pax_global_header00006660000000000000000000000064145175163750014530gustar00rootroot0000000000000052 comment=0a09a99c121af3a0fdab404ff71a3bc0a8eee446 opsin-2.8.0/000077500000000000000000000000001451751637500126675ustar00rootroot00000000000000opsin-2.8.0/.github/000077500000000000000000000000001451751637500142275ustar00rootroot00000000000000opsin-2.8.0/.github/workflows/000077500000000000000000000000001451751637500162645ustar00rootroot00000000000000opsin-2.8.0/.github/workflows/maven.yml000066400000000000000000000013371451751637500201210ustar00rootroot00000000000000# This workflow will build a Java project with Maven # For more information see: https://help.github.com/actions/language-and-framework-guides/building-and-testing-java-with-maven name: Java CI with Maven on: [push, pull_request] jobs: build: runs-on: ubuntu-20.04 strategy: matrix: # test against latest update of each major Java version: java: [ 8, 11, 17 ] name: Java ${{ matrix.java }} steps: - uses: actions/checkout@v3 - name: Setup java uses: actions/setup-java@v3 with: distribution: 'adopt' java-version: ${{ matrix.java }} cache: 'maven' - name: Build with Maven run: mvn -B clean test javadoc:javadoc package opsin-2.8.0/.gitignore000066400000000000000000000001021451751637500146500ustar00rootroot00000000000000target/ opsin-cli/src/main/java/dl/ .classpath .project .settings opsin-2.8.0/LICENSE.txt000066400000000000000000000020401451751637500145060ustar00rootroot00000000000000Copyright 2017 Daniel Lowe Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.opsin-2.8.0/README.md000066400000000000000000000244671451751637500141630ustar00rootroot00000000000000[![Maven Central](https://img.shields.io/maven-central/v/uk.ac.cam.ch.opsin/opsin-core.svg?label=Maven%20Central)](https://search.maven.org/search?q=g:%22uk.ac.cam.ch.opsin%22) [![Javadoc](https://javadoc.io/badge/uk.ac.cam.ch.opsin/opsin-core.svg)](https://javadoc.io/doc/uk.ac.cam.ch.opsin/opsin-core) [![MIT license](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT) [![Build Status](https://github.com/dan2097/opsin/workflows/Java%20CI%20with%20Maven/badge.svg)](https://github.com/dan2097/opsin/actions) OPSIN - Open Parser for Systematic IUPAC Nomenclature ===================================================== __Version 2.8.0 (see [ReleaseNotes.txt](https://raw.githubusercontent.com/dan2097/opsin/master/ReleaseNotes.txt) for what's new in this version)__ __Source code: __ __Web interface and informational site: __ __License: [MIT License](https://opensource.org/licenses/MIT)__ OPSIN is a Java library for IUPAC name-to-structure conversion offering high recall and precision on organic chemical nomenclature. Java 8 (or higher) is required for OPSIN 2.8.0 Supported outputs are SMILES, CML (Chemical Markup Language) and InChI (IUPAC International Chemical Identifier) ### Simple Usage Examples #### Convert a chemical name to SMILES `java -jar opsin-cli-2.8.0-jar-with-dependencies.jar -osmi input.txt output.txt` where input.txt contains chemical name/s, one per line NameToStructure nts = NameToStructure.getInstance(); String smiles = nts.parseToSmiles("acetamide"); #### Convert a chemical name to CML `java -jar opsin-cli-2.8.0-jar-with-dependencies.jar -ocml input.txt output.txt` where input.txt contains chemical name/s, one per line NameToStructure nts = NameToStructure.getInstance(); String cml = nts.parseToCML("acetamide"); #### Convert a chemical name to StdInChI/StdInChIKey/InChI with FixedH `java -jar opsin-cli-2.8.0-jar-with-dependencies.jar -ostdinchi input.txt output.txt` `java -jar opsin-cli-2.8.0-jar-with-dependencies.jar -ostdinchikey input.txt output.txt` `java -jar opsin-cli-2.8.0-jar-with-dependencies.jar -oinchi input.txt output.txt` where input.txt contains chemical name/s, one per line NameToInchi nti = new NameToInchi() String stdInchi = nti.parseToStdInchi("acetamide"); String stdInchiKey = nti.parseToStdInchiKey("acetamide"); String inchi = nti.parseToInchi("acetamide"); NOTE: OPSIN's non-standard InChI includes an additional layer (FixedH) that indicates which tautomer the chemical name described. StdInChI aims to be tautomer independent. ### Advanced Usage OPSIN 2.8.0 allows enabling of the following options: * allowRadicals: Allows substituents to be interpretable e.g. allows interpretation of "ethyl" * wildcardRadicals: If allowRadicals is enabled, this option uses atoms in the output to represent radicals: 'R' in CML and '*' in SMILES e.g. changes the output of ethyl from C[CH2] to CC\* * detailedFailureAnalysis: Provides a potentially more accurate reason as to why a chemical name could not be parsed. This is done by parsing the chemical name from right to left. The trade-off for enabling this is slightly increased memory usage. * allowAcidsWithoutAcid: Allows interpretation of acids without the word acid e.g. "acetic" * allowUninterpretableStereo: Allows stereochemistry uninterpretable by OPSIN to be ignored (When used as a library the OpsinResult has a status of WARNING if stereochemistry was ignored) * verbose: Enables debugging output (command-line only). This option has the effect of lowering the logging threshold on the uk.ac.cam.ch.wwmm.opsin package to DEBUG. The usage of these options on the command line is described in the command line's help dialog accessible via: `java -jar opsin-cli-2.8.0-jar-with-dependencies.jar -h` These options may be controlled using the following code: NameToStructure nts = NameToStructure.getInstance(); NameToStructureConfig ntsconfig = new NameToStructureConfig(); //a new NameToStructureConfig starts as a copy of OPSIN's default configuration ntsconfig.setAllowRadicals(true); OpsinResult result = nts.parseChemicalName("acetamide", ntsconfig); String cml = result.getCml(); String smiles = result.getSmiles(); String stdinchi = NameToInchi.convertResultToStdInChI(result); `result.getStatus()` may be checked to see if the conversion was successful. If a structure was generated but OPSIN believes there may be a problem a status of WARNING is returned. Currently this may occur if the name appeared to be ambiguous or stereochemistry was ignored. By default only optical rotation specification is ignored (this cannot be converted to stereo-configuration algorithmically). Convenience methods like `result.nameAppearsToBeAmbiguous()` may be used to check the cause of the warning. NOTE: (Std)InChI cannot be generated for polymers or radicals generated in combination with the wildcardRadicals option ### Availability OPSIN is available as a standalone JAR from GitHub, * `opsin-cli-2.8.0-jar-with-dependencies.jar` can be executed as a command-line application. It includes SMILES/CML/InChI support and bundles a logging implementation. * `opsin-core-2.8.0-jar-with-dependencies.jar` includes just SMILES/CML support. OPSIN is also available from the Maven Central Repository. For SMILES/CML output support you would include: uk.ac.cam.ch.opsin opsin-core 2.8.0 or if you also need InChI output support: uk.ac.cam.ch.opsin opsin-inchi 2.8.0 #### Building from source To build OPSIN from source, download Maven 3 and OPSIN's source code. Running `mvn package` in the root of OPSIN's source will build: | Artifact | Location | Description | |---------------------------------------------------|--------------------|-------------------------------------------------------------------| | opsin-cli-\-jar-with-dependencies.jar | opsin-cli/target | Standalone command-line application with SMILES/CML/InChI support | | opsin-core-\-jar-with-dependencies.jar | opsin-core/target | Library with SMILES/CML support | | opsin-inchi-\-jar-with-dependencies.jar | opsin-inchi/target | Library with SMILES/CML/InChI support | ### About OPSIN The workings of OPSIN are more fully described in: Chemical Name to Structure: OPSIN, an Open Source Solution Daniel M. Lowe, Peter T. Corbett, Peter Murray-Rust, Robert C. Glen Journal of Chemical Information and Modeling 2011 51 (3), 739-753 If you use OPSIN in your work, then it would be great if you could cite us. The following list broadly summarises what OPSIN can currently do and what will be worked on in the future. #### Supported nomenclature includes: * alkanes/alkenes/alkynes/heteroatom chains e.g. hexane, hex-1-ene, tetrasiloxane and their cyclic analogues e.g. cyclopropane * All IUPAC 1993 recommended rings * Trivial acids * Hantzsch-Widman e.g. 1,3-oxazole * Spiro systems * All von Baeyer rings e.g. bicyclo[2.2.2]octane * Hydro e.g. 2,3-dihydropyridine * Indicated hydrogen e.g. 1H-benzoimidazole * Heteroatom replacement * Specification of charge e.g. ium/ide/ylium/uide * Multiplicative nomenclature e.g. ethylenediaminetetraacetic acid * Conjunctive nomenclature e.g. cyclohexaneethanol * Fused ring systems e.g. imidazo[4,5-d]pyridine * Ring assemblies e.g. biphenyl * Most prefix and infix functional replacement nomenclature * The following functional classes: acetals, acids, alcohols, amides, anhydrides, anilides, azetidides, azides, bromides, chlorides, cyanates, cyanides, esters, di/tri/tetra esters, ethers, fluorides, fulminates, glycol ethers, glycols, hemiacetals, hemiketal, hydrazides, hydrazones, hydrides, hydroperoxides, hydroxides, imides, iodides, isocyanates, isocyanides, isoselenocyanates, isothiocyanates, ketals, ketones, lactams, lactims, lactones, mercaptans, morpholides, oxides, oximes, peroxides, piperazides, piperidides, pyrrolidides, selenides, selenocyanates, selenoketones, selenolsselenosemicarbazones, selenones, selenoxides, selones, semicarbazones, sulfides, sulfones, sulfoxides, sultams, sultims, sultines, sultones, tellurides, telluroketones, tellurones, tellurosemicarbazones, telluroxides, thiocyanates, thioketones, thiols, thiosemicarbazones * Greek letters * Lambda convention * Amino Acids and derivatives * Structure-based polymer names e.g. poly(2,2'-diamino-5-hexadecylbiphenyl-3,3'-diyl) * Bridge prefixes e.g. methano * Specification of oxidation numbers and charge on elements * Perhalogeno terms * Subtractive prefixes: deoxy, dehydro, anhydro, demethyl, deamino * Stoichiometry ratios and mixture indicators * Nucleosides, (oligo)nucleotides and their esters * Carbohydrate nomenclature * Simple CAS names including inverted CAS names * Steroids including alpha/beta stereochemistry * Isotopic labelling * E/Z/R/S stereochemistry * cis/trans indicating relative stereochemistry on rings and as a synonym of E/Z #### Currently UNsupported nomenclature includes: * Other less common stereochemical terms * Most alkaloids/terpenoids * Natural product specific nomenclature operations ### Developers and Contributors * Rich Apodaca * Albina Asadulina * Peter Corbett * Daniel Lowe (Current maintainer) * John Mayfield * Peter Murray-Rust * Noel O'Boyle * Mark Williamson Thanks also to the many users who have contributed through suggestions and bug reporting. ![YourKit Logo](https://www.yourkit.com/images/yklogo.png) OPSIN's developers use YourKit to profile and optimise code. YourKit supports open source projects with its full-featured Java Profiler. YourKit, LLC is the creator of [YourKit Java Profiler](https://www.yourkit.com/java/profiler/index.jsp) and [YourKit .NET Profiler](https://www.yourkit.com/.net/profiler/index.jsp), innovative and intelligent tools for profiling Java and .NET applications. Good Luck and let us know if you have problems, comments or suggestions! Bugs may be reported on the project's [issue tracker](https://github.com/dan2097/opsin/issues). opsin-2.8.0/ReleaseNotes.txt000066400000000000000000000567231451751637500160360ustar00rootroot00000000000000Version 2.8.0 (2022-10-29) Support for undecahectane/undecadictane (previously only hendeca was supported) Support for dicarboximido Improved support for lysergic acid derivatives Added a few more sugars e.g. digitalose Added borodeuteride and hydro contractions of pharmaceutical salts e.g. hydromethanesulfonate Support substitution on glyceric acid Corrected interpretation of imidazolium, trioxane and phthalhydrazide Version 2.7.0 (2022-08-16) Improved coverage of flavonoid parent structures Support for apiofuranosyl, added 5 locant to apiose Improved support for n-amyl Superscripted numbers in poly spiro systems are now intelligently determined if the input lacks superscript indication Support for annulynes Fixed issues where amino acid salts were being interpreted as functionalisation of the amino acid Fixed bug where annulene parsing was case sensitive Chalcone, in accordance with current IUPAC recommendations, is now interpreted as specifically the trans isomer Minor dependency updates Version 2.6.0 (2021-12-21) OPSIN now requires Java 8 (or higher) OPSIN command-line functionality moved to opsin-cli module OPSIN standalone jars are now built with mvn package Updated from InChI 1.03 to InChI 1.06 Support for capturing relative/racemic stereochemistry (output via CxSmiles) [contributed by John Mayfield] Support for deaza/dethia Support nitrile as a suffix on amino acids [contributed by John Mayfield] Support more glycero-n-phospho substituents Support for chloroxime and other haloximes Support cis/trans on rings where a stereocenter has two non-hydrogen substituents, using Cahn-Ingold-Prelog rules to determine which are relative Multiple improvements to implicit bracketting logic Corrected interpretation of methylselenopyruvate Added group 1/2 nitrides e.g. magnesium nitride Added molecular diatomics e.g. molecular hydrogen (or dihydrogen) Fixed out of memory error if a fusion bracket referenced an interior atom instead of a peripheral atom Fixed out of memory error while parsing very long ambiguous input, by switching parsing algorithm from breadth-first to depth-first Dependency changes: Updated logging from Log4J v1.2.17 to the latest Log4J2 (v2.17.0). Neither OPSIN 2.5.0 nor 2.6.0 are vulnerable to Log4Shell. The logging implementation is only included in the opsin-cli module opsin-inchi now uses JNA-InChI (https://github.com/dan2097/jna-inchi) rather than JNI-InChI. This supports the latest version of InChI and also support new Macs with ARM64 processors Woodstox now uses groupid com.fasterxml.woodstox (the groupid change did not signify a break in API compatibility) dk.brics.automaton now uses groupid dk.brics (the groupid change did not signify a break in API compatibility) commons-cli is only used by the opsin-cli module Version 2.5.0 (2020-10-04) OPSIN now requires Java 7 (or higher) Support for traditional oxidation state names e.g. ferric Added support for defining the stereochemistry of phosphines/arsines Added newly discovered elements Improved algorithm for correctly interpreting ester names with a missing space e.g. 3-aminophenyl-4-aminobenzenesulfonate Fixed structure of canavanine Corrected interpretation of silver oxide Vocabulary improvements Minor improvements/bug fixes Internal XML Changes: tokenList files now all use the same schema (tokenLists.dtd) Version 2.4.0 (2018-12-23) OPSIN is now licensed under the MIT License Locant labels included in extended SMILES output Command-line now has a name flag to include the input name in SMILES/InChI output (tab delimited) Added support for carotenoids Added support for Vitamin B-6 related compounds Added support for more fused ring system bridge prefixes Added support for anilide as a functional replacement group Allow heteroatom replacement as a detachable prefix e.g. 3,6,9-triaza-2-(4-phenylbutyl)undecanoic acid Support Boughton system isotopic suffixes for 13C/14C/15N/17O/18O Support salts of acids in CAS inverted names Improved support for implicitly positively charged purine nucleosides/nucleotides Added various biochemical groups/substituents Improved logic for determining intended substitution in names with too few brackets Incorrectly capitalized locants can now be used to reference ring fusion atoms Some names no longer allow substitution e.g. water, hydrochloride Many minor precision/recall improvements Version 2.3.1 (2017-07-23) Fixed fused ring numbering algorithm incorrectly numbering some ortho- and peri-fused fused systems involving 7-membered rings Support P-thio to indicate thiophosphate linkage Count of isotopic replacements no longer required if locants given Fixed bug where CIP algorithm could assign priorities to identical substituents Fixed "DL" before a substituent not assigning the substituted alpha-carbon as racemic stereo L-stereochemistry no longer assumed on semi-systematic glycine derivatives e.g. phenylglycine Fixed some cases where substituents like carbonyl should have been part of an implicitly bracketed section Fixed interpretation of leucinic acid and 3/4/5-pyrazolone Version 2.3.0 (2017-02-23) D/L stereochemistry can now be assigned algorithmically e.g. L-2-aminobutyric acid Other minor improvements to amino acid support e.g. homoproline added Extended SMILES added to command-line interface Names intended to include the triiodide/tribromide anion no longer erroneously have three monohalides Ambiguity detected when applying unlocanted subtractive prefixes Better support for adjacent multipliers e.g. ditrifluoroacetic acid deoxynucleosides are now implicitly 2'-deoxynucleosides Added support for as a syntax for a superscripted number Added support for amidrazones Aluminium hydrides/chlorides/bromides/iodides are now covalently bonded Fixed names with isotopes less than 10 not being supported Fixed interpretation of some trivial names that clash with systematic names Version 2.2.0 (2016-10-16) Added support for IUPAC system for isotope specification e.g. (3-14C,2,2-2H2)butane Added support for specifying deuteration using the Boughton system e.g. butane-2,2-d2 Added support for multiplied bridges e.g. 1,2:3,4-diepoxy Front locants after a von baeyer descriptor are now supported e.g. bicyclo[2.2.2]-7-octene onosyl substituents now supported e.g. glucuronosyl More sugar substituents e.g. glucosaminyl Improved support for malformed polycyclic spiro names Support for oximino as a suffix Added method [NameToStructure.getVersion()] to retrieve OPSIN version number Allowed bridges to be used as detachable prefixes Allow odd numbers of hydro to be added e.g. trihydro Added support for unbracketed R stereochemistry (but not S, for the moment, due to the ambiguity with sulfur locants) Various minor bug fixes e.g. stereochemistry was incorrect for isovaline Minor vocabulary improvements Version 2.1.0 (2016-03-12) Added support for fractional multipliers e.g. hemihydrochloride Added support for abbreviated common salts e.g. HCl Added support for sandwich compounds e.g. ferrocene Improved recognition of names missing the last 'e' (common in German) Support for E/Z directly before double bond indication e.g. 2Z-ylidene, 2Z-ene Improved support for functional class ethers e.g. "glycerol triglycidyl ether" Added general support for names involving an ester formed from an alcohol and an ate group Grignards reagents and certain compounds (e.g. uranium hexafluoride), are now treated as covalent rather than ionic Added experimental support for outputting extended SMILES. Polymers and attachment points are annotated explicitly Polymers when output as SMILES now have atom classes to indicate which end of the repeat unit is which Support * as a superscript indicator e.g. *6* to mean superscript 6 Improved recognition of racemic stereochemistry terms Added general support for names like "beta-alanine N,N-diacetic acid" Allowed "one" and "ol" suffixes to be used in more cases where another suffix is also present "ic acid halide" is not interpreted the same as "ic halide" Fixed some cases where ambiguous operations were not considered ambiguous e.g. monosubstitututed phenyl Improvements/bug fixes to heuristics for detecting when spaces are omitted from ether/ester names Improved support for stereochemistry in older CAS index names Many precision improvements e.g. cyclotriphosphazene, thiazoline, TBDMS/TBDPS protecting groups, S-substituted-methionine Various minor bug fixes e.g. names containing "SULPH" not recognized Minor vocabulary improvements Internal XML Changes: Synonymns of the same concept are now or-ed rather being seperate entities e.g. tertiary|tert-|t- Version 2.0.0 (2015-07-10) MAJOR CHANGES: Requires Java 1.6 or higher CML (Chemical Markup Language) is now returned as a String rather than a XOM Element OPSIN now attempts to identify if a chemical name is ambiguous. Names that appear ambiguous return with a status of WARNING with the structure provided being one interpretation of the name Added support for "alcohol esters" e.g. phenol acetate [meaning phenyl acetate] Multiplied unlocanted substitution is now more intelligent e.g. all substituents must connect to same group, and degeneracy of atom environments is taken into account The ester interpretation is now preferred in more cases where a name does not contain a space but the parent is methanoate/ethanoate/formate/acetate/carbamate Inorganic oxides are now interpreted, yielding structures with [O-2] ions Added more trivial names of simple molecules Support for nitrolic acids Fixed parsing issue where a directly substituted acetal was not interpretable Fixed certain groups e.g. phenethyl, not having their suffix attached to a specific location Corrected interpretation of xanthyl, and various trivial names that look systematic Name to structure is now ~20% faster Initialisation time reduced by a third InChI generation is now ~20% faster XML processing dependency changed from XOM to Woodstox Significant internal refactoring Utility functions designed for internal use are no longer on the public API Various minor bug fixes Internal XML Changes: Groups lacking a labels attribute now have no locants (previously had ascending numeric locants) Syntax for addGroup/addHeteroAtom/addBond attributes changed to be easier to parse and allow specification of whether the name is ambiguous if a locant is not provided Version 1.6.0 (2014-04-26) Added API/command-line options to generate StdInchiKeys Added support for the IUPAC recommended nomenclature for carbobohydrate lactones Added support for boronic acid pinacol esters Added basic support for specifying chalcogen acid tautomer form e.g. thioacetic S-acid Fused ring bridges are now numbered Names with Endo/Exo/Syn/Anti stereochemistry can now be partially interpreted if warnRatherThanFailOnUninterpretableStereochemistry is used The warnRatherThanFailOnUninterpretableStereochemistry option will now assign as much stereochemistry as OPSIN understands (All ignored stereochemistry terms are mentioned in the OpsinResult message) Many minor nomenclature support improvements e.g. succinic imide; hexaldehyde; phenyldiazonium, organotrifluoroborates etc. Added more trivial names that can be confused with systematic names e.g. Imidazolidinyl urea Fixed StackOverFlowError that could occur when processing molecules with over 5000 atoms Many minor bug fixes Minor vocabulary improvements Minor speed improvements NOTE: This is the last release to support Java 1.5 Version 1.5.0 (2013-07-21) Command line interface now accepts files to read and write to as arguments Added option to allow interpretation of acids missing the word acid e.g. "acetic" (off by default) Added option to treat uninterpretable stereochemistry as a warning rather than a failure (off by default) Added support for nucleotide chains e.g. guanylyl(3'-5')uridine Added support for parabens, azetidides, morpholides, piperazides, piperidides and pyrrolidides Vocabulary improvements e.g. homo/beta amino acids Many minor bug fixes e.g. fulminic acid correctly interpreted Version 1.4.0 (2013-01-27) Added support for dialdoses,diketoses,ketoaldoses,alditols,aldonic acids,uronic acids,aldaric acids,glycosides,oligosacchardides, named systematically or from trivial stems, in cyclic or acyclic form Added support for ketoses named using dehydro Added support for anhydro Added more trivial carbohydrate names Added support for sn-glcyerol Improved heuristics for phospho substitution Added hydrazido and anilate suffixes Allowed more functional class nomenclature to apply to amino acids Added support for inverting CAS names with substituted functional terms e.g. Acetaldehyde, O-methyloxime Double substitution of a deoxy chiral centre now uses the CIP rules to decide which substituent replaced the hydroxy group Unicode right arrows, superscripts and the soft hyphen are now recognised Version 1.3.0 (2012-09-16) Added option to output radicals as R groups (* in SMILES) Added support for carbolactone/dicarboximide/lactam/lactim/lactone/olide/sultam/sultim/sultine/sultone suffixes Resolved some cases of ambiguity in the grammar; the program's capability to handle longer peptide names is improved Allowed one (as in ketone) before yl e.g. indol-2-on-3-yl Allowed primed locants to be used as unprimed locants in a bracket e.g. 2-(4'-methylphenyl)pyridine Vocabulary improvements SMILES writer will no longer reuse ring closures on the same atom Fixed case where a name formed of many words that could be parsed ambiguously would cause OPSIN to run out of memory NameToStructure.getInstance() no longer throws a checked exception Many minor bug fixes Version 1.2.0 (2011-12-06) OPSIN is now available from Maven Central Basic support for cylised carbohydrates e.g. alpha-D-glucopyranose Basic support for systematic carbohydrate stems e.g. D-glycero-D-gluco-Heptose Added heuristic for correcting esters with omitted spaces Added support for xanthates/xanthic acid Minor vocabulary improvements Fixed a few minor bugs/limitations in the Cahn-Ingold-Prelog rules implementation and made more memory efficient Many minor improvements and bug fixes Version 1.1.0 (2011-06-16) Significant improvements to fused ring numbering code, specifically 3/4/5/7/8 member rings are no longer only allowed in chains of rings Added support for outputting to StdInChI Small improvements to fused ring building code Improvements to heuristics for disambiguating what group is being referred to by a locant Lower case indicated hydrogen is now recognised Improvements to parsing speed Many minor improvements and bug fixes Version 1.0.0 (2011-03-09) Added native isomeric SMILES output Improved command-line interface. The desired format i.e. CML/SMILES/InChI as well as options such as allowing radicals can now all be specified via flags Debugging is now performed using log4j rather than by passing a verbose flag Added traditional locants to carboxylic acids and alkanes e.g. beta-hydroxybutyric acid Added support for cis/trans indicating the relative stereochemistry of two substituents on rings and fused rings sytems Added support for stoichiometry ratios and mixture indicators Added support for alpha/beta stereochemistry on steroids Added support for the method for naming spiro systems described in the 1979 recommendations rule A-42 Added detailedFailureAnalysis option to detect the part of a chemical name that fails to parse Added support for deoxy Added open-chain saccharides Improvements to CAS index name uninversion algorithm Added support for isotopes into the program allowing deuterio/tritio Added support for R/S stereochemistry indicated by a locant which is also used to indicate the point of substitution for a substituent Many minor improvements and bug fixes Version 0.9.0 (2010-11-01) Added transition metals/f-block elements and nobel gases Added support for specifying the charge or oxidation number on elements e.g. aluminium(3+), iron(II) Calculations based off a van Arkel diagram are now used to determine whether functional bonds to metals should be treated as ionic or covalent Improved support for prefix functional replacement e.g. hydrazono/amido/imido/hydrazido/nitrido/pseudohalides can now be used for functional replacement on appropriate acids Ortho/meta/para handling improved - can now only apply to six membered rings Added support for methylenedioxy Added support for simple bridge prefixes e.g. methano as in 2,3-methanoindene Added support for perfluoro/perchloro/perbromo/periodo Generalised alkane support to allow alkanes of lengths up to 9999 to be described without enumeration Updated dependency on JNI-InChI to 0.7, hence InChI 1.03 is now used. Improved algorithm for assigning unlocanted hydro terms Improved heuristic for determing meaning of oxido Improved charge balancing e.g. ionic substance of an implicit ratio 2:3 can now be handled rather than being represented as a net charged 1:1 mixture Grammar is a bit more lenient of placement of stereochemistry and multipliers Vocabulary improvements especially in the area of nucleosides and nucleotides Esters of biochemical compounds e.g. triphosphates are now supported Many minor improvements and bug fixes Version 0.8.0 (2010-07-16) NameToStructureConfig can now be used to configure whether radicals e.g. ethyl are output or not. Names like carbon tetrachloride are now supported glycol ethers e.g. ethylene glycol ethyl ether are now supported Prefix functional replacement support now includes halogens e.g. chlorophosphate Added support for epoxy/epithio/episeleno/epitelluro Added suport for hydrazides/fluorohydrins/chlorohydrins/bromohydrins/iodohydrins/cyanohydrins/acetals/ketals/hemiacetals/hemiketals/diketones/disulfones named using functional class nomenclature Improvements to algorithm for assigning and finding atoms corresponding to element symbol locants Added experimental right to left parser (ReverseParseRules.java) Vocabulary improvements Parsing is now even faster Various bug fixes and name intepretation fixes Version 0.7.0 (2010-06-09) Added full support for conjunctive nomenclature e.g. 1,3,5-benzenetriacetic acid Added basic support for CAS names Added trivial poly-noncarboxylic acids and more trivial carboxylic acids Added support for spirobi/spiroter/dispiroter and the majority of spiro(ring-locant-ring) nomenclature Indicators of the direction that a chemical rotates plane polarised light are now detected and ignored Fixed many cases of trivial names being interpreted systematically by adding more trivial names and detecting such cases Names such as oxalic bromide cyanide where a halide/pseudohalide replaces an oxygen are now supported Amino acid ester named from the neutral amino acid are now supported e.g. glycine ethyl ester Added more heteroatom replacement terms Allowed creation of an OPSIN parse through NameToStructure.getOpsinParser() Added support for dehydro - for unsaturating bonds Improvements to element symbol locant assignment and retrieving appropriate atoms from locants like N2 OPSIN's SMILES parser now accept specification of number of hydrogens in cases other than chiral atoms Mixtures specified by separating components by semicolonspace are now supported Many internal improvements and bug fixes Version 0.6.1 (2010-03-18) Counter ions are now duplicated such as to lead to if possible a neutral compound In names like nitrous amide the atoms modified by the functional replacement can now be substituted Allowed ~number~ for specifying superscripts Vocabulary improvements Added quinone suffix Tetrahedral sulfur stereochemistry is now recognised Bug fixes to fix incorrect interpretation of some names e.g. triphosgene is now unparseable rather than 3 x phosghene, phospho has different meanings depending on whether it used on an amino acid or another group etc. Version 0.6.0 (2010-02-18) OPSIN is now a mavenised project consisting of two modules: core and inchi. Core does name -->CML, inchi depends on core and allows conversion to inchi Instead of CML an OpsinResult can be returned which can yield information as to why a name was not interpretable Added support for unlocanted R/S/E/Z stereochemistry. Removed limit on number of atoms that stereochemistry code can handle Added support for polymers e.g. poly(ethylene) Improvements in handling of multiplicative nomenclature Improvements to fusion nomenclature handling: multiplied components and multi parent systems are now supported Improved support for functional class nomenclature; space detection has been improved and support has been added for anhydride,oxide,oxime,hydrazone,semicarbazone,thiosemicarbazone,selenosemicarbazone,tellurosemicarbazone,imide Support for the lambda convention Locanted esters Improvements in dearomatisation code CML output changed to being CML-Lite compliant Speed improvements Support for greek letters e.g. as alpha or $a or α Added more infixes Added more suffixes Vocabulary improvements Systematic handling of amino acid nomenclature Added support for perhydro Support for ylium/uide Support for locants like N-1 (instead of N1) Fixed potential infinite loop in fused ring numbering Made grammar more lenient in many places e.g. euphonic o, optional sqaure brackets Sulph is now treated like sulf as in sulphuric acid and many misc fixes and improvements Version 0.5.3 (2009-10-22) Added support for amic, aldehydic, anilic, anilide, carboxanilide and amoyl suffixes Added support for cyclic imides e.g. succinimide/succinimido Added support for amide functional class Support for locants such as N5 which means a nitrogen that is attached in some way to position 5. Locants of this type may also be used in ester formation. Some improvements to functional replacement using prefixes e.g. thioethanoic acid now works Disabled stereochemistry in molecules with over 300 atoms as a temporary fix to the problem in 0.52 Slight improvement in method for deciding which group detachable hydro prefixes apply to. Minor vocabulary update Version 0.5.2 (2009-10-04) Outputting directly to InChI is now supported using the separately available nameToInchi jar (an OPSIN jar is expected in the same location as the nameToInchi jar) Fused rings with any number of rings in a chain or formed entirely of 6 membered rings can now be numbered Added support for E/Z/R/S where locants are given. Unlocanted cases will be dealt with in a subsequent release. In very large molecules a lack of memory may be encountered, this will be resolved in a subsequent release Some Infixes are now supported e.g. ethanthioic acid All spiro systems with Von Baeyer brackets are now supported e.g. dispiro[4.2.4.2]tetradecane Vocabulary increase (especially: terpenes, ingorganic acids, fused ring components) Fixed some problems with components with both acylic and cyclic sections e.g. trityl Improved locant assignments e.g. 2-furyl is now also fur-2-yl Speed improvements Removed dependence on Nux/Saxon Misc minor fixes Version 0.5.1 (2009-07-20) Huge reduction in OPSIN initialisation time (typical ~7 seconds -->800ms) Allowed thio/seleno/telluro as divalent linkers and for functional replacement when used as prefixes. Peroxy can now be used for functional replacement Better support for semi-trivally named hydrocarbon fused rings e.g. tetracene Better handling of carbonic acid derivatives Improvements to locant assignment Support for names like triethyltetramine and triethylene glycol Misc other fixes to prevent OPSIN generating the wrong structure for certain types of names Version 0.5 (2009-06-23) Too many changes to list Version 0.1 (2006-10-11) Initial releaseopsin-2.8.0/opsin-cli/000077500000000000000000000000001451751637500145645ustar00rootroot00000000000000opsin-2.8.0/opsin-cli/pom.xml000066400000000000000000000040751451751637500161070ustar00rootroot00000000000000 4.0.0 opsin uk.ac.cam.ch.opsin 2.8.0 opsin-cli OPSIN Command Line interface Command line interface for using OPSIN to convert names to SMILES/InChI/InChIKey/CML org.apache.maven.plugins maven-shade-plugin 3.2.4 package shade target/opsin-cli-${project.version}-jar-with-dependencies.jar uk.ac.cam.ch.wwmm.opsin.Cli uk.ac.cam.ch.opsin opsin-inchi commons-cli commons-cli org.apache.logging.log4j log4j-core org.junit.jupiter junit-jupiter test opsin-2.8.0/opsin-cli/src/000077500000000000000000000000001451751637500153535ustar00rootroot00000000000000opsin-2.8.0/opsin-cli/src/main/000077500000000000000000000000001451751637500162775ustar00rootroot00000000000000opsin-2.8.0/opsin-cli/src/main/java/000077500000000000000000000000001451751637500172205ustar00rootroot00000000000000opsin-2.8.0/opsin-cli/src/main/java/uk/000077500000000000000000000000001451751637500176375ustar00rootroot00000000000000opsin-2.8.0/opsin-cli/src/main/java/uk/ac/000077500000000000000000000000001451751637500202225ustar00rootroot00000000000000opsin-2.8.0/opsin-cli/src/main/java/uk/ac/cam/000077500000000000000000000000001451751637500207625ustar00rootroot00000000000000opsin-2.8.0/opsin-cli/src/main/java/uk/ac/cam/ch/000077500000000000000000000000001451751637500213545ustar00rootroot00000000000000opsin-2.8.0/opsin-cli/src/main/java/uk/ac/cam/ch/wwmm/000077500000000000000000000000001451751637500223435ustar00rootroot00000000000000opsin-2.8.0/opsin-cli/src/main/java/uk/ac/cam/ch/wwmm/opsin/000077500000000000000000000000001451751637500234735ustar00rootroot00000000000000opsin-2.8.0/opsin-cli/src/main/java/uk/ac/cam/ch/wwmm/opsin/Cli.java000066400000000000000000000251501451751637500250500ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.io.BufferedReader; import java.io.BufferedWriter; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.io.OutputStream; import java.io.OutputStreamWriter; import java.nio.charset.StandardCharsets; import javax.xml.stream.XMLOutputFactory; import javax.xml.stream.XMLStreamException; import javax.xml.stream.XMLStreamWriter; import org.apache.commons.cli.CommandLine; import org.apache.commons.cli.CommandLineParser; import org.apache.commons.cli.DefaultParser; import org.apache.commons.cli.HelpFormatter; import org.apache.commons.cli.Option; import org.apache.commons.cli.Option.Builder; import org.apache.commons.cli.Options; import org.apache.commons.cli.UnrecognizedOptionException; import org.apache.logging.log4j.Level; import org.apache.logging.log4j.core.config.Configurator; import com.ctc.wstx.api.WstxOutputProperties; import com.ctc.wstx.stax.WstxOutputFactory; public class Cli { private enum InchiType { inchiWithFixedH, stdInchi, stdInchiKey } /** * Run OPSIN as a command-line application. * * @param args * @throws Exception */ public static void main(String[] args) throws Exception { Options options = buildCommandLineOptions(); CommandLineParser parser = new DefaultParser(); CommandLine cmd = null; try { cmd = parser.parse(options, args); } catch (UnrecognizedOptionException e) { System.err.println(e.getMessage()); System.exit(1); } if (cmd.hasOption("h")) { displayUsage(options); } if (cmd.hasOption("v")) { Configurator.setLevel("uk.ac.cam.ch.wwmm.opsin", Level.DEBUG); } NameToStructureConfig n2sconfig = generateOpsinConfigObjectFromCmd(cmd); InputStream input = System.in; OutputStream output = System.out; String[] unparsedArgs = cmd.getArgs(); if (unparsedArgs.length == 0) { System.err.println("Run the jar using the -h flag for help. Enter a chemical name to begin:"); } else if (unparsedArgs.length == 1) { input = new FileInputStream(new File(unparsedArgs[0])); } else if (unparsedArgs.length == 2) { input = new FileInputStream(new File(unparsedArgs[0])); output = new FileOutputStream(new File(unparsedArgs[1])); } else { displayUsage(options); } try { String outputType = cmd.getOptionValue("o", "smi"); boolean outputName = cmd.hasOption("n"); if (outputType.equalsIgnoreCase("cml")) { interactiveCmlOutput(input, output, n2sconfig); } else if (outputType.equalsIgnoreCase("smi") || outputType.equalsIgnoreCase("smiles")) { interactiveSmilesOutput(input, output, n2sconfig, false, outputName); } else if (outputType.equalsIgnoreCase("inchi")) { interactiveInchiOutput(input, output, n2sconfig, InchiType.inchiWithFixedH, outputName); } else if (outputType.equalsIgnoreCase("stdinchi")) { interactiveInchiOutput(input, output, n2sconfig, InchiType.stdInchi, outputName); } else if (outputType.equalsIgnoreCase("stdinchikey")) { interactiveInchiOutput(input, output, n2sconfig, InchiType.stdInchiKey, outputName); } else if (outputType.equalsIgnoreCase("extendedsmi") || outputType.equalsIgnoreCase("extendedsmiles") || outputType.equalsIgnoreCase("cxsmi") || outputType.equalsIgnoreCase("cxsmiles")) { interactiveSmilesOutput(input, output, n2sconfig, true, outputName); } else { System.err.println("Unrecognised output format: " + outputType); System.err.println( "Expected output types are \"cml\", \"smi\", \"inchi\", \"stdinchi\" and \"stdinchikey\""); System.exit(1); } } finally { if (output != System.out) { output.close(); } if (input != System.in) { input.close(); } } } private static void displayUsage(Options options) { HelpFormatter formatter = new HelpFormatter(); String version = NameToStructure.getVersion(); formatter.printHelp("java -jar opsin-" + (version != null ? version : "[version]") + "-jar-with-dependencies.jar [options] [inputfile] [outputfile]" + OpsinTools.NEWLINE + "OPSIN converts systematic chemical names to CML, SMILES or InChI/StdInChI/StdInChIKey" + OpsinTools.NEWLINE + "Names should be new line delimited and may be read from stdin (default) or a file and output to stdout (default) or a file", options); System.exit(0); } private static Options buildCommandLineOptions() { Options options = new Options(); Builder outputBuilder = Option.builder("o"); outputBuilder.longOpt("output"); outputBuilder.hasArg(); outputBuilder.argName("format"); StringBuilder outputOptionsDesc = new StringBuilder(); outputOptionsDesc.append("Sets OPSIN's output format (default smi)").append(OpsinTools.NEWLINE); outputOptionsDesc.append("Allowed values are:").append(OpsinTools.NEWLINE); outputOptionsDesc.append("cml for Chemical Markup Language").append(OpsinTools.NEWLINE); outputOptionsDesc.append("smi for SMILES").append(OpsinTools.NEWLINE); outputOptionsDesc.append("extendedsmi for Extended SMILES").append(OpsinTools.NEWLINE); outputOptionsDesc.append("inchi for InChI (with FixedH)").append(OpsinTools.NEWLINE); outputOptionsDesc.append("stdinchi for StdInChI").append(OpsinTools.NEWLINE); outputOptionsDesc.append("stdinchikey for StdInChIKey"); outputBuilder.desc(outputOptionsDesc.toString()); options.addOption(outputBuilder.build()); options.addOption("h", "help", false, "Displays the allowed command line flags"); options.addOption("v", "verbose", false, "Enables debugging"); options.addOption("a", "allowAcidsWithoutAcid", false, "Allows interpretation of acids without the word acid e.g. \"acetic\""); options.addOption("f", "detailedFailureAnalysis", false, "Enables reverse parsing to more accurately determine why parsing failed"); options.addOption("n", "name", false, "Include name in SMILES/InChI output (tab delimited)"); options.addOption("r", "allowRadicals", false, "Enables interpretation of radicals"); options.addOption("s", "allowUninterpretableStereo", false, "Allows stereochemistry uninterpretable by OPSIN to be ignored"); options.addOption("w", "wildcardRadicals", false, "Radicals are output as wildcard atoms"); return options; } /** * Uses the command line parameters to configure a new NameToStructureConfig * * @param cmd * @return The configured NameToStructureConfig */ private static NameToStructureConfig generateOpsinConfigObjectFromCmd(CommandLine cmd) { NameToStructureConfig n2sconfig = new NameToStructureConfig(); n2sconfig.setInterpretAcidsWithoutTheWordAcid(cmd.hasOption("a")); n2sconfig.setDetailedFailureAnalysis(cmd.hasOption("f")); n2sconfig.setAllowRadicals(cmd.hasOption("r")); n2sconfig.setWarnRatherThanFailOnUninterpretableStereochemistry(cmd.hasOption("s")); n2sconfig.setOutputRadicalsAsWildCardAtoms(cmd.hasOption("w")); return n2sconfig; } private static void interactiveCmlOutput(InputStream input, OutputStream out, NameToStructureConfig n2sconfig) throws IOException, XMLStreamException { NameToStructure nts = NameToStructure.getInstance(); BufferedReader inputReader = new BufferedReader(new InputStreamReader(input, StandardCharsets.UTF_8)); XMLOutputFactory factory = new WstxOutputFactory(); factory.setProperty(WstxOutputProperties.P_OUTPUT_ESCAPE_CR, false); XMLStreamWriter writer = factory.createXMLStreamWriter(out, "UTF-8"); writer = new IndentingXMLStreamWriter(writer, 2); writer.writeStartDocument(); CMLWriter cmlWriter = new CMLWriter(writer); cmlWriter.writeCmlStart(); int id = 1; String line; while ((line = inputReader.readLine()) != null) { int splitPoint = line.indexOf('\t'); String name = splitPoint >= 0 ? line.substring(0, splitPoint) : line; OpsinResult result = nts.parseChemicalName(name, n2sconfig); Fragment structure = result.getStructure(); cmlWriter.writeMolecule(structure, name, id++); writer.flush(); if (structure == null) { System.err.println(result.getMessage()); } } cmlWriter.writeCmlEnd(); writer.writeEndDocument(); writer.flush(); writer.close(); } private static void interactiveSmilesOutput(InputStream input, OutputStream out, NameToStructureConfig n2sconfig, boolean extendedSmiles, boolean outputName) throws IOException { NameToStructure nts = NameToStructure.getInstance(); BufferedReader inputReader = new BufferedReader(new InputStreamReader(input, StandardCharsets.UTF_8)); BufferedWriter outputWriter = new BufferedWriter(new OutputStreamWriter(out, StandardCharsets.UTF_8)); String line; while ((line = inputReader.readLine()) != null) { int splitPoint = line.indexOf('\t'); String name = splitPoint >= 0 ? line.substring(0, splitPoint) : line; OpsinResult result = nts.parseChemicalName(name, n2sconfig); String output = extendedSmiles ? result.getExtendedSmiles() : result.getSmiles(); if (output == null) { System.err.println(result.getMessage()); } else { outputWriter.write(output); } if (outputName) { outputWriter.write('\t'); outputWriter.write(line); } outputWriter.newLine(); outputWriter.flush(); } } private static void interactiveInchiOutput(InputStream input, OutputStream out, NameToStructureConfig n2sconfig, InchiType inchiType, boolean outputName) throws Exception { NameToStructure nts = NameToStructure.getInstance(); BufferedReader inputReader = new BufferedReader(new InputStreamReader(input, StandardCharsets.UTF_8)); BufferedWriter outputWriter = new BufferedWriter(new OutputStreamWriter(out, StandardCharsets.UTF_8)); String line; while ((line = inputReader.readLine()) != null) { int splitPoint = line.indexOf('\t'); String name = splitPoint >= 0 ? line.substring(0, splitPoint) : line; OpsinResult result = nts.parseChemicalName(name, n2sconfig); String output; switch (inchiType) { case inchiWithFixedH: output = NameToInchi.convertResultToInChI(result); break; case stdInchi: output = NameToInchi.convertResultToStdInChI(result); break; case stdInchiKey: output = NameToInchi.convertResultToStdInChIKey(result); break; default: throw new IllegalArgumentException("Unexepected enum value: " + inchiType); } if (output == null) { System.err.println(result.getMessage()); } else { outputWriter.write(output); } if (outputName) { outputWriter.write('\t'); outputWriter.write(line); } outputWriter.newLine(); outputWriter.flush(); } } } opsin-2.8.0/opsin-cli/src/main/resources/000077500000000000000000000000001451751637500203115ustar00rootroot00000000000000opsin-2.8.0/opsin-cli/src/main/resources/log4j2.xml000066400000000000000000000005011451751637500221300ustar00rootroot00000000000000 opsin-2.8.0/opsin-core/000077500000000000000000000000001451751637500147455ustar00rootroot00000000000000opsin-2.8.0/opsin-core/pom.xml000066400000000000000000000055671451751637500162770ustar00rootroot00000000000000 4.0.0 opsin uk.ac.cam.ch.opsin 2.8.0 opsin-core OPSIN Core Core files of OPSIN. Allows conversion of chemical names to CML (Chemical Markup Language) org.apache.maven.plugins maven-shade-plugin 3.2.4 package shade target/opsin-core-${project.version}-jar-with-dependencies.jar src/main/resources true **/*.props src/main/resources false **/*.props dk.brics automaton com.fasterxml.woodstox woodstox-core commons-io commons-io org.apache.logging.log4j log4j-api org.junit.jupiter junit-jupiter test org.hamcrest hamcrest-library test org.mockito mockito-core test org.apache.logging.log4j log4j-core test opsin-2.8.0/opsin-core/src/000077500000000000000000000000001451751637500155345ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/000077500000000000000000000000001451751637500164605ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/java/000077500000000000000000000000001451751637500174015ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/java/uk/000077500000000000000000000000001451751637500200205ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/java/uk/ac/000077500000000000000000000000001451751637500204035ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/000077500000000000000000000000001451751637500211435ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/000077500000000000000000000000001451751637500215355ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/000077500000000000000000000000001451751637500225245ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/000077500000000000000000000000001451751637500236545ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/AmbiguityChecker.java000066400000000000000000000173031451751637500277420ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayDeque; import java.util.ArrayList; import java.util.Collection; import java.util.Deque; import java.util.HashMap; import java.util.HashSet; import java.util.LinkedHashSet; import java.util.List; import java.util.Map; import java.util.Set; class AmbiguityChecker { static boolean isSubstitutionAmbiguous(List substitutableAtoms, int numberToBeSubstituted) { if (substitutableAtoms.isEmpty()) { throw new IllegalArgumentException("OPSIN Bug: Must provide at least one substituable atom"); } if (substitutableAtoms.size() < numberToBeSubstituted) { throw new IllegalArgumentException("OPSIN Bug: substitutableAtoms must be >= numberToBeSubstituted"); } if (substitutableAtoms.size() == numberToBeSubstituted){ return false; } if (allAtomsConnectToDefaultInAtom(substitutableAtoms, numberToBeSubstituted)) { return false; } Set uniqueAtoms = new HashSet<>(substitutableAtoms); if (uniqueAtoms.size() == 1) { return false; } if (allAtomsEquivalent(uniqueAtoms) && (numberToBeSubstituted == 1 || numberToBeSubstituted == substitutableAtoms.size() - 1)){ return false; } return true; } static boolean allAtomsEquivalent(Collection atoms) { StereoAnalyser analyser = analyseRelevantAtomsAndBonds(atoms); Set uniqueEnvironments = new HashSet<>(); for (Atom a : atoms) { uniqueEnvironments.add(getAtomEnviron(analyser, a)); } return uniqueEnvironments.size() == 1; } static boolean allBondsEquivalent(Collection bonds) { Set relevantAtoms = new HashSet<>(); for (Bond b : bonds) { relevantAtoms.add(b.getFromAtom()); relevantAtoms.add(b.getToAtom()); } StereoAnalyser analyser = analyseRelevantAtomsAndBonds(relevantAtoms); Set uniqueBonds = new HashSet<>(); for (Bond b : bonds) { uniqueBonds.add(bondToCanonicalEnvironString(analyser, b)); } return uniqueBonds.size() == 1; } private static String bondToCanonicalEnvironString(StereoAnalyser analyser, Bond b) { String s1 = getAtomEnviron(analyser, b.getFromAtom()); String s2 = getAtomEnviron(analyser, b.getToAtom()); if (s1.compareTo(s2) > 0){ return s1 + s2; } else { return s2 + s1; } } static String getAtomEnviron(StereoAnalyser analyser, Atom a) { Integer env = analyser.getAtomEnvironmentNumber(a); if (env == null) { throw new RuntimeException("OPSIN Bug: Atom was not part of ambiguity analysis"); } //"identical" atoms may be distinguished by bonds yet to be formed, hence split by outvalency // e.g. [PH3] vs [PH3]= return env + "\t" + a.getOutValency(); } private static boolean allAtomsConnectToDefaultInAtom(List substitutableAtoms, int numberToBeSubstituted) { Atom defaultInAtom = substitutableAtoms.get(0).getFrag().getDefaultInAtom(); if (defaultInAtom != null) { for (int i = 0; i < numberToBeSubstituted; i++) { if (!substitutableAtoms.get(i).equals(defaultInAtom)) { return false; } } return true; } return false; } static StereoAnalyser analyseRelevantAtomsAndBonds(Collection startingAtoms) { Set atoms = new HashSet<>(); Set bonds = new HashSet<>(); Deque stack = new ArrayDeque<>(startingAtoms); while (!stack.isEmpty()) { Atom a = stack.removeLast(); if (!atoms.contains(a)) { atoms.add(a); for (Bond b : a.getBonds()) { bonds.add(b); stack.add(b.getOtherAtom(a)); } } } List ghostHydrogens = new ArrayList<>(); for (Atom atom : atoms) { int explicitHydrogensToAdd = StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(atom); for (int i = 0; i < explicitHydrogensToAdd; i++) { Atom ghostHydrogen = new Atom(ChemEl.H); Bond b = new Bond(ghostHydrogen, atom, 1); atom.addBond(b); ghostHydrogen.addBond(b); ghostHydrogens.add(ghostHydrogen); } } atoms.addAll(ghostHydrogens); StereoAnalyser analyzer = new StereoAnalyser(atoms, bonds); for (Atom ghostHydrogen : ghostHydrogens) { Bond b = ghostHydrogen.getFirstBond(); b.getOtherAtom(ghostHydrogen).removeBond(b); } return analyzer; } static List useAtomEnvironmentsToGivePlausibleSubstitution(List substitutableAtoms, int numberToBeSubstituted) { if (substitutableAtoms.isEmpty()) { throw new IllegalArgumentException("OPSIN Bug: Must provide at least one substituable atom"); } if (substitutableAtoms.size() < numberToBeSubstituted) { throw new IllegalArgumentException("OPSIN Bug: substitutableAtoms must be >= numberToBeSubstituted"); } if (substitutableAtoms.size() == numberToBeSubstituted){ return substitutableAtoms; } List preferredAtoms = findPlausibleSubstitutionPatternUsingSymmmetry(substitutableAtoms, numberToBeSubstituted); if (preferredAtoms != null){ return preferredAtoms; } return findPlausibleSubstitutionPatternUsingLocalEnvironment(substitutableAtoms, numberToBeSubstituted); } private static List findPlausibleSubstitutionPatternUsingSymmmetry(List substitutableAtoms, int numberToBeSubstituted) { //cf. octaethylporphyrin (8 identical atoms capable of substitution) StereoAnalyser analyser = analyseRelevantAtomsAndBonds(new HashSet<>(substitutableAtoms)); Map> atomsInEachEnvironment = new HashMap<>(); for (Atom a : substitutableAtoms) { String env = getAtomEnviron(analyser, a); List atomsInEnvironment = atomsInEachEnvironment.get(env); if (atomsInEnvironment == null) { atomsInEnvironment = new ArrayList<>(); atomsInEachEnvironment.put(env, atomsInEnvironment); } atomsInEnvironment.add(a); } List preferredAtoms = null; for (List atoms : atomsInEachEnvironment.values()) { if (atoms.size() == numberToBeSubstituted){ if (preferredAtoms != null){ return null; } preferredAtoms = atoms; } } if (preferredAtoms == null) { //check for environments with double the required atoms where this means each atom can support two substitutions c.f. cyclohexane for (List atoms : atomsInEachEnvironment.values()) { if (atoms.size() == (numberToBeSubstituted * 2)){ Set uniquified = new LinkedHashSet<>(atoms);//retain deterministic atom ordering if (uniquified.size() == numberToBeSubstituted) { if (preferredAtoms != null){ return null; } preferredAtoms = new ArrayList<>(uniquified); } } } } return preferredAtoms; } private static List findPlausibleSubstitutionPatternUsingLocalEnvironment(List substitutableAtoms, int numberToBeSubstituted) { //cf. pentachlorotoluene (5 sp2 carbons vs sp3 methyl) Map> atomsInEachLocalEnvironment = new HashMap<>(); for (Atom a : substitutableAtoms) { int valency = a.determineValency(true); int currentValency = a.getIncomingValency() + a.getOutValency(); int numOfBonds = (valency - currentValency) + a.getBondCount();//distinguish sp2 and sp3 atoms String s = a.getElement().toString() +"\t" + valency + "\t" + numOfBonds + "\t" + a.hasSpareValency(); List atomsInEnvironment = atomsInEachLocalEnvironment.get(s); if (atomsInEnvironment == null) { atomsInEnvironment = new ArrayList<>(); atomsInEachLocalEnvironment.put(s, atomsInEnvironment); } atomsInEnvironment.add(a); } List preferredAtoms = null; for (List atoms : atomsInEachLocalEnvironment.values()) { if (atoms.size() == numberToBeSubstituted){ if (preferredAtoms != null){ return null; } preferredAtoms = atoms; } } return preferredAtoms; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/AnnotatorState.java000066400000000000000000000030641451751637500274700ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * Contains the state needed during finite-state parsing * From this the tokens string and their semantics can be generated * @author Daniel * */ class AnnotatorState { /** The current state of the DFA. */ private final int state; /** The annotation so far. */ private final char annot; /** The index of the first char in the chemical name that has yet to be tokenised */ private final int posInName; private final boolean isCaseSensitive; private final AnnotatorState previousAs; AnnotatorState(int state, char annot, int posInName, boolean isCaseSensitive, AnnotatorState previousAs) { this.state = state; this.annot = annot; this.posInName = posInName; this.isCaseSensitive = isCaseSensitive; this.previousAs = previousAs; } /** * The current state in the DFA * @return */ int getState() { return state; } /** * The annotation that was consumed to transition to this state * @return */ char getAnnot() { return annot; } /** * The index of the first char in the chemical name that has yet to be tokenised (at the point of creating this AnnotatorState) * @return */ int getPosInName() { return posInName; } /** * Where the corresponding token is case sensitive * @return */ boolean isCaseSensitive() { return isCaseSensitive; } /** * The last annotator state for the previous token (or null if this is the first) * @return */ AnnotatorState getPreviousAs() { return previousAs; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/Atom.java000066400000000000000000000474401451751637500254300ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.Collections; import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.Set; import java.util.regex.Matcher; import static uk.ac.cam.ch.wwmm.opsin.OpsinTools.*; /** * An atom. Carries information about which fragment it is in, and an ID * number and a list of bonds that it is involved. It may also have other information such as * whether it has "spare valencies" due to unsaturation, its charge, locant labels, stereochemistry and notes * * @author ptc24 * @author dl387 * */ class Atom { /**The (unique over the molecule) ID of the atom.*/ private final int id; /**The chemical element of the atom. */ private ChemEl chemEl; /**The locants that pertain to the atom.*/ private final List locants = new ArrayList<>(2); /**The formal charge on the atom.*/ private int charge = 0; /**The isotope of the atom. Null if not defined explicitly.*/ private Integer isotope = null; /** * Holds the atomParity object associated with this object * null by default */ private AtomParity atomParity = null; /**The bonds that involve the atom*/ private final List bonds = new ArrayList<>(4); /**A map between PropertyKey s as declared here and useful atom properties, usually relating to some kind of special case. */ @SuppressWarnings("rawtypes") private final Map properties = new HashMap<>(); /** A set of atoms that were equally plausible to perform functional replacement on */ static final PropertyKey> AMBIGUOUS_ELEMENT_ASSIGNMENT = new PropertyKey<>("ambiguousElementAssignment"); /** The atom class which will be output when serialised to SMILES. Useful for distinguishing attachment points */ static final PropertyKey ATOM_CLASS = new PropertyKey<>("atomClass"); /** Used on wildcard atoms to indicate their meaning */ static final PropertyKey HOMOLOGY_GROUP = new PropertyKey<>("homologyGroup"); /** Used on wildcard atoms to indicate that they are a position variation bond */ static final PropertyKey> POSITION_VARIATION_BOND = new PropertyKey<>("positionVariationBond"); /** The hydrogen count as set in the SMILES*/ static final PropertyKey SMILES_HYDROGEN_COUNT = new PropertyKey<>("smilesHydrogenCount"); /** The oxidation number as specified by Roman numerals in the name*/ static final PropertyKey OXIDATION_NUMBER = new PropertyKey<>("oxidationNumber"); /** Is this atom the carbon of an aldehyde? (however NOT formaldehyde)*/ static final PropertyKey ISALDEHYDE = new PropertyKey<>("isAldehyde"); /** Indicates that this atom is an anomeric atom in a cyclised carbohydrate*/ static final PropertyKey ISANOMERIC = new PropertyKey<>("isAnomeric"); /** Transient integer used to indicate traversal of fragments*/ static final PropertyKey VISITED = new PropertyKey<>("visited"); private static final StereoGroup UNKNOWN_STEREOGROUP = new StereoGroup(StereoGroupType.Unk); /**The fragment to which the atom belongs.*/ private Fragment frag; /** Whether an atom is part of a delocalised set of double bonds. A double bond in a kekule structure * can be mapped to a single bond with this attribute set to true on both atoms that were in the double bond * For example, benzene could be temporarily represented by six singly-bonded atoms, each with a set * spare valency attribute , and later converted into a fully-specified valence structure.*/ private boolean spareValency = false; /**The total bond order of all bonds that are expected to be used for inter fragment bonding * e.g. in butan-2-ylidene this would be 2 for the atom at position 2 and 0 for the other 3 */ private int outValency = 0; /** Null by default or set by the lambda convention.*/ private Integer lambdaConventionValency; /** Null by default or set by the SMILES builder*/ private Integer minimumValency; /** Can this atom have implicit hydrogen? True unless explicitly set otherwise otherwise*/ private boolean implicitHydrogenAllowed = true; /** This is modified by ium/ide/ylium/uide and is used to choose the appropriate valency for the atom*/ private int protonsExplicitlyAddedOrRemoved = 0; /** * Takes same values as type in Fragment. Useful for discriminating suffix atoms from other atoms when a suffix is incorporated into another fragments */ private String type; /** * Is this atom in a ring. Default false. Set by the CycleDetector. * Double bonds are only converted to spareValency if atom is in a ring * Some suffixes have different meanings if an atom is part of a ring or not c.g. cyclohexanal vs ethanal */ private boolean atomIsInACycle = false; /** * Builds an Atom from scratch. * GENERALLY EXCEPT FOR TESTING SHOULD NOT BE CALLED EXCEPT FROM THE FRAGMANAGER * @param id The ID number, unique to the atom in the molecule being built * @param chemlEl The chemical element * @param frag the Fragment to contain the Atom */ Atom(int id, ChemEl chemlEl, Fragment frag) { if (frag == null){ throw new IllegalArgumentException("Atom is not in a fragment!"); } if (chemlEl == null){ throw new IllegalArgumentException("Atom does not have an element!"); } this.frag = frag; this.id = id; this.chemEl = chemlEl; this.type =frag.getType(); } /** Used to build a DUMMY atom. * Does not have an id/frag/type as would be expected for a proper atom * @param chemlEl The chemical element */ Atom(ChemEl chemlEl){ this.chemEl = chemlEl; this.id = 0; } /** * Uses the lambdaConventionValency or if that is not available * the default valency assuming this is >= the current valency * If not then allowed the chemically sensible valencies of the atom are checked with the one that is closest and >= to the current valency * being returned. If the valency has still not been determined the current valency i.e. assuming the atom to have 0 implicit hydrogen is returned. * This is the correct behaviour for inorganics. For p block elements it means that OPSIN does not believe the atom to be in a valid valency (too high) * * if considerOutValency is true, the valency that will be used to form bonds using the outAtoms is * taken into account i.e. if any radicals were used to form bonds * @param considerOutValency * @return */ int determineValency(boolean considerOutValency) { if (lambdaConventionValency != null){ return lambdaConventionValency + protonsExplicitlyAddedOrRemoved; } int currentValency = getIncomingValency(); if (considerOutValency){ currentValency += outValency; } Integer calculatedMinValency = minimumValency == null ? null : minimumValency + protonsExplicitlyAddedOrRemoved; if (charge ==0 || protonsExplicitlyAddedOrRemoved != 0){ Integer defaultValency = ValencyChecker.getDefaultValency(chemEl); if (defaultValency != null){ defaultValency += protonsExplicitlyAddedOrRemoved; if (currentValency <= defaultValency && (calculatedMinValency == null || defaultValency >= calculatedMinValency)){ return defaultValency; } } } Integer[] possibleValencies = ValencyChecker.getPossibleValencies(chemEl, charge); if (possibleValencies != null) { if (calculatedMinValency != null && calculatedMinValency >= currentValency){ return calculatedMinValency; } for (Integer possibleValency : possibleValencies) { if (calculatedMinValency != null && possibleValency < calculatedMinValency){ continue; } if (currentValency <= possibleValency){ return possibleValency; } } } if (calculatedMinValency != null && calculatedMinValency >= currentValency){ return calculatedMinValency; } else{ return currentValency; } } /**Adds a locant to the Atom. Other locants are preserved. * Also associates the locant with the atom in the parent fragments hash * * @param locant The new locant */ void addLocant(String locant) { locants.add(locant); frag.addMappingToAtomLocantMap(locant, this); } /**Replaces all existing locants with a new one. * * @param locant The new locant */ void replaceLocants(String locant) { clearLocants(); addLocant(locant); } void removeLocant(String locantToRemove) { int locantArraySize = locants.size(); for (int i = locantArraySize -1; i >=0 ; i--) { if (locants.get(i).equals(locantToRemove)){ locants.remove(i); frag.removeMappingFromAtomLocantMap(locantToRemove); } } } /**Removes all locants from the Atom. * */ void clearLocants() { for (int i = 0, l = locants.size(); i < l; i++) { frag.removeMappingFromAtomLocantMap(locants.get(i)); } locants.clear(); } /** * Removes only elementSymbolLocants: e.g. N, S', Se */ void removeElementSymbolLocants() { for (int i = locants.size() - 1; i >= 0; i--) { String locant = locants.get(i); if (MATCH_ELEMENT_SYMBOL_LOCANT.matcher(locant).matches()){ frag.removeMappingFromAtomLocantMap(locant); locants.remove(i); } } } /** * Removes all locants other than elementSymbolLocants (e.g. N, S', Se) * Hence removes numeric locants and greek locants */ void removeLocantsOtherThanElementSymbolLocants() { for (int i = locants.size() - 1; i >= 0; i--) { String locant = locants.get(i); if (!MATCH_ELEMENT_SYMBOL_LOCANT.matcher(locant).matches()){ frag.removeMappingFromAtomLocantMap(locant); locants.remove(i); } } } /**Checks if the Atom has a given locant. * * @param locant The locant to test for * @return true if it has, false if not */ boolean hasLocant(String locant) { if (locants.contains(locant)) { return true; } Matcher m = MATCH_AMINOACID_STYLE_LOCANT.matcher(locant); if (m.matches()){//e.g. N'5 if (chemEl.toString().equals(m.group(1))){//element symbol if (!m.group(2).equals("") && (!hasLocant(m.group(1) +m.group(2)))){//has primes return false;//must have exact locant e.g. N' } if (OpsinTools.depthFirstSearchForNonSuffixAtomWithLocant(this, m.group(3)) != null){ return true; } } } return false; } /**Gets the first locant for the Atom. This may be the locant that was initially * specified, or the most recent locant specified using replaceLocant, or first * locant to be added since the last invocation of clearLocants. * * @return The locant, or null if there is no locant */ String getFirstLocant() { return locants.size() > 0 ? locants.get(0) : null; } /**Returns the array of locants containing all locants associated with the atom * * @return The list of locants (may be empty) */ List getLocants() { return Collections.unmodifiableList(locants); } /**Returns the subset of the locants which are element symbol locants e.g. N, S', Se * * @return The list of locants (may be empty) */ List getElementSymbolLocants() { List elementSymbolLocants = new ArrayList<>(1); for (int i = 0, l = locants.size(); i < l; i++) { String locant = locants.get(i); if (MATCH_ELEMENT_SYMBOL_LOCANT.matcher(locant).matches()) { elementSymbolLocants.add(locant); } } return elementSymbolLocants; } void setFrag(Fragment f) { frag = f; } Fragment getFrag() { return frag; } /**Gets the ID of the atom. * * @return The ID of the atom */ int getID() { return id; } /**Gets the chemical element corresponding to the element of the atom. * * @return The chemical element corresponding to the element of the atom */ ChemEl getElement() { return chemEl; } /**Sets the chemical element corresponding to the element of the atom. * * @param chemEl The chemical element corresponding to the element of the atom */ void setElement(ChemEl chemEl) { this.chemEl = chemEl; } /**Gets the formal charge on the atom. * * @return The formal charge on the atom */ int getCharge() { return charge; } /**Modifies the charge of this atom by the amount given. This can be any integer * The number of protons changed is noted so as to calculate the correct valency for the atom. This can be any integer. * For example ide is the loss of a proton so is charge=-1, protons =-1 * @param charge * @param protons */ void addChargeAndProtons(int charge, int protons){ this.charge += charge; protonsExplicitlyAddedOrRemoved+=protons; } /**Sets the formal charge on the atom. * NOTE: make sure to update protonsExplicitlyAddedOrRemoved if necessary * * @param c The formal charge on the atom */ void setCharge(int c) { charge = c; } /** * Sets the formal charge and number of protonsExplicitlyAddedOrRemoved to 0 */ void neutraliseCharge() { charge = 0; protonsExplicitlyAddedOrRemoved = 0; } /** * Gets the mass number of the atom or null if not explicitly defined * e.g. 3 for tritium * @return */ Integer getIsotope() { return isotope; } /** * Sets the mass number of the atom explicitly * @param isotope */ void setIsotope(Integer isotope) { if (isotope != null && isotope < chemEl.ATOMIC_NUM) { throw new RuntimeException("Isotopic mass cannot be less than the element's number of protons: " + chemEl.toString() + " " + isotope + " < " + chemEl.ATOMIC_NUM ); } this.isotope = isotope; } /**Adds a bond to the atom * * @param b The bond to be added */ void addBond(Bond b) { if (bonds.contains(b)){ throw new IllegalArgumentException("Atom already has given bond (This is not allowed as this would give two bonds between the same atoms!)"); } bonds.add(b); } /**Removes a bond to the atom * * @param b The bond to be removed * @return whether bond was present */ boolean removeBond(Bond b) { return bonds.remove(b); } /**Calculates the number of bonds connecting to the atom, excluding bonds to implicit * hydrogens. Double bonds count as * two bonds, etc. Eg ethene - both C's have an incoming valency of 2. * * @return Incoming Valency */ int getIncomingValency() { int v = 0; for (int i = 0, len = bonds.size(); i < len; i++) { v += bonds.get(i).getOrder(); } return v; } int getProtonsExplicitlyAddedOrRemoved() { return protonsExplicitlyAddedOrRemoved; } void setProtonsExplicitlyAddedOrRemoved(int protonsExplicitlyAddedOrRemoved) { this.protonsExplicitlyAddedOrRemoved = protonsExplicitlyAddedOrRemoved; } /**Does the atom have spare valency to form double bonds? * * @return true if atom has spare valency */ boolean hasSpareValency() { return spareValency; } /**Set whether an atom has spare valency * * @param sv The spare valency */ void setSpareValency(boolean sv) { spareValency = sv; } /**Gets the total bond order of the bonds expected to be created from this atom for inter fragment bonding * * @return The outValency */ int getOutValency() { return outValency; } /**Adds to the total bond order of the bonds expected to be created from this atom for inter fragment bonding * * @param outV The outValency to be added */ void addOutValency(int outV) { outValency += outV; } List getBonds() { return Collections.unmodifiableList(bonds); } int getBondCount() { return bonds.size(); } /**Gets a list of atoms that connect to the atom * * @return The list of atoms connected to the atom */ List getAtomNeighbours(){ int bondCount = bonds.size(); List results = new ArrayList<>(bondCount); for (int i = 0; i < bondCount; i++) { results.add(bonds.get(i).getOtherAtom(this)); } return results; } Integer getLambdaConventionValency() { return lambdaConventionValency; } void setLambdaConventionValency(Integer valency) { this.lambdaConventionValency = valency; } String getType() { return type; } void setType(String type) { this.type = type; } boolean getAtomIsInACycle() { return atomIsInACycle; } /** * Sets whether atom is in a cycle, true if it is * @param atomIsInACycle */ void setAtomIsInACycle(boolean atomIsInACycle) { this.atomIsInACycle = atomIsInACycle; } AtomParity getAtomParity() { return atomParity; } void setAtomParity(AtomParity atomParity) { this.atomParity = atomParity; } void setAtomParity(Atom[] atomRefs4, int parity) { atomParity = new AtomParity(atomRefs4, parity); } Integer getMinimumValency() { return minimumValency; } void setMinimumValency(Integer minimumValency) { this.minimumValency = minimumValency; } boolean getImplicitHydrogenAllowed() { return implicitHydrogenAllowed; } void setImplicitHydrogenAllowed(boolean implicitHydrogenAllowed) { this.implicitHydrogenAllowed = implicitHydrogenAllowed; } @SuppressWarnings("unchecked") T getProperty(PropertyKey propertyKey) { return (T) properties.get(propertyKey); } void setProperty(PropertyKey propertyKey, T value) { properties.put(propertyKey, value); } /** * Checks if the valency of this atom allows it to have the amount of spare valency that the atom currently has * May reduce the spare valency on the atom to be consistent with the valency of the atom * Does nothing if the atom has no spare valency * @param takeIntoAccountExternalBonds * @throws StructureBuildingException */ void ensureSVIsConsistantWithValency(boolean takeIntoAccountExternalBonds) throws StructureBuildingException { if (spareValency) { Integer maxValency; if (lambdaConventionValency != null) { maxValency = lambdaConventionValency + protonsExplicitlyAddedOrRemoved; } else{ Integer hwValency = ValencyChecker.getHWValency(chemEl); if (hwValency == null) { throw new StructureBuildingException(chemEl + " is not expected to be aromatic!"); } maxValency = hwValency + protonsExplicitlyAddedOrRemoved; } int maxSpareValency; if (takeIntoAccountExternalBonds) { maxSpareValency = maxValency - getIncomingValency() - outValency; } else{ maxSpareValency = maxValency - frag.getIntraFragmentIncomingValency(this); } if (maxSpareValency < 1) { setSpareValency(false); } } } /** * Returns the the first bond in the atom's bond list or null if it has no bonds * @return */ Bond getFirstBond() { if (bonds.size() > 0){ return bonds.get(0); } return null; } /**Gets the bond between this atom and a given atom * * @param a The atom to find a bond to * @return The bond, or null if there is no bond */ Bond getBondToAtom(Atom a) { for (int i = 0, l = bonds.size(); i < l; i++) { Bond b = bonds.get(i); if(b.getOtherAtom(this) == a){ return b; } } return null; } /**Gets the bond between this atom and a given atom, throwing if fails. * * @param a The atom to find a bond to * @return The bond found * @throws StructureBuildingException */ Bond getBondToAtomOrThrow(Atom a) throws StructureBuildingException { Bond b = getBondToAtom(a); if(b == null){ throw new StructureBuildingException("Couldn't find specified bond"); } return b; } /** * Set the stereo group, ignored if the atom does not have any parity info. * @param stroGrp the stereo group. */ void setStereoGroup(StereoGroup stroGrp) { if (atomParity != null) { atomParity.setStereoGroup(stroGrp); } } /** * Access the stereo group on the atom parity info. * @return the stereo group */ StereoGroup getStereoGroup() { return atomParity != null ? atomParity.getStereoGroup() : UNKNOWN_STEREOGROUP; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/AtomParity.java000066400000000000000000000031311451751637500266060ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * Hold information about 4 atoms and their chiral determinant allowing the description of tetrahedral stereochemistry * @author dl387 * */ class AtomParity { /** * A dummy hydrogen atom. Used to represent an implicit hydrogen that is attached to a tetrahedral stereocentre */ static final Atom hydrogen = new Atom(ChemEl.H); /** * A dummy hydrogen atom. Used to represent the hydrogen that replaced a hydroxy at a tetrahedral stereocentre */ static final Atom deoxyHydrogen = new Atom(ChemEl.H); /** * Typical case where of an absolute stereocentre in group number 1 */ private static final StereoGroup ABSOLUTE_STEREOGROUP = new StereoGroup(StereoGroupType.Abs); private Atom[] atomRefs4; private int parity; private StereoGroup stereoGroup = ABSOLUTE_STEREOGROUP; /** * Create an atomParity from an array of 4 atoms and the parity of the chiral determinant * @param atomRefs4 * @param parity */ AtomParity(Atom[] atomRefs4, int parity){ if (atomRefs4.length !=4){ throw new IllegalArgumentException("atomRefs4 must contain references to 4 atoms"); } this.atomRefs4 = atomRefs4; this.parity = parity; } Atom[] getAtomRefs4() { return atomRefs4; } void setAtomRefs4(Atom[] atomRefs4) { this.atomRefs4 = atomRefs4; } int getParity() { return parity; } void setParity(int parity) { this.parity = parity; } void setStereoGroup(StereoGroup stroGrp) { this.stereoGroup = stroGrp; } StereoGroup getStereoGroup() { return this.stereoGroup; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/AtomProperties.java000066400000000000000000000171421451751637500275010ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.EnumMap; import java.util.Map; /** * Holds useful atomic properties * @author dl387 * */ class AtomProperties { private static final Map elementToPaulingElectronegativity = new EnumMap<>(ChemEl.class); private static final Map elementToHwPriority = new EnumMap<>(ChemEl.class); static{ elementToPaulingElectronegativity.put(ChemEl.H, 2.20); elementToPaulingElectronegativity.put(ChemEl.Li, 0.98); elementToPaulingElectronegativity.put(ChemEl.Be, 1.57); elementToPaulingElectronegativity.put(ChemEl.B, 2.04); elementToPaulingElectronegativity.put(ChemEl.C, 2.55); elementToPaulingElectronegativity.put(ChemEl.N, 3.04); elementToPaulingElectronegativity.put(ChemEl.O, 3.44); elementToPaulingElectronegativity.put(ChemEl.F, 3.98); elementToPaulingElectronegativity.put(ChemEl.Na, 0.93); elementToPaulingElectronegativity.put(ChemEl.Mg, 1.31); elementToPaulingElectronegativity.put(ChemEl.Al, 1.61); elementToPaulingElectronegativity.put(ChemEl.Si, 1.90); elementToPaulingElectronegativity.put(ChemEl.P, 2.19); elementToPaulingElectronegativity.put(ChemEl.S, 2.58); elementToPaulingElectronegativity.put(ChemEl.Cl, 3.16); elementToPaulingElectronegativity.put(ChemEl.K, 0.82); elementToPaulingElectronegativity.put(ChemEl.Ca, 1.00); elementToPaulingElectronegativity.put(ChemEl.Sc, 1.36); elementToPaulingElectronegativity.put(ChemEl.Ti, 1.54); elementToPaulingElectronegativity.put(ChemEl.V, 1.63); elementToPaulingElectronegativity.put(ChemEl.Cr, 1.66); elementToPaulingElectronegativity.put(ChemEl.Mn, 1.55); elementToPaulingElectronegativity.put(ChemEl.Fe, 1.83); elementToPaulingElectronegativity.put(ChemEl.Co, 1.88); elementToPaulingElectronegativity.put(ChemEl.Ni, 1.91); elementToPaulingElectronegativity.put(ChemEl.Cu, 1.90); elementToPaulingElectronegativity.put(ChemEl.Zn, 1.65); elementToPaulingElectronegativity.put(ChemEl.Ga, 1.81); elementToPaulingElectronegativity.put(ChemEl.Ge, 2.01); elementToPaulingElectronegativity.put(ChemEl.As, 2.18); elementToPaulingElectronegativity.put(ChemEl.Se, 2.55); elementToPaulingElectronegativity.put(ChemEl.Br, 2.96); elementToPaulingElectronegativity.put(ChemEl.Kr, 3.00); elementToPaulingElectronegativity.put(ChemEl.Rb, 0.82); elementToPaulingElectronegativity.put(ChemEl.Sr, 0.95); elementToPaulingElectronegativity.put(ChemEl.Y, 1.22); elementToPaulingElectronegativity.put(ChemEl.Zr, 1.33); elementToPaulingElectronegativity.put(ChemEl.Nb, 1.6); elementToPaulingElectronegativity.put(ChemEl.Mo, 2.16); elementToPaulingElectronegativity.put(ChemEl.Tc, 1.9); elementToPaulingElectronegativity.put(ChemEl.Ru, 2.2); elementToPaulingElectronegativity.put(ChemEl.Rh, 2.28); elementToPaulingElectronegativity.put(ChemEl.Pd, 2.20); elementToPaulingElectronegativity.put(ChemEl.Ag, 1.93); elementToPaulingElectronegativity.put(ChemEl.Cd, 1.69); elementToPaulingElectronegativity.put(ChemEl.In, 1.78); elementToPaulingElectronegativity.put(ChemEl.Sn, 1.96); elementToPaulingElectronegativity.put(ChemEl.Sb, 2.05); elementToPaulingElectronegativity.put(ChemEl.Te, 2.1); elementToPaulingElectronegativity.put(ChemEl.I, 2.66); elementToPaulingElectronegativity.put(ChemEl.Xe, 2.60); elementToPaulingElectronegativity.put(ChemEl.Cs, 0.79); elementToPaulingElectronegativity.put(ChemEl.Ba, 0.89); elementToPaulingElectronegativity.put(ChemEl.La, 1.1); elementToPaulingElectronegativity.put(ChemEl.Ce, 1.12); elementToPaulingElectronegativity.put(ChemEl.Pr, 1.13); elementToPaulingElectronegativity.put(ChemEl.Nd, 1.14); elementToPaulingElectronegativity.put(ChemEl.Pm, 1.13); elementToPaulingElectronegativity.put(ChemEl.Sm, 1.17); elementToPaulingElectronegativity.put(ChemEl.Eu, 1.2); elementToPaulingElectronegativity.put(ChemEl.Gd, 1.2); elementToPaulingElectronegativity.put(ChemEl.Tb, 1.1); elementToPaulingElectronegativity.put(ChemEl.Dy, 1.22); elementToPaulingElectronegativity.put(ChemEl.Ho, 1.23); elementToPaulingElectronegativity.put(ChemEl.Er, 1.24); elementToPaulingElectronegativity.put(ChemEl.Tm, 1.25); elementToPaulingElectronegativity.put(ChemEl.Yb, 1.1); elementToPaulingElectronegativity.put(ChemEl.Lu, 1.27); elementToPaulingElectronegativity.put(ChemEl.Hf, 1.3); elementToPaulingElectronegativity.put(ChemEl.Ta, 1.5); elementToPaulingElectronegativity.put(ChemEl.W, 2.36); elementToPaulingElectronegativity.put(ChemEl.Re, 1.9); elementToPaulingElectronegativity.put(ChemEl.Os, 2.2); elementToPaulingElectronegativity.put(ChemEl.Ir, 2.20); elementToPaulingElectronegativity.put(ChemEl.Pt, 2.28); elementToPaulingElectronegativity.put(ChemEl.Au, 2.54); elementToPaulingElectronegativity.put(ChemEl.Hg, 2.00); elementToPaulingElectronegativity.put(ChemEl.Tl, 1.62); elementToPaulingElectronegativity.put(ChemEl.Pb, 2.33); elementToPaulingElectronegativity.put(ChemEl.Bi, 2.02); elementToPaulingElectronegativity.put(ChemEl.Po, 2.0); elementToPaulingElectronegativity.put(ChemEl.At, 2.2); elementToPaulingElectronegativity.put(ChemEl.Rn, 2.2); elementToPaulingElectronegativity.put(ChemEl.Fr, 0.7); elementToPaulingElectronegativity.put(ChemEl.Ra, 0.9); elementToPaulingElectronegativity.put(ChemEl.Ac, 1.1); elementToPaulingElectronegativity.put(ChemEl.Th, 1.3); elementToPaulingElectronegativity.put(ChemEl.Pa, 1.5); elementToPaulingElectronegativity.put(ChemEl.U, 1.38); elementToPaulingElectronegativity.put(ChemEl.Np, 1.36); elementToPaulingElectronegativity.put(ChemEl.Pu, 1.28); elementToPaulingElectronegativity.put(ChemEl.Am, 1.13); elementToPaulingElectronegativity.put(ChemEl.Cm, 1.28); elementToPaulingElectronegativity.put(ChemEl.Bk, 1.3); elementToPaulingElectronegativity.put(ChemEl.Cf, 1.3); elementToPaulingElectronegativity.put(ChemEl.Es, 1.3); elementToPaulingElectronegativity.put(ChemEl.Fm, 1.3); elementToPaulingElectronegativity.put(ChemEl.Md, 1.3); elementToPaulingElectronegativity.put(ChemEl.No, 1.3); elementToPaulingElectronegativity.put(ChemEl.Lr, 1.3); elementToHwPriority.put(ChemEl.F, 23); elementToHwPriority.put(ChemEl.Cl, 22); elementToHwPriority.put(ChemEl.Br, 21); elementToHwPriority.put(ChemEl.I, 20); elementToHwPriority.put(ChemEl.O, 19); elementToHwPriority.put(ChemEl.S, 18); elementToHwPriority.put(ChemEl.Se, 17); elementToHwPriority.put(ChemEl.Te, 16); elementToHwPriority.put(ChemEl.N, 15); elementToHwPriority.put(ChemEl.P, 14); elementToHwPriority.put(ChemEl.As, 13); elementToHwPriority.put(ChemEl.Sb, 12); elementToHwPriority.put(ChemEl.Bi, 11); elementToHwPriority.put(ChemEl.Si, 10); elementToHwPriority.put(ChemEl.Ge, 9); elementToHwPriority.put(ChemEl.Sn, 8); elementToHwPriority.put(ChemEl.Pb, 7); elementToHwPriority.put(ChemEl.B, 6); elementToHwPriority.put(ChemEl.Al, 5); elementToHwPriority.put(ChemEl.Ga, 4); elementToHwPriority.put(ChemEl.In, 3); elementToHwPriority.put(ChemEl.Tl, 2); elementToHwPriority.put(ChemEl.Hg, 1); } /** * Useful to give an indication of whether a bond is like to be ionic (diff >1.8), polar or covalent (diff < 1.2) * @param chemEl * @return */ static Double getPaulingElectronegativity(ChemEl chemEl) { return elementToPaulingElectronegativity.get(chemEl); } /** * Maps chemEl to the priority of that atom in Hantzch-Widman system. A higher value indicates a higher priority. * @param chemEl * @return */ static Integer getHwpriority(ChemEl chemEl) { return elementToHwPriority.get(chemEl); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/Attribute.java000066400000000000000000000012451451751637500264640ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; class Attribute { private final String name; private String value; Attribute(String name, String value) { this.name = name; this.value = value; } /** * Creates a copy * @param attribute */ Attribute(Attribute attribute) { this.name = attribute.getName(); this.value = attribute.getValue(); } String getValue() { return value; } String getName() { return name; } void setValue(String value) { this.value = value; } String toXML() { return getName() + "=\"" + OpsinTools.xmlEncode(value) + "\""; } public String toString() { return name +"\t" + value; } }opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/AutomatonInitialiser.java000066400000000000000000000104241451751637500306640ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.io.BufferedInputStream; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; import java.nio.charset.StandardCharsets; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import dk.brics.automaton.Automaton; import dk.brics.automaton.RegExp; import dk.brics.automaton.RunAutomaton; import dk.brics.automaton.SpecialOperations; /** * Handles storing and retrieving automata to/from files * This is highly useful to do as building these deterministic automata from scratch can take minutes * @author dl387 * */ class AutomatonInitialiser { private static final Logger LOG = LogManager.getLogger(AutomatonInitialiser.class); private final ResourceGetter resourceGetter; AutomatonInitialiser(String resourcePath) { resourceGetter = new ResourceGetter(resourcePath); } /** * In preference serialised automata and their hashes will be looked for in the resource folder in your working directory * If it cannot be found there then these files will be looked for in the standard resource folder * (this is actually the standard behaviour of the resourceGetter but I'm reiterating it here as if the stored hash doesn't match * the current hash then the creation of an updated serialised automaton and hash will occur in the working directory resource folder as the standard * resource folder will not typically be writable) * @param automatonName : A name for the automaton so that it can it can be saved/loaded from disk * @param regex : the regex from which to build the RunAutomaton * @param reverseAutomaton : should the automaton be reversed * @param tableize: if true, a transition table is created which makes the run method faster in return of a higher memory usage (adds ~256kb) * @return A RunAutomaton, may have been built from scratch or loaded from a file */ RunAutomaton loadAutomaton(String automatonName, String regex, boolean tableize, boolean reverseAutomaton) { if (reverseAutomaton){ automatonName+="_reversed_"; } try{ if (isAutomatonCached(automatonName, regex)) { return loadCachedAutomaton(automatonName); } } catch (IOException e) { LOG.warn("Error loading cached automaton: "+automatonName, e); } RunAutomaton automaton = createAutomaton(regex, tableize, reverseAutomaton); cacheAutomaton(automatonName, automaton, regex); return automaton; } private boolean isAutomatonCached(String automatonName, String regex) { String currentRegexHash = getRegexHash(regex); String cachedRegexHash = getCachedRegexHash(automatonName); return currentRegexHash.equals(cachedRegexHash); } private String getRegexHash(String regex) { return Integer.toString(regex.hashCode()); } private String getCachedRegexHash(String automatonName) { /*This file contains the hashcode of the regex which was used to generate the automaton on the disk */ return resourceGetter.getFileContentsAsString(automatonName + "RegexHash.txt"); } private RunAutomaton loadCachedAutomaton(String automatonName) throws IOException{ try (InputStream automatonInput = resourceGetter.getInputstreamFromFileName(automatonName +"SerialisedAutomaton.aut")){ return RunAutomaton.load(new BufferedInputStream(automatonInput)); } catch (Exception e) { IOException ioe = new IOException("Error loading automaton"); ioe.initCause(e); throw ioe; } } private static RunAutomaton createAutomaton(String regex, boolean tableize, boolean reverseAutomaton) { Automaton a = new RegExp(regex).toAutomaton(); if (reverseAutomaton){ SpecialOperations.reverse(a); } return new RunAutomaton(a, tableize); } private void cacheAutomaton(String automatonName, RunAutomaton automaton, String regex) { try (OutputStream regexHashOutputStream = resourceGetter.getOutputStream(automatonName + "RegexHash.txt")) { regexHashOutputStream.write(getRegexHash(regex).getBytes(StandardCharsets.UTF_8)); try (OutputStream automatonOutputStream = resourceGetter.getOutputStream(automatonName + "SerialisedAutomaton.aut")) { automaton.store(automatonOutputStream); } } catch (IOException e) { LOG.warn("Error serialising automaton: "+automatonName, e); } } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/Bond.java000066400000000000000000000067661451751637500254200ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import uk.ac.cam.ch.wwmm.opsin.BondStereo.BondStereoValue; /**A bond, between two atoms. * * @author ptc24 * @author dl387 * */ class Bond { /** The Atom the bond comes from */ private final Atom from; /** The Atom the bond goes to */ private final Atom to; /** The bond order */ private int order; static enum SMILES_BOND_DIRECTION{ RSLASH, LSLASH } /** If this bond was built from SMILES can be set to either RSLASH or LSLASH. Subsequently read to add a bondStereoElement * null by default*/ private SMILES_BOND_DIRECTION smilesBondDirection = null; /** * Holds the bondStereo object associated with this bond * null by default */ private BondStereo bondStereo = null; /** DO NOT CALL DIRECTLY EXCEPT FOR TESTING * Creates a new Bond. * * @param from The Atom the bond comes from. * @param to The Atom the bond goes to. * @param order The bond order. */ Bond(Atom from, Atom to, int order) { if (from == to){ throw new IllegalArgumentException("Bonds must be made between different atoms"); } if (order < 1 || order > 3){ throw new IllegalArgumentException("Bond order must be 1, 2 or 3"); } if (from == null){ throw new IllegalArgumentException("From atom was null!"); } if (to == null){ throw new IllegalArgumentException("To atom was null!"); } this.from = from; this.to = to; this.order = order; } /** * Gets from ID * @return ID */ int getFrom() { return from.getID(); } /** * Gets to ID * @return ID */ int getTo() { return to.getID(); } /**Gets order. * @return*/ int getOrder() { return order; } /**Sets order. * @param order*/ void setOrder(int order) { this.order = order; } /** * Gets from Atom * @return Atom */ Atom getFromAtom() { return from; } /** * Gets to Atom * @return Atom */ Atom getToAtom() { return to; } /**Adds to the bond order. * * @param o The value to be added to the bond order. */ void addOrder(int o) { order += o; } /** * Returns either null or RSLASH or LSLASH * @return */ SMILES_BOND_DIRECTION getSmilesStereochemistry() { return smilesBondDirection; } void setSmilesStereochemistry(SMILES_BOND_DIRECTION bondDirection) { this.smilesBondDirection = bondDirection; } BondStereo getBondStereo() { return bondStereo; } void setBondStereo(BondStereo bondStereo) { this.bondStereo = bondStereo; } void setBondStereoElement(Atom[] atomRefs4, BondStereoValue cOrT) { bondStereo = new BondStereo(atomRefs4, cOrT); } /** * Returns the atom at the other end of the bond to given atom * @param atom * @return */ Atom getOtherAtom(Atom atom) { if (from == atom){ return to; } else if (to == atom){ return from; } else{ return null; } } @Override public int hashCode() { final int prime = 31; int result = 1; result = prime * result + from.getID(); result = prime * result + to.getID(); return result; } @Override public boolean equals(Object obj) { if (this == obj) { return true; } if (obj == null) { return false; } if (getClass() != obj.getClass()) { return false; } Bond other = (Bond) obj; if (from == other.from && to == other.to){ return true; } if (from == other.to && to == other.from){ return true; } return false; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/BondStereo.java000066400000000000000000000026701451751637500265700ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * Holds information about the positions of 2 atoms relative to a double bond allowing the specification of cis/trans stereochemistry * @author dl387 * */ class BondStereo { private Atom[] atomRefs4; private BondStereoValue bondStereoValue; /** * Possible values for a bondStereo element * @author dl387 * */ enum BondStereoValue{ CIS("C"), TRANS("T"); private final String value; BondStereoValue(String value){ this.value = value; } @Override public String toString() { return value; } } /** * Create a bondStereo from an array of 4 atoms. The 2nd and 3rd atoms of this array are connected via a double bond. * The 1st and 4th atoms are at either end of this bond and indication is given as to whether they are cis or trans to each other. * @param atomRefs4 * @param cOrT */ BondStereo(Atom[] atomRefs4, BondStereoValue cOrT) { if (atomRefs4.length !=4){ throw new IllegalArgumentException("atomRefs4 must contain references to 4 atoms"); } this.atomRefs4 = atomRefs4; this.bondStereoValue = cOrT; } Atom[] getAtomRefs4() { return atomRefs4; } void setAtomRefs4(Atom[] atomRefs4) { this.atomRefs4 = atomRefs4; } BondStereoValue getBondStereoValue() { return bondStereoValue; } void setBondStereoValue(BondStereoValue bondStereoValue) { this.bondStereoValue = bondStereoValue; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/BuildResults.java000066400000000000000000000106151451751637500271430ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.Collections; import java.util.LinkedHashSet; import java.util.List; import java.util.Set; /** * A "struct" to hold the results of fragment building. * @author dl387 * */ class BuildResults { /**Holds the atoms that are currently marked as radicals. An atom may be listed twice for say diyl * Typically these will be utilised by a word rule e.g. the ethyl of ethyl ethanoate has one * Also holds the order of the bond that will be created when it is used (valency) * setExplicitly says whether the outAtom absolutely definitely refers to that atom or not. * e.g. propyl is stored as prop-1-yl with this set to false while prop-2-yl has it set to true * These OutAtoms are the same objects as are present in the fragments*/ private final List outAtoms = new ArrayList<>(); /**The atoms that may be used to from things like esters*/ private final List functionalAtoms = new ArrayList<>(); /**A list of fragments that have been evaluated to form this BuildResults. They are in the order they would be found in the XML*/ private final Set fragments = new LinkedHashSet<>(); /**A BuildResults is constructed from a list of Fragments. * This constructor creates this list from the groups present in an XML word/bracket/sub element. * @param wordSubOrBracket*/ BuildResults(Element wordSubOrBracket) { List groups = OpsinTools.getDescendantElementsWithTagName(wordSubOrBracket, XmlDeclarations.GROUP_EL); for (Element group : groups) { Fragment frag = group.getFrag(); fragments.add(frag); for (int i = 0, l = frag.getOutAtomCount(); i < l; i++) { outAtoms.add(frag.getOutAtom(i)); } int functionalAtomCount = frag.getFunctionalAtomCount(); if (functionalAtomCount > 0){ Element parent = group.getParent(); if (parent.getName().equals(XmlDeclarations.ROOT_EL) || OpsinTools.getNextGroup(group) == null) { for (int i = 0; i < functionalAtomCount; i++) { functionalAtoms.add(frag.getFunctionalAtom(i)); } } } } } /** * Construct a blank buildResults */ BuildResults() {} /** * Returns a read only view of the fragments in this BuildResults * @return */ Set getFragments(){ return Collections.unmodifiableSet(fragments); } int getFragmentCount(){ return fragments.size(); } OutAtom getOutAtom(int i) { return outAtoms.get(i); } int getOutAtomCount() { return outAtoms.size(); } OutAtom removeOutAtom(int i) { OutAtom outAtom = outAtoms.get(i); outAtom.getAtom().getFrag().removeOutAtom(outAtom); return outAtoms.remove(i); } void removeAllOutAtoms() { for (int i = outAtoms.size() -1; i >=0 ; i--) { removeOutAtom(i); } } /** * Returns the atom corresponding to position i in the functionalAtoms list * @param i index * @return atom */ Atom getFunctionalAtom(int i) { return functionalAtoms.get(i).getAtom(); } FunctionalAtom removeFunctionalAtom(int i) { FunctionalAtom functionalAtom = functionalAtoms.get(i); functionalAtom.getAtom().getFrag().removeFunctionalAtom(functionalAtom); return functionalAtoms.remove(i); } int getFunctionalAtomCount(){ return functionalAtoms.size(); } /** * Returns the first OutAtom * @return OutAtom */ OutAtom getFirstOutAtom() { return outAtoms.get(0); } /** * Returns the atom corresponding to the given id assuming the atom the id corresponds to is within the list of fragment in this Buildresults * @param id index * @return atom * @throws StructureBuildingException */ Atom getAtomByIdOrThrow(int id) throws StructureBuildingException { for (Fragment fragment : fragments) { Atom outAtom =fragment.getAtomByID(id); if (outAtom != null){ return outAtom; } } throw new StructureBuildingException("No fragment contained this id: " + id); } void mergeBuildResults(BuildResults otherBR) { outAtoms.addAll(otherBR.outAtoms); functionalAtoms.addAll(otherBR.functionalAtoms); fragments.addAll(otherBR.fragments); } /** * Returns the sum of the charges of the fragments in the buildResults * @return */ int getCharge() { int totalCharge = 0; for (Fragment frag : fragments) { totalCharge += frag.getCharge(); } return totalCharge; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/BuildState.java000066400000000000000000000026341451751637500265640ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import uk.ac.cam.ch.wwmm.opsin.OpsinWarning.OpsinWarningType; /** * Used to pass the current configuration and FragmentManager around * The currentWordRule can be mutated to keep track of what the parent wordRule is at the given time * * @author dl387 * */ class BuildState { final FragmentManager fragManager; final HashMap> xmlSuffixMap; final NameToStructureConfig n2sConfig; // counter is used for DL- racemic stereochemistry in oligomers, we place each one in a separate racemic group, // there is implicitly one group in-case the input has a combination of (RS)- and then DL- int numRacGrps = 1; private final List warnings = new ArrayList<>(); WordRule currentWordRule = null; BuildState(NameToStructureConfig n2sConfig) { this.n2sConfig = n2sConfig; IDManager idManager = new IDManager(); fragManager = new FragmentManager(new SMILESFragmentBuilder(idManager), idManager); xmlSuffixMap = new HashMap<>(); } List getWarnings() { return warnings; } void addWarning(OpsinWarningType type, String message) { warnings.add(new OpsinWarning(type, message)); } void addIsAmbiguous(String message) { warnings.add(new OpsinWarning(OpsinWarningType.APPEARS_AMBIGUOUS, message)); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/CASTools.java000066400000000000000000000246671451751637500261650ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.Arrays; import java.util.List; import java.util.regex.Matcher; import java.util.regex.Pattern; /** * Tools for converting CAS nomenclature into IUPAC nomenclature. * @author dl387 */ class CASTools { private static final Pattern matchCasCollectiveIndex = Pattern.compile("([\\[\\(\\{]([1-9][0-9]?[cC][iI][, ]?)+[\\]\\)\\}])+|[1-9][0-9]?[cC][iI]", Pattern.CASE_INSENSITIVE); private static final Pattern matchAcid = Pattern.compile("acid[\\]\\)\\}]*", Pattern.CASE_INSENSITIVE); private static final Pattern matchCommaSpace = Pattern.compile(", "); private static final Pattern matchCompoundWithPhrase = Pattern.compile("(compd\\. with|compound with|and) ", Pattern.CASE_INSENSITIVE); private static final Pattern matchFunctionalTermAllowingSubstituentPrefix = Pattern.compile("(amide|hydrazide|(thi|selen|tellur)?oxime|hydrazone|(iso)?(semicarbazone|thiosemicarbazone|selenosemicarbazone|tellurosemicarbazone)|imide|imine|semioxamazone)[\\]\\)\\}]*", Pattern.CASE_INSENSITIVE); /** * Inverts a CAS name. * Throws an exception is OPSIN is unable to determine whether something is a substituent or functional term * or if something unexpected in a CAS name is encountered * @param name * @return * @throws ParsingException */ static String uninvertCASName(String name, ParseRules parseRules) throws ParsingException { List nameComponents = new ArrayList<>(Arrays.asList(matchCommaSpace.split(name))); List substituents = new ArrayList<>(); List seperateWordSubstituents = new ArrayList<>(); List functionalTerms = new ArrayList<>(); String parent = nameComponents.get(0); String[] parentNameParts = parent.split(" "); if (parentNameParts.length != 1) { if (matchCasCollectiveIndex.matcher(parentNameParts[parentNameParts.length - 1]).matches()) {//CAS collective index description should be ignored StringBuilder parentSB = new StringBuilder(); for (int i = 0; i < parentNameParts.length - 1; i++) { parentSB.append(parentNameParts[i]); } parent = parentSB.toString(); parentNameParts = parent.split(" "); } for (int i = 1; i < parentNameParts.length; i++) { if (!matchAcid.matcher(parentNameParts[i]).matches()) { ParseRulesResults results = parseRules.getParses(parentNameParts[i]); List parseTokens = results.getParseTokensList(); if (parseTokens.isEmpty()) { throw new ParsingException("Invalid CAS name. Parent compound was followed by an unexpected term"); } } } } boolean addedBracket = false; boolean esterEncountered = false; for (int i = 1; i < nameComponents.size(); i++) { String nameComponent = nameComponents.get(i); Matcher m = matchCompoundWithPhrase.matcher(nameComponent); boolean compoundWithcomponent = false; if (m.lookingAt()) { nameComponent = nameComponent.substring(m.group().length()); compoundWithcomponent = true; } String[] components = nameComponents.get(i).split(" "); for (int c = 0, componentLen = components.length; c < componentLen; c++) { String component = components[c]; if (compoundWithcomponent) { functionalTerms.add(component); continue; } if (component.endsWith("-")) { Character missingCloseBracket = missingCloseBracketCharIfApplicable(component); if (missingCloseBracket !=null) { if (addedBracket) { throw new ParsingException("Close bracket appears to be missing"); } parent += missingCloseBracket; addedBracket = true; } substituents.add(component); } else { ParseRulesResults results = parseRules.getParses(component); List parseTokens = results.getParseTokensList(); if (parseTokens.size() > 0) { List parseWords = WordTools.splitIntoParseWords(parseTokens, component); List firstParseWordTokens = parseWords.get(0).getParseTokens(); WordType firstWordType = OpsinTools.determineWordType(firstParseWordTokens.get(0).getAnnotations()); for (int j = 1; j < firstParseWordTokens.size(); j++) { if (!firstWordType.equals(OpsinTools.determineWordType(firstParseWordTokens.get(j).getAnnotations()))) { throw new ParsingException(component + "can be interpreted in multiple ways. For the sake of precision OPSIN has decided not to process this as a CAS name"); } } if (parseWords.size() == 1) { switch (firstWordType) { case functionalTerm: if (component.equalsIgnoreCase("ester")) { if (seperateWordSubstituents.isEmpty()){ throw new ParsingException("ester encountered but no substituents were specified in potential CAS name!"); } if (esterEncountered) { throw new ParsingException("ester formation was mentioned more than once in CAS name!"); } parent = uninvertEster(parent); esterEncountered = true; } else { functionalTerms.add(component); } break; case substituent: seperateWordSubstituents.add(component); break; case full: if (StringTools.endsWithCaseInsensitive(component, "ate") || StringTools.endsWithCaseInsensitive(component, "ite")//e.g. Piperazinium, 1,1-dimethyl-, 2,2,2-trifluoroacetate hydrochloride || StringTools.endsWithCaseInsensitive(component, "ium") || StringTools.endsWithCaseInsensitive(component, "hydrofluoride") || StringTools.endsWithCaseInsensitive(component, "hydrochloride") || StringTools.endsWithCaseInsensitive(component, "hydrobromide") || StringTools.endsWithCaseInsensitive(component, "hydroiodide")) { functionalTerms.add(component); } else if (StringTools.endsWithCaseInsensitive(component, "ic") && c + 1 < componentLen && components[c + 1].equalsIgnoreCase("acid")) { functionalTerms.add(component); functionalTerms.add(components[++c]); } else { throw new ParsingException("Unable to interpret: " + component + " (as part of a CAS index name)- A full word was encountered where a substituent or functionalTerm was expected"); } break; default: throw new ParsingException("Unrecognised CAS index name form"); } } else if (parseWords.size() == 2 && firstWordType.equals(WordType.substituent)) { //could be something like O-methyloxime which is parsed as [O-methyl] [oxime] List secondParseWordTokens = parseWords.get(1).getParseTokens(); WordType secondWordType = OpsinTools.determineWordType(secondParseWordTokens.get(0).getAnnotations()); for (int j = 1; j < secondParseWordTokens.size(); j++) { if (!secondWordType.equals(OpsinTools.determineWordType(secondParseWordTokens.get(j).getAnnotations()))) { throw new ParsingException(component + "can be interpreted in multiple ways. For the sake of precision OPSIN has decided not to process this as a CAS name"); } } if (secondWordType.equals(WordType.functionalTerm) && matchFunctionalTermAllowingSubstituentPrefix.matcher(parseWords.get(1).getWord()).matches()){ functionalTerms.add(component); } else{ throw new ParsingException("Unrecognised CAS index name form, could have a missing space?"); } } else { throw new ParsingException("Unrecognised CAS index name form"); } } else { if (!matchCasCollectiveIndex.matcher(component).matches()) {//CAS collective index description should be ignored throw new ParsingException("Unable to interpret: " + component + " (as part of a CAS index name)"); } } } } } StringBuilder casName = new StringBuilder(); for (String prefixFunctionalTerm : seperateWordSubstituents) { casName.append(prefixFunctionalTerm); casName.append(" "); } for (int i = substituents.size() - 1; i >= 0; i--) { //stereochemistry term comes after substituent term. In older CAS names (9CI) this stereochemistry term can apply to the substituent term. Hence append in reverse order casName.append(substituents.get(i)); } casName.append(parent); for (String functionalTerm : functionalTerms) { casName.append(" "); casName.append(functionalTerm); } return casName.toString(); } private static Character missingCloseBracketCharIfApplicable(String component) { int bracketLevel =0; Character missingCloseBracket =null; for (int i = 0, l = component.length(); i < l; i++) { char character = component.charAt(i); if (character == '(' || character == '[' || character == '{') { bracketLevel++; if (bracketLevel ==1){ missingCloseBracket = character; } } if (character == ')' || character == ']' || character == '}') { bracketLevel--; if (bracketLevel<0){ return null; } } } if (bracketLevel == 1){ if (missingCloseBracket == '('){ return ')'; } if (missingCloseBracket == '['){ return ']'; } if (missingCloseBracket == '{'){ return '}'; } } return null; } /** * Modifies the name of the parent acid from ic to ate (or ous to ite) * hence allowing the formation of the uninverted ester * @param parent * @return * @throws ParsingException */ private static String uninvertEster(String parent) throws ParsingException { int len = parent.length(); if (len == 0) { throw new ParsingException("Failed to uninvert CAS ester"); } char lastChar = parent.charAt(len - 1); if (lastChar == ')') { if (StringTools.endsWithCaseInsensitive(parent, "ic acid)")) { parent = parent.substring(0, parent.length() - 8) + "ate)"; } else if (StringTools.endsWithCaseInsensitive(parent, "ous acid)")) { parent = parent.substring(0, parent.length() - 9) + "ite)"; } else if (StringTools.endsWithCaseInsensitive(parent, "ine)")){//amino acid parent = parent.substring(0, parent.length() - 2) + "ate)"; } else{ throw new ParsingException("Failed to uninvert CAS ester"); } } else { if (StringTools.endsWithCaseInsensitive(parent, "ic acid")) { parent = parent.substring(0, parent.length() - 7) + "ate"; } else if (StringTools.endsWithCaseInsensitive(parent, "ous acid")) { parent = parent.substring(0, parent.length() - 8) + "ite"; } else if (StringTools.endsWithCaseInsensitive(parent, "ine")){//amino acid parent = parent.substring(0, parent.length() - 1) + "ate"; } else{ throw new ParsingException("Failed to uninvert CAS ester"); } } return parent; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/CMLWriter.java000066400000000000000000000154021451751637500263310ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.io.ByteArrayOutputStream; import java.io.UnsupportedEncodingException; import java.util.List; import javax.xml.stream.XMLOutputFactory; import javax.xml.stream.XMLStreamException; import javax.xml.stream.XMLStreamWriter; import com.ctc.wstx.api.WstxOutputProperties; import com.ctc.wstx.stax.WstxOutputFactory; class CMLWriter { /** * CML Elements/Attributes/NameSpace */ static final String CML_NAMESPACE = "http://www.xml-cml.org/schema"; private static final XMLOutputFactory factory = new WstxOutputFactory(); static { factory.setProperty(WstxOutputProperties.P_OUTPUT_ESCAPE_CR, false); } /**The XML writer*/ private final XMLStreamWriter writer; /** * Creates a CML writer for the given fragment * @param writer */ CMLWriter(XMLStreamWriter writer) { this.writer = writer; } static String generateCml(Fragment structure, String chemicalName) { return generateCml(structure, chemicalName, false); } static String generateIndentedCml(Fragment structure, String chemicalName) { return generateCml(structure, chemicalName, true); } private static String generateCml(Fragment structure, String chemicalName, boolean indent) { ByteArrayOutputStream out = new ByteArrayOutputStream(); try { XMLStreamWriter xmlWriter = factory.createXMLStreamWriter(out, "UTF-8"); if (indent) { xmlWriter = new IndentingXMLStreamWriter(xmlWriter, 2); } CMLWriter cmlWriter = new CMLWriter(xmlWriter); cmlWriter.writeCmlStart(); cmlWriter.writeMolecule(structure, chemicalName, 1); cmlWriter.writeCmlEnd(); xmlWriter.close(); } catch (XMLStreamException e) { throw new RuntimeException(e); } try { return out.toString("UTF-8"); } catch (UnsupportedEncodingException e) { throw new RuntimeException("JVM doesn't support UTF-8...but it should do!"); } } void writeCmlStart(){ try { writer.writeStartElement("cml"); writer.writeDefaultNamespace(CML_NAMESPACE); writer.writeAttribute("convention", "conventions:molecular"); writer.writeNamespace("conventions", "http://www.xml-cml.org/convention/"); writer.writeNamespace("cmlDict", "http://www.xml-cml.org/dictionary/cml/"); writer.writeNamespace("nameDict", "http://www.xml-cml.org/dictionary/cml/name/"); } catch (XMLStreamException e) { throw new RuntimeException(e); } } void writeCmlEnd(){ try { writer.writeEndElement(); writer.flush(); } catch (XMLStreamException e) { throw new RuntimeException(e); } } void writeMolecule(Fragment structure, String chemicalName, int id) throws XMLStreamException { writer.writeStartElement("molecule"); writer.writeAttribute("id", "m" + id); writer.writeStartElement("name"); writer.writeAttribute("dictRef", "nameDict:unknown"); writer.writeCharacters(chemicalName); writer.writeEndElement(); if (structure != null) { writer.writeStartElement("atomArray"); for(Atom atom : structure) { writeAtom(atom); } writer.writeEndElement(); writer.writeStartElement("bondArray"); for(Bond bond : structure.getBondSet()) { writeBond(bond); } writer.writeEndElement(); } writer.writeEndElement(); } private void writeAtom(Atom atom) throws XMLStreamException { writer.writeStartElement("atom"); writer.writeAttribute("id", "a" + Integer.toString(atom.getID())); writer.writeAttribute("elementType", atom.getElement().toString()); if(atom.getCharge() != 0){ writer.writeAttribute("formalCharge", Integer.toString(atom.getCharge())); } if(atom.getIsotope() != null){ writer.writeAttribute("isotopeNumber", Integer.toString(atom.getIsotope())); } if (atom.getElement() != ChemEl.H){ int hydrogenCount =0; List neighbours = atom.getAtomNeighbours(); for (Atom neighbour : neighbours) { if (neighbour.getElement() == ChemEl.H){ hydrogenCount++; } } if (hydrogenCount==0){//prevent adding of implicit hydrogen writer.writeAttribute("hydrogenCount", "0"); } } AtomParity atomParity = atom.getAtomParity(); if(atomParity != null) { StereoGroupType stereoGroupType = atomParity.getStereoGroup().getType(); if (!((stereoGroupType == StereoGroupType.Rac || stereoGroupType == StereoGroupType.Rel) && countStereoGroup(atom) == 1)) { writeAtomParity(atomParity); } } for(String locant : atom.getLocants()) { writer.writeStartElement("label"); writer.writeAttribute("value", locant); writer.writeAttribute("dictRef", "cmlDict:locant"); writer.writeEndElement(); } writer.writeEndElement(); } private int countStereoGroup(Atom atom) { StereoGroup refGroup = atom.getAtomParity().getStereoGroup(); int count = 0; for (Atom a : atom.getFrag()) { AtomParity atomParity = a.getAtomParity(); if (atomParity == null) { continue; } if (atomParity.getStereoGroup().equals(refGroup)) { count++; } } return count; } private void writeAtomParity(AtomParity atomParity) throws XMLStreamException { writer.writeStartElement("atomParity"); writeAtomRefs4(atomParity.getAtomRefs4()); writer.writeCharacters(Integer.toString(atomParity.getParity())); writer.writeEndElement(); } private void writeBond(Bond bond) throws XMLStreamException { writer.writeStartElement("bond"); writer.writeAttribute("id", "a" + Integer.toString(bond.getFrom()) + "_a" + Integer.toString(bond.getTo())); writer.writeAttribute("atomRefs2", "a" + Integer.toString(bond.getFrom()) + " a" + Integer.toString(bond.getTo())); switch (bond.getOrder()) { case 1: writer.writeAttribute("order", "S"); break; case 2: writer.writeAttribute("order", "D"); break; case 3: writer.writeAttribute("order", "T"); break; default: writer.writeAttribute("order", "unknown"); break; } BondStereo bondStereo = bond.getBondStereo(); if (bondStereo != null){ writeBondStereo(bondStereo); } writer.writeEndElement(); } private void writeBondStereo(BondStereo bondStereo) throws XMLStreamException { writer.writeStartElement("bondStereo"); writeAtomRefs4(bondStereo.getAtomRefs4()); writer.writeCharacters(bondStereo.getBondStereoValue().toString()); writer.writeEndElement(); } private void writeAtomRefs4(Atom[] atomRefs4) throws XMLStreamException { StringBuilder atomRefsSb = new StringBuilder(); for(int i = 0; i< atomRefs4.length - 1; i++) { atomRefsSb.append('a'); atomRefsSb.append(atomRefs4[i].getID()); atomRefsSb.append(' '); } atomRefsSb.append('a'); atomRefsSb.append(atomRefs4[atomRefs4.length - 1].getID()); writer.writeAttribute("atomRefs4", atomRefsSb.toString()); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ChemEl.java000066400000000000000000000027741451751637500256660ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; enum ChemEl { R(0), H(1), He(2), Li(3), Be(4), B(5), C(6), N(7), O(8), F(9), Ne(10), Na(11), Mg(12), Al(13), Si(14), P(15), S(16), Cl(17), Ar(18), K(19), Ca(20), Sc(21), Ti(22), V(23), Cr(24), Mn(25), Fe(26), Co(27), Ni(28), Cu(29), Zn(30), Ga(31), Ge(32), As(33), Se(34), Br(35), Kr(36), Rb(37), Sr(38), Y(39), Zr(40), Nb(41), Mo(42), Tc(43), Ru(44), Rh(45), Pd(46), Ag(47), Cd(48), In(49), Sn(50), Sb(51), Te(52), I(53), Xe(54), Cs(55), Ba(56), La(57), Ce(58), Pr(59), Nd(60), Pm(61), Sm(62), Eu(63), Gd(64), Tb(65), Dy(66), Ho(67), Er(68), Tm(69), Yb(70), Lu(71), Hf(72), Ta(73), W(74), Re(75), Os(76), Ir(77), Pt(78), Au(79), Hg(80), Tl(81), Pb(82), Bi(83), Po(84), At(85), Rn(86), Fr(87), Ra(88), Ac(89), Th(90), Pa(91), U(92), Np(93), Pu(94), Am(95), Cm(96), Bk(97), Cf(98), Es(99), Fm(100), Md(101), No(102), Lr(103), Rf(104), Db(105), Sg(106), Bh(107), Hs(108), Mt(109), Ds(110), Rg(111), Cn(112), Nh(113), Fl(114), Mc(115), Lv(116), Ts(117), Og(118); final int ATOMIC_NUM; private ChemEl(int atomicNum) { this.ATOMIC_NUM = atomicNum; } boolean isChalcogen() { return (this == O || this == S || this == Se || this == Te); } boolean isHalogen() { return (this == F || this == Cl || this == Br || this == I); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/CipOrderingException.java000066400000000000000000000012261451751637500306040ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /**Thrown if the ordering of ligands can now be determined by OPSIN's implementation of the CIP rules. * This could be due to a limitation of the implementation or ligands actually being indistinguishable * * @author dl387 * */ class CipOrderingException extends StereochemistryException { private static final long serialVersionUID = 1L; CipOrderingException() { super(); } CipOrderingException(String message) { super(message); } CipOrderingException(String message, Throwable cause) { super(message, cause); } CipOrderingException(Throwable cause) { super(cause); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/CipSequenceRules.java000066400000000000000000000423141451751637500277420ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayDeque; import java.util.ArrayList; import java.util.Collections; import java.util.Comparator; import java.util.Deque; import java.util.List; import java.util.Queue; /** * An implementation of rules 1-2 of the CIP rules i.e. constitutional differences then isotopes if there is a tie * Cases that require rules 3-5 to distinguish result in an exception * * Phantom atoms are not added as I believe that the results of the program will still be the same even in their absence as everything beats a phantom and comparing phantoms to phantoms achieves nothing * (higher ligancy beats lower ligancy when comparisons are performed) * @author dl387 * */ class CipSequenceRules { private static class CipOrderingRunTimeException extends RuntimeException { private static final long serialVersionUID = 1L; CipOrderingRunTimeException(String message) { super(message); } } private final Atom chiralAtom; CipSequenceRules(Atom chiralAtom) { this.chiralAtom = chiralAtom; } /** * Returns the chiral atom's neighbours in CIP order from lowest priority to highest priority * @return * @throws CipOrderingException */ List getNeighbouringAtomsInCipOrder() throws CipOrderingException { List neighbours = chiralAtom.getAtomNeighbours(); try { Collections.sort(neighbours, new SortByCipOrder(chiralAtom)); } catch (CipOrderingRunTimeException e) { throw new CipOrderingException(e.getMessage()); } return neighbours; } /** * Returns the chiral atom's neighbours, with the exception of the given atom, in CIP order from lowest priority to highest priority * @param neighbourToIgnore * @return * @throws CipOrderingException */ List getNeighbouringAtomsInCipOrderIgnoringGivenNeighbour(Atom neighbourToIgnore) throws CipOrderingException { List neighbours = chiralAtom.getAtomNeighbours(); if (!neighbours.remove(neighbourToIgnore)) { throw new IllegalArgumentException("OPSIN bug: Atom" + neighbourToIgnore.getID() +" was not a neighbour of the given stereogenic atom"); } try { Collections.sort(neighbours, new SortByCipOrder(chiralAtom)); } catch (CipOrderingRunTimeException e) { throw new CipOrderingException(e.getMessage()); } return neighbours; } /** * Holds information about what atoms to try next next and how those atoms were reached (to prevent immediate back tracking and to detect cycles) * @author dl387 * */ private static class CipState { CipState(List nextAtoms1, List nextAtoms2) { this.nextAtoms1 = nextAtoms1; this.nextAtoms2 = nextAtoms2; } final List nextAtoms1; final List nextAtoms2; } /** * Holds an atom with associated visited atoms * @author dl387 * */ private static class AtomWithHistory { AtomWithHistory(Atom atom, List visitedAtoms, Integer indexOfOriginalFromRoot) { this.atom = atom; this.visitedAtoms = visitedAtoms; this.indexOfOriginalFromRoot = indexOfOriginalFromRoot; } final Atom atom; final List visitedAtoms; final Integer indexOfOriginalFromRoot; } /** * Sorts atoms by their CIP order, low to high * @author dl387 * */ private class SortByCipOrder implements Comparator { private final Atom chiralAtom; private final AtomListCipComparator atomListCipComparator = new AtomListCipComparator(); private final ListOfAtomListsCipComparator listOfAtomListsCipComparator = new ListOfAtomListsCipComparator(); private final CipComparator cipComparator = new CipComparator(); private int rule = 0; SortByCipOrder(Atom chiralAtom) { this.chiralAtom = chiralAtom; } public int compare(Atom a, Atom b) { /* * rule = 0 --> Rule 1a Higher atomic number precedes lower * rule = 1 --> Rule 1b A duplicated atom, with its predecessor node having the same label closer to the root, ranks higher than a duplicated atom, with its predecessor node having the same label farther from the root, which ranks higher than any non-duplicated atom node * rule = 2 --> Rule 2 Higher atomic mass number precedes lower */ for (rule = 0; rule <= 2; rule++) { List atomsVisted = new ArrayList<>(); atomsVisted.add(chiralAtom); AtomWithHistory aWithHistory = new AtomWithHistory(a, atomsVisted, null); AtomWithHistory bWithHistory = new AtomWithHistory(b, new ArrayList<>(atomsVisted), null); int compare = compareByCipRules(aWithHistory, bWithHistory); if (compare != 0) { return compare; } List nextAtoms1 = new ArrayList<>(); nextAtoms1.add(aWithHistory); List nextAtoms2 = new ArrayList<>(); nextAtoms2.add(bWithHistory); CipState startingState = new CipState(nextAtoms1, nextAtoms2); Deque cipStateQueue = new ArrayDeque<>(); cipStateQueue.add(startingState); /* Go through CIP states in a breadth-first manner: * Neighbours of the given atom/s (if multiple atoms this is because so far the two paths leading to them have been equivalent) are evaluated for both a and b * Neighbours are sorted by CIP priority * Comparisons performed between neighbours of a and neighbours of b (will break if compare != 0) * Degenerate neighbours grouped together * CIP state formed for each list of neighbours and added to queue in order of priority * */ while(!cipStateQueue.isEmpty()) { CipState currentState = cipStateQueue.removeFirst(); compare = compareAtNextLevel(currentState, cipStateQueue); if (compare != 0) { return compare; } } } throw new CipOrderingRunTimeException("Failed to assign CIP stereochemistry, this indicates a bug in OPSIN or a limitation in OPSIN's implementation of the sequence rules"); } /** * Compares the neighbours of the atoms specified in nextAtom1/2 in cipstate. * Returns the result of the comparison between these neighbours * If the comparison returned 0 adds new cipstates to the queue * @param cipState * @param queue * @return */ private int compareAtNextLevel(CipState cipState, Queue queue) { List> neighbours1 = getNextLevelNeighbours(cipState.nextAtoms1); List> neighbours2 = getNextLevelNeighbours(cipState.nextAtoms2); int compare = compareNeighboursByCipPriorityRules(neighbours1, neighbours2); if (compare != 0) { return compare; } List> prioritisedNeighbours1 = formListsWithSamePriority(neighbours1); List> prioritisedNeighbours2 = formListsWithSamePriority(neighbours2); //As earlier compare was 0, prioritisedNeighbours1.size() == prioritisedNeighbours2.size() for (int i = prioritisedNeighbours1.size() - 1; i >= 0; i--) { queue.add(new CipState(prioritisedNeighbours1.get(i), prioritisedNeighbours2.get(i))); } return 0; } private int compareNeighboursByCipPriorityRules(List> neighbours1, List> neighbours2) { int difference = listOfAtomListsCipComparator.compare(neighbours1, neighbours2); if (difference >0) { return 1; } if (difference < 0) { return -1; } return 0; } private List> getNextLevelNeighbours(List nextAtoms) { List> neighbourLists = new ArrayList<>(); for (AtomWithHistory nextAtom : nextAtoms) { neighbourLists.add(getNextAtomsWithAppropriateGhostAtoms(nextAtom)); } Collections.sort(neighbourLists, atomListCipComparator); return neighbourLists; } /** * If given say [H,C,C] this becomes [H] [C,C] * If given say [H,C,C] [H,C,C] this becomes [H,H] [C,C,C,C] * If given say [H,C,C] [H,C,F] this becomes [H],[C,C][H][C][F] * as [H,C,F] is higher priority than [H,C,C] so all its atoms must be evaluated first * The input lists of neighbours are assumed to have been presorted. * @param neighbourLists */ private List> formListsWithSamePriority(List> neighbourLists) { int intialNeighbourListCount = neighbourLists.size(); if (intialNeighbourListCount > 1) { List> listsToRemove = new ArrayList<>(); for (int i = 0; i < intialNeighbourListCount; i++) { List> neighbourListsToCombine = new ArrayList<>(); List primaryAtomList = neighbourLists.get(i); for (int j = i + 1; j < intialNeighbourListCount; j++) { List neighbourListToCompareWith = neighbourLists.get(j); if (atomListCipComparator.compare(primaryAtomList, neighbourListToCompareWith) == 0) { neighbourListsToCombine.add(neighbourListToCompareWith); i++; } else { break; } } for (List neighbourList: neighbourListsToCombine) { listsToRemove.add(neighbourList); primaryAtomList.addAll(neighbourList); } } neighbourLists.removeAll(listsToRemove); } List> updatedNeighbourLists = new ArrayList<>(); //lists of same priority have already been combined (see above) e.g. [H,C,C] [H,C,C] -->[H,C,C,H,C,C] //now sort these combined lists by CIP priority //then group atoms that have the same CIP priority for (int i = 0, lstsLen = neighbourLists.size(); i < lstsLen; i++) { List neighbourList = neighbourLists.get(i); Collections.sort(neighbourList, cipComparator); AtomWithHistory lastAtom = null; List currentAtomList = new ArrayList<>(); for (int j = 0, lstLen = neighbourList.size(); j < lstLen; j++) { AtomWithHistory a = neighbourList.get(j); if (lastAtom != null && compareByCipRules(lastAtom, a) != 0) { updatedNeighbourLists.add(currentAtomList); currentAtomList = new ArrayList<>(); } currentAtomList.add(a); lastAtom = a; } if (!currentAtomList.isEmpty()) { updatedNeighbourLists.add(currentAtomList); } } return updatedNeighbourLists; } /** * Sorts atoms by their atomic number, low to high * @author dl387 * */ private class CipComparator implements Comparator { public int compare(AtomWithHistory a, AtomWithHistory b) { return compareByCipRules(a, b); } } /** * Sorts atomLists by CIP rules, low to high * @author dl387 * */ private class AtomListCipComparator implements Comparator> { public int compare(List a, List b) { int aSize = a.size(); int bSize = b.size(); int differenceInSize = aSize - bSize; int maxCommonSize = aSize > bSize ? bSize : aSize; for (int i = 1; i <= maxCommonSize; i++) { int difference = compareByCipRules(a.get(aSize - i), b.get(bSize - i)); if (difference > 0) { return 1; } if (difference < 0) { return -1; } } if (differenceInSize > 0) { return 1; } if (differenceInSize < 0) { return -1; } return 0; } } /** * Sorts lists of atomLists by CIP rules, low to high * @author dl387 * */ private class ListOfAtomListsCipComparator implements Comparator>> { public int compare(List> a, List> b) { int aSize = a.size(); int bSize = b.size(); int differenceInSize = aSize - bSize; int maxCommonSize = aSize > bSize ? bSize : aSize; for (int i = 1; i <= maxCommonSize; i++) { List aprime = a.get(aSize - i); List bprime = b.get(bSize - i); int aprimeSize = aprime.size(); int bprimeSize = bprime.size(); int differenceInSizeprime = aprimeSize - bprimeSize; int maxCommonSizeprime = aprimeSize > bprimeSize ? bprimeSize : aprimeSize; for (int j = 1; j <= maxCommonSizeprime; j++) { int difference = compareByCipRules(aprime.get(aprimeSize - j), bprime.get(bprimeSize - j)); if (difference > 0) { return 1; } if (difference < 0) { return -1; } } if (differenceInSizeprime > 0) { return 1; } if (differenceInSizeprime < 0) { return -1; } } if (differenceInSize > 0) { return 1; } if (differenceInSize < 0) { return -1; } return 0; } } /** * Gets the neighbouring atoms bar the previous atom in CIP order * If the neighbouring atom has already been visited it is replaced with a ghost atom * Multiple bonds including those to previous atoms yield ghost atoms unless the bond goes to the chiral atom e.g. in a sulfoxide * @param atoms * @return */ private List getNextAtomsWithAppropriateGhostAtoms(AtomWithHistory atomWithHistory) { Atom atom = atomWithHistory.atom; List visitedAtoms = atomWithHistory.visitedAtoms; Atom previousAtom = visitedAtoms.get(visitedAtoms.size()-1); List visitedAtomsIncludingCurrentAtom = new ArrayList<>(visitedAtoms); visitedAtomsIncludingCurrentAtom.add(atom); List neighboursWithHistory = new ArrayList<>(); for(Bond b : atom.getBonds()) { Atom atomBondConnectsTo = b.getOtherAtom(atom); if (!atomBondConnectsTo.equals(chiralAtom)) {//P-91.1.4.2.4 (higher order bonds to chiral centre do not involve duplication of atoms) for (int j = b.getOrder(); j >1; j--) {//add ghost atoms to represent higher order bonds Atom ghost = new Atom(atomBondConnectsTo.getElement()); if (rule > 0) { int indexOfOriginalAtom = visitedAtoms.indexOf(atomBondConnectsTo); if (indexOfOriginalAtom != -1) { neighboursWithHistory.add(new AtomWithHistory(ghost, visitedAtomsIncludingCurrentAtom, indexOfOriginalAtom)); } else{ neighboursWithHistory.add(new AtomWithHistory(ghost, visitedAtomsIncludingCurrentAtom, visitedAtoms.size() + 1)); } } else{ neighboursWithHistory.add(new AtomWithHistory(ghost, visitedAtomsIncludingCurrentAtom, null)); } } } if (!atomBondConnectsTo.equals(previousAtom)) { if (visitedAtoms.contains(atomBondConnectsTo)) {//cycle detected, add ghost atom instead Atom ghost = new Atom(atomBondConnectsTo.getElement()); if (rule > 0) { neighboursWithHistory.add(new AtomWithHistory(ghost, visitedAtomsIncludingCurrentAtom, visitedAtoms.indexOf(atomBondConnectsTo))); } else{ neighboursWithHistory.add(new AtomWithHistory(ghost, visitedAtomsIncludingCurrentAtom, null)); } } else{ neighboursWithHistory.add(new AtomWithHistory(atomBondConnectsTo, visitedAtomsIncludingCurrentAtom, null)); } } } Collections.sort(neighboursWithHistory, cipComparator); return neighboursWithHistory; } /** * Greater than 0 means a is preferred over b (vice versa for less than 1) * @param a * @param b * @return */ private int compareByCipRules(AtomWithHistory a, AtomWithHistory b) { //rule 1a //prefer higher atomic number int atomicNumber1 = a.atom.getElement().ATOMIC_NUM; int atomicNumber2 = b.atom.getElement().ATOMIC_NUM; if (atomicNumber1 > atomicNumber2) { return 1; } else if (atomicNumber1 < atomicNumber2) { return -1; } if (rule > 0) { //rule 1b //prefer duplicate to non-duplicate Integer indexFromRoot1 = a.indexOfOriginalFromRoot; Integer indexFromRoot2 = b.indexOfOriginalFromRoot; if (indexFromRoot1 != null && indexFromRoot2 == null) { return 1; } if (indexFromRoot1 == null && indexFromRoot2 != null) { return -1; } //prefer duplicate of node closer to root if (indexFromRoot1 != null && indexFromRoot2 != null) { if (indexFromRoot1 < indexFromRoot2 ) { return 1; } if (indexFromRoot1 > indexFromRoot2 ) { return -1; } } if (rule > 1) { //rule 2 //prefer higher atomic mass Integer atomicMass1 = a.atom.getIsotope(); Integer atomicMass2 = b.atom.getIsotope(); if (atomicMass1 != null && atomicMass2 == null) { return 1; } else if (atomicMass1 == null && atomicMass2 != null) { return -1; } else if (atomicMass1 != null && atomicMass2 != null) { if (atomicMass1 > atomicMass2) { return 1; } else if (atomicMass1 < atomicMass2) { return -1; } } } } return 0; } } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ComponentGenerationException.java000066400000000000000000000007531451751637500323610ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /**Thrown during component generation. * * @author ptc24 * */ class ComponentGenerationException extends Exception { private static final long serialVersionUID = 1L; ComponentGenerationException() { super(); } ComponentGenerationException(String message) { super(message); } ComponentGenerationException(String message, Throwable cause) { super(message, cause); } ComponentGenerationException(Throwable cause) { super(cause); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ComponentGenerator.java000066400000000000000000004130761451751637500303430ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayDeque; import java.util.ArrayList; import java.util.Collections; import java.util.Comparator; import java.util.Deque; import java.util.HashMap; import java.util.List; import java.util.Locale; import java.util.Map; import java.util.regex.Matcher; import java.util.regex.Pattern; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import static uk.ac.cam.ch.wwmm.opsin.OpsinTools.*; /**Does destructive procedural parsing on parser results. * * @author ptc24 * @author dl387 * */ class ComponentGenerator { /** * Sort bridges such as the highest priority secondary bridges come first * e.g. 1^(6,3).1^(15,13) * rearranged to 1^(15,13).1^(6,3) * @author dl387 * */ private static class VonBaeyerSecondaryBridgeSort implements Comparator> { public int compare(HashMap bridge1, HashMap bridge2){ //first we compare the larger coordinate, due to an earlier potential swapping of coordinates this is always in "AtomId_Larger" int largerCoordinate1 = bridge1.get("AtomId_Larger"); int largerCoordinate2 = bridge2.get("AtomId_Larger"); if (largerCoordinate1 >largerCoordinate2) { return -1; } else if (largerCoordinate2 >largerCoordinate1) { return 1; } //tie int smallerCoordinate1 = bridge1.get("AtomId_Smaller"); int smallerCoordinate2 = bridge2.get("AtomId_Smaller"); if (smallerCoordinate1 >smallerCoordinate2) { return -1; } else if (smallerCoordinate2 >smallerCoordinate1) { return 1; } //tie int bridgelength1 = bridge1.get("Bridge Length"); int bridgelength2 = bridge2.get("Bridge Length"); if (bridgelength1 >bridgelength2) { return -1; } else if (bridgelength2 >bridgelength1) { return 1; } else{ return 0; } } } //match a fusion bracket with only numerical locants. If this is followed by a HW group it probably wasn't a fusion bracket private static final Pattern matchNumberLocantsOnlyFusionBracket = Pattern.compile("\\[\\d+(,\\d+)*\\]"); private static final Pattern matchCommaOrDot =Pattern.compile("[\\.,]"); private static final Pattern matchAnnulene = Pattern.compile("[\\[\\(\\{]([1-9]\\d*)[\\]\\)\\}]annul(en|yn)", Pattern.CASE_INSENSITIVE); private static final String elementSymbols ="(?:He|Li|Be|B|C|N|O|F|Ne|Na|Mg|Al|Si|P|S|Cl|Ar|K|Ca|Sc|Ti|V|Cr|Mn|Fe|Co|Ni|Cu|Zn|Ga|Ge|As|Se|Br|Kr|Rb|Sr|Y|Zr|Nb|Mo|Tc|Ru|Rh|Pd|Ag|Cd|In|Sn|Sb|Te|I|Xe|Cs|Ba|La|Ce|Pr|Nd|Pm|Sm|Eu|Gd|Tb|Dy|Ho|Er|Tm|Yb|Lu|Hf|Ta|W|Re|Os|Ir|Pt|Au|Hg|Tl|Pb|Po|At|Rn|Fr|Ra|Ac|Th|Pa|U|Np|Pu|Am|Cm|Bk|Cf|Es|Fm|Md|No|Lr|Rf|Db|Sg|Bh|Hs|Mt|Ds)"; private static final Pattern matchStereochemistry = Pattern.compile("(.*?)(SR|R/?S|r/?s|[Rr]\\^?[*]|[Ss]\\^?[*]|[Ee][Zz]|[EZ][*]|[RSEZrsezabx]|[cC][iI][sS]|[tT][rR][aA][nN][sS]|[aA][lL][pP][hH][aA]|[bB][eE][tT][aA]|[xX][iI]|[eE][xX][oO]|[eE][nN][dD][oO]|[sS][yY][nN]|[aA][nN][tT][iI]|M|P|Ra|Sa|Sp|Rp|R(?:[Oo][Rr]|[Aa][Nn][Dd])S|S(?:[Oo][Rr]|[Aa][Nn][Dd])R|E(?:[Oo][Rr]|[Aa][Nn][Dd])Z|Z(?:[Oo][Rr]|[Aa][Nn][Dd])E)"); private static final Pattern matchRacemic = Pattern.compile("rac(\\.|em(\\.|ic)?)?-?", Pattern.CASE_INSENSITIVE); private static final Pattern matchRS = Pattern.compile("[Rr]/?\\^?[*Ss]?|[Ss]\\^?[*Rr]?|R(?:[Oo][Rr]|[Aa][Nn][Dd])S|S(?:[Oo][Rr]|[Aa][Nn][Dd])R"); private static final Pattern matchEZ = Pattern.compile("[EZez]|[Ee][Zz]|[EZ]\\*|EandZ|EorZ"); private static final Pattern matchAlphaBetaStereochem = Pattern.compile("a|b|x|[aA][lL][pP][hH][aA]|[bB][eE][tT][aA]|[xX][iI]"); private static final Pattern matchCisTrans = Pattern.compile("[cC][iI][sS]|[tT][rR][aA][nN][sS]"); private static final Pattern matchEndoExoSynAnti = Pattern.compile("[eE][xX][oO]|[eE][nN][dD][oO]|[sS][yY][nN]|[aA][nN][tT][iI]"); private static final Pattern matchAxialStereo = Pattern.compile("M|P|Ra|Sa|Sp|Rp"); private static final Pattern matchLambdaConvention = Pattern.compile("(\\S+)?lambda\\D*(\\d+)\\D*", Pattern.CASE_INSENSITIVE); private static final Pattern matchHdigit =Pattern.compile("H\\d"); private static final Pattern matchNonDigit =Pattern.compile("\\D+"); private static final Pattern matchAddedHydrogenLocantBracket = Pattern.compile("[1-9][0-9]*[a-g]?'*H(,[1-9][0-9]*[a-g]?'*H)*", Pattern.CASE_INSENSITIVE); private static final Pattern matchRSLocantBracket = Pattern.compile("[RS]|R[,/]?S", Pattern.CASE_INSENSITIVE); private static final Pattern matchSuperscriptedLocant = Pattern.compile("(" + elementSymbols +"'*)[\\^\\[\\(\\{~\\*\\<]*(?:[sS][uU][pP][ ]?)?([^\\^\\[\\(\\{~\\*\\<\\]\\)\\}\\>]+)[^\\[\\(\\{]*"); private static final Pattern matchIUPAC2004ElementLocant = Pattern.compile("(\\d+'*)-(" + elementSymbols +"'*)(.*)"); private static final Pattern matchGreek = Pattern.compile("alpha|beta|gamma|delta|epsilon|zeta|eta|omega", Pattern.CASE_INSENSITIVE); private static final Pattern matchInlineSuffixesThatAreAlsoGroups = Pattern.compile("carbonyl|oxy|sulfenyl|sulfinyl|sulfonyl|selenenyl|seleninyl|selenonyl|tellurenyl|tellurinyl|telluronyl"); private final BuildState buildState; ComponentGenerator(BuildState buildState) { this.buildState = buildState; } /** * Processes a parse result destructively adding semantic information by processing the various micro syntaxes. * @param parse * @throws ComponentGenerationException */ void processParse(Element parse) throws ComponentGenerationException { List substituentsAndRoot = OpsinTools.getDescendantElementsWithTagNames(parse, new String[]{SUBSTITUENT_EL, ROOT_EL}); for (Element subOrRoot: substituentsAndRoot) { /* Throws exceptions for occurrences that are ambiguous and this parse has picked the incorrect interpretation */ resolveAmbiguities(subOrRoot); processLocants(subOrRoot); convertOrthoMetaParaToLocants(subOrRoot); formAlkaneStemsFromComponents(subOrRoot); processAlkaneStemModifications(subOrRoot);//e.g. tert-butyl processHeterogenousHydrides(subOrRoot);//e.g. tetraphosphane, disiloxane processIndicatedHydrogens(subOrRoot); processStereochemistry(subOrRoot); processInfixes(subOrRoot); processSuffixPrefixes(subOrRoot); processLambdaConvention(subOrRoot); } List groups = OpsinTools.getDescendantElementsWithTagName(parse, GROUP_EL); /* Converts open/close bracket elements to bracket elements and * places the elements inbetween within the newly created bracket */ List brackets = new ArrayList<>(); findAndStructureBrackets(substituentsAndRoot, brackets); for (Element subOrRoot: substituentsAndRoot) { processHydroCarbonRings(subOrRoot); handleSuffixIrregularities(subOrRoot);//handles quinone -->dioxo } for (Element group : groups) { detectAlkaneFusedRingBridges(group); processRings(group);//processes cyclo, von baeyer and spiro tokens handleGroupIrregularities(group);//handles benzyl, diethylene glycol, phenanthrone and other awkward bits of nomenclature } for (Element bracket : brackets) { moveDetachableHetAtomRepl(bracket); } } /** * Resolves common ambiguities e.g. tetradeca being 4x10carbon chain rather than 14carbon chain * @param subOrRoot * @throws ComponentGenerationException */ static void resolveAmbiguities(Element subOrRoot) throws ComponentGenerationException { List multipliers = subOrRoot.getChildElements(MULTIPLIER_EL); for (Element apparentMultiplier : multipliers) { if (!BASIC_TYPE_VAL.equals(apparentMultiplier.getAttributeValue(TYPE_ATR)) && !VONBAEYER_TYPE_VAL.equals(apparentMultiplier.getAttributeValue(TYPE_ATR))){ continue; } int multiplierNum = Integer.parseInt(apparentMultiplier.getAttributeValue(VALUE_ATR)); Element nextEl = OpsinTools.getNextSibling(apparentMultiplier); if (multiplierNum >=3){//detects ambiguous use of things like tetradeca if(nextEl !=null){ if (nextEl.getName().equals(ALKANESTEMCOMPONENT)){//can ignore the trivial alkanes as ambiguity does not exist for them int alkaneChainLength = Integer.parseInt(nextEl.getAttributeValue(VALUE_ATR)); if (alkaneChainLength >=10 && alkaneChainLength > multiplierNum){ Element isThisALocant = OpsinTools.getPreviousSibling(apparentMultiplier); if (isThisALocant == null || !isThisALocant.getName().equals(LOCANT_EL) || isThisALocant.getValue().split(",").length != multiplierNum){ throw new ComponentGenerationException(apparentMultiplier.getValue() + nextEl.getValue() +" should not have been lexed as two tokens!"); } } } } } if (multiplierNum >=4 && nextEl !=null && nextEl.getName().equals(HYDROCARBONFUSEDRINGSYSTEM_EL) && nextEl.getValue().equals("phen") && !"e".equals(nextEl.getAttributeValue(SUBSEQUENTUNSEMANTICTOKEN_ATR))){//deals with tetra phenyl vs tetraphen yl Element possibleLocantOrMultiplierOrSuffix = OpsinTools.getNextSibling(nextEl); if (possibleLocantOrMultiplierOrSuffix!=null){//null if not used as substituent if (possibleLocantOrMultiplierOrSuffix.getName().equals(SUFFIX_EL)){//for phen the aryl substituent, expect an adjacent suffix e.g. phenyl, phenoxy Element isThisALocant = OpsinTools.getPreviousSibling(apparentMultiplier); if (isThisALocant == null || !isThisALocant.getName().equals(LOCANT_EL) || isThisALocant.getValue().split(",").length != 1){ String multiplierAndGroup =apparentMultiplier.getValue() + nextEl.getValue(); throw new ComponentGenerationException(multiplierAndGroup +" should not have been lexed as one token!"); } } } } if (multiplierNum > 4 && !apparentMultiplier.getValue().endsWith("a")){//disambiguate pent oxy and the like. Assume it means pentanoxy rather than 5 oxys if (nextEl !=null && nextEl.getName().equals(GROUP_EL)&& matchInlineSuffixesThatAreAlsoGroups.matcher(nextEl.getValue()).matches()){ throw new ComponentGenerationException(apparentMultiplier.getValue() + nextEl.getValue() +" should have been lexed as [alkane stem, inline suffix], not [multiplier, group]!"); } } } List fusions = subOrRoot.getChildElements(FUSION_EL); for (Element fusion : fusions) { String fusionText = fusion.getValue(); if (matchNumberLocantsOnlyFusionBracket.matcher(fusionText).matches()){ Element possibleHWRing = OpsinTools.getNextSiblingIgnoringCertainElements(fusion, new String[]{MULTIPLIER_EL, HETEROATOM_EL}); if (possibleHWRing !=null && HANTZSCHWIDMAN_SUBTYPE_VAL.equals(possibleHWRing.getAttributeValue(SUBTYPE_ATR))){ int heteroCount = 0; int multiplierValue = 1; Element currentElem = OpsinTools.getNextSibling(fusion); while(currentElem != null && !currentElem.getName().equals(GROUP_EL)){ if(currentElem.getName().equals(HETEROATOM_EL)) { heteroCount+=multiplierValue; multiplierValue =1; } else if (currentElem.getName().equals(MULTIPLIER_EL)){ multiplierValue = Integer.parseInt(currentElem.getAttributeValue(VALUE_ATR)); } currentElem = OpsinTools.getNextSibling(currentElem); } String[] locants = fusionText.substring(1, fusionText.length() - 1).split(","); if (locants.length == heteroCount){ boolean foundLocantNotInHwSystem =false; for (String locant : locants) { if (Integer.parseInt(locant) > (possibleHWRing.getAttributeValue(VALUE_ATR).length()-2)){ foundLocantNotInHwSystem =true; } } if (!foundLocantNotInHwSystem){ throw new ComponentGenerationException("This fusion bracket is in fact more likely to be a description of the locants of a HW ring"); } } } } } } /** * Removes hyphens from the end of locants if present * Looks for locants of the form number-letter and converts them to letternumber * e.g. 1-N becomes N1. 1-N is the IUPAC 2004 recommendation, N1 is the previous recommendation * Strips indication of superscript * Strips added hydrogen out of locants * Strips stereochemistry out of locants * Normalises case on greeks to lower case * * @param subOrRoot * @throws ComponentGenerationException */ static void processLocants(Element subOrRoot) throws ComponentGenerationException { List children = subOrRoot.getChildElements(); for (Element el : children) { String elName = el.getName(); if (elName.equals(LOCANT_EL)) { Element locantEl = el; List individualLocants = splitIntoIndividualLocants(StringTools.removeDashIfPresent(locantEl.getValue())); for (int i = 0, locantCount = individualLocants.size(); i < locantCount; i++) { String locantText = individualLocants.get(i); char lastChar = locantText.charAt(locantText.length() - 1); if(lastChar == ')' || lastChar == ']' || lastChar == '}') { //stereochemistry or added hydrogen that result from the application of this locant as a locant for a substituent may be included in brackets after the locant int bracketStart = -1; for (int j = locantText.length() - 2; j >=0; j--) { char ch = locantText.charAt(j); if (ch == '(' || ch == '[' || ch == '{') { bracketStart = j; break; } } if (bracketStart >=0) { String brackettedText = locantText.substring(bracketStart + 1, locantText.length() - 1); if (matchAddedHydrogenLocantBracket.matcher(brackettedText).matches()) { locantText = StringTools.removeDashIfPresent(locantText.substring(0, bracketStart));//strip the bracket from the locantText //create individual tags for added hydrogen. Examples of bracketed text include "9H" or "2H,7H" String[] addedHydrogens = brackettedText.split(","); for (String addedHydrogen : addedHydrogens) { Element addedHydrogenElement = new TokenEl(ADDEDHYDROGEN_EL); String hydrogenLocant = fixLocantCapitalisation(addedHydrogen.substring(0, addedHydrogen.length() - 1)); addedHydrogenElement.addAttribute(new Attribute(LOCANT_ATR, hydrogenLocant)); OpsinTools.insertBefore(locantEl, addedHydrogenElement); } if (locantEl.getAttribute(TYPE_ATR) == null){ locantEl.addAttribute(new Attribute(TYPE_ATR, ADDEDHYDROGENLOCANT_TYPE_VAL));//this locant must not be used as an indirect locant } } else if (matchRSLocantBracket.matcher(brackettedText).matches()) { locantText = StringTools.removeDashIfPresent(locantText.substring(0, bracketStart));//strip the bracket from the locantText String rs = brackettedText.replaceAll("\\W", "");//convert R/S to RS Element newStereoChemEl = new TokenEl(STEREOCHEMISTRY_EL, "(" + standardizeLocantVariants(locantText) + rs + ")"); newStereoChemEl.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); OpsinTools.insertBefore(locantEl, newStereoChemEl); } //compounds locant e.g. 1(10). are left as is, and handled by the function that handles unsaturation //brackets for superscripts are removed by standardizeLocantVariants e.g. N(alpha) } else{ throw new ComponentGenerationException("OPSIN bug: malformed locant text"); } } individualLocants.set(i, standardizeLocantVariants(locantText)); } locantEl.setValue(StringTools.stringListToString(individualLocants, ",")); Element afterLocants = OpsinTools.getNextSibling(locantEl); if(afterLocants == null) { throw new ComponentGenerationException("Nothing after locant tag: " + locantEl.toXML()); } if (individualLocants.size() == 1) { ifCarbohydrateLocantConvertToAminoAcidStyleLocant(locantEl); } } else if (elName.equals(COLONORSEMICOLONDELIMITEDLOCANT_EL)) { //e.g. (1,2:3,4) //This type of locant is typically used with pairs of locants e.g epoxides, anhydrides String locantText = StringTools.removeDashIfPresent(el.getValue()); StringBuilder updatedLocantText = new StringBuilder(); StringBuilder sb = new StringBuilder(); for (int i = 0, len = locantText.length(); i < len; i++) { char ch = locantText.charAt(i); if (ch == ',' || ch == ':' || ch == ';') { updatedLocantText.append(standardizeLocantVariants(sb.toString())); sb.setLength(0); updatedLocantText.append(ch); } else { sb.append(ch); } } updatedLocantText.append(standardizeLocantVariants(sb.toString())); el.setValue(updatedLocantText.toString()); } } } private static String standardizeLocantVariants(String locantText) { if (locantText.contains("-")) {//avoids this regex being invoked typically //rearranges locant to the older equivalent form Matcher m = matchIUPAC2004ElementLocant.matcher(locantText); if (m.matches()){ locantText = m.group(2) + m.group(1) + m.group(3); } } if (Character.isLetter(locantText.charAt(0))) { //remove indications of superscript as the fact a locant is superscripted can be determined from context e.g. N~1~ ->N1 Matcher m = matchSuperscriptedLocant.matcher(locantText); if (m.lookingAt()) { String replacementString = m.group(1) + m.group(2); locantText = m.replaceFirst(replacementString); } if (locantText.length() >= 3){ //convert greeks to lower case m = matchGreek.matcher(locantText); while (m.find()) { locantText = locantText.substring(0, m.start()) + m.group().toLowerCase(Locale.ROOT) + locantText.substring(m.end()); } } } locantText = fixLocantCapitalisation(locantText); return locantText; } /** * Looks for locants of the form ()? * and converts them to * e.g. 2,4,6 tri O- -->O2,O4,O6 tri * @param locant */ private static void ifCarbohydrateLocantConvertToAminoAcidStyleLocant(Element locant) { if (MATCH_ELEMENT_SYMBOL.matcher(locant.getValue()).matches()) { Element possibleMultiplier = OpsinTools.getPreviousSibling(locant); if (possibleMultiplier != null && possibleMultiplier.getName().equals(MULTIPLIER_EL)) { int multiplierValue = Integer.parseInt(possibleMultiplier.getAttributeValue(VALUE_ATR)); Element possibleOtherLocant = OpsinTools.getPreviousSibling(possibleMultiplier); if (possibleOtherLocant != null) { String[] locantValues = possibleOtherLocant.getValue().split(","); if (locantValues.length == Integer.parseInt(possibleMultiplier.getAttributeValue(VALUE_ATR))){ for (int i = 0; i < locantValues.length; i++) { locantValues[i] = locant.getValue() + locantValues[i]; } possibleOtherLocant.setValue(StringTools.arrayToString(locantValues, ",")); locant.detach(); } } else{ StringBuilder sb = new StringBuilder(); for (int i = 0; i < multiplierValue - 1; i++) { sb.append(locant.getValue()); sb.append(StringTools.multiplyString("'", i)); sb.append(','); } sb.append(locant.getValue()); sb.append(StringTools.multiplyString("'", multiplierValue - 1)); Element newLocant = new TokenEl(LOCANT_EL, sb.toString()); OpsinTools.insertBefore(possibleMultiplier, newLocant); locant.detach(); } } } } /** * Takes a string of locants and splits on commas, but taking into account brackets * e.g. 1,2(1H,2H),3 becomes [1][2(1H,2H)][3] * @param locantString * @return */ private static List splitIntoIndividualLocants(String locantString) { List individualLocants = new ArrayList<>(); char[] charArray = locantString.toCharArray(); boolean inBracket =false; int indiceOfLastMatch =0; for (int i = 0; i < charArray.length; i++) { char c = charArray[i]; if (c==','){ if (!inBracket){ individualLocants.add(locantString.substring(indiceOfLastMatch, i)); indiceOfLastMatch = i+1; } } else if (c == '(' || c == '[' || c == '{') { inBracket =true; } else if(c == ')' || c == ']' || c == '}') { inBracket =false; } } individualLocants.add(locantString.substring(indiceOfLastMatch, charArray.length)); return individualLocants; } /**Converts ortho/meta/para into locants * Depending on context para, for example, will either become para or 1,para * * @param subOrRoot * @throws ComponentGenerationException */ private void convertOrthoMetaParaToLocants(Element subOrRoot) throws ComponentGenerationException{ List ompLocants = subOrRoot.getChildElements(ORTHOMETAPARA_EL); for (Element ompLocant : ompLocants) { String locantText = ompLocant.getValue(); String firstChar = locantText.substring(0, 1); ompLocant.setName(LOCANT_EL); ompLocant.addAttribute(new Attribute(TYPE_ATR, ORTHOMETAPARA_TYPE_VAL)); if (orthoMetaParaLocantIsTwoLocants(ompLocant)) { if ("o".equalsIgnoreCase(firstChar)){ ompLocant.setValue("1,ortho"); } else if ("m".equalsIgnoreCase(firstChar)){ ompLocant.setValue("1,meta"); } else if ("p".equalsIgnoreCase(firstChar)){ ompLocant.setValue("1,para"); } else{ throw new ComponentGenerationException(locantText + " was not identified as being either ortho, meta or para but according to the chemical grammar it should of been"); } } else{ if ("o".equalsIgnoreCase(firstChar)){ ompLocant.setValue("ortho"); } else if ("m".equalsIgnoreCase(firstChar)){ ompLocant.setValue("meta"); } else if ("p".equalsIgnoreCase(firstChar)){ ompLocant.setValue("para"); } else{ throw new ComponentGenerationException(locantText + " was not identified as being either ortho, meta or para but according to the chemical grammar it should of been"); } } } } private boolean orthoMetaParaLocantIsTwoLocants(Element ompLocant) { Element afterOmpLocant = OpsinTools.getNextSibling(ompLocant); if (afterOmpLocant != null){ String elName = afterOmpLocant.getName(); if(elName.equals(MULTIPLIER_EL) && afterOmpLocant.getAttributeValue(VALUE_ATR).equals("2")){ //e.g. p-dimethyl return true; } String outIds = afterOmpLocant.getAttributeValue(OUTIDS_ATR); if (outIds != null && outIds.split(",").length > 1) { //e.g. p-phenylene return true; } if(elName.equals(GROUP_EL)){ Element multiplier = OpsinTools.getNextSibling(afterOmpLocant); if(multiplier != null && multiplier.getName().equals(MULTIPLIER_EL) && multiplier.getAttributeValue(VALUE_ATR).equals("2")){ Element suffix = OpsinTools.getNextSiblingIgnoringCertainElements(multiplier, new String[]{INFIX_EL, SUFFIXPREFIX_EL}); if(suffix.getName().equals(SUFFIX_EL)){ //e.g. o-benzenediamine return true; } } } } return false; } /** * Processes adjacent alkane stem component elements into a single alkaneStem group element with the appropriate SMILES * e.g. dodecane would be "do" value=2 and "dec" value=10 -->alkaneStem with 12 carbons * * @param subOrRoot */ private void formAlkaneStemsFromComponents(Element subOrRoot) { Deque alkaneStemComponents =new ArrayDeque<>(subOrRoot.getChildElements(ALKANESTEMCOMPONENT)); while(!alkaneStemComponents.isEmpty()){ Element alkaneStemComponent = alkaneStemComponents.removeFirst(); int alkaneChainLength =0; StringBuilder alkaneName = new StringBuilder(); alkaneChainLength += Integer.parseInt(alkaneStemComponent.getAttributeValue(VALUE_ATR)); alkaneName.append(alkaneStemComponent.getValue()); while (!alkaneStemComponents.isEmpty() && OpsinTools.getNextSibling(alkaneStemComponent)==alkaneStemComponents.getFirst()) { alkaneStemComponent.detach(); alkaneStemComponent = alkaneStemComponents.removeFirst(); alkaneChainLength += Integer.parseInt(alkaneStemComponent.getAttributeValue(VALUE_ATR)); alkaneName.append(alkaneStemComponent.getValue()); } Element alkaneStem = new TokenEl(GROUP_EL, alkaneName.toString()); alkaneStem.addAttribute(new Attribute(TYPE_ATR, CHAIN_TYPE_VAL)); alkaneStem.addAttribute(new Attribute(SUBTYPE_ATR, ALKANESTEM_SUBTYPE_VAL)); alkaneStem.addAttribute(new Attribute(VALUE_ATR, StringTools.multiplyString("C", alkaneChainLength))); alkaneStem.addAttribute(new Attribute(USABLEASJOINER_ATR, "yes")); alkaneStem.addAttribute(new Attribute(LABELS_ATR, NUMERIC_LABELS_VAL)); OpsinTools.insertAfter(alkaneStemComponent, alkaneStem); alkaneStemComponent.detach(); } } /** * Applies the traditional alkane modifiers: iso, tert, sec, neo by modifying the alkane chain's SMILES * * @param subOrRoot * @throws ComponentGenerationException */ private void processAlkaneStemModifications(Element subOrRoot) throws ComponentGenerationException { List alkaneStemModifiers = subOrRoot.getChildElements(ALKANESTEMMODIFIER_EL); for (Element alkaneStemModifier : alkaneStemModifiers) { Element alkane = OpsinTools.getNextSibling(alkaneStemModifier); String type; if (alkaneStemModifier.getAttribute(VALUE_ATR)!=null){ type = alkaneStemModifier.getAttributeValue(VALUE_ATR);//identified by token; } else{ if (alkaneStemModifier.getValue().equals("n-")){ type="normal"; } else if (alkaneStemModifier.getValue().equals("i-")){ type="iso"; } else if (alkaneStemModifier.getValue().equals("s-")){ type="sec"; } else{ throw new ComponentGenerationException("Unrecognised alkaneStem modifier"); } } alkaneStemModifier.detach(); if (alkane ==null){ throw new ComponentGenerationException("OPSIN Bug: AlkaneStem not found after alkaneStemModifier"); } String subType = alkane.getAttributeValue(SUBTYPE_ATR); boolean isAmyl; if (AMYL_SUBTYPE_VAL.equals(subType)) { isAmyl = true; } else if (CHAIN_TYPE_VAL.equals(alkane.getAttributeValue(TYPE_ATR)) && ALKANESTEM_SUBTYPE_VAL.equals(subType)) { isAmyl = false; } else { throw new ComponentGenerationException("OPSIN Bug: AlkaneStem not found after alkaneStemModifier"); } int chainLength = isAmyl ? 5 : alkane.getAttributeValue(VALUE_ATR).length(); String smiles; String labels = NONE_LABELS_VAL; if (type.equals("normal")) { //normal behaviour is default so don't need to do anything //n-methyl and n-ethyl contain redundant information and are probably intended to mean N-methyl/N-ethyl if ((chainLength == 1 || chainLength == 2) && alkaneStemModifier.getValue().equals("n-")){ Element locant = new TokenEl(LOCANT_EL, "N"); OpsinTools.insertBefore(alkane, locant); } continue; } else if (type.equals("tert")){ if (chainLength < 4) { throw new ComponentGenerationException("ChainLength to small for tert modifier, required minLength 4. Found: " + chainLength); } if (chainLength > 8) { throw new ComponentGenerationException("Interpretation of tert on an alkane chain of length: " + chainLength + " is ambiguous"); } if (chainLength == 8) { smiles = "C(C)(C)CC(C)(C)C"; } else{ smiles = "C(C)(C)C" + StringTools.multiplyString("C", chainLength-4); } } else if (type.equals("iso")){ if (chainLength < 3) { throw new ComponentGenerationException("ChainLength to small for iso modifier, required minLength 3. Found: " +chainLength); } boolean suffixPresent = isAmyl || subOrRoot.getChildElements(SUFFIX_EL).size() > 0; if (chainLength == 3 && !suffixPresent){ throw new ComponentGenerationException("iso has no meaning without a suffix on an alkane chain of length 3"); } if (chainLength == 8 && !suffixPresent){ smiles = "C(C)(C)CC(C)(C)C"; } else{ smiles = StringTools.multiplyString("C", chainLength - 3) +"C(C)C"; StringBuilder sb = new StringBuilder(); for (int c = 1; c <= chainLength - 2; c++) { sb.append(c); sb.append('/'); } sb.append('/'); labels = sb.toString(); } } else if (type.equals("sec")) { if (chainLength < 3) { throw new ComponentGenerationException("ChainLength to small for sec modifier, required minLength 3. Found: " +chainLength); } boolean suffixPresent = isAmyl || subOrRoot.getChildElements(SUFFIX_EL).size() > 0; if (!suffixPresent) { throw new ComponentGenerationException("sec has no meaning without a suffix on an alkane chain"); } smiles = "C(C)C" + StringTools.multiplyString("C", chainLength-3); } else if (type.equals("neo")) { if (chainLength < 5) { throw new ComponentGenerationException("ChainLength to small for neo modifier, required minLength 5. Found: " +chainLength); } smiles = StringTools.multiplyString("C", chainLength - 5) + "CC(C)(C)C"; } else{ throw new ComponentGenerationException("Unrecognised alkaneStem modifier"); } if (isAmyl) { smiles = alkane.getAttributeValue(VALUE_ATR).substring(0,1) + smiles;//include - or = indicating yl/ylidene } alkane.getAttribute(VALUE_ATR).setValue(smiles); alkane.getAttribute(LABELS_ATR).setValue(labels); alkane.removeAttribute(alkane.getAttribute(USABLEASJOINER_ATR)); } } /**Form heterogeneous hydrides/substituents * These are chains of one heteroatom or alternating heteroatoms and are expressed using SMILES * They are typically treated in an analogous way to alkanes * @param subOrRoot The root/substituents * @throws ComponentGenerationException */ private void processHeterogenousHydrides(Element subOrRoot) throws ComponentGenerationException { List multipliers = subOrRoot.getChildElements(MULTIPLIER_EL); for (int i = 0; i < multipliers.size(); i++) { Element m = multipliers.get(i); if (m.getAttributeValue(TYPE_ATR).equals(GROUP_TYPE_VAL)){ continue; } Element multipliedElem = OpsinTools.getNextSibling(m); if(multipliedElem.getName().equals(GROUP_EL) && multipliedElem.getAttribute(SUBTYPE_ATR)!=null && multipliedElem.getAttributeValue(SUBTYPE_ATR).equals(HETEROSTEM_SUBTYPE_VAL)) { int mvalue = Integer.parseInt(m.getAttributeValue(VALUE_ATR)); Element possiblyALocant = OpsinTools.getPreviousSibling(m);//detect rare case where multiplier does not mean form a chain of heteroatoms e.g. something like 1,2-disulfanylpropane if(possiblyALocant !=null && possiblyALocant.getName().equals(LOCANT_EL)&& mvalue==possiblyALocant.getValue().split(",").length){ Element suffix = OpsinTools.getNextSibling(multipliedElem, SUFFIX_EL); if (suffix !=null && suffix.getAttributeValue(TYPE_ATR).equals(INLINE_TYPE_VAL)){ Element possibleMultiplier = OpsinTools.getPreviousSibling(suffix); if (!possibleMultiplier.getName().equals(MULTIPLIER_EL)){//NOT something like 3,3'-diselane-1,2-diyl continue; } } } //chain of heteroatoms String heteroatomSmiles=multipliedElem.getAttributeValue(VALUE_ATR); if (heteroatomSmiles.equals("B") && OpsinTools.getPreviousSibling(m)==null){ Element possibleUnsaturator = OpsinTools.getNextSibling(multipliedElem); if (possibleUnsaturator !=null && possibleUnsaturator.getName().equals(UNSATURATOR_EL) && possibleUnsaturator.getAttributeValue(VALUE_ATR).equals("1")){ throw new ComponentGenerationException("Polyboranes are not currently supported"); } } String smiles = StringTools.multiplyString(heteroatomSmiles, mvalue); multipliedElem.getAttribute(VALUE_ATR).setValue(smiles); m.detach(); multipliers.remove(i--); } } for (Element m : multipliers) { if (m.getAttributeValue(TYPE_ATR).equals(GROUP_TYPE_VAL)){ continue; } Element multipliedElem = OpsinTools.getNextSibling(m); if(multipliedElem.getName().equals(HETEROATOM_EL)){ Element possiblyAnotherHeteroAtom = OpsinTools.getNextSibling(multipliedElem); if (possiblyAnotherHeteroAtom !=null && possiblyAnotherHeteroAtom.getName().equals(HETEROATOM_EL)){ Element possiblyAnUnsaturator = OpsinTools.getNextSiblingIgnoringCertainElements(possiblyAnotherHeteroAtom, new String[]{LOCANT_EL, MULTIPLIER_EL});//typically ane but can be ene or yne e.g. triphosphaza-1,3-diene if (possiblyAnUnsaturator !=null && possiblyAnUnsaturator.getName().equals(UNSATURATOR_EL)){ StringBuilder newGroupName = new StringBuilder(m.getValue()); newGroupName.append(multipliedElem.getValue()); newGroupName.append(possiblyAnotherHeteroAtom.getValue()); //chain of alternating heteroatoms if (possiblyAnUnsaturator.getAttributeValue(VALUE_ATR).equals("1")){ checkForAmbiguityWithHWring(multipliedElem.getAttributeValue(VALUE_ATR), possiblyAnotherHeteroAtom.getAttributeValue(VALUE_ATR)); } int mvalue = Integer.parseInt(m.getAttributeValue(VALUE_ATR)); StringBuilder smilesSB= new StringBuilder(); Element possiblyARingFormingEl = OpsinTools.getPreviousSibling(m); boolean heteroatomChainWillFormARing = false; if (possiblyARingFormingEl!=null && (possiblyARingFormingEl.getName().equals(CYCLO_EL) || possiblyARingFormingEl.getName().equals(VONBAEYER_EL) || possiblyARingFormingEl.getName().equals(SPIRO_EL))){ heteroatomChainWillFormARing=true; //will be cyclised later. for (int j = 0; j < mvalue; j++) { smilesSB.append(possiblyAnotherHeteroAtom.getAttributeValue(VALUE_ATR)); smilesSB.append(multipliedElem.getAttributeValue(VALUE_ATR)); } } else{ for (int j = 0; j < mvalue -1; j++) { smilesSB.append(multipliedElem.getAttributeValue(VALUE_ATR)); smilesSB.append(possiblyAnotherHeteroAtom.getAttributeValue(VALUE_ATR)); } smilesSB.append(multipliedElem.getAttributeValue(VALUE_ATR)); } String smiles =smilesSB.toString(); smiles = matchHdigit.matcher(smiles).replaceAll("H?");//hydrogen count will be determined by standard valency multipliedElem.detach(); Element addedGroup = new TokenEl(GROUP_EL, newGroupName.toString()); addedGroup.addAttribute(new Attribute(VALUE_ATR, smiles)); addedGroup.addAttribute(new Attribute(LABELS_ATR, NUMERIC_LABELS_VAL)); addedGroup.addAttribute(new Attribute(TYPE_ATR, CHAIN_TYPE_VAL)); addedGroup.addAttribute(new Attribute(SUBTYPE_ATR, HETEROSTEM_SUBTYPE_VAL)); if (!heteroatomChainWillFormARing){ addedGroup.addAttribute(new Attribute(USABLEASJOINER_ATR, "yes")); } OpsinTools.insertAfter(possiblyAnotherHeteroAtom, addedGroup); possiblyAnotherHeteroAtom.detach(); m.detach(); } else if (possiblyAnUnsaturator!=null && possiblyAnUnsaturator.getValue().equals("an") && HANTZSCHWIDMAN_SUBTYPE_VAL.equals(possiblyAnUnsaturator.getAttributeValue(SUBTYPE_ATR))){ //check for HWring that should be interpreted as a heterogenous hydride boolean foundLocantIndicatingHwRingHeteroatomPositions =false;//allow formally incorrect HW ring systems if they have locants Element possibleLocant = OpsinTools.getPreviousSibling(m); if (possibleLocant !=null && possibleLocant.getName().equals(LOCANT_EL)){ int expected = Integer.parseInt(m.getAttributeValue(VALUE_ATR)) + 1; if (expected == possibleLocant.getValue().split(",").length){ foundLocantIndicatingHwRingHeteroatomPositions = true; } } if (!foundLocantIndicatingHwRingHeteroatomPositions){ checkForAmbiguityWithHeterogenousHydride(multipliedElem.getAttributeValue(VALUE_ATR), possiblyAnotherHeteroAtom.getAttributeValue(VALUE_ATR)); } } } } } } /** * Throws an exception if the given heteroatoms could be part of a valid Hantzch-widman ring * For this to be true the first heteroatom must be higher priority than the second * and the second must be compatible with a HW ane stem * @param firstHeteroAtomSMILES * @param secondHeteroAtomSMILES * @throws ComponentGenerationException */ private void checkForAmbiguityWithHWring(String firstHeteroAtomSMILES, String secondHeteroAtomSMILES) throws ComponentGenerationException { Matcher m = MATCH_ELEMENT_SYMBOL.matcher(firstHeteroAtomSMILES); if (!m.find()){ throw new ComponentGenerationException("Failed to extract element from heteroatom"); } ChemEl atom1ChemEl = ChemEl.valueOf(m.group()); m = MATCH_ELEMENT_SYMBOL.matcher(secondHeteroAtomSMILES); if (!m.find()){ throw new ComponentGenerationException("Failed to extract element from heteroatom"); } ChemEl atom2ChemEl = ChemEl.valueOf(m.group()); if (AtomProperties.getHwpriority(atom1ChemEl) > AtomProperties.getHwpriority(atom2ChemEl)){ if (atom2ChemEl == ChemEl.O || atom2ChemEl == ChemEl.S || atom2ChemEl == ChemEl.Se || atom2ChemEl == ChemEl.Te || atom2ChemEl == ChemEl.Bi || atom2ChemEl == ChemEl.Hg){ if (!hasSiorGeorSnorPb(atom1ChemEl, atom2ChemEl)){ throw new ComponentGenerationException("Hantzch-widman ring misparsed as a heterogeneous hydride with alternating atoms"); } } } } /** * Are either of the elements Si/Ge/Sn/Pb * @param atom1ChemEl * @param atom2ChemEl * @return */ private boolean hasSiorGeorSnorPb(ChemEl atom1ChemEl, ChemEl atom2ChemEl) { return (atom1ChemEl == ChemEl.Si || atom1ChemEl == ChemEl.Ge || atom1ChemEl == ChemEl.Sn || atom1ChemEl == ChemEl.Pb || atom2ChemEl == ChemEl.Si || atom2ChemEl == ChemEl.Ge || atom2ChemEl == ChemEl.Sn || atom2ChemEl == ChemEl.Pb); } /** * Throws an exception if the given heteroatoms could be part of a heterogenous hydride * For this to be true the second heteroatom must be higher priority than the first * @param firstHeteroAtomSMILES * @param secondHeteroAtomSMILES * @throws ComponentGenerationException */ private void checkForAmbiguityWithHeterogenousHydride(String firstHeteroAtomSMILES, String secondHeteroAtomSMILES) throws ComponentGenerationException { Matcher m = MATCH_ELEMENT_SYMBOL.matcher(firstHeteroAtomSMILES); if (!m.find()){ throw new ComponentGenerationException("Failed to extract element from heteroatom"); } String atom1Element = m.group(); m = MATCH_ELEMENT_SYMBOL.matcher(secondHeteroAtomSMILES); if (!m.find()){ throw new ComponentGenerationException("Failed to extract element from heteroatom"); } String atom2Element = m.group(); if (AtomProperties.getHwpriority(ChemEl.valueOf(atom2Element)) > AtomProperties.getHwpriority(ChemEl.valueOf(atom1Element))){ throw new ComponentGenerationException("heterogeneous hydride with alternating atoms misparsed as a Hantzch-widman ring"); } } /** Handle indicated hydrogen e.g. 1H- in 1H-pyrrole * * @param subOrRoot The substituent/root to looks for indicated hydrogens in. * @throws ComponentGenerationException */ private void processIndicatedHydrogens(Element subOrRoot) throws ComponentGenerationException { List indicatedHydrogens = subOrRoot.getChildElements(INDICATEDHYDROGEN_EL); for (Element indicatedHydrogenGroup : indicatedHydrogens) { String txt = StringTools.removeDashIfPresent(indicatedHydrogenGroup.getValue()); if (!StringTools.endsWithCaseInsensitive(txt, "h")){//remove brackets if they are present txt = txt.substring(1, txt.length()-1); } String[] hydrogenLocants =txt.split(","); for (String hydrogenLocant : hydrogenLocants) { if (StringTools.endsWithCaseInsensitive(hydrogenLocant, "h")) { String locant = fixLocantCapitalisation(hydrogenLocant.substring(0, hydrogenLocant.length() - 1)); Element indicatedHydrogenEl = new TokenEl(INDICATEDHYDROGEN_EL); indicatedHydrogenEl.addAttribute(new Attribute(LOCANT_ATR, locant)); OpsinTools.insertBefore(indicatedHydrogenGroup, indicatedHydrogenEl); } else{ throw new ComponentGenerationException("OPSIN Bug: malformed indicated hydrogen element!"); } } indicatedHydrogenGroup.detach(); } } /** Handles stereoChemistry in brackets: R/Z/E/Z/a/alpha/b/beta and cis/trans * Will assign a locant to a stereoChemistry element if one was specified/available * * @param subOrRoot The substituent/root to looks for stereoChemistry in. * @throws ComponentGenerationException */ void processStereochemistry(Element subOrRoot) throws ComponentGenerationException { List stereoChemistryElements = subOrRoot.getChildElements(STEREOCHEMISTRY_EL); List locantedUnbrackettedEzTerms = new ArrayList<>(); for (Element stereoChemistryElement : stereoChemistryElements) { if (stereoChemistryElement.getAttributeValue(TYPE_ATR).equals(STEREOCHEMISTRYBRACKET_TYPE_VAL)){ processStereochemistryBracket(stereoChemistryElement); } else if (stereoChemistryElement.getAttributeValue(TYPE_ATR).equals(CISORTRANS_TYPE_VAL)){ assignLocantUsingPreviousElementIfPresent(stereoChemistryElement);//assign a locant if one is directly before the cis/trans } else if (stereoChemistryElement.getAttributeValue(TYPE_ATR).equals(E_OR_Z_TYPE_VAL)){ stereoChemistryElement.addAttribute(new Attribute(VALUE_ATR, stereoChemistryElement.getValue().toUpperCase(Locale.ROOT))); if (assignLocantUsingPreviousElementIfPresent(stereoChemistryElement)) {//assign a locant if one is directly before the E/Z locantedUnbrackettedEzTerms.add(stereoChemistryElement); } } else if (stereoChemistryElement.getAttributeValue(TYPE_ATR).equals(ENDO_EXO_SYN_ANTI_TYPE_VAL)){ processLocantAssigningForEndoExoSynAnti(stereoChemistryElement);//assign a locant if one is directly before the endo/exo/syn/anti. Don't neccesarily detach it } else if (stereoChemistryElement.getAttributeValue(TYPE_ATR).equals(ALPHA_OR_BETA_TYPE_VAL)){ processUnbracketedAlphaBetaStereochemistry(stereoChemistryElement); } else if (stereoChemistryElement.getAttributeValue(TYPE_ATR).equals(RELATIVECISTRANS_TYPE_VAL)){ processRelativeCisTrans(stereoChemistryElement); } else if (stereoChemistryElement.getAttributeValue(TYPE_ATR).equals(OPTICALROTATION_TYPE_VAL)){ processOpticalRotation(stereoChemistryElement); } } if (locantedUnbrackettedEzTerms.size() > 0) { duplicateLocantFromStereoTermIfAdjacentToEneOrYlidene(locantedUnbrackettedEzTerms); } } private void processStereochemistryBracket(Element stereoChemistryElement) throws ComponentGenerationException { try { StereoGroupType stereoType = null; String txt = stereoChemistryElement.getValue(); if (StringTools.startsWithCaseInsensitive(txt, "rel-")) { stereoType = StereoGroupType.Rel; txt = txt.substring(4); } txt = StringTools.removeDashIfPresent(txt); Matcher racemicMacher = matchRacemic.matcher(txt); if (racemicMacher.lookingAt()) { txt = txt.substring(racemicMacher.group().length()); // should be an error if relative is set already but should not be // possible to be matched by grammar... stereoType = StereoGroupType.Rac; } txt = normaliseBinaryBrackets(txt); if (txt.length() > 0) {//if txt is just "rel- or rac-" then it will be length 0 at this point List stereoChemistryDescriptors = splitStereoBracketIntoDescriptors(txt); boolean exclusiveStereoTerm = false; if (stereoChemistryDescriptors.size() == 1) { String stereoChemistryDescriptor = stereoChemistryDescriptors.get(0); if (stereoChemistryDescriptor.equalsIgnoreCase("rel")) { stereoType = StereoGroupType.Rel; exclusiveStereoTerm = true; } else if (matchRacemic.matcher(stereoChemistryDescriptor).matches()) { stereoType = StereoGroupType.Rac; exclusiveStereoTerm = true; } } if (!exclusiveStereoTerm) { for (String stereoChemistryDescriptor : stereoChemistryDescriptors) { Matcher m = matchStereochemistry.matcher(stereoChemistryDescriptor); if (m.matches()) { Element stereoChemEl = new TokenEl(STEREOCHEMISTRY_EL, stereoChemistryDescriptor); String locantVal = m.group(1); if (locantVal.length() > 0) { stereoChemEl.addAttribute(new Attribute(LOCANT_ATR, fixLocantCapitalisation(StringTools.removeDashIfPresent(locantVal)))); } OpsinTools.insertBefore(stereoChemistryElement, stereoChemEl); if (matchRS.matcher(m.group(2)).matches()) { stereoChemEl.addAttribute(new Attribute(TYPE_ATR, R_OR_S_TYPE_VAL)); String symbol = m.group(2).toUpperCase(Locale.ROOT).replaceAll("/", ""); StereoGroupType localStereoType = stereoType; // needs to be local if (symbol.equals("RS") || symbol.equals("SR") || symbol.equals("RANDS") || symbol.equals("SANDR")) { // rel-(RS) is conflicting, interpret as relative even though a // relative descriptor was used if (localStereoType == null) { localStereoType = StereoGroupType.Rac; } symbol = symbol.substring(0, 1); // RS => R, SR => S } else if (symbol.equals("R*") || symbol.equals("S*") || symbol.equals("R^*") || symbol.equals("S^*") || symbol.equals("RORS") || symbol.equals("SORR")) { // rac-(R*) is conflicting, interpret as racemic even though a // relative descriptor was used if (localStereoType == null) { localStereoType = StereoGroupType.Rel; } symbol = symbol.substring(0, 1); // R* => R, S* => S } stereoChemEl.addAttribute(new Attribute(VALUE_ATR, symbol)); if (localStereoType == null) { localStereoType = StereoGroupType.Abs; } stereoChemEl.addAttribute(new Attribute(STEREOGROUP_ATR, localStereoType.name())); } else if (matchEZ.matcher(m.group(2)).matches()) { stereoChemEl.addAttribute(new Attribute(TYPE_ATR, E_OR_Z_TYPE_VAL)); String symbol = m.group(2).toUpperCase(Locale.ROOT); if (symbol.equalsIgnoreCase("EandZ") || symbol.equalsIgnoreCase("EorZ") || symbol.equalsIgnoreCase("E*") || symbol.equalsIgnoreCase("Z*")) symbol = "EZ"; stereoChemEl.addAttribute(new Attribute(VALUE_ATR, symbol)); } else if (matchAlphaBetaStereochem.matcher(m.group(2)).matches()) { stereoChemEl.addAttribute(new Attribute(TYPE_ATR, ALPHA_OR_BETA_TYPE_VAL)); if (Character.toLowerCase(m.group(2).charAt(0)) == 'a') { stereoChemEl.addAttribute(new Attribute(VALUE_ATR, "alpha")); } else if (Character.toLowerCase(m.group(2).charAt(0)) == 'b') { stereoChemEl.addAttribute(new Attribute(VALUE_ATR, "beta")); } else if (Character.toLowerCase(m.group(2).charAt(0)) == 'x') { stereoChemEl.addAttribute(new Attribute(VALUE_ATR, "xi")); } else { throw new ComponentGenerationException("Malformed alpha/beta stereochemistry element: " + stereoChemistryElement.getValue()); } } else if (matchCisTrans.matcher(m.group(2)).matches()) { stereoChemEl.addAttribute(new Attribute(TYPE_ATR, CISORTRANS_TYPE_VAL)); stereoChemEl.addAttribute(new Attribute(VALUE_ATR, m.group(2).toLowerCase(Locale.ROOT))); } else if (matchEndoExoSynAnti.matcher(m.group(2)).matches()) { stereoChemEl.addAttribute(new Attribute(TYPE_ATR, ENDO_EXO_SYN_ANTI_TYPE_VAL)); stereoChemEl.addAttribute(new Attribute(VALUE_ATR, m.group(2).toLowerCase(Locale.ROOT))); } else if (matchAxialStereo.matcher(m.group(2)).matches()) { stereoChemEl.addAttribute(new Attribute(TYPE_ATR, AXIAL_TYPE_VAL)); stereoChemEl.addAttribute(new Attribute(VALUE_ATR, m.group(2))); } else { throw new ComponentGenerationException("Malformed stereochemistry element: " + stereoChemistryElement.getValue()); } } else { throw new ComponentGenerationException("Malformed stereochemistry element: " + stereoChemistryElement.getValue()); } } } else { // Rac or Rel exclusively Element stereoChemEl = new TokenEl(STEREOCHEMISTRY_EL, stereoChemistryElement.getValue()); stereoChemEl.addAttribute(new Attribute(TYPE_ATR, stereoType == StereoGroupType.Rac ? RAC_TYPE_VAL : REL_TYPE_VAL)); OpsinTools.insertBefore(stereoChemistryElement, stereoChemEl); } } else { if (stereoType == StereoGroupType.Rac || stereoType == StereoGroupType.Rel) { Element stereoChemEl = new TokenEl(STEREOCHEMISTRY_EL, stereoChemistryElement.getValue()); stereoChemEl.addAttribute(new Attribute(TYPE_ATR, stereoType == StereoGroupType.Rac ? RAC_TYPE_VAL : REL_TYPE_VAL)); OpsinTools.insertBefore(stereoChemistryElement, stereoChemEl); } } } catch (StereochemistryException ex) { if (buildState.n2sConfig.warnRatherThanFailOnUninterpretableStereochemistry()) { buildState.addWarning(OpsinWarning.OpsinWarningType.STEREOCHEMISTRY_IGNORED, ex.getMessage()); } else { throw new ComponentGenerationException(ex); } } finally { stereoChemistryElement.detach(); } } /** * Normalizes brackets that are written with an AND or OR: *

* "(R)- and (S)-" becomes "(RS)", * "(R)- or (S)-" becomes "(R*)-" * "(R,R)- or (S,R)-" becomes "(R*,R)-" * * @param inputStr the stereo bracket test * @return normalised bracket or the input if it could not be normalised */ static String normaliseBinaryBrackets(String inputStr) throws StereochemistryException { if (inputStr.isEmpty()) { return inputStr; } int len = inputStr.length() - 1; int i = 1; for (; i < len; i++) { if (inputStr.charAt(i) == ')') { break; } } if (i == len) { return inputStr; // no match } String firstBracket = inputStr.substring(1, i); i++; // close bracket // optional dash if (i < len && inputStr.charAt(i) == '-') { i++; } StereoGroupType mode; if (StringTools.startsWithCaseInsensitive(inputStr, i, "AND")) { mode = StereoGroupType.Rac; } else if (StringTools.startsWithCaseInsensitive(inputStr, i, "OR")) { mode = StereoGroupType.Rel; } else { return inputStr; } for (; i < len; i++) { if (inputStr.charAt(i) == '(') { break; } } if (i == len) { return inputStr; // no match } int mark = i + 1; for (; i < len; i++) { if (inputStr.charAt(i) == ')') { break; } } String secondBracket = inputStr.substring(mark, i); if (firstBracket.length() != secondBracket.length()) { throw new StereochemistryException("Alternative stereochemistry brackets are different lengths: " + firstBracket + " " + secondBracket); } StringBuilder generated = new StringBuilder(); generated.append('('); for (int j = 0; j < firstBracket.length(); j++) { generated.append(firstBracket.charAt(j)); if (firstBracket.charAt(j) == secondBracket.charAt(j)) { continue; } else if (firstBracket.charAt(j) == 'R' || firstBracket.charAt(j) == 'S' || firstBracket.charAt(j) == 'r' || firstBracket.charAt(j) == 's' || firstBracket.charAt(j) == 'E' || firstBracket.charAt(j) == 'Z') { if (mode == StereoGroupType.Rac) { generated.append(secondBracket.charAt(j)); } else { generated.append('*'); } } else { throw new StereochemistryException("Invalid combination of stereo brackets: " + firstBracket.charAt(j) + " " + secondBracket.charAt(j)); } } generated.append(')'); return generated.toString(); } private List splitStereoBracketIntoDescriptors(String stereoBracket) { List stereoDescriptors = new ArrayList<>(); StringBuilder sb = new StringBuilder(); //ignore first and last character (opening and closing bracket) for (int i = 1, l = stereoBracket.length() - 1; i < l; i++) { char ch = stereoBracket.charAt(i); if (ch ==','){ stereoDescriptors.add(sb.toString()); sb.setLength(0); } else if (ch == '-'){ if (matchStereochemistry.matcher(sb.toString()).matches()){ //delimiter between stereochemistry stereoDescriptors.add(sb.toString()); sb.setLength(0); } else{ //locanted stereochemistry term sb.append(ch); } } else{ sb.append(ch); } } stereoDescriptors.add(sb.toString()); return stereoDescriptors; } public void processOpticalRotation(Element e) { if (e.getValue().startsWith("(+/-)") || e.getValue().startsWith("(+-)")) { Element stereoChemEl = new TokenEl(STEREOCHEMISTRY_EL, e.getValue()); stereoChemEl.addAttribute(new Attribute(TYPE_ATR, RAC_TYPE_VAL)); OpsinTools.insertBefore(e, stereoChemEl); } } private boolean assignLocantUsingPreviousElementIfPresent(Element stereoChemistryElement) { Element possibleLocant = OpsinTools.getPrevious(stereoChemistryElement); if (possibleLocant !=null && possibleLocant.getName().equals(LOCANT_EL) && possibleLocant.getValue().split(",").length==1){ stereoChemistryElement.addAttribute(new Attribute(LOCANT_ATR, possibleLocant.getValue())); possibleLocant.detach(); return true; } return false; } private void processLocantAssigningForEndoExoSynAnti(Element stereoChemistryElement) { Element possibleLocant = OpsinTools.getPrevious(stereoChemistryElement); if (possibleLocant !=null && possibleLocant.getName().equals(LOCANT_EL) && possibleLocant.getValue().split(",").length==1){ stereoChemistryElement.addAttribute(new Attribute(LOCANT_ATR, possibleLocant.getValue())); Element group = OpsinTools.getNextSibling(stereoChemistryElement, GROUP_EL); if (group != null && (CYCLICUNSATURABLEHYDROCARBON_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR)) || OpsinTools.getPreviousSibling(group).getName().equals(VONBAEYER_EL))){ //detach locant only if we're sure it has no other meaning //typically locants in front of endo/exo/syn/anti also indicate the position of a susbtituent/suffix e.g. 3-exo-amino possibleLocant.detach(); } } } private void processUnbracketedAlphaBetaStereochemistry(Element stereoChemistryElement) throws ComponentGenerationException { String txt = StringTools.removeDashIfPresent(stereoChemistryElement.getValue()); String[] stereoChemistryDescriptors = txt.split(","); List locants = new ArrayList<>(); boolean createLocantsEl =false; for (String stereoChemistryDescriptor : stereoChemistryDescriptors) { Matcher digitMatcher = MATCH_DIGITS.matcher(stereoChemistryDescriptor); if (digitMatcher.lookingAt()){ String locant = digitMatcher.group(); String possibleAlphaBeta = digitMatcher.replaceAll(""); locants.add(locant); Matcher alphaBetaMatcher = matchAlphaBetaStereochem.matcher(possibleAlphaBeta); if (alphaBetaMatcher.matches()){ Element stereoChemEl = new TokenEl(STEREOCHEMISTRY_EL, stereoChemistryDescriptor); stereoChemEl.addAttribute(new Attribute(LOCANT_ATR, locant)); OpsinTools.insertBefore(stereoChemistryElement, stereoChemEl); stereoChemEl.addAttribute(new Attribute(TYPE_ATR, ALPHA_OR_BETA_TYPE_VAL)); if (Character.toLowerCase(possibleAlphaBeta.charAt(0)) == 'a'){ stereoChemEl.addAttribute(new Attribute(VALUE_ATR, "alpha")); } else if (Character.toLowerCase(possibleAlphaBeta.charAt(0)) == 'b'){ stereoChemEl.addAttribute(new Attribute(VALUE_ATR, "beta")); } else if (Character.toLowerCase(possibleAlphaBeta.charAt(0)) == 'x'){ stereoChemEl.addAttribute(new Attribute(VALUE_ATR, "xi")); } else{ throw new ComponentGenerationException("Malformed alpha/beta stereochemistry element: " + stereoChemistryElement.getValue()); } } else{ createLocantsEl =true; } } } if (!createLocantsEl){ //create locants unless a group supporting alpha/beta stereochem is within this substituent/root createLocantsEl =true; List groups = OpsinTools.getNextSiblingsOfType(stereoChemistryElement, GROUP_EL); for (Element group : groups) { if (group.getAttributeValue(ALPHABETACLOCKWISEATOMORDERING_ATR)!=null){ createLocantsEl=false; break; } } } if (createLocantsEl){ Element newLocantEl = new TokenEl(LOCANT_EL, StringTools.stringListToString(locants, ",")); OpsinTools.insertAfter(stereoChemistryElement, newLocantEl); } stereoChemistryElement.detach(); } private void processRelativeCisTrans(Element stereoChemistryElement) { String value = StringTools.removeDashIfPresent(stereoChemistryElement.getValue()); StringBuilder sb = new StringBuilder(); String[] terms = value.split(","); for (String term : terms) { if (term.startsWith("c-")|| term.startsWith("t-") || term.startsWith("r-")){ if (sb.length() > 0){ sb.append(','); } sb.append(term.substring(2)); } else{ throw new RuntimeException("Malformed relativeCisTrans element"); } } Element locantEl = new TokenEl(LOCANT_EL, sb.toString()); OpsinTools.insertAfter(stereoChemistryElement, locantEl); } /** * If the e/z term is next to an ene or ylidene duplicate the locant * e.g. 2E,4Z-diene --> 2E,4Z-2,4-diene * 2E-ylidene --> 2E-2-ylidene * @param locantedUnbrackettedEzTerms */ private void duplicateLocantFromStereoTermIfAdjacentToEneOrYlidene(List locantedUnbrackettedEzTerms) { for (int i = 0, l = locantedUnbrackettedEzTerms.size(); i < l; i++) { Element currentTerm = locantedUnbrackettedEzTerms.get(i); List groupedTerms = new ArrayList<>(); groupedTerms.add(currentTerm); while (i + 1 < l && locantedUnbrackettedEzTerms.get(i + 1).equals(OpsinTools.getNextSibling(currentTerm))) { currentTerm = locantedUnbrackettedEzTerms.get(++i); groupedTerms.add(currentTerm); } Element lastTermInGroup = groupedTerms.get(groupedTerms.size() - 1); Element eneOrYlidene; if (groupedTerms.size() > 1) { Element multiplier = OpsinTools.getNextSibling(lastTermInGroup); if (!(multiplier != null && multiplier.getName().equals(MULTIPLIER_EL) && String.valueOf(groupedTerms.size()).equals(multiplier.getAttributeValue(VALUE_ATR)))) { continue; } eneOrYlidene = OpsinTools.getNextSibling(multiplier); } else { eneOrYlidene = OpsinTools.getNextSibling(lastTermInGroup); } if (eneOrYlidene != null) { String name = eneOrYlidene.getName(); if (name.equals(UNSATURATOR_EL) || name.equals(SUFFIX_EL)) { if ((name.equals(UNSATURATOR_EL) && eneOrYlidene.getAttributeValue(VALUE_ATR).equals("2")) || (name.equals(SUFFIX_EL) && eneOrYlidene.getAttributeValue(VALUE_ATR).equals("ylidene"))) { List locants = new ArrayList<>(); for (Element stereochemistryTerm : groupedTerms) { locants.add(stereochemistryTerm.getAttributeValue(LOCANT_ATR)); } Element newLocant = new TokenEl(LOCANT_EL, StringTools.stringListToString(locants, ",")); OpsinTools.insertAfter(lastTermInGroup, newLocant); } else{ if (name.equals(UNSATURATOR_EL)){ throw new RuntimeException("After E/Z stereo expected ene but found: " + eneOrYlidene.getValue()); } else { throw new RuntimeException("After E/Z stereo expected yldiene but found: " + eneOrYlidene.getValue()); } } } } } } /** * Looks for "suffixPrefix" and assigns their value them as an attribute of an adjacent suffix * @param subOrRoot * @throws ComponentGenerationException */ private void processSuffixPrefixes(Element subOrRoot) throws ComponentGenerationException { List suffixPrefixes = subOrRoot.getChildElements(SUFFIXPREFIX_EL); for (Element suffixPrefix : suffixPrefixes) { Element suffix = OpsinTools.getNextSibling(suffixPrefix); if (suffix==null || ! suffix.getName().equals(SUFFIX_EL)){ throw new ComponentGenerationException("OPSIN bug: suffix not found after suffixPrefix: " + suffixPrefix.getValue()); } suffix.addAttribute(new Attribute(SUFFIXPREFIX_ATR, suffixPrefix.getAttributeValue(VALUE_ATR))); suffixPrefix.detach(); } } /** * Looks for infixes and assigns them to the next suffix using a semicolon delimited infix attribute * If the infix/suffix block has been bracketed e.g (dithioate) then the infix is multiplied out * If preceded by a suffixPrefix e.g. sulfono infixes are also multiplied out * If a multiplier is present and neither of these cases are met then it is ambiguous as to whether the multiplier is referring to the infix or the infixed suffix * This ambiguity is resolved in processInfixFunctionalReplacementNomenclature by looking at the structure of the suffix to be modified * @param subOrRoot * @throws ComponentGenerationException */ private void processInfixes(Element subOrRoot) throws ComponentGenerationException { List infixes = subOrRoot.getChildElements(INFIX_EL); for (Element infix : infixes) { Element suffix = OpsinTools.getNextSiblingIgnoringCertainElements(infix, new String[]{INFIX_EL, SUFFIXPREFIX_EL, MULTIPLIER_EL}); if (suffix ==null || !suffix.getName().equals(SUFFIX_EL)){ throw new ComponentGenerationException("No suffix found next next to infix: "+ infix.getValue()); } List currentInfixInformation; if (suffix.getAttribute(INFIX_ATR)==null){ suffix.addAttribute(new Attribute(INFIX_ATR, "")); currentInfixInformation = new ArrayList<>(); } else{ currentInfixInformation = StringTools.arrayToList(suffix.getAttributeValue(INFIX_ATR).split(";")); } String infixValue =infix.getAttributeValue(VALUE_ATR); currentInfixInformation.add(infixValue); Element possibleMultiplier = OpsinTools.getPreviousSibling(infix); Element possibleBracket; boolean multiplierKnownToIndicateInfixMultiplicationPresent =false; if (possibleMultiplier.getName().equals(MULTIPLIER_EL)){ //suffix prefix present so multiplier must indicate infix replacement Element possibleSuffixPrefix = OpsinTools.getPreviousSiblingIgnoringCertainElements(infix, new String[]{MULTIPLIER_EL, INFIX_EL}); if (possibleSuffixPrefix!=null && possibleSuffixPrefix.getName().equals(SUFFIXPREFIX_EL)){ multiplierKnownToIndicateInfixMultiplicationPresent =true; } Element elementBeforeMultiplier = OpsinTools.getPreviousSibling(possibleMultiplier); //double multiplier indicates multiple suffixes which all have their infix multiplied //if currentInfixInformation contains more than 1 entry it contains information from an infix from before the multiplier so the interpretation of the multiplier as a suffix multiplier is impossible if (elementBeforeMultiplier.getName().equals(MULTIPLIER_EL) || currentInfixInformation.size() > 1){ multiplierKnownToIndicateInfixMultiplicationPresent =true; } possibleBracket = elementBeforeMultiplier; } else{ possibleBracket=possibleMultiplier; possibleMultiplier=null; infix.detach(); } if (possibleBracket.getName().equals(STRUCTURALOPENBRACKET_EL)){ Element bracket = OpsinTools.getNextSibling(suffix); if (!bracket.getName().equals(STRUCTURALCLOSEBRACKET_EL)){ throw new ComponentGenerationException("Matching closing bracket not found around infix/suffix block"); } if (possibleMultiplier!=null){ int multiplierVal = Integer.parseInt(possibleMultiplier.getAttributeValue(VALUE_ATR)); for (int i = 1; i < multiplierVal; i++) { currentInfixInformation.add(infixValue); } possibleMultiplier.detach(); infix.detach(); } possibleBracket.detach(); bracket.detach(); } else if (multiplierKnownToIndicateInfixMultiplicationPresent){//multiplier unambiguously means multiplication of the infix int multiplierVal = Integer.parseInt(possibleMultiplier.getAttributeValue(VALUE_ATR)); for (int i = 1; i < multiplierVal; i++) { currentInfixInformation.add(infixValue); } possibleMultiplier.detach(); infix.detach(); } else if (possibleMultiplier!=null && GROUP_TYPE_VAL.equals(possibleMultiplier.getAttributeValue(TYPE_ATR))){//e.g. ethanbisthioic acid == ethanbis(thioic acid) infix.detach(); } suffix.getAttribute(INFIX_ATR).setValue(StringTools.stringListToString(currentInfixInformation, ";")); } } /** * Identifies lambdaConvention elements. * The elementsValue is expected to be a comma seperated lambda values and 0 or more locants. Where a lambda value has the following form: * optional locant, the word lambda and then a number which is the valency specified (with possibly some attempt to indicate this number is superscripted) * If the element is followed by heteroatoms (possibly multiplied) they are multiplied and the locant/lambda assigned to them * Otherwise a new lambdaConvention element is created with the valency specified by the lambda convention taking the attribute "lambda" * In the case where heteroatoms belong to a fused ring system a new lambdaConvention element is also created. The original locants are retained in the benzo specific fused ring nomenclature: * 2H-5lambda^5-phosphinino[3,2-b]pyran --> 2H 5lambda^5 phosphinino[3,2-b]pyran BUT * 1lambda^4,5-Benzodithiepin --> 1lambda^4 1,5-Benzodithiepin * @param subOrRoot * @throws ComponentGenerationException */ private void processLambdaConvention(Element subOrRoot) throws ComponentGenerationException { List lambdaConventionEls = subOrRoot.getChildElements(LAMBDACONVENTION_EL); boolean fusedRingPresent = false; if (lambdaConventionEls.size()>0){ if (subOrRoot.getChildElements(GROUP_EL).size()>1){ fusedRingPresent = true; } } for (Element lambdaConventionEl : lambdaConventionEls) { boolean frontLocantsExpected =false;//Is the lambdaConvention el followed by benz/benzo of a fused ring system (these have front locants which correspond to the final fused rings numbering) or by a polycylicspiro system String[] lambdaValues = StringTools.removeDashIfPresent(lambdaConventionEl.getValue()).split(","); Element possibleHeteroatomOrMultiplier = OpsinTools.getNextSibling(lambdaConventionEl); int heteroCount = 0; int multiplierValue = 1; while(possibleHeteroatomOrMultiplier != null){ if(possibleHeteroatomOrMultiplier.getName().equals(HETEROATOM_EL)) { heteroCount+=multiplierValue; multiplierValue =1; } else if (possibleHeteroatomOrMultiplier.getName().equals(MULTIPLIER_EL)){ multiplierValue = Integer.parseInt(possibleHeteroatomOrMultiplier.getAttributeValue(VALUE_ATR)); } else{ break; } possibleHeteroatomOrMultiplier = OpsinTools.getNextSibling(possibleHeteroatomOrMultiplier); } boolean assignLambdasToHeteroAtoms =false; if (lambdaValues.length==heteroCount){//heteroatom and number of locants +lambdas must match if (fusedRingPresent && possibleHeteroatomOrMultiplier!=null && possibleHeteroatomOrMultiplier.getName().equals(GROUP_EL) && possibleHeteroatomOrMultiplier.getAttributeValue(SUBTYPE_ATR).equals(HANTZSCHWIDMAN_SUBTYPE_VAL)){ //You must not set the locants of a HW system which forms a component of a fused ring system. The locant specified corresponds to the complete fused ring system. } else{ assignLambdasToHeteroAtoms =true; } } else if (possibleHeteroatomOrMultiplier!=null && ((heteroCount==0 && OpsinTools.getNextSibling(lambdaConventionEl).equals(possibleHeteroatomOrMultiplier) && fusedRingPresent && possibleHeteroatomOrMultiplier.getName().equals(GROUP_EL) && (possibleHeteroatomOrMultiplier.getValue().equals("benzo") || possibleHeteroatomOrMultiplier.getValue().equals("benz")) && !OpsinTools.getNextSibling(possibleHeteroatomOrMultiplier).getName().equals(FUSION_EL) && !OpsinTools.getNextSibling(possibleHeteroatomOrMultiplier).getName().equals(LOCANT_EL)) || (possibleHeteroatomOrMultiplier.getName().equals(POLYCYCLICSPIRO_EL) && (possibleHeteroatomOrMultiplier.getAttributeValue(VALUE_ATR).equals("spirobi")|| possibleHeteroatomOrMultiplier.getAttributeValue(VALUE_ATR).equals("spiroter"))))){ frontLocantsExpected = true;//a benzo fused ring e.g. 1lambda4,3-benzothiazole or a symmetrical poly cyclic spiro system } List heteroAtoms = new ArrayList<>();//contains the heteroatoms to apply the lambda values too. Can be empty if the values are applied to a group directly rather than to a heteroatom if (assignLambdasToHeteroAtoms){//populate heteroAtoms, multiplied heteroatoms are multiplied out Element multiplier = null; Element heteroatomOrMultiplier = OpsinTools.getNextSibling(lambdaConventionEl); while(heteroatomOrMultiplier != null){ if(heteroatomOrMultiplier.getName().equals(HETEROATOM_EL)) { heteroAtoms.add(heteroatomOrMultiplier); if (multiplier!=null){ for (int i = 1; i < Integer.parseInt(multiplier.getAttributeValue(VALUE_ATR)); i++) { Element newHeteroAtom = heteroatomOrMultiplier.copy(); OpsinTools.insertBefore(heteroatomOrMultiplier, newHeteroAtom); heteroAtoms.add(newHeteroAtom); } multiplier.detach(); multiplier=null; } } else if (heteroatomOrMultiplier.getName().equals(MULTIPLIER_EL)){ if (multiplier !=null){ break; } else{ multiplier = heteroatomOrMultiplier; } } else{ break; } heteroatomOrMultiplier = OpsinTools.getNextSibling(heteroatomOrMultiplier); } } for (int i = 0; i < lambdaValues.length; i++) {//assign all the lambdas to heteroatoms or to newly created lambdaConvention elements Matcher m = matchLambdaConvention.matcher(lambdaValues[i]); if (m.matches()){//a lambda Attribute valencyChange = new Attribute(LAMBDA_ATR, m.group(2)); String locant = m.group(1) != null ? fixLocantCapitalisation(m.group(1)) : null; if (frontLocantsExpected){ if (locant == null) { throw new ComponentGenerationException("Locant not found for lambda convention before a benzo fused ring system"); } lambdaValues[i] = locant; } if (assignLambdasToHeteroAtoms){ Element heteroAtom = heteroAtoms.get(i); heteroAtom.addAttribute(valencyChange); if (locant != null) { heteroAtom.addAttribute(LOCANT_ATR, locant); } } else{ Element newLambda = new TokenEl(LAMBDACONVENTION_EL); newLambda.addAttribute(valencyChange); if (locant != null) { newLambda.addAttribute(LOCANT_ATR, locant); } OpsinTools.insertBefore(lambdaConventionEl, newLambda); } } else{//just a locant e.g 1,3lambda5 String locant = fixLocantCapitalisation(lambdaValues[i]); lambdaValues[i] = locant; if (!assignLambdasToHeteroAtoms){ if (!frontLocantsExpected){ throw new ComponentGenerationException("Lambda convention not specified for locant: " + locant); } } else{ Element heteroAtom = heteroAtoms.get(i); heteroAtom.addAttribute(new Attribute(LOCANT_ATR, locant)); } } } if (!frontLocantsExpected){ lambdaConventionEl.detach(); } else{ lambdaConventionEl.setName(LOCANT_EL); lambdaConventionEl.setValue(StringTools.arrayToString(lambdaValues, ",")); } } } /**Finds matching open and close brackets, and places the * elements contained within in a big <bracket> element. * @param brackets * * @param substituentsAndRoot: The substituent/root elements at the current level of the tree * @throws ComponentGenerationException */ private void findAndStructureBrackets(List substituentsAndRoot, List brackets) throws ComponentGenerationException { int blevel = 0; Element openBracket = null; boolean nestedBrackets = false; for (Element sub : substituentsAndRoot) { List children = sub.getChildElements(); for (Element child : children) { String name = child.getName(); if(name.equals(OPENBRACKET_EL)) { blevel++; if(openBracket == null) { openBracket = child; } else { nestedBrackets = true; } } else if (name.equals(CLOSEBRACKET_EL)) { blevel--; if(blevel == 0) { Element bracket = structureBrackets(openBracket, child); brackets.add(bracket); if (nestedBrackets) { findAndStructureBrackets(OpsinTools.getDescendantElementsWithTagNames(bracket, new String[]{SUBSTITUENT_EL, ROOT_EL}), brackets); } openBracket = null; nestedBrackets = false; } } } } if (blevel != 0){ throw new ComponentGenerationException("Brackets do not match!"); } } /**Places the elements in substituents containing/between an open and close bracket * in a <bracket> tag. * * @param openBracket The open bracket element * @param closeBracket The close bracket element * @return The bracket element thus created. * @throws ComponentGenerationException */ private Element structureBrackets(Element openBracket, Element closeBracket) throws ComponentGenerationException { Element bracket = new GroupingEl(BRACKET_EL); Element currentEl = openBracket.getParent(); OpsinTools.insertBefore(currentEl, bracket); /* Pick up everything in the substituent before the bracket*/ Element firstChild = currentEl.getChild(0); while(!firstChild.equals(openBracket)) { firstChild.detach(); bracket.addChild(firstChild); firstChild = currentEl.getChild(0); } /* Pick up all elements from the one with the open bracket, * to the one with the close bracket, inclusive. */ while(!currentEl.equals(closeBracket.getParent())) { Element nextEl = OpsinTools.getNextSibling(currentEl); currentEl.detach(); bracket.addChild(currentEl); currentEl = nextEl; if (currentEl == null) { throw new ComponentGenerationException("Brackets within a word do not match!"); } } currentEl.detach(); bracket.addChild(currentEl); /* Pick up elements after the close bracket */ currentEl = OpsinTools.getNextSibling(closeBracket); while(currentEl != null) { Element nextEl = OpsinTools.getNextSibling(currentEl); currentEl.detach(); bracket.addChild(currentEl); currentEl = nextEl; } openBracket.detach(); closeBracket.detach(); return bracket; } /**Looks for annulen/polyacene/polyaphene/polyalene/polyphenylene/polynaphthylene/polyhelicene tags and replaces them with a group with appropriate SMILES. * @param subOrRoot The subOrRoot to look for tags in * @throws ComponentGenerationException */ private void processHydroCarbonRings(Element subOrRoot) throws ComponentGenerationException { List annulens = subOrRoot.getChildElements(ANNULEN_EL); for (Element annulen : annulens) { String annulenValue = annulen.getValue(); Matcher m = matchAnnulene.matcher(annulenValue); if (!m.matches()) { throw new ComponentGenerationException("Invalid annulen tag"); } int annuleneSize = Integer.valueOf(m.group(1)); if (annuleneSize < 3) { throw new ComponentGenerationException("Invalid annulene size"); } //build [annulenSize]annulene ring as SMILES StringBuilder sb = new StringBuilder(); if (m.group(2).equalsIgnoreCase("yn")) { sb.append("C1#C"); } else { sb.append("c1c"); } for (int i = 2; i < annuleneSize; i++) { sb.append("c"); } sb.append('1'); Element group = new TokenEl(GROUP_EL, annulenValue); group.addAttribute(new Attribute(VALUE_ATR, sb.toString())); group.addAttribute(new Attribute(LABELS_ATR, NUMERIC_LABELS_VAL)); group.addAttribute(new Attribute(TYPE_ATR, RING_TYPE_VAL)); group.addAttribute(new Attribute(SUBTYPE_ATR, RING_SUBTYPE_VAL)); annulen.getParent().replaceChild(annulen, group); } List hydrocarbonFRSystems = subOrRoot.getChildElements(HYDROCARBONFUSEDRINGSYSTEM_EL); for (Element hydrocarbonFRSystem : hydrocarbonFRSystems) { Element multiplier = OpsinTools.getPreviousSibling(hydrocarbonFRSystem); if(multiplier != null && multiplier.getName().equals(MULTIPLIER_EL)) { int multiplierValue =Integer.parseInt(multiplier.getAttributeValue(VALUE_ATR)); String classOfHydrocarbonFRSystem =hydrocarbonFRSystem.getAttributeValue(VALUE_ATR); StringBuilder smilesSB= new StringBuilder(); if (classOfHydrocarbonFRSystem.equals("polyacene")){ if (multiplierValue <=3){ throw new ComponentGenerationException("Invalid polyacene"); } smilesSB.append("c1ccc"); for (int j = 2; j <= multiplierValue; j++) { smilesSB.append("c"); smilesSB.append(ringClosure(j)); smilesSB.append("c"); } smilesSB.append("ccc"); for (int j = multiplierValue; j >2; j--) { smilesSB.append("c"); smilesSB.append(ringClosure(j)); smilesSB.append("c"); } smilesSB.append("c12"); }else if (classOfHydrocarbonFRSystem.equals("polyaphene")){ if (multiplierValue <=3){ throw new ComponentGenerationException("Invalid polyaphene"); } smilesSB.append("c1ccc"); int ringsAbovePlane; int ringsOnPlane; int ringOpeningCounter=2; if (multiplierValue %2==0){ ringsAbovePlane =(multiplierValue-2)/2; ringsOnPlane =ringsAbovePlane +1; } else{ ringsAbovePlane =(multiplierValue-1)/2; ringsOnPlane =ringsAbovePlane; } for (int j = 1; j <= ringsAbovePlane; j++) { smilesSB.append("c"); smilesSB.append(ringClosure(ringOpeningCounter++)); smilesSB.append("c"); } for (int j = 1; j <= ringsOnPlane; j++) { smilesSB.append("cc"); smilesSB.append(ringClosure(ringOpeningCounter++)); } smilesSB.append("ccc"); ringOpeningCounter--; for (int j = 1; j <= ringsOnPlane; j++) { smilesSB.append("cc"); smilesSB.append(ringClosure(ringOpeningCounter--)); } for (int j = 1; j < ringsAbovePlane; j++) { smilesSB.append("c"); smilesSB.append(ringClosure(ringOpeningCounter--)); smilesSB.append("c"); } smilesSB.append("c12"); } else if (classOfHydrocarbonFRSystem.equals("polyalene")){ if (multiplierValue <5){ throw new ComponentGenerationException("Invalid polyalene"); } smilesSB.append("c1"); for (int j = 3; j < multiplierValue; j++) { smilesSB.append("c"); } smilesSB.append("c2"); for (int j = 3; j <= multiplierValue; j++) { smilesSB.append("c"); } smilesSB.append("c12"); } else if (classOfHydrocarbonFRSystem.equals("polyphenylene")){ if (multiplierValue <2){ throw new ComponentGenerationException("Invalid polyphenylene"); } smilesSB.append("c1cccc2"); for (int j = 1; j < multiplierValue; j++) { smilesSB.append("c3ccccc3"); } smilesSB.append("c12"); } else if (classOfHydrocarbonFRSystem.equals("polynaphthylene")){ if (multiplierValue <3){ throw new ComponentGenerationException("Invalid polynaphthylene"); } smilesSB.append("c1cccc2cc3"); for (int j = 1; j < multiplierValue; j++) { smilesSB.append("c4cc5ccccc5cc4"); } smilesSB.append("c3cc12"); } else if (classOfHydrocarbonFRSystem.equals("polyhelicene")){ if (multiplierValue <4){ throw new ComponentGenerationException("Invalid polyhelicene"); } smilesSB.append("c1c"); int ringOpeningCounter=2; for (int j = 1; j < multiplierValue; j++) { smilesSB.append("ccc"); smilesSB.append(ringClosure(ringOpeningCounter++)); } smilesSB.append("cccc"); ringOpeningCounter--; for (int j = 2; j < multiplierValue; j++) { smilesSB.append("c"); smilesSB.append(ringClosure(ringOpeningCounter--)); } smilesSB.append("c12"); } else{ throw new ComponentGenerationException("Unknown semi-trivially named hydrocarbon fused ring system"); } Element newGroup =new TokenEl(GROUP_EL, multiplier.getValue() + hydrocarbonFRSystem.getValue()); newGroup.addAttribute(new Attribute(VALUE_ATR, smilesSB.toString())); newGroup.addAttribute(new Attribute(LABELS_ATR, FUSEDRING_LABELS_VAL)); newGroup.addAttribute(new Attribute(TYPE_ATR, RING_TYPE_VAL)); newGroup.addAttribute(new Attribute(SUBTYPE_ATR, HYDROCARBONFUSEDRINGSYSTEM_EL)); hydrocarbonFRSystem.getParent().replaceChild(hydrocarbonFRSystem, newGroup); multiplier.detach(); } else{ throw new ComponentGenerationException("Invalid semi-trivially named hydrocarbon fused ring system"); } } } /** * Handles irregular suffixes. e.g. Quinone and ylene * @param subOrRoot * @throws ComponentGenerationException */ private void handleSuffixIrregularities(Element subOrRoot) throws ComponentGenerationException { List suffixes = subOrRoot.getChildElements(SUFFIX_EL); for (Element suffix : suffixes) { String suffixValue = suffix.getValue(); if (suffixValue.equals("ic") || suffixValue.equals("ous")){ if (!buildState.n2sConfig.allowInterpretationOfAcidsWithoutTheWordAcid()) { Element next = OpsinTools.getNext(suffix); if (next == null){ throw new ComponentGenerationException("\"acid\" not found after " +suffixValue); } } } // convert quinone to dione else if (suffixValue.equals("quinone") || suffixValue.equals("quinon")){ suffix.removeAttribute(suffix.getAttribute(ADDITIONALVALUE_ATR)); suffix.setValue("one"); Element multiplier = OpsinTools.getPreviousSibling(suffix); if (multiplier.getName().equals(MULTIPLIER_EL)){ Attribute multVal = multiplier.getAttribute(VALUE_ATR); int newMultiplier = Integer.parseInt(multVal.getValue()) * 2; multVal.setValue(String.valueOf(newMultiplier)); } else{ multiplier = new TokenEl(MULTIPLIER_EL, "di"); multiplier.addAttribute(new Attribute(VALUE_ATR, "2")); OpsinTools.insertBefore(suffix, multiplier); } } else if (suffixValue.equals("ylene") || suffixValue.equals("ylen")){ suffix.removeAttribute(suffix.getAttribute(ADDITIONALVALUE_ATR)); suffix.setValue("yl"); Element alk = OpsinTools.getPreviousSibling(suffix, GROUP_EL); if (alk.getAttribute(USABLEASJOINER_ATR)!=null){ alk.removeAttribute(alk.getAttribute(USABLEASJOINER_ATR)); } Element multiplier = new TokenEl(MULTIPLIER_EL, "di"); multiplier.addAttribute(new Attribute(VALUE_ATR, "2")); OpsinTools.insertBefore(suffix, multiplier); } else if (suffixValue.equals("ylium") &&//disambiguate between ylium the charge modifying suffix and ylium the acylium suffix "acylium".equals(suffix.getAttributeValue(VALUE_ATR)) && suffix.getAttribute(SUFFIXPREFIX_ATR)==null && suffix.getAttribute(INFIX_ATR)==null){ Element group = OpsinTools.getPreviousSibling(suffix, GROUP_EL); if (group==null || (!ACIDSTEM_TYPE_VAL.equals(group.getAttributeValue(TYPE_ATR)) && !CHALCOGENACIDSTEM_TYPE_VAL.equals(group.getAttributeValue(TYPE_ATR)) && !NONCARBOXYLICACID_TYPE_VAL.equals(group.getAttributeValue(TYPE_ATR)))){ Element beforeSuffix = OpsinTools.getPreviousSibling(suffix); String o = beforeSuffix.getAttributeValue(SUBSEQUENTUNSEMANTICTOKEN_ATR); if (o ==null || !StringTools.endsWithCaseInsensitive(o, "o")){ if (group!=null && ARYLSUBSTITUENT_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))){ //contracted form for removal of hydride e.g. 9-Anthrylium suffix.getAttribute(VALUE_ATR).setValue("ylium"); suffix.getAttribute(TYPE_ATR).setValue(CHARGE_TYPE_VAL); suffix.removeAttribute(suffix.getAttribute(SUBTYPE_ATR)); } else{ throw new ComponentGenerationException("ylium is intended to be the removal of H- in this context not the formation of an acylium ion"); } } } } else if (suffixValue.equals("nitrolic acid") || suffixValue.equals("nitrolicacid")) { Element precedingGroup = OpsinTools.getPreviousSibling(suffix, GROUP_EL); if (precedingGroup == null){ if (subOrRoot.getChildCount() != 1) { throw new RuntimeException("OPSIN Bug: nitrolic acid not expected to have sibilings"); } Element precedingSubstituent = OpsinTools.getPreviousSibling(subOrRoot); if(precedingSubstituent == null || !precedingSubstituent.getName().equals(SUBSTITUENT_EL)){ throw new ComponentGenerationException("Expected substituent before nitrolic acid"); } List existingSuffixes = precedingSubstituent.getChildElements(SUFFIX_EL); if (existingSuffixes.size() == 1) { if (!existingSuffixes.get(0).getValue().equals("yl")){ throw new ComponentGenerationException("Unexpected suffix found before nitrolic acid"); } existingSuffixes.get(0).detach(); for (Element child : precedingSubstituent.getChildElements()) { child.detach(); OpsinTools.insertBefore(suffix, child); } precedingSubstituent.detach(); } else{ throw new ComponentGenerationException("Only the nitrolic acid case where it is preceded by an yl suffix is supported"); } } } } } /** * Looks for alkaneStems followed by a bridge forming 'o' and makes them fused ring bridge elements * @param group */ private void detectAlkaneFusedRingBridges(Element group) { if (ALKANESTEM_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))){ Element unsaturator = OpsinTools.getNextSibling(group); if (unsaturator != null && unsaturator.getName().equals(UNSATURATOR_EL)) { Element possibleBridgeFormer = OpsinTools.getNextSiblingIgnoringCertainElements(group, new String[]{UNSATURATOR_EL}); if(possibleBridgeFormer != null && possibleBridgeFormer.getName().equals(BRIDGEFORMINGO_EL)){ group.setName(FUSEDRINGBRIDGE_EL); Attribute smilesValAtr = group.getAttribute(VALUE_ATR); smilesValAtr.setValue("-" + smilesValAtr.getValue() + "-"); possibleBridgeFormer.detach(); unsaturator.detach(); } } } } /**Looks (multiplier)cyclo/spiro/cyclo tags before chain * and replaces them with a group with appropriate SMILES * Note that only simple spiro tags are handled at this stage i.e. not dispiro * @param group A group which is potentially a chain * @throws ComponentGenerationException */ private void processRings(Element group) throws ComponentGenerationException { Element previous = OpsinTools.getPreviousSiblingIgnoringCertainElements(group, new String[]{LOCANT_EL}); if(previous != null) { String previousElType = previous.getName(); if(previousElType.equals(SPIRO_EL)){ processSpiroSystem(group, previous); } else if(previousElType.equals(VONBAEYER_EL)) { processVonBaeyerSystem(group, previous); } else if(previousElType.equals(CYCLO_EL)) { processCyclisedChain(group, previous); } } } /** * Processes a spiro descriptor element. * This modifies the provided chainGroup into the spiro system by replacing the value of the chain group with appropriate SMILES * @param chainGroup * @param spiroEl * @throws ComponentGenerationException * @throws NumberFormatException */ private void processSpiroSystem(Element chainGroup, Element spiroEl) throws NumberFormatException, ComponentGenerationException { int[][] spiroDescriptors = getSpiroDescriptors(StringTools.removeDashIfPresent(spiroEl.getValue())); Element multiplier = OpsinTools.getPreviousSibling(spiroEl); int numberOfSpiros = 1; if (multiplier != null && multiplier.getName().equals(MULTIPLIER_EL) && BASIC_TYPE_VAL.equals(multiplier.getAttributeValue(TYPE_ATR))) { numberOfSpiros = Integer.parseInt(multiplier.getAttributeValue(VALUE_ATR)); multiplier.detach(); } int numberOfCarbonInDescriptors = 0; boolean hasSuperscripts = false; for (int[] spiroDescriptor : spiroDescriptors) { numberOfCarbonInDescriptors += spiroDescriptor[0]; if (spiroDescriptor[1] != -1) { hasSuperscripts = true; } } numberOfCarbonInDescriptors += numberOfSpiros; int expectedNumberOfCarbons = chainGroup.getAttributeValue(VALUE_ATR).length(); if (numberOfCarbonInDescriptors != expectedNumberOfCarbons) { if (numberOfCarbonInDescriptors > expectedNumberOfCarbons && spiroDescriptors.length > 2 && !hasSuperscripts) { //Can we infer where superscripts should have been? //Assume that all 2+ digit spiro descriptors are actually single digit bridges followed by a superscripted locant //Note that the locant can't be zero, so 10,20,30 etc. can't be split to a superscripted zero int carbonsWithSuperscriptsInferred = 0; for (int i = 0; i < spiroDescriptors.length; i++) { int[] spiroDescriptor = spiroDescriptors[i]; int carbons = spiroDescriptor[0]; if (i > 1 && carbons >= 11) { String str = String.valueOf(carbons); carbons = Integer.parseInt(str.substring(0,1)); int locant = Integer.parseInt(str.substring(1)); if (locant > 0) { spiroDescriptor[0] = carbons; spiroDescriptor[1] = locant; } } carbonsWithSuperscriptsInferred += carbons; } carbonsWithSuperscriptsInferred += numberOfSpiros; if (carbonsWithSuperscriptsInferred == expectedNumberOfCarbons) { numberOfCarbonInDescriptors = carbonsWithSuperscriptsInferred; } } if (numberOfCarbonInDescriptors != expectedNumberOfCarbons) { throw new ComponentGenerationException("Disagreement between number of atoms in spiro descriptor: " + numberOfCarbonInDescriptors +" and number of atoms in chain: " + expectedNumberOfCarbons); } } int numOfOpenedBrackets = 1; int curIndex = 2; String smiles = "C0" + StringTools.multiplyString("C", spiroDescriptors[0][0]) + "10("; // for those molecules where no superstrings compare prefix number with curIndex. for (int i = 1; i < spiroDescriptors.length; i++) { if (spiroDescriptors[i][1] >= 0) { int ringOpeningPos = findIndexOfRingOpenings(smiles, spiroDescriptors[i][1]); String ringOpeningLabel = String.valueOf(smiles.charAt(ringOpeningPos)); ringOpeningPos++; if (ringOpeningLabel.equals("%")) { while (smiles.charAt(ringOpeningPos) >= '0' && smiles.charAt(ringOpeningPos) <= '9' && ringOpeningPos < smiles.length()) { ringOpeningLabel += smiles.charAt(ringOpeningPos); ringOpeningPos++; } } if (smiles.indexOf("C" + ringOpeningLabel, ringOpeningPos) >= 0) { // this ring opening has already been closed // i.e. this atom connects more than one ring in a spiro fusion // insert extra ring opening smiles = smiles.substring(0, ringOpeningPos) + ringClosure(curIndex) + smiles.substring(ringOpeningPos); // add ring in new brackets smiles += "(" + StringTools.multiplyString("C", spiroDescriptors[i][0]) + ringClosure(curIndex) + ")"; curIndex++; } else { smiles += StringTools.multiplyString("C", spiroDescriptors[i][0]) + ringOpeningLabel + ")"; } } else if (numOfOpenedBrackets >= numberOfSpiros) { smiles += StringTools.multiplyString("C", spiroDescriptors[i][0]); // take the number before bracket as index for smiles // we can open more brackets, this considered in prev if curIndex--; smiles += ringClosure(curIndex) + ")"; // from here start to decrease index for the following } else { smiles += StringTools.multiplyString("C", spiroDescriptors[i][0]); smiles += "C" + ringClosure(curIndex++) + "("; numOfOpenedBrackets++; } } chainGroup.getAttribute(VALUE_ATR).setValue(smiles); chainGroup.getAttribute(TYPE_ATR).setValue(RING_TYPE_VAL); if (chainGroup.getAttribute(USABLEASJOINER_ATR) != null) { chainGroup.removeAttribute(chainGroup.getAttribute(USABLEASJOINER_ATR)); } spiroEl.detach(); } /** * If the integer given is > 9 return %ringClosure else just returns ringClosure * @param ringClosure * @return */ private String ringClosure(int ringClosure) { if (ringClosure > 9) { return "%" + Integer.toString(ringClosure); } else{ return Integer.toString(ringClosure); } } /** * Prepares spiro string for processing * @param text - string with spiro e.g. spiro[2.2] * @return array with number of carbons in each group and associated index of spiro atom */ private int[][] getSpiroDescriptors(String text) { if (text.indexOf("-") == 5) { text = text.substring(7, text.length() - 1);//cut off spiro-[ and terminal ] } else{ text = text.substring(6, text.length() - 1);//cut off spiro[ and terminal ] } String[] spiroDescriptorStrings = matchCommaOrDot.split(text); int[][] spiroDescriptors = new int[spiroDescriptorStrings.length][2]; // array of descriptors where number of elements and super string present for (int i = 0; i < spiroDescriptorStrings.length; i++) { String[] elements = matchNonDigit.split(spiroDescriptorStrings[i]); if (elements.length > 1) {//a "superscripted" number is present spiroDescriptors[i][0] = Integer.parseInt(elements[0]); StringBuilder superScriptedNumber = new StringBuilder(); for (int j = 1; j < elements.length; j++){//may be more than one non digit as there are many ways of indicating superscripts superScriptedNumber.append(elements[j]); } spiroDescriptors[i][1] = Integer.parseInt(superScriptedNumber.toString()); } else { spiroDescriptors[i][0] = Integer.parseInt(spiroDescriptorStrings[i]); spiroDescriptors[i][1] = -1; } } return spiroDescriptors; } /** * Finds the the carbon atom with the given locant in the provided SMILES * Returns the next index which is expected to correspond to the atom's ring opening/s * @param smiles string to search in * @param locant locant of the atom in given structure * @return index of ring openings * @throws ComponentGenerationException */ private Integer findIndexOfRingOpenings(String smiles, int locant) throws ComponentGenerationException { int count = 0; int pos = -1; for (int i = 0, len = smiles.length(); i < len; i++) { if (smiles.charAt(i) == 'C') { count++; if (count == locant) { pos = i; break; } } } if (pos == -1) { throw new ComponentGenerationException("Unable to find atom corresponding to number indicated by superscript in spiro descriptor"); } return pos + 1; } /** * Given an element corresponding to an alkane or other systematic chain and the preceding vonBaeyerBracket element: * Generates the SMILES of the von baeyer system and assigns this to the chain Element * Checks are done on the von baeyer multiplier and chain length * The multiplier and vonBaeyerBracket are detached * @param chainEl * @param vonBaeyerBracketEl * @throws ComponentGenerationException */ private void processVonBaeyerSystem(Element chainEl, Element vonBaeyerBracketEl) throws ComponentGenerationException { String vonBaeyerBracket = StringTools.removeDashIfPresent(vonBaeyerBracketEl.getValue()); Element multiplier = OpsinTools.getPreviousSibling(vonBaeyerBracketEl); int numberOfRings=Integer.parseInt(multiplier.getAttributeValue(VALUE_ATR)); multiplier.detach(); int alkylChainLength; Deque elementSymbolArray = new ArrayDeque<>(); String smiles =chainEl.getAttributeValue(VALUE_ATR); char[] smilesArray =smiles.toCharArray(); for (int i = 0; i < smilesArray.length; i++) {//only able to interpret the SMILES that should be in an unmodified unbranched chain char currentChar =smilesArray[i]; if (currentChar == '['){ if ( smilesArray[i +2]==']'){ elementSymbolArray.add("[" +String.valueOf(smilesArray[i+1]) +"]"); i=i+2; } else{ elementSymbolArray.add("[" + String.valueOf(smilesArray[i+1]) +String.valueOf(smilesArray[i+2]) +"]"); i=i+3; } } else{ elementSymbolArray.add(String.valueOf(currentChar)); } } alkylChainLength=elementSymbolArray.size(); int totalLengthOfBridges=0; int bridgeLabelsUsed=3;//start labelling from 3 upwards //3 and 4 will be the atoms on each end of one secondary bridge, 5 and 6 for the next etc. List> bridges = new ArrayList<>(); Map> bridgeLocations = new HashMap<>(alkylChainLength); if (vonBaeyerBracket.indexOf("-")==5){ vonBaeyerBracket = vonBaeyerBracket.substring(7, vonBaeyerBracket.length()-1);//cut off cyclo-[ and terminal ] } else{ vonBaeyerBracket = vonBaeyerBracket.substring(6, vonBaeyerBracket.length()-1);//cut off cyclo[ and terminal ] } String[] bridgeDescriptors = matchCommaOrDot.split(vonBaeyerBracket);//the bridgelengths and positions for secondary bridges //all bridges from past the first 3 are secondary bridges and require specification of bridge position which will be partially in the subsequent position in the array for (int i = 0; i < bridgeDescriptors.length; i++) { String bridgeDescriptor = bridgeDescriptors[i]; HashMap bridge = new HashMap<>(); int bridgeLength =0; if (i > 2){//this is a secondary bridge (chain start/end locations should have been specified) i++; String coordinatesStr1; String coordinatesStr2 = matchNonDigit.matcher(bridgeDescriptors[i]).replaceAll(""); String[] tempArray = matchNonDigit.split(bridgeDescriptor); if (tempArray.length ==1){ //there is some ambiguity as it has not been made obvious which number/s are supposed to be the superscripted locant //so we assume that it is more likely that it will be referring to an atom of label >10 //rather than a secondary bridge of length > 10 char[] tempCharArray = bridgeDescriptor.toCharArray(); if (tempCharArray.length ==2){ bridgeLength= Character.getNumericValue(tempCharArray[0]); coordinatesStr1= Character.toString(tempCharArray[1]); } else if (tempCharArray.length ==3){ bridgeLength= Character.getNumericValue(tempCharArray[0]); coordinatesStr1=Character.toString(tempCharArray[1]) +Character.toString(tempCharArray[2]); } else if (tempCharArray.length ==4){ bridgeLength = Integer.parseInt(Character.toString(tempCharArray[0]) +Character.toString(tempCharArray[1])); coordinatesStr1 = Character.toString(tempCharArray[2]) +Character.toString(tempCharArray[3]); } else{ throw new ComponentGenerationException("Unsupported Von Baeyer locant description: " + bridgeDescriptor ); } } else{//bracket or other delimiter detected, no ambiguity! bridgeLength= Integer.parseInt(tempArray[0]); coordinatesStr1= tempArray[1]; } bridge.put("Bridge Length", bridgeLength ); int coordinates1=Integer.parseInt(coordinatesStr1); int coordinates2=Integer.parseInt(coordinatesStr2); if (coordinates1 > alkylChainLength || coordinates2 > alkylChainLength){ throw new ComponentGenerationException("Indicated bridge position is not on chain: " +coordinates1 +"," +coordinates2); } if (coordinates2>coordinates1){//makes sure that bridges are built from highest coord to lowest int swap =coordinates1; coordinates1=coordinates2; coordinates2=swap; } if (bridgeLocations.get(coordinates1)==null){ bridgeLocations.put(coordinates1, new ArrayList<>()); } if (bridgeLocations.get(coordinates2)==null){ bridgeLocations.put(coordinates2, new ArrayList<>()); } bridgeLocations.get(coordinates1).add(bridgeLabelsUsed); bridge.put("AtomId_Larger_Label", bridgeLabelsUsed); bridgeLabelsUsed++; if (bridgeLength==0){//0 length bridge, hence want atoms with the same labels so they can join together without a bridge bridgeLocations.get(coordinates2).add(bridgeLabelsUsed -1); bridge.put("AtomId_Smaller_Label", bridgeLabelsUsed -1); } else{ bridgeLocations.get(coordinates2).add(bridgeLabelsUsed); bridge.put("AtomId_Smaller_Label", bridgeLabelsUsed); } bridgeLabelsUsed++; bridge.put("AtomId_Larger", coordinates1); bridge.put("AtomId_Smaller", coordinates2); } else{ bridgeLength= Integer.parseInt(bridgeDescriptor); bridge.put("Bridge Length", bridgeLength); } totalLengthOfBridges += bridgeLength; bridges.add(bridge); } if (totalLengthOfBridges + 2 !=alkylChainLength ){ throw new ComponentGenerationException("Disagreement between lengths of bridges and alkyl chain length"); } if (numberOfRings +1 != bridges.size()){ throw new ComponentGenerationException("Disagreement between number of rings and number of bridges"); } StringBuilder smilesSB = new StringBuilder(); int atomCounter=1; int bridgeCounter=1; //add standard bridges for (HashMap bridge : bridges) { if (bridgeCounter==1){ smilesSB.append(elementSymbolArray.removeFirst()); smilesSB.append("1"); if (bridgeLocations.get(atomCounter)!=null){ for (Integer bridgeAtomLabel : bridgeLocations.get(atomCounter)) { smilesSB.append(ringClosure(bridgeAtomLabel)); } } smilesSB.append("("); } int bridgeLength =bridge.get("Bridge Length"); for (int i = 0; i < bridgeLength; i++) { atomCounter++; smilesSB.append(elementSymbolArray.removeFirst()); if (bridgeLocations.get(atomCounter)!=null){ for (Integer bridgeAtomLabel : bridgeLocations.get(atomCounter)) { smilesSB.append(ringClosure(bridgeAtomLabel)); } } } if (bridgeCounter==1){ atomCounter++; smilesSB.append(elementSymbolArray.removeFirst()); smilesSB.append("2"); if (bridgeLocations.get(atomCounter)!=null){ for (Integer bridgeAtomLabel : bridgeLocations.get(atomCounter)) { smilesSB.append(ringClosure(bridgeAtomLabel)); } } } if (bridgeCounter==2){ smilesSB.append("1)"); } if (bridgeCounter==3){ smilesSB.append("2"); } bridgeCounter++; if (bridgeCounter >3){break;} } //create list of secondary bridges that need to be added //0 length bridges and the 3 main bridges are dropped List> secondaryBridges = new ArrayList<>(); for (HashMap bridge : bridges) { if(bridge.get("AtomId_Larger")!=null && bridge.get("Bridge Length")!=0){ secondaryBridges.add(bridge); } } Comparator> sortBridges= new VonBaeyerSecondaryBridgeSort(); Collections.sort(secondaryBridges, sortBridges); List> dependantSecondaryBridges; //add secondary bridges, recursively add dependent secondary bridges do{ dependantSecondaryBridges = new ArrayList<>(); for (HashMap bridge : secondaryBridges) { int bridgeLength =bridge.get("Bridge Length"); if (bridge.get("AtomId_Larger") > atomCounter){ dependantSecondaryBridges.add(bridge); continue; } smilesSB.append("."); for (int i = 0; i < bridgeLength; i++) { atomCounter++; smilesSB.append(elementSymbolArray.removeFirst()); if (i==0){ smilesSB.append(ringClosure(bridge.get("AtomId_Larger_Label"))); } if (bridgeLocations.get(atomCounter)!=null){ for (Integer bridgeAtomLabel : bridgeLocations.get(atomCounter)) { smilesSB.append(ringClosure(bridgeAtomLabel)); } } } smilesSB.append(ringClosure(bridge.get("AtomId_Smaller_Label"))); } if (dependantSecondaryBridges.size() >0 && dependantSecondaryBridges.size()==secondaryBridges.size()){ throw new ComponentGenerationException("Unable to resolve all dependant bridges!!!"); } secondaryBridges=dependantSecondaryBridges; } while(dependantSecondaryBridges.size() > 0); chainEl.getAttribute(VALUE_ATR).setValue(smilesSB.toString()); chainEl.getAttribute(TYPE_ATR).setValue(RING_TYPE_VAL); if (chainEl.getAttribute(USABLEASJOINER_ATR) !=null){ chainEl.removeAttribute(chainEl.getAttribute(USABLEASJOINER_ATR)); } vonBaeyerBracketEl.detach(); } /** * Converts a chain group into a ring. * The chain group can either be an alkane or heteroatom chain * @param chainGroup * @param cycloEl * @throws ComponentGenerationException */ private void processCyclisedChain(Element chainGroup, Element cycloEl) throws ComponentGenerationException { String smiles=chainGroup.getAttributeValue(VALUE_ATR); int chainlen =0; for (int i = smiles.length() -1 ; i >=0; i--) { if (Character.isUpperCase(smiles.charAt(i)) && smiles.charAt(i) !='H'){ chainlen++; } } if (chainlen < 3){ throw new ComponentGenerationException("Heteroatom chain too small to create a ring: " + chainlen); } smiles+="1"; if (smiles.charAt(0)=='['){ int closeBracketIndex = smiles.indexOf(']'); smiles= smiles.substring(0, closeBracketIndex +1) +"1" + smiles.substring(closeBracketIndex +1); } else{ if (Character.getType(smiles.charAt(1)) == Character.LOWERCASE_LETTER){//element is 2 letters long smiles= smiles.substring(0,2) +"1" + smiles.substring(2); } else{ smiles= smiles.substring(0,1) +"1" + smiles.substring(1); } } chainGroup.getAttribute(VALUE_ATR).setValue(smiles); if (chainlen==6){//6 membered rings have ortho/meta/para positions if (chainGroup.getAttribute(LABELS_ATR)!=null){ chainGroup.getAttribute(LABELS_ATR).setValue("1/2,ortho/3,meta/4,para/5/6"); } else{ chainGroup.addAttribute(new Attribute(LABELS_ATR, "1/2,ortho/3,meta/4,para/5/6")); } } chainGroup.getAttribute(TYPE_ATR).setValue(RING_TYPE_VAL); if (chainGroup.getAttribute(USABLEASJOINER_ATR) !=null){ chainGroup.removeAttribute(chainGroup.getAttribute(USABLEASJOINER_ATR)); } cycloEl.detach(); } /**Handles special cases in IUPAC nomenclature. * Benzyl etc. * @param group The group to look for irregularities in. * @throws ComponentGenerationException */ private void handleGroupIrregularities(Element group) throws ComponentGenerationException { String groupValue =group.getValue(); if (!buildState.n2sConfig.allowInterpretationOfAcidsWithoutTheWordAcid()) { if (group.getAttribute(FUNCTIONALIDS_ATR) !=null && (groupValue.endsWith("ic") || groupValue.endsWith("ous"))){ Element next = OpsinTools.getNext(group); if (next == null){ throw new ComponentGenerationException("\"acid\" not found after " +groupValue); } } } String groupType = group.getAttributeValue(TYPE_ATR); String groupSubType = group.getAttributeValue(SUBTYPE_ATR); if (OUSICATOM_SUBTYPE_VAL.equals(groupSubType)) { Element next = OpsinTools.getNext(group, false); if (next == null){ throw new ComponentGenerationException("counter anion not found after " +groupValue); } } if(groupValue.equals("thiophen") || groupValue.equals("selenophen") || groupValue.equals("tellurophen")) {//thiophenol is generally phenol with an O replaced with S not thiophene with a hydroxy Element possibleSuffix = OpsinTools.getNextSibling(group); if (!"e".equals(group.getAttributeValue(SUBSEQUENTUNSEMANTICTOKEN_ATR)) && possibleSuffix !=null && possibleSuffix.getName().equals(SUFFIX_EL)) { if (possibleSuffix.getValue().startsWith("ol")){ Element isThisALocant = OpsinTools.getPreviousSibling(group); if (isThisALocant == null || !isThisALocant.getName().equals(LOCANT_EL) || isThisALocant.getValue().split(",").length != 1){ throw new ComponentGenerationException(groupValue + "ol has been incorrectly interpreted as "+ groupValue+", ol instead of phenol with the oxgen replaced"); } } } } else if(groupValue.equals("chromen")) {//chromene in IUPAC nomenclature is fully unsaturated, but sometimes is instead considered to be chromane with a front locanted double bond Element possibleLocant = OpsinTools.getPreviousSibling(group); if (possibleLocant != null && possibleLocant.getName().equals(LOCANT_EL) && (possibleLocant.getValue().equals("2") || possibleLocant.getValue().equals("3"))) { Element possibleSuffix = OpsinTools.getNextSibling(group); if (possibleSuffix == null || possibleSuffix.getName().equals(LOCANT_EL)){//if there is a suffix assume the locant refers to that rather than the double bond group.getAttribute(VALUE_ATR).setValue("O1CCCc2ccccc12"); group.addAttribute(ADDBOND_ATR, "2 locant required"); group.addAttribute(FRONTLOCANTSEXPECTED_ATR, "2,3"); } } } else if (groupValue.equals("methylene") || groupValue.equals("methylen")) {//e.g. 3,4-methylenedioxyphenyl Element nextSub = OpsinTools.getNextSibling(group.getParent()); if (nextSub !=null && nextSub.getName().equals(SUBSTITUENT_EL) && OpsinTools.getNextSibling(group)==null && (OpsinTools.getPreviousSibling(group)==null || !OpsinTools.getPreviousSibling(group).getName().equals(MULTIPLIER_EL))){//not trimethylenedioxy List children = nextSub.getChildElements(); if (children.size() >=2 && children.get(0).getValue().equals("di")&& children.get(1).getValue().equals("oxy")){ group.setValue(groupValue + "dioxy"); group.getAttribute(VALUE_ATR).setValue("C(O)O"); group.getAttribute(OUTIDS_ATR).setValue("2,3"); group.getAttribute(SUBTYPE_ATR).setValue(EPOXYLIKE_SUBTYPE_VAL); if (group.getAttribute(LABELS_ATR)!=null){ group.getAttribute(LABELS_ATR).setValue(NONE_LABELS_VAL); } else{ group.addAttribute(new Attribute(LABELS_ATR, NONE_SUBTYPE_VAL)); } nextSub.detach(); for (int i = children.size() -1 ; i >=2; i--) { children.get(i).detach(); OpsinTools.insertAfter(group, children.get(i)); } } } } else if (groupValue.equals("ethylene") || groupValue.equals("ethylen")) { Element previous = OpsinTools.getPreviousSibling(group); if (previous != null && previous.getName().equals(MULTIPLIER_EL)){ int multiplierValue = Integer.parseInt(previous.getAttributeValue(VALUE_ATR)); Element possibleRoot = OpsinTools.getNextSibling(group.getParent()); if (possibleRoot==null && OpsinTools.getParentWordRule(group).getAttributeValue(WORDRULE_ATR).equals(WordRule.glycol.toString())){//e.g. dodecaethylene glycol StringBuilder smiles = new StringBuilder("CC"); for (int i = 1; i < multiplierValue; i++) { smiles.append("OCC"); } group.getAttribute(OUTIDS_ATR).setValue("1," +Integer.toString(3*(multiplierValue-1) +2)); group.getAttribute(VALUE_ATR).setValue(smiles.toString()); previous.detach(); if (group.getAttribute(LABELS_ATR)!=null){//use numeric numbering group.getAttribute(LABELS_ATR).setValue(NUMERIC_LABELS_VAL); } else{ group.addAttribute(new Attribute(LABELS_ATR, NUMERIC_LABELS_VAL)); } } else if (possibleRoot!=null && possibleRoot.getName().equals(ROOT_EL)){ List children = possibleRoot.getChildElements(); if (children.size()==2){ Element amineMultiplier =children.get(0); Element amine =children.get(1); if (amineMultiplier.getName().equals(MULTIPLIER_EL) && (amine.getValue().equals("amine") || amine.getValue().equals("amin"))){//e.g. Triethylenetetramine if (Integer.parseInt(amineMultiplier.getAttributeValue(VALUE_ATR))!=multiplierValue +1){ throw new ComponentGenerationException("Invalid polyethylene amine!"); } StringBuilder smiles = new StringBuilder(); for (int i = 0; i < multiplierValue; i++) { smiles.append("NCC"); } smiles.append("N"); group.removeAttribute(group.getAttribute(OUTIDS_ATR)); group.getAttribute(VALUE_ATR).setValue(smiles.toString()); previous.detach(); possibleRoot.detach(); group.getParent().setName(ROOT_EL); if (group.getAttribute(LABELS_ATR)!=null){//use numeric numbering group.getAttribute(LABELS_ATR).setValue(NUMERIC_LABELS_VAL); } else{ group.addAttribute(new Attribute(LABELS_ATR, NUMERIC_LABELS_VAL)); } } } } } else{ Element nextSub = OpsinTools.getNextSibling(group.getParent()); if (nextSub !=null && nextSub.getName().equals(SUBSTITUENT_EL) && OpsinTools.getNextSibling(group)==null){ List children = nextSub.getChildElements(); if (children.size() >=2 && children.get(0).getValue().equals("di")&& children.get(1).getValue().equals("oxy")){ group.setValue(groupValue + "dioxy"); group.getAttribute(VALUE_ATR).setValue("C(O)CO"); group.getAttribute(OUTIDS_ATR).setValue("2,4"); group.getAttribute(SUBTYPE_ATR).setValue(EPOXYLIKE_SUBTYPE_VAL); if (group.getAttribute(LABELS_ATR)!=null){ group.getAttribute(LABELS_ATR).setValue(NONE_LABELS_VAL); } else{ group.addAttribute(new Attribute(LABELS_ATR, NONE_SUBTYPE_VAL)); } nextSub.detach(); for (int i = children.size() -1 ; i >=2; i--) { children.get(i).detach(); OpsinTools.insertAfter(group, children.get(i)); } } } } } else if (groupValue.equals("propylene") || groupValue.equals("propylen")) { Element previous = OpsinTools.getPreviousSibling(group); if (previous!=null && previous.getName().equals(MULTIPLIER_EL)){ int multiplierValue = Integer.parseInt(previous.getAttributeValue(VALUE_ATR)); Element possibleRoot = OpsinTools.getNextSibling(group.getParent()); if (possibleRoot==null && OpsinTools.getParentWordRule(group).getAttributeValue(WORDRULE_ATR).equals(WordRule.glycol.toString())){//e.g. dodecaethylene glycol StringBuilder smiles =new StringBuilder("CCC"); for (int i = 1; i < multiplierValue; i++) { smiles.append("OC(C)C"); } group.getAttribute(OUTIDS_ATR).setValue("2," +Integer.toString(4*(multiplierValue-1) +3)); group.getAttribute(VALUE_ATR).setValue(smiles.toString()); if (group.getAttribute(LABELS_ATR)!=null){ group.getAttribute(LABELS_ATR).setValue(NONE_LABELS_VAL); } else{ group.addAttribute(new Attribute(LABELS_ATR, NONE_LABELS_VAL)); } previous.detach(); } } } //acridone (not codified), anthrone, phenanthrone and xanthone have the one at position 9 by default else if (groupValue.equals("anthr")|| groupValue.equals("anthran") || groupValue.equals("phenanthr") || groupValue.equals("acrid") || groupValue.equals("xanth") || groupValue.equals("thioxanth") || groupValue.equals("selenoxanth")|| groupValue.equals("telluroxanth")|| groupValue.equals("xanthen")) { Element possibleLocant = OpsinTools.getPreviousSibling(group); if (possibleLocant==null || !possibleLocant.getName().equals(LOCANT_EL)){//only need to give one a locant of 9 if no locant currently present Element possibleSuffix = OpsinTools.getNextSibling(group); if (possibleSuffix!=null && "one".equals(possibleSuffix.getAttributeValue(VALUE_ATR))){ //Rule C-315.2 Element newLocant =new TokenEl(LOCANT_EL, "9"); OpsinTools.insertBefore(possibleSuffix, newLocant); Element newAddedHydrogen = new TokenEl(ADDEDHYDROGEN_EL); newAddedHydrogen.addAttribute(new Attribute(LOCANT_ATR, "10")); OpsinTools.insertBefore(newLocant, newAddedHydrogen); } else if (possibleSuffix!=null && possibleSuffix.getName().equals(SUFFIX_EL) && groupValue.equals("xanth") || groupValue.equals("thioxanth") || groupValue.equals("selenoxanth")|| groupValue.equals("telluroxanth")){ //diasambiguate between xanthate/xanthic acid and xanthene String suffixVal = possibleSuffix.getAttributeValue(VALUE_ATR); if (suffixVal.equals("ic") || suffixVal.equals("ate")){ throw new ComponentGenerationException(groupValue + possibleSuffix.getValue() +" is not a derivative of xanthene"); } } } } else if (groupValue.equals("phospho")){//is this the organic meaning (P(=O)=O) or biochemical meaning (P(=O)(O)O) Element wordRule = OpsinTools.getParentWordRule(group); for (Element otherGroup : OpsinTools.getDescendantElementsWithTagName(wordRule, GROUP_EL)) { String type = otherGroup.getAttributeValue(TYPE_ATR); String subType = otherGroup.getAttributeValue(SUBTYPE_ATR); if (OpsinTools.isBiochemical(type, subType) || (YLFORACYL_SUBTYPE_VAL.equals(subType) && ("glycol".equals(otherGroup.getValue()) || "diglycol".equals(otherGroup.getValue())) ) || (YLFORYL_SUBTYPE_VAL.equals(subType) && "glycer".equals(otherGroup.getValue())) ) { group.getAttribute(VALUE_ATR).setValue("-P(=O)(O)O"); group.addAttribute(new Attribute(USABLEASJOINER_ATR, "yes")); break; } } } else if (groupValue.equals("hydrogen")){ Element hydrogenParentEl = group.getParent(); Element nextSubOrRoot = OpsinTools.getNextSibling(hydrogenParentEl); if (nextSubOrRoot!=null){ Element possibleSuitableAteGroup = nextSubOrRoot.getChild(0); if (!possibleSuitableAteGroup.getName().equals(GROUP_EL) || !NONCARBOXYLICACID_TYPE_VAL.equals(possibleSuitableAteGroup.getAttributeValue(TYPE_ATR))){ throw new ComponentGenerationException("Hydrogen is not meant as a substituent in this context!"); } Element possibleMultiplier = OpsinTools.getPreviousSibling(group); String multiplier = "1"; if (possibleMultiplier!=null && possibleMultiplier.getName().equals(MULTIPLIER_EL)){ multiplier = possibleMultiplier.getAttributeValue(VALUE_ATR); possibleMultiplier.detach(); } possibleSuitableAteGroup.addAttribute(new Attribute(NUMBEROFFUNCTIONALATOMSTOREMOVE_ATR, multiplier)); group.detach(); List childrenToMove = hydrogenParentEl.getChildElements(); for (int i = childrenToMove.size() -1 ; i >=0; i--) { childrenToMove.get(i).detach(); nextSubOrRoot.insertChild(childrenToMove.get(i), 0); } hydrogenParentEl.detach(); } } else if (groupValue.equals("acryl")){ if (SIMPLESUBSTITUENT_SUBTYPE_VAL.equals(groupSubType)){ Element nextEl = OpsinTools.getNext(group); if (nextEl!=null && nextEl.getValue().equals("amid")){ throw new ComponentGenerationException("amide in acrylamide is not [NH2-]"); } } } else if (groupValue.equals("azo") || groupValue.equals("azoxy") || groupValue.equals("nno-azoxy") || groupValue.equals("non-azoxy") || groupValue.equals("onn-azoxy") || groupValue.equals("diazoamino") || groupValue.equals("hydrazo") ){ Element enclosingSub = group.getParent(); Element next = OpsinTools.getNextSiblingIgnoringCertainElements(enclosingSub, new String[]{HYPHEN_EL}); if (next==null && OpsinTools.getPreviousSibling(enclosingSub) == null){//e.g. [(E)-NNO-azoxy]benzene next = OpsinTools.getNextSiblingIgnoringCertainElements(enclosingSub.getParent(), new String[]{HYPHEN_EL}); } if (next!=null && next.getName().equals(ROOT_EL)){ if (!(next.getChild(0).getName().equals(MULTIPLIER_EL))){ List suffixes = next.getChildElements(SUFFIX_EL); if (suffixes.isEmpty()){//only case without locants is handled so far. suffixes only apply to one of the fragments rather than both!!! Element newMultiplier = new TokenEl(MULTIPLIER_EL); newMultiplier.addAttribute(new Attribute(VALUE_ATR, "2")); next.insertChild(newMultiplier, 0); Element interSubstituentHyphen = OpsinTools.getPrevious(group); if (interSubstituentHyphen!=null && !interSubstituentHyphen.getName().equals(HYPHEN_EL)){//prevent implicit bracketting OpsinTools.insertAfter(interSubstituentHyphen, new TokenEl(HYPHEN_EL)); } } } } } else if (groupValue.equals("coenzyme a") || groupValue.equals("coa")){ Element enclosingSubOrRoot = group.getParent(); Element previous = OpsinTools.getPreviousSibling(enclosingSubOrRoot); if (previous != null){ List groups = OpsinTools.getDescendantElementsWithTagName(previous, GROUP_EL); if (groups.size() > 0){ Element possibleAcid = groups.get(groups.size() - 1); if (ACIDSTEM_TYPE_VAL.equals(possibleAcid.getAttributeValue(TYPE_ATR))){ if (possibleAcid.getAttribute(SUFFIXAPPLIESTO_ATR) != null && possibleAcid.getAttributeValue(SUFFIXAPPLIESTO_ATR).split(",").length > 1){//multi acid. yl should be one oyl and the rest carboxylic acids Element suffix = OpsinTools.getNextSibling(possibleAcid, SUFFIX_EL); if (suffix.getAttribute(ADDITIONALVALUE_ATR) == null){ suffix.addAttribute(new Attribute(ADDITIONALVALUE_ATR, "ic")); } } String subType = possibleAcid.getAttributeValue(SUBTYPE_ATR); if (subType.equals(YLFORYL_SUBTYPE_VAL) || subType.equals(YLFORNOTHING_SUBTYPE_VAL)){ possibleAcid.getAttribute(SUBTYPE_ATR).setValue(YLFORACYL_SUBTYPE_VAL);//yl always means an acyl when next to coenzyme A } } } } //locanted substitution onto Coenzyme A is rarely intended, so put prior content into a bracket to disfavour it Element enclosingBracketOrWord = enclosingSubOrRoot.getParent(); int indexOfCoa = enclosingBracketOrWord.indexOf(enclosingSubOrRoot); if (indexOfCoa > 0) { Element newBracket = new GroupingEl(BRACKET_EL); List precedingElements = enclosingBracketOrWord.getChildElements(); for (int i = 0; i < indexOfCoa; i++) { Element precedingElement = precedingElements.get(i); precedingElement.detach(); newBracket.addChild(precedingElement); } OpsinTools.insertBefore(enclosingSubOrRoot, newBracket); } } else if (groupValue.equals("sphinganine") || groupValue.equals("icosasphinganine") || groupValue.equals("eicosasphinganine") || groupValue.equals("phytosphingosine") || groupValue.equals("sphingosine") || groupValue.equals("sphinganin") || groupValue.equals("icosasphinganin") || groupValue.equals("eicosasphinganin") || groupValue.equals("phytosphingosin") || groupValue.equals("sphingosin")){ Element enclosingSubOrRoot = group.getParent(); Element previous = OpsinTools.getPreviousSibling(enclosingSubOrRoot); if (previous!=null){ List groups = OpsinTools.getDescendantElementsWithTagName(previous, GROUP_EL); if (groups.size()>0){ Element possibleAcid = groups.get(groups.size()-1); if (ALKANESTEM_SUBTYPE_VAL.equals(possibleAcid.getAttributeValue(SUBTYPE_ATR))){ List inlineSuffixes = OpsinTools.getChildElementsWithTagNameAndAttribute(possibleAcid.getParent(), SUFFIX_EL, TYPE_ATR, INLINE_TYPE_VAL); if (inlineSuffixes.size()==1 && inlineSuffixes.get(0).getAttributeValue(VALUE_ATR).equals("yl")){ inlineSuffixes.get(0).getAttribute(VALUE_ATR).setValue("oyl");//yl on a systematic acid next to a fatty acid means acyl //c.f. Nomenclature of Lipids 1976, Appendix A, note a } } } } } else if (groupValue.equals("sel")){ //check that it is not "selenium" if (HETEROSTEM_SUBTYPE_VAL.equals(groupSubType) && group.getAttribute(SUBSEQUENTUNSEMANTICTOKEN_ATR) ==null){ Element unsaturator = OpsinTools.getNextSibling(group); if (unsaturator !=null && unsaturator.getName().equals(UNSATURATOR_EL) && unsaturator.getValue().equals("en") && group.getAttribute(SUBSEQUENTUNSEMANTICTOKEN_ATR) ==null){ Element ium = OpsinTools.getNextSibling(unsaturator); if (ium !=null && ium.getName().equals(SUFFIX_EL) && ium.getValue().equals("ium")){ throw new ComponentGenerationException("selenium does not indicate a chain of selenium atoms with a double bond and a positive charge"); } } } } else if ((groupValue.equals("keto") || groupValue.equals("aldehydo")) && SIMPLESUBSTITUENT_SUBTYPE_VAL.equals(groupSubType)){ //check for case where this is specifying the open chain form of a ketose/aldose Element previousEl = OpsinTools.getPreviousSibling(group); if (previousEl ==null || !previousEl.getName().equals(LOCANT_EL) || groupValue.equals("aldehydo")){ Element parentSubstituent = group.getParent(); Element nextSubOrRoot = OpsinTools.getNextSibling(parentSubstituent); Element parentOfCarbohydate = nextSubOrRoot; Element carbohydrate = null; while (parentOfCarbohydate != null){ Element possibleCarbohydrate = parentOfCarbohydate.getFirstChildElement(GROUP_EL); if (possibleCarbohydrate !=null && possibleCarbohydrate.getAttributeValue(TYPE_ATR).equals(CARBOHYDRATE_TYPE_VAL)){ carbohydrate = possibleCarbohydrate; break; } parentOfCarbohydate = OpsinTools.getNextSibling(parentOfCarbohydate); } if (carbohydrate != null) { if (parentOfCarbohydate.getChildElements(CARBOHYDRATERINGSIZE_EL).size() > 0){ throw new ComponentGenerationException("Carbohydrate has a specified ring size but " + groupValue + " indicates the open chain form!"); } for (Element suffix : parentOfCarbohydate.getChildElements(SUFFIX_EL)) { if ("yl".equals(suffix.getAttributeValue(VALUE_ATR))) { throw new ComponentGenerationException("Carbohydrate appears to be a glycosyl, but " + groupValue + " indicates the open chain form!"); } } Element alphaOrBetaLocantEl = OpsinTools.getPreviousSiblingIgnoringCertainElements(carbohydrate, new String[]{STEREOCHEMISTRY_EL}); if (alphaOrBetaLocantEl != null && alphaOrBetaLocantEl.getName().equals(LOCANT_EL) ){ String value = alphaOrBetaLocantEl.getValue(); if (value.equals("alpha") || value.equals("beta") || value.equals("alpha,beta") || value.equals("beta,alpha")){ throw new ComponentGenerationException("Carbohydrate has alpha/beta anomeric form but " + groupValue + " indicates the open chain form!"); } } group.detach(); List childrenToMove = parentSubstituent.getChildElements(); for (int i = childrenToMove.size() -1 ; i >=0; i--) { Element el = childrenToMove.get(i); if (!el.getName().equals(HYPHEN_EL)){ el.detach(); nextSubOrRoot.insertChild(el, 0); } } parentSubstituent.detach(); if (RING_SUBTYPE_VAL.equals(carbohydrate.getAttributeValue(SUBTYPE_ATR))) { String carbohydrateAdditionValue = carbohydrate.getAttributeValue(ADDITIONALVALUE_ATR); //OPSIN assumes a few trivial names are more likely to describe the cyclic form. additionalValue contains the SMILES for the acyclic form if (carbohydrateAdditionValue == null){ throw new ComponentGenerationException(carbohydrate.getValue() + " can only describe the cyclic form but " + groupValue + " indicates the open chain form!"); } carbohydrate.getAttribute(VALUE_ATR).setValue(carbohydrateAdditionValue); } } else if (groupValue.equals("aldehydo")){ throw new ComponentGenerationException("aldehydo is only a valid prefix when it precedes a carbohydrate!"); } } } else if (groupValue.equals("bor") || groupValue.equals("antimon") || groupValue.equals("arsen") || groupValue.equals("phosphor") || groupValue.equals("phosphate") || groupValue.equals("phosphat") || groupValue.equals("silicicacid") || groupValue.equals("silicic acid") || groupValue.equals("silicate") || groupValue.equals("silicat")){//fluoroboric acid/fluoroborate are trivial rather than systematic; tetra(fooyl)borate is inorganic Element suffix = null; Boolean isAcid = null; if (groupValue.endsWith("acid")){ if (OpsinTools.getNext(group) == null){ isAcid = true; } } else if (groupValue.endsWith("ate") || groupValue.endsWith("at")){ if (OpsinTools.getNext(group) == null){ isAcid = false; } } else{ suffix = OpsinTools.getNextSibling(group); if (suffix != null && suffix.getName().equals(SUFFIX_EL) && suffix.getAttribute(INFIX_ATR) == null && OpsinTools.getNext(suffix) == null){ String suffixValue = suffix.getAttributeValue(VALUE_ATR); if (suffixValue.equals("ic")){ isAcid = true; } else if (suffixValue.equals("ate")){ isAcid = false; } } } if (isAcid != null){//check for inorganic interpretation Element substituent = OpsinTools.getPreviousSibling(group.getParent()); if (substituent !=null && (substituent.getName().equals(SUBSTITUENT_EL) || substituent.getName().equals(BRACKET_EL))){ List children = substituent.getChildElements(); Element firstChild = children.get(0); boolean matched = false; if (children.size() ==1 && firstChild.getName().equals(GROUP_EL) && (firstChild.getValue().equals("fluoro") || firstChild.getValue().equals("fluor"))){ if (groupValue.equals("bor")) { group.getAttribute(VALUE_ATR).setValue(isAcid ? "F[B-](F)(F)F.[H+]" : "F[B-](F)(F)F"); matched = true; } else if (groupValue.equals("antimon")) { group.getAttribute(VALUE_ATR).setValue(isAcid ? "F[Sb-](F)(F)(F)(F)F.[H+]" : "F[Sb-](F)(F)(F)(F)F"); matched = true; } else if (groupValue.startsWith("silicic") || groupValue.startsWith("silicat")) { group.getAttribute(VALUE_ATR).setValue(isAcid ? "F[Si|6-2](F)(F)(F)(F)F.[H+].[H+]" : "F[Si|6-2](F)(F)(F)(F)F"); matched = true; } if (matched) { substituent.detach(); } } else if (firstChild.getName().equals(MULTIPLIER_EL)) { String multiplierVal = firstChild.getAttributeValue(VALUE_ATR); if (groupValue.equals("bor")){ if (multiplierVal.equals("4") || (multiplierVal.equals("3") && OpsinTools.getPreviousSibling(substituent) != null)) { //tri case allows organotrifluoroborates group.getAttribute(VALUE_ATR).setValue(isAcid ? "[B-].[H+]" :"[B-]"); matched = true; } } else if (groupValue.equals("antimon") && multiplierVal.equals("6")) { group.getAttribute(VALUE_ATR).setValue(isAcid ? "[Sb-].[H+]" :"[Sb-]"); matched = true; } else if (groupValue.equals("arsen") && multiplierVal.equals("6")) { group.getAttribute(VALUE_ATR).setValue(isAcid ? "[As-].[H+]" :"[As-]"); matched = true; } else if (groupValue.startsWith("phosph") && multiplierVal.equals("6")) { group.getAttribute(VALUE_ATR).setValue(isAcid ? "[P-].[H+]" :"[P-]"); matched = true; } else if (groupValue.startsWith("silic") && multiplierVal.equals("6")) { group.getAttribute(VALUE_ATR).setValue(isAcid ? "[Si|6-2].[H+].[H+]" :"[Si|6-2]"); matched = true; } } if (matched) { group.getAttribute(TYPE_ATR).setValue(SIMPLEGROUP_TYPE_VAL); group.getAttribute(SUBTYPE_ATR).setValue(SIMPLEGROUP_SUBTYPE_VAL); Attribute usableAsJoiner = group.getAttribute(USABLEASJOINER_ATR); if (usableAsJoiner != null){ group.removeAttribute(usableAsJoiner); } Attribute acceptsAdditiveBonds = group.getAttribute(ACCEPTSADDITIVEBONDS_ATR); if (acceptsAdditiveBonds != null){ group.removeAttribute(acceptsAdditiveBonds); } Attribute functionalIds = group.getAttribute(FUNCTIONALIDS_ATR); if (functionalIds != null){ group.removeAttribute(functionalIds); } if (suffix != null){ suffix.detach(); } } } } } else if (groupValue.equals("pyruv")) { //cf. methylselenopyruvate Element precedingSubstituent = OpsinTools.getPreviousSibling(group.getParent()); if (precedingSubstituent != null && OpsinTools.getPreviousSibling(precedingSubstituent) != null) { Element subGroup = precedingSubstituent.getFirstChildElement(GROUP_EL); if (subGroup != null && FunctionalReplacement.matchChalcogenReplacement.matcher(subGroup.getValue()).matches() && OpsinTools.getNextSibling(subGroup) == null) { OpsinTools.insertAfter(subGroup, new TokenEl(HYPHEN_EL));//discourage the use of the chalcogen term as functional replacement } } } else if (ENDINIC_SUBTYPE_VAL.equals(groupSubType) && AMINOACID_TYPE_VAL.equals(groupType)) { //aspartyl and glutamyl typically mean alpha-aspartyl/alpha-glutamyl String[] suffixAppliesTo = group.getAttributeValue(SUFFIXAPPLIESTO_ATR).split(","); if (suffixAppliesTo.length == 2) { Element yl = OpsinTools.getNextSibling(group); if (yl.getAttributeValue(VALUE_ATR).equals("yl")) { if (yl.getAttribute(ADDITIONALVALUE_ATR) == null){ yl.addAttribute(new Attribute(ADDITIONALVALUE_ATR, "ic")); } } } } else if (SALTCOMPONENT_SUBTYPE_VAL.equals(groupSubType)) { Element parse = null; Element tempParent = group.getParent(); while (tempParent != null) { parse = tempParent; tempParent = tempParent.getParent(); } if (parse.getChildCount() <= 1) { throw new ComponentGenerationException("Group expected to be part of a salt but only one component found. Could be a class of compound: " + groupValue); } if (groupValue.length() > 0) { //e.g. 2HCl char firstChar = groupValue.charAt(0); if (firstChar >= '1' && firstChar <= '9') { Element shouldntBeAmultiplier= OpsinTools.getPreviousSibling(group); if (shouldntBeAmultiplier != null && shouldntBeAmultiplier.getName().equals(MULTIPLIER_EL)) { throw new ComponentGenerationException("Unepxected multiplier found before: " + groupValue); } Element multiplier = new TokenEl(MULTIPLIER_EL, String.valueOf(firstChar)); multiplier.addAttribute(TYPE_ATR, BASIC_TYPE_VAL); multiplier.addAttribute(VALUE_ATR, String.valueOf(firstChar)); OpsinTools.insertBefore(group, multiplier); group.setValue(groupValue.substring(1)); } } } else if (ELEMENTARYATOM_TYPE_VAL.equals(groupType)) { //simple inorganic molecular diatomics should be implicitly bonded e.g. dioxygen Element multiplier = OpsinTools.getPreviousSibling(group); if (multiplier != null && "2".equals(multiplier.getAttributeValue(VALUE_ATR))) { //check that the name is just formed of two tokens e.g. dinitrogen Element temp = group.getParent(); Element parent = temp; while (temp != null) { parent = temp; temp = parent.getParent(); } if (OpsinTools.countNumberOfElementsAndNumberOfChildLessElements(parent)[1] == 2) { String newVal; switch (group.getAttributeValue(VALUE_ATR)) { case "[H]": newVal = "[H][H]"; break; case "[N]": newVal = "N#N"; break; case "[O]": newVal = "O=O"; break; case "[F]": newVal = "FF"; break; case "[Cl]": newVal = "ClCl"; break; case "[Br]": newVal = "BrBr"; break; case "[I]": newVal = "II"; break; default: newVal = null; break; } if (newVal != null) { Element newGroup = new TokenEl(GROUP_EL, groupValue); newGroup.addAttribute(TYPE_ATR, SIMPLEGROUP_TYPE_VAL); newGroup.addAttribute(SUBTYPE_ATR, SIMPLEGROUP_SUBTYPE_VAL); newGroup.addAttribute(VALUE_ATR, newVal); OpsinTools.insertAfter(group, newGroup); group.detach(); multiplier.detach(); } } } } if (AMINOACID_TYPE_VAL.equals(groupType)) { Element previous = OpsinTools.getPreviousSibling(group.getParent()); if (previous != null) { List groups = OpsinTools.getDescendantElementsWithTagName(previous, GROUP_EL); if (groups.size() > 0) { Element possibleAcid = groups.get(groups.size() - 1); if (ACIDSTEM_TYPE_VAL.equals(possibleAcid.getAttributeValue(TYPE_ATR))) { if (possibleAcid.getAttribute(SUFFIXAPPLIESTO_ATR) != null && possibleAcid.getAttributeValue(SUFFIXAPPLIESTO_ATR).split(",").length > 1) {//multi acid. yl should be one oyl and the rest carboxylic acids Element suffix = OpsinTools.getNextSibling(possibleAcid, SUFFIX_EL); if (suffix.getAttribute(ADDITIONALVALUE_ATR) == null) { suffix.addAttribute(new Attribute(ADDITIONALVALUE_ATR, "ic")); } } } } } } } private void moveDetachableHetAtomRepl(Element bracket) throws ComponentGenerationException { int indexOfLastHeteroatom = -1; for (int i = bracket.getChildCount() - 1; i >= 0; i--) { Element child = bracket.getChild(i); if (child.getName().equals(HETEROATOM_EL)) { indexOfLastHeteroatom = i; break; } } if (indexOfLastHeteroatom >=0) { Element rightMostGroup = null; Element nextSubOrRootOrBracket = OpsinTools.getNextSibling(bracket); while (nextSubOrRootOrBracket != null) { Element groupToConsider = nextSubOrRootOrBracket.getFirstChildElement(GROUP_EL); if (groupToConsider != null) { rightMostGroup = groupToConsider; } nextSubOrRootOrBracket = OpsinTools.getNextSibling(nextSubOrRootOrBracket); } if (rightMostGroup == null) { throw new ComponentGenerationException("Unable to find group for: " + bracket.getChild(0).getValue() +" to apply to!"); } Element rightMostGroupParent = rightMostGroup.getParent(); for (int i = indexOfLastHeteroatom; i >= 0; i--) { Element locantededHeteroAtomRepl = bracket.getChild(i); locantededHeteroAtomRepl.detach(); rightMostGroupParent.insertChild(locantededHeteroAtomRepl, 0); } } } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ComponentProcessor.java000066400000000000000000007344151451751637500303770ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayDeque; import java.util.ArrayList; import java.util.Arrays; import java.util.Collections; import java.util.Comparator; import java.util.Deque; import java.util.HashMap; import java.util.HashSet; import java.util.Iterator; import java.util.LinkedHashMap; import java.util.List; import java.util.Locale; import java.util.Map; import java.util.Set; import java.util.regex.Matcher; import java.util.regex.Pattern; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import static uk.ac.cam.ch.wwmm.opsin.OpsinTools.*; /**Performs structure-aware destructive procedural parsing on parser results. * * @author dl387 * */ class ComponentProcessor { private static final Pattern matchAddedHydrogenBracket =Pattern.compile("[\\[\\(\\{]([^\\[\\(\\{]*)H[\\]\\)\\}]"); private static final Pattern matchElementSymbolOrAminoAcidLocant = Pattern.compile("[A-Z][a-z]?'*(\\d+[a-z]?'*)?"); private static final Pattern matchChalcogenReplacement= Pattern.compile("thio|seleno|telluro"); private static final Pattern matchGroupsThatAreAlsoInlineSuffixes = Pattern.compile("carbon|oxy|sulfen|sulfin|sulfon|selenen|selenin|selenon|telluren|tellurin|telluron"); private static final String[] traditionalAlkanePositionNames =new String[]{"alpha", "beta", "gamma", "delta", "epsilon", "zeta"}; private final FunctionalReplacement functionalReplacement; private final SuffixApplier suffixApplier; private final BuildState state; //rings that look like HW rings but have other meanings. For the HW like inorganics the true meaning is given private static final Map specialHWRings = new HashMap<>(); static{ //The first entry of the array is a special instruction e.g. blocked or saturated. The correct order of the heteroatoms follows //terminal e is ignored from all of the keys as it is optional in the input name specialHWRings.put("oxin", new String[]{"blocked"}); specialHWRings.put("azin", new String[]{"blocked"}); specialHWRings.put("selenin", new String[]{"not_icacid", "Se","C","C","C","C","C"}); specialHWRings.put("tellurin", new String[]{"not_icacid", "Te","C","C","C","C","C"}); specialHWRings.put("thiol", new String[]{"not_nothingOrOlate", "S","C","C","C","C"}); specialHWRings.put("selenol", new String[]{"not_nothingOrOlate", "Se","C","C","C","C"}); specialHWRings.put("tellurol", new String[]{"not_nothingOrOlate", "Te","C","C","C","C"}); specialHWRings.put("oxazol", new String[]{"","O","C","N","C","C"}); specialHWRings.put("thiazol", new String[]{"","S","C","N","C","C"}); specialHWRings.put("selenazol", new String[]{"","Se","C","N","C","C"}); specialHWRings.put("tellurazol", new String[]{"","Te","C","N","C","C"}); specialHWRings.put("oxazolidin", new String[]{"","O","C","N","C","C"}); specialHWRings.put("thiazolidin", new String[]{"","S","C","N","C","C"}); specialHWRings.put("selenazolidin", new String[]{"","Se","C","N","C","C"}); specialHWRings.put("tellurazolidin", new String[]{"","Te","C","N","C","C"}); specialHWRings.put("oxazolid", new String[]{"","O","C","N","C","C"}); specialHWRings.put("thiazolid", new String[]{"","S","C","N","C","C"}); specialHWRings.put("selenazolid", new String[]{"","Se","C","N","C","C"}); specialHWRings.put("tellurazolid", new String[]{"","Te","C","N","C","C"}); specialHWRings.put("oxazolin", new String[]{"","O","C","N","C","C"}); specialHWRings.put("thiazolin", new String[]{"","S","C","N","C","C"}); specialHWRings.put("selenazolin", new String[]{"","Se","C","N","C","C"}); specialHWRings.put("tellurazolin", new String[]{"","Te","C","N","C","C"}); specialHWRings.put("oxoxolan", new String[]{"","O","C","O","C","C"}); specialHWRings.put("oxoxol", new String[]{"","O","C","O","C","C"}); specialHWRings.put("oxoxan", new String[]{"","O","C","C","O","C","C"}); specialHWRings.put("oxoxin", new String[]{"","O","C","C","O","C","C"}); specialHWRings.put("oxoxoxan", new String[]{"","O","C","O","C","O","C"}); specialHWRings.put("boroxin", new String[]{"saturated","O","B","O","B","O","B"}); specialHWRings.put("borazin", new String[]{"saturated","N","B","N","B","N","B"}); specialHWRings.put("borthiin", new String[]{"saturated","S","B","S","B","S","B"}); } ComponentProcessor(BuildState state, SuffixApplier suffixApplier) { this.state = state; this.suffixApplier = suffixApplier; this.functionalReplacement = new FunctionalReplacement(state); } /** * Processes a parse result that has already gone through the ComponentGenerator. * At this stage one can expect all substituents/roots to have at least 1 group. * Multiple groups are present in, for example, fusion nomenclature. By the end of this function there will be exactly 1 group * associated with each substituent/root. Multiplicative nomenclature can result in there being multiple roots * @param parse * @throws ComponentGenerationException * @throws StructureBuildingException */ void processParse(Element parse) throws ComponentGenerationException, StructureBuildingException { List words =OpsinTools.getDescendantElementsWithTagName(parse, WORD_EL); int wordCount =words.size(); for (int i = wordCount -1; i>=0; i--) { Element word = words.get(i); String wordRule = OpsinTools.getParentWordRule(word).getAttributeValue(WORDRULE_EL); state.currentWordRule = WordRule.valueOf(wordRule); if (word.getAttributeValue(TYPE_ATR).equals(WordType.functionalTerm.toString())){ continue;//functionalTerms are handled on a case by case basis by wordRules } List roots = OpsinTools.getDescendantElementsWithTagName(word, ROOT_EL); if (roots.size() > 1){ throw new ComponentGenerationException("Multiple roots, but only 0 or 1 were expected. Found: " + roots.size()); } List substituents = OpsinTools.getDescendantElementsWithTagName(word, SUBSTITUENT_EL); List substituentsAndRoot = OpsinTools.combineElementLists(substituents, roots); List brackets = OpsinTools.getDescendantElementsWithTagName(word, BRACKET_EL); List substituentsAndRootAndBrackets = OpsinTools.combineElementLists(substituentsAndRoot, brackets); List groups = OpsinTools.getDescendantElementsWithTagName(word, GROUP_EL); for (Element group : groups) { Fragment thisFrag = resolveGroup(state, group); processChargeAndOxidationNumberSpecification(group, thisFrag);//e.g. mercury(2+) or mercury(II) } for (Element subOrRoot : substituentsAndRoot) { applyDLPrefixes(subOrRoot); processCarbohydrates(subOrRoot);//e.g. glucopyranose (needs to be done before determineLocantMeaning to cope with alpha,beta for undefined anomer stereochemistry) } Element finalSubOrRootInWord = word.getChild(word.getChildCount() - 1); while (!finalSubOrRootInWord.getName().equals(ROOT_EL) && !finalSubOrRootInWord.getName().equals(SUBSTITUENT_EL)){ List children = OpsinTools.getChildElementsWithTagNames(finalSubOrRootInWord, new String[]{ROOT_EL, SUBSTITUENT_EL, BRACKET_EL}); if (children.isEmpty()){ throw new ComponentGenerationException("Unable to find finalSubOrRootInWord"); } finalSubOrRootInWord = children.get(children.size() - 1); } for (Element subOrRootOrBracket : substituentsAndRootAndBrackets) { determineLocantMeaning(subOrRootOrBracket, finalSubOrRootInWord); } for (Element subOrRoot : substituentsAndRoot) { processMultipliers(subOrRoot); detectConjunctiveSuffixGroups(subOrRoot, groups); matchLocantsToDirectFeatures(subOrRoot); List groupsOfSubOrRoot = subOrRoot.getChildElements(GROUP_EL); if (groupsOfSubOrRoot.size() > 0) { Element lastGroupInSubOrRoot =groupsOfSubOrRoot.get(groupsOfSubOrRoot.size() - 1); preliminaryProcessSuffixes(lastGroupInSubOrRoot, subOrRoot.getChildElements(SUFFIX_EL)); } } for (int j = substituents.size() -1; j >=0; j--) { Element substituent = substituents.get(j); if (substituent.getChildElements(GROUP_EL).isEmpty()) { boolean removed = removeAndMoveToAppropriateGroupIfHydroSubstituent(substituent);//this REMOVES a substituent just containing hydro/perhydro elements and moves these elements in front of an appropriate ring if (!removed){ removed = removeAndMoveToAppropriateGroupIfSubtractivePrefix(substituent); } if (!removed){ removed = removeAndMoveToAppropriateGroupIfRingBridge(substituent); } if (!removed){ throw new RuntimeException("OPSIN Bug: Encountered substituent with no group!: " + substituent.toXML() ); } substituents.remove(j); substituentsAndRoot.remove(substituent); substituentsAndRootAndBrackets.remove(substituent); } } functionalReplacement.processAcidReplacingFunctionalClassNomenclature(finalSubOrRootInWord, word); if (functionalReplacement.processPrefixFunctionalReplacementNomenclature(groups, substituents)){//true if functional replacement performed, 1 or more substituents will have been removed substituentsAndRoot = OpsinTools.combineElementLists(substituents, roots); substituentsAndRootAndBrackets =OpsinTools.combineElementLists(substituentsAndRoot, brackets); } handleGroupIrregularities(groups); for (Element subOrRoot : substituentsAndRoot) { processHW(subOrRoot);//hantzch-widman rings FusedRingBuilder.processFusedRings(state, subOrRoot); processFusedRingBridges(subOrRoot); assignElementSymbolLocants(subOrRoot); processRingAssemblies(subOrRoot); processPolyCyclicSpiroNomenclature(subOrRoot); } for (Element subOrRoot : substituentsAndRoot) { applyLambdaConvention(subOrRoot); handleMultiRadicals(subOrRoot); } addImplicitBracketsToAminoAcids(groups, brackets); for (Element substituent : substituents) { matchLocantsToIndirectFeatures(substituent); addImplicitBracketsWhenSubstituentHasTwoLocants(substituent, brackets); implicitlyBracketToPreviousSubstituentIfAppropriate(substituent, brackets); } for (Element root : roots) { matchLocantsToIndirectFeatures(root); } for (Element subOrRoot : substituentsAndRoot) { assignImplicitLocantsToDiTerminalSuffixes(subOrRoot); processConjunctiveNomenclature(subOrRoot); suffixApplier.resolveSuffixes(subOrRoot.getFirstChildElement(GROUP_EL), subOrRoot.getChildElements(SUFFIX_EL)); if (subOrRoot.getName().equals(SUBSTITUENT_EL)) { moveSubstituentDetachableHetAtomRepl(subOrRoot); } } moveErroneouslyPositionedLocantsAndMultipliers(brackets);//e.g. (tetramethyl)azanium == tetra(methyl)azanium List children = OpsinTools.getChildElementsWithTagNames(word, new String[]{ROOT_EL, SUBSTITUENT_EL, BRACKET_EL}); addImplicitBracketsWhenFirstSubstituentHasTwoMultipliers(children.get(0), brackets);//e.g. ditrifluoroacetic acid --> di(trifluoroacetic acid) while (children.size() == 1) { children = OpsinTools.getChildElementsWithTagNames(children.get(0), new String[]{ROOT_EL, SUBSTITUENT_EL, BRACKET_EL}); } if (children.size() > 0) { assignLocantsToMultipliedRootIfPresent(children.get(children.size() - 1));//multiplicative nomenclature e.g. methylenedibenzene or 3,4'-oxydipyridine } substituentsAndRootAndBrackets =OpsinTools.combineElementLists(substituentsAndRoot, brackets);//implicit brackets may have been created for (Element subBracketOrRoot : substituentsAndRootAndBrackets) { assignLocantsAndMultipliers(subBracketOrRoot); } processBiochemicalLinkageDescriptors(substituents, brackets); processWordLevelMultiplierIfApplicable(word, roots, wordCount); } new WordRulesOmittedSpaceCorrector(state, parse).correctOmittedSpaces();//TODO where should this go? } /**Resolves the contents of a group element * * @param group The group element * @return The fragment specified by the group element. * @throws StructureBuildingException If the group can't be built. * @throws ComponentGenerationException */ static Fragment resolveGroup(BuildState state, Element group) throws StructureBuildingException, ComponentGenerationException { String groupValue = group.getAttributeValue(VALUE_ATR); String labelsValue = group.getAttributeValue(LABELS_ATR); Fragment thisFrag = state.fragManager.buildSMILES(groupValue, group, labelsValue != null ? labelsValue : NONE_LABELS_VAL); group.setFrag(thisFrag); //processes groups like cymene and xylene whose structure is determined by the presence of a locant in front e.g. p-xylene processXyleneLikeNomenclature(state, group, thisFrag); setFragmentDefaultInAtomIfSpecified(thisFrag, group); setFragmentFunctionalAtomsIfSpecified(group, thisFrag); applyTraditionalAlkaneNumberingIfAppropriate(group, thisFrag); applyHomologyGroupLabelsIfSpecified(group, thisFrag); if (ELEMENTARYATOM_TYPE_VAL.equals(group.getAttributeValue(TYPE_ATR))) { //these do not have implicit hydrogen e.g. phosphorus is literally just a phosphorus atom for (Atom a : thisFrag) { a.setImplicitHydrogenAllowed(false); } } return thisFrag; } private enum AtomReferenceType { ID, DEFAULTID, LOCANT, DEFAULTLOCANT } private static class AtomReference { private final AtomReferenceType referenceType; private final String reference; AtomReference(AtomReferenceType referenceType, String reference) { this.referenceType = referenceType; this.reference = reference; } } private static class AddGroup { private final Fragment frag; private AtomReference atomReference; AddGroup(Fragment frag, AtomReference atomReference) { this.frag = frag; this.atomReference = atomReference; } } private static class AddHeteroatom { private final String heteroAtomSmiles; private AtomReference atomReference; AddHeteroatom(String heteroAtomSmiles, AtomReference atomReference) { this.heteroAtomSmiles = heteroAtomSmiles; this.atomReference = atomReference; } } private static class AddBond { private final int bondOrder; private AtomReference atomReference; AddBond(int bondOrder, AtomReference atomReference) { this.bondOrder = bondOrder; this.atomReference = atomReference; } } /** * Checks for groups with the addGroup/addBond/addHeteroAtom attributes. For the addGroup attribute adds the group defined by the SMILES described within * e.g. for xylene this function would add two methyls. Xylene is initially generated using the structure of benzene! * See tokenList dtd for more information on the syntax of these attributes if it is not clear from the code * @param state * @param group: The group element * @param parentFrag: The fragment that has been generated from the group element * @throws StructureBuildingException * @throws ComponentGenerationException */ private static void processXyleneLikeNomenclature(BuildState state, Element group, Fragment parentFrag) throws StructureBuildingException, ComponentGenerationException { boolean ambiguous = false; if(group.getAttribute(ADDGROUP_ATR) != null) { String addGroupInformation = group.getAttributeValue(ADDGROUP_ATR); List groupsToBeAdded = new ArrayList<>(); ////typically only one, but 2 in the case of xylene and quinones for (String groupToBeAdded : addGroupInformation.split(";")) { String[] description = groupToBeAdded.split(" "); if (description.length < 3 || description.length > 4) { throw new ComponentGenerationException("malformed addGroup tag"); } String smiles = description[0]; AtomReferenceType referenceType = AtomReferenceType.valueOf(description[1].toUpperCase(Locale.ROOT)); String reference = description[2]; Fragment fragToAdd; if (description.length == 4) {//labels may optionally be specified for the group to be added fragToAdd = state.fragManager.buildSMILES(smiles, group, description[3]); } else{ fragToAdd = state.fragManager.buildSMILES(smiles, group, NONE_LABELS_VAL); } groupsToBeAdded.add(new AddGroup(fragToAdd, new AtomReference(referenceType, reference))); } Element previousEl = OpsinTools.getPreviousSibling(group); if (previousEl !=null && previousEl.getName().equals(LOCANT_EL)){//has the name got specified locants to override the default ones List locantValues = StringTools.arrayToList(previousEl.getValue().split(",")); if ((locantValues.size() == groupsToBeAdded.size() || locantValues.size() + 1 == groupsToBeAdded.size()) && locantAreAcceptableForXyleneLikeNomenclatures(locantValues, group)){//one locant can be implicit in some cases boolean assignlocants = true; if (locantValues.size() != groupsToBeAdded.size()){ //check that the firstGroup by default will be added to the atom with locant 1. If this is not the case then as many locants as there were groups should of been specified //or no locants should have been specified, which is what will be assumed (i.e. the locants will be left unassigned) AddGroup groupInformation = groupsToBeAdded.get(0); String locant; switch (groupInformation.atomReference.referenceType) { case DEFAULTLOCANT: case LOCANT: locant = parentFrag.getAtomByLocantOrThrow(groupInformation.atomReference.reference).getFirstLocant(); break; case DEFAULTID: case ID: locant = parentFrag.getAtomByIDOrThrow(parentFrag.getIdOfFirstAtom() + Integer.parseInt(groupInformation.atomReference.reference) - 1).getFirstLocant(); break; default: throw new ComponentGenerationException("malformed addGroup tag"); } if (locant == null || !locant.equals("1")){ assignlocants = false; } } if (assignlocants){ for (int i = groupsToBeAdded.size() - 1; i >=0 ; i--) { //if less locants than expected are specified the locants of only the later groups will be changed //e.g. 4-xylene will transform 1,2-xylene to 1,4-xylene AddGroup groupInformation = groupsToBeAdded.get(i); if (locantValues.size() >0){ groupInformation.atomReference = new AtomReference(AtomReferenceType.LOCANT, locantValues.get(locantValues.size() - 1)); locantValues.remove(locantValues.size() - 1); } else{ break; } } group.removeAttribute(group.getAttribute(FRONTLOCANTSEXPECTED_ATR)); previousEl.detach(); } } } for (int i = 0; i < groupsToBeAdded.size(); i++) { AddGroup groupInformation = groupsToBeAdded.get(i); Fragment newFrag = groupInformation.frag; Atom atomOnParentFrag; switch (groupInformation.atomReference.referenceType) { case DEFAULTLOCANT: ambiguous = true; case LOCANT: if (groupInformation.atomReference.reference.equals("required")) { throw new ComponentGenerationException(group.getValue() + " requires an allowed locant"); } atomOnParentFrag = parentFrag.getAtomByLocantOrThrow(groupInformation.atomReference.reference); break; case DEFAULTID: ambiguous = true; case ID: atomOnParentFrag = parentFrag.getAtomByIDOrThrow(parentFrag.getIdOfFirstAtom() + Integer.parseInt(groupInformation.atomReference.reference) -1); break; default: throw new ComponentGenerationException("malformed addGroup tag"); } if (newFrag.getOutAtomCount() >1){ throw new ComponentGenerationException("too many outAtoms on group to be added"); } if (newFrag.getOutAtomCount() ==1) { OutAtom newFragOutAtom = newFrag.getOutAtom(0); newFrag.removeOutAtom(newFragOutAtom); state.fragManager.incorporateFragment(newFrag, newFragOutAtom.getAtom(), parentFrag, atomOnParentFrag, newFragOutAtom.getValency()); } else{ Atom atomOnNewFrag = newFrag.getDefaultInAtomOrFirstAtom(); state.fragManager.incorporateFragment(newFrag, atomOnNewFrag, parentFrag, atomOnParentFrag, 1); } } } if(group.getAttributeValue(ADDHETEROATOM_ATR) != null) { String addHeteroAtomInformation = group.getAttributeValue(ADDHETEROATOM_ATR); List heteroAtomsToBeAdded = new ArrayList<>(); for (String heteroAtomToBeAdded : addHeteroAtomInformation.split(";")) { String[] description = heteroAtomToBeAdded.split(" "); if (description.length != 3) { throw new ComponentGenerationException("malformed addHeteroAtom tag"); } String heteroAtomSmiles = description[0]; AtomReferenceType referenceType = AtomReferenceType.valueOf(description[1].toUpperCase(Locale.ROOT)); String reference = description[2]; heteroAtomsToBeAdded.add(new AddHeteroatom(heteroAtomSmiles, new AtomReference(referenceType, reference))); } Element previousEl = OpsinTools.getPreviousSibling(group); if (previousEl != null && previousEl.getName().equals(LOCANT_EL)){//has the name got specified locants to override the default ones List locantValues =StringTools.arrayToList(previousEl.getValue().split(",")); if (locantValues.size() == heteroAtomsToBeAdded.size() && locantAreAcceptableForXyleneLikeNomenclatures(locantValues, group)){ for (int i = heteroAtomsToBeAdded.size() -1; i >=0 ; i--) {//all heteroatoms must have a locant or default locants will be used AddHeteroatom groupInformation = heteroAtomsToBeAdded.get(i); groupInformation.atomReference = new AtomReference(AtomReferenceType.LOCANT, locantValues.get(locantValues.size() - 1)); locantValues.remove(locantValues.size() - 1); } group.removeAttribute(group.getAttribute(FRONTLOCANTSEXPECTED_ATR)); previousEl.detach(); } } for (int i = 0; i < heteroAtomsToBeAdded.size(); i++) { AddHeteroatom heteroAtomInformation = heteroAtomsToBeAdded.get(i); Atom atomOnParentFrag = null; switch (heteroAtomInformation.atomReference.referenceType) { case DEFAULTLOCANT: ambiguous = true; case LOCANT: if (heteroAtomInformation.atomReference.reference.equals("required")) { throw new ComponentGenerationException(group.getValue() + " requires an allowed locant"); } atomOnParentFrag = parentFrag.getAtomByLocantOrThrow(heteroAtomInformation.atomReference.reference); break; case DEFAULTID: ambiguous = true; case ID: atomOnParentFrag = parentFrag.getAtomByIDOrThrow(parentFrag.getIdOfFirstAtom() + Integer.parseInt(heteroAtomInformation.atomReference.reference) - 1); break; default: throw new ComponentGenerationException("malformed addHeteroAtom tag"); } state.fragManager.replaceAtomWithSmiles(atomOnParentFrag, heteroAtomInformation.heteroAtomSmiles); } } if(group.getAttributeValue(ADDBOND_ATR) != null && !HANTZSCHWIDMAN_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))) {//HW add bond is handled later String addBondInformation = group.getAttributeValue(ADDBOND_ATR); List bondsToBeAdded = new ArrayList<>(); for (String bondToBeAdded : addBondInformation.split(";")) { String[] description = bondToBeAdded.split(" "); if (description.length != 3) { throw new ComponentGenerationException("malformed addBond tag"); } int bondOrder = Integer.parseInt(description[0]); AtomReferenceType referenceType = AtomReferenceType.valueOf(description[1].toUpperCase(Locale.ROOT)); String reference = description[2]; bondsToBeAdded.add(new AddBond(bondOrder, new AtomReference(referenceType, reference))); } boolean locanted = false; Element previousEl = OpsinTools.getPreviousSibling(group); if (previousEl != null && previousEl.getName().equals(LOCANT_EL)){//has the name got specified locants to override the default ones List locantValues = StringTools.arrayToList(previousEl.getValue().split(",")); if (locantValues.size() == bondsToBeAdded.size() && locantAreAcceptableForXyleneLikeNomenclatures(locantValues, group)){ for (int i = bondsToBeAdded.size() -1; i >=0 ; i--) {//all bond order changes must have a locant or default locants will be used AddBond bondInformation = bondsToBeAdded.get(i); bondInformation.atomReference = new AtomReference(AtomReferenceType.LOCANT, locantValues.get(locantValues.size() - 1)); locantValues.remove(locantValues.size() - 1); } group.removeAttribute(group.getAttribute(FRONTLOCANTSEXPECTED_ATR)); previousEl.detach(); locanted = true; } } for (int i = 0; i < bondsToBeAdded.size(); i++) { AddBond bondInformation = bondsToBeAdded.get(i); Atom atomOnParentFrag; switch (bondInformation.atomReference.referenceType) { case DEFAULTLOCANT: ambiguous = true; case LOCANT: if (bondInformation.atomReference.reference.equals("required")) { throw new ComponentGenerationException(group.getValue() + " requires an allowed locant"); } atomOnParentFrag=parentFrag.getAtomByLocantOrThrow(bondInformation.atomReference.reference); break; case DEFAULTID: ambiguous = true; case ID: atomOnParentFrag= parentFrag.getAtomByIDOrThrow(parentFrag.getIdOfFirstAtom() + Integer.parseInt(bondInformation.atomReference.reference) -1); break; default: throw new ComponentGenerationException("malformed addBond tag"); } Bond b = FragmentTools.unsaturate(atomOnParentFrag, bondInformation.bondOrder, parentFrag); if (!locanted && b.getOrder() == 2 && parentFrag.getAtomCount() == 5 && b.getFromAtom().getAtomIsInACycle() && b.getToAtom().getAtomIsInACycle()){ //special case just that substitution of groups like imidazoline may actually remove the double bond... b.setOrder(1); b.getFromAtom().setSpareValency(true); b.getToAtom().setSpareValency(true); } } } if (ambiguous) { state.addIsAmbiguous(group.getValue() +" describes multiple structures"); } } /** * Checks that all locants are present within the front locants expected attribute of the group * @param locantValues * @param group * @return */ private static boolean locantAreAcceptableForXyleneLikeNomenclatures(List locantValues, Element group) { if (group.getAttribute(FRONTLOCANTSEXPECTED_ATR) == null){ throw new IllegalArgumentException("Group must have frontLocantsExpected to implement xylene-like nomenclature"); } List allowedLocants = Arrays.asList(group.getAttributeValue(FRONTLOCANTSEXPECTED_ATR).split(",")); for (String locant : locantValues) { if (!allowedLocants.contains(locant)){ return false; } } return true; } /** * Looks for the presence of {@link XmlDeclarations#DEFAULTINLOCANT_ATR} and {@link XmlDeclarations#DEFAULTINID_ATR} on the group and applies them to the fragment * @param thisFrag * @param group * @throws StructureBuildingException */ private static void setFragmentDefaultInAtomIfSpecified(Fragment thisFrag, Element group) throws StructureBuildingException { String defaultInLocant = group.getAttributeValue(DEFAULTINLOCANT_ATR); String defaultInId = group.getAttributeValue(DEFAULTINID_ATR); if (defaultInLocant != null){//sets the atom at which substitution will occur to by default thisFrag.setDefaultInAtom(thisFrag.getAtomByLocantOrThrow(defaultInLocant)); } else if (defaultInId != null){ thisFrag.setDefaultInAtom(thisFrag.getAtomByIDOrThrow(thisFrag.getIdOfFirstAtom() + Integer.parseInt(defaultInId) - 1)); } } /** * Looks for the presence of FUNCTIONALIDS_ATR on the group and applies them to the fragment * @param group * @param thisFrag * @throws StructureBuildingException */ private static void setFragmentFunctionalAtomsIfSpecified(Element group, Fragment thisFrag) throws StructureBuildingException { if (group.getAttribute(FUNCTIONALIDS_ATR)!=null){ String[] functionalIDs = group.getAttributeValue(FUNCTIONALIDS_ATR).split(","); for (String functionalID : functionalIDs) { thisFrag.addFunctionalAtom(thisFrag.getAtomByIDOrThrow(thisFrag.getIdOfFirstAtom() + Integer.parseInt(functionalID) - 1)); } } } private static void applyTraditionalAlkaneNumberingIfAppropriate(Element group, Fragment thisFrag) { String groupType = group.getAttributeValue(TYPE_ATR); if (groupType.equals(ACIDSTEM_TYPE_VAL)){ List atomList = thisFrag.getAtomList(); Atom startingAtom = thisFrag.getFirstAtom(); if (group.getAttribute(SUFFIXAPPLIESTO_ATR) != null){ String suffixAppliesTo = group.getAttributeValue(SUFFIXAPPLIESTO_ATR); String suffixAppliesToArr[] = suffixAppliesTo.split(","); if (suffixAppliesToArr.length != 1){ return; } startingAtom = atomList.get(Integer.parseInt(suffixAppliesToArr[0]) - 1); } List neighbours = startingAtom.getAtomNeighbours(); int counter = -1; Atom previousAtom = startingAtom; for (int i = neighbours.size() - 1; i >=0; i--) {//only consider carbon atoms if (neighbours.get(i).getElement() != ChemEl.C){ neighbours.remove(i); } } while (neighbours.size() == 1){ counter++; if (counter > 5){ break; } Atom nextAtom = neighbours.get(0); if (nextAtom.getAtomIsInACycle()){ break; } String traditionalLocant = traditionalAlkanePositionNames[counter]; if (!nextAtom.hasLocant(traditionalLocant)){ nextAtom.addLocant(traditionalLocant); } neighbours = nextAtom.getAtomNeighbours(); neighbours.remove(previousAtom); for (int i = neighbours.size()-1; i >=0; i--) {//only consider carbon atoms if (neighbours.get(i).getElement() != ChemEl.C){ neighbours.remove(i); } } previousAtom = nextAtom; } } else if (groupType.equals(CHAIN_TYPE_VAL) && ALKANESTEM_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))){ List atomList = thisFrag.getAtomList(); if (atomList.size() == 1){ return; } Element possibleSuffix = OpsinTools.getNextSibling(group, SUFFIX_EL); Boolean terminalSuffixWithNoSuffixPrefixPresent = false; if (possibleSuffix!=null && TERMINAL_SUBTYPE_VAL.equals(possibleSuffix.getAttributeValue(SUBTYPE_ATR)) && possibleSuffix.getAttribute(SUFFIXPREFIX_ATR) == null){ terminalSuffixWithNoSuffixPrefixPresent = true; } for (Atom atom : atomList) { String firstLocant = atom.getFirstLocant(); if (!atom.getAtomIsInACycle() && firstLocant != null && firstLocant.length() == 1 && Character.isDigit(firstLocant.charAt(0))){ int locantNumber = Integer.parseInt(firstLocant); if (terminalSuffixWithNoSuffixPrefixPresent){ if (locantNumber > 1 && locantNumber <= 7){ atom.addLocant(traditionalAlkanePositionNames[locantNumber - 2]); } } else{ if (locantNumber > 0 && locantNumber <= 6){ atom.addLocant(traditionalAlkanePositionNames[locantNumber - 1]); } } } } } } private static void applyHomologyGroupLabelsIfSpecified(Element group, Fragment frag) { String homologyValsStr = group.getAttributeValue(HOMOLOGY_ATR); if (homologyValsStr != null) { String[] vals = homologyValsStr.split(";"); List homologyAtoms = new ArrayList<>(); for (Atom a : frag) { if (a.getElement() == ChemEl.R) { homologyAtoms.add(a); } } int count = vals.length; if (count != homologyAtoms.size()) { throw new RuntimeException("OPSIN Bug: Number of homology atoms should match number of homology labels! for: " + group.getValue() ); } for (int i = 0; i < count; i++) { homologyAtoms.get(i).setProperty(Atom.HOMOLOGY_GROUP, vals[i]); } } } private void processChargeAndOxidationNumberSpecification(Element group, Fragment frag) { if (OUSICATOM_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))) { String oxidationStates = group.getAttributeValue(COMMONOXIDATIONSTATESANDMAX_ATR); if (oxidationStates == null) { throw new RuntimeException(COMMONOXIDATIONSTATESANDMAX_ATR + " should be specified on: " + group.getValue()); } frag.getFirstAtom().setProperty(Atom.OXIDATION_NUMBER, Integer.parseInt(oxidationStates.split(":")[0])); } Element nextEl = OpsinTools.getNextSibling(group); if (nextEl != null){ if(nextEl.getName().equals(CHARGESPECIFIER_EL)) { frag.getFirstAtom().setCharge(Integer.parseInt(nextEl.getAttributeValue(VALUE_ATR))); nextEl.detach(); } if(nextEl.getName().equals(OXIDATIONNUMBERSPECIFIER_EL)) { frag.getFirstAtom().setProperty(Atom.OXIDATION_NUMBER, Integer.parseInt(nextEl.getAttributeValue(VALUE_ATR))); nextEl.detach(); } } } /** * Removes a substituent is just hydro/perhydro elements and moves its contents to be in front of the next in scope ring * @param substituent * @return true if the substituent was a hydro substituent and hence was removed * @throws ComponentGenerationException */ private boolean removeAndMoveToAppropriateGroupIfHydroSubstituent(Element substituent) throws ComponentGenerationException { List hydroElements = substituent.getChildElements(HYDRO_EL); if (hydroElements.size() > 0) { Element targetRing = null; final Element adjacentSubOrRootOrBracket = OpsinTools.getNextSibling(substituent); if (adjacentSubOrRootOrBracket == null) { throw new ComponentGenerationException("Cannot find ring for hydro substituent to apply to"); } //first check adjacent substituent/root. If the hydro element has one locant or the ring is locantless then we can assume the hydro is acting as a nondetachable prefix Element potentialRing = adjacentSubOrRootOrBracket.getFirstChildElement(GROUP_EL); if (potentialRing != null && containsCyclicAtoms(potentialRing)) { Element possibleLocantInFrontOfHydro = OpsinTools.getPreviousSibling(hydroElements.get(0)); if (possibleLocantInFrontOfHydro != null && possibleLocantInFrontOfHydro.getName().equals(LOCANT_EL) && possibleLocantInFrontOfHydro.getValue().split(",").length == 1) { //e.g.4-decahydro-1-naphthalenyl targetRing = potentialRing; } else{ Element possibleLocantInFrontOfRing = OpsinTools.getPreviousSibling(potentialRing, LOCANT_EL); if (possibleLocantInFrontOfRing != null) { if (potentialRing.getAttribute(FRONTLOCANTSEXPECTED_ATR) != null) {//check whether the group was expecting a locant e.g. 2-furyl String locantValue = possibleLocantInFrontOfRing.getValue(); String[] expectedLocants = potentialRing.getAttributeValue(FRONTLOCANTSEXPECTED_ATR).split(","); for (String expectedLocant : expectedLocants) { if (locantValue.equals(expectedLocant)){ targetRing = potentialRing; break; } } } //check whether the group is a benzofused ring e.g. 1,4-benzodioxin if (FUSIONRING_SUBTYPE_VAL.equals(potentialRing.getAttributeValue(SUBTYPE_ATR)) && (potentialRing.getValue().equals("benzo")|| potentialRing.getValue().equals("benz")) && !OpsinTools.getNextSibling(potentialRing).getName().equals(FUSION_EL)){ targetRing = potentialRing; } } else{ targetRing = potentialRing; } } } //that didn't match so the hydro appears to be a detachable prefix. detachable prefixes attach in preference to the rightmost applicable group so search any remaining substituents/roots from right to left if (targetRing == null) { Element nextSubOrRootOrBracketFromLast = substituent.getParent().getChild(substituent.getParent().getChildCount() - 1);//the last sibling while (!nextSubOrRootOrBracketFromLast.equals(substituent)){ potentialRing = nextSubOrRootOrBracketFromLast.getFirstChildElement(GROUP_EL); if (potentialRing != null && containsCyclicAtoms(potentialRing)){ targetRing = potentialRing; break; } else{ nextSubOrRootOrBracketFromLast = OpsinTools.getPreviousSibling(nextSubOrRootOrBracketFromLast); } } } if (targetRing == null) { throw new ComponentGenerationException("Cannot find ring for hydro substituent to apply to"); } //move the children of the hydro substituent List children = substituent.getChildElements(); Element targetSubstituent = targetRing.getParent(); if (targetSubstituent.equals(adjacentSubOrRootOrBracket)) { for (int i = children.size()-1; i >=0 ; i--) { Element child = children.get(i); if (child.getName().equals(HYPHEN_EL)) { continue; } child.detach(); targetSubstituent.insertChild(child, 0); } } else { boolean inDetachablePrefix = true; for (int i = children.size()-1; i >=0 ; i--) { Element child = children.get(i); String elName = child.getName(); if (elName.equals(HYPHEN_EL)) { continue; } else if (inDetachablePrefix && elName.equals(HYDRO_EL)) { child.detach(); targetSubstituent.insertChild(child, 0); } else if (elName.equals(STEREOCHEMISTRY_EL)) { inDetachablePrefix = false; child.detach(); adjacentSubOrRootOrBracket.insertChild(child, 0); } else { throw new ComponentGenerationException("Unexpected term found before detachable hydro prefix: " + child.getValue() ); } } } substituent.detach(); return true; } return false; } /** * Removes substituents which are just a subtractivePrefix element e.g. deoxy and moves their contents to be in front of the next in scope biochemical fragment (or failing that group) * @param substituent * @return true if the substituent was a subtractivePrefix substituent and hence was removed * @throws ComponentGenerationException */ static boolean removeAndMoveToAppropriateGroupIfSubtractivePrefix(Element substituent) throws ComponentGenerationException { List subtractivePrefixes = substituent.getChildElements(SUBTRACTIVEPREFIX_EL); if (subtractivePrefixes.size() > 0) { Element biochemicalGroup = null;//preferred Element standardGroup = null; final Element adjacentSubOrRootOrBracket = OpsinTools.getNextSibling(substituent); Element nextSubOrRootOrBracket = adjacentSubOrRootOrBracket; if (nextSubOrRootOrBracket == null){ throw new ComponentGenerationException("Unable to find group for: " + subtractivePrefixes.get(0).getValue() +" to apply to!"); } //prefer the nearest (unlocanted) biochemical group or the rightmost standard group while (nextSubOrRootOrBracket != null) { Element groupToConsider = nextSubOrRootOrBracket.getFirstChildElement(GROUP_EL); if (groupToConsider != null) { if (OpsinTools.isBiochemical(groupToConsider.getAttributeValue(TYPE_ATR), groupToConsider.getAttributeValue(SUBTYPE_ATR))){ biochemicalGroup = groupToConsider; if (OpsinTools.getPreviousSiblingsOfType(biochemicalGroup, LOCANT_EL).isEmpty()) { break; } } else { standardGroup = groupToConsider; } } nextSubOrRootOrBracket = OpsinTools.getNextSibling(nextSubOrRootOrBracket); } Element targetGroup = biochemicalGroup != null ? biochemicalGroup : standardGroup; if (targetGroup == null) { throw new ComponentGenerationException("Unable to find group for: " + subtractivePrefixes.get(0).getValue() +" to apply to!"); } //move the children of the subtractivePrefix substituent List children = substituent.getChildElements(); Element targetSubstituent = targetGroup.getParent(); if (targetSubstituent.equals(adjacentSubOrRootOrBracket)) { for (int i = children.size()-1; i >=0 ; i--) { Element child =children.get(i); if (!child.getName().equals(HYPHEN_EL)){ child.detach(); targetSubstituent.insertChild(child, 0); } } } else { boolean inDetachablePrefix = true; for (int i = children.size()-1; i >=0 ; i--) { Element child = children.get(i); String elName = child.getName(); if (elName.equals(HYPHEN_EL)) { continue; } else if (inDetachablePrefix && elName.equals(SUBTRACTIVEPREFIX_EL)) { child.detach(); targetSubstituent.insertChild(child, 0); } else if (elName.equals(STEREOCHEMISTRY_EL)) { inDetachablePrefix = false; child.detach(); adjacentSubOrRootOrBracket.insertChild(child, 0); } else { throw new ComponentGenerationException("Unexpected term found before detachable substractive prefix: " + child.getValue() ); } } } substituent.detach(); return true; } return false; } /** * Removes substituents which are just a fused ring element and moves their contents to be in front of the next in scope ring * @param substituent * @return true if the substituent was a ring bridge and hence was removed * @throws ComponentGenerationException */ private boolean removeAndMoveToAppropriateGroupIfRingBridge(Element substituent) throws ComponentGenerationException { List ringBridges = substituent.getChildElements(FUSEDRINGBRIDGE_EL); if (ringBridges.size() > 0) { final Element adjacentSubOrRootOrBracket = OpsinTools.getNextSibling(substituent); Element nextSubOrRootOrBracket = adjacentSubOrRootOrBracket; if (nextSubOrRootOrBracket == null){ throw new ComponentGenerationException("Unable to find group for: " + ringBridges.get(0).getValue() +" to apply to!"); } Element targetGroup = null; Element standardGroup = null; //prefer the nearest (unlocanted) ring group or the rightmost standard group while (nextSubOrRootOrBracket != null) { Element groupToConsider = nextSubOrRootOrBracket.getFirstChildElement(GROUP_EL); if (groupToConsider != null) { if (containsCyclicAtoms(groupToConsider) && OpsinTools.getPreviousSiblingsOfType(groupToConsider, LOCANT_EL).isEmpty()) { targetGroup = groupToConsider; break; } else { standardGroup = groupToConsider; } } nextSubOrRootOrBracket = OpsinTools.getNextSibling(nextSubOrRootOrBracket); } if (targetGroup == null) { targetGroup = standardGroup; } if (targetGroup == null) { throw new ComponentGenerationException("Unable to find group for: " + ringBridges.get(0).getValue() +" to apply to!"); } //move the children of the fusedRingBridge substituent List children = substituent.getChildElements(); Element targetSubstituent = targetGroup.getParent(); if (targetSubstituent.equals(adjacentSubOrRootOrBracket)) { for (int i = children.size()-1; i >=0 ; i--) { Element child =children.get(i); if (!child.getName().equals(HYPHEN_EL)){ child.detach(); targetSubstituent.insertChild(child, 0); } } } else { boolean inDetachablePrefix = true; for (int i = children.size()-1; i >=0 ; i--) { Element child = children.get(i); String elName = child.getName(); if (elName.equals(HYPHEN_EL)) { continue; } else if (inDetachablePrefix && (elName.equals(FUSEDRINGBRIDGE_EL) || elName.equals(COLONORSEMICOLONDELIMITEDLOCANT_EL) || elName.equals(LOCANT_EL))) { child.detach(); targetSubstituent.insertChild(child, 0); } else if (elName.equals(STEREOCHEMISTRY_EL)) { inDetachablePrefix = false; child.detach(); adjacentSubOrRootOrBracket.insertChild(child, 0); } else { throw new ComponentGenerationException("Unexpected term found before detachable ring bridge: " + child.getValue() ); } } } substituent.detach(); return true; } return false; } private boolean containsCyclicAtoms(Element potentialRing) { Fragment potentialRingFrag = potentialRing.getFrag(); List atomList = potentialRingFrag.getAtomList(); for (Atom atom : atomList) { if (atom.getAtomIsInACycle()){ return true; } } return false; } /** * Checks for agreement between the number of locants and multipliers. * If a locant element contains multiple elements and is not next to a multiplier the various cases where this is the case will be checked for * This may result in a locant being moved if it is more convenient for subsequent processing * @param subOrBracketOrRoot The substituent/root/bracket to looks for locants in. * @param finalSubOrRootInWord : used to check if a locant is referring to the root as in multiplicative nomenclature * @throws ComponentGenerationException * @throws StructureBuildingException */ private void determineLocantMeaning(Element subOrBracketOrRoot, Element finalSubOrRootInWord) throws StructureBuildingException, ComponentGenerationException { List locants = subOrBracketOrRoot.getChildElements(LOCANT_EL); Element group = subOrBracketOrRoot.getFirstChildElement(GROUP_EL);//will be null if element is a bracket for (Element locant : locants) { String[] locantValues = locant.getValue().split(","); if(locantValues.length > 1) { Element afterLocant = OpsinTools.getNextSibling(locant); int structuralBracketDepth = 0; Element multiplierEl = null; while (afterLocant != null){ String elName = afterLocant.getName(); if (elName.equals(STRUCTURALOPENBRACKET_EL)){ structuralBracketDepth++; } else if (elName.equals(STRUCTURALCLOSEBRACKET_EL)){ structuralBracketDepth--; } if (structuralBracketDepth != 0){ afterLocant = OpsinTools.getNextSibling(afterLocant); continue; } if(elName.equals(LOCANT_EL)) { break; } else if (elName.equals(MULTIPLIER_EL)){ if (locantValues.length == Integer.parseInt(afterLocant.getAttributeValue(VALUE_ATR))){ if (afterLocant.equals(OpsinTools.getNextSiblingIgnoringCertainElements(locant, new String[]{INDICATEDHYDROGEN_EL}))){ //direct locant, typical case. An exception is made for indicated hydrogen e.g. 1,2,4-1H-triazole multiplierEl = afterLocant; break; } else{ Element afterMultiplier = OpsinTools.getNextSibling(afterLocant); if (afterMultiplier!=null && (afterMultiplier.getName().equals(SUFFIX_EL) || afterMultiplier.getName().equals(INFIX_EL) || afterMultiplier.getName().equals(UNSATURATOR_EL) || afterMultiplier.getName().equals(GROUP_EL))){ multiplierEl = afterLocant; //indirect locant break; } } } if (afterLocant.equals(OpsinTools.getNextSibling(locant))){//if nothing better can be found report this as a locant/multiplier mismatch multiplierEl = afterLocant; } } else if (elName.equals(RINGASSEMBLYMULTIPLIER_EL) && afterLocant.equals(OpsinTools.getNextSibling(locant))){//e.g. 1,1'-biphenyl multiplierEl = afterLocant; if (!FragmentTools.allAtomsInRingAreIdentical(group.getFrag())){//if all atoms are identical then the locant may refer to suffixes break; } } else if (elName.equals(FUSEDRINGBRIDGE_EL)&& locantValues.length ==2 && afterLocant.equals(OpsinTools.getNextSibling(locant))){//e.g. 1,8-methano break; } afterLocant = OpsinTools.getNextSibling(afterLocant); } if(multiplierEl != null) { if(Integer.parseInt(multiplierEl.getAttributeValue(VALUE_ATR)) == locantValues.length ) { // number of locants and multiplier agree boolean locantModified = false;//did determineLocantMeaning do something? if (locantValues[locantValues.length-1].endsWith("'") && group!=null && subOrBracketOrRoot.indexOf(group) > subOrBracketOrRoot.indexOf(locant)){//quite possible that this is referring to a multiplied root if (group.getAttribute(OUTIDS_ATR)!=null && group.getAttributeValue(OUTIDS_ATR).split(",").length>1){ locantModified = checkSpecialLocantUses(locant, locantValues, finalSubOrRootInWord); } else{ Element afterGroup = OpsinTools.getNextSibling(group); int inlineSuffixCount =0; int multiplier = 1; while (afterGroup != null){ if(afterGroup.getName().equals(MULTIPLIER_EL)){ multiplier =Integer.parseInt(afterGroup.getAttributeValue(VALUE_ATR)); } else if(afterGroup.getName().equals(SUFFIX_EL) && afterGroup.getAttributeValue(TYPE_ATR).equals(INLINE_TYPE_VAL)){ inlineSuffixCount +=(multiplier); multiplier=1; } afterGroup = OpsinTools.getNextSibling(afterGroup); } if (inlineSuffixCount >=2){ locantModified = checkSpecialLocantUses(locant, locantValues, finalSubOrRootInWord); } } } if (!locantModified && !OpsinTools.getNextSibling(locant).equals(multiplierEl)){//the locants apply indirectly the multiplier e.g. 2,3-butandiol //move the locant to be next to the multiplier. locant.detach(); OpsinTools.insertBefore(multiplierEl, locant); } } else { if(!checkSpecialLocantUses(locant, locantValues, finalSubOrRootInWord)) { throw new ComponentGenerationException("Mismatch between locant and multiplier counts (" + Integer.toString(locantValues.length) + " and " + multiplierEl.getAttributeValue(VALUE_ATR) + "):" + locant.getValue()); } } } else { /* Multiple locants without a multiplier */ if(!checkSpecialLocantUses(locant, locantValues, finalSubOrRootInWord)) { throw new ComponentGenerationException("Multiple locants without a multiplier: " + locant.toXML()); } } } } } /**Looks for Hantzch-Widman systems, and sees if the number of locants * agrees with the number of heteroatoms. * If this is not the case alternative possibilities are tested: * The locants could be intended to indicate the position of outAtoms e.g. 1,4-phenylene * The locants could be intended to indicate the attachement points of the root groups in multiplicative nomenclature e.g. 4,4'-methylenedibenzoic acid * @param locant The element corresponding to the locant group to be tested * @param locantValues The locant values; * @param finalSubOrRootInWord : used to check if a locant is referring to the root as in multiplicative nomenclatures) * @return true if there's a HW system, and agreement; or if the locants conform to one of the alternative possibilities, otherwise false. * @throws ComponentGenerationException */ private boolean checkSpecialLocantUses(Element locant, String[] locantValues, Element finalSubOrRootInWord) throws ComponentGenerationException { int count = locantValues.length; Element currentElem = OpsinTools.getNextSibling(locant); int heteroCount = 0; int multiplierValue = 1; while(currentElem != null && !currentElem.getName().equals(GROUP_EL)){ if(currentElem.getName().equals(HETEROATOM_EL)) { heteroCount += multiplierValue; multiplierValue = 1; } else if (currentElem.getName().equals(MULTIPLIER_EL)){ multiplierValue = Integer.parseInt(currentElem.getAttributeValue(VALUE_ATR)); } else{ break; } currentElem = OpsinTools.getNextSibling(currentElem); } if(currentElem != null && currentElem.getName().equals(GROUP_EL)){ if (currentElem.getAttributeValue(SUBTYPE_ATR).equals(HANTZSCHWIDMAN_SUBTYPE_VAL)) { if(heteroCount == count) { return true; } else if (heteroCount > 1){ return false;//there is a case where locants don't apply to heteroatoms in a HW system, but in that case only one locant is expected so this function would not be called } } if (heteroCount == 0 && currentElem.getAttribute(OUTIDS_ATR) != null ) {//e.g. 1,4-phenylene String[] outIDs = currentElem.getAttributeValue(OUTIDS_ATR).split(",", -1); Fragment groupFragment = currentElem.getFrag(); if (count ==outIDs.length && groupFragment.getAtomCount() > 1){//things like oxy do not need to have their outIDs specified int idOfFirstAtomInFrag =groupFragment.getIdOfFirstAtom(); boolean foundLocantNotPresentOnFragment = false; for (int i = outIDs.length - 1; i >=0; i--) { Atom a =groupFragment.getAtomByLocant(locantValues[i]); if (a == null){ foundLocantNotPresentOnFragment = true; break; } outIDs[i] = Integer.toString(a.getID() - idOfFirstAtomInFrag + 1);//convert to relative id } if (!foundLocantNotPresentOnFragment){ currentElem.getAttribute(OUTIDS_ATR).setValue(StringTools.arrayToString(outIDs, ",")); locant.detach(); return true; } } } else if(currentElem.getValue().equals("benz") || currentElem.getValue().equals("benzo")){ Element potentialGroupAfterBenzo = OpsinTools.getNextSibling(currentElem, GROUP_EL);//need to make sure this isn't benzyl if (potentialGroupAfterBenzo!=null){ return true;//e.g. 1,2-benzothiazole } } } if(currentElem != null) { String name = currentElem.getName(); if (name.equals(POLYCYCLICSPIRO_EL)){ return true; } else if (name.equals(FUSEDRINGBRIDGE_EL) && count == 2){ return true; } else if (name.equals(SUFFIX_EL) && CYCLEFORMER_SUBTYPE_VAL.equals(currentElem.getAttributeValue(SUBTYPE_ATR)) && count == 2){ currentElem.addAttribute(new Attribute(LOCANT_ATR, locant.getValue())); locant.detach(); return true; } else if (name.equals(SUBTRACTIVEPREFIX_EL) && ANHYDRO_TYPE_VAL.equals(currentElem.getAttributeValue(TYPE_ATR))){ if (count != 2) { throw new ComponentGenerationException("Two locants are required before an anhydro prefix, but found: "+ locant.getValue()); } currentElem.addAttribute(new Attribute(LOCANT_ATR, locant.getValue())); locant.detach(); return true; } } boolean detectedMultiplicativeNomenclature = detectMultiplicativeNomenclature(locant, locantValues, finalSubOrRootInWord); if (detectedMultiplicativeNomenclature){ return true; } if (currentElem != null && count ==2 && currentElem.getName().equals(GROUP_EL)){ if (EPOXYLIKE_SUBTYPE_VAL.equals(currentElem.getAttributeValue(SUBTYPE_ATR))){ return true; } if ("yes".equals(currentElem.getAttributeValue(IMINOLIKE_ATR))){ currentElem.getAttribute(SUBTYPE_ATR).setValue(EPOXYLIKE_SUBTYPE_VAL); return true; } } Element parentElem = locant.getParent(); if (count == 2 && parentElem.getName().equals(BRACKET_EL)){//e.g. 3,4-(dichloromethylenedioxy) this is changed to (dichloro3,4-methylenedioxy) List substituents = parentElem.getChildElements(SUBSTITUENT_EL); if (substituents.size() > 0){ Element finalSub = substituents.get(substituents.size() - 1); Element group = finalSub.getFirstChildElement(GROUP_EL); if (EPOXYLIKE_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))){ locant.detach(); OpsinTools.insertBefore(group, locant); return true; } } } return false; } /** * Detects multiplicative nomenclature. If it does then the locant will be moved, changed to a multiplicative locant and true will be returned * @param locant * @param locantValues * @param finalSubOrRootInWord * @return */ private boolean detectMultiplicativeNomenclature(Element locant, String[] locantValues, Element finalSubOrRootInWord) { int count =locantValues.length; Element multiplier = finalSubOrRootInWord.getChild(0); if (finalSubOrRootInWord.getParent().getName().equals(BRACKET_EL)){//e.g. 1,1'-ethynediylbis(1-cyclopentanol) if (!multiplier.getName().equals(MULTIPLIER_EL)){ multiplier = finalSubOrRootInWord.getParent().getChild(0); } else{ Element elAfterMultiplier = OpsinTools.getNextSibling(multiplier); String elName = elAfterMultiplier.getName(); if (elName.equals(HETEROATOM_EL) || elName.equals(SUBTRACTIVEPREFIX_EL)|| (elName.equals(HYDRO_EL) && !elAfterMultiplier.getValue().startsWith("per"))|| elName.equals(FUSEDRINGBRIDGE_EL)) { multiplier = finalSubOrRootInWord.getParent().getChild(0); } } } Element commonParent = locant.getParent().getParent();//this should be a common parent of the multiplier in front of the root. If it is not, then this locant is in a different scope Element parentOfMultiplier =multiplier.getParent(); while (parentOfMultiplier!=null){ if (commonParent.equals(parentOfMultiplier)){ if (locantValues[count-1].endsWith("'") && multiplier.getName().equals(MULTIPLIER_EL) && !OpsinTools.getNextSibling(multiplier).getName().equals(MULTIPLICATIVELOCANT_EL) && Integer.parseInt(multiplier.getAttributeValue(VALUE_ATR)) == count ){//multiplicative nomenclature locant.setName(MULTIPLICATIVELOCANT_EL); locant.detach(); OpsinTools.insertAfter(multiplier, locant); return true; } } parentOfMultiplier = parentOfMultiplier.getParent(); } return false; } private void applyDLPrefixes(Element subOrRoot) throws ComponentGenerationException { List dlStereochemistryEls = OpsinTools.getChildElementsWithTagNameAndAttribute(subOrRoot, STEREOCHEMISTRY_EL, TYPE_ATR, DLSTEREOCHEMISTRY_TYPE_VAL); for (Element dlStereochemistry : dlStereochemistryEls) { String dlStereochemistryValue = dlStereochemistry.getAttributeValue(VALUE_ATR); Element elementToApplyTo = OpsinTools.getNextSibling(dlStereochemistry); if (elementToApplyTo == null){ continue; } String type = elementToApplyTo.getAttributeValue(TYPE_ATR); if (OPTICALROTATION_TYPE_VAL.equals(type)) { elementToApplyTo = OpsinTools.getNextSibling(elementToApplyTo); if (elementToApplyTo == null) { continue; } type = elementToApplyTo.getAttributeValue(TYPE_ATR); } if (AMINOACID_TYPE_VAL.equals(type)) { if (!applyDlStereochemistryToAminoAcid(elementToApplyTo, dlStereochemistryValue)){ continue; } } else if (CARBOHYDRATE_TYPE_VAL.equals(type)) { applyDlStereochemistryToCarbohydrate(elementToApplyTo, dlStereochemistryValue); } else if (CARBOHYDRATECONFIGURATIONPREFIX_TYPE_VAL.equals(type)) { applyDlStereochemistryToCarbohydrateConfigurationalPrefix(elementToApplyTo, dlStereochemistryValue); } else{ continue; } dlStereochemistry.detach(); } } boolean applyDlStereochemistryToAminoAcid(Element aminoAcidEl, String dlStereochemistryValue) throws ComponentGenerationException { Fragment aminoAcid = aminoAcidEl.getFrag(); List atomList = aminoAcid.getAtomList(); List atomsWithParities = new ArrayList<>(); for (Atom atom : atomList) { if (atom.getAtomParity() != null) { atomsWithParities.add(atom); } } if (atomsWithParities.isEmpty()) { //achiral amino acid... but may become chiral after substitution return false; } if (dlStereochemistryValue.equals("dl")){ int grpnum = ++state.numRacGrps; for (Atom atom : atomsWithParities) { atom.setStereoGroup(new StereoGroup(StereoGroupType.Rac, grpnum)); } } else{ boolean invert; if (dlStereochemistryValue.equals("l") || dlStereochemistryValue.equals("ls")){ invert = false; } else if (dlStereochemistryValue.equals("d") || dlStereochemistryValue.equals("ds")){ invert = true; } else{ throw new RuntimeException("OPSIN bug: Unexpected value for D/L stereochemistry found before amino acid: " + dlStereochemistryValue ); } if ("yes".equals(aminoAcidEl.getAttributeValue(NATURALENTISOPPOSITE_ATR))){ invert = !invert; } if (invert) { for (Atom atom : atomsWithParities) { atom.getAtomParity().setParity(-atom.getAtomParity().getParity()); } } } return true; } void applyDlStereochemistryToCarbohydrate(Element carbohydrateEl, String dlStereochemistryValue) throws ComponentGenerationException { Fragment carbohydrate = carbohydrateEl.getFrag(); List atomList = carbohydrate.getAtomList(); List atomsWithParities = new ArrayList<>(); for (Atom atom : atomList) { if (atom.getAtomParity()!=null){ atomsWithParities.add(atom); } } if (atomsWithParities.isEmpty()){ throw new ComponentGenerationException("D/L stereochemistry :" + dlStereochemistryValue + " found before achiral carbohydrate");//sounds like a vocab bug... } StereoGroupType grp; int grpnum = 0; boolean invert; if (dlStereochemistryValue.equals("dl")){ invert = false; grp = StereoGroupType.Rac; grpnum = ++state.numRacGrps; } else if (dlStereochemistryValue.equals("d") || dlStereochemistryValue.equals("dg")){ invert = false; grp = StereoGroupType.Abs; } else if (dlStereochemistryValue.equals("l") || dlStereochemistryValue.equals("lg")){ invert = true; grp = StereoGroupType.Abs; } else{ throw new ComponentGenerationException("Unexpected value for D/L stereochemistry found before carbohydrate: " + dlStereochemistryValue ); } if ("yes".equals(carbohydrateEl.getAttributeValue(NATURALENTISOPPOSITE_ATR))){ invert = !invert; } if (invert) { for (Atom atom : atomsWithParities) { atom.getAtomParity().setParity(-atom.getAtomParity().getParity()); atom.setStereoGroup(new StereoGroup(grp, grpnum)); } } else if (grp != StereoGroupType.Abs) { for (Atom atom : atomsWithParities) { atom.setStereoGroup(new StereoGroup(grp, grpnum)); } } } static void applyDlStereochemistryToCarbohydrateConfigurationalPrefix(Element elementToApplyTo, String dlStereochemistryValue) throws ComponentGenerationException { if (dlStereochemistryValue.equals("d") || dlStereochemistryValue.equals("dg")) { //do nothing, D- is implicit } else if (dlStereochemistryValue.equals("l") || dlStereochemistryValue.equals("lg")) { String[] values = elementToApplyTo.getAttributeValue(VALUE_ATR).split("/", -1); StringBuilder sb = new StringBuilder(); for (String value : values) { if (value.equals("r")){ sb.append("l"); } else if (value.equals("l")){ sb.append("r"); } else{ throw new RuntimeException("OPSIN Bug: Invalid carbohydrate prefix value: " + elementToApplyTo.getAttributeValue(VALUE_ATR)); } sb.append("/"); } String newVal = sb.toString().substring(0, sb.length()-1); elementToApplyTo.getAttribute(VALUE_ATR).setValue(newVal); } else if (dlStereochemistryValue.equals("dl")) { String[] values = elementToApplyTo.getAttributeValue(VALUE_ATR).split("/"); String newVal = "?" + StringTools.multiplyString("/?", values.length-1); elementToApplyTo.getAttribute(VALUE_ATR).setValue(newVal); } else{ throw new ComponentGenerationException("Unexpected value for D/L stereochemistry found before carbohydrate prefix: " + dlStereochemistryValue ); } } /** * Cyclises carbohydrates and regularises their suffixes * @param subOrRoot * @throws StructureBuildingException */ private void processCarbohydrates(Element subOrRoot) throws StructureBuildingException { List carbohydrates = OpsinTools.getChildElementsWithTagNameAndAttribute(subOrRoot, GROUP_EL, TYPE_ATR, CARBOHYDRATE_TYPE_VAL); for (Element carbohydrate : carbohydrates) { Fragment carbohydrateFrag = carbohydrate.getFrag(); String subtype = carbohydrate.getAttributeValue(SUBTYPE_ATR); boolean isAldose; if (CARBOHYDRATESTEMKETOSE_SUBTYPE_VAL.equals(subtype)){ isAldose = false; } else if (CARBOHYDRATESTEMALDOSE_SUBTYPE_VAL.equals(subtype) || SYSTEMATICCARBOHYDRATESTEMALDOSE_SUBTYPE_VAL.equals(subtype)){ isAldose = true; } else{ Attribute anomericId = carbohydrate.getAttribute(SUFFIXAPPLIESTO_ATR); if (anomericId != null){ Atom anomericCarbon = carbohydrateFrag.getAtomByID(carbohydrateFrag.getIdOfFirstAtom() + Integer.parseInt(anomericId.getValue()) -1); if (APIOFURANOSE_SUBTYPE_VAL.equals(subtype)) { Element suffix = getNextSibling(carbohydrate); if (suffix != null && suffix.getName().equals(SUFFIX_EL)) { suffix.addAttribute(new Attribute(LOCANTID_ATR, String.valueOf(anomericCarbon.getID()))); } } else { applyAlphaBetaStereoToCyclisedCarbohydrate(carbohydrate, anomericCarbon); } carbohydrate.removeAttribute(anomericId); } //trivial carbohydrates don't have suffixes (except apiofuranoses) continue; } boolean cyclisationPerformed = false; Attribute anomericId = carbohydrate.getAttribute(SUFFIXAPPLIESTO_ATR); if (anomericId == null){ throw new StructureBuildingException("OPSIN bug: Missing suffixAppliesTo on: " + carbohydrate.getValue()); } Atom potentialCarbonyl = carbohydrateFrag.getAtomByID(carbohydrateFrag.getIdOfFirstAtom() + Integer.parseInt(anomericId.getValue()) -1); if (potentialCarbonyl == null){ throw new StructureBuildingException("OPSIN bug: " + anomericId.getValue() + " did not point to an atom on: " + carbohydrate.getValue()); } carbohydrate.removeAttribute(anomericId); Element nextSibling = OpsinTools.getNextSibling(carbohydrate); while (nextSibling !=null){ Element nextNextSibling = OpsinTools.getNextSibling(nextSibling); String elName = nextSibling.getName(); if (elName.equals(SUFFIX_EL)){ Element suffix = nextSibling; String value = suffix.getAttributeValue(VALUE_ATR); if (value.equals("dialdose") || value.equals("aric acid") || value.equals("arate")){ if (!isAldose){ throw new StructureBuildingException(value + " may only be used with aldoses"); } if (cyclisationPerformed){ throw new StructureBuildingException("OPSIN bug: " + value + " not expected after carbohydrate cycliser"); } processAldoseDiSuffix(value, carbohydrate, potentialCarbonyl); suffix.detach(); } else if (value.startsWith("uron")){ //strictly these are also aldose di suffixes but in practice they are also used on ketoses suffix.addAttribute(new Attribute(LOCANT_ATR, String.valueOf(carbohydrateFrag.getChainLength()))); } else if (!cyclisationPerformed && (value.equals("ulose") || value.equals("osulose"))){ if (value.equals("ulose")){ isAldose = false; if (SYSTEMATICCARBOHYDRATESTEMALDOSE_SUBTYPE_VAL.equals(subtype)){ carbohydrate.getAttribute(SUBTYPE_ATR).setValue(SYSTEMATICCARBOHYDRATESTEMKETOSE_SUBTYPE_VAL); } } potentialCarbonyl = processUloseSuffix(carbohydrate, suffix, potentialCarbonyl); suffix.detach(); } else if (value.equals("itol") || value.equals("yl") || value.equals("glycoside")){ suffix.addAttribute(new Attribute(LOCANT_ATR, potentialCarbonyl.getFirstLocant())); if (value.equals("glycoside") && OpsinTools.getParentWordRule(subOrRoot).getAttributeValue(WORDRULE_ATR).equals(WordRule.simple.toString())){ throw new StructureBuildingException("A glycoside requires a space-separated substituent e.g. methyl alpha-D-glucopyranoside"); } } } else if (elName.equals(CARBOHYDRATERINGSIZE_EL)){ if (cyclisationPerformed){ throw new StructureBuildingException("OPSIN bug: Carbohydate cyclised twice!"); } Element ringSize = nextSibling; cycliseCarbohydrateAndApplyAlphaBetaStereo(carbohydrate, ringSize, potentialCarbonyl); ringSize.detach(); cyclisationPerformed = true; } else if (!elName.equals(LOCANT_EL) && !elName.equals(MULTIPLIER_ATR) && !elName.equals(UNSATURATOR_EL) && !elName.equals(COLONORSEMICOLONDELIMITEDLOCANT_EL)){ break; } nextSibling = nextNextSibling; } if (!cyclisationPerformed){ applyUnspecifiedRingSizeCyclisationIfPresent(carbohydrate, potentialCarbonyl); } } } private void applyUnspecifiedRingSizeCyclisationIfPresent(Element group, Atom potentialCarbonyl) throws StructureBuildingException { boolean cyclise = false; Element possibleYl = OpsinTools.getNextSibling(group); if (possibleYl != null && possibleYl.getName().equals(SUFFIX_EL)){ if (possibleYl.getAttributeValue(VALUE_ATR).equals("yl")){ cyclise = true; } else { //(on|uron)osyl possibleYl = OpsinTools.getNextSibling(possibleYl); if (possibleYl != null && possibleYl.getName().equals(SUFFIX_EL) && possibleYl.getAttributeValue(VALUE_ATR).equals("yl")) { cyclise = true; } } } if (!cyclise) { Element alphaOrBetaLocantEl = OpsinTools.getPreviousSiblingIgnoringCertainElements(group, new String[]{STEREOCHEMISTRY_EL}); if (alphaOrBetaLocantEl != null && alphaOrBetaLocantEl.getName().equals(LOCANT_EL) ){ String value = alphaOrBetaLocantEl.getValue(); if (value.equals("alpha") || value.equals("beta") || value.equals("alpha,beta") || value.equals("beta,alpha")){ cyclise = true; } } } if (cyclise) { Element ringSize = new TokenEl(CARBOHYDRATERINGSIZE_EL); String sugarStem = group.getValue(); if (group.getFrag().hasLocant("5") && !sugarStem.equals("rib") && !sugarStem.equals("fruct")){ ringSize.addAttribute(new Attribute(VALUE_ATR, "6")); } else{ ringSize.addAttribute(new Attribute(VALUE_ATR, "5")); } OpsinTools.insertAfter(group, ringSize); cycliseCarbohydrateAndApplyAlphaBetaStereo(group, ringSize, potentialCarbonyl); ringSize.detach(); } } /** * Indicates that the compound is a ketose. * This may take the form of replacement of the aldose functionality with ketose functionality or the additon of ketose functionality * The carbonyl may be subsequently used in cyclisation e.g. non-2-ulopyranose * A potentialcarbonyl is returned * @param group * @param suffix * @param potentialCarbonyl * @return * @throws StructureBuildingException */ private Atom processUloseSuffix(Element group, Element suffix, Atom potentialCarbonyl) throws StructureBuildingException { List locantsToConvertToKetones = new ArrayList<>(); Element potentialLocantOrMultiplier = OpsinTools.getPreviousSibling(suffix); if (potentialLocantOrMultiplier.getName().equals(MULTIPLIER_ATR)){ int multVal = Integer.parseInt(potentialLocantOrMultiplier.getAttributeValue(VALUE_ATR)); Element locant = OpsinTools.getPreviousSibling(potentialLocantOrMultiplier); if (locant != null && locant.getName().equals(LOCANT_EL)){ String[] locantStrs = locant.getValue().split(","); if (locantStrs.length != multVal) { throw new StructureBuildingException("Mismatch between locant and multiplier counts (" + locantStrs.length + " and " + multVal + "):" + locant.getValue()); } Collections.addAll(locantsToConvertToKetones, locantStrs); locant.detach(); } else{ for (int i = 0; i < multVal; i++) { locantsToConvertToKetones.add(String.valueOf(i + 2)); } } potentialLocantOrMultiplier.detach(); } else { Element locant = potentialLocantOrMultiplier; if (!locant.getName().equals(LOCANT_EL)){ locant = OpsinTools.getPreviousSibling(group); } if (locant !=null && locant.getName().equals(LOCANT_EL)){ String locantStr = locant.getValue(); if (locantStr.split(",").length==1){ locantsToConvertToKetones.add(locantStr); } else{ throw new StructureBuildingException("Incorrect number of locants for ul suffix: " + locantStr); } locant.detach(); } else{ locantsToConvertToKetones.add("2"); } } Fragment frag = group.getFrag(); if (suffix.getAttributeValue(VALUE_ATR).equals("ulose")) {//convert aldose to ketose Atom aldehydeAtom = potentialCarbonyl; boolean foundBond = false; for (Bond bond : aldehydeAtom.getBonds()) { if (bond.getOrder() ==2){ Atom otherAtom = bond.getOtherAtom(aldehydeAtom); if (otherAtom.getElement() == ChemEl.O && otherAtom.getCharge()==0 && otherAtom.getBondCount()==1){ bond.setOrder(1); foundBond = true; break; } } } if (!foundBond){ throw new StructureBuildingException("OPSIN bug: Unable to convert aldose to ketose"); } Atom backboneAtom = frag.getAtomByLocantOrThrow(locantsToConvertToKetones.get(0)); potentialCarbonyl = backboneAtom; } for (String locantStr : locantsToConvertToKetones) { Atom backboneAtom = frag.getAtomByLocantOrThrow(locantStr); boolean foundBond = false; for (Bond bond : backboneAtom.getBonds()) { if (bond.getOrder() ==1){ Atom otherAtom = bond.getOtherAtom(backboneAtom); if (otherAtom.getElement() == ChemEl.O && otherAtom.getCharge()==0 && otherAtom.getBondCount()==1){ bond.setOrder(2); foundBond = true; break; } } } if (!foundBond){ throw new StructureBuildingException("Failed to find hydroxy group at position:" + locantStr); } backboneAtom.setAtomParity(null); } return potentialCarbonyl; } /** * Cyclises carbohydrate configuration prefixes according to the ring size indicator * Alpha/beta stereochemistry is then applied if present * @param carbohydrateGroup * @param ringSize * @param potentialCarbonyl * @throws StructureBuildingException */ private void cycliseCarbohydrateAndApplyAlphaBetaStereo(Element carbohydrateGroup, Element ringSize, Atom potentialCarbonyl) throws StructureBuildingException { Fragment frag = carbohydrateGroup.getFrag(); String ringSizeVal = ringSize.getAttributeValue(VALUE_ATR); Element potentialLocant = OpsinTools.getPreviousSibling(ringSize); Atom carbonylCarbon = null; Atom atomToJoinWith = null; if (potentialLocant.getName().equals(LOCANT_EL)){ String[] locants = potentialLocant.getValue().split(","); if (locants.length != 2){ throw new StructureBuildingException("Expected 2 locants in front of sugar ring size specifier but found: " + potentialLocant.getValue()); } try{ int firstLocant = Integer.parseInt(locants[0]); int secondLocant = Integer.parseInt(locants[1]); if (Math.abs(secondLocant - firstLocant) != (Integer.parseInt(ringSizeVal) -2)){ throw new StructureBuildingException("Mismatch between ring size: " + ringSizeVal + " and ring size specified by locants: " + (Math.abs(secondLocant - firstLocant) + 2) ); } } catch (NumberFormatException e){ throw new StructureBuildingException("Locants for ring should be numeric but were: " + potentialLocant.getValue()); } carbonylCarbon = frag.getAtomByLocantOrThrow(locants[0]); atomToJoinWith = frag.getAtomByLocantOrThrow("O" + locants[1]); potentialLocant.detach(); } if (carbonylCarbon == null){ carbonylCarbon = potentialCarbonyl; if (carbonylCarbon ==null){ throw new RuntimeException("OPSIN bug: Could not find carbonyl carbon in carbohydrate"); } } for (Bond b: carbonylCarbon.getBonds()) { if (b.getOrder()==2){ b.setOrder(1); break; } } int locantOfCarbonyl; try{ locantOfCarbonyl = Integer.parseInt(carbonylCarbon.getFirstLocant()); } catch (Exception e) { throw new RuntimeException("OPSIN bug: Could not determine locant of carbonyl carbon in carbohydrate", e); } if (atomToJoinWith ==null){ String locantToJoinWith = String.valueOf(locantOfCarbonyl + Integer.parseInt(ringSizeVal) -2); atomToJoinWith =frag.getAtomByLocant("O" +locantToJoinWith); if (atomToJoinWith ==null){ throw new StructureBuildingException("Carbohydrate was not an inappropriate length to form a ring of size: " + ringSizeVal); } } state.fragManager.createBond(carbonylCarbon, atomToJoinWith, 1); CycleDetector.assignWhetherAtomsAreInCycles(frag); applyAlphaBetaStereoToCyclisedCarbohydrate(carbohydrateGroup, carbonylCarbon); } private void applyAlphaBetaStereoToCyclisedCarbohydrate(Element carbohydrateGroup, Atom carbonylCarbon) { Fragment frag = carbohydrateGroup.getFrag(); Element alphaOrBetaLocantEl = OpsinTools.getPreviousSiblingIgnoringCertainElements(carbohydrateGroup, new String[]{STEREOCHEMISTRY_EL}); if (alphaOrBetaLocantEl !=null && alphaOrBetaLocantEl.getName().equals(LOCANT_EL)){ Element stereoPrefixAfterAlphaBeta = OpsinTools.getNextSibling(alphaOrBetaLocantEl); Atom anomericReferenceAtom = getAnomericReferenceAtom(frag); if (anomericReferenceAtom ==null){ throw new RuntimeException("OPSIN bug: Unable to determine anomeric reference atom in: " +carbohydrateGroup.getValue()); } applyAnomerStereochemistryIfPresent(alphaOrBetaLocantEl, carbonylCarbon, anomericReferenceAtom); if (carbonylCarbon.getAtomParity() !=null && (SYSTEMATICCARBOHYDRATESTEMALDOSE_SUBTYPE_VAL.equals(carbohydrateGroup.getAttributeValue(SUBTYPE_ATR)) || SYSTEMATICCARBOHYDRATESTEMKETOSE_SUBTYPE_VAL.equals(carbohydrateGroup.getAttributeValue(SUBTYPE_ATR)))){ //systematic chains only have their stereochemistry defined after structure building to account for the fact that some stereocentres may be removed //hence inspect the stereoprefix to see if it is L and flip if so String val = stereoPrefixAfterAlphaBeta.getAttributeValue(VALUE_ATR); if (val.substring(val.length() -1 , val.length()).equals("l")){//"r" if D, "l" if L //flip if L AtomParity atomParity = carbonylCarbon.getAtomParity(); atomParity.setParity(-atomParity.getParity()); } } } carbonylCarbon.setProperty(Atom.ISANOMERIC, true); } private void processAldoseDiSuffix(String suffixValue, Element group, Atom aldehydeAtom) throws StructureBuildingException { Fragment frag = group.getFrag(); Atom alcoholAtom = frag.getAtomByLocantOrThrow(String.valueOf(frag.getChainLength())); if (suffixValue.equals("aric acid") || suffixValue.equals("arate")){ FragmentTools.removeTerminalOxygen(state, alcoholAtom, 1); Fragment f = state.fragManager.buildSMILES("O", group, NONE_LABELS_VAL); state.fragManager.incorporateFragment(f, f.getFirstAtom(), frag, alcoholAtom, 2); f = state.fragManager.buildSMILES("O", group, NONE_LABELS_VAL); Atom hydroxyAtom = f.getFirstAtom(); if (suffixValue.equals("arate")){ hydroxyAtom.addChargeAndProtons(-1, -1); } state.fragManager.incorporateFragment(f, f.getFirstAtom(), frag, alcoholAtom, 1); frag.addFunctionalAtom(hydroxyAtom); f = state.fragManager.buildSMILES("O", group, NONE_LABELS_VAL); hydroxyAtom = f.getFirstAtom(); if (suffixValue.equals("arate")){ hydroxyAtom.addChargeAndProtons(-1, -1); } state.fragManager.incorporateFragment(f, f.getFirstAtom(), frag, aldehydeAtom, 1); frag.addFunctionalAtom(hydroxyAtom); } else if (suffixValue.equals("dialdose")){ FragmentTools.removeTerminalOxygen(state, alcoholAtom, 1); Fragment f = state.fragManager.buildSMILES("O", group, NONE_LABELS_VAL); state.fragManager.incorporateFragment(f, f.getFirstAtom(), frag, alcoholAtom, 2); } else{ throw new IllegalArgumentException("OPSIN Bug: Unexpected suffix value: " + suffixValue); } } /** * Gets the configurationalAtom currently i.e. the defined stereocentre with the highest locant * @param frag * @return */ private Atom getAnomericReferenceAtom(Fragment frag){ List atomList = frag.getAtomList(); int highestLocantfound = Integer.MIN_VALUE; Atom configurationalAtom = null; for (Atom a : atomList) { if (a.getAtomParity()==null){ continue; } try{ String locant = a.getFirstLocant(); if (locant !=null) { int intVal = Integer.parseInt(locant); if (intVal > highestLocantfound){ highestLocantfound = intVal; configurationalAtom = a; } } } catch (NumberFormatException e) { //may throw number format exceptions } } return configurationalAtom; } private void applyAnomerStereochemistryIfPresent(Element alphaOrBetaLocantEl, Atom anomericAtom, Atom anomericReferenceAtom) { String value = alphaOrBetaLocantEl.getValue(); if (value.equals("alpha") || value.equals("beta")){ Atom[] referenceAtomRefs4 = getDeterministicAtomRefs4ForReferenceAtom(anomericReferenceAtom); boolean flip = StereochemistryHandler.checkEquivalencyOfAtomsRefs4AndParity(referenceAtomRefs4, 1, anomericReferenceAtom.getAtomParity().getAtomRefs4(), anomericReferenceAtom.getAtomParity().getParity()); Atom[] atomRefs4 = getDeterministicAtomRefs4ForAnomericAtom(anomericAtom); if (flip){ if (value.equals("alpha")){ anomericAtom.setAtomParity(atomRefs4, 1); } else{ anomericAtom.setAtomParity(atomRefs4, -1); } } else{ if (value.equals("alpha")){ anomericAtom.setAtomParity(atomRefs4, -1); } else{ anomericAtom.setAtomParity(atomRefs4, 1); } } alphaOrBetaLocantEl.detach(); } else if (value.equals("alpha,beta") || value.equals("beta,alpha")){ //unspecified stereochemistry alphaOrBetaLocantEl.detach(); } } private Atom[] getDeterministicAtomRefs4ForReferenceAtom(Atom referenceAtom) { List neighbours = referenceAtom.getAtomNeighbours(); if (neighbours.size()!=3){ throw new RuntimeException("OPSIN bug: Unexpected number of atoms connected to anomeric reference atom of carbohydrate"); } String nextLowestLocant = String.valueOf(Integer.parseInt(referenceAtom.getFirstLocant()) -1); Atom[] atomRefs4 = new Atom[4]; for (Atom neighbour : neighbours) { if (neighbour.getElement() == ChemEl.O) { atomRefs4[0] = neighbour; } else if (neighbour.getElement() == ChemEl.C) { if (neighbour.getFirstLocant().equals(nextLowestLocant)){ atomRefs4[1] = neighbour; } else { atomRefs4[2] = neighbour; } } else{ throw new RuntimeException("OPSIN bug: Unexpected atom element type connected to for anomeric reference atom"); } } atomRefs4[3] = AtomParity.hydrogen; for (Atom atom : atomRefs4) { if (atom ==null){ throw new RuntimeException("OPSIN bug: Unable to determine atomRefs4 for anomeric reference atom"); } } return atomRefs4; } private Atom[] getDeterministicAtomRefs4ForAnomericAtom(Atom anomericAtom) { List neighbours = anomericAtom.getAtomNeighbours(); Atom[] atomRefs4 = new Atom[4]; if (neighbours.size() == 3 || neighbours.size() == 4 ){ //normal case } else if (neighbours.size() == 2 && anomericAtom.getOutValency() == 1) { //trivial glycosyl atomRefs4[1] = AtomParity.deoxyHydrogen; } else { throw new RuntimeException("OPSIN bug: Unexpected number of atoms connected to anomeric atom of carbohydrate"); } for (Atom neighbour : neighbours) { if (neighbour.getElement() == ChemEl.C){ if (neighbour.getAtomIsInACycle()){ atomRefs4[0] = neighbour; } else{ atomRefs4[3] = neighbour; } } else if (neighbour.getElement() == ChemEl.O){ int incomingVal =neighbour.getIncomingValency(); if (incomingVal ==1){ atomRefs4[1] = neighbour; } else if (incomingVal ==2){ atomRefs4[2] = neighbour; } else{ throw new RuntimeException("OPSIN bug: Unexpected valency on oxygen in carbohydrate"); } } else{ throw new RuntimeException("OPSIN bug: Unexpected atom element type connected to anomeric atom of carbohydrate"); } } if (atomRefs4[3]==null){ atomRefs4[3] = AtomParity.hydrogen; } for (Atom atom : atomRefs4) { if (atom ==null){ throw new RuntimeException("OPSIN bug: Unable to assign anomeric carbon stereochemistry on carbohydrate"); } } return atomRefs4; } /** Look for multipliers, and multiply out suffixes/unsaturators/heteroatoms/hydros. * Locants are assigned if the number of locants matches the multiplier * associated with them. Eg. triol - > ololol. * Note that infix multiplication is handled seperately as resolution of suffixes is required to perform this unambiguously * @param subOrRoot The substituent/root to looks for multipliers in. */ private void processMultipliers(Element subOrRoot) { List multipliers = subOrRoot.getChildElements(MULTIPLIER_EL); for (Element multiplier : multipliers) { Element possibleLocant = OpsinTools.getPreviousSibling(multiplier); String[] locants = null; if (possibleLocant != null){ String possibleLocantElName = possibleLocant.getName(); if (possibleLocantElName.equals(LOCANT_EL)){ locants = possibleLocant.getValue().split(","); } else if (possibleLocantElName.equals(COLONORSEMICOLONDELIMITEDLOCANT_EL)){ locants = StringTools.removeDashIfPresent(possibleLocant.getValue()).split(":"); } } Element featureToMultiply = OpsinTools.getNextSibling(multiplier); String nextName = featureToMultiply.getName(); if(nextName.equals(UNSATURATOR_EL) || nextName.equals(SUFFIX_EL) || nextName.equals(SUBTRACTIVEPREFIX_EL) || (nextName.equals(HETEROATOM_EL) && !GROUP_TYPE_VAL.equals(multiplier.getAttributeValue(TYPE_ATR))) || nextName.equals(HYDRO_EL)) { int mvalue = Integer.parseInt(multiplier.getAttributeValue(VALUE_ATR)); if (mvalue>1){ featureToMultiply.addAttribute(new Attribute(MULTIPLIED_ATR, "multiplied")); } for(int i= mvalue -1; i >=1; i--) { Element newElement = featureToMultiply.copy(); if (locants !=null && locants.length==mvalue){ newElement.addAttribute(new Attribute(LOCANT_ATR, locants[i])); } OpsinTools.insertAfter(featureToMultiply, newElement); } multiplier.detach(); if (locants !=null && locants.length==mvalue){ featureToMultiply.addAttribute(new Attribute(LOCANT_ATR, locants[0])); possibleLocant.detach(); } } } } /** * Converts group elements that are identified as being conjunctive suffixes to CONJUNCTIVESUFFIXGROUP_EL * and labels them appropriately. Any suffixes that the conjunctive suffix may have are resolved onto it * @param subOrRoot * @param allGroups * @throws ComponentGenerationException * @throws StructureBuildingException */ private void detectConjunctiveSuffixGroups(Element subOrRoot, List allGroups) throws ComponentGenerationException, StructureBuildingException { List groups = subOrRoot.getChildElements(GROUP_EL); if (groups.size() > 1) { List conjunctiveGroups = new ArrayList<>(); Element ringGroup =null; for (int i = groups.size() -1 ; i >=0; i--) { Element group =groups.get(i); if (!group.getAttributeValue(TYPE_ATR).equals(RING_TYPE_VAL)){//e.g. the methanol in benzenemethanol. conjunctiveGroups.add(group); } else{ ringGroup =group; break; } } if (conjunctiveGroups.isEmpty()){ return; } if (ringGroup ==null){ throw new ComponentGenerationException("OPSIN bug: unable to find ring associated with conjunctive suffix group"); } if (conjunctiveGroups.size()!=1){ throw new ComponentGenerationException("OPSIN Bug: Two groups exactly should be present at this point when processing conjunctive nomenclature"); } Element primaryConjunctiveGroup =conjunctiveGroups.get(0); Fragment primaryConjunctiveFrag = primaryConjunctiveGroup.getFrag(); //remove all locants List atomList = primaryConjunctiveFrag.getAtomList(); for (Atom atom : atomList) { atom.clearLocants(); } List suffixes = new ArrayList<>(); Element possibleSuffix = OpsinTools.getNextSibling(primaryConjunctiveGroup); while (possibleSuffix !=null){ if (possibleSuffix.getName().equals(SUFFIX_EL)){ suffixes.add(possibleSuffix); } possibleSuffix = OpsinTools.getNextSibling(possibleSuffix); } preliminaryProcessSuffixes(primaryConjunctiveGroup, suffixes); suffixApplier.resolveSuffixes(primaryConjunctiveGroup, suffixes); for (Element suffix : suffixes) { suffix.detach(); } primaryConjunctiveGroup.setName(CONJUNCTIVESUFFIXGROUP_EL); allGroups.remove(primaryConjunctiveGroup); Element possibleMultiplier = OpsinTools.getPreviousSibling(primaryConjunctiveGroup); //label atoms appropriately boolean alphaIsPosition1 = atomList.get(0).getIncomingValency() < 3; int counter =0; for (int i = (alphaIsPosition1 ? 0 : 1); i < atomList.size(); i++) { Atom a = atomList.get(i); if (counter==0){ a.addLocant("alpha"); } else if (counter==1){ a.addLocant("beta"); } else if (counter==2){ a.addLocant("gamma"); } else if (counter==3){ a.addLocant("delta"); } else if (counter==4){ a.addLocant("epsilon"); } else if (counter==5){ a.addLocant("zeta"); } else if (counter==6){ a.addLocant("eta"); } counter++; } if (MULTIPLIER_EL.equals(possibleMultiplier.getName())){ int multiplier = Integer.parseInt(possibleMultiplier.getAttributeValue(VALUE_ATR)); for (int i = 1; i < multiplier; i++) { Element conjunctiveSuffixGroup = primaryConjunctiveGroup.copy(); Fragment newFragment = state.fragManager.copyAndRelabelFragment(primaryConjunctiveFrag, i); newFragment.setTokenEl(conjunctiveSuffixGroup); conjunctiveSuffixGroup.setFrag(newFragment); conjunctiveGroups.add(conjunctiveSuffixGroup); OpsinTools.insertAfter(primaryConjunctiveGroup, conjunctiveSuffixGroup); } Element possibleLocant = OpsinTools.getPreviousSibling(possibleMultiplier); possibleMultiplier.detach(); if (possibleLocant.getName().equals(LOCANT_EL)){ String[] locants = possibleLocant.getValue().split(","); if (locants.length!=multiplier){ throw new ComponentGenerationException("mismatch between number of locants and multiplier in conjunctive nomenclature routine"); } for (int i = 0; i < locants.length; i++) { conjunctiveGroups.get(i).addAttribute(new Attribute(LOCANT_ATR, locants[i])); } possibleLocant.detach(); } } } } /** Match each locant to the next applicable "feature". Assumes that processLocants * has done a good job and rejected cases where no match can be made. * Handles cases where the locant is next to the feature it refers to * * @param subOrRoot The substituent/root to look for locants in. * @throws ComponentGenerationException */ private void matchLocantsToDirectFeatures(Element subOrRoot) throws ComponentGenerationException { List locants = subOrRoot.getChildElements(LOCANT_EL); List groups = subOrRoot.getChildElements(GROUP_EL); for (Element group : groups) { if (group.getAttributeValue(SUBTYPE_ATR).equals(HANTZSCHWIDMAN_SUBTYPE_VAL)){//handle Hantzch-widman systems if (group.getAttribute(ADDBOND_ATR)!=null){//special case for partunsatring //exception for where a locant is supposed to indicate the location of a double bond... List deltas = subOrRoot.getChildElements(DELTA_EL); if (deltas.isEmpty()){ Element delta =new TokenEl(DELTA_EL); Element appropriateLocant = OpsinTools.getPreviousSiblingIgnoringCertainElements(group, new String[]{HETEROATOM_EL, MULTIPLIER_EL}); if (appropriateLocant !=null && appropriateLocant.getName().equals(LOCANT_EL) && appropriateLocant.getValue().split(",").length == 1){ delta.setValue(appropriateLocant.getValue()); OpsinTools.insertBefore(appropriateLocant, delta); appropriateLocant.detach(); locants.remove(appropriateLocant); } else{ delta.setValue(""); subOrRoot.insertChild(delta, 0);//no obvious attempt to set double bond position, potentially ambiguous, valency will be used to choose later } } } if (locants.size()>0 ){ Element locantBeforeHWSystem = null; List heteroAtoms = new ArrayList<>(); int indexOfGroup = subOrRoot.indexOf(group); for (int j = indexOfGroup -1; j >= 0; j--) { String elName = subOrRoot.getChild(j).getName(); if (elName.equals(LOCANT_EL)){ locantBeforeHWSystem = subOrRoot.getChild(j); break; } else if(elName.equals(HETEROATOM_EL)){ Element heteroAtom = subOrRoot.getChild(j); heteroAtoms.add(heteroAtom); if (heteroAtom.getAttribute(LOCANT_ATR)!=null){//locants already assigned, assumedly by process multipliers break; } } else{ break; } } Collections.reverse(heteroAtoms); if (locantBeforeHWSystem !=null){ String[] locantValues = locantBeforeHWSystem.getValue().split(","); //detect a solitary locant in front of a HW system and prevent it being assigned. //something like 1-aziridin-1-yl never means the N is at position 1 as it is at position 1 by convention //this special case is not applied to pseudo HW like systems e.g. [1]oxacyclotetradecine if (locantValues.length ==1 && group.getFrag().getAtomCount() <=10){ locants.remove(locantBeforeHWSystem);//don't assign this locant } else { if (locantValues.length == heteroAtoms.size()){ for (int j = 0; j < locantValues.length; j++) { String locantValue = locantValues[j]; heteroAtoms.get(j).addAttribute(new Attribute(LOCANT_ATR, locantValue)); } locantBeforeHWSystem.detach(); locants.remove(locantBeforeHWSystem); } else if (heteroAtoms.size()>1){ throw new ComponentGenerationException("Mismatch between number of locants and Hantzsch-Widman heteroatoms"); } } } } } } assignSingleLocantsToAdjacentFeatures(locants); } /** * Looks for a suffix/suffix/heteroatom/hydro element adjacent to the given locant * and if the locant element describes just 1 locant asssigns it * @param locants */ private void assignSingleLocantsToAdjacentFeatures(List locants) { for (Element locant : locants) { String[] locantValues = locant.getValue().split(","); Element referent = OpsinTools.getNextSibling(locant); if (referent != null && locantValues.length == 1){ String refName = referent.getName(); if (refName.equals(ISOTOPESPECIFICATION_EL)) { referent = OpsinTools.getNextSibling(referent); if (referent == null) { return; } refName = referent.getName(); } //Only assigning locants to elements that were not created by a multiplier if(referent.getAttribute(LOCANT_ATR) == null && referent.getAttribute(MULTIPLIED_ATR) == null && (refName.equals(UNSATURATOR_EL) || refName.equals(SUFFIX_EL) || refName.equals(HETEROATOM_EL) || refName.equals(CONJUNCTIVESUFFIXGROUP_EL) || refName.equals(SUBTRACTIVEPREFIX_EL) || (refName.equals(HYDRO_EL) && !referent.getValue().startsWith("per") ))) {//not perhydro referent.addAttribute(new Attribute(LOCANT_ATR, locantValues[0])); locant.detach(); } } } } /** * Handles suffixes, passes them to resolveGroupAddingSuffixes. * Processes the suffixAppliesTo command which multiplies a suffix and attaches the suffixes to the atoms described by the given IDs * @param group * @param suffixes * @throws ComponentGenerationException * @throws StructureBuildingException */ private void preliminaryProcessSuffixes(Element group, List suffixes) throws ComponentGenerationException, StructureBuildingException{ Fragment suffixableFragment = group.getFrag(); if (group.getAttribute(SUFFIXAPPLIESTO_ATR)!=null){//typically a trivial polyAcid or aminoAcid processSuffixAppliesTo(group, suffixes,suffixableFragment); } else{ for (Element suffix : suffixes) { if (suffix.getAttribute(ADDITIONALVALUE_ATR)!=null){ throw new ComponentGenerationException("suffix: " + suffix.getValue() + " used on an inappropriate group"); } } } applyDefaultLocantsToSuffixesIfApplicable(group, suffixableFragment); List suffixFragments =resolveGroupAddingSuffixes(suffixes, suffixableFragment); state.xmlSuffixMap.put(group, suffixFragments); boolean suffixesResolved =false; if (group.getAttributeValue(TYPE_ATR).equals(CHALCOGENACIDSTEM_TYPE_VAL)){//merge the suffix into the chalcogen acid stem e.g sulfonoate needs to be one fragment for infix replacement suffixApplier.resolveSuffixes(group, suffixes); suffixesResolved =true; } processSuffixPrefixes(suffixes);//e.g. carbox amide functionalReplacement.processInfixFunctionalReplacementNomenclature(suffixes, suffixFragments); processRemovalOfHydroxyGroupsRules(suffixes, suffixableFragment); if (group.getValue().equals("oxal")){//oxalic acid is treated as a non carboxylic acid for the purposes of functional replacment. See P-65.2.3 suffixApplier.resolveSuffixes(group, suffixes); group.getAttribute(TYPE_ATR).setValue(NONCARBOXYLICACID_TYPE_VAL); suffixesResolved =true; } if (suffixesResolved){ //suffixes have already been resolved so need to be detached to avoid being passed to resolveSuffixes later for (int i = suffixes.size() -1; i>=0; i--) { Element suffix =suffixes.remove(i); suffix.detach(); } } if (group.getAttribute(NUMBEROFFUNCTIONALATOMSTOREMOVE_ATR)!=null){ int numberToRemove = Integer.parseInt(group.getAttributeValue(NUMBEROFFUNCTIONALATOMSTOREMOVE_ATR)); if (numberToRemove > suffixableFragment.getFunctionalAtomCount()){ throw new ComponentGenerationException("Too many hydrogen for the number of positions on non carboxylic acid"); } for (int i = 0; i< numberToRemove; i++) { Atom functionalAtom = suffixableFragment.removeFunctionalAtom(0).getAtom(); functionalAtom.neutraliseCharge(); } } } private void applyDefaultLocantsToSuffixesIfApplicable(Element group, Fragment suffixableFragment) { String defaultLocantsAtrValue = group.getAttributeValue(SUFFIXAPPLIESTOBYDEFAULT_ATR); if (defaultLocantsAtrValue != null){ String[] suffixInstructions = defaultLocantsAtrValue.split(","); Element suffix = OpsinTools.getNextNonChargeSuffix(group); if (suffix !=null) { List suffixes = new ArrayList<>(); while (suffix != null) { suffixes.add(suffix); suffix = OpsinTools.getNextNonChargeSuffix(suffix); } if (suffixInstructions.length == suffixes.size()) { int firstIdInFragment = suffixableFragment.getIdOfFirstAtom(); for (int i = 0; i < suffixInstructions.length; i++) { String suffixInstruction = suffixInstructions[i]; suffixes.get(i).addAttribute(new Attribute(DEFAULTLOCANTID_ATR, Integer.toString(firstIdInFragment + Integer.parseInt(suffixInstruction) -1))); } } } } } /** * Processes the effects of the suffixAppliesTo attribute * @param group * @param suffixes * @param suffixableFragment * @throws ComponentGenerationException */ private void processSuffixAppliesTo(Element group, List suffixes, Fragment suffixableFragment) throws ComponentGenerationException { //suffixAppliesTo attribute contains instructions for number/positions of suffix //this is of the form comma sepeated ids with the number of ids corresponding to the number of instances of the suffix Element suffix =OpsinTools.getNextNonChargeSuffix(group); if (suffix ==null){ if (group.getAttributeValue(TYPE_ATR).equals(ACIDSTEM_TYPE_VAL)){ throw new ComponentGenerationException("No suffix where suffix was expected"); } } else{ if (suffixes.size()>1 && group.getAttributeValue(TYPE_ATR).equals(ACIDSTEM_TYPE_VAL)){ throw new ComponentGenerationException("More than one suffix detected on trivial polyAcid. Not believed to be allowed"); } String suffixInstruction =group.getAttributeValue(SUFFIXAPPLIESTO_ATR); String[] suffixInstructions = suffixInstruction.split(","); int firstIdInFragment=suffixableFragment.getIdOfFirstAtom(); if (CYCLEFORMER_SUBTYPE_VAL.equals(suffix.getAttributeValue(SUBTYPE_ATR))){ if (suffixInstructions.length !=2){ throw new ComponentGenerationException("suffix: " + suffix.getValue() + " used on an inappropriate group"); } String[] locantIds = new String[2]; locantIds[0] = Integer.toString(firstIdInFragment + Integer.parseInt(suffixInstructions[0]) - 1); locantIds[1] = Integer.toString(firstIdInFragment + Integer.parseInt(suffixInstructions[1]) - 1); suffix.addAttribute(new Attribute(LOCANTID_ATR, StringTools.arrayToString(locantIds, ","))); return; } boolean symmetricSuffixes =true; if (suffix.getAttribute(ADDITIONALVALUE_ATR)!=null){//handles amic, aldehydic, anilic and amoyl suffixes properly if (suffixInstructions.length < 2){ throw new ComponentGenerationException("suffix: " + suffix.getValue() + " used on an inappropriate group"); } symmetricSuffixes = false; } if (suffix.getAttribute(LOCANT_ATR)==null){ suffix.addAttribute(new Attribute(LOCANTID_ATR, Integer.toString(firstIdInFragment + Integer.parseInt(suffixInstructions[0]) -1))); } for (int i = 1; i < suffixInstructions.length; i++) { Element newSuffix = new TokenEl(SUFFIX_EL); if (symmetricSuffixes){ newSuffix.addAttribute(new Attribute(VALUE_ATR, suffix.getAttributeValue(VALUE_ATR))); newSuffix.addAttribute(new Attribute(TYPE_ATR, suffix.getAttributeValue(TYPE_ATR))); if (suffix.getAttribute(SUBTYPE_ATR)!=null){ newSuffix.addAttribute(new Attribute(SUBTYPE_ATR, suffix.getAttributeValue(SUBTYPE_ATR))); } if (suffix.getAttribute(INFIX_ATR)!=null && suffix.getAttributeValue(INFIX_ATR).startsWith("=")){//clone infixes that effect double bonds but not single bonds e.g. maleamidate still should have one functional atom newSuffix.addAttribute(new Attribute(INFIX_ATR, suffix.getAttributeValue(INFIX_ATR))); } } else{ newSuffix.addAttribute(new Attribute(VALUE_ATR, suffix.getAttributeValue(ADDITIONALVALUE_ATR))); newSuffix.addAttribute(new Attribute(TYPE_ATR, ROOT_EL)); } newSuffix.addAttribute(new Attribute(LOCANTID_ATR, Integer.toString(firstIdInFragment + Integer.parseInt(suffixInstructions[i]) -1))); OpsinTools.insertAfter(suffix, newSuffix); suffixes.add(newSuffix); } } } /**Processes a suffix and returns any fragment the suffix intends to add to the molecule * @param suffixes The suffix elements for a fragment. * @param frag The fragment to which the suffix will be applied * @return An arrayList containing the generated fragments * @throws StructureBuildingException If the suffixes can't be resolved properly. * @throws ComponentGenerationException */ private List resolveGroupAddingSuffixes(List suffixes, Fragment frag) throws StructureBuildingException, ComponentGenerationException { List suffixFragments =new ArrayList<>(); String groupType = frag.getType(); String subgroupType = frag.getSubType(); String suffixTypeToUse =null; if (suffixApplier.isGroupTypeWithSpecificSuffixRules(groupType)){ suffixTypeToUse =groupType; } else{ suffixTypeToUse = STANDARDGROUP_TYPE_VAL; } for (Element suffix : suffixes) { String suffixValue = suffix.getAttributeValue(VALUE_ATR); boolean cyclic;//needed for addSuffixPrefixIfNonePresentAndCyclic rule Atom atomLikelyToBeUsedBySuffix = null; String locant = suffix.getAttributeValue(LOCANT_ATR); String locantId = suffix.getAttributeValue(LOCANTID_ATR); if (locant != null && locant.indexOf(',') == -1) { atomLikelyToBeUsedBySuffix = frag.getAtomByLocant(locant); } else if (locantId != null && locantId.indexOf(',') == -1) { atomLikelyToBeUsedBySuffix = frag.getAtomByIDOrThrow(Integer.parseInt(locantId)); } if (atomLikelyToBeUsedBySuffix==null){ //a locant has not been specified //also can happen in the cases of things like fused rings where the final numbering is not available so lookup by locant fails (in which case all the atoms will be cyclic anyway) atomLikelyToBeUsedBySuffix = frag.getFirstAtom(); } cyclic = atomLikelyToBeUsedBySuffix.getAtomIsInACycle(); List suffixRules = suffixApplier.getSuffixRuleTags(suffixTypeToUse, suffixValue, subgroupType); Fragment suffixFrag = null; /* * Temp fragments are build for each addGroup rule and then merged into suffixFrag */ for (SuffixRule suffixRule : suffixRules) { switch (suffixRule.getType()) { case addgroup: String labels = suffixRule.getAttributeValue(SUFFIXRULES_LABELS_ATR); if (labels == null) { labels = NONE_LABELS_VAL; } suffixFrag = state.fragManager.buildSMILES(suffixRule.getAttributeValue(SUFFIXRULES_SMILES_ATR), SUFFIX_TYPE_VAL, labels); List atomList = suffixFrag.getAtomList(); String functionalIdsAtr = suffixRule.getAttributeValue(SUFFIXRULES_FUNCTIONALIDS_ATR); if (functionalIdsAtr != null) { String[] relativeIdsOfFunctionalAtoms = functionalIdsAtr.split(","); for (String relativeId : relativeIdsOfFunctionalAtoms) { int atomIndice = Integer.parseInt(relativeId) -1; if (atomIndice >=atomList.size()){ throw new StructureBuildingException("Check suffixRules.xml: Atom requested to have a functionalAtom was not within the suffix fragment"); } suffixFrag.addFunctionalAtom(atomList.get(atomIndice)); } } String outIdsAtr = suffixRule.getAttributeValue(SUFFIXRULES_OUTIDS_ATR); if (outIdsAtr != null) { String[] relativeIdsOfOutAtoms = outIdsAtr.split(","); for (String relativeId : relativeIdsOfOutAtoms) { int atomIndice = Integer.parseInt(relativeId) -1; if (atomIndice >=atomList.size()){ throw new StructureBuildingException("Check suffixRules.xml: Atom requested to have a outAtom was not within the suffix fragment"); } suffixFrag.addOutAtom(atomList.get(atomIndice), 1 , true); } } break; case addSuffixPrefixIfNonePresentAndCyclic: if (cyclic && suffix.getAttribute(SUFFIXPREFIX_ATR)==null){ suffix.addAttribute(new Attribute(SUFFIXPREFIX_ATR, suffixRule.getAttributeValue(SUFFIXRULES_SMILES_ATR))); } break; case addFunctionalAtomsToHydroxyGroups: if (suffixFrag != null){ throw new ComponentGenerationException("addFunctionalAtomsToHydroxyGroups is not currently compatable with the addGroup suffix rule"); } addFunctionalAtomsToHydroxyGroups(atomLikelyToBeUsedBySuffix); break; case chargeHydroxyGroups: if (suffixFrag != null){ throw new ComponentGenerationException("chargeHydroxyGroups is not currently compatable with the addGroup suffix rule"); } chargeHydroxyGroups(atomLikelyToBeUsedBySuffix); break; case removeTerminalOxygen: if (suffixFrag != null){ throw new ComponentGenerationException("removeTerminalOxygen is not currently compatible with the addGroup suffix rule"); } int bondOrder = Integer.parseInt(suffixRule.getAttributeValue(SUFFIXRULES_ORDER_ATR)); FragmentTools.removeTerminalOxygen(state, atomLikelyToBeUsedBySuffix, bondOrder); break; default: break; } } if (suffixFrag != null) { suffixFragments.add(suffixFrag); suffix.setFrag(suffixFrag); } } return suffixFragments; } /**Processes any convertHydroxyGroupsToOutAtoms and convertHydroxyGroupsToPositiveCharge instructions * This is not handled as part of resolveGroupAddingSuffixes as something like carbonochloridoyl involves infix replacement * on a hydroxy that would otherwise actually be removed by this rule! * @param suffixes The suffix elements for a fragment. * @param frag The fragment to which the suffix will be applied * @throws ComponentGenerationException */ private void processRemovalOfHydroxyGroupsRules(List suffixes, Fragment frag) throws ComponentGenerationException { String groupType = frag.getType(); String subgroupType = frag.getSubType(); String suffixTypeToUse =null; if (suffixApplier.isGroupTypeWithSpecificSuffixRules(groupType)) { suffixTypeToUse =groupType; } else{ suffixTypeToUse = STANDARDGROUP_TYPE_VAL; } for (Element suffix : suffixes) { String suffixValue = suffix.getAttributeValue(VALUE_ATR); List suffixRules = suffixApplier.getSuffixRuleTags(suffixTypeToUse, suffixValue, subgroupType); for (SuffixRule suffixRule : suffixRules) { SuffixRuleType type =suffixRule.getType(); if (type == SuffixRuleType.convertHydroxyGroupsToOutAtoms) { convertHydroxyGroupsToOutAtoms(frag); } else if (type == SuffixRuleType.convertHydroxyGroupsToPositiveCharge) { convertHydroxyGroupsToPositiveCharge(frag); } } } } /** * Finds all hydroxy groups connected to a given atom and adds a functionalAtom to each of them * @param atom * @throws StructureBuildingException */ private void addFunctionalAtomsToHydroxyGroups(Atom atom) throws StructureBuildingException { List neighbours = atom.getAtomNeighbours(); for (Atom neighbour : neighbours) { if (neighbour.getElement() == ChemEl.O && neighbour.getCharge() == 0 && neighbour.getBondCount() == 1 && atom.getBondToAtomOrThrow(neighbour).getOrder() == 1){ neighbour.getFrag().addFunctionalAtom(neighbour); } } } /** * Finds all hydroxy groups connected to a given atom and makes them negatively charged * @param atom * @throws StructureBuildingException */ private void chargeHydroxyGroups(Atom atom) throws StructureBuildingException { List neighbours = atom.getAtomNeighbours(); for (Atom neighbour : neighbours) { if (neighbour.getElement() == ChemEl.O && neighbour.getCharge()==0 && neighbour.getBondCount()==1 && atom.getBondToAtomOrThrow(neighbour).getOrder()==1){ neighbour.addChargeAndProtons(-1, -1); } } } /** * Given a fragment removes all hydroxy groups and adds a valency 1 outAtom to the adjacent atom for each hydroxy group * Note that O[OH] is not considered a hydroxy c.f. carbonoperoxoyl * @param frag */ private void convertHydroxyGroupsToOutAtoms(Fragment frag) { List atomList = frag.getAtomList(); for (Atom atom : atomList) { if (atom.getElement() == ChemEl.O && atom.getCharge()==0 && atom.getBondCount()==1 && atom.getFirstBond().getOrder()==1 && atom.getOutValency() == 0){ Atom adjacentAtom = atom.getAtomNeighbours().get(0); if (adjacentAtom.getElement() != ChemEl.O){ state.fragManager.removeAtomAndAssociatedBonds(atom); frag.addOutAtom(adjacentAtom, 1, true); } } } } /** * Given a fragment removes all hydroxy groups and applies ylium to the adjacent atom (+1 charge -1 proton) * Note that O[OH] is not considered a hydroxy * @param frag */ private void convertHydroxyGroupsToPositiveCharge(Fragment frag) { List atomList = frag.getAtomList(); for (Atom atom : atomList) { if (atom.getElement() == ChemEl.O && atom.getCharge()==0 && atom.getBondCount()==1 && atom.getFirstBond().getOrder()==1 && atom.getOutValency() == 0){ Atom adjacentAtom = atom.getAtomNeighbours().get(0); if (adjacentAtom.getElement() != ChemEl.O){ state.fragManager.removeAtomAndAssociatedBonds(atom); adjacentAtom.addChargeAndProtons(1, -1); } } } } /** * Searches for suffix elements with the suffixPrefix attribute set * A suffixPrefix is something like sulfon in sulfonamide. It would in this case take the value S(=O) * @param suffixes * @throws StructureBuildingException */ private void processSuffixPrefixes(List suffixes) throws StructureBuildingException{ for (Element suffix : suffixes) { if (suffix.getAttribute(SUFFIXPREFIX_ATR)!=null){ Fragment suffixPrefixFrag = state.fragManager.buildSMILES(suffix.getAttributeValue(SUFFIXPREFIX_ATR), SUFFIX_TYPE_VAL, NONE_LABELS_VAL); addFunctionalAtomsToHydroxyGroups(suffixPrefixFrag.getFirstAtom()); if (suffix.getValue().endsWith("ate") || suffix.getValue().endsWith("at")){ chargeHydroxyGroups(suffixPrefixFrag.getFirstAtom()); } Atom firstAtomOfPrefix = suffixPrefixFrag.getFirstAtom(); firstAtomOfPrefix.addLocant("X");//make sure this atom is not given a locant Fragment suffixFrag = suffix.getFrag(); state.fragManager.incorporateFragment(suffixPrefixFrag, suffixFrag); //manipulate suffixFrag such that all the bonds to the first atom (the R) go instead to the first atom of suffixPrefixFrag. //Then reconnect the R to that atom Atom theR = suffixFrag.getFirstAtom(); List neighbours = theR.getAtomNeighbours(); for (Atom neighbour : neighbours) { Bond b = theR.getBondToAtomOrThrow(neighbour); state.fragManager.removeBond(b); state.fragManager.createBond(neighbour, firstAtomOfPrefix, b.getOrder()); } state.fragManager.createBond(firstAtomOfPrefix, theR, 1); } } } /** * Checks through the groups accessible from the startingElement taking into account brackets (i.e. those that it is feasible that the group of the startingElement could substitute onto). * It is assumed that one does not intentionally locant onto something in a deeper level of bracketing (not implicit bracketing). e.g. 2-propyl(ethyl)ammonia will give prop-2-yl * @param state * @param startingElement * @param locant: the locant string to check for the presence of * @return whether the locant was found * @throws StructureBuildingException */ static boolean checkLocantPresentOnPotentialRoot(BuildState state, Element startingElement, String locant) throws StructureBuildingException { boolean foundSibling =false; Deque s = new ArrayDeque<>(); s.add(startingElement); boolean doneFirstIteration =false;//check on index only done on first iteration to only get elements with an index greater than the starting element while (s.size()>0){ Element currentElement =s.removeLast(); Element parent = currentElement.getParent(); List siblings = OpsinTools.getChildElementsWithTagNames(parent, new String[]{BRACKET_EL, SUBSTITUENT_EL, ROOT_EL}); int indexOfCurrentElement =parent.indexOf(currentElement); for (Element bracketOrSub : siblings) { if (!doneFirstIteration && parent.indexOf(bracketOrSub) <= indexOfCurrentElement){ continue; } if (bracketOrSub.getName().equals(BRACKET_EL)){//only want to consider implicit brackets, not proper brackets if (bracketOrSub.getAttribute(TYPE_ATR)==null){ continue; } s.add(bracketOrSub.getChild(0)); } else{ Element group = bracketOrSub.getFirstChildElement(GROUP_EL); Fragment groupFrag = group.getFrag(); if (groupFrag.hasLocant(locant)){ return true; } List suffixes =state.xmlSuffixMap.get(group); if (suffixes!=null){ for (Fragment suffix : suffixes) { if (suffix.hasLocant(locant)){ return true; } } } List conjunctiveGroups = OpsinTools.getNextSiblingsOfType(group, CONJUNCTIVESUFFIXGROUP_EL); for (Element conjunctiveGroup : conjunctiveGroups) { if (conjunctiveGroup.getFrag().hasLocant(locant)){ return true; } } } foundSibling =true; } doneFirstIteration =true; } if (!foundSibling){//Special case: anything the group could potentially substitute onto is in a bracket. The bracket is checked recursively s = new ArrayDeque<>(); s.add(startingElement); doneFirstIteration =false;//check on index only done on first iteration to only get elements with an index greater than the starting element while (s.size()>0){ Element currentElement =s.removeLast(); Element parent = currentElement.getParent(); List siblings = OpsinTools.getChildElementsWithTagNames(parent, new String[]{BRACKET_EL, SUBSTITUENT_EL, ROOT_EL}); int indexOfCurrentElement =parent.indexOf(currentElement); for (Element bracketOrSub : siblings) { if (!doneFirstIteration && parent.indexOf(bracketOrSub) <= indexOfCurrentElement){ continue; } if (bracketOrSub.getName().equals(BRACKET_EL)){ s.add(bracketOrSub.getChild(0)); } else{ Element group = bracketOrSub.getFirstChildElement(GROUP_EL); Fragment groupFrag = group.getFrag(); if (groupFrag.hasLocant(locant)){ return true; } List suffixes =state.xmlSuffixMap.get(group); if (suffixes!=null){ for (Fragment suffix : suffixes) { if (suffix.hasLocant(locant)){ return true; } } } List conjunctiveGroups = OpsinTools.getNextSiblingsOfType(group, CONJUNCTIVESUFFIXGROUP_EL); for (Element conjunctiveGroup : conjunctiveGroups) { if (conjunctiveGroup.getFrag().hasLocant(locant)){ return true; } } } } doneFirstIteration =true; } } return false; } /** Handles special cases in IUPAC nomenclature that are most elegantly solved by modification of the fragment * Also sets the default in atom for alkanes so that say methylethyl is prop-2-yl rather than propyl * @param groups * @throws StructureBuildingException * @throws ComponentGenerationException */ private void handleGroupIrregularities(List groups) throws StructureBuildingException, ComponentGenerationException{ for (Element group : groups) { String groupValue =group.getValue(); if (groupValue.equals("porphyrin")|| groupValue.equals("porphin")){ List hydrogenAddingEls = group.getParent().getChildElements(INDICATEDHYDROGEN_EL); boolean implicitHydrogenExplicitlySet =false; for (Element hydrogenAddingEl : hydrogenAddingEls) { String locant = hydrogenAddingEl.getAttributeValue(LOCANT_ATR); if (locant !=null && (locant.equals("21") || locant.equals("22") || locant.equals("23") || locant.equals("24"))){ implicitHydrogenExplicitlySet =true; } } if (!implicitHydrogenExplicitlySet){ //porphyrins implicitly have indicated hydrogen at the 21/23 positions //directly modify the fragment to avoid problems with locants in for example ring assemblies Fragment frag = group.getFrag(); frag.getAtomByLocantOrThrow("21").setSpareValency(false); frag.getAtomByLocantOrThrow("23").setSpareValency(false); } } else if (groupValue.equals("xanthate") || groupValue.equals("xanthat") || groupValue.equals("xanthic acid") || groupValue.equals("xanthicacid")){ //This test needs to be here rather in the ComponentGenerator to correctly reject non substituted thioxanthates Element wordRule = OpsinTools.getParentWordRule(group); if (wordRule.getAttributeValue(WORDRULE_ATR).equals(WordRule.simple.toString())){ if (OpsinTools.getDescendantElementsWithTagName(wordRule, SUBSTITUENT_EL).isEmpty()){ throw new ComponentGenerationException(groupValue +" describes a class of compounds rather than a particular compound"); } } } else if (groupValue.equals("adenosin") || groupValue.equals("cytidin") || groupValue.equals("guanosin") || groupValue.equals("inosin") || groupValue.equals("uridin") || groupValue.equals("xanthosin")){ //These groups are 2'-deoxy by convention Element previous = OpsinTools.getPreviousSibling(group); if (previous != null && previous.getName().equals(SUBTRACTIVEPREFIX_EL) && previous.getAttributeValue(TYPE_ATR).equals(DEOXY_TYPE_VAL) && previous.getAttributeValue(VALUE_ATR).equals("O") && previous.getAttribute(LOCANT_ATR) == null) { Element prev2 = OpsinTools.getPrevious(previous); if (prev2 == null || !prev2.getName().equals(SUBTRACTIVEPREFIX_EL)) { Fragment frag = group.getFrag(); StructureBuildingMethods.applySubtractivePrefix(state, frag, ChemEl.O, "2'"); previous.detach(); } } } else if (groupValue.equals("imidazol")) { Element next = OpsinTools.getNextSibling(group); if (next != null && next.getName().equals(SUFFIX_EL) && next.getValue().equals("ium")) { Atom pos3 = group.getFrag().getAtomByLocant("3"); if (pos3 != null) { //imidazolium = imidazol-3-ium next.addAttribute(DEFAULTLOCANTID_ATR, String.valueOf(pos3.getID())); } } } if ("yes".equals(group.getAttributeValue(USABLEASJOINER_ATR)) && group.getAttribute(DEFAULTINID_ATR) == null && group.getAttribute(DEFAULTINLOCANT_ATR) == null) { //makes linkers by default attach end to end Fragment frag = group.getFrag(); int chainLength = frag.getChainLength(); if (chainLength > 1){ boolean connectEndToEndWithPreviousSub = true; if (group.getAttributeValue(TYPE_ATR).equals(CHAIN_TYPE_VAL) && ALKANESTEM_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))){//don't do this if the group is preceded by another alkaneStem e.g. methylethyl makes more sense as prop-2-yl rather than propyl Element previousSubstituent = OpsinTools.getPreviousSibling(group.getParent()); if (previousSubstituent != null){ List previousSubstGroups = previousSubstituent.getChildElements(GROUP_EL); if (previousSubstGroups.size() == 1){ Element previousGroup = previousSubstGroups.get(0); if (previousGroup.getAttributeValue(TYPE_ATR).equals(CHAIN_TYPE_VAL) && ALKANESTEM_SUBTYPE_VAL.equals(previousGroup.getAttributeValue(SUBTYPE_ATR))){ Element suffixAfterGroup = OpsinTools.getNextSibling(previousGroup, SUFFIX_EL); if (suffixAfterGroup == null || suffixAfterGroup.getFrag() == null || suffixAfterGroup.getFrag().getOutAtomCount() == 0){ connectEndToEndWithPreviousSub = false; } } } } } if (connectEndToEndWithPreviousSub){ Element parent = group.getParent(); while (parent.getName().equals(BRACKET_EL)) { parent = parent.getParent(); } if (!parent.getName().equals(ROOT_EL)) { group.addAttribute(new Attribute(DEFAULTINID_ATR, Integer.toString(chainLength))); frag.setDefaultInAtom(frag.getAtomByLocantOrThrow(Integer.toString(chainLength))); } } } } } } /** * Handles Hantzsch-Widman rings. Adds SMILES to the group corresponding to the ring's structure * @param subOrRoot * @throws StructureBuildingException * @throws ComponentGenerationException */ private void processHW(Element subOrRoot) throws StructureBuildingException, ComponentGenerationException{ List hwGroups = OpsinTools.getChildElementsWithTagNameAndAttribute(subOrRoot, GROUP_EL, SUBTYPE_ATR, HANTZSCHWIDMAN_SUBTYPE_VAL); for (Element group : hwGroups) { Fragment hwRing = group.getFrag(); List atomList =hwRing.getAtomList(); boolean noLocants = true; List prevs = new ArrayList<>(); Element prev = OpsinTools.getPreviousSibling(group); while(prev != null && prev.getName().equals(HETEROATOM_EL)) { prevs.add(prev); if(prev.getAttribute(LOCANT_ATR) != null) { noLocants = false; } prev = OpsinTools.getPreviousSibling(prev); } Collections.reverse(prevs); List heteroatomsToProcess = prevs; if (atomList.size() == 6 && group.getValue().equals("an")){ boolean hasNitrogen = false; boolean hasSiorGeorSnorPb = false; boolean saturatedRing = true; for(Element heteroatom : heteroatomsToProcess){ String heteroAtomElement =heteroatom.getAttributeValue(VALUE_ATR); Matcher m = MATCH_ELEMENT_SYMBOL.matcher(heteroAtomElement); if (!m.find()){ throw new ComponentGenerationException("Failed to extract element from Hantzsch-Widman heteroatom"); } heteroAtomElement = m.group(); if (heteroAtomElement.equals("N")){ hasNitrogen = true; } if (heteroAtomElement.equals("Si") || heteroAtomElement.equals("Ge") || heteroAtomElement.equals("Sn") || heteroAtomElement.equals("Pb") ){ hasSiorGeorSnorPb = true; } } for (Atom a: atomList) { if (a.hasSpareValency()){ saturatedRing = false; } } if (saturatedRing && !hasNitrogen && hasSiorGeorSnorPb){ throw new ComponentGenerationException("Blocked Hantzsch-Widman system (6 member saturated ring with no nitrogen but has Si/Ge/Sn/Pb)"); } } StringBuilder nameSB = new StringBuilder(); for(Element heteroatom : heteroatomsToProcess) { String hetValue = heteroatom.getValue(); if (hetValue.endsWith("a")) { nameSB.append(hetValue.substring(0, hetValue.length() - 1)); } else { nameSB.append(hetValue); } } nameSB.append(group.getValue()); String name = nameSB.toString(); group.setValue(name); if(noLocants && heteroatomsToProcess.size() > 0) { String[] specialRingInformation = specialHWRings.get(name); if(specialRingInformation != null) { String specialInstruction =specialRingInformation[0]; if (!specialInstruction.equals("")){ if (specialInstruction.equals("blocked")){ throw new ComponentGenerationException("Blocked Hantzsch-Widman system"); } else if (specialInstruction.equals("saturated")){ for (Atom a: atomList) { a.setSpareValency(false); } } else if (specialInstruction.equals("not_icacid")){ if (group.getAttribute(SUBSEQUENTUNSEMANTICTOKEN_ATR) == null){ Element nextEl = OpsinTools.getNextSibling(group); if (nextEl != null && nextEl.getName().equals(SUFFIX_EL) && nextEl.getAttribute(LOCANT_ATR) == null && nextEl.getAttributeValue(VALUE_ATR).equals("ic")){ throw new ComponentGenerationException(name + nextEl.getValue() +" appears to be a generic class name, not a Hantzsch-Widman ring"); } } } else if (specialInstruction.equals("not_nothingOrOlate")){ if (group.getAttribute(SUBSEQUENTUNSEMANTICTOKEN_ATR) == null){ Element nextEl = OpsinTools.getNextSibling(group); if (nextEl==null || (nextEl!=null && nextEl.getName().equals(SUFFIX_EL) && nextEl.getAttribute(LOCANT_ATR)==null && nextEl.getAttributeValue(VALUE_ATR).equals("ate"))){ throw new ComponentGenerationException(name +" has the syntax for a Hantzsch-Widman ring but probably does not mean that in this context"); } } } else{ throw new ComponentGenerationException("OPSIN Bug: Unrecognised special Hantzsch-Widman ring instruction"); } } //something like oxazole where by convention locants go 1,3 or a inorganic HW-like system for (int i = 1; i < specialRingInformation.length; i++) { Atom a = hwRing.getAtomByLocantOrThrow(Integer.toString(i)); a.setElement(ChemEl.valueOf(specialRingInformation[i])); } for(Element p : heteroatomsToProcess){ p.detach(); } heteroatomsToProcess.clear(); } } //add locanted heteroatoms for (Iterator it = heteroatomsToProcess.iterator(); it.hasNext();) { Element heteroatom = it.next(); String locant = heteroatom.getAttributeValue(LOCANT_ATR); if (locant == null) { continue; } String elementReplacement = heteroatom.getAttributeValue(VALUE_ATR); Matcher m = MATCH_ELEMENT_SYMBOL.matcher(elementReplacement); if (!m.find()){ throw new ComponentGenerationException("Failed to extract element from Hantzsch-Widman heteroatom"); } elementReplacement = m.group(); Atom a = hwRing.getAtomByLocantOrThrow(locant); a.setElement(ChemEl.valueOf(elementReplacement)); if (heteroatom.getAttribute(LAMBDA_ATR) != null){ a.setLambdaConventionValency(Integer.parseInt(heteroatom.getAttributeValue(LAMBDA_ATR))); } heteroatom.detach(); it.remove(); } List deltaEls = subOrRoot.getChildElements(DELTA_EL); //add locanted double bonds and convert unlocanted to unsaturators for (Element deltaEl : deltaEls) { String locantOfDoubleBond = deltaEl.getValue(); if (locantOfDoubleBond.equals("")){ Element newUnsaturator = new TokenEl(UNSATURATOR_EL); newUnsaturator.addAttribute(new Attribute(VALUE_ATR, "2")); OpsinTools.insertAfter(group, newUnsaturator); } else{ Atom firstInDoubleBond = hwRing.getAtomByLocantOrThrow(locantOfDoubleBond); FragmentTools.unsaturate(firstInDoubleBond, 2, hwRing); } deltaEl.detach(); } //add unlocanted heteroatoms int hetAtomsToProcess = heteroatomsToProcess.size(); if (hetAtomsToProcess > 0) { List carbonAtomsInRing = new ArrayList<>(); for (Atom atom : atomList) { if (atom.getElement() == ChemEl.C) { carbonAtomsInRing.add(atom); } } if (hetAtomsToProcess> 1 && hetAtomsToProcess < (carbonAtomsInRing.size() -1)) { Element possibleBenzo = OpsinTools.getPreviousSibling(group, GROUP_EL); //assume benzo fusions or hwring as a fusion prefix produce unambiguous heteroatom positioning if (!(possibleBenzo != null && (possibleBenzo.getValue().equals("benz") || possibleBenzo.getValue().equals("benzo")) || "o".equals(group.getAttributeValue((SUBSEQUENTUNSEMANTICTOKEN_ATR))))) { state.addIsAmbiguous("Heteroatom positioning in the Hantzsch-Widman name " + name); } } if (hetAtomsToProcess > carbonAtomsInRing.size()) { throw new StructureBuildingException(hetAtomsToProcess +" heteroatoms were specified for a Hantzsch-Widman ring with only " + carbonAtomsInRing.size() + " atoms"); } for (int i = 0; i < hetAtomsToProcess; i++) { Element heteroatom = heteroatomsToProcess.get(i); String elementReplacement = heteroatom.getAttributeValue(VALUE_ATR); Matcher m = MATCH_ELEMENT_SYMBOL.matcher(elementReplacement); if (!m.find()){ throw new ComponentGenerationException("Failed to extract element from Hantzsch-Widman heteroatom"); } elementReplacement = m.group(); Atom a = carbonAtomsInRing.get(i); a.setElement(ChemEl.valueOf(elementReplacement)); if (heteroatom.getAttribute(LAMBDA_ATR)!=null){ a.setLambdaConventionValency(Integer.parseInt(heteroatom.getAttributeValue(LAMBDA_ATR))); } heteroatom.detach(); } } } } /** * Assigns Element symbols to groups, suffixes and conjunctive suffixes. * Suffixes have preference. * @param subOrRoot * @throws StructureBuildingException */ private void assignElementSymbolLocants(Element subOrRoot) throws StructureBuildingException { List groups = subOrRoot.getChildElements(GROUP_EL); Element lastGroupElementInSubOrRoot =groups.get(groups.size()-1); List suffixFragments = new ArrayList<>(state.xmlSuffixMap.get(lastGroupElementInSubOrRoot)); Fragment suffixableFragment = lastGroupElementInSubOrRoot.getFrag(); //treat conjunctive suffixesas if they were suffixes List conjunctiveGroups = subOrRoot.getChildElements(CONJUNCTIVESUFFIXGROUP_EL); for (Element group : conjunctiveGroups) { suffixFragments.add(group.getFrag()); } FragmentTools.assignElementLocants(suffixableFragment, suffixFragments); for (int i = groups.size()-2; i>=0; i--) { FragmentTools.assignElementLocants(groups.get(i).getFrag(), new ArrayList<>()); } } /** * Processes constructs such as biphenyl, 1,1':4',1''-Terphenyl, 2,2'-Bipyridylium, m-Quaterphenyl * @param subOrRoot * @throws ComponentGenerationException * @throws StructureBuildingException */ private void processRingAssemblies(Element subOrRoot) throws ComponentGenerationException, StructureBuildingException { List ringAssemblyMultipliers = subOrRoot.getChildElements(RINGASSEMBLYMULTIPLIER_EL); for (Element multiplier : ringAssemblyMultipliers) { int mvalue = Integer.parseInt(multiplier.getAttributeValue(VALUE_ATR)); /* * Populate locants with locants. Two locants are required for every pair of rings to be joined. * e.g. bi requires 2, ter requires 4 etc. */ List> ringJoiningLocants = new ArrayList<>(); Element potentialLocant = OpsinTools.getPreviousSibling(multiplier); Element group = OpsinTools.getNextSibling(multiplier, GROUP_EL); if (potentialLocant != null && (potentialLocant.getName().equals(COLONORSEMICOLONDELIMITEDLOCANT_EL) || potentialLocant.getName().equals(LOCANT_EL))){ //a locant appears to have been provided to indicate how to connect the rings of the ringAssembly if (ORTHOMETAPARA_TYPE_VAL.equals(potentialLocant.getAttributeValue(TYPE_ATR))){//an OMP locant has been provided to indicate how to connect the rings of the ringAssembly String locant2 = potentialLocant.getValue(); String locant1 = "1"; List locantArrayList = new ArrayList<>(); locantArrayList.add("1"); locantArrayList.add("1'"); ringJoiningLocants.add(locantArrayList); for (int i = 1; i < mvalue - 1; i++) { locantArrayList = new ArrayList<>(); locantArrayList.add(locant2 + StringTools.multiplyString("'", i)); locantArrayList.add(locant1 + StringTools.multiplyString("'", i + 1)); ringJoiningLocants.add(locantArrayList); } potentialLocant.detach(); } else{ String locantText = StringTools.removeDashIfPresent(potentialLocant.getValue()); //locantText might be something like 1,1':3',1'' String[] perRingLocantArray = MATCH_COLONORSEMICOLON.split(locantText); if (perRingLocantArray.length != (mvalue - 1)){ throw new ComponentGenerationException("Disagreement between number of locants(" + locantText + ") and ring assembly multiplier: " + mvalue); } if (perRingLocantArray.length != 1 || perRingLocantArray[0].split(",").length != 1){//if there is just a single locant it doesn't relate to how the rings are connected for (String ringLocantArray : perRingLocantArray) { String[] locantArray = ringLocantArray.split(","); if (locantArray.length != 2){ throw new ComponentGenerationException("missing locant, expected 2 locants: " + ringLocantArray); } ringJoiningLocants.add(Arrays.asList(locantArray)); } potentialLocant.detach(); } } } Fragment fragmentToResolveAndDuplicate = group.getFrag(); Element elementToResolve;//temporary element containing elements that should be resolved before the ring is duplicated Element nextEl = OpsinTools.getNextSibling(multiplier); if (nextEl.getName().equals(STRUCTURALOPENBRACKET_EL)){//brackets have been provided to aid disambiguation. These brackets are detached e.g. bi(cyclohexyl) elementToResolve = new GroupingEl(SUBSTITUENT_EL); Element currentEl = nextEl; nextEl = OpsinTools.getNextSibling(currentEl); currentEl.detach(); while (nextEl != null && !nextEl.getName().equals(STRUCTURALCLOSEBRACKET_EL)){ currentEl = nextEl; nextEl = OpsinTools.getNextSibling(currentEl); currentEl.detach(); elementToResolve.addChild(currentEl); } if (nextEl != null){ nextEl.detach(); } } else{ elementToResolve = determineElementsToResolveIntoRingAssembly(multiplier, ringJoiningLocants.size(), fragmentToResolveAndDuplicate.getOutAtomCount()); } List suffixes = elementToResolve.getChildElements(SUFFIX_EL); suffixApplier.resolveSuffixes(group, suffixes); int bondOrder = 1; if (fragmentToResolveAndDuplicate.getOutAtomCount() > 1){ throw new StructureBuildingException("Ring assembly fragment should have one or no OutAtoms; not more than one!"); } if (fragmentToResolveAndDuplicate.getOutAtomCount() == 1) {//e.g. bicyclohexanylidene bondOrder = fragmentToResolveAndDuplicate.getOutAtom(0).getValency(); } boolean twoRingsJoinedUsingSuffixPosition = ringJoiningLocants.isEmpty() && mvalue == 2 && fragmentToResolveAndDuplicate.getOutAtomCount() == 1; if (!twoRingsJoinedUsingSuffixPosition && fragmentToResolveAndDuplicate.getOutAtomCount() == 1) { //remove yl (or the like). Need to make sure that resolveUnLocantedFeatures doesn't consider 2,2'-bipyridyl ambiguous due to the location of the ul fragmentToResolveAndDuplicate.removeOutAtom(0); } StructureBuildingMethods.resolveLocantedFeatures(state, elementToResolve); StructureBuildingMethods.resolveUnLocantedFeatures(state, elementToResolve); group.detach(); OpsinTools.insertAfter(multiplier, group); if (twoRingsJoinedUsingSuffixPosition){ Fragment clone = state.fragManager.copyAndRelabelFragment(fragmentToResolveAndDuplicate, 1); Atom atomOnParent = fragmentToResolveAndDuplicate.getOutAtom(0).getAtom(); Atom atomOnClone = clone.getOutAtom(0).getAtom(); fragmentToResolveAndDuplicate.removeOutAtom(0); clone.removeOutAtom(0); state.fragManager.incorporateFragment(clone, atomOnClone, fragmentToResolveAndDuplicate, atomOnParent, bondOrder); } else { List clonedFragments = new ArrayList<>(); for (int j = 1; j < mvalue; j++) { clonedFragments.add(state.fragManager.copyAndRelabelFragment(fragmentToResolveAndDuplicate, j)); } Fragment lastRingUnlocantedBondedTo = null; for (int i = 0; i < mvalue - 1; i++) { Fragment clone = clonedFragments.get(i); Atom atomOnParent; Atom atomOnLatestClone; if (ringJoiningLocants.size() > 0){//locants defined atomOnParent = fragmentToResolveAndDuplicate.getAtomByLocantOrThrow(ringJoiningLocants.get(i).get(0)); String secondLocant = ringJoiningLocants.get(i).get(1); if (mvalue ==2 && !secondLocant.endsWith("'")){ //Allow prime to be (incorrectly) omitted on second locant in bi ring assemblies e.g. 2,2-bipyridine try { atomOnLatestClone = clone.getAtomByLocantOrThrow(secondLocant); } catch (StructureBuildingException e){ atomOnLatestClone = clone.getAtomByLocant(secondLocant + "'"); if (atomOnLatestClone == null){ throw e; } } } else{ atomOnLatestClone = clone.getAtomByLocantOrThrow(secondLocant); } } else{ List potentialAtomsOnParent; if (lastRingUnlocantedBondedTo == null){ potentialAtomsOnParent = FragmentTools.findSubstituableAtoms(fragmentToResolveAndDuplicate, bondOrder); } else{ potentialAtomsOnParent = FragmentTools.findSubstituableAtoms(lastRingUnlocantedBondedTo, bondOrder); } List potentialAtomsOnClone = FragmentTools.findSubstituableAtoms(clone, bondOrder); if (potentialAtomsOnParent.isEmpty() || potentialAtomsOnClone.isEmpty()) { throw new StructureBuildingException("Unable to find suitable atom for unlocanted ring assembly construction"); } if (AmbiguityChecker.isSubstitutionAmbiguous(potentialAtomsOnParent, 1)) { state.addIsAmbiguous("Choice of atoms to form ring assembly: " + group.getValue()); } if (AmbiguityChecker.isSubstitutionAmbiguous(potentialAtomsOnClone, 1)) { state.addIsAmbiguous("Choice of atoms to form ring assembly: " + group.getValue()); } atomOnParent = potentialAtomsOnParent.get(0); atomOnLatestClone = potentialAtomsOnClone.get(0); lastRingUnlocantedBondedTo = clone; } state.fragManager.incorporateFragment(clone, atomOnLatestClone, fragmentToResolveAndDuplicate, atomOnParent, bondOrder); } } group.setValue(multiplier.getValue() + group.getValue()); Element possibleOpenStructuralBracket = OpsinTools.getPreviousSibling(multiplier); if (possibleOpenStructuralBracket!=null && possibleOpenStructuralBracket.getName().equals(STRUCTURALOPENBRACKET_EL)){//e.g. [2,2'-bipyridin]. //To emphasise there can actually be two sets of structural brackets e.g. [1,1'-bi(cyclohexyl)] OpsinTools.getNextSibling(possibleOpenStructuralBracket, STRUCTURALCLOSEBRACKET_EL).detach(); possibleOpenStructuralBracket.detach(); } multiplier.detach(); } } /** * Given the element after the ring assembly multiplier determines which siblings should be resolved by adding them to elementToResolve * @param multiplier * @param ringJoiningLocants * @param outAtomCount * @return * @throws ComponentGenerationException */ private Element determineElementsToResolveIntoRingAssembly(Element multiplier, int ringJoiningLocants, int outAtomCount) throws ComponentGenerationException { Element elementToResolve = new GroupingEl(SUBSTITUENT_EL); boolean groupFound = false; boolean inlineSuffixSeen = outAtomCount > 0; Element currentEl = OpsinTools.getNextSibling(multiplier); while (currentEl != null) { //Attach all until group found //Attach unlocanted charge suffixes/unsaturation //Attach one unlocanted unmultiplied inline suffix (or one locanted unmultiplied inline suffix if it is a bi ring assembly e.g. bipyridin-2-yl) Element nextEl = OpsinTools.getNextSibling(currentEl); if (!groupFound) { currentEl.detach(); elementToResolve.addChild(currentEl); if (currentEl.getName().equals(GROUP_EL)) { groupFound = true; } } else { if (currentEl.getName().equals(SUFFIX_EL)) { String suffixType = currentEl.getAttributeValue(TYPE_ATR); if (suffixType.equals(CHARGE_TYPE_VAL) && currentEl.getAttribute(LOCANT_ATR) == null) { currentEl.detach(); elementToResolve.addChild(currentEl); } else if (!inlineSuffixSeen && suffixType.equals(INLINE_TYPE_VAL) && currentEl.getAttributeValue(MULTIPLIED_ATR) == null && (currentEl.getAttribute(LOCANT_ATR) == null || ("2".equals(multiplier.getAttributeValue(VALUE_ATR)) && ringJoiningLocants == 0)) && currentEl.getFrag() == null){ inlineSuffixSeen = true; currentEl.detach(); elementToResolve.addChild(currentEl); } else { break; } } else if (currentEl.getName().equals(UNSATURATOR_EL) && currentEl.getAttribute(LOCANT_ATR) == null) { currentEl.detach(); elementToResolve.addChild(currentEl); } else { break; } } currentEl = nextEl; } Element parent = multiplier.getParent(); if (!parent.getName().equals(SUBSTITUENT_EL) && OpsinTools.getChildElementsWithTagNameAndAttribute(parent, SUFFIX_EL, TYPE_ATR, INLINE_TYPE_VAL).size() > 0){ throw new ComponentGenerationException("Unexpected radical adding suffix on ring assembly"); } return elementToResolve; } /** * Proccess any polycyclic spiro systems present in subOrRoot * It is assumed that at this stage all hantzch widman rings/fused rings have been resolved to single groups allowing them to be simply spiro fused * * http://www.chem.qmul.ac.uk/iupac/spiro/ (SP-2 through SP-6) * @param subOrRoot * @throws ComponentGenerationException * @throws StructureBuildingException */ private void processPolyCyclicSpiroNomenclature(Element subOrRoot) throws ComponentGenerationException, StructureBuildingException { List polyCyclicSpiros = subOrRoot.getChildElements(POLYCYCLICSPIRO_EL); if (polyCyclicSpiros.size()>0){ Element polyCyclicSpiroDescriptor = polyCyclicSpiros.get(0); String value = polyCyclicSpiroDescriptor.getAttributeValue(VALUE_ATR); if (value.equals("spiro")){ if (polyCyclicSpiros.size()!=1){ throw new ComponentGenerationException("Nested polyspiro systems are not supported"); } processNonIdenticalPolyCyclicSpiro(polyCyclicSpiroDescriptor); } else if (value.equals("spiroOldMethod")){ processOldMethodPolyCyclicSpiro(polyCyclicSpiros); } else if (value.equals("spirobi")){ if (polyCyclicSpiros.size()!=1){ throw new ComponentGenerationException("Nested polyspiro systems are not supported"); } processSpiroBiOrTer(polyCyclicSpiroDescriptor, 2); } else if (value.equals("spiroter")){ if (polyCyclicSpiros.size()!=1){ throw new ComponentGenerationException("Nested polyspiro systems are not supported"); } processSpiroBiOrTer(polyCyclicSpiroDescriptor, 3); } else if (value.equals("dispiroter")){ if (polyCyclicSpiros.size()!=1){ throw new ComponentGenerationException("Nested polyspiro systems are not supported"); } processDispiroter(polyCyclicSpiroDescriptor); } else{ throw new ComponentGenerationException("Unsupported spiro system encountered"); } polyCyclicSpiroDescriptor.detach(); } } private void processNonIdenticalPolyCyclicSpiro(Element polyCyclicSpiroDescriptor) throws ComponentGenerationException, StructureBuildingException { Element subOrRoot = polyCyclicSpiroDescriptor.getParent(); Element openBracket = OpsinTools.getNextSibling(polyCyclicSpiroDescriptor); if (!openBracket.getName().equals(STRUCTURALOPENBRACKET_EL)){ throw new ComponentGenerationException("OPSIN Bug: Open bracket not found where open bracket expeced"); } List spiroBracketElements = OpsinTools.getSiblingsUpToElementWithTagName(openBracket, STRUCTURALCLOSEBRACKET_EL); Element closeBracket = OpsinTools.getNextSibling(spiroBracketElements.get(spiroBracketElements.size() - 1)); if (closeBracket == null || !closeBracket.getName().equals(STRUCTURALCLOSEBRACKET_EL)){ throw new ComponentGenerationException("OPSIN Bug: Open bracket not found where open bracket expeced"); } List groups = new ArrayList<>(); for (Element spiroBracketElement : spiroBracketElements) { String name = spiroBracketElement.getName(); if (name.equals(GROUP_EL)) { groups.add(spiroBracketElement); } else if (name.equals(SPIROLOCANT_EL)) { Element spiroLocant = spiroBracketElement; String[] locants = StringTools.removeDashIfPresent(spiroLocant.getValue()).split(","); if (locants.length != 2) { throw new ComponentGenerationException("Incorrect number of locants found before component of polycyclic spiro system"); } boolean changed = false; Matcher m1 = matchAddedHydrogenBracket.matcher(locants[0]); if (m1.find()) { Element addedHydrogenElement = new TokenEl(ADDEDHYDROGEN_EL); String addedHydrogenLocant = m1.group(1); int primes = StringTools.countTerminalPrimes(addedHydrogenLocant); if (primes > 0 && primes == (groups.size() - 1)) {//rings are primeless before spiro fusion (hydrogen is currently added before spiro fusion) addedHydrogenLocant = addedHydrogenLocant.substring(0, addedHydrogenLocant.length() - primes); } addedHydrogenElement.addAttribute(new Attribute(LOCANT_ATR, addedHydrogenLocant)); OpsinTools.insertBefore(spiroLocant, addedHydrogenElement); locants[0] = m1.replaceAll(""); changed = true; } Matcher m2 = matchAddedHydrogenBracket.matcher(locants[1]); if (m2.find()) { Element addedHydrogenElement = new TokenEl(ADDEDHYDROGEN_EL); String addedHydrogenLocant = m2.group(1); int primes = StringTools.countTerminalPrimes(addedHydrogenLocant); if (primes > 0 && primes == groups.size()) {//rings are primeless before spiro fusion (hydrogen is currently added before spiro fusion) addedHydrogenLocant = addedHydrogenLocant.substring(0, addedHydrogenLocant.length() - primes); } addedHydrogenElement.addAttribute(new Attribute(LOCANT_ATR, addedHydrogenLocant)); OpsinTools.insertAfter(spiroLocant, addedHydrogenElement); locants[1] = m2.replaceAll(""); changed = true; } if (changed) { spiroLocant.addAttribute(new Attribute(TYPE_ATR, ADDEDHYDROGENLOCANT_TYPE_VAL)); } spiroLocant.setValue(StringTools.arrayToString(locants, ",")); } } int groupCount = groups.size(); if (groupCount < 2) { throw new ComponentGenerationException("OPSIN Bug: Atleast two groups were expected in polycyclic spiro system"); } Element firstGroup = groups.get(0); List firstGroupEls = new ArrayList<>(); int indexOfOpenBracket = subOrRoot.indexOf(openBracket); Element firstSpiroLocant = OpsinTools.getNextSibling(firstGroup, SPIROLOCANT_EL); if (firstSpiroLocant == null) { throw new ComponentGenerationException("Unable to find spiroLocant for polycyclic spiro system"); } int indexOfFirstSpiroLocant = subOrRoot.indexOf(firstSpiroLocant); for (int i = indexOfOpenBracket + 1; i < indexOfFirstSpiroLocant; i++) { firstGroupEls.add(subOrRoot.getChild(i)); } resolveFeaturesOntoGroup(firstGroupEls); Set spiroAtoms = new HashSet<>(); for (int i = 1; i < groupCount; i++) { Element nextGroup = groups.get(i); Element spiroLocant = OpsinTools.getNextSibling(groups.get(i - 1), SPIROLOCANT_EL); if (spiroLocant == null) { throw new ComponentGenerationException("Unable to find spiroLocant for polycyclic spiro system"); } String[] locants = spiroLocant.getValue().split(","); locants[0] = fixLocantCapitalisation(locants[0]); locants[1] = fixLocantCapitalisation(locants[1]); List nextGroupEls = new ArrayList<>(); int indexOfLocant = subOrRoot.indexOf(spiroLocant); int indexOfNextSpiroLocantOrEndOfSpiro = subOrRoot.indexOf(i + 1 < groupCount ? OpsinTools.getNextSibling(nextGroup, SPIROLOCANT_EL) : OpsinTools.getNextSibling(nextGroup, STRUCTURALCLOSEBRACKET_EL)); for (int j = indexOfLocant + 1; j < indexOfNextSpiroLocantOrEndOfSpiro; j++) { nextGroupEls.add(subOrRoot.getChild(j)); } resolveFeaturesOntoGroup(nextGroupEls); spiroLocant.detach(); Fragment nextFragment = nextGroup.getFrag(); FragmentTools.relabelNumericLocants(nextFragment.getAtomList(), StringTools.multiplyString("'", i)); String secondLocant = locants[1]; Atom atomOnNextFragment; if (secondLocant.endsWith("'")){ atomOnNextFragment = nextFragment.getAtomByLocantOrThrow(locants[1]); } else{ //for simple spiro fusions the prime is often forgotten atomOnNextFragment = nextFragment.getAtomByLocantOrThrow(locants[1] + "'"); } Atom atomToBeReplaced = null; for (int j = 0; j < i; j++) { atomToBeReplaced = groups.get(j).getFrag().getAtomByLocant(locants[0]); if (atomToBeReplaced != null){ break; } } if (atomToBeReplaced == null){ throw new ComponentGenerationException("Could not find the atom with locant " + locants[0] +" for use in polycyclic spiro system"); } spiroAtoms.add(atomToBeReplaced); if (atomToBeReplaced.getElement() != atomOnNextFragment.getElement()){ //In well formed names these should be identical but by special case pick the heteroatom if the other is carbon if (atomToBeReplaced.getElement() != ChemEl.C && atomOnNextFragment.getElement() == ChemEl.C) { atomOnNextFragment.setElement(atomToBeReplaced.getElement()); } else if (atomToBeReplaced.getElement() != ChemEl.C && atomOnNextFragment.getElement() != ChemEl.C) { throw new ComponentGenerationException("Disagreement between which element the spiro atom should be: " + atomToBeReplaced.getElement() +" and " + atomOnNextFragment.getElement() ); } } if (atomToBeReplaced.hasSpareValency()){ atomOnNextFragment.setSpareValency(true); } state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(atomToBeReplaced, atomOnNextFragment); } if (spiroAtoms.size() > 1) { Element expectedMultiplier = OpsinTools.getPreviousSibling(polyCyclicSpiroDescriptor); if (expectedMultiplier != null && expectedMultiplier.getName().equals(MULTIPLIER_EL) && Integer.parseInt(expectedMultiplier.getAttributeValue(VALUE_ATR)) == spiroAtoms.size()) { expectedMultiplier.detach(); } } Element rootGroup = groups.get(groupCount - 1); Fragment rootFrag = rootGroup.getFrag(); String name = rootGroup.getValue(); for (int i = 0; i < groupCount - 1; i++) { Element group = groups.get(i); state.fragManager.incorporateFragment(group.getFrag(), rootFrag); name = group.getValue() + name; group.detach(); } rootGroup.setValue(polyCyclicSpiroDescriptor.getValue() + name); openBracket.detach(); closeBracket.detach(); } /** * Processes spiro systems described using the now deprectated method described in the 1979 guidelines Rule A-42 * @param spiroElements * @throws ComponentGenerationException * @throws StructureBuildingException */ private void processOldMethodPolyCyclicSpiro(List spiroElements) throws ComponentGenerationException, StructureBuildingException { Element firstSpiro =spiroElements.get(0); Element subOrRoot = firstSpiro.getParent(); Element firstEl = subOrRoot.getChild(0); List elementsToResolve = OpsinTools.getSiblingsUpToElementWithTagName(firstEl, POLYCYCLICSPIRO_EL); elementsToResolve.add(0, firstEl); resolveFeaturesOntoGroup(elementsToResolve); for (int i = 0; i < spiroElements.size(); i++) { Element currentSpiro = spiroElements.get(i); Element previousGroup = OpsinTools.getPreviousSibling(currentSpiro, GROUP_EL); if (previousGroup==null){ throw new ComponentGenerationException("OPSIN bug: unable to locate group before polycylic spiro descriptor"); } Element nextGroup = OpsinTools.getNextSibling(currentSpiro, GROUP_EL); if (nextGroup==null){ throw new ComponentGenerationException("OPSIN bug: unable to locate group after polycylic spiro descriptor"); } Fragment previousFrag = previousGroup.getFrag(); Fragment parentFrag = nextGroup.getFrag(); FragmentTools.relabelNumericLocants(parentFrag.getAtomList(), StringTools.multiplyString("'",i+1)); elementsToResolve = OpsinTools.getSiblingsUpToElementWithTagName(currentSpiro, POLYCYCLICSPIRO_EL); resolveFeaturesOntoGroup(elementsToResolve); String locant1 =null; Element possibleFirstLocant = OpsinTools.getPreviousSibling(currentSpiro); if (possibleFirstLocant !=null && possibleFirstLocant.getName().equals(LOCANT_EL)){ if (possibleFirstLocant.getValue().split(",").length==1){ locant1 = possibleFirstLocant.getValue(); possibleFirstLocant.detach(); } else{ throw new ComponentGenerationException("Malformed locant before polycyclic spiro descriptor"); } } Atom atomToBeReplaced; if (locant1 != null){ atomToBeReplaced = previousFrag.getAtomByLocantOrThrow(locant1); } else{ List potentialAtoms = FragmentTools.findSubstituableAtoms(previousFrag, 2); if (potentialAtoms.isEmpty()) { throw new StructureBuildingException("No suitable atom found for spiro fusion"); } if (AmbiguityChecker.isSubstitutionAmbiguous(potentialAtoms, 1)) { state.addIsAmbiguous("Choice of atom for spiro fusion on: " + previousGroup.getValue()); } atomToBeReplaced = potentialAtoms.get(0); } Atom atomOnParentFrag; String locant2 =null; Element possibleSecondLocant = OpsinTools.getNextSibling(currentSpiro); if (possibleSecondLocant !=null && possibleSecondLocant.getName().equals(LOCANT_EL)){ if (possibleSecondLocant.getValue().split(",").length==1){ locant2 = possibleSecondLocant.getValue(); possibleSecondLocant.detach(); } else{ throw new ComponentGenerationException("Malformed locant after polycyclic spiro descriptor"); } } if (locant2 != null){ atomOnParentFrag = parentFrag.getAtomByLocantOrThrow(locant2); } else{ List potentialAtoms = FragmentTools.findSubstituableAtoms(parentFrag, 2); if (potentialAtoms.isEmpty()) { throw new StructureBuildingException("No suitable atom found for spiro fusion"); } if (AmbiguityChecker.isSubstitutionAmbiguous(potentialAtoms, 1)) { state.addIsAmbiguous("Choice of atom for spiro fusion on: " + nextGroup.getValue()); }; atomOnParentFrag = potentialAtoms.get(0); } state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(atomToBeReplaced, atomOnParentFrag); if (atomToBeReplaced.hasSpareValency()){ atomOnParentFrag.setSpareValency(true); } if (atomToBeReplaced.getCharge()!=0 && atomOnParentFrag.getCharge()==0){ atomOnParentFrag.setCharge(atomToBeReplaced.getCharge()); atomOnParentFrag.setProtonsExplicitlyAddedOrRemoved(atomToBeReplaced.getProtonsExplicitlyAddedOrRemoved()); } state.fragManager.incorporateFragment(previousFrag, parentFrag); nextGroup.setValue(previousGroup.getValue() + currentSpiro.getValue() + nextGroup.getValue()); previousGroup.detach(); } } /** * Two or three copies of the fragment after polyCyclicSpiroDescriptor are spiro fused at one centre * @param polyCyclicSpiroDescriptor * @param components * @throws ComponentGenerationException * @throws StructureBuildingException */ private void processSpiroBiOrTer(Element polyCyclicSpiroDescriptor, int components) throws ComponentGenerationException, StructureBuildingException { Element locant = OpsinTools.getPreviousSibling(polyCyclicSpiroDescriptor); String locantText; if (locant ==null || !locant.getName().equals(LOCANT_EL)){ if (components==2){ locantText ="1,1'"; } else{ throw new ComponentGenerationException("Unable to find locant indicating atoms to form polycyclic spiro system!"); } } else{ locantText = locant.getValue(); locant.detach(); } String[] locants = locantText.split(","); if (locants.length!=components){ throw new ComponentGenerationException("Mismatch between spiro descriptor and number of locants provided"); } Element group = OpsinTools.getNextSibling(polyCyclicSpiroDescriptor, GROUP_EL); if (group==null){ throw new ComponentGenerationException("Cannot find group to which spirobi/ter descriptor applies"); } determineFeaturesToResolveInSingleComponentSpiro(polyCyclicSpiroDescriptor); Fragment fragment = group.getFrag(); List clones = new ArrayList<>(); for (int i = 1; i < components ; i++) { clones.add(state.fragManager.copyAndRelabelFragment(fragment, i)); } Atom atomOnOriginalFragment = fragment.getAtomByLocantOrThrow(locants[0]); for (int i = 1; i < components ; i++) { Fragment clone = clones.get(i - 1); Atom atomToBeReplaced; if (components ==2 && !locants[i].endsWith("'")){ //Allow prime to be (incorrectly) omitted on second locant in spirobi try { atomToBeReplaced = clone.getAtomByLocantOrThrow(locants[i]); } catch (StructureBuildingException e){ atomToBeReplaced = clone.getAtomByLocant(locants[i] + "'"); if (atomToBeReplaced == null){ throw e; } } } else{ atomToBeReplaced = clone.getAtomByLocantOrThrow(locants[i]); } state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(atomToBeReplaced, atomOnOriginalFragment); if (atomToBeReplaced.hasSpareValency()){ atomOnOriginalFragment.setSpareValency(true); } } for (Fragment clone : clones) { state.fragManager.incorporateFragment(clone, fragment); } group.setValue(polyCyclicSpiroDescriptor.getValue() + group.getValue()); } /** * Three copies of the fragment after polyCyclicSpiroDescriptor are spiro fused at two centres * @param polyCyclicSpiroDescriptor * @throws StructureBuildingException * @throws ComponentGenerationException */ private void processDispiroter(Element polyCyclicSpiroDescriptor) throws StructureBuildingException, ComponentGenerationException { String value = polyCyclicSpiroDescriptor.getValue(); value = value.substring(0, value.length()-10);//remove dispiroter value = StringTools.removeDashIfPresent(value); String[] locants = value.split(":"); Element group = OpsinTools.getNextSibling(polyCyclicSpiroDescriptor, GROUP_EL); if (group==null){ throw new ComponentGenerationException("Cannot find group to which dispiroter descriptor applies"); } determineFeaturesToResolveInSingleComponentSpiro(polyCyclicSpiroDescriptor); Fragment fragment = group.getFrag(); List clones = new ArrayList<>(); for (int i = 1; i < 3 ; i++) { clones.add(state.fragManager.copyAndRelabelFragment(fragment, i)); } for (Fragment clone : clones) { state.fragManager.incorporateFragment(clone, fragment); } String[] locants1 = locants[0].split(","); Atom atomOnLessPrimedFragment = fragment.getAtomByLocantOrThrow(fixLocantCapitalisation(locants1[0])); Atom atomToBeReplaced = fragment.getAtomByLocantOrThrow(fixLocantCapitalisation(locants1[1])); state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(atomToBeReplaced, atomOnLessPrimedFragment); if (atomToBeReplaced.hasSpareValency()){ atomOnLessPrimedFragment.setSpareValency(true); } String[] locants2 = locants[1].split(","); atomOnLessPrimedFragment = fragment.getAtomByLocantOrThrow(fixLocantCapitalisation(locants2[0])); atomToBeReplaced = fragment.getAtomByLocantOrThrow(fixLocantCapitalisation(locants2[1])); state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(atomToBeReplaced, atomOnLessPrimedFragment); if (atomToBeReplaced.hasSpareValency()){ atomOnLessPrimedFragment.setSpareValency(true); } group.setValue("dispiroter" + group.getValue()); } /** * The features between the polyCyclicSpiroDescriptor and the first group element, or beween the STRUCTURALOPENBRACKET_EL and STRUCTURALCLOSEBRACKET_EL * are found and then passed to resolveFeaturesOntoGroup * @param polyCyclicSpiroDescriptor * @throws StructureBuildingException * @throws ComponentGenerationException */ private void determineFeaturesToResolveInSingleComponentSpiro(Element polyCyclicSpiroDescriptor) throws StructureBuildingException, ComponentGenerationException { Element possibleOpenBracket = OpsinTools.getNextSibling(polyCyclicSpiroDescriptor); List elementsToResolve; if (possibleOpenBracket.getName().equals(STRUCTURALOPENBRACKET_EL)){ possibleOpenBracket.detach(); elementsToResolve = OpsinTools.getSiblingsUpToElementWithTagName(polyCyclicSpiroDescriptor, STRUCTURALCLOSEBRACKET_EL); OpsinTools.getNextSibling(elementsToResolve.get(elementsToResolve.size()-1)).detach();//detach close bracket } else{ elementsToResolve = OpsinTools.getSiblingsUpToElementWithTagName(polyCyclicSpiroDescriptor, GROUP_EL); } resolveFeaturesOntoGroup(elementsToResolve); } /** * Given some elements including a group element resolves all locanted and unlocanted features. * If suffixes are present these are resolved and detached * @param elementsToResolve * @throws StructureBuildingException * @throws ComponentGenerationException */ private void resolveFeaturesOntoGroup(List elementsToResolve) throws StructureBuildingException, ComponentGenerationException{ if (elementsToResolve.isEmpty()){ return; } Element substituentToResolve = new GroupingEl(SUBSTITUENT_EL);//temporary element containing elements that should be resolved before the ring is cloned Element parent = elementsToResolve.get(0).getParent(); int index = parent.indexOf(elementsToResolve.get(0)); Element group =null; List suffixes = new ArrayList<>(); Element locant =null; for (Element element : elementsToResolve) { String elName =element.getName(); if (elName.equals(GROUP_EL)){ group = element; } else if (elName.equals(SUFFIX_EL)){ suffixes.add(element); } else if (elName.equals(LOCANT_EL) && group==null){ locant = element; } element.detach(); substituentToResolve.addChild(element); } if (group ==null){ throw new ComponentGenerationException("OPSIN bug: group element should of been given to method"); } if (locant !=null){//locant is probably an indirect locant, try and assign it List locantAble = findElementsMissingIndirectLocants(substituentToResolve, locant); String[] locantValues = locant.getValue().split(","); if (locantAble.size() >= locantValues.length){ for (int i = 0; i < locantValues.length; i++) { String locantValue = locantValues[i]; locantAble.get(i).addAttribute(new Attribute(LOCANT_ATR, locantValue)); } locant.detach(); } } if (!suffixes.isEmpty()){ suffixApplier.resolveSuffixes(group, suffixes); for (Element suffix : suffixes) { suffix.detach(); } } if (substituentToResolve.getChildCount() != 0){ StructureBuildingMethods.resolveLocantedFeatures(state, substituentToResolve); StructureBuildingMethods.resolveUnLocantedFeatures(state, substituentToResolve); List children = substituentToResolve.getChildElements(); for (int i = children.size() -1; i>=0; i--) { Element child = children.get(i); child.detach(); parent.insertChild(child, index); } } } private static class SortBridgesByHighestLocantedBridgehead implements Comparator{ private final Map bridgeToRingAtoms; SortBridgesByHighestLocantedBridgehead(Map bridgeToRingAtoms) { this.bridgeToRingAtoms = bridgeToRingAtoms; } public int compare(Fragment bridge1, Fragment bridge2) { Atom[] ringAtoms1 = bridgeToRingAtoms.get(bridge1); int bridge1HighestRingLocant = Math.max(getLocantNumber(ringAtoms1[0]),getLocantNumber(ringAtoms1[1])); Atom[] ringAtoms2 = bridgeToRingAtoms.get(bridge2); int bridge2HighestRingLocant = Math.max(getLocantNumber(ringAtoms2[0]),getLocantNumber(ringAtoms2[1])); if (bridge1HighestRingLocant > bridge2HighestRingLocant){ return -1; } if (bridge1HighestRingLocant < bridge2HighestRingLocant){ return 1; } return 0; } } /** * Processes bridges e.g. 4,7-methanoindene * Resolves and attaches said bridges to the adjacent ring fragment * Numbers the bridges in accordance with FR-8.6/FR-8.7 * @param subOrRoot * @throws StructureBuildingException */ private void processFusedRingBridges(Element subOrRoot) throws StructureBuildingException { List bridges = subOrRoot.getChildElements(FUSEDRINGBRIDGE_EL); int bridgeCount = bridges.size(); if (bridgeCount == 0) { return; } Element groupEl = OpsinTools.getNextSibling(bridges.get(bridgeCount - 1), GROUP_EL); Fragment ringFrag = groupEl.getFrag(); Map bridgeToRingAtoms = new LinkedHashMap<>(); for (Element bridge : bridges) { Element possibleMultiplier = OpsinTools.getPreviousSibling(bridge); List locants = null; int multiplier = 1; if (possibleMultiplier != null) { Element possibleLocant; if (possibleMultiplier.getName().equals(MULTIPLIER_EL)) { multiplier = Integer.parseInt(possibleMultiplier.getAttributeValue(VALUE_ATR)); possibleLocant = OpsinTools.getPreviousSibling(possibleMultiplier); possibleMultiplier.detach(); if (possibleLocant != null && possibleLocant.getName().equals(COLONORSEMICOLONDELIMITEDLOCANT_EL)) { locants = new ArrayList<>(); String[] locantsForEachMultiple = StringTools.removeDashIfPresent(possibleLocant.getValue()).split(":"); if (locantsForEachMultiple.length != multiplier) { throw new RuntimeException("Mismatch between locant and multiplier counts (" + locantsForEachMultiple.length + " and " + multiplier + "): " + possibleLocant.getValue()); } for (String locantsForInstance : locantsForEachMultiple) { String[] locantArray = locantsForInstance.split(","); if (locantArray.length != 2) { throw new RuntimeException("Expected two locants per bridge, but was: " + possibleLocant.getValue()); } locants.add(locantArray); } possibleLocant.detach(); } } else { possibleLocant = possibleMultiplier; if (possibleLocant != null && possibleLocant.getName().equals(LOCANT_EL)) { String[] locantArray = possibleLocant.getValue().split(","); if (locantArray.length == 2) { locants = new ArrayList<>(); locants.add(locantArray); possibleLocant.detach(); } } } } for (int i = 0; i < multiplier; i++) { Fragment bridgeFrag = state.fragManager.buildSMILES(bridge.getAttributeValue(VALUE_ATR), groupEl, NONE_LABELS_VAL); Atom[] ringAtoms; if (locants != null) { String[] locantArray = locants.get(i); if (locantArray.length == 2) { bridgeFrag.getOutAtom(0).setLocant(locantArray[0]); bridgeFrag.getOutAtom(1).setLocant(locantArray[1]); } ringAtoms = StructureBuildingMethods.formEpoxide(state, bridgeFrag, ringFrag.getDefaultInAtomOrFirstAtom()); } else{ List possibleAtoms = FragmentTools.findSubstituableAtoms(ringFrag, 1); if (possibleAtoms.isEmpty()) { throw new StructureBuildingException("Unable to find suitable atom to form bridge"); } if (AmbiguityChecker.isSubstitutionAmbiguous(possibleAtoms, 1)) { state.addIsAmbiguous("Addition of bridge to: " + groupEl.getValue()); } ringAtoms = StructureBuildingMethods.formEpoxide(state, bridgeFrag, possibleAtoms.get(0)); } bridgeToRingAtoms.put(bridgeFrag, ringAtoms); state.fragManager.incorporateFragment(bridgeFrag, ringFrag); } bridge.detach(); } int highestLocant = getHighestNumericLocant(ringFrag); List bridgeFragments = new ArrayList<>(bridgeToRingAtoms.keySet()); Collections.sort(bridgeFragments, new SortBridgesByHighestLocantedBridgehead(bridgeToRingAtoms)); for (Fragment bridgeFragment: bridgeFragments) { List bridgeFragmentAtoms = bridgeFragment.getAtomList(); Atom[] ringAtoms = bridgeToRingAtoms.get(bridgeFragment); if (getLocantNumber(ringAtoms[0]) <= getLocantNumber(ringAtoms[1])){ for (int i = bridgeFragmentAtoms.size() - 1; i >=0; i--) { bridgeFragmentAtoms.get(i).addLocant(String.valueOf(++highestLocant)); } } else{ for (Atom atom : bridgeFragmentAtoms) { atom.addLocant(String.valueOf(++highestLocant)); } } } } private static int getLocantNumber(Atom atom) { String locant = atom.getFirstLocant(); if (locant != null) { Matcher m = MATCH_NUMERIC_LOCANT.matcher(locant); if (m.matches()){ return Integer.parseInt(m.group(1)); } } return 0; } private int getHighestNumericLocant(Fragment ringFrag) { for (int i = 1; ; i++) { if (ringFrag.getAtomByLocant(String.valueOf(i)) == null){ return i - 1; } } } /** * Searches for lambdaConvention elements and applies the valency they specify to the atom * they specify on the substituent/root's fragment * @param subOrRoot * @throws StructureBuildingException */ private void applyLambdaConvention(Element subOrRoot) throws StructureBuildingException { List lambdaConventionEls = subOrRoot.getChildElements(LAMBDACONVENTION_EL); for (Element lambdaConventionEl : lambdaConventionEls) { Fragment frag = subOrRoot.getFirstChildElement(GROUP_EL).getFrag(); if (lambdaConventionEl.getAttribute(LOCANT_ATR)!=null){ frag.getAtomByLocantOrThrow(lambdaConventionEl.getAttributeValue(LOCANT_ATR)).setLambdaConventionValency(Integer.parseInt(lambdaConventionEl.getAttributeValue(LAMBDA_ATR))); } else{ if (frag.getAtomCount()!=1){ throw new StructureBuildingException("Ambiguous use of lambda convention. Fragment has more than 1 atom but no locant was specified for the lambda"); } frag.getFirstAtom().setLambdaConventionValency(Integer.parseInt(lambdaConventionEl.getAttributeValue(LAMBDA_ATR))); } lambdaConventionEl.detach(); } } /** * Uses the number of outAtoms that are present to assign the number of outAtoms on substituents that can have a variable number of outAtoms * Hence at this point it can be determined if a multi radical susbtituent is present in the name * This would be expected in multiplicative nomenclature and is noted in the state so that the StructureBuilder knows to resolve the * section of the name from that point onwards in a left to right manner rather than right to left * @param subOrRoot: The sub/root to look in * @throws ComponentGenerationException * @throws StructureBuildingException */ private void handleMultiRadicals(Element subOrRoot) throws ComponentGenerationException, StructureBuildingException{ Element group =subOrRoot.getFirstChildElement(GROUP_EL); String groupValue =group.getValue(); Fragment thisFrag = group.getFrag(); if (groupValue.equals("methylene") || groupValue.equals("methylen") || groupValue.equals("oxy")|| matchChalcogenReplacement.matcher(groupValue).matches()){//resolves for example trimethylene to propan-1,3-diyl or dithio to disulfan-1,2-diyl. Locants may not be specified before the multiplier Element beforeGroup = OpsinTools.getPreviousSibling(group); if (beforeGroup!=null && beforeGroup.getName().equals(MULTIPLIER_ATR) && beforeGroup.getAttributeValue(TYPE_ATR).equals(BASIC_TYPE_VAL) && OpsinTools.getPreviousSibling(beforeGroup)==null){ int multiplierVal = Integer.parseInt(beforeGroup.getAttributeValue(VALUE_ATR)); if (!unsuitableForFormingChainMultiradical(group, beforeGroup)){ if (groupValue.equals("methylene") || groupValue.equals("methylen")){ group.getAttribute(VALUE_ATR).setValue(StringTools.multiplyString("C", multiplierVal)); } else if (groupValue.equals("oxy")){ group.getAttribute(VALUE_ATR).setValue(StringTools.multiplyString("O", multiplierVal)); } else if (groupValue.equals("thio")){ group.getAttribute(VALUE_ATR).setValue(StringTools.multiplyString("S", multiplierVal)); } else if (groupValue.equals("seleno")){ group.getAttribute(VALUE_ATR).setValue(StringTools.multiplyString("[SeH?]", multiplierVal)); } else if (groupValue.equals("telluro")){ group.getAttribute(VALUE_ATR).setValue(StringTools.multiplyString("[TeH?]", multiplierVal)); } else{ throw new ComponentGenerationException("unexpected group value"); } group.getAttribute(OUTIDS_ATR).setValue("1,"+Integer.parseInt(beforeGroup.getAttributeValue(VALUE_ATR))); group.setValue(beforeGroup.getValue() + groupValue); beforeGroup.detach(); if (group.getAttribute(LABELS_ATR)!=null){//use numeric numbering group.getAttribute(LABELS_ATR).setValue(NUMERIC_LABELS_VAL); } else{ group.addAttribute(new Attribute(LABELS_ATR, NUMERIC_LABELS_VAL)); } state.fragManager.removeFragment(thisFrag); thisFrag = resolveGroup(state, group); group.removeAttribute(group.getAttribute(USABLEASJOINER_ATR)); } } } if (group.getAttribute(OUTIDS_ATR)!=null){//adds outIDs at the specified atoms String[] radicalPositions = group.getAttributeValue(OUTIDS_ATR).split(","); int firstIdInFrag =thisFrag.getIdOfFirstAtom(); for (String radicalID : radicalPositions) { thisFrag.addOutAtom(firstIdInFrag + Integer.parseInt(radicalID) - 1, 1, true); } } int outAtomCount = thisFrag.getOutAtomCount(); if (outAtomCount >=2){ if (groupValue.equals("amine") || groupValue.equals("amin")) {//amine is a special case as it shouldn't technically be allowed but is allowed due to it's common usage in EDTA Element previousGroup = OpsinTools.getPreviousGroup(group); Element nextGroup = OpsinTools.getNextGroup(group); if (previousGroup==null || previousGroup.getFrag().getOutAtomCount() < 2 || nextGroup==null){//must be preceded by a multi radical throw new ComponentGenerationException("Invalid use of amine as a substituent!"); } } if (state.currentWordRule == WordRule.polymer){ if (outAtomCount >=3){//In poly mode nothing may have more than 2 outAtoms e.g. nitrilo is -N= or =N- int valency =0; for (int i = 2; i < outAtomCount; i++) { OutAtom nextOutAtom = thisFrag.getOutAtom(i); valency += nextOutAtom.getValency(); thisFrag.removeOutAtom(nextOutAtom); } thisFrag.getOutAtom(1).setValency(thisFrag.getOutAtom(1).getValency() + valency); } } } if (outAtomCount ==2 && EPOXYLIKE_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))){ Element possibleLocant = OpsinTools.getPreviousSibling(group); if (possibleLocant !=null){ String[] locantValues = possibleLocant.getValue().split(","); if (locantValues.length==2){ thisFrag.getOutAtom(0).setLocant(locantValues[0]); thisFrag.getOutAtom(1).setLocant(locantValues[1]); possibleLocant.detach(); subOrRoot.addAttribute(new Attribute(LOCANT_ATR, locantValues[0])); } } } int totalOutAtoms = outAtomCount + calculateOutAtomsToBeAddedFromInlineSuffixes(group, subOrRoot.getChildElements(SUFFIX_EL)); if (totalOutAtoms >= 2){ group.addAttribute(new Attribute (ISAMULTIRADICAL_ATR, Integer.toString(totalOutAtoms))); } } /** * Checks for cases where multiplier(methylene) or multiplier(thio) and the like should not be interpreted as one fragment * Something like nitrilotrithiotriacetic acid or oxetane-3,3-diyldimethylene * @param group * @param multiplierBeforeGroup * @return */ private boolean unsuitableForFormingChainMultiradical(Element group, Element multiplierBeforeGroup) { Element previousGroup = OpsinTools.getPreviousGroup(group); if (previousGroup!=null){ if (previousGroup.getAttribute(ISAMULTIRADICAL_ATR)!=null){ if (previousGroup.getAttributeValue(ACCEPTSADDITIVEBONDS_ATR)!=null && OpsinTools.getPreviousSibling(previousGroup.getParent())!=null){ return false; } //the initial multiplier is proceded by another multiplier e.g. bis(dithio) if (OpsinTools.getPrevious(multiplierBeforeGroup).getName().equals(MULTIPLIER_EL)){ return false; } if (previousGroup.getAttributeValue(ISAMULTIRADICAL_ATR).equals(multiplierBeforeGroup.getAttributeValue(VALUE_ATR))){ return true;//probably multiplicative } else{ return false; } } else if (OpsinTools.getPreviousSibling(previousGroup, MULTIPLIER_EL)==null){ //This is a 99% solution to the detection of cases such as ethylidenedioxy == ethan-1,1-diyldioxy Fragment previousGroupFrag = previousGroup.getFrag(); int outAtomValency =0; if (previousGroupFrag.getOutAtomCount()==1){ outAtomValency = previousGroupFrag.getOutAtom(0).getValency(); } else{ Element suffix = OpsinTools.getNextSibling(previousGroup, SUFFIX_EL); if (suffix!=null && suffix.getAttributeValue(VALUE_ATR).equals("ylidene")){ outAtomValency =2; } if (suffix!=null && suffix.getAttributeValue(VALUE_ATR).equals("ylidyne")){ outAtomValency =3; } } if (outAtomValency==Integer.parseInt(multiplierBeforeGroup.getAttributeValue(VALUE_ATR))){ return true; } } } return false; } /** * Calculates number of OutAtoms that the resolveSuffixes method will add. * @param group * @param suffixes * @return numberOfOutAtoms that will be added by resolveSuffixes * @throws ComponentGenerationException */ private int calculateOutAtomsToBeAddedFromInlineSuffixes(Element group, List suffixes) throws ComponentGenerationException { int outAtomsThatWillBeAdded = 0; Fragment frag = group.getFrag(); String groupType = frag.getType(); String subgroupType = frag.getSubType(); String suffixTypeToUse =null; if (suffixApplier.isGroupTypeWithSpecificSuffixRules(groupType)){ suffixTypeToUse =groupType; } else{ suffixTypeToUse = STANDARDGROUP_TYPE_VAL; } List suffixList =state.xmlSuffixMap.get(group); for (Fragment suffix : suffixList) { outAtomsThatWillBeAdded += suffix.getOutAtomCount(); } for (Element suffix : suffixes) { String suffixValue = suffix.getAttributeValue(VALUE_ATR); List suffixRules = suffixApplier.getSuffixRuleTags(suffixTypeToUse, suffixValue, subgroupType); for (SuffixRule suffixRule : suffixRules) { if(suffixRule.getType() == SuffixRuleType.setOutAtom) { outAtomsThatWillBeAdded += 1; } } } return outAtomsThatWillBeAdded; } /** * Corrects something like L-alanyl-L-glutaminyl-L-arginyl-O-phosphono-L-seryl-L-alanyl-L-proline to: * ((((L-alanyl-L-glutaminyl)-L-arginyl)-O-phosphono-L-seryl)-L-alanyl)-L-proline * i.e. substituents go onto the last mentioned amino acid; amino acids chain together to form peptides * @param groups * @param brackets */ private void addImplicitBracketsToAminoAcids(List groups, List brackets) { for (int i = groups.size() -1; i >=0; i--) { Element group = groups.get(i); if (group.getAttributeValue(TYPE_ATR).equals(AMINOACID_TYPE_VAL) && OpsinTools.getNextGroup(group)!=null){ Element possibleLocant = OpsinTools.getPreviousSiblingIgnoringCertainElements(group, new String[]{MULTIPLIER_EL}); if (possibleLocant != null && possibleLocant.getName().equals(LOCANT_EL)){ continue; } Element subOrRoot = group.getParent(); //now find the brackets/substituents before this element Element previous = OpsinTools.getPreviousSibling(subOrRoot); List previousElements = new ArrayList<>(); while( previous !=null){ if (!previous.getName().equals(SUBSTITUENT_EL) && !previous.getName().equals(BRACKET_EL)){ break; } previousElements.add(previous); previous = OpsinTools.getPreviousSibling(previous); } if (previousElements.size()>0){//an implicit bracket is needed Collections.reverse(previousElements); Element bracket = new GroupingEl(BRACKET_EL); bracket.addAttribute(new Attribute(TYPE_ATR, IMPLICIT_TYPE_VAL)); Element parent = subOrRoot.getParent(); int indexToInsertAt = parent.indexOf(previousElements.get(0)); for (Element element : previousElements) { element.detach(); bracket.addChild(element); } subOrRoot.detach(); bracket.addChild(subOrRoot); parent.insertChild(bracket, indexToInsertAt); brackets.add(bracket); } } } } /** * Looks for whether this substituent should be bracketed to the substituent before it * E.g. dimethylaminobenzene -> (dimethylamino)benzene when the substituent is the amino * The list of brackets is modified if the method does something * @param substituent * @param brackets * @throws StructureBuildingException * @throws ComponentGenerationException */ private void implicitlyBracketToPreviousSubstituentIfAppropriate(Element substituent, List brackets) throws ComponentGenerationException, StructureBuildingException { String firstElInSubName = substituent.getChild(0).getName(); if (firstElInSubName.equals(LOCANT_EL) || firstElInSubName.equals(MULTIPLIER_EL) || firstElInSubName.equals(STEREOCHEMISTRY_EL)){ return; } Element substituentGroup = substituent.getFirstChildElement(GROUP_EL); //Only some substituents are valid joiners (e.g. no rings are valid joiners). Need to be atleast bivalent. if (substituentGroup.getAttribute(USABLEASJOINER_ATR) == null){ return; } //checks that the element before is a substituent or a bracket which will obviously include substituent/s //this makes sure that there would be more than more than just a substituent if a bracket is added Element elementBeforeSubstituent = OpsinTools.getPreviousSibling(substituent); if (elementBeforeSubstituent == null || !elementBeforeSubstituent.getName().equals(SUBSTITUENT_EL) && !elementBeforeSubstituent.getName().equals(BRACKET_EL)) { return; } // groups like carbonyl/sulfonyl should typically be implicitly bracketed e.g. tert-butoxy-carbonyl, unless they are part of multiplicative nomenclature boolean sulfonylLike = matchGroupsThatAreAlsoInlineSuffixes.matcher(substituentGroup.getValue()).matches(); Element elementAftersubstituent = OpsinTools.getNextSibling(substituent); if (elementAftersubstituent != null) { //Not preceded and followed by a bracket e.g. Not (benzyl)methyl(phenyl)amine c.f. P-16.4.1.3 (draft 2004) //carbonyl-like allowed due to empirical usage if (!sulfonylLike && elementBeforeSubstituent.getName().equals(BRACKET_EL) && !IMPLICIT_TYPE_VAL.equals(elementBeforeSubstituent.getAttributeValue(TYPE_ATR)) && elementAftersubstituent.getName().equals(BRACKET_EL)) { Element firstChildElementOfElementAfterSubstituent = elementAftersubstituent.getChild(0); if ((firstChildElementOfElementAfterSubstituent.getName().equals(SUBSTITUENT_EL) || firstChildElementOfElementAfterSubstituent.getName().equals(BRACKET_EL)) && !OpsinTools.getPrevious(firstChildElementOfElementAfterSubstituent).getName().equals(HYPHEN_EL)) { return; } } } //there must be an element after the substituent (or the substituent is being used for locanted ester formation) for the implicit bracket to be required if (!isSubBracketOrRoot(elementAftersubstituent)) { if (!(elementAftersubstituent == null && locantedEsterImplicitBracketSpecialCase(substituent, elementBeforeSubstituent))) { return; } } //look for hyphen between substituents, this seems to indicate implicit bracketing was not desired e.g. dimethylaminomethane vs dimethyl-aminomethane //an exception is made for groups like carbonyl/sulfonyl as these typically should be implicitly bracketed e.g. tert-butoxy-carbonyl Element elementDirectlyBeforeSubstituent = OpsinTools.getPrevious(substituent.getChild(0));//can't return null as we know elementBeforeSubstituent is not null if (!sulfonylLike && elementDirectlyBeforeSubstituent.getName().equals(HYPHEN_EL)) { return; } List groupElements = OpsinTools.getDescendantElementsWithTagName(elementBeforeSubstituent, GROUP_EL);//one for a substituent, possibly more for a bracket Element lastGroupOfElementBeforeSub = groupElements.get(groupElements.size() - 1); if (lastGroupOfElementBeforeSub == null) { throw new ComponentGenerationException("No group where group was expected"); } //prevents alkyl chains being bracketed together e.g. ethylmethylamine //...unless it's something like 2-methylethyl where the first appears to be locanted onto the second if (substituentsAreEndToEndAlkyls(substituentGroup, lastGroupOfElementBeforeSub, elementBeforeSubstituent)) { return; } Fragment frag = substituentGroup.getFrag(); //prevent bracketing to multi radicals unless through substitution they are likely to cease being multiradicals if (lastGroupOfElementBeforeSub.getAttribute(ISAMULTIRADICAL_ATR) != null && lastGroupOfElementBeforeSub.getAttribute(ACCEPTSADDITIVEBONDS_ATR) == null && lastGroupOfElementBeforeSub.getAttribute(IMINOLIKE_ATR) == null) { return; } if (substituentGroup.getAttribute(ISAMULTIRADICAL_ATR) != null) { if (substituentGroup.getAttribute(ACCEPTSADDITIVEBONDS_ATR) == null && substituentGroup.getAttribute(IMINOLIKE_ATR) == null) { //after implicit bracketting the substituent should no longer be a multi-radical. If neither of the above attributes apply this can't happen return; } //being not substitutable doesn't mean it can't form additive bonds cf. oxy. Additive bonds can still benefit from implicit bracketing boolean isSubstitutable = false; for (Atom atom : frag) { if (StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(atom) > 0){ isSubstitutable = true; break; } } if (!isSubstitutable && elementAftersubstituent != null && elementAftersubstituent.getChild(0).getName().equals(MULTIPLIER_EL)) { //return if multiplicative nomenclature detected, if the multiplier differs from the out atom count, additive bonds may still be possible if (frag.getOutAtomCount() == Integer.parseInt(elementAftersubstituent.getChild(0).getAttributeValue(VALUE_ATR))){ String elType = elementAftersubstituent.getName(); if (elType.equals(ROOT_EL)) { return; } else if (elType.equals(SUBSTITUENT_EL)) { List groups = OpsinTools.getDescendantElementsWithTagName(elementAftersubstituent, GROUP_EL); for (Element group : groups) { if (group.getAttribute(ISAMULTIRADICAL_ATR) != null){ return ;//a multi radical } } } else if (elType.equals(BRACKET_EL) && OpsinTools.getDescendantElementsWithTagName(elementAftersubstituent, ROOT_EL).size() > 0) { return; } } } } if (lastGroupOfElementBeforeSub.getAttribute(IMINOLIKE_ATR) != null && substituentGroup.getAttribute(IMINOLIKE_ATR) != null){ return;//possibly a multiplicative additive operation } if (implicitBracketWouldPreventAdditiveBonding(elementBeforeSubstituent, elementAftersubstituent)) { return;//e.g. N-ethylmethylsulfonimidoyl } if (implicitBracketWouldPreventConnectionToAmineSuffix(elementBeforeSubstituent, elementAftersubstituent)) { return;//e.g. N-methylmethyleneamine } if (substituentGroup.getValue().equals("sulf") && frag.getAtomCount() == 1) { Element suffix = OpsinTools.getNextSiblingIgnoringCertainElements(substituentGroup, new String[]{UNSATURATOR_EL}); if (suffix != null && suffix.getAttributeValue(VALUE_ATR).equals("ylidene")) { substituentGroup.removeAttribute(substituentGroup.getAttribute(USABLEASJOINER_ATR)); //TODO resolve suffixes as early as can be done unambiguously //e.g. it should be possible to know that sulfanylidene has 0 hydrogen but azaniumylidyne has 1 return; } } //prevent bracketing perhalogeno terms if (PERHALOGENO_SUBTYPE_VAL.equals(lastGroupOfElementBeforeSub.getAttributeValue(SUBTYPE_ATR))) { return; } if (frag.getAtomCount() == 1 && frag.getFirstAtom().getElement() == ChemEl.Si) { //silyl is typically triple substituted, so more than just the preceding substituent potentially should be inside the implicit bracket //e.g. ethoxydimethylsilylpropyl should be (ethoxydimethylsilyl)propyl elementBeforeSubstituent = determineSubstitutentForSilylImplicitBracketting(elementBeforeSubstituent, 3); } /* * locant may need to be moved. This occurs when the group in elementBeforeSubstituent is not supposed to be locanted onto * theSubstituentGroup * e.g. 2-aminomethyl-1-chlorobenzene where the 2 refers to the benzene NOT the methyl */ List locantRelatedElements = new ArrayList<>();//sometimes moved String[] locantValues = null; List stereoChemistryElements = new ArrayList<>();//always moved if bracketing occurs List childrenOfElementBeforeSubstituent = elementBeforeSubstituent.getChildElements(); for (Element childOfElBeforeSub : childrenOfElementBeforeSubstituent) { String currentElementName = childOfElBeforeSub.getName(); if (currentElementName.equals(STEREOCHEMISTRY_EL)){ stereoChemistryElements.add(childOfElBeforeSub); } else if (currentElementName.equals(LOCANT_EL)) { if (locantValues != null) { break; } locantRelatedElements.add(childOfElBeforeSub); locantValues = childOfElBeforeSub.getValue().split(","); } else{ break; } } //either all locants will be moved, or none boolean moveLocants = false; if (locantValues != null) { Element elAfterLocant = OpsinTools.getNextSibling(locantRelatedElements.get(0)); for (String locantText : locantValues) { //Check the right fragment in the bracket: //if it only has 1 then assume locanted substitution onto it not intended. Or if doesn't have the required locant if (frag.getAtomCount() == 1 || !frag.hasLocant(locantText) || matchElementSymbolOrAminoAcidLocant.matcher(locantText).find() || (locantValues.length == 1 && elAfterLocant.getName().equals(MULTIPLIER_EL))) { if (checkLocantPresentOnPotentialRoot(state, substituent, locantText)){ moveLocants = true;//locant location is present elsewhere break; } else { if( frag.getAtomCount() == 1 && frag.hasLocant(locantText)) { //1 locant was intended to locant onto fragment with 1 atom } else{ moveLocants = true;//the fragment adjacent to the locant doesn't have this locant or doesn't need any indirect locants. Assume it will appear elsewhere later break; } } } } if (moveLocants && locantValues.length > 1) { if (elAfterLocant != null && elAfterLocant.getName().equals(MULTIPLIER_EL)) { Element shouldBeAGroupOrSubOrBracket = OpsinTools.getNextSiblingIgnoringCertainElements(elAfterLocant, new String[]{MULTIPLIER_EL}); if (shouldBeAGroupOrSubOrBracket != null) { if ((shouldBeAGroupOrSubOrBracket.getName().equals(GROUP_EL) && elAfterLocant.getAttributeValue(TYPE_ATR).equals(GROUP_TYPE_VAL))//e.g. 2,5-bisaminothiobenzene --> 2,5-bis(aminothio)benzene || sulfonylLike){//e.g. 4,4'-dimethoxycarbonyl-2,2'-bioxazole --> 4,4'-di(methoxycarbonyl)-2,2'-bioxazole locantRelatedElements.add(elAfterLocant);//e.g. 1,5-bis-(4-methylphenyl)sulfonyl --> 1,5-bis-((4-methylphenyl)sulfonyl) } else if (ORTHOMETAPARA_TYPE_VAL.equals(locantRelatedElements.get(0).getAttributeValue(TYPE_ATR))) {//e.g. p-dimethylamino[ring] locantRelatedElements.get(0).setValue(locantValues[1]); } else if (frag.getAtomCount() == 1) {//e.g. 1,3,4-trimethylthiobenzene --> 1,3,4-tri(methylthio)benzene locantRelatedElements.add(elAfterLocant); } else{//don't bracket other complex multiplied substituents (name hasn't given enough hints if indeed bracketing was expected) return; } } else{ moveLocants = false; } } else{ moveLocants = false; } } } Element bracket = new GroupingEl(BRACKET_EL); bracket.addAttribute(new Attribute(TYPE_ATR, IMPLICIT_TYPE_VAL)); for (Element stereoChemistryElement : stereoChemistryElements) { stereoChemistryElement.detach(); bracket.addChild(stereoChemistryElement); } if (moveLocants){ for (Element locantElement : locantRelatedElements) { locantElement.detach(); bracket.addChild(locantElement); } } /* * Case when a multiplier should be moved * e.g. tripropan-2-yloxyphosphane -->tri(propan-2-yloxy)phosphane or trispropan-2-ylaminophosphane --> tris(propan-2-ylamino)phosphane */ if (locantRelatedElements.isEmpty()) { Element possibleMultiplier = childrenOfElementBeforeSubstituent.get(0); if (possibleMultiplier.getName().equals(MULTIPLIER_EL) && ( sulfonylLike || possibleMultiplier.getAttributeValue(TYPE_ATR).equals(GROUP_TYPE_VAL))){ Element desiredGroup = OpsinTools.getNextSiblingIgnoringCertainElements(possibleMultiplier, new String[]{MULTIPLIER_EL}); if (desiredGroup !=null && desiredGroup.getName().equals(GROUP_EL)) { possibleMultiplier.detach(); bracket.addChild(possibleMultiplier); } } } Element parent = substituent.getParent(); int startIndex = parent.indexOf(elementBeforeSubstituent); int endIndex = parent.indexOf(substituent); for(int i = 0 ; i <= (endIndex - startIndex); i++) { Element n = parent.getChild(startIndex); n.detach(); bracket.addChild(n); } parent.insertChild(bracket, startIndex); brackets.add(bracket); } private boolean substituentsAreEndToEndAlkyls(Element group, Element precedingGroup, Element precedingGroupSubstituent) { if (!isPotentialAlkyl(group) || !isPotentialAlkyl(precedingGroup)) { return false; } Element suffixAfterGroup = OpsinTools.getNextSibling(precedingGroup, SUFFIX_EL); //if the alkane ends in oxy, sulfinyl, sulfonyl etc. it's not a pure alkane //the outatom check rules out things like "oyl" which don't extend the chain if (suffixAfterGroup !=null && suffixAfterGroup.getFrag() != null && suffixAfterGroup.getFrag().getOutAtomCount() > 0) { return false; } Element locant = null; Element multiplier = null; for (int i = 0, len = precedingGroupSubstituent.getChildCount(); i < len; i++) { Element childOfElBeforeSub = precedingGroupSubstituent.getChild(i); String elName = childOfElBeforeSub.getName(); if (elName.equals(LOCANT_EL)) { locant = childOfElBeforeSub; Element next = OpsinTools.getNextSibling(childOfElBeforeSub); if (next != null && next.getName().equals(MULTIPLIER_EL)) { multiplier = next; } break; } else if (!elName.equals(STEREOCHEMISTRY_EL)){ break; } } if (locant != null) { //It's very rare to have a locanted substituent followed by an unlocanted substituent //Hence the more typical interpretation is that either: //The first substituent connects to the latter e.g. 1,1-dimethylethyl //The two substituents really should be joined end to end e.g. formylmethyl // //Both these scenarios are handled by returning that this isn't a end to end alkyl link and hence a bracket can be added String[] locantVals = locant.getValue().split(","); if (!fragHasLocants(group.getFrag(), locantVals) && (multiplier == null || Integer.parseInt(multiplier.getAttributeValue(VALUE_ATR)) == locantVals.length) && isSimpleAlkyl(group) && isSimpleAlkyl(precedingGroup)) { //special case //e.g. N,N-diethylmethylphosphonamidate (methyl is unlocanted) return true; } if (multiplier != null && Integer.parseInt(multiplier.getAttributeValue(VALUE_ATR)) == 2) { Element nextGroup = OpsinTools.getNextGroup(group); if (nextGroup != null) { Fragment frag = nextGroup.getFrag(); if (frag.getAtomCount() == 1 && frag.getFirstAtom().getElement() == ChemEl.Si) { //special case //e.g. 3-dimethylethoxysilylpropyl return true; } } } return false; } return true; } private boolean fragHasLocants(Fragment frag, String[] locantVals) { for (String locantVal : locantVals) { if (!frag.hasLocant(locantVal)) { return false; } } return true; } private boolean isPotentialAlkyl(Element group) { String type = group.getAttributeValue(TYPE_ATR); return (type.equals(CHAIN_TYPE_VAL) && group.getAttributeValue(SUBTYPE_ATR).equals(ALKANESTEM_SUBTYPE_VAL) || type.equals(ACIDSTEM_TYPE_VAL)); } private boolean isSimpleAlkyl(Element group) { Element suffix= OpsinTools.getNextSibling(group, SUFFIX_EL); String type = group.getAttributeValue(TYPE_ATR); return type.equals(CHAIN_TYPE_VAL) && group.getAttributeValue(SUBTYPE_ATR).equals(ALKANESTEM_SUBTYPE_VAL) && suffix != null && suffix.getValue().equals("yl") ; } private boolean implicitBracketWouldPreventAdditiveBonding(Element elementBeforeSubstituent, Element elementAftersubstituent) { if (elementAftersubstituent != null && elementAftersubstituent.getName().equals(SUBSTITUENT_EL)) { Element groupAfterSubstituent = elementAftersubstituent.getFirstChildElement(GROUP_EL); if (groupAfterSubstituent.getAttribute(ACCEPTSADDITIVEBONDS_ATR) != null && !isSubBracketOrRoot(OpsinTools.getNextSibling(elementAftersubstituent))) { if (elementBeforeSubstituent.getChild(0).getName().equals(LOCANT_EL)) { Fragment additiveAcceptingFrag = groupAfterSubstituent.getFrag(); Element viableSubstituent = elementBeforeSubstituent; while (viableSubstituent != null) { if (viableSubstituent.getName().equals(SUBSTITUENT_EL) || viableSubstituent.getName().equals(BRACKET_EL)) { Element possibleLocant = viableSubstituent.getChild(0); if (possibleLocant.getName().equals(LOCANT_EL)){ if (additiveAcceptingFrag.getFirstAtom().equals(additiveAcceptingFrag.getAtomByLocant(possibleLocant.getValue()))) { return false; } } } viableSubstituent = OpsinTools.getPreviousSibling(viableSubstituent); } return true; } } } return false; } private boolean implicitBracketWouldPreventConnectionToAmineSuffix(Element elementBeforeSubstituent, Element elementAftersubstituent) { //Treat amine like a suffix rather than a group cf. 815.3 (IUPAC 1979)... but only when a locant is present //e.g. Dimethylaminopropylamine should still be (Dimethylaminopropyl)amine //but N-methylmethyleneamine not N-(methylmethylene)amine if (elementAftersubstituent != null && elementAftersubstituent.getName().equals(ROOT_EL) && elementAftersubstituent.getChildCount() == 1) { String val = elementAftersubstituent.getValue(); if ((val.equals("amine") || val.equals("amin")) && elementBeforeSubstituent.getChild(0).getName().equals(LOCANT_EL)) { return true ; } } return false; } private Element determineSubstitutentForSilylImplicitBracketting(Element el, int expectedSubstituents) { Element elToUse = el; while (elToUse != null && (elToUse.getName().equals(SUBSTITUENT_EL) || elToUse.getName().equals(BRACKET_EL))) { int multiplier = 1; boolean locantPresent = false;//locant can be present on the first substituent, but not others e.g. 3-dimethylethoxysilylpropyl int index = 0; int childCount = elToUse.getChildCount(); Element childToConsider = elToUse.getChild(index++); if (childToConsider.getName().equals(LOCANT_EL)) { locantPresent = true; childToConsider = index < childCount ? elToUse.getChild(index++) : null; } if (childToConsider != null && childToConsider.getName().equals(MULTIPLIER_EL)) { multiplier = Integer.parseInt(childToConsider.getAttributeValue(VALUE_ATR)); childToConsider = index < childCount ? elToUse.getChild(index++) : null; } if (childToConsider == null || !(childToConsider.getName().equals(GROUP_EL) || childToConsider.getName().equals(SUBSTITUENT_EL))) { return el;//Not expected substituent content } expectedSubstituents -= multiplier; if (expectedSubstituents <=0) { if (expectedSubstituents < 0) { return el;//Not expected substituent conten } return elToUse;//extend implicit bracket } if (locantPresent) { return el; } elToUse = OpsinTools.getPreviousSibling(elToUse); } return el; } /** * Retrusn true in the case that: * the given substituent is a direct child of a word element * The preceding substituent/bracket is the first element in the word element * The current word rule invovles locanted ester like linkages * @param substituent * @param elementBeforeSubstituent * @return */ private boolean locantedEsterImplicitBracketSpecialCase(Element substituent, Element elementBeforeSubstituent) { if (substituent.getParent().getName().equals(WORD_EL) && OpsinTools.getPreviousSibling(elementBeforeSubstituent) == null && (state.currentWordRule == WordRule.ester || state.currentWordRule == WordRule.functionalClassEster || state.currentWordRule == WordRule.multiEster || state.currentWordRule == WordRule.acetal)){ return true; } return false; } /** * Attempts to match locants to non adjacent suffixes/unsatuators * e.g. 2-propanol, 3-furyl, 2'-Butyronaphthone * @param subOrRoot The substituent/root to look for locants in. * @throws StructureBuildingException */ private void matchLocantsToIndirectFeatures(Element subOrRoot) throws StructureBuildingException { /* Root fragments (or the root in a bracket) can have prefix-locants * that work on suffixes - (2-furyl), 2-propanol, (2-propylmethyl), (2-propyloxy), 2'-Butyronaphthone. */ List locantEls = findLocantsThatCouldBeIndirectLocants(subOrRoot); if (locantEls.size()>0){ Element group =subOrRoot.getFirstChildElement(GROUP_EL); Element lastLocant = locantEls.get(locantEls.size()-1);//the locant that may apply to an unsaturator/suffix String[] locantValues = lastLocant.getValue().split(","); if (locantValues.length==1 && group.getAttribute(FRONTLOCANTSEXPECTED_ATR)!=null){//some trivial retained names like 2-furyl expect locants to be in front of them. For these the indirect intepretation will always be used rather than checking whether 2-(furyl) even makes sense String[] allowedLocants = group.getAttributeValue(FRONTLOCANTSEXPECTED_ATR).split(","); for (String allowedLocant : allowedLocants) { if (locantValues[0].equals(allowedLocant)){ Element expectedSuffix = OpsinTools.getNextSibling(group); if (expectedSuffix!=null && expectedSuffix.getName().equals(SUFFIX_EL) && expectedSuffix.getAttribute(LOCANT_ATR)==null){ expectedSuffix.addAttribute(new Attribute(LOCANT_ATR, locantValues[0])); lastLocant.detach(); return; } break; } } } boolean allowIndirectLocants =true; if(state.currentWordRule == WordRule.multiEster && !ADDEDHYDROGENLOCANT_TYPE_VAL.equals(lastLocant.getAttributeValue(TYPE_ATR))){//special case e.g. 1-benzyl 4-butyl terephthalate (locants do not apply to yls) Element parentEl = subOrRoot.getParent(); if (parentEl.getName().equals(WORD_EL) && parentEl.getAttributeValue(TYPE_ATR).equals(SUBSTITUENT_EL) && parentEl.getChildCount()==1 && locantValues.length==1 && !ORTHOMETAPARA_TYPE_VAL.equals(lastLocant.getAttributeValue(TYPE_ATR))){ allowIndirectLocants =false; } } Fragment fragmentAfterLocant = group.getFrag(); if (fragmentAfterLocant.getAtomCount()<=1){ allowIndirectLocants =false;//e.g. prevent 1-methyl as meth-1-yl is extremely unlikely to be the intended result } if (allowIndirectLocants){ /* The first locant is most likely a locant indicating where this subsituent should be attached. * If the locant cannot be found on a potential root this cannot be the case though (assuming the name is valid of course) */ if (!ADDEDHYDROGENLOCANT_TYPE_VAL.equals(lastLocant.getAttributeValue(TYPE_ATR)) && locantEls.size() ==1 && group.getAttribute(ISAMULTIRADICAL_ATR)==null && locantValues.length == 1 && checkLocantPresentOnPotentialRoot(state, subOrRoot, locantValues[0]) && OpsinTools.getPreviousSibling(lastLocant, LOCANT_EL)==null){ return; } boolean assignableToIndirectFeatures =true; List locantAble =findElementsMissingIndirectLocants(subOrRoot, lastLocant); if (locantAble.size() < locantValues.length){ assignableToIndirectFeatures =false; } else{ for (String locantValue : locantValues) { if (!fragmentAfterLocant.hasLocant(locantValue)){//locant is not available on the group adjacent to the locant! assignableToIndirectFeatures =false; } } } if (!assignableToIndirectFeatures){//usually indicates the name will fail unless the suffix has the locant or heteroatom replacement will create the locant if (locantValues.length==1){ List suffixes =state.xmlSuffixMap.get(group); //I do not want to assign element locants as in locants on the suffix as I currently know of no examples where this actually occurs if (matchElementSymbolOrAminoAcidLocant.matcher(locantValues[0]).matches()){ return; } for (Fragment suffix : suffixes) { if (suffix.hasLocant(locantValues[0])){//e.g. 2'-Butyronaphthone Atom dummyRAtom =suffix.getFirstAtom(); List neighbours =dummyRAtom.getAtomNeighbours(); Bond b =null; atomLoop: for (Atom atom : neighbours) { List neighbourLocants = atom.getLocants(); for (String neighbourLocant : neighbourLocants) { if (MATCH_NUMERIC_LOCANT.matcher(neighbourLocant).matches()){ b = dummyRAtom.getBondToAtomOrThrow(atom); break atomLoop; } } } if (b!=null){ state.fragManager.removeBond(b);//the current bond between the dummy R and the suffix state.fragManager.createBond(dummyRAtom, suffix.getAtomByLocantOrThrow(locantValues[0]), b.getOrder()); lastLocant.detach(); } } } } } else{ for (int i = 0; i < locantValues.length; i++) { String locantValue = locantValues[i]; locantAble.get(i).addAttribute(new Attribute(LOCANT_ATR, locantValue)); } lastLocant.detach(); } } } } /** * Finds locants that are before a group element and not immediately followed by a multiplier * @param subOrRoot * @return */ private List findLocantsThatCouldBeIndirectLocants(Element subOrRoot) { List locantEls = new ArrayList<>(); for (int i = 0, len = subOrRoot.getChildCount(); i < len; i++) { Element el = subOrRoot.getChild(i); if (el.getName().equals(LOCANT_EL)) { Element afterLocant = OpsinTools.getNextSibling(el); if (afterLocant != null && afterLocant.getName().equals(MULTIPLIER_EL)){//locant should not be followed by a multiplier. c.f. 1,2,3-tributyl 2-acetyloxypropane-1,2,3-tricarboxylate continue; } locantEls.add(el); } else if (el.getName().equals(GROUP_EL)) { break; } } return locantEls; } /** * Find elements that can have indirect locants but don't currently * This requirement excludes hydro and heteroatoms as it is assumed that locants for these are always adjacent (or handled by the special HW code in the case of heteroatoms) * @param subOrRoot The subOrRoot of interest * @param locantEl the locant, only elements after it will be considered * @return An arrayList of locantable elements */ private List findElementsMissingIndirectLocants(Element subOrRoot,Element locantEl) { List locantAble = new ArrayList<>(); List childrenOfSubOrBracketOrRoot=subOrRoot.getChildElements(); for (Element el : childrenOfSubOrBracketOrRoot) { String name =el.getName(); if (name.equals(SUFFIX_EL) || name.equals(UNSATURATOR_EL) || name.equals(CONJUNCTIVESUFFIXGROUP_EL)){ if (el.getAttribute(LOCANT_ATR) ==null && el.getAttribute(LOCANTID_ATR) ==null && el.getAttribute(MULTIPLIED_ATR)==null){// shouldn't already have a locant or be multiplied (should of already had locants assignd to it if that were the case) if (subOrRoot.indexOf(el)>subOrRoot.indexOf(locantEl)){ if (name.equals(SUFFIX_EL)){//check a few special cases that must not be locanted Element group = OpsinTools.getPreviousSibling(el, GROUP_EL); String type = group.getAttributeValue(TYPE_ATR); if ((type.equals(ACIDSTEM_TYPE_VAL) && !CYCLEFORMER_SUBTYPE_VAL.equals(el.getAttributeValue(SUBTYPE_ATR)))|| type.equals(NONCARBOXYLICACID_TYPE_VAL) || type.equals(CHALCOGENACIDSTEM_TYPE_VAL)){ continue; } } locantAble.add(el); } } } } return locantAble; } /** * Put di-carbon modifying suffixes e.g. oic acids, aldehydes on opposite ends of chain * @param subOrRoot * @throws StructureBuildingException */ private void assignImplicitLocantsToDiTerminalSuffixes(Element subOrRoot) throws StructureBuildingException { Element terminalSuffix1 = subOrRoot.getFirstChildElement(SUFFIX_EL); if (terminalSuffix1!=null){ if (isATerminalSuffix(terminalSuffix1) && OpsinTools.getNextSibling(terminalSuffix1) != null){ Element terminalSuffix2 =OpsinTools.getNextSibling(terminalSuffix1); if (isATerminalSuffix(terminalSuffix2)){ Element hopefullyAChain = OpsinTools.getPreviousSibling(terminalSuffix1, GROUP_EL); if (hopefullyAChain != null && hopefullyAChain.getAttributeValue(TYPE_ATR).equals(CHAIN_TYPE_VAL)){ int chainLength = hopefullyAChain.getFrag().getChainLength(); if (chainLength >=2){ terminalSuffix1.addAttribute(new Attribute(LOCANT_ATR, "1")); terminalSuffix2.addAttribute(new Attribute(LOCANT_ATR, Integer.toString(chainLength))); } } } } } } /** * Checks whether a suffix element is: * a suffix, an inline suffix OR terminal root suffix, has no current locant * @param suffix * @return */ private boolean isATerminalSuffix(Element suffix){ return suffix.getName().equals(SUFFIX_EL) && suffix.getAttribute(LOCANT_ATR) == null && (suffix.getAttributeValue(TYPE_ATR).equals(INLINE_TYPE_VAL) || TERMINAL_SUBTYPE_VAL.equals(suffix.getAttributeValue(SUBTYPE_ATR))); } private void processConjunctiveNomenclature(Element subOrRoot) throws ComponentGenerationException, StructureBuildingException { List conjunctiveGroups = subOrRoot.getChildElements(CONJUNCTIVESUFFIXGROUP_EL); int conjunctiveGroupCount = conjunctiveGroups.size(); if (conjunctiveGroupCount > 0){ Element ringGroup = subOrRoot.getFirstChildElement(GROUP_EL); Fragment ringFrag = ringGroup.getFrag(); if (ringFrag.getOutAtomCount()!=0 ){ throw new ComponentGenerationException("OPSIN Bug: Ring fragment should have no radicals"); } for (int i = 0; i < conjunctiveGroupCount; i++) { Element conjunctiveGroup = conjunctiveGroups.get(i); Fragment conjunctiveFragment = conjunctiveGroup.getFrag(); String locant = conjunctiveGroup.getAttributeValue(LOCANT_ATR); Atom atomToConnectToOnConjunctiveFrag = FragmentTools.lastNonSuffixCarbonWithSufficientValency(conjunctiveFragment); if (atomToConnectToOnConjunctiveFrag == null) { throw new ComponentGenerationException("OPSIN Bug: Unable to find non suffix carbon with sufficient valency"); } if (locant != null){ state.fragManager.createBond(atomToConnectToOnConjunctiveFrag, ringFrag.getAtomByLocantOrThrow(locant) , 1); } else{ List possibleAtoms = FragmentTools.findSubstituableAtoms(ringFrag, 1); if (possibleAtoms.isEmpty()){ throw new StructureBuildingException("No suitable atom found for conjunctive operation"); } if (AmbiguityChecker.isSubstitutionAmbiguous(possibleAtoms, 1)) { state.addIsAmbiguous("Connection of conjunctive group to: " + ringGroup.getValue()); } state.fragManager.createBond(atomToConnectToOnConjunctiveFrag, possibleAtoms.get(0) , 1); } state.fragManager.incorporateFragment(conjunctiveFragment, ringFrag); } } } /** * Converts a biochemical linkage description e.g. (1->4) into an O[1-9] locant * If the carbohydrate is preceded by substituents these are placed into a bracket and the bracket locanted * @param substituents * @param brackets * @throws StructureBuildingException */ private void processBiochemicalLinkageDescriptors(List substituents, List brackets) throws StructureBuildingException { for (Element substituent : substituents) { List bioLinkLocants = substituent.getChildElements(BIOCHEMICALLINKAGE_EL); if (bioLinkLocants.size() > 0){ if (bioLinkLocants.size() > 1){ throw new RuntimeException("OPSIN Bug: More than 1 biochemical linkage locant associated with subsituent"); } Element bioLinkLocant = bioLinkLocants.get(0); String bioLinkLocantStr = bioLinkLocant.getValue(); bioLinkLocantStr = bioLinkLocantStr.substring(1, bioLinkLocantStr.length() -1);//strip brackets checkAndApplyFirstLocantOfBiochemicalLinkage(substituent, bioLinkLocantStr); int secondLocantStartPos = Math.max(bioLinkLocantStr.lastIndexOf('>'), bioLinkLocantStr.lastIndexOf('-')) + 1; String locantToConnectTo = bioLinkLocantStr.substring(secondLocantStartPos); Element parent = substituent.getParent(); Attribute locantAtr = new Attribute(LOCANT_ATR, "O" + locantToConnectTo); Element elementAfterSubstituent = OpsinTools.getNextSibling(substituent); boolean hasAdjacentGroupToSubstitute = isSubBracketOrRoot(elementAfterSubstituent); /* If a biochemical is not at the end of a scope but is preceded by substituents/brackets * these are bracketted and the locant assigned to the bracket. * Else If the group is the only thing in a bracket the locant is assigned to the bracket (this is used to describe branches) * Else the locant is assigned to the substituent */ boolean bracketAdded =false; if (hasAdjacentGroupToSubstitute) { //now find the brackets/substituents before this element Element previous = OpsinTools.getPreviousSibling(substituent); List previousElements = new ArrayList<>(); while( previous != null) { if (!previous.getName().equals(SUBSTITUENT_EL) && !previous.getName().equals(BRACKET_EL)) { break; } previousElements.add(previous); previous = OpsinTools.getPreviousSibling(previous); } if (previousElements.size() > 0) {//an explicit bracket is needed Collections.reverse(previousElements); Element bracket = new GroupingEl(BRACKET_EL); bracket.addAttribute(locantAtr); int indexToInsertAt = parent.indexOf(previousElements.get(0)); for (Element element : previousElements) { element.detach(); bracket.addChild(element); } substituent.detach(); bracket.addChild(substituent); parent.insertChild(bracket, indexToInsertAt); brackets.add(bracket); bracketAdded = true; if (substituent.getAttribute(LOCANT_ATR) != null) { throw new StructureBuildingException("Substituent with biochemical linkage descriptor should not also have a locant: " + substituent.getAttributeValue(LOCANT_ATR)); } } } if (!bracketAdded) { Element elToAddAtrTo; if (parent.getName().equals(BRACKET_EL) && !hasAdjacentGroupToSubstitute) { elToAddAtrTo = parent; } else{ elToAddAtrTo = substituent; } if (elToAddAtrTo.getAttribute(LOCANT_ATR) !=null) { throw new StructureBuildingException("Substituent with biochemical linkage descriptor should not also have a locant: " + elToAddAtrTo.getAttributeValue(LOCANT_ATR)); } elToAddAtrTo.addAttribute(locantAtr); } bioLinkLocant.detach(); } } for (Element bracket : brackets) { List bioLinkLocants = bracket.getChildElements(BIOCHEMICALLINKAGE_EL); if (bioLinkLocants.size() > 0){ if (bioLinkLocants.size() > 1) { throw new RuntimeException("OPSIN Bug: More than 1 biochemical linkage locant associated with bracket"); } Element bioLinkLocant = bioLinkLocants.get(0); Element substituent = OpsinTools.getPreviousSibling(bioLinkLocant); if (substituent == null || !substituent.getName().equals(SUBSTITUENT_EL)){ throw new RuntimeException("OPSIN Bug: Substituent expected before biochemical linkage locant"); } String bioLinkLocantStr = bioLinkLocant.getValue(); bioLinkLocantStr = bioLinkLocantStr.substring(1, bioLinkLocantStr.length() -1); checkAndApplyFirstLocantOfBiochemicalLinkage(substituent, bioLinkLocantStr); int secondLocantStartPos = Math.max(bioLinkLocantStr.lastIndexOf('>'), bioLinkLocantStr.lastIndexOf('-')) + 1; String locantToConnectTo = bioLinkLocantStr.substring(secondLocantStartPos); if (bracket.getAttribute(LOCANT_ATR) !=null){ throw new StructureBuildingException("Substituent with biochemical linkage descriptor should not also have a locant: " + bracket.getAttributeValue(LOCANT_ATR)); } bracket.addAttribute(new Attribute(LOCANT_ATR, "O" + locantToConnectTo)); bioLinkLocant.detach(); } } } private boolean isSubBracketOrRoot(Element element) { return element !=null && (element.getName().equals(SUBSTITUENT_EL) || element.getName().equals(BRACKET_EL) || element.getName().equals(ROOT_EL)); } private void checkAndApplyFirstLocantOfBiochemicalLinkage(Element substituent, String biochemicalLinkage) throws StructureBuildingException { Element group = substituent.getFirstChildElement(GROUP_EL); Fragment frag = group.getFrag(); String firstLocant = biochemicalLinkage.substring(0, biochemicalLinkage.indexOf('-')); if (group.getAttributeValue(TYPE_ATR).equals(CARBOHYDRATE_TYPE_VAL)) { Atom anomericAtom = frag.getAtomByLocantOrThrow(firstLocant); boolean anomericIsOutAtom = false; for (int i = 0; i < frag.getOutAtomCount(); i++) { if (frag.getOutAtom(i).getAtom().equals(anomericAtom)){ anomericIsOutAtom = true; } } if (!anomericIsOutAtom){ throw new StructureBuildingException("Invalid glycoside linkage descriptor. Locant: " + firstLocant + " should point to the anomeric carbon"); } } else{ Atom positionOfPhospho = frag.getAtomByLocantOrThrow("O" + firstLocant); if (positionOfPhospho.getBondCount() !=1){ throw new StructureBuildingException(firstLocant + " should be the carbon to which a hydroxy group is attached!"); } if (frag.getOutAtomCount()==1){ Atom atomToConnect = frag.getOutAtom(0).getAtom(); state.fragManager.createBond(positionOfPhospho, atomToConnect, 1); } else{ throw new RuntimeException("OPSIN Bug: Biochemical linkage only expected on groups with 1 OutAtom"); } } if (OpsinTools.getNextGroup(group)==null){ throw new StructureBuildingException("Biochemical linkage descriptor should be followed by another biochemical: " + biochemicalLinkage); } } private void moveSubstituentDetachableHetAtomRepl(Element substituent) throws ComponentGenerationException { Element child = substituent.getChild(0); List locantededHeteroAtomRepls = new ArrayList<>(); while (child != null && child.getName().equals(HETEROATOM_EL) && child.getAttribute(LOCANT_ATR) != null) { locantededHeteroAtomRepls.add(child); child = OpsinTools.getNextSibling(child); } if (!locantededHeteroAtomRepls.isEmpty() && child != null && child.getName().equals(LOCANT_EL)) { //e.g. 4-aza-3-methyl Element rightMostGroup = null; Element nextSubOrRootOrBracket = OpsinTools.getNextSibling(substituent); while (nextSubOrRootOrBracket != null) { Element groupToConsider = nextSubOrRootOrBracket.getFirstChildElement(GROUP_EL); if (groupToConsider != null) { rightMostGroup = groupToConsider; } nextSubOrRootOrBracket = OpsinTools.getNextSibling(nextSubOrRootOrBracket); } if (rightMostGroup == null) { throw new ComponentGenerationException("Unable to find group for: " + substituent.getChild(0).getValue() +" to apply to!"); } Element rightMostGroupParent = rightMostGroup.getParent(); for (int i = locantededHeteroAtomRepls.size() - 1; i >= 0; i--) { Element locantededHeteroAtomRepl = locantededHeteroAtomRepls.get(i); locantededHeteroAtomRepl.detach(); rightMostGroupParent.insertChild(locantededHeteroAtomRepl, 0); } } } /** * Moves a multiplier out of a bracket if the bracket contains only one substituent * e.g. (trimethyl) --> tri(methyl). * The multiplier may have locants e.g. [N,N-bis(2-hydroxyethyl)] * This is done because OPSIN has no idea what to do with (trimethyl) as there is nothing within the scope to substitute onto! * @param brackets */ private void moveErroneouslyPositionedLocantsAndMultipliers(List brackets) { for (int i = brackets.size()-1; i >=0; i--) { Element bracket =brackets.get(i); List childElements = bracket.getChildElements(); boolean hyphenPresent = false; int childCount = childElements.size(); if (childCount==2){ for (int j = childCount -1; j >=0; j--) { if (childElements.get(j).getName().equals(HYPHEN_EL)){ hyphenPresent=true; } } } if (childCount==1 || hyphenPresent && childCount==2){ List substituentContent = childElements.get(0).getChildElements(); if (substituentContent.size()>=2){ Element locant =null; Element multiplier =null; Element possibleMultiplier = substituentContent.get(0); if (substituentContent.get(0).getName().equals(LOCANT_EL)){//probably erroneous locant locant = substituentContent.get(0); possibleMultiplier = substituentContent.get(1); } if (possibleMultiplier.getName().equals(MULTIPLIER_EL)){//erroneously placed multiplier present multiplier = possibleMultiplier; } if (locant!=null){ if (multiplier==null || locant.getValue().split(",").length == Integer.parseInt(multiplier.getAttributeValue(VALUE_ATR))){ locant.detach(); OpsinTools.insertBefore(childElements.get(0), locant); } else{ continue; } } if (multiplier !=null){ multiplier.detach(); OpsinTools.insertBefore(childElements.get(0), multiplier); } } } } } /** * Checks for case where the term is a substituent that starts with two multipliers * Interprets the first as a word level multiplier and the second as a substituent multiplier by adding an implicit bracket * @param substituent * @param brackets */ private void addImplicitBracketsWhenFirstSubstituentHasTwoMultipliers(Element substituent, List brackets) { if (!substituent.getName().equals(SUBSTITUENT_EL)) { return; } List multipliers = new ArrayList<>(); for (int i = 0, len = substituent.getChildCount(); i < len; i++) { Element child = substituent.getChild(i); if (child.getName().equals(MULTIPLIER_EL)) { multipliers.add(child); } else { break; } } if (multipliers.size() != 2) { return; } Element bracket = new GroupingEl(BRACKET_EL); bracket.addAttribute(new Attribute(TYPE_ATR, IMPLICIT_TYPE_VAL)); Element parent = substituent.getParent(); List elsToAddToBracket = parent.getChildElements(); Element wordMultiplier = multipliers.get(0); wordMultiplier.detach(); bracket.addChild(wordMultiplier); for (Element el : elsToAddToBracket) { el.detach(); bracket.addChild(el); } parent.addChild(bracket); brackets.add(bracket); } /** * Given the right most child of a word: * Checks whether this is multiplied e.g. methylenedibenzene * If it is then it checks for the presence of locants e.g. 4,4'-oxydibenzene which has been changed to oxy-4,4'-dibenzene * An attribute called inLocants is then added that is either INLOCANTS_DEFAULT or a comma seperated list of locants * @param rightMostElement * @throws ComponentGenerationException * @throws StructureBuildingException */ private void assignLocantsToMultipliedRootIfPresent(Element rightMostElement) throws ComponentGenerationException, StructureBuildingException { List multipliers = rightMostElement.getChildElements(MULTIPLIER_EL); if(multipliers.size() == 1) { Element multiplier =multipliers.get(0); if (OpsinTools.getPrevious(multiplier)==null){ throw new StructureBuildingException("OPSIN bug: Unacceptable input to function"); } List locants = rightMostElement.getChildElements(MULTIPLICATIVELOCANT_EL); if (locants.size()>1){ throw new ComponentGenerationException("OPSIN bug: Only none or one multiplicative locant expected"); } int multiVal = Integer.parseInt(multiplier.getAttributeValue(VALUE_ATR)); if (locants.isEmpty()){ rightMostElement.addAttribute(new Attribute(INLOCANTS_ATR, INLOCANTS_DEFAULT)); } else{ Element locantEl = locants.get(0); String[] locantValues = locantEl.getValue().split(","); if (locantValues.length == multiVal){ rightMostElement.addAttribute(new Attribute(INLOCANTS_ATR, locantEl.getValue())); locantEl.detach(); } else{ throw new ComponentGenerationException("Mismatch between number of locants and number of roots"); } } } else if (rightMostElement.getName().equals(BRACKET_EL)){ assignLocantsToMultipliedRootIfPresent(rightMostElement.getChild(rightMostElement.getChildCount()-1)); } } /** * Adds an implicit bracket in the case where two locants have been given. * One for the locanting of substituent on to the next substituent and one * for the locanting of this combined substituent onto a parent group * e.g. 5-p-hydroxyphenyl-1,2-dithiole-3-thione --> e.g. 5-(p-hydroxyphenyl)-1,2-dithiole-3-thione * @param substituent * @param brackets */ private void addImplicitBracketsWhenSubstituentHasTwoLocants(Element substituent, List brackets) { Element siblingSubstituent = OpsinTools.getNextSibling(substituent); if (siblingSubstituent != null && siblingSubstituent.getName().equals(SUBSTITUENT_EL)) { List locants = getLocantsAtStartOfSubstituent(substituent); if (locants.size() == 2 && locantsAreSingular(locants) && getLocantsAtStartOfSubstituent(siblingSubstituent).isEmpty()) { Element bracket = new GroupingEl(BRACKET_EL); bracket.addAttribute(new Attribute(TYPE_ATR, IMPLICIT_TYPE_VAL)); Element parent = substituent.getParent(); int indexToInsertAt = parent.indexOf(substituent); int elementsToMove = substituent.indexOf(locants.get(0)) + 1; for (int i = 0; i < elementsToMove; i++) { Element locantOrStereoToMove = substituent.getChild(0); locantOrStereoToMove.detach(); bracket.addChild(locantOrStereoToMove); } substituent.detach(); siblingSubstituent.detach(); bracket.addChild(substituent); bracket.addChild(siblingSubstituent); parent.insertChild(bracket, indexToInsertAt); brackets.add(bracket); } } } /** * Retrieves the first elements of a substituent which are locants skipping over stereochemistry elements * @param substituent * @return */ private List getLocantsAtStartOfSubstituent(Element substituent) { List locants = new ArrayList<>(); for (int i = 0, len = substituent.getChildCount(); i < len; i++) { Element child = substituent.getChild(i); String currentElementName = child.getName(); if (currentElementName.equals(LOCANT_EL)){ locants.add(child); } else if (currentElementName.equals(STEREOCHEMISTRY_EL)){ //ignore } else{ break; } } return locants; } /** * Checks that none of the locants contain commas * @param locants * @return */ private boolean locantsAreSingular(List locants) { for (Element locant : locants) { if (locant.getValue().split(",").length > 1){ return false; } } return true; } /** * Assigns locants and multipliers to substituents/brackets * If both locants and multipliers are present a final check is done that the number of them agree. * WordLevel multipliers are processed e.g. diethyl ethanoate * Adding a locant to a root or any other group that cannot engage in substitive nomenclature will result in an exception being thrown * An exception is made for cases where the locant could be referring to a position on another word * @param subOrBracket * @throws ComponentGenerationException * @throws StructureBuildingException */ private void assignLocantsAndMultipliers(Element subOrBracket) throws ComponentGenerationException, StructureBuildingException { List locants = subOrBracket.getChildElements(LOCANT_EL); int multiplier = 1; List multipliers = subOrBracket.getChildElements(MULTIPLIER_EL); Element parentElem = subOrBracket.getParent(); boolean oneBelowWordLevel = parentElem.getName().equals(WORD_EL); Element groupIfPresent = subOrBracket.getFirstChildElement(GROUP_EL); if (multipliers.size() > 0) { if (multipliers.size() > 1){ throw new ComponentGenerationException(subOrBracket.getName() +" has multiple multipliers, unable to determine meaning!"); } if (oneBelowWordLevel && OpsinTools.getNextSibling(subOrBracket) == null && OpsinTools.getPreviousSibling(subOrBracket) == null) { return;//word level multiplier } multiplier = Integer.parseInt(multipliers.get(0).getAttributeValue(VALUE_ATR)); subOrBracket.addAttribute(new Attribute(MULTIPLIER_ATR, multipliers.get(0).getAttributeValue(VALUE_ATR))); //multiplier is INTENTIONALLY not detached. As brackets/subs are only multiplied later on it is necessary at that stage to determine what elements (if any) are in front of the multiplier if (groupIfPresent !=null && PERHALOGENO_SUBTYPE_VAL.equals(groupIfPresent.getAttributeValue(SUBTYPE_ATR))){ throw new StructureBuildingException(groupIfPresent.getValue() +" cannot be multiplied"); } } if(locants.size() > 0) { if (multiplier==1 && oneBelowWordLevel && OpsinTools.getPreviousSibling(subOrBracket)==null){//locant might be word Level locant if (wordLevelLocantsAllowed(subOrBracket, locants.size())){//something like S-ethyl or S-(2-ethylphenyl) or S-4-tert-butylphenyl Element locant = locants.remove(0); if (locant.getValue().split(",").length!=1){ throw new ComponentGenerationException("Multiplier and locant count failed to agree; All locants could not be assigned!"); } parentElem.addAttribute(new Attribute(LOCANT_ATR, locant.getValue())); locant.detach(); if (locants.isEmpty()){ return; } } } if (subOrBracket.getName().equals(ROOT_EL)){ locantsToDebugString(locants); throw new ComponentGenerationException(locantsToDebugString(locants)); } if (locants.size() != 1){ throw new ComponentGenerationException(locantsToDebugString(locants)); } Element locantEl = locants.get(0); String[] locantValues = locantEl.getValue().split(","); if (multiplier != locantValues.length){ throw new ComponentGenerationException("Multiplier and locant count failed to agree; All locants could not be assigned!"); } Element parent = subOrBracket.getParent(); //attempt to find cases where locant will not be utilised. A special case is made for carbonyl derivatives //e.g. 1H-2-benzopyran-1,3,4-trione 4-[N-(4-chlorophenyl)hydrazone] if (!parent.getName().equals(WORD_EL) || !parent.getAttributeValue(TYPE_ATR).equals(WordType.full.toString()) || !state.currentWordRule.equals(WordRule.carbonylDerivative)){ List children =parent.getChildElements(); boolean foundSomethingToSubstitute =false; for (int i = parent.indexOf(subOrBracket) +1 ; i < children.size(); i++) { if (!children.get(i).getName().equals(HYPHEN_EL)){ foundSomethingToSubstitute = true; } } if (!foundSomethingToSubstitute){ throw new ComponentGenerationException(locantsToDebugString(locants)); } } if (groupIfPresent !=null && PERHALOGENO_SUBTYPE_VAL.equals(groupIfPresent.getAttributeValue(SUBTYPE_ATR))){ throw new StructureBuildingException(groupIfPresent.getValue() +" cannot be locanted"); } subOrBracket.addAttribute(new Attribute(LOCANT_ATR, locantEl.getValue())); locantEl.detach(); } } private String locantsToDebugString(List locants) { StringBuilder message = new StringBuilder("Unable to assign all locants. "); message.append((locants.size() > 1) ? "These locants " : "This locant "); message.append((locants.size() > 1) ? "were " : "was "); message.append("not assigned: "); for(Element locant : locants) { message.append(locant.getValue()); message.append(" "); } return message.toString(); } private boolean wordLevelLocantsAllowed(Element subBracketOrRoot, int numberOflocants) { Element parentElem = subBracketOrRoot.getParent(); if (WordType.valueOf(parentElem.getAttributeValue(TYPE_ATR))==WordType.substituent && (OpsinTools.getNextSibling(subBracketOrRoot)==null || numberOflocants>=2)){ if (state.currentWordRule == WordRule.ester || state.currentWordRule == WordRule.functionalClassEster || state.currentWordRule == WordRule.multiEster || state.currentWordRule == WordRule.acetal){ return true; } } if ((state.currentWordRule == WordRule.potentialAlcoholEster || state.currentWordRule == WordRule.amineDiConjunctiveSuffix || (state.currentWordRule == WordRule.ester && (OpsinTools.getNextSibling(subBracketOrRoot)==null || numberOflocants>=2))) && parentElem.getName().equals(WORD_EL)){ Element wordRule = parentElem.getParent(); List words = wordRule.getChildElements(WORD_EL); Element ateWord = words.get(words.size()-1); if (parentElem==ateWord){ return true; } } if (state.currentWordRule == WordRule.acidReplacingFunctionalGroup && parentElem.getName().equals(WORD_EL) && (OpsinTools.getNextSibling(subBracketOrRoot)==null || numberOflocants>=2)) { //e.g. diphosphoric acid 1,3-di(ethylamide) if (parentElem.getParent().indexOf(parentElem) > 0){ return true; } } return false; } /** * If a word level multiplier is present e.g. diethyl butandioate then this is processed to ethyl ethyl butandioate * If wordCount is 1 then an exception is thrown if a multiplier is encountered e.g. triphosgene parsed as tri phosgene * @param word * @param roots * @param wordCount * @throws StructureBuildingException * @throws ComponentGenerationException */ private void processWordLevelMultiplierIfApplicable(Element word, List roots, int wordCount) throws StructureBuildingException, ComponentGenerationException { if (word.getChildCount() == 1){ Element firstSubBrackOrRoot = word.getChild(0); Element multiplier = firstSubBrackOrRoot.getFirstChildElement(MULTIPLIER_EL); if (multiplier != null) { int multiVal = Integer.parseInt(multiplier.getAttributeValue(VALUE_ATR)); List locants = firstSubBrackOrRoot.getChildElements(LOCANT_EL); boolean assignLocants = false; boolean wordLevelLocants = wordLevelLocantsAllowed(firstSubBrackOrRoot, locants.size());//something like O,S-dimethyl phosphorothioate if (locants.size() > 1) { throw new ComponentGenerationException("Unable to assign all locants"); } String[] locantValues = null; if (locants.size() == 1) { locantValues = locants.get(0).getValue().split(","); if (locantValues.length == multiVal){ assignLocants = true; locants.get(0).detach(); if (wordLevelLocants){ word.addAttribute(new Attribute(LOCANT_ATR, locantValues[0])); } else{ throw new ComponentGenerationException(locantsToDebugString(locants)); } } else{ throw new ComponentGenerationException("Unable to assign all locants"); } } checkForNonConfusedWithNona(multiplier); if (wordCount == 1) { if (!isMonoFollowedByElement(multiplier, multiVal)){ throw new StructureBuildingException("Unexpected multiplier found at start of word. Perhaps the name is trivial e.g. triphosgene"); } } if (multiVal == 1) {//mono return; } List elementsNotToBeMultiplied = new ArrayList<>();//anything before the multiplier for (int i = firstSubBrackOrRoot.indexOf(multiplier) -1 ; i >=0 ; i--) { Element el = firstSubBrackOrRoot.getChild(i); el.detach(); elementsNotToBeMultiplied.add(el); } multiplier.detach(); for(int i= multiVal -1; i>=1; i--) { Element clone = state.fragManager.cloneElement(state, word); if (assignLocants){ clone.getAttribute(LOCANT_ATR).setValue(locantValues[i]); } OpsinTools.insertAfter(word, clone); } for (Element el : elementsNotToBeMultiplied) {//re-add anything before multiplier to original word firstSubBrackOrRoot.insertChild(el, 0); } } } else if (roots.size() == 1) { if (OpsinTools.getDescendantElementsWithTagName(roots.get(0), FRACTIONALMULTIPLIER_EL).size() > 0){ throw new StructureBuildingException("Unexpected fractional multiplier found within chemical name"); } } } private void checkForNonConfusedWithNona(Element multiplier) throws StructureBuildingException { if (multiplier.getValue().equals("non")){ String subsequentUnsemanticToken = multiplier.getAttributeValue(SUBSEQUENTUNSEMANTICTOKEN_ATR); if (subsequentUnsemanticToken !=null && subsequentUnsemanticToken.toLowerCase(Locale.ROOT).startsWith("a")){ return; } throw new StructureBuildingException("\"non\" probably means \"not\". If a multiplier of value 9 was intended \"nona\" should be used"); } } /** * Names like monooxygen may be used to emphasise that a molecule is not being described * @param multiplier * @param multiVal * @return */ private boolean isMonoFollowedByElement(Element multiplier, int multiVal) { if (multiVal ==1){ Element possibleElement = OpsinTools.getNextSibling(multiplier); if (possibleElement != null && possibleElement.getName().equals(GROUP_EL) && (ELEMENTARYATOM_TYPE_VAL.equals(possibleElement.getAttributeValue(TYPE_ATR)) || possibleElement.getValue().equals("hydrogen"))){ return true; } } return false; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/CycleDetector.java000066400000000000000000000104341451751637500272520ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayDeque; import java.util.ArrayList; import java.util.Deque; import java.util.LinkedHashSet; import java.util.List; import java.util.Set; /** * Assigns whether atoms are in rings or not * @author dl387 * */ class CycleDetector { /** * Performs a depth first search for rings hence assigning whether atoms are in rings or not * This is necessary for deciding the applicability, and in some cases meaning, of suffixes and to determine what atoms are capable of having spare valency * Fragments made of disconnected sections are supported * @param frag */ static void assignWhetherAtomsAreInCycles(Fragment frag) { List atomList = frag.getAtomList(); for (Atom atom : atomList) { atom.setAtomIsInACycle(false); atom.setProperty(Atom.VISITED, null); } for (Atom a : atomList) {//as OPSIN does not disallow disconnected sections within a single "fragment" (e.g. in suffixes) for vigorousness this for loop is required if(a.getProperty(Atom.VISITED) == null){//true for only the first atom in a fully connected molecule traverseRings(a, null, 0); } } } private static int traverseRings(Atom currentAtom, Atom previousAtom, int depth){ Integer previouslyAssignedDepth = currentAtom.getProperty(Atom.VISITED); if(previouslyAssignedDepth != null){ return previouslyAssignedDepth; } currentAtom.setProperty(Atom.VISITED, depth); List equivalentAtoms = new ArrayList<>(); equivalentAtoms.add(currentAtom); List neighbours; for(;;) { //Non-recursively process atoms in a chain //add the atoms in the chain to equivalentAtoms as either all or none of them are in a ring neighbours = currentAtom.getAtomNeighbours(); neighbours.remove(previousAtom); if (neighbours.size() != 1) { break; } Atom nextAtom = neighbours.get(0); if (nextAtom.getProperty(Atom.VISITED) != null) { //chain reached a previously visited atom, must be a ring break; } previousAtom = currentAtom; currentAtom = nextAtom; equivalentAtoms.add(currentAtom); currentAtom.setProperty(Atom.VISITED, ++depth); } int result = depth + 1; for (Atom neighbour : neighbours) { int temp = traverseRings(neighbour, currentAtom, depth + 1); result = Math.min(result, temp); } if (result < depth){ for (Atom a : equivalentAtoms) { a.setAtomIsInACycle(true); } } else if (result == depth) { currentAtom.setAtomIsInACycle(true); } return result; } private static class PathSearchState{ final Atom currentAtom; final List orderAtomsVisited; public PathSearchState(Atom currentAtom, List orderAtomsVisited ) { this.currentAtom = currentAtom; this.orderAtomsVisited = orderAtomsVisited; } Atom getCurrentAtom() { return currentAtom; } List getOrderAtomsVisited() { return orderAtomsVisited; } } /** * Attempts to find paths from a1 to a2 using only the given bonds * @param a1 * @param a2 * @param peripheryBonds * @return */ static List> getPathBetweenAtomsUsingBonds(Atom a1, Atom a2, Set peripheryBonds){ List> paths = new ArrayList<>(); Deque stateStack = new ArrayDeque<>(); stateStack.add(new PathSearchState(a1, new ArrayList<>())); while (stateStack.size()>0){ PathSearchState state =stateStack.removeLast();//depth first traversal List orderAtomsVisited = state.getOrderAtomsVisited(); Atom nextAtom = state.getCurrentAtom(); orderAtomsVisited.add(nextAtom); Set neighbourBonds = new LinkedHashSet<>(nextAtom.getBonds()); neighbourBonds.retainAll(peripheryBonds); for (Bond neighbourBond : neighbourBonds) { Atom neighbour = neighbourBond.getOtherAtom(nextAtom); if (orderAtomsVisited.contains(neighbour)){//atom already visited by this path continue; } if (neighbour ==a2 ){//target atom found paths.add(new ArrayList<>(orderAtomsVisited.subList(1, orderAtomsVisited.size()))); } else{//add atom to stack, its neighbours will be recursively investigated shortly stateStack.add(new PathSearchState(neighbour, new ArrayList<>(orderAtomsVisited))); } } } return paths; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/CyclicAtomList.java000066400000000000000000000066031451751637500274070ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.List; /** * Convenience class for iterating over a list of atoms that form a ring * Doing getNext when the index is the final atom in the list will return the first atom * Doing getPrevious when the index is the first atom in the list will return the final atom * @author dl387 * */ class CyclicAtomList{ private int index = -1; private final List atomList; /** * Construct a cyclicAtomList from an atomList * Index defaults to -1 * @param atomList */ CyclicAtomList(List atomList) { this.atomList = atomList; } /** * Construct a cyclicAtomList from an atomList * The second parameter sets the current index * @param atomList * @param index */ CyclicAtomList(List atomList, int index) { this.atomList = atomList; setIndex(index); } /** * Returns the number of elements in this list. If this list contains more * than Integer.MAX_VALUE elements, returns * Integer.MAX_VALUE. * * @return the number of elements in this list */ int size() { return atomList.size(); } /** * Returns the atom at the specified position in this list. * @param index index of the element to return * @return Atom the atom at the specified position in this list * @throws IndexOutOfBoundsException - if the index is out of range (index < 0 || index >= size()) */ Atom get(int index) throws IndexOutOfBoundsException { return atomList.get(index); } /** * Return the current index in the list * @return */ int getIndex() { return index; } /** * Set the current index * @param index */ void setIndex(int index) { if (index >= atomList.size()){ throw new IllegalArgumentException("Specified index is not within ringAtom list"); } this.index = index; } /** * Increments and returns the atom at the new index in the list (next atom) * When the index is the final atom in the list will return the first atom * @return */ Atom next() { int tempIndex = index + 1; if (tempIndex >= atomList.size()){ tempIndex = 0; } index = tempIndex; return atomList.get(index); } /** * Decrements and returns the atom at the new index in the list (previous atom) * when the index is the first atom in the list will return the final atom * @return */ Atom previous() { int tempIndex = index - 1; if (tempIndex < 0){ tempIndex = atomList.size() -1 ; } index = tempIndex; return atomList.get(index); } /** * Returns the next atom in the list * When the index is the final atom in the list will return the first atom * Doesn't effect the list * @return */ Atom peekNext() { int tempIndex = index + 1; if (tempIndex >= atomList.size()){ tempIndex = 0; } return atomList.get(tempIndex); } /** * Returns the previous atom in the list * when the index is the first atom in the list will return the final atom * Doesn't effect the list * @return */ Atom peekPrevious() { int tempIndex = index - 1; if (tempIndex < 0){ tempIndex = atomList.size() -1 ; } return atomList.get(tempIndex); } /** * Returns the atom corresponding to the current index * Note that CycliAtomLists have a default index of -1 * @return */ Atom getCurrent() { return atomList.get(index); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/Element.java000066400000000000000000000116211451751637500261110ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.List; abstract class Element { protected String name; protected Element parent = null; protected final List attributes = new ArrayList<>(); Element(String name) { this.name = name; } void addAttribute(Attribute attribute) { attributes.add(attribute); } void addAttribute(String atrName, String atrValue) { attributes.add(new Attribute(atrName, atrValue)); } /** * Adds a child element * @param child */ abstract void addChild(Element child); /** * Creates a deep copy with no parent */ abstract Element copy(); void detach() { if (parent != null) { parent.removeChild(this); parent = null; } } Attribute getAttribute(int index) { return attributes.get(index); } /** * Returns the attribute with the given name * or null if the attribute doesn't exist * @param name * @return */ Attribute getAttribute(String name) { for (int i = 0, len = attributes.size(); i < len; i++) { Attribute a = attributes.get(i); if (a.getName().equals(name)) { return a; } } return null; } int getAttributeCount() { return attributes.size(); } /** * Returns the value of the attribute with the given name * or null if the attribute doesn't exist * @param name * @return */ String getAttributeValue(String name) { Attribute attribute = getAttribute(name); if (attribute != null) { return attribute.getValue(); } return null; } /** * Returns the child at the given index in the children list * @param index * @return */ abstract Element getChild(int index); /** * Returns the number of children * @return */ abstract int getChildCount(); /** * Returns a copy of the child elements * * @return */ abstract List getChildElements(); /** * Gets child elements with this name (in iteration order) * @param name * @return */ abstract List getChildElements(String name); /** * Returns the first child element with the specified name * * @param name * @return */ abstract Element getFirstChildElement(String name); /** * Returns the last child element * * @return */ abstract Element getLastChildElement(); /** * Returns the fragment associated with this element (only applicable to tokens) * @return */ Fragment getFrag() { throw new UnsupportedOperationException("Only tokens can have associated fragments"); } String getName() { return name; } Element getParent() { return this.parent; } abstract String getValue(); /** * Returns the index of the given child in the children list (or -1 if it isn't a child) * @param child * @return */ abstract int indexOf(Element child); /** * Inserts the element at the given index in the children list * @param child * @param index */ abstract void insertChild(Element child, int index); boolean removeAttribute(Attribute attribute) { return attributes.remove(attribute); } /** * Removes the given child element * @param child * @return */ abstract boolean removeChild(Element child); /** * Removes the element at the given index in the children list * @param index * @return */ abstract Element removeChild(int index); /** * Replaces a child element with another element * @param oldChild * @param newChild */ abstract void replaceChild(Element oldChild, Element newChild); /** * Sets the fragment associated with this element (only applicable to tokens!) * @param frag */ void setFrag(Fragment frag) { throw new UnsupportedOperationException("Only tokens can have associated fragments"); } void setName(String name) { this.name = name; } void setParent(Element newParentEl) { this.parent = newParentEl; } abstract void setValue(String text); public String toString() { return toXML(); } String toXML() { return toXML(0).toString(); } private StringBuilder toXML(int indent) { StringBuilder result = new StringBuilder(); for (int i = 0; i < indent; i++) { result.append(" "); } result.append('<'); result.append(name); for (Attribute atr : attributes) { result.append(' '); result.append(atr.toXML()); } result.append('>'); int childCount = getChildCount(); if (childCount > 0) { for (int i = 0; i < childCount; i++) { Element child = getChild(i); result.append(OpsinTools.NEWLINE); result.append(child.toXML(indent + 1)); } result.append(OpsinTools.NEWLINE); for (int i = 0; i < indent; i++) { result.append(" "); } } else{ result.append(OpsinTools.xmlEncode(getValue())); } result.append("'); return result; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/Fragment.java000066400000000000000000000416571451751637500262770ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.Collection; import java.util.Collections; import java.util.HashMap; import java.util.Iterator; import java.util.LinkedHashMap; import java.util.LinkedHashSet; import java.util.List; import java.util.Map; import java.util.Set; import java.util.regex.Matcher; import static uk.ac.cam.ch.wwmm.opsin.OpsinTools.*; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; /**A fragment of a molecule, holds bonds and atoms. * * @author ptc24 * @author dl387 * */ class Fragment implements Iterable { /**A mapping between IDs and the atoms in this fragment, by default is ordered by the order atoms are added to the fragment*/ private final Map atomMapFromId = new LinkedHashMap<>(); /**Equivalent to and synced to atomMapFromId.values() */ private final Collection atomCollection = atomMapFromId.values(); /**A mapping between locants and the atoms in this fragment*/ private final Map atomMapFromLocant = new HashMap<>(); /**The bonds in the fragment*/ private final Set bondSet = new LinkedHashSet<>(); /**The associated token element*/ private Element tokenEl; /**The atoms that are used when this fragment is connected to another fragment. Unused outAtoms means that the fragment is a radical or an error has occurred * Initially empty */ private final List outAtoms = new ArrayList<>(); /**The atoms that are used on this fragment to form things like esters * Initially empty */ private final List functionalAtoms = new ArrayList<>(); /**The atom that fragments connecting to this fragment should connect to in preference * e.g. for amino acids the alpha amino group * Null by default*/ private Atom defaultInAtom = null; /**The atoms in the fragment that have been indicated to have hydrogen at the SMILES level.*/ private final List indicatedHydrogen = new ArrayList<>(); /**Pseudo atoms indicating start and end of polymer structure repeat unit*/ private List polymerAttachmentPoints = null; /** * DO NOT CALL DIRECTLY EXCEPT FOR TESTING * Makes an empty Fragment associated with the given tokenEl * @param tokenEl */ Fragment(Element tokenEl) { this.tokenEl = tokenEl; } /** * DO NOT CALL DIRECTLY EXCEPT FOR TESTING * Makes an empty Fragment with the given type * * @param type */ Fragment(String type) { this.tokenEl = new TokenEl(""); this.tokenEl.addAttribute(TYPE_ATR, type); } /**Adds an atom to the fragment and associates it with this fragment*/ void addAtom(Atom atom) { List locants =atom.getLocants(); for (String locant: locants) { atomMapFromLocant.put(locant, atom); } atomMapFromId.put(atom.getID(), atom); atom.setFrag(this); } /** * Return the number of atoms in the fragment * @return */ int getAtomCount() { return atomCollection.size(); } /** * Returns a copy of the fragment's atoms * @return */ List getAtomList() { return new ArrayList<>(atomCollection); } /** * Adds a bond to the fragment. * @param bond */ void addBond(Bond bond) { bondSet.add(bond); } /**Removes a bond to the fragment if it is present. * @param bond * @return*/ boolean removeBond(Bond bond) { return bondSet.remove(bond); } /**Gets bondSet.*/ Set getBondSet() { return Collections.unmodifiableSet(bondSet); } /**Gets the id of the atom in the fragment with the specified locant. * * @param locant The locant to look for * @return The id of the found atom, or 0 if it is not found */ int getIDFromLocant(String locant) { Atom a = getAtomByLocant(locant); if (a != null){ return a.getID(); } return 0; } /**Gets the id of the atom in the fragment with the specified locant, throwing if this fails. * * @param locant The locant to look for * @return The id of the found atom * @throws StructureBuildingException */ int getIDFromLocantOrThrow(String locant) throws StructureBuildingException { int id = getIDFromLocant(locant); if(id == 0) { throw new StructureBuildingException("Couldn't find id from locant " + locant + "."); } return id; } /**Gets the atom in the fragment with the specified locant. * * @param locant The locant to look for * @return The found atom, or null if it is not found */ Atom getAtomByLocant(String locant) { Atom a =atomMapFromLocant.get(locant); if (a != null){ return a; } Matcher m =MATCH_AMINOACID_STYLE_LOCANT.matcher(locant); if (m.matches()){//e.g. N5 Atom backboneAtom =atomMapFromLocant.get(m.group(3));//the atom corresponding to the numeric or greek component if (backboneAtom==null){ return null; } a = FragmentTools.getAtomByAminoAcidStyleLocant(backboneAtom, m.group(1), m.group(2)); if (a != null){ return a; } } return null; } /**Gets the atom in the fragment with the specified locant, throwing if this fails. * * @param locant The locant to look for * @return The found atom * @throws StructureBuildingException */ Atom getAtomByLocantOrThrow(String locant) throws StructureBuildingException { Atom a = getAtomByLocant(locant); if(a == null) { throw new StructureBuildingException("Could not find the atom with locant " + locant + "."); } return a; } /**Gets the atom in the fragment with the specified ID. * * @param id The id of the atom. * @return The found atom, or null. */ Atom getAtomByID(int id) { return atomMapFromId.get(id); } /**Gets the atom in the fragment with the specified ID, throwing if this fails. * * @param id The id of the atom. * @return The found atom * @throws StructureBuildingException */ Atom getAtomByIDOrThrow(int id) throws StructureBuildingException { Atom a = getAtomByID(id); if(a == null) { throw new StructureBuildingException("Couldn't find atom with id " + id + "."); } return a; } /**Finds a bond between two specified atoms the first of which must be within the fragment * * @param ID1 The id of one atom * @param ID2 The id of the other atom * @return The bond found, or null */ Bond findBond(int ID1, int ID2) { Atom a = atomMapFromId.get(ID1); if (a != null){ for (Bond b : a.getBonds()) { if((b.getFrom() == ID1 && b.getTo() == ID2) || (b.getTo() == ID1 && b.getFrom() == ID2)) { return b; } } } return null; } /**Finds a bond between two specified atoms the first of which must be within the fragment, throwing if it fails. * * @param ID1 The id of one atom * @param ID2 The id of the other atom * @return The bond found * @throws StructureBuildingException */ Bond findBondOrThrow(int ID1, int ID2) throws StructureBuildingException { Bond b = findBond(ID1, ID2); if(b == null) { throw new StructureBuildingException("Couldn't find specified bond"); } return b; } /**Works out how many atoms there are in the fragment there are * with consecutive locants, starting from 1 that are in a chain * * @return The number of atoms in the locant chain */ int getChainLength() { int length = 0; Atom next = getAtomByLocant(Integer.toString(length + 1)); Atom previous = null; while (next != null){ if (previous != null && previous.getBondToAtom(next) == null){ break; } length++; previous = next; next = getAtomByLocant(Integer.toString(length + 1)); } return length; } /** * Gets the type of the corresponding tokenEl * Returns "" if undefined * @return */ String getType() { String type = tokenEl.getAttributeValue(TYPE_ATR); return type != null ? type : ""; } /** * Gets the subType of the corresponding tokenEl * Returns "" if undefined * @return */ String getSubType() { String subType = tokenEl.getAttributeValue(SUBTYPE_ATR); return subType != null ? subType : ""; } /** * Gets the associate tokenEl * Whether or not this is a real token can be tested by whether it has a parent * @return */ Element getTokenEl() { return tokenEl; } /** * Sets the associated tokenEl * Type/subType are inherited from the tokenEl * @param tokenEl */ void setTokenEl(Element tokenEl) { this.tokenEl = tokenEl; } /** * How many OutAtoms (i.e. radicals) are associated with this fragment * @return */ int getOutAtomCount() { return outAtoms.size(); } /** * Gets the outAtom at a specific index of the outAtoms linkedList * @param i * @return */ OutAtom getOutAtom(int i) { return outAtoms.get(i); } /** * Adds an outAtom * @param id * @param valency * @param setExplicitly * @throws StructureBuildingException */ void addOutAtom(int id, int valency, Boolean setExplicitly) throws StructureBuildingException { addOutAtom(getAtomByIDOrThrow(id), valency, setExplicitly); } /** * Adds an outAtom * @param atom * @param valency * @param setExplicitly */ void addOutAtom(Atom atom, int valency, Boolean setExplicitly) { outAtoms.add(new OutAtom(atom, valency, setExplicitly)); } /** * Includes the OutAtoms of a given fragment into this fragment * Note that no OutAtoms are created in doing this * @param frag */ void incorporateOutAtoms(Fragment frag) { outAtoms.addAll(frag.outAtoms); } /** * Removes the outAtom at a specific index of the outAtom linkedList * @param i */ void removeOutAtom(int i) { OutAtom removedOutAtom = outAtoms.remove(i); if (removedOutAtom.isSetExplicitly()){ removedOutAtom.getAtom().addOutValency(-removedOutAtom.getValency()); } } /** * Removes the specified outAtom from the outAtoms linkedList * @param outAtom */ void removeOutAtom(OutAtom outAtom) { if (outAtoms.remove(outAtom) && outAtom.isSetExplicitly()){ outAtom.getAtom().addOutValency(-outAtom.getValency()); } } /** * How many functionalAtoms (i.e. locations that can form esters) are associated with this fragment * @return */ int getFunctionalAtomCount() { return functionalAtoms.size(); } /** * Gets the functionalAtom at a specific index of the functionalAtoms linkedList * @param i * @return */ FunctionalAtom getFunctionalAtom(int i) { return functionalAtoms.get(i); } /**Adds a functionalAtom * @param atom*/ void addFunctionalAtom(Atom atom) { functionalAtoms.add(new FunctionalAtom(atom)); } /** * Includes the FunctionalAtoms of a given fragment into this fragment * Note that no FunctionalAtoms are created in doing this * @param frag */ void incorporateFunctionalAtoms(Fragment frag) { functionalAtoms.addAll(frag.functionalAtoms); } /** * Removes the functionalAtom at a specific index of the functionalAtoms linkedList * @param i * @return */ FunctionalAtom removeFunctionalAtom(int i) { return functionalAtoms.remove(i); } /** * Removes the specified functionalAtom from the functionalAtoms linkedList * @param functionalAtom */ void removeFunctionalAtom(FunctionalAtom functionalAtom) { functionalAtoms.remove(functionalAtom); } List getPolymerAttachmentPoints() { return polymerAttachmentPoints; } void setPolymerAttachmentPoints(List polymerAttachmentPoints) { this.polymerAttachmentPoints = polymerAttachmentPoints; } /**Gets a list of atoms in the fragment that connect to a specified atom * * @param atom The reference atom * @return The list of atoms connected to the atom */ List getIntraFragmentAtomNeighbours(Atom atom) { List results = new ArrayList<>(atom.getBondCount()); for(Bond b : atom.getBonds()) { Atom otherAtom = b.getOtherAtom(atom); if (otherAtom == null) { throw new RuntimeException("OPSIN Bug: A bond associated with an atom does not involve it"); } //If the other atom is in atomMapFromId then it is in this fragment if (atomMapFromId.get(otherAtom.getID()) != null) { results.add(otherAtom); } } return results; } /**Calculates the number of bonds connecting to the atom, excluding bonds to implicit * hydrogens. Double bonds count as * two bonds, etc. Eg ethene - both C's have an incoming valency of 2. * * Only bonds to atoms within the fragment are counted. Suffix atoms are excluded * * @param atom * @return Incoming Valency * @throws StructureBuildingException */ int getIntraFragmentIncomingValency(Atom atom) throws StructureBuildingException { int v = 0; for(Bond b : atom.getBonds()) { //recalled atoms will be null if they are not part of this fragment if(b.getFromAtom() == atom) { Atom a =getAtomByID(b.getTo()); if (a != null && !a.getType().equals(SUFFIX_TYPE_VAL)){ v += b.getOrder(); } } else if(b.getToAtom() == atom) { Atom a =getAtomByID(b.getFrom()); if (a != null && !a.getType().equals(SUFFIX_TYPE_VAL)){ v += b.getOrder(); } } else{ throw new StructureBuildingException("A bond associated with an atom does not involve it"); } } return v; } /** * Checks valencies are sensible * @throws StructureBuildingException */ void checkValencies() throws StructureBuildingException { for(Atom a : atomCollection) { if(!ValencyChecker.checkValency(a)) { throw new StructureBuildingException("Atom is in unphysical valency state! Element: " + a.getElement() + " valency: " + a.getIncomingValency()); } } } /** * Removes an atom from this fragment * @param atom */ void removeAtom(Atom atom) { int atomID =atom.getID(); atomMapFromId.remove(atomID); for (String l : atom.getLocants()) { atomMapFromLocant.remove(l); } if (defaultInAtom == atom){ defaultInAtom = null; } } /** * Retrieves the overall charge of the fragment by querying all its atoms * @return */ int getCharge() { int charge=0; for (Atom a : atomCollection) { charge+=a.getCharge(); } return charge; } Atom getDefaultInAtom() { return defaultInAtom; } void setDefaultInAtom(Atom inAtom) { this.defaultInAtom = inAtom; } Atom getDefaultInAtomOrFirstAtom() { return defaultInAtom != null ? defaultInAtom : getFirstAtom(); } /** * Adds a mapping between the locant and atom object * @param locant A locant as a string * @param a An atom */ void addMappingToAtomLocantMap(String locant, Atom a){ atomMapFromLocant.put(locant, a); } /** * Removes a mapping between a locant * @param locant A locant as a string */ void removeMappingFromAtomLocantMap(String locant){ atomMapFromLocant.remove(locant); } /** * Checks to see whether a locant is present on this fragment * @param locant * @return */ boolean hasLocant(String locant) { return getAtomByLocant(locant) != null; } /** * Returns an unmodifiable list of the locants associated with this fragment * @return */ Set getLocants() { return Collections.unmodifiableSet(atomMapFromLocant.keySet()); } List getIndicatedHydrogen() { return indicatedHydrogen; } void addIndicatedHydrogen(Atom atom) { indicatedHydrogen.add(atom); } /** * Returns the id of the first atom in the fragment * @return * @throws StructureBuildingException */ int getIdOfFirstAtom() { return getFirstAtom().getID(); } /** * Returns the the first atom in the fragment or null if it has no atoms * Typically the first atom will be the first atom that was added to the fragment * @return firstAtom */ Atom getFirstAtom(){ Iterator atomIterator =atomCollection.iterator(); if (atomIterator.hasNext()){ return atomIterator.next(); } return null; } /** * Clears and recreates atomMapFromId (and hence AtomCollection) using the order of the atoms in atomList * @param atomList * @throws StructureBuildingException */ void reorderAtomCollection(List atomList) throws StructureBuildingException { if (atomMapFromId.size() != atomList.size()){ throw new StructureBuildingException("atom list is not the same size as the number of atoms in the fragment"); } atomMapFromId.clear(); for (Atom atom : atomList) { atomMapFromId.put(atom.getID(), atom); } } /** * Reorders the fragment's internal atomList by the value of the first locant of the atoms * e.g. 1,2,3,3a,3b,4 * Used for assuring the correct order of atom iteration when performing ring fusion * @throws StructureBuildingException */ void sortAtomListByLocant() throws StructureBuildingException { List atomList =getAtomList(); Collections.sort(atomList, new FragmentTools.SortByLocants()); reorderAtomCollection(atomList); } @Override public Iterator iterator() { return atomCollection.iterator(); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/FragmentManager.java000066400000000000000000000726041451751637500275660ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import java.util.ArrayList; import java.util.Collections; import java.util.HashMap; import java.util.LinkedHashMap; import java.util.LinkedHashSet; import java.util.List; import java.util.Map; import java.util.Map.Entry; import java.util.Set; /** Holds the Fragments during the construction of the molecule, * handles the building of new fragments and handles the creation/deletion of atoms/bonds * * @author ptc24 * @author dl387 * */ class FragmentManager { /** A mapping between fragments and inter fragment bonds */ private final Map> fragToInterFragmentBond = new LinkedHashMap<>(); /** All of the atom-containing fragments in the molecule */ private final Set fragments = fragToInterFragmentBond.keySet(); /** A builder for fragments specified as SMILES */ private final SMILESFragmentBuilder sBuilder; /** A source of unique integers */ private final IDManager idManager; /** Sets up a new Fragment manager, containing no fragments. * * @param sBuilder A SMILESFragmentBuilder - dependency injection. * @param idManager An IDManager. */ FragmentManager(SMILESFragmentBuilder sBuilder, IDManager idManager) { if (sBuilder == null || idManager == null ){ throw new IllegalArgumentException("FragmentManager was parsed a null object in its constructor!"); } this.sBuilder = sBuilder; this.idManager = idManager; } /** Builds a fragment, based on an SMILES string * The fragment will not correspond to a token * * @param smiles The fragment to build * @return The built fragment * @throws StructureBuildingException */ Fragment buildSMILES(String smiles) throws StructureBuildingException { return buildSMILES(smiles, "", NONE_LABELS_VAL); } /** Builds a fragment, based on an SMILES string * The fragment will not correspond to a token * * @param smiles * @param type * @param labelMapping * @return * @throws StructureBuildingException */ Fragment buildSMILES(String smiles, String type, String labelMapping) throws StructureBuildingException { Fragment newFrag = sBuilder.build(smiles, type, labelMapping); addFragment(newFrag); return newFrag; } /** Builds a fragment, based on an SMILES string * The fragment will correspond to the given tokenEl * * @param smiles The fragment to build * @param tokenEl The corresponding tokenEl * @param labelMapping How to label the fragment * @return The built fragment * @throws StructureBuildingException */ Fragment buildSMILES(String smiles, Element tokenEl, String labelMapping) throws StructureBuildingException { Fragment newFrag = sBuilder.build(smiles, tokenEl, labelMapping); addFragment(newFrag); return newFrag; } /**Creates a new fragment, containing all of the atoms and bonds * of all of the other fragments - i.e. the whole molecule. This updates * which fragments the atoms think they are in to the new super fragment * but does not change the original fragments. * Hence the original fragments remain associated with their atoms * Atoms and Bonds are not copied. * * @return The unified fragment */ Fragment getUnifiedFragment() { Fragment uniFrag = new Fragment(""); for (Entry> entry : fragToInterFragmentBond.entrySet()) { Fragment f = entry.getKey(); Set interFragmentBonds = entry.getValue(); for(Atom atom : f) { uniFrag.addAtom(atom); } for(Bond bond : f.getBondSet()) { uniFrag.addBond(bond); } uniFrag.incorporateOutAtoms(f); uniFrag.incorporateFunctionalAtoms(f); for (Bond interFragmentBond : interFragmentBonds) { uniFrag.addBond(interFragmentBond); } } addFragment(uniFrag); return uniFrag; } /** Incorporates a fragment, usually a suffix, into a parent fragment * This does: * Imports all of the atoms and bonds from another fragment into this one. * Also imports outAtoms and functionalAtoms * Reassigns inter-fragment bonds of the child fragment as either intra-fragment bonds * of the parent fragment or as inter-fragment bonds of the parent fragment * * The original fragment still maintains its original atomList/bondList * * @param childFrag The fragment to be incorporated * @param parentFrag The parent fragment * @throws StructureBuildingException */ void incorporateFragment(Fragment childFrag, Fragment parentFrag) throws StructureBuildingException { for(Atom atom : childFrag) { parentFrag.addAtom(atom); } for(Bond bond : childFrag.getBondSet()) { parentFrag.addBond(bond); } parentFrag.incorporateOutAtoms(childFrag); parentFrag.incorporateFunctionalAtoms(childFrag); Set interFragmentBonds = fragToInterFragmentBond.get(childFrag); if (interFragmentBonds == null){ throw new StructureBuildingException("Fragment not registered with this FragmentManager!"); } for (Bond bond : interFragmentBonds) {//reassign inter-fragment bonds of child if (bond.getFromAtom().getFrag() == parentFrag && bond.getToAtom().getFrag() == parentFrag){ //bond is now enclosed within parentFrag so make it an intra-fragment bond //and remove it from the inter-fragment set of the parentFrag parentFrag.addBond(bond); fragToInterFragmentBond.get(parentFrag).remove(bond); } else{ //bond was an inter-fragment bond between the childFrag and another frag //It is now between the parentFrag and another frag addInterFragmentBond(bond); } } fragToInterFragmentBond.remove(childFrag); } /** Incorporates a fragment, usually a suffix, into a parent fragment, creating a bond between them. * * @param childFrag The fragment to be incorporated * @param fromAtom An atom on that fragment * @param parentFrag The parent fragment * @param toAtom An atom on that fragment * @param bondOrder The order of the joining bond * @throws StructureBuildingException */ void incorporateFragment(Fragment childFrag, Atom fromAtom, Fragment parentFrag, Atom toAtom, int bondOrder) throws StructureBuildingException { if (!fromAtom.getFrag().equals(childFrag)){ throw new StructureBuildingException("OPSIN Bug: fromAtom was not associated with childFrag!"); } if (!toAtom.getFrag().equals(parentFrag)){ throw new StructureBuildingException("OPSIN Bug: toAtom was not associated with parentFrag!"); } incorporateFragment(childFrag, parentFrag); createBond(fromAtom, toAtom, bondOrder); } /** Converts an atom in a fragment to a different atomic symbol described by a SMILES string * Charged atoms can also be specified eg. [NH4+] * * @param a The atom to change to a heteroatom * @param smiles The SMILES for one atom * @throws StructureBuildingException */ void replaceAtomWithSmiles(Atom a, String smiles) throws StructureBuildingException { replaceAtomWithAtom(a, getHeteroatom(smiles), false); } /** * Converts the smiles for a heteroatom to an atom * @param smiles * @return * @throws StructureBuildingException */ Atom getHeteroatom(String smiles) throws StructureBuildingException { Fragment heteroAtomFrag = sBuilder.build(smiles); if (heteroAtomFrag.getAtomCount() != 1){ throw new StructureBuildingException("Heteroatom smiles described a fragment with multiple SMILES!"); } return heteroAtomFrag.getFirstAtom(); } /** Uses the information given in the given heteroatom to change the atomic symbol * and charge of the given atom * * @param a The atom to change to a heteroatom * @param heteroAtom The atom to copy element/charge properties from * @param assignLocant Whether a locant should be assigned to the heteroatom if the locant is not used elsewhere * @throws StructureBuildingException if a charge disagreement occurs */ void replaceAtomWithAtom(Atom a, Atom heteroAtom, boolean assignLocant) throws StructureBuildingException { ChemEl chemEl =heteroAtom.getElement(); int replacementCharge =heteroAtom.getCharge(); if (replacementCharge!=0){ if (a.getCharge()==0){ a.addChargeAndProtons(replacementCharge, heteroAtom.getProtonsExplicitlyAddedOrRemoved()); } else if (a.getCharge()==replacementCharge){ a.setProtonsExplicitlyAddedOrRemoved(heteroAtom.getProtonsExplicitlyAddedOrRemoved()); } else{ throw new StructureBuildingException("Charge conflict between replacement term and atom to be replaced"); } } a.setElement(chemEl); a.removeElementSymbolLocants(); if (assignLocant){ String primes = ""; while (a.getFrag().getAtomByLocant(chemEl.toString() + primes) != null){//if element symbol already assigned, add a prime and try again primes += "'"; } a.addLocant(chemEl.toString() + primes); } } /** Gets an atom, given an id number * Use this if you don't know what fragment the atom is in * @param id The id of the atom * @return The atom, or null if no such atom exists. */ Atom getAtomByID(int id) { for(Fragment f : fragments) { Atom a = f.getAtomByID(id); if(a != null) { return a; } } return null; } /** Gets an atom, given an id number, throwing if fails. * Use this if you don't know what fragment the atom is in * @param id The id of the atom * @return The atom * @throws StructureBuildingException */ Atom getAtomByIDOrThrow(int id) throws StructureBuildingException { Atom a = getAtomByID(id); if(a == null) { throw new StructureBuildingException("Couldn't get atom by id"); } return a; } /**Turns all of the spare valencies in the fragments into double bonds. * * @throws StructureBuildingException */ void convertSpareValenciesToDoubleBonds() throws StructureBuildingException { for(Fragment f : fragments) { FragmentTools.convertSpareValenciesToDoubleBonds(f); } } /** * Checks valencies are all chemically reasonable. An exception is thrown if any are not * @throws StructureBuildingException */ void checkValencies() throws StructureBuildingException { for(Fragment f : fragments) { f.checkValencies(); } } Set getFragments() { return Collections.unmodifiableSet(fragments); } /** * Registers a fragment * @param frag */ private void addFragment(Fragment frag) { fragToInterFragmentBond.put(frag, new LinkedHashSet<>()); } /** * Removes a fragment * Any inter-fragment bonds of this fragment are removed from the fragments it was connected to * Throws an exception if fragment wasn't present * @param frag * @throws StructureBuildingException */ void removeFragment(Fragment frag) throws StructureBuildingException { Set interFragmentBondsInvolvingFragmentSet = fragToInterFragmentBond.get(frag); if (interFragmentBondsInvolvingFragmentSet == null) { throw new StructureBuildingException("Fragment not registered with this FragmentManager!"); } List interFragmentBondsInvolvingFragment = new ArrayList<>(interFragmentBondsInvolvingFragmentSet); for (Bond bond : interFragmentBondsInvolvingFragment) { if (bond.getFromAtom().getFrag() == frag){ fragToInterFragmentBond.get(bond.getToAtom().getFrag()).remove(bond); } else{ fragToInterFragmentBond.get(bond.getFromAtom().getFrag()).remove(bond); } } fragToInterFragmentBond.remove(frag); } int getOverallCharge() { int totalCharge = 0; for (Fragment frag : fragments) { totalCharge += frag.getCharge(); } return totalCharge; } /** * Creates a copy of a fragment by copying data * labels the atoms using new ids from the idManager * @param originalFragment * @return the clone of the fragment * @throws StructureBuildingException */ Fragment copyFragment(Fragment originalFragment) throws StructureBuildingException { return copyAndRelabelFragment(originalFragment, 0); } /** * Creates a copy of a fragment by copying data * labels the atoms using new ids from the idManager * @param originalFragment * @param primesToAdd: The minimum number of primes to add to the cloned atoms. More primes will be added if necessary to keep the locants unique e.g. N in the presence of N' becomes N'' when this is 1 * @return the clone of the fragment */ Fragment copyAndRelabelFragment(Fragment originalFragment, int primesToAdd) { Element tokenEl = new TokenEl(""); tokenEl.addAttribute(TYPE_ATR, originalFragment.getType()); tokenEl.addAttribute(SUBTYPE_ATR, originalFragment.getSubType()); Fragment newFragment = new Fragment(tokenEl); HashMap oldToNewAtomMap = new HashMap<>();//maps old Atom to new Atom List atomList =originalFragment.getAtomList(); for (Atom atom : atomList) { int id = idManager.getNextID(); ArrayList newLocants = new ArrayList<>(atom.getLocants()); if (primesToAdd !=0){ for (int i = 0; i < newLocants.size(); i++) { String currentLocant = newLocants.get(i); int currentPrimes = StringTools.countTerminalPrimes(currentLocant); String locantSansPrimes = currentLocant.substring(0, currentLocant.length()-currentPrimes); int highestNumberOfPrimesWithThisLocant = currentPrimes; while (originalFragment.getAtomByLocant(locantSansPrimes + StringTools.multiplyString("'", highestNumberOfPrimesWithThisLocant +1 ))!=null){ highestNumberOfPrimesWithThisLocant++; } newLocants.set(i, locantSansPrimes + StringTools.multiplyString("'", ((highestNumberOfPrimesWithThisLocant +1)*primesToAdd) + currentPrimes)); } } Atom newAtom =new Atom(id, atom.getElement(), newFragment); for (String newLocant : newLocants) { newAtom.addLocant(newLocant); } newAtom.setCharge(atom.getCharge()); newAtom.setIsotope(atom.getIsotope()); newAtom.setSpareValency(atom.hasSpareValency()); newAtom.setProtonsExplicitlyAddedOrRemoved(atom.getProtonsExplicitlyAddedOrRemoved()); newAtom.setLambdaConventionValency(atom.getLambdaConventionValency()); //outValency is derived from the outAtoms so is automatically cloned newAtom.setAtomIsInACycle(atom.getAtomIsInACycle()); newAtom.setType(atom.getType());//may be different from fragment type if the original atom was formerly in a suffix newAtom.setMinimumValency(atom.getMinimumValency()); newAtom.setImplicitHydrogenAllowed(atom.getImplicitHydrogenAllowed()); newFragment.addAtom(newAtom); oldToNewAtomMap.put(atom, newAtom); } for (Atom atom : atomList) { if (atom.getAtomParity() != null){ Atom[] oldAtomRefs4 = atom.getAtomParity().getAtomRefs4(); Atom[] newAtomRefs4 = new Atom[4]; for (int i = 0; i < oldAtomRefs4.length; i++) { Atom oldAtom = oldAtomRefs4[i]; if (oldAtom.equals(AtomParity.hydrogen)){ newAtomRefs4[i] = AtomParity.hydrogen; } else if (oldAtom.equals(AtomParity.deoxyHydrogen)){ newAtomRefs4[i] = AtomParity.deoxyHydrogen; } else{ newAtomRefs4[i] = oldToNewAtomMap.get(oldAtom); } } AtomParity newAtomParity =new AtomParity(newAtomRefs4, atom.getAtomParity().getParity()); newAtomParity.setStereoGroup(atom.getAtomParity().getStereoGroup()); oldToNewAtomMap.get(atom).setAtomParity(newAtomParity); } Set oldAmbiguousElementAssignmentAtoms = atom.getProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT); if (oldAmbiguousElementAssignmentAtoms!=null){ Set newAtoms = new LinkedHashSet<>(); for (Atom oldAtom : oldAmbiguousElementAssignmentAtoms) { newAtoms.add(oldToNewAtomMap.get(oldAtom)); } oldToNewAtomMap.get(atom).setProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT, newAtoms); } Integer smilesHydrogenCount = atom.getProperty(Atom.SMILES_HYDROGEN_COUNT); if (smilesHydrogenCount!=null){ oldToNewAtomMap.get(atom).setProperty(Atom.SMILES_HYDROGEN_COUNT, smilesHydrogenCount); } Integer oxidationNumber = atom.getProperty(Atom.OXIDATION_NUMBER); if (oxidationNumber!=null){ oldToNewAtomMap.get(atom).setProperty(Atom.OXIDATION_NUMBER, oxidationNumber); } Boolean isAldehyde = atom.getProperty(Atom.ISALDEHYDE); if (isAldehyde!=null){ oldToNewAtomMap.get(atom).setProperty(Atom.ISALDEHYDE, isAldehyde); } Boolean isAnomeric = atom.getProperty(Atom.ISANOMERIC); if (isAnomeric!=null){ oldToNewAtomMap.get(atom).setProperty(Atom.ISANOMERIC, isAnomeric); } Integer atomClass = atom.getProperty(Atom.ATOM_CLASS); if (atomClass!=null){ oldToNewAtomMap.get(atom).setProperty(Atom.ATOM_CLASS, atomClass); } String homologyGroup = atom.getProperty(Atom.HOMOLOGY_GROUP); if (homologyGroup != null) { oldToNewAtomMap.get(atom).setProperty(Atom.HOMOLOGY_GROUP, homologyGroup); } List oldPositionVariationAtoms = atom.getProperty(Atom.POSITION_VARIATION_BOND); if (oldPositionVariationAtoms != null) { List newAtoms = new ArrayList<>(); for (Atom oldAtom : oldPositionVariationAtoms) { newAtoms.add(oldToNewAtomMap.get(oldAtom)); } oldToNewAtomMap.get(atom).setProperty(Atom.POSITION_VARIATION_BOND, newAtoms); } } for (int i = 0, l = originalFragment.getOutAtomCount(); i < l; i++) { OutAtom outAtom = originalFragment.getOutAtom(i); newFragment.addOutAtom(oldToNewAtomMap.get(outAtom.getAtom()), outAtom.getValency(), outAtom.isSetExplicitly()); if (outAtom.getLocant() !=null){ newFragment.getOutAtom(newFragment.getOutAtomCount() -1).setLocant(outAtom.getLocant() + StringTools.multiplyString("'", primesToAdd) ); } } for (int i = 0, l = originalFragment.getFunctionalAtomCount(); i < l; i++) { FunctionalAtom functionalAtom = originalFragment.getFunctionalAtom(i); newFragment.addFunctionalAtom(oldToNewAtomMap.get(functionalAtom.getAtom())); } if (originalFragment.getDefaultInAtom() != null) { newFragment.setDefaultInAtom(oldToNewAtomMap.get(originalFragment.getDefaultInAtom())); } Set bondSet =originalFragment.getBondSet(); for (Bond bond : bondSet) { Bond newBond = createBond(oldToNewAtomMap.get(bond.getFromAtom()), oldToNewAtomMap.get(bond.getToAtom()), bond.getOrder()); newBond.setSmilesStereochemistry(bond.getSmilesStereochemistry()); if (bond.getBondStereo() != null){ Atom[] oldAtomRefs4 = bond.getBondStereo().getAtomRefs4(); Atom[] newAtomRefs4 = new Atom[4]; for (int i = 0; i < oldAtomRefs4.length; i++) { newAtomRefs4[i] = oldToNewAtomMap.get(oldAtomRefs4[i]); } newBond.setBondStereoElement(newAtomRefs4, bond.getBondStereo().getBondStereoValue()); } } List indicatedHydrogenAtoms = originalFragment.getIndicatedHydrogen(); for (Atom atom : indicatedHydrogenAtoms) { newFragment.addIndicatedHydrogen(oldToNewAtomMap.get(atom)); } addFragment(newFragment); return newFragment; } /** * Takes an element and produces a copy of it. Groups and suffixes are copied so that the new element * has its own group and suffix fragments * @param elementToBeCloned * @param state The current buildstate * @return * @throws StructureBuildingException */ Element cloneElement(BuildState state, Element elementToBeCloned) throws StructureBuildingException { return cloneElement(state, elementToBeCloned, 0); } /** * Takes an element and produces a copy of it. Groups and suffixes are copied so that the new element * has its own group and suffix fragments * @param elementToBeCloned * @param state The current buildstate * @param primesToAdd: The minimum number of primes to add to the cloned atoms. More primes will be added if necessary to keep the locants unique e.g. N in the presence of N' becomes N'' when this is 1 * @return * @throws StructureBuildingException */ Element cloneElement(BuildState state, Element elementToBeCloned, int primesToAdd) throws StructureBuildingException { Element clone = elementToBeCloned.copy(); List originalGroups = OpsinTools.getDescendantElementsWithTagName(elementToBeCloned, XmlDeclarations.GROUP_EL); List clonedGroups = OpsinTools.getDescendantElementsWithTagName(clone, XmlDeclarations.GROUP_EL); HashMap oldNewFragmentMapping =new LinkedHashMap<>(); for (int i = 0; i < originalGroups.size(); i++) { Fragment originalFragment = originalGroups.get(i).getFrag(); Fragment newFragment = copyAndRelabelFragment(originalFragment, primesToAdd); oldNewFragmentMapping.put(originalFragment, newFragment); newFragment.setTokenEl(clonedGroups.get(i)); clonedGroups.get(i).setFrag(newFragment); List originalSuffixes =state.xmlSuffixMap.get(originalGroups.get(i)); List newSuffixFragments =new ArrayList<>(); for (Fragment suffix : originalSuffixes) { newSuffixFragments.add(copyFragment(suffix)); } state.xmlSuffixMap.put(clonedGroups.get(i), newSuffixFragments); } Set interFragmentBondsToClone = new LinkedHashSet<>(); for (Fragment originalFragment : oldNewFragmentMapping.keySet()) {//add inter fragment bonds to cloned fragments for (Bond bond : fragToInterFragmentBond.get(originalFragment)) { interFragmentBondsToClone.add(bond); } } for (Bond bond : interFragmentBondsToClone) { Atom originalFromAtom = bond.getFromAtom(); Atom originalToAtom = bond.getToAtom(); Fragment originalFragment1 = originalFromAtom.getFrag(); Fragment originalFragment2 = originalToAtom.getFrag(); if (!oldNewFragmentMapping.containsKey(originalFragment1) || (!oldNewFragmentMapping.containsKey(originalFragment2))){ throw new StructureBuildingException("An element that was a clone contained a bond that went outside the scope of the cloning"); } Fragment newFragment1 = oldNewFragmentMapping.get(originalFragment1); Fragment newFragment2 = oldNewFragmentMapping.get(originalFragment2); Atom fromAtom = newFragment1.getAtomList().get(originalFragment1.getAtomList().indexOf(originalFromAtom)); Atom toAtom = newFragment2.getAtomList().get(originalFragment2.getAtomList().indexOf(originalToAtom)); createBond(fromAtom, toAtom, bond.getOrder()); } return clone; } /** * Takes an atom, removes it and bonds everything that was bonded to it to the replacementAtom with the original bond orders. * Non element symbol locants are copied to the replacement atom * @param atomToBeReplaced * @param replacementAtom */ void replaceAtomWithAnotherAtomPreservingConnectivity(Atom atomToBeReplaced, Atom replacementAtom) { atomToBeReplaced.removeElementSymbolLocants(); List locants = new ArrayList<>(atomToBeReplaced.getLocants()); for (String locant : locants) { atomToBeReplaced.removeLocant(locant); replacementAtom.addLocant(locant); } List bonds = atomToBeReplaced.getBonds(); for (Bond bond : bonds) { Atom connectedAtom = bond.getOtherAtom(atomToBeReplaced); if (connectedAtom.getAtomParity() != null){ Atom[] atomRefs4 = connectedAtom.getAtomParity().getAtomRefs4(); for (int i = 0 ; i < 4; i++) { if (atomRefs4[i] == atomToBeReplaced){ atomRefs4[i] = replacementAtom; break; } } } if (bond.getBondStereo() != null){ Atom[] atomRefs4 = bond.getBondStereo().getAtomRefs4(); for (int i = 0 ; i < 4; i++) { if (atomRefs4[i] == atomToBeReplaced){ atomRefs4[i] = replacementAtom; break; } } } createBond(replacementAtom, bond.getOtherAtom(atomToBeReplaced), bond.getOrder()); } removeAtomAndAssociatedBonds(atomToBeReplaced); } /** * Removes a bond from the inter-fragment bond mappings if it was present * @param bond */ private void removeInterFragmentBondIfPresent(Bond bond) { fragToInterFragmentBond.get(bond.getFromAtom().getFrag()).remove(bond); fragToInterFragmentBond.get(bond.getToAtom().getFrag()).remove(bond); } /** * Adds a bond to the fragment to inter-fragment bond mappings * @param bond */ private void addInterFragmentBond(Bond bond) { fragToInterFragmentBond.get(bond.getFromAtom().getFrag()).add(bond); fragToInterFragmentBond.get(bond.getToAtom().getFrag()).add(bond); } /** * Gets an unmodifiable view of the set of the inter-fragment bonds a fragment is involved in * @param frag * @return set of inter fragment bonds */ Set getInterFragmentBonds(Fragment frag) { Set interFragmentBonds = fragToInterFragmentBond.get(frag); if (interFragmentBonds == null) { throw new IllegalArgumentException("Fragment not registered with this FragmentManager!"); } return Collections.unmodifiableSet(interFragmentBonds); } /** * Create a new Atom of the given element belonging to the given fragment * @param chemEl * @param frag * @return Atom */ Atom createAtom(ChemEl chemEl, Fragment frag) { Atom a = new Atom(idManager.getNextID(), chemEl, frag); frag.addAtom(a); return a; } /** * Create a new bond between two atoms. * The bond is associated with these atoms. * It is also listed as an inter-fragment bond or associated with a fragment * @param fromAtom * @param toAtom * @param bondOrder * @return Bond */ Bond createBond(Atom fromAtom, Atom toAtom, int bondOrder) { Bond b = new Bond(fromAtom, toAtom, bondOrder); fromAtom.addBond(b); toAtom.addBond(b); if (fromAtom.getFrag() == toAtom.getFrag()){ fromAtom.getFrag().addBond(b); } else{ addInterFragmentBond(b); } return b; } void removeAtomAndAssociatedBonds(Atom atom){ List bondsToBeRemoved = new ArrayList<>(atom.getBonds()); for (Bond bond : bondsToBeRemoved) { removeBond(bond); } atom.getFrag().removeAtom(atom); Set ambiguousElementAssignment = atom.getProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT); if (ambiguousElementAssignment != null){ ambiguousElementAssignment.remove(atom); if (ambiguousElementAssignment.size() == 1){ ambiguousElementAssignment.iterator().next().setProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT, null); } } } void removeBond(Bond bond){ bond.getFromAtom().getFrag().removeBond(bond); bond.getFromAtom().removeBond(bond); bond.getToAtom().removeBond(bond); removeInterFragmentBondIfPresent(bond); } /** * Valency is used to determine the expected number of hydrogen * Hydrogens are then added to bring the number of connections up to the minimum required to satisfy the atom's valency * This allows the valency of the atom to be encoded e.g. phopshane-3 hydrogen, phosphorane-5 hydrogen. * It is also necessary when considering stereochemistry as a hydrogen beats nothing in the CIP rules * @throws StructureBuildingException */ void makeHydrogensExplicit() throws StructureBuildingException { for (Fragment fragment : fragments) { List atomList = fragment.getAtomList(); for (Atom parentAtom : atomList) { int explicitHydrogensToAdd = StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(parentAtom); for (int i = 0; i < explicitHydrogensToAdd; i++) { Atom hydrogen = createAtom(ChemEl.H, fragment); createBond(parentAtom, hydrogen, 1); } if (parentAtom.getAtomParity() != null){ if (explicitHydrogensToAdd > 1) { //Cannot have tetrahedral chirality and more than 2 hydrogens parentAtom.setAtomParity(null);//probably caused by deoxy } else { modifyAtomParityToTakeIntoAccountExplicitHydrogen(parentAtom); } } } } } private void modifyAtomParityToTakeIntoAccountExplicitHydrogen(Atom atom) throws StructureBuildingException { AtomParity atomParity = atom.getAtomParity(); if (!StereoAnalyser.isPossiblyStereogenic(atom)){ //no longer a stereoCentre e.g. due to unsaturation atom.setAtomParity(null); } else{ Atom[] atomRefs4 = atomParity.getAtomRefs4(); Integer positionOfImplicitHydrogen = null; Integer positionOfDeoxyHydrogen = null; for (int i = 0; i < atomRefs4.length; i++) { Atom a = atomRefs4[i]; if (a.equals(AtomParity.hydrogen)){ positionOfImplicitHydrogen = i; } else if (a.equals(AtomParity.deoxyHydrogen)){ positionOfDeoxyHydrogen = i; } } if (positionOfImplicitHydrogen != null || positionOfDeoxyHydrogen != null) { //atom parity was set in SMILES, the dummy hydrogen atom has now been substituted List neighbours = atom.getAtomNeighbours(); for (Atom atomRef : atomRefs4) { neighbours.remove(atomRef); } if (neighbours.isEmpty()) { throw new StructureBuildingException("OPSIN Bug: Unable to determine which atom has substituted a hydrogen at stereocentre"); } else if (neighbours.size() == 1 && positionOfDeoxyHydrogen != null) { atomRefs4[positionOfDeoxyHydrogen] = neighbours.get(0); if (positionOfImplicitHydrogen != null){ throw new StructureBuildingException("OPSIN Bug: Unable to determine which atom has substituted a hydrogen at stereocentre"); } } else if (neighbours.size() == 1 && positionOfImplicitHydrogen != null) { atomRefs4[positionOfImplicitHydrogen] = neighbours.get(0); } else if (neighbours.size() == 2 && positionOfDeoxyHydrogen != null && positionOfImplicitHydrogen != null) { try{ List cipOrderedAtoms = new CipSequenceRules(atom).getNeighbouringAtomsInCipOrder(); //higher priority group replaces the former hydroxy groups (deoxyHydrogen) if (cipOrderedAtoms.indexOf(neighbours.get(0)) > cipOrderedAtoms.indexOf(neighbours.get(1))) { atomRefs4[positionOfDeoxyHydrogen] = neighbours.get(0); atomRefs4[positionOfImplicitHydrogen] = neighbours.get(1); } else{ atomRefs4[positionOfDeoxyHydrogen] = neighbours.get(1); atomRefs4[positionOfImplicitHydrogen] = neighbours.get(0); } } catch (CipOrderingException e){ //assume ligands equivalent so it makes no difference which is which atomRefs4[positionOfDeoxyHydrogen] = neighbours.get(0); atomRefs4[positionOfImplicitHydrogen] = neighbours.get(1); } } else{ throw new StructureBuildingException("OPSIN Bug: Unable to determine which atom has substituted a hydrogen at stereocentre"); } } } } }opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/FragmentTools.java000066400000000000000000001355561451751637500273220ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayDeque; import java.util.ArrayList; import java.util.Collections; import java.util.Comparator; import java.util.Deque; import java.util.HashMap; import java.util.HashSet; import java.util.List; import java.util.Map; import java.util.Set; import java.util.regex.Matcher; import java.util.regex.Pattern; import static uk.ac.cam.ch.wwmm.opsin.OpsinTools.*; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; /** * Sorts a list of atoms such that their order agrees with the order symbolic locants are typically assigned * * Preferred atoms are sorted to the START of the list * @author dl387 * */ class SortAtomsForElementSymbols implements Comparator { public int compare(Atom a, Atom b){ int bondOrderA = a.getProperty(Atom.VISITED); int bondOrderB = b.getProperty(Atom.VISITED); if (bondOrderA > bondOrderB) {//lower order bond is preferred return 1; } if (bondOrderA < bondOrderB) { return -1; } if (a.getOutValency() > b.getOutValency()) {//prefer atoms with outValency return -1; } if (a.getOutValency() < b.getOutValency()) { return 1; } int expectedHydrogenA = StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(a); int expectedHydrogenB = StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(b); if (expectedHydrogenA > expectedHydrogenB) {//prefer atoms with more hydrogen return -1; } if (expectedHydrogenA < expectedHydrogenB) { return 1; } return 0; } } /** * Performs a very crude sort of atoms such that those that are more likely to be substitued are preferred for low locants * Preferred atoms are sorted to the START of the list * @author dl387 * */ class SortAtomsForMainGroupElementSymbols implements Comparator { public int compare(Atom a, Atom b){ int compare = a.getElement().compareTo(b.getElement()); if (compare != 0) {//only bother comparing properly if elements are the same return compare; } int aExpectedHydrogen = StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(a); int bExpectedHydrogen = StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(b); if (aExpectedHydrogen > 0 && bExpectedHydrogen == 0) {//having substitutable hydrogen preferred return -1; } if (aExpectedHydrogen == 0 && bExpectedHydrogen > 0) { return 1; } List locantsA = a.getLocants(); List locantsB = b.getLocants(); if (locantsA.isEmpty() && !locantsB.isEmpty()) {//having no locants preferred return -1; } if (!locantsA.isEmpty() && locantsB.isEmpty()) { return 1; } return 0; } } class FragmentTools { /** * Sorts by number, then by letter e.g. 4,3,3b,5,3a,2 -->2,3,3a,3b,4,5 * @author dl387 * */ static class SortByLocants implements Comparator { static final Pattern locantSegmenter =Pattern.compile("(\\d+)([a-z]?)('*)"); public int compare(Atom atoma, Atom atomb){ if (atoma.getType().equals(SUFFIX_TYPE_VAL) && !atomb.getType().equals(SUFFIX_TYPE_VAL)){//suffix atoms go to the back return 1; } if (atomb.getType().equals(SUFFIX_TYPE_VAL) && !atoma.getType().equals(SUFFIX_TYPE_VAL)){ return -1; } String locanta =atoma.getFirstLocant(); String locantb =atomb.getFirstLocant(); if (locanta==null|| locantb==null){ return 0; } Matcher m1 =locantSegmenter.matcher(locanta); Matcher m2 =locantSegmenter.matcher(locantb); if (!m1.matches()|| !m2.matches()){//inappropriate locant return 0; } String locantaPrimes = m1.group(3); String locantbPrimes = m2.group(3); if (locantaPrimes.compareTo(locantbPrimes)>=1) { return 1;//e.g. 1'' vs 1' } else if (locantbPrimes.compareTo(locantaPrimes)>=1) { return -1;//e.g. 1' vs 1'' } else{ int locantaNumber = Integer.parseInt(m1.group(1)); int locantbNumber = Integer.parseInt(m2.group(1)); if (locantaNumber >locantbNumber) { return 1;//e.g. 3 vs 2 or 3a vs 2 } else if (locantbNumber >locantaNumber) { return -1;//e.g. 2 vs 3 or 2 vs 3a } else{ String locantaLetter = m1.group(2); String locantbLetter = m2.group(2); if (locantaLetter.compareTo(locantbLetter)>=1) { return 1;//e.g. 1b vs 1a } else if (locantbLetter.compareTo(locantaLetter)>=1) { return -1;//e.g. 1a vs 1b } return 0; } } } } /** * Assign element locants to groups/suffixes. These are in addition to any numerical locants that are present. * Adds primes to make each locant unique. * For groups a locant is not given to carbon atoms * If an element appears in a suffix then element locants are not assigned to occurrences of that element in the parent group * HeteroAtoms in acidStems connected to the first Atom of the fragment are treated as if they were suffix atoms * @param suffixableFragment * @param suffixFragments * @throws StructureBuildingException */ static void assignElementLocants(Fragment suffixableFragment, List suffixFragments) throws StructureBuildingException { Map elementCount = new HashMap<>();//keeps track of how many times each element has been seen Set atomsToIgnore = new HashSet<>();//atoms which already have a symbolic locant List allFragments = new ArrayList<>(suffixFragments); allFragments.add(suffixableFragment); /* * First check whether any element locants have already been assigned, these will take precedence */ for (Fragment fragment : allFragments) { List atomList = fragment.getAtomList(); for (Atom atom : atomList) { List elementSymbolLocants = atom.getElementSymbolLocants(); for (String locant : elementSymbolLocants) { int primeCount = StringTools.countTerminalPrimes(locant); String element = locant.substring(0, locant.length() - primeCount); Integer seenCount = elementCount.get(element); if (seenCount == null || (seenCount < primeCount + 1)){ elementCount.put(element, primeCount + 1); } atomsToIgnore.add(atom); } } } { Set elementsToIgnore = elementCount.keySet(); for (Fragment fragment : allFragments) { List atomList = fragment.getAtomList(); for (Atom atom : atomList) { if (elementsToIgnore.contains(atom.getElement().toString())){ atomsToIgnore.add(atom); } } } } String fragType = suffixableFragment.getType(); if (fragType.equals(NONCARBOXYLICACID_TYPE_VAL) || fragType.equals(CHALCOGENACIDSTEM_TYPE_VAL)){ if (suffixFragments.size() != 0){ throw new StructureBuildingException("No suffix fragments were expected to be present on non carboxylic acid"); } processNonCarboxylicAcidLabelling(suffixableFragment, elementCount, atomsToIgnore); } else{ if (suffixFragments.size() > 0){ processSuffixLabelling(suffixFragments, elementCount, atomsToIgnore); Integer seenCount = elementCount.get("N"); if (seenCount != null && seenCount > 1){//look for special case violation of IUPAC rule, =(N)=(NN) is N//N' in practice rather than N/N'/N'' //this method will put both locants on the N with substituable hydrogen detectAndCorrectHydrazoneDerivativeViolation(suffixFragments); } } processMainGroupLabelling(suffixableFragment, elementCount, atomsToIgnore); } } private static void detectAndCorrectHydrazoneDerivativeViolation(List suffixFragments) { fragmentLoop: for (Fragment suffixFrag : suffixFragments) { List atomList = suffixFrag.getAtomList(); for (Atom atom : atomList) { if (atom.getElement() == ChemEl.N && atom.getIncomingValency() ==3 ){ List locants =atom.getLocants(); if (locants.size()==1 && MATCH_ELEMENT_SYMBOL_LOCANT.matcher(locants.get(0)).matches()){ List neighbours = atom.getAtomNeighbours(); for (Atom neighbour : neighbours) { if (neighbour.getElement() == ChemEl.N && neighbour.getIncomingValency()==1){ String locantToAdd = locants.get(0); atom.clearLocants(); neighbour.addLocant(locantToAdd); continue fragmentLoop; } } } } } } } private static void processMainGroupLabelling(Fragment suffixableFragment, Map elementCount, Set atomsToIgnore) { Set elementToIgnore = new HashSet<>(elementCount.keySet()); List atomList = suffixableFragment.getAtomList(); Collections.sort(atomList, new SortAtomsForMainGroupElementSymbols()); Atom atomToAddCLabelTo = null;//only add a C label if there is only one C in the main group boolean seenMoreThanOneC = false; for (Atom atom : atomList) { if (atomsToIgnore.contains(atom)){ continue; } ChemEl chemEl = atom.getElement(); if (elementToIgnore.contains(chemEl.toString())){ continue; } if (chemEl == ChemEl.C) { if (seenMoreThanOneC) { continue; } if (atomToAddCLabelTo != null){ atomToAddCLabelTo = null; seenMoreThanOneC = true; } else{ atomToAddCLabelTo = atom; } } else{ assignLocant(atom, elementCount); } } if (atomToAddCLabelTo != null){ atomToAddCLabelTo.addLocant("C"); } } private static void processSuffixLabelling(List suffixFragments, Map elementCount, Set atomsToIgnore) { List startingAtoms = new ArrayList<>(); Set atomsVisited = new HashSet<>(); for (Fragment fragment : suffixFragments) { Atom rAtom = fragment.getFirstAtom(); List nextAtoms = getIntraFragmentNeighboursAndSetVisitedBondOrder(rAtom); atomsVisited.addAll(nextAtoms); startingAtoms.addAll(nextAtoms); } Collections.sort(startingAtoms, new SortAtomsForElementSymbols()); Deque atomsToConsider = new ArrayDeque<>(startingAtoms); while (atomsToConsider.size() > 0){ assignLocantsAndExploreNeighbours(elementCount, atomsToIgnore, atomsVisited, atomsToConsider); } } private static void processNonCarboxylicAcidLabelling(Fragment suffixableFragment, Map elementCount, Set atomsToIgnore) { Set atomsVisited = new HashSet<>(); Atom firstAtom = suffixableFragment.getFirstAtom(); List startingAtoms = getIntraFragmentNeighboursAndSetVisitedBondOrder(firstAtom); Collections.sort(startingAtoms, new SortAtomsForElementSymbols()); atomsVisited.add(firstAtom); Deque atomsToConsider = new ArrayDeque<>(startingAtoms); while (atomsToConsider.size() > 0){ assignLocantsAndExploreNeighbours(elementCount, atomsToIgnore, atomsVisited, atomsToConsider); } if (!atomsToIgnore.contains(firstAtom) && firstAtom.determineValency(true) > firstAtom.getIncomingValency()) { //e.g. carbonimidoyl the carbon has locant C assignLocant(firstAtom, elementCount); } } private static void assignLocantsAndExploreNeighbours(Map elementCount, Set atomsToIgnore, Set atomsVisited, Deque atomsToConsider) { Atom atom = atomsToConsider.removeFirst(); atomsVisited.add(atom); if (!atomsToIgnore.contains(atom)) {//assign locant assignLocant(atom, elementCount); } List atomsToExplore = getIntraFragmentNeighboursAndSetVisitedBondOrder(atom); atomsToExplore.removeAll(atomsVisited); Collections.sort(atomsToExplore, new SortAtomsForElementSymbols()); for (int i = atomsToExplore.size() - 1; i >= 0; i--) { atomsToConsider.addFirst(atomsToExplore.get(i)); } } /** * Gets the neighbours of an atom that claim to be within the same frag * The order of bond taken to get to the neighbour is set on the neighbours Atom.VISITED property * @param atom * @return */ private static List getIntraFragmentNeighboursAndSetVisitedBondOrder(Atom atom) { List atomsToExplore = new ArrayList<>(); List bonds = atom.getBonds(); for (Bond bond : bonds) { Atom neighbour = bond.getOtherAtom(atom); if (neighbour.getFrag().equals(atom.getFrag())) { atomsToExplore.add(neighbour); neighbour.setProperty(Atom.VISITED, bond.getOrder()); } } return atomsToExplore; } private static void assignLocant(Atom atom, Map elementCount) { String element = atom.getElement().toString(); Integer count = elementCount.get(element); if (count == null){ atom.addLocant(element); elementCount.put(element, 1); } else{ atom.addLocant(element + StringTools.multiplyString("'", count)); elementCount.put(element, count + 1); } } /** Adjusts the order of a bond in a fragment. * * @param fromAtom The lower-numbered atom in the bond * @param bondOrder The new bond order * @param fragment The fragment * @return The bond that was unsaturated * @throws StructureBuildingException */ static Bond unsaturate(Atom fromAtom, int bondOrder, Fragment fragment) throws StructureBuildingException { Atom toAtom = null; Integer locant = null; try{ String primes =""; String locantStr = fromAtom.getFirstLocant(); int numberOfPrimes = StringTools.countTerminalPrimes(locantStr); locant = Integer.parseInt(locantStr.substring(0, locantStr.length()-numberOfPrimes)); primes = StringTools.multiplyString("'", numberOfPrimes); Atom possibleToAtom = fragment.getAtomByLocant(String.valueOf(locant +1)+primes); if (possibleToAtom !=null && fromAtom.getBondToAtom(possibleToAtom)!=null){ toAtom = possibleToAtom; } else if (possibleToAtom ==null && fromAtom.getAtomIsInACycle()){//allow something like cyclohexan-6-ene, something like butan-4-ene will still fail possibleToAtom = fragment.getAtomByLocant("1" + primes); if (possibleToAtom !=null && fromAtom.getBondToAtom(possibleToAtom)!=null){ toAtom =possibleToAtom; } } } catch (Exception e) { List atomList = fragment.getAtomList(); int initialIndice = atomList.indexOf(fromAtom); if (initialIndice +1 < atomList.size() && fromAtom.getBondToAtom(atomList.get(initialIndice +1))!=null){ toAtom = atomList.get(initialIndice +1); } } if (toAtom==null){ if (locant!=null){ throw new StructureBuildingException("Could not find bond to unsaturate starting from the atom with locant: " +locant); } else{ throw new StructureBuildingException("Could not find bond to unsaturate"); } } Bond b = fromAtom.getBondToAtomOrThrow(toAtom); if (b.getOrder() != 1) { throw new StructureBuildingException("Bond indicated to be unsaturated was already unsaturated"); } b.setOrder(bondOrder); return b; } /** Adjusts the order of a bond in a fragment. * * @param fromAtom The first atom in the bond * @param locantTo The locant of the other atom in the bond * @param bondOrder The new bond order * @param fragment The fragment * @throws StructureBuildingException */ static void unsaturate(Atom fromAtom, String locantTo, int bondOrder, Fragment fragment) throws StructureBuildingException { Atom toAtom = fragment.getAtomByLocantOrThrow(locantTo); Bond b = fromAtom.getBondToAtomOrThrow(toAtom); if (b.getOrder() != 1) { throw new StructureBuildingException("Bond indicated to be unsaturated was already unsaturated"); } b.setOrder(bondOrder); } /**Adjusts the labeling on a fused ring system, such that bridgehead atoms * have locants endings in 'a' or 'b' etc. Example: naphthalene * 1,2,3,4,5,6,7,8,9,10->1,2,3,4,4a,5,6,7,8,8a * @param atomList */ static void relabelLocantsAsFusedRingSystem(List atomList) { int locantVal = 0; char locantLetter = 'a'; for (Atom atom : atomList) { atom.clearLocants(); } for (Atom atom : atomList) { if(atom.getElement() != ChemEl.C || atom.getBondCount() < 3) { locantVal++; locantLetter = 'a'; atom.addLocant(Integer.toString(locantVal)); } else { atom.addLocant(Integer.toString(locantVal) + locantLetter); locantLetter++; } } } /** * Adds the given string to all the locants of the atoms. * @param atomList * @param stringToAdd */ static void relabelLocants(List atomList, String stringToAdd) { for (Atom atom : atomList) { List locants = new ArrayList<>(atom.getLocants()); atom.clearLocants(); for (String locant : locants) { atom.addLocant(locant + stringToAdd); } } } /** * Adds the given string to all the numeric locants of the atoms. * @param atomList * @param stringToAdd */ static void relabelNumericLocants(List atomList, String stringToAdd) { for (Atom atom : atomList) { List locants = new ArrayList<>(atom.getLocants()); for (String locant : locants) { if (MATCH_NUMERIC_LOCANT.matcher(locant).matches()){ atom.removeLocant(locant); atom.addLocant(locant + stringToAdd); } } } } static void splitOutAtomIntoValency1OutAtoms(OutAtom outAtom) { Fragment frag =outAtom.getAtom().getFrag(); for (int i = 1; i < outAtom.getValency(); i++) { frag.addOutAtom(outAtom.getAtom(), 1, outAtom.isSetExplicitly()); } outAtom.setValency(1); } /** * Checks if the specified Nitrogen is potentially involved in [NH]C=N <-> N=C[NH] tautomerism * Given the starting nitrogen returns the other nitrogen or null if that nitrogen does not appear to be involved in such tautomerism * @param nitrogen * @return null or the other nitrogen */ static Atom detectSimpleNitrogenTautomer(Atom nitrogen) { if (nitrogen.getElement() == ChemEl.N && nitrogen.getAtomIsInACycle()){ for (Atom neighbour : nitrogen.getAtomNeighbours()) { if (neighbour.hasSpareValency() && neighbour.getElement() == ChemEl.C && neighbour.getAtomIsInACycle()){ List distance2Neighbours = neighbour.getAtomNeighbours(); distance2Neighbours.remove(nitrogen); for (Atom distance2Neighbour : distance2Neighbours) { if (distance2Neighbour.hasSpareValency() && distance2Neighbour.getElement() == ChemEl.N && distance2Neighbour.getAtomIsInACycle() && distance2Neighbour.getCharge()==0){ return distance2Neighbour; } } } } } return null; } /**Increases the order of bonds joining atoms with spareValencies, * and uses up said spareValencies. * [spare valency is an indication of the atom's desire to form the maximum number of non-cumulative double bonds] * @param frag * @throws StructureBuildingException If the algorithm can't work out where to put the bonds */ static void convertSpareValenciesToDoubleBonds(Fragment frag) throws StructureBuildingException { List atomCollection = frag.getAtomList(); /* pick atom, getAtomNeighbours, decideIfTerminal, resolve */ /* * Remove spare valency on atoms with valency precluding creation of double bonds */ for(Atom a : atomCollection) { a.ensureSVIsConsistantWithValency(true); } /* * Remove spare valency on atoms that are not adjacent to another atom with spare valency */ atomLoop: for(Atom a : atomCollection) { if(a.hasSpareValency()) { for(Atom aa : frag.getIntraFragmentAtomNeighbours(a)) { if(aa.hasSpareValency()) { continue atomLoop; } } a.setSpareValency(false); } } /* * The indicated hydrogen from the original SMILES definition of the fragment e.g. [nH] are used to disambiguate if there are * an odd number of atoms with spare valency. Hence pyrrole is unambiguously 1H-pyrrole unless specified otherwise * Things gets more complicated if the input contained multiple indicated hydrogen as it is unclear whether these still apply to the final molecule */ Atom atomToReduceValencyAt = null; List originalIndicatedHydrogen = frag.getIndicatedHydrogen(); List indicatedHydrogen = new ArrayList<>(originalIndicatedHydrogen.size()); for (Atom atom : frag.getIndicatedHydrogen()) { if (atom.hasSpareValency() && atom.getCharge() == 0) { indicatedHydrogen.add(atom); } } if (indicatedHydrogen.size() > 0) { //typically there will be only one indicated hydrogen if (indicatedHydrogen.size() > 1) { for (Atom indicatedAtom : indicatedHydrogen) { boolean couldBeInvolvedInSimpleNitrogenTautomerism = false;//fix for guanine like purine derivatives if (indicatedAtom.getElement() == ChemEl.N && indicatedAtom.getAtomIsInACycle()) { atomloop : for (Atom neighbour : indicatedAtom.getAtomNeighbours()) { if (neighbour.getElement() == ChemEl.C && neighbour.getAtomIsInACycle()) { List distance2Neighbours = neighbour.getAtomNeighbours(); distance2Neighbours.remove(indicatedAtom); for (Atom distance2Neighbour : distance2Neighbours) { if (distance2Neighbour.getElement() == ChemEl.N && distance2Neighbour.getAtomIsInACycle() && !originalIndicatedHydrogen.contains(distance2Neighbour)){ couldBeInvolvedInSimpleNitrogenTautomerism = true; break atomloop; } } } } } //retain spare valency if has the cyclic [NH]C=N moiety but substitution has meant that this tautomerism doesn't actually occur cf. 8-oxoguanine if (!couldBeInvolvedInSimpleNitrogenTautomerism || detectSimpleNitrogenTautomer(indicatedAtom) != null) { indicatedAtom.setSpareValency(false); } } } else{ atomToReduceValencyAt = indicatedHydrogen.get(0); } } int svCount = 0; for(Atom a : atomCollection) { svCount += a.hasSpareValency() ? 1 :0; } /* * Double-bonds go between pairs of atoms so if there are an off number of candidate atoms (e.g. pyrrole) an atom must be chosen * The atom with indicated hydrogen (see above) is used in preference else heuristics are used to chose a candidate */ if((svCount & 1) == 1) { if (atomToReduceValencyAt == null) { atomToReduceValencyAt = findBestAtomToRemoveSpareValencyFrom(frag, atomCollection); } atomToReduceValencyAt.setSpareValency(false); svCount--; } while(svCount > 0) { boolean foundTerminalFlag = false; boolean foundNonBridgeHeadFlag = false; boolean foundBridgeHeadFlag = false; //First handle cases where double bond placement is completely unambiguous i.e. an atom where only one neighbour has spare valency for(Atom a : atomCollection) { if(a.hasSpareValency()) { int count = 0; for(Atom aa : frag.getIntraFragmentAtomNeighbours(a)) { if(aa.hasSpareValency()) { count++; } } if(count == 1) { for(Atom aa : frag.getIntraFragmentAtomNeighbours(a)) { if(aa.hasSpareValency()) { foundTerminalFlag = true; a.setSpareValency(false); aa.setSpareValency(false); a.getBondToAtomOrThrow(aa).addOrder(1); svCount -= 2;//Two atoms where for one of them this bond is the only double bond it can possible form break; } } } } } if(foundTerminalFlag) { continue; } //Find two atoms where one, or both, of them are not bridgeheads for(Atom a : atomCollection) { List neighbours = frag.getIntraFragmentAtomNeighbours(a); if(a.hasSpareValency() && neighbours.size() < 3) { for(Atom aa : neighbours) { if(aa.hasSpareValency()) { foundNonBridgeHeadFlag = true; a.setSpareValency(false); aa.setSpareValency(false); a.getBondToAtomOrThrow(aa).addOrder(1); svCount -= 2;//Two atoms where one of them is not a bridge head break; } } } if(foundNonBridgeHeadFlag) { break; } } if(foundNonBridgeHeadFlag) { continue; } //Find two atoms where both of them are bridgheads for(Atom a : atomCollection) { List neighbours = frag.getIntraFragmentAtomNeighbours(a); if(a.hasSpareValency()) { for(Atom aa : neighbours) { if(aa.hasSpareValency()) { foundBridgeHeadFlag = true; a.setSpareValency(false); aa.setSpareValency(false); a.getBondToAtomOrThrow(aa).addOrder(1); svCount -= 2;//Two atoms where both of them are a bridge head e.g. necessary for something like coronene break; } } } if(foundBridgeHeadFlag) { break; } } if(!foundBridgeHeadFlag) { throw new StructureBuildingException("Failed to assign all double bonds! (Check that indicated hydrogens have been appropriately specified)"); } } } private static Atom findBestAtomToRemoveSpareValencyFrom(Fragment frag, List atomCollection) { for(Atom a : atomCollection) {//try and find an atom with SV that neighbours only one atom with SV if(a.hasSpareValency()) { int atomsWithSV = 0; for(Atom aa : frag.getIntraFragmentAtomNeighbours(a)) { if(aa.hasSpareValency()) { atomsWithSV++; } } if (atomsWithSV == 1) { return a; } } } atomLoop: for(Atom a : atomCollection) {//try and find an atom with bridgehead atoms with SV on both sides c.f. phenoxastibinine == 10H-phenoxastibinine if(a.hasSpareValency()) { List neighbours = frag.getIntraFragmentAtomNeighbours(a); if (neighbours.size() == 2) { for(Atom aa : neighbours) { if(frag.getIntraFragmentAtomNeighbours(aa).size() < 3){ continue atomLoop; } } return a; } } } //Prefer nitrogen to carbon e.g. get NHC=C rather than N=CCH Atom firstAtomWithSpareValency = null; Atom firstHeteroAtomWithSpareValency = null; for(Atom a : atomCollection) { if(a.hasSpareValency()) { if (a.getElement() != ChemEl.C) { if (a.getCharge() == 0) { return a; } if(firstHeteroAtomWithSpareValency == null) { firstHeteroAtomWithSpareValency = a; } } if(firstAtomWithSpareValency == null) { firstAtomWithSpareValency = a; } } } if (firstAtomWithSpareValency == null) { throw new IllegalArgumentException("OPSIN Bug: No atom had spare valency!"); } return firstHeteroAtomWithSpareValency != null ? firstHeteroAtomWithSpareValency : firstAtomWithSpareValency; } static Atom getAtomByAminoAcidStyleLocant(Atom backboneAtom, String elementSymbol, String primes) { //Search for appropriate atom by using the same algorithm as is used to assign locants initially List startingAtoms = new ArrayList<>(); Set atomsVisited = new HashSet<>(); List neighbours = getIntraFragmentNeighboursAndSetVisitedBondOrder(backboneAtom); mainLoop: for (Atom neighbour : neighbours) { atomsVisited.add(neighbour); if (!neighbour.getType().equals(SUFFIX_TYPE_VAL)){ for (String neighbourLocant : neighbour.getLocants()) { if (MATCH_NUMERIC_LOCANT.matcher(neighbourLocant).matches()){//gone to an inappropriate atom continue mainLoop; } } } startingAtoms.add(neighbour); } Collections.sort(startingAtoms, new SortAtomsForElementSymbols()); Map elementCount = new HashMap<>();//keeps track of how many times each element has been seen Deque atomsToConsider = new ArrayDeque<>(startingAtoms); boolean hydrazoneSpecialCase =false;//look for special case violation of IUPAC rule where the locant of the =N- atom is skipped. This flag is set when =N- is encountered while (atomsToConsider.size() > 0){ Atom atom = atomsToConsider.removeFirst(); atomsVisited.add(atom); int primesOnPossibleAtom =0; String element =atom.getElement().toString(); if (elementCount.get(element)==null){ elementCount.put(element,1); } else{ int count =elementCount.get(element); primesOnPossibleAtom =count; elementCount.put(element, count +1); } if (hydrazoneSpecialCase){ if (element.equals(elementSymbol) && primes.length() == primesOnPossibleAtom -1){ return atom; } hydrazoneSpecialCase =false; } List atomNeighbours = getIntraFragmentNeighboursAndSetVisitedBondOrder(atom); atomNeighbours.removeAll(atomsVisited); for (int i = atomNeighbours.size() -1; i >=0; i--) { Atom neighbour = atomNeighbours.get(i); if (!neighbour.getType().equals(SUFFIX_TYPE_VAL)){ for (String neighbourLocant : neighbour.getLocants()) { if (MATCH_NUMERIC_LOCANT.matcher(neighbourLocant).matches()){//gone to an inappropriate atom atomNeighbours.remove(i); break; } } } } if (atom.getElement() == ChemEl.N && atom.getIncomingValency() ==3 && atom.getCharge()==0 && atomNeighbours.size()==1 && atomNeighbours.get(0).getElement() == ChemEl.N){ hydrazoneSpecialCase =true; } else{ if (element.equals(elementSymbol)){ if (primes.length() == primesOnPossibleAtom){ return atom; } } } Collections.sort(atomNeighbours, new SortAtomsForElementSymbols()); for (int i = atomNeighbours.size() - 1; i >= 0; i--) { atomsToConsider.addFirst(atomNeighbours.get(i)); } } if (primes.equals("") && backboneAtom.getElement().toString().equals(elementSymbol)){//maybe it meant the starting atom return backboneAtom; } return null; } /** * Determines whether the bond between two elements is likely to be covalent * This is crudely determined based on whether the combination of elements fall outside the ionic and * metallic sections of a van Arkel diagram * @param chemEl1 * @param chemEl2 * @return */ static boolean isCovalent(ChemEl chemEl1, ChemEl chemEl2) { Double atom1Electrongegativity = AtomProperties.getPaulingElectronegativity(chemEl1); Double atom2Electrongegativity = AtomProperties.getPaulingElectronegativity(chemEl2); if (atom1Electrongegativity!=null && atom2Electrongegativity !=null){ double halfSum = (atom1Electrongegativity + atom2Electrongegativity)/2; double difference = Math.abs(atom1Electrongegativity - atom2Electrongegativity); if (halfSum < 1.6){ return false;//probably metallic } if (difference < 1.76 * halfSum - 3.03){ return true; } } return false; } /** * Is the atom a suffix atom/carbon of an aldehyde atom/chalcogen functional atom/hydroxy (or chalcogen equivalent) * (by special step heterostems are not considered hydroxy e.g. disulfane) * @param atom * @return */ static boolean isCharacteristicAtom(Atom atom) { if (atom.getType().equals(SUFFIX_TYPE_VAL) || (atom.getElement().isChalcogen() && !HETEROSTEM_SUBTYPE_VAL.equals(atom.getFrag().getSubType()) && atom.getIncomingValency() == 1 && atom.getOutValency() == 0 && atom.getCharge() == 0)) { return true; } return isFunctionalAtomOrAldehyde(atom); } /** * Is the atom an aldehyde atom or a chalcogen functional atom * @param atom * @return */ static boolean isFunctionalAtomOrAldehyde(Atom atom) { if (Boolean.TRUE.equals(atom.getProperty(Atom.ISALDEHYDE))){//substituting an aldehyde would make it no longer an aldehyde return true; } return isFunctionalAtom(atom); } /** * Is the atom a chalcogen functional atom * @param atom * @return */ static boolean isFunctionalAtom(Atom atom) { ChemEl chemEl = atom.getElement(); if (chemEl.isChalcogen()) {//potential chalcogen functional atom Fragment frag = atom.getFrag(); for (int i = 0, l = frag.getFunctionalAtomCount(); i < l; i++) { if (atom.equals(frag.getFunctionalAtom(i).getAtom())){ return true; } } } return false; } /** * Checks that all atoms in a ring appear to be equivalent * @param ring * @return true if all equivalent, else false */ static boolean allAtomsInRingAreIdentical(Fragment ring){ List atomList = ring.getAtomList(); Atom firstAtom = atomList.get(0); ChemEl chemEl = firstAtom.getElement(); int valency = firstAtom.getIncomingValency(); boolean spareValency = firstAtom.hasSpareValency(); for (Atom atom : atomList) { if (atom.getElement() != chemEl){ return false; } if (atom.getIncomingValency() != valency){ return false; } if (atom.hasSpareValency() != spareValency){ return false; } } return true; } static void removeTerminalAtom(BuildState state, Atom atomToRemove) { AtomParity atomParity = atomToRemove.getAtomNeighbours().get(0).getAtomParity(); if (atomParity!=null){//replace reference to atom with reference to implicit hydrogen Atom[] atomRefs4= atomParity.getAtomRefs4(); for (int i = 0; i < atomRefs4.length; i++) { if (atomRefs4[i]==atomToRemove){ atomRefs4[i] = AtomParity.deoxyHydrogen; break; } } } state.fragManager.removeAtomAndAssociatedBonds(atomToRemove); } /** * Removes a terminal oxygen from the atom * An exception is thrown if no suitable oxygen could be found connected to the atom * Note that [N+][O-] is treated as N=O * @param state * @param atom * @param desiredBondOrder * @throws StructureBuildingException */ static void removeTerminalOxygen(BuildState state, Atom atom, int desiredBondOrder) throws StructureBuildingException { //TODO prioritise [N+][O-] List neighbours = atom.getAtomNeighbours(); for (Atom neighbour : neighbours) { if (neighbour.getElement() == ChemEl.O && neighbour.getBondCount()==1){ Bond b = atom.getBondToAtomOrThrow(neighbour); if (b.getOrder()==desiredBondOrder && neighbour.getCharge()==0){ FragmentTools.removeTerminalAtom(state, neighbour); if (atom.getLambdaConventionValency()!=null){//corrects valency for phosphin/arsin/stibin atom.setLambdaConventionValency(atom.getLambdaConventionValency()-desiredBondOrder); } if (atom.getMinimumValency()!=null){//corrects valency for phosphin/arsin/stibin atom.setMinimumValency(atom.getMinimumValency()-desiredBondOrder); } return; } else if (neighbour.getCharge() ==-1 && b.getOrder()==1 && desiredBondOrder == 2){ if (atom.getCharge() ==1 && atom.getElement() == ChemEl.N){ FragmentTools.removeTerminalAtom(state, neighbour); atom.neutraliseCharge(); return; } } } } if (desiredBondOrder ==2){ throw new StructureBuildingException("Double bonded oxygen not found at suffix attachment position. Perhaps a suffix has been used inappropriately"); } else if (desiredBondOrder ==1){ throw new StructureBuildingException("Hydroxy oxygen not found at suffix attachment position. Perhaps a suffix has been used inappropriately"); } else { throw new StructureBuildingException("Suitable oxygen not found at suffix attachment position Perhaps a suffix has been used inappropriately"); } } /** * Finds terminal atoms of the given element type from the list given * The terminal atoms be single bonded, not radicals and uncharged * @param atoms * @param chemEl * @return */ static List findHydroxyLikeTerminalAtoms(List atoms, ChemEl chemEl) { List matches =new ArrayList<>(); for (Atom atom : atoms) { if (atom.getElement() == chemEl && atom.getIncomingValency() == 1 && atom.getOutValency() == 0 && atom.getCharge() == 0){ matches.add(atom); } } return matches; } /** * Checks whether a bond is part of a 6 member or smaller ring. * This is necessary as such double bonds are assumed to not be capable of having E/Z stereochemistry * @param bond * @return true unless in a 6 member or smaller rings */ static boolean notIn6MemberOrSmallerRing(Bond bond) { Atom fromAtom =bond.getFromAtom(); Atom toAtom = bond.getToAtom(); if (fromAtom.getAtomIsInACycle() && toAtom.getAtomIsInACycle()){//obviously both must be in rings //attempt to get from the fromAtom to the toAtom in 6 or fewer steps. List visitedAtoms = new ArrayList<>(); Deque atomsToInvestigate = new ArrayDeque<>();//A queue is not used as I need to make sure that only up to depth 6 is investigated List neighbours =fromAtom.getAtomNeighbours(); neighbours.remove(toAtom); for (Atom neighbour : neighbours) { atomsToInvestigate.add(neighbour); } visitedAtoms.add(fromAtom); for (int i = 0; i < 5; i++) {//up to 5 bonds from the neighbours of the fromAtom i.e. up to ring size 6 if (atomsToInvestigate.isEmpty()){ break; } Deque atomsToInvestigateNext = new ArrayDeque<>(); while (!atomsToInvestigate.isEmpty()) { Atom currentAtom =atomsToInvestigate.removeFirst(); if (currentAtom == toAtom){ return false; } visitedAtoms.add(currentAtom); neighbours =currentAtom.getAtomNeighbours(); for (Atom neighbour : neighbours) { if (!visitedAtoms.contains(neighbour) && neighbour.getAtomIsInACycle()){ atomsToInvestigateNext.add(neighbour); } } } atomsToInvestigate = atomsToInvestigateNext; } } return true; } /** * Finds the hydroxy atom of all hydroxy functional groups in a fragment * i.e. not in carboxylic acid or oxime * @param frag * @return * @throws StructureBuildingException */ static List findHydroxyGroups(Fragment frag) throws StructureBuildingException { List hydroxyAtoms = new ArrayList<>(); List atoms = frag.getAtomList(); for (Atom atom : atoms) { if (atom.getElement() == ChemEl.O && atom.getIncomingValency() == 1 && atom.getOutValency() == 0 && atom.getCharge() == 0){ Atom adjacentAtom = atom.getAtomNeighbours().get(0); List neighbours = adjacentAtom.getAtomNeighbours(); if (adjacentAtom.getElement() == ChemEl.C){ neighbours.remove(atom); if (neighbours.size() >= 1 && neighbours.get(0).getElement() == ChemEl.O && adjacentAtom.getBondToAtomOrThrow(neighbours.get(0)).getOrder()==2){ continue; } if (neighbours.size() >= 2 && neighbours.get(1).getElement() == ChemEl.O && adjacentAtom.getBondToAtomOrThrow(neighbours.get(1)).getOrder()==2){ continue; } hydroxyAtoms.add(atom); } } } return hydroxyAtoms; } static List findnAtomsForSubstitution(List atomList, Atom preferredAtom, int numberOfSubstitutionsRequired, int bondOrder, boolean takeIntoAccountOutValency, boolean preserveValency) { int atomCount = atomList.size(); int startingIndex = preferredAtom != null ? atomList.indexOf(preferredAtom) : 0; if (startingIndex < 0){ throw new IllegalArgumentException("OPSIN Bug: preferredAtom should be part of the list of atoms to search through"); } CyclicAtomList atoms = new CyclicAtomList(atomList, startingIndex - 1);//next() will retrieve the atom at the startingIndex List substitutableAtoms = new ArrayList<>(); if (atomCount == 1 && ELEMENTARYATOM_TYPE_VAL.equals(atomList.get(0).getFrag().getType())) { Atom atom = atomList.get(0); int timesAtomCanBeSubstituted = getTimesElementaryAtomCanBeSubstituted(atom); for (int j = 1; j <= timesAtomCanBeSubstituted; j++) { substitutableAtoms.add(atom); } } else { for (int i = 0; i < atomCount; i++) {//aromaticity preserved, standard valency assumed, characteristic atoms ignored Atom atom = atoms.next(); if (!FragmentTools.isCharacteristicAtom(atom) || (numberOfSubstitutionsRequired == 1 && atom == preferredAtom)) { int currentExpectedValency = atom.determineValency(takeIntoAccountOutValency); int usedValency = atom.getIncomingValency() + (atom.hasSpareValency() ? 1 : 0) + (takeIntoAccountOutValency ? atom.getOutValency() : 0); int timesAtomCanBeSubstituted = ((currentExpectedValency - usedValency)/ bondOrder); for (int j = 1; j <= timesAtomCanBeSubstituted; j++) { substitutableAtoms.add(atom); } } } } if (substitutableAtoms.size() >= numberOfSubstitutionsRequired){ return substitutableAtoms; } substitutableAtoms.clear(); for (int i = 0; i < atomCount; i++) {//aromaticity preserved, standard valency assumed, functional suffixes ignored Atom atom = atoms.next(); if (!FragmentTools.isFunctionalAtomOrAldehyde(atom) || (numberOfSubstitutionsRequired == 1 && atom == preferredAtom)) { int currentExpectedValency = atom.determineValency(takeIntoAccountOutValency); int usedValency = atom.getIncomingValency() + (atom.hasSpareValency() ? 1 : 0) + (takeIntoAccountOutValency ? atom.getOutValency() : 0); int timesAtomCanBeSubstituted = ((currentExpectedValency - usedValency)/ bondOrder); for (int j = 1; j <= timesAtomCanBeSubstituted; j++) { substitutableAtoms.add(atom); } } } if (substitutableAtoms.size() >= numberOfSubstitutionsRequired){ return substitutableAtoms; } if (preserveValency) { return null; } substitutableAtoms.clear(); for (int i = 0; i < atomCount; i++) {//aromaticity preserved, any sensible valency allowed, anything substitutable Atom atom = atoms.next(); Integer maximumValency = ValencyChecker.getMaximumValency(atom); if (maximumValency != null) { int usedValency = atom.getIncomingValency() + (atom.hasSpareValency() ? 1 : 0) + (takeIntoAccountOutValency ? atom.getOutValency() : 0); int timesAtomCanBeSubstituted = ((maximumValency - usedValency)/ bondOrder); for (int j = 1; j <= timesAtomCanBeSubstituted; j++) { substitutableAtoms.add(atom); } } else{ for (int j = 0; j < numberOfSubstitutionsRequired; j++) { substitutableAtoms.add(atom); } } } if (substitutableAtoms.size() >= numberOfSubstitutionsRequired){ return substitutableAtoms; } substitutableAtoms.clear(); for (int i = 0; i < atomCount; i++) {//aromaticity dropped, any sensible valency allowed, anything substitutable Atom atom = atoms.next(); Integer maximumValency = ValencyChecker.getMaximumValency(atom); if (maximumValency != null) { int usedValency = atom.getIncomingValency() + (takeIntoAccountOutValency ? atom.getOutValency() : 0); int timesAtomCanBeSubstituted = ((maximumValency - usedValency)/ bondOrder); for (int j = 1; j <= timesAtomCanBeSubstituted; j++) { substitutableAtoms.add(atom); } } else { for (int j = 0; j < numberOfSubstitutionsRequired; j++) { substitutableAtoms.add(atom); } } } if (substitutableAtoms.size() >= numberOfSubstitutionsRequired){ return substitutableAtoms; } return null; } private static int getTimesElementaryAtomCanBeSubstituted(Atom atom) { Integer oxidationNumber = atom.getProperty(Atom.OXIDATION_NUMBER);//explicitly set oxidation state if (oxidationNumber == null) { String oxidationStates = atom.getFrag().getTokenEl().getAttributeValue(COMMONOXIDATIONSTATESANDMAX_ATR);//properties of this element if (oxidationStates != null) { String[] commonOxidationStates = oxidationStates.split(":")[0].split(","); //highest common oxidation state oxidationNumber = Integer.parseInt(commonOxidationStates[commonOxidationStates.length - 1]); } else { oxidationNumber = 0; } } int usedValency = atom.getIncomingValency(); return (oxidationNumber > usedValency) ? oxidationNumber - usedValency : 0; } static List findnAtomsForSubstitution(List atomList, Atom preferredAtom, int numberOfSubstitutionsRequired, int bondOrder, boolean takeIntoAccountOutValency) { return findnAtomsForSubstitution(atomList, preferredAtom, numberOfSubstitutionsRequired, bondOrder, takeIntoAccountOutValency, false); } static List findnAtomsForSubstitution(Fragment frag, Atom preferredAtom, int numberOfSubstitutionsRequired, int bondOrder, boolean takeIntoAccountOutValency) { return findnAtomsForSubstitution(frag.getAtomList(), preferredAtom, numberOfSubstitutionsRequired, bondOrder, takeIntoAccountOutValency); } /** * Returns a list of atoms of size >= numberOfSubstitutionsDesired (or null if this not possible) * An atom must have have sufficient valency to support a substituent requiring a bond of order bondOrder * If an atom can support multiple substituents it will appear in the list multiple times * This method iterates over the the fragment atoms attempting to fulfil these requirements with incrementally more lenient constraints: * aromaticity preserved, standard valency assumed, characteristic atoms ignored * aromaticity preserved, standard valency assumed, functional suffixes ignored * aromaticity preserved, any sensible valency allowed, anything substitutable * aromaticity dropped, any sensible valency allowed, anything substitutable * * Iteration starts from the defaultInAtom (if applicable, else the first atom) i.e. the defaultInAtom if substitutable will be the first atom in the list * @param frag * @param numberOfSubstitutionsRequired * @param bondOrder * @return */ static List findnAtomsForSubstitution(Fragment frag, int numberOfSubstitutionsRequired, int bondOrder) { return findnAtomsForSubstitution(frag.getAtomList(), frag.getDefaultInAtom(), numberOfSubstitutionsRequired, bondOrder, true); } /** * Returns a list of the most preferable atoms for substitution (empty list if none are) * An atom must have have sufficient valency to support a substituent requiring a bond of order bondOrder * If an atom can support multiple substituents it will appear in the list multiple times * This method iterates over the the fragment atoms attempting to fulfil these requirements with incrementally more lenient constraints: * aromaticity preserved, standard valency assumed, characteristic atoms ignored * aromaticity preserved, standard valency assumed, functional suffixes ignored * aromaticity preserved, any sensible valency allowed, anything substitutable * aromaticity dropped, any sensible valency allowed, anything substitutable * * Iteration starts from the defaultInAtom (if applicable, else the first atom) i.e. the defaultInAtom if substitutable will be the first atom in the list * @param frag * @param bondOrder * @return */ static List findSubstituableAtoms(Fragment frag, int bondOrder) { List potentialAtoms = findnAtomsForSubstitution(frag, 1, bondOrder); if (potentialAtoms == null) { return Collections.emptyList(); } return potentialAtoms; } static Atom lastNonSuffixCarbonWithSufficientValency(Fragment conjunctiveFragment) { List atomList = conjunctiveFragment.getAtomList(); for (int i = atomList.size()-1; i >=0; i--) { Atom a = atomList.get(i); if (a.getType().equals(SUFFIX_TYPE_VAL)){ continue; } if (a.getElement() != ChemEl.C){ continue; } if (ValencyChecker.checkValencyAvailableForBond(a, 1)){ return a; } } return null; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/FunctionalAtom.java000066400000000000000000000005521451751637500274440ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * Struct for a FunctionalAtom. As expected holds the atom. * This is used to indicate, for example, that this atom may form an ester * * @author dl387 * */ class FunctionalAtom { private final Atom atom; FunctionalAtom(Atom atom) { this.atom = atom; } Atom getAtom() { return atom; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/FunctionalReplacement.java000066400000000000000000001472641451751637500310170ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import java.util.ArrayList; import java.util.Collections; import java.util.Comparator; import java.util.Iterator; import java.util.LinkedHashSet; import java.util.LinkedList; import java.util.List; import java.util.Set; import java.util.regex.Pattern; /** * Methods for performing functional replacement * @author dl387 * */ class FunctionalReplacement { /** * Sorts infix transformations by the number of acceptable inputs for the transformation. * e.g. thio ends up towards the end of the list as it accepts both -O or =O whilst say imido only accepts =O * @author dl387 * */ private static class SortInfixTransformations implements Comparator { public int compare(String infixTransformation1, String infixTransformation2) { int allowedInputs1 = infixTransformation1.split(",").length; int allowedInputs2 = infixTransformation2.split(",").length; if (allowedInputs1 < allowedInputs2){//infixTransformation1 preferred return -1; } if (allowedInputs1 > allowedInputs2){//infixTransformation2 preferred return 1; } else{ return 0; } } } private static enum PREFIX_REPLACEMENT_TYPE{ chalcogen,//ambiguous halideOrPseudoHalide,//only mean functional replacement when applied to non carboxylic acids dedicatedFunctionalReplacementPrefix,//no ambiguity exists hydrazono,//ambiguous, only applies to non carboxylic acid peroxy//ambiguous, also applies to etheric oxygen } static final Pattern matchChalcogenReplacement= Pattern.compile("thio|seleno|telluro"); private final BuildState state; FunctionalReplacement(BuildState state) { this.state = state; } /** * Applies the effects of acid replacing functional class nomenclature * This must be performed early so that prefix/infix functional replacement is performed correctly * and so that element symbol locants are assigned appropriately * @param finalSubOrRootInWord * @param word * @throws ComponentGenerationException * @throws StructureBuildingException */ void processAcidReplacingFunctionalClassNomenclature(Element finalSubOrRootInWord, Element word) throws ComponentGenerationException, StructureBuildingException { Element wordRule = OpsinTools.getParentWordRule(word); if (WordRule.valueOf(wordRule.getAttributeValue(WORDRULE_ATR)) == WordRule.acidReplacingFunctionalGroup){ Element parentWordRule = word.getParent(); if (parentWordRule.indexOf(word)==0){ for (int i = 1, l = parentWordRule.getChildCount(); i < l ; i++) { Element acidReplacingWord = parentWordRule.getChild(i); if (!acidReplacingWord.getName().equals(WORD_EL)) { throw new RuntimeException("OPSIN bug: problem with acidReplacingFunctionalGroup word rule"); } String type = acidReplacingWord.getAttributeValue(TYPE_ATR); if (type.equals(WordType.full.toString())) { //case where functionalTerm is substituted //as words are processed from right to left in cases like phosphoric acid tri(ethylamide) this will be phosphoric acid ethylamide ethylamide ethylamide processAcidReplacingFunctionalClassNomenclatureFullWord(finalSubOrRootInWord, acidReplacingWord); } else if (type.equals(WordType.functionalTerm.toString())) { processAcidReplacingFunctionalClassNomenclatureFunctionalWord(finalSubOrRootInWord, acidReplacingWord); } else { throw new RuntimeException("OPSIN bug: problem with acidReplacingFunctionalGroup word rule"); } } } } } /** * Performs prefix functional replacement e.g. thio in thioacetic acid replaces an O with S * Prefixes will present themselves as substituents. There is potential ambiguity between usage as a substituent * and as a functional replacement term in some cases. If the substituent is deemed to indicate functional replacement * it will be detached and its effects applied to the subsequent group * * The list of groups and substituents given to this method will be mutated in the process. * * For heterocyclic rings functional replacement should technically be limited to : * pyran, morpholine, chromene, isochromene and xanthene, chromane and isochromane. * but this is not currently enforced * @param groups * @param substituents * @return boolean: has any functional replacement occurred * @throws StructureBuildingException * @throws ComponentGenerationException */ boolean processPrefixFunctionalReplacementNomenclature(List groups, List substituents) throws StructureBuildingException, ComponentGenerationException { int originalNumberOfGroups = groups.size(); for (int i = originalNumberOfGroups-1; i >=0; i--) { Element group =groups.get(i); String groupValue = group.getValue(); PREFIX_REPLACEMENT_TYPE replacementType = null; if (matchChalcogenReplacement.matcher(groupValue).matches() && !isChalcogenSubstituent(group) || groupValue.equals("thiono")){ replacementType =PREFIX_REPLACEMENT_TYPE.chalcogen; } else if (HALIDEORPSEUDOHALIDE_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))){ replacementType =PREFIX_REPLACEMENT_TYPE.halideOrPseudoHalide; } else if (DEDICATEDFUNCTIONALREPLACEMENTPREFIX_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))){ replacementType =PREFIX_REPLACEMENT_TYPE.dedicatedFunctionalReplacementPrefix; } else if (groupValue.equals("hydrazono")){ replacementType =PREFIX_REPLACEMENT_TYPE.hydrazono; } else if (groupValue.equals("peroxy")){ replacementType =PREFIX_REPLACEMENT_TYPE.peroxy; } if (replacementType != null) { //need to check whether this is an instance of functional replacement by checking the substituent/root it is applying to Element substituent = group.getParent(); Element nextSubOrBracket = OpsinTools.getNextSibling(substituent); if (nextSubOrBracket!=null && (nextSubOrBracket.getName().equals(ROOT_EL) || nextSubOrBracket.getName().equals(SUBSTITUENT_EL))){ Element groupToBeModified = nextSubOrBracket.getFirstChildElement(GROUP_EL); if (groupPrecededByElementThatBlocksPrefixReplacementInterpetation(groupToBeModified)) { if (replacementType == PREFIX_REPLACEMENT_TYPE.dedicatedFunctionalReplacementPrefix){ throw new ComponentGenerationException("dedicated Functional Replacement Prefix used in an inappropriate position :" + groupValue); } continue;//not 2,2'-thiodipyran } Element locantEl = null;//null unless a locant that agrees with the multiplier is present Element multiplierEl = null; int numberOfAtomsToReplace = 1;//the number of atoms to be functionally replaced, modified by a multiplier e.g. dithio Element possibleMultiplier = OpsinTools.getPreviousSibling(group); if (possibleMultiplier != null) { Element possibleLocant; if (possibleMultiplier.getName().equals(MULTIPLIER_EL)) { numberOfAtomsToReplace = Integer.valueOf(possibleMultiplier.getAttributeValue(VALUE_ATR)); possibleLocant = OpsinTools.getPreviousSibling(possibleMultiplier); multiplierEl = possibleMultiplier; } else{ possibleLocant = possibleMultiplier; } if (possibleLocant !=null && possibleLocant.getName().equals(LOCANT_EL) && possibleLocant.getAttribute(TYPE_ATR) == null) { int numberOfLocants = possibleLocant.getValue().split(",").length; if (numberOfLocants == numberOfAtomsToReplace){//locants and number of replacements agree locantEl = possibleLocant; } else if (numberOfAtomsToReplace > 1) {//doesn't look like prefix functional replacement if (replacementType == PREFIX_REPLACEMENT_TYPE.dedicatedFunctionalReplacementPrefix){ throw new ComponentGenerationException("dedicated Functional Replacement Prefix used in an inappropriate position :" + groupValue); } continue; } } } int oxygenReplaced; if (replacementType == PREFIX_REPLACEMENT_TYPE.chalcogen) { oxygenReplaced = performChalcogenFunctionalReplacement(groupToBeModified, locantEl, numberOfAtomsToReplace, group.getAttributeValue(VALUE_ATR)); } else if (replacementType == PREFIX_REPLACEMENT_TYPE.peroxy) { if (nextSubOrBracket.getName().equals(SUBSTITUENT_EL)) { continue; } oxygenReplaced = performPeroxyFunctionalReplacement(groupToBeModified, locantEl, numberOfAtomsToReplace); } else if (replacementType == PREFIX_REPLACEMENT_TYPE.dedicatedFunctionalReplacementPrefix){ if (!groupToBeModified.getAttributeValue(TYPE_ATR).equals(NONCARBOXYLICACID_TYPE_VAL) && !(groupToBeModified.getValue().equals("form") && groupValue.equals("imido"))){ throw new ComponentGenerationException("dedicated Functional Replacement Prefix used in an inappropriate position :" + groupValue); } oxygenReplaced = performFunctionalReplacementOnAcid(groupToBeModified, locantEl, numberOfAtomsToReplace, group.getAttributeValue(VALUE_ATR)); if (oxygenReplaced==0){ throw new ComponentGenerationException("dedicated Functional Replacement Prefix used in an inappropriate position :" + groupValue); } } else if (replacementType == PREFIX_REPLACEMENT_TYPE.hydrazono || replacementType == PREFIX_REPLACEMENT_TYPE.halideOrPseudoHalide){ Fragment acidFrag = groupToBeModified.getFrag(); if (!groupToBeModified.getAttributeValue(TYPE_ATR).equals(NONCARBOXYLICACID_TYPE_VAL) || acidHasSufficientHydrogenForSubstitutionInterpretation(acidFrag, group.getFrag().getOutAtom(0).getValency(), locantEl)){ //hydrazono replacement only applies to non carboxylic acids e.g. hydrazonooxalic acid //need to be careful to note that something like chlorophosphonic acid isn't functional replacement continue; } oxygenReplaced = performFunctionalReplacementOnAcid(groupToBeModified, locantEl, numberOfAtomsToReplace, group.getAttributeValue(VALUE_ATR)); } else{ throw new StructureBuildingException("OPSIN bug: Unexpected prefix replacement type"); } if (oxygenReplaced>0){ state.fragManager.removeFragment(group.getFrag()); substituent.removeChild(group); groups.remove(group); for (int j = substituent.getChildCount() - 1; j >= 0; j--){//there may be a locant that should be moved Element child = substituent.getChild(j); child.detach(); nextSubOrBracket.insertChild(child, 0); } substituents.remove(substituent); substituent.detach(); if (oxygenReplaced>1){ multiplierEl.detach(); } } } else if (replacementType == PREFIX_REPLACEMENT_TYPE.dedicatedFunctionalReplacementPrefix){ throw new ComponentGenerationException("dedicated Functional Replacement Prefix used in an inappropriate position :" + groupValue); } } } return groups.size() != originalNumberOfGroups; } private boolean isChalcogenSubstituent(Element group) { //Is this group followed by a hyphen and directly preceded by a substituent i.e. no multiplier/locant //e.g. methylthio- Element next = OpsinTools.getNextSibling(group); if (next != null && next.getName().equals(HYPHEN_EL) && OpsinTools.getPreviousSibling(group) == null) { Element previousGroup = OpsinTools.getPreviousGroup(group); if (previousGroup != null) { //TODO We actually want to know if a carbon atom is the attachment point... but we don't know the attachment point locations at this point Element suffix = OpsinTools.getNextSibling(previousGroup, SUFFIX_EL); if (suffix == null || suffix.getFrag() == null) { for (Atom a : previousGroup.getFrag()) { if (a.getElement() == ChemEl.C) { return true; } } } } } return false; } /** * Currently prefix replacement terms must be directly adjacent to the groupToBeModified with an exception made * for carbohydrate stereochemistry prefixes e.g. 'gluco' and for substractive prefixes e.g. 'deoxy' * @param groupToBeModified * @return */ private boolean groupPrecededByElementThatBlocksPrefixReplacementInterpetation(Element groupToBeModified) { Element previous = OpsinTools.getPreviousSibling(groupToBeModified); while (previous !=null && (previous.getName().equals(SUBTRACTIVEPREFIX_EL) || (previous.getName().equals(STEREOCHEMISTRY_EL) && previous.getAttributeValue(TYPE_ATR).equals(CARBOHYDRATECONFIGURATIONPREFIX_TYPE_VAL)))){ previous = OpsinTools.getPreviousSibling(previous); } return previous != null; } /* * */ /** * Performs functional replacement using infixes e.g. thio in ethanthioic acid replaces an O with S * @param suffixFragments May be modified if a multiplier is determined to mean multiplication of a suffix, usually untouched * @param suffixes The suffix elements May be modified if a multiplier is determined to mean multiplication of a suffix, usually untouched * @throws StructureBuildingException * @throws ComponentGenerationException */ void processInfixFunctionalReplacementNomenclature(List suffixes, List suffixFragments) throws StructureBuildingException, ComponentGenerationException { for (int i = 0; i < suffixes.size(); i++) { Element suffix = suffixes.get(i); if (suffix.getAttribute(INFIX_ATR) != null){ Fragment fragToApplyInfixTo = suffix.getFrag(); Element possibleAcidGroup = OpsinTools.getPreviousSiblingIgnoringCertainElements(suffix, new String[]{MULTIPLIER_EL, INFIX_EL, SUFFIX_EL}); if (possibleAcidGroup !=null && possibleAcidGroup.getName().equals(GROUP_EL) && (possibleAcidGroup.getAttributeValue(TYPE_ATR).equals(NONCARBOXYLICACID_TYPE_VAL)|| possibleAcidGroup.getAttributeValue(TYPE_ATR).equals(CHALCOGENACIDSTEM_TYPE_VAL))){ fragToApplyInfixTo = possibleAcidGroup.getFrag(); } if (fragToApplyInfixTo ==null){ throw new ComponentGenerationException("infix has erroneously been assigned to a suffix which does not correspond to a suffix fragment. suffix: " + suffix.getValue()); } //e.g. =O:S,-O:S (which indicates replacing either a double or single bonded oxygen with S) //This is semicolon delimited for each infix List infixTransformations = StringTools.arrayToList(suffix.getAttributeValue(INFIX_ATR).split(";")); List atomList =fragToApplyInfixTo.getAtomList(); LinkedList singleBondedOxygen = new LinkedList<>(); LinkedList doubleBondedOxygen = new LinkedList<>(); populateTerminalSingleAndDoubleBondedOxygen(atomList, singleBondedOxygen, doubleBondedOxygen); int oxygenAvailable = singleBondedOxygen.size() +doubleBondedOxygen.size(); /* * Modifies suffixes, suffixFragments, suffix and infixTransformations as appropriate */ disambiguateMultipliedInfixMeaning(suffixes, suffixFragments, suffix, infixTransformations, oxygenAvailable); /* * Sort infixTransformations so more specific transformations are performed first * e.g. ethanthioimidic acid-->ethanimidthioic acid as imid can only apply to the double bonded oxygen */ Collections.sort(infixTransformations, new SortInfixTransformations()); for (String infixTransformation : infixTransformations) { String[] transformationArray = infixTransformation.split(":"); if (transformationArray.length !=2){ throw new StructureBuildingException("Atom to be replaced and replacement not specified correctly in infix: " + infixTransformation); } String[] transformations = transformationArray[0].split(","); String replacementSMILES = transformationArray[1]; boolean acceptDoubleBondedOxygen = false; boolean acceptSingleBondedOxygen = false; boolean nitrido =false; for (String transformation : transformations) { if (transformation.startsWith("=")){ acceptDoubleBondedOxygen = true; } else if (transformation.startsWith("-")){ acceptSingleBondedOxygen = true; } else if (transformation.startsWith("#")){ nitrido =true; } else{ throw new StructureBuildingException("Malformed infix transformation. Expected to start with either - or =. Transformation was: " +transformation); } if (transformation.length()<2 || transformation.charAt(1)!='O'){ throw new StructureBuildingException("Only replacement by oxygen is supported. Check infix defintions"); } } boolean infixAssignmentAmbiguous =false; if ((acceptSingleBondedOxygen ||nitrido) && !acceptDoubleBondedOxygen){ if (singleBondedOxygen.isEmpty()){ throw new StructureBuildingException("Cannot find single bonded oxygen for infix with SMILES: "+ replacementSMILES+ " to modify!"); } if (singleBondedOxygen.size() !=1){ infixAssignmentAmbiguous=true; } } if (!acceptSingleBondedOxygen && (acceptDoubleBondedOxygen || nitrido)){ if (doubleBondedOxygen.isEmpty()){ throw new StructureBuildingException("Cannot find double bonded oxygen for infix with SMILES: "+ replacementSMILES+ " to modify!"); } if (doubleBondedOxygen.size() != 1){ infixAssignmentAmbiguous=true; } } if (acceptSingleBondedOxygen && acceptDoubleBondedOxygen){ if (oxygenAvailable ==0){ throw new StructureBuildingException("Cannot find oxygen for infix with SMILES: "+ replacementSMILES+ " to modify!"); } if (oxygenAvailable !=1){ infixAssignmentAmbiguous=true; } } Set ambiguousElementAtoms = new LinkedHashSet<>(); Atom atomToUse = null; if ((acceptDoubleBondedOxygen || nitrido) && doubleBondedOxygen.size()>0 ){ atomToUse = doubleBondedOxygen.removeFirst(); } else if (acceptSingleBondedOxygen && singleBondedOxygen.size()>0 ){ atomToUse = singleBondedOxygen.removeFirst(); } else{ throw new StructureBuildingException("Cannot find oxygen for infix with SMILES: "+ replacementSMILES+ " to modify!");//this would be a bug } Fragment replacementFrag = state.fragManager.buildSMILES(replacementSMILES, SUFFIX_TYPE_VAL, NONE_LABELS_VAL); if (replacementFrag.getOutAtomCount()>0){//SMILES include an indication of the bond order the replacement fragment will have, this is not intended to be an outatom replacementFrag.removeOutAtom(0); } Atom atomThatWillReplaceOxygen =replacementFrag.getFirstAtom(); if (replacementFrag.getAtomCount()==1 && atomThatWillReplaceOxygen.getElement().isChalcogen()){ atomThatWillReplaceOxygen.setCharge(atomToUse.getCharge()); atomThatWillReplaceOxygen.setProtonsExplicitlyAddedOrRemoved(atomToUse.getProtonsExplicitlyAddedOrRemoved()); } removeOrMoveObsoleteFunctionalAtoms(atomToUse, replacementFrag);//also will move charge if necessary moveObsoleteOutAtoms(atomToUse, replacementFrag);//if the replaced atom was an outatom the fragments outatom list need to be corrected if (nitrido){ atomToUse.getFirstBond().setOrder(3); Atom removedHydroxy = singleBondedOxygen.removeFirst(); state.fragManager.removeAtomAndAssociatedBonds(removedHydroxy); removeAssociatedFunctionalAtom(removedHydroxy); } state.fragManager.incorporateFragment(replacementFrag, atomToUse.getFrag()); state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(atomToUse, atomThatWillReplaceOxygen); if (infixAssignmentAmbiguous){ ambiguousElementAtoms.add(atomThatWillReplaceOxygen); if (atomThatWillReplaceOxygen.getProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT)!=null){ ambiguousElementAtoms.addAll(atomThatWillReplaceOxygen.getProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT)); } } if (infixAssignmentAmbiguous){//record what atoms could have been replaced. Often this ambiguity is resolved later e.g. S-methyl ethanthioate for (Atom a : doubleBondedOxygen) { ambiguousElementAtoms.add(a); if (a.getProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT)!=null){ ambiguousElementAtoms.addAll(a.getProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT)); } } for (Atom a : singleBondedOxygen) { ambiguousElementAtoms.add(a); if (a.getProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT)!=null){ ambiguousElementAtoms.addAll(a.getProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT)); } } for (Atom atom : ambiguousElementAtoms) { atom.setProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT, ambiguousElementAtoms); } } } } } } /* * Functional class nomenclature */ /** * Replaces the appropriate number of functional oxygen atoms with the corresponding fragment * @param acidContainingRoot * @param acidReplacingWord * @throws ComponentGenerationException * @throws StructureBuildingException */ private void processAcidReplacingFunctionalClassNomenclatureFullWord(Element acidContainingRoot, Element acidReplacingWord) throws ComponentGenerationException, StructureBuildingException { String locant = acidReplacingWord.getAttributeValue(LOCANT_ATR); Element acidReplacingGroup = StructureBuildingMethods.findRightMostGroupInBracket(acidReplacingWord); if (acidReplacingGroup ==null){ throw new ComponentGenerationException("OPSIN bug: acid replacing group not found where one was expected for acidReplacingFunctionalGroup wordRule"); } String functionalGroupName = acidReplacingGroup.getValue(); Fragment acidReplacingFrag = acidReplacingGroup.getFrag(); if (acidReplacingGroup.getParent().getChildCount() != 1){ throw new ComponentGenerationException("Unexpected qualifier to: " + functionalGroupName); } Element groupToBeModified = acidContainingRoot.getFirstChildElement(GROUP_EL); List oxygenAtoms = findFunctionalOxygenAtomsInApplicableSuffixes(groupToBeModified); if (oxygenAtoms.isEmpty()){ oxygenAtoms = findFunctionalOxygenAtomsInGroup(groupToBeModified); } if (oxygenAtoms.isEmpty()){ List conjunctiveSuffixElements =OpsinTools.getNextSiblingsOfType(groupToBeModified, CONJUNCTIVESUFFIXGROUP_EL); for (Element conjunctiveSuffixElement : conjunctiveSuffixElements) { oxygenAtoms.addAll(findFunctionalOxygenAtomsInGroup(conjunctiveSuffixElement)); } } if (oxygenAtoms.size() < 1){ throw new ComponentGenerationException("Insufficient oxygen to replace with " + functionalGroupName +"s in " + acidContainingRoot.getFirstChildElement(GROUP_EL).getValue()); } boolean isAmide = functionalGroupName.equals("amide") || functionalGroupName.equals("amid"); if (isAmide) { if (acidReplacingFrag.getAtomCount()!=1){ throw new ComponentGenerationException("OPSIN bug: " + functionalGroupName + " not found where expected"); } Atom amideNitrogen = acidReplacingFrag.getFirstAtom(); amideNitrogen.neutraliseCharge(); amideNitrogen.clearLocants(); acidReplacingFrag.addMappingToAtomLocantMap("N", amideNitrogen); } Atom chosenOxygen = locant != null ? removeOxygenWithAppropriateLocant(oxygenAtoms, locant) : oxygenAtoms.get(0); state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(chosenOxygen, acidReplacingFrag.getFirstAtom()); removeAssociatedFunctionalAtom(chosenOxygen); } /** * Replaces the appropriate number of functional oxygen atoms with the corresponding fragment * @param acidContainingRoot * @param functionalWord * @throws ComponentGenerationException * @throws StructureBuildingException */ private void processAcidReplacingFunctionalClassNomenclatureFunctionalWord(Element acidContainingRoot, Element functionalWord) throws ComponentGenerationException, StructureBuildingException { if (functionalWord !=null && functionalWord.getAttributeValue(TYPE_ATR).equals(WordType.functionalTerm.toString())){ Element functionalTerm = functionalWord.getFirstChildElement(FUNCTIONALTERM_EL); if (functionalTerm ==null){ throw new ComponentGenerationException("OPSIN bug: functionalTerm word not found where one was expected for acidReplacingFunctionalGroup wordRule"); } Element acidReplacingGroup = functionalTerm.getFirstChildElement(FUNCTIONALGROUP_EL); String functionalGroupName = acidReplacingGroup.getValue(); Element possibleLocantOrMultiplier = OpsinTools.getPreviousSibling(acidReplacingGroup); int numberOfAcidicHydroxysToReplace = 1; String[] locants = null; if (possibleLocantOrMultiplier != null){ if (possibleLocantOrMultiplier.getName().equals(MULTIPLIER_EL)){ numberOfAcidicHydroxysToReplace = Integer.parseInt(possibleLocantOrMultiplier.getAttributeValue(VALUE_ATR)); possibleLocantOrMultiplier.detach(); possibleLocantOrMultiplier = OpsinTools.getPreviousSibling(acidReplacingGroup); } if (possibleLocantOrMultiplier != null){ if (possibleLocantOrMultiplier.getName().equals(LOCANT_EL)){ locants = StringTools.removeDashIfPresent(possibleLocantOrMultiplier.getValue()).split(","); possibleLocantOrMultiplier.detach(); } else { throw new ComponentGenerationException("Unexpected qualifier to acidReplacingFunctionalGroup functionalTerm"); } } } if (functionalTerm.getChildCount() != 1){ throw new ComponentGenerationException("Unexpected qualifier to acidReplacingFunctionalGroup functionalTerm"); } Element groupToBeModified = acidContainingRoot.getFirstChildElement(GROUP_EL); List oxygenAtoms = findFunctionalOxygenAtomsInApplicableSuffixes(groupToBeModified); if (oxygenAtoms.isEmpty()) { oxygenAtoms = findFunctionalOxygenAtomsInGroup(groupToBeModified); } if (oxygenAtoms.isEmpty()) { List conjunctiveSuffixElements =OpsinTools.getNextSiblingsOfType(groupToBeModified, CONJUNCTIVESUFFIXGROUP_EL); for (Element conjunctiveSuffixElement : conjunctiveSuffixElements) { oxygenAtoms.addAll(findFunctionalOxygenAtomsInGroup(conjunctiveSuffixElement)); } } if (numberOfAcidicHydroxysToReplace > oxygenAtoms.size()){ throw new ComponentGenerationException("Insufficient oxygen to replace with nitrogen in " + acidContainingRoot.getFirstChildElement(GROUP_EL).getValue()); } boolean isAmide = functionalGroupName.equals("amide") || functionalGroupName.equals("amid"); if (isAmide) { for (int i = 0; i < numberOfAcidicHydroxysToReplace; i++) { Atom functionalOxygenToReplace = locants != null ? removeOxygenWithAppropriateLocant(oxygenAtoms, locants[i]) : oxygenAtoms.get(i); removeAssociatedFunctionalAtom(functionalOxygenToReplace); functionalOxygenToReplace.setElement(ChemEl.N); } } else{ String groupValue = acidReplacingGroup.getAttributeValue(VALUE_ATR); String labelsValue = acidReplacingGroup.getAttributeValue(LABELS_ATR); Fragment acidReplacingFrag = state.fragManager.buildSMILES(groupValue, SUFFIX_TYPE_VAL, labelsValue != null ? labelsValue : NONE_LABELS_VAL); Fragment acidFragment = groupToBeModified.getFrag(); if (acidFragment.hasLocant("2")){//prefer numeric locants on group to those of replacing group for (Atom atom : acidReplacingFrag) { atom.clearLocants(); } } Atom firstFunctionalOxygenToReplace = locants != null ? removeOxygenWithAppropriateLocant(oxygenAtoms, locants[0]) : oxygenAtoms.get(0); state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(firstFunctionalOxygenToReplace, acidReplacingFrag.getFirstAtom()); removeAssociatedFunctionalAtom(firstFunctionalOxygenToReplace); for (int i = 1; i < numberOfAcidicHydroxysToReplace; i++) { Fragment clonedHydrazide = state.fragManager.copyAndRelabelFragment(acidReplacingFrag, i); Atom functionalOxygenToReplace = locants != null ? removeOxygenWithAppropriateLocant(oxygenAtoms, locants[i]) : oxygenAtoms.get(i); state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(functionalOxygenToReplace, clonedHydrazide.getFirstAtom()); state.fragManager.incorporateFragment(clonedHydrazide, functionalOxygenToReplace.getFrag()); removeAssociatedFunctionalAtom(functionalOxygenToReplace); } state.fragManager.incorporateFragment(acidReplacingFrag, firstFunctionalOxygenToReplace.getFrag()); } } else{ throw new ComponentGenerationException("amide word not found where expected, bug?"); } } private Atom removeOxygenWithAppropriateLocant(List oxygenAtoms, String locant) throws ComponentGenerationException { for (Iterator iterator = oxygenAtoms.iterator(); iterator.hasNext();) { Atom atom = iterator.next(); if (atom.hasLocant(locant)) { iterator.remove(); return atom; } } //Look for the case whether the locant refers to the backbone for (Iterator iterator = oxygenAtoms.iterator(); iterator.hasNext();) { Atom atom = iterator.next(); if (OpsinTools.depthFirstSearchForNonSuffixAtomWithLocant(atom, locant) != null){ iterator.remove(); return atom; } } throw new ComponentGenerationException("Failed to find acid group at locant: " + locant); } /* * Prefix functional replacement nomenclature */ private boolean acidHasSufficientHydrogenForSubstitutionInterpretation(Fragment acidFrag, int hydrogenRequiredForSubstitutionInterpretation, Element locantEl) { List atomsThatWouldBeSubstituted = new ArrayList<>(); if (locantEl !=null){ String[] possibleLocants = locantEl.getValue().split(","); for (String locant : possibleLocants) { Atom atomToBeSubstituted = acidFrag.getAtomByLocant(locant); if (atomToBeSubstituted !=null){ atomsThatWouldBeSubstituted.add(atomToBeSubstituted); } else{ atomsThatWouldBeSubstituted.clear(); atomsThatWouldBeSubstituted.add(acidFrag.getDefaultInAtomOrFirstAtom()); break; } } } else{ atomsThatWouldBeSubstituted.add(acidFrag.getDefaultInAtomOrFirstAtom()); } for (Atom atom : atomsThatWouldBeSubstituted) { if (StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(atom) < hydrogenRequiredForSubstitutionInterpretation){ return false;//insufficient hydrogens for substitution interpretation } } return true; } /** * Performs replacement of oxygen atoms by chalogen atoms * If this is ambiguous e.g. thioacetate then Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT is populated * @param groupToBeModified * @param locantEl * @param numberOfAtomsToReplace * @param replacementSmiles * @return * @throws StructureBuildingException */ private int performChalcogenFunctionalReplacement(Element groupToBeModified, Element locantEl, int numberOfAtomsToReplace, String replacementSmiles) throws StructureBuildingException { List oxygenAtoms = findOxygenAtomsInApplicableSuffixes(groupToBeModified); if (oxygenAtoms.isEmpty()) { oxygenAtoms = findOxygenAtomsInGroup(groupToBeModified); } if (locantEl != null) {//locants are used to indicate replacement on trivial groups List oxygenWithAppropriateLocants = pickOxygensWithAppropriateLocants(locantEl, oxygenAtoms); if(oxygenWithAppropriateLocants.size() < numberOfAtomsToReplace) { numberOfAtomsToReplace = 1; //e.g. -1-thioureidomethyl } else{ locantEl.detach(); oxygenAtoms = oxygenWithAppropriateLocants; } } List replaceableAtoms = new ArrayList<>(); if (replacementSmiles.startsWith("=")) { //e.g. thiono replacementSmiles = replacementSmiles.substring(1); for (Atom oxygen : oxygenAtoms) { int incomingValency = oxygen.getIncomingValency(); int bondCount = oxygen.getBondCount(); if (bondCount == 1 && incomingValency == 2) { replaceableAtoms.add(oxygen); } } } else { List doubleBondedOxygen = new ArrayList<>(); List singleBondedOxygen = new ArrayList<>(); List ethericOxygen = new ArrayList<>(); for (Atom oxygen : oxygenAtoms) { int incomingValency = oxygen.getIncomingValency(); int bondCount = oxygen.getBondCount(); if (bondCount == 1 && incomingValency ==2 ) { doubleBondedOxygen.add(oxygen); } else if (bondCount == 1 && incomingValency == 1) { singleBondedOxygen.add(oxygen); } else if (bondCount == 2 && incomingValency == 2) { ethericOxygen.add(oxygen); } } replaceableAtoms.addAll(doubleBondedOxygen); replaceableAtoms.addAll(singleBondedOxygen); replaceableAtoms.addAll(ethericOxygen); } int totalOxygen = replaceableAtoms.size(); if (numberOfAtomsToReplace >1){ if (totalOxygen < numberOfAtomsToReplace){ numberOfAtomsToReplace=1; } } int atomsReplaced =0; if (totalOxygen >=numberOfAtomsToReplace){//check that there atleast as many oxygens as requested replacements boolean prefixAssignmentAmbiguous =false; Set ambiguousElementAtoms = new LinkedHashSet<>(); if (totalOxygen != numberOfAtomsToReplace){ prefixAssignmentAmbiguous=true; } for (Atom atomToReplace : replaceableAtoms) { if (atomsReplaced == numberOfAtomsToReplace){ ambiguousElementAtoms.add(atomToReplace); continue; } else{ state.fragManager.replaceAtomWithSmiles(atomToReplace, replacementSmiles); if (prefixAssignmentAmbiguous){ ambiguousElementAtoms.add(atomToReplace); } } atomsReplaced++; } if (prefixAssignmentAmbiguous){//record what atoms could have been replaced. Often this ambiguity is resolved later e.g. S-methyl thioacetate for (Atom atom : ambiguousElementAtoms) { atom.setProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT, ambiguousElementAtoms); } } } return atomsReplaced; } /** * Converts functional oxygen to peroxy e.g. peroxybenzoic acid * Returns the number of oxygen replaced * @param groupToBeModified * @param locantEl * @param numberOfAtomsToReplace * @return * @throws StructureBuildingException */ private int performPeroxyFunctionalReplacement(Element groupToBeModified, Element locantEl, int numberOfAtomsToReplace) throws StructureBuildingException { List oxygenAtoms = findFunctionalOxygenAtomsInApplicableSuffixes(groupToBeModified); if (oxygenAtoms.isEmpty()){ oxygenAtoms = findEthericOxygenAtomsInGroup(groupToBeModified); oxygenAtoms.addAll(findFunctionalOxygenAtomsInGroup(groupToBeModified)); } if (locantEl !=null){ List oxygenWithAppropriateLocants = pickOxygensWithAppropriateLocants(locantEl, oxygenAtoms); if(oxygenWithAppropriateLocants.size() < numberOfAtomsToReplace){ numberOfAtomsToReplace =1; } else{ locantEl.detach(); oxygenAtoms = oxygenWithAppropriateLocants; } } if (numberOfAtomsToReplace >1 && oxygenAtoms.size() < numberOfAtomsToReplace){ numberOfAtomsToReplace=1; } int atomsReplaced = 0; if (oxygenAtoms.size() >=numberOfAtomsToReplace){//check that there atleast as many oxygens as requested replacements atomsReplaced = numberOfAtomsToReplace; for (int j = 0; j < numberOfAtomsToReplace; j++) { Atom oxygenToReplace = oxygenAtoms.get(j); if (oxygenToReplace.getBondCount()==2){//etheric oxygen Fragment newOxygen = state.fragManager.buildSMILES("O", SUFFIX_TYPE_VAL, NONE_LABELS_VAL); Bond bondToRemove = oxygenToReplace.getFirstBond(); Atom atomToAttachTo = bondToRemove.getFromAtom() == oxygenToReplace ? bondToRemove.getToAtom() : bondToRemove.getFromAtom(); state.fragManager.createBond(atomToAttachTo, newOxygen.getFirstAtom(), 1); state.fragManager.createBond(newOxygen.getFirstAtom(), oxygenToReplace, 1); state.fragManager.removeBond(bondToRemove); state.fragManager.incorporateFragment(newOxygen, groupToBeModified.getFrag()); } else{ Fragment replacementFrag = state.fragManager.buildSMILES("OO", SUFFIX_TYPE_VAL, NONE_LABELS_VAL); removeOrMoveObsoleteFunctionalAtoms(oxygenToReplace, replacementFrag); state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(oxygenToReplace, replacementFrag.getFirstAtom()); state.fragManager.incorporateFragment(replacementFrag, groupToBeModified.getFrag()); } } } return atomsReplaced; } /** * Replaces double bonded oxygen and/or single bonded oxygen depending on the input SMILES * SMILES with a valency 1 outAtom replace -O, SMILES with a valency 2 outAtom replace =O * SMILES with a valency 3 outAtom replace -O and =O (nitrido) * Returns the number of oxygen replaced * @param groupToBeModified * @param locantEl * @param numberOfAtomsToReplace * @param replacementSmiles * @return * @throws StructureBuildingException */ private int performFunctionalReplacementOnAcid(Element groupToBeModified, Element locantEl, int numberOfAtomsToReplace, String replacementSmiles) throws StructureBuildingException { int outValency; if (replacementSmiles.startsWith("-")){ outValency =1; } else if (replacementSmiles.startsWith("=")){ outValency =2; } else if (replacementSmiles.startsWith("#")){ outValency =3; } else{ throw new StructureBuildingException("OPSIN bug: Unexpected valency on fragment for prefix functional replacement"); } replacementSmiles = replacementSmiles.substring(1); List oxygenAtoms = findOxygenAtomsInApplicableSuffixes(groupToBeModified); if (oxygenAtoms.isEmpty()){ oxygenAtoms = findOxygenAtomsInGroup(groupToBeModified); } if (locantEl !=null){//locants are used to indicate replacement on trivial groups List oxygenWithAppropriateLocants = pickOxygensWithAppropriateLocants(locantEl, oxygenAtoms); List singleBondedOxygen = new ArrayList<>(); List terminalDoubleBondedOxygen = new ArrayList<>(); populateTerminalSingleAndDoubleBondedOxygen(oxygenWithAppropriateLocants, singleBondedOxygen, terminalDoubleBondedOxygen); if (outValency ==1){ oxygenWithAppropriateLocants.removeAll(terminalDoubleBondedOxygen); } else if (outValency ==2){ oxygenWithAppropriateLocants.removeAll(singleBondedOxygen); } if(oxygenWithAppropriateLocants.size() < numberOfAtomsToReplace){ numberOfAtomsToReplace =1; //e.g. -1-thioureidomethyl } else{ locantEl.detach(); oxygenAtoms = oxygenWithAppropriateLocants; } } List singleBondedOxygen = new ArrayList<>(); List terminalDoubleBondedOxygen = new ArrayList<>(); populateTerminalSingleAndDoubleBondedOxygen(oxygenAtoms, singleBondedOxygen, terminalDoubleBondedOxygen); if (outValency ==1){ oxygenAtoms.removeAll(terminalDoubleBondedOxygen); } else if (outValency ==2){ oxygenAtoms.removeAll(singleBondedOxygen); //favour bridging oxygen over double bonded oxygen c.f. imidodicarbonate oxygenAtoms.removeAll(terminalDoubleBondedOxygen); oxygenAtoms.addAll(terminalDoubleBondedOxygen); } else { if (singleBondedOxygen.isEmpty() || terminalDoubleBondedOxygen.isEmpty()){ throw new StructureBuildingException("Both a -OH and =O are required for nitrido prefix functional replacement"); } oxygenAtoms.removeAll(singleBondedOxygen); } if (numberOfAtomsToReplace >1 && oxygenAtoms.size() < numberOfAtomsToReplace){ numberOfAtomsToReplace=1; } int atomsReplaced =0; if (oxygenAtoms.size() >=numberOfAtomsToReplace){//check that there atleast as many oxygens as requested replacements for (Atom atomToReplace : oxygenAtoms) { if (atomsReplaced == numberOfAtomsToReplace){ continue; } else{ Fragment replacementFrag = state.fragManager.buildSMILES(replacementSmiles, atomToReplace.getFrag().getTokenEl(), NONE_LABELS_VAL); if (outValency ==3){//special case for nitrido atomToReplace.getFirstBond().setOrder(3); Atom removedHydroxy = singleBondedOxygen.remove(0); state.fragManager.removeAtomAndAssociatedBonds(removedHydroxy); removeAssociatedFunctionalAtom(removedHydroxy); } state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(atomToReplace, replacementFrag.getFirstAtom()); if (outValency ==1){ removeOrMoveObsoleteFunctionalAtoms(atomToReplace, replacementFrag); } moveObsoleteOutAtoms(atomToReplace, replacementFrag); state.fragManager.incorporateFragment(replacementFrag, atomToReplace.getFrag()); } atomsReplaced++; } } return atomsReplaced; } /* * Infix functional replacement nomenclature */ /** * This block handles infix multiplication. Unless brackets are provided this is ambiguous without knowledge of the suffix that is being modified * For example butandithione could be intepreted as butandi(thione) or butan(dithi)one. * Obviously the latter is wrong in this case but it is the correct interpretation for butandithiate * @param suffixes * @param suffixFragments * @param suffix * @param infixTransformations * @param oxygenAvailable * @throws ComponentGenerationException * @throws StructureBuildingException */ private void disambiguateMultipliedInfixMeaning(List suffixes, List suffixFragments,Element suffix, List infixTransformations, int oxygenAvailable) throws ComponentGenerationException, StructureBuildingException { Element possibleInfix = OpsinTools.getPreviousSibling(suffix); if (possibleInfix.getName().equals(INFIX_EL)){//the infix is only left when there was ambiguity Element possibleMultiplier = OpsinTools.getPreviousSibling(possibleInfix); if (possibleMultiplier.getName().equals(MULTIPLIER_EL)){ int multiplierValue =Integer.parseInt(possibleMultiplier.getAttributeValue(VALUE_ATR)); if (infixTransformations.size() + multiplierValue-1 <=oxygenAvailable){//multiplier means multiply the infix e.g. butandithiate for (int j = 1; j < multiplierValue; j++) { infixTransformations.add(0, infixTransformations.get(0)); } } else{ Element possibleLocant = OpsinTools.getPreviousSibling(possibleMultiplier); String[] locants = null; if (possibleLocant.getName().equals(LOCANT_EL)) { locants = possibleLocant.getValue().split(","); } if (locants !=null){ if (locants.length!=multiplierValue){ throw new ComponentGenerationException("Multiplier/locant disagreement when multiplying infixed suffix"); } suffix.addAttribute(new Attribute(LOCANT_ATR, locants[0])); } suffix.addAttribute(new Attribute(MULTIPLIED_ATR, "multiplied")); for (int j = 1; j < multiplierValue; j++) {//multiplier means multiply the infixed suffix e.g. butandithione Element newSuffix = suffix.copy(); Fragment newSuffixFrag = state.fragManager.copyFragment(suffix.getFrag()); newSuffix.setFrag(newSuffixFrag); suffixFragments.add(newSuffixFrag); OpsinTools.insertAfter(suffix, newSuffix); suffixes.add(newSuffix); if (locants !=null){//assign locants if available newSuffix.getAttribute(LOCANT_ATR).setValue(locants[j]); } } if (locants!=null){ possibleLocant.detach(); } } possibleMultiplier.detach(); possibleInfix.detach(); } else{ throw new ComponentGenerationException("Multiplier expected in front of ambiguous infix"); } } } /* * Convenience Methods */ /** * Given an atom that is to be replaced by a functional replacement fragment * determines whether this atom is a functional atom and, if it is, performs the following processes: * The functionalAtom is removed. If the the replacement fragment is an atom of O/S/Se/Te or the * the terminal atom of the fragment is a single bonded O/S/Se/Te a functionAom is added to this atom. * @param atomToBeReplaced * @param replacementFrag */ private void removeOrMoveObsoleteFunctionalAtoms(Atom atomToBeReplaced, Fragment replacementFrag){ List replacementAtomList = replacementFrag.getAtomList(); Fragment origFrag = atomToBeReplaced.getFrag(); for (int i = origFrag.getFunctionalAtomCount() - 1; i >=0; i--) { FunctionalAtom functionalAtom = origFrag.getFunctionalAtom(i); if (atomToBeReplaced.equals(functionalAtom.getAtom())){ atomToBeReplaced.getFrag().removeFunctionalAtom(i); Atom terminalAtomOfReplacementFrag = replacementAtomList.get(replacementAtomList.size()-1); if ((terminalAtomOfReplacementFrag.getIncomingValency() ==1 || replacementAtomList.size()==1)&& terminalAtomOfReplacementFrag.getElement().isChalcogen()){ replacementFrag.addFunctionalAtom(terminalAtomOfReplacementFrag); terminalAtomOfReplacementFrag.setCharge(atomToBeReplaced.getCharge()); terminalAtomOfReplacementFrag.setProtonsExplicitlyAddedOrRemoved(atomToBeReplaced.getProtonsExplicitlyAddedOrRemoved()); } atomToBeReplaced.neutraliseCharge(); } } } /** * Given an atom that is to be replaced by a functional replacement fragment * determines whether this atom has outvalency and if it does removes the outatom from the atom's fragment * and adds an outatom to the replacementFrag * @param atomToBeReplaced * @param replacementFrag */ private void moveObsoleteOutAtoms(Atom atomToBeReplaced, Fragment replacementFrag){ if (atomToBeReplaced.getOutValency() >0){//this is not known to occur in well formed IUPAC names but would occur in thioxy (as a suffix) List replacementAtomList = replacementFrag.getAtomList(); Fragment origFrag = atomToBeReplaced.getFrag(); for (int i = origFrag.getOutAtomCount() - 1; i >=0; i--) { OutAtom outAtom = origFrag.getOutAtom(i); if (atomToBeReplaced.equals(outAtom.getAtom())){ atomToBeReplaced.getFrag().removeOutAtom(i); Atom terminalAtomOfReplacementFrag = replacementAtomList.get(replacementAtomList.size()-1); replacementFrag.addOutAtom(terminalAtomOfReplacementFrag, outAtom.getValency(), outAtom.isSetExplicitly()); } } } } private void removeAssociatedFunctionalAtom(Atom atomWithFunctionalAtom) throws StructureBuildingException { Fragment frag = atomWithFunctionalAtom.getFrag(); for (int i = frag.getFunctionalAtomCount() - 1; i >=0; i--) { FunctionalAtom functionalAtom = frag.getFunctionalAtom(i); if (atomWithFunctionalAtom.equals(functionalAtom.getAtom())){ atomWithFunctionalAtom.getFrag().removeFunctionalAtom(i); return; } } throw new StructureBuildingException("OPSIN bug: Unable to find associated functionalAtom"); } /** * Returns the subset of oxygenAtoms that possess one of the locants in locantEl * Searches for locant on nearest non suffix atom in case of suffixes * @param locantEl * @param oxygenAtoms * @return */ private List pickOxygensWithAppropriateLocants(Element locantEl, List oxygenAtoms) { String[] possibleLocants = locantEl.getValue().split(","); boolean pLocantSpecialCase = allLocantsP(possibleLocants); List oxygenWithAppropriateLocants = new ArrayList<>(); for (Atom atom : oxygenAtoms) { List atomlocants = atom.getLocants(); if (atomlocants.size() > 0) { for (String locantVal : possibleLocants) { if (atomlocants.contains(locantVal)) { oxygenWithAppropriateLocants.add(atom); break; } } } else if (pLocantSpecialCase) { for (Atom neighbour : atom.getAtomNeighbours()) { if (neighbour.getElement() == ChemEl.P) { oxygenWithAppropriateLocants.add(atom); break; } } } else { Atom atomWithNumericLocant = OpsinTools.depthFirstSearchForAtomWithNumericLocant(atom); if (atomWithNumericLocant != null) { List atomWithNumericLocantLocants = atomWithNumericLocant.getLocants(); for (String locantVal : possibleLocants) { if (atomWithNumericLocantLocants.contains(locantVal)) { oxygenWithAppropriateLocants.add(atom); break; } } } } } return oxygenWithAppropriateLocants; } private boolean allLocantsP(String[] locants) { if (locants.length == 0) { return false; } for (String locant : locants) { if (!locant.equals("P")) { return false; } } return true; } /** * Returns oxygen atoms in suffixes with functionalAtoms * @param groupToBeModified * @return */ private List findFunctionalOxygenAtomsInApplicableSuffixes(Element groupToBeModified) { List suffixElements =OpsinTools.getNextSiblingsOfType(groupToBeModified, SUFFIX_EL); List oxygenAtoms = new ArrayList<>(); for (Element suffix : suffixElements) { Fragment suffixFrag = suffix.getFrag(); if (suffixFrag != null) {//null for non carboxylic acids for (int i = 0, l = suffixFrag.getFunctionalAtomCount(); i < l; i++) { Atom a = suffixFrag.getFunctionalAtom(i).getAtom(); if (a.getElement() == ChemEl.O) { oxygenAtoms.add(a); } } } } return oxygenAtoms; } /** * Returns functional oxygen atoms in groupToBeModified * @param groupToBeModified * @return */ private List findFunctionalOxygenAtomsInGroup(Element groupToBeModified) { List oxygenAtoms = new ArrayList<>(); Fragment frag = groupToBeModified.getFrag(); for (int i = 0, l = frag.getFunctionalAtomCount(); i < l; i++) { Atom a = frag.getFunctionalAtom(i).getAtom(); if (a.getElement() == ChemEl.O){ oxygenAtoms.add(a); } } return oxygenAtoms; } /** * Returns etheric oxygen atoms in groupToBeModified * @param groupToBeModified * @return */ private List findEthericOxygenAtomsInGroup(Element groupToBeModified) { List oxygenAtoms = new ArrayList<>(); List atomList = groupToBeModified.getFrag().getAtomList(); for (Atom a: atomList) { if (a.getElement() == ChemEl.O && a.getBondCount()==2 && a.getCharge()==0 && a.getIncomingValency()==2){ oxygenAtoms.add(a); } } return oxygenAtoms; } /** * Returns oxygen atoms in suffixes with functionalAtoms or acidStem suffixes or aldehyde suffixes (1979 C-531) * @param groupToBeModified * @return */ private List findOxygenAtomsInApplicableSuffixes(Element groupToBeModified) { List suffixElements =OpsinTools.getNextSiblingsOfType(groupToBeModified, SUFFIX_EL); List oxygenAtoms = new ArrayList<>(); for (Element suffix : suffixElements) { Fragment suffixFrag = suffix.getFrag(); if (suffixFrag != null) {//null for non carboxylic acids if (suffixFrag.getFunctionalAtomCount() > 0 || groupToBeModified.getAttributeValue(TYPE_ATR).equals(ACIDSTEM_TYPE_VAL) || suffix.getAttributeValue(VALUE_ATR).equals("aldehyde")) { List atomList = suffixFrag.getAtomList(); for (Atom a : atomList) { if (a.getElement() == ChemEl.O) { oxygenAtoms.add(a); } } } } } return oxygenAtoms; } /** * Returns oxygen atoms in groupToBeModified * @param groupToBeModified * @return */ private List findOxygenAtomsInGroup(Element groupToBeModified) { List oxygenAtoms = new ArrayList<>(); List atomList = groupToBeModified.getFrag().getAtomList(); for (Atom a : atomList) { if (a.getElement() == ChemEl.O){ oxygenAtoms.add(a); } } return oxygenAtoms; } private void populateTerminalSingleAndDoubleBondedOxygen(List atomList, List singleBondedOxygen, List doubleBondedOxygen) throws StructureBuildingException { for (Atom a : atomList) { if (a.getElement() == ChemEl.O){//find terminal oxygens if (a.getBondCount()==1){ int incomingValency = a.getIncomingValency(); if (incomingValency ==2){ doubleBondedOxygen.add(a); } else if (incomingValency ==1){ singleBondedOxygen.add(a); } else{ throw new StructureBuildingException("Unexpected bond order to oxygen; excepted 1 or 2 found: " +incomingValency); } } } } } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/FusedRingBuilder.java000066400000000000000000001310101451751637500277100ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.Arrays; import java.util.Collections; import java.util.HashMap; import java.util.HashSet; import java.util.LinkedHashSet; import java.util.List; import java.util.Locale; import java.util.Map; import java.util.Set; import static uk.ac.cam.ch.wwmm.opsin.OpsinTools.*; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; /** * Assembles fused rings named using fusion nomenclature * @author dl387 * */ class FusedRingBuilder { private final BuildState state; private final List groupsInFusedRing; private final Element lastGroup; private final Fragment parentRing; private final Map fragmentInScopeForEachFusionLevel = new HashMap<>(); private final Map atomsToRemoveToReplacementAtom = new HashMap<>(); private FusedRingBuilder(BuildState state, List groupsInFusedRing) { this.state = state; this.groupsInFusedRing = groupsInFusedRing; lastGroup = groupsInFusedRing.get(groupsInFusedRing.size()-1); parentRing = lastGroup.getFrag(); fragmentInScopeForEachFusionLevel.put(0, parentRing); } /** * Master method for processing fused rings. Fuses groups together * @param state: contains the current id and fragment manager * @param subOrRoot Element (substituent or root) * @throws StructureBuildingException */ static void processFusedRings(BuildState state, Element subOrRoot) throws StructureBuildingException { List groups = subOrRoot.getChildElements(GROUP_EL); if (groups.size() < 2){ return;//nothing to fuse } List groupsInFusedRing =new ArrayList<>(); for (int i = groups.size()-1; i >=0; i--) {//group groups into fused rings Element group =groups.get(i); groupsInFusedRing.add(0, group); if (i!=0){ Element startingEl = group; if ((group.getValue().equals("benz") || group.getValue().equals("benzo")) && FUSIONRING_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))){ Element beforeBenzo = OpsinTools.getPreviousSibling(group); if (beforeBenzo !=null && beforeBenzo.getName().equals(LOCANT_EL)){ startingEl = beforeBenzo; } } Element possibleGroup = OpsinTools.getPreviousSiblingIgnoringCertainElements(startingEl, new String[]{MULTIPLIER_EL, FUSION_EL}); if (!groups.get(i-1).equals(possibleGroup)){//end of fused ring system if (groupsInFusedRing.size()>=2){ //This will be invoked in cases where there are multiple fused ring systems in the same subOrRoot such as some spiro systems new FusedRingBuilder(state, groupsInFusedRing).buildFusedRing(); } groupsInFusedRing.clear(); } } } if (groupsInFusedRing.size()>=2){ new FusedRingBuilder(state, groupsInFusedRing).buildFusedRing(); } } /** * Combines the groups given in the {@link FusedRingBuilder} constructor to destructively create the fused ring system * This fused ring is then numbered * @throws StructureBuildingException */ void buildFusedRing() throws StructureBuildingException{ /* * Apply any nonstandard ring numbering, sorts atomOrder by locant * Aromatises appropriate cycloalkane rings, Rejects groups with acyclic atoms */ processRingNumberingAndIrregularities(); processBenzoFusions();//FR-2.2.8 e.g. in 2H-[1,3]benzodioxino[6',5',4':10,5,6]anthra[2,3-b]azepine benzodioxino is one component List nameComponents = formNameComponentList(); nameComponents.remove(lastGroup); List componentFragments = new ArrayList<>();//all the ring fragments (other than the parentRing). These will later be merged into the parentRing List parentFragments = new ArrayList<>(); parentFragments.add(parentRing); int numberOfParents = 1; Element possibleMultiplier = OpsinTools.getPreviousSibling(lastGroup); if (nameComponents.size()>0 && possibleMultiplier !=null && possibleMultiplier.getName().equals(MULTIPLIER_EL)){ numberOfParents = Integer.parseInt(possibleMultiplier.getAttributeValue(VALUE_ATR)); possibleMultiplier.detach(); for (int j = 1; j < numberOfParents; j++) { Fragment copyOfParentRing =state.fragManager.copyFragment(parentRing); parentFragments.add(copyOfParentRing); componentFragments.add(copyOfParentRing); } } /*The indice from nameComponents to use next. Work from right to left i.e. starts at nameComponents.size()-1*/ int ncIndice = processMultiParentSystem(parentFragments, nameComponents, componentFragments);//handle multiparent systems /* * The number of primes on the component to be connected. * This is initially 0 indicating fusion of unprimed locants with the letter locants of the parentRing * Subsequently it will switch to 1 indicating fusion of a second order component (primed locants) with a * first order component (unprimed locants) * Next would be double primed fusing to single primed locants etc. * */ int fusionLevel = (nameComponents.size()-1 -ncIndice)/2; for (; ncIndice>=0; ncIndice--) { Element fusion = null; if (nameComponents.get(ncIndice).getName().equals(FUSION_EL)){ fusion = nameComponents.get(ncIndice--); } if (ncIndice <0 || !nameComponents.get(ncIndice).getName().equals(GROUP_EL)){ throw new StructureBuildingException("Group not found where group expected. This is probably a bug"); } Fragment nextComponent = nameComponents.get(ncIndice).getFrag(); int multiplier = 1; Element possibleMultiplierEl = OpsinTools.getPreviousSibling(nameComponents.get(ncIndice));//e.g. the di of difuro if (possibleMultiplierEl != null && possibleMultiplierEl.getName().equals(MULTIPLIER_EL)){ multiplier = Integer.parseInt(possibleMultiplierEl.getAttributeValue(VALUE_ATR)); } String[] fusionDescriptors =null; if (fusion !=null){ String fusionDescriptorString = fusion.getValue().toLowerCase(Locale.ROOT).substring(1, fusion.getValue().length()-1); if (multiplier ==1){ fusionDescriptors = new String[]{fusionDescriptorString}; } else{ if (fusionDescriptorString.split(";").length >1){ fusionDescriptors = fusionDescriptorString.split(";"); } else if (fusionDescriptorString.split(":").length >1){ fusionDescriptors = fusionDescriptorString.split(":"); } else if (fusionDescriptorString.split(",").length >1){ fusionDescriptors = fusionDescriptorString.split(","); } else{//multiplier does not appear to mean multiplied component. Could be indicating multiplication of the whole fused ring system if (ncIndice!=0){ throw new StructureBuildingException("Unexpected multiplier: " + possibleMultiplierEl.getValue() +" or incorrect fusion descriptor: " + fusionDescriptorString); } multiplier =1; fusionDescriptors = new String[]{fusionDescriptorString}; } } } if (multiplier >1){ possibleMultiplierEl.detach(); } Fragment[] fusionComponents = new Fragment[multiplier]; for (int j = 0; j < multiplier; j++) { if (j>0){ fusionComponents[j] = state.fragManager.copyAndRelabelFragment(nextComponent, j); } else{ fusionComponents[j] = nextComponent; } } for (int j = 0; j < multiplier; j++) { Fragment component = fusionComponents[j]; componentFragments.add(component); if (fusion !=null){ if (fusionDescriptors[j].split(":").length==1){//A fusion bracket without a colon is used when applying to the parent component (except in a special case where locants are ommitted) //check for case of omitted locant from a higher order fusion bracket e.g. cyclopenta[4,5]pyrrolo[2,3-c]pyridine if (fusionDescriptors[j].split("-").length==1 && fusionDescriptors[j].split(",").length >1 && FragmentTools.allAtomsInRingAreIdentical(component) && ((StringTools.countTerminalPrimes(fusionDescriptors[j].split(",")[0])) != fusionLevel) ){//Could be like cyclopenta[3,4]cyclobuta[1,2]benzene where the first fusion to occur has parent locants omitted not child locants int numberOfPrimes = StringTools.countTerminalPrimes(fusionDescriptors[j].split(",")[0]); //note that this is the number of primes on the parent ring. So would expect the child ring and hence the fusionLevel to be 1 higher if (numberOfPrimes + 1 != fusionLevel){ if (numberOfPrimes + 2 == fusionLevel){//ring could be in previous fusion level e.g. the benzo in benzo[10,11]phenanthro[2',3',4',5',6':4,5,6,7]chryseno[1,2,3-bc]coronene fusionLevel--; } else{ throw new StructureBuildingException("Incorrect number of primes in fusion bracket: " +fusionDescriptors[j]); } } relabelAccordingToFusionLevel(component, fusionLevel); List numericalLocantsOfParent = Arrays.asList(fusionDescriptors[j].split(",")); List numericalLocantsOfChild = findPossibleNumericalLocants(component, determineAtomsToFuse(fragmentInScopeForEachFusionLevel.get(fusionLevel), numericalLocantsOfParent, null).size()-1); processHigherOrderFusionDescriptors(component, fragmentInScopeForEachFusionLevel.get(fusionLevel), numericalLocantsOfChild, numericalLocantsOfParent); } else{ fusionLevel = 0; relabelAccordingToFusionLevel(component, fusionLevel); String fusionDescriptor = fusionDescriptors[j]; String[] fusionArray = determineNumericalAndLetterComponents(fusionDescriptor); int numberOfPrimes =0; if (!fusionArray[1].equals("")){ numberOfPrimes =StringTools.countTerminalPrimes(fusionArray[1]); if (fusionArray[0].equals("")){ fusionDescriptor = fusionArray[1].replaceAll("'", ""); } else{ fusionDescriptor = fusionArray[0]+ "-" +fusionArray[1].replaceAll("'", ""); } if (numberOfPrimes >= parentFragments.size()){ throw new StructureBuildingException("Unexpected prime in fusion descriptor"); } } performSimpleFusion(fusionDescriptor, component, parentFragments.get(numberOfPrimes));//e.g. pyrano[3,2-b]imidazo[4,5-e]pyridine where both are level 0 fusions } } else{ //determine number of primes in fusor and hence determine fusion level int numberOfPrimes = -j + StringTools.countTerminalPrimes(fusionDescriptors[j].split(",")[0]); if (numberOfPrimes != fusionLevel){ if (fusionLevel == numberOfPrimes +1){ fusionLevel--; } else{ throw new StructureBuildingException("Incorrect number of primes in fusion bracket: " +fusionDescriptors[j]); } } relabelAccordingToFusionLevel(component, fusionLevel); performHigherOrderFusion(fusionDescriptors[j], component, fragmentInScopeForEachFusionLevel.get(fusionLevel)); } } else{ relabelAccordingToFusionLevel(component, fusionLevel); performSimpleFusion(null, component, fragmentInScopeForEachFusionLevel.get(fusionLevel)); } } fusionLevel++; if (multiplier ==1){//multiplied components may not be substituted onto fragmentInScopeForEachFusionLevel.put(fusionLevel, fusionComponents[0]); } } for (Fragment ring : componentFragments) { state.fragManager.incorporateFragment(ring, parentRing); } removeMergedAtoms(); FusedRingNumberer.numberFusedRing(parentRing);//numbers the fused ring; StringBuilder fusedRingName = new StringBuilder(); for (Element element : nameComponents) { fusedRingName.append(element.getValue()); } fusedRingName.append(lastGroup.getValue()); Element fusedRingEl =lastGroup;//reuse this element to save having to remap suffixes... fusedRingEl.getAttribute(VALUE_ATR).setValue(fusedRingName.toString()); fusedRingEl.getAttribute(TYPE_ATR).setValue(RING_TYPE_VAL); fusedRingEl.setValue(fusedRingName.toString()); for (Element element : nameComponents) { element.detach(); } } private void removeMergedAtoms() { for (Atom a : atomsToRemoveToReplacementAtom.keySet()) { state.fragManager.removeAtomAndAssociatedBonds(a); } atomsToRemoveToReplacementAtom.clear(); } /** * Forms a list a list of all group and fusion elements between the first and last group in the fused ring * @return */ private List formNameComponentList() { List nameComponents = new ArrayList<>(); Element currentEl = groupsInFusedRing.get(0); while(currentEl != lastGroup){ if (currentEl.getName().equals(GROUP_EL) || currentEl.getName().equals(FUSION_EL)){ nameComponents.add(currentEl); } currentEl = OpsinTools.getNextSibling(currentEl); } return nameComponents; } private void processRingNumberingAndIrregularities() throws StructureBuildingException { for (Element group : groupsInFusedRing) { Fragment ring = group.getFrag(); if (ALKANESTEM_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))){ aromatiseCyclicAlkane(group); } processPartiallyUnsaturatedHWSystems(group, ring); if (group == lastGroup) { //perform a quick check that every atom in this group is infact cyclic. Fusion components are enumerated and hence all guaranteed to be purely cyclic List atomList = ring.getAtomList(); for (Atom atom : atomList) { if (!atom.getAtomIsInACycle()) { throw new StructureBuildingException("Inappropriate group used in fusion nomenclature. Only groups composed entirely of atoms in cycles may be used. i.e. not: " + group.getValue()); } } if (group.getAttribute(FUSEDRINGNUMBERING_ATR) != null) { String[] standardNumbering = group.getAttributeValue(FUSEDRINGNUMBERING_ATR).split("/", -1); for (int j = 0; j < standardNumbering.length; j++) { atomList.get(j).replaceLocants(standardNumbering[j]); } } else { ring.sortAtomListByLocant();//for those where the order the locants are in is sensible } } for (Atom atom : atomList) { atom.clearLocants();//the parentRing does not have locants, letters are used to indicate the edges } } else if (group.getAttribute(FUSEDRINGNUMBERING_ATR) == null) { ring.sortAtomListByLocant();//for those where the order the locants are in is sensible } } } /** * Interprets the unlocanted unsaturator after a partially unsaturated HW Rings as indication of spare valency and detaches it * This is necessary as this unsaturator can only refer to the HW ring and for names like 2-Benzoxazolinone to avoid confusion as to what the 2 refers to. * @param group * @param ring */ private void processPartiallyUnsaturatedHWSystems(Element group, Fragment ring) { if (HANTZSCHWIDMAN_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR)) && group.getAttribute(ADDBOND_ATR)!=null){ List unsaturators = OpsinTools.getNextAdjacentSiblingsOfType(group, UNSATURATOR_EL); if (unsaturators.size()>0){ Element unsaturator = unsaturators.get(0); if (unsaturator.getAttribute(LOCANT_ATR)==null && unsaturator.getAttributeValue(VALUE_ATR).equals("2")){ unsaturator.detach(); List bondsToUnsaturate = StructureBuildingMethods.findBondsToUnSaturate(ring, 2, true); if (bondsToUnsaturate.isEmpty()) { throw new RuntimeException("Failed to find bond to unsaturate on partially saturated HW ring"); } Bond b = bondsToUnsaturate.get(0); b.getFromAtom().setSpareValency(true); b.getToAtom().setSpareValency(true); } } } } /** * Given a cyclicAlkaneGroup determines whether or not it should be aromatised. Unlocanted ene will be detached if it is an aromatisation hint * No unsaturators -->aromatise * Just ane -->don't * More than 1 ene or locants on ene -->don't * yne --> don't * @param cyclicAlkaneGroup */ private void aromatiseCyclicAlkane(Element cyclicAlkaneGroup) { Element next = OpsinTools.getNextSibling(cyclicAlkaneGroup); List unsaturators = new ArrayList<>(); while (next!=null && next.getName().equals(UNSATURATOR_EL)){ unsaturators.add(next); next = OpsinTools.getNextSibling(next); } boolean conjugate =true; if (unsaturators.size()==1){ int value = Integer.parseInt(unsaturators.get(0).getAttributeValue(VALUE_ATR)); if (value !=2){ conjugate =false; } else if (unsaturators.get(0).getAttribute(LOCANT_ATR)!=null){ conjugate =false; } } else if (unsaturators.size()==2){ int value1 = Integer.parseInt(unsaturators.get(0).getAttributeValue(VALUE_ATR)); if (value1 !=1){ conjugate =false; } else{ int value2 = Integer.parseInt(unsaturators.get(1).getAttributeValue(VALUE_ATR)); if (value2 !=2 || unsaturators.get(1).getAttribute(LOCANT_ATR)!=null){ conjugate =false; } } } else if (unsaturators.size() >2){ conjugate =false; } if (conjugate){ for (Element unsaturator : unsaturators) { unsaturator.detach(); } List atomList = cyclicAlkaneGroup.getFrag().getAtomList(); for (Atom atom : atomList) { atom.setSpareValency(true); } } } private int processMultiParentSystem(List parentFragments, List nameComponents, List componentFragments) throws StructureBuildingException { int i = nameComponents.size()-1; int fusionLevel =0; if (i>=0 && parentFragments.size()>1){ List previousFusionLevelFragments = parentFragments; for (; i>=0; i--) { if (previousFusionLevelFragments.size()==1){//completed multi parent system fragmentInScopeForEachFusionLevel.put(fusionLevel, previousFusionLevelFragments.get(0)); break; } Element fusion = null; if (nameComponents.get(i).getName().equals(FUSION_EL)){ fusion = nameComponents.get(i--); } else{ throw new StructureBuildingException("Fusion bracket not found where fusion bracket expected"); } if (i <0 || !nameComponents.get(i).getName().equals(GROUP_EL)){ throw new StructureBuildingException("Group not found where group expected. This is probably a bug"); } Fragment nextComponent = nameComponents.get(i).getFrag(); relabelAccordingToFusionLevel(nextComponent, fusionLevel); int multiplier = 1; Element possibleMultiplierEl = OpsinTools.getPreviousSibling(nameComponents.get(i)); if (possibleMultiplierEl != null && possibleMultiplierEl.getName().equals(MULTIPLIER_EL)){ multiplier = Integer.parseInt(possibleMultiplierEl.getAttributeValue(VALUE_ATR)); possibleMultiplierEl.detach(); } List fusionComponents = new ArrayList<>(); for (int j = 0; j < multiplier; j++) { if (j>0){ Fragment clonedFrag = state.fragManager.copyFragment(nextComponent); relabelAccordingToFusionLevel(clonedFrag, j);//fusionLevels worth of primes already added fusionComponents.add(clonedFrag); } else{ fusionComponents.add(nextComponent); } } fusionLevel+=multiplier; if (multiplier>1 && multiplier != previousFusionLevelFragments.size()){ throw new StructureBuildingException("Mismatch between number of components and number of parents in fused ring system"); } String fusionDescriptorString = fusion.getValue().toLowerCase(Locale.ROOT).substring(1, fusion.getValue().length()-1); String[] fusionDescriptors =null; if (fusionDescriptorString.split(";").length >1){ fusionDescriptors = fusionDescriptorString.split(";"); } else if (fusionDescriptorString.split(":").length >1){ fusionDescriptors = fusionDescriptorString.split(":"); } else if (fusionDescriptorString.split(",").length >1){ fusionDescriptors = fusionDescriptorString.split(","); } else{ throw new StructureBuildingException("Invalid fusion descriptor: " + fusionDescriptorString); } if (fusionDescriptors.length != previousFusionLevelFragments.size()){ throw new StructureBuildingException("Invalid fusion descriptor: "+fusionDescriptorString +"(Number of locants disagrees with number of parents)"); } for (int j = 0; j < fusionDescriptors.length; j++) { String fusionDescriptor = fusionDescriptors[j]; Fragment component = multiplier>1 ? fusionComponents.get(j) : nextComponent; Fragment parentToUse = previousFusionLevelFragments.get(j); boolean simpleFusion = fusionDescriptor.split(":").length <= 1; if (simpleFusion){ String[] fusionArray = determineNumericalAndLetterComponents(fusionDescriptor); if (fusionArray[1].length() != 0){ int numberOfPrimes =StringTools.countTerminalPrimes(fusionArray[1]); if (fusionArray[0].length() == 0){ fusionDescriptor = fusionArray[1].replaceAll("'", ""); } else{ fusionDescriptor = fusionArray[0]+ "-" +fusionArray[1].replaceAll("'", ""); } if (numberOfPrimes !=j){//check the number of primes on the letter part agree with the parent to use e.g.[4,5-bcd:1,2-c']difuran throw new StructureBuildingException("Incorrect number of primes in fusion descriptor: " + fusionDescriptor); } } performSimpleFusion(fusionDescriptor, component, parentToUse); } else{ performHigherOrderFusion(fusionDescriptor, component, parentToUse); } } previousFusionLevelFragments = fusionComponents; componentFragments.addAll(fusionComponents); } if (previousFusionLevelFragments.size()!=1){ throw new StructureBuildingException("Invalid fused ring system. Incomplete multiparent system"); } } return i; } /** * Splits a first order fusion component into it's numerical and letter parts * Either one of these can be the blank string as they may have been omitted * The first entry in the array is the numbers and the second the letters * @param fusionDescriptor * @return */ private String[] determineNumericalAndLetterComponents(String fusionDescriptor) { String[] fusionArray = fusionDescriptor.split("-"); if (fusionArray.length ==2){ return fusionArray; } else{ String[] components = new String[2]; if (fusionArray[0].contains(",")){//the digit section components[0]=fusionArray[0]; components[1]=""; } else{ components[0]=""; components[1]=fusionArray[0]; } return components; } } /** * Searches groups for benz(o) components and fuses them in accordance with * FR-2.2.8 Heterobicyclic components with a benzene ring * @throws StructureBuildingException */ private void processBenzoFusions() throws StructureBuildingException { for(int i = groupsInFusedRing.size() - 2; i >= 0; i--) { Element group = groupsInFusedRing.get(i); if (group.getValue().equals("benz") || group.getValue().equals("benzo")) { Element possibleFusionbracket = OpsinTools.getNextSibling(group); if (!possibleFusionbracket.getName().equals(FUSION_EL)) { Element possibleMultiplier = OpsinTools.getPreviousSibling(group); if (possibleMultiplier == null || !possibleMultiplier.getName().equals(MULTIPLIER_EL) || possibleMultiplier.getAttributeValue(TYPE_ATR).equals(GROUP_TYPE_VAL)) { //e.g. 2-benzofuran. Fused rings of this type are a special case treated as being a single component //and have a special convention for indicating the position of heteroatoms benzoSpecificFusion(group, groupsInFusedRing.get(i + 1)); group.detach(); groupsInFusedRing.remove(i); } } } } } /** * Modifies nextComponent's locants according to the fusionLevel. * @param component * @param fusionLevel */ private void relabelAccordingToFusionLevel(Fragment component, int fusionLevel) { if (fusionLevel > 0){ FragmentTools.relabelNumericLocants(component.getAtomList(), StringTools.multiplyString("'", fusionLevel)); } } /** * Handles fusion between components where the fusion descriptor is of the form: * comma separated locants dash letters * e.g imidazo[4,5-d]pyridine * The fusionDescriptor may be given as null or the letter/numerical part omitted. * Sensible defaults will be found instead * @param fusionDescriptor * @param childRing * @param parentRing * @throws StructureBuildingException */ private void performSimpleFusion(String fusionDescriptor, Fragment childRing, Fragment parentRing) throws StructureBuildingException { List numericalLocantsOfChild = null; List letterLocantsOfParent = null; if (fusionDescriptor != null){ String[] fusionArray = fusionDescriptor.split("-"); if (fusionArray.length ==2){ numericalLocantsOfChild = Arrays.asList(fusionArray[0].split(",")); char[] tempLetterLocantsOfParent = fusionArray[1].toCharArray(); letterLocantsOfParent = new ArrayList<>(); for (char letterLocantOfParent : tempLetterLocantsOfParent) { letterLocantsOfParent.add(String.valueOf(letterLocantOfParent)); } } else{ if (fusionArray[0].contains(",")){//only has digits String[] numericalLocantsOfChildTemp = fusionArray[0].split(","); numericalLocantsOfChild = Arrays.asList(numericalLocantsOfChildTemp); } else{//only has letters char[] tempLetterLocantsOfParentCharArray = fusionArray[0].toCharArray(); letterLocantsOfParent = new ArrayList<>(); for (char letterLocantOfParentCharArray : tempLetterLocantsOfParentCharArray) { letterLocantsOfParent.add(String.valueOf(letterLocantOfParentCharArray)); } } } } int edgeLength =1; if (numericalLocantsOfChild != null){ if (numericalLocantsOfChild.size() <=1){ throw new StructureBuildingException("At least two numerical locants must be provided to perform fusion!"); } edgeLength = numericalLocantsOfChild.size()-1; } else if (letterLocantsOfParent != null){ edgeLength = letterLocantsOfParent.size(); } if (numericalLocantsOfChild == null){ numericalLocantsOfChild = findPossibleNumericalLocants(childRing, edgeLength); } if (letterLocantsOfParent == null){ letterLocantsOfParent = findPossibleLetterLocants(parentRing, edgeLength); } if (numericalLocantsOfChild == null || letterLocantsOfParent ==null){ throw new StructureBuildingException("Unable to find bond to form fused ring system. Some information for forming fused ring system was only supplyed implicitly"); } processFirstOrderFusionDescriptors(childRing, parentRing, numericalLocantsOfChild, letterLocantsOfParent);//fuse the rings } /** * Takes a ring an returns and array with one letter corresponding to a side/s * that contains two adjacent non bridgehead carbons * The number of sides is specified by edgeLength * @param ring * @param edgeLength The number of bonds to be fused along * @return */ private List findPossibleLetterLocants(Fragment ring, int edgeLength) { List carbonAtomIndexes = new ArrayList<>(); int numberOfAtoms = ring.getAtomCount(); CyclicAtomList cyclicAtomList = new CyclicAtomList(ring.getAtomList()); for (int i = 0; i <= numberOfAtoms; i++) { //iterate backwards in list to use highest locanted edge in preference. //this retains what is currently locant 1 on the parent ring as locant 1 if the first two atoms found match //the last atom in the list is potentially tested twice e.g. on a 6 membered ring, 6-5 and 1-6 are both possible Atom atom = cyclicAtomList.previous(); //want non-bridgehead carbon atoms. Double-check that these carbon atoms are actually bonded (e.g. von baeyer systems have non-consecutive atom numbering!) if (atom.getElement() == ChemEl.C && atom.getBondCount() == 2 && (carbonAtomIndexes.isEmpty() || atom.getAtomNeighbours().contains(cyclicAtomList.peekNext()))){ carbonAtomIndexes.add(cyclicAtomList.getIndex()); if (carbonAtomIndexes.size() == edgeLength + 1){//as many carbons in a row as to give that edgelength ->use these side/s Collections.reverse(carbonAtomIndexes); List letterLocantsOfParent = new ArrayList<>(); for (int j = 0; j < edgeLength; j++) { letterLocantsOfParent.add(String.valueOf((char)(97 + carbonAtomIndexes.get(j))));//97 is ascii for a } return letterLocantsOfParent; } } else{ carbonAtomIndexes.clear(); } } return null; } /** * Takes a ring and returns an array of numbers corresponding to a side/s * that contains two adjacent non bridgehead carbons * The number of sides is specified by edgeLength * @param ring * @param edgeLength The number of bonds to be fused along * @return */ private List findPossibleNumericalLocants(Fragment ring, int edgeLength) { List carbonLocants = new ArrayList<>(); int numberOfAtoms = ring.getAtomCount(); CyclicAtomList cyclicAtomList = new CyclicAtomList(ring.getAtomList()); for (int i = 0; i <= numberOfAtoms; i++) { //the last atom in the list is potentially tested twice e.g. on a 6 membered ring, 1-2 and 6-1 are both possible Atom atom = cyclicAtomList.next(); //want non-bridgehead carbon atoms. Double-check that these carbon atoms are actually bonded (e.g. von baeyer systems have non-consecutive atom numbering!) if (atom.getElement() == ChemEl.C && atom.getBondCount() == 2 && (carbonLocants.isEmpty() || atom.getAtomNeighbours().contains(cyclicAtomList.peekPrevious()))){ carbonLocants.add(atom.getFirstLocant()); if (carbonLocants.size() == edgeLength + 1){//as many carbons in a row as to give that edgelength ->use these side/s List numericalLocantsOfChild = new ArrayList<>(); for (String locant : carbonLocants) { numericalLocantsOfChild.add(locant); } return numericalLocantsOfChild; } } else{ carbonLocants.clear(); } } return null; } /** * Performs a single ring fusion using the values in numericalLocantsOfChild/letterLocantsOfParent * @param childRing * @param parentRing * @param numericalLocantsOfChild * @param letterLocantsOfParent * @throws StructureBuildingException */ private void processFirstOrderFusionDescriptors(Fragment childRing, Fragment parentRing, List numericalLocantsOfChild, List letterLocantsOfParent) throws StructureBuildingException { List childAtoms = determineAtomsToFuse(childRing, numericalLocantsOfChild, letterLocantsOfParent.size() +1); if (childAtoms ==null){ throw new StructureBuildingException("Malformed fusion bracket!"); } List parentAtoms = new ArrayList<>(); List parentPeripheralAtomList = getPeripheralAtoms(parentRing.getAtomList()); CyclicAtomList cyclicListAtomsOnSurfaceOfParent = new CyclicAtomList(parentPeripheralAtomList, (int)letterLocantsOfParent.get(0).charAt(0) -97);//convert from lower case character through ascii to 0-23 parentAtoms.add(cyclicListAtomsOnSurfaceOfParent.getCurrent()); for (int i = 0; i < letterLocantsOfParent.size(); i++) { parentAtoms.add(cyclicListAtomsOnSurfaceOfParent.next()); } fuseRings(childAtoms, parentAtoms); } /** * Returns the sublist of the given atoms that are peripheral atoms given that the list is ordered such that the interior atoms are at the end of the list * @param atomList * @return */ private List getPeripheralAtoms(List atomList) { //find the indice of the last atom on the surface of the ring. This obviously connects to the first atom. The objective is to exclude any interior atoms. List neighbours = atomList.get(0).getAtomNeighbours(); int indice = Integer.MAX_VALUE; for (Atom atom : neighbours) { int indexOfAtom =atomList.indexOf(atom); if (indexOfAtom ==1){//not the next atom continue; } else if (indexOfAtom ==-1){//not in parentRing continue; } if (atomList.indexOf(atom)< indice){ indice = indexOfAtom; } } return atomList.subList(0, indice +1); } /** * Handles fusion between components where the fusion descriptor is of the form: * comma separated locants colon comma separated locants * e.g pyrido[1'',2'':1',2']imidazo * @param fusionDescriptor * @param nextComponent * @param fusedRing * @throws StructureBuildingException */ private void performHigherOrderFusion(String fusionDescriptor, Fragment nextComponent, Fragment fusedRing) throws StructureBuildingException { List numericalLocantsOfChild = null; List numericalLocantsOfParent = null; String[] fusionArray = fusionDescriptor.split(":"); if (fusionArray.length ==2){ numericalLocantsOfChild = Arrays.asList(fusionArray[0].split(",")); numericalLocantsOfParent = Arrays.asList(fusionArray[1].split(",")); } else{ throw new StructureBuildingException("Malformed fusion bracket: This is an OPSIN bug, check regexTokens.xml"); } processHigherOrderFusionDescriptors(nextComponent, fusedRing, numericalLocantsOfChild, numericalLocantsOfParent);//fuse the rings } /** * Performs a single ring fusion using the values in numericalLocantsOfChild/numericalLocantsOfParent * @param childRing * @param parentRing * @param numericalLocantsOfChild * @param numericalLocantsOfParent * @throws StructureBuildingException */ private void processHigherOrderFusionDescriptors(Fragment childRing, Fragment parentRing, List numericalLocantsOfChild, List numericalLocantsOfParent) throws StructureBuildingException { List childAtoms =determineAtomsToFuse(childRing, numericalLocantsOfChild, null); if (childAtoms ==null){ throw new StructureBuildingException("Malformed fusion bracket!"); } List parentAtoms = determineAtomsToFuse(parentRing, numericalLocantsOfParent, childAtoms.size()); if (parentAtoms ==null){ throw new StructureBuildingException("Malformed fusion bracket!"); } fuseRings(childAtoms, parentAtoms); } /** * Determines which atoms on a ring should be used for fusion given a set of numerical locants. * If from the other ring involved in the fusion it is known how many atoms are expected to be found this should be provided * If this is not known it should be set to null and the smallest number of fusion atoms will be returned. * @param ring * @param numericalLocantsOnRing * @param expectedNumberOfAtomsToBeUsedForFusion * @return * @throws StructureBuildingException */ private List determineAtomsToFuse(Fragment ring, List numericalLocantsOnRing, Integer expectedNumberOfAtomsToBeUsedForFusion) throws StructureBuildingException { List parentPeripheralAtomList = getPeripheralAtoms(ring.getAtomList()); String firstLocant = numericalLocantsOnRing.get(0); String lastLocant = numericalLocantsOnRing.get(numericalLocantsOnRing.size() - 1); int indexfirst = parentPeripheralAtomList.indexOf(ring.getAtomByLocantOrThrow(firstLocant)); if (indexfirst == -1) { throw new StructureBuildingException(firstLocant + " refers to an atom that is not a peripheral atom!"); } int indexfinal = parentPeripheralAtomList.indexOf(ring.getAtomByLocantOrThrow(lastLocant)); if (indexfinal == -1) { throw new StructureBuildingException(lastLocant + " refers to an atom that is not a peripheral atom!"); } CyclicAtomList cyclicRingAtomList = new CyclicAtomList(parentPeripheralAtomList, indexfirst); List fusionAtoms = null; List potentialFusionAtomsAscending = new ArrayList<>(); potentialFusionAtomsAscending.add(cyclicRingAtomList.getCurrent()); while (cyclicRingAtomList.getIndex() != indexfinal){//assume numbers are ascending potentialFusionAtomsAscending.add(cyclicRingAtomList.next()); } if (expectedNumberOfAtomsToBeUsedForFusion ==null ||expectedNumberOfAtomsToBeUsedForFusion == potentialFusionAtomsAscending.size()){ boolean notInPotentialParentAtoms =false; for (int i =1; i < numericalLocantsOnRing.size()-1 ; i ++){ if (!potentialFusionAtomsAscending.contains(ring.getAtomByLocantOrThrow(numericalLocantsOnRing.get(i)))){ notInPotentialParentAtoms =true; } } if (!notInPotentialParentAtoms){ fusionAtoms = potentialFusionAtomsAscending; } } if (fusionAtoms ==null || expectedNumberOfAtomsToBeUsedForFusion ==null){//that didn't work, so try assuming the numbers are descending cyclicRingAtomList.setIndex(indexfirst); List potentialFusionAtomsDescending = new ArrayList<>(); potentialFusionAtomsDescending.add(cyclicRingAtomList.getCurrent()); while (cyclicRingAtomList.getIndex() != indexfinal){//assume numbers are descending potentialFusionAtomsDescending.add(cyclicRingAtomList.previous()); } if (expectedNumberOfAtomsToBeUsedForFusion ==null || expectedNumberOfAtomsToBeUsedForFusion == potentialFusionAtomsDescending.size()){ boolean notInPotentialParentAtoms =false; for (int i =1; i < numericalLocantsOnRing.size()-1 ; i ++){ if (!potentialFusionAtomsDescending.contains(ring.getAtomByLocantOrThrow(numericalLocantsOnRing.get(i)))){ notInPotentialParentAtoms =true; } } if (!notInPotentialParentAtoms){ if (fusionAtoms!=null && expectedNumberOfAtomsToBeUsedForFusion ==null){ //prefer less fusion atoms if (potentialFusionAtomsDescending.size()< fusionAtoms.size()){ fusionAtoms = potentialFusionAtomsDescending; } } else{ fusionAtoms = potentialFusionAtomsDescending; } } } } return fusionAtoms; } /** * Creates the bonds required to fuse two rings together. * The child atoms are recorded as atoms that should be removed later * @param childAtoms * @param parentAtoms * @throws StructureBuildingException */ private void fuseRings(List childAtoms, List parentAtoms) throws StructureBuildingException { if (parentAtoms.size()!=childAtoms.size()){ throw new StructureBuildingException("Problem with fusion descriptors: Parent atoms specified: " + parentAtoms.size() +" Child atoms specified: " + childAtoms.size() + " These should have been identical!"); } //replace parent atoms if the atom has already been used in fusion with the original atom //This will occur if fusion has resulted in something resembling a spiro centre e.g. cyclopenta[1,2-b:5,1-b']bis[1,4]oxathiine for (int i = parentAtoms.size() -1; i >=0; i--) { if (atomsToRemoveToReplacementAtom.get(parentAtoms.get(i))!=null){ parentAtoms.set(i, atomsToRemoveToReplacementAtom.get(parentAtoms.get(i))); } if (atomsToRemoveToReplacementAtom.get(childAtoms.get(i))!=null){ childAtoms.set(i, atomsToRemoveToReplacementAtom.get(childAtoms.get(i))); } } //sync spareValency and check that element type matches for (int i = 0; i < childAtoms.size(); i++) { Atom parentAtom = parentAtoms.get(i); Atom childAtom = childAtoms.get(i); if (childAtom.hasSpareValency()){ parentAtom.setSpareValency(true); } if (parentAtom.getElement() != childAtom.getElement()){ throw new StructureBuildingException("Invalid fusion descriptor: Heteroatom placement is ambiguous as it is not present in both components of the fusion"); } atomsToRemoveToReplacementAtom.put(childAtom, parentAtom); } Set fusionEdgeBonds = new HashSet<>();//these bonds already exist in both the child and parent atoms for (int i = 0; i < childAtoms.size() -1; i++) { fusionEdgeBonds.add(childAtoms.get(i).getBondToAtomOrThrow(childAtoms.get(i+1))); fusionEdgeBonds.add(parentAtoms.get(i).getBondToAtomOrThrow(parentAtoms.get(i+1))); } Set bondsToAddToParentAtoms = new LinkedHashSet<>(); for (Atom childAtom : childAtoms) { for (Bond b : childAtom.getBonds()) { if (!fusionEdgeBonds.contains(b)){ bondsToAddToParentAtoms.add(b); } } } Set bondsToAddToChildAtoms = new LinkedHashSet<>(); for (Atom parentAtom : parentAtoms) { for (Bond b : parentAtom.getBonds()) { if (!fusionEdgeBonds.contains(b)){ bondsToAddToChildAtoms.add(b); } } } for (Bond bond : bondsToAddToParentAtoms) { Atom from = bond.getFromAtom(); int indiceInChildAtoms = childAtoms.indexOf(from); if (indiceInChildAtoms !=-1){ from = parentAtoms.get(indiceInChildAtoms); } Atom to = bond.getToAtom(); indiceInChildAtoms = childAtoms.indexOf(to); if (indiceInChildAtoms !=-1){ to = parentAtoms.get(indiceInChildAtoms); } state.fragManager.createBond(from, to, 1); } for (Bond bond : bondsToAddToChildAtoms) { Atom from = bond.getFromAtom(); int indiceInParentAtoms = parentAtoms.indexOf(from); if (indiceInParentAtoms !=-1){ from = childAtoms.get(indiceInParentAtoms); } Atom to = bond.getToAtom(); indiceInParentAtoms = parentAtoms.indexOf(to); if (indiceInParentAtoms !=-1){ to = childAtoms.get(indiceInParentAtoms); } Bond newBond = new Bond(from, to, 1); if (childAtoms.contains(from)){ from.addBond(newBond); } else{ to.addBond(newBond); } } } /** * Fuse the benzo with the subsequent ring * Uses locants in front of the benz/benzo group to assign heteroatoms on the now numbered fused ring system * @param benzoEl * @param parentEl * @throws StructureBuildingException */ private void benzoSpecificFusion(Element benzoEl, Element parentEl) throws StructureBuildingException { /* * Perform the fusion, number it and associate it with the parentEl */ Fragment benzoRing = benzoEl.getFrag(); Fragment parentRing = parentEl.getFrag(); performSimpleFusion(null, benzoRing , parentRing); state.fragManager.incorporateFragment(benzoRing, parentRing); removeMergedAtoms(); FusedRingNumberer.numberFusedRing(parentRing);//numbers the fused ring; Fragment fusedRing =parentRing; setBenzoHeteroatomPositioning(benzoEl, fusedRing); } /** * Checks for locant(s) before benzo and uses these to set * @param benzoEl * @param fusedRing * @throws StructureBuildingException */ private void setBenzoHeteroatomPositioning(Element benzoEl, Fragment fusedRing) throws StructureBuildingException { Element locantEl = OpsinTools.getPreviousSibling(benzoEl); if (locantEl != null && locantEl.getName().equals(LOCANT_EL)) { String[] locants = locantEl.getValue().split(","); if (locantsCouldApplyToHeteroatomPositions(locants, benzoEl)) { List atomList =fusedRing.getAtomList(); List heteroatoms = new ArrayList<>(); List elementOfHeteroAtom = new ArrayList<>(); for (Atom atom : atomList) {//this iterates in the same order as the numbering system if (atom.getElement() != ChemEl.C){ heteroatoms.add(atom); elementOfHeteroAtom.add(atom.getElement()); } } if (locants.length == heteroatoms.size()){//as many locants as there are heteroatoms to assign //check for special case of a single locant indicating where the group substitutes e.g. 4-benzofuran-2-yl if (!(locants.length == 1 && OpsinTools.getPreviousSibling(locantEl) == null && ComponentProcessor.checkLocantPresentOnPotentialRoot(state, benzoEl.getParent(), locants[0]))) { for (Atom atom : heteroatoms) { atom.setElement(ChemEl.C); } for (int i=0; i< heteroatoms.size(); i++) { fusedRing.getAtomByLocantOrThrow(locants[i]).setElement(elementOfHeteroAtom.get(i)); } locantEl.detach(); } } else if (locants.length > 1){ throw new StructureBuildingException("Unable to assign all locants to benzo-fused ring or multiplier was mising"); } } } } private boolean locantsCouldApplyToHeteroatomPositions(String[] locants, Element benzoEl) { if (!locantsAreAllNumeric(locants)) { return false; } List suffixes = benzoEl.getParent().getChildElements(SUFFIX_EL); int suffixesWithoutLocants = 0; for (Element suffix : suffixes) { if (suffix.getAttribute(LOCANT_ATR)==null){ suffixesWithoutLocants++; } } if (locants.length == suffixesWithoutLocants){//In preference locants will be assigned to suffixes rather than to this nomenclature return false; } return true; } private boolean locantsAreAllNumeric(String[] locants) { for (String locant : locants) { if (!MATCH_NUMERIC_LOCANT.matcher(locant).matches()){ return false; } } return true; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/FusedRingNumberer.java000066400000000000000000001655251451751637500301230ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.Collections; import java.util.Comparator; import java.util.EnumMap; import java.util.HashMap; import java.util.LinkedHashMap; import java.util.List; import java.util.Map; import java.util.Map.Entry; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; /** * Numbers fusedRings * @author aa593 * @author dl387 * */ class FusedRingNumberer { private static final Logger LOG = LogManager.getLogger(FusedRingNumberer.class); private static class RingConnectivityTable { final List ringShapes = new ArrayList<>(); final List neighbouringRings = new ArrayList<>(); final List directionFromRingToNeighbouringRing = new ArrayList<>(); final List usedRings = new ArrayList<>(); RingConnectivityTable copy(){ RingConnectivityTable copy = new RingConnectivityTable(); copy.ringShapes.addAll(ringShapes); copy.neighbouringRings.addAll(neighbouringRings); copy.directionFromRingToNeighbouringRing.addAll(directionFromRingToNeighbouringRing); copy.usedRings.addAll(usedRings); return copy; } } /** * Wrapper for a ring of a fused ring system with the shape that ring is currently being treated as having * @author dl387 * */ private static class RingShape{ private final Ring ring; private final FusionRingShape shape; public RingShape(Ring ring, FusionRingShape shape) { this.ring = ring; this.shape = shape; } Ring getRing() { return ring; } FusionRingShape getShape() { return shape; } } enum FusionRingShape { enterFromLeftHouse,//5 membered ring enterFromTopLeftHouse,//5 membered ring enterFromTopRightHouse,//5 membered ring enterFromRightHouse,//5 membered ring enterFromLeftSevenMembered,//7 membered ring enterFromTopSevenMembered,//7 membered ring enterFromRightSevenMembered,//7 membered ring enterFromBottomRightSevenMembered,//7 membered ring enterFromBottomLeftSevenMembered,//7 membered ring standard } private static class Chain { private final int length; private final int startingX; private final int y; Chain(int length, int startingX, int y) { this.length = length; this.startingX = startingX; this.y = y; } int getLength() { return length; } int getStartingX() { return startingX; } int getY() { return y; } } /** * Sorts by atomSequences by the IUPAC rules for determining the preferred labelling * The most preferred will be sorted to the back (0th position) * @author dl387 * */ private static class SortAtomSequences implements Comparator> { public int compare(List sequenceA, List sequenceB){ if (sequenceA.size() != sequenceB.size()){ //Error in fused ring building. Identified ring sequences not the same lengths! return 0; } int i=0; int j=0; //Give low numbers for the heteroatoms as a set. while(i < sequenceA.size()){ Atom atomA=sequenceA.get(i); boolean isAaHeteroatom = atomA.getElement() != ChemEl.C; //bridgehead carbon do not increment numbering if (!isAaHeteroatom && atomA.getBondCount()>=3){ i++; continue; } Atom atomB=sequenceB.get(j); boolean isBaHeteroatom =atomB.getElement() != ChemEl.C; if (!isBaHeteroatom && atomB.getBondCount()>=3){ j++; continue; } if (isAaHeteroatom && !isBaHeteroatom){ return -1; } if (isBaHeteroatom && !isAaHeteroatom){ return 1; } i++;j++; } i=0; j=0; //Give low numbers for heteroatoms when considered in the order: O, S, Se, Te, N, P, As, Sb, Bi, Si, Ge, Sn, Pb, B, Hg while(i < sequenceA.size()){ Atom atomA=sequenceA.get(i); //bridgehead carbon do not increment numbering if (atomA.getElement() == ChemEl.C && atomA.getBondCount()>=3){ i++; continue; } Atom atomB=sequenceB.get(j); if (atomB.getElement() == ChemEl.C && atomB.getBondCount()>=3){ j++; continue; } Integer heteroAtomPriorityA = heteroAtomValues.get(atomA.getElement()); int atomAElementValue = heteroAtomPriorityA != null ? heteroAtomPriorityA : 0; Integer heteroAtomPriorityB = heteroAtomValues.get(atomB.getElement()); int atomBElementValue = heteroAtomPriorityB != null ? heteroAtomPriorityB : 0; if (atomAElementValue > atomBElementValue){ return -1; } if (atomAElementValue < atomBElementValue){ return 1; } i++;j++; } //Give low numbers to fusion carbon atoms. for ( i = 0; i < sequenceA.size(); i++) { Atom atomA=sequenceA.get(i); Atom atomB=sequenceB.get(i); if (atomA.getBondCount()>=3 && atomA.getElement() == ChemEl.C){ if (!(atomB.getBondCount()>=3 && atomB.getElement() == ChemEl.C)){ return -1; } } if (atomB.getBondCount()>=3 && atomB.getElement() == ChemEl.C){ if (!(atomA.getBondCount()>=3 && atomA.getElement() == ChemEl.C)){ return 1; } } } //Note that any sequences still unsorted at this step will have fusion carbon atoms in the same places //which means you can go through both sequences without constantly looking for fusion carbons i.e. the variable j is no longer needed //Give low numbers to fusion rather than non-fusion atoms of the same heteroelement. for (i = 0; i < sequenceA.size(); i++) { Atom atomA=sequenceA.get(i); Atom atomB=sequenceB.get(i); if (atomA.getBondCount()>=3){ if (!(atomB.getBondCount()>=3)){ return -1; } } if (atomB.getBondCount()>=3){ if (!(atomA.getBondCount()>=3)){ return 1; } } } //TODO consider heteroatoms FR5.4d return 0; } } private static final Map heteroAtomValues = new EnumMap<>(ChemEl.class); static{ //unknown heteroatoms or carbon are given a value of 0 heteroAtomValues.put(ChemEl.Hg, 2); heteroAtomValues.put(ChemEl.Tl, 3); heteroAtomValues.put(ChemEl.In, 4); heteroAtomValues.put(ChemEl.Ga, 5); heteroAtomValues.put(ChemEl.Al, 6); heteroAtomValues.put(ChemEl.B, 7); heteroAtomValues.put(ChemEl.Pb, 8); heteroAtomValues.put(ChemEl.Sn, 9); heteroAtomValues.put(ChemEl.Ge, 10); heteroAtomValues.put(ChemEl.Si, 11); heteroAtomValues.put(ChemEl.Bi, 12); heteroAtomValues.put(ChemEl.Sb, 13); heteroAtomValues.put(ChemEl.As, 14); heteroAtomValues.put(ChemEl.P, 15); heteroAtomValues.put(ChemEl.N, 16); heteroAtomValues.put(ChemEl.Te, 17); heteroAtomValues.put(ChemEl.Se, 18); heteroAtomValues.put(ChemEl.S, 19); heteroAtomValues.put(ChemEl.O, 20); heteroAtomValues.put(ChemEl.I, 21); heteroAtomValues.put(ChemEl.Br, 22); heteroAtomValues.put(ChemEl.Cl, 23); heteroAtomValues.put(ChemEl.F, 24); } /* * The meaning of the integers used is as follows: * 2 * 3 ^ 1 * \ | / * +-4 <- -> 0 * / | \ * -3 v -1 * -2 * * They indicate the relative directions between rings * Possibly enums should be used... */ /** * Numbers the fused ring * Works reliably for all common ring systems. * Some complex fused ring systems involving multiple connections to rings with an odd number of edges may still be wrong * @param fusedRing * @throws StructureBuildingException */ static void numberFusedRing(Fragment fusedRing) throws StructureBuildingException { List rings = SSSRFinder.getSetOfSmallestRings(fusedRing); if (rings.size() <2) { throw new StructureBuildingException("Ring perception system found less than 2 rings within input fragment!"); } List atomList = fusedRing.getAtomList(); setupAdjacentFusedRingProperties(rings); if (!checkRingApplicability(rings)) { for (Atom atom : atomList) { atom.clearLocants(); } return; } List> atomSequences = determinePossiblePeripheryAtomOrders(rings, atomList.size()); if (atomSequences.isEmpty()){ for (Atom atom : atomList) { atom.clearLocants(); } return; } // add missing atoms to each path for (List path : atomSequences) {//TODO properly support interior atom labelling for(Atom atom : atomList) { if(!path.contains(atom)) { path.add(atom); } } } // find the preferred numbering scheme then relabel with this scheme Collections.sort(atomSequences, new SortAtomSequences()); FragmentTools.relabelLocantsAsFusedRingSystem(atomSequences.get(0)); fusedRing.reorderAtomCollection(atomSequences.get(0)); } /** * Populates rings with their neighbouring fused rings and the bonds involved * @param rings */ static void setupAdjacentFusedRingProperties(List rings){ for (int i = 0, l = rings.size(); i < l; i++) { Ring curRing = rings.get(i); bondLoop : for (Bond bond : curRing.getBondList()) { // go through all the bonds for the current ring for (int j = i + 1; j < l; j++) { Ring otherRing = rings.get(j); if (otherRing.getBondList().contains(bond)) { // check if this bond belongs to any other ring otherRing.addNeighbour(bond, curRing); curRing.addNeighbour(bond, otherRing); // if so, then associate the bond with the adjacent ring continue bondLoop; } } } } } /** * Checks that all the rings are of sizes 3-8 or if larger than 8 are involved in 2 or fewer fused bonds * @param rings * @return */ private static boolean checkRingApplicability(List rings) { for (Ring ring : rings) { if (ring.size() <=2){ throw new RuntimeException("Invalid ring size: " +ring.size()); } if (ring.size() >8 && ring.getNumberOfFusedBonds() > 2){ return false; } } return true; } /** * Returns possible enumerations of atoms. Currently Interior atoms are not considered. * These enumerations will be compliant with rules FR5.1-FR5.3 of the fused ring nomenclature guidelines * http://www.chem.qmul.ac.uk/iupac/fusedring/FR51.html * @param rings * @param atomCountOfFusedRingSystem * @return * @throws StructureBuildingException */ private static List> determinePossiblePeripheryAtomOrders(List rings, int atomCountOfFusedRingSystem) throws StructureBuildingException { List tRings = findTerminalRings(rings); if (tRings.size()<1) { throw new RuntimeException("OPSIN bug: Unable to find a terminal ring in fused ring system"); } Ring tRing = tRings.get(0); Bond b1 = getStartingNonFusedBond(tRing); if(b1 == null) { throw new RuntimeException("OPSIN Bug: Non-fused bond from terminal ring not found"); } List cts = new ArrayList<>(); RingConnectivityTable startingCT = new RingConnectivityTable(); cts.add(startingCT); buildRingConnectionTables(tRing, null, 0, b1, b1.getFromAtom(), startingCT, cts); //The preference against fusion to elongated edges is built into the construction of the ring table /* FR 5.1.1/FR 5.1.2 Preferred shapes preferred to distorted shapes */ removeCTsWithDistortedRingShapes(cts); //TODO better implement the corner cases of FR 5.1.3-5.1.5 /* FR-5.2a. Maximum number of rings in a horizontal row */ Map> horizonalRowDirections = findLongestChainDirections(cts); List ringMaps = createRingMapsAlignedAlongGivenhorizonalRowDirections(horizonalRowDirections); /* FR-5.2b-d */ return findPossiblePaths(ringMaps, atomCountOfFusedRingSystem); } /** * Finds the rings with the minimum number of fused bonds * @param rings * @return */ private static List findTerminalRings(List rings) { List tRings = new ArrayList<>(); int minFusedBonds = Integer.MAX_VALUE; for (Ring ring : rings){ if (ring.getNumberOfFusedBonds() < minFusedBonds) { minFusedBonds = ring.getNumberOfFusedBonds(); } } for (Ring ring : rings){ if (ring.getNumberOfFusedBonds() == minFusedBonds) { tRings.add(ring); } } return tRings; } /** * Recursive function to create the connectivity table of the rings, for each connection includes both directions * @param currentRing * @param previousRing * @param previousDir * @param previousBond * @param atom * @param ct * @param cts * @return */ private static List buildRingConnectionTables(Ring currentRing, Ring previousRing, int previousDir, Bond previousBond, Atom atom, RingConnectivityTable ct, List cts) { // order atoms and bonds in the ring currentRing.makeCyclicLists(previousBond, atom); List generatedCts = new ArrayList<>(); List allowedShapes = getAllowedShapesForRing(currentRing, previousBond); if (allowedShapes.isEmpty()) { throw new RuntimeException("OPSIN limitation, unsupported ring size in fused ring numbering"); } ct.usedRings.add(currentRing); for (int i = allowedShapes.size() - 1; i >=0; i--) { FusionRingShape fusionRingShape = allowedShapes.get(i); RingConnectivityTable currentCT; if (i==0) { currentCT = ct; } else{ currentCT = ct.copy(); cts.add(currentCT); generatedCts.add(currentCT); } RingShape ringShape = new RingShape(currentRing, fusionRingShape); List ctsToExpand = new ArrayList<>(); ctsToExpand.add(currentCT);//all the cts to consider, the currentCT and generated clones for (Ring neighbourRing : currentRing.getNeighbours()) { //find the directions between the current ring and all neighbouring rings including the previous ring // this means that the direction to the previous ring will then be known in both directions // find direction Bond currentBond = findFusionBond(currentRing, neighbourRing); int dir = 0; if (neighbourRing == previousRing) { dir = getOppositeDirection(previousDir); } else { dir = calculateRingDirection(ringShape, previousBond, currentBond, previousDir); } //System.out.println(currentRing +"|" +neighbourRing +"|" +dir +"|" +(neighbourRing==previousRing)); // place into connectivity table, like graph, rings and their connection for (RingConnectivityTable ctToExpand : ctsToExpand) { ctToExpand.ringShapes.add(ringShape); ctToExpand.neighbouringRings.add(neighbourRing); ctToExpand.directionFromRingToNeighbouringRing.add(dir); } if (!currentCT.usedRings.contains(neighbourRing)) { List newCts = new ArrayList<>(); for (RingConnectivityTable ctToExpand : ctsToExpand) { Atom a = getAtomFromBond(currentRing, currentBond); List generatedDownStreamCts = buildRingConnectionTables(neighbourRing, currentRing, dir, currentBond, a, ctToExpand, cts); newCts.addAll(generatedDownStreamCts); } ctsToExpand.addAll(newCts); generatedCts.addAll(newCts); } } } return generatedCts; } /** * Returns the allowed shapes for the given ring. * The starting bond is required to assured that elongated bonds do not unnecesarily correspond to fusions * Currently only 5 membered rings are considered in multiple orientations but the same * is probably required for 7+ member rings * @param ring * @param startingBond * @return */ private static List getAllowedShapesForRing(Ring ring, Bond startingBond) { List allowedRingShapes = new ArrayList<>(); int size = ring.size(); if (size==5){ List fusedBonds = ring.getFusedBonds(); int fusedBondCount = fusedBonds.size(); if (fusedBondCount==1){ allowedRingShapes.add(FusionRingShape.enterFromLeftHouse); } else if (fusedBondCount==2 || fusedBondCount==3 || fusedBondCount==4){ List distances = new ArrayList<>();//one distance is likely to be 0 for (Bond fusedBond : fusedBonds) { distances.add(calculateDistanceBetweenBonds(startingBond, fusedBond, ring)); } if (!distances.contains(1)){ allowedRingShapes.add(FusionRingShape.enterFromLeftHouse); } if (!distances.contains(4)){ allowedRingShapes.add(FusionRingShape.enterFromRightHouse); } if (!distances.contains(2)){ allowedRingShapes.add(FusionRingShape.enterFromTopLeftHouse); } else if (!distances.contains(3)){ allowedRingShapes.add(FusionRingShape.enterFromTopRightHouse); } allowedRingShapes = removeDegenerateRingShapes(allowedRingShapes, distances, 5); } else if (fusedBondCount==5){ allowedRingShapes.add(FusionRingShape.enterFromLeftHouse); allowedRingShapes.add(FusionRingShape.enterFromRightHouse); //top left and top right are the same other than position of the elongated bond which will invariably be used anyway allowedRingShapes.add(FusionRingShape.enterFromTopLeftHouse); } } else if (size==7){ List fusedBonds = ring.getFusedBonds(); int fusedBondCount = fusedBonds.size(); if (fusedBondCount==1){ allowedRingShapes.add(FusionRingShape.enterFromLeftSevenMembered); } else{ List distances = new ArrayList<>();//one distance is likely to be 0 for (Bond fusedBond : fusedBonds) { distances.add(calculateDistanceBetweenBonds(startingBond, fusedBond, ring)); } if (!distances.contains(4) && !distances.contains(6)){ allowedRingShapes.add(FusionRingShape.enterFromLeftSevenMembered); } if (!distances.contains(1) && !distances.contains(6)){ allowedRingShapes.add(FusionRingShape.enterFromTopSevenMembered); } if (!distances.contains(1) && !distances.contains(3)){ allowedRingShapes.add(FusionRingShape.enterFromRightSevenMembered); } if (!distances.contains(2) && !distances.contains(4)){ allowedRingShapes.add(FusionRingShape.enterFromBottomRightSevenMembered); } if (!distances.contains(3) && !distances.contains(5)){ allowedRingShapes.add(FusionRingShape.enterFromBottomLeftSevenMembered); } allowedRingShapes = removeDegenerateRingShapes(allowedRingShapes, distances, 7); } } else{ allowedRingShapes.add(FusionRingShape.standard); } return allowedRingShapes; } /** * Removes the ring shapes that for given distances have identical properties * @param allowedRingShapes * @param distances * @param ringSize */ private static List removeDegenerateRingShapes(List allowedRingShapes, List distances, int ringSize) { distances = new ArrayList<>(distances); distances.remove((Integer)0);//remove distance 0 if present, this invariably comes from the starting bond and is not of interest (and breaks getDirectionFromDist) for (int i = allowedRingShapes.size() - 1; i >=0; i--) { FusionRingShape shapeToConsiderRemoving = allowedRingShapes.get(i); for (int j = i - 1; j >=0; j--) { FusionRingShape shapeToCompareWith = allowedRingShapes.get(j); boolean foundDifference = false; for (Integer distance : distances) { if (getDirectionFromDist(shapeToConsiderRemoving, ringSize, distance) != getDirectionFromDist(shapeToCompareWith, ringSize, distance)){ foundDifference = true; break; } } if (!foundDifference){ allowedRingShapes.remove(i); break; } } } return allowedRingShapes; } /** * Calculates the direction of the next ring according to the distance between fusion bonds and the previous direction * @param ringShape * @param previousBond * @param currentBond * @param previousDir * @return */ private static int calculateRingDirection(RingShape ringShape, Bond previousBond, Bond currentBond, int previousDir) { // take the ring fused to one from the previous loop step Ring ring = ringShape.getRing(); if (ring.getCyclicBondList() == null ) { throw new RuntimeException("OPSIN bug: cyclic bond set should have already been populated"); } int dist = calculateDistanceBetweenBonds(previousBond, currentBond, ring); if (dist == 0) { throw new RuntimeException("OPSIN bug: Distance between bonds is equal to 0"); } int relativeDir = getDirectionFromDist(ringShape.getShape(), ring.size(), dist); return determineAbsoluteDirectionUsingPreviousDirection(ringShape.getShape(), ring.size(), relativeDir, previousDir); } /** * Given two bonds on a ring returns the distance (in bonds) between them * @param bond1 * @param bond2 * @param ring * @return */ private static int calculateDistanceBetweenBonds(Bond bond1, Bond bond2, Ring ring) { List cyclicBondList =ring.getCyclicBondList(); int previousBondIndice = cyclicBondList.indexOf(bond1); int currentBondIndice = cyclicBondList.indexOf(bond2); if (previousBondIndice==-1 || currentBondIndice==-1){ throw new RuntimeException("OPSIN bug: previous and current bond were not present in the cyclic bond list of the current ring"); } int ringSize =ring.size(); int dist = (ringSize + currentBondIndice - previousBondIndice) % ringSize; return dist; } /** * Uses the ring shape, the ring size and distance between the incoming and outgoing fused bond to determine * the relative direction between the entry point on the ring and the exit point * @param fusionRingShape * @param ringSize * @param dist * @return */ private static int getDirectionFromDist(FusionRingShape fusionRingShape, int ringSize, int dist) { int dir=0; if (ringSize == 3) { // 3 member ring if (dist == 1) { dir = -1; } else if (dist == 2) { dir = 1; } else throw new RuntimeException("Impossible distance between bonds for a 3 membered ring"); } else if (ringSize == 4) { // 4 member ring if (dist ==1) { dir = -2; } else if (dist == 2) { dir = 0; } else if (dist ==3) { dir = 2; } else throw new RuntimeException("Impossible distance between bonds for a 4 membered ring"); } else if (ringSize == 5) { // 5 member ring switch (fusionRingShape) { case enterFromLeftHouse: if (dist ==1){ dir = -2;//fusion to an elongated bond } else if (dist ==2){ dir = 0; } else if (dist ==3){ dir = 1; } else if (dist ==4){ dir = 3; } else { throw new RuntimeException("Impossible distance between bonds for a 5 membered ring"); } break; case enterFromTopLeftHouse: if (dist ==1){ dir = -3; } else if (dist ==2){ dir = -1;//fusion to an elongated bond } else if (dist ==3){ dir = 1; } else if (dist ==4){ dir = 3; } else { throw new RuntimeException("Impossible distance between bonds for a 5 membered ring"); } break; case enterFromTopRightHouse: if (dist ==1){ dir = -3; } else if (dist ==2){ dir = -1; } else if (dist ==3){ dir = 1;//fusion to an elongated bond } else if (dist ==4){ dir = 3; } else { throw new RuntimeException("Impossible distance between bonds for a 5 membered ring"); } break; case enterFromRightHouse: if (dist ==1){ dir = -3; } else if (dist ==2){ dir = -1; } else if (dist ==3){ dir = 0; } else if (dist ==4){ dir = 2;//fusion to an elongated bond } else { throw new RuntimeException("Impossible distance between bonds for a 5 membered ring"); } break; default : throw new RuntimeException("OPSIN Bug: Unrecognised fusion ring shape for 5 membered ring"); } } else if (ringSize == 7) { // 7 member ring switch (fusionRingShape) { case enterFromLeftSevenMembered: if (dist ==1){ dir = -3; } else if (dist ==2){ dir = -1; } else if (dist ==3){ dir = 0; } else if (dist ==4){ dir = 1;//fusion to an abnormally angled bond } else if (dist ==5){ dir = 2; } else if (dist ==6){ dir = 3;//fusion to an abnormally angled bond } else { throw new RuntimeException("Impossible distance between bonds for a 7 membered ring"); } break; case enterFromTopSevenMembered: if (dist ==1){ dir = -3;//fusion to an abnormally angled bond } else if (dist ==2){ dir = -2; } else if (dist ==3){ dir = -1; } else if (dist ==4){ dir = 1; } else if (dist ==5){ dir = 2; } else if (dist ==6){ dir = 3;//fusion to an abnormally angled bond } else { throw new RuntimeException("Impossible distance between bonds for a 7 membered ring"); } break; case enterFromRightSevenMembered: if (dist ==1){ dir = -3;//fusion to an abnormally angled bond } else if (dist ==2){ dir = -2; } else if (dist ==3){ dir = -1;//fusion to an abnormally angled bond } else if (dist ==4){ dir = 0; } else if (dist ==5){ dir = 1; } else if (dist ==6){ dir = 3; } else { throw new RuntimeException("Impossible distance between bonds for a 7 membered ring"); } break; case enterFromBottomRightSevenMembered: if (dist ==1){ dir = -3; } else if (dist ==2){ dir = -2;//fusion to an abnormally angled bond } else if (dist ==3){ dir = -1; } else if (dist ==4){ dir = 0;//fusion to an abnormally angled bond } else if (dist ==5){ dir = 1; } else if (dist ==6){ dir = 3; } else { throw new RuntimeException("Impossible distance between bonds for a 7 membered ring"); } break; case enterFromBottomLeftSevenMembered: if (dist ==1){ dir = -3; } else if (dist ==2){ dir = -1; } else if (dist ==3){ dir = 0;//fusion to an abnormally angled bond } else if (dist ==4){ dir = 1; } else if (dist ==5){ dir = 2;//fusion to an abnormally angled bond } else if (dist ==6){ dir = 3; } else { throw new RuntimeException("Impossible distance between bonds for a 7 membered ring"); } break; default: throw new RuntimeException("OPSIN Bug: Unrecognised fusion ring shape for 7 membered ring"); } } else if (ringSize % 2 == 0) {//general case even number of atoms ring (a 6 membered ring or distortion of) if (dist == 1) { dir = -3; } else if (dist == ringSize-1) { dir = 3; } else { dir = dist - ringSize/2; if (Math.abs(dir) > 2 && ringSize >= 8){// 8 and more neighbours dir = -2 * Integer.signum(dir); } } } else {// general case odd number of atoms ring (distortion of an even numbered ring by insertion of one atom). if (dist == 1) { dir = -3; } else if (dist == ringSize/2 || dist == ringSize/2 + 1) {//0 in both cases as effectively we are using a different depiction of the ring system. See FR-5.1.1 (this is done to give the longest horizontal row) dir = 0; } else if (dist == ringSize-1) { dir = 3; } else if(dist < ringSize/2) { dir = -2; } else if(dist > ringSize/2+1) { dir = 2; } else{ throw new RuntimeException("OPSIN Bug: Unable to determine direction between odd number of atoms ring and next ring"); } } return dir; } private static void removeCTsWithDistortedRingShapes(List cts) { Map> ctToDistortedRings = new HashMap<>(); for (RingConnectivityTable ct : cts) { List distortedRingSizes = new ArrayList<>(); ctToDistortedRings.put(ct, distortedRingSizes); List ringShapes = ct.ringShapes; for (int i = 0; i < ringShapes.size(); i++) { Ring r1 = ringShapes.get(i).getRing(); Ring r2 = ct.neighbouringRings.get(i); for (int j = i +1; j < ringShapes.size(); j++) { if (ringShapes.get(j).getRing().equals(r2) && ct.neighbouringRings.get(j).equals(r1)){//look for the reverse entry in the ring connection table int expectedDir = getOppositeDirection(ct.directionFromRingToNeighbouringRing.get(i)); if (expectedDir != ct.directionFromRingToNeighbouringRing.get(j)){ distortedRingSizes.add(r2.size()); } } } } } int minDistortedRings = Integer.MAX_VALUE;//find the minimum number of distorted rings for (List distortedRingSizes : ctToDistortedRings.values()) { if (distortedRingSizes.size() < minDistortedRings){ minDistortedRings = distortedRingSizes.size(); } } for (int i = cts.size()-1; i>=0; i--) { if (ctToDistortedRings.get(cts.get(i)).size()>minDistortedRings){ cts.remove(i); } } } /** * Given a list of cts find the longest chain of rings in a line. This can be used to find a possible horizontal row * The output is a map between the connection tables and the directions which give the longest chains * Some cts may have no directions that give a chain of rings of this length * * @param cts * @return */ private static Map> findLongestChainDirections(List cts){ Map> horizonalRowDirections = new LinkedHashMap<>(); int maxChain = 0; for (RingConnectivityTable ct : cts) { if (ct.ringShapes.size() != ct.neighbouringRings.size() || ct.neighbouringRings.size() != ct.directionFromRingToNeighbouringRing.size()) { throw new RuntimeException("OPSIN Bug: Sizes of arrays in fused ring numbering connection table are not equal"); } int ctEntriesSize = ct.ringShapes.size(); List directions = new ArrayList<>(); horizonalRowDirections.put(ct, directions); for (int i = 0; i < ctEntriesSize; i++) { Ring neighbour = ct.neighbouringRings.get(i); int curChain = 1; int curDir = ct.directionFromRingToNeighbouringRing.get(i); nextRingInChainLoop: for (int k = 0; k <= ct.usedRings.size(); k++) {//<= rather than < so buggy behaviour can be caught int indexOfNeighbour = indexOfCorrespondingRingshape(ct.ringShapes, neighbour); if (indexOfNeighbour >= 0) { for (int j = indexOfNeighbour; j < ctEntriesSize; j++) { if (ct.ringShapes.get(j).getRing() == neighbour && ct.directionFromRingToNeighbouringRing.get(j) == curDir) { curChain++; neighbour = ct.neighbouringRings.get(j); continue nextRingInChainLoop; } } } else{ throw new RuntimeException("OPSIN bug: fused ring numbering: Ring missing from connection table"); } if (curChain >= maxChain ) { int oDir = getOppositeDirection(curDir); if(curChain > maxChain){//new longest chain found for (List previousDirections: horizonalRowDirections.values()) { previousDirections.clear(); } } // if we has this direction before or its opposite, it is the same orientation if(curChain > maxChain || (!directions.contains(curDir) && !directions.contains(oDir))) { directions.add(curDir); } maxChain = curChain; } break; } if (maxChain > ct.usedRings.size()){ throw new RuntimeException("OPSIN bug: fused ring layout contained a loop: more rings in a chain than there were rings!"); } } } return horizonalRowDirections; } /** * Given a list of ringShapes finds the indice of the ringShape corresponding to the given ring * returns -1 if this is not possible * @param ringShapes * @param ring * @return */ private static int indexOfCorrespondingRingshape(List ringShapes, Ring ring) { for (int i = 0; i < ringShapes.size(); i++) { if (ringShapes.get(i).getRing().equals(ring)){ return i; } } return -1; } /** * For each RingConnectivityTable and for each horizontal row direction creates a ringMap aligned along the given horizontal row direction * @param horizonalRowDirectionsMap * @return * @throws StructureBuildingException */ private static List createRingMapsAlignedAlongGivenhorizonalRowDirections(Map> horizonalRowDirectionsMap) throws StructureBuildingException { List ringMaps = new ArrayList<>(); for (Entry> entry : horizonalRowDirectionsMap.entrySet()) { RingConnectivityTable ct = entry.getKey(); if ( ct.ringShapes.size() != ct.neighbouringRings.size() || ct.neighbouringRings.size() != ct.directionFromRingToNeighbouringRing.size() || ct.ringShapes.size() <= 0) { throw new RuntimeException("OPSIN Bug: Sizes of arrays in fused ring numbering connection table are not equal"); } int ctEntriesSize = ct.ringShapes.size(); for (Integer horizonalRowDirection : entry.getValue()) { int[] directionFromRingToNeighbouringRing = new int[ctEntriesSize]; // turn the ring system such as to be aligned along the horizonalRowDirection for(int i=0; i> findPossiblePaths(List ringMaps, int atomCountOfFusedRingSystem){ List chainQs = new ArrayList<>(); List correspondingRingMap = new ArrayList<>(); for (Ring[][] ringMap : ringMaps) { List chains = findChainsOfMaximumLengthInHorizontalDir(ringMap); // For each chain count the number of rings in each quadrant for (Chain chain : chains) { int midChainXcoord = chain.getLength() + chain.getStartingX() - 1;//Remember the X axis is measured in 1/2s so don't need to 1/2 length Double[] qs = countQuadrants(ringMap, midChainXcoord, chain.getY()); chainQs.add(qs); correspondingRingMap.add(ringMap); } } /* * The quadrant numbers are as follows: * * 1 | 0 * ----+---- * 2 | 3 * * But at this stage it is not known what the mapping between these numbers and the/a preferred orientation of the structure is */ // order for each right corner candidates for each chain List> allowedUpperRightQuadrantsForEachChain =rulesBCD(chainQs); List> paths = new ArrayList<> (); for (int c=0; c < chainQs.size(); c++) { Ring[][] ringMap = correspondingRingMap.get(c); List allowedUpperRightQuadrants = allowedUpperRightQuadrantsForEachChain.get(c); for (Integer upperRightQuadrant : allowedUpperRightQuadrants) { Ring[][] qRingMap = transformQuadrantToUpperRightOfRingMap(ringMap, upperRightQuadrant); if (LOG.isTraceEnabled()){ debugRingMap(qRingMap); } boolean inverseAtoms = (upperRightQuadrant == 2 || upperRightQuadrant == 0); List peripheralAtomPath = orderAtoms(qRingMap, inverseAtoms, atomCountOfFusedRingSystem); paths.add(peripheralAtomPath); } } return paths; } private static Ring[][] generateRingMap(RingConnectivityTable ct, int[] directionFromRingToNeighbouringRing) { int ctEntriesSize = ct.ringShapes.size(); // Find max and min coordinates for ringMap // we put the first ring into takenRings to start with it in the connection table int nRings = ct.usedRings.size(); int[][] coordinates = new int[nRings][]; // correspondent to usedRings Ring[] takenRings = new Ring[nRings]; int takenRingsCnt = 0; int maxX = 0; int minX = 0; int maxY = 0; int minY = 0; takenRings[takenRingsCnt++] = ct.ringShapes.get(0).getRing(); coordinates[0] = new int[]{0,0}; // Go through the rings in a system // Find the rings connected to them and assign coordinates according to the direction // Each time we go to the ring, whose coordinates were already identified. for(int tr=0; tr= 0) { for (int j=indexOfCurrentRing; j< ctEntriesSize; j++) { if (ct.ringShapes.get(j).getRing() == currentRing) { Ring neighbour = ct.neighbouringRings.get(j); if (arrayContains(takenRings, neighbour)) { continue; } int[] newXY = new int[2]; newXY[0] = xy[0] + Math.round(2 * countDX(directionFromRingToNeighbouringRing[j])); newXY[1] = xy[1] + countDY(directionFromRingToNeighbouringRing[j]); if(takenRingsCnt > takenRings.length) { throw new RuntimeException("OPSIN Bug: Fused ring numbering bug"); } takenRings[takenRingsCnt] = neighbour; coordinates[takenRingsCnt] = newXY; takenRingsCnt++; if (newXY[0] > maxX){ maxX = newXY[0]; } else if (newXY[0] < minX) { minX = newXY[0]; } if (newXY[1] > maxY){ maxY = newXY[1]; } else if (newXY[1] < minY) { minY = newXY[1]; } } } } else{ throw new RuntimeException("OPSIN bug: fused ring numbering: Ring missing from connection table"); } } // the height and the width of the map int h = maxY - minY + 1; int w = maxX - minX + 1; Ring[][] ringMap = new Ring[w][h]; // Map rings using coordinates calculated in the previous step, and transform them according to found minX and minY int ix = -minX; int iy = -minY; if (ix >= w || iy >= h) { throw new RuntimeException("OPSIN Bug: Fused ring numbering bug, Coordinates have been calculated wrongly"); } int curX = 0; int curY = 0; for (int ti = 0; ti < takenRings.length; ti++){ int[] xy = coordinates[ti]; curX = xy[0] - minX; curY = xy[1] - minY; if(curX <0 || curX > w || curY < 0 || curY > h) { throw new RuntimeException("OPSIN Bug: Fused ring numbering bug, Coordinates have been calculated wrongly"); } if (ringMap[curX][curY] != null){ return null; } ringMap[curX][curY] = takenRings[ti]; } return ringMap; } /** * Finds all the chains of maximum length for the current direction * @param ringMap * @return */ private static List findChainsOfMaximumLengthInHorizontalDir(Ring[][] ringMap){ int w = ringMap.length; int h = ringMap[0].length; List chains = new ArrayList<>(); int maxChain = 0; int chain = 0; // Find the longest chain for (int j=0; j maxChain){ chains.clear(); maxChain = chain; } if(chain >= maxChain) { chains.add(new Chain(chain, i, j)); } i += 2*chain; } } } return chains; } /** * Counts number of rings in each quadrant * @param ringMap * @param midChainXcoord * @param yChain * @return */ private static Double[] countQuadrants(Ring[][] ringMap, int midChainXcoord, int yChain){ Double[] qs = new Double[4]; qs[0] = 0d; qs[1] = 0d; qs[2] = 0d; qs[3] = 0d; int w = ringMap.length; int h = ringMap[0].length; // Count rings in each quadrants for (int x=0; x yChain ) { qs[0]+=0.5; qs[1]+=0.5; } else if( x == midChainXcoord && y < yChain ) { qs[2]+=0.5; qs[3]+=0.5; } else if( x < midChainXcoord && y == yChain ) { qs[1]+=0.5; qs[2]+=0.5; } else if( x > midChainXcoord && y == yChain ) { qs[0]+=0.5; qs[3]+=0.5; } if (x==midChainXcoord && y==yChain ){ qs[0]+=0.25; qs[1]+=0.25; qs[2]+=0.25; qs[3]+=0.25; } } else if(x > midChainXcoord && y > yChain) { qs[0]++; } else if(x < midChainXcoord && y > yChain) { qs[1]++; } else if(x < midChainXcoord && y < yChain) { qs[2]++; } else if(x > midChainXcoord && y < yChain) { qs[3]++; } } } return qs; } /** * Applying rules FR5.2 B, C and D to the ring system. * Return a list of possible upper right quadrants for each chain given. A chain may have multiple possible upper right quadrants (due to symmetry) * or none if other chains can be shown to be preferable by application of the rules * @param chainQs - array with number of ring in each quadrant for each chain. */ private static List> rulesBCD(List chainQs) { List> possibleUpperRightQuadrantsForEachChain = new ArrayList<>(); int nChains = chainQs.size(); if (nChains==0){ throw new RuntimeException("OPSIN Bug: Fused ring numbering, no chains found?"); } // Rule B: Maximum number of rings in upper right quadrant. Upper right corner candidates (it is not at this stage known which quadrant is the upper right one) double qmax = 0; for (Double[] chainQ : chainQs) { for (int j = 0; j < 4; j++) { Double q = chainQ[j]; if(q > qmax) { qmax = q; } } } for (Double[] chainQ : chainQs) { List allowedUpperRightQuadrants = new ArrayList<>(); for (int j = 0; j < 4; j++){ if (chainQ[j] == qmax) { allowedUpperRightQuadrants.add(j); } } possibleUpperRightQuadrantsForEachChain.add(allowedUpperRightQuadrants); } // Rule C: Minimum number of rings in lower left quadrant double qmin = Double.MAX_VALUE; for (int c = 0; c < nChains; c++) { List possibleUpperRightQuadrant = possibleUpperRightQuadrantsForEachChain.get(c); for (Integer upperRightQuad : possibleUpperRightQuadrant) { int qdiagonal = (upperRightQuad + 2) % 4; if (chainQs.get(c)[qdiagonal] < qmin){ qmin = chainQs.get(c)[qdiagonal]; } } } for (int c = 0; c < nChains; c++) { List possibleUpperRightQuadrant = possibleUpperRightQuadrantsForEachChain.get(c); List allowedUpperRightQuadrants = new ArrayList<>(); for (Integer upperRightQuad : possibleUpperRightQuadrant) { int qdiagonal = (upperRightQuad + 2) % 4; if (chainQs.get(c)[qdiagonal]==qmin) { allowedUpperRightQuadrants.add(upperRightQuad); } } possibleUpperRightQuadrantsForEachChain.set(c, allowedUpperRightQuadrants); } // Rule D: Maximum number of rings above the horizontal row double rMax = 0; for (int c = 0; c < nChains; c++) { List possibleUpperRightQuadrant = possibleUpperRightQuadrantsForEachChain.get(c); for (Integer upperRightQuad : possibleUpperRightQuadrant) { int upperLeftQuad; if (upperRightQuad % 2 == 0) { upperLeftQuad = upperRightQuad + 1; } else { upperLeftQuad = upperRightQuad - 1; } if (chainQs.get(c)[upperLeftQuad] + chainQs.get(c)[upperRightQuad] > rMax) { rMax = chainQs.get(c)[upperLeftQuad] + chainQs.get(c)[upperRightQuad]; } } } for (int c = 0; c < nChains; c++) { List possibleUpperRightQuadrant = possibleUpperRightQuadrantsForEachChain.get(c); List allowedUpperRightQuadrants = new ArrayList<>(); for (Integer upperRightQuad : possibleUpperRightQuadrant) { int upperLeftQuad; if (upperRightQuad % 2 == 0) { upperLeftQuad = upperRightQuad + 1; } else { upperLeftQuad = upperRightQuad - 1; } if (chainQs.get(c)[upperLeftQuad] + chainQs.get(c)[upperRightQuad] == rMax) { allowedUpperRightQuadrants.add(upperRightQuad); } } possibleUpperRightQuadrantsForEachChain.set(c, allowedUpperRightQuadrants); } return possibleUpperRightQuadrantsForEachChain; } /** * Enumerates the peripheral atoms in a system in accordance with FR-5.3: * First finds the uppermost right ring, takes the next neighbour in the clockwise direction, and so on until the starting atom is reached * @param ringMap * @param inverseAtoms The direction in which the periphery atoms should be enumerated. Anticlockwise by default * @param atomCountOfFusedRingSystem * @return */ private static List orderAtoms(Ring[][] ringMap, boolean inverseAtoms, int atomCountOfFusedRingSystem){ int w = ringMap.length; int h = ringMap[0].length; // find upper right ring Ring upperRightRing = null; for (int i=w-1; i>=0; i--) { if (ringMap[i][h-1] != null) { upperRightRing = ringMap[i][h-1]; break; } } if (upperRightRing == null) { throw new RuntimeException("OPSIN Bug: Upper right ring not found when performing fused ring numbering"); } List visitedRings = new ArrayList<>(); visitedRings.add(upperRightRing); while (isEntirelyFusionAtoms(upperRightRing)){//c.f cyclopropa[de]anthracene upperRightRing = findClockwiseRingFromUpperRightRing(ringMap, upperRightRing, visitedRings); if (upperRightRing==null){ throw new RuntimeException("OPSIN Bug: Unabled to find clockwise ring without fusion atoms"); } visitedRings.add(upperRightRing); } Ring prevRing = findUpperLeftNeighbourOfUpperRightRing(ringMap, upperRightRing); Bond prevBond = findFusionBond(upperRightRing, prevRing); Bond nextBond = null; Ring currentRing = upperRightRing; Ring nextRing = null; List atomPath = new ArrayList<>(); int count = 0; mainLoop: for (; count <= atomCountOfFusedRingSystem; count++) { int ringSize = currentRing.size(); int startingBondIndex = currentRing.getBondIndex(prevBond) ; List cyclicBonds = currentRing.getCyclicBondList(); List fusedBonds = currentRing.getFusedBonds(); if (!inverseAtoms) { for(int bondIndex = 0; bondIndex < ringSize; bondIndex++) { int i = (startingBondIndex + bondIndex + 1) % ringSize; // +1 because we start from the bond next to stBond and end with it // if this bond is fused then it indicates the next ring to move to Bond bond = cyclicBonds.get(i); if(fusedBonds.contains(bond)) { nextBond = bond; break; } } } else { for(int bondIndex = 0; bondIndex < ringSize; bondIndex++) { int i = (startingBondIndex - bondIndex -1 + ringSize) % ringSize; // -1 because we start from the bond next to stBond and end with it // if this bond is fused then it indicates the next ring to move to Bond bond = cyclicBonds.get(i); if(fusedBonds.contains(bond)) { nextBond = bond; break; } } } if (nextBond == null) { throw new RuntimeException("OPSIN Bug: None of the bonds from this ring were fused, but this is not possible "); } // next ring nextRing = currentRing.getNeighbourOfFusedBond(nextBond); int endNumber = currentRing.getBondIndex(nextBond) ; // Add atoms in order, considering inverse or not inverse if (!inverseAtoms) { // if distance between prev bond and cur bond = 1 (it means that fused bonds are next to each other) i.e. come under interior atom numbering // we don't add that atom, cause it was added already if ( (endNumber - startingBondIndex + ringSize) % ringSize != 1) { startingBondIndex = (startingBondIndex + 1) % ringSize; endNumber = (endNumber - 1 + ringSize ) % ringSize; if (startingBondIndex > endNumber) { endNumber += ringSize; } // start from the atom next to fusion for (int j = startingBondIndex; j <= endNumber; j++) { Atom atom = currentRing.getCyclicAtomList().get(j % ringSize); if (atomPath.contains(atom)) { break mainLoop; } atomPath.add(atom); } } } else { if ( ( startingBondIndex - endNumber + ringSize) % ringSize != 1) { startingBondIndex = (startingBondIndex - 2 + ringSize ) % ringSize; endNumber = endNumber % ringSize; if (startingBondIndex < endNumber) { startingBondIndex += ringSize; } for (int j = startingBondIndex; j >= endNumber; j-- ) { Atom atom = currentRing.getCyclicAtomList().get(j % ringSize); if (atomPath.contains(atom)) { break mainLoop; } atomPath.add(atom); } } } prevBond = nextBond; prevRing = currentRing; currentRing = nextRing; } if (count ==atomCountOfFusedRingSystem){ throw new RuntimeException("OPSIN Bug: Fused ring numbering may have been stuck in an infinite loop while enumerating peripheral numbering"); } return atomPath; } private static boolean isEntirelyFusionAtoms(Ring upperRightRing) { List atomList = upperRightRing.getAtomList(); for (Atom atom : atomList) { if (atom.getBondCount() < 3){ return false; } } return true; } /** * Finds the neighbour ring, which is the clockwise of the given ring. * @param ringMap * @param upperRightRing * @param visitedRings * @return */ private static Ring findClockwiseRingFromUpperRightRing (Ring[][] ringMap, Ring upperRightRing, List visitedRings){ Ring clockwiseRing = null; int maxX = 0; int maxY = 0; for (Ring ring : upperRightRing.getNeighbours()) { if (visitedRings.contains(ring)){ continue; } int xy[] = findRingPosition(ringMap, ring); if (xy==null) { throw new RuntimeException("OPSIN Bug: Ring not found in ringMap when performing fused ring numbering"); } if (xy[0] > maxX || xy[0] == maxX && xy[1] > maxY ) { maxX = xy[0]; maxY = xy[1]; clockwiseRing = ring; } } return clockwiseRing; } /** * Finds the neighbour ring, which is the uppermost and on the left side from the given ring. Used to find previous bond for the uppermost right ring, from which we start to enumerate * @param ringMap * @param upperRightRing * @return */ private static Ring findUpperLeftNeighbourOfUpperRightRing (Ring[][] ringMap, Ring upperRightRing){ Ring nRing = null; int minX = Integer.MAX_VALUE; int maxY = 0; for (Ring ring : upperRightRing.getNeighbours()) { // upper left would be previous ring int xy[] = findRingPosition(ringMap, ring); if (xy==null) { throw new RuntimeException("OPSIN Bug: Ring not found in ringMap when performing fused ring numbering"); } if (xy[1] > maxY || xy[1] == maxY && xy[0] < minX ) { minX = xy[0]; maxY = xy[1]; nRing = ring; } } return nRing; } /** * Finds the position(i,j) of the ring in the map * @param ringMap * @param ring * @return */ private static int[] findRingPosition(Ring[][] ringMap, Ring ring) { int w = ringMap.length; int h = ringMap[0].length; for(int i=0; i allBonds = new ArrayList<>(tRing.getBondList()); for (Bond fusedBond : tRing.getFusedBonds()) { List neighbouringBonds = fusedBond.getFromAtom().getBonds(); for (Bond bond : neighbouringBonds) { allBonds.remove(bond); } neighbouringBonds = fusedBond.getToAtom().getBonds(); for (Bond bond : neighbouringBonds) { allBonds.remove(bond); } } if (allBonds.size() > 0){ return allBonds.get(0); } for (Bond bond : tRing.getBondList()) { if(tRing.getNeighbourOfFusedBond(bond) == null){ // return a non-fused bond return bond; } } return null; } /** * Given the direction of the bond from ring1 to ring2, returns the opposite direction: from ring2 to ring1 * @param prevDir * @return */ static int getOppositeDirection(int prevDir) { int dir; if (prevDir == 0) { dir = 4; } else if (Math.abs(prevDir) == 4){ dir =0; } else if (Math.abs(prevDir) == 2){ dir = 2 * -1 * Integer.signum(prevDir); } else if (Math.abs(prevDir) == 1){ dir = 3 * -1 * Integer.signum(prevDir); } else {//prevDir will be +-3 dir = 1 * -1 * Integer.signum(prevDir); } return dir; } /** * Finds the atom connected to the bond, takes into account the order of the bonds and atoms in the ring * @param ring * @param curBond * @return */ private static Atom getAtomFromBond(Ring ring, Bond curBond) { if (ring.getCyclicBondList() == null) { throw new RuntimeException("The cyclic bond list should already have been generated"); } int bondIndice= ring.getCyclicBondList().indexOf(curBond); int atomIndice = ( bondIndice - 1 + ring.size() ) % ring.size(); return ring.getCyclicAtomList().get(atomIndice); } /** * Finds the fusion bond between 2 rings * @param r1 * @param r2 * @return */ private static Bond findFusionBond (Ring r1, Ring r2) { List b2 = r2.getBondList(); for(Bond bond : r1.getBondList()){ if (b2.contains(bond)) { return bond; } } return null; } /** * Counts delta x distance between previous and next rings * @param val * @return */ private static float countDX (int val) { float dX = 0; if (Math.abs(val) == 1) { dX += 0.5f; } else if (Math.abs(val) == 3) { dX -= 0.5f; } else if (Math.abs(val) == 0) { dX += 1f; } else if (Math.abs(val) == 4) { dX -= 1f; } return dX; } /** * Counts delta y distance (height) between previous and next rings * @param val * @return */ private static int countDY (int val) { int dY = 0; if (Math.abs(val) != 4) { if (val > 0) { dY = 1; } if (val < 0) { dY = -1; } } return dY; } /** * Take into account the previous direction to convert the given relative direction into a direction that is absolute for the fused ring system * @param fusionRingShape * @param ringSize * @param relativeDirection * @param previousDir * @return */ static int determineAbsoluteDirectionUsingPreviousDirection(FusionRingShape fusionRingShape, int ringSize, int relativeDirection, int previousDir){ int interimDirection; if (Math.abs(previousDir) == 4) { if (relativeDirection == 0) { interimDirection = 4; } else { interimDirection = relativeDirection + 4 * -1 * Integer.signum(relativeDirection); // if dir<0 we add 4, if dir>0 we add -4 } } else { interimDirection = relativeDirection + previousDir; } if (Math.abs(interimDirection) > 4) {// Added interimDirection = (8 - Math.abs(interimDirection)) * Integer.signum(interimDirection) * -1; } //TODO investigate this function and unit test /* Even numbered rings when angled do not have direction 2. * Almost true for 5 member except for corner case where fusion to elongated bond occurs */ if (Math.abs(interimDirection) == 2 && ((ringSize % 2 ==0) || ringSize==5 || ringSize==7)) { // if (one of them equal to 1 and another is equal to 3, we decrease absolute value and conserve the sign) if (Math.abs(relativeDirection)==1 && Math.abs(previousDir)==3 || Math.abs(relativeDirection)==3 && Math.abs(previousDir)==1) { interimDirection = 1 * Integer.signum(interimDirection); } // if both are equal to 1 else if(Math.abs(relativeDirection)==1 && Math.abs(previousDir)==1 ) { interimDirection = 3 * Integer.signum(interimDirection); } // if both are equal to 3 else if(Math.abs(relativeDirection)==3 && Math.abs(previousDir)==3 ) { interimDirection = 3 * Integer.signum(interimDirection); } // else it is correctly 2 } if (interimDirection == -4) { interimDirection = 4; } return interimDirection; } private static void debugRingMap(Ring[][] ringMap) { Ring[][] yxOrdered = new Ring[ringMap[0].length][ringMap.length]; for (int x = 0; x < ringMap.length; x++) { Ring[] yRings = ringMap[x]; for (int y = 0; y < yRings.length; y++) { yxOrdered[y][x] =yRings[y]; } } for (int y = yxOrdered.length-1; y >=0 ; y--) { Ring[] xRings = yxOrdered[y]; StringBuilder sb = new StringBuilder(); for (Ring ring : xRings) { if (ring!=null){ int size = ring.size(); if (size>9){ if (size==10){ sb.append("0"); } else if (size % 2 ==0){ sb.append("2"); } else{ sb.append("1"); } } else{ sb.append(size); } } else{ sb.append(" "); } } LOG.trace(sb.toString()); } LOG.trace("#########"); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/GroupingEl.java000066400000000000000000000052431451751637500265760ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.List; class GroupingEl extends Element{ private final List children = new ArrayList<>(); GroupingEl(String name) { super(name); } @Override void addChild(Element child) { child.setParent(this); children.add(child); } @Override Element copy() { GroupingEl copy = new GroupingEl(this.name); for (Element childEl : this.children) { Element newChild = childEl.copy(); newChild.setParent(copy); copy.addChild(newChild); } for (int i = 0, len = this.attributes.size(); i < len; i++) { Attribute atr = this.attributes.get(i); copy.addAttribute(new Attribute(atr)); } return copy; } @Override Element getChild(int index) { return children.get(index); } @Override int getChildCount() { return children.size(); } @Override List getChildElements() { return new ArrayList<>(children); } @Override List getChildElements(String name) { List elements = new ArrayList<>(1); for (Element element : children) { if (element.name.equals(name)) { elements.add(element); } } return elements; } @Override Element getFirstChildElement(String name) { for (Element child : children) { if (child.getName().equals(name)) { return child; } } return null; } @Override Element getLastChildElement() { int childCount = children.size(); return childCount > 0 ? children.get(childCount - 1) : null; } String getValue() { int childCount = getChildCount(); if (childCount == 0) { return ""; } StringBuilder result = new StringBuilder(); for (int i = 0; i < childCount; i++) { result.append(children.get(i).getValue()); } return result.toString(); } @Override int indexOf(Element child) { return children.indexOf(child); } @Override void insertChild(Element child, int index) { child.setParent(this); children.add(index, child); } @Override boolean removeChild(Element child) { child.setParent(null); return children.remove(child); } @Override Element removeChild(int index) { Element removed = children.remove(index); removed.setParent(null); return removed; } @Override void replaceChild(Element oldChild, Element newChild) { int index = indexOf(oldChild); if (index == -1) { throw new RuntimeException("oldChild is not a child of this element."); } removeChild(index); insertChild(newChild, index); } void setValue(String text) { throw new UnsupportedOperationException("Token groups do not have a value"); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/IDManager.java000066400000000000000000000011021451751637500263000ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /**A source of unique integers. Starts at 1 by default. * * @author ptc24 * */ class IDManager { /**the last integer generated, or 0 at first*/ private int currentID; int getCurrentID() { return currentID; } /**Initialises currentID at zero - will give 1 when first called */ IDManager() { currentID = 0; } /**Generates a new, unique integer. This is one * higher than the previous integer, or 1 if previously uncalled. * @return The generated integer. */ int getNextID() { currentID += 1; return currentID; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/IndentingXMLStreamWriter.java000066400000000000000000000025221451751637500313710ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import javax.xml.stream.XMLStreamException; import javax.xml.stream.XMLStreamWriter; import org.codehaus.stax2.util.StreamWriterDelegate; /** * This only overrides the commands actually used by the CmlWriter i.e. it isn't general */ class IndentingXMLStreamWriter extends StreamWriterDelegate { private final int indentSize; private int depth = 0; private boolean atStartOfNewline = false; IndentingXMLStreamWriter(XMLStreamWriter writer, int indentSize) { super(writer); this.indentSize = indentSize; } @Override public void writeStartElement(String arg0) throws XMLStreamException { if (!atStartOfNewline){ super.writeCharacters(OpsinTools.NEWLINE); } super.writeCharacters(StringTools.multiplyString(" ", depth * indentSize)); super.writeStartElement(arg0); atStartOfNewline = false; depth++; } @Override public void writeEndElement() throws XMLStreamException { depth--; if (atStartOfNewline) { super.writeCharacters(StringTools.multiplyString(" ", depth * indentSize)); } super.writeEndElement(); super.writeCharacters(OpsinTools.NEWLINE); atStartOfNewline = true; } @Override public void writeCharacters(String arg0) throws XMLStreamException { super.writeCharacters(arg0); atStartOfNewline = false; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/IsotopeSpecificationParser.java000066400000000000000000000077621451751637500320330ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.regex.Matcher; import java.util.regex.Pattern; class IsotopeSpecificationParser { private static final Pattern matchBoughtonIsotope =Pattern.compile("(?:-([^,]+(?:,[^,]+)*))?-((\\d+)([A-Z][a-z]?)|d)(\\d+)?"); private static final Pattern matchIupacIsotope =Pattern.compile("(?:([^,]+(?:,[^,]+)*)-)?(\\d+)([A-Z][a-z]?)(\\d+)?"); static class IsotopeSpecification { private final ChemEl chemEl; private final int isotope; private final int multiplier; private final String[] locants; IsotopeSpecification(ChemEl chemEl, int isotope, int multiplier, String[] locants) { this.chemEl = chemEl; this.isotope = isotope; this.multiplier = multiplier; this.locants = locants; } ChemEl getChemEl() { return chemEl; } int getIsotope() { return isotope; } int getMultiplier() { return multiplier; } String[] getLocants() { return locants; } } static IsotopeSpecification parseIsotopeSpecification(Element isotopeSpecification) throws StructureBuildingException { String type = isotopeSpecification.getAttributeValue(XmlDeclarations.TYPE_ATR); if (XmlDeclarations.BOUGHTONSYSTEM_TYPE_VAL.equals(type)) { return processBoughtonIsotope(isotopeSpecification); } else if (XmlDeclarations.IUPACSYSTEM_TYPE_VAL.equals(type)) { return processIupacIsotope(isotopeSpecification); } else { throw new RuntimeException("Unsupported isotope specification syntax"); } } private static IsotopeSpecification processBoughtonIsotope(Element isotopeSpecification) throws StructureBuildingException { String val = isotopeSpecification.getValue(); Matcher m = matchBoughtonIsotope.matcher(val); if (!m.matches()) { throw new RuntimeException("Malformed isotope specification: " + val); } int isotope; ChemEl chemEl; if (m.group(2).equals("d")) { isotope = 2; chemEl = ChemEl.H; } else { isotope = Integer.parseInt(m.group(3)); chemEl = ChemEl.valueOf(m.group(4)); } int multiplier = 1; String multiplierStr = m.group(5); if (multiplierStr != null) { multiplier = Integer.parseInt(multiplierStr); } String locantsStr = m.group(1); String[] locants = null; if(locantsStr != null) { locants = locantsStr.split(","); if (multiplierStr == null) { multiplier = locants.length; } else if (locants.length != multiplier) { throw new StructureBuildingException("Mismatch between number of locants: " + locants.length + " and number of " + chemEl.toString() + " isotopes requested: " + multiplier); } for (int i = 0; i < locants.length; i++) { locants[i] = OpsinTools.fixLocantCapitalisation(locants[i]); } } return new IsotopeSpecification(chemEl, isotope, multiplier, locants); } private static IsotopeSpecification processIupacIsotope(Element isotopeSpecification) throws StructureBuildingException { String val = isotopeSpecification.getValue(); Matcher m = matchIupacIsotope.matcher(val); if (!m.matches()) { throw new RuntimeException("Malformed isotope specification: " + val); } int isotope = Integer.parseInt(m.group(2)); ChemEl chemEl = ChemEl.valueOf(m.group(3)); int multiplier = 1; String multiplierStr = m.group(4); if (multiplierStr != null) { multiplier = Integer.parseInt(multiplierStr); } String locantsStr = m.group(1); String[] locants = null; if(locantsStr != null) { locants = locantsStr.split(","); if (multiplierStr == null) { multiplier = locants.length; } else if (locants.length != multiplier) { throw new StructureBuildingException("Mismatch between number of locants: " + locants.length + " and number of " + chemEl.toString() +" isotopes requested: " + multiplier); } for (int i = 0; i < locants.length; i++) { locants[i] = OpsinTools.fixLocantCapitalisation(locants[i]); } } return new IsotopeSpecification(chemEl, isotope, multiplier, locants); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/NameToStructure.java000066400000000000000000000166101451751637500276270ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.io.InputStream; import java.util.Collections; import java.util.List; import java.util.Properties; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import uk.ac.cam.ch.wwmm.opsin.OpsinResult.OPSIN_RESULT_STATUS; /** The "master" class, to turn a name into a structure. * * @author ptc24 * @author dl387 */ public class NameToStructure { private static final Logger LOG = LogManager.getLogger(NameToStructure.class); /**Applies OPSIN's grammar to tokenise and assign meaning to tokens*/ private ParseRules parseRules; /**Parses a chemical name into one (or more in the case of ambiguity) parse trees*/ private Parser parser; /**Which suffixes apply to what and what their effects are*/ private SuffixRules suffixRules; private static NameToStructure NTS_INSTANCE; public static synchronized NameToStructure getInstance() { if (NTS_INSTANCE == null) { NTS_INSTANCE = new NameToStructure(); } return NTS_INSTANCE; } /** * Returns the version of the OPSIN library * @return Version number String */ public static String getVersion() { try(InputStream is = NameToStructure.class.getResourceAsStream("opsinbuild.props")) { Properties props = new Properties(); props.load(is); return props.getProperty("version"); } catch (Exception e) { return null; } } /**Initialises the name-to-structure converter. * * @throws NameToStructureException If the converter cannot be initialised, most likely due to bad or missing data files. */ private NameToStructure() { LOG.debug("Initialising OPSIN... "); try { /*Initialise all of OPSIN's classes. Some classes are injected as dependencies into subsequent classes*/ //Allows retrieving of OPSIN resources ResourceGetter resourceGetter = new ResourceGetter("uk/ac/cam/ch/wwmm/opsin/resources/"); ResourceManager resourceManager = new ResourceManager(resourceGetter); WordRules wordRules = new WordRules(resourceGetter); parseRules = new ParseRules(resourceManager); Tokeniser tokeniser = new Tokeniser(parseRules); parser = new Parser(wordRules, tokeniser, resourceManager); suffixRules = new SuffixRules(resourceGetter); } catch (Exception e) { throw new NameToStructureException(e.getMessage(), e); } LOG.debug("OPSIN initialised"); } /** * Convenience method for converting a name to CML with OPSIN's default options * @param name The chemical name to parse. * @return A CML element, containing the parsed molecule, or null if the name was uninterpretable. */ public String parseToCML(String name) { OpsinResult result = parseChemicalName(name); String cml = result.getCml(); if(cml != null && LOG.isDebugEnabled()){ LOG.debug(cml); } return cml; } /** * Convenience method for converting a name to SMILES with OPSIN's default options * @param name The chemical name to parse. * @return A SMILES string describing the parsed molecule, or null if the name was uninterpretable. */ public String parseToSmiles(String name) { OpsinResult result = parseChemicalName(name); String smiles = result.getSmiles(); LOG.debug(smiles); return smiles; } /**Parses a chemical name, returning an OpsinResult which represents the molecule. * This object contains in the status whether the name was parsed successfully * A message which may contain additional information if the status was warning/failure * The OpsinResult has methods to generate a SMILES or CML representation * For InChI, the OpsinResult should be given to the NameToInchi class * * @param name The chemical name to parse. * @return OpsinResult */ public OpsinResult parseChemicalName(String name) { NameToStructureConfig n2sConfig = NameToStructureConfig.getDefaultConfigInstance(); return parseChemicalName(name, n2sConfig); } /**Parses a chemical name, returning an OpsinResult which represents the molecule. * This object contains in the status whether the name was parsed successfully * A message which may contain additional information if the status was warning/failure * CML and SMILES representations may be retrieved directly from the object * InChI may be generate using NameToInchi * * @param name The chemical name to parse. * @param n2sConfig Options to control how OPSIN interprets the name. * @return OpsinResult */ public OpsinResult parseChemicalName(String name, NameToStructureConfig n2sConfig) { if (name == null){ throw new IllegalArgumentException("String given for name was null"); } n2sConfig = n2sConfig.clone();//avoid n2sconfig being modified mid name processing List parses; try { LOG.debug(name); String modifiedName = PreProcessor.preProcess(name); parses = parser.parse(n2sConfig, modifiedName); Collections.sort(parses, new SortParses());//fewer tokens preferred } catch (Exception e) { if(LOG.isDebugEnabled()) { LOG.debug(e.getMessage(), e); } String message = e.getMessage() != null ? e.getMessage() : "exception with null message"; return new OpsinResult(null, OPSIN_RESULT_STATUS.FAILURE, message, name); } String reasonForFailure = ""; Fragment fragGeneratedWithWarning = null; List warnings = Collections.emptyList(); for(Element parse : parses) { try { if (LOG.isDebugEnabled()) { LOG.debug(parse.toXML()); } //Performs XML manipulation e.g. nesting bracketing, processing some nomenclatures BuildState state = new BuildState(n2sConfig); new ComponentGenerator(state).processParse(parse); if (LOG.isDebugEnabled()) { LOG.debug(parse.toXML()); } //Converts the XML to fragments (handles many different nomenclatueres for describing structure). Assigns locants new ComponentProcessor(state, new SuffixApplier(state, suffixRules)).processParse(parse); if (LOG.isDebugEnabled()) { LOG.debug(parse.toXML()); } //Constructs a single fragment from the fragments generated by the ComponentProcessor. Applies stereochemistry Fragment frag = new StructureBuilder(state).buildFragment(parse); if (LOG.isDebugEnabled()) { LOG.debug(parse.toXML()); } if (state.getWarnings().isEmpty()) { return new OpsinResult(frag, OPSIN_RESULT_STATUS.SUCCESS, "", name); } if (fragGeneratedWithWarning == null) { //record first frag that had a warning but try other parses as they may work without a warning fragGeneratedWithWarning = frag; warnings = state.getWarnings(); } } catch (Exception e) { if (reasonForFailure.length() == 0) { reasonForFailure = e.getMessage() != null ? e.getMessage() : "exception with null message"; } if (LOG.isDebugEnabled()) { LOG.debug(e.getMessage(), e); } } } if (fragGeneratedWithWarning != null) { return new OpsinResult(fragGeneratedWithWarning, OPSIN_RESULT_STATUS.WARNING, warnings, name); } return new OpsinResult(null, OPSIN_RESULT_STATUS.FAILURE, reasonForFailure, name); } /** * Returns an OPSIN parser * This can be used to determine whether a word can be interpreted as being part of a chemical name. * Just because a word can be split into tokens does not mean the word constitutes a valid chemical name * e.g. ester is interpretable but is not in itself a chemical name * @return Opsin parser for recognition/parsing of a chemical word */ public static ParseRules getOpsinParser() { NameToStructure n2s = NameToStructure.getInstance(); return n2s.parseRules; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/NameToStructureConfig.java000066400000000000000000000115121451751637500307510ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * Allows OPSIN to be configured e.g. enable processing of radicals * Example usage: * NameToStructureConfig n2sConfig = new NameToStructureConfig(); * n2sconfig.setAllowRadicals(true); * nts.parseChemicalName(chemicalName, n2sConfig) * where nts is an instance of NameToStructure * @author dl387 * */ public class NameToStructureConfig implements Cloneable { // Fields set with default values private boolean allowRadicals = false; private boolean outputRadicalsAsWildCardAtoms = false; private boolean detailedFailureAnalysis = false; private boolean interpretAcidsWithoutTheWordAcid = false; private boolean warnRatherThanFailOnUninterpretableStereochemistry = false; /** * Constructs a NameToStructureConfig with default settings: * allowRadicals = false * outputRadicalsAsWildCardAtoms = false * detailedFailureAnalysis = false * interpretAcidsWithoutTheWordAcid = false * warnRatherThanFailOnUninterpretableStereochemistry = false */ public NameToStructureConfig() { } /** * Are radicals allowed? e.g. should fragments such as phenyl be interpretable * @return whether radicals are allowed */ public boolean isAllowRadicals() { return allowRadicals; } /** * Sets whether radicals allowed? e.g. should fragments such as phenyl be interpretable */ public void setAllowRadicals(boolean allowRadicals) { this.allowRadicals = allowRadicals; } /** * Are radicals output as wildcard atoms e.g. [*]CC for ethyl * @return whether radicals are output using explicit wildcard atoms */ public boolean isOutputRadicalsAsWildCardAtoms() { return outputRadicalsAsWildCardAtoms; } /** * Should radicals be output as wildcard atoms e.g. [*]CC for ethyl (as opposed to [CH2]C)
* Note that if this is set to true InChIs cannot be generated * @param outputRadicalsAsWildCardAtoms */ public void setOutputRadicalsAsWildCardAtoms(boolean outputRadicalsAsWildCardAtoms) { this.outputRadicalsAsWildCardAtoms = outputRadicalsAsWildCardAtoms; } /** * Should OPSIN attempt reverse parsing to more accurately determine why parsing failed * @return whether a more precise cause of failure should be determined if parsing fails */ public boolean isDetailedFailureAnalysis() { return detailedFailureAnalysis; } /** * Sets whether OPSIN should attempt reverse parsing to more accurately determine why parsing failed */ public void setDetailedFailureAnalysis(boolean detailedFailureAnalysis) { this.detailedFailureAnalysis = detailedFailureAnalysis; } /** * Are acids without the word "acid" interpretable e.g. should "acetic" be interpretable * @return whether acids without the word "acid" should be interpretable */ public boolean allowInterpretationOfAcidsWithoutTheWordAcid() { return interpretAcidsWithoutTheWordAcid; } /** * Sets whether acids without the word "acid" interpretable e.g. should "acetic" be interpretable * @param interpretAcidsWithoutTheWordAcid */ public void setInterpretAcidsWithoutTheWordAcid(boolean interpretAcidsWithoutTheWordAcid) { this.interpretAcidsWithoutTheWordAcid = interpretAcidsWithoutTheWordAcid; } /** * If OPSIN cannot understand the stereochemistry in a name should OPSIN's result be a warning * and structure with incomplete stereochemistry, or should failure be returned (Default) * @return whether ignored stereochemistry is a warning (rather than a failure) */ public boolean warnRatherThanFailOnUninterpretableStereochemistry() { return warnRatherThanFailOnUninterpretableStereochemistry; } /** * Sets whether if OPSIN cannot understand the stereochemistry in a name whether OPSIN's result should be a warning * and structure with incomplete stereochemistry, or should failure be returned (Default) * @param warnRatherThanFailOnUninterpretableStereochemistry */ public void setWarnRatherThanFailOnUninterpretableStereochemistry(boolean warnRatherThanFailOnUninterpretableStereochemistry) { this.warnRatherThanFailOnUninterpretableStereochemistry = warnRatherThanFailOnUninterpretableStereochemistry; } /** * Constructs a NameToStructureConfig with default settings: * allowRadicals = false * outputRadicalsAsWildCardAtoms = false * detailedFailureAnalysis = false * interpretAcidsWithoutTheWordAcid = false * warnRatherThanFailOnUninterpretableStereochemistry = false */ public static NameToStructureConfig getDefaultConfigInstance() { return new NameToStructureConfig(); } @Override public NameToStructureConfig clone() { try { return (NameToStructureConfig) super.clone(); } catch (CloneNotSupportedException e) { // Can only be thrown if we *don't* implement Cloneable, which we do... throw new Error("Impossible!", e); } } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/NameToStructureException.java000066400000000000000000000007461451751637500315110ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /**Thrown if OPSIN failed to initialise * * @author ptc24 * */ public class NameToStructureException extends RuntimeException { private static final long serialVersionUID = 1L; NameToStructureException() { super(); } NameToStructureException(String message) { super(message); } NameToStructureException(String message, Throwable cause) { super(message, cause); } NameToStructureException(Throwable cause) { super(cause); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/OpsinRadixTrie.java000066400000000000000000000120771451751637500274320ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.List; /** * A black/white radix tree implementation. * A radix tree is a type of trie where common prefixes are merged together to save space * This implementation employs short arrays rather than maps to exploit the fact that all OPSIN tokens are ASCII. * @author dl387 * */ class OpsinRadixTrie { final OpsinTrieNode rootNode; OpsinRadixTrie() { rootNode = new OpsinTrieNode("", false); } /** * Adds a string to the Trie. * This string should not contain any non ASCII characters * @param token */ void addToken(String token) { int tokenLength =token.length(); String remaingStr =token; OpsinTrieNode currentNode = rootNode; for (int i = 0; i < tokenLength;) { int charsMatched = currentNode.getNumberOfMatchingCharacters(remaingStr, 0); remaingStr = remaingStr.substring(charsMatched); i+=charsMatched; currentNode = currentNode.add(remaingStr, charsMatched); } currentNode.setIsEndPoint(true); } /** * Returns all possible runs of the input string that reached end point nodes in the trie * e.g. ylidene might return 2 ("yl"), 6 ("yliden") and 7 ("ylidene") * Results are given as the index of the end of the match in the chemicalName * Returns null if no runs were possible * @param chemicalName * @param posInName The point at which to start matching * @return */ List findMatches(String chemicalName, int posInName) { int untokenisedChemicalNameLength = chemicalName.length(); List indexes = null; if (rootNode.isEndPoint()) { indexes = new ArrayList<>(); indexes.add(posInName); } OpsinTrieNode node = rootNode; for (int i = posInName; i < untokenisedChemicalNameLength; i++) { node = node.getChild(chemicalName.charAt(i)); if (node == null) { break; } int nodeLength = node.getValue().length(); if (nodeLength > 1) { int charsMatched = node.getNumberOfMatchingCharacters(chemicalName, i); if (charsMatched != nodeLength) { break; } i += (charsMatched - 1); } if (node.isEndPoint()) { if (indexes == null) { indexes = new ArrayList<>(); } indexes.add(i + 1); } } return indexes; } /** * Same as findMatches but the trie has been populated by reversed tokens * @param chemicalName * @param posInName The index after the first character to start matching * @return */ List findMatchesReadingStringRightToLeft(String chemicalName, int posInName ) { List indexes = null; if (rootNode.isEndPoint()) { indexes = new ArrayList<>(); indexes.add(posInName); } OpsinTrieNode node = rootNode; for (int i = posInName - 1; i >=0; i--) { node = node.getChild(chemicalName.charAt(i)); if (node == null) { break; } int nodeLength = node.getValue().length(); if (nodeLength > 1) { int charsMatched = node.getNumberOfMatchingCharactersInReverse(chemicalName, i); if (charsMatched != nodeLength) { break; } i -= (charsMatched - 1); } if (node.isEndPoint()) { if (indexes == null) { indexes = new ArrayList<>(); } indexes.add(i); } } return indexes; } } class OpsinTrieNode { private boolean isEndPoint; private String key; private OpsinTrieNode[] children = new OpsinTrieNode[128]; OpsinTrieNode(String key, boolean isEndPoint) { this.isEndPoint = isEndPoint; this.key = key; } String getValue() { return key; } boolean isEndPoint() { return isEndPoint; } void setIsEndPoint(boolean isEndPoint) { this.isEndPoint = isEndPoint; } private void setChildren(OpsinTrieNode[] children) { this.children = children; } OpsinTrieNode add(String remaingStr, int charsMatched) { if (charsMatched < key.length()){//need to split this Trie node OpsinTrieNode newNode = new OpsinTrieNode(key.substring(charsMatched), isEndPoint); newNode.setChildren(children); children = new OpsinTrieNode[128]; children[key.charAt(charsMatched)] = newNode; key = key.substring(0, charsMatched); isEndPoint =false; } if (remaingStr.length()!=0){ int charValue = (int) remaingStr.charAt(0); if (children[charValue] == null) { children[charValue] = new OpsinTrieNode(remaingStr, false); } return children[charValue]; } return this; } int getNumberOfMatchingCharacters(String chemicalName, int posInName) { int maxLength = Math.min(key.length(), chemicalName.length() - posInName); for (int i = 0; i < maxLength; i++) { if (key.charAt(i) != chemicalName.charAt(posInName + i)){ return i; } } return maxLength; } int getNumberOfMatchingCharactersInReverse(String chemicalName, int posInName) { int maxLength = Math.min(key.length(), posInName + 1); for (int i = 0; i < maxLength; i++) { if (key.charAt(i) != chemicalName.charAt(posInName - i)){ return i; } } return maxLength; } OpsinTrieNode getChild(char c) { return children[(int) c]; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/OpsinResult.java000066400000000000000000000162301451751637500270100ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.Collections; import java.util.List; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import uk.ac.cam.ch.wwmm.opsin.OpsinWarning.OpsinWarningType; /** * Holds the structure OPSIN has generated from a name * Additionally holds a status code for whether name interpretation was successful * @author dl387 * */ public class OpsinResult { private static final Logger LOG = LogManager.getLogger(OpsinResult.class); private final Fragment structure; private final OPSIN_RESULT_STATUS status; private final String message; private final String chemicalName; private final List warnings; /** * Whether parsing the chemical name was successful, encountered problems or was unsuccessful.
* If the result is not {@link OPSIN_RESULT_STATUS#FAILURE} then a structure has been generated * @author dl387 * */ public enum OPSIN_RESULT_STATUS{ /** * OPSIN successfully interpreted the name */ SUCCESS, /** * OPSIN interpreted the name but detected a potential problem e.g. could not interpret stereochemistry
* Currently, by default, WARNING is not used as stereochemistry failures are treated as failures
* In the future, ambiguous chemical names may produce WARNING */ WARNING, /** * OPSIN failed to interpret the name */ FAILURE } OpsinResult(Fragment frag, OPSIN_RESULT_STATUS status, List warnings, String chemicalName) { this.structure = frag; this.status = status; StringBuilder sb = new StringBuilder(); for (int i = 0, l = warnings.size(); i < l; i++) { OpsinWarning warning = warnings.get(i); sb.append(warning.getType().toString()); sb.append(": "); sb.append(warning.getMessage()); if (i + 1 < l){ sb.append("; "); } } this.message = sb.toString(); this.chemicalName = chemicalName; this.warnings = warnings; } OpsinResult(Fragment frag, OPSIN_RESULT_STATUS status, String message, String chemicalName) { this.structure = frag; this.status = status; this.message = message; this.chemicalName = chemicalName; this.warnings = Collections.emptyList(); } Fragment getStructure() { return structure; } /** * Returns an enum indicating whether interpreting the chemical name was successful * If an issue was identified but a chemical structure could be still be deduced the status is {@link OPSIN_RESULT_STATUS#WARNING} * @return {@link OPSIN_RESULT_STATUS} status */ public OPSIN_RESULT_STATUS getStatus() { return status; } /** * Returns a message explaining why generation of a molecule from the name failed * This string will be blank when no problems were encountered * @return String explaining problems encountered */ public String getMessage() { return message; } /** * Returns the chemical name that this OpsinResult was generated from * @return String containing the original chemical name */ public String getChemicalName() { return chemicalName; } /** * Generates the CML corresponding to the molecule described by the name * If name generation failed i.e. the OPSIN_RESULT_STATUS is FAILURE then null is returned * @return Chemical Markup Language as a String */ public String getCml() { if (structure != null){ try{ return CMLWriter.generateCml(structure, chemicalName); } catch (Exception e) { LOG.debug("CML generation failed", e); } } return null; } /** * Generates the CML corresponding to the molecule described by the name
* If name generation failed i.e. the OPSIN_RESULT_STATUS is FAILURE then null is returned
* The CML is indented * @return Idented Chemical Markup Language as a String */ public String getPrettyPrintedCml() { if (structure != null){ try{ return CMLWriter.generateIndentedCml(structure, chemicalName); } catch (Exception e) { LOG.debug("CML generation failed", e); } } return null; } /** * Generates the SMILES corresponding to the molecule described by the name
* If name generation failed i.e. the OPSIN_RESULT_STATUS is FAILURE then null is returned * @return SMILES as a String */ public String getSmiles() { return getSmiles(SmilesOptions.DEFAULT); } /** * Generates the SMILES corresponding to the molecule described by the name
* If name generation failed i.e. the OPSIN_RESULT_STATUS is FAILURE then null is returned. *
* The options parameter is used to control the output by a set of binary flags. This is * primarily used to control the output layers in ChemAxon Extended SMILES (CXSMILES). *

	 * // only the include the enhanced stereo layers
	 * result.getSmiles(SmilesOptions.CXSMILES_ENHANCED_STEREO);
	 * // only the include the enhanced stereo and polymer layers
	 * result.getSmiles(SmilesOptions.CXSMILES_ENHANCED_STEREO |
	 *                  SmilesOptions.CXSMILES_POLYMERS);
	 * 
* * @param options binary flags of {@link SmilesOptions} (default: {@link SmilesOptions#DEFAULT})) * @return SMILES as a String * @see SmilesOptions */ public String getSmiles(int options) { if (structure != null){ try{ return SMILESWriter.generateSmiles(structure, options); } catch (Exception e) { LOG.debug("SMILES generation failed", e); } } return null; } /** * Experimental function that generates the extended SMILES corresponding to the molecule described by the name * If name generation failed i.e. the OPSIN_RESULT_STATUS is FAILURE then null is returned * If the molecule doesn't utilise any features made possible by extended SMILES this is equivalent to {@link #getSmiles()} * @return Extended SMILES as a String */ public String getExtendedSmiles() { if (structure != null){ try{ return SMILESWriter.generateSmiles(structure, SmilesOptions.CXSMILES); } catch (Exception e) { LOG.debug("Extended SMILES generation failed", e); } } return null; } /** * A list of warnings encountered when the result was {@link OPSIN_RESULT_STATUS#WARNING}
* This list of warnings is immutable * @return A list of {@link OpsinWarning} */ public List getWarnings() { return Collections.unmodifiableList(warnings); } /** * Convenience method to check if one of the associated OPSIN warnings was {@link OpsinWarningType#APPEARS_AMBIGUOUS} * @return true if name appears to be ambiguous */ public boolean nameAppearsToBeAmbiguous() { for (OpsinWarning warning : warnings) { if (warning.getType() == OpsinWarningType.APPEARS_AMBIGUOUS) { return true; } } return false; } /** * Convenience method to check if one of the associated OPSIN warnings was {@link OpsinWarningType#STEREOCHEMISTRY_IGNORED} * @return true if stereochemistry was ignored to interpret the name */ public boolean stereochemistryIgnored() { for (OpsinWarning warning : warnings) { if (warning.getType() == OpsinWarningType.STEREOCHEMISTRY_IGNORED) { return true; } } return false; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/OpsinTools.java000066400000000000000000000601071451751637500266340ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayDeque; import java.util.ArrayList; import java.util.Deque; import java.util.HashSet; import java.util.List; import java.util.Locale; import java.util.Set; import java.util.regex.Pattern; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; /** * A set of useful methods and constants to assist OPSIN * @author dl387 * */ class OpsinTools { static final Pattern MATCH_DIGITS = Pattern.compile("\\d+"); static final Pattern MATCH_COLONORSEMICOLON = Pattern.compile("[:;]"); static final Pattern MATCH_AMINOACID_STYLE_LOCANT =Pattern.compile("([A-Z][a-z]?)('*)((\\d+[a-z]?|alpha|beta|gamma|delta|epsilon|zeta|eta|omega)'*)"); static final Pattern MATCH_ELEMENT_SYMBOL =Pattern.compile("[A-Z][a-z]?"); static final Pattern MATCH_ELEMENT_SYMBOL_LOCANT =Pattern.compile("[A-Z][a-z]?'*"); static final Pattern MATCH_NUMERIC_LOCANT =Pattern.compile("(\\d+)[a-z]?'*"); static final char END_OF_SUBSTITUENT = '\u00e9'; static final char END_OF_MAINGROUP = '\u00e2'; static final char END_OF_FUNCTIONALTERM = '\u00FB'; static final String NEWLINE = System.getProperty("line.separator"); static boolean isBiochemical(String type, String subType) { return BIOCHEMICAL_SUBTYPE_VAL.equals(subType) || CARBOHYDRATE_TYPE_VAL.equals(type) || AMINOACID_TYPE_VAL.equals(type); } /** * Returns the next sibling suffix node which is not related to altering charge (ium/ide/id) * @param currentEl */ static Element getNextNonChargeSuffix(Element currentEl) { Element next = getNextSibling(currentEl); while (next != null) { if (next.getName().equals(SUFFIX_EL) && !CHARGE_TYPE_VAL.equals(next.getAttributeValue(TYPE_ATR))){ return next; } next = getNextSibling(next); } return null; } /** * Returns a new list containing the elements of list1 followed by list2 * @param list1 * @param list2 * @return The new list */ static List combineElementLists(List list1, List list2) { List elementList = new ArrayList<>(list1.size() + list2.size()); elementList.addAll(list1); elementList.addAll(list2); return elementList; } static String fixLocantCapitalisation(String locant) { int len = locant.length(); if (len >= 2) { char lastChar = locant.charAt(len - 1); if ((lastChar >= 'A' && lastChar <= 'G') && MATCH_DIGITS.matcher(locant).region(0, len - 1).matches() ) { //erroneous uppercase e.g. 1A rather than 1a locant = locant.toLowerCase(Locale.ROOT); } } return locant; } /** * Returns the previous group. This group element need not be a sibling * @param current: starting element * @return */ static Element getPreviousGroup(Element current) { if (current.getName().equals(GROUP_EL)) {//can start with a group or the sub/root the group is in current = current.getParent(); } Element parent = current.getParent(); if (parent == null || parent.getName().equals(WORDRULE_EL)) { return null; } int index = parent.indexOf(current); if (index ==0) { return getPreviousGroup(parent);//no group found } Element previous = parent.getChild(index - 1); while (previous.getChildCount() != 0) { previous = previous.getChild(previous.getChildCount() - 1); } List groups = previous.getParent().getChildElements(GROUP_EL); if (groups.isEmpty()){ return getPreviousGroup(previous); } else{ return groups.get(groups.size() - 1);//return last group if multiple exist e.g. fused ring } } /** * Returns the next group. This group element need not be a sibling * @param current: starting element * @return */ static Element getNextGroup(Element current) { if (current.getName().equals(GROUP_EL)) {//can start with a group or the sub/root the group is in current = current.getParent(); } Element parent = current.getParent(); if (parent == null || parent.getName().equals(MOLECULE_EL)) { return null; } int index = parent.indexOf(current); if (index == parent.getChildCount() - 1) { return getNextGroup(parent);//no group found } Element next = parent.getChild(index + 1); while (next.getChildCount() != 0){ next = next.getChild(0); } List groups = next.getParent().getChildElements(GROUP_EL); if (groups.isEmpty()){ return getNextGroup(next); } else{ return groups.get(0);//return first group if multiple exist e.g. fused ring } } /** * Finds the wordRule element that encloses the given element. * Returns the wordRule element or throws an exception * @param el * @return wordRule Element */ static Element getParentWordRule(Element el) { Element parent = el.getParent(); while(parent != null && !parent.getName().equals(WORDRULE_EL)){ parent = parent.getParent(); } if (parent == null){ throw new RuntimeException("Cannot find enclosing wordRule element"); } else{ return parent; } } /** * Searches in a depth-first manner for a non-suffix atom that has the target non element symbol locant * Returns either that atom or null if one cannot be found * @param startingAtom * @param targetLocant * @return the matching atom or null */ static Atom depthFirstSearchForNonSuffixAtomWithLocant(Atom startingAtom, String targetLocant) { Deque stack = new ArrayDeque<>(); stack.add(startingAtom); Set atomsVisited = new HashSet<>(); while (stack.size() > 0) { Atom currentAtom = stack.removeLast(); atomsVisited.add(currentAtom); List neighbours = currentAtom.getAtomNeighbours(); for (Atom neighbour : neighbours) { if (atomsVisited.contains(neighbour)){//already visited continue; } List locants = new ArrayList<>(neighbour.getLocants()); locants.removeAll(neighbour.getElementSymbolLocants()); //A main group atom, would expect to only find one except in something strange like succinimide //The locants.size() > 0 condition allows things like terephthalate to work which have an atom between the suffixes and main atoms that has no locant if (locants.size() > 0 && !neighbour.getType().equals(SUFFIX_TYPE_VAL)){ if (locants.contains(targetLocant)){ return neighbour; } continue; } stack.add(neighbour); } } return null; } /** * Searches in a depth-first manner for an atom with a numeric locant * Returns either that atom or null if one cannot be found * @param startingAtom * @return the matching atom or null */ static Atom depthFirstSearchForAtomWithNumericLocant(Atom startingAtom){ Deque stack = new ArrayDeque<>(); stack.add(startingAtom); Set atomsVisited = new HashSet<>(); while (stack.size() > 0) { Atom currentAtom = stack.removeLast(); atomsVisited.add(currentAtom); List neighbours = currentAtom.getAtomNeighbours(); for (Atom neighbour : neighbours) { if (atomsVisited.contains(neighbour)){//already visited continue; } List locants = neighbour.getLocants(); for (String neighbourLocant : locants) { if (MATCH_NUMERIC_LOCANT.matcher(neighbourLocant).matches()){ return neighbour; } } stack.add(neighbour); } } return null; } /** * Given a list of annotations returns the word type as indicated by the final annotation of the list * @param annotations * @return WordType * @throws ParsingException */ static WordType determineWordType(List annotations) throws ParsingException { char finalAnnotation = annotations.get(annotations.size() - 1); if (finalAnnotation == END_OF_MAINGROUP) { return WordType.full; } else if (finalAnnotation == END_OF_SUBSTITUENT) { return WordType.substituent; } else if (finalAnnotation == END_OF_FUNCTIONALTERM) { return WordType.functionalTerm; } else{ throw new ParsingException("OPSIN bug: Unable to determine word type!"); } } /**Gets the next sibling of a given element. * * @param element The reference element. * @return The next Sibling, or null. */ static Element getNextSibling(Element element) { Element parent = element.getParent(); int i = parent.indexOf(element); if (i + 1 >= parent.getChildCount()) { return null; } return parent.getChild(i + 1); } /**Gets the first next sibling of a given element whose element name matches the given string. * * @param current The reference element. * @param elName The element name to look for * @return The matched next Sibling, or null. */ static Element getNextSibling(Element current, String elName) { Element next = getNextSibling(current); while (next != null) { if (next.getName().equals(elName)){ return next; } next = getNextSibling(next); } return null; } /**Gets the previous sibling of a given element. * * @param element The reference element. * @return The previous Sibling, or null. */ static Element getPreviousSibling(Element element) { Element parent = element.getParent(); int i = parent.indexOf(element); if (i == 0) { return null; } return parent.getChild(i - 1); } /**Gets the first previous sibling of a given element whose element name matches the given string. * * @param current The reference element. * @param elName The element name of a element to look for * @return The matched previous Sibling, or null. */ static Element getPreviousSibling(Element current, String elName) { Element prev = getPreviousSibling(current); while (prev != null) { if (prev.getName().equals(elName)){ return prev; } prev = getPreviousSibling(prev); } return null; } /**Inserts a element so that it occurs before a reference element. The new element * must not currently have a parent. * * @param element The reference element. * @param newElement The new element to insert. */ static void insertBefore(Element element, Element newElement) { Element parent = element.getParent(); int i = parent.indexOf(element); parent.insertChild(newElement, i); } /**Inserts an element so that it occurs after a reference element. The new element * must not currently have a parent. * * @param element The reference element. * @param neweElement The new element to insert. */ static void insertAfter(Element element, Element neweElement) { Element parent = element.getParent(); int i = parent.indexOf(element); parent.insertChild(neweElement, i + 1); } /** * Gets the next element. This element need not be a sibling * @param element: starting element * @return */ static Element getNext(Element element) { return getNext(element, true); } /** * Gets the next element. This element need not be a sibling * If withinConnectedComponent is set the method is restricted to the current branch of the root molecule element * @param element: starting element * @param withinConnectedComponent * @return */ static Element getNext(Element element, boolean withinConnectedComponent) { Element parent = element.getParent(); if (parent == null || (withinConnectedComponent && parent.getName().equals(XmlDeclarations.MOLECULE_EL))) { return null; } int index = parent.indexOf(element); if (index + 1 >= parent.getChildCount()) { return getNext(parent, withinConnectedComponent);//reached end of element } Element next = parent.getChild(index + 1); while (next.getChildCount() > 0){ next = next.getChild(0); } return next; } /** * Gets the previous element. This element need not be a sibling * @param element: starting element * @return */ static Element getPrevious(Element element) { return getPrevious(element, true); } /** * Gets the previous element. This element need not be a sibling * If withinConnectedComponent is set the method is restricted to the current branch of the root molecule element * @param element: starting element * @param withinConnectedComponent * @return */ static Element getPrevious(Element element, boolean withinConnectedComponent) { Element parent = element.getParent(); if (parent == null || (withinConnectedComponent && parent.getName().equals(XmlDeclarations.MOLECULE_EL))) { return null; } int index = parent.indexOf(element); if (index == 0) { return getPrevious(parent, withinConnectedComponent);//reached beginning of element } Element previous = parent.getChild(index - 1); while (previous.getChildCount() > 0){ previous = previous.getChild(previous.getChildCount() - 1); } return previous; } /** * Returns a list containing sibling elements with the given element name after the given element. * These elements need not be continuous * @param currentElem: the element to look for following siblings of * @param elName: the name of the elements desired * @return */ static List getNextSiblingsOfType(Element currentElem, String elName) { List laterSiblingElementsOfType = new ArrayList<>(); Element parent = currentElem.getParent(); if (parent == null){ return laterSiblingElementsOfType; } int indexOfCurrentElem = parent.indexOf(currentElem); for (int i = indexOfCurrentElem + 1; i < parent.getChildCount(); i++) { Element child = parent.getChild(i); if (child.getName().equals(elName)) { laterSiblingElementsOfType.add(child); } } return laterSiblingElementsOfType; } /** * Returns a list containing sibling elements with the given element name after the given element. * @param currentElem: the element to look for following siblings of * @param elName: the name of the elements desired * @return */ static List getNextAdjacentSiblingsOfType(Element currentElem, String elName) { List siblingElementsOfType = new ArrayList<>(); Element parent = currentElem.getParent(); if (parent == null){ return siblingElementsOfType; } Element nextSibling = getNextSibling(currentElem); while (nextSibling != null && nextSibling.getName().equals(elName)){ siblingElementsOfType.add(nextSibling); nextSibling = getNextSibling(nextSibling); } return siblingElementsOfType; } /** * Returns a list containing sibling elements with the given element names after the given element. * These elements need not be continuous and are returned in the order encountered * @param currentElem: the element to look for following siblings of * @param elNames: An array of the names of the elements desired * @return */ static List getNextSiblingsOfTypes(Element currentElem, String[] elNames){ List laterSiblingElementsOfTypes = new ArrayList<>(); currentElem = getNextSibling(currentElem); while (currentElem != null){ String name = currentElem.getName(); for (String elName : elNames) { if (name.equals(elName)){ laterSiblingElementsOfTypes.add(currentElem); break; } } currentElem = getNextSibling(currentElem); } return laterSiblingElementsOfTypes; } /** * Returns a list containing sibling elements with the given element name before the given element. * These elements need not be continuous * @param currentElem: the element to look for previous siblings of * @param elName: the name of the elements desired * @return */ static List getPreviousSiblingsOfType(Element currentElem, String elName) { List earlierSiblingElementsOfType = new ArrayList<>(); Element parent = currentElem.getParent(); if (parent == null){ return earlierSiblingElementsOfType; } int indexOfCurrentElem = parent.indexOf(currentElem); for (int i = 0; i < indexOfCurrentElem; i++) { Element child = parent.getChild(i); if (child.getName().equals(elName)) { earlierSiblingElementsOfType.add(child); } } return earlierSiblingElementsOfType; } /** * Gets the next sibling element of the given element. If this element's name is within the elementsToIgnore array this is repeated * If no appropriate element can be found null is returned * @param startingEl * @param elNamesToIgnore * @return */ static Element getNextSiblingIgnoringCertainElements(Element startingEl, String[] elNamesToIgnore){ Element parent = startingEl.getParent(); if (parent == null){ return null; } int i = parent.indexOf(startingEl); if (i + 1 >= parent.getChildCount()) { return null; } Element next = parent.getChild(i + 1); String elName = next.getName(); for (String namesToIgnore : elNamesToIgnore) { if (elName.equals(namesToIgnore)){ return getNextSiblingIgnoringCertainElements(next, elNamesToIgnore); } } return next; } /** * Gets the previous sibling element of the given element. If this element's name is within the elementsToIgnore array this is repeated * If no appropriate element can be found null is returned * @param startingEl * @param elNamesToIgnore * @return */ static Element getPreviousSiblingIgnoringCertainElements(Element startingEl, String[] elNamesToIgnore){ Element parent = startingEl.getParent(); if (parent == null){ return null; } int i = parent.indexOf(startingEl); if (i == 0) { return null; } Element previous = parent.getChild(i - 1); String elName = previous.getName(); for (String namesToIgnore : elNamesToIgnore) { if (elName.equals(namesToIgnore)){ return getPreviousSiblingIgnoringCertainElements(previous, elNamesToIgnore); } } return previous; } /** * Finds all descendant elements whose name matches the given element name * @param startingElement * @param elementName * @return */ static List getDescendantElementsWithTagName(Element startingElement, String elementName) { List matchingElements = new ArrayList<>(); Deque stack = new ArrayDeque<>(); for (int i = startingElement.getChildCount() - 1; i >= 0; i--) { stack.add(startingElement.getChild(i)); } while (stack.size() > 0){ Element currentElement = stack.removeLast(); if (currentElement.getName().equals(elementName)){ matchingElements.add(currentElement); } for (int i = currentElement.getChildCount() - 1; i >= 0; i--) { stack.add(currentElement.getChild(i)); } } return matchingElements; } /** * Finds all descendant elements whose element name matches one of the strings in elementNames * @param startingElement * @param elementNames * @return */ static List getDescendantElementsWithTagNames(Element startingElement, String[] elementNames) { List matchingElements = new ArrayList<>(); Deque stack = new ArrayDeque<>(); for (int i = startingElement.getChildCount() - 1; i >= 0; i--) { stack.add(startingElement.getChild(i)); } while (stack.size()>0){ Element currentElement = stack.removeLast(); String currentElName = currentElement.getName(); for (String targetTagName : elementNames) { if (currentElName.equals(targetTagName)){ matchingElements.add(currentElement); break; } } for (int i = currentElement.getChildCount() - 1; i >= 0; i--) { stack.add(currentElement.getChild(i)); } } return matchingElements; } /** * Finds all child elements whose element name matches one of the strings in elementNames * @param startingElement * @param elementNames * @return */ static List getChildElementsWithTagNames(Element startingElement, String[] elementNames) { List matchingElements = new ArrayList<>(); for (int i = 0, l = startingElement.getChildCount(); i < l; i++) { Element child = startingElement.getChild(i); String currentElName = child.getName(); for (String targetTagName : elementNames) { if (currentElName.equals(targetTagName)){ matchingElements.add(child); break; } } } return matchingElements; } /** * Finds all descendant elements whose element name matches the given elementName * Additionally the element must have the specified attribute and the value of the attribute must be as specified * @param startingElement * @param elementName * @param attributeName * @param attributeValue * @return */ static List getDescendantElementsWithTagNameAndAttribute(Element startingElement, String elementName, String attributeName, String attributeValue) { List matchingElements = new ArrayList<>(); Deque stack = new ArrayDeque<>(); for (int i = startingElement.getChildCount() - 1; i >= 0; i--) { stack.add(startingElement.getChild(i)); } while (stack.size() > 0){ Element currentElement =stack.removeLast(); if (currentElement.getName().equals(elementName)){ if (attributeValue.equals(currentElement.getAttributeValue(attributeName))){ matchingElements.add(currentElement); } } for (int i = currentElement.getChildCount() - 1; i >= 0; i--) { stack.add(currentElement.getChild(i)); } } return matchingElements; } /** * Finds all child elements whose element name matches the given elementName * Additionally the element must have the specified attribute and the value of the attribute must be as specified * @param startingElement * @param elementName * @return */ static List getChildElementsWithTagNameAndAttribute(Element startingElement, String elementName, String attributeName, String attributeValue) { List matchingElements = new ArrayList<>(); for (int i = 0, l = startingElement.getChildCount(); i < l; i++) { Element child = startingElement.getChild(i); if (child.getName().equals(elementName)){ if (attributeValue.equals(child.getAttributeValue(attributeName))){ matchingElements.add(child); } } } return matchingElements; } /** * Finds and returns the number of elements and the number of elements with no children, that are descendants of the startingElement * The 0th position of the returned array is the total number of elements * The 1st position is the number of child less elements * @param startingElement * @return */ static int[] countNumberOfElementsAndNumberOfChildLessElements(Element startingElement) { int[] counts = new int[2]; Deque stack = new ArrayDeque<>(); stack.add(startingElement); while (stack.size() > 0){ Element currentElement = stack.removeLast(); int childCount = currentElement.getChildCount(); if (childCount == 0) { counts[1]++; } else{ for (int i = 0; i < childCount; i++) { stack.add(currentElement.getChild(i)); } counts[0] += childCount; } } return counts; } /** * Find all the later siblings of startingElement up until there is no more siblings or an * element with the given element name is reached (exclusive of that element) * @param startingEl * @param elName * @return */ static List getSiblingsUpToElementWithTagName(Element startingEl, String elName) { List laterSiblings = new ArrayList<>(); Element nextEl = getNextSibling(startingEl); while (nextEl != null && !nextEl.getName().equals(elName)){ laterSiblings.add(nextEl); nextEl = getNextSibling(nextEl); } return laterSiblings; } /** * Replaces newlines/tabs and "&<> with XML entities * @param str * @return */ static String xmlEncode(String str) { StringBuilder result = new StringBuilder(); for (int i = 0, len = str.length(); i < len; i++) { char c = str.charAt(i); switch (c) { case '\t': result.append(" "); break; case '\n': result.append(" "); break; case '\r': result.append(" "); break; case '"': result.append("""); break; case '&': result.append("&"); break; case '<': result.append("<"); break; case '>': result.append(">"); break; default: result.append(c); } } return result.toString(); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/OpsinWarning.java000066400000000000000000000035741451751637500271460ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * A warning generated by OPSIN while interpreting a name.
* The specifics of the warning may be used to judge whether you want to accept the generated structure. */ public class OpsinWarning { /** * The type of problem OPSIN encountered */ public enum OpsinWarningType { /**OPSIN ignored stereochemistry from the input name to give this structure. This can have various causes
: * OPSIN doesn't support interpretation of the type of stereochemistry * OPSIN stereo-perception doesn't support this type of stereocentre * The name describes the wrong structure * The stereochemistry is being requested at the wrong atom/bond */ STEREOCHEMISTRY_IGNORED("Stereochemical term ignored"), /**OPSIN made a choice that appeared to be ambiguous to give this structure i.e. the name may describe multiple possible structures
*The name may be missing locants
*Alternatively the name could actually be a trivial rather than systematic name
*OPSIN tries to make sensible choices when choosing in ambiguous cases so the resultant structure may nonetheless be the intended one*/ APPEARS_AMBIGUOUS("This names appears to be ambiguous"); private final String explanation; private OpsinWarningType(String explanation) { this.explanation = explanation; } public String getExplanation() { return explanation; } } private final OpsinWarningType type; private final String message; OpsinWarning(OpsinWarningType type, String message) { this.type = type; this.message = message; } /** * @return The type of the warning c.f. {@link OpsinWarningType} */ public OpsinWarningType getType() { return type; } /** * @return The message describing the specific cause of this warning */ public String getMessage() { return message; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/OutAtom.java000066400000000000000000000035201451751637500261070ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * Struct for an OutAtom. As expected holds a reference to an atom. * However if setExplicitly is not true then it is not true then it is not absolutely definitely this amount that is referred to * e.g. propyl is stored as prop-1-yl with this set to false while prop-2-yl has it set to true * * Also holds the order of the bond that will be created when it is used (valency) e.g. Eg. chloro 1, oxo 2 * * Optionally a locant may be specified for what the outAtom should connect to if it is convenient to store such information. This is used in ester formation and epoxy fomation * @author dl387 * */ class OutAtom { private Atom atom; private int valency; private boolean setExplicitly; private String locant; OutAtom(Atom atom, int valency, Boolean setExplicitly) { this(atom, valency, setExplicitly, null); } OutAtom(Atom atom, int valency, Boolean setExplicitly, String locant) { this.atom = atom; this.valency = valency; this.setExplicitly = setExplicitly; this.locant = locant; if (setExplicitly){ atom.addOutValency(valency); } } Atom getAtom() { return atom; } void setAtom(Atom atom) { this.atom = atom; } int getValency() { return valency; } void setValency(int valency) { if (setExplicitly){ atom.addOutValency(valency -this.valency); } this.valency = valency; } boolean isSetExplicitly() { return setExplicitly; } void setSetExplicitly(boolean setExplicitly) { if (!this.setExplicitly && setExplicitly){ atom.addOutValency(valency); } else if (this.setExplicitly && !setExplicitly){ atom.addOutValency(-valency); } this.setExplicitly = setExplicitly; } String getLocant() { return locant; } void setLocant(String locant) { this.locant = locant; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/Parse.java000066400000000000000000000020011451751637500255620ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.List; /**A "struct" containing data on the parsing of a chemical name. * * @author ptc24 * @author dl387 * */ class Parse { /**The chemical name.*/ private final String name; /**The words within the name, and their parsing data.*/ private final List words = new ArrayList<>(); /** * Creates a parse object for a chemicalName * @param chemicalName */ Parse(String chemicalName) { name = chemicalName; } Parse deepCopy() { Parse p = new Parse(name); for(ParseWord pw : words) { p.words.add(pw.deepCopy()); } return p; } public String toString() { return "[" + name + ", " + words.toString() + "]"; } List getWords() { return words; } boolean addWord(ParseWord word) { return words.add(word); } boolean removeWord(ParseWord word) { return words.remove(word); } ParseWord getWord(int indice) { return words.get(indice); } String getName() { return name; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ParseRules.java000066400000000000000000000172771451751637500266220ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayDeque; import java.util.ArrayList; import java.util.Collections; import java.util.List; import java.util.regex.Matcher; import java.util.regex.Pattern; import dk.brics.automaton.RunAutomaton; /** * Instantiate via NameToStructure.getOpsinParser() * * Performs finite-state allocation of roles ("annotations") to tokens: * The chemical name is broken down into tokens e.g. ethyl -->eth yl by applying the chemical grammar in regexes.xml * The tokens eth and yl are associated with a letter which is referred to here as an annotation which is the role of the token. * These letters are defined in regexes.xml and would in this case have the meaning alkaneStem and inlineSuffix * * The chemical grammar employs the annotations associated with the tokens when deciding what may follow what has already been seen * e.g. you cannot start a chemical name with yl and an optional e is valid after an arylGroup * * @author ptc24 * @author dl387 * */ public class ParseRules { /** A DFA encompassing the grammar of a chemical word. */ private final RunAutomaton chemAutomaton; /** The allowed symbols in chemAutomaton */ private final char[] stateSymbols; private final OpsinRadixTrie[] symbolTokenNamesDict; private final RunAutomaton[] symbolRegexAutomataDict; private final Pattern[] symbolRegexesDict; private final AnnotatorState initialState; /** * Creates a left to right parser that can parse a substituent/full/functional word * @param resourceManager */ ParseRules(ResourceManager resourceManager){ this.chemAutomaton = resourceManager.getChemicalAutomaton(); this.symbolTokenNamesDict = resourceManager.getSymbolTokenNamesDict(); this.symbolRegexAutomataDict = resourceManager.getSymbolRegexAutomataDict(); this.symbolRegexesDict = resourceManager.getSymbolRegexesDict(); this.stateSymbols = chemAutomaton.getCharIntervals(); this.initialState = new AnnotatorState(chemAutomaton.getInitialState(), '\0', 0, true, null); } /**Determines the possible annotations for a chemical word * Returns a list of parses and how much of the word could not be interpreted * e.g. usually the list will have only one parse and the string will equal "" * For something like ethyloxime. The list will contain the parse for ethyl and the string will equal "oxime" as it was unparsable * For something like eth no parses would be found and the string will equal "eth" * * @param chemicalWord * @return Results of parsing * @throws ParsingException */ public ParseRulesResults getParses(String chemicalWord) throws ParsingException { String chemicalWordLowerCase = StringTools.lowerCaseAsciiString(chemicalWord); ArrayDeque asStack = new ArrayDeque<>(); asStack.add(initialState); int posInNameOfLastSuccessfulAnnotations = 0; List successfulAnnotations = new ArrayList<>(); AnnotatorState longestAnnotation = initialState;//this is the longest annotation. It does not necessarily end in an accept state int stateSymbolsSize = stateSymbols.length; while (!asStack.isEmpty()) { AnnotatorState as = asStack.removeLast();//depth-first avoids pathological memory consumption if parsing ambiguity is encountered int posInName = as.getPosInName(); if (chemAutomaton.isAccept(as.getState())){ if (posInName >= posInNameOfLastSuccessfulAnnotations){//this annotation is worthy of consideration if (posInName > posInNameOfLastSuccessfulAnnotations){//this annotation is longer than any previously found annotation successfulAnnotations.clear(); posInNameOfLastSuccessfulAnnotations = posInName; } else if (successfulAnnotations.size() > 128){ throw new ParsingException("Ambiguity in OPSIN's chemical grammar has produced more than 128 annotations. Parsing has been aborted. Please report this as a bug"); } successfulAnnotations.add(as); } } //record the longest annotation found so it can be reported to the user for debugging if (posInName > longestAnnotation.getPosInName()){ longestAnnotation = as; } for (int i = 0; i < stateSymbolsSize; i++) { char annotationCharacter = stateSymbols[i]; int potentialNextState = chemAutomaton.step(as.getState(), annotationCharacter); if (potentialNextState != -1) {//-1 means this state is not accessible from the previous state OpsinRadixTrie possibleTokenisationsTrie = symbolTokenNamesDict[i]; if (possibleTokenisationsTrie != null) { List possibleTokenisations = possibleTokenisationsTrie.findMatches(chemicalWordLowerCase, posInName); if (possibleTokenisations != null) {//next could be a token for (int j = 0, l = possibleTokenisations.size(); j < l; j++) {//typically list size will be 1 so this is faster than an iterator int tokenizationIndex = possibleTokenisations.get(j); AnnotatorState newAs = new AnnotatorState(potentialNextState, annotationCharacter, tokenizationIndex, false, as); //System.out.println("tokened " + chemicalWordLowerCase.substring(posInName, tokenizationIndex)); asStack.add(newAs); } } } RunAutomaton possibleAutomata = symbolRegexAutomataDict[i]; if (possibleAutomata != null) {//next could be an automaton int matchLength = possibleAutomata.run(chemicalWord, posInName); if (matchLength != -1){//matchLength = -1 means it did not match int tokenizationIndex = posInName + matchLength; AnnotatorState newAs = new AnnotatorState(potentialNextState, annotationCharacter, tokenizationIndex, true, as); //System.out.println("neword automata " + chemicalWord.substring(posInName, tokenizationIndex)); asStack.add(newAs); } } Pattern possibleRegex = symbolRegexesDict[i]; if (possibleRegex != null) {//next could be a regex Matcher mat = possibleRegex.matcher(chemicalWord).region(posInName, chemicalWord.length()); mat.useTransparentBounds(true); if (mat.lookingAt()) {//match at start int tokenizationIndex = posInName + mat.group(0).length(); AnnotatorState newAs = new AnnotatorState(potentialNextState, annotationCharacter, tokenizationIndex, true, as); //System.out.println("neword regex " + mat.group(0)); asStack.add(newAs); } } } } } List outputList = new ArrayList<>(); String uninterpretableName = chemicalWord; String unparseableName = chemicalWord.substring(longestAnnotation.getPosInName()); if (successfulAnnotations.size() > 0){//at least some of the name could be interpreted into a substituent/full/functionalTerm int bestAcceptPosInName = -1; for(AnnotatorState as : successfulAnnotations) { outputList.add(convertAnnotationStateToParseTokens(as, chemicalWord, chemicalWordLowerCase)); bestAcceptPosInName = as.getPosInName();//all acceptable annotator states found should have the same posInName } uninterpretableName = chemicalWord.substring(bestAcceptPosInName); } return new ParseRulesResults(outputList, uninterpretableName, unparseableName); } private ParseTokens convertAnnotationStateToParseTokens(AnnotatorState as, String chemicalWord, String chemicalWordLowerCase) { List tokens = new ArrayList<>(); List annotations = new ArrayList<>(); AnnotatorState previousAs; while ((previousAs = as.getPreviousAs()) != null) { if (as.isCaseSensitive()) { tokens.add(chemicalWord.substring(previousAs.getPosInName(), as.getPosInName())); } else{ tokens.add(chemicalWordLowerCase.substring(previousAs.getPosInName(), as.getPosInName())); } annotations.add(as.getAnnot()); as = previousAs; } Collections.reverse(tokens); Collections.reverse(annotations); return new ParseTokens(tokens, annotations); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ParseRulesResults.java000066400000000000000000000033551451751637500301740ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.List; /** * A wrapper for the results from parsing a chemical name or part of a chemical name * through ParseRules * * @author dl387 */ public class ParseRulesResults { private final List parseTokensList; private final String uninterpretableName; private final String unparseableName; public ParseRulesResults(List parseTokensList, String uninterpretableName, String unparseableName) { this.parseTokensList = parseTokensList; this.uninterpretableName = uninterpretableName; this.unparseableName = unparseableName; } /** * One ParseTokens object is returned for each possible interpretation of a chemical name * If none of the name can be interpreted this list will be empty * @return List of possible tokenisations/annotation of tokens */ public List getParseTokensList() { return parseTokensList; } /** * The substring of the name that could not be classified into a substituent/full/functionalTerm * e.g. in ethyl-2H-fooarene "2H-fooarene" will be returned * @return String of uninterpetable chemical name */ public String getUninterpretableName() { return uninterpretableName; } /** * The substring of the name that could not be tokenised at all. * This will always be the same or shorter than the uninterpetable substring of name * e.g. in ethyl-2H-fooarene "fooarene" will be returned * @return String of unparseable chemical name */ public String getUnparseableName() { return unparseableName; } public String toString() { return "(" + parseTokensList.toString() + ", " + uninterpretableName + ", " + unparseableName + ")"; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ParseTokens.java000066400000000000000000000032231451751637500267550ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.Collections; import java.util.List; /**A "struct" containing data a possible tokenisation of a word in a chemical name. * * @author ptc24 * @author dl387 * */ public class ParseTokens { /**The tokens that the word is made up of.*/ private final List tokens; /**A list of possible annotations of that token.*/ private final List annotations; /** * Creates a parseTokens from an existing list of tokens and annotations * The lists should be of identical lengths otherwise an exception is thrown * @param tokens * @param annotations */ ParseTokens(List tokens, List annotations ){ if (tokens.size() != annotations.size()){ throw new IllegalArgumentException("OPSIN bug: mismatch between the sizes of tokens list and annotation list"); } this.tokens = Collections.unmodifiableList(new ArrayList(tokens)); this.annotations = Collections.unmodifiableList(new ArrayList(annotations)); } public List getTokens() { return tokens; } public List getAnnotations() { return annotations; } public String toString() { return "[" + tokens + ", " + annotations + "]"; } @Override public boolean equals(Object other) { if (this == other) { return true; } if (other instanceof ParseTokens) { ParseTokens otherPT = (ParseTokens) other; return this.tokens.equals(otherPT.tokens) && this.annotations.equals(otherPT.annotations); } return false; } @Override public int hashCode() { return (3 * this.tokens.hashCode()) * (7 * this.annotations.hashCode()); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ParseWord.java000066400000000000000000000015141451751637500264260ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.List; /**A "struct" containing data on the parsing of a word in a chemical name. * * @author ptc24 * @author dl387 * */ class ParseWord { /**The word itself.*/ private final String word; /**All of the possible tokenisations of the word.*/ private final List parseTokens; ParseWord deepCopy() { return new ParseWord(word, parseTokens); } ParseWord(String word, List parseTokens) { this.word =word; if (parseTokens ==null){ this.parseTokens = null; } else{ this.parseTokens = new ArrayList<>(parseTokens); } } String getWord() { return word; } List getParseTokens() { return parseTokens; } public String toString() { return "[" + word + ", " + parseTokens.toString() + "]"; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/Parser.java000066400000000000000000000334411451751637500257600ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.io.IOException; import java.util.ArrayDeque; import java.util.ArrayList; import java.util.Arrays; import java.util.Deque; import java.util.List; import java.util.regex.Matcher; import java.util.regex.Pattern; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; /**Conducts finite-state parsing on chemical names. * Adds XML annotation to the semantic constituents of the name. * * @author ptc24/dl387 * */ class Parser { /**Uses ParseRules to intelligently parse chemical names into substituents/full terms/functional terms*/ private final Tokeniser tokeniser; /**The rules by which words are grouped together (e.g. in functional class nomenclature)*/ private final WordRules wordRules; /**Holds the various tokens used.*/ private final ResourceManager resourceManager; private final ParseRules parseRules; private static final Pattern matchSemiColonSpace = Pattern.compile("; "); private static final Pattern matchStoichiometryIndication = Pattern.compile("[ ]?[\\{\\[\\(](\\d+|\\?)([:/](\\d+|\\?))+[\\}\\]\\)]$"); private static final Logger LOG = LogManager.getLogger(Parser.class); /** * No-argument constructor. Uses ResouceGetter found at * uk/ac/cam/ch/wwmm/opsin/resources/ * @throws IOException */ Parser() throws IOException { ResourceGetter resources = new ResourceGetter("uk/ac/cam/ch/wwmm/opsin/resources/"); this.wordRules = new WordRules(resources); this.resourceManager = new ResourceManager(resources); this.parseRules = new ParseRules(this.resourceManager); this.tokeniser = new Tokeniser(parseRules); } /**Initialises the parser. * @param resourceManager * @param tokeniser * @param wordRules */ Parser(WordRules wordRules, Tokeniser tokeniser, ResourceManager resourceManager) { this.wordRules =wordRules; this.resourceManager = resourceManager; this.tokeniser = tokeniser; this.parseRules = tokeniser.getParseRules(); } /**Parses a chemical name to an XML representation of the parse. * @param n2sConfig * * @param name The name to parse. * @return The parse. * @throws ParsingException If the name is unparsable. */ List parse(NameToStructureConfig n2sConfig, String name) throws ParsingException { Integer[] componentRatios = null; if (name.endsWith(")") || name.endsWith("]") || name.endsWith("}")){ Matcher m = matchStoichiometryIndication.matcher(name); if (m.find()){ componentRatios = processStoichiometryIndication(m.group()); name = m.replaceAll(""); } } Parse parse = null; if (name.contains(", ")){ try{ TokenizationResult tokenizationResult = tokeniser.tokenize(CASTools.uninvertCASName(name, parseRules), false); if (tokenizationResult.isSuccessfullyTokenized()){ parse = tokenizationResult.getParse(); } } catch (ParsingException ignored) { } } else if (name.contains("; ")){//a mixture, spaces are sufficient for OPSIN to treat as a mixture. These spaces for obvious reasons must not be removed TokenizationResult tokenizationResult = tokeniser.tokenize(matchSemiColonSpace.matcher(name).replaceAll(" "), false); if (tokenizationResult.isSuccessfullyTokenized()){ parse = tokenizationResult.getParse(); } } boolean allowSpaceRemoval; if (parse == null) { allowSpaceRemoval = true; TokenizationResult tokenizationResult = tokeniser.tokenize(name , true); if (tokenizationResult.isSuccessfullyTokenized()){ parse = tokenizationResult.getParse(); } else{ if (n2sConfig.isDetailedFailureAnalysis()){ generateExactParseFailureReason(tokenizationResult, name); } else{ throw new ParsingException(name + " is unparsable due to the following being uninterpretable: " + tokenizationResult.getUninterpretableName() + " The following was not parseable: " +tokenizationResult.getUnparsableName()); } } } else { allowSpaceRemoval = false; } List parses = generateParseCombinations(parse); if (parses.isEmpty()) { throw new ParsingException("No parses could be found for " + name); } List results = new ArrayList<>(); ParsingException preciseException = null; for(Parse pp : parses) { Element moleculeEl = new GroupingEl(MOLECULE_EL); moleculeEl.addAttribute(new Attribute(NAME_ATR, name)); for(ParseWord pw : pp.getWords()) { Element word = new GroupingEl(WORD_EL); moleculeEl.addChild(word); List parseTokens = pw.getParseTokens(); if (parseTokens.size() != 1){ throw new ParsingException("OPSIN bug: parseWord should have exactly 1 annotations after creating additional parses step"); } ParseTokens tokensForWord = parseTokens.get(0); WordType wordType = OpsinTools.determineWordType(tokensForWord.getAnnotations()); word.addAttribute(new Attribute(TYPE_ATR, wordType.toString())); String value = pw.getWord(); if (value.startsWith("-")) { //we want -functionalterm to be the same as functionalterm value = value.substring(1); } word.addAttribute(new Attribute(VALUE_ATR, value)); writeWordXML(word, tokensForWord.getTokens(), WordTools.chunkAnnotations(tokensForWord.getAnnotations())); } /* All words are placed into a wordRule. * Often multiple words in the same wordRule. * WordRules can be nested within each other e.g. in Carbonyl cyanide m-chlorophenyl hydrazone -> * Carbonyl cyanide m-chlorophenyl hydrazone */ try { wordRules.groupWordsIntoWordRules(moleculeEl, n2sConfig, allowSpaceRemoval, componentRatios); } catch (ParsingException e) { if(LOG.isDebugEnabled()) { LOG.debug(e.getMessage(), e); } // Using that parse no word rules matched continue; } try{ if (componentRatios != null){ applyStoichiometryIndicationToWordRules(moleculeEl, componentRatios); } if (moleculeEl.getAttributeValue(ISSALT_ATR) != null && moleculeEl.getChildElements(WORDRULE_EL).size() < 2) { throw new ParsingException(name + " is apparently a salt, but the name only contained one component. The name could be describing a class of compounds"); } results.add(moleculeEl); } catch (ParsingException e) { preciseException = e; } } if (results.isEmpty()) { if (preciseException != null) { throw preciseException; } throw new ParsingException(name + " could be parsed but OPSIN was unsure of the meaning of the words. This error will occur, by default, if a name is just a substituent"); } return results; } static Integer[] processStoichiometryIndication(String ratioString) throws ParsingException { ratioString = ratioString.trim(); ratioString = ratioString.substring(1, ratioString.length()-1); String[] ratioStrings = ratioString.split(":"); if (ratioStrings.length ==1){ ratioStrings = ratioString.split("/"); } Integer[] componentRatios = new Integer[ratioStrings.length]; for (int i = 0; i < ratioStrings.length; i++) { String currentRatio = ratioStrings[i]; if (currentRatio.contains("/")){ throw new ParsingException("Unexpected / in component ratio declaration"); } if (currentRatio.equals("?")){ componentRatios[i] = 1; } else{ componentRatios[i] = Integer.parseInt(currentRatio); } } return componentRatios; } private void generateExactParseFailureReason(TokenizationResult tokenizationResult, String name) throws ParsingException { ReverseParseRules reverseParseRules; try { reverseParseRules = new ReverseParseRules(resourceManager); } catch (IOException e) { throw new RuntimeException("Failed to load resources for parsing names from right to left!",e); } String uninterpretableLR = tokenizationResult.getUninterpretableName(); String unparseableLR = tokenizationResult.getUnparsableName(); TokenizationResult reverseTokenizationResult = tokeniser.tokenizeRightToLeft(reverseParseRules, uninterpretableLR, true); String uninterpretableRL = reverseTokenizationResult.getUninterpretableName(); String unparseableRL = reverseTokenizationResult.getUnparsableName(); int indiceToTruncateUpTo = uninterpretableLR.length()-unparseableLR.length(); StringBuilder message = new StringBuilder(); message.append(name); if (!uninterpretableRL.equals("")){ message.append(" was uninterpretable due to the following section of the name: "); message.append(uninterpretableRL); if (indiceToTruncateUpTo <= unparseableRL.length()){ String uninterpretableInContext = unparseableRL.substring(indiceToTruncateUpTo); if (!uninterpretableInContext.equals("")){ message.append(" The following was not understandable in the context it was used: "); message.append(uninterpretableInContext); } } } else{ message.append(" has no tokens unknown to OPSIN but does not conform to its grammar. "); message.append("From left to right it is unparsable due to the following being uninterpretable:"); message.append(uninterpretableLR); message.append(" The following of which was not parseable: "); message.append(unparseableLR); } throw new ParsingException(message.toString()); } /** * For cases where any of the parse's parseWords contain multiple annotations create a * parse for each possibility. Hence after this process there may be multiple Parse objects but * the parseWords they contain will each only have one parseTokens object. * @param parse * @return * @throws ParsingException */ private List generateParseCombinations(Parse parse) throws ParsingException { int numberOfCombinations = 1; List parseWords = parse.getWords(); for (ParseWord pw : parseWords) { int parsesForWord = pw.getParseTokens().size(); numberOfCombinations *= parsesForWord; if (numberOfCombinations > 128){//checked here to avoid integer overflow on inappropriate input throw new ParsingException("Too many different combinations of word interpretation are possible (>128) i.e. name contains too many terms that OPSIN finds ambiguous to interpret"); } } if (numberOfCombinations ==1){ return Arrays.asList(parse); } List parses = new ArrayList<>(); Deque parseQueue = new ArrayDeque<>(); parseQueue.add(new Parse(parse.getName())); while (!parseQueue.isEmpty()){ Parse currentParse = parseQueue.removeFirst(); int wordsInCurrentParse = currentParse.getWords().size(); if(wordsInCurrentParse == parseWords.size()) { parses.add(currentParse); } else { ParseWord referenceWord = parseWords.get(wordsInCurrentParse); List referenceWordParseTokens = referenceWord.getParseTokens(); for (int i = referenceWordParseTokens.size()-1; i >=0; i--) { ParseTokens parseTokens = referenceWordParseTokens.get(i); Parse parseWithNextWord = i > 0 ? currentParse.deepCopy() : currentParse; ParseWord newParseWord = new ParseWord(referenceWord.getWord(), Arrays.asList(parseTokens)); parseWithNextWord.addWord(newParseWord); parseQueue.add(parseWithNextWord); } } } return parses; } /**Write the XML corresponding to a particular word in a parse. * * @param wordEl The empty XML word element to be written into. * @param tokens The list of tokens. * @param annotations The lists of annotations. This has been divided into a separate list per substituent/root/functionalTerm * @throws ParsingException */ void writeWordXML(Element wordEl, List tokens, List> annotations) throws ParsingException { int annotNumber = 0; int annotPos = 0; Element chunk = new GroupingEl(SUBSTITUENT_EL); wordEl.addChild(chunk); Element lastTokenElement = null; for (String token : tokens) { if (annotPos >= annotations.get(annotNumber).size()) { annotPos = 0; annotNumber++; chunk = new GroupingEl(SUBSTITUENT_EL); wordEl.addChild(chunk); lastTokenElement = null; } Element tokenElement = resourceManager.makeTokenElement(token, annotations.get(annotNumber).get(annotPos)); if (tokenElement != null) {//null for tokens that have ignoreWhenWritingXML set chunk.addChild(tokenElement); lastTokenElement = tokenElement; } else if (lastTokenElement!=null && token.length() > 0){ if (lastTokenElement.getAttribute(SUBSEQUENTUNSEMANTICTOKEN_ATR) != null){ lastTokenElement.getAttribute(SUBSEQUENTUNSEMANTICTOKEN_ATR).setValue(lastTokenElement.getAttributeValue(SUBSEQUENTUNSEMANTICTOKEN_ATR) + token); } else{ lastTokenElement.addAttribute(new Attribute(SUBSEQUENTUNSEMANTICTOKEN_ATR, token)); } } annotPos++; } WordType wordType = WordType.valueOf(wordEl.getAttributeValue(TYPE_ATR)); if(wordType == WordType.full) { chunk.setName(ROOT_EL); } else if(wordType == WordType.functionalTerm) { chunk.setName(FUNCTIONALTERM_EL); } } /** * Assigns an indication of stoichiometry to each child word rule of the moleculeEl. * Throws an exception if there is a mismatch between the number of word rules and ratio. * @param moleculeEl * @param componentRatios * @throws ParsingException */ private void applyStoichiometryIndicationToWordRules(Element moleculeEl,Integer[] componentRatios) throws ParsingException { List wordRules = moleculeEl.getChildElements(WORDRULE_EL); if (wordRules.size()!=componentRatios.length){ throw new ParsingException("Component and stoichiometry indication indication mismatch. OPSIN believes there to be " +wordRules.size() +" components but " + componentRatios.length +" ratios were given!"); } for (int i = 0; i < componentRatios.length; i++) { wordRules.get(i).addAttribute(new Attribute(STOICHIOMETRY_ATR,String.valueOf(componentRatios[i]))); } } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ParsingException.java000066400000000000000000000006661451751637500300110ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /**Thrown during finite-state parsing. * * @author ptc24 * */ public class ParsingException extends Exception { private static final long serialVersionUID = 1L; ParsingException() { super(); } ParsingException(String message) { super(message); } ParsingException(String message, Throwable cause) { super(message, cause); } ParsingException(Throwable cause) { super(cause); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/PreProcessingException.java000066400000000000000000000007061451751637500311640ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /**Thrown during preprocessing. * * @author dl387 * */ class PreProcessingException extends Exception { private static final long serialVersionUID = 1L; PreProcessingException() { super(); } PreProcessingException(String message) { super(message); } PreProcessingException(String message, Throwable cause) { super(message, cause); } PreProcessingException(Throwable cause) { super(cause); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/PreProcessor.java000066400000000000000000000121731451751637500271510ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.HashMap; import java.util.Locale; import java.util.Map; /** * Takes a name: * strips leading/trailing white space * Normalises representation of greeks and some other characters * @author dl387 * */ class PreProcessor { private static final Map DOTENCLOSED_TO_DESIRED = new HashMap<>(); private static final Map XMLENTITY_TO_DESIRED = new HashMap<>(); static { DOTENCLOSED_TO_DESIRED.put("a", "alpha"); DOTENCLOSED_TO_DESIRED.put("b", "beta"); DOTENCLOSED_TO_DESIRED.put("g", "gamma"); DOTENCLOSED_TO_DESIRED.put("d", "delta"); DOTENCLOSED_TO_DESIRED.put("e", "epsilon"); DOTENCLOSED_TO_DESIRED.put("l", "lambda"); DOTENCLOSED_TO_DESIRED.put("x", "xi"); DOTENCLOSED_TO_DESIRED.put("alpha", "alpha"); DOTENCLOSED_TO_DESIRED.put("beta", "beta"); DOTENCLOSED_TO_DESIRED.put("gamma", "gamma"); DOTENCLOSED_TO_DESIRED.put("delta", "delta"); DOTENCLOSED_TO_DESIRED.put("epsilon", "epsilon"); DOTENCLOSED_TO_DESIRED.put("zeta", "zeta"); DOTENCLOSED_TO_DESIRED.put("eta", "eta"); DOTENCLOSED_TO_DESIRED.put("lambda", "lambda"); DOTENCLOSED_TO_DESIRED.put("xi", "xi"); DOTENCLOSED_TO_DESIRED.put("omega", "omega"); DOTENCLOSED_TO_DESIRED.put("fwdarw", "->"); XMLENTITY_TO_DESIRED.put("alpha", "alpha"); XMLENTITY_TO_DESIRED.put("beta", "beta"); XMLENTITY_TO_DESIRED.put("gamma", "gamma"); XMLENTITY_TO_DESIRED.put("delta", "delta"); XMLENTITY_TO_DESIRED.put("epsilon", "epsilon"); XMLENTITY_TO_DESIRED.put("zeta", "zeta"); XMLENTITY_TO_DESIRED.put("eta", "eta"); XMLENTITY_TO_DESIRED.put("lambda", "lambda"); XMLENTITY_TO_DESIRED.put("xi", "xi"); XMLENTITY_TO_DESIRED.put("omega", "omega"); } /** * Master method for PreProcessing * @param chemicalName * @return * @throws PreProcessingException */ static String preProcess(String chemicalName) throws PreProcessingException { chemicalName = chemicalName.trim();//remove leading and trailing whitespace if (chemicalName.length() == 0){ throw new PreProcessingException("Input chemical name was blank!"); } chemicalName = performMultiCharacterReplacements(chemicalName); chemicalName = StringTools.convertNonAsciiAndNormaliseRepresentation(chemicalName); return chemicalName; } private static String performMultiCharacterReplacements(String chemicalName) { StringBuilder sb = new StringBuilder(chemicalName.length()); for (int i = 0, nameLength = chemicalName.length(); i < nameLength; i++) { char ch = chemicalName.charAt(i); switch (ch) { case '$': if (i + 1 < nameLength){ char letter = chemicalName.charAt(i + 1); String replacement = getReplacementForDollarGreek(letter); if (replacement != null){ sb.append(replacement); i++; break; } } sb.append(ch); break; case '.': //e.g. .alpha. String dotEnclosedString = getLowerCasedDotEnclosedString(chemicalName, i); String dotEnclosedReplacement = DOTENCLOSED_TO_DESIRED.get(dotEnclosedString); if (dotEnclosedReplacement != null){ sb.append(dotEnclosedReplacement); i = i + dotEnclosedString.length() + 1; break; } sb.append(ch); break; case '&': { //e.g. α String xmlEntityString = getLowerCasedXmlEntityString(chemicalName, i); String xmlEntityReplacement = XMLENTITY_TO_DESIRED.get(xmlEntityString); if (xmlEntityReplacement != null){ sb.append(xmlEntityReplacement); i = i + xmlEntityReplacement.length() + 1; break; } sb.append(ch); break; } case 's': case 'S'://correct British spelling to the IUPAC spelling if (chemicalName.regionMatches(true, i + 1, "ulph", 0, 4)){ sb.append("sulf"); i = i + 4; break; } sb.append(ch); break; default: sb.append(ch); } } return sb.toString(); } private static String getLowerCasedDotEnclosedString(String chemicalName, int indexOfFirstDot) { int end = -1; int limit = Math.min(indexOfFirstDot + 9, chemicalName.length()); for (int j = indexOfFirstDot + 1; j < limit; j++) { if (chemicalName.charAt(j) == '.'){ end = j; break; } } if (end > 0){ return chemicalName.substring(indexOfFirstDot + 1, end).toLowerCase(Locale.ROOT); } return null; } private static String getLowerCasedXmlEntityString(String chemicalName, int indexOfAmpersand) { int end = -1; int limit = Math.min(indexOfAmpersand + 9, chemicalName.length()); for (int j = indexOfAmpersand + 1; j < limit; j++) { if (chemicalName.charAt(j) == ';'){ end = j; break; } } if (end > 0){ return chemicalName.substring(indexOfAmpersand + 1, end).toLowerCase(Locale.ROOT); } return null; } private static String getReplacementForDollarGreek(char ch) { switch (ch) { case 'a' : return "alpha"; case 'b' : return "beta"; case 'g' : return "gamma"; case 'd' : return "delta"; case 'e' : return "epsilon"; case 'l' : return "lambda"; default: return null; } } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/PropertyKey.java000066400000000000000000000010071451751637500270120ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * * @author dl387 * * @param */ class PropertyKey { private final String name; public PropertyKey(String name) { this.name = name; } @Override public int hashCode() { return 37 * (name != null ? name.hashCode() : 0); } @Override public boolean equals(Object obj) { return this == obj; } @Override public String toString() { return "Key{" + name + "}"; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ResourceGetter.java000066400000000000000000000127271451751637500274720ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.io.OutputStream; import java.net.URL; import java.nio.charset.StandardCharsets; import javax.xml.stream.XMLInputFactory; import javax.xml.stream.XMLStreamException; import javax.xml.stream.XMLStreamReader; import org.apache.commons.io.IOUtils; import org.codehaus.stax2.XMLInputFactory2; import com.ctc.wstx.stax.WstxInputFactory; /** * Handles I/O: * Gets resource files from packages which is useful for including data from the JAR file. * Provides OutputStreams for the serialisation of automata. * * @author ptc24 * @author dl387 * */ class ResourceGetter { private static final XMLInputFactory xmlInputFactory; private final String resourcePath; private final String workingDirectory; static { xmlInputFactory = new WstxInputFactory(); xmlInputFactory.setProperty(XMLInputFactory.SUPPORT_DTD, false); xmlInputFactory.setProperty(XMLInputFactory2.P_AUTO_CLOSE_INPUT, true); } /** * Sets up a resourceGetter to get resources from a particular path. * /-separated - e.g. uk.ac.ch.cam.wwmm.opsin.resources should be * /uk/ac/cam/ch/wwmm/opsin/resources/ * * @param resourcePath The /-separated resource path. */ ResourceGetter(String resourcePath) { if(resourcePath.startsWith("/")) { resourcePath = resourcePath.substring(1); } this.resourcePath = resourcePath; String workingDirectory; try { workingDirectory = new File(".").getCanonicalPath();//works on linux unlike using the system property } catch (IOException e) { //Automata will not be serialisable workingDirectory = null; } this.workingDirectory = workingDirectory; } /** * Gets the resourcePath used to initialise this ResourceGetter * @return */ String getResourcePath() { return resourcePath; } /**Fetches a data file from resourcePath, * and returns an XML stream reader for it * * @param name The name of the file to parse. * @return An XMLStreamReader * @throws IOException */ XMLStreamReader getXMLStreamReader(String name) throws IOException { if(name == null){ throw new IllegalArgumentException("Input to function was null"); } try { if (workingDirectory != null){ File f = getFile(name); if(f != null) { return xmlInputFactory.createXMLStreamReader(new FileInputStream(f)); } } ClassLoader l = getClass().getClassLoader(); URL url = l.getResource(resourcePath + name); if (url == null){ throw new IOException("URL for resource: " + resourcePath + name + " is invalid"); } return xmlInputFactory.createXMLStreamReader(url.openStream()); } catch (XMLStreamException e) { throw new IOException("Validity exception occurred while reading the XML file with name:" +name, e); } } private File getFile(String name) { File f = new File(getResDir(), name); if(f.isFile()){ return f; } return null; } private File getResDir() { File resourcesTop = new File(workingDirectory, "resources"); return new File(resourcesTop, resourcePath); } /**Fetches a data file from resourcePath, and returns the entire contents * as a string. * * @param name The file to fetch. * @return The contents of the file as a string or "" if an IOException occurred */ String getFileContentsAsString(String name){ if(name == null){ throw new IllegalArgumentException("Input to function was null"); } try (InputStreamReader is = new InputStreamReader(getInputstreamFromFileName(name), StandardCharsets.UTF_8)) { return IOUtils.toString(is); } catch (IOException e) { return ""; } } /**Fetches a data file from the working directory or resourcePath as an InputStream. * * @param name The name of the file to get an InputStream of. * @return An InputStream corresponding to the file. * @throws IOException */ InputStream getInputstreamFromFileName(String name) throws IOException { if(name == null){ throw new IllegalArgumentException("Input to function was null"); } if (workingDirectory!=null){ File f = getFile(name); if(f != null) { return new FileInputStream(f); } } ClassLoader l = getClass().getClassLoader(); URL url = l.getResource(resourcePath + name); if (url == null){ throw new IOException("URL for resource: " + resourcePath + name + " is invalid"); } return url.openStream(); } /**Sets up an output stream to which a resource file can be written; this * resource file will be in a subdirectory of the resources directory in * the working directory. * * @param name The name of the file to write. * @return The output stream. * @throws IOException */ OutputStream getOutputStream(String name) throws IOException { if(name == null){ throw new IllegalArgumentException("Input to function was null"); } File f = getFileForWriting(name); return new FileOutputStream(f); } private File getFileForWriting(String name) throws IOException { File resourcesTop = new File(workingDirectory, "resources"); File resDir = new File(resourcesTop, resourcePath); if(!resDir.exists()){ if (!resDir.mkdirs()){ throw new IOException("Failed to generate requested directories to create: " + name); } } return new File(resDir, name); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ResourceManager.java000066400000000000000000000422201451751637500276010ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.io.IOException; import java.util.Arrays; import java.util.HashMap; import java.util.Map; import java.util.regex.Matcher; import java.util.regex.Pattern; import javax.xml.stream.XMLStreamConstants; import javax.xml.stream.XMLStreamException; import javax.xml.stream.XMLStreamReader; import dk.brics.automaton.RunAutomaton; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; /**Holds all of the tokens used in parsing of chemical names. * Holds all automata * Generates XML Elements for tokens. * * @author ptc24 * @author dl387 * */ class ResourceManager { private static final TokenEl IGNORE_WHEN_WRITING_PARSE_TREE = new TokenEl(""); /**Used to load XML files.*/ private final ResourceGetter resourceGetter; /**Used to serialise and deserialise automata.*/ private final AutomatonInitialiser automatonInitialiser; /**A mapping between primitive tokens, and annotation->Token object mappings.*/ private final HashMap> tokenDict = new HashMap<>(); /**A mapping between regex tokens, and annotation->Token object mappings.*/ private final HashMap reSymbolTokenDict = new HashMap<>(); /**A mapping between annotation symbols and a trie of tokens.*/ private final OpsinRadixTrie[] symbolTokenNamesDict; /**A mapping between annotation symbols and DFAs (annotation->automata mapping).*/ private final RunAutomaton[] symbolRegexAutomataDict; /**A mapping between annotation symbols and regex patterns (annotation->regex pattern mapping).*/ private final Pattern[] symbolRegexesDict; /**The automaton which describes the grammar of a chemical name from left to right*/ private final RunAutomaton chemicalAutomaton; /**As symbolTokenNamesDict but the tokens are reversed*/ private OpsinRadixTrie[] symbolTokenNamesDictReversed; /**As symbolRegexAutomataDict but automata are reversed */ private RunAutomaton[] symbolRegexAutomataDictReversed; /**As symbolRegexesDict but regexes match the end of string */ private Pattern[] symbolRegexesDictReversed; /**The automaton which describes the grammar of a chemical name from right to left*/ private RunAutomaton reverseChemicalAutomaton; /**Generates the ResourceManager. * This involves reading in the token files, the regexToken file (regexTokens.xml) and the grammar file (regexes.xml). * DFA are built or retrieved for the regexTokens and the chemical grammar. * * Throws an exception if the XML token and regex files can't be read in properly or the grammar cannot be built. * @param resourceGetter * @throws IOException */ ResourceManager(ResourceGetter resourceGetter) throws IOException { this.resourceGetter = resourceGetter; this.automatonInitialiser = new AutomatonInitialiser(resourceGetter.getResourcePath() + "serialisedAutomata/"); chemicalAutomaton = processChemicalGrammar(false); int grammarSymbolsSize = chemicalAutomaton.getCharIntervals().length; symbolTokenNamesDict = new OpsinRadixTrie[grammarSymbolsSize]; symbolRegexAutomataDict = new RunAutomaton[grammarSymbolsSize]; symbolRegexesDict = new Pattern[grammarSymbolsSize]; processTokenFiles(false); processRegexTokenFiles(false); } /** * Processes tokenFiles * @param reversed Should the tokens be reversed * @throws IOException */ private void processTokenFiles(boolean reversed) throws IOException { XMLStreamReader filesToProcessReader = resourceGetter.getXMLStreamReader("index.xml"); try { while (filesToProcessReader.hasNext()) { int event = filesToProcessReader.next(); if (event == XMLStreamConstants.START_ELEMENT && filesToProcessReader.getLocalName().equals("tokenFile")) { String fileName = filesToProcessReader.getElementText(); processTokenFile(fileName, reversed); } } } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading index.xml", e); } finally { try { filesToProcessReader.close(); } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading index.xml", e); } } } private void processTokenFile(String fileName, boolean reversed) throws IOException { XMLStreamReader reader = resourceGetter.getXMLStreamReader(fileName); try { while (reader.hasNext()) { if (reader.next() == XMLStreamConstants.START_ELEMENT) { String tagName = reader.getLocalName(); if (tagName.equals("tokenLists")) { while (reader.hasNext()) { switch (reader.next()) { case XMLStreamConstants.START_ELEMENT: if (reader.getLocalName().equals("tokenList")) { processTokenList(reader, reversed); } break; } } } else if (tagName.equals("tokenList")) { processTokenList(reader, reversed); } } } } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading " + fileName, e); } finally { try { reader.close(); } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading " + fileName, e); } } } private void processTokenList(XMLStreamReader reader, boolean reversed) throws XMLStreamException { String tokenTagName = null; Character symbol = null; String type = null; String subType = null; boolean ignoreWhenWritingXML = false; for (int i = 0, l = reader.getAttributeCount(); i < l; i++) { String atrName = reader.getAttributeLocalName(i); String atrValue = reader.getAttributeValue(i); if (atrName.equals("tagname")){ tokenTagName = atrValue; } else if (atrName.equals("symbol")){ symbol = atrValue.charAt(0); } else if (atrName.equals(TYPE_ATR)){ type = atrValue; } else if (atrName.equals(SUBTYPE_ATR)){ subType = atrValue; } else if (atrName.equals("ignoreWhenWritingXML")){ ignoreWhenWritingXML = atrValue.equals("yes"); } else{ throw new RuntimeException("Malformed tokenlist"); } } if (tokenTagName == null || symbol == null) { throw new RuntimeException("Malformed tokenlist"); } int index = Arrays.binarySearch(chemicalAutomaton.getCharIntervals(), symbol); if (index < 0) { throw new RuntimeException(symbol +" is associated with a tokenList of tagname " + tokenTagName +" however it is not actually used in OPSIN's grammar!!!"); } while (reader.hasNext()) { switch (reader.next()) { case XMLStreamConstants.START_ELEMENT: if (reader.getLocalName().equals("token")) { TokenEl el; if (ignoreWhenWritingXML) { el = IGNORE_WHEN_WRITING_PARSE_TREE; } else{ el = new TokenEl(tokenTagName); if (type != null) { el.addAttribute(TYPE_ATR, type); } if (subType != null) { el.addAttribute(SUBTYPE_ATR, subType); } for (int i = 0, l = reader.getAttributeCount(); i < l; i++) { el.addAttribute(reader.getAttributeLocalName(i), reader.getAttributeValue(i)); } } String text = reader.getElementText(); StringBuilder sb = new StringBuilder(text.length()); for (int i = 0, len = text.length(); i < len; i++) { char ch = text.charAt(i); if (ch == '\\') { if (i + 1 >= len) { throw new RuntimeException("Malformed token text: " + text); } ch = text.charAt(++i); } else if (ch == '|') { addToken(sb.toString(), el, symbol, index, reversed); sb.setLength(0); continue; } sb.append(ch); } addToken(sb.toString(), el, symbol, index, reversed); } break; case XMLStreamConstants.END_ELEMENT: if (reader.getLocalName().equals("tokenList")) { return; } break; } } } private void addToken(String text, TokenEl el, Character symbol, int index, boolean reversed) { Map symbolToToken = tokenDict.get(text); if(symbolToToken == null) { symbolToToken = new HashMap<>(); tokenDict.put(text, symbolToToken); } symbolToToken.put(symbol, el); if (!reversed){ OpsinRadixTrie trie = symbolTokenNamesDict[index]; if(trie == null) { trie = new OpsinRadixTrie(); symbolTokenNamesDict[index] = trie; } trie.addToken(text); } else{ OpsinRadixTrie trie = symbolTokenNamesDictReversed[index]; if(trie == null) { trie = new OpsinRadixTrie(); symbolTokenNamesDictReversed[index] = trie; } trie.addToken(new StringBuilder(text).reverse().toString()); } } private void processRegexTokenFiles(boolean reversed) throws IOException{ XMLStreamReader reader = resourceGetter.getXMLStreamReader("regexTokens.xml"); Map tempRegexes = new HashMap<>(); Pattern matchRegexReplacement = Pattern.compile("%.*?%"); try { while (reader.hasNext()) { if (reader.next() == XMLStreamConstants.START_ELEMENT) { String localName = reader.getLocalName(); if (!localName.equals("regex") && !localName.equals("regexToken")){ continue; } String re = reader.getAttributeValue(null, "regex"); Matcher m = matchRegexReplacement.matcher(re); StringBuilder newValueSB = new StringBuilder(); int position = 0; while(m.find()) {//replace sections enclosed in %..% with the appropriate regex newValueSB.append(re.substring(position, m.start())); StringBuilder replacement = tempRegexes.get(m.group()); if (replacement == null){ throw new RuntimeException("Regex entry for: " + m.group() + " missing! Check regexTokens.xml"); } newValueSB.append(replacement); position = m.end(); } newValueSB.append(re.substring(position)); if (localName.equals("regex")) { String regexName = reader.getAttributeValue(null, "name"); if (regexName == null){ throw new RuntimeException("Regex entry in regexTokenes.xml with no name. regex: " + newValueSB.toString()); } tempRegexes.put(regexName, newValueSB); continue; } addRegexToken(reader, newValueSB.toString(), reversed); } } } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading regexTokens.xml", e); } finally { try { reader.close(); } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading regexTokens.xml", e); } } } private void addRegexToken(XMLStreamReader reader, String regex, boolean reversed) { String tokenTagName = null; Character symbol = null; String type = null; String subType = null; String value = null; boolean determinise = false; boolean ignoreWhenWritingXML = false; for (int i = 0, l = reader.getAttributeCount(); i < l; i++) { String atrName = reader.getAttributeLocalName(i); String atrValue = reader.getAttributeValue(i); if (atrName.equals("tagname")){ tokenTagName = atrValue; } else if (atrName.equals("symbol")){ symbol = atrValue.charAt(0); } else if (atrName.equals(TYPE_ATR)){ type = atrValue; } else if (atrName.equals(SUBTYPE_ATR)){ subType = atrValue; } else if (atrName.equals("value")){ value = atrValue; } else if (atrName.equals("determinise")){ determinise = atrValue.equals("yes"); } else if (atrName.equals("ignoreWhenWritingXML")){ ignoreWhenWritingXML = atrValue.equals("yes"); } else if (!atrName.equals("regex")){ throw new RuntimeException("Malformed regexToken"); } } if (tokenTagName == null || symbol == null) { throw new RuntimeException("Malformed regexToken"); } if (!reversed) { //reSymbolTokenDict will be populated when the constructor is called for left-right parsing, hence skip for right-left if (reSymbolTokenDict.get(symbol) != null) { throw new RuntimeException(symbol +" is associated with multiple regular expressions. The following expression clashes: " + regex +" This should be resolved by combining regular expressions that map the same symbol" ); } if (ignoreWhenWritingXML) { reSymbolTokenDict.put(symbol, IGNORE_WHEN_WRITING_PARSE_TREE); } else{ TokenEl el = new TokenEl(tokenTagName); if (type != null){ el.addAttribute(TYPE_ATR, type); } if (subType != null){ el.addAttribute(SUBTYPE_ATR, subType); } if (value != null){ el.addAttribute(VALUE_ATR, value); } reSymbolTokenDict.put(symbol, el); } } int index = Arrays.binarySearch(chemicalAutomaton.getCharIntervals(), symbol); if (index < 0){ throw new RuntimeException(symbol +" is associated with the regex " + regex +" however it is not actually used in OPSIN's grammar!!!"); } if (!reversed){ if (determinise){//should the regex be compiled into a DFA for faster execution? symbolRegexAutomataDict[index] = automatonInitialiser.loadAutomaton(tokenTagName + "_" + (int)symbol, regex, false, false); } else{ symbolRegexesDict[index] = Pattern.compile(regex); } } else{ if (determinise){//should the regex be compiled into a DFA for faster execution? symbolRegexAutomataDictReversed[index] = automatonInitialiser.loadAutomaton(tokenTagName + "_" + (int)symbol, regex, false, true); } else{ symbolRegexesDictReversed[index] = Pattern.compile(regex +"$"); } } } private RunAutomaton processChemicalGrammar(boolean reversed) throws IOException { XMLStreamReader reader = resourceGetter.getXMLStreamReader("regexes.xml"); Map regexDict = new HashMap<>(); Pattern matchRegexReplacement = Pattern.compile("%.*?%"); try { while (reader.hasNext()) { if (reader.next() == XMLStreamConstants.START_ELEMENT && reader.getLocalName().equals("regex")) { String name = reader.getAttributeValue(null, "name"); String value = reader.getAttributeValue(null, "value"); Matcher m = matchRegexReplacement.matcher(value); StringBuilder newValueSB = new StringBuilder(); int position = 0; while(m.find()) { newValueSB.append(value.substring(position, m.start())); StringBuilder replacement = regexDict.get(m.group()); if (replacement == null){ throw new RuntimeException("Regex entry for: " + m.group() + " missing! Check regexes.xml"); } newValueSB.append(replacement); position = m.end(); } newValueSB.append(value.substring(position)); if (regexDict.get(name) != null){ throw new RuntimeException("Regex entry: " + name + " has duplicate definitions! Check regexes.xml"); } regexDict.put(name, newValueSB); } } } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading regexes.xml", e); } finally { try { reader.close(); } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading regexes.xml", e); } } String re = regexDict.get("%chemical%").toString(); if (!reversed){ return automatonInitialiser.loadAutomaton("chemical", re, true, false); } else{ return automatonInitialiser.loadAutomaton("chemical", re, true, true); } } synchronized void populatedReverseTokenMappings() throws IOException{ if (reverseChemicalAutomaton == null){ reverseChemicalAutomaton = processChemicalGrammar(true); } int grammarSymbolsSize = reverseChemicalAutomaton.getCharIntervals().length; if (symbolTokenNamesDictReversed == null){ symbolTokenNamesDictReversed = new OpsinRadixTrie[grammarSymbolsSize]; processTokenFiles(true); } if (symbolRegexAutomataDictReversed == null && symbolRegexesDictReversed==null){ symbolRegexAutomataDictReversed = new RunAutomaton[grammarSymbolsSize]; symbolRegexesDictReversed = new Pattern[grammarSymbolsSize]; processRegexTokenFiles(true); } } /**Given a token string and an annotation character, makes the XML element for * the token string. * @param tokenString The token string. * @param symbol The annotation character. * * @return The XML element produced. * @throws ParsingException */ TokenEl makeTokenElement(String tokenString, Character symbol) throws ParsingException { Map annotationToToken = tokenDict.get(tokenString); if(annotationToToken != null){ TokenEl token = annotationToToken.get(symbol); if (token != null) { if (token == IGNORE_WHEN_WRITING_PARSE_TREE){ return null; } return token.copy(tokenString); } } TokenEl regexToken = reSymbolTokenDict.get(symbol); if (regexToken != null){ if (regexToken == IGNORE_WHEN_WRITING_PARSE_TREE){ return null; } return regexToken.copy(tokenString); } throw new ParsingException("Parsing Error: This is a bug in the program. A token element could not be found for token: " + tokenString +" using annotation symbol: " +symbol); } RunAutomaton getChemicalAutomaton() { return chemicalAutomaton; } OpsinRadixTrie[] getSymbolTokenNamesDict() { return symbolTokenNamesDict; } RunAutomaton[] getSymbolRegexAutomataDict() { return symbolRegexAutomataDict; } Pattern[] getSymbolRegexesDict() { return symbolRegexesDict; } RunAutomaton getReverseChemicalAutomaton() { return reverseChemicalAutomaton; } OpsinRadixTrie[] getSymbolTokenNamesDictReversed() { return symbolTokenNamesDictReversed; } RunAutomaton[] getSymbolRegexAutomataDictReversed() { return symbolRegexAutomataDictReversed; } Pattern[] getSymbolRegexesDictReversed() { return symbolRegexesDictReversed; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ReverseParseRules.java000066400000000000000000000210551451751637500301430ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.io.IOException; import java.util.ArrayDeque; import java.util.ArrayList; import java.util.List; import java.util.regex.Matcher; import java.util.regex.Pattern; import dk.brics.automaton.RunAutomaton; /** * The same as ParseRules but works from right to left * * Performs finite-state allocation of roles ("annotations") to tokens: * The chemical name is broken down into tokens e.g. ethyl -->eth yl by applying the chemical grammar in regexes.xml * The tokens eth and yl are associated with a letter which is referred to here as an annotation which is the role of the token. * These letters are defined in regexes.xml and would in this case have the meaning alkaneStem and inlineSuffix * * The chemical grammar employs the annotations associated with the tokens when deciding what may follow what has already been seen * e.g. you cannot start a chemical name with yl and an optional e is valid after an arylGroup * * @author dl387 * */ class ReverseParseRules { /** A DFA encompassing the grammar of a chemical word. */ private final RunAutomaton chemAutomaton; /** The allowed symbols in chemAutomaton */ private final char[] stateSymbols; private final OpsinRadixTrie[] symbolTokenNamesDictReversed; private final RunAutomaton[] symbolRegexAutomataDictReversed; private final Pattern[] symbolRegexesDictReversed; /** * Creates a right to left parser that can parse a substituent/full/functional word * @param resourceManager * @throws IOException */ ReverseParseRules(ResourceManager resourceManager) throws IOException{ resourceManager.populatedReverseTokenMappings(); this.chemAutomaton = resourceManager.getReverseChemicalAutomaton(); this.symbolTokenNamesDictReversed = resourceManager.getSymbolTokenNamesDictReversed(); this.symbolRegexAutomataDictReversed = resourceManager.getSymbolRegexAutomataDictReversed(); this.symbolRegexesDictReversed = resourceManager.getSymbolRegexesDictReversed(); this.stateSymbols = chemAutomaton.getCharIntervals(); } /**Determines the possible annotations for a chemical word * Returns a list of parses and how much of the word could not be interpreted * e.g. usually the list will have only one parse and the string will equal "" * For something like ethyloxime. The list will contain the parse for ethyl and the string will equal "oxime" as it was unparsable * For something like eth no parses would be found and the string will equal "eth" * * @param chemicalWord * @return * @throws ParsingException */ public ParseRulesResults getParses(String chemicalWord) throws ParsingException { AnnotatorState initialState = new AnnotatorState(chemAutomaton.getInitialState(), '\0', chemicalWord.length(), true, null); String chemicalWordLowerCase = StringTools.lowerCaseAsciiString(chemicalWord); ArrayDeque asStack = new ArrayDeque<>(); asStack.add(initialState); int posInNameOfLastSuccessfulAnnotations = chemicalWord.length(); List successfulAnnotations = new ArrayList<>(); AnnotatorState longestAnnotation = initialState;//this is the longest annotation. It does not necessarily end in an accept state int stateSymbolsSize = stateSymbols.length; while (!asStack.isEmpty()) { AnnotatorState as = asStack.removeLast();//depth-first avoids pathological memory consumption if parsing ambiguity is encountered int posInName = as.getPosInName(); if (chemAutomaton.isAccept(as.getState())){ if (posInName <= posInNameOfLastSuccessfulAnnotations){//this annotation is worthy of consideration if (posInName < posInNameOfLastSuccessfulAnnotations){//this annotation is longer than any previously found annotation successfulAnnotations.clear(); posInNameOfLastSuccessfulAnnotations = posInName; } else if (successfulAnnotations.size() > 128){ throw new ParsingException("Ambiguity in OPSIN's chemical grammar has produced more than 128 annotations. Parsing has been aborted. Please report this as a bug"); } successfulAnnotations.add(as); } } //record the longest annotation found so it can be reported to the user for debugging if (posInName < longestAnnotation.getPosInName()){ longestAnnotation = as; } for (int i = 0; i < stateSymbolsSize; i++) { char annotationCharacter = stateSymbols[i]; int potentialNextState = chemAutomaton.step(as.getState(), annotationCharacter); if (potentialNextState != -1) {//-1 means this state is not accessible from the previous state OpsinRadixTrie possibleTokenisationsTrie = symbolTokenNamesDictReversed[i]; if (possibleTokenisationsTrie != null) { List possibleTokenisations = possibleTokenisationsTrie.findMatchesReadingStringRightToLeft(chemicalWordLowerCase, posInName); if (possibleTokenisations != null) {//next could be a token for (int j = 0, l = possibleTokenisations.size(); j < l; j++) {//typically list size will be 1 so this is faster than an iterator int tokenizationIndex = possibleTokenisations.get(j); AnnotatorState newAs = new AnnotatorState(potentialNextState, annotationCharacter, tokenizationIndex, false, as); //System.out.println("tokened " + chemicalWordLowerCase.substring(tokenizationIndex, posInName)); asStack.add(newAs); } } } RunAutomaton possibleAutomata = symbolRegexAutomataDictReversed[i]; if (possibleAutomata != null) {//next could be an automaton int matchLength = runInReverse(possibleAutomata, chemicalWord, posInName); if (matchLength != -1){//matchLength = -1 means it did not match int tokenizationIndex = posInName - matchLength; AnnotatorState newAs = new AnnotatorState(potentialNextState, annotationCharacter, tokenizationIndex, true, as); //System.out.println("neword automata " + chemicalWord.substring(tokenizationIndex, posInName)); asStack.add(newAs); } } Pattern possibleRegex = symbolRegexesDictReversed[i]; if (possibleRegex != null) {//next could be a regex Matcher mat = possibleRegex.matcher(chemicalWord).region(0, posInName); mat.useTransparentBounds(true); if (mat.find()) {//match at end (patterns use $ anchor) int tokenizationIndex = posInName - mat.group(0).length(); AnnotatorState newAs = new AnnotatorState(potentialNextState, annotationCharacter, tokenizationIndex, true, as); //System.out.println("neword regex " + mat.group(0)); asStack.add(newAs); } } } } } List outputList = new ArrayList<>(); String uninterpretableName = chemicalWord; String unparseableName = chemicalWord.substring(0, longestAnnotation.getPosInName()); if (successfulAnnotations.size() > 0){//at least some of the name could be interpreted into a substituent/full/functionalTerm int bestAcceptPosInName = -1; for(AnnotatorState as : successfulAnnotations) { outputList.add(convertAnnotationStateToParseTokens(as, chemicalWord, chemicalWordLowerCase)); bestAcceptPosInName = as.getPosInName();//all acceptable annotator states found should have the same posInName } uninterpretableName = chemicalWord.substring(0, bestAcceptPosInName); } return new ParseRulesResults(outputList, uninterpretableName, unparseableName); } /** * Returns the length of the longest accepted run of the given string * starting at pos in the string and working backwards * @param automaton * @param s the string * @param indexAfterFirstchar pos in string to start at * @return length of the longest accepted run, -1 if no run is accepted */ private int runInReverse(RunAutomaton automaton, String s, int indexAfterFirstchar) { int state = automaton.getInitialState(); int max = -1; for (int pos = indexAfterFirstchar -1; ; pos--) { if (automaton.isAccept(state)){ max = indexAfterFirstchar - 1 - pos; } if (pos == -1){ break; } state = automaton.step(state, s.charAt(pos)); if (state == -1){ break; } } return max; } private ParseTokens convertAnnotationStateToParseTokens(AnnotatorState as, String chemicalWord, String chemicalWordLowerCase) { List tokens = new ArrayList<>(); List annotations = new ArrayList<>(); AnnotatorState previousAs; while ((previousAs = as.getPreviousAs()) != null) { if (as.isCaseSensitive()) { tokens.add(chemicalWord.substring(as.getPosInName(), previousAs.getPosInName())); } else{ tokens.add(chemicalWordLowerCase.substring(as.getPosInName(), previousAs.getPosInName())); } annotations.add(as.getAnnot()); as = previousAs; } return new ParseTokens(tokens, annotations); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/Ring.java000066400000000000000000000060701451751637500254210ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.LinkedHashMap; import java.util.List; import java.util.Map; /** * Class representing a single ring (i.e. NOT a fused ring which is formed from multiple rings) * @author dl387 * */ class Ring { private final List atomList = new ArrayList<>(); private final List bondList; private final Map bondToNeighbourRings = new LinkedHashMap<>(); private List cyclicAtomList; private List cyclicBondList; Ring(List bondList){ if (bondList==null || bondList.isEmpty()){ throw new IllegalArgumentException("Bond list is empty"); } this.bondList = bondList; for(Bond bond: bondList){ Atom atom1 = bond.getFromAtom(); if (!atomList.contains(atom1)) { atomList.add(atom1); } Atom atom2 = bond.getToAtom(); if (!atomList.contains(atom2)) { atomList.add(atom2); } } if (atomList.size() != bondList.size()) { throw new RuntimeException("atomList and bondList different sizes. Ring(bond)"); } } List getBondList() { return bondList; } List getAtomList() { return atomList; } /** * Number of ring atoms/bonds * @return */ int size() { return atomList.size(); } int getNumberOfFusedBonds() { return bondToNeighbourRings.size(); } /** * Return bonds utilised in multiple rings * @return List */ List getFusedBonds(){ return new ArrayList<>(bondToNeighbourRings.keySet()); } int getBondIndex(Bond bond){ return cyclicBondList.indexOf(bond); } List getCyclicBondList(){ return cyclicBondList; } List getCyclicAtomList(){ return cyclicAtomList; } List getNeighbours() { return new ArrayList<>(bondToNeighbourRings.values()); } Ring getNeighbourOfFusedBond(Bond fusedBond) { return bondToNeighbourRings.get(fusedBond); } void addNeighbour(Bond bond, Ring ring) { if (this == ring) { throw new IllegalArgumentException("Ring can't be a neighbour of itself"); } bondToNeighbourRings.put(bond, ring); } /** * Stores atoms and bonds in the order defined by atom and bond * @param stBond - the first bond * @param stAtom - the atom defining in which direction to go */ void makeCyclicLists(Bond stBond, Atom stAtom){ if (cyclicBondList==null){ cyclicBondList = new ArrayList<>(); cyclicAtomList = new ArrayList<>(); Atom atom = stAtom; cyclicBondList.add(stBond); cyclicAtomList.add(atom); for (int i=0; i organicAtoms = new HashSet<>(); /**Aromatic Atoms.*/ private static final Set aromaticAtoms = new HashSet<>(); static { organicAtoms.add("B"); organicAtoms.add("C"); organicAtoms.add("N"); organicAtoms.add("O"); organicAtoms.add("P"); organicAtoms.add("S"); organicAtoms.add("F"); organicAtoms.add("Cl"); organicAtoms.add("Br"); organicAtoms.add("I"); aromaticAtoms.add("c"); aromaticAtoms.add("n"); aromaticAtoms.add("o"); aromaticAtoms.add("p"); aromaticAtoms.add("s"); aromaticAtoms.add("si"); aromaticAtoms.add("as"); aromaticAtoms.add("se"); aromaticAtoms.add("sb"); aromaticAtoms.add("te"); } private final IDManager idManager; SMILESFragmentBuilder(IDManager idManager) { this.idManager = idManager; } private class ParserInstance { private final Deque stack = new ArrayDeque<>(); private final Map ringClosures = new HashMap<>(); private final String smiles; private final int endOfSmiles; private final Fragment fragment; private final int firstAtomOutValency; private final int lastAtomOutValency; private int i; ParserInstance(String smiles, Fragment fragment) { this.smiles = smiles; this.fragment = fragment; int lastIndex = smiles.length(); char firstChar = smiles.charAt(0);//used by OPSIN to specify the valency with which this fragment connects if (firstChar == '-') { this.firstAtomOutValency = 1; this.i = 1; } else if (firstChar == '=') { this.firstAtomOutValency = 2; this.i = 1; } else if (firstChar == '#') { this.firstAtomOutValency = 3; this.i = 1; } else { this.firstAtomOutValency = -1; this.i = 0; } char lastChar = smiles.charAt(lastIndex - 1);//used by OPSIN to specify the valency with which this fragment connects and to indicate it connects via the last atom in the SMILES if (lastChar == '-') { this.lastAtomOutValency = 1; this.endOfSmiles = lastIndex - 1; } else if (lastChar == '=') { this.lastAtomOutValency = 2; this.endOfSmiles = lastIndex - 1; } else if (lastChar == '#') { this.lastAtomOutValency = 3; this.endOfSmiles = lastIndex - 1; } else { this.lastAtomOutValency = -1; this.endOfSmiles = lastIndex; } } void parseSmiles() throws StructureBuildingException { stack.add(new StackFrame(null, 1)); for (; i < endOfSmiles; i++) { char ch = smiles.charAt(i); switch (ch) { case '(': stack.add(new StackFrame(stack.getLast())); break; case ')': stack.removeLast(); break; case '-': stack.getLast().bondOrder = 1; break; case '=': if (stack.getLast().bondOrder != 1){ throw new StructureBuildingException("= in unexpected position: bond order already defined!"); } stack.getLast().bondOrder = 2; break; case '#': if (stack.getLast().bondOrder != 1){ throw new StructureBuildingException("# in unexpected position: bond order already defined!"); } stack.getLast().bondOrder = 3; break; case '/': if (stack.getLast().slash != null){ throw new StructureBuildingException("/ in unexpected position: bond configuration already defined!"); } stack.getLast().slash = SMILES_BOND_DIRECTION.RSLASH; break; case '\\': if (stack.getLast().slash != null){ throw new StructureBuildingException("\\ in unexpected position: bond configuration already defined!"); } stack.getLast().slash = SMILES_BOND_DIRECTION.LSLASH; break; case '.': stack.getLast().atom = null; break; case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': case 'g': case 'h': case 'i': case 'j': case 'k': case 'l': case 'm': case 'n': case 'o': case 'p': case 'q': case 'r': case 's': case 't': case 'u': case 'v': case 'w': case 'x': case 'y': case 'z': case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': case 'G': case 'H': case 'I': case 'J': case 'K': case 'L': case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R': case 'S': case 'T': case 'U': case 'V': case 'W': case 'X': case 'Y': case 'Z': case '*': processOrganicAtom(ch); break; case '[': processBracketedAtom(); break; case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': case '%': processRingOpeningOrClosure(ch); break; default: throw new StructureBuildingException(ch + " is in an unexpected position. Check this is not a mistake and that this feature of SMILES is supported by OPSIN's SMILES parser"); } } if (!ringClosures.isEmpty()){ throw new StructureBuildingException("Unmatched ring opening"); } if (firstAtomOutValency > 0) { fragment.addOutAtom(fragment.getFirstAtom(), firstAtomOutValency, true); } if (lastAtomOutValency > 0) { //note that in something like C(=O)- this would be the carbon not the oxygen fragment.addOutAtom(getInscopeAtom(), lastAtomOutValency, true); } } /** * An organic atom e.g. 'C', 'Cl', 'c' etc. * @param ch * @throws StructureBuildingException */ private void processOrganicAtom(char ch) throws StructureBuildingException { String elementType = String.valueOf(ch); boolean spareValency = false; if(is_A_to_Z(ch)) {//normal atoms if(i + 1 < endOfSmiles && is_a_to_z(smiles.charAt(i + 1)) && organicAtoms.contains(smiles.substring(i, i + 2))) { elementType = smiles.substring(i, i + 2); i++; } else if (!organicAtoms.contains(elementType)){ throw new StructureBuildingException(elementType + " is not an organic Element. If it is actually an element it should be in square brackets"); } } else if(is_a_to_z(ch)) {//aromatic atoms if (!aromaticAtoms.contains(elementType)){ throw new StructureBuildingException(elementType + " is not an aromatic Element. If it is actually an element it should not be in lower case"); } elementType = String.valueOf((char)(ch - 32)); spareValency = true; } else if (ch == '*') { elementType = "R"; } Atom atom = createAtom(elementType, fragment); atom.setSpareValency(spareValency); fragment.addAtom(atom); StackFrame currentFrame = stack.getLast(); if(currentFrame.atom != null) { Bond b = createBond(currentFrame.atom, atom, currentFrame.bondOrder); if (currentFrame.slash != null){ b.setSmilesStereochemistry(currentFrame.slash); currentFrame.slash = null; } if (currentFrame.atom.getAtomParity() != null){ addAtomToAtomParity(currentFrame.atom.getAtomParity(), atom); } } currentFrame.atom = atom; currentFrame.bondOrder = 1; } /** * square brackets- contain non-organic atoms or where required to set properties such as charge/chirality etc. * e.g. [Na+] * @throws StructureBuildingException */ private void processBracketedAtom() throws StructureBuildingException { i++; int indexOfRightSquareBracket = smiles.indexOf(']', i); if (indexOfRightSquareBracket == -1) { throw new StructureBuildingException("[ without matching \"]\""); } // isotope String isotope = ""; while(is_0_to_9(smiles.charAt(i))) { isotope += smiles.charAt(i); i++; } char ch; if (i < indexOfRightSquareBracket){ ch = smiles.charAt(i); i++; } else{ throw new StructureBuildingException("No element found in square brackets"); } // elementType String elementType = String.valueOf(ch); boolean spareValency = false; if(is_A_to_Z(ch)) {//normal atoms if(is_a_to_z(smiles.charAt(i))) { elementType += smiles.charAt(i); i++; } } else if(is_a_to_z(ch)) {//aromatic atoms if(is_a_to_z(smiles.charAt(i))) { if (aromaticAtoms.contains(elementType + smiles.charAt(i))){ elementType = String.valueOf((char)(ch - 32)) + smiles.charAt(i); i++; } else{ throw new StructureBuildingException(elementType + smiles.charAt(i) + " is not an aromatic Element. If it is actually an element it should not be in lower case"); } } else{ if (!aromaticAtoms.contains(elementType)){ throw new StructureBuildingException(elementType + " is not an aromatic Element."); } elementType = String.valueOf((char)(ch - 32)); } spareValency = true; } else if (elementType.equals("*")){ elementType = "R"; } else{ throw new StructureBuildingException(elementType + " is not a valid element type!"); } Atom atom = createAtom(elementType, fragment); atom.setSpareValency(spareValency); if (isotope.length() > 0){ atom.setIsotope(Integer.parseInt(isotope)); } fragment.addAtom(atom); StackFrame currentFrame = stack.getLast(); if(currentFrame.atom != null) { Bond b = createBond(currentFrame.atom, atom, currentFrame.bondOrder); if (currentFrame.slash != null){ b.setSmilesStereochemistry(currentFrame.slash); currentFrame.slash = null; } if (currentFrame.atom.getAtomParity() != null){ addAtomToAtomParity(currentFrame.atom.getAtomParity(), atom); } } Atom previousAtom = currentFrame.atom;//needed for setting atomParity elements up currentFrame.atom = atom; currentFrame.bondOrder = 1; Integer hydrogenCount = 0; int charge = 0; Boolean chiralitySet = false; for (; i < indexOfRightSquareBracket; i++) { ch = smiles.charAt(i); if(ch == '@') {// chirality-sets atom parity if (chiralitySet){ throw new StructureBuildingException("Atom parity appeared to be specified twice for an atom in a square bracket!"); } processTetrahedralStereochemistry(atom, previousAtom, fragment.getAtomCount() == 1); chiralitySet = true; } else if (ch == 'H'){// hydrogenCount if (hydrogenCount == null || hydrogenCount != 0){ throw new StructureBuildingException("Hydrogen count appeared to be specified twice for an atom in a square bracket!"); } if (smiles.charAt(i + 1) == '?'){ //extension to allow standard valency (as determined by the group in the periodic table) to dictate hydrogens i++; hydrogenCount = null; } else{ String hydrogenCountString =""; while(is_0_to_9(smiles.charAt(i + 1))) { hydrogenCountString += smiles.charAt(i + 1); i++; } if (hydrogenCountString.length() == 0){ hydrogenCount = 1; } else{ hydrogenCount = Integer.parseInt(hydrogenCountString); } if (atom.hasSpareValency()) { if ((!elementType.equals("C") && !elementType.equals("Si")) || hydrogenCount >=2){ fragment.addIndicatedHydrogen(atom); } } } } else if(ch == '+' || ch == '-') {// formalCharge if (charge != 0){ throw new StructureBuildingException("Charge appeared to be specified twice for an atom in a square bracket!"); } charge = (ch == '+') ? 1 : -1; String changeChargeStr = ""; int changeCharge = 1; while(is_0_to_9(smiles.charAt(i + 1))) {//e.g. [C+2] changeChargeStr += smiles.charAt(i + 1); i++; } if (changeChargeStr.length() == 0){ while(i + 1 < indexOfRightSquareBracket){//e.g. [C++] ch = smiles.charAt(i + 1); if (ch == '+'){ if (charge != 1){ throw new StructureBuildingException("Atom has both positive and negative charges specified!");//e.g. [C+-] } } else if (ch == '-'){ if (charge != -1){ throw new StructureBuildingException("Atom has both negative and positive charges specified!"); } } else{ break; } changeCharge++; i++; } } changeCharge = changeChargeStr.length() == 0 ? changeCharge : Integer.parseInt(changeChargeStr); atom.setCharge(charge * changeCharge); } else if(ch == '|') { StringBuilder lambda = new StringBuilder(); while(i < endOfSmiles && is_0_to_9(smiles.charAt(i + 1))) { lambda.append(smiles.charAt(i + 1)); i++; } atom.setLambdaConventionValency(Integer.parseInt(lambda.toString())); } else{ throw new StructureBuildingException("Unexpected character found in square bracket"); } } atom.setProperty(Atom.SMILES_HYDROGEN_COUNT, hydrogenCount); } /** * Adds an atomParity element to the given atom using the information at the current index * @param atom * @param previousAtom * @param isFirstAtom */ private void processTetrahedralStereochemistry(Atom atom, Atom previousAtom, boolean isFirstAtom){ Boolean chiralityClockwise = false; if (smiles.charAt(i + 1) == '@'){ chiralityClockwise = true; i++; } Atom[] atomRefs4 = new Atom[4]; AtomParity atomParity = new AtomParity(atomRefs4, chiralityClockwise ? 1 : -1); int index =0; if (previousAtom != null){ atomRefs4[index] = previousAtom; index++; } else if (isFirstAtom && firstAtomOutValency == 1) { atomRefs4[index] = AtomParity.deoxyHydrogen; index++; } if (smiles.charAt(i + 1) == 'H'){ atomRefs4[index] = AtomParity.hydrogen; //this character will also be checked by the hydrogen count check, hence don't increment i } atom.setAtomParity(atomParity); } /** * Process ring openings and closings e.g. the two 1s in c1ccccc1 * @param ch * @throws StructureBuildingException */ private void processRingOpeningOrClosure(char ch) throws StructureBuildingException { String closure = String.valueOf(ch); if(ch == '%') { if (i + 2 < endOfSmiles && is_0_to_9(smiles.charAt(i + 1)) && is_0_to_9(smiles.charAt(i + 2))) { closure = smiles.substring(i + 1, i + 3); i +=2; } else{ throw new StructureBuildingException("A ring opening indice after a % must be two digits long"); } } if(ringClosures.containsKey(closure)) { processRingClosure(closure); } else { if (getInscopeAtom() == null){ throw new StructureBuildingException("A ring opening has appeared before any atom!"); } processRingOpening(closure); } } private void processRingOpening(String closure) throws StructureBuildingException { StackFrame currentFrame = stack.getLast(); StackFrame sf = new StackFrame(currentFrame); if (currentFrame.slash != null){ sf.slash = currentFrame.slash; currentFrame.slash = null; } AtomParity atomParity = sf.atom.getAtomParity(); if (atomParity != null){//replace ringclosureX with actual reference to id when it is known sf.indexOfDummyAtom = addAtomToAtomParity(atomParity, ringOpeningDummyAtom); } ringClosures.put(closure, sf); currentFrame.bondOrder = 1; } private void processRingClosure(String closure) throws StructureBuildingException { StackFrame sf = ringClosures.remove(closure); StackFrame currentFrame = stack.getLast(); int bondOrder = 1; if(sf.bondOrder > 1) { if(currentFrame.bondOrder > 1 && sf.bondOrder != currentFrame.bondOrder){ throw new StructureBuildingException("ring closure has two different bond orders specified!"); } bondOrder = sf.bondOrder; } else if(currentFrame.bondOrder > 1) { bondOrder = currentFrame.bondOrder; } Bond b; if (currentFrame.slash != null) { //stereochemistry specified on ring closure //special case e.g. CC1=C/F.O\1 Bond is done from the O to the the C due to the presence of the \ b = createBond(currentFrame.atom, sf.atom, bondOrder); b.setSmilesStereochemistry(currentFrame.slash); if(sf.slash != null && sf.slash.equals(currentFrame.slash)) {//specified twice check for contradiction throw new StructureBuildingException("Contradictory double bond stereoconfiguration"); } currentFrame.slash = null; } else { b = createBond(sf.atom, currentFrame.atom, bondOrder); if (sf.slash != null) { //stereochemistry specified on ring opening b.setSmilesStereochemistry(sf.slash); } } AtomParity currentAtomParity = currentFrame.atom.getAtomParity(); if (currentAtomParity != null) { addAtomToAtomParity(currentAtomParity, sf.atom); } AtomParity closureAtomParity = sf.atom.getAtomParity(); if (closureAtomParity != null) {//replace dummy atom with actual atom e.g. N[C@@H]1C.F1 where the 1 initially holds a dummy atom before being replaced with the F atom Atom[] atomRefs4 = closureAtomParity.getAtomRefs4(); if (sf.indexOfDummyAtom == null) { throw new RuntimeException("OPSIN Bug: Index of dummy atom representing ring closure atom not set"); } atomRefs4[sf.indexOfDummyAtom] = currentFrame.atom; } currentFrame.bondOrder = 1; } /** * Adds an atom at the first non-null position in the atomParity's atomRefs4 * @param atomParity * @param atom * @return Returns the index of the atom in the atomParity's atomRefs4 * @throws StructureBuildingException */ private int addAtomToAtomParity(AtomParity atomParity, Atom atom) throws StructureBuildingException { Atom[] atomRefs4 = atomParity.getAtomRefs4(); boolean setAtom = false; int i = 0; for (; i < atomRefs4.length; i++) { if (atomRefs4[i] == null){ atomRefs4[i] = atom; setAtom = true; break; } } if (!setAtom){ throw new StructureBuildingException("Tetrahedral stereocentre specified in SMILES appears to involve more than 4 atoms"); } return i; } /** * For non-empty SMILES will return the atom at the top of the stack i.e. the one that will be bonded to next if the SMILES continued * (only valid during execution of and after {@link ParserInstance#parseSmiles()} has been called) * @return */ Atom getInscopeAtom(){ return stack.getLast().atom; } } /** * Build a Fragment based on a SMILES string. * The type/subType of the Fragment are the empty String * The fragment has no locants * * @param smiles The SMILES string to build from. * @return The built fragment. * @throws StructureBuildingException */ Fragment build(String smiles) throws StructureBuildingException { return build(smiles, "", NONE_LABELS_VAL); } /** * Build a Fragment based on a SMILES string. * @param smiles The SMILES string to build from. * @param type The type of the fragment retrieved when calling {@link Fragment#getType()} * @param labelMapping A string indicating which locants to assign to each atom. Can be a slash delimited list, "numeric", "fusedRing" or "none"/"" * @return * @throws StructureBuildingException */ Fragment build(String smiles, String type, String labelMapping) throws StructureBuildingException { return build(smiles, new Fragment(type), labelMapping); } /** * Build a Fragment based on a SMILES string. * @param smiles The SMILES string to build from. * @param tokenEl The corresponding tokenEl * @param labelMapping A string indicating which locants to assign to each atom. Can be a slash delimited list, "numeric", "fusedRing" or "none"/"" * @return Fragment The built fragment. * @throws StructureBuildingException */ Fragment build(String smiles, Element tokenEl, String labelMapping) throws StructureBuildingException { if (tokenEl == null){ throw new IllegalArgumentException("tokenEl is null. FragmentManager's DUMMY_TOKEN should be used instead"); } return build(smiles, new Fragment(tokenEl), labelMapping); } private Fragment build(String smiles, Fragment fragment, String labelMapping) throws StructureBuildingException { if (smiles == null) { throw new IllegalArgumentException("SMILES specified is null"); } if (labelMapping == null) { throw new IllegalArgumentException("labelMapping is null use \"none\" if you do not want any numbering or \"numeric\" if you would like default numbering"); } if (smiles.isEmpty()){ return fragment; } ParserInstance instance = new ParserInstance(smiles, fragment); instance.parseSmiles(); List atomList = fragment.getAtomList(); processLabelling(labelMapping, atomList); verifyAndTakeIntoAccountLonePairsInAtomParities(atomList); addBondStereoElements(fragment); for (Atom atom : atomList) { if (atom.getProperty(Atom.SMILES_HYDROGEN_COUNT) != null && atom.getLambdaConventionValency() == null){ setupAtomValency(atom); } } CycleDetector.assignWhetherAtomsAreInCycles(fragment); return fragment; } private void processLabelling(String labelMapping, List atomList) throws StructureBuildingException { if (labelMapping.equals(NONE_LABELS_VAL) || labelMapping.length() == 0) { return; } if (labelMapping.equals(NUMERIC_LABELS_VAL)) { int atomNumber = 1; for (Atom atom : atomList) { atom.addLocant(Integer.toString(atomNumber++)); } } else if(labelMapping.equals(FUSEDRING_LABELS_VAL)) {//fragment is a fusedring with atoms in the correct order for fused ring numbering //this will do stuff like changing labels from 1,2,3,4,5,6,7,8,9,10->1,2,3,4,4a,5,6,7,8,8a FragmentTools.relabelLocantsAsFusedRingSystem(atomList); } else{ String[] labelMap = labelMapping.split("/", -1);//place slash delimited labels into an array int numOfAtoms = atomList.size(); if (labelMap.length != numOfAtoms){ throw new StructureBuildingException("Group numbering has been invalidly defined in resource file: labels: " +labelMap.length + ", atoms: " + numOfAtoms ); } for (int i = 0; i < numOfAtoms; i++) { String labels[] = labelMap[i].split(","); for (String label : labels) { if (label.length() > 0) { atomList.get(i).addLocant(label); } } } } } private void verifyAndTakeIntoAccountLonePairsInAtomParities(List atomList) throws StructureBuildingException { for (Atom atom : atomList) { AtomParity atomParity = atom.getAtomParity(); if (atomParity != null){ Atom[] atomRefs4 = atomParity.getAtomRefs4(); int nullAtoms = 0; int hydrogen = 0; for (Atom atomRefs4Atom : atomRefs4) { if (atomRefs4Atom == null){ nullAtoms++; } else if (atomRefs4Atom.equals(AtomParity.hydrogen)){ hydrogen++; } } if (nullAtoms != 0){ if (nullAtoms ==1 && hydrogen==0 && (atom.getElement() == ChemEl.N || atom.getElement() == ChemEl.S || atom.getElement() == ChemEl.Se)){//special case where lone pair is part of the tetrahedron if (atomList.indexOf(atomRefs4[0]) < atomList.indexOf(atom)){//is there an atom in the SMILES in front of the stereocentre? atomRefs4[3] = atomRefs4[2]; atomRefs4[2] = atomRefs4[1]; atomRefs4[1] = atom; } else{ atomRefs4[3] = atomRefs4[2]; atomRefs4[2] = atomRefs4[1]; atomRefs4[1] = atomRefs4[0]; atomRefs4[0] = atom; } } else{ throw new StructureBuildingException("SMILES is malformed. Tetrahedral stereochemistry defined on a non tetrahedral centre"); } } } } } private void addBondStereoElements(Fragment currentFrag) throws StructureBuildingException { Set bonds = currentFrag.getBondSet(); for (Bond centralBond : bonds) {//identify cases of E/Z stereochemistry and add appropriate bondstereo tags if (centralBond.getOrder() == 2) { List fromAtomBonds = centralBond.getFromAtom().getBonds(); for (Bond preceedingBond : fromAtomBonds) { if (preceedingBond.getSmilesStereochemistry() != null) { List toAtomBonds = centralBond.getToAtom().getBonds(); for (Bond followingBond : toAtomBonds) { if (followingBond.getSmilesStereochemistry() != null) {//now found a double bond surrounded by two bonds with slashs boolean upFirst; boolean upSecond; Atom atom2 = centralBond.getFromAtom(); Atom atom3 = centralBond.getToAtom(); Atom atom1 = preceedingBond.getOtherAtom(atom2); Atom atom4 = followingBond.getOtherAtom(atom3); if (preceedingBond.getSmilesStereochemistry() == SMILES_BOND_DIRECTION.LSLASH) { upFirst = preceedingBond.getToAtom() == atom2;//in normally constructed SMILES this will be the case but you could write C(/F)=C/F instead of F\C=C/F } else if (preceedingBond.getSmilesStereochemistry() == SMILES_BOND_DIRECTION.RSLASH) { upFirst = preceedingBond.getToAtom() != atom2; } else{ throw new StructureBuildingException(preceedingBond.getSmilesStereochemistry() + " is not a slash!"); } if (followingBond.getSmilesStereochemistry() == SMILES_BOND_DIRECTION.LSLASH) { upSecond = followingBond.getFromAtom() != atom3; } else if (followingBond.getSmilesStereochemistry() == SMILES_BOND_DIRECTION.RSLASH) { upSecond = followingBond.getFromAtom() == atom3; } else{ throw new StructureBuildingException(followingBond.getSmilesStereochemistry() + " is not a slash!"); } BondStereoValue cisTrans = upFirst == upSecond ? BondStereoValue.CIS : BondStereoValue.TRANS; if (centralBond.getBondStereo() != null) { //double bond has redundant specification e.g. C/C=C\\1/NC1 hence need to check it is consistent Atom[] atomRefs4 = centralBond.getBondStereo().getAtomRefs4(); if (atomRefs4[0].equals(atom1) || atomRefs4[3].equals(atom4)) { if (centralBond.getBondStereo().getBondStereoValue().equals(cisTrans)){ throw new StructureBuildingException("Contradictory double bond stereoconfiguration"); } } else{ if (!centralBond.getBondStereo().getBondStereoValue().equals(cisTrans)){ throw new StructureBuildingException("Contradictory double bond stereoconfiguration"); } } } else{ Atom[] atomRefs4= new Atom[4]; atomRefs4[0] = atom1; atomRefs4[1] = atom2; atomRefs4[2] = atom3; atomRefs4[3] = atom4; centralBond.setBondStereoElement(atomRefs4, cisTrans); } } } } } } } for (Bond bond : bonds) { bond.setSmilesStereochemistry(null); } } /** * Utilises the atom's hydrogen count as set by the SMILES as well as incoming valency to determine the atom's valency * If the atom is charged whether protons have been added or removed will also need to be determined * @param atom * @throws StructureBuildingException */ private void setupAtomValency(Atom atom) throws StructureBuildingException { int hydrogenCount = atom.getProperty(Atom.SMILES_HYDROGEN_COUNT); int incomingValency = atom.getIncomingValency() + hydrogenCount +atom.getOutValency(); int charge = atom.getCharge(); int absoluteCharge =Math.abs(charge); ChemEl chemEl = atom.getElement(); if (atom.hasSpareValency()) { Integer hwValency = ValencyChecker.getHWValency(chemEl); if (hwValency == null || absoluteCharge > 1) { throw new StructureBuildingException(chemEl +" is not expected to be aromatic!"); } if (absoluteCharge != 0) { Integer[] possibleVal = ValencyChecker.getPossibleValencies(chemEl, charge); if (possibleVal != null && possibleVal.length > 0) { hwValency = possibleVal[0]; } else { throw new StructureBuildingException(chemEl +" with charge " + charge + " is not expected to be aromatic!"); } } if (incomingValency < hwValency){ incomingValency++; } } Integer defaultVal = ValencyChecker.getDefaultValency(chemEl); if (defaultVal !=null){//s or p block element if (defaultVal != incomingValency || charge !=0) { if (Math.abs(incomingValency - defaultVal) == absoluteCharge) { atom.setProtonsExplicitlyAddedOrRemoved(incomingValency - defaultVal); } else{ Integer[] unchargedStableValencies = ValencyChecker.getPossibleValencies(chemEl, 0); boolean hasPlausibleValency =false; for (Integer unchargedStableValency : unchargedStableValencies) { if (Math.abs(incomingValency - unchargedStableValency)==Math.abs(charge)){ atom.setProtonsExplicitlyAddedOrRemoved(incomingValency - unchargedStableValency); //we strictly set the valency if a charge is specified but are more loose about things if uncharged e.g. allow penta substituted phosphine if (charge != 0) { atom.setLambdaConventionValency(unchargedStableValency); } else{ atom.setMinimumValency(incomingValency); } hasPlausibleValency=true; break; } } if (!hasPlausibleValency){//could be something like [Sn] which would be expected to be attached to later atom.setMinimumValency(incomingValency); } } } } else{ if (hydrogenCount > 0){//make hydrogen explicit Fragment frag =atom.getFrag(); for (int i = 0; i < hydrogenCount; i++) { Atom hydrogen = createAtom(ChemEl.H, frag); createBond(atom, hydrogen, 1); } } } } /** * Create a new Atom of the given element belonging to the given fragment * @param elementSymbol * @param frag * @return Atom */ private Atom createAtom(String elementSymbol, Fragment frag) { return createAtom(ChemEl.valueOf(elementSymbol), frag); } /** * Create a new Atom of the given element belonging to the given fragment * @param chemEl * @param frag * @return Atom */ private Atom createAtom(ChemEl chemEl, Fragment frag) { Atom a = new Atom(idManager.getNextID(), chemEl, frag); frag.addAtom(a); return a; } /** * Create a new bond between two atoms. * The bond is associated with these atoms. * @param fromAtom * @param toAtom * @param bondOrder * @return Bond */ private Bond createBond(Atom fromAtom, Atom toAtom, int bondOrder) { Bond b = new Bond(fromAtom, toAtom, bondOrder); fromAtom.addBond(b); toAtom.addBond(b); fromAtom.getFrag().addBond(b); return b; } private boolean is_A_to_Z(char ch) { return ch >= 'A' && ch <= 'Z'; } private boolean is_a_to_z(char ch) { return ch >= 'a' && ch <= 'z'; } private boolean is_0_to_9(char ch){ return ch >= '0' && ch <= '9'; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/SMILESWriter.java000066400000000000000000001016341451751637500267150ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayDeque; import java.util.ArrayList; import java.util.Arrays; import java.util.Collections; import java.util.Comparator; import java.util.Deque; import java.util.EnumMap; import java.util.HashMap; import java.util.HashSet; import java.util.LinkedHashMap; import java.util.List; import java.util.Locale; import java.util.Map; import java.util.Set; import uk.ac.cam.ch.wwmm.opsin.Bond.SMILES_BOND_DIRECTION; import uk.ac.cam.ch.wwmm.opsin.BondStereo.BondStereoValue; /** * Writes an isomeric SMILES serialisation of an OPSIN fragment * @author dl387 * */ class SMILESWriter { /**The organic atoms and their allowed implicit valences in SMILES */ private static final Map organicAtomsToStandardValencies = new EnumMap<>(ChemEl.class); /**Closures 1-9, %10-99, 0 */ private static final List closureSymbols = new ArrayList<>(); /**The available ring closure symbols, ordered from start to end in the preferred order for use.*/ private final Deque availableClosureSymbols = new ArrayDeque<>(closureSymbols); /**Maps between bonds and the ring closure to use when the atom that ends the bond is encountered.*/ private final HashMap bondToClosureSymbolMap = new HashMap<>(); /**Maps between bonds and the atom that this bond will go to in the SMILES. Populated in the order the bonds are to be made */ private final HashMap bondToNextAtomMap = new LinkedHashMap<>(); /**The structure to be converted to SMILES*/ private final Fragment structure; /**Holds the SMILES string which is under construction*/ private final StringBuilder smilesBuilder = new StringBuilder(); /**Should extended SMILES be output*/ private int options; /**The order atoms were traversed when creating the SMILES*/ private List smilesOutputOrder; static { organicAtomsToStandardValencies.put(ChemEl.B, new Integer[]{3}); organicAtomsToStandardValencies.put(ChemEl.C, new Integer[]{4}); organicAtomsToStandardValencies.put(ChemEl.N, new Integer[]{3,5});//note that OPSIN doesn't accept valency 5 nitrogen without the lambda convention organicAtomsToStandardValencies.put(ChemEl.O, new Integer[]{2}); organicAtomsToStandardValencies.put(ChemEl.P, new Integer[]{3,5}); organicAtomsToStandardValencies.put(ChemEl.S, new Integer[]{2,4,6}); organicAtomsToStandardValencies.put(ChemEl.F, new Integer[]{1}); organicAtomsToStandardValencies.put(ChemEl.Cl, new Integer[]{1}); organicAtomsToStandardValencies.put(ChemEl.Br, new Integer[]{1}); organicAtomsToStandardValencies.put(ChemEl.I, new Integer[]{1}); organicAtomsToStandardValencies.put(ChemEl.R, new Integer[]{1,2,3,4,5,6,7,8,9}); for (int i = 1; i <=9; i++) { closureSymbols.add(String.valueOf(i)); } for (int i = 10; i <=99; i++) { closureSymbols.add("%"+i); } closureSymbols.add("0"); } /** * Creates a SMILES writer for the given fragment * @param structure * @param options */ private SMILESWriter(Fragment structure, int options) { this.structure = structure; this.options = options; } /** * Generates SMILES for the given fragment * The following assumptions are currently made: * The fragment contains no bonds to atoms outside the fragment * Hydrogens are all explicit * Spare valency has been converted to double bonds * @param options the set of {@link SmilesOptions} to use * @return SMILES String */ static String generateSmiles(Fragment structure, int options) { return new SMILESWriter(structure, options).writeSmiles(); } /** * Generates SMILES for the given fragment * The following assumptions are currently made: * The fragment contains no bonds to atoms outside the fragment * Hydrogens are all explicit * Spare valency has been converted to double bonds * @return SMILES String */ static String generateSmiles(Fragment structure) { return new SMILESWriter(structure, SmilesOptions.DEFAULT).writeSmiles(); } /** * Generates extended SMILES for the given fragment * The following assumptions are currently made: * The fragment contains no bonds to atoms outside the fragment * Hydrogens are all explicit * Spare valency has been converted to double bonds * @return Extended SMILES String */ static String generateExtendedSmiles(Fragment structure) { return new SMILESWriter(structure, SmilesOptions.CXSMILES).writeSmiles(); } String writeSmiles() { assignSmilesOrder(); assignDoubleBondStereochemistrySlashes(); List atomList = structure.getAtomList(); smilesOutputOrder = new ArrayList<>(atomList.size()); boolean isEmpty = true; for (Atom currentAtom : atomList) { Integer visitedDepth = currentAtom.getProperty(Atom.VISITED); if (visitedDepth != null && visitedDepth ==0) {//new component if (!isEmpty){ smilesBuilder.append('.'); } traverseSmiles(currentAtom); isEmpty = false; } } if ((options & SmilesOptions.CXSMILES) != 0) { writeExtendedSmilesLayer(options); } return smilesBuilder.toString(); } private void writeExtendedSmilesLayer(int options) { List atomLabels = new ArrayList<>(); List atomLocants = new ArrayList<>(); List positionVariationBonds = new ArrayList<>(); Integer lastLabel = null; Integer lastLocant = null; int attachmentPointCounter = 1; Map> enhancedStereo = null; Set seenAttachmentpoints = new HashSet<>(); List polymerAttachPoints = structure.getPolymerAttachmentPoints(); boolean isPolymer = polymerAttachPoints != null && polymerAttachPoints.size() > 0; for (int i = 0, l = smilesOutputOrder.size(); i < l; i++) { Atom a = smilesOutputOrder.get(i); String homologyGroup = a.getProperty(Atom.HOMOLOGY_GROUP); if (homologyGroup != null) { homologyGroup = escapeExtendedSmilesLabel(homologyGroup); if (homologyGroup.startsWith("_")) { atomLabels.add(homologyGroup); } else { atomLabels.add(homologyGroup + "_p"); } lastLabel = i; } else if (a.getElement() == ChemEl.R){ if (isPolymer) { atomLabels.add("star_e"); } else { Integer atomClass = a.getProperty(Atom.ATOM_CLASS); if (atomClass != null) { seenAttachmentpoints.add(atomClass); } else { do { atomClass = attachmentPointCounter++; } while (seenAttachmentpoints.contains(atomClass)); } atomLabels.add("_AP" + String.valueOf(atomClass)); } lastLabel = i; } else { atomLabels.add(""); } String firstLocant = a.getFirstLocant(); if (firstLocant != null) { atomLocants.add(firstLocant); lastLocant = i; } else { atomLocants.add(""); } List atomsInPositionVariationBond = a.getProperty(Atom.POSITION_VARIATION_BOND); if (atomsInPositionVariationBond != null) { StringBuilder sb = new StringBuilder(); sb.append(i); for (int j = 0; j < atomsInPositionVariationBond.size(); j++) { sb.append(j==0 ? ':' : '.'); Atom referencedAtom = atomsInPositionVariationBond.get(j); int referencedAtomIndex = smilesOutputOrder.indexOf(referencedAtom); if (referencedAtomIndex == -1){ throw new RuntimeException("OPSIN Bug: Failed to resolve position variation bond atom"); } sb.append(referencedAtomIndex); } positionVariationBonds.add(sb.toString()); } StereoGroup stereoGroup = a.getStereoGroup(); if (stereoGroup.getType() != StereoGroupType.Unk) { if (enhancedStereo == null) { enhancedStereo = new HashMap<>(); } List grps = enhancedStereo.get(stereoGroup); if (grps == null) { enhancedStereo.put(stereoGroup, grps = new ArrayList<>()); } grps.add(smilesOutputOrder.indexOf(a)); } } List extendedSmiles = new ArrayList<>(2); if (lastLabel != null && (options & SmilesOptions.CXSMILES_ATOM_LABELS) != 0) { extendedSmiles.add("$" + StringTools.stringListToString(atomLabels.subList(0, lastLabel + 1), ";") + "$" ); } if (lastLocant != null && (options & SmilesOptions.CXSMILES_ATOM_VALUES) != 0) { extendedSmiles.add("$_AV:" + StringTools.stringListToString(atomLocants.subList(0, lastLocant + 1), ";") + "$" ); } if (enhancedStereo != null && (options & SmilesOptions.CXSMILES_ENHANCED_STEREO) != 0) { if (enhancedStereo.size() == 1) { if (enhancedStereo.get(new StereoGroup(StereoGroupType.Rac, 1)) != null || enhancedStereo.get(new StereoGroup(StereoGroupType.Rac, 2)) != null) { extendedSmiles.add("r"); } else if (enhancedStereo.get(new StereoGroup(StereoGroupType.Rel, 1)) != null) { List idxs = enhancedStereo.get(new StereoGroup(StereoGroupType.Rel, 1)); StringBuilder sb = new StringBuilder(); sb.append("o1:"); sb.append(idxs.get(0)); for (int i = 1; i < idxs.size(); i++) { sb.append(',').append(idxs.get(i)); } extendedSmiles.add(sb.toString()); } // Abs is ignored in this case since that is the default in smiles that // all stereochemistry is absolute } else { StringBuilder sb = new StringBuilder(); int numRac = 1, numRel = 1; // renumber List>> entries = new ArrayList<>(enhancedStereo.entrySet()); // ensure consistent output order Collections.sort(entries, new Comparator>>() { @Override public int compare(Map.Entry> a, Map.Entry> b) { Collections.sort(a.getValue()); Collections.sort(b.getValue()); int len = Math.min(a.getValue().size(), b.getValue().size()); for (int i = 0; i < len; i++) { int cmp = a.getValue().get(i).compareTo(b.getValue().get(i)); if (cmp != 0) return cmp; } int cmp = Integer.compare(a.getValue().size(), b.getValue().size()); if (cmp != 0) return cmp; return a.getKey().compareTo(b.getKey()); // error? } }); for (Map.Entry> e : entries) { sb.setLength(0); StereoGroup key = e.getKey(); switch (key.getType()) { case Abs: // skip Abs this is the default in SMILES but we could be verbose about it continue; case Rel: sb.append("o").append(numRac++).append(":"); break; case Rac: sb.append("&").append(numRel++).append(":"); break; case Unk: continue; } List idxs = e.getValue(); sb.append(idxs.get(0)); for (int i = 1; i < idxs.size(); i++) sb.append(',').append(idxs.get(i)); extendedSmiles.add(sb.toString()); } } } if (positionVariationBonds.size() > 0) { extendedSmiles.add("m:" + StringTools.stringListToString(positionVariationBonds, ",")); } if (isPolymer && (options & SmilesOptions.CXSMILES_POLYMERS) != 0) { StringBuilder sruContents = new StringBuilder(); sruContents.append("Sg:n:"); boolean appendDelimiter = false; for (int i = 0, l = smilesOutputOrder.size(); i < l; i++) { if (smilesOutputOrder.get(i).getElement() != ChemEl.R) { if (appendDelimiter) { sruContents.append(','); } sruContents.append(i); appendDelimiter = true; } } sruContents.append("::ht"); extendedSmiles.add(sruContents.toString()); } if (extendedSmiles.size() > 0) { smilesBuilder.append(" |"); smilesBuilder.append(StringTools.stringListToString(extendedSmiles, ",")); smilesBuilder.append('|'); } } private String escapeExtendedSmilesLabel(String str) { StringBuilder sb = new StringBuilder(); for (int i = 0, len = str.length(); i < len; i++) { char ch = str.charAt(i); if ((ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') || (ch >= '0' && ch <= '9') ) { sb.append(ch); } else { sb.append("&#"); sb.append(String.valueOf((int)ch)); sb.append(';'); } } return sb.toString(); } /** * Walks through the fragment populating the Atom.VISITED property indicating how many bonds * an atom is from the start of the fragment walk. A new walk will be started for each disconnected component of the fragment */ private void assignSmilesOrder() { List atomList = structure.getAtomList(); for (Atom atom : atomList) { atom.setProperty(Atom.VISITED, null); } for (Atom a : atomList) { if(a.getProperty(Atom.VISITED) == null && !isSmilesImplicitProton(a)){//true for only the first atom in a fully connected molecule traverseMolecule(a); } } } private static class TraversalState { private final Atom atom; private final Bond bondTaken; private final int depth; private TraversalState(Atom atom, Bond bondTaken, int depth) { this.atom = atom; this.bondTaken = bondTaken; this.depth = depth; } } /** * Iterative function for populating the Atom.VISITED property * Also populates the bondToNextAtom Map * @param startingAtom * @return */ private void traverseMolecule(Atom startingAtom){ Deque stack = new ArrayDeque(); stack.add(new TraversalState(startingAtom, null, 0)); while (!stack.isEmpty()){ TraversalState currentstate = stack.removeLast(); Atom currentAtom = currentstate.atom; Bond bondtaken = currentstate.bondTaken; if (bondtaken != null) { bondToNextAtomMap.put(bondtaken, currentAtom); } if(currentAtom.getProperty(Atom.VISITED) != null){ continue; } int depth = currentstate.depth; currentAtom.setProperty(Atom.VISITED, depth); List bonds = currentAtom.getBonds(); for (int i = bonds.size() - 1; i >=0; i--) { Bond bond = bonds.get(i); if (bond.equals(bondtaken)){ continue; } Atom neighbour = bond.getOtherAtom(currentAtom); if (isSmilesImplicitProton(neighbour)){ continue; } stack.add(new TraversalState(neighbour, bond, depth + 1)); } } } private boolean isSmilesImplicitProton(Atom atom) { if (atom.getElement() != ChemEl.H){ //not hydrogen return false; } if (atom.getIsotope() != null && atom.getIsotope() != 1){ //deuterium/tritium return false; } List neighbours = atom.getAtomNeighbours(); int neighbourCount = neighbours.size(); if (neighbourCount > 1){ //bridging hydrogen return false; } if (neighbourCount == 0){ //just a hydrogen atom return false; } Atom neighbour = neighbours.get(0); ChemEl chemEl = neighbour.getElement(); if (chemEl == ChemEl.H || chemEl == ChemEl.R) { //only connects to hydrogen or an R-group return false; } if (chemEl == ChemEl.N){ List bondsFromNitrogen = neighbour.getBonds(); if (bondsFromNitrogen.size() == 2){ for (Bond bond : bondsFromNitrogen) { if (bond.getBondStereo() != null){ //special case where hydrogen is connected to a nitrogen with imine double bond stereochemistry return false; } } } } return true; } private boolean hasStereo(Atom atom) { AtomParity parity = atom.getAtomParity(); if (parity == null) { return false; } if ((options & SmilesOptions.CXSMILES_ENHANCED_STEREO) != 0) { return true; } //When not outputting extended SMILES, treat rac/rel like undefined, when a stereogroup only has a single atom //e.g. rac-(R)-chlorofluorobromomethane StereoGroupType stereoGroupType = parity.getStereoGroup().getType(); if ((stereoGroupType == StereoGroupType.Rac || stereoGroupType == StereoGroupType.Rel) && countStereoGroup(atom) == 1) { return false; } return true; } private int countStereoGroup(Atom atom) { StereoGroup refGroup = atom.getAtomParity().getStereoGroup(); int count = 0; for (Atom a : atom.getFrag()) { AtomParity atomParity = a.getAtomParity(); if (atomParity == null) { continue; } if (atomParity.getStereoGroup().equals(refGroup)) { count++; } } return count; } /** * Goes through the bonds with BondStereo in the order the are to be created in the SMILES * The bondStereo is used to set whether the bonds to non-implicit hydrogens that are adjacent to this bond * should be be represented by / or \ in the SMILES. If this method has already set the slash on some bonds * e.g. in a conjugated system this must be taken into account when setting the next slashes so as to not * create a contradictory double bond stereochemistry definition. */ private void assignDoubleBondStereochemistrySlashes() { Set bonds = bondToNextAtomMap.keySet(); for (Bond bond : bonds) { bond.setSmilesStereochemistry(null); } for (Bond bond : bonds) { BondStereo bondStereo =bond.getBondStereo(); if (bondStereo!=null){ Atom[] atomRefs4 = bondStereo.getAtomRefs4(); Bond bond1 = atomRefs4[0].getBondToAtom(atomRefs4[1]); Bond bond2 = atomRefs4[2].getBondToAtom(atomRefs4[3]); if (bond1 ==null || bond2==null){ throw new RuntimeException("OPSIN Bug: Bondstereo described atoms that are not bonded"); } Atom bond1ToAtom = bondToNextAtomMap.get(bond1); Atom bond2ToAtom = bondToNextAtomMap.get(bond2); SMILES_BOND_DIRECTION bond1Slash = bond1.getSmilesStereochemistry();//null except in conjugated systems SMILES_BOND_DIRECTION bond2Slash = bond2.getSmilesStereochemistry(); SMILES_BOND_DIRECTION bond1Direction = SMILES_BOND_DIRECTION.LSLASH; SMILES_BOND_DIRECTION bond2Direction = SMILES_BOND_DIRECTION.LSLASH; if (bondStereo.getBondStereoValue().equals(BondStereoValue.CIS)){ bond2Direction = bond2Direction.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH;//flip the slash type to be used from \ to / } if (!bond1ToAtom.equals(atomRefs4[1])){ bond1Direction = bond1Direction.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; } if (!bond2ToAtom.equals(atomRefs4[3])){ bond2Direction = bond2Direction.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; } //One of the bonds may have already have a defined slash from a previous bond stereo. If so make sure that we don't change it. if (bond1Slash !=null && !bond1Slash.equals(bond1Direction)){ bond1Direction = bond1Direction.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; bond2Direction = bond2Direction.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; } else if (bond2Slash !=null && !bond2Slash.equals(bond2Direction)){ bond1Direction = bond1Direction.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; bond2Direction = bond2Direction.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; } //Also need to investigate the bonds which are implicitly set by the bondStereo //F Cl // C=C //N O //e.g. the bonds from the C-N and C-O (the higher priority atoms will always be used for bond1/2) Bond bond1Other =null; Bond bond2Other =null; SMILES_BOND_DIRECTION bond1OtherDirection =null; SMILES_BOND_DIRECTION bond2OtherDirection =null; List bondsFrom2ndAtom = new ArrayList<>(atomRefs4[1].getBonds()); bondsFrom2ndAtom.remove(bond1); bondsFrom2ndAtom.remove(bond); if (bondsFrom2ndAtom.size()==1){//can be 0 for imines if (bondToNextAtomMap.containsKey(bondsFrom2ndAtom.get(0))){//ignore bonds to implicit hydrogen bond1Other = bondsFrom2ndAtom.get(0); bond1OtherDirection = bond1Direction.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; if (!bond1ToAtom.equals(atomRefs4[1])){ bond1OtherDirection = bond1OtherDirection.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; } if (!bondToNextAtomMap.get(bond1Other).equals(atomRefs4[1])){ bond1OtherDirection = bond1OtherDirection.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; } } } List bondsFrom3rdAtom= new ArrayList<>(atomRefs4[2].getBonds()); bondsFrom3rdAtom.remove(bond2); bondsFrom3rdAtom.remove(bond); if (bondsFrom3rdAtom.size()==1){ if (bondToNextAtomMap.containsKey(bondsFrom3rdAtom.get(0))){ bond2Other = bondsFrom3rdAtom.get(0); bond2OtherDirection = bond2Direction.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; if (!bond2ToAtom.equals(atomRefs4[3])){ bond2OtherDirection = bond2OtherDirection.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; } if (!bondToNextAtomMap.get(bond2Other).equals(bond2Other.getOtherAtom(atomRefs4[2]))){ bond2OtherDirection = bond2OtherDirection.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; } } } //One of the bonds may have already have a defined slash from a previous bond stereo. If so make sure that we don't change it. if (bond1Other !=null && bond1Other.getSmilesStereochemistry() !=null && !bond1Other.getSmilesStereochemistry().equals(bond1OtherDirection)){ bond1Direction = bond1Direction.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; bond2Direction = bond2Direction.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; bond1OtherDirection = bond1OtherDirection.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; if (bond2Other!=null){ bond2OtherDirection = bond2OtherDirection.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; } } else if (bond2Other !=null && bond2Other.getSmilesStereochemistry() !=null && !bond2Other.getSmilesStereochemistry().equals(bond2OtherDirection)){ bond1Direction = bond1Direction.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; bond2Direction = bond2Direction.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; bond2OtherDirection = bond2OtherDirection.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; if (bond1Other!=null){ bond1OtherDirection = bond1OtherDirection.equals(SMILES_BOND_DIRECTION.LSLASH) ? SMILES_BOND_DIRECTION.RSLASH : SMILES_BOND_DIRECTION.LSLASH; } } //Set slashes for all bonds that are not to implicit hydrogen //In non conjugated systems this will yield redundant, but consistent, information bond1.setSmilesStereochemistry(bond1Direction); bond2.setSmilesStereochemistry(bond2Direction); if (bond1Other!=null){ bond1Other.setSmilesStereochemistry(bond1OtherDirection); } if (bond2Other!=null){ bond2Other.setSmilesStereochemistry(bond2OtherDirection); } } } } private static final TraversalState startBranch = new TraversalState(null, null, -1); private static final TraversalState endBranch = new TraversalState(null, null, -1); /** * Generates the SMILES starting from the currentAtom, iteratively exploring * in the same order as {@link SMILESWriter#traverseMolecule(Atom)} * @param startingAtom */ private void traverseSmiles(Atom startingAtom){ Deque stack = new ArrayDeque<>(); stack.add(new TraversalState(startingAtom, null, 0)); while (!stack.isEmpty()){ TraversalState currentstate = stack.removeLast(); if (currentstate == startBranch){ smilesBuilder.append('('); continue; } if (currentstate == endBranch){ smilesBuilder.append(')'); continue; } Atom currentAtom = currentstate.atom; Bond bondtaken = currentstate.bondTaken; if (bondtaken != null){ smilesBuilder.append(bondToSmiles(bondtaken)); } int depth = currentstate.depth; smilesBuilder.append(atomToSmiles(currentAtom, depth, bondtaken)); smilesOutputOrder.add(currentAtom); List bonds = currentAtom.getBonds(); List newlyAvailableClosureSymbols = null; for (Bond bond : bonds) {//ring closures if (bond.equals(bondtaken)) { continue; } Atom neighbour = bond.getOtherAtom(currentAtom); Integer nDepth = neighbour.getProperty(Atom.VISITED); if (nDepth != null && nDepth <= depth){ String closure = bondToClosureSymbolMap.get(bond); smilesBuilder.append(closure); if (newlyAvailableClosureSymbols == null){ newlyAvailableClosureSymbols = new ArrayList<>(); } newlyAvailableClosureSymbols.add(closure); } } for (Bond bond : bonds) {//ring openings Atom neighbour = bond.getOtherAtom(currentAtom); Integer nDepth = neighbour.getProperty(Atom.VISITED); if (nDepth != null && nDepth > (depth +1)){ String closure = availableClosureSymbols.removeFirst(); bondToClosureSymbolMap.put(bond, closure); smilesBuilder.append(bondToSmiles(bond)); smilesBuilder.append(closure); } } if (newlyAvailableClosureSymbols != null) { //By not immediately adding to availableClosureSymbols we avoid using the same digit //to both close and open on the same atom for (int i = newlyAvailableClosureSymbols.size() -1; i >=0; i--) { availableClosureSymbols.addFirst(newlyAvailableClosureSymbols.get(i)); } } boolean seenFirstBranch = false; for (int i = bonds.size() - 1; i >=0; i--) { //adjacent atoms which have not been previously written Bond bond = bonds.get(i); Atom neighbour = bond.getOtherAtom(currentAtom); Integer nDepth = neighbour.getProperty(Atom.VISITED); if (nDepth != null && nDepth == depth + 1){ if (!seenFirstBranch){ stack.add(new TraversalState(neighbour, bond, depth + 1)); seenFirstBranch = true; } else { stack.add(endBranch); stack.add(new TraversalState(neighbour, bond, depth + 1)); stack.add(startBranch); } } } } } /** * Returns the SMILES describing the given atom. * Where possible square brackets are not included to give more readable SMILES * @param atom * @param depth * @param bondtaken * @return */ private String atomToSmiles(Atom atom, int depth, Bond bondtaken) { StringBuilder atomSmiles = new StringBuilder(); int hydrogenCount = calculateNumberOfBondedExplicitHydrogen(atom); boolean needsSquareBrackets = determineWhetherAtomNeedsSquareBrackets(atom, hydrogenCount); if (needsSquareBrackets) { atomSmiles.append('['); } if (atom.getIsotope() != null) { atomSmiles.append(atom.getIsotope()); } ChemEl chemEl = atom.getElement(); if (chemEl == ChemEl.R) {//used for polymers atomSmiles.append('*'); } else{ if (atom.hasSpareValency()) {//spare valency corresponds directly to lower case SMILES in OPSIN's SMILES reader atomSmiles.append(chemEl.toString().toLowerCase(Locale.ROOT)); } else{ atomSmiles.append(chemEl.toString()); } } if (hasStereo(atom)) atomSmiles.append(atomParityToSmiles(atom, depth, bondtaken)); if (hydrogenCount != 0 && needsSquareBrackets && chemEl != ChemEl.H){ atomSmiles.append('H'); if (hydrogenCount != 1){ atomSmiles.append(String.valueOf(hydrogenCount)); } } int charge = atom.getCharge(); if (charge != 0){ if (charge == 1){ atomSmiles.append('+'); } else if (charge == -1){ atomSmiles.append('-'); } else{ if (charge > 0){ atomSmiles.append('+'); } atomSmiles.append(charge); } } if (needsSquareBrackets) { Integer atomClass = atom.getProperty(Atom.ATOM_CLASS); if (atomClass != null) { atomSmiles.append(':'); atomSmiles.append(String.valueOf(atomClass)); } atomSmiles.append(']'); } return atomSmiles.toString(); } private int calculateNumberOfBondedExplicitHydrogen(Atom atom) { List neighbours = atom.getAtomNeighbours(); int count = 0; for (Atom neighbour : neighbours) { if (neighbour.getProperty(Atom.VISITED) == null){ count++; } } return count; } private boolean determineWhetherAtomNeedsSquareBrackets(Atom atom, int hydrogenCount) { Integer[] expectedValencies = organicAtomsToStandardValencies.get(atom.getElement()); if (expectedValencies == null){ return true; } if (atom.getCharge() != 0){ return true; } if (atom.getIsotope() != null){ return true; } if (hasStereo(atom)) { return true; } int valency = atom.getIncomingValency(); boolean valencyCanBeDescribedImplicitly = Arrays.binarySearch(expectedValencies, valency) >= 0; int targetImplicitValency =valency; if (valency > expectedValencies[expectedValencies.length-1]){ valencyCanBeDescribedImplicitly = true; } if (!valencyCanBeDescribedImplicitly){ return true; } int nonHydrogenValency = valency - hydrogenCount; int implicitValencyThatWouldBeGenerated = nonHydrogenValency; for (int i = expectedValencies.length - 1; i >= 0; i--) { if (expectedValencies[i] >= nonHydrogenValency){ implicitValencyThatWouldBeGenerated =expectedValencies[i]; } } if (targetImplicitValency != implicitValencyThatWouldBeGenerated){ return true; } if (atom.getProperty(Atom.ATOM_CLASS) != null) { return true; } return false; } private String atomParityToSmiles(Atom currentAtom, int depth, Bond bondtaken) { AtomParity atomParity = currentAtom.getAtomParity(); Atom[] atomRefs4 = atomParity.getAtomRefs4().clone(); List atomrefs4Current = new ArrayList<>(); if (bondtaken != null) {//previous atom Atom neighbour = bondtaken.getOtherAtom(currentAtom); atomrefs4Current.add(neighbour); } for (Atom atom : atomRefs4) {//lone pair as in tetrahedral sulfones if (atom.equals(currentAtom)){ atomrefs4Current.add(currentAtom); } } List bonds = currentAtom.getBonds(); for (Bond bond : bonds) {//implicit hydrogen Atom neighbour = bond.getOtherAtom(currentAtom); if (neighbour.getProperty(Atom.VISITED) == null){ atomrefs4Current.add(currentAtom); } } for (Bond bond : bonds) {//ring closures if (bond.equals(bondtaken)){ continue; } Atom neighbour = bond.getOtherAtom(currentAtom); if (neighbour.getProperty(Atom.VISITED) == null){ continue; } if (neighbour.getProperty(Atom.VISITED) <= depth){ atomrefs4Current.add(neighbour); } } for (Bond bond : bonds) {//ring openings Atom neighbour = bond.getOtherAtom(currentAtom); if (neighbour.getProperty(Atom.VISITED) == null){ continue; } if (neighbour.getProperty(Atom.VISITED) > (depth +1)){ atomrefs4Current.add(neighbour); } } for (Bond bond : bonds) {//next atom/s Atom neighbour = bond.getOtherAtom(currentAtom); if (neighbour.getProperty(Atom.VISITED) == null){ continue; } if (neighbour.getProperty(Atom.VISITED) == depth + 1){ atomrefs4Current.add(neighbour); } } Atom[] atomrefs4CurrentArr = new Atom[4]; for (int i = 0; i < atomrefs4Current.size(); i++) { atomrefs4CurrentArr[i] = atomrefs4Current.get(i); } for (int i = 0; i < atomRefs4.length; i++) {//replace mentions of explicit hydrogen with the central atom the hydrogens are attached to, to be consistent with the SMILES representation if (atomRefs4[i].getProperty(Atom.VISITED) == null){ atomRefs4[i] = currentAtom; } } boolean equivalent = StereochemistryHandler.checkEquivalencyOfAtomsRefs4AndParity(atomRefs4, atomParity.getParity(), atomrefs4CurrentArr, 1); if (equivalent){ return "@@"; } else{ return "@"; } } /** * Generates the SMILES description of the bond * In the case of cis/trans stereochemistry this relies on the {@link SMILESWriter#assignDoubleBondStereochemistrySlashes} * having been run to setup the smilesBondDirection attribute * @param bond * @return */ private String bondToSmiles(Bond bond){ String bondSmiles = ""; int bondOrder = bond.getOrder(); if (bondOrder == 2){ bondSmiles = "="; } else if (bondOrder == 3){ bondSmiles = "#"; } else if (bond.getSmilesStereochemistry() != null){ if (bond.getSmilesStereochemistry() == SMILES_BOND_DIRECTION.RSLASH){ bondSmiles ="/"; } else{ bondSmiles ="\\"; } } return bondSmiles; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/SSSRFinder.java000066400000000000000000000076141451751637500264510ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.HashMap; import java.util.HashSet; import java.util.LinkedHashSet; import java.util.List; import java.util.Map; import java.util.Set; /** * Class for finding SSSR * The algorithm employed does not work in some corner cases * * @author pm286 * @author dl387 * */ class SSSRFinder { /** get set of smallest rings. * In corner cases the list of rings returned will not be the SSSR * @param frag * @return list of rings */ static List getSetOfSmallestRings(Fragment frag){ List atomSet = frag.getAtomList(); List ringList = getRings(atomSet); if (ringList.size() > 1) { boolean change = true; while (change) { for (int i = 0; i < ringList.size(); i++) { Ring ring = ringList.get(i); change = reduceRingSizes(ring, ringList); } } } return ringList; } /** get list of rings. * not necessarily SSSR * @param atomSet * @return list of rings */ private static List getRings(List atomSet ){ List ringList = new ArrayList<>(); Set usedAtoms = new HashSet<>(); Atom root = atomSet.get(0); Atom parentAtom = null; Map atomToParentMap = new HashMap<>(); Set linkBondSet = new LinkedHashSet<>(); expand(root, parentAtom, usedAtoms, atomToParentMap, linkBondSet); for (Bond bond : linkBondSet) { Ring ring = getRing(bond, atomToParentMap); ringList.add(ring); } return ringList; } private static Ring getRing(Bond bond, Map atomToParentMap){ Atom atomFrom = bond.getFromAtom() ; Atom atomTo = bond.getToAtom(); List bondSet0 = getAncestors1(atomFrom, atomToParentMap); List bondSet1 = getAncestors1(atomTo, atomToParentMap); List mergedBondSet = symmetricDifference (bondSet0, bondSet1); mergedBondSet.add(bond); return new Ring(mergedBondSet); } private static List getAncestors1(Atom atom, Map atomToParentMap){ List newBondSet = new ArrayList<>(); while (true) { Atom atom1 = atomToParentMap.get(atom); if (atom1 == null) { break; } Bond bond = atom.getBondToAtom(atom1); if (newBondSet.contains(bond)) { break; } newBondSet.add(bond); atom = atom1; } return newBondSet; } private static void expand(Atom atom, Atom parentAtom, Set usedAtoms, Map atomToParentMap, Set linkBondSet){ usedAtoms.add(atom); atomToParentMap.put(atom, parentAtom); List ligandAtomList = atom.getAtomNeighbours(); for (Atom ligandAtom : ligandAtomList) { if (ligandAtom.equals(parentAtom)) { // skip existing bond } else if (usedAtoms.contains(ligandAtom)) { Bond linkBond = atom.getBondToAtom(ligandAtom); linkBondSet.add(linkBond); // already treated } else { expand(ligandAtom, atom, usedAtoms, atomToParentMap, linkBondSet); } } } private static boolean reduceRingSizes(Ring ring, List newList){ boolean change = false; for (int i = 0; i < newList.size(); i++) { Ring target = newList.get(i); if (target == ring) { continue; } List newBondSet = symmetricDifference ( target.getBondList(), ring.getBondList() ) ; if (newBondSet.size() < target.size()) { Ring newRing = new Ring(newBondSet); newList.set(i, newRing); change = true; } } return change; } private static List symmetricDifference(List bondSet1, List bondSet2) { List newBondSet = new ArrayList<>(); for (Bond bond1 : bondSet1) { if (!bondSet2.contains(bond1)) { newBondSet.add(bond1); } } for (Bond bond2 : bondSet2) { if (!bondSet1.contains(bond2)) { newBondSet.add(bond2); } } return newBondSet; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/SmilesOptions.java000066400000000000000000000030261451751637500273300ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * Options to control SMILES generation. * These options can be provided to the {@link OpsinResult#getSmiles(int)} method to control what is included in * the generated SMILES. The main use here is to control generation of ChemAxon Extended SMILES (CXSMILES) that supports * features beyond plain SMILES. * @see ChemAxon Extended SMILES and SMARTS - CXSMILES and CXSMARTS */ public interface SmilesOptions { /** * Default SMILES generation, as Daylight intended. */ int DEFAULT = 0x0; /** * Include atom labels in CXSMILES. These can provide semantics to * atoms e.g. a label of _AP1 is the first attachment point. */ int CXSMILES_ATOM_LABELS = 0x1; /** * Include atom values in CXSMILES. The first locant value of each atom is used for this. */ int CXSMILES_ATOM_VALUES = 0x2; /** * Include repeat brackets in the CXSMILES layers for polymers. */ int CXSMILES_POLYMERS = 0x4; /** * Include racemic, relative, and absolute enhanced stereochemistry in the CXSMILES layers. */ int CXSMILES_ENHANCED_STEREO = 0x8; /** * Include all CXSMILES layers that are relevant. This option is equivalent to turning on all CXSMILES features. */ int CXSMILES = CXSMILES_ATOM_LABELS | CXSMILES_ATOM_VALUES | CXSMILES_POLYMERS | CXSMILES_ENHANCED_STEREO; } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/SortParses.java000066400000000000000000000032461451751637500266310ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.WORDRULE_ATR; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.WORDRULE_EL; import java.util.Comparator; /** * Prefer non-substituent word rules to substituent word rule e.g. ethylene is C=C not -CC- * Prefer the parse with the least elements that have 0 children e.g. benzal beats benz al (1 childless element vs 2 childless elements) * Prefer less elements e.g. beats */ class SortParses implements Comparator { public int compare(Element el1, Element el2){ boolean isSubstituent1 = WordRule.substituent.toString().equals(el1.getFirstChildElement(WORDRULE_EL).getAttributeValue(WORDRULE_ATR)); boolean isSubstituent2 = WordRule.substituent.toString().equals(el2.getFirstChildElement(WORDRULE_EL).getAttributeValue(WORDRULE_ATR)); if (isSubstituent1 && !isSubstituent2){ return 1; } if (!isSubstituent1 && isSubstituent2){ return -1; } int[] counts1 = OpsinTools.countNumberOfElementsAndNumberOfChildLessElements(el1); int[] counts2 = OpsinTools.countNumberOfElementsAndNumberOfChildLessElements(el2); int childLessElementsInEl1 = counts1[1]; int childLessElementsInEl2 = counts2[1]; if ( childLessElementsInEl1> childLessElementsInEl2){ return 1; } else if (childLessElementsInEl1 < childLessElementsInEl2){ return -1; } int elementsInEl1 = counts1[0]; int elementsInEl2 = counts2[0]; if ( elementsInEl1> elementsInEl2){ return 1; } else if (elementsInEl1 < elementsInEl2){ return -1; } else{ return 0; } } }opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/StereoAnalyser.java000066400000000000000000000536611451751637500274720ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.Arrays; import java.util.Collection; import java.util.Collections; import java.util.Comparator; import java.util.HashMap; import java.util.HashSet; import java.util.List; import java.util.Map; import java.util.Set; import java.util.Map.Entry; /** * Identifies stereocentres and determines the CIP order of connected atoms * @author dl387 * */ class StereoAnalyser { /** The atoms/bonds upon which this StereoAnalyser is operating */ private final Collection atoms; private final Collection bonds; /** Maps each atom to its currently assigned colour. Eventually all atoms in non identical environments will have different colours. Higher is higher priority*/ private final Map mappingToColour; /** Maps each atom to an array of the colours of its neighbours*/ private final Map atomNeighbourColours; private final AtomNeighbouringColoursComparator atomNeighbouringColoursComparator = new AtomNeighbouringColoursComparator(); private static final AtomicNumberThenAtomicMassComparator atomicNumberThenAtomicMassComparator = new AtomicNumberThenAtomicMassComparator(); /** * Holds information about a tetrahedral stereocentre * @author dl387 * */ class StereoCentre{ private final Atom stereoAtom; private final boolean trueStereoCentre; /** * Creates a stereocentre object from a tetrahedral stereocentre atom * @param stereoAtom */ StereoCentre(Atom stereoAtom, Boolean isTrueStereoCentre) { this.stereoAtom = stereoAtom; this.trueStereoCentre = isTrueStereoCentre; } Atom getStereoAtom() { return stereoAtom; } /** * Does this atom have 4 constitutionally different groups (or 3 and a lone pair) * or is it only a stereo centre due to the presence of other centres in the molecule * @return */ boolean isTrueStereoCentre() { return trueStereoCentre; } List getCipOrderedAtoms() throws CipOrderingException { List cipOrderedAtoms = new CipSequenceRules(stereoAtom).getNeighbouringAtomsInCipOrder(); if (cipOrderedAtoms.size()==3){//lone pair is the 4th. This is represented by the atom itself and is always the lowest priority cipOrderedAtoms.add(0, stereoAtom); } return cipOrderedAtoms; } } /*** * Holds information about a double bond that can possess E/Z stereochemistry * @author dl387 * */ class StereoBond{ private final Bond bond; StereoBond(Bond bond) { this.bond = bond; } Bond getBond() { return bond; } /** * Returns the following atoms: * Highest CIP atom on one side * atom in bond * other atom in bond * Highest CIP atom on other side * @return * @throws CipOrderingException */ List getOrderedStereoAtoms() throws CipOrderingException { Atom a1 = bond.getFromAtom(); Atom a2 = bond.getToAtom(); List cipOrdered1 = new CipSequenceRules(a1).getNeighbouringAtomsInCipOrderIgnoringGivenNeighbour(a2); List cipOrdered2 = new CipSequenceRules(a2).getNeighbouringAtomsInCipOrderIgnoringGivenNeighbour(a1); List stereoAtoms = new ArrayList<>(); stereoAtoms.add(cipOrdered1.get(cipOrdered1.size()-1));//highest CIP adjacent to a1 stereoAtoms.add(a1); stereoAtoms.add(a2); stereoAtoms.add(cipOrdered2.get(cipOrdered2.size()-1));//highest CIP adjacent to a2 return stereoAtoms; } } /** * Sorts atoms by their atomic number, low to high * In the case of a tie sorts by atomic mass * @author dl387 * */ private static class AtomicNumberThenAtomicMassComparator implements Comparator { public int compare(Atom a, Atom b){ return compareAtomicNumberThenAtomicMass(a, b); } } private static int compareAtomicNumberThenAtomicMass(Atom a, Atom b){ int atomicNumber1 = a.getElement().ATOMIC_NUM; int atomicNumber2 = b.getElement().ATOMIC_NUM; if (atomicNumber1 > atomicNumber2){ return 1; } else if (atomicNumber1 < atomicNumber2){ return -1; } Integer atomicMass1 = a.getIsotope(); Integer atomicMass2 = b.getIsotope(); if (atomicMass1 != null && atomicMass2 == null){ return 1; } else if (atomicMass1 == null && atomicMass2 != null){ return -1; } else if (atomicMass1 != null && atomicMass2 != null){ if (atomicMass1 > atomicMass2){ return 1; } else if (atomicMass1 < atomicMass2){ return -1; } } return 0; } /** * Sorts based on the list of colours for neighbouring atoms * e.g. [1,2] > [1,1] [1,1,3] > [2,2,2] [1,1,3] > [3] * @author dl387 * */ private class AtomNeighbouringColoursComparator implements Comparator { public int compare(Atom a, Atom b){ int[] colours1 = atomNeighbourColours.get(a); int[] colours2 = atomNeighbourColours.get(b); int colours1Size = colours1.length; int colours2Size = colours2.length; int maxCommonColourSize = Math.min(colours1Size, colours2Size); for (int i = 1; i <= maxCommonColourSize; i++) { int difference = colours1[colours1Size - i] - colours2[colours2Size - i]; if (difference > 0){ return 1; } if (difference < 0){ return -1; } } int differenceInSize = colours1Size - colours2Size; if (differenceInSize > 0){ return 1; } if (differenceInSize < 0){ return -1; } return 0; } } /** * Employs a derivative of the InChI algorithm to label which atoms are equivalent. * These labels can then be used by the findStereo(Atoms/Bonds) functions to find features that * can possess stereoChemistry * @param molecule */ StereoAnalyser(Fragment molecule) { this (molecule.getAtomList(), molecule.getBondSet()); } /** * Employs a derivative of the InChI algorithm to label which atoms are equivalent. * These labels can then be used by the findStereo(Atoms/Bonds) functions to find features that * can possess stereoChemistry * NOTE: All bonds of every atom must be in the set of bonds, no atom may have a bond to an atom not in the list * @param atoms * @param bonds */ StereoAnalyser(Collection atoms, Collection bonds) { this.atoms = atoms; this.bonds = bonds; List ghostAtoms = addGhostAtoms(); List atomsToSort = new ArrayList<>(atoms); atomsToSort.addAll(ghostAtoms); mappingToColour = new HashMap<>(atomsToSort.size()); atomNeighbourColours = new HashMap<>(atomsToSort.size()); Collections.sort(atomsToSort, atomicNumberThenAtomicMassComparator); List> groupsByColour = populateColoursByAtomicNumberAndMass(atomsToSort); boolean changeFound = true; while(changeFound){ for (List groupWithAColour : groupsByColour) { for (Atom atom : groupWithAColour) { int[] neighbourColours = findColourOfNeighbours(atom); atomNeighbourColours.put(atom, neighbourColours); } } List> updatedGroupsByColour = new ArrayList<>(); changeFound = populateColoursAndReportIfColoursWereChanged(groupsByColour, updatedGroupsByColour); groupsByColour = updatedGroupsByColour; } removeGhostAtoms(ghostAtoms); } /** * Adds "ghost" atoms in the same way as the CIP rules for handling double bonds * e.g. C=C --> C(G)=C(G) where ghost is a carbon with no hydrogen bonded to it * @return The ghost atoms created */ private List addGhostAtoms() { List ghostAtoms = new ArrayList<>(); for (Bond bond : bonds) { int bondOrder = bond.getOrder(); for (int i = bondOrder; i > 1; i--) { Atom fromAtom = bond.getFromAtom(); Atom toAtom = bond.getToAtom(); Atom ghost1 = new Atom(fromAtom.getElement()); Bond b1 = new Bond(ghost1, toAtom, 1); toAtom.addBond(b1); ghost1.addBond(b1); ghostAtoms.add(ghost1); Atom ghost2 = new Atom(toAtom.getElement()); Bond b2 = new Bond(ghost2, fromAtom, 1); fromAtom.addBond(b2); ghost2.addBond(b2); ghostAtoms.add(ghost2); } } return ghostAtoms; } /** * Removes the ghost atoms added by addGhostAtoms * @param ghostAtoms */ private void removeGhostAtoms(List ghostAtoms) { for (Atom atom : ghostAtoms) { Bond b = atom.getFirstBond(); b.getOtherAtom(atom).removeBond(b); } } /** * Takes a list of atoms sorted by atomic number/mass * and populates the mappingToColour map * @param atomList * @return */ private List> populateColoursByAtomicNumberAndMass(List atomList) { List> groupsByColour = new ArrayList<>(); Atom previousAtom = null; List atomsOfThisColour = new ArrayList<>(); int atomsSeen = 0; for (Atom atom : atomList) { if (previousAtom != null && compareAtomicNumberThenAtomicMass(previousAtom, atom) != 0){ for (Atom atomOfthisColour : atomsOfThisColour) { mappingToColour.put(atomOfthisColour, atomsSeen); } groupsByColour.add(atomsOfThisColour); atomsOfThisColour = new ArrayList<>(); } previousAtom = atom; atomsOfThisColour.add(atom); atomsSeen++; } if (!atomsOfThisColour.isEmpty()){ for (Atom atomOfThisColour : atomsOfThisColour) { mappingToColour.put(atomOfThisColour, atomsSeen); } groupsByColour.add(atomsOfThisColour); } return groupsByColour; } /** * Takes the lists of atoms pre-grouped by colour and sorts each by its neighbours colours * The updatedGroupsByColour is populated with those for which this process caused a change * and populates the mappingToColour map * Returns whether mappingToColour was changed * @param groupsByColour * @param updatedGroupsByColour * @return boolean Whether mappingToColour was changed */ private boolean populateColoursAndReportIfColoursWereChanged(List> groupsByColour, List> updatedGroupsByColour) { boolean changeFound = false; int atomsSeen = 0; for (List groupWithAColour : groupsByColour) { Collections.sort(groupWithAColour, atomNeighbouringColoursComparator); Atom previousAtom = null; List atomsOfThisColour = new ArrayList<>(); for (Atom atom : groupWithAColour) { if (previousAtom != null && atomNeighbouringColoursComparator.compare(previousAtom, atom) != 0){ for (Atom atomOfThisColour : atomsOfThisColour) { if (!changeFound && atomsSeen != mappingToColour.get(atomOfThisColour)){ changeFound = true; } mappingToColour.put(atomOfThisColour, atomsSeen); } updatedGroupsByColour.add(atomsOfThisColour); atomsOfThisColour = new ArrayList<>(); } previousAtom = atom; atomsOfThisColour.add(atom); atomsSeen++; } if (!atomsOfThisColour.isEmpty()){ for (Atom atomOfThisColour : atomsOfThisColour) { if (!changeFound && atomsSeen != mappingToColour.get(atomOfThisColour)){ changeFound = true; } mappingToColour.put(atomOfThisColour, atomsSeen); } updatedGroupsByColour.add(atomsOfThisColour); } } return changeFound; } /** * Produces a sorted (low to high) array of the colour of the atoms surrounding a given atom * @param atom * @return int[] colourOfAdjacentAtoms */ private int[] findColourOfNeighbours(Atom atom) { List bonds = atom.getBonds(); int bondCount = bonds.size(); int[] colourOfAdjacentAtoms = new int[bondCount]; for (int i = 0; i < bondCount; i++) { Bond bond = bonds.get(i); Atom otherAtom = bond.getOtherAtom(atom); colourOfAdjacentAtoms[i] = mappingToColour.get(otherAtom); } Arrays.sort(colourOfAdjacentAtoms);//sort such that this goes from low to high return colourOfAdjacentAtoms; } /** * Retrieves a list of any tetrahedral stereoCentres * Internally this is done by checking whether the "colour" of all neighbouring atoms of the tetrahedral atom are different * @return List */ List findStereoCentres(){ List potentialStereoAtoms = getPotentialStereoCentres(); List trueStereoCentres = new ArrayList<>(); for (Atom potentialStereoAtom : potentialStereoAtoms) { if (isTrueStereCentre(potentialStereoAtom)){ trueStereoCentres.add(potentialStereoAtom); } } List stereoCentres = new ArrayList<>(); for (Atom trueStereoCentreAtom : trueStereoCentres) { stereoCentres.add(new StereoCentre(trueStereoCentreAtom, true)); } potentialStereoAtoms.removeAll(trueStereoCentres); List paraStereoCentres = findParaStereoCentres(potentialStereoAtoms, trueStereoCentres); for (Atom paraStereoCentreAtom : paraStereoCentres) { stereoCentres.add(new StereoCentre(paraStereoCentreAtom, false)); } return stereoCentres; } /** * Retrieves atoms that pass the isPossiblyStereogenic() criteria * @return */ private List getPotentialStereoCentres() { List potentialStereoAtoms = new ArrayList<>(); for (Atom atom : atoms) { if (isPossiblyStereogenic(atom)){ potentialStereoAtoms.add(atom); } } return potentialStereoAtoms; } /** * Checks whether the atom has 3 or 4 neighbours all of which are constitutionally different * @param potentialStereoAtom * @return */ private boolean isTrueStereCentre(Atom potentialStereoAtom) { List neighbours = potentialStereoAtom.getAtomNeighbours(); if (neighbours.size()!=3 && neighbours.size()!=4){ return false; } int[] colours = new int[4]; for (int i = neighbours.size() - 1 ; i >=0; i--) { colours[i] = mappingToColour.get(neighbours.get(i)); } boolean foundIdenticalNeighbour =false; for (int i = 0; i < 4; i++) { int cl = colours[i]; for (int j = i +1; j < 4; j++) { if (cl == colours[j]){ foundIdenticalNeighbour =true; break; } } } return !foundIdenticalNeighbour; } /** * Finds a subset of the stereocentres associated with rule 2 from: * DOI: 10.1021/ci00016a003 * @param potentialStereoAtoms * @param trueStereoCentres */ private List findParaStereoCentres(List potentialStereoAtoms, List trueStereoCentres) { List paraStereoCentres = new ArrayList<>(); for (Atom potentialStereoAtom : potentialStereoAtoms) { List neighbours = potentialStereoAtom.getAtomNeighbours(); if (neighbours.size() == 4){ int[] colours = new int[4]; for (int i = neighbours.size() - 1 ; i >=0; i--) { colours[i] = mappingToColour.get(neighbours.get(i)); } //find pairs of constitutionally identical substituents Map foundPairs = new HashMap<>(); for (int i = 0; i < 4; i++) { int cl = colours[i]; for (int j = i +1; j < 4; j++) { if (cl == colours[j]){ foundPairs.put(i, j); break; } } } int pairs = foundPairs.keySet().size(); if (pairs==1 || pairs==2){ boolean hasTrueStereoCentreInAllBranches = true; for (Entry entry: foundPairs.entrySet()) { if (!branchesHaveTrueStereocentre(neighbours.get(entry.getKey()), neighbours.get(entry.getValue()), potentialStereoAtom, trueStereoCentres)){ hasTrueStereoCentreInAllBranches = false; break; } } if (hasTrueStereoCentreInAllBranches){ paraStereoCentres.add(potentialStereoAtom); } } } } return paraStereoCentres; } private boolean branchesHaveTrueStereocentre(Atom branchAtom1, Atom branchAtom2, Atom potentialStereoAtom, List trueStereoCentres) { List atomsToVisit= new ArrayList<>(); Set visitedAtoms = new HashSet<>(); visitedAtoms.add(potentialStereoAtom); atomsToVisit.add(branchAtom1); atomsToVisit.add(branchAtom2); while(!atomsToVisit.isEmpty()){ List newAtomsToVisit = new ArrayList<>(); while(!atomsToVisit.isEmpty()){ Atom atom = atomsToVisit.remove(0); if (trueStereoCentres.contains(atom)){ return true; } if (atomsToVisit.contains(atom)){//the two branches have converged on this atom, don't investigate neighbours of it do{ atomsToVisit.remove(atom); } while (atomsToVisit.contains(atom)); continue; } else{ List neighbours = atom.getAtomNeighbours(); for (Atom neighbour : neighbours) { if (visitedAtoms.contains(neighbour)){ continue; } newAtomsToVisit.add(neighbour); } } visitedAtoms.add(atom); } atomsToVisit = newAtomsToVisit; } return false; } /** * Checks whether an atom could be a tetrahedral stereocentre by checking that it is both tetrahedral * and does not have neighbours that are identical due to resonance/tautomerism * @param atom * @return */ static boolean isPossiblyStereogenic(Atom atom){ return isKnownPotentiallyStereogenic(atom) && !isAchiralDueToResonanceOrTautomerism(atom); } /** * Roughly corresponds to the list of atoms in table 8 of the InChI manual * tetrahedral phosphorus/arsenic are also allowed by InChI but are, perhaps due to an oversight, omitted from this table * Essentially does a crude check for whether an atom is known to be able to possess tetrahedral geometry * and whether it is currently tetrahedral. Atoms that are tetrahedral but not typically considered chiral * like tertiary amines are not recognised * @param atom * @return */ static boolean isKnownPotentiallyStereogenic(Atom atom) { List neighbours = atom.getAtomNeighbours(); ChemEl chemEl = atom.getElement(); if (neighbours.size() == 4){ if (chemEl == ChemEl.B || chemEl == ChemEl.C || chemEl == ChemEl.Si || chemEl == ChemEl.Ge || chemEl == ChemEl.Sn || chemEl == ChemEl.N || chemEl == ChemEl.P || chemEl == ChemEl.As || chemEl == ChemEl.S || chemEl == ChemEl.Se){ return true; } } else if (neighbours.size() ==3){ if ((chemEl == ChemEl.S || chemEl == ChemEl.Se) && (atom.getIncomingValency()==4 || (atom.getCharge() ==1 && atom.getIncomingValency()==3))){ //tetrahedral sulfur/selenium - 3 bonds and the lone pair return true; } if (chemEl == ChemEl.N && atom.getCharge() ==0 && atom.getIncomingValency()==3 && atomsContainABondBetweenThemselves(neighbours)){ return true; //nitrogen where two attached atoms are connected together } if ((chemEl == ChemEl.P || chemEl == ChemEl.As) &&atom.getIncomingValency() == 3) { //tetrahedral phosphorus/arsenic - 3 bonds and the lone pair return true; } } return false; } private static boolean atomsContainABondBetweenThemselves(List atoms) { for (Atom atom : atoms) { for (Atom neighbour : atom.getAtomNeighbours()) { if (atoms.contains(neighbour)){ return true; } } } return false; } static boolean isAchiralDueToResonanceOrTautomerism(Atom atom) { ChemEl chemEl = atom.getElement(); if(chemEl == ChemEl.N || chemEl == ChemEl.P || chemEl == ChemEl.As || chemEl == ChemEl.S || chemEl == ChemEl.Se) { List neighbours = atom.getAtomNeighbours(); Set resonanceAndTautomerismAtomicElementPlusIsotopes = new HashSet<>(); for (Atom neighbour : neighbours) { ChemEl neighbourChemEl = neighbour.getElement(); if ((neighbourChemEl.isChalcogen() || neighbourChemEl == ChemEl.N) && isOnlyBondedToHydrogensOtherThanGivenAtom(neighbour, atom)){ if (resonanceAndTautomerismAtomicElementPlusIsotopes.contains(neighbourChemEl.toString() + atom.getIsotope())){ return true; } resonanceAndTautomerismAtomicElementPlusIsotopes.add(neighbourChemEl.toString() + atom.getIsotope()); } if (neighbourChemEl == ChemEl.H && neighbour.getBondCount()==1){ //terminal H atom neighbour return true; } } } return false; } private static boolean isOnlyBondedToHydrogensOtherThanGivenAtom(Atom atom, Atom attachedNonHydrogen) { for (Atom neighbour: atom.getAtomNeighbours()) { if (neighbour.equals(attachedNonHydrogen)){ continue; } if (neighbour.getElement() != ChemEl.H){ return false; } } return true; } /** * Retrieves a list of any double bonds possessing the potential to have E/Z stereoChemistry * This is done internally by checking the two atoms attached to the ends of the double bond are different * As an exception nitrogen's lone pair is treated like a low priority group and so is allowed to only have 1 atom connected to it * @return */ List findStereoBonds() { List stereoBonds = new ArrayList<>(); for (Bond bond : bonds) { if (bond.getOrder()==2){ Atom a1 = bond.getFromAtom(); List neighbours1 = a1.getAtomNeighbours(); neighbours1.remove(bond.getToAtom()); if (neighbours1.size()==2 || (neighbours1.size()==1 && a1.getElement() == ChemEl.N && a1.getIncomingValency()==3 && a1.getCharge()==0)){ if (neighbours1.size()==2 && mappingToColour.get(neighbours1.get(0)).equals(mappingToColour.get(neighbours1.get(1)))){ continue; } Atom a2 = bond.getToAtom(); List neighbours2 = a2.getAtomNeighbours(); neighbours2.remove(bond.getFromAtom()); if (neighbours2.size()==2 || (neighbours2.size()==1 && a2.getElement() == ChemEl.N && a2.getIncomingValency()==3 && a2.getCharge()==0)){ if (neighbours2.size()==2 && mappingToColour.get(neighbours2.get(0)).equals(mappingToColour.get(neighbours2.get(1)))){ continue; } stereoBonds.add(new StereoBond(bond)); } } } } return stereoBonds; } /** * Returns a number describing the environment of an atom. Atoms with the same number are in identical environments * Null if atom was not part of this environment analysis * @param a * @return */ Integer getAtomEnvironmentNumber(Atom a) { return mappingToColour.get(a); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/StereoGroup.java000066400000000000000000000021031451751637500267710ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.Objects; class StereoGroup implements Comparable { private final StereoGroupType type; private final int number; StereoGroup(StereoGroupType type) { this(type, 1); } StereoGroup(StereoGroupType type, int number) { this.type = type; this.number = number; } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; StereoGroup that = (StereoGroup) o; return number == that.number && type == that.type; } @Override public int hashCode() { return Objects.hash(type, number); } public int compareTo(StereoGroup that) { int cmp = this.type.compareTo(that.type); if (cmp != 0) { return cmp; } return Integer.compare(this.number, that.number); } StereoGroupType getType() { return type; } int getNumber() { return number; } @Override public String toString() { return "StereoGroup{" + "type=" + type + ", number=" + number + '}'; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/StereoGroupType.java000066400000000000000000000023401451751637500276360ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * Indicate if a stereo-center belongs to a certain stereo group type. The group * can be specified on the {@link AtomParity}. For formats that support enhanced * stereo (currently only CXSMILES) you would normally also have a group number * (e.g. Rac1/&1, Rac2/&2, Rac3/&3, etc) however there is currently * no way to specify this separation in IUPAC. An extension may be to indicate * grouping with parenthesis "(1RS)-,(3RS)-" "((1RS),(3RS))-". However the most * common cases of mixed stereo groups are likely to be where part of the * structure is known (absolute) and part is racemic/relative which can be * specified like this "(1RS,3R)-". The most common cases are all absolute, all * racemic (AND enantiomer), all relative (OR enantiomer). */ public enum StereoGroupType { /** * Absolute stereochemistry, the configuration of the stereo center is known. */ Abs, /** * Racemic stereochemistry, the molecule is a mixture of the stereo center. */ Rac, /** * Relative stereochemistry, the configuration of a stereo center is unknown * but may be known relative to another configuration. */ Rel, /** * Fallback sentinel value to ensure non-null. */ Unk } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/StereochemistryException.java000066400000000000000000000010331451751637500315640ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * Thrown if stereochemistry cannot be applied to a structure * @author Daniel * */ class StereochemistryException extends StructureBuildingException { private static final long serialVersionUID = 1L; StereochemistryException() { super(); } StereochemistryException(String message) { super(message); } StereochemistryException(String message, Throwable cause) { super(message, cause); } StereochemistryException(Throwable cause) { super(cause); } }opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/StereochemistryHandler.java000066400000000000000000001344041451751637500312140ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.Collections; import java.util.HashMap; import java.util.HashSet; import java.util.List; import java.util.Locale; import java.util.Map; import java.util.Set; import java.util.Map.Entry; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import uk.ac.cam.ch.wwmm.opsin.BondStereo.BondStereoValue; import uk.ac.cam.ch.wwmm.opsin.OpsinWarning.OpsinWarningType; import uk.ac.cam.ch.wwmm.opsin.StereoAnalyser.StereoBond; import uk.ac.cam.ch.wwmm.opsin.StereoAnalyser.StereoCentre; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; /** * Identifies stereocentres, assigns stereochemistry elements to them and then uses the CIP rules to calculate appropriates atomParity/bondstereo tags * @author dl387 * */ class StereochemistryHandler { private static final Logger LOG = LogManager.getLogger(StereochemistryHandler.class); private final BuildState state; private final Map atomStereoCentreMap; private final Map bondStereoBondMap; private final Map notExplicitlyDefinedStereoCentreMap; private final Map notExplicitlyDefinedStereoBondMap; StereochemistryHandler(BuildState state, Map atomStereoCentreMap, Map bondStereoBondMap) { this.state = state; this.atomStereoCentreMap = atomStereoCentreMap; notExplicitlyDefinedStereoCentreMap = new HashMap<>(atomStereoCentreMap); this.bondStereoBondMap = bondStereoBondMap; notExplicitlyDefinedStereoBondMap = new HashMap<>(bondStereoBondMap); } /** * Processes and assigns stereochemistry elements to appropriate fragments * @param stereoChemistryEls * @throws StructureBuildingException */ void applyStereochemicalElements(List stereoChemistryEls) throws StructureBuildingException { List locantedStereoChemistryEls = new ArrayList<>(); List unlocantedStereoChemistryEls = new ArrayList<>(); List carbohydrateStereoChemistryEls = new ArrayList<>(); List globalRacemicOrRelative = new ArrayList<>(); for (Element stereoChemistryElement : stereoChemistryEls) { if (stereoChemistryElement.getAttributeValue(LOCANT_ATR)!=null){ locantedStereoChemistryEls.add(stereoChemistryElement); } else if (stereoChemistryElement.getAttributeValue(TYPE_ATR).equals(CARBOHYDRATECONFIGURATIONPREFIX_TYPE_VAL)){ carbohydrateStereoChemistryEls.add(stereoChemistryElement); } else if (stereoChemistryElement.getAttributeValue(TYPE_ATR).equals(RAC_TYPE_VAL) || stereoChemistryElement.getAttributeValue(TYPE_ATR).equals(REL_TYPE_VAL)) { globalRacemicOrRelative.add(stereoChemistryElement); } else{ unlocantedStereoChemistryEls.add(stereoChemistryElement); } } //perform locanted before unlocanted to avoid unlocanted elements using the stereocentres a locanted element refers to for (Element stereochemistryEl : locantedStereoChemistryEls) { try { matchStereochemistryToAtomsAndBonds(stereochemistryEl); } catch (StereochemistryException e) { if (state.n2sConfig.warnRatherThanFailOnUninterpretableStereochemistry()){ state.addWarning(OpsinWarningType.STEREOCHEMISTRY_IGNORED, e.getMessage()); } else{ throw e; } } } if (!carbohydrateStereoChemistryEls.isEmpty()){ processCarbohydrateStereochemistry(carbohydrateStereoChemistryEls); } for (Element stereochemistryEl : unlocantedStereoChemistryEls) { try { matchStereochemistryToAtomsAndBonds(stereochemistryEl); } catch (StereochemistryException e) { if (state.n2sConfig.warnRatherThanFailOnUninterpretableStereochemistry()){ state.addWarning(OpsinWarningType.STEREOCHEMISTRY_IGNORED, e.getMessage()); } else{ throw e; } } } if (globalRacemicOrRelative.size() > 1) { if (state.n2sConfig.warnRatherThanFailOnUninterpretableStereochemistry()){ state.addWarning(OpsinWarningType.STEREOCHEMISTRY_IGNORED, "More than one global indicator of rac- or rel- was specified"); } else { throw new StructureBuildingException("More than one global indicator of rac- or rel- was specified"); } } for (Element stereochemistryEl : globalRacemicOrRelative) { try { matchStereochemistryToAtomsAndBonds(stereochemistryEl); } catch (StereochemistryException e) { if (state.n2sConfig.warnRatherThanFailOnUninterpretableStereochemistry()){ state.addWarning(OpsinWarningType.STEREOCHEMISTRY_IGNORED, e.getMessage()); } else{ throw e; } } } } /** * Checks that all atomParity and bondStereo elements correspond to identified stereocentres. * If they do not, they have assumedly been removed by substitution and hence the atomPaity/bondStereo is removed * @param bondsWithPreDefinedBondStereo * @param atomsWithPreDefinedAtomParity */ void removeRedundantStereoCentres(List atomsWithPreDefinedAtomParity, List bondsWithPreDefinedBondStereo) { for (Atom atom : atomsWithPreDefinedAtomParity) { if (!atomStereoCentreMap.containsKey(atom)){ atom.setAtomParity(null); } } for (Bond bond : bondsWithPreDefinedBondStereo) { if (!bondStereoBondMap.containsKey(bond)){ bond.setBondStereo(null); } } } /** * Attempts to locate a suitable atom/bond for the stereochemistryEl, applies it and detaches the stereochemsitry * @param stereoChemistryEl * @throws StructureBuildingException * @throws StereochemistryException */ private void matchStereochemistryToAtomsAndBonds(Element stereoChemistryEl) throws StructureBuildingException, StereochemistryException { String stereoChemistryType =stereoChemistryEl.getAttributeValue(TYPE_ATR); if (stereoChemistryType.equals(R_OR_S_TYPE_VAL)){ assignStereoCentre(stereoChemistryEl); } else if (stereoChemistryType.equals(E_OR_Z_TYPE_VAL)){ assignStereoBond(stereoChemistryEl); } else if (stereoChemistryType.equals(CISORTRANS_TYPE_VAL)){ if (!assignCisTransOnRing(stereoChemistryEl)){ assignStereoBond(stereoChemistryEl); } } else if (stereoChemistryType.equals(ALPHA_OR_BETA_TYPE_VAL)){ assignAlphaBetaXiStereochem(stereoChemistryEl); } else if (stereoChemistryType.equals(DLSTEREOCHEMISTRY_TYPE_VAL)){ assignDlStereochem(stereoChemistryEl); } else if (stereoChemistryType.equals(RAC_TYPE_VAL)){ applyGlobalRacOrRelFlags(stereoChemistryEl, StereoGroupType.Rac); } else if (stereoChemistryType.equals(REL_TYPE_VAL)){ applyGlobalRacOrRelFlags(stereoChemistryEl, StereoGroupType.Rel); } else if (stereoChemistryType.equals(ENDO_EXO_SYN_ANTI_TYPE_VAL)){ throw new StereochemistryException(stereoChemistryType + " stereochemistry is not currently interpretable by OPSIN"); } else if (stereoChemistryType.equals(RELATIVECISTRANS_TYPE_VAL)){ throw new StereochemistryException(stereoChemistryType + " stereochemistry is not currently interpretable by OPSIN"); } else if (stereoChemistryType.equals(AXIAL_TYPE_VAL)){ throw new StereochemistryException(stereoChemistryType + " stereochemistry is not currently interpretable by OPSIN"); } else if (stereoChemistryType.equals(OPTICALROTATION_TYPE_VAL)){ state.addWarning(OpsinWarningType.STEREOCHEMISTRY_IGNORED, "Optical rotation cannot be algorithmically used to assign stereochemistry. This term was ignored: " + stereoChemistryEl.getValue()); } else{ throw new StructureBuildingException("Unexpected stereochemistry type: " +stereoChemistryType); } stereoChemistryEl.detach(); } /** * Groups carbohydrateStereoChemistryEls by their parent element and * sends them for further processing * @param carbohydrateStereoChemistryEls * @throws StructureBuildingException */ private void processCarbohydrateStereochemistry(List carbohydrateStereoChemistryEls) throws StructureBuildingException { Map> groupToStereochemEls = new HashMap<>(); for (Element carbohydrateStereoChemistryEl : carbohydrateStereoChemistryEls) { Element nextGroup = OpsinTools.getNextSibling(carbohydrateStereoChemistryEl, GROUP_EL); if (nextGroup ==null || (!SYSTEMATICCARBOHYDRATESTEMALDOSE_SUBTYPE_VAL.equals(nextGroup.getAttributeValue(SUBTYPE_ATR)) && !SYSTEMATICCARBOHYDRATESTEMKETOSE_SUBTYPE_VAL.equals(nextGroup.getAttributeValue(SUBTYPE_ATR)))){ throw new RuntimeException("OPSIN bug: Could not find carbohydrate chain stem to apply stereochemistry to"); } if (groupToStereochemEls.get(nextGroup)==null){ groupToStereochemEls.put(nextGroup, new ArrayList<>()); } groupToStereochemEls.get(nextGroup).add(carbohydrateStereoChemistryEl); } for (Entry> entry : groupToStereochemEls.entrySet()) { assignCarbohydratePrefixStereochem(entry.getKey(), entry.getValue()); } } /** * Applies global racemic/relative stereochemistry. The logic is follows: * - Atoms are divided into two sets defined/undefined. * - (1) If there is a single undefined stereogenic atom it is set to R and marked as racemic/relative * - (2) If there is more than one the global specification is ignored. * - (3) If there are no undefined stereocenters then they are set to racemic/relative * * @param stereoChemistryEl the stereo chemistry element * @param groupType the stereo group Rac(emic) or Rel(ative). * @throws StructureBuildingException * @throws StereochemistryException */ private void applyGlobalRacOrRelFlags(Element stereoChemistryEl, StereoGroupType groupType) throws StructureBuildingException, StereochemistryException { Element wordParent = stereoChemistryEl.getParent(); while (wordParent != null && !wordParent.getName().equals(WORD_EL)) { wordParent = wordParent.getParent(); } if (wordParent == null) { return; } Element parentSubBracketOrRoot = stereoChemistryEl.getParent(); List possibleFragments = new ArrayList<>(StructureBuildingMethods.findAlternativeFragments(parentSubBracketOrRoot)); List adjacentGroupEls = OpsinTools.getDescendantElementsWithTagName(parentSubBracketOrRoot, GROUP_EL); for (int i = adjacentGroupEls.size()-1; i >=0; i--) { possibleFragments.add(adjacentGroupEls.get(i).getFrag()); } // collect all fragments that occur after List words = OpsinTools.getNextSiblingsOfType(wordParent, WORD_EL); for (Element word : words) { List possibleGroups = OpsinTools.getDescendantElementsWithTagName(word, GROUP_EL); for (int i = possibleGroups.size() - 1; i >= 0; i--) { possibleFragments.add(possibleGroups.get(i).getFrag()); } } // first for all-ready (un)/defined stereochemistry List undefinedStereo = new ArrayList<>(); List definedStereo = new ArrayList<>(); for (Fragment fragment : possibleFragments) { List atomList = fragment.getAtomList(); for (Atom potentialStereoAtom : atomList) { if (potentialStereoAtom.getAtomParity() != null) { definedStereo.add(potentialStereoAtom); } else if (notExplicitlyDefinedStereoCentreMap.get(potentialStereoAtom) != null) { undefinedStereo.add(potentialStereoAtom); } } } if (undefinedStereo.size() > 0) { if (undefinedStereo.size() > 1) { state.addWarning(OpsinWarningType.STEREOCHEMISTRY_IGNORED, "More than one undefined stereocenter for rac- or rel- mixture"); return; } try { applyStereoChemistryToStereoCentre(undefinedStereo.get(0), notExplicitlyDefinedStereoCentreMap.get(undefinedStereo.get(0)), "R"); } catch (CipOrderingException e) { state.addWarning(OpsinWarningType.STEREOCHEMISTRY_IGNORED, "Could not set rac- or rel- stereochemistry: " + e.getMessage()); return; } undefinedStereo.get(0).setStereoGroup(new StereoGroup(groupType)); notExplicitlyDefinedStereoCentreMap.remove(undefinedStereo.get(0)); } else { for (Atom atom : definedStereo) { atom.setStereoGroup(new StereoGroup(groupType)); } } } /** * Handles R/S stereochemistry. r/s is not currently handled * @param stereoChemistryEl * @throws StructureBuildingException * @throws StereochemistryException */ private void assignStereoCentre(Element stereoChemistryEl) throws StructureBuildingException, StereochemistryException { //generally the LAST group in this list will be the appropriate group e.g. (5S)-5-ethyl-6-methylheptane where the heptane is the appropriate group //we use the same algorithm as for unlocanted substitution so as to deprecate assignment into brackets Element parentSubBracketOrRoot = stereoChemistryEl.getParent(); List possibleFragments = StructureBuildingMethods.findAlternativeFragments(parentSubBracketOrRoot); List adjacentGroupEls = OpsinTools.getDescendantElementsWithTagName(parentSubBracketOrRoot, GROUP_EL); for (int i = adjacentGroupEls.size()-1; i >=0; i--) { possibleFragments.add(adjacentGroupEls.get(i).getFrag()); } String locant = stereoChemistryEl.getAttributeValue(LOCANT_ATR); String rOrS = stereoChemistryEl.getAttributeValue(VALUE_ATR); String grpStr = stereoChemistryEl.getAttributeValue(STEREOGROUP_ATR); StereoGroupType grpType = grpStr != null ? StereoGroupType.valueOf(grpStr) : StereoGroupType.Unk; for (Fragment fragment : possibleFragments) { if (attemptAssignmentOfStereoCentreToFragment(fragment, rOrS, locant, grpType)) { return; } } Element possibleWordParent = parentSubBracketOrRoot.getParent(); if (possibleWordParent.getName().equals(WORD_EL) && possibleWordParent.getChild(0).equals(parentSubBracketOrRoot)){ //something like (3R,4R,5R)-ethyl 4-acetamido-5-amino-3-(pentan-3-yloxy)cyclohex-1-enecarboxylate //i.e. the stereochemistry is in a different word to what it is applied to List words = OpsinTools.getNextSiblingsOfType(possibleWordParent, WORD_EL); for (Element word : words) { List possibleGroups = OpsinTools.getDescendantElementsWithTagName(word, GROUP_EL); for (int i = possibleGroups.size()-1; i >=0; i--) { if (attemptAssignmentOfStereoCentreToFragment(possibleGroups.get(i).getFrag(), rOrS, locant, grpType)) { return; } } } } throw new StereochemistryException("Could not find atom that: " + stereoChemistryEl.toXML() + " appeared to be referring to"); } private boolean attemptAssignmentOfStereoCentreToFragment(Fragment fragment, String rOrS, String locant, StereoGroupType stereoType) throws StereochemistryException, StructureBuildingException { if (locant == null) {//undefined locant List atomList = fragment.getAtomList(); for (Atom potentialStereoAtom : atomList) { if (notExplicitlyDefinedStereoCentreMap.containsKey(potentialStereoAtom)){ applyStereoChemistryToStereoCentre(potentialStereoAtom, notExplicitlyDefinedStereoCentreMap.get(potentialStereoAtom), rOrS); potentialStereoAtom.setStereoGroup(new StereoGroup(stereoType)); notExplicitlyDefinedStereoCentreMap.remove(potentialStereoAtom); return true; } } } else{ Atom potentialStereoAtom = fragment.getAtomByLocant(locant); if (potentialStereoAtom !=null && notExplicitlyDefinedStereoCentreMap.containsKey(potentialStereoAtom)){ applyStereoChemistryToStereoCentre(potentialStereoAtom, notExplicitlyDefinedStereoCentreMap.get(potentialStereoAtom), rOrS); potentialStereoAtom.setStereoGroup(new StereoGroup(stereoType)); notExplicitlyDefinedStereoCentreMap.remove(potentialStereoAtom); return true; } } return false; } /** * Assigns atom parity to the given atom in accordance with the CIP rules * @param atom The stereoAtom * @param stereoCentre * @param rOrS The description given in the name * @throws StructureBuildingException * @throws StereochemistryException */ private void applyStereoChemistryToStereoCentre(Atom atom, StereoCentre stereoCentre, String rOrS) throws StructureBuildingException, StereochemistryException { List cipOrderedAtoms =stereoCentre.getCipOrderedAtoms(); if (cipOrderedAtoms.size()!=4){ throw new StructureBuildingException("Only tetrahedral chirality is currently supported"); } Atom[] atomRefs4 = new Atom[4]; atomRefs4[0] = cipOrderedAtoms.get(cipOrderedAtoms.size()-1); for (int i = 0; i < cipOrderedAtoms.size() -1; i++) {//from highest to lowest (true for S) hence atomParity 1 for S atomRefs4[i+1] = cipOrderedAtoms.get(i); } if (rOrS.equals("R")) { atom.setAtomParity(atomRefs4, -1); } else if (rOrS.equals("S")) { atom.setAtomParity(atomRefs4, 1); } else{ throw new StructureBuildingException("Unexpected stereochemistry type: " + rOrS); } } /** * Handles E/Z stereochemistry and cis/trans in cases where cis/trans unambiguously corresponds to E/Z * @param stereoChemistryEl * @throws StructureBuildingException * @throws StereochemistryException */ private void assignStereoBond(Element stereoChemistryEl) throws StructureBuildingException, StereochemistryException { //generally the LAST group in this list will be the appropriate groups e.g. (2Z)-5-ethyl-6-methylhex-2-ene where the hex-2-ene is the appropriate group //we use the same algorithm as for unlocanted substitution so as to deprecate assignment into brackets Element parentSubBracketOrRoot = stereoChemistryEl.getParent(); List possibleFragments = StructureBuildingMethods.findAlternativeFragments(parentSubBracketOrRoot); List adjacentGroupEls = OpsinTools.getDescendantElementsWithTagName(parentSubBracketOrRoot, GROUP_EL); for (int i = adjacentGroupEls.size()-1; i >=0; i--) { possibleFragments.add(adjacentGroupEls.get(i).getFrag()); } String locant = stereoChemistryEl.getAttributeValue(LOCANT_ATR); String eOrZ = stereoChemistryEl.getAttributeValue(VALUE_ATR); boolean isCisTrans =false; if (stereoChemistryEl.getAttributeValue(TYPE_ATR).equals(CISORTRANS_TYPE_VAL)){ isCisTrans =true; String cisOrTrans = stereoChemistryEl.getAttributeValue(VALUE_ATR); if (cisOrTrans.equalsIgnoreCase("cis")){ eOrZ = "Z"; } else if (cisOrTrans.equalsIgnoreCase("trans")){ eOrZ = "E"; } else{ throw new StructureBuildingException("Unexpected cis/trans stereochemistry type: " +cisOrTrans); } } for (Fragment fragment : possibleFragments) { if (attemptAssignmentOfStereoBondToFragment(fragment, eOrZ, locant, isCisTrans)) { return; } } Element possibleWordParent = parentSubBracketOrRoot.getParent(); if (possibleWordParent.getName().equals(WORD_EL) && possibleWordParent.getAttributeValue(TYPE_ATR).equals(WordType.substituent.toString())){ //the element is in front of a substituent and may refer to the full group //i.e. the stereochemistry is in a different word to what it is applied to List words = OpsinTools.getChildElementsWithTagNameAndAttribute(possibleWordParent.getParent(), WORD_EL, TYPE_ATR, WordType.full.toString()); for (Element word : words) { List possibleGroups = OpsinTools.getDescendantElementsWithTagName(word, GROUP_EL); for (int i = possibleGroups.size()-1; i >=0; i--) { if (attemptAssignmentOfStereoBondToFragment(possibleGroups.get(i).getFrag(), eOrZ, locant, isCisTrans)) { return; } } } } if (isCisTrans){ throw new StereochemistryException("Could not find bond that: " + stereoChemistryEl.toXML() + " could refer unambiguously to"); } else{ throw new StereochemistryException("Could not find bond that: " + stereoChemistryEl.toXML() + " was referring to"); } } private boolean attemptAssignmentOfStereoBondToFragment(Fragment fragment, String eOrZ, String locant, boolean isCisTrans) throws StereochemistryException { if (locant == null){//undefined locant Set bondSet = fragment.getBondSet(); for (Bond potentialBond : bondSet) { if (notExplicitlyDefinedStereoBondMap.containsKey(potentialBond) && (!isCisTrans || cisTransUnambiguousOnBond(potentialBond))){ applyStereoChemistryToStereoBond(potentialBond, notExplicitlyDefinedStereoBondMap.get(potentialBond), eOrZ); notExplicitlyDefinedStereoBondMap.remove(potentialBond); return true; } } List sortedInterFragmentBonds = sortInterFragmentBonds(state.fragManager.getInterFragmentBonds(fragment), fragment); for (Bond potentialBond : sortedInterFragmentBonds) { if (notExplicitlyDefinedStereoBondMap.containsKey(potentialBond) && (!isCisTrans || cisTransUnambiguousOnBond(potentialBond))){ applyStereoChemistryToStereoBond(potentialBond, notExplicitlyDefinedStereoBondMap.get(potentialBond), eOrZ); notExplicitlyDefinedStereoBondMap.remove(potentialBond); return true; } } } else{ Atom firstAtomInBond = fragment.getAtomByLocant(locant); if (firstAtomInBond !=null){ List bonds = firstAtomInBond.getBonds(); for (Bond potentialBond : bonds) { if (notExplicitlyDefinedStereoBondMap.containsKey(potentialBond) && (!isCisTrans || cisTransUnambiguousOnBond(potentialBond))){ applyStereoChemistryToStereoBond(potentialBond, notExplicitlyDefinedStereoBondMap.get(potentialBond), eOrZ); notExplicitlyDefinedStereoBondMap.remove(potentialBond); return true; } } } } return false; } /** * Does the stereoBond have a hydrogen connected to both ends of it. * If not it is ambiguous when used in conjunction with cis/trans and E/Z should be used. * @param potentialBond * @return */ static boolean cisTransUnambiguousOnBond(Bond potentialBond) { List neighbours1 = potentialBond.getFromAtom().getAtomNeighbours(); boolean foundHydrogen1 =false; for (Atom neighbour : neighbours1) { if (neighbour.getElement() == ChemEl.H){ foundHydrogen1 =true; } } List neighbours2 = potentialBond.getToAtom().getAtomNeighbours(); boolean foundHydrogen2 =false; for (Atom neighbour : neighbours2) { if (neighbour.getElement() == ChemEl.H){ foundHydrogen2 =true; } } return (foundHydrogen1 && foundHydrogen2); } /** * Sorts bonds such that those originating from the given fragment are preferred * @param interFragmentBonds A set of interFragment Bonds * @param preferredOriginatingFragment * @return A sorted list */ private List sortInterFragmentBonds(Set interFragmentBonds, Fragment preferredOriginatingFragment) { List interFragmentBondList = new ArrayList<>(); for (Bond bond : interFragmentBonds) { if (bond.getFromAtom().getFrag() ==preferredOriginatingFragment){ interFragmentBondList.add(0, bond); } else{ interFragmentBondList.add(bond); } } return interFragmentBondList; } /** * Assigns bondstereo to the given bond in accordance with the CIP rules * @param bond The stereobond * @param stereoBond * @param eOrZ The stereo description given in the name * @throws StereochemistryException */ private void applyStereoChemistryToStereoBond(Bond bond, StereoBond stereoBond, String eOrZ ) throws StereochemistryException { List stereoBondAtoms = stereoBond.getOrderedStereoAtoms(); //stereoBondAtoms contains the higher priority atom at one end, the two bond atoms and the higher priority atom at the other end Atom[] atomRefs4 = new Atom[4]; atomRefs4[0] = stereoBondAtoms.get(0); atomRefs4[1] = stereoBondAtoms.get(1); atomRefs4[2] = stereoBondAtoms.get(2); atomRefs4[3] = stereoBondAtoms.get(3); if (eOrZ.equals("E")){ bond.setBondStereoElement(atomRefs4, BondStereoValue.TRANS); } else if (eOrZ.equals("Z")){ bond.setBondStereoElement(atomRefs4, BondStereoValue.CIS); } else if (eOrZ.equals("EZ")){ bond.setBondStereo(null); } else{ throw new IllegalArgumentException("Unexpected stereochemistry type: " + eOrZ); } } /** * Searches for instances of two tetrahedral stereocentres/psuedo-stereocentres * then sets their configuration such that the substituents at these centres are cis or trans to each other * @param stereoChemistryEl * @return * @throws StructureBuildingException */ private boolean assignCisTransOnRing(Element stereoChemistryEl) throws StructureBuildingException { if (stereoChemistryEl.getAttribute(LOCANT_ATR) != null) { return false; } Element parentSubBracketOrRoot = stereoChemistryEl.getParent(); List possibleFragments = StructureBuildingMethods.findAlternativeFragments(parentSubBracketOrRoot); List adjacentGroupEls = OpsinTools.getDescendantElementsWithTagName(parentSubBracketOrRoot, GROUP_EL); for (int i = adjacentGroupEls.size()-1; i >=0; i--) { possibleFragments.add(adjacentGroupEls.get(i).getFrag()); } for (Fragment fragment : possibleFragments) { if (attemptAssignmentOfCisTransRingStereoToFragment(fragment, stereoChemistryEl)){ return true; } } Element possibleWordParent = parentSubBracketOrRoot.getParent(); if (possibleWordParent.getName().equals(WORD_EL) && possibleWordParent.getChild(0).equals(parentSubBracketOrRoot)){ //stereochemistry is in a different word to what it is applied to List words = OpsinTools.getNextSiblingsOfType(possibleWordParent, WORD_EL); for (Element word : words) { List possibleGroups = OpsinTools.getDescendantElementsWithTagName(word, GROUP_EL); for (int i = possibleGroups.size()-1; i >=0; i--) { if (attemptAssignmentOfCisTransRingStereoToFragment(possibleGroups.get(i).getFrag(), stereoChemistryEl)) { return true; } } } } return false; } private boolean attemptAssignmentOfCisTransRingStereoToFragment(Fragment fragment, Element stereoChemistryEl) throws StructureBuildingException { List atomList = fragment.getAtomList(); List chosenStereoAtoms = new ArrayList<>(); List stereoAtomsWithTwoNonHydrogen = new ArrayList<>(); for (Atom potentialStereoAtom : atomList) { if (potentialStereoAtom.getAtomIsInACycle()){ List neighbours = potentialStereoAtom.getAtomNeighbours(); if (neighbours.size() == 4) { int hydrogenCount = 0; int acylicOrNotInFrag = 0; for (Atom neighbour : neighbours) { if (neighbour.getElement() == ChemEl.H) { hydrogenCount++; } if (!neighbour.getAtomIsInACycle() || !atomList.contains(neighbour)) { acylicOrNotInFrag++; } } if (hydrogenCount == 1 || (hydrogenCount == 0 && acylicOrNotInFrag == 1)) { chosenStereoAtoms.add(potentialStereoAtom); } else if (hydrogenCount == 0 && acylicOrNotInFrag == 2 && notExplicitlyDefinedStereoCentreMap.containsKey(potentialStereoAtom)) { stereoAtomsWithTwoNonHydrogen.add(potentialStereoAtom); } } } } boolean chooseAtomByCip = false; if (chosenStereoAtoms.size() < 2 && chosenStereoAtoms.size() + stereoAtomsWithTwoNonHydrogen.size() == 2) { chosenStereoAtoms.addAll(stereoAtomsWithTwoNonHydrogen); chooseAtomByCip = true; } if (chosenStereoAtoms.size() == 2) { Atom a1 = chosenStereoAtoms.get(0); Atom a2 = chosenStereoAtoms.get(1); if (a1.getAtomParity() != null && a2.getAtomParity() != null){//one can have defined stereochemistry but not both return false; } Set peripheryBonds = determinePeripheryBonds(fragment); List> paths = CycleDetector.getPathBetweenAtomsUsingBonds(a1, a2, peripheryBonds); if (paths.size() != 2) { return false; } applyStereoChemistryToCisTransOnRing(a1, a2, paths, atomList, stereoChemistryEl.getAttributeValue(VALUE_ATR), chooseAtomByCip); notExplicitlyDefinedStereoCentreMap.remove(chosenStereoAtoms.get(0)); notExplicitlyDefinedStereoCentreMap.remove(chosenStereoAtoms.get(1)); if (chooseAtomByCip) { state.addIsAmbiguous("Ring cis/trans applied to stereocenter where no hydrogen was present. Cahn-Ingold-Prelog rules used to determine which substituents are cis/trans, but other conventions may be in use"); } return true; } return false; } private Set determinePeripheryBonds(Fragment fragment) { List rings = SSSRFinder.getSetOfSmallestRings(fragment); FusedRingNumberer.setupAdjacentFusedRingProperties(rings); Set bondsToConsider = new HashSet<>(); for (Ring ring : rings) { for (Bond bond : ring.getBondList()) { bondsToConsider.add(bond); } } for (Ring ring : rings) { bondsToConsider.removeAll(ring.getFusedBonds()); } return bondsToConsider; } private void applyStereoChemistryToCisTransOnRing(Atom a1, Atom a2, List> paths, List fragmentAtoms, String cisOrTrans, boolean chooseAtomByCip) throws StructureBuildingException { List a1Neighbours = a1.getAtomNeighbours(); Atom[] atomRefs4a1 = new Atom[4]; Atom firstPathAtom = paths.get(0).size()>0 ? paths.get(0).get(0) : a2; atomRefs4a1[2] = firstPathAtom; Atom secondPathAtom = paths.get(1).size()>0 ? paths.get(1).get(0) : a2; atomRefs4a1[3] = secondPathAtom; a1Neighbours.remove(firstPathAtom); a1Neighbours.remove(secondPathAtom); if (firstPathAtom.equals(secondPathAtom)){ throw new StructureBuildingException("OPSIN Bug: cannot assign cis/trans on ring stereochemistry"); } if (chooseAtomByCip) { atomRefs4a1[1] = getLowestCip(a1, a1Neighbours); } else { atomRefs4a1[1] = getHydrogenOrAcyclicOrOutsideOfFragment(a1Neighbours, fragmentAtoms); } if (atomRefs4a1[1] ==null){ throw new StructureBuildingException("OPSIN Bug: cannot assign cis/trans on ring stereochemistry"); } a1Neighbours.remove(atomRefs4a1[1]); atomRefs4a1[0] = a1Neighbours.get(0); List a2Neighbours = a2.getAtomNeighbours(); Atom[] atomRefs4a2 = new Atom[4]; firstPathAtom = paths.get(0).size()>0 ? paths.get(0).get(paths.get(0).size()-1) : a1; atomRefs4a2[2] = firstPathAtom; secondPathAtom = paths.get(1).size()>0 ? paths.get(1).get(paths.get(1).size()-1) : a1; atomRefs4a2[3] = secondPathAtom; a2Neighbours.remove(firstPathAtom); a2Neighbours.remove(secondPathAtom); if (firstPathAtom.equals(secondPathAtom)){ throw new StructureBuildingException("OPSIN Bug: cannot assign cis/trans on ring stereochemistry"); } if (chooseAtomByCip) { atomRefs4a2[1] = getLowestCip(a2, a2Neighbours); } else { atomRefs4a2[1] = getHydrogenOrAcyclicOrOutsideOfFragment(a2Neighbours, fragmentAtoms); } if (atomRefs4a2[1] ==null){ throw new StructureBuildingException("OPSIN Bug: cannot assign cis/trans on ring stereochemistry"); } a2Neighbours.remove(atomRefs4a2[1]); atomRefs4a2[0] = a2Neighbours.get(0); boolean enantiomer =false; if (a1.getAtomParity()!=null){ if (!checkEquivalencyOfAtomsRefs4AndParity(atomRefs4a1, 1, a1.getAtomParity().getAtomRefs4(), a1.getAtomParity().getParity())){ enantiomer=true; } } else if (a2.getAtomParity()!=null){ if (cisOrTrans.equals("cis")){ if (!checkEquivalencyOfAtomsRefs4AndParity(atomRefs4a2, -1, a2.getAtomParity().getAtomRefs4(), a2.getAtomParity().getParity())){ enantiomer=true; } } else if (cisOrTrans.equals("trans")){ if (!checkEquivalencyOfAtomsRefs4AndParity(atomRefs4a2, 1, a2.getAtomParity().getAtomRefs4(), a2.getAtomParity().getParity())){ enantiomer=true; } } } if (enantiomer){ if (cisOrTrans.equals("cis")){ a1.setAtomParity(atomRefs4a1, -1); a2.setAtomParity(atomRefs4a2, 1); } else if (cisOrTrans.equals("trans")){ a1.setAtomParity(atomRefs4a1, -1); a2.setAtomParity(atomRefs4a2, -1); } } else{ if (cisOrTrans.equals("cis")){ a1.setAtomParity(atomRefs4a1, 1); a2.setAtomParity(atomRefs4a2, -1); } else if (cisOrTrans.equals("trans")){ a1.setAtomParity(atomRefs4a1, 1); a2.setAtomParity(atomRefs4a2, 1); } } } private Atom getLowestCip(Atom a, List atomsToConsider) { try { List neigh = new CipSequenceRules(a).getNeighbouringAtomsInCipOrder(); for (Atom atom : neigh) { if (!atomsToConsider.contains(atom)) { continue; } return atom; } } catch (CipOrderingException e) { LOG.debug(e.getMessage(), e); } return null; } private Atom getHydrogenOrAcyclicOrOutsideOfFragment(List atoms, List fragmentAtoms) { for (Atom atom : atoms) { if (atom.getElement() == ChemEl.H){ return atom; } } for (Atom atom : atoms) { if (!atom.getAtomIsInACycle() || !fragmentAtoms.contains(atom)){ return atom; } } return null; } /** * Handles assignment of alpha and beta stereochemistry to appropriate ring systems * Currently these are only assignable to natural products * Xi (unknown) stereochemistry is applicable to any tetrahedral centre * @param stereoChemistryEl * @throws StructureBuildingException */ private void assignAlphaBetaXiStereochem(Element stereoChemistryEl) throws StructureBuildingException { Element parentSubBracketOrRoot = stereoChemistryEl.getParent(); List possibleFragments = StructureBuildingMethods.findAlternativeFragments(parentSubBracketOrRoot); Fragment substituentGroup =null; if (parentSubBracketOrRoot.getName().equals(SUBSTITUENT_EL)){ substituentGroup = parentSubBracketOrRoot.getFirstChildElement(GROUP_EL).getFrag(); } List adjacentGroupEls = OpsinTools.getDescendantElementsWithTagName(parentSubBracketOrRoot, GROUP_EL); for (int i = adjacentGroupEls.size()-1; i >=0; i--) { possibleFragments.add(adjacentGroupEls.get(i).getFrag()); } String locant = stereoChemistryEl.getAttributeValue(LOCANT_ATR); String alphaOrBeta = stereoChemistryEl.getAttributeValue(VALUE_ATR); for (Fragment fragment : possibleFragments) { Atom potentialStereoAtom = fragment.getAtomByLocant(locant); if (potentialStereoAtom !=null && atomStereoCentreMap.containsKey(potentialStereoAtom)){//same stereocentre can be defined twice e.g. one subsituent alpha the other beta if (alphaOrBeta.equals("xi")){ potentialStereoAtom.setAtomParity(null); } else { String alphaBetaClockWiseAtomOrdering = fragment.getTokenEl().getAttributeValue(ALPHABETACLOCKWISEATOMORDERING_ATR); if (alphaBetaClockWiseAtomOrdering==null){ throw new StructureBuildingException("Identified fragment is not known to be able to support alpha/beta stereochemistry"); } applyAlphaBetaStereochemistryToStereoCentre(potentialStereoAtom, fragment, alphaBetaClockWiseAtomOrdering, alphaOrBeta, substituentGroup); } notExplicitlyDefinedStereoCentreMap.remove(potentialStereoAtom); return; } } throw new StructureBuildingException("Could not find atom that: " + stereoChemistryEl.toXML() + " appeared to be referring to"); } /** * Converts the alpha/beta descriptor into an atomRefs4 and parity. * The ordering of atoms in the atomsRefs4 is determined by using the two adjacent atoms along the rings edge as defined by ALPHABETACLOCKWISEATOMORDERING_ATR. * by what atom is also part of the ring or is a hydrogen * and by the substituent atom (as determined by the optional substituentGroup group parameter or by being a non-hydrogen) * @param stereoAtom * @param fragment * @param alphaBetaClockWiseAtomOrdering * @param alphaOrBeta * @param substituentGroup * @throws StructureBuildingException */ private void applyAlphaBetaStereochemistryToStereoCentre(Atom stereoAtom, Fragment fragment, String alphaBetaClockWiseAtomOrdering, String alphaOrBeta, Fragment substituentGroup) throws StructureBuildingException { List ringOrder = StringTools.arrayToList(alphaBetaClockWiseAtomOrdering.split("/")); int positionInList = ringOrder.indexOf(stereoAtom.getFirstLocant()); if (stereoAtom.getAtomIsInACycle() && positionInList!=-1){ Atom[] atomRefs4 = new Atom[4]; List neighbours = stereoAtom.getAtomNeighbours(); if (neighbours.size()==4){ int previousIndice = positionInList==0 ? ringOrder.size()-1: positionInList -1; int nextindice = positionInList==ringOrder.size()-1? 0: positionInList +1; atomRefs4[0] = fragment.getAtomByLocantOrThrow(ringOrder.get(previousIndice)); atomRefs4[3] = fragment.getAtomByLocantOrThrow(ringOrder.get(nextindice)); neighbours.remove(atomRefs4[0]); neighbours.remove(atomRefs4[3]); Atom a1 =neighbours.get(0); Atom a2 =neighbours.get(1); if ((fragment.getAtomList().contains(a1) && ringOrder.contains(a1.getFirstLocant()))){ atomRefs4[1]=a1; atomRefs4[2]=a2; } else if ((fragment.getAtomList().contains(a2) && ringOrder.contains(a2.getFirstLocant()))){ atomRefs4[1]=a2; atomRefs4[2]=a1; } else if (a1.getElement() == ChemEl.H && a2.getElement() != ChemEl.H){ atomRefs4[1]=a2; atomRefs4[2]=a1; } else if (a2.getElement() == ChemEl.H && a1.getElement() != ChemEl.H){ atomRefs4[1]=a1; atomRefs4[2]=a2; }//TODO support case where alpha/beta are applied prior to a suffix (and the stereocentre doesn't have a hydrogen) e.g. 17alpha-yl else if (substituentGroup !=null && fragment !=substituentGroup && substituentGroup.getAtomList().contains(a1)){ atomRefs4[1]=a1; atomRefs4[2]=a2; } else if (substituentGroup !=null && fragment !=substituentGroup && substituentGroup.getAtomList().contains(a2)){ atomRefs4[1]=a2; atomRefs4[2]=a1; } else{ throw new StructureBuildingException("alpha/beta stereochemistry could not be determined at position " +stereoAtom.getFirstLocant()); } AtomParity previousAtomParity = stereoAtom.getAtomParity(); if (alphaOrBeta.equals("alpha")){ stereoAtom.setAtomParity(atomRefs4, 1); } else if (alphaOrBeta.equals("beta")){ stereoAtom.setAtomParity(atomRefs4, -1); } else{ throw new StructureBuildingException("OPSIN Bug: malformed alpha/beta stereochemistry value"); } if (!notExplicitlyDefinedStereoCentreMap.containsKey(stereoAtom)){//stereocentre has already been defined, need to check for contradiction! AtomParity newAtomParity =stereoAtom.getAtomParity(); if (previousAtomParity == null){ if (newAtomParity != null){ throw new StructureBuildingException("contradictory alpha/beta stereochemistry at position " +stereoAtom.getFirstLocant()); } } else if (newAtomParity == null){ if (previousAtomParity != null){ throw new StructureBuildingException("contradictory alpha/beta stereochemistry at position " +stereoAtom.getFirstLocant()); } } else if (!checkEquivalencyOfAtomsRefs4AndParity(previousAtomParity.getAtomRefs4(), previousAtomParity.getParity(), newAtomParity.getAtomRefs4(), newAtomParity.getParity())){ throw new StructureBuildingException("contradictory alpha/beta stereochemistry at position " +stereoAtom.getFirstLocant()); } } } else{ throw new StructureBuildingException("Unsupported stereocentre type for alpha/beta stereochemistry"); } } else{ throw new StructureBuildingException("Unsupported stereocentre type for alpha/beta stereochemistry"); } } /** * Applies carbohydate configurational prefixes to the appropriate carbohydrateStem * @param carbohydrateGroup * @param carbohydrateStereoChemistryEls * @throws StructureBuildingException */ private void assignCarbohydratePrefixStereochem(Element carbohydrateGroup, List carbohydrateStereoChemistryEls) throws StructureBuildingException { Fragment carbohydrate = carbohydrateGroup.getFrag(); Set atoms = notExplicitlyDefinedStereoCentreMap.keySet(); List stereocentresInCarbohydrate = new ArrayList<>(); for (Atom atom : atoms) { if (carbohydrate.getAtomByID(atom.getID())!=null){ Boolean isAnomeric = atom.getProperty(Atom.ISANOMERIC); if (isAnomeric ==null || !isAnomeric) { stereocentresInCarbohydrate.add(atom); } } } //stereoconfiguration is specified from the farthest from C-1 to nearest to C-1 //but it is easier to set it the other way around hence this reverse Collections.reverse(carbohydrateStereoChemistryEls); List stereocentreConfiguration = new ArrayList<>(); for (Element carbohydrateStereoChemistryEl: carbohydrateStereoChemistryEls) { String[] values = carbohydrateStereoChemistryEl.getAttributeValue(VALUE_ATR).split("/"); Collections.addAll(stereocentreConfiguration, values); } if (stereocentresInCarbohydrate.size() != stereocentreConfiguration.size()){ throw new StructureBuildingException("Disagreement between number of stereocentres on carbohydrate: " + stereocentresInCarbohydrate.size() + " and centres defined by configurational prefixes: " + stereocentreConfiguration.size()); } Collections.sort(stereocentresInCarbohydrate, new FragmentTools.SortByLocants()); for (int i = 0; i < stereocentresInCarbohydrate.size(); i++) { Atom stereoAtom =stereocentresInCarbohydrate.get(i); String configuration = stereocentreConfiguration.get(i); if (configuration.equals("r")){ AtomParity atomParity = stereoAtom.getAtomParity(); if (atomParity ==null){ throw new RuntimeException("OPSIN bug: stereochemistry was not defined on a carbohydrate stem, but it should been"); } //do nothing, r by default } else if (configuration.equals("l")){ AtomParity atomParity = stereoAtom.getAtomParity(); if (atomParity ==null){ throw new RuntimeException("OPSIN bug: stereochemistry was not defined on a carbohydrate stem, but it should been"); } atomParity.setParity(-atomParity.getParity()); } else if (configuration.equals("?")){ stereoAtom.setAtomParity(null); } else{ throw new RuntimeException("OPSIN bug: unexpected carbohydrate stereochemistry configuration: " + configuration); } notExplicitlyDefinedStereoCentreMap.remove(stereoAtom); } } private void assignDlStereochem(Element stereoChemistryEl) throws StructureBuildingException { String dOrL = stereoChemistryEl.getAttributeValue(VALUE_ATR); Element elementToApplyTo = OpsinTools.getNextSiblingIgnoringCertainElements(stereoChemistryEl, new String[]{STEREOCHEMISTRY_EL}); if (elementToApplyTo != null && elementToApplyTo.getName().equals(GROUP_EL) && attemptAssignmentOfDlStereoToFragment(elementToApplyTo.getFrag(), dOrL)){ // D/L adjacent to group that now has an appropriate stereocentre e.g. glycine return; } Element parentSubBracketOrRoot = stereoChemistryEl.getParent(); //generally the LAST group in this list will be the appropriate group //we use the same algorithm as for unlocanted substitution so as to deprecate assignment into brackets List possibleFragments = StructureBuildingMethods.findAlternativeFragments(parentSubBracketOrRoot); List adjacentGroupEls = OpsinTools.getDescendantElementsWithTagName(parentSubBracketOrRoot, GROUP_EL); for (int i = adjacentGroupEls.size()-1; i >=0; i--) { possibleFragments.add(adjacentGroupEls.get(i).getFrag()); } for (Fragment fragment : possibleFragments) { if (attemptAssignmentOfDlStereoToFragment(fragment, dOrL)) { return; } } throw new StereochemistryException("Could not find stereocentre to apply " + dOrL.toUpperCase(Locale.ROOT) + " stereochemistry to"); } private boolean attemptAssignmentOfDlStereoToFragment(Fragment fragment, String dOrL) throws StereochemistryException, StructureBuildingException { List atomList = fragment.getAtomList(); for (Atom potentialStereoAtom : atomList) { if (notExplicitlyDefinedStereoCentreMap.containsKey(potentialStereoAtom) && potentialStereoAtom.getBondCount() == 4) { List neighbours = potentialStereoAtom.getAtomNeighbours(); Atom acidGroup = null;//A carbon connected to non-carbons e.g. COOH Atom amineOrAlcohol = null;//N or O e.g. NH2 (as this may be substituted don't check H count) Atom sideChain = null;//A carbon Atom hydrogen = null;//A hydrogen for (Atom atom : neighbours) { ChemEl el = atom.getElement(); if (el == ChemEl.H) { hydrogen = atom; } else if (el == ChemEl.C) { int chalcogenNeighbours = 0; for (Atom neighbour2 : atom.getAtomNeighbours()) { if (atom == neighbour2) { continue; } if (neighbour2.getElement().isChalcogen()) { chalcogenNeighbours++; } } if (chalcogenNeighbours > 0) { acidGroup = atom; } else { sideChain = atom; } } else if (el == ChemEl.O || el ==ChemEl.N) { amineOrAlcohol = atom; } } if (acidGroup != null && amineOrAlcohol != null && sideChain != null && hydrogen != null) { Atom[] atomRefs4 = new Atom[]{acidGroup, sideChain, amineOrAlcohol, hydrogen}; if (dOrL.equals("l") || dOrL.equals("ls")) { potentialStereoAtom.setAtomParity(atomRefs4, -1); } else if (dOrL.equals("d") || dOrL.equals("ds")) { potentialStereoAtom.setAtomParity(atomRefs4, 1); } else if (dOrL.equals("dl")) { potentialStereoAtom.setAtomParity(atomRefs4, 1); potentialStereoAtom.setStereoGroup(new StereoGroup(StereoGroupType.Rac,++state.numRacGrps)); } else{ throw new RuntimeException("OPSIN bug: Unexpected value for D/L stereochemistry found: " + dOrL ); } notExplicitlyDefinedStereoCentreMap.remove(potentialStereoAtom); return true; } } } return false; } static int swapsRequiredToSort(Atom[] atomRefs4){ Atom[] atomRefs4Copy = atomRefs4.clone(); int swapsPerformed = 0; int i,j; for (i=atomRefs4Copy.length; --i >=0;) { boolean swapped = false; for (j=0; j atomRefs4Copy[j+1].getID()){ Atom temp = atomRefs4Copy[j+1]; atomRefs4Copy[j+1] = atomRefs4Copy[j]; atomRefs4Copy[j] = temp; swapsPerformed++; swapped = true; } } if (!swapped){ return swapsPerformed; } } return swapsPerformed; } static boolean checkEquivalencyOfAtomsRefs4AndParity(Atom[] atomRefs1, int atomParity1, Atom[] atomRefs2, int atomParity2){ int swaps1 = swapsRequiredToSort(atomRefs1); int swaps2 = swapsRequiredToSort(atomRefs2); if (atomParity1 < 0 && atomParity2 > 0 || atomParity1 > 0 && atomParity2 < 0){ swaps1++; } return swaps1 %2 == swaps2 %2; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/StringTools.java000066400000000000000000000304221451751637500270070ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.Arrays; import java.util.List; /**Static routines for string manipulation. * * @author ptc24 * @author dl387 * */ class StringTools { /** * Converts a list of strings into a single string delimited by the given separator * * @param list A list of strings. * @param separator * @return The corresponding string. */ static String stringListToString(List list, String separator) { StringBuilder sb = new StringBuilder(); int lastIndexOfList = list.size() - 1; for (int i = 0; i < lastIndexOfList; i++) { sb.append(list.get(i)); sb.append(separator); } if (lastIndexOfList >= 0){ sb.append(list.get(lastIndexOfList)); } return sb.toString(); } /**Produce repetitions of a string. Eg. HelloWorld * 2 = HelloWorldHelloWorld. * * @param s The string to multiply. * @param n The number of times to multiply it. * @return The multiplied string. */ static String multiplyString(String s, int n) { StringBuilder sb = new StringBuilder(); for (int i = 0; i < n; i++) { sb.append(s); } return sb.toString(); } /**Joins an array of strings into a single string. * * @param stringArray The strings to join together. * @param separator The separator to use. * @return The resulting string. */ static String arrayToString(String[] stringArray, String separator) { StringBuilder sb = new StringBuilder(); int lastIndexOfArray = stringArray.length - 1; for(int i = 0; i < lastIndexOfArray; i++) { sb.append(stringArray[i]); sb.append(separator); } if (lastIndexOfArray >= 0){ sb.append(stringArray[lastIndexOfArray]); } return sb.toString(); } /**Converts a unicode string into ASCII * e.g. converting Greek letters to their names (e.g. alpha) * Unrecognised non-ASCII characters trigger an exception * * @param s The string to convert * @return The converted string * @throws PreProcessingException */ static String convertNonAsciiAndNormaliseRepresentation(String s) throws PreProcessingException { StringBuilder sb = new StringBuilder(s.length()); for (int i = 0, l = s.length(); i < l; i++) { char c = s.charAt(i); switch (c) { case '\t': case '\n': case '\u000B'://vertical tab case '\f': case '\r': //normalise white space sb.append(" "); break; case '`': sb.append("'");//replace back ticks with apostrophe break; case '"': sb.append("''");//replace quotation mark with two primes break; default: if(c >= 128) { sb.append(getReplacementForNonASCIIChar(c));//replace non ascii characters with hard coded ascii strings } else if (c > 31){//ignore control characters sb.append(c); } } } return sb.toString(); } private static String getReplacementForNonASCIIChar(char c) throws PreProcessingException { switch (c) { case '\u03b1': return "alpha";//greeks case '\u03b2': return "beta"; case '\u03b3': return "gamma"; case '\u03b4': return "delta"; case '\u03b5': return "epsilon"; case '\u03b6': return "zeta"; case '\u03b7': return "eta"; case '\u03b8': return "theta"; case '\u03b9': return "iota"; case '\u03ba': return "kappa"; case '\u03bb': return "lambda"; case '\u03bc': return "mu"; case '\u03bd': return "nu"; case '\u03be': return "xi"; case '\u03bf': return "omicron"; case '\u03c0': return "pi"; case '\u03c1': return "rho"; case '\u03c2': return "stigma"; case '\u03c3': return "sigma"; case '\u03c4': return "tau"; case '\u03c5': return "upsilon"; case '\u03c6': return "phi"; case '\u03c7': return "chi"; case '\u03c8': return "psi"; case '\u03c9': return "omega"; case '\u1D05': return "D";//small capitals case '\u029F': return "L"; case '\u00B1': return "+-";//plus minus symbol case '\u2213': return "-+"; case '\u2192'://right arrows case '\u2794': case '\u2799': case '\u279C': return "->"; case '\u00C6': return "AE";//common ligatures case '\u00E6': return "ae"; case '\u0152': return "OE"; case '\u0153': return "oe"; case '\u0132': return "IJ"; case '\u0133': return "ij"; case '\u1D6B': return "ue"; case '\uFB00': return "ff"; case '\uFB01': return "fi"; case '\uFB02': return "fl"; case '\uFB03': return "ffi"; case '\uFB04': return "ffl"; case '\uFB06': return "st"; case '\u00E0': return "a";//diacritics case '\u00C0': return "A"; case '\u00E1': return "a"; case '\u00C1': return "A"; case '\u00E2': return "a"; case '\u00C2': return "A"; case '\u00E3': return "a"; case '\u00C3': return "A"; case '\u00E4': return "a"; case '\u00C4': return "A"; case '\u00E5': return "a"; case '\u00C5': return "A"; case '\u00E7': return "c"; case '\u00C7': return "C"; case '\u00E8': return "e"; case '\u00C8': return "E"; case '\u00E9': return "e"; case '\u00C9': return "E"; case '\u00EA': return "e"; case '\u00CA': return "E"; case '\u00EB': return "e"; case '\u00CB': return "E"; case '\u00EC': return "i"; case '\u00CC': return "I"; case '\u00ED': return "i"; case '\u00CD': return "I"; case '\u00EE': return "i"; case '\u00CE': return "I"; case '\u00EF': return "i"; case '\u00CF': return "I"; case '\u00F2': return "o"; case '\u00D2': return "O"; case '\u00F3': return "o"; case '\u00D3': return "O"; case '\u00F4': return "o"; case '\u00D4': return "O"; case '\u00F5': return "o"; case '\u00D5': return "O"; case '\u00F6': return "o"; case '\u00D6': return "O"; case '\u00F9': return "u"; case '\u00D9': return "U"; case '\u00FA': return "u"; case '\u00DA': return "U"; case '\u00FB': return "u"; case '\u00DB': return "U"; case '\u00FC': return "u"; case '\u00DC': return "U"; case '\u00FD': return "y"; case '\u00DD': return "Y"; case '\u0115': return "e"; case '\u0114': return "E"; case '\u0117': return "e"; case '\u0116': return "E"; case '\u2070': return "0";//superscripts case '\u00B9': return "1"; case '\u00B2': return "2"; case '\u00B3': return "3"; case '\u2074': return "4"; case '\u2075': return "5"; case '\u2076': return "6"; case '\u2077': return "7"; case '\u2078': return "8"; case '\u2079': return "9"; case '\u2080': return "0";//subscripts case '\u2081': return "1"; case '\u2082': return "2"; case '\u2083': return "3"; case '\u2084': return "4"; case '\u2085': return "5"; case '\u2086': return "6"; case '\u2087': return "7"; case '\u2088': return "8"; case '\u2089': return "9"; case '\u2018': return "'";//quotation marks and primes (map to apostrophe/s) case '\u2019': return "'"; case '\u201B': return "'"; case '\u02BC': return "'"; case '\u201C': return "''"; case '\u201D': return "''"; case '\u2032': return "'";//primes case '\u2033': return "''"; case '\u2034': return "'''"; case '\u2057': return "''''"; case '\u02B9': return "'";//modifier primes case '\u02BA': return "''"; case '\u2035': return "'";//back primes case '\u2036': return "''"; case '\u2037': return "'''"; case '\u00B4': return "'";//accents case '\u02CA': return "'"; case '\u0301': return "'"; case '\u02DD': return "''"; case '\u030B': return "''"; case '\u2010'://dashes, hyphens and the minus sign case '\u2011': case '\u2012': case '\u2013': case '\u2014': case '\u2015': case '\u2212': return "-"; case '\u02DC'://small tilde case '\u223C'://tilde operator case '\u301C': return "~";//wave dash case '\uff0c': return ",";//full width punctuation case '\uFF1A': return ":"; case '\uFF1B': return ";"; case '\uFF08': return "("; case '\uFF09': return ")"; case '\uFF3B': return "["; case '\uFF3D': return "]"; case '\u3010': return "["; case '\u3011': return "]"; case '\uFF5B': return "{"; case '\uFF5D': return "}"; case '\u00DF': return "beta";//similar glyph case '\u2000'://different sized spaces case '\u2001': case '\u2002': case '\u2003': case '\u2004': case '\u2005': case '\u2006': case '\u2008': case '\u2009': case '\u200A': case '\u205F': case '\u00A0'://Non-breaking spaces case '\u2007': case '\u202F': case '\u3000': return " ";//ideographic space case '\u00AD'://soft hyphen case '\u200b'://zero width space case '\u200d'://zero width joiner case '\uFEFF': return "";//BOM-found at the start of some UTF files default: throw new PreProcessingException("Unrecognised unicode character: " + c); } } /**Converts a string array to an ArrayList. * * @param array The array. * @return The ArrayList. */ static List arrayToList(String [] array) { List list = new ArrayList<>(); list.addAll(Arrays.asList(array)); return list; } /** * If a dash is the last character it is removed * @param locantText * @return */ static String removeDashIfPresent(String locantText){ if(locantText.endsWith("-")) { locantText = locantText.substring(0, locantText.length() - 1); } return locantText; } /** * Counts the number of primes at the end of a locant * @param locantText * @return */ static int countTerminalPrimes(String locantText){ int numberOfPrimes = 0; for(int i = locantText.length() -1; i > 0; i--){ if (locantText.charAt(i) == '\''){ numberOfPrimes++; } else{ break; } } return numberOfPrimes; } /** * Tests if this string start with the specified prefix ignoring case. * @param str * @param prefix * @return */ static boolean startsWithCaseInsensitive(String str, String prefix) { return str.regionMatches(true, 0, prefix, 0, prefix.length()); } static boolean startsWithCaseInsensitive(String str, int i, String prefix) { return str.regionMatches(true, i, prefix, 0, prefix.length()); } /** * Tests if this string ends with the specified suffix ignoring case. * @param str * @param suffix * @return */ static boolean endsWithCaseInsensitive(String str, String suffix) { if (suffix.length() > str.length()) { return false; } int strOffset = str.length() - suffix.length(); return str.regionMatches(true, strOffset, suffix, 0, suffix.length()); } /** * Lower cases a string (only converts A-Z to a-z) * @param str */ static String lowerCaseAsciiString(String str) { StringBuilder sb = new StringBuilder(str.length()); for (int i = 0, l = str.length(); i < l; i++) { char c = str.charAt(i); if (c >= 'A' && c <= 'Z') { c = (char) (c + 32); } sb.append(c); } return sb.toString(); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/StructureBuilder.java000066400000000000000000003403151451751637500300340ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.Arrays; import java.util.Collections; import java.util.HashMap; import java.util.List; import java.util.ListIterator; import java.util.Map; import java.util.Set; import java.util.regex.Matcher; import java.util.regex.Pattern; import uk.ac.cam.ch.wwmm.opsin.StereoAnalyser.StereoBond; import uk.ac.cam.ch.wwmm.opsin.StereoAnalyser.StereoCentre; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import static uk.ac.cam.ch.wwmm.opsin.OpsinTools.*; import static uk.ac.cam.ch.wwmm.opsin.StructureBuildingMethods.*; /**Constructs a single OPSIN fragment which describes the molecule from the ComponentGenerator/ComponentProcessor results. * * @author ptc24 * @author dl387 * */ class StructureBuilder { private static final Pattern matchBoroHydrogenIsotope = Pattern.compile("boro(deuter|trit)ide?"); private final BuildState state; private final List polymerAttachmentPoints = new ArrayList<>();//rGroups need to be represented as normal atoms for the purpose of working out stereochemistry. They will be converted to a suitable representation later private int currentTopLevelWordRuleCount; StructureBuilder(BuildState state) { this.state = state; } /** Builds a molecule as a Fragment based on ComponentProcessor output. * @param molecule The ComponentProcessor output. * @return A single Fragment - the built molecule. * @throws StructureBuildingException If the molecule won't build - there may be many reasons. */ Fragment buildFragment(Element molecule) throws StructureBuildingException { List wordRules = molecule.getChildElements(WORDRULE_EL); currentTopLevelWordRuleCount = wordRules.size(); if (currentTopLevelWordRuleCount == 0) { throw new StructureBuildingException("Molecule contains no word rules!?"); } for (Element wordRule : wordRules) { processWordRuleChildrenThenRule(wordRule); } if (currentTopLevelWordRuleCount != wordRules.size()) { wordRules = molecule.getChildElements(WORDRULE_EL);//very rarely a word rule adds a top level word rule } List groupElements = OpsinTools.getDescendantElementsWithTagName(molecule, GROUP_EL); processSpecialCases(groupElements); processOxidationNumbers(groupElements); state.fragManager.convertSpareValenciesToDoubleBonds(); state.fragManager.checkValencies(); manipulateStoichiometry(molecule, wordRules); state.fragManager.makeHydrogensExplicit(); Fragment uniFrag = state.fragManager.getUnifiedFragment(); processStereochemistry(molecule, uniFrag); if (uniFrag.getOutAtomCount() > 0) { if (!state.n2sConfig.isAllowRadicals()) { throw new StructureBuildingException("Radicals are currently set to not convert to structures"); } if (state.n2sConfig.isOutputRadicalsAsWildCardAtoms()) { convertOutAtomsToAttachmentAtoms(uniFrag); } } if (polymerAttachmentPoints.size() > 0) { for (Atom rAtom : polymerAttachmentPoints) { rAtom.setElement(ChemEl.R); } uniFrag.setPolymerAttachmentPoints(polymerAttachmentPoints); } return uniFrag; } private void processWordRuleChildrenThenRule(Element wordRule) throws StructureBuildingException { List wordRuleChildren = wordRule.getChildElements(WORDRULE_EL); for (Element wordRuleChild : wordRuleChildren) { processWordRuleChildrenThenRule(wordRuleChild); } processWordRule(wordRule); } private void processWordRule(Element wordRuleEl) throws StructureBuildingException { WordRule wordRule = WordRule.valueOf(wordRuleEl.getAttributeValue(WORDRULE_ATR)); List words = OpsinTools.getChildElementsWithTagNames(wordRuleEl, new String[]{WORD_EL, WORDRULE_EL}); state.currentWordRule = wordRule; switch (wordRule) { case simple: for (Element word : words) { if (!word.getName().equals(WORD_EL) || !word.getAttributeValue(TYPE_ATR).equals(WordType.full.toString())){ throw new StructureBuildingException("OPSIN bug: Unexpected contents of 'simple' wordRule"); } resolveWordOrBracket(state, word); } break; case substituent: for (Element word : words) { if (!word.getName().equals(WORD_EL) || !word.getAttributeValue(TYPE_ATR).equals(WordType.substituent.toString()) || !state.n2sConfig.isAllowRadicals()){ throw new StructureBuildingException("OPSIN bug: Unexpected contents of 'substituent' wordRule"); } resolveWordOrBracket(state, word); } break; case ester: case multiEster: buildEster(words);//e.g. ethyl ethanoate, dimethyl terephthalate, methyl propanamide break; case divalentFunctionalGroup: buildDiValentFunctionalGroup(words);// diethyl ether or methyl propyl ketone break; case monovalentFunctionalGroup: buildMonovalentFunctionalGroup(words);// ethyl chloride, isophthaloyl dichloride, diethyl ether, ethyl alcohol break; case functionalClassEster: buildFunctionalClassEster(words);//e.g. ethanoic acid ethyl ester, tetrathioterephthalic acid dimethyl ester break; case acidReplacingFunctionalGroup: //e.g. ethanoic acid ethyl amide, terephthalic acid dimethyl amide, //ethanoic acid amide, carbonic dihydrazide //already processed by the ComponentProcessor for (Element word : words) { resolveWordOrBracket(state, word); } break; case oxide: buildOxide(words);//e.g. styrene oxide, triphenylphosphane oxide, thianthrene 5,5-dioxide, propan-2-one oxide break; case carbonylDerivative: buildCarbonylDerivative(words);//e.g. Imidazole-2-carboxamide O-ethyloxime, pentan-3-one oxime break; case anhydride: buildAnhydride(words);//e.g. acetic anhydride break; case acidHalideOrPseudoHalide: buildAcidHalideOrPseudoHalide(words);//e.g. phosphinimidic chloride break; case additionCompound: buildAdditionCompound(words);//e.g. carbon tetrachloride break; case glycol: buildGlycol(words);//e.g. ethylene glycol break; case glycolEther: buildGlycolEther(words);//e.g. octaethyleneglycol monododecyl ether break; case acetal: buildAcetal(words);//e.g. propanal diethyl acetal break; case potentialAlcoholEster: //e.g. uridine 5'-(tetrahydrogen triphosphate) if (!buildAlcoholEster(words, currentTopLevelWordRuleCount)){ //should be processed as two "simple" wordrules if no hydroxy found, hence number of top level word rules may change //These simple word rules have already been processed splitAlcoholEsterRuleIntoTwoSimpleWordRules(words); currentTopLevelWordRuleCount++; } break; case cyclicPeptide: buildCyclicPeptide(words); break; case amineDiConjunctiveSuffix: //e.g. glycine N,N-diacetic acid buildAmineDiConjunctiveSuffix(words); break; case polymer: buildPolymer(words); break; default: throw new StructureBuildingException("Unexpected Word Rule"); } } private void buildEster(List words) throws StructureBuildingException { boolean inSubstituents = true; BuildResults substituentsBr = new BuildResults(); List ateGroups = new ArrayList<>(); Map buildResultsToLocant = new HashMap<>();//typically locant will be null for (Element word : words) { resolveWordOrBracket(state, word); BuildResults br = new BuildResults(word); if (inSubstituents && br.getFunctionalAtomCount() > 0){ inSubstituents = false; } if (inSubstituents){ if (!word.getAttributeValue(TYPE_ATR).equals(WordType.substituent.toString())){ if (word.getAttributeValue(TYPE_ATR).equals(WordType.full.toString())){ throw new StructureBuildingException("bug? ate group did not have any functional atoms!"); } else{ throw new StructureBuildingException("OPSIN bug: Non substituent word found where substituent expected in ester"); } } int outAtomCount = br.getOutAtomCount(); boolean traditionalEster =false; for (int i = 0; i < outAtomCount; i++) { OutAtom out = br.getOutAtom(i); if (out.getValency()>1){ FragmentTools.splitOutAtomIntoValency1OutAtoms(out); traditionalEster =true; } } if (traditionalEster){//e.g. ethylidene dipropanoate br = new BuildResults(word); outAtomCount = br.getOutAtomCount(); } if (outAtomCount ==1){//TODO add support for locanted terepthaloyl String locantForSubstituent = word.getAttributeValue(LOCANT_ATR); if (locantForSubstituent!=null){ br.getFirstOutAtom().setLocant(locantForSubstituent);//indexes which functional atom to connect to when there is a choice. Also can disambiguate which atom is a S in things like thioates } } else if (outAtomCount ==0){ throw new StructureBuildingException("Substituent was expected to have at least one outAtom"); } substituentsBr.mergeBuildResults(br); } else{ String locant = word.getAttributeValue(LOCANT_ATR);//specifying a locant for an ateWord is very unusual as this information is typically redundant c.f. dodecamethylene 1,12-bis(chloroformate) if (br.getFunctionalAtomCount()<1){ throw new StructureBuildingException("bug? ate group did not have any functional atoms!"); } ateGroups.add(br); buildResultsToLocant.put(br, locant); } } if (ateGroups.isEmpty()){ throw new StructureBuildingException("OPSIN bug: Missing ate group in ester"); } int outAtomCount =substituentsBr.getOutAtomCount(); if (outAtomCount ==0){ throw new StructureBuildingException("OPSIN bug: Missing outatom on ester substituents"); } int esterIdCount = 0; for (BuildResults br : ateGroups) { esterIdCount += br.getFunctionalAtomCount(); } if (outAtomCount > esterIdCount){ throw new StructureBuildingException("There are more radicals in the substituents(" + outAtomCount +") than there are places to form esters("+esterIdCount+")"); } if (esterIdCount > outAtomCount && outAtomCount % ateGroups.size() !=0) { //actually checks if the same number of ester forming points would be used in each ate group e.g. ethyl diacetate is wrong throw new StructureBuildingException("There are less radicals in the substituents(" + outAtomCount +") than there are places to form esters("+esterIdCount+")"); } for(int i=0; i< outAtomCount; i++) { BuildResults ateBr = ateGroups.get(i % ateGroups.size()); Atom ateAtom; if (substituentsBr.getFirstOutAtom().getLocant()!=null){ ateAtom =determineFunctionalAtomToUse(substituentsBr.getFirstOutAtom().getLocant(), ateBr); } else{ ateAtom =ateBr.getFunctionalAtom(0); ateBr.removeFunctionalAtom(0); } String locant = buildResultsToLocant.get(ateBr); if (locant ==null){//typical case Atom atomOnSubstituentToUse = getOutAtomTakingIntoAccountWhetherSetExplicitly(substituentsBr, 0); state.fragManager.createBond(ateAtom, atomOnSubstituentToUse, 1); substituentsBr.removeOutAtom(0); } else{ Integer outAtomPosition =null; for (int j = 0; j < substituentsBr.getOutAtomCount(); j++) { if (substituentsBr.getOutAtom(j).getAtom().hasLocant(locant)){ outAtomPosition = j; break; } } if (outAtomPosition ==null){ throw new StructureBuildingException("Unable to find substituent with locant: " + locant + " to form ester!"); } Atom atomOnSubstituentToUse = substituentsBr.getOutAtom(outAtomPosition).getAtom(); state.fragManager.createBond(ateAtom, atomOnSubstituentToUse, 1); substituentsBr.removeOutAtom(outAtomPosition); } ateAtom.neutraliseCharge(); } } private void buildDiValentFunctionalGroup(List words) throws StructureBuildingException { int wordIndice = 0; if (!words.get(wordIndice).getAttributeValue(TYPE_ATR).equals(WordType.substituent.toString())) { throw new StructureBuildingException("word: " +wordIndice +" was expected to be a substituent"); } resolveWordOrBracket(state, words.get(wordIndice)); BuildResults substituent1 =new BuildResults(words.get(wordIndice)); if (substituent1.getOutAtom(0).getValency() !=1){ throw new StructureBuildingException("OutAtom has unexpected valency. Expected 1. Actual: " + substituent1.getOutAtom(0).getValency()); } BuildResults substituent2; if (substituent1.getOutAtomCount()==2){// e.g. tetramethylene sulfone if (substituent1.getOutAtom(1).getValency() !=1){ throw new StructureBuildingException("OutAtom has unexpected valency. Expected 1. Actual: " + substituent1.getOutAtom(1).getValency()); } substituent2 = substituent1; } else{ if (substituent1.getOutAtomCount()!=1){ throw new StructureBuildingException("Expected one outAtom. Found " + substituent1.getOutAtomCount() ); } wordIndice++; if (words.get(wordIndice).getAttributeValue(TYPE_ATR).equals(WordType.functionalTerm.toString())) {//e.g. methyl sulfoxide rather than dimethyl sulfoxide Element clone = state.fragManager.cloneElement(state, words.get(0)); OpsinTools.insertAfter(words.get(0), clone); words = words.get(0).getParent().getChildElements(); } else{ resolveWordOrBracket(state, words.get(wordIndice)); } substituent2 =new BuildResults(words.get(wordIndice)); if (substituent2.getOutAtomCount()!=1){ throw new StructureBuildingException("Expected one outAtom. Found " + substituent2.getOutAtomCount() ); } if (substituent2.getOutAtom(0).getValency() !=1){ throw new StructureBuildingException("OutAtom has unexpected valency. Expected 1. Actual: " + substituent2.getOutAtom(0).getValency()); } } wordIndice++; if (words.get(wordIndice) ==null || !words.get(wordIndice).getAttributeValue(TYPE_ATR).equals(WordType.functionalTerm.toString())) { throw new StructureBuildingException(words.get(wordIndice).getValue()+" was expected to be a functionalTerm"); } List functionalGroup = OpsinTools.getDescendantElementsWithTagName(words.get(wordIndice), FUNCTIONALGROUP_EL); if (functionalGroup.size()!=1){ throw new StructureBuildingException("Unexpected number of functionalGroups found, could be a bug in OPSIN's grammar"); } String smilesOfGroup = functionalGroup.get(0).getAttributeValue(VALUE_ATR); Fragment diValentGroup =state.fragManager.buildSMILES(smilesOfGroup, FUNCTIONALCLASS_TYPE_VAL, NONE_LABELS_VAL); Atom outAtom1 = getOutAtomTakingIntoAccountWhetherSetExplicitly(substituent1, 0); substituent1.removeOutAtom(0); Atom outAtom2 = getOutAtomTakingIntoAccountWhetherSetExplicitly(substituent2, 0); substituent2.removeOutAtom(0); if (diValentGroup.getOutAtomCount()==1){//c.f. peroxide where it is a linker state.fragManager.createBond(outAtom1, diValentGroup.getOutAtom(0).getAtom(), 1); diValentGroup.removeOutAtom(0); state.fragManager.createBond(outAtom2, diValentGroup.getFirstAtom(), 1); } else{ if (outAtom1 != outAtom2){//general case state.fragManager.createBond(outAtom1, diValentGroup.getFirstAtom(), 1); state.fragManager.createBond(outAtom2, diValentGroup.getFirstAtom(), 1); } else{//e.g. carbonyl sulfide state.fragManager.createBond(outAtom1, diValentGroup.getFirstAtom(), 2); } } state.fragManager.incorporateFragment(diValentGroup, outAtom1.getFrag()); } private void buildMonovalentFunctionalGroup(List words) throws StructureBuildingException { resolveWordOrBracket(state, words.get(0)); List groups = OpsinTools.getDescendantElementsWithTagName(words.get(0), GROUP_EL); for (Element group : groups) {//replaces outAtoms with valency greater than 1 with multiple outAtoms; e.g. ylidene -->diyl Fragment frag = group.getFrag(); for (int i = frag.getOutAtomCount()-1; i>=0; i--) { OutAtom outAtom =frag.getOutAtom(i); if (outAtom.getValency()>1){ FragmentTools.splitOutAtomIntoValency1OutAtoms(outAtom); } } } BuildResults substituentBR = new BuildResults(words.get(0)); List functionalGroupFragments = new ArrayList<>(); for (int i=1; i functionalGroups = OpsinTools.getDescendantElementsWithTagName(functionalGroupWord, FUNCTIONALGROUP_EL); if (functionalGroups.size()!=1){ throw new StructureBuildingException("Expected exactly 1 functionalGroup. Found " + functionalGroups.size()); } Fragment monoValentFunctionGroup =state.fragManager.buildSMILES(functionalGroups.get(0).getAttributeValue(VALUE_ATR), FUNCTIONALCLASS_TYPE_VAL, NONE_LABELS_VAL); if (functionalGroups.get(0).getAttributeValue(TYPE_ATR).equals(MONOVALENTSTANDALONEGROUP_TYPE_VAL)){ Atom ideAtom = monoValentFunctionGroup.getDefaultInAtomOrFirstAtom(); ideAtom.addChargeAndProtons(1, 1);//e.g. make cyanide charge netural } Element possibleMultiplier = OpsinTools.getPreviousSibling(functionalGroups.get(0)); functionalGroupFragments.add(monoValentFunctionGroup); if (possibleMultiplier!=null){ int multiplierValue = Integer.parseInt(possibleMultiplier.getAttributeValue(VALUE_ATR)); for (int j = 1; j < multiplierValue; j++) { functionalGroupFragments.add(state.fragManager.copyFragment(monoValentFunctionGroup)); } possibleMultiplier.detach(); } } int outAtomCount =substituentBR.getOutAtomCount(); if (outAtomCount > functionalGroupFragments.size()){//something like isophthaloyl chloride (more precisely written isophthaloyl dichloride) if (functionalGroupFragments.size()!=1){ throw new StructureBuildingException("Incorrect number of functional groups found to balance outAtoms"); } Fragment monoValentFunctionGroup = functionalGroupFragments.get(0); for (int j = 1; j < outAtomCount; j++) { functionalGroupFragments.add(state.fragManager.copyFragment(monoValentFunctionGroup)); } } else if (functionalGroupFragments.size() > outAtomCount){ throw new StructureBuildingException("There are more function groups to attach than there are positions to attach them to!"); } for (int i = 0; i < outAtomCount; i++) { Fragment ideFrag =functionalGroupFragments.get(i); Atom ideAtom = ideFrag.getDefaultInAtomOrFirstAtom(); Atom subAtom = getOutAtomTakingIntoAccountWhetherSetExplicitly(substituentBR, 0); state.fragManager.createBond(ideAtom, subAtom, 1); substituentBR.removeOutAtom(0); state.fragManager.incorporateFragment(ideFrag, subAtom.getFrag()); } } private void buildFunctionalClassEster(List words) throws StructureBuildingException { Element firstWord = words.get(0); if (!firstWord.getAttributeValue(TYPE_ATR).equals(WordType.full.toString())) { throw new StructureBuildingException("Don't alter wordRules.xml without checking the consequences!"); } resolveWordOrBracket(state, firstWord);//the group BuildResults acidBr = new BuildResults(firstWord); if (acidBr.getFunctionalAtomCount()==0) { throw new StructureBuildingException("No functionalAtoms detected!"); } int wordCountMinus1 = words.size() - 1; if (wordCountMinus1 < 2 || !words.get(wordCountMinus1).getAttributeValue(TYPE_ATR).equals(WordType.functionalTerm.toString())) { throw new StructureBuildingException("OPSIN Bug: Bug in functionalClassEster rule; 'ester' not found where it was expected"); } for (int i = 1; i < wordCountMinus1; i++) { Element currentWord = words.get(i); String wordType = currentWord.getAttributeValue(TYPE_ATR); if (!wordType.equals(WordType.substituent.toString())) { if (wordType.equals(WordType.functionalTerm.toString()) && currentWord.getAttributeValue(VALUE_ATR).equalsIgnoreCase("ester")) { //superfluous ester word continue; } throw new StructureBuildingException("OPSIN Bug: Bug in functionalClassEster rule; Encountered: " + currentWord.getAttributeValue(VALUE_ATR)); } resolveWordOrBracket(state, currentWord); BuildResults substituentBr = new BuildResults(currentWord); int outAtomCount = substituentBr.getOutAtomCount(); if (acidBr.getFunctionalAtomCount() < outAtomCount) { throw new StructureBuildingException("Insufficient functionalAtoms on acid"); } for (int j = 0; j < outAtomCount; j++) { String locantForSubstituent = currentWord.getAttributeValue(LOCANT_ATR); Atom functionalAtom; if (locantForSubstituent != null) { functionalAtom = determineFunctionalAtomToUse(locantForSubstituent, acidBr); } else{ functionalAtom = acidBr.getFunctionalAtom(0); acidBr.removeFunctionalAtom(0); } if (substituentBr.getOutAtom(j).getValency() != 1) { throw new StructureBuildingException("Substituent was expected to have only have an outgoing valency of 1"); } state.fragManager.createBond(functionalAtom, getOutAtomTakingIntoAccountWhetherSetExplicitly(substituentBr, j), 1); if (functionalAtom.getCharge() == -1) { functionalAtom.neutraliseCharge(); } } substituentBr.removeAllOutAtoms(); } } /** * Handles names like thiophene 1,1-dioxide; carbon dioxide; benzene oxide * Does the same for sulfide/selenide/telluride * @param words * @throws StructureBuildingException */ private void buildOxide(List words) throws StructureBuildingException { resolveWordOrBracket(state, words.get(0));//the group List oxideFragments = new ArrayList<>(); List locantsForOxide =new ArrayList<>();//often not specified if (!words.get(1).getAttributeValue(TYPE_ATR).equals(WordType.functionalTerm.toString())){ throw new StructureBuildingException("Oxide functional term not found where expected!"); } Element rightMostGroup; if (words.get(0).getName().equals(WORDRULE_EL)){//e.g. Nicotinic acid N-oxide List fullWords = OpsinTools.getDescendantElementsWithTagNameAndAttribute(words.get(0), WORD_EL, TYPE_ATR, WordType.full.toString()); if (fullWords.isEmpty()){ throw new StructureBuildingException("OPSIN is entirely unsure where the oxide goes so has decided not to guess"); } rightMostGroup = findRightMostGroupInBracket(fullWords.get(fullWords.size()-1)); } else{ rightMostGroup = findRightMostGroupInBracket(words.get(0)); } int numberOfOxygenToAdd =1; List multipliers =OpsinTools.getDescendantElementsWithTagName(words.get(1), MULTIPLIER_EL); if (multipliers.size() >1){ throw new StructureBuildingException("Expected 0 or 1 multiplier found: " + multipliers.size()); } if (multipliers.size()==1){ numberOfOxygenToAdd = Integer.parseInt(multipliers.get(0).getAttributeValue(VALUE_ATR)); multipliers.get(0).detach(); } else{ if (ELEMENTARYATOM_TYPE_VAL.equals(rightMostGroup.getAttributeValue(TYPE_ATR))){ Atom elementaryAtom = rightMostGroup.getFrag().getFirstAtom(); int charge = elementaryAtom.getCharge(); if (charge >0 && charge %2 ==0){ numberOfOxygenToAdd = charge/2; } else if (elementaryAtom.getProperty(Atom.OXIDATION_NUMBER)!=null){ int valency = elementaryAtom.getProperty(Atom.OXIDATION_NUMBER) - elementaryAtom.getIncomingValency(); if (valency >0 && valency %2 ==0){ numberOfOxygenToAdd = valency/2; } } } } List functionalGroup =OpsinTools.getDescendantElementsWithTagName(words.get(1), FUNCTIONALGROUP_EL); if (functionalGroup.size()!=1){ throw new StructureBuildingException("Expected 1 group element found: " + functionalGroup.size()); } String smilesReplacement = functionalGroup.get(0).getAttributeValue(VALUE_ATR); String labels = functionalGroup.get(0).getAttributeValue(LABELS_ATR); for (int i = 0; i < numberOfOxygenToAdd; i++) { oxideFragments.add(state.fragManager.buildSMILES(smilesReplacement, FUNCTIONALCLASS_TYPE_VAL, labels != null ? labels : NONE_LABELS_VAL)); } List locantEls =OpsinTools.getDescendantElementsWithTagName(words.get(1), LOCANT_EL); if (locantEls.size() >1){ throw new StructureBuildingException("Expected 0 or 1 locant elements found: " + locantEls.size()); } if (locantEls.size()==1){ String[] locants = StringTools.removeDashIfPresent(locantEls.get(0).getValue()).split(","); locantsForOxide.addAll(Arrays.asList(locants)); locantEls.get(0).detach(); } if (!locantsForOxide.isEmpty() && locantsForOxide.size()!=oxideFragments.size()){ throw new StructureBuildingException("Mismatch between number of locants and number of oxides specified"); } Fragment groupToModify = rightMostGroup.getFrag();//all the suffixes are part of this fragment at this point mainLoop: for (int i = 0; i < oxideFragments.size(); i++) { Atom oxideAtom = oxideFragments.get(i).getFirstAtom(); if (!locantsForOxide.isEmpty()){ Atom atomToAddOxideTo =groupToModify.getAtomByLocantOrThrow(locantsForOxide.get(i)); if (atomToAddOxideTo.getElement() == ChemEl.C && !ELEMENTARYATOM_TYPE_VAL.equals(groupToModify.getType())) { throw new StructureBuildingException("Locant " + locantsForOxide.get(i) + " indicated oxide applied to carbon, but this would lead to hypervalency!"); } formAppropriateBondToOxideAndAdjustCharges(atomToAddOxideTo, oxideAtom); } else{ if (ELEMENTARYATOM_TYPE_VAL.equals(groupToModify.getType())){ Atom elementaryAtom= groupToModify.getFirstAtom(); formAppropriateBondToOxideAndAdjustCharges(elementaryAtom, oxideAtom);//e.g. carbon dioxide int chargeOnAtom =elementaryAtom.getCharge(); if (chargeOnAtom>=2){ elementaryAtom.setCharge(chargeOnAtom-2); } continue mainLoop; } else{ List atomList = groupToModify.getAtomList(); //In preference suffixes are substituted onto e.g. acetonitrile oxide for (Atom atom : atomList) { if (!atom.getType().equals(SUFFIX_TYPE_VAL)) { continue; } if (atom.getElement() != ChemEl.C && atom.getElement() != ChemEl.O) { formAppropriateBondToOxideAndAdjustCharges(atom, oxideAtom); continue mainLoop; } } for (Atom atom : atomList) { if (atom.getElement() != ChemEl.C && atom.getElement() != ChemEl.O) { formAppropriateBondToOxideAndAdjustCharges(atom, oxideAtom); continue mainLoop; } } } //No heteroatoms could be found. Perhaps it's supposed to be something like styrene oxide Set bondSet = groupToModify.getBondSet();//looking for double bond for (Bond bond : bondSet) { if (bond.getOrder()==2 && bond.getFromAtom().getElement() == ChemEl.C && bond.getToAtom().getElement() == ChemEl.C){ bond.setOrder(1); state.fragManager.createBond(bond.getFromAtom(), oxideAtom, 1); state.fragManager.createBond(bond.getToAtom(), oxideAtom, 1); continue mainLoop; } } //...or maybe something a bit iffy nomenclature wise like benzene oxide :-S for (Bond bond : bondSet) { Atom fromAtom =bond.getFromAtom(); Atom toAtom = bond.getToAtom(); if (fromAtom.hasSpareValency() && toAtom.hasSpareValency() &&fromAtom.getElement() == ChemEl.C && toAtom.getElement() == ChemEl.C){ fromAtom.setSpareValency(false); toAtom.setSpareValency(false); state.fragManager.createBond(fromAtom, oxideAtom, 1); state.fragManager.createBond(toAtom, oxideAtom, 1); continue mainLoop; } } //something like where oxide goes on an oxygen propan-2-one oxide List atomList = groupToModify.getAtomList(); for (Atom atom : atomList) { if (!atom.getType().equals(SUFFIX_TYPE_VAL)) { continue; } if (atom.getElement() != ChemEl.C) { formAppropriateBondToOxideAndAdjustCharges(atom, oxideAtom); continue mainLoop; } } for (Atom atom : atomList) { if (atom.getElement() != ChemEl.C) { formAppropriateBondToOxideAndAdjustCharges(atom, oxideAtom); continue mainLoop; } } throw new StructureBuildingException("Unable to find suitable atom or a double bond to add oxide to"); } } for (Fragment oxide : oxideFragments) { state.fragManager.incorporateFragment(oxide, groupToModify); } } /** * Decides whether an oxide should double bond e.g. P=O or single bond as a zwitterionic form e.g. [N+]-[O-] * Corrects the charges if necessary and forms the bond * @param atomToAddOxideTo * @param oxideAtom * @throws StructureBuildingException */ private void formAppropriateBondToOxideAndAdjustCharges(Atom atomToAddOxideTo, Atom oxideAtom) throws StructureBuildingException { Integer maxVal = ValencyChecker.getMaximumValency(atomToAddOxideTo.getElement(), atomToAddOxideTo.getCharge()); if (maxVal ==null || (atomToAddOxideTo.getIncomingValency() + atomToAddOxideTo.getOutValency() +2) <= maxVal){ if (atomToAddOxideTo.getLambdaConventionValency()==null || !ValencyChecker.checkValencyAvailableForBond(atomToAddOxideTo, 2)){//probably in well formed names 2 protons should always be added but some names use the lambdaConvention to specify the valency after oxide has been applied atomToAddOxideTo.addChargeAndProtons(0, 2);//this is an additive operation, up the proton count by 2 } state.fragManager.createBond(atomToAddOxideTo, oxideAtom, 2); } else{ if (atomToAddOxideTo.getCharge()!=0 || oxideAtom.getCharge()!=0){ throw new StructureBuildingException("Oxide appeared to refer to an atom that has insufficent valency to accept the addition of oxygen"); } atomToAddOxideTo.addChargeAndProtons(1, 1); oxideAtom.addChargeAndProtons(-1, -1); maxVal = ValencyChecker.getMaximumValency(atomToAddOxideTo.getElement(), atomToAddOxideTo.getCharge()); if (maxVal !=null && (atomToAddOxideTo.getIncomingValency() + atomToAddOxideTo.getOutValency() +1) > maxVal){ throw new StructureBuildingException("Oxide appeared to refer to an atom that has insufficent valency to accept the addition of oxygen"); } state.fragManager.createBond(atomToAddOxideTo, oxideAtom, 1); } } private void buildCarbonylDerivative(List words) throws StructureBuildingException { if (!WordType.full.toString().equals(words.get(0).getAttributeValue(TYPE_ATR))){ throw new StructureBuildingException("OPSIN bug: Wrong word type encountered when applying carbonylDerivative wordRule"); } List replacementFragments = new ArrayList<>(); List locantForFunctionalTerm =new ArrayList<>();//usually not specified if (!words.get(1).getAttributeValue(TYPE_ATR).equals(WordType.functionalTerm.toString())){//e.g. acetone O-ethyloxime or acetone 1-chloro-1-methylhydrazone for (int i = 1; i < words.size(); i++) { Element word = words.get(i); Fragment frag = findRightMostGroupInWordOrWordRule(word).getFrag(); replacementFragments.add(frag); int childCount = word.getChildCount(); if (childCount == 1 && word.getChild(0).getName().equals(BRACKET_EL) && word.getChild(0).getAttribute(LOCANT_ATR)!=null){ locantForFunctionalTerm.add(word.getChild(0).getAttributeValue(LOCANT_ATR)); } else if (childCount == 2 && word.getChild(0).getAttribute(LOCANT_ATR) != null ){ Element firstChild = word.getChild(0); String locant = firstChild.getAttributeValue(LOCANT_ATR); if (word.getChild(1).getName().equals(ROOT_EL) && !frag.hasLocant(locant) && MATCH_NUMERIC_LOCANT.matcher(locant).matches()){ //e.g. 1,3-benzothiazole-2-carbaldehyde 2-phenylhydrazone locantForFunctionalTerm.add(firstChild.getAttributeValue(LOCANT_ATR)); firstChild.removeAttribute(firstChild.getAttribute(LOCANT_ATR)); } } } } else{//e.g. butan-2,3-dione dioxime or hexan2,3-dione 2-oxime int numberOfCarbonylReplacements =1; List multipliers =OpsinTools.getDescendantElementsWithTagName(words.get(1), MULTIPLIER_EL); if (multipliers.size() >1){ throw new StructureBuildingException("Expected 0 or 1 multiplier found: " + multipliers.size()); } if (multipliers.size()==1){ numberOfCarbonylReplacements = Integer.parseInt(multipliers.get(0).getAttributeValue(VALUE_ATR)); multipliers.get(0).detach(); } List functionalGroup =OpsinTools.getDescendantElementsWithTagName(words.get(1), FUNCTIONALGROUP_EL); if (functionalGroup.size()!=1){ throw new StructureBuildingException("Expected 1 functionalGroup element found: " + functionalGroup.size()); } String smilesReplacement = functionalGroup.get(0).getAttributeValue(VALUE_ATR); String labels = functionalGroup.get(0).getAttributeValue(LABELS_ATR); for (int i = 0; i < numberOfCarbonylReplacements; i++) { Fragment replacementFragment = state.fragManager.buildSMILES(smilesReplacement, FUNCTIONALCLASS_TYPE_VAL, labels != null ? labels : NONE_LABELS_VAL); if (i >0){ FragmentTools.relabelLocants(replacementFragment.getAtomList(), StringTools.multiplyString("'", i)); } List atomList = replacementFragment.getAtomList(); for (Atom atom : atomList) { atom.removeLocantsOtherThanElementSymbolLocants();//prevents numeric locant locanted substitution from outside the functional word } replacementFragments.add(replacementFragment); } List locantEls =OpsinTools.getDescendantElementsWithTagName(words.get(1), LOCANT_EL); if (locantEls.size() >1){ throw new StructureBuildingException("Expected 0 or 1 locant elements found: " + locantEls.size()); } if (locantEls.size() == 1) { String[] locants = StringTools.removeDashIfPresent(locantEls.get(0).getValue()).split(","); locantForFunctionalTerm.addAll(Arrays.asList(locants)); locantEls.get(0).detach(); } } if (!locantForFunctionalTerm.isEmpty() && locantForFunctionalTerm.size()!=replacementFragments.size()){ throw new StructureBuildingException("Mismatch between number of locants and number of carbonyl replacements"); } Element rightMostGroup = findRightMostGroupInWordOrWordRule(words.get(0)); Element parent = rightMostGroup.getParent(); boolean multiplied =false; while (!parent.equals(words.get(0))){ if (parent.getAttribute(MULTIPLIER_ATR)!=null){ multiplied =true; } parent = parent.getParent(); } if (!multiplied){ List carbonylOxygens = findCarbonylOxygens(rightMostGroup.getFrag(), locantForFunctionalTerm); int replacementsToPerform = Math.min(replacementFragments.size(), carbonylOxygens.size()); replaceCarbonylOxygenWithReplacementFragments(words, replacementFragments, carbonylOxygens, replacementsToPerform); } resolveWordOrBracket(state, words.get(0));//the component if (replacementFragments.size() >0){ //Note that the right most group may be multiplied e.g. 3,3'-methylenebis(2,4,6-trimethylbenzaldehyde) disemicarbazone //or the carbonyl may not even be on the right most group e.g. 4-oxocyclohexa-2,5-diene-1-carboxylic acid 4-oxime BuildResults br = new BuildResults(words.get(0)); List carbonylOxygens = new ArrayList<>(); List fragments = new ArrayList<>(br.getFragments()); for (ListIterator iterator = fragments.listIterator(fragments.size()); iterator.hasPrevious();) {//iterate in reverse order - right most groups preferred carbonylOxygens.addAll(findCarbonylOxygens(iterator.previous(), locantForFunctionalTerm)); } replaceCarbonylOxygenWithReplacementFragments(words, replacementFragments, carbonylOxygens, replacementFragments.size()); } } private void replaceCarbonylOxygenWithReplacementFragments(List words, List replacementFragments, List carbonylOxygens, int functionalReplacementsToPerform) throws StructureBuildingException { if (functionalReplacementsToPerform > carbonylOxygens.size()){ throw new StructureBuildingException("Insufficient carbonyl groups found!"); } for (int i = 0; i < functionalReplacementsToPerform; i++) { Atom carbonylOxygen =carbonylOxygens.remove(0);//the oxygen of the carbonyl Fragment carbonylFrag = carbonylOxygen.getFrag(); Fragment replacementFrag = replacementFragments.remove(0); List atomList = replacementFrag.getAtomList(); if (atomList.size() == 2){ //special case for oxime //adds a locant like O1 giving another way of referencing this atom Atom numericLocantAtomConnectedToCarbonyl = OpsinTools.depthFirstSearchForAtomWithNumericLocant(carbonylOxygen); if (numericLocantAtomConnectedToCarbonyl != null) { Atom lastatom = atomList.get(1); lastatom.addLocant(lastatom.getElement().toString() + numericLocantAtomConnectedToCarbonyl.getFirstLocant()); } } if (!words.get(1).getAttributeValue(TYPE_ATR).equals(WordType.functionalTerm.toString())){ resolveWordOrBracket(state, words.get(1 +i)); } for (Atom atom : atomList) { atom.removeLocantsOtherThanElementSymbolLocants();//prevents numeric locant locanted substitution from outside the functional word List locants =atom.getLocants(); for (int j = locants.size() -1; j >=0; j--) { String locant = locants.get(j); if (carbonylFrag.hasLocant(locant)){ atom.removeLocant(locant); } } } if (replacementFrag.getOutAtomCount() == 2) { //e.g. chloroxime Atom carbonylCarbon = carbonylOxygen.getFirstBond().getOtherAtom(carbonylOxygen); OutAtom outAtom = replacementFrag.getOutAtom(1); replacementFrag.removeOutAtom(outAtom); if (carbonylCarbon.getIncomingValency() >=4) { throw new StructureBuildingException("Insufficient substitutable hydrogen for haloxime"); } state.fragManager.createBond(carbonylCarbon, outAtom.getAtom(), outAtom.getValency()); } if (replacementFrag.getOutAtomCount() != 1) { throw new RuntimeException("OPSIN Bug: Carbonyl replacement fragment expected to have one outatom"); } Atom atomToReplaceCarbonylOxygen = replacementFrag.getOutAtom(0).getAtom(); replacementFrag.removeOutAtom(0); state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(carbonylOxygen, atomToReplaceCarbonylOxygen); atomToReplaceCarbonylOxygen.setType(carbonylOxygen.getType());//copy the type e.g. if the carbonyl was a suffix this should appear as a suffix if (replacementFrag.getTokenEl().getParent() == null) {//incorporate only for the case that replacementFrag came from a functional class element state.fragManager.incorporateFragment(replacementFrag, carbonylFrag); } } } /** * Given a fragment and optionally a list of locants finds carbonyl atoms * If locants are given the carbonyl must be assoicated with one of the given locants * @param fragment * @param locantForCarbonylAtom * @return * @throws StructureBuildingException */ private List findCarbonylOxygens(Fragment fragment, List locantForCarbonylAtom) throws StructureBuildingException { List matches = new ArrayList<>(); List rootFragAtomList = fragment.getAtomList(); for (Atom atom : rootFragAtomList) {//find all carbonyl oxygen if (atom.getElement() == ChemEl.O && atom.getCharge()==0){ List neighbours =atom.getAtomNeighbours(); if (neighbours.size()==1){ if (neighbours.get(0).getElement() == ChemEl.C){ if (!locantForCarbonylAtom.isEmpty()){ Atom numericLocantAtomConnectedToCarbonyl = OpsinTools.depthFirstSearchForAtomWithNumericLocant(atom); if (numericLocantAtomConnectedToCarbonyl!=null){//could be the carbon of the carbonyl or the ring the carbonyl connects to in say a carbaldehyde boolean matchesLocant = false; for (String locant : locantForCarbonylAtom) { if (numericLocantAtomConnectedToCarbonyl.hasLocant(locant)){ matchesLocant =true; } } if (!matchesLocant){ continue; } } else{ continue; } } Bond b = atom.getBondToAtomOrThrow(neighbours.get(0)); if (b.getOrder()==2){ matches.add(atom); } } } } } return matches; } private void buildAnhydride(List words) throws StructureBuildingException { if (words.size()!=2 && words.size()!=3){ throw new StructureBuildingException("Unexpected number of words in anhydride. Check wordRules.xml, this is probably a bug"); } Element anhydrideWord = words.get(words.size()-1); List functionalClass =OpsinTools.getDescendantElementsWithTagName(anhydrideWord, FUNCTIONALGROUP_EL); if (functionalClass.size()!=1){ throw new StructureBuildingException("Expected 1 group element found: " + functionalClass.size()); } String anhydrideSmiles = functionalClass.get(0).getAttributeValue(VALUE_ATR); int numberOfAnhydrideLinkages =1; List multipliers =OpsinTools.getDescendantElementsWithTagName(anhydrideWord, MULTIPLIER_EL); if (multipliers.size() >1){ throw new StructureBuildingException("Expected 0 or 1 multiplier found: " + multipliers.size()); } if (multipliers.size()==1){ numberOfAnhydrideLinkages = Integer.parseInt(multipliers.get(0).getAttributeValue(VALUE_ATR)); multipliers.get(0).detach(); } String anhydrideLocant = null; List anhydrideLocants =OpsinTools.getDescendantElementsWithTagNames(anhydrideWord, new String[]{LOCANT_EL, COLONORSEMICOLONDELIMITEDLOCANT_EL}); if (anhydrideLocants.size() >1){ throw new StructureBuildingException("Expected 0 or 1 anhydrideLocants found: " + anhydrideLocants.size()); } if (anhydrideLocants.size()==1){ anhydrideLocant = anhydrideLocants.get(0).getValue(); anhydrideLocants.get(0).detach(); } resolveWordOrBracket(state, words.get(0)); BuildResults br1 = new BuildResults(words.get(0)); if (br1.getFunctionalAtomCount() ==0){ throw new StructureBuildingException("Cannot find functionalAtom to form anhydride"); } if (words.size()==3){//asymmetric anhydride if (anhydrideLocant!=null){ throw new StructureBuildingException("Unsupported or invalid anhydride"); } resolveWordOrBracket(state, words.get(1)); BuildResults br2 = new BuildResults(words.get(1)); if (br2.getFunctionalAtomCount() ==0){ throw new StructureBuildingException("Cannot find functionalAtom to form anhydride"); } if (numberOfAnhydrideLinkages>1){ for (int i = numberOfAnhydrideLinkages-1; i >=0 ; i--) { if (br2.getFunctionalAtomCount()==0){ throw new StructureBuildingException("Cannot find functionalAtom to form anhydride"); } BuildResults newAcidBr; if (i!=0){ Element newAcid = state.fragManager.cloneElement(state, words.get(0)); OpsinTools.insertAfter(words.get(0), newAcid); newAcidBr = new BuildResults(newAcid); } else{ newAcidBr =br1; } formAnhydrideLink(anhydrideSmiles, newAcidBr, br2); } } else{ if (br1.getFunctionalAtomCount()!=1 && br2.getFunctionalAtomCount()!=1 ) { throw new StructureBuildingException("Invalid anhydride description"); } formAnhydrideLink(anhydrideSmiles, br1, br2); } } else{//symmetric anhydride if (br1.getFunctionalAtomCount()>1){//cyclic anhydride if (br1.getFunctionalAtomCount()==2){ if (numberOfAnhydrideLinkages!=1 || anhydrideLocant !=null ){ throw new StructureBuildingException("Unsupported or invalid anhydride"); } formAnhydrideLink(anhydrideSmiles, br1, br1); } else{//cyclic anhydride where group has more than 2 acids if (anhydrideLocant ==null){ throw new StructureBuildingException("Anhydride formation appears to be ambiguous; More than 2 acids, no locants"); } String[] acidLocants =MATCH_COLONORSEMICOLON.split(StringTools.removeDashIfPresent(anhydrideLocant)); if (acidLocants.length != numberOfAnhydrideLinkages){ throw new StructureBuildingException("Mismatch between number of locants and number of anhydride linkages to form"); } if (br1.getFunctionalAtomCount() < (numberOfAnhydrideLinkages *2)){ throw new StructureBuildingException("Mismatch between number of acid atoms and number of anhydride linkages to form"); } List functionalAtoms = new ArrayList<>(); for (int i = 0; i < br1.getFunctionalAtomCount(); i++) { functionalAtoms.add(br1.getFunctionalAtom(i)); } for (int i = 0; i < numberOfAnhydrideLinkages; i++) { String[] locants = acidLocants[i].split(","); Atom oxygen1 =null; for (int j = functionalAtoms.size() -1; j >=0; j--) { Atom functionalAtom = functionalAtoms.get(j); Atom numericLocantAtomConnectedToFunctionalAtom = OpsinTools.depthFirstSearchForAtomWithNumericLocant(functionalAtom); if (numericLocantAtomConnectedToFunctionalAtom.hasLocant(locants[0])){ oxygen1=functionalAtom; functionalAtoms.remove(j); break; } } Atom oxygen2 =null; for (int j = functionalAtoms.size() -1; j >=0; j--) { Atom functionalAtom = functionalAtoms.get(j); Atom numericLocantAtomConnectedToFunctionalAtom = OpsinTools.depthFirstSearchForAtomWithNumericLocant(functionalAtom); if (numericLocantAtomConnectedToFunctionalAtom.hasLocant(locants[1])){ oxygen2=functionalAtom; functionalAtoms.remove(j); break; } } if (oxygen1 ==null || oxygen2==null){ throw new StructureBuildingException("Unable to find locanted atom for anhydride formation"); } formAnhydrideLink(anhydrideSmiles, oxygen1, oxygen2); } } } else{ if (numberOfAnhydrideLinkages!=1 || anhydrideLocant !=null ){ throw new StructureBuildingException("Unsupported or invalid anhydride"); } Element newAcid = state.fragManager.cloneElement(state, words.get(0)); OpsinTools.insertAfter(words.get(0), newAcid); BuildResults br2 = new BuildResults(newAcid); formAnhydrideLink(anhydrideSmiles, br1, br2); } } } /** * Given buildResults for both the acids and the SMILES of the anhydride forms the anhydride bond using the first functionalAtom on each BuildResults * @param anhydrideSmiles * @param acidBr1 * @param acidBr2 * @throws StructureBuildingException */ private void formAnhydrideLink(String anhydrideSmiles, BuildResults acidBr1, BuildResults acidBr2)throws StructureBuildingException { Atom oxygen1 = acidBr1.getFunctionalAtom(0); acidBr1.removeFunctionalAtom(0); Atom oxygen2 = acidBr2.getFunctionalAtom(0); acidBr2.removeFunctionalAtom(0); formAnhydrideLink(anhydrideSmiles, oxygen1, oxygen2); } /** * Given two atoms and the SMILES of the anhydride forms the anhydride bond * @param anhydrideSmiles * @param oxygen1 * @param oxygen2 * @throws StructureBuildingException */ private void formAnhydrideLink(String anhydrideSmiles, Atom oxygen1, Atom oxygen2)throws StructureBuildingException { if (oxygen1.getElement() != ChemEl.O || oxygen2.getElement() != ChemEl.O || oxygen1.getBondCount()!=1 ||oxygen2.getBondCount()!=1) { throw new StructureBuildingException("Problem building anhydride"); } Atom atomOnSecondAcidToConnectTo = oxygen2.getAtomNeighbours().get(0); state.fragManager.removeAtomAndAssociatedBonds(oxygen2); Fragment anhydride = state.fragManager.buildSMILES(anhydrideSmiles, FUNCTIONALCLASS_TYPE_VAL, NONE_LABELS_VAL); Fragment acidFragment1 = oxygen1.getFrag(); state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(oxygen1, anhydride.getFirstAtom()); List atomsInAnhydrideLinkage = anhydride.getAtomList(); state.fragManager.createBond(atomsInAnhydrideLinkage.get(atomsInAnhydrideLinkage.size()-1), atomOnSecondAcidToConnectTo, 1); state.fragManager.incorporateFragment(anhydride, acidFragment1); } private void buildAcidHalideOrPseudoHalide(List words) throws StructureBuildingException { if (!words.get(0).getAttributeValue(TYPE_ATR).equals(WordType.full.toString())){ throw new StructureBuildingException("Don't alter wordRules.xml without checking the consequences!"); } resolveWordOrBracket(state, words.get(0)); BuildResults acidBr = new BuildResults(words.get(0)); int functionalAtomCount =acidBr.getFunctionalAtomCount(); if (functionalAtomCount==0){ throw new StructureBuildingException("No functionalAtoms detected!"); } boolean monoMultiplierDetected =false; List functionalGroupFragments = new ArrayList<>(); for (int i = 1; i < words.size(); i++ ) { Element functionalGroupWord =words.get(i); List functionalGroups = OpsinTools.getDescendantElementsWithTagName(functionalGroupWord, FUNCTIONALGROUP_EL); if (functionalGroups.size()!=1){ throw new StructureBuildingException("Expected exactly 1 functionalGroup. Found " + functionalGroups.size()); } Fragment monoValentFunctionGroup =state.fragManager.buildSMILES(functionalGroups.get(0).getAttributeValue(VALUE_ATR), FUNCTIONALCLASS_TYPE_VAL, NONE_LABELS_VAL); if (functionalGroups.get(0).getAttributeValue(TYPE_ATR).equals(MONOVALENTSTANDALONEGROUP_TYPE_VAL)){ Atom ideAtom = monoValentFunctionGroup.getDefaultInAtomOrFirstAtom(); ideAtom.addChargeAndProtons(1, 1);//e.g. make cyanide charge netural } Element possibleMultiplier = OpsinTools.getPreviousSibling(functionalGroups.get(0)); functionalGroupFragments.add(monoValentFunctionGroup); if (possibleMultiplier != null){ int multiplierValue = Integer.parseInt(possibleMultiplier.getAttributeValue(VALUE_ATR)); if (multiplierValue == 1) { monoMultiplierDetected = true; } for (int j = 1; j < multiplierValue; j++) { functionalGroupFragments.add(state.fragManager.copyFragment(monoValentFunctionGroup)); } possibleMultiplier.detach(); } } int halideCount = functionalGroupFragments.size(); if (halideCount < functionalAtomCount && halideCount == 1 && !monoMultiplierDetected) { //e.g. phosphoric chloride, chloride is implicitly multiplied Fragment ideFrag = functionalGroupFragments.get(0); for (int i = halideCount; i < functionalAtomCount; i++) { functionalGroupFragments.add(state.fragManager.copyFragment(ideFrag)); } halideCount = functionalAtomCount; } else if (halideCount > functionalAtomCount || (!monoMultiplierDetected && halideCount =0; i--) { Fragment ideFrag =functionalGroupFragments.get(i); Atom ideAtom = ideFrag.getDefaultInAtomOrFirstAtom(); Atom acidAtom = acidBr.getFunctionalAtom(i); if (acidAtom.getElement() != ChemEl.O){ throw new StructureBuildingException("Atom type expected to be oxygen but was: " +acidAtom.getElement()); } acidBr.removeFunctionalAtom(i); Fragment acidFragment =acidAtom.getFrag(); state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(acidAtom, ideAtom); state.fragManager.incorporateFragment(ideFrag, acidFragment); } } private void buildAdditionCompound(List words) throws StructureBuildingException { Element firstWord = words.get(0); if (!firstWord.getAttributeValue(TYPE_ATR).equals(WordType.full.toString())) { throw new StructureBuildingException("Don't alter wordRules.xml without checking the consequences!"); } resolveWordOrBracket(state, firstWord); Element elementaryAtomEl = StructureBuildingMethods.findRightMostGroupInBracket(firstWord); Fragment elementaryAtomFrag = elementaryAtomEl.getFrag(); Atom elementaryAtom = elementaryAtomFrag.getFirstAtom(); int charge = elementaryAtom.getCharge(); List functionalGroupFragments = new ArrayList<>(); for (int i = 1; i < words.size(); i++ ) { Element functionalGroupWord = words.get(i); List functionalGroups = OpsinTools.getDescendantElementsWithTagName(functionalGroupWord, FUNCTIONALGROUP_EL); if (functionalGroups.size() != 1){ throw new StructureBuildingException("Expected exactly 1 functionalGroup. Found " + functionalGroups.size()); } Element functionGroup = functionalGroups.get(0); Fragment monoValentFunctionGroup = state.fragManager.buildSMILES(functionGroup.getAttributeValue(VALUE_ATR), FUNCTIONALCLASS_TYPE_VAL, NONE_LABELS_VAL); if (functionGroup.getAttributeValue(TYPE_ATR).equals(MONOVALENTSTANDALONEGROUP_TYPE_VAL)){ Atom ideAtom = monoValentFunctionGroup.getDefaultInAtomOrFirstAtom(); ideAtom.addChargeAndProtons(1, 1);//e.g. make cyanide and the like charge neutral } Element possibleMultiplier = OpsinTools.getPreviousSibling(functionGroup); functionalGroupFragments.add(monoValentFunctionGroup); if (possibleMultiplier != null) { int multiplierValue = Integer.parseInt(possibleMultiplier.getAttributeValue(VALUE_ATR)); for (int j = 1; j < multiplierValue; j++) { functionalGroupFragments.add(state.fragManager.copyFragment(monoValentFunctionGroup)); } possibleMultiplier.detach(); } else if (words.size() == 2) {//silicon chloride -->silicon tetrachloride int incomingBondOrder = elementaryAtom.getIncomingValency(); int expectedValency; if (charge > 0) { expectedValency = incomingBondOrder + charge; } else{ if (elementaryAtom.getProperty(Atom.OXIDATION_NUMBER) != null) { expectedValency = elementaryAtom.getProperty(Atom.OXIDATION_NUMBER); } else{ if (elementaryAtomEl.getAttribute(COMMONOXIDATIONSTATESANDMAX_ATR) != null) { String[] typicalOxidationStates = elementaryAtomEl.getAttributeValue(COMMONOXIDATIONSTATESANDMAX_ATR).split(":")[0].split(","); expectedValency = Integer.parseInt(typicalOxidationStates[0]); } else{ expectedValency = ValencyChecker.getPossibleValencies(elementaryAtom.getElement(), charge)[0]; } } } int implicitMultiplier = expectedValency - incomingBondOrder > 1 ? expectedValency - incomingBondOrder : 1; for (int j = 1; j < implicitMultiplier; j++) { functionalGroupFragments.add(state.fragManager.copyFragment(monoValentFunctionGroup)); } } } if (charge > 0) { elementaryAtom.setCharge(charge - functionalGroupFragments.size()); } //[AlH3] --> [AlH4-] , [AlH4] --> [AlH4-] applyAluminiumHydrideSpecialCase(firstWord, elementaryAtom, functionalGroupFragments); int halideCount = functionalGroupFragments.size(); Integer maximumVal = ValencyChecker.getMaximumValency(elementaryAtom.getElement(), elementaryAtom.getCharge()); if (maximumVal != null && halideCount > maximumVal) { throw new StructureBuildingException("Too many halides/psuedo halides addded to " +elementaryAtom.getElement()); } for (int i = halideCount - 1; i >= 0; i--) { Fragment ideFrag = functionalGroupFragments.get(i); Atom ideAtom = ideFrag.getDefaultInAtomOrFirstAtom(); state.fragManager.incorporateFragment(ideFrag, ideAtom, elementaryAtomFrag, elementaryAtom, 1); } } private void applyAluminiumHydrideSpecialCase(Element firstWord, Atom elementaryAtom, List functionalGroupFragments) throws StructureBuildingException { if ((elementaryAtom.getElement() == ChemEl.Al || elementaryAtom.getElement() == ChemEl.B) && elementaryAtom.getCharge() == 0) { if (functionalGroupFragments.size() == 3) { if (functionalGroupFragments.get(0).getDefaultInAtomOrFirstAtom().getElement() == ChemEl.H && functionalGroupFragments.get(1).getDefaultInAtomOrFirstAtom().getElement() == ChemEl.H && functionalGroupFragments.get(2).getDefaultInAtomOrFirstAtom().getElement() == ChemEl.H) { Element counterCationWordRule = OpsinTools.getPreviousSibling(firstWord.getParent()); if (counterCationWordRule != null && counterCationWordRule.getChildCount() == 1) { Element word =counterCationWordRule.getFirstChildElement(WORD_EL); if (word != null && word.getChildCount() ==1) { Element root = word.getFirstChildElement(ROOT_EL); if (root != null && root.getChildCount() ==1) { Element group = root.getFirstChildElement(GROUP_EL); if (group != null && ELEMENTARYATOM_TYPE_VAL.equals(group.getAttributeValue(TYPE_ATR))) { ChemEl chemEl = group.getFrag().getFirstAtom().getElement(); if (chemEl == ChemEl.Li || chemEl == ChemEl.Na || chemEl == ChemEl.K || chemEl == ChemEl.Rb || chemEl == ChemEl.Cs) { functionalGroupFragments.add(state.fragManager.copyFragment(functionalGroupFragments.get(0))); elementaryAtom.setCharge(-1); } } } } } } } else if (functionalGroupFragments.size() == 4) { if (functionalGroupFragments.get(0).getDefaultInAtomOrFirstAtom().getElement() == ChemEl.H && functionalGroupFragments.get(1).getDefaultInAtomOrFirstAtom().getElement() == ChemEl.H && functionalGroupFragments.get(2).getDefaultInAtomOrFirstAtom().getElement() == ChemEl.H && functionalGroupFragments.get(3).getDefaultInAtomOrFirstAtom().getElement() == ChemEl.H) { elementaryAtom.setCharge(-1); } } } } private void buildGlycol(List words) throws StructureBuildingException { int wordIndice = 0; resolveWordOrBracket(state, words.get(wordIndice));//the group Element finalGroup = findRightMostGroupInWordOrWordRule(words.get(wordIndice)); Fragment theDiRadical = finalGroup.getFrag(); if (theDiRadical.getOutAtomCount()!=2){ throw new StructureBuildingException("Glycol class names (e.g. ethylene glycol) expect two outAtoms. Found: " + theDiRadical.getOutAtomCount() ); } wordIndice++; if (wordIndice >= words.size() || !words.get(wordIndice).getAttributeValue(TYPE_ATR).equals(WordType.functionalTerm.toString())){ throw new StructureBuildingException("Glycol functionalTerm word expected"); } List functionalClassEls = OpsinTools.getDescendantElementsWithTagName(words.get(wordIndice), FUNCTIONALCLASS_EL); if (functionalClassEls.size()!=1){ throw new StructureBuildingException("Glycol functional class not found where expected"); } OutAtom outAtom1 = theDiRadical.getOutAtom(0); Atom chosenAtom1 = outAtom1.isSetExplicitly() ? outAtom1.getAtom() : findAtomForUnlocantedRadical(state, theDiRadical, outAtom1); Fragment functionalFrag =state.fragManager.buildSMILES(functionalClassEls.get(0).getAttributeValue(VALUE_ATR), FUNCTIONALCLASS_TYPE_VAL, NONE_LABELS_VAL); if (outAtom1.getValency() != 1){ throw new StructureBuildingException("OutAtom has unexpected valency. Expected 1. Actual: " + outAtom1.getValency()); } state.fragManager.createBond(chosenAtom1, functionalFrag.getFirstAtom(), 1); state.fragManager.incorporateFragment(functionalFrag, theDiRadical); OutAtom outAtom2 = theDiRadical.getOutAtom(1); Atom chosenAtom2 = outAtom2.isSetExplicitly() ? outAtom2.getAtom() : findAtomForUnlocantedRadical(state, theDiRadical, outAtom2); Fragment hydroxy =state.fragManager.buildSMILES("O", FUNCTIONALCLASS_TYPE_VAL, NONE_LABELS_VAL); if (outAtom2.getValency() != 1){ throw new StructureBuildingException("OutAtom has unexpected valency. Expected 1. Actual: " + outAtom2.getValency()); } state.fragManager.createBond(chosenAtom2, hydroxy.getFirstAtom(), 1); state.fragManager.incorporateFragment(hydroxy, theDiRadical); theDiRadical.removeOutAtom(1); theDiRadical.removeOutAtom(0); } /** * Handles Glcyol ethers nomenclature e.g. * triethylene glycol n-butyl ether * tripropylene glycol methyl ether * dipropylene glycol methyl ether acetate * @param words * @throws StructureBuildingException */ private void buildGlycolEther(List words) throws StructureBuildingException { List wordsToAttachToGlycol = new ArrayList<>(); Element glycol =words.get(0); resolveWordOrBracket(state, glycol);//if this actually is something like ethylene glycol this is a no-op as it will already have been resolved if (!glycol.getAttributeValue(TYPE_ATR).equals(WordType.full.toString())){ throw new StructureBuildingException("OPSIN Bug: Cannot find glycol word!"); } for (int i = 1; i < words.size(); i++) { Element wordOrWordRule =words.get(i); //ether ignored if (!wordOrWordRule.getAttributeValue(TYPE_ATR).equals(WordType.functionalTerm.toString())){ resolveWordOrBracket(state, wordOrWordRule);//the substituent to attach wordsToAttachToGlycol.add(wordOrWordRule); } else if (!wordOrWordRule.getAttributeValue(VALUE_ATR).equalsIgnoreCase("ether")){ throw new StructureBuildingException("Unexpected word encountered when applying glycol ether word rule " + wordOrWordRule.getAttributeValue(VALUE_ATR)); } } int numOfEthers = wordsToAttachToGlycol.size(); if (numOfEthers == 0) { throw new StructureBuildingException("OPSIN Bug: Unexpected number of substituents for glycol ether"); } Element finalGroup = findRightMostGroupInWordOrWordRule(glycol); List hydroxyAtoms = FragmentTools.findHydroxyGroups(finalGroup.getFrag()); if (hydroxyAtoms.isEmpty()) { throw new StructureBuildingException("No hydroxy groups found in: " + finalGroup.getValue() + " to form ether"); } if (hydroxyAtoms.size() < numOfEthers) { throw new StructureBuildingException("Insufficient hydroxy groups found in: " + finalGroup.getValue() + " to form required number of ethers"); } for (int i = 0; i < numOfEthers; i++) { BuildResults br = new BuildResults(wordsToAttachToGlycol.get(i)); if (br.getOutAtomCount() >0){//form ether state.fragManager.createBond(hydroxyAtoms.get(i), br.getOutAtom(0).getAtom(), 1); br.removeOutAtom(0); } else if (br.getFunctionalAtomCount() >0){//form ester Atom ateAtom = br.getFunctionalAtom(0); ateAtom.neutraliseCharge(); state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(hydroxyAtoms.get(i), br.getFunctionalAtom(0)); br.removeFunctionalAtom(0); } else{ throw new StructureBuildingException("Word had neither an outAtom or a functionalAtom! hence neither and ether or ester could be formed : " + wordsToAttachToGlycol.get(i).getAttributeValue(VALUE_ATR)); } } } /** * Builds acetals/ketals/hemiacetals/hemiketals and chalcogen analogues * The distinction between acetals and ketals is not enforced (ketals are a subset of acetals) * @param words * @throws StructureBuildingException */ private void buildAcetal(List words) throws StructureBuildingException { for (int i = 0; i < words.size()-1; i++) { resolveWordOrBracket(state, words.get(i)); } BuildResults substituentsBr = new BuildResults(); for (int i = 1; i < words.size()-1; i++) { Element currentWord = words.get(i); BuildResults substituentBr = new BuildResults(currentWord); int outAtomCount = substituentBr.getOutAtomCount(); if (outAtomCount ==1){ String locantForSubstituent = currentWord.getAttributeValue(LOCANT_ATR); if (locantForSubstituent!=null){ substituentBr.getFirstOutAtom().setLocant(locantForSubstituent); } } else if (outAtomCount ==0){ throw new StructureBuildingException("Substituent was expected to have at least one outAtom"); } substituentsBr.mergeBuildResults(substituentBr); } Element rightMostGroup = findRightMostGroupInWordOrWordRule(words.get(0)); Fragment rootFragment = rightMostGroup.getFrag();//the group which will be modified List carbonylOxygen= findCarbonylOxygens(rootFragment, new ArrayList<>()); Element functionalWord = words.get(words.size()-1); List functionalClasses = OpsinTools.getDescendantElementsWithTagName(functionalWord, FUNCTIONALCLASS_EL); if (functionalClasses.size()!=1){ throw new StructureBuildingException("OPSIN bug: unable to find acetal functionalClass"); } Element functionalClassEl = functionalClasses.get(0); String functionalClass = functionalClassEl.getValue(); Element beforeAcetal = OpsinTools.getPreviousSibling(functionalClassEl); int numberOfAcetals =1; String[] elements = functionalClassEl.getAttributeValue(VALUE_ATR).split(","); if (beforeAcetal != null){ if (beforeAcetal.getName().equals(MULTIPLIER_EL)){ numberOfAcetals = Integer.parseInt(beforeAcetal.getAttributeValue(VALUE_ATR)); } else{ replaceChalcogensInAcetal(functionalClassEl, elements); } } if (carbonylOxygen.size() < numberOfAcetals){ throw new StructureBuildingException("Insufficient carbonyls to form " + numberOfAcetals +" " + functionalClass ); } boolean hemiacetal = functionalClass.contains("hemi"); List acetalFrags = new ArrayList<>(); for (int i = 0; i < numberOfAcetals; i++) { acetalFrags.add(formAcetal(carbonylOxygen, elements)); } int bondsToForm = hemiacetal ? numberOfAcetals : 2*numberOfAcetals; if (substituentsBr.getOutAtomCount()!=bondsToForm){ throw new StructureBuildingException("incorrect number of susbtituents when forming " + functionalClass); } connectSubstituentsToAcetal(acetalFrags, substituentsBr, hemiacetal); } private void replaceChalcogensInAcetal(Element functionalClassEl, String[] elements) throws StructureBuildingException { Element currentEl = functionalClassEl.getParent().getChild(0); int multiplier = 1; if (currentEl.getName().equals(MULTIPLIER_EL)){ multiplier = Integer.parseInt(currentEl.getAttributeValue(VALUE_ATR)); if (multiplier > 2){ throw new StructureBuildingException(functionalClassEl.getValue() + " only has two oxygen!"); } currentEl = OpsinTools.getNextSibling(currentEl); } int i = 0; while(currentEl != functionalClassEl) { if (currentEl.getName().equals(GROUP_EL)) { for (int j = 0; j < multiplier; j++) { if (i == 2) { throw new StructureBuildingException(functionalClassEl.getValue() + " only has two oxygen!"); } if (!elements[i].equals("O")){ throw new StructureBuildingException("Replacement on " + functionalClassEl.getValue() + " can only be used to replace oxygen!"); } elements[i++] = currentEl.getAttributeValue(VALUE_ATR); } } else { throw new StructureBuildingException("Unexpected element before acetal"); } currentEl = OpsinTools.getNextSibling(currentEl); } } private Fragment formAcetal(List carbonylOxygen, String[] elements) throws StructureBuildingException { Atom neighbouringCarbon = carbonylOxygen.get(0).getAtomNeighbours().get(0); state.fragManager.removeAtomAndAssociatedBonds(carbonylOxygen.get(0)); carbonylOxygen.remove(0); Fragment acetalFrag = state.fragManager.buildSMILES(StringTools.arrayToString(elements, "."),"",NONE_LABELS_VAL); FragmentTools.assignElementLocants(acetalFrag, new ArrayList<>()); List acetalAtomList = acetalFrag.getAtomList(); Atom atom1 = acetalAtomList.get(0); state.fragManager.createBond(neighbouringCarbon, atom1, 1); Atom atom2 = acetalAtomList.get(1); state.fragManager.createBond(neighbouringCarbon, atom2, 1); state.fragManager.incorporateFragment(acetalFrag, neighbouringCarbon.getFrag()); return acetalFrag; } private boolean buildAlcoholEster(List words, int numberOfWordRules) throws StructureBuildingException { for (Element word : words) { if (!WordType.full.toString().equals(word.getAttributeValue(TYPE_ATR))){ throw new StructureBuildingException("Bug in word rule for potentialAlcoholEster"); } resolveWordOrBracket(state, word); } int ateWords = words.size() -1; if (ateWords < 1){ throw new StructureBuildingException("Bug in word rule for potentialAlcoholEster"); } Fragment potentialAlcoholFragment = findRightMostGroupInWordOrWordRule(words.get(0)).getFrag(); List hydroxyAtoms = FragmentTools.findHydroxyGroups(potentialAlcoholFragment); List chosenHydroxyAtoms = new ArrayList<>(); List ateBuildResults = new ArrayList<>(); for (int i = 1; i < words.size(); i++) { Element ateWord = words.get(i); BuildResults wordBr = new BuildResults(ateWord); if (isAppropriateAteGroupForAlcoholEster(ateWord, wordBr)) { String locant = ateWord.getAttributeValue(LOCANT_ATR); if (locant != null) { Atom atomOnAlcoholFragment = potentialAlcoholFragment.getAtomByLocantOrThrow(locant); if (!hydroxyAtoms.contains(atomOnAlcoholFragment) || chosenHydroxyAtoms.contains(atomOnAlcoholFragment)) { atomOnAlcoholFragment = potentialAlcoholFragment.getAtomByLocantOrThrow("O" + locant); } if (!hydroxyAtoms.contains(atomOnAlcoholFragment) || chosenHydroxyAtoms.contains(atomOnAlcoholFragment)) { throw new StructureBuildingException(locant + " did not point to a hydroxy group to be used for ester formation"); } chosenHydroxyAtoms.add(atomOnAlcoholFragment); } else if (words.size() == 2) { //special case for adenosine triphosphate and the like //also true for pyridoxine derivatives //guess that locant might be 5' Atom atomOnAlcoholFragment = potentialAlcoholFragment.getAtomByLocant("O5'"); if (hydroxyAtoms.contains(atomOnAlcoholFragment)) { chosenHydroxyAtoms.add(atomOnAlcoholFragment); } } ateBuildResults.add(wordBr); } else { return false; } } if (chosenHydroxyAtoms.size() < ateWords) { if (!chosenHydroxyAtoms.isEmpty()) { throw new RuntimeException("OPSIN Bug: Either all or none of the esters should be locanted in alcohol ester rule"); } if (hydroxyAtoms.size() == ateWords || hydroxyAtoms.size() > ateWords && (AmbiguityChecker.allAtomsEquivalent(hydroxyAtoms) || potentialAlcoholFragment.getTokenEl().getValue().equals("glycerol") )) { for (int i = 0; i < ateWords; i++) { chosenHydroxyAtoms.add(hydroxyAtoms.get(i)); } } else { return false; } } for (int i = 0; i < ateWords; i++) { BuildResults br = ateBuildResults.get(i); Element ateWord = words.get(i + 1); Element ateGroup = findRightMostGroupInWordOrWordRule(ateWord); if (ateGroup.getAttribute(NUMBEROFFUNCTIONALATOMSTOREMOVE_ATR) == null && numberOfWordRules == 1) { //by convention [O-] are implicitly converted to [OH] when phosphates/sulfates are attached //If word rules is > 1 this will be done or not done as part of charge balancing for (int j = br.getFunctionalAtomCount() -1; j >= 1; j--) { Atom atomToDefunctionalise = br.getFunctionalAtom(j); br.removeFunctionalAtom(j); atomToDefunctionalise.neutraliseCharge(); } } Atom functionalAtom = br.getFunctionalAtom(0); br.removeFunctionalAtom(0); functionalAtom.neutraliseCharge(); state.fragManager.replaceAtomWithAnotherAtomPreservingConnectivity(functionalAtom, chosenHydroxyAtoms.get(i)); } return true; } private void buildAmineDiConjunctiveSuffix(List words) throws StructureBuildingException { for (Element word : words) { if (!WordType.full.toString().equals(word.getAttributeValue(TYPE_ATR))){ throw new StructureBuildingException("Bug in word rule for amineDiConjunctiveSuffix"); } resolveWordOrBracket(state, word); } if (words.size() != 3) { throw new StructureBuildingException("Unexpected number of words encountered when processing name of type amineDiConjunctiveSuffix, expected 3 but found: " + words.size()); } Element aminoAcid = findRightMostGroupInWordOrWordRule(words.get(0)); if (aminoAcid == null) { throw new RuntimeException("OPSIN Bug: failed to find amino acid"); } Atom amineAtom = aminoAcid.getFrag().getDefaultInAtom(); if (amineAtom == null) { throw new StructureBuildingException("OPSIN did not know where the amino acid amine was located"); } for (int i = 1; i < words.size(); i++) { Element word = words.get(i); Fragment suffixLikeGroup = findRightMostGroupInWordOrWordRule(word).getFrag(); String locant = word.getAttributeValue(LOCANT_ATR); if (locant != null){ if (!locant.equals("N")) { throw new RuntimeException("OPSIN Bug: locant expected to be N but was: " + locant); } } Atom atomToConnectToOnConjunctiveFrag = FragmentTools.lastNonSuffixCarbonWithSufficientValency(suffixLikeGroup); if (atomToConnectToOnConjunctiveFrag == null) { throw new StructureBuildingException("OPSIN Bug: Unable to find non suffix carbon with sufficient valency"); } state.fragManager.createBond(atomToConnectToOnConjunctiveFrag, amineAtom, 1); } } private static final Pattern matchCommonCarboxylicSalt = Pattern.compile("tri-?fluoro-?acetate?$", Pattern.CASE_INSENSITIVE); private static final Pattern matchCommonEsterFormingInorganicSalt = Pattern.compile("(ortho-?)?(bor|phosphor|phosphate?|phosphite?)|carbam|carbon|sulfur|sulfate?|sulfite?|diphosphate?|triphosphate?", Pattern.CASE_INSENSITIVE); /** * CAS endorses the use of ...ol ...ate names means esters * but only for cases involving "common acids": * Acetic acid; Benzenesulfonic acid; Benzenesulfonic acid, 4-methyl-; Benzoic acid and its monoamino, mononitro, and dinitro derivatives; * Boric acid (H3BO3); Carbamic acid; Carbamic acid, N-methyl-; Carbamic acid, N-phenyl-; Carbonic acid; Formic acid; Methanesulfonic acid; * Nitric acid; Phosphoric acid; Phosphorodithioic acid; Phosphorothioic acid; Phosphorous acid; Propanoic acid; Sulfuric acid; and Sulfurous acid. * ...unless the alcohol component is also common. * * As in practice a lot of use won't be from CAS names we use the following heuristic: * Is locanted OR * Has 1 functional atom (And not common salt e.g. Trifluoroacetate) OR * common phosphorus/sulfur ate including di/tri phosphate * @param ateWord * @param wordBr * @return * @throws StructureBuildingException */ private boolean isAppropriateAteGroupForAlcoholEster(Element ateWord, BuildResults wordBr) throws StructureBuildingException { if (wordBr.getFunctionalAtomCount() > 0) { if (ateWord.getAttributeValue(LOCANT_ATR) != null) { //locanted, so locant must be used for this purpose return true; } if (wordBr.getFunctionalAtomCount() == 1) { if (matchCommonCarboxylicSalt.matcher(ateWord.getAttributeValue(VALUE_ATR)).find()) { return false; } return true; } String ateGroupText = findRightMostGroupInWordOrWordRule(ateWord).getValue(); //e.g. triphosphate if (matchCommonEsterFormingInorganicSalt.matcher(ateGroupText).matches()) { return true; } } return false; } private void splitAlcoholEsterRuleIntoTwoSimpleWordRules(List words) { Element firstGroup = words.get(0); Element wordRule = firstGroup.getParent(); wordRule.getAttribute(WORDRULE_ATR).setValue(WordRule.simple.toString()); wordRule.getAttribute(VALUE_ATR).setValue(firstGroup.getAttributeValue(VALUE_ATR)); Element newWordRule = new GroupingEl(WORDRULE_EL); newWordRule.addAttribute(TYPE_ATR, WordType.full.toString()); newWordRule.addAttribute(WORDRULE_ATR, WordRule.simple.toString()); newWordRule.addAttribute(VALUE_ATR, words.get(1).getAttributeValue(VALUE_ATR)); OpsinTools.insertAfter(wordRule, newWordRule); for (int i = 1; i < words.size(); i++) { Element word = words.get(i); word.detach(); newWordRule.addChild(word); } } private void connectSubstituentsToAcetal(List acetalFrags, BuildResults subBr, boolean hemiacetal) throws StructureBuildingException { Map usageMap= new HashMap<>(); for (int i = subBr.getOutAtomCount() -1; i>=0; i--) { OutAtom out = subBr.getOutAtom(i); subBr.removeOutAtom(i); Atom atomToUse = null; if (out.getLocant()!=null){ boolean numericLocant = MATCH_NUMERIC_LOCANT.matcher(out.getLocant()).matches(); for (Fragment possibleAcetalFrag : acetalFrags) { if (numericLocant){ Atom a =OpsinTools.depthFirstSearchForNonSuffixAtomWithLocant(possibleAcetalFrag.getFirstAtom(), out.getLocant()); if (a!=null){ List atomList = possibleAcetalFrag.getAtomList(); if (atomList.get(0).getBondCount()==1){ atomToUse = atomList.get(0); break; } else if (atomList.get(1).getBondCount()==1){ atomToUse = atomList.get(1); break; } } } else if (possibleAcetalFrag.hasLocant(out.getLocant())){ atomToUse = possibleAcetalFrag.getAtomByLocantOrThrow(out.getLocant()); break; } } if (atomToUse==null){ throw new StructureBuildingException("Unable to find suitable acetalFrag"); } } else{ List atomList = acetalFrags.get(0).getAtomList(); if (atomList.get(0).getBondCount()==1){ atomToUse = atomList.get(0); } else if (atomList.get(1).getBondCount()==1){ atomToUse = atomList.get(1); } else{ throw new StructureBuildingException("OPSIN bug: unable to find acetal atom"); } } Fragment acetalFrag = atomToUse.getFrag(); int usage = usageMap.get(acetalFrag) !=null ? usageMap.get(acetalFrag) : 0; state.fragManager.createBond(out.getAtom(), atomToUse, out.getValency()); usage++; if (usage >=2 || hemiacetal){ acetalFrags.remove(acetalFrag); } usageMap.put(acetalFrag, usage); } } private void buildCyclicPeptide(List words) throws StructureBuildingException { if (words.size() != 2){ throw new StructureBuildingException("OPSIN Bug: Expected 2 words in cyclic peptide name, found: " + words.size()); } Element peptide = words.get(1); resolveWordOrBracket(state, peptide); BuildResults peptideBr = new BuildResults(peptide); if (peptideBr.getOutAtomCount() ==1){ Atom outAtom = getOutAtomTakingIntoAccountWhetherSetExplicitly(peptideBr, 0); List aminoAcids = OpsinTools.getDescendantElementsWithTagNameAndAttribute(peptide, GROUP_EL, TYPE_ATR, AMINOACID_TYPE_VAL); if (aminoAcids.size() < 2){ throw new StructureBuildingException("Cyclic peptide building failed: Requires at least two amino acids!"); } Atom inAtom = aminoAcids.get(0).getFrag().getDefaultInAtomOrFirstAtom(); state.fragManager.createBond(outAtom, inAtom, peptideBr.getOutAtom(0).getValency()); peptideBr.removeAllOutAtoms(); } else{ throw new StructureBuildingException("Cyclic peptide building failed: Expected 1 outAtoms, found: " +peptideBr.getOutAtomCount()); } } private void buildPolymer(List words) throws StructureBuildingException { if (words.size()!=2){ throw new StructureBuildingException("Currently unsupported polymer name type"); } Element polymer = words.get(1); resolveWordOrBracket(state, polymer); BuildResults polymerBr = new BuildResults(polymer); if (polymerBr.getOutAtomCount() ==2){ Atom inAtom = getOutAtomTakingIntoAccountWhetherSetExplicitly(polymerBr, 0); Atom outAtom = getOutAtomTakingIntoAccountWhetherSetExplicitly(polymerBr, 1); /* * We assume the polymer repeats so as an approximation we create an R group with the same element as the group at the other end of polymer (with valency equal to the bondorder of the Rgroup so no H added) */ Atom rGroup1 =state.fragManager.buildSMILES("[" + outAtom.getElement().toString() + "|" + polymerBr.getOutAtom(0).getValency() + "]", "", "alpha").getFirstAtom(); rGroup1.setProperty(Atom.ATOM_CLASS, 1); state.fragManager.createBond(inAtom, rGroup1, polymerBr.getOutAtom(0).getValency()); Atom rGroup2 =state.fragManager.buildSMILES("[" + inAtom.getElement().toString() + "|" + polymerBr.getOutAtom(1).getValency() + "]", "", "omega").getFirstAtom(); rGroup2.setProperty(Atom.ATOM_CLASS, 2); state.fragManager.createBond(outAtom, rGroup2, polymerBr.getOutAtom(1).getValency()); polymerAttachmentPoints.add(rGroup1); polymerAttachmentPoints.add(rGroup2); polymerBr.removeAllOutAtoms(); } else{ throw new StructureBuildingException("Polymer building failed: Two termini were not found; Expected 2 outAtoms, found: " +polymerBr.getOutAtomCount()); } } /** * Finds a suitable functional atom corresponding to the given locant * Takes into account situations where function replacement may have resulted in the wrong atoms being functional atoms * @param locant * @param mainGroupBR * @return functionalAtomToUse * @throws StructureBuildingException */ private Atom determineFunctionalAtomToUse(String locant, BuildResults mainGroupBR) throws StructureBuildingException { for (int i = 0; i < mainGroupBR.getFunctionalAtomCount(); i++) { //look for exact locant match Atom possibleAtom = mainGroupBR.getFunctionalAtom(i); if (possibleAtom.hasLocant(locant)){ mainGroupBR.removeFunctionalAtom(i); Set degenerateAtoms = possibleAtom.getProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT); if (degenerateAtoms != null){ degenerateAtoms.remove(possibleAtom); } return possibleAtom; } } if (MATCH_NUMERIC_LOCANT.matcher(locant).matches()){ //None of the functional atoms had an appropriate locant. Look for the case whether the locant refers to the backbone. e.g. 5-methyl 2-aminopentanedioate for (int i = 0; i < mainGroupBR.getFunctionalAtomCount(); i++) { Atom possibleAtom = mainGroupBR.getFunctionalAtom(i); if (OpsinTools.depthFirstSearchForNonSuffixAtomWithLocant(possibleAtom, locant)!=null){ mainGroupBR.removeFunctionalAtom(i); Set degenerateAtoms = possibleAtom.getProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT); if (degenerateAtoms != null){ degenerateAtoms.remove(possibleAtom); } return possibleAtom; } } } else if (MATCH_ELEMENT_SYMBOL_LOCANT.matcher(locant).matches()){ //None of the functional atoms had an appropriate locant. Look for the special cases: // Where the lack of primes on an element symbol locant should be ignored e.g. O,O-diethyl carbonate // Where the locant is used to decide on the ester configuration c.f. O-methyl ..thioate and S-methyl ..thioate boolean isElementSymbol = MATCH_ELEMENT_SYMBOL.matcher(locant).matches(); for (int i = 0; i < mainGroupBR.getFunctionalAtomCount(); i++) { Atom possibleAtom = mainGroupBR.getFunctionalAtom(i); if (isElementSymbol && possibleAtom.getElement().toString().equals(locant)){ mainGroupBR.removeFunctionalAtom(i); return possibleAtom; } Set degenerateAtoms = possibleAtom.getProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT); if (degenerateAtoms != null){ boolean foundAtom = false; for (Atom a : degenerateAtoms) { if (a.hasLocant(locant) || (isElementSymbol && a.getElement().toString().equals(locant))){ //swap locants and element type List tempLocants = new ArrayList<>(a.getLocants()); List tempLocants2 = new ArrayList<>(possibleAtom.getLocants()); a.clearLocants(); possibleAtom.clearLocants(); for (String l : tempLocants) { possibleAtom.addLocant(l); } for (String l : tempLocants2) { a.addLocant(l); } ChemEl originalChemEl = possibleAtom.getElement(); possibleAtom.setElement(a.getElement()); a.setElement(originalChemEl); mainGroupBR.removeFunctionalAtom(i); foundAtom =true; break; } } if (foundAtom){ degenerateAtoms.remove(possibleAtom); return possibleAtom; } } } } throw new StructureBuildingException("Cannot find functional atom with locant: " +locant + " to form an ester with"); } /** * Applies explicit stoichiometry, charge balancing and fractional multipliers * @param molecule * @param wordRules * @throws StructureBuildingException */ private void manipulateStoichiometry(Element molecule, List wordRules) throws StructureBuildingException { boolean explicitStoichiometryPresent = applyExplicitStoichiometryIfProvided(wordRules); boolean chargedFractionalGroup = false; List wordRulesWithFractionalMultipliers = new ArrayList<>(0); for (Element wordRule : wordRules) { Element fractionalMultiplier = wordRule.getChild(0); while (fractionalMultiplier.getChildCount() != 0){ fractionalMultiplier = fractionalMultiplier.getChild(0); } if (fractionalMultiplier.getName().equals(FRACTIONALMULTIPLIER_EL)) { if (explicitStoichiometryPresent) { throw new StructureBuildingException("Fractional multipliers should not be used in conjunction with explicit stoichiometry"); } String[] value = fractionalMultiplier.getAttributeValue(VALUE_ATR).split("/"); if (value.length != 2) { throw new RuntimeException("OPSIN Bug: malformed fractional multiplier: " + fractionalMultiplier.getAttributeValue(VALUE_ATR)); } try { int numerator = Integer.parseInt(value[0]); int denominator = Integer.parseInt(value[1]); if (denominator != 2) { throw new RuntimeException("Only fractions of a 1/2 currently supported"); } for (int j = 1; j < numerator; j++) { Element clone = state.fragManager.cloneElement(state, wordRule); OpsinTools.insertAfter(wordRule, clone); wordRulesWithFractionalMultipliers.add(clone); } } catch (NumberFormatException e) { throw new RuntimeException("OPSIN Bug: malformed fractional multiplier: " + fractionalMultiplier.getAttributeValue(VALUE_ATR)); } //don't detach the fractional multiplier to avoid charge balancing multiplication (cf. handling of mono) wordRulesWithFractionalMultipliers.add(wordRule); if (new BuildResults(wordRule).getCharge() !=0){ chargedFractionalGroup = true; } } } if (wordRulesWithFractionalMultipliers.size() > 0) { if (wordRules.size() == 1) { throw new StructureBuildingException("Unexpected fractional multiplier found at start of word"); } if (chargedFractionalGroup) { for (Element wordRule : wordRules) { if (wordRulesWithFractionalMultipliers.contains(wordRule)) { continue; } Element clone = state.fragManager.cloneElement(state, wordRule); OpsinTools.insertAfter(wordRule, clone); } } } boolean saltExpected = molecule.getAttribute(ISSALT_ATR) != null; if (saltExpected) { deprotonateAcidIfSaltWithMetal(molecule); } int overallCharge = state.fragManager.getOverallCharge(); if (overallCharge != 0){//a net charge is present! Could just mean the counterion has not been specified though balanceChargeIfPossible(molecule, overallCharge, explicitStoichiometryPresent); } if (wordRulesWithFractionalMultipliers.size() > 0 && !chargedFractionalGroup) { for (Element wordRule : molecule.getChildElements(WORDRULE_EL)) { if (wordRulesWithFractionalMultipliers.contains(wordRule)) { continue; } Element clone = state.fragManager.cloneElement(state, wordRule); OpsinTools.insertAfter(wordRule, clone); } } } private boolean applyExplicitStoichiometryIfProvided(List wordRules) throws StructureBuildingException { boolean explicitStoichiometryPresent =false; for (Element wordRule : wordRules) { if (wordRule.getAttribute(STOICHIOMETRY_ATR)!=null){ int stoichiometry = Integer.parseInt(wordRule.getAttributeValue(STOICHIOMETRY_ATR)); wordRule.removeAttribute(wordRule.getAttribute(STOICHIOMETRY_ATR)); for (int j = 1; j < stoichiometry; j++) { Element clone = state.fragManager.cloneElement(state, wordRule); OpsinTools.insertAfter(wordRule, clone); } explicitStoichiometryPresent =true; } } return explicitStoichiometryPresent; } private void deprotonateAcidIfSaltWithMetal(Element molecule) { List positivelyChargedComponents = new ArrayList<>(); List negativelyChargedComponents = new ArrayList<>(); List neutralComponents = new ArrayList<>(); List wordRules = molecule.getChildElements(WORDRULE_ATR); for (Element wordRule : wordRules) { BuildResults br = new BuildResults(wordRule); int charge = br.getCharge(); if (charge > 0) { positivelyChargedComponents.add(br); } else if (charge < 0) { negativelyChargedComponents.add(br); } else { neutralComponents.add(br); } } if (negativelyChargedComponents.isEmpty() && (positivelyChargedComponents.size() > 0 || getMetalsThatCanBeImplicitlyCations(molecule).size() > 0)) { for (int i = neutralComponents.size() - 1; i>=0; i--) { List functionalAtoms = new ArrayList<>(); for (Fragment f : neutralComponents.get(i).getFragments()) { for (int j = 0; j < f.getFunctionalAtomCount(); j++) { functionalAtoms.add(f.getFunctionalAtom(j).getAtom()); } } for (Atom functionalAtom : functionalAtoms) { if (functionalAtom.getCharge() == 0 && functionalAtom.getIncomingValency() == 1){ functionalAtom.addChargeAndProtons(-1, -1); } } } } } /** * A net charge is present; Given the molecule element the overallCharge is there an unambiguous way of * multiplying fragments to make the net charge 0 * metals without specified charge may be given an implicit positive charge * * If this fails look for the case where there are multiple molecules and the mixture is only negative due to negatively charged functional Atoms e.g. pyridine acetate and remove the negative charge * @param molecule * @param explicitStoichiometryPresent * @param overallCharge * @throws StructureBuildingException */ private void balanceChargeIfPossible(Element molecule, int overallCharge, boolean explicitStoichiometryPresent) throws StructureBuildingException { List wordRules = molecule.getChildElements(WORDRULE_ATR); if (wordRules.size() == 1) {//single molecule if (overallCharge == 1) { phosphoZwitterionSpecialCase(wordRules.get(0)); } return; } List positivelyChargedComponents = new ArrayList<>(); List negativelyChargedComponents = new ArrayList<>(); Map componentToChargeMapping = new HashMap<>(); Map componentToBR = new HashMap<>(); List cationicElements = getMetalsThatCanBeImplicitlyCations(molecule); overallCharge = setCationicElementsToTypicalCharge(cationicElements, overallCharge); if (overallCharge == 0) { return; } if (overallCharge == -2 && triHalideSpecialCase(wordRules)) { //e.g. three iodides --> triiodide ion return; } for (Element wordRule : wordRules) { BuildResults br = new BuildResults(wordRule); componentToBR.put(wordRule, br); int charge = br.getCharge(); if (charge>0){ positivelyChargedComponents.add(wordRule); } else if (charge <0){ negativelyChargedComponents.add(wordRule); } componentToChargeMapping.put(wordRule, charge); } if (cationicElements.size() ==1 && overallCharge < 0) {//e.g. manganese tetrachloride [Mn+2]-->[Mn+4] boolean mustBeCommonOxidationState = negativelyChargedComponents.size() == 1 && negativelyChargedComponents.get(0).getChildElements(WORD_EL).size() == 1; //For simple case e.g. silver oxide, constrain the metal to common oxidation states, otherwise allow any plausible oxidation state of metal if (setChargeOnCationicElementAppropriately(overallCharge, cationicElements.get(0), mustBeCommonOxidationState)) { return; } } if (!explicitStoichiometryPresent && (positivelyChargedComponents.size()==1 && cationicElements.isEmpty() && negativelyChargedComponents.size() >=1 || positivelyChargedComponents.size()>=1 && negativelyChargedComponents.size() ==1 )){ boolean success = multiplyChargedComponents(negativelyChargedComponents, positivelyChargedComponents, componentToChargeMapping, overallCharge); if (success){ return; } } if (cationicElements.size() ==1){//e.g. magnesium monochloride [Mg2+]-->[Mg+] boolean success = setChargeOnCationicElementAppropriately(overallCharge, cationicElements.get(0), false); if (success){ return; } } if (overallCharge <0){ if (overallCharge == -1 && acetylideSpecialCase(wordRules)) { //e.g. acetylide dianion --> acetylide monoanion return; } //neutralise functionalAtoms if they are the sole cause of the negative charge and multiple molecules are present int chargeOnFunctionalAtoms = 0; for (Element wordRule : wordRules) { BuildResults br = componentToBR.get(wordRule); int functionalAtomCount = br.getFunctionalAtomCount(); for (int i = functionalAtomCount -1; i >=0; i--) { chargeOnFunctionalAtoms += br.getFunctionalAtom(i).getCharge(); } } if (chargeOnFunctionalAtoms <= overallCharge){ for (Element wordRule : wordRules) { BuildResults br = componentToBR.get(wordRule); int functionalAtomCount = br.getFunctionalAtomCount(); for (int i = functionalAtomCount -1; i >=0; i--) { if (overallCharge==0){ return; } overallCharge-=br.getFunctionalAtom(i).getCharge(); br.getFunctionalAtom(i).neutraliseCharge(); br.removeFunctionalAtom(i); } } } } } private List getMetalsThatCanBeImplicitlyCations(Element molecule) { List cationicElements = new ArrayList<>(); List elementaryAtoms = OpsinTools.getDescendantElementsWithTagNameAndAttribute(molecule, GROUP_EL, TYPE_ATR, ELEMENTARYATOM_TYPE_VAL); for (Element elementaryAtom : elementaryAtoms) { if (elementaryAtom.getAttribute(COMMONOXIDATIONSTATESANDMAX_ATR)!=null){ Atom metalAtom = elementaryAtom.getFrag().getFirstAtom(); if (metalAtom.getCharge() == 0 && metalAtom.getProperty(Atom.OXIDATION_NUMBER) == null) {//if not 0 charge cannot be implicitly modified String[] typicalOxidationStates = elementaryAtom.getAttributeValue(COMMONOXIDATIONSTATESANDMAX_ATR).split(":")[0].split(","); int typicalCharge = Integer.parseInt(typicalOxidationStates[typicalOxidationStates.length-1]); if (typicalCharge > metalAtom.getBondCount()){ cationicElements.add(elementaryAtom); } } } } return cationicElements; } /** * Sets the cationicElements to the lowest typical charge as specified by the COMMONOXIDATIONSTATESANDMAX_ATR that is >= incoming valency * The valency incoming to the cationicElement is taken into account e.g. phenylmagnesium chloride is [Mg+] * @param cationicElements * @param overallCharge * @return */ private int setCationicElementsToTypicalCharge(List cationicElements, int overallCharge) { for (Element cationicElement : cationicElements) { Fragment cationicFrag = cationicElement.getFrag(); String[] typicalOxidationStates = cationicElement.getAttributeValue(COMMONOXIDATIONSTATESANDMAX_ATR).split(":")[0].split(","); int incomingValency = cationicFrag.getFirstAtom().getIncomingValency(); for (String typicalOxidationState : typicalOxidationStates) { int charge = Integer.parseInt(typicalOxidationState); if (charge>= incomingValency){ charge -= incomingValency; overallCharge += charge; cationicFrag.getFirstAtom().setCharge(charge); break; } } } return overallCharge; } /** * Checks for tribromide/triodide and joins the ions if found * @param wordRules * @return */ private boolean triHalideSpecialCase(List wordRules) { for (Element wordRule : wordRules) { if (wordRule.getChildCount() == 3) { String value = wordRule.getAttributeValue(VALUE_ATR); if ("tribromide".equals(value) || "tribromid".equals(value) || "triiodide".equals(value) || "triiodid".equals(value)) { List groups1 = OpsinTools.getDescendantElementsWithTagName(wordRule.getChild(0), GROUP_EL); List groups2 = OpsinTools.getDescendantElementsWithTagName(wordRule.getChild(1), GROUP_EL); List groups3 = OpsinTools.getDescendantElementsWithTagName(wordRule.getChild(2), GROUP_EL); if (groups1.size() != 1 || groups2.size() != 1 || groups3.size() != 1) { throw new RuntimeException("OPSIN Bug: Unexpected trihalide representation"); } Atom centralAtom = groups1.get(0).getFrag().getFirstAtom(); Atom otherAtom1 = groups2.get(0).getFrag().getFirstAtom(); otherAtom1.setCharge(0); Atom otherAtom2 = groups3.get(0).getFrag().getFirstAtom(); otherAtom2.setCharge(0); state.fragManager.createBond(centralAtom, otherAtom1, 1); state.fragManager.createBond(centralAtom, otherAtom2, 1); return true; } } } return false; } private boolean phosphoZwitterionSpecialCase(Element wordRule) { List groups = OpsinTools.getDescendantElementsWithTagName(wordRule, GROUP_EL); List phosphoFrags = new ArrayList<>(); for (Element group : groups) { if (PHOSPHO_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR)) && group.getValue().endsWith("phospho")) { phosphoFrags.add(group.getFrag()); } } if (phosphoFrags.size() == 1) { List matches = FragmentTools.findHydroxyLikeTerminalAtoms(phosphoFrags.get(0).getAtomList(), ChemEl.O); for (Atom atom : matches) { if (atom.getFirstBond().getOtherAtom(atom).getElement() == ChemEl.P) { atom.addChargeAndProtons(-1, -1); return true; } } } return false; } private boolean acetylideSpecialCase(List wordRules) { for (Element wordRule : wordRules) { String value = wordRule.getAttributeValue(VALUE_ATR); if ("acetylide".equals(value) || "acetylid".equals(value)) { List groups = OpsinTools.getDescendantElementsWithTagName(wordRule, GROUP_EL); if (groups.size() != 1) { throw new RuntimeException("OPSIN Bug: Unexpected acetylide representation"); } Fragment frag = groups.get(0).getFrag(); Atom firstAtom = frag.getFirstAtom(); if (frag.getCharge() == -2 && firstAtom.getCharge() == -1) { firstAtom.addChargeAndProtons(1, 1); return true; } } } return false; } /** * Multiplies out charged word rules to balance charge * Return true if balancing was possible else false * @param negativelyChargedComponents * @param positivelyChargedComponents * @param componentToChargeMapping * @param overallCharge * @return * @throws StructureBuildingException */ private boolean multiplyChargedComponents(ListnegativelyChargedComponents, List positivelyChargedComponents, Map componentToChargeMapping, int overallCharge) throws StructureBuildingException { Element componentToMultiply; if (overallCharge >0){ if (negativelyChargedComponents.size() >1){ return false;//ambiguous as to which to multiply } componentToMultiply = negativelyChargedComponents.get(0); } else{ if (positivelyChargedComponents.size() >1){ return false;//ambiguous as to which to multiply } componentToMultiply = positivelyChargedComponents.get(0); } int charge = componentToChargeMapping.get(componentToMultiply); if (overallCharge % charge ==0){//e.g. magnesium chloride if (!componentCanBeMultiplied(componentToMultiply)){ return false; } int timesToDuplicate = Math.abs(overallCharge/charge); for (int i = 0; i < timesToDuplicate; i++) { OpsinTools.insertAfter(componentToMultiply, state.fragManager.cloneElement(state, componentToMultiply)); } } else{//e.g. iron(3+) sulfate -->2:3 mixture if (positivelyChargedComponents.size() >1 || !componentCanBeMultiplied(positivelyChargedComponents.get(0))){ return false; } if (negativelyChargedComponents.size() >1 || !componentCanBeMultiplied(negativelyChargedComponents.get(0))){ return false; } int positiveCharge = componentToChargeMapping.get(positivelyChargedComponents.get(0)); int negativeCharge = Math.abs(componentToChargeMapping.get(negativelyChargedComponents.get(0))); int targetTotalAbsoluteCharge = positiveCharge * negativeCharge; for (int i = (targetTotalAbsoluteCharge/negativeCharge); i >1; i--) { OpsinTools.insertAfter(negativelyChargedComponents.get(0), state.fragManager.cloneElement(state, negativelyChargedComponents.get(0))); } for (int i = (targetTotalAbsoluteCharge/positiveCharge); i >1; i--) { OpsinTools.insertAfter(positivelyChargedComponents.get(0), state.fragManager.cloneElement(state, positivelyChargedComponents.get(0))); } } return true; } private boolean componentCanBeMultiplied(Element componentToMultiply) { if (componentToMultiply.getAttributeValue(WORDRULE_ATR).equals(WordRule.simple.toString()) && OpsinTools.getChildElementsWithTagNameAndAttribute(componentToMultiply, WORD_EL, TYPE_ATR, WordType.full.toString()).size()>1){ return false;//already has been multiplied e.g. dichloride } Element firstChild = componentToMultiply.getChild(0); while (firstChild.getChildCount() != 0){ firstChild = firstChild.getChild(0); } if (firstChild.getName().equals(MULTIPLIER_EL) || firstChild.getName().equals(FRACTIONALMULTIPLIER_EL) ){//e.g. monochloride. Allows specification of explicit stoichiometry return false; } return true; } private boolean setChargeOnCationicElementAppropriately(int overallCharge, Element cationicElement, boolean mustBeCommonOxidationState) { Atom cation = cationicElement.getFrag().getFirstAtom(); int chargeOnCationNeeded = -(overallCharge -cation.getCharge()); if (mustBeCommonOxidationState) { String[] typicalOxidationStates = cationicElement.getAttributeValue(COMMONOXIDATIONSTATESANDMAX_ATR).split(":")[0].split(","); for (String typicalOxidationState : typicalOxidationStates) { int charge = Integer.parseInt(typicalOxidationState); if (charge == chargeOnCationNeeded) { cation.setCharge(chargeOnCationNeeded); return true; } } } else { int maximumCharge = Integer.parseInt(cationicElement.getAttributeValue(COMMONOXIDATIONSTATESANDMAX_ATR).split(":")[1]); if (chargeOnCationNeeded >=0 && chargeOnCationNeeded <= maximumCharge){ cation.setCharge(chargeOnCationNeeded); return true; } } return false; } private Element findRightMostGroupInWordOrWordRule(Element wordOrWordRule) throws StructureBuildingException { if (wordOrWordRule.getName().equals(WORDRULE_EL)){ List words = OpsinTools.getDescendantElementsWithTagName(wordOrWordRule, WORD_EL); for (int i = words.size() -1 ; i >=0; i--) {//ignore functionalTerm Words if (words.get(i).getAttributeValue(TYPE_ATR).equals(WordType.functionalTerm.toString())){ words.remove(words.get(i)); } } if (words.isEmpty()){ throw new StructureBuildingException("OPSIN bug: word element not found where expected"); } return StructureBuildingMethods.findRightMostGroupInBracket(words.get(words.size()-1)); } else if (wordOrWordRule.getName().equals(WORD_EL)){//word element can be treated just like a bracket return StructureBuildingMethods.findRightMostGroupInBracket(wordOrWordRule); } else{ throw new StructureBuildingException("OPSIN bug: expected word or wordRule"); } } /** * Nasty special cases: * Oxido and related groups can act as O= or even [O-][N+] * This nasty behaviour is in generated ChemDraw names and is supported by most nameToStructure tools so it is supported here * Acting as O= notably is often correct behaviour for inorganics * * Methionine (and the like) when substituted at the sulfur/selenium/tellurium are implicitly positively charged * * purine nucleosides/nucleotides are implicitly positively charged when 7-substituted * @param groups * @throws StructureBuildingException */ private void processSpecialCases(List groups) throws StructureBuildingException { for (Element group : groups) { String subType = group.getAttributeValue(SUBTYPE_ATR); if (OXIDOLIKE_SUBTYPE_VAL.equals(subType)){ Atom oxidoAtom = group.getFrag().getFirstAtom(); if (oxidoAtom.getBondCount() != 1) { continue; } Atom connectedAtom = oxidoAtom.getFirstBond().getOtherAtom(oxidoAtom); ChemEl chemEl = connectedAtom.getElement(); if (checkForConnectedOxo(connectedAtom)){//e.g. not oxido(trioxo)ruthenium continue; } if (ELEMENTARYATOM_TYPE_VAL.equals(connectedAtom.getFrag().getType()) || ((chemEl == ChemEl.S || chemEl == ChemEl.P) && connectedAtom.getCharge() ==0 && ValencyChecker.checkValencyAvailableForBond(connectedAtom, 1))){ oxidoAtom.neutraliseCharge(); oxidoAtom.getFirstBond().setOrder(2); } else if (chemEl == ChemEl.N && connectedAtom.getCharge()==0){ int incomingValency = connectedAtom.getIncomingValency(); if ((incomingValency + connectedAtom.getOutValency()) ==3 && connectedAtom.hasSpareValency()){ connectedAtom.addChargeAndProtons(1, 1);//e.g. N-oxidopyridine } else if ((incomingValency + connectedAtom.getOutValency()) ==4){ if (connectedAtom.getLambdaConventionValency()!=null && connectedAtom.getLambdaConventionValency()==5){ oxidoAtom.setCharge(0); oxidoAtom.setProtonsExplicitlyAddedOrRemoved(0); oxidoAtom.getFirstBond().setOrder(2); } else{ connectedAtom.addChargeAndProtons(1, 1); } } } } else if (AMINOACID_TYPE_VAL.equals(group.getAttributeValue(TYPE_ATR))) { for (Atom atom : group.getFrag()) { if (atom.getElement().isChalcogen() && atom.getElement() != ChemEl.O && atom.getBondCount() == 3 && atom.getIncomingValency() == 3 && atom.getCharge() == 0) { atom.addChargeAndProtons(1, 1); } } } else if (BIOCHEMICAL_SUBTYPE_VAL.equals(subType)) { Fragment frag = group.getFrag(); Atom atom = frag.getAtomByLocant("7"); if (atom != null) { String groupName = group.getValue(); if (groupName.equals("adenosin") || groupName.equals("guanosin") || groupName.equals("inosin") || groupName.equals("thioinosin") || groupName.equals("xanthosin") || groupName.equals("nucleocidin") || groupName.contains("adenylic") || groupName.contains("guanylic") || groupName.contains("inosinic") || groupName.contains("xanthylic") || groupName.endsWith("adenylyl") || groupName.endsWith("adenosyl") || groupName.endsWith("guanylyl") || groupName.endsWith("guanosyl") || groupName.endsWith("inosinylyl") || groupName.endsWith("inosyl") || groupName.endsWith("xanthylyl") || groupName.endsWith("xanthosyl")) { if (atom.getElement() == ChemEl.N && atom.hasSpareValency() && atom.getBondCount() == 3 && atom.getIncomingValency() == 3 && atom.getCharge() == 0) { atom.addChargeAndProtons(1, 1); } } } } else { Matcher m = matchBoroHydrogenIsotope.matcher(group.getValue()); if (m.matches()) { //The number of deuterium in borodeuteride will depend on how many have been substituted! Fragment frag = group.getFrag(); int substitutableHydrogen = calculateSubstitutableHydrogenAtoms(frag.getFirstAtom()); boolean isDeuterium = m.group(1).equals("deuter"); for (int i = 0; i < substitutableHydrogen; i++) { Fragment isotope = state.fragManager.buildSMILES(isDeuterium ? "[2H]" : "[3H]"); state.fragManager.createBond(frag.getFirstAtom(), isotope.getFirstAtom(), 1); state.fragManager.incorporateFragment(isotope, frag); } } } } } /** * Is the atom connected to an atom whose fragment has an xml entry called "oxo" * @param atom * @return */ private boolean checkForConnectedOxo(Atom atom) { List bonds = atom.getBonds(); for (Bond bond : bonds) { Atom connectedAtom; if (bond.getFromAtom() == atom){ connectedAtom = bond.getToAtom(); } else{ connectedAtom = bond.getFromAtom(); } Element correspondingEl = connectedAtom.getFrag().getTokenEl(); if (correspondingEl.getValue().equals("oxo")){ return true; } } return false; } /** * Sets the charge according to the oxidation number if the oxidation number atom property has been set * @param groups * @throws StructureBuildingException */ private void processOxidationNumbers(List groups) throws StructureBuildingException { for (Element group : groups) { if (ELEMENTARYATOM_TYPE_VAL.equals(group.getAttributeValue(TYPE_ATR))){ Atom atom = group.getFrag().getFirstAtom(); if (atom.getProperty(Atom.OXIDATION_NUMBER)!=null){ List neighbours = atom.getAtomNeighbours(); int chargeThatWouldFormIfLigandsWereRemoved =0; for (Atom neighbour : neighbours) { Element neighbourEl = neighbour.getFrag().getTokenEl(); Bond b = atom.getBondToAtomOrThrow(neighbour); //carbonyl and nitrosyl are neutral ligands if (!((neighbourEl.getValue().equals("carbon") && NONCARBOXYLICACID_TYPE_VAL.equals(neighbourEl.getAttributeValue(TYPE_ATR))) || neighbourEl.getValue().equals("nitrosyl"))){ chargeThatWouldFormIfLigandsWereRemoved+=b.getOrder(); } } atom.setCharge(atom.getProperty(Atom.OXIDATION_NUMBER)-chargeThatWouldFormIfLigandsWereRemoved); } } } } /** * Handles the application of stereochemistry and checking * existing stereochemical specification is still relevant. * @param molecule * @param uniFrag * @throws StructureBuildingException */ private void processStereochemistry(Element molecule, Fragment uniFrag) throws StructureBuildingException { List stereoChemistryEls = findStereochemistryElsInProcessingOrder(molecule); List atomList = uniFrag.getAtomList(); List atomsWithPreDefinedAtomParity = new ArrayList<>(); for (Atom atom : atomList) { if (atom.getAtomParity()!=null){ atomsWithPreDefinedAtomParity.add(atom); } } Set bonds = uniFrag.getBondSet(); List bondsWithPreDefinedBondStereo = new ArrayList<>(); for (Bond bond : bonds) { if (bond.getBondStereo()!=null){ bondsWithPreDefinedBondStereo.add(bond); } } if (stereoChemistryEls.size() >0 || atomsWithPreDefinedAtomParity.size() >0 || bondsWithPreDefinedBondStereo.size() >0){ StereoAnalyser stereoAnalyser = new StereoAnalyser(uniFrag); Map atomStereoCentreMap = new HashMap<>();//contains all atoms that are stereo centres with a mapping to the corresponding StereoCentre object List stereoCentres = stereoAnalyser.findStereoCentres(); for (StereoCentre stereoCentre : stereoCentres) { atomStereoCentreMap.put(stereoCentre.getStereoAtom(),stereoCentre); } Map bondStereoBondMap = new HashMap<>(); List stereoBonds = stereoAnalyser.findStereoBonds(); for (StereoBond stereoBond : stereoBonds) { Bond b = stereoBond.getBond(); if (FragmentTools.notIn6MemberOrSmallerRing(b)){ bondStereoBondMap.put(b, stereoBond); } } StereochemistryHandler stereoChemistryHandler = new StereochemistryHandler(state, atomStereoCentreMap, bondStereoBondMap); stereoChemistryHandler.applyStereochemicalElements(stereoChemistryEls); stereoChemistryHandler.removeRedundantStereoCentres(atomsWithPreDefinedAtomParity, bondsWithPreDefinedBondStereo); } } /** * Finds stereochemistry els in a recursive right to left manner. * Within the same scope though stereochemistry els are found left to right * @param parentEl * @return */ private List findStereochemistryElsInProcessingOrder(Element parentEl) { List matchingElements = new ArrayList<>(); List stereochemistryElsAtThisLevel = new ArrayList<>(); for (int i = parentEl.getChildCount() - 1; i >=0; i--) { Element child = parentEl.getChild(i); if (child.getName().equals(STEREOCHEMISTRY_EL)){ stereochemistryElsAtThisLevel.add(child); } else{ matchingElements.addAll(findStereochemistryElsInProcessingOrder(child)); } } Collections.reverse(stereochemistryElsAtThisLevel); matchingElements.addAll(stereochemistryElsAtThisLevel); return matchingElements; } private void convertOutAtomsToAttachmentAtoms(Fragment uniFrag) throws StructureBuildingException { int outAtomCount = uniFrag.getOutAtomCount(); for (int i = outAtomCount -1; i >=0; i--) { OutAtom outAtom = uniFrag.getOutAtom(i); uniFrag.removeOutAtom(i); Atom rGroup = state.fragManager.createAtom(ChemEl.R, uniFrag); state.fragManager.createBond(outAtom.getAtom(), rGroup, outAtom.getValency()); } } /** * Returns the atom corresponding to position i in the outAtoms list * If the outAtom is not set explicitly a suitable atom will be found * @param buildResults * @param i index * @return atom * @throws StructureBuildingException */ private Atom getOutAtomTakingIntoAccountWhetherSetExplicitly(BuildResults buildResults, int i) throws StructureBuildingException { OutAtom outAtom = buildResults.getOutAtom(i); if (outAtom.isSetExplicitly()){ return outAtom.getAtom(); } else{ return findAtomForUnlocantedRadical(state, outAtom.getAtom().getFrag(), outAtom); } } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/StructureBuildingException.java000066400000000000000000000007451451751637500320620ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /**Thrown during assembly of the structure * * @author ptc24 * */ class StructureBuildingException extends Exception { private static final long serialVersionUID = 1L; StructureBuildingException() { super(); } StructureBuildingException(String message) { super(message); } StructureBuildingException(String message, Throwable cause) { super(message, cause); } StructureBuildingException(Throwable cause) { super(cause); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/StructureBuildingMethods.java000066400000000000000000003422531451751637500315320ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayDeque; import java.util.ArrayList; import java.util.Collection; import java.util.Collections; import java.util.Deque; import java.util.HashMap; import java.util.HashSet; import java.util.List; import java.util.Map; import java.util.Map.Entry; import java.util.Set; import java.util.regex.Matcher; import java.util.regex.Pattern; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import uk.ac.cam.ch.wwmm.opsin.IsotopeSpecificationParser.IsotopeSpecification; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import static uk.ac.cam.ch.wwmm.opsin.OpsinTools.*; /** * Methods for processing the substitutive and additive operations that connect all the fragments together * as well as indicated hydrogen/unsaturation/heteroatom replacement * @author dl387 * */ class StructureBuildingMethods { private static final Logger LOG = LogManager.getLogger(StructureBuildingMethods.class); private static final Pattern matchCompoundLocant =Pattern.compile("[\\[\\(\\{](\\d+[a-z]?'*)[\\]\\)\\}]"); private StructureBuildingMethods() {} /** * Resolves a word/bracket: * Locanted attributes of words are resolved onto their group * Locanted substitution is performed * Connections involving multi radicals are processed * Unlocanted attributes of words are resolved onto their group * * If word is a wordRule the function will instantly return * * @param state * @param word * @throws StructureBuildingException */ static void resolveWordOrBracket(BuildState state, Element word) throws StructureBuildingException { if (word.getName().equals(WORDRULE_EL)){//already been resolved return; } if (!word.getName().equals(WORD_EL) && !word.getName().equals(BRACKET_EL)){ throw new StructureBuildingException("A word or bracket is the expected input"); } recursivelyResolveLocantedFeatures(state, word); recursivelyResolveUnLocantedFeatures(state, word); //TODO check all things that can substitute have outAtoms //TOOD think whether you can avoid the need to have a cansubstitute function by only using appropriate group List subsBracketsAndRoots = OpsinTools.getDescendantElementsWithTagNames(word, new String[]{BRACKET_EL, SUBSTITUENT_EL, ROOT_EL}); for (Element subsBracketsAndRoot : subsBracketsAndRoots) { if (subsBracketsAndRoot.getAttribute(MULTIPLIER_ATR) != null) { throw new StructureBuildingException("Structure building problem: multiplier on :" + subsBracketsAndRoot.getName() + " was never used"); } } List groups = OpsinTools.getDescendantElementsWithTagName(word, GROUP_EL); for (int i = 0; i < groups.size(); i++) { Element group = groups.get(i); if (group.getAttribute(RESOLVED_ATR)==null && i != groups.size()-1){ throw new StructureBuildingException("Structure building problem: Bond was not made from :" +group.getValue() + " but one should of been"); } } } /** * Performs locanted attribute resolution * then additive joining of fragments * then locanted substitutive joining of fragments * * @param state * @param word * @throws StructureBuildingException */ static void recursivelyResolveLocantedFeatures(BuildState state, Element word) throws StructureBuildingException { if (!word.getName().equals(WORD_EL) && !word.getName().equals(BRACKET_EL)){ throw new StructureBuildingException("A word or bracket is the expected input"); } List subsBracketsAndRoots = OpsinTools.getChildElementsWithTagNames(word, new String[]{BRACKET_EL, SUBSTITUENT_EL, ROOT_EL}); //substitution occurs left to right so by doing this right to left you ensure that any groups that will come into existence //due to multipliers being expanded will be in existence for (int i =subsBracketsAndRoots.size()-1; i>=0; i--) { Element subBracketOrRoot = subsBracketsAndRoots.get(i); if (subBracketOrRoot.getName().equals(BRACKET_EL)){ recursivelyResolveLocantedFeatures(state,subBracketOrRoot); if (potentiallyCanSubstitute(subBracketOrRoot)){ performAdditiveOperations(state, subBracketOrRoot); performLocantedSubstitutiveOperations(state, subBracketOrRoot); } } else{ resolveRootOrSubstituentLocanted(state, subBracketOrRoot); } } } /** * Performs locanted attribute resolution * then additive joining of fragments * then locanted substitutive joining of fragments * * @param state * @param word * @throws StructureBuildingException */ static void recursivelyResolveUnLocantedFeatures(BuildState state, Element word) throws StructureBuildingException { if (!word.getName().equals(WORD_EL) && !word.getName().equals(BRACKET_EL)){ throw new StructureBuildingException("A word or bracket is the expected input"); } List subsBracketsAndRoots = OpsinTools.getChildElementsWithTagNames(word, new String[]{BRACKET_EL, SUBSTITUENT_EL, ROOT_EL}); //substitution occurs left to right so by doing this right to left you ensure that any groups that will come into existence //due to multipliers being expanded will be in existence for (int i =subsBracketsAndRoots.size()-1; i>=0; i--) { Element subBracketOrRoot = subsBracketsAndRoots.get(i); if (subBracketOrRoot.getName().equals(BRACKET_EL)){ recursivelyResolveUnLocantedFeatures(state,subBracketOrRoot); if (potentiallyCanSubstitute(subBracketOrRoot)){ performUnLocantedSubstitutiveOperations(state, subBracketOrRoot); } } else{ resolveRootOrSubstituentUnLocanted(state, subBracketOrRoot); } } } static void resolveRootOrSubstituentLocanted(BuildState state, Element subOrRoot) throws StructureBuildingException { resolveLocantedFeatures(state, subOrRoot);//e.g. unsaturators, hydro groups and heteroatom replacement boolean foundSomethingToSubstitute = potentiallyCanSubstitute(subOrRoot); if (foundSomethingToSubstitute){ performAdditiveOperations(state, subOrRoot);//e.g. ethylenediimino, oxyethylene (operations where two outAtoms are used to produce the bond and no locant is required as groups) performLocantedSubstitutiveOperations(state, subOrRoot);//e.g. 2-methyltoluene } } static void resolveRootOrSubstituentUnLocanted(BuildState state, Element subOrRoot) throws StructureBuildingException { boolean foundSomethingToSubstitute = potentiallyCanSubstitute(subOrRoot); resolveUnLocantedFeatures(state, subOrRoot);//e.g. unsaturators, hydro groups and heteroatom replacement if (foundSomethingToSubstitute){ performUnLocantedSubstitutiveOperations(state, subOrRoot);//e.g. tetramethylfuran } } private static void performLocantedSubstitutiveOperations(BuildState state, Element subBracketOrRoot) throws StructureBuildingException { Element group; if (subBracketOrRoot.getName().equals(BRACKET_EL)) { group = findRightMostGroupInBracket(subBracketOrRoot); } else{ group = subBracketOrRoot.getFirstChildElement(GROUP_EL); } if (group.getAttribute(RESOLVED_ATR) != null) { return; } Fragment frag = group.getFrag(); if (frag.getOutAtomCount() >=1 && subBracketOrRoot.getAttribute(LOCANT_ATR) != null){ String locantString = subBracketOrRoot.getAttributeValue(LOCANT_ATR); if (frag.getOutAtomCount() >1){ checkAndApplySpecialCaseWhereOutAtomsCanBeCombinedOrThrow(frag, group); } if (subBracketOrRoot.getAttribute(MULTIPLIER_ATR) != null) {//e.g. 1,2-diethyl multiplyOutAndSubstitute(state, subBracketOrRoot); } else{ Fragment parentFrag = findFragmentWithLocant(subBracketOrRoot, locantString); if (parentFrag == null){ String modifiedLocant = checkForBracketedPrimedLocantSpecialCase(subBracketOrRoot, locantString); if (modifiedLocant != null){ parentFrag = findFragmentWithLocant(subBracketOrRoot, modifiedLocant); if (parentFrag != null){ locantString = modifiedLocant; } } } if (parentFrag==null){ throw new StructureBuildingException("Cannot find in scope fragment with atom with locant " + locantString + "."); } group.addAttribute(new Attribute(RESOLVED_ATR, "yes")); Element groupToAttachTo = parentFrag.getTokenEl(); if (groupToAttachTo.getAttribute(ACCEPTSADDITIVEBONDS_ATR) != null && parentFrag.getOutAtomCount() > 0 && groupToAttachTo.getAttribute(ISAMULTIRADICAL_ATR) != null && parentFrag.getAtomByLocantOrThrow(locantString).getOutValency() > 0 && frag.getOutAtom(0).getValency() == 1 && parentFrag.getFirstAtom().equals(parentFrag.getAtomByLocantOrThrow(locantString))) { //horrible special case to allow C-hydroxycarbonimidoyl and the like //If additive nomenclature the first atom should be an out atom joinFragmentsAdditively(state, frag, parentFrag); } else{ Atom atomToSubstituteAt = parentFrag.getAtomByLocantOrThrow(locantString); if (PHOSPHO_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR)) && frag.getOutAtom(0).getValency() == 1){ if (atomToSubstituteAt.getElement() != ChemEl.O){ for (Atom neighbour : atomToSubstituteAt.getAtomNeighbours()) { if (neighbour.getElement() == ChemEl.O && neighbour.getBondCount()==1 && neighbour.getFirstBond().getOrder() == 1 && neighbour.getOutValency() == 0 && neighbour.getCharge() == 0){ atomToSubstituteAt = neighbour; break; } } } } joinFragmentsSubstitutively(state, frag, atomToSubstituteAt); } } } } private static void performUnLocantedSubstitutiveOperations(BuildState state, Element subBracketOrRoot) throws StructureBuildingException { Element group; if (subBracketOrRoot.getName().equals(BRACKET_EL)){ group = findRightMostGroupInBracket(subBracketOrRoot); } else{ group = subBracketOrRoot.getFirstChildElement(GROUP_EL); } if (group.getAttribute(RESOLVED_ATR) != null){ return; } Fragment frag = group.getFrag(); if (frag.getOutAtomCount() >= 1){ if (subBracketOrRoot.getAttribute(LOCANT_ATR) != null){ throw new RuntimeException("Substituent has an unused outAtom and has a locant but locanted substitution should already have been performed!"); } if (frag.getOutAtomCount() > 1){ checkAndApplySpecialCaseWhereOutAtomsCanBeCombinedOrThrow(frag, group); } if (subBracketOrRoot.getAttribute(MULTIPLIER_ATR) != null) {//e.g. diethyl multiplyOutAndSubstitute(state, subBracketOrRoot); } else{ if (PERHALOGENO_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))) { performPerHalogenoSubstitution(state, frag, subBracketOrRoot); } else{ List atomsToJoinTo = null; if (PHOSPHO_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR)) && frag.getOutAtom(0).getValency() == 1){ List possibleParents = findAlternativeFragments(subBracketOrRoot); for (Fragment fragment : possibleParents) { List hydroxyAtoms = FragmentTools.findHydroxyGroups(fragment); if (hydroxyAtoms.size() >= 1){ atomsToJoinTo = hydroxyAtoms; } break; } } if (atomsToJoinTo == null) { atomsToJoinTo = findAtomsForSubstitution(subBracketOrRoot, 1, frag.getOutAtom(0).getValency()); } if (atomsToJoinTo == null){ throw new StructureBuildingException("Unlocanted substitution failed: unable to find suitable atom to bond atom with id:" + frag.getOutAtom(0).getAtom().getID() + " to!"); } if (AmbiguityChecker.isSubstitutionAmbiguous(atomsToJoinTo, 1)) { state.addIsAmbiguous("Connection of " + group.getValue() + " to " + atomsToJoinTo.get(0).getFrag().getTokenEl().getValue()); } joinFragmentsSubstitutively(state, frag, atomsToJoinTo.get(0)); } group.addAttribute(new Attribute(RESOLVED_ATR, "yes")); } } } /** * Clones the perhalogenFrag sufficiently to replace all in scope hydrogen with halogens. * The cloned fragments are merged into the perhalogenFrag * @param state * @param perhalogenFrag * @param subBracketOrRoot * @throws StructureBuildingException */ private static void performPerHalogenoSubstitution(BuildState state, Fragment perhalogenFrag, Element subBracketOrRoot) throws StructureBuildingException { List fragmentsToAttachTo = findAlternativeFragments(subBracketOrRoot); List atomsToHalogenate = new ArrayList<>(); for (Fragment fragment : fragmentsToAttachTo) { FragmentTools.convertSpareValenciesToDoubleBonds(fragment); for (Atom atom : fragment) { int substitutableHydrogen = calculateSubstitutableHydrogenAtoms(atom); if (substitutableHydrogen > 0 && FragmentTools.isCharacteristicAtom(atom)){ continue; } for (int i = 0; i < substitutableHydrogen; i++) { atomsToHalogenate.add(atom); } } } if (atomsToHalogenate.isEmpty()){ throw new RuntimeException("Failed to find any substitutable hydrogen to apply " + perhalogenFrag.getTokenEl().getValue() + " to!"); } List halogens = new ArrayList<>(); halogens.add(perhalogenFrag); for (int i = 0; i < atomsToHalogenate.size() - 1; i++) { halogens.add(state.fragManager.copyFragment(perhalogenFrag)); } for (int i = 0; i < atomsToHalogenate.size(); i++) { Fragment halogen = halogens.get(i); Atom from = halogen.getOutAtom(0).getAtom(); halogen.removeOutAtom(0); state.fragManager.createBond(from, atomsToHalogenate.get(i), 1); } for (int i = 1; i < atomsToHalogenate.size(); i++) { state.fragManager.incorporateFragment(halogens.get(i), perhalogenFrag); } } /** * Multiplies out groups/brakets and substitutes them. The attribute "locant" is checked for locants * If it is present it should contain a comma separated list of locants * The strategy employed is to clone subOrBracket and its associated fragments as many times as the multiplier attribute * perform(Un)LocantedSubstitutiveOperations is then called with on each call a different clone (or the original element) being in position * Hence bonding between the clones is impossible * @param state * @param subOrBracket * @throws StructureBuildingException */ private static void multiplyOutAndSubstitute(BuildState state, Element subOrBracket) throws StructureBuildingException { Attribute multiplierAtr = subOrBracket.getAttribute(MULTIPLIER_ATR); int multiplier = Integer.parseInt(multiplierAtr.getValue()); subOrBracket.removeAttribute(multiplierAtr); String[] locants = null; String locantsAtrValue = subOrBracket.getAttributeValue(LOCANT_ATR); if (locantsAtrValue != null){ locants = locantsAtrValue.split(","); } Element parentWordOrBracket = subOrBracket.getParent(); int indexOfSubOrBracket = parentWordOrBracket.indexOf(subOrBracket); subOrBracket.detach(); List elementsNotToBeMultiplied = new ArrayList<>();//anything before the multiplier in the sub/bracket Element multiplierEl = subOrBracket.getFirstChildElement(MULTIPLIER_EL); if (multiplierEl == null){ throw new RuntimeException("Multiplier not found where multiplier expected"); } for (int i = subOrBracket.indexOf(multiplierEl) -1 ; i >=0 ; i--) { Element el = subOrBracket.getChild(i); el.detach(); elementsNotToBeMultiplied.add(el); } multiplierEl.detach(); List multipliedElements = new ArrayList<>(); for (int i = multiplier - 1; i >=0; i--) { Element currentElement; if (i != 0){ currentElement = state.fragManager.cloneElement(state, subOrBracket, i); addPrimesToLocantedStereochemistryElements(currentElement, StringTools.multiplyString("'", i));//Stereochemistry elements with locants will need to have their locants primed (stereochemistry is only processed after structure building) } else{ currentElement = subOrBracket; } multipliedElements.add(currentElement); if (locants != null){ parentWordOrBracket.insertChild(currentElement, indexOfSubOrBracket); currentElement.getAttribute(LOCANT_ATR).setValue(locants[i]); performLocantedSubstitutiveOperations(state, currentElement); currentElement.detach(); } } if (locants == null) { parentWordOrBracket.insertChild(multipliedElements.get(0), indexOfSubOrBracket); performUnlocantedSubstitutiveOperations(state, multipliedElements); multipliedElements.get(0).detach(); } for (Element multipliedElement : multipliedElements) {//attach all the multiplied subs/brackets parentWordOrBracket.insertChild(multipliedElement, indexOfSubOrBracket); } for (Element el : elementsNotToBeMultiplied) {//re-add anything before multiplier to original subOrBracket subOrBracket.insertChild(el, 0); } } private static void performUnlocantedSubstitutiveOperations(BuildState state, List multipliedElements) throws StructureBuildingException { int numOfSubstituents = multipliedElements.size(); Element subBracketOrRoot = multipliedElements.get(0); Element group; if (subBracketOrRoot.getName().equals(BRACKET_EL)){ group = findRightMostGroupInBracket(subBracketOrRoot); } else{ group = subBracketOrRoot.getFirstChildElement(GROUP_EL); } Fragment frag = group.getFrag(); if (frag.getOutAtomCount() >= 1){ if (subBracketOrRoot.getAttribute(LOCANT_ATR) != null){ throw new RuntimeException("Substituent has an unused outAtom and has a locant but locanted substitution should already been been performed!"); } if (PERHALOGENO_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))) { throw new StructureBuildingException(group.getValue() + " cannot be multiplied"); } if (frag.getOutAtomCount() > 1){ checkAndApplySpecialCaseWhereOutAtomsCanBeCombinedOrThrow(frag, group); } List atomsToJoinTo = null; if (PHOSPHO_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR)) && frag.getOutAtom(0).getValency() == 1){ List possibleParents = findAlternativeFragments(subBracketOrRoot); for (Fragment fragment : possibleParents) { List hydroxyAtoms = FragmentTools.findHydroxyGroups(fragment); if (hydroxyAtoms.size() >= numOfSubstituents){ atomsToJoinTo = hydroxyAtoms; } break; } } if (atomsToJoinTo == null) { atomsToJoinTo = findAtomsForSubstitution(subBracketOrRoot, numOfSubstituents, frag.getOutAtom(0).getValency()); } if (atomsToJoinTo == null) { throw new StructureBuildingException("Unlocanted substitution failed: unable to find suitable atom to bond atom with id:" + frag.getOutAtom(0).getAtom().getID() + " to!"); } if (AmbiguityChecker.isSubstitutionAmbiguous(atomsToJoinTo, numOfSubstituents)) { state.addIsAmbiguous("Connection of " + group.getValue() + " to " + atomsToJoinTo.get(0).getFrag().getTokenEl().getValue()); List atomsPreferredByEnvironment = AmbiguityChecker.useAtomEnvironmentsToGivePlausibleSubstitution(atomsToJoinTo, numOfSubstituents); if (atomsPreferredByEnvironment != null) { atomsToJoinTo = atomsPreferredByEnvironment; } } joinFragmentsSubstitutively(state, frag, atomsToJoinTo.get(0)); group.addAttribute(new Attribute(RESOLVED_ATR, "yes")); for (int i = 1; i < numOfSubstituents; i++) { subBracketOrRoot = multipliedElements.get(i); if (subBracketOrRoot.getName().equals(BRACKET_EL)){ group = findRightMostGroupInBracket(subBracketOrRoot); } else{ group = subBracketOrRoot.getFirstChildElement(GROUP_EL); } frag = group.getFrag(); if (frag.getOutAtomCount() > 1){//TODO do this prior to multiplication? checkAndApplySpecialCaseWhereOutAtomsCanBeCombinedOrThrow(frag, group); } joinFragmentsSubstitutively(state, frag, atomsToJoinTo.get(i)); group.addAttribute(new Attribute(RESOLVED_ATR, "yes")); } } } /** * Adds locanted unsaturators, heteroatoms and hydrogen elements to the group within the sub or root * @param state * @param subOrRoot * @throws StructureBuildingException */ static void resolveLocantedFeatures(BuildState state, Element subOrRoot) throws StructureBuildingException { List groups = subOrRoot.getChildElements(GROUP_EL); if (groups.size() != 1){ throw new StructureBuildingException("Each sub or root should only have one group element. This indicates a bug in OPSIN"); } Element group = groups.get(0); Fragment thisFrag = group.getFrag(); List unsaturators = new ArrayList<>(); List heteroatoms = new ArrayList<>(); List hydrogenElements = new ArrayList<>(); List subtractivePrefixElements = new ArrayList<>(); List isotopeSpecifications = new ArrayList<>(); List children =subOrRoot.getChildElements(); for (Element currentEl : children) { String elName =currentEl.getName(); if (elName.equals(UNSATURATOR_EL)){ unsaturators.add(currentEl); } else if (elName.equals(HETEROATOM_EL)){ heteroatoms.add(currentEl); } else if (elName.equals(SUBTRACTIVEPREFIX_EL)){ subtractivePrefixElements.add(currentEl); } else if (elName.equals(HYDRO_EL)){ hydrogenElements.add(currentEl); } else if (elName.equals(INDICATEDHYDROGEN_EL)){ hydrogenElements.add(currentEl); } else if (elName.equals(ADDEDHYDROGEN_EL)){ hydrogenElements.add(currentEl); } else if (elName.equals(ISOTOPESPECIFICATION_EL)){ isotopeSpecifications.add(currentEl); } } /* * Add locanted functionality */ if (!subtractivePrefixElements.isEmpty()) { List atomsToDehydro = new ArrayList<>(); //locanted substitution can be assumed to be irrelevant to subtractive operations hence perform all subtractive operations now Map unlocantedSubtractivePrefixes = new HashMap<>(); Map unlocantedHeteroatomRemovalPrefixes = new HashMap<>(); for(int i = subtractivePrefixElements.size() -1; i >= 0; i--) { Element subtractivePrefix = subtractivePrefixElements.get(i); String type = subtractivePrefix.getAttributeValue(TYPE_ATR); String locant = subtractivePrefix.getAttributeValue(LOCANT_ATR); switch (type) { case DEOXY_TYPE_VAL: ChemEl atomToRemove = ChemEl.valueOf(subtractivePrefix.getAttributeValue(VALUE_ATR)); if (locant == null) { Integer count = unlocantedSubtractivePrefixes.get(atomToRemove); unlocantedSubtractivePrefixes.put(atomToRemove, count != null ? count + 1 : 1); } else { applySubtractivePrefix(state, thisFrag, atomToRemove, locant); } break; case ANHYDRO_TYPE_VAL: applyAnhydroPrefix(state, thisFrag, subtractivePrefix); break; case DEHYDRO_TYPE_VAL: if(locant != null) { atomsToDehydro.add(thisFrag.getAtomByLocantOrThrow(locant)); } else{ throw new StructureBuildingException("locants are assumed to be required for the use of dehydro to be unambiguous"); } break; case HETEROATOMREMOVAL_TYPE_VAL: ChemEl heteroatom = ChemEl.valueOf(subtractivePrefix.getAttributeValue(VALUE_ATR)); if (locant == null) { Integer count = unlocantedHeteroatomRemovalPrefixes.get(heteroatom); unlocantedHeteroatomRemovalPrefixes.put(heteroatom, count != null ? count + 1 : 1); } else { applyHeteroatomRemovalPrefix(state, thisFrag, heteroatom, locant); } break; default: throw new StructureBuildingException("OPSIN bug: Unexpected subtractive prefix type: " + type); } subtractivePrefix.detach(); } for (Entry entry : unlocantedSubtractivePrefixes.entrySet()) { applyUnlocantedSubtractivePrefixes(state, thisFrag, entry.getKey(), entry.getValue()); } for (Entry entry : unlocantedHeteroatomRemovalPrefixes.entrySet()) { applyUnlocantedHeteroatomRemovalPrefixes(state, thisFrag, entry.getKey(), entry.getValue()); } if (atomsToDehydro.size() > 0){ boolean isCarbohydrateDehydro = false; if (group.getAttributeValue(TYPE_ATR).equals(CARBOHYDRATE_TYPE_VAL)){ Set uniquifiedDehydroAtoms = new HashSet<>(atomsToDehydro); if (uniquifiedDehydroAtoms.size()==atomsToDehydro.size()){//need to rule out case where dehydro is being used to form triple bonds on carbohydrates isCarbohydrateDehydro = true; } } if (isCarbohydrateDehydro){ for (Atom a : atomsToDehydro) { List hydroxyAtoms = FragmentTools.findHydroxyLikeTerminalAtoms(a.getAtomNeighbours(), ChemEl.O); if (hydroxyAtoms.size() > 0){ hydroxyAtoms.get(0).getFirstBond().setOrder(2); } else{ throw new StructureBuildingException("atom with locant " + a.getFirstLocant() + " did not have a hydroxy group to convert to a ketose"); } } } else{ List atomsToFormDoubleBonds = new ArrayList<>(); List atomsToFormTripleBondsBetween = new ArrayList<>();//dehydro on a double/aromatic bond forms a triple bond for (Atom a : atomsToDehydro) { if (!a.hasSpareValency()){ a.setSpareValency(true); atomsToFormDoubleBonds.add(a); } else{ atomsToFormTripleBondsBetween.add(a); } } for (Atom atom : atomsToFormDoubleBonds) {//check that all the dehydro-ed atoms are next to another atom with spare valency boolean hasSpareValency =false; for (Atom neighbour : atom.getAtomNeighbours()) { if (neighbour.hasSpareValency()){ hasSpareValency = true; break; } } if (!hasSpareValency){ throw new StructureBuildingException("Unexpected use of dehydro; two adjacent atoms were not unsaturated such as to form a double bond"); } } addDehydroInducedTripleBonds(atomsToFormTripleBondsBetween); } } } for(int i=hydrogenElements.size() -1;i >= 0;i--) { Element hydrogen = hydrogenElements.get(i); String locant = hydrogen.getAttributeValue(LOCANT_ATR); if(locant != null) { Atom a =thisFrag.getAtomByLocantOrThrow(locant); if (a.hasSpareValency()){ a.setSpareValency(false); } else{ if (!acdNameSpiroIndicatedHydrogenBug(group, locant)){ throw new StructureBuildingException("hydrogen addition at locant: " + locant +" was requested, but this atom is not unsaturated"); } } hydrogenElements.remove(i); hydrogen.detach(); } } for(int i=unsaturators.size() -1;i >= 0;i--) { Element unsaturator = unsaturators.get(i); String locant = unsaturator.getAttributeValue(LOCANT_ATR); int bondOrder = Integer.parseInt(unsaturator.getAttributeValue(VALUE_ATR)); if(bondOrder <= 1) { unsaturator.detach(); continue; } if(locant != null) { unsaturators.remove(unsaturator); /* * Is the locant a compound locant e.g. 1(6) * This would indicate unsaturation between the atoms with locants 1 and 6 */ Matcher matcher = matchCompoundLocant.matcher(locant); if (matcher.find()) { String compoundLocant = matcher.group(1); locant = matcher.replaceAll(""); FragmentTools.unsaturate(thisFrag.getAtomByLocantOrThrow(locant), compoundLocant, bondOrder, thisFrag); } else { FragmentTools.unsaturate(thisFrag.getAtomByLocantOrThrow(locant), bondOrder, thisFrag); } unsaturator.detach(); } } for(int i=heteroatoms.size() -1;i >= 0;i--) { Element heteroatomEl = heteroatoms.get(i); String locant = heteroatomEl.getAttributeValue(LOCANT_ATR); if(locant != null) { Atom heteroatom = state.fragManager.getHeteroatom(heteroatomEl.getAttributeValue(VALUE_ATR)); Atom atomToBeReplaced =thisFrag.getAtomByLocantOrThrow(locant); if (heteroatom.getElement() == atomToBeReplaced.getElement() && heteroatom.getCharge() == atomToBeReplaced.getCharge()){ throw new StructureBuildingException("The replacement term " +heteroatomEl.getValue() +" was used on an atom that already is a " + heteroatom.getElement()); } state.fragManager.replaceAtomWithAtom(thisFrag.getAtomByLocantOrThrow(locant), heteroatom, true); if (heteroatomEl.getAttribute(LAMBDA_ATR) != null){ thisFrag.getAtomByLocantOrThrow(locant).setLambdaConventionValency(Integer.parseInt(heteroatomEl.getAttributeValue(LAMBDA_ATR))); } heteroatoms.remove(heteroatomEl); heteroatomEl.detach(); } } if (isotopeSpecifications.size() > 0) { applyIsotopeSpecifications(state, thisFrag, isotopeSpecifications, true); } } /** * ACD/Name has a known bug where it produces names in which a suffixed saturated ring in a polycyclic spiro * is treated as if it is unsaturated and hence has indicated hydrogens * e.g. 1',3'-dihydro-2H,5H-spiro[imidazolidine-4,2'-indene]-2,5-dione * @param group * @param indicatedHydrogenLocant * @return */ private static boolean acdNameSpiroIndicatedHydrogenBug(Element group, String indicatedHydrogenLocant) { if (group.getValue().startsWith("spiro")) { for (Element suffix : group.getParent().getChildElements(SUFFIX_EL)) { String suffixLocant = suffix.getAttributeValue(LOCANT_ATR); if (suffixLocant != null && suffixLocant.equals(indicatedHydrogenLocant)) { LOG.debug("Indicated hydrogen at " + indicatedHydrogenLocant + " ignored. Known bug in generated IUPAC name"); return true; } } } return false; } /** * Removes a terminal atom of a particular element e.g. oxygen * The locant specifies the atom adjacent to the atom to be removed * Formally the atom is replaced by hydrogen, hence stereochemistry is intentionally preserved * @param state * @param fragment * @param chemEl * @param locant A locant or null * @throws StructureBuildingException */ static void applySubtractivePrefix(BuildState state, Fragment fragment, ChemEl chemEl, String locant) throws StructureBuildingException { Atom adjacentAtom = fragment.getAtomByLocantOrThrow(locant); List applicableTerminalAtoms = FragmentTools.findHydroxyLikeTerminalAtoms(adjacentAtom.getAtomNeighbours(), chemEl); if (applicableTerminalAtoms.isEmpty()) { throw new StructureBuildingException("Unable to find terminal atom of type: " + chemEl + " at locant "+ locant +" for subtractive nomenclature"); } Atom atomToRemove = applicableTerminalAtoms.get(0); if (FragmentTools.isFunctionalAtom(atomToRemove)) {//This can occur with aminoglycosides where the anomeric OH is removed by deoxy for (int i = 0, len = fragment.getFunctionalAtomCount(); i < len; i++) { if (atomToRemove.equals(fragment.getFunctionalAtom(i).getAtom())) { fragment.removeFunctionalAtom(i); break; } } fragment.addFunctionalAtom(atomToRemove.getFirstBond().getOtherAtom(atomToRemove)); } FragmentTools.removeTerminalAtom(state, atomToRemove); } /** * Removes terminal atoms of a particular element e.g. oxygen * The number to remove is decided by the count * Formally the atom is replaced by hydrogen, hence stereochemistry is intentionally preserved * @param state * @param fragment * @param chemEl * @param count * @throws StructureBuildingException */ static void applyUnlocantedSubtractivePrefixes(BuildState state, Fragment fragment, ChemEl chemEl, int count) throws StructureBuildingException { List applicableTerminalAtoms = FragmentTools.findHydroxyLikeTerminalAtoms(fragment.getAtomList(), chemEl); if (applicableTerminalAtoms.isEmpty() || applicableTerminalAtoms.size() < count) { throw new StructureBuildingException("Unable to find terminal atom of type: " + chemEl + " for subtractive nomenclature"); } if (AmbiguityChecker.isSubstitutionAmbiguous(applicableTerminalAtoms, count)) { state.addIsAmbiguous("Group to remove with subtractive prefix"); } for (int i = 0; i < count; i++) { Atom atomToRemove = applicableTerminalAtoms.get(i); if (FragmentTools.isFunctionalAtom(atomToRemove)) {//This can occur with aminoglycosides where the anomeric OH is removed by deoxy for (int j = 0, len = fragment.getFunctionalAtomCount(); j < len; j++) { if (atomToRemove.equals(fragment.getFunctionalAtom(j).getAtom())) { fragment.removeFunctionalAtom(j); break; } } fragment.addFunctionalAtom(atomToRemove.getFirstBond().getOtherAtom(atomToRemove)); } FragmentTools.removeTerminalAtom(state, atomToRemove); } } private static void applyAnhydroPrefix(BuildState state, Fragment frag, Element subtractivePrefix) throws StructureBuildingException { ChemEl chemEl = ChemEl.valueOf(subtractivePrefix.getAttributeValue(VALUE_ATR)); String locantStr = subtractivePrefix.getAttributeValue(LOCANT_ATR); if (locantStr == null) { throw new StructureBuildingException("Two locants are required before an anhydro prefix"); } String[] locants = locantStr.split(","); Atom backBoneAtom1 = frag.getAtomByLocantOrThrow(locants[0]); Atom backBoneAtom2 = frag.getAtomByLocantOrThrow(locants[1]); List applicableTerminalAtoms = FragmentTools.findHydroxyLikeTerminalAtoms(backBoneAtom1.getAtomNeighbours(), chemEl); if (applicableTerminalAtoms.isEmpty()){ throw new StructureBuildingException("Unable to find terminal atom of type: " + chemEl + " for subtractive nomenclature"); } FragmentTools.removeTerminalAtom(state, applicableTerminalAtoms.get(0)); applicableTerminalAtoms = FragmentTools.findHydroxyLikeTerminalAtoms(backBoneAtom2.getAtomNeighbours(), chemEl); if (applicableTerminalAtoms.isEmpty()){ throw new StructureBuildingException("Unable to find terminal atom of type: " + chemEl + " for subtractive nomenclature"); } state.fragManager.createBond(backBoneAtom1, applicableTerminalAtoms.get(0), 1); } /** * Attempts to form triple bond between the atoms in atomsToFormTripleBondsBetween * Throws an exception if the list contains duplicates or atoms with no adjacent atom in the list * @param atomsToFormTripleBondsBetween * @throws StructureBuildingException */ private static void addDehydroInducedTripleBonds(List atomsToFormTripleBondsBetween) throws StructureBuildingException { if (atomsToFormTripleBondsBetween.size()>0){ Set atoms = new HashSet<>(atomsToFormTripleBondsBetween); if (atomsToFormTripleBondsBetween.size() != atoms.size()){ throw new StructureBuildingException("locants specified for dehydro specify the same atom too many times"); } atomLoop: for (int i = atomsToFormTripleBondsBetween.size()-1; i >=0; i = i-2) {//two atoms will have a triple bond formed betwen them Atom a = atomsToFormTripleBondsBetween.get(i); List neighbours = a.getAtomNeighbours(); for (Atom neighbour : neighbours) { if (atomsToFormTripleBondsBetween.contains(neighbour)){ atomsToFormTripleBondsBetween.remove(i); atomsToFormTripleBondsBetween.remove(neighbour); Bond b = a.getBondToAtomOrThrow(neighbour); b.setOrder(3); a.setSpareValency(false); neighbour.setSpareValency(false); continue atomLoop; } } throw new StructureBuildingException("dehydro indicated atom should form a triple bond but no adjacent atoms also had hydrogen removed!"); } } } /** * Replaces the specified heteroatom at the specified locant with carbon * @param state * @param fragment * @param chemEl * @param locant A locant * @throws StructureBuildingException */ private static void applyHeteroatomRemovalPrefix(BuildState state, Fragment fragment, ChemEl chemEl, String locant) throws StructureBuildingException { Atom atomToChange = fragment.getAtomByLocantOrThrow(locant); if (atomToChange.getElement() != chemEl) { throw new StructureBuildingException("Removal of " + chemEl.toString() + " requested, but the atom at locant " + locant + " is not this element"); } atomToChange.setElement(ChemEl.C); atomToChange.removeElementSymbolLocants(); } /** * Replaces the specified number of the specified heteroatoms with carbon * @param state * @param fragment * @param chemEl * @param count * @throws StructureBuildingException */ private static void applyUnlocantedHeteroatomRemovalPrefixes(BuildState state, Fragment fragment, ChemEl chemEl, int count) throws StructureBuildingException { List applicableAtoms = new ArrayList<>(); for (Atom atom : fragment) { if (atom.getElement() == chemEl) { applicableAtoms.add(atom); } } if (applicableAtoms.isEmpty() || applicableAtoms.size() < count) { throw new StructureBuildingException("Unable to find sufficient atoms of element: " + chemEl + " for subtractive nomenclature"); } if (AmbiguityChecker.isSubstitutionAmbiguous(applicableAtoms, count)) { state.addIsAmbiguous("Group to remove with subtractive prefix"); } for (int i = 0; i < count; i++) { Atom atomToChange = applicableAtoms.get(i); atomToChange.setElement(ChemEl.C); atomToChange.removeElementSymbolLocants(); } } /** * Adds locanted unsaturators, heteroatoms and hydrogen elements to the group within the sub or root * @param state * @param subOrRoot * @throws StructureBuildingException */ static void resolveUnLocantedFeatures(BuildState state, Element subOrRoot) throws StructureBuildingException { List groups = subOrRoot.getChildElements(GROUP_EL); if (groups.size() != 1){ throw new StructureBuildingException("Each sub or root should only have one group element. This indicates a bug in OPSIN"); } Fragment frag = groups.get(0).getFrag(); List unsaturationBondOrders = new ArrayList<>(); List heteroatoms = new ArrayList<>(); List hydrogenElements = new ArrayList<>(); List isotopeSpecifications = new ArrayList<>(); List children = subOrRoot.getChildElements(); for (Element currentEl : children) { String elName = currentEl.getName(); if (elName.equals(UNSATURATOR_EL)) { int bondOrder = Integer.parseInt(currentEl.getAttributeValue(VALUE_ATR)); if (bondOrder > 1) { unsaturationBondOrders.add(bondOrder); } currentEl.detach(); } else if (elName.equals(HETEROATOM_EL)){ heteroatoms.add(currentEl); currentEl.detach(); } else if (elName.equals(HYDRO_EL) || elName.equals(INDICATEDHYDROGEN_EL) || elName.equals(ADDEDHYDROGEN_EL)){ hydrogenElements.add(currentEl); currentEl.detach(); } else if (elName.equals(ISOTOPESPECIFICATION_EL)){ isotopeSpecifications.add(currentEl); } } if (hydrogenElements.size() > 0) { applyUnlocantedHydro(state, frag, hydrogenElements); } if (unsaturationBondOrders.size() > 0){ unsaturateBonds(state, frag, unsaturationBondOrders); } if (heteroatoms.size() > 0) { applyUnlocantedHeteroatoms(state, frag, heteroatoms); } if (isotopeSpecifications.size() > 0) { applyIsotopeSpecifications(state, frag, isotopeSpecifications, false); } if (frag.getOutAtomCount() > 0){//assign any outAtoms that have not been set to a specific atom to a specific atom for (int i = 0, l = frag.getOutAtomCount(); i < l; i++) { OutAtom outAtom = frag.getOutAtom(i); if (!outAtom.isSetExplicitly()){ outAtom.setAtom(findAtomForUnlocantedRadical(state, frag, outAtom)); outAtom.setSetExplicitly(true); } } } } private static void applyUnlocantedHydro(BuildState state, Fragment frag, List hydrogenElements) throws StructureBuildingException { /* * This function is not entirely straightforward as certain atoms definitely should have their spare valency reduced * However names are not consistent as to whether they bother having the hydro tags do this! * The atoms in atomsWithSV are in atom order those that can take a hydro element and then those that shouldn't really take a hydro element as its absence is unambiguous */ List atomsAcceptingHydroPrefix = new ArrayList<>(); Set atomsWhichImplicitlyHadTheirSVRemoved = new HashSet<>(); List atomList = frag.getAtomList(); for (Atom atom : atomList) { if (atom.getType().equals(SUFFIX_TYPE_VAL)){ continue; } atom.ensureSVIsConsistantWithValency(false);//doesn't take into account suffixes if (atom.hasSpareValency()) { atomsAcceptingHydroPrefix.add(atom); //if we take into account suffixes is the SV removed atom.ensureSVIsConsistantWithValency(true); if (!atom.hasSpareValency()) { atomsWhichImplicitlyHadTheirSVRemoved.add(atom); } } } int hydrogenElsCount = hydrogenElements.size(); for (Element hydrogenElement : hydrogenElements) { if (hydrogenElement.getValue().equals("perhydro")) { if (hydrogenElsCount != 1){ throw new StructureBuildingException("Unexpected indication of hydrogen when perhydro makes such indication redundnant"); } for (Atom atom : atomsAcceptingHydroPrefix) { atom.setSpareValency(false); } return; } } List atomsWithDefiniteSV = new ArrayList<>(); List otherAtomsThatCanHaveHydro = new ArrayList<>(); for(Atom a : atomsAcceptingHydroPrefix) { if (atomsWhichImplicitlyHadTheirSVRemoved.contains(a)) { otherAtomsThatCanHaveHydro.add(a); } else { boolean canFormDoubleBond = false; for(Atom aa : frag.getIntraFragmentAtomNeighbours(a)) { if(aa.hasSpareValency()) { canFormDoubleBond = true; break; } } if (canFormDoubleBond) { atomsWithDefiniteSV.add(a); } else { otherAtomsThatCanHaveHydro.add(a); } } } List prioritisedAtomsAcceptingHydro = new ArrayList<>(atomsWithDefiniteSV); prioritisedAtomsAcceptingHydro.addAll(otherAtomsThatCanHaveHydro);//these end up at the end of the list if (hydrogenElsCount > prioritisedAtomsAcceptingHydro.size()) { throw new StructureBuildingException("Cannot find atom to add hydrogen to (" + hydrogenElsCount + " hydrogens requested but only " + prioritisedAtomsAcceptingHydro.size() +" positions that can be hydrogenated)" ); } int svCountAfterRemoval = atomsWithDefiniteSV.size() - hydrogenElsCount; if (svCountAfterRemoval > 1) { //ambiguity likely. If it's 1 then an atom will be implicitly hydrogenated //NOTE: as hydrogens as added in pairs the unambiguous if one hydrogen is added and allow atoms are identical condition is unlikely to be ever satisfied if (!(AmbiguityChecker.allAtomsEquivalent(atomsWithDefiniteSV) && (hydrogenElsCount == 1 || hydrogenElsCount == atomsWithDefiniteSV.size() - 1))) { state.addIsAmbiguous("Ambiguous choice of positions to add hydrogen to on " + frag.getTokenEl().getValue()); } } for (int i = 0; i < hydrogenElsCount; i++) { prioritisedAtomsAcceptingHydro.get(i).setSpareValency(false); } } private static void unsaturateBonds(BuildState state, Fragment frag, List unsaturationBondOrders) throws StructureBuildingException { int tripleBonds = 0; int doublebonds = 0; for (Integer bondOrder : unsaturationBondOrders) { if (bondOrder == 3) { tripleBonds++; } else if (bondOrder == 2) { doublebonds++; } else { throw new RuntimeException("Unexpected unsaturation bon order: " + bondOrder); } } if (tripleBonds > 0) { unsaturateBonds(state, frag, 3, tripleBonds); } if (doublebonds > 0) { unsaturateBonds(state, frag, 2, doublebonds); } } private static void unsaturateBonds(BuildState state, Fragment frag, int bondOrder, int numToUnsaturate) throws StructureBuildingException { List bondsThatCouldBeUnsaturated = findBondsToUnSaturate(frag, bondOrder, false); List alternativeBondsThatCouldBeUnsaturated = Collections.emptyList(); if (bondsThatCouldBeUnsaturated.size() < numToUnsaturate){ bondsThatCouldBeUnsaturated = findBondsToUnSaturate(frag, bondOrder, true); } else { alternativeBondsThatCouldBeUnsaturated = findAlternativeBondsToUnSaturate(frag, bondOrder, bondsThatCouldBeUnsaturated); } if (bondsThatCouldBeUnsaturated.size() < numToUnsaturate){ throw new StructureBuildingException("Failed to find bond to change to a bond of order: " + bondOrder); } if (bondsThatCouldBeUnsaturated.size() > numToUnsaturate) { //by convention cycloalkanes can have one unsaturation implicitly at the 1 locant //terms like oxazoline are formally ambiguous but in practice the lowest locant is the one that will be intended (in this case 2-oxazoline) if (!isCycloAlkaneSpecialCase(frag, numToUnsaturate, bondsThatCouldBeUnsaturated) && !HANTZSCHWIDMAN_SUBTYPE_VAL.equals(frag.getSubType())) { if (alternativeBondsThatCouldBeUnsaturated.size() >= numToUnsaturate) { List allBonds = new ArrayList<>(bondsThatCouldBeUnsaturated); allBonds.addAll(alternativeBondsThatCouldBeUnsaturated); if (!(AmbiguityChecker.allBondsEquivalent(allBonds) && numToUnsaturate == 1 )) { state.addIsAmbiguous("Unsaturation of bonds of " + frag.getTokenEl().getValue()); } } else { if (!(AmbiguityChecker.allBondsEquivalent(bondsThatCouldBeUnsaturated) && (numToUnsaturate == 1 || numToUnsaturate == bondsThatCouldBeUnsaturated.size() - 1))){ state.addIsAmbiguous("Unsaturation of bonds of " + frag.getTokenEl().getValue()); } } } } for (int i = 0; i < numToUnsaturate; i++) { bondsThatCouldBeUnsaturated.get(i).setOrder(bondOrder); } } private static boolean isCycloAlkaneSpecialCase(Fragment frag, int numToUnsaturate, List bondsThatCouldBeUnsaturated) { if (numToUnsaturate == 1) { Bond b = bondsThatCouldBeUnsaturated.get(0); Atom a1 = b.getFromAtom(); Atom a2 = b.getToAtom(); if ((ALKANESTEM_SUBTYPE_VAL.equals(frag.getSubType()) || HETEROSTEM_SUBTYPE_VAL.equals(frag.getSubType())) && a1.getAtomIsInACycle() && a2.getAtomIsInACycle() && (a1.equals(frag.getFirstAtom()) || a2.equals(frag.getFirstAtom()))) { //mono unsaturated cyclo alkanes are unambiguous e.g. cyclohexene return true; } } return false; } private static boolean isCycloAlkaneHeteroatomSpecialCase(Fragment frag, int numHeteroatoms, List atomsThatCouldBeReplaced) { if (numHeteroatoms == 1) { if ((ALKANESTEM_SUBTYPE_VAL.equals(frag.getSubType()) || HETEROSTEM_SUBTYPE_VAL.equals(frag.getSubType())) && frag.getFirstAtom().getAtomIsInACycle() && atomsThatCouldBeReplaced.get(0).equals(frag.getFirstAtom())) { //single heteroatom implicitly goes to 1 position return true; } } return false; } private static class HeteroAtomSmilesAndLambda { private final String smiles; private final String lambdaConvention; public HeteroAtomSmilesAndLambda(String smiles, String lambdaConvention) { this.smiles = smiles; this.lambdaConvention = lambdaConvention; } @Override public int hashCode() { final int prime = 31; int result = 1; result = prime * result + ((lambdaConvention == null) ? 0 : lambdaConvention .hashCode()); result = prime * result + ((smiles == null) ? 0 : smiles.hashCode()); return result; } @Override public boolean equals(Object obj) { if (this == obj) return true; if (obj == null) return false; if (getClass() != obj.getClass()) return false; HeteroAtomSmilesAndLambda other = (HeteroAtomSmilesAndLambda) obj; if (lambdaConvention == null) { if (other.lambdaConvention != null) return false; } else if (!lambdaConvention.equals(other.lambdaConvention)) return false; if (smiles == null) { if (other.smiles != null) return false; } else if (!smiles.equals(other.smiles)) return false; return true; } } private static void applyUnlocantedHeteroatoms(BuildState state, Fragment frag, List heteroatoms) throws StructureBuildingException { Map heteroatomDescriptionToCount = new HashMap<>(); for (Element heteroatomEl : heteroatoms) { String smiles = heteroatomEl.getAttributeValue(VALUE_ATR); String lambdaConvention = heteroatomEl.getAttributeValue(LAMBDA_ATR); HeteroAtomSmilesAndLambda desc = new HeteroAtomSmilesAndLambda(smiles, lambdaConvention); Integer count = heteroatomDescriptionToCount.get(desc); heteroatomDescriptionToCount.put(desc, count != null ? count + 1 : 1); } List atomlist = frag.getAtomList(); for (Entry entry : heteroatomDescriptionToCount.entrySet()) { HeteroAtomSmilesAndLambda desc = entry.getKey(); int replacementsRequired = entry.getValue(); Atom heteroatom = state.fragManager.getHeteroatom(desc.smiles); ChemEl heteroatomChemEl = heteroatom.getElement(); //finds an atom for which changing it to the specified heteroatom will not cause valency to be violated List atomsThatCouldBeReplaced = new ArrayList<>(); for (Atom atom : atomlist) { if (atom.getType().equals(SUFFIX_TYPE_VAL)) { continue; } if ((heteroatomChemEl.equals(atom.getElement()) && heteroatom.getCharge() == atom.getCharge())){ continue;//replacement would do nothing } if(atom.getElement() != ChemEl.C && heteroatomChemEl != ChemEl.C){ if (atom.getElement() == ChemEl.O && (heteroatomChemEl == ChemEl.S || heteroatomChemEl == ChemEl.Se || heteroatomChemEl == ChemEl.Te)) { //by special case allow replacement of oxygen by chalcogen } else{ //replacement of heteroatom by another heteroatom continue; } } if (ValencyChecker.checkValencyAvailableForReplacementByHeteroatom(atom, heteroatom)) { atomsThatCouldBeReplaced.add(atom); } } if (atomsThatCouldBeReplaced.size() < replacementsRequired){ throw new StructureBuildingException("Cannot find suitable atom for heteroatom replacement"); } if (atomsThatCouldBeReplaced.size() > replacementsRequired && !isCycloAlkaneHeteroatomSpecialCase(frag, replacementsRequired, atomsThatCouldBeReplaced)) { if (!(AmbiguityChecker.allAtomsEquivalent(atomsThatCouldBeReplaced) && (replacementsRequired == 1 || replacementsRequired == atomsThatCouldBeReplaced.size() - 1))) { //by convention cycloalkanes can have one unsaturation implicitly at the 1 locant state.addIsAmbiguous("Heteroatom replacement on " + frag.getTokenEl().getValue()); } } for (int i = 0; i < replacementsRequired; i++) { Atom atomToReplaceWithHeteroAtom = atomsThatCouldBeReplaced.get(i); state.fragManager.replaceAtomWithAtom(atomToReplaceWithHeteroAtom, heteroatom, true); if (desc.lambdaConvention != null) { atomToReplaceWithHeteroAtom.setLambdaConventionValency(Integer.parseInt(desc.lambdaConvention)); } } } } private static void applyIsotopeSpecifications(BuildState state, Fragment frag, List isotopeSpecifications, boolean applyLocanted) throws StructureBuildingException { for(int i = isotopeSpecifications.size() - 1; i >= 0; i--) { Element isotopeSpecification = isotopeSpecifications.get(i); IsotopeSpecification isotopeSpec = IsotopeSpecificationParser.parseIsotopeSpecification(isotopeSpecification); String[] locants = isotopeSpec.getLocants(); if(locants != null) { if (!applyLocanted) { continue; } } else if (applyLocanted) { continue; } ChemEl chemEl = isotopeSpec.getChemEl(); int isotope = isotopeSpec.getIsotope(); if(locants != null) { if (chemEl == ChemEl.H) { for (int j = 0; j < locants.length; j++) { Atom atomWithHydrogenIsotope = frag.getAtomByLocantOrThrow(locants[j]); Atom hydrogen = state.fragManager.createAtom(isotopeSpec.getChemEl(), frag); hydrogen.setIsotope(isotope); state.fragManager.createBond(atomWithHydrogenIsotope, hydrogen, 1); } } else { for (int j = 0; j < locants.length; j++) { Atom atom = frag.getAtomByLocantOrThrow(locants[j]); if (chemEl != atom.getElement()) { throw new StructureBuildingException("The atom at locant: " + locants[j] + " was not a " + chemEl.toString() ); } atom.setIsotope(isotope); } } } else { int multiplier = isotopeSpec.getMultiplier(); if (chemEl == ChemEl.H) { List parentAtomsToApplyTo = FragmentTools.findnAtomsForSubstitution(frag, multiplier, 1); if (parentAtomsToApplyTo == null){ throw new StructureBuildingException("Failed to find sufficient hydrogen atoms for unlocanted hydrogen isotope replacement"); } if (AmbiguityChecker.isSubstitutionAmbiguous(parentAtomsToApplyTo, multiplier)) { if (!casIsotopeAmbiguitySpecialCase(frag, parentAtomsToApplyTo, multiplier)) { state.addIsAmbiguous("Position of hydrogen isotope on " + frag.getTokenEl().getValue()); } } for (int j = 0; j < multiplier; j++) { Atom atomWithHydrogenIsotope = parentAtomsToApplyTo.get(j); Atom hydrogen = state.fragManager.createAtom(isotopeSpec.getChemEl(), frag); hydrogen.setIsotope(isotope); state.fragManager.createBond(atomWithHydrogenIsotope, hydrogen, 1); } } else { List parentAtomsToApplyTo = new ArrayList<>(); for (Atom atom : frag) { if (atom.getElement() == chemEl) { parentAtomsToApplyTo.add(atom); } } if (parentAtomsToApplyTo.size() < multiplier) { throw new StructureBuildingException("Failed to find sufficient atoms for " + chemEl.toString() + " isotope replacement"); } if (AmbiguityChecker.isSubstitutionAmbiguous(parentAtomsToApplyTo, multiplier)) { state.addIsAmbiguous("Position of isotope on " + frag.getTokenEl().getValue()); } for (int j = 0; j < multiplier; j++) { parentAtomsToApplyTo.get(j).setIsotope(isotope); } } } isotopeSpecification.detach(); } } private static boolean casIsotopeAmbiguitySpecialCase(Fragment frag, List parentAtomsToApplyTo, int multiplier) throws StructureBuildingException { if (multiplier !=1) { return false; } List atoms = frag.getAtomList(); Atom firstAtom = atoms.get(0); if (!parentAtomsToApplyTo.get(0).equals(firstAtom)) { return false; } ChemEl firstAtomEl = firstAtom.getElement(); if (atoms.size() ==2) { if (firstAtomEl == atoms.get(1).getElement()) { //e.g. ethane return true; } } else { int intraFragValency = frag.getIntraFragmentIncomingValency(firstAtom); boolean spareValency = firstAtom.hasSpareValency(); if (firstAtom.getAtomIsInACycle()) { for (int i = 1; i < atoms.size(); i++) { Atom atom = atoms.get(i); if (atom.getElement() != firstAtomEl){ return false; } if (frag.getIntraFragmentIncomingValency(atom) != intraFragValency){ return false; } if (atom.hasSpareValency() != spareValency){ return false; } } //e.g. benzene return true; } } return false; } static Atom findAtomForUnlocantedRadical(BuildState state, Fragment frag, OutAtom outAtom) throws StructureBuildingException { List possibleAtoms = FragmentTools.findnAtomsForSubstitution(frag, outAtom.getAtom(), 1, outAtom.getValency(), true); if (possibleAtoms == null){ throw new StructureBuildingException("Failed to assign all unlocanted radicals to actual atoms without violating valency"); } if (!((ALKANESTEM_SUBTYPE_VAL.equals(frag.getSubType()) || HETEROSTEM_SUBTYPE_VAL.equals(frag.getSubType())) && possibleAtoms.get(0).equals(frag.getFirstAtom()))) { if (AmbiguityChecker.isSubstitutionAmbiguous(possibleAtoms, 1)) { state.addIsAmbiguous("Positioning of radical on: " + frag.getTokenEl().getValue()); } } return possibleAtoms.get(0); } private static List findAlternativeBondsToUnSaturate(Fragment frag, int bondOrder, Collection bondsToIgnore) { return findBondsToUnSaturate(frag, bondOrder, false, new HashSet<>(bondsToIgnore)); } /** * Finds bond within the fragment that can have their bondOrder increased to the specified bond order * Depending on the value of allowAdjacentUnsaturatedBonds adjacent higher bonds are prevented * @param frag * @param bondOrder * @param allowAdjacentUnsaturatedBonds * @return */ static List findBondsToUnSaturate(Fragment frag, int bondOrder, boolean allowAdjacentUnsaturatedBonds) { return findBondsToUnSaturate(frag, bondOrder, allowAdjacentUnsaturatedBonds, Collections.emptySet()); } private static List findBondsToUnSaturate(Fragment frag, int bondOrder, boolean allowAdjacentUnsaturatedBonds, Set bondsToIgnore) { List bondsToUnsaturate = new ArrayList<>(); mainLoop: for (Atom atom1 : frag) { if (atom1.hasSpareValency() || SUFFIX_TYPE_VAL.equals(atom1.getType()) || atom1.getProperty(Atom.ISALDEHYDE) !=null) { continue; } List bonds = atom1.getBonds(); int incomingValency = 0; for (Bond bond : bonds) { //don't place implicitly unsaturated bonds next to each other if (bond.getOrder() != 1 && !allowAdjacentUnsaturatedBonds) { continue mainLoop; } if (bondsToUnsaturate.contains(bond)) { if (!allowAdjacentUnsaturatedBonds) { continue mainLoop; } incomingValency += bondOrder; } else { incomingValency += bond.getOrder(); } } Integer maxVal = getLambdaValencyOrHwValencyOrMaxValIfCharged(atom1); if(maxVal != null && (incomingValency + (bondOrder - 1) + atom1.getOutValency()) > maxVal) { continue; } bondLoop: for (Bond bond : bonds) { if (bond.getOrder() == 1 && !bondsToUnsaturate.contains(bond) && !bondsToIgnore.contains(bond)) { Atom atom2 = bond.getOtherAtom(atom1); if (frag.getAtomByID(atom2.getID()) != null) {//check other atom is actually in the fragment! if (atom2.hasSpareValency() || SUFFIX_TYPE_VAL.equals(atom2.getType()) || atom2.getProperty(Atom.ISALDEHYDE) !=null) { continue; } int incomingValency2 = 0; for (Bond bond2 : atom2.getBonds()) { //don't place implicitly unsaturated bonds next to each other if (bond2.getOrder() != 1 && !allowAdjacentUnsaturatedBonds) { continue bondLoop; } if (bondsToUnsaturate.contains(bond2)) { if (!allowAdjacentUnsaturatedBonds) { continue bondLoop; } incomingValency2 += bondOrder; } else { incomingValency2 += bond2.getOrder(); } } Integer maxVal2 = getLambdaValencyOrHwValencyOrMaxValIfCharged(atom2); if(maxVal2 != null && (incomingValency2 + (bondOrder - 1) + atom2.getOutValency()) > maxVal2) { continue; } bondsToUnsaturate.add(bond); break bondLoop; } } } } return bondsToUnsaturate; } /** * Return the lambda convention derived valency + protons if set * Otherwise if charge is 0 returns {@link ValencyChecker#getHWValency(ChemEl)} * Otherwise return {@link ValencyChecker#getMaximumValency(ChemEl, int)} * Returns null if the maximum valency is not known * @param a * @return */ static Integer getLambdaValencyOrHwValencyOrMaxValIfCharged(Atom a) { if (a.getLambdaConventionValency() != null) { return a.getLambdaConventionValency() + a.getProtonsExplicitlyAddedOrRemoved(); } else if (a.getCharge() == 0){ return ValencyChecker.getHWValency(a.getElement()); } else { return ValencyChecker.getMaximumValency(a.getElement(), a.getCharge()); } } private static void performAdditiveOperations(BuildState state, Element subBracketOrRoot) throws StructureBuildingException { if (subBracketOrRoot.getAttribute(LOCANT_ATR) != null){//additive nomenclature does not employ locants return; } Element group; if (subBracketOrRoot.getName().equals(BRACKET_EL)){ group =findRightMostGroupInBracket(subBracketOrRoot); } else{ group =subBracketOrRoot.getFirstChildElement(GROUP_EL); } if (group.getAttribute(RESOLVED_ATR) != null){ return; } Fragment frag = group.getFrag(); int outAtomCount = frag.getOutAtomCount(); if (outAtomCount >=1){ if (subBracketOrRoot.getAttribute(MULTIPLIER_ATR) ==null){ Element nextSiblingEl = OpsinTools.getNextSibling(subBracketOrRoot); if (nextSiblingEl.getAttribute(MULTIPLIER_ATR) != null && (outAtomCount >= Integer.parseInt(nextSiblingEl.getAttributeValue(MULTIPLIER_ATR)) || //probably multiplicative nomenclature, should be as many outAtoms as the multiplier outAtomCount==1 && frag.getOutAtom(0).getValency()==Integer.parseInt(nextSiblingEl.getAttributeValue(MULTIPLIER_ATR))) && hasRootLikeOrMultiRadicalGroup(nextSiblingEl)){ if (outAtomCount==1){//special case e.g. 4,4'-(benzylidene)dianiline FragmentTools.splitOutAtomIntoValency1OutAtoms(frag.getOutAtom(0)); //special case where something like benzylidene is being used as if it meant benzdiyl for multiplicative nomenclature //this is allowed in the IUPAC 79 recommendations but not recommended in the current recommendations } performMultiplicativeOperations(state, group, nextSiblingEl); } else if (group.getAttribute(ISAMULTIRADICAL_ATR) != null){//additive nomenclature e.g. ethyleneoxy Fragment nextFrag = getNextInScopeMultiValentFragment(subBracketOrRoot); if (nextFrag != null){ Element nextMultiRadicalGroup = nextFrag.getTokenEl(); Element parentSubOrRoot = nextMultiRadicalGroup.getParent(); if (state.currentWordRule != WordRule.polymer){//imino does not behave like a substituent in polymers only as a linker if (nextMultiRadicalGroup.getAttribute(IMINOLIKE_ATR) != null){//imino/methylene can just act as normal substituents, should an additive bond really be made??? Fragment adjacentFrag = OpsinTools.getNextGroup(subBracketOrRoot).getFrag(); if (nextFrag != adjacentFrag){//imino is not the absolute next frag if (potentiallyCanSubstitute(nextMultiRadicalGroup.getParent()) || potentiallyCanSubstitute(nextMultiRadicalGroup.getParent().getParent())){ return; } } } if (group.getAttribute(IMINOLIKE_ATR) != null && levelsToWordEl(group) > levelsToWordEl(nextMultiRadicalGroup)){ return;//e.g. imino substitutes ((chloroimino)ethylene)dibenzene } } if (parentSubOrRoot.getAttribute(MULTIPLIER_ATR) != null){ throw new StructureBuildingException("Attempted to form additive bond to a multiplied component"); } group.addAttribute(new Attribute(RESOLVED_ATR, "yes")); joinFragmentsAdditively(state, frag, nextFrag); } } else {//e.g. chlorocarbonyl or hydroxy(sulfanyl)phosphoryl List siblingFragments = findAlternativeFragments(subBracketOrRoot); if (siblingFragments.size()>0){ Fragment nextFrag = siblingFragments.get(siblingFragments.size()-1); Element nextGroup = nextFrag.getTokenEl(); if (nextGroup.getAttribute(ACCEPTSADDITIVEBONDS_ATR) != null && nextGroup.getAttribute(ISAMULTIRADICAL_ATR) != null && (nextFrag.getOutAtomCount()>1|| nextGroup.getAttribute(RESOLVED_ATR) != null && nextFrag.getOutAtomCount()>=1 )){ Atom toAtom = nextFrag.getOutAtom(0).getAtom(); if (calculateSubstitutableHydrogenAtoms(toAtom) ==0){ group.addAttribute(new Attribute(RESOLVED_ATR, "yes")); joinFragmentsAdditively(state, frag, nextFrag);//e.g. aminocarbonyl or aminothio } } if (group.getAttribute(RESOLVED_ATR)==null && siblingFragments.size()>1){ for (int i = 0; i< siblingFragments.size()-1; i++) { Fragment lastFrag = siblingFragments.get(i); Element lastGroup = lastFrag.getTokenEl(); if (lastGroup.getAttribute(ACCEPTSADDITIVEBONDS_ATR) != null && lastGroup.getAttribute(ISAMULTIRADICAL_ATR) != null && (lastFrag.getOutAtomCount()>1|| lastGroup.getAttribute(RESOLVED_ATR) != null && lastFrag.getOutAtomCount()>=1 )){ Atom toAtom = lastFrag.getOutAtom(0).getAtom(); if (calculateSubstitutableHydrogenAtoms(toAtom) ==0){ group.addAttribute(new Attribute(RESOLVED_ATR, "yes")); joinFragmentsAdditively(state, frag, lastFrag);//e.g. hydroxy(sulfanyl)phosphoryl } break; } //loop may continue if lastFrag was in fact completely unsubstitutable e.g. hydroxy...phosphoryloxy. The oxy is unsubstituable as the phosphoryl will already have bonded to it if (FragmentTools.findSubstituableAtoms(lastFrag, frag.getOutAtom(outAtomCount - 1).getValency()).size() > 0) { break; } } } } } } else{// e.g. dimethoxyphosphoryl or bis(methylamino)phosphoryl List siblingFragments = findAlternativeFragments(subBracketOrRoot); if (siblingFragments.size()>0){ int multiplier = Integer.parseInt(subBracketOrRoot.getAttributeValue(MULTIPLIER_ATR)); Fragment nextFrag = siblingFragments.get(siblingFragments.size()-1); Element nextGroup = nextFrag.getTokenEl(); if (nextGroup.getAttribute(ACCEPTSADDITIVEBONDS_ATR) != null && nextGroup.getAttribute(ISAMULTIRADICAL_ATR) != null && (nextFrag.getOutAtomCount()>=multiplier|| nextGroup.getAttribute(RESOLVED_ATR) != null && nextFrag.getOutAtomCount()>=multiplier +1 )){ Atom toAtom = nextFrag.getOutAtom(0).getAtom(); if (calculateSubstitutableHydrogenAtoms(toAtom) ==0){ group.addAttribute(new Attribute(RESOLVED_ATR, "yes")); multiplyOutAndAdditivelyBond(state, subBracketOrRoot, nextFrag);//e.g.dihydroxyphosphoryl } } if (group.getAttribute(RESOLVED_ATR)==null && siblingFragments.size()>1){ for (int i = 0; i< siblingFragments.size()-1; i++) { Fragment lastFrag = siblingFragments.get(i); Element lastGroup = lastFrag.getTokenEl(); if (lastGroup.getAttribute(ACCEPTSADDITIVEBONDS_ATR) != null && lastGroup.getAttribute(ISAMULTIRADICAL_ATR) != null && (lastFrag.getOutAtomCount()>=multiplier|| lastGroup.getAttribute(RESOLVED_ATR) != null && lastFrag.getOutAtomCount()>=multiplier +1 )){ Atom toAtom = lastFrag.getOutAtom(0).getAtom(); if (calculateSubstitutableHydrogenAtoms(toAtom) ==0){ group.addAttribute(new Attribute(RESOLVED_ATR, "yes")); multiplyOutAndAdditivelyBond(state, subBracketOrRoot, lastFrag);//e.g. dihydroxyphosphoryloxy } break; } //loop may continue if lastFrag was in fact completely unsubstitutable e.g. hydroxy...phosphoryloxy. The oxy is unsubstituable as the phosphoryl will already have bonded to it if (FragmentTools.findSubstituableAtoms(lastFrag, frag.getOutAtom(outAtomCount - 1).getValency()).size() > 0) { break; } } } } } } } /** * Searches the input for something that either is a multiRadical or has no outAtoms i.e. not dimethyl * @param subBracketOrRoot * @return */ private static boolean hasRootLikeOrMultiRadicalGroup(Element subBracketOrRoot) { List groups = OpsinTools.getDescendantElementsWithTagName(subBracketOrRoot, GROUP_EL); if (subBracketOrRoot.getAttribute(INLOCANTS_ATR) != null){ return true;// a terminus with specified inLocants } for (Element group : groups) { Fragment frag = group.getFrag(); int outAtomCount =frag.getOutAtomCount(); if (group.getAttribute(ISAMULTIRADICAL_ATR) != null){ if (outAtomCount >=1 ){ return true;//a multi radical } } else if (outAtomCount ==0 && group.getAttribute(RESOLVED_ATR)==null){ return true;// a terminus } } return false; } /** * Multiply out subOrBracket and additively bond all substituents to the specified fragment * @param state * @param subOrBracket * @param fragToAdditivelyBondTo * @throws StructureBuildingException */ private static void multiplyOutAndAdditivelyBond(BuildState state, Element subOrBracket, Fragment fragToAdditivelyBondTo) throws StructureBuildingException { int multiplier = Integer.parseInt(subOrBracket.getAttributeValue(MULTIPLIER_ATR)); subOrBracket.removeAttribute(subOrBracket.getAttribute(MULTIPLIER_ATR)); List clonedElements = new ArrayList<>(); List elementsNotToBeMultiplied = new ArrayList<>();//anything before the multiplier in the sub/bracket for (int i = multiplier -1; i >=0; i--) { Element currentElement; if (i != 0){ currentElement = state.fragManager.cloneElement(state, subOrBracket, i); addPrimesToLocantedStereochemistryElements(currentElement, StringTools.multiplyString("'", i));//Stereochemistry elements with locants will need to have their locants primed (stereochemistry is only processed after structure building) clonedElements.add(currentElement); } else{ currentElement = subOrBracket; Element multiplierEl = subOrBracket.getFirstChildElement(MULTIPLIER_EL); if (multiplierEl ==null){ throw new StructureBuildingException("Multiplier not found where multiplier expected"); } for (int j = subOrBracket.indexOf(multiplierEl) -1 ; j >=0 ; j--) { Element el = subOrBracket.getChild(j); el.detach(); elementsNotToBeMultiplied.add(el); } multiplierEl.detach(); } Element group; if (currentElement.getName().equals(BRACKET_EL)){ group = findRightMostGroupInBracket(currentElement); } else{ group = currentElement.getFirstChildElement(GROUP_EL); } Fragment frag = group.getFrag(); if (frag.getOutAtomCount() != 1 ){ throw new StructureBuildingException("Additive bond formation failure: Fragment expected to have one OutAtom in this case but had: "+ frag.getOutAtomCount()); } joinFragmentsAdditively(state, frag, fragToAdditivelyBondTo); } for (Element clone : clonedElements) {//make sure cloned substituents don't substitute onto each other! OpsinTools.insertAfter(subOrBracket, clone); } for (Element el : elementsNotToBeMultiplied) {//re-add anything before multiplier to original subOrBracket subOrBracket.insertChild(el, 0); } } /** * Creates a build results from the input group for use as the input to the real performMultiplicativeOperations function * @param state * @param group * @param multipliedParent * @throws StructureBuildingException */ private static void performMultiplicativeOperations(BuildState state, Element group, Element multipliedParent) throws StructureBuildingException{ BuildResults multiRadicalBR = new BuildResults(group.getParent()); performMultiplicativeOperations(state, multiRadicalBR, multipliedParent); } private static void performMultiplicativeOperations(BuildState state, BuildResults multiRadicalBR, Element multipliedParent) throws StructureBuildingException { int multiplier = Integer.parseInt(multipliedParent.getAttributeValue(MULTIPLIER_ATR)); if (multiplier != multiRadicalBR.getOutAtomCount()){ if (multiRadicalBR.getOutAtomCount() == multiplier*2){ //TODO substituents like nitrilo can have their outatoms combined } if (multiplier != multiRadicalBR.getOutAtomCount()){ throw new StructureBuildingException("Multiplication bond formation failure: number of outAtoms disagree with multiplier(multiplier: " + multiplier + ", outAtom count: " + multiRadicalBR.getOutAtomCount()+ ")"); } } if (LOG.isTraceEnabled()){ LOG.trace(multiplier +" multiplicative bonds to be formed"); } multipliedParent.removeAttribute(multipliedParent.getAttribute(MULTIPLIER_ATR)); List inLocants = null; String inLocantsString = multipliedParent.getAttributeValue(INLOCANTS_ATR); if (inLocantsString != null){//true for the root of a multiplicative name if (inLocantsString.equals(INLOCANTS_DEFAULT)){ inLocants = new ArrayList<>(multiplier); for (int i = 0; i < multiplier; i++) { inLocants.add(INLOCANTS_DEFAULT); } } else{ inLocants = StringTools.arrayToList(inLocantsString.split(",")); if (inLocants.size() != multiplier){ throw new StructureBuildingException("Mismatch between multiplier and number of inLocants in multiplicative nomenclature"); } } } List clonedElements = new ArrayList<>(); BuildResults newBr = new BuildResults(); for (int i = multiplier -1; i >=0; i--) { Element multipliedElement; if (i != 0){ multipliedElement = state.fragManager.cloneElement(state, multipliedParent, i); addPrimesToLocantedStereochemistryElements(multipliedElement, StringTools.multiplyString("'", i));//Stereochemistry elements with locants will need to have their locants primed (stereochemistry is only processed after structure building) clonedElements.add(multipliedElement); } else{ multipliedElement = multipliedParent; } //determine group that will be additively bonded to Element multipliedGroup; if (multipliedElement.getName().equals(BRACKET_EL)) { multipliedGroup = getFirstMultiValentGroup(multipliedElement); if (multipliedGroup == null){//root will not have a multivalent group List groups = OpsinTools.getDescendantElementsWithTagName(multipliedElement, GROUP_EL); if (inLocants == null){ throw new StructureBuildingException("OPSIN Bug? in locants must be specified for a multiplied root in multiplicative nomenclature"); } if (inLocants.get(0).equals(INLOCANTS_DEFAULT)){ multipliedGroup = groups.get(groups.size() - 1); } else{ groupLoop: for (int j = groups.size()-1; j >=0; j--) { Fragment possibleFrag = groups.get(j).getFrag(); for (String locant : inLocants) { if (possibleFrag.hasLocant(locant)){ multipliedGroup = groups.get(j); break groupLoop; } } } } if (multipliedGroup == null){ throw new StructureBuildingException("Locants for inAtoms on the root were either misassigned to the root or were invalid: " + inLocants.toString() +" could not be assigned!"); } } } else{ multipliedGroup = multipliedElement.getFirstChildElement(GROUP_EL); } Fragment multipliedFrag = multipliedGroup.getFrag(); OutAtom multiRadicalOutAtom = multiRadicalBR.getOutAtom(i); Fragment multiRadicalFrag = multiRadicalOutAtom.getAtom().getFrag(); Element multiRadicalGroup = multiRadicalFrag.getTokenEl(); if (multiRadicalGroup.getAttribute(RESOLVED_ATR) == null){ resolveUnLocantedFeatures(state, multiRadicalGroup.getParent());//the addition of unlocanted unsaturators can effect the position of radicals e.g. diazenyl multiRadicalGroup.addAttribute(new Attribute(RESOLVED_ATR, "yes")); } boolean substitutivelyBondedToRoot = false; if (inLocants != null) { Element rightMostGroup; if (multipliedElement.getName().equals(BRACKET_EL)) { rightMostGroup = findRightMostGroupInBracket(multipliedElement); } else{ rightMostGroup = multipliedElement.getFirstChildElement(GROUP_EL); } rightMostGroup.addAttribute(new Attribute(RESOLVED_ATR, "yes"));//this group will not be used further within this word but can in principle be a substituent e.g. methylenedisulfonyl dichloride if (multipliedGroup.getAttribute(ISAMULTIRADICAL_ATR) != null) {//e.g. methylenedisulfonyl dichloride if (!multipliedParent.getAttributeValue(INLOCANTS_ATR).equals(INLOCANTS_DEFAULT)) { throw new StructureBuildingException("inLocants should not be specified for a multiradical parent in multiplicative nomenclature"); } } else{ Atom from = multiRadicalOutAtom.getAtom(); int bondOrder = multiRadicalOutAtom.getValency(); //bonding will be substitutive rather additive as this is bonding to a root Atom atomToJoinTo = null; for (int j = inLocants.size() -1; j >=0; j--) { String locant = inLocants.get(j); if (locant.equals(INLOCANTS_DEFAULT)){//note that if one entry in inLocantArray is default then they all are "default" List possibleAtoms = getPossibleAtomsForUnlocantedConnectionToMultipliedRoot(multipliedGroup, bondOrder, i); if (possibleAtoms.isEmpty()) { throw new StructureBuildingException("No suitable atom found for multiplicative operation"); } if (AmbiguityChecker.isSubstitutionAmbiguous(possibleAtoms, 1)) { state.addIsAmbiguous("Connection to multiplied group: " + multipliedGroup.getValue()); } atomToJoinTo = possibleAtoms.get(0); inLocants.remove(j); break; } else{ Atom inAtom = multipliedFrag.getAtomByLocant(locant); if (inAtom != null) { atomToJoinTo = inAtom; inLocants.remove(j); break; } } } if (atomToJoinTo == null){ throw new StructureBuildingException("Locants for inAtoms on the root were either misassigned to the root or were invalid: " + inLocants.toString() +" could not be assigned!"); } if (!multiRadicalOutAtom.isSetExplicitly()) {//not set explicitly so may be an inappropriate atom from = findAtomForUnlocantedRadical(state, from.getFrag(), multiRadicalOutAtom); } multiRadicalFrag.removeOutAtom(multiRadicalOutAtom); state.fragManager.createBond(from, atomToJoinTo, bondOrder); if (LOG.isTraceEnabled()){ LOG.trace("Substitutively bonded (multiplicative to root) " + from.getID() + " (" + from.getFrag().getTokenEl().getValue() + ") " + atomToJoinTo.getID() + " (" + atomToJoinTo.getFrag().getTokenEl().getValue() + ")"); } substitutivelyBondedToRoot = true; } } if (!substitutivelyBondedToRoot) { joinFragmentsAdditively(state, multiRadicalFrag, multipliedFrag); } if (multipliedElement.getName().equals(BRACKET_EL)) { recursivelyResolveUnLocantedFeatures(state, multipliedElement);//there may be outAtoms that are involved in unlocanted substitution, these can be safely used now e.g. ...bis((3-hydroxy-4-methoxyphenyl)methylene) where (3-hydroxy-4-methoxyphenyl)methylene is the currentElement } if (inLocants == null) { //currentElement is not a root element. Need to build up a new BuildResults so as to call performMultiplicativeOperations again //at this stage an outAtom has been removed from the fragment within currentElement through an additive bond newBr.mergeBuildResults(new BuildResults(multipliedElement)); } } if (newBr.getFragmentCount() == 1) { throw new StructureBuildingException("Multiplicative nomenclature cannot yield only one temporary terminal fragment"); } if (newBr.getFragmentCount() >= 2) { List siblings = OpsinTools.getNextSiblingsOfTypes(multipliedParent, new String[]{SUBSTITUENT_EL, BRACKET_EL, ROOT_EL}); if (siblings.isEmpty()) { Element parentOfMultipliedEl = multipliedParent.getParent(); if (parentOfMultipliedEl.getName().equals(BRACKET_EL)) {//brackets are allowed siblings = OpsinTools.getNextSiblingsOfTypes(parentOfMultipliedEl, new String[]{SUBSTITUENT_EL, BRACKET_EL, ROOT_EL}); if (siblings.get(0).getAttribute(MULTIPLIER_ATR) == null) { throw new StructureBuildingException("Multiplier not found where multiplier was expected for successful multiplicative nomenclature"); } performMultiplicativeOperations(state, newBr, siblings.get(0)); } else{ throw new StructureBuildingException("Could not find suitable element to continue multiplicative nomenclature"); } } else{ if (siblings.get(0).getAttribute(MULTIPLIER_ATR) == null) { throw new StructureBuildingException("Multiplier not found where multiplier was expected for successful multiplicative nomenclature"); } performMultiplicativeOperations(state, newBr, siblings.get(0)); } } for (Element clone : clonedElements) {//only insert cloned substituents now so they don't substitute onto each other! OpsinTools.insertAfter(multipliedParent, clone); } } /** * Applies special case to prefer the end of chains with the usableAsAJoiner attributes cf. p-phenylenedipropionic acid * Such cases will still be considered to be formally ambiguous * @param multipliedGroup * @param multipliedFrag * @param bondOrder * @param primesAdded * @return * @throws StructureBuildingException */ private static List getPossibleAtomsForUnlocantedConnectionToMultipliedRoot(Element multipliedGroup, int bondOrder, int primesAdded) throws StructureBuildingException { Fragment multipliedFrag = multipliedGroup.getFrag(); if ("yes".equals(multipliedGroup.getAttributeValue(USABLEASJOINER_ATR)) && multipliedFrag.getDefaultInAtom() == null) { Element previous = OpsinTools.getPrevious(multipliedGroup); if (previous != null && previous.getName().equals(MULTIPLIER_EL)){ String locant = getLocantOfEndOfChainIfGreaterThan1(multipliedFrag, primesAdded); if (locant != null) { Atom preferredAtom = multipliedFrag.getAtomByLocantOrThrow(locant); List possibleAtoms = FragmentTools.findnAtomsForSubstitution(multipliedFrag.getAtomList(), preferredAtom, 1, bondOrder, true); if (possibleAtoms == null) { possibleAtoms = Collections.emptyList(); } return possibleAtoms; } } } return FragmentTools.findSubstituableAtoms(multipliedFrag, bondOrder); } private static String getLocantOfEndOfChainIfGreaterThan1(Fragment frag, int primes) { String primesStr = StringTools.multiplyString("'", primes); int length = 0; Atom next = frag.getAtomByLocant(Integer.toString(length + 1) + primesStr); Atom previous = null; while (next != null){ if (previous != null && previous.getBondToAtom(next) == null){ break; } length++; previous = next; next = frag.getAtomByLocant(Integer.toString(length + 1) + primesStr); } if (length > 1){ return Integer.toString(length) + primesStr; } return null; } /** * Given a subsituent/bracket finds the next multi valent substituent/root that is in scope and hence its group * e.g. for oxy(dichloromethyl)methylene given oxy substituent the methylene group would be found * for oxy(dichloroethylene) given oxy substituent the ethylene group would be found * for oxy(carbonylimino) given oxy carbonyl would be found * @param substituentOrBracket * @return frag * @throws StructureBuildingException */ private static Fragment getNextInScopeMultiValentFragment(Element substituentOrBracket) throws StructureBuildingException { if (!substituentOrBracket.getName().equals(SUBSTITUENT_EL) && !substituentOrBracket.getName().equals(BRACKET_EL)){ throw new StructureBuildingException("Input to this function should be a substituent or bracket"); } if (substituentOrBracket.getParent()==null){ throw new StructureBuildingException("substituent did not have a parent!"); } Element parent = substituentOrBracket.getParent(); List children = OpsinTools.getChildElementsWithTagNames(parent, new String[]{SUBSTITUENT_EL, BRACKET_EL, ROOT_EL});//will be returned in index order int indexOfSubstituent =parent.indexOf(substituentOrBracket); for (Element child : children) { if (parent.indexOf(child) <=indexOfSubstituent){//only want things after the input continue; } if (child.getAttribute(MULTIPLIER_ATR) != null){ continue; } List childDescendants; if (child.getName().equals(BRACKET_EL)){ childDescendants = OpsinTools.getDescendantElementsWithTagNames(child, new String[]{SUBSTITUENT_EL, ROOT_EL});//will be returned in depth-first order } else{ childDescendants =new ArrayList<>(); childDescendants.add(child); } for (Element descendantChild : childDescendants) { Element group = descendantChild.getFirstChildElement(GROUP_EL); if (group == null){ throw new StructureBuildingException("substituent/root is missing its group"); } Fragment possibleFrag = group.getFrag(); if (group.getAttribute(ISAMULTIRADICAL_ATR) != null && (possibleFrag.getOutAtomCount() >=2 || (possibleFrag.getOutAtomCount() >=1 && group.getAttribute(RESOLVED_ATR) != null ))){ return possibleFrag; } } } return null; } /** * Given a bracket searches in a depth first manner for the first multi valent group * @param bracket * @return group * @throws StructureBuildingException */ private static Element getFirstMultiValentGroup(Element bracket) throws StructureBuildingException { if (!bracket.getName().equals(BRACKET_EL)){ throw new StructureBuildingException("Input to this function should be a bracket"); } List groups = OpsinTools.getDescendantElementsWithTagName(bracket, GROUP_EL);//will be returned in index order for (Element group : groups) { Fragment possibleFrag = group.getFrag(); if (group.getAttribute(ISAMULTIRADICAL_ATR) != null && (possibleFrag.getOutAtomCount() >=2 || (possibleFrag.getOutAtomCount() >=1 && group.getAttribute(RESOLVED_ATR) != null ))){ return group; } } return null; } private static void joinFragmentsAdditively(BuildState state, Fragment fragToBeJoined, Fragment parentFrag) throws StructureBuildingException { Element elOfFragToBeJoined = fragToBeJoined.getTokenEl(); if (EPOXYLIKE_SUBTYPE_VAL.equals(elOfFragToBeJoined.getAttributeValue(SUBTYPE_ATR))){ for (int i = 0, l = fragToBeJoined.getOutAtomCount(); i < l; i++) { OutAtom outAtom = fragToBeJoined.getOutAtom(i); if (outAtom.getLocant() != null){ throw new StructureBuildingException("Inappropriate use of " + elOfFragToBeJoined.getValue()); } } } int outAtomCountOnFragToBeJoined = fragToBeJoined.getOutAtomCount(); if (outAtomCountOnFragToBeJoined ==0){ throw new StructureBuildingException("Additive bond formation failure: Fragment expected to have at least one OutAtom but had none"); } if (parentFrag.getOutAtomCount() == 0){ throw new StructureBuildingException("Additive bond formation failure: Fragment expected to have at least one OutAtom but had none"); } OutAtom in = null; if (parentFrag.getOutAtomCount() > 1){ int firstOutAtomOrder = parentFrag.getOutAtom(0).getValency(); boolean unresolvedAmbiguity =false; for (int i = 1, l = parentFrag.getOutAtomCount(); i < l; i++) { OutAtom outAtom = parentFrag.getOutAtom(i); if (outAtom.getValency() != firstOutAtomOrder){ unresolvedAmbiguity =true; } } if (unresolvedAmbiguity){//not all outAtoms on parent equivalent firstOutAtomOrder = fragToBeJoined.getOutAtom(0).getValency(); unresolvedAmbiguity =false; for (int i = 1, l = fragToBeJoined.getOutAtomCount(); i < l; i++) { OutAtom outAtom = fragToBeJoined.getOutAtom(i); if (outAtom.getValency() != firstOutAtomOrder){ unresolvedAmbiguity =true; } } if (unresolvedAmbiguity && outAtomCountOnFragToBeJoined == 2){//not all outAtoms on frag to be joined are equivalent either! //Solves the specific case of 2,2'-[ethane-1,2-diylbis(azanylylidenemethanylylidene)]diphenol vs 2,2'-[ethane-1,2-diylidenebis(azanylylidenemethanylylidene)]bis(cyclohexan-1-ol) //but does not solve the general case as only a single look behind is performed. Element previousGroup = OpsinTools.getPreviousGroup(elOfFragToBeJoined); if (previousGroup != null){ Fragment previousFrag = previousGroup.getFrag(); if (previousFrag.getOutAtomCount() > 1){ int previousGroupFirstOutAtomOrder = previousFrag.getOutAtom(0).getValency(); unresolvedAmbiguity =false; for (int i = 1, l = previousFrag.getOutAtomCount(); i < l; i++) { OutAtom outAtom = previousFrag.getOutAtom(i); if (outAtom.getValency() != previousGroupFirstOutAtomOrder){ unresolvedAmbiguity =true; } } if (!unresolvedAmbiguity && previousGroupFirstOutAtomOrder==parentFrag.getOutAtom(0).getValency()){ for (int i = 1, l = parentFrag.getOutAtomCount(); i < l; i++) { OutAtom outAtom = parentFrag.getOutAtom(i); if (outAtom.getValency() != previousGroupFirstOutAtomOrder){ in = outAtom; break; } } } } } } else{ for (int i = 0, l = parentFrag.getOutAtomCount(); i < l; i++) { OutAtom outAtom = parentFrag.getOutAtom(i); if (outAtom.getValency()==firstOutAtomOrder){ in = outAtom; break; } } } } } if (in==null){ in = parentFrag.getOutAtom(0); } Atom to = in.getAtom(); int bondOrder = in.getValency(); if (!in.isSetExplicitly()){//not set explicitly so may be an inappropriate atom to = findAtomForUnlocantedRadical(state, to.getFrag(), in); } parentFrag.removeOutAtom(in); OutAtom out =null; for (int i =outAtomCountOnFragToBeJoined -1; i>=0; i--) { if (fragToBeJoined.getOutAtom(i).getValency() == bondOrder){ out = fragToBeJoined.getOutAtom(i); break; } } if (out ==null){ if (outAtomCountOnFragToBeJoined >=bondOrder){//handles cases like nitrilo needing to be -N= (remove later outAtoms first as per usual) int valency =0; Atom lastOutAtom = fragToBeJoined.getOutAtom(outAtomCountOnFragToBeJoined -1).getAtom(); for (int i =outAtomCountOnFragToBeJoined -1; i >= 0; i--) { OutAtom nextOutAtom = fragToBeJoined.getOutAtom(i); if (nextOutAtom.getAtom() != lastOutAtom){ throw new StructureBuildingException("Additive bond formation failure: bond order disagreement"); } valency += nextOutAtom.getValency(); if (valency==bondOrder){ nextOutAtom.setValency(valency); out = nextOutAtom; break; } fragToBeJoined.removeOutAtom(nextOutAtom); } if (out==null){ throw new StructureBuildingException("Additive bond formation failure: bond order disagreement"); } } else{ throw new StructureBuildingException("Additive bond formation failure: bond order disagreement"); } } Atom from = out.getAtom(); if (!out.isSetExplicitly()){//not set explicitly so may be an inappropriate atom from = findAtomForUnlocantedRadical(state, from.getFrag(), out); } fragToBeJoined.removeOutAtom(out); state.fragManager.createBond(from, to, bondOrder); if (LOG.isTraceEnabled()){ LOG.trace("Additively bonded " + from.getID() + " (" + from.getFrag().getTokenEl().getValue() + ") " + to.getID() + " (" + to.getFrag().getTokenEl().getValue() + ")" ); } } private static void joinFragmentsSubstitutively(BuildState state, Fragment fragToBeJoined, Atom atomToJoinTo) throws StructureBuildingException { Element elOfFragToBeJoined = fragToBeJoined.getTokenEl(); if (EPOXYLIKE_SUBTYPE_VAL.equals(elOfFragToBeJoined.getAttributeValue(SUBTYPE_ATR))){ formEpoxide(state, fragToBeJoined, atomToJoinTo); return; } int outAtomCount = fragToBeJoined.getOutAtomCount(); if (outAtomCount >1){ throw new StructureBuildingException("Substitutive bond formation failure: Fragment expected to have one OutAtom but had: "+ outAtomCount); } if (outAtomCount ==0 ){ throw new StructureBuildingException("Substitutive bond formation failure: Fragment expected to have one OutAtom but had none"); } if (elOfFragToBeJoined.getAttribute(IMINOLIKE_ATR) != null){//special case for methylene/imino if (fragToBeJoined.getOutAtomCount()==1 && fragToBeJoined.getOutAtom(0).getValency()==1 ){ fragToBeJoined.getOutAtom(0).setValency(2); } } OutAtom out = fragToBeJoined.getOutAtom(0); Atom from = out.getAtom(); int bondOrder = out.getValency(); if (!out.isSetExplicitly()){//not set explicitly so may be an inappropriate atom List possibleAtoms = FragmentTools.findnAtomsForSubstitution(fragToBeJoined.getAtomList(), from, 1, bondOrder, false); if (possibleAtoms == null){ throw new StructureBuildingException("Failed to assign all unlocanted radicals to actual atoms without violating valency"); } if (!((ALKANESTEM_SUBTYPE_VAL.equals(fragToBeJoined.getSubType()) || HETEROSTEM_SUBTYPE_VAL.equals(fragToBeJoined.getSubType())) && possibleAtoms.get(0).equals(fragToBeJoined.getFirstAtom()))) { if (AmbiguityChecker.isSubstitutionAmbiguous(possibleAtoms, 1)) { state.addIsAmbiguous("Positioning of radical on: " + fragToBeJoined.getTokenEl().getValue()); } } from = possibleAtoms.get(0); } fragToBeJoined.removeOutAtom(out); state.fragManager.createBond(from, atomToJoinTo, bondOrder); if (LOG.isTraceEnabled()){ LOG.trace("Substitutively bonded " + from.getID() + " (" + from.getFrag().getTokenEl().getValue() + ") " + atomToJoinTo.getID() + " (" + atomToJoinTo.getFrag().getTokenEl().getValue() + ")"); } } /** * Forms a bridge using the given fragment. * The bridgingFragment's outAtoms locants or a combination of the atomToJoinTo and a suitable atom * are used to decide what atoms to form the bridge between * @param state * @param bridgingFragment * @param atomToJoinTo * @return Atoms that the bridgingFragment attached to * @throws StructureBuildingException */ static Atom[] formEpoxide(BuildState state, Fragment bridgingFragment, Atom atomToJoinTo) throws StructureBuildingException { Fragment fragToJoinTo = atomToJoinTo.getFrag(); List atomList = fragToJoinTo.getAtomList(); if (atomList.size()==1){ throw new StructureBuildingException("Epoxides must be formed between two different atoms"); } Atom firstAtomToJoinTo; if (bridgingFragment.getOutAtom(0).getLocant() != null){ firstAtomToJoinTo = fragToJoinTo.getAtomByLocantOrThrow(bridgingFragment.getOutAtom(0).getLocant()); } else{ firstAtomToJoinTo = atomToJoinTo; } OutAtom outAtom1 = bridgingFragment.getOutAtom(0); bridgingFragment.removeOutAtom(0); //In epoxy chalcogenAtom1 will be chalcogenAtom2. Methylenedioxy is also handled by this method state.fragManager.createBond(outAtom1.getAtom(), firstAtomToJoinTo, outAtom1.getValency()); Atom secondAtomToJoinTo; if (bridgingFragment.getOutAtom(0).getLocant() != null){ secondAtomToJoinTo = fragToJoinTo.getAtomByLocantOrThrow(bridgingFragment.getOutAtom(0).getLocant()); } else{ int index = atomList.indexOf(firstAtomToJoinTo); Atom preferredAtom = (index + 1 >= atomList.size()) ? atomList.get(index - 1) : atomList.get(index + 1); List possibleSecondAtom = FragmentTools.findnAtomsForSubstitution(fragToJoinTo.getAtomList(), preferredAtom, 1, 1, true); if (possibleSecondAtom != null) { possibleSecondAtom.removeAll(Collections.singleton(firstAtomToJoinTo)); } if (possibleSecondAtom == null || possibleSecondAtom.isEmpty()) { throw new StructureBuildingException("Unable to find suitable atom to form bridge"); } if (AmbiguityChecker.isSubstitutionAmbiguous(possibleSecondAtom, 1)) { state.addIsAmbiguous("Addition of bridge to: "+ fragToJoinTo.getTokenEl().getValue()); } secondAtomToJoinTo = possibleSecondAtom.get(0); } OutAtom outAtom2 = bridgingFragment.getOutAtom(0); bridgingFragment.removeOutAtom(0); if (outAtom1.getAtom().equals(outAtom2.getAtom()) && firstAtomToJoinTo == secondAtomToJoinTo){ throw new StructureBuildingException("Epoxides must be formed between two different atoms"); } int bondValency = outAtom2.getValency(); if (outAtom2.getAtom().hasSpareValency() && !secondAtomToJoinTo.hasSpareValency()) { //bridging groups like azeno are treated as aromatic so that it is not fixed as to which of the two bonds is the double bond //if connected to a saturated group though, one of them must be a double bond bondValency = 2; } state.fragManager.createBond(outAtom2.getAtom(), secondAtomToJoinTo, bondValency); CycleDetector.assignWhetherAtomsAreInCycles(bridgingFragment); return new Atom[]{firstAtomToJoinTo, secondAtomToJoinTo}; } /** * Attempts to find an in-scope fragment capable of forming the given numberOfSubstitutions each with the given bondOrder * @param subOrBracket * @param numberOfSubstitutions * @param bondOrder * @return */ private static List findAtomsForSubstitution(Element subOrBracket, int numberOfSubstitutions, int bondOrder) { FindAlternativeGroupsResult results = findAlternativeGroups(subOrBracket); List substitutableAtoms = findAtomsForSubstitution(results.groups, numberOfSubstitutions, bondOrder, true); if (substitutableAtoms != null) { return substitutableAtoms; } substitutableAtoms = findAtomsForSubstitution(results.groups, numberOfSubstitutions, bondOrder, false); if (substitutableAtoms != null) { return substitutableAtoms; } substitutableAtoms = findAtomsForSubstitution(results.groupsSubstitutionUnlikely, numberOfSubstitutions, bondOrder, true); if (substitutableAtoms != null) { return substitutableAtoms; } substitutableAtoms = findAtomsForSubstitution(results.groupsSubstitutionUnlikely, numberOfSubstitutions, bondOrder, false); return substitutableAtoms; } private static List findAtomsForSubstitution(List possibleParents, int numberOfSubstitutions, int bondOrder, boolean preserveValency) { boolean rootHandled = false; for (int i = 0, l = possibleParents.size(); i < l; i++) { Element possibleParent = possibleParents.get(i); Fragment frag = possibleParent.getFrag(); List substitutableAtoms; if (possibleParent.getParent().getName().equals(ROOT_EL)){//consider all root groups as if they were one if(rootHandled) { continue; } List atoms = frag.getAtomList(); for (int j = i + 1; j < l; j++) { Element possibleOtherRoot = possibleParents.get(j); if (possibleOtherRoot.getParent().getName().equals(ROOT_EL)) { atoms.addAll(possibleOtherRoot.getFrag().getAtomList()); } } rootHandled = true; substitutableAtoms = FragmentTools.findnAtomsForSubstitution(atoms, frag.getDefaultInAtom(), numberOfSubstitutions, bondOrder, true, preserveValency); } else{ substitutableAtoms = FragmentTools.findnAtomsForSubstitution(frag.getAtomList(), frag.getDefaultInAtom(), numberOfSubstitutions, bondOrder, true, preserveValency); } if (substitutableAtoms != null){ return substitutableAtoms; } } return null; } /** * Finds all the fragments accessible from the startingElement taking into account brackets * i.e. those that it is feasible that the group of the startingElement could substitute onto * @param startingElement * @return A list of fragments in the order to try them as possible parent fragments (for substitutive operations) */ static List findAlternativeFragments(Element startingElement) { List foundFragments = new ArrayList<>(); FindAlternativeGroupsResult results = findAlternativeGroups(startingElement); for (Element group : results.groups) { foundFragments.add(group.getFrag()); } for (Element group : results.groupsSubstitutionUnlikely) { foundFragments.add(group.getFrag()); } return foundFragments; } /** * Finds all the groups accessible from the startingElement taking into account brackets * i.e. those that it is feasible that the group of the startingElement could substitute onto * (locanting onto bracketted groups is unlikely so these are kept seperate in the results object) * @param startingElement * @return An object containing the groups in the order to try them as possible parent groups (for substitutive operations) */ static FindAlternativeGroupsResult findAlternativeGroups(Element startingElement) { Deque stack = new ArrayDeque<>(); stack.add(new AlternativeGroupFinderState(startingElement.getParent(), false)); List groups = new ArrayList<>(); List groupsSubstitutionUnlikely = new ArrayList<>();//locanting into brackets is rarely the desired answer so keep these separate boolean doneFirstIteration = false;//check on index only done on first iteration to only get elements with an index greater than the starting element while (stack.size() > 0) { AlternativeGroupFinderState state = stack.removeLast(); Element currentElement = state.el; boolean substitutionUnlikely = state.substitutionUnlikely; if (currentElement.getName().equals(GROUP_EL)) { if (substitutionUnlikely) { groupsSubstitutionUnlikely.add(currentElement); } else { groups.add(currentElement); } continue; } List siblings = OpsinTools.getChildElementsWithTagNames(currentElement, new String[]{BRACKET_EL, SUBSTITUENT_EL, ROOT_EL}); for (Element bracketOrSubOrRoot : siblings) { if (!doneFirstIteration && currentElement.indexOf(bracketOrSubOrRoot) <= currentElement.indexOf(startingElement)){ continue; } if (bracketOrSubOrRoot.getAttribute(MULTIPLIER_ATR) != null){ continue; } boolean substitutionUnlikelyForThisEl = substitutionUnlikely; if (bracketOrSubOrRoot.getName().equals(BRACKET_EL)){ if (!IMPLICIT_TYPE_VAL.equals(bracketOrSubOrRoot.getAttributeValue(TYPE_ATR))) { substitutionUnlikelyForThisEl = true; } stack.add(new AlternativeGroupFinderState(bracketOrSubOrRoot, substitutionUnlikelyForThisEl)); } else{ if (bracketOrSubOrRoot.getAttribute(LOCANT_ATR) != null) { substitutionUnlikelyForThisEl = true; } Element group = bracketOrSubOrRoot.getFirstChildElement(GROUP_EL); stack.add(new AlternativeGroupFinderState(group, substitutionUnlikelyForThisEl)); } } doneFirstIteration = true; } return new FindAlternativeGroupsResult(groups, groupsSubstitutionUnlikely); } private static class AlternativeGroupFinderState { private final Element el; private final boolean substitutionUnlikely; AlternativeGroupFinderState(Element el, boolean substitutionUnlikely) { this.el = el; this.substitutionUnlikely = substitutionUnlikely; } } private static class FindAlternativeGroupsResult { private final List groups; private final List groupsSubstitutionUnlikely; FindAlternativeGroupsResult(List groups, List groupsSubstitutionUnlikely) { this.groups = groups; this.groupsSubstitutionUnlikely = groupsSubstitutionUnlikely; } } /** * Checks through the groups accessible from the currentElement taking into account brackets * i.e. those that it is feasible that the group of the currentElement could substitute onto * @param startingElement * @param locant: the locant string to check for the presence of * @return The fragment with the locant, or null * @throws StructureBuildingException */ private static Fragment findFragmentWithLocant(Element startingElement, String locant) throws StructureBuildingException { Deque stack = new ArrayDeque<>(); stack.add(startingElement.getParent()); boolean doneFirstIteration = false;//check on index only done on first iteration to only get elements with an index greater than the starting element Fragment monoNuclearHydride = null;//e.g. methyl/methane - In this case no locant would be expected as unlocanted substitution is always unambiguous. Hence deprioritise while (stack.size() > 0) { Element currentElement = stack.removeLast(); if (currentElement.getName().equals(SUBSTITUENT_EL) || currentElement.getName().equals(ROOT_EL)) { Fragment groupFrag = currentElement.getFirstChildElement(GROUP_EL).getFrag(); if (monoNuclearHydride != null && currentElement.getAttribute(LOCANT_ATR) != null) {//It looks like all groups are locanting onto the monoNuclearHydride e.g. 1-oxo-1-phenyl-sulfanylidene return monoNuclearHydride; } if (groupFrag.hasLocant(locant)) { if (locant.equals("1") && groupFrag.getAtomCount() == 1) { if (monoNuclearHydride == null) { monoNuclearHydride = groupFrag; } } else{ return groupFrag; } } continue; } else if (monoNuclearHydride != null) { return monoNuclearHydride; } List siblings = OpsinTools.getChildElementsWithTagNames(currentElement, new String[]{BRACKET_EL, SUBSTITUENT_EL, ROOT_EL}); List bracketted = new ArrayList<>(); if (!doneFirstIteration) {//on the first iteration, ignore elements before the starting element and favour the element directly after the starting element (conditions apply) int indexOfStartingEl = currentElement.indexOf(startingElement); Element substituentToTryFirst = null; for (Element bracketOrSubOrRoot : siblings) { int indexOfCurrentEl = currentElement.indexOf(bracketOrSubOrRoot); if (indexOfCurrentEl <= indexOfStartingEl) { continue; } if (bracketOrSubOrRoot.getAttribute(MULTIPLIER_ATR) != null) { continue; } if (bracketOrSubOrRoot.getName().equals(BRACKET_EL)) { if (IMPLICIT_TYPE_VAL.equals(bracketOrSubOrRoot.getAttributeValue(TYPE_ATR)) && bracketOrSubOrRoot.getAttribute(LOCANT_EL) == null) { //treat implicit brackets without locants as if they are not there for (Element descendent : getChildrenIgnoringLocantlessImplicitBrackets(bracketOrSubOrRoot)) { if (descendent.getName().equals(BRACKET_EL)) { bracketted.add(descendent); } else { if (substituentToTryFirst == null && descendent.getAttribute(LOCANT_EL) == null && MATCH_NUMERIC_LOCANT.matcher(locant).matches()) { substituentToTryFirst = descendent; } else { stack.add(descendent); } } } } else { bracketted.add(bracketOrSubOrRoot); } } else { if (substituentToTryFirst == null && bracketOrSubOrRoot.getAttribute(LOCANT_EL) == null && MATCH_NUMERIC_LOCANT.matcher(locant).matches()) { substituentToTryFirst = bracketOrSubOrRoot; } else { stack.add(bracketOrSubOrRoot); } } } if (substituentToTryFirst != null) { stack.add(substituentToTryFirst); } doneFirstIteration = true; } else { for (Element bracketOrSubOrRoot : siblings) { if (bracketOrSubOrRoot.getAttribute(MULTIPLIER_ATR) != null) { continue; } if (bracketOrSubOrRoot.getName().equals(BRACKET_EL)) { if (IMPLICIT_TYPE_VAL.equals(bracketOrSubOrRoot.getAttributeValue(TYPE_ATR)) && bracketOrSubOrRoot.getAttribute(LOCANT_EL) == null) { //treat implicit brackets without locants as if they are not there for (Element descendent : getChildrenIgnoringLocantlessImplicitBrackets(bracketOrSubOrRoot)) { if (descendent.getName().equals(BRACKET_EL)) { bracketted.add(descendent); } else { stack.add(descendent); } } } else { bracketted.add(bracketOrSubOrRoot); } } else { stack.add(bracketOrSubOrRoot); } } } //locanting into brackets is rarely the desired answer so place at the bottom of the stack for (int i = bracketted.size() -1; i >=0; i--) { stack.addFirst(bracketted.get(i)); } } return monoNuclearHydride; } private static List getChildrenIgnoringLocantlessImplicitBrackets(Element implicitBracket) { List childrenAndImplicitBracketChildren = new ArrayList<>(); for (Element child : implicitBracket.getChildElements()) { if (child.getName().equals(BRACKET_EL) && IMPLICIT_TYPE_VAL.equals(child.getAttributeValue(TYPE_ATR)) && child.getAttribute(LOCANT_EL) == null) { childrenAndImplicitBracketChildren.addAll(getChildrenIgnoringLocantlessImplicitBrackets(child)); } else { childrenAndImplicitBracketChildren.add(child); } } return childrenAndImplicitBracketChildren; } static Element findRightMostGroupInBracket(Element bracket) { List subsBracketsAndRoots = OpsinTools.getChildElementsWithTagNames(bracket, new String[]{BRACKET_EL, SUBSTITUENT_EL, ROOT_EL}); Element lastSubsBracketOrRoot = subsBracketsAndRoots.get(subsBracketsAndRoots.size() - 1); while (lastSubsBracketOrRoot.getName().equals(BRACKET_EL)) { subsBracketsAndRoots = OpsinTools.getChildElementsWithTagNames(lastSubsBracketOrRoot, new String[]{BRACKET_EL, SUBSTITUENT_EL, ROOT_EL}); lastSubsBracketOrRoot = subsBracketsAndRoots.get(subsBracketsAndRoots.size() - 1); } return findRightMostGroupInSubOrRoot(lastSubsBracketOrRoot); } static Element findRightMostGroupInSubBracketOrRoot(Element subBracketOrRoot) { if (subBracketOrRoot.getName().equals(BRACKET_EL)) { return findRightMostGroupInBracket(subBracketOrRoot); } else { return findRightMostGroupInSubOrRoot(subBracketOrRoot); } } private static Element findRightMostGroupInSubOrRoot(Element subOrRoot) { for (int i = subOrRoot.getChildCount() - 1; i >= 0; i--) { Element el = subOrRoot.getChild(i); if (el.getName().equals(GROUP_EL)) { return el; } } return null; } private static boolean potentiallyCanSubstitute(Element subBracketOrRoot) { Element parent = subBracketOrRoot.getParent(); List children =parent.getChildElements(); for (int i = parent.indexOf(subBracketOrRoot) +1 ; i < children.size(); i++) { if (!children.get(i).getName().equals(HYPHEN_EL)){ return true; } } return false; } static String checkForBracketedPrimedLocantSpecialCase(Element subBracketOrRoot, String locantString) { int terminalPrimes = StringTools.countTerminalPrimes(locantString); if (terminalPrimes > 0){ int brackettingDepth = 0; Element parent = subBracketOrRoot.getParent(); while (parent != null && parent.getName().equals(BRACKET_EL)){ if (!IMPLICIT_TYPE_VAL.equals(parent.getAttributeValue(TYPE_ATR))){ brackettingDepth++; } parent = parent.getParent(); } if (terminalPrimes == brackettingDepth){ return locantString.substring(0, locantString.length() - terminalPrimes); } } return null; } /** * In cases such as methylenecyclohexane two outAtoms are combined to form a single outAtom with valency * equal to sum of the valency of the other outAtoms. * This is only allowed on substituents where all the outAtoms are on the same atom * @param frag * @param group * @throws StructureBuildingException */ private static void checkAndApplySpecialCaseWhereOutAtomsCanBeCombinedOrThrow(Fragment frag, Element group) throws StructureBuildingException { int outAtomCount = frag.getOutAtomCount(); if (outAtomCount <= 1) { return; } if (EPOXYLIKE_SUBTYPE_VAL.equals(group.getAttributeValue(SUBTYPE_ATR))){ return; } String groupValue = group.getValue(); if (groupValue.equals("oxy") || groupValue.equals("thio") || groupValue.equals("seleno") || groupValue.equals("telluro")){//always bivalent return; } //special case: all outAtoms on same atom e.g. methylenecyclohexane Atom firstOutAtom = frag.getOutAtom(0).getAtom(); int valencyOfOutAtom = 0; for (int i = outAtomCount - 1; i >=0 ; i--) {//remove all outAtoms and add one with the total valency of all those that have been removed OutAtom out = frag.getOutAtom(i); if (!out.getAtom().equals(firstOutAtom)){ throw new StructureBuildingException("Substitutive bond formation failure: Fragment expected to have one OutAtom but had: "+ outAtomCount); } valencyOfOutAtom += out.getValency(); frag.removeOutAtom(i); } frag.addOutAtom(firstOutAtom, valencyOfOutAtom, true); } /** * Calculates the number of substitutable hydrogen by taking into account: * Specified valency if applicable, outAtoms and the lowest valency state that will satisfy these * e.g. thio has 2 outAtoms and no bonds hence -->2 outgoing, lowest stable valency = 2 hence no substitutable hydrogen * e.g. phosphonyl has 2 outAtoms and one double bond -->4 outgoing, lowest stable valency =5 hence 1 substitutable hydrogen * @param atom * @return */ static int calculateSubstitutableHydrogenAtoms(Atom atom) { if (!atom.getImplicitHydrogenAllowed()) { return 0; } int valency = atom.determineValency(true); int currentValency = atom.getIncomingValency() + atom.getOutValency(); int substitutableHydrogen = valency - currentValency; return substitutableHydrogen >= 0 ? substitutableHydrogen : 0; } /** * Stereochemistry terms are assigned right at the end so that checks can be done on whether the indicated atom is in fact chiral. * In the process of multiplication locants are primed. This function adds the appropriate number of primes to any locanted stereochemistry locants * The primesString is the string containing the primes to add to each locant * @param subOrBracket * @param primesString */ private static void addPrimesToLocantedStereochemistryElements(Element subOrBracket, String primesString) { List stereoChemistryElements =OpsinTools.getDescendantElementsWithTagName(subOrBracket, STEREOCHEMISTRY_EL); for (Element stereoChemistryElement : stereoChemistryElements) { if (stereoChemistryElement.getAttribute(LOCANT_ATR) != null){ stereoChemistryElement.getAttribute(LOCANT_ATR).setValue(stereoChemistryElement.getAttributeValue(LOCANT_ATR) + primesString); } } } /** * Calculates the number of times getParent() must be called to reach a word element * Returns null if element does not have an enclosing word element. * @param element * @return */ private static Integer levelsToWordEl(Element element) { int count =0; while (!element.getName().equals(WORD_EL)){ element = element.getParent(); if (element == null){ return null; } count++; } return count; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/SuffixApplier.java000066400000000000000000000652271451751637500273140ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import java.util.ArrayList; import java.util.Arrays; import java.util.HashSet; import java.util.LinkedHashMap; import java.util.List; import java.util.Map; import java.util.Map.Entry; import java.util.Set; import uk.ac.cam.ch.wwmm.opsin.IsotopeSpecificationParser.IsotopeSpecification; class SuffixApplier { private final BuildState state; private final SuffixRules suffixRules; SuffixApplier(BuildState state, SuffixRules suffixRules) { this.state = state; this.suffixRules = suffixRules; } /** * Does suffixApplicability.xml have an entry for this group type? * @param groupType * @return */ boolean isGroupTypeWithSpecificSuffixRules(String groupType){ return suffixRules.isGroupTypeWithSpecificSuffixRules(groupType); } /**Process the effects of suffixes upon a fragment. * Unlocanted non-terminal suffixes are not attached yet. All other suffix effects are performed * @param group The group element for the fragment to which the suffixes will be added * @param suffixes The suffix elements for a fragment. * @throws StructureBuildingException If the suffixes can't be resolved properly. * @throws ComponentGenerationException */ void resolveSuffixes(Element group, List suffixes) throws StructureBuildingException, ComponentGenerationException { Fragment frag = group.getFrag(); List atomList = frag.getAtomList();//this instance of atomList will not change even once suffixes are merged into the fragment String groupType = frag.getType(); String subgroupType = frag.getSubType(); String suffixTypeToUse = isGroupTypeWithSpecificSuffixRules(groupType) ? groupType : STANDARDGROUP_TYPE_VAL; List associatedSuffixFrags = state.xmlSuffixMap.get(group); if (associatedSuffixFrags != null) {//null for non-final group in polycyclic spiro systems associatedSuffixFrags.clear(); } Map> suffixValToSuffixes = new LinkedHashMap<>();//effectively undoes the effect of multiplying out suffixes for (Element suffix : suffixes) { String suffixValue = suffix.getAttributeValue(VALUE_ATR); List suffixesWithThisVal = suffixValToSuffixes.get(suffixValue); if (suffixesWithThisVal == null) { suffixesWithThisVal = new ArrayList<>(); suffixValToSuffixes.put(suffixValue, suffixesWithThisVal); } suffixesWithThisVal.add(suffix); //Apply isotopes to suffixes if present if (suffix.getFrag() != null) { //boughton system applies to preceding suffix //iupac system applies to following suffix Element boughtonIsotopeSpecification = OpsinTools.getNextSibling(suffix); if (boughtonIsotopeSpecification != null && boughtonIsotopeSpecification.getName().equals(ISOTOPESPECIFICATION_EL)) { if (BOUGHTONSYSTEM_TYPE_VAL.equals(boughtonIsotopeSpecification.getAttributeValue(TYPE_ATR))) { applyIsotopeToSuffix(suffix.getFrag(), boughtonIsotopeSpecification, false); } else { throw new RuntimeException("Unexpected isotope specification after suffix"); } } Element iupacIsotopeSpecification = OpsinTools.getPreviousSibling(suffix); while (iupacIsotopeSpecification != null && iupacIsotopeSpecification.getName().equals(ISOTOPESPECIFICATION_EL) && IUPACSYSTEM_TYPE_VAL.equals(iupacIsotopeSpecification.getAttributeValue(TYPE_ATR))) { Element next = OpsinTools.getPreviousSibling(iupacIsotopeSpecification); applyIsotopeToSuffix(suffix.getFrag(), iupacIsotopeSpecification, true); iupacIsotopeSpecification = next; } } } boolean reDetectCycles = false; List fragsToMerge = new ArrayList<>(); for (Entry> entry : suffixValToSuffixes.entrySet()) { String suffixValue = entry.getKey(); List suffixesWithThisVal = entry.getValue(); List possibleAtomsToAttachSuffixTo = null; List rulesToApply = suffixRules.getSuffixRuleTags(suffixTypeToUse, suffixValue, subgroupType); for (int suffixIndex = 0; suffixIndex < suffixesWithThisVal.size(); suffixIndex++) { Element suffix = suffixesWithThisVal.get(suffixIndex); Fragment suffixFrag = null; for (SuffixRule suffixRule : rulesToApply) { switch (suffixRule.getType()) { case addgroup: if (suffixFrag == null) { suffixFrag = suffix.getFrag(); if (suffixFrag == null) { throw new RuntimeException("OPSIN Bug: Suffix was expected to have an associated fragment but it wasn't found"); } Atom firstAtomInSuffix = suffixFrag.getFirstAtom(); if (firstAtomInSuffix.getBondCount() <= 0) { throw new ComponentGenerationException("OPSIN Bug: Dummy atom in suffix should have at least one bond to it"); } if (CYCLEFORMER_SUBTYPE_VAL.equals(suffix.getAttributeValue(SUBTYPE_ATR))){ processCycleFormingSuffix(suffixFrag, frag, suffix); reDetectCycles = true; } else{ int bondOrderRequired = firstAtomInSuffix.getIncomingValency(); Atom fragAtomToUse = getFragAtomToUse(frag, suffix, suffixTypeToUse); if (fragAtomToUse == null) { if (possibleAtomsToAttachSuffixTo == null) { int substitutionsRequired = suffixesWithThisVal.size(); possibleAtomsToAttachSuffixTo = FragmentTools.findnAtomsForSubstitution(frag, atomList.get(0), substitutionsRequired, bondOrderRequired, true); if (possibleAtomsToAttachSuffixTo == null) { throw new StructureBuildingException("No suitable atom found to attach " + suffixValue + " suffix"); } for (Atom atom : possibleAtomsToAttachSuffixTo) { if (FragmentTools.isCharacteristicAtom(atom)){ throw new StructureBuildingException("No suitable atom found to attach suffix"); } } if ("yes".equals(suffixRule.getAttributeValue(SUFFIXRULES_KETONELOCANT_ATR)) && !atomList.get(0).getAtomIsInACycle()) { List proKetoneAtoms = getProKetonePositions(possibleAtomsToAttachSuffixTo); //Note that names like "ethanone" are allowable as the fragment may subsequently be substituted to form an actual ketone if (proKetoneAtoms.size() >= substitutionsRequired) { possibleAtomsToAttachSuffixTo = proKetoneAtoms; } } if (!(substitutionsRequired == 1 && (ALKANESTEM_SUBTYPE_VAL.equals(frag.getSubType()) || HETEROSTEM_SUBTYPE_VAL.equals(frag.getSubType())) && possibleAtomsToAttachSuffixTo.get(0).equals(frag.getFirstAtom()))) { if (AmbiguityChecker.isSubstitutionAmbiguous(possibleAtomsToAttachSuffixTo, substitutionsRequired)) { state.addIsAmbiguous("Addition of " + suffixValue +" suffix to: " + group.getValue()); } } } fragAtomToUse = possibleAtomsToAttachSuffixTo.get(suffixIndex); } //create a new bond and associate it with the suffixfrag and both atoms. Remember the suffixFrag has not been imported into the frag yet List bonds = new ArrayList<>(firstAtomInSuffix.getBonds()); for (Bond bondToSuffix : bonds) { Atom suffixAtom = bondToSuffix.getOtherAtom(firstAtomInSuffix); state.fragManager.createBond(fragAtomToUse, suffixAtom, bondToSuffix.getOrder()); state.fragManager.removeBond(bondToSuffix); if (fragAtomToUse.getIncomingValency() > 2 && (suffixValue.equals("aldehyde") || suffixValue.equals("al")|| suffixValue.equals("aldoxime"))){//formaldehyde/methanal are excluded as they are substitutable if("X".equals(suffixAtom.getFirstLocant())){//carbaldehyde suffixAtom.setProperty(Atom.ISALDEHYDE, true); } else{ fragAtomToUse.setProperty(Atom.ISALDEHYDE, true); } } } } } else{ throw new ComponentGenerationException("OPSIN bug: Suffix may only have one addgroup rule: " + suffix.getValue()); } break; case changecharge: int chargeChange = Integer.parseInt(suffixRule.getAttributeValue(SUFFIXRULES_CHARGE_ATR)); int protonChange = Integer.parseInt(suffixRule.getAttributeValue(SUFFIXRULES_PROTONS_ATR)); if (suffix.getAttribute(SUFFIXPREFIX_ATR) == null) { Atom fragAtomToUse = getFragAtomToUse(frag, suffix, suffixTypeToUse); if (fragAtomToUse != null) { fragAtomToUse.addChargeAndProtons(chargeChange, protonChange); } else{ applyUnlocantedChargeModification(atomList, chargeChange, protonChange); } } else {//a suffix prefixed acylium suffix if (suffixFrag == null) { throw new StructureBuildingException("OPSIN bug: ordering of elements in suffixRules.xml wrong; changeCharge found before addGroup"); } Set bonds = state.fragManager.getInterFragmentBonds(suffixFrag); if (bonds.size() != 1) { throw new StructureBuildingException("OPSIN bug: Wrong number of bonds between suffix and group"); } for (Bond bond : bonds) { if (bond.getFromAtom().getFrag() == suffixFrag) { bond.getFromAtom().addChargeAndProtons(chargeChange, protonChange); } else { bond.getToAtom().addChargeAndProtons(chargeChange, protonChange); } } } break; case setOutAtom: String outValencyAtr = suffixRule.getAttributeValue(SUFFIXRULES_OUTVALENCY_ATR); int outValency = outValencyAtr != null ? Integer.parseInt(outValencyAtr) : 1; if (suffix.getAttribute(SUFFIXPREFIX_ATR) == null) { if (!fragsToMerge.isEmpty()) { //ensure suffix fragments that were just added can be referenced e.g. glucitol-O1-yl mergeSuffixFrags(frag, fragsToMerge); fragsToMerge.clear(); } Atom fragAtomToUse = getFragAtomToUse(frag, suffix, suffixTypeToUse); if (fragAtomToUse != null) { frag.addOutAtom(fragAtomToUse, outValency, true); } else { frag.addOutAtom(frag.getFirstAtom(), outValency, false); } } else {//something like oyl on a ring, which means it is now carbonyl and the outAtom is on the suffix and not frag if (suffixFrag == null) { throw new StructureBuildingException("OPSIN bug: ordering of elements in suffixRules.xml wrong; setOutAtom found before addGroup"); } Set bonds = state.fragManager.getInterFragmentBonds(suffixFrag); if (bonds.size() != 1) { throw new StructureBuildingException("OPSIN bug: Wrong number of bonds between suffix and group"); } for (Bond bond : bonds) { if (bond.getFromAtom().getFrag() == suffixFrag) { suffixFrag.addOutAtom(bond.getFromAtom(), outValency, true); } else { suffixFrag.addOutAtom(bond.getToAtom(), outValency, true); } } } break; case setAcidicElement: ChemEl chemEl = ChemEl.valueOf(suffixRule.getAttributeValue(SUFFIXRULES_ELEMENT_ATR)); swapElementsSuchThatThisElementIsAcidic(suffixFrag, chemEl); break; case addSuffixPrefixIfNonePresentAndCyclic: case addFunctionalAtomsToHydroxyGroups: case chargeHydroxyGroups: case removeTerminalOxygen: case convertHydroxyGroupsToOutAtoms: case convertHydroxyGroupsToPositiveCharge: //already processed break; } } if (suffixFrag != null) {//merge suffix frag and parent fragment fragsToMerge.add(suffixFrag); suffix.setFrag(null); } } } mergeSuffixFrags(frag, fragsToMerge); if (reDetectCycles) { CycleDetector.assignWhetherAtomsAreInCycles(frag); } } private void mergeSuffixFrags(Fragment frag, List suffixFrags) throws StructureBuildingException { for (Fragment suffixFrag : suffixFrags) { state.fragManager.removeAtomAndAssociatedBonds(suffixFrag.getFirstAtom());//the dummy R atom Set suffixLocants = new HashSet<>(suffixFrag.getLocants()); for (String suffixLocant : suffixLocants) { if (Character.isDigit(suffixLocant.charAt(0))){//check that numeric locants do not conflict with the parent fragment e.g. hydrazide 2' with biphenyl 2' if (frag.hasLocant(suffixLocant)){ suffixFrag.getAtomByLocant(suffixLocant).removeLocant(suffixLocant); } } } state.fragManager.incorporateFragment(suffixFrag, frag); } } private void applyIsotopeToSuffix(Fragment frag, Element isotopeSpecification, boolean mustBeApplied) throws StructureBuildingException { IsotopeSpecification isotopeSpec = IsotopeSpecificationParser.parseIsotopeSpecification(isotopeSpecification); ChemEl chemEl = isotopeSpec.getChemEl(); int isotope = isotopeSpec.getIsotope(); int multiplier = isotopeSpec.getMultiplier(); String[] locants = isotopeSpec.getLocants(); if (locants != null && !mustBeApplied) { //locanted boughton isotope probably applies to the group rather than the suffix return; } if (locants == null) { List atoms = frag.getAtomList(); atoms.remove(0); if (chemEl == ChemEl.H) { List parentAtomsToApplyTo = FragmentTools.findnAtomsForSubstitution(atoms, null, multiplier, 1, true); if (parentAtomsToApplyTo == null) { if (mustBeApplied) { throw new StructureBuildingException("Failed to find sufficient hydrogen atoms for unlocanted hydrogen isotope replacement"); } else { return; } } if (AmbiguityChecker.isSubstitutionAmbiguous(parentAtomsToApplyTo, multiplier)) { state.addIsAmbiguous("Position of hydrogen isotope on " + frag.getTokenEl().getValue()); } for (int j = 0; j < multiplier; j++) { Atom atomWithHydrogenIsotope = parentAtomsToApplyTo.get(j); Atom hydrogen = state.fragManager.createAtom(isotopeSpec.getChemEl(), frag); hydrogen.setIsotope(isotope); state.fragManager.createBond(atomWithHydrogenIsotope, hydrogen, 1); } } else { List parentAtomsToApplyTo = new ArrayList<>(); for (Atom atom : atoms) { if (atom.getElement() == chemEl) { parentAtomsToApplyTo.add(atom); } } if (parentAtomsToApplyTo.size() < multiplier) { if(mustBeApplied) { throw new StructureBuildingException("Failed to find sufficient atoms for " + chemEl.toString() + " isotope replacement"); } else { return; } } if (AmbiguityChecker.isSubstitutionAmbiguous(parentAtomsToApplyTo, multiplier)) { state.addIsAmbiguous("Position of isotope on " + frag.getTokenEl().getValue()); } for (int j = 0; j < multiplier; j++) { parentAtomsToApplyTo.get(j).setIsotope(isotope); } } } else { if (chemEl == ChemEl.H) { for (int j = 0; j < locants.length; j++) { Atom atomWithHydrogenIsotope = frag.getAtomByLocantOrThrow(locants[j]); Atom hydrogen = state.fragManager.createAtom(isotopeSpec.getChemEl(), frag); hydrogen.setIsotope(isotope); state.fragManager.createBond(atomWithHydrogenIsotope, hydrogen, 1); } } else { for (int j = 0; j < locants.length; j++) { Atom atom = frag.getAtomByLocantOrThrow(locants[j]); if (chemEl != atom.getElement()) { throw new StructureBuildingException("The atom at locant: " + locants[j] + " was not a " + chemEl.toString() ); } atom.setIsotope(isotope); } } } isotopeSpecification.detach(); } /** * Return the subset of atoms that are "pro-ketone" * i.e. a [CD2](C)C * @param atoms * @return */ private List getProKetonePositions(List atoms) { List proKetonePositions = new ArrayList<>(); for (Atom atom : atoms) { List bonds = atom.getBonds(); if (bonds.size() == 2 && bonds.get(0).getOrder() == 1 && bonds.get(1).getOrder() == 1 && bonds.get(0).getOtherAtom(atom).getElement() == ChemEl.C && bonds.get(1).getOtherAtom(atom).getElement() == ChemEl.C) { proKetonePositions.add(atom); } } return proKetonePositions; } private void processCycleFormingSuffix(Fragment suffixFrag, Fragment suffixableFragment, Element suffix) throws StructureBuildingException, ComponentGenerationException { List rAtoms = new ArrayList<>(); for (Atom a : suffixFrag) { if (a.getElement() == ChemEl.R){ rAtoms.add(a); } } if (rAtoms.size() != 2){ throw new ComponentGenerationException("OPSIN bug: Incorrect number of R atoms associated with cyclic suffix"); } if (rAtoms.get(0).getBondCount() <= 0 || rAtoms.get(1).getBondCount() <= 0) { throw new ComponentGenerationException("OPSIN Bug: Dummy atoms in suffix should have at least one bond to them"); } Atom parentAtom1; Atom parentAtom2; String locant = suffix.getAttributeValue(LOCANT_ATR); String locantId = suffix.getAttributeValue(LOCANTID_ATR); if (locant != null){ String[] locants = locant.split(","); if (locants.length ==2){ parentAtom1 = suffixableFragment.getAtomByLocantOrThrow(locants[0]); parentAtom2 = suffixableFragment.getAtomByLocantOrThrow(locants[1]); } else if (locants.length ==1){ parentAtom1 = suffixableFragment.getAtomByLocantOrThrow("1"); parentAtom2 = suffixableFragment.getAtomByLocantOrThrow(locants[0]); } else{ throw new ComponentGenerationException("Incorrect number of locants associated with cycle forming suffix, expected 2 found: " + locants.length); } } else if (locantId !=null) { String[] locantIds = locantId.split(","); if (locantIds.length !=2){ throw new ComponentGenerationException("OPSIN bug: Should be exactly 2 locants associated with a cyclic suffix"); } parentAtom1 = suffixableFragment.getAtomByIDOrThrow(Integer.parseInt(locantIds[0])); parentAtom2 = suffixableFragment.getAtomByIDOrThrow(Integer.parseInt(locantIds[1])); } else{ int chainLength = suffixableFragment.getChainLength(); if (chainLength > 1 && chainLength == suffixableFragment.getAtomCount()){ parentAtom1 = suffixableFragment.getAtomByLocantOrThrow("1"); parentAtom2 = suffixableFragment.getAtomByLocantOrThrow(String.valueOf(chainLength)); } else{ List hydroxyAtoms = FragmentTools.findHydroxyGroups(suffixableFragment); if (hydroxyAtoms.size() == 1 && suffixableFragment.getAtomByLocant("1") != null){ parentAtom1 = suffixableFragment.getAtomByLocantOrThrow("1"); parentAtom2 = hydroxyAtoms.get(0); } else{ throw new ComponentGenerationException("cycle forming suffix: " + suffix.getValue() +" should be locanted!"); } } } if (parentAtom1.equals(parentAtom2)){ throw new ComponentGenerationException("cycle forming suffix: " + suffix.getValue() +" attempted to form a cycle involving the same atom twice!"); } if (suffixableFragment.getType().equals(CARBOHYDRATE_TYPE_VAL)){ FragmentTools.removeTerminalOxygen(state, parentAtom1, 2); FragmentTools.removeTerminalOxygen(state, parentAtom1, 1); List chainHydroxy = FragmentTools.findHydroxyLikeTerminalAtoms(parentAtom2.getAtomNeighbours(), ChemEl.O); if (chainHydroxy.size() == 1){ FragmentTools.removeTerminalAtom(state, chainHydroxy.get(0));//make sure to retain stereochemistry } else{ throw new ComponentGenerationException("The second locant of a carbohydrate lactone should point to a carbon in the chain with a hydroxyl group"); } } else{ if (parentAtom2.getElement() == ChemEl.O){//cyclic suffixes like lactone formally indicate the removal of hydroxy cf. 1979 rule 472.1 //...although in most cases they are used on structures that don't actually have a hydroxy group List neighbours = parentAtom2.getAtomNeighbours(); if (neighbours.size()==1){ List suffixNeighbours = rAtoms.get(1).getAtomNeighbours(); if (suffixNeighbours.size()==1 && suffixNeighbours.get(0).getElement() == ChemEl.O){ state.fragManager.removeAtomAndAssociatedBonds(parentAtom2); parentAtom2 = neighbours.get(0); } } } } makeBondsToSuffix(parentAtom1, rAtoms.get(0)); makeBondsToSuffix(parentAtom2, rAtoms.get(1)); state.fragManager.removeAtomAndAssociatedBonds(rAtoms.get(1)); } private Atom getFragAtomToUse(Fragment frag, Element suffix, String suffixTypeToUse) throws StructureBuildingException { String locant = suffix.getAttributeValue(LOCANT_ATR); if (locant != null) { return frag.getAtomByLocantOrThrow(locant); } String locantId = suffix.getAttributeValue(LOCANTID_ATR); if (locantId != null) { return frag.getAtomByIDOrThrow(Integer.parseInt(locantId)); } String defaultLocantId = suffix.getAttributeValue(DEFAULTLOCANTID_ATR); if (defaultLocantId != null) { return frag.getAtomByIDOrThrow(Integer.parseInt(defaultLocantId)); } else if (suffixTypeToUse.equals(ACIDSTEM_TYPE_VAL) || suffixTypeToUse.equals(NONCARBOXYLICACID_TYPE_VAL) || suffixTypeToUse.equals(CHALCOGENACIDSTEM_TYPE_VAL)) {//means that e.g. sulfonyl, has an explicit outAtom return frag.getFirstAtom(); } return null; } /** * Preference is given to mono cation/anions as they are expected to be more likely * Additionally, Typically if a locant has not been specified then it was intended to refer to a nitrogen even if the nitrogen is not at locant 1 e.g. isoquinolinium * Hence preference is given to nitrogen atoms and then to non carbon atoms * @param atomList * @param chargeChange * @param protonChange */ private void applyUnlocantedChargeModification(List atomList, int chargeChange, int protonChange) { //List of atoms that can accept this charge while remaining in a reasonable valency List nitrogens = new ArrayList<>();//most likely List otherHeteroatoms = new ArrayList<>();//plausible List carbonsAtoms = new ArrayList<>();//rare List chargedAtoms = new ArrayList<>();//very rare if (atomList.isEmpty()) { throw new RuntimeException("OPSIN Bug: List of atoms to add charge suffix to was empty"); } for (Atom a : atomList) { ChemEl chemEl = a.getElement(); Integer[] stableValencies = ValencyChecker.getPossibleValencies(chemEl, a.getCharge() + chargeChange); if (stableValencies == null) {//unstable valency so seems unlikely continue; } int resultantExpectedValency = (a.getLambdaConventionValency() ==null ? ValencyChecker.getDefaultValency(chemEl) : a.getLambdaConventionValency()) + a.getProtonsExplicitlyAddedOrRemoved() + protonChange; if (!Arrays.asList(stableValencies).contains(resultantExpectedValency)) { //unstable valency so seems unlikely continue; } if (protonChange < 0) { int substitableHydrogen = StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(a); if (a.hasSpareValency() && !a.getFrag().getIndicatedHydrogen().contains(a)) { substitableHydrogen--; } if (substitableHydrogen < 1) { //no hydrogens so operation can't remove one! continue; } } if (a.getCharge() == 0) { if (chemEl == ChemEl.N) { nitrogens.add(a); } else if (chemEl != ChemEl.C) { otherHeteroatoms.add(a); } else { carbonsAtoms.add(a); } } else { chargedAtoms.add(a); } } List listFromWhichToChoose; if (!nitrogens.isEmpty()) { listFromWhichToChoose = nitrogens; if (AMINOACID_TYPE_VAL.equals(atomList.get(0).getFrag().getType())) { //By convention treat names like lysinium as unambiguous (prefer alpha nitrogen) if (listFromWhichToChoose.contains(atomList.get(0))){ listFromWhichToChoose = new ArrayList<>(); listFromWhichToChoose.add(atomList.get(0)); } } } else if (!otherHeteroatoms.isEmpty()) { listFromWhichToChoose = otherHeteroatoms; } else if (!carbonsAtoms.isEmpty()) { listFromWhichToChoose = carbonsAtoms; } else if (!chargedAtoms.isEmpty()) { listFromWhichToChoose = chargedAtoms; } else { listFromWhichToChoose = atomList; } Atom chosenAtom = listFromWhichToChoose.get(0); if (!AmbiguityChecker.allAtomsEquivalent(listFromWhichToChoose)) { state.addIsAmbiguous("Addition of charge suffix to: " + chosenAtom.getFrag().getTokenEl().getValue()); } chosenAtom.addChargeAndProtons(chargeChange, protonChange); } /** * e.g. if element is "S" changes C(=S)O -->C(=O)S * @param frag * @param chemEl * @throws StructureBuildingException */ private void swapElementsSuchThatThisElementIsAcidic(Fragment frag, ChemEl chemEl) throws StructureBuildingException { for (int i = 0, l =frag.getFunctionalAtomCount(); i < l; i++) { Atom atom = frag.getFunctionalAtom(i).getAtom(); Set ambiguouslyElementedAtoms = atom.getProperty(Atom.AMBIGUOUS_ELEMENT_ASSIGNMENT); if (ambiguouslyElementedAtoms != null) { Atom atomToSwapWith = null; for (Atom ambiguouslyElementedAtom : ambiguouslyElementedAtoms) { if (ambiguouslyElementedAtom.getElement() == chemEl){ atomToSwapWith = ambiguouslyElementedAtom; break; } } if (atomToSwapWith != null) { if (atomToSwapWith != atom) { //swap locants and element type List tempLocants1 = new ArrayList<>(atom.getLocants()); List tempLocants2 = new ArrayList<>(atomToSwapWith.getLocants()); atom.clearLocants(); atomToSwapWith.clearLocants(); for (String locant : tempLocants1) { atomToSwapWith.addLocant(locant); } for (String locant : tempLocants2) { atom.addLocant(locant); } ChemEl a2ChemEl = atomToSwapWith.getElement(); atomToSwapWith.setElement(atom.getElement()); atom.setElement(a2ChemEl); ambiguouslyElementedAtoms.remove(atomToSwapWith); } ambiguouslyElementedAtoms.remove(atom); return; } } } throw new StructureBuildingException("Unable to find potential acidic atom with element: " + chemEl); } /** * Creates bonds between the parentAtom and the atoms connected to the R atoms. * Removes bonds to the R atom * @param parentAtom * @param suffixRAtom */ private void makeBondsToSuffix(Atom parentAtom, Atom suffixRAtom) { List bonds = new ArrayList<>(suffixRAtom.getBonds()); for (Bond bondToSuffix : bonds) { Atom suffixAtom = bondToSuffix.getOtherAtom(suffixRAtom); state.fragManager.createBond(parentAtom, suffixAtom, bondToSuffix.getOrder()); state.fragManager.removeBond(bondToSuffix); } } List getSuffixRuleTags(String suffixTypeToUse, String suffixValue, String subgroupType) throws ComponentGenerationException { return suffixRules.getSuffixRuleTags(suffixTypeToUse, suffixValue, subgroupType); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/SuffixRule.java000066400000000000000000000012321451751637500266110ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.List; class SuffixRule { private final SuffixRuleType type; private final List attributes; SuffixRule(SuffixRuleType type, List attributes) { this.type = type; this.attributes = attributes; } SuffixRuleType getType() { return type; } /** * Returns the value of the attribute with the given name * or null if the attribute doesn't exist * @param name * @return */ String getAttributeValue(String name) { for (Attribute a : attributes) { if (a.getName().equals(name)) { return a.getValue(); } } return null; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/SuffixRuleType.java000066400000000000000000000005021451751637500274520ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; enum SuffixRuleType { addgroup, addSuffixPrefixIfNonePresentAndCyclic, setOutAtom, changecharge, addFunctionalAtomsToHydroxyGroups, chargeHydroxyGroups, removeTerminalOxygen, convertHydroxyGroupsToOutAtoms, convertHydroxyGroupsToPositiveCharge, setAcidicElement } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/SuffixRules.java000066400000000000000000000170521451751637500270030ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import java.io.IOException; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; import javax.xml.stream.XMLStreamConstants; import javax.xml.stream.XMLStreamException; import javax.xml.stream.XMLStreamReader; class SuffixRules { /**For a given group type what suffixes are applicable. * Within this group type which are applicable for a given suffixValue * Returns a list as different group subTypes can give different meanings*/ private final Map>> suffixApplicability; private static class ApplicableSuffix { private final String requiredSubType; private final List suffixRules; public ApplicableSuffix(String requiredSubType, List suffixRules) { this.requiredSubType = requiredSubType; this.suffixRules = suffixRules; } } SuffixRules(ResourceGetter resourceGetter) throws IOException { Map> suffixRulesMap = generateSuffixRulesMap(resourceGetter); suffixApplicability = generateSuffixApplicabilityMap(resourceGetter, suffixRulesMap); } private Map> generateSuffixRulesMap(ResourceGetter resourceGetter) throws IOException { Map> suffixRulesMap = new HashMap<>(); XMLStreamReader reader = resourceGetter.getXMLStreamReader("suffixRules.xml"); try { while (reader.hasNext()) { if (reader.next() == XMLStreamConstants.START_ELEMENT && reader.getLocalName().equals(SUFFIXRULES_RULE_EL)) { String ruleValue = reader.getAttributeValue(null, SUFFIXRULES_VALUE_ATR); if (suffixRulesMap.get(ruleValue) != null) { throw new RuntimeException("Suffix: " + ruleValue + " appears multiple times in suffixRules.xml"); } suffixRulesMap.put(ruleValue, processSuffixRules(reader)); } } } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading suffixRules.xml", e); } finally { try { reader.close(); } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading suffixRules.xml", e); } } return suffixRulesMap; } private List processSuffixRules(XMLStreamReader reader) throws XMLStreamException { String startingElName = reader.getLocalName(); List rules = new ArrayList<>(); while (reader.hasNext()) { switch (reader.next()) { case XMLStreamConstants.START_ELEMENT: String tagName = reader.getLocalName(); SuffixRuleType type = SuffixRuleType.valueOf(tagName); List attributes = new ArrayList<>(); for (int i = 0, l = reader.getAttributeCount(); i < l; i++) { attributes.add(new Attribute(reader.getAttributeLocalName(i), reader.getAttributeValue(i))); } rules.add(new SuffixRule(type, attributes)); break; case XMLStreamConstants.END_ELEMENT: if (reader.getLocalName().equals(startingElName)) { return rules; } break; } } throw new RuntimeException("Malformed suffixRules.xml"); } private Map>> generateSuffixApplicabilityMap(ResourceGetter resourceGetter, Map> suffixRulesMap) throws IOException { Map>> suffixApplicability = new HashMap<>(); XMLStreamReader reader = resourceGetter.getXMLStreamReader("suffixApplicability.xml"); try { while (reader.hasNext()) { if (reader.next() == XMLStreamConstants.START_ELEMENT && reader.getLocalName().equals(SUFFIXAPPLICABILITY_GROUPTYPE_EL)) { Map> suffixToRuleMap = new HashMap<>(); suffixApplicability.put(reader.getAttributeValue(null, SUFFIXAPPLICABILITY_TYPE_ATR), suffixToRuleMap); while (reader.hasNext()) { int event = reader.next(); if (event == XMLStreamConstants.START_ELEMENT && reader.getLocalName().equals(SUFFIXAPPLICABILITY_SUFFIX_EL)) { String suffixValue = reader.getAttributeValue(null, SUFFIXAPPLICABILITY_VALUE_ATR); List suffixList = suffixToRuleMap.get(suffixValue); //can have multiple entries if subType attribute is set if (suffixToRuleMap.get(suffixValue) == null){ suffixList = new ArrayList<>(); suffixToRuleMap.put(suffixValue, suffixList); } String requiredSubType = reader.getAttributeValue(null, SUFFIXAPPLICABILITY_SUBTYPE_ATR); String suffixRuleName = reader.getElementText(); List suffixRules = suffixRulesMap.get(suffixRuleName); if (suffixRules == null) { throw new RuntimeException("Suffix: " + suffixRuleName +" does not have a rule associated with it in suffixRules.xml"); } suffixList.add(new ApplicableSuffix(requiredSubType, suffixRules)); } else if (event == XMLStreamConstants.END_ELEMENT && reader.getLocalName().equals(SUFFIXAPPLICABILITY_GROUPTYPE_EL)) { break; } } } } } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading suffixApplicability.xml", e); } finally { try { reader.close(); } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading suffixApplicability.xml", e); } } return suffixApplicability; } /** * Returns the appropriate suffixRules for the given arguments. * The suffix rules are the children of the appropriate rule in suffixRules.xml * @param suffixTypeToUse * @param suffixValue * @param subgroupType * @return * @throws ComponentGenerationException */ List getSuffixRuleTags(String suffixTypeToUse, String suffixValue, String subgroupType) throws ComponentGenerationException { Map> groupToSuffixMap = suffixApplicability.get(suffixTypeToUse); if (groupToSuffixMap == null){ throw new ComponentGenerationException("Suffix Type: " + suffixTypeToUse + " does not have a corresponding groupType entry in suffixApplicability.xml"); } List potentiallyApplicableSuffixes = groupToSuffixMap.get(suffixValue); if(potentiallyApplicableSuffixes == null || potentiallyApplicableSuffixes.isEmpty() ) { throw new ComponentGenerationException("Suffix: " + suffixValue + " does not apply to the group it was associated with (type: " + suffixTypeToUse + ") according to suffixApplicability.xml"); } List suffixRules = null; for (ApplicableSuffix suffix : potentiallyApplicableSuffixes) { if (suffix.requiredSubType != null) { if (!suffix.requiredSubType.equals(subgroupType)) { continue; } } if (suffixRules != null) { throw new ComponentGenerationException("Suffix: " + suffixValue + " appears multiple times in suffixApplicability.xml"); } suffixRules = suffix.suffixRules; } if (suffixRules == null){ throw new ComponentGenerationException("Suffix: " +suffixValue +" does not apply to the group it was associated with (type: "+ suffixTypeToUse + ") due to the group's subType: "+ subgroupType +" according to suffixApplicability.xml"); } return suffixRules; } /** * Does suffixApplicability.xml have an entry for this group type? * @param groupType * @return */ boolean isGroupTypeWithSpecificSuffixRules(String groupType){ return suffixApplicability.containsKey(groupType); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/TokenEl.java000066400000000000000000000047201451751637500260630ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.Collections; import java.util.List; class TokenEl extends Element { private String value; private Fragment frag; TokenEl(String name) { super(name); this.value = ""; } TokenEl(String name, String value) { super(name); this.value = value; } @Override void addChild(Element child) { throw new UnsupportedOperationException("Tokens do not have children"); } @Override Element copy() { TokenEl copy = new TokenEl(this.name, this.value); for (int i = 0, len = this.attributes.size(); i < len; i++) { Attribute atr = this.attributes.get(i); copy.addAttribute(new Attribute(atr)); } return copy; } /** * Creates a copy with no parent * The provided value is used instead of the Element to be copied's value * @param value * @return */ TokenEl copy(String value) { TokenEl copy = new TokenEl(this.name, value); for (int i = 0, len = this.attributes.size(); i < len; i++) { Attribute atr = this.attributes.get(i); copy.addAttribute(new Attribute(atr)); } return copy; } @Override Element getChild(int index) { throw new UnsupportedOperationException("Tokens do not have children"); } @Override int getChildCount() { return 0; } @Override List getChildElements() { return Collections.emptyList(); } @Override List getChildElements(String name) { return Collections.emptyList(); } @Override Element getFirstChildElement(String name) { return null; } @Override Element getLastChildElement() { return null; } @Override Fragment getFrag() { return frag; } String getValue() { return value; } @Override int indexOf(Element child) { return -1; } @Override void insertChild(Element child, int index) { throw new UnsupportedOperationException("Tokens do not have children"); } @Override boolean removeChild(Element child) { throw new UnsupportedOperationException("Tokens do not have children"); } @Override Element removeChild(int index) { throw new UnsupportedOperationException("Tokens do not have children"); } @Override void replaceChild(Element oldChild, Element newChild) { throw new UnsupportedOperationException("Tokens do not have children"); } @Override void setFrag(Fragment frag) { this.frag = frag; } void setValue(String text) { this.value = text; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/Tokeniser.java000066400000000000000000000250611451751637500264660ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.Collections; import java.util.List; import java.util.regex.Matcher; import java.util.regex.Pattern; /** * Uses OPSIN's DFA based grammar to break a name into tokens with associated meanings ("annotations"). * @author dl387 * */ class Tokeniser { private final ParseRules parseRules; private final Pattern matchCasCollectiveIndex = Pattern.compile("([\\[\\(\\{]([1-9][0-9]?[cC][iI][, ]?)+[\\]\\)\\}])+|[1-9][0-9]?[cC][iI]", Pattern.CASE_INSENSITIVE ); private final Pattern matchCompoundWithPhrase = Pattern.compile("(compd\\. with|compound with|and) ", Pattern.CASE_INSENSITIVE ); Tokeniser(ParseRules parseRules) { this.parseRules = parseRules; } ParseRules getParseRules() { return parseRules; } /** * Master method for tokenizing chemical names into words and within words into tokens * @param name The chemical name. * @param allowRemovalOfWhiteSpace * @return * @throws ParsingException */ TokenizationResult tokenize(String name, boolean allowRemovalOfWhiteSpace) throws ParsingException { TokenizationResult result = allowRemovalOfWhiteSpace ? new TokenizationResult(WordTools.removeWhiteSpaceIfBracketsAreUnbalanced(name)) :new TokenizationResult(name); TokenizationResult resultFromBeforeWhitespaceRemoval = null; while (!result.isSuccessfullyTokenized()){ ParseRulesResults results = parseRules.getParses(result.getUnparsedName()); List parseTokens = results.getParseTokensList(); result.setWorkingName(results.getUninterpretableName()); String parsedName = result.getUnparsedName().substring(0, result.getUnparsedName().length() - result.getWorkingName().length()); if (isWordParsable(parseTokens, result)) { parseWord(result, parseTokens, parsedName, false); resultFromBeforeWhitespaceRemoval =null; } else { if (resultFromBeforeWhitespaceRemoval == null) { resultFromBeforeWhitespaceRemoval = new TokenizationResult(name); resultFromBeforeWhitespaceRemoval.setErrorFields(result.getUnparsedName(), result.getWorkingName(), results.getUnparseableName()); } if (!fixWord(result, parsedName, allowRemovalOfWhiteSpace)) { result.setErrorFields(resultFromBeforeWhitespaceRemoval.getUnparsedName(), resultFromBeforeWhitespaceRemoval.getUninterpretableName(), resultFromBeforeWhitespaceRemoval.getUnparsableName()); break; } } } return result; } /** * Master method for tokenizing chemical names into words and within words into tokens * This is performed in a right to left manner * @param reverseParseRules * @param name The chemical name. * @param allowRemovalOfWhiteSpace * @return * @throws ParsingException */ TokenizationResult tokenizeRightToLeft(ReverseParseRules reverseParseRules, String name, boolean allowRemovalOfWhiteSpace) throws ParsingException { TokenizationResult result = new TokenizationResult(name); //removeWhiteSpaceIfBracketsAreUnbalanced is not currently employed as the input to this function from the parser will often be what the LR tokenizer couldn't handle, which may not have matching brackets TokenizationResult resultFromBeforeWhitespaceRemoval = null; while (!result.isSuccessfullyTokenized()){ ParseRulesResults results = reverseParseRules.getParses(result.getUnparsedName()); List parseTokens =results.getParseTokensList(); result.setWorkingName(results.getUninterpretableName()); String parsedName = result.getUnparsedName().substring(result.getWorkingName().length()); if (isWordParsableInReverse(parseTokens, result)) { parseWord(result, parseTokens, parsedName, true); resultFromBeforeWhitespaceRemoval =null; } else{ if (resultFromBeforeWhitespaceRemoval == null) { resultFromBeforeWhitespaceRemoval = new TokenizationResult(name); resultFromBeforeWhitespaceRemoval.setErrorFields(result.getUnparsedName(), result.getWorkingName(), results.getUnparseableName()); } if (!fixWordInReverse(result, parsedName, allowRemovalOfWhiteSpace)) { result.setErrorFields(resultFromBeforeWhitespaceRemoval.getUnparsedName(), resultFromBeforeWhitespaceRemoval.getUninterpretableName(), resultFromBeforeWhitespaceRemoval.getUnparsableName()); break; } } } Collections.reverse(result.getParse().getWords()); return result; } private boolean isWordParsableInReverse(List parseTokens, TokenizationResult result) { return parseTokens.size()>0 && (result.isFullyInterpretable() || result.getWorkingName().charAt(result.getWorkingName().length()-1)==' ' || result.getWorkingName().charAt(result.getWorkingName().length()-1) =='-'); } private boolean isWordParsable(List parseTokens, TokenizationResult result) { return parseTokens.size()>0 && (result.isFullyInterpretable() || result.getWorkingName().charAt(0) ==' ' || result.getWorkingName().charAt(0) =='-'); } private void parseWord(TokenizationResult result, List parseTokens, String parsedName, boolean reverse) { //If something like ethylchloride is encountered this should be split back to ethyl chloride and there will be 2 ParseWords returned //In cases of properly formed names there will be only one ParseWord //If there are two parses one of which assumes a missing space and one of which does not the former is discarded addParseWords(parseTokens, parsedName, result.getParse(), reverse); if (result.isFullyInterpretable()) { result.setUnparsedName(result.getWorkingName()); } else { String remainingName =result.getWorkingName(); if (reverse){ if (remainingName.length() > 3 && remainingName.endsWith(" - ")){ remainingName = remainingName.substring(0, remainingName.length() - 3); } else{ remainingName = remainingName.substring(0, remainingName.length() - 1); } } else{ if (remainingName.length() > 3 && remainingName.startsWith(" - ")){//this is a way of of indicating a mixture remainingName = remainingName.substring(3); } else{ remainingName = remainingName.substring(1); } } result.setUnparsedName(remainingName); } } private void addParseWords(List parseTokens, String parsedName, Parse parse, boolean reverse) { List parseWords = WordTools.splitIntoParseWords(parseTokens, parsedName); if (reverse) { Collections.reverse(parseWords);//make this set of words back to front as well } for (ParseWord parseWord : parseWords) { parse.addWord(parseWord); } } private boolean fixWord(TokenizationResult result, String parsedName, boolean allowRemovalOfWhiteSpace) throws ParsingException { Matcher m = matchCompoundWithPhrase.matcher(result.getWorkingName()); if (m.lookingAt() && lastParsedWordWasFullOrFunctionalTerm(result)) { result.setUnparsedName(parsedName + result.getWorkingName().substring(m.group().length())); } else if (matchCasCollectiveIndex.matcher(result.getWorkingName()).matches()) { result.setUnparsedName(parsedName); } else { if (allowRemovalOfWhiteSpace) { //TODO add a warning message if this code is invoked. A name invoking this is unambiguously BAD List parsedWords = result.getParse().getWords(); if (!reverseSpaceRemoval(parsedWords, result)) { //Try and remove a space from the right and try again int indexOfSpace = result.getWorkingName().indexOf(' '); if (indexOfSpace != -1) { result.setUnparsedName( parsedName + result.getWorkingName().substring(0, indexOfSpace) + result.getWorkingName().substring(indexOfSpace + 1)); } else { return false; } } } else { return false; } } return true; } private boolean lastParsedWordWasFullOrFunctionalTerm(TokenizationResult result) throws ParsingException { List parseWords = result.getParse().getWords(); if (parseWords.size()>0){ List parseTokensList = parseWords.get(parseWords.size()-1).getParseTokens(); for (ParseTokens parseTokens : parseTokensList) { WordType type = OpsinTools.determineWordType(parseTokens.getAnnotations()); if (type.equals(WordType.full) || type.equals(WordType.functionalTerm)){ return true; } } } return false; } private boolean fixWordInReverse(TokenizationResult result, String parsedName, boolean allowRemovalOfWhiteSpace) { if (allowRemovalOfWhiteSpace) { //Try and remove a space and try again //TODO add a warning message if this code is invoked. A name invoking this is unambiguously BAD int indexOfSpace = result.getWorkingName().lastIndexOf(' '); if (indexOfSpace != -1) { result.setUnparsedName( result.getWorkingName().substring(0, indexOfSpace) + result.getWorkingName().substring(indexOfSpace + 1) + parsedName); } else { return false; } } else { return false; } return true; } /** * Fixes cases like for example "benzene sulfonamide" -->"benzenesulfonamide" * @param parsedWords * @param result * @return * @throws ParsingException */ private boolean reverseSpaceRemoval(List parsedWords, TokenizationResult result) throws ParsingException { boolean successful = false; if (!parsedWords.isEmpty()) {//first see whether the space before the unparseable word is erroneous ParseWord pw = parsedWords.get(parsedWords.size() - 1); String lastWordAndUnparsed = pw.getWord() + result.getUnparsedName(); ParseRulesResults backResults = parseRules.getParses(lastWordAndUnparsed); List backParseTokens = backResults.getParseTokensList(); String backUninterpretableName = backResults.getUninterpretableName(); String backParsedName = lastWordAndUnparsed.substring(0, lastWordAndUnparsed.length() - backUninterpretableName.length()); if (backParsedName.length() > pw.getWord().length() && backParseTokens.size() > 0 && (backUninterpretableName.equals("") || backUninterpretableName.charAt(0) == ' ' || backUninterpretableName.charAt(0) == '-')) {//a word was interpretable result.getParse().removeWord(pw); List parseWords = WordTools.splitIntoParseWords(backParseTokens, backParsedName); for (ParseWord parseWord : parseWords) { result.getParse().addWord(parseWord); } if (!backUninterpretableName.equals("")) { result.setUnparsedName(backUninterpretableName.substring(1));//remove white space at start of uninterpretableName } else { result.setUnparsedName(backUninterpretableName); } successful = true; } } return successful; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/TokenizationResult.java000066400000000000000000000025471451751637500304040ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * * @author dl387 * */ class TokenizationResult { private final Parse parse; private String workingName; private String unparsableName; private String unparsedName; private String uninterpretableName; TokenizationResult(String name) { this.parse = new Parse(name); this.workingName = ""; this.unparsableName = ""; this.unparsedName = name; this.uninterpretableName = ""; } boolean isSuccessfullyTokenized() { return unparsedName.length()==0; } Parse getParse() { return parse; } void setUninterpretableName(String name) { this.uninterpretableName = name; } String getUninterpretableName() { return this.uninterpretableName; } String getWorkingName() { return workingName; } void setWorkingName(String name) { this.workingName = name; } boolean isFullyInterpretable() { return "".equals(workingName); } String getUnparsableName() { return unparsableName; } void setUnparsableName(String name) { this.unparsableName = name; } String getUnparsedName() { return unparsedName; } void setUnparsedName(String name) { this.unparsedName = name; } void setErrorFields(String unparsedName, String uninterpretableName, String unparsableName) { this.unparsedName = unparsedName; this.uninterpretableName = uninterpretableName; this.unparsableName = unparsableName; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ValencyChecker.java000066400000000000000000000474601451751637500274200ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.EnumMap; import java.util.HashMap; import java.util.Map; /** * Provides valency checking features and a lookup on the possible valencies * for an atom given its element and charge * * Also used to perform a final check on the output of OPSIN, to reject interpretations * that result in hypervalent structures due to incorrect names or misinterpreted names * * @author ptc24 * @author dl387 * */ class ValencyChecker { /** used to decide on the likely valency state*/ private static final Map expectedDefaultValency = new EnumMap<>(ChemEl.class); /** used to decide whether an atom has spare valency in a ring, these are the same as specified in the Hantzch-Widman system */ private static final Map valencyInHW = new EnumMap<>(ChemEl.class); /** used to decide on the likely valency state */ private static final Map> possibleStableValencies = new EnumMap<>(ChemEl.class); static { expectedDefaultValency.put(ChemEl.B, 3); expectedDefaultValency.put(ChemEl.Al, 3); expectedDefaultValency.put(ChemEl.In, 3); expectedDefaultValency.put(ChemEl.Ga, 3); expectedDefaultValency.put(ChemEl.Tl, 3); expectedDefaultValency.put(ChemEl.C, 4); expectedDefaultValency.put(ChemEl.Si, 4); expectedDefaultValency.put(ChemEl.Ge, 4); expectedDefaultValency.put(ChemEl.Sn, 4); expectedDefaultValency.put(ChemEl.Pb, 4); expectedDefaultValency.put(ChemEl.N, 3); expectedDefaultValency.put(ChemEl.P, 3); expectedDefaultValency.put(ChemEl.As, 3); expectedDefaultValency.put(ChemEl.Sb, 3); expectedDefaultValency.put(ChemEl.Bi, 3); expectedDefaultValency.put(ChemEl.O, 2); expectedDefaultValency.put(ChemEl.S, 2); expectedDefaultValency.put(ChemEl.Se, 2); expectedDefaultValency.put(ChemEl.Te, 2); expectedDefaultValency.put(ChemEl.Po, 2); expectedDefaultValency.put(ChemEl.F, 1); expectedDefaultValency.put(ChemEl.Cl, 1); expectedDefaultValency.put(ChemEl.Br, 1); expectedDefaultValency.put(ChemEl.I, 1); expectedDefaultValency.put(ChemEl.At, 1); //in order of priority in the HW system valencyInHW.put(ChemEl.F, 1); valencyInHW.put(ChemEl.Cl, 1); valencyInHW.put(ChemEl.Br, 1); valencyInHW.put(ChemEl.I, 1); valencyInHW.put(ChemEl.O, 2); valencyInHW.put(ChemEl.S, 2); valencyInHW.put(ChemEl.Se, 2); valencyInHW.put(ChemEl.Te, 2); valencyInHW.put(ChemEl.N, 3); valencyInHW.put(ChemEl.P, 3); valencyInHW.put(ChemEl.As, 3); valencyInHW.put(ChemEl.Sb, 3); valencyInHW.put(ChemEl.Bi, 3); valencyInHW.put(ChemEl.Si, 4); valencyInHW.put(ChemEl.Ge, 4); valencyInHW.put(ChemEl.Sn, 4); valencyInHW.put(ChemEl.Pb, 4); valencyInHW.put(ChemEl.B, 3); valencyInHW.put(ChemEl.Al, 3); valencyInHW.put(ChemEl.Ga, 3); valencyInHW.put(ChemEl.In, 3); valencyInHW.put(ChemEl.Tl, 3); valencyInHW.put(ChemEl.Hg, 2); valencyInHW.put(ChemEl.C, 4); possibleStableValencies.put(ChemEl.H, new HashMap<>()); possibleStableValencies.put(ChemEl.He, new HashMap<>()); possibleStableValencies.put(ChemEl.Li, new HashMap<>()); possibleStableValencies.put(ChemEl.Be, new HashMap<>()); possibleStableValencies.put(ChemEl.B, new HashMap<>()); possibleStableValencies.put(ChemEl.C, new HashMap<>()); possibleStableValencies.put(ChemEl.N, new HashMap<>()); possibleStableValencies.put(ChemEl.O, new HashMap<>()); possibleStableValencies.put(ChemEl.F, new HashMap<>()); possibleStableValencies.put(ChemEl.Ne, new HashMap<>()); possibleStableValencies.put(ChemEl.Na, new HashMap<>()); possibleStableValencies.put(ChemEl.Mg, new HashMap<>()); possibleStableValencies.put(ChemEl.Al, new HashMap<>()); possibleStableValencies.put(ChemEl.Si, new HashMap<>()); possibleStableValencies.put(ChemEl.P, new HashMap<>()); possibleStableValencies.put(ChemEl.S, new HashMap<>()); possibleStableValencies.put(ChemEl.Cl, new HashMap<>()); possibleStableValencies.put(ChemEl.Ar, new HashMap<>()); possibleStableValencies.put(ChemEl.K, new HashMap<>()); possibleStableValencies.put(ChemEl.Ca, new HashMap<>()); possibleStableValencies.put(ChemEl.Ga, new HashMap<>()); possibleStableValencies.put(ChemEl.Ge, new HashMap<>()); possibleStableValencies.put(ChemEl.As, new HashMap<>()); possibleStableValencies.put(ChemEl.Se, new HashMap<>()); possibleStableValencies.put(ChemEl.Br, new HashMap<>()); possibleStableValencies.put(ChemEl.Kr, new HashMap<>()); possibleStableValencies.put(ChemEl.Rb, new HashMap<>()); possibleStableValencies.put(ChemEl.Sr, new HashMap<>()); possibleStableValencies.put(ChemEl.In, new HashMap<>()); possibleStableValencies.put(ChemEl.Sn, new HashMap<>()); possibleStableValencies.put(ChemEl.Sb, new HashMap<>()); possibleStableValencies.put(ChemEl.Te, new HashMap<>()); possibleStableValencies.put(ChemEl.I, new HashMap<>()); possibleStableValencies.put(ChemEl.Xe, new HashMap<>()); possibleStableValencies.put(ChemEl.Cs, new HashMap<>()); possibleStableValencies.put(ChemEl.Ba, new HashMap<>()); possibleStableValencies.put(ChemEl.Tl, new HashMap<>()); possibleStableValencies.put(ChemEl.Pb, new HashMap<>()); possibleStableValencies.put(ChemEl.Bi, new HashMap<>()); possibleStableValencies.put(ChemEl.Po, new HashMap<>()); possibleStableValencies.put(ChemEl.At, new HashMap<>()); possibleStableValencies.put(ChemEl.Rn, new HashMap<>()); possibleStableValencies.put(ChemEl.Fr, new HashMap<>()); possibleStableValencies.put(ChemEl.Ra, new HashMap<>()); possibleStableValencies.get(ChemEl.H).put(0, new Integer[]{1}); possibleStableValencies.get(ChemEl.He).put(0, new Integer[]{0}); possibleStableValencies.get(ChemEl.Li).put(0, new Integer[]{1}); possibleStableValencies.get(ChemEl.Be).put(0, new Integer[]{2}); possibleStableValencies.get(ChemEl.B).put(0, new Integer[]{3}); possibleStableValencies.get(ChemEl.C).put(0, new Integer[]{4}); possibleStableValencies.get(ChemEl.N).put(0, new Integer[]{3}); possibleStableValencies.get(ChemEl.O).put(0, new Integer[]{2}); possibleStableValencies.get(ChemEl.F).put(0, new Integer[]{1}); possibleStableValencies.get(ChemEl.Ne).put(0, new Integer[]{0}); possibleStableValencies.get(ChemEl.Na).put(0, new Integer[]{1}); possibleStableValencies.get(ChemEl.Mg).put(0, new Integer[]{2}); possibleStableValencies.get(ChemEl.Al).put(0, new Integer[]{3}); possibleStableValencies.get(ChemEl.Si).put(0, new Integer[]{4}); possibleStableValencies.get(ChemEl.P).put(0, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.S).put(0, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.Cl).put(0, new Integer[]{1,3,5,7}); possibleStableValencies.get(ChemEl.Ar).put(0, new Integer[]{0}); possibleStableValencies.get(ChemEl.K).put(0, new Integer[]{1}); possibleStableValencies.get(ChemEl.Ca).put(0, new Integer[]{2}); possibleStableValencies.get(ChemEl.Ga).put(0, new Integer[]{3}); possibleStableValencies.get(ChemEl.Ge).put(0, new Integer[]{4}); possibleStableValencies.get(ChemEl.As).put(0, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Se).put(0, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.Br).put(0, new Integer[]{1,3,5,7}); possibleStableValencies.get(ChemEl.Kr).put(0, new Integer[]{0,2}); possibleStableValencies.get(ChemEl.Rb).put(0, new Integer[]{1}); possibleStableValencies.get(ChemEl.Sr).put(0, new Integer[]{2}); possibleStableValencies.get(ChemEl.In).put(0, new Integer[]{3}); possibleStableValencies.get(ChemEl.Sn).put(0, new Integer[]{2,4}); possibleStableValencies.get(ChemEl.Sb).put(0, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Te).put(0, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.I).put(0, new Integer[]{1,3,5,7}); possibleStableValencies.get(ChemEl.Xe).put(0, new Integer[]{0,2,4,6,8}); possibleStableValencies.get(ChemEl.Cs).put(0, new Integer[]{1}); possibleStableValencies.get(ChemEl.Ba).put(0, new Integer[]{2}); possibleStableValencies.get(ChemEl.Tl).put(0, new Integer[]{1,3}); possibleStableValencies.get(ChemEl.Pb).put(0, new Integer[]{2,4}); possibleStableValencies.get(ChemEl.Bi).put(0, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Po).put(0, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.At).put(0, new Integer[]{1,3,5,7}); possibleStableValencies.get(ChemEl.Rn).put(0, new Integer[]{0,2,4,6,8}); possibleStableValencies.get(ChemEl.Fr).put(0, new Integer[]{1}); possibleStableValencies.get(ChemEl.Ra).put(0, new Integer[]{2}); possibleStableValencies.get(ChemEl.H).put(1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Li).put(1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Be).put(1, new Integer[]{1}); possibleStableValencies.get(ChemEl.Be).put(2, new Integer[]{0}); possibleStableValencies.get(ChemEl.B).put(2, new Integer[]{1}); possibleStableValencies.get(ChemEl.B).put(1, new Integer[]{2}); possibleStableValencies.get(ChemEl.B).put(-1, new Integer[]{4}); possibleStableValencies.get(ChemEl.B).put(-2, new Integer[]{3}); possibleStableValencies.get(ChemEl.C).put(2, new Integer[]{2}); possibleStableValencies.get(ChemEl.C).put(1, new Integer[]{3}); possibleStableValencies.get(ChemEl.C).put(-1, new Integer[]{3}); possibleStableValencies.get(ChemEl.C).put(-2, new Integer[]{2}); possibleStableValencies.get(ChemEl.N).put(2, new Integer[]{3}); possibleStableValencies.get(ChemEl.N).put(1, new Integer[]{4}); possibleStableValencies.get(ChemEl.N).put(-1, new Integer[]{2}); possibleStableValencies.get(ChemEl.N).put(-2, new Integer[]{1}); possibleStableValencies.get(ChemEl.O).put(2, new Integer[]{4}); possibleStableValencies.get(ChemEl.O).put(1, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.O).put(-1, new Integer[]{1}); possibleStableValencies.get(ChemEl.O).put(-2, new Integer[]{0}); possibleStableValencies.get(ChemEl.F).put(2, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.F).put(1, new Integer[]{2}); possibleStableValencies.get(ChemEl.F).put(-1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Na).put(1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Na).put(-1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Mg).put(2, new Integer[]{0}); possibleStableValencies.get(ChemEl.Al).put(3, new Integer[]{0}); possibleStableValencies.get(ChemEl.Al).put(2, new Integer[]{1}); possibleStableValencies.get(ChemEl.Al).put(1, new Integer[]{2}); possibleStableValencies.get(ChemEl.Al).put(-1, new Integer[]{4}); possibleStableValencies.get(ChemEl.Al).put(-2, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Si).put(2, new Integer[]{2}); possibleStableValencies.get(ChemEl.Si).put(1, new Integer[]{3}); possibleStableValencies.get(ChemEl.Si).put(-1, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Si).put(-2, new Integer[]{2}); possibleStableValencies.get(ChemEl.P).put(2, new Integer[]{3}); possibleStableValencies.get(ChemEl.P).put(1, new Integer[]{4}); possibleStableValencies.get(ChemEl.P).put(-1, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.P).put(-2, new Integer[]{1,3,5,7}); possibleStableValencies.get(ChemEl.S).put(2, new Integer[]{4}); possibleStableValencies.get(ChemEl.S).put(1, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.S).put(-1, new Integer[]{1,3,5,7}); possibleStableValencies.get(ChemEl.S).put(-2, new Integer[]{0}); possibleStableValencies.get(ChemEl.Cl).put(2, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Cl).put(1, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.Cl).put(-1, new Integer[]{0}); possibleStableValencies.get(ChemEl.K).put(1, new Integer[]{0}); possibleStableValencies.get(ChemEl.K).put(-1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Ca).put(2, new Integer[]{0}); possibleStableValencies.get(ChemEl.Ca).put(1, new Integer[]{1}); possibleStableValencies.get(ChemEl.Ga).put(3, new Integer[]{0}); possibleStableValencies.get(ChemEl.Ga).put(2, new Integer[]{1}); possibleStableValencies.get(ChemEl.Ga).put(1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Ga).put(-1, new Integer[]{4}); possibleStableValencies.get(ChemEl.Ga).put(-2, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Ge).put(4, new Integer[]{0}); possibleStableValencies.get(ChemEl.Ge).put(1, new Integer[]{3}); possibleStableValencies.get(ChemEl.Ge).put(-1, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Ge).put(-2, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.As).put(2, new Integer[]{3}); possibleStableValencies.get(ChemEl.As).put(1, new Integer[]{4}); possibleStableValencies.get(ChemEl.As).put(-1, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.As).put(-2, new Integer[]{1,3,5,7}); possibleStableValencies.get(ChemEl.As).put(-3, new Integer[]{0}); possibleStableValencies.get(ChemEl.Se).put(2, new Integer[]{4}); possibleStableValencies.get(ChemEl.Se).put(1, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Se).put(-1, new Integer[]{1,3,5,7}); possibleStableValencies.get(ChemEl.Se).put(-2, new Integer[]{0}); possibleStableValencies.get(ChemEl.Br).put(2, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Br).put(1, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.Br).put(-1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Rb).put(1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Rb).put(-1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Sr).put(2, new Integer[]{0}); possibleStableValencies.get(ChemEl.Sr).put(1, new Integer[]{1}); possibleStableValencies.get(ChemEl.In).put(3, new Integer[]{0}); possibleStableValencies.get(ChemEl.In).put(2, new Integer[]{1}); possibleStableValencies.get(ChemEl.In).put(1, new Integer[]{0}); possibleStableValencies.get(ChemEl.In).put(-1, new Integer[]{2,4}); possibleStableValencies.get(ChemEl.In).put(-2, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Sn).put(4, new Integer[]{0}); possibleStableValencies.get(ChemEl.Sn).put(2, new Integer[]{0}); possibleStableValencies.get(ChemEl.Sn).put(1, new Integer[]{3}); possibleStableValencies.get(ChemEl.Sn).put(-1, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Sn).put(-2, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.Sb).put(3, new Integer[]{0}); possibleStableValencies.get(ChemEl.Sb).put(2, new Integer[]{3}); possibleStableValencies.get(ChemEl.Sb).put(1, new Integer[]{2,4}); possibleStableValencies.get(ChemEl.Sb).put(-1, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.Sb).put(-2, new Integer[]{1,3,5,7}); possibleStableValencies.get(ChemEl.Te).put(2, new Integer[]{2,4}); possibleStableValencies.get(ChemEl.Te).put(1, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Te).put(-1, new Integer[]{1,3,5,7}); possibleStableValencies.get(ChemEl.Te).put(-2, new Integer[]{0}); possibleStableValencies.get(ChemEl.I).put(2, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.I).put(1, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.I).put(-1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Cs).put(1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Cs).put(-1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Ba).put(2, new Integer[]{0}); possibleStableValencies.get(ChemEl.Ba).put(1, new Integer[]{1}); possibleStableValencies.get(ChemEl.Pb).put(2, new Integer[]{0}); possibleStableValencies.get(ChemEl.Pb).put(1, new Integer[]{3}); possibleStableValencies.get(ChemEl.Pb).put(-1, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.Pb).put(-2, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.Bi).put(3, new Integer[]{0}); possibleStableValencies.get(ChemEl.Bi).put(2, new Integer[]{3}); possibleStableValencies.get(ChemEl.Bi).put(1, new Integer[]{2,4}); possibleStableValencies.get(ChemEl.Bi).put(-1, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.Bi).put(-2, new Integer[]{1,3,5,7}); possibleStableValencies.get(ChemEl.At).put(2, new Integer[]{3,5}); possibleStableValencies.get(ChemEl.At).put(1, new Integer[]{2,4,6}); possibleStableValencies.get(ChemEl.At).put(-1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Fr).put(1, new Integer[]{0}); possibleStableValencies.get(ChemEl.Ra).put(2, new Integer[]{0}); possibleStableValencies.get(ChemEl.Ra).put(1, new Integer[]{1}); } /** * Given a chemical element (e.g. Na) and charge (e.g. 1) returns the highest stable valency that OPSIN knows is possible * If for the particular combination of chemical element and charge the highest stable valency is not known null is returned * @param chemEl * @param charge * @return */ static Integer getMaximumValency(ChemEl chemEl, int charge) { Map possibleStableValenciesForEl = possibleStableValencies.get(chemEl); if (possibleStableValenciesForEl != null){ Integer[] possibleStableValenciesForElAndCharge = possibleStableValenciesForEl.get(charge); if (possibleStableValenciesForElAndCharge != null){ return possibleStableValenciesForElAndCharge[possibleStableValenciesForElAndCharge.length - 1]; } } return null; } /** * Return the lambda convention derived valency if set otherwise returns the same as {@link #getMaximumValency(ChemEl, int)} * Returns null if the maximum valency is not known * @param a * @return */ static Integer getMaximumValency(Atom a) { Integer maxVal; if (a.getLambdaConventionValency() != null) { maxVal = a.getLambdaConventionValency() + a.getProtonsExplicitlyAddedOrRemoved(); } else{ maxVal = getMaximumValency(a.getElement(), a.getCharge()); } return maxVal; } /** * Checks whether the total incoming valency to an atom exceeds its expected valency * outValency e.g. on radicals is taken into account * @param a * @return */ static boolean checkValency(Atom a) { int valency = a.getIncomingValency() + a.getOutValency(); Integer maxVal = getMaximumValency(a); if(maxVal == null) { return true; } return valency <= maxVal; } /** Check whether valency is available on the atom to form a bond of the given order. * spareValency and outValency are not taken into account. * @param a atom you are interested in * @param bondOrder order of bond required * @return */ static boolean checkValencyAvailableForBond(Atom a, int bondOrder) { int valency = a.getIncomingValency() + bondOrder; Integer maxVal = getMaximumValency(a); if(maxVal == null) { return true; } return valency <= maxVal; } /** Check whether changing to a heteroatom will result in valency being exceeded * spareValency and outValency is taken into account * @param a atom you are interested in * @param heteroatom atom which will be replacing it * @return */ static boolean checkValencyAvailableForReplacementByHeteroatom(Atom a, Atom heteroatom) { int valency =a.getIncomingValency(); valency +=a.hasSpareValency() ? 1 : 0; valency +=a.getOutValency(); Integer maxValOfHeteroAtom = getMaximumValency(heteroatom.getElement(), heteroatom.getCharge()); return maxValOfHeteroAtom == null || valency <= maxValOfHeteroAtom; } /** * Returns the default valency of an element when uncharged or null if unknown * @param chemlEl * @return */ static Integer getDefaultValency(ChemEl chemlEl) { return expectedDefaultValency.get(chemlEl); } /** * Returns the valency of an element in the HW system (useful for deciding whether something should have double bonds in a ring) or null if unknown * Note that the HW system makes no claim about valency when the atom is charged * @param chemEl * @return */ static Integer getHWValency(ChemEl chemEl) { return valencyInHW.get(chemEl); } /** * Returns the maximum valency of an element with a given charge or null if unknown * @param chemEl * @param charge * @return */ static Integer[] getPossibleValencies(ChemEl chemEl, int charge) { Map possibleStableValenciesForEl = possibleStableValencies.get(chemEl); if (possibleStableValenciesForEl == null){ return null; } return possibleStableValenciesForEl.get(charge); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/WordRule.java000066400000000000000000000011301451751637500262550ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * The currently supported wordRules * All word rules mentioned here should have corresponding rules in StructureBuilder * @author dl387 * */ enum WordRule{ acetal, additionCompound, acidHalideOrPseudoHalide, acidReplacingFunctionalGroup, amineDiConjunctiveSuffix, anhydride, potentialAlcoholEster, carbonylDerivative, cyclicPeptide, divalentFunctionalGroup, ester, functionalClassEster, functionGroupAsGroup, glycol, glycolEther, monovalentFunctionalGroup, multiEster, oxide, polymer, simple, substituent }opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/WordRules.java000066400000000000000000001011531451751637500264460ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.io.IOException; import java.util.ArrayList; import java.util.Collections; import java.util.List; import java.util.Locale; import java.util.regex.Pattern; import javax.xml.stream.XMLStreamConstants; import javax.xml.stream.XMLStreamException; import javax.xml.stream.XMLStreamReader; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; /**The rules by which words are grouped together (e.g. in functional class nomenclature) * * @author dl387 * */ class WordRules { /**The wordRules themselves.*/ private final List wordRuleList; enum EndsWithGroup { acid, ateGroup; } private static final Pattern icOrOusAcid = Pattern.compile("(ic|ous)([ ]?acid)?$"); private static final Pattern ateOrIteOrAmide = Pattern.compile("(at|it|amid)e?$"); /** * Describes a word that a wordRule is looking for * @author dl387 * */ private static class WordDescription { /**Whether the word is a full word, substituent word or functionalTerm word*/ private final WordType type; /**A group with a hardcoded method for efficient detection */ private final EndsWithGroup endsWithGroup; /**A case insensitive pattern which attempts to match the end of the String value of the word*/ private final Pattern endsWithPattern; /**The case insensitive String value of the word */ private final String value; /** Only applicable for functionalTerms. The string value of the functionalTerm's type attribute*/ private final String functionalGroupType; /** Only applicable for functionalTerms. The string value of the functionalTerm's subType attribute*/ private final String functionalGroupSubType; /** The value of the type attribute of the last group element in the word e.g. maybe aminoAcid*/ private final String endsWithGroupType; /** The value of the subType attribute of the last group element in the word e.g. maybe elementaryAtom*/ private final String endsWithGroupSubType; /** * Makes a description of a word to looks for * @param reader */ WordDescription(XMLStreamReader reader){ WordType type = null; String value = null; EndsWithGroup endsWithGroup = null; Pattern endsWithPattern = null; String functionalGroupType = null; String functionalGroupSubType = null; String endsWithGroupType = null; String endsWithGroupSubType = null; for (int i = 0, l = reader.getAttributeCount(); i < l; i++) { String atrValue = reader.getAttributeValue(i); switch (reader.getAttributeLocalName(i)) { case "type": type = WordType.valueOf(atrValue); break; case "value": value = atrValue; break; case "functionalGroupType": functionalGroupType = atrValue; break; case "functionalGroupSubType": functionalGroupSubType = atrValue; break; case "endsWith": endsWithGroup = EndsWithGroup.valueOf(atrValue); break; case "endsWithRegex": endsWithPattern = Pattern.compile(atrValue +"$", Pattern.CASE_INSENSITIVE); break; case "endsWithGroupType": endsWithGroupType = atrValue; break; case "endsWithGroupSubType": endsWithGroupSubType = atrValue; break; default: break; } } if (type == null) { throw new RuntimeException("Malformed wordRule, no type specified"); } this.type = type; this.endsWithGroup = endsWithGroup; this.endsWithPattern = endsWithPattern; this.value = value; this.functionalGroupType = functionalGroupType; this.functionalGroupSubType = functionalGroupSubType; this.endsWithGroupType = endsWithGroupType; this.endsWithGroupSubType = endsWithGroupSubType; } WordType getType() { return type; } EndsWithGroup getEndsWithGroup() { return endsWithGroup; } Pattern getEndsWithPattern() { return endsWithPattern; } String getValue() { return value; } String getFunctionalGroupType() { return functionalGroupType; } String getFunctionalGroupSubType() { return functionalGroupSubType; } String getEndsWithGroupType() { return endsWithGroupType; } String getEndsWithGroupSubType() { return endsWithGroupSubType; } } /** * A representation of a wordRule element from wordRules.xml * @author dl387 * */ private static class WordRuleDescription { private final List wordDescriptions; private final WordRule ruleName; private final WordType ruleType; List getWordDescriptions() { return wordDescriptions; } WordRule getRuleName() { return ruleName; } WordType getRuleType() { return ruleType; } /** * Creates a wordRule from a wordRule element found in wordRules.xml * @param reader * @throws XMLStreamException */ WordRuleDescription(XMLStreamReader reader) throws XMLStreamException { List wordDescriptions = new ArrayList<>(); ruleName = WordRule.valueOf(reader.getAttributeValue(null, "name")); ruleType = WordType.valueOf(reader.getAttributeValue(null,"type")); while (reader.hasNext()) { int event = reader.next(); if (event == XMLStreamConstants.START_ELEMENT) { if (reader.getLocalName().equals("word")) { wordDescriptions.add(new WordDescription(reader)); } } else if (event == XMLStreamConstants.END_ELEMENT) { if (reader.getLocalName().equals("wordRule")) { break; } } } this.wordDescriptions = Collections.unmodifiableList(wordDescriptions); } } /**Initialises the WordRules. * @param resourceGetter * @throws IOException */ WordRules(ResourceGetter resourceGetter) throws IOException { List wordRuleList = new ArrayList<>(); XMLStreamReader reader = resourceGetter.getXMLStreamReader("wordRules.xml"); try { while (reader.hasNext()) { if (reader.next() == XMLStreamConstants.START_ELEMENT && reader.getLocalName().equals("wordRule")) { wordRuleList.add(new WordRuleDescription(reader)); } } } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading wordRules.xml", e); } finally { try { reader.close(); } catch (XMLStreamException e) { throw new IOException("Parsing exception occurred while reading wordRules.xml", e); } } this.wordRuleList = Collections.unmodifiableList(wordRuleList); } /**Takes a molecule element and places the word elements into wordRule elements * @param moleculeEl A molecule element with word children * @param n2sConfig * @param allowSpaceRemoval * @param componentRatios * @throws ParsingException */ void groupWordsIntoWordRules(Element moleculeEl, NameToStructureConfig n2sConfig, boolean allowSpaceRemoval, Integer[] componentRatios) throws ParsingException { WordRulesInstance instance = new WordRulesInstance(moleculeEl, n2sConfig, allowSpaceRemoval, componentRatios); List wordEls = moleculeEl.getChildElements(WORD_EL); //note that multiple words in wordEls may be later replaced by a wordRule element for (int i = 0; i wordRuleEls = moleculeEl.getChildElements(); for (Element wordRuleEl : wordRuleEls) { if (!wordRuleEl.getName().equals(WORDRULE_EL)){ throw new ParsingException("Unable to assign wordRule to: " + wordRuleEl.getAttributeValue(VALUE_ATR)); } } } private class WordRulesInstance { private final Element moleculeEl; private final boolean allowRadicals; private final boolean allowSpaceRemoval; private final Integer expectedNumOfComponents; WordRulesInstance(Element moleculeEl, NameToStructureConfig n2sConfig, boolean allowSpaceRemoval, Integer[] componentRatios) { this.moleculeEl = moleculeEl; this.allowRadicals = n2sConfig.isAllowRadicals(); this.allowSpaceRemoval = allowSpaceRemoval; this.expectedNumOfComponents = componentRatios != null ? componentRatios.length : null; } private boolean matchWordRule(List wordEls, int indexOfFirstWord) throws ParsingException { wordRuleLoop: for (WordRuleDescription wordRuleDesc : wordRuleList) { int i = indexOfFirstWord; List wordDescriptions = wordRuleDesc.getWordDescriptions(); int wordsInWordRule = wordDescriptions.size(); if (i + wordsInWordRule <= wordEls.size()) {//need sufficient words to match the word rule for (int j = 0; j < wordsInWordRule; j++) { Element wordEl = wordEls.get(i + j); WordDescription wd = wordDescriptions.get(j); if (!wd.getType().toString().equals(wordEl.getAttributeValue(TYPE_ATR))){ continue wordRuleLoop;//type mismatch; } String functionalGroupTypePredicate = wd.getFunctionalGroupType(); String functionalGroupSubTypePredicate = wd.getFunctionalGroupSubType(); if (functionalGroupTypePredicate != null || functionalGroupSubTypePredicate != null) { if (!WordType.functionalTerm.toString().equals(wordEl.getAttributeValue(TYPE_ATR))){ continue wordRuleLoop; } Element lastEl = getLastElementInWord(wordEl); if (lastEl == null) { throw new ParsingException("OPSIN Bug: Cannot find the functional element in a functionalTerm"); } while (lastEl.getName().equals(CLOSEBRACKET_EL) || lastEl.getName().equals(STRUCTURALCLOSEBRACKET_EL)) { lastEl = OpsinTools.getPreviousSibling(lastEl); if (lastEl == null) { throw new ParsingException("OPSIN Bug: Cannot find the functional element in a functionalTerm"); } } if (functionalGroupTypePredicate != null && !functionalGroupTypePredicate.equals(lastEl.getAttributeValue(TYPE_ATR))) { continue wordRuleLoop; } if (functionalGroupSubTypePredicate != null && !functionalGroupSubTypePredicate.equals(lastEl.getAttributeValue(SUBTYPE_ATR))) { continue wordRuleLoop; } } EndsWithGroup endsWithGroupPredicate = wd.getEndsWithGroup(); if (endsWithGroupPredicate != null && !endsWithGroupPredicateSatisfied(wordEl, endsWithGroupPredicate)) { continue wordRuleLoop; } String valuePredicate = wd.getValue(); if (valuePredicate != null && !wordEl.getAttributeValue(VALUE_ATR).toLowerCase(Locale.ROOT).equals(valuePredicate)){//word string contents mismatch continue wordRuleLoop; } Pattern endsWithPatternPredicate = wd.getEndsWithPattern(); if (endsWithPatternPredicate != null) { if (!endsWithPatternPredicate.matcher(wordEl.getAttributeValue(VALUE_ATR)).find()){ continue wordRuleLoop; } } String endsWithGroupTypePredicate = wd.getEndsWithGroupType(); if (endsWithGroupTypePredicate != null) { Element lastGroupInWordRule = getLastGroupInWordRule(wordEl); if (lastGroupInWordRule == null || !endsWithGroupTypePredicate.equals(lastGroupInWordRule.getAttributeValue(TYPE_ATR))){ continue wordRuleLoop; } } String endsWithSubGroupTypePredicate = wd.getEndsWithGroupSubType(); if (endsWithSubGroupTypePredicate != null) { Element lastGroupInWordRule = getLastGroupInWordRule(wordEl); if (lastGroupInWordRule == null || !endsWithSubGroupTypePredicate.equals(lastGroupInWordRule.getAttributeValue(SUBTYPE_ATR))){ continue wordRuleLoop; } } } //Word Rule matches! Element wordRuleEl = new GroupingEl(WORDRULE_EL); WordRule wordRule = wordRuleDesc.getRuleName(); wordRuleEl.addAttribute(new Attribute(TYPE_ATR, wordRuleDesc.getRuleType().toString())); wordRuleEl.addAttribute(new Attribute(WORDRULE_EL, wordRule.toString())); /* * Some wordRules can not be entirely processed at the structure building stage */ switch (wordRule) { case functionGroupAsGroup: //convert the functional term into a full term Element functionalWord = wordEls.get(i + wordsInWordRule -1); if (!functionalWord.getAttributeValue(TYPE_ATR).equals(FUNCTIONALTERM_EL) || wordsInWordRule>2){ throw new ParsingException("OPSIN bug: Problem with functionGroupAsGroup wordRule"); } convertFunctionalGroupIntoGroup(functionalWord); if (wordsInWordRule==2){ joinWords(wordEls, wordEls.get(i), functionalWord); wordsInWordRule =1; } wordRuleEl.getAttribute(WORDRULE_ATR).setValue(WordRule.simple.toString()); break; case carbonylDerivative: case acidReplacingFunctionalGroup: //e.g. acetone 4,4-diphenylsemicarbazone. This is better expressed as a full word as the substituent actually locants onto the functional term for (int j = 1; j < (wordsInWordRule - 1); j++) { Element wordEl = wordEls.get(i + j); if (WordType.substituent.toString().equals(wordEl.getAttributeValue(TYPE_ATR))) { joinWords(wordEls, wordEls.get(i + j), wordEls.get(i + j + 1)); wordsInWordRule--; List functionalTerm = OpsinTools.getDescendantElementsWithTagName(wordEls.get(i + j), FUNCTIONALTERM_EL);//rename functionalTerm element to root if (functionalTerm.size() != 1){ throw new ParsingException("OPSIN bug: Problem with "+ wordRule +" wordRule"); } functionalTerm.get(0).setName(ROOT_EL); List functionalGroups = OpsinTools.getDescendantElementsWithTagName(functionalTerm.get(0), FUNCTIONALGROUP_EL);//rename functionalGroup element to group if (functionalGroups.size() != 1){ throw new ParsingException("OPSIN bug: Problem with "+ wordRule +" wordRule"); } functionalGroups.get(0).setName(GROUP_EL); wordEls.get(i + j).getAttribute(TYPE_ATR).setValue(WordType.full.toString()); } } break; case additionCompound: case oxide: //is the halide/pseudohalide/oxide actually a counterion rather than covalently bonded Element possibleElementaryAtomContainingWord = wordEls.get(i); List elementaryAtoms = OpsinTools.getDescendantElementsWithTagNameAndAttribute(possibleElementaryAtomContainingWord, GROUP_EL, TYPE_ATR, ELEMENTARYATOM_TYPE_VAL); if (elementaryAtoms.size() == 1) { Element elementaryAtom = elementaryAtoms.get(0); ChemEl chemEl1 = getChemElFromElementaryAtomEl(elementaryAtom); if (wordRule == WordRule.oxide) { if (wordsInWordRule != 2){ throw new ParsingException("OPSIN bug: Problem with "+ wordRule +" wordRule"); } Element oxideWord = wordEls.get(i + 1); ChemEl chemEl2 = getChemElFromWordWithFunctionalGroup(oxideWord); if (!FragmentTools.isCovalent(chemEl1, chemEl2) || chemEl1 == ChemEl.Ag){ Element oxideGroup = convertFunctionalGroupIntoGroup(oxideWord); setOxideStructureAppropriately(oxideGroup, elementaryAtom); applySimpleWordRule(wordEls, indexOfFirstWord, possibleElementaryAtomContainingWord); continue wordRuleLoop; } } else { for (int j = 1; j < wordsInWordRule; j++) { Element functionalGroup = wordEls.get(i + j); ChemEl chemEl2 = getChemElFromWordWithFunctionalGroup(functionalGroup); if (!FragmentTools.isCovalent(chemEl1, chemEl2)) {//use separate word rules for ionic components boolean specialCaseCovalency = false; if (chemEl2.isHalogen() && wordsInWordRule == 2) { switch (chemEl1) { case Mg: if (possibleElementaryAtomContainingWord.getChildCount() > 1) { //treat grignards (i.e. substitutedmagnesium halides) as covalent specialCaseCovalency = true; } break; case Al: if (chemEl2 == ChemEl.Cl || chemEl2 == ChemEl.Br || chemEl2 == ChemEl.I) { specialCaseCovalency = true; } break; case Ti: if (oxidationNumberOrMultiplierIs(elementaryAtom, functionalGroup, 4) && (chemEl2 == ChemEl.Cl || chemEl2 == ChemEl.Br || chemEl2 == ChemEl.I)) { specialCaseCovalency = true; } break; case V: if (oxidationNumberOrMultiplierIs(elementaryAtom, functionalGroup, 4) && chemEl2 == ChemEl.Cl) { specialCaseCovalency = true; } break; case Zr: case Hf: if (oxidationNumberOrMultiplierIs(elementaryAtom, functionalGroup, 4) && chemEl2 == ChemEl.Br) { specialCaseCovalency = true; } break; case U: if (oxidationNumberOrMultiplierIs(elementaryAtom, functionalGroup, 6) && (chemEl2 == ChemEl.F || chemEl2 == ChemEl.Cl)) { specialCaseCovalency = true; } break; case Np: case Pu: if (oxidationNumberOrMultiplierIs(elementaryAtom, functionalGroup, 6) && chemEl2 == ChemEl.F) { specialCaseCovalency = true; } break; default: break; } } else if ((chemEl2 == ChemEl.H || chemEl2 == ChemEl.C ) && wordsInWordRule == 2) { if (chemEl1 == ChemEl.Al) { //organoaluminium and aluminium hydrides are covalent specialCaseCovalency = true; } } if (!specialCaseCovalency) { continue wordRuleLoop; } } } } } break; case potentialAlcoholEster: if (expectedNumOfComponents != null && expectedNumOfComponents == moleculeEl.getChildCount()) { //don't apply this wordRule if doing so makes the number of components incorrect continue wordRuleLoop; } int lastWordIndex = indexOfFirstWord + wordsInWordRule - 1; if (wordEls.get(lastWordIndex).getAttribute(ISSALT_ATR) != null) { //explicitly stated to be a salt, so shouldn't be bonded! continue wordRuleLoop; } if (lastWordIndex + 1 < wordEls.size()) { Element nextWord = wordEls.get(lastWordIndex + 1); if (WordType.functionalTerm.toString().equals(nextWord.getAttributeValue(TYPE_ATR)) && nextWord.getAttributeValue(VALUE_ATR).equalsIgnoreCase("salt")) { //explicitly stated to be a salt, so shouldn't be bonded! continue wordRuleLoop; } } break; case monovalentFunctionalGroup: Element potentialOxy = getLastElementInWord(wordEls.get(0)); String val = potentialOxy.getValue(); if (val.equals("oxy") || val.equals("oxo")) { throw new ParsingException(wordEls.get(0).getValue() + wordEls.get(1).getValue() +" is unlikely to be intended to be a molecule"); } break; default: break; } List wordValues = new ArrayList<>(); Element parentEl = wordEls.get(i).getParent(); int indexToInsertAt = parentEl.indexOf(wordEls.get(i)); for (int j = 0; j < wordsInWordRule; j++) { Element wordEl = wordEls.remove(i); wordEl.detach(); wordRuleEl.addChild(wordEl); wordValues.add(wordEl.getAttributeValue(VALUE_ATR)); } wordRuleEl.addAttribute(new Attribute(VALUE_ATR, StringTools.stringListToString(wordValues, " ")));//The bare string of all the words under this wordRule parentEl.insertChild(wordRuleEl, indexToInsertAt); wordEls.add(i, wordRuleEl); return true; } } Element firstWord = wordEls.get(indexOfFirstWord); if (firstWord.getName().equals(WORD_EL) && WordType.full.toString().equals(firstWord.getAttributeValue(TYPE_ATR))){//No wordRule -->wordRule="simple" applySimpleWordRule(wordEls, indexOfFirstWord, firstWord); return false; } else if (allowSpaceRemoval && WordType.substituent.toString().equals(firstWord.getAttributeValue(TYPE_ATR))){ /* * substituents may join together or to a full e.g. 2-ethyl toluene -->2-ethyltoluene * 1-chloro 2-bromo ethane --> 1-chloro-2-bromo ethane then subsequently 1-chloro-2-bromo-ethane */ if (indexOfFirstWord +1 < wordEls.size()){ Element wordToPotentiallyCombineWith = wordEls.get(indexOfFirstWord +1); if (WordType.full.toString().equals(wordToPotentiallyCombineWith.getAttributeValue(TYPE_ATR)) || WordType.substituent.toString().equals(wordToPotentiallyCombineWith.getAttributeValue(TYPE_ATR))){ joinWords(wordEls, firstWord, wordToPotentiallyCombineWith); return true; } } } else if (WordType.functionalTerm.toString().equals(firstWord.getAttributeValue(TYPE_ATR)) && firstWord.getAttributeValue(VALUE_ATR).equalsIgnoreCase("salt")) { if (indexOfFirstWord == 0) { throw new ParsingException("The word salt appeared in an unexpected location"); } Element previousWord = wordEls.get(indexOfFirstWord - 1); if (previousWord.getAttribute(ISSALT_ATR) == null) { previousWord.addAttribute(ISSALT_ATR, "yes"); } wordEls.remove(indexOfFirstWord); firstWord.detach(); if (moleculeEl.getAttribute(ISSALT_ATR) == null) { moleculeEl.addAttribute(ISSALT_ATR, "yes"); } return true; } if (wordEls.size() == 1 && indexOfFirstWord == 0 && firstWord.getName().equals(WORD_EL) && WordType.substituent.toString().equals(firstWord.getAttributeValue(TYPE_ATR))) { if (firstWord.getAttributeValue(VALUE_ATR).equalsIgnoreCase("dihydrogen")) { convertToDihydrogenMolecule(firstWord); return true; } if (allowRadicals) { //name is all one substituent, make this a substituent and finish applySubstituentWordRule(wordEls, indexOfFirstWord, firstWord); } } return false; } private boolean endsWithGroupPredicateSatisfied(Element wordEl, EndsWithGroup endsWithGroupPredicate) throws ParsingException { Element lastEl = getLastElementInWord(wordEl); if (lastEl == null) { return false; } String elName = lastEl.getName(); while (elName.equals(CLOSEBRACKET_EL) || elName.equals(STRUCTURALCLOSEBRACKET_EL) || elName.equals(ISOTOPESPECIFICATION_EL)) { lastEl = OpsinTools.getPreviousSibling(lastEl); if (lastEl == null) { return false; } elName = lastEl.getName(); } if (endsWithGroupPredicate == EndsWithGroup.acid) { if (elName.equals(SUFFIX_EL)) { if (icOrOusAcid.matcher(lastEl.getAttributeValue(VALUE_ATR)).find()) { return true; } } else if (elName.equals(GROUP_EL)) { if (lastEl.getAttribute(FUNCTIONALIDS_ATR) != null && (icOrOusAcid.matcher(lastEl.getValue()).find() || AMINOACID_TYPE_VAL.equals(lastEl.getAttributeValue(TYPE_ATR)))) { return true; } } } else if (endsWithGroupPredicate == EndsWithGroup.ateGroup) { if (elName.equals(GROUP_EL)) { if (lastEl.getAttribute(FUNCTIONALIDS_ATR) != null && ateOrIteOrAmide.matcher(lastEl.getValue()).find()) { return true; } } else { while (lastEl != null && elName.equals(SUFFIX_EL)) { String suffixValAtr = lastEl.getAttributeValue(VALUE_ATR); if (ateOrIteOrAmide.matcher(suffixValAtr).find() || suffixValAtr.equals("glycoside")) { return true; } //glycoside is not always the last suffix lastEl = OpsinTools.getPreviousSibling(lastEl, SUFFIX_EL); } } } return false; } private boolean oxidationNumberOrMultiplierIs(Element elementaryAtomEl, Element functionalGroupWord, int expectedVal) throws ParsingException { List functionalGroups = OpsinTools.getDescendantElementsWithTagName(functionalGroupWord, FUNCTIONALGROUP_EL); if (functionalGroups.size() != 1) { throw new ParsingException("OPSIN bug: Unable to find functional group in oxide or addition compound rule"); } Element possibleMultiplier = OpsinTools.getPreviousSibling(functionalGroups.get(0)); if (possibleMultiplier != null && possibleMultiplier.getName().equals(MULTIPLIER_EL)) { return Integer.parseInt(possibleMultiplier.getAttributeValue(VALUE_ATR)) == expectedVal; } else { Element possibleOxidationNumber = OpsinTools.getNextSibling(elementaryAtomEl); if(possibleOxidationNumber != null && possibleOxidationNumber.getName().equals(OXIDATIONNUMBERSPECIFIER_EL)) { return Integer.parseInt(possibleOxidationNumber.getAttributeValue(VALUE_ATR)) == expectedVal; } } return false; } private Element getLastGroupInWordRule(Element wordEl) { Element lastEl = getLastElementInWord(wordEl); if (lastEl.getName().equals(GROUP_EL)) { return lastEl; } else{ List groups = lastEl.getParent().getChildElements(GROUP_EL); if (groups.size() > 0) { return groups.get(groups.size() - 1); } } return null; } private Element getLastElementInWord(Element wordEl) { List children = wordEl.getChildElements(); Element lastChild = children.get(children.size() - 1); while (lastChild.getChildCount() != 0) { children = lastChild.getChildElements(); lastChild = children.get(children.size() - 1); } return lastChild; } private void applySimpleWordRule(List wordEls, int indexOfFirstWord, Element firstWord) { Element parentEl = firstWord.getParent(); int indexToInsertAt = parentEl.indexOf(firstWord); Element wordRuleEl = new GroupingEl(WORDRULE_EL); wordRuleEl.addAttribute(new Attribute(WORDRULE_ATR, WordRule.simple.toString()));//No wordRule wordRuleEl.addAttribute(new Attribute(TYPE_ATR, WordType.full.toString())); wordRuleEl.addAttribute(new Attribute(VALUE_ATR, firstWord.getAttributeValue(VALUE_ATR))); firstWord.detach(); wordRuleEl.addChild(firstWord); wordEls.set(indexOfFirstWord, wordRuleEl); parentEl.insertChild(wordRuleEl, indexToInsertAt); } private void applySubstituentWordRule(List wordEls, int indexOfFirstWord, Element firstWord) { Element parentEl = firstWord.getParent(); int indexToInsertAt = parentEl.indexOf(firstWord); Element wordRuleEl = new GroupingEl(WORDRULE_EL); wordRuleEl.addAttribute(new Attribute(WORDRULE_ATR, WordRule.substituent.toString())); wordRuleEl.addAttribute(new Attribute(TYPE_ATR, WordType.full.toString())); wordRuleEl.addAttribute(new Attribute(VALUE_ATR, firstWord.getAttributeValue(VALUE_ATR))); firstWord.detach(); wordRuleEl.addChild(firstWord); wordEls.set(indexOfFirstWord, wordRuleEl); parentEl.insertChild(wordRuleEl, indexToInsertAt); } /** * Merges two adjacent words * The latter word (wordToPotentiallyCombineWith) is merged into the former and removed from wordEls * @param wordEls * @param firstWord * @param wordToPotentiallyCombineWith * @throws ParsingException */ private void joinWords(List wordEls, Element firstWord, Element wordToPotentiallyCombineWith) throws ParsingException { wordEls.remove(wordToPotentiallyCombineWith); wordToPotentiallyCombineWith.detach(); List substituentEls = firstWord.getChildElements(SUBSTITUENT_EL); if (substituentEls.isEmpty()){ throw new ParsingException("OPSIN Bug: Substituent element not found where substituent element expected"); } Element finalSubstituent = substituentEls.get(substituentEls.size() - 1); if (!finalSubstituent.getLastChildElement().getName().equals(HYPHEN_EL)){//add an implicit hyphen if one is not already present Element implicitHyphen = new TokenEl(HYPHEN_EL, "-"); finalSubstituent.addChild(implicitHyphen); } List elementsToMergeIntoSubstituent = wordToPotentiallyCombineWith.getChildElements(); for (int j = elementsToMergeIntoSubstituent.size() -1 ; j >=0; j--) { Element el = elementsToMergeIntoSubstituent.get(j); el.detach(); OpsinTools.insertAfter(finalSubstituent, el); } if (WordType.full.toString().equals(wordToPotentiallyCombineWith.getAttributeValue(TYPE_ATR))){ firstWord.getAttribute(TYPE_ATR).setValue(WordType.full.toString()); } firstWord.getAttribute(VALUE_ATR).setValue(firstWord.getAttributeValue(VALUE_ATR) + wordToPotentiallyCombineWith.getAttributeValue(VALUE_ATR)); } private Element convertFunctionalGroupIntoGroup(Element word) throws ParsingException { word.getAttribute(TYPE_ATR).setValue(WordType.full.toString()); List functionalTerms = OpsinTools.getDescendantElementsWithTagName(word, FUNCTIONALTERM_EL); if (functionalTerms.size() != 1){ throw new ParsingException("OPSIN Bug: Exactly 1 functionalTerm expected in functionalGroupAsGroup wordRule"); } Element functionalTerm = functionalTerms.get(0); functionalTerm.setName(ROOT_EL); List functionalGroups = functionalTerm.getChildElements(FUNCTIONALGROUP_EL); if (functionalGroups.size() != 1){ throw new ParsingException("OPSIN Bug: Exactly 1 functionalGroup expected in functionalGroupAsGroup wordRule"); } Element functionalGroup = functionalGroups.get(0); functionalGroup.setName(GROUP_EL); functionalGroup.getAttribute(TYPE_ATR).setValue(SIMPLEGROUP_TYPE_VAL); functionalGroup.addAttribute(new Attribute(SUBTYPE_ATR, SIMPLEGROUP_SUBTYPE_VAL)); return functionalGroup; } private void convertToDihydrogenMolecule(Element word) { word.getAttribute(TYPE_ATR).setValue(WordType.full.toString()); for (int i = word.getChildCount() - 1; i >=0; i--) { word.removeChild(i); } Element root = new GroupingEl(ROOT_EL); Element group = new TokenEl(GROUP_EL); group.addAttribute(TYPE_ATR, SIMPLEGROUP_TYPE_VAL); group.addAttribute(SUBTYPE_ATR, SIMPLEGROUP_SUBTYPE_VAL); group.addAttribute(VALUE_ATR, "[H][H]"); group.setValue("dihydrogen"); root.addChild(group); word.addChild(root); } /** * Sets the SMILES of the oxide group to be something like [O-2] * ... unless the oxide group is multiplied and the elementaryAtom has no oxidation states greater 2 * in which case [O-][O-] would be assumed * @param oxideGroup * @param elementaryAtom */ private void setOxideStructureAppropriately(Element oxideGroup, Element elementaryAtom) { boolean chainInterpretation = false; Integer multiplierVal = null; Element possibleMultiplier = OpsinTools.getPreviousSibling(oxideGroup); if (possibleMultiplier != null && possibleMultiplier.getName().equals(MULTIPLIER_EL)){ multiplierVal = Integer.parseInt(possibleMultiplier.getAttributeValue(VALUE_ATR)); if (multiplierVal > 1) { String commonOxidationStatesAndMax = elementaryAtom.getAttributeValue(COMMONOXIDATIONSTATESANDMAX_ATR); if (commonOxidationStatesAndMax == null || Integer.parseInt(commonOxidationStatesAndMax.split(":")[1]) <= 2){ chainInterpretation = true; } } } Attribute value = oxideGroup.getAttribute(VALUE_ATR); String smiles = value.getValue(); String element; if (smiles.equals("O")){ element = "O"; } else if (smiles.equals("S")){ element = "S"; } else if (smiles.startsWith("[Se")){ element = "Se"; } else if (smiles.startsWith("[Te")){ element = "Te"; } else{ throw new RuntimeException("OPSIN Bug: Unexpected smiles for oxideGroup: " + smiles); } if (chainInterpretation){ StringBuilder sb = new StringBuilder(); sb.append('['); sb.append(element); sb.append("-]"); for (int i = 2; i < multiplierVal; i++) { sb.append('['); sb.append(element); sb.append(']'); } sb.append('['); sb.append(element); sb.append("-]"); value.setValue(sb.toString()); possibleMultiplier.detach(); } else{ value.setValue("[" + element + "-2]"); } } private ChemEl getChemElFromElementaryAtomEl(Element elementaryAtomEl) { String elementStr = elementaryAtomEl.getAttributeValue(VALUE_ATR); if (elementStr.startsWith("[")) { int len = elementStr.length() - 1; for (int i = 1; i < len; i++) { char ch = elementStr.charAt(i); if ((ch >= 'A' && ch <='Z') || (ch >= 'a' && ch <='z')) { if (i + 1 < len) { char ch2 = elementStr.charAt(i + 1); if ((ch2 >= 'A' && ch2 <='Z') || (ch2 >= 'a' && ch2 <='z')) { //two letter element elementStr = elementStr.substring(i, i + 2); break; } } //one letter element elementStr = elementStr.substring(i, i + 1); break; } } } return ChemEl.valueOf(elementStr); } private ChemEl getChemElFromWordWithFunctionalGroup(Element functionalWord) throws ParsingException { List functionalGroups = OpsinTools.getDescendantElementsWithTagName(functionalWord, FUNCTIONALGROUP_EL); if (functionalGroups.size() != 1){ throw new ParsingException("OPSIN bug: Unable to find functional group in oxide or addition compound rule"); } String smiles = functionalGroups.get(0).getAttributeValue(VALUE_ATR); String elementStr = ""; for (int i = 0; i < smiles.length(); i++) { if (Character.isUpperCase(smiles.charAt(i))){ elementStr += smiles.charAt(i); if (i + 1 wordRules = OpsinTools.getDescendantElementsWithTagName(parse, WORDRULE_EL); for (Element wordRule : wordRules) { WordRule wordRuleVal = WordRule.valueOf(wordRule.getAttributeValue(WORDRULE_ATR)); if (wordRuleVal == WordRule.divalentFunctionalGroup){ checkAndCorrectOmittedSpacesInDivalentFunctionalGroupRule(wordRule); } else if (wordRuleVal == WordRule.simple){ //note that this function may change the word rule to ester checkAndCorrectOmittedSpaceEster(wordRule); } } } /** * Corrects cases like "methylethyl ether" to "methyl ethyl ether" * @param divalentFunctionalGroupWordRule */ private void checkAndCorrectOmittedSpacesInDivalentFunctionalGroupRule(Element divalentFunctionalGroupWordRule) { List substituentWords = OpsinTools.getChildElementsWithTagNameAndAttribute(divalentFunctionalGroupWordRule, WORD_EL, TYPE_ATR, SUBSTITUENT_TYPE_VAL); if (substituentWords.size() == 1){//potentially has been "wrongly" interpreted e.g. ethylmethyl ketone is more likely to mean ethyl methyl ketone List children = OpsinTools.getChildElementsWithTagNames(substituentWords.get(0), new String[]{SUBSTITUENT_EL, BRACKET_EL}); if (children.size() == 2) { Element firstSubOrbracket = children.get(0); //rule out correct usage e.g. diethyl ether and locanted substituents e.g. 2-methylpropyl ether if (firstSubOrbracket.getAttribute(LOCANT_ATR) == null && firstSubOrbracket.getAttribute(MULTIPLIER_ATR) == null) { Element firstGroup = findRightMostGroupInSubBracketOrRoot(firstSubOrbracket); Fragment firstFrag = firstGroup.getFrag(); if (hasSingleMonoValentCarbonOrSiliconRadical(firstFrag)) { Element subToMove =children.get(1); subToMove.detach(); Element newWord =new GroupingEl(WORD_EL); newWord.addAttribute(new Attribute(TYPE_ATR, SUBSTITUENT_TYPE_VAL)); newWord.addChild(subToMove); OpsinTools.insertAfter(substituentWords.get(0), newWord); } } } } } /** * Corrects cases like methyl-2-ethylacetate --> methyl 2-ethylacetate * @param wordRule * @throws StructureBuildingException */ private void checkAndCorrectOmittedSpaceEster(Element wordRule) throws StructureBuildingException { List words = wordRule.getChildElements(WORD_EL); if (words.size() != 1) { return; } Element word = words.get(0); String wordRuleContents = wordRule.getAttributeValue(VALUE_ATR); if (matchAteOrIteEnding.matcher(wordRuleContents).find()) { List children = OpsinTools.getChildElementsWithTagNames(word, new String[]{SUBSTITUENT_EL, BRACKET_EL, ROOT_EL}); if (children.size() >= 2) { Element rootEl = children.get(children.size() - 1); Element rootGroup = findRightMostGroupInSubBracketOrRoot(rootEl); Fragment rootFrag = rootGroup.getFrag(); int functionalAtomsCount = rootFrag.getFunctionalAtomCount(); int rootMultiplier = 1; String rootElMultiplierAtrVal = rootEl.getAttributeValue(MULTIPLIER_ATR); if (rootElMultiplierAtrVal != null) { rootMultiplier = Integer.parseInt(rootElMultiplierAtrVal); functionalAtomsCount *= rootMultiplier; } if (functionalAtomsCount > 0){ List substituents = children.subList(0, children.size() - 1); int substituentCount = substituents.size(); if (substituentCount == 1 && rootMultiplier > 1) { return; } Element firstChild = substituents.get(0); if (!checkSuitabilityOfSubstituentForEsterFormation(firstChild, functionalAtomsCount)){ if (firstChild.getAttribute(LOCANT_ATR) != null) { //Check for cases like 4-chlorophenyl-3-aminobenzoate i.e. 4-chlorophenyl is the substituent Integer lastSubOrBracketWithoutLocantIdx = null; for (int i = 1; i < substituents.size(); i++) { Element subOrBracket = substituents.get(i); if (subOrBracket.getAttribute(LOCANT_ATR) == null) { if (!checkSuitabilityOfSubstituentForEsterFormation(subOrBracket, 1)) { //shouldn't have a multiplier as preceding substituent needs to connect to this via locanted substitution return; } lastSubOrBracketWithoutLocantIdx = i; break; } } if (lastSubOrBracketWithoutLocantIdx != null && substitutionWouldBeAmbiguous(rootFrag, null)) { List elsToFormEsterSub = new ArrayList<>(); for (int i = 0; i <= lastSubOrBracketWithoutLocantIdx ; i++) { elsToFormEsterSub.add(substituents.get(i)); } transformToEster(wordRule, elsToFormEsterSub); } } return; } String multiplierValue = firstChild.getAttributeValue(MULTIPLIER_ATR); if (specialCaseWhereEsterPreferred(findRightMostGroupInSubBracketOrRoot(firstChild), multiplierValue, rootGroup, substituentCount)) { transformToEster(wordRule, firstChild); } else if (substituentCount > 1 && (allBarFirstSubstituentHaveLocants(substituents) || insufficientSubstitutableHydrogenForSubstitution(substituents, rootFrag, rootMultiplier))){ transformToEster(wordRule, firstChild); } else if ((substituentCount == 1 || rootMultiplier > 1) && substitutionWouldBeAmbiguous(rootFrag, multiplierValue)) { //either 1 substituent or multiplicative nomenclature (in the multiplicative nomenclature case many substituents will not have locants) transformToEster(wordRule, firstChild); } } } } } private boolean allBarFirstSubstituentHaveLocants(List substituentsAndBrackets) { if (substituentsAndBrackets.size() <=1){ return false; } for (int i = 1; i < substituentsAndBrackets.size(); i++) { if (substituentsAndBrackets.get(i).getAttribute(LOCANT_ATR)==null){ return false; } } return true; } private boolean insufficientSubstitutableHydrogenForSubstitution(List substituentsAndBrackets, Fragment frag, int rootMultiplier) { int substitutableHydrogens = getAtomForEachSubstitutableHydrogen(frag).size() * rootMultiplier; for (int i = 1; i < substituentsAndBrackets.size(); i++) { Element subOrBracket = substituentsAndBrackets.get(i); Fragment f = findRightMostGroupInSubBracketOrRoot(subOrBracket).getFrag(); String multiplierValue = subOrBracket.getAttributeValue(MULTIPLIER_ATR); int multiplier = 1; if (multiplierValue != null){ multiplier = Integer.parseInt(multiplierValue); } substitutableHydrogens -= (getTotalOutAtomValency(f) * multiplier); } Element potentialEsterSub = substituentsAndBrackets.get(0); int firstFragSubstitutableHydrogenRequired = getTotalOutAtomValency(findRightMostGroupInSubBracketOrRoot(potentialEsterSub).getFrag()); String multiplierValue = potentialEsterSub.getAttributeValue(MULTIPLIER_ATR); int multiplier = 1; if (multiplierValue != null){ multiplier = Integer.parseInt(multiplierValue); } if (substitutableHydrogens >=0 && (substitutableHydrogens - (firstFragSubstitutableHydrogenRequired * multiplier)) < 0){ return true; } return false; } private int getTotalOutAtomValency(Fragment f) { int outAtomValency = 0; for (int i = 0, l = f.getOutAtomCount(); i < l; i++) { outAtomValency += f.getOutAtom(i).getValency(); } return outAtomValency; } /** * Ester form preferred when: * mono is used on substituent * alkyl chain is used on formate/acetate e.g. ethylacetate * Root is carbamate, >=2 substituents, and this is the only word rule * (ester and non-ester carbamates differ only by whether or not there is a space, heuristically the ester is almost always intended under these conditions) * @param substituentGroupEl * @param multiplierValue * @param rootGroup * @param numOfSubstituents * @return */ private boolean specialCaseWhereEsterPreferred(Element substituentGroupEl, String multiplierValue, Element rootGroup, int numOfSubstituents) { if (multiplierValue != null && Integer.parseInt(multiplierValue) == 1){ return true; } String rootGroupName = rootGroup.getParent().getValue(); if (substituentGroupEl.getAttributeValue(TYPE_ATR).equals(CHAIN_TYPE_VAL) && ALKANESTEM_SUBTYPE_VAL.equals(substituentGroupEl.getAttributeValue(SUBTYPE_ATR))) { if (substituentGroupEl.getParent().getValue().matches(substituentGroupEl.getValue() + "yl-?") && rootGroupName.matches(".*(form|methan|acet|ethan)[o]?ate?")) { return true; } } if ((rootGroupName.endsWith("carbamate") || rootGroupName.endsWith("carbamat")) && numOfSubstituents >= 2) { Element temp = substituentGroupEl.getParent(); while (temp.getParent() != null) { temp = temp.getParent(); } if (temp.getChildElements(WORDRULE_EL).size() == 1) { return true; } } return false; } private boolean substitutionWouldBeAmbiguous(Fragment frag, String multiplierValue) { int multiplier = 1; if (multiplierValue != null){ multiplier = Integer.parseInt(multiplierValue); } if (multiplier == 1 && frag.getDefaultInAtom() != null) { return false; } List atomForEachSubstitutableHydrogen = getAtomForEachSubstitutableHydrogen(frag); if (atomForEachSubstitutableHydrogen.size() == multiplier){ return false; } StereoAnalyser analyser = new StereoAnalyser(frag); Set uniqueEnvironments = new HashSet<>(); for (Atom a : atomForEachSubstitutableHydrogen) { uniqueEnvironments.add(AmbiguityChecker.getAtomEnviron(analyser, a)); } if (uniqueEnvironments.size() == 1 && (multiplier == 1 || multiplier == atomForEachSubstitutableHydrogen.size() - 1)){ return false; } return true; } private boolean checkSuitabilityOfSubstituentForEsterFormation(Element subOrBracket, int rootFunctionalAtomsCount) { if (subOrBracket.getAttribute(LOCANT_ATR) != null){ return false; } Fragment rightMostGroup = findRightMostGroupInSubBracketOrRoot(subOrBracket).getFrag(); if (!hasSingleMonoValentCarbonOrSiliconRadical(rightMostGroup)) { return false; } String multiplierStr = subOrBracket.getAttributeValue(MULTIPLIER_ATR); if (multiplierStr != null) { int multiplier = Integer.parseInt(multiplierStr); if (multiplier > rootFunctionalAtomsCount) { return false; } } return true; } private boolean hasSingleMonoValentCarbonOrSiliconRadical(Fragment frag) { if (frag.getOutAtomCount() == 1) { OutAtom outAtom = frag.getOutAtom(0); if (outAtom.getValency() == 1 && (outAtom.getAtom().getElement() == ChemEl.C || outAtom.getAtom().getElement() == ChemEl.Si)) { return true; } } return false; } private List getAtomForEachSubstitutableHydrogen(Fragment frag) { List substitutableAtoms = new ArrayList<>(); List atomList = frag.getAtomList(); for (Atom atom : atomList) { if (FragmentTools.isCharacteristicAtom(atom)){ continue; } int currentExpectedValency = atom.determineValency(true); int currentValency = (atom.getIncomingValency() + (atom.hasSpareValency() ? 1 : 0) + atom.getOutValency()); for (int i = currentValency; i < currentExpectedValency; i++) { substitutableAtoms.add(atom); } } return substitutableAtoms; } private void transformToEster(Element parentSimpleWordRule, Element substituentOrBracket) throws StructureBuildingException { parentSimpleWordRule.getAttribute(WORDRULE_ATR).setValue(WordRule.ester.toString()); Element lastChildElOfSub = substituentOrBracket.getChild(substituentOrBracket.getChildCount() - 1); if (lastChildElOfSub.getName().equals(HYPHEN_EL)){ lastChildElOfSub.detach(); } substituentOrBracket.detach(); Element newSubstituentWord = new GroupingEl(WORD_EL); newSubstituentWord.addAttribute(new Attribute(TYPE_ATR, SUBSTITUENT_TYPE_VAL)); newSubstituentWord.addChild(substituentOrBracket); parentSimpleWordRule.insertChild(newSubstituentWord, 0); String multiplierStr = substituentOrBracket.getAttributeValue(MULTIPLIER_ATR); if (multiplierStr!=null){ substituentOrBracket.removeAttribute(substituentOrBracket.getAttribute(MULTIPLIER_ATR)); int multiplier = Integer.parseInt(multiplierStr); for (int i = 1; i < multiplier; i++) { Element clone = state.fragManager.cloneElement(state, newSubstituentWord); OpsinTools.insertAfter(newSubstituentWord, clone); } } } private void transformToEster(Element parentSimpleWordRule, List elsToFormEsterSub) throws StructureBuildingException { parentSimpleWordRule.getAttribute(WORDRULE_ATR).setValue(WordRule.ester.toString()); Element sub = elsToFormEsterSub.get(elsToFormEsterSub.size() - 1); Element lastChildElOfSub = sub.getChild(sub.getChildCount() - 1); if (lastChildElOfSub.getName().equals(HYPHEN_EL)){ lastChildElOfSub.detach(); } Element newSubstituentWord = new GroupingEl(WORD_EL); newSubstituentWord.addAttribute(new Attribute(TYPE_ATR, SUBSTITUENT_TYPE_VAL)); for (Element elToFormEsterSub : elsToFormEsterSub) { elToFormEsterSub.detach(); newSubstituentWord.addChild(elToFormEsterSub); } parentSimpleWordRule.insertChild(newSubstituentWord, 0); } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/WordTools.java000066400000000000000000000156761451751637500264720ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.List; import static uk.ac.cam.ch.wwmm.opsin.OpsinTools.*; /** * Tools for dealing uniformly with unusually-formed words. */ class WordTools { /** * Splits cases where the parseTokensList describes a functionalTerm in addition to another mainGroup/substituent into two parseWords * This occurs if the name is formally missing a space e.g. ethylthiocyanate. * If multiple parses are present then it may be possible to disambiguate between them: * parses with omitted spaces are discarded if a parse without omitted space is found * parses with shorter functional terms are discarded e.g. ethylthiocyanate is [ethyl] [thiocyanate] not [ethylthio] [cyanate] * @param parseTokensList * @param chemicalName * @return */ static List splitIntoParseWords(List parseTokensList, String chemicalName) { List wellFormedParseTokens = new ArrayList<>();//these are all in the same word as would be expected List> splitParseTokensForEachParseTokens = new ArrayList<>(); /* * Each ParseTokens is split into the number of words it describes * e.g. ethylchloride has one interpretation so splitParseTokensList will have one entry * This entry will be formed of TWO parseTokens, one for the ethyl and one for the chloride */ int leastWordsInOmmittedSpaceParse = Integer.MAX_VALUE;//we want the least number of words i.e. less omitted spaces int longestFunctionalTermEncountered = 0;//we want the longest functional term for (ParseTokens parseTokens : parseTokensList) { List annotations = parseTokens.getAnnotations(); List> chunkedAnnotations = chunkAnnotations(annotations);//chunked into mainGroup/substituent/functionalTerm if (containsOmittedSpace(chunkedAnnotations)){ List omittedWordParseTokens = new ArrayList<>(); List tokens = parseTokens.getTokens(); List newAnnotations = new ArrayList<>(); List newTokens = new ArrayList<>(); int currentFunctionalTermLength = 0; int annotPos = 0; for (List annotationList : chunkedAnnotations) { Character finalAnnotationInList = annotationList.get(annotationList.size() - 1); if (finalAnnotationInList.equals(END_OF_FUNCTIONALTERM) && newAnnotations.size() > 0) { //create a new parseTokens for the substituent/maingroup preceding the functional term //not necessary if the functional term is the first thing to be read e.g. in the case of poly omittedWordParseTokens.add(new ParseTokens(newTokens, newAnnotations)); newAnnotations = new ArrayList<>(); newTokens = new ArrayList<>(); } for (Character annotation : annotationList) { newAnnotations.add(annotation); newTokens.add(tokens.get(annotPos++)); } if (finalAnnotationInList.equals(END_OF_FUNCTIONALTERM) || finalAnnotationInList.equals(END_OF_MAINGROUP) || annotPos == tokens.size()) { omittedWordParseTokens.add(new ParseTokens(newTokens, newAnnotations)); if (finalAnnotationInList.equals(END_OF_FUNCTIONALTERM)){ currentFunctionalTermLength = StringTools.stringListToString(newTokens, "").length(); } newAnnotations = new ArrayList<>(); newTokens = new ArrayList<>(); } } if (omittedWordParseTokens.size() <= leastWordsInOmmittedSpaceParse){ if (omittedWordParseTokens.size() < leastWordsInOmmittedSpaceParse){ splitParseTokensForEachParseTokens.clear(); leastWordsInOmmittedSpaceParse = omittedWordParseTokens.size(); longestFunctionalTermEncountered = 0; } if (currentFunctionalTermLength >=longestFunctionalTermEncountered){ if (currentFunctionalTermLength > longestFunctionalTermEncountered){ splitParseTokensForEachParseTokens.clear(); longestFunctionalTermEncountered =currentFunctionalTermLength; } splitParseTokensForEachParseTokens.add(omittedWordParseTokens); } } } else { wellFormedParseTokens.add(parseTokens); } } List parseWords = new ArrayList<>(); if (!wellFormedParseTokens.isEmpty()) { parseWords.add(new ParseWord(chemicalName, wellFormedParseTokens)); } else { for (int i = 0; i < leastWordsInOmmittedSpaceParse; i++) { List parseTokensForWord = new ArrayList<>(); for (List parseTokens : splitParseTokensForEachParseTokens) { if (!parseTokensForWord.contains(parseTokens.get(i))){//if only one word is ambiguous there is no need for the unambiguous word to have multiple identical interpretation parseTokensForWord.add(parseTokens.get(i)); } } parseWords.add(new ParseWord(StringTools.stringListToString(parseTokensForWord.get(0).getTokens(), ""), parseTokensForWord)); } } return parseWords; } private static boolean containsOmittedSpace(List> chunkedAnnotations) { if (chunkedAnnotations.size() > 1){//there are multiple subsitutents/maingroup/functionalterms for (List annotationList : chunkedAnnotations) { if (annotationList.contains(END_OF_FUNCTIONALTERM)){ return true; } } } return false; } /**Groups the token annotations for a given word into substituent/s and/or a maingroup and/or functionalTerm by * looking for the endOfSubstituent/endOfMainGroup/endOfFunctionalTerm annotations * * @param annots The annotations for a word. * @return A List of lists of annotations, each list corresponds to a substituent/maingroup/functionalTerm */ static List> chunkAnnotations(List annots) { List> chunkList = new ArrayList<>(); List currentTerm = new ArrayList<>(); for (Character annot : annots) { currentTerm.add(annot); char ch = annot; if (ch == END_OF_SUBSTITUENT || ch == END_OF_MAINGROUP || ch == END_OF_FUNCTIONALTERM) { chunkList.add(currentTerm); currentTerm = new ArrayList<>(); } } return chunkList; } /** * Works left to right removing spaces if there are too many opening brackets * @param name * @return * @throws ParsingException If brackets are unbalanced and cannot be balanced by removing whitespace */ static String removeWhiteSpaceIfBracketsAreUnbalanced(String name) throws ParsingException { int bracketLevel = 0; int stringLength = name.length(); for (int i = 0; i < stringLength; i++) { char c = name.charAt(i); if (c == '(' || c == '[' || c == '{') { bracketLevel++; } else if (c == ')' || c == ']' || c == '}') { bracketLevel--; } else if (c == ' ' && bracketLevel > 0) {//brackets unbalanced and a space has been encountered! name = name.substring(0, i) + name.substring(i + 1); stringLength = name.length(); i--; } } if (bracketLevel > 0) { throw new ParsingException("Unmatched opening bracket found in :" + name); } else if (bracketLevel < 0) { throw new ParsingException("Unmatched closing bracket found in :" + name); } return name; } } opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/WordType.java000066400000000000000000000003401451751637500262710ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * The supported wordTypes. * Adding further word types should not be taken lightly * @author dl387 * */ enum WordType{ full, polymer, substituent, functionalTerm, }opsin-2.8.0/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/XmlDeclarations.java000066400000000000000000000732151451751637500276200ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; /** * Contains static final strings corresponding to XML element names and attributes employed by OPSIN * @author dl387 * */ class XmlDeclarations { //TODO are all these types and subtypes actually a good idea considering the vast majority are never used? /* * The container XML elements. These are generated by OPSIN */ /**Define a scope for determining what group a substituent should bond to*/ static final String BRACKET_EL ="bracket"; /**Contains a functional group or class. These terms typically effect the chosen wordRule for the name*/ static final String FUNCTIONALTERM_EL ="functionalTerm"; /**The top most element in OPSIN's parse tree. As a name can describe multiple molecules the same is confusingly true of this element*/ static final String MOLECULE_EL ="molecule"; /**Contains a substituent. A substituent will after the ComponentProcessor contain one group*/ static final String SUBSTITUENT_EL = "substituent"; /**Contains a root group(the rightmost in a word). A root will after the ComponentProcessor contain one group*/ static final String ROOT_EL ="root"; /**Contains brackets/substituents/root. Generally these correspond to words in the original chemical name (unless erroneous/omitted spaces were present)*/ static final String WORD_EL ="word"; /**Contains words/wordRules. The value of the wordRule indicates how the StructureBuilder should process its children*/ static final String WORDRULE_EL ="wordRule"; /* * The token XML elements. These are generally produced by the parser from the tokenised chemical name * Some are produced by OPSIN in the ComponentGenerator/ComponentProcessor */ /**Adds a hydrogen to an unsaturated system, this is hydrogen that is added due to a suffix and is expressed in a locant e.g. 1(2H) */ static final String ADDEDHYDROGEN_EL ="addedHydrogen"; /**A component of an alkaneStem e.g. [octa][hexaconta][tetract]ane will have three alkaneStemComponents*/ static final String ALKANESTEMCOMPONENT ="alkaneStemComponent"; /**Something like tert/iso/sec Modifies an alkaneStem in the ComponentGenerator*/ static final String ALKANESTEMMODIFIER_EL ="alkaneStemModifier"; /**An annulene/annulyne. Converted to a group by the ComponentGenerator*/ static final String ANNULEN_EL ="annulen"; /**A bridge described in SMILES for used on rings*/ static final String FUSEDRINGBRIDGE_EL ="fusedRingBridge"; /**An O that indicates that the preceding alkaneStem is in fact a bridge*/ static final String BRIDGEFORMINGO_EL ="bridgeFormingO"; /**A locant indicating the positions for a glycosidic linkage. The first locant will point to an alpha carbon * Also used to indicate joining of nucleosyl groups*/ static final String BIOCHEMICALLINKAGE_EL ="biochemicalLinkage"; /**Indicates the size of the ring in a carbohydrate e.g. furanose = 5*/ static final String CARBOHYDRATERINGSIZE_EL ="carbohydrateRingSize"; /**A charge specifier e.g. (2+). Value is the charge to set something to*/ static final String CHARGESPECIFIER_EL ="chargeSpecifier"; /**Used amongst other things to indicate how the rings of a ring assembly should be assembled*/ static final String COLONORSEMICOLONDELIMITEDLOCANT_EL ="colonOrSemiColonDelimitedLocant"; /**Created by the ComponentProcessor. Something like the acetic acid in benzene-1,3,5-triacetic acid*/ static final String CONJUNCTIVESUFFIXGROUP_EL ="conjunctiveSuffixGroup"; /**Used by the ComponentGenerator to group elements into bracket elements*/ static final String CLOSEBRACKET_EL ="closebracket"; /**Used by the ComponentGenerator to modify alkanes into cycloalkanes*/ static final String CYCLO_EL ="cyclo"; /** A delta used to indicate the position of a double bond in older nomenclature*/ static final String DELTA_EL ="delta"; /**A fractional multiplier e.g. hemi*/ static final String FRACTIONALMULTIPLIER_EL ="fractionalMultiplier"; /**A functional Class such as acid. Does not correspond to a fragment*/ static final String FUNCTIONALCLASS_EL ="functionalClass"; /**A functional group such as alcohol or sulfone. Describes a fragment*/ static final String FUNCTIONALGROUP_EL ="functionalGroup"; /**Currently just poly or oligo for polymers*/ static final String FUNCTIONALMODIFIER_EL ="functionalModifier"; /**A fusion bracket. Used in fusion nomenclature*/ static final String FUSION_EL ="fusion"; /**Define a scope for determining what group a substituent should bond to*/ static final String GROUP_EL ="group"; /**A heteroatom. Could be part of a Hantzsch Widman ring or a replacement prefix*/ static final String HETEROATOM_EL ="heteroatom"; /**Adds a hydrogen to an unsaturated system (hydro/perhydro)*/ static final String HYDRO_EL ="hydro"; /**One of the systematic hydrocarbon fused ring series e.g. tetralene, pentalene. Converted to a group by the ComponentGenerator*/ static final String HYDROCARBONFUSEDRINGSYSTEM_EL ="hydrocarbonFusedRingSystem"; /**Adds a hydrogen to an unsaturated system to indicate what atoms are saturated in a system where not all atoms with spare valency can form double bonds e.g. e.g. 2H-pyran*/ static final String INDICATEDHYDROGEN_EL ="indicatedHydrogen"; /**Specifies that one of more atoms are enriched with a particular isotope*/ static final String ISOTOPESPECIFICATION_EL ="isotopeSpecification"; /**A hyphen between two substituents. Used as hint that the two substituents do not join together*/ static final String HYPHEN_EL ="hyphen"; /**ine as in the end of an aminoAcid. Has no meaning*/ static final String INE_EL ="ine"; /**An infix. This performs functionalReplacement on a suffix*/ static final String INFIX_EL ="infix"; /**Indicates that a heteroatom or atom should be in a specific valency*/ static final String LAMBDACONVENTION_EL ="lambdaConvention"; /**A locant e.g. where a substituent should attach*/ static final String LOCANT_EL ="locant"; /**Used by the ComponentGenerator to group elements into bracket elements*/ static final String OPENBRACKET_EL ="openbracket"; /**otho/meta/para Converted to a locant by the ComponentProcessor*/ static final String ORTHOMETAPARA_EL ="orthoMetaPara"; /**Describes the number of spiro centres in a poly cyclic spiro system*/ static final String POLYCYCLICSPIRO_EL ="polyCyclicSpiro"; /**A locant indicating through which atoms a multiplied parent in multiplicative nomenclature is connected*/ static final String MULTIPLICATIVELOCANT_EL ="multiplicativeLocant"; /**A multiplier e.g. indicating multiplication of a heteroatom or substituent*/ static final String MULTIPLIER_EL ="multiplier"; /**e.g. (III), Specifies the oxidation number of an atom. Value is the oxidation number to set something to*/ static final String OXIDATIONNUMBERSPECIFIER_EL ="oxidationNumberSpecifier"; /**Used to indicate how many rings are in a ring assembly*/ static final String RINGASSEMBLYMULTIPLIER_EL ="ringAssemblyMultiplier"; /**A spiro system. Converted to a group by the ComponentGenerator*/ static final String SPIRO_EL ="spiro"; /**A locant that separates components of a spiro system*/ static final String SPIROLOCANT_EL ="spiroLocant"; /**Something like R/S/E/Z. Indicates stereochemical configuration*/ static final String STEREOCHEMISTRY_EL ="stereoChemistry"; /**Present in complicated nomenclature e.g. ring assemblies to avoid ambiguity*/ static final String STRUCTURALCLOSEBRACKET_EL ="structuralCloseBracket"; /**Present in complicated nomenclature to avoid ambiguity*/ static final String STRUCTURALOPENBRACKET_EL ="structuralOpenBracket"; /**Indicates replacement of a group by hydrogen e.g. deoxy means replace OH with H*/ static final String SUBTRACTIVEPREFIX_EL ="subtractivePrefix"; /**A suffix e.g. amide, al, yl etc.*/ static final String SUFFIX_EL ="suffix"; /**Something like sulfon/carbo/carbox that modifies a following suffix*/ static final String SUFFIXPREFIX_EL ="suffixPrefix"; /**ene/yne, indicated that a double/triple bond should be formed at a saturated location*/ static final String UNSATURATOR_EL ="unsaturator"; /**A vonBaeyer system. Converted to a group by the ComponentGenerator*/ static final String VONBAEYER_EL ="vonBaeyer"; /* * The token XML attributes. These are generally produced by the parser from the tokenised chemical name * Some are produced by OPSIN in the ComponentGenerator/ComponentProcessor */ /**The semantic meaning of the token. Exact meaning is dependent on the element type e.g. SMILES for a group but a number for a multiplier*/ static final String VALUE_ATR ="value"; /**The stereo group (absolute, racemic, relative) */ static final String STEREOGROUP_ATR ="stereoGroup"; /**The type of the token. Possible values are enumerated with strings ending in _TYPE_VAL */ static final String TYPE_ATR = "type"; /**The subType of the token. Possible values are enumerated with strings ending in _SUBTYPE_VAL */ static final String SUBTYPE_ATR = "subType"; /**Whether the group can be additively bonded to. e.g. thio */ static final String ACCEPTSADDITIVEBONDS_ATR = "acceptsAdditiveBonds"; /**Used to add a higher order bond at a position that can be subsequently specified. * Syntax: semicolon delimited list of the format: orderOfBond space ("id"|"locant"|"defaultId"|"defaultLocant") space (id|locant) */ static final String ADDBOND_ATR = "addBond"; /**Used to add a group at a position that can be subsequently specified * Syntax: semicolon delimited list of the format: SMILESofGroupToBeAdded space ("id"|"locant"|"defaultId"|"defaultLocant") space (id|locant) [space locantLabel]. */ static final String ADDGROUP_ATR = "addGroup"; /**Used to set a heteroatom at a position that can be subsequently specified * Syntax: semicolon delimited list of the format: elementOfAtom space ("id"|"locant"|"defaultId"|"defaultLocant") space (id|locant). */ static final String ADDHETEROATOM_ATR = "addHeteroAtom"; /**Another value that the token takes. e.g. for suffix tokens that add two suffixes to the molecule*/ static final String ADDITIONALVALUE_ATR = "additionalValue"; /**Listed in a clockwise order, the locants of the atoms that define a pseudo 2D plane for alpha/beta stereochemistry */ static final String ALPHABETACLOCKWISEATOMORDERING_ATR="alphaBetaClockWiseAtomOrdering"; /**For elements, the typical oxidation states (comma separated) then a colon and the maximum oxidation station*/ static final String COMMONOXIDATIONSTATESANDMAX_ATR = "commonOxidationStatesAndMax"; /**The ID of the atom which by default an incoming fragment should connect to. ID is relative to this particular fragment (first atom =1) */ static final String DEFAULTINID_ATR = "defaultInID"; /**The locant of the atom which by default an incoming fragment should connect to**/ static final String DEFAULTINLOCANT_ATR = "defaultInLocant"; /**Works like the locant attribute but refers to the atoms OPSIN ID. Will be overridden by the locant/locantId attribute*/ static final String DEFAULTLOCANTID_ATR = "defaultLocantID"; /**A comma separated list of locants that are expected in front of a group for either xylene-like nomenclature or as indirect locants*/ static final String FRONTLOCANTSEXPECTED_ATR = "frontLocantsExpected"; /**A comma separated list of relative IDs at which to add functionalAtoms*/ static final String FUNCTIONALIDS_ATR = "functionalIDs"; /**Numbering to use when ring is part of a fused ring system */ static final String FUSEDRINGNUMBERING_ATR = "fusedRingNumbering"; /**Semi-colon delimited list of labels for * atoms, where the * atoms represent generic groups e.g. Alkyl*/ static final String HOMOLOGY_ATR = "homology"; /**Indicates that the substituent can either be -X- or X= depending on context cf. imino or methylene*/ static final String IMINOLIKE_ATR = "iminoLike"; /**The functional replacement specified by an infix to be performed on the suffix*/ static final String INFIX_ATR = "infix"; /**Defines the locants for which a radical will connect to another group in multiplicative nomenclature e.g. in 2,2'-methylenedipyridine the 2,2' become inlocants of the pyridine*/ static final String INLOCANTS_ATR = "inLocants"; /**Determined by the {@link ComponentProcessor}. True if a fragment has more than two radical positions e.g. ethan-1,2-diyl not ethanylidene*/ static final String ISAMULTIRADICAL_ATR = "isAMultiRadical"; /**Was the word salt encountered indicating that a salt was expected? */ static final String ISSALT_ATR = "isSalt"; /**Slash delimited list of locants. List must be the same length as number of atoms. Multiple locants can be given to an atom by comma delimiting them*/ static final String LABELS_ATR = "labels"; /**Added to a heteroatom or LAMBDACONVENTION_EL to indicate the desired valency*/ static final String LAMBDA_ATR = "lambda"; /**Locant used when deciding where to apply an operation*/ static final String LOCANT_ATR = "locant"; /**Works like a locant but refers to the atom's OPSIN id*/ static final String LOCANTID_ATR = "locantID"; /**Indicates that this trivial name has the opposite D/L stereochemistry to others in its class i.e. L- for carbohydrates or D- for amino acids*/ static final String NATURALENTISOPPOSITE_ATR ="naturalEntIsOpposite"; /** Used as a fudge for some hydrogen esters e.g. dihydrogenphosphate*/ static final String NUMBEROFFUNCTIONALATOMSTOREMOVE_ATR = "numberOfFunctionalAtomsToRemove"; /**Indicates that an element has been multiplied. Prevents badly assigning indirect locants*/ static final String MULTIPLIED_ATR = "multiplied"; /**Indicates how many times a bracket/substituent should be multiplied*/ static final String MULTIPLIER_ATR ="multiplier"; /** The name that was inputted into OPSIN's parser. Attribute of molecule */ static final String NAME_ATR = "name"; /**A comma separated list of relative IDs at which to add OutAtoms*/ static final String OUTIDS_ATR = "outIDs"; /**Indicates that a substituent/bracket has been processed by StructureBuildingMethods*/ static final String RESOLVED_ATR ="resolved"; /**Placed on a word rule if explicit stoichiometry has been provided. Value is always an integer */ static final String STOICHIOMETRY_ATR = "stoichiometry"; /** Holds the value of any tokens for which XML was not generated by the parser e.g. an optional e. Multiple elided tokens will be concatenated*/ static final String SUBSEQUENTUNSEMANTICTOKEN_ATR ="subsequentUnsemanticToken"; /**A comma separated list of relatives IDs indicating where to add suffix/es*/ static final String SUFFIXAPPLIESTO_ATR = "suffixAppliesTo"; /**A relatives ID indicating at what position to attach a suffix to by default*/ static final String SUFFIXAPPLIESTOBYDEFAULT_ATR = "suffixAppliesToByDefault"; /**Added by the ComponentGenerator to a suffix*/ static final String SUFFIXPREFIX_ATR = "suffixPrefix"; /**Can the substituent be implicitly bracketed to a previous substitutent e.g. methylaminobenzene --> (methylamino)benzene as amino has this attribute*/ static final String USABLEASJOINER_ATR = "usableAsAJoiner"; /**The wordRule that a wordRule element corresponds to*/ static final String WORDRULE_ATR ="wordRule"; /* * The values the type attribute can take * Type is expected to be present at minimum on all group elements */ /**A term like amide or hydrazide that replaces a functional hydroxy group*/ static final String ACIDREPLACINGFUNCTIONALGROUP_TYPE_VAL ="acidReplacingFunctionalGroup"; /**A trivial carboxylic acid. These by default do not have their acid groups which are then added on using suffixes*/ static final String ACIDSTEM_TYPE_VAL ="acidStem"; /**This stereochemistry element conveys alpha/beta stereochemistry*/ static final String ALPHA_OR_BETA_TYPE_VAL ="alphaOrBeta"; /**An aminoAcid. These by default do not have their acid groups which are then added on using suffixes. Notably these suffixes do NOT correspond to tokens in the input chemical name!*/ static final String AMINOACID_TYPE_VAL ="aminoAcid"; /**A subtractive prefix that removes a terminal chalcogen and forms an intramolecular bridge to another*/ static final String ANHYDRO_TYPE_VAL ="anhydro"; /**This stereochemistry element conveys axial stereochemistry * These indicate the postion of groups are an axis/plane/helix. This is expressed by the descriptors: M, P, Ra, Sa, Rp, Sp*/ static final String AXIAL_TYPE_VAL ="axial"; /**A normal multiplier e.g. di*/ static final String BASIC_TYPE_VAL ="basic"; /**An isotopeSpecification using boughton system nomenclature*/ static final String BOUGHTONSYSTEM_TYPE_VAL ="boughtonSystem"; /**A locant enclosed in square brackets e.g. [5]*/ static final String BRACKETEDLOCANT_TYPE_VAL ="bracketedLocant"; /**This stereochemistry element specifies stereochemistry in a carbohydrate e.g. gluco is r/l/r/r (position of hydroxy in a fischer projection)*/ static final String CARBOHYDRATECONFIGURATIONPREFIX_TYPE_VAL ="carbohydrateConfigurationalPrefix"; /**Groups formed in accordance with carbohydrate nomenclature */ static final String CARBOHYDRATE_TYPE_VAL ="carbohydrate"; /**Indicates the group should be acyclic*/ static final String CHAIN_TYPE_VAL ="chain"; /**This suffix modifies charge*/ static final String CHARGE_TYPE_VAL ="charge"; /**This stereochemistry element conveys cis/trans stereochemistry*/ static final String CISORTRANS_TYPE_VAL ="cisOrTrans"; /**This stereochemistry element conveys R/S stereochemistry*/ static final String R_OR_S_TYPE_VAL ="RorS"; /**This stereochemistry element conveys E/Z stereochemistry*/ static final String E_OR_Z_TYPE_VAL ="EorZ"; /** The entire molecules is racemic and nothing else known. */ static final String RAC_TYPE_VAL ="RAC"; /** The entire molecules is relative and nothing else known (less common). */ static final String REL_TYPE_VAL ="REL"; /**This group is a sulfur/selenium/tellurium acid with the acidic hydroxy missing*/ static final String CHALCOGENACIDSTEM_TYPE_VAL ="chalcogenAcidStem"; /**A subtractive prefix that removes a hydrogen to covert a hydroxy into a carbonyl or convert a bond to a double/triple bond*/ static final String DEHYDRO_TYPE_VAL ="dehydro"; /**A subtractive prefix that removes a terminal hydroxy like atom*/ static final String DEOXY_TYPE_VAL ="deoxy"; /**A functional group describing a divalent group*/ static final String DIVALENTGROUP_TYPE_VAL ="diValentGroup"; /** This stereochemsitry element indicates the configuration of an amino acid/carbohydrate relative to glyceraldehyde*/ static final String DLSTEREOCHEMISTRY_TYPE_VAL ="dlStereochemistry"; /**An atom e.g. "lithium" */ static final String ELEMENTARYATOM_TYPE_VAL ="elementaryAtom"; /**This stereochemistry element conveys endo/exo/syn/anti stereochemistry * These indicate relative orientation of groups attached to non-bridgehead atoms in a bicyclo[x.y.z]alkane (x >= y > z > 0)*/ static final String ENDO_EXO_SYN_ANTI_TYPE_VAL ="endoExoSynAnti"; /**A group that is functional class e.g. O for anhydride*/ static final String FUNCTIONALCLASS_TYPE_VAL ="functionalClass"; /**A multiplier for groups of terms e.g. bis*/ static final String GROUP_TYPE_VAL ="group"; /**A subtractive prefix that removes a heteroatom i.e. replaces it with carbon */ static final String HETEROATOMREMOVAL_TYPE_VAL = "heteratomRemoval"; /**An implicit bracket. Implicit brackets are added where a bracket is needed to give the intended meaning*/ static final String IMPLICIT_TYPE_VAL ="implicit"; /**This suffix adds a radical to the preceding group e.g. yl, oyl*/ static final String INLINE_TYPE_VAL ="inline"; /**An isotopeSpecification using IUPAC nomenclature*/ static final String IUPACSYSTEM_TYPE_VAL ="iupacSystem"; /**This functional group is monovalent e.g. alcohol*/ static final String MONOVALENTGROUP_TYPE_VAL ="monoValentGroup"; /**This functional group is monovalent and describes a specific compound e.g. cyanide*/ static final String MONOVALENTSTANDALONEGROUP_TYPE_VAL ="monoValentStandaloneGroup"; /**A non carboxylic acid e.g. phosphoric*/ static final String NONCARBOXYLICACID_TYPE_VAL ="nonCarboxylicAcid"; /**This stereochemistry element describes the direction that plane polarised light is rotated*/ static final String OPTICALROTATION_TYPE_VAL ="opticalRotation"; /**Indicates the locant was made from an ortho/meta/para term*/ static final String ORTHOMETAPARA_TYPE_VAL ="orthoMetaPara"; /**This stereochemistry element conveys relative cis/trans stereochemistry e.g. r-1, c-2, t-3*/ static final String RELATIVECISTRANS_TYPE_VAL ="relativeCisTrans"; /**Indicates the group should be, at least in part, cyclic*/ static final String RING_TYPE_VAL ="ring"; /**Indicates a group that does not allow suffixes*/ static final String SIMPLEGROUP_TYPE_VAL ="simpleGroup"; /**Groups that do not have any special rules for suffix handling*/ static final String STANDARDGROUP_TYPE_VAL ="standardGroup"; /**A bracket containing R/S/E/Z descriptors*/ static final String STEREOCHEMISTRYBRACKET_TYPE_VAL ="stereochemistryBracket"; /**Indicates a group that is a substituent*/ static final String SUBSTITUENT_TYPE_VAL ="substituent"; /**A locant that also indicated the addition of hydrogen e.g.2(1H); not used to locant onto another group*/ static final String ADDEDHYDROGENLOCANT_TYPE_VAL ="addedHydrogenLocant"; /**Indicates a group that is a suffix*/ static final String SUFFIX_TYPE_VAL ="suffix"; /**A suffix that does not add a radical, hence will be present only on the root group */ static final String ROOT_TYPE_VAL ="root"; /**A multiplier for a Von Baeyer system e.g. bi in bicyclo*/ static final String VONBAEYER_TYPE_VAL ="VonBaeyer"; /* * The values the subType attribute can take * subType is expected to be present at minimum on all group elements */ /**Functional groups like acetal and mercaptal */ static final String ACETALLIKE_SUBTYPE_VAL = "acetalLike"; /**The stem of an alkane e.g. "eth" */ static final String ALKANESTEM_SUBTYPE_VAL ="alkaneStem"; /**Amyl/amylidene */ static final String AMYL_SUBTYPE_VAL = "amyl"; /**An anhydride functional term e.g. "thioanhydride"*/ static final String ANHYDRIDE_SUBTYPE_VAL ="anhydride"; /**A CAS apio furanose stereoisomer*/ static final String APIOFURANOSE_SUBTYPE_VAL ="apioFuranose"; /**An aryl subsituent or stem e.g. "phenyl", "styr" */ static final String ARYLSUBSTITUENT_SUBTYPE_VAL ="arylSubstituent"; /**Nucleotides/nucleosides/natural products. * Carbohydrates can be detected by {@link XmlDeclarations#CARBOHYDRATE_TYPE_VAL} * Amino acids can be detected by {@link XmlDeclarations#AMINOACID_TYPE_VAL} * For any of the above use {@link OpsinTools#isBiochemical(String, String)}*/ static final String BIOCHEMICAL_SUBTYPE_VAL ="biochemical"; /**A trivial carbohydrate stem for an aldose e.g. "galact"*/ static final String CARBOHYDRATESTEMALDOSE_SUBTYPE_VAL ="carbohydrateStemAldose"; /**A trivial carbohydrate stem for a ketose e.g. "fruct"*/ static final String CARBOHYDRATESTEMKETOSE_SUBTYPE_VAL ="carbohydrateStemKetose"; /**Functional groups like oxime and hydrazone that replace a carbonyl */ static final String CARBONYLREPLACEMENT_SUBTYPE_VAL = "carbonylReplacement"; /**The function group oxide, and its other chalcogen analogs */ static final String CHALCOGENIDE_SUBTYPE_VAL = "chalcogenide"; /**A suffix that forms a cycle e.g. imide, lactam, sultam*/ static final String CYCLEFORMER_SUBTYPE_VAL ="cycleformer"; /**A hydrocarbon stem that is typically followed by an unsaturator e.g. "adamant" */ static final String CYCLICUNSATURABLEHYDROCARBON_SUBTYPE_VAL ="cyclicUnsaturableHydrocarbon"; /**Replacement terms that are not substituents e.g. amido/hydrazido/imido/nitrido*/ static final String DEDICATEDFUNCTIONALREPLACEMENTPREFIX_SUBTYPE_VAL = "dedicatedFunctionalReplacementPrefix"; /**An amino acid that ends in an e.g. tryptoph */ static final String ENDINAN_SUBTYPE_VAL ="endInAn"; /**An amino acid that ends in ic e.g. aspart */ static final String ENDINIC_SUBTYPE_VAL ="endInIc"; /**An amino acid that ends in ine e.g. alan */ static final String ENDININE_SUBTYPE_VAL ="endInIne"; /**A substituent that is expected to form a bridge e.g. "epoxy", "epiimino" */ static final String EPOXYLIKE_SUBTYPE_VAL ="epoxyLike"; /**A ring that will be fused onto another ring e.g. "benzo", "pyrido", "pyridino" */ static final String FUSIONRING_SUBTYPE_VAL ="fusionRing"; /**Functional terms like glcycol or chlorohydrin that add an oxygen and another atom to opposite ends of a chain */ static final String GLYCOLORHALOHYDRIN_SUBTYPE_VAL = "glycolOrHalohydrin"; /**A group that can be suffixed e.g. "hydrazin" */ static final String GROUPSTEM_SUBTYPE_VAL ="groupStem"; /**A halide or pseudo halide e.g. "bromo", "cyano". Can be functional replacment terms when preceding certain non-carboxylic acids */ static final String HALIDEORPSEUDOHALIDE_SUBTYPE_VAL = "halideOrPseudoHalide"; /**The stem of a hantzch Widman ring sytem e.g. "an", "ol", "olidin" */ static final String HANTZSCHWIDMAN_SUBTYPE_VAL ="hantzschWidman"; /**A heteroatom hydride e.g. "az" "sulf" (will be followed by an unsaturator, may be preceded by a multiplier to form the heteroatom equivalent of alkanes)*/ static final String HETEROSTEM_SUBTYPE_VAL ="heteroStem"; /**A group with no special properties Similar to: {@link XmlDeclarations#NONE_SUBTYPE_VAL}*/ static final String SIMPLEGROUP_SUBTYPE_VAL ="simpleGroup"; /**A substituent which intrinsically forms multiple bonds e.g. "siloxane", "thio" */ static final String MULTIRADICALSUBSTITUENT_SUBTYPE_VAL ="multiRadicalSubstituent"; /**A non-carboxylic acid which cannot form a substituent e.g. "bor" */ static final String NOACYL_SUBTYPE_VAL ="noAcyl"; /**A group with no special properties Similar to: {@link XmlDeclarations#SIMPLEGROUP_SUBTYPE_VAL}*/ static final String NONE_SUBTYPE_VAL ="none"; /**oxido/sulfido/selenido/tellurido These are handled similarly to oxide e.g. might give -[O-] or =O*/ static final String OXIDOLIKE_SUBTYPE_VAL ="oxidoLike"; /**An atom with implicit oxidation state e.g. "ferric" */ static final String OUSICATOM_SUBTYPE_VAL ="ousIcAtom"; /**A term indicating replacement of all substitutable hydrogens by a halogen e.g. "perchloro" */ static final String PERHALOGENO_SUBTYPE_VAL ="perhalogeno"; /** phospho and other very related substituents. Strongly prefer forming bonds to hydroxy groups */ static final String PHOSPHO_SUBTYPE_VAL ="phospho"; /**A ring group e.g. "pyridin" */ static final String RING_SUBTYPE_VAL ="ring"; /** A component of a salt e.g "hydrate", "2HCl" */ static final String SALTCOMPONENT_SUBTYPE_VAL ="saltComponent"; /**A substitutent with no suffix e.g. "amino" */ static final String SIMPLESUBSTITUENT_SUBTYPE_VAL ="simpleSubstituent"; /**A substituent expecting a suffix e.g."bor" "vin" */ static final String SUBSTITUENT_SUBTYPE_VAL ="substituent"; /**A group representing a straight chain carbohydrate of a certain length with undefined stereochemistry e.g. hex in hexose */ static final String SYSTEMATICCARBOHYDRATESTEMALDOSE_SUBTYPE_VAL ="systematicCarbohydrateStemAldose"; /**A group representing a straight chain carbohydrate of a certain length with undefined stereochemistry e.g. hex in hex-2-ulose */ static final String SYSTEMATICCARBOHYDRATESTEMKETOSE_SUBTYPE_VAL ="systematicCarbohydrateStemKetose"; /**A suffix that attaches to the end of a chain e.g. "aldehyde", "ic acid" */ static final String TERMINAL_SUBTYPE_VAL ="terminal"; /**An acid that when suffixed with yl gives an acyl group e.g. "acet" */ static final String YLFORACYL_SUBTYPE_VAL ="ylForAcyl"; /**An acid that has undefined meaning when suffixed with yl */ static final String YLFORNOTHING_SUBTYPE_VAL ="ylForNothing"; /**An acid that when suffixed with yl gives an alkyl group e.g. "laur" */ static final String YLFORYL_SUBTYPE_VAL ="ylForYl"; /**Requests that no labelling should be applied */ static final String NONE_LABELS_VAL ="none"; /**Requests that labelling be done like a fused ring. It is assumed that the order of the atoms is locant 1 as the first atom*/ static final String FUSEDRING_LABELS_VAL ="fusedRing"; /**Requests that labelling be 1, 2, 3 etc. It is assumed that the order of the atoms is locant 1 as the first atom*/ static final String NUMERIC_LABELS_VAL ="numeric"; /** InLocants have not been specified */ static final String INLOCANTS_DEFAULT = "default"; /** * See suffixRules.dtd */ static final String SUFFIXRULES_RULE_EL = "rule"; static final String SUFFIXRULES_VALUE_ATR = "value"; static final String SUFFIXRULES_SMILES_ATR = "SMILES"; static final String SUFFIXRULES_LABELS_ATR = "labels"; static final String SUFFIXRULES_FUNCTIONALIDS_ATR = "functionalIDs"; static final String SUFFIXRULES_OUTIDS_ATR = "outIDs"; static final String SUFFIXRULES_KETONELOCANT_ATR = "ketoneLocant"; static final String SUFFIXRULES_ORDER_ATR = "order"; static final String SUFFIXRULES_OUTVALENCY_ATR = "outValency"; static final String SUFFIXRULES_CHARGE_ATR = "charge"; static final String SUFFIXRULES_PROTONS_ATR = "protons"; static final String SUFFIXRULES_ELEMENT_ATR = "element"; /** * See suffixApplicability.dtd */ static final String SUFFIXAPPLICABILITY_GROUPTYPE_EL = "groupType"; static final String SUFFIXAPPLICABILITY_SUFFIX_EL = "suffix"; static final String SUFFIXAPPLICABILITY_TYPE_ATR = "type"; static final String SUFFIXAPPLICABILITY_VALUE_ATR = "value"; static final String SUFFIXAPPLICABILITY_SUBTYPE_ATR = "subType"; } opsin-2.8.0/opsin-core/src/main/resources/000077500000000000000000000000001451751637500204725ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/000077500000000000000000000000001451751637500211115ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/000077500000000000000000000000001451751637500214745ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/000077500000000000000000000000001451751637500222345ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/000077500000000000000000000000001451751637500226265ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/000077500000000000000000000000001451751637500236155ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/000077500000000000000000000000001451751637500247455ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/opsinbuild.props000066400000000000000000000000321451751637500301750ustar00rootroot00000000000000version=${project.version}opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/000077500000000000000000000000001451751637500267575ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/alkanes.xml000066400000000000000000000063031451751637500311210ustar00rootroot00000000000000 meth eth prop but pent hex hept oct non undec hen do tri tetr pent hex hept oct non undec dec cos|icos|eicos triacont|tricont tetracont pentacont hexacont heptacont octacont nonacont decacont hect dict trict tetract pentact hexact heptact octact nonact kili dili trili tetrali pentali hexali heptali octali nonali normal|normal |normal- tertiary|tertiary |tertiary-|tert|tert.|tert-|tert.-|t- iso|iso- sec|sec.|sec-|sec.-|secondary|secondary |secondary- neo|neo- opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/aminoAcids.xml000066400000000000000000001103451451751637500315540ustar00rootroot00000000000000 alan dehydroalan|alpha,beta-dehydroalan argin asparag isoasparag|alpha-asparag cystathion isoglutam|iso-glutam glyc histid isoleuc|iso-leuc alloisoleuc|allo-isoleuc leuc lys methion selenomethion|seleno-methion telluromethion|telluro-methion phenylalan dehydrophenylalan|alpha,beta-dehydrophenylalan prol ser threon allothreon|allo-threon tyros val norval norargin norcyste norleuc allys citrull ornith sarcos thyron thyrox pyrrolys lanthion hydroxyprol hydroxylys hadacid isoser|iso-ser agarit azaser alanos albizz|albizzi alli ethion selenoethion|seleno-ethion telluroethion|telluro-ethion canal canavan cycloleuc isoval|iso-val penicillam phenylglyc thean felin azidolys azido-lys|azidonorleuc|azido-norleuc azidohomoalan|azido-homoalan azidophenylalan|azido-phenylalan monoiodotyros|mono-iodotyros diiodotyros|di-iodotyros triiodothyron|tri-iodothyron tertiaryleuc|tertiary-leuc|tertleuc|tert-leuc|tert.leuc|tert.-leuc|t-leuc thiaprol|thioprol panton lamin saccharop tertiarybutylalan|tertiary-butylalan|tertbutylalan|tert-butylalan|tert.butylalan|tert.-butylalan|t-butylalan tertiarybutylglyc|tertiary-butylglyc|tertbutylglyc|tert-butylglyc|tert.butylglyc|tert.-butylglyc|t-butylglyc propargylglyc allylglyc 2-indolylglyc 3-indolylglyc norcitrull nortyros diphenylalan homoalan homoargin homocitrull homohistid homomethion homolanthion homoleuc homolys homophenylalan homoprol homopropargylglyc homoser homotyros beta-alan beta-homoalan beta-homoargin beta-homoasparag beta-homohydroxyprol beta-homoisoleuc beta-homoleuc beta-homolys beta-homomethion beta-homophenylalan beta-homoprol beta-homopropargylglyc beta-homoser beta-homothreon beta-homotyros beta-leuc beta-phenylalan alpha-glutam glutam beta-glutam homoglutam beta-homoglutam cyste homocyste selenocyste|seleno-cyste selenohomocyste|seleno-homocyste tellurocyste|telluro-cyste tellurohomocyste|telluro-homocyste tryptoph homotryptoph beta-homotryptoph nortryptoph aspart homoaspart|glutam homoglutam beta-homoglutam beta-glutam|beta-homoaspart carboxyglutam cyste homocyste selenocyste|seleno-cyste tellurocyste|telluro-cyste pantothen pyroglutam aspart-1-yl|l-aspart-1-yl|alpha-aspartyl|l-alpha-aspartyl|alpha-l-aspartyl d-aspart-1-yl|d-alpha-aspartyl|alpha-d-aspartyl aspart-4-yl|l-aspart-4-yl|beta-aspartyl|l-beta-aspartyl|beta-l-aspartyl d-aspart-4-yl|d-beta-aspartyl|beta-d-aspartyl glutam-1-yl|l-glutam-1-yl|alpha-glutamyl|l-alpha-glutamyl|alpha-l-glutamyl d-glutam-1-yl|d-alpha-glutamyl|alpha-d-glutamyl glutam-5-yl|l-glutam-5-yl|gamma-glutamyl|l-gamma-glutamyl|gamma-l-glutamyl d-glutam-5-yl|d-gamma-glutamyl|gamma-d-glutamyl cystyl|l-cystyl d-cystyl half-cystyl|l-half-cystyl d-half-cystyl tryptyl|l-tryptyl d-tryptyl aspartate(2-) aspartate(1-) glutamate(2-) glutamate(1-) lysinium(1+) lysinium(2+) aspart-1-al aspart-4-al aspart-1-ol aspart-4-ol glutam-1-al glutam-4-al glutam-1-ol glutam-4-ol arginamide|arginamid arginate|arginat homoarginamide|homoarginamid homoarginate|homoarginat methionine sulfoxide|methioninesulfoxide|methionin sulfoxid|methionin-sulfoxid|methioninsulfoxid|methionine oxide|methionineoxide|methionin oxid|methionin-oxid|methioninoxid pantothenol|panthenol pidolic acid|pidolicacid pidolate|pidolat pyroglutamal pyroglutamol pyroglutamide|pyroglutamid selenomethionine selenoxide|selenomethionineselenoxide|selenomethionin selenoxid|selenomethionin-selenoxid|selenomethioninselenoxid|selenomethionine oxide|selenomethionineoxide|selenomethionin oxid|selenomethionin-oxid|selenomethioninoxid telluromethionine telluroxide|telluromethioninetelluroxide|telluromethionin telluroxid|telluromethionin-telluroxid|telluromethionintelluroxid|telluromethionine oxide|telluromethionineoxide|telluromethionin oxid|telluromethionin-oxid|telluromethioninoxid abrine cystine|cystin cystinate|cystinat dopa homocystine|homocystin homocystinate|homocystinat selenocystine|seleno-cystine|selenocystin|seleno-cystin selenocystinate|seleno-cystinate|selenocystinat|seleno-cystinat tellurocystine|telluro-cystine|tellurocystin|telluro-cystin tellurocystinate|telluro-cystinate|tellurocystinat|telluro-cystinat alanopine|alanopin beta-alanopine|beta-alanopin butyrine carnitine|carnitin l-carnitine|l-carnitin d-carnitine|d-carnitin ciliatine creatine|creatin cystamine|cystamin cysteamine|cysteamin diaminopimelic acid|d,l-diaminopimelic acid|meso-diaminopimelic acid l,l-diaminopimelic acid|ll-diaminopimelic acid|l-diaminopimelic acid d,d-diaminopimelic acid|dd-diaminopimelic acid|d-diaminopimelic acid dibromotyrosine|dibromotyrosin dihydroxyphenylglycine|dihydroxyphenylglycin glutathione disulfide|glutathionedisulfide|glutathion disulfid|glutathion-disulfid|glutathiondisulfid guvacine|guvacin homohypotaurine|homohypotaurin homohypotaurinate|homohypotaurate|homohypotaurinat|homohypotaurat homoselenohypotaurine|homoselenohypotaurin homoselenohypotaurinate|homoselenohypotaurate|homoselenohypotaurinat|homoselenohypotaurat homotaurine|homotaurin homotaurinate|homotaurate|homotaurinat|homotaurat homoselenotaurine|homoselenotaurin homoselenotaurinate|homoselenotaurate|homoselenotaurinat|homoselenotaurat hypotaurine|hypotaurin hypotaurinate|hypotaurate|hypotaurinat|hypotaurat hypotaurocyamine|hypotaurocyamin methioninamine|methioninamin methylselenocysteine|methylselenocystein nopaline octopinic acid|octopinicacid octopine|octopin selenocystamine|seleno-cystamine|selenocystamin|seleno-cystamin selenocysteamine|seleno-cysteamine|selenocysteamin|seleno-cysteamin selenohypotaurine|seleno-hypotaurine|selenohypotaurin|seleno-hypotaurin selenohypotaurinate|seleno-hypotaurinate|selenohypotaurate|seleno-hypotaurate|selenohypotaurinat|seleno-hypotaurinat|selenohypotaurat|seleno-hypotaurat strombine taurine|taurin taurinate|taurate|taurinat|taurat dimethyltaurate|dimethyltaurat selenotaurine|seleno-taurine|selenotaurin|seleno-taurin selenotaurinate|seleno-taurinate|selenotaurate|seleno-taurate|selenotaurinat|seleno-taurinat|selenotaurat|seleno-taurat taurocyamine|taurocyamin tauropine|tauropin tetrazolylglycine|tetrazolylglycin tricine isolysine|beta-lysine|isolysin|beta-lysin lysopine|d-lysopine|lysopin|d-lysopin d-methionine (s)-sulfoxide|d-methionine-(s)-sulfoxide|d-methionine-s-sulfoxide|d-methionin-(s)-sulfoxid|d-methionin-s-sulfoxid|d-methionine (s)-s-oxide|d-methionine-(s)-s-oxide|d-methionin-(s)-s-oxid d-methionine (r)-sulfoxide|d-methionine-(r)-sulfoxide|d-methionine-r-sulfoxide|d-methionin-(r)-sulfoxid|d-methionin-r-sulfoxid|d-methionine (r)-s-oxide|d-methionine-(r)-s-oxide|d-methionin-(r)-s-oxid methionine (s)-sulfoxide|methionine-(s)-sulfoxide|methionine-s-sulfoxide|methionin-(s)-sulfoxid|methionin-s-sulfoxid|methionine (s)-s-oxide|methionine-(s)-s-oxide|methionin-(s)-s-oxid|l-methionine (s)-sulfoxide|l-methionine-(s)-sulfoxide|l-methionine-s-sulfoxide|l-methionin-(s)-sulfoxid|l-methionin-s-sulfoxid|l-methionine (s)-s-oxide|l-methionine-(s)-s-oxide|l-methionin-(s)-s-oxid methionine (r)-sulfoxide|methionine-(r)-sulfoxide|methionine-r-sulfoxide|methionin-(r)-sulfoxid|methionin-r-sulfoxid|methionine (r)-s-oxide|methionine-(r)-s-oxide|methionin-(r)-s-oxid|l-methionine (r)-sulfoxide|l-methionine-(r)-sulfoxide|l-methionine-r-sulfoxide|l-methionin-(r)-sulfoxid|l-methionin-r-sulfoxid|l-methionine (r)-s-oxide|l-methionine-(r)-s-oxide|l-methionin-(r)-s-oxid pantetheine|pantethein statine trimethylglycine|trimethylglycin opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/arylGroups.xml000066400000000000000000001276111451751637500316600ustar00rootroot00000000000000 aceanthren aceanthrylen acenaphthylen acenaphthen acenaphthoquinon acephenanthren acephenanthrylen acridan acridarsin acridin acridophosphin acrindolin adrenalin alloxan amphetamin anethol anilin anisidin anisol anthracen anthraquinon anthradiquinon anthrazin anthyridin arsanthren arsanthridin arsindol arsindolin arsindolizidin arsindolizin arsinolin arsinolizin as-indacen as-triazin|asym-triazin azulen benzanthron benzidin|4,4'-benzidin benzen benzoin benzoquinon benzotribromid benzotrichlorid benzotrifluorid benzotriiodid benzyn betacarbolin|beta-carbolin|b-carbolin bibenzyl biphenylen boranthren borneol caffein camphen camphorquinon carbazol carbostyril catechol chalcon chinolin cholanthren chroman thiochroman selenochroman tellurochroman chromen chrom-2-en chrom-3-en thiochromen selenochromen tellurochromen chromenylium thiochromenylium selenochromenylium tellurochromenylium chromocen chromon chrysen cinnolin cobaltocen collidin coumaran coumarin|cumarin coumaron coronen cresol cumen cymen cyclopenta[a]phenanthren decalin dopamin duren ethylenimin eugenol ferrocen flavylium fluoranthen fluoren spiro-9,9'-bifluoren fulven furan thiofuran selenofuran tellurofuran furazan furoxan guaiacol harmalin harmalol harman harmanamid harmin hemimelliten histamin homomorpholin thiahomomorpholin|thiohomomorpholin selenohomomorpholin tellurohomomorpholin homopiperazin homopiperidin hydantoin hydrobenzoin hydroquinon imidazol imidazolidin imidazolin indan ind-1-en ind-2-en indazol inden indol indolin indolizidin indolizin indoxazen isatin isoarsindol isoarsindolin isoarsinolin isobenzofuran isobenzothiofuran|isobenzothiophen isocarbostyril isochinolin isochroman isothiochroman isoselenochroman isotellurochroman isochromen isothiochromen isoselenochromen isotellurochromen isochromenylium isothiochromenylium isoselenochromenylium isotellurochromenylium isocoumarin|isocumarin isoduren isoindol isoindolin isoquinolin isoquinolon isophosphindol isophosphindolin isophosphinolin isosafrol isoselenazol isoselenazolidin isoselenazolin isotellurazol isotellurazolidin isotellurazolin isothiazol isothiazolidin isothiazolin isoxazol|isooxazol isoxazolidin|isooxazolidin isoxazolin|isooxazolin isoviolanthren lepidin lupetidin menthol mercuranthren mesitylen melliten molybdocen morpholin thiamorpholin|thiomorpholin selenomorpholin telluromorpholin naphthacen naphthalen|naphthalin naphthoquinon naphthodiquinon naphthyridin nickelocen niobocen norbornen|norcamphen|norbornylen norharmin osmocen ovalen oxanthren oxindol paracetamol paraxanthin perimidin perylen phenalen phenanthrazin phenanthren phenanthridin phenanthrolin phenarsazin phenazin phenetidin phenetol phenoxid phenoxylium phenomercurin phenoxazin phenothiazin phenoselenazin phenotellurazin phenophosphazinin|phenophosphazin phenarsazinin phenazasilin phenoarsazin phenomercurazin|phenomercazin phenoxathiin phenoxaselenin phenoxasilin phenoxatellurin phenoxaphosphinin|phenoxaphosphin phenoxarsinin|phenoxarsin phenoxastibinin|phenoxantimonin phenothiarsinin|phenothiarsin phloroglucinol phosphanthren phosphanthridin phosphindol phosphindolin phosphindolizidin phosphindolizin phosphinolin phosphinolizin phthalazin phthalid phthaloperin piaselenol|piazselenol piazthiol pinolin picolin piperazin piperidin picen pleiaden plumbocen prehniten pseudocumen pteridin purin pyran thiopyran selenopyran telluropyran pyranthren pyrazin pyrazol pyrazolidin pyrazolin pyren pyridazin pyridin pyrimidin pyrindan pyrinden|pyrindin pyrocatechol pyrogallol pyrrol pyrrolizidin pyrrolizin pyrrolidin pyrrolidon pyrrolin pyrylium quinaldin quinazolin quindolin quinindolin quinolin quinolizidin quinolizin quinolon quinoxalin quinuclidin resorcinol rhodanin rhodocen rubicen ruthenocen safrol s-indacen s-triazin|sym-triazin s-triazol|sym-triazol silanthren skatol stilben sulfolan sulfol-2-en sulfol-3-en sulfolen styren selenanthren selenophen telluranthren tellurophen tetralin tetralon thebenidin theobromin theophyllin thianthren thiophen titanocen tolan toluen toluidin trinden trinaphthylen triphenodioxazin triphenodithiazin triphenylen tritan tyramin tryptamin tryptolin uranocen urazol v-triazin vanadocen veratrol xanthen thioxanthen selenoxanthen telluroxanthen xanthylium violanthren xylen xylidin zirconocen adenin cytosin isocytosin guanin isoguanin hypoxanthin thymin uracil xanthin adenosin cytidin isocytidin guanosin isoguanosin inosin thioinosin thymidin|deoxythymidin uridin xanthosin nucleocidin idoxuridin ribosylthymin orotidin pseudouridin wybutosin apigenin aromadendrin|aromodendrin taxifolin|dihydroquercetin auron coumestan cyanidin diosmetin flavan flavan-2-en|flav-2-en flavan-3-en|flav-3-en flavon flavanon genistein dihydrogenistein genistin isoflavan|iso-flavan isoflavan-2-en|isoflav-2-en|iso-flavan-2-en|iso-flav-2-en isoflavan-3-en|isoflav-3-en|iso-flavan-3-en|iso-flav-3-en isoflavon|iso-flavon luteolin naringenin neoflavan|neo-flavan neoflavan-2-en|neoflav-2-en|neo-flavan-2-en|neo-flav-2-en neoflavan-3-en|neoflav-3-en|neo-flavan-3-en|neo-flav-3-en neoflavon|neo-flavon quercetin bilin corrin dihydrophloroglucinol porphyrin|porphin opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/arylSubstituents.xml000066400000000000000000000304131451751637500331060ustar00rootroot00000000000000 imidaz indaz isobenzothioph isothiaz isoxaz|isooxaz phenmeth pheneth phenprop phenisoprop phenbut phenpent phenhex pyrr styr tol phenyl acenaphth acrid acridars anthrac anthr|anthran arsanthr arsanthrid as-indac azul benzhydr boranthr chrys cinnol coron cyclopenta[a]phenanthr cym fluoranth fulv fur thiofur selenofur tellurofur furfur imidazolid indoliz isobenzofur isobenzothiofur isoquinol isoselenaz isotelluraz isoxazolid mesit morphol thiamorphol|thiomorphol selenomorphol telluromorphol naphthac naphthal naphth naphthyrid oval oxanthr perimid phenal phenanthr phenanthrid phenanthrol phenarsaz phenaz phenoxaz phenothiaz phenoselenaz phenotelluraz phenophosphaz phenoarsaz phenomercuraz|phenomercaz phenoxaselen phenoxatellur phenoxaphosph phenoxars phenoxastibin|phenoxantimon phenothiars phosphanthr piperaz piperid pleiad pterid pur pyr thiopyr selenopyr telluropyr pyranthr pyrazolid pyridaz pyrid pyrimid pyrroliz pyrrolid quinazol quinol quinoliz quinoxal quinuclid rubic s-indac selenanthr selenoph silanthr stilb telluranthr telluroph thianthr thioph thien|thiene tolu xanth thioxanth selenoxanth telluroxanth opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/atomHydrides.xml000066400000000000000000000046671451751637500321520ustar00rootroot00000000000000 bor alum indig gall thall carb sil germ stann plumb az phosph phosphor ars arsor stib stibor bismuth oxid sulf sel tell pol fluor chlor brom iod astat az opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/carbohydrateSuffixes.xml000066400000000000000000000105311451751637500336650ustar00rootroot00000000000000 ide|id uronamide|uronamid uronate|uronat uronic acid uronicacid urononitrile|urononitril onamide|onamid onate|onat onic acid|onicacid ononitrile|ononitril ose|os arate|arat|arate(2-)|arat(2-) aric acid|aricacid onamide|onamid onate|onat onic acid|onicacid ononitrile|ononitril osonamide|osonamid osonate|osonat osonic acid|osonicacid osononitrile|osononitril uronamide|uronamid uronate|uronat uronic acid uronicacid urononitrile|urononitril on uron ar on oson uron odiald itol ul ul yl onoyl uronoyl opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/carbohydrates.xml000066400000000000000000000455611451751637500323460ustar00rootroot00000000000000 glycer erythr thre rib arabin xyl lyx all altr gluc mann gul id galact tal abequ amicet ascaryl boivin sarment colit digitox cymar fuc digital parat quinov rhamn rhodin tyvel hamamel cladin strept eval evernitr api erythrul ribul xylul psic fruct sorb tagat sedoheptul oxir oxet furan pyran septan octan glycero erythro threo ribo arabino xylo lyxo allo altro gluco manno gulo ido galacto talo tri tetr pent hex hept oct non dec undec dodec tridec tetradec pentadec hexadec apio-alpha-d-furan|d-apio-alpha-d-furan apio-beta-d-furan|d-apio-beta-d-furan apio-alpha-l-furan|d-apio-alpha-l-furan apio-beta-l-furan|d-apio-beta-l-furan l-apio-alpha-d-furan l-apio-beta-d-furan l-apio-alpha-l-furan l-apio-beta-l-furan dithioerythritol dithiothreitol sorbitol rhamnulose fuculose glucamine|glucamin meglumine|meglumin saccharic acid saccharate|saccharat galactal glucal ascorbic acid dehydroascorbic acid ascorbate|ascorbat galactosamine|galactosamin glucosamine|glucosamin mannosamine|mannosamin fucosamine|fucosamin quinovosamine|quinovosamin bacillosamine|bacillosamin garosamine|garosamin neuraminic acid neuraminate|neuraminat neuraminamide|neuraminamid neuraminol muramic acid isomuramic acid lactose lactosamine|lactosamin lactosediamine|lactosediamin rutinose glyceraldehyde|glyceraldehyd d-glyceraldehyde|(d)-glyceraldehyde|d-glyceraldehyd|(d)-glyceraldehyd l-glyceraldehyde|(l)-glyceraldehyde|l-glyceraldehyd|(l)-glyceraldehyd dextrose galactosaminyl glucosaminyl mannosaminyl fucosaminyl quinovosaminyl bacillosaminyl garosaminyl neuraminyl|neuraminosyl muramyl|muramosyl isomuramyl|isomuramosyl lactosyl lactosaminyl lactosediaminyl rutinosyl aldehydo opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/carboxylicAcids.xml000066400000000000000000000742201451751637500326110ustar00rootroot00000000000000 form acet propion|propi butyr isobutyr valer isovaler oxal malon succin glutar acetoacet atrolact biotin|d-biotin caprin capro capryl enanth fol dihydrofol|7,8-dihydrofol tetrahydrofol|5,6,7,8-tetrahydrofol 5,10-methenyl-5,6,7,8-tetrahydrofol|5,10-methenyltetrahydrofol|5,10-methenyl-tetrahydrofol isocapro isosuccin lact|dl-lact l-lact|l(+)-lact|l-(+)-lact d-lact|d(-)-lact|d-(-)-lact mesoxal naphthion orot oxalacet pelargon|pelarg dimercaptosuccin tetrol pival adip pimel suber azela|azel sebac acryl methacryl croton isocroton male fumar citracon mesacon camphor acetur allophan alpha-isoduryl angel benzil beta-isoduryl bicarbam brassyl citramal cresot o-cresot|ortho-cresot|o-cresotin|ortho-cresotin m-cresot|meta-cresot|m-cresotin|meta-cresotin p-cresot|para-cresot|p-cresotin|para-cresotin gamma-isoduryl glutacon glycol diglycol glyoxyl|glyoxal hippur homolevulin hydracryl isonipecot isocinchomeron itacon levulin mal|dl-mal l-mal|l(+)-mal|l-(+)-mal d-mal|d(-)-mal|d-(-)-mal malein mandel|dl-mandel l-mandel|l(+)-mandel|l-(+)-mandel d-mandel|d(-)-mandel|d-(-)-mandel thiomal mucon nipecot orsellin|o-orsellin|ortho-orsellin p-orsellin|para-orsellin oxanil d-pipecolin|d(+)-pipecolin|d-(+)-pipecolin|d-pipecol|d(+)-pipecol|d-(+)-pipecol l-pipecolin|l(-)-pipecolin|l-(-)-pipecolin|l-pipecol|l(-)-pipecol|l-(-)-pipecol pipecolin|pipecol pyruv sulfanil thioglycol alpha-resorcyl beta-resorcyl gamma-resorcyl trimellit trimes laur myrist palmit stear ole elaid benz hydratrop atrop cinnam nicotin isonicotin then anis arachid|arachidyl|arach arachidon behen behenol brassid caffe cerebron ceromeliss ceroplast cerot citronell coumar conifer clupanodon eleostear alpha-eleostear beta-eleostear eruc ferul farnes gadole gedd gentis geran ghedd glycer glycid gondo homoanis homogentis homoisovanill homoprotocatechu homovanill homoveratr hydrocaffe hydrocinnam hydroferul hydroisoferul dihydrocaffe dihydrocinnam dihydroferul dihydroisoferul isoferul|iso-ferul isostear|iso-stear isovanill|iso-vanill laccer lignocer linole (9,12,15)-linolen|alpha-linolen (6,9,12)-linolen|gamma-linolen nervon margar meliss montan myristole ner o-veratr|orthoveratr|ortho-veratr o-homoveratr|orthohomoveratr|ortho-homoveratr palmitole penicillan 6-aminopenicillan petroselin|petrosel picol protocatechu psyll punic ricinole ricinelaid sinap sorb syring undecylen vaccen vanill veratr xyl propiol phthal isophthal terephthal anthranil aconit allant asaron asclepin carbaz cinchonin citr duryl edet gall hemellit hemimellit henatriacontyl heneicosyl heptacosyl heptadecyl hexatriacontyl homophthal homoisophthal homoterephthal hydant hydrangel isocitr isophthalon itamal itatartar lutidin m-hemipin metahemipin meta-hemipin m-homosalicyl meta-homosalicyl maleur melilot mellit mellophan mevalon mevald mucobrom mucochlor nonadecyl nonacosyl o-homosalicyl ortho-homosalicyl oxam oxaldehyd oxalur p-homosalicyl para-homosalicyl pant pentadecyl pentacosyl phloret phthalon piperonyl prehnit prehnityl pristan propargyl pter dihydropter|7,8-dihydropter 5,6,7,8-tetrahydropter 2-pyrocatechu o-pyrocatechu pyromellit pyrotartar quinald salicyl thiosalicyl seneci stearidon tartar dl-tartar l-tartar l(+)-tartar l-(+)-tartar dextrotartar d-tartar d(-)-tartar d-(-)-tartar levotartar mesotartar tartron terephthalon thaps tigl traumat tricarballyl tricosyl tridecyl trop undecyl umbell uvit valpr vanilmandel vanillomandel vanillylmandel leuc|dl-leuc d-leuc l-leuc n-butyr chargeAndOxidationNumberSpecifiers.xml000066400000000000000000000032471451751637500363510ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources (1+) (2+) (3+) (4+) (5+) (6+) (7+) (8+) (9+) (+1) (+2) (+3) (+4) (+5) (+6) (+7) (+8) (+9) (-1) (-2) (-3) (-4) (-5) (-6) (-7) (-8) (-9) (1-) (2-) (3-) (4-) (5-) (6-) (7-) (8-) (9-) (0) (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) cyclicUnsaturableHydrocarbon.xml000066400000000000000000000030241451751637500352700ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources adamant cub prism born|camph|bornyl car menth|p-menth|para-menth m-menth|meta-menth o-menth|ortho-menth norborn|norcamph|norbornyl norcar norpin pin thuj opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/elementaryAtoms.xml000066400000000000000000000336431451751637500326630ustar00rootroot00000000000000 lithium sodium|natrium potassium|kalium rubidium caesium|cesium francium beryllium magnesium calcium strontium barium radium aluminium|aluminum gallium indium thallium tin stannum lead|plumbum bismuth polonium scandium titanium vanadium chromium manganese iron cobalt nickel copper zinc yttrium zirconium niobium|columbium molybdenum technetium ruthenium rhodium palladium silver cadmium lanthanum cerium praseodymium neodymium promethium samarium europium gadolinium terbium dysprosium holmium erbium thulium ytterbium lutetium hafnium tantalum tungsten rhenium osmium iridium platinum gold mercury|hydrargyrum actinium thorium protactinium uranium neptunium plutonium americium curium berkelium californium einsteinium fermium mendelevium nobelium lawrencium rutherfordium dubnium seaborgium bohrium hassium meitnerium darmstadtium roentgenium copernicium nihonium flerovium moscovium livermorium boron carbon silicon germanium nitrogen phosphorus arsenic antimony|stibium oxygen sulfur selenium tellurium fluorine chlorine bromine iodine astatine tennessine helium neon argon krypton xenon radon actinon thoron oganesson antimonous antimonic argentous argentic aurous auric bismuthous bismuthic cerous ceric chromous chromic cobaltous cobaltic cuprous cupric europous europic ferrous ferric germanous germanic iridous iridic manganous manganic mercurous mercuric molybdenous molybdenic neptunous neptunic nickelous nickelic niobous niobic osmious osmic palladious palladic platinous platinic plumbous plumbic rhenious rhenic rhodious rhodic ruthenous ruthenic stannous stannic thallous thallic titanous titanic tungstous tungstic uranous uranic vanadous vanadic opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/functionalTerms.xml000066400000000000000000000241631451751637500326640ustar00rootroot00000000000000 ester salt glycol fluorohydrin chlorohydrin bromohydrin iodohydrin cyanohydrin oxime|oxim thioxime|thioxim|thiooxime|thiooxim|thio-oxime|thio-oxim selenoxime|selenoxim|selenooxime|selenooxim|seleno-oxime|seleno-oxim telluroxime|telluroxim|tellurooxime|tellurooxim|telluro-oxime|telluro-oxim bromoxime|bromoxim|bromooxime|bromooxim|bromo-oxime|bromo-oxim chloroxime|chloroxim|chlorooxime|chlorooxim|chloro-oxime|chloro-oxim fluoroxime|fluoroxim|fluorooxime|fluorooxim|fluoro-oxime|fluoro-oxim iodoxime|iodoxim|iodooxime|iodooxim|iodo-oxime|iodo-oxim hydrazone|hydrazon semicarbazone|semicarbazon thiosemicarbazone|thiosemicarbazon|thio-semicarbazone|thio-semicarbazon selenosemicarbazone|selenosemicarbazon|seleno-semicarbazone|seleno-semicarbazon tellurosemicarbazone|tellurosemicarbazon|telluro-semicarbazone|telluro-semicarbazon isosemicarbazone|isosemicarbazon isothiosemicarbazone|isothiosemicarbazon isoselenosemicarbazone|isoselenosemicarbazon isotellurosemicarbazone|isotellurosemicarbazon semioxamazone|semioxamazon imide|imid imine|imin oxide|oxid sulfide|sulfid selenide|selenid telluride|tellurid amide|amid anilide|anilid azetidide|azetidid hydrazide|hydrazid morpholide|morpholid piperazide|piperazid piperidide|piperidid pyrrolidide|pyrrolidid mercaptal acetal|ketal hemimercaptal hemiacetal|hemiketal hemithioacetal|hemithioketal hemidithioacetal|hemidithioketal anhydride|anhydrid thioanhydride|thioanhydrid selenoanhydride|selenoanhydrid telluroanhydride|telluroanhydrid peroxyanhydride|peroxyanhydrid dithioperoxyanhydride|dithioperoxyanhydrid diselenoperoxyanhydride|diselenoperoxyanhydrid ditelluroperoxyanhydride|ditelluroperoxyanhydrid oligo poly cyclo alcohol alcoholate mercaptan selenol thiol ether ketone|keton diketone|diketon triketone|triketon ketoxime|ketoxim oxide|oxid peroxide|peroxid selenide|selenid diselenide|diselenid triselenide|triselenid selenone|selenon diselenone|diselenon selenoxide|selenoxid diselenoxide|diselenoxid selenoether selenoketone|selenoketon sulfide|sulfid disulfide|disulfid trisulfide|trisulfid tetrasulfide|tetrasulfid pentasulfide|pentasulfid hexasulfide|hexasulfid sulfone|sulfon disulfone|disulfon sulfoxide|sulfoxid disulfoxide|disulfoxid telluride|tellurid ditelluride|ditellurid tritelluride|tritellurid telluroether telluroketone|telluroketon tellurone|telluron ditellurone|ditelluron telluroxide|telluroxid ditelluroxide|ditelluroxid thioether thioketone|thioketon azide|azid bromide|bromid chloride|chlorid cyanate|cyanat cyanide|cyanid deuteride|deuterid deuteroxide|deuteroxid fluoride|fluorid fulminate|fulminat hydride|hydrid hydroperoxide|hydroperoxid hydroselenide|hydroselenid hydrodiselenide|hydrodiselenid hydrotriselenide|hydrotriselenid hydrosulfide|hydrosulfid hydrodisulfide|hydrodisulfid hydrotrisulfide|hydrotrisulfid hydrotetrasulfide|hydrotetrasulfid hydrotelluride|hydrotellurid hydroditelluride|hydroditellurid hydrotritelluride|hydrotritellurid hydroxide|hydroxid iodide|iodid isocyanate|isocyanat isocyanide|isocyanid isofulminate|isofulminat isonitrile|isonitril isoselenocyanate|isoselenocyanat isotellurocyanate|isotellurocyanat isothiocyanate|isothiocyanat selenocyanate|selenocyanat selenofulminate|selenofulminat tellurocyanate|tellurocyanat tellurofulminate|tellurofulminat thiocyanate|thiocyanat thiofulminate|thiofulminat tritide|tritid opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/fusionComponents.xml000066400000000000000000000365421451751637500330640ustar00rootroot00000000000000 benzo|benz naphthyridino phenanthrolino aceanthryleno acenaphthyleno|acenaphth|acenaphtho acephenanthryleno acridarsino acridino acridophosphino acrindolino anthyridino anthraceno|anthra anthrazino arsanthreno arsanthridino arsindolizino arsindolo arsinolino arsinolizino azuleno benzeno betacarbolino|beta-carbolino|b-carbolino biphenyleno boranthreno carbazolo chinolino chromeno thiochromeno selenochromeno tellurochromeno chryseno cinnolino coroneno fluorantheno fluoreno furano furazano furo imidazolo|imidazo as-indaceno s-indaceno indazolo indeno indolizidino indolizino indolo isoarsindolo isoarsinolino isobenzofurano|isobenzofuro isochinolino isochromeno isothiochromeno isoselenochromeno isotellurochromeno isoindolo isooxazolo|isoxazolo isophosphindolo isophosphinolino isoquino|isoquinolino isoselenazolo isotellurazolo isothiazolo mercuranthreno naphthaceno naphthaleno|naphthalino|naphth|naphtho ovaleno oxanthreno perimidino peryleno|perylo phenaleno phenanthrazino phenanthreno phenanthridino phenanthro phenazino phenoxazino phenothiazino phenoselenazino phenotellurazino phenophosphazinino|phenophosphazino phenarsazinino phenazasilino phenoarsazino phenomercurazino|phenomercazino phenoxathiino phenoxaselenino phenoxasilin phenoxatellurino phenoxaphosphinino|phenoxaphosphino phenoxarsinino|phenoxarsino phenoxastibinino|phenoxantimonino phenothiarsinino|phenothiarsino phenomercurino phosphanthreno phosphanthridino phosphindolo phosphindolizino phosphinolino phosphinolizino phthalazino phthaloperino piceno pleiadeno pteridino purino pyrano thiopyrano selenopyrano telluropyrano pyranthreno pyrazino pyrazolo pyreno pyridazino pyridino|pyrido pyrimidino|pyrimido pyrrolizidino pyrrolizino pyrrolo quinazolino quindolino quinindolino quino|quinolino quinolizidino quinolizino quinoxalino rubiceno selenanthreno selenopheno silanthreno s-triazino|sym-triazino s-triazolo|sym-triazolo telluranthreno telluropheno thebenidino thianthreno thieno|thiopheno trindeno trinaphthyleno triphenodioxazino triphenodithiazino triphenyleno xantheno thioxantheno selenoxantheno telluroxantheno opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/germanTokens.xml000066400000000000000000000012571451751637500321430ustar00rootroot00000000000000 brom chlor fluor iod perfluor perbrom perchlor period groupStemsAllowingAllSuffixes.xml000066400000000000000000000011621451751637500354350ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources hydrazin acetylen hydroxylamin phytan groupStemsAllowingInlineSuffixes.xml000066400000000000000000000105701451751637500361460ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources allen formazan isodiazen keten urethan isopren ammonium phosphonium arsonium stibonium bismuthonium oxonium sulfonium selenonium telluronium fluoronium chloronium bromonium iodonium silylium germylium stannylium plumbylium phosphine arsine stibin bismuthin aceton propionon butyron valeron enanthon caprylon isobutyron isovaleron lauron myriston palmiton stearon glutathion glycerol|glycerin sn-glycerol guanidin saccharin opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/heteroAtoms.xml000066400000000000000000000266771451751637500320150ustar00rootroot00000000000000 fluora chlora broma ioda astata oxa thia selena tellura polona aza phospha arsa stiba bisma carba sila germa stanna plumba bora alumina galla inda thalla zinca cadma mercura cupra argenta aura nickela pallada platina darmstadta cobalta rhoda irida meitnera ferra ruthena osma hassa mangana techneta rhena bohra chroma molybda tungsta seaborga vanada nioba tantala dubna titana zircona hafna rutherforda scanda yttra lanthana cera praseodyma neodyma prometha samara europa gadolina terba dysprosa holma erba thula ytterba luteta actina thora protactina urana neptuna plutona america cura berkela californa einsteina ferma mendeleva nobela lawrenca berylla magnesa calca stronta bara rada litha soda potassa rubida caesa franca hela neona argona kryptona xenona radona fluoronia chloronia bromonia iodonia astatonia oxonia thionia selenonia telluronia polononia azonia phosphonia arsonia stibonia bismuthonia carbonia silonia germonia stannonia plumbonia boronia aluminonia gallonia indonia thallonia zinconia cadmonia mercuronia cupronia argentonia auronia nickelonia palladonia platinonia darmstadtonia cobaltonia rhodonia iridonia meitneronia ferronia ruthenonia osmonia hassonia manganonia technetonia rhenonia bohronia chromonia molybdonia tungstonia seaborgonia vanadonia niobonia tantalonia dubnonia titanonia zircononia hafnonia rutherfordonia scandonia yttronia lanthanonia ceronia praseodymonia neodymonia promethonia samaronia europonia gadolinonia terbonia dysprosonia holmonia erbonia thulonia ytterbonia lutetonia actinonia thoronia protactinonia uranonia neptunonia plutononia americonia curonia berkelonia californonia einsteinonia fermonia mendelevonia nobelonia lawrenconia beryllonia magnesonia calconia strontonia baronia radonia lithonia sodonia potassonia rubidonia caesonia franconia helonia neononia argononia kryptononia xenononia radononia fluoranylia chloranylia bromanylia iodanylia astatanylia oxidanylia sulfanylia selanylia tellanylia polanylia azanylia phosphanylia arsanylia stibanylia bismuthanylia carbanylia silanylia germanylia stannanylia plumbanylia boranylia alumanylia indiganylia gallanylia thallanylia fluoranida chloranida bromanida iodanida astatanida oxidanida sulfanida selanida tellanida polanida azanida phosphanida arsanida stibanida bismuthanida carbanida silanida germanida stannanida plumbanida boranida alumanida indiganida gallanida thallanida fluoranuida chloranuida bromanuida iodanuida astatanuida oxidanuida sulfanuida selanuida tellanuida polanuida azanuida phosphanuida arsanuida stibanuida bismuthanuida carbanuida silanuida germanuida stannanuida plumbanuida boranuida alumanuida indiganuida gallanuida thallanuida opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/hwHeteroAtoms.xml000066400000000000000000000110741451751637500322750ustar00rootroot00000000000000 fluora chlora broma ioda oxa thia selena tellura aza phospha arsa stiba bisma sila germa stanna plumba bora aluma galla indiga thalla mercura fluor chlor brom iod ox thi selen tellur az phosph ars stib bism sil germ stann plumb bor alum gall indig thall mercur phosphor arsen antimon oxa thia selena tellura aza bisma sila germa stanna plumba mercura ox thi selen tellur az bism sil germ stann plumb mercur phosphor arsen antimon oxa thia selena tellura bisma mercura ox thi selen tellur bism mercur arsen aluma indiga fluor chlor brom iod ox thi selen tellur az phosph ars stib bism sil germ stann plumb bor alum gall indig thall mercur opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/hwSuffixes.xml000066400000000000000000000105671451751637500316450ustar00rootroot00000000000000 iren|irin iran|iridin et etan|etidin etin|eten ol olan|olidin olin|olen inan inin epin epan ocin ocan onin onan ecin ecan cycloundecin cyclododecin cyclotridecin cyclotetradecin cyclopentadecin cyclohexadecin cycloheptadecin cyclooctadecin cyclononadecin cycloeicosin|cycloicosin cyclohenicosin cyclodocosin cyclotricosin cyclotetracosin cyclopentacosin cyclohexacosin cycloheptacosin cyclooctacosin cyclononacosin cyclotriacontin cyclohentriacontin cyclodotriacontin cyclotritriacontin cyclotetratriacontin cyclopentatriacontin cyclohexatriacontin cycloheptatriacontin cyclooctatriacontin cyclononatriacontin cyclotetracontin in an opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/index.xml000066400000000000000000000030771451751637500306170ustar00rootroot00000000000000 arylSubstituents.xml multiRadicalSubstituents.xml simpleSubstituents.xml substituents.xml arylGroups.xml simpleGroups.xml simpleCyclicGroups.xml groupStemsAllowingAllSuffixes.xml groupStemsAllowingInlineSuffixes.xml cyclicUnsaturableHydrocarbon.xml elementaryAtoms.xml aminoAcids.xml carbohydrates.xml naturalProducts.xml alkanes.xml atomHydrides.xml carboxylicAcids.xml nonCarboxylicAcids.xml chargeAndOxidationNumberSpecifiers.xml functionalTerms.xml heteroAtoms.xml hwHeteroAtoms.xml hwSuffixes.xml fusionComponents.xml multipliers.xml infixes.xml inlineSuffixes.xml inlineChargeSuffixes.xml suffixPrefix.xml carbohydrateSuffixes.xml suffixes.xml unsaturators.xml miscTokens.xml opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/infixes.xml000066400000000000000000000032061451751637500311470ustar00rootroot00000000000000 amid azid bromid chlorid cyanatid cyanid dithioperox diselenoperox ditelluroperox fluorid hydrazid hydrazon imid iodid isocyanatid isocyanid isothiocyanatid isoselenocyanatid isotellurocyanatid nitrid perox selen tellur thi thiocyanatid selenocyanatid tellurocyanatid hydroxim opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/inlineChargeSuffixes.xml000066400000000000000000000010041451751637500336010ustar00rootroot00000000000000 ium ide|id ylium|(ylium) uide|uid ylium|(ylium) opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/inlineSuffixes.xml000066400000000000000000000057471451751637500325110ustar00rootroot00000000000000 aldehydoyl amido|amidyl aminyl amoyl aniloyl carbonimidoyl carbonyl hydrazido io iminyl oximino oxy oyl selenenyl seleninyl selenonyl sulfenamido sulfenoselenoyl sulfenothioyl sulfenyl sulfinyl sulfonyl tellurenyl tellurinyl telluronyl yl ylidene|yliden ylidyne|ylidyn imido|imidyl dicarboximido oyl yl ylidene|yliden ylidyne|ylidyn amido|amidyl carbonyl hydrazido oxy oyl amido|amidyl hydrazido oyl yl ylidene|yliden ylene|ylen opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/miscTokens.xml000066400000000000000000000175321451751637500316300ustar00rootroot00000000000000 cen len phen phenylen naphthylen helicen cyclo|cyclo- hydro perhydro cis trans endo exo syn anti r r or s s or r r and s s and r ine|in thio seleno telluro ylamine|ylamin spirobi|spirobi- spiroter|spiroter- spiro|spiro- spiro|spiro- d|(d) l|(l) dl|d,l|(dl) ds dg ls lg deoxy desoxy deamino desamino demethyl desmethyl dehydro anhydro deaza|desaza deoxa|desoxa dethia|desthia deoxy desoxy deamino desamino demethyl desmethyl dehydro anhydro deaza|desaza deoxa|desoxa dethia|desthia - - , ( [ { ) ] } ( [ { ) ] } ( [ { ) ] } a e o o , o multiRadicalSubstituents.xml000066400000000000000000000236541451751637500345030ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources methylene|methylen ethylene|ethylen propylene|propylen butylene|butylen pentylene|pentylen hexylene|hexylen heptylene|heptylen octylene|octylen nonylene|nonylen undecylene|undecylen carbene vinylene|vinylen neopentylene|neopentylen durylene|durylen phenylene|phenylen phthalylidene|phthalyliden|phthalal isophthalylidene|isophthalyliden|isophthalal terephthalylidene|terephthalyliden|terephthalal tolylene|tolylen semicarbazono siloxane|siloxan ureylene|ureylen xylylene|xylylen hydrazo azino azo azoxy nno-azoxy non-azoxy onn-azoxy diazoamino nitrene|aminylene|aminylen imino iminio nitrilo nitrilio phosphinidyne|phosphinidyn arsinidyne|arsinidyn stibylidyne|stibylidyn bismuthylidyne|bismuthylidyn phosphinidenio arsinidenio phosphinico arsinico stibinico peroxy nitroryl phosphinato arsinato stibinato silylene|silylen germylene|germylen stannylene|stannylen plumbylene|plumbylen borylene|borylen oxy thio seleno telluro carbothioyl carboselenoyl carbotelluroyl carbohydrazonoyl carboimidoyl carbohydroximoyl etheno prop[1]eno|propeno but[1]eno|[1]buteno|but-1-eno but[2]eno|[2]buteno|but-2-eno buta[1,3]dieno|[1,3]butadieno|buta-1,3-dieno epoxy|epioxido epidioxy|epidioxido epitrioxy|epitrioxido epithio|episulfano lambda4-sulfano|epi-lambda4-sulfano lambda6-sulfano|epi-lambda6-sulfano epidithio|epidisulfano episeleno|episelano epitelluro|epitellano epimino|epiimino|epiazano diazano|biimino|epidiazano diazeno|epidiazeno triaz[1]eno|azimino|epitriazeno|epitriaz-1-eno phosphano|epiphosphano arsano|epiarsano stibano|epistibano silano|episilano germano|epigermano stannano borano|epiborano (metheno)|metheno (azeno)|azeno|epiazanylylidene (phospheno)|phospheno|epiphosphanylylidene (ethanylylidene)|ethanylylidene|epiethanylylidene (ethanediylidene)|ethanediylidene|epiethanediylidene (propan[1]yl[3]ylidene)|propan[1]yl[3]ylidene (prop[1]en[1]yl[3]ylidene)|prop[1]en[1]yl[3]ylidene (diazanediylidene)|diazanediylidene|epidiazanediylidene (epoxymethano)|oxaethano (epiminoethano)|(iminoethano)|[1]azapropano (epoxyprop[1]eno)|(epoxy[1]propeno)|[1]oxabut-2-eno (epoxyprop[2]eno)|(epoxy[2]propeno)|[1]oxabut-3-eno (ethanoiminomethano)|(ethaniminomethano) [2]azabutano (methanooxymetheno)|(methanoxymetheno)|epi[2]oxapropan-1-yl-3-ylidene (epoxymethanoazenometheno)|(epoxymethanonitrilometheno)|[1]oxa[3]azabut-3-eno (epoxymethenoazenomethano)|(epoxymethenonitrilomethano)|[1]oxa[3]azabut-2-eno amine|amin opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/multipliers.xml000066400000000000000000000120351451751637500320530ustar00rootroot00000000000000 mono mon di tri tetr pent hex hept oct non dec undec dodec tridec tetradec pentadec hexadec heptadec octadec nonadec eicos|icos henicos|heneicos docos tricos tetracos pentacos hexacos heptacos octacos nonacos triacont hentriacont dotriacont tritriacont tetratriacont pentatriacont hexatriacont heptatriacont octatriacont nonatriacont tetracont hentetracont dotetracont tritetracont tetratetracont pentatetracont hexatetracont heptatetracont octatetracont nonatetracont pentacont bis tris tetrakis pentakis hexakis heptakis octakis nonakis decakis undecakis dodecakis tridecakis tetradecakis pentadecakis hexadecakis heptadecakis octadecakis nonadecakis eicosakis|icosakis henicosakis|heneicosakis docosakis tricosakis tetracosakis pentacosakis hexacosakis heptacosakis octacosakis nonacosakis triacontakis bi tri tetra penta hexa hepta octa nona deca undeca dodeca trideca tetradeca pentadeca hexadeca heptadeca octadeca nonadeca eicosa bi ter quater quinque sexi septi octi novi deci undeci dodeci trideci tetradeci pentadeci hemi sesqui hemipenta hemihepta opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/naturalProducts.xml000066400000000000000000000654741451751637500327130ustar00rootroot00000000000000 aconit ajmal cinchon morphin trop androst campest chol cholest ergost estr|oestr furost gon gorgost poriferast pregn spirost stigmast pterocarp prost roten thrombox alpha-terpineol beta-terpineol gamma-terpineol 4-terpineol|terpin-4-ol|4-terpinenol|terpinen-4-ol bufanolide|bufanolid bufadienolide|bufadienolid cardanolide|cardanolid cardenolide|cardenolid androstenol androstenone|androstenon androstadienone|androstadienon androstanediol androstenediol androstenedione|androstenedion campestanol cholesterol codeine|codein codeinone|codeinon dihydrocodeine|dihydrocodein hydrocodone|hydrocodon|dihydrocodeinone|dihydrocodeinon oxycodone|oxycodon|dihydrohydroxycodeinone|dihydrohydroxycodeinon estratetraenol heroin|diamorphine|diamorphin|diacetylmorphine|diacetylmorphin dihydroheroin|diacetyldihydromorphine|diacetyldihydromorphin morphine|morphin morphinone|morphinon dihydromorphine|dihydromorphin hydromorphone|hydromorphon|dihydromorphinone|dihydromorphinon aporphin berbin ergolin alpha-pinen (+)-alpha-pinen (-)-alpha-pinen beta-pinen (+)-beta-pinen (-)-beta-pinen alpha-terpinen beta-terpinen gamma-terpinen delta-terpinen|terpinolen beta,beta-caroten|beta-caroten|all-trans-beta-caroten beta,gamma-caroten beta,epsilon-caroten|alpha-caroten|all-trans-alpha-caroten beta,kappa-caroten beta,phi-caroten beta,chi-caroten beta,psi-caroten|gamma-caroten|all-trans-gamma-caroten gamma,gamma-caroten gamma,epsilon-caroten gamma,kappa-caroten gamma,phi-caroten gamma,chi-caroten gamma,psi-caroten epsilon,epsilon-caroten|epsilon-caroten|all-trans-epsilon-caroten epsilon,kappa-caroten epsilon,phi-caroten epsilon,chi-caroten epsilon,psi-caroten|delta-caroten|all-trans-delta-caroten kappa,kappa-caroten kappa,phi-caroten kappa,chi-caroten kappa,psi-caroten phi,phi-caroten|isorenieraten|all-trans-isorenieraten phi,chi-caroten phi,psi-caroten chi,chi-caroten|renierapurpurin|all-trans-renierapurpurin chi,psi-caroten psi,psi-caroten|lycopen|all-trans-lycopen neurosporen|all-trans-neurosporen zeta-caroten|all-trans-zeta-caroten cepham cephem ceph-2-em ceph-3-em penam penem pen-2-em 1-carbapen-1-em|1-carba-pen-1-em|carbapen-1-em|carba-pen-1-em lysergic acid dihydrolysergic acid isolysergic acid dihydroisolysergic acid lysergate|lysergat dihydrolysergate|dihydrolysergat isolysergate|isolysergat dihydroisolysergate|dihydroisolysergat lysergamide|lysergamid dihydrolysergamide|dihydrolysergamid isolysergamide|isolysergamid dihydroisolysergamide|dihydroisolysergamid lysergol dihydrolysergol isolysergol dihydroisolysergol opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/nonCarboxylicAcids.xml000066400000000000000000000541461451751637500332710ustar00rootroot00000000000000 arsor arsen azor nitror phosphor orthophosphor ortho-phosphor stibor antimon sulfur carbam carbanil carbon mesyl diphosphor pyrophosphor pyro-phosphor sulfam sulfinam sulfenam arsin phosphin arson azin azon phosphen phosphon stibin stibon arsenen bor orthobor ortho-bor borin boron selen tellur chrom dichrom mangan permangan technet pertechnet rhen perrhen perruthen dibor diboron diphosphon diarson diarsor distibon distibor ditellur diselen disulfur|pyrosulfur|pyro-sulfur hypodiboron hypodiphosphon hypodiphosphor hypodiarson hypodiarsor hypodistibon hypodistibor hypodisulfur|dithion hypodiselen hypoditellur dicarbon disilic triphosphon triphosphor triselen trisulfur tricarbon tetraphosphor tetracarbon trithion tetrathion pentathion besyl brosyl cacodyl edisyl esyl ethion hyponitr isethion isohypophosphor methion nitron nonafl nosyl orthocarbon|ortho-carbon orthoform|ortho-form orthoacet|ortho-acet orthopropion|ortho-propion orthobutyr|ortho-butyr orthoisobutyr|ortho-isobutyr orthovaler|ortho-valer orthoisovaler|ortho-isovaler orthotellur|ortho-tellur hypobor pyrocarbon|pyro-carbon sulfamid sulfoxyl tosyl tresyl amidosulfonate|amidosulfonat amidosulfonicacid|amidosulfonic acid arsenious|arseniousacid|arsenious acid bicarbonate|bicarbonat bisulfate|bisulfat bisulfite|bisulfit diphosphate|diphosphat diphosphite|diphosphit diselenious|diseleniousacid|diselenious acid disulfate|pyrosulfate|pyro-sulfate|disulfat|pyrosulfat|pyro-sulfat disulfite|disulfit pyrosulfite|pyro-sulfite|pyrosulfit|pyro-sulfit metabisulfate|metabisulfat metabisulfite|metabisulfit glycero-1-phosphate|glycero-1-phosphat glycero-2-phosphate|glycero-2-phosphat glycero-3-phosphate|glycero-3-phosphat sn-glycero-1-phosphate|sn-glycero-1-phosphat sn-glycero-2-phosphate|sn-glycero-2-phosphat sn-glycero-3-phosphate|sn-glycero-3-phosphat hydrosulfite|hydrosulfit hypothiocyanate|hypothiocyanat hypothiocyanicacid|hypothiocyanic acid hypothiocyanite|hypothiocyanit hypothiocyanousacid|hypothiocyanous acid hypodiphosphate|hypodiphosphat hypodiphosphite|hypodiphosphit hypophosphate|hypophosphorate|hypophosphat|hypophosphorat hypophosphoric|hypophosphoricacid|hypophosphoric acid hypophosphite|hypophosphorite|hypophosphit|hypophosphorit hypophosphorous|hypophosphorousacid|hypophosphorous acid isohypophosphate|isohypophosphat orthonitrate|ortho-nitrate|orthonitrat|ortho-nitrat orthophosphate|ortho-phosphate|orthophosphat|ortho-phosphat orthophosphite|ortho-phosphite|orthophosphit|ortho-phosphit pentaphosphate|pentaphosphat peroxodicarbonate|peroxodicarbonat peroxodicarbonicacid|peroxodicarbonic acid peroxocarbonate|peroxocarbonat peroxocarbonicacid|peroxocarbonic acid persulfate|persulfat perxenate|perxenat perxenicacid|perxenic acid phosphite|phosphit phosphate|phosphoate|phosphat pyrophosphate|pyro-phosphate|pyrophosphat|pyro-phosphat pyrophosphite|pyro-phosphite|pyrophosphit|pyro-phosphit selenious|seleniousacid|selenious acid selenite|selenit selenate|selenoate|selenat sulfite|sulfit sulfate|sulfoate|sulfat peroxomonosulfate|peroxymonosulfate|peroxomonosulfat|peroxymonosulfat peroxomonosulfuric|peroxomonosulfuricacid|peroxomonosulfuric acid|peroxymonosulfuric|peroxymonosulfuricacid|peroxymonosulfuric acid tellurite|tellurit tellurate|telluroate|tellurat triflic|triflicacid|triflic acid triflate|triflat tetraphosphate|tetraphosphat triphosphate|triphosphat bromic|bromicacid|bromic acid bromous|bromousacid|bromous acid chloric|chloricacid|chloric acid chlorous|chlorousacid|chlorous acid fluoric|fluoricacid|fluoric acid fluorous|fluorousacid|fluorous acid iodic|iodicacid|iodic acid iodous|iodousacid|iodous acid hypobromous|hypobromousacid|hypobromous acid hypochlorous|hypochlorousacid|hypochlorous acid hypofluorous|hypofluorousacid|hypofluorous acid hypoiodous|hypoiodousacid|hypoiodous acid metaperiodic|metaperiodicacid|metaperiodic acid nitric|nitricacid|nitric acid nitrous|nitrousacid|nitrous acid orthoperiodic|orthoperiodicacid|orthoperiodic acid|ortho-periodic acid silicic|silicicacid|silicic acid|orthosilicic|orthosilicicacid|orthosilicic acid|ortho-silicic acid perbromic|perbromicacid|perbromic acid perchloric|perchloricacid|perchloric acid perfluoric|perfluoricacid|perfluoric acid periodic|periodicacid|periodic acid bromate|bromat bromite|bromit chlorate|chlorat chlorite|chlorit fluorate|fluorat fluorite|fluorit iodate|iodat iodite|iodit hypobromite|hypobromit hypochlorite|hypochlorit hypofluorite|hypofluorit hypoiodite|hypoiodit metaperiodate|metaperiodat nitrate|nitrat nitrite|nitrit orthoperiodate|ortho-periodate|orthoperiodat|ortho-periodat silicate|orthosilicate|ortho-silicate|silicat|orthosilicat|ortho-silicat perbromate|perbromat perchlorate|perchlorat perfluorate|perfluorat periodate|periodat sulfon sulfin|thion sulfen selenon selenin selenen telluron tellurin telluren opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/regexTokenList.dtd000066400000000000000000000010441451751637500324220ustar00rootroot00000000000000 opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/regexTokens.xml000066400000000000000000000244721451751637500320100ustar00rootroot00000000000000 opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/regexes.dtd000066400000000000000000000002011451751637500311070ustar00rootroot00000000000000 opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/regexes.xml000066400000000000000000001102311451751637500311410ustar00rootroot00000000000000 opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata/000077500000000000000000000000001451751637500325775ustar00rootroot00000000000000alkaneStemModifier_110RegexHash.txt000066400000000000000000000000131451751637500411760ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-1589210247alkaneStemModifier_110SerialisedAutomaton.aut000066400000000000000000000005231451751637500432540ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp -.ijnostur[IM`&v겥xpalkaneStemModifier_110_reversed_RegexHash.txt000066400000000000000000000000131451751637500432340ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-1589210247alkaneStemModifier_110_reversed_SerialisedAutomaton.aut000066400000000000000000000005231451751637500453120ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp -.ijnostur[IM`&v겥xpannulen_78RegexHash.txt000066400000000000000000000000111451751637500370260ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata977079442annulen_78SerialisedAutomaton.aut000066400000000000000000000037371451751637500411200ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp ur[ZW 9]xp pur[C&f]xp'()*01:ABEFLMNOUVYZ[\]^abeflmnouvyz{|}~ur[IM`&v겥xp   annulen_78_reversed_RegexHash.txt000066400000000000000000000000111451751637500410640ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata977079442annulen_78_reversed_SerialisedAutomaton.aut000066400000000000000000000037371451751637500431560ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp ur[ZW 9]xp pur[C&f]xp'()*01:ABEFLMNOUVYZ[\]^abeflmnouvyz{|}~ur[IM`&v겥xp biochemicalLinkage_236RegexHash.txt000066400000000000000000000000101451751637500411730ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata21796777biochemicalLinkage_236SerialisedAutomaton.aut000066400000000000000000000014231451751637500432540ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp'()*-.1:>?[\]^{|}~ur[IM`&v겥xpbiochemicalLinkage_236_reversed_RegexHash.txt000066400000000000000000000000101451751637500432310ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata21796777biochemicalLinkage_236_reversed_SerialisedAutomaton.aut000066400000000000000000000014231451751637500453120ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp'()*-.1:>?[\]^{|}~ur[IM`&v겥xpchemicalRegexHash.txt000066400000000000000000000000131451751637500366170ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-1544405556chemicalSerialisedAutomaton.aut000066400000000000000000712724451451751637500407230ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpZ ur[ZW 9]xpZ ur[IM`&v겥xp  !"#$%&'()*+,-./01234567888888888888888888888888888888888888889:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ur[C&f]xpABCDEFGHIJKLMNOPQRSTUVWXYZ[_`abcdefghijklmnopqrstuvwxyz{uq~8F@|<592[/I W &[>]+zPA(W 6!3>]Dz?T/>]:0|W +?HQ).>8d 'Q'It&pHx0g HxFHxK] EO&p NY00CCX:X4XE{5_R?'2v(#SCW9#?8A/Hmm-( @X f2GU Ae/I f !s!seUPH: F6-DHGKP t6|4 FN5P}IJP-; 41y-['-}W $XFUS5s $(#(# X) $/3(#(#/3LW|%5O8}{8}rJ7*: [HZV>HZ.}H<LHRU/TA64gJx;/TKo1/ )D<,JCWoCE1Q1CLHJO Q] A;HK882<0T0W2Q ;.KBY'(#@QJ?QA5 CP9CSYK(GD> 8h Wa K5,'CR#H3=%YG B CGTF m B*. B KK4%#=-*MM\)M\ f))VHBBBB<+:5M:4$S2]A/Va Cj28Q(3I (i88(iD#QD)D4C%9RHYK(#Mm)C4M[3BX&"C"X>!P5A&[,PA^9%:9A^7M!7LA^ 1AQ$&<=Y,2-}$&ARY #@A\AI, 1WNQa|<59/I)2*[ >]2PBWW6 >]Dz1>]$;Y3|WH A/7o{!/X</ eUkH/'M eB& $ eJM?7"":xNVR'-5d+H^ v5X+ 2U -+ L%e;=JA9=W/9=9<<<< EO&p, I 5TV 5; 5@RV%NWY?,GAOHS$g*\WYFFRAOSSIJAO-;6_WY41y-[})To}P> f WZ }(#>(#M%WD$I f};'-(#(#heXP+0"N XP&:XP[A\A\'+O%'%'=8B9N8G3<`FOC<`M %""!2W N! NQ N +/RT, NUOt Q6I>>, 7UC7UG"T!ZOCHJ@@r)XH|R:BDr*k;=%W#B1OT/Q= (W1DH;.1!ORQ=K95[#|IUP,@P IP{XpH>="H2PP"7PDQF7ZE ZZ!KX-=EP35'R7'Bt3'HS9-U $/'&-'3- E1%/N=J1VMMX{1V73^71V77&1D iRO #NNh3G.e E7/dE7B;T` 6 W26,'H1MC*>AV-NEF; ,NE$NEH9S],,X,X,F^ z(3 ?LLL.('RO:S(# ]DrW9F;:)zYg'- BGE[?I6TPYI)@  %4.O m @6<RR?6!S(#.OKS7<<3e?/LSNa6.5&K :(#4 &.O4j 4S< Z&;::|%G&a|; SOZG&" ";?'K4FG& -0nV6YcFU1A/TN4) 18*88;$&$& RALD.CQIRQFQQ" M4C 0AE=*H3Rl0A1W) 8`N3Jp/Jp&i1W6R\"1W1G"3[6nR\"3EL@?H66#66LFN?O=8B9Y] HY78G:(CG]GO:(5:(@@NG4U@\BLL:N'-5d/GVjN MI;SxG\zA05o zKezDABG\  P PLW_0Rb f 2!tV?b??? %=I fb;?!e$@4?Ne}}&U  7{ DyVA5T. "?:&{0> f2i&{(#(#MI f&{'-(#(#e')7'K 'W /XGU Ae/ !s!sK-L f2"1-?? I f-4??4e;&(wYP;&;&( @X f2GU  Ae/I f !s!sei<NyR,8%DP8RCP7R8QK5,'CR#H3=%YG B CGTF m B*. B KK4%#=-*O8}{8}rA!S%BA  e%Ae%:EP35 Ia36/S90(Z6/6/(Z"Q("D"Gf3^#C HLIHA1.kS) $X fS5s $(#>(# }I f $'-(#(#e HLVN(k )P#FA(k>)X#S KJH H *W#AE<A.A#&,LJWFNh[p[%sa E3[pN6pIXcNh#EXI8FiUV@xG%gRV8N?,GAOGHS%V74GGAOSSIJhAO-;441y-[=d}9+>.IJ1Vx SVoC`/0<SM;1<DPCME%MR1PS P O+3GH?LC,A @7U(o!4'0@l(oK(o@lO**AT*3$  @KE3N5f :1 L5wAu- CX@CC-.M Hc ?B& X6FV OJ6Fy6FD'L*$77GjGs(X 4zCV5sGs(#'(# D$D$GsC2;(#(#(!-IuAS!-<!-KS LA4Y i0 FM90Y=6, "# 3Y'=6?=6A 35 CP9-=T f6&{(#E3(#MI f&{'-(#(#eX,)5V?\S@N5CO$1kPGqN5 TiPY"tN1%)P%lAa"LpO~K&ACXO~M!M!XO~M!M!222=3K>9Y +*v$N7$4L*7UA7*77},K}6}} {L6*"N*>;1X1LJV %W-L@%/*NB:{*NDoDo;*NDoDo%+3#+*`G*`0+*`*`PY ?6)\H6|WV,0%Gv%ON 7IB$`:KB?L 1F':K@0DuP- @OR@T@z8@FK1r7I "cIS>Y :A93TRF5JRSK:BDrRFC=%Z#BG?;"TY*VF> m?6.? KKVF4%#=-*A.q #.q+J'JZI A#<>6 2NA#CA#2=RTJ[HRT+/2X RTUM, K0J~K@3E!L=Ny@wK@2GW))IUNy=?=G22D8(3 0NySY6 )%$-?B5 3 ,c#0;(#(#M7D$0;;'-(#(#(h9# 9#$PHX$P9#$P!u,D L/h-/$ R1-/V+N5-/'-'-; R'B''B$$U5(G+R J<5-RO8vJ$5V+\V+C;@=CnNG(59>;}U+>#&2WF,LBm3 nFp[%sK+ 3 6Q36QjpN6LA)pO1\#3 -LG5 %p"={MKW/X "O2XM%KM& JM?7"">3C>>8Oo 2:'06@ vAwKYXYYb QvQS_bQv8L,LUU LL)FMN@EEY&3KA.@KKA5KAUA*FWFk$&>* @?~D>7?~B}?~_4H}.D)Y2"T05HX`@V5HW5Hu(rs0V &E1C{ CCF NI8O G MM88 8$<888/./JBi/b::K";3 F<AAH,A)Ab)()=_4T.D)Y2",T5H93@V5HW5HV-@MK8 <'82. q'$26;V=Y! 6-=0%$X  f2 e%Ae I f%'-eLHRU/TA64gJxx/TKo1S)!<,JCWoC=1Q1CLHJO ; ,5'hLYR0+'XDRN*H(>Mx '--GK@AOVQ CD-GMFa4MAOSSK(m%AOG8L@oa:4KZEGTv4j4M[:@B3 JWW/,,!DN NN%U AKS N5 Yi'-PlIJEh:X(T85 ;1X1TV%EA NO#HN=yU> /y:BDr:H>,BG=$tTRE@@=IJE=-;@41y-[}O(I4 (I(IETTSm4I&SI&MI&ENFF-V("00A5ItH<0p7f9082ItBF0g=  3YH\?A 35 CP9:Yb:G:)fE}:#D 5#0#Fq(#JjW; X< R X:&Q<3WT8/$j3P3W/$U",D#OAU@,+#Y~'CX$ CK52C ( mS2C Bv YDo '2D(G>OAKP @K(KPeen?6)\e*=WV,%Gv:D%7IB$`:KB?L 1F':K@ruP- @OR@X_@z8O@F&7I "cIS8>Y *JYC<{q9Oq X>Y97?! Q7?BJB7?,Q"`*.h 22*8%,'59nR{$)O( )%,J37) O(GSGJtJ!=dJ cJ;M8PFiJ''J''X2`25n H4(&^Mx-'-eMGK@PG0MF4.OMP}.OK(m%PG"M2O4KZEG.O4j4M[$VX**J L,* VfW*Q*$mMM?B8CS7=3L+Y<)WKU&W*h"W?U&&U?E8'Q)9F:!QJR!3}M/JRWAJRGE7 *X #D D:A:*H L-69;C616 E9;'9;-?B8RCS7=3LY<)WKMJW*h"W?&U?E]%^E]FQE]|<592[/I W &[>]GPA(W 6>]Dz? >]:|W +?HQ#O5CN YX 7n:9*S? L 14*Zv 1S?4JJK,%D+&SF1%D%]%?%D%$$=$D8EX7,%?@\? BFNV1I@\!d!eB7?!dW-!dA77/W7K;6DZB?8DQ9Bz**Mh66/|,5):/|!p 13)/18HRA?)JO?} < $ /7 >N5 V'-'-T~RP0UPm+P-@M4U'82. 'D6,VM $/6&63- E1%/N=JK:BDrRFC=%Z#BG?TY*VF m?6.? KKVF4%#=-*}}&15ODu)Du<2? 7UC9nM3Y f7UP:!GW=4<SP:N=dP:?I! B&81H@\6J !d<B) 00?J !dWt!dP:A670>>4K>(rs0"K nt+Gttt<%V_OrcV_*2,Pm!kMEV_HK u? u>y'1i/K8,a)J8.Y6?0'"I6:Rb' wB*Oj3Oj3cOj/GSxN MII-;SxzSA05o zKez Q;=YUAn1N?O=%8B9Y] Y78G:((G]YOO:(5:(@@N4U@\BLL>;V 'Q';HxX!; HxFHx #N7HQ' HIHFF_F#` ` `HFFLz Z;:?;SN" N2K%M%MK5,'[CR#H@=%YG B CGA@ m(#C B*. B KK@4%#=-*PY ?6)\H6|WV,0%Gv5%O 6" +<?c `!6.OtLHRU/TA64gJx;/TKo1/ )D<,JCWoCE1Q1CLHJO ] 'T_18bO8}{8}rP`@ J=KJ f 2KT{ 5s@ (#'(# 2D$I f@ C2;(#(#eRO: (# ]DrW9F6U:)z!H'-BGE[?ITTSPYI) d  ,4.O m @6<RR?6!+ (#.O 7<<3e?/LSNa6.5&K :(#(#4 &.O4jR4S<-/$ KUR1-/N5-/'-;})ToEP.. f ".WE(#(#M%WD$I fE;'-(#(#he N T T+8L6m8W22*%,'59nR{$) <%,.J37J?SJ!J8; .2?By&rQ#6}A,tVGIVG8wVG;mW 0>098B2*03aj*A=R;*o82M$T$T3S 5U $T?IY<MUP.PH5dXRO: (# ]DrW9F6U:)z!H'-BGE[?ITSPYI)^  ,4.O m @6<RR?6! (#.O 7<<3e?/LSNa6.5&K :(#O(#4 &.O4jR4S<20@(c<Q&$9&9)`QOIO:lFQOOA"Da rSePI#X3SeV/& r@Seh_(:)rD)Y":)5H[@V5HW5HV >hv(G<.2=8H1;.V0#%@HH>N 01NFFN'T+e''g,$K=Pm'K8]3=V/& r@='-Rmh&/8Y.A8@(%X6PB HLVN(#bI ) qP#&|(kH2+>)HD (Mi3 (4W68(#4W [LSS-'ITH HMi(#?s! HN*'V'3;U-I^8;U %MZ %=;;U& % %&VN(kO(k>)H|&| QaQaiaLL*"W B>0#98B*Haj*A=*N,%C9N,N,9O+B<_( ZE ZZC 0 C C S P;V;!;E# =$9&# OCOIO:l# (OO(HRD5;O62F`J93+L^#)3+:+'O?Wj8"g>6u<{qz9Oq X5K>97?K!E*5Kff7?BJB7?,5K"`* :EP5 Ia36/<S90(Z6/6/(ZN*P8Y]HYPG8 $.>GO8 8 @@NG4U@\BLLP.6.&W0..H?-C,A @ 7U(o!4'0@l(oK%(o@lO!n$PA^$@75?>A s4Y iY 4-Y7O-SY: %19?!(U!(&M(K6:RbK }P9D=?6)\DMqWV,%Gv :%RRCP7RQ8h \X4!B8h+8hv7G6V?Q1117>X# kBgCd..Q3M(bHub"HS*(bH?(2xQ2x1o2xHp962N?O=8B9Y] 3Y78G:(kG]Q53ONC:(5:(@@N34U@\BLLISIbIJ*m0A0A1: /tWKXZK7M!BADoDo9[DoDo5!V{{O~K&dACXO~M!M!XO~M!M!U/TA/T)Wo!.ELD%L }- HG{P2F.V9R HFF<O!]F*1k8 % %MZ %=;&& % %#<<8i:%WL9CR9C9C)X#S %KJ6/H H *W#OQ1( OVVFOVNK 7U:CY]<7]?YU 7UGHJH~!C}C ON5HJHJ@@N4U@\'-BLL!n$P$L7 0AX 07: 0N&+R R8v;R \  "(.wm&.q:..w 4a5%GEG;vj;v;vp"={MKW/X "6W2XMKM& JM?7""|T_2NV|2NUH]MG0T0M>5> 90M(#(#M7D$0M;'-(#(#(h#-Ev#=v=vX7#_Wb#D 5#.0#Fq(*#T}T}NT}&6%!,{6%5=6%YUl ~NV4-zB'C.F 42XZ;]9(y4 2=*`2=TUl1:@A .+)BO#*(*(PIJ*(6%!,{.6%5=6% SVoSP6C;1M67GN?O=%8B9Y] 4]Y78G:(-?GF4.FOFO:(5:(@@N.4U@\BLL,a)J> S*%#3+%6N*Ks7>3 /aVT7F L * *FN@I-89&v;m='03FPP.V1$ $KX f2S5s $(#'(# I f $[(#9(#[e VVo6rS<O HC;1+< H V VT D:_D-;o5 D?S?Oy D?? BiBi<BiBi$6 $X f2<[S5s $(#"p'(# I f $(#(#eT&:6Q  @O++= R8vN= <&,NX'-'-j%$X  f2 e%Ae7!I f%'-eK82<2;BY7OQ[BO6N*#$#$I&'?SI&MI&OMP8MI&"pcAAXq A"9+/3r$3r"3r:<AK^9<N5<'-5dlh^H[[P+//\ 099BR@Mr:  P-OKg ;OKAoOK7-@-IMeQ0C"Q("FK"%J6<V61$1M1 &&"6eeMSS7IB$`B$?L<%i1@G:KXZ@u9g*SOR@zf@WKXZ7IS6~:Qh::N -p!n$PA^$@7M!5?hR0R0R0 O7@ En KYN*S; c.%K *3/{DP#AHnRKPTH/<TY~ K#%/6Y~oY~Q ?2d/*+@0// & K/ /KW2/d8eE7B)S1T` 6 W2W26GjGs(X 3GjV5sGs(#'(# D$D$Gs;(#(#E9 Y8 #7' :RY'E/ JVC 'd1C' E'9;1- z)%.?t!KXXTIO/ 21BO0Y(GWC2|0 02| 1S?R3%y% iHQ0JiY  FFpSG . G<&" oL  oI H :BDr ).oUP#=%PBsT> 75&s KJsH H5*W#.CME  CWUsx!kMECWSQ$6FX f2Q5s6(#&'(# I f6C2V[(#(#[eTJI$mMPU"1yVNUHS  WUo@N -n:-/$ KR1-/,N5-/'-1U6. $,11Dy,9BN'1d4>+f:6EuD2|!e02|v 6lJeOO0P;k$>O;k(#=v(#M7D$;k'-(#(#O J8BI JKG JB<+j?S8X f25H SAe/2'GI fS'-XkheKt8RZNXC=%WG9 4(:KVF m9 ).9 KKVF4%#=-*(X( ?LU?2J?2?2,jTFO6 2U-W$J6NKUJhJ6L^L*uQ.8D"3=TJ6 HQ.6$OdO'b7A,+YAYAE9 Y8 #7' :RY'E/ JVC 'd1' sE'9;-" M40AE=*HRl0A1W&! J1DW?/C 1W6R\C1WP"W?[6n"3EL ID/ 6Q6Q W,?\N5CN5 P45Y5.2Y6??AN?O=%8B9Y] 4]Y78G:(-?GF4.FOFO:(5:(@@#1)N.4U@\BLL{TD,q.,q,,q<,;L )x)x/qQ<=J.$&0R"JL}#@!`u(rs0L} "Q7{N;("OL" }gW3K@W)*fW O  UJ  0SDWCTRSG,UU PW(pj -FX fGU Ae/I f '-eXVX+5XO hVX%1%#cXXVX%%Q$&<=$&*R 'm6NV/GVjN MII-;,tSxG\z&A05o zKezDABG\  P P6.(6E8G f2@8?? I f84??4eK5,'[CR#H@=%YG B: CGA@ m(#? B*. B KK@4%#=-*!G - -* #X$3X?X$ (UX${BBTQDGG5YYFR/GH2;622*%,'59nR{$) )%,J.B37J%S8J!=dJ(#(#M-p/3(#(#/3$6 $X f2nS5s $(#'(# I f $(#(#e0W'Tti EF'+; E5 #A #Fq(#A4Y i0 FM90Y=6 "# 3Y=6?=6A 35 CP9!0Y!0)N7!0 GwJGw1g O6WOS,u&&=K5,'K~CR#HJ=%YG B CG6A mR B*. B KK6A4%#=-*0PP/>5 3 IEUP/(#(#M7D$P/4;'-(#(#(hD<088)<%1$='FX f2JV5s='(#'(# I f='[(#9(#[e9g!99C D2CiNKD2B1QzJD2(#LAP |?u0U9Ec90 |<Xb?, $"pY<1t?<A$5 CP9R34 iHQJiY XG FFpG .(. !!0GPPR:BDr*k;WB1T/Q=W1DH;.H1!ORQ=K95[#|I%UAPFN5Yi'-hHLI;Q*HTBOZ 1TBPTB1QXSQ,[l9lENI;/GVjN MI;SxG\zA05o zKezDABG\  P P+R RA8vV+'.E7/d):U9NP?E7??*gBWW45)S'?7;?J0L5s.,Xk; ;1XrFT KB1LJL-VXSDWYRL U7L |N7#}?YPT2F DB;QF4 Mc&K*%4VU4T;K*WW;+y w93(#U(D a.|U(U(;VHyHyWTPP9 LHRU/TA64gJx;/TKo1)D<,JCWoC=1Q71C9LHJO Q] #fM+B&;:,e9^ %&;9^ 5ODu)27SgRDu:<4 ::>'>R=>-@M45U'82.H,'D6 T,V4H,$/@6&63-H,E1%/N=JRO: (# ]DrW9F6U:)z!H'-BGE[?ITSPYI);  ,4.O m @6<RR?6++!+ (#.O 7<<3e?/LSNa6.5&K :(#O(#4 &.O4jR4S<&C& OTTP%TXW:V]PY ?6Mx)\H6D6|WV,0%.OGv%?c `!6%O.OtY7CuI%~D +e.E)C, E&bE8; . ByHc>5?V>M> 8?;#7 ;U;M4WE0A W*6PF[1]P777%UP77P8PG..OC8 @@. U:Y&x6?5RWcFF`"{7~3S"I6:"IG"{'"{UFPH "\ $PX>S5s $(#'(# $(#(#X+f: 20LEuWC2|0 02|>9J <,N'-'-N?O=8B9Y] 6Y78G:(1G,Lq6C OC :(5:(@@#N64U@\BLL'NuD'0Y#XK[A0mG58FT0mB0mZ+AF88IfE}bT:N75 Ia36/SS90S(Z6/ +6/(Z!X@ Ae/-pLA56 ]LAffLA:N[8/%?"O8%Qb <-/$ aR1-/N5-/'-5d'-Q;J8--u--.XT%8..gUjj!1/\A;HK82<0T90W2Q';.KF 3Y8QJ?Q5A 35 CP9 c cD c7{::18IU?'^8<?"?, UM$HY')DI%~D +H.ELT6|C, EE8; . By5 . .\<<2+R R8vV+{U"0 2__RRz////M4WE0A W"P,@P PU  $DE(#(#MDE/3(#(#/3LJ53+PJWWaaX@E9o(:)r:)VVK82<2*;RVN>?,GAOMHS@vQ%AOSSIJXAO-;.nE41y-[}9H[#D 5#K%p0#V/(=#'-E=YE='E=0u6oK>V6h;'jW;%KY 6;DhKP@ 4,H nKPTH <-/$ R1-/N5-/'-5d'-A/@QdJ+E-J"p"pJ("<@ p"={MKW/X "O2XM%KM& J'E9tM?7""11>Xf?P?\N5CN5E P45Y%5(N?O=8B9Y] VWY78G:(EG]%VWOC:(5:(@@NVW4U@\BLL BY!NWNM_ L 6 6Yo -!LLM;<-IQQ+2!9,US ;a@$SO.@foS/ /  B$$1H421JQ41 &*IPPPT/"P/"WHO)WH/"WHQj*'U-/1-?? -? -I?%X-5(#-(#(#lJp;4 ^%6!60PP> f2Y1(#(#M%WD$I f'-(#(#he"(.wm&.qB.w 4a5%GEGR $KX f2S5s $(#(# I f $/3(#(#/3eA&CK #(?JsE&P(&P+H&P+&&/1.Y9J:5 '81.8?AV</ 4X8Y78l!q+ @] QjVN;HSG43E3O :/|*p!;T X0=U)1)13LHRU/TA64gJx;/TKo1/ )D<,JCWoCE1Q1CLHJO ] %%%0:K6pC N3'-N&XDB,@x'2M8 o -iT*QTF"!#"T*QG?*Q@4m89p #?LC$kX5q<+Y#A 5#;0#Fq(#BZBZ02 $KX f2S5s $(#'(# I f $[(#>(#[eG/T/T)|<59/I![ >] PU6 >]Dz>]$;Y3|H=U)1)13 J8 Y/FAV7i@J 9,<R?V;';b4,VECb3CbSCbP3p.'E7/d):U91P?E7?? BY8Z1 )S N?7Y?1s7.,Xk>YT*G4$S]A/V 2M8J3I (i488<M(iPPX C<LHRU/TA64gUx;/TKo18))"UWoIO=1Q1CLHUO ]   %&%:=6?r:GW8 Ia.o6/A$0S(Z6/ 6/(ZN'-AFFGZ%s6Q R#D)M22*8%,'59nR{$)O( )%,J37) O(GSGJtJ!=d<#J 7g>Y>P"tN1N%)%lAaN1"Lp=g (Ws< ;NKIO,8 ; %MZ %=; ; % %>H< Q'ItHx>60g HxF7Hx #N7Bq7mO4@BqFF_F#` ` `BqFFGeI )]!E1?/?/fH6#"M#73^7#77^.)?ES7hE")?!OMDOMB&OMVC<p"="?24Fv6WFv;V;!; ^&'X M!M!M!X HRM!M!HR!]#yV0U#yY#y IC > QMkE/X19KY HJS$(9 '-MGK@"NGF30MF,4.OM"U.OK(m%"G"M2,O4KZEG.O4j4M[$V:N#/99\H[N?O=8B9Y] 6YC8G:(EGLq6C OC >H:(5:(@@N64U@\LD2CiNKD2BBITHD2 \ 4KFSF H 4N7IvN7*],/ 4=N7N7=8D fU7P ?6)\P'E.WV,0%Gv-I+'%OTvB@B@+/B@R8 PHUG\WxuOLzDABG\u P P= !1E= = W%NT}7y?G$$.XR8CSGR8(l:TPAUWo'TPN5TP4'-hPnABG\JJNFu:BDrY]+rHYOHBG'T-9C O'K'@@N4U@\BLL:N'-U/TA/TAK)X>%I X(=)X00-3F'.900009x LZ"O`E-:-/$ 8R1-/ N5-/'-;YR1XrFT,KB1LJLXV2g.,B}WB}P3L U7=4L |3o,N7#}7?Y8XYW8T8 }FF%sW93p9:N75 Ia36/S90S(Z6/ 6/(Z"ZU/TA/TAK)** 7} " 90(?T ELD%U\/NLUA D^^UP.UDTD^ HQ6$:3AK*y>3C NC 5)u3'-hFSSETTSm@)&MX< ]Dr;4)zF#B'KTQP7F3N.O'3e'O<-N/><-H17R<-MM)CDJvK52BY @<>,BC 7mCF"IF_F#`C$FF$J&:fI:,~4:Y(#:(#(# 9!22-/$ OR1-/@N5-/UK?!(U5R1}F/!(`%$R&M:$"I6:%$ '%$U$FPH "\ =SP GSxN W: 7CI-,tSxC ACq* )J> S8^p;:/hS7N'-;}'- H 8 ).oUP#Psm$> 75s KJsH H5*W#'j%$X  f2 e%AeI f%'-eH LC,A @ 7UR8(oGi!4'@l(oK%(oSGR8@l(lE8'Q)9F:QJR!3}M/JRWAJRG$ -G N5GIGG=N" M4C 0AE=*H7Rl0A1W' J17/&i1W6R\1W"7[6n"3EL0K@3E!L=#\@wK@2xW)95$Bc?J~22D8tX0BcSY6 )%6<RRR-<(# &j CF!N=/hN/'-'-Dy9B2Lz!V\.Z>!+-!7UHQ 2iIJ.uIJP2IJO8}{8}rA!B#h:VRA^3VRN5VR'-Y5dh2? 7UC9nM3 f7UP:!GW=4<GSP:N=dP:aY/;%&;Y/rBR=rO*;O*C O*NFq-:@y%5Fr%7vH5%1/NR1/H11/pH>="2=H>E)CSKE)'])&{0> f2&{(#(#MI f&{'-(#-*(#eX\XTIO/ 2O0$FGWC2|0 02|P7/Y ?6Mx)\D'WV,0%Gv%I+'%OTvM VAA| V VM:6C Nc'-K(U1&cNDFx$m)M( ( ( ( (%H  f2 e%AeI f%!s!se66O}6l-NV$-B4lUZ*81(yX24'<"<l81$$QUQO~,&*8ACXO~M!M!XO~HRM! M!HR)ToEP,%. f ".WE(#(#M%WD$I fE;'-(#(#he4$S2]A/V 283I (i88(i3<ZB3<H3<Mo# &X,# O&# M(#!RY3(#2l(E'@]!] P-8Ui6>]Dz?2>]:|Ui+HQR<7!\TIO/ *rIV1DO%BTG@K7?%B -K(%BA77/W7K;6#&2WF,LBm3 nFpVY%sK+ 3 6Q36QpN6L"pO#3 -LG5 \ 4KFS H 4N7B}IvN7*] 4=N7N7=(%J  e%Ae%!s!sV7E|&T_2NV|2NU#O @#V!@@ O0+-!#?Q}COYJ? A B?Y1Jl#7&/8Y=&v:56g#.Al8A>7$4l8Y8<Z+A+$8# k8!XI @I MS!6&I3;)!M!*UI3M!%M!"I3M!M!.&;0!6 %38p@I(@IKO@IK5,'[CR#H?=%YG B^ CG8? m B*. B KK?4%#=-*@*" "ZMQ!Gi(V mS$ K ?~D>7?~B}B};?~? <FK+ H7H HHOoOo$P$P Oo$P-MN,O)CN !K@B9QRuRu:'('(7zU 8O G MM88 8$<888-[6n/y:BDr:H>=%,BG=VBTRE@@=IJ=-;@41y-[}PY ?6)\H6|WV,0%.OGv%O HLVN%(k )OFA(k>)O&&$=KJH Y HO*W#K5,'[CR#HPQ=%YG B# C2PQ> m> B*.F B KLKPQ4%#.=-*)PXXD'L*Ad>8JAd"AdO@P,C(#(#@[4(#/GSxN MII-;:SxzSA05o zKez KYN#*S; c.`K *9O3C[5`@@EAHnK`THFz lU-P2IYKI+1IV NYXK:B:DrRF=%6xBGK?TY*/V> mN5?6.? KKV4%#'-=-*& _J&y&0JFFZ8!>;=wNbF`)F`LLTF`L_RO:S(# ]DrW9F;:)zYg'- BGE[?IKTPYI)VH  %4.O m @6<RR?6!+S(#.OKS7<<3e?/LSNa6.5&K :(#4 &.O4j 4S<MY!NWCt 8)N  \ 6 M 6CX2 (=B``@ WKXZPD 8?6Mx)\P/w.WV,0%.OGv%%O[,)'[7[GE*AM)1/NPE?Mo( 5)SAe7 x5s.,Xk)&{0> f&{(#>(#MI f&{'-(#(#e-@M?U'82'$(k&pE6E>)!$/E66K] EO&p- NYFD 4H:[NHL\p5&U?z Fl-/$ R1-/N5-/'-L'-#L< ]Dr;4)zFB'TQP7F3N''ON?mP4?K+ NPAFqX OTq777'6q776/H 7UC,A @7U(oKc!4'@lN(oK4+(o@l3xB(FF6!FF0Rb f2V?b??? %=I fb?$?e?K5,'[CR#HPQ=%YG B C2PQ> m>N B*. B K KPQ4%#=-*" P>H f '&G+ (#(#M%WD$I f ;'-(#(#he;gW,:BDrBTOFcFcKSS %:R7[P> f UW7[(#(#M%WD$I f7[;'-(#%g(#he<-><-H1<-M:+X'O?Wj/|@8"g>6uK,5):/|!p KS55YsYs D:_D-;o5 D?S?OyD D??7@!K5z5zJ5z<?{/tE5)0-/;1-?? -4??46?(B8 jT W'XW2KKB=/ *YiGO"G8wG#'HV-CV-:jMW-MC_N19*47?! ':4J7?BJ7?8*44J??O HLVN%(k )O1(kRP>)FO&&$=KJH HO*W#"ZGT@CC'G;GB}F F%sO6Q3PY ?6Mx)\H'6|WV,0%GvI+'%OTv8'Q)Q !D8T:28(56PT2F DB;QF4TG Mc&K*4V"4K*NWKRHOdO'b7A:BAKq5BN5BYi'-h=3#!"Jm&LQ#!84Y#!M==O8}{;4F8}'5rP7N''*E*NAMYVK2X?Mo%#TX29XT` 6 TW2W26/+D :a/']'j M,,#'B&P>>XK?(T>>?~L/'>7?~B}B}?~$RV%NWY?,GAO$HS$g*\WYFFAOSSIJAO-;V*@WY41y-[}2=8--u-W&{.@> f2&{(#(#MI f&{/3(#(#/3eAs <#(5#"0#V/(@0#n0WWY<)PH2?2?W$U:Y&x6?5RWcFF`"{~3S"I6:"I~"{'"{B<UFPH "\<=U/TA/T/G)V 0.!w/G 1$PY ?6)\H66|WV,0%.OGv?c `!6%O.Ot2`P,-a-a>Od Q'OHx7A HxFHx *48*4?O1-IXnn@n39 h3109;3E5))A;HK8(#<0TQ q90,2J2EQ;"HD (wYE(4W68QJ(#4W Q[LSJAw(#?s! PPN*' nS?4Sv 1S?JK,E."_&/.Y'.A)T4ATAOTDI_TDTDM 9Q7$Ct 8Gq)Q7 T\ 6 M 6 WP ? NB`:*" 1CTT +TEN9'=W/9==Y09W8dW',0O%'Ny'X%""2 L UQJ7y $1@/*C<)C0 JO M6P*CJO-%LJOR1PS P O+ HLVNL(#b )55 qP#V(k2+>)=QK1HDL (,!3 (4W68L(#L4W [LS?>K H(#?s! ,*'4|(V8',R"  PUPH ?6)\H W6|WV,%Gv%W^(.+@G11>$ 9@>N5>TFO6 2U-W$J6NDJhJ6L^L {Q.DS8TJ6 HQ6${L6>{? ENE>5JOdO7A$/* -Q$&<=$&]R  #"I: oP?S4A7XV o-i F"**C 8Ke #C?^.$:+X'O?Wj68"g>6u+*C*36X?e5n H:K(&^Mx!c'-eMGK@=*v>GMFMj4.OM=.OK(m%=G"M2Mj:4KZEG.O4j4M[$VLkLU? C4Lk<LkTh""Q%T(4F(Q (Hf Tk?L+U M@ !(w;,&M1>aY/U;%&;$M$}R wY/%%(I rINI ?*(Q *(IJPIJ*4*(IJ HLVN%(k )OFA(kb>)O&&KJTH< HO*WKJ#22T;D'599nR{ <0.%J%?- cS%J!J8; .2 cBy 4 , EHH6|3 8".(QLB9& 2R(H>f7UR8G.:(.! @ MF,RO.:(5(%:(GSGR8N,R4Q-=4j)nM[(l.7$8; .By -/$ R1-/@N5-/Jk& I-@M45U'82.N'D60>,5(4EN$/@6&D63-NE1%&/N=J 1U  6  - VmP. 9Q7$Q7T Y< ]Dr;4)zFB'tTP7N'3e'O-@*T:8:*T&15&*T&&O^L5I^#K9"^Q??C N758 U9;  <?O)%K$Q~Gv@ %KG1Y 9UVU UDaDa"DaUrUr2UrfH TN:.X?< ]Dr H:K(&^Mx!c'-BMGK@=Tv>GNMFMj4.OM=.OK(m%=G"M2MjO4KZEG.O4j4M[$VR+'/E;+N*8"->CWNX;.(N!ORCK95[#|IY?L>J N f 2HT5s(#'(# 2D$I f;(#(#e;.!X7(GP\u J<H-?4FW//~*\@ CFWFW(==RQM 0&{0> f&{(#(#MI f&{'-(#)(#eJ HLVN%(k )OFA(k1>)O&&TKJH' HO*W#K@3K@W)/tE5_4H}.D)Y"T05H@V5HW5Hu(rs0V &JNRVN?,GAO1HS@vAOSSIJAO-;41y-[}?R: ,x;';b4,EbH'llN A/7o/Uk9Z{L6*"N*><A6LH:1G*15%Peen?6)\e1 WV,%NGv4"%iK5,'[CR#H?=%YG B^ CG8? mN B*. B KK?4%#=-*o?R !n$P$ t:-5?:-5S!UL~L~X>%Ip X(=7X%cLg{TRF5JRS6,6$6c Fx>oFx&FxXDVCN H(UMx'-MGK@AOHS/GMF)4MAOSS.OK(m%AOG"M2:4KZEG.O#4jH4M[$V!=,Fx($mMMG 5V 5; 5*W?!(U!(`&M(K6:Rb%$Kz Y^zz>*)JJf@@|@:%+|DCs4OY/?1n4O?F5:$9Ui6H?:|Ui+HQZ:a/PBP%ZP3BX-.# .Ve.IJUV@xG%g=&oI 'K'&oB=B&oBBY;6/KM!0XoRY01O~K&ACXO~M!7M!X O~M!M!C98C'CB(BDB;D+<44EZHT2F F I5YY@KO55Ff7R$,[!p9S61 ;l/| CNXRW C$] C,5):/|RR!p NK 7UCY]<7]HYVp7UGHJ!C}SGOHJHJ@@NG4U@\BLL+ M N& )G MH NP NSt1<* 7U:CJ)|6-Y 7UG"H!V! FN5"UIJ"-; 41y-['-} GhS: HN7N7IvN7*]?j?jN7N7P7U7U! 8*z O ?\N5CN5! ":Q&!:CX/;>uud Q'/!&pHxC HxFHxK] EO&p NY/hN'-'-P- CD+>I < CDCD;mJv&cM[)5V?\N5CO$1kGqN5  2 Y"tN1%) %lAa"Lp.8: #. %.!N?%X-5(#-(#(#N 01NFFUNWTFO6 2U-W$J6NKUJhJ6L^L*uQ.D3=TJ6 HQ6$RO<S(# ]DrW9F;:)zYg'- BGE[?ITPYI)N %4.O m @6<RR?6!S(#.OKS7<<3e?/LSNa6.5&K OO(#4 &.O4j 4S<31;4=3<3&*O62J & sCQIRQFQ1E W1EW1E -02R\$R\!R\R\8 RTHRT+/RTUM,RU)l<W<LX!]< %W /XGU Ae/ !s!s*0/NQP/ f2X/?? %=I f/??eSK95{@}M!Jf@@|@)^:%+/XHH :& gC5wAu-"*C>X@CC?'*--G_4H}.D)Y2"T05H@V5HW5Hu(rs0V ,'H1MC*>AV-NEF; ,,NE$NEH9S],,X,X,F^n7n#7nN7: Dr;4FB'P}TP7' 'AFq#:-/$ R1-/C N5-/'- IT4. &-3^4N+NnP4NNhE")?'NuD'0YXK[A0mRG58FT0mB20mZ+AF88IJ&@#[J&J&FTf1Vx SVoC`/0<SM;1<DPCME%MR1PS P O+"4=cOK ;OKAoOK6X%$ X%V+N5X%U:Y&x6?5RWcF(`"{dy3S"I6:"I~"{'"{UFPH _"\LSLoLL 1QJ|):g1"1F- u"+S.NK@rTAGD.CNTYV6h;'jW;%KY @;DhKP 4,HO nKPTH)j;"eVlK % * ;!]O!] G)j;!]!]FUA%.+%N?O=8B9Y] 6Y78G:(G,Lq6C OC >H:(5PI:(@@N64U@\BLL:*" 22{'599nR{ <.%J%+EkkS%J!J8; .2kBy/GK5,'[CR#HPQ=%YG BV C2PQ> m>T B*. B KSKPQ4%#=-*O8}{;4F8}'rQP7F3.O''/h-/$ R1-/N5-/'-;}'- 5!V, 5; 5E9 Y8 #7' |:RY'Q?/2&V|C'd1' E|'9;-XP XP&( XPA4Y i0 FM90Y=6 "# 3Y=6?E=6A 35 CP9-@M/U'82'$It&pE6E0gXT$/E66K] EO&p-T NY|IGU7|7=CAINHAI323'3&/8Y.IATO8F0PP> f2TE(#(#M%WD$I f'-(#)(#he"%*%#3+%6N*Ks7>3 /aVT7F L *(\ =u*FN@IV|Ht4PM_4H}.D)Y2",T05HX`@V5HW5Hu(rs0V W /9FFCY  6l;1X1LJV %W-L@%v;mQ 9/=4%M%MG)*%#3+%6N*Q27>3 Q;;Q2Q LQ E *3 *Q2N@IXDVCNHM(&^Mx '-MGK%AOHS&GMF4%MAOSSK(m%AOG"M2:4KZEG.O4j4M[$VYy,-% H3KK -/$ R1-/LN5-/'-'-`<+<XD5N*H(>Mx '-e-GK@AOVQDBGMF$4MAOSSK(m%AOG8L@o$:4KZEGTv4j4M[:XDV2-CN H((!Mx'-MGK@AO lHS&&GF3MF-q?N4MAOSSK`2-.OK2-(m%AOG"M2-q:4KZEG.O4j 4M[$VNFuI1/)2Y]+rHYOH/)G',M;-9O'K'@@N4U@\BLL7U7"V: $X f2S5s $(#Q '(# I f $(#(#e?B8RCS7=3LG4Y< WK!*G4(G(GJW*h"W?G4&U3FF%sW93pO9%-/$ R1-/FN5-/'-+q62>[6+q+q@J OPF&/yeen:H>,eG=RE@@F=IJ=-;@41y-[}55UPH: F>A.HGKP6|,FNP}IJP-;41y-['-}n#WWY< m> n B*.UY B KKPQ4%#.=-*N*S*3JQL 0 'C C?KKII2?<?<2=,22;EENW303e9))2G:IC3AK&S%OC3N5C3'-hNFu:BDrY]+rH=%OHBG'T-9C O'K'@@N4U@\BLLO8}{8}rA!F3S #1< R X:&Q&QHI1/)2,A @/)(oM;4'@l(oK(o@l pT pSD pK-P9('"`V 3aFK R7JU$#cVU$Z%U$H9KY,A @ (o4'0@l(oK%(o@lOL%<B<(G<7KXpQppH E)CSGKE)K+3+*`G*`0+*`"|KLN7"|"|9+B+B+"@X++K(#M=4214J41441a8 HLVN%(k )FA(kRP>)X#;tTKJH H*W# J8 Y/ AJT@ %KP25n?6Mx)\eWV,0%.OGv%O%3P$Fr%7v%I'I@U''0=C K:B:DrRF=%6xBG?TY*/V> mN5?6.? KKV4%#'-=-*Q$ 4A 4@ 4+2P!BtOO 80YTh 0/B>K= Dn;V6 H :BDr ).oUP#=%PBszT> 75&s KJsH H5*W#G: G:IIN7G:I)R0a6X6?L50a066})R0aSQ*'?'^8<?""C?=8B98GW GN5OOC:(G5OU:Y&x6?5RWcFF`"{~3S"I6:"I~"{'"{B<UFPH "\XO#DX X=:XJX f 2+<75s(#'(# 2D$I fC2;(#(#e0O8ss&XDUV@x'2M8 -iT*QTG!#"T*QG?*Q@4m89p #?LC$kX5qELD%L^#)'>4K ?5KO1AA;Cy P7!>y y!WQ+U098*;@j*A=* A22*8%,'59nR{$)O( )%,JP37) O(GSG1J!=d8yJaY/;%&;Y/.28-@M4 U'82.'$&p6,V !$/66K] EO&p-! NYG;ND.CNYE=( -@M4U'82.'$&p6,V !$/66K] EO&p-! NYV)PxM;MM!BQq'&" o/3o o! $X f%9S5s $(#(# 9I f $'-(#(#e /)9Xq?Oq9! ?H?-C,A @7UR8(o1!4'0@l(oK%(oSGR8@lO(lR\Eb&pK] EO&p NY k *C; f20,;??? %=I f;??e:EP5 Ia36/1S90(Z6/ 6/(ZG"">XDV+bNHM(>Mx '--GK%AOHS&GMFKw4%MAOSSK(m%AOG8L@oKw:4KZEGTv4j4M[:5ODu)27SgRDu:<4 :: NJ;-v>R9 >>N'-;}0&{0> f2&{(#(#MI f&{'-(#(#e4T.T@G#@!`6I,G*%#3+%6N*Q27>3 Q;;Q2Q LQ E *Os =.*Q2N@I)=8PF-)'')8''"ZS2]:|Cj%-Ba2LG3L11 L%'LK1 -0n5 \YST@n!$!n APS"6]G;$2/9:\<Vv59:\N2M!:\o O=?!e%BAO=?7L| m:HmmTh"R$,[!p96L;l CN24SW C$] CR4S 6  - ?B8RCS7=3LG4Y<)W KW_!*G4(G(GW*h"HWM?e?G4&U"?F%s, L "90 ;""6+@X(T81HV HVW303e9)4)(R-9A2BN5'-h)UTFO62U-W$/N#)J%qD^^Q. DTD^ HQ6$E9 Y8 #7' |:RY'/2&V|C'd1' E|'9;-8--)5V?\S@N5CO$1kGqN5 p 7g>Y>"tN1%)%lAa"Lp: $X f2)S5s $(#'(# I f $(#(#e?8'9;" 1AQ<=?p&],H- 2@$&0 nA nR3#@ nA\Au(rs0 13 9&,;4jP:&::44 49H:3WT8/$j3P3W/$UX?< ]Dr H:K(Mx!c'-BMGK@=2QT;GF3NMFX4.OM=.OK(m%I@=G"M2XO4KZEG.O4j4M[$V V/ 6Q6Q HH Ss1e<3<ZB3<3<Mo 7UC7U ! :& gC5wAu&Z-"*CO>X@CC?'*--GC FJ{NLfOKSfJ{%$q FJ{( SD1so@N=tO=t7v=tX:VH>+F*2W2/dYfS:UfY H QE7??B;d3"W8)S?=?XT` 6 W8W2 'FW2t)6<OWI1<D8W % %MZ %=;&&W %&!];u %#:A93TB*`F( SVoSP6C;1MR6N7/##N7H9z' HIHFF_F#` ` `HFF.62@@3<&*V$ED$V#X1j{>{?oOAYTE9@>N5.#>8!ho$$ YP;mPZ3PK5,'CR#HC=%YG BS CGPBVF m B*. B KKVF4%#=-*E8')9F:!Q".JR$\!3}M/JRWAJR!G".WW,>K <1$ 21N51-p'-5d'-1<E9 Y8 #7' :RY'/ JVC'd1' E'9;-MI ( ;F&A\/f<</ff/f< ++%g%gF ;yJ")6+7 ZPh:N$8/HW,t._NI-Lz5jYN)J)J/#Lz LzSN8^7pW0NENDXP+0"N XP&:XPHgYs [YsL4H4 p4C &N YX 7n9*S?: L 14*Z'v 1S?4JJK,%y=PPT1TTk?L+U M@ !(w; &M1>aY/U;%& T;$M$}R wY/%%(T ELD%U\/N-WLU  D^^NUP.UDTD^ HQ6$+= RO,8vBPY ?6)\H6|WV,0%.OGv%OF%s 6LA ]LAfLA';:EP35 Ia36/S90(Z6/6/(Z{{{Z? :% ? ? M["AN7 D7 -:]1-?? -??X@E9o.'E7/d):U9 XP?E7??ZBW X)SAe?7? Xs.,Xk.oB$$'8+O%'+'.'.'A.'hGm RO<S(# ]DrW9F;:)zYg'- BGE[?ITPYI)@N %4.O m @6<RR?6!S(#.OKS7<<3e?/LSNa6.5&K O(#4 &.O4j 4S<MXMX&'MX.Jq?Oq9! ?0LMLL*^P *^~*^0Ar_5^?=rYC_eVeHeAr _eeSWWS)(Fz lP22#1#"I: oP?S4A7XV o-i F"**C 8Ke #C?^.$?'^8<?"" ?X7-@1s-K^-SI 6S$S# 3P  `>ODu)DuJ<N':j -FX f26GU  Ae/I f '-e/**NK 7UCY]<7]HYVp 7UGHJ!C}SGOHJHJ@@NG4U@\BLL.3XT%8..g.2 <P_FNDG'-'-$`:KB:KguR?'2v(#SC49#?)z Q'- GE[ BI9y C+uOI)> E N4 m@6<RR B*+9+++-!+2v(#+ .OK2v7<<3e B/LSNa6.5&KE :(#4 &.O4jHA;4S<*7c=_<*%6SBVQ%63%6X:EP35 Ia36/S90(Z6/6/(Z#&F,LBm9nFp%s"[YE32WpN6L/pO#-G5;9;65e;T PT #bT FFR%?@\?@\2B#(o??4 GJ("h-<?4%</R.U\%R=%Cn(9>;}Y>"tN1%4/")%lAa"LpUK?!(U5R1}F/!(`%$=X&M:$6:_%$ '%$U$FPH "\>-e>O>>:C:; :XDVTvNHM(>Mx 'G%AOHS&RMFNz%AOSS(%AOGI+'Nz4Q-=Tv4j)nM[1)(G+R J<"h-R08vYh\@=A#Cn(9>;}(4F(Q (E 4:,i$$)$9HhY.# .eVe ?.e"f,=lE;JHf 2?%'HV-CrB K(qV-LJKh,WKh.=7Kh |N7#}?Y""wD77{("YOY8"YYp mp'p "?0J%U2AE2N52Yi'-Plh Z;::|%Ba|;SA1 A1 ?'K1 -0nRGd!n4$PK?Y #L$/:\9i 7M!57@;:\N2Do>:\R M! nlDo .#..WK5,'[CR#HPQ=%YG B C2PQ> m>U B*. B KKPQ4%#=-*KYN#*S; c.`K *S 3C[5`@@8zAHnK`TH'C ? CFK$.7mEC$F_F#`$$F"\F$ EU'+; E< M 9@$Ct 8Gq)$1Q7! 2T\ 6  M Q&! 6CXCX/;).R $KX f2QS5s $(#(# HI f $/3(#-*(#/3e=!Oz9=!22=!6A Z;:?;SN" N:E.=9B9& F(H>f'-8BGK%:(GAMGMFV4O%M:(5K(m%:(G89NV:4KZEG4j4M[?VNHSG(( ?(?6D./66JB6< Dr;4)zF#B'%TP7N'3e'O"3=NNNWJ W $XS5s $(#(# $/3(#(#/3%>"?RV%:N7GKAOTWHS?IFNF5?'AOSSIJ-AO-;41y-['-IJDPvTOPvLSjLM!PvLS S(G+R J<"h-R 8vYh\@=Cn(9>;} A&:IC3ABK&S%OC3N5C3'-h:W5 Ia.6/$0S(Z6/6/'C.(ZH?%Ia&Y /XIM!M!M!XIM! M!H9# H NU6 )RP# XUL$[ . !?KJH H *W#:7NV='F V/ 6Q6Q59# YA  9?;YR1XrFT,KB1LJLXV2g.,B}WB}P3L U7EAL |,N7#}7?YKW:B6KK.'E7/d):U94P?E7??JBW+ 4)SN?7?4s.,Xk LF`3a33KK!W@1AA=9B9& QN(H>f'-8BGK.:(GMGMFV4O,CM:(5K(m%:(G89NV:4KZEG4j4M[?Fl0YC NM'-'-E1zNK 7UCY]<7]HYVp7UGHJ!C}SGONHJHJ@@NG4U@\BLL*I A4)K V61$RdC=%&(1GD,KM1pRVF mD,U.D, KKVF4%#=-**`70 ((E@z.R$,[l96;l C N:CWK~ C$];. C!O!RCK95[#|I_4T.D)Y"T5H @V5HW5HVR3 NO#HQJiN UFFpG . G.WY+N 01NFFNY6?`"46:Rb"{K5,'CR#HC=%YG BS CGPBVF m B*.[ B KKVF4%#=-*#RYRL,RY;RYq9OqLA9!#zafDaX X(=XK~FK~K~ =9+%$X  f2 e%8I f%e%gJv<.[\<.(G<.i;1X1LJV %W-L |%5AM7<MPX>A6q'I+g)?B6J-VN]K LN '-'-Al,N4 (G+R J<"h-R 8vYh\@=Cn(9>;}VN HSF& \ 4KFSM H 4N7IvN7*]I 4=N7?YN7=LY=SC %?Zs57X<p%"={MKWX "R2(2AKM& J?7KK"HD*E*NAMYV HK2:UXLVMo%#TX2XT` 6 TW2W26r3BR r=5; ItH<It0g"Y7O"R?'+QC9#'C)z@GE[ BF CE20 mE[ B*(#7<3e B/1 0O@K 4W<Bu+Q4j`.mY QvN9$\YQv"I9D=DG~<]C"G<]#+'8"8$A(JSsHJJ?u GI- ?4FWA#/6 2A#CA# CFWFW2==PmPZP3XcHwXcIKUg@lC(o@l#M1l'(%'(UWW<"<l#&4n#N1MN1V#N1&)!LB o;S!"("/.!""NCt D 7v=t>"6P> f2 _+6(#(#M%WD$I f6'-(#&(#he3O39$3MN .*lXnx;HAA;HK852<0Tt0W2Q'G;,&ElYQJ?QAEl5 CP9([-L f2@1-?? JWI f-4?$?4e?NO4?K+ NAF:I SSK5,'%:CR#>I=%?GK BV CW5F mNF5" B*.I B KYK4%#'-IJ@`-*!N&+5+2U +I1/)2/)GAM;;POS'@@PT;N" %4$P/bA/V /8Q XI (i88(iP+'"+4> &+!K!K, L0R@ f 2Q@N-@??? %=I f@;??Ne117>XYF2(g2S+?+(650,6;6AA#&F,LBm4nFp-%s"[ <43pN6LpO#4-G5'HV-CrB K(qFV-LJKh,B}WKh.=7Kh |N7#}?YK5,'[CR#HPQ=%YG B C2PQ> m> n B*.T B KDKPQ4%#.=-*' <F!K+ H7H HHj?S8X f25H SAe/2'GI fS'-(Xkhe5!gV:EP5 Ia3.6/S90(Z6/6/'C.(ZH?E)CSX K. WE#  E)E)8-/$ R1-/LN5-/'-|Q7Q7"T/PY%6?AU%'9q,-/$ KAR1-/N5-/'-Xj ;ejJjH LC,A @7U(o:!4'@l(oKR(o@l" M4C 0AE=*H3Rl0A1W) 8`N3Jp/Jp&i1W6R\1Wu1G"3[6nR\"3ELP,@P 4PP,@P AAP $KX f2S5s $(#'(# .I f $[(#(#[e:#D 5#K0#V/(#'-RU533KRVN4?,GAOHS@vD4AOSSIJAO-;441y-[}UVBHp1IG$OKI` 3G"V/&G'-'Y6'Y -'Y%,jPpDe*4 (G+R J<M-RA8vY#M\&@=Cn(M9>;})!jPX=9+ w \ G1VX.L)S,!]44>4GT+= R8vN= V/ 6Q LHRU/TA64gCx;/TKo1!.)T:LWo1Q1CLH:LO ] =G%!A=G=GE5#j4S2]:|Cj%-Ba2LG3L11 L%'LK1 -0n8 f$RH8868Yk!60488)J70A+0qIwwEdEd7 h5%E<;B;%!]O!]9;%!]!]/ $ D/ =vN5/ SIR:F6@:N5:NTR$%-/$ :R1-/N5-/'-%BA  e%8%)CG1) m7Y7:,N'-4ZM9E}?0[+]Kt9(?]@4H>4Hf4H'&n!LJ"&n&nt>PY ?6Mx)\HJ6|WV,0%.OGv%%O:&::7$ : g$U$V-@M4 U'82E'$E6,U% !$/E66-!4SeDN?O=8:B9Y] DS1Y K8GK:(@GY1C ONC 5:(5B:(@@/N14U@\'-KfL: w)G5OKQ ;OKAoOK &o9|'K'&oB=B&oBB* tT ELD%U\/N-WLUA D^^UP.UDTD^ HQ6$"^4/=WKkI0A+0qIw$/7Ow+/,pH>= "2;J6W:IUJH>H>50PSJP> f Y-KSJ(#(#M%WD$I fSJ;'-(#(#he/P QF5'-uP77@c/;P77TR  6lYA?Z+A8I08@ ^SC : Ny P7!y yD.CN!YV1LpN b+%g?%g)6_bE7/dE7?B6@JJ)S1?JJ228(>=X44$v441a3W8Y8@Q : ) W} EL'pK-mq AMZ % L#!]L 3W  %Y-3N?O=8B9Y] 6Y78G:(G,Lq6C OC >H:(5*S:(@@|N64U@\BLLhLkLL.(',`"`**G:YAKP9 YFNF5Y'-h!l !l!lKS0>(t6)g Q'6HxV%x HxFHx W $KX f2 >S5s $(#=(# 7sI f $/3(#(#/3eENNHNv+H %H?&" o!)@G? oXs4OY1s-K^4OSF5 6S$S 'F-"RSVAN-;.;.I-;.M4WEG0A G/T;WW;RVN4?,GAOHS@vD4AOSSIJAO-;441y-[}V a!,/ a!u a:3,M-1-?-$CTr*@ (:)r:)VzVK5,'[CR#HPQ=%&G B^ CBB2PQ> m> n B*. B KKPQ4%#T-*N?O=8B9Y] 6Y78G:(T<G,Lq6C OC ~:(5:(@@28(N64U@\BLL6"cT&!=$9< !OIO:l!(OO(# iM#7UA7#77N7(?H- $+HH#(-;68:N75 Ia3.6/S90S(Z6/6/'C.(ZH?Q<=$&RRO:S(# ]DrW9F;:)zYg'- BGE[?ITPYI)(  %4.O m @6<RR?6++!+S(#.OKS7<<3e?/LSNa6.5&K :O(#4 &.O4j 4S<>I4?V>M>F Z;:;S"K5,'[CR#H?=%YG B^ CG8? mT B*. B KK?4%#=-*%)ToEP. f ".WE(#(#M%WD$I fE;'-(#(#heE ,v=8#M?eO\TUO&O>Y$/U&Ul ~NV$-0 BNo3.H4lE2Z7MS(yE2024'<"<lUl$$:QKYN#*S; c.`K *9O3C[5`@@EAHn&~=K`TH-SQP )>QQ1+`Gx= sUCNQ;;M9mKZEGjQ/1?B8RCS7=3LG4Y< WK!*G4(G(GW*h"W?G4&U3K5,'[CR#HPQ=%YG B C2PQ> m>N B*. B KKPQ4%#.=-*f}:},&3'0q6GBGBq9Oq9!YNDNn0WWY<''*<^8 B) w :h ""$K=P'K8]3=V/& r@='-Rmh6PC &6^PN+Nn66PNN HpGHX;6WAo28434+N42=8=BB$=B (=B )7D88OA.q #Sq.q>B+QY >#2> :N75 Ia 36/S90S(Z6/ 6/(Z*@:Q/ 1AQ$&<=Y,2-}$&A!`R<#@A\%AI, 1<WNQa%lL%(GUEBS?U8R34Y iHQ0JiY C FFpG . G qU%2XGSxN SxACJJ4" H4/;BgOU"4/GU.[(!0(0/ UY"U0.!w/G0/11$<O>Y!]F*1k8 % %MZ %=;&& % %#TTTTT+Y!]O!] %!]SE*iR*i*iY*I%~D +R{2.EN0+C, EE8; . By%KN'---4$S2]A/V 28[3I (i8@8(i7S9"V@(5PXUPm:PU?O&p$6J? f 2=5s(#'(# 2D$I f;(#(#e,G7{7{7{7{1I ?!1I4)1I Y8 ;&Y/7?CP?X>%IU X(=HX%c!< 9x +F9K5,'CR#Hh=%YG B CG$ m B*. B KK4%#=-*;X!G!"Z&"! ;#;E)T#)TSD)T>NHv6 'Q'UHxOL HxFHx 1Vx VVoXC`$00S<iMi@ACiMEM+<1A V VT>NBBRDv&~=RQ?GhS2J HN7N7IvN7*]?j?jN7N7R?(# ]C9#JC)z'-BGE[ BI'T'KI) ,4 m @6<RR B*!(#K7<<3e B/LSHJk& IK,:(#4 &+Q4jP4-N"&&?0[+]Kt9(?]/!Q/!C:1G* 15%407 4>>49qP[A1]P7"7%UP778%T PT ##(U7T ;SJ;SD1SD1;SD1RGd!n4$PK?Y #L$/:\- 757@;:\N2Do>:\R M! nlDo KEN:C3W8Y8@Q : E() W} EL/0K0cE(A L#!]L 3WE( %Y-3!G'p0J C nH2VVn2L4H4O pOO4O-W >WNI?LHRU/TA64gJx;/TKo1/ )D<,JCWoCE1Q1CLHJO Q] !n$P$$::-M u uy#3535;vj;vEE;vEM"Z#K5,'K~CR#HJ=%YG B [ CG6A mMJ B*. B KK6A4%#=-*X3$ .?t.bPBPD%ZDPD!U !UQF'!U' ,'' L=L' L4C<A;HK882<0T0W2QJ;.KY@QJ?QA5 CP9<YY+RQ'K[D9J9RO'8"K[H4G4~O4;H(7GHl+O@] Qj?3W! !:  F:+4:$`B:Ku(t6)g6H%%x&?w& %&)8tK'B=BBB@:.M66qgO FNX'-'-?5/_- BSS&SSJ~kJ~J~U Y'E# E)8)R)F?)F%1RW-PU)3M:&Z*~* V@DR65A Dv?'*FR6--G22*By'59nR{) <%,%JQ37X?S%J!J2?6KJU:Y&x6?5RWcFF`"{7~3S"I6:"IG"{'EV"{Ss1eUFPH' "\ u"=0@'8'Q)E -#\%QF>!.Bc?F>]F>D80BcSY6 )%C\@1C\=PC\P9R HKS55VKTTA;HK82<0T90W2Q';.KF 3Y8QJ?QA 35 CP9$1NFu:BDrY]+rH=%OHBG' T-9O'K'@@N4U@\BLL%W8YW=0%~D +3E!eC, E$@E .CTT+ 7UC7U!sSi5tjj;1.BX:&Q*KYN#*S; c.<8K *W03/{<8EAHnK<8TH 1S?!n AP$? Ao V2 (+RQ'K[D9J9RO'8"K[H>G4~O4H(7Hl+O@] Qj6k H|/>E# <)M(( ?(.8: #. %.Ae!YQ+%g%g O4U,"z=$/*6=WQ+U098*J@j*A=* A**'Nj'O O'O1'1*Y %J( % % [1'& % %F":=.4Y iY 4-Y7O=6-NFu:BDrY]+rH=%OHBG'T-9O'K'@@N4U@\BLL/yeen:H>,eG=RE@@N=IJ=-;@41y-[}*B=U)1+%H+?KsA)1V3 'F L'V0V*FN@IZZZDa6hDa"DaI=0 -FX f2zGU Ae/I f '-eL$ LN5L-p'-'-" M4WE)=*H&2Z>0A 1W  k/ 1W61WT;"kWW;F9MH"39"V/&9-pUN11"-7I#Fz l?5O'1Q,C PV2Y zY^zz (533K:C:; :2=D@.+(t6)g6%xF HLI;Q*HTB6^OZ 1$yTBPTB1 c%29[P1'  Cj J8 Y/K -/$ =R1-/N5-/'-'-W%J  e%Ae%!s!s(G+R J<"h-R8vYh\@=Cn(9>;}3W8Y8@Q : E() W} ELCK0cE(AAL#!]"L y+3WE( %Y-!]3($PP3A2: Dr;4)zF#B'25TP7'3e'4/;BgO"/GU(!0(0/ UY"U0.!w/G0/11$*%# +%6NN 3$ +$, L * >$$*, Y)FE#04V0?-IEe4 4X=d4PUPH ?6)\H 6|WV,%NGv%<{q?OqYX.;C><9*47?E!Bx 7?BJ 7?8*4 ??O<, UM=< ]Dr;4)zFB@'TP7N''1 0O@O+Q.mC.CC8$; 'W $KX f2F<S5s $(#(# I f $/3(#(#/3e( @X f2GU Ae/I f !sXk!se 9Q7$Q7.T Y'%6 $ XS5s $(#/\'(# $(#(#>N@@4 -DK H(#?s! N*'4|P6QUY [HZV>%~D +0HZEW~HC, EE 3h313E5)' <CDG$  C l@F mN5S@'-'-%Od%OV+%OUV@x-iG3L*Q@LFW B>0#98B*Haj*A=* Z19&R ZWT# ZE5+;uB) w;YR1XrFT,KB1LJL;V0.,B}N7WB}P3L U7L |,N7#}7?YRW-PPV-5ODu)27SgRDu:G9<4 :: $67JX f2Q5s7(#'(# 2D$I f7(# r9(#eGHR34Y iHQJiY Kv FFpG .3; G";""6#5 R| C Xu=:V68$O?Wj1M1g>6u9YUl ~NV4 4-zB' 4.F 42=<Z" 4N(yNQ2=*`2=TQ9BzUl 41:@A D//"/?*??/?R\22*8%,'59nR{$)O( )%,J37) O(GSGJtJ!=dEJ3Q>QQ1#?)>J B&P>'YFIA$f6'Y -'Y,j1Yz?XTIG/ 2O60GWC2|t0 V0P:A62|>>4)5V?\CXC O$1LGq4N5! 1 INOY 1"tQ&!)OCXCX/;A^:9A^7M!A^C$.7m#EC$F_F#`$$FF$GR80 -FX fGU Ae/I f '-eY!NWNY) %{ZIzFF%sW93p9A;HK882<0T]0W2Q);W]"pY"pFQJ?Q$A]5 CP9J%W oAoOKXU2DX X1'W $KX f2S5s $(#=(# I f $/3(#(#/3e'C CF ,>4: $X f2S5s $(#=(# I f $'-(#(#evG6V?Q1@NNA HLVN(#b )B qP#B(k2+>)K1HD (3 (4W68(#24W [LS?>K H$8(#(#?s! N*'4| (GX:BNK 7U:CY]<7]HYVp7UGHJN!C}SGC ONHJHJ@@NG4U@\'-BLL4BX;NJ-4 >*)J#D0# JTF#'A.\D']H,>z5555$8 PHC,t?$SI-YG\2Lz2EJy42LzLzDABG\4 P PFwFF:3,XDV+bNHM(>Mx 'G%AOHS&RMFNz%AOSS(%AOGI+'Nz4Q-=Tv4j)nM[&AfR?'(#SC9#%tC)z'- GE[ BI C.2I) Y4 m @6<RR B*!(#K7<<3e B/LS/nX@KY:(#4 &4jP451PUPH ?6)\H6|WV,%Gv%Y6?`/46:Rb"{=vFXW5P/\-r/yeen:H>,eG=RE@@=IJ=-;@41y-[}B-@B-IB-* 7U:CJ)|6-Y7UGK"KV!V! FN5"UIJ"-; 41y-['-}F$Bq7m@BqF_F#`BqFF!u,,JR?'&C(#SCW9#GM)zX'-GE[ BI l C6I)F3 G'4 m @6<RR B*!K`&C(#.O&C7<<3e B/LSNa6.5&KG:(#(#4 &.O4jR4S<K5::!f)5568@I1wO(UD@IOKOm@IOO1&>X?q(}>>lEE $KX f2S5s $(#@(# I f $/3(#(#/3eGy;X,X18TFS6m&<1LJ kLV0!_W kL UN7L@H9S],!_X,N7 1X,CuF^XDV2-CN H((!Mx'-MGK@AOHS&&GF3MF-q?N4MAOSS2-.OK2-(m%AOG"M2-q:4KZEG.O4j 4M[$VA\,<A\2,,)X3> T Q.$9)`QOIO:lQ(OO(UCUPH: F6-DHGKP6|4 FN5P}IJP-; 41y-['-}O??K5,'[CR#HPQ=%YG BA C2PQ> m>C B*. B K%KPQ4%#=-*&DZE$;Y3H HLVN5(k )S(%(k7>)M9U8bKJH H8*WI# * (D'J,7WX>5><SWX(#(#M7D$WX(#(#hO;&/.Y.A#/&{Y>&{(#(#M&{(#(#PD 8?6Mx)\P/w.WV,0%.OGv%%O3!;9x7,%?b@\? BFN:1I@\!d yB, A|:#?#L!dW-!dA7:/W7K;6#@cA,/9 A@RXq R6Q3@5L @O#-G5=B9& F(H>f8R8G%:(GA}MF8O%:(5(%:(GSGR8N84Q-=4j)nM[(l>%"7i $@J Q"?BU:Y6?5RWc}FF`"{Rb+$6:&x"{'"{U$FPH "\AIAIO0U:Y6?5RWc}FF`"{Rb+$6:"{'&x"{U$FPH "\4t(.H>N@ @C6WFW%-/$ R1-/N5-/'-N, G6  - 8CSQK&:EP35 Ia36/S90(Z6/6/(ZPP7"7%UP76Re"G""J'JZI A#>6 2A#CA#2=;^SC Ny P7!y yD.CN!YGNV$-4 ZE ,vN2L&?y*8*T**N::{*NDoT6Do;*NDoDo|<592[/I W &[>]GPA(W 6>]Dz?>]:|W +HQB!8,a)J$ N>OOGR$,[l96 l CN W C$];. C!OR K95[#|IWyDIJ"Z7 fU78"ghA;HK8(# <0TQ q90,22EQ;"HD (wYE(4W68QJ(#4W Q[LS?>KAw(#?s! N*'4|Tk?L+U M@ !(w;&M1>aY/U;%&;$M$}R wY/%%($N$3c$1E7/dPE7B:~A5 )S1A5>ANgF,9@>NS5R^& C>48!h#.)CZ9#CZWD=dj?Go8X f o53 GoAe/2'GI fGo;'-(7Xk2 heXK1\H9KY,A @(o4'0@l.O(oK(o@lO! .:'('(7zUV-;-6Q- ;ITT 042EX 07: 0N5=U)127SgR)1:34 ::6XD: N*H(>Mx '-P-GK@AOVQ.DGMFQ4MAOSSK(m%AOG8L@oQ:4KZEGTv4j4M[:|<59/I![ >]PU6 >]Dz>]$;Y3|H+V%NFu:BDrY]+rH=%OH#BG'5T-9ON'K'@@N4U@\BLLDe? Y8 Y/R3 G8 iHQ0JiTB GTFFpSG . !!0GPP{{T{'JZ*j>8rVK78rM8rRO<L(#+DrW9F% :)zY#'-BGE[?IXTUPYI)<N ,!4.O m @6<RR?6!L(#.OKL7<<3e?/LSNa6.5&K O(#4 &.O4jf4S<KMQ!R9XyC=%5G$$i"#KVF m$Q.Xg$ KKVF4%#=-*A.VJIv 4!oE!o!o@!lSUK?!(U5R1}F/!(`%$E&M:$6:%$ '%$U$FPH "\-%&;FW/9D: LkC4Lk<LkTh"  Z&;::|%G&a ;JSRZG&" "?'KG& -$Sm=8 <B9-/$ 8GR1-/N5-/'-'-$Co5|?Co@CoDM? 1S?1(.'C.H?T ELD%U\/NLU D^^NUP.UDTD^ HQ6$:EP5 Ia3.6/S90(Z6/I6/'C.(ZH?9[[P [.nE7IC@8R06&yL7I0,PAP/b/ Xr3BRr=I/Vi//a 2XY4 N)22 E; '+; E[; ?'^8<?"";O?PUPH ?6)\H6|WV,%Gv%BR66V%x6MRf6,46?Qa3XcHwXcUg@l@l S+k+kM 9Q7$Ct 8Gq)Q7 T\ 6 M 6hF;"M*!"T":g#$~L+!U1AN4KUU/T/GKoP1 )'PWoP1O 1,g0.!w/GLHP1ODl 1; $7-@-I # 3P I<QR3|?uHQJi| Q?FFpG .? GTELD%uLG9LD)_C~31D(N$D31JT&r-NOO\NO 3NO3.YiRWP7)VXsL;Y1s-K^4OSF5 6S$S$;Y3H )7D88UM(N UMk!JMk4ND.CNYUT7iW#E7i6 67i6Bi0SR+S(F=,;R*&s> f2AJ*&(#>(#MI f*&-p/3(#(#/3eR{_3W  @QLKi) 9 ES ;%AMZS 4!]S  3W% %Y-3ARYRYM<9K5,'%CR#HO=%YG BQV CH`'F mF" B*. B KT:K'4%#IJ=-*AT;Cy2! P7!y y! 2@i>Q2r 'M> #.I R?'(#SC9#%tC)z'-GE[ BI& C.2I) Y4 m @6<RR B*!(#K7<<3e B/LS/nX@KY:(#4 &4jP451-@M4U'82.'D6*,V1=$/6&563-E1%/N=J!KG'@Ee:7:$ v$ 1?$k- %=@;(o??4 GJ("h-?4%/R.U\N%R=%Cn(9>;}JA0Y6JAJAY!NWN L 6 6 9-r.QH16/\/\6T G=R:BDr*k;=%WB1OT/Q= (W1DH;.1!ORQ=K95[#|I#@cA,/9 A@!:Xq R3@5Ld@O#-G5::Y(#4:(#S$ ESFN5Se D,G$9 1D@,OOIO:lGIGI,OAYQ *O:Y,<cY @M4Y,4Tl<<<<R$,[K~l964ul C [No4uWMJ C$];. C!OR4uK95[#|I7pS<7=|627PJ{LfOSfJ{%$qJ{H 3hOL\TUO&&@OW"OIW"AW"M\ +T6 * AQT 4\xW45$3C4\.*.T+T 4\..m:XJX f 2+<75s(#'(# 2D$I fC2;(#(#eB99!PP%<@I>RYI>I>;>RV%N4?,GAOHS@vD4RAOSSIJAO-;441y-[}-@M4U'82.H,'D6,V4H,$/6&63-H,E1%/N=J L QJ#D 5#40#V/(6#'-N 01NFFUNGE*AM)1/NPE?Mo( 5)SAe75s.,Xk4!d" L""Lp2C2(&20O8ssLHRU/TA64gCx;/TKo1B)T:LWo1Ql1CGuLH:LO ] 7TfLA=S%+@o@:<@:@: } W30PL>%3GWWe+W(mIJW-;41y-[}C/#Y6I%~D +E+C, EE G)Y>"tN1%)%lAa(Lp&XDUV@x'2 ( -iT*Q?)G9N:V$T*QG?$y*Q@4m89p #?LC$kX5qA;HK8(#<0T q90, 223BQV;F"HD (wYPA(4W68QJ(#4W Q[LS?>KAw(#?s! N*'4|ILRW-PYV*-8 ::F@:N5:4NT24(22?"Z$T*i$;$O 5&g5 Q5*,$=N7.:$4L*7UA7`*771@/*C<)C0 JO M6PCJO-%JOR1PS P O+U:Y6?5RWc}FF`"{ +$6:&x"{'W"{4hU$FPH "\R&{.@> f2&{(#>(#MI f&{/3(#(#/3eAWDc N P%K"O8EP353S9SLrH#R%U2AE2N52Yi'-hK6+*CRC=%(*G5u36;X.VF m5u<(.5u KKVF4%#=-*Y&XP*%#3+%6N*7>3 V/ LE * *N@I32D4o)&F)32KJ,KJ'32KJKJ4.~]=X;NJ-4 +'"+4+/!K:WIN)N)7N)N);A'<A.LA$6 $X f2<[S5s $(#"p'(# I f $(#(#eXd#8D: "5L!! *X * LA@0-Q f21-?? I f-?$?eG"2/+O%-/$ K@>R1-/N5-/'-$+\M@e\ (#\ $X f2<S5s $(#'(# <I f $(#(#eT`;7U1A/T)K@3K@AW)=+K6_ Qv'@26_Qv>\RM B>K?KSIO#L"B/;KaPG'75;K-Do;KRM! nl E*AM?GMoBWp)S1Wp426?Xi{?X & ,5=U27SgR)1$:34 ::>$$ YBR@?]  P K=8Q HLVN%(k )OFA(kb>)O&&KJ[HC< HO*WKJ#@' Nj'O O'O'jXU2DX oXNV$-4ZRX0RS*HnKN9Hn0{:T7Hn1+ X R?'&C(#SCW9#G-TL!TD$8/HW,t._9-I-:Lz5*+5G9-/#Lz LzS9-8^pMP9D=?6)\DT>WV,%NGv%``@_(:)rD)Y":)5H5@V5HW5HVW(a&(SXM!M!XM!M!V61$1H0M1&)v8?/-6)?RNU&3*4KA.@KKA5KA8*4??Oq&aC1qC,HLI;HA+)FM-&N@EEY>&3X*4KA(.@K KA5KA8*4??O4:Q"1X WvE/Q)EEP9D=?6)\DWV,%NGv%2 6I9nH{ <R{9D)i+G5-S9D*a=d9D1 Ae/I f-p!s!se)mkY [HZV>%~D +HZEHC, EE AG:: Dr;4)zFB@'JTP7'3e'1 0O@+Q.m)5V?\N5CO$1kGqN52<  2 YS@"tN1%) %lAa"LpBbR < ]Dr;4FB'TP7N.O''OB=/2&09)=22YiK=d=0m2 %2Ao2l8UOkUl 41:@A :`+:`4:`<L76L'tN't'tL't}Wv6lWW3K@W)*fWL+!U1AN4KUU/T/GKoP1)'PWoP1O 1,g0.!w/GLHP1ODl 1; $1$ $XVS5s $(#'(# $[(#(#[LXO*L(LR' 5GF+V, 5; 5'JZ*j>2LkLU? C4Lk<,2^LkTh""XU2DX A93TX.#. %.  ,f}:*>A}NE&F; ,,NE$NE,X,:-/$ KR1-/>N5-/'-.'E7/d):U91PE7??&%B8Z1 )S N?7?1sNXk%/=u% e%8%7;K -/$ 3R1-/LN5-/'-'-Mx '-8MGK%AOG&GMFT04%MAOSSK(m%AOG"M2T0:4KZEG.O4j4M[$VN#1)V#% M $P%I/ f2X/?? I f/??eQKYN#*S; c.`K *S 3C[5`@@8zAHnFK`TH (1s F:B:B 2 W!!0PPO<R ^T.F5JR.S:EP35 Ia36/S90(Z6/N6/(Z HLVN%(k )FA(kRP>)X#;tKJH H*W#J+H#wJF 0FFJF:*" #NNU&{!&{(#&{(#<{qz9Oq X5K>Y97?3V!*5Kff!T7?BJB7?, B5K"`*.h 5A :SBN5;'-VhWL#H:8 Ia6/ $0(Z6/6/(Z 8$1IR8 P9D=?6)\JD3WV,%Gv%mMZ <N'-'-=P+; E1c',k2%ANh/Xq&/V/IXcNhXI!$!n APS"6]G  $2/9:\99'559:\N2M!:\oJ?u GI- ?4FWA#/6 2NA#CA# CFWFW2==C*CFkC9x.<aY/U;%&;$M$}R wY/%%(L@"Y}F8k7U OD ,]y;L4P47:M56W @X f2;GU Ae/I f !s!se& IVR RR#!6(#U(.|U(U(+P" P5tH f '&G+ (#(#M%WD$I f ;'-(#(#heVN.Y%6?%GF%1 gPU)3M:&Z*~"*  >@DR65A Dv?'*FR6--GF$PEAN4.3#4N 4.Jp;4.T( q- qY` qTN" %T*E*AMYVPK21X91Mo%#TX2XT$K6Pn V36V/& r@6'-:hR$,[l96 l C N W C$];. C!OR K95[#|I!;!("!&sW )&s(`)CC(LU)Yl;4+DX;4JpJp8m";4533RvKX kE 9V|LLAPQ0J~K@3E!L=Ny@wK@2GW))IUNy=?=G22D8(3 0NySY6 )%%NN VELD%L4FG  1c6k4e a(S(SO+$ @KE3N5M N7eN;=U)1)138  L'1 P>=<1P>rTP- % % %HzYC,A @R8(o4'0@l(oK(oSGR8@lO(l6 'q'qW=qFX f2*5s=q(#Q '(# I f=qC2[(#(#[e6:I*"V?"Hb"; @LA HEP35 ) RP#2b3LrS9W 'LrKJLrH H *W# \ 4KFS H 4N7B}IvN7*] 4=N7N7=5656!M!56!R(G+R J<M-R&@8vY#M\C;@=Cn(M9>;}D N:+$AK*=H+$>N>5R %&;+$'-h:IA,X/rN5'-Lh<;5r<e65e0<ee \daX4DG$ U2-G N5G_4T.D)Y"T5H@V5HW5HV(.wm.w({4a4R@(yX=TR@5 " :%UU#-M4Y i'QW0HY= -L$/=0&=3--LE1%/N=JU HLVN5(k )S(FA(k z>)C9U8bKJH H8*W#"XP+0"N XP&AXP?tR^T.F5JR.SH StUmR?'(#SC9#Q{C)z'-GE[ BI CW'KI) ,4 mE[@6<RR B*!(#K7<<3e B/LSHJk& IK,:(#4 &+Q4jP4-( A>7oW//b/OUkM+lO=B2:IUOH>H>5CT.C9X[5:C4A4Y i0 FM90TY=6E "# 3"pY=6?=6A 35 CP9hC:&{0> fA&{(#=(#MI f&{'-(#(#e,,W,1~%fL+q>62>[6+q+q@J Q"E= ^YE='E=&u!$'&uYh&u@ E)CS6iKNE)K VVoS V;$&$&RUA;HK882<0T]0W2Q);W]"pY"pFQJ? 9Q$A]5 C?P9A?JMW1<((6<RR<O(# &jPUPH ?6)\H6|WV,%NGv%*E*AMYVK2X8aMo%#TX2 lXTWa O~K&ACXO~M!7M!XO~M!M!I|<59[/I 1n&[>]P-8Ui6>]Dz?>]:|Ui+HQ>8d 'Q'82It&pHx0g HxFHxK] EO&p NY#V++>+,9D(z(z &{0> f2&{(#@(#MI f&{'-(#(#eIO o o F"/16l,F/ 1B B 9lGhS54 H9lN7N7IvN7*]?j?j9lN7?YN7.8: #. % .^ K.\:*"^.\&HG"-@M U'82'$2&pE6E;XT$/E66K] EO&p-T NY/< ]Dr;4FB'TQP7F3N''O;W.HS:BBBBQg)HQgC.Qg;YQt(YYY1Jl!Q&!CX/;NU6ULL8J8@G 88 8$<88%o0*E*AMYVPK2:UX91Mo%#TX2XT, HLVN (#b ) E qP#B(k2+U>)AK1HD  (,3 (4W68 (#2 4W [LS?>K H(#(#?s! =+*'4|LL"B$mcQi- &{.@> f2*&{(#(#MI f&{/3(#(#/3e>;V 'Q';HxX!; HxFHx KN/!QR QC=%F1/!GWC-KVF m>. KKVF4%#=-*W $KX f2S5s $(#E3(# I f $/3(#(#/3e#H7mHIHFF_F#` ` `HFF!KIO o oF"6FOJ6Fy6F:/| N 01NFFN10"B /  N +; E-4-@M4 U'82'$&pE6E," &$/E66K] EO&p-& NY*NB:{*NDoDo;*NDoDo;YR1XrFTQ3KB1LJL9VXS7Q3WIvN7P3L U7L |Q3N7#}?Y)5V?\CXC O$1WOGq=N5! 1RV H',sY 1"t Q&!),sCXCX/;5 "'%#\ WU8J&.MoB+R R78v?=P EU%%>= =2D#B?."IF"I,IS;%%I Ae O"((Q/U:&{0> f&{(#(#MI f&{'-(#(#eQ8G 8 88T}T}NT}As1?F#STk?!(U M@F !(;&M1>aY/;%&;Y/V0cN'54vT= RAP :BDr0U9Ec90 B<%T, $Y'<1t?<A$5 CP9;YVl % ;!]O!] G; ,!]!] HLVN%(k )OFA(k>)O&&$=KJHY HO*W#?RWPVKj5A&KjJ"*/BP)#X  f2C]#8#GI f#eYXrFeY7y:YYJJQ9 e<Y{U*@ @ VJIv2( [HZV>9n0@  HHZ>.HJx;B^S>=d> f2X7(#@(#MI f-p/3(#(#/3e:KN'-AH=BN5'-h3<3<3<W'HV-CrB K(qV-LJKh,WKh.=7&Kh |N7#}?YeJf@@|@:%8 !F0NFu:BDrY]+rH=%OH#BG'PT-9C O'K'@@N4U@\BLL"6P> f2 _+6(#(#M%WD$I f6'-(#(#heKJR+:"F5L!!Q7QN]QTXQwMXQXQN@#A+VBGMMMD>P9``@NI?HJHM>W1EA/PB]%%R$,[K~l964ul CNo4uWMJ C$];. C!OR4uK95[#|IXP+0"N XP&XPBgCd..8HRA?)JO? s> f2 (#(#MI f -p/3(#(#/3e+&&/1.Y9J:5 '8.8A/ 4X8Y78l+ @] -: +%g?7D/]nB;/] dN/]p%"={MKWWX "?2CWOKM& JW?7""A4 ](# W30PL>%3GWWe+W(mIJMW-;41y-[}K82<2*;Z FRV%NWY?,GAO$HS$g*\WYFFAOSSIJAO-;*@WY41y-[};%5)ToEP>. f ".WE(#(#M%WD$I fE;'-(#(#he 1AQ<=?p&],H- 2@$&0 nAR3#@ nA\Au(rs0 13 Ao1SDP9)PSe2X3SeV/&SeW&{3>&{(#(#M&{/3(#(#/3{:A wT@:NB:5 S:4As(#h# 1$:G5GE*AM)1/NPE:U?Mo( 5)SAe75s.,Xk &>QXU2DX X4*,24+I4N1MN1V4N1N1>>k(Q~1m)mk# 9V?sLP/b/ X::8 Ia6/$0(Z6/6/(Ze D,G$91D@,OOIO:lGIGI,OAY OR? C9#'C)zB@GE[ BE[TE20 mE[ B*(#7<3e B/1 0O@K 4W<Bu+Q4j`.mUl ~NV4 4-zB' 4.F 42=<Z" 4N(yNQ2=*`2=TBzUl 41:@A 4TC|2N952B;AG#3 LwRj B)Z K4,{5Z 5=Z ?FK5,'[CR#HPQ=%YG BA C2PQ> m>C B*.( B KKPQ4%#.=-*sQWR HLVN (#b ) E qP#B(k2+>)AK1HD (,3 (4W68 (#2 4W [LS?>K H(#(#?s! =+*'4|@b8@b ;&(wYP;&;&#EUVY@x?SmQ7 -i*QfG+O Q%g%g;*QGe*Q:#Q?^.$IBI!I"8(G+R J<5-R&@8vMJ$5V+\V+&@=Cn(59Y;}+= 4" M4C 0AE=*H3Rl0A1WX 8`N3Jp/JpO1W6R\I1WQ"3[6nR\"3EL:8 Ia6/$0(Z6/6/(Z C#"I: oP?S4A7XV o-i F"**C 8Ke #C?^.$A>JI'4'4'4'46TSE*iR*i*iC2#@cA,/9 LA@RXq R6Q3@5L@O#-G5H/GSxN MII-;,tSxzSA05o zKez 5:e"Y%6?,R&Q%%!]?!]9%!]7-@-9I S8-=JB yU;**VK7C(K7K7MDuDuM<#xA.q #.q+X?< ]Dr H:K(&^)z!c'-BMGK@=Tv>GNMFMj4.OM=.OK(m3e=G"M2MjO4KZEG.O4j4M[$VX>%Ip X(=HX%cLg**6"(.wm&.q.w4a5%GEG7IB$`:KB?LO'1F':K@0uE;YOR@z8I[@F7IY"cIS>Y V N1Tk0SY M@ S;1>aY/;%&;Y/RM B>K?KSIO#L"B/;KaPG'5;K-Do;KRM! nl +KFCWv+H7HHHR?'(#SC9#%tC)z'- GE[ BI C.2I) Y4 m @6<RR B*!(#K7<<3e B/LS/nX@KY:(#4 &4jP451C O62JHuK -/$ 6R1-/N5-/'-'-PB24,#Tk0SY M@ S;1>aY/U;%&;Y/%4@"I"IUL><JX f 2 5s<(#'(# 2D$I f<;(#(#e8 =#C2Cx>QAd Q'OHx7+7A HxF Hx 4 7IB$`:KB?L 1F':K@ruP- @OR@X_@z8F@FA&7I "cIS8>Y OMBDOMB&OM/W*+@/</ & K/ /+<K V VT>A@T HLVN(#b )B qP#&|(k2+5>)K1HD (3 (4W68(#4W [LS?>K H(#?s! N*'4|#Ev#=v4y#K $XS5s $(#B'(# $[(#(#[E MAB41 .Z44/ NV4-4ZC$<E$C$& =& C$& & (2AR3 A iHQ0Ji* 6FFpSG . !!0GPPN<3!33<U OD y;L4P4CIs0NO1NFFNF%1RWPU)3M>&Z@:V .">5ADv%,F>$ 4OHLIHA Do?\N5CN5 P45Y50"SJP> f X&G-KSJ(#(#M%WD$I fSJ;'-(#(#he8,V_OrcV_**1V_cAAXqT3 V/ 6Q6Q# )O&&$=KJ ,HY HO*WKJ#CJ% XKF4s&LMSO4FLM:/:gQh::N  R4lDRr8pC2 = *38p v1/BNR1/H11/(XHXXLHRU/TA64gJx;/TKo1;)U<,JCOWoCE1Q.1CQfLHJO Q] %$ FJ:J06??J?Jg)k-"9F9MH"39"V/&98IWRGd!n$PK?YO#L$/:\/6O54:\N2Do2:\1xRM! nl E1%-/$ R1-/ N5-/8'Q)Q!D8'M 9Q7$Ct 8Gq)8Q7 T\ 6 M 6%L'H 6,BI I6,=m8B %MZ %=;&&B % %'Y%6?/wc86:Rb$M$}R w8%%(OmUe6I'1@562iUe6 )B'@>B $PX&S5s $(#'(# $[(#(#[Nh,8RIXcNhXIKSS(Q,'HV-C*>KAV-NEF; ,NE$NE,R$,[l96;l C N:CWK~ C$];. C!ORCK95[#|IyBR 22*%,'59nR{$) )%,J.B37J%S8J!=d0J mN?6.:? KKVF4%#'-=-*"9P> f  C _9(#(#M%WD$I f9;'-(#H(#heT@GH&w"&w&wSE!e=JB=1K5,'8CR#H=%YG BA C.)r"p m"p# B*., B K#K)r4%#?=-*X**J L1Z1Z# 1Z(G+R J<5-RO8vJ$5V+\V+C;@=Cn.NG(59>;}*4 ,8*4?O8811M15TT0Mk-Mk#UG2.y4n+I#N1MN1V4#N1N1&UL&yL7INFu:BDrY]+rHYOHBG'BT-9O'K}'@@N4U@\BLLWDD] J"N"IF$(o??4 GJ("h-?4%/R.U\%R=%Cn(9>;}XDVTvNHM(>Mx '--GK%AOHS&5GMF4%MAOSSK(m%AOG8L@o:4KZEGTv4j4M[:B;G;0E;0;0S-%7Y:W5 Ia.6/$0S(Z6/U6/'C.(ZH?RI2?<?<T` 6 W26Q&$9&9)`QOIO:lQOO):V>L#^?Wa33KMDuDu<+(|5",|;|AAA @L,5)) 0 1=4G[M OG[B[#M 1G19G[G[M 11p%"={MKWX "D232OKM& J%?7""= Cj%$X  f e%AeI f%'-e @IzU6/5 7IzZIz>R 7j>|><<K $ . LN5 7'-'-L#?V,'HV-C*>KAFV-NEF; ,NE$NE,FT5J7<d9x LB~/*E*NAMYVK2X(Mo%#T=X2. XT` 6 TW2W26 ~4h W` vAwKR P> s2W>C5)2.3K.J2..AI<AI.q.q.N+"Q7{("O8" $KX f2S5s $(#'(# I f $[(#(#[e 8 ".N( !G".W,$eUsPq %-@MuU'82'$/!&pE6EC8FJ$/E66K] EO&p-FJ NYPT2F DB;QF4 Mc&K*4V4T;K*WW;04T7-S2]:|Cj%-Ba2L3L11 V/L%'LK1 -0nQPNYJJ8 PHUG\WxuOLzDABG\u P P ?4?4/>VN(k(k">)&L ?<??[)'[[M!F S:N75 Ia3.6/S90S(Z6/.P6/'C.(ZH?0 $ XS5s $(#(# $'-(#(#F*%# +%6NN4rO13$ YGy?? L * >$$*?? Yt;2"ttB//=%;1XrFT KB1LJL-VXSDWL U7YRL |N7#}?Y>;V Q';Hx!; HxFHx "Q("FK"HI KZC :1 L5wAu- CX@CC-/GVjN MI;SxG\zA05o NzKezDABG\  P P>R 7j>+|> 4,p 4N7IvN7*] 4N7E %I VVo!S<PC;1+< V VT[,)'[7[R$,[K~l96 l CY=NUA (W (8 C$];. C!OR K95[#|IY+3DI%~D +0.E$?-C, E*E8; . By=8B98 GC 9+/B@'3 ;|I&,8 ; %MZ %=; ;& % %&$cV)&$rD $riM" [:~X)'F[7[1O17>X Ul+)(.wmz:.DC.}B.wY4aDjN(yI*`=TUl1:@A :& gC5wAu&Z-)3"*CO>X@CC?'*--GY!NWN L 6 6V%= 7:=X3%KSIi;UD8NV6Q:R P> f .t,Q (#(#M%WD$I f ;'-(#(#heR.oQ$2$2#L C60MT00S3,BL%L(@L(L(?B8RCS7=3LG4Y<)W;KW_!*G4(G(GJW*h"InW@%3&?G4&U"?"'M!M!G2JWN7O?9G$NNGC/ :F d dPUP ?6Mx)\H'_6|WV,%Gv%I+'%Tv T TT+8:RV^P> f2U:eV^(#(#M%WD$I fV^'-(#%(#he3cM N! NQ NOAP |?u0U9Ec90 |<?, $Y'<1t?<A$5 CP9N?O=88B9Y] 9Y78G:(G(sWLGOG6:(5:(@@NL4U@\BLL4VVU'VUUq497@!KN?mP4?K+ NPAF4{%Z NNQUl+GM#N_+*`G*`04+*`*`G1&/.Y'.A)T4ATX;V;!;%MH(hkFz; l(.8XZI[PIA WfI[)I[WKXZW6~&+N&R$,[K~l96 l CTNUA (W (MJ C$];. C!OR K95[#|IN1,<-^ItH<It10g"*`1K5,'%CR#H=%YG B6 CG=e) m$= B*. B KK)4%#=-*p%"={MKWX "7232AKM& RJ|?7& ""X>%I _ X(=HX%c+P1=:++N?O=%8:B9Y] DSRY 8G:((G2) 'FONF5O:(5:(@@N'4U@\'- #LO+<E7 *X #D D:A:*H L-69;C61 E9;'9;-p%"={MKWX "7232AKM& JJ|?7& ""Po&ICLHSRA<A6LH+8>;")6AEj7A}A Z&;::|%G&a|;-GSZG&" ";?'+KG& -'0n-0 $H 7UC,A @7U(oS}!4'@l(oK4(o@l:BDr;4FB'=rTP7'R'DI: oP;Q*74A oTB*F"OZ 1TBPTB1T W'XW0iS*1bM01b1b!j-RUnr0DDDDXDDDDH 7UC,A Y@7U(o:x!4'@lN(oK(o@l@L?M@@<-l<-)X#;tKJH H*W#$#$#16h=NLK c*6;NB*S)*NBQQ0OQO1(:h 3394=3<3R+'/E;+NR8"->C (WNX;.N!ORCK95[#|I(M*?B8RCS7=3LY<)WX1KMW*h"W?&U? U+< BSAp%"={MKWWX "$2CWAKM& JW?7""B|OPCB| B|,'HV-C*>KAV-NEF; ,NE$NE, $ X S5s $(#/\'(# 5 $(#(#).L .LYl.Lw.w(Gw7P HLVN(#b )B qP#&| (k2+D>)K1HD (3 (4W68(#4W [LS?>K H(#?s! N*'4|(t6)g6H%%x4gErU[ErCEr%I%)=K%> ;J<N95r;e65e0<;eeJ?u GI- ?4FWA#/6 2NA#CA# CFWFW2== ?7L| G>QAd 'Q'OHx%7A HxFHx '0'1$ $KX f2 S5s $(#"p'(# 6I f $[(#(#[eU:Y&x6?5RWcFF`"{43S"IF6:"I~"{',\"{R<UFPH' "\ M $P%I3W  @QLKi) 9 : ES 9;%AS 4!]S  3W% %Y-3?~>7?~B}?~H(X &~5sH(#'(# D$D$H;(#(#(X7-@1s-K^-SI 61S$S*%#3+%6N*Q27>3 P$;;Q2Q LQ & * ?*Q2N@I>Hvd 'Q'82U&pHxX QQQQ.(DVD5D?S?OyD??P ?6Mx)\P'.WV,0%GvI+'%OTvH LC,A @7U(o!4'@l(oK(o@l Jp VUM*6PR3 A iHQJi* 6FFpSG . !!0GPP>E*AMLMoWpXu)3,[l9lNcKQ-,f}:*>A}NE&F; ,NE$NE,5=U)127SgR6N)1:Q34 ::Po&<1eR?  Ss1e Tj:K:Ku?S2]:|Cj%-Ba2Lx3L11 V/L%'BLK1 -0n9Y"p&C bOL\TUO&& HO4UK,7H$/*H'U8 bHGW8T8 }HJ,H^=\:nH^>{IH^Y#D4V%~D +0E?-C, EE &/8Y.9AG##Y')DI%~D +H.E6|C, EE8; . By?8<?"?_(:)rD)Y":)5H@V5HW5HV 7 \ G1VV538}8}rDR?'&C(#SCW9#GM)zX'-GE[ BI l C6I)8S G'4 m @6<RR B*!&C(#.O&C7<<3e B/LSNa6.5&KG:(#O(#4 &.O4jR4S<RWPV&B9Gq9Oq94!fE7/dE7?B6@JJ)S1?XJJ3W  @QLKi) 9 ES R;%AS 4!].S  3W% %Y-3!JUsPqJF>1jY9U/TA/T)RV8N?,GAOGHS%V74GGAOSSIJAO-;441y-[}l< B>B/Ha@<5?;K<N/ND/;/ 30UV@xG3LL(<*>ANgF,9@>N#5.#>48!h 08OE4 CCSiPjBR@H:  P-c-F--XD7awN*H(>Mx '--GK@AOVQiD-GMFa4MAOSSK(m%AOG8L@oa:4KZEGTv4j4M[:Q<=$&R#3W8Y8@Q : E() W} ELCK0cE(AAL#!]*-L +3WE( %Y-!]3-F)ToEP> f2%:E(#(#M%WD$I fE'-(#-*(#he=.>E9D=DK<];GI-B!7O  fU78)A!F$SN@N$3c$1':BDoDo<M!Do 'MDMNM.DABG\5B#ho' 4GR8GAA-G N 5.#G8!h1%U)yAY$IJKPHo)yN5)yYi'-hCSK && v{L6*"N*>RM B>K?KSIO#L"B/;KaPG'75;K-Do;KRM! nl L0MV ONeV V H=)J_8PKF-)''=)''NWO-/$ R1-/FN5-/'-'-#:U4W68=4W(#?s! u7J22*8%,'59nR{$)O( 3%,J"37F2 O(GSGJtJ!=dJY'N1'%) %lAa"Lp(v;/ f2&{(#(#MI f&{/3(#(#/3eGL-G N 5GP]UKS{I^Eb{_%{Q3W8Y@Q : &j) W} ELK-mD]&jAL#!]L 3W&j %Y-3XU`%I X(=CX-9J^(zN#v9.3 uS J8BI JKG J<+7(=32c4o4&F)32KJ,KJ'32KJKJA;HK8(#PP<0TQ q90,2J2EQ;NHD (/YE(4W68QJ(#4W Q[LSJA/(#?s! PPN*' nHK u? uyP yWSyW"2-?;X!!V Y<A6LH b8,a)J)J/hFCV+NI'-'-3M(k(k+E>)10~0~;,(o??4 GJ("h-?4% 6/R.U\N%R=%Cn(9>;};W. 0X}'  ;U3HA:ApQpp{b{_6{VUxM>J&&  WULsP/w ?6)\'iWV,0%GvI+'%OTv ) 8tK'B=B BB QJ7 Q0>QQOQ:EP5 Ia36/1S90(Z6/ 6/(ZE9GE6)EAP :BDr0U9Ec90 B<T, $"pY<1t?2<A$5 CP9/+IF/Pu/*.o099 1@/*C<)C0 JO M6PCJO-%JOR1PS P O+ 0QE9 J8 KI.7xU4:YS?'/K.C'd<'v 1S?E.JJK,K:B:DrRFC=%ZBGK?,TY*VF> mN?6.J? KKVF4%#'-=-*0PF3!E!Ga<(]Q". 2 ! 7K? 22!G".07KWW,M\7M\W2/dE7B$`B:Kuf66=Q6OF%1RWPU)3Ml&Z@PV . M5A2Dv%,FM$ 4O,=,LL ,LO62JN?O=8B9Y] 6Y78G:(T<G,Lq6C OC ~:(5:(@@28(N64U@\BLL9+&&/.Y9J:5 O'81.8>eA0 O418Y78lN!G+O@] QjT PT #T JE-J"pJ:U:M N5'-P]=#== %-/$ R1-/N5-//XUPH F>A.HGP%6|,NP}IJ-hP-;41y-[},,XK<<"<E"9+.7A#..%0?5K;.7P:7 ;5;0\PpK*4 %)=K%> % (h(hm 7m mFF_F#`m$FF$>)4 9Q7$O8)Gq#W$1Q7'TX$ >Y'N1'%) %lAa"Lp '"3bE 'Q '^I^#^Q?D}XP+0"NW XP&XP,4!d<0 L<><9qRM B>K?KSIO#L"B/;KaPG'75;K-Do)w;KRM! nl )FM-&N@EEY&3*4KA.@KKA5KA8*4??O8k<{q9Oq X>Y97?d! Qz7?BJB77?,Y+Q"`*.h A-J!>*9D=J)|>"DG"V.F"UIJ"-;.41y-[} HLVN(#b )B qP#% (k2+>)K1HD (3 (4W68(#4W [LS?>K H (#?s! N*'4|CA "Z&AhcL)FM HcF;7E3O$ LON5O(Ms-((XT+-:T)X1T::;:Z6D 5:6*0Yy-%YKDCQIRQFQQtE5N1NFNA;HK882<0T]0W2Q;W]"pY"p@QJ?QBA]5 CP9EQ)p$9)`QOCIO:lQ(OO(XBPUPH ?6)\HLx6|WV,%NGv0:% GwJGw6U>9}&/8Y.A9BlJBl3iBl\M0 N#&2WF,LBm3 nFp[%sK+ 3 6Q36QjpN6L pO1\#3 -LG5.7A#.P3W."U1A/T)1*W+A-=/%D} K-:(P=!=/5R&{.@> f2FO&{(#(#MI f&{/3(#-*(#/3e>N5z4@$5zJ5z<@I.M(@IOKOO@IOK5,'%CR#H=%YG B CK>~& m&9 B*. B K8pK4%#=-*N YX 7n9*S? L 14<*Zv 1S?4JJK,!!0SX>%IU X(=CX%c+KCS*zK &W& DO+#9#WP- % % % HLVN(k )P#FA(k>)X#S %KJH H *W#.A#. :.KYN#*S; c.<8K *k3/{<88zAHnK<8TH H.V9R HF.vQ+U*@::M:N U(|.5A.K" 5N55'-h$8/HW,t._NI-:Lz ^YN)J)JOVLz LzSN8^p 'K5,'CR#HD=%YG B/ C,IGNC mC S B*. B K8KGN4%#=-*">4/yeen:H>,eG= \RE@@=IJ=-;@41y-[}LW@@:%D>DD"H2hH3;;y!;n H 8 ).oUP#Ps$> 75s KJsH H5*W# 56 L-69;C61 E9;'9;-G"On YG"}G"W#-/$ 3R1-/>N5-/M J  f2X+J8I fJ++eR(;PR'4RP`@ JXKJ f 2KT{ 5s@ (#'(# 2D$I f@ C2;(#(#eE8'Q)9F:QJR!3}M/JRWAJRkGW>0)k.3K.Jk..=G=F?=E^9 7 ) 1(1((HC&p8AP :BDr0U9Ec9=% #B<T, $Y'<1t?<A$5 CP9Co15|?Co@@CoY6I%~D +E+C, EE ;0++"3=Q J;YR1XrFT,KB1LJLXV2g.,B}WB}L U7L |G?,N7#}?YWh $KX f2S5s $(#'(# I f $[(#(#[e0H 7UC,A @7U(oS}!4'@l(oKH(o@l< Dr;4FB'TP7N''O;#04    w't =),[l9lNI;W?PI;cAAXqT,<RM B>K?KSIO#L"B/;K aPG'5;K-Do;KRM! nl ZI&R ZW ZE598OYTRF%1RW-PU)3M:#&ZP* CV5A Dv?'*F--G6:'1@562i6" 1AQ$&<=Y,2-}$&ACR<#@%A\5HAI, 1<WNQa(U12`22UnX^UnR(Unv(GK $KX f2$S5s $(#'(# I f $[(#(#[eRLL*PT1XTV<V0{V"Q8,#iaO9+ #i#i:00/96SP1L<{q?OqYX.;C>9*47?!Bx 7?BJ 7?8*4 ??O(a&SXM!M!M!XM!M!T PT ##@T BR@?$C6(@ $CP$C:@&I0O2%gPD 8?6Mx)\P.WV,0%Gv%O4 1c1cVMGO-*VMVM1+ >n/>n">nN?O=88B9Y] 2Y78G:(5G]3>OJt:(5:(@@N>4U@\BLLCTCTCTgRSI=SU2SS-:$+KQ J0BYFHM$JFFN~ `JFFN %&;'dAd774d7Yh##5##,##TO~>ACO~M!M!XO~M!K !T:K V61$RdC=%&(1GD,KM1pRVF mD,U.D, KKVF4%#=-*)CNAb)() 'E :! #S.N EfLfK&s AfL# %L D.CN3W  %+@p:Y*" -Q"("O7{"!n AP$ NF! NQ NT LPT ##T F X:)9%   e%8%++<L&`KYN#*S; c.`K *9O3C[5`@@EAH n&~=K`THH-| 88I/L@ 'C C>IZ%#RQ$6FX f2Q5s6(#&'(# I f6C2[(#(#[eP{A. 0&32.J0&9 0&UCNNQP25n?6Mx)\ewWV,0%.OGv%%O  ; G@4 BR 3,;EPT2F DB;QF4 Mc&K*%4V4T;K*WW;U:QrQ5U:IJ.uIJP2U:IJIJY(YYY1I 16JHS@NNBN'tC"%X&\>bF}M56:.6UU H(X &~5sH(#'(# D$D$H;(#(#( Z&R ZW ZE5VNHSGW" IOIW"AW"K7L 4NFOL OL *X * LA@CP0W3!E!Ga,34 K@". 2W)IM? 2*f2!G".0WW,3cOj=, X/Ul+)(.wmz:.DC.}.wY4aDjN(yI*`=TUl1:@A O$ LON5O!>(v2BJ1(v(v1G7{,Oxl:WKkI:EN'-K:BDrRFC=%ZBG?TY*VF m?6.? KKVF4%#=-*O5= P`@ J4KJ f 2KT{ 5s@ (#'(# 2D$I f@ C2;(#(#e/T0d [1Vx SVoC`/-0<SM/&-*C* tME%@MR61-S P% O+Ul ~NV4-zB'C.F 420Z;]9(y4 2=*`2=T(/Ul1:@A $$9?$;,!Q$&<=$&R4955'C Cg!> mS!O<7; KG:={ <|%m  D}&8>QW[) 8)9[[PI &4vT?L>JX f2HT5s(#'(# 2D$I f(#(#e4Q-W 9Q7$Q7RT`8E!n AP$:GY A>M?d Q'3&pHxUZS9 HxFU*HxK] EO&p NY)ToOP> f2.=O(#(#M%WD$I fO'-(#(#he&R'V A&'*IS$8HW,t._CI-:LzO*+0o/Lz LzS8^p1D<%DO.CDe ~4hMF?Q$&<=$&R %J/EyAA&aC1qCcKQ.1XX:*"  1B=/>=d *YiK%-/$  R1-/N5-/'-#&2WF,LBm3 nFpVY%sK+ 3 6Q36QpN6LpOHc#3 -G5_1<@}#X p1I?!1I4)1I5:RW\"**XI:QIT&I*r*E*NAMYVK2XMo%#TX2XT` 6 TW2W26U-/$ R1-/ N5-/B ?I+Gh77J7 HI+N7N7IvN7*]?j?jI+N7N7VV~Vg009p%"={MKWX "D232OKM& MJV%?7& ""pH>="H2Z121J1%IM(WY7Y77Y7$z: < *_SH-'SH(#M(#SH(#* 099T5 +t h%1%#cVX%%#0vQVQs@+FQR#QN7QN7%c-r.QH [-r/\/\S-r(#9 EU'+; ET[ WR3<.. =3<3<AzD$Gs(XRD$V5sGs(#'(# D$D$Gs(#(#J y8 y? y+&&/1.Y9J:5 '81.8?AV</ 4X8Y7/&8l+ @] 7QjPFy.HJH31#)31AyiL HLVN(k )/zFA(k7>)X#_?1KJH H?1*W#:-/$ R1-/C N5-/'-;YR1XrFT3KB1LJL$GVXS >3WL U7L |3N7#}?YTKBE9 Y8 #7' D:RY'D?/2&O9;C'd1' AE9;'9;-<'O_BO;O_kO_<%J+.[Q^7()Ms-((X@NK5,'[CR#H@=%YG B CGA@ m(#N B*. B KK@4%#=-*:.A5%N.N5.Yi'-h7IB$`:KB?L 1U :K@4u6`- @OR@X_@z8@F7I "cISY -k,l8PN7KK0A+0 q Y8 YS?/t v 1S? JK,Y6?0'"I6:Rb'$R$ z9S61 ;B/| CTX-$W C$] C,5):/|R$!p XHHN?O=8B9Y] 6Y78G:(G,Lq6C OC >H:(5)L:(@@N64U@\BLL)&{0> f&{(#>(#MI f&{'-(#(#e1816?R6 %16Nv?12v +R %OR8v'V+\ '1/NR1/H11/;YR1XrFT,KB1LJL;V0.,B}N7WB}P3L U7EAL |,N7#}7?YeLN(_'4=:"=X:?? ==:??-/$ R1-/N5-/'-K>V6h;'jW;%KY ;DhKP 4,HQ> nKPTH?~D>7?~B}?~9)F L@EEY&3 KA-.QJ@KKA5KABAARXq=]6:&::2? 7UC9nM3 f7UP:!GW=4<SP:N=dP: f2&{(#@(#MI f&{'-(#(#e q q;Y 340x'!;YIJPIJl;YIJIJ8BI4 + MNT<&NPN K5,'CR#HD=%YG B/ C,IGNC mC S B*.a B K8KGN4%#=-*?B8CS7=3L+Y<)WQ&KU&RW*h"W?U&&U?32c4o4&F)32KJ,KJ'32KJKJH?-C,A @7U(o!4'0@l(oK(o@lOTN,FE& V= CAiB6LW33LBQJ??4 GI- ?4A#T/6 2A#CA#2 ' ,'' =' O+C(FA % HLEP(#75 ) qP#&|32ES94K1HD (E(4W68(#4W [LS?>K H(#?s! N*'4|Ei* 7U:CJ)|>"7UGK"!V.FN"UIJ"-;.41y-['-}%-/$ =R1-/FN5-/'-A7JiJ''AFJ'WUAD2CiNKD2BBJD28d 6EV  1<E^EV928("8T2F =*,FW ?W?/:R\Kr"W?[6n"3EL1@/*C<)C0 JO M6P*CJO-%JOR1PS P O+*B=U)1+%H+?KsA)1V 83 'F LV0V*FN@I,'H1MC*>AV-NE'F; ,NE$.SNEH9S],,X,X,F^,!7/$)4K1HD (E(4W68(#4W [LSS-'ITH H(#?s! HN*'V HLVN(#b ) qP#&|(k2E>)4K1HD (E(4W68(#4W [LS?>K H(#?s! N*'4|5tXv5s5(#/\'(# D$D$5(#(#I? 82o0&{(#(#M&{/3(#(#/3FT&!WV}N*<<0Z+AO'$ 1cA1c'CMb C"p6FPP.IJ _M&{0> f2A&{(#(#MI f&{(#(#e8'Q)Q!%I X(=BX<U)l c cDY c7{;s V&A\/fLy/ff/fP.W2/dE7BRC' RR'M $KX f2.S5s $(#(# I f $/3(#(#/3e5B#hMo'K[DK[G:4AA;HK8(# <0TQ q90,2J2EQ;NHD (/YE(4W68QJ(#4W Q[LSJA/(#?s! PPN*' n'B''%U)yAIJKPHo)yN5)yYi'-h4B T 4vT C4 ]w$BgO"wU0*(!0(0/UY"U0/1R3 A iHQ0Ji* 6FFpSG . !!0GPP< +<M\)M\ f)):BDoDo<M!DoF@JU,F@F@.(F1 -02R\$R\!R\R\KW:B6K"*-K? 1x5 D5 "&""5 "2UoX->>LP"Vq?Oq 9*4! 3D8*4??O0U0}N7Bq7m=j@BqFF_F#` ` `BqFFX> E%I X(=B'+; EX< IYZC7YZX]YZ"*.R?'O(#SCW9#1M)z'-GE[ BIO C*DI)F3 G4 m Q@6<RR B*!O(#.OKO7<<3e B/LSNa6.5&KG:(#4 &.O4jH 4S<P@'K[DK[=GKLR&&&{0> fY&{(#(#MI f&{'-(#(#e4%?`4 49HHhg:$ Y@:N5:F%1RWPU)3Ml&Z@PV . M5A2Dv%,GFM$ 4Oq,$V|vJ/" !#JJ+&&/.Y9J:5 O'81.8>eA0 O418Y78lG+O@] Qj6FOJ6Fy6F(GP:GW8 Ia6/4$0S(Z6/6/(Z :N:Y(#4:(#(# H :BDr ).oUP#=%P#BszT> 75&s KJsH H5*W#:EP5 Ia3.6/sS90(Z6/ 6/'C.(ZH?Kydd?RW303Ge9)4W-;): L< *_#UG2.y4n+I#N1MN1V#N1N1'@ .G!+qJ *XV&l=Bp=FF>F=FLcW Iw < E9+kFz; l(.8XZI[PIA WI[)I[WKXZW6~X>%IU X(=7X%c#=u7LA\,< mA\2,,e DJG$9W]QJOCOIO:lGIGIJOO:h}'^'EB'A;HK8(#<0TQ q90,22EQ;"HD (wYE(4W68QJ(#4W Q[LS?>KAw(#?s! N*'4|G$OK3 3GV/&G'-:KXA>N+$'-B-A -@B-I50B-;'D<;S$'& =& ;;'& & ,xDE8')9F:Q".JR !3}M/*fJRWAWJR!G".WW,T W'X+@/W/& KG/ /K V40EI <3,?#YIAS%BN5'-hO(I[%4 (I(IDl 5V 5; 58$A(JJR?'&C(#SCW9#G)zX'-GE[ BI l C<>I) .-'4 m @6<RR B*!K`&C(#.O&C7<<3e B/LSNa6.5&K.-:(#(#4 &.O4j R4S<4RVN>?,GAOMHS@vQ%AOSSIJAO-;.nE41y-[}( @X f2 GU Ae/I f !sXk!se: 8V\:M%*,f:N  / Yi# NOF>.$BpEC$F_F#`$F-/$ 5R1-/N5-/'->3C>>RV%NWY?,GAO$HS$g*\WYFFAOSSIJAAO-;WY41y-[IJ}V/PxPx<{q9Oq X>Y97?D! Qz7?BJB7?,Q"`*.h A&:7VVH)4 9Q7$O8)Gq#WQ7'TX$ >Y'N1'%) %lAa"Lp4@"I"I,D$ <$)D N5DH-oA4Y6I%~D +.EF+C, EE8; . By{-{T{ Z&;::|%/a|;GGSA1M/ V/;?'K/ -0n NFuI1/)2Y]+rHYOH/)G'KOM;-9ON'K'@@N4U@\BLL M//GHCC#C%i/1D[-[["%H#ab"HS*H=3&!'C C8C mS8<!=,Fx$mMMG9D=DG~<]C"-;<] Z;::|%Ba|;)SA1 A1 &?'K2/1 -0n;&4{YP;&.%;&H VM?<L.YJ? A?Y1.)E9 J8 KI.7x/v:YS?'/$x\C'd'v 1S?E\JJK,Xx Y?1U:zXx.7XxE}::8N%fE9 Y8 #7' D:RY'P/2&O9;C'd1' E9;'9;-VNNHS1&F4&MO@(%-@MAU'82'$&pE6E$){$$/E66K] EO&p-$ NY_bE8')9F:Q".JR'!3}M/JRWA\JR!G".WW,%TG GV/G1fY $KX f2S5s $(#(# I f $/3(#(#/3e<>HZF*2W2/dYfS:Uf- H E7??Bq()S??XT` 6 (W2 'FW2t)6Q8N}:XHH SVoS*K LN'-'-R*F>Od Q'OHx 7A HxF*vHx :-/$ 'R1-/N5-/'-LKxNSM9g/ $ D/ =vN5/ !W-e D*G$9L*OOIO:lGIGI*OO22*By'59nR{$) <%,.J37J?SJ!J8; .2?By9  . )JBn?B8CS7=3L+Y<)WKU&W*h"W?U&&U?K5,'[CR#HPQ=%YG B C2PQ> m>U B*. B K-KPQ4%#=-*$Bp7m9EC$F_F#`$$FF$.+o#@BAARXq O"((F`F`LF`Q+U@[O_;O_kO_*< ='?B8RCS7=3LY<)WKMJW*h"W?&U?a||?=$x >M?d 'Q'3&pHxS9 HxFHxK] EO&p NY2@9H0]2@A2@E5T .1y8n(V,7G5MAV+NX% <DG$ DGFN5DG'-'-  ?$>$3 )7/Q$ $KX f2TS5s $(#'(# I f $[(##(#[e: $X f2.S5s $(#'(# I f $(#I(#e UI*I}}p&VN(k(k>)PH2?H2?HK u? u1yP yWS1!Oy _ WFWF' 52J <L$ X L N5L z'-'-U*kWSWSF: J8 YI/qJ#Sl",D#@V -02R\$R\!R\R\RTHRT+/RTUM,,zlH]$`:KB:KuG7FXDVCN*H&:(>Mx '--GK@AOHS=GMF&4@MAOSSK(m%AOG8L@o&:4KZEGTv4j4M[: / <6$ XM6N56'-'-H?-C,A @7U(o!4'0@l(oK(o@lO 1AQ%$&<=Y,52-}$&A3.RW*5 #@ UA\`AI, 15WNQau/%5XD/%/%D}D}MN VT>21S.We6Qf1STB1S,Q"`*.h 0 $ XX^SC Ny P7!y yD.CN!Y0DsvG6VQp%"={MKWX "7232AKM& J?7""-i2iLi G /1'6l:8 Ia.o6/ $0(Z6/6/(Z::K""9P> f R _9(#(#M%WD$I f9;'-(#&H(#he"{RF!RF7FRFA/25xAW.Aq9Oq9!YNNQ<=$&R:R7[P> f 1UW7[(#(#M%WD$I f7[;'-(#%g(#heRG#.H D D D D  6l :pLB: )W))G5Wt7qET` 6 W26  QP>IPgIEI;? $X f217S5s $(#'(# I f $(#(#eM2( [HZV>9n0@  HHZ>@HJx;B^GS>=d>(0&{0> f2&{(#(#MI f&{'-(#)(#e@B3 4SU H 4N7IvN7*] 4N7N7VVMCA1-:cAAUXqB%6"(~C$r5Q1#$ *#C N5#;.:Ih!,"5QMz|Q7Q7T*?'*-GYO&7k/2j4LA6 ]LAfXiLA {4*,2 {4+I4N1MN1V { {4N1N1K5,'8CR#H=%YG BA C.)r"p m"p# B*. B KK)r4%#=-*,+;JH;G;1H4Uz21JQ41 6969V;"IF"I-*RZPUPm+P3Z.R ;.R.R#y';V0U#yY#yJ(N?XX$33X?X$ ( (HJRX$U~ yIqF8 y? yIQQ+2!rwN.w(G(G!'(?wUV@xG42?AJ"<l 77IB$`:KB?L 1F':K@0DuP- @OR@T@z8@F7I "cIS>Y J8BI JK5 JBTXW:#6}X?< ]Dr H:K()z!c'-BMGK@=T;GF3NMFX4.OM=.OK(m%=G"M2XO4KZEG.O4j4M[$V>ANgF,9@>NS5.#>48!h $ X.S5s $(#B'(# $(#(#RGd!n4$PK?Y #L$/:\P6 757%:\N2Do:\R M! nl NfQ (PP% Y4 t0' J?u GI ?4FWA#/6 2A#CA# CFWFW2==8P$8" /Fk<7{EQ9GE6EF*2N/dYfS:Uf- H ??Moq()S??XT` 6 (W2 'FW2t)6@OL $$)$9$ HhLz9C$<1Q$)$9Hh9MO\TUO&)OV <N-/$ THSR1-/N5-/'-'-KDw35"qMDw+KGDwTR)To'P> f |3'(#(#M%WD$I f';'-(#.-*o(#he"O`E- HLVN%(k )OFA(kb>)O&&KJHC< HO*W#r-R#;9E)J+s$8/HW,t._NI-:Lzo ^YN)J)J/#Lz -LzSN8^ pK5,'%CR#Hh=%YG BH CG$ m?' B*. B KK4%#=-*E=YE='E="?P/w ?6Mx)\'UKiWV,0%Gv%I+'%OTvH^&e:>:nH^>{>{H^< @Y@OJ@O4@O+P 'E :! #S.N EfLK&s AfL# %L D.CN3W  %+@p:Y$`:KB:KuW"S6 IOIW"AW""Z&$_4H}.D)Y"T05H @V5HW'5Hu(rs0V FV]= -IX5/&PBD#61AzWTELD%uLG9LD)_C~31D(L5D312i.uIJ:- R4@cQi-R9D3Q*/G;3W8Y@Q : Ki) W} ELRK-mVS%A8L#!]L 3W% %Y-3tZ>''y07&TAy'G'9y''8ETR 2U8K\22+.$`B:K;uP.>Od 'Q'OHxJ7A HxFHx %(P# 3$ 44. 4,pS H 4N7IvN7*] 4=N7N7=^C Ny P7!y yD.CN!Y7IB$`B$?L<Gk1Ym:KXZ@uQ?%OR@z@WKXZ7I%6~'BsD>%:A@:N5(:As(#h8F. ?# YC.eVeHe.ee 1@4T.Y?22T6,<#@6%6I, 1<WNQaCT.CFC)*N::{*NDoT6Do;*NDoDoK5,'[CR#H@=%YG B CGA@ m(#T B*. B KK@4%#=-*GE*AM)1/NPE:U?J>Mo( 5)S75s.,Xk4 AQNbpD)/>)6$*4*3^X&X&j$*:#h#h'2j4A'<A.AI|-G?IPA;HK8(# <0TQ q90,2J2EQ;"HD (wYE(4W68QJ(#4W Q[LSJAw(#?s! PPN*' n4%?`4 Kb49HHhg+H9KY,A @(o4'0@l.O(oK0(o@lO8 -/$ N\R1-/N5-/'-;}&{0> f2&{(#(#MI f&{'-(#(#e q *WWBR@?$C(@ $CP$C SD)Tp"="2JCJ>IJ256% ?4?4/UI ~)T$"/)TSD)TW2S*12S2S(_$ K(_LN5(_A\/f//ff/fP. #9O8}{;4F8}'rP7'')Y')DI%~D +H.EY6|C, E E8; . By:-/$ R1-/C N5-/'-?~L/G>7?~B}N7B}?~$B<'CN CKF mSKRB22V[+;I= = 3W8Y8@Q : E() W} EL/0K0cE(A L#!]V&L 3WE( %Y-!]3.'E7/d):U91P?E7??BY8Z1 )S ?7?1s.,XkO620J ^#) **A*32WL@2W2W*qG-G-YeG-G-9.L .LYlB.L8 = M#C2AG: G:IIN7G:IP2K%MDB;Q%M4Mc&K*4V4K* :JCX 4 t HLkC4Lk<LkThEF"/1rI1616 16K5,'%CR#H=%G B Ce>~& m&9 B*. B KK4%#/-* ?'7= GSxN W: 7CI-SxACq* > S8^p:-/$ ER1-/FN5-/'-SEu%dUUJiJ''AFJ'** 6"MN VT>2X1S.We6Qf1STB1S,Q"`*.h K5,'%CR#H1=%YG BA CG9 Z mN" B*. B KK Z4%#=-*(t6)g6%xF * RNUF Z&R ZW ZE&5 PP9P7R$ ,+=N5N7?@< ]Dr;4)zFB'TTP7N.O'3e'O<S*+< VT'E3;V;K!;$/'3+XTIO/ 21*rO0Y(GWC2|0 02|TLSo?.?.V -ZONeV V HHE9 Y8 #7' D:RY'D?/2&O9;C'd1' E9;'9;-')7Y "(F&PY ?6)\H66|WV,0%.OGv?c `!6%O.OtYRYRYR&Y DahDa"Da1Vx VVoX'z`$)a07S<iMC'|CiMEGM+<1'| V VTI!;&=~R`P;&;&33W8Y8@Q : E() W} EL/0K0cE(A L#!] lL 3WE( %Y-!]3<\060aQPvTOPvSjPvB&zXt/B5VW3LiK@".W),VO?'!G".OWW,3R!U7K5,'[CR#H?=%YG B^ CG8? mU B*. B KK?4%#=-*=4O4OF5=Bp9=FF>F=F$ 'Db0"4 k%%sO{O{#(5# 0#Fq(75#BZBZ:0}Bq7m=j@BqFF_F#` ` `BqFF=%gE!E%E:N75 Ia36/S90S(Z6/6/(Z1$z;4I1R3=S )WM"3=)WV6KCP @K(KIlE'L9ML< Dr;4)zFB'TP7N''O?(G+R J<5-RO8vJ$5V+\V+C;@=.Cn.NG(59=>;}Ul+)(.wmz:.DC.}B.w34aDj(yI*`=TUl1:@A A;HK882<0T]0MU2QJ;(W]"pY"pFQJ?QA]5 C.9J@P8*F8*18*J5 CK O7I6&yL7I,PUl ~NV4 4-zB'E.F 42yA%KFN'-T5J :N5 Ia36/S90S(Z6/6/(Z %K".P=L,5))K(P M9+B+B;J<95r;e65e0;eeTT50-M4Y i'QW0HY= -L$/=0&=3--LE1%/N=J"8X>%IU X(=BX%cT[>Y$W3K@!W);T9:-Q f221-?? /I f-??eW|+>IeDGW|YW|;mJv&cM[:a/;7;:@&I0JO2'C C&'3 ;|I5,8 ; %MZ %=;;U ;& % %&b;1X1LJ#V %W-L |%W_{K:BDrRFC=%Z#BG?>TY*VF m?6.? KKVF4%#=-*VV">Hvd Q'U&pHxL HxFHxK] EO&p NYE}E0J~K@3E!L=9@wK@28{W)95Y9?(#G22D809SY6 )%*X58*T*Q+U*@GB 9lGhS54 H9lN7N7IvN7*]?j?j9lN7SM?YN7l-K $KX f2S5s $(#'(# I f $[(#X(#[e ]w$Bgw**4 c4>>S)4YdKP9A;HK882<0T]0W2Q?;W]"pY"p!QJ?QA]5 CP9>TTT* 7UCJ)|>Y" 7UG")c!V."UIJ"-;.41y-[}Y(YYY1R=G,5Lww*6EEF&#e2J&y&0JJ<{q9Oq X>Y97?D! Q7?BJB7?,Q"`*.h Q Lp%"={MKWX "7232AKM& J?7& ""@z@F@-8Y%6?% RTLAw`~"{DO6:~"{F"{$M$}R wUO%FPF&%H*-(3Y+ E1(  fh::1G* Ia16/5%0(Z6/6/(Z:3AK*y>3C NC 533'-h+,QgO )H Qg.C.AQg..GF%1RW2PU)3MI&Z@(V08I5A Dv%,O8FI$ 4OW&{3>&{(#(#M&{/3(#(#/3F(@-QO8}{8}rA!S'*VI f'j -FX fGU Ae/I f '-XkeTo0M>5>#_90M(#(#M7D$0M'-(#(#hA;HK882<0T]0W2Q);W]"pY"pFQJ?%QA]5 C?P9 $X kS5s $(#(# 3s $/3(#(#/3;v vj;vEE;vEr3r=?'^8<?"2q?/I@: }V&D,9=+= R8vN=;,!&/8Y.A8%D1%D]%%DKsE'9;1-I`J:KN'->0A+ Q'0Hx=cq HxF+Hx N&{0> f&{(#@(#MI f&{'-(#(#e?*6P)DE8')9F:Q".JR.!3}M/*fJRWAJR!G".WW, b8,a)J)JK5,'[:CR#>'y=%?GK B C%'y> mN>5s B*.f B KK'y4%#'-.@`-*Q rMu2i5IJ.uIJP2IJIJ3 %>K} z(3 >.+w;K5,'[CR#H?=%YG B: CG8? m? B*. B KK?4%#=-*MY!NWCt 8)N T4 \ 6 M 6; WULs:1G*15%.(Z(Z?NO;9;e65e0;e,[l9lNI;I;&;YUYUYU wB R 19P9S9P ?6Mx)\HU6|WV,0%Gv%%O"|LN7"|"|9+BA4 ](#4U f$`:KB:KuGOR@$>*PY ?6)\H'6|WV,0%Gv$bI+'%OTvD-* vKh:&{0> f2&{(#=(#MI f&{'-(#(#e"cX'c7X>%I X(=X%c4$* 7UCJ)|>Y"7UG"!V.N"UIJ"-;.41y-[}2WL@2W2W:8888E+xI+ A/7o/:Uk;#6"(.wm&.q.w 4a5%GEG1II8?!1I4)1I;Q<=J.$&0#ZR"JL}#@!`Au(rs0L} +U'a$aKa LRJ,WH9S],X,F^ 9XXGU Ae/ o=vNV{  gAGg90Jq;m,H=4M[S4%~:S=:U_:%I D KG:={   D}&A#A.5AK.K" 5N55'-h5 5 h5 PF[%H1]P7"7%UOEP77W2NPFyu$.H;JH;GGQB52; A/7o{!/X</ eUkH/'M eB& eJM?7""=*:9B /1NVs,0NNUI#>W $KX f2#S5s $(#(# RI f $/3(#)(#/3ekFz l(.18LI[.PIA WI[)I[WM40AE0A C'7'7'-/|1-?? -?RGd!n4$PK?Y #L$/:\- 757@;:\N2Do:\ iR M! nl Ma ~4h/M &{.@> f2,K&{(#@(#MI f&{/3(#(#/3e'CU CC 8A\A\)0 (WMAWGnW%&@ y7]8 y? yR<)N >1NF8|NV_OrcV_**@D1V_B*'F@OtA,F@F@.(FF&Mo BBPUPmP HY.V9R HF.QCSwK(G7: Dr;4)zFB'25TP7'3e' !//>c*FDK5,'[CR#HPQ=%YG BV C2PQ> m>T B*. B KBSKPQ4%#=-*#&2WF,LBm3 nFp[%sK+ 3 6Q36QjpN6LpO1\#3 -G56@RGd!n4$PK?Y #L$/:\- 757@;:\N2Do:\ iR M! nl /; 'j?KX f2F:  Ae/2'GI f'-eCb3CbSCb2|DCs4OY/?1n4OF5:$9Ui6H?:|Ui+HQ-Od'13' ,82OM7A2$/*#&3-E1%/N=J(K4F(Q Q ,;(J"L=v;m&M[(WQ( GwJW//bGwM+lOBOC; f 20,;??? %=I f;;??Ne:RV^P> f2U:eV^(#(#M%WD$I fV^'-(#%(#heP03LB HLVN(#b )B qP#&|(k2+>)K1HD (3 (4W68(#4W [LS?>K H(#?s! N*'4|k#%q##UCNQ C6Y<W<LX!]<>ANgF,9@>N5 A93T>48!h>pQ@\@\#PBXDVO+bNHM(Mx '-MGK%AOHSQxSGF3MF84%MAOSSOKO(m%AOG"M2:4KZEG.O4j;4M[$VEV%'&S '&22'&A< Dr;4FB'TP7N''OA;HX8(#A<0T q90,*J23BQ"m6EHD (YPA(4W68QJ(#4W Q[LSJA(#?s! PPN*' n#qBBaGG.FX f25s.(#'(# I f.[(#(#[e= 5: Q!PT2F DB;QF4S  Mc&K*%4V4T;K*WW;F%1RW2PU)3M>&Z@:V .">5A Dv%,F>$ 4O'CX$ CK52C ( mS2C1T0d E '+; EB$ (o??4 GJ("h-?4%</R.U\%R=%Cn(9>;}#33$. >$$. Y( A>7oW//b/UkM+lO=B:IUOH>H>5 V Vo5 V4YF2(g2G+8g mSR KgEg<%D .%%1SD pP9#_Ev#=v4y#1Vx SVoC`/-0<SMW9&-*C*ME%MR1-S P O+FF6[%s9C5&Q2h <X$ LX,N5X '-'- 9IHN7\N7N7N7Q$&<=$&R1G. ?# YC.eVeHe.eeM\)M%M\ f))FdUl ~ b-0 BNo@.Vw.wlU2U4a/GRn(yU2=24'<"<lUlRn$$QX? %j1 f@+Y!MS@+@+F Y8 Y;/DTCP' T0>%"7i@J Q"Q9Bz?\N5CN5 P455 :1 L5wAu- CQX@CCI>>'0YX>0m.58F0mB0mF8K5,'[CR#HPQ=%YG B C2PQ> m>N B*. B K KPQ4%#=-*+N5+ 2U -+ )*)+g/)Ml,K5,'%CR#H=%YG BK CG=e) m9 B*. B KK)4%#=-*7J r\38c\\pD(G93@9V9KNSR34 iHQ0JiY FFpG . !!0GPP\H/\|?mPIlE'LP9MR++++PJCJ>IJ:P+"1( @X f2GU Ae/I f !s!se9 <@0M3@0V/&@0EU=$9,UOOIO:lU(OO(B!8,a)J$ &>A;H <0T+9 q90,Y3BQ3B  (=Y3BQJ(# Q[!!0A=%o(9PPJ*'UB-YK]C,P-8B6!3>]Dz?>]:|+HQ7 E 'F'HV-CV-:jMM~~5"?$6 $X f2S5s $(#'(# I f $(#9(#e"Z$T z*i$;$O1Vk-#-#^SC NyQ P7!>y yD.CN!Y=4O4OF5@ ." M4C 0AE=*H7Rl0A1W' J17/O1W6R\1W"7[6n"3ELRLL*PT1XT1cA,k2%A/.YXq&/V/=$xAP :BDr0U9Ec9=% #B<4T, $Y<1t?<A$5 CP9\U OD y;L44A\, KA\2,,(:)r:)V!`V+; ELHRU/TA64gJx;/TKo1)D<,JCWoC=1QK1CLHJO Q] U>KU#3#3U#31B$$(LU6 /+%6NN )1$ 3+$, L * >$$*, Y="%LbR T T+8L6mE8#3(37hQ L'7hT0d B$r3BR r=3ntOTDTMI_TDTDF34H 3534T.T@"F VFF9'=W/9==9H+HH <L$ X L N5L-p'-'-"(bm&.q.wlF4a5%GE4'<"<lG$$QV UPH H6|LR/4R/V.9Q,QQ:3bT M$HTM0, V&+K :g 1B$$H"'X-%   e%8%:$# /  N$ a||?O&ELD%L }9,'= &; %&; -t0 -FX f2=GU Ae/I f eG,UU -RYRL,RY;RYY.2Xu)?Kt9(?8-118-8-=GbzCN%KW3AT=K):M3N53Yi'-h+'8"@:@:@: }KN'-;}K5,'%CR#HO=%YG BQV CH`'F mF" B*. B K:K'4%#=-*GRD"E7/dPE7B:~A5 )S1A5<AD<.< O}6$6 X f2<>F5s (#"p'(# 2D$I f (#(#e#D0# JTF#4/;BgO"/GU (!0(0/ UY" 9U0.!w/G0/11$X19KY HJS$(9'-MGK@"NGF30MF,4.OM"U.OK(m%"G"M2,O4KZEG.O4j4M[$VHy|h5 @%3&Y7CuI%~D +e.EC, EE8; . By.zAK;0zN5z'-0<.Hlh1(q1V1+%dK@X++K(#MMm4M[2lW&{3>&{(#(#M&{/3(#(#/3L,LUU LLO@K*""W*Q*$mM .Ju3T;W; ;d$$E}::X>%I' X(=X%c# V  1<Q8LA56 ]LAff-LA) R)F?YT)QM+R R8v;R +f:Eu(E!8.;&4{YP;&.;&cFxDoE0&/.Y.WADI203DDG1)V@4 8A?J'#5R*U2p,#5;#5<{qz9Oq X5K>Y97?!*5Kff7?BJB; 7?,5K"`*B.h /MfS)0=nXAANWf:<2NNK(#MMm)C4M[*%#3+%6N*7>3 Y V/ L * *N@I(EM\+*?L++O+%>:'=8B98G/C OC-l,eG= RE@@N=IJ=-;@41y-[}M Z&;::|%G&a|; SOZG&" ";?'KLeG& -0nFD H87J>S87871","p mp'p 0JD2CiNKD2BBKD2T-?%-5(#-(#0p< ^ %E::0pS#0pUPH: F>A.HGP6|,FNP}IJP-;41y-['-}}RA.q #.q+- L6m*> `$YYJX /"P/"O)/"FWv+H7HHHX?TC967VCPCU# !UB=l=l,=ll.cE;8pH>="F2=H>D2CiNKD2BB6\D2(#2Wa (v;/ W1EW)1EK5,'[CR#HPQ=%YG B C2PQ> m>U B*. B KQi-KPQ4%#=-*LD 2 A#2PUP ?6Mx)\H'6|WV,%GvI+'%Tv 1AQ%$&<=Y,52-}$&A3.RW*5 #@ UA\AI, 15WNQaT PT ##T 5(GO&X 5QQ5QQO~K&ACXO~M!M!XO~M!M!SE!=JQg3 ;)H Qg.C.AQg..LH/6/q8Cx |4KoO"R%>^M):LWoIO"/QO"CLH:LO ] 5 5 h5 VJSiP@CCYnOO7AF2MX32V/&2-p VP Vo5 V4$6JC? f 2=5s(#'(# 2D$I f;(#(#e4X%GNN:,N*'-X> E%I X(=F'+; EXB$ t<X(T87 6ONKQn<X(5sQn(#/\'(# Qn![(#(#[+GTN7N7RM B>K?KSIO#L"B/;K aPG'5;K-Do;KRM! nl U  +( 5V, 5; 5AA2:C N'-UTP 67x7xGO8 8 @@NG4U@\BLL+s]&o9|'K'&oB=B&oBB<+Dr;4FB@'TxTP7N''1 0O@O+Q.mFF0|F4&6JB="%R$,[K~l96 l CTNUA (W (MJ C$];.T C!OAR K95[;.#|I1ObW;>I:P;Q* o TBLF"OZ 1TBPTB4m89p 15qLE%> 87PLELE9HHhg4H4O pOO4OTFO62U-W$/N(J%qD^^ Q. D7TD^ HQ6$0PP/>5 3 IEUP/(#(#M7D$+QP/4;'-(#(#(hKYN*S; c.%K *3/{DP#AH nRKPTHN?O=8B9Y] 6Y78G:(1G,Lq6C OC :(5:(@@/N64U@\BLLP25n?6Mx)\:e6HWV,0%.OGv%?c `!6%O.Ot#%q##UCNQ Z;::|%a|;JSA1 ?'K -0nWH9!(#h0,#,Gb,CXX2PJD!f/f\<<B7l;>>>.87%S8787%DG}Q16%D%]%?%D%%fT5 > f.C.Af.. $X f2RS5s $(#&'(# ><I f $(#(#e #EUV@x?Sm7 -i*QGCG.iCY*QGeTB*Q#C?^.$WUJ5z;;Y=I%~D +3.E0eC, EMpE8; . ByCo15|?Co@@CoDT0d E '+; E[ UQ ^&/XM!7M!M!XCHRM!M!HR 6  - %;(bHub"HS*(bHH.R.HHPP%G?~>7?~B}N7B}?~=-(2V NNQUl%80&{Y>&{(#(#M&{'-(#(#Wg /1 "B8PY ?6Mx)\H6D6|WV,0%.OGv%?c `!6%O.Ot:%H>BA;HK82<0T0W2QS;.K)YQJ?QA)5 CP9${7"I$8&$rQW[) 8$r)<YB8RVN 6?,GAOSHS@vG$9AOSSIJAO-;941y-[}W2/dE7)bBW2!< 9xD}YB>H< 'Q'ItHx0g HxFHx BR@?4u  P>a3C>>:+:9x L8'Q)QY%!:An1M 9@$Ct 8Gq)$1Q7! "jT\ 6 M Q&! 6CXCX/;K7-Je: F:+:N?O=%8B9Y] @Y78G:(G]:S6ONO:(5:(@@N64U@\BLL^."(""P,@P P7K5,'%CR#H=%YG B CK>~& m&9 B*. B KK4%#=-*A;HK8(#<0T q90,2J23BQ"m;UpHD (BYPA(4W68QJ(#4W Q[LSJAB(#?s! PPN*' nJ?u GI- <?4FWA#/6 2NA#CA# CFWFW2==+O;*3'CN CKF mSKMK5,'[CR#H?=%YG B^ CG8? mC B*. B KK?4%#=-*TELD%uLGN9LD_C~31D(D316(#6F6NV4-4ZF V(yX2=TVK82<2;BB YP*(/sF- u"+8wG%lN5 "2/PUP ?6Mx)\H6|WV,%Gv%%UXPJ"N XP)M&)MMXP)M)MD9=8 <B981GC N6'-'-* 7UCJ)|>Y"7UG")c!V."UIJ"-;.41y-[}Y6?0'"IF6:Rb'-bIBBO/* TBC;1JORB:.A8!V=%N.N5.'-h7IB$`B$?L<Gk1Ym:KXZ@uQ?%OR@z@WKXZ7I%6~-/$ uR1-/LN5-/'-8 @~7Q*%# +%6NN4r&K3$ YGy?? L * >$$*?? Y5kAA!KG -lUnFsX^UnRUnU@Y?Q ,/W*+@0/</ & K/ /+<K V VTGE*AM)1/NPE?Mo( 5 )S75s.,Xk8J$%7mjEC$F"I_F#`Jn$FF:KNB'-BhDA>60BR W 0>098B2*< aj*A=2m*oW U=63B9& F(H>f'-8KGK%:(GAMGMFV4O%M:(5K(m%:(G:A93TNV:4KZEG4j4M[:s: $X f2S5s $(#'(# I f $(#(#eCx'UPUPH ?6)\H 6|WV,%Gv3%+>xS4HQ)>(#/l(#X3>(#(#N*S*3%}EE7{I^E!2b{_%{.Q-' wB)[vJ/W" !#JJODu)Du<N' ZP:;aS4OP+B'NuD'0YXK[A0m,wG58F0mB0mZ+AF88I4!d3 :'N1YdI3 3 i#EUVY@x?SmQ7 -i*QfG+O Q%g%g;*QGeI*Q#Q?^.e$33ReD!!A!FF < $ /7 >N5 -p'-'-@E`f4H+&&/1.Y9J:5 '8.8A'~/ 4X8Y78l+ @] -AP :BDr0U9Ec9=% B<T, $Y'<1t?<A$5 CP9U/TA/TU|)C,I: oP;Q*7 oTB*F"OZ 1TBPTB1BQ#0(BDB;DDLMdC!n$P$/# t:-5?:\:-LN0Pt OS;S3o%FN'-?!%07=EQ+Y< 0 :9=AU& Br"8 ?U&&U?',0O%''2y'% R@-%RC# 3 A;HK882<0T0W2Q>u;.KY!QJ?QA5 CP9$K6Pn V36V/& r@6'-ht@#> >(t6)g Q'6Hx%x HxFHx Jr533K F%1RW2PU)3MI&Z1_:VXE8I5A xDv%,FI$ TuOS2]:|Cj%-Ba2L:3L11" L%'@LK1 -0n>,BAN6",.F +F 6(F vE/CQ)EE74"GKSI'SIN7bN7IN7K5,'[CR#HPQ=%YG B C2PQ> m>U B*.X B KQi-KPQ4%#.=-*U:Y6?5RWc.FF`"{S'.6:"{'"{U.FPH "\-/$ R1-/N5-/'-)F L@EEY&3 KANQJ@KKA5KA?  6?~>7?~B}N7B}<E")??~9$%7mEC$F"I_F#`$FFPT2F DB;QF4. Mc&K*4V4T;K*WW;LHRU/TA64gCx;/TKo1B)T:LWo1Q1CGuLH:LO ] VST@/MS4"C.*.T+.._V1D)G\)O{DABG\ P0K@3E!L=#\@wK@2xW)95$Bc?J~2/2D8X0BcSY6 )%*%#3+%6N*Ks7>3 /aVT7F L *: *FN@I0%$X  f& e%AeJI f%'-e;7 ;/;H3;;!;zB"22*8%,'59nR{$)O( )%,JP37) O(GSG1J!=dB4J>'0YX>0mn.58F0mB0mF8PNQWSWS*.$.7m#EC$F_F#`$$FF$< ,)h< < X:&Q*o (G+= J<H-RFW8v/~*\@ CFWFW(==M28G+3vJ8XTIO/ 2O0$FGWC2|0 u02|F" M4C 0AE=*H7Rl0A1W& J17/&i1W6R\1W"7[6n"3ELG<v"KVNHSXZTWKXZ6~[1,=,LL ,L ~  7{ Y6I%~D +E+C, EE ,'H1MC*>AV-NEF; ,,NE$NEH9S],,X,X,F^U@%3&!] rQ mSWF/5>JLW4>@ V <N6$ HSQWN5W'-'- WO 7 ".FAP |?u0U9Ec90 |<?, $Y<1t?<A$5 CP9Jd3333 ' :& gC5wAu-"*C>X@CC?'*--G,-3VNUHS:@PKH:(6Y3@V/& r@@'-Rmh:-/$ KR1-/N5-/'-&J&y&0(J*%#3+%6N*Ks7>3 /aVT7F L * u*FN@I-/$ O R1-/=N5-/ D  N 5U W"IR?'(#WCW9#; =)z'-GE[ BI CKYFI)F3 @94 m @6<RR B*!(#.OK7<<3e; B/LSNa6.5&K@9:(#4 &.OR4jS~Pb4S<= X %,%,e37OlA"(bm&.qB.wlL4a5%GE4'<"<lG$$QR 0E2~)R7N7Y\R77P' -p"={MKW/X "O2XM%KM& J9tM?7""<{qz9Oq X5K>Y97?!*5Kff7?BJBW-7?, 5K"`*B.h 0MT00$)>$$ YU1A/T)D <"1Vx SVoC`/0<SML<DPCME%MR*1PS P O+M7| {;-{X@{KYN*S; c.%K *3/{DP#AHnKPTHGH#Cz*r"/h$ N5Yi-p'-;}'-R?'O(#SC9#ANgF,9@>N5S9>48!hT8P 23!; K82<2 ;"pRr3*!#o!HB6eHB!HB( A>7oW/X/b/UkM+lO=B:IUOH>H>57" HLHRU/TA64gCx;/TKo1!.)T:LWo1Q1CLH:LO ] N?O=88B9Y] 9Y"8G:(L'GWLGOG6:(5:(@@NL4U@\0LEQ;;S*N b+%g?%g+_>1y-[K82<2(<;Bo@NUl ~NV4 4-zB'E.F 42W(Z;]<E(yQ2=*`2=TUlE1:@A G8W/9  \ G1VQJ|):g 18P /Y=?6Mx)\DWV,0%.OGv%O3 sIU=8B98G/C OC<"T=)H"5~"7TLVL ""TLL#6L.L(PLXP+; EEW%J  e%Ae%!s!s- $X9S5s $(#(#  $/3(#(#/3R&{.@> f2&{(#(#MI f&{/3(#(#/3e%T(h(hKYN*S; c.%K *3/{DPAHnKPTHu34 B=R =f==|<592[/I &[>]TP-8B6>]Dz?>]:|+HQB$K5,'8CR#H3=%YG B>u CGTF m! B*. B KK4%#=-*8R8RI&o'&o=B&oF*2W2/dYfS:UfY H QE7??B;d3"W8)S?=?XT` 6 W8W2 'FW2t)6X7-@1s-K^-SI 61S$SW 0>098B2*8aj*A=*o=Oc>IB=?<?>>=??#EUVY@x?Sm7 -i*QG.";*QGe*Q#?^.$ 45N'A 4@@ 4WHQ/ Q<Q16/)QW:#:3O~&SQACXO~M!M!XO~M!M!p.F'@<82=8B+^ BB;.I:2XCWFl-/$ R1-/N5-/'-L'-@OdJ@O@O@d+*%# +%6NN 3$ +$, L * >$$*, YA;HK8(#<0T q90,223BQ;F"HD (wYPA(4W68QJ(#4W Q[LS?>KAw(#?s! N*'4|(}%bXF Q T NNPwW3.w(G(Gwi RFX f2m5sR(#'(# I fR[(#(#[eY7Y77Y7~++++([-L f2G_1-?%g? E;I f-4??4eM-l'K5+"@< )h< < X:q?Oq 9*4'! 3D7?8*4??OU:Y&x6?5RWc.FF`"{dS'.6:~"{'"{U.FPH "\G0G5=.!w/G16h=NL *;NB*S)*NBMU\2'9N/ yr y>Q7 y@u4`'JZ> 2 22(UPH 9n0@  HH>76|Jx;B^GS>=dCa>0#98KSB*Haj*A=*/yee:n:H6-KeGK=GRE:FN5=IJ=-;:41y-['-}X$33X?X$ ( (0RX$9D=DG$ -G N5G z?Q??BL)>I)5V?\N5CO$1kPGqN5 TiPY"tN1%)P%lAa"Lp)&s(`)CCLH):59UUL5y .# .eVe ?.e8h \4!B8h+8h/y:BDr:H>,BG=TRE@@F=IJ=-;@41y-[}RW-PV-UPOTUO&O5kAAH{b{_6{D$9"-F 5$ J{D$NMKG#:34NV_V_*V_#@cA,/9 LA@!:Xq R3@5L@O#-G51A Fl-/$ "LR1-/N5-/'-'-:W5 Ia.6/$0S(Z6/6/'C.(ZH?R#@{RRARX:XXQ E{3C/E/]B;/] dN/]&!eene J8 YI/JJO}JJUCNFUOYT Qf6lP`><*X f25s><(#&'(# 2D$I f><C2(#(# ^eLH"<A6LH+8!o5X+ E!o!o@@ sV>R 7j>W>Tk0SY M@ S; 1>aY/;%&;Y/W&{.@> f2&{(#(#MI f&{/3(#)(#/3eY6?"IK V61$RdC=%&(61GD,KM1pRVF mD,U.D, KKVF4%#=-*  R4- Re94T.T@G#@!`G>Rz:(1Jf@@|,k2%@/:%&/V/:1G*1'5%.(Z6/(ZW2/dE7QOB=W2;h!kME(@EP353.S9lVRO< (#+DrW9F6U:)z!H'-BGE[?IQTSPYI) dN ,4.O m @6<RR?69!+ (#.O 7<<3eVE?/LSNa6.5&K O(#(#4 &.O4jR4S<8*)d8*18*O!!0PP'H& 1AQ%$&<=Y,52-}$&A3.RW*5 #@ UA\AI,M 15WNQa&#e2J&yU&0JJ:-/$ KR1-/C N5-/'-QPNYJK82<J2;Rr"pY7ORr O>W%H  f2H e%Ae \I f%!s!seE8'Q)9F:QJR"!3}M/JRWAKJR=6DaDa"Da"9AI0I$kIP6u>>>.0FA0mF!l !l!lKS;1XrFT3KB1LJL$GVXS >3WL U7L |3N7#}?YKN'-R<)CT.C.FC / Co22f3C23(&3I23353$H'!i'!u'J+E-J"p"pEAJV5 . .K<XGA5s(#'(# [(#(#[0QO!O!*T@: b<@:@: }HJ,<C2 %a C< 8$yR$,[l964ul CNo4uW C$];. C!OR4uK95[#|IR$,[!p96;l CN24SW C$] CR4SO~K&ACXO~M!M!XO~M!M!S=x&SQQSQU:Y&x6?5RWcFF`"{~3S"I6:"I~"{'"{B<UFPH' "\7DFk n9$/R>">">"Q?PY ?6Mx)\HJ6|WV,0%.OGv%%OLCn0WWV6h;'jW;%KY ;DhKP 4,H nKPTHm'O%''+s]WDc NN%KND.CNY1EF :& gC5wAu&Z-)3"*C>>X@CC?'*--GN7(q+#F W 7""OA 22"1 L UN/R\kND/;/D<Qv+'"+4+!K$8/HW,t._NI-:Lzo ^YN)J)J/#Lz /LzS^N8^ p,#,GbL ,L:&{0> f2&{(#=(#MI f&{'-(#(#e0&{0> f2&{(#(#MI f&{'-(#(#e%-/$ KR1-/N5-/'-# iM#73^7#77N?O=8B9Y] HY78G:(CG]GO:(5:(@@NG4U@\BLL$8/HW,t._NI-Lz5jYN)J)JOVLz LzSN8^7pXDV+bNHM(>Mx '-MGK%AOHS&GMFKw4%MAOSSK(m%AOG"M2Kw:4KZEG.O4j4M[$VUl ~NV4-zB'C.F 420Z;]9(y4 2=*`9u2=T(/Ul1:@A K5,'%CR#HO=%YG BQV CH`'F mF" B*. B KK'4%#=-**<M* 7UCJ)|>"7UG"!V."UIJ"-;.41y-[}2SH*12S 2S1$ <X+|5s (#'(# [(#(#[1/NR1/H11/(7I#Fz l?5O'1Q,CPV2Y >0A+ 'Q'0Hxq HxFHx :Yb:M!GM!:M!HT ELD%U\/NLU D^L^UP.UDTD^ HQ6$WAKRN5Yi'-=;}-nhFP5MX3P5V/&P5-pB=/9D=d *YiU0|F9(9 U79N7 VnLA56 ]LAffILAI>RYI>I>;>#fM+B?B8CS7=3L+Y<)WKU&W*h"RW?U&&U?R?'(#WC9#Q{C)z'-GE[ BI CWI) 4 mE[@6<RR B*!(#K7<<3e B/LSHJk& IK:(#4 &+Q4jP4-:VRA^3VRN5VR'-5dh!KMN5'-MP$-:J3rO}JJUCNNQ,'=)5V?\S@N5CO$1kPGqN5 TiPYP"tN1%)P%lAa"Lp /Fk-@M45U'82.N'D60>,5(4EN$/@6&63Q-NE1%&/N=JLww*OSR3 G8 iHQJiB 4TFFpSG . !!0GPP,54)OKFM!5 ADoDo9[): DoDo/vJ #JJ B>Ba.w"IFI{)MSxSx61A-/$ ,TR1-/E3N5-/ 3j MV7. +P8D}J'JZI A#>6 2A#CA#2rXUr=-r.QH [-r/\/\UV-r*,32V_cV_*V_NK[K[ |G&FEQM O+Z0PI>5>0hII(#(#M7D$I'-(#(#hb">nH9KY,A @ (o 4'0@l.O(oK%(o@lO0 $X f2O\S5s $(#(# I f $'-(#(#eVNGIHS4AOGPaFKRGd!n$PK?YO#L$/:\?6O5:\N2Do:\RM! nl PP%GXG:=7j>>*GLN'-1 $4215J5Q4155 e D,G$91D@,OOIO:lGIGI,O O?!n$P$L7 VA| V VM 3$4 t  L QJ#DKARO: (# ]DrW9F6U:)z!H'-BGE[?ITSPYI);  ,4.O m @6<RR?6++!+ (#.O 7<<3e?/LSNa6.5&K :(#O(#4 &.O4jR4S<v??2J?2?2,j#)7%q##UCNQNQy+aD-NyDAQ.9)`QOIO:lQO \,/)FO&&KJH HO*W#<'FG$9X?FOOIO:lGIGIFOO:(G+R J<"h-R 8vYh\@=Cn(9>;}@&h6;X-0#2#X-X-9O $ Q;(#*'2lV0K@3E!L=#\@wK@2'W)95$Bc?2J~2D80BcSY6 )%Jp/./J/ 5-/$ R1-/N5-/'-'-D6<%DO.D ^I^#^QX? &>Q y0MnXlu(rs0 E5(3 &~=  \ G14GV0K@3E!L=#\@wK@2'W)95$Bc?J~22D80BcSY6 )%7n`@53 $KX f2_S5s $(#>{'(# I f $[(#(#[eVK C@N(#ODu)Dur<7iE7i 7iUN YX 7n:9*S? L 14*Zv 1S?4JJK,.'E7/d):U91P?E7??BY8Z1 )S ?7?-1s.,XkG@4 BR -W$.7mL<EC$F"I_F#`C$$FF$> Y8MM1p*+R R8v;R \  W"I-M0A+'QW'H0=q-L$/=0&=3--LE1%/N=J5y &07/&TAy'G'9*y''X2`25n H4(&^Mx-'-eMGK@POG0MF4.OMP}.OK(m%!APG"M2O4KZEG.O4j4M[$V*XV&lE 3Bb1 Z;:;S"N)fE}NS/NSxNS-/$ R1-/N5-/( 7IB$`:KB?LO'1F':K@uE;YOR@z8@F7IY"cIS>Y <+Dr;4)zF#B@''TP7N'3e'1 0O@O+Q.mK5,'[CR#HPQ=%&G B^ CBB2PQ> m>U B*. B KKPQ4%#T-*LU9ECCUPH HG6|LR/4PGR/# k%IGU7GEV%72+&&/1.Y9J:5 '81.8fAV</ 4B8Y78l+ @] Qj f2&{(#(#MI f&{'-(#)(#e* /1HOR:HH< S+k+k8GTf%4-/$ \R1-/N5-/'-A;HK82<0T90W2Q';.KF 3Y8QJ?QN5A 35 CP9H9KY,A @(o4'0@l.O(oK(o@lOeenep!#EUVY@x?Sm7 -i*Q8G.";*QGe*Q#?^.$R:BDr*k;WB1T/Q= (W1DH;.1!ORQ=K95[#|I$8HW,t._CI-:Lz%*+0o/Lz LzS08^pJH7 D;D;o5 D?S?Oy D??CB32.h313E5)O22*8%,'59nR{$)O( )%,J37) O(GSGJtJ!=dYkJMx H'G@AO@6|.MF@AOSS(%AOGI+'4Q-=Tv4j)nM[DBB@}M!0r9#&F,LBm9nFp%s"[YE32WpN6LQ;pO5 #-G5NN*S*Wq3@"" NG@M 4SU H 4N7IvN7*] 4N7N7NSLNSxNSR ."&E +7G%<;6;7!]?!]97!]!]72'.NG9&U-@M45U'82.N'D60>,5(4EN$/@6&63Q-NE1%/N=J4IK`N*S*37IB$`:KB?LO'1F':K@uE;YOR@z8@F7IY"cIS>Y RO< (# ]DrW9F6U:)z!H'-BGE[?ITSPYI)^N ,4.O m @6<RR?6! (#.O 7<<3e?/LSNa6.5&K O(#O(#4 &.O4jR4S<?~X*,U!!!!W2/dE7)bBW24$P/bA/V /8 XI (i88(iIx4s&LM1OLKIxLMSa33KK?~>7?~B}N7?~N?O=8B9Y] 6Y78G:(V$G,Lq6C OC C:(5:(@@N64U@\BLLNm/nX@ /EeWr //'/*/B I+GhSJ7 HI+N7N7IvN7*]?j?jI+N7N7W+MR:BDr*k;WB1T/Q=W1DH;.1!ORQ=K95[#|I8 PH P+,02C2(&32Co15|?Co@@5KCoK>V6h;'jW;%KY @;DhKP 4,H nKPTHF%1RWPU)3Ml&Z@PV . M5A2Dv%,GFM$ 4O1!?B8E)CSBM#3~-XY< .WKH5vV.W*hNWE# ?VE)E)8 HLVN%(k )FA(k6>)X#;t$=KJH H*W#;+Q&_,QQ:3b 'E9t.'E7/d):U91P?E7??BY8Z1 )S ?7.A?1s7.,Xk'NuD'0Y#X:5K[A0m1G58F0mB0mZ+AF88I49J+E-J"p"pJ)E %(5 Q(Y>*!HX9# HLVN5(k )S(FA(k z>)C9U8bKJ!H.I H8*W&#W5:6W5%=#OW5".N(!G".W,(a&SXM!M!M!XM!M!!Fz6 FM( GwJW//bGw M+lOBOH>2Q %2Ao2l8 A1??):/|X$3X?X$ (X$5&)JEHH=L6|/h-/$ R1-/N5-/'-;}'-4$S]A/V 2M8J3I (i488<M(iPPXV_OrcV_**V_PY ?6Mx)\H)E6|,F30%Gv%%O:#D 5#0#V/(#'-:#*oIL*}K}}KMQ!R9XyC=%5G$i"#KVF m$Q.$ KKVF4%#=-*2? S/y:BDr:H>=%,BG=TRE@@F=IJ=-;@41y-[}  Z;::|%Ba|;SA1 A1 &?'K1 -0n:-/$ 46R1-/,N5-/'-/RK#E'JaItIt0g HLVN5(k )S(FA(k z>)C9U8bKJ/HI H8*W&#TPFyuLG9D,N.H_C~31D(D31XTIG/ 21O60(GWC2|t0 0P:A62|>>4"ZTR+ ;-WY!NWCtNY) B# V/ 6Q6Q9#  O}<d9x L Z;:;4S;;MmKZEGjN?O=8B9Y] VWY78G:(EG]%VWO>H:(5:(@@NVW4U@\BLL-/$ JR1-/"N5-/ B$ XF +F 6( F #D 5#K0#"V/(#'-%BA  e%Ae%"M2N?O=8B9Y] 6Y78G:(1G,Lq6C OC :(5:(@@/N64U@\BLLMeQ06O. %I y8 y? yN-/$ AR1-/ N5-/-)oA;HK8(#<0T q90,223BQ;F"HD (wYPA(4W68QJ(#24W Q[LS?>KAw(#(#?s! N*'4|`2fJd E33F'+; E33D 45MYYYYFW CFWFW= ZP:;aS4P=PPTUc{{%V%A,JA%A$p*%,'5%,!E375K5,'%CR#HO=%YG BQV CH`'F mF" B*. B K:K'4%#IJ=-* 4S H 4N7IvN7*] 4N7N7( A/7oW/X/bM/RUkM+lOBOMSc7,%?b@\? BFN:1I@\!d yB, A|:#?#L!dW-B!dAL87:/W7-K;6E7/dPE7B:~A5 )S1A5)5ELD%L=8B98GN5OOC5OK@3K@<W)5<?'2D8<+4e F(S> SHF%1RW2PU)3MI&Z@YV08I5A xDv%,Y9FI$ 4OV/CV:IUH>5K5,'[CR#HPQ=%YG BLL C2PQ> m>s B*. B KKPQ4%#=-*R#?)&s(`)CC)08xJ8X f LuF_w JAe/2'GI fJ;'-2 heL[,:k8M!'llNHICIC:#D 5#0#V/(3j#'-&JX f2Y:5s&(#'(# 2D$I f&(#X(#eI0Q6Ix+5+P(' A"%BP(=8F=EP(==W8 PH PXDe5nHM(&^Mx '-eMGK%AO&JGMF4%M_SSK(m%AOG"M2:4KZEG.O4j4M[$VEdEdJQ,;';b4,E',0O%','0 -FX f2GU Ae/I f '-eC CCF6 PCS K*(E# (E)8 8"*t*t#"I: oP?S4A7XV o-i F"**C%g 8Ke #C?^.$8w)R[:~)'F[ @7[323'3N=8B9N8G3<`FOC<`>PPIIIK3o4!d":H L""N1LpK$%H*p?R(GB&+ X00%)>RGd!n4$PK?Y #L$/:\- 757@;:\N2Do":\ iR M! nlDo RiR?'&C(#SC49#GD)zX'-GE[ BI CEPI)V O['4 m@6<RR B*+9+++!+&C(#+ .O&C7<<3e B/LSNa6.5&KO[:(#(#4 &.O4jH R4S<%f9[[P0`:R$;Y3:RH&(RV%NWY.GAOHS*\WYFFRAOSSIJAO-;WY41y-[FPP%>Hvd 'Q'U&pHx 22BRERYR+L,RY;RYP)ToEP. f ".WE(#(#M%WD$I fE;'-(#(#heK 7IB$`B$?L<%i1@G:KXZ@u9g*SOR@zf@WKXZ7IS6~OA.q #S.q>+QY >#2> R;PRR+*C*6+99X9UNTJG/T/T)G,UU %a%DoGZ!1XM!M!M!XM!M!!F7IB$`:KB?LO'1F':K@0uE;YOR@z8@F 7IY"cIS>Y (?~D>7?~B}B})??~%?@\?@\BTG?!e!dGAI cAIT##XU5##,##T!.W3K@W)WP~D5 55= Yh===;mc?B8CS7=3L+Y<)WQ&KU&RW*h"W?U&&U?$ v$ 1?$K5,'[:CR#>T=%*G B^ CO%T> mN>5s B*. B KKT4%#'--* I9w9wCP3/@P PL16%),{6%5=6%FRV%NWY?,GAO$HS$g*\WYFFAOSSIJNAO-;bWY41y-[IJ} PP9eQ&$9)`QOIO:lQOOt!X7:OQ%Q4QQXDVOCN H(!Mx'-MGK@AOHS&GF3MF-q84MAOSSO.OKO(m%AOG"M2-q:4KZEG.O4j;4M[$V<"'0'=A8L@oGHCC#C 1AQ<=?p&],H- {2-$&0 nA" RSj#@ nA\Au(rs0 1 OoOo$P$P Oo$PRV%NWY?,GAOHS$g*\WYFFRAOSSIJG#AO-;6_WY41y-[IJ}Dn4.'E7/d):U91P?E7?? BY8Z1 )S N?7?1s.,XkHx.@ss&ssP-rH [-r/\-r?B8CS7=3LY<)WKMW*h"W?&U? e7 eF%1RWPU)3Ml&Z@PV . M5A2Dv%C,FM$ 4O'R32ACHQ0Ji* 6FFpSG . G1l'(%'(RUW,H 7UC,A Y@7U(o!4'@l(oK(o@l-@M4 U'82E'$E6,U% !$/E66-!YG9B/<X(d5s/(#'(# /[(#(#[*%#3+%6N*Q27>3 P$;;Q2Q LQ & * *Q2N@I-/$ R1-/V+N5-/P&E*N7XExP&N7JN7TP&N7N7%2(UPH 9n0@  HH>6|Jx;B^S>=d>21S.We6Q1STB1S,Q"`*.h h) *" a&SXM!M!XM!M!O J$>OO GHI1/)2,A Y@/)(oM;4'@l(oK(o@la&SXM!M!XM!M!EP353S9SU#  :I K5,'[CR#HPQ=%YG B C2PQ> m> n B*.O B KKPQ4%#.=-*? $X f23DS5s $(#>{'(# I f $(#(#e8O(I4 (I(I LU > Cw/W7Q}. DM}! JI JK JBn0WWT!5T!%U)yA7#IJKPHo)yN5)yYi'-h92r 'M>1z"",9C <1jY9:S >*%#$'HV-CrB K(qFV-LJKh,WKh.=7Kh |N7#}?Y0J~K@3E!L=Ny@wK@2GW))IUNy=?=G22D83 0NySY6 )%I+gE  ,vx:`<@) @#MM8 8$<@@88?>6M6M$6M6M$(6EH[[JPY=I%~D +3.EeC, EE8; . By JV!BI JBKBG JBB<+s-k,l8!{>3W8Y8@Q : E() W} ELQK,dcE( %AAL#!]SXL " H3WE( %Y-!]3@LK"8T2F =*,FW ?W?/:R\"W?[6n"3EL'o9#EUV@x?Sm7 -i*QGCG.iCY*QGe*Q#C?^.$,qX.,q,,qY*I%~D +2.EN0+C, EE8; . By;5c>H< 'Q'ItHx0g HxFHx &*9&aC1qC0) m> U:  T\(Y1rR'JyQK1r'N3H538}8}rX19KY HJS$(Mx9'-MGK@"="NGF30MF,4.OM"U.OK(m%"G"M2,O4KZEG.O4j4M[$VUV@xG$y+xI1 L UCXHN-AXZ D.##3^7#B2"T=)H"5~"7TLVL ""TLLU&aC1qC0) m J+-r.QH [-r/\/\-rOa>)4-(`)COCi)V/PxPx W30PL>%3GWe+W(mIJW-;41y-[} -cNhUQJ ^&@XJM!M!M!XJHRM!M!HR"^GHLpJ+E-J"p"pAJPFy.HJH31#)D31 $KX f2S5s $(#>{'(# I f $[(#(#[eWbJWW([-BT1-?? -4??4v?vV+%O %,%,37+&&/1.Y9J:5 '8.8A'~/ 4B8Y78l+ @] -ET ELD%U\/NLU D^L^UP.8UDTD^ HQ6$I: oP;Q* oTBF"OZ 1TBP68TB1% yI<# C#k#C0$.7m%EC$F_F#`$$FF$D#04V0 ?-K]Wt7)X(Z1KDV+N<'-;m6h&m;"x-H n-H6I+X2`Y H4(Mx-'-HMGK@P,G6| G8S0MFBH4.OMP}.OK(m%/PG"M2BHOO4KZEG.O4j4M[$VW0\H/\&pK] EO&p NYJN47CM!J(JNk JN(aG?/W*+@/</, & KG/ ^/+<K V VTVGOFIVG8wVG;mQ 9/=4#> (33<-M0A+'QW'H0=#q-L$/=0&=3--LE1%/N=J&aC1qC0) VrGE*AM)1/NPE?Mo( 5)SAe75s.,XkGAA-G N 5 <G8!h<+< VTAr_5^?=rYC_eVeHeAr _eeS9#*X2`Y H4(Mx-'-HMGK@P.r6|G0MFBH4.OMP}.OK(m%PG"M2BHO4KZEG.O4j4M[$VUl ~NV4-zB'C.F 42XZ;]9(y2=*`4 2=TUl1:@A WDNCQIRQFQQ=8B98S1GG::)N!!0A;HK882<0T]0MU2QJ;(W]"pY"p@QJ?QA]5 C.9?/(59[P13o X'3 ;|I&,8 ; %MZ %=; ;& % %&:>N'-)'JG=0%$X  f2 e%AeI f%'-ewN.w(G(G?wJRJJE+,AOAA.C6,BI!S=m8B % %MZ %=;&&B % %hh>>'0YX>0m.58FT0mB0mF8F40F qF.Ss1e,t%UUQ"15`55`Gn5`;A3A\,<B mA\2,, SVoSP6C;1M6=I;=xP7 ;//5;//BY 2 69?B8E)CSBM#3~-XY< CWKH5vVCW \NWE# ?VE)E)'8JN7-@-_IH(=[ *@L?M@@Ul ~NV4-zB'C.F 420Z;]9(y4 2=*`2=TUl1:@A /TVyRO<(# ]DrW9F:)zY'-BGE[?IBT#PYI)F3N 4.O m @6<RR?6!(#.OK7<<3e?/LSNa6.5&K O(#4 &.O4jP4S<G, %*5J"*T W'X+@/W/& K/ /KNH$|PJ.(Q/<M4WE0A5 K2 %NNURWP_VKj8G=X|?u| .?EE2YS<E274x-E11]&kKN'-+!X '"+4+!K.6N P`@ J=KJ f 2KT{ 5s@ (#'(# 2D$I f@ C2;(#(#e/y:B:Dr:H>,BGK=TRE@@FN=IJ=-;@41y-['-} K5,'[CR#HPQ=%&G B^ CBB2PQ> m>N B*. B KKPQ4%#T-*'6q:E;Cy P7!y y!AP :BDr0U9Ec90 B<T, $"pY<1t?<A$5 CP9&p~K] EO&p NY11K>Xf?P)X>2=8B+^ BBIM*V_7"FD 9,M"]/W#V/#c|8UV|U 7:M5S2]:|Cj%-Ba2LG3L11 L%'LK1 -0nHaiM' R@Aj, ;  I1/)2/)M;;POSPe)J"Q7{("OL"#qBB/aG(4 & C:IC3AK&S%OC3N5C3'-h7,%?b@\? BFN:1I@\!dMB, A|:#?#$!dW-*!dA37:/W7-K;6 )FM-&N@EEY>&3*4KA(.@K KA5KA8*4??O7/!Q/! aC>CtPe2Xs4OY1s-K^4OSF5 6S$S?iGnWJ'JZI A# >6 2A#CA#2=I=QUTFO62U-W$/N(J%qD^^ Q.RD7TD^ HQ6$/y:BDr:H>=%,BG=TRE@@F=IJ=-;@41y-[}JM"RJJE+?K>.%:N5 Ia36/S90S(Z6/6/(ZB"P?{*9D=J)|>"DG"U3V."UIJ"-;.41y-[}/O+* VN%Wf:<2NNK(#MMm4M[2lI6A BIBB!IBB"85z5zJ5z<9A@3BN5Yi'-hPY ?6Mx)\H'6|WV,0%Gv%I+'%OTv ?F)<E9 Y8 #7' :RY'/ JVC'd1+R' Y$?E'9;1-FkCq3 rLr,rK5,'[CR#HPQ=%YG B# C2PQ> m> B*. B KLKPQ4%#.=-*a||?-8:A3Y4"*-AQ?7`6eK,e9^),,XK9^ c,b QK -/$ R1-/N5-/'-'-4*>:Q.)`QOIO:lQO!!(!(OL&MX(!9F<)*sY*sU8*s!$.j>B$7lhK5,'[CR#H@=%YG Bx CGA@ m(# B*. B KK@4%#=-*F63|Ya3|M3|!GBQ?~>7?~B}N7?~I#1W98ELRO:S(# ]DrW9F;:)zYg'- BGE[?ITPYI)VH  %4.O m @6<RR?6!+S(#.OKS7<<3e?/LSNa6.5&K :(#4 &.O4j 4S<D3DDG1G<)6! EKLR&AGL &%KN'-K5,'[CR#HPQ=%YG BA C2PQ> m>C B*. B K KPQ4%#=-*"("O":+$AK*=H+$>N>5Y+$'-hw$c&?N.w(GLM!w+&&/1.Y9J:5 '81.8fAV</ 4B8Y78l2V+ @] Qj.B ?I+GhS7J7 HI+N7N7IvN7*]?j?jI+N7N7'B&P>>>KCSK(G8/#&,LJWFNh[p%sa E3[pN6pIXcNh#EXI{;-{I{Cj@"I#Ev#=v#UPH F>A.HGP%6|,NP}IJP-;41y-[}-/$ R1-/V+N5-/'-RO<L(# ]DrW9F% :)zY'-BGE[?I9TUPYI)<N ,!4.O m @6<RR?6!L(#.OKL7<<3e#?/LSNa6.5&K O(#4 &.O4jf4S<Y53%~D +3EKC, EE 2K%MD%ML%4OK ;OKAoOKFz l-2P2ORCF27B*KYN*S; c.<8K *3/{<8AHnK<8TH1>Y,1OTE_3Y777'61>M^Y77 $KX f2oS5s $(#'(# RI f $[(#(#[e( <$ &N54-p'-'-TT:YAKP9 YFNF5'Y'-h;J<J(95r;eVe0_;ees'8+O%''-M0A+'QW'H0=q-L$/=0&=3--LE1%/N=J, (, Ce-, FFF:3,, G$8HW,t._CI-:Lz%*+0o/Lz LzS8^pX2RGd!n4$PK?Y #L$/:\9i 7M!57@;:\N2Do(:\5;R M! nlDo  $ X1S5s $(#'(# H $(#(#AW!D)#IlE'L9M V(U|/< H :BDr ).oUP#PBs!T> 75's KJ/sH H5*W#O>3) $X fS5s $(#>(# I f $'-(#(#e+&&/.Y9J:5 ;'81.8A0L;48Y78l+;@] Qj9K[P8PZ6 $;Y3PZHA4Y i0 FM90Y=6 "# 3Y=6?=6A 35 CP9RA)R7N7Y\R7BJ748&5%8&8& 1AQ%$&<=Y,52-}$&ARW*5 #@ SA\"WAI,?5 15WNQa=dH^=\:nH^>{H^??2J?2?2,jO.u HEP35 ) RP#2b3Lr S9W 'LrKJLrH H *W#A Q+U@[?7(ZsGO# !v!"NA)3B22)3qY8.'E7/d):U91PE7??&%B8Z1 )S ?7?1sNXkJ0L ?*$=E  1XXRW-P*E*V)I 5A&?'* --G9(9 U79 <b#bbDt,G $X f28S5s $(#'(# 'iI f $(#(#e} 57: N"1Vx SVoC`/ }0<SM9m<" }CME%MR1 }S P O+E9 Y8 #7' :'Y'Q?/9VC'd1' E'9; GH#,&6B34ACf$$VNHSG+ M;F>0L{;P;& "#*WN )6'="E (K01 *-@ (395313)X0~R0~ Y' E# E)8 Z&;::|%G&a|; SOZG&" "?'PKLKIxG& -'0nGjGs(X 3GjV5sGs(#'(# D$D$Gs;(#(#(N 01NFF'N+N&IJE>=Oc>IB=?<?>>=??C;WCP<CG<%H  f2 e%8I f%++etOR?'2v(#SCW9#?M)z Q'- GE[ BI l CKI)8S GN4 m @6<RR B*!2v(#.OK2v7<<3e B/LSNa6.5&KG:O(#4 &.O4j;4S<VBE^k?\CXCN5 CXK5,'[CR#HPQ=%YG BLL C2PQ> m>s B*.f B KKPQ4%#.=-*)D2CiNKD2BB"D2(#*qG-G-G-G-%OG d%OV+V+(%O;&(wYP;&E;&%Od%OV+ G%O20@&{P)&{(#(#$>&{(#L><JX f2 5s<(#'(# 2D$I f<(#(#eb*" (W<#1Z D %J( % % [ , % %3X])5V?\N5CO$1kPGqN5 TiPY"tN1%)P%lAa"Lp=-<DN 5Ut h%1%#c%%U OD yGQ2]N!GAUtIsK $XW&S5s $(#'(# $[(#(#[ Dnj;V8?9,K5,'CR#H=%YG B CG=e) m B*. B KK)4%#=-*Q$&<=$&R495#@!`5XDVCN H;(!Mx'-MGK@AOHS&GF3MF-q)4MAOSS.OK(m%;AOG"M2-q:4KZEG.O4j 64M[$VNA H/i(-PY"7UG"!V.F"UIJ"-;.41y-[}:&{0> f&{(#(#MI f&{'-(#(#eMU ?~D>7?~B}B}F?~4T.T@G#@!`6I,GU1A/TB) F1(QBS4-22<;WyO})NC) Q)8LQ, >VLL= / K5,'[CR#HPQ=%YG B C2PQ> m>N B*. B KKPQ4%#=-*HJQ:8@I0Q8R-@M45U'82.H,'D6,V4H,$/@6&63-H,E1%/N=J  Iw < E)fE}#Ev#=v#U6XsXV+FG$9KDFOIO:lGIGIFOO 'BUNNW< *X *E L6F OJ6Fy6FWA2N5Yi'-;}h:X2`UPY H4(Mx-'-HMGK@P8[6| GF3MFBH4.OMP}.OK(m%PG"M2BH:4KZEG.O4j4M[$V*^L1CP *^~ e,*^0JO2=*3cV <NIHSFNE'-'-%?@\?@\6B >"P:A6">4 ?+; E8G;8(+8BR34 iHQJiY FFpG . G< . .G".7-@-I'HV-CrB K(qV-LJKh,B}WKh.=7kKh |N7#}?YE lI1!V7EXXGO7GDN( G"N?F1^!1^T1^L:K5,'%CR#H=%YG B CK>~& m&9 B*.L B K8pK4%#KJ=-*&n!LJ"&n&n2UHk66664$S]A/V 2M83I (i488<M(iPPX(.aJ??4 GI ?4A#/6 2A#CA#2 C yY  P7!y Hy! 9xD}F%1RW2PU)3MI&Z@YV08I5A xDv%M,1jY9FI$ %4O8$0!0-Q f2Vh1-?? I f-??e }==F="P&qP %5%GEG:LcV7K5,K;KAA&-/y:BDr:H>=%,BG=TRE@@N=IJ=-;@41y-[}W+A-/$ R1-/"N5-/8WV8CC%RGd!n4$PK?Y #L$/:\- 757@;:\N2Do:\ iR M! nlDo 7CJN4M!9J(JNk 7CJN/>OOG2; q)P009WN;1X1V$MH9S],MX,F^M4WE0AF %W'C CC &7A>:EP35 Ia36/S90(Z6/M6/(ZUl ~NV4-zB'C.F 42Z;]9(y2=*`2=TUl1:@A :5JRwELF%1RW2PU)3M>&Z@HV .">5A Dv%,F>$ 4ORQUl .Yz9J%1(W y+ ZP:;S4P0-Q f2 1-?? I f-?$?eY%6? e8% BiBiBiBiR7YJ1FF%s'C C>!j"Z5j1Vx VVoXC`$00S<iM@ACiMEM+<1A V VTGS3P3W22*%,'59nR{$) )%,J37J%SJ!=dJ)4 (E(# ['C. H%o(9HJ*'?BN8l%#7X B%#-Ev#=v=v7#K6+*CRC=%(T*G5u36;X.VF m5u<(.5u KKVF4%#=-*Ky E E *%#3+%6N*7>3 Y V/ LE * *N@IsKH.sZsS[4NL'-'-^C Ny4 P7!>y =F{M#9XH"6R2 U6RKM=:IUUH>H>5{BBT&"C">!PP1$(UFX f2!M5s(U(#'(# I f(U[(#(#[ePUPH ?6)\H W6|WV,%Gv%XTIG/ 2O60GWC2|t0 0P:A62|>>4St1< RNi2Wa R"'?8:A@:NRx5(:As(#hq?Oq9>!+=?: r 'TS Sr"TS'',TS'W B>098B*aj*A=*ERYRK-EL,RY;RYA;HK8(#<0T q90,223BQJ;F"HD (wYPA(4W68QJ(#24W Q[LS?>KAw(#(#?s! N*'4|&FRY0,YF(@QBh9Te9.7{97{+( 5V, 5; 5AA#&2WF,LBm3 nFpVY%sK+ 3 6Q36QpN6LpOc#3 -G5V 4&--V V E9 Y8 #7' |:RY'Q?/2&V|C'd1' E|'9;-P7U7U!4/":N<8'O?Wj61/|BM1g>6u,5):/|!p C(#([-L f21-?? I f-4??4e;'D<;S$'& =& ;;'& & ;1.BX:*NV$-4Z$VN(k(k>)PH2?[2? b8,a)J)J,|8CC%B NBNBLgLgQ+U*@+MHXX 1AQ%$&<=Y,52 $&A0RC*5 #@ SA\AI, 15WNAa<+Dr;4)zFB@''TP7N'3e'1 0O@O+Q.mD!W<Le=W Cc5?GO"b&X 5QQcQ5QQ( NO#NU=G7O G 1*<UVUUI8nn@7n+-!;X,X1JPV,X,-/$ R1-/,N5-/D" M4C 0AE=*H3Rl0A1W) 8`N3Jp/Jp&i1W6R\1Wu1G"3[6n"3EL3 \,/)VDi U$/66- U0U5" QMNJPD 8?6)\P#=.WV,0%Gv%O)5V?\S@N5CO$1kGqN5 p 7g>Y>"tN1%")%lAa"Lp;&(wYP;&7E;&# 3Wk. Z8">rI3>U)!WQM!**UI3M!%M!"rI3M!M!/G WTPP=@(NP8PG..OC8 @@.X.4G#D 5#K?0#"V/(#'-BFo9>W<F%1RW-PU)3M:#&ZP* V5A Dv?'*F--G R 4'8+O%'+''34-IY84 % %MZ %=;4& % %&9-:PPPTFF%%***%3ZN7(J+#" (> f26=$ (#@(#M%WD$I f '-(#(#+7e" M40AE=*HRl0A1W J1DW?/1W6R\1W"W?[6n"3ELM4WE0A WO>}/pO? VVoS  V >DV+WP ?. B6lPPMXh%3"%55%5N1(G+R J<-RA8vYDT\N&@=Cn(9>;}RVN>?,GAOMHS@vQ%AOSSIJAO-;41y-[}=8B98GN5O5O7/CSYK(G0%RY01;K82<2;"pZ^1$ $KX f2S5s $(#"p'(# I f $[(#(#[e{;-{& I{\5@,+PD}D}RLY3/L*XDVCNHM(&^Mx '-MGK%AOHS&GMF84%MAOSSK(m%AOG"M2:4KZEG.O*4jG4M[$VK =$0@'8'Q)E -#\%QF>!.Bc?F>]F>D80BcSY6 )%P7/Y ?6Mx)\D'WV,0%GvI+'%OTvAn1HR$p$p{;-{&{F1FFf%sFYR?'(#SC9#IW %&;X/K.E7/d):U9NP?E7??*gBWW45)S'?7?L5s.,XkI: oP oF"80YTh.OPFy.HT&=T%LTL*236I+BqSk9DW$/6&63-WE1%/N=J.3P0&{0> f&{(#E3(#MI f&{'-(#(#ehE")?59X$33X?X$ ( (X$: $X f2:S5s $(#Q '(# I f $(#(#e>M?d 'Q'3&pHxS9 HxFHxK] EO&p NY-/$ KR1-/LN5-/'-LHRU/TA64g(x;/TKo1S)/O(Wo1Q1CLH(O ] =.E!#b) L HLVN(# ) qP#&|(k2E>)4K1HD (E(4W68(#4W [LS?>K H(#?s! N*'4|JY)&s(`)CCU)#&2:}4n+I#N1MN1V#N1N1:$K@PK76Y3@V/& r@@'-RmhH 7UC,A Y@7U(o!4'@l(oK(o@lB@;@A@22*8%,'59nR{$)J )%,J537J:JSJtJ!=dJ>p>E)CSX tK. WE#  E)E)8"8%D+&16%D%]%?%D%%NFu:BDrY]+rH=%OH#BG' T-9O'K'@@N4U@\BLL&x(: &x&xUl ~NV$-0 BNo3.H4lU2BPZ7MS(yU2=24'<"<lUl$$Q"!"T":gL:?B8CS7=3LY<)WKMW*h"W?&U?(G+R J<5-R 8vJ$5V+\V+&@=Cn1(59>;}$8/HW,t._9-I-:Lz *+5G9-/#Lz LzS9-8^pJ?u GI- <?4FWA#E/6 2A#CA# CFWFW2===B9& 2R(H>f8R8G.:(.G @ MF,RO.:(5(%:(GSGR8N,R4Q-=4j)nM[(lB=1.T W'XWQ2K;1/KK5,'[CR#HPQ=%YG B C2PQ> m>N B*. B K KPQ4%#.=-*8CCL%N 'NR17=iT,N;N PD 8?6Mx)\P.WV,0%Gv%%OYoR Z EQ;;SJo9&JoLL'tJoL >d 'Q'82&pHx3$ HxFHxK] EO&p NYK.FX f2F5s.(#'(# I f.[(#X(#[eWmM C6>? #EUVY@x?SmQ7 -i*QT G+O Q%g%g/5*QGeH *Q#Q?^.e$S %( % % [ % %T ELD%U\/NLUA D^^UP.UDTD^ HQ6$:+'O?Wj8"g>6u;YR1XrFT,KB1LJL;V0.,B}N7WB}P3L U7=L | ,N7#}7?YNK 7U:CY]<7]HYVp7UGKHJ%A!C}SGC ONHJ2aHJ@@NG4U@\'-BLLK5,'%CR#HO=%YG BQV CH`'F mF" B*. B KT:K'4%#=-* HEP35 ) RP#2b3Lr1S9W &LrKJ.LrH H *W#B D+ES 8 HD+N7B}N7IvN7*]?j?jD+N7N7$#RRYAT6ATBBBBIV*B=U)1+%H+?KsA6N)1V 83 'F LV0V*FN@ILH/6/q8Cx |KoO">^M):LCWoO"/QO"CLH:LO ] !/@]("<@ V ..Ic--V V Az-M4Y i'QW'HY=N -L$/=0&5j=3--LE1%/N=JAQXJ; A.'538}8}r M9E+B<RV%NWY?,GAOQSHS$g*\WYFF?'AOSSIJAO-;GWY41y-[}Y4%RO/0j0j)4 9Q7$O8)Gq#WQ7'TX$ >Y'N1'%) %lAa"Lp*NB:{H*NDoT6Do;$*NDoDoM2Ur4$*!07A*'G'9*''MMK"")=-< + B0&{0> fw&{(#(#MI f&{'-(#)(#eRI: oP oF"@3WW-4?!%07=EQ+Y< 0 :9=AU& Br" ?U&&U?K5,'[CR#HPQ=%&G B^ CBB2PQ> m>? B*. B KKPQ4%#T-*X&{QFXVDn8 =0!'&4 ?0!0!C2lV23:W>5)2.3K.J2../BG+IF//%O d%OV+%O H 8 ).oUP#PsY2$> 75&s KJsH H5*W#%"8%N?B*%#3+%6N*Ks7>3 'VT7F L * *FN@IO~K&ACXO~M!7M!XO~M!M!?~D>7?~B}B}?~A"'<JX f2U 5s<(#'(# 2D$I f<C2(#(#eW;MyR 0E2~)R7N7Y\R773cDA2!=EV)$Hh3233s.sZs?JK@K@W)4  Gy;X,X18TFSF+1LJ kLM?VCKC#W kL UN7,L@H9S],X,N7 1X,CuF^S@EQS@S@0R@e f 2Q@N-@??? %=I f@;??Ne6YcF$BpEC$F_F#`$F%<44)FMN@EEY&3KA.@KKA52KAb6s+:9x L LF%1RW2PU)3M>&Z@XV .">5A xDv%,F>$ 4O ZX(T87gR RKX'JZI*j>:Qh::N :-/$ R1-/N5-/'-B<J% 0 $ X4iS5s $(#=v(# ;k $'-(#(#;X,X1CVV&X,$  -"UV@x G:424m89p 425q?{OdO7A$/*3<) Ku8"*t?+~V8*%#3+%6N*Q27>3 P$;;Q2Q LQ & * *Q2N@I22*%,'59nR{$)0X )%,J 37JU0XSJ!=dJ$3>->>&-E)=KO=K.a=KQ3%K; UPUP ?6Mx)\H6|WV,%Gv%DA<A.A X>%IF X(=X%cr T T@5+>{)5V?\S@N5CO$1kGqN5> 7g>Y>P"tN11%)%lAaN1"Lp-vS2]23U]-w -wD2CiNKD2B1JD2(#XDUPY HM(&^Mx '-HMGK%AO6|&GMF4%MAOSSK(m%AOG"M2:4KZEG.O4j4M[$V" M4C 0AE=*H3Rl0A1WX 8`N3Jp/JpO1W6R\@^1W"3[6nR\"3EL*%# +%6NN4r&K3$ YGy?? L * >$$*?? YRHs> f2H(#(#MI fH-p/3(#-*(#/3e HLVN%(k )FA(ks>)X#;tTKJH H*W#T9HU5"H2P^4/;BgOU"/GU(!0(0/UY"U0.!w/G0/11$Ky/E U1A/TN4) q1 t%1%#c%Y6?`46:Rb"{RL3/L*W $XNS5s $(#(# DE $/3(#(#/38%`6>; oM 9@$Ct 8)Q7! (%T\ 6  M Q&! 6CXCX/;NfQ 22*%,'59nR{$) <%,J37J?SJ!J2?,[l9lN (:  F:+4::,lAKFt,l,N5,l'-hA.q #.q+ :8888X2`Y H4(Mx-'-HMGK@P,G6| G8S0MFBH4.OMP}.OK(m%PG"M2BHOO4KZEG.O4j4M[$V3LBP /Y=?6Mx)\DWV,0%.OGv%%O2`P,-a-a:3AK*y>3C NC 53'-hQ11V1W'NR=iT,N;N -@M45U'82.N'D6I,5(4EN$/6&63-NE1%&/N=J JEMF*2W2/dYfS:UfY H QE7??FbB;dW8)S?=?XT` 6 W8W2 'FW2t)6K:BDrRFC=%ZBG?TY*VF m?6.? KKVF4%#=-*%j426?X>X? %j?X$:+X'O?Wj/|8"g>6u,5):/|!p NK 7U:CY]<7]?YU7UGHJH~!C}C ON5HJHJ@@N4U@\'-BLL0m*sY*sU8*s!j )EA<HZHZ/H!bR 6.(3W8Y@Q : Ki) W} ELK-mVS%AL#!]L 3W% %Y-3P25n?6Mx)\e6WV,0%.OGv?c `!6%O.Otv>]?12v 4$S]A/V 2M83I (i488<M(iPPXJaItIt-0g7 J8 YS?/M>CP'v 1S?>JJK,1c',k2%ANh/9=Xq&8/V/IXcNhXIM k DrI2i5IJ.uIJP2U:IJIJ1y-[) O./  .I D,>GpJ&+&&/1.Y9J:5 '81.8?AV</ 4X8Y7.X8l^+ @] 7Qj#&2WF,LBm3 Fp-%s 3 6Q36QjpN6LpO#3 -RG5W3A=K):M3N53Yi'-hK5,'[:CR#>'y=%?GK B C%'y> mN>5s B*.= B K&kK'y4%#'-.@`-*7w,^M1(G+R J<5-R 8vJ$5V+\V+&@=Cn1(59=>;}:F4FNh%s% =b3IXcNh=bXI HLVN(#bI ) qP#&|(kH2+>)HD (Mi3 (4W68(#4W [LSS-'ITH HMi(#?s! HN*'VQH D D D Df84SCTCTCTgRS&) oR@{RRA R:&{0> f2&{(#(#MI f&{'-(#(#e22j'599nR{ <D.%J%M; S%J!J8; .2; ByH6666i/1MZH3;;y!;NG !#S'.NG#S.0? 8X f2  Ae/2'GI f '-he,N1?\N5CN5? >*< ]Dr;4)zFB'TP7N''OO".*.WG4. 5U%>H>/l(#>K5,'[CR#HPQ=%YG BA C2PQ> m>C B*. B K-%KPQ4%#=-*005??-:W0&{0> f&{(#(#MI f&{'-(#(#eK ($*=L2E#JE** 0J?S*^L1CP *^~*^0JO2|<59[/I &[>]C,P-8B6>]Dz?>]:|+HQ>ANgF,9@>N5+ 9>48!hNSNSxNS)2s-k,l8E``@O|GAA-G N 5FG8!hR<=8 <B9-/$ 8P@GR1-/FN5-/'-'- 8)LE$C'7L?+e''g6:BDrBGTOFc mS?/Fc KYN*S; c.%K *3/{DP#AHmnRKPTHF A/7o{!/X</ eUkH/'M eB& eJM?7""02{I^E!2b{_{.r$*`PRPE9 Y8 #7' D:RY'D?/2&O9;C'd1' E9;'9;-3W  @QLKi) 9 ES ;%AMZS 4!]S  3W% %Y-3"c,-=,,6 |'q p*U{)NT pSDNF pKP9&SM[2l1H4Uz21J51,DX2`Y H4(Mx-'-HMGK@P.r6|G0MFBH4.OMP}.OK(m%JPG"M2BHO4KZEG.O4j4M[$VXsL;Y1s-K^4OS F5 6S$S$;Y3H6PM*mKZEGj=9B9& QN(H>f'-8BGK.:(GMGMFV4O,CM:(5K(m%:(G89NV:4KZEG4j4M[?DJ83`< ,)h< < X:SE SVoV_SD~/*C;1/O;X,X1JPV,X,=8B98GWzGN5OOC:(@@5OM: 7IB$`:KB?LO'1F':K@0uE;YOR@z8B@F7@ 7IY"cIS>Y Y6?46:Rb 1AQ%$&<=Y,52-}$&A3.RW*5 #@ UA\AI,M 15WNQadFU Wv+H7HHH6?c `!6.Ot=8B9;8G@GOC3EI:P;Q* o TBHF"OZ 1$yTBPTB4m89p 15q>w<*& J#/%D/%/%D}RK5,'[:CR#>'y=%?GK B:! C%'y> mN>5C B*.y B K-%K'y4%#'-.@`-*D=P8P"Y..G@4 BR VO&p'^'EB'UV@xG .+1 .K5,'CR#HJ=%YG B CG6A m B*. B KK6A4%#=-*U11M1P25IJ:A@:N5(:As(#ht"%P2K%MDB;Q%M4Mc&K*4V4K* 5GNMFMj4.OM=.OK(m%=G"M2MjO4KZEG.O4j4M[$V< %[ HLVN(# ) qP#&|(kH2E>)4K1HD (E(4W68(#4W [LSS-'ITH H(#?s! HN*'VCTr*@ X2`UPY H4(&^Mx-'-HMGK@PS6|GMF4.OMP}.OK(m%PG"M2:4KZEG.O4j4M[$VC`CQ0C4RO< (# ]DrW9F6U:)z!H#'-BGE[?I@TSPYI);N ,4.O m @6<RR?69++!+ (#+ .O 7<<3e?/LSNa6.5&K O(#O(#4 &.O4jR4S<K5,'[CR#HPQ=%YG B# C2PQ> m> B*. B KLKPQ4%#=-*PMXh:6) < R X:&Q?``@QEH.a%; #-Ev#=v=v#N*P8Y]HYPG8 $.>GO8 8 @@NG4U@\BLL-l<-=%YG B!z C: ; ( m (R B*.# B K+BK ;4%#;.=-*-)H! PUV@x-iMBG3L*QLTj:K:KuUQO~,&TACXO~M!7M!XO~HRM!M!HRV*5- hG JG G @Q*dAd774d79TN?O=8B9Y] 6Y78G:(1G,Lq6C OC :(5Y:(@@BO#N64U@\BLLG'LDD:6DDY R3%y% iHQ0JiY FFpSG . G/O3ONNM q:TPAUWo'TPN5TP4'-hYKqS!)O f&{(#(#MI f&{'-(#(#e 9!22j;ejJjJ4JQNV$-4ZR$ GwJGw5O u:7N@60T0M>5>90M(#(#M7D$0M'-(#(#hX,MtI>WUE9CH9+B<_ODu)Dur<!Va$aKa>5?V>Mi>|<592[/I W &[>]+zPA(W 6!3>]Dz?>]:|W +HQ22'&$JCCCH^b=\:nH^>{>{(#e9+%?>?@\4B"cAANXqT3@O2? 7UC9nM3Y f 7UP:s!GW=4<SP:N=dP:6u,5):/|!p 6,BI6,=m8B %MZ %=;&&B % %FK$1!-0 $Y TI80YTh"L%e5 R:7) o9(9 U79oN 30K>$YF2(gR?(NC=%@2GR+C.FVF mRC.R KKVF4%#=-*/y:BDr:H>=%,BG=HTRE@@N=IJ=-;@41y-[}U5 % %-WD$Gs(XRD$V5sGs(#'(# D$(#D$Gs(#(#/W*+@0/`</IM & KG/ /+<K V VT{{{ HLVN(#b )B qP#% (k2+>)K1HD (%3 (4W68(#4W [LS?>K H (#?s! ""N*'4|%5T,a)J)JY+3DI%~D +0.EA?-C, EE8; . ByGG8wG/EL/UE/E/E0JO2%?@\?T @\B7^#?!e%SZ$w'I@U''|W1 oES|"("/.|""`8E1' <C0@ CFN@'-'-$J%`6>;L NFOL OZL &Y`$67JX f 2*Q5s7(#'(# 2D$I f7;(#?9(5(#e&P>N?O=8B9Y] 6Y78G:(V$G,Lq6C OC C:(5;:(@@:7NN64U@\BLL+H^5X+ 2U -+    w't B&7,%?b@\? BFN:1@\!dKB fA|:#?#L!dW-!dA7:/W7PL;6#04V0?-oKOCK_(:)rD)Y":)5HE@V5HW5HVNK[K[G1&11;U2sK%PPIPM! =10W3!E!Ga<(]K@". 2W) 7K? 22!G".07KWW,Dr2i5IJPIJP2IJIJ $X f2S5s $(#J'(# I f $(#(#eYM=H>UE 1S?DN?O=88B9Y] 9Y78G:(G(sWLGOG6:(5":(@@%jNL4U@\=dBLL23l13C2T(&T2TT+`"^JJJYi#C#k#58&}!LHRU/TA64gJx;/TKo1/ )D<,JCWoCE1Q()1C sLHJO Q] ,); ('BV/GL OL OL p-:AM N5'-hOFN'-'-1c',k2%ANh/ @Xq&/V:/IXcNhXISFK'E9t0JBF$JFFN~JFF+B7v4!>QAd 'Q'82OHx%7A HxFHx BR 2M&&#D 5#T:0#"V/(,#'-VMLpY 0A+0q :&"C5wAu-"C-$>X@C@C*9CFWFWRGd!n$PK?YO#L$/:\?6O5:\N2Do4:\RM! nl ?R :IC3AAK&S%OC3N5C3'-h;d$$E}:;.7P7 ;5;A^9%:9A^7M!7L:a/A^$.7m%EC$F_F#`m$$FF$DUAT1I?!1I4)1Ij?*O8X f %<) *O Ae/2'GI f*O;'-2 hea $aKa#6UERM B>K?KSIO#L"B/;KaPG'5;K-Do;KRM! nl 2CUBI?JK@K@W)$7NW#G u:7N#/8'K[D'0YXK[0mG58F0mBX0mF8- =J $X fRS5s $(#@(# 6*I f $'-(#(#e7I#Fz l?5O'1Q,CPV2Y :*" Wd 1S4S tYl.L7S BQtX( GwJW//bGwM+lOBOH>*" 8(-;Y 340x'!;YIJPIJl;YIJIJX19KY HJS$(9'-MGK@"NGF30MF,4.OM"U.OK(m%"G"M2,O4KZEG.O4j4M[$V8] &1:8 #(5#0#Fq(#BZBZ5 R| ?C Oz R Z3W8Y@Q : &j) W} ELK-mD]&jAL#!]L 3W&j %Y-3<{qz9Oq X5K>Y97?3V!*5Kff!T7?BJB(7?,B5K"`*B.h =8B98GC =#04V0?-oKK 'E :!%S+]N EfLK'AfL# %>L D.CN3W %+@p:Y>QAd 'Q'O&pHx7A HxFHxK] EO&p NYEP353 S9SLrH< ,)h< < X:4N &:03^4N+Nn4NNA;HK882<0T]0W2Q);W]"pY"pFQJ?QA]5 CP9UX<U(#U%<;;%!]?!]9%!]!]MQ!0iP&!*N7VXExP&N7JN7T?jP&N7N7G;,! _GWM] _ _:gB.\DQ2d*i 8*T*H)4 9Q7$O8)Gq#WQ7'TX$ Y'N1'%) %lAa"Lp_4H}.D)Y"T05H@V5HW5Hu(rs0V Yq<J%B <-/$ /R1-/>N5-/'-'- ,f}:*>A}NEG&F; ,NE$NE,4  %.X;95r;e65e0;ee 69&'2&'Lh\V M//EP353P[S9SLr[TFO62U-W$/N#)J%qD^^Q.DTD^ HQ6$<CG"YG"}G"W $X f2L!S5s $(#'(# @ I f $(#(#eO~,&)ACXO~M!M!XO~HRM!M!HRJV :KKaFNY'-N?O=8B9Y] HY78G:(CG]GO:(5:(@@NG4U@\BLL=/7,==;mcc % {T,a)J)J"E")?K $XS5s $(#B'(# $[(#(#[&"C"K>P= 1E= = WK-8KK A;HK852<0T0W2Q+;.K)YQJ?QA)5 CP9", 5M(#RY3(#2l>z N(F(Q (>#K6!~3<LJCC/X>%I X(=BXT[V\< 9xO%E -(#2 JI JKB JJWr9BJ]Wr=dEe=d Wr=d=dw$c&?N.w(GL"w6R$^./66J6=o!E1?/?/X<JX f2 5s<(#'(# 2D$I f<C2(#(#eUTP0rh0&&URSVN(k(k>)FF<%sW93pO9?e+;M?e7IB$`:KB?LXF1F':K@4uEYXFORT@z8@F7IXF"cIS>Y wN.w(G(G(?w!!(!(&MQK::3u@G??!%07=EQ+Y< 0 9=AU&(G Br"X4 ?U&&U?K K +A1f+/xPd<SLHRU/TA64gCx;/TKo1)T:LWo1Q1CLH:LO ] @]=)O~K&ACXO~M!M!XO~M!M!.2`22{XRVN>?,GAO4HS@vQAOSSIJ%AO-;41y-[}3<". <-/$ R1-/N5-/'-'-#@cA,/9 A@Xq R3@5L@O#-G5B/Si+ M;>0L{;P;"X2`25n H4(&^Mx-:'-eMGK@POG0MF4.OMP}.OK(m%PG"M2O4KZEG.O4j4M[$V:-/$ R1-/N5-/'-WlUQN(LU-D*%,'5%,37'0J`` %1)= }&11;U3HAEsV NY&nJ"&n&n1$ $KX f22S5s $(#'(# (UI f $[(#(#[e-O>?%X-5(#-(#(#L>S?03L'>;.;.ILL>;.;.*fW @ N@%C-I38C % %MZ %=;C& % %&<{q9Oq XR>Y97?K! DR7?BJB7?,R"`*.h QFODu)27Dur<D*N:*NDoT6Do;*NDo %j>NHv6 'Q'82UHxOL HxFHx L8*O T$TXL8T3*<OIF*1k8 % %MZ %=;&& % %#A^:9A^7A^,'H1MC*>AV-NEgF; ,NE$NEH9S],,X,X,F^ 1AQ$&<=Y,2-}$&AR<#@A\AI, 1<WNQaeQ&$9)`QOIO:lQO OA<HZHZ/H::7,%?b@\? BFNN1I@\!d7_BbN?$!dW-!dA7N/W7K;67,%?@\? BFNV1I@\!dB7?!dW-!dA77/W7K;65)5568<{qz9Oq XR>Y97?K! DR7?BJB7?,R"`*.h A5ItH<0p7f90ItBF0g=  3YH\?A 35 CP9T!u')PV_OrcV_*2,A'MEV_S;$ E,O;"N5;6U>P@R$,[!p9S61 ;l/| CNXRW C$] C,5):/|RR!p +&&/1.Y9J:5 '81.8 `A/ 4X8Y7/&8l+ @] 7QjJGzSXX,@ Ae/zGh0Rb f2V?b??? %=I fb?!$?e aM$/ a!u a:3,, $ XS5s $(#(# $(#(#O0NEVYh'u/IK5,'K~CR#H>=%JG B CI ; ( m (R B*. B KK ;4%#S/-*>/)60"/PGE/UCNQNQ:DpW>'JR< 5)V, 5; 5AALLYLjYWRR\YT PT #@T B==GL] HLVN5(k )S(FA(k z>)C9U8bKJH.I H8*W# <HC NL'-'-#A 5#0#Fq(#(%H  f2 e% AeI f%!s!se91H4Uz21J11aSY;&OSY< SYUV@xJ GYP4m89p P5q INV4-4ZN0Rb f 2V?b??? %=I fb;?e$@4?Ne HLVN(k )P#FA(kV>)X#S %KJOcH," H *W#&eenep"?v!"NA^9%:9A^7M!7BA^!XE!!!s!HHRHF`F`LF`q  TJI$mM9:R Ph f .t,Q (#(#M%WD$I f ;'-(#(#he-rH [-r/\-r0", 5M(#RY(#2l-{TW3K@W)? ",N7uY1Vx SVoC`/-0<SMW9&-*C*ME%VMR1-S P% O++,,IwIQ&*Q IQI$OXL OXOXJ)/@K@39K@W) +=?'+7G.G.;&4{YP;&.MV{{;& 5V, 5; 5RW-P=V#b) P9D=?6)\D WV,%NGv+%1Vx VVoXC`$)a07S<iMC'|CiMEGM+<1'| V VTKG".8 PHK6R\##0-Q f21-?%g? I f-??e!K%UL?X7-@1s-K^-SI 6S$S)AF.Y4A3#"47m%S%"4FF_F#` ` `"4F"\F-/$ R1-/N5-/ H :BDr ).oUP#PBsT> 75's KJsH H5*W# <FK+ H7H HHYjI%~D +D.EC, EE8; . By: S!N4:Y(#:(#(#!)I"I"J;J/ AT< D/ =vN=v5*X7/ As(#h G/ A B'C C>@26_Q)p$9:)`QOIO:lQ(OO(!W@U/TA)/T)A,CWo!.,E+W&{.@> f2&{(#=(#MI f&{/3(#(#/3e8;wCl88N,%C9N,O<YN,9O+B<_D&F~6 NKTJhD.CNJhY=*+ M N e& )G MH NP NT1^?R$,[l96;l C N:CWK~ C$];. C!ORCK95[#|IjH53(/10&J0&90&UCSN@+MS@+@+F<P#v9DCGDA5!A DA&S#>Hvd Q'U&pHxeL HxF+HxK] EO&p NY:kJT""v;T X \4B+K>$YF2(gR?(NC=%@R{2GR Y+C.FVF mRC.R KKVF4%#=-*MK\/}K\U8K\SH'SHMSHX(N?O=%8B9Y] 4]Y78G:(-?GF4.FOFO:(5:(@@1)N.4U@\BLLYW=0%~D +3EeC, EE 7FRF< ;NKIR,8 ; %MZ %=;, ; % %8ByO/pQ[/p/pO0Q3009Hh"Z&" ; GSxN W: 7CI-,tSxACq* > S8^p4@"I"I;U1,z>M}K!KK;4+DX;4JpJp u";41EE7/dE7?bB6@JJ)S1?XJJ0O*W*Q*$mWsMF0&"CURl&Z&d"[0+>)o'<M5A[%L[,FM$ 4OPT2F DB;QF4BS Mc&K*4V4K*aT].F" M4C 0AE=*H3Rl0A1W) 8`N3Jp/Jp&i1W6R\1W1G"3[6n"3EL0IU@3W8Y8@Q : E() W} ELQK,dcE( %AAL#!]@.L H3WE( %Y-!]3Kt8RZNXC=%W G9 4(:KVF m9 ).9 KKVF4%#=-*>QAd Q'O&pHxJ#7A HxF7HxK] EO&p NY 1AQ<=?p&],H- {2-$&0 nARSj#@ nA\Au(rs0 1 A.q #.q+- K-L f21-?? I f-4??4e,"<LP<"<E++1K$1^!1^T1^+( 5V, 5;| 5AANV4-4Z7t-N(yX- HLVN%(k )OFA(k>)O&&$=KJH HO*W#6TE")?#&2WF,LBm3 nFpVY%sK+ 3 6Q36QpN6LpO#3 -G5/**G33VT@N?O=%8B9Y] Y78G:(G]YOO:(5:(@@N4U@\BLLS,F$8/HW,t._NI-:Lzo ^YN)J)J/#Lz DOLzS^N8^ p6+?c `!6.OtL7IN*S*Wq3@R34Y iHQJiY Kv FFpG .W G ="%.8: #. %&E .SA;*OO"4CQ)(#(#((#(#KW:B6K*;-K?1\X$33X?X$ ( (X$D$9F 5$ J{D$NK/5hUnX^UnRUnL&cQ4$%7m"EC$F_F#`$F"\F7: : zDahDa":Da=.N?O=%8B9Y] 4]Y78G:(-?GF4.FOFO:(5XH:(@@/5hN.4U@\IJBLLALlTs'E=^YE='!E= 4KFS H 4N7IvN7*] 4=N7N7=Nd%dJd65Rf6,6( -BCjB]%%M!L-*#N7Jn7mQkJnF"IF_F#` ` `JnFFQ<=$&RL} nFl-/$ R1-/N5-/'-'-) $X f2S5s $(#>(# I f $'-(#(#e4 &{.@> f2&{(#(#MI f&{/3(#(#/3e0RP> f2,1(#(#M%WD$I f'-(#(#heRA0E)R7N7Y\R77LGCG 7UC7UG!ZOCHJ@@j 1AQ%$&<=Y,52-}$&ARW*5 #@ SA\AI,?5 15WNQa)P8P.Y;FF%sW939''/[ Yi5IQY"`YXD+L"aN*H(>Mx '-8-GK@AOVQGDGMFQ4MAOSSK(m%AOG8L@oQ:4KZEGTv4j4M[:5(GO&X 5QQ5<~QQ6"cT&XC"cPXLUPmmP&),EhJ'JZI A#">6 2A#CA#2QM 9@$Ct 8Gq)Q7! T\ 6 M Q&! 6CXCX/;7,%?@\? BFNV1I@\!d@SB7?b!dW-!dA977/W7K;6K@3K@W)5<?'2<A;HK8(#<0T q90,2J23BQ;UpHD (BYPA(4W68QJ(#4W Q[LSJAB(#?s! PPN*' n ELI@a # #I # #`$6 JX f 2)F5s (#"p'(# 2D$I f ;(#(#eDwMMDw+Dw; BC.C@C8*F8*128*56h.;-HL'LGzKL4UT,=U:Y&x6?5RWc.FF`"{.S'.6:~"{'"{U.FPH "\(+B89UV' <C-/$ 1 CR1-/>N5-/'-'-P P7"7%UP7ItH<It10g5eD2CiNKD2BBD24 44, 5M(#RY(#=XQUO4CQ)(#(#((#(#22*%,'59nR{$)J )%,J37J:JSJ!=dJY>"tN1%)%lAaN1"Lp$KPK1 3"V/& r@'-RmhTS>LH<A6LH+8MkHMkB%6(FO (F(FM=I{N?O=88B9Y] 9Y78G:(G(sWLGOG6:(5:(@@%jNL4U@\BLLN7:N75 Ia3.6/S90S(Z6/6/'C.(ZH?3rDd$3r"DD3r)YLB>#&Ue** 6A S7/$6u,5):/|!p 7,%?b@\? BFN:1I@\!dMB, A|:#?#$!dW-!dA7:/W7K;6v /FkLcF%1RW2PU)3MI&Z@(V08I5A Dv%,O8FI$ %4O$%7mEC$F_F#`$F"\F(/Ln3(/>S(F" F-r.QHJ3w/\/\3wz-R&8RNU OD7n9 y 14<*Z4JK5,'CR#Hh=%YG B O CG$ m B*. B KK4%#=-* w$c&?N.w(G(Gw8@@"'B&P>>P:k ?+%CkX:+3!YSh5J!Afh!FLH/6/q8Cx |KoO"R%>^M):LWoIO"/QgO"CLH:LO ] P(' A%BP(=8F=P(==NP/b/ X(iz6 SHe, S"eS?6P:A6>4a?Km@!aaTh Z;::|%Ba|;)SA1 A1 &?'zK2/1 -0nFV?QFHbFGG8wG&L.2\A=2\N52\'-hR3%y% iHQJiY ) FFpSG .#T G#TFO6 2U-W$J6NKUJhJ6L^L*uQ.D3=TJ6 HQ.6$?GHA&w_&n7*# # O# M p!:3z'O?WjB/|Tg>6u,5):/|!p 1 n11C5%&V8M8M8M8M8CCA%B NB*NBVN(k(k>)&K.gLUV>>> <#(5#0#V/(#L"@F1 0O@+Q.m \d3s(#(#M3s/3(#(#/3'R!0I+'Tv%?>?@\4B;<O>ANgF,9@>NJ*5@HM2>48!hG"">2AAT=)H5~"7TLVL TLL>n36/>nA"A">nA"6eeMQ!i(V mS 1@/*C<)C0 JO M6PCJO-%YJOR1PS P O+3W8Y8@Q : E() W} ELCK0cE(AAL#!]L y+3WE( %Y-3RF5U   4!OLp;4+DX;4JpJpR;4 8;1X1VB}JT!?+/KqS!/*RRJRRAR"gMI 5BM4)ME};Y0@E BL+U !UBL: ,R)ToEP%. f ".WE(#(#M%WD$I fE;'-(#(#he-+CGO!EQ)p$92)`QOCIO:lQ(OO(HN?O=%8:B9Y] DSFaY K8GK:(G'e/ ^FONF5O:(5!:(@@1)N ^4U@\'-IJKfL-M0A+'QW'H0=q-L$/=0&=3--LE1%/N=Jl-:TPAUWo'TPN5TvTP4'-h# 6V{{7,%?b@\? BFN:1I@\!d yB, A|:#?#L!dW-!dAXL87:/W7K;6IYKI+IJM88 ++%g%g)6#:U?'^8<?""8?m 7m mF_F#`m$FF$K5,'8CR#H=%YG BA C.)r"p m"p# B*.) B KK)r4%#?=-*MSxSx61A/5 R|MI; z 05o zKez  P=V.#d8=I:P;Q* o TBQF"OZ 1TBPE~TB4m89p 15q2,TOY~/6Y~oY~Q KJOMN7(?H+ $+HH-;68TFO62U-W$/NJ%qD^^Q.DTD^ HQ6$2j4X+f: 20EuWC2|0 02|KL4%o(9&/.Y.WA;-W>8 (P(Jp(K' 5GGV, 5; 5}I Hb$J #8 EVk> I)J}9FFCY )PD 8?6)\P.WV,0%Gv%OCoN^?Co@rCo)>BR@?$C(@ $CP$C-Gr-?E-ERGd!n4$PK?Y#L$/:\'S5@;:\N2Do:\RM! nl 'K[DK[WwG:4AHlH#a(bb"HS*HTIO/OG> O=?!e%BAO=%a&ZSXM!M!M!XM!M!FQ%&%%9(?:BDr;4FB'=rTP7'T'#LF<PUPmBiPNh'Q@\@\BDD)D?!(U!(`&M(K6:Rb%$K1?FR3%y% iHQ0JiTY  FFpSG . GA[7A[L7LA[LQ!@C1"(""<3,?Q<=J.$&0R"JL}#@!`Au(rs0L} oJ&E:9F:JR#W3}M/JRWAJR2n11C5%"R<*Ps>L<*(#(#M<*-p/3(#(#/3EDE)FE -:=) U$VU$Z%U$8=!OK0FP!!0JM+B-/$ KR1-/V+N5-/'-$*W4A4Y i0 FM90TY=6, "# 3Y'=6?=6A 35 CP9 0JMDuDuM<.{Yq9Oq94!fa+Q/An1HRTK5,'K~CR#H>=%YG B!z C: ; ( m (R B*. B K+BK ;4%#=-*H6#1WE:9F:JRW3}M/JRWAJRRO<L(# ]DrW9F% :)zY#'-BGE[?I9TUPYI)<N ,!4.O m @6<RR?6!L(#.OKL7<<3e?/LSNa6.5&K O(#4 &.O4jf4S<T } 5 {&g5 Q5O J8.C*CFkC9xL&LN?O=8B9Y] 6Y78G:(1G,Lq6C OC :(5?|:(@@N64U@\BLLY6I%~D +.E+C, EE8; . ByY%6?#w;[w6:Rb$M$}R ww%%(k,   8O G MM88 8$<888.26'!7!i'!u'M>>=`>>BM!>>M!H?H?RA( A>7oW/X/bM/9AUkM+lOB:IUOH>H>5',0O%':'G3 ++G*`+=== %W $KX f2S5s $(#(# I f $/3(#)(#/3e Ry &07&TAy''9y'':C:+:H KN5N5A V/ 6Q6Q=] (o??4 GJ("h-?4%</R.U\%R=%Cn(9>;}G'L<YSKB2<@<C8PD 8?6)\PV.WV,0%.OGv %Oy2-NyAR?:(# ]Dr9#3 VU#. L'(#E * *.N@ISe2X3SeV/&Se-pNRw3tQ-ZRw;Rw4dR\EL#-/$ ER1-/C N5-/$V$.7mEC$F_F#`$$FF$CC/!R;O#O7OLc B?<???Q<=$&R3";/5h )T)TSD)TK5,'[CR#HPQ=%&G B^ CBB2PQ> m> B*. B KKPQ4%#T-*1Z-1Z# 1ZY:XJX f 2?GE5s(#'(# 2D$I fC2;(#I2(#e8GGP mS9 K.I 22*8%,'59nR{$)O( )%,JP37) O(GSG1J!=dJk;7 ;U;H/5 R|MI; z 05o zKez r\38c\\Vp{L6>{? ENE>Jn7n#7nN7NU% Y=v(JWX)Ps>X)(#(#MX)-p/3(#(#/3(}%bX&oI '&oB=B&oBGb,!n$P$/ t:-5?:\:- 1T[C/#?B8#CSBM#3~,Y<QI0 .W.9 >+.W*hWE# ?+E)E)8?N'N7(?+# RLLLg7Lg3^KJ26,RK#E'p"="2GZ:IUZH>5(&:3%0P)upMO6N*#$#$ATBBBBUVP ZP:;=SPM u u6 y35 :1 L5wAu- C>X@CC-7,%?@\? BFNV1I@\!d!eB7?!dW-b!dA77/W7K;6DAFqQQ AQX-@_3XXX:&Q%eTn%e6%e1(?%BA  e%8%6P?c `!6.Ot)"bH#(99FD05# J{#Bg/D'L*7JXTIG/ 21O60UvGWC2|0 0P:A62|>>4Xu)$X>%I X(=X-rXUBR\8>9E)J !N#1)7~DP?}5;1Q#u%Q4QQP>BW>d 'Q'&pHx3$ HxFHxK] EO&p NYR; f2@N,;??? %=I f;??eE8'Q)9F:QJR"!3}M/JRWASJR b#bb/UK5,'[CR#H?=%YG B CG8? mN B*. B KK?4%#=-*7GO8 8 @@NG4U@\BLLM?5!Z %jUQ*'PtPt;:H;$&$&RsALALq9Oq9!QyP*%,'5%,37S55W 0>098B2*aj*A=*orK\8 =0!&4 ?0!0!C2WQ+U098*@j*A=*4!d3 :N1YdI3 3 RIL;.7P:7 ;>;0\JJ ])$vJIJGjGs(X BV5sGs(#'(# D$D$Gs;(#(#(N?O=8B9Y] VWY78G:(kG]%VWOC:(5:(@@NVW4U@\BLL HN(#75 ) 6 qP#CP32+LrS9+HDS (3 (4W68Lr(#24W Lr[LS?>K H(#(#?s! N*'4|Q)`QIOQ!n$P$LY3//08/%C%%%UX<U(#UE9 Y8 #7' :RY'/ JVC'd1`' E'9;1-YW=0%~D +R{3ESaeC, EE 9q,W BR4M%]D$VXTIO/ 2O0GWC2|0 02|;1XrFT3KB1LJLVXS >3WL U7L |3N7#}?YH:BDr;4=%FZB'CTP7''Xd#8 FP5MX3P5V/&P5 $KX f2S5s $(#'(# I f $[(#>(#[e+N$}R wRWPRNVKj5A&,KjXU`%I X(=CX+KIJ:'W $KX f2 S5s $(#E3(# =I f $/3(#(#/3e0:!Q3009HhHh2 ZP: xO$b>8;M$TSS 5U $T?IY<MUP.PH5dXODM2Q0%?WI EPPU'+; EPPT[ GB&V XR32ACHQ0JiT* 6FFpSG . GNU6UL  E8')9F:Q".JR!3}M/JRWAJR!G".WW, C6./66J6A$2VSiLtjj+RQ'K[D9J9RO'8":5K[H.G4~O4H(7Hl+O@] QjPY ?6Mx)\H'6|WV,0%Gv%I+'%OTv#G*Z:I8 ;NKI,8 ; % %=; ; % %'Y6'Y -'Y,jVNNHS1&F4&XDX XI3>lSTBN#}"`G,$F GR83{353O(~CJJ<J<(%/y:BDr:H>,BG=TRE@@=IJ=-;@41y-[}&/.Y.WA#/4A8<1R10 1AOB''''=&XDUV@x'2 ( -iT*Q;G9N:VT*QG?$y*Q@4m89p #?LC$kX5qY7 Y77Y772PJ%R w$,I'$$~'$7y SVoS<DC+<C VT2@c&2$72MKa Y4 t  H :BDr ).oUP#=%PBsBT> 75's KJsH H5*W#Tk?!(U M@F !(;&M1>aY/;%&;Y/EXF3 J8 3YS?/qCPv 1S?qJJK,=M8N} $~:3z'O?WjB/|"Tg>6u,5):/|!p M 9Q7$Ct 8)Q7 U0T\ 6 M 6+$A\/f </ff/f@"I<Gy;X,X18TFSF+1LJ kLVCKCW kL UN7,L@H9S],X,N7 1X,CuF^ $LFX f2W5s$L(#'(# I f$L[(#>(#[eV)Hh;X,X12:VM5=U27SgR)1$:=O34 :3-:>$$ YNERV%NWY.GAOHS*\WYFFAOSSIJAO-;WY41y-[FLA 9i6 ]LAffLA"U-,' ' ('93@9V9- K.\w4T7-.\$ ?JK@K@W)7=uUlX2Q~3\QUl3" F/*+@0/<)/p & K/ /K# iM#7UA7*#77$JX(T8)/F FA;HK882<0T]0W2Q);W]"pY"pFQJ? QA]5 C?P9X>%I% X(=FX%c[6{*NB:{H*NDoT6Do;*NDoDoJ)JWJ# 1H `Bq7m1| `@BqF_F#` ` `BqFFJ=X=JiJK $KX f2S5s $(#'(# I f $[(#(#[e GwJWGw50//UkR3|?uHQ0Ji| `?FFpG . G) O.S/ 9K5,'CR#HC=%YG B@, CGPBVF m[ B*.4b B KVKVF4%#=-* HQ GE3PY+3DI%~D +0.E?-C, EE8; . By/>)6# 3 X2`UPY H4(Mx-'-HMGK@PV6| G8SMFBH4.OMP}.OK(m%PG"M2BH:O4KZEG.O4j4M[$VSDDq?Oq9V!Np)4 (E(# ['C. H%o(9HJ*'?O/p%Q[/p/pA%O >ONV~QmONON;UK5,'%:CR#>I=%?GK BV CW5F mNF5" B*.E B K\YK4%#'-IJ@`-*Cz[-svN PO62J '[J7*@ (Np%"={MKWWX "R2CWAKM& JW?7""K =D;$$100K <X!5s (#B'(# [(#(#[$"3r'C CTqw/G-PW{5 ?6Mx)\e'%WV,0%Gv%I+'%OTvUl ~NV$-0 BNo@.Vw4lU2UZ/GRn(yU2=24'<"<lUlRn$$QUU3#$ NUtO.#1G19114 ]w$BgO"wU*(!0(0/ UY"U0/1NU6UL H :u WQ<=$&0R.'f#@!`Au(rs0'f &u'&uYh&u'@?g)D%DO.D'Q CXdQd77@cd77KYN#*S; c.K *W03/{ (#EAHnKTH".o89UUTL;,|'NuD'0Y#XK[A0m1G58F0mB0mZ+AF88I a/ a!u a:3,5 D5 &5 ,[l9l,NI;W?P CI;(NQc(N@(N bD8%Fr%7v%C+TXL8A.L=K3 O=K.a=K:G PTJG G @Q&>*j7`'dj17`7` ?' <C $  CS"~> mN5S"~-p'-'-T PT #fT %889DoO-/$ R1-/N5-/'-'-T? M[F%s=bR?:(#8Dr9#RC)zD'-BG3B BITYw2I) Y4 mPA@6<RR B*!(#K7<<3e B/LSNa6.5&KJ:(#4 &.O4jP4S<**'38pTk0SY M@ S; 1>aY/U;%&;Y/%6.54j =.4jGR w8FP(K5,'[CR#HPQ=%YG B C2PQ> m>? B*. B KTKPQ4%#=-* HLVN(k )P#FA(k>)X#S KJH H *W#XD{-%82(h(h7kRGd!n4$PK?Y'#L$/:\#B/:'5M!@;:\N2Do:\R'M! nl LNEE9 Y8 #7' :RY'/ JVC'd1' $?E'9;-RVN=?,GAOHS@v =AOSSIJAO-;=41y-[}{X!4=GA=G=GE57i@J '%#\U:Y&x6?5RWc.FF`"{KS'.6:G"{'"{U.FPH "\&' ;CeJ  @ N@: gQh::N  < ]Dr;4FB' TP7N.O'1'O %,%,37OlOlS2]23U]-w-wX=u?5KY9/%f>1jY9%S EPY ?6Mx)\H@6|,F30%.OGv%%O9A 2BN5;'-VhYTA8 . $X fS5s $(#(# I f $'-(#(#e) $X f2CrS5s $(#(# EI f $'-(#-*(#eEH.a  L CQJ#DKA@U/TA/TKo)Wo!.1,gzB"2HNK 7U:CY]<7]HYVp7UGKHJ!C}SGC ONHJHJ@@NG4U@\'-BLL7,%?b@\? BFNN1I@\!d=RBbN?L!dW-!dA7N/W7K;6P 11>X9]j MV7. B ( GwJW//bGwM+lO=BOH>4R}NPF[1]P7"7%UP77%J/EyAAWDc N P64Y iY %M%M-)):RN.'-V+K5,'[CR#HPQ=%YG BLL C2PQ> m>s B*. B KKPQ4%#=-*Q4QQVPz 7mBPzFF_F#`Pz$F"\F$EGEA7EVN;HSG4/> f2Q(#(#MI f(#(#eC"&\>b?B8RCS7=3LG4Y<)W KW_!*G4(G(GW*h"W?G4&U?P ~BPD%ZDPD&US0UUD$V#HFQE]B*LWF C<(}X(SvEuUo61 1~%fL0R2 f 2S+-2?%g?? %=I f2;??Ne&Q&6)M&k1l'(%(.8'(I[(>UIA WfI[)I[WN?O=88B9Y] 9Y78G:(G(sWLGOG6:(5:(@@NL4U@\=dBLLH3XcHw,A @Xc(o4'@lN(oK(o@l4}l%lLmI/8%DP8D :a/8)Y%%oPr$#D 5#K0#V/(#'-#NQeTLR?'(#I7C9#'TaN*S*3EE6O.D((L$(>("HV2hRGd!n$PK?Y#L$/:\'S5:\N2Do:\RM! nl {4*,2( {4+I4N1MN1V { {4N1N1" M4WE)=*H&( 0A 1W SG/ 1W6%1WT;"GWW;#U1  P>=<1P>r-5 %PW{5 ?6Mx)\e'WV,0%GvI+'%OTvXsL;Y1s-K^4OSF5 61S$S$;Y3HBENBEM!%M!#M!BEM!H?LC,A @7U(o!4'0@l(oK(o@lO-Gr-?-E HLVN(k )P#FA(k>)X#S KJ%H H *W#K5,'[CR#H@=%YG B CGA@ m(#s B*. B KK@4%#=-*kFz; l(.18XZI[3PIA WI[)I[WKXZW6~X-i0#2#X-X-9O4ZM9E}:Co?Co@CoE 6  -X K $ . LN5 -p'-'-NK 7UCY]<7]HYVp7UGHJ!C}SGOHJHJ@@NG4U@\BLL1Vx SVoC`/ }0<SM9m<" }C tME%MR1 }S P O+8O8}{;4F8}'rQP7F3.O''J0LQEE&rEEQ :&"C5wAu&Z-"C>X@CCK>QAd 'Q'OHx7A HxFHx S2]23U]-w LK-w:+$AK*=H+$>N>5S&;+$'-h!!xVYh HLVNS(#b )! qP#% (k2+S>) aK1HD (%3 #(4W68S(#S4W [LS?>K H(#?s! QM*'4|"PP% N;A;H (#%<0T q90,YJ23BQ"m UpHD (BYPA(4W68QJ(#4W Q[LSJAB(#?s! PPN*' n yI<LE> 87PLELE9H"("O"A;HK882<0T]0W2Q);W]"pY"pFQJ?QA]5 CP9$.7mEC$F_F#`$$FF$ "n'y=%?GK B:! C%'y> mN>5C B*.( B KK'y4%#'-.@`-*W5 :6W5%=#OW5EdEd>C H 8 ).oUP#Ps$> 75's KJsH H5*W#$%7m"EC$F_F#`"4$F"\F;9X!Gy!A^:9A^7M!A^R p& I 1AQ$&<=Y,2-}$&A!`R<#@A\AI, 1<WNQa>vX;Cy P7!>y y!;1X1TV?")?45Q)")(#(#(")(#(#:WP/w ?6Mx)\'UKiWV,0%Gv%I+'%OTvNMAP :BDr0U9Ec90 B<T, $Y'<1t?<A$5 CP9Q$&<=$&R495#@!`AI,508&* VfW*QWA*$mMMp%"={MKWX "D232OKM& GQJ?7& ""ddJd***3-8Y%6?% RTLAw`~"{~DO6:~"{F"{$M$}R wUO%FPF&%H*-(3XcHwXcE(/m?lC6O62J ^#) (P8NU OD7n9 y 14*Z4JE!E%ECtPi""Ps>(#(#M-p/3(#(#/322*By'599nR{ <%,.%J37EkkS%J!J8; .2kBy&5`/55`Gn5`;34ff$\<<2UPH F>A.HGP6|,NP}IJP-;41y-[}Y6I%~D +E: +C, E7E RLL*PT1T;&4{YP;&;&0p< ^< %E::0pS#0p967vUQJ ^&@XJM!M!XJHRM!M!HR6%Od%OV+%O3h87: ZM58 S#G MM88 8$<8882 RVN>?,GAO4HS@vQAOSSIJAO-;41y-[}JJ%E<;;%!]?!]9%!]!]O/pQ[/p/pAO+&&/1.Y9J:5 '81.8?AV</ 4X8Y78lZ^+ @] QjYnOOm7AOD9 ) C y& P7!y y!Ri H <(# ]Dr ).oJ qP#)zH B2+stT0OHDN (.3 (4W68s (#4W3es[LS?>K H.O(#?s! N*'4|'=IP#&,LJA/UFNh[p%s4JN3[pN68pIXcNh#NXI(G+R J<M-R 8vY#M\C;@=Cn(M9>;}XCkX:5,J2NK'-18C=NL C*4%NB*S)*NBLmTk?!(U M@F !(;&M1>aY/;%&;Y/0&{0> f2&{(#E3(#MI f&{'-(#(#eIPgIE5I/JoSE2AJ=,22;EP ?6)\H6|WV,0%Gv%O3RB$i$i 1"-P`><JX f 2w5s><(#&'(# 2D$I f><C2;(#(#e>ANgF,9@>NS5XC>48!hT W'XW2K;1K>%DW:A wT@:N5\I:4As(#h+ M N)G MH NP N vBI=m8B %MZ %=;B % %B'g"BBE^EQ)p$92)`QOCIO:l# Q(OO(:A wT@:NRx5(:4As(#hS $PXS5s $(#'(# $[(#(#[ 0 $X f( S5s $(#E3(# I f $'-(#(#e-V,-V6J-VTh(XPJ4 XP&=XPQ$ $KX f2S5s $(#&'(# I f $[(#(#[eI8PPPPKLN'-8r78rM8rGDyN'N'9BN'" M40AE=*HRl0A1W J1DW?/C 1W6R\1W"W?[6n"3ELKXZH&-<.E7/d):U94P?E7??BW+ 4)S?7?4s.,Xk t 'r%I!wIQ&*Q IQV#M$MO4UV#V#:g :V68$O?Wj1MM1g>6uE)3#<AAS:5647d7dB=/29)=22YiK=D i= // XR34 iHQ0JiY FFpG . G QJ=NA>'Q&$9)`QOIO:lQOOXDV+bNHM(>Mx '--GK%AOHS&5GMF4%MAOSSK(m%AOG8L@o:4KZEGTv4j4M[::+$AK*=H+$>N>5S+$'-h RR*wHO.w(G(Gwi(G+R J<M-R8vY#M\@=Cn(M9>;}LW4>B%6S;(Q?%63%6R?'&C(#SCW9#C<M)zX'-GE[ BI CFBI)F3 G'4 m @6<RR B*!&C(#.O&C7<<3e B/LSNa6.5&KG:(#(#4 &.O4jR4S<H?-C,A @7UR8(o!4'0@l(oK(oSGR8@lO(l@*XW.W.W.W.RY0 ?s> f2X?(#(#MI f?-p/3(#(#/3eZ^ VVoSG V -+MG dG8w'G%?@\?@\BTG?!e!dAG5<SY+;1.BX:FXW Z5(OL(OJ(O GyR&& )q'8M.0..N *X 7n:9* L 14*Z4FLP?.F&[ SC==C">bKP'y=%?GK B C%'y> mN>5s B*.%P B KK'y4%#'-.@`-*Y*:2V!YGz1Y 6h;"x-H n-H_S=?+9V7DV7LcV7^C Ny P7!y yD.CN!Y5M%Yh@HGHAU:Y6?5RWc}FF`"{Rb+$6:"{'"{U$FPH "\ 45N'A 4@@<L7I 4FXt/BXEdEd>C"!KYN#*S; c.`K *9O3C[5`@@EAH0n=K`THH $X f2CLS5s $(#'(# I f $(#X(#e0fj)P4(`)CC)1}Y`'#M40AE0A KT;KW; h!E!!!!B!8,a)J)J"ZY6I%~D +R{.EF+C, EE8; . ByPP%GG<!><sRV8N 6?,GAOSHS@vG$9AOSSIJAO-;941y-[}7IB$`:KB?L 1F':K@0DuP- @OR@T@z8@F1r7I "cIS>Y 9"1^O'>ER&Z^'>X~'3 ;|I,8 ; %MZ %=; ;& % %&Ut.1G191-/$ R1-/=vN5-/*sS?Y*sU8*s!j'N&BIN&=m8B %MZ %=;&&B % %#7#Jf@@|@ @:%<6l8 2S?L> <JX f 2:X$J5s <(#'(# 2D$I f <;(#S N>HA(#eVDVN HSFS..EA2K%M%MdRO:L(# ]DrW9F% :)zY'-BGE[?ITUPYI)=  ,!4.O m @6<RR?6!L(#.OKL7<<3e?/LSNa6.5&K :O(#4 &.O4jf4S<'*P>7-&RSNK[K[ |G-M4Y i'QW'HY=@ -L$/=0&"=3--LE1%/N=JV61$1H0M1T} T}N)T}62JN7,?96$NN6J:LSAT#BBBB=8 PH>FN P%!5%%XDVCN H(!Mx'-MGK@AOHSYI&GF3MF-q?N4MAOSS.OK(m%;AOG"M2-q:4KZEG.O4jK= 64M[$V-9P1U9,;&Q&6&H(X~5sH(#'(# D$D$H(#(#!,M!,!+@p"={MKW/X "O2XM%KM& fJ'E9tM?7""UC!G+!"U&UgRS o:I!BAS%R&!BN5!B'-hI2 !.X%CT=T%LTLp"={MKW/X "6W2XMKM& %JM?7""UQO~,&ACXO~M!M!XO~HRM!M!HR=<.==MM5 < <N'-'-F%1RWPU)3Ml&Z@V . M5ADv%,FM$ 4O.7A#.E .R?'(#SC9#%tC)z'-GE[ BI C.2I) Y4 m @6<RR B*!(#K7<<3e B/LS/nX@KY:(#4 &4jP4518 PH<}G\2A&ODABG\& P PyJO~ACO~M!O~C $8 PHC,t?$3_I-)G\2Lz&@>22LzNLzDABG\>2 P PX2`Y H4(Mx-'-HMGK@P($6| GF30MFBH4.OMP}.OK(m%(PG"M2BHO4KZEG.O4j4M[$V:X:LX f22I2 5s:(#Q '(# 2D$I f:C2(#(#I#e7IB$`:KB?L 1F':K@ruP- @OR@X_@z86j@F7I "cIS8>Y N b+%g?%g)8>5 D5 "&""5 "]&kC8 =0!4 ?0!0!C2>QAd 'Q'O&pHx 7A HxFHxK] EO&p NY&>996 Z;::|%Ba|;SA1 A1 ?'K1 -0n/GVjN MII-;SxG\zA05o zKezDABG\  P P ) C y P7!y y!<OWI34<D8W % %MZ %=;&&W %!];u %# +UN=8 <B98S GFN'-'--PUP ?6Mx)\H'_6|WV,%Gv%I+'%TvMX - MX'&'-MX'8?,8\8X.6#1W<EdEd7Q s#!LQ#!8#!M=)-!UK?!(U5R1}F/!(`%$=X&M:$6:_%$ '%$U$FPH "\WDc NN %KP~D22V+bN9nR{$)( <JHSJ PFS!J2PH> f2&GM,.(#(#M%WD$I f.'-(#(#heR?'I7C9#,5(4EN$/@6&63Q-NE1%/N=J>5=U)127SgR)1: 34 :-:U1APf/T/G) qWo!.0.!w/Gq11$R)E8')9F:Q".JR!3}M/JRWAJR!G".WW,:HK u?O?Wj uyg>6uIp>  9YOoOoOo)f4+!K* 1c1cGIFG$9GIFOIO:lGIGIFOO22 (P(Jp(:'('(U !-{RV%:NFE0GAOHSTFFNF5AOSSIJAO-;F41y-['-fGR"OAMBLON5OhDy5T5T5TPUPH ?6)\H W6|WV,%Gv%0mPKA$$$$'8+O%'y'OBI)*=m8B % %MZ %=;&&B % %#& R{g2g`7gG"On YG"}38>G"W#+!,A1f+/x3'9N/)FM-&N@EEY&3*4KA.@KKA5:=KA8*4??O#yNA0U#yY#yR :&"C5wAu-"C-$>X@C'C,O5@,I&SI&MI&XjP>>3>m.$ bp"={MKW/X "O2XM%KM& J9tM?7""AA}AA;HK852<0Tt0 2QS;YN&ElYQJ?QAEl5 C89BUNN)DB)R)F?)X&F6%]%?%%< ]Dr;4FB@'GTP7N'#'1 0O@O+Q.m:3AK*y>3C NC 59a3'-h=/$.P /Y=?6Mx)\DWV,0%Gv%%O 3Gd ZP:;=S#P:(G+R J<5-R 8vJ$5V+\V+&@=C*Cn_1(59=>;}K:B6K*KR&{.@> f2&{(#(#MI f&{/3(#(#/3e+@J -,jLH/6/q8Cx |4KoO"Q>^M):LWoO"/QO"CLH:LO ] J7*Ye']'j M,,?8<?"?P25n?6Mx)\e:WV,0%Gv%%OS$SO.S'/ q0NEOdO7A7B`7'cG-7e=8 <B9DG$ 8G&FON5C-p'-'-Q<=$&GlRB"BBE^N},O;BQ8N}O$ $X f28S5s $(#'(# I f $(#(#e" P$H f '&G+ (#(#M%WD$I f ;'-(#(#heT2F F@ I5Y/YP@P PQ$&<=$&*R N*S*3%}EnE93W8Y8@Q : &j) W} ELK-mD]&jA L#!]L 3W&j %Y-36 |3IVN(k(k2>)4M!& C'v"Q("">#&2WF,LBm4nFp-%s"[ <43pN6LpO#4-G5K:B:DrRF=%6xBGK?TY*/V> mN5?6.? KKV4%#'-=-*}Ny/W*+@0/</IM & KG/ /+<K V VT2PHdG&/.Y'.A)T4ATA)&{0> fO&{(#(#MI f&{'-(#-*(#e+R RA8vV+J T+sI$ ,+=N5ETZ**%#3+%6N*Ks7>3 VT7F L * *FN@Ip=4w8 mp'p 0J?Sb=]:BDrBGTTOFc mS? KFcR 6.(Hl,O*6PCNKJAGD.CNJY-CVYh" :&"C5wAu&Z-R"C%>X@CC( A/7oW//b/YUkM+lOBG7O-/?0KAN-;.;.I>-;.;.QMSH(6BC$ <GE$C$& =& C$& & ."~IG""?6aO8 /16l$K=PK8]3=V/& r@='-Rmh#N7H7m HIHFF_F#` ` `HFF 4A 4@ 4T!u6H4OR:HH<K5,'CR#H7=%YG B< CG7 mS B*. B KK 4%#=-*A;HK82<0T90W2Q7O;.KF 3Y8QJ?QA 35 CP9 RGd!n$PK?YO#L$/:\/6O54:\N2Do:\ 1xRM! nl RM B>K?KSIO#L"B/;K aPG'5;K-Do;KRM! nl ISIN7bN7IN7VDJ,'H1MC*>AV-NEF; ,NE$NEH9S],,X,X,F^4[7P4['4['P DNV4-4ZF VV3XcHwXc%@o/}R>RGd!n$PK?Y#L$/:\S5:\N2Do:\RM! nl PDbBaBaL& L&2L&,E]%^E]-FQ-E]-U0l5E5EPW{5 ?6Mx)\e'%WV,0%Gv%I+'%OTv" M40AE=*H7Rl0A1W J17/1W6R\1W"7[6n"3ELWWS)@"# VFF WrBJ]Wr=dEe=d Wr=d=d:YAKP9 YFNF5 ,Y'-h-/R1-/NR-/A;HK882<0T0W2Q>u;.KBY'(#!QJ?QA5 CP9 HLVN%(k )OFA(k1>)O&&TKJY[H HO*WKJ#K5,'CR#HC=%YG B@, CGPBVF m[ B*. B KKVF4%#=-*PY ?6)\H@=6|,F30%.OGv%3%OY6?"I:-/$ R1-/N5-/'-L $PYS $(#'(# $(#.G)8<+Dr;4)zFB@'TP7N''1 0O@O+Q.mERYR+L,RY;RY  YN YX 7n:97*S? L 14*Zv 1S?4JJK,R RR=/O1 :a/#&3',LJWANh[p[Xqa E3[pN6pIXcNh#EXIIzU6/5 7IzZIz-L f21-?? I f-4??4e#&LM40AE0A Jp HLVN5(k )/zFA(k(>)X#_?1bKJH H?1*W#UVUUgBR@J 9JPJaNOv/^ HLVN%(k )OFA(k1>)O&&TKJ.H0' HO*WKJ#YeY7YY$zJC\@1C\=C\P00.900009xR?'(#SC9#JC)z'-GE[ BI C'KI) ,4 m @6<RR B*!(#K7<<3e B/LSHJk& IK,:(#4 &+Q4jP4-q?Oq 9*4! 3D7?8*4??O;1X1"VB} [HZV>HZH<S4>| R$Q#iaO9+ #ir#i:W B>098B*aj*A=H* EDRKLR&AGL &L.L(LP`.<JX f2,5s.<(#'(# 2D$I f.<C2(# #(#e* 7U:CJ)|6-Y7UG"H!V! FN5"UIJ"-; 41y-['-})JO~,&)ACXO~M!M!X O~HRM!M!HR>:XJX f 2 GE5s(#'(# 2D$I fC2;(##I2(#e?\CXCN5 CX0K@3E!L=#\@wK@2W)95$Bc?22D80BcSY6 )%:1G*15%.(Z6/(Z.E7/d):U9NP?E7??1BWW45)S?7?5s.,Xk9WC/9==9${"Z|G GzL BY4 t OT3DU1<JX f2] 5s<(#'(# 2D$I f<(#(#eI=)'I R)F?)8,a)JXTIG/ 21O60GWC2|t0 0P:A62|>>4?!(U!(+&MK0 $X fS5s $(#E3(# I f $'-(#(#eK;=c<~.)qO,YB    w wN.w(G(G?wh/;E)CS-KI0eI&IX> E%I4 X(=F'+; EX%c[ Z/* TBC;1JORBTFO62U-W$/NJ%qD^^QTD^M5BM4)ME}W& WDW4t.H$:S$)$S$9HhHh... 0 . ?X//// ]w$w*T0/!.0/<@ ;VX2`Y H4(Mx-'-HMGK@P6| GF30MFBH4.OMP}.OK(m%PG"M2BHO4KZEG.O4j4M[$VM 9@$Ct 8)Q7! T\ 6  M Q&! 6CXCX/;2+C2(&2LH/6/q8Cx |4KoO">^M):LCWoO"/QO"CLH:LO ] 1cA,k2%A/Xq&/V/LHRU/TA64gJx;/TKo1/ )D<,JCWoCE1Q1C sLHJO ] -l'-NM+'--7&t5M=2l&x &x&xW2B{A4V062 @WJb2mK82<2;XT X(=IXk1l'(%(.8'(I[UIA WfI[)I[W{L6*"N*> P k ARO<S(# ]DrW9F;:)zYg'- BGE[?ITPYI)@N %4.O m @6<RR?6!S(#.OKS7<<3e(?/LSNa6.5&K O(#4 &.O4j 4S<CCC"u'HS=v<./)F<.(G<. 9+++*O$U7#1)4-(`)COCDU)*=@G JPK5,'[CR#HPQ=%YG BV C2PQ> m>T B*. B KBSKPQ4%#.=-* * 6l6l4e FXO$hDX X=1$ $KX f22S5s $(#'(# I f $[(#(#[eMQ!0i Q($ :1G*15%R; f 2$@N,;??? %=I f;;??Ne*%#3+%6N*Ks7>3 /aVT7F L * =u*FN@I0PP> f2TE(#(#M%WD$I f'-(#K)(#he-/$ .R1-/LN5-/ C)G?H#ab"HS*H(cOL\TUO&&O $XNS5s $(#'(# / $[(#(#[ L@E UQJB?(%MHP/b/ X9A BN5'-h3M(k(k>)11 4GR8W:-/$ K/R1-/N5-/'-LN?O=8B9Y] HY78G:( G]GO:(5:(@@NG4U@\BLL8q/.&Is0=8B9;8G@GOC.MY00(u LDGG B;&,R`P;&;&3P)tS?OP?*?VyP??j7`M"Z*`6j17`7`9J^(.wm.w4a4R@(yX=TR@UXC:9%W!<W;!"("/.!"*E*NAMYVK2XMo%#T=X2XT` 6 TW2W26MY1 [;>>>.BR@$C@ $CP$C6F OJ6Fy6F HLVN5(k )/zFA(k7>)X#_?1bKJH H?1*W#Ma ~4hLH/6/q8Cx |KoO"Q>^M):LWoO"/QO"CLH:LO ] """:W303G2e9)4W-;)WX)Ps>X)(#(#MX)l/3(#(#/3K<FX f2T5s<(#J'(# I f<"B[(#(#[e=g R?'2v(#SCW9#?M)z Q'- GE[ BI l CKI)8S GN4 m @6<RR B*!K`2v(#.OK2v7<<3e B/LSNa6.5&KG:O(#4 &.O4j;4S<LbR"*- O5555 J8BI JK5 JX2`Y H4(Mx-'-HMGK@P($6| GF30MFBH4.OMP}.OK(m%PG"M2BHO4KZEG.O4j4M[$V/G:O0.!w/G1$R+'/E;+6N*8"->CWNX;.N!ORCK95[#|IED 5GV, 5Gz 5??4 G?4/,,<{qz9Oq X5K>Y97?3V!*5Kff!T7?BJB7?,B5K"`*.h $0F(@QQ%x5Fr%7v%8u;(#4-RGd!n4$PK?Y #L$/:\- 757@;:\N2Do:\5;R M! nl %R wR;PR'4RC8+Q&$9)`QOKO:lQOO(P<4(%%w6"wVwS>2h426?X8H)G?X&3*4KA.@K KA5KA8*4??O 1AQ$&<=Y,2-}$&ACR<#@%A\=AI,W 1<WNQaTFO62U-W$/N#)J%qD^^ Q.DTD^ HQ6$+B=.]E EUECN%Q1/NR1/H11/3394=3<3RO<S(# ]DrW9F;:)zYg#'- BGE[?ITPYI)@N %4.O m @6<RR?6!S(#.OKS7<<3e?/LSNa6.5&K O(#4 &.O4j 4S<>>>3kE)CS  K.`WE# `E)E)8*E*NAMYV HK2XLVMo%#TX2XT` 6 TW2W26$DBD )6'="E ( *-@ (K $X3S5s $(#/\'(# Qn $[(#(#[K82<J2;Rr"pY7ORrA9$<*9t-&'E9t{I^Eb{_{2XW87-6#6F620O"G< ]Dr;4)zFB'KTQP7F3N.O'3e'OD)yj?Go8X f 53 GoAe/2'GI fGo;'-7Xk2 he<XDVOCN H=(!Mx'-MGK@AOHSQ&GF3MF-q84MAOSSO.OKO(m%AOG"M2-q:4KZEG.O4j;4M[$V>}>>lEWCM5l" / r;4p3E  VNGHS4AO-;0?J8X f p w JAe/2'GI fJ;'-2 heF%1RWPU)3Ml&Z@&V . M5ADv%2,FM$ 4O 7Q+&PN?O=8B9Y] Y78G:(G]YO:(5:(@@N4U@\BLL U9N%OG d%OV+V+?F%ON?O=8B9Y] HY78G:( G]GO:(5+:(@@LL7NG4U@\BLL{;-{{T PT ##U7T H!  K5,'[CR#HPQ=%YG BA C2PQ> m>C B*. B KKPQ4%#=-* ;JpFI[Su/!TJI$mMMG%6S;Q?%63%6X1q1V1 <@0M3@0V/&@0-p*%#3+%6N*Q27>3 P$;;Q2Q LQ & ** ?*Q2N@IE*AMMoBWp)S1Wpp"=""2:}6W}d M9+BAc K@I:C:; =:E9 Y8 #7' :RY'E/ JVC 'd1' >KsE'9;-K5,'[CR#HPQ=%YG B C2PQ> m>? B*. B KKPQ4%#=-*2 ZP: xO$:H;M$TS/6  $T?4IY<MP.PH5dX ^&'X M!M!X HRM!M!HRP2K%MDB;Q%M4>Mc&K*%4V4K*W-$ %'Ce, F%1RW2PU)3MI&Z1_:VXE8I5A Dv%,FI$ TuO=U)1)138  L'V1)+GIFG$9'%GIFOIO:lGIGIFOO|DCs4OY/?1n4O(F5:$9Ui6H?I:|Ui+HQ27nKM! ADoDo9[DoDo!I"AF&/.Y.TdAT" M4C 0AE=*H3Y0A1W' ?N3Jp/Jp&i1W6R\1W"3[6n48ELF`#!F`LLF`OmV 5V, 5; 5AAB?U7YH8<?""#8?O./ ,'H1MC*>AFV-NEF; ,,NE$NEH9S],,X,X,F^3W)_j3P3WUBB,!uO^Y$? @>BAVh  ,,XK#7\) 8W`A:?`N5`Yi'-hL^O/03J2n( ^V32D4o&F)32KJ,KJ'32KJKJ?M I[<2* 7U:CJ)|>"7UG"Bw!V.FN"UIJ"-;.41y-['-}BEBE%M!BEQUtNMw+Gttt<%2#D 5#Hr0#I(J#.?tCo15|?Co@@KCo5C[8XS@ DUAT: $X f2S5s $(#(# I f $'-(#(#e15 R|M C ?U7YH8<?""?@,r!).!!JQL Fy07{&TAy'G'9y''X)1)137VqVqW&{3>&{(#(#M&{/3(#(#/3O(I%4 (I(I-=EP35'R7'Bt3'S9-U $/'&'3- E1%/N=J8 PH<}G\PE2A&OLzDABG\& P P ++%g%gK 'N b+%g?%g;0E;0*;0, (4I1 }&11;UJ? *JC)>CJC?~>7?~B}N7B};?~5=U27SgR)1$:34 ::>$$ Y/y:B:Dr:H>,BG=TRE@@FN=IJ=-;@41y-['-} -S%J!J8; .2>ByPY ?6Mx)\H6|,F30%Gv%OY+=3Y++ M N MH NP NBCQgRW-PPV -r3BRX8>9E)JG dG8wGN?O=8B9Y] 6Y78G:(G,Lq6C OC >H:(5:(@@N64U@\BLL pT pSD pKP94PSPTk?L+U M@ !(w;&M1>aY/;%&;$M$}R wY/%%(0aUX6?50a066}0aj?38X7 f 6 +W 3Ae/2'GI f3;'-2 he SD@ PP?%u( /XGU Ae/ !s!sH1(3 QV n3)X3;>>%%oPr$ B ; .((' wBKA'FA}6AJf@@|@:%+/3Wj3P3WUCIQT W'X+@/W/N& KG/ /K VW2/dE7BW2* 7UCJ)|>Y"7UG"!V."UIJ"-;.41y-[}8*8*18*"v)5V?\N5CO$1kGqN52<  2 YS@"tN1%@) %lAa"Lp4!KOf< 2 6I9nH{ <9D+G5-S9D*a=d9D -)kBi-=t O=t7v(=tX:&Q*Ul ~NV4 4-zB' 4.F 42M9Z" 4N(yN?2=*`,2=TkUl 41:*`@A Gc&"C">!P5A&PVHBBBBV,AP :BDr0U9Ec9=% B<CT, $"pY<1t?<A$5 CP9;2 Z3C2(&I25K?KSIO#L"B/;KaPG'75;K-Do;KRM! nl ,x%-/$ KR1-/N5-/'-&; %&; >d Q'&pHx<&$ HxFHxK] EO&p NY9D=D~<]<]B@B@+/B@;GNS/NSx'NSv<.\<.(G<.i%M%MG!0Y!0)N7!0&C$ -D!W-M4Y i'QW0HY=Q -L$/=0&=3--LE1%/N=J/&P 9Q7$Q7T 1bGDM01b1b!jBQ#BO#j;ejJj?0Y4U:/|W@G1@4$Bp7mEC$F_F#`$FFNWMScW Z&;::|%G&a|; SOZG&" "?'KG& -0nQ#Q AQ6(XDVCN H;(!Mx'-MGK@AOHS&GF3MF-q)4<2MAOSS.OK(m%;AOG"M2-q:4KZEG.O4j 64M[$VNR?L>NX f2Nt215s(#>{'(# 2D$I f(#(#<eQ#880TEfEfG\DABG\ PV*@j%$X  f> e% Ae*OI f%'-e1{ ND)R)F?) #W$HLW$/W$;&5$:S$)$9HhHh4 c4>>g4Y7CuI%~D +e.EC, EE8; . ByE!E%EK@39K@W) +=?'+9 v?E.))))1$6FX f2#5s6(#"p'(# I f6[(#(#[e'CG7 C?K mS B KKC58*%,'5;%,37B2[GSk2[% Q.$9)`QOIO:lQOO8 = XL C2/j&=J o))&s(`)CC)85yeQ&$9)`QOCIO:lQOO&Q#VO&&;DDk$`B:K u.C2NWEdEdJ"V?9"Hb"0L,B'^H?8O?@%3&$yE Q%OG d%OV+V+VO0 (%O;i(E1!V7EXX&JN-W4R3%y% iHQJiY FFpSG . G TELD%uLG9LD_C~31D(D31JvG H7{7{7{7{ V   1<:yBeGO~,&ACXO~M!M!XO~HRM!M!HR;~=K% H :BDr ).oUP#=%PBs,~T> 75s KJsH H5*W#-&FLE#`DE)FEW303)ejyL dC ?FWh-/$ KR1-/N5-/'-9_&H;eeneG!S=G!{/tE5C^P3S?(xOP?*?VyC^=P??SBI*=m8B % %MZ %=;&&B % %:I*YM(WJ$R<U:Y&x6?5RWcFF`"{~3S"I6:"I~"{'"{B<UFPH' "\&{P&{(#(#$>&{(#J;M8PNFiJ''J'')QW:%?>?J;@\6B>L?!eP:A6L>>4$>$$ YW303e#7%q##UCNQv: wG5$67JX f2Q5s7(#'(# 2D$I f7(#9(#eY+3DI%~D +R{0.EA?-C, EE8; . By&'#Bg8 'HV-CrB K(qFV-LJKh,,WIvKh.=7Kh |N7#}?Y z R+'/E;+N8"->CWNX;.N!ORCK95[#|IX** LP,,|<592[/I W &[>]+zPA(W 6!3>]Dz?8>]:0|W +?HQ!K%UL?/GSxN MI;SxzDA05o zKez # /FkB @yEQ)p$9)`QOIO:lQ(O O( B<MOu".*.WG4.QE)CS-K(N*S*'n3%}EnE1Vx SVoC`/0<SML<DPCME%/MR1PS P O+RZX?e5n H:K(&^Mx!c'-eMGK@=v>GMFMj4.OM=.OK(m%=G"M2Mj:4KZEG.O4j4M[$V 0O@>)J***3K{ V3W8Y@Q : Ki) W} ELGK-mVS%AL#!]8L 3W% %Y-3+x.oGW $KX f2;S5s $(#Q '(# =qI f $[(#(#[e-`-`HX>  NE  1X%$8HW,t._CI-:Lz%*+0o/Lz LzSO/08^pR&{.@> f2 &{(#>(#MI f&{/3(#(#/3e(# YP|<592[/I W &[>]GPA(W 6>]Dz?>]:Y|W +HQX8H:&COC;#93W8Y8@Q : E() W} ELCK0cE(AAL#!]L +3WE( %Y-3ADo 5:x2S/9Y=$DG8W/9$ eeneG!S=-;!Pp%lAaRGd!n4$PK?Y #L$/:\P6 757%:\N2Do<:\PR M! nlDo 0s-UQO~,&ACXO~M!M!XO~HRM! M!HRCFWFWYG9BKW:B6K-KVCV~VPeen?6)\eWV,%Gv%+RQ'K[D9J9RO'8":5K[H>G4~O4H(7Hl+O@] QjDRDDO+RV%:N7GKAO'*HS?IFNF5AOSSIJAAO-;41y-['-IJDGT,71RBE8+RB'&' 71.dRB''9[>[P.C6 3kR?'O(#SCW9#C{M)z'-GE[ BI CI)F3 G4 m @6<RR B*!O(#.OKO7<<3e B/LSNa6.5&K:(#4 &.OR4jS~4S<&-U+~3W8Y@Q : Ki) W} ELRK-mVS%A8L#!]L B3W% %Y-3 Ac"2UfQ H :BDr ).oUP#PBsT> 75&s KJsH H5*W#KYN#*S; c.`K *S 3C[5`@@8zAHSnWFK`THH:8FN'-=@T6j -FX fWGU  Ae/I f '-e9[[ImP 6 >]:#EUVY@x?Sm7 -i*Q.%G."/5*QGe*Q#?^.$pS!N?O=8B9Y] 6Y78G:(G,Lq6C OC >H:(5:(@@|N64U@\BLLq15+(55X,5;5AAI f+AV&lK),,XK B 4S H 4N7B}IvN7*] 4N7N7UPH F>A.HGPA6|,FP}IJIP-;41y-[}GTN78~:(j7`M"Z6j17`7`9DFUt?HH?LC,A @7U(o%!4'0@l(oK%(o@lO%;%;.a%;UM 0W3!E!Ga,3K@". 2-YW)EM? 2*f2!G".0WW,.E7/d):U9NP?E7??1BWW45)S'?7?5s.,XkWSEWn'JZ>HK u? uyP yyXnx;HTFO6 2U-W$J6NDJhJ6L^L {Q.0DS8TJ6 HQ.6$@!;T XG-N7(?%B-0(_C U8j Ku8"TFO62U-W$NLJ%^Q.DT HQ6$>Od 'Q'1OHxJ7A HxFHx @-!q!:cDQ9OcC NC 5c'- CSK &W&FL -f2uE1=<1r27,%?b@\? BFN:1I@\!dMB, A|:#?#$!dW-!dA37:/W7K;6W2K(P Mc>9+B+BO~K&MDACXO~M!%M!XI3O~M!M!K82<2;RW-PYV-L*J0L%X?<+Dr H:K()z!c'-BMGK@=5*T;GF3NMFX4.OM=.OK(m3e=G"M2XO4KZEG.O4j4M[$VKT* ]w$w*#D 5#.0#Fq(# 9++++ G $2 0 $X f2S5s $(#(# I f $'-(#)(#eE$)>$$ YYF v8(W#28(#3/3*30B1* 1#R*&s> f2AJ*&(#>(#MI f*&V/3(#(#/3eN:5h Qv#";/5hQvOEQ;;SPD)M)M)M)M $ XS5s $(#(# $(#(#<)I3;)!M!*UI3M!%M!"I3:M!M!k'3;U-I^8;U % %MZ %=;;U& % %&'34-IY84 % %MZ %=;4W& % %&4J0LpYdKP9&M[,<M40AE0A C'7'/1W7'X>%I' X(=7X%c#RGd!n$PK?YO#L$/:\/6O54:\N2Do:\RM! nl JI: oP oF"@3WW0-C N#22A*'599nR{ <2.%J%+M;S%J!J8; .2;ByQ+U@?B8CS7=3L+Y<)WQ&KU&RW*h"YW?U&&U?J?u GI ?4FWA#I/6 2A#CA# CFWFW2==UA13K33.!Y P kB NS/NSxNS!< 9x LG):K5,'%CR#H=%YG B CK>~& m&9 B*.Q B K38pK4%#KJ=-*I: oP;Q* oTBF"OZ 1TBPTB15IJ4O#$V E e '+; E[ e LL(L'CJ C6"p mS6AHeQ&$9)`QOIO:lQOOI: oP;Q*7 oTBF"OZ 1TBPTB1XaA&:3z'O?Wj6B/|"Tg>6u,5):/|!p W6G**6""QQ AQ0LPE! 1@4T.Y?22T6Y< #@6%76I, 1<WNQa J"Ny &07/&TAy'G'9y''MmKZEGj:TM\ J6 * T T+88:M+k(7M+>N>5M+'-D(OUR{  W&{3>&{(#(#M&{/3(#(#/3 &{.@> f2&{(#(#MI f&{/3(#(#/3eNE// $"/?*??/?44..e.eUKPUJN7U:N5 Ia36/S90S(Z6/ 6/(Z=$3N7O4L7UA777Ul ~NV4 4-zB' 4.F 42=<Z" 4N(yNQ2=*`2=TUl 41:@A MN VT>21S.We6Qf1STB1S,Q"`*.h 65eQVd0Y:GW8 Ia6/A$0S(Z6/ 6/(Z HLVN(#H ) qP#&|(kH2E>)4=HD (L,E(4W68(#4W [LSS-'ITH HL,(#?s! HN*'V,'HV-C*>KABV-NEF; ,NE$NE, c ]w$w*T0/!.U0/YIBOI!I"8#D 5#K0#V/(#'-;GTX> E%I X(=B'+; EXT[ !JR!*!BK\K\U8K\$.K& P0J?V(* -r.QH{|/\/\|?\N5CN5? >&Y9VKTT 8l'VM+ kFz; l(.8XZI[PIA WfI[)I[WKXZW6~**Mh66-@M4U'82.'D6,V1=$/6&63-E1%/N=J dN.'L,P,F'N?O=8B9Y] HY78G:( G]GO:(5:(@@LL7NG4U@\BLLB 4S H 4N7IvN7*] 4N7?YN7 4'A 4@( 4: )W))G5N7/##iXU2DX 9X,X,GE*AM)1/NPE?J>Mo( 5)S75s.,Xk4:Q"1 4Uo4#4XX2` /Y= H4(&^Mx-J'-DMGK@P2G0MF4.OMP}.OK(m%PG"M2O4KZEG.O4j4M[$VR?2-(# ]C(9#/M)z'-B-GE[ BIQTV>I)F3 9v?N4 m@6<RR B*!2-(#.OK2-7<<3e B/LSU=8 uSK9v:(#4 &>4j4/A=%g Z&;::|%G&a|; SOZG&" ";?'KG& -0nK5,K;KAA3 QV n3)XO<3;>>9=W/9=9 $X f2@S5s $(#(# 6I f $'-(#(#e8G!o E!o!o@:-ZY+C:+:HH>R 7j>|>tGttt<%W5LfO:6W5%$qW5N1G;UN5-/'-Nq.UL?RYR/iL,RY;RY Bz(Q~6Q9BzeCVCPCL-VR&,-V6J-VTh=oS2]2%3U]-w LK-w!RTJ[HRT+/RTUM, KXP XP&XPIQQ!6-BM=K;MM!BQBQRO<S(# ]DrW9F;:)zYg'- BGE[?ITPYI)N %4.O m @6<RR?6!S(#.OKS7<<3e80?/LSNa6.5&K OO(#4 &.O4j 4S<USRAQN(LU-D#>>MA6LHA;HK8 <0TQ q90,2EQ;0 (=YEQJ(# Q[!!0A=%o(9PPJ*'w LXPL EBq?Oq9>!?CDI < CDCD;mJv&M[=U)1)138 /-6KC-+A+-E9CH9(FA69I3M(k(k>)*9D=J)|>"DG"V.N"UIJ"-;.41y-[}J#wJF 0FFJF)1)I)WWGnWA>G  XHH $}R wOKg ;OKAoOKD}R?'+QC9#'C)z@GE[ BE[ CE20 mE[ B*(#7<3e B/1 0O@K 4W<Bu+Q4j`.mE9 Y8 #7' :RY'/ JVC'd1' Y$?E'9;-<1.,`A#ANj%$X  fX e%Ae3I f%'-e'CPSOGD: Dr;4)zF#B@'JTP7'3e'1 0O@+Q.m1-(,:WU::I ++%g%g>4X@yD-NyDA22J+b'599nR{ <.%J%HS.*S%J!J8; .2.*ByX>%I% X(=X%c6{./(]JLN5'-L'E R<<)q*)+g/) Z&;::|%a|;JSA1 ?'K -0n'M=/G - GwJGw5$ JI JK2O JFx>oFxFx"+ %V(F- u"V(+/!Q/!GC > mS K>G!J I1/)2/)M;;PP+sIPeen?6)\eWV,%Gv%R?'(#WCW9#; )z Q'- GE[ BI CKY2I)6 2N4 m @6<RR B*9!+(#.OK7<<3e; B/LSNa6.5&K2:(#4 &.OJ4jHSkPb4S<US7-;+xG$u|<592[/I W [>]C,P*C(W 6!3>]Dz?>]:|W +DQ6wO2yUD6OKOm6OO1$ $XVS5s $(#'(# $[(#(#[PUP ?6)\H'6|WV,%GvI+'%TvA;H&.(#+<0T q90,J23BQ"m CEHD (YPA(4W68QJ(#4W Q[LSJA(#?s! PPN*' nWDD80_W!4PK+ N D ;TXL8Hp ,YA1 f -3'(#(#M%WD$I f';'-(#Y.-*o(#he+*C*369-@M45U'82.H,'D6 T,V4H,$/6&63-H,E1%/N=JIa52UTSKT7"2TS'&',RBTS''|<592[/I W &[>]GPA(W 6>]Dz?*)>]:Y|W +?HQTk?!(U M@ !(;=&M1>aY/;%&%;Y/LIA?!(U!(&MXB5TT+%gT-:T)XT:C:+:H([E;G f2"E;?%g?? I fE;4??4e")6T-:T)XT** X%)!Q @HUJ N@ -n$RO<S(# ]DrW9F;:)zYg#'- BGE[?IFTPYI)(N %4.O m @6<RR?69++!+S(#+ .OKS7<<3e?/LSNa6.5&K OO(#4 &.O4j 4S<>0A+ 'Q'QW0Hxq HxFHx N7(H $+HH;60J~K@3E!L=Ny@wK@23W))IUNy=?=;22D8I0NySY6 )%'O%''+X:XJX f2<75s(#'(# 2D$I fC2(#(#eUV@xG=%g=nPPWW2<=E=E'& '&22'& O.#1G1911Kt8RZNXC=%WG9 4(:KVF m9 )." 9 KKVF4%#=-*0 $X f2;S5s $(#E3(# MRI f $'-(#(#e0A+0qIw$/7O=3wU!"U&UgRS HLVNHI )~ qP#&|(k.+>)$ (.+(# ['C. H.%o(9HJ*'?_HwMG$9 CGN $8GG$7Eh \GuP3W:"QVN=E K5,'K~CR#H>=%YG B!z C: ; ( m (R B*.4 B KBK ;4%#;.=-*.'.'A.'"gJ06+-J"p"pJD1XK@3K@AW)=,_RV(1)4-(`)COC)U/TA)/T)A,COWo!.,E+ $X fIdS5s $(#(#  I f $'-(#(#e _ GWM] _ _:g 1?.c/D :a/ ;43,*;4JpJp;4T(W&{.@> f2&{(#(#MI f&{/3(#(#/3e2C2(&2E:9F:JRW3}M/JRWAJRW|<592[/I W &[>]GPA(W 6>]Dz?8j>]:+Y|W +?HQGF00q6[;(<n#WWY<$$  Y?()f4!K>/IF//&P(&P+H&P+LA56 ]LAffLA<*<Q.$9))`QOIO:lQOO!)I"I"4NI%34N+Nn4N/GVjN MII-;SxG\zA05o NzKezDABG\  P P1^Uj"!1^T1^8F$A(JSs[J (o??4 GJ("h-<?4% 6/R.U\N%R=%Cn(9>;}K1;- ?5K-O7!\TIO/ *rIV1DBO%BSUG@K7#?%B -%BA77/W7K;6Jf@@|@ @:%l N'-%NN { 3LA^S!^:9A^77RA^Q&_,QQ:3b,4NP9D=?6)\JDT>WV,%NGv%0"+PE,= 4)6(F '8+O%','44>$A4 <\844>4K82<K2; hZY7OZWUJs> f21UJ(#(#MI fUJ-p/3(#(#/3ef}:},& a/ a!u a:3K,Gy;X,X18TFSF+1LJ kL$VCKCW kL UN7,L@H9S],X,N7 1X,CuF^# # O# M!82= X+++'C C8C mS8mL:HmmTh"",hu'J&H='JF6%]%?'J'J%%%'Ce HLVNL(#b )55 qP#V(k2+L>)=QK1HD (,!3 -y(4W68L(#L4W [LS?>K H(#?s! ,*'4| HLVN(k )P#FA(kV>)X#S %KJMwH H *W#RV%:N7GKAOTWHS?IFNF5?'AOSSIJEAO-;41y-['-IJDF K5,'[CR#H@=%YG B CGA@ m(#U B*. B KK@4%#=-*T33C|S9S=' 2=8B+^ BBI)I ='WWS)=;;4;4Jp;4KRs=N'B&P>>XKBR@$C4@ $CP$C44 49 H \ 4KFSF H 4N7IvN7*] 4=N7N7='Q!O%''IBOIsI)%E:9F:JR2W3}M/*fJRWAJRW<):/|:G$O 3GV/&G'-,o#~R<*k(K4F(Q Q (R$p$p LA 9i6 ]LAf$LAP $(o%J  e%8%++=9B9& L(H>f'-8KGK.:(G@LGMF4O.M:(5K(m%:(G:A93TN:4KZEG4j4M[:s ++%g%gL#73f/ 1X2`Y H4(Mx-'-HMGK@P6| G8S0MFBH4.OMP}.OK(m%PG"M2BHOO4KZEG.O4j4M[$VT ;TTTTDU %QV%A,JA%A &{.@> f2&{(#@(#MI f&{/3(#(#/3e."]TPFyuLG9D.H_C~31D(D31Nl"N;"!L7IK5,'[CR#HPQ=%YG BA C2PQ> m>C B*. B KWa KPQ4%#=-*H LC,A @7UR8(oGi!4'@l(oK%(oSGR8@l(l#i+ #i#ig-/$ R1-/E3N5-/*!!!!W3K@W)OW6Vl %+Y !]O!] G!]!]Fl$ :1N5-p'-L'-S,8/Zh6;U:Y&x6?5RWcFF`"{~3S"I6:"I~"{'"{R<UFPH "\U OD y;L4P4:R"`*88)J Q0P0;G>(0;(#(#M7D$0;'-(#(#F.KV4v %AG4v"S9^>5?V>M>+{6w@O5i2yUD6OKOm+{GI6OO( # .)CZ9#CZWE7E7B`8E/:-/$ KR1-/,N5-/'-2 2 s4OY4OPaF57$vW6,).M 9@$Ct 8Gq)Q7! 2T\ 6  M Q&! 6CXCX/;r3BRLz r=R?'O(#SCW9#C{M)z Q'- GE[ BI CI)F3 GN4 m @6<RR B*!O(#.OKO7<<3e B/LSNa6.5&K6:(#4 &.OJ4jSk4S<Q'NuD'0YXK[A0mG58F0mB0mZ+AF88I-Q f21-?? I f-??e U@Y?Q ?2dl6%6%56% b8,a)J)JC2 Z3C2(&2+` H :BDr ).oUP#=%PBsT> 75s KJsH H5*W#;3kIW<, UMKEdEd"$ N^~@N5f}:}&U/TA/TU|)C'8+O%''B5B,B4X7PEtXXG1VV,;L )x)xr-Lr,r"^9/9=9 UMDt'HV-CrB K(qV-LJKh,B}WKh.=7Kh |N7#}?YP@P P,MP77 /  N =lX3,=l,=llEENTNNN%EUl ~NV4 4-zB' 4.F 42M9Z" 4N(yN?2=*`2=TkUl 41:@A 6N< ;NKIR,8 ; %MZ %=; ; % %PPWPxTTk?L+U M@ !(w;&M1>aY/U;%&;$M$}R wY/%%( -IX5*4 8*4?O;b4,x'9L,,HQ$FX f265s(#'(# I fC2[(##(#[e-=EP35'R7Bt3'DMS9-U $/'&'3- E1%/N=JK5,'8CR#H=%YG BA C.)r"p m"p# B*. B K#K)r4%#=-*,tLFHkHk5<F@F-/$ R1-/N5-/'HV-CrB K(qV-LJKh,WIvKh.=7Kh |N7#}?Y=&<.==MM5R&QNhIXcNhXI533K; $G7{7{N"7{7{EQ)p$943)`QOIO:lUQ(OO(~$ 1~>N5~X2` /Y= H4(&^Mx-'-DMGK@P2G0MF4.OMP}.OK(m%PG"M2O4KZEG.O4j4M[$VnPPWWAV-NEWF; ,NE$*NE,VG5eQ&$9!a)`QOIO:lQO OK:B:DrRFC=%ZBGK?TY*VF> mN?6.? KKVF4%#'-=-*Fx$mMGb p %%$63*O T=<3T HLVN (#b ) E qP#B(k2+ >)AK1HD (,3 b(4W68 (#2 4W [LS?>K H(#(#?s! =+*'4|<>Ks WXeA0q Wj W8k<,2I*!8, % %MZ %=;&&, % %J%.'E7/d):U91P?E7?? BY8Z1 )S N?7D?FE1s7.,XkR8%DP8RCP7R8Q::TvNFuI1/)2Y]+rHYOH/)G'M;-9O'K'@@N4U@\BLLyJYRiGY1< ]Dr;4)zFB@' TP7N'3e'1 0O@O+Q.m-Q f2,x1-?? ;I f-??e32DY&32KJ,KJ'32KJU;#0G$O:q 3G"V/&G'-&NK 7UCY]<7]HYVp7UGHJ!C}SGC OHJHJ@@NG4U@\BLL/W*+@/</ & KG/ /+<K V VTI8."IF"I.P/b/ X"(i8(i$!']H"$3E I$R?^R1RVDCV~VPUPH ?6)\H 6|WV,%Gv'/%,[l9lNI;W?P CI;W $XS5s $(#'(# $[(#(#[=8 #BR@?$CA(@ $CP$C LP&.w.w4a+PD};"QO$$ Y9>;3?TDKXcXc2wA4Y i0 FM90Y=6 "# 3Y'=6?=6A 35 CP9RV8N?,GAOGHS%V74GGAOSSIJAO-;441y-[},=,,RGd!n4$PK?Y #L$/:\- 757@;:\N2Do:\R M! nl $8/HW,t._NI-:Lzo ^YN)J)J/#Lz LzSN8^pP5A@P  4P4Y iY%L G5)ToEP> f2%:E(#(#M%WD$I fE'-(#Y-*(#he Z43>% Z7: ZMM5NO8}{;4F8}'QPrQP7F3''RHK u?*k;W u1y/Q=W1DH;.1!ORQ=K95[#|I H :BDr ).oUP#=%P#Bs,~T> 75s KJsH H5*W#MX MX'&'-MX'#)56{6{ZF.v.v.v.vC&'E9t*W+A )(`)CO)/R.X@CCW&{.@> f2Ea&{(#(#MI f&{/3(#%(#/3e<{M'-&OqYX.;f>1*47?. ':4J7?BJ7?8*44J??OD2CiNKD2BF]ITHD2(#-)Jc.UPH: F6-DHGKP t6|4 FN5P}IJP-; 41y-['-}V V )F L@EEY&3 KAQJ@KKA5KAQ:N $;&4{YP;&{;&Fz lP2ORCF2,#R8 '  E5TN *X 7n9*9 L 14*Z2H4/=/M9[[P 6 HI1/)2,A @/)(oSM;4'@l(oK7(o@lr)XT;YD3B4 0x'!;YIJPIJl;YIJIJm?~D>7?~B}B}R?~-=v;m&cM[N?O=%8B9Y] 4]Y<8G:((GBF4.FOFO:(5:(@@N.4U@\ L w)[2)[B)[XZ!VYh(LU70A+ 'Q'0Hxq HxFHx ORO:S(# ]DrW9F;:)zYg'- BGE[?IJTPYI)  %4.O m @6<RR?6!S(#.OKS7<<3e?/LSNa6.5&K :O(#4 &.O4j 4S<V Y8*" BPT2F DB;QHF4. Mc&K*4V4T;K*WW;) $X fS5s $(#(# I f $'-(#(#eO8}{;4F8}'nNnMK>>O> 3?BUPH F>A.HGP%6|,NP}IJP-;41y-[})R0a6X6?L50a066})R0aS,l8WD$KLR&AG&,%u4X2`Y H4(Mx-'-HMGK@P($6| GF30MFBH4.OMP}.OK(m%PG"M2BHO4KZEG.O4j4M[$VP/\*sG&CY*sU8)3*s!jBQ#%D+&{16%D%]%?%D%%R 0EU)R77Y\PR770PP> f (^TP(#(#M%WD$I f;'-(#K)(#heS2]2S3X"J,!)Y$?T>>>.0FA0mF/_)5V?\S@N5CO$1kGqN5> 7g>Y>P"tN1%)%lAa"Lp#-GEv#=v=v# ?\CXCN5' CXV6Q+'8"/oCW?PCRV%NWY?,GAOQSHS$g*\WYFF?'AOSSIJAO-;WY41y-[}4. &3^4N+Nn4NN-9V2>--E5)Q-B9& ((H>f'-7UKGK.:(3!0LGMF4O,CM:(5K(m%:(G:A93TN:4KZEG4j4M[:s>8d Q'It&pHx0g HxFHxK] EO&p NYJg)k'X@_3XXX:&QC NYy%XF%1RW-PU)3M:&Z*~*  V@DR65A Dv?'*FR6--GH1 ."IF"I ;1XrFT KB1LJLVXSDWL U7L |N7#}?YK NG +[6n5=U)127SgRH)1:Q34 ::#&F,LBm4nFp%s"[ <43pN6LpO#4-G5B-A @B-I50B-)T#)TSD)TK8KK :A6:P/w ?6)\'iWV,0%Gv%I+'%OTv$I'$$~'$'&$-<-=EP35'R7Bt3'3S9-U $/'&'3- E1%/N=JYn/<ej?V=Dk>DD"H2h(A,6 )0 (;KBR@?J=S 9JPJ1?aS::U)[[[I|8=J S;E+;u: $X f2S5s $(#(# I f $'-(#%(#eK5,'[CR#HPQ=%YG B C2PQ> m>N B*.o B K KPQ4%#.=-*+*C*GY-6+9 mS5u K9--WIj(($88J=Y .'E7/d):U91P?E7??BY8Z1 )S ?7?1s7.,Xk*RLYVK2XK#L*%#TX2XTW2M"H0-Q f2!1-?%g? 2I f-??e$8/HW,t._NI-:Lz ^YN)J)JOVLz _LzSN8^ pcNhOdO7A$/*33BRO<S(#+DrW9F;:)zYg'- BGE[?ITPYI)VHN %4.O m @6<RR?69!+S(#.OKS7<<3e?/LSNa6.5&K O(#4 &.O4j 4S<%6"o;LQ?%63%6RGd!n4$PK?Y #p$/:\'K 757@;:\N2Do:\R M! nl7 T 4vT C ]& IKP @K(NK(u;V;!;$/J{Lf FOSfJ{%$qJ{4%(#4*`G*`04*`*`N7(H $+HH#(;6: $X f2)*S5s $(#(# I f $'-(#(#e*38p D:_D;o5 D?S?Oy D??|?mPIlE'LP9MR --I778o8o&Y` qF T T+8L6mE8-c B?<???W-=;V'R7'Bt;'!;-U $/'&'3- E1%/N=Jy &07#&TAy''9)y''K5,'[CR#HPQ=%YG B C2PQ> m>U B*. B KKPQ4%#.=-*;*A>AYTE9@>N#5.#>8!h'Ny(D'HJ,VM*VMVM1+ KW:B6KYK+&&/1.Y9J:5 '81.8 `A/ 4X8Y78l!q+ @] 7Qj0-Q f2Vh1-?? @I f-??eJO7I#Fz l?5O'1Q,C PV2Y S/!Q/!C > mS>:3z'O?WjB/|"Tg>6u*,5):/|!p *%#3+%6N*Q27>3 Q;;Q2Q LQ E * .*Q2N@IA;HK882<0T0W2QJ;.KYFQJ?QA5 CP9NV4-4!ZNRO<L(# ]DrW9F% :)zY'-BGE[?IITUPYI)=N ,!4.O m @6<RR?6!L(#.OKL7<<3e?/LSNa6.5&K OO(#4 &.O4jf4S<<>P`Gs(XOV5sGs(#'(# D$D$GsC2(#(#VCV~;V<%&;:IC3AK&S%OC3N5C3'-hP ?6Mx)\P'9.WV,0%Gv%I+'%OTv:MGE3K@3K@"lW)5<?'2<K5,'[:CR#>'y=%?GK B:! C%'y> mN>5C B*. w B K%K'y4%#'-.@`-*SH'SH(#M(#SH(#CT.CCEI )BD[2)B-/)BP:A@:NB:5(:As(#h_1003F'.900009x# '[J7*((XE)CSKP8P"Y.Y.5AK" 5N55'-h$*.8M<04/"E?MG.CNH2"&E UK?!(U5R1}F/!(`%$R&M:$"I6:%$ '%$U$FPH "\< #0c9c,|0c0# 3WkX X(=PX6+BJL. :GW8 Ia6/A$0S(Z6/ 6/(Z0 %RY015&6CCS%@/?"EA:BDr;4FB'=rTP7'G'%4N'-WlSW@G1V@4UlM\QUl".MY00M HcssZs))))*J/-/$ R1-/LN5-/f}:}&1H,-NE,*B=U)1+%H+?KsA6N)1V 3 'F L'V0V*FN@I*E*NAMYV HK2:UX>Mo%#T=X2XT` 6 TW2W26*Tqw/GE7 *X #D D:A:7*> L-69;C61 E9;'9;-XU2DX X7,%?b@\? BFN:1I@\!dMB, A|:#?#$!dW-!dA37:/W7K;6#EUVY@x?Sm7 -i*Q8G."/5*QGe*Q#?^.$ A.Hl0'Y=I%~D +3.EXeC, EE8; . ByRRO< (# ]DrW9F6U:)z!H'-BGE[?ITSPYI)^N ,4.O m @6<RR?6! (#.O 7<<3e?/LSNa6.5&K O(#O(#4 &.O4jR4S<;g;g?Q6J68HFVF((:2326]952]B:A wT@:NRx5&:4As(#hF%1RW2PU)3MI&Z@(V08I5A Dv%,FI$ 4O;7 ;;|8?00*v$N7$4L*7UA7* 77q9Oq9*4!$Np8*4Np?O $X f2JwS5s $(#'(# I f $(##(#eKh.5T!3ntXT E%I X(=F'+; EXB$ .'E7/d):U91P?E7??BY8Z1 )S ?7?1s.,Xk22*%,'59nR{$) )%,Jk37J%S8J!=dJAFV-NEgF; ,NE$NEH9S],,X,X,F^ +J??4 GI ?4A#:/6 2A#C1iA#2 9R HY++&&/1.Y9J:5 '81.8?AV</ 4X8Y78l^+ @] Qj1cA,k2%A/Xq&/V/!V\4/;BgOU"/GU(!0(0/ UY"U0.!w/G0/11$'A:(H8H:&CC2j4/%;RMDf*e%;.a%;UM !X!,N!DKTPy2D'',D'' P /Y=?6Mx)\DWV,0%.OGv%%OLUV+U)u/ $ D/ =vN5/ WIX=N)N)N)N)O@RVN>?,GAOMHS@vQ%AOSSIJC=AO-;E41y-[}P`@ JQKJ f 2KT{ 5s@ (#'(# 2D$I f@ C2;(#(#e%?>?@\6BE.?!eP:A6.>>4.q.q+K/mmX2`Y H4(Mx-'-HMGK@P6|G0MFBH4.OMP}.OK(m%PG"M2BHO4KZEG.O4j4M[$VRI:RP> f2,-(#(#M%WD$I f'-(#(#heQ 32iIJ.uIJP2IJ;.&.w.w4aO>>5"s 7"s"s#UG24n+I#N1MN1V#N1N1G\Y(#5U( [HZV>HZH4]=X;NJ-4 K $ C N5 '-'-DFRCT.CFC%KEFN1'- $KX f2S5s $(#'(# I f $[(#(#[eW6,,qR/\9Te9.7{97{1c',k2%LANh/9=Xq&8/V/IXcNhXI4U,"z==kFz l(.18I[PIA WI[)I[WQEE&rEE-CCC>9.NG)ToEP> f ".WE(#(#M%WD$I fE;'-(#(#he$`B:K u.CN A/7o{!/X<M/ e4^UkH/'M eB& eJM?7""1l'(%'(UWWN YX 7n:9*S?7 L 14<*Zv 1S?4JJK,E)CS6iKN E)PUP ?6)\H'+}6|WV,%Gv&I+'%TvR,=B9& 2R(H>f8R8G.:(G @ MF,RO.:(5(%:(GSGR8N,R4Q-=4j)nM[(lY=$4;-$q=.K $KX f2mS5s $(#'(# I f $[(#X(#[e3W  @QLKi) 9 : ES R;%AS 4!]S  3W% %Y-3-/$ K<R1-/LN5-/'-* 7UCJ)|>Y"7UG"!V.F"UIJ"-;.41y-[}PT2F DB;QF4 Mc&K*4V4T;K*WW;EP353S98'Q)Q!)O&&TKJH0' HO*W#QQ<=$&GlR'f HEP35 ) RP#2b3Lr1S9W &LrKJLrH H *W#w'&+ '&22 '&u(3 P9D=?6)\D3WV,%Gv%Eg0.J=M5*B=U)1+%H+?KsA)1V 3 'F L'V0V*FN@IK5,'[CR#H@=%YG B CGA@ m(# n B*. B KK@4%#=-*16",E8')9F:Q".JR!3}M/*fJRWAJR!G".WW,"ZG&"7CC'G;G0&J0&90&UCNRG:%wS|YJl+= R78v6=74Z44/ &MopH>="P2P(NA%BP(=8F=P(==MBQX>%I X(=FX[-4%(#-4*`G*`0--4*`*`?>MQ??B6lP%?@\?@\&NBTG?!e!dAGJX=JiJEg0Yi: $X f2S5s $(#'(# I f $(#I(#e#UK?!(U5R1}F/!(`%$E&M:$6:%$ '%$U$FPH "\Lg!JLgUV@xG3LLDABG\!'0+RQ'K[D9J9RO'8"K[HG4~O4H(7Hl+O@] QjAD(Gw(E1=V9n0@  HHZ>HJx;B^S>=d> f2&{(#(#MI f&{'-(#%(#e(   " TXx Y?1U:zXx.XxE}::R'Vs> f2B'V(#(#MI f'V-p/3(#(#/3e(G+R J<5-RO8vJ$5V+\V+C;@=CnNG(59=>;}:1G*15%(ZLL(L>'>':E?B8E)CSBM#3~,Y<QI .WK >+.W*hWE# ?+E)E)8 $ XS5s $(#B'(# $(#(#Q+U@[?*1 $,11Oj3Oj3cM7Oj:-/$ WR1-/N5-/'- sILAH"7ALVL ALLX)Ep $KX f2S5s $(#(# I f $/3(#(#/3eS..Ak&&Ss1e&{0> f2K&{(#(#MI f&{'-(#(#eCNV'CJ C6"p mS6%?>?@\WB> 1 7 \,> G1VV 9Q7$Q7T Y'% 6h;H|3W8Y@Q : Ki) W} ELRK-mVS%A8L#!]*HL B3W% %Y-349  Q Q<QEF1Vx SVoC`/-0SM9m/&-*C* tME%MR1-S PAO+S2]:|Cj%-Ba2Lx3L11 V/L%'LK1 -0nE(%3oAV-NEF; ,,NE$NEH9S],,X,X,F^*rN YX 7n:97*S?7 L 14<*Zv 1S?4JJK,2 hZ.v.vL.v.v<*B=U)1+%H+?KsA)1V 83 'F LV0V*FN@IRYRiL,RY;RY/*CCN?O=%8B9Y] 4]Y78G:(-?GF4.FOFO:(5?:(@@5hN.4U@\IJBLL* Z;::|%Ba|;SA1 A1 ?'&K1 -0nFlN'-L'-!yu/IY6?`46:Rb"{/h-/$ 0R1-/N5-/'-'-K5,'8CR#H=%YG BA C.)r"p m"p# B*. B K+#K)r4%#=-*I/!/!C j88dO)$8HW,t._9-I-:Lz5*+5G9-Lz LzS9-8^p =D}+&&/1.Y9J:5 '81.8?AV</ 4X8Y78l+ @] Qj/*/X>%IT X(=X%cH?XN/)/)M;7NN* ,;0^16)/i/iH"/iJVV%J//J3W8Y8@Q : E() " ELKPcE(A L#!]L 3WE( %Y->L'*YJ'/1'6l6lA5xAW.A7WX>5> ~<SWX(#(#M7D$WX(#(#hE(LL7YXYYoY~ <YK>N '-'-:7N.'E7/d):U91P?E7??;B 8Z1 )S ?7.A?1s7.,XkRM B>K?KSIO#L"B/;K aPG'5;K-DoG6;KRM! nl 2 !C5W/M)6<+pQp!p uS7bN?N>T 2? 7UC9nM3Y f 7UP:Ak!GW=4<GSP:N=dP:^M):LWoO"/Q'ZO"CLH:LO ] |Q7Q7"T HLVN5(k )/zFA(k+>)X#_?1KJH H?1*W#.=PFL G/T/T)1K?@WgX2`Y H4(&^Mx-'-HMGK@P?U6|G0MF4.OMP}.OK(m%PG"M2O4KZEG.O4j4M[$V2VXDW/YN*H(>Mx '-D-GK@AOVQD:GMFL4MAOSSK(m%AOG8L@oL:4KZEGTv4j4M[: $KX f2_S5s $(#>{'(# |I f $[(#(#[eT}7-T}NNT}5(X Y5s5(#/\'(# D$D$5;(#(#(X d'6B(:)r:)V!`5HV*%#3+%6N*Ks7>3 'VT7F L * *FN@IB:BDr;4=%FFB'TP7N'':U 4x. " SD!fJR:WBAK::BN5BYi'-h H8 = 9L C2&=J oJ>UH>(#/l(#4>(#/m*r-\%%:A wT@:N5X@:4As(#h+'8"CO>}Oe D*GQL*OOIO:lGIGI*OO::A7N'-LF- u"*m0A0A1: NU OD7n9 %y 14*Z4JPD 8?6Mx)\P.WV,0%.OGv%O+: Qv !T:QvX'-=;V'R7'Bt;'-!;-U $/'&'3- E1%/N=JN*S*3%}EEUK\K\U8K\+= IRFW8v.6\ CFWFW6==0K@3E!L=#\@wK@2xW)95$Bc?J~22D8X0BcSY6 )%$8HW,t._CI-:LzO*+0oLz LzS8^p( GwJW//bGw/M+lOBO $XNS5s $(#'(# $[(#(#[IzTU 7IzZIzN?O=8B9Y] VWY78G:(EG]%VWO~:(5:(@@NVW4U@\BLLK_4T.D)Y2"T5H93@V5HW5HV+N^5X+ 2U -+  SVoS*W7ss> f27s(#=(#MI f7s-p/3(#(#/3e#&F,LBm9nFp%s"[YE32WpN6LpO#-G5:HK u?O?Wj u;Zyg>6u=V !u a,1Vx SVoC`/0<SML<DPCME%MR1PS P O+ GwJGw OO  N N?O=8:B9Y] DS1Y K8GK:(@GY1C ONC 5:(5?|:(@@N14U@\'-KfL"^JJJA<HZHZHFP25n?6)\eWV,0%Gv%O#qBBa =7,==;mc%p(h(h7CJN4NM!9J(JNk 7CJN//&<-B1><-H1<-M(X:KN'-5dX> E%I X(=F'+; EX[ "^PIPGPYi3ER?'(#VCW9#IM)z'--G. BI CMI)F3 G)4 m,C@6<RR B*!(#.OK7<<3e; B/LSU=8 uSKG:(#4 &>4jPb4/A!X.`2j4BRSB/YW*ROb0j%")@Ew42z5Q)")(#(#(%D$")(#(#ms4OY4OF5,n76 :7P6Q  SVoV_SD~/*C;1/XJO"S: @L< *_ W! :BDr;4FB'TTP7N'/c'NV$-Fe4lZ R(yX4'<"<lR$$QYB!:-/$ KR1-/N5-/'-LS@I+aN7#N/R\kND/;/K5,'CR#HD=%G BT CwIGNC mC S B*. B KKGN4%#,9-*33*3=8B98 GC 009)>iy &07&TAy'G'9y''q9Oq9!YND7?,N*0;.* 7U:CJ)|>"7UGK"!V.FN"UIJM."-;.41y-['-}P@P Pm&98@Zm/m;E>>-Q f221-?? I f-??eUQO~,&TACXO~M!7M!XO~HRM!M!HR&=dMC8V*@RO<L(# ]DrW9F% :)zY'-BGE[?IITUPYI)=N ,!4.O m @6<RR?6!L(#.OKL7<<3eOp?/LSNa6.5&K OO(#4 &.O4jf4S< <-/$ R1-/>N5-/'-'-I{CDE&QgO )H Qg.C.AQg..+RQ'K[D9J9RO'8"K[H>G4~O4H(7*>Hl+O@] QjE*AMMoBWpWpR3%y iHQ0JiY C) FFpSG . GI$= // XO &2&2P/q BBh; 9-%YY X8DX X0//UkXwOO;'UPNQW2$67JX f 2 Q5s7(#'(# 2D$I f7;(# r?9(5(#e/ F5L!!2=8HN;.V0#%@HH Z;::|%Ba|;)SA1 A1 &?'8K1 -0n$8/HW,t._NI-:Lzo ^YN)J)J/#Lz LzS^N8^pRGd!n$PK?YO#L$/:\?6O54:\N2Do:\RM! nl /EUE/E/E02j4X?W. %I XP(=PBXPP< ZP:G ;MS5O <MOPPXR<) HLVN5(k )S(FA(k z>)C9U8bKJHI H8*W#*" AM*J SVoSP66U'&+ '&22'&P Q-uP77@cP77)T)TSD)T-"^4=:-/$ R1-/C N5-/'-.0 H :BDr ).oUP#=%P#BsBT> 75's KJsH H5*W#O<:7O~K&=^ACXO~M!M!XIO~M! M!LvM~TE9 Y8 #7' D:RY'P/2&O9;C'd1' E9;'9;-P&{Y>&{(#(#M&{'-(#(#O62J ^#) Y6?"I'NNN5 R| ?C O ;1X?~1VE''B}N7W-'%OG d%OV+V+(%O" M40AE=*HRl0A1W J1DW?/1W6R\C 1W"W?[6n"3EL18R10 41nPPWW>>Y.F00?-z:W:W9'=W/9==994 nl/h-/$ R1-/N5-/'-'-7,%?@\? BFNV1I@\!d@SB7?b!dW-!dA77/W7K;6A@ hPEATbEKN8$+PE,X-2#X-X-9ONFu:BDrY]+rH=%OHBG'5T-9ON'K'@@N4U@\BLL;g$WC 9NH ]w$w**n&aC1qC, 8@F0&"CURl&Z&d"[>)o'<M5A[%[,FM$ 4OJV ' <C CN'-'- 0 AX 07: 0 ]w$w**8?)P"CNAb)()-l<-098B2*aj*A=*o--E5))PUPH ?6)\H 6|WV,%NGv%0&{Y>R&{(#(#M&{'-(#(#.zAK>0zN5z'-.Hlh' iI@U''M!*-@M45U'82.N'86 T,S4EN$/@6&63-NE1%/=Jm@Zm/m;)E' <C CFN'-'-*m0A0A /t/tY~/6Y~oY~QL H7IB$`:KB?L 1U :K@4u6`- @OR@T@z8@F7I "cISY R)RN7R9D=DGA~<]C"G<]V' YU@!#b) SY09[P&{Y>0&{(#(#M&{'-(#(#A;HK852<0Tt0W2Q'G;,&ElYQJ?JKQEAEl5 C&P9 HL(#W8 ) qP#&|2E$4I~HD (.E(4W68(#4W [LS?>K H.(#?s! N*'4|%)uRO< (# ]DrW9F6U:)z!H'-BGE[?ITSPYI)LN ,4.O m @6<RR?6! (#.O 7<<3e?/LSNa6.5&K O(#(#4 &.O4jR4S<F`)F`LLvTF`A;HK882<0T]0W2Q?;W]"pY"p!QJ?Q6ReA]5 CP9QUl.{Y#y#EUVY@x?SmH7 -i*QG.FH?;*QGe*Q#H?^.$5HzK>$YF2(gR?(NC=%@2GR Y+C.FVF mRC.R KKVF4%#=-*FlC N'-'-A!N&]) "A2@T ,;W!49[>P:RJUX6W`?:5J066}JLVII&IR?'(#SCW9#; l)z Q'- GE[ BI CKY(BI)F3 GN4 m @6<RR B*!(#.OK7<<3e; B/LSNa6.5&KG:(#4 &.OJ4jSkPb4S<pW*Ps>eA0 O418Y78l+O@] Qj>/GSxN MI;SxzDA05o zKe z < ^< %=:S# W8kPP-HLIH@C1TB16o#Q3N2/Dr 2i5IJPIJP2;YIJIJ# 9'=W/9==09HnKN9Hn0{Hn1+ X F (F0|4F4 7m 3F_F#`$FF$2D|Q#Q A/Q&{3>&{(#(#M&{/3(#(#/3K20rN.+T/A' D ;+DDYTh4T7-D3DDG1)K5,'8CR#H=%YG B5 CG )@ m'(## B*. B KK@4%#=-*PMIFRO: (# ]DrW9F6U:)z!H'-BGE[?IITSPYI)L  ,4.O m @6<RR?6! (#.O 7<<3e?/LSNa6.5&K :(#(#4 &.O4jR4S<[:~)'F[ @72[D;$$FMkMk??4 G?46/,\ %Cn,LO22@$8HW,t._CI-:Lz%*+0o/Lz zLzS8^p-CVYh\M`9'R!I+'TvN*SCo*3""@""EICG1)V@43~HhX$33X?X$ ( (RX$1*}=<1r/ E'  ;UUO.KV4vAG4vA&GF00qT#YYD3E=JBN7((_.(.. 0 . T W'X+@/W/;& K/ /K V7IB$`:KB?LO'1F':K@uE;YOR@z8@F7IY"cIS>Y 0 $X fS5s $(#(# I f $'-(#(#e!$Yv0PS"6]G  B2/9:\9a9'559:\N2M!:\of'-8KGK%:(GAMGMFV4O%M:(5K(m%:(G:A93TNV:4KZEG4j4M[:sF(@Q%5Fr%7v%%$X  f2H e%AeI f%eXDV+bNHM(>Mx '-BGK%AOHS&GMFKw4%MAOSSK(m%AOG89Kw:4KZEG4j4M[?De'Nj' '>QQOQ4 ]w$BgO"wU*(!0(0/UY"U0/1KYN#*S; c.`K *9O3C[5`@@EAHn=K`THJA46JAJA"5+%g?6q2 1&[ SC==UUU2W>C5)2.3K.Jk2.. g009XGG,7.0q6GBGBZ+A" ;""6.#K&LN.'-;X,XOi1V&W-H9S],&X,X,F^ NKG f2 7??? I f4??4eH/"/P/UCNQ7*=/$</08/22*%,'59nR{$) )%,J.B37J%S8J!=dJ5> T%90M(#(#M7D$0M;'-(#(#(h|<592[/I W &[>]+zPA(W 6!3>]Dz? >]:|W +?HQRR <-/$ R1-/N5-/'-'-VURWPCV 5A& &/.Y.A#/4AW+MYSY6Z'+O%''.&!oJ-o/M*`oXT E%I X(=F'+; EX[ EI MQ!i+l'O*6P HLVN%(k )OFA(k1>)O&&TKJyH' HO*WKJ#:1G*15% b5)C2Fl-/$ R1-/C N5-/'-'-X(T8DX XB1* 1#* N N N3 n3)X3;*>&L2f2kY;Y;<<K $ TL N5 -p'-'-Q:lQ0OQ=tO=t7v=tX:C+BF(BDB;DBb A$z)1)I)P 11>X&%C%C%$ V_OrcV_**V_NK 7U:CY]<7]?YU7UGKHJI !C}C ON5HJHJ@@N4U@\'-BLL%-/$ KAR1-/FN5-/'-4.Z44/ 3H 33Peen?6)\eWV,%NGv%35w--1) Qv$N#1)QvG.E7/d):U9NP?E7??BWW45)S?7?5s.,Xk!G". !t%1%#c%XY?DXXY1v:?D&15&XYXY?D&&)s;*%,'5%,37GC$< E$C$& =& C$& & D;$$F0/hI$ IV+N5IYi'-'-,q.,q,,qI6#. e e' <C-/$  CR1-/>N5-/'-'- , 1 L5w UYZC7YZ"X]"YZ"* 0O@R$,[l96;l/| CN24SW C$] C,5):/|R4S!p gV7E-1NGs(X M%9`V5sGs(#'(# D$D$Gs;(#(#(M9g oH D %J( % % [ % %F`)F`LLF`N?O=%8:B9Y] DSFaY K8GK:(G'e/ ^FONF5O:(57:(@@N ^4U@\'-IJKfLPXLUPm:P&)U+6ITHRRBX:-H1Tk?L+U M@ !(w;Qu&M1>aY/;%&;$M$}R wY/%%(vv VOA.q #S.q>B+QY >#2> *T+8*T&15&*T&I ? $X f2S5s $(#'(# I f $(#(#e0F=G%!A=G=GE5*m0A0A THY{XE>Od Q'OHx 7A HxFTHx YGOT3Y777'6Y77BR@?J' 9JPJaSN26E9 Y8 #7' D:RY'D?/2&O9;C'd1h' AE9;'9;--=;V'R7'Bt;'!;-U $/'&'3- E1%/N=J*^BKRP *^~*^0EL=$9RLOOIO:lL(O O(>W|" M4C 0AE=*H3Rl0A1W) 8`N3Jp/Jp&i1W6R\EB1W"3[6nR\"3ELf}:},&, 4)?\CXCN5 CXG/T/T)KEGy1MX18TFS6m&<V-LJ kL k0!_W kL UN7L@H9S],!_X,N7 1X,CuF^ (MFHLKIx!OjD3Oj3cOj"(.wm&q.w4a5%GEG<{qz9Oq XR>Y97??! DR!T7?BJB7?,R"`*.h ?B8RCS7=3LG4Y<)W;KW_!*G4(G(GJW*h"W3&?G4&U?Xx:zXx.XxE}:# iM#73^7#77" M4C 0AE=*H7Rl0A1W J17/O1W6R\1W"7[6n"3ELjI %K<# `X+#&2WF,LBm3 nFp[%sK+ 3 6Q36QjpN6L>SpO#3 -LG5A[7A[7A[TIoUIo -:(<9_j4>U:Y&x6?5RWcFF`"{7~3S"I6:"IG"{'""{UFPH' "\'8+O%'M+' GwJGw O6WOB&'g"BBE^E^J??4 GI- ?4A#T/6 2A#CA#2F%1RW2PU)3MI&Z@YV08I5A xDv%c,Y9FI$ %4O< ;NKIO,8 ; %MZ %=; ; % %R(Mgg )mkX7-@1s-K^-S6I 6S$S2xQQ2xJ+1oJ+2xJ+TS KTP"2TS'',TS''7IB$`:KB?L 1F':K@0DuP- @OR@T@z8>@F7I "cIS8>Y S<*Tv: 8:*T&15&?D*T&&'RGd!n$PK?YO#L$/:\/6O54:\N2Do*:\RM! nl .1y8n(@F"1 0O@+Q.m N `Bq7m `@BqF_F#` ` `BqFFE+%f HEP35 ) RP#2b3LrS9W &LrKJLrH H *W#HLIHA&?W"F- u"+#EUVY@x?SmQ7 -i*QT G+O Q%g%g/5*QGe*QG;#Q?^.e$G4<WI<D8W % %MZ %=;&&W %& %`8ERR8vPUItH<It0g +-RO; HV'NuD'0Y#XK[A0mG58F0mB0mZ+AF88I5W` `+`W $KX f2D S5s $(#(# UJI f $/3(#(#/3e9]%D= E= = WR (R5fu QfkFz l(.8I[PIA WI[)I[WVLp?!(U!(&M(KK 0 -IX5"5":EP5 Ia36/S90(Z6/6/(Z!HAHH/BR@?$C4,-@ $CP$C.;'7  fU7839 h313E5))"Q(""+&&/1.Y9J:5 K'81.8>xA0 K4;B8Y78l+K@] QjHK u? uyP yWS1!Oy< ]Dr;4)zF#B'TTP7N.O'3e'OO0<-><-H1<-(M$K,:' 3,"V/& r@,'- 22*%,'59nR{$)J )%,J"37J:JSJ!=dJ)Co%?Co@@CoOOTUO&OVL+!U1AN4UNr/T/GKoP1P)D4kGWoP1O1,g0.!w/GLHG1ODl 1; $:I)5gFNTFO6 2U-W$J6NKUJhJ6L^L*uQ.DTJ6 HQ6$YD(# D$BK5sD(#B'(# D$D$PPD;(#(#(PI>N~ Z;:;S>8 K8SOC(WGE*AM)1/NPE?Mo( 5 )S75s.,Xk2C2(&T2)5V?\S@N5CO$1kGqN5> 7g>Y>P"tN1%)%lAa"Lp:..A%N.N5.'-hn115%#^""/u8-)B<"8@@0KY01)+(EF4$S2]A/Va 28Q(3I (i88(i P;J-5'  E5T)023:W>5)2.3K.J2..11VFSSY. HLVN%(k )O1(kRP>)FO&&TKJH HO*W#B=8 =0!&4 ?0!0!C2 `1[X+RR9s@bU OD y=3p*8-18-8-3F<AA9.KV4v %AG4vWPXLUPmVP&)!1P?/j?38:7 f 6 +W 3Ae/2'GI f3;'-2 he+&&/.Y9J:5 O'81.8AA0 O48Y78l+O@] QjRVN?,GAOHS@vAOSSIJAO-;41y-[}ObW; HLVN(#b )B qP#&|(k2+M>)K1HD (3 (4W68(#24W [LS?>K H(#(#?s! N*'4|V-V-EJM+nB 0SYS2FY/!1/\6/XL=kJK5,'K~CR#HJ=%YG B CG6A mR B*. B KK6A4%#=-*Y,:%wS|YlLHRU/TA64g(x;/TKo1T)/O(WoE1Q1CLH(O ] SDP90SYS1Y/RbY/I865 SN?O=8B9Y] 6YC8G:(EGLq6C OC :(5:(@@N64U@\L B(vR?'O(#SCW9#C{M)z'-GE[ BI CI)F3 G84 m @6<RR B*!O(#.OKO7<<3e B/LSNa6.5&K:(#4 &.O4j4S<!/\N #04VY]- HY0G$7?- zN`GO$7$7@@NG4U@\BLL(G+R J<5-R 8vJ$5V+\V+&@=6 Cn(59=>;}-r [-r/\-r!_V@tDlSz C6Y$`:KB:KuGOR@ $ XS5s $(#'(# $(#(#K$S4-5P++Y-Q f21-?? I f-??eJf@@|@:%+[,~6)'F[7[ 1W^,MN #04VY]- HY0G$71?- zN`GO$7VC$7@@NG4U@\BLL22+6$$,8: $X f2:S5s $(#Q '(# :I f $(#(#eFnK;=>I5?<X?< ]Dr H:K(&^)z!cF'-BMGK@=Tv>GNMFMj4.OM=.OK(m3e=G"M2MjO4KZEG.O4j4M[$V|<59/I)2*[ >]PBWW6 >]Dz1>]$;Y3|WH%)=K%> F%1RWPU)3Ml&Z@&V . M5ADv%,FM$ 4OV%T+-:T)XTQQNM X>%I X(=CX+KO"vRO<S(#+DrW9F;:)zYg'- BGE[?I-TPYI)VHN %4.O m @6<RR?69!+S(#.OKS7<<3e7?/LSNa6.5&K O(#4 &.O4j 4S<4-22; WyWy=8 <B9DG$ 8G&FON5C'-'-28(B!8,a)J$ LtRO< (# ]DrW9F6U:)z!H#'-BGE[?ITSPYI)^N ,4.O m @6<RR?6! (#.O 7<<3e?/LSNa6.5&K O(#O(#4 &.O4jR4S<!;Cy P7!y y!PtC#X-O'NuD'0Y#XK[A0m(G58FT0mB0mZ+AF88IX> E%I' X(=F'+; EX%c[# Y8 Y/DTCPT*X f20 Ae/I fe2).7A#.. 6NM\E3 J6 * XDV2-CN H((!Mx'-MGK@AOHSG(&GF3MF-q?N4MAOSS2-.OK2-(m%AOG"M2-q:4KZEG.O4j 4M[$V3h313E5"v)3394=3<3: Dr;4FB@'G]TP7'+\'1 0O@+Q.m? $X f23DS5s $(#>{'(# I f $(#(#e=!99I9=!22=!62K%M%M(3K*4K* H :BDr ).oUP#PBsT> 75s KJsH H5*W#: Dr;4)zFB'TP7''/=d:h??@;X,X1V>OO G)5V?\N5CO$1kGqN5  2 YS@"tN1%) %lAa"Lp:$ -*:N5:U-)VN(k(k!#>) "AT8 PH>FN& P::'CH8 C (;N2e 0X 07: 0<0SYSD5N5N *X 7n:9D*&Y L 14*Z4W=qFX f2*5s=q(#Q '(# I f=qC2>[(#(#[e1H421J514D=88*VM7&<MPXWBRV%NWY?,GAOQSHS$g*\WYFF?'AOSSIJ?AO-;HdGWY41y-[IJ}X?< ]Dr H:K(Mx!c'-BMGK@=T;GF3NMFX4.OM=.OK(m%=G"M2XO4KZEG.O4j4M[$V=63B9& F(H>f8R8G%:(GA}MF8O%:(5(%:(GSGR8N84Q-=4j)nM[(lx4U,+Np5+ 2U -+ L#&F,LBm9nFp%s"[YE3pN6LpO#-G5W8.3XT%8..gP2K%MDB;Q%M4GMc&K*4V4K*W*%,'5%,137G-/$ R1-/N5-/'-+&&/.Y9J:5 O'81.8>eA0 O418Y78lN!G+O@] QjQ R Q<QOO*@ Qv/8V*@Qv7NK 7U:CY]<7]?YU 7UGKHJI !C}C ON5HJHJ@@N4U@\'-BLL |<59[/I 1n&[>]GVP-8Ui62>]Dz?>]:UL?|Ui+HQBW7l0~2D0;1X?~1VE''B}W-'6Cy-"HO.u#j4l;|Y;4'<"<l$Q>-TL!W 0>098B2*8aj*A=*oY97?d! Qz7?BJB7?,Y+Q"`*.h ' 5GGV, 5;#5 53$K%.&W0..(8-R?'2v(#SCW9#N|M)z Q'- GE[ BI C,uI)F3 GN4 m @6<RR B*!2v(#.OK2v7<<3e B/LSNa6.5&KG:(#4 &.O4j;4S<KXcXcK!1oQ&$9()`QOKO:l6QOO")?45Q)")(#(#(")(#(#%45A ISLN5;'-Vh+f:EuP25n?6)\e6WV,0%.OGv?c `!6%O.OtR 0E)R7N7Y\R77GhSJ HN7N7IvN7*]?j?jN7N7UriUr2Ur6 <D3N1'-5d'- SGH;6y-NyA)FMN@EEY>&3XKA=9.@KKA5KAXCE426?X  &?XB -)BV <NHSN'-'-Hk3WW>IWTFO6 2U-W$J6NDJhJ6L^L {Q.DTJ6 HQ6$&H>< ]Dr;4FB'DTP7N'F'OLi7I#Fz l?5O'1Q,LC PV2Y Y'NuD'0Y#X:5K[A0m(G58FT0mB0mZ+AF88I6WF+f:EuD2|2|XK9?YAB4c6Y @Y Z;::|%Ba|;)SA1 A1 &?'85K/1 -0nK5,'[CR#HPQ=%YG BA C2PQ> m>C B*.,Q B KWa KPQ4%#.=-*FGM  3GV/&G A8N?O=8B9Y] 6Y78G:(T<G,Lq6C OC ~:(5T:(@@8(N64U@\BLL#33UHr L' r- #5#4n#MN1#L&3 L&2"DL&OX$  XFN5XYi'-'-DD)DJ[3\LdyUM, K2223KKD~L3DDG1)V.#(&Y :V!YGz1Y"K:BDrRFC=%ZBG?;"TY*VF> m?6.? KKVF4%#=-*2VLFz lU-P( A>7oW/X/bM/UkM+lO=B:IUOH>H>56Cy-"H*'K[D'0Y'8XK[0mDG58F0mB0mF0&&=!J0&90&UCNI58 R(UR-I~?%X-5(#-(#(#q?OqS9*4!+=D7?8*4+=??OO-/$ OUR1-/N5-/'-'-(:)r:)/`V!`5HV* $2<1N)59$0@PT W'XW0iK0-Q f2!1-?%g? I f-??e7L$ LN5L-p'-'-1$ $KX f2S5s $(#'(# I f $[(#(#[e,[l9X$lNU;j (W?P;j2?ARYR9nM3 fYP:FGW=4<SP:N=dP:ANgF,9@>NJ*5.#>48!hRV%N4?,GAOHS@vD4AOSSIJAO-;441y-[}N'-;}M4WE0AF %TW 9)I"I"W&{3>&{(#(#M&{/3(#(#/30J~K@3E!L=B@wK@28{W)95=VB?G22D80BSY6 )%%?@\?@\B#XX2` /Y= H4(&^Mx-'-DMGK@PG0MF4.OMP}.OK(m%PG"M2O4KZEG.O4j4M[$VVN(k(k>){L6>{ENE>6T&J&y&0J"W`g"$7{("7{O7{D"7{7{ %RO<S(#+DrW9F;:)zYg'- BGE[?I-TPYI)VHN %4.O m @6<RR?69!+S(#.OKS7<<3e?/LSNa6.5&K O(#4 &.O4j 4S<WAo8 14 / 15>+&&/1.Y9J:5 '81.8fAV</ 4B8Y738lV+ @] 7QjB=94=Sn:7VV;JH;GGN;4,#|<59[/I 1n&[>] P-8Ui6>]Dz?>]:|Ui+HQL .L(LG"">A=kJ1 Z&;::|%G&a|; SOZG&" ";?'KG& -0n:+X'O?WjF.8"g>6u8'JZ*j>AP :BDr0U9Ec9=% #B<CT, $"pY<1t?<A$5 CP9'iJX f25s'i(#'(# I f'i(#(#eE:9F:JRW3}M/*fJRWAJRW-/$ K.R1-/N5-/'-PUPH ?6)\PH W6|WV,%Gv%+&&/1.Y9J:5 '81.8fAV</ 4B8Y78lV+ @] Qj=$N7O4L7UA777,;UV@xNGLb"4*4! 5 f::!f)5568Y#@cA,/9 A@RXq R6Q3@5L@O#-G50J~K@3E!L=Ny@wK@2GW))IUNy=?=G2:2D80NySY6 )%Y#D4V%~D +0EV?-C, EE Y-Oh>^^"P&qP5a%5%GEG$$ CXA  sK( ( j( ( HgUl+)(.wmz:.DC.}.w4aDj(yI*`=TUl1:@A ARYRY 99R cG$OK 3GV/&G'-O/0:N<8'O?Wj1/| HM1g>6u,5):/|!p PF[1]P777%UYP77 1AQ%$&<=Y,52-}$&ARW*5 #@ SA\AI,5 15WNQa HLVN(#bI )F qP#&|(kH2+>)*HD (Mi+(4W68(#4W [LSS-'ITH HMi(#?s! HN*'V;#7&/8Y=&v:56gQ #I.Al8A8$2$4l8YT8<Z+A+2$8# k8!XID=VM;P{P{S+KU?B8RCS7=3LG4Y<)W;KW_!*G4(G(GJW*h"3W?G4&U"?f}:*},&(L$(>("H2h#"47m%S%"4FF_F#` ` `"4FF"\FFwXL8O62VbJLXDVCNHM(>Mx '-BGK%AOHS&GMFKw4%MAOSSK(m%AOG89Kw:4KZEG4j4M[?5=U27SgR6N)1$: 34 ::>$$ Y 1AQ%$&<=Y,52 $&A0RC*5 #@ UA\AI, 15WNAaRLL*SYR&OSY<SY LPFy$.H*,#D 5#K*0#V/(G#'-<WI<D8W % %MZ %=;&&W % %M u u6 y>n/>nA"A">nAR{= !1E= = W#34=3<3&R34Y iHQ0JipY C FFpG . G@TFO6 2U-W$J6NKUJhJ6L^L*uQ.1DTJ6 HQ.6$ Z% Z7: ZM<M5{'(# I f $(#(#e;JH;G(;00'.900009xdNN'Qd3GdN7UQ ^&/XM!7M!M!XHRM!M!HRV61$1G5M1 & mSD, K&2n(23:452.3K.J2.%NN { 2/6#1W.Z..ZV\.v.Z!UUK?!(U5R1}F/!(`%$R&M:$"I6:%$ '.%$U$FPH "\>ANgF,9@>N5.#>48!h.7A#. .<{qz9Oq X5K>97?K!E*5Kff!T7?BJB7?,5K"`* .7A#. .X$?X$ (X$@4"*-K ;T .I LdyUM,K82<2@;BY7OQB#"#73^7#7/5 R|MI; z?h 05o NzKez  PM 'E 'Q 'Ws$K N$3c$1A;HK882<0T]0W2Q?;W]"pY"p!QJ?0SQ6ReA]5 C?P9N7'&?+ '&22'&%O d%OV+ G%OBl,JBl3i/Bl:G$O,6 3GV/&G'->7j>'>SvX>%Ip X(=X%cLgt;YR1XrFT,KB1LJLXV2g.,B}WB}P3L U7PL |,N7#}7?Y*)Y$" &{3>C&{(#(#M&{/3(#(#/3NRwQ-ZRw;Rw4dEL@G?J/@/\YU,,SY30;&OSY<SY.MBQ:W>'HhJ1$sJ<J<%&>C>>>85/pG33"@@2 6I9nH{ <9D+G5-GS9D*a=d9DY97?d! Qz7?BJB2o7?,Y+Q"`*.h _4H}.D)Y2"T05H@V5HW5Hu(rs0V ;*UV@xFG$yT33S9,&SSH-<P`.<JX f 2',5s.<(#'(# 2D$I f.<C2;(# KJ#"(#eK\W/}K\U8K\3T<AA3RB$i$i'HV-CrB K(qV-LJKh,WKh.=7Kh |N7#}?Y2 Z38CC2(&I25X!X$ T-?X$ ( (X$ NK 7UCY]<7]HYVp7UGHJ !C}SGONHJHJ@@NG4U@\BLL+*C*6PJrO}JJUCNX8h4W!B8h+X8hmUl ~NV4-zB'C.F 420Z;]9(y4 2=*`2=T(/Ul1:@A /DJL$ LN5LYi-p'-'-35w--L>S?0L'>;.;.ILL>;.;.D,UU /%n8%DP8D :a/8P9D=?6)\DWV,%Gv% y]8 y? yCjN].7A#.@P3W.OD*%#3+%6N*Q27>3 Q;;Q2Q LQ E * *Q2N@I*%,'5%,337S5SkJ/yee:n:H6-KeG=BRE:FN5=IJ=-;:41y-['-}Q <-/$ R1-/FN5-/'-'-)=9B9& ((H>f'-8KGK.:(G0LGMF4O,CM:(5K(m%:(G:A93TN:4KZEG4j4M[:sC5X&Q8=E?B8RCS7=3LG4Y<)W;KW_!*G4(G(GJW*h"W?G4&U?!~<:-/$ KR1-/C N5-/'-XY?DX1XY1v:?D&15&XYXY?D&&5";/5hL& L&2"DL&) $X fS5s $(#(# 'I f $'-(#-*(#e-/$ R1-/>N5-/$ g$$088="8T2F =*,F4c ?W?Jp/:R\"W?[6n"3EL?K $XS5s $(#/\'(# $[(#(#[K6+*CRC=%(*G5u6;X.VF m5u<(.5u KKVF4%#=-*1!I$R,?^R1ROSCbX:A^S!^:9A^77/A^TFO6 2U-W$NLJ%^*uQ.DT HQ6$7,%?b@\? BFN:1I@\!dMB, A|:#?#$!dW-"!dA7:/W7-K;6,AOAA=8B98(,GF<`8D)====RO< (# ]DrW9F6U:)z!H#'-BGE[?I0QTSPYI)LN ,4.O m @6<RR?6! (#.O 7<<3e?/LSNa6.5&K O(#(#4 &.O4jR4S<1)GjGs(XGjV5sGs(#'(# D$D$Gs(#(#G 33e;DDH2=l,G=l,=llE+;.7P:7 ;';0\T[P^Gj -FX fGU Ae/I f '-ej%$X  fN e%AeGoI f%'-Xke;cKZUw,[>,BT/So?.?. w9AP |?u0U9Ec90 |<?, $Y<1t?<A$5 CP9G;#-=EP35'R7Bt3'S9-U $/'&'3- E1%/N=JQ0QQ;3;73~0Hh>d 'Q'&pHx$ HxFHxK] EO&p NYFWNV CFWFW=4D=88Y#D4V%~D +0EI?-C, EE 4X%GNNLA56 ]LAffLA%82(h(hRV%:N7GKAOTWHS?IFNF5?'AOSSIJ<AO-;,#41y-['-IJD.omI-/$ R1-/,N5-/R3 A iHQ0JiT* 6FFpSG . !!0GPP#&F,LBm9nFp%s"[YE32WpN6LpO#-G5Ul ~NV4-zB'C.F 42XZ;]9(y2=*`2=TUl1:@A 7IB$`B$?L<%i1@G:KXZ@u*SOR@Zf@WKXZ7IS>6~JA6JAJA732c4o&F)32KJ,KJ'32KJKJ%?>?@\BR32ACHQJi* 6FFpSG . G nlSt 1 / 1Y6I%~D +E+C, EE R+'/E;+6NR8"->C (WNX;.N!ORCK95[#|I<'NR7=iT,N;N N'-'-5 [5U:Y&x6?5RWcF(`"{d?z3S"IF6:"I~"{'"{UFPH _"\H)G HLVN%(k )OkFA(k />)X#:Ok'(#KJH HOk*W#-_1N7O$U(I:Peen?6)\:e5WV,%NGv%F%1RW2PU)3MI&Z@(V08I5A Dv%,8FI$ 4O/"2/0wBQ#0(BD)TB;DDV --V V J<"<lKt8RZNXC=%WG9 :KVF m9 ).9 KKVF4%#=-*M#(5BM4)ME}:d%dJdH 7UC,A Y@<7U(o:x!4'@lN(oK(o@l T:tJp)5V?\S@N5CO$1kGqN5 p 7g>Y>"tN1%")%lAaN1"LpFF qF <$ &N54l'-'-UV@xGEQ)p$9)`QOIO:lQ(OO(.Y:-/$ R1-/,N5-/'-$_A$2V4KQ$&<=$&]R5Q"1O|YARYR%~D +YECC, EE -<3&ALW4>B+K+ N cQi-0R@0Ve f 2Q@N-@??? %=I f@;??Ne;A <R-Od'13' ,OX7A2$/*#&E3-E1%/N=J"?8K,8\8@E`fK-BT1-?? 8-4??4)FM-&N@EEY>&3*4KA .@KKA5KA8*4??OX0PO( OOP /Y=?6)\D=WV,0%GvN[%OTA%78=4O4OF5JHtF`F`LF`KN/!QR QC=%F182/!GWC-KVF m>. KKVF4%#=-*:+X'O?Wj8"g>6u$8/HW,t._NI-:Lz ^YN)J)JOVLz DLzS7-N8^ p f-=; 95r;e65e0;een0WWY<"DG" V.F"UIJ"-;.41y-[} A/7o{!/X</ eUkH/'M eB& eJM?7""-@M4U'82.'D6W,V1=$/56&639-E1%/N=JDl :.K5,'CR#HC=%YG B@, CGPBVF m[ B*. B KVKVF4%#=-*"8T2F =*,F ?W?Jp/:R\"W?[6n"3ELNRw9Q-ZRw;RwR\EL#A(t6)g6H%%xU:Y6?5RWc}FF`"{ +$6:&x"{';"{U$FPH "\0 $ XS5s $(#=v(# $'-(#(#9&,;4/;BgO"/GU?(!0(0/UY"PwU0.!w/G0/11$L|  8~89p pM#32&32,KJ323E!/5ODu)27SgRDu:<4 :: SO1CLMuR+'/E;+N*8"->CWNX;.N!ORCK95[#|I#,KpE< mp'p 0J UGjGs(XCV5sGs(#'(# D$D$GsC2(#(#!"Kl%l'K[D'0YXK[0mG58F0mB20mF)(`)C)E+:-/$ KR1-/N5-/'-5d4X(T8; Y'u& C-@MVN'82. q'$(k6>)VDi U/Q6- U4ee'>'/ F5L!!Y6??A1c',k2%ANh/Xq&8/V/IXcNhXI9L 7 OL OL L AN*S*C3%}EE^C Ny P7!>y yD.CN!YH 7UC,A Y@7U(o!4'@l(oK(o@lY RVN>?,GAOHS@vQAOSSIJAO-;41y-[}3] Z&;::|%G&a|; SOZG&" ";?'KKFG& -'0n(.wm.w' 4aR@4U ,"z=$/*6=HOR:HH<#D 5#0#V/(#'->?V>M> 18C=NL C*%NB*S)*NBQ" M40AE=*HRl0A1W&! J1DW?/C 1W6R\1W"W?[6n"3EL4 CC+RQ'K[D9J9RO'8"K[HG4~O4;H(7Hl+O@] QjRO<(# ]DrW9F:)z!H'-BGE[?IBT#PYI)F3N 4.O m @6<RR?6!(#.O7<<3e?/LSNa6.5&K O(#(#4 &.O4jP4S<2(UPH 9n0@ HH>Y6|Jx;B^GS>=d>?RR18vK=:I(nAK ,XO(nN5(n'-LO2hUPH F>A.HGPX6|,P}IJP-;41y-[}:GW8 Ia6/$0S(Z6/6/(Z7-@1s-_I"^=+<44ZHH;+PP%GXGq9Oq9!f0EI .7A#.3W.s.sZIsT89B 4S+ H 4N7IvN7*]9l 4N7?YN7V1-PU OD yVN(k(k2>)J0L-Od'13' ,OM7A2$/*#&)3-E1%/N=JtXR3 G8 iHQJiB TFFpSG . !!0GPP'3 ;|IE,8 ; %MZ %=; ;& % %&(A6 3- 5 5 h5 8 =0! ?0!0!C2aXP5A@P  AAP3XcHwXcUg@lC@lXDUP)D HM(>Mx '-HMGK%AO6|&GMFKw4%MAOSSK(m%AOG"M2Kw:4KZEG.O4j4M[$VPFy^.HJH31#)D312(22? (F"O (F(FM==C @-A@-n/Ko1XWo!.O"CXK5,'K~CR#H>=%YG B!z C: ; ( m (R B*. B KBK ;4%#=-*KlDAp 8 4^OX4 2U -4 p"="Fg2FvGE*AM)1/NPE?J>Mo( 5)S75s.,Xk0J~K@3E!L=Ny@wK@23W))IUNy=?=;2+2D8I0NySY6 )%(F(Q (GuSNEh \Gu3AAP :BDr0U9Ec9=% B<T, $Y<1t?<A$5 CP9(<%')6 )0 ()V"I"I-e D*G5QL*OOIO:lGIGI*OO:2 ZP: xO$:H;M$T,S/6 $T?4IY<MP.PH5dX #1< RK X:&Q&QG "V#UV#V#:g 3H 35323:52.3K.J2./GVjN MII-;SxG\z&A05o zKezDABG\  P P+'"+4+!K!K, pU+6SRAQN(LU-D."IF"Ih,FN' <C-/$  CR1-/FN5-/'-'-#@cA,/9 A@!:Xq R3@5L@O#-G5S ?FNh%s;=$3IXcNh=$XI/y:B:Dr:H6-=%KBG=YTRE:FN5=IJ=-;:41y-['-}Vm;-F`)F`LL0RvTF`Y6I%~D +R{E4+C, EE KW:B6K"*-KS?v 1S?JK,C%-o44Y iY%L -.5&+GM#N_+*`G*`0+*`*` Z;:;S-7I#Fz l?5O'1Q,LCPV2Y !0Y!0)!0N^,mqI^#^Q? y9# 9#HX9# 1AQ%$&<=Y,52-}$&A3.RW*5 #@ UA\AI, 15WNQa6%!A6%5=6%H3K2/y:B:Dr:H>,BGK=TRE@@FN=IJV=-;@41y-['-}h#&F,LBm9nFp%s"[YE32WpN6L4 pO#-G5F&\Y"vV3%Fr%7v%*8*T*QR$'I7C6; C CG&, m x$] CR&,7$MN76LW@ $5272 2,##TH@Gn5`3O8}{;4F8}'rP7N''CT.C9N:C,MP77K>V6h;'jW;%KY 6;DhKP@ 4,HW nKPTH{L6>{? EN#E>JI:N'-D!CGDA5!A DA?##Fq#4\s43C4\.*.T+4\.. T ELD%U\/NLU  D^^NUP.UDTD^ HQ6$P Q-uP77@cP; 774UK,7H$/*HfX?<+Dr H:K()z!c'-BMGK@=T;GF3NMFX4.OM=.OK(m%=G"M2XO4KZEG.O4j4M[$VB>*"G&$-<G R6#XB5RHPY6I%~D +E: +C, E eE K5,'[CR#HPQ=%YG B C2PQ> m>? B*. B KOYTKPQ4%#=-*>8pW& WDWBUX@/"/P/UCLNQ;,0 HLVN(#b )B qP#V(k2+>)K1HD (3 (4W68(#4W [LS?>K H5\(#?s! N*'4|F}M: >{$1I ?!1I4)>t1IRGd!n4$PK?Y #L$/:\9i 7M!57@;:\N2Do-:\R M! nlDo R3.!w/GM'MH]M00?-bO(I%4 (I(IF4)M,UXC GSxN W: 7CI-SxACq* > !S8^p 1AQ$&<=Y,2-}$&ACR<#@%A\AI, 1<WNQaQY"`Y!!A!FMN VT>21S.We6Qf1STB1S,Q"`*.h /Mf!g 4YS}A 4@@ 4C8=8B98VG++Y"cIS>&5S!UL~L~DyVA'VC& F> >f'-MGK%:( CAGMF984 m%M55K(m%:(G"M2N9:4KZEG.O*4jG4M[$V"C @- <04/"@-@WELD%L<^#)UD!o+ E!o!o@HzYC,A @R8(oG4'0@l(oKF(oSGR8@lO(l=DB;FG=F?=E^XPMXh: HN(#75 ) 6 qP#w32+LrS9+HDS (3 (4W68Lr(#4W Lr[LS?>K H(#?s! N*'4|GA GV/G2 Z38CC2(&2+`5_A^S!^:9A^77A^H@Gn3"P&qP%5%GEG$W $XXWS5s $(#=v(# 7 $/3(#(#/3<) BRFN?O=%8B9Y] 4]Y78G:(-?GF4.FOFO:(57:(@@N.4U@\IJBLL6wO2yUD6OKOm6<~OO-M4Y i'QW0HTY=Q -L$/=0&=3--LE1%/N=JAAM8H)JMvPH5 R| C ), <-/$ R1-/,N5-/'-'-L9p !@0MKYN#*S; c.K *X3/{ (#8zAHnKTHK5,'[CR#H?=%YG B CG8? mT B*. B KK?4%#=-*n#WW4T!7.>T!V5VT!VK5,'CR#H7=%YG BK CG7 m B*. B KK 4%#=-*)B2)B-)BP $ X*S5s $(#(# ") $(#(#(YX f2V YAe/I fY-p!sXk!seLHRU/TA64gUx;/TKo1T)"UWoIOE1Q1CLHUO ] Q+&&/.Y9J:5 O'81.8AA0 O48Y718l+O@] Qj%;JH;GG=(2;= 7@ KA-/$ aR1-/V+N5-/'-t2"ttB<$`:KB:K9uGOR@FDg3-/$ R1-/N5-/'-'8&%8&8&22*8%,'59nR{$)J )%,J"37J:JSJtJ!=dJ?V>M>M4WE0A < <6$ XM6N56Yi-p'-'-Iz 7IzZIz6Rf6,6?;?[U5"H22h*-/$ K)R1-/N5-/'- 45N'A 4@@7I 4K@3K@zW)<0SYS1Y/Rb;Y/O= LA$ ]LAffLAP.-@M4U'82.'D6*,V1=$/6&63-E1%/N=J \I m>s B*.S\ B KKPQ4%#.=-*A;HK82<0T90W2Q';.KF 3Y8QJ?QN5A 35 CP9 ++%g%g,D#7KSSIVA&CK >>'0>Y.!#o!6e!PY ?6Mx)\HO6|WV,0%Gv%%O T TH+8L6mE>8$KGPK mn333GV/& r@G'-:'RmhRGd!n4$PK?Y'#L$/:\W/:'5M!%:\N2Do:\R'M! nl >>>.0FAF?\N5CN5; ?D u?? ?'3 ;|I,8 ; %MZ %=; ;& % %&52U+ D;D;o5 D?S?Oy D??(# [U5 % %)v*E*AMYVPK2X91Mo%#TX2XTC8BB,B4 1AQ%$&<=Y,52-}$&ARW*5 #@ SA\6AI, 15WNQa>(o??4 GJ("h-?4%K/R.UV+\%R=%Cn(9>;}JD/%XD/%/%D} S;E;uA;HK882<0T]0W2Q?;W]"pY"p!QJ?QReA]5 CP9:(?\CXCN5' -CX<X|<<1X@m%~9LL11M1TXsL;Y1s-K^I4OS F5 6S$S$;Y3H $ X S5s $(#/\'(# $(#(#>TK.5UkFz; l(.18XZI[PIA WfI[)I[WKXZW6~G`V61$1GM1 & mSD, K&4$S]A/V 2M89T3I (i8>D8<M(iPPXG'L=:C8QL6f== 5098B2*$+aj*A=*oK H(#?s! N*'4|Cg 1xB!8,a)J$ X2V-;-6Q-OKg ;OKAoOK A2=8H C;.V0#%@HHM u uyOjOj3cOj.I: oP o-iSF"@3W Wz )0 (X>%I7 X(=FX%c[1:I$d6:I<4 P#MUPBimBimPBiBi#7&/8Y=&v:56gQ #I.Al8A8$<2$4l8YT8<Z+A+2$8# k8!XI HLVN%(k )FA(kRP>)X#;t$=KJH H*W#G88KlGj -FX f2GU  Ae/I f '-e;}6#66Vw+XB5|25X E:o"E ( *-YYF >+F 6( F 3W8Y8@Q : E() W} EL/0K0cE(A L#!]U)L 3WE( %Y-!]3&{0> f&{(#@(#MI f&{'-(#(#eJ-J"pJAC-I38C %MZ %=;C& % %&P)X#S %KJH H *W#(G+R J<"h-R08vYh\@=<Cn(9>;}MSxSxA+q>[6+q+q@J 22"63'599nR{ <8.%J%G cS%J!J8; .2 cBySQ7d;9;e65e0;e-xCVCPCI: oP;Q*7 oTB*F"OZ 1TBPTB1I#N7H7m' HIHFF_F#` ` `HFFA^:9A^7A^CG@4 BR - s> f2X7(#@(#MI f#>/3(#(#/3e%`;E7/dE7VBJJR:BDr*k;=%WB1PT/Q=W1DH;.1!ORQ=K95[#|IQ$&<=$&%R495#@!`AI,5P25n?6)\e/WV,0%GvXf%O%(h(h,yW8DVD85D?S?Oy88D??-V"++Va,-V6J-VTh"-C-+A+-P@?:XJ# f 2+<75s(#'(# 2D$I fC2;(#(#eW B>098B*aj*A= * HLVNS(#b )! qP#% (k2+>) aK1HDS (%3 (4W68S(#S4W [LS?>K H(#?s! QM*'4|: >836+>6Q>pH>=F{M#X)"6R2PU8_6RKM:IU8_H>H>56S%OG d%OV+V+%O1"R?:(#8Dr9#RC)zT'-BG3B BITYw2I) Y4 mPA@6<RR B*!(#7<<3e B/LSNa6.5&KS:(#(#4 &.O4jP4S<Qp"={MKW/X "2XMKM& JM?7""6X;4+DX;4JpJp;4:#D 5#K0#V/(@#'--:]1-?*? -??FF%sD2CiNKD2BBD23HLI;Q*HTBOZ 1$yTBPTB1RV%NWY?,GAOHS$g*\WYFFRAOSSIJBAO-;WY41y-[IJ}$`:KB 4:KuA G@ORG+'ppp/@WRLL7!#o!HB6eHB!HBG'>=&/.Y.AAWGZ+AG8I'jE)CSKNE)7IB$`:KB?L 1F':K@ruP- @OR@X_@z8@F7I "cIS>Y 1 L I&CUKg-,f}:*>A}NE8&F; ,NE$NE,X,0+-!BQ#_T+_6o._4:KC N'-++%g;HG V @M+:9x LUN!G:):)O3X>%I X(=#OX V';zC& FH>f'-MGK%:( CA:2GMFP*4O%M:(5K(m%:(G"M2NP*:4KZEG.O4j4M[$V0FFF?&#KXZ4 - XDV2-CN H(($Mx'-MGK@AO lHS&yG6MF.I?N4MAOSS2-.OK2-(m%AOG"M2.I:4KZEG.O4j 4M[$V6BQ$ $KX f2TS5s $(#'(# I f $[(##(#[e?UB% 7IB$`:KB?LO'1F':K@uE;YOR@z8@F7IY"cIS>Y %E<;%;%!]?!]97%!]!]R?'(#SCW9#; M)zX'-GE[ BI CKYI)F3 G'4 m @6<RR B*!(#.O7<<3e; B/LSNa6.5&K4:(#(#4 &.OP4j)Pb4S<E# [HZV>HZ'H<S4>V6h;'jW;%KY ;DhKP@ 4,H nKPTH XsRO:S(# ]DrW9F;:)zYg'- BGE[?ITPYI)@  %4.O m @6<RR?6!S(#.OKS7<<3e?/LSNa6.5&K :(#4 &.O4j 4S<)]",3Z ;'w[,~)'F[7[Y_3d6Re2 Z38CC2(&5}2&%hQ&6)M& POM DOMB&OMNx ZP:;S1$ $KX f2 S5s $(#"p'(# I f $[(#(#[e)FMN@EEY&3KA.@KKA5J}KA&=8 <B9-/$ 8UGR1-/C N5-/'-'-:$ @:N5:+= R78v=&{0> f2&{(#(#MI f&{'-(#(#eG$ON 3GV/&G'-RpR"""P,@P P <-/$ 0R1-/N5-/'-'-eeneG)!S=G!D$%DO.Dj@nN?O=8B9Y] 2Y78G:(L'G]3>O:(5:(@@N>4U@\BLLQ$ $KX f2&S5s $(#&'(# I f $[(#(#[e2??4 G?4/,\ %Cn,5wQ,5K/*+@//X & K/ $/KTy*J6JJY4%|<59[/I 1n&[>]GVP-8Ui62>]Dz?>]:?|Ui+HQRV8N?,GAOGHS%V74GGAOSSIJAO-;G441y-[}WF SVoSNX7-@1s-K^-SI 6S$S'D!+M'D 'D:3b$`B:Kuf:C N'-LA 9i6 ]LAffDLAPC &^PN+NnPNN:#* L&:'J\ 5L&2iL&* 7UCJ)|>Y"7UG"!V.N"UIJ"-;.41y-[} ?*$=E E 1XXj?7!8X f2 M 7!Ae/2'GI f7!'-he?Q6JDaDa"Da ZU?Cy-"H*I0eI&QI7-@-_I4;#046BNK 7UCY]<7]HYVp7UGHJ,!C}SGC OHJHJ@@NG4U@\BLL0GB'P5A@P  P J -'Y,j64_1<E^E11>XT PT ##bT Y8 YN#/?P(N%P(=8F=P(=N,N,N,9=/OA^S!^:9A^77+RA^,-A&0V44>4,1Y,Ig;N YX 7n:9*S? L 14<*Zv 1S?4JJK,7;J0@'8'Q)E -#\%QF>!.Bc=?F>]F>D80BcSY6 )%CT ELD%U\/NLU  D^^NUP.~UDTD^ HQ6$!qO'>*w@-!q'>:'!i'!u'K" HI<UU:Y&x6?5RWcF(`"{dy3S"I6:"IG"{'"{UFPH _"\H @!Kgg/ AT< D/ =vN=v5;?/ As(#hTFO6 2U-W$J6NDJhJ6L^L {Q.CD*nS8TJ6 HQ.6$Qj -FX f2TmGU Ae/I f '-e@RV%N`?,GAOHHS@v`N?'AOSSIJAO-;`41y-[}.8; .ByD.nE $XS5s $(#'(# $[(#(#[M40AE0A$ :IA,X/rN5'-4LhRWPVA'FA}A>QAd Q'OHx7A HxFHx ===;mc;7 ;P; ">'5`55`Gn5`;3RB1`+RB'&' RB'''.NG: vE/ Q)EE"G#(5# 0#Fq(#BZBZ?kNL --WI8o ]w$w**T0/!.U0/#NN 2U Z,-Yu% Z7: ZMM5=2l"KYN*S; c.%KE*3/{DPAKP `3<{qz9Oq XR>Y97?!g! DR7?BJB7?,R"`*.h 7IB$`:KB?LO'1F':K@0uE;YOR@z8@F 7IY"cIS>Y 1@O dJ@O>@O;@d(#+2lVJN4M!J(JNk JNFkR&{3>&{(#(#M&{/3(#(#/3RF!RF7FRFG2JN7O?9G$NNG F%s 45N'A 4@@ 4 E8')9F:Q".JR$\!3}M/JRWAJR!G".WW, dBiBiBiBi>N?O=8B9Y] 6Y78G:(V$G,Lq6C OC C:(5:(@@N64U@\BLLA;HK82<0T90W2Q;.KF 3YQJ?QA 35 CP9.^LEH;1/ NR1/H11/fP7U7U!E\ T T@5+>{'3E<{qz9Oq X5K>Y97?3V!*5Kff!T7?BJB 47?, B5K"`*B.h "3i:&{0> f@&{(#(#MI f&{'-(#%(#e) 8 7UC7U!Z%?@\?@\2B#7,%?b@\? BFNN1I@\!dKBbN?$!dW-!dA7N/W7K;6?9XHHE:NUO*=_<*SQNN;ItH<It0g"":EY R#339Q  >>>F&IF$EC$_F$!SQ4QQWJ`DI9+HW&{.@> f2!&{(#=(#MI f&{/3(#(#/3eNV4-4!ZN-O62J ^#)D =t O=t7v=tX:&Q*,^M1K :'('(UM!  " <L.1-#..J%M\M\0 (D'L*XN/)/)IM;N4[P4['4[ )  (B iHI1/)2,A Y@/)(oM;4'@l(oK(o@lXTIO/ 21O0Y(GWC2|0 02|+R R8v-f+= R8v= ;nIC,8 ; %MZ %=; ;& % %&vG6VQU:Y6?5RWc}FF`"{ +$6:&x"{'"{ ~4hU$FPH "\D*G+~#;YR1XrFTQ3KB1LJLdVXS7Q3WIvN7L U7L |Q3N7#}?YX)1)1e3Vqq>Zq777Nq7dEWvH7HH+W&{3>K&{(#=v(#M&{/3(#(#/3UPH HGW6|LR/4P-;R/X7>QAd 'Q'82O&pHx 7A HxFHxK] EO&p NY##))5##,<"I##T@(5AA}A)]<{q9Oq X>Y97?d! Qz7?BJB7?,Y+Q"`*.h &[VY<V0{V&t5MUV@xG=%g?=)X+f: 20EuWC2|t0 02|>s4OY4OPaF5'CG C?K mS B KKTFO62U-W$/N(J%qD^^ Q.DDTD^ HQ6$Q$ $XFS5s $(#'(# $[(#(#[Q|iJ 77A[H8D HLVN(k )/zFA(k>)X#_?1KJH H?1*W#RO<S(#+DrW9F;:)zYg#'- BGE[?I-TPYI)VHN %4.O m @6<RR?69!+S(#.OKS7<<3e?/LSNa6.5&K O(#4 &.O4j 4S<!I@II7:*" 6:B;&(wYP;&E;&("|+BXDVCNH0(>Mx '-BGK@AOHS3,GMFKw4MAOSSK(m%AOG89Kw:4KZEG4j4M[?B 4S H 4N7IvN7*] 4N7N7A;HK882<0T]0W2Q;W]"pY"p@QJ?QA]5 CP9KM!BADoDo9[DoDo$8/HW,t._NI-:Lz ^YN)J)JOVLz HLzS-N8^ pU7X>%IU X(=!JX%cN *X 7n:97*&Y L 14*Z4Y4 ?\N5C4N5 s*>Y*M<MPX (1s F:B:B< ]Dr;4)zF#B'TQP7F3N'3e'O S3f Z;:;S>8 IY8HB ?I+GhSJ7 HI+N7N7IvN7*]?j?jI+N7N7)5V?\N5CO$1kGqN52<  2 YS@"tN1Vu%?@) %lAa"Lp 0, V :gIL LHRU/TA64gJx;/TKo1)D<,JCWoC=1Q1C9LHJO ] "ZGSL'G;G3"c%E<;%;%!]?!]9%!]!]G 4" HLE8(#7I ) qP#&|3H2+S9HD (Mi3 (4W68(#4W [LSS-'ITH HMi(#?s! HN*'V7M8rB 4S7 H 4N7IvN7*]I+ 4N7N7BD#nD7 ++%g%g$?C0IA)Y$?)NV$-B4lZ*81(yX4'<"<l81$$Q)V*@-N]QF`KF`LF`:$K3jPn33jV/& r@3j'-h% a&S3HX M!7M!M!X M!M!!FK5,'CR#HD=%YG B/ C,IGNC mC S B*. B K8KGN4%#=-*XPEtXXG1VYXrFeY7YYJJA[#U7A[L7LA[L$6 $X f2S5s $(#'(# I f $(#9(#eWX.>SWX(#(#MWX(#(#4U2T,HMV,G5#Sl",D#@V1f+GY6I%~D +R{E+C, EE X'c 1@4T.Y?22,T6Y< #@6%6I, 1<WNQa)@B#4J  j?38X f 6 +W 3Ae/2'GI f3;'-2 he!>H< 'Q'82ItHx0g HxFHx  DM40AE0A' JpA, TSKT"2TS'&',TS'''HV-CrB K(qV-LJKh,,WIvKh.=7*Kh |N7#}?Y UD="ZT'1R+ ;T!.>T!V5VT!VA'<A.A2(UPH 9n0@  HH>6|Jx;B^GS>=d>d3MV/& r@M'-hT ELD%U\/N-WLU D^L^UP.UDTD^ HQ6$,t%UU%@@P?E8<?"2q?=U)1)13I5'A rINH,I ??N?O=8B9Y] VWY78G:(/G]%VWO~:(5:(@@NVW4U@\BLL73W8Y@Q : Ki) W} ELGK-mVS%AL#!]L 3W% %Y-3M!XpLO9O&MF398=FOO9YF2(g2S+g?U7YH8<?""C? T TT+1GE %(s~u1G(;D>pF%1RW2PU)3MI&Z@(V08I5A Dv%.6,FI$ %4O*4G"YG"}G")]WJ$Jm-L f2'1-?? I f-4??4e K5,'CR#HD=%YG B/ C,IGNC mC S B*.J B KKGN4%#=-*8 !F0:.AV=%N.N5.'-h/&/8YO&.AAO4AZ+AO88IXpL80o -!LLN?O=8B9Y] 6Y78G:(T<G,Lq6C OC ~:(5B:(@@N64U@\BLLLHRU/TA64g(x;/TKo1S)/O(Wo=1Q1CLH(O ] =1P-]/u}L+!U1AN4KUU/T/GKoP1C)' PWoP1O 1,g0.!w/GLHP1ODl 1; $RO<S(# ]DrW9F;:)zYg#'- BGE[?ITPYI)N %4.O m @6<RR?6!S(#.OKS7<<3e?/LSNa6.5&K OO(#4 &.O4j 4S<NV$-4=Z$8 =0!6&4 ?0!0!C2: >I +>6Q>W,.T..$mM Z;::|%Ba|;)SA1 A1 &?'K/1 -0nE5  P)JUV.1-Kn#. %.J%3&=M393I"?#332UHr L' r> I)J@WGN77GO?9G$NNWGS/B/ fU7P(N7%P(=8F=P(=T8H7n,.0.0J=JiJ5{8 Ad >ANgF,9@>N#5W@o>48!hR\8N 45N'A 4@@ 44Q)JiJ'J-Nh,IXcNhXIXDV2-CN H((!Mx'-MGK@AO lHS&&G8SMF-q?N4MAOSS2-.OK2-(m%AOG"M2-q:O4KZEG.O4j 4M[$VC|V|U:I:M:N FK;=S+)v8?? y+?RGE*AM)1/NPE?Mo( 5 )S7K5s.,XkX2`UPY H4(Mx-'-HMGK@P6| GF3MFBH4.OMP}.OK(m%PG"M2BH:4KZEG.O4j4M[$V#EUVY@x?SmQ7 -i*QT G+O Q%g%g/5*QGe*Q#Q?^.$ &G#:3,4N( A>7oW/X/b/9AUkM+lOB:IUOH>H>5M40AE;40A 7AJp/A*E*AMYVK2XMo%#TX2XTVe>5/35/5/p< Dr;4)zFB'%TP7N'3e'O=W2;TW9:E7E7,iB \!4B+* |<592[/I W &[>]GPA(W 6>]Dz?>]:+Y|W +HQ=VKTT ;4+DX;4JpJp";4 6EV 76 1<E^E --WIK5,'%CR#H=%YG B CK>~& m&9 B*. B KK4%#KJ=-* P k 7M65&o9|'K'&oB=B&oBB+w;<VG9IVG8wVG;m,H=4M[WX Ae/!s!s22*8%,'59nR{$)O( )%,JP37) O(GSG1J!=dJTY*VF m?6.c? KKVF4%#=-*1R10 41G'L=:QL6f==M=N(""H LC,A @7U(oB7!4'@l(oK%(o@l0YTh#T }+HXG9.X f2(S5sG(#J'(# 2D$I fG(#(# eRB1`+RB'&' RB=3''DIPPPP#EUV@x?Sm7 -i*QG."*QGe*Q#?^.$G.oI/!/!SbC0C{L6>{ENWE>N}#!"Jm&LQ#!8#!M==IW 0>098B2*aj*A=*oX@CC?'*--G,f}:*>A}NE&F; ,NE$NE,X,?R1X> E%IU X(=B'+; EX%cT[ *H.5# 8,0*Fq(@@* )FLE7PLELE9H Z;::|%Ba|;)SA1 A1 &?'K1 -0n"Z$T*i$;$ JI JK JV9RO:L(# ]DrW9F% :)zY'-BGE[?ITUPYI)<  ,!4.O m @6<RR?6!L(#.OKL7<<3e?/LSNa6.5&K :(#4 &.O4jf4S<W7Ps>5b7(#=v(#M7-p/3(#(#/3L22*%,'59nR{$) )%,J.B37J%S8J!=dJ)o'<M5A[%;{[,FM$ 4O'U%8? y==C<q15*T:T*%"%%)3qY<R? 7$4l8Y8<Z+A+$8# k8!XI9[0LPPZ>uud Q'/!&pHxC HxFVHxK] EO&p NY(8ON'-'-:-/$ R1-/ N5-/'-TUYx3aFKYx & 7YA C#"I: oP?S4A7XV o-i F"**C 8Ke #C?^.$-=EP35'R7Bt3'DMS9-U $/'&'3- E1%/N=J:EZ..TA-P8!n$P$/GL t:-5?:\:-'CU CC D%Ia&Y /XIM!M!M!XIM!Do 0M!!F+&&/1.Y9J:5 ;'81.8A0L;4X8Y78l+;@] Qj">>>>Y.59}Le)< X+f: 207EuWC2|t0 02|>TFO6 2U-W$J6NKUJhJ6L^L*uQ.D"3=TJ6 HQ6$QL9p 49% R e%8%<\TR _ RR $X f2S5s $(#'(# I f $(#(#eL: L!]q08R$,[K~l96 l CY=NUA (W (8 C$];. C!OM~TR K95[#|I$'QO%''R$,[l964ul CNo4uW C$];. C!OR4uK95[#|I0)C2#sK |<592[/I W &[>]+zPA(W 6!3>]Dz?>]:0|W +HQR$K89S61 ;1/| CM1X3GW C$] C,5):/|RG!p K-BT1-?? -4??4S S8B/B/:FN'-4/BgOx"/qUC(!0(0/UY"U0/'b9G88G6%&*RO<L(# ]DrW9F% :)zY'-BGE[?ITUPYI)<N ,!4.O m @6<RR?6!L(#.OKL7<<3e?/LSNa6.5&K O(#4 &.O4jf4S<9C NRwQ-ZRw;RwEL'MMN *X 7n9* L 14*Z4H[O62VbJL3+ 4m89p 5qe<*@  cK5,'[CR#H?=%YG B CG8? mU B*. B KK?4%#=-*T'#5R*U2p,#5;#5LZ#D 5#0#V/(#'-O,TDI_TDTD-@M4U'82.'D6W,V1=$/56&Hx63-E1%/N=JARYRYM< $ X%S5s $(#(# 0M $'-(#(#7-@-I Sg6^C g80*@ / 8 = L C2=J*%#3+%6N*.7>3 XVU#. L'(#& * *.N@ISH'SHM(#SH 124MLMUo4#4*!HX<*GU!n AP $2WM 5?o S8^p EN '+; E[N Q (?*E*AMYVPK2XMo%#TX2XT%N'-_4T.D)Y2"T5H@V5HW5HVB==K@3K@W)5<?'<7vI<54*?58N *X 7n:9*&Y L 14*Z4@a # # # #1c',k2%ANh/Xq&8/V/IXcNhXIN:;3?T$8HW,t._CI-:LzO*+0oLz /LzS8^pL N>( A>7oW/X/b/UkM+lOB:IUOH>H>5rXUBR$r=Lz22*8%,'59nR{$)J )%,JX37J:JS1J!=dJ--E5).'E7/d):U91P?E7??BY8Z1 )S ?7?/91s.,Xk+ NV ! A/7o/:Uk9DoM! 16t,7 / 1-^"Sj1$ <X+|5s (#'(# [(#(#["@QQ>/,|nPPWW4j4M[4?I BM1D7r(H*+9+++++ *HU:Y&x6?5RWcFF`"{.H6:_F~"{'"{UFPH "\(fO?,4<<NW<<Le HEP35 ) RP#2b3Lr?S9W LrKJLrH H *W#(F <L$ !LC N5LA'-'-!$!n APS"6]G  $2/9:\9'559:\N2M!:\oY +K5,'CR#H=%YG B CG=e) m B*. B KK)4%#=-*6Qi-#D0# J#DGJo&JoLL'tJoL#33Q RGd!n4$PK?Y #p$/:\'%z 7M!57@;:\N2Do:\R M! nl7 5{B99!0#n#nQQ8s65;C8C'C#7&/8Y=&v:56gQ #I.Al86A8$<2$4l8YT8<Z+A+2$8# k8!XIE7/dE7B6@JJJJUT[6!A^+ 2:9A^7M!7A^C:K/N<'-5d--9V2>--E5) <DG$ DGFN5DG-p'-'-&=;&k7,%?@\? BFNV1I@\!d@SB7?b!dW-0!dA77/W7K;6W303)e))># Y8 ;&Y/7?CP?Q353139'=W/9==9 $XS5s $(#(# $/3(#(#/3&{0> fS&{(#(#MI f&{'-(#(#eK WsUE19F#_IL=v=vL-:DD#i+ #i#iJRY5)4oYF)KJ,KJ'YYKJKJBR:Ja\M`+&&/1.Y9J:5 '81.8?AV</ 4X8Y78l-!q+ @] Qj=9B9& ((H>f'-8KGK.:(3G0LGMF4O,CM:(5K(m%:(G:A93TN:4KZEG4j4M[:s V/ 6Q6Q # :?B'#!i'!u'$EXF3"0 ;""6 22=!N-N,N,N,9O++%g$Bp7m)EC$F_F#`$FF0&32.J0&90&UCNNQrVMN VT>2X1S.We6Q1STB1S,Q"`*.h 3W  @QLKi) 9 ES ;%AS 4!]S  3W% %Y-3UNTJ0Jq;m,9/=4M[+/-@M4U'82.'D6W,V1=$/56&63-9-E1%/N=JFF6[%s>s52.K:B:DrRFC=%ZBG?TY*VF> mN?6.? KKVF4%#'-=-*H9KY,A @(o 4'0@l.O(oK%(o@lOFG=z 3G"V/&G2?ARYR9nM3 fYP:gGW=4<GSP:N=dP: !.Bc?F>]F>D80BcSY6 )%$`:KB:KuGOR@FLHRU/TA64gJx;/TKo1)D<,JCWoC=1Q<1CLHJO Q] /y:BDr:H>,BG=STRE@@N=IJEz=-;@41y-[}95^%%Hi2(UPH 9n0@ HH>X6|Jx;B^S>=d>3WP3L U7L |3N7#}?Y9[9DoM!T?"(.wm&q.w1R4a5%GEGR?'O(#SCW9#9*ANgF,9@>N5.#>48!h$"1#(9FD05# J{#EGE6E U&JN4M!J(JNk JN:( GwJW//bGw%FM+lO=BOH>@mFILW<7W<_9T PT ##T $$P=P$PEII2?<?<sC0I,>KsJmPN0?@Zm/m;E=1Jf@@|,k2%@/M:%&/V/L$ LN5L'-'-(82u-9D(z(zCI+*VCPCI |"V?"Hb"$Bp7mEC$F_F#`$$FF$HW 6@?RJ LU > )*Tk?!(U M@ !(;=&M1>aY/;%& ";Y/9-DXE{*Tv8:*T&15&*T&&$:{$DoT6Do;$DoDoY6I%~D +E+C, EE 73% (h(h>$ 9@>N5>lB<R? K)B< Z(EWU_URq/|Q8+H%-/$ R1-/N5-/'-@=1;;;6<RRA<(# &jUG@&I0'WLl*" -PD)M)M&)M)MD a@o0&&7 I} ;: ; ;K5,'[:CR#>T=%*G B^ CO%T> mN>5C B*. B KKT4%#'--*DE*s4OY4OF5,n76 7,k8HVCX>%IQ X(=CX%c+K&'N?O=8B9Y] VWY78G:(KG]%VWO:(5:(@@NVW4U@\BLL:):)Qo6w6w,"^6dLL7i!n$P$$:j7`M"Z*`6j17`7` ~4F4N &3^4N+Nn4NNFl/N'-L'-H9KY,A @(oV'4'0@l(oK[(o@lO <4NN'-'NRF*7=iT,N;N E%W5E=^YE='E=8H886/@8Ax7A^5!f)5568F="P?.F"N2NNKM)FM-&N@EEY>&3X*4KA .@KKA5KA8*4??OH^:nH^>{H^UB8^(NE(N@(NXTIG/ 2O60GWC2|0 0P:A62|>>4 (GXWPeen?6)\:e+tWV,%Gv%:EZ..DT!n AP$ A4$P/bA/V /8# XI (i488(iPQ8j%$X  f2 e% AeI f%'-eTj:K:KuL??4/BgOx"UC(!0(0/UY"U0/GE3#VN(k(k>)PH2?H2?+"1GT&I9[>P* 7UCJ)|>"7UG"E!V.N"UIJA{"-;.41y-[}VMGO-*VMVM1+ X j?387 f 6 +W 3Ae/2'GI f3;'-2 heI:P;Q* o TB>F"OZ 1$yTBP=TB4m89p 15q005??-?L> <JX f2$J5s <(#'(# 2D$I f <(#S >(#eO7G6NRGd!n4$PK?Y#L$/:\#BS5@;:\N2Do:\RM! nl 2fJd33F335# 3$ o))))0Wx<A^9%:9A^7M!73/A^TC?s]65SB- ]@vB-IB-: $X f2-S5s $(#(# V^I f $'-(#%(#eI? B1 1'HV-CV-MO8}{8}rA!S'QQ0OQ#58>{H^ "E ( *=-G$O 3G"V/&G'-*RLYVK2XL*%#TX2XTUA93TJpXZTWKXZ6~)s(`)C)9bH 7UC,A Y@7U(o!4'@lN(oK(o@l~KB))))-.Xx:*N*NT6Do*N#&24n+I#N1MN1V#N1N1E7/dE7LB A5%wS|Y2 6I9nH{ <9D)i+G5-S9D*a=d 9DY97?!*5Kff7?BJBOu7?, 5K"`*B.h U#V:+'O?Wj8"g>6uW%Ia&Y /XIM!M!M!XIM!8 M!E) -/$ ,R1-/N5-/fQ)p$9)`QOIO:lQ(OO(/G0.!w/G1$j%$X  f2:J e% AeI f%'-eJ *J)>JIBIJIL'LGzLPeen?6)\e5WV,%NGv% "9GO7GDNG"NK5,'%CR#HO=%YG BQV CH`'F mF" B*. B KK'4%#IJ=-*',0O%''A;HK852<0Tt0W2Q'G;,&ElYQJ?Q0NEAEl5 CP9SY0PSJP> f N;%-KSJ(#(#M%WD$I fSJ;'-(#(#heKB6KK*T8:*T&15&*T&&!A2G;,!P7/Y ?6)\D'AWV,0%GvBI+'%OTv-/$ R1-/N5-/U@\(u LDGGNY" 7UG"!V.F"UIJ"-;.41y-[}b; D&Z@V .">5ADv%,F>$ 4O7!\TIO/ *rIV1DO%BSUG@K7#?%B -%BA77/W7K;6hN?!%07=EQ+Y< 0 9=AU& Br" ?U&&U?mHmmTh" HEP35 ) RP#2b3Lr S9W 'LrKJLrH H *W#7;    I1616 16PK:B:DrRF=%6x#BG?{TY*/V> mN5?6.? KKV4%#'-=-*P25n?6Mx)\eWV,0%Gv%OM R?'&C(#SCW9#GM)zX'-GE[ BI l C6I)8S G'4 m @6<RR B*!K`&C(#.O&C7<<3e B/LSNa6.5&KG:(#O(#4 &.O4jR4S<: ";}F7E+R?:(#8Dr9#RC)z '- BG3B BITYw2I) Y4 mPA@6<RR B*!(#K7<<3e B/LSNa6.5&K%:(#4 &.O4jP4S<I1/)2/);>M;Ph#;953%~3KR?'O(#VCW9#6zM)z'-G. BI C5UI)F3 G84 m,C@6<RR B*!O(#.OKO7<<3e B/LSNa6.5&KG:(#4 &.O4j4S<YjI%~D +D.E;lC, EE8; . By<{qz9Oq X5K>Y97?!*5Kff7?BJB7?,5K"`*.h T X1;@PmPZ3P.'E7/d):U91P?E7??BY8Z1 )S ?7?-1s7.,Xk4IUH)'WWS);5:SS)T'D<S$'& =& '& & L(8U:Y6?5RWc}FF`"{ +$6:&x"{'"{U$FPH "\:w9b8FI4 TK?vB ; .('FHlN?O=8B9Y] 3Y78G:(:G]Q53ON>H:(5:(@@N34U@\BLLK:BDrRFC=%ZBG?TY*VF> m?6.? KKVF4%#=-*/qARYRY 9SCP: f2 (#(#MI f-p/3(#(#/3e(1,;b4,A;HK8(#<0T q90,223BQ;F"HD (wYPA(4W68QJ(#4W Q[LS?>KAw(#?s! N*'4|QK1rND@-C-+A-R$,[l96;l C N:CWK~ C$];. C!O!RCK95[#|IW2=8HV0#%@HHI(1Y1Vx SVoC`/0<SM<DPCME%MR1PS P O+>;1XrFT KB1LJLVXSDWYRL U7ML |?TN7#}?YlJp< ]Dr;4FB'LTQP7F3N.O''O-HcEXP+0"N XP&MXPD)3*B=U)1+%H+?KsA)1VJ3 'FQ LV0< V*FN@I-I1/)2/)M;#-Ev#=v=v#K+1;- ?5K-OQ8N}5Q;\ 55= / G2u1G 19\6t,7 / 1=G=F?=E^@B)~Re6LW63d6Re9QV[NZ+= Q8--u--RGd!n4$PK?Y#L$/:\WS5%:\N2Do:\RM! nl U OD y4{CBfEb{8_8XL{88#T PT #T ?'^8<?""?3R'+R R78v S9:'3 ;|I5,8 ; %MZ %=; ;& % %&>(,PC+*C*6+9 mS9*(=3Y14'!*(IJPIJl*(IJIJ:HK u?O?Wj uyg>6uY%6?AUw%5 55= NV4-4ZF V(yXVWr9B!J]Wr=dEe=d Wr=d=d(NE(N@j(N*S'`$SO.S/ j."IF"IK4,I{E&$$E}:;yJjS&{(#=v(#M&{'-(#(#YjI%~D +D.EC, EE8; . By;H 0}Bq7m0}@BqF_F#` ` `BqFF7pS D/y:B:Dr:H>,BG=)ATRE@@FN=IJO=-;@41y-['-}WBA]/<W5WYjI%~D +R{D.E;lC, EE8; . ByUNK HEP35 ) RP#2b3Lr?S9W LrKJ9~LrH H *W#)CN(8Cl885 ,55J22;z'599nR{ <.%J% CM;S%J!J8; .2;ByF0&"CURl&Z&d)3"[GJ>)o'<M5A[%[,FM$ 4Ol4'<"<l$Q==K%t2"ttB 0 L;L7 LH@<+2kO_<rLr,rH3;;!;lnnT W'X+@/W/& K/ /K VTB"W $KX f2S5s $(#'(# I f $[(#I(#[eE9 J8 KI.7xU4: YS?'/K.C'd<'v 1S?E.JJK,#A 5#0#Fq(#BZBZ&kY [HZV>%~D +HZE HC, E E @=1h-/N/RZ/ /K1r8*F8*18* Y8 Y/DTCP' T0 $X f2S5s $(#E3(# I f $'-(#(#e WULskFz l(.8I[1PIA WI[)I[WJMBQJSk1D9y7r(H*+9+++++ *H&XDUV@x'2 ( -iT*QG9N:VT*QG?$y*Q@4m89p #?LC$kX5qFJ%s8=$;F`)F`LLF`.E7/d):U9NP?E7??*gBWW45)S'?7?J0L5s.,XkRO :9* #p2+0E7N7Y\77&Cc/&&7IB ;B$?L<Gk1YmXZ@PQ?%OR@z@WKXZ7I%6~LHRU/TA64gJxx/TKo1S)+<,JCOWoCE1Q1CLHJO ; $`B:Ku(Q<*Q AQ. (8u;(#40K@3E!L=B@wK@2W)95=VB?22D80BSY6 )%IL4'0 M?eU/TA/TKo=)Wo!.1CWQ+U098*@j*A=* AV6Q-N?O=8B9Y] 6Y78G:(V$G,Lq6C OC C:(5:(@@:7NN64U@\BLL O6q%   e%8%++'?\N5CN5 +RQ'K[D9J9RO'8"K[H.G4~O4H(7Hl+O@] Qj +`SpS|DCs4OY/?1nI4O?F5:$9Ui6H?:|Ui+HQXDVCN H(UMx'-MGK@AOHS/GMF4MAOSS.OK(m%AOG"M2:4KZEG.O4j4M[$V'EQ;;S**Xl:X>XQ E{L+!F;AN4UNr/GKoP1PD4kGWoP1O1,g0.!w/GLHG1ODl 1; $BS&34=3<3(L#EUV@x?Sm7 -i*Q8G."*QGe*Q#?^.$?'^8<?"") 8?XDe5nHM(&^Mx '-eMGK%AO&JGF3MF4%M_SSK(m%AOG"M2:4KZEG.O4j4M[$VW2/dE7BR32ACHQJi* 46FFpSG .$Z G*!@ ."Q7{("OD" %-Q f2,x1-?? I f-??e eKQg3 )H Qg.C.AQg..9UUTL* 7UCJ)|>"7UG"!V.F"UIJ"-;.41y-[}8s65W $KX f2S5s $(#Q '(# I f $[(#(#[e4@"I"IUiiIgL#A@9%GpH>="F2=JH>Bb1 N&HA $X7&S5s $(#'(# U $[(#(#[>uud 'Q'/!&pHxC HxFHxK] EO&p NY{FjL{{ :1O?Wj1K>Xg>6u,7    ?~D>7?~B}B}?~"C 1.BX:C O*SA@<8RR18v4+jY+jG)M+j"N5;YR1XrFT,K61LJL$GVl.,B}N7WB}P3L U7L |,N7#}L?YPCW2/dBE7BJ)S1T` 6 W2W26 |.KA0%KN5K'-0<hSY&OSY<SY.'E7/d):U91PE7??&%B=D8Z1 )S ?7?1sNXk'D'D 'D:3b_4H}.D)Y2",T05H@V5HW5Hu(rs0V 6Vl,W %+Y !]O!] G!]!]0%6Q%6%6W sPs>T s(#(#M s4/3(#(#/3'j.(K4F(Q Q Q*(:IC3A;!K&S%OC3N5C3'-hLMSLoL$`:KB:KXZu).WKXZ.6~6" +?c `!6.Ot#EUVY@x?SmQ7 -i*QY5G7O Q%g?%g/5*QGe@*Q>,B#Q?^.e$U7,%?b@\? BFN:1I@\!dMB, A|:#?#$!dW-.$!dA37:/W7-K;6*B=U)1+%H+?KsA6N)1VJ3 'FQ LV0V*FN@IQ)p$9:)`QOIO:l!Q(OO(/ $ D/ =vN5/ -pW 0>098B2*aj*A=*o"&V8M8MD8M8M- -)k-TB"L-$W3A=K):M3N53Yi'-h522?O+1mIcR?'(#SC9#%tC)z'-GE[ BI C.2I) Y4 m @6<RR B*!(#7<<3e B/LS/nX@KY:(#(#4 &4jP451ELD%L^#)UD<K5,'8CR#H3=%YG B5 CGTF m# B*. B KK4%#=-*%[&4q)) DUAT@(522*8%,'59nR{$)O( )%,J37) O(GSGJtJ!=dJd Q'&pHx$ HxFHxK] EO&p NY 5uM\)JO%M\ f))R&{3>7E&{(#(#M&{/3(#(#/3+/?LH/6/q8Cx |KoO">^M):LCWoO"/Q('O"CLH:LO ] _H1Mv(G$v )5V?\S@N5CO$1kGqN5 p 7g>Y>"tN1%4/")%lAaN1"Lp"X),? 0O-T_Bk4* 0Bk{Jf@@|,k@ @:%: $X f2)S5s $(#'(# I f $(#(#eK>V6h;'jW;%KY ;DhKP 4,H nKPTH" M4C 0AE=*H3Rl0A1WX 8`N3Jp/JpO1W6R\1W"3[6n"3ELVV~V-%%?!]%8i,1$6FX f2#5s6(#"p'(# I f6A[(#(#[e/R.4$/ Y.!:` J +:`:`;<V$><'_ZkFz l(.8I[1PIA WI[)YI[WUl ~NV4 4-zB' 4.F 42M9Z" 4N(yN?2=*` 2=T>kUl 41:*`@A 5V, 5; 5AAX?e5n H:K(Mx!c'-eMGK@=O;G8SMFX4.OM=.OK(m%=G"M2X:O4KZEG.O4j4M[$V6P2CJMB9hCT'J]J=dEe=d CCJ=d=d"7c>Hvd 'Q'U&pHxL HxFHxK] EO&p NY=v;mj%$X  f2 e%AeI f%'-Xke)",=fY5}Y="P2H> 5-/$ nR1-/N5-/'-'-M40AE0AC C'7'/1W7'C_NU8'Q)QY%!^SC : NyQ P7!>y yD.CN!Y+O?6Qs:Y:YWD$-l 0 $ XS5s $(#(# $'-(#(#T!9C C J13NUNUNU.(FR$'C6; C CG&,W C$] CR&,)$R< QEoGH^b=\:nH^>{>{BH^Ul ~NV4 4-zB'E.F 422Z;]<E(y?2=*`2=TUlE1:@A "nELN?N  N8l%: %M|KKG! -8KK D}&$N P FX"+^ #0c9c,|0c0RV%N`?,GAO SHS@v`NRAOSSIJAO-;`41y-[}:-/$ K4R1-/N5-/'-5d1aE9 J8 KI.7xU4:YS?'/K.C'd<'v 1S?E.JJK,L} ?<??D%DO.D?w5z4@$5zJ'E5z<2tF $ XS5s $(#'(# $(#(#2(UPH 9n0@ HH>X6|Jx;B^S>=d>Ba>Ks.V#L=v,:p!-WQAS!-<!-KSSR?'O(#SCW9#9*M)z'-GE[ BI l CBI)F3 G4 m @6<RR B*!O(#.OKO7<<3e B/LSNa6.5&KG:(#4 &.O4jH 4S<+ VH HLVN%(k )OFA(kb>)O&&KJCH HO*WKJ#:-/$ KR1-/FN5-/'-27;J<N95r;e65e0;eeK@3K@W)5<?'2D8<,,Y8LMd9x L=!99I9=!223g=!6E:9F:JR+W3}M/JRWAJRW?~D>7?~B}B}E")??~ 0AX 07: 00{BU($5JP0 ]N)$vJI$5@J#L>?B8CS7=3L+Y<)WQ&KU&RW*h"UW*?U&&U?Y,cY @M4Y<WOV}+Y*:&l2V!YGz1Y <0{V:EP5 Ia36/S90(Z6/6/(Z7 Bv YDo M!G+ZM+Z+Z B0.Ax!kMEQ9 e<*@ @ -/$ K#gR1-/V+N5-/'-#L% 0(GD<%DO.DK@3K@zW)PXUPmVPCD CDCD;mJv>vX!$!n APS"6]G;$2/9:\Vv59:\N2M!:\oJ<J<%2(G+R J<"h-R08vYh\@=Cn(9>;}:  *_WX.>SWX(#(#MWX-p(#(#EgC$EC$=& C$U,pKPUN7JN7xN7UN7(tX f2U t Ae/I ft-p!s!se!X?< ]Dr H:K()z!c#'-BMGK@=T;G8SNMFX4.OM=.OK(m3e=G"M2XOO4KZEG.O4j4M[$VB3B-,LXV_OrcV_** 1V_)&s(`)CCR$A6LH) 45N'A 4@@L7I 47,%?>?I! B&81H@\6J !dB) 00?J !dWt!dP:A670>>4,q1.,q,R',q SVoSP6C;16 $X f2<S5s $(#'(# I f $(#(#eG'K@)F L@EEY&3 KAQJ@K KA5KA?@4H>4Hf4HTK+-:T)XT [HZV>HZ.}H+Y|8YJ'JZI A#>6 2NA#CA#2=RGd!n4$PK?Y #L$/:\P6 757%:\N2Do:\PR M! nl 3?AU:Y&x6?5RWcFF`"{43S"IF6:"I~"{'+#"{UFPH' "\LgHLg=.==MM5Ul+)(.wmz:.DC.}.w34aDj(yI*`(=TUl1:@A  3%K;UTNV$-4Z$3Y+ "E ( *-W2U"F?E^XP>`"N XP&AXP?tNU6U$L H t"%=W f2Tg&{(#E3(#MI f&{/3(#(#/3eUV@xG%g NO#N=yUGA;HK852<0T0W2Q.E;.K)YQJ?QA)5 CP9@K5,'%CR#Hh=%YG BA CG$ m" B*. B KK4%#=-*S4W684W(#?s! V='FW:+'O?Wj/8"g>6u 22X19KY HJS$(Mx9'-MGK@"NGF30MF,4.OM"U.OK(m%"G"M2,O4KZEG.O4j4M[$VOFJ%s8K*Y),,XK H1$ $KX f2BS5s $(#'(# I f $[(#9(#[e;JH;GG2;Ye9UJY>:0C2EHER _ R+iRi7;:;=1;"N (b;YR1XrFT,KB1LJLXV2g.,B}WB}P3L U7L |3o,N7#}?YA;HK882<0T0W2Q ;.KY@QJ?QA5 CP9I$1,   )FMN@EEY>&3KA.@KKA5KA+QQS %QUl ~NV4 4-zB' 4.422ZB 4N(yNQ2=*`2=TUl 41: Oe8 H NU6 )RP# XUL$[ . !?KJ(0H H *W# H <(# ]Dr ).oJ qP#)zH B2+sTT0OHDN (..O3 (4W68s (#4W3es[LS?>K H.O(#?s! N*'4|T68L@o2 (1F|E7/dPE7B:~A5 )S1A52!8[5NFu:BDrY]+rHYOHBG'T-9ON'K'@@N4U@\BLL:-/$ R1-/N5-/'-&9.aEHH6|8N}*5=U27SgR)1$:34 ::>$$ Y HLVN(#b )B qP#V(k2+>)K1HD (,!3 (4W68(#4W [LS?>K H5\(#?s! RiGN*'4|7,%?b@\? BFNN1I@\!dKBbN?L!dW-!dA7N/W7K;6BR@J 9JPJ1?a-4%(#.-4*`G*`0--4*`*`UHL]]0@'8'Q)E -#\%QF>!.Bc?F>]=F>D80BcSY6 )%9/9=9UK?!(U5R1}F/!(`%$&M:$6:_%$ '%$U$FPH "\ ;|I,8 ; %MZ %=; ;& % %&30tH^:nH^>{H^ V/ 6Q6Q O~K&ACXO~M!M!XO~M! M!L<*V1B>> H 8 ).oUP#Ps%$> 75's KJsH H5*W#F%s>0A+ Q'0Hxq HxFHx B=/ *Yi H :BDr ).oUP#PBs24T> 75s KJ- sH H5*W#)", 3 !33< r 7m 3FF_F#`$FF$J"f,E ;NKIV,8 ; %MZ %=;B ; % %&Z^=JBF$=FFN~=FFW $KX f2;S5s $(#Q '(# I f $[(#(#[e:BDr;4FB'TP7''-/1-?? -?? 7v$0P;k>5> 8;k(#=v(#M7D$;k;'-(#(#(h8S&SQQSQ NO#NUFG 3G"V/&G    62?:!:SH9 S'5sSH(#M(#?SH(#(#28--u- 1AQ$&<=Y,2-}$&ACR<#@%A\AI,W 1<WNQaDnR34 iHQ0JiY z FFpG . !!0GPP*rr$6 $X f2nS5s $(#'(# I f $(#(#e"O7UA7N77{I^Eb{_XL{#R?'O(#SCW9#9*M)z'-GE[ BI l CBI)8S G4 m @6<RR B*!O(#.OKO7<<3e B/LSNa6.5&KG:O(#4 &.O4jH 4S<8 bHGW8T 8 }HJ,OSUV@xGG$OKDY 3GV/&G'-+S<13C+-@M45U'82.N'D6I,5(4EN$/6&638-NE1%/N=J;YR1XrFT,KB1LJLXV2g.,B}WB}P3L U7L | ,N7#}?Y&2sK%;&YP;&;&U OD7n y498W $KX f2KS5s $(#'(# O9I f $[(#(#[e66=Q6O*pH>=F{M#9XH"6R2 U6RKM=:IUUH>H>5!0K<3&JX f 2oY:5s&(#'(# 2D$I f&;(#?X((#eU/TA/TKo)Wo!.1C) $X f2@S5s $(#(# OI f $'-(#(#eg HLVN%(k )OkFA(ks>)X#:Ok'(#TKJH HOk*W#KMQ!R9XyC=%5G$$i"#KVF m$Q.$ KKVF4%#=-*D B?2223:HK u?O?Wj u'yg>6uMK;MM!BQ?jGhS=?j HN7IvN7*]?j?jN7N7V, 5: E  1XC-9WJiJ22K ='599nR{ <3.%J%e.*S%J!J8; .2.*By ZP:;SP;0;0;0):I :83[ .I 8 *' ,'' L=L' LPL0@'8'Q)E -#\%!QF>!.Bc?F>]F>D80BcSY6 )%$9$P=P$P*L2E#JE** 0JK5,'[CR#HPQ=%YG BLL C2PQ> m>s B*. B K&kKPQ4%#=-*C$EC$& =& C$& ;;:N5 Ia36/HS90S(Z6/*6/(ZR34 iHQ0JiY &4 FFpG . G*,$N7.:$4L*7UA7`*777&t5M%g>"<{qz9Oq X5K>Y97?3V!*5Kff!T7?BJB7?,5K"`*.h 8 -FX f2GU Ae/I f eQ*:9B:"X:?? :??%RV4UT,(G+= J<H-RFW8v/~*\@ CFWFW(==RW-PVXDVCNH0(>Mx '-BGK@AO7HS3,GMFKw4MAOSSK(m%AOG89Kw:4KZEG4j4M[?)=8PF-)'')''='0LRFx$mMRE,#>#J8I=JiJ9B=GK -/$ R1-/N5-/'-'-G'>Q&cND_,}?rYC_eVeHe_eeP /Y=?6)\DWV,0%Gv%O:R P> f L//J (#=(#M%WD$I f ;'-(#(#heE9GE6E=d R!|H.N )CNAb)(A)Ry HLVN%(k )OkFA(k6>)X#:Ok'(#$=KJH HOk*W#*%#3+%6N*Ks7>3 /aVT7F L * u*FN@I)C\ @1C\=PC\P009-c- B?<???"NB=942= sUC'NRiT,N;N+N2\'-8#z-"W3AK):M3N53Yi'-h*%#3+%6N*Q27>3 Q;;Q2Q LQ E *- .*Q2N@I?jGhS?j HN7IvN7*]?j?jN7N71U <6$ XM6N56-p'-'-@GHZ2VJM5I:P;Q* o TBHF"OZ 1$yTBPTB4m89p 15q9I@ 'B&P>>K:%WD/!Q/! aC(G+R J<"h-R08vYh\@=;Cn60(9>;}P?I! B&$1"@\6J !dJ B.pP$ ?J !dW!dP:A67$ >>4+3#+*`G*`0+*`*`=OcIB=?<?=??H3XcHw,A @Xc(o@h4'@l.O(oK(o@lS:I/DQA*CQ/C N5/'- LE87PLELE9H7@!K!KW |RHK u?*k;W u1H4y/Q= (W1DH;.1!ORQ=K95[#|IW+RQ'K[D9J9RO'8"K[HG4~O4H(7Hl+O@] QjRGd!n4$PK?Y #L$/:\- 757@;:\N2Do:\R M! nl F0&"CURl&Z&d"[0+>)o'<M5A[%[,FM$ 4O87S8787K$R?'2v(#SCW9#?-N)z Q'- GE[ BI l C0f+5I) XN4 m @6<RR B*!K`2v(#.OK2v7<<3e B/LSNa6.5&KX:(#4 &.O4jA;4S<EDE)FEF?MQ??B(D @g3(L$(>("H2h2hRO< (# ]DrW9F6U:)z!H'-BGE[?ITSPYI)^N ,4.O m @6<RR?6! (#.O 7<<3e2?/LSNa6.5&K O(#O(#4 &.O4jR4S<?K.KA0%KN5K'-hNHU!WAj4NVTIO/O%IU X(=FX%c[>H< Q'ItHx0g HxFHx B}?~#.NG 8VyV>;?? $X f2RS5s $(#&'(# I f $(#(#eLV1B-/$ R1-/N5-/'-'-Sj/9?//VL4$S]A/V 2M8B3I (i88<M(iPPXH|/8 PHN P#A 5#!0#Fq(#N?O=8B9Y] VWY78G:(EG]%VWO:(5:(@@NVW4U@\BLLvJ #JJ&&  WULsE9 Y8 #7' :RY'E/ JVC 'd1' sE'9;1-5<WA6LH@@*9D:=J)|6-DG"V! FN5"UIJ"-; 41y-['-}SH!'5sSHMSH yY\r y>Q7 yOTO'tV'tO't#&2WF,LBm3 nFp[%sK+ 3 6Q36QjpN6LpO#3 -G5-/=s1-?? -??"Q7{N;("OD" } %=1;X&~= YN?O3W8Y8@Q : ) W} ELPUK-mq AMZ %AL#!]L 3W  %Y-3qq777Nq7+q6+q+q@J :6_|<592[/I &[>]P-8B6!3>]Dz?>]:|+HQ/1XWo!.X VVoSXP XP&XP0SYS2FF%*F#D 5#90#V/(M#'-+&&/.Y9J:5 O'81.8AA0 O418Y78l+O@] Qj::);ItH<It0g"Y7O"%;RMDf*e%;.aENs%;UM O62F`J93+L^#)3+Ul ~NV4 4-zB' 4.F 42M9Z" 4N(yN?2=*`U2=TUl 41:*`@A 3.b"FxFxFx;)?*"G7hE")?G!'N$.52;5:A;HK882<0T0W2QJ;.KY!QJ?QA5 CP9X<*+&&/1.Y9J:5 ;'81.8A0L;4B8Y78l+;@] QjA?L>JX f 2)@215s(#>{'(# 2D$I f;(#(#e OO H :BDr ).oUP#PBs*KT> 75&s KJ-sH H5*W#W@@j:%)5V?\S@N5CO$1kGq&N5 B\g>Y>P"tN1%)%lAa(Lp`X?< ]Dr H:K(Mx!c'-BBGK@=IT;GF3NMFX4.OM=.OK(m%AZ=G & CXO4KZEG>4j4M[4cX? %jE7 *X #D D:A:* L-69;C61 E9;'9;-s4OY4O/F5,n76 :7%?>?@\IBtL>&.w.w,4a? 1x +O&aC1qSC>QY >#2> >=F{M#X)/6R6RUkPU8_6RKM:IU8_H>H>5 $PXS5s $(#'(# $(#(#;>>.(6PC & !6^PN+Nn66PNNAWS hPEASTb N0 N N*E*NAMYV HK2XMo%#T=X2XT` 6 TW2W26G'Ad8JAd"Ad VI&'?SI&MI& <OH*#$#$&{&{(#&{Q1XKWP8P..OC.M-X2? 7UC9nM3Y f7UP:s!GW=4<SP:N=dP:6u= LVLL= &/8Y u.AAI4AZ+A88In#WW>.P25n?6)\e62}WV,0%.OGv?c `!6%O.Ot ?4?4/q?Oq9!?8&K,8\8X X(=XC(/wN.w(GwU <L$ !LC N5L-p'-'-KSRV8N 6?,GAO&HS@vG$9AOSSIJAO-;941y-[};YR1XrFT,KB1LJLXV2g.,B}WB}L U7L |?,N7#}?Y<Rj%$X  f e% AeI f%'-eOaQg3 ;)H Qg.C.AfQg..#$$ T H^T+7V>{L6mUE6Bf}:}&1H,,8$A(JSJ-/?0KAN-;.;.I-;.;.117>X,[l9lFNI;W?P C!OI; Q%*z O WKkIC 7mCF"IF_F#`C$$FF$4 $ M/4 ,N54 ' <C6$  CA#tN5#t'-'-PPP)$9  1P)N $8GG$;9UC-9AI:P;Q* o TBF"OZ 1$yTBPTB4m89p 15q J8 YCI/<JIHK u? uyYL#&:H6Q :1 L5wAu- CX@CC-JJ<J<%A'FA}A :& gC5wAu&Z-"*C>X@CC?'*--G Z;:;4S8(%H  f2 e%AeI f%!sXk!se7H?LC,A @ 7U(o%!4'0@l(oK%(o@lOQ+U.@[?*1E W1EW)1EN_H1<-MRHzYC,A @R8(o 4'0@l(oK%(oSGR8@lO(lH# '7O~ACO~M!M!XO~M! UKU#3#3U#3D2CiNKD2BB9KD2(# 'E :!%S+]N EfL K'AfL# %>L D.CN3W %+@p:Y4!d<:EL<><B'C$ CKN'*Tj:K:Ku|DCs4OY/?1n4OF5:$9Ui6H?:|Ui+HQaKm@!aaThYK:BDrRFC=%ZBG?>TY*VF m?6.? KKVF4%#=-*A'\H"7ALVL ALLED/ D;;o D?S?Oy D??\N5CN5 > &{.@> f2"O&{(#(#MI f&{/3(#(#/3e,q1.,q,,q-(-@M:?U'82'$3&pE6ES9!;$/E66K] EO&p-; NY;9&w& %&N`'-IZ%(h(hkFz; l(.8XZI[>PIA WI[) 9I[WKXZW6~+F+FU9:R P> f .t,Q (#(#M%WD$I f ;'-(#(#heW $XS5s $(#(# $/3(#(#/3%!=$9< !OOIO:l!(OO(S; (X$K3,e9^),,XK9^ TM[ 4)1ICT$D4 ]w$BgO"wU*(!0(0/UY"U0/8HG\Mg!DABG\! PG5-([-BTb1-?? -4??4%&!5%%)ToEPD. f ".WE(#(#M%WD$I fE;'-(#(#he&XDUV@x'2 ( -iT*QG9N:V$T*QG?$y*Q@4m89p #?LC$kX5q )DD88~!J-T!1P-AN-;.--=EP35'R7Bt3'3S9-U $/'&'3- E1%/N=JF,jHHM!RM!M!HM! ?4?4/{>>5L| '9;@z@@:BDr;4=%FB'CTP7''.MR"H2I5P EU%6M5=U27SgR)1$:)@34 ::>$$ Y 8Y6?6J5 R| C *Tv: 8:*T&15&*T&&'}2PHdG%OG d%OV+V+%O$$+ !X|@XDV2-CN H((!Mx'-MGK@AO2-HSG(&GF3MF-q?N4#MAOSS2-.OK2-(m%AOG"M2-q:4KZEG.O4j 4M[$V#W#?BO##2R$,[l96;l CN:CW C$];. C!ORCK95[#|I6J1~816?R6 %16;1XrFT KB1LJLVXSDWYRL U7@PL |N7#}?YX2`Y H4(&^Mx-'-HMGK@P6|G0MF4.OMP}.OK(m%PG"M2O4KZEG.O4j4M[$VUM>1(N7s57XU5"H2 8$1I PT8/$."&E /$+I$=71RP4Q,.4 49H?FlM$ MC N5M-p'-'-8G#GP mS9 KO `!6JGwGwM?B8RCS7=3LG4Y<)W KW_!*G4(G(GW*h"%YW?G4&U"?U:Y&x6?5RWcFF`"{~3S"I6:"I~"{'"{<UFPH "\)/5 R|MI; z 05o NzKez  PGnGhS HN7IvN7*]N7N7N/m.i)uP7X7H^b=\:nH^>{>{O(}>>lE"57:bTC?s]6LU9 % )ZB ; .('D(6E E5U55E5E)CS  (K.`WE# `E)E)8Y6?0'"IF6:Rb'$!N?O=8B9Y] 6Y78G:(1G,Lq6C OC :(5:(@@N64U@\BLL wSLB}(AA ?'7=B& XK5,'[CR#H?=%YG B CG8? mC B*. B KK?4%#=-*)5V?\N5CO$1kGqN52<  2 YS@"tN1)e%@) %lAa"Lp]>K$-B HEP35 ) RP#2b3LrS9W LrKJLrH H *W# "O8XsL;Y1s-K^4OSF5 61S$S$;Y3Hp%"={MKWWX "R2CWOKM& JW?7""(# ;V4T.T@GGN?O=%8B9Y] 4]Y78G:(-?GF4.FOFO:(5:(@@5hN.4U@\BLL64FT4 Aq4 4 B.\DQ !CWlB@)}B@+/B@@( Uh:XJX f2GE5s(#'(# 2D$I fC2(#I(#eT2F F I5Y/YR$,[K~l96 l CY=NUA (W (8 C$];. C!OTR K95[#|I9"$4?D "Hb9"TR 1AQ%$&<=Y,52-}$&ARW*5 #@ SA\3AI,5 15WNQaRRGd!n$PK?YO#L$/:\6O5:\N2Do:\RM! nl M\+H!*?L++O%O+8H5m)JRV%NWY?,GAOQSHS$g*\WYFF?'AOSSIJAO-;HdGWY41y-[}AZ+A8IV <NDG$ HS%EFN54E-p'-'-!n AP$ A$6JX f 2=5s(#'(# 2D$I f;(#(#e &A}5;<{qz9Oq XR>Y97?K! DR!T7?BJB7?,R"`*.h %8v12v Q30QQ;33vE3NN7(?-0(_1511#33FUHr L' rvvH3XcHw,A @Xc(o%k4'@lN(oK(o@lw$c&?N.w(G(GS!wJ *JC)>CJC(XXz Ae/!s!sO@Co?4@"I"I Y8 YN#/?8J.'E7/d):U94P?E7??ZBW+ 4)S?7?4s.,Xks4OY4OF5,n77 >DV+WP ? B6lPQ: a $: /XwN.w(G(GN&w3Co15|?Co@@5KCop"="24Fv6WFv% %!]?!]9%!]H 7UC,A Y@-7U(o!4'@l(oK(o@l-#AGr-?E-E*%#3+%6N*Q27>3 P$;;Q2Q LQ & *M *Q2N@I N(t6)g6G%xF *Hx @z.A;HK882<0T]0MU2QJ;(W]"pY"p!QJ?QA]5 C.9|<59[/I 1n&[>]GVP-8Ui62>]Dz?S>]:|Ui+HQ#D 5#0#V/(##EUV@x?Sm7 -i*QGCG.iCY*QGe*QW:#C?^.$ A1M40AE;40A 7AJp/AK.1VF( !9YB!/8,8\84G0-Q f2 1-?? bI f-?$?e<JX f 2(] 5s<(#'(# 2D$I f<;(#(#e;0 #DDDDDDDDR FFFRGd!n$PK?YO#L$/:\/6O54:\N2Do':\ 1xRM! nl  $YGB& XNxMNHF.FTD1S,TO?8 ELF'+; E5L /V#NN 2U79$%J ,B e%8%++> 11>Xf-@M4U'82. 'D6n,VM $/6&63- E1%/N=JY6I%~D +E4+C, EE DBO#N?O=88B9Y] 2Y78G:(L'G]3>O6:(5:(@@N>4U@\BLL8QJJ4/;BgO"/GU(!0(0/UY"U0.!w/G0/11$"j ;ejJj<S-5X|<<1XR3|?uHQ0JiU9| `?FFpG . Gt<>'' R&]&kR9;YR1XrFT,KB1LJLXV2g.,B}WB}L U7SwL |,N7#}7?Y<{q?OqYX.;f>19*47?! ':4J7?BJ7?8*44J??On#WWY< f&{(#(#MI f&{'-(#%(#eTIO/OG> O=?!eO=K>V6h;'jW;%KY ;DhKP 4,H nKPTHK:BDrRFC=%ZBG?TY*VF m?6.? KKVF4%#=-*F"u4 ; +PD}*H.5#,0*Fq(*L#A4T.TG60)5V@C O$1LGq4Q7! 1 1TINOY 1"tQ&!)OCXCX/;R?'2v(#SCW9#?M)z Q'- GE[ BI l CKI)F3 GN4 m @6<RR B*!2v(#.OK2v7<<3e B/LSNa6.5&KG:(#4 &.O4j;4S<RW-P*7*V t*5A&?'**--G=1 [HZV>HZH<<Fda@"# VFF> 1l'(%('(RU?B8RCS7=3LG4Y<)W;KW_!*G4(G(GJW*h"1TW3&?G4&U"?I0I%>)<4I> Uh&W.sFK5,'CR#HC=%YG BS CGPBVF m[ B*. B KKVF4%#=-*%g3W8Y8@Q : E() W} ELCK0cE(AAL#!]L H3WE( %Y-3&6CCS% <GM 3GV/&GR=C\RV%NWY?,GAOHS$g*\WYFFRAOSSIJ=AO-;6_WY41y-[IJ}I: oP o!F"WYoH T T+>{!4% !4XY!4U%\S0UU 0RO<L(#+DrW9F% :)zY'-BGE[?IXTUPYI)<N ,!4.O m @6<RR?6!L(#.OKL7<<3e?/LSNa6.5&K O(#4 &.O4jf4S</I }4" 44-@M45U'82.N'D6I,5(4EN$/6&=638-NE1%&/N=JPY ?6)\H6|,F30%.OGv%O%-/$ KR1-/N5-/'-:W2B{A(;4V062 @WJb2m @B9QRuRuHLIH1TB1V_OrcV_**1V_KAH7+ XU2DX M2X K?E3NA8:8USA@<88U([@TG f2D@T??? I f@T4??4e +;1XrFT KB1LJLVXSDWYRL U7L |N7#}?Y2@#=u7,%?@\? BFNV1I@\!d!eB7?b!dW-!dA77/W7K;6([-L f21-?? I f-4??4e>'P,D#4/"+M?Q 3C/mEUX<U(#U4Q@\@\BP'<<_@]!("H2hQH3XcHw,A @Xc(o4'@l.O(oK(o@l %V<V0{VT,Vs-=,,3XcHwXc%@m%~9LLN3XcHw,AXc%9&U%;%;.a%;UMQ  KTN(LUG0JX4`Dl VNHS4<.RRRAR,kRU7Q;5 G1D#EUVY@x?SmQ7 -i*QT G+O Q%g%g/5*QGe*Q>,B#Q?^.$L%S5L(I2( [HZV>9n0@  HHZ>HJx;B^GS>=d>f'-8KGK%:(GAGGMF*d4O%M:(5K(m%:(G:A93TN*d:4KZEG4j4M[:sG33VT X%)!Q @ N@ -n@BM66OS|Peen?6)\e+tWV,%Gv%M'MH]$M#&2WF,LBm4nFp3%s"[ <43jpN6LpO#4-G5A;HK82<0T0W2QJ;.KYQJ?QA5 CP9) #MM8 8$<88([-L f21-?? I f-4?$?4e+`PT PT #+T N?O=8:B9Y] DSAY8G:(EGA GAC ONC 5:(5:(@@NA4U@\'-L,+#' <C-/$ 4 CR1-/FN5-/'-'-W $KX f2;S5s $(#'(# T,I f $[(#I(#[e#G'L<:#XmB2<@<0nn@7nKJ3"5zH5zJ5z<:A wT@:N5(:4As(#hXDVTvN*H R(>Mx 'G@AOHS.MF@AOSS(%AOGI+'4Q-=Tv4j)nM[3O/0;:V/W*+@/</ & K/ I]/+<K V VT'HV-CV-LJ:jMW-Kh |M/GVjN MI;SxG\zA05o NzKePzDABG\  P PV $;PC) $X f2S5s $(#(# I f $'-(#(#e H <(# ]Dr ).oJ qP#)z&B2+sTT0OHDN (..O3 (4W68s (#4W3es[LS?>K H.O(#?s! N*'4|8Y<(%GR8CSGR8(l'C] C&"ZGD/L'G;G3,7UM5PY ?6Mx)\H6|,F30%.OGv%O#04V0GN?-oKOC$7@@K7!I@II*9D=J)|>"DG"+GV.N"UIJ"-;.41y-[}' NAj, ; !5 P5AW!PJ&OL\TUO&&O;YR1XrFT,KB1LJLXV2g.,B}WB}L U7L |G?,N7#}7?Y ZP:;6S+{6wO5i2yUD6OKOm+{GI6OO( -L f2'1-?? 8I f-4??4e>ANgF,9@>N#5 o>48!h@K95[ !K-@M4U'82.H,'D6 T,V4H,$/6&63-H,E1%/N=J1cA,k2%LA/.YXq&/V/ B>Ba<o@N-nIMoMMX?W. E%I XP(=PB'+; EXPP<  $X f2S5s $(#(# I f $'-(#(#eW@@:%P+>>"P&qPH%5%GEG$! $YYJX A"6*P> f  : 6*(#@(#M%WD$I f6*;'-(#(#he@] 0PI>5>As -0hII(#(#M7D$I;'-(#(#(hY0<>,B L QJ#DI9EEEE(:)r:)VzOG1 L U1B yO1Vx SVoC`/-0<SM/&-*C* tME%MR61-S P O+ .pI0QUQO~,&&ACXO~M!M!XJO~HRM!M!HR1s;//\4$S2]A/Va 28Q(3I (i88(iO5,S2]:|Cj%-Ba2L:3L11" L%'LK1 -0n,r!).!!:G$OKEM 3GV/&G'-sN:P9"V.(K9A3BN5Yi'-hN?O=8B9Y] 6Y78G:(V$G,Lq6C OC C:(5:(@@7NN64U@\BLL+&&/1.Y9J:5 '81.8 `A/ 4X8Y798l-!q+ @] 7QjeQ&$9!a)`QOIO:l,QO O6 P)5V?\N5CO$1kGqN52<  2 YS@"tN1 %) %lAa"LpC^P3S)?(xOP?*?VyC^=P??S-M4Y i'QW'HY= -L$/=0&=3--LE1%/N=J3 6 4"*-T2F F I5Y/Y |FX f2/Z5s|(#>{'(# I f|[(#(#[eBG :&"C5wAu&Z-)3"C%>X@CCWDc N P& %K1\L)FI?1\FOK82<2;BY7OQB7IB$`:KB?LXF1F':K@!&uEYXFORT@z8@F7IXF"cIS>Y C:M40AE0A' Jp!YSh5J!A!F%-/$ KR1-/FN5-/'-X> E%I+ X(=F'+; EX%c[0y 2@P;C0]2@A2@E5T)0&-9 8Q1l'(%'(RU@$LF$`:KB:K]u@G#C8r*88-z)Y$?%=@ 4Y%6?U% KN5N5 JE-J"pJ' <C0- CN#t'-'-6%!6%5=6%UPH F>A.HGP6|,FP}IJP-;41y-[}A.q #.q+- L6m XB0.2J=,22;E:5 W0q W4 W8kG$ -G N5G-p:-/$ R1-/N5-/'-5dK5,'%CR#Hh=%YG B O CG$ m" B*. B KK4%#=-*IIBIO8}{8}rT}7-T}NT}(G+R J<"h-R08vYh\@=Cn(9>;}+&&/1.Y9J:5 '81.8fAV</ 4B8Y7U8l2V+ @] 7Qj!P*fWA&:7VVV_OrcV_**MEV_8P$J#&2WF,LBm3 nFp[%sK+ 3 6Q36QjpN6LpO1\#3 -G5FlM$ MC N5MA'-'-QW(6 )0 (J>8Q P )>QQ11 Z;:;S>8 8 G;,!) NC) Q) ](#E7/dE7LB )&s(`)CCQLH):N'-L 2>ANgF,9@>N5T3T>48!h "|0*J *X *E LP,@P P6.zG\IO~0 ?>D6M6M6M6M9X+ NW"OIW"AW"FHHVgR $KX f2PS5s $(#>(# *&I f $/3(#(#/3eU/TA/T);$&$& R1S-RV%N4?,GAOHHS@vD4?'AOSSIJAO-;441y-[}V|LLXJMMK""Pd<!-SB6vN!GLC Z;:;1S>8 K8: N'-e D,G$9>1D@,OOIO:lGIGI,OQ *O:X?< ]Dr H:K()z!c#'-BBGK@=PhT;GF3NMFX4.OM=.OK(m3e=G & CXO4KZEG>4j4M[477@U}W/<W5WAx71Vx SVoC`/0<SML<DPCME%MR*1PS P O+TP|<, UM K()E8   RGU AeC@ GSxN SxSACJOSJX19/Y= HJS$(Mx9'-DMGK@"NGF3MF,4.OM"U.OK(m%"G"M2,:4KZEG.O4j4M[$VAh5> -0hII(#(#M7D$I;'-(#(#(hO&aC1qSC>>QY >#2> 1 L IA;HK882<0T]0W2Q;W]"pY"p@QJ?QLBA]5 CP9 vK7iW* 7UCJ)|>Y" 7UG"!V.N"UIJ"-;.41y-[}7!\TIO/ *rIV1DO%BSUG@K7#?%B -+%BA77/W7K;6? $X f2S5s $(#'(# I f $(#>(#eS.'%\+N.'A.'8CC%B NB*NB7GRP:GW8 Ia6/$0S(Z6/6/(Z?T ELD%U\/NLU D^^UP.UDTD^ HQ6$#AJNM3>J<)#2/J<%2P7/Y ?6)\D'WV,0%GvI+'%OTvO/p!Q[/p/p%O FxY&>oFxFx1Jf@@|,k2%@/N:%&8/V/YcY @Y"XJJ+E-J"p"pJRYRL,RY;RY%V%,J%N-jJ?u GI ?4FWA#H/6 2NA#CA# CFWFW2==9C 9(U?(   J8 3YS?N/qCP'v 1S?qJJK,v!"N?QI1/)2/)GM;;POS'@@PA>R 7j>>&{0> f2&{(#(#MI f&{(#(#e.#..N?O=8B9Y] 3Y78G:(KG]Q53ON:(5:(@@N34U@\BLL s=8 <B9L$ 8G6C ON5C6A'-'-Jo&JoJoK5,'8CR#H=%[G B C*)r"p m"p# B*. B KK)r4%#F-*  E0gAGS  g GM A!] %:&{0> f&{(#=(#MI f&{'-(#(#e)5V?\S@N5CO$1kGqN5> 7g>Y>P"tN1-~%)%lAaN1"Lp9[>[P.C6 P{.P HLVN(#b )B qP#B(k2+>)K1HD (,3 (4W68(#24W [LS?>K H$8(#(#?s! >:0CN*'4|1Vx SVoC`/-0<SMW9&-*C*ME%G MRW!1-S P% O+ R$,[K~l96 l CTNUA (W (MJ C$];. C!OR K95[;.#|IPK$$$$y&Ty'G'9y'E9 Y8 #7' D:RY'D?/2&O9;C'd1:' E9;'9;-X>%IU X(=X%c%#I&%%L?VN-/$ KR1-/N5-/'-C8C'CF'*%#3+%6N*Ks7>3 'VT7F L * *FN@ISR+'/E;+NR8"->C (WNX;.4N!ORCK95[#|I'[=G5 'EE 'Q ''wJQ6 ?G"">EI **AT*3GH'{NyYY=Y+3DI%~D +0.E?-C, EE8; . By@R;@A #@3?P(%P(8F=P(WAWGnOW}5;%.o(h(h0J~K@3E!L=Ny@wK@23W))IUNy=?=;22D80NySY6 )%RV%NWY?,GAO$HS$g*\WYFFAOSSIJAO-;WY41y-[}5,7$?5 0X 07: 0Hp9X>%IQ X(=X%cSpkGW>0)k.3K.Jk.. IG/ AB B 1AQ%$&<=Y,2-}$&ADmRY #@SA\AI, 1WNQaF@,F@F@.(F&X V| V VM"'%JbK@3K@W)=<OI!]F*1k8 % %MZ %=;&& % %#BAAXqK:BDrRFC=%ZBG?TY*VF m?6.o? KKVF4%#=-*71f+E%UGEA7E.GTUC!G+!"U&J'GUgRS on0WW m>C B*. B KKPQ4%#T-**" MO"(F %)M2CMEO2--uMM2--' HLVNS(#b )! qP#% (k2+6P>) aK1HDS (%3 (4W68S(#S4W [LS?>K H(#?s! QM*'4|P[1]P7"7%UP77.L+&&/.Y9J:5 O'81.8A0 O48Y78l+O@] Qjh4<D<.<H 7UC,A Y@<7U(o!4'@l(oK(o@l(G+R J<M-R&@8vY#M\&@=Cn(M9>;};;6<RR<(# &j5656M56RV%NWY?,GAOHS$g*\WYFFRAOSSIJAO-;WY41y-[} wCN?O=8B9Y] 3Y78G:(/G]Q53ON~:(5:(@@N34U@\BLLS?9E[P ' 6 $;Y3' H$; ' e>I8CCL%NBK N'-'-1 A/7o/Uk9Z6W eJ:j M"ZV7. *` PBD2KD2BD2P- NN$ M $%I iK5,'[CR#HPQ=%YG BV C2PQ> m>T B*. B KSKPQ4%#.=-*LHRU/TA64gCx;/TKo1!.)T:LWo1Q1CLH:LO ] & YL>#&K82<2(<;|DCs4OY/?1n4O?F5:$9Ui6H?;:|Ui+HQ:N<8'O?Wj1/|M1g>6u,5):/|!p :-/$ KR1-/N5-/'-N7(?!d-0(_ ?GhS: HN7N7IvN7*]?j?jN7N7w$c&?N.w(GLRw(?>{L6>{? ENE>*KH LC,A @7U(o!4'@l(oK(o@l3&+)@%3& 0PP/>5 3 IEUP/(#(#M7D$>P/4;'-(#(#(hX d/]U:Y&x6?5RWcFF`"{~3S"I6:"I~"{'+#"{UFPH' "\ .Yi*"G&$-<GD 9+/?WIPPUPPT['C Cg!> mS! sYY /#%A5ItH<0p7f90It0g=  3YH\?A 35 CP9/W~IF//X2`Y H4(&^Mx-'-HMGK@P?U6|G0MF4.OMP}.OK(m%PG"M2O4KZEG.O4j4M[$V/y:BDr:H>=%,#BG=TRE@@F=IJ=-;@41y-[}7? $ X.S5s $(#B'(# D $(#(#D 'tN?O=8B9Y] 6Y78G:(1G,Lq6C OC :(5':(@@#N64U@\BLLUl ~NV$-0 BNo3.H4lU2Z7MS(yU2=24'<"<lUl$$Q0+-!?,PPT2F DB;QF4 Mc&K*4V4K*2fJd33F33DX@%6SQ%63%6.'%\+N.'A@7k.'3 @*H.5#E@,0*Fq(@@* O+.#1G19M 11?\N5C4N5 s*>Y*,K%Y [HZV>%~D +HZEW~HC, EE "(bm&q.wl4a5%GE4'<"<lG$$Q77'cG-7 9 4m89p 5qBb  E8'Q)9F:QJR!3}M/JRWAJR-;K?-3-){b{_8{U-!] TW7Z3iFF%sW99)'JM 9@$Ct 8)Q7! T\ 6 M Q&! 6CXCX/;&r$%7mEC$F_F#`$FF ) R)F?)G-YK&SM[2l>Y * 7U:CJ)|>"7UG"!V.FN"UIJ"-;.41y-['-})FM-&N@EEY&3*4KAR.@K KA5 KA8*4??O22*By'599nR{ <%,.%J%37EkkS%J!J8; .2kByUl ~NV4 4-zB' 4.F 42=<Z" 4N(yNQ2=*`*2=TUl 41:*`@A XP+0"NW XP&MXP,D"rB ?D+GhS"8 HD+N7B}N7IvN7*]?j?jD+N7N7/P<$@P P2=8=B W=B (=B 7IB$`:KB?LO'1F':K@0uE;YOR@z8@F7@ 7IY"cIS>Y :i16YBLL*LWFOA.q #S.q>+QY >#2> !<8 f$RH83 ^868YIk:4N'-<52*JBY7K:KY77Y713=h !33<B|CB|B| AT+R R8v;R \ Cn W$HLW$/W$PW{5 ?6)\e'UWV,0%Gv'I+'%OTv&.MoA4Y i0 FM90Y=6, "# 3Y'=6?#k=6A 35 CP9X:4$S]A/V 2M83I (i88<M(iPPX7nA^9%:9A^7M!7A^P/I.3>l$i>AYTE9@>NS5.#>8!h"w45YY-YY+= RO,8v6Ix+qP7/Y ?6Mx)\D'WV,0%Gv%I+'%OTv:-gEGE6E2CUI2K%M%MK*K*3<3<3<*%,'5%,137G2[<{qz9Oq X5K>Y97?3V!*5Kff!T7?BJB=7?,5K"`*B.h #EUVY@x?SmQ7 -i*QY5G7O Q%g?%g/5*QGeH *Q#Q?^.e$EdEd>C h/*Jz FL/SFLFLUM 72NV4-4Z7t-N(yX-D(# D$BK5sD(#B'(# D$D$D;(#(#(0duV/N?O=8B9Y] 6Y78G:(T<G,Lq6C OC ~:(5:(@@N64U@\BLL*'$DD aM$/ a!uW9 a:3,,.b54j(?4jNFu:BDrY]+rHYOHBG'T-9O'K'@@N4U@\BLL>>X=`>>BM!>>S2]:|Cj%-Ba2Lx3L11 V/L%'LK1 -0n"WTW/<W5WP^SuSuQ;#Q AQj7`'d"Z$j17`7`%ZP7,%?b@\? BFN:1I@\!d yB, A|:#?#L!dW-!dAL87:/W7K;6M i:UMXoMX'&' MX''4X;NJ-4 .E7/d):U9NP?E7??1BWW45)S?7'?5s.,XkJD=W23W8Y@Q : Ki) W} ELRK-mVS%A8L#!]yL 3W% %Y-3-r.QH [-r/\/\V-r%E<;;%!]O!]9%!]!]LGjGs(XV5sGs(#'(# D$D$Gs(#(#PE9 Y8 #7' |:RY'Q?/2&V|C 'd1' E|'9;- 1: ;NKIV,8 ; %MZ %=; ; % %H^b=\:nH^>{>{H^pS3,#!LQ#!8#!M=K5,'K~CR#H>=%YG B!z C: ; ( m (R B*.Qw B KK ;4%#;.=-*B@)}B@+/B@-/$ R1-/FN5-/L% 0(G<O$ LON5O-pYA4c6Y @YU$WrWrEe=dWrh+&&/.Y9J:5 ;'81.8A0L;48Y78l+;@] Qj$.7mEC$F_F#`Pz$$F"\F$O$;YD34 0x'!;YIJPIJl;YIJIJm=o4$P/bA/V /8 XI (i88(iOMDOMB&OM?6K)0@'8'Q)E -#\%QF>!.Bc?F>]F>D80BcSY6 )%R?'(#SCW9#; M)z'-GE[ BI CKYI)F3 G4 m @6<RR B*!(#.OK7<<3e; B/LSNa6.5&K:(#4 &.OR4jS~Pb4S<VE^XUKg-Pj7`M"Z2*`6j17`7`{BBTT/yee:n:H6-KeG=RE:FN5=IJ=-;:41y-['-} H :BDr ).oUP#=%PBsT> 75's KJsH H5*W#8^B;6!D./66J6R3%y% iHQ0JiY : FFpSG . !!0GPPR $KX f2S5s $(#(# 'VI f $/3(#(#/3e? &4T <N'-'-9/9=978HV" M40AE=*HRl0A1W&! J1DW?/C 1W6R\1WP"W?[6n"3ELM\)%M\ f)):BDr;4=%FB'TP7''ErU[ErCEr%IFkU8j GSxN W: 7CI-SxACq* )J> S8^p" M40AE=*HRl0A1W J1DW?/1W6R\1W"W?[6n"3ELO*I;O*C O*:ITAK> ?aVTC N5T'-hTr*@ N,C7N,N,9O'M &{3>&{(#(#M&{/3(#(#/3A# E+f:EuS3W8Y8@Q : E() " ELKPcE(AAL#!]L 3WE( %Y->LQgO *)HQg.C.AQg.:+'O?Wj8"g>6u&@ ;X,XI*1VW-H9S],X,X,F^5%TKY'@K(KBhbO3=:-ZY+C:+D:HH7X BM40AE0A$ 7'V0GK7R3%y iHQ0JiY FFpSG . G(G+R J<5-R 8vJ$5V+\V+&@=Cn_1(59>;} =|<59[/I &[>]P-8B6>]Dz?>]:|+HQ),)I ='WWS)=;YF2(g2+8g mSgC/ :F d d ' *%# +%6NN4rO13$ Gy?? L * >$$*?? Y:-/$ KR1-/N5-/'-!7<Lk"&Q(X]YZ+QQS %Q %0y<0y(G+R J<M-R&@8vY#M\@=Cn(M9>;}R?'O(#SCW9#C{M)zX'-GE[ BI CI)F3 G84 m @6<RR B*!O(#.OO7<<3e B/LSNa6.5&K4:(#(#4 &.O4j4S<RVN>?,GAOMHS@vQ%AOSSIJAO-;E41y-[}: $X fS5s $(#(# I f $'-(#(#e+'UL?5 5 h5 E)"#*F1)JvU!"U&Ug6RS = WFWp5!D:CW J8 Y/<J%A&  $X f2S5s $(#'(# I f $(#X(#e"' G"")Ab)()?YJ? A?Y1EdEdJ h:~C N/'-WDD8+3DWRS2I;X,X1VX,2X)5V?\CXC O$1LGq4N5! 1 1 INOY 1"tQ&!)OCXCX/;9FF M=]N*SCo*3""@""R=%$X  f2 e%8#I f%eP)tS?OP?*?VyP??N'JZ^> 2 A#22(UPH 9n0@ HH>Y6|Jx;B^GS>=d>'K8]3=V/& r@='-Rmh#EUVY@x?SmQ7 -i*QfG+O Q%g%g;*QGeX(*QX:#Q?^.e$A;HK82<0T90W2Q7O;.KF 3YQJ?8QA 35 CP9J T=' ENAI' sBE=8F=' ' E==O@[GSxN SxACJOJ) $X f2S5s $(#(# I f $'-(#-*(#eQg3 )H Qg.*.AQg.. 22@Cu'599nR{ <e.%J%3S%J!J8; .23ByVPSP/W7'U-"HJ1b"UR?'O(#I7C9#Y CCC#!!2W>5)2.3K.J2..RM40AE0A C'7'/1W7'& > VK5,'[CR#HPQ=%YG B# C2PQ> m> B*. B KKPQ4%#=-*)G1) %GHENA;HK882<0T0W2Q-;.KBY'(#FQJ?QA5 CP9;$.7mL<EC$F"I_F#`$$FF$AP :BDr0U9Ec90 B<T, $Y<1t?<A$5 CP9E9 Y8 #7' :RY'E/ JVC 'd1' E'9;-'*YJ'/1(Rl'6l6lvE/Q)EE744/BgOx"U(!0(0/UY"U0/ A/7o{!/X</ e4^UkH/'M eB& . eJM?7"")?>q*"G7hE")?G! (QBS"ZT1R+ ;GH#JMBT'J]J=dEe=d J=d=d)F L@EEY&3 KA<QJ@K KA5KA?H;<LP<"Qp <E++9 %)N ;nI,8 ; %MZ %=; ;& % %&R8SGR8(lD_vT4 ]w$BgO"wU?#*(!0(0/UY"U0/q 1AQ$&<=Y,2-}$&A!`R<#@%A\AI, 1<WNQaA^!^:9A^7A^I&C& 9AP :BDr0U9Ec9=% B<T, $"pY<1t?<A$5 CP9?'^8<?""?7I@8R06&yL7I0,P VVoS>_ HLVN%(k )OFA(kb>)O&&KJH HO*W#N?O=88B9Y] 9Y78G:(G(sWLGOG6:(5:(@@ %jNL4U@\BLL2@0]2@A2@E5T K L3N '-'-Q$&<=$&R'm #@!`'mP4KX0P(/# b8,a)J)JNwJ:N*S*3%}EEO>IJH9533KX k R34Y iHQ0JiY FFpG . G/* TBBOOB(FFFF: $X f2 AS5s $(#=(# I f $'-(#(#eF%1RWPU)3Ml&Z@PV . M5A2Dv%,FM$ 4OGSxN SxT$AJK5,'CR#H?=%YG B^ CG8? m B*. B KK?4%#=-*SIRTFO6 2U-W$J6NDJhJ6L^L {Q.D*nS8TJ6 HQ6$I/<X(d5s/(#'(# /C2[(#(#[n115%I-%O8}{;8}r%g@-k#0y# 8+:HF%1RW-PU)3M:#&ZP* V5A Dv?'*F--GH"rB|/)>iVN(kK(k>)'s44T>$1:IUH>5 j'K84I8?? 84??4-@M4U'82.'D6W,V1=$/56&963-9-E1%/N=J8&'E9t5 R| 6&6CC%.@/@/c5?GO"b&X 5QQcQ5QQ( BM2MEO2--uMM2--,PIwI&*I?B8E)CSBM#3~,Y<QI .WK >+.W*hWE# ?+E)E)8O8}{;4F8}'rP7'':7 kXDVCN*H(>Mx '--GK@AOVQHSDGMF&4MAOSSK(m%AOG8L@o&:4KZEGTv4j4M[:%%%%&n >P:% KLC:; :DB>N%ErU[ErCEr%I/5 R|MI; z# 05o zKez 22*%,'59nR{$)- )%,JM37J2-SJ!=dJ,5(4EN$/@6&63-NE1%/N=JHn9Hn0{Hn1T+ < ;NKIRd,8 ; %MZ %=;W ; % %PT2F DB;QF4TG Mc&K*4VM4K*/0j0jW $KX f2S5s $(#(# I f $/3(#(#/3e+K ;L+@p533KX 9 ?tURpIU/TA)/T)A,CWo!., :& gC5wAu-"*C8>X@CC?'*--Gp%"={MKWX "R2(2OKM& J?7KK";.=vN/ $8/HW,t._NI-:Lz ^YN)J)JOVLz LzS7-N8^p"Q(""E{F#(5#0#"V/(#'CGW C?K mS B/K"ZT1R+ ;.nE( 5N'-'-D-k>DD"H2hf%: Gs(XV5sGs(#'(# Gs(#(#F~N@YV5HRO<L(# ]DrW9F% :)zY'-BGE[?I9TUPYI)<N ,!4.O m @6<RR?6!L(#.OKL7<<3e?/LSNa6.5&K O(#4 &.O4jf4S<B 4S*[ H 4N7B}IvN7*] 4N7N7K5,'[CR#H?=%YG B CG8? m n B*. B KK?4%#=-**%1(h(h:Rg:O J#KJB <N'-5d'- $m1c',k2%ANh/Xq&/V/IXcNhXI -=9"("OY" Z K4,{R 5Z 5= Z 8J=-J=J=Y  56E9 Y8 #7' D:RY'P/2&O9;C'd1' E9;'9;-S`E>60YO-T_Bk2++YBk'W&{3>)2&{(#(#M&{/3(#(#/3RWN <-/$ 3R1-/FN5-/'-'-1 =9+?),!N7uY=N5-=EP35'R7Bt3'S9-U $/'&'3- E1%/N=JG0O7GDN( G"N$8/HW,t._NI-:Lzo ^YN)J)J/#Lz LzS^N8^p*/DoX>%I _ X(=X%c EKLR&AGL &=8 <B96$ 8GUNN5N'-'-(o??4 GJ("h-?4% 6/R.U\N%R=+%Cn(9>;}AT IqWP ? BRS;4;4Jp;4G$OK 3GV/&G'-&aC1qC,)B2)B-/)BPq?OqS9*42!+=D7?8*4+=??O3W  @QLKi) 9 ES 9;%AS 4!]&S  3W% %Y-3TELD%uLGN9LD_C~31D(D31vTN7+vUS7F-2dA" 9 Ba4m89p 5qlT NNP3r$3r"DD3r*4%;3@q%;.a%;UM :6h;"x-H-H>NHv6 Q'UHxML HxF![Hx (.wm.w4a4R@(yXR@Hg1@tDlSz 1AQ%$&<=Y,2-}$&A0RY #@SA\AI, 1WNQa4%a&GZ!1XM!M!M!XM!M!!F:W>'JR%e+[#Tn%e6%e1 F:+:/E3BR=Z>]X-$/U SVoS*/)33Re$(9F5$ J{$A69Te9.97{16h=NL *1;NB*S)*NB*G+~' X4WDDCR++BE2)!M!UBEM!%M!"BEM!M!NU OD7n9 Jy 14*Z4-pzM 3k8 HLVN(#b )B qP#&| (k2+>)K1HD (3 (4W68(#4W [LS?>K H(#?s! N*'4|+f:Eu2|D4XNBK5sD(#B'(# D$D$D(#(#@?'C] C&@ 4KFS"@ H 4N7IvN7*] 4=N7N7=:vN% .%%kGW>A0)k.3K.Jk..0J~K@3E!L=B@wK@2W)95=VB?;22D80BSY6 )%CIRQ>TW<Buj66O}P6< ]Dr;4FB'TP7N''OG'WEY  VVoS>_CD B?9"$4? "Hb9"TRKV+N'-W|GW|YW|;mJv6PnABG\E5XP [ xB&OM D;o DS? D/*+@//X & K/ 0/K'K[D'0Y'8XK[0mG58F0mB0mF6TNPY ?6Mx)\H66|WV,0%.OGv?c `!6%O.Ot-/$  R1-/ N5-/Z2h(#U( a.|U(U(;VNM O+IJMN VT>21S.We6Q1STB91S,Q"`*.h (G?~>7?~B}?~W(& WDWOjS6 Oj3cOj/GSxN MI;SxzA05o zKez ' :a/29r c cDY c7{X>%I X(=>X3,;,'HV-C*>AV-NEWF; ,NE$`NE,')7'K O'Y7 Y77Y7F&5 UPH F>A.HGPX6|,P}IJH_P-;41y-[}([-L f21-?? @TI f-4??4ep5!*<YTFO6 2U-W$J6NTwLJ>jhJ6L^L {Q.DTJ6 HQ0!$LA 9i6 ]LAf$$LAA'\H? "7ALVL ALL=8G8W/9&J&JT)1Xj Qv 4,#Qv9k..CNTk?L+U M@ !(w;)Q&M1>aY/;%&10;$M$}R wY/%%(: I#YKI+INFuI1/)2Y]+rHYOH/)G'RM;-9C O'K'@@N4U@\BLLUS7D`-``@+ Y8 Y /TG dG8wG>!/TFO62U-W$/N(J%qD^^ Q.DTD^ HQ6$*.5N5M< XP+0"NW XP&AXP,?t? u?? ?QXDVCNHM(>Mx '--GK%AOHS&GMFKw4%MAOSSK(m%AOG8L@oKw:4KZEGTv4j4M[: / 6Q U$-@M4U'82.!h'D6,VQ!h$/6&63-!hE1%/N=J )F8TSKT("2TS'',TS''GM2?>6M6M6M6M?2FJ?2?2,j1,+++DY)0DeI:P?.F8)BO#rU1AE/T/G) FWo!.0.!w/G F11$.'E7/d):U91P?E7?? BY8Z1 )S N?7?FE1s.,XkVB<:V~VK5,'CR#HC=%YG B@, CGPBVF m[ B*.J? B K$VKVF4%#=-*1bM01b1b!j)&{0> f2I&{(#(#MI f&{'-(#-*(#e DFG$9QRFOOIO:lGIGIFOO:P(' A"%BP(=8F=P(==!LB oM;S!"("/.|!"".?"EA?7A4Y i0 FM90Y=6 "# 3"pY=6?=6A 35 CP9:-/$ R1-/FN5-/'-Ky/K /Fk,vRO<S(# ]DrW9F;:)zYg'- BGE[?IFTPYI)(N %4.O m @6<RR?69++!+S(#+ .OKS7<<3e?/LSNa6.5&K OO(#4 &.O4j 4S<4, 'E 'Q ' *)V  1<8LA;HK852<0Tt0W2Q'G;,&ElYQJ?VQAEl5 C&P9#"M#73^7#774FNS|?u|P?E2$`:KB:KuGORdWvH7HH # Co5|?Co@rCo'3 ;|I,8 ; %MZ %=; ;& % %&P ?6Mx)\HU6|WV,0%Gv%%O/l'YFIA$f6'Y -DT'Y,j1 *X * LV++>+S..AMQ!G~i(V mS$ K !!?~>7?~B}N7B})??~*JE** 0J!m%%(WQ)2;1X1"VB}'3W  @QLKi) 9 ES R;%AS 4!]S  3W% %Y-3?B8E)CSBM#3~,Y<QI .W.K >+.W*hWE# ?+E)E)8.1XX)s(`)CI)_N -l'-(*%Ip X(=!JX%cLgFhC&|?u|P?0PP> f ZU5&(#E3(#M%WD$I f;'-(#(#he3|OFYa3|M3|!BQ =d 0BE7E7BI88g8g+x GCH9OS0p([-L f2G_1-?%g? I f-4??4e::#''kM $X f2S5s $(#(# I f $(#(#e;YR1XrFT,KB1LJLXV2g.,B}WB}P3L U7L |,N7#}?Y" M4C 0AE=*H3Rl0A1W) 8`N3Jp/Jp&i1W6R\1W"3[6n"3ELY,1OT3Y777'61>M^Y77 @fW&{.@> f2&{(#(#MI f&{/3(#%(#/3e|?u|?EE2E2 4Wr9IWr=dEe=d Wr=d T;W;0LHRU/TA64gCx;/TKo1B)T:LWo1QT1C \GuLH:LO ] 0aUX6)R?50a066}0a!!Y6I%~D +.E#+C, E&E8; . By-M4Y i'QW0HY= -L$/=0&=3--LE1%/N=J@B27@ G NPQ5FCVNG(HS4AO-;XXXG1V Y=v#%")@42z5Q)")(#(#(%D$")(#(#mJ#wJ 0J%JbA/y:BDr:H>=%,#BG=HTRE@@N=IJ=-;@41y-[}W''5I:P;Q* o TBLF"OZ 1TBPTB4m89p 15qK5,'%CR#HO=%YG BQV CH`'F mF" B*. B K\YK'4%#=-*W sPs>T s(#(#M s4-p/3(#(#/3R3 A iHQJi* @6FFpSG .4 !!0GPP/,1O%+Lww*$;Y3H@*H.5#@,0*Fq(@@*XMQtXW/V <N-/$ /HSR1-/FN5-/'-'-& Z ZZ;&(wYP;&;&jW ;ejJjP)Wo!.1,g;$ ,O;"N5;%Ia&:Y /XIM!M!M!XIM!8Do 0M!!F* 7UCJ)|>"7UG"!V.N"UIJ"-;.41y-[}Ul+)(.wmz:.DC.}.w34aDj(yI*`=TUl1:@A LbR(RVN>?,GAOMHS@vQ%AOSSIJAO-;41y-[}N+4,<?S-5X|<<1XX1ObW;%?>?@\WB.>F009 M-@M4 U'82'$&pE6," &$/E66K] EO&p-& NYE4 c4>>4E9 Y8 #7' :'Y'Q?/9VC 'd1' E'9; .H /y:BDr:H>=%,#BG=VBTRE@@=IJ=-;@41y-[}+:"W3A%K@".W)E ?'!G". WW,-@M4U'82.'$6,V !$/66-!P ?6)\P'.WV,0%GvI+'%OTvHt%MMJ+E-J"p"p]AJS@QS@S@/%5 +'8"/oCC.8: #. %3U .?_P% N2'-DNFu:BDrY]+rHYOHBG'=HT-9ON'K '@@N4U@\BLLP!o@A4Y i0 FM90TY=6 "# 3Y=6?=6A 35 CP9<{q?OqYX.;C><9*47?!Bx 7?BJ 7?8*4 ??O0ZAQ-=j?\N5CN5 P45Y%5%0 $X f2*S5s $(#(# I f $'-(#)(#e 7iR?'+QC9#3C NC 5G'L3'-h6UT /EH@q9OqLA9!#zafDa')&{0> f &{(#(#MI f&{'-(#(#eI&PgIE5I(K4F(Q Q <*(2.4326]952]B-&}&}LHRU/TA64gJx;/TKo1;)U<,JCOWoCE1QJ91C QfLHJO Q] Ul ~NV$-0 BNo@.Vw4lU2Z/GRn(yU2=24'<"<lUlRn$$QE R<<+<44Z*XV&l;0;0;0N?O=8B9Y] VWY78G:(G]%VWO:(5:(@@NVW4U@\BLL (N(N@(N=HcRV%NWY?,GAOQSHS$g*\WYFF?'AOSSIJAO-;WY41y-[}-Od'13' ,82OX7A2$/*#&3-E1%/N=JvJ/ " !#JJFLRM>B/SFLFLUM 1H421J11ak1l'(%(.8'(I[UIA WI[)I[WUK?!(U5R1}F/!(`%$&M:$6:%$ '%$U$FPH "\1A!"1c',k2%ANh/D5Xq&/V/IXcNhXI(LL7_(:)rD)Y":)5H@V5HW5HV"(bm&q.wl4a5%GE4'<"<lG$$QJ-J"pJ== W-/$ R1-/C N5-/ `D@K1#$ *#C N5#F;8k ?++1G;+ GEqQ '?]3=C$I I&'?SI&MZI&+>9FF #<))R34Y iHQJiY Kv FFpG .R) G*o@6:'Ue1@562i6HdGN7XU2DX X%{L6>{ENE>8N}NV4-4 ZV)6(%U1AKTm.1FN51Yi'-h@) ,@#MM8 8$<@@88< ]Dr;4)zF#B'tTP7N'3e'OW $X,S5s $(#'(# $[(#(#[ %ICT.C-C?~D>7?~B}B}?~VUxMi2i->L->i-> Z&;::|%G&a|; SOZG&" ";?'+KG& -'0n%92N'- NO#NU=G7OG-@M4U'82.!h'D6D,VQ!h$/6&63-!hE1%/N=JR$,[l96;l C?PN:CW C$];.K~ C!ORCK95[#|IHxL UPH H6|LR/R/KNHN=4APEAP`.<JX f2,5s.<(#'(# 2D$I f.<C2(##(#eB.\DQ?2d0F@OtA,F@JGF@.(FF;SJ;SS;S+&&/1.Y9J:5 K'81.8A0 K4;X8Y78l+K@] Qj' 5GV, 5; 5>$ 9@>N5>4-p;4+DX;4JpJp! u";4:+X'O?Wj8"g>6uL;8 3y*%,'5;%,37B2[GSk2[3EXDV2-CN H((!Mx'-MGK@AO lHS&&GF3MF-q?N4MAOSS2-.OK2-(m%AOG"M2-q:4KZEG.O4j 4M[$V* 7UCJ)|>"7UG";!V.F"UIJ:"-;.41y-[}# ?$Sl",D#$@VOg#33+:W>f}:}&1H,-,W $KX f2S5s $(#(# I f $/3(#(#/3ePPPTFF0J~K@3E!L=B@wK@2DW)95=VB?;22D80BSY6 )%=&5ZG=F?=E^E^<FR?'O(#SCW9#C{M)zX'-GE[ BI CI)F3 G'4 m @6<RR B*!O(#.OO7<<3e B/LSNa6.5&K4:(#(#4 &.OP4j)4S</hV+N'-'- H.'E7/d):U91P?E7?? BY8Z1 )S N?7?1s7.,Xk~$ ,1~>N5~PPPxT,LKIx@m>uud 'Q'/!&pHx;C HxFHxK] EO&p NY"Q("1"=P R%?@\?@\BTGGR&{.@> f2&{(#(#MI f&{/3(#-*(#/3eL!JE]'%^E]-FQ-E]-YY/RV8N 6?,GAO5HS@vG$9JtAOSSIJAO-;941y-[}fW $XS5s $(#(# $/3(#(#/3]= -IX5 =W51FFf%s8!)O%I X(=X%cI V/ 6Q MN@5nMv4e + M;>0L{;P;"A \ 4KFS H 4N7IvN7*] 4=N7N7="(bm&.q.wl4a5%GE4'<"<lG$$QK $KX f2mS5s $(#'(# .I f $[(#X(#[e. E '+; EB$ 5=U)127SgR)1:Q34 ::8$A(JSs[J* 8*T*'UV@xFG$y3O 5JR9&U)T#)TSD)T+( 5V, 5; 5AAD;$$:BDr;4FB'=rTP7'X ',f}:*>A}NE &F; ,,NE$NE,X,<I*1k8 % %MZ %=;&& % %L#:'!LGz1L1FF%sFYFY-!q=d#"I: oP?S4A7XV o-i 8F"**C 8KeP #C?^.$" FE)=*H&2Z>F 1W   k/ 1W61WT;"kWW;"uQMY#(t6)g6H%%x @1 0O@+Q.m>>>Y.l G,(Q XU2DX 2X/QNc;2Xt/BK/P Q5'-uP77@c/;P77TRYeY7YYJ'D6;'D +YD'D:3b,4NRO<L(#+DrW9F% :)zY'-BGE[?IXTUPYI)<N ,!4.O m @6<RR?6!L(#.OKL7<<3e ?/LSNa6.5&K O(#4 &.O4jf4S<KYN#*S; c.`K *S 3C[5`@@8zAHnK`THH,j=8 <B98GN'-'-*#S?'*-G*%,'5%,37S5SkJ54Y++;+++)eVVgA UPH: F6-DHGP6|4 FN5P}IJP-; 41y-['-}CBWVGIVG8wVG;m&L/q k%%115VS%DG}Q16%D%]%?%D%%"X%$ X%V+N5X% HLVN(# ) qP#&|(kH2E>)4=HD (L,E(4W68(#4W [LSS-'ITH HL,(#?s! HN*'V-EI R?'+QC9#'C)z@GE[ B CE20 mE[ B*(#7<3e B/1 0O@K 4W<Bu+Q4j`.mPPPxTAD9#33UHr L' rU0|K5,'[CR#HPQ=%YG B# C2PQ> m> B*. B KLKPQ4%#=-*0SY S2FVN(kO(k>)H|&|V6 (tX f2U t Ae/I ftB!s!se2Ge!Mz )Q Q RO<S(# ]DrW9F;:)zYg'- BGE[?ITPYI)(N %4.O m @6<RR?69++!+S(#.OKS7<<3e?/LSNa6.5&K OO(#4 &.O4j 4S<)OKM!5 ADoDo9[): DoDo/>vX%X>%I7 X(=X%c1'=]7IB$`:KB?L 1F':K@ruP- @OR@X_@z8@FA&7I "cIS>Y MN .TB ?I+Gh 7J7 HI+N7N7IvN7*]?j?jI+N7N7Y6I%~D +.E+C, EE8; . ByL%<B<(G<W" IOIW"AW"Wv7HHzYC,A @ R8(o 4'0@l(oK%(oSGR8@lO(lQ$ $XS5s $(#'(# $[(#(#[4)VtR H(-`02?ARYR9nM3 fYP:GW=4<SP:N=dP:7oW//b/UkM+lOB:IUOH>H>536)ToEP. f ".WE(#(#M%WD$I fE;'-(#(#heC/# dW&;8*8*18*S_bVLC$#N7"47m!%S%"4FF_F#` ` `"4F'"\&F@"Y}P5A@P  6PZ.v.v.v.v'>|VGO>X QQQQ8&$rQW[) 8$r) B B B1Vx VVoXC`$)a07S<iM!C'|CiMEGM+<1'| V VTP2K%MDB;Q%M4Mc&K*4V4K*W6I+BqSkqO!%M%MQ+U0*@< ]Dr;4FB@'TP7N''1 0O@O+Q.mBV/1I ?!1I4)1I0Q< 8J1Vx SVoC`/-0<SMW9&-*C*ME%MRW!1-S P O+8;8(+8B,875%S8787#:36ATBBBB"8S2]:|Cj%-Ba2L3L11 L%'LK1 -0n>EV%DNHzYC,A @R8(o4'0@l(oK(oSGR8@lO(l$`:KB:KguB=/L]U O !L]GL]YiX>%I+ X(=FX%c[S`U:Y6?5RWc.FF`"{dS'.6:"{'"{U.FPH "\7{)O4PJ }WRV%NWY?,GAOHS$g*\WYFFRAOSSIJAO-;6_WY41y-[}2(UPH 9n0@  HH>6|Jx;B^S>=d<>AL8+(2A = R!|e|@4u1GF#(5#0#V/(#?!%07=EQ+Y< 0 9=AU&(G Br" ?U&&U?LRM B>K?KSIO#L"B/;K aPG'5;K-Do;KRM! nl ( A/7oW/X/b!/RUkM+lOBO.'E7/d):U91P?E7??;B 8Z1 )S ?7V?W/91s7.,Xk5ODu)27SgRDu:<4 :: A#EUVY@x?SmQ7 -i*QT G+O Q%g%g/5*QGe8\*Q#Q?^.e$" G""G'LD:GD:6DD8Y XOA2I R K $XW&S5s $(#'(#  $[(#(#[0#EUV@x?Sm7 -i*QG.iCY*QGe*Q#C?^.$Uq*N7KPExUN7JN7TUN7N7WT,FX f2@5sT,(#'(# I fT,C2[(#I(#[eQW Z&;::|%a|;JSA1 ;?'K -0n(?H'(FG$9FOIO:lFOOF0&"CURl&Z&d"[>)o'<M5A[%[,FM$ 4O T H^T+7V>{L6m!n$P$/ t:-5?:\:-JJ NFu:BDrY]+rH=%OHBG'PT-9C O'K'@@N4U@\BLLVM-*VMVM1+ 8dp%"={MKWX "D232OKM& J?7"" X3P3W/hI$ IV+N5IYi-p'-'- Ae |<59[/I 1n&[>] P-8Ui62>]Dz?>]:|Ui+HQ _M] _ _:g ( J !P! !FF%s6QM OB[#M 1G19M 116: Dr;4FB@'TP7''1 0O@+Q.m 4KFS"@ H 4N7IvN7*]D 4=N7N7=([-L f21-?%g? I f-4??4eE K@3K@".W)YrV!G".VW,,#,GbL ,LS2E i7~DPM i7?\N5CN5 P45Y5&'&'$v1aH LC,A @7UR8(o!4'@l(oK(oSGR8@l(l/<X(d5s/(#'(# /[(#(#[k1l'(%(.8'(I[RGUIA WI[)I[WQ<=$&R 5`3%7055`Gn5`;33j -FX fGU Ae/I f '-Xke);'WWS);j?- $ X<S5s $(#'(# $(#(# t%LHRU/TA64gCx;/TKo1B)T:LWo1QU1CLH:LO ] JGwGwAm?d?d= sUC5=U27SgR)1$:N34 :M:>$$ YL"y'LGzKL HP2F.V9R HFF \GuBe~)).))W$KHLW$/W$XXXsXB'C C?K mSK NO#N=yUADAD8rK78rML8r+3b+*`G*`0+*`%OT%d%OV+V+%OH X8h34W!B8h+X8hmF%1RWPU)3Ml&Z@PV . M5A2Dv%,FM$ 4O2Sn1RtMq1V1 sHW R:"(bm&q.wlE4a5%GE4'<"<lG$$Q'E3KJ".8 uS;YR1XrFT,KB1LJLXV2g.,B}WB}L U7>L |?,N7#}7?YcUf*JUfUfQQ$ $XFS5s $(#'(# 7 $[(#(#[R$,[l96;l C N:CWK~ C$];.W C!O;,!RCK95[#|IU.CE!"U&UgRSXLoI?7;V? Y=$,FE$ g00D962J&N7,?96$NN6 4,pS H 4N7IvN7*] 4N7N7IBIsI).=.II0eI&I EF'+; ED 4T.TQ7QFN]FQFO'+1bDM01b1b!jC$ <E$C$& =& C$& & :BDr;4=%FB'CTP7'''-4R8!n$P$2TpHoNB+8w0ZAE $X f2 S5s $(#J'(# I f $(#(#eRO<S(# ]DrW9F;:)zYg'- BGE[?ITPYI)@N %4.O m @6<RR?6!S(#.OKS7<<3e?/LSNa6.5&K O(#4 &.O4j 4S<%a&Z!1XM!M!M!XM!M!:N<8'O?Wj1/|BM1g>6u,5):/|!p #qBB/a+Ut&.1G191&'*46N*T5  u:7N$9 N $8$%?@\?@\JBRV8N?,GAOGHS%V74GGAOSSIJ,AO-;441y-[=d}"Z|X?< ]Dr H:K()z!c'-BMGK@=BT;GF3NMFX4.OM=.OK(m3e=G"M2XO4KZEG.O4j4M[$V0J~K@3E!L=NyC:K@2W)!IUNy=?=G22D80NySY6Q\%+&&/.Y9J:5 O'81.8>eA0 O418Y70m8l+O@] Qj&PRV8N GAOSHS74GGAOSSIJAO-;441y-[PKx&Dw6"wVDDw*QW=s> f2$R=(#E3(#MI f=$/3(#(#/3eE*AM?MoBWp)S1Wp1 !#ST'T_1#S4$`:KB:K]u@J~ kJ~J~RV%NWY?,GAOQSHS$g*\WYFF?'AOSSIJ-AO-;WY41y-[IJ}R$,[K~l964ul CNo4uW8 C$];. C!OR4uK95[#|IT]"IoL MU MPA4 "(u-V^G'LDD:6DDY yA&Ty'G'9y' 5DN]C,P*C(W 6>]Dz?>]:|W +DQ:V68$O?Wj1 M1g>6uU+=  UAPBN5Yi'-hF6EU"F?=E^4/;BgOU"/GU(!0(0/UY"U0.!w/G0/11$CT.C9C' 5GV, 5; 5:Yb:M!GM!:M!RH?LC,A @7U(o!4'0@l(oK>(o@lO+RQ'K[D9J9RO'8":5K[H4G4~O4;H(7Hl+O@] QjR3 G8 iHQ0JiB TFFpSG . !!0GPP?~>7?~B}N7B}?~ARYRY 9SC99T<+7,G?I! B&$1"O6J !dJ G.pP$ ?J !dW!dP:A67$ >>4>Q%LFN'- f '&G+ (#(#M%WD$I f ;'-(#(#he%$X  f2 e%AeI f%eJ?u GI ?4FWA#/6 2NA#CA# CFWFW2== T T+8L6mE>8CmJ 3hM\+*?L++OO+F#(5#D0#"V/(9#RIN#v9(t6)g6%x-<((~u1GRV8N?,GAOGHS%V74GGAOSSIJ6AO-;G441y-[=d}<''/1'r6l-Od'13' ,OM7A2$/*#&3-E1%/N=JQ$&<=$&0Rx#u(rs0# +K A/7o/Uk9Z6WV61$1M1<-N/><-H1<-MM"&E RHK u?*k;W u1y/Q= (W1DH;.1!ORQ=K95[#|IICK5,'[CR#HPQ=%YG BV C2PQ> m>T B*. B KKPQ4%#=-*,@F'C C (S],:3jP(n33jV/& r@3j'-h4Cp'J&'JF6%]%?'J'J%%pH>=F{M#X)"6R6R2PU8_6RKM:IU8_H>H>58OGn31QZOw-?BX9B1BV"LUV"}NMN.FTD1S,T8W8T8 } Ss1e$%7mEC$F_F#`H$FFN?O=8B9Y] VWY78G:(:G]%VWO>H:(5:(@@NVW4U@\BLLH| >E>EpJBF$JFFN~J8FFS#GSO5R 6.(FHla&SXM!M!XM!M!? $X f217S5s $(#'(# I f $(#(#eXQX JwMXQXQN@ -nRO; ?8<?"?9# 9#$PHX$P9#$P9FVSW9FHY9FA<HZHZHE!b!bM@'MH]$M58>{UPH F>A.HGPX6|,P}IJP-;41y-[}/y:BDr:H>,BG=XTRE@@F=IJBI=-;@41y-[}iM $J #8 ERV%NWY?,GAOQSHS$g*\WYFF?'AOSSIJAO-;GWY41y-[IJ}S, S"eS? /Dk49 YN?O=8B9Y] HY78G:( G]GO:(5:(@@L7NG4U@\BLL3V*T"K,LG$O 3GV/&G'-U0{+ %KON)y'-?L>JEN f 2HT5s(#'(# 2D$I f;(#(#eR?'(#SCW9#; =)z'-GE[ BI CKYbI)F3 D*4 m @6<RR B*!(#.OK7<<3e; B/LSNa6.5&KD*:(#4 &.OR4jS~Pb4S<#D 5#0#I(#6,6$"N76Ht%MM9[[Y@P 6 >]-@M6DHvU'82'$U&pE6EL!$/E66K] EO&p- NY,A-@M4U'82.'D6W,V1=$/56&63-E1%/N=J|0W6I+BqqH| >E>E 1A5H}<=?p&],H- 2@T0 nA n3#@ nA\Au(rs0 13 NK 7U:CY]<7]?YU7UGHJ!C}C ON5HJHJ@@N4U@\'-BLLA=BN5'-h+&&/1.Y9J:5 ;'81.8>xA0L;4B8Y78l+;@] QjH3XcHw,A @Xc(oQ4'@l(oK(o@l_Y5)4o9YF)KJ,KJ'YYKJKJeene!! D119#"P&qP%5%GEG$(G0G VHW B>0#98B*aj*A=*$/UD}&@b@b4U2T,C996PO;(D'IIW%H  f2 e%AeI f%!s!seJ 9eH<>''9%G;N/1WR\kND/;/D2CiNKD2BF]"D2(#$%7mEC$F_F#`$FFN7M;6!n AP$-P88HVEE#=UK?!(U5R1}F/!(`%$=X&M:$6:_%$ '&%$U$FPH "\*NB:{*NDoT6Do;*NDoDo+R RFW8vB CFWFWB=,tUtO$.#1G1911E)CSX K. E#  E)E)8J$H. HLVNHI )~ qP#&|(k.++>)$ (.+(# ['C. H.%o(9HJ*'?80CL=, p4?WI EPPU'+; EPP< ',',PJk& I(CN%%V#&2WF,LBm3 nFpVY%sK+ 3 6Q36QpN6LpOc#3 -LG5Hm". B>B/a@<5?;K<&Up=4w8 mp'} p 0J?S:XJ9 f 2+<75s(#'(# 2D$I fC2;(#(#e2S*12S 2S.A#..1</i/./iH"/iVN(k(k">)&|JaItIt-0gK1@*W.W.L)W.W.S,K5,'CR#Ho=%YG B< CGE0$ mNS B*. B KK0$4%#=-*?B8RCS7=3LG4Y<)W KW_!*G4(G(GW*h"WM?e?G4&U? + R:BDr*k;=%W#B1PT/Q=W1DH;.1!ORQ=K95[#|I4/BgO"U(!0(0/UY"U0/N+<44EZn#WWY<WS hPEASTb2@0]2@A2@E5+VT $ X1S5s $(#'(# $(#(#2  mpH>="2ML7/4(LL7:+DD?L?!X!,!3K3W8Y8@Q : E() W} ELQK,dcE( %AAL#!]6L 3WE( %Y-!]3I:P;Q* o TBF"OZ 1$yTBPTB4m89p 15q#N7"47m3)%S%"4FF_F#` ` `"4FF'"\&Fa(P<4(H2%%:BDrBGTOFc mS?/Fc-r [-r/\-rNb9HW H=.0J~K@3E!L=Ny@wK@23W))IUNy=?=;2A2D80NySY6 )%<UV <NHSFN'-'-33LA56 ]LAff4LA4+DUPH: F>A.HGKPB6|,FNP}IJ-+P-;41y-['-} -0CD< CDCD;mJvNK 7U:CY]<7]?YU7UGKHJ!C}C ON5HJHJ@@N4U@\'-BLLNWE> HLVN5(k )S(FA(k z>)C9U8bKJ H H8*W&#!+'R~E-%6"5P:A6>46' ENA' sBE=8F=' ' E==%?>?@\Bt> O*;O*C O*-@M45U'82.N'86 T,S4EN$/6&63-NE1%/=J&/8Y.A8TIO/OG> O=O==KO=K.a=K4'OC & I'-MGK@:( CJUEGF3MF 84OM:(5O.OKO(m%:(G"M2N :4KZEG.O4j;4M[$V( I55!>?'P4N34N+Nn4NP8M $~$17!X74?n+ ,X,B 4S7 H 4N7IvN7*] 4N7N7Bd-;V;BdK?-3BdBd-XR9c,|0tUOYTW<;`-@M45U'82.N'D60>,5(4EN$/@6&'63Q-NE1%&/N=J QaQaA^S!^:9A^777RA^BE2)!DM!UBEM!%M!"BEM!M!8J T2F FR Y#.%o(9R$;I79S61 ;/| C CX3GW C$] C,5):/|RG!p E7E7,iB8gEEEE4ACf$$,[l9lEN5`3%7055`Gn? 5`;33<g7M>7;X?< ]Dr H:K()z!c#'-BMGK@=BT;GF3NMFX4.OM=.OK(m3e=G"M2XO4KZEG.O4j4M[$V'YG6'Y -'Y,j8uY 6'llNP:" PX:?? ==:;??6:I[/H8D7\:Q%U)yAKPHo)yN5)yYi'-h3XcHwXcUg@lC(o@l1(q1V.1/W*+@0/</ & KG/ /+<K V VTM 9@$Ct 8)Q7! BoT\ 6 M Q&! 6CXCX/;4A +'Qr8"/oCW?PN!OCCQIRQFQV NY!KXD2CiNKD2BF]THD2(#0@'8'Q)E -#\%QF>7!.Bc=?F>]QF>D80BcSY6 )%!MNV$-Fe4lYZ R(yX24'<"<lR$$Q#&,LJA/UFNh[p%s%JN3[pO8pIXcNh#NEXI$5JP0 ]<N)$vJI$5@J%#7IB$`:KB?LO'1F':K@0uE;YOR@z8@F7IY"cIS>Y 4TWKX/W*+@0/`</ & K/ /+<K V VT{b{_{O\TDG$OK 3G"V/&G'-2(h) V --V V W $X fS5s $(#(# I f $'-(#(#eW|QGW|YW|;mJvRL:L!,"a7 $aKa-$ (DyN'& N'9BN'EZE'L9[[P 6 >]:?B8RCS7=3LG4Y<)W;KW_!*G4(G(GJW*h"W@%3&?G4&U?< P /Y=?6)\DKWV,0%.OGv.f%OJaItIt0gHKK!-AS!-<!-K!S 4,p 4N7IvN7*] 4N74C<  Z&;::|%G&a|; SOZG&" "?')KG& -'0nR66V%x %{Z  ;: ; ;:GW8 Ia6/4$0S(Z6/*6/(ZXN/)/)M;eF0&"CURl&Z&d"[GJ>)o'<M5A[%[,FM$ 4O"+?1\LY3:U@\5U#EUVY@x?SmQ7 -i*QfG+O Q%g%g;*QGe*Q#Q?^.$A#8wVGg=X?< ]Dr H:K(Mx!c'-BBGK@=T;GF3NMFX4.OM=.OK(m%=G & CXO4KZEG>4j4M[4&*RO: (# ]DrW9F6U:)z!H'-BGE[?ITSPYI)L  ,4.O m @6<RR?6! (#.O 7<<3e?/LSNa6.5&K :(#(#4 &.O4jR4S<==I.MI&IX&JX f2Y:5s&(#'(# 2D$I f&(#=X(#e;.NFu:BDrY]+rHYOHBG'T-9C O'K B'@@N4U@\BLL<s0 B>Ba@<<Wr9Wr=dEe=d Wr=d rSeP#X3SeV/& r@SehI ! TQ&!CX/;+x,q:1.,q,,q-N?%-5(#-(#(#1sUPH F>A.HGPA6|,FP}IJP-;41y-[} zQ;j(#*'2lV7-fCX' G/ AB BSDr2i5IJ.uIJP2IJIJ/h-/$ R1-/V+N5-/'-'-@?8@P0PP/6>;UP/(#(#M7D$P/4'-(#(#NT-c- B?<?=?? mIBOIJI ;, ;MZ % ;Fli$ 8iN5i'-'-5 R| ?C  4! 3 :A wT@:N5(:4As(#h0 $ XOS5s $(#(# I $'-(#(# NY#D4V%~D +0E?-C, EE ON'-'-"G""G<P(%:EP5 Ia3.6/sS90(Z6/ 6/'C.(ZH?c3k%7kJ3|DCs4OY/?1nI4O(F5:$9Ui6H?:|Ui+HQzF////p"="324Fv6WJFv"^P XcNhO~K&=^ACXO~M!M!XO~M! M!+g 'U#L<=vL(K1a"?Km@!aaTh"GT&%45A2{ ISLN5;'-VhRFK J&/8Y.AT8WX19/Y= HJS$(Mx9'-DMGK@"AgNGF3MF,4.OM"U.OK(m%"G"M2,:4KZEG.O4j4M[$V b' <C-/$  CR1-/N5-/'-'- \ 4KFSM H 4N7IvN7*] 4=N7?YN7=5P/b/ X"(i(iP PIPM!URp'@U'')11)I)K=< ]Dr;4FB' TQP7F3N'7'O+y?A2&AJ=,22;E>>WN?K5,'%CR#HO=%YG BQV CH`'F mF" B*. B KYK'4%#=-*NO\NO NO3A4 ](# fE T` 6 W26 qO>> 0A+0 qwy+1}E+" M4C 0AE=*H3Rl0A1WX 8`N3Jp/JpO1W6R\1WQ"3[6n"3EL0W3!E!Ga,34 K@". 2W)EM? 2*f2!G".0WW,0+X8DX XL" J8 Y/JRO<L(# ]DrW9F% :)zY'-BGE[?ITUPYI)=N ,!4.O m @6<RR?6!L(#.OKL7<<3e?/LSNa6.5&K OO(#4 &.O4jf4S<VNGHS4AOGWWIDIRp;.ICG1)R32ACHQ0Ji* 6FFpSG . G#yV0U#yY.s#y?I 9w%45ASLN5'-h7 7 0&{Y>PV&{(#(#M&{'-(#(#C "^CH[[P=JB=pF$=FFN~=FF HL(#8 ) qP#&|2E$4I~HD (.E(4W68(#4W [LS?>K H.(#?s! N*'4| 0G Q4(v1(v(v1):T=7,%?>?I! B&$1"@\6J !dB.pP$ ?J !dW!dP:A67$ >>4P25n?6)\eWV,0%.OGv%OJ H^b=\:nH^>{>{H^1"8@@"y&J>60 ;-WE9 J8 KI.7x/v:YS?'/$x\C'd'v 1S?E\JJK,:BDrBTO P{W%U)yAIJKPHo)yN5)yYi'-hQ $H.,!*%,'5%,37S5Sk5/./JBi/..PY ?6Mx)\H6|WV,0%.OGv%OU1>Ks;4DX;4Jp;41Vx SVoC`/ }0<SM9m<" }CME%MR1 }S P O+1Vx SVoC`/0<SML<DPCME%A_MR*1PS P O+>QAd Q'O&pHx7A HxFHxK] EO&p NY(o??4 GJ("h-?4%K/R.UV+\%R=L%Cn(9>;}WoCTC!fCTCTgRS o;J<95r;eVe0;ee/@:TPAUWo'TPN5.OTP4'-h77Y|U/MoGM~T.E7/d):U9NP?E7??*gBWW45)S'?7 ?L5s.,XkM40AE0A C'7'/7'.)0'0XP XP&=XPBTT ELD%U\/NLU D^L^UP.UDTD^ HQ6$*M40AE0A* C'7'/1W7'LHRU/TA64gJx;/TKo1/ )D<,JCWoCE1Q1C QfLHJO ] t>3lTTT TT+`88Y GE*AM)1/NPE?Mo( 5)S75s.,XkK5,'CR#HC=%YG B@, CGPBVF m[ B*. B K$VKVF4%#=-*"!"T":g;=4Y iY 4-Y7O=6-7@ =W2P*0J>1jY9X&!rPFy$.H31L0ME7/dE7VBA;HK852<0Tt0W2Q'G;,&ElYQJ?QEAEl5 CP9HV4M!,:(k 1Vx SVoC`/ }0<SMJ!<" }CME%MR1 }S P O+;X,X1V,X,60K@3E!L=#\@wK@2xW)95$Bc?J~22D80BcSY6 )% HZH<S4<;)QK1r/* W5Lf+O:6W5%$qW5):g)FMN@EEY>&3VKA=9.@KKA5KA0K@3E!L=#\@wK@2'W)95$Bc?22D80BcSY6 )%0J~K@3E!L=9@wK@2DW)95Y9?(#;22D809SY6 )%<%0 -FX f 5GU Ae/I f '-e* J"NJ?'^8<?""?R?'(#I7C9#K H.O(#?s! N*'4|;!R!*!BA;HK82<0T0W2Q;.KYQJ?QA5 CP97Z3iBlL.  B BAA BAULH/6/q8Cx |KoO">^M):LCWoO"/QO"CLH:LO ] OA.q #S.q>+QY >#2 *> 1c',k2%ANh/4Xq&8/V./IXcNhXI/@/\3w7FFNh%sY7HIXcNh7HXI GhS9j  HN7IvN7*]?j?jN7N72v/kT: Dr;4)zFB@'TP7''1 0O@+Q.mX2`Y H4(Mx-'-HMGK@P.r6|G0MFBH4.OMP}.OK(m%PG"M2BHO4KZEG.O4j4M[$VB- ]:d@vB-IB-BQ:%wS|Y=4O4OF5*B=U)1+%H+?KsA)1V3 'FQ LV0V*FN@IGSxN SxT$AW2/dE7QOB=W2"<l&aC1qC?0) m> D2CiNKD2B1"D2(#V ;H;G;D//? $X f2A}S5s $(#'(# <I f $(#>(#eUPH: F6-DHGP6|4 FN5P}IJP-; 41y-['-}U,p<KPUN7JN7xN7UN7(N$E(N@(N+eQ&$9))`QOCIO:lQOO+ M;!@>0L{;P;& "# -5'  E5T>Od 'Q'OHx7A HxFHx GJX f 28S5sG(#J'(# 2D$I fG;(#(#e4'4'4'4'4T .'E7/d):U94P?E7??&%BW+ 4)SN?7?4s.,XkT"p2<)P,-a-a8K5,'CR#HD=%YG B/ C,IGNC mC S B*. B KKGN4%#=-*ODu)Du<N':WQ+U098*@j*A=* AJ&#[J&J& E '+; E[ (OQL(OJ8(O0J~K@3E!L=Ny@wK@23W))IUNy=?=;22D84I0NySY6 )%WrB'J]Wr=dEe=d Wr=d=dP25n?6)\eJ WV,0%.OGvY%OX?<+Dr H:K(Mx!c'-BMGK@=YT;GF3NMFX4.OM=.OK(m%L=G"M2XO4KZEG.O4j4M[$VUQO~,&&ACXO~M!M!XO~HRM!M!HR-WVBP ?WIPPUPP<?O>IaW!:@&I0+@X> 02 6I9nH{ <9D)i+G5-S9D*a=d9DX** L{L6>{? ENR9E>R\%H  f2 e%8I f%++e1021J!1Fm O&vM ;NKI$e,8 ; % %=; ; % %K:B:DrRF=%6x#BGK?TY*/V> mN5?6.? KKV4%#'-=-*M55 C%?@\?@\B9RHK u?*k;W u1- y/Q=W1DH;.1!ORQ=K95[#|I -:$+K -,jiF UF%s8W&n"&n&nQC 90&NRKRR:` +:`:`<+.[*kJNp%"={MKWX "7232AKM& J|?7""JJCCC<\06!oQl!OFJOF!OF$lN#?)><LN'-''/1'6lRRRARh!< o;S!"("/.!"" Z;::|%a|;SA1 ?'K -0nF?2,j w?}SLRO:S(# ]DrW9F;:)zYg'- BGE[?ITPYI)  %4.O m @6<RR?6!S(#.OKS7<<3e?/LSNa6.5&K :O(#4 &.O4j 4S<&Nhn>nNnX> E%I.8 X(=B'+; EX%cT[ '!i'!u-d'(_$ (_LN5(_,ALHRU/TA64gJx;/TKo1)D<,JCWoC=1Q1CLHJO ] T+-:T)XT3pCo15|?Co@@Coa'D)7'K O'UP  $w/=$G7{7{N"7{7{ % J8 YS?M/M>CP'v 1S?>JJK,HnR5D9Hn0{Hn1+ UP /Y=?6)\DWV,0%.OGv%O J+mI JK2O J,Z-&$-<-DwMMDw+DwSE Qf-K ?5KOBDoh;7IB$`:KB?L 1F':K@0DuP- @OR@T@z8&@F1r7I "cIS8>Y 222=3]GVP-8Ui62>]Dz?2c>]:UL?|Ui+HQ 5 77!\TIO/ *rIV1DBO%BTG@K7?%B -%BA77/W7K;6<G2K%M%MF F%sO6Q3 1S?ERYREL,RY;RY5.q.q.N+m!FP!!0J:N75 Ia36/SS90S(Z6/X6/(Z.FlN'-'-&H;'CH8 C (2C8 PHV0G3T<AA#lY97?d! Qz7?BJB7?,Q"`*.h ~)))),'H1MC*>AV-NE/F; ,,NE$) NEH9S],,X,X,F^FG$9FOIO:lGIGIFOO>=X44$v44 K! -8KK sTQdsZsSSP`@ JX f 2KT{ 5s@ (#'(# 2D$I f@ C2;(#(#e:C:; :(SDlNFuI1/)2Y]+rHYOH/)G'M;-9ON'K'@@N4U@\BLLJUc*^P *^~*^0-%a5DoGZ!1XM!M!M!XM!M!!FTS"TS'TSTFO6 2U-W$J6NTwLJ>jhJ6L^L*uQ.DTJ6 HQ0* 7U:CJ)|6-Y7UGK"!V! FN5"UIJ"-; 41y-['-}p"="24FvFv4TWR?'O(#SCW9#6zM)z'-G. BI C5UI)F3 G)4 m,C@6<RR B*!O(#.OKO7<<3e B/LSNa6.5&KG:(#4 &.O#4jH4S<3;cKZUw,NK[K[G(&& 2U89 EO&p | +B+:KFN'-W?7L| H 7UC,A @7U(o!4'@lN(oK(o@lII+IzD-p-;YR1XrFT3KB1LJLdVXS >3WL U7L |3N7#}?Y;Cy P7!y y!RWP_VLA56 ]LAffILAG88Ga@!aaTh.#. %.W FJ{NLf OKSfJ{%$q FJ{( E# =$9&# OCOIO:l# E+(OO(w$c&?N.w(GL?S!w"8T2F =*,HFW ?W?/:R\"W?[6n"3EL5ODu)27SgRDu:<4 ::#J9(? 8"R34Y iHQJiY FFpG . GPY ?6Mx)\HO6|WV,0%Gv%%O&DKy E E ;:@V***%3Cwi z(3 R$,[l96;l C?PN:CW C$];. C!ORCK95[#|I<{q9Oq XR>Y97?! DR7?BJB7?,R"`*.h U-/?0AN-;.;.I-;.;.16h=NL *1;NB*S)*NB#04V0 ?-4 BR %RV=R?'(#SCW9#; M)z Q'- GE[ BI CKYI)F3 GN4 m @6<RR B*!(#.OK7<<3e; B/LSNa6.5&K6:(#4 &.OJ4jSkPb4S<=<3GjGs(X .DGjV5sGs(#'(# D$D$Gs;(#(#P%I X(='SX* Z&;::|%G&a|; SOZG&" ";?'KFG& -0n1 !T:#s %6"o;LQ?%63-%6 J8 YS?/M>CPv 1S?>JJK,(o:+X'O?Wj6/|X8"g>6u,5):/|!p X#-@_3XXX:&Q&QTFO6 2U-W$5&N3J%75&^N*uQ.DT5& HQ6$WGs(X MV5sGs(#'(# D$D$Gs;(#(#(1!3T*11>Xf?P&{0> f2&{(#(#MI f&{'-(#(#eW $KX f2S5s $(#'(# I f $[(#(#[e!n$PA^$@75?C(#7B M4WE0A5 M;MM!BQ3"|LN7"|"|9+B(U1'C ? CF)4 9Q7$O8)Gq#W$1Q7' TX$ Y'N1'%) %lAa"Lp)j;"eVl % * ;!]O!] G)j;!]!]FK V61$RdC=%&(1GD,M1pRVF mD,U.D, KKVF4%#=-*71RBE*B8+RB'&' 71.dRB''B=/2Ua9)=22Yi=d=0}()! F(Q Q (I# L UQJ(e,@RRRU:Y&x6?5RWc.FF`"{dS'.6:G"{'"{U.FPH "\>AYTE9@>N5.#>8!hBD2;4DX;4JpO0;44 ]w$BgO"wU$*(!0(0/ UY"U0/1(o??4 GJ("h-?4%/R.UV+\%R=%Cn(9>;}T W'XW0i89S9@9+W#<|#9G'@'W&{3>&{(#=v(#M&{/3(#(#/3V*,L0PP> f !TP(#(#M%WD$I f;'-(#)(#heA\, A\2,,YXA:BDr;4FB'=rTP7'$I''CO CD@& mS@PY ?6Mx)\H)E6|,F30%Gv%%O7,%?@\? BFNN1I@\!dBbN?!dW-!dA7N/W7K;6&?% ;!o:BDr;4FB'=rTP7'A'J{LfOSfJ{%$qJ{<~=)J_8PF-)''=)''N>">">"Q?K5,'%CR#H=%YG B CG=e) m9 B*. B KK)4%#=-* 5Xg>6u4@"I"I4,8,8\8+@X++KMS$  >ESFN5SPV>"LK>">"Q??1)B<'B&P>>S0+I77-K|2}x;V;!;$/'3RF!RF7FRF!GlDMx '-eMGK%AO&6CGMF'_4%MAOSSK(m%AOG"M2'_:4KZEG.O4j4M[$VGE*AM)1/NPE:U?Mo( 5 )S75s.,Xk%~ N:=X3%K;UD89\,7A-7/MA#'N?O=8B9Y] VWY78G:(EG]%VWO:(5:(@@NVW4U@\BLL&"C">>"K>">"Q?>S?0'>;.;.I>;.;.:I(nAK7(,XO(nN5(n'-4LO2hLU9ECCP>=YFP?.F9p E+,x)xI$/OXR3 G8 iHQ0JiB GTFFpSG . !!0GPP@OLMSLo*{L-r.QH [-r/\/\%UV-r(#Y3%6"o;Q?%63%6l4U6,"z=$/*63=F0&"CURl&Z&d)3"[0+>)o'<M5A[%[,FM$ 4O& 45N'A 4@@57I 4b7<d9x L L@mFIiF2i->L->i->%4.N'-4!><5w"0aUX6?50a066}0aJd3333<*N $X f2S5s $(#'(# I f $(#(#e)AS>(2h qN2N?O=8B9Y] 2Y78G:(G]3>O:(5:(@@N>4U@\BLLL%S5A5xAAY-8,=L+6?% RTLA!(w`~"{~&MDO6:~"{F"{$M$}R wUO%FPF&%H*-(:N5 Ia36/S90S(Z6/ 6/(Z'yI@U''0=M!*?YJ? A?Y01RRM4$S2]A/V 28[3I (i8(8(i*E#JE** 0JK N'-'- A#-TELD%uLGN9-WLD_C~31D(D31?L>JXN f 2HT5s(#'(# 2D$I f;(#(#eN?O=8B9Y] HY78G:( G]GO:(5(o:(@@NG4U@\BLL7,%?b@\? BFN:1I@\!d yB, A|:#?#L!dW- !dA7:/W7-K;6X> E%I=i X(=F'+; EX%c[: 1^"!1^TT1^|<5DL;/I!4O >] F5U6 >]Dz>]$;Y3|H$.7mEC$F"I_F#`$$FF$K@3K@W)5<< U<X;5sU(#'(# U[(#(#['+e''g#&F,LBm9nFp%s"[YE3pN6L2WpO#-G57Tf/A''''==:*|LC9(u-G.^H[,PFy.HJH3131)&s(`)CCi);WG8AN77GO?9G$NNWG Z&;::|%G&a|; SOZG&" "?'NKIxG& -'0nK5,'[CR#HPQ=%&G B^ CBB2PQ> m>s B*. B KKPQ4%#T-*SO1C"|7"|"|9+B{;-{ uS{"#d8#!= FK& ;Y')DI%~D +H.E6|C, EE8; . By > *E*NAMYV HK2XMo%#TX2XT` 6 TW2W261d4><9W<LX!]<h'#5R*U2p,#5;#57<_ ,:k8M!RB==G$8HW,t._CI-:Lz%*+0o/Lz *%LzSO/08^pF x&DR $XS5s $(#(# $/3(#(#/36XV7S{DV7LcEV7yUM, Z&;::|%G&a|;-GSZG&" ";?'2KG& -'0nW5:6W5%W5>\VW WRY!7<"*" P6!3(/I?7;N1NK 7UCY]<7]HYVp7UGHJ,!C}SGC OHJRHJ@@NG4U@\BLL HLVN%(k )FA(k />)X#;tKJH H*W# UB!8,a)J)J08x 8X f2F_ Ae/2'GI f '-heM~TRj B)PUPH ?6)\H6|WV,%NGv%G7G$8/HW,t._9-I-:LzYP*+5G9-OVLz LzS9-8^p+6;"9  Q<=$&R5S:cMY!NWCt 8)N  \ 6 M 6CX /Fk8 <-/$  R1-/C N5-/'-'-(#4!O"Lp-/$ >?R1-/N5-/'-'-$1/W$JT} T}NT}1Q1@/*C<)C0 `JO M6P*CJO-%JOR1PS P O+8CC%B NBNB3B=h !33<%m:HmmTh"=63B9& F(H>f'-8KGK%:(GAGGMF*d4O%M:(5K(m%:(G:A93TN*d:4KZEG4j4M[:sQ:NO 5G IV, 5;RY 5 pU(|]n#WWaY/;%&;Y/%eQ&$9)`QOIO:l*QOOAyi>-!;YR1XrFT3KB1LJL9VXS >3WP3L U7L |3N7#}?Y/E/UE/E/E0j(,UU )4-(`)COC+)"(bm&.q.wl4a5%GE4'<"<lG$$QR $XLaS5s $(#(# <* $/3(#(#/3o <-/$ M3R1-/ N5-/'-'- G%'7,%?@\? BFNN1I@\!dKBbN?!dW-!dA7N/W7K;6Ml,r ++%g%g#Bg+RQ'K[D9J9RO'8"K[H.G4~O4H(7$aHl+O@] Qj%?>?@\IBt>'=]F EB '+; E[B DoTY SEe)eeC2=J6D./66J6H?-C,A @7UR8(o&D!4'0@l(oKQ_(oSGR8@lO(l  ggN7DUS7D`F-2d '= (o??4 GJ("h-<?4%K/R.UV+\%R=%Cn(9>;}@K8 ;VEQ)p$9*)`QOIO:lQ(O O(..AK+d%0.LN5.'-h :RA> f2XC(#=(#M%WD$I f'-(#(#3ze. 7 Z&;::|%a|;+hSA1 ?'K -0n2? 7UC9nM3Y f7UP:Ak!GW=4<GSP:N=dP:)=QK1HD (,!3 (4W68L(#L4W [LS?>K H(#?s! ,*'4|X+f: 20EuWC2|0 02|>X?<+Dr H:K()z!c#'-BMGK@=5*T;GF3NMFX4.OM=.OK(m3e=G"M2XO4KZEG.O4j4M[$V.Z..ZV\.Z@E#?@?M@@S?kNG'L=:@8QL6f==v<. "F<.(G<.YC2 -OV Y 0H;1< ]Dr;4FB'TQP7F3N.O''O8<@<<-97I6R$,[l96;l CN24SW C$] CR4SOS1TUO&&O ^TFO62U-W$NJ%^Q.DT HQ6$E*AMMoA^S!^:9A^77A^ Gz?tC67VCP$3CE7/dE7B K5,'[CR#HPQ=%YG B# C2PQ> m> B*.C B KKPQ4%#.=-*W$LW$/W$)F QwFF:3,P7~DP0O*6P4Y iY 4--5B#h@n6K1 Fl-/$ 0R1-/C N5-/'-'-45YYYY:#&2WF,LBm3 nFpVY%sK+ 3 6Q36QpN6L3pOHc#3 -LG5RQS)FG 3GV/&G:AX N'-z/* TBC;1BIV9gyN01NF8|Nw$c&?N.w(G(G!wW/+JWGnW4fR#g: 8V\:M:N  II&I6LSLoL:EP35 Ia 36/S90(Z6/6/(ZY86cWM=N(""$ O+.#1G1911JRJJE+SE&:&&C2V#MO4UV#V#:g Gs(X9`V5sGs(#'(# D$D$Gs(#(#4V%GNNW4/BgO"UH(!0(0/UY"8U0/0 0 :-/$ RR1-/>N5-/'- $X f2+S5s $(#(# .I f $'-(#(#eN01NFN* 7U:CJ)|6-Y 7UGK"KV!V! FN5"UIJ"-; 41y-['-}:%W(9_1bj@n <-/$ R1-/ N5-/'-'-,H7&t5M2l%J ,B e%8?%++.HLI;Q*HTBTOZ 1TBPTB1Ul ~NV4 4-zB' 4.422ZB 4N(yN?2=*`2=TUl 41: RWP*VV!?'*!-GARYRY)(`)CO)E+5+"@Gy;X,X18TFSF+1LJ kLVCKC#W kL UN7,L@H9S],X,N7 1X,CuF^0SYS1Y/Y/V+NUV@x-iG3L*QL8V*@O~K&ACXO~M!%M!XO~M!M!V_OrcV_*2,V_ZZ8/vZR3%y% iHQJiY FFpSG . !!0GPP%i/1K~FK~K~#"4,m+Uo4#4 yUKUUV#4UV#V#:g $90&{0> f2&&{(#E3(#MI f&{'-(#(#e"(bm&q.wlB_4a5%GE4'<"<lG$$QZ0;4;NK 7UCY]<7]HYVp 7UGHJ !C}SGONHJHJ@@NG4U@\BLL1<@)&.+T0 ;nU, ; %MZ %=; ; %&TTT _ _ _:gOEL[1]OE7"7%UOE77Y!NWN L 6 6\U:Y6?5RWc}FF`"{ +$6:&x"{'7"{ ~4hU$FPH "\'C C"p2K%M%MK*K*LQ0/8 PHK6!*6:N75 Ia3.6/"S90S(Z6/ 6/'C.(ZH?P25n?6Mx)\e6HWV,0%.OGv%?c `!6%O.OtE/,:` J +:`j:`;<V0u(rs0 1Y=I%~D +R{3.EXeC, EE8; . ByP9D=?6)\DWV,%Gv% (-((X Q r2i5IJ.uIJP2IJIJPH HLVN(k )P#FA(kV>)X#S %KJH H *W#)q'U?F q- qY`( qI 1!R$,[K~l96 l CY=NUA (W (8 C$];.? C!OTR K95[;.#|I*%#3+%6N*Q27>3 P$;;Q2Q LQ & * *Q2N@I <G 3GV/&GF%1RW2PU)3MI&Z@YV08I5A xDv%,1jY9FI$ 4OJGwGwp%"={MKWX "D232OKM& JV%?7""T~D4 $mGStPT2F DB;QHF4BS Mc&K*4V4K*:%?@\?@\BTG?!e!dG>ANgF,9@>NJ*5F2>48!h;X,X12:V<R?'O(#SCW9#9*F\)z'-GE[ BI C/I)Q -4 m@6<RR B*!O(#.OKO7<<3e B/LSNa6.5&K-:(#4 &.O4jQHH 4S<>NHv6 Q'UHxL HxFHx A\<#pU^B Mp)WW1q1V1ILKN/!QR QC=%F1/!GWC-KVF m>.4R KKVF4%#=-*TA#A"5e22VN9nR{$)( <JHSJ PSJ!J2P(V85=U27SgR6N)1$:)@34 ::>$$ YG$OHm 3GV/&G'-\@e\ (#\;:@VKz1?5+jY+jG+j"NJBQR34 iHQ0JiTY &4 FFpG . GGs(XV5sGs(#'(# D$D$Gs(#(#KN3'-A^!^:9A^7A^+:9x;G L<0p1 ^ %QE::0pS#<0pF@LA KN'-%BA P e%+Dw%9E}N7;6RVN>?,GAO4HS@vQ%AOSSIJAO-;41y-[}@I.(@IOKOO@IO0LMLL7,%?b@\? BFN:1@\!dKB fA|:#?#$!dW-!dA7:/W7PL;6-Od'13' ,O7A2$/*#&3-E1%/N=J GwJGw O6WOVN(kK(k>)'s44'U3393S W30PL>%R{3GWWe+W(mIJW-;41y-[}*"6ee"S J8 YCI/<>JK>AYTE9@>NJ*5.#>8!h1Jf@@|,k2%@/:%&/V/ (#I2l>H>/l>T~D4 $m;YUYUYU"u4[P4['4[4RWPVKjKj'8+O%'+'F?4Uo4#4 A=] HLVN(#b )B qP#&|(k2+>)K1HD (3 (4W68(#24W [LS?>K H(#(#?s! N*'4| \ 4KFS H 4N7IvN7*] 4=N7?YN7=;&4{YP;&V{{;&U Y*I%~D +2.EK<+C, E >E8; . By2EO2--u2--;1XrFT KB1LJLVXSDWYRL U7L |?TN7#}?Y@+S@+@+FA.q #.q?+- L6m*> H]H>!56 56!M!56!I R3%y iHQ0JiTY C) FFpSG . G?B8RCS7=3LG4Y<)W KW_!*G4(G(GW*h"qW?e?G4&U"?( A/7oW/X/b/RUkM+lOBOH?-C,A @7U(o!4'0@l(oK(o@lO'OVC & I'-MGK@:( CJUEGF3MF )4 mM55O.OKO(m%:(G"M2N :4KZEG.O#4jH;4M[$VX@7?@YtB=/P:=d *Yi:V68$O?Wj1M1g>6u +R %OR8v'V+\ '3hX>%I X(=CX- 7"XQ@22*%,'59nR{$)- )%,J37J2-SJ!=dJX@CC$EXF39G@4 BR VR?'2v(#SCW9#N|M)z Q'- GE[ BI2v C,uI)F3 GN4 m l@6<RR B*!2v(#.OK2v7<<3e B/LSNa6.5&KG:(#4 &.O4j;4S<XD9/Y=HM(&^Mx '-DMGK%AO&IrGF3MF14%MN-SSK(m%AOG"M21:4KZEG.O4j4M[$V Z&;::|%G&a|; SOZG&" "?'KIxG& -0nRA)R7N7Y\R7WDCQIRQFQQA;HK82<0T90W2Q';.KF 3Y8QJ?7Q5A 35 CP9XN/)/)IM;3c$8C=NCL%:tJp4.RVN=?,GAO%HS@v =AOSSIJAO-;=41y-[}%BA  e%8$%)vM A7,%?@\? BFNV1I@\!d@SB7?b!dW-6!dA#v977/W7K;6CT.C:C(OO+URV%NWY?,GAO$HS$g*\WYFFAOSSIJAO-;bWY41y-[}> E  17kXJY1G;*pYdKP9&SM[2l=8 <B9-/$ 8 GR1-/N5-/'-'-VT}A" T}N4T}*RLYVK2X:rL*%#TX2XTJJ ])$vJIJ=31AA-:+&&/.Y9J:5 O'81.8>eA0 O418Y78lG+O@] Qj   Y8 Y/TELD%uLGN9LD_C~31D(D31WJwN.w(G(GwDy,9B".!G".W,1WOV}<4,-MON?O=%8B9Y] Y78G:(HG]YO?':(5:(@@N4U@\BLLPX dRN.RR:7*nS8Y*I%~D +2.E+C, EE8; . By4!d3 :YdI3 3 :\;K1rU1A/T) 1 KN5N5 (+##p?XD{-%>4$\Y;3?T }%JEyAAE - -)kBi-N$N$3c$1.#y0U#yY#yI8CCL%E Q)@@] ;1XrFT KB1LJLVXSDWYRL U7L |N7#}?YL OL OL (G+R J<5-RO8vJ$5V+\V+C;@=PCn(59=>;}:K,N'-P5A@P  J`-4PK (U=,&<.==MM5=2ltw+Gttt<%d'Qd3GdN7# EpGx= sUC-H11/KxRF@BM66C;WCPC8GPO&pV[+P /Y=?6Mx)\DWV,0%Gv%%O-/$ R1-/N5-/'-;}QG7QFN]FQF$ ~@N5+GM#+*`G*`0+*`*`H 7UC,A @7U(oS}!4'@l(oK?(o@lT6.5&9fHNb!W|+>IeDGW|Y2W|;mJv&cM[+#1&V HQ >AQ]9@>N5>4 L2Wa 'P 'V+N'-"N/nX@LTKkLL.('FHlNV4-4lZN 4'<"<l $Qq?Oq9V!*P/+50^=_<*0^3=K$IO=K.a=K |FX f2/Z5s|(#>{'(# I f|U[(#(#[e#7V`NuY=&v:56g#K[Al8lG>7$4l8Y8<Z+A+$8# k8!XIH+H %HVRWVVV0@'8'Q)E -#\%QF>7!.Bc=?F>]F>D80BcSY6 )%PUP ?6Mx)\H6|WV,%Gv%%PY ?6)\H6|,F30%Gv%OR?'O(#SCW9#C{M)z Q'- GE[ BI CI)F3 G84 m @6<RR B*!O(#.OKO7<<3e B/LSNa6.5&K6:(#4 &.O4j4S<{L6>{EN2E>#F#YF#FHn9Hn0{Hn1+ =8B98VG5ONVs,0N@k$NNV$-4 Z 7 O7 %Y, % %7 %Bh- ER '+; E[R ODu)Dur<QQMRoTV $XS5s $(#'(# $[(#(#[U%"8T2F =*,F4c ?W?Jp/:R\V"W?[6n"3EL)*)+g)#}) $X f2"MS5s $(#>(# SI f $'-(#(#e)7857?@)5V?\S@N5CO$1kGqN5> 7g>Y>P"tN1%)%lAa"LpP/b/, X"(i8(i W>NBB)$ j5=U27SgR)1$:34 ::>$$ Y!#b) ?S!];JH;GG;2952BNK 7U:CY]<7]HYVp7UGHJ!C}SGC ONHJHJ@@NG4U@\'-BLLLA 9i6 ]LAf$6DLA#04V0G?-oKOC$7@@KD2KD2BD221 J8BI JKG JBR$,[K~l964ul CGNo4uW8 C$];. C!OR4uK95[#|IG$O 3GV/&G'-MUdDMJ+NJ+MJ+q9Oq9$'!<TC4vTCAD0 +V)$-/'R1-/NR-/)4-(`)COC s(LU)'* VHQqG,'EGEA7E7!\TIO/ *rIV1DO%BTG@K7?%B -%BA77/W7K;6W $XS5s $(#(# $/3(#(#/3:-/$ KR1-/C N5-/'-@Jj f2;%1(#(#M%WD$I f'-(#(#he((MmOKZEGj8j%$X  f e%AeI f%'-Xke(G+= J<H-*RFW'8v08A K\@N CFWFW( K==!%%NN +3L3LYF2(g2+8gg4R}6P:>D=Rk-ncAAXqT3@OUV@xGO"( SVoS$8 PHC,t?$3_I-)G\2Lz&&@>22LzNLzDABG\>2 P P S8^pT RI.UOYTR y]8 y?% y(#1 L 6kI&C' 5GV, 5; 5!0+%g?%g O3vJ+.cR3%y iHQJiY SB FFpSG .M G K <X!5s (#B'(# P[(#(#[ c%2 4 @4422*%,'59nR{$) c )%,J37J g cSJ!=dJJX f 2HT5s(#'(# 2D$I f;(#(#eBEBEM!%M!#M!BEM!5@,+PD}D}I5'A rINI ??)F L@EEY&3 KAQJ@KKA5KA?5gF#NNU?7979*RZ/DLQ, >VL;C2 L= /  GwJGw5.u T&.G1@YnOO7A%OOF%1RW2PU)3MI&Z@(V08I5A Dv%4#,8FI$ %4O"v' <C $  CS"~> mN5S"~V'-'-5JR#"I: oP?S4A7XV o-i F"**C%g 8KeQ #C?^.$= // X&2K-"^PIPGPYiW3K@W)WXP XP&XP :& gC5wAu&Z-"*C>X@CC?'*--G)&{0> f&{(#(#MI f&{'-(#-*(#e/y:B:Dr:H6-=%K#BG=YTRE:FN5=IJ=-;:41y-['-}'K[D'0Y'8XK[0mDG58F0mB0mFC 0 C C JW7Ps>5b7(#=v(#M7 /3(#(#/365K# &X,# O# M(#!RY3(#2l!K0FP!!0JY%6? G)M\+(*?L++%O+N1NFNM&rQ-:]6Z1-?*? P-??D-/$ R1-/=N5-/JLHRU/TA64g(x;/TKo18))/O(Wo=1Q1CLH(O ] H Z;:;MS&h@<M@PX('4T8k1l'(%(.8'(I[EUIA WI[)I[W2 6I9nH{ <9D3+G5-GS9D*a=d9D Q&!),sCXCX+M/;" 8:):)#Bq7m@BqFF_F#` ` `BqFFs--WIj((DNG:sW $KX f2KS5s $(#'(# I f $[(#(#[e/Ko41XWo!.O"CXI$OX!L OXOXX-%i0#2#X-X-9O+B<_TIO/OB9?//2]N!GI$/K5,'[CR#HPQ=%YG B C2PQ> m>? B*.% B KTKPQ4%#.=-*HLIH11RO< (# ]DrW9F6U:)z!H'-BGE[?I@TSPYI);N ,4.O m @6<RR?69++!+ (#+ .O 7<<3e?/LSNa6.5&K O(#O(#4 &.O4jR4S<0 $X f2S5s $(#(# I f $'-(#(#eF:;="P?.F"Y$?2k<?;?[U5G+;q"H22h*W; HLVN(k )P#FA(kV>)X#S %KJH," H *W#!< oJ;S!"("/.!"" >V5X,3c HEP35 ) RP#2b3Lr S9W 'LrKJLrH H *W# %,%,e37G"Ej8YG"}G"W8RO<L(# ]DrW9F% :)zY#'-BGE[?IITUPYI)=N ,!4.O m @6<RR?6!L(#.OKL7<<3e?/LSNa6.5&K OO(#4 &.O4jf4S<XV_+XV_**V_AQ*%#3+%6N*Q27>3 Q;;Q2Q LQ E * =.*Q2N@INU Y97?!*5Kff7?BJB7?, 5K"`*.h UVUU1-l:WKkI:CvCvSP?Cv=!9=!22=!6.G\)DABG\ P4;6D6O}P6I:K ;NC3'-4fA5272  <+Y!]O!] %!]:K N(n'-LRO<S(# ]DrW9F;:)zYg'- BGE[?IFTPYI)(N %4.O m @6<RR?69++!+S(#.OKS7<<3e+?/LSNa6.5&K OO(#4 &.O4j 4S<$$( E"//5h ]w$w*T0/0/0//$Uk; R(F (F(FM=?&p+<K*4O5@,'>< ;NKI,8 ; %MZ %=; ; % %&.w.w,4a>WA2N5Yi'-;}h"Q7{N;("O8" }%R:BDr*k;=%WB1T/Q= (W1DH;.1!ORQ=K95[#|IWV}&/8Y.IAT8<NO\NO 3NO3)q>NHv6 'Q'UHxL HxFHx 3|=E4=Ya3|M+4 3|!BQBQYARYR%~D +YEC, EE C-"Z$T8*i$;$1H^b=\:nH^>{>{ Y O3ONN=&5ZG=F?1h(=E^E^-jAYnOOm7A33"%("O"#i,S+ #i#iq8TUYx3aFKYx 1Y,7 PF&-/$ !R1-/N5-/JC7@ $8HW,t._CI-:Lz*+0oLz LzS8^p:BDrBJTFcFFGZ%s6Q 4'A 4@ 4OA V4 6K5,'[CR#HPQ=%YG B C2PQ> m> n B*. B KKPQ4%#=-*.0..8@LK5>>'0YX>0mVI.58F0mB0mF)&{0> f2V&{(#(#MI f&{'-(#(#ePY ?6Mx)\H6|WV,0%Gv%OQ$7<XG5s7(#'(# 7C2[(#(#[&S*RLYVK2XL*%#T=X2XTW2. >'C.H?t?(G+R J<5-R&@8vMJ$5V+\V+C;@=Cn(59Y;}AO.o-r.QHQtS/\/\SJ}JJUCNN4?K+ NAF Z;:?;SN" N7IB$`:KB?LXF1F':K@4uEYXFORX_@z8@F7IXF"cIS>Y +H^ v5X+ 2U -4+ 0 $ XS5s $(#(# $'-(#(#KR: $X fS5s $(#=(# I f $'-(#(#eUA%$GV NeV V H 1\ `!63393 VVoS&IG H VT}/[ T}NNT}O"(NR?'O(#SCW9#1M)z'-GE[ BI C*DI)F3 G4 m @6<RR B*!O(#.OKO7<<3e B/LSNa6.5&KG:(#4 &.O4jH 4S</yeen:H>,eG=@RE@@F=IJ=-;@41y-[}= 7*!!Wh!! jTk?L+U M@ !(w;Qu&M1>aY/;%&;$M$}R wY/%%(O?6Qs:Y:Yp"={MKW/X "6W2XM%KM& JM?7""O9&MF398OO9WQ+U098*$0@j*A=*RGd!n4$PK?Y #L$/:\P6 757%:\N2DoW:\*6PR M! nlDo N *X 7n9*9 L 14*Z74#{@@ Z&;::|%G&a|; SOZG&" ";?'mK4FG& -'0nHR:HH<5I O0EI FE9CH9<_S+XPTO"N XP&:XP[M 9@$Ct 8Gq)Q7! "jT\ 6 M Q&! 6CXCX/;p"={MKWWX "R2CWKM& JW?7""9.a=K>3O_;O_kO_<-,V%BBUAGPBN5Yi'-hK>V6h;'jW;%KY c @;DhKP 4,H nKPTHWH9S],X,F^ HEP35 ) RP#2b3Lr1S9W &LrKJLrH H *W#/G:0.!w/G1$I> QRYI>I>;>>X2`UPY H4(Mx-'-HMGK@P6|GMFBH4.OMP}.OK(m%PG"M2BH:4KZEG.O4j4M[$V* 0V? 1AQ<=?p&],H- {2P$&0 nARSj#@ nA\Au(rs0 1 >>|?u|?22*8%,'59nR{$)O( 3%,J"37F2 O(GSG1J!=dJyy#6LW6,+#,8~*^4/BgOx"4UC(!0(0/UY"U0/UV@x=G.7A#.43W.-:A6.i)u4/;BgOU"/GU.[(!0(0/ UY"U0.!w/G0/11$4. &-3^4N+Nn4NN ENKLR&AGL &;YR1XrFT,KB1LJLXV2g.,B}WB}P3L U7L |,N7#}?Y$`B:K;u$(6QW*Ps>2Ux-aA '0&{Y>&{(#(#M&{'-(#(# K-:$P=+K+=:++5IJYRYRYR?\N5CN5; 5FGF 3GV/&GRGd!n4$PK?Y #p$/:\'K 757%:\N2Do:\R M! nl7 O8}{8}rCMG-YK&M[2lN?O=88B9Y] 2Y78G:(?G]3>O6:(5:(@@N>4U@\BLL J-5'  E5TXOI(%R#DQCT.C.FC=8 <B9-/$ 8GR1-/FN5-/'-'-A;HK8(#<0T q90,2J23BQ;UpHD (BY3B(4W68QJ(#4W Q[LSJAB(#?s! PPN*' n:BDr;4FB'=rTP7''Jl@)M8 PH<}G\2A&OLzDABG\& P P5S -UL~L~ B>Ba@<5?<VN(k(k!#>)2?N?O=8B9Y] 6Y78G:(G,Lq6C OC >H:(5:(@@N64U@\BLL$|PJ$K|%Xh>$ /9@>N5>A;HK8PP<0T+9 q90,23BQ3B; (=Y3BQJ(# Q[!!0A=%o(9PPJ*'K:BDrRFC=%ZBG?;"TY*VF> m?6.? KKVF4%#=-*j M"Z%EV7. B*` POG JG G @Q3W8Y8@Q : &j) W} ELK-mD]&jAAL#!]L 3W&j %Y-30&{Y>&{(#(#M&{'-(#(#(p dRN.RRX?< ]Dr H:K(&^Mx!c'-BMGK@=nTv>GNMFMj4.OM=.OK(m%2=G"M2MjO4KZEG.O4j4M[$V9[0LPEP353S9SLr[2=E=E zG <FN'-'-Q$7<XG5s7(#'(# 7[(#(#[s4OY4OF5#HRV%NWY?,GAO$HS$g*\WYFFAOSSIJAO-;bWY41y-[}<S 1R~EIX?< ]Dr H:K()z!c'-BBGK@=T;GF3NMFX4.OM=.OK(m%=G & CXO4KZEG>4j4M[4wVV"$M$}R w%(H LC,A @7UR8(o!4'@l(oK(oSGR8@l(l H <(# ]Dr ).oJ qP#)zB2+sTT0OHDN (..O3 (4W68s (#24W3es[LS?>K H.O(#(#?s! N*'4|%U AK S N5 Yi'-IJEhV_OrcV_**!kMEV_ABG\&?22 %2Ao^2l8E Z,-Yu% Z7:NL0. ZMM5=2lX>%I X(=FXB$ O>@Sj4/;BgOU"4/GU(!0(0/UY"U0.!w/G0/11$V)TSKT("2TS'',DTS''$.7mEC$F_F#`$$F"\F$WMA`: Dr;4FB'TP7''D6?c `!6.Ot)4 9Q7$O8)Gq#WQ7' TX$ Y'N1'%) %lAa"Lp/)60"/P/UCNQNQC;WCP<CVV:V?? V4??4&u'&uYh&u@/GVjN MII-;,tSxG\z+A05o NzKezDABG\  P P ;Jp(pH>="2H>, -IX5OV!$ QTV!N5V!Yi-p'-'-6'4["^#]2R:BDr*k;=%WB1T/Q=W1DH;.1!ORQ=K95[#|I" J@!"T++":g 1*" 4F%1RWPU)3Ml&Z@&V . M5A2Dv%,FM$ 4O, 79#aP"9S*9PCX0P #"I: oP?S4A7XV o-i 8F"**C 8Ke #C?^.$(G+R J<5-RO8vJ$5V+\V+C;@=Cn(59>;}/SP?"/?*?Vy/??R $KX f2S5s $(#>(# I f $/3(#(#/3e/*+@0//p & K/ /K6S!RXL4444YW=0%~D +3ESaeC, EE A/Hmm&"C"K>G?IRGd!n4$PK?Y #L$/:\P6 757%:\N2DoE:\R M! nlDo Z;:;7S>8 IY8 T8/$."&E /$+2?2VmNWu$B$B b6s+:L79x L L.*(Q B*(IJPIJ*4*(IJJ+E-J"p"pEAJj1wEi[.nE7%*o@FX> E%IU X(=F'+; EX%c[ pH>=2"2.H>5XEtXXG1V2 ZP: xO$b>8;M$T$TSS 5U $T?IY<MUP.PH5dX.KV4vAG4vWP0M>5>0h90M(#(#M7D$0M'-(#(#h<<"<E+8$L+!U1AN4UNr/T/GKoP1)D4kGWoP1O1,g0.!w/GLHG1ODl 1; $Y Z&;::|%G&a|; SOZG&" "?'KLKIxG& -0nRL3/L*TIC5@ PP%)Uuu'!i'!u'DtWY#> S9%U)yA<IJKPHo)yN5)yYi'-h GYJ !P! !Si>BW98dO)/"_:3m*MFNF5'-D|%BA  e%+%/\RV%N4?,GAO2HS@vD4AOSSIJAO-;441y-[}6 .3/S?"/?*?Vy/??NYJ'I rINI ?" M40AE=*HRl0A1W&! J1DW?/C 1W6R\41W"W?[6n"3EL{B"P/q BB4ATB(1;&<,^:RO< (# ]DrW9F6U:)z!H'-BGE[?I@TSPYI);N ,4.O m @6<RR?69++!+ (#.O 7<<3e?/LSNa6.5&K O(#O(#4 &.O4jR4S<dQ$d77@cd77Uk@FPP.AVVQR Q<Q! PA^S!^:9A^77I27RA^3E$`:KB 4:KuA G@ORG*/+50^=_<*0^,D# O>aFK VVoS  V3)%Z#v9)#T Q )>QQ116h=NLK *6;NB*S)*NB0JDXP [$8 PHC,t?$3_I-)G\R4Lz&@>2R4Lz-NLzDABG\>2 P P d-3Y4"*-Q SVoSN6XTIG/ 2O60:AGWC2|0 U0P:A62|>>4I1/)2/);>M;4$P/bA/V /8 XI (i88(iPUb6&PNYJE%(WQ)0K@3E!L=#\@wK@2xW)95$Bc?J~2'2D8tX0BcSY6 )% *-N] L:X:JX f 2& 2 5s:(#Q '(# 2D$I f:C2;(#(#e>>'0YX>0m.58FT0mB0mF8HYR $KX f2S5s $(#(# I f $/3(#-*(#/3eK5,'%CR#H=%YG B CK>~& m&9 B*. B K38pK4%#=-*@sN(LU1"8@@cLH/6/q8Cx |KoO">^M):LWoO"/QO"CLH:LO ] QJJRWPVKj5A&Kj!@6666L%5 Q55CIR?'O(#SCW9#9*F\)z'-GE[ BI l CHq/I)9 -4 m @6<RR B*!O(#.OKO7<<3e B/LSNa6.5&K-:(#4 &.O4jQHH 4S<F|0p< ^ %E::0pS#0p ,G'L=L6f=={L6>{ENF E>8N}B WXeA0q W W8k'O%''A1&VO&&;DP`Gs(X -OV5sGs(#'(# D$D$GsC2;(#(#( *z  ?B8CS7=3L+Y<)WKU&RW*h"W?U&&U?Y4$S2]A/Va 283I (i88(i,=,,HLIH110QO!O!B 6uK>V6h;'jW;%KY c ;DhKP 4,H nKPTHE9 Y8 #7' |:RY'/2&V|C'd1' E|'9;-:KC N'--@MD.QAU'82'$O&pE6E7A" &$/E66K] EO&p-& NY;JH;GG52;:YAKP9 YFNF5Y'-h1^"!1^T1^*nS8!U  !UQF'!U3a HLVNL(#b )55 qP#V(k2+'>)=QK1HDL (,!3 (4W68L(#L4W [LS?>K H(#?s! ,*'4|VV @E!!(!(OL&M:J JY""=63B9& F(H>f'-8BGK%:(GAMGMFV4O%M:(5K(m%:(G89NV:4KZEG4j4M[?W2/dE7B=W2O8}{;4F8}'rQP7F3''M DFG$9FOOIO:lGIGIFOORV%NWY.GAOHS*\WYFF?'AOSSIJAO-;WY41y-[FO8NV!'-'-K5,'%CR#HO=%#G B O CH`'F mF" B*. B KK'4%#-*8 =0!=&4 ?0!0!C2S200=1Ow-;<"<<O*7>$ 9@>N5>4l7H?-C,A @ 7UR8(o1!4'0@l(oK%(oSGR8@lO(l!lW !l!lKSS?'O8(N(N@(N;_ssZs4?\CXCN5 LHRU/TA64gJx;/TKo1;)U<,JCOWoCE1Q1CLHJO Q] BCD#: K5,'[CR#HPQ=%YG B C2PQ> m>U B*. B K-KPQ4%#.=-*;*#JnO7m>JnF"IF_F#` ` `JnFF-gJ!WONUK $X3S5s $(#/\'(# $[(#(#[:):)O36wJM"RJJE++G+'8"/oCW?PN!OCO:YR./\STk?L+U M@ !(w;&M1>aY/;%&;$M$}R wY/%%(D<K5,'K~CR#H>=%YG B!z C: ; ( m (R B*. B KK ;4%#=-*22V+bN9nR{$)( <JHSJ P8FS!J2P*GLc# B?<???D2CiNKD2BBJD2R?'(#I7C9#$$JR3|?uHQJi| ?FFpG . G"H6""&0R@ f2@N-@??? %=I f@??eP ?6Mx)\P'9.WV,0%Gv%I+'%OTv?OMs-((nX#"I: oP?S4A7XV o-i F"**C%g 8Ke #C?^.$"iFL2dFDQIRQFQQV 4&--V V 9=B(CFFFF@BS0Ru1X6E+[BGE6E&3KA=9.@KKA5KA5;\ 55= ;&YP;&v;&/!Q/!C >>X( ////$(9DF5$ J{$pppK!1o2x NNnUl3W8Y8@Q : E() " ELK1cE( %AAL#!]L 3WE( %Y->LHHeQ&$9))`QOCIO:lJQOOU:Y&x6?5RWcFF`"{KH6:_FG"{'"{UFPH "\0@'8'Q)E -#\%!QF>7!.Bc=?F>]F>D80BcSY6 )%'O%''AYt(YYY12xQ2xJ+1oJ+2xJ+E$R$2J|):gFV?FHbF [5=U)127SgR)1:34 ::M mvM QOM On % %C9 % % ;nIWB,8 ; %MZ %=; ; % %(K4F(Q Q *(Ul ~NV4-zB'E.F 422Z;]<E(y2=*`2=TUlE1:@A 0!}dd7d ?t4 AXDVOCNHM(Mx '-MGK%AOHSQx'GF3MF#m84%MAOSSOKO(m%AOG"M2#m:4KZEG.O*4jG;4M[$V7IB$`:KB?L 1F':K@0DuP- @OR@T@z8@FK1r7I "cIS8>Y *%#3+%6N*Q2<3 Y  J;;Q2Q LQ E * *Q2N@IVrS7RPePe NO#NU=GG QJ&5" M40AE=*HRl0A1W&! J1DW?/C 1W6R\1WP"W?[6n"3EL.84T..$mMPUPH ?6)\H 6|WV,%Gv+%CSKB*'e D*G$9L*OOIO:lGIGI*OO:^SC Ny P7!>y yD.CN!YK5,'%CR#H=a=%YG BK CG$ m'(#9 B*. B KK$4%#=-*: $X fC>S5s $(#(# I f $'-(#(#eRO< (#+DrW9F6U:)z!H'-BGE[?ITSPYI) dN ,4.O m @6<RR?69!+ (#.O 7<<3e?/LSNa6.5&K O(#(#4 &.O4jR4S<$ HQ><+>R$,[K~l96 l CTNUA (W (MJ C$];. C!OR K95[#|I2fJd E33F'+; E335 $X f2E@S5s $(#(# I f $(#(#e +$I:P;Q* o TBF"OZ 1TBPTB4m89p 15qJO!QXS)5V?\N5CO$1kGqN5  2 Y"tN1S@%) %lAa"LpTFO6 2U-W$5&NSJ%75&^N {Q.DT5& HQ6$=KIO=K.aII=K???BR?'2-(#SC(9#/M)z'--GE[ BI CV>I)F3 9v?N4 m@6<RR B*!2-(#.OK2-7<<3e B/LSU=8 uSK9v:(#4 &>4j4/AD<B?LBSqW[) 8 A'\ALVL ALI<MV <N2HSNW'-'-XVX+5XO hVX%1%#cXXVX%%8HWi:&COLzSC)< ;NKI,8 ; %MZ %=; ; % %W&{.@> f2~&{(#(#MI f&{/3(#)(#/3e( A/7oW/X/b/UkM+lOBOTFO62U-W$/N(J%qD^^ Q.D7TD^ HQ6$(WC R 4Bb6.(>Q6(>>,'B&P>>%&;3OA.q #S.q>+QY >#2V> UK?!(U5R1}F/!(`%$&M:$"I6:%$ '%$U$FPH "\O4DT$c:BDr;4FB'=rTP7'?:'T SEWn*iK5,'[CR#H?=%YG Bx CG8? m B*. B KK?4%#=-*??4 G?4/,\ ,K5,'CR#HJ=%YG B CG6A m B*. B KK6A4%#=-* ARNU6UL OL\TUO&&R HO1u+w'I+'Tv|)Q)&s(`)CCA6LH)LH/6/q8Cx |KoO">^M):LWoIO"/QO"CLH:LO ] :N!B'-<GB& XR34Y iHQ0JiTY C FFpG . G 4S H 4N7JN7*] 4N7N7"ZT ;6<RR<(# &jH=.HUPH H&6|hLYQ:N-/ P52G5N52$8 PHC,t?$SI-Y nG\2LzEJy42LzLzDABG\4 P PVV :& gC5wAu&Z-"*C>>X@CC?'*--G;V;!;'{L6>{? ENE>11HW 0>098B2*$+aj*A=*oB/SFLFLUM RWPVKj5A&,KjRO/0j0j HLVN(k )FA(k>)X#;tKJH H*W#VO'>47A$2V'>/W>M>P4HQ)>/lf>M9g o+/,)5V?\CXC O$1WOGq=N5! 1 H',sY 1"t Q&!),sCXCX/;)3B22X?e5n H:K(Mx!c'-eMGK@=;GF3MFX4.OM=.OK(m%=G"M2X:4KZEG.O4j4M[$V 5G IV, 5; 5/U<N/RZ/9P9SG9N/kND/;/Xs4OY1s-K^4OSF5 6S$SXU2DX & CXDxU@"F;&P;&;&Q5 3 IEUP/(#(#M7D$.OP/4;'-(#(#(hA?Z+A8IB=/J=d *YiM4E AP):BDr;4=%FFB'CTP7''$QQ Q<Q:-/$ R1-/>N5-/'-R<)9'=W/9==Y09>:0CA;H'(#B+C0TQ90,2EQ C;HD (IYE(4W68QJ(#4W Q[LS?>KAI(#?s! N*'4|((MNmOKZEGjMN VT>21S.We6Q1STB1S,Q"`*.h UV@xG=%g=6q22*%,'59nR{$) c )%,JN37J g cSJ!=dJ5> #_90M(#(#M7D$0M;'-(#(#(hD$Gs(XD$V5sGs(#'(# D$D$Gs(#(# Z;:?;SN" N?$1cA,k2%A/((Xq&/VY/VWVV0SG Qv?E2PHdGQvAYA sU  8H8868?#XHH##BAAXq&u5X P'&uYhy\&u@@;4+DX;4JpJp;4: )+W))LG5J, $KX f2oS5s $(#'(# I f $[(#(#[e&oI 'K'&oB=B&oBB0&{Y>&{(#(#M&{'-(#(#X$33X?X$ ( (",X$ $ Q;*'V=7?~B}B}?~H LC,A @ 7U(oB7!4'@l(oK%(o@lT2F F 3W  @QLKi) 9 ES ;%AS 4!]S  3W% %Y-3p%"={MKWX "7232AKM& J|?7"",>'>R=>JLKYN*S; c.%K *3/{DP#AH*nKPTH H?3HM!RM!M!HM!R#@cA,/9 A@Xq R6Q3@5L@O#-G57J.G"H2NV$-4Z$KN'-;}:EY IWR.oQ$2$2  GU AeC@ IR(K4F(Q Q ( $X fS5s $(#@(# I f $'-(#(#e&& )q':+X'O?Wj/|X8"g>6u,5):/|!p ([?&>3?&?? ?&4??4I$=715Mp"={MKWWX "2CWKM& JW?7""*%,'5%,F37S5SkJ5:N'-3W8Y@Q : Ki) W} ELGK-mVS%A8L#!]L 3W% %Y-3W $KX f2S5s $(#(# I f $/3(#%(#/3eF*9D:=J)|6-DG">OV! FN5"UIJ"-; 41y-['-}1l'(%'(U)(QNK 7UCY]<7]HYVp 7UGHJ,!C}SGC OHJHJ@@NG4U@\BLL5L\EQ)p$9*)`QOIO:lLQ(O O(A^S!^:9A^77LA^&/.Y.SA#/4A8lTa:mX?< ]Dr H:K(Mx!c'-BMGK@=X5T;G8SNMFX4.OM=.OK(m%<=G"M2XOO4KZEG.O4j4M[$V%  e%8% $X f2S5s $(#&'(# I f $(#(#e)FM-&N@EEY&3*4KA.@K KA5KA8*4??Or8S P;#0,33eI>YI>I>;>4K=F#(5#0#V/(#MF?) Z K4,{ 5Z 5= Z PUP ?6)\HX*6|WV,%Gv%:SY&OSY<SY!3Y-@;@A #@Iy+QUl?L> <JX f2$J5s <(#'(# 2D$I f <(#>(#e=*!R+'/E;+N8"->C (WNX;.N!ORCK95[#|IEP353S9S NFuI1/)2Y]+rHYOH/)G'M;-9C O'K'@@N4U@\BLLN 9Q7$Q7TR(MggR8 RR8v;44U:Y&x6?5RWcFF`"{43S"IF6:"I~"{'?f"{<UFPH' "\99'NuD'0YXK[A0mG58FT0mB0mZ+AF88IE QfKSS5YF2(g2+&{0> f&{(#(#MI f&{'-(#(#eXPK"N XP( &( :XP( ( Hg R?:(# ]Dr9#R-U)zT'-BG3B BITYw%+I)F3 ,4 mPA@6<RR B*!(#7<<3e B/LSNa6.5&KPk:(#(#4 &.O>:4j0CP4S<U;LHRU/TA64g(x;/TKo1)/O(Wo1Q1CLH(O ] KYN#*S; c.`K<*k305`@@8zAHnK`T! ZP:E;MS'R# <M#PPXPT2F DB;QF4 Mc&K*%4V4T;K*WW;3W8Y8@Q : &j) W} EL'pK-mD]&jA L#!]L 3W&j %Y-3=8 <B98GFN'-'-UB-YK:EP35 Ia36/S90(Z6/6/(ZF"/y:BDr:H>,BG=TRE@@N=IJ=-;@41y-[}U0UUV<V0{V Cj&6CC%A;HK82<0T0W2Q;.K)YQJ?QA)5 CP9 e eN%WJSH9 '5sSH(#M(#SH(#(#:A wT@:NB:5(:4As(#h*#?'*-GRVN 6?,GAOHS@vG$9AOSSIJAO-;941y-[} HLVNS(#b )! qP#% (k2+>) aK1HD (%3 (4W68S(#S4W [LS?>K H(#?s! QM*'4| D;? ;o D?S?Oy D?EP353.S9?L> <JX f 2R$J5s <(#'(# 2D$I f <;(#N>HA(#eUU8 G MM88 8$<888XD9j=HM(>Mx '-DMGK%AO&4MGMF'O4%MAOSSK(m%AOG"M2'O:4KZEG.O4j4M[$VFli$ 8iN5i-p'-'-??4 G?4/UK?!(U5R1}F/!(`%$E&M:$6:%$ ' %$U$FPH "\-=;V'R7'Bt;'1B!;-U $/'&'3- E1%/N=J*<+< VTKU,Y<5X|<<1X@B=8B98GGEQ-=j,%<; ;%!]?!]9%!]!]L%W")AGOIW"AW"/hN'-;}'-TFO6 2U-W$J6NDJhJ6L^L {Q.2DTJ6 HQ.6$ HBBBB (4.5&EWU_:?UE)5V?\N5CO$1kGqN5  2 Y"tN1%) %lAa"Lp-@M45U'82.N'D6I,5(4EN$/6&T63@<8-NE1%&/N=JR3%y iHQJiY FFpSG . G&[7= p*U{)NT pSD pKP9&SM[2l =U C2%D+&1%D%]%?%D%&`QMIO o o F" 0&{0> f{&{(#(#MI f&{'-(#(#eA\< wBB ?9lGhS 54 H9lN7N7IvN7*]?j?j9lN77?YHN7+Eh \GuJCvOyCvSP?CvO~&ACXO~M!M!XO~M!M!:,iAO@7!\TIO/ *rIV1DO%BG@K7#?%B -%BA77/W7K;6-"RAN-;.;.I-;.Q HY9F3W8Y8@Q : E() W} ELCK0cE(AAL#!]L " H3WE( %Y-35<9,K5,'[CR#HPQ=%YG BLL C2PQ> m>s B*. B KKPQ4%#=-*,';RV%NWY?,GAO$HS$g*\WYFFAOSSIJPcAO-;bWY41y-[IJ}EYW=0%~D +3EeC, EE I/!/!CM: $X fS5s $(#(# I f $'-(#%(#eG$OT^ 3GV/&GE%J{ODu)Du<N!)<!A!FSiLtjjPvTOPvLSjLM!PvL2v '3",D#: WDc N P*T?.SW[) 8TS "TS'',TS'U:Y&x6?5RWcFF`"{7~3S"I6:"IG"{'V"{1eUFPH' "\1</,H3l;YR1XrFT,KB1LJLXV2g.,B}WB}L U7L |,N7#}?YMSxSxA#O~&ACXO~M!M!XO~HRM!M!HRyV7DV7LcEV7YH 7UC,A @7U(o!4'@l(oK(o@lO})IQ $X f2 S5s $(#J'(# GI f $(#(#e2H1()WR./\I= TFO6 2U-W$NLJ%^ {Q.DT HQ6$'B&P>>&;H1AHH>2hKYN#*S; c.`K *S 3C[5`@@8zAHnWFK`THK5,'[CR#HPQ=%YG BV C2PQ> m>T B*.H6 B KKPQ4%#.=-*L<A^+6;:9A^77A^CL>+MSLoL80*@9C67VCPC115V)&{0> f2W&{(#>(#MI f&{'-(#(#eMSt ."&E +0MT00&{(#=v(#M&{'-(#(# -+k-/$ BR1-/FN5-/PO4FI bIIYKI+IPT33C|S9K5,'CR#HC=%YG B CGPBVF m B*. B KKVF4%#=-*/gJR@5 'B&P>>ON:$V~QmONON;UD8C$ <GE$C$& =& 'C$& & $ X!vS5s $(#(# WX $(#(#$oGR8%K ?N '-FKQ=;QS %QB/S2]23 HHS:5647d7d/5 R|MI; z# 05o zKez  P 5]GVP-8Ui62>]Dz?>]:|Ui+HQWTLKIxYj?99WH/PD*)M)M)M)MT4:TPAUWo'TPN5>TP4'-h@K95[PY ?6)\HN{6|,F30%GvK%O 4 4IvN7 4T W'X+@W0i RS*"1jY9&/.Y.A#/4A8l1l'(%'(UWI[WR?'(#SC9#JC)z'-GE[ BI' C'KI) ,4 m @6<RR B*!(#K7<<3e B/LSHJk& IK,:(#4 &+Q4jP4-;S gJ;SD1SD1;SD1V -'ITH' <CTc C>N"~'-'-.kO~K&dACXO~M!M!XO~M!M!RM B>K?KSIO#L"B/;K aPG'5;K-Doz;KRM! nl 0R2? f2V-2?%g?? %=I f2?? OekFz; l(.18XZI[5PIA WfI[)I[WKXZW6~K5,'CR#H7=%YG BT CG7 m B*. B KK 4%#=-*'& '&22'&.% k&X M!7M!M!X M!M!COIN2'-'-B=/=d *Yi"6ee"7RGSxN SxACJOSJ@&)(@S@ K-_-_=I: oP o-iF"@3W WPKUPm:PU9R[P9SG944^S4OX4 2U -444 N' $IR-% H <(# ]Dr ).oJ qP#)zB2+stT0OHDN (.3 (4W68s (#24W3es[LS?>K H.O(#(#?s! N*'4|4 c4>>WN)4 j;VH 7UC,A @7U(oS}!4'@l(oKA(o@loLS)BA~/TVy7!\TIO/ *rIV1DO%BG@K7?%B -%BA77/W7K;6T%?@\?@\JBGCTT +J??4 GI- ?4A#/6 2A#CA#2!6? :% ? j? M[IoIo)#4\s43C4\.*.T+4\..2u92u2u*-#EUVY@x?SmQ7If-i*Q8GRO Q%g%g/5*QGe*Q#Q?^.$.+@G11f}:}.&1H,-NE,Q%77m1KJhMN.FTTR$,[K~l96 l CY=NUA (W (8 C$];. C!OR K95[;.#|I X&/.Y.TdAK:B:DrRF=%6xBG?{TY*/V> mN5?6.? KKV4%#'-=-*4U, *X *L LA@CP K5,'[CR#HPQ=%YG B C2PQ> m>? B*.( B KKPQ4%#.=-*F;>8d 'Q'It&pHx0g HxFHxK] EO&p NY9 K#A CTT+TT,a)J}LHRU/TA64gCx;/TKo1B)T:LWo1Q1C \GuLH:LO ] .5V*K,LGF00+qYGSxN SxAV,I=:QIT&IWrB'J]Wr=dEe=d JWr=d=dK#E*__6o._4W $KX f2;S5s $(#'(# I f $[(#I(#[eX$?X$ (X$,v;m,9/=4M[|<592[/I W &[>]+zPA(W 6!3>]Dz?>]:0|W +HQPD 8?6Mx)\P.WV,0%Gv%%O=PPT&>99U<MY!NWCt 8)N  \ 6 M 6 jHg*N:!Y*NDoT6Do;*NDo5=U)127SgR)1: 34 :H:#R>QE*nS8#D 5#0#"V/(#'-U-  N?O=8B9Y] 6YC8G:(EGLq6C OC ~:(5:(@@N64U@\L"cP9D=?6)\DWV,%NGv%4A#EUV@x?Sm7 -i*QGCG.iCY*QGe*Q#C?^.$'$ 9@>N5>-pAP :BDr0U9Ec90 B<CT, $Y<1t?<A$5 CP9PT2F DB;QF4 Mc&K*4V4K*Sk/m*r-\%%%e DqUFTn%e6A0M%e11LkLL.('K B N '-'- 0, V :g ,FE>@SjPv:-/$ =IR1-/N5-/'-G+ZM+Z+Z; 5 T&ONRV%:NFE0GAOHSTFFNF5?'AOSSIJAO-;F41y-['-f#EUV@x?Sm7 -i*QGCG.iCY*QGe?*QW:#C?^.$EL*<#(Y$ G$O 3GV/&G=}?@ 6 (%|!EA' <C C>N'-'-&/.Y.A.9YARYR%~D +MYECC, EE W&{.@> f2&{(#E3(#MI f&{/3(#(#/3e%?)4 9Q7$O8)Gq#WQ7' TX$ Y'N1K'%) %lAa"LpL4h(Ma ~4h(,4<<<<j -FX f2GU Ae/I f '-eo6%e1R<*Ps>L<*(#(#M<*/3(#(#/3!n AP$Hw/G/L@ Ys"=W'iA"uXDVTvN*H R(>Mx 'G@AO@HS.MF@AOSS(%AOGI+'4Q-=Tv4j)nM[H?MyH?<+Dr;4FB@'TP7N''1 0O@O+Q.m*J6JJC8#*CFkC9xQ$YFX f2P!5sY(#'(# I fYC2[(#(#[e-"R?0AN-;.;.I-;.;.%ND4 $m{BBTUPH: F>A.HGP=6|,FNP}IJJP-;41y-['-}$6J'? f 2=5s(#'(# 2D$I f;(#(#e*]GTN7" J@!"T":g 1-D2CiNKD2BBHD2!wO0NE#7&/8Y=&v:56gQ #I.Al83"A8$2$4l8YT8<Z+A+2$8# k8!XI sRg#DYxUnUnRUnRO: (# ]DrW9F6U:)z!H'-BGE[?ITSPYI) d  ,4.O m @6<RR?6!+ (#.O 7<<3e?/LSNa6.5&K :(#(#4 &.O4jR4S<4WCh G/;.cAAUXqSm11M1'+e''g EfT;W;K $KX f2V@S5s $(#J'(# <I f $[(#(#[e>-L!7  _  WbF'T+e'2YA'g "FBV <N-/$ HSR1-/N5-/'-'-{/tE50 7} "$j !u&XDUV@x'2M8 -iT*QG!#"T*QG?*Q@4m89p #?LC$kX5q``@# -@M4U'82.W'D64,V=>W$/6&63-WE1%/N=J%U3m2GFN5Yi'-D|Ps>(#(#M/3(#(#/3! Q&!CX/;eKKW<Buj O3DTLSo?.?. N`8 &~={L6>{? EN1E>5J:BDrBGTOFc mS? KFc|<592[/I &[>]C,P-8B6>]Dz?>]:|+HQj)O&&KJH< HO*W#S!P5JIIT0M>5>%90M(#(#M7D$0M'-(#(#h7,%?b@\? BFN:1I@\!d yB, A|:#?#L!dW-#!dAXL87:/W7-K;6>Od 'Q'82OHxJ7A HxFHx V<V0{ V Y8 Y / K z)%.?t^(!KX'T_122JI$mM' 5GV, 5;N 5XU2DX 3TX(G+R J<-R 8vYDT\NC;@=Cn(9>;} HLI;Q*HTBOZ 1TBPTB1:BDr;4=%F9B'CTP7''=G19%!A=G=GE5+;u>#L:UQO~,&ACXO~M!7M!XO~HRM!M!HRA +< T T+M[8K5,'[CR#HPQ=%YG B C2PQ> m> n B*. B KDKPQ4%#=-*L`Kx&D:BDr;4FB'TP7N''XZWKXZ6~A;HK82<0TU0W2QVZ;.K UYQJ?QAU5 CP9NVKTT*8l8l6Cy-"H2h*WRs> f2R(#(#MI fR-p/3(#)(#/3eCT(CCCF+ MNHTQ&NPN&RL3/L* /$:$ @:N5:4-pKg-T 5t h%1%#c%% J P ?!C,[l9l*N (;jN 01NFF:NJ+B=/L]0/!L]GL]Yi4p(D @g3(L$(>U.:("H2h2h;02B=;0;0GdFWv+H7HHH(%H  f2! e% AetI f%!s!seR 0E)R77Y\R77=8 <B9-/$ 8GR1-/C N5-/'-'-JOV!$ QTV!N5V!'-'-TL76LNL65Rf6,46PW{5 ?6)\e'WV,0%GvI+'%OTv1 L PT2F DB;QHF4S  Mc&K*%4V4T;K*WW;-/$ KR1-/N5-/'-;} HLVN (#b ) E qP#B(k2+>)AK1HD  (,3 (4W68 (#2 4W [LS?>K H(#(#?s! =+*'4|=8 <B98FGNN'-'-=C-9UwN.w(G(GwUl ~NV4 4-zB' 4.F 42M9Z" 4N(yN?2=*`2=TUl 41:@A -0-W4NF?E^*B=U)1+%H+?KsA)1V 3 'F L'V0.V*FN@I,Z-@M4 U'82.'$6,V !$/66-!n0WWY<M?d 'Q'823&pHxS9 HxFHxK] EO&p NY:V68$O?Wj61MM1g>6u/!Q/!G+pC > mS K>Fz lPDD -k>DD"H2h2hJHSM 9Q7$Ct 8)Q7 T\ 6 M 6>KsG$ -G N5G*9D=J)|>"DG"V."UIJ"-;.41y-[}NU OD7n9 "y 14<*Z4JW=s> f2$R=(#E3(#MI f=-p/3(#(#/3e1. $,11?GgHW8<?""?*sG&CY*sU8*s!jBQ#$6J0? f 2=5s(#'(# 2D$I f;(#(#eDrI2i5IJ.uIJP2IJIJB @y*/N/RZ/G8 uS6PLkC4Lk<LkTh"1R_DxU@"F"=O7UA7N77-YKO>7 O7 Y,7 J%eTn%e6%e1.z//;//#9#33F8r ~W&{.@> f2+&{(#(#MI f&{/3(#(#/3eP/w ?6Mx)\'iWV,0%GvI+'%OTv|<59[/I 1n&[>]GVP-8Ui62>]Dz?6>]:?|Ui+HQN1 m> n B*. B KKPQ4%#=-*>M?d Q'3&pHxS9 HxFHxK] EO&p NY5FCErM2S2]2S3-wA\<SKB :S=:U_:%IX X(=XB=/29)=22Yi=6=PPT1T{MDMJ+NJ+MJ+6:'1@562i6c"2Q3XcHwXc%@l22*%,'59nR{$) )%,J.B37J%S8J!=dEJ>4VG9IVG8w)KVG;m,H=4M[2X p$"TT pSD pKP9WU/TA/T)C@ LHSRA<A6LH+8KFX f2X5s(#'(# I fC2[(#(#[e3,;8A4$S]A/V 2M8B3I (i88<M(iPPX&&RS@Y?Q W:I=(9L((8& LH/6/q8Cx |KoO"R%>^M):LWoIO"/QO"CLH:LO ]  1AQ$&<=Y,2-}$&A0RY #@A\AI, 1WNQaJO627uJ ^#)D SvEuT PT ##57T -8Y%6?% RT%w`~"{##}@6:~"{FU"{$M$}R wU@%FPF&%H*-(P ?6Mx)\H6|WV,0%Gv%OO]5S:cL~T~D4 $mMG8H7n,.0.0E1?B8CS7=3L+Y<)WQ&KU&RW*h"W*?U&&U?TFO6 2U-W$N3J%^*uQ.DT HQ6$Q*F-"R?0T|AN-;.;.I-;.;.4 k%%UR{   = C2 $X7&S5s $(#'(# $[(#(#[D=R2 PK82<2N;BY7OQ[B?U5-XXTIG/ 21BO60UvGWC2|0 0P:A62|>>4X?< ]Dr H:K()z!c'-BMGK@=T;G8SNMFX4.OM=.OK(m3e=G"M2XOO4KZEG.O4j4M[$V-/<@<8"@JNE?x E5U55E5LHRU/TA64gCx;/TKo1B)T:LWo1Q1CLH:LO ] MX>%I X(=XP;SA@<85"+o+ %V(F- u"V(+U/TA)/T)A,COWo!.,HI=S2SS-!#eOK0FP!!0JJn#WWY< !.Bc?F>]F>D80BcSY6 )%/&>H>(#/l(#4>(#N1#,[l9X$lNU;j (W?P;j:W5 Ia.6/j$0S(Z6/ 6/'C.(ZH?7UK5,'[CR#HPQ=%YG B C2PQ> m>? B*.5. B KOYTKPQ4%#.=-*X2`UPY H4(Mx-'-HMGK@PO6|GMFBH4.OMP}.OK(m%PG"M2BH:4KZEG.O4j4M[$VUl ~NV$-0 BNo@.Vw4lU2Z/GRn(yU2=24'<"<lUlRn$$QL$N+< VD,IU  ;n, ; %MZ %=; ; %F@,F@F@.(K$F}}p&MV# 0{+ 2 Z3C2(&5}2U!DQN(LU-DC`)( A>7oW/X/b/UkM+lO=B:IUOH>H>5K>$YF2(gR?(NC=%@2GR Y+C.FVF mRC.!R KKVF4%#=-*-r.QH [-r/\/\BV-r(#82Iz 7IzZIz4NV$-B4lZ*81(yX24'<"<l81$$Q@%3&9UUL$;;PEQ)p$943)`QOIO:lQ(OO( 4,pS H 4N7IvN7*] 4=N7N7=2o;V Q';Hx!; HxFHx Rp#!p NO#NU=G7O GA93TW-%s> f2X-%(#(#MI f-%-p/3(#%(#/3e0UHL]]B ?9lGhSk54 H9lN7N7IvN7*]?j?j9lN7SM7?YHN7VNHSN?O=8B9Y] Y78G:(3G]+O:(5:(@@N4U@\BLL+P3W M MP@0 $ XS5s $(#(# P/ $'-(#(#0|( A>7oW//b/FUkM+lOB#:IUOH>H>5 52ARYRY;$ 9SCP:3=%G B O C ;p4F mNF5" B*. B KK44%#'-V8-*2;9A4062 @2/y:B:Dr:H6-=%K#BGK=TRE:FN5=IJ=-;:41y-['-}EdEd>CQMRV%:N7GKAO'*HS?IFNF5AOSSIJ jAO-;*@41y-['-IJD1&H3aFK nPPWW <0{K5,'[CR#HPQ=%&G B^ CBB2PQ> m>T B*. B KKPQ4%#T-* ;NKI,8 ; %MZ %=; ; % %MW 1<RLL*PTT;&(wYP;&7E;&+&&/1.Y9J:5 '81.8fAV</ 4B8Y78l+ @] 7QjM 9Q7$Ct 8Gq)$1Q7 T\ 6 M 60bLA&06F _OJ6Fy6FFR?'(#WCW9#IM)z'--G. BI CMJI)F3 *?N4 m,C@6<RR B*!(#.OK7<<3e; B/LSU=8 uSK*:(#4 &>4jK=Pb4/A LU >+ 5G(V, 5GzY 5-BT71-?? -4??4 M$HTM0, V :g YhH@E-'ITHO-/$ % R1-/N5-/'-'-VnM N?O=8B9Y] 6YC8G:(EGLq6C OC C:(5:(@@N64U@\L//J3W8Y8@Q : E() W} EL/0K0cE(A L#!]L 3WE( %Y-3H9KY,A @(o4'0@l(oK(o@lO,_R&$rD $ri,LW<AKNGTu<V+N5<Yi'-hT%V-V-L119GA 5] GV/G2=8H#%@HHBFz lP2OR2 'EE 'Q ' 6X,4ee:Lc=8-BT71-?? V-4??4Bd-;BdK?-3BdBd-&/.Y'.A)T4AT-0WA]/<W5WK $KX f2S5s $(#J'(# I f $[(#(#[eDy/y:BDr:H>=%,BG=TRE@@=IJ=-;@41y-[}N #04VY]- HYR{0G$71?- zN`GO$7$7@@NG4U@\BLLWAWGnWRO< (#+DrW9F6U:)z!H#'-BGE[?IQTSPYI) dN ,4.O m @6<RR?69!+ (#.O 7<<3e?/LSNa6.5&K O(#(#4 &.O4jR4S<:BDr;4=%FB'CTP7''8wR?'2v(#SCW9#?M)z Q'- GE[ BI l CKI)F3 GN4 m @6<RR B*!K`2v(#.OK2v7<<3e B/LSNa6.5&KG:(#4 &.O4j;4S<>>'0YX>0m.58F0mB0mFIvVN(k(k>)PH2?[2?'+ +xG%OG d%OV+V+0 (%OSPXDVCN*H(>Mx '--GK@AOHSDGMF&4MAOSSK(m%AOG8L@o&:4KZEGTv4j4M[::-/$ AR1-/N5-/'-5d C yY  P7!y y!G33xMw$M$}R w%(NK 7UCY]<7]HYVp7UGHJ!C}SGOHJHJ@@NG4U@\BLLT5 +t h%1%#c%%#&2WF,LBm3 Fp-%s 3 6Q36QpN6LpO#3 -RG5A;HK82<0TU0W2Q;.K UYQJ?QAU5 CP9.Z3Q..ZV\.v.ZUW::&{0> f2.&{(#(#MI f&{'-(#%(#eM*S647d7d-V,-V6J-VThO62JLWDCQQFQQItH<It>g0g"Y7O"*W*Q*$mM?Wu0-Q f21-?? I f-??eI+/a$ VjHC,t?$SI-YSxG\2Lz2AEJy42LzLzDABG\4 P PZ+YO!];-2V2L`X)1)1e3eQ,-=,,R H)5V?\N5CO$1kGqN52<  2 YS@"tN1%?@) %lAa"Lp!n APN$2 Y5?o;??(:)r:);&4{YP;&.-1{;&?6ONQmONON;UH?-C,A @7U(o!4'0@l(oK%(o@lOJV%J//J / 6Q (:)r:)Vz(%H  f2> e%AeYI f%!sXk!se Bj :.A=%N.N5.'-h POQ)&{0> f2&{(#(#MI f&{'-(#(#e='K=VKJ= DK=>uud 'Q'82/!&pHx;C HxFHxK] EO&p NY#y0U#yY#yTk0SY M@ S;1>aY/;%&;Y/%+DA1f+D/x)3 'L= 4P(NA.%BP(=8F=P(==A'\QALVL ALwG6"wVDDwP/b/ X"(i(i*%#3+%6N*7>3 XV/ L& * *N@I N1?+~+%dK@X+" 7+K(#MMm4M[2l1$ $XS5s $(#'(# $[(#(#[2L7xK; 5#,K4vXHH VB9& ((H>f'-KGK.:(3 C05YGMF>4O,CM:(5K(m%:(G:A93TN>:4KZEG4j4M[:sG@4 BR *5%lAa&l0W7i@5{J 1,ITHO>RTO'tV'tO't6yE^U1A/T) O:N.4U,"z=$/*=;.>;5,J%([JWG f2WJW??? I fJW4?$?4e)Hhn#WWY<XTIG/ 21BO60(GWC2|t0 0P:A62|>>4:BDr;4=%FB'TP7N''FE7 *X #D D:A:*> L-69;C61." E9;'9;-hIb G;.7P7 ;>;?s%#(h(h $ X<S5s $(#'(# Gs $(#(#W@G1&O&&;D78HRW-P=V!9FSW9FHY9F NY >PC (DP&E*N7XExP&N7JN7TP& N7N7/"3=&JX f 2IY:5s&(#'(# 2D$I f&;(#=?X((#emn#WW)O&&$=KJH Y HO*WKJ#PD 8?6)\P.WV,0%.OGv%ON?O=8:B9Y] DS1Y K8GK:(@GY1C ONC 5:(5:(@@/N14U@\'-KfL@UMY!NWCt 8)N  \ 6  M 6CXFWN CFWFW=FPP.,P>X2`UPY H4(Mx-'-HMGK@P6| G8SMFBH4.OMP}.OK(m%PG"M2BH:O4KZEG.O4j4M[$V>R 7j>[|>9x3@9V9p%"={MKWX "D232OKM& VJ%?7& ""RJL 1AQ$&<=Y,2-}$&ACR<#@%A\AI, 1<WNQa,ZU$! A&K5,'CR#HC=%YG B@, CGPBVF m[ B*.' B KKVF4%#=-*#33UHr L'rC:IUH>5"8T2F =*,HF4c ?W?Jp/:R\"W?[6n"3EL([E;G f2"E;?%g?? I fE;6q4??4e 1AQ%$&<=Y,2-}$&A0RY #@UA\AI, 1WNQaV b8,a)J)J=8B98GX+f: 20#+EuWC2|0 02|OA.q #SHp.q>B+QY >#2> 1OMBDOMB&HOM:KQ@C NT'- ?.'E7/d):U91P?E7??BY8Z1 )S ?7?W/91s.,Xkr\3BRv8c\\ p4NNU J!<BI JK5 JA;HK8(#<0T q90, 223BQ;F"HD (wYPA(4W68QJ(#4W Q[LS?>KAw(#?s! N*'4|NV4-4 ZU:Y&x6?5RWcFF`"{7~3S"I6:"IG"{'"{1eUFPH "\>%DJXHH$JQ$&<=$&R'm #@!`'m1"8@@Q"j -FX fGU  Ae/I f '-eJWF)JW6MJ1Vx SVoC`/-0<SMW9&-*C*ME%"MR!1-S P% O+ : q|DCs4OY/?1n4O(F5:$9Ui6H?:|Ui+HQR34Y iHQ0Ji FY C FFpG . GU =8B98GGN5OOC:(G5O%?>?@\B>_;.#T8 } Z19&R ZW ZE5+;u A/7o{!/X</ e4^UkH/'M eB& eJM?7""RV%N4?,GAO SHS@vD4RAOSSIJAO-;441y-[},B%P""?6J??4 GI- <?4A#T/6 2A#CA#2K>V6h;'jW;%KY c 6;DhKP@ 4,H nKPTH5HE8')9F:!Q".JR.!3}M/*fJRWAJR!G".WW,HN&#KXZ,'HV-C*>AV-NEF; ,NE$NE, --I'CG C?K mS B/KXT@A7iE7i6 67i61/?bK $XS5s $(#'(# $[(#(#[)Ul ~NV4 4-zB'E.F 422Z;]<E(yQ2=*`2=TUlE1:@A ! \ 4KFS H 4N7B}IvN7*]d 4=N7N7=M<{qz9Oq X5K>Y97?!*5Kff7?BJB7?, 5K"`*.h 4tXKYN#*S; c.<8K *k3/{<8EAHnK<8TH==F*2W2/dYfS:Uf- H E7??Bq()S??XT` 6 (W2 'FW2t)6SS*E*AMYVK2X8aMo%#TX2 XTE%,3,;D8*E*NAMYV HK2X>Mo%#T=X2XT` 6 TW2W261Vx SVoC`/-0SM9m/&-*C*ME%MR1-S PAO+#EUV@x?Sm7 -i*QGCG.iCY*QGe9*Q#C?^.$6h;"x-H-HI/W<;`YARYR%~D +YEOC, ETfE 2523K.2PT2F DB;QF4BS Mc&K*4V4K*!JR ;nI,8 ; %MZ %=; ; % %#EE##$G'Y J)JW6MJ<D<.<4=N)Q%Q4QSQR 9!n AP$? A4 c4>>)4MY!NWCt 8)N V \ 6  M 6CX? BrI3>U)!M!**UI3M!%M!"rI3M!M!/Y!LjYWRR\YLKIx, :I>oS)4&{Y> &{(#(#M&{(#(#UPH F>A.HGPA6|,FP}IJP-;41y-[})ToS2%> f2*NS(#>(#M%WD$I fS'-(#(#D(e?U7YH8<?"";) 8?.l:e DJQ$9KpQJOCOIO:lGIGIJOO1}CSwK(GJUX6?:5J066}JK5,'[CR#HPQ=%YG BA C2PQ> m>C B*.P B K KPQ4%#.=-*`Y6?4:*" 1MN.FTDTN:<2NNK(#M%~D +HZEHC, EE .Q84T..$mMMCtRGd!n4$PK?Y#L$/:\'S5%:\N2Do:\RM! nl W%KN/!QR QC=%F1/!GC-KVF m>. KKVF4%#=-*" M4WE)=*H&2Z>0A 1W  k/ 1W61WT;"kWW;R$,[l96;l C N:CWK~ C$];. C!O;,!RCK95[#|IW/<W5WG` =FWR 1AQ$&<=Y,2-}$&ACR<#@%A\VAI, 1<WNQa"I2? 7UC9nM3 f7UP:7!GW=4<GSP:N=d/FP: f2&{(#>(#MI f&{'-(#(#e|K\S*K\U8K\'N?@X<JX f 2! 5s<(#'(# 2D$I f<C2;(#(#e,:,W,&q+;HLPPPTFF/h/$ +/N5/'-'-%""!21x>? 1x6I77H9S],X,F^9#Ul$X2Q~3\QUl3".'E7/d):U91P?E7??;B 8Z1 )S ?79W?/91s7.,XkK5,'CR#HD=%YG B/ C,IGNC mC S B*.7 B K8KGN4%#=-*! I: oP o!F"<?iGn>V777G5XU2DX @oX,kAo8:-/$ R1-/N5-/'-.T +28--u--@G1B >IQ4lL| ?,' ' ('E7/dE7B6@JJ)S1JJ $KX f2S5s $(#'(# $LI f $[(#>(#[eXsL;Y1s-K^4OS?F5 61S$S$;Y3HR?'(#+QC9#R 7j>>FM QwFF:3,,jH);ejJj!<#'#5R9*U2p,#5;#5LZ7<_  H NU6 )RP# XUL$[ . !?KJH H *W#PWPW$*KJB6KKCx+VDA1f+D/x?(?>{L6>{ENE>1VxJWVoXC`$00<iMi @ACiMEM+<1A V VT0y0y:+'O?Wj68"g>6uU&/.Y.A4*)2=8H;.V0#%@HHG2JN7O?9G$NNG2;9A4062 @2A4Y i0 FM90Y=6E "# 3"pY=6? =6A 35 CP9Q*M%0P)u#. <,LX< O 7 V 4&--V pV $PP0RSJP> f "P,-KSJ(#(#M%WD$I fSJ;'-(#(#he"AN&/.Y.A#/4A8<P0M>5> P0h90M(#(#M7D$0M;'-(#(#(h+f:EuD2|!e2|RGd!n4$PK?Y #L$/:\P6 757%:\N2Do:\*6PR M! nl %W!l !l!lKSDnjB7l{L6>{ENE>N}S99WH/"8T2F =*,F ?W?/:R\"W?[6n"3EL1l'(%'(SUWI[W!dXsL;Y1s-K^4OS%F5 61S$S$;Y3HD*G+~#O~&SyACXO~M!M!XO~HRM!M!HR: $X fSdS5s $(#(# 7[I f $'-(#%(#eC5< ;NKIRd,8 ; %MZ %=; ; % %KYN#*S; c.`K *S 3C[5`@@8zAH 'nFK`THH HL*(#WI ) qP#&|H2+$BHD (Rh3 (4W68(#4W [LSS-'ITH HRh(#?s! HN*'VQ) $X f-TS5s $(#(# EI f $'-(#(#eR?'2-(#SC(9#/M)z'--GE[ BIQ CV>I)F3 9v?N4 m@6<RR B*!2-(#.OK2-7<<3e B/LSU=8 uSK9v:(#4 &>4j4/A!<;!"("/.!"UPH HG6|LR/4P-;R/#1)0W3!E!Ga<(]K@". 2 W) 7K? 22!G".07KWW,!!0PPWCT.CC~R$(X9S61 ;/| C8"XRW C$] C,5):/|RR!p (.;J! jKH22*8%,'59nR{$)J )%,J"37J:JS1J!=dJ;V 'Q';Hx!; HxFHx ( A/7oW//b/YUkM+lOBVOF%1RW2PU)3M>&Z@:V .">5A xDv%,F>$ 4O 4SR* H 4N7JN7*]P& 4N7N7R$,[!p96;l/| CN24SW C$] C,5):/|R4S!p 1?511B@)}B@+/B@66-xVN(k(k>)PH2?2?S; (-/$ R1-/V+N5-/!rT:#EUVY@x?SmQ7 -i*QfG+O Q%g%g;*QGe*QX:#Q?^.$W&{3>2&{(#(#M&{/3(#(#/3PUP ?6)\H6|WV,%Gv%%H  f2 e%8JI f%++eGy;X,X18TFS6m&<1LJ kL kV0!_W kL UN7L@H9S],!_X,N7 1X,CuF^C#O4G!l" M40AE=*HRl0A1W&! J1DW?/C 1W6R\Q1WP"W?[6n"3EL?B8RCS7=3LY<)WKMW*h"W?&U?XB3(# Dn4;V?!%07=EQ+Y< 30 9=AU&(G Br" ?U&&U?X?< ]Dr H:K(Mx!c'-BMGK@=T;G8SNMFX4.OM=.OK(m%=G"M2XOO4KZEG.O4j4M[$VJ:J06J $KX f2 jS5s $(#@(# I f $/3(#(#/3e3<AAHQ ?*.JNJ-4TUWIN)N)N)N) 0N?O=88B9Y] 9Y78G:(G(sWLGOG6:(5C:(@@ %jNL4U@\=dBLL;/I(1H:K>N'-K5,'[CR#HPQ=%YG BLL C2PQ> m>s B*.Y B K5KPQ4%#.=-*"cBk ?+<>,B+3RWPCV 5A& "u7I#Fz l?5O'1Q,CPV2Y R34 iHQJiY FFpG . !!0GPPN?O=%8B9Y] 4]Y78G:(-?GF4.FOFO:(5:(@@/5hN.4U@\BLL-cNhV_cV_*V_U'8 UhY _4H}.D)Y2"T05H@V5HW5Hu(rs0V A&AP |?u0U9Ec90 |<?, $"pY<1t?<A$5 CP9KXcXc,#8 WA0q W W8kP25n?6Mx)\e:WV,0%Gv%%OKMQ!R9XyC=%5G$$i"#KVF m$Q.$ KKVF4%#=-*P.'9[[P70Mnu(rs0 89S9@9x;0RY01L/YK:'eY7YYJ1XGXXsXB3W8Y@Q : Ki) W} ELRK-mVS%A8L#!]-JL /B3W% %Y-3)+( 5V, 5; 5AA0//$UkO-.:MQ!i(V (%J e e%Ae%!s!s# iM#73^71V#77**<X '  ;U Z^K --Od'13' ,O7A2$/*#&3-E1%/N=JV_V_*V_XJ8% $KX f2S5s $(#(# ?I f $/3(#(#/3eWF<XUR5sF(#'(# F[(#(#[VX+5O hVX%1%#cVX%%:<AK6^9<N5<'-Y5dlh# 7"# O# M(#!RY(#L64:Q"1" M4C 0AE=*H3Y0A1W' ?N3Jp/JpO1W6R\1W"3[6n48EL!4@6% !4XY!4' >->> 9XGU Ae/ :N75 Ia36/S90S(Z6/ 6/(Z>8d Q'It&pHx-0g HxF HxK] EO&p NY@*W.W.W.W. <-/$ 5pR1-/,N5-/'-'-(-NE)-:]1-?? :-??:+X'O?Wj/|8"g>6u,5):/|!p 8h \4!B8h+8h"+O-/$ R1-/FN5-/'-'- 6)!jP& #LF?* K%UL?C?(=#EUV@x?Sm7 -i*QG.iC*QGeY*Q#C?^.$R$,[!p96L;l CN24SW C$] CR4SF#(5#G30#V/(2#PP4%GXG'c8 = L C2/j=JB9B1BK $KX f2$S5s $(#'(# I f $[(#(#[eF+( 5FV, 5;5 5AA9H$N#v9!XU2DX CXH1 L I&:CC:; :!E!!!s!/ `+&"C">!P5A&[,P>?q(}>>lEJ=JiJ d/*+@// & K/ /KW $KX f2RS5s $(#(# -%I f $/3(#%(#/3e/.E7/d):U9NP?E7??*gBWW45)S'?7?5s.,Xk/UyK9S*S*7,%?>?I!9B&81H@\6J !dB) 00?J !dWt!dP:A670>>4/ AT< D/ =vN=v57/ As(#hV_OrcV_*2,V_22*By'59nR{$) <%,J37J?SJ!J2?CAF#(5#R0#V/(P5#+N'K[D'0Y'8X:5K[0mDG58F0mB0mF GU Ae 0"P> f2&G1(#(#M%WD$I f'-(#(#hezSX,@ Ae/zGh*%#3+%6N*7>3 Y V/ L& * *N@I3%K;UDU6XsEmUY4 N)LM/%D/%/%D}#D 5#?0#V/(Se#!!RO<L(#+DrW9F% :)zY'-BGE[?ITUPYI)<N ,!4.O m @6<RR?6!L(#.OKL7<<3e?/LSNa6.5&K O(#4 &.O4jf4S<;T8fW9:CkX:S~PUPmP;?W7-P7 ;PP';PPT[>>=`>>BM!>>nAA``@%r#;9E)JBOTOVOe<*T@ :A wT@:N5N@:4As(#hY=I%~D +3.EeC, EE8; . By4#EUVY@x?SmQ7 -i*QfG+O Q%g%g;*QGe *Q:#Q?^.e$RO<S(# ]DrW9F;:)zYg'- BGE[?ITPYI)N %4.O m @6<RR?6!S(#.OKS7<<3e?/LSNa6.5&K OO(#4 &.O4j 4S<0K@3E!L=#\@wK@2xW)95$Bc?J~2JR2D80BcSY6 )%4 4X=d4?TT&~=:7: 0UPH F>A.HGP6|,P}IJP-;41y-[}2(UPH 9n0@ HH>6|Jx;B^S>=d> f27s(#=(#MI f7s,/3(#(#/3e;YR1XrFT,K61LJL$GVQ<.,B}WB}L U7L |,N7#}L?YXQ JwMXQXQN@;YR1XrFT,K61LJL$GVQ<.,B}WB}P3L U7L |,N7#}L?YLg>[:~)'F[7[FVJ.'E7/d):U91P?E7?? BY8Z1 )S N?7?1s.,Xk(G+= J<H-IXRFW8v08A K\@N CFWFW( K==IBuY 6 HLVN%(k )OFA(k>)O&&$=KJ"H HO*WKJ#2FDoN?O=8B9Y] 6Y78G:(1G,Lq6C OC :(5:(@@BO#N64U@\BLL*%#3+%6N*7>3 V/ L * *N@I 1@4T.Y?22T6,<#@6% 6I, 1<WNQaN8UQY ^&05XYM!M!M!XYHRM! M!HRUl+)(.wmz:.DC.}.w4aDjN(yI*`=TUl1:@A M1K5,'[CR#HPQ=%YG BLL C2PQ> m>s B*. B K5KPQ4%#=-*W3h#_Ev#=v#UhXK8 Uh?KK5,'%:CR#>I=%?GK BV CW5F mNF5" B*. B KK4%#'-IJ@`-*.CNG+j h&&Q$ $KX f2:S5s $(#'(# I f $[(#(#[eRWPQVKj5A&KjJA6JAJA0Y97?d! Qz7?BJBKA7?,Q"`*.h 7 NO7 %Y, % %7 %+*C*G6+9 mS5u K9!0SYS2F*(=34'!*(IJPIJl*(IJIJ H 8 ).oUP#Ps$> 75&s KJsH H5*W#7&t5MR #EUVY@x?SmQ7If-i*Q8G.O Q%g?%g/5*QGe*Q#Q?^.$49Hhg m7PF%1RW2PU)3MI&Z@YV08I5A xDv%/,FI$ %4O $X f2CLS5s $(#'(# &I f $(#X(#el;|4'<"<l$Q?Kt9(?R$,[K~l96 $,l CN2A (W (8 C$];. C!OR K95[+I QXxES:zXx.XxE}:9[[P 6 >] ,6G: G:G:H W! 1cA,k2%A/((Xq&/V@/?(T>>9B?8<?"?XDVOCN H(!Mx'-MGK@AOHS&GF3MF-q)4MAOSSO.OKO(m%AOG"M2-q:4KZEG.O#4jH;4M[$V A?1 <>N'-'-#&=H>K82<K2; hZY7OZ'U $8/HW,t._9-I-:Lz5*+5G9-OVLz LzS9-8^p+%g6q;1XrFT KB1LJL-VXSDWL U7L |N7#}?Y--X;=7,E3f*A=X] HLVN%(k )OFA(k1>)O&&TKJH HO*W# yr y>Q7 y<{q9Oq X>Y97?D! Q7?BJBz7?,Q"`*.h 3d6Re qM{- qY`( q.M3|BQUG:6FKP977'c7I  `` N&u5X P'&uYh&u@@(G+= J<H-IXRFW8vYA K\@N CFWFW( K==?22*8%,'59nR{$)O( )%,J37) O(GSGJtJ!=dJA;H'(#+C0TQ90,2EQ C;HD (I mE(4W68J(#4W Q[LS?>KAI(#?s! N*'4|#D%']CTr*@ @ U@"o Q3x0@'8'Q)E -#\%QF> !.Bc?F>]UF>D80BcSY6 )%HND>>>.?4!d3 :N1YdI3 3 "3>$6JX? f 2=5s(#'(# 2D$I f;(#(#e&{0> f&{(#(#MI f&{'-(#(#e?B8CS7=3L+Y<)WQ&KU&RW*h"W?U&&U?.^^E9?*Y8 KI.7x/v:*S?' L$x\C'd'v 1S?E\JJK,NV$-Fe4lZ R(yX24'<"<lR$$Q9W<#EUV@x?Sm7 -i*QG.iC*QGe*Q#C?^.$&33I3Y3SS: =:Y(#4:(#7W'TtikFz l(.185I[.PIA WI[)I[WF%s7H 4 t 9D=DGy~<]C"-;<]`` QX=)!0SY6YYBC//KqS!+R RB8v;R \ Cn UV@x=GP8W8T8Q }8Th1'1* %J( % % [1'& % %FRO< (# ]DrW9F6U:)z!H'-BGE[?I0QTSPYI)LN ,4.O m @6<RR?6! (#.O 7<<3e?/LSNa6.5&K O(#(#4 &.O4jR4S<nXAA$C U8*sjT2F FR ;Bv YDo E9 Y8 #7' :RY'/ JVC'd1%' $?E'9;1-XD{YN*H(>Mx '-H-GK@AOVQ6|DGMF&4MAOSSK(m%AOG8L@o&:4KZEGTv4j4M[:;E }G"WqX OTq777'6q77IKI+I>>>.0FF22*%,'59nR{$) )%,Jk37J%SJ!=d8J f2 (#(#MI f-p/3(#(#/3e> +#'E=:E $YY @AA+R R8v;R \ !n AP$:G A;&P;&;&F1;%_*LWF2&aC1qC,DZYP,@P P7YV,E3f A?9'=W/9==0z09 $PS $(#'(# $(#:-/$ K<R1-/FN5-/'-YY/&fR * 7UCJ)|>"7UG"!V."UIJs"-;.41y-[} GSxN W: 7CI-SxC ACq* )J> S8^p KkIcYB9& ((H>f'-KGK.:(305YGMF>4O,CM:(5K(m%:(G:A93TN>:4KZEG4j4M[:s@+!MS@+@+FGF00+q:+$AK*=H+$>N>5+$'-hPXUPmmP 1@4T.Y?22T6< #@6%6I, 1<WNQa?G$$KW:B6KKB!=8 <B9L$ 8G6C ON5C6-p'-'-)088C-Tz L*CFkE C9x L@/ ssssL#:F'!LGz1LR$p$pP25n?6Mx)\ewWV,0%.OGv%%O:#C:+:HJM*RLYVK2XOL*%#T=X2XTW2{S^I;!V{{;#XT%I X(=FX[7MI>>32D&32KJ,KJ'32KJq9Oq9$'!N'* H+'/1'6l2(UPH 9n0@ HH>6|Jx;B^GS>=d>Y >A:9@>N5> 1@/*C<)C0 `JO M6PCJO-%JOR1PS P O+8;AE5)4oF)KJ,KJ'KJKJ1cA,k2%/A/.YXq&/V/R _ RR!n$P$ t:-:-XlX?<+Dr H:K(Mx!c'-BMGK@=T;GF3NMFX4.OM=.OK(m%=G"M2XO4KZEG.O4j4M[$V4!d3 dI3 3 KQn<X(5sQn(#/\'(# Qn[(#(#[/y:B:Dr:H6-=%KBG=TRE:FN5=IJ=-;:41y-['-}kFz l(.18I[.PIA WI[)I[W 7UC7U !B<L p/0PMRL9> f2G+MR(#E3(#M%WD$I fMR'-(#(#QeBKx&D?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ur[C&f]xpABCDEFGHIJKLMNOPQRSTUVWXYZ[_`abcdefghijklmnopqrstuvwxyz{uq~+`&%C D $J# ||]"6$ #|. &M; r CU%~~ #  j  n8 r "#} of K%;  z$Cm % & $C \$C )g g&Y* 3z!(!  r!'(_b  R6"A" %}}}#} Y  WAJ)Z=~~(@%!" HH  2%% %4$'!'T%$$3 $+)[$j $Cm  < o  $C$C )gg&Y*|&||}@Z|I '8 r"#} oO K%; z&v&&)\&&U& 5$'@]%! M "Y ]]V]&Y {+P V P j '% U) %)4&t!g ( / ( O(S s !o)5 / Y( S s !oT$"$m%# ;M)%);5;])$' ;.f% QK5/"#-$fC bb '  Y(p $%i%&5#G#G%K4't#f$w?#?{t6#$"A b#X e %'H'(h ` (h(h6(h#.(!Q#!!g'  L) 2 %%w  S!r we ~#"  ` &( (  w::::b& { %o 4#(V(V(V#<()?&'V* $%2A!2 lP$"n'&JJA Mt~$.b' %k8!) @St!~ &2)cY*+R~Y) @St! &2#c*+R  / (( S s !o&%#Sm"r @  p)9q&c (Zq D # )5])"6(q$  #.8 ""}%>pD'("R'',#$>"* '%2 "nJ~.b'{ v(v]%b'2[[[g&&W"&' nG;tb! ' ;n >$8$, n<)?&'V* $xxxx$ $4!4'! )q3!S )[0 / ?$T  #3 e, !lo$Cm$C%    $C$C ) g&Y* 6{& ] /&")XYZ"]]&&Y( i&((q($}  '8 r"#} o K%; z v]  # Y]V]%&Y v , i ` &'*??6$?"A b#X e %( n8($"#N o K%; z  mz[6 mY m%6 !% m&{\'j%)- B'j'j #!O -k'j ( y#{!%5# HU8 )7e " &k $"<<< (<(  )vv]$$v.%/)P{/"6df m Y$f   A L $%_$r # | # r&i&i&i))w"FsVz&&ME &s E!v%)tK<_'=EK(!)&z f$9# + 6j+9+ " {j "y$Iz )F(8%W#",2&@@@;t  ;YV)&{\ "# oz% (M (L/gggZg:( $\P< << A t))<)L0$&=$`#&Q# (\ &F iP 98  P$ #A)  '#'r jK B `zjjj Ljttt9t't # #"v'v  1$l I # # )g#T G#& 9} m)q m m F mrK B '!&)Q<$$ ##" 8CmR,$96+y 7+g   ,H'sbr%U"sU&H' k$! ^ iH ( #b ###'*"U)@b &)FF}Z!X#{!%5" H&8 )% " &k %< i&z'j %)- B #!0!O -k(v&t t ! (#v %$ S'x&*&U&W!!!&1!}}}#}%2 mCA!2 lf m"n'&JJ Mt~.)]% D# "6 #$#"L)  ! 2F!s&2'2 ]V* $]] ^ 7 I] ^ ^];] L ^ (     /?) !o6 E[ !%[$#$ / ( S s !o6HP  t)))L\9 iP 9 99P$ )A) 9 '#(!N<Z /!:? (8!:!:S &es !:!o$?!z$?$? ^"|$?Df''[(V"'"5''V,# $>"* !'")ON 2 {%+-%% V P  j% $Cr' vJ&)\$C \$C  &&Y*      M#MsC"$m%U ( #b % ]q7 / oq"$T7 Y Z"#3]]&"] &&Y^%2 m $ m m"nJ F ~ m.'#rD Zpx#Vpxxppx(HI'".#!= I]%wm .Y V ] P]]%] $\&YMMM'M(v"S)"&w((v# Z(TX Sl' T*c !y Y3 & \ &EEEE]%m aYa V ] P]"]%] a$\&Y!: (!t!:!: (&es !:u i%u+ 8"uA#h  %, m}U!g''rn$ LK B 2% $WU'M'M$'M D $J(# )|5|])"6(q$  #|.['' ^ ^ ^; ^ z#!"E)!  r#"E"E!'#(_#"Eb   "u;tb-  ;n >$8$, nn!)/!!HoT D!e H &=V$96+y 7+g#% m"r t p)&c ( y&(Y q"v'&  1$l I ?#X e0'P00 0'r jK B (U -j|!2$ttt C' vq 9 $&v$C'v .$C)$C !&\!]]Y*%#m  $## %] # r v!& \ &&* @/ @ @#G1 @ W 5# G&$C'E'v%!$ #!!!.H)$C,!&\!]]Y* X zP bL"'M m mu $!y!y!yJ!y'/ &&M & &s !v%)tK W_'= K(!)&z fU\ 9  iP 9#z 9P$ )A)  '#  ](o D # !T"#"6(q o%k #z!9x[4!!4'e!a! )qe3 S%4)[ 0$R(F $v& t!(#v#*wXKXH>&B,W/'??6(a?"A b#X e %6 6u&&ME &s E!v%)tK<_'=EK(!)&z fH!'C Q +(0&n&&.& Z [&&#L' $* #q"-"-"-"b  /? N$4$4 $4!o(0v l00qxv0!'/es  ))|) C y y )b y <l u;'$  8m"#} o K%; zN(L/ql I%2$ A!2 l)$$"n' lJAMt~$.'!g''rn L5K B 2%6%, C),,,%?%H$?! "H$?$$?! ( `^ "|$?z8'8%n8 8 N#!inN!nnJnag ' ." _)n   &(X! Sl' K! ! *c !T !  y Y  XAr$Zf #"R&G( / , %~ "$ K5)1+!1++1V1 +%8! sN!g''rmtnt L K B 2t%(C'C Qf)b #3 +(0lu;& w&!2"XtB )t'-  M !B! $Zf #"R&G / , %~ "$ K5)# ##&j#(l AC(  ( @/&&$&  Y &}u&YgggZg %w  o o S> we ~#X(G eX)- )-%}DH# `)?#&'V* $(z((}($Zf#"R&G / ,%~"$ K5)<<-<o$$6< <g&z ) h (B!  :"  %I:%,:gi_)nd]%wzm Y V ] P]%] $\&Y  ]%$@$m #'Y V ] P]]%] $\&Y$9!g 6+y 7+ Lg 2%X# (-hDf("'," $>"* !" 8 v"#} o K%; z$9#  6+yuu{ 7+"yzu)F n((Y UG(O]  % M  Q %Y% !o]]](%&Y('"* )jC"%U^^'+^#$  2F2"&2'2 ]V* $]  #G&5&5$H!g''r j L5K B 2%l :H # #'H5' "$Zf &7"R&G   / ,%~" $ K5) 0www'% 5U) %)4&t!g V ,I ''##',nz&  }!F/(}'' H} vvl  1$l I$9#  6+y{ 7+"y Mz)F$C[v<a% a$C$$C`S 'x&Y*$Zf &7"R&G0 &+ / , %~ "$ K5)D z @D z zDD!x zE{% * ;^]  $   $Y$ !o]]]($&Y $ggg ,g9 w 8 r"#N oO K%; z] 'Y]]&Y!R6"A %y]  w'4Y'4]]]'4%&Y6+y+$ #|$ @ v$W @oo(oO'(h m$D$D'!H!] !$  !$Y$ !o]]]($&Y "v / mh] " ")&R yv)&R&R)b y'&&  1$l Il&Ru;Mm] I!_'. #!= I $9# # 26j+?# " {j 7 "y$Iz )F& Uy$(;'L"c""&I  obq 98'&]%!) M "`Y ]]V]&Y$qAp!2 l{$$"n'&JJA Mt~$.%v /f%%j%%#%(t ]   /%'a!|a%%Za%F%& (l "  -  $YD'(Nu #a9m$9t"}ttt\) /? (!t ( s !o $P\9 iP 999P$ )A) 9 '#%m Y V  P%] $\&{\Zx*1U $C$C$C \&Y*g}#}ggog  K! / [a& !Za^y&#L' & '#$'#+{&'*"U)@ 4%Pd#b(   'w$*p4 GP 0 " 0Adt"DI)))L'&]%!) M "`Y ]]]_'&Y((' ()   i))D'o(> 8V1  D  RSV VFoF  U Uq Ut/ G%l G GH )R$ t G[(u~[%'>("" ,#Q 8 ~} O K%; ]$7&Y6 ,S  (v(B E{$ R %_ "$x$ A[Q' #(X Sl'*c !y Y f"(z (z(zW2(zv&D R'#U ##4&!$!g#[)z,1zz ` z 2 }'Vk0'\4!4'! )q3!S%4)[ 0): W"q'( L $r }5}6"A'??$? b#X ej'?"W#f t?#?7?::6#$"A! b#X e: %t/ G% G GH )R$ t G[ $.`i"1( ''px#Vpxx"\ppxD R :%u+ 8 $Zf%&7"R&G %%/ ,%~h"%$ K5) % n l$XY$X($X&{\L0 C LL C,& yZ$Zf &7"R&G0 / $, %~ "$ K5)(h#."*& $C'E%%'%$C g&Y*# j@j j"{j"yI S z)F\9 iP 9#tP$ A)  '# ( /!:) ((!:!:S &es !:!o( # ##^&j#v s :)#a9 ?)$)9)[ ![;"#}" &"&I # # g&"' '$|$]  Y]]0&Yf' W!"@Vq w ) _"P &|]  Y]]&Y(u@~`*%'>(6'q'q~'q%I )- -k v)M -'6/% t $C( vR'C Qa.va Vv$C$$C $< +(0S 'x&Y*$"&RvAA)b'v"  1$l IlAu;'H'#{!%5Z" HA8ZZ !)Z% " &k %<'%# U## %t!$!g# #!D!D!Ds# px#Vpxxppx' W 6{& ]9( ]99 p])] L9&56#.#a9m&9$'"$) M)()`5`])$4 %`. + x!8!!<[!'"W#f?#?"W"W6#$"A" b#X e"W %6 , 4 3  & \  &&#(J:" (o)dfVV"5'V !"nN!nnJn(l} -$ ^Y Af%$ c%U"sU&H' k$! ^ iHW|r\K B mmmmkg"&Rv&R&R)b'v&  1$l Il&Ru;%w  o o S we ~#&"8&. D # )C5C])"6(q$"{  #C. 1$l I @O v)K !$W @r & \ &h  ) ;050])$H 0.ttt9t   2 M 2& m!T%%0% #2"# oz$9!g'6+y 7+ Lg 2ZI%& '! !)[& mI"";C ".)b "uE&4a!m'!!Y(p! 6 2 M 2mO] %( #bu. O -#OOOj%zm Y V  P%] )$\&{\"'#U)##4&!$!g# / ?$T=  $T , %L!lo'M 'M'%# U## %4t!$!g# :'$'i U))Bi %)4&t!g $Zf &7"R&G0 / , %~ "$ K5)} $Zf &7"R&G / ,%~)y" $ K5)[)$C( %'C Qa|a Vv$C$$C $< +(0S 'x&Y*)" H%#1'"\/')!q \"\^ z)#g0*"C00&0 1'   Z#D %$%1& # ' (8#",2 c|s|| :|+_Y&{\ D# "6 #$)u / > &LVtP$$?$?$?^"|$?!H&v"&&)\&$C'E$C%!1$C$C$C g&Y*!w $9# 26j+?# " {j 7 "y$Iz )Fv$C'E%%!1%$C$C g&Y*!   d"D))K /$|' [&h$|Z^&#L' & & #!W##W/$~#n 6{& m("r ?(( p) W9(q&c (q S g&"' I&^ ^ ;/ ; ; U'Id' &E)&%?% {%+%% V P  j% %2("nJ ~.&evqx%2"nJ ~.EEEE', t5&M j#?> (X Sl'*c !y Ym!5"r !5!5 p)9!5q&c (q&E @O v)K$W @ " ""g   "( 'C Qf V$  +(0S'x&* iA i l)D(B#5$9''#>#"";C "";)b "%uE&{ z#!(! m r#!'#(_#b   X$$$\9 i 9 99 )A) 9v ""Tv "`vT """(L/(L/ql6= 2(eQ% M)6X  M&Y&F&&{\ y! 'Je("$""|G "|"|m# &&"| K$| [&h$|$|^& #L' $|  $?)!  $?$$?! ( `^ "|$?zV z!(!  G r!'(_4'%b  Y(Y++ +#"![ m  |#(#%k m " H WHHH('&]%!) M "`Y ]] ]&Y "v / tt(mh mK|&s)' > /!:) (!t77 ( s 7!o * 6'H5$>'h>%_>"$($>A ," ;M#7M 1$l I    !g''rn$ LK B 2%$9# + 6j+9+ " {j "y$Iz )F0vQ yO.[&)!/\!](Z/Df''[("'," $$>"* !"#{!%5Z" H8ZZ )Z% " &k %<!qb  : "&?V Lc t'~70 bbH  \$C V= $C$C g(&Y*(C'C Qf)b #3 +(0lu; i"]"])i )Q%&!g$A')$An$A L9 2$A%k&.&.&.&. vv]%b'HD"6D"ND D m"'h# T(  S e ~# \& \ $?$?$?`^"|$?z_!_%HG); c"^} |$^!}KX /$|d [a&X%D%DZa^y&#L' %D& J)&& = = = =8 r}O K%; Z]!% ^z&%A#G$#^ /?# !o"E.)8''"""%"a):%w   o o( S[ we ~# t Df''[(V"&V'"5''V # $>"* !'"1 2",2&X)__ _  ({ l &a )4Zf"R&G   &*$9# 6j+"{j 7 "ygIz)F  " d<<"&!I <$8_!$!$ ""}$ ,lIl"$ #|$ (XK Sl'$OKK*c !f K y Y 8 u(S7r" m) '  / (!t ( s !oW J&$v'%Z$C  $C$C &Y*?#X?8#$# }()}}}1,W1,,1{1 ,4_!U%4(!4'(e! !)qe S)[0)8 S' 6{& %!3 (H$ i 9d 9A) D # )P5])"6(q$  #."'$(x {) %w  7[ w "%$% # %t  $ f_'8g!\9 i 9dYY 9$ cA) Y'#%<<< (<( !gr'rn$ L5K B 2%OO'O c (& tF't C' v%$m  g)g z!#)!  r##!'(_ #b    $r'Y(*p''''$#u'2)?!2&2'2 ]V* $] 3Jt  O'*"U)@\  iP 9  P 9$ #A)  '#)$%2&%&!&&IO O  OOj #&#&#&"V (o)d$R(1 -n]  M Y ]V]&Y%#  %t 3 'j%)- B'j'j !O -k'j$ $ ]$ ^ ^ ^; L ^ ; ;$C("avR'C"a Qa.va Vv$C$$C $< +(0S 'x&Y*(%URU)&H' k$! ^ iHH$?! H$!  `N "|z'??6$?"A b#X e %6#66:6 l7(L/( <'C Qf #$  +(0S'x&*JH W#59 V" !   d"D)) 8 r"#} o K%; z` l d /? !o w>w%0 #"ho  #x3 & \& R  )4@ [_)n"' ) ?'M#M% ; ; ;`$ ; dS#p M) )5])$ .'  '.Xw c'W D # "#"6(q o #z)P"_'??x R$ b#X eY&{\U'.'&i&i !a$!a$&,x HS 'xxrI' #!= I4S#G' e!)qe3 S)[0  :%2 cT S'j %)- B'j'j #!O -k'j(ylE D $J(# )$J5$J])(q$K  #$J.&&ME & s E!v%)QKp<_'=EK(!)&z f" $ !$ z( z z& z !!@e ;9" ; ;`$ ; dS &&M3 &s!v%)tK >_'=K(!)&z ft/$tH $ t[%LfV oV"5'V (!"yZ/$p!%%%%& Ux!E{% YYY'R#DYJg(w)br%w  o$; o S! we ~#>>> >O $#O   0nOOj %2CA!2 l'"n'&J mJG Mt~. /!$T=YZ $T=&&{\K F(i'??$? b#X e# %E#&j#!)`)`E)` }!!F!}5#GGM16#.(s (((&'>&x#)D:))z[66c !% iA) (\VZf"R&G %m Y V#  P%] $\&{\ z z z z m' |#(6%k m" $ !&& ( L &))% $$ &*) M)()`5`])$! `.d(c%2("nJ .~.] ^'Q I] ^ ^];] L ^']  ^ Y]]V]%&Y q&X d k' F`,Y<$$ #Lz0 C Lsz$L Cz( ]C& %h m hYh V  P%] h$\&{\wm"r $ p)9q&c (qv"'O!j!"&w((v(<<<-<$9# 6j+ %"%{j 7 "y~Iz%)Fr&;v!&&&U \6&&&*('V(' &('  l$lDf'("'''," $>"* !'"R%2$A!2 l)$$"n& lJAMt~$.)g%u <$$ #2F2"& 2'2 ]V* $]! u!   &] ^ I] ^ ^];] L ^<% -9<% % #< <% %2LA!2 l)"n' lJsMt~. )- -k!#I!ZU H&j%$Cr' + %!$$?!$?$? ^"|$?#f # #6#"A" %)r)?rr&' rV* $~#GGM1U !g4 #)44 43']"y"{"yI#05$Crv#&U)\$C \$C  &&Y* !/ ` & P 8  t"D)))L  o ou o oSe ~#'}(%'$_Cm("r ?q( p)9(q&c (q'?"W#f t?#?l?6#$"A~ b#X e % 1 X)Aqk G#%u. #{!%5Z" H8ZZ !)Z% " &k %<d$[&%?$*= X &~a(+Ja$a%a` Y&*&i&i&iyZ ] u /&"' uYZ"]]V]&&YY n)4 #;!s'.%-##f$#6#"A %"????$*xmmh8 u! @%Se9` @ @)qeS1 @0 '".'"W#f=M?#?''"W"W6#$"A" b#X e"W % $D$D8""}p!: (!t!:!: (&es !:('V('$ &('  yC y y)))b yS;l)u;''$~$%6Q "v "e/  w K ]  # Y]V]%&Y#;f% 5/" Y$fC1+ 1++%/1V1 +i M(XiiE{ giX"Xv&9 v$8*!! ""} ,$9# 6j+"{j 7 "ygIz)FfTff# ff C%2A!2 l"i"n& &JJG Mt~.&m !EKD 2P0>00 t0)Lv"'O!!&w((v(EEEE()f%w &h o)_ o S> we ~# /'RRZER&!ww)blwu;#_ ` l d&c|B||F :|_"ii&= T  @!*" h /&i\ $O 3m("r ?(( p)l9(q&c (q z#!"E)! $ r#"E"E!'#(_#"Eb       |( 'C Qf   #$ Z +(0S 'x&*"$Zf#"R&G / ,%~$ K5) D # ) 5])(q$  #.#z z!"E(!  r"E"E!'(_"Eb   (r)(r(rF YBZ(r|||;#  #)Z $C'E$C)M%!1$C$C$C g&Y*%2$}A!2 lQ$$"n'&J lJA Mt~$.\9 i 9d-- 9$% A) -'#v Qy.c)!/\!](ZP/!"!&k!  k`)qS&v#&&)\&G y47!2t!2M %" % n l%Y%(%&{\&"*1!"[[[!{[" rv!)%)%& \'S)%&&*#> &L  t t]%ceh cm e Y  V ] P]]%]  $\&Y  (4"]##(Ko  '.!] ^ I] ^ ^];] L ^rKr  Se ~#t)))&q$$$% U V (pf!: (!!:!: S&es !:u /#&//x#/'( B#lH!` 7\)&`)g4 G#& P& XA]7 / o"$T7YZ" #3]]&1]&&Y<@]% M qY ]V]&Ym "&*6= 2&'(eQB M)6Xmmm#m( i$) ^p ^ ^; L ^ !: (!t!:!: (&es !:&vQ y.[&) v!/\!](Z/D!gr'r!n  L5K B 2% '%$9# )r6j+ "{j 7 "ygIz)FHo a T D!e H ] ^&] ^ ^];] L ^] Y]]&Yq"E "E"E"EP$' t ] )L%{* c$Zf &7"R&G0 / , %~ "$ K5)#( R'C Qf #$$< +(0S'x&*$T%%%% V E)" ! 'Zaaaa(v&.Q yO&.&..k[&'Q)&.!/\!](Z/m]m  ! &C &a )4%w  [ w "% $& D ) r b$T?Q$%_$  Y u&Y    &a )4mR, P V P!, #S?P 8 < t))<)L&02!jq'''$$#)$D)I)!,X )O  -(^O   OOj (w&T!b&T&T&TV J g'#$'# #{!%5# H8 ) " &k m"r  p)D%{&c (  _!< &|P 8 t)L#z!#Z!8ZZZ#h'v @.')!\!](Z "# ozPm("r ?(( p)9(q&c (q!#V!?M9"&Rv)b'vr  1$l Ilu;("'C Qf)b #$< +(0uE&*# ~ XAFr )8'] 6zvw6Y6 V] P:]6$\&Y1#]!#wHH\&q##  / ?$T=  $T , !o$?)! $?$W!  `e "|Wz^'fVVV"5'V W!"c q"% "F& I :   w$(!n UGn D $J# $J$J]$K #$J.#& HHH`Hz. 'x c '4Y'4'4%&{\XDf''[(" ',"  $>"* ! "$C $$C$C &Y* @/MMM & %?)BD(l W - $YD'(Nu & O'e('& &&;&fV' !V"v t(#vM] j]]] L'lH5HHHH(H !e H %' l%Y%(%&{\"'rK BrAAA]%m &GYG V ] P]k]%] G$\&Y mqp!  "nJ&\ ~ .", & /!:) (!t// ( Fs /!o v$ vu. XL^I'I !p ,&MT'h<% -9% < <% 5)7 h ]%! M "!Y ]]]a&YH# t/$tdH $ t[#{!%5# H8 ) " &k ) ^^^*&_| #;T);g;! @%S @ @)qS1 @0 Y'  |i>#=<>>> >& #{!%5Z" H8ZZ !)Z% " &k %<     t%!xc;|s||& : |tB't C' vV z!(!  r!'(_4'%b  f$Df''[(V" "5 'V,#  $>"* ! ""]!,*1m[!"     % M l%Y%(%&{\Df'(V" '"5''V,# F$>"* !'"R"p #`&/z 8  D  RS' % )&#{!%5# H8 )% " &k %<XU(>G eX&i $O 3 j L!T@@&W@ j$ A#|) _# v8 i 9 9A) T#& C$C$C &Y /? N$4$4 so$4!oo D (D(& mI%%_l|e!g''r jtt L K B 2t%^ /!$T=YZ$T=&&{\A!2 l)' lGMt '''  0$J 00 $ 0Adt"DI)))L'#f$?#?6#$"A b#X e %  Y  '&Yf$H$()- -k) M)()`5`])$&9 %`.](Y]]]&Y%]'v ' y  ' .')"J !\!](Zm%   ) g&*6%(#(X Sl'*c !y Y&T  %@((j((l(% "# o"z']  BQ F| }!!F!$s(s(s(s[' V z!(!  G rX!'(_4'%b  a>(%p?Z u&:&:&:$&:!f#! ##&j#" v$C'E'v%!)M$ #!!.H)$C !&\!]]Y*%#m  $## %] # gm("r (( p)9(q&c (q*$%2!Y!!Y!Y"nJK ~!Y. W")P""c: # !T!!!!g''r j$ LK B 2% n+j( ( V Pu$\M%q( $\$\$%1 # ?#X e#{!%5# H8 )% " &k %< A%"$(q!;(!;!;!;r m(%k m!KKK+N'-'-N'-nN!nnJn  #{!2%5# H-82 ) " &k {^^'+^$! !  ! $! `T zx XA P+ "& "" <   " & !A%)- B!A!A #%!O -k!A(w $&z c~ ` {& ;" }!F)))')" m*"&]%*1(iW)! W$W!  `e "|Wz}U/ #;W b)n[~'~    P':V J v$C'E%)M%!1L$C$CX tg&Y*`(z(N & m$Ktt't @O vP$W @  # pjE"{j"y)Iz)F ^& ^ ^; L ^)5)5)5*)5v s"9"9"9'3& #$  )  ) )$ )!!! !V1Av"| n((Y UG(O)- -)- &)]%! M "!Y ]]\]a*&Y)%#%~]) & )"$Zf &7"R&G 0 I/ , %~ "$ K5)!2it [4_)nX   nN!dn!*JnE  K T!C)))b!'l)u;'hf&h" ,#& :F] /&"$TwYZ" #3]&=]&&Y" \  i 9d   9$#A)  '#'%U %4t!g %!#o  K ?> &L @O v=)K$W @I>' r&#v!&&& \6&&&*7@77' "7m"r $ p)9q&c (q' 3U)) %)4&t!g '.v "Tv "`vT """"%""O7'"OYYY%1111 tv, N}ZP"(c[ S'~%j'?"W#f t?#? 5?"W"W6#(a"A" b#X e"W %(N d")2!)2 yv!)b y&  1$l Ilu;$u(r!}~'~)?&' r BV* $%; 9 i 4` l dS+#Z )[  e   I&8" 8X S#m"4$'! $$3!$+)[$P0P0 9 9 0t!lI 9)Lr & \ &D'("'',"$>"* ' Q + Q&%)3&%2 l$}A!2 l l$$"n& &JJA Mt~$.((3#a'79m&9a* 4$'!%$$3 $@$+)[$j(##;Zf!#"R&G !!/ ,%~"!$ K5) L(u~I[%'>($))Y/'K'U/'K'K'KX&&[#a9[[9[ ('hf&># @jBj"{j"yIz)F^ FF)?s&'V* $!& m$K$C( %'C Qa6a #v$C$$C $< +(0S 'x&Y*'$*p] ^"p] ^ ^ ];] L ^K /# [a& &Za^y&#L' &  ((("# o)(z!h K(' H} 'q$  $ H!e H )N6"A +3 + +&&! +"4'% U %4t!g (T#m$y5$&5#G#G%K('1v#D'??$? b#X ek $!  &a )4cA($#$$v^^t/$ t D # "#"6(q o #z ))|)\  i 9d   9$#A)  '#~~~( $Zf &7"R&G)'0 / , %~ "$ K5)&m  )g @]%! M "%Y ]]))]!&Y$4't#f$w?#?$1t6#(a"A b#X e % rY   &Y fV "5'V !" q u - %w  S we ~#! x4!!4'e!k!)qe3 S )[0"7 #jz ds)) /&$T=YZ $T=)&&{\ &~( R'C Qf #$x +(0S'x&*f P 0 < 0 dtI))<)L='g&a)4111` l 1&f- H[:@E:::b&    '7( L $ &!&& c%] &i \ $O 3 -  T'%#`U %4t!g   " #{! H8!&( | )Cp @O v)K !$W @ "# o  W"qz;tb-  ;n >$8$, n P <0  << 0AdtI))<)Lz ?#e"%w  (S[ we ~#''L&=&^ !e%Sed`)qeS0!e] 7b]]] Lc :(!c!.   )/ )^)))3:((t 7)($%)3#n## 6{$_# PP&P$`$`P$#&Q$` '##( /!:) ( (// S Fs /!o YYY'uY]  Y]&Y((($P w##=#[)z&s; [$ ;;_K;!g''r"n LK B 2%_v&.Q y"_&.&..[&)&.!/\!](Z/$C('C Qa@  a #v$C$$C  Z +(0S  'x&Y*( 5 U"v'v  1$l I #/'*D)@)}l] ^] ^ ^];] L ^|*1)p&T Q# +(0z0 Cz CzN:#W X(>G eX"#l)&&$ &#$T=$T  :!h Ny(L/ql:!  s (_"[j' >) / $""|G"|"|m# &&"| b&$C$C $C$$C g &Y*V!n!n !n22(u#!8 @ {S(tT! {(( &2 c#(G*+R 1111 (;&&M3 &!d<s!v%)tK ?kK()&z ZD R\\'%&#U %4t!g   #  ^!h { ^ ^ @ Py ^ !C# DDD D%n#]#^'j%)-"&'j'j &g!O -k'jum"r O p)&c (# V ! %2 "nJ~.\\\_%h D# 3"6(q #!b& ` {&]] ^ I ^];] L ^M'U$" m# && "nJ  }(Q&Y I#  /(6 #.wN m' p W m $9!g'6+(n 7 Lg 2I%"& (Z""*1 B";! w'  kS)qS]7  !A%)- T!A!A #%!O -k!A(!%  Nm'lH5 z#!"E)! N r#"E"E!'#(_#"Eb   #"&Rv)b'v  1$l Ilu;L(LL4 xa*L'r jK B  ~$?$?$?^"|$? " g&&' $Cm$C% $C $C$C )g g&Y*X+( ( V P ( $\vQ yOw.k[&'Q)!/\!](Z/ #[ s!(r)u)u)u)u)?&'V* $(]    "#~)?H&'V* $6++%'mC wm] '!p D# "6 #$C( %'C Qa *a #v$C$$C   +(0S 'x&Y*VqI'#!= I$  %; yZ YPJ0 ? 0 d$t"DI)))L( | 'SAL]&~ /!&~"$ToYZ"#3]]&=]&&Y`m%] )$Zf#"R&G $ / ,%~"$ K5)&=!g''r j$ LK B 2%TNN ;N'${Cu'$$' '$% '8n L 2%"! yv!)b y&  1$l Ilu; V%(C'C Q y$])b y #vb3 +(0lu;(l( .W -~( . . $YDB .'(Nu S @O v )K !$W @[,"Z$?$?W`e"|Wz /?z !o l((V0"nJN%%%%% %%  c): VS4!'! )q3!S)[0& 'HXC'*"U)@ m(*'MH`! % $ ! `  z   j 'f 'f'f"'f*fZ$s )]#  cVy(" yy(y${$4oho'hf# $ BS%S?> $C V U $C$CT (&Y*! I#" /('# C{"y.z)F!# /LV!?'x c%Q /Xa!44Za4& i!!&k! D  # )y5$J])"6(q$K  #$J.s sss# M ,'sb ;"'L"""&I &&M3 &!ds!v%)tK ?kK(!)&z fD&PdbI $3'R .  L#!= I "]K"] $Q {\\##k'v  @ .'%)$M!\!](Z=\$ 5#Ca*QD%DDD!x$ P P P a P(##f ##6#"A % $n Y # L%$R(m ] X%!#LV!$4%%%"Z^j%2_v& t9(#vy c' I# dU )I)) %)4&t!g) {&7'*"U)@ 4% 18^  ) ) ) ) P <0   0 dt zI)))L#j#$9# C6+y{ 7+"ygz)Fv y.')!\!](Z"V#F ](o }L! w' (V W/ OOOOjP< 0 !k<< 0 dtI))<)LK & /# [a&!&''Za^y&'#L' '& !c; |(") ff"F&I : f!#)D!D!D!D& ?' ? ?> ?;& !4 "&""*1 B";+9(s  3Jt  O z#!"E)!  r"EG!'#(_#"Eb   ] &- -Y ]]V]&Y 0$J 00 $ss 0 dtI))s)L H %% %%&iO&i$C'E%%!1%$C$Ca(g&Y*L((LL 6$0L& %((((b (HoaT D!e H D# "6(q$ # o^ ;/ ; ; U'&w] !%'  !%Y% !o]]](%&Yt/&,$ t&4!$%S'e!!`$$)qe3 S$+)[$0&5&5v t(#v$9# # 26j+?# " {j 7 "y$Iz )FR )g#T#& g& 2& " & '  #&[(6= 2(eB M)6 : wn L 6  2% M(dY(d |(d&{\(5  V P]%! ML!Y ]]V] &Y$?$?$?^"|$?"HG); "^ d 3$$^! d'$&  %( #b&L_)n%(!f!$<[!t $$$U P'"W#f t?#?&"W"W6#$?"A" b#X e"W %X#4  $i "u. XL  /'Z&r;& \T&&*]%! M !Y ]]'0]&Y(  ou oSe ~# Q  +(0d%pC%p%p}(%p   %y'.m8(c # EM[  "v /    ( # XAFr /&YZ&&{\Lb \Df'(V"#+'"5''V,# $>"* !'"c; Ts &&  :  S  & {+ @@ V Ps j@ $ZEf #"R&G( / , %~ "$ K5)$C(" "'C Q " 5 )b " #v$C1$< +(0u E&Y*f[Lz0 C Lz$L Czs &v&&)\&w$!n'??$? b#X eDf''[(V"&V'"5''V # $>"* !'"&l!G m7"o ( %%= <$ !))  E" <""" "m'j %)- B'j'j'#!O -k'j(I^^ D# "6 # $ D # )P5])"6(q$  #.2 n8"#} o K%; z4!4'! )q!S)[0'v y .')"J !\!](Z\\#|If!+ VdQ' %"/g" 'E% Y&* VE)" ! 'Z Q Q Q i Q K  / (!t'p'p ( us 'p!o D $J# XX]"6$< #X.!/  ` & = T! Z PCW$$`#"|$z  n8  "#} of K%;  zv "Tv "%@vT "   j $Zf &7"R&G 0 I/ &, %~ "$ K5)  'g Z' .")))5)+&&#M!P& &sF!P!v%)tK#_'=!PK(!)&z f)%))) B$9#  6+y%%{ 7+"y~z%)F=?fVVV"5'V !""E"E"E"EC y'@ y)b ylu;#& A!2 l!'&JGMt'W]%ceh cm ehYh V# ] P]]%] h$\&Y(u~`*%'>(Q"O"O"O @/ @ @1 @ !xIE'#!='z! ;(! I r ; ;!'(_ ;b  nn n#_` l d;  & d&:a$2&MC)U&)??&' r BV* $5#""""  ,'$Y(*p a /'a!KaZa& If%%+d%%%% Q' %%g" #  rK Br)&Y1Y&Y$9 '+  '6j+9+ " {j "y$Iz )F&&M% &s%%!v%)tK_'=%K(5!)&z f})!TI) [fRff!f$?$?$?(`^"|$?z%2 $IA!2 l  $$"n'&JJA Mt~$.H;"^ 0 C CN$9  6j+J9"9{j 7 "y sIz9)F V\|N%JK /$|d [&hZ^&#L' & $ !&&#$'#$F#{&!%5# H8 )% " &k %< oZ^3  & \ @( &{%%' 'I$3'. #!= I4!%S'! )q3!S)[0 9]%! M!Y ]V] &Y F*#%i"##'q'q$%'q%  8"#} o K%; z%c|)V||F: :|&%' $%uYm}Uf' !"("(C y)b(<l yu;   '.Df'("''' " $>"* !'" K^z%hzm hYh V#  P%] h$\&{\&]%! M "!Y ]]]$ &Y$K /$|' [&h$|$|Z& #L' $|& 'h)'t~  A ;"'L"""&I $C$C$C Y&Y*))O - spO 0 OOj) 6 t  O^ ! Y)&{\] ^ I] ^ ^];] L ^J  (2  V &a )4D"` l dS!$#$L$)kV!$g+&$ !!)#a9))9)m'j %)- m'j'j'#!O -k'j("!g''r n LK B 2%Ho a. T D!e H  i ]Q]k !a$!a$xS 'x ^ ^ ^; L ^!g'$$ L 2$%(X! Sl'! ! *c !T !  y Y O= /?$4$4 $4!o ( rK B  K "%w C [ w iA= : (2} : : V n )4 : "# oz(` X! yZ Q  +(0H] /!"$ToYZ"#3]]&=]&&Y Z& PG(h(h(h6(h#.((  [$Zf &7"R&G0 / , %~ "$ K5)Zf#"R&G  / ,%~$ K5)??#X?F#$'# 'z 9c; )s &  : V)] Y]]]&YA$L'(! t4! @$ 'e!  @"")qe3 S)["0%N%N%N%N$$$##=#,  ) .f 5/" Y$f {+ V P j %#m ##  %] # "")b uE&*(n(C'C Qf)b #3 +(0lu; M$C'E%%!1%$C$C%( 5g&Y* b&&##A(((_(#Q #)?&' ]V* $]!&&:%:$ XYX'( L   Y u&YO -?O nOOjS(&l"z(&!\l(&yZ #6#.h&] ^zw A!] ^ ^];] L ^!g'n L 2%I rY (  &Y]% M "Y]]V]&Y$T!%&!&X)K(i)4" d k' d'hf#7s $)Y05$' "O"O"Oz Y  V P$= $\&{\Df("'," $>"* !" #G @/ #GMMM %'M'M$'M6 66)8X6&$C$C $C$C$C g&Y*$<% )) %)t mmmm& ' D z @ zDD!x zssss# M]p / ":"$T Y Z" #3]]&"] &&Y(ys& \#u$D ?j???$*\ 9  iP 9$ 9!!P$ ~A) ! '# z$Cr&#vr (&&&U)\$C \$C61 &&Y*'D'D'D m'M'M.7$C(" "'C Q " )b " #v$C,$< +(0uE&Y* t ''n' )]$.(5${_> &L[$|'  u%Q}("<% -9<% % #< <% 'v ' y  '.')!\!](ZTT)\ & iP 98  P$ #A)  '#  #6 l &a )41"   %|g /' c;|#}"Q||"&F&I : |"v==)b'v  1$l Il=u;\9 i 9d99 9$)A) 9'#0 Cd CN!A%#9)- T!A!A #%!O -k!A( m]m$R h%_"$$A",2H& $ v @.')!\!](Z)H'\  i 9d   9$#A)  '#+++V + $)" H%#1'"\/')!q \"\IK2]  M P &Y& ]]F]&&Y\ ]\\'( ! ! ! $! ` z6$"IHG); "^} |$^!} d k duu)Tu r Y (  &YrtK B  mm"# o!mz," 85# G(X Sl'*c !y Y!7!& #7  8m#7"#} o K%; z'Z 6 X\9 i 999)A) 9 )C8 b} O K%;  e A4!$ '! "")q3!S)["0v s%Y%Y%Y$$$/$8"1D RSX&`"&R yv#R&R&R)b y'&v&  1$l Il&Ru;  q)4;"'L"""&"I - &o&o$%}&o !g''r j L5K B 2%6%]z(Y V] P]V]$\&Y <% - 9<% % #< <% %{$uvTvvT %2 m m m"nJ F~ m.%2H%HH"nJ ~H.  )r&#v!&&&U \6&&&*I%f%f%f(%fZe Q Q$Cr;&$C \$C g &&Y*% )h( | !#LV!] Y]]&Yp%x f]p / ":"$T Y Z" #3]]&"] &&Y(""'C Qf)b #$< +(0\u E&* (1N"&{4"]#f#(K#k\  iP 9  P$ #A)  '#* V!!;!(&*&& ?&'&)%R'!dZva @A.'%! $*&(ZK /$|' [&h$|$|Z^& #L' $|& G%8) ))5)+ HP$J0 $ 0 dtI)))L(E# n8"#}  o K%; z4$I'!$$3 $+)[$ + PZt D" R!A%)- B!A!A #%!O -k!A( /(etm' m"&&ME &s E!v%)tK<_'=EK(!)&z f [W1! [U$U!  ` "|Uz% $!] j]]] Ls!#&u& "8 r}O K%; > "&Lv s $ $ / ?$T=  $T , $!o =&&ME &!" s E!v%)'5QKp<_'=EK(!)&z f$ g% d( k' d ^mhlh'h p Wd mh 7HZ$f v$C'E%)M%!1L$C$C g&Y*] /&"'YZ"]V]&&Y'{$z'''`L)D%DDD!xK 2!JCj 'b (] o" o(](] oSte(] ~#  M ']  Y]](&Y(IA!2 l('&JGMt%2$#A!2 l)$$"n' lJAMt~$.WWW`e"|Wz%& m '8 r}O K%; ](3$k(((# #{"yz)Ft'~ 3  "`# 6 , oo$o(X! Sl D'&! ! *kc !T !  y Y (m(u~`%'>(Hoa 8.T D!e HDf'("''," $>"* !"@  o oS e ~# Q +(0 m)-!$Zf &7"R&G0 / ',, %~ "$ K5)% ] @O vF )K !#$W @B#9]  q]  c] A] L tK6''''Y\&{\ / {+@@ V P ) j@ ($js D (?# ]$ #.m'j %)-${m A A #"l!O -k A(}"`}} C/}K /# [&hZ^&#L' & @ >&' &!eY (r'j%)- B'j!A #%!O -k!A('% U) %)4&th!g   n8$"#N o K%; z] !$  2!Y !o]]](&Y'*?? 6$?"A! b#X e %/!!!z!%'t#f$w?#?$1t6#(a"A b#X e %(l "  - $YD'(Nu &&M3 &bs!v%)tK >_'=K(!)&z f6 ,oo$o..o?u $r (%URU)&H' k$! ^ iH / $T%  #3 , !oLz0e C Lz L Cz3 & \u&Y1'CYY 'R#DY 444)u!wr)u~)u )u$Zf &7"R&G0 / &, %~ "$ K5)]%whzm hY V ] P]]%] $\&Y#,<0 )a<< 0 dtI))<)L+(1N"{  #:)n#&> !g''r j LK B 2% ( /!:) (( S s !o 4F 4 4~ 4!: (# !:!: S#&es !:uvQ yO.[&) !/\!](Z/ {+ @@ V P ) j@ ( /a /'aa!4!4Za !4&  r%|%%n%r(rWr&X''%w!#"$ t( ta#U$"m# && $, $9# 6j+"{j 7 "y MIz)F&M{3 &!ds{{!v%)tK& ?k{K()&z #"/#,<#]0 "M<< 0 dtI))<)L ] 9(o<# D #\9 i 9d99 9)A) 9& (& 2 M 2{)4 | y'% MY'0&{\#!g L 2% $ M l$Y$($)&{\8! ""}gss n kz!(!  r!'(_b   z z z z%w  (S> we ~#  t/$ t"("*(.&@%;4G m]%wzm Y V ] P]]%] $\&Y &#~#&& &S$6&F[% MYV&{\\9 iP 999P$ )A) 9 '#!' 's^v y.')!\!](Zva @< (A.'%#! $*&(ZqqqGqC)blu;c\9 i 9d9  9$#A)  '# %' Ghi(%' WvX    y yy'*"UyB2F s2&2'2 ]V* $])#_!_#(z!(!  r!'(_b  $(( 8('G'G" t %t( H~h%)-!O -k'vI(X! Sl'&! ! *kc !T !  y Y & &&&F =Lc;|s&  : (T#((&.&& / (!t( s !o n]7 / o"$T7 Y Z"#3]]&"] &&Y#|( Rnm%"r$ %% p) W9%q&c (q%y'.! !  ! $! ` "z  '!h {# ' @ P !C# 'D'D'D### 8#A$" Nm# && # ';+; #}" "&I )5!Z)5)5 e$*)5Pdbj] !$  d!Y !o]] ](&Y))%U&)qS%2$A!2 l#$$"n'&JJA Mt~$.)!:&T&NN ;N$?$?$?`^"|$?z] ^ I] ^ ^];] L ^$C( vR'C Qa'va #v$C$$C ' +(0S 'x&Y*$d$d$d#$dO  -(^O   0OOj F  ,#%Q 8 r"#} o K%; z.) ))r yQQ& \Q &] (] }]U] L ' /? N$4$4 oh$4!oo \ ]\\ `&&%^!?$ ^( ^ ^; ^"!N y!'Je$$96+' 7 gg %(|S !i @!'j%)- B'j'j #!O -k'j(Y1CY'R#DY "IpL6Tv $9(96+ 7 gW/u% u uY  '&Y3r%^^ )z+@&s&" U"!!! / ":$T= Y Z $T" &&{\$9#  6j++ " {j 7 "y$Iz )F $C( 'C QaE  a #v$C$$C  p +(0S 'x&Y*66 : v$C'E'v%!)M$ #!!.H)$CS!&\!]]Y*!d ?!d}"&rR}}% }#$D4  #^%K# ## h(W]  M Y ]]&Y#)8'(X Sl '*c !  y Y !/ ` & "O 7'E!%Sede)qeS0%w  [ w )W/(ysO "\ -(^O  OOj 8'I!""}&* &=&&i $O 3W @O v)K !$W @^/ #Df"(V"""5'V # $>"* !"'  U )N %)4&t!g c# #!"  ] ^ I] ^ ^ }];] L ^ZK$C a%aaa$C$$C Sg 'x&Y*G  :$C $$C$C &Y*g#;  "v'&c  1$l I G6 T (#Z +%-#) "%AcR %"&MT&=L%Tm;"r ;; p)H9;q&c (q%w  o& o S[ we ~# q$  "D)) ?m #(O((( (Slb#N % n l%Y%(%&{\t43]%C&n$$R%_"$$Ac| B||F c :|_("* % % % % $9# )r6j+%!"{j "ygIz)F# j@j j"{j"y'I Sz)F $C( 'C Qa?a Vv$C$$C   +(0S 'x&Y*# o#oo{"yzo)FT]](]tt'tbb'bv& t(! (#v U]%(&P4!4'e!&!)qe3 S)[0&&M3 &s!v%)tK_'=K(!)&z f#{!%5# H8 ) " &k ^ t,$ tmR,)'v y.')!\!](Z!!!(7![4!!4'e! !)qe S)[0) @St!~ &2)cY*+R~&&&&a%@'j%)- B'j'j !O -k'jkr yQQ& \% {Q &$C('C Qa?a Vv$C$$C   +(0S 'x&Y*)X&' _% $Cm % N  $C$C )gg&Y*W1! #W$W!  `e "|Wzg 6gg ,g9$Zf#"R&G: / ,%~%,$ K5) Y u&Y& $C"$C" " }$C)b "$C$C guE&Y*" /]i]F] $(]r#mK B ('$9 '+  '6j+ C+ " {j 7 "yIz )FP< << t))<)L$  g] /!"$TvwYZ" #3]]&=]&&Y I#(,!#)c 3  3"rll"F&I : lF #~~ \)&P000 0tI)L&)"ZN 'G6 (#& +%-#) "%Ac )] ^zw!] ^ ^];] L ^krKr$9#  6+y { 7+"yz )F%) )) %)t  %#m ##  %] # mp)2`) Z6$$(R] P /":$"_pYZ"]]X]&&Y%%%tT%JF'~&6"#.)8''F^)P"= &~'PH $Zf#"R&G:  / ,%~%,$ K5)Df'(V" c'"5''V,# $>"* !'"R !/ ` & 3&="4555R  A&&&e"2 Q'rXK B $Cr&#v&&&&)\$C \$C61 &&Y*000v0%2A!2 l("n'&JJG Mt~. f' (!"\\] ^ I] ^ ^ ];] L ^ )69m"r  p)&c (" @0 L C y'@)b ylu;"X '(  %$>:(yn @St&y! &2 cG*+R(d""""7(L/999) L9!2tp {  z!(!  r!'(_b   );g!D z d@D z zDD!x z)H(Y1 CYY'R#DY%H!vF ,#'H 4 4C 4D"v)b'v  1$l Ilu;%2 mCA!2 l m"n'&JJs Mt~.($( c%] cA$#$ N$$) ]2 ' 9v iA) i<$)$8i % D $J# $J$J]"6$K #$J.&u(U  ol o oSe ~#&c&qC' v! & )W1! %TW$W! ( `e "|Wz3 0& \ &( D $JO# ) $J5$J])"6(q$K  #$J.(z(z(z(z( h$E{ V  L] &*' /###&j# D # ]"6$ #. /!:) (!:!: S( &es !:!ofR"" R"" W"q&j['[H!eYV&{\]%@$m #'Y V ] P]]%] $\&Y7((, m&'U |#((%k m" $# )L / ( S s !o v @ .')!\!](Z&(,M3 &!d$gs!v%)tK ?kK()&z 'j%)-'j'j !O -k'ja /'aa!4!4Za#n!4& $r\\#k] E Y]]&Y"xnm%#m  |## C %] # lb&X#4 g&2&a"&'    'sb %'v .)!\!]](<<<$<!G m (%=P PP$ '# iA) '% U)! %)&t!g #"X("['0"&$q t%"1(P <0 # rr 0 dt$I))r)Lm#] r #&#&#& !gr'r$:n L PK B 2% kVIu(&u#3` XP bLv&.Q y"_&.&..[&)&.!/\!](Z/(H !e H !x&' &yZV t'~7]%m A%Y% V ] P]]%] %$\&Y7!&1!& n#x#xb#x^ 4 49 4` U'_$'%_"$$Aa (# &`#E#E(#EDf'("'''," $>"* !'"Rtxx'x#;;&. $K#/(''*"U)@)'l&. zKX /$|d [a&3X$|$|Zay& #L' $|& (vP(B W T H'')K /$|d [&h%D%DZ^&#L' %D& #Nz"5$#!$-t 8"8 $'#d'$!g''r j\\ LK B 2\%'{$zc''%' !e%See)qeS0)"" "$C( vR'C Qa.va Vv$C$$C $< +(0S 'x&Y*3  & \ &( 'C Qf #$  +(0S'x&*g( /!:) ((!:!:S &es !:!o% $9#  6+y {+"y$z )F<$ $ z$ $T$T&' &##'q'q% c'q%%] 6 l 6Y6]:]6%&Y"v&R yv)sv&R&R)b y&&  1$l Il&Ru;fff!f m m m F m] 6 M )6Y6 ]:]6&Yq'j%)- 'j'j !O -k'j#  |/'8 r K%; (T%7" @Im$$$! %)l { '* '# j@jBj"{j"yIz)F yZ&  %2\"nJ ~.$6x""2x""""Z""I#1\9 i 9d99 9$)A) 9'# /'a!KZa& [ !%P<0  ( << 0 dtI))<)L@]%! M "%Y ]]"m]!&Y!)6g( /!:) (i(77 S s 7!omRl' p W m V%8(X Sl D' *kc !)  y Y "0> "&L8 c(C'C Q y)b y #v3 +(0lu;m< )g&*f(&fC 6I 7}\)'ps v D# "6 #E{% ! ~|f{oL) "`v / $u(r!}" 1$l I!3 0& \ &""^ _ Y 6u&Ym 8 e  Av)P" 1_!A%_)-'{!A!A'#%!O -k!A( (!t([s (I'. #!= Iv s"9"9"9r%2 J11"nJ [~1.$C(" 'C Q " )b " #v$C$C$< +(0uE&Y*; &  v_* (v]%b'&Y  mm"# o!mz|%; %4 @! z#!)<)! )(Q r#)<)<!'=#(_X#)<b    /&$T=Y Z $T" &&{\  I!_'. #!= I#g$s "kz!(!   r!'(_b   XA%h m  Y  V  P%]  $\&{\'%i U)Bi %)4&t!g  M))v5v])$$ v.  / )3:((t($%)3] Pp / ":$"$T p Y Z" #3]]&"] &&Y!#LV!HG); ""^\ $ ^!\(&#~&& &S"a &$#   $Cr&;v&&&U)\$C \$C61 &&Y*$\M%qh#F##D#""" "$TW!D$ff)iHv"V#F  ](o H$? ! "H$?$$?! ( `^ "|$?z4!$ 'e!$`$$)qe3 @S$+)[$0c!g' L 2%!g''r n L5K B 2%' T#!<% -9<% % "t < <% "";C "";)b "JEM[uEE'%I# dU)yI## %)&t!$!g# \\#kg# 8$ $'j%)-'j'j #X!O -k'j(D z @D z zDD!x z<<F<W #WWZW$8#$,%?%fVV"5'V !" Rb ? !;(!;!;!;rO  @ OOj %`& lfV"5V'V !V"m  ~%] %V"Q H$?! "H$?$$?! ( `^ "|$?z p$L p p p# j@j j"{j"yIz)FI'. #!= I ""%8!# P P P a P$C'E%%!1%$C$C g&Y*"v&R yv)sv&R&R)b y&&  1$l Il&Ru;% 1+ 1++%/1V1 +F&' ]V* $] uu)Tu u((T%7OO'O  #{!%5# H8 )% " &k %< "u. XL V V\9 iP 9`99P$ )A) 9 '# ?yyy'*"UyBD'("'',"$>"* '@'%# U# %4t!g <<-<:$5txx'x7K2K$9(96+y 7+g""""% }L '!A%)- B!A!A #%!O -k!A( 1$l I'sbE   eD v"!"&w((vJq)))$)Ht#f$#tt6#"A9t %"&8(Y"'-'-'-rK B  ( /!:) ((!:!:S &es !:!oP 0 0tI)L @)?!&' ]V* $] %# #VV{"yzV)F+ V P$\!!!<[!+9'$&)8' &#='>&rD R |:#p(r; &4&4& &4K { 2Rhohho&.&&!T [%-;%2 l$}A!2 l2 lWW"n'&JJ& Mt~W.?'%w &h o)_ o S> we ~#@@@3@'#f$?#?"W6#$"A" b#X e"W %6= 2 J(eB M)6 4F # 4 4~ 4"3&M@ ( $(4#<" @0! v$ vrK B P$J0 $ 0Adt"DI)))L)L &c;|s||&& : | Y  n /'" kZ"&ppp!p] ^H I] ^ ^];] L ^bt &%w  S> we ~#)8 S'Wm1' p W m & & & >& $ )~(&l"z(&!\l(&\ &F iP 9  P$ #A)  '#Y1CYY'R#DY'*"U)@!%& <% <% % < <% 8 r}O K%; $?$?$?  ^"|$?M  $ (q t   d!^"D))r & \&$2$2$2 $2%*!f!!<[!;tbd!  ;ddn >$8!d$, n%zm Y V  P%] $\&{\#G A&5&5 L8'* &Y@@]%! M "%Y ]]]!e&Y!$CH vJH&)\$C \$C  &&Y*& &&( |&8&A!2 l)' lGMt"'WfBff!f]j]jj]] LjHoT!e H!A%)- #!O -k( $9%"&MT' '#U ##4&!$!g#[ &\9 i 999&)A) 9* X8zPdbLcEY$nA(x"(1N"{$#( &v!&&&!( Zw ((_( -  3 /? ;!o -  $9%"&MTG y47UeUKKKt&H $$$#;X`<))!& 3'| HH\&qZ&I 6{& #!$T D# ## ) 5])(q$  #.> '1V&'1'1E) ! '1'Z)U!h {)U)U @ P )U !C# c; s &  : % 4%w  o" o S[ we ~# $'g  V &a )4HoT D!e HY  n 0&]%! M "!Y ]]V]&Y%2?"nJ ~.mmm#m(r$C$C$C &Y*#g#{!%5# H8 ) " &k  Y1vCYY'R#DY)p#"^!~"~#""#%#" ^^^ 2|?I$(v+&3"X'( %$>:(ynD'("'',"$>"* ')lU)l)l' K<$$ #& & & >& c| )||F :|mmE]  f m%#7  "# o z? > &L# (B# # >a#  !C@"<"'j'''   &&!Pz Oi -"OiiO"Oji 7!77#7  _!< &|m m( / ( O(S s !o$ $ z$ "'M mQ mu $n$o& E  "c )3:((t($%)3  #6 l &a )4( "*  #-#()] ^ q I] ^ ^ c];] L ^"v'v  1$l I&2F2y&2'2 ]V* $]! %' N\ %Jf '8sf"#} o K%; z;va @A.'%#! $*&(Z Zf#"R&G / ,%~$ K5)6= 2(eQB M)6('C Q y)b y #v3 +(0lu; /!%I 5)BII %)tOI U !g  8 r"#} o K%; zu;tb- + ;n >$8$, n#"^"~ "#%#"5#"O  7'E$'d%1& #  !(u @~ ` * '>( 6&jH;"^ vQ yO .([&) !/\!](Z/; ]#" Nt9'   9[!"# Up'} %Y[[["i4i&= T P&%hzm hYh V  P%] h$\&{\:'$[!" %# '' %t"' {+ @@ V PU j@ \vQ y&(.@c)!/\!](Z/= .{{]%eh m ehYh V# ] P]]%] h$\&Y!h ll !W)vW/P00#C0AA 0t nIA)L+j( ( V P$\( $\)dI&i&iP> 9 9 t!l 9)L!K w>&Bw!e%See)qeSX00 mi |#((%k m!$V1g&&"&' 7!&1!& Y u&Yg ' ."&v&&&)\&nN!nnJn8} O K%; ]z!Uv$%@ V  V$C!iv!i#&U)\$C \$C  &&Y*& "%w   o" o S# we ~# # )rj%"{j"yIz)F  1#x#xb#x" "" "eQV!2t  '*pn#$DI!,  ]   M  Y ] V]&Y( %'C Qf V$$< +(0S'x&*R;tb-  ; >&$8!$, &&#M!P& &sF!P!v%)tK#_'=!PK(!)&z f'_ W")P"0 `&]Y]V]&Y ;/" ; ; U'$#\ /!$T= [Y [Z $T&S [&&{\ V" ! /Z p$L p p p F Rj$ $ #/] '% n  y'$XY$X !o]]]($X&Y([([([([fii' !i"# )! #v Qy.c)!/\!](ZP/ /?) !o$?!z$?$?^"|$? 1! A  "D))'Hm'{$z'''' #!D!D!D!zY V PV)$\&{\]7 / o"$T7 Y Z"#3]]&"] &&YV$  :#& !gr'rn$ L5K B 2% ' ^!h {5 ' ^ ^ @ Py ^ !C# LLL;L 8"#N oO K%; z#Ggs%2?"nJ ~."[, " 8M&L&]!!3$%  MY V PV&{\$J#"% \9 i 999)A) 9'j %)- Bff #!O -kf(\\\#k v_]  % n  Y %Y% !o]]](%&Y#{!%5Z" H8kZ )Z% " &k %<$Zf &7"R&G 0 / &, %~ "$ K5)Q? c&IN XA" >$ ,j($ !a&a$S 'x %v y.c)!/\!](Z/X#4"u%u. 8 m' U |#((%k m" $ ~u.  ) rf '8&Kf"#N oO K%; z  O'#f)N"A6"A /Z"B'eH('] mg ' mY m]]] ." m&Y  v<$`S'x&*$4og#&Q $?`"|z8Pd#b( Q !(B! e)(gllO  -(^@ OOj  ) 'HG); ""^\ $ ^!\)P{/"6#&!m'!!Y(p!)f$B)) [) /$99(9U\ 9  iP 9 9  P$ #A)  '#$%1& #  / ?$Tk#  #3 , !lo= &%{ / (!t ( s !o$'z'z h'z`=K2K, wtd '4a*]%$$m Y V ] P]]%] $\&Yz ~|fK$| [&V$|$|^y #L' $| f '8&Kf"#N oO K%; z$w g ' ."J   H!g' L 2%v&.Q yO&.&..k[&)&.!/\!](Z/]  $   $Y$ !o]]]($&Y'j%)- B'j'j !O -k'j!g' J$n$ L2%i 2$%%g&&' z 3'| a(+Ja$a%a`i(C'C Qf)b #3 +(0lu;$?e$?$? ^"|$?" Q1+ ?+11V1 +(C'C Q y !)b y #vx3 +(0lu;j&f# 3 #$D)}I  !f!!<[!( R'C Qf V$$< +(0S'x&*N;NN D ; $J # ) ;$J5$J])"6(q$K  #$J.$C j$ _$C #|)&Y m mY m  m&{\xx(x`%#0 _ Y u&Y ( L ( R'C Qf22 #$ +(0S2'x&*5# M ^(( z#!"E)!  r#"E"E!'){#(_#"Eb   fR   %!, ,,,%?%)@E D $J(# ) X5X])"6(q$<  #X.^($In UnH] /!"$T [Y [Z" #3]]&&S] [&&Yz! ;(!  r ; ;!'(_ ;b  K F(i#"]"]V c t'~7 8  $ 6 : p!M$)(~!% 64Gj~'c %ccRc '  -+(J++V+&%? 9#D%# %2("nJ .~.. 'MvK"T"vKK`vTK$%? #x4$'!O$$3 $+)[$Dd''[d(V" !%`"5%`'V,# C$>"* !%`"Z |y(  '8 "#} f oO K%; z$P!(L/ `V& ` `m(P `O  -(^O  OOj ?B\\| XAF(#  l#!2!2!#V!#yO$C'E%!1$C$C g&Y*) %.#(+##6`# 3'| )4'} % i!A) #{! H8 C C C(L/ql] ^zw!] ^ ^];] L ^=~~!g''r n LK B 2%  V'1 ! %%*^$}#k7!&1!&( aaaa+9   \9 i 9d99 9$)A) 9'#sN\ 9  iP 9 9  P$ #A)  '#(T#%J] !$  !$Y$ !o]]]($&Y(#( cD#OOOOj ?((x )v" ](o)d@O@ |@3@&|]  $ M  ) $Y$ !o]]]($&Y *%) 5) %)t WMO  -(^O  OOj j% )) %)t  $)Bx",2& CA^z(Vz!  r'!(%P m=|%6 D (?# CC]"6$"{ #C. |!B y)$T$A$ 8 r}O K%; # )! # Z(G m)( %%%m_!_8("* Id !XDf("'," $>"* !"] Y]]\&Y"#0""S" y!--[!"O  -(^O  OOj  6{& )e)* ' ]"| 4'$&m$C( vR'C Qa9v22a #v$C$$C  +(0S2 'x&Y*!: ([!:!: S&es !:%ejc; "  "&F& I : " :#(%*"; )<<&! <$9# C6+y { 7+"y$z )FN&Y!8" 8)6))F%= T I^%P#h 3$Cr&;v&&)\$C \$C  &&Y*5o'55#$'+5)#.% Q"%%tT%Jtt %i&z %&&M!P& &s!P!v%)tK_'=!PK(!)&z f.YYY' $  !  # $Zf#"R&G / ,%~$ K5)c;|("$ ||"F&I : | Z(L/%) )) %)t 3 '#f$w?#?$t6#$"A b#X e %y%) )) %)t% D  %U& k$ !&&'( (X Sl'*c !y YM MM'M ((("# o)(z$qI'#!= I"v%<L xB  ! &C J )4J   W1! #W$W!  `e "|Wz&"nJV 3 && \ & ! 6{& )K& /# [a&&QQZa^y&S#L' Q& $8! ""} ,)z@&sR)- )-o !%^S",2& Q +(0( | 'S# E&G4 /!:? (!t!:!:( &es !:!o 9rJ /O D; $J(;# )|D5D])"6(q$  #D.) % $S 'x&*VqP0 0 0tI)L%I II %tOI  :#(v " Tv "%@vT "[( ` l d!2t$Cr ; l&$C \$CT &&Y*%.s,,(,Cq 9$[UL C }!F (''!#! '.( | #&  J t $?| &$` "v'&&  1$l I %HG); "^} |$^!})z'('#>$m'j %$)- m'j'j'#!O -k'j($%i%( ((q($c 3  3"  "&F& I : Lb -jY W@  %! !!(7!'l  UVg'%#`U## %t!$!g# %*'f 'f'f|"'fUeU~ }`'Vk0'\$9#  6j+ " {j 7 +"y$Iz )F"~'` l d'?"W#f t?#? 5?"W"W6#(a"A" b#X e"W % {+P V P j ;#}""&.I " yv#!)b y'&  1$l Ilu;v)P" 1hK /$|d [&h%%Z^&E#L' %& NJ&!!!bp#"O $#@ OOj H2F2y&2'2 ]V* $]D'(""T'',#$>"* '@ = If!+d Q' g" %#"^"~#"" #%#" %2 mCA!2 l m"n& &JJG Mt~.i''    tH&'% MYV&{\ \n#(X! Sl'&! ! *c !T !  y Y "#0""S" y(& l"z(&!\l(&(" @0t v#]i / Y(i S s !o   i))$)$)b "% { {N  : -&v y.')!\!](Z6 Cg=g #{!%5Z" H8ZZ )Z% " &k %<A$" N am# &&  Q  T x'  -'$ z] z(wY V] P]V]$\&Y\  i 9d 9$ A) '#6*"f :$Zf &7"R&G%0 y/ $, %~ "$ K5) 3'#U )n##4&!$!g#[ $ l*Y*!(*&{\& 9.# H$)c  /!:? (!t!:!: ( &es !:!o% MYV0 &{\XXXP<0  (<< 0 d tI))<)L 'O!J"JJ J#%2CA!2 l' m"n'&JJG Mt~. %4o( | 'S ) 'v$C  H .)!&\!]]Y0&=#m[ )!T)%%0%&Yg {+P V P j #` #  #$f)ff)iHv; &T  z2$&z&v!&&&Pm&)g # "&@"""%K$| [&$|$|^yR #L' $| 'r)  K B (U P(!e%Sede)qeS0)&e " tm))) / ( S( s !o'" 8Mm&v"'O!"&w((v(K / [a& Za^y&#L' & & $C'E$C%!1$C$C$C g&Y*  "O&&M &s!v%)tK_'=K(5!)&z f$Zf&7"R&G / ,%~"$ K5)z ~|f#)$D)I)!,X )#) @St!~ &2)cY*+R~'  0m%mZZ#[Z%[$Zf &7"R&G0 / , %~ "$ K5)'pwK F(i P00 0tI)L u#) 45>%#m ##  %] #%O(([#([([([`=JJ "v !/  z z z!x z|#y(#rD Z VE)> ! 'Z ,#4K /$|d [a&'?$|$|Za^yX& #L' $|& v&.Q yO&.&..[&)&.!/\!](Z/(+%`w('4a*0&=D'YIP P$ '#$C$C$C)b$C$C g&Y*Xv")"&w((v) %3m}%9 W:!A%#9)- T!A!A #%!O -k!A(%w ( [ w WUU`"|Uz )%U"sU&H' k$!)0 ^ iH YV%&{\='sV t'~7&%?G y47uU 8!"#} oO K%; zj  m |#((%k m%$ $m("rP ?(( p)9(q&c (qmfRmxx#Px%$H 8&Y$(s(s(s iA) i Q +(0 D # $J]"6$K #$J.& M3 &!dqs !v%)tK ?kK()&z y}ZQQQQ( :#(EY$nA("(1N"{$t/,$t,,H  $ t,[n /'!4!4Z#n!4& % % % % 'q% c%"+)z)zF''Frn$ LK B 2%^ Q +(0 "# o"zA!2 l6'&J GMt'Wb V  V'm%#m L## @ %] # /!%S)qS)*0  }  &a )4!P(CPy%w  o  o( S> we ~# [' BF ` {&]ZUeU&' ! [ $ l$Y$($&{\v(@ V$r)(   aa'a&6#. !h {) @ P !C# !;rU|%$m Y V  P%] $\&{\!&"Z L" O(o" 1 e(g'"(g(g%(g) r 'G ='G = = =!a$Cr&#v (&&&U)\$C \$C61 &&Y*$?! $?$$?!  H`^ "|$?z|*1)pb!n!n !n{(-### 8#m)g" """O  O  OOj $C'E%%!1%$C$Cg&Y*#a9m9D'("'',"$>"* '!U%]  Y]&Yt$ t z$;]"C yv&R)b y'&&  1$l Il&Ru;D(>D% 000f $r ### 6{$_# #i    &x:#)D / ?$T  #3 , !o)%%%1 11` l 1&#  #*" \ =JHoT!e H"Em  g=)gg'If!+d Q' g" $9# 6j+)"{j "ygIz)F( / (&Z((( S +s (!o#:i(#:#: #:)nd Z#{!2%5# H82 ) " &k SSS1S&c&q_)n5! @ @R %) D $Je# $J$J]"6$K #$J.Cq 9&>v "Tv "vT "   N (,,,{ ,6= 2O'(eQ% M)6X!#8#$'#] /!"$ToYZ"#3]]&=]&&Yb!F!gr'r j LK B 2%2 0C ($ ( (  M' ( !&2F2y&2'2 ]V* $]L0 C L L C%hzm hYh V  P%] h)$\&{\ '8 f"#} oO K%; z'''$G''v .) !&\!]]*(Fm K$| [&h$|$|^& #L' $|%?D%i'#f$w?#?$6#$t"A b#X e %f' !" =)8 S'W y&!!v'Je!A%)- B #d!O -k(HG ); -"^  } |$ 5 ^!}%2 m   "nJ&\~ . #H tPA(~'>(oCCC#C('] r&#v!""& \"&&*u ~(:!"@)'!M'2#PS" N %~4$U @$ '$Ue! @$$)qe S$+)[$0"X '(  %$>:(yn 4&&M3 &!ds!v%)tK ?kK()&z   M( $(4#< ?A$Crv])%)%&)\$C \$C'S)% &&Y*]Q F(X Sl ' *c !  y Y  D; $J(;# )$J5$J])(q$K  #$J.[%' ]%m Y V# ] P]]%] $\&Y$?)!  o$?$$?!  `^ "|$?z  #  D# $J #o [&%%% D (?# ]"6$ #.xg"% ![ :X(>G eX7 ~(L/K /$|' [a& D$|$|Za^y& #L' $|& ((""L(o g g g"j g)n D (?# ]"6$I #.] R f&T&..."%(XE{#{!%5# Ho& 8 )% " &k %<}#a99m"r  p)&c (6 6u"L"LL;L\9 iP 9 ,9 P$ #A)  '#P${$$ $) M) )5])$ .Xm"rr QQ& \pQ&$ D $J(# )$J5$J])(q$K  #$J.W %' %#{!%5" H8 )% " &k %<R }!!F  XA tJ 8 " ""g   "4!$%S'! $)q3!S)[0%$$"! yv!)b y&  1$l Ilu;7 R$Cr&;v&&&)\$C \$C61 &&Y*%(&P((&#l"z(&!\l(& yZ%%A!2 l!'&JGMt'W& 'Y(*pv&.Q yO&.&..([&)&.!/\!](Z/O' Zxmhh'h p Wd mh ]%! M "!Y ]]V]&Ybb('V(' &(' ` {&]m%" m, NZ('((`!gr'rn LK B 2%'%#`U## %4t!$!g# v&.Q yO&.&..[&)&.!/\!](Z/W7WW(L"WS ($R($C( 'C Qa6a #v$C$$C   +(0S 'x&Y*u  MYV)\&{\ao H%$#'"\/')!q \"\ '%Y%%&{\$C'E%%!1%$C$C g&Y* D  # ) 5])"6(q$  #.$Cr;v#?&)\$C \$C  &&Y*(!%S)qS0 #$DI!,   D # "#"6 o #z$I \&*(v '9n\\]%C]&]M m3 &B!d2s mY m%)gK]]] ?k mK()&&z'Y"v)bv  1$l Ilu;R) k"@ z#!<)! $: r#<<!'#(_ #<b   % d k( k' ]%! M!Y ] V]&Yws$"aW y]"u H%'~"\/')!q \"\$C( R'C Qava #v$C$$C $< +(0S 'x&Y*#)###&j#{ G6 >' (# + >''%-'#) "%AcG6 T (#Z + %-#) "%Ac% %t :'$"O  7'E$$m("r ?(( p)9(q&c (q.&c&qPdb/){v&&#M!P& &sF!P!v%)tK#_'=!PK(!)&z fo$$$ /> &LV<% -9<% % "t< <% "V ] %!5 Y%( #bfVVV"5'V !"6"A % 1 Z#D &{^ 5 ! )#w C d k' d)\9 i 9d99 9)A) 9 Z#|%C!J&&6!?$ ^0O  -(^O   nOOj ]%eh m %eOYO V ] P]]%] O$\&Y H |? \'eH(' y&!!v'Je"& Ri"fota&vtaaa #`Tz     "1( O''2%F!u_2%%&2'2 %]V* $]c\9 i 9d99 9)A) 9!'+5 V P$\D D $J # )"$J5$J])"6(q$K  #$J.## C C C]  Y]])&YE{% 99#=( ` l d$v% SDf'(V"'"5''V,# $>"* !'"RK {D 2'#U##4!$!g#&[4!$%S'! $$)q3!S$+)[$0 p n'j%)-$S8'j'j #!O -k'j((Y" "&R yv#Rv&R&R)b y'&&  1$l Il&Ru;(( 7((')3b ( &Y m!(h& t tt9tP_K /$|d [&h$|$|Z& #L' $|& 5! @ @R %)%$"O)q v$C'E%%'yxx$C$C(x&Y*e&#J"nJ w' !''"W#f$?#?"W"W6#$"A" b#X e"W %+Q# #{"yz)F  /   MYV\&{\&&M3 &!#s!v%)tK >_'=K(!)&z f%f *%f%f(%fZ%l"#*wXXH>&B,( 6{& %&j!I!_'. (/#!= IS#G @ e#H LL)qeS#L0"$m%rfV"5'V !"|e'( L]%eh m ).eEYE V ] P]]%] E$\&Y("X'( %$>:(ync |("\ ||"&F&I : |?'a)=\ o o o oS-#e ~#^"v&R yvv)b y'&r  1$l Ilu; @']%) M qY ]V]&Y I`&&$/"O7"O!%Sed`)qeS0f^ 4 49 4 qIIIOIF&'&_ ]V* $] " ""   "  o ]%@$m #'Y V ] P]]%] $\&Ym"r @ W p)9q&c (Zq /!:? ("!!: S s !o: 5# M  Z*1U 2!L MmhK /# [&hZ&#L' & aa'a$%O 4 WHH\&q'!+'#>'%+"'''bP> t)L! w' EV V%w   o o( S[ we ~#( kVI 4 /'  n D # "#"6 o # z+ 66a#(+)>ya##a6a`#'??$I b#X e2$D$HHHH D $J# $J$J]"6$K #$J.[&~`$` )'l&c&qj$%(!?$8$""}qCw>&B, rY #  &Y&r;& \ &&*  f% K5/" Y$fC%l|D(V" "5'V,# $>"* !"[' t)) ) ;"'L"""&TI  $ Df''[("'''," $>"* !'"[( C Cll'(e"* %.!h {%.%. @ PI%. !C# " IIIOI] '% n   '%Y% !o]]](%&Y ~?& "v'v  1$l IqCD'$~D$%6DQ+(""'C Qf)b #$< +(0"uE&* :#( & %? T x'  -$ !&&'v  @ .')!\!](Z)PP$bPP${ '#UV\\ ) VNE)" ! B)P"NJO O  OOj \r&;v!&& \ &&*$u.   B! 6$ Y  V P $\&{\  R z( !  r_!'^% .&g) _X i(('#U##4!$!g# &a !95# "# oz4!$ '! 66)q3!S)[60'PP PP$ '#$X(^^^(R(R(R O(R  /!$T=YZ $T&&{\%2 l$}A!2 l lii"n'&JJ Mt~i.G y47}UeU  "D))md)z&_!f-!!H<[!xmmh8 u o^; ` &$'mC%w   o9 o S!r we ~#4!!4'e! !)qe S)[0'Dv&.Q yO&.&..([&)&.!/\!](Z/%)]   M   Y& ]]F]&&Yv&.Q yO&.&..([&)&.!/\!](Z/4a*(~ '>(t'  MY)&{\ S(# ##^&j#!> ZaU\ & i 9d   9$#A)  '#$&(M `')8 S'W] /&"$T&wYZ" #3]]&=]&&Y#$DyI!,  I!_'. (/#!= I '??"W6$?"A" b#X e"W %5# GN>  :'P''' vvl  1$l Im!Egj%8W " ""   "/! $C D # "#"6 o #z %4!4'e!&`)qe3 !S)[0*"!c & <O  O  OOj !([#:!([([(["m' p W m""""""Z""'& $''"W Y??::6$?"A! b#X e: %" 8t!Eb& &M r C"V  ] %X'#'v v&.Q .y"_&.&..[&)&.!/\!](Z/%( $ !###D# y0'%# 5U)## %)4&t!$!g# $9#  6+y99{ 7+"y sz9)F!Y##,%%%(((( {+ V P j %2$A!2 l)$$"n' lJAMt~$. n82"#} o K%; zV (K   _!< &|"U)<Df(V" "5'V,# $>"* !"F(s "H o o o oS#]!e ~## &K / [a& Za^y&#L' & $C' vJ&)\$C \$C  &&Y*#N'j'''##; Q Q;!6D(V" "5'V,# $>"* !"4$'! %$$3 $$+)[$jt  ({` l  )45$ /8j Q(((&Y$B(c; Tsll& : l" 8& / ?$T$ #  #3 8, !loT T($C(" "'C Q "h)b " #v$C#K$< +(0u E&Y*#N4! @$ 'e! @$$)qe S$+)[$0(P H%'"\/')!q \"\$Cr ;&$C \$C g &&Y*f' .!"( $\M$\)c |("& [["F&TI : [] ^ I] ^ ^];] L ^ I#&[('nNK!nnJntjv y.c)!/\!](Z/ %m"r $ p)9q&c (qm("r (( p)9(q&c (qWWW$Wl(_! D# w"6(q #'~7!: (!:!: S&es !: "%!&:/&:&:$&:6up'5'4 ], i ` &"* c] "#!<< HP$J0 $ 0Adt"DI)))LDf'("'''," $>"* !')"Roo(o'#{!%5# H8 ) " &k < <#$  d&%?4!$%S'e!`$)qe3 S)[0##f ##6#"A$%)7 %%v ) r??#X e?$)bXXX>X 6{& %m aYa V  P"%] a$\&{\ cR, = = H =!A%)-'{!A!A'#%!O -k!A('j%)- B'j'j #!O -k'j ((Y66!  [O"'!$8!!! )|""}! ,# # nN\!nnJn)%($C v_$C (v]%b'&Y FXh[(>G eX%(q((&Y$B(( | ) E{ ?]Df("' " $>"* !"'j%)- B'j'j !O -k'jueb$'#' !&&  " &<'<<<"1( H'L CM }!FC"$m%U(#s %w  o o S we ~# D # ) 5])(q$  #. @ @ @1 @  !C@M"$m%(Df'(V" '"5''V # $>"* !'" 2u. j$D$D['[3 $r!'&" a |'N>>#> o Z^&  (( *  AKX /$|d [a& XZa^y&#L' & tF#w t t , t2%2%%%' W  %W`!  %W$W!  `e "|Wz (  '   o"# o$Cr' vJ&)\$C \$C  &&Y* ]Q F] v&t t! (#v $Q$D$Dt/,$t,,H  $ t,[ $y5)B%~"v'v  1$l I L m' m[` t4! @$ 'e! @$$)qe S$+)[$0g}ggog %w  [ w H)! H$H! ` Hz$C'E%%%$C$C g&Y*(g(g(g%(g) 2 ] !$  !$Y$ !o]]]($&Y]   M  Y ]V]&YK$| [&$|$|^y #L' $|$Zf&7"R&G / ,%~"$ K5)"$"%1"&%# " )mKK J | ") 3']$og  (L/m!I"r | pP) )mmZ F %%%  LW1! - $ !  ` "| zaZ c,] &)@S%w  [ w r# )rj"{j"y.Iz)F&X eXIf+d Q' g" ,''##'"v'&  1$l I  4S ' e! )qe3 S)[02w ` l d'= !<$QE{;%)?"&' ]V* $~']%P&# $D) \H9H iP 999P$ )A) 9 '#*1U $u( !]   M E Y ]V]\&Y%wq&h o)_ o S> we ~##$8$!$ ""}$ ,D'("x'',#$>"* '(D ": }""e"%2LA!2 l)"n' lJ Mt~.r; , ,& \l ,&&*] mz[6~ mY m%6]]] !% m&YO  -(^O  OOj !JC )- -)-( z!(!  r!'(_b   (4( | 'S /!$T=YZ $Tm&&{\ m) m m F mP <0  << 0AdtI))<)LfU"wiI&%i  'k v$C'E%%!1$C$C g&Y*(  5${ fPdb9%B wB(c] ^&] ^ ^];] L ^#4 t 444kVq;(N & #$&!A%)- T!A!A #%!O -k!A(&&ME &sE!v%)tK<_'=EK(5!)&z f#^!~ ### ! 0!!z!% V Pr&;v!&&& \6&&&*%w  oi o S> we ~#%2LA!2 l)"n& lJGMt~.%w  o o( S[ we ~#f m Y$f $ $$%$%lf)60 'j%)- B'j'j !O -k'j u$9#  6+y { 7+"y$z )F#5 E  V G) '*t#f$w*?#?$1t6#(a"A b#X e % 0  ] ^"p ^];] L ^(""'C Qf)b #$< +(0u E&* m"r  p)D%{&c ( $r /? N$4$4 o$4!o 6{& % c?)\ '%|*)pI$3'. #!= I / &$T= Y Z $T" )&&{\]7 / o"$T7 SY SZ" #3]]& ] S&&YT$"$m%$& # v&.Q y&.&..@c)&.!/\!](Z/D z#D z zDD!x z6&(N $  "#G%&5'8'%r'8n L5K B 2%("* "u. XL\9 i 999)A) 9\9 iP 999P$ )A) 9 '#<<D#z}d /?) !o)L-"K  $C= '<H&$C \$C &&Y*& mI4attt6"A9t % ~& ~ ~ XA' ~0&v&.Q y"_&.&..[&)&.!/\!](Z/m#D rB_c; """F& KI : 4(l} -~$ ^YG Af%$~1#Dd $[&:ab)J* !B?#X ehY)&{\](Y]]]} &YR)-'j -k&&M% &c s %%!v%)tK_'=%K(!)&z fv"'O*!j!!&w((v( !*"aaa%a` A!PPc|||F :|_(8f '8f  "#} o K%;  z(L(L%](L{\ & iP 98  P$ #A)  '#m<"r  < p)(b&c (v&. y%&.&..c)&.!/\!](Z/%8C'(Y(YY'uY"S! #%x("9'% U)3 %)4&t!g    "D))] /&"  uYZ"]]]&&Y h! ! $ ! `'!  z'C'(u~`*%'>( )Mc; |(" K "F& I : ( "* "0m> "&L8 c!([!([([([  U'.&~#/!mr&#v! & \ &&**E##{!%5# H8 ) " &k < {+ @@ V P = j@ r yQQ& \pQ &  MYV\\&{\ y#{!%5# HU8 )7e " &k 'M $!#uV!PP$ '#WWW`e"|Wz"C)b%uE2(!%w  o& o S[ we ~#5! @ 4 @R %)&&ME &cs E!v%)tK<_'=EK(!)&z f / &$T= Y Z $T" &&{\" yv#)b y'&  1$l Ilu;AL xBL$9#  6+y{+"ygz)F @/ 3#HG ); "^  } |$ 5 ^!}F ` {&]< %w   S> we ~#rK B  W&h)P& !(Df(V" "5'V,# $>"* !"(+$3 \#kP  t"D)))L {+P V P j %w  o4  o S> we ~#}Ze |y-"O7'"O %%%(C'C Q y!)b y #v3 +(0lu;)9 o &m) 7K(i$R(Mt M _$$$(v(' ]%eh m ehYh V# ] P]]%] h$\&YPH %e%e%e)K& /# [a&&Zay&#L' & t w q q q!`'G U Uq U]6 66)8X60RG6 T (# +%-#) "%Ace0M$&=D'Y%F%%&' %]V* $] J nnnJn''*??6$?"A b#X e %&$%_"$$%2 l$}A!2 l l$$"n& &JJA Mt~$.g ('j%)- 'j'j #!O -k'j(%w%% o  o( S> we ~#]  M Y ]]&Y#(f`&%D"&RC yv!&&R&R)b y'&&  1$l Il&Ru; ZD'("'',"$>"* 'Z#a9m9 : : : n)4 :nDf'(V"%'"5''V,# $>"* !'"`jt txQg] z  Y  V] P]$=] $\&Yv&.Q yO&.&..k[&'Q)&.!/\!](Z/99(9]Q]'j%)-'j'j !O -k'jC y * y)b ylu;8"-q??? b#X e nn nK$| [&h$|$|^& #L' $|](Y]]#]&Y -j'z'z h'z">K!A%)- #d!O -k(( Lm("r ?(( p)9(q&c (q n Z Q}QQQ'&)%RG!d ?k!d    ^mv&.Q yO&.&..k[&)&.!/\!](Z/K /# [&hQQZ^&S#L' Q& D(>D^$? S $?$? ^"|$?"vvv(v] /'YZ]]V]&Y ) [[%[%; Xh(>G eX%^I]'I !p  ?( !C@m;tb-  ;n >$8$, n /(Y i !o;'L"""&I \ ##f ##6#"A% %c; "  "&F& I : v s %Y%Y%Y a(+Jaa%a`+66% #a9%G9$%_"$$ X(G eX'?"W#f t?#?%?6#$"A< b#X e %tK(l   #` %w  (S[ we ~#r#"^"~#"" #%#"TT z#!"E)!  r#"E"E!'#(_#"Eb   v\&Z $\9 i 9d99 9)A) 9 ! mP ss t))s)L"tPi@'W Z%w  S> we ~#m###ksBB%##m (## %] #&)i %&XXX!)'v .)!\!]]%; t/H $ t %t i_! 9'v! (!!.)P!!\!]]$%!T!!)! ! / V V /X44Z4& K$? $?$? ^"|$?" (;!< &| O  - s(^O   0 OOj $Zf  &7"R&G   I / , %~ " $ K5)!VC$Zf#"R&G:  / ,%~%,$ K5)  MYV&{\("X'( Z%$>:(yn v& t^!(#v"X 7'(  77%$ 7>:(yn8}(####P$ 0$$ 0t/I$)LK /$|d [a&'?X$|$|Za^y& #L' $|&  v v$C( )<'C Qa%a #v$C$$C   +(0S 'x&Y*r%w  o  o( S> we ~#'??$? b#X eo8""}%|5$"'?"W#f t?#? 5?"W"W6#(a"A" b#X e"W %^UUU$ #|)Q")?&' ]V* $][I -$`$`$)$`'#"#$'#!(&(B )C(C'C Q y)b y #v3 +(0lu;)V JXSw>&Bw(| V V -n#{!%5# H 8 ) " &k ;$$$&/ $C y  y)b y(<l yu;Df(" ',"  $>"* ! " ' " XPdbLc [4 _)n ; ;## ; ; ; U'?'D(D(> j$ #|) &)'{$z'''!!!!'#!=' @O v)K !#$W @?$9!g'6+n 7 Lg 2I%a aaJ!af z5/" Y$f%#m ##  %] # ) sp!'%&#U %4t!g m"r  p)D%{&c (f% 5/"#-$fC'j%)- 'j!A #%!O -k!A( m vu 6mv%vv()vg&*p'r jK B  w(!]%! M "!Y ]]V]&Y% M$Y$$&{\ $C $C%a a$C$$C Sg 'x&Y* L# b \ ; &  $ lY (&{\'Df"(V"""5'V # $>"* !" yC y y)))b yz;l)u;; g8 %mG] ,)-R > >8 C,,(, D# (o"6(q #$'E'v$ .) !&\!]]*(6z6Y6 V P:6)$\&{\]  M Y ]V]&Y!g'rn L5K B 2%2Fj2y&2'&_2 ]V* $]$"(z((}( (f(5$#{!%5Z" H8ZZ )Z% " &k %<   )4Qg ] 'i Y]&Y)(7K F(i / ?$T f  #3 , !ott't"Ux$$$] ^ I] ^ ^ }];] L ^' H'D Rpq pqq"\pSpqv#Zw+$K F(i cA     l$l!'%&#U %4th!g  ]Q F]%VOa$?$?$?^"|$?" ) r` l dZ' O] /&"$TwY Z" #3]]&"] &&Y5#q  $Zf &7"R&G30 / ',, %~ "$ K5)H& !($$'$$ = 1$l IF!9"!9!9!9?#X e#gt "(u~ `%'>(] ^& ^];] L ^4!$ 'e!$ @$$)qe3 S$+)[$0 "  S+e ~#("*(% !  &a )4 m"| :P YY t)))Y)LQ'%I# dU)yI## %)&t!$!g# &( LA0)-f -k$$>'>%_>"$($>Av&.Q y&.&..c)&.!/\!](Z/ / o$T=YZ $T '&&{\}"Cv)bv  1$l Ilu;%) )) %)ti D  ) / o$T= SY SZ $T  S&&{\  _"P &|v y.')!\!](Z &ng&&.& Z$$rJ%'j%)-"8'j'j #.!O -k'j((Y;t  ;Ho T D!e HH&MIf!+dQ') g" !g'n L 2%]   M wY V ] P]V]&Y X#4)b$u(r!}~'~&T &]](]h "T "T " !a" | @ @ @1 @ #Wvvv(v&*U$DfRxx#Px!( | )C#& C(sc] ^"] ^ ^];] L ^] Y]]V]&Y&MCY0&{\%hzm hY V  P%] $\&{\)S.$Zf &7"R&G0 / , %~ "$ K5) ;& %?8J|#:^(6= 2 B M)6)3![I)) T)!#]!] ` W/J$  !  $&",2&P"&U&)B)(` l d 3'r jXK B l >i   6= 2(eQB M)6!C@3|*)p"  :%:(l  y#{!%5# HU8 )7e " &k )* z,zz ` z  n8 r"#} o K%; zHG); ""^\ $ ^!\!#;A!2 l)' lGMt'W x'r(  K B (U" <"0  << 0AdtI))<)L!6>#=>>> >&K F (i /!Q > Z&LVoj Q%'"SP00#C0AA 0t$IL xA)LL(%`"* ""NQ<!QQQEQF''Fr&n L,K B 2%$Cm % N   $C$C ) g&Y* #{!%5Z# H8ZZ )Z% " &k %<% MYV &{\"";C "";)b "!EuEv%2 A!2 l)$"n' lJAMt~$.bp` /'" kZ"&); @S(t!(( &2 c#(G*+R{"y+!qb{  : "&?C"$f!: (!t!:!: (&es !:"|D#D#zc|s|| :|!6_CKq 9) @St! &2 ckY%*+R )J* &%c /(?(( l(!o'??i$ b#X e8_!_1m zp m)p $$ $W7 ~WW(L"WS` /'"/`RRZ" E R&,T,,"$m$,'c;|("$||"F &I : | $r v$9#  C6+y { 7+"y$z )F#u<    &a )4(l .W - . . $YDB .'(Nu $$$/$ D # ]$ #.CCC#C(3  & \u &v8Pd#b(] Y]]V]&YiK /$|' [a& "$|$|Zay& #L' $|& &{&&g&yZ!: #`z  MY)&{\S$C(" "'C Q "( )b " #v$C$C$< +(0uE&Y*!A%&X< #}"}}C/}u&A\9 iP 999P$ )A) 9 '#P&U&%BMN &  < !%  !4  A!2 lp'&J GMt'% U)! %)&t!g $9#  (6+y { 7+"y$z )F'2#8!g'r j L5K B 2% ^"")o:#)D \H)?&' r BV* $%'*"U)@ / o$T=YZ $T1&&{\ 'H&&M3 &!d"0s!v%)tK ?kK(!)&z f&v&&)\ &$ $px#V}x&ppx v%vv$(Sv'x&* 'kD'("R$r'',#$>"* '&"$mF%  8 @'W '8 r  "#} o K%;  z  @&Y$ >&& ?&&&M3 &'s!v%)tK_'=K(!)&z f. U")8''!%U[~ #$###%w  [ w "%&5&5*1U #>D z @D z z DD!x z c; " "F& I : & T56"#)M ]MM&M %LM ]MM&M ?)=\a aaao !P &]$9N )&1+ 1++%/1V1 +''% S D '" '(V"""5'V # $>"* !"c; #}" "F& I :   J)4$?`N"|z>!gr'r j$ L5K B 2%XX 4 #" QL iA) m!5"r ;!5!5 p)9!5q&c (qVi"5i'V !i" A yC y y)))b y7;gl)u;; 0   +9~'~v y.c)!/\!](Z/'%#`U)) %4t!g) mQ:KK K#"^s"~#"" #%#";nnnn'O%l$ )Y!g'n L? 2% o ou oSe ~#XXX>Xy( m' |#((%k m" #=# ^^ rY   &YQ $!A%)-!A!A #%!O -k!A(#$D 4  t z!(!  r!'(_b g s # j)rj%"{j"yIz)F d%N u d%N%N%N@]%! M^$Y$ ]]]$&Y j$ A#|) S 'M'M#(#^^ m)$$ $%b$3Y /(?(( l(!o E  "c )3:((t($%)3 !: (!t!:!: (&es !:%u 3#f '8'f&&"#} o K%; &z% !c):d w /YZV&{\' H} '$'') ''*t#f$w*?#?ft6#$"A b#X e % ` &< $ R $ kVI'e &M]%whzm 7hYh V ] P]]%] h$\&Ym$!$9  6j+)"{j "ygIz)F yC y y)))b y!'l)u;!am  mg)gg$'h%_"$$Aj$* '%m %Y% V  P%] %$\&{\(@>}(PdbEME/s !d ?k!dfEM[E(X! Sl'&! ! *c !T !  y Y  C C ]C& LL d\9 i 9$a99)A) 9Q'%I# dU)I %)4&t!g I $3' .  L#!= I '  [W1! %TW$W! ( `e "|Wz!q##W ,WN %%%n UG(O 6!g''rn$ LK B 2%] ^ I] ^ ^];] L ^ic|_||Fg :|_ U'' v& 9 v)- -k# ~|m  &)g% !g' J$n$ L% 2$%m%T)g&* / ?$T=  $T , !lo /!:) (!:!: S &es !:!o 7 &|t @St! &2 cG*+R # %t|-q"('#&R yv'#&R&R)b y'&&  1$l Il&Ru;MxY#YY'RY);; |< "R% $96+y 7+g D $JO# ) $J5$J])"6(q$K  #$J.Df''[(V"'"5''V,# $>"* !'"F ,#### T(!"g=2)?!2&2'2 ]V* $]Lz0 C Lz L Cz  MYV(&{\~~~7 D#  (o"6(q : #$BdB(c$ff)iHv"Cv&R)b'v&  1$l Il&Ru;& %?v&.Q yO&.&..[&)&.!/\!](Z/m'j %)-m #!0!O -k(B_i!P"!&k!v$C'E%%!1%$C$C g&Y* $Zf %&7"R&G)x %%/ , %~h "%$ K5)!e%See)qeS'0T00$N$C'E%% %$C . g&Y* ]$.&)FZ w D (?# ]"6$ #.?#X eE{'"W#f t?#?&?"W"W6#$"A" b#X e"W %] m2 mY m]] ] m&Y'w$vvv(vt/H $ t!)#V#'E% &* P  A t"D)))L&['[a'j%)-$S'j'j #!O -k'j((YB&Pd o | 2 &a )4'r[[%[ |! l"!2D z @D z zDD!x zw# K i# W=#OO'O #hfRff!f4$I'! $$3!$+)[$'  / (!t ( s !o v)M &w1 99[m'j %)- m'j'j'#!O -k'j(#$'#o w$Crv'&)\$C \$C  &&Y*,Ts,,"$m$,'" ;/5 ; ; U' /ZbR6"A0 %#% Q Q Q i Q K x""(=x""""Z"") D# "6(q #$%@5a /'aa!4!4Za ' !4& -%X @ &' &%  v v_(v]%b'$Zf#"R&G / ,%~$ K5)O  -#(^O  OOj 7!77#7Lz0 C Lz L Cz Q2 +(08Pd#b(#.$  }   q )4)P{"6h#/-(:!"!! ! C )b <l u; $'g  V &a )4m"r  p)(b&c (  #  6{&  "v "e/ xxxx!: (!:!: S&es !:uv$C'E%%!1%$C$Cg&Y*#%&=&^ #(vP(B!JC$Crv&)\$C \$C  &&Y*  d"D))mg&2&"&'  %# %t  )U!h { )U)U @ P )U !C# \ 9  iP 9 9  P$ #A)  '# F%8 #. # m"r  p)&c (!g''r j L,K B 2%zY V PV$\&{\" ("(a(+Jaa%a`$C$C$C "&Y* Ho . T D !e H ^#$( $ lY(&{\ @)g#& P& ( %'C Qf #$$< +(0S'x&*] / &"$Tw Y Z" #3]&"] &&Y)2% -% % % 6;'L"C"&4&4"&I &4c d " j d "F& I : =r'=r"?#mn#m L %K B 2#m% ?m("r (( p)9(q&c (q&H")&R yv)&R&R)b y'&&  1$l Il&Ru;(|S ()f)) [)S#AA&&M &s !v%)tK_'=K(!)&z fc; " "F& I : M1!f !!<[! y!'Je\\$}#k$Cr&#v+""&)\$C \$C" &&Y* }!FWdM VE)" ! %; nnnntataaa"6 ,   '8 "#} oO K%; z iH HHH|" v& 9 v&)F Z_!A%_)- #z!O -k( )4]%! M "!Y ]] ]&Y!q  : "&?&]%whzm  hYh V ] P]%] h$\&Y %;  >s >>g   > .k)8 S'W&&M& &s!v%)tK_'=K(!)&z f$86 6Y6:6)%&{\#jv' >) '5##$Cr&#vr  &)\$C \$C &&Y*s l$}As!2 l' l' ' "n'&JJ" Mt~' .( tI( $\M%q$\ " / !a$!a$S 'x ^! T! $! `&/ zKK =K &a)4#C'%2 J"nJ~.( "& ?fH %e%e%eY"m  )g$C(% <'C Qa% a #v$C$$C   +(0S 'x&Y*% " ? > &L(s'%# 5U)## %)&t!$!g# )( $ $$/$'H5' {+ V P j "!t ''n'%%%(%vvWM]%wh m hYh V 6] P]]%] h$\&Y!#LV!gG ;/5 ; ; U' $'g#6 l &a )49 [&h&#L' P< :: t)):)L (X! Sl)x'&! ! *kc !T !  y Y #( Q$9!g'6+"n 7 Lg 2%v$C'E%%!1%$C$Cg&Y* # `'! zm% ) g&* "5' 8! ""} , 6{/& Y(&{\GJ!A%)- B #!O -k(#{!%5Z" H8ZZ )Z% " &k %<+6(!# NcW%! "W$W!  `e "|Wz` /'"/`RRZ"ER&4!$%S'e!!`$$)qe S$+)[$0'rd K B  /!:) (!t!:!: ( &es !:!o$C( 'C Qa(##a #v$C$$C   +(0S# 'x&Y*v&.Q y&.&..@c)&.!/\!](Z/''%&#U %4t!g ` (X Sl' 1*c !  y Y " Z)b &*fff# ff CM&)Q$*O[!"H  &Y*Ek{"y"X7'( 77%$ 7>:(yn |nHo  T D !e H l  (Df'(V" '"5'V,# $>"* !"!] zwY V] PV]$\&Y  B    "$1"%1"&%# " $C %$C$C &Y* v$C'E%%!1%$C$Cg&Y*"#( ##&j# 1$`#!$# <$L$)kV!$g }!!F! U ]%(&P-;Y "# o"(L(L%](Lk"~'%o@!e%See)qeS0T00n|('V('m &('} /!:) ((!:!: S &es !:!o&}6= 2 B M)68 r}O K%; ]81 ' p $C'E'v$C)M$ u$C.$C)$C g!&\!]]Y*cW }L$Cm$C% N$C $C$C )g g&Y*" o3"" "eiI 3))&O)) %)t\) !A%)-!A!A #%!O -k!A(#$k &%w  o o( S[ we ~#(C'C Q y!)b y #v3 +(0lu;3 0& \ &#gt "`  y&'/P!'/'/!vre'/{!'mC XX4 $-"t R; 2':(u @~$ ` * '>( 6$9#  )r6j+{ " {j 7 "y$Iz )F1+W 1++1V1 +  I("G %2$A!2 l)WW"n' lJ&Mt~W. D# "6 # x} )8'XX]X7iX}(] ^( I] ^ ^ p];] L ^8%k!!V (&*N  :(X Sl'*c !y Y(( ((l(!'% U %4t!g a*)`O  -(^O  OOj $Zf &7"R&G0 / , %~ "$ K5)"B'H!+'3 $r!y u. &]F{('v! (!!.)!!\!]]P;tbd!  ;ddn >$8!d$, n&c&q-!!!VfR "" R"1+ 1++1V1 +pu. '1  Y(p $u(2 ![[[  MYV\&{\ z#!"E)! ) r#"E"E!'=#(_#"Eb   [b!%S)qS0%2) 6r'j %)-T'j'j m#!O -k'j(A!2 l)' lGMtjc; " "F& I : '&)%R !d ?!d n((Y UG(O /&$T=YZ$T=&&{\)G d]q7 / oq"$T7YZ" #3]]& ']&&Y 9J r& r|rt r'j%)-'j'j 5!O -k'juK& /# [a&&Zay&#L' & $ $'&{#'&&y v'g'& /? !o%dD'} m'U |#(6%k m" $ $m<"r o< p)&c ("yC\'mC XXX>X %' wI'Gf%%+ d'G%%%% Q' %%g" S  3 "`m("rL ?(( p) W9(q&c (q3\)&)FFZ'))o(+` q #))) &a)4 z!(!  r!'(_b     2 &a )4 6 X ~'8}} O K%; m;"r ;; p)lH9;q&c (q'v @.')$M!\!](Zv&t t ! B(#v('j%)-&'j'j !O -k'ju#" E!g' L? 2%Ic; Ts &  :  &$9#  6+y{ 7+"ygz)F"V ](o)d uuY  '&Y!e!\\O||"=P(!t9 )"v&R yvav)b y'&  1$l Ilu;r(rWrj$9# + 6j++u"u{j 7 "yIzu)Fc; |("\ ||"&F&I : | y!I'Jem%"r %% p)9%q&c (q]%C&n]%"U"sU&"H' k$! ^ iH>$Cr&#v&&&&)\$C \$C61 &&Y*0T0 " ## g (F"v'v  1$l I c|$||F :|P<  t z)))L c!#R##6"A # %66$T6    G6 (# +%-#) "%Acn(++ +% d( k' &&#M & &TsF !v%)tK# W_'= K(!)&z f !a$!a$&<x HJS 'xx D $J(# )$J5$J])"6(q$K  #$J.("" $>"* (| @&w$9!g'6+y 7+ Lg 2DI%K$| [&h$|$|^& #L' $|%%$C V =!!$C$C;!(&Y*)'$ zo H%$#'"\/')!q \"\%w  S we ~#'hf#C $ %MemR,%)(sy%$ lV!2t!2'q% %)'$$Zf &7"R&G0 y/ , %~ "$ K5)X" ( ' (tP$&),(C%G6' (# +''%-'#) "%Ac'  U )! %)&t!g VvQ y.[&) v!/\!](Z/ I)^ ^jr)?rr&' rV* $aaa!a'"W??"W"W6$?"A" b#X e"W %5# GO' T(yQ {  ZnDD DK / [a& Za^y&#L' & mE m m F m(!C@&Ummmm"  MYV)&{\ (!c   (X Sl' *c !)  y Y %N4$'! $$3!$+)[$j!4!#G'e!)qe3 S)[0  e   f( (X! Sl'&! ! *c !T !  y Y  : ZD z @D z zDD!x z ^"v&R yv)sv&R&R)b y&&  1$l Il&Ru;w'%i U)Bi %)4&t!g !)/!!i$|j!xt/,$ t&E[x  !V!!! z,jW `"| z"C)buER r r &6#. z!(!  r!'(_b  7^%$AKK KD(' $Cm$C% j$C $C$CT)g&Y*$9#  6+y { 7+"y$z )F"Pd Y1YY'R#DY'$~$%6Q~~] k 3t  OPa Iaaa` 'L|"(c;|s||&& : | q 1] Pp / ":$"$T Y Z" #3]]&"] &&Y%i 6{/& ))z)z"$Cvvv$C$C(v&Y*]^II !p   }  ^( "44$'!'T$$3 $+)[$j]%m Y V# ] P]]%] $\&Y L }"[ 7W| !c; |("\ ||"&F&I : |6Df'(V" c'"5''V,# !$>"* !'"R$C" "" " } ")b "$C$C guE&Y*['[   1$l I  d"D))mM m%$$u.q&h o#U $k$k oS e$k ~#$Cr ;g , ,&$C \$Cl , &&Y*$4oo y! 'Je  Se ~# ! <c;|#}""| "F& I : $$$"""%"G6 ( +C#y(?#X eX# gs%'$""S#$  #D c&$ %  yC y'@ y)b ylu;#$9!g'6+'n 7 Lg 2I% ,#(TX Sl'T*c !y Yp)%p))p$p)%Y ; ;/ ; ; ; U' )] / o"$T)1w Y Z" #3$T]]&"] &&Y|X XX>X]%whzm  hYh V# ] P]]%] h$\&Y]&-'W !%Y% ]]]%&Y z`&($($($6  /jc|s|| :|_ c! %]  1$l#&'"WG??"W"W6$?"A" b#X e"W %&FP-ttP$t '#!_ D  DDD!x'v6*"2g& 2& Ha " & '  !F (iy }!F%xv#J"]( %'C Qf #$  +(0S'x&*z!  r%2$A!2 l)ii"n' lJ Mt~i. 5$')&i$&dK F (i =  &XXXiXX$/# o"k!3 0& \ &T...)i %&f!$H$ `&&%^!?$%%X(>G eXD  D DD!x('C Qf)b #3 +(0lu;% d k( k'   '~ XA 'h$  !&&6[ $ l$Y$($&{\%2 mCA!2 l m"n& &JJG Mt~.(8",2 &=&^ 'K!?#"]K"]$Zf&7"R&G / ,%~ X"$ K5)O $#O   0OOj  n8$"#N o K%; z'v  .)!\!]] VE)> ! 'Z' o #6| l &a )4 & %? ^%\!/ /!:) (!t!:!:( &es !:!o , = = H =& v$C'E'v%!)M$ #!!.H)$C,!&\!]]Y* ] !$  !*Y* !o]]!](*&Y %&N!g''r+\n\ LK B 2\%mmmm 1$l I# (# # >a#  &;$96+y 7+g $m&v!&&&P00#C0AA 0t(IL xBA)LL _[z 3'| YV&{\!!4  '   f( Z. y&P!!v'Je pw q#Dww Jw Df''[(V"x"5'V,# $$>"* !"  ww Jw4!$ '! $$)q!S$+)[$0!!!<!$ <(( 8(%Q)E([& { {X$ / o$T= Y Z$T" &&{\  LMT #Dv&t t! B(#v V ! ! D#  "6(q #$") &&M3 &s!v%)tK_'=K(!)&z fz$}(@5 c] &tt$t'# % M l%Y%(%)&{\I&"> "&L8 cm]m $rL(cLL4 xa*L# m%k mJK F(i %'j %)-T'j'j #!O -k'j( z!<)!  r<<!'(_ <b   %# X #&v'9#&&)\(W&> &L77776 2 M)6 ]& *9( 'C Qf  #$ p +(0S 'x&*4$'!%$$3 D$+)[$Y\H9H iP 9YYP$  cA) Y '#p'rZ K B #!##K~#""$m%m"r  p)&c ('''' ]R" ' L! v!\8=( f$(4>);#< v $&, E v&'( L  $"";C "";)b "EEMuEE&q] Y]]&Y /":YZX&&{\)  )5])$ .y")x##'q'q% 'q%% K%P# 3]]](]!'"W Y??6$?"A~ b#X e % "# oz'555#$'+5$&M3 &!ds!v%)tK ?kK()&z $9!g'6+'(n 7 Lg 2I% c(X Sl'?*c !y Yl$C( 'C Qa6a #v$C$$C   +(0S 'x&Y*#p$ppp!C!!)bl!u;&.&.&.&.'*"U)@ 4% D $J# DD]"6$  #D.$%]  M Y ]V]\&Y ! /'a(Aa Za$Z & (L/! ! $! ` zXh [(>% eX%(rv!& \ &&*d&3 [%D' "'M m# mu $\  iP 9 9  P$ #A)  '##*wXXH>&B,(|SA (K /' [&hZ^&#L' & - 1 %;  4 4C 4$?$?$? ^"|$?" /!:) (!t ( s !oa(+J"bKa%a`2F2&2'2 ]V* $]& }!F%;  e00]$H0.C Q}(1#$(' [O\9 i 9d99 9$)A) 9'#%h m hYh V#  P%] h$\&{\ z!(!  r!'(_b g s I'#!= I$/c o ,#$U[ m !%!Q#!]%wh m ehYh V ] P]]%] h$\&Y%*=%u. 8?#X e?#(%c -$~'%''' '$H$yr?D R#t %# z  ~|fnnnn%B&M  2m]m5H  ` z n8 r"#} o K%; z& )8'$C V= $C$C g (&Y* L ($N  _!< &|  ? yFZ! %Se%8e )qeS. 0&iD^#D#zt Gk]   M ~Y ]V]\&Y D U X  $Zf &7"R&G y/ , %~ "$ K5)rK B %0 #"ho ! )! $! ` zmm#9)IYY# uY(( d d< d'j%)- B'j'j'#!O -k'j('%# 5U)# %)4&t!g "1V J c!A%)- B #z!O -k(a Iaaa`"%FZ ' 6(1N"&{# ({"y)z)F)'$P$ 8$$ t/$)L V) _!< &|&%; -5#] /!"$TpYZ" #3]]&m]&&Y$9E p6j+E " {j 7 "y$Iz )F!yR!y!yJ!y!A%)-'{!A!A'#%!O -k!A('# "& }!FZf#"R&G / ,%~$ K5)rrr rHG); "^} |$^!}$  ##f ####6#"A # %""R""$$$D  0 C CNc; Ts&  : F~~K / [&hZ^&#L' & PP&P$`$`P$$i#&Q$` '##XXX>X#i %%% %%   "D))m%& \ )g&*k*1\9 i 9d99 9)A) 9$C $C$C  &Y*] /&"$TnwYZ"#3]]&=]&&Y$9!g 6+n 7 Lg 2%( g_)n5! @4 @R %)Df''[("%`%`'," C$>"* !%`"s&$%#m ##  %] # 4$'! $$3!$+)[$j&~D '" '(V"""5'V,# $>"* !"#  M@( %'C Qf #$  +(0S'x&*3 a( &="4o&," $8(l( W - ( $YD'(Nu #.. .#C(X Sl'*c !  y Y K /# [&hvvZ^&!f#L' v& '(6 ,'"W Y??"W"W6(a?"A" b#X e"W %)r!Fa(!   )'c5])$ . "&p ""g   "K& ' `$L }!F' ) r[ !%[(it/%"H $ t[#$)c v 1$lw! o |  &a )4iiiE{ gim% )g&*$C}('i# j@j j"{j"y$Iz)F#!VT(&*'C Qf +(0#=f $f 1 & \    $  '#r&#v!& \&&*('V('m &('(X Sl' 1*c !  y Y  iYA) , NZ    f $#(, ` & L! dgg : "# o2F2"& 2'2 ]V* $]% MYV)&{\8 K%; JYt~C'BP00#C0AA 0tILA)L$] &$5]$  & }!F XA % n l%Y%B(%&{\& h$C $C%a?a  a$C$$CS 'x&Y*^^^K$| [&!R8$|$|^y #L' $| #;&Y]%! M "!Y ]]$q]$&Y # #' HK& /# [a& 2&vvZa^y&!f#L' v& $| }!!F!$&iO 3 ($Zf#"R&G( / ,%~"$ K5) z!"E)!  r"E"E!'(_"Eb   ( | R. ~#c;|#}"E||"F&I : |?i = THo .T D!e H m"r)Y  W&&#M!P& & @sF!P!v%)tK#_'=!PK(!)&z f e   DZ DD #BD|y &* $Cm$C% N$C $C$C )gg&Y*w(p$K F#@$'#(u@~ `*%'>(6#M{!B''!!)[ ](o](Y]] ]}&Y:(] ^"p] ^ ^];] L ^]  M  Y ]]V]&Yb !(!gr'r!n L5K B 2%v&.Q yOO&..[&)&.!/\!](Z/ /'v   .)!&\!]]Y {w!!x  &L(  "D))(8 ~} O K%; ]-_(~! ! 9 DJ""JJ J#qqqSq"% "5' Df("'," $>"* !"! !  ! $! `"v " \z/ )!&$&>&$ DZDD #BD|y` /'"/`RRZ"YR&m'#U##4&!$!g#%th  6"AH %&]%!M &B!d$(s!Y%) gK]]] ?kK()&&z'Y Y&'#U##4!$!g#[ v$C'E'v%!)M$ #!!.H)$C !&\!]]Y*&m r("" $>"* [< <'< $<(%w " [ w "%& a "#n Y&{\ ))?D#z5)B :m("r ?(( p)l9(q&c (q$/#  M & $ M l$Y$($&{\ %w  o o S> we ~#& H$? ! H$!  ` "|z /( !o " YYYYXh(>% eX%tP&$xE{% [F8 ! ""} @& JvZf d#"R&G d/ ,%~$ K5)$H$&&/M%3 &!d"s/%%!v%)tKT ?k%K()&z  $$$/$)! $! ` z&a)4v svv(vv"#"&w((v+++V +"'#G @/#G&&$&% @]%! M "%Y ]]V]&YLcM }"[ 7"""@"www&A!2 l)' lGMt'W!X(B"v'&  1$l I % $r (X Sl' 1*c !  y Y &&ME & s E!v%)'5QKp<_'=EK(!)&z f&&M3 &s!v%)tK >_'=K(!)&z f#g"&RCv&R&R)b'v&  1$l Il&Ru;)8$$'/J[!"$Zf#&7"R&G ##/ ,%~#"#$ K5))- -k'E% &* l a /'a!KZa& m("r ?(( p)9(q&c (q-ZZZZ w Ln L n(l(Y UG(O(qr &|I?'#' &m'j%)-'j'j'#!O -k'j(@'a ?H d   /'Z& $ %%%!A%)-!A!A #%!O -k!A(&&#M& &jsF!v%)tK#_'=K(!)&z f /> Z&LVK$| [&h$|$|^& #L' $|| O $#O  OOj  D # "#"6 o # z(_W/(B%2"nJ ~. e)3)) T)!(&3*)eh!a$Cr&#v _&)\$C \$C &&Y*&T&T&T&TV J (   '&)%Rl[v' !g'8 n  L) 2 %$C!iv!i'cc&)\$C \$Cc &&Y*v"!!"&w((vP< 8 << t))<)L 9 V]  % n  Y'%Y% !o]]](%&Y'%# dU)## %)4&It!$!g#  : "&? w)?"S&' ]V* $~'] (([s 2F2&2'2 ]V* $] y&'/P! '/'/!vre'/ "v'&  1$l I $$ ![ /%'%%Z%F%& $ur#"^"~#""#%#"m('V('$ &('<o ^ $^^!!@0^4!$%S'! $$)q!S$+)[$0! w'  V V (6$ /? ;!o < 4 m(%k mU3xx(x# ##&j#'!gr'r j L5K B 2%K /# [&h''Z^&'#L' '& $9#  6+y{ 7+"y z)F(.  d$C'E'v$C$ u$C.$C)$C g!&\!]]Y*v&w((v" yv#)b y'&!  1$l Ilu;f "v)b'v  1$l Ilu;( / (( S s !o 8 r"#} oO K%; z!%See)qeS)*0$CV=$C$C g(&Y*'v   .)!&\!]]Y$?$?$?^"|$?nNnnJnz[6%6c !%, qt%'%#`U %4t%a!g ;'% U) %)4&t!g \ "C yv#v)b y'&  1$l Ilu; ^w ^ ^; L ^$C'E%%!1%$C$Cg&Y*]%! M "!Y ]]\]$ !&Y$E ! ! ! T!  w9)`)`E)`"=&$&4Pa7&0 %v y .')!\!](Ztt % bb'b'v  .)!\!]] a&a$S 'x$)p'%&#U %t!g !%~m'$cY(*p $U#{!%5" HA~8 !)% " &k %<]% M qY ]V]&Y%]  $ M    $Y$ !o]]($&Y o o o oS9#]e ~## X"$? #X e    &a )44!%S'e! `)qe3 S)[0)$ %)$)$$R!)$$&tQ<QQEQ!: (!:!: S&es !:V)$K' )))f 5 Y$fHo$|!e HM /X'a#XXZa>X& 2w!x *(?\&_# ~ XAFrv\  ($((!#G @/*#G 'j%)-'j'j !O -k'jud{'mC %2!Y J!Y!Y"nJK~!Y.x$rMsp$hMM!_M c%] _[?:#X e$!4!4'! )q3!S()[0 #o]  M  Y ]]&Y!g''r j LK B 2%V  8"#} o K%; z%w  S# we ~# Z!2, -" $8m*"oA 1$l IA2F(2"& 2' 2 ]V* $]#$###" <"0  :: 0 dtI)):)LU%$C"$C" ") "ii)b "$C$C$iuE&Y*&"Z*1   J3 | Q'c; Ts& K : \$ #|$ $Cr&#v (&&&U)\$C \$C61 &&Y*" /!:? (!t!: ( s !oxx$ RRh %_ "$x$ A%w  [ w "%%4a*yx)z)z", & ".t#f=M# '  6#"AH % $'g 2 &a )4&&M3 &!ds!v%)tK ?kK(!)&z f"2( / "2( ('p'p S us 'p!o _ $kVq&$C()<'C Qa%a #v$C$$C   +(0S 'x&Y*"C ".)b "uE"(R|(R(R"= O(RK!gr'r j#m#m L %K B 2#m%%)7#fh6"A'eH('0000)) z#!(! m r#!'#(_#b   $C(" "'C Q "' )b " #v$C$C$< +(0u E&Y* &oP<'n << t))<)L#"^"~#""#%#" &|c!%'$""SD'("'',"$>"* '')@ ](o'#f$?#?a6#(a"A b#X e %%[ mCA%[!2 l m"n& &JJG Mt~.O $O  OOj (X Sl'M&*c !$#7'Xy Ypx#Vpxxppx#v&.Q y&.&..[&)&.!/\!](Z/"\$C("avR'C"a Qava #v$C$$C x +(0S 'x&Y*'#Y` !N;NN  : " #"e /'  # %2 A!2 l)"n& lJGMt~. g)- A -k!!;#;;!;cccx HJ "# o)G Wz$J<-!7<"t <<$" \%2LA!2 l)"X"X"n' lJMt~"X.####&j#U"{"yI(&l(&!\l(&# C #$"G)Jm# && #8"8"^""%"i -ii"ji!!!<!' %8!! !  ! $! ` z?"W#X e7!&(((G(%o(.88 $/'ffVV"5'V .!"!G m (%=!9!9!9!9 m (%k m(uf  3'] #i    $9!g'6+y 7+ Lg 2$%O#!W"K##W/$~#%w  S[ we ~##~8D RS~'%w  [ w "%WbWW`e"|Wzc;|s| &  : $9!g'6+y 7+ Lg 2% Z#9W!2' t$F!#rDm6  -v %pC s%p%p}(%p 5$ D # #c"#"6(q o #z& 000 ,#"!(KTKKTK!: (!:!: S&es !:u!g' J$n$ Lg%i 2$%% D $J# $J$J]"6$K #$J.Z /r%U"sU&H' k$! ^ iHO` l d$C'E%%!1%$C$Ck!g&Y*&v!&&&q (](](] Ste(] ~#v&w((v]"'% n "  '%Y% !o]]B](%&Y &@%] /&"'YZ"]]V]&&Yy(yy(y${v "Tv "`vT "&*1K$| [&%8$|$|^y #L' $| i) >  MY&F&&{\I'.#!= I'j%)- B'j'j #!O -k'j((Y#"L<)8'P 8  A t"D)))L W)P"ma] # Ei>$?$?$?(`^"|$?z#05:e;#}" &"&I p!!&k!lS f(UN$)>  :A"V)#$'#!  8 > "&L8 c('j%)- B'j'j !O -k'ju% "! yv!!)b y'&  1$l Ilu;C"%U&&M%3 &!ds%%!v%)tKT ?k%K()&z f !5 Y$fntttt\)'??$ b#X eYvDf(V""5'V,# $>"* !" @"U!!! /8&t/%H $ t[###&j# E%2$ A!2 l)$$"n' lJAMt~$.?'M(u `)?&'V* $"Z(8%W",2&# @j4jV"V{j"yIzV)F(@B>#; 8'"#} o K%; z#f ##6#"A %t %=  w!K~F ~~ v""&w((v%P 3 /X'XXZ>X& G6 ( +!))n!: (!t!:!: (&es !: u|[% % % % ''#$# (tr] 6 l 6Y6]:]6%&Y m" (%k m'BD"X(Df'("'''," $>"* !' "#,$0  0t ]I )Lf $f R;tb-  ;!S >t$8$, !S "# oz] / o"$T)17 Y Z" #3]]&"] &&YK /$|' [a&$|Za^y&#L' & 9l f(fff"# o!fz(#9m#9& i_)ndt &tr;v!& \ &&* D|P>AA t nA)L\9 i 9 99)A) 9 "V ](o @ @ @1 @! @LL)qS#L0b$C $C$C &Y*K /# [a& Za^y&&#L' & ''''$96+ 7 g %&r x #"&Rv&R&R)b'v&  1$l Il&Ru;& &&#S;&)g#& h$k$k S e$k ~#xxxx?G?'` !Z\\#k ` l d  O ( D # "#"6 o #z#9 D'("$r'',#'^$>"* ' t 8(vN(' =r'=rn$ L5K B 2%!4"-"-"-D< <\9 i 9d 9$A) '# "Cv)b'v  1$l Ilu; tF t t , t !! !\  i 9d 9$)A) '#"$ \\|v}(T#{KK =K z[6!%6c !%x\# # $< -< #<< $Pm  ~%] z z z!x z!2!2%8 L666:6rv!&U \ &&*P< rr t$))r)L(V(V(V$d] ^"px] ^ ^ ];] L ^|'E'v!$ !!.),!!&\!]]*(!%YmYY'R#DY$Zf &7"R&G 0  / ',, %~ "$ K5) i A) G z#!"E)! ' r#"E"E!'){#(_#"Eb   4 44 43'] @ v$W @"'M mQ mu $ _]V)!C@! m %k mJ " 8 g^ g$ g#"j g o o o oS+e ~#KPJ0 ?$ 0 dt"DI)))La(+)>Jaa%a`Df(V" ""5'V,# $>"* !""""N"#'2#(4w$ w."1YYY'RYM#f$#6#"A % O  -(^O  OOj o XAFr /? !o z!(! m r!'(_b     M YV)%&{\WWW`e"|Wz("v)b'v  1$l Ilu;$C( 'C Qa?a Vv$C$$C   +(0S 'x&Y* #6#.# Up''ZT("T((G( x  iI&Ii"l ".. .)))$96+3 7 g$d r$d$d#$d( !h { @ P !C# $/"k v$C'E%)M%!1L$C$C t(g&Y* t/cA$(} WWWZW !W)vKW/ D# ## )`5])"6(q$I  #.#$#s< <>;;;!;'{$z'y''PP&P$`$`P$)$` '# `w ` `&&M3 &!ds!v%)tK ?kK(!)&z f("   M(z %t#! !Ho|!e H$m'j %$)-mff #!O -kf( $ )$'{C$z'''o!f!!<[! H%'"\/')!q \"\$ m \U |#(#%k m "    / ?$T=  $T , !lo ( ( (  M' ( v)P" 1("*  &% %lx$"! yv !)b y'&  1$l Ilu; >(((0&=m"r  p)9q&c (q" @m !'*??6$?"A b#X e % m'M 'M'hf# $Zf &7"R&G% y/ $, %~ X "$ K5) H1+ 1++1V1 +)l)l)lp"L"LP(C%.[.% MY&{\N  :A$[ h$8t "L& '*??6(a?"A b#X e %(R /?) !o< <'??$$ b#X e  %"&MT!gr'r j L PK B 2%%w  S! we ~# Q +(0 Df(V" "5'V,#" $>"* !"HH\&q!g'n L 2%($C $C$C &Y* % n lY#(&{\'f %%n%n Un D  # )5])(q$  #.&#(XK Sl'KK*c !f K y Y $P\9 iP 9N--P$ % A) - '# =\%$R('j%)- B'j'j #!O -k'j((Y] / &"$T&w Y Z" #3]]&"] &&Y4$U @$ '$Ue! @66)qe3 S)[60 @Z] '% n   '%Y% !o]]](%&Y t d&%?+j( ( V P ( $\(C'C Qf)b #3 +(0lu;  #PP P$ '#\9 i 9d99 9)A) 9${'#m'Y(*pO  d"D))m&(,M3 &!ds!v%)tK ?kK()&z MMM_M$`#&Q#s%wq h o)_  o S> we ~#vm 1% # '=  q  $x#g "":(&=&^   '.Dm5${'j %)- B A A #"l!O -k A(#*wX_KXH>&B,$#c;|s||& : | 5]$ Zf d!#"R&Ga d!!/ ,%~"!$ K5)" @0 v vy(L z}()A&!$8!!!! )|""}! ,6 'E%www&*( L! %!&&M3 & s!v%)tK_'=K(!)&z f&%t$W1$! %TW$W! ( `e "|WzLz0 C LzL Cz'M"Ois(I'Y(*pD'$'~D$%6D&Q  '.!#LV!g&HY1CYY'R#DYqT"$m%] ^z] ^ ^];] L ^G - P 0  0 dt"DI)))L&!? "1###xxxx]$F6$ Y  V] P]] $\&Y $S'x<E{AL xLp'3$W1$!  $$$!  `# "|$z4!$ '! ||)q3!S#!)[|0_[8$R(p%)7%va @<A.'%! $*&(Zv /$vA4!$ '! $$)q3!S$+)[$0(&zc|s|| :|]%! M "!Y ]]V]&Y'q%)  :; [;;_K;/"A V%" ! L0 C Ls$Ld C&:\  i 9d!! 9$~A) !'#'` X"X , NZcAw+K F(ia(+a$aa`v W$%*^^^0^] m& mY m]]]" m&YP $IAp!2 l  $$"n'&JJA Mt~$.8!! ""} , f["2!V+!!$! n8 r"#N o K%; z)]%! M "!Y ]]]]%C&n&Y]'??$? b#X e!& z#!"E)!  r#"E"E!'#(_#"Eb   <&~ $^$UUU= &%{ ; d$9!g'6+"n 7 Lg 2%K F (i =r))'??x$ b#X em'"."W#fM?#?"W"W6#$"A" b#X e"W %f!g'n L 2%46(@ Vr)( ('V('$ &(' fVVV"5'V @ '!" !""R"&"f%2H HH"nJ~H."d  : !2t! i-A) D'("'',#$>"* '$C $Cv%a vva$C$$C(Sv 'x&Y*t 6[ ![[v&. y&.&..c)&.!/\!](Z/Tc; s &  : 3 / $T=  $T , !oW%! "W$W!  `e "|WzX eXQ {]%m Y V# ] P]]%] $\&Y&5&l!G m7"o ( %= <$ !))  E" <Z] ^zw! ^];] L ^*1&&/M3 &!d<s/!v%)tK ?kK()&z <'<<<$9# C6+y{+"ygz)FP$J0 $YY 0 dt)I))Y)L K$| [&$|$|^y #L' $|I'Gf+Hd'G Q' g" 6"SfR>   %''''b  m}U%#m 7## %] #!h {# @ P !C# 1"g1%/1 1   " W%211"nJ [ ~1.m} [lll @(T! %S )qS. 0W:b%[p$ vD T(vDD%@v$xTD!%-]%! M "!Y ]]a]]&YKX /$|d [a&3X$|$|Zay& #L' $|&  %#a99<w<<-<:%,:^)we d d$ 4  3%Q)E ')WWW(`e"|Wz[$C(" "'C Q "'h)b " #v$C$C$< +(0u E&Y*(8",2&F$C( )<'C Qa%a #v$C$$C   +(0S 'x&Y*%[ mCA%[!2 l"w m"X"X"n'&JJ Mt~"X.!/(LT'A& %jDd''[d(V"&V'"5''V # $>"* !'"g$3"DPW%! "W$W!  `e "|Wz#Z': V 'H VmZZ#[Z (   "#""%oN"C Q}(+j( ( V P$\M( $\$\] '% n  $E'Y !o]]#](&Y )(ys&X(!#"$B?AD D # )"#"6(q o #z8} O K%; ]]!%' W !%Y% !o]]](%&Y$###yk&X)$ )$)$$R!)$$'*"U)@$9# 6j+)"{j "ygIz)F"! D%2F!u2&2'2 ]V* $]) ,% )<)'' %)t"' %2 $IA!2 l  $$"n'&JJA Mt~$.8##f ##6#"A0 %'"W Y??6$?"A< b#X e % [&5%$"  :##'q'q~'q%&! $C'E$C%!1$C$C$C g&Y* D K(X /$|d ([a&3X$|$|Zay& #L' $|& ( 5xmmh8 uI&  'k u. z[66c !%*6 ,66:6]  $ M    $Y$ !o]]]($&Y!&#-;Df'("'''," $>"* !'"K$| [&h$|$|^& #L' $| ) .% N"%k8!(X Sl'M&"Z*c !$#7'Xy Y;$Zf &7"R&G0 / $, %~ "$ K5)]"'% n "  '%Y% !o]]](%&Y c] ' ]% M qY]V]&Yw''$G'cccc4! @$ 'e!$ @||)qe3 S#!)[|0# /'a!KaZa& ]%wm .Y V ] P]]%] $\&Y&iO&i&K'v! (!!.)P(C!!\!]]Pm" m{' ' s   % "%h m OYO V  P%] O$\&{\E,EEEX$Cm$Cv% Y$Cvv $C$C()v g&Y*) M)()`5`])$n `. ;);;]$';.4a*"$r Df"(V"5" "5 'V,#  $>"* ! ")-!A -k$Zf &7"R&G)0 / , %~ "$ K5)` /'"/`RRZ"0 rR&]% M N%(dY(d ]] |](d&Y(o$$"# $ $$%$%l)%t%)) )"()&R yv)&R&R)b y'&&  1$l Il&Ru;& mIb))))@]% M qY ]V]&Yc; q"% "F& I : !g' J$n$ L 2$%# C{"yz)F  f  "# o z @#! !m("r$ ?(( p) W9(q&c (q~~4$'! $$3!$+)[$9I&IIOI @$VZ:' r yQQ& \ { Q &UR6 2 M)6$C( R'C Qaa #vv$C$$C $< +(0S 'x&Y*#$$R(%w%% o)oo o S weo ~#%#]   M   Y ]]V]&Y0'&='{#$z'y v''#{!%5# Ho8 )% " &k %< $o A!2 l) l%Mtg)S%?'G6 (# +%-#) "%Ac " j/ z#!"E(! E r#"E"E!'#(_#"Eb   ///#/6= 2(eQB M)6XI&%IIiOIW/&"( + n8 r||"#} o K%; |z#; c% $2R$2$2 $2'g m |#((%k m" vQ y.c)!/\!](Z/! !! ! T! #$Zf#"R&G $ / ,%~"$ K5)&x Hx]  6 Y]&Yv y.c)!/\!](Z/]   M \ Y ]]V]\&Y! @O vP$W @;"'L"""&I #G!?$ `w ` `&`%/ Vx HJx@@@K$| [&%$|$|^y #L' $|  : %pxipxxppxX+w>w  #6 l &a )4'#U##4!$!g# [ &%?%[&H)!T^)@@&W@ncnnJn%2$}A!2 lQ l$$"n'&JJA Mt~$. '8 r"#N oO K%; zM&]5$96+ 7 g %I)) %t\)  \%(PP&P$`$`P$#$` '#"C yv)b y&  1$l Ilu; [V J VE)" ! 'ZDf''[("''' " $>"* !'"@%%k%&X(% / (!t(( ( +s (!o '?? R$Q b#X e#E#E(#E'#U##4!$!g#n&q$K F (i =!:s  W)P&#    d"D))m]%! M "!Y ]]]]%C&Y] g}"}}% }X)z)z #=%s(Uaaaa v)M ) '>#)D @St! &2 cG*+RV 0] }L# ]&~ /!&~"$TYZ" #3]]&]&&Y@ %r ($($($2n Zm +3( + +&&! +"4Df("ee'," $>"* !e"#{!%5" H8 )% " &k %<6|||@Z|o tHK /' [a&Za^y&#L' & q$ # >EY$(nA(x(x"(1N"{$ `V ` `(P `t /tt9t3_vv(X! Sl'! ! *c !T !  y Y _%m GYG V  Pk%] G$\&{\&{'*"U)@ 4%)H!4!#G'e!)qe3 S)[0[#a'79i[[&9[%w  S[ we ~#&$ & P )h .( | dM ;$#}"3$$"&/I $"\!,( | j$ _#|) m (;!< &|& v$$$`!2Wt&R 1$l I_)i )Q%& &u (U))'"W Y??"W"W6$?"A" b#X e"W %@)000%$#f '8&Kf"#N oO K%; z ~ ~ ~ XA' ~0! $S'x " ""   "( L!a<<F<t Rw#~)#  MSO . O  nOOj  #S? z r % %lJ)JJ" =J#s : m"< mY m" m&{\3 & \&&a D # '"#"6(q o  #z&{&%%% ep$ppp"8(X Sl'M*c !$#7'Xy Y R DPdb(YH$?! 'wH$!  ` "|z!!!&1!&l D # )"#"6(q o #z' D ) r & M{3 &!d s {{!v%)tK& ?k{K()&z %z6#.  e  $H c;|s[[&T : [ M% '1V'1'1E) ! '1'Z&v!&&&*1 /&$T=YZ $T=&&{\4!4'! )q3!S)[0 #{!%5Z# HI8ZZ )Z% " &k %<$Zf  &7"R&G  I / &, %~)y " $ K5)HG); ""^\ $ ^!\ j$ _* #|)&Y&I$%1 #  Y i(N >pa >pp!p ((] !%'  !%Y% !o]]](%&Y%w " [ w "% w888 8'%# dU)I## %)4&t!$!g# Ss l$}As!2 l l$$"n& &JJA Mt~$.!w@ h&'f)S(Sl#k  e  H& $ '"W#f?#?"W"W6#$"A" b#X e"W %rv!& \ &&* #&Y1CYY'R#DYav#Jb? '8 r&&"#} o K%; &z>( !C yy!!)b yl!u;2 kVI! 4 t2#v&X(%w!#"$ 3$?! H$?$$?!  `^ "|$?z#/''*D)@^$Zf #&7"R&G !. # #/ ',, %~# "#$ K5)NN"A y&P!!v'Je V&UE)" ! 'Z#"v&R yvvAA)b y'&"  1$l IlAu;]w ^wY]]V]%&Y&&M h &!" )s h h!v%)'5QKp _'= hK(!)&z f/#a9m&9!g'n L 2%&v!&&&R+r & \&&*   q  #)DK /$|d [&h$|$|Z^& #L' $|& C)blu;T DO# 1$C'E$C%$ww$C$Cw&Y*!! 9!If!+d Q' g" $e8!""}!e//\\\| m>>#>000 0v&.Q y&.&..@c)&.!/\!](Z/Y)%4$'! $$3!$+)[$j'K'K'K'K0'\A!2 l'&JGMt'W?'qM#7MW#"b&0 2)&*{v#G!  ]$.$%1& #   |?&&fY)&{\xm%mh8 u %#m  E##  %] ##k DD D"2( / "2( O(S s !o#!%'r) K B (U#"y#""#%#" # #Y C$?`"|z c'E%xx(x&*$!A%)- B!A!A'#%!O -k!A(%Ph m=" 5;tb!  ;n >$8$, n'E% &* ~'$~$%6&Q /&YZV)&&{\%$96+y 7+ge Mb$fVVV"5'V  !"px#VpxxppxtBt C' v z#!"E)!  r#"E"E!'){#(_#"Eb   ] Y]]&Y!d]  BQ F~ ' ]'Df"(V"&"e"5e'V,# $>"* !e"%)- B!O -k (WnPnnn#$'#g | p &)) $  Q +(0,L*%) )) %)t $C V U $C$CT(&Y*v s!vv(v ) rZ$u(r!}~6  %!O  P ^P P$  '#K & /# [a&&Zay&#L' & 9%.!h {%.%. @ PI%. !C# /  mg  mY m ." m&{\$9!g'6+y 7+ Lg 2I%Vrv!cc& \c&&*w( $9# + 6j+!+"{j 7 "y Iz)FHG6 > (# + >%-#) "%Ac [O"1( D(D&  .&&( |&8&#tC &)') $  /&YZV&&{\ d"O7"O'"Wh??"W"W6$?"A" b#X e"W %'*"U)@ D((%:"bxS S S ? ? ?> ?)$LO#=>&$)]  % M  . %Y% !o]](%&Y]v& t 9(#v!f!!<[!5$0T#&^g  Y   &Y&%j(""'C Qf)b #$< +(0u E&*t^I]I !p &j0T0%2$A!2 l)' ' "n' lJ"Mt~' .!H'&]%!) M "`Y ]]]_&Y(4a*[ " g d k d'r K B (}' H} # X  '7( L tm {U% '* '$R(!( R'C Qf #$' +(0S'x&*%2LA!2 l)"n' lJGMt~.Df''[(V"'"5''V,# $>"* !'"$V (  "f!""(U@"$#$%w  [ w 8sI( Y  YYY +5 V P$\(((l('%e$q "# oz%2 A!2 l)"n' lJGMt~.#:i#:#: #:)nd v$C'E%)M%!1L$C$Ca tg&Y* t"  \\ | ZwB  L's&&=]   M \~Y ]]V]\&Y"Y#^^$ /llV v)M <&w'M:"X  z'( )+ z z%$(k z>:(yn{88I'#!= IS Q + Q    V &a )48 } O K%; ]o $(!",2&!"  1'D'D'D y""""""Z"")KK#K2k f%X /' Z$Z & s'v! (!!.)P!!\!]]P(W+++V+ l$& `(n UGn ` z!)<)!  r)<)<!'(_X)<b   |"=Z'r> K B (U4nRE{'j%)-'j'j #J!O -k'j((Y!!  ou o oSe ~#  -  j D'("'',#$>"* ' X!??? b#X e]%% M "%Y ]]V]&Y  fff"# o!fz F"%8 6#.( y(((l( zX m) ' j I(E$wK F#@ D $J(# )$J5$J])"6(q$K  #$J.yF YZP<#]0 "M<< 0 dtI))<)L$96+~ 7 g c(%o(B'rM:W2   l &a )40-_! ! 9( t#8K(X /$|d ([a&X%%Za^y&E#L' %& %2 J"nJ~.$9# )r6j+" {j 7 "y$Iz )F%"(4a*X) KK#K""T" /? (+ S s !o @ {Sts! { &2 cG*+R ' ()! y0'#ww)b ylwu;$$'$,)] ^ I] ^ ^];] L ^ %2("nJ$' .[~.."]K )``]$!`.$Zf#"R&G / ,%~"$ K5)$96+ 7 g ~~R( n8(&"||"#} o K%; |z!(B )C vv]%b'9u (""'C Qf)b #$< +(0uE&*6 ,%66:6v& t! (#v4(!4'(e!!!)qe3 S()[0> h (W  U L j% $R(] ^] ^ ^];] L ^ MN&Yr 6(, D( 'C Qf## #$  +(0S#'x&* &q$o'v! !!.)!!\!]](v%"U"sU]&"H' k$!)0 ^ iH$C'E%%!1%$C$C%rg&Y*.[(  !g''r n  LK B 2%  ( /[(#E] wF&' ]V* $](IA!2 l)' lGMt\ 9  iP 9 9P$  A)  '#  (   ')3 p)c)K F(i m%k mz%(T#& c%oFPd#b]"|#  '0't#f$w?#? t 6#$"A! b#X e %]Y#&{\#/'*"U)@ /!:? (!:!: S &es !:!o z z z& z,%##m ##  %] # m)%%%  S($>$y$Y)LLL 6$0L& %&  P<#]0 "M<< 0 dtI))<)LDTDD$xTDTT'#U&D)n##4& n!$!g#[%g)-%f ( M3  & \\ &'w*p d d< d'r jK B (U<&X!#"$)))') V" ! c| s|| :|] /!"$TvYZ" #3]]&=]&&Y)Yr3I $r!D!!!#V#If!+d Q' g" ^  ""   "g %%%:2$`O| v&_%c;|sff& : f $ M l$Y$($&{\ o!  &a )4%9 !g'rn L5K B 2%'r jK B (U(%URU&H' k$! ^ iH 6 M6Y6:6)&{\# # H%'$ "\/')!q \"\###&j# ($R(A$Crv#&U)\$C \$C  &&Y* !)&& o#!"X z'(  z z%$(k z>:(ynP$&(C%%h m EYE V  P%] E$\&{\ )q / o$T= Y Z $T" &&{\")2!)2 yv !==)b y'&  1$l Il=u; (   '#U&b##4&!$!g#!Z e$##f ##6#"A% %%.")8''|!"=cb): V=( f$(4>#< v $&, E v z#!"E)! ) r#"E"E!'=#(_#"Eb   <% -9<% % < <% a(+Jaa%a`hohhSUJ $F(K=# o@j jo"o{j"yIzo)F W$mYY# uY($"Gm# && I$$$)L( a  $C( %'C Qaa #v$C$$C   +(0S 'x&Y*$Nv$)P" 1$&dK F (i =t G$ t<u !h {# @ P !C# c|s|| :|' /(ys /&YZ)&&{\$ )Y&*$Crv'&)\$C \$C  &&Y*  )/ )))3:((t 7)($%)3%w h S> we ~# D $JO# ) $J5$J])"6(q$K  #$J.P<' << t))<)Lq )qqGq# H )cn +N'%I# dU)!I %)4&t%a!g $ff)iHv#& ( |  %&47!#$V!$&KK b[ ' WWW$W\\"&Rv&R&R)bv&  1$l Il&Ru;!w~ !WW/ / ?$Tk  #3 , !lok  z#!#)! ' r###!'){#(_ ##b   8!""}#$DI  ) r]   M " Y& ]]F]&&Y% 5) %)t   _!< &|sc; q"% "F& I :  D # "#"6 o #z  / )3:((t($%)3}(}}} 9@ iA i 6  2 M 2# @$! %!SSS1S&c&q&&M h &s h h!v%)tK _'= hK(5!)&z foh !f$!!<[!!Z e JJJ" =J#($!(r(r(rBZ(r""ii)b$iuE&* !h { @ P !C# . yx y y"" "% _) iA) X#=#$DI!,  ('#U##4!$!g#  yC y'@)b ylu; #5$'$Cm$Cv% Y$Cvv $C$C()vg&Y* i9A) 'j %)-Tm'j'j #!O -k'j( g'F!! 9!!!g$A'$A$A L9 2$A%'%rn L5K B 2%A!2 li'&JGMt5#"v '&  1$l I$ #|$ ] Y]]V]&Y?KK#Kppx#Vpxxppx $^  %q o L|  &a )4%P b555&i#~  MS* %   )5])$ .$Hv{ $%j)iI&)/i[\[[!{[" ] ^&] ^ ^];] L ^A!2 l l%Mt3%a /'aa!4!4Za  !4& X__ _2)? 2&2'2 ]V* $]$Yt tI q (ys%&#Or (X! Sl'&C! a*c !T !  y Y ] ;& &H ~ \  l)V&%?P00(000 0tI0)L' I# dU )yI## %)&t!$!g# Y))%w oo S weo ~#colonOrSemiColonDelimitedLocant_76RegexHash.txt000066400000000000000000000000121451751637500435730ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-118920414colonOrSemiColonDelimitedLocant_76SerialisedAutomaton.aut000066400000000000000000007445661451751637500457000ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp}"ur[ZW 9]xp"pur[C&f]xpM !'()*+,-.01:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvyz{|}~ur[IM`&v겥xp:/vculjDl/muljDl/uQ xZx{   v   ../vculjl/mujl/uiir11RR v        XX v     &&UA]UUA]UUA]UUA]UJJii**d%5K25deYy>5d}}(000D**DDDDDDDDDDQ xZx{ R R222nQ..w ;3( 8w ;3( /vculjl/mujL]l/uEEii/vculjgl/muljgl/ud%5K25de%5K25d??HAA  DxxvFFqqDDMQeaMMDD**d%5K25de%y25d&& g--d%5K25de%5K25d=]]===@=======w ;c3( 8w ;c3( Q xZx{`_$_`_$_55!!lnQmmGG    Ml MlnQ/vculjl/muljl/uQ xZ.x{#Pc'PPc'P|w ;3( 8w ;3( d%525de%525d4400UA]kUUA]kU0\\qqUUQoQo?pp.. v ''kkqqqqqqqqqqq}}WWQ xZx{d%5K25de%5K25dyyZ%%..0""d%5K25de%5K25dSSnQ**h9N<P$S\h%5K25II(0(((0(((((((0nQ$$            v         v          WWv T-  T- ::EEdd/vculjl/m/yrl/uv))`  $$CCd%5K25de%5K25dIIQ xZx{Q xZx{MQMMMMMMMMMM v qq/vculjl/m/lyLrl/u**ffOOZZvff  w ;3( 8w ;3(  d%5K25de%5y25dWWFFeerr/vculjl/m/yLrl/ud%5K25deYK>5d22CCBBZZPPr## MMOO~~Q xZx{/vculjFl/muljFl/uQw ;3~( 8w ;3~(   Q xZx{IIRRRRd%5K25de%5K25d=@zz}}}}Q xZx{**GGw ;3( 8w ;3( 55==II__;; ""ggY04433 v          Q xZx{77?pp==d%5K25de%K25d%%  /F[ </[ </ppq/vculjDl/muljDl/uffII  88/vculjl/muljLl/u..DD77uuQ xZ.x6{MMKKWWjjw ;3( 86_3 TTDD/vculjl/muljl/uQ xZx{''HHw ;3( 8w ;3( qs}Gss}Gs55zz55w ;3( 8w;3( @@nQZZ66          [[w ;3( 86 ;3( &&xxQ xZ.x6{w ;3~( 8w ;3~( nnKK>W>W77w ;3( 8w ;3( $$ee..#@z{,9dAax^$^$nQNN^ssn= pJ  V2YYjj'' v       nQ>>kkd5K25de5K25dqqF<Q xZ{00nQd%5K25deY5y25d ;3( 8 ;3( /vculjl/m/ljLl/u,,nQ  11Q OxZOx{99..nQQ*Vb!bVb!bd%5K25deY5y>5drJJ  vIvI00##0//   ((??**?`2::2222222222 v  [[gg!!       /vculjl/muljl/u55w ;3( 86 _3 ++00lZZllllllllll/vc|ljl/m|ljl/u  d%5K25de%5K25d!*!*Q xZx{d]  d  dw ;3( 8w ;3( ''     ZZ v qqtw3H   FFQIUN |1+w ;3( ??d%5K25de%5K25dnnF+F+ ~V 0\8# uljl        SSllL{t^o)b ++?ppO O ''XX/vculjl/muljl/u~~iiCCw ;3( 86_3 EE0nQQ xZ.x6{TT:;;'',,b--b--  |||||||||||ww;;55    0v00oow ;3( 8w;( w ;3(* 8w ;3(* ((..((!!WWY0YYYYYYYYYYmmWW}}Q xZx{YY$$ww))YYaa   ::d%5K25de%5K25dZZv!vv!vXXhh**44ll"7 BLt!!     ==NN/vculj+l/mulj+l/uBB/vculjl/muljl/unnrnQOO vcolonOrSemiColonDelimitedLocant_76_reversed_RegexHash.txt000066400000000000000000000000121451751637500456310ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-118920414colonOrSemiColonDelimitedLocant_76_reversed_SerialisedAutomaton.aut000066400000000000000000007507511451751637500477270ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp)ur[ZW 9]xp)pur[C&f]xpM !'()*+,-.01:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvyz{|}~ur[IM`&v겥xpUJJII##_QQQQQQBB w#_Z00G000000nn  0KKvvjjjyyzzJJJJww(22jjjjjj$44ddii jjjjja.R022jjjjj995#_aaa9aaaaaa6kTJ-T4Ab))AA=<   QQQQQQjooLc(,(,  =<jjjjjjjjjjjXX jjSjjjjp\$$-STW(((((((((((??//}}QQQQQQ5#_D]U%@RNakkkkV=V=W kkkmnW//qjB:B:t.R..........5t0  aaWWyy}}  YY<<vvXXXX ;|xffi2i2 jjj``CC''//ff_||[[BBgg**BB=o=o5D]U%@RNHbbj?N"N"ZZ;;nn 0]aWfun22mVXQXQXhhkkoo""ggJJyyXXYY}}aaa??PP{{HVff""0CC2222 jjjaaauu&*FX`@U1&rOOhh#_aaPaaaaKVXXX6699_vv$$#_aaa5t#_XX BB++V 0]aWfuQQQQQQQ$$$:$$$$$$:QQQ#_KK  F$$ii''__QQQQQQQQQQQaaaaa5t]]NN>LcELLELLLLLL((vv44~~GGFsss(ssssss(HDD$$?Q5.RW#_aaaaVnD]U%@RNZ x  x      ,,ee_&*FX`@U1&rx>x>jjj//XX88V 0]aWfuMMdvl0$l0$Vccs//dvXX!!J7J7wwBB''jjjk++{{ll5tDDkkXQXQXRRyy-- ;; 11////_QQQjjjjQQQQvv //jjjjjjjjjQ+n'bs2K#77::aaaaaaaaaaahhk[[dd$[[ccns`` V 0]aWfuBBBB&*FX`@U1&rTttppmmOOVjjjjjjjjjjj))I<r"^8!   Mee_QQQQQQQQQ% %tO3A ^zQuu5JJ22..  snn#_aaaaaQQQQQQFFUU_QQQQQQ\\T=<I==I======&*FX`@U1&r$$''s*s*%%I<r"^8!   M_0''HHHrHHHHHHrmllI<r"^8!   Mjj_QQQQQQQQQQ jjjjDD-SW EE3''q //jjjjjjjjjQ+n'bs2K#3#_aaaaaaaaaa66..''=<CC?a~qqaaaLc5t  &&aaaaaa!!88@@ggnn5t&&Lcdv  SSj#_aaaEEee jjjjjjjjjj  11GGjQQQ)) 5#_Zaaa9aaaaaaG6kTJ-T4AbQ~000z000000zs*qs*V 0]aWfu&*FX`@U1&rWaaaa22ZZkkaaaaaaaaaaa \\MMI<r"^8!   M))jjjjjj\\{{>>LLnnKKdvpdd~pdddddd~XQXQX_QQQQ  kkcc ;|x>yy33LL55-SH$mH$m?YY||IIQQyQQQyjyy,,BB//k#_aaaaaaa!!hh77  ## vvvvPPJJ&*FX`@U1&rI<r"^8!   M_QQQQQQQQQ %tO3A ^zH)H)_QQQQQ))))D]U%@RNnn_QQQQQQQJJ5''D]U%@RN' #_vvaaa9aaaaaa6kTJ-T4Ab// m^^I<r"^8!   M-S--------$$$$jV 0]aWfuaaaaaa$)) jjjjjjj))#_5D]U%@RN(fusion_173RegexHash.txt000066400000000000000000000000131451751637500367470ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-1946653277fusion_173SerialisedAutomaton.aut000066400000000000000000000113051451751637500410250ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp)-ur[ZW 9]xp-pur[C&f]xp'()*,-.01:;<AD[\]^ah{|}~ur[IM`&v겥xpe&  *%#           ,,,, (&+"""''''   !&"$$$!%%&!!'''''((#** ++ ,,,,,fusion_173_reversed_RegexHash.txt000066400000000000000000000000131451751637500410050ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-1946653277fusion_173_reversed_SerialisedAutomaton.aut000066400000000000000000000124431451751637500430670ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp3ur[ZW 9]xp3pur[C&f]xp'()*,-.01:;<AD[\]^ah{|}~ur[IM`&v겥xp***!!!% 2! &+   '*  ** &$.$. ,)'....1*# ** " - *1-**###, !"#/&....%%&'*)0(**)0(002+ ,--/..../"""0(12+fusion_70RegexHash.txt000066400000000000000000000000121451751637500366620ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata1685997369fusion_70SerialisedAutomaton.aut000066400000000000000000000034221451751637500407420ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp ur[ZW 9]xppur[C&f]xp'()*,-.01:;AD[\]^ah{|}~ur[IM`&v겥xp          fusion_70_reversed_RegexHash.txt000066400000000000000000000000121451751637500407200ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata1685997369fusion_70_reversed_SerialisedAutomaton.aut000066400000000000000000000035631451751637500430060ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp'()*,-.01:;AD[\]^ah{|}~ur[IM`&v겥xp             indicatedHydrogen_101RegexHash.txt000066400000000000000000000000121451751637500410560ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata2057999932indicatedHydrogen_101SerialisedAutomaton.aut000066400000000000000000000026111451751637500431350ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp ur[ZW 9]xp pur[C&f]xp'()*,-.01:ADHI[\]^ahi{|}~ur[IM`&v겥xp      indicatedHydrogen_101_reversed_RegexHash.txt000066400000000000000000000000121451751637500431140ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata2057999932indicatedHydrogen_101_reversed_SerialisedAutomaton.aut000066400000000000000000000026111451751637500451730ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp ur[ZW 9]xp pur[C&f]xp'()*,-.01:ADHI[\]^ahi{|}~ur[IM`&v겥xp      isotopeSpecification_255RegexHash.txt000066400000000000000000000000121451751637500416270ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-910681680isotopeSpecification_255SerialisedAutomaton.aut000066400000000000000000002144431451751637500437160ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xpS !'()*+,-.01235679:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvyz{|}~ur[IM`&v겥xpE|ssssssss~ Y<<<<<<<(dg@dg @XZZQQDDxxxxxxxOheOOheO Y<<<<<<<(dg@dg@XYjjKKttyybYbbbabbbbbbbY11z={zz={z|sssssVV Y<<<<<<<(dg@dg@Xbb Y<<<<<<<(dg@Vdg@VX5fEEYFQFQVVllo4%}8oo4%}8oRR Y<<<<<<<(dg@dg@Xnn Y<<<<<<<(dg@dg@X66GG00HH""ttVVY<<<<<<<<--rrrrrrr.c..c.55 Y<<<<<<<(dg@dg@XYwwVVY66 Y<<<<<<<(dg@dg XYwwbbyyYssssssm&!*,u\L3`AdT@7NC7NC RR Y<<<<<<<(dg@ @X++99bYa:: Y<<<<<<<(dg@Hdg@HXyy[t[tww''l))rrrrrrrr|ssssssssllxxxxxxxxZZpp $iJU MB |^^ Y<<<<<<<(dg@dg@XHHttkktt|ssssssssRRPP22IIRRww Y<<<<<<<(dg@ @X|Y Y<<<<<<<(dg@dg@X Y<<<<<<<(dg@ @XQQwwSS|ssssssssrrrrrrr.c..c.Y;;YVVW//RRjj Y<<<<<<<(dg@g @Xsssssssm&!*,u\L3`Adg@]]Y##??qqQQ Y<<<<<<<(g@g@X Y<<<<<<<(dg@dg@X__>rrrrrrr.c..c.VVYQQlllllllllll Y<<<<<<<(dg@dg@XvvVVisotopeSpecification_255_reversed_RegexHash.txt000066400000000000000000000000121451751637500436650ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-910681680isotopeSpecification_255_reversed_SerialisedAutomaton.aut000066400000000000000000002132111451751637500457440ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xpS !'()*+,-.01235679:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvyz{|}~ur[IM`&v겥xpE$d YWasssssssss6c(gLU HH3` T%%%%%%% TTx TTTTTTxWssssjj||PPJJ$WsssssWsssssssG~G~>>T%%%%%%%+t+tVVooYWasssssssss6c(gLU T%%%%%%%sssuu]],,**II,,^^,,}},,QfffffffDD[.qk.qkiiFF??ssssszzzzzzzw0w0;;Qfffffff@@T%%%%%%%sWY[B[[B[[[[[[Z[::$ml\95R 'hv!<W1=81=8WsssZQfffffffssssss77$m\95R 'hv!<,,XXe$d""zzzzzzz{{ &&WsssssssOOsssss$dW$Wm\95R 'hv!<sssssssssss_n_n//y 3`KK,,,,s44ssssssm\95R 'hv!<QfffffffAQQAQQQQQQEE--CCssssssMMe##$m\95R 'hv!<m\95R 'hv!<sss$d^^SSsss3`zzzzzzzNNzzzzzzz$r)r)22Wasssssssssc(gLUpps[$debbWss1s=sss1=8 WssssssssssisotopeSpecification_256RegexHash.txt000066400000000000000000000000131451751637500416310ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-2072431106isotopeSpecification_256SerialisedAutomaton.aut000066400000000000000000002420111451751637500437070ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpDur[ZW 9]xppur[C&f]xpR !'()*+,-.0124568:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvyz{|}~ur[IM`&v겥xpPfK1ipi1ipiH``````""``````""""""""o``````"K1ipQi1ipQiHK1ip%i1ip%iH``````"""``````" mm%%##K1ipi1ipiHJJ"""""88nnvv``````ww}}K1ipiuiHwwwdd"=4oooo 4ooooooowwwsso``````"""""""""""K1ipi1pciH``````"""""""PP?9999999???q???????wwwwwwwww""""""ww?qwwwwwwwK1ipiuiiH``````"""""""""""""z.ATaW{R^*M51ipiwwwwUjUUjUwwwwwwwwwwwwwww%%oX3C <6""=[oooooooffFFK1ipi1ipiHo""""""""``````""""""""""o``````K1ipi1ipiHK1ipiuiH%%hhhhhh22KipiipiH ''hhhhhhhFF__OOoff4eeK1ipi1ipiHw"ooooYooooooowwwww//VVK1ipi1ipiHwwwww=4%%w~~77GG``````""""""""%%``````"wwwwwwwwwwwwwwwwwwwo``````""""""wwwwwwwwoooooooooooK1ipiuipiHGG&&xr-N+E0|>:@wwLIwwwg y )g y )wwww """"999999, ;$ , ;$ oSSSSSSSX3C <6""=[ooooooo  wwwwwooo=4ooooooo""""""""``````"=4wwwwwwww??"""``````"%%oooo=4oooooooZBC <6""=[``````"xr-]o"``````""""""wwwwwwwwwll"""""""%%kkK1ipi1piHK1ipi1ipiHwwwwwwww((ooo=4ooooooo??!!ZBC <6""=[vvATaW{R^*M51ipihhhhhh22oo\=4ooooooobhhhhhh22"""K1ipi1ipiHK1ipi1ipiHVV==4ttisotopeSpecification_256_reversed_RegexHash.txt000066400000000000000000000000131451751637500436670ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-2072431106isotopeSpecification_256_reversed_SerialisedAutomaton.aut000066400000000000000000002315241451751637500457540ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xpR !'()*+,-.0124568:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvyz{|}~ur[IM`&v겥xpL<Tm"~PyE!Y6x|oeeeeeeLLpp99>>>>>>00duuuuuuuuuu,,wwTm"~PyE!Y6x|oT#WW))II11 #??duuuuuuuT# 55+-dDDuuuuuuuuuC%N/a}Zg@Sduuu[[ KK``::TppuuuuuuAeeeeeeMMBllllllduu&=uuuu3=T#dXX kF kFeeeeeehchcduuuuu22Ruuuuuu-BidRDDuuuuuuuuuBVC%N/a}Zg@SBA_u''>>duuuudeeeeeevvvvvv\^Hn74bquuuuu]]i+++++++++JJ_llllllT#\^Hn74bqT>>>>>>uuuuT {{{{{{zzuuuuuuuuuuu_XXXXllllllrrppllllllWW+ Tuuu(j;s(j;sTGGXXm"~PyE!Y6x|ottXXm"~PyE!Y6x|oppppuuuuuQQXX**d3=3=T<<Tm"~PyE!Y6x|o$$Tdm"~PyE!Y6x|opp+UU ffffff\Hn74bquuuuuuuuuu..duuu88uuuuvvvvvvXXOOppT# {{{{{{lambdaConvention_161RegexHash.txt000066400000000000000000000000121451751637500407230ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata1658814986lambdaConvention_161SerialisedAutomaton.aut000066400000000000000000004103451451751637500430110ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xpM !'()*+,-.01:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvyz{|}~ur[IM`&v겥xpqOO $  $ _$$$$$$$$$$$[BZ s{{Zs{{Zg""{{O &&&OO###VVXXYYPuPu#PPJJ]]$[nn!!BZ s{'{Zs##x####{ZgII--II[[f[22fff[fffffff[55__mmII44u Qa|8QuN Qa|8Qubww++((::u CpkLuCpkLuOPPu Qa|8QuN Qa|8QubUuuWWKK`[00JJf[[[ @d  @d KK``#####u Qa|8QuN Qa|8Qub--u Qa|8QuN Qa|8Qub  BZ {'{Z{'{Zg zziiBZ s{'{Z{#####{Zg9########@@2 1 1 `[``````````^^,,AAMMzzvcc-=?S=?Sll4&X,m}=+? BOffu Qa|8QuN Qa|8Qubu Qa|8QuN Qa|8Qub((u Qa|8QuNAQ|8QubOiPPOO#######JJ^^Ob##xxBZ s{'{Zs{'{Zg;;nn[eeu Qa|8uQuN Qa|8uQub<<//SSu Qa|8VQuN Qa|8VQub!aqaq###PPGGzzBZ s{'{Zs{'##{Zg``UUcvvkk..yy55OBZ s{'{Zs{'{ZgZwZwZUUMMVV[BZ F{'{ZF{'{Zgu Qa|8QuN Q|8Qub-=?S=?SDD...........{{**OIIII\IIIIIIIOBZ s{'{Zs{'{Zg2 1 1 u Qa|8QuNAo|QubKK   ((99{{9yyBZ s{'{Z#'####{Zg922czz{{//u Qa|8QuN :8Qub!!9""~~~~~~~BZ s{'{Zs{'{Zg>J>J##U#####&OVVff.uQa|8QuNQa|8Qub44u Qa|8QuN a|8Qub{{BZ s{'{Z###{Zgu Qa|8QuNAoa|QubllJti'dLhRuQa|8QuNQa|8Qub**uuBZ s{'{Zs{'#{Zgu Qa|8QuN Qa|8QubBZ s{'{Zs{'{ZgO66u Qa|8QuNAQo|/QubBZ s{'{Zs{'#{Zg//I\JJtthhuuAA\\BZ s{'{Zs{'{Zg2 1 1 ""[cqquu#6T#6TOuu$$OOC44ll%%rru Q|8QuN Q|8Qub#####77O-=?SW=?SWjj88``44zz##cccbb<<EEUUUUUUUUUUUBZ s{'{Z{###{ZgffBZ s{'N{Zs{'N{ZgHHTQgrEQTQgrEQBZ s{'{Zs{'{Zg00c"%}oGp3>)##s{'{uu~~|/|/PP//BZ s{'{Zs#'{Zg$$YFeDRs7 j1 Qa|8Qu Qa|8QuN Qa|8Qub))HH;;]Z3Z]Z3Z44BZ s{'{Zs{##{Zg ###lambdaConvention_161_reversed_RegexHash.txt000066400000000000000000000000121451751637500427610ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata1658814986lambdaConvention_161_reversed_SerialisedAutomaton.aut000066400000000000000000004331201451751637500450430ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xpM !'()*+,-.01:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvyz{|}~ur[IM`&v겥xp}}kk.RRB,,\,,\ `77111111LL( ((((((((((DD`(`(`ffMM kkkkkk;FqS9a-@Du_{ :8Xr8T*xgV3ux&n811=111111=----;FqS9a-@Du_111SJ77]].v..v...... !!zBddiddddddi8GGGGGGGGG)) 22CCNNkkG11111111111SJ llZZss^^SJ; ff}}#X111111111111$ ycGV~OH* Y1CCH PbLL591=tBK 0 II111111111111$ ycGV~OH* Yarrr%rrrrrr%111d||h~{iior|||| . N NQQ#111111111111111 ycGV~OH* Y|h~ddbb111111111111$ ycGV~OH* Yo::8#XTSS"8TSSSSSS"81111|{55}}tteeYY11111rT*xgV3ux&nX{==d111111111111 ycGV~OH* Yr33AA KdXcXc/#111111164+SS+SSSSSSee1>>''#;FqS9a-@Du_--EEbbFF11111111RRvZ2O2O#11111111111y ycGV~OH* Y111mm#111111111111 ycGV~OH* Y##oo Q Q''__P|\\--G8GGGGGGGGEE66<<<<::vvvvvvvvvvv{4$P$Po{`]]G44SJ+SS+SSSSSS?^2^2^kkWW''RR//>>ll  {{111 ;FqS9a-@Du_#( 'g'g#111111111111 ycGV~OH* Ykkww :  }}}}#IIFqS9a-@Du_hh111111111111111111 ycGV~OH* Y''||11111ppII||4ss( #SJ#4mm$P$P  :#ppzzjj?B{1111H bLL591=tBK 0bbhhjj||JJ11111PZ\\RRSJ  1@@Z--UU` KKAAbb'';qS9a-@Du_'..''[[WWUU  }}v1111111111zdddiddddddi""111!!111111 .v..v......++?{kk&n&nzrbb}}H0bLL5091=tBK 00bbbbpp{EE II{a?rr%rrrrrr%MM}}T))T*xgV3ux&n h~77^^^FqS9a-@Du_;FqS9a-@Du_FqS9a-@Du_ RR/#Z[[%%wqwq{==--||--locant_108RegexHash.txt000066400000000000000000000000131451751637500367220ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-1574170472locant_108SerialisedAutomaton.aut000066400000000000000000002103341451751637500410030ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xpN !'()*+,-./01:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvyz{|}~ur[IM`&v겥xpCkk"v"v]]],,CC*WcMO*mHTTTTTT**WcMO*mMHTTOT*,,P4PPjWjjppFt|FFt|F ~~TTTTTd 7u 7u *WcMO*MTOT*]*WcMO*mTTTTT*T33VV((]## ] ]nnTTTKK)jjj#Ft|FFt|F*WcMO*mTHTT*AAZZ*WcM`O*M`OT*=j00)))jjTTTTTTjWj jzz:jWjjjWj jjWj j=jjjAAjWj jVV*WcMO*MHTOT*99jWj j&& *WcMO*MO* IIyy`` *WcMO*MO*jWj jV]?? jWjj11*WcMO*TMO*88wwPPPjWccxxxj jV]ddVVVVVVVVVVneennnnnnnnnn] >>Ft|FFt|Fbb!!jWj j*WcM"O*M"O*n]-- AAxjWjj\\ oo~]:jW:::j:::::::j"TTTTTTTTTjWjj*WcM`O*M`O*~]~~~~~~~~~~*WcMO(*MO(* *WcMO*MO**WcMO*MOT*bb::Q.sLX.Q.sLX.bbZZ""*WcMO*TTMHTTTOT*SSPbPbPTTTTTTTTTTTTTaa[A[A^^Ft|FFt|F$$ PPPjWj j]]]hh{6qf@;REUlJ<Y GvvjWj jTTTTTTT*WcMO*MO*}}=jjj] ] ]P P P &&*WcMDO*MDO* P P P::gg0%B52+/TT_MO~~NN e' ' ITITTnnjWj j*WcMvO*MvO*rriiAAlocant_108_reversed_RegexHash.txt000066400000000000000000000000131451751637500407600ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-1574170472locant_108_reversed_SerialisedAutomaton.aut000066400000000000000000002103341451751637500430410ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpDur[ZW 9]xppur[C&f]xpN !'()*+,-./01:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvyz{|}~ur[IM`&v겥xpC(d(d}@iF%%%%@%%%%%ll%%%%%Qb:b:aa- ----------%~88hh'~iFiF,,%%%%%%%%gg- %%%%%%%%%%%%%%%%%%$'*22- /~~~~~~~~~GGvEEE%%%KK %%%@kk11@u)_|)_|@%%%%%%@YV]7; 5e=WIH@~~~~~~~~%%%%%%%%%%%%# J3B P!6nf%%%% NN%%%%%QZZ%%%%%%%%%%%%%%%# J3B P!6nf@SS%%%%%%%%%%%%/# J3B P!6nfMMEEE@++Q99LL&&%22qq%%%%%%'.0.0uvE22E22E$~ww}~vCCCQ%%%22%%%%%%%%%%%%# J3B P!6nf~@V]7; 5e=WIH[[%%%%%%%%%%%U J3B P!6nfmmrrOO%%%@V]7; 5e=WIH)_|)_| $qqvEuEE@V]7; 5e=WIHoo$&&%%%%%%%%%%$@22}xXxX}TT\\%%%%%%%%%%%%# J3B P!6nfzz@V]7; 5e=WIH/@{{V]7; 5e=WIH%'p''cp''''''c``&&$ss""* 4 4yjyj$@%%%%%%%%%%%%# J3B P!6nf@22%%%%%%^^ttRR??@iFiFAA%%%u<uu<uuuuuu>>locant_175RegexHash.txt000066400000000000000000000000131451751637500367260ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-1092805161locant_175SerialisedAutomaton.aut000066400000000000000000000007001451751637500410010ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp '(,-.01:ADahur[IM`&v겥xp4locant_175_reversed_RegexHash.txt000066400000000000000000000000131451751637500407640ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-1092805161locant_175_reversed_SerialisedAutomaton.aut000066400000000000000000000007001451751637500430370ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp '(,-.01:ADahur[IM`&v겥xp4orthoMetaPara_79RegexHash.txt000066400000000000000000000000131451751637500401370ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-1577719820orthoMetaPara_79SerialisedAutomaton.aut000066400000000000000000000041321451751637500422150ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp!-.ABEFHIMNOPQRSTUabefhimnopqrstuur[IM`&v겥xp    orthoMetaPara_79_reversed_RegexHash.txt000066400000000000000000000000131451751637500421750ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-1577719820orthoMetaPara_79_reversed_SerialisedAutomaton.aut000066400000000000000000000037251451751637500442620ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp ur[ZW 9]xp pur[C&f]xp!-.ABEFHIMNOPQRSTUabefhimnopqrstuur[IM`&v겥xp   polyCyclicSpiro_197RegexHash.txt000066400000000000000000000000121451751637500406000ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata1840510902polyCyclicSpiro_197SerialisedAutomaton.aut000066400000000000000000000076511451751637500426700ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp%'(,-.01:;ADEFIJOPQRSTUadefhijopqrstuur[IM`&v겥xp        polyCyclicSpiro_197_reversed_RegexHash.txt000066400000000000000000000000121451751637500426360ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata1840510902polyCyclicSpiro_197_reversed_SerialisedAutomaton.aut000066400000000000000000000074241451751637500447240ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp ur[ZW 9]xppur[C&f]xp%'(,-.01:;ADEFIJOPQRSTUadefhijopqrstuur[IM`&v겥xpx      spiroLocant_201RegexHash.txt000066400000000000000000000000101451751637500377260ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata21884574spiroLocant_201SerialisedAutomaton.aut000066400000000000000000000037771451751637500420250ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp'()*,-.01:ADHI[\]^ahi{|}~ur[IM`&v겥xp            spiroLocant_201_reversed_RegexHash.txt000066400000000000000000000000101451751637500417640ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata21884574spiroLocant_201_reversed_SerialisedAutomaton.aut000066400000000000000000000037771451751637500440630ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp'()*,-.01:ADHI[\]^ahi{|}~ur[IM`&v겥xp              spiro_83RegexHash.txt000066400000000000000000000000111451751637500365160ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata888073889spiro_83SerialisedAutomaton.aut000066400000000000000000000134171451751637500406040ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp0 !()*+,-./01:<=>?IJOPQRSTUV[\]^_ijopqrstuv{|}~ur[IM`&v겥xpp         spiro_83_reversed_RegexHash.txt000066400000000000000000000000111451751637500405540ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata888073889spiro_83_reversed_SerialisedAutomaton.aut000066400000000000000000000145221451751637500426400ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp ur[ZW 9]xp pur[C&f]xp0 !()*+,-./01:<=>?IJOPQRSTUV[\]^_ijopqrstuv{|}~ur[IM`&v겥xp         stereoChemistry_185RegexHash.txt000066400000000000000000000000131451751637500406400ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-2044694624stereoChemistry_185SerialisedAutomaton.aut000066400000000000000000004032351451751637500427250ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xpN !'()*+,-.01:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvxyz{|}~ur[IM`&v겥xp//{{{^+2. ^%. ^%1***  VV<<RR2**2[Kd'P,'d'P,'2^+2. ^%. ^%1-2^+2. ^%. %1`>i n; n;Of""*ffffffffff{{{{{ {{((*2^+2. ^%. ^%1MM(nk~nnk~n{{{Kd'P,'d'P,'MM{tgGBgtgGBg11FF{>>{{{{{{{{{{qq{{{{{{{{{{{{{{{{{{{{`>i n; n;99{{OO^+2. ^%E^1UUMssTT{{{{{{{{OO`>in;n;2vSJEvD8z,=P {LL2[yhyhgg**2[dduuuuuuuf*QQh2OO2[ddc/_uuuuukk))2qq2`>i n;8DYY}6}6{--ZZ2tgGBgtgGBgSSlClC*9tgGBgtgGBg^+2. ^%. ^%12qq__oo z2ee??||2[}}55%4c<BT3!x&A$GCcV{{7r??ff`>i n; n;  "#(7##(7#AA00`>i n1; n1;CC^+2. ^%.^%12[66WWllw@ @ 66`>i n;8D2hhheeLL`>i n;8DLL**))2==***{{{{00{{*LL^+2. ^%. ^%1""&&}*}}}}}}}}}}iiuu{{{{{{{{{{{{{{55`>i n; nZ;2[::z66{{NN`>i n;8n;ssBT3!x&A$GCcV{{7r  yb+wp @ n;QQKKll^+2 ^% ^%12Kd'P,`'d'P,`']]{{{{{{{bbRRffMMMMMMMMMMMffH;N;H;N;`>i n1; n1;^+2. ^%E ^1{{{''xx::WWoo a  a 22[::aamm6!j.:J> mI. ^%LL%4c<2[jjc/_uuuuu`>i nl; nl;**??2::2^+2. ^ %. ^ %1{{{{ee2^+2.X^%.X^%1{{{{{{{{XX2KKffffFqFq^+2. ^%R. ^%R12pp{{{qq66ff`>i nC; nC;UUee^+2. ^%. ^%1}}HH`>i n; n;{{{{{{{{{||`>i n;k n;k]]#3I3#3I3^+2. ^%^%1YY{{2[ww{{{{{[}*llffrr2LLq\4M$\4M$ee{{ii^+2. ^%. ^%1le^+2. ^%. ^%1]]~~^+2. ^%E^1`>i n; n;`>i n; n;`>i n; n;LLtt\\OOqq^+2. ^ %. ^ %1stereoChemistry_185_reversed_RegexHash.txt000066400000000000000000000000131451751637500426760ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-2044694624stereoChemistry_185_reversed_SerialisedAutomaton.aut000066400000000000000000004206031451751637500447610ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xpN !'()*+,-.01:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvxyz{|}~ur[IM`&v겥xp^bk$$6hrrlU`2&I_e~5p /]]00AKA||z3z3w5ww:wwwwww:\>>}}}}}}}}}}ht4%%&&[A[AFD+ s+ scc7^lS;nOqXK=MM pp))^bII,kTTvvvnnd#d#%%^bkk!!j j }} }}}}}}FDk}}}}}ttjN[[33}hAKAP  c0c7^lS;nOqXK=GGBB]]uc^b}}}}}}}}}}}}'gux~UJCSZQc7^lS;nOqXK=^bCC}}}}}}}}}}}}'gux~UJCSZQOO^bHjNk,,,,,,,,,,,@@}}}}vw5yjNjj4jjjjjj488}}}}}}@@ggk^b>>``}}}}}}}}}}0}}}}}}}}}}}}'gux~UJCSZQ*"hJ"hJ;;RRggBB}}}}}}}}}}}}}}}'gux~UJCSZQ__mmhEDD8lt2&I_e~5p /330^b$?d6x2XoM7}}}}}}}}}}}}'gux~UJCSZQ}}}}}chc7^lS;nOqXK=hc{{}}}}}}}}}}}g'gux~UJCSZQgTTooiiu(( w5G+G+..RR11hDDlU`2&I_e~5p /c^bff..=F=FcHHTTFDFF FFFFFF PP||WWcc11z3z3WWc  cL7^lS;nOqXK= h}}}}}}}}}vvvcc7^lS;nOqXK=jN/#/#EE33bbhKAPKAPgg<< \\99c((a\\\\\\\\\\\}}}mmQQ*********0}}}}}}}}}}}}}}}'gux~UJCSZQc*c ??KAKAss^bffh}}}*}hggaKAKAw5c6 ^bf-faaNNhhKAKAFD  cc7^lS;nOqXK=qqgg}}}g}}}}}}}}}}}g'gux~UJCSZQg}}}}}{{}}}}}}  \vvv^bh99VVLLc}}}}chc7^lS;nOqXK=ZZee``k))ccc7^lS;nOqXK=}6,Y,Yc cY7^lS;nOqXK= ""``''vvv!!c}}}}}}}}}}}}'gux~UJCSZQww}v^b*rid6x2XoM7y,h^b6yVyVc<<c0c7^lS;nOqXK=--^bk%%}}}ck  zzKAPKAP}}}}}}}}}}}c  ::hstereoChemistry_196RegexHash.txt000066400000000000000000000000111451751637500406400ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-21120730stereoChemistry_196SerialisedAutomaton.aut000066400000000000000000000016071451751637500427240ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp ur[ZW 9]xp pur[C&f]xp()*+,-./0[\]^{|}~ur[IM`&v겥xpstereoChemistry_196_reversed_RegexHash.txt000066400000000000000000000000111451751637500426760ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-21120730stereoChemistry_196_reversed_SerialisedAutomaton.aut000066400000000000000000000016071451751637500447620ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp ur[ZW 9]xp pur[C&f]xp()*+,-./0[\]^{|}~ur[IM`&v겥xpstereoChemistry_202RegexHash.txt000066400000000000000000000000121451751637500406250ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata1301929544stereoChemistry_202SerialisedAutomaton.aut000066400000000000000000002115161451751637500427120ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xpN !'()*+,-./01:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvyz{|}~ur[IM`&v겥xpD@zz>>'n'n'____gttttto[1[o[1_[oR_______iiHH________aaaa66nnccddtttmJ P J P DDo[1pV[o[1pV[oR..gt##;;VV&('gtto[1[o[1__[oRYYl_l__tttttttqqgtgg VVmJ P J P rr99_'GGn'n'==__));;;;NNgttttttttKKddx?$}|?$}|MMo[1[o______[oRo[1j[o[1j[oRgggtttttttttttDD==ttttto[1[o[1[oRgttttttttttdduugttttt%%^^=='mm''GJ P! J P! LL``g55""xx'yynno[1[o[1__[oRSSpVp__V_gtttVVo[1[o[_1__U___[oRgttttttttww]]Ne:fe:f/+{  T__CI[1[&&&~~~~~~~ttttttttZZ22o[1][o[1][oREEssVVgttttttto[1[o[1___[oRQb\*XhW3O0tt QQ'ttt=pV=pV'((o[1[o[1[oRttt<<pVpVo[1j[o[1j_[oRo[1[o_1_U___[oRo[1[o[1[oR__ll'nn'77yddttnno[1[o_1[oR;;QQQ'''mJ P J P gttttttttt@_____44]];;o[1[o[1[oRo[1[o[1[oR('((((((((((_____88AvAvtiio[1[o_1_U_[oRFFk,B,k,B,((jj]_______--o[1[o[1[oRstereoChemistry_202_reversed_RegexHash.txt000066400000000000000000000000121451751637500426630ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata1301929544stereoChemistry_202_reversed_SerialisedAutomaton.aut000066400000000000000000002127001451751637500447440ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xpN !'()*+,-./01:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvyz{|}~ur[IM`&v겥xpD???hh~$$@F@@hhhhi)])]nniii}}vJJbo4&=UpTX%iiiiiiiiiiiie_q; A!{|++700OOO^OOOOOO^11iii~JJiiiiiLL--Y<<PPiiiiiiiiiiii6e_q; A!{|hh3232~ 9.R 9.R632j33j333333s**9FFFFFFFFFFFaaiiiiiiiiiiiiiiiie_q; A!{|iiiiiiiiiiiie_q; A!{| [[oB4&=UpTX%iiiii6QQ""Fhhiiiiiiiiiiie>q; A!{|\S\SO..ZZgg5 [[o5B4&=UpTX%57???II,,iiiiiiiHH32)])]cciiiiiiiiKK32::7((7iiiiY [[o/4&=UpTX%v''``iiiihhu~uu732 [[o4&=UpTX%O#W#Wzz?v??5Yb335Bb3333335iiiiiiiiiiirr*** ddttiiiiiiii???MMDVDVvvvvvvvvvOiiiiii99ffk))]ExExiiiim8m8@ [[oB4&=UpTX%wwNN7IIiiiiiihh ~  iiiiiiiiiiiiiiie_q; A!{|..CCGGiiiiiiiiiiiie_q; A!{|((iiiiiilliiiiiiiiii yystereoChemistry_230RegexHash.txt000066400000000000000000000000121451751637500406260ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata1608971995stereoChemistry_230SerialisedAutomaton.aut000066400000000000000000000005701451751637500427070ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp EFZ[efz{ur[IM`&v겥xp$stereoChemistry_230_reversed_RegexHash.txt000066400000000000000000000000121451751637500426640ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata1608971995stereoChemistry_230_reversed_SerialisedAutomaton.aut000066400000000000000000000005701451751637500447450ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpur[ZW 9]xppur[C&f]xp EFZ[efz{ur[IM`&v겥xp$stereoChemistry_69RegexHash.txt000066400000000000000000000000111451751637500405570ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata681652249stereoChemistry_69SerialisedAutomaton.aut000066400000000000000000005665341451751637500426620ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpP\ur[ZW 9]xp\pur[C&f]xpO !'()*+,-./01:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvxyz{|}~ur[IM`&v겥xpd885/4c=X .JJ=X-.J9J|8//8&&*2p>[]T 27p>"]* 2fpTw p} 00&&pT pL ),KU? D\G.p$ MMMMMMMMNNWW5/4c=X .9JJR5.9559J5|***&&p:T _pL:} 737SD*2p>[]T27p>]}2f//8800%88M/J J8M/J MvJ8eiei*2p>[]T 27*p"]* *} 2fpT pL} M/J MvJ8775/4c=X .JJ=X-.JJ|5://l4c=X .JJ=X-.JJ|5555ZD pT pL 5/4c=X .JJ=X-.5J9J|``2eXKT2eXKT2^6B^6B&********&  jjM/J J866//55555MMM}}{{ffW  &&BB2244M/J 5J8%%KK+R1++R1+OO88uuIID 5******ooMMMMM5/4c=X .JJ=XR.5J9J|*5//4c=X .JJ5-.595|u88 t// W0 W08WzzWWWWWWWWWWYY~8vvpT  pL} ;;.//Tw J0 T AJ08pT  pL} 55555555.//TwZJ TZAJ8&&M/J MMMMMMJ8  MMMMMMMMRR^B^B``cc\\M/J MMMMMv'M8nn.//TJ TAJ85/4c=X .JJ=X-.JJ|** 88gg/J J8qq5/4c=X .JJXR.55J9J| =55g9g9''//83S8//n3'j+U,N-kIH**p0!p>]} :HU{HU{cc*2p>[]T 27p>]} 2fhh.Tw9J T9AJ8,I,I^B^***BiirrVxVVxVS$$YY[Tw V[} hh[[p p .TwZJ TZAJ8.TwJ TAJ8)pT pL} /J J89*5/4c=X .JJ=X-.JJ|::4/Qm(a!MMET~JiEsiEspGTw p} ^B^BNN8uuuuuuuuuuuuu// h08# <h0MMMMMz:?9:?9p> pLI MM//W8#<W//8#<T _)} FZZ^B^B*2p>[]T 27p>]} 2f77II8//l8%%8//685155585///T >} 5/4cMX .JJMX-.JJ|*******LL 5 555/4c=X .JJ=X-.JJ|//35"5ii//W8#<W55555555qqHH``MMM;;8//8 5555585P*BP*B%%[Tw V[} *****77CCM/J vJ8//3"//F*2p>[]> 27p>]I 2f``RR.JJxwJ ...AJ#8*2p>[]T 27p>]} 2f*2p>[]T 27p>] 2fSS//WW8.TwJ TAJ8JJ885/4c=X .JJ=X-.JJ|z:?9:?9*2p>[]:T 27p"*]:** 2 *2f 0 MMM0*2p>[]T 27p>]} 2fl;l;2eXKT2eXKT2//??aaFF5/4c=X .JJ=X-.JJ|pT pL} 8//l6688//&&XX5/4c=X .JJ=5X-.JJ|M/J MJ8F.CCxwJ ...AJ#8// W08# <W5550iEsiEs))88/9J 9J8^B^BHHddss5/4c=X .9JJ=X-.9JJ|^^d %d %$$ccVV*2p>[]T 27p>]* 2f5://4c=X .JJ&!5]XR555;89J&|8QQ*2p>[]T 27p>"]** 2fbb::iiccM8//lJ J8// 08# <0.TJ TAJ8Y<(YY<(Y77 ` `//  0 0Y<(YY<(Y77 *2p>[]T 27p>]} 2f@@M/9J 9J8// h0 h08;;D 1g6g6*2p>[]nT 27p>]n} 2f++5/4c=X .JJ=X-.JJ|M/J MMMvJ8D8//lZZ899MMMMMMMpT pL} IIM/9J MMM9MMMv5JM8==pT p>} pT p> MMMMWW~~//WW8pTw p} ;;Z5://4c=X .JJ=X-.JJ|*2p>z]T 27p>]} 2fBB//3"pT _pL} __M8pT > 5/4c=X.JJ=X.JJ|( Lo@<OPtk55C Xy.J@AJ| t%%%%%%%%%%%8//8WWEEpTw p} p^Tw Bp^} B77MM55555MMOOyy||5/4c=X .JJ=X-.JJ|--// 0 08737*S*pT p> ^6B^6B8HU{HU{~~~~~~~~~~~FFZZnnAAuRuR//J J8p:T pL:} z:?9:?9;;pGTw p} AAM//J J8iirqRRpT pL} RR*2Mp>[]T 2Fp>]} 2f/QJ QJ8^B^Brq``MM5//4c=X .JJ=X-.JJ|mm*2p>[]T 27p>]} 2f5//4c=X .JJ=X-.JJ|8//Z8Y<(#YY<(#Y8//8cc&&ZZ*2p>[]:T 27p>]:} 2fIIpT pLM/J& MMMMMvJ&8~~*****pT _pL} 88p:T p>: *2p>[]T 27*p]* "*2f  ( Lo@<OPtk55C Xy.JAJ|*2p>[]T 27*p>]} 2fF11FFFFFFFFFF*2Mp>[]T 2Fp>]} 2fM/44MMJ MMMMMJ8M8//J J888//1g6g6DD.CC.wJ ....AJ8S*2Mp>[]T 2FG*>"*** 2f*2p>[]T 27p>]}2 2f5/4c=X .JJ5R.55J|5//4c=X .JW=X-.JW|D b****pTw p}   YY[Tw [} 99p:Tw p:} stereoChemistry_69_reversed_RegexHash.txt000066400000000000000000000000111451751637500426150ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata681652249stereoChemistry_69_reversed_SerialisedAutomaton.aut000066400000000000000000005634531451751637500447150ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xpqWur[ZW 9]xpWpur[C&f]xpO !'()*+,-./01:<=>?ABCDEFGHIJKLMNOPQRSTUVXYZ[\]^_abcdefghijklmnopqrstuvxyz{|}~ur[IM`&v겥xpxxxxxx88iqq4 66&-v w|KKKKKKKK-vKo5:l@7d-v 6E6xxxxxKKKKKKKKKKKwxxxx99.9AA68nSQ3x(/B&8B;UA?nL8|KKK-:-:-D9Smm468DDSd{8B;3A?-I#_L8(&S  1a:D9EEDDxxxxxxxxxxxxsX}J@0,,|KKKhGhG8offS00]]%%%%%%S|KKbRKKKK%RD9%%%%&-v w|KKKKKKKK-vKo5:l@7d-vxxxxx|KKKKKKK-=-=cM%%%%%%%wgga%%  $$,,SSKxxxxxxxxxDDxxxxxxxxxxxxsX}J@0-%-----Smm""68D9Sd{8B;UA?-I[_L8++22>>rr|KKKKKKKKKK68DDSdI8B;@AV?LI1_L8<<<xxCCsi D9EEDDSS<<----|%%%%%%%%%%%%-;NHxG? 5#-pp,,&-v |KKKKKKKKK-vKo5:lQ.@7d-v%%%%%xxxxxxxxxxxx-%%%%%%%%%%%%-;NHxG? 5#-..ttKKKKKSTT%%%%%%%%%%%%;NHxG? 5#  OO==AATT&-v |KKKKKKKKK-vKo5:lQ.@7d-vKKKKKK<SP<SP<::MPP22MM&-v |KKKKKKKK-vKo5:l@7d-v''KKKKTTDDxxx-| - -i%%  %%%&-v |KKKKKKKK-vK)o5:lQ.@M7d-vxxxxxxxxxxh""%9S+bx#mb[[ee/VVDD%%KKK<<;;xxxxDD2xxxxxxxxxxx2s>}J@02uu))D9SLLR  R  jj++xxx*DD68DDSd{8B;WA?-I_L8KKKKKmiWiW%%%CC---eeD9==TTTT%%%%%%%%%%%ZZxxxxxNNcxxxxxxxxxxxxsX}J@0^^%%%%%%%%%%|KKKKK/E/E%xxx$$''68DDS!{8B;UA?-I[_L8FF,, J,J,xxx  |  S+FmHHKKK$$::(&(((((((( wwuu(&<<&-v |KKKKKKKKK-vKPo5:lQ.@7d-v68DDSdI8B;UAV?LI_L8%%%%%%---(&BBD9|KKKKKD9>>,==KKKKK68DDSd{f8B;*A?-I_L8x22D999   KKKKKKKKKKK6Sd{;UA?-I[_LKKK9|KKKK%%%%%%%%=kY{_% !Q<kY_`<U((zzxS68D9SdI8B;@AV?LI1_L8|x"x/p&"\"-a%%%%%%%%%%%%-;NHxG? 5#-yyD9$$$$---KKKKKKKKKKKDcDDVVKKKK,,DDSxxxxxxxxxxxxsX}J@0K/KKKKE/E&-v |KKKKKKKKK-vK5:lQ.@7d-vUU- - -DDxxxxxxxxxxxxxxxsX}J@0!!$$ %%N==DDDKKKKKKo?}]?}]h-i--""<<KKKm68DDS!I8B;UAV?LI_L8Ro  xxxxxxxxvM%%%%D9==CCx|RRKuu"",,,Z,,,,,,Z----|--9>>,0'0'c$$KGG""""&-v |KKKKKKKKK-vKo5:lQ.@7d-v  |-<-<-xxxxxx=k%_% !Q<k_`<UxxCC66N&-v |KKKKKKKKKKKKKK-vKo5:lQ.@7d-v-/E-/E-KKKKKK%%%%%%%%68DSd{8B;UA?-I[_L8DD^xxxxxxxxxxx2s\}J@02PPyy&-v |KKKKKKKKKKKK-vKo5:lQ.@7d-vMmmmmmmmmmKKKKKKKKKKKEkk,,,,vv<<&-v |KKKKKKKKKKK-vKo5:l@7d-v%%%&-v |KKKKKKKKK-vKo5:lQ.@57d-v|KKKKKOSPSP22S|KKb KKKK 88xxxjjuuSRRSDD|KKKKK7-E%%%%%%%%%%%-;)HxG? 5#-%%%%-:-:-4g4gTTxxxxxxxxxxxxsX}J@0s-i--xxxxxx68DDSd{8B;UA?-I[_L8K%%h""rr,&v |~KK%XKIKKK!vCK5:lNT7d<Uv%m)))D9HH%%%9 $  $ sik%_% !k_`<UCCD9-=-= ~~xxxi  h77JJ-%%%%%%%%%%%%%%%-;NHxG? 5#-AA%%%%%zzAAAA1168DDSd{8B;UA?-I[_L868DDSd{8B;UA?-I[_L8-%%%%%%%%%%%-;HxG? 5#-oll---22%%%%%%%%%%%%;NHxG? 5#KKKKKKKKKKK--- AA**OOD9&|KKKKKKKKKKo5:lQ.@7d^^-a%%%%%%%%%%%%%%%-;NHxG? 5#-``TT/ 4  4      TTS%%%%%2Ft2FtAAx"x/p&r"\"NNa  DDxxxxxxxxxxxxxxxsX}J@0YYKAAPP22$$*KKKK K |KKKKKSmbmb  33vonBaeyer_66RegexHash.txt000066400000000000000000000000131451751637500373170ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-1917907032vonBaeyer_66SerialisedAutomaton.aut000066400000000000000000000227301451751637500414010ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp",ur[ZW 9]xp,pur[C&f]xp5 !()*+,-./0:<=>?CDLMOPQSTUVYZ[\]^_cdlmopqstuvyz{|}~ur[IM`&v겥xp  ))) )&& )**#' )#'  (!!%$$) ++ vonBaeyer_66_reversed_RegexHash.txt000066400000000000000000000000131451751637500413550ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomata-1917907032vonBaeyer_66_reversed_SerialisedAutomaton.aut000066400000000000000000000244541451751637500434440ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/serialisedAutomatasrdk.brics.automaton.RunAutomatonN!IinitialIsize[acceptt[Z[classmapt[I[pointst[C[ transitionsq~xp0ur[ZW 9]xp0pur[C&f]xp5 !()*+,-./0:<=>?CDLMOPQSTUVYZ[\]^_cdlmopqstuvyz{|}~ur[IM`&v겥xp  !  % " &' $ ,  ..++.((*/.. ".)-#)).%-'(()*)) &# $ , / /opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/simpleCyclicGroups.xml000066400000000000000000000465471451751637500333410ustar00rootroot00000000000000 3-pyrazolone|3-pyrazolon 4-pyrazolone|4-pyrazolon 5-pyrazolone|5-pyrazolon alpha-asarone|alpha-asaron beta-asarone|beta-asaron alpha-furil|2,2'-furil 2,2'-furoin abietamide|abietamid acetovanillone|acetovanillon anise alcohol anisil|p-anisil anisoin|p-anisoin anthra-1,2-quinone|anthra-1,2-quinon anthra-1,4-quinone|anthra-1,4-quinon anthra-9,10-quinone|anthra-9,10-quinon anthranil apocynin barbituric|barbituricacid|barbituric acid benzil benzo-1,2-quinone|benzo-1,2-quinon benzo-1,4-quinone|benzo-1,4-quinon benzol benzopinacol benzopinacolone|benzopinacolon benzosemiquinone|benzosemiquinon bisphenol a|bisphenol-a bourbonal carvacrol catecholate|catecholat chavicol coniferol creosol cumaldehyde|cumaldehyd cuminic acid|cuminicacid|cumic acid|cumicacid cuminal|cuminaldehyde|cuminaldehyd cuminol curcumin cyanuramide|cyanurotriamide|cyanurotriamine|cyanuramid|cyanurotriamid|cyanurotriamin cyanuric bromide|cyanuryl bromide|cyanuric bromid|cyanuryl bromid cyanuric chloride|cyanuryl chloride|cyanuric chlorid|cyanuryl chlorid cyanuric fluoride|cyanuryl fluoride|cyanuric fluorid|cyanuryl fluorid cyanuric iodide|cyanuryl iodide|cyanuric iodid|cyanuryl iodid cyclopentadienide|cyclopentadienid cyclopentadienylium cyclotetraphosphazene|cyclotetraphosphazen cyclotriphosphazene|cyclotriphosphazen dibenzamide|dibenzamid diphenate diphenic|diphenicacid|diphenic acid durohydroquinone|durohydroquinon dypnone elemicin estragole|estragol ethyl vanillin|ethylvanillin homoisovanillin homovanillin o-homovanillin imidazolate|imidazolat isatic|isaticacid|isatic acid isatoic|isatoicacid|isatoic acid isoelemicin isoeugenol isophthalyl alcohol isovanillin lutidine|lutidin melamine|melamin quinol mesitylenate|mesitylenat mesitylenic|mesitylenic acid|mesitylenicacid naphtho-1,2-quinone|naphtho-1,2-quinon naphtho-1,4-quinone|naphtho-1,4-quinon nicotine|nicotin olivetol orcinol perbenzoate|perbenzoat perbenzoic|perbenzoicacid|perbenzoic acid phthalhydrazide|phthalhydrazid phthalyl alcohol phenol phenolate|phenolat phenylium phloretin phlorol picrate|picrat picric|picricacid|picric acid pseudocumohydroquinone|pseudocumohydroquinon pterin dihydropterin tetrahydropterin pyrocatecholate|pyrocatecholat quinolinic|quinolinicacid|quinolinic acid methanopterin tetrahydromethanopterin|5,6,7,8-tetrahydromethanopterin resveratrol resorcin rhapontigenin saligenin diethylstilbestrol styphnate|styphnat styphnic|styphnicacid|styphnic acid styrene carbonate|styrene carbonat styrol sydnone|sydnon terephthalyl alcohol thymohydroquinone|thymohydroquinon thymol toluhydroquinone|toluhydroquinon toluol tribenzamide|tribenzamid uric|uricacid|uric acid vanillin o-vanillin xylol o-xylohydroquinone|ortho-xylohydroquinone|o-xylohydroquinon|ortho-xylohydroquinon m-xylohydroquinone|meta-xylohydroquinone|m-xylohydroquinon|meta-xylohydroquinon p-xylohydroquinone|para-xylohydroquinone|p-xylohydroquinon|para-xylohydroquinon zingerone|zingeron 4-pyridoxic|4-pyridoxic|4-pyridoxic acid 4-pyridoxolactone|4-pyridoxolacton 5-pyridoxic|5-pyridoxic|5-pyridoxic acid 5-pyridoxolactone|5-pyridoxolacton isopyridoxal pyridoxal pyridoxamine|pyridoxamin pyridoxine|pyridoxin|pyridoxol pyridoxal-p pyridoxamine-p|pyridoxamin-p pyridoxine-p|pyridoxin-p|pyridoxol-p 2'-adenylic|2'-adenylicacid|2'-adenylic acid 3'-adenylic|3'-adenylicacid|3'-adenylic acid 5'-adenylic|5'-adenylicacid|5'-adenylic acid 3'-thymidylic|3'-thymidylicacid|3'-thymidylic acid 5'-thymidylic|5'-thymidylicacid|5'-thymidylic acid 2'-guanylic|2'-guanylicacid|2'-guanylic acid 3'-guanylic|3'-guanylicacid|3'-guanylic acid 5'-guanylic|5'-guanylicacid|5'-guanylic acid 2'-inosinic|2'-inosinicacid|2'-inosinic acid 3'-inosinic|3'-inosinicacid|3'-inosinic acid 5'-inosinic|5'-inosinicacid|5'-inosinic acid|inosinic acid thioinosinic acid 2'-xanthylic|2'-xanthylicacid|2'-xanthylic acid 3'-xanthylic|3'-xanthylicacid|3'-xanthylic acid 5'-xanthylic|5'-xanthylicacid|5'-xanthylic acid 2'-cytidylic|2'-cytidylicacid|2'-cytidylic acid 3'-cytidylic|3'-cytidylicacid|3'-cytidylic acid 5'-cytidylic|5'-cytidylicacid|5'-cytidylic acid 2'-uridylic|2'-uridylicacid|2'-uridylic acid 3'-uridylic|3'-uridylicacid|3'-uridylic acid 5'-uridylic|5'-uridylicacid|5'-uridylic acid 2'-orotidylic|2'-orotidylicacid|2'-orotidylic acid 3'-orotidylic|3'-orotidylicacid|3'-orotidylic acid 5'-orotidylic|5'-orotidylicacid|5'-orotidylic acid chrysoeriol dihydrokaempferol kaempferol opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/simpleGroups.xml000066400000000000000000001276401451751637500322040ustar00rootroot00000000000000 acetoin acetone cyanohydrin acetoxime|acetoxim acetoxylium acrolein agmatine|agmatin allicin aluminohydride|aluminohydrid amidogen aminoxide|aminoxid aminoxylium aminylium ammoniumolate|ammoniumolat anandamide|anandamid arsin biacetyl bicine|bicin bifluoride|bifluorid biisopropenyl biisopropyl biguanide|biguanid biguanidine|biguanidin bismuth oxychloride|bismuthoxychloride|bismuth oxychlorid|bismuthoxychlorid bistriflimide|bistriflimid bisulfide|bisulfid bitartrate|bitartrat biurea biuret bombykol borodeuteride|borodeuterid borohydride|borohydrid borotritide|borotritid bromal bromamine|bromamide|bromamin|bromamid dibromamine|bromimide|dibromamin|bromimid bromoform busulfan butoxide|n-butoxide|butoxid|n-butoxid butoxylium butyroin cadaverine|cadaverin camphor camphorsulfonate|camphorsulfonat camphor-10-sulfonate|camphor-10-sulfonat camphorsulfonic|camphorsulfonicacid|camphorsulfonic acid camphor-10-sulfonic|camphor-10-sulfonicacid|camphor-10-sulfonic acid capraldehyde|capraldehyd capramide|capramid caprate|caprat capric|capricacid|capric acid caprinitrile|caprinitril capriphenone|capriphenon carbazone|carbazon carbinol carbodiazone|carbodiazon carbodiimide|carbodiimid carbonohydrazide|carbohydrazide|carbazide|carbonohydrazid|carbohydrazid|carbazid cetane cetanol chloral chloral formamide|chloralformamide|chloral formamid|chloralformamid chloral hydrate|chloralhydrate|chloral hydrat|chloralhydrat chloramine|chloramide|chloramin|chloramid dichloramine|chlorimide|dichloramin|chlorimid chloroform chloropicrin citral citronellol|beta-citronellol alpha-citronellol citronellal|beta-citronellal alpha-citronellal crotonylene|crotonylen cyanamide cyanic|cyanicacid|cyanic acid cyanogen cyanogen azide|cyanogen azid cyanogen bromide|cyanogen bromid cyanogen chloride|cyanogen chlorid cyanogen fluoride|cyanogen fluorid cyanogen iodide|cyanogen iodid cyanuric acid|cyanuricacid dansylamide|dansylamid diacetamide|diacetamid diacetonamine|diacetonamin diacetone alcohol dichlorvos dicyan dicyanamide|dicyanamid diethanolamine|diethanolamin dimercaprol diglyme diphosgene|diphosgen epibromohydrin epichlorohydrin epifluorohydrin epiiodohydrin erythrene|erythren ethoxide|ethoxid ethoxylium ethylene|ethylen alpha-farnesene|alpha-farnesen beta-farnesene|beta-farnesen farnesol felbinac ferricyanide|ferricyanid ferrocyanide|ferrocyanid fluoramine|fluoramide|fluoramin|fluoramid difluoramine|fluorimide|difluoramin fluoroform fulminicacid|fulminic acid gallohydride|gallohydrid geranial geraniol glycerol carbonate|glycerin carbonate|glyceryl carbonate|glycerol carbonat|glycerin carbonat|glyceryl carbonat glyme glyoxal glyoxime|glyoxim halothane|halothan hexametapol hexamethyldisilazide|hexamethyldisilazid hexamethyleneimine|hexamethyleneimin|hexamethylenimine|hexamethylenimin hexamethyleneiminium|hexamethyleniminium hydronium hydroxylium hydroperoxylium hydroxycitronellal ibuprofen imidogen iodoform iron oxychloride|ironoxychloride|iron oxychlorid|ironoxychlorid isobutoxide|iso-butoxide|isobutoxid|iso-butoxid|isobutylate|iso-butylate|isobutylat|iso-butylat isobutylene|isobutylen isocarbonohydrazide|isocarbonohydrazid isocyanic|isocyanicacid|isocyanic acid isothiocyanic|isothiocyanicacid|isothiocyanic acid isoselenocyanic|isoselenocyanicacid|isoselenocyanic acid isotellurocyanic|isotellurocyanicacid|isotellurocyanic acid isocyanuric acid|isocyanuricacid isofulminicacid|isofulminic acid isophorone|isophoron isophorone diisocyanate|isophorone diisocyanat|isophoron diisocyanat isopropoxide|iso-propoxide|isopropoxid|iso-propoxid isoselenourea|isoselenurea isoselenouronium|isoselenuronium isosemicarbazide|isosemicarbazid isotellurourea|isotellururea isotellurouronium|isotellururonium isothiourea|isothiurea isothiouronium|isothiuronium isourea isouronium itatartrate|itatartrat ketoprofen laccerol linalool lignocerol mesilate|mesilat mesityl oxide|mesityl oxid methacrolein methamidophos methoxide|methoxid methoxylium methylal methyleneimine|methyleneimin|methylenimine|methylenimin methyleneiminium|methyleniminium monoethanolamine|mono-ethanolamine|monoethanolamin|mono-ethanolamin monoglyme alpha-myrcene|alpha-myrcen beta-myrcene|beta-myrcen neopentyl glycol neral nerol nerolidol neurine nitramide|nitramid nitrenium nitroform nitroglycerin nitrosonium nitronium nitrosamide|nitrosamid nitrous oxide|nitrousoxide|nitrous oxid|nitrousoxid nitroxide|nitroxid nitroxyl alpha-ocimene|alpha-ocimen beta-ocimene|beta-ocimen oxaldehyde|oxaldehyd oxamide|oxamid oxylium ozone ozonide|ozonid penicillic|penicillicacid|penicillic acid penicillate|penicillat pentaerythritol dipentaerythritol tripentaerythritol pentaguanide|pentaguanid pentauret peracetic|peraceticacid|peracetic acid peracetate|peracetat performic|performicacid|performic acid performate peroxylium phorone phosgene|phosgen phosphin phosphoramide|phosphoramid phosphorus oxybromide|phosphorus(v) oxybromide|phosphorusoxybromide|phosphorous oxybromide|phosphorous(v) oxybromide|phosphorousoxybromide|phosphorus oxybromid|phosphorus(v) oxybromid|phosphorusoxybromid|phosphorous oxybromid|phosphorous(v) oxybromid|phosphorousoxybromid phosphorus oxychloride|phosphorus(v) oxychloride|phosphorusoxychloride|phosphorous oxychloride|phosphorous(v) oxychloride|phosphorousoxychloride|phosphorus oxychlorid|phosphorus(v) oxychlorid|phosphorusoxychlorid|phosphorous oxychlorid|phosphorous(v) oxychlorid|phosphorousoxychlorid phosphorus oxyfluoride|phosphorus(v) oxyfluoride|phosphorusoxyfluoride|phosphorous oxyfluoride|phosphorous(v) oxyfluoride|phosphorousoxyfluoride|phosphorus oxyfluorid|phosphorus(v) oxyfluorid|phosphorusoxyfluorid|phosphorous oxyfluorid|phosphorous(v) oxyfluorid|phosphorousoxyfluorid phosphorus oxyiodide|phosphorus(v) oxyiodide|phosphorusoxyiodide|phosphorous oxyiodide|phosphorous(v) oxyiodide|phosphorousoxyiodide|phosphorus oxyiodid|phosphorus(v) oxyiodid|phosphorusoxyiodid|phosphorous oxyiodid|phosphorous(v) oxyiodid|phosphorousoxyiodid phosphorus pentasulfide|phosphorus pentasulfid phytol pinacol pinacolone|pinacolon piperylene|piperylen pristane|pristan propione propionoin|propioin propoxide|n-propoxide|propoxid|n-propoxid propoxylium propylene|propylen prussicacid|prussic acid pseudoselenourea|pseudoselenurea pseudotellurourea|pseudotellururea pseudothiourea|pseudothiurea pseudourea putrescine|putrescin rhodinol rhodinal rubeanic acid sarin sec-butoxide|secbutoxide|sec-butoxid|secbutoxid|sec-butylate|secbutylate|sec-butylat|secbutylat selenilimine|selenilimin selenium oxybromide|seleniumoxybromide|selenium oxybromid|seleniumoxybromid selenoximide|selenoximine|selenoximid|selenoximin selenuronium semicarbazide|semicarbazid semioxamazide|semioxamazid soman spermidine|spermidin spermine|spermin squalane|squalan squalene|squalen sulfamide|sulfamid sulfilimine|sulfimide|sulfilimin|sulfimid tabun tartrate|tartrat d-tartrate|(d)-tartrate|d(-)-tartrate|d-(-)-tartrate|d-tartrat|(d)-tartrat|d(-)-tartrat|d-(-)-tartrat l-tartrate|(l)-tartrate|l(+)-tartrate|l-(+)-tartrate|l-tartrat|(l)-tartrat|l(+)-tartrat|l-(+)-tartrat tellurilimine|tellurilimin tellururonium telluroximide|telluroximine|telluroximid|telluroximin tertiary-butoxide|tertiarybutoxide|tert-butoxide|tertbutoxide|tert.-butoxide|tert.butoxide|t-butoxide|tbutoxide|tertiary-butoxid|tertiarybutoxid|tert-butoxid|tertbutoxid|tert.-butoxid|tert.butoxid|t-butoxid|tbutoxid|tertiary-butylate|tertiarybutylate|tert-butylate|tertbutylate|tert.-butylate|tert.butylate|t-butylate|tertiary-butylat|tertiarybutylat|tert-butylat|tertbutylat|tert.-butylat|tert.butylat|t-butylat tertiary-pentoxide|tertiarypentoxide|tert-pentoxide|tertpentoxide|tert.-pentoxide|tert.pentoxide|t-pentoxide|tertiary-pentoxid|tertiarypentoxid|tert-pentoxid|tertpentoxid|tert.-pentoxid|tert.pentoxid|t-pentoxid|tertiary-pentylate|tertiarypentylate|tert-pentylate|tertpentylate|tert.-pentylate|tert.pentylate|t-pentylate|tertiary-pentylat|tertiarypentylat|tert-pentylat|tertpentylat|tert.-pentylat|tert.pentylat|t-pentylat|tertiary-amoxide|tertiaryamoxide|tert-amoxide|tertamoxide|tert.-amoxide|tert.amoxide|t-amoxide|tertiary-amoxid|tertiaryamoxid|tert-amoxid|tertamoxid|tert.-amoxid|tert.amoxid|t-amoxid|tertiary-amylate|tertiaryamylate|tert-amylate|tertamylate|tert.-amylate|tert.amylate|t-amylate|tertiary-amylat|tertiaryamylat|tert-amylat|tertamylat|tert.-amylat|tert.amylat|t-amylat tetrabromoaluminate|tetrabromoaluminat tetrachloroaluminate|tetrachloroaluminat tetraglyme tetrafluoroaluminate|tetrafluoroaluminat tetraiodoaluminate|tetraiodoaluminat tetraguanide|tetraguanid tetrauret tetrahydroaluminate|tetrahydroaluminat tetrahydroborate|tetrahydroborat tetrahydrogallate|tetrahydrogallat thiosinamine|thiosinamin thiuram monosulfide|thiuram monosulfid thiuram disulfide|thiuram disulfid thiuronium triacetin triacetamide|triacetamid tributyrin trichlorohydrin triclofos triethanolamine|triethanolamin triflimide|triflimid triflimidate|triflimidat triflimidic acid triguanide|triguanid triglyme trilaurin trimyristin triolein trioxygen tripalmitin triphosgene|triphosgen triptane tristearin triuret trometamol|tromethamine|tromethamin|tromethane|tromethan tropylium uranyl urea uronium vinylene|vinylen xanthate|xanthat xanthic acid|xanthicacid ammonia carbenium hydrocyanicacid|hydrocyanic acid hydroisocyanicacid|hydroisocyanic acid hydrazoicacid|hydrazoic acid bis((trifluoromethyl)sulfonyl)imide|bis(trifluoromethane)sulfonimide|bis(trifluoromethanesulfonyl)imide|bis(trifluoromethylsulfonyl)imide|bistrifluoromethanesulfonimide|bis[(trifluoromethyl)sulfonyl]imide|trifluoromethanesulfonimide|bis((trifluoromethyl)sulfonyl)imid|bis(trifluoromethane)sulfonimid|bis(trifluoromethanesulfonyl)imid|bis(trifluoromethylsulfonyl)imid|bistrifluoromethanesulfonimid|bis[(trifluoromethyl)sulfonyl]imid|trifluoromethanesulfonimid bis((pentafluoroethyl)sulfonyl)imide|bis((perfluoroethyl)sulfonyl)imide|bis(pentafluoroethanesulfonyl)imide|bis(pentafluoroethylsulfonyl)imide|bis(perfluoroethanesulfonyl)imide|bis(perfluoroethylsulfonyl)imide|bispentafluoroethylsulfonylimide|bisperfluoroethylsulfonylimide|bis[(pentafluoroethyl)sulfonyl]imide|bis[(perfluoroethyl)sulfonyl]imide|bis((pentafluoroethyl)sulfonyl)imid|bis((perfluoroethyl)sulfonyl)imid|bis(pentafluoroethanesulfonyl)imid|bis(pentafluoroethylsulfonyl)imid|bis(perfluoroethanesulfonyl)imid|bis(perfluoroethylsulfonyl)imid|bispentafluoroethylsulfonylimid|bisperfluoroethylsulfonylimid|bis[(pentafluoroethyl)sulfonyl]imid|bis[(perfluoroethyl)sulfonyl]imid tris((trifluoromethyl)sulfonyl)methide|tris(trifluoromethanesulfonyl)methide|tris(trifluoromethylsulfonyl)methide|tris[(trifluoromethyl)sulfonyl]methide|tris((trifluoromethyl)sulfonyl)methid|tris(trifluoromethanesulfonyl)methid|tris(trifluoromethylsulfonyl)methid|tris[(trifluoromethyl)sulfonyl]methid tris((pentafluoroethyl)sulfonyl)methide|tris((perfluoroethyl)sulfonyl)methide|tris(pentafluoroethanesulfonyl)methide|tris(pentafluoroethylsulfonyl)methide|tris(perfluoroethanesulfonyl)methide|tris(perfluoroethylsulfonyl)methide|tris[(pentafluoroethyl)sulfonyl]methide|tris[(perfluoroethyl)sulfonyl]methide|tris((pentafluoroethyl)sulfonyl)methid|tris((perfluoroethyl)sulfonyl)methid|tris(pentafluoroethanesulfonyl)methid|tris(pentafluoroethylsulfonyl)methid|tris(perfluoroethanesulfonyl)methid|tris(perfluoroethylsulfonyl)methid|tris[(pentafluoroethyl)sulfonyl]methid|tris[(perfluoroethyl)sulfonyl]methid azobisisobutyronitrile|azobis-isobutyronitrile|azobisisobutyronitril|azobis-isobutyronitril diphenylethylenediamine|diphenylethylenediamin dicyclohexylurea diepoxybutane|diepoxybutan dimethylurea dimethoxyethane|dimethoxyethan dimethylacetamide|dimethyl-acetamide|dimethylacetamid|dimethyl-acetamid carbonyldiimidazole|carbonyldiimidazol formamide|formamid mercaptopurine|mercaptopurin methanamide|methanamid phytantriol trifluorothymidine|trifluorothymidin trinitrotoluene glyoxalylamide|glyoxalylamid carbanolate|carbanolat chlorazine chlorazin chlorobenzilate|chlorobenzilat benzocyclobutene|benzocyclobuten diazinon|diazinone diazolidinylurea|diazolidinyl urea dichlorodiphenyltrichloroethane|dichlorodiphenyltrichloroethan difluoroheptylazidosulfinate|difluoroheptylazidosulfinat dimethylol ethylene urea dithianone|dithianon dihydromethanophenazine|dihydromethanophenazin iodamide imidazolidinylurea|imidazolidinyl urea methanophenazine|methanophenazin methoxychlor methylazoxymethanol acetate|methylazoxymethanol acetat methyldibromo glutaronitrile|methyldibromoglutaronitrile|methyldibromo glutaronitril|methyldibromoglutaronitril oxolinic acid oxolinate|oxolinat oxybenzone|oxybenzon pentazocine|pentazocin pyridate|pyridat pyrroloquinoline quinone|pyrroloquinoline quinon sulfosalicylic acid stibogluconate|stibogluconat tetrabromogallate|tetrabromogallat tetrachlorogallate|tetrachlorogallat tetrafluorogallate|tetrafluorogallat tetraiodogallate|tetraiodogallat toluene-diisocyanate|toluene diisocyanate|toluene-diisocyanat|toluene diisocyanat biotin|biotine|d-biotin|d-biotine biotin sulfone|biotine sulfone|d-biotin sulfone|d-biotine sulfone|biotin sulfon|d-biotin sulfon biotin sulfoxide|biotine sulfoxide|d-biotin sulfoxide|d-biotine sulfoxide|biotin sulfoxid|d-biotin sulfoxid choline|cholin chlorocholine|chlorocholin eicosasphinganine|eicosasphinganin ethanolamine|ethanolamin fluorocholine|fluorocholin fluorouracil propylthiouracil glycerone|glyceron glycocyamine|glycocyamin guanidinium icosasphinganine|icosasphinganin leucinic acid|dl-leucinic acid d-leucinic acid l-leucinic acid phytosphingosine|phytosphingosin sphinganine|sphinganin sphingosine|sphingosin triethylcholine|triethylcholin vitamin c coenzyme a|coa hydrate|hydrat hbr|2hbr|3hbr|4hbr hcl|2hcl|3hcl|4hcl tfa|2tfa|3tfa|4tfa acetylide|acetylid amine|amin aminium aminide|aminid barbiturate|barbiturat boronic pinacol ester|boronic acid pinacol ester|boronicacid pinacol ester|boronicacidpinacol ester|boronicacid pinacolester|boronicacidpinacolester|boronic acidpinacol ester|boronic acidpinacolester|boronic acid pinacolester carboxamide|carboxamid carboxylate|carboxylat carboxylic|carboxylicacid|carboxylic acid diazonium nitrone paraben perselenurane|perselenuran persulfurane|persulfuran selenurane|selenuran sulfoximide|sulfoximine|sulfoximid|sulfoximin sulfoxonium sulfurane|sulfuran nitrolic acid|nitrolicacid hydrofluoride|hydrofluorid hydrochloride|hydrochlorid hydrobromide|hydrobromid hydroiodide|hydriodide|hydroiodid|hydriodid hydroastatide|hydroastatid hydrofluoricacid|hydrofluoric acid hydrochloricacid|hydrochloric acid hydrobromicacid|hydrobromic acid hydroiodicacid|hydroiodic acid|hydriodicacid|hydriodic acid hydroastaticacid|hydroastatic acid hydromethanesulfonate|hydromethanesulfonat hydrophosphate|hydrophosphat dihydrophosphate|dihydrophosphat trihydrophosphate|trihydrophosphat hydrosulfate|hydrosulfat dihydrosulfate|dihydrosulfat hydronitrate|hydronitrat hydromaleate|hydromaleat|monohydromaleate|monohydromaleat dihydromaleate|dihydromaleat hydroacetate|hydroacetat hydrobenzoate|hydrobenzoat hydrocitrate|hydrocitrat|monohydrocitrate|monohydrocitrat dihydrocitrate|dihydrocitrat trihydrocitrate|trihydrocitrat hydrofumarate|hydrofumarat|monohydrofumarate|monohydrofumarat dihydrofumarate|dihydrofumarat hydrotartrate|hydrotartrat|monohydrotartrate|monohydrotartrat dihydrotartrate|dihydrotartrat hydrolactate|hydrolactat hydroxalate|hydroxalat|monohydroxalate|monohydroxalat|hydrooxalate|hydrooxalat|monohydrooxalate|monohydrooxalat dihydroxalate|dihydroxalat|dihydrooxalate|dihydrooxalat hydrosuccinate|hydrosuccinat|monohydrosuccinate|monohydrosuccinat dihydrosuccinate|dihydrosuccinat hydro-p-toluenesulfonate|hydro-p-toluenesulfonat molecular hydrogen molecular nitrogen molecular oxygen molecular fluorine molecular chlorine molecular bromine molecular iodine water acetylide|acetylid lithium nitride|lithium nitrid sodium nitride|sodium nitrid potassium nitride|potassium nitrid beryllium nitride|beryllium nitrid calcium nitride|calcium nitrid magnesium nitride|magnesium nitrid strontium nitride|strontium nitrid lithium amide sodium amide|natrium amide potassium amide|kalium amide rubidium amide caesium amide|cesium amide francium amide beryllium amide|beryllium diamide magnesium amide|magnesium diamide calcium amide|calcium diamide strontium amide|strontium diamide barium amide|barium diamide radium amide|radium diamide diphenylphosphoryl azide|diphenylphosphoryl azid|diphenylphosphorylazide|diphenylphosphorylazid opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/simpleSubstituents.xml000066400000000000000000001163451451751637500334410ustar00rootroot00000000000000 1-formazano 1-isoureido 1-isothioureido 3-isoureido 3-isothioureido 5-formazano abietamido acetoxyl acetamino acetenyl aci-nitro acroyl acryl active amyl|active-amyl|activeamyl amidoxalyl amidyl amidylidene|amidyliden aminoxy|aminoxyl aminyl aminylidene|aminyliden aminylidyne|aminylidyn amoxy anisal asaryl azidyl iso-amoxy|isoamoxy anilino anisidino benzal|benzylene benzidino 4,4'-benzidino besyl beta-allyl biphen-2-yl biphen-3-yl biphen-4-yl boc borono brosyl butoxyl cacodyl carbamimidamido carbazono carbodiazono carbonohydrazido|carbohydrazido|carbazido carbmethoxy carbomethoxy carbethoxy|carboethoxy carbopropoxy|carbpropoxy carbobutoxy|carbbutoxy carbopentoxy|carbpentoxy carbohexoxy|carbhexoxy carboheptoxy|carbheptoxy carbooctoxy|carboctoxy carbononoxy|carbnonoxy carbodecoxy|carbdecoxy carboundecoxy|carbundecoxy carbododecoxy|carbdodecoxy carbotridecoxy|carbtridecoxy carbotetradecoxy|carbtetradecoxy carbopentadecoxy|carbpentadecoxy carballoxy|carboalloxy|carballyloxy|carboallyloxy carbbenzoxy|carbobenzoxy|carbbenzyloxy|carbobenzyloxy carbisopropoxy|carboisopropoxy|carb-i-propoxy|carbo-i-propoxy carbphenoxy|carbophenoxy|carbphenyloxy|carbophenyloxy carbo-tertiarybutoxy carbo-tertiary-butoxy|carbo-tertbutoxy|carbo-tert.butoxy|carbo-tert-butoxy|carbo-tert.-butoxy|carbo-t-butoxy|carb-tertiarybutoxy|carb-tertiary-butoxy|carb-tertbutoxy|carb-tert.butoxy|carb-tert-butoxy|carb-tert.-butoxy|carb-t-butoxy|carbotertiarybutoxy|carbotertiary-butoxy|carbotertbutoxy|carbotert.butoxy|carbotert-butoxy|carbotert.-butoxy|carbot-butoxy|carbtertiarybutoxy|carbtertiary-butoxy|carbtertbutoxy|carbtert.butoxy|carbtert-butoxy|carbtert.-butoxy|carbt-butoxy cbz carbyne carbynium carvacryl cetyl cinnamal cresyl crotyl cuminyl cuminylidene|cuminyliden|cumal cumoyl cumyl|alpha-cumyl cyanamido cyanyl cyclopentadienyl dabsyl dansyl desyl deuterio|deutero diazonio duryl ethoxyl ethylenimino eugenyl formimino hexamethyleneimino|hexamethylenimino homoallyl homomorpholino hydantoyl hydrazino hydrazono hydrazyl hydrido hydrogen hydroseleno hydroseleninyl hydroselenonyl hydrosulfinyl hydrosulfonyl hydrotelluro hydrotellurinyl hydrotelluronyl isobutoxyl|iso-butoxyl isocrotyl isocarbonohydrazido isocyanyl isoeugenyl isopentenyl isosemicarbazido isoureido isothioureido lutidin-2-yl lutidin-3-yl lutidin-4-yl mesoxalo mesyl methallyl methoxyl methylol morpholino oxalaceto hydroxamino hydroximino isopropoxyl linalyl methacroyl methacryl neophyl nerolidyl nitramido nitramino nitroxy nitrosyl nitryl nosyl oximino perselenuranyl perselenuranylidene|perselenuranyliden persulfuranyl persulfuranylidene|persulfuranyliden phenetidino phenetyl phenoxy|phenoxyl picryl pinacolyl piperazino piperidino prenyl propargyl propoxyl as-pseudocumyl v-pseudocumyl s-pseudocumyl putrescinyl pyrrolidino salicylal sec-iso-amyl|sec.-iso-amyl|sec-isoamyl|sec.-isoamyl|secisoamyl|sec-iso-pentyl|sec.-iso-pentyl|sec-isopentyl|sec.-isopentyl|secisopentyl sec-butoxyl|sec.-butoxyl|secbutoxyl seleneno selenino selenono selenyl selenuranyl selenuraniumyl selenuranylidene|selenuranyliden semicarbazido siamyl sulfino sulfo sulfoxy sulfuranyl sulfuraniumyl sulfuranylidene|sulfuranyliden tellureno tellurino tellurono telluryl tertiary-butoxyl|tertiarybutoxyl|tert-butoxyl|tertbutoxyl|tert.-butoxyl|tert.butoxyl|t-butoxyl|tbutoxyl then-2-yl then-2-ylidene|then-2-yliden then-2-ylidyne|then-2-ylidyn then-2-oyl then-3-yl then-3-ylidene|then-3-yliden then-3-ylidyne|then-3-ylidyn then-3-oyl thexyl thiyl thymyl toluidino tosyl trifloxy triflyl tritio trityl ureido vanillal veratral o-veratryl xenyl xylidino amino ammonio phosphonio arsonio stibonio bismuthonio oxonio sulfonio selenonio telluronio fluoronio chloronio bromonio iodonio diazo azinoyl azono nitro fluorosyl bromosyl chlorosyl chloroso iodosyl iodoso astatosyl fluoryl bromyl chloryl chloroxy iodyl iodoxy astatyl astatoxy perfluoryl perbromyl perchloryl periodyl fulminato selenocyano tellurocyano thiocyano isoselenocyano isotellurocyano isothiocyano hydroxy|hydroxyl hydroperoxy|hydroperoxyl|hydrogenperoxyl|perhydroxyl oxyl peroxyl|dioxyl oxo|keto mercapto|sulfhydryl|sulfydryl thioxo thiono selenoxo telluroxo phosphinimyl phosphoroso phosphino phosphinylidene|phosphinyliden phosphinothioylidene|phosphinothioyliden phosphinidene|phosphiniden arsono arsonato arso arsinimyl arsoroso|arsenoso arsino arsinylidene|arsinyliden arsinothioylidene|arsinothioyliden arsinidene|arsiniden stibono stibonato stibo stiboso stibino stibylene|stibylen bismuthino bismuthylene|bismuthylen sulfonylidene|sulfonyliden sulfonato sulfinato sulfeno nitroso carboxy|carboxyl carboxylato amidino|guanyl oxalo methoxalyl ethoxalyl fmoc tms tbdms tbdps t-butyl(dimethyl)silanoxy|t-butyl(dimethyl)siloxy|t-butyl-dimethylsilanoxy|t-butyl-dimethylsiloxy|t-butyldimethylsilanoxy|t-butyldimethylsiloxy|tert-butyl(dimethyl)silanoxy|tert-butyl(dimethyl)siloxy|tert-butyl-dimethylsilanoxy|tert-butyl-dimethylsiloxy|tert-butyldimethylsilanoxy|tert-butyldimethylsiloxy t-butyl(dimethyl)silanyl|t-butyl(dimethyl)silyl|t-butyl-dimethylsilanyl|t-butyl-dimethylsilyl|t-butyldimethylsilanyl|t-butyldimethylsilyl|tert-butyl(dimethyl)silanyl|tert-butyl(dimethyl)silyl|tert-butyl-dimethylsilanyl|tert-butyl-dimethylsilyl|tert-butyldimethylsilanyl|tert-butyldimethylsilyl t-butyl(diphenyl)silanoxy|t-butyl(diphenyl)siloxy|t-butyl-diphenylsilanoxy|t-butyl-diphenylsiloxy|t-butyldiphenylsilanoxy|t-butyldiphenylsiloxy|tert-butyl(diphenyl)silanoxy|tert-butyl(diphenyl)siloxy|tert-butyl-diphenylsilanoxy|tert-butyl-diphenylsiloxy|tert-butyldiphenylsilanoxy|tert-butyldiphenylsiloxy t-butyl(diphenyl)silanyl|t-butyl(diphenyl)silyl|t-butyl-diphenylsilanyl|t-butyl-diphenylsilyl|t-butyldiphenylsilanyl|t-butyldiphenylsilyl|tert-butyl(diphenyl)silanyl|tert-butyl(diphenyl)silyl|tert-butyl-diphenylsilanyl|tert-butyl-diphenylsilyl|tert-butyldiphenylsilanyl|tert-butyldiphenylsilyl actinio aluminio americio antimonio argonio arsenio astatio bario berkelio beryllio bismuthio borio bromio cadmio caesio calcio californio cerio chlorio chromio cobaltio cuprio curio dysprosio einsteinio erbio europio fermio fluorio francio gadolinio gallio germanio aurio hafnio helio holmio indio iodio iridio ferrio kryptonio lanthanio lawrencio plumbio lithio lutetio magnesio manganio mendelevio mercurio molybdenio neodymio neonio neptunio nickelio niobio nobelio osmio palladio phosphorio platinio plutonio polonio potassio|kalio praseodymio promethio protactinio radonio rhenio rhodio rubidio ruthenio samario scandio selenio silicio argentio sodio|natrio strontio sulfurio tantalio technetio tellurio terbio thallio thorio thulio stannio titanio tungstenio|wolframio uranio vanadio xenonio ytterbio yttrio zincio zirconio diphospho glycero-1-phospho|glycero-1-phosphoryl glycero-2-phospho|glycero-2-phosphoryl glycero-3-phospho|glycero-3-phosphoryl phosphono phosphonato phospho sn-glycero-1-phospho|sn-glycero-1-phosphoryl sn-glycero-2-phospho|sn-glycero-2-phosphoryl sn-glycero-3-phospho|sn-glycero-3-phosphoryl triphospho guanidino pyridoxyl pyridoxylidene|pyridoxyliden p-pyridoxyl|5'-p-pyridoxyl p-pyridoxylidene|p-pyridoxyliden|5'-p-pyridoxylidene|5'-p-pyridoxyliden tauryl taurinomethyl|taurino-methyl 3'-adenylyl 3'-thymidylyl 3'-guanylyl 3'-isoguanylyl 3'-inosinylyl 3'-xanthylyl 3'-cytidylyl 3'-isocytidylyl 3'-uridylyl 3'-orotidylyl 3'-pseudouridylyl 5'-adenylyl 5'-thymidylyl 5'-guanylyl 5'-isoguanylyl 5'-inosinylyl 5'-xanthylyl 5'-cytidylyl 5'-isocytidylyl 5'-uridylyl 5'-orotidylyl 5'-pseudouridylyl 5'-adenosyl|5'-deoxy-5'-adenosyl|adenosyl 5'-thymidyl 5'-guanosyl 5'-isoguanosyl 5'-inosyl 5'-xanthosyl 5'-cytidyl 5'-isocytidyl 5'-uridyl 5'-orotidyl 5'-pseudouridyl adenylyl thymidylyl guanylyl isoguanylyl inosinylyl xanthylyl cytidylyl isocytidylyl uridylyl orotidylyl pseudouridylyl astato azido bromo chloro cyanato cyano fluoro iodo isocyanato isocyano isoselenocyanato isotellurocyanato isothiocyanato amido hydrazido imido nitrido oxido sulfido selenido tellurido perfluoro perbromo perchloro periodo perdeuterio|perdeutero pertritio amyl amylidene|amyliden opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/substituents.xml000066400000000000000000000020641451751637500322570ustar00rootroot00000000000000 bor sil germ stann plumb homopiperon phyt phenac piperon salic all|allan vin opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/suffixApplicability.dtd000066400000000000000000000003451451751637500334710ustar00rootroot00000000000000 opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/suffixApplicability.xml000066400000000000000000000354161451751637500335250ustar00rootroot00000000000000 carbonyl carboximidoyl ide ium oximino oxy oyl selenenyl seleninyl selenonyl sulfenyl sulfinyl sulfonyl tellurenyl tellurinyl telluronyl uide yl ylidene ylidyne ylium acylium aldehyde aldoxime amide amidrazone amidylium amidine amidinium amidium amido anilide ate hydrazide hydrazido hydroxamate hydroxamic ic ic_O_acid ic_S_acid ic_Se_acid ic_Te_acid ide cyclicimide cyclicimidium cyclicimido cyclicimidylium ium lactam lactim lactone nitrile nitrilium nitrolic acid onaphthone ophenone oximino-ylForAcyl oximino oxy-ylForAcyl oxy oyl sultam sultim sultine sultone uide oyl yl ylidene ylidyne oxoAndDiYl oxoAndTriYl ylium aldehyde aldehyde aldehyde aldoxime amide amidrazone amidylium amidium amido anilide ate hydrazide hydrazide hydrazido hydrazido hydroxamate hydroxamate hydroxamic hydroxamic ic ic_O_acid ic_S_acid ic_Se_acid ic_Te_acid cyclicimide cyclicimidium cyclicimido cyclicimidylium ium nitrile nitrilium ol ol olate olate oyl oyl yl yl ylidene carbonyl_to_hydroxy carbonyl_to_hydroxy onamide_aldehyde onamide_aldehyde ous ous ite ite ononitrile_aldehyde ononitrile_aldehyde yl yl lactone hydroxy_to_amide hydroxy_to_ate hydroxy_to_icacid hydroxy_to_nitrile hydroxy_to_carbonyl ic_nonCarboxylic hydroxy_to_amide hydroxy_to_ate hydroxy_to_icacid hydroxy_to_nitrile hydroxy_to_acyl yl_carbohydrate yl ylidene ylium amine aminylium aminoAndYl ite ous diyl diyl ylidene acylium_nonCarboxylic ate_nonCarboxylic ic_nonCarboxylic ite_nonCarboxylic ous_nonCarboxylic oyl_nonCarboxylic oyl_nonCarboxylic acylium aldehyde aldehyde aldoxime amide amidrazone amidylium amidine amidinium amidium amido amine aminide aminium amino aminylium anilide arsonous ate azonic arsonite azonate azonite azonous boronate boronic boronicacidpinacolester carbamate carbamic carbolactone carboximidoyl carbonyl carbonylium carboxamide carboxylic carboxylate carboxylite diazonium dicarboximide dicarboximido hydrazide hydrazido hydrazonic hydroxamate hydroxamic ic ic_O_acid ic_S_acid ic_Se_acid ic_Te_acid ide imine iminide iminium iminyl iminylium io ium lactam lactim lactone nitrile nitrilium nitrolic acid ol olate lactone yl onaphthone one ophenone oximino oxy oyl phosphonite phosphonous selenenic selenenyl seleninyl selenonyl selone stibonite stibonous sulfamate sulfamic sulfenamide sulfenamido selenenate sulfenate sulfenic sulfenoselenoate sulfenoselenoic sulfenoselenoyl sulfenothioate sulfenothioic sulfenothioyl sulfenyl sulfinyl sulfonyl sultam sultim sultine sultone tellone tellurenate tellurenic tellurenyl tellurinyl telluronyl thione uide yl ylidene ylidyne ylium opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/suffixPrefix.xml000066400000000000000000000014321451751637500321630ustar00rootroot00000000000000 carb|carbo|carbox|carbono sulfon|sulfono sulfin|sulfino selenon|selenono selenin|selenino telluron|tellurono tellurin|tellurino phosphon|phosphono arson|arsono stibon|stibono opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/suffixRules.dtd000066400000000000000000000023401451751637500317720ustar00rootroot00000000000000 opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/suffixRules.xml000066400000000000000000000347771451751637500320420ustar00rootroot00000000000000 opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/suffixes.xml000066400000000000000000000170151451751637500313410ustar00rootroot00000000000000 amine|amin aminide|aminid aminium aminylium carbonylium carboxyamide|carboxyamid carboxylate|carboxylat carboxylic|carboxylicacid|carboxylic acid carboxylite|carboxylit diazonium imine|imin iminide|iminid iminium iminylium quinone|quinon ol olate|olat one|on selone|selon selenenate|selenenat selenenic|selenenicacid|selenenic acid sulfenoselenoate|sulfenoselenoat sulfenoselenoic|sulfenoselenoicacid|sulfenoselenoic acid sulfenothioate|sulfenothioat sulfenothioic|sulfenothioicacid|sulfenothioic acid sulfenamide|sulfenamid sulfenate|sulfenat sulfenic|sulfenicacid|sulfenic acid tellone|tellon tellurenate|tellurenat tellurenic|tellurenicacid|tellurenic acid arsonite|arsonit arsonous|arsonousacid|arsonous acid azonic|azonicacid|azonic acid azonate|azonat azonite|azonit azonous|azonousacid|azonous acid boronate|boronat boronic|boronicacid|boronic acid boronic pinacol ester|boronic acid pinacol ester|boronicacid pinacol ester|boronicacidpinacol ester|boronicacid pinacolester|boronicacidpinacolester|boronic acidpinacol ester|boronic acidpinacolester|boronic acid pinacolester carbamate|carbamat carbamic|carbamicacid|carbamic acid phosphonite|phosphonit phosphonous|phosphonousacid|phosphonous acid stibonite|stibonit stibonous|stibonousacid|stibonous acid sulfamate|sulfamat sulfamic|sulfamicacid|sulfamic acid al aldehydic|aldehydicacid|aldehydic acid aldoxime|aldoxim amate|amat amic|amicacid|amic acid anilate|anilat anilic|anilicacid|anilic acid ite|it nitrolic acid|nitrolicacid onaphthone|onaphthon|naphthone|naphthon ophenone|ophenon|phenone|phenon ous|ousacid|ous acid aldehyde|aldehyd amide|amid amidium anilide|anilid amidine|amidin amidinium amidrazone|amidrazon amidylium ate|at hydrazide|hydrazid hydroxamic|hydroxamicacid|hydroxamic acid hydroxamate|hydroxamat ic|icacid|ic acid ic acid anion ic o-acid ic s-acid ic se-acid ic te-acid nitrile|nitril nitrilium ylium aldehyde|aldehyd one|on selone|selon tellone|tellon thione|thion ol imide|ic imide|ic acid imide|imid|ic imid|ic acid imid imidium|ic imidium|ic acid imidium imidylium|ic imidylium|ic acid imidylium carbolactone|carbolacton dicarboximide|dicarboximid lactam lactim lactone|lacton olide|olid sultam sultim sultine|sultin sultone|sulton opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/tokenFiles.dtd000066400000000000000000000001021451751637500315500ustar00rootroot00000000000000 opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/tokenLists.dtd000066400000000000000000000177411451751637500316250ustar00rootroot00000000000000 opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/unsaturators.xml000066400000000000000000000007031451751637500322530ustar00rootroot00000000000000 ene|en yne|yn ane|an opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/wordRules.dtd000066400000000000000000000011341451751637500314410ustar00rootroot00000000000000 opsin-2.8.0/opsin-core/src/main/resources/uk/ac/cam/ch/wwmm/opsin/resources/wordRules.xml000066400000000000000000000346701451751637500315010ustar00rootroot00000000000000 opsin-2.8.0/opsin-core/src/test/000077500000000000000000000000001451751637500165135ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/java/000077500000000000000000000000001451751637500174345ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/java/uk/000077500000000000000000000000001451751637500200535ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/java/uk/ac/000077500000000000000000000000001451751637500204365ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/000077500000000000000000000000001451751637500211765ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/000077500000000000000000000000001451751637500215705ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/000077500000000000000000000000001451751637500225575ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/000077500000000000000000000000001451751637500237075ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/AmbiguityDetectionTest.java000066400000000000000000000022511451751637500312030ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertFalse; import static org.junit.jupiter.api.Assertions.assertTrue; import org.junit.jupiter.api.AfterAll; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.params.ParameterizedTest; import org.junit.jupiter.params.provider.CsvFileSource; public class AmbiguityDetectionTest { private static NameToStructure n2s; @BeforeAll public static void setUp() { n2s = NameToStructure.getInstance(); } @AfterAll public static void cleanUp(){ n2s = null; } @ParameterizedTest @CsvFileSource(resources ="ambiguous.txt", delimiter='\t') public void testNamesThatShouldBeDetectedAsAmbiguous(String ambiguousName) { assertTrue(n2s.parseChemicalName(ambiguousName).nameAppearsToBeAmbiguous(), ambiguousName + " should be considered ambiguous"); } @ParameterizedTest @CsvFileSource(resources ="unambiguous.txt", delimiter='\t') public void testUnAmbiguousCounterExamples(String unambiguousName) { assertFalse(n2s.parseChemicalName(unambiguousName).nameAppearsToBeAmbiguous(), unambiguousName + " should be considered unambiguous"); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/AtomTest.java000066400000000000000000000034741451751637500263220ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertFalse; import static org.junit.jupiter.api.Assertions.assertNotNull; import static org.junit.jupiter.api.Assertions.assertTrue; import static org.mockito.Mockito.mock; import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; public class AtomTest { private Fragment frag; private SMILESFragmentBuilder sBuilder = new SMILESFragmentBuilder(new IDManager()); @BeforeEach public void setUp() { frag = new Fragment(mock(Element.class)); } @Test public void testAtom() { Atom atom = new Atom(10, ChemEl.C, frag); assertNotNull(atom, "Got atom"); assertEquals(10, atom.getID(), "Id = 10"); assertEquals(ChemEl.C, atom.getElement(), "Element = C"); } @Test public void testAddLocantHasLocant() { Atom atom = new Atom(10, ChemEl.C, frag); atom.addLocant("1"); assertTrue(atom.hasLocant("1"), "Atom has locant '1'"); assertFalse(atom.hasLocant("C"), "Atom has no locant 'C'"); atom.addLocant("C"); assertTrue(atom.hasLocant("C"), "Atom now has locant 'C'"); } @Test public void testGetIncomingValency() throws StructureBuildingException { assertEquals(0, sBuilder.build("C").getFirstAtom().getIncomingValency(), "No bonds"); assertEquals(1, sBuilder.build("CC").getFirstAtom().getIncomingValency(), "One bond"); assertEquals(2, sBuilder.build("C(C)C").getFirstAtom().getIncomingValency(), "Two bonds"); assertEquals(2, sBuilder.build("C=O").getFirstAtom().getIncomingValency(), "Double bond"); assertEquals(3, sBuilder.build("C#C").getFirstAtom().getIncomingValency(), "Triple bond"); assertEquals(1, sBuilder.build("CC=CC#N").getFirstAtom().getIncomingValency(), "One bond"); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/BondTest.java000066400000000000000000000035301451751637500262750ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertNotNull; import static org.mockito.Mockito.mock; import org.junit.jupiter.api.Test; import uk.ac.cam.ch.wwmm.opsin.Bond.SMILES_BOND_DIRECTION; import uk.ac.cam.ch.wwmm.opsin.BondStereo.BondStereoValue; public class BondTest { @Test public void testBond() { Fragment frag = new Fragment(mock(Element.class)); Atom a1 = new Atom(1, ChemEl.C, frag); Atom a2 = new Atom(2, ChemEl.C, frag); frag.addAtom(a1); frag.addAtom(a2); Bond bond = new Bond(a1, a2, 1); assertNotNull(bond, "Got bond"); assertEquals(1, bond.getFrom(), "From = 1"); assertEquals(2, bond.getTo(), "To = 2"); assertEquals(1, bond.getOrder(), "Order = 1"); assertEquals(a1, bond.getFromAtom()); assertEquals(a2, bond.getToAtom()); assertEquals(a2, bond.getOtherAtom(a1)); assertEquals(a1, bond.getOtherAtom(a2)); assertEquals(null, bond.getBondStereo()); assertEquals(null, bond.getSmilesStereochemistry()); } @Test public void testBondMutation() { Fragment frag = new Fragment(mock(Element.class)); Atom a1 = new Atom(1, ChemEl.C, frag); Atom a2 = new Atom(2, ChemEl.C, frag); Atom a3 = new Atom(3, ChemEl.C, frag); Atom a4 = new Atom(4, ChemEl.C, frag); frag.addAtom(a1); frag.addAtom(a2); frag.addAtom(a3); frag.addAtom(a4); Bond bond = new Bond(a2, a3, 1); bond.setOrder(2); assertEquals(2, bond.getOrder(), "Order = 2"); BondStereo bondStereo = new BondStereo(new Atom[]{a1,a2,a3,a4}, BondStereoValue.TRANS); bond.setBondStereo(bondStereo); assertEquals(bondStereo, bond.getBondStereo()); bond.setSmilesStereochemistry(SMILES_BOND_DIRECTION.LSLASH); assertEquals(SMILES_BOND_DIRECTION.LSLASH, bond.getSmilesStereochemistry()); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/CASToolsTest.java000066400000000000000000000164041451751637500270460ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertThrows; import java.io.IOException; import org.junit.jupiter.api.AfterAll; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; /** * */ public class CASToolsTest { private static ParseRules parseRules; @BeforeAll public static void setUp() throws IOException{ ResourceGetter rg = new ResourceGetter("uk/ac/cam/ch/wwmm/opsin/resources/"); parseRules = new ParseRules(new ResourceManager(rg)); } @AfterAll public static void cleanUp() { parseRules = null; } @Test public void cas1() throws ParsingException{ String name = CASTools.uninvertCASName("Silane, chloromethyl-", parseRules); assertEquals("chloromethyl-Silane", name); } @Test public void cas2() throws ParsingException{ String name = CASTools.uninvertCASName("Acetic acid, 2-ethoxy-2-thioxo-", parseRules); assertEquals("2-ethoxy-2-thioxo-Acetic acid", name); } @Test public void cas3() throws ParsingException{ String name = CASTools.uninvertCASName("Silanol, 1,1'-methylenebis-", parseRules); assertEquals("1,1'-methylenebis-Silanol", name); } @Test public void cas4() throws ParsingException{ String name = CASTools.uninvertCASName("Phosphonic acid, P,P'-(8-methylene-3,7,10,14-tetraoxo-4,6,11,13-tetraazahexadecane-1,16-diyl)-bis-, P,P,P',P'-tetramethyl ester", parseRules); assertEquals("P,P,P',P'-tetramethyl P,P'-(8-methylene-3,7,10,14-tetraoxo-4,6,11,13-tetraazahexadecane-1,16-diyl)-bis-Phosphonate", name); } @Test public void cas5() throws ParsingException{ String name = CASTools.uninvertCASName("Benzenamine, 3,3',3''-(1-ethenyl-2-ylidene)tris[6-methyl-", parseRules); assertEquals("3,3',3''-(1-ethenyl-2-ylidene)tris[6-methyl-Benzenamine]", name); } @Test public void cas6() throws ParsingException{ String name = CASTools.uninvertCASName("Pyridine, 3,3'-thiobis[6-chloro-", parseRules); assertEquals("3,3'-thiobis[6-chloro-Pyridine]", name); } @Test public void cas7() throws ParsingException{ String name = CASTools.uninvertCASName("1-Butanesulfonic acid, 2,4-diamino-3-chloro- 1-ethyl ester", parseRules); assertEquals("1-ethyl 2,4-diamino-3-chloro-1-Butanesulfonate", name); } @Test public void cas8() throws ParsingException{ String name = CASTools.uninvertCASName("Benzenecarboximidamide, N'-(1E)-1-propen-1-yl-N-(1Z)-1-propen-1-yl-", parseRules); assertEquals("N'-(1E)-1-propen-1-yl-N-(1Z)-1-propen-1-yl-Benzenecarboximidamide", name); } @Test public void cas9() throws ParsingException{ String name = CASTools.uninvertCASName("Phosphoric acid, ethyl dimethyl ester", parseRules); assertEquals("ethyl dimethyl Phosphorate", name); } @Test public void cas10() throws ParsingException{ String name = CASTools.uninvertCASName("2-Propanone, oxime", parseRules); assertEquals("2-Propanone oxime", name); } @Test public void cas11() throws ParsingException{ String name = CASTools.uninvertCASName("Disulfide, bis(2-chloroethyl)", parseRules); assertEquals("bis(2-chloroethyl) Disulfide", name); } @Test public void cas12() throws ParsingException{ String name = CASTools.uninvertCASName("Ethanimidic acid, N-nitro-, (1Z)-", parseRules); assertEquals("(1Z)-N-nitro-Ethanimidic acid", name); } @Test public void cas13() throws ParsingException{ String name = CASTools.uninvertCASName("2(1H)-Pyridinone, hydrazone, (2E)-", parseRules); assertEquals("(2E)-2(1H)-Pyridinone hydrazone", name); } @Test public void cas14() throws ParsingException{ String name = CASTools.uninvertCASName("benzoic acid, 4,4'-methylenebis[2-chloro-", parseRules); assertEquals("4,4'-methylenebis[2-chloro-benzoic acid]", name); } @Test public void cas15() throws ParsingException{ String name = CASTools.uninvertCASName("peroxide, ethyl methyl", parseRules); assertEquals("ethyl methyl peroxide", name); } @Test public void cas16() throws ParsingException{ String name = CASTools.uninvertCASName("Phosphonic diamide, P-phenyl- (8CI9CI)", parseRules); assertEquals("P-phenyl-Phosphonic diamide", name); } @Test public void cas17() throws ParsingException{ String name = CASTools.uninvertCASName("piperazinium, 1,1-dimethyl-, 2,2,2-trifluoroacetate hydrochloride", parseRules); assertEquals("1,1-dimethyl-piperazinium 2,2,2-trifluoroacetate hydrochloride", name); } @Test public void cas18() throws ParsingException{ String name = CASTools.uninvertCASName("Acetamide, ethylenebis(((ethyl)amino)-", parseRules); assertEquals("ethylenebis(((ethyl)amino)-Acetamide)", name); } @Test public void cas19() throws ParsingException{ String name = CASTools.uninvertCASName("Benzenesulfonic acid, 4-amino-, 1-methylhydrazide", parseRules); assertEquals("4-amino-Benzenesulfonic acid 1-methylhydrazide", name); } @Test public void cas20() throws ParsingException{ String name = CASTools.uninvertCASName("Acetaldehyde, O-methyloxime", parseRules); assertEquals("Acetaldehyde O-methyloxime", name); } @Test public void cas21() throws ParsingException{ String name = CASTools.uninvertCASName("Acetic acid, 2-amino-2-oxo-, 2-(phenylmethylene)hydrazide", parseRules); assertEquals("2-amino-2-oxo-Acetic acid 2-(phenylmethylene)hydrazide", name); } @Test public void cas22() throws ParsingException{ String name = CASTools.uninvertCASName("L-Alanine, N-carboxy-, 1-ethyl ester", parseRules); assertEquals("1-ethyl N-carboxy-L-Alaninate", name); } @Test public void cas23() throws ParsingException{ String name = CASTools.uninvertCASName("Pyridine, 3-(tetrahydro-2H-pyran-2-yl)-, (S)-", parseRules); assertEquals("(S)-3-(tetrahydro-2H-pyran-2-yl)-Pyridine", name); } @Test public void cas24() throws ParsingException{ String name = CASTools.uninvertCASName("Pyrrolo[1,2-a]pyrimidinium, 1-[4-[(aminoiminomethyl)amino]butyl]-7-[[2-[(aminoiminomethyl)-amino]ethyl]thio]-6-(11-dodecenyl)-2,3,4,6,7,8-hexahydro-6-hydroxy-, chloride, dihydrochloride", parseRules); assertEquals("1-[4-[(aminoiminomethyl)amino]butyl]-7-[[2-[(aminoiminomethyl)-amino]ethyl]thio]-6-(11-dodecenyl)-2,3,4,6,7,8-hexahydro-6-hydroxy-Pyrrolo[1,2-a]pyrimidinium chloride dihydrochloride", name); } @Test public void cas25() throws ParsingException{ //In acetic acid, sodium salt (1:1), the stoichiometry is removed prior to uninversion String name = CASTools.uninvertCASName("acetic acid, sodium salt", parseRules); assertEquals("acetic acid sodium salt", name); } @Test public void commaDelimitedAcidSalt() throws ParsingException{ //I don't think this is actually a CAS name, but it's easiest to correct it in this function String name = CASTools.uninvertCASName("benzamide, trifluoroacetic acid salt", parseRules); assertEquals("benzamide trifluoroacetic acid salt", name); } @Test public void notCas1(){ assertThrows(ParsingException.class, () -> { CASTools.uninvertCASName("hexanamine, hexylamine", parseRules); }); } @Test() public void notCas2() { assertThrows(ParsingException.class, () -> { CASTools.uninvertCASName("cyclopropane-1,2-diyldicarbonyl diisocyanate, cyclopropane-1,2-diylbis(carbonyl)bisisocyanate", parseRules); }); } @Test() public void notCas3() { assertThrows(ParsingException.class, () -> { CASTools.uninvertCASName("benzoic acid, ester", parseRules); }); } } ComponentGeneration_AmbiguitiesAndIrregularitiesTest.java000066400000000000000000000205101451751637500371670ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsinpackage uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertThrows; import static org.junit.jupiter.api.Assertions.fail; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import org.junit.jupiter.api.Test; public class ComponentGeneration_AmbiguitiesAndIrregularitiesTest { @Test public void testCorrectlyTokenisedAlkane(){ Element substituent = new GroupingEl(SUBSTITUENT_EL); Element alkaneComponent1 = new TokenEl(ALKANESTEMCOMPONENT); alkaneComponent1.addAttribute(new Attribute(VALUE_ATR, "4")); Element alkaneComponent2 = new TokenEl(ALKANESTEMCOMPONENT); alkaneComponent2.addAttribute(new Attribute(VALUE_ATR, "10")); substituent.addChild(alkaneComponent1); substituent.addChild(alkaneComponent2); try{ ComponentGenerator.resolveAmbiguities(substituent); } catch (ComponentGenerationException e) { fail("alkane was well formed, exception should not be thrown"); } } @Test public void testCorrectlyTokenisedAlkane2(){ Element substituent = new GroupingEl(SUBSTITUENT_EL); Element multiplier = new TokenEl(MULTIPLIER_EL); multiplier.addAttribute(new Attribute(TYPE_ATR, BASIC_TYPE_VAL)); multiplier.addAttribute(new Attribute(VALUE_ATR, "2")); Element alkaneComponent = new TokenEl(ALKANESTEMCOMPONENT); alkaneComponent.addAttribute(new Attribute(VALUE_ATR, "10")); substituent.addChild(multiplier); substituent.addChild(alkaneComponent); try{ ComponentGenerator.resolveAmbiguities(substituent); } catch (ComponentGenerationException e) { fail("alkane was well formed, exception should not be thrown"); } } @Test public void testCorrectlyTokenisedAlkane3(){//unambiguously 6 hexanes Element substituent = new GroupingEl(SUBSTITUENT_EL); Element multiplier = new TokenEl(MULTIPLIER_EL); multiplier.addAttribute(new Attribute(TYPE_ATR, BASIC_TYPE_VAL)); multiplier.addAttribute(new Attribute(VALUE_ATR, "6")); Element alkaneComponent = new TokenEl(ALKANESTEMCOMPONENT); alkaneComponent.addAttribute(new Attribute(VALUE_ATR, "6")); substituent.addChild(multiplier); substituent.addChild(alkaneComponent); try{ ComponentGenerator.resolveAmbiguities(substituent); } catch (ComponentGenerationException e) { fail("alkane was well formed, exception should not be thrown"); } } @Test() // tetradec is 14 not 4 x 10 public void testMisTokenisedAlkane() { assertThrows(ComponentGenerationException.class, () -> { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element erroneousMultiplier = new TokenEl(MULTIPLIER_EL); erroneousMultiplier.addAttribute(new Attribute(TYPE_ATR, BASIC_TYPE_VAL)); erroneousMultiplier.addAttribute(new Attribute(VALUE_ATR, "4")); Element alkaneComponent2 = new TokenEl(ALKANESTEMCOMPONENT); alkaneComponent2.addAttribute(new Attribute(VALUE_ATR, "10")); substituent.addChild(erroneousMultiplier); substituent.addChild(alkaneComponent2); ComponentGenerator.resolveAmbiguities(substituent); }); } @Test public void testLocantsIndicatingTokenizationIsCorrect(){//should be a group multiplier formally Element substituent = new GroupingEl(SUBSTITUENT_EL); Element locant = new TokenEl(LOCANT_EL, "1,2,3,4"); substituent.addChild(locant); Element multiplier = new TokenEl(MULTIPLIER_EL); multiplier.addAttribute(new Attribute(TYPE_ATR, BASIC_TYPE_VAL)); multiplier.addAttribute(new Attribute(VALUE_ATR, "4")); Element alkaneComponent = new TokenEl(ALKANESTEMCOMPONENT); alkaneComponent.addAttribute(new Attribute(VALUE_ATR, "10")); substituent.addChild(multiplier); substituent.addChild(alkaneComponent); try{ ComponentGenerator.resolveAmbiguities(substituent); } catch (ComponentGenerationException e) { fail("alkane was well formed, exception should not be thrown"); } } @Test() // tetradec is 14 not 4 x 10 public void testLocantsIndicatingTokenizationIsIncorrect() { assertThrows(ComponentGenerationException.class, () -> { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element locant = new TokenEl(LOCANT_EL, "1"); substituent.addChild(locant); Element erroneousMultiplier = new TokenEl(MULTIPLIER_EL); erroneousMultiplier.addAttribute(new Attribute(TYPE_ATR, BASIC_TYPE_VAL)); erroneousMultiplier.addAttribute(new Attribute(VALUE_ATR, "4")); Element alkaneComponent = new TokenEl(ALKANESTEMCOMPONENT); alkaneComponent.addAttribute(new Attribute(VALUE_ATR, "10")); substituent.addChild(erroneousMultiplier); substituent.addChild(alkaneComponent); ComponentGenerator.resolveAmbiguities(substituent); }); } @Test() public void testTetraphenShouldBeTetra_Phen1() {// tetraphenyl assertThrows(ComponentGenerationException.class, () -> { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element multiplier = new TokenEl(MULTIPLIER_EL); multiplier.addAttribute(new Attribute(TYPE_ATR, BASIC_TYPE_VAL)); multiplier.addAttribute(new Attribute(VALUE_ATR, "4")); Element phen = new TokenEl(HYDROCARBONFUSEDRINGSYSTEM_EL, "phen"); Element yl = new TokenEl(SUFFIX_EL, "yl"); substituent.addChild(multiplier); substituent.addChild(phen); substituent.addChild(yl); ComponentGenerator.resolveAmbiguities(substituent); }); } @Test() public void testTetraphenShouldBeTetra_Phen2() {// tetraphenoxy assertThrows(ComponentGenerationException.class, () -> { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element multiplier = new TokenEl(MULTIPLIER_EL); multiplier.addAttribute(new Attribute(TYPE_ATR, BASIC_TYPE_VAL)); multiplier.addAttribute(new Attribute(VALUE_ATR, "4")); Element phen = new TokenEl(HYDROCARBONFUSEDRINGSYSTEM_EL, "phen"); Element yl = new TokenEl(SUFFIX_EL, "oxy"); substituent.addChild(multiplier); substituent.addChild(phen); substituent.addChild(yl); ComponentGenerator.resolveAmbiguities(substituent); }); } @Test public void testTetraphenShouldBeTetraphen1(){//tetrapheneyl Element substituent = new GroupingEl(SUBSTITUENT_EL); Element multiplier = new TokenEl(MULTIPLIER_EL); multiplier.addAttribute(new Attribute(TYPE_ATR, BASIC_TYPE_VAL)); multiplier.addAttribute(new Attribute(VALUE_ATR, "4")); Element phen = new TokenEl(HYDROCARBONFUSEDRINGSYSTEM_EL, "phen"); phen.addAttribute(new Attribute(SUBSEQUENTUNSEMANTICTOKEN_ATR, "e")); Element yl = new TokenEl(SUFFIX_EL, "yl"); substituent.addChild(multiplier); substituent.addChild(phen); substituent.addChild(yl); try{ ComponentGenerator.resolveAmbiguities(substituent); } catch (ComponentGenerationException e) { fail("tetraphene was the intended interpretation"); } } @Test public void testTetraphenShouldBeTetraphen2(){//tetraphen2yl Element substituent = new GroupingEl(SUBSTITUENT_EL); Element multiplier = new TokenEl(MULTIPLIER_EL); multiplier.addAttribute(new Attribute(TYPE_ATR, BASIC_TYPE_VAL)); multiplier.addAttribute(new Attribute(VALUE_ATR, "4")); Element phen = new TokenEl(HYDROCARBONFUSEDRINGSYSTEM_EL, "phen"); Element locant = new TokenEl(LOCANT_EL, "2"); Element yl = new TokenEl(SUFFIX_EL, "yl"); substituent.addChild(multiplier); substituent.addChild(phen); substituent.addChild(locant); substituent.addChild(yl); try{ ComponentGenerator.resolveAmbiguities(substituent); } catch (ComponentGenerationException e) { fail("tetraphen as in tetraphene was the intended interpretation"); } } @Test public void testTetraphenShouldBeTetraphen3(){//2tetraphenyl Element substituent = new GroupingEl(SUBSTITUENT_EL); Element locant = new TokenEl(LOCANT_EL, "2"); Element multiplier = new TokenEl(MULTIPLIER_EL); multiplier.addAttribute(new Attribute(TYPE_ATR, BASIC_TYPE_VAL)); multiplier.addAttribute(new Attribute(VALUE_ATR, "4")); Element phen = new TokenEl(HYDROCARBONFUSEDRINGSYSTEM_EL, "phen"); Element yl = new TokenEl(SUFFIX_EL, "yl"); substituent.addChild(locant); substituent.addChild(multiplier); substituent.addChild(phen); substituent.addChild(yl); try{ ComponentGenerator.resolveAmbiguities(substituent); } catch (ComponentGenerationException e) { fail("tetraphen as in tetraphene was the intended interpretation"); } } //TODO multiplier oxy tests, fusion vs Hw locants, and handleGroupIrregularities tests } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/ComponentGeneration_MiscTest.java000066400000000000000000000042711451751637500323470ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertThrows; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import org.junit.jupiter.api.Test; public class ComponentGeneration_MiscTest { @Test() public void testRejectSingleComponentSaltComponent() { assertThrows(ComponentGenerationException.class, () -> { // reject "hydrate" Element molecule = new GroupingEl(MOLECULE_EL); Element wordRule = new GroupingEl(WORDRULE_EL); Element word = new GroupingEl(WORD_EL); Element root = new GroupingEl(ROOT_EL); Element group = new TokenEl(GROUP_EL); group.addAttribute(new Attribute(TYPE_ATR, SIMPLEGROUP_TYPE_VAL)); group.addAttribute(new Attribute(SUBTYPE_ATR, SALTCOMPONENT_SUBTYPE_VAL)); root.addChild(group); word.addChild(root); wordRule.addChild(word); molecule.addChild(wordRule); processComponents(molecule); }); } @Test public void testNumericallyMultipliedSaltComponent() throws ComponentGenerationException { Element molecule = new GroupingEl(MOLECULE_EL); molecule.addChild(new GroupingEl(WORDRULE_EL)); Element wordRule = new GroupingEl(WORDRULE_EL); Element word = new GroupingEl(WORD_EL); Element root = new GroupingEl(ROOT_EL); Element group = new TokenEl(GROUP_EL); group.addAttribute(new Attribute(TYPE_ATR, SIMPLEGROUP_TYPE_VAL)); group.addAttribute(new Attribute(SUBTYPE_ATR, SALTCOMPONENT_SUBTYPE_VAL)); group.setValue("2hcl"); root.addChild(group); word.addChild(root); wordRule.addChild(word); molecule.addChild(wordRule); processComponents(molecule); assertEquals(2, root.getChildCount()); Element multiplier = root.getChild(0); assertEquals(MULTIPLIER_EL, multiplier.getName()); assertEquals("2", multiplier.getAttributeValue(VALUE_ATR)); assertEquals("2", multiplier.getValue()); Element updatedGroup = root.getChild(1); assertEquals("hcl", updatedGroup.getValue()); } private void processComponents(Element parse) throws ComponentGenerationException { new ComponentGenerator(new BuildState(new NameToStructureConfig())).processParse(parse); } } ComponentGeneration_ProcesslocantsTest.java000066400000000000000000000277561451751637500344140ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsinpackage uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertNotNull; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; public class ComponentGeneration_ProcesslocantsTest { private Element locant; private Element substituent; @BeforeEach public void setUpSubstituent(){ substituent = new GroupingEl(SUBSTITUENT_EL); locant = new TokenEl(LOCANT_EL); substituent.addChild(locant); substituent.addChild(new TokenEl(GROUP_EL));//a dummy element to give the locant a potential purpose } @Test public void testCardinalNumber() throws ComponentGenerationException { locant.setValue("1"); ComponentGenerator.processLocants(substituent); assertEquals("1", locant.getValue()); } @Test public void testCardinalNumberWithHyphen() throws ComponentGenerationException { locant.setValue("1-"); ComponentGenerator.processLocants(substituent); assertEquals("1", locant.getValue()); } @Test public void testElementSymbol() throws ComponentGenerationException { locant.setValue("N-"); ComponentGenerator.processLocants(substituent); assertEquals("N", locant.getValue()); } @Test public void testAminoAcidStyleLocant() throws ComponentGenerationException { locant.setValue("N1-"); ComponentGenerator.processLocants(substituent); assertEquals("N1", locant.getValue()); } @Test public void testCompoundLocant() throws ComponentGenerationException { locant.setValue("1(10)-"); ComponentGenerator.processLocants(substituent); assertEquals("1(10)", locant.getValue()); } @Test public void testGreek() throws ComponentGenerationException { locant.setValue("alpha"); ComponentGenerator.processLocants(substituent); assertEquals("alpha", locant.getValue()); } @Test public void testNotlowercase1() throws ComponentGenerationException { locant.setValue("AlPhA-"); ComponentGenerator.processLocants(substituent); assertEquals("alpha", locant.getValue()); } @Test public void testNotlowercase2() throws ComponentGenerationException { locant.setValue("NAlPhA-"); ComponentGenerator.processLocants(substituent); assertEquals("Nalpha", locant.getValue()); } @Test public void testIUPAC2004() throws ComponentGenerationException { locant.setValue("2-N-"); ComponentGenerator.processLocants(substituent); assertEquals("N2", locant.getValue()); } @Test public void testSuperscript1() throws ComponentGenerationException { locant.setValue("N^(2)"); ComponentGenerator.processLocants(substituent); assertEquals("N2", locant.getValue()); } @Test public void testSuperscript2() throws ComponentGenerationException { locant.setValue("N^2"); ComponentGenerator.processLocants(substituent); assertEquals("N2", locant.getValue()); } @Test public void testSuperscript3() throws ComponentGenerationException { locant.setValue("N(2)"); ComponentGenerator.processLocants(substituent); assertEquals("N2", locant.getValue()); } @Test public void testSuperscript4() throws ComponentGenerationException { locant.setValue("N~12~"); ComponentGenerator.processLocants(substituent); assertEquals("N12", locant.getValue()); } @Test public void testSuperscript5() throws ComponentGenerationException { locant.setValue("N(alpha)"); ComponentGenerator.processLocants(substituent); assertEquals("Nalpha", locant.getValue()); } @Test public void testSuperscript6() throws ComponentGenerationException { locant.setValue("N^alpha"); ComponentGenerator.processLocants(substituent); assertEquals("Nalpha", locant.getValue()); } @Test public void testSuperscript7() throws ComponentGenerationException { locant.setValue("N*12*"); ComponentGenerator.processLocants(substituent); assertEquals("N12", locant.getValue()); } @Test public void testAddedHydrogen() throws ComponentGenerationException { locant.setValue("3(5'H)"); ComponentGenerator.processLocants(substituent); assertEquals("3", locant.getValue()); assertEquals(ADDEDHYDROGENLOCANT_TYPE_VAL, locant.getAttributeValue(TYPE_ATR)); Element addedHydrogen = OpsinTools.getPreviousSibling(locant); assertNotNull(addedHydrogen); assertEquals(ADDEDHYDROGEN_EL, addedHydrogen.getName()); assertEquals("5'", addedHydrogen.getAttributeValue(LOCANT_ATR)); } @Test public void testAddedHydrogen2() throws ComponentGenerationException { locant.setValue("1,2(2H,7H)"); ComponentGenerator.processLocants(substituent); assertEquals("1,2", locant.getValue()); assertEquals(ADDEDHYDROGENLOCANT_TYPE_VAL, locant.getAttributeValue(TYPE_ATR)); Element addedHydrogen1 = OpsinTools.getPreviousSibling(locant); assertNotNull(addedHydrogen1); assertEquals(ADDEDHYDROGEN_EL, addedHydrogen1.getName()); assertEquals("7", addedHydrogen1.getAttributeValue(LOCANT_ATR)); Element addedHydrogen2 = OpsinTools.getPreviousSibling(addedHydrogen1); assertNotNull(addedHydrogen2); assertEquals(ADDEDHYDROGEN_EL, addedHydrogen2.getName()); assertEquals("2", addedHydrogen2.getAttributeValue(LOCANT_ATR)); } @Test public void testStereochemistryInLocant1() throws ComponentGenerationException { locant.setValue("5(R)"); ComponentGenerator.processLocants(substituent); assertEquals("5", locant.getValue()); Element stereochemistry = OpsinTools.getPreviousSibling(locant); assertNotNull(stereochemistry); assertEquals(STEREOCHEMISTRY_EL, stereochemistry.getName()); assertEquals(STEREOCHEMISTRYBRACKET_TYPE_VAL, stereochemistry.getAttributeValue(TYPE_ATR)); assertEquals("(5R)", stereochemistry.getValue());//will be handled by process stereochemistry function } @Test public void testStereochemistryInLocant2() throws ComponentGenerationException { locant.setValue("5-(S)"); ComponentGenerator.processLocants(substituent); assertEquals("5", locant.getValue()); Element stereochemistry = OpsinTools.getPreviousSibling(locant); assertNotNull(stereochemistry); assertEquals(STEREOCHEMISTRY_EL, stereochemistry.getName()); assertEquals(STEREOCHEMISTRYBRACKET_TYPE_VAL, stereochemistry.getAttributeValue(TYPE_ATR)); assertEquals("(5S)", stereochemistry.getValue());//will be handled by process stereochemistry function } @Test public void testStereochemistryInLocant3() throws ComponentGenerationException { locant.setValue("N(3)-(S)"); ComponentGenerator.processLocants(substituent); assertEquals("N3", locant.getValue()); Element stereochemistry = OpsinTools.getPreviousSibling(locant); assertNotNull(stereochemistry); assertEquals(STEREOCHEMISTRY_EL, stereochemistry.getName()); assertEquals(STEREOCHEMISTRYBRACKET_TYPE_VAL, stereochemistry.getAttributeValue(TYPE_ATR)); assertEquals("(N3S)", stereochemistry.getValue());//will be handled by process stereochemistry function } @Test public void testStereochemistryInLocant4() throws ComponentGenerationException { locant.setValue("5(RS)"); ComponentGenerator.processLocants(substituent); assertEquals("5", locant.getValue()); Element stereochemistry = OpsinTools.getPreviousSibling(locant); assertNotNull(stereochemistry); assertEquals(STEREOCHEMISTRY_EL, stereochemistry.getName()); assertEquals(STEREOCHEMISTRYBRACKET_TYPE_VAL, stereochemistry.getAttributeValue(TYPE_ATR)); assertEquals("(5RS)", stereochemistry.getValue());//will be handled by process stereochemistry function } @Test public void testStereochemistryInLocant5() throws ComponentGenerationException { locant.setValue("5(R,S)"); ComponentGenerator.processLocants(substituent); assertEquals("5", locant.getValue()); Element stereochemistry = OpsinTools.getPreviousSibling(locant); assertNotNull(stereochemistry); assertEquals(STEREOCHEMISTRY_EL, stereochemistry.getName()); assertEquals(STEREOCHEMISTRYBRACKET_TYPE_VAL, stereochemistry.getAttributeValue(TYPE_ATR)); assertEquals("(5RS)", stereochemistry.getValue());//will be handled by process stereochemistry function } @Test public void testStereochemistryInLocant6() throws ComponentGenerationException { locant.setValue("5(R/S)"); ComponentGenerator.processLocants(substituent); assertEquals("5", locant.getValue()); Element stereochemistry = OpsinTools.getPreviousSibling(locant); assertNotNull(stereochemistry); assertEquals(STEREOCHEMISTRY_EL, stereochemistry.getName()); assertEquals(STEREOCHEMISTRYBRACKET_TYPE_VAL, stereochemistry.getAttributeValue(TYPE_ATR)); assertEquals("(5RS)", stereochemistry.getValue());//will be handled by process stereochemistry function } @Test public void testMultipleCardinals() throws ComponentGenerationException { locant.setValue("2,3-"); ComponentGenerator.processLocants(substituent); assertEquals("2,3", locant.getValue()); } @Test public void testMultipleTypesTogether() throws ComponentGenerationException { locant.setValue("2,N5,GaMMa,3-N,N^3,N(2),N~10~,4(5H),3-N(S),1(6)-"); ComponentGenerator.processLocants(substituent); assertEquals("2,N5,gamma,N3,N3,N2,N10,4,N3,1(6)", locant.getValue()); assertEquals(ADDEDHYDROGENLOCANT_TYPE_VAL, locant.getAttributeValue(TYPE_ATR)); Element stereochemistry = OpsinTools.getPreviousSibling(locant); assertNotNull(stereochemistry); assertEquals(STEREOCHEMISTRY_EL, stereochemistry.getName()); assertEquals(STEREOCHEMISTRYBRACKET_TYPE_VAL, stereochemistry.getAttributeValue(TYPE_ATR)); assertEquals("(N3S)", stereochemistry.getValue()); Element addedHydrogen = OpsinTools.getPreviousSibling(stereochemistry); assertNotNull(addedHydrogen); assertEquals(ADDEDHYDROGEN_EL, addedHydrogen.getName()); assertEquals("5", addedHydrogen.getAttributeValue(LOCANT_ATR)); } @Test public void testCarbohydrateStyleLocants() throws ComponentGenerationException { //2,4,6-tri-O locant.setValue("O"); Element multiplier = new TokenEl(MULTIPLIER_EL); multiplier.addAttribute(new Attribute(VALUE_ATR, "3")); OpsinTools.insertBefore(locant, multiplier); Element numericLocant = new TokenEl(LOCANT_EL); numericLocant.setValue("2,4,6"); OpsinTools.insertBefore(multiplier, numericLocant); ComponentGenerator.processLocants(substituent); assertEquals("O2,O4,O6", numericLocant.getValue()); Element group = OpsinTools.getNextSibling(multiplier); assertNotNull(group); assertEquals(group.getName(), GROUP_EL); } @Test public void testCarbohydrateStyleLocantsNoNumericComponent() throws ComponentGenerationException { //tri-O locant.setValue("O"); Element multiplier = new TokenEl(MULTIPLIER_EL); multiplier.addAttribute(new Attribute(VALUE_ATR, "3")); OpsinTools.insertBefore(locant, multiplier); ComponentGenerator.processLocants(substituent); Element elBeforeMultiplier = OpsinTools.getPreviousSibling(multiplier); assertNotNull(elBeforeMultiplier, "A locant should not be in front of the multiplier"); assertEquals(LOCANT_EL, elBeforeMultiplier.getName()); assertEquals("O,O',O''", elBeforeMultiplier.getValue()); Element group = OpsinTools.getNextSibling(multiplier); assertNotNull(group); assertEquals(group.getName(), GROUP_EL); } @Test public void testCarbohydrateStyleLocantsCounterExample() throws ComponentGenerationException { //2,4,6-tri-2 (this is not a carbohydrate style locant) locant.setValue("2"); Element multiplier = new TokenEl(MULTIPLIER_EL); multiplier.addAttribute(new Attribute(VALUE_ATR, "3")); OpsinTools.insertBefore(locant, multiplier); Element numericLocant = new TokenEl(LOCANT_EL); numericLocant.setValue("2,4,6"); OpsinTools.insertBefore(multiplier, numericLocant); ComponentGenerator.processLocants(substituent); assertEquals("2,4,6", numericLocant.getValue()); Element unmodifiedLocant = OpsinTools.getNextSibling(multiplier); assertNotNull(unmodifiedLocant); assertEquals(unmodifiedLocant.getName(), LOCANT_EL); assertEquals("2", unmodifiedLocant.getValue()); } } ComponentGeneration_StereochemistryTest.java000066400000000000000000001603661451751637500345760ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsinpackage uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import java.util.List; import org.junit.jupiter.api.Test; public class ComponentGeneration_StereochemistryTest { @Test public void testUnlocantedS() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(S)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals(null, newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("S", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testMultipleUnLocanted() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(R,R)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(2, children.size()); Element newStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl1.getName()); assertEquals(null, newStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl2.getName()); assertEquals(null, newStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl2.getAttributeValue(TYPE_ATR)); } @Test public void testLocantedR() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(1R)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("1", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testMultipleRorSLocanted() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(alphaR,3S,7'S)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(3, children.size()); Element newStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl1.getName()); assertEquals("alpha", newStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl2.getName()); assertEquals("3", newStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("S", newStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl2.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl3 = children.get(2); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl3.getName()); assertEquals("7'", newStereochemistryEl3.getAttributeValue(LOCANT_ATR)); assertEquals("S", newStereochemistryEl3.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl3.getAttributeValue(TYPE_ATR)); } @Test public void testUnLocantedE() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(E)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals(null, newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("E", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testLocantedZ() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(5Z)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("5", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("Z", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testMultipleRorSorEorZ() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(NZ,2E,R)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(3, children.size()); Element newStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl1.getName()); assertEquals("N", newStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("Z", newStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, newStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl2.getName()); assertEquals("2", newStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("E", newStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, newStereochemistryEl2.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl3 = children.get(2); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl3.getName()); assertEquals(null, newStereochemistryEl3.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl3.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl3.getAttributeValue(TYPE_ATR)); } @Test public void testDashInsteadOfComma() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(NZ,2E-R)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(3, children.size()); Element newStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl1.getName()); assertEquals("N", newStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("Z", newStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, newStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl2.getName()); assertEquals("2", newStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("E", newStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, newStereochemistryEl2.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl3 = children.get(2); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl3.getName()); assertEquals(null, newStereochemistryEl3.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl3.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl3.getAttributeValue(TYPE_ATR)); } @Test public void testBracketedLocantedCisTrans() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(3cis,5trans)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(2, children.size()); Element newStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl1.getName()); assertEquals("3", newStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("cis", newStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(CISORTRANS_TYPE_VAL, newStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl2.getName()); assertEquals("5", newStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("trans", newStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(CISORTRANS_TYPE_VAL, newStereochemistryEl2.getAttributeValue(TYPE_ATR)); } @Test public void testBracketedUnlocantedCisTrans() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(5S-trans)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(2, children.size()); Element newStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl1.getName()); assertEquals("5", newStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("S", newStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl2.getName()); assertEquals(null, newStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("trans", newStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(CISORTRANS_TYPE_VAL, newStereochemistryEl2.getAttributeValue(TYPE_ATR)); } @Test public void testBracketedExo() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(exo)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals(null, newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("exo", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ENDO_EXO_SYN_ANTI_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testBracketedEndo() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(3-endo,5S)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(2, children.size()); Element newStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl1.getName()); assertEquals("3", newStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("endo", newStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(ENDO_EXO_SYN_ANTI_TYPE_VAL, newStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl2.getName()); assertEquals("5", newStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("S", newStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl2.getAttributeValue(TYPE_ATR)); } @Test public void testLocantedCisTrans() throws ComponentGenerationException { //XML for 3-cis,5-trans: Element substituent = new GroupingEl(SUBSTITUENT_EL); Element locant = new TokenEl(LOCANT_EL, "3"); substituent.addChild(locant); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "cis"); stereochem.addAttribute(new Attribute(TYPE_ATR, CISORTRANS_TYPE_VAL)); stereochem.addAttribute(new Attribute(VALUE_ATR, "cis")); substituent.addChild(stereochem); locant = new TokenEl(LOCANT_EL, "5"); substituent.addChild(locant); stereochem = new TokenEl(STEREOCHEMISTRY_EL, "trans"); stereochem.addAttribute(new Attribute(TYPE_ATR, CISORTRANS_TYPE_VAL)); stereochem.addAttribute(new Attribute(VALUE_ATR, "trans")); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(2, children.size()); Element modifiedStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl1.getName()); assertEquals("3", modifiedStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("cis", modifiedStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(CISORTRANS_TYPE_VAL, modifiedStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element modifiedStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl2.getName()); assertEquals("5", modifiedStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("trans", modifiedStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(CISORTRANS_TYPE_VAL, modifiedStereochemistryEl2.getAttributeValue(TYPE_ATR)); } @Test public void testLocantedExoOn() throws ComponentGenerationException { //XML for 3-exobicyclo[2.2.2]oct: Element substituent = new GroupingEl(SUBSTITUENT_EL); Element locant = new TokenEl(LOCANT_EL, "3"); substituent.addChild(locant); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "exo"); stereochem.addAttribute(new Attribute(TYPE_ATR, ENDO_EXO_SYN_ANTI_TYPE_VAL)); stereochem.addAttribute(new Attribute(VALUE_ATR, "exo")); substituent.addChild(stereochem); Element multiplier = new TokenEl(MULTIPLIER_EL); multiplier.addAttribute(new Attribute(TYPE_ATR, VONBAEYER_TYPE_VAL)); substituent.addChild(multiplier); Element vonBaeyer = new TokenEl(VONBAEYER_EL); substituent.addChild(vonBaeyer); Element group = new TokenEl(GROUP_EL); group.addAttribute(new Attribute(TYPE_ATR, CHAIN_TYPE_VAL)); group.addAttribute(new Attribute(SUBTYPE_ATR, ALKANESTEM_SUBTYPE_VAL)); substituent.addChild(group); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(4, children.size()); Element modifiedStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl.getName()); assertEquals("3", modifiedStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("exo", modifiedStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ENDO_EXO_SYN_ANTI_TYPE_VAL, modifiedStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testLocantedExo() throws ComponentGenerationException { //XML for 3-exoamino Element substituent = new GroupingEl(SUBSTITUENT_EL); Element locant = new TokenEl(LOCANT_EL, "3"); substituent.addChild(locant); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "exo"); stereochem.addAttribute(new Attribute(TYPE_ATR, ENDO_EXO_SYN_ANTI_TYPE_VAL)); stereochem.addAttribute(new Attribute(VALUE_ATR, "exo")); substituent.addChild(stereochem); Element group = new TokenEl(GROUP_EL); group.addAttribute(new Attribute(TYPE_ATR, SUBSTITUENT_EL)); group.addAttribute(new Attribute(SUBTYPE_ATR, SIMPLESUBSTITUENT_SUBTYPE_VAL)); substituent.addChild(group); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(3, children.size()); assertEquals(LOCANT_EL, children.get(0).getName()); Element modifiedStereochemistryEl = children.get(1); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl.getName()); assertEquals("3", modifiedStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("exo", modifiedStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ENDO_EXO_SYN_ANTI_TYPE_VAL, modifiedStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testAnti() throws ComponentGenerationException { //XML for anti: Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "anti"); stereochem.addAttribute(new Attribute(TYPE_ATR, ENDO_EXO_SYN_ANTI_TYPE_VAL)); stereochem.addAttribute(new Attribute(VALUE_ATR, "anti")); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element unmodifiedStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, unmodifiedStereochemistryEl.getName()); assertEquals("anti", unmodifiedStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ENDO_EXO_SYN_ANTI_TYPE_VAL, unmodifiedStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testCis() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "cis"); stereochem.addAttribute(new Attribute(TYPE_ATR, CISORTRANS_TYPE_VAL)); stereochem.addAttribute(new Attribute(VALUE_ATR, "cis")); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element modifiedStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl1.getName()); assertEquals(null, modifiedStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("cis", modifiedStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(CISORTRANS_TYPE_VAL, modifiedStereochemistryEl1.getAttributeValue(TYPE_ATR)); } @Test public void testAxial1() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(M)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals(null, newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("M", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(AXIAL_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testAxial2() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(Ra)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals(null, newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("Ra", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(AXIAL_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testZUnbracketted() throws ComponentGenerationException {//note that IUPAC mandates brackets //XML for Z,Z: Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "Z"); stereochem.addAttribute(new Attribute(TYPE_ATR, E_OR_Z_TYPE_VAL)); substituent.addChild(stereochem); stereochem = new TokenEl(STEREOCHEMISTRY_EL, "Z"); stereochem.addAttribute(new Attribute(TYPE_ATR, E_OR_Z_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(2, children.size()); Element modifiedStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl1.getName()); assertEquals(null, modifiedStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("Z", modifiedStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, modifiedStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element modifiedStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl2.getName()); assertEquals(null, modifiedStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("Z", modifiedStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, modifiedStereochemistryEl2.getAttributeValue(TYPE_ATR)); } @Test public void testEandZUnbrackettedLocanted() throws ComponentGenerationException {//note that IUPAC mandates brackets //XML for 2E,4Z: Element substituent = new GroupingEl(SUBSTITUENT_EL); Element locant = new TokenEl(LOCANT_EL, "2"); substituent.addChild(locant); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "E"); stereochem.addAttribute(new Attribute(TYPE_ATR, E_OR_Z_TYPE_VAL)); substituent.addChild(stereochem); locant = new TokenEl(LOCANT_EL, "4"); substituent.addChild(locant); stereochem = new TokenEl(STEREOCHEMISTRY_EL, "Z"); stereochem.addAttribute(new Attribute(TYPE_ATR, E_OR_Z_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(2, children.size()); Element modifiedStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl1.getName()); assertEquals("2", modifiedStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("E", modifiedStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, modifiedStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element modifiedStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl2.getName()); assertEquals("4", modifiedStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("Z", modifiedStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, modifiedStereochemistryEl2.getAttributeValue(TYPE_ATR)); } @Test public void testEandZUnbrackettedBeforeEne() throws ComponentGenerationException {//not allowed in IUPAC names //XML for 2E,4Z-diene: Element substituent = new GroupingEl(SUBSTITUENT_EL); Element locant = new TokenEl(LOCANT_EL, "2"); substituent.addChild(locant); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "E"); stereochem.addAttribute(new Attribute(TYPE_ATR, E_OR_Z_TYPE_VAL)); substituent.addChild(stereochem); locant = new TokenEl(LOCANT_EL, "4"); substituent.addChild(locant); stereochem = new TokenEl(STEREOCHEMISTRY_EL, "Z"); stereochem.addAttribute(new Attribute(TYPE_ATR, E_OR_Z_TYPE_VAL)); substituent.addChild(stereochem); Element multiplier = new TokenEl(MULTIPLIER_EL, "di"); multiplier.addAttribute(new Attribute(VALUE_ATR, "2")); substituent.addChild(multiplier); Element unsaturator = new TokenEl(UNSATURATOR_EL, "ene"); unsaturator.addAttribute(new Attribute(VALUE_ATR, "2")); substituent.addChild(unsaturator); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(5, children.size()); Element modifiedStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl1.getName()); assertEquals("2", modifiedStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("E", modifiedStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, modifiedStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element modifiedStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl2.getName()); assertEquals("4", modifiedStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("Z", modifiedStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, modifiedStereochemistryEl2.getAttributeValue(TYPE_ATR)); Element newLocant = children.get(2); assertEquals(LOCANT_EL, newLocant.getName()); assertEquals("2,4", newLocant.getValue()); assertEquals(MULTIPLIER_EL ,children.get(3).getName()); assertEquals(UNSATURATOR_EL, children.get(4).getName()); } @Test public void testEandZUnbrackettedBeforeYlidene() throws ComponentGenerationException {//not allowed in IUPAC names //XML for 2Z-ylidene: Element substituent = new GroupingEl(SUBSTITUENT_EL); Element locant = new TokenEl(LOCANT_EL, "2"); substituent.addChild(locant); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "Z"); stereochem.addAttribute(new Attribute(TYPE_ATR, E_OR_Z_TYPE_VAL)); substituent.addChild(stereochem); Element suffix = new TokenEl(SUFFIX_EL, "ylidene"); suffix.addAttribute(new Attribute(VALUE_ATR, "ylidene")); substituent.addChild(suffix); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(3, children.size()); Element modifiedStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl1.getName()); assertEquals("2", modifiedStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("Z", modifiedStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, modifiedStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element newLocant = children.get(1); assertEquals(LOCANT_EL, newLocant.getName()); assertEquals("2", newLocant.getValue()); assertEquals(SUFFIX_EL ,children.get(2).getName()); } @Test public void testBrackettedAlphaBeta() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(1a,2b,3bEtA,4alpha,5xi)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); Element naturalProduct = new TokenEl(GROUP_EL); naturalProduct.addAttribute(new Attribute(SUBTYPE_ATR, BIOCHEMICAL_SUBTYPE_VAL)); naturalProduct.addAttribute(new Attribute(ALPHABETACLOCKWISEATOMORDERING_ATR, "")); substituent.addChild(naturalProduct); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(6, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("1", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("alpha", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); newStereochemistryEl = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("2", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("beta", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); newStereochemistryEl = children.get(2); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("3", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("beta", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); newStereochemistryEl = children.get(3); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("4", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("alpha", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); newStereochemistryEl = children.get(4); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("5", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("xi", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testAlphaBeta() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "3beta,5alpha"); stereochem.addAttribute(new Attribute(TYPE_ATR, ALPHA_OR_BETA_TYPE_VAL)); substituent.addChild(stereochem); Element naturalProduct = new TokenEl(GROUP_EL); naturalProduct.addAttribute(new Attribute(SUBTYPE_ATR, BIOCHEMICAL_SUBTYPE_VAL)); naturalProduct.addAttribute(new Attribute(ALPHABETACLOCKWISEATOMORDERING_ATR, "")); substituent.addChild(naturalProduct); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(3, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("3", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("beta", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); newStereochemistryEl = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("5", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("alpha", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testAlphaBetaNotDirectlyPrecedingANaturalProduct1() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "3beta,5alpha"); stereochem.addAttribute(new Attribute(TYPE_ATR, ALPHA_OR_BETA_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(3, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("3", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("beta", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); newStereochemistryEl = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("5", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("alpha", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); Element newLocantEl = children.get(2); assertEquals(LOCANT_EL, newLocantEl.getName()); assertEquals("3,5", newLocantEl.getValue()); } @Test public void testAlphaBetaNotDirectlyPrecedingANaturalProduct2() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(3beta,5alpha)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(2, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("3", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("beta", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); newStereochemistryEl = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("5", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("alpha", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testAlphaBetaNotDirectlyPrecedingANaturalProduct3() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element naturalProduct = new TokenEl(GROUP_EL); naturalProduct.addAttribute(new Attribute(SUBTYPE_ATR, BIOCHEMICAL_SUBTYPE_VAL)); naturalProduct.addAttribute(new Attribute(ALPHABETACLOCKWISEATOMORDERING_ATR, "")); substituent.addChild(naturalProduct); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "3beta,5alpha"); stereochem.addAttribute(new Attribute(TYPE_ATR, ALPHA_OR_BETA_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(4, children.size()); Element newStereochemistryEl = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("3", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("beta", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); newStereochemistryEl = children.get(2); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("5", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("alpha", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); Element newLocantEl = children.get(3); assertEquals(LOCANT_EL, newLocantEl.getName()); assertEquals("3,5", newLocantEl.getValue()); } @Test public void testAlphaBetaStereoMixedWithNormalLocants() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "3beta,4,10,12alpha"); stereochem.addAttribute(new Attribute(TYPE_ATR, ALPHA_OR_BETA_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(3, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("3", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("beta", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); newStereochemistryEl = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("12", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("alpha", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(ALPHA_OR_BETA_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); Element newLocantEl = children.get(2); assertEquals(LOCANT_EL, newLocantEl.getName()); assertEquals("3,4,10,12", newLocantEl.getValue()); } //relative stereochemistry is currently treated the same as absolute stereochemistry @Test public void testRelativeStereoChemistry1() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "rel-(1R,3S,4S,7R)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(4, children.size()); Element newStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl1.getName()); assertEquals("1", newStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl2.getName()); assertEquals("3", newStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("S", newStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl2.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl3 = children.get(2); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl3.getName()); assertEquals("4", newStereochemistryEl3.getAttributeValue(LOCANT_ATR)); assertEquals("S", newStereochemistryEl3.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl3.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl4 = children.get(3); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl4.getName()); assertEquals("7", newStereochemistryEl4.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl4.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl4.getAttributeValue(TYPE_ATR)); } @Test public void testRelativeStereoChemistry2() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(1R*,3S*,4S*,7R*)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(4, children.size()); Element newStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl1.getName()); assertEquals("1", newStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl2.getName()); assertEquals("3", newStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("S", newStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl2.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl3 = children.get(2); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl3.getName()); assertEquals("4", newStereochemistryEl3.getAttributeValue(LOCANT_ATR)); assertEquals("S", newStereochemistryEl3.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl3.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl4 = children.get(3); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl4.getName()); assertEquals("7", newStereochemistryEl4.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl4.getAttributeValue(VALUE_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl4.getAttributeValue(TYPE_ATR)); } @Test public void testRelativeStereoChemistry3() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "rel-"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element stereoElement = children.get(0); assertEquals(STEREOCHEMISTRY_EL, stereoElement.getName()); assertEquals(null, stereoElement.getAttributeValue(LOCANT_ATR)); assertEquals(REL_TYPE_VAL, stereoElement.getAttributeValue(TYPE_ATR)); } //relativeCisTrans is only supported sufficiently to get constitutionally correct results i.e. locants extracted from the stereochemistry @Test public void testRelativeCisTrans() throws ComponentGenerationException { //c-4- Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "c-4-"); stereochem.addAttribute(new Attribute(TYPE_ATR, RELATIVECISTRANS_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(2, children.size()); Element modifiedStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl1.getName()); assertEquals(null, modifiedStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("c-4-", modifiedStereochemistryEl1.getValue()); assertEquals(RELATIVECISTRANS_TYPE_VAL, modifiedStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element locant = children.get(1); assertEquals(LOCANT_EL, locant.getName()); assertEquals("4", locant.getValue()); } @Test public void testRacemate1() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "rac-(2R)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals("2", newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(StereoGroupType.Rac.name(), newStereochemistryEl.getAttributeValue(STEREOGROUP_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testRacemate2() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(RS)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals(null, newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(StereoGroupType.Rac.name(), newStereochemistryEl.getAttributeValue(STEREOGROUP_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testRacemate2_ci() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(rs)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals(null, newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(StereoGroupType.Rac.name(), newStereochemistryEl.getAttributeValue(STEREOGROUP_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testRacemate3() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(SR)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals(null, newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("S", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(StereoGroupType.Rac.name(), newStereochemistryEl.getAttributeValue(STEREOGROUP_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testRacemate4() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "rac-(2R,4S)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(2, children.size()); Element newStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl1.getName()); assertEquals("2", newStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(StereoGroupType.Rac.name(), newStereochemistryEl1.getAttributeValue(STEREOGROUP_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl2.getName()); assertEquals("4", newStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("S", newStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(StereoGroupType.Rac.name(), newStereochemistryEl2.getAttributeValue(STEREOGROUP_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl2.getAttributeValue(TYPE_ATR)); } @Test public void testRacemate5() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(2RS,4SR)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(2, children.size()); Element newStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl1.getName()); assertEquals("2", newStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(StereoGroupType.Rac.name(), newStereochemistryEl1.getAttributeValue(STEREOGROUP_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl1.getAttributeValue(TYPE_ATR)); Element newStereochemistryEl2 = children.get(1); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl2.getName()); assertEquals("4", newStereochemistryEl2.getAttributeValue(LOCANT_ATR)); assertEquals("S", newStereochemistryEl2.getAttributeValue(VALUE_ATR)); assertEquals(StereoGroupType.Rac.name(), newStereochemistryEl2.getAttributeValue(STEREOGROUP_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl2.getAttributeValue(TYPE_ATR)); } @Test public void testRacemate6() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "rac-"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element stereoElement = children.get(0); assertEquals(STEREOCHEMISTRY_EL, stereoElement.getName()); assertEquals(null, stereoElement.getAttributeValue(LOCANT_ATR)); assertEquals(RAC_TYPE_VAL, stereoElement.getAttributeValue(TYPE_ATR)); } @Test public void testRacemate7() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "racem-"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element stereoElement = children.get(0); assertEquals(STEREOCHEMISTRY_EL, stereoElement.getName()); assertEquals(null, stereoElement.getAttributeValue(LOCANT_ATR)); assertEquals(RAC_TYPE_VAL, stereoElement.getAttributeValue(TYPE_ATR)); } @Test public void testRacemate8() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "racemic-"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element stereoElement = children.get(0); assertEquals(STEREOCHEMISTRY_EL, stereoElement.getName()); assertEquals(null, stereoElement.getAttributeValue(LOCANT_ATR)); assertEquals(RAC_TYPE_VAL, stereoElement.getAttributeValue(TYPE_ATR)); } @Test public void testRacemate9() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(R/S)-"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element newStereochemistryEl = children.get(0); assertEquals(STEREOCHEMISTRY_EL, newStereochemistryEl.getName()); assertEquals(null, newStereochemistryEl.getAttributeValue(LOCANT_ATR)); assertEquals("R", newStereochemistryEl.getAttributeValue(VALUE_ATR)); assertEquals(StereoGroupType.Rac.name(), newStereochemistryEl.getAttributeValue(STEREOGROUP_ATR)); assertEquals(R_OR_S_TYPE_VAL, newStereochemistryEl.getAttributeValue(TYPE_ATR)); } @Test public void testRacemate10() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(RAC)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element stereoElement = children.get(0); assertEquals(STEREOCHEMISTRY_EL, stereoElement.getName()); assertEquals(null, stereoElement.getAttributeValue(LOCANT_ATR)); assertEquals(RAC_TYPE_VAL, stereoElement.getAttributeValue(TYPE_ATR)); } // TODO (R or S)- (R and S)- @Test public void testRacemateEz1() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(EZ)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); Element modifiedStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl1.getName()); assertEquals("EZ", modifiedStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, modifiedStereochemistryEl1.getAttributeValue(TYPE_ATR)); } @Test public void testRacemateEz2() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "(2EZ)"); stereochem.addAttribute(new Attribute(TYPE_ATR, STEREOCHEMISTRYBRACKET_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element modifiedStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl1.getName()); assertEquals("2", modifiedStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("EZ", modifiedStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, modifiedStereochemistryEl1.getAttributeValue(TYPE_ATR)); } @Test public void testRacemateEz3_unbracketted() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element locant = new TokenEl(LOCANT_EL, "2"); substituent.addChild(locant); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "ez"); stereochem.addAttribute(new Attribute(TYPE_ATR, E_OR_Z_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); assertEquals(1, children.size()); Element modifiedStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl1.getName()); assertEquals("2", modifiedStereochemistryEl1.getAttributeValue(LOCANT_ATR)); assertEquals("EZ", modifiedStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, modifiedStereochemistryEl1.getAttributeValue(TYPE_ATR)); } @Test public void testRacemateEz4_unbracketted() throws ComponentGenerationException { Element substituent = new GroupingEl(SUBSTITUENT_EL); Element stereochem = new TokenEl(STEREOCHEMISTRY_EL, "EZ"); stereochem.addAttribute(new Attribute(TYPE_ATR, E_OR_Z_TYPE_VAL)); substituent.addChild(stereochem); processStereochemistry(substituent); List children = substituent.getChildElements(); Element modifiedStereochemistryEl1 = children.get(0); assertEquals(STEREOCHEMISTRY_EL, modifiedStereochemistryEl1.getName()); assertEquals("EZ", modifiedStereochemistryEl1.getAttributeValue(VALUE_ATR)); assertEquals(E_OR_Z_TYPE_VAL, modifiedStereochemistryEl1.getAttributeValue(TYPE_ATR)); } private void processStereochemistry(Element subOrRoot) throws ComponentGenerationException { new ComponentGenerator(new BuildState(new NameToStructureConfig())).processStereochemistry(subOrRoot); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/ComponentProcessorTest.java000066400000000000000000000304051451751637500312560ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import org.junit.jupiter.api.Test; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertNotNull; import static org.junit.jupiter.api.Assertions.assertThrows; import static org.junit.jupiter.api.Assertions.assertTrue; import static org.mockito.Mockito.mock; public class ComponentProcessorTest { @Test() public void testSubtractiveWithNoGroupToAttachTo() { assertThrows(ComponentGenerationException.class, () -> { Element word = new GroupingEl(WORD_EL); Element substituent = new GroupingEl(SUBSTITUENT_EL); word.addChild(substituent); Element substractivePrefix = new TokenEl(SUBTRACTIVEPREFIX_EL); substractivePrefix.addAttribute(new Attribute(TYPE_ATR, DEOXY_TYPE_VAL)); substituent.addChild(substractivePrefix); ComponentProcessor.removeAndMoveToAppropriateGroupIfSubtractivePrefix(substituent); }); } @Test public void testSubtractiveWithBiochemicalToAttachTo() throws ComponentGenerationException{ Element word = new GroupingEl(WORD_EL); Element substituent = new GroupingEl(SUBSTITUENT_EL); Element substractivePrefix = new TokenEl(SUBTRACTIVEPREFIX_EL); substractivePrefix.addAttribute(new Attribute(TYPE_ATR, DEOXY_TYPE_VAL)); substituent.addChild(substractivePrefix); word.addChild(substituent); Element root = new GroupingEl(ROOT_EL); word.addChild(root); Element group = new TokenEl(GROUP_EL); group.addAttribute(new Attribute(SUBTYPE_ATR, BIOCHEMICAL_SUBTYPE_VAL)); root.addChild(group); ComponentProcessor.removeAndMoveToAppropriateGroupIfSubtractivePrefix(substituent); assertEquals(null, substituent.getParent(), "Substractive prefix should of been detached"); assertEquals(2, root.getChildCount()); assertEquals(substractivePrefix, root.getChildElements().get(0)); } @Test public void testSubtractiveRightMostPreferred() throws ComponentGenerationException{ Element word = new GroupingEl(WORD_EL); Element substituent = new GroupingEl(SUBSTITUENT_EL); Element substractivePrefix = new TokenEl(SUBTRACTIVEPREFIX_EL); substractivePrefix.addAttribute(new Attribute(TYPE_ATR, DEOXY_TYPE_VAL)); substituent.addChild(substractivePrefix); word.addChild(substituent); Element substituent2 = new GroupingEl(SUBSTITUENT_EL); Element group1 = new TokenEl(GROUP_EL); group1.addAttribute(new Attribute(TYPE_ATR, SIMPLEGROUP_SUBTYPE_VAL)); group1.addAttribute(new Attribute(SUBTYPE_ATR, SIMPLEGROUP_SUBTYPE_VAL)); substituent2.addChild(group1); word.addChild(substituent2); Element root = new GroupingEl(ROOT_EL); word.addChild(root); Element group2 = new TokenEl(GROUP_EL); group2.addAttribute(new Attribute(TYPE_ATR, SIMPLEGROUP_SUBTYPE_VAL)); group2.addAttribute(new Attribute(SUBTYPE_ATR, BIOCHEMICAL_SUBTYPE_VAL)); root.addChild(group2); ComponentProcessor.removeAndMoveToAppropriateGroupIfSubtractivePrefix(substituent); assertEquals(null, substituent.getParent(), "Substractive prefix should of been detached"); assertEquals(2, root.getChildCount()); assertEquals(substractivePrefix, root.getChildElements().get(0)); } @Test public void testSubtractiveBiochemicalPreferredToRightMost() throws ComponentGenerationException{ Element word = new GroupingEl(WORD_EL); Element substituent = new GroupingEl(SUBSTITUENT_EL); Element substractivePrefix = new TokenEl(SUBTRACTIVEPREFIX_EL); substractivePrefix.addAttribute(new Attribute(TYPE_ATR, DEOXY_TYPE_VAL)); substituent.addChild(substractivePrefix); word.addChild(substituent); Element substituent2 = new GroupingEl(SUBSTITUENT_EL); Element group1 = new TokenEl(GROUP_EL); group1.addAttribute(new Attribute(SUBTYPE_ATR, BIOCHEMICAL_SUBTYPE_VAL)); substituent2.addChild(group1); word.addChild(substituent2); Element root = new GroupingEl(ROOT_EL); word.addChild(root); Element group2 = new TokenEl(GROUP_EL); group2.addAttribute(new Attribute(SUBTYPE_ATR, SIMPLEGROUP_SUBTYPE_VAL)); root.addChild(group2); ComponentProcessor.removeAndMoveToAppropriateGroupIfSubtractivePrefix(substituent); assertEquals(null, substituent.getParent(), "Substractive prefix should of been detached"); assertEquals(1, root.getChildCount()); assertEquals(2, substituent2.getChildCount()); assertEquals(substractivePrefix, substituent2.getChildElements().get(0)); } @Test public void testSubtractiveWithMultiplierAndLocants() throws ComponentGenerationException{ Element word = new GroupingEl(WORD_EL); Element substituent = new GroupingEl(SUBSTITUENT_EL); Element locant = new TokenEl(LOCANT_EL); substituent.addChild(locant); Element multiplier = new TokenEl(MULTIPLIER_EL); substituent.addChild(multiplier); Element substractivePrefix = new TokenEl(SUBTRACTIVEPREFIX_EL); substractivePrefix.addAttribute(new Attribute(TYPE_ATR, DEOXY_TYPE_VAL)); substituent.addChild(substractivePrefix); word.addChild(substituent); Element root = new GroupingEl(ROOT_EL); word.addChild(root); Element group = new TokenEl(GROUP_EL); group.addAttribute(new Attribute(SUBTYPE_ATR, BIOCHEMICAL_SUBTYPE_VAL)); root.addChild(group); ComponentProcessor.removeAndMoveToAppropriateGroupIfSubtractivePrefix(substituent); assertEquals(null, substituent.getParent(), "Substractive prefix should of been detached"); assertEquals(4, root.getChildCount()); assertEquals(locant, root.getChildElements().get(0)); assertEquals(multiplier, root.getChildElements().get(1)); assertEquals(substractivePrefix, root.getChildElements().get(2)); } @Test public void testDLStereochemistryLOnAminoAcid() throws ComponentGenerationException, StructureBuildingException{ BuildState state = new BuildState(mock(NameToStructureConfig.class)); Fragment f = state.fragManager.buildSMILES("N[C@@H](C)C"); Element aminoAcidEl = new TokenEl(GROUP_EL); aminoAcidEl.setFrag(f); int parityBefore = f.getAtomByID(2).getAtomParity().getParity(); ComponentProcessor processor = new ComponentProcessor(state, mock(SuffixApplier.class)); assertEquals(true, processor.applyDlStereochemistryToAminoAcid(aminoAcidEl, "l")); assertEquals(parityBefore, f.getAtomByID(2).getAtomParity().getParity()); } @Test public void testDLStereochemistryDOnAminoAcid() throws ComponentGenerationException, StructureBuildingException{ BuildState state = new BuildState(mock(NameToStructureConfig.class)); Fragment f = state.fragManager.buildSMILES("N[C@@H](C)C"); Element aminoAcidEl = new TokenEl(GROUP_EL); aminoAcidEl.setFrag(f); int parityBefore = f.getAtomByID(2).getAtomParity().getParity(); ComponentProcessor processor = new ComponentProcessor(state, mock(SuffixApplier.class)); assertEquals(true, processor.applyDlStereochemistryToAminoAcid(aminoAcidEl, "d")); assertEquals(parityBefore, -f.getAtomByID(2).getAtomParity().getParity()); } @Test public void testDLStereochemistryDLOnAminoAcid() throws ComponentGenerationException, StructureBuildingException{ BuildState state = new BuildState(mock(NameToStructureConfig.class)); Fragment f = state.fragManager.buildSMILES("N[C@@H](C)C"); Element aminoAcidEl = new TokenEl(GROUP_EL); aminoAcidEl.setFrag(f); ComponentProcessor processor = new ComponentProcessor(state, mock(SuffixApplier.class)); assertTrue(processor.applyDlStereochemistryToAminoAcid(aminoAcidEl, "dl")); assertNotNull(f.getAtomByID(2).getAtomParity()); assertEquals(StereoGroupType.Rac, f.getAtomByID(2).getStereoGroup().getType()); } @Test public void testDLStereochemistryDOnAchiralAminoAcid() throws ComponentGenerationException, StructureBuildingException{ BuildState state = new BuildState(mock(NameToStructureConfig.class)); Fragment f = state.fragManager.buildSMILES("NC(C)C"); Element aminoAcidEl = new TokenEl(GROUP_EL); aminoAcidEl.setFrag(f); ComponentProcessor processor = new ComponentProcessor(state, mock(SuffixApplier.class)); assertEquals(false, processor.applyDlStereochemistryToAminoAcid(aminoAcidEl, "d")); } @Test public void testDLStereochemistryLOnCarbohydrate() throws ComponentGenerationException, StructureBuildingException{ BuildState state = new BuildState(mock(NameToStructureConfig.class)); Fragment f = state.fragManager.buildSMILES("N[C@@H](C)C"); Element carbohydrateEl = new TokenEl(GROUP_EL); carbohydrateEl.setFrag(f); int parityBefore = f.getAtomByID(2).getAtomParity().getParity(); ComponentProcessor processor = new ComponentProcessor(state, mock(SuffixApplier.class)); processor.applyDlStereochemistryToCarbohydrate(carbohydrateEl, "l"); assertEquals(parityBefore, -f.getAtomByID(2).getAtomParity().getParity()); } @Test public void testDLStereochemistryDOnCarbohydrate() throws ComponentGenerationException, StructureBuildingException{ BuildState state = new BuildState(mock(NameToStructureConfig.class)); Fragment f = state.fragManager.buildSMILES("N[C@@H](C)C"); Element carbohydrateEl = new TokenEl(GROUP_EL); carbohydrateEl.setFrag(f); int parityBefore = f.getAtomByID(2).getAtomParity().getParity(); ComponentProcessor processor = new ComponentProcessor(state, mock(SuffixApplier.class)); processor.applyDlStereochemistryToCarbohydrate(carbohydrateEl, "d"); assertEquals(parityBefore, f.getAtomByID(2).getAtomParity().getParity()); } @Test public void testDLStereochemistryInvertedNaturalOnCarbohydrate1() throws ComponentGenerationException, StructureBuildingException{ BuildState state = new BuildState(mock(NameToStructureConfig.class)); Fragment f = state.fragManager.buildSMILES("N[C@@H](C)C"); Element carbohydrateEl = new TokenEl(GROUP_EL); carbohydrateEl.addAttribute(new Attribute(NATURALENTISOPPOSITE_ATR, "yes")); carbohydrateEl.setFrag(f); int parityBefore = f.getAtomByID(2).getAtomParity().getParity(); ComponentProcessor processor = new ComponentProcessor(state, mock(SuffixApplier.class)); processor.applyDlStereochemistryToCarbohydrate(carbohydrateEl, "l"); assertEquals(parityBefore, f.getAtomByID(2).getAtomParity().getParity()); } @Test public void testDLStereochemistryInvertedNaturalOnCarbohydrate2() throws ComponentGenerationException, StructureBuildingException{ BuildState state = new BuildState(mock(NameToStructureConfig.class)); Fragment f = state.fragManager.buildSMILES("N[C@@H](C)C"); Element carbohydrateEl = new TokenEl(GROUP_EL); carbohydrateEl.addAttribute(new Attribute(NATURALENTISOPPOSITE_ATR, "yes")); carbohydrateEl.setFrag(f); int parityBefore = f.getAtomByID(2).getAtomParity().getParity(); ComponentProcessor processor = new ComponentProcessor(state, mock(SuffixApplier.class)); processor.applyDlStereochemistryToCarbohydrate(carbohydrateEl, "d"); assertEquals(parityBefore, -f.getAtomByID(2).getAtomParity().getParity()); } @Test public void testDStereochemistryDOnCarbohydratePrefix() throws ComponentGenerationException, StructureBuildingException{ Element prefix = new TokenEl(STEREOCHEMISTRY_EL); prefix.addAttribute(new Attribute(TYPE_ATR, CARBOHYDRATECONFIGURATIONPREFIX_TYPE_VAL)); prefix.addAttribute(new Attribute(VALUE_ATR, "l/r"));//D-threo ComponentProcessor.applyDlStereochemistryToCarbohydrateConfigurationalPrefix(prefix, "d"); assertEquals("l/r", prefix.getAttributeValue(VALUE_ATR)); } @Test public void testLStereochemistryDOnCarbohydratePrefix() throws ComponentGenerationException, StructureBuildingException{ Element prefix = new TokenEl(STEREOCHEMISTRY_EL); prefix.addAttribute(new Attribute(TYPE_ATR, CARBOHYDRATECONFIGURATIONPREFIX_TYPE_VAL)); prefix.addAttribute(new Attribute(VALUE_ATR, "r/l")); ComponentProcessor.applyDlStereochemistryToCarbohydrateConfigurationalPrefix(prefix, "l"); assertEquals("l/r", prefix.getAttributeValue(VALUE_ATR)); } @Test public void testDLStereochemistryDOnCarbohydratePrefix() throws ComponentGenerationException, StructureBuildingException{ Element prefix = new TokenEl(STEREOCHEMISTRY_EL); prefix.addAttribute(new Attribute(TYPE_ATR, CARBOHYDRATECONFIGURATIONPREFIX_TYPE_VAL)); prefix.addAttribute(new Attribute(VALUE_ATR, "l/r")); ComponentProcessor.applyDlStereochemistryToCarbohydrateConfigurationalPrefix(prefix, "dl"); assertEquals("?/?", prefix.getAttributeValue(VALUE_ATR)); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/CycleDetectorTest.java000066400000000000000000000131331451751637500301440ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import java.util.HashSet; import java.util.List; import java.util.Set; import org.junit.jupiter.api.Test; //Cycle detection is performed as part of fragment creation so we can just check the output of fragment creation public class CycleDetectorTest { private SMILESFragmentBuilder sBuilder = new SMILESFragmentBuilder(new IDManager()); @Test public void testAssignCyclic1() throws StructureBuildingException { Fragment frag = sBuilder.build("CCCC"); for (Atom a : frag) { assertEquals(false, a.getAtomIsInACycle(), "Should be acylic"); } } @Test public void testAssignCyclic2() throws StructureBuildingException { Fragment frag = sBuilder.build("c1ccccc1"); for (Atom a : frag) { assertEquals(true, a.getAtomIsInACycle(), "Should be cylic"); } } @Test public void testAssignCyclic3() throws StructureBuildingException { Fragment frag = sBuilder.build("c12.c23.c34.c45.c56.c61"); for (Atom a : frag) { assertEquals(true, a.getAtomIsInACycle(), "Should be cylic"); } } @Test public void testAssignCyclic4() throws StructureBuildingException { Fragment frag = sBuilder.build("c1ccccc1CCc1ccccc1"); List atomList = frag.getAtomList(); for (int i = 0; i < atomList.size(); i++) { Atom a = atomList.get(i); if (i<=5 || i >=8){ assertEquals(true, a.getAtomIsInACycle(), "Should be cylic"); } else{ assertEquals(false, a.getAtomIsInACycle(), "Should be acylic"); } } } @Test public void testAssignCyclic5() throws StructureBuildingException { Fragment frag = sBuilder.build("CCc1ccc(O)cc1"); List atomList = frag.getAtomList(); for (int i = 0; i < atomList.size(); i++) { Atom a = atomList.get(i); if (i<=1 || i==6){ assertEquals(false, a.getAtomIsInACycle(), "Should be acylic"); } else{ assertEquals(true, a.getAtomIsInACycle(), "Should be cylic"); } } } @Test public void testAssignCyclic6() throws StructureBuildingException { Fragment frag = sBuilder.build("CC1CC(O1)C"); List atomList = frag.getAtomList(); for (int i = 0; i < atomList.size(); i++) { Atom a = atomList.get(i); if (i==0 || i==5){ assertEquals(false, a.getAtomIsInACycle(), "Should be acylic"); } else{ assertEquals(true, a.getAtomIsInACycle(), "Should be cylic"); } } } @Test public void testFindPathBetweenAtoms1() throws StructureBuildingException { Fragment frag = sBuilder.build("c1ccccc1"); List atomList = frag.getAtomList(); List> paths = CycleDetector.getPathBetweenAtomsUsingBonds(atomList.get(0), atomList.get(3), frag.getBondSet()); assertEquals(2, paths.size()); for (List path : paths) { assertEquals(2, path.size()); } for (List path : paths) { if (atomList.indexOf(path.get(0))==1){ assertEquals(2, atomList.indexOf(path.get(1))); } else{ assertEquals(5, atomList.indexOf(path.get(0))); assertEquals(4, atomList.indexOf(path.get(1))); } } } @Test public void testFindPathBetweenAtoms2() throws StructureBuildingException { Fragment frag = sBuilder.build("C1CCCC2CCCCC12"); List atomList = frag.getAtomList(); Set bonds = new HashSet(frag.getBondSet()); bonds.remove(atomList.get(4).getBondToAtom(atomList.get(9))); List> paths = CycleDetector.getPathBetweenAtomsUsingBonds(atomList.get(4), atomList.get(9), bonds); assertEquals(2, paths.size()); List pathLeftRing; List pathRightRing; if (atomList.indexOf(paths.get(0).get(0))==3){ pathLeftRing = paths.get(0); pathRightRing = paths.get(1); } else{ pathLeftRing = paths.get(1); pathRightRing = paths.get(0); } assertEquals(3, atomList.indexOf(pathLeftRing.get(0))); assertEquals(2, atomList.indexOf(pathLeftRing.get(1))); assertEquals(1, atomList.indexOf(pathLeftRing.get(2))); assertEquals(0, atomList.indexOf(pathLeftRing.get(3))); assertEquals(5, atomList.indexOf(pathRightRing.get(0))); assertEquals(6, atomList.indexOf(pathRightRing.get(1))); assertEquals(7, atomList.indexOf(pathRightRing.get(2))); assertEquals(8, atomList.indexOf(pathRightRing.get(3))); } @Test public void testFindPathBetweenAtoms3() throws StructureBuildingException { Fragment frag = sBuilder.build("C1(C)CCCC2C(C)CCCC12"); List atomList = frag.getAtomList(); Set bonds = new HashSet(frag.getBondSet()); bonds.remove(atomList.get(0).getBondToAtom(atomList.get(1))); bonds.remove(atomList.get(6).getBondToAtom(atomList.get(7))); bonds.remove(atomList.get(5).getBondToAtom(atomList.get(11))); List> paths = CycleDetector.getPathBetweenAtomsUsingBonds(atomList.get(0), atomList.get(6), bonds); assertEquals(2, paths.size()); List pathLeftRing; List pathRightRing; if (atomList.indexOf(paths.get(0).get(0))==2){ pathLeftRing = paths.get(0); pathRightRing = paths.get(1); } else{ pathLeftRing = paths.get(1); pathRightRing = paths.get(0); } assertEquals(2, atomList.indexOf(pathLeftRing.get(0))); assertEquals(3, atomList.indexOf(pathLeftRing.get(1))); assertEquals(4, atomList.indexOf(pathLeftRing.get(2))); assertEquals(5, atomList.indexOf(pathLeftRing.get(3))); assertEquals(11, atomList.indexOf(pathRightRing.get(0))); assertEquals(10, atomList.indexOf(pathRightRing.get(1))); assertEquals(9, atomList.indexOf(pathRightRing.get(2))); assertEquals(8, atomList.indexOf(pathRightRing.get(3))); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/DtdTest.java000066400000000000000000000121271451751637500261300ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertTrue; import java.net.URI; import java.net.URISyntaxException; import java.net.URL; import java.util.HashSet; import java.util.Set; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.stream.XMLStreamConstants; import javax.xml.stream.XMLStreamException; import javax.xml.stream.XMLStreamReader; import org.junit.jupiter.api.Test; import org.xml.sax.ErrorHandler; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; public class DtdTest { private final static String RESOURCE_LOCATION = "uk/ac/cam/ch/wwmm/opsin/resources/"; private final ResourceGetter resourceGetter = new ResourceGetter(RESOURCE_LOCATION); @Test public void testTokenFiles() throws Exception { XMLStreamReader reader = resourceGetter.getXMLStreamReader("index.xml"); while (reader.hasNext()) { if (reader.next() == XMLStreamConstants.START_ELEMENT && reader.getLocalName().equals("tokenFile")) { validate(getUriForFile(reader.getElementText())); } } reader.close(); } @Test public void testRegexes() throws Exception { validate(getUriForFile("regexes.xml")); } @Test public void testRegexTokens() throws Exception { validate(getUriForFile("regexTokens.xml")); } @Test public void testSuffixApplicability() throws Exception { validate(getUriForFile("suffixApplicability.xml")); } @Test public void testSuffixRules() throws Exception { validate(getUriForFile("suffixRules.xml")); } @Test public void testWordRules() throws Exception { validate(getUriForFile("wordRules.xml")); } @Test public void testTokenFilesValueValidity() throws Exception { XMLStreamReader indexReader = resourceGetter.getXMLStreamReader("index.xml"); while (indexReader.hasNext()) { if (indexReader.next() == XMLStreamConstants.START_ELEMENT && indexReader.getLocalName().equals("tokenFile")) { XMLStreamReader tokenReader = resourceGetter.getXMLStreamReader(indexReader.getElementText()); while (tokenReader.hasNext()) { if (tokenReader.next() == XMLStreamConstants.START_ELEMENT) { String tagName = tokenReader.getLocalName(); if (tagName.equals("tokenLists")) { while (tokenReader.hasNext()) { switch (tokenReader.next()) { case XMLStreamConstants.START_ELEMENT: if (tokenReader.getLocalName().equals("tokenList")) { validateTokenList(tokenReader); } break; } } } else if (tagName.equals("tokenList")) { validateTokenList(tokenReader); } } } } } indexReader.close(); } private void validateTokenList(XMLStreamReader reader) throws XMLStreamException { Set terms = new HashSet(); while (reader.hasNext()) { switch (reader.next()) { case XMLStreamConstants.START_ELEMENT: if (reader.getLocalName().equals("token")) { String tokenString = reader.getElementText(); assertTrue(!terms.contains(tokenString), tokenString +" occurred more than once in a tokenList"); terms.add(tokenString); char[] characters = tokenString.toCharArray(); for (char c : characters) { assertTrue((int)c < 128, "Non ascii character found in token: " + tokenString + OpsinTools.NEWLINE + "An ASCII replacement should be used!"); assertTrue(!(c >='A' && c <='Z'), "Capital letter found in token: " + tokenString + OpsinTools.NEWLINE + "Only lower case letters should be used!"); } } break; case XMLStreamConstants.END_ELEMENT: if (reader.getLocalName().equals("tokenList")) { return; } break; } } } public static void validate(URI uri) throws Exception { System.out.println("Validating:"+ uri); DocumentBuilderFactory f = DocumentBuilderFactory.newInstance(); f.setValidating(true); DocumentBuilder b = f.newDocumentBuilder(); MyErrorHandler h = new MyErrorHandler(); b.setErrorHandler(h); try { b.parse(uri.toString()); } catch (SAXException e) { if (h.error != null) { System.out.println(h.error); AssertionError ae = new AssertionError("XML Validation error: "+uri.toString()); ae.initCause(h.error); throw ae; } } } static class MyErrorHandler implements ErrorHandler { private SAXParseException error; public void error(SAXParseException exception) throws SAXException { this.error = exception; throw new SAXException("Error"); } public void fatalError(SAXParseException exception) throws SAXException { this.error = exception; throw new SAXException("Error"); } public void warning(SAXParseException exception) throws SAXException { this.error = exception; throw new SAXException("Error"); } } private URI getUriForFile (String fileName) throws URISyntaxException { ClassLoader l = getClass().getClassLoader(); URL url = l.getResource(RESOURCE_LOCATION + fileName); if (url ==null) {return null;} return url.toURI(); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/FragmentManagerTest.java000066400000000000000000000044621451751637500304560ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertNotNull; import static org.junit.jupiter.api.Assertions.assertNull; import java.io.IOException; import java.util.ArrayList; import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; public class FragmentManagerTest { FragmentManager fragManager; @BeforeEach public void setUp() throws IOException{ IDManager idManager = new IDManager(); fragManager = new FragmentManager(new SMILESFragmentBuilder(idManager), idManager); } @Test public void testGetUnifiedFrags() throws StructureBuildingException { Fragment frag1 = fragManager.buildSMILES("CC"); Fragment frag2 = fragManager.buildSMILES("CNC"); fragManager.createBond(frag1.getFirstAtom(), frag2.getFirstAtom(), 1); Fragment frag = fragManager.getUnifiedFragment(); assertEquals(5, frag.getAtomCount(), "Frag has five atoms"); assertEquals(4, frag.getBondSet().size(), "Frag has four bonds"); } @Test public void testRelabelFusedRingSystem() throws StructureBuildingException { Fragment naphthalene = fragManager.buildSMILES("C1=CC=CC2=CC=CC=C12"); FragmentTools.relabelLocantsAsFusedRingSystem(naphthalene.getAtomList()); assertEquals(1, naphthalene.getIDFromLocant("1"), "Locant 1 = atom 1"); assertEquals(5, naphthalene.getIDFromLocant("4a"), "Locant 4a = atom 5"); assertEquals(9, naphthalene.getIDFromLocant("8"), "Locant 8 = atom 9"); assertEquals(10, naphthalene.getIDFromLocant("8a"), "Locant 8a = atom 10"); assertEquals(0, naphthalene.getIDFromLocant("9"), "No locant 9"); } @Test public void testCloneFragment() throws StructureBuildingException { Fragment urea = fragManager.buildSMILES("NC(=O)N"); FragmentTools.assignElementLocants(urea, new ArrayList()); assertNotNull(urea.getAtomByLocant("N")); assertNotNull(urea.getAtomByLocant("N'")); assertNull(urea.getAtomByLocant("N''")); assertNull(urea.getAtomByLocant("N'''")); Fragment primedCopy = fragManager.copyAndRelabelFragment(urea, 1); assertEquals(4, primedCopy.getAtomCount()); assertNull(primedCopy.getAtomByLocant("N")); assertNull(primedCopy.getAtomByLocant("N'")); assertNotNull(primedCopy.getAtomByLocant("N''")); assertNotNull(primedCopy.getAtomByLocant("N'''")); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/FragmentTest.java000066400000000000000000000500031451751637500271530ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.List; import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertFalse; import static org.junit.jupiter.api.Assertions.assertNotNull; import static org.junit.jupiter.api.Assertions.assertNull; import static org.junit.jupiter.api.Assertions.assertTrue; import static org.junit.jupiter.api.Assertions.fail; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; public class FragmentTest { private Fragment frag; private FragmentManager fm; @BeforeEach public void setUp(){ IDManager idManager = new IDManager(); fm = new FragmentManager(new SMILESFragmentBuilder(idManager), idManager); try { frag = fm.buildSMILES(""); } catch (StructureBuildingException e) { throw new RuntimeException(e); } } @Test public void testFragment() { assertNotNull(frag.getAtomList(), "Has atom list"); } @Test public void testAddAtom() { assertEquals(0, frag.getAtomCount(), "Has no atoms"); frag.addAtom(new Atom(1, ChemEl.C, frag)); assertEquals(1, frag.getAtomCount(), "Now has one atom"); } @Test public void testAddBond() { frag.addAtom(new Atom(1, ChemEl.C, frag)); frag.addAtom(new Atom(2, ChemEl.C, frag)); assertEquals(0, frag.getBondSet().size(), "Has no bonds"); fm.createBond(frag.getAtomByID(1), frag.getAtomByID(2), 1); assertEquals(1, frag.getBondSet().size(), "Now has one bond"); } @Test public void testImportFrag() throws StructureBuildingException { Fragment frag1 = fm.buildSMILES("CC"); Fragment frag2 = fm.buildSMILES("CC"); assertEquals(2, frag1.getAtomCount(), "Fragment has two atoms"); assertEquals(1, frag1.getBondSet().size(), "Fragment has one bond"); fm.incorporateFragment(frag2, frag1); assertEquals(4, frag1.getAtomCount(), "Fragment now has four atoms"); assertEquals(2, frag1.getBondSet().size(), "Fragment now has two bonds"); } @Test public void testImportFragWithIntraFragBonds1() throws StructureBuildingException { Fragment frag1 = fm.buildSMILES("C"); Fragment frag2 = fm.buildSMILES("C"); fm.createBond(frag1.getFirstAtom(), frag2.getFirstAtom(), 1); assertEquals(0, frag1.getBondSet().size()); assertEquals(0, frag2.getBondSet().size()); assertEquals(1, fm.getInterFragmentBonds(frag1).size()); assertEquals(1, fm.getInterFragmentBonds(frag2).size()); fm.incorporateFragment(frag2, frag1); assertEquals(1, frag1.getBondSet().size()); assertEquals(0, frag2.getBondSet().size()); assertEquals(0, fm.getInterFragmentBonds(frag1).size()); } @Test public void testImportFragWithIntraFragBonds2() throws StructureBuildingException { Fragment frag1 = fm.buildSMILES("C"); Fragment frag2 = fm.buildSMILES("C"); Fragment frag3 = fm.buildSMILES("C"); fm.createBond(frag2.getFirstAtom(), frag3.getFirstAtom(), 1); assertEquals(0, frag1.getBondSet().size()); assertEquals(0, frag2.getBondSet().size()); assertEquals(0, frag3.getBondSet().size()); assertEquals(0, fm.getInterFragmentBonds(frag1).size()); assertEquals(1, fm.getInterFragmentBonds(frag2).size()); assertEquals(1, fm.getInterFragmentBonds(frag3).size()); fm.incorporateFragment(frag2, frag1); assertEquals(0, frag1.getBondSet().size()); assertEquals(0, frag2.getBondSet().size()); assertEquals(0, frag3.getBondSet().size()); assertEquals(1, fm.getInterFragmentBonds(frag1).size()); assertEquals(1, fm.getInterFragmentBonds(frag3).size()); } @Test public void testGetIDFromLocant() { Atom atom = new Atom(10, ChemEl.C, frag); atom.addLocant("a"); frag.addAtom(atom); atom = new Atom(20, ChemEl.C, frag); atom.addLocant("silly"); frag.addAtom(atom); assertEquals(10, frag.getIDFromLocant("a"), "Locant a has ID 10"); assertEquals(20, frag.getIDFromLocant("silly"), "Locant silly has ID 20"); assertEquals(0, frag.getIDFromLocant("42"), "Locant 42 is not present"); } @Test public void testGetAtomByLocant() { Atom atom1 = new Atom(10, ChemEl.C, frag); atom1.addLocant("a"); frag.addAtom(atom1); Atom atom2 = new Atom(20, ChemEl.C, frag); atom2.addLocant("silly"); frag.addAtom(atom2); assertEquals(atom1, frag.getAtomByLocant("a"), "Locant a gets atom1"); assertEquals(atom2, frag.getAtomByLocant("silly"), "Locant silly gets atom2"); assertNull(frag.getAtomByLocant("42"), "Locant 42 is not present"); } @Test public void testGetAtomByID() { Atom atom1 = new Atom(10, ChemEl.C, frag); frag.addAtom(atom1); Atom atom2 = new Atom(20, ChemEl.C, frag); frag.addAtom(atom2); assertEquals(atom1, frag.getAtomByID(10), "ID 10 gets atom1"); assertEquals(atom2, frag.getAtomByID(20), "ID 20 gets atom2"); assertNull(frag.getAtomByID(42), "ID 42 is not present"); } @Test public void testFindBond() { frag.addAtom(new Atom(1, ChemEl.C, frag)); frag.addAtom(new Atom(2, ChemEl.C, frag)); frag.addAtom(new Atom(3, ChemEl.N, frag)); frag.addAtom(new Atom(4, ChemEl.O, frag)); fm.createBond(frag.getAtomByID(2), frag.getAtomByID(4), 2); fm.createBond(frag.getAtomByID(1), frag.getAtomByID(2), 1); fm.createBond(frag.getAtomByID(1), frag.getAtomByID(3), 3); Bond b = frag.findBond(2, 4); assertNotNull(b, "Found a bond"); assertEquals(2, b.getOrder(), "..a double bond"); b = frag.findBond(3, 1); assertNotNull(b, "Found a different bond"); assertEquals(3, b.getOrder(), "..a triple bond"); b = frag.findBond(2, 3); assertNull(b, "Don't find non-existent bonds"); } @Test public void testGetChainLength() { assertEquals(0, frag.getChainLength(), "No chain"); Atom a1 =new Atom(1, ChemEl.C, frag); a1.addLocant("1"); frag.addAtom(a1); assertEquals(1, frag.getChainLength(), "Methane"); Atom a2 =new Atom(2, ChemEl.C, frag); a2.addLocant("2"); frag.addAtom(a2); fm.createBond(frag.getAtomByID(1), frag.getAtomByID(2), 1); assertEquals(2, frag.getChainLength(), "ethane"); Atom a3 =new Atom(3, ChemEl.C, frag); a3.addLocant("3"); frag.addAtom(a3); fm.createBond(frag.getAtomByID(2), frag.getAtomByID(3), 1); assertEquals(3, frag.getChainLength(), "propane"); Atom a4 =new Atom(4, ChemEl.C, frag); frag.addAtom(a4); a4.addLocant("4"); fm.createBond(frag.getAtomByID(2), frag.getAtomByID(4), 1); assertEquals(3, frag.getChainLength(), "isobutane"); fm.removeBond(a2.getBondToAtom(a4)); fm.createBond(a3, a4, 1); assertEquals(4, frag.getChainLength(), "butane"); } @Test public void testRelabelSuffixLocants() throws StructureBuildingException { frag = fm.buildSMILES("C(N)N"); assertEquals(0, frag.getIDFromLocant("N"), "Can't find locant N in frag"); assertEquals(0, frag.getIDFromLocant("N'"), "Can't find locant N' in frag"); FragmentTools.assignElementLocants(frag, new ArrayList()); int idN = frag.getIDFromLocant("N"); int idNprime = frag.getIDFromLocant("N'"); if ((idN==2 && idNprime==3) || idN==3 && idNprime==2){ } else{ fail("Locants misassigned"); } } @Test public void testLabelCarbamimidamido() throws StructureBuildingException { frag = fm.buildSMILES("C(N)(=N)N-", NONCARBOXYLICACID_TYPE_VAL, NONE_LABELS_VAL); FragmentTools.assignElementLocants(frag, new ArrayList()); assertEquals(4, frag.getIDFromLocant("N"), "Can find locant N in frag: ID = 4"); assertEquals(2, frag.getIDFromLocant("N'"), "Can find locant N' in frag: ID = 2"); assertEquals(3, frag.getIDFromLocant("N''"), "Can find locant N'' in frag: ID = 3"); } @Test public void testLabelHydrazonoHydrazide() throws StructureBuildingException { frag = fm.buildSMILES("C(=NN)NN" , NONCARBOXYLICACID_TYPE_VAL, NONE_LABELS_VAL); FragmentTools.assignElementLocants(frag, new ArrayList()); assertEquals(4, frag.getIDFromLocant("N"), "Can find locant N in frag: ID = 4"); assertEquals(5, frag.getIDFromLocant("N'"), "Can find locant N' in frag: ID = 5"); assertEquals(2, frag.getIDFromLocant("N''"), "Can find locant N'' in frag: ID = 2"); assertEquals(3, frag.getIDFromLocant("N'''"), "Can find locant N''' in frag: ID = 3"); } @Test public void testLabelCarbonimidoyl() throws StructureBuildingException { frag = fm.buildSMILES("C(=N)" , ACIDSTEM_TYPE_VAL, NONE_LABELS_VAL); frag.addOutAtom(frag.getFirstAtom(), 1, true); frag.addOutAtom(frag.getFirstAtom(), 1, true); FragmentTools.assignElementLocants(frag, new ArrayList()); assertEquals(2, frag.getIDFromLocant("N"), "Can find locant N in frag: ID = 2"); assertEquals(1, frag.getIDFromLocant("C"), "Can find locant N in frag: ID = 1"); } @Test public void testLabelHydrazonicAmide() throws StructureBuildingException { frag = fm.buildSMILES("C", ACIDSTEM_TYPE_VAL, NONE_LABELS_VAL); Fragment suffixfrag = fm.buildSMILES("[R](N)=NN", SUFFIX_TYPE_VAL, NONE_LABELS_VAL); List suffixes = new ArrayList(); suffixes.add(suffixfrag); FragmentTools.assignElementLocants(frag, suffixes); fm.incorporateFragment(suffixfrag, frag); assertEquals(3, frag.getIDFromLocant("N"), "Can find locant N in frag: ID = 3"); assertEquals(5, frag.getIDFromLocant("N''"), "Can find locant N'' in frag: ID = 5"); //DEVIATION From systematic behaviour assertEquals(5, frag.getIDFromLocant("N'"), "Can find locant N' in frag: ID = 5"); } @Test public void testLabelHydrazonate() throws StructureBuildingException { frag = fm.buildSMILES("C", ACIDSTEM_TYPE_VAL, NONE_LABELS_VAL); Fragment suffixfrag = fm.buildSMILES("[R]([O-])=NN", SUFFIX_TYPE_VAL, NONE_LABELS_VAL); List suffixes = new ArrayList(); suffixes.add(suffixfrag); FragmentTools.assignElementLocants(frag, suffixes); fm.incorporateFragment(suffixfrag, frag); assertEquals(5, frag.getIDFromLocant("N'"), "Can find locant N' in frag: ID = 5"); //DEVIATION From systematic behaviour assertEquals(5, frag.getIDFromLocant("N"), "Can find locant N in frag: ID = 5"); } @Test public void testLabelHexanDiamide() throws StructureBuildingException { frag = fm.buildSMILES("CCCCCC", CHAIN_TYPE_VAL, "1/2/3/4/5/6"); Fragment suffixfrag1 = fm.buildSMILES("[R]N", SUFFIX_TYPE_VAL, NONE_LABELS_VAL); Fragment suffixfrag2 = fm.buildSMILES("[R]N", SUFFIX_TYPE_VAL, NONE_LABELS_VAL); List suffixes = new ArrayList(); suffixes.add(suffixfrag1); suffixes.add(suffixfrag2); FragmentTools.assignElementLocants(frag, suffixes); fm.incorporateFragment(suffixfrag1, frag); fm.incorporateFragment(suffixfrag2, frag); assertEquals(8, frag.getIDFromLocant("N"), "Can find locant N in frag: ID = 8"); assertEquals(10, frag.getIDFromLocant("N'"), "Can find locant N' in frag: ID = 10"); } @Test public void testLabelDiimidooxalicDiamide() throws StructureBuildingException { frag = fm.buildSMILES("CC", ACIDSTEM_TYPE_VAL, "1/2"); Fragment suffixfrag1 = fm.buildSMILES("[R](N)=N", SUFFIX_TYPE_VAL, NONE_LABELS_VAL); Fragment suffixfrag2 = fm.buildSMILES("[R](N)=N", SUFFIX_TYPE_VAL, NONE_LABELS_VAL); List suffixes = new ArrayList(); suffixes.add(suffixfrag1); suffixes.add(suffixfrag2); FragmentTools.assignElementLocants(frag, suffixes); fm.incorporateFragment(suffixfrag1, frag); fm.incorporateFragment(suffixfrag2, frag); assertEquals(4, frag.getIDFromLocant("N"), "Can find locant N in frag: ID = 4"); assertEquals(7, frag.getIDFromLocant("N'"), "Can find locant N' in frag: ID = 7"); assertEquals(5, frag.getIDFromLocant("N''"), "Can find locant N'' in frag: ID = 5"); assertEquals(8, frag.getIDFromLocant("N'''"), "Can find locant N''' in frag: ID = 8"); } @Test public void testLabelHydrazinecarbohydrazide() throws StructureBuildingException { frag = fm.buildSMILES("NN", SIMPLEGROUP_TYPE_VAL, "1/2"); Fragment suffix = fm.buildSMILES("[R]C(=O)NN", SUFFIX_TYPE_VAL, "/X///"); List suffixes = new ArrayList(); suffixes.add(suffix); FragmentTools.assignElementLocants(frag, suffixes); fm.incorporateFragment(suffix, frag); assertEquals(6, frag.getIDFromLocant("N"), "Can find locant N in frag: ID = 6"); assertEquals(7, frag.getIDFromLocant("N'"), "Can find locant N' in frag: ID = 7"); assertEquals(0, frag.getIDFromLocant("N''"), "Can't find locant N'' in frag"); assertEquals(0, frag.getIDFromLocant("C"), "Can't find locant C in frag"); } @Test public void testLabelCarbonicDihydrazide() throws StructureBuildingException { frag = fm.buildSMILES("C(=O)(NN)NN", NONCARBOXYLICACID_TYPE_VAL, NONE_LABELS_VAL); FragmentTools.assignElementLocants(frag, new ArrayList()); int idN = frag.getIDFromLocant("N"); int idNprime = frag.getIDFromLocant("N'"); int idNprime2 = frag.getIDFromLocant("N''"); int idNprime3 = frag.getIDFromLocant("N'''"); if ((idN==3 && idNprime==4 && idNprime2==5 && idNprime3==6) || (idN==5 && idNprime==6 && idNprime2==3 && idNprime3==4)){ } else{ fail("Locants misassigned"); } assertEquals(0, frag.getIDFromLocant("C"), "Can't find locant C in frag"); } @Test public void testLabelSulfonoThioate() throws StructureBuildingException { frag = fm.buildSMILES("C"); Fragment suffix = fm.buildSMILES("[R]S(=O)(=O)S", SUFFIX_TYPE_VAL, "/X///"); List suffixes = new ArrayList(); suffixes.add(suffix); FragmentTools.assignElementLocants(frag, suffixes); fm.incorporateFragment(suffix, frag); assertEquals(6, frag.getIDFromLocant("S"), "Can find locant S in frag: ID = 6"); assertEquals(0, frag.getIDFromLocant("S'"), "Can't find locant S' in frag"); int idO = frag.getIDFromLocant("O"); int idOprime = frag.getIDFromLocant("O'"); if ((idO==4 && idOprime==5)|| (idO==5 && idOprime==4)){ } else{ fail("Locants misassigned"); } } @Test public void testLabelAcetoanilide() throws StructureBuildingException { frag = fm.buildSMILES("CC"); Fragment suffix = fm.buildSMILES("[*](=O)Nc1ccccc1", SUFFIX_TYPE_VAL, "///1'/2'/3'/4'/5'/6'"); List suffixes = new ArrayList(); suffixes.add(suffix); FragmentTools.assignElementLocants(frag, suffixes); fm.incorporateFragment(suffix, frag); assertEquals(5, frag.getIDFromLocant("N"), "Can find locant N in frag: ID = 5"); } @Test public void testLabelPyridine() throws StructureBuildingException { frag = fm.buildSMILES("n1ccccc1", RING_TYPE_VAL, "1/2/3/4/5/6"); FragmentTools.assignElementLocants(frag, new ArrayList()); assertEquals(1, frag.getIDFromLocant("N"), "Can find locant N in frag: ID = 1"); assertEquals(0, frag.getIDFromLocant("C"), "Can't find locant C in frag"); } @Test public void testLabelPiperazine() throws StructureBuildingException { frag = fm.buildSMILES("N1CCNCC1", RING_TYPE_VAL, "1/2/3/4/5/6"); FragmentTools.assignElementLocants(frag, new ArrayList()); int idN = frag.getIDFromLocant("N"); int idNprime = frag.getIDFromLocant("N'"); if ((idN==1 && idNprime==4) || (idN==4 && idNprime==1)){ } else{ fail("Locants misassigned"); } assertEquals(0, frag.getIDFromLocant("C"), "Can't find locant C in frag"); } @Test public void testLabelCarboximidohydrazide() throws StructureBuildingException { frag = fm.buildSMILES("c1ccccc1"); Fragment suffix = fm.buildSMILES("[R]C(=N)NN", SUFFIX_TYPE_VAL, "/X//1'/2'"); List suffixes = new ArrayList(); suffixes.add(suffix); FragmentTools.assignElementLocants(frag, suffixes); fm.incorporateFragment(suffix, frag); assertEquals(10, frag.getIDFromLocant("N"), "Can find locant N in frag: ID = 10"); assertEquals(11, frag.getIDFromLocant("N'"), "Can find locant N' in frag: ID = 11"); assertEquals(9, frag.getIDFromLocant("N''"), "Can find locant N'' in frag: ID = 9"); } @Test public void testIndicatedHydrogen() throws StructureBuildingException { Fragment pyrrole = fm.buildSMILES("[nH]1cccc1"); assertEquals(1, pyrrole.getIndicatedHydrogen().size(), "Pyrrole has 1 indicated hydrogen"); assertEquals(pyrrole.getFirstAtom(), pyrrole.getIndicatedHydrogen().get(0), "..and the indicated hydrogen is on the nitrogen"); } @Test public void testSpareValenciesOnAromaticAtoms() throws StructureBuildingException{ Fragment naphthalene = fm.buildSMILES("c1cccc2ccccc12"); for(Atom a : naphthalene) { assertEquals(true, a.hasSpareValency(), "All atoms have sv"); } for(Bond b : naphthalene.getBondSet()) { assertEquals(1, b.getOrder(), "All bonds are of order 1"); } } @Test public void testConvertSpareValenciesToDoubleBonds() throws StructureBuildingException{ Fragment dhp = fm.buildSMILES("c1cCccC1"); FragmentTools.convertSpareValenciesToDoubleBonds(dhp); for(Atom a : dhp) { assertEquals(false, a.hasSpareValency(), "All atoms have no sv"); } Fragment funnydiene = fm.buildSMILES("C(=C)C=C"); FragmentTools.convertSpareValenciesToDoubleBonds(funnydiene); for(Atom a : funnydiene) { assertEquals(false, a.hasSpareValency(), "All atoms have no sv"); } Fragment naphthalene = fm.buildSMILES("c1cccc2ccccc12"); FragmentTools.convertSpareValenciesToDoubleBonds(naphthalene); for(Atom a : naphthalene) { assertEquals(false, a.hasSpareValency(), "All atoms have no sv"); } Fragment pentalene = fm.buildSMILES("c12c(ccc1)ccc2"); for(Atom a : pentalene) { assertEquals(true, a.hasSpareValency(), "All atoms have sv"); } FragmentTools.convertSpareValenciesToDoubleBonds(pentalene); for(Atom a : pentalene) { assertEquals(false, a.hasSpareValency(), "All atoms have no sv"); } } @Test public void testGetAtomNeighbours() throws StructureBuildingException { Fragment naphthalene = fm.buildSMILES("C1=CC=CC2=CC=CC=C12"); assertEquals(2, naphthalene.getIntraFragmentAtomNeighbours(naphthalene.getAtomByID(1)).size(), "Atom 1 has two neighbours"); assertEquals(3, naphthalene.getIntraFragmentAtomNeighbours(naphthalene.getAtomByID(5)).size(), "Atom 5 has three neighbours"); } @Test public void testIsCharacteristicAtomSuffix() throws StructureBuildingException{ Fragment parent = fm.buildSMILES("CC"); Fragment suffix = fm.buildSMILES("N", SUFFIX_TYPE_VAL, NONE_LABELS_VAL); fm.incorporateFragment(suffix, suffix.getFirstAtom(), parent, parent.getFirstAtom(), 1); List parentAtoms = parent.getAtomList(); assertFalse(FragmentTools.isCharacteristicAtom(parentAtoms.get(0))); assertFalse(FragmentTools.isCharacteristicAtom(parentAtoms.get(1))); assertTrue(FragmentTools.isCharacteristicAtom(parentAtoms.get(2))); } @Test public void testIsCharacteristicAtomAldehyde() throws StructureBuildingException{ Fragment parent = fm.buildSMILES("CC"); Fragment suffix = fm.buildSMILES("O", SUFFIX_TYPE_VAL, NONE_LABELS_VAL); fm.incorporateFragment(suffix, suffix.getFirstAtom(), parent, parent.getFirstAtom(), 2); List parentAtoms = parent.getAtomList(); parentAtoms.get(1).setProperty(Atom.ISALDEHYDE, true); assertFalse(FragmentTools.isCharacteristicAtom(parentAtoms.get(0))); assertTrue(FragmentTools.isCharacteristicAtom(parentAtoms.get(1))); assertTrue(FragmentTools.isCharacteristicAtom(parentAtoms.get(2))); } @Test public void testIsCharacteristicAtomFunctionalAtom() throws StructureBuildingException{ Fragment parent = fm.buildSMILES("CC(=O)[O-]"); List parentAtoms = parent.getAtomList(); parent.addFunctionalAtom(parentAtoms.get(3)); for (int i = 0; i < parentAtoms.size() - 1; i++) { assertFalse(FragmentTools.isCharacteristicAtom(parentAtoms.get(i))); } assertTrue(FragmentTools.isCharacteristicAtom(parentAtoms.get(parentAtoms.size() - 1))); } @Test public void testIsCharacteristicAtomHydroxy() throws StructureBuildingException{ List phenolAtoms = fm.buildSMILES("Oc1ccccc1").getAtomList(); assertTrue(FragmentTools.isCharacteristicAtom(phenolAtoms.get(0))); for (int i = 1; i < phenolAtoms.size(); i++) { assertFalse(FragmentTools.isCharacteristicAtom(phenolAtoms.get(i))); } } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/FusedRingNumbererFunctionsTest.java000066400000000000000000000265641451751637500327060ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import org.junit.jupiter.api.Test; public class FusedRingNumbererFunctionsTest { @Test public void testGetOppositeDirection(){ assertEquals(4,FusedRingNumberer.getOppositeDirection(0)); assertEquals(-3,FusedRingNumberer.getOppositeDirection(1)); assertEquals(-2,FusedRingNumberer.getOppositeDirection(2)); assertEquals(-1,FusedRingNumberer.getOppositeDirection(3)); assertEquals(0,FusedRingNumberer.getOppositeDirection(4)); assertEquals(0,FusedRingNumberer.getOppositeDirection(-4)); assertEquals(1,FusedRingNumberer.getOppositeDirection(-3)); assertEquals(2,FusedRingNumberer.getOppositeDirection(-2)); assertEquals(3,FusedRingNumberer.getOppositeDirection(-1)); } // // @Test // public void testDetermineAbsoluteDirectionFromPreviousDirection3Membered(){ // assertEquals(1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, 0, 3)); // assertEquals(-1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, 0, 3)); // // assertEquals(2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, 1, 3)); // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, 1, 3)); // // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, -1, 3)); // assertEquals(-2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, -1, 3)); // } // // @Test // public void testDetermineAbsoluteDirectionFromPreviousDirection4Membered(){ // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, 0, 4)); // assertEquals(2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(2, 0, 4)); // assertEquals(-2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-2, 0, 4)); // // assertEquals(2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, 2, 4)); // assertEquals(4,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(2, 2, 4)); // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-2, 2, 4)); // // assertEquals(-2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, -2, 4)); // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(2, -2, 4)); // assertEquals(4,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-2, -2, 4)); // } // // @Test // public void testDetermineAbsoluteDirectionFromPreviousDirection5Membered(){ // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, 0, 5)); // assertEquals(1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, 0, 5)); // assertEquals(2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(2, 0, 5)); // assertEquals(3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, 0, 5)); // assertEquals(-1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, 0, 5)); // assertEquals(-2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-2, 0, 5)); // assertEquals(-3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, 0, 5)); // // //assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, 1, 5)); // assertEquals(2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, 1, 5)); // //assertEquals(3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(2, 1, 5)); // assertEquals(4,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, 1, 5)); // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, 1, 5)); // //assertEquals(-1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-2, 1, 5)); // assertEquals(-1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, 1, 5)); // // assertEquals(2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, 2, 5)); // assertEquals(3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, 2, 5)); // assertEquals(4,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(2, 2, 5)); // assertEquals(-3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, 2, 5)); // assertEquals(1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, 2, 5)); // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-2, 2, 5)); // assertEquals(-1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, 2, 5)); // // //assertEquals(3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, 3, 5)); // assertEquals(4,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, 3, 5)); // //assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(2, 3, 5)); // assertEquals(-2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, 3, 5)); // assertEquals(2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, 3, 5)); // assertEquals(1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-2, 3, 5)); // //assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, 3, 5)); //// // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, -1, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, -1, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(2, -1, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, -1, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, -1, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-2, -1, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, -1, 5)); //// //// assertEquals(-2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, -2, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, -2, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(2, -2, 5)); //// assertEquals(.FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, -2, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, -2, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-2, -2, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, -2, 5)); //// //// assertEquals(-3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, -3, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, -3, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(2, -3, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, -3, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, -3, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-2, -3, 5)); //// assertEquals(,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, -3, 5)); // } // // @Test // public void testDetermineAbsoluteDirectionFromPreviousDirection8Membered(){ // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, 0, 8)); // assertEquals(1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, 0, 8)); // assertEquals(3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, 0, 8)); // assertEquals(-1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, 0, 8)); // assertEquals(-3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, 0, 8)); // // assertEquals(1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, 1, 8)); // assertEquals(3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, 1, 8)); // assertEquals(4,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, 1, 8)); // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, 1, 8)); // assertEquals(-1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, 1, 8)); // // assertEquals(1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, 1, 8)); // assertEquals(3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, 1, 8)); // assertEquals(4,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, 1, 8)); // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, 1, 8)); // assertEquals(-1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, 1, 8)); // // assertEquals(2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, 2, 8)); // assertEquals(3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, 2, 8)); // assertEquals(-3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, 2, 8)); // assertEquals(1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, 2, 8)); // assertEquals(-1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, 2, 8)); // // assertEquals(3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, 3, 8)); // assertEquals(4,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, 3, 8)); // assertEquals(-3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, 3, 8)); // assertEquals(1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, 3, 8)); // assertEquals(1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, 3, 8)); // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, 3, 8)); // // assertEquals(-1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, -1, 8)); // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, -1, 8)); // assertEquals(1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, -1, 8)); // assertEquals(-3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, -1, 8)); // assertEquals(4,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, -1, 8)); // // assertEquals(-2,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, -2, 8)); // assertEquals(-1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, -2, 8)); // assertEquals(1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, -2, 8)); // assertEquals(-3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, -2, 8)); // assertEquals(3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, -2, 8)); // // assertEquals(-3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(0, -3, 8)); // assertEquals(-1,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(1, -3, 8)); // assertEquals(0,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(3, -3, 8)); // assertEquals(4,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-1, -3, 8)); // assertEquals(3,FusedRingNumberer.determineAbsoluteDirectionFromPreviousDirection(-3, -3, 8)); // } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/FusedRingNumbererTest.java000066400000000000000000000374261451751637500310140ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.mockito.Mockito.mock; import java.util.List; import org.junit.jupiter.api.Disabled; import org.junit.jupiter.api.Test; /** * Tests that fused ring numbering is working as expected. A heteroatom(n) has been placed at the expected locant 1 to make numbering unambiguous where due to symmetry geometric consideration are insufficient to deduce unique numbering * Currently interior atoms are not labelled. As this is not seen as a problem, any tests of compounds with interior atoms have not had locants assigned to the interior atoms * @author dl387 * */ public class FusedRingNumbererTest { private SMILESFragmentBuilder sBuilder = new SMILESFragmentBuilder(new IDManager()); @Test public void aceanthrene() throws StructureBuildingException { compareNumbering("C1Cc2cccc3cc4ccccc4c1c23", "1/2/2a/3/4/5/5a/6/6a/7/8/9/10/10a/10b/10c"); } @Test public void acenaphthene() throws StructureBuildingException { compareNumbering("C1Cc2cccc3cccc1c23", "1/2/2a/3/4/5/5a/6/7/8/8a/8b"); } @Test public void acephenanthrene() throws StructureBuildingException { compareNumbering("c1ccc2CCc3cc4ccccc4c1c23", "1/2/3/3a/4/5/5a/6/6a/7/8/9/10/10a/10b/10c"); } @Test public void arsanthrene() throws StructureBuildingException { compareNumbering("C1=CC=CC=2[As]=C3C=CC=CC3=[As]C12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void arsanthridine() throws StructureBuildingException { compareNumbering("c1cccc2[as]cc3ccccc3c12", "1/2/3/4/4a/5/6/6a/7/8/9/10/10a/10b"); } @Test public void arsindole() throws StructureBuildingException { compareNumbering("[as]1ccc2ccccc12", "1/2/3/3a/4/5/6/7/7a"); } @Test public void arsindoline() throws StructureBuildingException { compareNumbering("[as]1cccc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void betacarboline() throws StructureBuildingException { compareNumbering("c1nccc2c3ccccc3nc12", "1/2/3/4/4a/4b/5/6/7/8/8a/9/9a"); } @Test public void boranthrene() throws StructureBuildingException { compareNumbering("C1=CC=CC=2[B]=C3C=CC=CC3=[B]C12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void cholanthrene() throws StructureBuildingException { compareNumbering("C1Cc2cccc3cc4c5ccccc5ccc4c1c23", "1/2/2a/3/4/5/5a/6/6a/6b/7/8/9/10/10a/11/12/12a/12b/12c"); } @Test public void thiochromane() throws StructureBuildingException { compareNumbering("S1CCCc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void selenochromane() throws StructureBuildingException { compareNumbering("[Se]1CCCc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void tellurochromane() throws StructureBuildingException { compareNumbering("[Te]1CCCc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void thiochromene() throws StructureBuildingException { compareNumbering("s1cccc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void selenochromene() throws StructureBuildingException { compareNumbering("[se]1cccc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void tellurochromene() throws StructureBuildingException { compareNumbering("[Te]1CC=Cc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void coronene() throws StructureBuildingException { compareNumbering("c1cc2ccc3ccc4ccc5ccc6ccc71.c28c3c4c5c6c78", "1/2/2a/3/4/4a/5/6/6a/7/8/8a/9/10/10a/11/12/12a/12b/12c/12d/12e/12f/12g"); } @Test public void indane() throws StructureBuildingException { compareNumbering("C1CCc2ccccc12", "1/2/3/3a/4/5/6/7/7a"); } @Test public void isoarsindole() throws StructureBuildingException { compareNumbering("c1[as]cc2ccccc12", "1/2/3/3a/4/5/6/7/7a"); } @Test public void isoarsinoline() throws StructureBuildingException { compareNumbering("c1[as]ccc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void thioisochromane() throws StructureBuildingException { compareNumbering("C1SCCc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void selenoisochromane() throws StructureBuildingException { compareNumbering("C1[Se]CCc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void telluroisochromane() throws StructureBuildingException { compareNumbering("C1[Te]CCc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void isochromene() throws StructureBuildingException { compareNumbering("c1occc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void thioisochromene() throws StructureBuildingException { compareNumbering("c1sccc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void selenoisochromene() throws StructureBuildingException { compareNumbering("c1[se]ccc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void telluroisochromene() throws StructureBuildingException { compareNumbering("C1[Te]C=Cc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void isophosphindole() throws StructureBuildingException { compareNumbering("c1pcc2ccccc12", "1/2/3/3a/4/5/6/7/7a"); } @Test public void isophosphinoline() throws StructureBuildingException { compareNumbering("c1pccc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void isoviolanthrene() throws StructureBuildingException { compareNumbering("c1cccc2c3ccc4c5ccc6Cc7ccccc7c8ccc9c%10ccc%11Cc12.c3%11c4%10.c59c68", "1/2/3/4/4a/4b/5/6/6a/6b/7/8/8a/9/9a/10/11/12/13/13a/13b/14/15/15a/15b/16/17/17a/18/18a/18b/18c/18d/18e"); } @Test public void mercuranthrene() throws StructureBuildingException { compareNumbering("c1cccc2[Hg]c3ccccc3[Hg]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test @Disabled //Transient BUG in path finding code public void ovalene() throws StructureBuildingException { compareNumbering("c1cc2ccc3ccc4cc5ccc6ccc7ccc8cc91.c19c2c3c4c9c5c6c7c8c19", "1/2/2a/3/4/4a/5/6/6a/7/7a/8/9/9a/10/11/11a/12/13/13a/14/14a/14b/14c/14d/14e/14f/14g/14h/14i/14j/14k"); } @Test public void oxanthrene() throws StructureBuildingException { compareNumbering("c1cccc2Oc3ccccc3Oc12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void perylene() throws StructureBuildingException { compareNumbering("c1ccc2cccc3c4cccc5cccc6c71.c237.c456", "1/2/3/3a/4/5/6/6a/6b/7/8/9/9a/10/11/12/12a/12b/12c/12d"); } @Test public void phenanthridine() throws StructureBuildingException { compareNumbering("c1cccc2ncc3ccccc3c12", "1/2/3/4/4a/5/6/6a/7/8/9/10/10a/10b"); } @Test public void phenomercurine() throws StructureBuildingException { compareNumbering("c1cccc2[Hg]c3ccccc3[Hg]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenoxazine() throws StructureBuildingException { compareNumbering("c1cccc2Oc3ccccc3nc12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenothiazine() throws StructureBuildingException { compareNumbering("c1cccc2Sc3ccccc3nc12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenoselenazine() throws StructureBuildingException { compareNumbering("c1cccc2[Se]c3ccccc3nc12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenotellurazine() throws StructureBuildingException { compareNumbering("c1cccc2[Te]c3ccccc3nc12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenophosphazinine() throws StructureBuildingException { compareNumbering("c1cccc2nc3ccccc3pc12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenophosphazine() throws StructureBuildingException { compareNumbering("c1cccc2nc3ccccc3pc12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenarsazinine() throws StructureBuildingException { compareNumbering("c1cccc2nc3ccccc3[as]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenoarsazine() throws StructureBuildingException { compareNumbering("c1cccc2nc3ccccc3[as]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenomercurazine() throws StructureBuildingException { compareNumbering("c1cccc2nc3ccccc3[Hg]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenomercazine() throws StructureBuildingException { compareNumbering("c1cccc2nc3ccccc3[Hg]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenoxathiine() throws StructureBuildingException { compareNumbering("c1cccc2Oc3ccccc3Sc12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenoxaselenine() throws StructureBuildingException { compareNumbering("c1cccc2Oc3ccccc3[Se]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenoxatellurine() throws StructureBuildingException { compareNumbering("c1cccc2Oc3ccccc3[Te]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenoxaphosphinine() throws StructureBuildingException { compareNumbering("c1cccc2Oc3ccccc3pc12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenoxaphosphine() throws StructureBuildingException { compareNumbering("c1cccc2Oc3ccccc3pc12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenoxarsinine() throws StructureBuildingException { compareNumbering("c1cccc2Oc3ccccc3[as]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenoxarsine() throws StructureBuildingException { compareNumbering("c1cccc2Oc3ccccc3[as]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenoxastibinine() throws StructureBuildingException { compareNumbering("c1cccc2Oc3ccccc3[sb]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenoxantimonine() throws StructureBuildingException { compareNumbering("c1cccc2Oc3ccccc3[sb]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenothiarsinine() throws StructureBuildingException { compareNumbering("c1cccc2Sc3ccccc3[as]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phenothiarsine() throws StructureBuildingException { compareNumbering("c1cccc2Sc3ccccc3[as]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phosphanthrene() throws StructureBuildingException { compareNumbering("C1=CC=CC=2P=C3C=CC=CC3=PC12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void phosphindole() throws StructureBuildingException { compareNumbering("p1ccc2ccccc12", "1/2/3/3a/4/5/6/7/7a"); } @Test public void phosphinoline() throws StructureBuildingException { compareNumbering("p1cccc2ccccc12", "1/2/3/4/4a/5/6/7/8/8a"); } @Test public void picene() throws StructureBuildingException { compareNumbering("c1cccc2ccc3c4ccc5ccccc5c4ccc3c21", "1/2/3/4/4a/5/6/6a/6b/7/8/8a/9/10/11/12/12a/12b/13/14/14a/14b"); } @Test public void pleiadene() throws StructureBuildingException { compareNumbering("c1ccc2cccc3cc4ccccc4cc51.c235", "1/2/3/3a/4/5/6/6a/7/7a/8/9/10/11/11a/12/12a/12b"); } @Test public void pyranthrene() throws StructureBuildingException { compareNumbering("c1cccc2c3cc4ccc5cc6ccccc6c6cc7ccc8cc12.c38c7c4c56", "1/2/3/4/4a/4b/5/5a/6/7/7a/8/8a/9/10/11/12/12a/12b/13/13a/14/15/15a/16/16a////"); } @Test public void pyrrolizine() throws StructureBuildingException { compareNumbering("c1ccn2cccc12", "1/2/3/4/5/6/7/7a"); } @Test public void quinolizine() throws StructureBuildingException { compareNumbering("c1cccn2ccccc12", "1/2/3/4/5/6/7/8/9/9a"); } @Test public void rubicene() throws StructureBuildingException { compareNumbering("c1ccc2c3ccccc3c3c4cccc5c6ccccc6c6c71.c237.c456", "1/2/3/3a/3b/4/5/6/7/7a/7b/7c/8/9/10/10a/10b/11/12/13/14/14a/14b/14c/14d/14e"); } @Test public void silanthrene() throws StructureBuildingException { compareNumbering("C1=CC=CC=2[Si]=C3C=CC=CC3=[Si]C12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void selenanthrene() throws StructureBuildingException { compareNumbering("c1cccc2[Se]c3ccccc3[Se]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void telluranthrene() throws StructureBuildingException { compareNumbering("c1cccc2[Te]c3ccccc3[Te]c12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void thianthrene() throws StructureBuildingException { compareNumbering("c1cccc2Sc3ccccc3Sc12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void trindene() throws StructureBuildingException { compareNumbering("c1ccc2c3cccc3c4cccc4c12", "1/2/3/3a/3b/4/5/6/6a/6b/7/8/9/9a/9b"); } @Test public void violanthrene() throws StructureBuildingException { compareNumbering("c1cccc2Cc3ccc4c5ccc6Cc7ccccc7c8ccc9c%10ccc%11c12.c3%11c4%10.c59c68", "1/2/3/4/4a/5/5a/6/7/7a/7b/8/9/9a/10/10a/11/12/13/14/14a/14b/15/16/16a/16b/17/18/18a/18b////"); } @Test public void naphthotetraphene() throws StructureBuildingException { compareNumbering("c1cc2ccc3ccc4cc5ccccc5cc4c3c2c6ccccc16", "1/2/2a/3/4/4a/5/6/6a/7/7a/8/9/10/11/11a/12/12a/12b/12c/12d/13/14/15/16/16a"); } @Test public void anthratetraphene() throws StructureBuildingException { compareNumbering("c1cc2ccc3ccc4cc5ccccc5cc4c3c2c6cc7ccccc7cc16", "1/2/2a/3/4/4a/5/6/6a/7/7a/8/9/10/11/11a/12/12a/12b/12c/12d/13/13a/14/15/16/17/17a/18/18a"); } @Test public void octalenotetraphene() throws StructureBuildingException { compareNumbering("c1ccc2ccc3ccc4cc5ccccc5cc4c3c2cc6ccccccc16", "1/2/3/3a/4/5/5a/6/7/7a/8/8a/9/10/11/12/12a/13/13a/13b/13c/14/14a/15/16/17/18/19/20/20a"); } @Test public void difficultChain() throws StructureBuildingException { compareNumbering("C1C2C3C4CC5CC6CCCCCCC6CCC5CC4CC3C12", "1/1a/1b/1c/2/2a/3/3a/4/5/6/7/8/9/9a/10/11/11a/12/12a/13/13a/13b"); } @Test public void difficultChain2() throws StructureBuildingException { compareNumbering("C1C2CCCCC2C3C4C5C6CCC7C8CCCCC8C7CCC56C4CCC13", "1/1a/2/3/4/5/5a/5b/5c/5d/5e/6/7/7a/7b/8/9/10/11/11a/11b/12/13/13a/13b/14/15/15a"); } @Test public void acrindoline() throws StructureBuildingException { compareNumbering("c1cccc2c3ccc4cc5ccccc5nc4c3nc12", "1/2/3/4/4a/4b/5/6/6a/7/7a/8/9/10/11/11a/12/12a/12b/13/13a"); } @Test public void anthrazine() throws StructureBuildingException { compareNumbering("c1cccc2cc3c4nc5ccc6cc7ccccc7cc6c5nc4ccc3cc12", "1/2/3/4/4a/5/5a/5b/6/6a/7/8/8a/9/9a/10/11/12/13/13a/14/14a/14b/15/15a/16/17/17a/18/18a"); } @Test public void anthyridine() throws StructureBuildingException { compareNumbering("n1cccc2cc3cccnc3nc12", "1/2/3/4/4a/5/5a/6/7/8/9/9a/10/10a"); } @Test public void benzo_cd_azulene() throws StructureBuildingException { compareNumbering("c1cc2cccc3ccccc1c23", "1/2/2a/3/4/5/5a/6/7/8/9/9a/9b"); } @Test public void indeno_7_1_cd_azepine() throws StructureBuildingException { compareNumbering("c1nccc2ccc3cccc1c23", "1/2/3/4/4a/5/6/6a/7/8/9/9a/9b"); } @Test @Disabled public void tripleSubstituedSevenMembered() throws StructureBuildingException { compareNumbering("C1NCCN2c3ncccc3Cc4ccccc4C12", "1/2/3/4/5/5a/6/7/8/9/9a/10/10a/11/12/13/14/14a/14b"); compareNumbering("c1cccc2C3CNCCN3c4ncccc4Cc12", "1/2/3/4/5/5a/6/7/8/9/9a/10/10a/11/12/13/14/14a/14b"); } /** * Takes smiles and expected labels for a fused ring. Generates the fused ring, numbers it then compares to the given slash delimited labels * @param smiles * @param labels * @throws StructureBuildingException */ private void compareNumbering(String smiles, String labels) throws StructureBuildingException { Fragment fusedRing = sBuilder.build(smiles, mock(Element.class), XmlDeclarations.NONE_LABELS_VAL); String[] labelArray =labels.split("/", -1); FusedRingNumberer.numberFusedRing(fusedRing); List atomList =fusedRing.getAtomList(); assertEquals(atomList.size(), labelArray.length);//bug in test if not true! for (int i = 0; i < atomList.size(); i++) { if (!labelArray[i].equals("")){//exterior atom locant assertEquals(labelArray[i],atomList.get(i).getFirstLocant()); } } } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/HeteroAtomReplacementTest.java000066400000000000000000000062011451751637500316400ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertThrows; import static org.mockito.Mockito.mock; import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; public class HeteroAtomReplacementTest { FragmentManager fragManager; Atom a; @BeforeEach public void setUp() { IDManager idManager = new IDManager(); fragManager = new FragmentManager(new SMILESFragmentBuilder(idManager), idManager); a = new Atom(0, ChemEl.C, mock(Fragment.class)); } @Test public void thia() throws StructureBuildingException{ fragManager.replaceAtomWithSmiles(a, "S"); assertEquals(0, a.getCharge()); assertEquals(0, a.getProtonsExplicitlyAddedOrRemoved()); assertEquals(2, StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(a)); } @Test public void thionia() throws StructureBuildingException{ fragManager.replaceAtomWithSmiles(a, "[SH3+]"); assertEquals(1, a.getCharge()); assertEquals(1, a.getProtonsExplicitlyAddedOrRemoved()); assertEquals(3, StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(a)); } @Test public void sulfanylia() throws StructureBuildingException{ fragManager.replaceAtomWithSmiles(a, "[SH+]"); assertEquals(1, a.getCharge()); assertEquals(-1, a.getProtonsExplicitlyAddedOrRemoved()); assertEquals(1, StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(a)); } @Test public void sulfanida() throws StructureBuildingException{ fragManager.replaceAtomWithSmiles(a, "[SH-]"); assertEquals(-1, a.getCharge()); assertEquals(-1, a.getProtonsExplicitlyAddedOrRemoved()); assertEquals(1, StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(a)); } @Test public void sulfanuida() throws StructureBuildingException{ fragManager.replaceAtomWithSmiles(a, "[SH3-]"); assertEquals(-1, a.getCharge()); assertEquals(1, a.getProtonsExplicitlyAddedOrRemoved()); assertEquals(3, StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(a)); } @Test public void replaceNeutralWithCharged() throws StructureBuildingException{ Atom a = new Atom(0, ChemEl.C, mock(Fragment.class)); fragManager.replaceAtomWithSmiles(a, "[NH4+]"); assertEquals(1, a.getCharge()); assertEquals(1, a.getProtonsExplicitlyAddedOrRemoved()); assertEquals(4, StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(a)); } @Test public void replaceChargedWithEquallyCharged() throws StructureBuildingException{ Atom a = new Atom(0, ChemEl.C, mock(Fragment.class)); a.addChargeAndProtons(1, -1); fragManager.replaceAtomWithSmiles(a, "[NH4+]"); assertEquals(1, a.getCharge()); assertEquals(1, a.getProtonsExplicitlyAddedOrRemoved()); assertEquals(4, StructureBuildingMethods.calculateSubstitutableHydrogenAtoms(a)); } @Test() public void replaceChargedWithUnEquallyCharged() { assertThrows(StructureBuildingException.class, () -> { Atom a = new Atom(0, ChemEl.C, mock(Fragment.class)); a.addChargeAndProtons(1, -1); fragManager.replaceAtomWithSmiles(a, "[NH2-]"); }); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/NameToStructureConfigurationsTest.java000066400000000000000000000065401451751637500334360ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import org.junit.jupiter.api.AfterAll; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; import uk.ac.cam.ch.wwmm.opsin.OpsinResult.OPSIN_RESULT_STATUS; public class NameToStructureConfigurationsTest { private static NameToStructure n2s; @BeforeAll public static void setUp() { n2s = NameToStructure.getInstance(); } @AfterAll public static void cleanUp() { n2s = null; } @Test public void testAllowRadicals() throws StructureBuildingException { NameToStructureConfig n2sConfig = NameToStructureConfig.getDefaultConfigInstance(); n2sConfig.setAllowRadicals(false); OpsinResult or = n2s.parseChemicalName("methyl", n2sConfig); assertEquals(OPSIN_RESULT_STATUS.FAILURE, or.getStatus()); n2sConfig.setAllowRadicals(true); or = n2s.parseChemicalName("methyl", n2sConfig); assertEquals(OPSIN_RESULT_STATUS.SUCCESS, or.getStatus()); } @Test public void testOutputRadicalsAsWildCards() throws StructureBuildingException { NameToStructureConfig n2sConfig = NameToStructureConfig.getDefaultConfigInstance(); n2sConfig.setAllowRadicals(true); n2sConfig.setOutputRadicalsAsWildCardAtoms(false); OpsinResult or = n2s.parseChemicalName("methyl", n2sConfig); assertEquals("[CH3]", or.getSmiles()); n2sConfig.setOutputRadicalsAsWildCardAtoms(true); or = n2s.parseChemicalName("methyl", n2sConfig); assertEquals("C*", or.getSmiles()); } @Test public void testOutputRadicalsAsAttachments() { NameToStructureConfig n2sConfig = NameToStructureConfig.getDefaultConfigInstance(); n2sConfig.setAllowRadicals(true); n2sConfig.setOutputRadicalsAsWildCardAtoms(false); OpsinResult or = n2s.parseChemicalName("methyl", n2sConfig); assertEquals("[CH3]", or.getSmiles(SmilesOptions.CXSMILES_ATOM_LABELS)); n2sConfig.setOutputRadicalsAsWildCardAtoms(true); or = n2s.parseChemicalName("methyl", n2sConfig); assertEquals("C* |$;_AP1$|", or.getSmiles(SmilesOptions.CXSMILES_ATOM_LABELS)); } @Test public void testInterpretAcidsWithoutTheWordAcid() throws StructureBuildingException { NameToStructureConfig n2sConfig = NameToStructureConfig.getDefaultConfigInstance(); n2sConfig.setInterpretAcidsWithoutTheWordAcid(false); OpsinResult or = n2s.parseChemicalName("acetic", n2sConfig); assertEquals(OPSIN_RESULT_STATUS.FAILURE, or.getStatus()); n2sConfig.setInterpretAcidsWithoutTheWordAcid(true); or = n2s.parseChemicalName("acetic", n2sConfig); assertEquals(OPSIN_RESULT_STATUS.SUCCESS, or.getStatus()); } @Test public void testWarnRatherThanFailOnUninterpretableStereochemistry() throws StructureBuildingException { NameToStructureConfig n2sConfig = NameToStructureConfig.getDefaultConfigInstance(); n2sConfig.setWarnRatherThanFailOnUninterpretableStereochemistry(false); OpsinResult or = n2s.parseChemicalName("(R)-2,2'-Bis(diphenylphosphino)-1,1'-binaphthyl", n2sConfig); assertEquals(OPSIN_RESULT_STATUS.FAILURE, or.getStatus()); n2sConfig.setWarnRatherThanFailOnUninterpretableStereochemistry(true); or = n2s.parseChemicalName("(R)-2,2'-Bis(diphenylphosphino)-1,1'-binaphthyl", n2sConfig); assertEquals(OPSIN_RESULT_STATUS.WARNING, or.getStatus()); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/NameToStructureTest.java000066400000000000000000000050521451751637500305200ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertNotNull; import static org.junit.jupiter.api.Assertions.assertNull; import org.junit.jupiter.api.Test; public class NameToStructureTest { @Test public void testNameToStructure() { NameToStructure nts = NameToStructure.getInstance(); assertNotNull(nts, "Got a name to structure convertor"); } @Test public void testParseToCML() { NameToStructure nts = NameToStructure.getInstance(); String cml = nts.parseToCML("ethane"); // output is syntactically valid (schema, dictRefs) // labels assigned and is correct. // contains a molecule with same connectivity as 'frag of CML' assertEquals("" + "" + "ethane" + "" + "" + "" + "" + "" + "" + "" + "" + "" + "" + "" + "" + "" + "" + "" + "" + "", cml, "Parsing 'ethane'"); assertNull(nts.parseToCML("helloworld"), "Won't parse helloworld"); } @Test public void testParseToSmiles() { NameToStructure nts = NameToStructure.getInstance(); String smiles = nts.parseToSmiles("ethane"); assertEquals("CC", smiles); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/ParserTest.java000066400000000000000000000061701451751637500266520ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertFalse; import static org.junit.jupiter.api.Assertions.assertThrows; import java.io.IOException; import java.util.List; import org.junit.jupiter.api.AfterAll; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; public class ParserTest { private static Parser parser; private static NameToStructureConfig config; @BeforeAll public static void setUp() throws IOException{ parser = new Parser(); config = NameToStructureConfig.getDefaultConfigInstance(); } @AfterAll public static void cleanUp(){ parser = null; config = null; } @Test() public void testParseThrowsWhenNameIsUninterpretable() throws ParsingException { assertThrows(ParsingException.class, () -> { parser.parse(config, "chunky bacon"); }); } @Test public void testParseUninvertsCASNomenclature() throws ParsingException { List parse = parser.parse(config, "Piperidine, 1-(1-oxopropyl)-"); assertFalse(parse.isEmpty()); } @Test public void testParseReturnsOneWordRuleForEachMixtureComponent() throws ParsingException { List parse = parser.parse(config, "benzene; ethane"); assertEquals(2, parse.get(0).getChildElements(XmlDeclarations.WORDRULE_EL).size()); } @Test() public void testParseThrowsWhenNameIsSubstituentOnly() { assertThrows(ParsingException.class, () -> { parser.parse(config, "chloro"); }); } @Test() public void testNoParseForOneComponentSalt() { assertThrows(ParsingException.class, () -> { parser.parse(config, "pyridine salt"); }); } @Test public void testConvertStringToComponentRatios1() throws ParsingException { String ratio = "(1:2)"; Integer[] componentRatios = Parser.processStoichiometryIndication(ratio); assertEquals(2, componentRatios.length); for (int i = 0; i < componentRatios.length; i++) { if (i==0){ assertEquals(1,(int) componentRatios[i]); } if (i==1){ assertEquals(2,(int) componentRatios[i]); } } } @Test public void testConvertStringToComponentRatios2() throws ParsingException { String ratio = "[1/1/2]"; Integer[] componentRatios = Parser.processStoichiometryIndication(ratio); assertEquals(3, componentRatios.length); for (int i = 0; i < componentRatios.length; i++) { if (i==0){ assertEquals(1,(int) componentRatios[i]); } if (i==1){ assertEquals(1,(int) componentRatios[i]); } if (i==2){ assertEquals(2,(int) componentRatios[i]); } } } @Test public void testConvertStringToComponentRatios3() throws ParsingException { String ratio = "(1:2:?)"; Integer[] componentRatios = Parser.processStoichiometryIndication(ratio); assertEquals(3, componentRatios.length); for (int i = 0; i < componentRatios.length; i++) { if (i==0){ assertEquals(1,(int) componentRatios[i]); } if (i==1){ assertEquals(2,(int) componentRatios[i]); } if (i==2){ assertEquals(1,(int) componentRatios[i]); } } } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/PolymerTest.java000066400000000000000000000012641451751637500270440ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertNotNull; import org.junit.jupiter.api.Test; public class PolymerTest { @Test public void testSimplePolymer() throws ParsingException { OpsinResult result = NameToStructure.getInstance().parseChemicalName("poly(oxyethylene)"); String smiles = result.getSmiles(); assertNotNull(smiles); assertEquals(true, smiles.contains("[*:1]")); assertEquals(true, smiles.contains("[*:2]")); String cml = result.getCml(); assertEquals(true, cml.contains("alpha")); assertEquals(true, cml.contains("omega")); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/PreProcessorTest.java000066400000000000000000000046431451751637500300470ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertThrows; import org.junit.jupiter.api.Test; public class PreProcessorTest { @Test() public void testPreProcessBlankThrows() { assertThrows(PreProcessingException.class, () -> { PreProcessor.preProcess(""); }); } @Test public void testPreProcessConvertsDollarA() throws PreProcessingException { assertEquals("alpha-bromo", PreProcessor.preProcess("$a-bromo"), "Convert dollar-a"); } @Test public void testPreProcessConvertsDollarB() throws PreProcessingException { assertEquals("beta-bromo", PreProcessor.preProcess("$b-bromo"), "Convert dollar-b"); } @Test public void testPreProcessConvertsDollarG() throws PreProcessingException { assertEquals("gamma-bromo", PreProcessor.preProcess("$g-bromo"), "Convert dollar-g"); } @Test public void testPreProcessConvertsDollarD() throws PreProcessingException { assertEquals("delta-bromo", PreProcessor.preProcess("$d-bromo"), "Convert dollar-d"); } @Test public void testPreProcessConvertsDollarE() throws PreProcessingException { assertEquals("epsilon-bromo", PreProcessor.preProcess("$e-bromo"), "Convert dollar-e"); } @Test public void testPreProcessConvertsDollarL() throws PreProcessingException { assertEquals("lambda-bromo", PreProcessor.preProcess("$l-bromo"), "Convert dollar-l"); } @Test public void testPreProcessConvertsGreekLetterToWord() throws PreProcessingException { assertEquals("alpha-bromo", PreProcessor.preProcess("\u03b1-bromo"), "Convert greek to word"); } @Test public void testPreProcessConvertsSulphToSulf() throws PreProcessingException { assertEquals("sulfur dioxide", PreProcessor.preProcess("sulphur dioxide"), "Converts 'sulph' to 'sulph'"); } @Test public void testRemovalOfDotsFromGreekWords1() throws PreProcessingException { assertEquals("alpha-methyl-toluene", PreProcessor.preProcess(".alpha.-methyl-toluene"), "Converts '.alpha.' to 'alpha'"); } @Test public void testRemovalOfDotsFromGreekWords2() throws PreProcessingException { assertEquals("alphabetaeta", PreProcessor.preProcess(".alpha..beta..eta.")); } @Test public void testHtmlGreeks() throws PreProcessingException { assertEquals("alpha-methyl-toluene", PreProcessor.preProcess("α-methyl-toluene")); assertEquals("beta-methyl-styrene", PreProcessor.preProcess("&BETA;-methyl-styrene")); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/RadixTrieTest.java000066400000000000000000000055611451751637500273140ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertNotNull; import java.util.List; import org.junit.jupiter.api.Test; public class RadixTrieTest { @Test public void testSimpleAddSimpleGet(){ OpsinRadixTrie trie = new OpsinRadixTrie(); trie.addToken("benzene"); List matches= trie.findMatches("benzene", 0); assertNotNull(matches); assertEquals(1, matches.size()); assertEquals(7, matches.get(0).intValue()); } @Test public void testSimpleAddFindPrefix(){ OpsinRadixTrie trie = new OpsinRadixTrie(); trie.addToken("phenyl"); List matches= trie.findMatches("phenylbenzene", 0); assertNotNull(matches); assertEquals(1, matches.size()); assertEquals(6, matches.get(0).intValue()); } @Test public void testAddWithBranchFindPrefix(){ OpsinRadixTrie trie = new OpsinRadixTrie(); trie.addToken("pyridinyl"); trie.addToken("phenyl"); List matches= trie.findMatches("phenylbenzene", 0); assertNotNull(matches); assertEquals(1, matches.size()); assertEquals(6, matches.get(0).intValue()); } @Test public void testZeroLengthToken(){ OpsinRadixTrie trie = new OpsinRadixTrie(); trie.addToken("");//e.g. end of substituent List matches= trie.findMatches("phenylbenzene", 0); assertNotNull(matches); assertEquals(1, matches.size()); assertEquals(0, matches.get(0).intValue()); } @Test public void testMultipleHits(){ OpsinRadixTrie trie = new OpsinRadixTrie(); trie.addToken("methyl"); trie.addToken("methylidene"); List matches= trie.findMatches("methylidene", 0); assertNotNull(matches); assertEquals(2, matches.size()); assertEquals(6, matches.get(0).intValue()); assertEquals(11, matches.get(1).intValue()); } @Test public void testMultipleHits2(){ OpsinRadixTrie trie = new OpsinRadixTrie(); trie.addToken("abcdef"); trie.addToken("a"); trie.addToken(""); trie.addToken("acd"); trie.addToken("ab"); trie.addToken("abcf"); List matches= trie.findMatches("abc", 0); assertNotNull(matches); assertEquals(3, matches.size()); assertEquals(0, matches.get(0).intValue()); assertEquals(1, matches.get(1).intValue()); assertEquals(2, matches.get(2).intValue()); } @Test public void testReverseMatching(){ OpsinRadixTrie trie = new OpsinRadixTrie(); trie.addToken("enedilyhte"); trie.addToken("lyhte"); trie.addToken(""); trie.addToken("ly"); trie.addToken("lyhtem"); List matches= trie.findMatchesReadingStringRightToLeft("ethyl", 5); assertNotNull(matches); assertEquals(3, matches.size()); assertEquals(5, matches.get(0).intValue()); assertEquals(3, matches.get(1).intValue()); assertEquals(0, matches.get(2).intValue()); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/SMILESFragmentBuilderTest.java000066400000000000000000000777451451751637500314250ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertNotNull; import static org.junit.jupiter.api.Assertions.assertThrows; import static org.junit.jupiter.api.Assertions.fail; import java.util.List; import java.util.Set; import org.junit.jupiter.api.Test; import uk.ac.cam.ch.wwmm.opsin.BondStereo.BondStereoValue; public class SMILESFragmentBuilderTest { private SMILESFragmentBuilder sBuilder = new SMILESFragmentBuilder(new IDManager()); @Test public void testBuild() throws StructureBuildingException { Fragment fragment = sBuilder.build("C"); assertNotNull(fragment, "Got a fragment"); } @Test public void testSimple1() throws StructureBuildingException { Fragment fragment = sBuilder.build("CC"); List atomList = fragment.getAtomList(); assertEquals(2, atomList.size()); assertEquals(ChemEl.C, atomList.get(0).getElement()); assertEquals(ChemEl.C, atomList.get(1).getElement()); } @Test public void testSimple2() throws StructureBuildingException { Fragment fragment = sBuilder.build("O=C=O"); List atomList = fragment.getAtomList(); assertEquals(3, atomList.size()); assertEquals(ChemEl.O, atomList.get(0).getElement()); assertEquals(ChemEl.C, atomList.get(1).getElement()); assertEquals(ChemEl.O, atomList.get(2).getElement()); Set bonds = fragment.getBondSet(); assertEquals(2, bonds.size()); for (Bond bond : bonds) { assertEquals(2, bond.getOrder()); } } @Test public void testSimple3() throws StructureBuildingException { Fragment fragment = sBuilder.build("C#N"); List atomList = fragment.getAtomList(); assertEquals(2, atomList.size()); Set bonds = fragment.getBondSet(); assertEquals(1, bonds.size()); for (Bond bond : bonds) { assertEquals(3, bond.getOrder()); } } @Test public void testSimple4() throws StructureBuildingException { Fragment fragment = sBuilder.build("CCN(CC)CC"); List atomList = fragment.getAtomList(); assertEquals(7, atomList.size()); Atom nitrogen = atomList.get(2); assertEquals(ChemEl.N, nitrogen.getElement()); assertEquals(3, nitrogen.getBondCount()); List neighbours = nitrogen.getAtomNeighbours();//bonds and hence neighbours come from a linked hash set so the order of the neighbours is deterministic assertEquals(3, neighbours.size()); assertEquals(atomList.get(1), neighbours.get(0)); assertEquals(atomList.get(3), neighbours.get(1)); assertEquals(atomList.get(5), neighbours.get(2)); } @Test public void testSimple5() throws StructureBuildingException { Fragment fragment = sBuilder.build("CC(=O)O"); List atomList = fragment.getAtomList(); assertEquals(4, atomList.size()); Atom carbon = atomList.get(1); List neighbours = carbon.getAtomNeighbours(); assertEquals(3, neighbours.size()); assertEquals(atomList.get(0), neighbours.get(0)); assertEquals(atomList.get(2), neighbours.get(1)); assertEquals(atomList.get(3), neighbours.get(2)); assertEquals(2, carbon.getBondToAtomOrThrow(atomList.get(2)).getOrder()); } @Test public void testSimple6() throws StructureBuildingException { Fragment fragment = sBuilder.build("C1CCCCC1"); List atomList = fragment.getAtomList(); assertEquals(6, atomList.size()); for (Atom atom : atomList) { assertEquals(2, atom.getAtomNeighbours().size()); assertEquals(false, atom.hasSpareValency()); } } @Test public void testSimple7() throws StructureBuildingException { Fragment fragment = sBuilder.build("c1ccccc1"); List atomList = fragment.getAtomList(); assertEquals(6, atomList.size()); for (Atom atom : atomList) { assertEquals(2, atom.getAtomNeighbours().size()); assertEquals(true, atom.hasSpareValency()); } } @Test public void testSimple8() throws StructureBuildingException { Fragment fragment = sBuilder.build("[I-].[Na+]"); List atomList = fragment.getAtomList(); assertEquals(2, atomList.size()); Atom iodine = atomList.get(0); assertEquals(0, iodine.getAtomNeighbours().size()); assertEquals(-1, iodine.getCharge()); Atom sodium = atomList.get(1); assertEquals(0, sodium.getAtomNeighbours().size()); assertEquals(1, sodium.getCharge()); } @Test public void testSimple9() throws StructureBuildingException { Fragment fragment = sBuilder.build("(C(=O)O)"); List atomList = fragment.getAtomList(); assertEquals(3, atomList.size()); Atom carbon = atomList.get(0); assertEquals(2, carbon.getAtomNeighbours().size()); } @Test public void testSimple10() throws StructureBuildingException { Fragment fragment = sBuilder.build("C-C-O"); List atomList = fragment.getAtomList(); assertEquals(3, atomList.size()); } @Test public void testSimple11() throws StructureBuildingException { Fragment fragment = sBuilder.build("NC(Cl)(Br)C(=O)O"); List atomList = fragment.getAtomList(); assertEquals(7, atomList.size()); assertEquals(ChemEl.Cl, atomList.get(2).getElement()); } @Test() public void unterminatedRingOpening() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("C1CC"); fail("Should throw exception for bad smiles"); }); } @Test public void doublePositiveCharge1() throws StructureBuildingException { Fragment fragment = sBuilder.build("[C++]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(2, atomList.get(0).getCharge()); } @Test public void doublePositiveCharge2() throws StructureBuildingException { Fragment fragment = sBuilder.build("[C+2]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(2, atomList.get(0).getCharge()); } @Test public void doubleNegativeCharge1() throws StructureBuildingException { Fragment fragment = sBuilder.build("[O--]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(-2, atomList.get(0).getCharge()); } @Test public void doubleNegativeCharge2() throws StructureBuildingException { Fragment fragment = sBuilder.build("[O-2]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(-2, atomList.get(0).getCharge()); } @Test public void noIsotopeSpecified() throws StructureBuildingException { Fragment fragment = sBuilder.build("[NH3]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(null, atomList.get(0).getIsotope()); } @Test public void isotopeSpecified() throws StructureBuildingException { Fragment fragment = sBuilder.build("[15NH3]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertNotNull(atomList.get(0).getIsotope(), "Isotope should not be null"); int isotope = atomList.get(0).getIsotope(); assertEquals(15, isotope); } @Test() public void badlyFormedSMILE1() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("H5"); fail("Should throw exception for bad smiles"); }); } @Test() public void badlyFormedSMILE2() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("CH4"); fail("Should throw exception for bad smiles"); }); } @Test() public void badlyFormedSMILE3() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("13C"); fail("Should throw exception for bad smiles"); }); } @Test() public void badlyFormedSMILE4() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("C=#C"); fail("Should throw exception for bad smiles: is it a double or triple bond?"); }); } @Test() public void badlyFormedSMILE5() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("C#=C"); fail("Should throw exception for bad smiles: is it a double or triple bond?"); }); } @Test() public void badlyFormedSMILE6() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("F//C=C/F"); fail("Should throw exception for bad smiles: bond configuration specified twice"); }); } @Test() public void badlyFormedSMILE7() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("F/C=C/\\F"); fail("Should throw exception for bad smiles: bond configuration specified twice"); }); } @Test() public void badlyFormedSMILE8() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("F[C@@](Cl)Br"); fail("Should throw exception for invalid atom parity, not enough atoms in atom parity"); }); } @Test public void ringClosureHandling1() throws StructureBuildingException { Fragment fragment = sBuilder.build("C=1CN1"); List atomList = fragment.getAtomList(); assertEquals(3, atomList.size()); assertEquals(2, atomList.get(0).getBondToAtomOrThrow(atomList.get(2)).getOrder()); } @Test public void ringClosureHandling2() throws StructureBuildingException { Fragment fragment = sBuilder.build("C1CN=1"); List atomList = fragment.getAtomList(); assertEquals(3, atomList.size()); assertEquals(2, atomList.get(0).getBondToAtomOrThrow(atomList.get(2)).getOrder()); } @Test() public void ringClosureHandling3() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("C#1CN=1"); fail("Should throw exception for bad smiles"); }); } @Test public void ringClosureHandling4() throws StructureBuildingException { Fragment fragment = sBuilder.build("C=1CN=1"); List atomList = fragment.getAtomList(); assertEquals(3, atomList.size()); assertEquals(2, atomList.get(0).getBondToAtomOrThrow(atomList.get(2)).getOrder()); } @Test public void ringSupportGreaterThan10() throws StructureBuildingException { Fragment fragment = sBuilder.build("C%10CC%10"); List atomList = fragment.getAtomList(); assertEquals(3, atomList.size()); assertEquals(2, atomList.get(0).getAtomNeighbours().size()); } @Test public void hydrogenHandling1() throws StructureBuildingException { Fragment fragment = sBuilder.build("[OH3+]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(1, atomList.get(0).getCharge()); assertEquals(1, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); assertEquals(3, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling2() throws StructureBuildingException { Fragment fragment = sBuilder.build("[CH3][CH2][OH]"); List atomList = fragment.getAtomList(); assertEquals(3, atomList.size()); assertEquals(4, atomList.get(0).determineValency(true)); assertEquals(0, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); assertEquals(4, atomList.get(1).determineValency(true)); assertEquals(0, atomList.get(1).getProtonsExplicitlyAddedOrRemoved()); assertEquals(2, atomList.get(2).determineValency(true)); assertEquals(0, atomList.get(2).getProtonsExplicitlyAddedOrRemoved()); } @Test public void hydrogenHandling3() throws StructureBuildingException { Fragment fragment = sBuilder.build("[SH2]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(2, atomList.get(0).determineValency(true)); assertEquals(0, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); } @Test public void hydrogenHandling4() throws StructureBuildingException { Fragment fragment = sBuilder.build("[SH4]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); int minimumVal =atomList.get(0).getMinimumValency(); assertEquals(4, minimumVal); assertEquals(4, atomList.get(0).determineValency(true)); assertEquals(0, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); } @Test public void hydrogenHandling5() throws StructureBuildingException { Fragment fragment = sBuilder.build("[SH6]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); int minimumVal =atomList.get(0).getMinimumValency(); assertEquals(6, minimumVal); assertEquals(6, atomList.get(0).determineValency(true)); assertEquals(0, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); } @Test public void hydrogenHandling6() throws StructureBuildingException { Fragment fragment = sBuilder.build("[SH3]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); int minimumVal =atomList.get(0).getMinimumValency(); assertEquals(3, minimumVal); assertEquals(3, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling7() throws StructureBuildingException { Fragment fragment = sBuilder.build("[SH3+]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(1, atomList.get(0).getCharge()); assertEquals(1, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); assertEquals(3, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling8() throws StructureBuildingException { Fragment fragment = sBuilder.build("[SH+]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(1, atomList.get(0).getCharge()); assertEquals(-1, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); assertEquals(1, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling9() throws StructureBuildingException { Fragment fragment = sBuilder.build("[SH3-]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(-1, atomList.get(0).getCharge()); assertEquals(1, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); assertEquals(3, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling10() throws StructureBuildingException { Fragment fragment = sBuilder.build("[SH-]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(-1, atomList.get(0).getCharge()); assertEquals(-1, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); assertEquals(1, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling11() throws StructureBuildingException { Fragment fragment = sBuilder.build("[SH5+]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); int lambdaConvent =atomList.get(0).getLambdaConventionValency(); assertEquals(4, lambdaConvent); assertEquals(1, atomList.get(0).getCharge()); assertEquals(1, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); assertEquals(5, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling12() throws StructureBuildingException { Fragment fragment = sBuilder.build("[Li+]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(1, atomList.get(0).getCharge()); assertEquals(0, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); assertEquals(0, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling13() throws StructureBuildingException { Fragment fragment = sBuilder.build("[NaH]"); List atomList = fragment.getAtomList(); assertEquals(2, atomList.size()); assertEquals(0, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); assertEquals(0, atomList.get(0).getCharge()); assertEquals(0, atomList.get(1).getProtonsExplicitlyAddedOrRemoved()); assertEquals(0, atomList.get(1).getCharge()); assertEquals(ChemEl.H, atomList.get(1).getElement()); } @Test public void hydrogenHandling14() throws StructureBuildingException { Fragment fragment = sBuilder.build("-[SiH3]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(4, atomList.get(0).determineValency(true)); assertEquals(0, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); } @Test public void hydrogenHandling15() throws StructureBuildingException { Fragment fragment = sBuilder.build("=[SiH2]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(4, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling16() throws StructureBuildingException { Fragment fragment = sBuilder.build("#[SiH]"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(4, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling17() throws StructureBuildingException { Fragment fragment = sBuilder.build("[SiH3]-"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(4, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling18() throws StructureBuildingException { Fragment fragment = sBuilder.build("[SiH2]="); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(4, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling19() throws StructureBuildingException { Fragment fragment = sBuilder.build("[SiH]#"); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(4, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling20() throws StructureBuildingException { Fragment fragment = sBuilder.build("=[Si]="); List atomList = fragment.getAtomList(); assertEquals(1, atomList.size()); assertEquals(4, atomList.get(0).determineValency(true)); } @Test public void hydrogenHandling21() throws StructureBuildingException { Fragment fragment = sBuilder.build("[o+]1ccccc1"); List atomList = fragment.getAtomList(); assertEquals(6, atomList.size()); assertEquals(1, atomList.get(0).getProtonsExplicitlyAddedOrRemoved()); assertEquals(true, atomList.get(0).hasSpareValency()); assertEquals(3, atomList.get(0).determineValency(true)); assertEquals(0, atomList.get(1).getProtonsExplicitlyAddedOrRemoved()); assertEquals(4, atomList.get(1).determineValency(true)); assertEquals(true, atomList.get(1).hasSpareValency()); } @Test public void indicatedHydrogen() throws StructureBuildingException { Fragment fragment = sBuilder.build("Nc1[nH]c(=O)c2c(n1)nc[nH]2"); List atomList = fragment.getAtomList(); assertEquals(11, atomList.size()); assertEquals(2, fragment.getIndicatedHydrogen().size()); assertEquals(atomList.get(2), fragment.getIndicatedHydrogen().get(0)); assertEquals(atomList.get(10), fragment.getIndicatedHydrogen().get(1)); } @Test public void chiralityTest1() throws StructureBuildingException { Fragment fragment = sBuilder.build("N[C@@H](F)C"); List atomList = fragment.getAtomList(); assertEquals(4, atomList.size()); Atom chiralAtom = atomList.get(1); assertEquals(3, chiralAtom.getAtomNeighbours().size()); AtomParity atomParity = chiralAtom.getAtomParity(); Atom[] atomRefs4 = atomParity.getAtomRefs4(); assertEquals(atomList.get(0), atomRefs4[0]); assertEquals(AtomParity.hydrogen, atomRefs4[1]); assertEquals(atomList.get(2), atomRefs4[2]); assertEquals(atomList.get(3), atomRefs4[3]); assertEquals(1, atomParity.getParity()); } @Test public void chiralityTest2() throws StructureBuildingException { Fragment fragment = sBuilder.build("N[C@H](F)C"); List atomList = fragment.getAtomList(); assertEquals(4, atomList.size()); Atom chiralAtom = atomList.get(1); assertEquals(3, chiralAtom.getAtomNeighbours().size()); AtomParity atomParity = chiralAtom.getAtomParity(); Atom[] atomRefs4 = atomParity.getAtomRefs4(); assertEquals(atomList.get(0), atomRefs4[0]); assertEquals(AtomParity.hydrogen, atomRefs4[1]); assertEquals(atomList.get(2), atomRefs4[2]); assertEquals(atomList.get(3), atomRefs4[3]); assertEquals(-1, atomParity.getParity()); } @Test public void chiralityTest3() throws StructureBuildingException { Fragment fragment = sBuilder.build("C2.N1.F3.[C@@H]231"); List atomList = fragment.getAtomList(); assertEquals(4, atomList.size()); Atom chiralAtom = atomList.get(3); assertEquals(3, chiralAtom.getAtomNeighbours().size()); AtomParity atomParity = chiralAtom.getAtomParity(); Atom[] atomRefs4 = atomParity.getAtomRefs4(); assertEquals(AtomParity.hydrogen, atomRefs4[0]); assertEquals(atomList.get(0), atomRefs4[1]); assertEquals(atomList.get(2), atomRefs4[2]); assertEquals(atomList.get(1), atomRefs4[3]); assertEquals(1, atomParity.getParity()); } @Test public void chiralityTest4() throws StructureBuildingException { Fragment fragment = sBuilder.build("[C@@H]231.C2.N1.F3"); List atomList = fragment.getAtomList(); assertEquals(4, atomList.size()); Atom chiralAtom = atomList.get(0); assertEquals(3, chiralAtom.getAtomNeighbours().size()); AtomParity atomParity = chiralAtom.getAtomParity(); Atom[] atomRefs4 = atomParity.getAtomRefs4(); assertEquals(AtomParity.hydrogen, atomRefs4[0]); assertEquals(atomList.get(1), atomRefs4[1]); assertEquals(atomList.get(3), atomRefs4[2]); assertEquals(atomList.get(2), atomRefs4[3]); assertEquals(1, atomParity.getParity()); } @Test public void chiralityTest5() throws StructureBuildingException { Fragment fragment = sBuilder.build("[C@@H](Cl)1[C@H](C)(F).Br1"); List atomList = fragment.getAtomList(); assertEquals(6, atomList.size()); Atom chiralAtom1 = atomList.get(0); assertEquals(3, chiralAtom1.getAtomNeighbours().size()); AtomParity atomParity = chiralAtom1.getAtomParity(); Atom[] atomRefs4 = atomParity.getAtomRefs4(); assertEquals(AtomParity.hydrogen, atomRefs4[0]); assertEquals(atomList.get(1), atomRefs4[1]); assertEquals(atomList.get(5), atomRefs4[2]); assertEquals(atomList.get(2), atomRefs4[3]); assertEquals(1, atomParity.getParity()); Atom chiralAtom2 = atomList.get(2); assertEquals(3, chiralAtom2.getAtomNeighbours().size()); atomParity = chiralAtom2.getAtomParity(); atomRefs4 = atomParity.getAtomRefs4(); assertEquals(atomList.get(0), atomRefs4[0]); assertEquals(AtomParity.hydrogen, atomRefs4[1]); assertEquals(atomList.get(3), atomRefs4[2]); assertEquals(atomList.get(4), atomRefs4[3]); assertEquals(-1, atomParity.getParity()); } @Test public void chiralityTest6() throws StructureBuildingException { Fragment fragment = sBuilder.build("I[C@@](Cl)(Br)F"); List atomList = fragment.getAtomList(); assertEquals(5, atomList.size()); Atom chiralAtom = atomList.get(1); assertEquals(4, chiralAtom.getAtomNeighbours().size()); AtomParity atomParity = chiralAtom.getAtomParity(); Atom[] atomRefs4 = atomParity.getAtomRefs4(); assertEquals(atomList.get(0), atomRefs4[0]); assertEquals(atomList.get(2), atomRefs4[1]); assertEquals(atomList.get(3), atomRefs4[2]); assertEquals(atomList.get(4), atomRefs4[3]); assertEquals(1, atomParity.getParity()); } @Test public void chiralityTest7() throws StructureBuildingException { Fragment fragment = sBuilder.build("C[S@](N)=O"); List atomList = fragment.getAtomList(); assertEquals(4, atomList.size()); Atom chiralAtom = atomList.get(1); assertEquals(3, chiralAtom.getAtomNeighbours().size()); AtomParity atomParity = chiralAtom.getAtomParity(); Atom[] atomRefs4 = atomParity.getAtomRefs4(); assertEquals(atomList.get(0), atomRefs4[0]); assertEquals(atomList.get(1), atomRefs4[1]); assertEquals(atomList.get(2), atomRefs4[2]); assertEquals(atomList.get(3), atomRefs4[3]); assertEquals(-1, atomParity.getParity()); } @Test public void chiralityTest8() throws StructureBuildingException { Fragment fragment = sBuilder.build("[S@](C)(N)=O"); List atomList = fragment.getAtomList(); assertEquals(4, atomList.size()); Atom chiralAtom = atomList.get(0); assertEquals(3, chiralAtom.getAtomNeighbours().size()); AtomParity atomParity = chiralAtom.getAtomParity(); Atom[] atomRefs4 = atomParity.getAtomRefs4(); assertEquals(atomList.get(0), atomRefs4[0]); assertEquals(atomList.get(1), atomRefs4[1]); assertEquals(atomList.get(2), atomRefs4[2]); assertEquals(atomList.get(3), atomRefs4[3]); assertEquals(-1, atomParity.getParity()); } @Test public void testDoubleBondStereo1() throws StructureBuildingException { Fragment fragment = sBuilder.build("F/C=C/F"); Bond b =fragment.findBond(2, 3); assertEquals(BondStereoValue.TRANS, b.getBondStereo().getBondStereoValue()); } @Test public void testDoubleBondStereo2() throws StructureBuildingException { Fragment fragment = sBuilder.build("F\\C=C/F"); Bond b =fragment.findBond(2, 3); assertEquals(BondStereoValue.CIS, b.getBondStereo().getBondStereoValue()); } @Test public void testDoubleBondStereo3() throws StructureBuildingException { Fragment fragment = sBuilder.build("C(/F)=C/F"); Bond b =fragment.findBond(1, 3); assertEquals(BondStereoValue.CIS, b.getBondStereo().getBondStereoValue()); } @Test public void testDoubleBondStereo4() throws StructureBuildingException { Fragment fragment = sBuilder.build("C(\\F)=C/F"); Bond b =fragment.findBond(1, 3); assertEquals(BondStereoValue.TRANS, b.getBondStereo().getBondStereoValue()); } @Test public void testDoubleBondStereo5a() throws StructureBuildingException { Fragment fragment = sBuilder.build("CC1=C/F.O\\1"); Bond b =fragment.findBond(2, 3); assertEquals(BondStereoValue.CIS, b.getBondStereo().getBondStereoValue()); } @Test public void testDoubleBondStereo5b() throws StructureBuildingException { Fragment fragment = sBuilder.build("CC/1=C/F.O1"); Bond b =fragment.findBond(2, 3); assertEquals(BondStereoValue.CIS, b.getBondStereo().getBondStereoValue()); } @Test public void testDoubleBondStereo6() throws StructureBuildingException { Fragment fragment = sBuilder.build("CC1=C/F.O/1"); Bond b =fragment.findBond(2, 3); assertEquals(BondStereoValue.TRANS, b.getBondStereo().getBondStereoValue()); } @Test public void testDoubleBondMultiStereo1() throws StructureBuildingException { Fragment fragment = sBuilder.build("F/C=C/C=C/C"); Bond b =fragment.findBond(2, 3); assertEquals(BondStereoValue.TRANS, b.getBondStereo().getBondStereoValue()); b =fragment.findBond(4, 5); assertEquals(BondStereoValue.TRANS, b.getBondStereo().getBondStereoValue()); } @Test public void testDoubleBondMultiStereo2() throws StructureBuildingException { Fragment fragment = sBuilder.build("F/C=C\\C=C/C"); Bond b =fragment.findBond(2, 3); assertEquals(BondStereoValue.CIS, b.getBondStereo().getBondStereoValue()); b =fragment.findBond(4, 5); assertEquals(BondStereoValue.CIS, b.getBondStereo().getBondStereoValue()); } @Test public void testDoubleBondMultiStereo3() throws StructureBuildingException { Fragment fragment = sBuilder.build("F/C=C\\C=C\\C"); Bond b =fragment.findBond(2, 3); assertEquals(BondStereoValue.CIS, b.getBondStereo().getBondStereoValue()); b =fragment.findBond(4, 5); assertEquals(BondStereoValue.TRANS, b.getBondStereo().getBondStereoValue()); } @Test public void testDoubleBondMultiStereo4() throws StructureBuildingException { Fragment fragment = sBuilder.build("F/C=C\\C=CC"); Bond b =fragment.findBond(2, 3); assertEquals(BondStereoValue.CIS, b.getBondStereo().getBondStereoValue()); b =fragment.findBond(4, 5); assertEquals(null, b.getBondStereo()); } //From http://baoilleach.blogspot.com/2010/09/are-you-on-my-side-or-not-its-ez-part.html @Test public void testDoubleBondNoela() throws StructureBuildingException { Fragment fragment = sBuilder.build("C/C=C\\1/NC1"); Bond b =fragment.findBond(2, 3); if (BondStereoValue.TRANS.equals( b.getBondStereo().getBondStereoValue())){ assertEquals("1 2 3 4", atomRefsToIdStr(b.getBondStereo().getAtomRefs4())); } else{ assertEquals("1 2 3 5", atomRefsToIdStr(b.getBondStereo().getAtomRefs4())); } } @Test public void testDoubleBondNoelb() throws StructureBuildingException { Fragment fragment = sBuilder.build("C/C=C1/NC1"); Bond b =fragment.findBond(2, 3); assertEquals(BondStereoValue.TRANS, b.getBondStereo().getBondStereoValue()); assertEquals("1 2 3 4", atomRefsToIdStr(b.getBondStereo().getAtomRefs4())); } @Test public void testDoubleBondNoelc() throws StructureBuildingException { Fragment fragment = sBuilder.build("C/C=C\\1/NC/1"); Bond b =fragment.findBond(2, 3); if (BondStereoValue.TRANS.equals( b.getBondStereo().getBondStereoValue())){ assertEquals("1 2 3 4", atomRefsToIdStr(b.getBondStereo().getAtomRefs4())); } else{ assertEquals("1 2 3 5", atomRefsToIdStr(b.getBondStereo().getAtomRefs4())); } } @Test public void testDoubleBondNoeld() throws StructureBuildingException { Fragment fragment = sBuilder.build("C/C=C1/NC/1"); Bond b =fragment.findBond(2, 3); if (BondStereoValue.TRANS.equals( b.getBondStereo().getBondStereoValue())){ assertEquals("1 2 3 4", atomRefsToIdStr(b.getBondStereo().getAtomRefs4())); } else{ assertEquals("1 2 3 5", atomRefsToIdStr(b.getBondStereo().getAtomRefs4())); } } @Test() public void testDoubleBondNoele() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("C/C=C\\1\\NC1"); fail("Should throw exception for bad smiles: contradictory double bond configuration"); }); } @Test() public void testDoubleBondNoelf() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("C/C=C\1NC\1"); fail("Should throw exception for bad smiles: contradictory double bond configuration"); }); } @Test() public void testDoubleBondNoelg() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("C/C=C\1/NC\1"); fail("Should throw exception for bad smiles: contradictory double bond configuration"); }); } @Test public void testDoubleBondCornerCase1() throws StructureBuildingException { Fragment fragment = sBuilder.build("C\\1NC1=C/C"); Bond b =fragment.findBond(3, 4); assertEquals(BondStereoValue.CIS, b.getBondStereo().getBondStereoValue()); assertEquals("1 3 4 5", atomRefsToIdStr(b.getBondStereo().getAtomRefs4())); } @Test public void testDoubleBondCornerCase2() throws StructureBuildingException { Fragment fragment = sBuilder.build("C1NC/1=C/C"); Bond b =fragment.findBond(3, 4); assertEquals(BondStereoValue.CIS, b.getBondStereo().getBondStereoValue()); assertEquals("1 3 4 5", atomRefsToIdStr(b.getBondStereo().getAtomRefs4())); } @Test() public void testDoubleBondCornerCase3() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("C/1=C/CCCCCC/1"); fail("Should throw exception for bad smiles: contradictory double bond configuration"); }); } @Test() public void testDoubleBondCornerCase4() { assertThrows(StructureBuildingException.class, () -> { sBuilder.build("C\\1=C/CCCCCC\\1"); fail("Should throw exception for bad smiles: contradictory double bond configuration"); }); } @Test public void testDoubleBondCornerCase5() throws StructureBuildingException { Fragment fragment = sBuilder.build("C\\1=C/CCCCCC/1"); Bond b = fragment.findBond(1, 2); assertEquals(BondStereoValue.TRANS, b.getBondStereo().getBondStereoValue()); assertEquals("8 1 2 3", atomRefsToIdStr(b.getBondStereo().getAtomRefs4())); } @Test public void testDoubleBondCornerCase6() throws StructureBuildingException { Fragment fragment = sBuilder.build("C/1=C/CCCCCC\\1"); Bond b = fragment.findBond(1, 2); assertEquals(BondStereoValue.CIS, b.getBondStereo().getBondStereoValue()); assertEquals("8 1 2 3", atomRefsToIdStr(b.getBondStereo().getAtomRefs4())); } private String atomRefsToIdStr(Atom[] atomRefs4) { StringBuilder sb = new StringBuilder(); for (int i = 0; i < atomRefs4.length; i++) { sb.append(atomRefs4[i].getID()); if (i + 1 < atomRefs4.length) { sb.append(' '); } } return sb.toString(); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/SMILESWriterTest.java000066400000000000000000000366621451751637500276200ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.fail; import java.util.Collections; import java.util.List; import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; import uk.ac.cam.ch.wwmm.opsin.BondStereo.BondStereoValue; public class SMILESWriterTest { private FragmentManager fm; @BeforeEach public void setup(){ IDManager idManager = new IDManager(); fm = new FragmentManager(new SMILESFragmentBuilder(idManager), idManager); } @Test public void testRoundTrip1() throws StructureBuildingException { Fragment f = fm.buildSMILES("C"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("C", smiles); } @Test public void testRoundTrip2() throws StructureBuildingException { Fragment f = fm.buildSMILES("C#N"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("C#N", smiles); } @Test public void testRoundTrip3() throws StructureBuildingException { Fragment f = fm.buildSMILES(StringTools.multiplyString("C",200)); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals(StringTools.multiplyString("C",200), smiles); } @Test public void testRoundTrip4() throws StructureBuildingException { Fragment f = fm.buildSMILES("O=C=O"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("O=C=O", smiles); } @Test public void testRoundTrip5() throws StructureBuildingException { Fragment f = fm.buildSMILES("CCN(CC)CC"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("CCN(CC)CC", smiles); } @Test public void testRoundTrip6() throws StructureBuildingException { Fragment f = fm.buildSMILES("CC(=O)O"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("CC(=O)O", smiles); } @Test public void testRoundTrip7() throws StructureBuildingException { Fragment f = fm.buildSMILES("C1CCCCC1"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("C1CCCCC1", smiles); } @Test public void testRoundTrip8() throws StructureBuildingException { Fragment f = fm.buildSMILES("C1=CC=CC=C1"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("C1=CC=CC=C1", smiles); } @Test public void testRoundTrip9() throws StructureBuildingException { Fragment f = fm.buildSMILES("NC(Cl)(Br)C(=O)O"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("NC(Cl)(Br)C(=O)O", smiles); } @Test public void testRoundTrip10() throws StructureBuildingException { Fragment f = fm.buildSMILES("[NH4+].[Cl-].F.[He-2]"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[NH4+].[Cl-].F.[He-2]", smiles); } @Test public void testRoundTrip11() throws StructureBuildingException { Fragment f = fm.buildSMILES("[NH4+].[Cl-].F.[He-2]"); List atomList = f.getAtomList(); Collections.reverse(atomList); f.reorderAtomCollection(atomList); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[He-2].F.[Cl-].[NH4+]", smiles); } @Test public void testRoundTrip12() throws StructureBuildingException { Fragment f = fm.buildSMILES("CCO.N=O.C#N"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("CCO.N=O.C#N", smiles); } @Test public void testOrganic1() throws StructureBuildingException { Fragment f = fm.buildSMILES("[S]"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[S]", smiles); } @Test public void testOrganic2() throws StructureBuildingException { Fragment f = fm.buildSMILES("[S][H]"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[SH]", smiles); } @Test public void testOrganic3() throws StructureBuildingException { Fragment f = fm.buildSMILES("[S]([H])[H]"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("S", smiles); } @Test public void testOrganic4() throws StructureBuildingException { Fragment f = fm.buildSMILES("[S]([H])([H])[H]"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[SH3]", smiles); } @Test public void testOrganic5() throws StructureBuildingException { Fragment f = fm.buildSMILES("[S]([H])([H])([H])[H]"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[SH4]", smiles); } @Test public void testOrganic6() throws StructureBuildingException { Fragment f = fm.buildSMILES("S(F)(F)(F)F"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("S(F)(F)(F)F", smiles); } @Test public void testOrganic7() throws StructureBuildingException { Fragment f = fm.buildSMILES("S([H])(F)(F)(F)(F)F"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("S(F)(F)(F)(F)F", smiles); } @Test public void testOrganic8() throws StructureBuildingException { Fragment f = fm.buildSMILES("S([H])([H])(F)(F)(F)F"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[SH2](F)(F)(F)F", smiles); } @Test public void testOrganic9() throws StructureBuildingException { Fragment f = fm.buildSMILES("S(F)(F)(F)(F)(F)(F)F"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("S(F)(F)(F)(F)(F)(F)F", smiles); } @Test public void testOrganic10() throws StructureBuildingException { Fragment f = fm.buildSMILES("[I]([H])([H])[H]"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[IH3]", smiles); } @Test public void testCharged1() throws StructureBuildingException { Fragment f = fm.buildSMILES("[CH3+]"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[CH3+]", smiles); } @Test public void testCharged2() throws StructureBuildingException { Fragment f = fm.buildSMILES("[Mg+2]"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[Mg+2]", smiles); } @Test public void testCharged3() throws StructureBuildingException { Fragment f = fm.buildSMILES("[BH4-]"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[BH4-]", smiles); } @Test public void testCharged4() throws StructureBuildingException { Fragment f = fm.buildSMILES("[O-2]"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[O-2]", smiles); } @Test public void testIsotope() throws StructureBuildingException { Fragment f = fm.buildSMILES("[15NH3]"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[15NH3]", smiles); } @Test public void testRGroup1() throws StructureBuildingException { Fragment f = fm.buildSMILES("[R]CC[R]"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("*CC*", smiles); } @Test public void testRGroup2() throws StructureBuildingException { Fragment f = fm.buildSMILES("[H][R]"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[H]*", smiles); } @Test public void testRingOpeningsGreaterThan10() throws StructureBuildingException { Fragment f = fm.buildSMILES("C12=C3C4=C5C6=C1C7=C8C9=C1C%10=C%11C(=C29)C3=C2C3=C4C4=C5C5=C9C6=C7C6=C7C8=C1C1=C8C%10=C%10C%11=C2C2=C3C3=C4C4=C5C5=C%11C%12=C(C6=C95)C7=C1C1=C%12C5=C%11C4=C3C3=C5C(=C81)C%10=C23"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("C12=C3C4=C5C6=C1C1=C7C8=C9C%10=C%11C(=C28)C3=C3C2=C4C4=C5C5=C8C6=C1C1=C6C7=C9C9=C7C%10=C%10C%11=C3C3=C2C2=C4C4=C5C5=C%11C%12=C(C1=C85)C6=C9C9=C%12C%12=C%11C4=C2C2=C%12C(=C79)C%10=C32", smiles); } @Test public void testHydrogenNotBondedToAnyNonHydrogen1() throws StructureBuildingException { Fragment f = fm.buildSMILES("[H-].[H+]"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[H-].[H+]", smiles); } @Test public void testHydrogenNotBondedToAnyNonHydrogen2() throws StructureBuildingException { Fragment f = fm.buildSMILES("[H][H]"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[H][H]", smiles); } @Test public void testHydrogenNotBondedToAnyNonHydrogen3() throws StructureBuildingException { Fragment f = fm.buildSMILES("[2H][H]"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[2H][H]", smiles); } @Test public void testHydrogenNotBondedToAnyNonHydrogen4() throws StructureBuildingException { Fragment f = fm.buildSMILES("[H]B1[H]B([H])[H]1"); String smiles = SMILESWriter.generateSmiles(f); assertEquals("B1[H]B[H]1", smiles); } @Test public void testTetrahedralChirality1() throws StructureBuildingException { Fragment f = fm.buildSMILES("N[C@@H](F)C"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("N[C@@H](F)C", smiles); } @Test public void testTetrahedralChirality2() throws StructureBuildingException { Fragment f = fm.buildSMILES("N[C@H](F)C"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("N[C@H](F)C", smiles); } @Test public void testTetrahedralChirality3() throws StructureBuildingException { Fragment f = fm.buildSMILES("C2.N1.F3.[C@@H]231"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("C[C@H](F)N", smiles); } @Test public void testTetrahedralChirality4() throws StructureBuildingException { Fragment f = fm.buildSMILES("[C@@H]231.C2.N1.F3"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[C@H](C)(N)F", smiles); } @Test public void testTetrahedralChirality5() throws StructureBuildingException { Fragment f = fm.buildSMILES("[C@@H](Cl)1[C@H](C)(F).Br1"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("[C@H](Cl)([C@H](C)F)Br", smiles); } @Test public void testTetrahedralChirality6() throws StructureBuildingException { Fragment f = fm.buildSMILES("I[C@@](Cl)(Br)F"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("I[C@@](Cl)(Br)F", smiles); } @Test public void testTetrahedralChirality7() throws StructureBuildingException { Fragment f = fm.buildSMILES("C[S@](N)=O"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); assertEquals("C[S@](N)=O", smiles); } @Test public void testDoubleBondSupport1() throws StructureBuildingException { Fragment f = fm.buildSMILES("C/C=C/C"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); if (!smiles.equals("C/C=C/C") && !smiles.equals("C\\C=C\\C")){ fail(smiles +" did not correspond to one of the expected SMILES strings"); } } @Test public void testDoubleBondSupport2() throws StructureBuildingException { Fragment f = fm.buildSMILES("C/C=C\\C"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); if (!smiles.equals("C/C=C\\C") && !smiles.equals("C\\C=C/C")){ fail(smiles +" did not correspond to one of the expected SMILES strings"); } } @Test public void testDoubleBondSupport3() throws StructureBuildingException { Fragment f = fm.buildSMILES("C/C=C\\C=C/C"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); if (!smiles.equals("C/C=C\\C=C/C") && !smiles.equals("C\\C=C/C=C\\C")){ fail(smiles +" did not correspond to one of the expected SMILES strings"); } } @Test public void testDoubleBondSupport4() throws StructureBuildingException { Fragment f = fm.buildSMILES("ClC(C(=O)[O-])=CC(=CC(=O)[O-])Cl"); fm.makeHydrogensExplicit(); f.findBond(2, 6).setBondStereoElement(new Atom[]{f.getAtomByID(1), f.getAtomByID(2), f.getAtomByID(6), f.getAtomByID(7)}, BondStereoValue.TRANS); f.findBond(7, 8).setBondStereoElement(new Atom[]{f.getAtomByID(12), f.getAtomByID(7), f.getAtomByID(8), f.getAtomByID(9)}, BondStereoValue.TRANS); String smiles = SMILESWriter.generateSmiles(f); if (!smiles.equals("Cl\\C(\\C(=O)[O-])=C\\C(=C/C(=O)[O-])\\Cl") && !smiles.equals("Cl/C(/C(=O)[O-])=C/C(=C\\C(=O)[O-])/Cl")){ fail(smiles +" did not correspond to one of the expected SMILES strings"); } } @Test public void testDoubleBondSupport5() throws StructureBuildingException { Fragment f = fm.buildSMILES("C/C=N\\O"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); if (!smiles.equals("C/C=N\\O") && !smiles.equals("C\\C=N/O")){ fail(smiles +" did not correspond to one of the expected SMILES strings"); } } @Test public void testDoubleBondSupport6() throws StructureBuildingException { Fragment f = fm.buildSMILES("O=C(/C=C(C(O)=O)\\C=C/C(O)=O)O"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); if (!smiles.equals("O=C(/C=C(/C(O)=O)\\C=C/C(O)=O)O") && !smiles.equals("O=C(\\C=C(\\C(O)=O)/C=C\\C(O)=O)O")){ fail(smiles +" did not correspond to one of the expected SMILES strings"); } } @Test public void testDoubleBondSupport7() throws StructureBuildingException { Fragment f = fm.buildSMILES("C(=C(C=CC(=O)O)C(=O)O)C(=O)O"); fm.makeHydrogensExplicit(); f.findBond(1, 2).setBondStereoElement(new Atom[]{f.getAtomByID(11), f.getAtomByID(1), f.getAtomByID(2), f.getAtomByID(8)}, BondStereoValue.TRANS); f.findBond(3, 4).setBondStereoElement(new Atom[]{f.getAtomByID(2), f.getAtomByID(3), f.getAtomByID(4), f.getAtomByID(5)}, BondStereoValue.TRANS); String smiles = SMILESWriter.generateSmiles(f); if (!smiles.equals("C(=C(/C=C/C(=O)O)\\C(=O)O)/C(=O)O") && !smiles.equals("C(=C(\\C=C\\C(=O)O)/C(=O)O)\\C(=O)O")){ fail(smiles +" did not correspond to one of the expected SMILES strings"); } } @Test public void testDoubleBondSupport8() throws StructureBuildingException { //hydrogen on the nitrogen must be explicit! Fragment f = fm.buildSMILES("[H]/N=C(\\N)/O"); fm.makeHydrogensExplicit(); String smiles = SMILESWriter.generateSmiles(f); if (!smiles.equals("[H]/N=C(\\N)/O") && !smiles.equals("[H]\\N=C(/N)\\O")){ fail(smiles +" did not correspond to one of the expected SMILES strings"); } } @Test public void testLabelling1() throws StructureBuildingException { Fragment f = fm.buildSMILES("CCC", "", XmlDeclarations.NONE_LABELS_VAL); for (Atom a : f) { assertEquals(0, a.getLocants().size()); } Fragment f2 = fm.buildSMILES("CCC", "", ""); for (Atom a : f2) { assertEquals(0, a.getLocants().size()); } } @Test public void testLabelling2() throws StructureBuildingException { Fragment f = fm.buildSMILES("CCC", "", "1/2,alpha,2'/"); List atoms = f.getAtomList(); assertEquals(1, atoms.get(0).getLocants().size()); assertEquals(3, atoms.get(1).getLocants().size()); assertEquals(0, atoms.get(2).getLocants().size()); assertEquals("1", atoms.get(0).getLocants().get(0)); assertEquals("2", atoms.get(1).getLocants().get(0)); assertEquals("alpha", atoms.get(1).getLocants().get(1)); assertEquals("2'", atoms.get(1).getLocants().get(2)); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/SSSRTest.java000066400000000000000000000011561451751637500262070ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import java.util.List; import org.junit.jupiter.api.Test; public class SSSRTest { @Test public void testFindSSSR() throws Exception { NameToStructure n2s = NameToStructure.getInstance(); Fragment f = n2s.parseChemicalName("violanthrene").getStructure(); List rings = SSSRFinder.getSetOfSmallestRings(f); assertEquals(9, rings.size()); f = n2s.parseChemicalName("aceanthrene").getStructure(); rings = SSSRFinder.getSetOfSmallestRings(f); assertEquals(4, rings.size()); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/StereochemistryTest.java000066400000000000000000001103621451751637500306060ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertNotNull; import static org.junit.jupiter.api.Assertions.assertNotSame; import static org.junit.jupiter.api.Assertions.assertNull; import static org.junit.jupiter.api.Assertions.assertThrows; import static org.mockito.Mockito.mock; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; import org.hamcrest.MatcherAssert; import org.junit.jupiter.api.AfterAll; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Disabled; import org.junit.jupiter.api.Test; import org.hamcrest.CoreMatchers; import java.util.Iterator; import uk.ac.cam.ch.wwmm.opsin.BondStereo.BondStereoValue; import uk.ac.cam.ch.wwmm.opsin.OpsinResult.OPSIN_RESULT_STATUS; import uk.ac.cam.ch.wwmm.opsin.OpsinWarning.OpsinWarningType; import uk.ac.cam.ch.wwmm.opsin.StereoAnalyser.StereoBond; import uk.ac.cam.ch.wwmm.opsin.StereoAnalyser.StereoCentre; public class StereochemistryTest { private FragmentManager fm; @BeforeEach public void setup() { IDManager idManager = new IDManager(); fm = new FragmentManager(new SMILESFragmentBuilder(idManager), idManager); } private static NameToStructure n2s; @BeforeAll public static void intialSetup() { n2s = NameToStructure.getInstance(); } @AfterAll public static void cleanUp(){ n2s = null; } /* * Tests for finding stereo centres */ @Test public void findStereoCentresBromoChloroFluoroMethane() { Fragment f = n2s.parseChemicalName("bromochlorofluoromethane").getStructure(); StereoAnalyser stereoAnalyser = new StereoAnalyser(f); assertEquals(1, stereoAnalyser.findStereoCentres().size()); assertEquals(0, stereoAnalyser.findStereoBonds().size()); StereoCentre sc = stereoAnalyser.findStereoCentres().get(0); assertNotNull(sc.getStereoAtom()); Atom stereoAtom = sc.getStereoAtom(); assertEquals(ChemEl.C, stereoAtom.getElement()); assertEquals(4, stereoAtom.getID()); } @Test public void findStereoCentresNacetylleucine() throws CipOrderingException { Fragment f = n2s.parseChemicalName("N-acetylleucine").getStructure(); StereoAnalyser stereoAnalyser = new StereoAnalyser(f); assertEquals(1, stereoAnalyser.findStereoCentres().size()); assertEquals(0, stereoAnalyser.findStereoBonds().size()); StereoCentre sc = stereoAnalyser.findStereoCentres().get(0); assertNotNull(sc.getStereoAtom()); Atom stereoAtom = sc.getStereoAtom(); assertEquals(ChemEl.C, stereoAtom.getElement()); List neighbours = sc.getCipOrderedAtoms(); for (int i = 0; i < neighbours.size(); i++) { Atom a = neighbours.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(ChemEl.C, a.getElement()); } else if (i==2){ assertEquals(ChemEl.C, a.getElement()); } else if (i==3){ assertEquals(ChemEl.N, a.getElement()); } } } @Test public void findStereoCentresBut2ene() { Fragment f = n2s.parseChemicalName("but-2-ene").getStructure(); StereoAnalyser stereoAnalyser = new StereoAnalyser(f); assertEquals(0, stereoAnalyser.findStereoCentres().size()); assertEquals(1, stereoAnalyser.findStereoBonds().size()); StereoBond sb = stereoAnalyser.findStereoBonds().get(0); Bond stereoBond = sb.getBond(); assertNotNull(stereoBond); Atom stereoAtom1 = stereoBond.getFromAtom(); Atom stereoAtom2 = stereoBond.getToAtom(); assertNotNull(stereoAtom1); assertNotNull(stereoAtom2); assertNotSame(stereoAtom1, stereoAtom2); if (stereoAtom1.getID() == 2){ assertEquals(3, stereoAtom2.getID()); } else{ assertEquals(2, stereoAtom2.getID()); assertEquals(3, stereoAtom1.getID()); } } /* * Tests for applying stereochemistry */ @Test public void applyStereochemistryLocantedZ() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("(2Z)-but-2-ene").getStructure(); Atom atom2 = f.getAtomByLocant("2"); Atom atom3 = f.getAtomByLocant("3"); assertNotNull(atom2); assertNotNull(atom3); Bond chiralBond = atom2.getBondToAtom(atom3); assertNotNull(chiralBond); BondStereo bondStereo = chiralBond.getBondStereo(); assertNotNull(bondStereo); assertEquals("1 2 3 4", atomRefsToIdStr(bondStereo.getAtomRefs4())); assertEquals(BondStereoValue.CIS, bondStereo.getBondStereoValue()); } @Test public void applyStereochemistryLocantedE() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("(2E)-but-2-ene").getStructure(); Atom atom2 = f.getAtomByLocant("2"); Atom atom3 = f.getAtomByLocant("3"); assertNotNull(atom2); assertNotNull(atom3); Bond chiralBond = atom2.getBondToAtom(atom3); assertNotNull(chiralBond); BondStereo bondStereo = chiralBond.getBondStereo(); assertNotNull(bondStereo); assertEquals("1 2 3 4", atomRefsToIdStr(bondStereo.getAtomRefs4())); assertEquals(BondStereoValue.TRANS, bondStereo.getBondStereoValue()); } @Test public void applyStereochemistryUnlocantedZ() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("(Z)-but-2-ene").getStructure(); Atom atom2 = f.getAtomByLocant("2"); Atom atom3 = f.getAtomByLocant("3"); assertNotNull(atom2); assertNotNull(atom3); Bond chiralBond = atom2.getBondToAtom(atom3); assertNotNull(chiralBond); BondStereo bondStereo = chiralBond.getBondStereo(); assertNotNull(bondStereo); assertEquals("1 2 3 4", atomRefsToIdStr(bondStereo.getAtomRefs4())); assertEquals(BondStereoValue.CIS, bondStereo.getBondStereoValue()); } @Test public void applyStereochemistryUnlocantedE() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("(E)-but-2-ene").getStructure(); Atom atom2 = f.getAtomByLocant("2"); Atom atom3 = f.getAtomByLocant("3"); assertNotNull(atom2); assertNotNull(atom3); Bond chiralBond = atom2.getBondToAtom(atom3); assertNotNull(chiralBond); BondStereo bondStereo = chiralBond.getBondStereo(); assertNotNull(bondStereo); assertEquals("1 2 3 4", atomRefsToIdStr(bondStereo.getAtomRefs4())); assertEquals(BondStereoValue.TRANS, bondStereo.getBondStereoValue()); } @Test public void applyStereochemistryCis() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("cis-but-2-ene").getStructure(); Atom atom2 = f.getAtomByLocant("2"); Atom atom3 = f.getAtomByLocant("3"); assertNotNull(atom2); assertNotNull(atom3); Bond chiralBond = atom2.getBondToAtom(atom3); assertNotNull(chiralBond); BondStereo bondStereo = chiralBond.getBondStereo(); assertNotNull(bondStereo); assertEquals("1 2 3 4", atomRefsToIdStr(bondStereo.getAtomRefs4())); assertEquals(BondStereoValue.CIS, bondStereo.getBondStereoValue()); } @Test public void applyStereochemistryTrans() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("trans-but-2-ene").getStructure(); Atom atom2 = f.getAtomByLocant("2"); Atom atom3 = f.getAtomByLocant("3"); assertNotNull(atom2); assertNotNull(atom3); Bond chiralBond = atom2.getBondToAtom(atom3); assertNotNull(chiralBond); BondStereo bondStereo = chiralBond.getBondStereo(); assertNotNull(bondStereo); assertEquals("1 2 3 4", atomRefsToIdStr(bondStereo.getAtomRefs4())); assertEquals(BondStereoValue.TRANS, bondStereo.getBondStereoValue()); } @Test public void applyStereochemistryLocantedRS() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("(1S,2R)-2-(methylamino)-1-phenylpropan-1-ol").getStructure(); List atomList = f.getAtomList(); List stereoAtoms = new ArrayList(); for (Atom atom : atomList) { if (atom.getAtomParity() != null){ stereoAtoms.add(atom); } } assertEquals(2, stereoAtoms.size()); StereoAnalyser stereoAnalyser = new StereoAnalyser(f); List stereoCentres = stereoAnalyser.findStereoCentres(); assertEquals(2, stereoCentres.size()); if (stereoCentres.get(0).getStereoAtom().equals(stereoAtoms.get(0))){ assertEquals(stereoCentres.get(1).getStereoAtom(), stereoAtoms.get(1)); } else{ assertEquals(stereoCentres.get(0).getStereoAtom(), stereoAtoms.get(1)); assertEquals(stereoCentres.get(1).getStereoAtom(), stereoAtoms.get(0)); } } /** * Check the number of stereo atoms in a molecule name are assigned to the * correct groups. * @param name chemical name * @param nRacExp number of stereo centers expected to be racemic * @param nRelExp number of stereo centers expected to be relative * @param nAbsExp number of stereo centers expected to be absolute */ void assertEnhancedStereo(String name, int nRacExp, int nRelExp, int nAbsExp) { Fragment f = n2s.parseChemicalName(name).getStructure(); int nRacAtoms = 0; int nRelAtoms = 0; int nAbsAtoms = 0; for (Atom atom : f) { if (atom.getAtomParity() != null) { if (atom.getStereoGroup().getType() == StereoGroupType.Rac) { nRacAtoms++; } if (atom.getStereoGroup().getType() == StereoGroupType.Rel) { nRelAtoms++; } else if (atom.getStereoGroup().getType() == StereoGroupType.Abs) { nAbsAtoms++; } } } assertEquals(nRacExp, nRacAtoms, "Incorrect number of racemic stereo centers"); assertEquals(nRelExp, nRelAtoms, "Incorrect number of relative stereo centers"); assertEquals(nAbsExp, nAbsAtoms, "Incorrect number of absolute stereo centers"); } @Test public void applyStereochemistryLocantedRSracemic() throws StructureBuildingException { assertEnhancedStereo("(1RS,2SR)-2-(methylamino)-1-phenylpropan-1-ol", 2, 0, 0); } @Test public void applyStereochemistryLocantedRSrel() throws StructureBuildingException { assertEnhancedStereo("(1R*,2S*)-2-(methylamino)-1-phenylpropan-1-ol", 0, 2, 0); } @Test public void applyStereochemistryLocantedPartialRac() throws StructureBuildingException { assertEnhancedStereo("(1RS,2R)-2-(methylamino)-1-phenylpropan-1-ol", 1, 0, 1); } @Test public void applyStereochemistryLocantedPartialRel() throws StructureBuildingException { assertEnhancedStereo("(1R*,2R)-2-(methylamino)-1-phenylpropan-1-ol", 0, 1, 1); } @Test public void applyStereochemistryRacSlash() throws StructureBuildingException { assertEnhancedStereo("(1R/S,2R)-2-(methylamino)-1-phenylpropan-1-ol", 1, 0, 1); } @Test public void applyStereochemistryRelHatStar() throws StructureBuildingException { assertEnhancedStereo("(1R^*,2S^*)-2-(methylamino)-1-phenylpropan-1-ol", 0, 2, 0); } @Test public void applyStereochemistryRacemicUnlocanted() throws StructureBuildingException { assertEnhancedStereo("rac-1-phenylethan-1-ol", 1, 0, 0); } @Disabled("not allowed") public void applyStereochemistryRacemicMultipleUnlocanted() throws StructureBuildingException { assertEnhancedStereo("rac-2-(methylamino)-1-phenylpropan-1-ol", 2, 0, 0); } @Test public void applyStereochemistryRelUnlocanted() throws StructureBuildingException { assertEnhancedStereo("rel-1-phenylethan-1-ol", 0, 1, 0); } @Test public void applyStereochemistryRelUnlocanted2() throws StructureBuildingException { assertEnhancedStereo("(rac)-1-phenylethan-1-ol", 1, 0, 0); } @Test public void prefixTakesPrecedence() throws StructureBuildingException { assertEnhancedStereo("rac-(R*)-1-phenylethan-1-ol", 1, 0, 0); assertEnhancedStereo("rel-(RS)-1-phenylethan-1-ol", 0, 1, 0); } @Test public void applyStereochemistryLocantedRorS() throws StructureBuildingException { // just for James Davidson, should be rel-(1R)- or 1R* assertEnhancedStereo("(1R or S)-1-(1-pentyl-1H-pyrazol-5-yl)ethanol", 0, 1, 0); } @Test public void applyStereochemistryRacCis() throws StructureBuildingException { // racemic cis assertEnhancedStereo("rac-cis-N4-(2,2-dimethyl-3,4-dihydro-3-oxo-2H-pyrido[3,2-b][1,4]oxazin-6-yl)-N2-[6-[2,6-dimethylmorpholino)pyridin-3-yl]-5-fluoro-2,4-pyrimidinediamine", 2, 0, 0); } @Test public void applyStereochemistryPlusMinus() throws StructureBuildingException { assertEnhancedStereo("(+/-)-1-(1-pentyl-1H-pyrazol-5-yl)ethanol", 1, 0, 0); assertEnhancedStereo("(±)-1-(1-pentyl-1H-pyrazol-5-yl)ethanol", 1, 0, 0); } @Test public void testBracketNormalisation() throws StereochemistryException { MatcherAssert.assertThat(ComponentGenerator.normaliseBinaryBrackets("(R)-and(S)-"), CoreMatchers.is("(RS)")); MatcherAssert.assertThat(ComponentGenerator.normaliseBinaryBrackets("(R,S)-and(S,R)-"), CoreMatchers.is("(RS,SR)")); MatcherAssert.assertThat(ComponentGenerator.normaliseBinaryBrackets("(2R,3S)-and(2S,3S)-"), CoreMatchers.is("(2RS,3S)")); MatcherAssert.assertThat(ComponentGenerator.normaliseBinaryBrackets("(2R,3S)-or(2S,3S)-"), CoreMatchers.is("(2R*,3S)")); } @Test public void applyStereochemistryMultipleBrackets() throws StructureBuildingException { assertEnhancedStereo("(R)- and (S)-1-(1-pentyl-1H-pyrazol-5-yl)ethanol", 1, 0, 0); assertEnhancedStereo("(R)- or (S)-1-(1-pentyl-1H-pyrazol-5-yl)ethanol", 0, 1, 0); assertEnhancedStereo("(R,S)- or (S,S)-2-(methylamino)-1-phenylpropan-1-ol", 0, 1, 1); } @Test public void onlyApplyRacToPostfix() throws StructureBuildingException { assertEnhancedStereo("alanyl-rac-alanine", 1, 0, 1); } @Test public void remoteRacSpecification() throws StructureBuildingException { assertEnhancedStereo("rac-tert-butyl 7-[8-(tert-butoxycarbonylamino)-7-fluoro-3-[[(1S,2S,3R)-3-hydroxy-2,3-dimethyl-cyclobutoxy]carbonylamino]-6-isoquinolyl]-8-methyl-2,3-dihydropyrido[2,3-b][1,4]oxazine-1-carboxylate", 3, 0, 0); assertEnhancedStereo("(+-)-tert-butyl 7-[8-(tert-butoxycarbonylamino)-7-fluoro-3-[[(1S,2S,3R)-3-hydroxy-2,3-dimethyl-cyclobutoxy]carbonylamino]-6-isoquinolyl]-8-methyl-2,3-dihydropyrido[2,3-b][1,4]oxazine-1-carboxylate", 3, 0, 0); } // US20080015199A1_2830 @Test public void applyStereochemistryRelUnlocantedRAndS() throws StructureBuildingException { assertEnhancedStereo("(R) and (S)-4-{3-[(4-Carbamimidoylphenylamino)-(3,5-dimethoxyphenyl)methyl]-5-oxo-4,5-dihydro-[1,2,4]triazol-1-yl}thiazole-5-carboxylic acid", 1, 0, 0); } @Test public void racemicPeptides() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("DL-alanyl-DL-alanine").getStructure(); Map counter = new HashMap<>(); for (Atom atom : f) { if (atom.getAtomParity() != null) { StereoGroup key = atom.getStereoGroup(); Integer count = counter.get(key); counter.put(key, count != null ? count + 1 : 1); } } assertEquals(2, counter.size()); Iterator iterator = counter.values().iterator(); assertEquals(1, (int)iterator.next()); assertEquals(1, (int)iterator.next()); } @Test public void racemicCarbohydrates() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("4-O-α-DL-Glucopyranosyl-α-DL-glucose").getStructure(); Map counter = new HashMap<>(); for (Atom atom : f) { if (atom.getAtomParity() != null && atom.getStereoGroup().getType() == StereoGroupType.Rac) { StereoGroup key = atom.getStereoGroup(); Integer count = counter.get(key); counter.put(key, count != null ? count + 1 : 1); } } assertEquals(2, counter.size()); Iterator iterator = counter.values().iterator(); assertEquals(4, (int)iterator.next()); assertEquals(4, (int)iterator.next()); } @Test public void avoidCollisionOfRacemicDefinitions() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("DL-alanyl-(RS)-butan-2-ol").getStructure(); Map counter = new HashMap<>(); for (Atom atom : f) { if (atom.getAtomParity() != null) { StereoGroup key = atom.getStereoGroup(); Integer count = counter.get(key); counter.put(key, count != null ? count + 1 : 1); } } assertEquals(2, counter.size()); Iterator iterator = counter.values().iterator(); assertEquals(1, (int)iterator.next()); assertEquals(1, (int)iterator.next()); } @Test public void applyStereochemistryLocantedRandS() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("(1R and S)-1-(1-pentyl-1H-pyrazol-5-yl)ethanol").getStructure(); int nRacAtoms = 0; for (Atom atom : f) { if (atom.getAtomParity() != null && atom.getStereoGroup().getType() == StereoGroupType.Rac) { nRacAtoms++; } } assertEquals(1, nRacAtoms); } @Test public void testCIPpriority1() throws StructureBuildingException { Fragment f = fm.buildSMILES("C(Br)(F)([H])Cl"); List cipOrdered = new CipSequenceRules(f.getFirstAtom()).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(ChemEl.F, a.getElement()); } else if (i==2){ assertEquals(ChemEl.Cl, a.getElement()); } else if (i==3){ assertEquals(ChemEl.Br, a.getElement()); } } } @Test public void testCIPpriority2() throws StructureBuildingException { Fragment f = fm.buildSMILES("C([H])(C1CC1)(C1CCC1)O"); List cipOrdered = new CipSequenceRules(f.getFirstAtom()).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(3, a.getID()); } else if (i==2){ assertEquals(6, a.getID()); } else if (i==3){ assertEquals(ChemEl.O, a.getElement()); } } } @Test public void testCIPpriority3() throws StructureBuildingException { Fragment f = fm.buildSMILES("[C](N)(C1=CC(O)=CC=C1)([H])C2=CC=C(O)C=C2"); List cipOrdered = new CipSequenceRules(f.getFirstAtom()).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(11, a.getID()); } else if (i==2){ assertEquals(3, a.getID()); } else if (i==3){ assertEquals(ChemEl.N, a.getElement()); } } } @Test public void testCIPpriority4() throws StructureBuildingException { Fragment f = fm.buildSMILES("[C](N)(C1CC(O)CCC1)([H])C2CCC(O)CC2"); List cipOrdered = new CipSequenceRules(f.getFirstAtom()).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(11, a.getID()); } else if (i==2){ assertEquals(3, a.getID()); } else if (i==3){ assertEquals(ChemEl.N, a.getElement()); } } } @Test public void testCIPpriority5() throws StructureBuildingException { Fragment f = fm.buildSMILES("C1([H])(C(=O)O[H])C([H])([H])SC([H])([H])N([H])1"); List cipOrdered = new CipSequenceRules(f.getFirstAtom()).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(3, a.getID()); } else if (i==2){ assertEquals(7, a.getID()); } else if (i==3){ assertEquals(ChemEl.N, a.getElement()); } } } @Test public void testCIPpriority6() throws StructureBuildingException { Fragment f = fm.buildSMILES("C1([H])(O)C([H])(C([H])([H])[H])OC([H])([H])C([H])([H])C1([H])(O[H])"); List cipOrdered = new CipSequenceRules(f.getFirstAtom()).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(17, a.getID()); } else if (i==2){ assertEquals(4, a.getID()); } else if (i==3){ assertEquals(ChemEl.O, a.getElement()); } } } @Test public void testCIPpriority7() throws StructureBuildingException { Fragment f = fm.buildSMILES("[H]OC2([H])(C([H])([H])C([H])([H])C3([H])(C4([H])(C([H])([H])C([H])([H])C1=C([H])C([H])([H])C([H])([H])C([H])([H])C1([H])C4([H])(C([H])([H])C([H])([H])C23(C([H])([H])[H])))))"); List cipOrdered = new CipSequenceRules(f.getAtomList().get(34)).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(37, a.getID()); } else if (i==2){ assertEquals(13, a.getID()); } else if (i==3){ assertEquals(33, a.getID()); } } } @Test public void testCIPpriority8() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("(6aR)-6-phenyl-6,6a-dihydroisoindolo[2,1-a]quinazoline-5,11-dione").getStructure(); List cipOrdered = new CipSequenceRules(f.getAtomByLocant("6a")).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(ChemEl.C, a.getElement()); } else if (i==2){ assertEquals("6", a.getFirstLocant()); } else if (i==3){ assertEquals("12", a.getFirstLocant()); } } } @Test public void testCIPpriority9() throws StructureBuildingException { Fragment f = fm.buildSMILES("C1(C=C)CC1C2=CC=CC=C2"); fm.makeHydrogensExplicit(); List cipOrdered = new CipSequenceRules(f.getFirstAtom()).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(4, a.getID()); } else if (i==2){ assertEquals(2, a.getID()); } else if (i==3){ assertEquals(5, a.getID()); } } } @Test public void testCIPpriority10() throws StructureBuildingException { Fragment f = fm.buildSMILES("C(O[H])([H])(C1([H])C([H])(F)C([H])(Cl)C([H])([H])C([H])(I)C1([H])([H]))C1([H])C([H])(F)C([H])(Br)C([H])([H])C([H])(Cl)C1([H])([H])"); fm.makeHydrogensExplicit(); List cipOrdered = new CipSequenceRules(f.getFirstAtom()).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(5, a.getID()); } else if (i==2){ assertEquals(22, a.getID()); } else if (i==3){ assertEquals(ChemEl.O, a.getElement()); } } } @Test public void testCIPpriority11() throws StructureBuildingException { Fragment f = fm.buildSMILES("C17C=CC23C45OC6C19.O74.O2C3.C5.C6(C)C.C9"); fm.makeHydrogensExplicit(); //stereocentres at 1,4,5,7,8 List atomList = f.getAtomList(); List cipOrdered = new CipSequenceRules(atomList.get(0)).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(2, a.getID()); } else if (i==2){ assertEquals(8, a.getID()); } else if (i==3){ assertEquals(ChemEl.O, a.getElement()); } } cipOrdered = new CipSequenceRules(atomList.get(3)).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(3, a.getID()); } else if (i==1){ assertEquals(11, a.getID()); } else if (i==2){ assertEquals(5, a.getID()); } else if (i==3){ assertEquals(ChemEl.O, a.getElement()); } } cipOrdered = new CipSequenceRules(atomList.get(4)).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(12, a.getID()); } else if (i==1){ assertEquals(4, a.getID()); } else if (i==2){ assertEquals(6, a.getID()); } else if (i==3){ assertEquals(9, a.getID()); } } cipOrdered = new CipSequenceRules(atomList.get(6)).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(13, a.getID()); } else if (i==2){ assertEquals(8, a.getID()); } else if (i==3){ assertEquals(ChemEl.O, a.getElement()); } } cipOrdered = new CipSequenceRules(atomList.get(7)).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(16, a.getID()); } else if (i==2){ assertEquals(7, a.getID()); } else if (i==3){ assertEquals(1, a.getID()); } } } @Test public void testCIPpriority12() throws StructureBuildingException { Fragment f = fm.buildSMILES("C1(C)(CCC(=O)N1)CCC(=O)NC(C)C"); fm.makeHydrogensExplicit(); List cipOrdered = new CipSequenceRules(f.getFirstAtom()).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(2, a.getID()); } else if (i==1){ assertEquals(3, a.getID()); } else if (i==2){ assertEquals(8, a.getID()); } else if (i==3){ assertEquals(ChemEl.N, a.getElement()); } } } @Test public void testCIPpriority13() throws StructureBuildingException { Fragment f = fm.buildSMILES("C(O)(C#CC)C1=CC=CC=C1"); fm.makeHydrogensExplicit(); List cipOrdered = new CipSequenceRules(f.getFirstAtom()).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(ChemEl.H, a.getElement()); } else if (i==1){ assertEquals(6, a.getID()); } else if (i==2){ assertEquals(3, a.getID()); } else if (i==3){ assertEquals(2, a.getID()); } } } @Test public void testCIPpriority14() throws StructureBuildingException { Fragment f = fm.buildSMILES("C(Cl)([2H])([3H])[H]"); List cipOrdered = new CipSequenceRules(f.getFirstAtom()).getNeighbouringAtomsInCipOrder(); for (int i = 0; i < cipOrdered.size(); i++) { Atom a = cipOrdered.get(i); if (i==0){ assertEquals(5, a.getID()); } else if (i==1){ assertEquals(3, a.getID()); } else if (i==2){ assertEquals(4, a.getID()); } else if (i==3){ assertEquals(2, a.getID()); } } } @Test public void testCIPpriority15() throws StructureBuildingException { Fragment f = fm.buildSMILES("C([H])(O)(C(C(F)CCl)CCBr)C(C(F)CF)CCI"); fm.makeHydrogensExplicit(); List cipOrdered = new CipSequenceRules(f.getFirstAtom()).getNeighbouringAtomsInCipOrder(); assertEquals(4, cipOrdered.size()); assertEquals(2, cipOrdered.get(0).getID()); assertEquals(12, cipOrdered.get(1).getID()); assertEquals(4, cipOrdered.get(2).getID()); assertEquals(3, cipOrdered.get(3).getID()); } @Test() public void testCipUnassignable() { assertThrows(CipOrderingException.class, () -> { // two sides of ring are identical Fragment f = fm.buildSMILES("NC1(O)CCC(CCC2CCCCC2)CC1"); new CipSequenceRules(f.getAtomList().get(1)).getNeighbouringAtomsInCipOrder(); }); } @Test public void testAtomParityEquivalence1() { Atom a1= new Atom(1, ChemEl.C, mock(Fragment.class)); Atom a2= new Atom(2, ChemEl.C, mock(Fragment.class)); Atom a3= new Atom(3, ChemEl.C, mock(Fragment.class)); Atom a4= new Atom(4, ChemEl.C, mock(Fragment.class)); Atom[] atomRefs1 = new Atom[]{a1,a2,a3,a4}; Atom[] atomRefs2 = new Atom[]{a3,a4,a1,a2}; //2 swaps (4 by bubble sort) assertEquals(true, StereochemistryHandler.checkEquivalencyOfAtomsRefs4AndParity(atomRefs1, 1, atomRefs2, 1)); assertEquals(false, StereochemistryHandler.checkEquivalencyOfAtomsRefs4AndParity(atomRefs1, 1, atomRefs2, -1)); } @Test public void testAtomParityEquivalence2() { Atom a1= new Atom(1, ChemEl.C, mock(Fragment.class)); Atom a2= new Atom(2, ChemEl.C, mock(Fragment.class)); Atom a3= new Atom(3, ChemEl.C, mock(Fragment.class)); Atom a4= new Atom(4, ChemEl.C, mock(Fragment.class)); Atom[] atomRefs1 = new Atom[]{a1,a2,a3,a4}; Atom[] atomRefs2 = new Atom[]{a2,a4,a1,a3}; //3 swaps assertEquals(false, StereochemistryHandler.checkEquivalencyOfAtomsRefs4AndParity(atomRefs1, 1, atomRefs2, 1)); assertEquals(true, StereochemistryHandler.checkEquivalencyOfAtomsRefs4AndParity(atomRefs1, 1, atomRefs2, -1)); } @Test public void testCisTransUnambiguous() throws StructureBuildingException { Fragment f = fm.buildSMILES("[H]C([H])([H])C([H])=C([H])C([H])([H])[H]"); assertEquals(true, StereochemistryHandler.cisTransUnambiguousOnBond(f.findBond(5, 7))); } @Test public void testCisTransAmbiguous() throws StructureBuildingException { Fragment f = fm.buildSMILES("[H]C([H])([H])C(Cl)=C([H])C([H])([H])[H]"); assertEquals(false, StereochemistryHandler.cisTransUnambiguousOnBond(f.findBond(5, 7))); } @Test public void testChiralAtomWhichBecomesAchiral() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("alpha-amino-alanine").getStructure(); StereoAnalyser stereoAnalyser = new StereoAnalyser(f); assertEquals(0, stereoAnalyser.findStereoCentres().size()); assertEquals(0, stereoAnalyser.findStereoBonds().size()); Atom formerChiralCentre = f.getAtomByLocantOrThrow("alpha"); assertNull(formerChiralCentre.getAtomParity(), "This atom is no longer a chiral centre and hence should not have an associated atom parity"); } @Test public void testChiralBondWhichBecomesAchiral() throws StructureBuildingException { Fragment f = n2s.parseChemicalName("3-methylcrotonic acid").getStructure(); StereoAnalyser stereoAnalyser = new StereoAnalyser(f); assertEquals(0, stereoAnalyser.findStereoCentres().size()); assertEquals(0, stereoAnalyser.findStereoBonds().size()); Bond formerChiralBond = f.getAtomByLocantOrThrow("2").getBondToAtomOrThrow(f.getAtomByLocantOrThrow("3")); assertNull(formerChiralBond.getBondStereo(), "This Bond is no longer a chiral centre and hence should not have an associated bond stereo"); } @Test public void testIsTetrahedral() throws StructureBuildingException { assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("C(N)(O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[Si](N)(O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[Ge](N)(O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[N+](N)(O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[P+](N)(O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[As+](N)(O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[B-](N)(O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[Sn](N)(O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[N](=N)(O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[P](=N)(O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[S](=N)(=O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[S+](=N)(O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[S](=O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[S+](O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("N1(C)(OS1)").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[Se](=N)(=O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[Se+](=N)(O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[Se](=O)(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isKnownPotentiallyStereogenic(fm.buildSMILES("[Se+](O)(Cl)Br").getFirstAtom())); } @Test public void testAchiralDueToResonance() throws StructureBuildingException { assertEquals(true, StereoAnalyser.isAchiralDueToResonanceOrTautomerism(fm.buildSMILES("[S](=N)(=O)([O-])Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isAchiralDueToResonanceOrTautomerism(fm.buildSMILES("[S](=O)([O-])Br").getFirstAtom())); assertEquals(false, StereoAnalyser.isAchiralDueToResonanceOrTautomerism(fm.buildSMILES("[S](=S)([O-])Br").getFirstAtom())); assertEquals(false, StereoAnalyser.isAchiralDueToResonanceOrTautomerism(fm.buildSMILES("C(N)([O-])(Cl)Br").getFirstAtom())); } @Test public void testAchiralDueToTautomerism() throws StructureBuildingException { assertEquals(true, StereoAnalyser.isAchiralDueToResonanceOrTautomerism(fm.buildSMILES("[S](=N)(=O)([OH])Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isAchiralDueToResonanceOrTautomerism(fm.buildSMILES("[S](=O)([OH])Br").getFirstAtom())); assertEquals(false, StereoAnalyser.isAchiralDueToResonanceOrTautomerism(fm.buildSMILES("[S](=S)([OH])Br").getFirstAtom())); assertEquals(false, StereoAnalyser.isAchiralDueToResonanceOrTautomerism(fm.buildSMILES("C(N)([OH])(Cl)Br").getFirstAtom())); assertEquals(true, StereoAnalyser.isAchiralDueToResonanceOrTautomerism(fm.buildSMILES("N([H])(CC)(C)").getFirstAtom())); assertEquals(false, StereoAnalyser.isAchiralDueToResonanceOrTautomerism(fm.buildSMILES("N1(C)(OS1)").getFirstAtom())); } @Test public void testFindPseudoAsymmetricCarbon1() throws StructureBuildingException { Fragment f = fm.buildSMILES("OCC(O)C(O)C(O)CO"); fm.makeHydrogensExplicit(); StereoAnalyser stereoAnalyser = new StereoAnalyser(f); List stereoCentres = stereoAnalyser.findStereoCentres(); assertEquals(3, stereoCentres.size()); for (int i = 0; i < stereoCentres.size(); i++) { StereoCentre stereocentre = stereoCentres.get(i); if (i < 2){ assertEquals(true, stereocentre.isTrueStereoCentre()); } else{ assertEquals(false, stereocentre.isTrueStereoCentre()); assertEquals(5, stereocentre.getStereoAtom().getID()); } } } @Test public void testFindPseudoAsymmetricCarbon2() throws StructureBuildingException { Fragment f = fm.buildSMILES("OCC(O)C(C(Cl)(Br)C)(C(Cl)(Br)C)C(O)CO"); fm.makeHydrogensExplicit(); StereoAnalyser stereoAnalyser = new StereoAnalyser(f); List stereoCentres = stereoAnalyser.findStereoCentres(); assertEquals(5, stereoCentres.size()); for (int i = 0; i < stereoCentres.size(); i++) { StereoCentre stereocentre = stereoCentres.get(i); if (i <4){ assertEquals(true, stereocentre.isTrueStereoCentre()); } else{ assertEquals(false, stereocentre.isTrueStereoCentre()); assertEquals(5, stereocentre.getStereoAtom().getID()); } } } @Test public void testAmbiguousStereoTerm() { OpsinResult result = n2s.parseChemicalName("trans-N-[2-Chloro-5-(2-methoxyethyl)benzyl]-N-cyclopropyl-4-hydroxy-4-(1-methyl-2-oxo-1,2-dihydro-4-pyridinyl)-3-piperidinecarboxamide"); assertEquals(OPSIN_RESULT_STATUS.WARNING, result.getStatus()); assertEquals(1, result.getWarnings().size()); assertEquals(OpsinWarningType.APPEARS_AMBIGUOUS, result.getWarnings().get(0).getType()); } private String atomRefsToIdStr(Atom[] atomRefs4) { StringBuilder sb = new StringBuilder(); for (int i = 0; i < atomRefs4.length; i++) { sb.append(atomRefs4[i].getID()); if (i + 1 < atomRefs4.length) { sb.append(' '); } } return sb.toString(); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/StructureBuildingMethodsTest.java000066400000000000000000000141251451751637500324170ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.Set; import org.junit.jupiter.api.Test; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.mockito.Mockito.mock; public class StructureBuildingMethodsTest { @Test public void bracketedPrimeNotSpecialCase() { Element word = new GroupingEl(WORD_EL); Element substituent = new GroupingEl(SUBSTITUENT_EL); word.addChild(substituent); assertEquals(null, StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4")); assertEquals(null, StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4'")); assertEquals(null, StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4''")); } @Test public void bracketedPrimeSpecialCase1() { Element word = new GroupingEl(WORD_EL); Element bracket = new GroupingEl(BRACKET_EL); word.addChild(bracket); Element substituent = new GroupingEl(SUBSTITUENT_EL); bracket.addChild(substituent); assertEquals(null, StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4")); assertEquals("4", StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4'")); assertEquals(null, StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4''")); bracket.addAttribute(new Attribute(TYPE_ATR, IMPLICIT_TYPE_VAL)); assertEquals(null, StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4")); assertEquals(null, StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4'")); assertEquals(null, StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4''")); } @Test public void bracketedPrimeSpecialCase2() { Element word = new GroupingEl(WORD_EL); Element bracket = new GroupingEl(BRACKET_EL); word.addChild(bracket); Element bracket2 = new GroupingEl(BRACKET_EL); bracket.addChild(bracket2); Element substituent = new GroupingEl(SUBSTITUENT_EL); bracket2.addChild(substituent); assertEquals(null, StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4")); assertEquals(null, StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4'")); assertEquals("4", StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4''")); bracket2.addAttribute(new Attribute(TYPE_ATR, IMPLICIT_TYPE_VAL)); assertEquals(null, StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4")); assertEquals("4", StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4'")); assertEquals(null, StructureBuildingMethods.checkForBracketedPrimedLocantSpecialCase(substituent, "4''")); } @Test public void notPhosphoSubstitution() throws StructureBuildingException { //standard unlocanted substitution BuildState state = new BuildState(mock(NameToStructureConfig.class)); Element word = new GroupingEl(WORD_EL); Element amino = new TokenEl(GROUP_EL); Fragment aminoFrag = state.fragManager.buildSMILES("-N"); amino.setFrag(aminoFrag); Element substituent = new GroupingEl(SUBSTITUENT_EL); substituent.addChild(amino); Element methanol = new TokenEl(GROUP_EL); methanol.setFrag(state.fragManager.buildSMILES("CO")); Element root = new GroupingEl(ROOT_EL); root.addChild(methanol); word.addChild(substituent); word.addChild(root); StructureBuildingMethods.resolveRootOrSubstituentUnLocanted(state, substituent); Set interFragmentBonds = state.fragManager.getInterFragmentBonds(aminoFrag); assertEquals(1, interFragmentBonds.size()); assertEquals(ChemEl.C, interFragmentBonds.iterator().next().getOtherAtom(aminoFrag.getFirstAtom()).getElement()); } @Test public void phosphoUnlocantedSubstitution() throws StructureBuildingException { BuildState state = new BuildState(mock(NameToStructureConfig.class)); Element word = new GroupingEl(WORD_EL); Element phospho = new TokenEl(GROUP_EL); phospho.addAttribute(new Attribute(SUBTYPE_ATR, PHOSPHO_SUBTYPE_VAL)); Fragment phosphoFrag = state.fragManager.buildSMILES("-P(=O)O"); phospho.setFrag(phosphoFrag); Element substituent = new GroupingEl(SUBSTITUENT_EL); substituent.addChild(phospho); Element methanol = new TokenEl(GROUP_EL); methanol.setFrag(state.fragManager.buildSMILES("CO")); Element root = new GroupingEl(ROOT_EL); root.addChild(methanol); word.addChild(substituent); word.addChild(root); StructureBuildingMethods.resolveRootOrSubstituentUnLocanted(state, substituent); Set interFragmentBonds = state.fragManager.getInterFragmentBonds(phosphoFrag); assertEquals(1, interFragmentBonds.size()); assertEquals(ChemEl.O, interFragmentBonds.iterator().next().getOtherAtom(phosphoFrag.getFirstAtom()).getElement()); } @Test public void phosphoLocantedSubstitution() throws StructureBuildingException { BuildState state = new BuildState(mock(NameToStructureConfig.class)); Element word = new GroupingEl(WORD_EL); Element phospho = new TokenEl(GROUP_EL); phospho.addAttribute(new Attribute(SUBTYPE_ATR, PHOSPHO_SUBTYPE_VAL)); Fragment phosphoFrag = state.fragManager.buildSMILES("-P(=O)O"); phospho.setFrag(phosphoFrag); Element substituent = new GroupingEl(SUBSTITUENT_EL); substituent.addAttribute(new Attribute(LOCANT_ATR, "4")); substituent.addChild(phospho); Element methanol = new TokenEl(GROUP_EL); methanol.setFrag(state.fragManager.buildSMILES("CCCCO",methanol,"1/2/3/4/")); Element root = new GroupingEl(ROOT_EL); root.addChild(methanol); word.addChild(substituent); word.addChild(root); StructureBuildingMethods.resolveRootOrSubstituentLocanted(state, substituent); Set interFragmentBonds = state.fragManager.getInterFragmentBonds(phosphoFrag); assertEquals(1, interFragmentBonds.size()); assertEquals(ChemEl.O, interFragmentBonds.iterator().next().getOtherAtom(phosphoFrag.getFirstAtom()).getElement()); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/TokenizerTest.java000066400000000000000000000426141451751637500273730ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import java.io.IOException; import java.util.List; import org.junit.jupiter.api.AfterAll; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; public class TokenizerTest { private static Tokeniser tokenizer; private static ReverseParseRules reverseParseRules; @BeforeAll public static void setUp() throws IOException{ ResourceGetter rg = new ResourceGetter("uk/ac/cam/ch/wwmm/opsin/resources/"); ResourceManager rm = new ResourceManager(rg); tokenizer = new Tokeniser(new ParseRules(rm)); reverseParseRules = new ReverseParseRules(rm); } @AfterAll public static void cleanUp(){ tokenizer = null; reverseParseRules = null; } @Test public void hexane() throws ParsingException{ TokenizationResult result= tokenizer.tokenize("hexane", true); assertEquals(true, result.isSuccessfullyTokenized()); assertEquals(true, result.isFullyInterpretable()); assertEquals("", result.getUninterpretableName()); assertEquals("", result.getUnparsableName()); assertEquals("", result.getUnparsedName()); Parse parse = result.getParse(); assertEquals(1, parse.getWords().size(), "One Word"); ParseWord w = parse.getWords().get(0); assertEquals(1, w.getParseTokens().size(), "One Parse"); List tokens = w.getParseTokens().get(0).getTokens(); assertEquals(3, tokens.size(), "Three tokens"); assertEquals("hex", tokens.get(0), "First token: hex"); assertEquals("ane", tokens.get(1), "Second token: ane"); assertEquals("", tokens.get(2), "Third token: end of main group"); } @Test public void hexachlorohexane() throws ParsingException{ Parse parse = tokenizer.tokenize("hexachlorohexane", true).getParse(); assertEquals(1, parse.getWords().size(), "One Word"); ParseWord w = parse.getWords().get(0); assertEquals(1, w.getParseTokens().size(), "One Parse"); List tokens = w.getParseTokens().get(0).getTokens(); assertEquals(7, tokens.size(), "Seven tokens"); assertEquals("hex", tokens.get(0), "First token: hex"); assertEquals("a", tokens.get(1), "Second token: a"); assertEquals("chloro", tokens.get(2), "Third token: chloro"); assertEquals("", tokens.get(3), "Fourth token: end of main substituent"); assertEquals("hex", tokens.get(4), "Fifth token: hex"); assertEquals("ane", tokens.get(5), "Sixth token: ane"); assertEquals("", tokens.get(6), "Seventh token: end of main group"); } @Test public void ethylChloride() throws ParsingException { Parse parse = tokenizer.tokenize("ethyl chloride", true).getParse(); assertEquals(2, parse.getWords().size(), "Two Words"); ParseWord w = parse.getWord(0); assertEquals(1, w.getParseTokens().size(), "One Parse"); List tokens = w.getParseTokens().get(0).getTokens(); assertEquals(3, tokens.size(), "Three tokens"); assertEquals("eth", tokens.get(0), "First token: eth"); assertEquals("yl", tokens.get(1), "Second token: yl"); assertEquals("", tokens.get(2), "Third token: end of substituent"); w = parse.getWord(1); assertEquals(1, w.getParseTokens().size(), "One Parse"); tokens = w.getParseTokens().get(0).getTokens(); assertEquals(2, tokens.size(), "Two tokens"); assertEquals("chloride", tokens.get(0), "First token: chloride"); assertEquals("", tokens.get(1), "Second token: end of functionalTerm"); parse = tokenizer.tokenize("ethylchloride", true).getParse();//missing space assertEquals(2, parse.getWords().size(), "Two Words"); w = parse.getWord(0); assertEquals(1, w.getParseTokens().size(), "One Parse"); tokens = w.getParseTokens().get(0).getTokens(); assertEquals(3, tokens.size(), "Three tokens"); assertEquals("eth", tokens.get(0), "First token: eth"); assertEquals("yl", tokens.get(1), "Second token: yl"); assertEquals("", tokens.get(2), "Third token: end of substituent"); w = parse.getWord(1); assertEquals(1, w.getParseTokens().size(), "One Parse"); tokens = w.getParseTokens().get(0).getTokens(); assertEquals(2, tokens.size(), "Two tokens"); assertEquals("chloride", tokens.get(0), "First token: chloride"); assertEquals("", tokens.get(1), "Second token: end of functionalTerm"); } @Test public void hexachlorohexaneeeeeee() throws ParsingException{ TokenizationResult result = tokenizer.tokenize("hexachlorohexaneeeeeee", true); assertEquals(false, result.isSuccessfullyTokenized(), "Unparsable"); } @Test public void bracketedHexachlorohexane() throws ParsingException{ Parse parse = tokenizer.tokenize("(hexachloro)hexane", true).getParse(); assertEquals(1, parse.getWords().size(), "One Word"); ParseWord w = parse.getWords().get(0); assertEquals(1, w.getParseTokens().size(), "One Parse"); List tokens = w.getParseTokens().get(0).getTokens(); assertEquals(9, tokens.size(),"Nine tokens"); assertEquals("(", tokens.get(0), "First token: ("); assertEquals("hex", tokens.get(1), "Second token: hex"); assertEquals("a", tokens.get(2), "Third token: a"); assertEquals("chloro", tokens.get(3), "Fourth token: chloro"); assertEquals(")", tokens.get(4), "Fifth token: )"); assertEquals("", tokens.get(5), "Sixth token: end of main substituent"); assertEquals("hex", tokens.get(6), "Seventh token: hex"); assertEquals("ane", tokens.get(7), "Eigth token: ane"); assertEquals("", tokens.get(8), "Ninth token: end of main group"); } @Test public void methyl() throws ParsingException{ Parse parse = tokenizer.tokenize("methyl", true).getParse(); assertEquals(1, parse.getWords().size(), "One Word"); ParseWord w = parse.getWords().get(0); assertEquals(1, w.getParseTokens().size(), "One Parse"); List tokens = w.getParseTokens().get(0).getTokens(); assertEquals(3, tokens.size(), "Three tokens"); assertEquals("meth", tokens.get(0), "First token: meth"); assertEquals("yl", tokens.get(1), "Second token: yl"); assertEquals("", tokens.get(2), "Third token: end of substituent"); } @Test public void aceticacid() throws ParsingException{ Parse parse = tokenizer.tokenize("acetic acid", true).getParse(); assertEquals(1, parse.getWords().size(), "One Word"); ParseWord w = parse.getWords().get(0); assertEquals(1, w.getParseTokens().size(), "One Parse"); List tokens = w.getParseTokens().get(0).getTokens(); assertEquals(3, tokens.size(), "Three tokens"); assertEquals("acet", tokens.get(0), "First token: acet"); assertEquals("ic acid", tokens.get(1), "Second token: ic acid"); assertEquals("", tokens.get(2), "Third token: end of main group"); } @Test public void acceptableInterWordBreaks() throws ParsingException{ assertEquals(true, tokenizer.tokenize("methane ethane", false).isSuccessfullyTokenized()); assertEquals(true, tokenizer.tokenize("methane-ethane", false).isSuccessfullyTokenized()); assertEquals(true, tokenizer.tokenize("methane - ethane", false).isSuccessfullyTokenized()); assertEquals(false, tokenizer.tokenize("methane -ethane", false).isSuccessfullyTokenized()); assertEquals(false, tokenizer.tokenize("methane - ", false).isSuccessfullyTokenized()); assertEquals(true, tokenizer.tokenizeRightToLeft(reverseParseRules, "methane ethane", false).isSuccessfullyTokenized()); assertEquals(true, tokenizer.tokenizeRightToLeft(reverseParseRules, "methane-ethane", false).isSuccessfullyTokenized()); assertEquals(true, tokenizer.tokenizeRightToLeft(reverseParseRules, "methane - ethane", false).isSuccessfullyTokenized()); assertEquals(false, tokenizer.tokenizeRightToLeft(reverseParseRules, "methane -ethane", false).isSuccessfullyTokenized()); assertEquals(false, tokenizer.tokenizeRightToLeft(reverseParseRules, "methane - ", false).isSuccessfullyTokenized()); } @Test public void compoundWithValidUse() throws ParsingException{ TokenizationResult result =tokenizer.tokenize("benzene compound with toluene", true); assertEquals(true, result.isSuccessfullyTokenized()); TokenizationResult result2 =tokenizer.tokenize("benzene and toluene", true); assertEquals(true, result2.isSuccessfullyTokenized()); } @Test public void compoundWithInvalidUse1() throws ParsingException{ TokenizationResult result =tokenizer.tokenize("ethyl and toluene", true); assertEquals(false, result.isSuccessfullyTokenized()); } @Test public void compoundWithInvalidUse2() throws ParsingException{ TokenizationResult result =tokenizer.tokenize("and benzene", true); assertEquals(false, result.isSuccessfullyTokenized()); } @Test public void CCCP() throws ParsingException{ TokenizationResult result = tokenizer.tokenize("Carbonyl cyanide m-chlorophenyl oxime", true); assertEquals(true, result.isSuccessfullyTokenized()); assertEquals(true, result.isFullyInterpretable()); assertEquals("", result.getUninterpretableName()); assertEquals("", result.getUnparsableName()); assertEquals("", result.getUnparsedName()); Parse parse = result.getParse(); assertEquals(4, parse.getWords().size(), "Four Words"); ParseWord w = parse.getWords().get(0); assertEquals(1, w.getParseTokens().size(), "One Parse"); List tokens = w.getParseTokens().get(0).getTokens(); assertEquals(3, tokens.size(), "Three tokens"); assertEquals("carbon", tokens.get(0), "First token: carbon"); assertEquals("yl", tokens.get(1), "Second token: yl"); assertEquals("", tokens.get(2), "Third token: end of substituent"); w = parse.getWords().get(1); assertEquals(1, w.getParseTokens().size(), "One Parse"); tokens = w.getParseTokens().get(0).getTokens(); assertEquals(2, tokens.size(), "Two tokens"); assertEquals("cyanide", tokens.get(0), "First token: cyanide"); assertEquals("", tokens.get(1), "Second token: end of functionalTerm"); w = parse.getWords().get(2); assertEquals(1, w.getParseTokens().size(), "One Parse"); tokens = w.getParseTokens().get(0).getTokens(); assertEquals(5, tokens.size(), "Five tokens"); assertEquals("m-", tokens.get(0), "First token: m-"); assertEquals("chloro", tokens.get(1), "Second token: chloro"); assertEquals("", tokens.get(2), "Third token: end of substituent"); assertEquals("phenyl", tokens.get(3), "Fourth token: phenyl"); assertEquals("", tokens.get(4), "Fifth token: end of substituent"); w = parse.getWords().get(3); assertEquals(1, w.getParseTokens().size(), "One Parse"); tokens = w.getParseTokens().get(0).getTokens(); assertEquals(2, tokens.size(), "Two tokens"); assertEquals("oxime", tokens.get(0), "First token: oxime"); assertEquals("", tokens.get(1), "Second token: end of functionalTerm"); } @Test public void CCCP_RL() throws ParsingException{ TokenizationResult result = tokenizer.tokenizeRightToLeft(reverseParseRules, "Carbonyl cyanide m-chlorophenyl oxime", true); assertEquals(true, result.isSuccessfullyTokenized()); assertEquals(true, result.isFullyInterpretable()); assertEquals("", result.getUninterpretableName()); assertEquals("", result.getUnparsableName()); assertEquals("", result.getUnparsedName()); Parse parse = result.getParse(); assertEquals(4, parse.getWords().size(), "Four Words"); ParseWord w = parse.getWords().get(0); assertEquals(1, w.getParseTokens().size(), "One Parse"); List tokens = w.getParseTokens().get(0).getTokens(); assertEquals(3, tokens.size(), "Three tokens"); assertEquals("carbon", tokens.get(0), "First token: carbon"); assertEquals("yl", tokens.get(1), "Second token: yl"); assertEquals("", tokens.get(2), "Third token: end of substituent"); w = parse.getWords().get(1); assertEquals(1, w.getParseTokens().size(), "One Parse"); tokens = w.getParseTokens().get(0).getTokens(); assertEquals(2, tokens.size(), "Two tokens"); assertEquals("cyanide", tokens.get(0), "First token: cyanide"); assertEquals("", tokens.get(1), "Second token: end of functionalTerm"); w = parse.getWords().get(2); assertEquals(1, w.getParseTokens().size(), "One Parse"); tokens = w.getParseTokens().get(0).getTokens(); assertEquals(5, tokens.size(), "Five tokens"); assertEquals("m-", tokens.get(0), "First token: m-"); assertEquals("chloro", tokens.get(1), "Second token: chloro"); assertEquals("", tokens.get(2), "Third token: end of substituent"); assertEquals("phenyl", tokens.get(3), "Fourth token: phenyl"); assertEquals("", tokens.get(4), "Fifth token: end of substituent"); w = parse.getWords().get(3); assertEquals(1, w.getParseTokens().size(), "One Parse"); tokens = w.getParseTokens().get(0).getTokens(); assertEquals(2, tokens.size(), "Two tokens"); assertEquals("oxime", tokens.get(0), "First token: oxime"); assertEquals("", tokens.get(1), "Second token: end of functionalTerm"); } @Test public void partiallyInterpretatableLR() throws ParsingException{ TokenizationResult result =tokenizer.tokenize("ethyl-2H-foo|ene", true); assertEquals(false, result.isSuccessfullyTokenized()); assertEquals(false, result.isFullyInterpretable()); assertEquals("2H-foo|ene", result.getUninterpretableName()); assertEquals("foo|ene", result.getUnparsableName()); assertEquals("ethyl-2H-foo|ene", result.getUnparsedName()); } @Test public void partiallyInterpretatableRL1() throws ParsingException{ TokenizationResult result =tokenizer.tokenizeRightToLeft(reverseParseRules, "ethyl-2H-foo|ene", true); assertEquals(false, result.isSuccessfullyTokenized()); assertEquals(false, result.isFullyInterpretable()); assertEquals("ethyl-2H-foo|ene", result.getUninterpretableName()); assertEquals("ethyl-2H-foo|", result.getUnparsableName()); assertEquals("ethyl-2H-foo|ene", result.getUnparsedName()); } @Test public void partiallyInterpretatableRL2() throws ParsingException{ TokenizationResult result =tokenizer.tokenizeRightToLeft(reverseParseRules, "fooylpyridine oxide", true); assertEquals(false, result.isSuccessfullyTokenized()); assertEquals(false, result.isFullyInterpretable()); assertEquals("fooyl", result.getUninterpretableName()); assertEquals("f", result.getUnparsableName());//o as in the end of thio then oyl assertEquals("fooylpyridine", result.getUnparsedName()); } @Test public void tokenizeDoesNotTokenizeUnTokenizableName() throws ParsingException{ TokenizationResult result =tokenizer.tokenize("ethyl acet|foo toluene", true); assertEquals(false, result.isSuccessfullyTokenized()); } @Test public void tokenizePreservesSpacesInUninterpretableNameLR1() throws ParsingException{ TokenizationResult result =tokenizer.tokenize("ethyl acet|foo toluene", true); assertEquals("acet|foo toluene", result.getUninterpretableName()); } @Test public void tokenizePreservesSpacesInUnparsableNameLR1() throws ParsingException{ TokenizationResult result =tokenizer.tokenize("ethyl acet|foo toluene", true); assertEquals("|foo toluene", result.getUnparsableName()); } @Test public void tokenizePreservesSpacesInUnparsedNameLR1() throws ParsingException{ TokenizationResult result =tokenizer.tokenize("ethyl acet|foo toluene", true); assertEquals("acet|foo toluene", result.getUnparsedName()); } @Test public void tokenizePreservesSpacesInUninterpretableNameLR2() throws ParsingException{ TokenizationResult result =tokenizer.tokenize("eth yl acet|foo toluene", true); assertEquals("acet|foo toluene", result.getUninterpretableName()); } @Test public void tokenizePreservesSpacesInUnparsableNameLR2() throws ParsingException{ TokenizationResult result =tokenizer.tokenize("eth yl acet|foo toluene", true); assertEquals("|foo toluene", result.getUnparsableName()); } @Test public void tokenizePreservesSpacesInUnparsedNameLR2() throws ParsingException{ TokenizationResult result =tokenizer.tokenize("eth yl acet|foo toluene", true); assertEquals("acet|foo toluene", result.getUnparsedName()); } @Test public void tokenizePreservesSpacesInUninterpretableNameRL1() throws ParsingException{ TokenizationResult result =tokenizer.tokenizeRightToLeft(reverseParseRules, "ethyl foo|yl toluene", true); assertEquals("ethyl foo|yl", result.getUninterpretableName()); } @Test public void tokenizePreservesSpacesInUnparsableNameRL1() throws ParsingException{ TokenizationResult result =tokenizer.tokenizeRightToLeft(reverseParseRules, "ethyl foo|yl toluene", true); assertEquals("ethyl foo|", result.getUnparsableName()); } @Test public void tokenizePreservesSpacesInUnparsedNameRL1() throws ParsingException{ TokenizationResult result =tokenizer.tokenizeRightToLeft(reverseParseRules, "ethyl foo|yl toluene", true); assertEquals("ethyl foo|yl", result.getUnparsedName()); } @Test public void tokenizePreservesSpacesInUninterpretableNameRL2() throws ParsingException{ TokenizationResult result =tokenizer.tokenizeRightToLeft(reverseParseRules, "ethyl foo|yl tolu ene", true); assertEquals("ethyl foo|yl", result.getUninterpretableName()); } @Test public void tokenizePreservesSpacesInUnparsableNameRL2() throws ParsingException{ TokenizationResult result =tokenizer.tokenizeRightToLeft(reverseParseRules, "ethyl foo|yl tolu ene", true); assertEquals("ethyl foo|", result.getUnparsableName()); } @Test public void tokenizePreservesSpacesInUnparsedNameRL2() throws ParsingException{ TokenizationResult result =tokenizer.tokenizeRightToLeft(reverseParseRules, "ethyl foo|yl tolu ene", true); assertEquals("ethyl foo|yl", result.getUnparsedName()); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/UninterpretableNameTest.java000066400000000000000000000017011451751637500313550ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import org.junit.jupiter.api.AfterAll; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.params.ParameterizedTest; import org.junit.jupiter.params.provider.CsvFileSource; import uk.ac.cam.ch.wwmm.opsin.OpsinResult.OPSIN_RESULT_STATUS; public class UninterpretableNameTest { private static NameToStructure n2s; @BeforeAll public static void setUp() { n2s = NameToStructure.getInstance(); } @AfterAll public static void cleanUp(){ n2s = null; } @ParameterizedTest @CsvFileSource(resources ="uninterpretable.txt", delimiter='\t') public void testNamesThatShoudlBeUninterpretable(String uninterpretablName) { OpsinResult result = n2s.parseChemicalName(uninterpretablName); assertEquals(OPSIN_RESULT_STATUS.FAILURE, result.getStatus(), uninterpretablName + " should be uninterpretable"); } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/VerifyFragmentsTest.java000066400000000000000000000076071451751637500305370ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import javax.xml.stream.XMLStreamConstants; import javax.xml.stream.XMLStreamException; import javax.xml.stream.XMLStreamReader; import org.junit.jupiter.api.AfterAll; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; import static org.junit.jupiter.api.Assertions.fail; import static uk.ac.cam.ch.wwmm.opsin.XmlDeclarations.*; public class VerifyFragmentsTest { private static ResourceGetter resourceGetter; private static FragmentManager fm; @BeforeAll public static void setUp() { resourceGetter = new ResourceGetter("uk/ac/cam/ch/wwmm/opsin/resources/"); IDManager idManager = new IDManager(); fm = new FragmentManager(new SMILESFragmentBuilder(idManager), idManager); } @AfterAll public static void cleanUp(){ resourceGetter = null; fm = null; } @Test public void verifySMILES() throws Exception { XMLStreamReader indexReader = resourceGetter.getXMLStreamReader("index.xml"); while (indexReader.hasNext()) { if (indexReader.next() == XMLStreamConstants.START_ELEMENT && indexReader.getLocalName().equals("tokenFile")) { XMLStreamReader tokenReader = resourceGetter.getXMLStreamReader(indexReader.getElementText()); while (tokenReader.hasNext()) { if (tokenReader.next() == XMLStreamConstants.START_ELEMENT) { String tagName = tokenReader.getLocalName(); if (tagName.equals("tokenLists")) { while (tokenReader.hasNext()) { switch (tokenReader.next()) { case XMLStreamConstants.START_ELEMENT: if (tokenReader.getLocalName().equals("tokenList")) { verifySmilesInTokenList(tokenReader); } break; } } } else if (tagName.equals("tokenList")) { verifySmilesInTokenList(tokenReader); } } } } } indexReader.close(); } private void verifySmilesInTokenList(XMLStreamReader reader) throws XMLStreamException { String tagname = reader.getAttributeValue(null, "tagname"); if (tagname.equals(GROUP_EL) || tagname.equals(FUNCTIONALGROUP_EL) || tagname.equals(HETEROATOM_EL) || tagname.equals(SUFFIXPREFIX_EL)) { String type = reader.getAttributeValue(null, TYPE_ATR); String subType = reader.getAttributeValue(null, SUBTYPE_ATR); while (reader.hasNext()) { switch (reader.next()) { case XMLStreamConstants.START_ELEMENT: if (reader.getLocalName().equals("token")) { String smiles = reader.getAttributeValue(null, VALUE_ATR); String labels = reader.getAttributeValue(null, LABELS_ATR); TokenEl tokenEl = new TokenEl(GROUP_EL); if (type != null){ tokenEl.addAttribute(TYPE_ATR, type); } if (subType != null){ tokenEl.addAttribute(SUBTYPE_ATR, subType); } Fragment mol = null; try { mol = fm.buildSMILES(smiles, tokenEl, labels != null ? labels : ""); fm.convertSpareValenciesToDoubleBonds(); fm.makeHydrogensExplicit(); if (!tagname.equals(HETEROATOM_EL)) { //some heteroatom replacements have weird valencues, so only verify valency on more normal fragments try{ mol.checkValencies(); } catch (StructureBuildingException e) { fail("The following token's SMILES produced a structure with invalid valency: " + smiles); } } } catch (Exception e) { e.printStackTrace(); fail("The following SMILES were in error: " + smiles); } finally { if (mol != null) { try { fm.removeFragment(mol); } catch (StructureBuildingException e) { e.printStackTrace(); } } } } break; case XMLStreamConstants.END_ELEMENT: if (reader.getLocalName().equals("tokenList")) { return; } break; } } } } } opsin-2.8.0/opsin-core/src/test/java/uk/ac/cam/ch/wwmm/opsin/WordToolsTest.java000066400000000000000000000203151451751637500273470ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import java.util.Arrays; import java.util.List; import org.junit.jupiter.api.Test; public class WordToolsTest { @Test public void testNormalCase() throws ParsingException { ParseTokens pTokens = new ParseTokens(Arrays.asList("fooane",""), Arrays.asList('a', OpsinTools.END_OF_MAINGROUP)); List parseWords = WordTools.splitIntoParseWords(Arrays.asList(pTokens), "fooane"); assertEquals(1, parseWords.size()); assertEquals("fooane", parseWords.get(0).getWord()); assertEquals(1, parseWords.get(0).getParseTokens().size()); assertEquals(pTokens, parseWords.get(0).getParseTokens().get(0)); } @Test public void testNormalCase2() throws ParsingException { ParseTokens pTokens = new ParseTokens(Arrays.asList("fooyl","","fooane",""), Arrays.asList('a', OpsinTools.END_OF_SUBSTITUENT,'a', OpsinTools.END_OF_MAINGROUP)); List parseWords = WordTools.splitIntoParseWords(Arrays.asList(pTokens), "fooylfooane"); assertEquals(1, parseWords.size()); assertEquals("fooylfooane", parseWords.get(0).getWord()); assertEquals(1, parseWords.get(0).getParseTokens().size()); assertEquals(pTokens, parseWords.get(0).getParseTokens().get(0)); } @Test public void testNormalCase3() throws ParsingException { ParseTokens pTokens = new ParseTokens(Arrays.asList("functionalfoo",""), Arrays.asList('a', OpsinTools.END_OF_FUNCTIONALTERM)); List parseWords = WordTools.splitIntoParseWords(Arrays.asList(pTokens), "functionalfoo"); assertEquals(1, parseWords.size()); assertEquals("functionalfoo", parseWords.get(0).getWord()); assertEquals(1, parseWords.get(0).getParseTokens().size()); assertEquals(pTokens, parseWords.get(0).getParseTokens().get(0)); } @Test public void testNormalCase4() throws ParsingException { ParseTokens pTokens1 = new ParseTokens(Arrays.asList("fooane",""), Arrays.asList('a', OpsinTools.END_OF_MAINGROUP)); ParseTokens pTokens2 = new ParseTokens(Arrays.asList("fooane",""), Arrays.asList('b', OpsinTools.END_OF_MAINGROUP)); List parseWords = WordTools.splitIntoParseWords(Arrays.asList(pTokens1, pTokens2), "fooane"); assertEquals(1, parseWords.size()); assertEquals("fooane", parseWords.get(0).getWord()); assertEquals(2, parseWords.get(0).getParseTokens().size()); assertEquals(pTokens1, parseWords.get(0).getParseTokens().get(0)); assertEquals(pTokens2, parseWords.get(0).getParseTokens().get(1)); } @Test public void testStartingFunctionalTerm1() throws ParsingException { ParseTokens pTokens = new ParseTokens(Arrays.asList("poly","","foo",""), Arrays.asList('a', OpsinTools.END_OF_FUNCTIONALTERM,'a', OpsinTools.END_OF_SUBSTITUENT)); List parseWords = WordTools.splitIntoParseWords(Arrays.asList(pTokens), "polyfoo"); assertEquals(2, parseWords.size()); assertEquals("poly", parseWords.get(0).getWord()); assertEquals(1, parseWords.get(0).getParseTokens().size()); ParseTokens pTokensFunc = new ParseTokens(Arrays.asList("poly",""), Arrays.asList('a', OpsinTools.END_OF_FUNCTIONALTERM)); assertEquals(pTokensFunc, parseWords.get(0).getParseTokens().get(0)); assertEquals("foo", parseWords.get(1).getWord()); assertEquals(1, parseWords.get(1).getParseTokens().size()); ParseTokens pTokensGroup = new ParseTokens(Arrays.asList("foo",""), Arrays.asList('a', OpsinTools.END_OF_SUBSTITUENT)); assertEquals(pTokensGroup, parseWords.get(1).getParseTokens().get(0)); } @Test public void testStartingFunctionalTerm2() throws ParsingException { ParseTokens pTokens = new ParseTokens(Arrays.asList("poly","","foo",""), Arrays.asList('a', OpsinTools.END_OF_FUNCTIONALTERM,'a', OpsinTools.END_OF_MAINGROUP)); List parseWords = WordTools.splitIntoParseWords(Arrays.asList(pTokens), "polyfoo"); assertEquals(2, parseWords.size()); assertEquals("poly", parseWords.get(0).getWord()); assertEquals(1, parseWords.get(0).getParseTokens().size()); ParseTokens pTokensFunc = new ParseTokens(Arrays.asList("poly",""), Arrays.asList('a', OpsinTools.END_OF_FUNCTIONALTERM)); assertEquals(pTokensFunc, parseWords.get(0).getParseTokens().get(0)); assertEquals("foo", parseWords.get(1).getWord()); assertEquals(1, parseWords.get(1).getParseTokens().size()); ParseTokens pTokensGroup = new ParseTokens(Arrays.asList("foo",""), Arrays.asList('a', OpsinTools.END_OF_MAINGROUP)); assertEquals(pTokensGroup, parseWords.get(1).getParseTokens().get(0)); } @Test public void testTerminalFunctionalTerm() throws ParsingException { ParseTokens pTokens = new ParseTokens(Arrays.asList("fooyl","","functionalfoo",""), Arrays.asList('a', OpsinTools.END_OF_SUBSTITUENT,'a', OpsinTools.END_OF_FUNCTIONALTERM)); List parseWords = WordTools.splitIntoParseWords(Arrays.asList(pTokens), "fooylfunctionalfoo"); assertEquals(2, parseWords.size()); assertEquals("fooyl", parseWords.get(0).getWord()); assertEquals(1, parseWords.get(0).getParseTokens().size()); ParseTokens pTokensSub = new ParseTokens(Arrays.asList("fooyl",""), Arrays.asList('a', OpsinTools.END_OF_SUBSTITUENT)); assertEquals(pTokensSub, parseWords.get(0).getParseTokens().get(0)); assertEquals("functionalfoo", parseWords.get(1).getWord()); assertEquals(1, parseWords.get(1).getParseTokens().size()); ParseTokens pTokensFunc = new ParseTokens(Arrays.asList("functionalfoo",""), Arrays.asList('a', OpsinTools.END_OF_FUNCTIONALTERM)); assertEquals(pTokensFunc, parseWords.get(1).getParseTokens().get(0)); } @Test public void testMultipleParsesTerminalFunctionalTerm() throws ParsingException { ParseTokens pTokens1 = new ParseTokens(Arrays.asList("fooyl","","functionalfoo",""), Arrays.asList('a', OpsinTools.END_OF_SUBSTITUENT,'a', OpsinTools.END_OF_FUNCTIONALTERM)); ParseTokens pTokens2 = new ParseTokens(Arrays.asList("fooyl","","functionalfoo",""), Arrays.asList('b', OpsinTools.END_OF_SUBSTITUENT,'a', OpsinTools.END_OF_FUNCTIONALTERM)); List parseWords = WordTools.splitIntoParseWords(Arrays.asList(pTokens1,pTokens2), "fooylfunctionalfoo"); assertEquals(2, parseWords.size()); assertEquals("fooyl", parseWords.get(0).getWord()); assertEquals(2, parseWords.get(0).getParseTokens().size()); ParseTokens pTokensSub1 = new ParseTokens(Arrays.asList("fooyl",""), Arrays.asList('a', OpsinTools.END_OF_SUBSTITUENT)); assertEquals(pTokensSub1, parseWords.get(0).getParseTokens().get(0)); ParseTokens pTokensSub2 = new ParseTokens(Arrays.asList("fooyl",""), Arrays.asList('b', OpsinTools.END_OF_SUBSTITUENT)); assertEquals(pTokensSub2, parseWords.get(0).getParseTokens().get(1)); assertEquals("functionalfoo", parseWords.get(1).getWord()); assertEquals(1, parseWords.get(1).getParseTokens().size()); ParseTokens pTokensFunc = new ParseTokens(Arrays.asList("functionalfoo",""), Arrays.asList('a', OpsinTools.END_OF_FUNCTIONALTERM)); assertEquals(pTokensFunc, parseWords.get(1).getParseTokens().get(0)); } @Test public void testMultipleParsesAmbiguousWordTokenisationTerminalFunctionalTerm() throws ParsingException { ParseTokens pTokens1 = new ParseTokens(Arrays.asList("fooyl","","functionalfoo",""), Arrays.asList('a', OpsinTools.END_OF_SUBSTITUENT,'a', OpsinTools.END_OF_FUNCTIONALTERM)); ParseTokens pTokens2 = new ParseTokens(Arrays.asList("fooylfunc","","tionalfoo",""), Arrays.asList('b', OpsinTools.END_OF_SUBSTITUENT,'a', OpsinTools.END_OF_FUNCTIONALTERM)); List parseWords = WordTools.splitIntoParseWords(Arrays.asList(pTokens1,pTokens2), "fooylfunctionalfoo"); assertEquals(2, parseWords.size()); assertEquals("fooyl", parseWords.get(0).getWord()); assertEquals(1, parseWords.get(0).getParseTokens().size()); ParseTokens pTokensSub = new ParseTokens(Arrays.asList("fooyl",""), Arrays.asList('a', OpsinTools.END_OF_SUBSTITUENT)); assertEquals(pTokensSub, parseWords.get(0).getParseTokens().get(0)); assertEquals("functionalfoo", parseWords.get(1).getWord()); assertEquals(1, parseWords.get(1).getParseTokens().size()); ParseTokens pTokensFunc = new ParseTokens(Arrays.asList("functionalfoo",""), Arrays.asList('a', OpsinTools.END_OF_FUNCTIONALTERM)); assertEquals(pTokensFunc, parseWords.get(1).getParseTokens().get(0)); } } opsin-2.8.0/opsin-core/src/test/resources/000077500000000000000000000000001451751637500205255ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/resources/uk/000077500000000000000000000000001451751637500211445ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/resources/uk/ac/000077500000000000000000000000001451751637500215275ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/resources/uk/ac/cam/000077500000000000000000000000001451751637500222675ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/resources/uk/ac/cam/ch/000077500000000000000000000000001451751637500226615ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/resources/uk/ac/cam/ch/wwmm/000077500000000000000000000000001451751637500236505ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/resources/uk/ac/cam/ch/wwmm/opsin/000077500000000000000000000000001451751637500250005ustar00rootroot00000000000000opsin-2.8.0/opsin-core/src/test/resources/uk/ac/cam/ch/wwmm/opsin/ambiguous.txt000066400000000000000000000012321451751637500275320ustar00rootroot00000000000000dimethylbenzene trimethylbenzene tetramethylbenzene bipyridyl terphenyl epoxynaphthalene methanonaphthalene ethylenedipyridine benzenetriacetic acid pyridineacetic acid tetraaminoethylbenzene indenespirocyclopentane xylene naphthyridine 4,4'-(ethane-1,2-diylbis(pyridindiyl))dibutan-1-ol dipyridyl ketone pyridyl acetate pyridindiyl glycol hexadiene hexene hexadienyne hexyne cyclohexadiene diazole triazole diazabenzene azapropane diazapropane triazatrithiabenzene dihydrobenzene dihydropyridine triazolium naphthalenium benzenediol pyridinecarbaldehyde pyridinone 3,3,3-trifluoro-propane-d demethylcaffeine tetradeazaguanine opsin-2.8.0/opsin-core/src/test/resources/uk/ac/cam/ch/wwmm/opsin/unambiguous.txt000066400000000000000000000022201451751637500300730ustar00rootroot00000000000000benzene methylbenzene pentamethylbenzene hexamethylbenzene biphenyl bipyrid-2-yl 2,2'-bipyridyl epoxyethane 1,2-methanobenzene 2,2'-ethylenedipyridine ethylenedibenzene 1,3,5-benzenetriacetic acid benzeneacetic acid cyclopentanespirocyclobutane indene-2-spiro-1'-cyclopentane 4-xylene p-xylene 1,4-xylene acenaphthoquinone 1,2-acenaphthenequinone 1,8-naphthyridine 4,4'-(ethane-1,2-diylbis(azanediyl))dibutan-1-ol cyclo(tyrosinylvalylprolyl) tetramethylene sulfone dipyridin-2-yl ketone pyridin-2-yl acetate ethylene glycol butadiene propene propadiene 1-chlorocyclohexene cyclohexene cyclohexatriene azole tetrazole pentazole azabenzene pentaazabenzene hexaazabenzene azaethane triazapropane oxacycloundecan-2-one hexahydrobenzene tetrahydropyridin-2-one hexahydropyridin-2-one dihydroorotic acid pyridinium isoquinolinium ethanol propanol cyclohexanol benzenepentaol benzenehexaol valeric acid butanone pentanaldehyde pyridin-2-one methane-d 3,3,3-trifluoro-propane-2-d-1-ol 2,2,2-trifluoro-ethane-d 2-chloro-6-(methyl-d3)-benzene-d tridemethylcaffeine deoxycytidine deazamorpholine pentadeazaguanine opsin-2.8.0/opsin-core/src/test/resources/uk/ac/cam/ch/wwmm/opsin/uninterpretable.txt000066400000000000000000000003341451751637500307440ustar00rootroot000000000000007-methylbenzene 1,1,1,1-tetrafluoropropane pentafluoromethane 2,2-pentadienol cyclohexa-1,1-diene bicyclo[12.2.2]octadeca-1(16),1(16)-diene oxochloride oxychloride 1-deaza-morpholine phenaleno[1,9b-c]thiophene opsin-2.8.0/opsin-inchi/000077500000000000000000000000001451751637500151075ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/pom.xml000066400000000000000000000033411451751637500164250ustar00rootroot00000000000000 4.0.0 opsin uk.ac.cam.ch.opsin 2.8.0 opsin-inchi OPSIN InChI Support Adds NameToInchi class for converting names directly to InChIs org.apache.maven.plugins maven-shade-plugin 3.2.4 package shade target/opsin-inchi-${project.version}-jar-with-dependencies.jar uk.ac.cam.ch.opsin opsin-core io.github.dan2097 jna-inchi-core org.junit.jupiter junit-jupiter test org.apache.logging.log4j log4j-core test opsin-2.8.0/opsin-inchi/src/000077500000000000000000000000001451751637500156765ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/main/000077500000000000000000000000001451751637500166225ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/main/java/000077500000000000000000000000001451751637500175435ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/main/java/uk/000077500000000000000000000000001451751637500201625ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/main/java/uk/ac/000077500000000000000000000000001451751637500205455ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/main/java/uk/ac/cam/000077500000000000000000000000001451751637500213055ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/main/java/uk/ac/cam/ch/000077500000000000000000000000001451751637500216775ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/main/java/uk/ac/cam/ch/wwmm/000077500000000000000000000000001451751637500226665ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/main/java/uk/ac/cam/ch/wwmm/opsin/000077500000000000000000000000001451751637500240165ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/main/java/uk/ac/cam/ch/wwmm/opsin/InchiPruner.java000066400000000000000000000047321451751637500271150ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.ArrayList; import java.util.List; public class InchiPruner { /** * Return a modified version of the given InChI where the: * stereochemistry, fixed hydrogen and reconnected layers have been removed * The S indicating standard InChI is also removed * @param inchi * @return InChI just containing the c,h,q,p,i layers */ public static String mainAndChargeLayers(String inchi){ String[] inchiLayers = inchi.split("/"); if (inchiLayers.length < 2){ return inchi; } List retainedLayers = new ArrayList<>(); if (Character.isLetter(inchiLayers[0].charAt(inchiLayers[0].length() -1))){//remove the S indicating this to be a standard InChI inchiLayers[0] = inchiLayers[0].substring(0, inchiLayers[0].length() -1); } retainedLayers.add(inchiLayers[0]);//version identifier retainedLayers.add(inchiLayers[1]);//molecular formula for (int i = 2; i < inchiLayers.length; i++) { Character c = inchiLayers[i].charAt(0); if (c=='c' || c=='h' || c=='q' || c=='p' || c=='i'){ retainedLayers.add(inchiLayers[i]); } else if (c!='b' && c!='t' && c!='m' && c!='s'){//ignore stereochemistry but continue as there may be an isotopic layer break; } } return StringTools.stringListToString(retainedLayers, "/"); } /** * Return a modified version of the given InChI where the: * fixed hydrogen and reconnected layers have been removed * The S indicating standard InChI is also removed * @param inchi * @return InChI just containing the c,h,q,p,b,t,m,s,i layers */ public static String mainChargeAndStereochemistryLayers(String inchi){ String[] inchiLayers = inchi.split("/"); if (inchiLayers.length < 2){ return inchi; } List retainedLayers = new ArrayList<>(); if (Character.isLetter(inchiLayers[0].charAt(inchiLayers[0].length() -1))){//remove the S indicating this to be a standard InChI inchiLayers[0] = inchiLayers[0].substring(0, inchiLayers[0].length() -1); } retainedLayers.add(inchiLayers[0]);//version identifier retainedLayers.add(inchiLayers[1]);//molecular formula for (int i = 2; i < inchiLayers.length; i++) { Character c = inchiLayers[i].charAt(0); if (c=='c' || c=='h' || c=='q' || c=='p' || c=='b' || c=='t' || c=='m' || c=='s' || c=='i'){ retainedLayers.add(inchiLayers[i]); } else{ break; } } return StringTools.stringListToString(retainedLayers, "/"); } } opsin-2.8.0/opsin-inchi/src/main/java/uk/ac/cam/ch/wwmm/opsin/NameToInchi.java000066400000000000000000000206011451751637500270160ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import java.util.HashMap; import java.util.List; import java.util.Set; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import io.github.dan2097.jnainchi.InchiAtom; import io.github.dan2097.jnainchi.InchiBond; import io.github.dan2097.jnainchi.InchiBondType; import io.github.dan2097.jnainchi.InchiFlag; import io.github.dan2097.jnainchi.InchiInput; import io.github.dan2097.jnainchi.InchiKeyOutput; import io.github.dan2097.jnainchi.InchiOptions; import io.github.dan2097.jnainchi.InchiStereo; import io.github.dan2097.jnainchi.InchiOptions.InchiOptionsBuilder; import io.github.dan2097.jnainchi.InchiOutput; import io.github.dan2097.jnainchi.InchiStatus; import io.github.dan2097.jnainchi.InchiStereoParity; import io.github.dan2097.jnainchi.JnaInchi; import uk.ac.cam.ch.wwmm.opsin.BondStereo.BondStereoValue; /** * Allows the conversion of OPSIN's output into (Std)InChIs or StdInChIKeys * Also can be used, as a convenience method, to directly convert chemical names to (Std)InChIs or StdInChIKeys * @author dl387 * */ public class NameToInchi { private static final Logger LOG = LogManager.getLogger(NameToInchi.class); private NameToStructure n2s; public NameToInchi() { n2s = NameToStructure.getInstance(); } /**Parses a chemical name, returning an InChI representation of the molecule. * * @param name The chemical name to parse. * @return An InChI string, containing the parsed molecule, or null if the molecule would not parse. */ public String parseToInchi(String name) { OpsinResult result = n2s.parseChemicalName(name); return convertResultToInChI(result); } /**Parses a chemical name, returning a StdInChI representation of the molecule. * Note that chemical names typically specify an exact tautomer which is not representable in StdInChI * Use {@link #parseToInchi(String)} if you want to represent the exact tautomer using a fixed hydrogen layer * * @param name The chemical name to parse. * @return A StdInChI string, containing the parsed molecule, or null if the molecule would not parse. */ public String parseToStdInchi(String name) { OpsinResult result = n2s.parseChemicalName(name); return convertResultToStdInChI(result); } /**Parses a chemical name, returning a StdInChIKey for the molecule. * Like StdInChI, StdInChIKeys aim to not be tautomer specific * * @param name The chemical name to parse. * @return A StdInChIKey string or null if the molecule would not parse. */ public String parseToStdInchiKey(String name) { OpsinResult result = n2s.parseChemicalName(name); return convertResultToStdInChIKey(result); } /** * Converts an OPSIN result to InChI. Null is returned if this conversion fails * @param result * @return String InChI */ public static String convertResultToInChI(OpsinResult result){ return convertResultToInChI(result, false); } /** * Converts an OPSIN result to StdInChI. Null is returned if this conversion fails * Note that chemical names typically specify an exact tautomer which is not representable in StdInChI * Use {@link #convertResultToInChI(OpsinResult)} if you want to represent the exact tautomer using a fixed hydrogen layer * @param result * @return String InChI */ public static String convertResultToStdInChI(OpsinResult result){ return convertResultToInChI(result, true); } /** * Converts an OPSIN result to a StdInChIKey. Null is returned if this conversion fails * Like StdInChI, StdInChIKeys aim to not be tautomer specific * @param result * @return String InChIKey */ public static String convertResultToStdInChIKey(OpsinResult result){ String stdInchi = convertResultToInChI(result, true); if (stdInchi != null){ try { InchiKeyOutput key = JnaInchi.inchiToInchiKey(stdInchi); return key.getInchiKey(); } catch (Exception e) { if (LOG.isDebugEnabled()){ LOG.debug(e.getMessage(), e); } return null; } } return null; } private static String convertResultToInChI(OpsinResult result, boolean produceStdInChI){ if (result.getStructure() != null){ String inchi = null; try{ inchi = opsinFragmentToInchi(result.getStructure(), produceStdInChI); } catch (Exception e) { if (LOG.isDebugEnabled()){ LOG.debug(e.getMessage(), e); } return null; } if (inchi ==null){ //inchi generation failed return null; } if(LOG.isDebugEnabled()){ LOG.debug(inchi); } return inchi; } return null; } private static String opsinFragmentToInchi(Fragment frag, boolean produceStdInChI) { HashMap opsinIdAtomMap = new HashMap<>(); InchiOptionsBuilder optionsBuilder = new InchiOptions.InchiOptionsBuilder(); optionsBuilder.withFlag(InchiFlag.AuxNone); if (!produceStdInChI){ optionsBuilder.withFlag(InchiFlag.FixedH); } InchiInput input = new InchiInput(); List atomList =frag.getAtomList(); // Generate atoms for (Atom atom : atomList) { InchiAtom inchiAtom = new InchiAtom(atom.getElement().toString()); input.addAtom(inchiAtom); inchiAtom.setCharge(atom.getCharge()); Integer isotope = atom.getIsotope(); if (isotope != null) { inchiAtom.setIsotopicMass(isotope); } opsinIdAtomMap.put(atom.getID(), inchiAtom); } Set bondList = frag.getBondSet(); for (Bond bond : bondList) { input.addBond(new InchiBond(opsinIdAtomMap.get(bond.getFrom()), opsinIdAtomMap.get(bond.getTo()), InchiBondType.of((byte)bond.getOrder()))); } for (Atom atom : atomList) {//add atomParities AtomParity atomParity = atom.getAtomParity(); if (atomParity == null) { continue; } StereoGroupType stereoGroupType = atomParity.getStereoGroup().getType(); if ((stereoGroupType == StereoGroupType.Rac || stereoGroupType == StereoGroupType.Rel) && countStereoGroup(atom) == 1) { continue; } Atom[] atomRefs4 = atomParity.getAtomRefs4(); int[] atomRefs4AsInt = new int[4]; for (int i = 0; i < atomRefs4.length; i++) { atomRefs4AsInt[i] = atomRefs4[i].getID(); } InchiStereoParity parity = InchiStereoParity.UNKNOWN; if (atomParity.getParity() > 0){ parity = InchiStereoParity.EVEN; } else if (atomParity.getParity() < 0){ parity = InchiStereoParity.ODD; } input.addStereo(InchiStereo.createTetrahedralStereo(opsinIdAtomMap.get(atom.getID()), opsinIdAtomMap.get(atomRefs4AsInt[0]), opsinIdAtomMap.get(atomRefs4AsInt[1]), opsinIdAtomMap.get(atomRefs4AsInt[2]), opsinIdAtomMap.get(atomRefs4AsInt[3]), parity)); } for (Bond bond : bondList) {//add bondStereos BondStereo bondStereo =bond.getBondStereo(); if (bondStereo != null){ Atom[] atomRefs4 = bondStereo.getAtomRefs4(); int[] atomRefs4Ids = new int[4]; for (int i = 0; i < atomRefs4.length; i++) { atomRefs4Ids[i] = atomRefs4[i].getID(); } if (BondStereoValue.CIS.equals(bondStereo.getBondStereoValue())){ input.addStereo(InchiStereo.createDoubleBondStereo(opsinIdAtomMap.get(atomRefs4Ids[0]), opsinIdAtomMap.get(atomRefs4Ids[1]), opsinIdAtomMap.get(atomRefs4Ids[2]), opsinIdAtomMap.get(atomRefs4Ids[3]), InchiStereoParity.ODD)); } else if (BondStereoValue.TRANS.equals(bondStereo.getBondStereoValue())){ input.addStereo(InchiStereo.createDoubleBondStereo(opsinIdAtomMap.get(atomRefs4Ids[0]), opsinIdAtomMap.get(atomRefs4Ids[1]), opsinIdAtomMap.get(atomRefs4Ids[2]), opsinIdAtomMap.get(atomRefs4Ids[3]), InchiStereoParity.EVEN)); } } } InchiOutput output = JnaInchi.toInchi(input, optionsBuilder.build()); InchiStatus ret = output.getStatus(); if (LOG.isDebugEnabled()){ LOG.debug("Inchi generation status: " + ret); if (InchiStatus.SUCCESS != ret){ LOG.debug(output.getMessage()); } } if (InchiStatus.SUCCESS != ret && InchiStatus.WARNING != ret) { return null; } return output.getInchi(); } private static int countStereoGroup(Atom atom) { StereoGroup refGroup = atom.getAtomParity().getStereoGroup(); int count = 0; for (Atom a : atom.getFrag()) { AtomParity atomParity = a.getAtomParity(); if (atomParity == null) { continue; } if (atomParity.getStereoGroup().equals(refGroup)) { count++; } } return count; } } opsin-2.8.0/opsin-inchi/src/test/000077500000000000000000000000001451751637500166555ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/java/000077500000000000000000000000001451751637500175765ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/java/uk/000077500000000000000000000000001451751637500202155ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/java/uk/ac/000077500000000000000000000000001451751637500206005ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/java/uk/ac/cam/000077500000000000000000000000001451751637500213405ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/java/uk/ac/cam/ch/000077500000000000000000000000001451751637500217325ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/java/uk/ac/cam/ch/wwmm/000077500000000000000000000000001451751637500227215ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/java/uk/ac/cam/ch/wwmm/opsin/000077500000000000000000000000001451751637500240515ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/java/uk/ac/cam/ch/wwmm/opsin/InchiOutputTest.java000066400000000000000000000067721451751637500300430ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import org.junit.jupiter.api.AfterAll; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; import uk.ac.cam.ch.wwmm.opsin.OpsinResult.OPSIN_RESULT_STATUS; public class InchiOutputTest { private static NameToInchi n2i; @BeforeAll public static void setUp() { n2i = new NameToInchi(); } @AfterAll public static void cleanUp(){ n2i = null; } @Test public void testStaticToInChI() throws StructureBuildingException{ SMILESFragmentBuilder sBuilder = new SMILESFragmentBuilder(new IDManager()); Fragment f = sBuilder.build("C([H])([H])([H])C(=O)N([H])[H]"); OpsinResult result = new OpsinResult(f, OPSIN_RESULT_STATUS.SUCCESS, "", ""); assertEquals("InChI=1/C2H5NO/c1-2(3)4/h1H3,(H2,3,4)/f/h3H2", NameToInchi.convertResultToInChI(result)); } @Test public void testStaticToStdInChI() throws StructureBuildingException{ SMILESFragmentBuilder sBuilder = new SMILESFragmentBuilder(new IDManager()); Fragment f = sBuilder.build("C([H])([H])([H])C(=O)N([H])[H]"); OpsinResult result = new OpsinResult(f, OPSIN_RESULT_STATUS.SUCCESS, "", ""); assertEquals("InChI=1S/C2H5NO/c1-2(3)4/h1H3,(H2,3,4)", NameToInchi.convertResultToStdInChI(result)); } @Test public void testStaticToStdInChIKey() throws StructureBuildingException{ SMILESFragmentBuilder sBuilder = new SMILESFragmentBuilder(new IDManager()); Fragment f = sBuilder.build("C([H])([H])([H])C(=O)N([H])[H]"); OpsinResult result = new OpsinResult(f, OPSIN_RESULT_STATUS.SUCCESS, "", ""); assertEquals("DLFVBJFMPXGRIB-UHFFFAOYSA-N", NameToInchi.convertResultToStdInChIKey(result)); } @Test public void testParseToInChI(){ assertEquals("InChI=1/C2H5NO/c1-2(3)4/h1H3,(H2,3,4)/f/h3H2", n2i.parseToInchi("acetamide")); } @Test public void testParseToStdInChI(){ assertEquals("InChI=1S/C2H5NO/c1-2(3)4/h1H3,(H2,3,4)", n2i.parseToStdInchi("acetamide")); } @Test public void testParseToStdInChIKey(){ assertEquals("DLFVBJFMPXGRIB-UHFFFAOYSA-N", n2i.parseToStdInchiKey("acetamide")); } @Test public void ignoreRacemicStereoInInchi() throws StructureBuildingException{ assertEquals("InChI=1/C8H10O/c1-7(9)8-5-3-2-4-6-8/h2-7,9H,1H3", n2i.parseToInchi("rac-(R)-1-phenylethan-1-ol")); } // more than one in same rac group @Test public void keepRacemicStereoInInchi() throws StructureBuildingException{ assertEquals("InChI=1/C12H10O2/c13-11-8-5-1-3-7-4-2-6-9(10(7)8)12(11)14/h1-6,11-14H/t11-,12-/m1/s1", n2i.parseToInchi("rac-trans-acenaphthene-1,2-diol")); } @Test public void consistency() throws StructureBuildingException{ assertEquals("InChI=1/C31H29Cl2F2N3O3/c1-30(2,3)14-20-16-38(29(41)37-15-18-4-6-19(7-5-18)28(39)40)27(23-10-8-21(32)12-25(23)34)31(20,17-36)24-11-9-22(33)13-26(24)35/h4-13,20,27H,14-16H2,1-3H3,(H,37,41)(H,39,40)/t20-,27-,31-/m1/s1/f/h37,39H", n2i.parseToInchi("4-({[rac-(2S,3S,4S)-2,3-bis-(4-chloro-2-fluoro-phenyl)-3-cyano-4-(2,2-dimethyl-propyl)-pyrrolidine-1-carbonyl]-amino}-methyl)-benzoic acid")); assertEquals("InChI=1/C31H29Cl2F2N3O3/c1-30(2,3)14-20-16-38(29(41)37-15-18-4-6-19(7-5-18)28(39)40)27(23-10-8-21(32)12-25(23)34)31(20,17-36)24-11-9-22(33)13-26(24)35/h4-13,20,27H,14-16H2,1-3H3,(H,37,41)(H,39,40)/t20-,27-,31-/m1/s1/f/h37,39H", n2i.parseToInchi("rac-4-({[(2S,3S,4S)-2,3-bis-(4-chloro-2-fluoro-phenyl)-3-cyano-4-(2,2-dimethyl-propyl)-pyrrolidine-1-carbonyl]-amino}-methyl)-benzoic acid")); } } opsin-2.8.0/opsin-inchi/src/test/java/uk/ac/cam/ch/wwmm/opsin/NomenclatureIntegrationTest.java000066400000000000000000000150211451751637500324130ustar00rootroot00000000000000package uk.ac.cam.ch.wwmm.opsin; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.fail; import org.junit.jupiter.api.AfterAll; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.params.ParameterizedTest; import org.junit.jupiter.params.provider.CsvFileSource; public class NomenclatureIntegrationTest { private static NameToStructure n2s; private static NameToStructureConfig n2sConfig; @BeforeAll public static void setUp() { n2s = NameToStructure.getInstance(); n2sConfig = NameToStructureConfig.getDefaultConfigInstance(); n2sConfig.setAllowRadicals(true); } @AfterAll public static void cleanUp(){ n2s = null; n2sConfig = null; } @ParameterizedTest @CsvFileSource(resources = "radicals.txt", delimiter='\t') public void testRadicals(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "acetals.txt", delimiter='\t') public void testAcetals(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "alcoholEsters.txt", delimiter='\t') public void testAlcoholEsters(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "aminoAcids.txt", delimiter='\t') public void testAminoAcids(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "carbohydrates.txt", delimiter='\t') public void testCarbohydrates(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "chargeBalancing.txt", delimiter='\t') public void testChargeBalancing(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "conjunctiveNomenclature.txt", delimiter='\t') public void testConjunctiveNomenclature(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "cyclicSuffixes.txt", delimiter='\t') public void testCyclicSuffixes(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "epoxyLike.txt", delimiter='\t') public void testEpoxyLike(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "flavonoids.txt", delimiter='\t') public void testFlavonoids(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "functionalReplacement.txt", delimiter='\t') public void testFunctionalReplacement(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "isotopes.txt", delimiter='\t') public void testIsotopes(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "additiveNomenclature.txt", delimiter='\t') public void testAdditiveNomenclature(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "multiplicativeNomenclature.txt", delimiter='\t') public void testMultiplicativeNomenclature(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "omittedSpaces.txt", delimiter='\t') public void testOmittedSpaces(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "functionalClasses.txt", delimiter='\t') public void testFunctionalClassNomenclature(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "fusedRings.txt", delimiter='\t') public void testFusedRingNomenclature(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "hwRings.txt", delimiter='\t') public void testHwRingNomenclature(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "inorganics.txt", delimiter='\t') public void testInorganicNomenclature(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "ions.txt", delimiter='\t') public void testIonNomenclature(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "spiro.txt", delimiter='\t') public void testSpiroNomenclature(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "organometallics.txt", delimiter='\t') public void testOrganoMetallics(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "implicitBracketting.txt", delimiter='\t') public void testImplicitBracketting(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "stereochemistry.txt", delimiter='\t') public void testStereochemistry(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "detachablePrefixes.txt", delimiter='\t') public void testDetachablePrefixes(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "lettercasing.txt", delimiter='\t') public void testLetterCasing(String name, String expectedInchi) { checkName(name, expectedInchi); } @ParameterizedTest @CsvFileSource(resources = "miscellany.txt", delimiter='\t') public void testMiscellany(String name, String expectedInchi) { checkName(name, expectedInchi); } private void checkName(String name, String expectedInchI) { String inchi = NameToInchi.convertResultToInChI(n2s.parseChemicalName(name, n2sConfig)); if (inchi != null) { String opsinInchi = InchiPruner.mainChargeAndStereochemistryLayers(inchi); String referenceInchi = InchiPruner.mainChargeAndStereochemistryLayers(expectedInchI); assertEquals(referenceInchi, opsinInchi, name + " was misinterpreted as: " + inchi); } else { fail(name +" was uninterpretable"); } } } opsin-2.8.0/opsin-inchi/src/test/resources/000077500000000000000000000000001451751637500206675ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/resources/uk/000077500000000000000000000000001451751637500213065ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/000077500000000000000000000000001451751637500216715ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/000077500000000000000000000000001451751637500224315ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/000077500000000000000000000000001451751637500230235ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/000077500000000000000000000000001451751637500240125ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/000077500000000000000000000000001451751637500251425ustar00rootroot00000000000000opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/acetals.txt000066400000000000000000000035341451751637500273240ustar00rootroot00000000000000propanal dimethyl acetal InChI=1S/C5H12O2/c1-4-5(6-2)7-3/h5H,4H2,1-3H3 propanal diethyl acetal InChI=1S/C7H16O2/c1-4-7(8-5-2)9-6-3/h7H,4-6H2,1-3H3 cyclohexanone ethyl methyl ketal InChI=1S/C9H18O2/c1-3-11-9(10-2)7-5-4-6-8-9/h3-8H2,1-2H3 cyclohexane-1,4-dione 1-ethyl 1,4,4-trimethyl diketal InChI=1S/C11H22O4/c1-5-15-11(14-4)8-6-10(12-2,13-3)7-9-11/h5-9H2,1-4H3 2-methylcyclohexane-1,4-dione 1,1-diethyl 4,4-dichloro diketal InChI=1S/C11H20Cl2O4/c1-4-14-11(15-5-2)7-6-10(16-12,17-13)8-9(11)3/h9H,4-8H2,1-3H3 propanal ethylene acetal InChI=1S/C5H10O2/c1-2-5-6-3-4-7-5/h5H,2-4H2,1H3 cyclohexanone ethylene ketal InChI=1S/C8H14O2/c1-2-4-8(5-3-1)9-6-7-10-8/h1-7H2 3-(trimethylsilyl)propanal ethylene ketal InChI=1S/C8H18O2Si/c1-11(2,3)7-4-8-9-5-6-10-8/h8H,4-7H2,1-3H3 butanal ethyl hemiacetal InChI=1S/C6H14O2/c1-3-5-6(7)8-4-2/h6-7H,3-5H2,1-2H3 cyclohexanone methyl hemiketal InChI=1S/C7H14O2/c1-9-7(8)5-3-2-4-6-7/h8H,2-6H2,1H3 pentanal diethyl dithioacetal InChI=1S/C9H20S2/c1-4-7-8-9(10-5-2)11-6-3/h9H,4-8H2,1-3H3 propanal S-ethyl O-methyl monothioacetal InChI=1S/C6H14OS/c1-4-6(7-3)8-5-2/h6H,4-5H2,1-3H3 cyclopentanone diethyl monothioketal InChI=1S/C9H18OS/c1-3-10-9(11-4-2)7-5-6-8-9/h3-8H2,1-2H3 ethan-1-one ethylene monothioketal InChI=1S/C4H8OS/c1-4-5-2-3-6-4/h4H,2-3H2,1H3 cyclohexanone Se-ethyl S-methyl selenothioketal InChI=1S/C9H18SSe/c1-3-11-9(10-2)7-5-4-6-8-9/h3-8H2,1-2H3 cyclopentanone ethylene monoselenoketal InChI=1S/C7H12OSe/c1-2-4-7(3-1)8-5-6-9-7/h1-6H2 propanal ethyl dithiohemiacetal InChI=1S/C5H12S2/c1-3-5(6)7-4-2/h5-6H,3-4H2,1-2H3 propanal O-ethyl monothiohemiacetal InChI=1S/C5H12OS/c1-3-5(7)6-4-2/h5,7H,3-4H2,1-2H3 cyclopentanone S-ethyl selenothiohemiketal InChI=1S/C7H14SSe/c1-2-8-7(9)5-3-4-6-7/h9H,2-6H2,1H3 D-glucose diethyl mercaptal InChI=1S/C10H22O5S2/c1-3-16-10(17-4-2)9(15)8(14)7(13)6(12)5-11/h6-15H,3-5H2,1-2H3/t6-,7-,8+,9-/m1/s1 opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/additiveNomenclature.txt000066400000000000000000000005631451751637500320550ustar00rootroot00000000000000methylsulfonylbenzene InChI=1S/C7H8O2S/c1-10(8,9)7-5-3-2-4-6-7/h2-6H,1H3 methylsulfonamidobenzene InChI=1S/C7H9NO2S/c1-11(9,10)8-7-5-3-2-4-6-7/h2-6,8H,1H3 2-(N-(2-ethylphenyl)methylsulfonamido)-acetamide InChI=1S/C11H16N2O3S/c1-3-9-6-4-5-7-10(9)13(8-11(12)14)17(2,15)16/h4-7H,3,8H2,1-2H3,(H2,12,14) methylcarbonylbenzene InChI=1S/C8H8O/c1-7(9)8-5-3-2-4-6-8/h2-6H,1H3 opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/alcoholEsters.txt000066400000000000000000000030161451751637500305120ustar00rootroot00000000000000phenol acetate InChI=1S/C8H8O2/c1-7(9)10-8-5-3-2-4-6-8/h2-6H,1H3 phenol acetate (1:1) InChI=1S/C6H6O.C2H4O2/c7-6-4-2-1-3-5-6;1-2(3)4/h1-5,7H;1H3,(H,3,4) adenosine triphosphate InChI=1S/C10H16N5O13P3/c11-8-5-9(13-2-12-8)15(3-14-5)10-7(17)6(16)4(26-10)1-25-30(21,22)28-31(23,24)27-29(18,19)20/h2-4,6-7,10,16-17H,1H2,(H,21,22)(H,23,24)(H2,11,12,13)(H2,18,19,20)/t4-,6-,7-,10-/m1/s1 adenosine 5'-triphosphate InChI=1S/C10H16N5O13P3/c11-8-5-9(13-2-12-8)15(3-14-5)10-7(17)6(16)4(26-10)1-25-30(21,22)28-31(23,24)27-29(18,19)20/h2-4,6-7,10,16-17H,1H2,(H,21,22)(H,23,24)(H2,11,12,13)(H2,18,19,20)/t4-,6-,7-,10-/m1/s1 choline phosphate InChI=1S/C5H14NO4P/c1-6(2,3)4-5-10-11(7,8)9/h4-5H2,1-3H3,(H-,7,8,9)/p+1 L-histidinol phosphate InChI=1S/C6H12N3O4P/c7-5(3-13-14(10,11)12)1-6-2-8-4-9-6/h2,4-5H,1,3,7H2,(H,8,9)(H2,10,11,12)/t5-/m0/s1 1,2-Ethanediol 1-(4-methylbenzenesulfonate) InChI=1S/C9H12O4S/c1-8-2-4-9(5-3-8)14(11,12)13-7-6-10/h2-5,10H,6-7H2,1H3 #counter examples glycinium acetate InChI=1S/C2H5NO2.C2H4O2/c3-1-2(4)5;1-2(3)4/h1,3H2,(H,4,5);1H3,(H,3,4) D-tryptophanol oxalate InChI=1S/C11H14N2O.C2H2O4/c12-9(7-14)5-8-6-13-11-4-2-1-3-10(8)11;3-1(4)2(5)6/h1-4,6,9,13-14H,5,7,12H2;(H,3,4)(H,5,6)/t9-;/m1./s1 lysine acetate InChI=1S/C6H14N2O2.C2H4O2/c7-4-2-1-3-5(8)6(9)10;1-2(3)4/h5H,1-4,7-8H2,(H,9,10);1H3,(H,3,4)/t5-;/m0./s1 piperidin-4-ol trifluoroacetate InChI=1S/C5H11NO.C2HF3O2/c7-5-1-3-6-4-2-5;3-2(4,5)1(6)7/h5-7H,1-4H2;(H,6,7) piperidin-4-ol 2,2,2-trifluoroacetate InChI=1S/C5H11NO.C2HF3O2/c7-5-1-3-6-4-2-5;3-2(4,5)1(6)7/h5-7H,1-4H2;(H,6,7)opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/aminoAcids.txt000066400000000000000000000075041451751637500277600ustar00rootroot00000000000000arginine InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9/h4H,1-3,7H2,(H,11,12)(H4,8,9,10)/t4-/m0/s1 glutamic acid InChI=1S/C5H9NO4/c6-3(5(9)10)1-2-4(7)8/h3H,1-2,6H2,(H,7,8)(H,9,10)/t3-/m0/s1 epsilon-chloro-lysine InChI=1S/C6H13ClN2O2/c7-5(9)3-1-2-4(8)6(10)11/h4-5H,1-3,8-9H2,(H,10,11)/t4-,5?/m0/s1 3,5-diiodotyrosine InChI=1S/C9H9I2NO3/c10-5-1-4(2-6(11)8(5)13)3-7(12)9(14)15/h1-2,7,13H,3,12H2,(H,14,15)/t7-/m0/s1 1-methyl-proline InChI=1S/C6H11NO2/c1-7-4-2-3-5(7)6(8)9/h5H,2-4H2,1H3,(H,8,9)/t5-/m0/s1 beta-chloro-histidine InChI=1S/C6H8ClN3O2/c7-4(5(8)6(11)12)3-1-9-2-10-3/h1-2,4-5H,8H2,(H,9,10)(H,11,12)/t4?,5-/m0/s1 2-chloro-histidine InChI=1S/C6H8ClN3O2/c7-6-9-2-3(10-6)1-4(8)5(11)12/h2,4H,1,8H2,(H,9,10)(H,11,12)/t4-/m0/s1 homocysteine InChI=1S/C4H9NO2S/c5-3(1-2-8)4(6)7/h3,8H,1-2,5H2,(H,6,7)/t3-/m0/s1 norvaline InChI=1S/C5H11NO2/c1-2-3-4(6)5(7)8/h4H,2-3,6H2,1H3,(H,7,8)/t4-/m0/s1 selenocysteine InChI=1S/C3H7NO2Se/c4-2(1-7)3(5)6/h2,7H,1,4H2,(H,5,6)/t2-/m0/s1 serine InChI=1S/C3H7NO3/c4-2(1-5)3(6)7/h2,5H,1,4H2,(H,6,7)/t2-/m0/s1 L-serine InChI=1S/C3H7NO3/c4-2(1-5)3(6)7/h2,5H,1,4H2,(H,6,7)/t2-/m0/s1 D-serine InChI=1S/C3H7NO3/c4-2(1-5)3(6)7/h2,5H,1,4H2,(H,6,7)/t2-/m1/s1 cis-4-hydroxy-L-proline InChI=1S/C5H9NO3/c7-3-1-4(5(8)9)6-2-3/h3-4,6-7H,1-2H2,(H,8,9)/t3-,4-/m0/s1 D-allothreonine InChI=1S/C4H9NO3/c1-2(6)3(5)4(7)8/h2-3,6H,5H2,1H3,(H,7,8)/t2-,3-/m1/s1 glycinium InChI=1S/C2H5NO2/c3-1-2(4)5/h1,3H2,(H,4,5)/p+1 glycinate InChI=1S/C2H5NO2/c3-1-2(4)5/h1,3H2,(H,4,5)/p-1 tryptophan-1-ylbenzene InChI=1S/C17H16N2O2/c18-15(17(20)21)10-12-11-19(13-6-2-1-3-7-13)16-9-5-4-8-14(12)16/h1-9,11,15H,10,18H2,(H,20,21)/t15-/m0/s1 aspartic-2-ylbenzene InChI=1S/C10H11NO4/c11-10(9(14)15,6-8(12)13)7-4-2-1-3-5-7/h1-5H,6,11H2,(H,12,13)(H,14,15)/t10-/m1/s1 cysteinylbenzene InChI=1S/C9H11NOS/c10-8(6-12)9(11)7-4-2-1-3-5-7/h1-5,8,12H,6,10H2/t8-/m0/s1 α-aspartylbenzene InChI=1S/C10H11NO3/c11-8(6-9(12)13)10(14)7-4-2-1-3-5-7/h1-5,8H,6,11H2,(H,12,13)/t8-/m0/s1 aspart-1-ylbenzene InChI=1S/C10H11NO3/c11-8(6-9(12)13)10(14)7-4-2-1-3-5-7/h1-5,8H,6,11H2,(H,12,13)/t8-/m0/s1 β-aspartylbenzene InChI=1S/C10H11NO3/c11-8(10(13)14)6-9(12)7-4-2-1-3-5-7/h1-5,8H,6,11H2,(H,13,14)/t8-/m0/s1 aspart-4-ylbenzene InChI=1S/C10H11NO3/c11-8(10(13)14)6-9(12)7-4-2-1-3-5-7/h1-5,8H,6,11H2,(H,13,14)/t8-/m0/s1 aspartoyl dichloride InChI=1S/C4H5Cl2NO2/c5-3(8)1-2(7)4(6)9/h2H,1,7H2/t2-/m0/s1 aspartoyl chloride InChI=1S/C4H5Cl2NO2/c5-3(8)1-2(7)4(6)9/h2H,1,7H2/t2-/m0/s1 alaninal InChI=1S/C3H7NO/c1-3(4)2-5/h2-3H,4H2,1H3/t3-/m0/s1 aspart-1-al InChI=1S/C4H7NO3/c5-3(2-6)1-4(7)8/h2-3H,1,5H2,(H,7,8)/t3-/m0/s1 alaninol InChI=1S/C3H9NO/c1-3(4)2-5/h3,5H,2,4H2,1H3/t3-/m0/s1 γ-glutamylcysteinylglycine InChI=1S/C10H17N3O6S/c11-5(10(18)19)1-2-7(14)13-6(4-20)9(17)12-3-8(15)16/h5-6,20H,1-4,11H2,(H,12,17)(H,13,14)(H,15,16)(H,18,19)/t5-,6-/m0/s1 L-alanylglycyl-L-leucine InChI=1S/C11H21N3O4/c1-6(2)4-8(11(17)18)14-9(15)5-13-10(16)7(3)12/h6-8H,4-5,12H2,1-3H3,(H,13,16)(H,14,15)(H,17,18)/t7-,8-/m0/s1 glycine ethyl ester InChI=1S/C4H9NO2/c1-2-7-4(6)3-5/h2-3,5H2,1H3 O5-Ethyl hydrogen N-acetylglutamate InChI=1S/C9H15NO5/c1-3-15-8(12)5-4-7(9(13)14)10-6(2)11/h7H,3-5H2,1-2H3,(H,10,11)(H,13,14)/t7-/m0/s1 cyclo(L-asparaginyl-L-glutaminyl-L-tyrosyl-L-valyl-L-ornithyl-L-leucyl-D-phenylalanyl-L-prolyl-L-phenylalanyl-D-phenylalanyl) InChI=1S/C66H87N13O13/c1-38(2)32-47-59(85)77-52(36-42-20-12-7-13-21-42)66(92)79-31-15-23-53(79)64(90)76-49(34-41-18-10-6-11-19-41)61(87)74-48(33-40-16-8-5-9-17-40)60(86)75-51(37-55(69)82)62(88)70-46(28-29-54(68)81)58(84)73-50(35-43-24-26-44(80)27-25-43)63(89)78-56(39(3)4)65(91)71-45(22-14-30-67)57(83)72-47/h5-13,16-21,24-27,38-39,45-53,56,80H,14-15,22-23,28-37,67H2,1-4H3,(H2,68,81)(H2,69,82)(H,70,88)(H,71,91)(H,72,83)(H,73,84)(H,74,87)(H,75,86)(H,76,90)(H,77,85)(H,78,89)/t45-,46-,47-,48+,49-,50-,51-,52+,53-,56-/m0/s1 L-Asparaginium chloride InChI=1S/C4H8N2O3.ClH/c5-2(4(8)9)1-3(6)7;/h2H,1,5H2,(H2,6,7)(H,8,9);1H/t2-;/m0./s1opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/carbohydrates.txt000066400000000000000000001032101451751637500305320ustar00rootroot00000000000000L-glucitol InChI=1S/C6H14O6/c7-1-3(9)5(11)6(12)4(10)2-8/h3-12H,1-2H2/t3-,4+,5-,6-/m0/s1 L-erythro-L-gluco-Non-5-ulose InChI=1S/C9H18O9/c10-1-3(12)5(14)7(16)9(18)8(17)6(15)4(13)2-11/h3-8,10-17H,1-2H2/t3-,4+,5-,6-,7-,8+/m0/s1 4-O-Methyl-D-xylitol InChI=1S/C6H14O5/c1-11-5(3-8)6(10)4(9)2-7/h4-10H,2-3H2,1H3/t4-,5+,6+/m0/s1 2,3,5-Tri-O-methyl-D-mannitol InChI=1S/C9H20O6/c1-13-6(4-10)8(12)9(15-3)7(5-11)14-2/h6-12H,4-5H2,1-3H3/t6-,7-,8-,9-/m1/s1 2-O-Acetyl-5-O-methyl-D-mannitol InChI=1S/C9H18O7/c1-5(12)16-7(4-11)9(14)8(13)6(3-10)15-2/h6-11,13-14H,3-4H2,1-2H3/t6-,7-,8-,9-/m1/s1 D-Glucose InChI=1S/C6H12O6/c7-1-3(9)5(11)6(12)4(10)2-8/h1,3-6,8-12H,2H2/t3-,4+,5+,6+/m0/s1 aldehydo-D-Glucose InChI=1S/C6H12O6/c7-1-3(9)5(11)6(12)4(10)2-8/h1,3-6,8-12H,2H2/t3-,4+,5+,6+/m0/s1 keto-D-Fructose InChI=1S/C6H12O6/c7-1-3(9)5(11)6(12)4(10)2-8/h3,5-9,11-12H,1-2H2/t3-,5-,6-/m1/s1 (+)-D-glucose InChI=1S/C6H12O6/c7-1-3(9)5(11)6(12)4(10)2-8/h1,3-6,8-12H,2H2/t3-,4+,5+,6+/m0/s1 D-arabino-Hex-2-ulose InChI=1S/C6H12O6/c7-1-3(9)5(11)6(12)4(10)2-8/h3,5-9,11-12H,1-2H2/t3-,5-,6-/m1/s1 D-glycero-L-gulo-Heptose InChI=1S/C7H14O7/c8-1-3(10)5(12)7(14)6(13)4(11)2-9/h1,3-7,9-14H,2H2/t3-,4-,5-,6+,7+/m1/s1 alpha-D-Glucooxirose InChI=1S/C6H12O6/c7-1-2(8)3(9)4(10)5-6(11)12-5/h2-11H,1H2/t2-,3-,4+,5-,6+/m1/s1 2,3,5,6-Tetra-O-acetyl-alpha-D-galactofuranose InChI=1S/C14H20O10/c1-6(15)20-5-10(21-7(2)16)11-12(22-8(3)17)13(14(19)24-11)23-9(4)18/h10-14,19H,5H2,1-4H3/t10-,11+,12+,13-,14+/m1/s1 alpha-D-Glucoseptanose InChI=1S/C6H12O6/c7-2-1-12-6(11)5(10)4(9)3(2)8/h2-11H,1H2/t2-,3-,4+,5-,6+/m1/s1 #alpha-D-Fructofuranose 1,6-bisphosphate InChI=1S/C6H14O12P2/c7-4-3(1-16-19(10,11)12)18-6(9,5(4)8)2-17-20(13,14)15/h3-5,7-9H,1-2H2,(H2,10,11,12)(H2,13,14,15)/t3-,4-,5+,6+/m1/s1 Methyl alpha-D-glucoseptanoside InChI=1S/C7H14O6/c1-12-7-6(11)5(10)4(9)3(8)2-13-7/h3-11H,2H2,1H3/t3-,4-,5+,6-,7+/m1/s1 Methyl alpha-L-altrooxetoside InChI=1S/C7H14O6/c1-12-7-5(11)6(13-7)4(10)3(9)2-8/h3-11H,2H2,1H3/t3-,4-,5+,6-,7+/m0/s1 Methyl beta-D-allooxiroside InChI=1S/C7H14O6/c1-12-7-6(13-7)5(11)4(10)3(9)2-8/h3-11H,2H2,1H3/t3-,4-,5-,6-,7-/m1/s1 #1,2:3,4-Di-O-isopropylidene-alpha-D-galactopyranose InChI=1S/C12H20O6/c1-11(2)15-7-6(5-13)14-10-9(8(7)16-11)17-12(3,4)18-10/h6-10,13H,5H2,1-4H3/t6-,7+,8+,9-,10-/m1/s1 #D-Glucaro-1,4:6,3-dilactone InChI=1S/C6H6O6/c7-1-3-4(12-5(1)9)2(8)6(10)11-3/h1-4,7-8H/t1-,2+,3-,4-/m0/s1 D-Arabinitol InChI=1S/C5H12O5/c6-1-3(8)5(10)4(9)2-7/h3-10H,1-2H2/t3-,4-/m1/s1 Xylitol InChI=1S/C5H12O5/c6-1-3(8)5(10)4(9)2-7/h3-10H,1-2H2/t3-,4+,5+ Methyl alpha-L-arabinopyranoside InChI=1S/C6H12O5/c1-10-6-5(9)4(8)3(7)2-11-6/h3-9H,2H2,1H3/t3-,4-,5+,6+/m0/s1 Methyl beta-L-threofuranoside InChI=1S/C5H10O4/c1-8-5-4(7)3(6)2-9-5/h3-7H,2H2,1H3/t3-,4+,5-/m0/s1 Methyl beta-D-galactofuranoside InChI=1S/C7H14O6/c1-12-7-5(11)4(10)6(13-7)3(9)2-8/h3-11H,2H2,1H3/t3-,4-,5-,6+,7-/m1/s1 Methyl L-glycero-alpha-D-manno-heptopyranoside InChI=1S/C8H16O7/c1-14-8-6(13)4(11)5(12)7(15-8)3(10)2-9/h3-13H,2H2,1H3/t3-,4-,5-,6-,7+,8-/m0/s1 Methyl beta-D-fructofuranoside InChI=1S/C7H14O6/c1-12-7(3-9)6(11)5(10)4(2-8)13-7/h4-6,8-11H,2-3H2,1H3/t4-,5-,6+,7-/m1/s1 Methyl 5-acetamido-3,5-dideoxy-D-glycero-beta-D-galacto-non-2-ulopyranosonate InChI=1S/C12H21NO9/c1-5(15)13-8-6(16)3-12(20,11(19)21-2)22-10(8)9(18)7(17)4-14/h6-10,14,16-18,20H,3-4H2,1-2H3,(H,13,15)/t6-,7+,8+,9+,10+,12-/m0/s1 alpha,beta-D-Fructofuranose 6-phosphate InChI=1S/C6H13O9P/c7-2-6(10)5(9)4(8)3(15-6)1-14-16(11,12)13/h3-5,7-10H,1-2H2,(H2,11,12,13)/t3-,4-,5+,6?/m1/s1 #(6S)-1,2-O-isopropylidene-alpha-D-gluco-hexodialdo-1,4:6,3-difuranose D-ribo-Pentose InChI=1S/C5H10O5/c6-1-3(8)5(10)4(9)2-7/h1,3-5,7-10H,2H2/t3-,4+,5-/m0/s1 D-glycero-D-gluco-Heptose InChI=1S/C7H14O7/c8-1-3(10)5(12)7(14)6(13)4(11)2-9/h1,3-7,9-14H,2H2/t3-,4+,5+,6+,7-/m0/s1 L-ribo-D-manno-Nonose InChI=1S/C9H18O9/c10-1-3(12)5(14)7(16)9(18)8(17)6(15)4(13)2-11/h1,3-9,11-18H,2H2/t3-,4+,5-,6+,7+,8+,9+/m1/s1 3,6-Dideoxy-L-threo-L-talo-decose InChI=1/C10H20O8/c11-3-5(13)1-6(14)7(15)2-8(16)10(18)9(17)4-12/h3,5-10,12-18H,1-2,4H2/t5-,6+,7-,8-,9+,10+/m1/s1 L-threo-Tetrodialdose InChI=1S/C4H6O4/c5-1-3(7)4(8)2-6/h1-4,7-8H/t3-,4-/m0/s1 galacto-Hexodialdose InChI=1S/C6H10O6/c7-1-3(9)5(11)6(12)4(10)2-8/h1-6,9-12H/t3-,4+,5+,6- alpha-D-gluco-Hexodialdo-1,5-pyranose InChI=1S/C6H10O6/c7-1-2-3(8)4(9)5(10)6(11)12-2/h1-6,8-11H/t2-,3-,4+,5-,6+/m1/s1 (6R)-D-gluco-Hexodialdo-6,2-pyranose InChI=1S/C6H10O6/c7-1-2-3(8)4(9)5(10)6(11)12-2/h1-6,8-11H/t2-,3+,4-,5-,6+/m0/s1 #Methyl alpha-D-gluco-hexodialdo-6,3-furanose-1,5-pyranoside InChI=1S/C7H12O6/c1-11-7-3(9)4-2(8)5(13-7)6(10)12-4/h2-10H,1H3/t2-,3+,4-,5-,6?,7-/m0/s1 D-glycero-Tetrulose InChI=1S/C4H8O4/c5-1-3(7)4(8)2-6/h3,5-7H,1-2H2/t3-/m1/s1 D-lyxo-Hex-2-ulose InChI=1S/C6H12O6/c7-1-3(9)5(11)6(12)4(10)2-8/h3,5-9,11-12H,1-2H2/t3-,5+,6-/m1/s1 D-altro-Hept-2-ulopyranose InChI=1S/C7H14O7/c8-1-3-4(10)5(11)6(12)7(13,2-9)14-3/h3-6,8-13H,1-2H2/t3-,4-,5-,6+,7?/m1/s1 L-glycero-D-manno-Oct-2-ulose InChI=1S/C8H16O8/c9-1-3(11)5(13)7(15)8(16)6(14)4(12)2-10/h3,5-11,13-16H,1-2H2/t3-,5+,6+,7+,8+/m0/s1 L-gluco-Hept-4-ulose InChI=1S/C7H14O7/c8-1-3(10)5(12)7(14)6(13)4(11)2-9/h3-6,8-13H,1-2H2/t3-,4+,5-,6-/m0/s1 L-erythro-L-gluco-Non-5-ulose InChI=1S/C9H18O9/c10-1-3(12)5(14)7(16)9(18)8(17)6(15)4(13)2-11/h3-8,10-17H,1-2H2/t3-,4+,5-,6-,7-,8+/m0/s1 L-threo-Hexo-2,5-diulose InChI=1S/C6H10O6/c7-1-3(9)5(11)6(12)4(10)2-8/h5-8,11-12H,1-2H2/t5-,6-/m0/s1 #meso-xylo-Hepto-2,6-diulose InChI=1S/C7H12O7/c8-1-3(10)5(12)7(14)6(13)4(11)2-9/h5-9,12-14H,1-2H2/t5-,6+,7+ D-threo-Hexo-2,4-diulose InChI=1S/C6H10O6/c7-1-3(9)5(11)6(12)4(10)2-8/h3,6-9,12H,1-2H2/t3-,6-/m1/s1 alpha-D-threo-Hexo-2,4-diulo-2,5-furanose InChI=1S/C6H10O6/c7-1-3-4(9)5(10)6(11,2-8)12-3/h3,5,7-8,10-11H,1-2H2/t3-,5+,6+/m1/s1 L-altro-Octo-4,5-diulose InChI=1S/C8H14O8/c9-1-3(11)5(13)7(15)8(16)6(14)4(12)2-10/h3-6,9-14H,1-2H2/t3-,4-,5-,6+/m0/s1 D-glycero-D-ido-Nono-3,6-diulose InChI=1S/C9H16O9/c10-1-3(12)5(14)7(16)9(18)8(17)6(15)4(13)2-11/h3-5,8-14,17-18H,1-2H2/t3-,4-,5-,8+,9+/m1/s1 D-arabino-Hexos-3-ulose InChI=1S/C6H10O6/c7-1-3(9)5(11)6(12)4(10)2-8/h1,3-4,6,8-10,12H,2H2/t3-,4-,6-/m1/s1 Methyl beta-D-xylo-hexopyranosid-4-ulose InChI=1S/C7H12O6/c1-12-7-6(11)5(10)4(9)3(2-8)13-7/h3,5-8,10-11H,2H2,1H3/t3-,5+,6-,7-/m1/s1 2-dehydro-D-glucose InChI=1S/C6H10O6/c7-1-3(9)5(11)6(12)4(10)2-8/h1,4-6,8,10-12H,2H2/t4-,5-,6-/m1/s1 beta-D-fructofuranosyl 3-dehydro-alpha-D-allopyranoside InChI=1S/C12H20O11/c13-1-4-6(16)8(18)9(19)11(21-4)23-12(3-15)10(20)7(17)5(2-14)22-12/h4-7,9-11,13-17,19-20H,1-3H2/t4-,5-,6-,7-,9-,10+,11-,12+/m1/s1 5-dehydro-D-fructose InChI=1S/C6H10O6/c7-1-3(9)5(11)6(12)4(10)2-8/h5-8,11-12H,1-2H2/t5-,6-/m1/s1 3,6-Dideoxy-beta-D-arabino-hexopyranose InChI=1S/C6H12O4/c1-3-4(7)2-5(8)6(9)10-3/h3-9H,2H2,1H3/t3-,4+,5+,6-/m1/s1 beta-Tyvelopyranose InChI=1S/C6H12O4/c1-3-4(7)2-5(8)6(9)10-3/h3-9H,2H2,1H3/t3-,4+,5+,6-/m1/s1 2-deoxyribose InChI=1S/C5H10O4/c6-2-1-4(8)5(9)3-7/h2,4-5,7-9H,1,3H2/t4-,5+/m0/s1 2-deoxy-D-erythro-pentose InChI=1S/C5H10O4/c6-2-1-4(8)5(9)3-7/h2,4-5,7-9H,1,3H2/t4-,5+/m0/s1 2-deoxyglucose InChI=1S/C6H12O5/c7-2-1-4(9)6(11)5(10)3-8/h2,4-6,8-11H,1,3H2/t4-,5-,6+/m1/s1 2-deoxy-D-arabino-hexose InChI=1S/C6H12O5/c7-2-1-4(9)6(11)5(10)3-8/h2,4-6,8-11H,1,3H2/t4-,5-,6+/m1/s1 2-Deoxy-D-erythro-pentofuranose 5-phosphate InChI=1S/C5H11O7P/c6-3-1-5(7)12-4(3)2-11-13(8,9)10/h3-7H,1-2H2,(H2,8,9,10)/t3-,4+,5?/m0/s1 4-Deoxy-beta-D-xylo-hexopyranose InChI=1S/C6H12O5/c7-2-3-1-4(8)5(9)6(10)11-3/h3-10H,1-2H2/t3-,4-,5+,6+/m0/s1 2-Deoxy-D-ribo-hexose InChI=1S/C6H12O5/c7-2-1-4(9)6(11)5(10)3-8/h2,4-6,8-11H,1,3H2/t4-,5+,6-/m0/s1 2-Deoxy-alpha-D-allo-heptopyranose InChI=1S/C7H14O6/c8-2-4(10)7-6(12)3(9)1-5(11)13-7/h3-12H,1-2H2/t3-,4+,5-,6-,7+/m0/s1 1-Deoxy-L-glycero-D-altro-oct-2-ulose InChI=1S/C8H16O7/c1-3(10)5(12)7(14)8(15)6(13)4(11)2-9/h4-9,11-15H,2H2,1H3/t4-,5+,6+,7-,8+/m0/s1 Methyl 3-azido-4-O-benzoyl-6-bromo-2,3,6-trideoxy-2-fluoro-alpha-D-allopyranoside InChI=1S/C14H15BrFN3O4/c1-21-14-10(16)11(18-19-17)12(9(7-15)22-14)23-13(20)8-5-3-2-4-6-8/h2-6,9-12,14H,7H2,1H3/t9-,10-,11+,12-,14+/m1/s1 3-Deoxy-D-ribohexose InChI=1S/C6H12O5/c7-2-4(9)1-5(10)6(11)3-8/h2,4-6,8-11H,1,3H2/t4-,5+,6-/m1/s1 5-Deoxy-D-arabino-hept-3-ulose InChI=1S/C7H14O6/c8-2-4(10)1-5(11)7(13)6(12)3-9/h4-6,8-12H,1-3H2/t4-,5+,6+/m0/s1 6-Deoxy-L-gluco-oct-2-ulose InChI=1S/C8H16O7/c9-2-4(11)1-5(12)7(14)8(15)6(13)3-10/h4-5,7-12,14-15H,1-3H2/t4-,5+,7-,8-/m1/s1 1-Deoxy-D-arabinitol InChI=1S/C5H12O4/c1-3(7)5(9)4(8)2-6/h3-9H,2H2,1H3/t3-,4-,5+/m1/s1 5-Deoxy-D-arabinitol InChI=1S/C5H12O4/c1-3(7)5(9)4(8)2-6/h3-9H,2H2,1H3/t3-,4-,5-/m1/s1 5-Acetamido-3,5-dideoxy-D-glycero-alpha-D-galacto-non-2-ulopyranosonic acid InChI=1S/C11H19NO9/c1-4(14)12-7-5(15)2-11(20,10(18)19)21-9(7)8(17)6(16)3-13/h5-9,13,15-17,20H,2-3H2,1H3,(H,12,14)(H,18,19)/t5-,6+,7+,8+,9+,11+/m0/s1 N-acetyl-alpha-neuraminic acid InChI=1S/C11H19NO9/c1-4(14)12-7-5(15)2-11(20,10(18)19)21-9(7)8(17)6(16)3-13/h5-9,13,15-17,20H,2-3H2,1H3,(H,12,14)(H,18,19)/t5-,6+,7+,8+,9+,11+/m0/s1 2-Amino-3-O-[(R)-1-carboxyethyl]-2-deoxy-beta-D-glucopyranose InChI=1S/C9H17NO7/c1-3(8(13)14)16-7-5(10)9(15)17-4(2-11)6(7)12/h3-7,9,11-12,15H,2,10H2,1H3,(H,13,14)/t3-,4-,5-,6-,7-,9-/m1/s1 beta-muramic acid InChI=1S/C9H17NO7/c1-3(8(13)14)16-7-5(10)9(15)17-4(2-11)6(7)12/h3-7,9,11-12,15H,2,10H2,1H3,(H,13,14)/t3-,4-,5-,6-,7-,9-/m1/s1 2-Deoxy-2-methylamino-L-glucopyranose InChI=1S/C7H15NO5/c1-8-4-6(11)5(10)3(2-9)13-7(4)12/h3-12H,2H2,1H3/t3-,4-,5-,6-,7?/m0/s1 4,6-Dideoxy-4-formamido-2,3-di-O-methyl-D-mannopyranose InChI=1S/C9H17NO5/c1-5-6(10-4-11)7(13-2)8(14-3)9(12)15-5/h4-9,12H,1-3H3,(H,10,11)/t5-,6-,7+,8+,9?/m1/s1 2-Acetamido-1,3,4-tri-O-acetyl-2,6-dideoxy-alpha-L-galactopyranose InChI=1S/C14H21NO8/c1-6-12(21-8(3)17)13(22-9(4)18)11(15-7(2)16)14(20-6)23-10(5)19/h6,11-14H,1-5H3,(H,15,16)/t6-,11-,12+,13-,14-/m0/s1 2,3,4,6-Tetra-O-acetyl-1-thio-beta-D-glucopyranose InChI=1S/C14H20O9S/c1-6(15)19-5-10-11(20-7(2)16)12(21-8(3)17)13(14(24)23-10)22-9(4)18/h10-14,24H,5H2,1-4H3/t10-,11-,12+,13-,14+/m1/s1 5-Thio-beta-D-glucopyranose InChI=1S/C6H12O5S/c7-1-2-3(8)4(9)5(10)6(11)12-2/h2-11H,1H2/t2-,3-,4+,5-,6-/m1/s1 #Methyl 2,3,4-tri-O-acetyl-1-thio-6-O-trityl-alpha-D-glucopyranoside InChI=1S/C32H34O8S/c1-21(33)37-28-27(40-31(41-4)30(39-23(3)35)29(28)38-22(2)34)20-36-32(24-14-8-5-9-15-24,25-16-10-6-11-17-25)26-18-12-7-13-19-26/h5-19,27-31H,20H2,1-4H3/t27-,28-,29+,30-,31-/m1/s1 Methyl 4-seleno-alpha-D-xylofuranoside InChI=1S/C6H12O4Se/c1-10-6-5(9)4(8)3(2-7)11-6/h3-9H,2H2,1H3/t3-,4+,5-,6+/m1/s1 4-Thio-beta-D-galactopyranose InChI=1S/C6H12O5S/c7-1-2-5(12)3(8)4(9)6(10)11-2/h2-10,12H,1H2/t2-,3-,4-,5+,6-/m1/s1 #alpha-D-Glucopyranosyl phenyl (R)-selenoxide InChI=1S/C12H16O6Se/c13-6-8-9(14)10(15)11(16)12(18-8)19(17)7-4-2-1-3-5-7/h1-5,8-16H,6H2/t8-,9-,10+,11-,12-,19-/m1/s1 Methyl 5-seleno-alpha-D-fructofuranoside InChI=1S/C7H14O5Se/c1-12-7(3-9)6(11)5(10)4(2-8)13-7/h4-6,8-11H,2-3H2,1H3/t4-,5-,6+,7+/m1/s1 #Ethyl 3,4,6,7-tetra-O-acetyl-2-deoxy-1,5-dithio-alpha-D-gluco-heptopyranoside InChI=1S/C17H26O8S2/c1-6-26-15-7-13(23-10(3)19)16(25-12(5)21)17(27-15)14(24-11(4)20)8-22-9(2)18/h13-17H,6-8H2,1-5H3/t13-,14+,15-,16+,17+/m0/s1 2-C-Phenyl-alpha-D-glucopyranose InChI=1S/C12H16O6/c13-6-8-9(14)10(15)12(17,11(16)18-8)7-4-2-1-3-5-7/h1-5,8-11,13-17H,6H2/t8-,9-,10+,11+,12-/m1/s1 2-C-Acetamido-2,3,4,6-tetra-O-acetyl-alpha-D-mannopyranosyl fluoride InChI=1S/C16H22FNO10/c1-7(19)18-16(28-11(5)23)14(26-10(4)22)13(25-9(3)21)12(27-15(16)17)6-24-8(2)20/h12-15H,6H2,1-5H3,(H,18,19)/t12-,13-,14+,15+,16+/m1/s1 (5R)-1,2,3,4-Tetra-O-acetyl-5-bromo-alpha-D-xylo-hexopyranuronic acid InChI=1S/C14H17BrO11/c1-5(16)22-9-10(23-6(2)17)12(25-8(4)19)26-14(15,13(20)21)11(9)24-7(3)18/h9-12H,1-4H3,(H,20,21)/t9-,10-,11+,12+,14+/m1/s1 1,2,3,4-tetra-O-acetyl-5-bromo-beta-L-idopyranuronic acid InChI=1S/C14H17BrO11/c1-5(16)22-9-10(23-6(2)17)12(25-8(4)19)26-14(15,13(20)21)11(9)24-7(3)18/h9-12H,1-4H3,(H,20,21)/t9-,10-,11+,12+,14-/m1/s1 2-Deoxy-2-phenyl-alpha-D-glucopyranose InChI=1S/C12H16O5/c13-6-8-10(14)11(15)9(12(16)17-8)7-4-2-1-3-5-7/h1-5,8-16H,6H2/t8-,9-,10-,11-,12+/m1/s1 2-deoxy-2-C-phenyl-alpha-D-glucopyranose InChI=1S/C12H16O5/c13-6-8-10(14)11(15)9(12(16)17-8)7-4-2-1-3-5-7/h1-5,8-16H,6H2/t8-,9-,10-,11-,12+/m1/s1 (2R)-2-deoxy-2-phenyl-alpha-D-arabino-hexopyranose InChI=1S/C12H16O5/c13-6-8-10(14)11(15)9(12(16)17-8)7-4-2-1-3-5-7/h1-5,8-16H,6H2/t8-,9-,10-,11-,12+/m1/s1 2,3-Diazido-4-O-benzoyl-6-bromo-2,3,6-trideoxy-alpha-D-mannopyranosyl nitrate InChI=1S/C13H12BrN7O6/c14-6-8-11(26-12(22)7-4-2-1-3-5-7)9(17-19-15)10(18-20-16)13(25-8)27-21(23)24/h1-5,8-11,13H,6H2/t8-,9-,10+,11-,13-/m1/s1 (2R)-2-Bromo-2-chloro-2-deoxy-alpha-D-arabino-hexopyranose InChI=1S/C6H10BrClO5/c7-6(8)4(11)3(10)2(1-9)13-5(6)12/h2-5,9-12H,1H2/t2-,3-,4+,5+,6+/m1/s1 2-bromo-2-chloro-2-deoxy-alpha-D-glucopyranose InChI=1S/C6H10BrClO5/c7-6(8)4(11)3(10)2(1-9)13-5(6)12/h2-5,9-12H,1H2/t2-,3-,4+,5+,6+/m1/s1 (5R)-5-C-Cyclohexyl-5-C-phenyl-D-xylose InChI=1S/C17H24O5/c18-11-14(19)15(20)16(21)17(22,12-7-3-1-4-8-12)13-9-5-2-6-10-13/h1,3-4,7-8,11,13-16,19-22H,2,5-6,9-10H2/t14-,15+,16-,17-/m0/s1 1-C-Phenyl-D-glucose InChI=1S/C12H16O6/c13-6-8(14)10(16)12(18)11(17)9(15)7-4-2-1-3-5-7/h1-5,8,10-14,16-18H,6H2/t8-,10-,11+,12+/m1/s1 1-C-Phenyl-beta-D-glucopyranose InChI=1S/C12H16O6/c13-6-8-9(14)10(15)11(16)12(17,18-8)7-4-2-1-3-5-7/h1-5,8-11,13-17H,6H2/t8-,9-,10+,11-,12-/m1/s1 1-Deoxy-1-(methylimino)-D-xylitol InChI=1S/C6H13NO4/c1-7-2-4(9)6(11)5(10)3-8/h2,4-6,8-11H,3H2,1H3/t4-,5+,6+/m0/s1 D-Glucose phenylhydrazone InChI=1S/C12H18N2O5/c15-7-10(17)12(19)11(18)9(16)6-13-14-8-4-2-1-3-5-8/h1-6,9-12,14-19H,7H2/t9-,10+,11+,12+/m0/s1 D-arabino-Hexos-2-ulose bis(phenylhydrazone) InChI=1S/C18H22N4O4/c23-12-16(24)18(26)17(25)15(22-21-14-9-5-2-6-10-14)11-19-20-13-7-3-1-4-8-13/h1-11,16-18,20-21,23-26H,12H2/t16-,17-,18-/m1/s1 #D-arabino-hex-2-ulose phenylosazone InChI=1S/C18H22N4O4/c23-12-16(24)18(26)17(25)15(22-21-14-9-5-2-6-10-14)11-19-20-13-7-3-1-4-8-13/h1-11,16-18,20-21,23-26H,12H2/t16-,17-,18-/m1/s1 #D-arabino-Hexos-2-ulose phenylosotriazole InChI=1S/C12H15N3O4/c16-7-10(17)12(19)11(18)9-6-13-15(14-9)8-4-2-1-3-5-8/h1-6,10-12,16-19H,7H2/t10-,11-,12-/m1/s1 (1R)-1-(2-phenyl-2H-1,2,3-triazol-4-yl)-D-erythritol InChI=1S/C12H15N3O4/c16-7-10(17)12(19)11(18)9-6-13-15(14-9)8-4-2-1-3-5-8/h1-6,10-12,16-19H,7H2/t10-,11-,12-/m1/s1 #2-phenyl-4-(D-arabino-1,2,3,4-tetrahydroxybutyl)-2H-1,2,3-triazole InChI=1S/C12H15N3O4/c16-7-10(17)12(19)11(18)9-6-13-15(14-9)8-4-2-1-3-5-8/h1-6,10-12,16-19H,7H2/t10-,11-,12-/m1/s1 1,5-Anhydro-2-deoxy-D-arabino-hex-1-enitol InChI=1S/C6H10O4/c7-3-5-6(9)4(8)1-2-10-5/h1-2,4-9H,3H2/t4-,5-,6+/m1/s1 D-glucal InChI=1S/C6H10O4/c7-3-5-6(9)4(8)1-2-10-5/h1-2,4-9H,3H2/t4-,5-,6+/m1/s1 Methyl 2-deoxy-D-threo-pent-1-enofuranoside InChI=1S/C6H10O4/c1-9-6-2-4(8)5(3-7)10-6/h2,4-5,7-8H,3H2,1H3/t4-,5-/m1/s1 1-(2-Deoxy-D-threo-pent-1-enofuranosyl)uracil InChI=1S/C9H10N2O5/c12-4-6-5(13)3-8(16-6)11-2-1-7(14)10-9(11)15/h1-3,5-6,12-13H,4H2,(H,10,14,15)/t5-,6-/m1/s1 3,4-Di-O-acetyl-2-deoxy-D-erythro-pent-1-enopyranosyl chloride InChI=1S/C9H11ClO5/c1-5(11)14-7-3-9(10)13-4-8(7)15-6(2)12/h3,7-8H,4H2,1-2H3/t7-,8+/m0/s1 2,6-Anhydro-1-deoxy-D-altro-hept-1-enitol InChI=1S/C7H12O5/c1-3-5(9)7(11)6(10)4(2-8)12-3/h4-11H,1-2H2/t4-,5-,6-,7+/m1/s1 1,2-Dideoxy-D-arabino-hex-1-enitol InChI=1S/C6H12O4/c1-2-4(8)6(10)5(9)3-7/h2,4-10H,1,3H2/t4-,5-,6+/m1/s1 (Z)-1,2,3,4,5-Penta-O-acetyl-D-erythro-pent-1-enitol InChI=1S/C15H20O10/c1-8(16)21-6-13(23-10(3)18)15(25-12(5)20)14(24-11(4)19)7-22-9(2)17/h6,14-15H,7H2,1-5H3/b13-6-/t14-,15+/m1/s1 (E)-1,2,3,4,5-Penta-O-acetyl-D-erythro-pent-1-enitol InChI=1S/C15H20O10/c1-8(16)21-6-13(23-10(3)18)15(25-12(5)20)14(24-11(4)19)7-22-9(2)17/h6,14-15H,7H2,1-5H3/b13-6+/t14-,15+/m1/s1 2-Deoxy-D-threo-pent-1-enose dimethyl acetal InChI=1S/C7H14O5/c1-11-7(12-2)3-5(9)6(10)4-8/h3,5-6,8-10H,4H2,1-2H3/t5-,6-/m1/s1 2,3-Dideoxy-alpha-D-erythro-hex-2-enopyranose InChI=1S/C6H10O4/c7-3-5-4(8)1-2-6(9)10-5/h1-2,4-9H,3H2/t4-,5+,6-/m0/s1 1,2,4,5-Tetradeoxy-D-arabino-octa-1,4-dienitol InChI=1S/C8H14O4/c1-2-6(10)3-4-7(11)8(12)5-9/h2-4,6-12H,1,5H2/t6-,7+,8-/m1/s1 1,5-Anhydro-4-deoxy-D-erythro-hex-4-enitol InChI=1S/C6H10O4/c7-2-4-1-5(8)6(9)3-10-4/h1,5-9H,2-3H2/t5-,6+/m1/s1 Methyl 3,4-dideoxy-beta-D-glycero-hex-3-en-2-ulopyranoside InChI=1S/C7H12O4/c1-10-7(5-8)3-2-6(9)4-11-7/h2-3,6,8-9H,4-5H2,1H3/t6-,7-/m0/s1 2,3-Dideoxy-alpha-D-glycero-hex-2-enopyranos-4-ulose InChI=1S/C6H8O4/c7-3-5-4(8)1-2-6(9)10-5/h1-2,5-7,9H,3H2/t5-,6+/m1/s1 Methyl 3,4-dideoxy-beta-D-glycero-hept-3-en-2-ulopyranosid-5-ulose InChI=1S/C8H12O5/c1-12-8(5-10)3-2-6(11)7(4-9)13-8/h2-3,7,9-10H,4-5H2,1H3/t7-,8+/m1/s1 #5-Deoxy-1,2-O-isopropylidene-beta-L-threo-pent-4-enofuranose #5,6-Dideoxy-1,2-O-isopropylidene-alpha-D-xylo-hex-5-enofuranose (Z)-1,7-Anhydro-2,5,6-trideoxy-D-xylo-oct-5-en-1-ynitol InChI=1S/C8H10O4/c9-5-6-1-2-7(10)8(11)3-4-12-6/h1-2,6-11H,5H2/b2-1-/t6-,7-,8-/m0/s1 (Z)-1,7-anhydro-1,1,2,2-tetradehydro-2,5,6-trideoxy-D-xylo-oct-5-enitol InChI=1S/C8H10O4/c9-5-6-1-2-7(10)8(11)3-4-12-6/h1-2,6-11H,5H2/b2-1-/t6-,7-,8-/m0/s1 2-Deoxy-1-O-methyl-D-threo-pent-1-ynitol InChI=1S/C6H10O4/c1-10-3-2-5(8)6(9)4-7/h5-9H,4H2,1H3/t5-,6-/m1/s1 1,1,2,2-tetradehydro-2-deoxy-1-O-methyl-D-threo-pentitol InChI=1S/C6H10O4/c1-10-3-2-5(8)6(9)4-7/h5-9H,4H2,1H3/t5-,6-/m1/s1 #6,7,8-Trideoxy-1,2:3,4-di-O-isopropylidene-alpha-D-galacto-octa-6,7-dienopyranose #6,7,7,8-tetradehydro-6,7,8-trideoxy-1,2:3,4-di-O-isopropylidene-alpha-D-galacto-octopyranose 6-O-Acetyl-5-deoxy-alpha-D-xylo-hex-5-ynofuranose InChI=1S/C8H10O6/c1-4(9)13-3-2-5-6(10)7(11)8(12)14-5/h5-8,10-12H,1H3/t5-,6+,7-,8+/m1/s1 6-O-acetyl-5,5,6,6-tetradehydro-5-deoxy-alpha-D-xylo-hexofuranose InChI=1S/C8H10O6/c1-4(9)13-3-2-5-6(10)7(11)8(12)14-5/h5-8,10-12H,1H3/t5-,6+,7-,8+/m1/s1 #5,6,8-Trideoxy-1,2-O-isopropylidene-alpha-D-xylo-oct-5-ynofuranos-7-ulose #5,5,6,6-tetradehydro-5,6,8-trideoxy-1,2-O-isopropylidene-alpha-D-xylo-octofuranos-7-ulose Hamamelose InChI=1S/C6H12O6/c7-1-4(10)5(11)6(12,2-8)3-9/h2,4-5,7,9-12H,1,3H2/t4-,5-,6-/m1/s1 2-C-(Hydroxymethyl)-D-ribopyranose InChI=1S/C6H12O6/c7-2-6(11)4(9)3(8)1-12-5(6)10/h3-5,7-11H,1-2H2/t3-,4-,5?,6-/m1/s1 Cladinose InChI=1S/C8H16O4/c1-6(10)7(11)8(2,12-3)4-5-9/h5-7,10-11H,4H2,1-3H3/t6-,7-,8+/m0/s1 2,6-Dideoxy-3-C-methyl-3-O-methyl-L-ribo-hexopyranose InChI=1S/C8H16O4/c1-5-7(10)8(2,11-3)4-6(9)12-5/h5-7,9-10H,4H2,1-3H3/t5-,6?,7-,8+/m0/s1 Streptose InChI=1S/C6H10O5/c1-4(9)6(11,3-8)5(10)2-7/h2-5,9-11H,1H3/t4-,5-,6+/m0/s1 5-Deoxy-3-C-formyl-L-lyxofuranose InChI=1S/C6H10O5/c1-3-6(10,2-7)4(8)5(9)11-3/h2-5,8-10H,1H3/t3-,4-,5?,6+/m0/s1 6-Deoxy-3-C-methyl-D-mannopyranose InChI=1S/C7H14O5/c1-3-4(8)7(2,11)5(9)6(10)12-3/h3-6,8-11H,1-2H3/t3-,4-,5-,6?,7+/m1/s1 Evalose InChI=1S/C7H14O5/c1-4(9)6(11)7(2,12)5(10)3-8/h3-6,9-12H,1-2H3/t4-,5-,6-,7-/m1/s1 2,3,6-Trideoxy-3-C-methyl-4-O-methyl-3-nitro-L-arabino-hexopyranose InChI=1S/C8H15NO5/c1-5-7(13-3)8(2,9(11)12)4-6(10)14-5/h5-7,10H,4H2,1-3H3/t5-,6?,7-,8-/m0/s1 Evernitrose InChI=1S/C8H15NO5/c1-6(11)7(14-3)8(2,4-5-10)9(12)13/h5-7,11H,4H2,1-3H3/t6-,7-,8-/m0/s1 3-Deoxy-4-C-methyl-3-methylamino-L-arabinopyranose InChI=1S/C7H15NO4/c1-7(11)3-12-6(10)4(9)5(7)8-2/h4-6,8-11H,3H2,1-2H3/t4-,5-,6?,7+/m1/s1 Garosamine InChI=1S/C7H15NO4/c1-7(11)3-12-6(10)4(9)5(7)8-2/h4-6,8-11H,3H2,1-2H3/t4-,5-,6?,7+/m1/s1 D-Apiose InChI=1S/C5H10O5/c6-1-4(9)5(10,2-7)3-8/h1,4,7-10H,2-3H2/t4-/m0/s1 3-C-(Hydroxymethyl)-D-glycero-tetrose InChI=1S/C5H10O5/c6-1-4(9)5(10,2-7)3-8/h1,4,7-10H,2-3H2/t4-/m0/s1 3-C-(Hydroxymethyl)-alpha-D-erythrofuranose InChI=1S/C5H10O5/c6-1-5(9)2-10-4(8)3(5)7/h3-4,6-9H,1-2H2/t3-,4-,5+/m0/s1 D-apio-alpha-D-furanose InChI=1S/C5H10O5/c6-1-5(9)2-10-4(8)3(5)7/h3-4,6-9H,1-2H2/t3-,4-,5+/m0/s1 (3R)-alpha-D-apiofuranose InChI=1S/C5H10O5/c6-1-5(9)2-10-4(8)3(5)7/h3-4,6-9H,1-2H2/t3-,4-,5+/m0/s1 3-C-Methyl-D-glucose InChI=1S/C7H14O6/c1-7(13,5(11)3-9)6(12)4(10)2-8/h3-6,8,10-13H,2H2,1H3/t4-,5+,6-,7-/m1/s1 3-Deoxy-3-methyl-D-glucose InChI=1S/C7H14O5/c1-4(5(10)2-8)7(12)6(11)3-9/h2,4-7,9-12H,3H2,1H3/t4-,5+,6-,7+/m1/s1 2,3,6-Trideoxy-3-C-methyl-4-O-methyl-3-nitro-D-lyxo-hexopyranose InChI=1S/C8H15NO5/c1-5-7(13-3)8(2,9(11)12)4-6(10)14-5/h5-7,10H,4H2,1-3H3/t5-,6?,7+,8-/m1/s1 4-Cyclohexyl-4-deoxy-4-(hydroxymethyl)-D-allose InChI=1S/C13H24O6/c14-6-10(17)12(19)13(8-16,11(18)7-15)9-4-2-1-3-5-9/h6,9-12,15-19H,1-5,7-8H2/t10-,11+,12-,13-/m0/s1 (4R)-4-cyclohexyl-4-deoxy-4-(hydroxymethyl)-D-ribo-hexose InChI=1S/C13H24O6/c14-6-10(17)12(19)13(8-16,11(18)7-15)9-4-2-1-3-5-9/h6,9-12,15-19H,1-5,7-8H2/t10-,11+,12-,13-/m0/s1 3-Deoxy-3,3-dimethyl-D-ribo-hexose InChI=1S/C8H16O5/c1-8(2,6(12)4-10)7(13)5(11)3-9/h4-7,9,11-13H,3H2,1-2H3/t5-,6+,7-/m1/s1 4-C-(Hydroxymethyl)-D-erythro-pentose InChI=1S/C6H12O6/c7-1-4(10)5(11)6(12,2-8)3-9/h1,4-5,8-12H,2-3H2/t4-,5-/m0/s1 3-Deoxy-3-[(1R,2S)-1,2-dihydroxy-3-oxopropyl]-D-glycero-D-altro-heptopyranose InChI=1S/C10H18O9/c11-1-3(13)6(15)5-7(16)9(4(14)2-12)19-10(18)8(5)17/h1,3-10,12-18H,2H2/t3-,4-,5-,6+,7+,8+,9-,10?/m1/s1 #3-deoxy-3-(D-threo-1,2-dihydroxy-3-oxopropyl)-D-glycero-D-altro-heptopyranose InChI=1S/C10H18O9/c11-1-3(13)6(15)5-7(16)9(4(14)2-12)19-10(18)8(5)17/h1,3-10,12-18H,2H2/t3-,4-,5-,6+,7+,8+,9-,10?/m1/s1 #4-Deoxy-4-(L-ribo-1,2,3,4-tetrahydroxybutyl)-D-altro-hexodialdose 4-Deoxy-4-[(1R,2R)-1,2-dihydroxy-3-oxopropyl]-D-altro-heptulo-2,5-furanose InChI=1S/C10H18O9/c11-1-4(14)7(16)6-8(5(15)2-12)19-10(18,3-13)9(6)17/h1,4-9,12-18H,2-3H2/t4-,5+,6+,7-,8+,9-,10?/m0/s1 #4-deoxy-4-(D-erythro-1,2-dihydroxy-3-oxopropyl)-D-allo-heptulo-2,5-furanose InChI=1S/C10H18O9/c11-1-4(14)7(16)6-8(5(15)2-12)19-10(18,3-13)9(6)17/h1,4-9,12-18H,2-3H2/t4-,5+,6+,7-,8+,9-,10?/m0/s1 4-Deoxy-4-[(1R,2S)-1,2,3-trihydroxypropyl]-L-talo-heptos-6-ulose InChI=1S/C10H18O9/c11-1-4(14)8(17)7(9(18)5(15)2-12)10(19)6(16)3-13/h1,4-5,7-10,12-15,17-19H,2-3H2/t4-,5-,7+,8-,9-,10-/m0/s1 #4-deoxy-4-(L-erythro-1,2,3-trihydroxypropyl)-L-talo-heptos-6-ulose InChI=1S/C10H18O9/c11-1-4(14)8(17)7(9(18)5(15)2-12)10(19)6(16)3-13/h1,4-5,7-10,12-15,17-19H,2-3H2/t4-,5-,7+,8-,9-,10-/m0/s1 #4,6-Dideoxy-3-C-(D-glycero-1-hydroxyethyl)-D-ribo-hexose 3,4-Dideoxy-3-[3-hydroxy-2-(hydroxymethyl)propyl]-4-C-methyl-L-mannose InChI=1S/C11H22O6/c1-7(10(16)5-14)9(11(17)6-15)2-8(3-12)4-13/h6-14,16-17H,2-5H2,1H3/t7-,9-,10+,11+/m1/s1 D-glycero-D-galacto-Heptitol InChI=1S/C7H16O7/c8-1-3(10)5(12)7(14)6(13)4(11)2-9/h3-14H,1-2H2/t3-,4+,5-,6-,7+/m1/s1 D-erythro-L-galacto-Octitol InChI=1S/C8H18O8/c9-1-3(11)5(13)7(15)8(16)6(14)4(12)2-10/h3-16H,1-2H2/t3-,4-,5-,6+,7+,8+/m1/s1 D-Arabinitol InChI=1S/C5H12O5/c6-1-3(8)5(10)4(9)2-7/h3-10H,1-2H2/t3-,4-/m1/s1 D-Glucitol InChI=1S/C6H14O6/c7-1-3(9)5(11)6(12)4(10)2-8/h3-12H,1-2H2/t3-,4+,5-,6-/m1/s1 sorbitol InChI=1S/C6H14O6/c7-1-3(9)5(11)6(12)4(10)2-8/h3-12H,1-2H2/t3-,4+,5-,6-/m1/s1 L-Fucitol InChI=1S/C6H14O5/c1-3(8)5(10)6(11)4(9)2-7/h3-11H,2H2,1H3/t3-,4+,5+,6-/m0/s1 1-deoxy-D-galactitol InChI=1S/C6H14O5/c1-3(8)5(10)6(11)4(9)2-7/h3-11H,2H2,1H3/t3-,4+,5+,6-/m0/s1 L-Rhamnitol InChI=1S/C6H14O5/c1-3(8)5(10)6(11)4(9)2-7/h3-11H,2H2,1H3/t3-,4-,5-,6-/m0/s1 1-deoxy-L-mannitol InChI=1S/C6H14O5/c1-3(8)5(10)6(11)4(9)2-7/h3-11H,2H2,1H3/t3-,4-,5-,6-/m0/s1 #meso-Erythritol InChI=1S/C4H10O4/c5-1-3(7)4(8)2-6/h3-8H,1-2H2/t3-,4+ #meso-Ribitol InChI=1S/C5H12O5/c6-1-3(8)5(10)4(9)2-7/h3-10H,1-2H2/t3-,4+,5+ #meso-Galactitol InChI=1S/C6H14O6/c7-1-3(9)5(11)6(12)4(10)2-8/h3-12H,1-2H2/t3-,4+,5+,6- #meso-D-glycero-L-ido-Heptitol InChI=1S/C7H16O7/c8-1-3(10)5(12)7(14)6(13)4(11)2-9/h3-14H,1-2H2/t3-,4+,5+,6-,7- 5-O-Methyl-D-galactitol InChI=1S/C7H16O6/c1-13-5(3-9)7(12)6(11)4(10)2-8/h4-12H,2-3H2,1H3/t4-,5+,6+,7-/m0/s1 D-Gluconic acid InChI=1S/C6H12O7/c7-1-2(8)3(9)4(10)5(11)6(12)13/h2-5,7-11H,1H2,(H,12,13)/t2-,3-,4+,5-/m1/s1 D-Gluconate InChI=1S/C6H12O7/c7-1-2(8)3(9)4(10)5(11)6(12)13/h2-5,7-11H,1H2,(H,12,13)/p-1/t2-,3-,4+,5-/m1/s1 2-Amino-2-deoxy-D-gluconic acid InChI=1S/C6H13NO6/c7-3(6(12)13)5(11)4(10)2(9)1-8/h2-5,8-11H,1,7H2,(H,12,13)/t2-,3-,4-,5-/m1/s1 Methyl D-gluconate InChI=1S/C7H14O7/c1-14-7(13)6(12)5(11)4(10)3(9)2-8/h3-6,8-12H,2H2,1H3/t3-,4-,5+,6-/m1/s1 Isopropyl 3,4-di-O-methyl-L-mannonate InChI=1S/C11H22O7/c1-6(2)18-11(15)8(14)10(17-4)9(16-3)7(13)5-12/h6-10,12-14H,5H2,1-4H3/t7-,8+,9-,10-/m0/s1 N,N-Dimethyl-L-xylonamide InChI=1S/C7H15NO5/c1-8(2)7(13)6(12)5(11)4(10)3-9/h4-6,9-12H,3H2,1-2H3/t4-,5+,6-/m0/s1 Methyl 3-deoxy-D-threo-pentonate InChI=1S/C6H12O5/c1-11-6(10)5(9)2-4(8)3-7/h4-5,7-9H,2-3H2,1H3/t4-,5-/m0/s1 #Methyl tetra-O-acetyl-L-arabinonate InChI=1S/C14H20O10/c1-7(15)21-6-11(22-8(2)16)12(23-9(3)17)13(14(19)20-5)24-10(4)18/h11-13H,6H2,1-5H3/t11-,12-,13+/m0/s1 D-Glucono-1,4-lactone InChI=1S/C6H10O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/h2-5,7-10H,1H2/t2-,3-,4-,5-/m1/s1 #D-Gluconic acid gamma-lactone InChI=1S/C6H10O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/h2-5,7-10H,1H2/t2-,3-,4-,5-/m1/s1 D-Glucono-1,5-lactone InChI=1S/C6H10O6/c7-1-2-3(8)4(9)5(10)6(11)12-2/h2-5,7-10H,1H2/t2-,3-,4+,5-/m1/s1 #D-Gluconic acid delta-lactone InChI=1S/C6H10O6/c7-1-2-3(8)4(9)5(10)6(11)12-2/h2-5,7-10H,1H2/t2-,3-,4+,5-/m1/s1 3-Deoxy-D-ribo-hexono-1,5-lactone InChI=1S/C6H10O5/c7-2-5-3(8)1-4(9)6(10)11-5/h3-5,7-9H,1-2H2/t3-,4+,5+/m0/s1 #5-Amino-5-deoxy-D-mannono-1,5-lactam InChI=1S/C6H11NO5/c8-1-2-3(9)4(10)5(11)6(12)7-2/h2-5,8-11H,1H2,(H,7,12)/t2-,3-,4+,5+/m1/s1 #Penta-O-acetyl-D-gluconoyl chloride InChI=1S/C16H21ClO11/c1-7(18)24-6-12(25-8(2)19)13(26-9(3)20)14(27-10(4)21)15(16(17)23)28-11(5)22/h12-15H,6H2,1-5H3/t12-,13-,14+,15-/m1/s1 D-erythro-Pent-2-ulosonic acid InChI=1S/C5H8O6/c6-1-2(7)3(8)4(9)5(10)11/h2-3,6-8H,1H2,(H,10,11)/t2-,3-/m1/s1 D-arabino-Hex-5-ulosonic acid InChI=1S/C6H10O7/c7-1-2(8)3(9)4(10)5(11)6(12)13/h3-5,7,9-11H,1H2,(H,12,13)/t3-,4-,5+/m1/s1 alpha-D-arabino-Hex-2-ulopyranosonic acid InChI=1S/C6H10O7/c7-2-1-13-6(12,5(10)11)4(9)3(2)8/h2-4,7-9,12H,1H2,(H,10,11)/t2-,3-,4+,6-/m1/s1 3-Deoxy-alpha-D-manno-oct-2-ulopyranosonic acid InChI=1S/C8H14O8/c9-2-4(11)6-5(12)3(10)1-8(15,16-6)7(13)14/h3-6,9-12,15H,1-2H2,(H,13,14)/t3-,4-,5-,6-,8-/m1/s1 #Ethyl (methyl alpha-D-arabino-hex-2-u1opyranosid)onate beta-D-arabino-Hex-2-ulopyranosono-1,5-lactone InChI=1S/C6H8O6/c7-3-2-1-11-6(10,4(3)8)5(9)12-2/h2-4,7-8,10H,1H2/t2-,3-,4+,6+/m1/s1 #Indol-3-yl D-xylo-hex-5-ulofuranosonate L-xylo-Hex-2-ulosono-1,4-lactone InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/h2-3,5,7-9H,1H2/t2-,3+,5+/m0/s1 L-threo-Hex-2-enono-1,4-lactone InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/h2,5,7-10H,1H2/t2-,5+/m0/s1 L-lyxo-Hex-2-ulosono-1,4-lactone InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/h2-3,5,7-9H,1H2/t2-,3-,5+/m0/s1 D-Glucuronic acid InChI=1S/C6H10O7/c7-1-2(8)3(9)4(10)5(11)6(12)13/h1-5,8-11H,(H,12,13)/t2-,3+,4-,5-/m0/s1 alpha-D-Mannopyranuronic acid InChI=1S/C6H10O7/c7-1-2(8)4(5(10)11)13-6(12)3(1)9/h1-4,6-9,12H,(H,10,11)/t1-,2-,3-,4-,6-/m0/s1 Phenyl beta-D-glucopyranosiduronic acid InChI=1S/C12H14O7/c13-7-8(14)10(11(16)17)19-12(9(7)15)18-6-4-2-1-3-5-6/h1-5,7-10,12-15H,(H,16,17)/t7-,8-,9+,10-,12+/m0/s1 Methyl alpha-L-idopyranosiduronic acid InChI=1S/C7H12O7/c1-13-7-4(10)2(8)3(9)5(14-7)6(11)12/h2-5,7-10H,1H3,(H,11,12)/t2-,3-,4+,5+,7+/m0/s1 Methyl 2,3,4-tri-O-acetyl-alpha-D-glucopyranosyluronate bromide InChI=1S/C13H17BrO9/c1-5(15)20-8-9(21-6(2)16)11(13(18)19-4)23-12(14)10(8)22-7(3)17/h8-12H,1-4H3/t8-,9-,10+,11-,12-/m0/s1 Methyl alpha-L-glucofuranosidurononitrile InChI=1S/C7H11NO5/c1-12-7-5(11)4(10)6(13-7)3(9)2-8/h3-7,9-11H,1H3/t3-,4-,5-,6-,7+/m0/s1 #Sodium (methyl alpha-L-g1ucofuranosid)uronate Ethyl 2,3,5-tri-O-benzoyl-alpha-D-mannofuranuronate InChI=1S/C29H26O10/c1-2-35-28(33)23(38-26(31)19-14-8-4-9-15-19)21-22(36-25(30)18-12-6-3-7-13-18)24(29(34)37-21)39-27(32)20-16-10-5-11-17-20/h3-17,21-24,29,34H,2H2,1H3/t21-,22-,23-,24-,29-/m0/s1 D-Glucurono-6,3-lactone InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/h1-5,8-10H/t2-,3+,4-,5+/m0/s1 D-Glucofuranurono-6,3-lactone InChI=1S/C6H8O6/c7-1-3-4(12-5(1)9)2(8)6(10)11-3/h1-5,7-9H/t1-,2+,3-,4-,5?/m1/s1 Methyl alpha-D-glucofuranosidurono-6,3-lactone InChI=1S/C7H10O6/c1-11-7-3(9)5-4(13-7)2(8)6(10)12-5/h2-5,7-9H,1H3/t2-,3+,4+,5+,7-/m0/s1 4-Deoxy-L-threo-hex-4-enopyranuronic acid InChI=1S/C6H8O6/c7-2-1-3(5(9)10)12-6(11)4(2)8/h1-2,4,6-8,11H,(H,9,10)/t2-,4+,6?/m0/s1 Methyl 4-deoxy-L-threo-hex-4-enopyranuronate InChI=1S/C7H10O6/c1-12-6(10)4-2-3(8)5(9)7(11)13-4/h2-3,5,7-9,11H,1H3/t3-,5+,7?/m0/s1 Methyl 4-deoxy-alpha-L-threo-hex-4-enopyranosiduronic acid InChI=1S/C7H10O6/c1-12-7-5(9)3(8)2-4(13-7)6(10)11/h2-3,5,7-9H,1H3,(H,10,11)/t3-,5+,7+/m0/s1 #Methyl (phenyl 4-deoxy-beta-L-threo-hex-4-enopyranosid)uronate Methyl beta-D-galactopyranosiduronamide InChI=1S/C7H13NO6/c1-13-7-4(11)2(9)3(10)5(14-7)6(8)12/h2-5,7,9-11H,1H3,(H2,8,12)/t2-,3+,4+,5-,7+/m0/s1 L-Altraric acid InChI=1S/C6H10O8/c7-1(3(9)5(11)12)2(8)4(10)6(13)14/h1-4,7-10H,(H,11,12)(H,13,14)/t1-,2+,3-,4-/m1/s1 D-Glucaric acid InChI=1S/C6H10O8/c7-1(3(9)5(11)12)2(8)4(10)6(13)14/h1-4,7-10H,(H,11,12)(H,13,14)/t1-,2-,3-,4+/m0/s1 L-glycero-D-galacto-Heptaric acid InChI=1S/C7H12O9/c8-1(2(9)4(11)6(13)14)3(10)5(12)7(15)16/h1-5,8-12H,(H,13,14)(H,15,16)/t2-,3-,4+,5+/m0/s1 #meso-Xylaric acid InChI=1S/C5H8O7/c6-1(2(7)4(9)10)3(8)5(11)12/h1-3,6-8H,(H,9,10)(H,11,12)/t1-,2-,3+ #meso-Galactaric acid InChI=1S/C6H10O8/c7-1(3(9)5(11)12)2(8)4(10)6(13)14/h1-4,7-10H,(H,11,12)(H,13,14)/t1-,2+,3+,4- 4-O-Methyl-D-galactaric acid InChI=1S/C7H12O8/c1-15-5(4(10)7(13)14)2(8)3(9)6(11)12/h2-5,8-10H,1H3,(H,11,12)(H,13,14)/t2-,3-,4+,5-/m1/s1 #(2R,3R)-Tartaric acid InChI=1S/C4H6O6/c5-1(3(7)8)2(6)4(9)10/h1-2,5-6H,(H,7,8)(H,9,10)/t1-,2-/m1/s1 #(+)-Tartaric acid InChI=1S/C4H6O6/c5-1(3(7)8)2(6)4(9)10/h1-2,5-6H,(H,7,8)(H,9,10)/t1-,2-/m1/s1 L-threaric acid InChI=1S/C4H6O6/c5-1(3(7)8)2(6)4(9)10/h1-2,5-6H,(H,7,8)(H,9,10)/t1-,2-/m1/s1 #(2S,3S)-Tartaric acid InChI=1S/C4H6O6/c5-1(3(7)8)2(6)4(9)10/h1-2,5-6H,(H,7,8)(H,9,10)/t1-,2-/m0/s1 #(-)-Tartaric acid InChI=1S/C4H6O6/c5-1(3(7)8)2(6)4(9)10/h1-2,5-6H,(H,7,8)(H,9,10)/t1-,2-/m0/s1 D-threaric acid InChI=1S/C4H6O6/c5-1(3(7)8)2(6)4(9)10/h1-2,5-6H,(H,7,8)(H,9,10)/t1-,2-/m0/s1 #(2R,3S)-Tartaric acid InChI=1S/C4H6O6/c5-1(3(7)8)2(6)4(9)10/h1-2,5-6H,(H,7,8)(H,9,10)/t1-,2+ #meso-Tartaric acid InChI=1S/C4H6O6/c5-1(3(7)8)2(6)4(9)10/h1-2,5-6H,(H,7,8)(H,9,10)/t1-,2+ erythraric acid InChI=1S/C4H6O6/c5-1(3(7)8)2(6)4(9)10/h1-2,5-6H,(H,7,8)(H,9,10)/t1-,2+ 1-Methyl hydrogen D-galactarate InChI=1S/C7H12O8/c1-15-7(14)5(11)3(9)2(8)4(10)6(12)13/h2-5,8-11H,1H3,(H,12,13)/t2-,3+,4+,5-/m1/s1 6-Methyl hydrogen D-galactarate InChI=1S/C7H12O8/c1-15-7(14)5(11)3(9)2(8)4(10)6(12)13/h2-5,8-11H,1H3,(H,12,13)/t2-,3+,4+,5-/m0/s1 #D-Glucar-1-amic acid #Methyl D-glucar-6-amate L-mannaro-1,4:6,3-dilactone InChI=1S/C6H6O6/c7-1-3-4(12-5(1)9)2(8)6(10)11-3/h1-4,7-8H/t1-,2-,3+,4+/m1/s1 #The above are names from 2-Carb-1 through 2-Carb-23 #2-Carb-37.3 alpha-D-Glucopyranosyl-(1->4)-[alpha-D-glucopyranosyl-(1->6)]-D-glucopyranose InChI=1S/C18H32O16/c19-1-4-7(21)9(23)13(27)17(32-4)30-3-6-15(11(25)12(26)16(29)31-6)34-18-14(28)10(24)8(22)5(2-20)33-18/h4-29H,1-3H2/t4-,5-,6-,7-,8-,9+,10+,11-,12-,13-,14-,15-,16?,17+,18-/m1/s1 4,6-di-O-(alpha-D-glucopyranosyl)-D-glucopyranose InChI=1S/C18H32O16/c19-1-4-7(21)9(23)13(27)17(32-4)30-3-6-15(11(25)12(26)16(29)31-6)34-18-14(28)10(24)8(22)5(2-20)33-18/h4-29H,1-3H2/t4-,5-,6-,7-,8-,9+,10+,11-,12-,13-,14-,15-,16?,17+,18-/m1/s1 (5-Acetamido-3,5-dideoxy-D-glycero-alpha-D-galacto-non-2-ulopyranosylonic acid)-(2->3)-beta-D-galactopyranosyl-(1->3)-[alpha-L-fucopyranosyl-(1->4)]-2-acetamido-2-deoxy-D-glucopyranose InChI=1S/C31H52N2O23/c1-8-17(41)20(44)21(45)28(50-8)53-23-14(7-36)51-27(47)16(33-10(3)38)25(23)54-29-22(46)26(19(43)13(6-35)52-29)56-31(30(48)49)4-11(39)15(32-9(2)37)24(55-31)18(42)12(40)5-34/h8,11-29,34-36,39-47H,4-7H2,1-3H3,(H,32,37)(H,33,38)(H,48,49)/t8-,11-,12+,13+,14+,15+,16+,17+,18+,19-,20+,21-,22+,23+,24+,25+,26-,27?,28-,29-,31-/m0/s1 #5-N-acetyl-alpha-neuraminyl-(2->3)-beta-D-galactopyranosyl-(1->3)-[alpha-L-fucopyranosyl-(1->4)]-2-acetamido-2-deoxy-D-glucopyranose InChI=1S/C31H52N2O23/c1-8-17(41)20(44)21(45)28(50-8)53-23-14(7-36)51-27(47)16(33-10(3)38)25(23)54-29-22(46)26(19(43)13(6-35)52-29)56-31(30(48)49)4-11(39)15(32-9(2)37)24(55-31)18(42)12(40)5-34/h8,11-29,34-36,39-47H,4-7H2,1-3H3,(H,32,37)(H,33,38)(H,48,49)/t8-,11-,12+,13+,14+,15+,16+,17+,18+,19-,20+,21-,22+,23+,24+,25+,26-,27?,28-,29-,31-/m0/s1 2-Acetamido-2-deoxy-alpha-D-galactopyranosyl-(1->3)-[alpha-L-fucopyranosyl-(1->2)]-D-galactopyranose InChI=1S/C20H35NO15/c1-5-10(25)14(29)15(30)20(32-5)36-17-16(12(27)8(4-23)33-18(17)31)35-19-9(21-6(2)24)13(28)11(26)7(3-22)34-19/h5,7-20,22-23,25-31H,3-4H2,1-2H3,(H,21,24)/t5-,7+,8+,9+,10+,11-,12-,13+,14+,15-,16-,17+,18?,19+,20-/m0/s1 #CAS D-gluco-2-hept-ulose InChI=1S/C7H14O7/c8-1-3(10)5(12)7(14)6(13)4(11)2-9/h3,5-10,12-14H,1-2H2/t3-,5-,6+,7+/m1/s1 L-Apio-alpha-D-furanose InChI=1S/C5H10O5/c6-1-5(9)2-10-4(8)3(5)7/h3-4,6-9H,1-2H2/t3-,4+,5-/m1/s1 7-[[6-O-(5-O-D-Apio-beta-D-furanosyl-D-apio-beta-D-furanosyl)-beta-D-glucopyranosyl]oxy]-4'-methoxy-5-hydroxyisoflavone InChI=1S/C32H38O18/c1-43-15-4-2-14(3-5-15)17-8-44-19-7-16(6-18(34)21(19)22(17)35)49-28-25(38)24(37)23(36)20(50-28)9-45-29-27(40)32(42,12-47-29)13-48-30-26(39)31(41,10-33)11-46-30/h2-8,20,23-30,33-34,36-42H,9-13H2,1H3/t20-,23-,24+,25-,26+,27+,28-,29-,30+,31-,32+/m1/s1 #Misc 5-acetamido-3,5-dideoxy-D-glycero-alpha-D-galacto-non-2-ulopyranonosyl-(2->3)-beta-D-galactopyranose InChI=1S/C17H29NO14/c1-5(21)18-9-6(22)2-17(16(28)29,31-13(9)10(24)7(23)3-19)32-14-11(25)8(4-20)30-15(27)12(14)26/h6-15,19-20,22-27H,2-4H2,1H3,(H,18,21)(H,28,29)/t6-,7+,8+,9+,10+,11-,12+,13+,14-,15+,17-/m0/s1 glucosyl(1->6)glucopyranose InChI=1S/C12H22O11/c13-1-3-5(14)8(17)10(19)12(23-3)21-2-4-6(15)7(16)9(18)11(20)22-4/h3-20H,1-2H2/t3-,4-,5-,6-,7+,8+,9-,10-,11?,12?/m1/s1 glucitol-O1-ylbenzene InChI=1S/C12H18O6/c13-6-9(14)11(16)12(17)10(15)7-18-8-4-2-1-3-5-8/h1-5,9-17H,6-7H2/t9-,10+,11-,12-/m1/s1 D-sarmentose InChI=1S/C7H14O4/c1-5(9)7(10)6(11-2)3-4-8/h4-7,9-10H,3H2,1-2H3/t5-,6+,7+/m1/s1 L-sarmentose InChI=1S/C7H14O4/c1-5(9)7(10)6(11-2)3-4-8/h4-7,9-10H,3H2,1-2H3/t5-,6+,7+/m0/s1 D-cymarose InChI=1S/C7H14O4/c1-5(9)7(10)6(11-2)3-4-8/h4-7,9-10H,3H2,1-2H3/t5-,6+,7-/m1/s1 L-cymarose InChI=1S/C7H14O4/c1-5(9)7(10)6(11-2)3-4-8/h4-7,9-10H,3H2,1-2H3/t5-,6+,7-/m0/s1 digitalose InChI=1S/C7H14O5/c1-4(9)6(11)7(12-2)5(10)3-8/h3-7,9-11H,1-2H3/t4-,5+,6+,7-/m1/s1 D-digitalose InChI=1S/C7H14O5/c1-4(9)6(11)7(12-2)5(10)3-8/h3-7,9-11H,1-2H3/t4-,5+,6+,7-/m1/s1 L-digitalose InChI=1S/C7H14O5/c1-4(9)6(11)7(12-2)5(10)3-8/h3-7,9-11H,1-2H3/t4-,5+,6+,7-/m0/s1 D-cymaropyranose InChI=1S/C7H14O4/c1-4-7(9)5(10-2)3-6(8)11-4/h4-9H,3H2,1-2H3/t4-,5+,6?,7-/m1/s1opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/chargeBalancing.txt000066400000000000000000000013001451751637500307250ustar00rootroot00000000000000ammonium chloride InChI=1S/ClH.H3N/h1H;1H3 sodium chloride InChI=1S/ClH.Na/h1H;/q;+1/p-1 magnesium chloride InChI=1S/2ClH.Mg/h2*1H;/q;;+2/p-2 iron(3+) oxide InChI=1S/2Fe.3O/q2*+3;3*-2 sodium citrate InChI=1S/C6H8O7.3Na/c7-3(8)1-6(13,5(11)12)2-4(9)10;;;/h13H,1-2H2,(H,7,8)(H,9,10)(H,11,12);;;/q;3*+1/p-3 caffeine citrate InChI=1S/C8H10N4O2.C6H8O7/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2;7-3(8)1-6(13,5(11)12)2-4(9)10/h4H,1-3H3;13H,1-2H2,(H,7,8)(H,9,10)(H,11,12) tetrabutylammonium tribromide InChI=1S/C16H36N.Br3/c1-5-9-13-17(14-10-6-2,15-11-7-3)16-12-8-4;1-3-2/h5-16H2,1-4H3;/q+1;-1 N-Hydroxysulfosuccinimide sodium salt InChI=1S/C4H5NO6S.Na/c6-3-1-2(12(9,10)11)4(7)5(3)8;/h2,8H,1H2,(H,9,10,11);/q;+1/p-1 opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/conjunctiveNomenclature.txt000066400000000000000000000010141451751637500326030ustar00rootroot00000000000000benzeneacetic acid InChI=1S/C8H8O2/c9-8(10)6-7-4-2-1-3-5-7/h1-5H,6H2,(H,9,10) benzeneethylamine InChI=1S/C8H11N/c9-7-6-8-4-2-1-3-5-8/h1-5H,6-7,9H2 cyclohexaneacetic acid piperidide InChI=1S/C13H23NO/c15-13(14-9-5-2-6-10-14)11-12-7-3-1-4-8-12/h12H,1-11H2 L-aspartic acid N,N-diacetic acid InChI=1S/C8H11NO8/c10-5(11)1-4(8(16)17)9(2-6(12)13)3-7(14)15/h4H,1-3H2,(H,10,11)(H,12,13)(H,14,15)(H,16,17)/t4-/m0/s1 beta-alanine diacetic acid InChI=1S/C7H11NO6/c9-5(10)1-2-8(3-6(11)12)4-7(13)14/h1-4H2,(H,9,10)(H,11,12)(H,13,14) opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/cyclicSuffixes.txt000066400000000000000000000051531451751637500306720ustar00rootroot00000000000000hydracrylolactone InChI=1S/C3H4O2/c4-3-1-2-5-3/h1-2H2 5-pentanolide InChI=1S/C5H8O2/c6-5-3-1-2-4-7-5/h1-4H2 4-penten-5-olide InChI=1S/C5H6O2/c6-5-3-1-2-4-7-5/h2,4H,1,3H2 3-hydroxy-4-butanolide InChI=1S/C4H6O3/c5-3-1-4(6)7-2-3/h3,5H,1-2H2 #1,2-cyclohexanedicarboximide InChI=1S/C8H11NO2/c10-7-5-3-1-2-4-6(5)8(11)9-7/h5-6H,1-4H2,(H,9,10,11) cyclohexane-1,2-dicarboximide InChI=1S/C8H11NO2/c10-7-5-3-1-2-4-6(5)8(11)9-7/h5-6H,1-4H2,(H,9,10,11) succinimide InChI=1S/C4H5NO2/c6-3-1-2-4(7)5-3/h1-2H2,(H,5,6,7) N-phenylphthalimide InChI=1S/C14H9NO2/c16-13-11-8-4-5-9-12(11)14(17)15(13)10-6-2-1-3-7-10/h1-9H 7-phthalimido-1-naphthoic acid InChI=1S/C19H11NO4/c21-17-13-5-1-2-6-14(13)18(22)20(17)12-9-8-11-4-3-7-15(19(23)24)16(11)10-12/h1-10H,(H,23,24) 4-butanelactam InChI=1S/C4H7NO/c6-4-2-1-3-5-4/h1-3H2,(H,5,6) butano-4-lactam InChI=1S/C4H7NO/c6-4-2-1-3-5-4/h1-3H2,(H,5,6) O-methyl-4-butanelactim InChI=1S/C5H9NO/c1-7-5-3-2-4-6-5/h2-4H2,1H3 O-methyl-butano-4-lactim InChI=1S/C5H9NO/c1-7-5-3-2-4-6-5/h2-4H2,1H3 O-methyl-butyrolactim InChI=1S/C5H9NO/c1-7-5-3-2-4-6-5/h2-4H2,1H3 caprolactam InChI=1S/C6H11NO/c8-6-4-2-1-3-5-7-6/h1-5H2,(H,7,8) butano-4-lactone InChI=1S/C4H6O2/c5-4-2-1-3-6-4/h1-3H2 gamma-butyrolactone InChI=1S/C4H6O2/c5-4-2-1-3-6-4/h1-3H2 gamma-valerolactone InChI=1S/C5H8O2/c1-4-2-3-5(6)7-4/h4H,2-3H2,1H3 delta-valerolactone InChI=1S/C5H8O2/c6-5-3-1-2-4-7-5/h1-4H2 #phenanthrene-1,10:9,8-dicarbolactone InChI=1S/C16H6O4/c17-15-9-5-1-3-7-8-4-2-6-10-12(8)13(16(18)19-10)14(20-15)11(7)9/h1-6H #1,10:9,8-phenanthrenebiscarbolactone InChI=1S/C16H6O4/c17-15-9-5-1-3-7-8-4-2-6-10-12(8)13(16(18)19-10)14(20-15)11(7)9/h1-6H naphthalene-1,8-sultone InChI=1S/C10H6O3S/c11-14(12)9-6-2-4-7-3-1-5-8(13-14)10(7)9/h1-6H #1,8-naphthalenesultone InChI=1S/C10H6O3S/c11-14(12)9-6-2-4-7-3-1-5-8(13-14)10(7)9/h1-6H pentane-2,5-sultone InChI=1S/C5H10O3S/c1-5-3-2-4-8-9(5,6)7/h5H,2-4H2,1H3 naphthalene-1,8-sultam InChI=1S/C10H7NO2S/c12-14(13)9-6-2-4-7-3-1-5-8(11-14)10(7)9/h1-6,11H butane-1,4-sultam InChI=1S/C4H9NO2S/c6-8(7)4-2-1-3-5-8/h5H,1-4H2 O-methyl-propane-3-sultim InChI=1S/C4H9NO2S/c1-7-8(6)4-2-3-5-8/h2-4H2,1H3 pentane-2,5-sultine InChI=1S/C5H10O2S/c1-5-3-2-4-7-8(5)6/h5H,2-4H2,1H3 #N,N'-Bis(1-hexylheptyl)-perylene-3,4:9,10-bis-(dicarboximide) InChI=1S/C50H62N2O4/c1-5-9-13-17-21-33(22-18-14-10-6-2)51-47(53)39-29-25-35-37-27-31-41-46-42(32-28-38(44(37)46)36-26-30-40(48(51)54)45(39)43(35)36)50(56)52(49(41)55)34(23-19-15-11-7-3)24-20-16-12-8-4/h25-34H,5-24H2,1-4H3 ethyl (Z)-2-chloro-3-[2-chloro-5-(cyclohex-1-ene-1,2-dicarboximido)phenyl]acrylate InChI=1S/C19H17Cl2NO4/c1-2-26-19(25)16(21)10-11-9-12(7-8-15(11)20)22-17(23)13-5-3-4-6-14(13)18(22)24/h7-10H,2-6H2,1H3/b16-10-opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/detachablePrefixes.txt000066400000000000000000000016311451751637500314660ustar00rootroot000000000000004-amino-2,3-dihydro-pyridine InChI=1S/C5H8N2/c6-5-1-3-7-4-2-5/h1,3H,2,4,6H2 2,3-dihydro-4-amino-pyridine InChI=1S/C5H8N2/c6-5-1-3-7-4-2-5/h1,3H,2,4,6H2 4-methyl-3-aza-pyridine InChI=1S/C5H6N2/c1-5-2-3-6-4-7-5/h2-4H,1H3 3-aza-4-methyl-pyridine InChI=1S/C5H6N2/c1-5-2-3-6-4-7-5/h2-4H,1H3 2,3-diamino-1,4-ethano-cyclohexane InChI=1S/C8H16N2/c9-7-5-1-2-6(4-3-5)8(7)10/h5-8H,1-4,9-10H2 1,4-ethano-2,3-diamino-cyclohexane InChI=1S/C8H16N2/c9-7-5-1-2-6(4-3-5)8(7)10/h5-8H,1-4,9-10H2 4-(aminomethyl)-3-aza-pyridine InChI=1S/C5H7N3/c6-3-5-1-2-7-4-8-5/h1-2,4H,3,6H2 3-aza-4-(aminomethyl)-pyridine InChI=1S/C5H7N3/c6-3-5-1-2-7-4-8-5/h1-2,4H,3,6H2 3,6,9-triaza-2-(4-phenylbutyl)undecanoic acid InChI=1S/C18H31N3O2/c1-2-19-12-13-20-14-15-21-17(18(22)23)11-7-6-10-16-8-4-3-5-9-16/h3-5,8-9,17,19-21H,2,6-7,10-15H2,1H3,(H,22,23) 2lambda6,4lambda6-dithia-2,2,4,4-tetraoxo-pentane InChI=1S/C3H8O4S2/c1-8(4,5)3-9(2,6)7/h3H2,1-2H3 opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/epoxyLike.txt000066400000000000000000000014161451751637500276560ustar00rootroot000000000000002,3-methylenedioxynaphthalene InChI=1S/C11H8O2/c1-2-4-9-6-11-10(12-7-13-11)5-8(9)3-1/h1-6H,7H2 3,4-(dichloromethylenedioxy)-benzenesulfonic acid InChI=1S/C7H4Cl2O5S/c8-7(9)13-5-2-1-4(15(10,11)12)3-6(5)14-7/h1-3H,(H,10,11,12) (3,4-methylenedioxyphenyl)methane InChI=1S/C8H8O2/c1-6-2-3-7-8(4-6)10-5-9-7/h2-4H,5H2,1H3 epoxybenzene InChI=1S/C6H4O/c1-2-4-6-5(3-1)7-6/h1-4H 3,4-epoxybutanol InChI=1S/C4H8O2/c5-2-1-4-3-6-4/h4-5H,1-3H2 2,3-epoxypyridine InChI=1S/C5H3NO/c1-2-4-5(7-4)6-3-1/h1-3H methylenedioxydibenzene InChI=1S/C13H12O2/c1-3-7-12(8-4-1)14-11-15-13-9-5-2-6-10-13/h1-10H,11H2 3,4'-methylenedioxydipyridine InChI=1S/C11H10N2O2/c1-2-11(8-13-5-1)15-9-14-10-3-6-12-7-4-10/h1-8H,9H2 3,4-epoxy-1-phenyl-butane InChI=1S/C10H12O/c1-2-4-9(5-3-1)6-7-10-8-11-10/h1-5,10H,6-8H2 opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/flavonoids.txt000066400000000000000000001772441451751637500300660ustar00rootroot00000000000000#The following are commented: #Trivial names that are unlikely to be used as semi-systematic names #Arrow linkages with flavonoids [not yet supported] #Bracketed glycosides [not yet supported] #Natural product fused ring nomenclature [not yet supported] flavan InChI=1S/C15H14O/c1-2-6-12(7-3-1)15-11-10-13-8-4-5-9-14(13)16-15/h1-9,15H,10-11H2 2-phenyl-3,4-dihydro-2H-1-benzopyran InChI=1S/C15H14O/c1-2-6-12(7-3-1)15-11-10-13-8-4-5-9-14(13)16-15/h1-9,15H,10-11H2 2-phenylchromane InChI=1S/C15H14O/c1-2-6-12(7-3-1)15-11-10-13-8-4-5-9-14(13)16-15/h1-9,15H,10-11H2 2-phenyl-3,4-dihydro-2H-chromene InChI=1S/C15H14O/c1-2-6-12(7-3-1)15-11-10-13-8-4-5-9-14(13)16-15/h1-9,15H,10-11H2 (2S)-2′,7-bis(β-d-glucopyranosyloxy)-8-(2-hydroxyethyl)-4′-methoxyflavan-5-ol InChI=1S/C30H40O16/c1-41-12-2-3-14(18(8-12)43-29-26(39)24(37)22(35)20(10-32)45-29)17-5-4-13-16(34)9-19(15(6-7-31)28(13)42-17)44-30-27(40)25(38)23(36)21(11-33)46-30/h2-3,8-9,17,20-27,29-40H,4-7,10-11H2,1H3/t17-,20+,21+,22+,23+,24-,25-,26+,27+,29+,30+/m0/s1 2-[(2S)-7-(β-d-glucopyranosyloxy)-5-hydroxy-8-(2-hydroxyethyl)-3,4-dihydro-2H-1-benzopyran-2-yl]-5-methoxyphenyl β-d-glucopyranoside InChI=1S/C30H40O16/c1-41-12-2-3-14(18(8-12)43-29-26(39)24(37)22(35)20(10-32)45-29)17-5-4-13-16(34)9-19(15(6-7-31)28(13)42-17)44-30-27(40)25(38)23(36)21(11-33)46-30/h2-3,8-9,17,20-27,29-40H,4-7,10-11H2,1H3/t17-,20+,21+,22+,23+,24-,25-,26+,27+,29+,30+/m0/s1 2-[(2S)-7-(β-d-glucopyranosyloxy)-5-hydroxy-8-(2-hydroxyethyl)-3,4-dihydro-2H-chromen-2-yl]-5-methoxyphenyl β-d-glucopyranoside InChI=1S/C30H40O16/c1-41-12-2-3-14(18(8-12)43-29-26(39)24(37)22(35)20(10-32)45-29)17-5-4-13-16(34)9-19(15(6-7-31)28(13)42-17)44-30-27(40)25(38)23(36)21(11-33)46-30/h2-3,8-9,17,20-27,29-40H,4-7,10-11H2,1H3/t17-,20+,21+,22+,23+,24-,25-,26+,27+,29+,30+/m0/s1 (2S)-7-(β-d-glucopyranosyloxy)-2-[2-(β-d-glucopyranosyloxy)-4-methoxyphenyl]-8-(2-hydroxyethyl)-3,4-dihydro-2H-1-benzopyran-5-ol InChI=1S/C30H40O16/c1-41-12-2-3-14(18(8-12)43-29-26(39)24(37)22(35)20(10-32)45-29)17-5-4-13-16(34)9-19(15(6-7-31)28(13)42-17)44-30-27(40)25(38)23(36)21(11-33)46-30/h2-3,8-9,17,20-27,29-40H,4-7,10-11H2,1H3/t17-,20+,21+,22+,23+,24-,25-,26+,27+,29+,30+/m0/s1 (2S)-7-(β-d-glucopyranosyloxy)-2-[2-(β-d-glucopyranosyloxy)-4-methoxyphenyl]-8-(2-hydroxyethyl)-chroman-5-ol InChI=1S/C30H40O16/c1-41-12-2-3-14(18(8-12)43-29-26(39)24(37)22(35)20(10-32)45-29)17-5-4-13-16(34)9-19(15(6-7-31)28(13)42-17)44-30-27(40)25(38)23(36)21(11-33)46-30/h2-3,8-9,17,20-27,29-40H,4-7,10-11H2,1H3/t17-,20+,21+,22+,23+,24-,25-,26+,27+,29+,30+/m0/s1 (2S)-7-(β-d-glucopyranosyloxy)-2-[2-(β-d-glucopyranosyloxy)-4-methoxyphenyl]-8-(2-hydroxyethyl)-3,4-dihydro-2H-chromen-5-ol InChI=1S/C30H40O16/c1-41-12-2-3-14(18(8-12)43-29-26(39)24(37)22(35)20(10-32)45-29)17-5-4-13-16(34)9-19(15(6-7-31)28(13)42-17)44-30-27(40)25(38)23(36)21(11-33)46-30/h2-3,8-9,17,20-27,29-40H,4-7,10-11H2,1H3/t17-,20+,21+,22+,23+,24-,25-,26+,27+,29+,30+/m0/s1 (2S)-7-(β-d-glucopyranosyloxy)-4′-methoxy-8-(3-methylbut-2-en-1-yl)flavan-2′-ol InChI=1S/C27H34O9/c1-14(2)4-8-18-21(35-27-25(32)24(31)23(30)22(13-28)36-27)11-6-15-5-10-20(34-26(15)18)17-9-7-16(33-3)12-19(17)29/h4,6-7,9,11-12,20,22-25,27-32H,5,8,10,13H2,1-3H3/t20-,22+,23+,24-,25+,27+/m0/s1 (2S)-2′-hydroxy-4′-methoxy-8-(3-methylbut-2-en-1-yl)flavan-7-yl β-d-glucopyranoside InChI=1S/C27H34O9/c1-14(2)4-8-18-21(35-27-25(32)24(31)23(30)22(13-28)36-27)11-6-15-5-10-20(34-26(15)18)17-9-7-16(33-3)12-19(17)29/h4,6-7,9,11-12,20,22-25,27-32H,5,8,10,13H2,1-3H3/t20-,22+,23+,24-,25+,27+/m0/s1 (2S)-2-(2-hydroxy-4-methoxyphenyl)-8-(3-methylbut-2-en-1-yl)-3,4-dihydro-2H-1-benzopyran-7-yl β-d-glucopyranoside InChI=1S/C27H34O9/c1-14(2)4-8-18-21(35-27-25(32)24(31)23(30)22(13-28)36-27)11-6-15-5-10-20(34-26(15)18)17-9-7-16(33-3)12-19(17)29/h4,6-7,9,11-12,20,22-25,27-32H,5,8,10,13H2,1-3H3/t20-,22+,23+,24-,25+,27+/m0/s1 (2S)-2-(2-hydroxy-4-methoxyphenyl)-8-(3-methylbut-2-en-1-yl)chroman-7-yl β-d-glucopyranoside InChI=1S/C27H34O9/c1-14(2)4-8-18-21(35-27-25(32)24(31)23(30)22(13-28)36-27)11-6-15-5-10-20(34-26(15)18)17-9-7-16(33-3)12-19(17)29/h4,6-7,9,11-12,20,22-25,27-32H,5,8,10,13H2,1-3H3/t20-,22+,23+,24-,25+,27+/m0/s1 (2S)-2-(2-hydroxy-4-methoxyphenyl)-8-(3-methylbut-2-en-1-yl)-3,4-dihydro-2H-chromen-7-yl β-d-glucopyranoside InChI=1S/C27H34O9/c1-14(2)4-8-18-21(35-27-25(32)24(31)23(30)22(13-28)36-27)11-6-15-5-10-20(34-26(15)18)17-9-7-16(33-3)12-19(17)29/h4,6-7,9,11-12,20,22-25,27-32H,5,8,10,13H2,1-3H3/t20-,22+,23+,24-,25+,27+/m0/s1 2-[(2S)-7-(β-d-glucopyranosyloxy)-8-(3-methylbut-2-en-1-yl)-3,4-dihydro-2H-1-benzopyran-2-yl]-5-methoxyphenol InChI=1S/C27H34O9/c1-14(2)4-8-18-21(35-27-25(32)24(31)23(30)22(13-28)36-27)11-6-15-5-10-20(34-26(15)18)17-9-7-16(33-3)12-19(17)29/h4,6-7,9,11-12,20,22-25,27-32H,5,8,10,13H2,1-3H3/t20-,22+,23+,24-,25+,27+/m0/s1 2-[(2S)-7-(β-d-glucopyranosyloxy)-8-(3-methylbut-2-en-1-yl)chroman-2-yl]-5-methoxyphenol InChI=1S/C27H34O9/c1-14(2)4-8-18-21(35-27-25(32)24(31)23(30)22(13-28)36-27)11-6-15-5-10-20(34-26(15)18)17-9-7-16(33-3)12-19(17)29/h4,6-7,9,11-12,20,22-25,27-32H,5,8,10,13H2,1-3H3/t20-,22+,23+,24-,25+,27+/m0/s1 2-[(2S)-7-(β-d-glucopyranosyloxy)-8-(3-methylbut-2-en-1-yl)-3,4-dihydro-2H-chromen-2-yl]-5-methoxyphenol InChI=1S/C27H34O9/c1-14(2)4-8-18-21(35-27-25(32)24(31)23(30)22(13-28)36-27)11-6-15-5-10-20(34-26(15)18)17-9-7-16(33-3)12-19(17)29/h4,6-7,9,11-12,20,22-25,27-32H,5,8,10,13H2,1-3H3/t20-,22+,23+,24-,25+,27+/m0/s1 flavan-4-one InChI=1S/C15H12O2/c16-13-10-15(11-6-2-1-3-7-11)17-14-9-5-4-8-12(13)14/h1-9,15H,10H2 2-phenyl-2,3-dihydro-4H-1-benzopyran-4-one InChI=1S/C15H12O2/c16-13-10-15(11-6-2-1-3-7-11)17-14-9-5-4-8-12(13)14/h1-9,15H,10H2 2-phenylchroman-4-one InChI=1S/C15H12O2/c16-13-10-15(11-6-2-1-3-7-11)17-14-9-5-4-8-12(13)14/h1-9,15H,10H2 2-phenyl-2,3-dihydro-4H-chromen-4-one InChI=1S/C15H12O2/c16-13-10-15(11-6-2-1-3-7-11)17-14-9-5-4-8-12(13)14/h1-9,15H,10H2 (2S)-4′,5,7-trihydroxyflavan-4-one InChI=1S/C15H12O5/c16-9-3-1-8(2-4-9)13-7-12(19)15-11(18)5-10(17)6-14(15)20-13/h1-6,13,16-18H,7H2/t13-/m0/s1 naringenin InChI=1S/C15H12O5/c16-9-3-1-8(2-4-9)13-7-12(19)15-11(18)5-10(17)6-14(15)20-13/h1-6,13,16-18H,7H2/t13-/m0/s1 (2S)-5,7-dihydroxy-2-(4-hydroxyphenyl)-2,3-dihydro-4H-1-benzopyran-4-one InChI=1S/C15H12O5/c16-9-3-1-8(2-4-9)13-7-12(19)15-11(18)5-10(17)6-14(15)20-13/h1-6,13,16-18H,7H2/t13-/m0/s1 (2S)-5,7-dihydroxy-2-(4-hydroxyphenyl)chroman-4-one InChI=1S/C15H12O5/c16-9-3-1-8(2-4-9)13-7-12(19)15-11(18)5-10(17)6-14(15)20-13/h1-6,13,16-18H,7H2/t13-/m0/s1 (2S)-5,7-dihydroxy-2-(4-hydroxyphenyl)-2,3-dihydro-4H-chromen-4-one InChI=1S/C15H12O5/c16-9-3-1-8(2-4-9)13-7-12(19)15-11(18)5-10(17)6-14(15)20-13/h1-6,13,16-18H,7H2/t13-/m0/s1 (2S)-4′,5,7-trihydroxy-6,8-bis(3-methylbut-2-en-1-yl)flavan-4-one InChI=1S/C25H28O5/c1-14(2)5-11-18-23(28)19(12-6-15(3)4)25-22(24(18)29)20(27)13-21(30-25)16-7-9-17(26)10-8-16/h5-10,21,26,28-29H,11-13H2,1-4H3/t21-/m0/s1 6,8-bis(3-methylbut-2-en-1-yl)naringenin InChI=1S/C25H28O5/c1-14(2)5-11-18-23(28)19(12-6-15(3)4)25-22(24(18)29)20(27)13-21(30-25)16-7-9-17(26)10-8-16/h5-10,21,26,28-29H,11-13H2,1-4H3/t21-/m0/s1 6,8-diprenylnaringenin InChI=1S/C25H28O5/c1-14(2)5-11-18-23(28)19(12-6-15(3)4)25-22(24(18)29)20(27)13-21(30-25)16-7-9-17(26)10-8-16/h5-10,21,26,28-29H,11-13H2,1-4H3/t21-/m0/s1 #lonchocarpol A InChI=1S/C25H28O5/c1-14(2)5-11-18-23(28)19(12-6-15(3)4)25-22(24(18)29)20(27)13-21(30-25)16-7-9-17(26)10-8-16/h5-10,21,26,28-29H,11-13H2,1-4H3/t21-/m0/s1 #senegalensin InChI=1S/C25H28O5/c1-14(2)5-11-18-23(28)19(12-6-15(3)4)25-22(24(18)29)20(27)13-21(30-25)16-7-9-17(26)10-8-16/h5-10,21,26,28-29H,11-13H2,1-4H3/t21-/m0/s1 #senegalensein InChI=1S/C25H28O5/c1-14(2)5-11-18-23(28)19(12-6-15(3)4)25-22(24(18)29)20(27)13-21(30-25)16-7-9-17(26)10-8-16/h5-10,21,26,28-29H,11-13H2,1-4H3/t21-/m0/s1 (2S)-5,7-dihydroxy-2-(4-hydroxyphenyl)-6,8-bis(3-methylbut-2-en-1-yl)-2,3-dihydro-4H-1-benzopyran-4-one InChI=1S/C25H28O5/c1-14(2)5-11-18-23(28)19(12-6-15(3)4)25-22(24(18)29)20(27)13-21(30-25)16-7-9-17(26)10-8-16/h5-10,21,26,28-29H,11-13H2,1-4H3/t21-/m0/s1 (2S)-5,7-dihydroxy-2-(4-hydroxyphenyl)-6,8-bis(3-methylbut-2-en-1-yl)chroman-4-one InChI=1S/C25H28O5/c1-14(2)5-11-18-23(28)19(12-6-15(3)4)25-22(24(18)29)20(27)13-21(30-25)16-7-9-17(26)10-8-16/h5-10,21,26,28-29H,11-13H2,1-4H3/t21-/m0/s1 (2S)-5,7-dihydroxy-2-(4-hydroxyphenyl)-6,8-bis(3-methylbut-2-en-1-yl)-2,3-dihydro-4H-chromen-4-one InChI=1S/C25H28O5/c1-14(2)5-11-18-23(28)19(12-6-15(3)4)25-22(24(18)29)20(27)13-21(30-25)16-7-9-17(26)10-8-16/h5-10,21,26,28-29H,11-13H2,1-4H3/t21-/m0/s1 (2R,3R)-3-hydroxyflavan-4-one InChI=1S/C15H12O3/c16-13-11-8-4-5-9-12(11)18-15(14(13)17)10-6-2-1-3-7-10/h1-9,14-15,17H/t14-,15+/m0/s1 (2R,3R)-3-hydroxy-2-phenyl-2,3-dihydro-4H-1-benzopyran-4-one InChI=1S/C15H12O3/c16-13-11-8-4-5-9-12(11)18-15(14(13)17)10-6-2-1-3-7-10/h1-9,14-15,17H/t14-,15+/m0/s1 (2R,3R)-3-hydroxy-2-phenylchroman-4-one InChI=1S/C15H12O3/c16-13-11-8-4-5-9-12(11)18-15(14(13)17)10-6-2-1-3-7-10/h1-9,14-15,17H/t14-,15+/m0/s1 (2R,3R)-3-hydroxy-2-phenyl-2,3-dihydro-4H-chromen-4-one InChI=1S/C15H12O3/c16-13-11-8-4-5-9-12(11)18-15(14(13)17)10-6-2-1-3-7-10/h1-9,14-15,17H/t14-,15+/m0/s1 (2R,3R)-4′,5-dihydroxy-7-methoxy-4-oxoflavan-3-yl acetate InChI=1S/C18H16O7/c1-9(19)24-18-16(22)15-13(21)7-12(23-2)8-14(15)25-17(18)10-3-5-11(20)6-4-10/h3-8,17-18,20-21H,1-2H3/t17-,18+/m1/s1 7-O-methylaromadendrin 3-acetate InChI=1S/C18H16O7/c1-9(19)24-18-16(22)15-13(21)7-12(23-2)8-14(15)25-17(18)10-3-5-11(20)6-4-10/h3-8,17-18,20-21H,1-2H3/t17-,18+/m1/s1 3-O-acetyl-7-O-methylaromadendrin InChI=1S/C18H16O7/c1-9(19)24-18-16(22)15-13(21)7-12(23-2)8-14(15)25-17(18)10-3-5-11(20)6-4-10/h3-8,17-18,20-21H,1-2H3/t17-,18+/m1/s1 (2R,3R)-5-hydroxy-2-(4-hydroxyphenyl)-7-methoxy-4-oxo-3,4-dihydro-2H-1-benzopyran-3-yl acetate InChI=1S/C18H16O7/c1-9(19)24-18-16(22)15-13(21)7-12(23-2)8-14(15)25-17(18)10-3-5-11(20)6-4-10/h3-8,17-18,20-21H,1-2H3/t17-,18+/m1/s1 (2R,3R)-5-hydroxy-2-(4-hydroxyphenyl)-7-methoxy-4-oxochroman-3-yl acetate InChI=1S/C18H16O7/c1-9(19)24-18-16(22)15-13(21)7-12(23-2)8-14(15)25-17(18)10-3-5-11(20)6-4-10/h3-8,17-18,20-21H,1-2H3/t17-,18+/m1/s1 (2R,3R)-5-hydroxy-2-(4-hydroxyphenyl)-7-methoxy-4-oxo-3,4-dihydro-2H-chromen-3-yl acetate InChI=1S/C18H16O7/c1-9(19)24-18-16(22)15-13(21)7-12(23-2)8-14(15)25-17(18)10-3-5-11(20)6-4-10/h3-8,17-18,20-21H,1-2H3/t17-,18+/m1/s1 (3R)-isoflavan-7-ol InChI=1S/C15H14O2/c16-14-7-6-12-8-13(10-17-15(12)9-14)11-4-2-1-3-5-11/h1-7,9,13,16H,8,10H2/t13-/m0/s1 (3R)-3-phenyl-3,4-dihydro-2H-1-benzopyran-7-ol InChI=1S/C15H14O2/c16-14-7-6-12-8-13(10-17-15(12)9-14)11-4-2-1-3-5-11/h1-7,9,13,16H,8,10H2/t13-/m0/s1 (3R)-3-phenylchroman-7-ol InChI=1S/C15H14O2/c16-14-7-6-12-8-13(10-17-15(12)9-14)11-4-2-1-3-5-11/h1-7,9,13,16H,8,10H2/t13-/m0/s1 (3R)-3-phenyl-3,4-dihydro-2H-chromen-7-ol InChI=1S/C15H14O2/c16-14-7-6-12-8-13(10-17-15(12)9-14)11-4-2-1-3-5-11/h1-7,9,13,16H,8,10H2/t13-/m0/s1 rac-2′,4′,5,7-tetrahydroxyisoflavan-4-one InChI=1S/C15H12O6/c16-7-1-2-9(11(18)3-7)10-6-21-13-5-8(17)4-12(19)14(13)15(10)20/h1-5,10,16-19H,6H2 #dalbergioidin InChI=1S/C15H12O6/c16-7-1-2-9(11(18)3-7)10-6-21-13-5-8(17)4-12(19)14(13)15(10)20/h1-5,10,16-19H,6H2 rac-3-(2,4-dihydroxyphenyl)-5,7-dihydroxy-2,3-dihydro-4H-1-benzopyran-4-one InChI=1S/C15H12O6/c16-7-1-2-9(11(18)3-7)10-6-21-13-5-8(17)4-12(19)14(13)15(10)20/h1-5,10,16-19H,6H2 rac-3-(2,4-dihydroxyphenyl)-5,7-dihydroxychroman-4-one InChI=1S/C15H12O6/c16-7-1-2-9(11(18)3-7)10-6-21-13-5-8(17)4-12(19)14(13)15(10)20/h1-5,10,16-19H,6H2 rac-3-(2,4-dihydroxyphenyl)-5,7-dihydroxy-2,3-dihydro-4H-chromen-4-one InChI=1S/C15H12O6/c16-7-1-2-9(11(18)3-7)10-6-21-13-5-8(17)4-12(19)14(13)15(10)20/h1-5,10,16-19H,6H2 (4R)-neoflavan-7-ol InChI=1S/C15H14O2/c16-12-6-7-14-13(8-9-17-15(14)10-12)11-4-2-1-3-5-11/h1-7,10,13,16H,8-9H2/t13-/m1/s1 (4R)-4-phenyl-3,4-dihydro-2H-1-benzopyran-7-ol InChI=1S/C15H14O2/c16-12-6-7-14-13(8-9-17-15(14)10-12)11-4-2-1-3-5-11/h1-7,10,13,16H,8-9H2/t13-/m1/s1 (4R)-4-phenylchroman-7-ol InChI=1S/C15H14O2/c16-12-6-7-14-13(8-9-17-15(14)10-12)11-4-2-1-3-5-11/h1-7,10,13,16H,8-9H2/t13-/m1/s1 (4R)-4-phenyl-3,4-dihydro-2H-chromen-7-ol InChI=1S/C15H14O2/c16-12-6-7-14-13(8-9-17-15(14)10-12)11-4-2-1-3-5-11/h1-7,10,13,16H,8-9H2/t13-/m1/s1 flavone InChI=1S/C15H10O2/c16-13-10-15(11-6-2-1-3-7-11)17-14-9-5-4-8-12(13)14/h1-10H 2-phenyl-4H-1-benzopyran-4-one InChI=1S/C15H10O2/c16-13-10-15(11-6-2-1-3-7-11)17-14-9-5-4-8-12(13)14/h1-10H 2-phenyl-4H-chromen-4-one InChI=1S/C15H10O2/c16-13-10-15(11-6-2-1-3-7-11)17-14-9-5-4-8-12(13)14/h1-10H 4′,5,7-trihydroxyflavone InChI=1S/C15H10O5/c16-9-3-1-8(2-4-9)13-7-12(19)15-11(18)5-10(17)6-14(15)20-13/h1-7,16-18H apigenin InChI=1S/C15H10O5/c16-9-3-1-8(2-4-9)13-7-12(19)15-11(18)5-10(17)6-14(15)20-13/h1-7,16-18H 5,7-dihydroxy-2-(4-hydroxyphenyl)-4H-1-benzopyran-4-one InChI=1S/C15H10O5/c16-9-3-1-8(2-4-9)13-7-12(19)15-11(18)5-10(17)6-14(15)20-13/h1-7,16-18H 5,7-dihydroxy-2-(4-hydroxyphenyl)-4H-chromen-4-one InChI=1S/C15H10O5/c16-9-3-1-8(2-4-9)13-7-12(19)15-11(18)5-10(17)6-14(15)20-13/h1-7,16-18H 3′,5,7-trihydroxy-4′-methoxyflavone InChI=1S/C16H12O6/c1-21-13-3-2-8(4-10(13)18)14-7-12(20)16-11(19)5-9(17)6-15(16)22-14/h2-7,17-19H,1H3 diosmetin InChI=1S/C16H12O6/c1-21-13-3-2-8(4-10(13)18)14-7-12(20)16-11(19)5-9(17)6-15(16)22-14/h2-7,17-19H,1H3 5,7-dihydroxy-2-(3-hydroxy-4-methoxyphenyl)-4H-1-benzopyran-4-one InChI=1S/C16H12O6/c1-21-13-3-2-8(4-10(13)18)14-7-12(20)16-11(19)5-9(17)6-15(16)22-14/h2-7,17-19H,1H3 5,7-dihydroxy-2-(3-hydroxy-4-methoxyphenyl)-4H-chromen-4-one InChI=1S/C16H12O6/c1-21-13-3-2-8(4-10(13)18)14-7-12(20)16-11(19)5-9(17)6-15(16)22-14/h2-7,17-19H,1H3 2-(piperidin-1-yl)ethyl 3-methyl-4-oxoflav-2-ene-8-carboxylate InChI=1S/C24H25NO4/c1-17-21(26)19-11-8-12-20(23(19)29-22(17)18-9-4-2-5-10-18)24(27)28-16-15-25-13-6-3-7-14-25/h2,4-5,8-12H,3,6-7,13-16H2,1H3 2-(piperidin-1-yl)ethyl 3-methyl-4-oxo-2,3-didehydroflavan-8-carboxylate InChI=1S/C24H25NO4/c1-17-21(26)19-11-8-12-20(23(19)29-22(17)18-9-4-2-5-10-18)24(27)28-16-15-25-13-6-3-7-14-25/h2,4-5,8-12H,3,6-7,13-16H2,1H3 #flavoxate InChI=1S/C24H25NO4/c1-17-21(26)19-11-8-12-20(23(19)29-22(17)18-9-4-2-5-10-18)24(27)28-16-15-25-13-6-3-7-14-25/h2,4-5,8-12H,3,6-7,13-16H2,1H3 2-(piperidin-1-yl)ethyl 3-methyl-4-oxo-2-phenyl-4H-1-benzopyran-8-carboxylate InChI=1S/C24H25NO4/c1-17-21(26)19-11-8-12-20(23(19)29-22(17)18-9-4-2-5-10-18)24(27)28-16-15-25-13-6-3-7-14-25/h2,4-5,8-12H,3,6-7,13-16H2,1H3 2-(piperidin-1-yl)ethyl 3-methyl-4-oxo-2-phenyl-4H-chromene-8-carboxylate InChI=1S/C24H25NO4/c1-17-21(26)19-11-8-12-20(23(19)29-22(17)18-9-4-2-5-10-18)24(27)28-16-15-25-13-6-3-7-14-25/h2,4-5,8-12H,3,6-7,13-16H2,1H3 7-(β-d-glucopyranosyloxy)-4′,5-dihydroxy-3′-methoxyflavone InChI=1S/C22H22O11/c1-30-15-4-9(2-3-11(15)24)14-7-13(26)18-12(25)5-10(6-16(18)32-14)31-22-21(29)20(28)19(27)17(8-23)33-22/h2-7,17,19-25,27-29H,8H2,1H3/t17-,19-,20+,21-,22-/m1/s1 7-O-(β-d-glucopyranosyl)chrysoeriol InChI=1S/C22H22O11/c1-30-15-4-9(2-3-11(15)24)14-7-13(26)18-12(25)5-10(6-16(18)32-14)31-22-21(29)20(28)19(27)17(8-23)33-22/h2-7,17,19-25,27-29H,8H2,1H3/t17-,19-,20+,21-,22-/m1/s1 5-hydroxy-2-(4-hydroxy-3-methoxyphenyl)-4-oxo-4H-1-benzopyran-7-yl β-d-glucopyranoside InChI=1S/C22H22O11/c1-30-15-4-9(2-3-11(15)24)14-7-13(26)18-12(25)5-10(6-16(18)32-14)31-22-21(29)20(28)19(27)17(8-23)33-22/h2-7,17,19-25,27-29H,8H2,1H3/t17-,19-,20+,21-,22-/m1/s1 5-hydroxy-2-(4-hydroxy-3-methoxyphenyl)-4-oxo-4H-chromen-7-yl β-d-glucopyranoside InChI=1S/C22H22O11/c1-30-15-4-9(2-3-11(15)24)14-7-13(26)18-12(25)5-10(6-16(18)32-14)31-22-21(29)20(28)19(27)17(8-23)33-22/h2-7,17,19-25,27-29H,8H2,1H3/t17-,19-,20+,21-,22-/m1/s1 7-(β-d-glucopyranosyloxy)-5-hydroxy-2-(4-hydroxy-3-methoxyphenyl)-4H-1-benzopyran-4-one InChI=1S/C22H22O11/c1-30-15-4-9(2-3-11(15)24)14-7-13(26)18-12(25)5-10(6-16(18)32-14)31-22-21(29)20(28)19(27)17(8-23)33-22/h2-7,17,19-25,27-29H,8H2,1H3/t17-,19-,20+,21-,22-/m1/s1 7-(β-d-glucopyranosyloxy)-5-hydroxy-2-(4-hydroxy-3-methoxyphenyl)-4H-chromen-4-one InChI=1S/C22H22O11/c1-30-15-4-9(2-3-11(15)24)14-7-13(26)18-12(25)5-10(6-16(18)32-14)31-22-21(29)20(28)19(27)17(8-23)33-22/h2-7,17,19-25,27-29H,8H2,1H3/t17-,19-,20+,21-,22-/m1/s1 7-{β-d-glucopyranosyl-(1→2)-β-d-glucopyranosyl-(1→2)-[α-l-rhamnopyranosyl-(1→6)]-β-d-glucopyranosyloxy}-3′,5-dihydroxy-4′-methoxyflavone InChI=1S/C40H52O25/c1-12-25(46)29(50)33(54)37(58-12)57-11-23-28(49)32(53)35(65-40-36(31(52)27(48)22(10-42)62-40)64-38-34(55)30(51)26(47)21(9-41)61-38)39(63-23)59-14-6-16(44)24-17(45)8-19(60-20(24)7-14)13-3-4-18(56-2)15(43)5-13/h3-8,12,21-23,25-44,46-55H,9-11H2,1-2H3/t12-,21+,22+,23+,25-,26+,27+,28+,29+,30-,31-,32-,33+,34+,35+,36+,37+,38-,39+,40-/m0/s1 7-O-{β-d-glucopyranosyl-(1→2)-β-d-glucopyranosyl-(1→2)-[α-l-rhamnopyranosyl-(1→6)]-β-d-glucopyranosyl}diosmetin InChI=1S/C40H52O25/c1-12-25(46)29(50)33(54)37(58-12)57-11-23-28(49)32(53)35(65-40-36(31(52)27(48)22(10-42)62-40)64-38-34(55)30(51)26(47)21(9-41)61-38)39(63-23)59-14-6-16(44)24-17(45)8-19(60-20(24)7-14)13-3-4-18(56-2)15(43)5-13/h3-8,12,21-23,25-44,46-55H,9-11H2,1-2H3/t12-,21+,22+,23+,25-,26+,27+,28+,29+,30-,31-,32-,33+,34+,35+,36+,37+,38-,39+,40-/m0/s1 5-hydroxy-2-(3-hydroxy-4-methoxyphenyl)-4-oxo-4H-1-benzopyran-7-yl 6-deoxy-α-l-mannopyranosyl-(1→6)-[β-d-glucopyranosyl-(1→2)-β-d-glucopyranosyl-(1→2)]-β-d-glucopyranoside InChI=1S/C40H52O25/c1-12-25(46)29(50)33(54)37(58-12)57-11-23-28(49)32(53)35(65-40-36(31(52)27(48)22(10-42)62-40)64-38-34(55)30(51)26(47)21(9-41)61-38)39(63-23)59-14-6-16(44)24-17(45)8-19(60-20(24)7-14)13-3-4-18(56-2)15(43)5-13/h3-8,12,21-23,25-44,46-55H,9-11H2,1-2H3/t12-,21+,22+,23+,25-,26+,27+,28+,29+,30-,31-,32-,33+,34+,35+,36+,37+,38-,39+,40-/m0/s1 5-hydroxy-2-(3-hydroxy-4-methoxyphenyl)-4-oxo-4H-chromen-7-yl 6-deoxy-α-l-mannopyranosyl-(1→6)-[β-d-glucopyranosyl-(1→2)-β-d-glucopyranosyl-(1→2)]-β-d-glucopyranoside InChI=1S/C40H52O25/c1-12-25(46)29(50)33(54)37(58-12)57-11-23-28(49)32(53)35(65-40-36(31(52)27(48)22(10-42)62-40)64-38-34(55)30(51)26(47)21(9-41)61-38)39(63-23)59-14-6-16(44)24-17(45)8-19(60-20(24)7-14)13-3-4-18(56-2)15(43)5-13/h3-8,12,21-23,25-44,46-55H,9-11H2,1-2H3/t12-,21+,22+,23+,25-,26+,27+,28+,29+,30-,31-,32-,33+,34+,35+,36+,37+,38-,39+,40-/m0/s1 7-{6-deoxy-α-l-mannopyranosyl-(1→6)-[β-d-glucopyranosyl-(1→2)-β-d-glucopyranosyl-(1→2)]-β-d-glucopyranosyloxy}-5-hydroxy-2-(3-hydroxy-4-methoxyphenyl)-4H-1-benzopyran-4-one InChI=1S/C40H52O25/c1-12-25(46)29(50)33(54)37(58-12)57-11-23-28(49)32(53)35(65-40-36(31(52)27(48)22(10-42)62-40)64-38-34(55)30(51)26(47)21(9-41)61-38)39(63-23)59-14-6-16(44)24-17(45)8-19(60-20(24)7-14)13-3-4-18(56-2)15(43)5-13/h3-8,12,21-23,25-44,46-55H,9-11H2,1-2H3/t12-,21+,22+,23+,25-,26+,27+,28+,29+,30-,31-,32-,33+,34+,35+,36+,37+,38-,39+,40-/m0/s1 7-{6-deoxy-α-l-mannopyranosyl-(1→6)-[β-d-glucopyranosyl-(1→2)-β-d-glucopyranosyl-(1→2)]-β-d-glucopyranosyloxy}-5-hydroxy-2-(3-hydroxy-4-methoxyphenyl)-4H-chromen-4-one InChI=1S/C40H52O25/c1-12-25(46)29(50)33(54)37(58-12)57-11-23-28(49)32(53)35(65-40-36(31(52)27(48)22(10-42)62-40)64-38-34(55)30(51)26(47)21(9-41)61-38)39(63-23)59-14-6-16(44)24-17(45)8-19(60-20(24)7-14)13-3-4-18(56-2)15(43)5-13/h3-8,12,21-23,25-44,46-55H,9-11H2,1-2H3/t12-,21+,22+,23+,25-,26+,27+,28+,29+,30-,31-,32-,33+,34+,35+,36+,37+,38-,39+,40-/m0/s1 #methyl (4′,5-dihydroxy-4-oxoflav-2-en-7-yl 2-O-acetyl-β-d-glucopyranosid)uronate InChI=C24H22O12/c1-10(25)33-22-20(30)19(29)21(23(31)32-2)36-24(22)34-13-7-14(27)18-15(28)9-16(35-17(18)8-13)11-3-5-12(26)6-4-11/h3-9,19-22,24,26-27,29-30H,1-2H3/t19-,20-,21-,22+,24+/m0/s1 #methyl (4′,5-dihydroxy-4-oxo-2,3-didehydroflavan-7-yl 2-O-acetyl-β-d-glucopyranosid)uronate InChI=C24H22O12/c1-10(25)33-22-20(30)19(29)21(23(31)32-2)36-24(22)34-13-7-14(27)18-15(28)9-16(35-17(18)8-13)11-3-5-12(26)6-4-11/h3-9,19-22,24,26-27,29-30H,1-2H3/t19-,20-,21-,22+,24+/m0/s1 #methyl (4′,5-dihydroxyflavon-7-yl 2-O-acetyl-β-d-glucopyranosid)uronate InChI=C24H22O12/c1-10(25)33-22-20(30)19(29)21(23(31)32-2)36-24(22)34-13-7-14(27)18-15(28)9-16(35-17(18)8-13)11-3-5-12(26)6-4-11/h3-9,19-22,24,26-27,29-30H,1-2H3/t19-,20-,21-,22+,24+/m0/s1 #7-O-(methyl 2-O-acetyl-β-d-glucopyranosyluronate)apigenin InChI=C24H22O12/c1-10(25)33-22-20(30)19(29)21(23(31)32-2)36-24(22)34-13-7-14(27)18-15(28)9-16(35-17(18)8-13)11-3-5-12(26)6-4-11/h3-9,19-22,24,26-27,29-30H,1-2H3/t19-,20-,21-,22+,24+/m0/s1 #methyl [5-hydroxy-2-(4-hydroxyphenyl)-4-oxo-4H-1-benzopyran-7-yl 2-O-acetyl-β-d-glucopyranosid]uronate InChI=C24H22O12/c1-10(25)33-22-20(30)19(29)21(23(31)32-2)36-24(22)34-13-7-14(27)18-15(28)9-16(35-17(18)8-13)11-3-5-12(26)6-4-11/h3-9,19-22,24,26-27,29-30H,1-2H3/t19-,20-,21-,22+,24+/m0/s1 #methyl [5-hydroxy-2-(4-hydroxyphenyl)-4-oxo-4H-chromen-7-yl 2-O-acetyl-β-d-glucopyranosid]uronate InChI=C24H22O12/c1-10(25)33-22-20(30)19(29)21(23(31)32-2)36-24(22)34-13-7-14(27)18-15(28)9-16(35-17(18)8-13)11-3-5-12(26)6-4-11/h3-9,19-22,24,26-27,29-30H,1-2H3/t19-,20-,21-,22+,24+/m0/s1 8-(β-d-glucopyranosyl)-4′,5,7-trihydroxyflavone InChI=1S/C21H20O10/c22-7-14-17(27)18(28)19(29)21(31-14)16-11(25)5-10(24)15-12(26)6-13(30-20(15)16)8-1-3-9(23)4-2-8/h1-6,14,17-19,21-25,27-29H,7H2/t14-,17-,18+,19-,21+/m1/s1 #vitexin InChI=1S/C21H20O10/c22-7-14-17(27)18(28)19(29)21(31-14)16-11(25)5-10(24)15-12(26)6-13(30-20(15)16)8-1-3-9(23)4-2-8/h1-6,14,17-19,21-25,27-29H,7H2/t14-,17-,18+,19-,21+/m1/s1 8-(β-d-glucopyranosyl)apigenin InChI=1S/C21H20O10/c22-7-14-17(27)18(28)19(29)21(31-14)16-11(25)5-10(24)15-12(26)6-13(30-20(15)16)8-1-3-9(23)4-2-8/h1-6,14,17-19,21-25,27-29H,7H2/t14-,17-,18+,19-,21+/m1/s1 8-(β-d-glucopyranosyl)-5,7-dihydroxy-2-(4-hydroxyphenyl)-4H-1-benzopyran-4-one InChI=1S/C21H20O10/c22-7-14-17(27)18(28)19(29)21(31-14)16-11(25)5-10(24)15-12(26)6-13(30-20(15)16)8-1-3-9(23)4-2-8/h1-6,14,17-19,21-25,27-29H,7H2/t14-,17-,18+,19-,21+/m1/s1 8-(β-d-glucopyranosyl)-5,7-dihydroxy-2-(4-hydroxyphenyl)-4H-chromen-4-one InChI=1S/C21H20O10/c22-7-14-17(27)18(28)19(29)21(31-14)16-11(25)5-10(24)15-12(26)6-13(30-20(15)16)8-1-3-9(23)4-2-8/h1-6,14,17-19,21-25,27-29H,7H2/t14-,17-,18+,19-,21+/m1/s1 (1S)-1,5-anhydro-1-[5,7-dihydroxy-2-(4-hydroxyphenyl)-4-oxo-4H-1-benzopyran-8-yl]-d-glucitol InChI=1S/C21H20O10/c22-7-14-17(27)18(28)19(29)21(31-14)16-11(25)5-10(24)15-12(26)6-13(30-20(15)16)8-1-3-9(23)4-2-8/h1-6,14,17-19,21-25,27-29H,7H2/t14-,17-,18+,19-,21+/m1/s1 (1S)-1,5-anhydro-1-[5,7-dihydroxy-2-(4-hydroxyphenyl)-4-oxo-4H-chromen-8-yl]-d-glucitol InChI=1S/C21H20O10/c22-7-14-17(27)18(28)19(29)21(31-14)16-11(25)5-10(24)15-12(26)6-13(30-20(15)16)8-1-3-9(23)4-2-8/h1-6,14,17-19,21-25,27-29H,7H2/t14-,17-,18+,19-,21+/m1/s1 6-(β-d-glucopyranosyl)-4′,5-dihydroxy-7-methoxyflavone InChI=1S/C22H22O10/c1-30-13-7-14-16(11(25)6-12(31-14)9-2-4-10(24)5-3-9)19(27)17(13)22-21(29)20(28)18(26)15(8-23)32-22/h2-7,15,18,20-24,26-29H,8H2,1H3/t15-,18-,20+,21-,22+/m1/s1 6-(β-d-glucopyranosyl)-7-O-methylapigenin InChI=1S/C22H22O10/c1-30-13-7-14-16(11(25)6-12(31-14)9-2-4-10(24)5-3-9)19(27)17(13)22-21(29)20(28)18(26)15(8-23)32-22/h2-7,15,18,20-24,26-29H,8H2,1H3/t15-,18-,20+,21-,22+/m1/s1 6-(β-d-glucopyranosyl)-5-hydroxy-2-(4-hydroxyphenyl)-7-methoxy-4H-1-benzopyran-4-one InChI=1S/C22H22O10/c1-30-13-7-14-16(11(25)6-12(31-14)9-2-4-10(24)5-3-9)19(27)17(13)22-21(29)20(28)18(26)15(8-23)32-22/h2-7,15,18,20-24,26-29H,8H2,1H3/t15-,18-,20+,21-,22+/m1/s1 6-(β-d-glucopyranosyl)-5-hydroxy-2-(4-hydroxyphenyl)-7-methoxy-4H-chromen-4-one InChI=1S/C22H22O10/c1-30-13-7-14-16(11(25)6-12(31-14)9-2-4-10(24)5-3-9)19(27)17(13)22-21(29)20(28)18(26)15(8-23)32-22/h2-7,15,18,20-24,26-29H,8H2,1H3/t15-,18-,20+,21-,22+/m1/s1 (1S)-1,5-anhydro-1-[5-hydroxy-2-(4-hydroxyphenyl)-7-methoxy-4-oxo-4H-1-benzopyran-6-yl]-d-glucitol InChI=1S/C22H22O10/c1-30-13-7-14-16(11(25)6-12(31-14)9-2-4-10(24)5-3-9)19(27)17(13)22-21(29)20(28)18(26)15(8-23)32-22/h2-7,15,18,20-24,26-29H,8H2,1H3/t15-,18-,20+,21-,22+/m1/s1 (1S)-1,5-anhydro-1-[5-hydroxy-2-(4-hydroxyphenyl)-7-methoxy-4-oxo-4H-chromen-6-yl]-d-glucitol InChI=1S/C22H22O10/c1-30-13-7-14-16(11(25)6-12(31-14)9-2-4-10(24)5-3-9)19(27)17(13)22-21(29)20(28)18(26)15(8-23)32-22/h2-7,15,18,20-24,26-29H,8H2,1H3/t15-,18-,20+,21-,22+/m1/s1 6,8-di-(α-l-arabinopyranosyl)-4′,5,7-trihydroxyflavone InChI=1S/C25H26O13/c26-9-3-1-8(2-4-9)13-5-10(27)14-19(32)15(24-21(34)17(30)11(28)6-36-24)20(33)16(23(14)38-13)25-22(35)18(31)12(29)7-37-25/h1-5,11-12,17-18,21-22,24-26,28-35H,6-7H2/t11-,12-,17-,18-,21+,22+,24-,25-/m0/s1 6,8-di-(α-l-arabinopyranosyl)apigenin InChI=1S/C25H26O13/c26-9-3-1-8(2-4-9)13-5-10(27)14-19(32)15(24-21(34)17(30)11(28)6-36-24)20(33)16(23(14)38-13)25-22(35)18(31)12(29)7-37-25/h1-5,11-12,17-18,21-22,24-26,28-35H,6-7H2/t11-,12-,17-,18-,21+,22+,24-,25-/m0/s1 6,8-di-(α-l-arabinopyranosyl)-5,7-dihydroxy-2-(4-hydroxyphenyl)-4H-1-benzopyran-4-one InChI=1S/C25H26O13/c26-9-3-1-8(2-4-9)13-5-10(27)14-19(32)15(24-21(34)17(30)11(28)6-36-24)20(33)16(23(14)38-13)25-22(35)18(31)12(29)7-37-25/h1-5,11-12,17-18,21-22,24-26,28-35H,6-7H2/t11-,12-,17-,18-,21+,22+,24-,25-/m0/s1 6,8-di-(α-l-arabinopyranosyl)-5,7-dihydroxy-2-(4-hydroxyphenyl)-4H-chromen-4-one InChI=1S/C25H26O13/c26-9-3-1-8(2-4-9)13-5-10(27)14-19(32)15(24-21(34)17(30)11(28)6-36-24)20(33)16(23(14)38-13)25-22(35)18(31)12(29)7-37-25/h1-5,11-12,17-18,21-22,24-26,28-35H,6-7H2/t11-,12-,17-,18-,21+,22+,24-,25-/m0/s1 6-(α-l-arabinopyranosyl)-8-(β-d-glucopyranosyl)-3′,4′,5,7-tetrahydroxyflavone InChI=1S/C26H28O15/c27-5-13-18(33)21(36)23(38)26(41-13)16-20(35)15(25-22(37)17(32)11(31)6-39-25)19(34)14-10(30)4-12(40-24(14)16)7-1-2-8(28)9(29)3-7/h1-4,11,13,17-18,21-23,25-29,31-38H,5-6H2/t11-,13+,17-,18+,21-,22+,23+,25-,26-/m0/s1 #isocarlinoside InChI=1S/C26H28O15/c27-5-13-18(33)21(36)23(38)26(41-13)16-20(35)15(25-22(37)17(32)11(31)6-39-25)19(34)14-10(30)4-12(40-24(14)16)7-1-2-8(28)9(29)3-7/h1-4,11,13,17-18,21-23,25-29,31-38H,5-6H2/t11-,13+,17-,18+,21-,22+,23+,25-,26-/m0/s1 6-(α-l-arabinopyranosyl)-8-(β-d-glucopyranosyl)luteolin InChI=1S/C26H28O15/c27-5-13-18(33)21(36)23(38)26(41-13)16-20(35)15(25-22(37)17(32)11(31)6-39-25)19(34)14-10(30)4-12(40-24(14)16)7-1-2-8(28)9(29)3-7/h1-4,11,13,17-18,21-23,25-29,31-38H,5-6H2/t11-,13+,17-,18+,21-,22+,23+,25-,26-/m0/s1 6-(α-l-arabinopyranosyl)-2-(3,4-dihydroxyphenyl)-8-(β-d-glucopyranosyl)-5,7-dihydroxy-4H-1-benzopyran-4-one InChI=1S/C26H28O15/c27-5-13-18(33)21(36)23(38)26(41-13)16-20(35)15(25-22(37)17(32)11(31)6-39-25)19(34)14-10(30)4-12(40-24(14)16)7-1-2-8(28)9(29)3-7/h1-4,11,13,17-18,21-23,25-29,31-38H,5-6H2/t11-,13+,17-,18+,21-,22+,23+,25-,26-/m0/s1 6-(α-l-arabinopyranosyl)-2-(3,4-dihydroxyphenyl)-8-(β-d-glucopyranosyl)-5,7-dihydroxy-4H-chromen-4-one InChI=1S/C26H28O15/c27-5-13-18(33)21(36)23(38)26(41-13)16-20(35)15(25-22(37)17(32)11(31)6-39-25)19(34)14-10(30)4-12(40-24(14)16)7-1-2-8(28)9(29)3-7/h1-4,11,13,17-18,21-23,25-29,31-38H,5-6H2/t11-,13+,17-,18+,21-,22+,23+,25-,26-/m0/s1 4′,5,7-trihydroxy-6-[α-d-rhamnopyranosyl-(1→6)-β-d-glucopyranosyl]flavone InChI=1S/C27H30O14/c1-9-19(31)22(34)25(37)27(39-9)38-8-16-20(32)23(35)24(36)26(41-16)18-13(30)7-15-17(21(18)33)12(29)6-14(40-15)10-2-4-11(28)5-3-10/h2-7,9,16,19-20,22-28,30-37H,8H2,1H3/t9-,16-,19-,20-,22+,23+,24-,25+,26+,27+/m1/s1 #dulcinoside InChI=1S/C27H30O14/c1-9-19(31)22(34)25(37)27(39-9)38-8-16-20(32)23(35)24(36)26(41-16)18-13(30)7-15-17(21(18)33)12(29)6-14(40-15)10-2-4-11(28)5-3-10/h2-7,9,16,19-20,22-28,30-37H,8H2,1H3/t9-,16-,19-,20-,22+,23+,24-,25+,26+,27+/m1/s1 6-[α-d-rhamnopyranosyl-(1→6)-β-d-glucopyranosyl]apigenin InChI=1S/C27H30O14/c1-9-19(31)22(34)25(37)27(39-9)38-8-16-20(32)23(35)24(36)26(41-16)18-13(30)7-15-17(21(18)33)12(29)6-14(40-15)10-2-4-11(28)5-3-10/h2-7,9,16,19-20,22-28,30-37H,8H2,1H3/t9-,16-,19-,20-,22+,23+,24-,25+,26+,27+/m1/s1 6-[6-deoxy-α-d-mannopyranosyl-(1→6)-β-d-glucopyranosyl]-5,7-dihydroxy-2-(4-hydroxyphenyl)-4H-1-benzopyran-4-one InChI=1S/C27H30O14/c1-9-19(31)22(34)25(37)27(39-9)38-8-16-20(32)23(35)24(36)26(41-16)18-13(30)7-15-17(21(18)33)12(29)6-14(40-15)10-2-4-11(28)5-3-10/h2-7,9,16,19-20,22-28,30-37H,8H2,1H3/t9-,16-,19-,20-,22+,23+,24-,25+,26+,27+/m1/s1 6-[6-deoxy-α-d-mannopyranosyl-(1→6)-β-d-glucopyranosyl]-5,7-dihydroxy-2-(4-hydroxyphenyl)-4H-chromen-4-one InChI=1S/C27H30O14/c1-9-19(31)22(34)25(37)27(39-9)38-8-16-20(32)23(35)24(36)26(41-16)18-13(30)7-15-17(21(18)33)12(29)6-14(40-15)10-2-4-11(28)5-3-10/h2-7,9,16,19-20,22-28,30-37H,8H2,1H3/t9-,16-,19-,20-,22+,23+,24-,25+,26+,27+/m1/s1 (1S)-1,5-anhydro-6-O-(6-deoxy-α-d-mannopyranosyl)-1-[5,7-dihydroxy-2-(4-hydroxyphenyl)-4-oxo-4H-1-benzopyran-6-yl]-d-glucitol InChI=1S/C27H30O14/c1-9-19(31)22(34)25(37)27(39-9)38-8-16-20(32)23(35)24(36)26(41-16)18-13(30)7-15-17(21(18)33)12(29)6-14(40-15)10-2-4-11(28)5-3-10/h2-7,9,16,19-20,22-28,30-37H,8H2,1H3/t9-,16-,19-,20-,22+,23+,24-,25+,26+,27+/m1/s1 3-hydroxyflavone InChI=1S/C15H10O3/c16-13-11-8-4-5-9-12(11)18-15(14(13)17)10-6-2-1-3-7-10/h1-9,17H 3-hydroxy-2-phenyl-4H-1-benzopyran-4-one InChI=1S/C15H10O3/c16-13-11-8-4-5-9-12(11)18-15(14(13)17)10-6-2-1-3-7-10/h1-9,17H 3-hydroxy-2-phenyl-4H-chromen-4-one InChI=1S/C15H10O3/c16-13-11-8-4-5-9-12(11)18-15(14(13)17)10-6-2-1-3-7-10/h1-9,17H 4′-hydroxy-3-{4-O-[(2E)-3-(4-hydroxy-3-methoxyphenyl)prop-2-enoyl]-[α-l-rhamnopyranosyl-(1→2)]-[α-l-rhamnopyranosyl-(1→6)]-β-d-galactopyranosyloxy}-7-(α-l-rhamnopyranosyloxy)flavone InChI=1S/C49H58O25/c1-18-31(53)35(57)38(60)46(66-18)65-17-29-43(72-30(52)14-6-21-5-13-26(51)28(15-21)64-4)41(63)45(74-48-40(62)37(59)33(55)20(3)68-48)49(71-29)73-44-34(56)25-12-11-24(69-47-39(61)36(58)32(54)19(2)67-47)16-27(25)70-42(44)22-7-9-23(50)10-8-22/h5-16,18-20,29,31-33,35-41,43,45-51,53-55,57-63H,17H2,1-4H3/b14-6+/t18-,19-,20-,29+,31-,32-,33-,35+,36+,37+,38+,39+,40+,41-,43-,45+,46+,47-,48-,49-/m0/s1 7-(6-deoxy-α-l-mannopyranosyloxy)-2-(4-hydroxyphenyl)-4-oxo-4H-1-benzopyran-3-yl [6-deoxy-α-l-mannopyranosyl-(1→2)]-[6-deoxy-α-l-mannopyranosyl-(1→6)]-4-O-[(2E)-3-(4-hydroxy-3-methoxyphenyl)-prop-2-enoyl]-β-d-galactopyranoside InChI=1S/C49H58O25/c1-18-31(53)35(57)38(60)46(66-18)65-17-29-43(72-30(52)14-6-21-5-13-26(51)28(15-21)64-4)41(63)45(74-48-40(62)37(59)33(55)20(3)68-48)49(71-29)73-44-34(56)25-12-11-24(69-47-39(61)36(58)32(54)19(2)67-47)16-27(25)70-42(44)22-7-9-23(50)10-8-22/h5-16,18-20,29,31-33,35-41,43,45-51,53-55,57-63H,17H2,1-4H3/b14-6+/t18-,19-,20-,29+,31-,32-,33-,35+,36+,37+,38+,39+,40+,41-,43-,45+,46+,47-,48-,49-/m0/s1 7-(6-deoxy-α-l-mannopyranosyloxy)-2-(4-hydroxyphenyl)-4-oxo-4H-chromen-3-yl [6-deoxy-α-l-mannopyranosyl-(1→2)]-[6-deoxy-α-l-mannopyranosyl-(1→6)]-4-O-[(2E)-3-(4-hydroxy-3-methoxyphenyl)-prop-2-enoyl]-β-d-galactopyranoside InChI=1S/C49H58O25/c1-18-31(53)35(57)38(60)46(66-18)65-17-29-43(72-30(52)14-6-21-5-13-26(51)28(15-21)64-4)41(63)45(74-48-40(62)37(59)33(55)20(3)68-48)49(71-29)73-44-34(56)25-12-11-24(69-47-39(61)36(58)32(54)19(2)67-47)16-27(25)70-42(44)22-7-9-23(50)10-8-22/h5-16,18-20,29,31-33,35-41,43,45-51,53-55,57-63H,17H2,1-4H3/b14-6+/t18-,19-,20-,29+,31-,32-,33-,35+,36+,37+,38+,39+,40+,41-,43-,45+,46+,47-,48-,49-/m0/s1 #{7-(6-deoxy-α-l-mannopyranosyloxy)-2-(4-hydroxyphenyl)-4-oxo-4H-1-benzopyran-3-yl 4-deoxy-[6-deoxy-α-l-mannopyranosyl-(1→2)]-[6-deoxy-α-l-mannopyranosyl-(1→6)]-β-d-galactopyranosid-4-yl} (2E)-3-(4-hydroxy-3-methoxyphenyl)prop-2-enoate InChI=1S/C49H58O25/c1-18-31(53)35(57)38(60)46(66-18)65-17-29-43(72-30(52)14-6-21-5-13-26(51)28(15-21)64-4)41(63)45(74-48-40(62)37(59)33(55)20(3)68-48)49(71-29)73-44-34(56)25-12-11-24(69-47-39(61)36(58)32(54)19(2)67-47)16-27(25)70-42(44)22-7-9-23(50)10-8-22/h5-16,18-20,29,31-33,35-41,43,45-51,53-55,57-63H,17H2,1-4H3/b14-6+/t18-,19-,20-,29+,31-,32-,33-,35+,36+,37+,38+,39+,40+,41-,43-,45+,46+,47-,48-,49-/m0/s1 #{7-(6-deoxy-α-l-mannopyranosyloxy)-2-(4-hydroxyphenyl)-4-oxo-4H-chromen-3-yl 4-deoxy-[6-deoxy-α-l-mannopyranosyl-(1→2)]-[6-deoxy-α-l-mannopyranosyl-(1→6)]-β-d-galactopyranosid-4-yl} (2E)-3-(4-hydroxy-3-methoxyphenyl)prop-2-enoate InChI=1S/C49H58O25/c1-18-31(53)35(57)38(60)46(66-18)65-17-29-43(72-30(52)14-6-21-5-13-26(51)28(15-21)64-4)41(63)45(74-48-40(62)37(59)33(55)20(3)68-48)49(71-29)73-44-34(56)25-12-11-24(69-47-39(61)36(58)32(54)19(2)67-47)16-27(25)70-42(44)22-7-9-23(50)10-8-22/h5-16,18-20,29,31-33,35-41,43,45-51,53-55,57-63H,17H2,1-4H3/b14-6+/t18-,19-,20-,29+,31-,32-,33-,35+,36+,37+,38+,39+,40+,41-,43-,45+,46+,47-,48-,49-/m0/s1 3-{[6-deoxy-α-l-mannopyranosyl-(1→2)]-[6-deoxy-α-l-mannopyranosyl-(1→6)]-4-O-[(2E)-3-(4-hydroxy-3-methoxyphenyl)prop-2-enoyl]-β-d-galactopyranosyloxy}-7-(6-deoxy-α-l-mannopyranosyloxy)-2-(4-hydroxyphenyl)-4H-1-benzopyran-4-one InChI=1S/C49H58O25/c1-18-31(53)35(57)38(60)46(66-18)65-17-29-43(72-30(52)14-6-21-5-13-26(51)28(15-21)64-4)41(63)45(74-48-40(62)37(59)33(55)20(3)68-48)49(71-29)73-44-34(56)25-12-11-24(69-47-39(61)36(58)32(54)19(2)67-47)16-27(25)70-42(44)22-7-9-23(50)10-8-22/h5-16,18-20,29,31-33,35-41,43,45-51,53-55,57-63H,17H2,1-4H3/b14-6+/t18-,19-,20-,29+,31-,32-,33-,35+,36+,37+,38+,39+,40+,41-,43-,45+,46+,47-,48-,49-/m0/s1 3-{[6-deoxy-α-l-mannopyranosyl-(1→2)]-[6-deoxy-α-l-mannopyranosyl-(1→6)]-4-O-[(2E)-3-(4-hydroxy-3-methoxyphenyl)prop-2-enoyl]-β-d-galactopyranosyloxy}-7-(6-deoxy-α-l-mannopyranosyloxy)-2-(4-hydroxyphenyl)-4H-chromen-4-one InChI=1S/C49H58O25/c1-18-31(53)35(57)38(60)46(66-18)65-17-29-43(72-30(52)14-6-21-5-13-26(51)28(15-21)64-4)41(63)45(74-48-40(62)37(59)33(55)20(3)68-48)49(71-29)73-44-34(56)25-12-11-24(69-47-39(61)36(58)32(54)19(2)67-47)16-27(25)70-42(44)22-7-9-23(50)10-8-22/h5-16,18-20,29,31-33,35-41,43,45-51,53-55,57-63H,17H2,1-4H3/b14-6+/t18-,19-,20-,29+,31-,32-,33-,35+,36+,37+,38+,39+,40+,41-,43-,45+,46+,47-,48-,49-/m0/s1 isoflavone InChI=1S/C15H10O2/c16-15-12-8-4-5-9-14(12)17-10-13(15)11-6-2-1-3-7-11/h1-10H 3-phenyl-4H-1-benzopyran-4-one InChI=1S/C15H10O2/c16-15-12-8-4-5-9-14(12)17-10-13(15)11-6-2-1-3-7-11/h1-10H 3-phenyl-4H-chromen-4-one InChI=1S/C15H10O2/c16-15-12-8-4-5-9-14(12)17-10-13(15)11-6-2-1-3-7-11/h1-10H 4′,5,7-trihydroxyisoflavone InChI=1S/C15H10O5/c16-9-3-1-8(2-4-9)11-7-20-13-6-10(17)5-12(18)14(13)15(11)19/h1-7,16-18H genistein InChI=1S/C15H10O5/c16-9-3-1-8(2-4-9)11-7-20-13-6-10(17)5-12(18)14(13)15(11)19/h1-7,16-18H 5,7-dihydroxy-3-(4-hydroxyphenyl)-4H-1-benzopyran-4-one InChI=1S/C15H10O5/c16-9-3-1-8(2-4-9)11-7-20-13-6-10(17)5-12(18)14(13)15(11)19/h1-7,16-18H 5,7-dihydroxy-3-(4-hydroxyphenyl)-4H-chromen-4-one InChI=1S/C15H10O5/c16-9-3-1-8(2-4-9)11-7-20-13-6-10(17)5-12(18)14(13)15(11)19/h1-7,16-18H 7-(β-d-glucopyranosyloxy)-4′,5-dihydroxyisoflavone InChI=1S/C21H20O10/c22-7-15-18(26)19(27)20(28)21(31-15)30-11-5-13(24)16-14(6-11)29-8-12(17(16)25)9-1-3-10(23)4-2-9/h1-6,8,15,18-24,26-28H,7H2/t15-,18-,19+,20-,21-/m1/s1 genistin InChI=1S/C21H20O10/c22-7-15-18(26)19(27)20(28)21(31-15)30-11-5-13(24)16-14(6-11)29-8-12(17(16)25)9-1-3-10(23)4-2-9/h1-6,8,15,18-24,26-28H,7H2/t15-,18-,19+,20-,21-/m1/s1 7-O-(β-d-glucopyranosyl)genistein InChI=1S/C21H20O10/c22-7-15-18(26)19(27)20(28)21(31-15)30-11-5-13(24)16-14(6-11)29-8-12(17(16)25)9-1-3-10(23)4-2-9/h1-6,8,15,18-24,26-28H,7H2/t15-,18-,19+,20-,21-/m1/s1 5-hydroxy-3-(4-hydroxyphenyl)-4-oxo-4H-1-benzopyran-7-yl β-d-glucopyranoside InChI=1S/C21H20O10/c22-7-15-18(26)19(27)20(28)21(31-15)30-11-5-13(24)16-14(6-11)29-8-12(17(16)25)9-1-3-10(23)4-2-9/h1-6,8,15,18-24,26-28H,7H2/t15-,18-,19+,20-,21-/m1/s1 5-hydroxy-3-(4-hydroxyphenyl)-4-oxo-4H-chromen-7-yl β-d-glucopyranoside InChI=1S/C21H20O10/c22-7-15-18(26)19(27)20(28)21(31-15)30-11-5-13(24)16-14(6-11)29-8-12(17(16)25)9-1-3-10(23)4-2-9/h1-6,8,15,18-24,26-28H,7H2/t15-,18-,19+,20-,21-/m1/s1 7-(β-d-glucopyranosyloxy)-5-hydroxy-3-(4-hydroxyphenyl)-4H-1-benzopyran-4-one InChI=1S/C21H20O10/c22-7-15-18(26)19(27)20(28)21(31-15)30-11-5-13(24)16-14(6-11)29-8-12(17(16)25)9-1-3-10(23)4-2-9/h1-6,8,15,18-24,26-28H,7H2/t15-,18-,19+,20-,21-/m1/s1 7-(β-d-glucopyranosyloxy)-5-hydroxy-3-(4-hydroxyphenyl)-4H-chromen-4-one InChI=1S/C21H20O10/c22-7-15-18(26)19(27)20(28)21(31-15)30-11-5-13(24)16-14(6-11)29-8-12(17(16)25)9-1-3-10(23)4-2-9/h1-6,8,15,18-24,26-28H,7H2/t15-,18-,19+,20-,21-/m1/s1 neoflavone InChI=1S/C15H10O2/c16-15-10-13(11-6-2-1-3-7-11)12-8-4-5-9-14(12)17-15/h1-10H 4-phenylcoumarin InChI=1S/C15H10O2/c16-15-10-13(11-6-2-1-3-7-11)12-8-4-5-9-14(12)17-15/h1-10H 4-phenyl-2H-1-benzopyran-2-one InChI=1S/C15H10O2/c16-15-10-13(11-6-2-1-3-7-11)12-8-4-5-9-14(12)17-15/h1-10H 4-phenyl-2H-chromen-2-one InChI=1S/C15H10O2/c16-15-10-13(11-6-2-1-3-7-11)12-8-4-5-9-14(12)17-15/h1-10H 3′,6-dihydroxy-4′,7-dimethoxyneoflavone InChI=1S/C17H14O6/c1-21-14-4-3-9(5-12(14)18)10-7-17(20)23-15-8-16(22-2)13(19)6-11(10)15/h3-8,18-19H,1-2H3 #melannein InChI=1S/C17H14O6/c1-21-14-4-3-9(5-12(14)18)10-7-17(20)23-15-8-16(22-2)13(19)6-11(10)15/h3-8,18-19H,1-2H3 6-hydroxy-4-(3-hydroxy-4-methoxyphenyl)-7-methoxy-2H-1-benzopyran-2-one InChI=1S/C17H14O6/c1-21-14-4-3-9(5-12(14)18)10-7-17(20)23-15-8-16(22-2)13(19)6-11(10)15/h3-8,18-19H,1-2H3 6-hydroxy-4-(3-hydroxy-4-methoxyphenyl)-7-methoxy-2H-chromen-2-one InChI=1S/C17H14O6/c1-21-14-4-3-9(5-12(14)18)10-7-17(20)23-15-8-16(22-2)13(19)6-11(10)15/h3-8,18-19H,1-2H3 7-(β-d-glucopyranosyloxy)-8-methylneoflavone InChI=1S/C22H22O8/c1-11-15(28-22-20(27)19(26)18(25)16(10-23)29-22)8-7-13-14(9-17(24)30-21(11)13)12-5-3-2-4-6-12/h2-9,16,18-20,22-23,25-27H,10H2,1H3/t16-,18-,19+,20-,22-/m1/s1 7-(β-d-glucopyranosyloxy)-8-methyl-4-phenylcoumarin InChI=1S/C22H22O8/c1-11-15(28-22-20(27)19(26)18(25)16(10-23)29-22)8-7-13-14(9-17(24)30-21(11)13)12-5-3-2-4-6-12/h2-9,16,18-20,22-23,25-27H,10H2,1H3/t16-,18-,19+,20-,22-/m1/s1 8-methyl-2-oxo-4-phenyl-2H-1-benzopyran-7-yl β-d-glucopyranoside InChI=1S/C22H22O8/c1-11-15(28-22-20(27)19(26)18(25)16(10-23)29-22)8-7-13-14(9-17(24)30-21(11)13)12-5-3-2-4-6-12/h2-9,16,18-20,22-23,25-27H,10H2,1H3/t16-,18-,19+,20-,22-/m1/s1 8-methyl-2-oxo-4-phenyl-2H-chromen-7-yl β-d-glucopyranoside InChI=1S/C22H22O8/c1-11-15(28-22-20(27)19(26)18(25)16(10-23)29-22)8-7-13-14(9-17(24)30-21(11)13)12-5-3-2-4-6-12/h2-9,16,18-20,22-23,25-27H,10H2,1H3/t16-,18-,19+,20-,22-/m1/s1 7-(β-d-glucopyranosyloxy)-8-methyl-4-phenyl-2H-1-benzopyran-2-one InChI=1S/C22H22O8/c1-11-15(28-22-20(27)19(26)18(25)16(10-23)29-22)8-7-13-14(9-17(24)30-21(11)13)12-5-3-2-4-6-12/h2-9,16,18-20,22-23,25-27H,10H2,1H3/t16-,18-,19+,20-,22-/m1/s1 7-(β-d-glucopyranosyloxy)-8-methyl-4-phenyl-2H-chromen-2-one InChI=1S/C22H22O8/c1-11-15(28-22-20(27)19(26)18(25)16(10-23)29-22)8-7-13-14(9-17(24)30-21(11)13)12-5-3-2-4-6-12/h2-9,16,18-20,22-23,25-27H,10H2,1H3/t16-,18-,19+,20-,22-/m1/s1 chalcone InChI=1S/C15H12O/c16-15(14-9-5-2-6-10-14)12-11-13-7-3-1-4-8-13/h1-12H/b12-11+ (2E)-1,3-diphenylprop-2-en-1-one InChI=1S/C15H12O/c16-15(14-9-5-2-6-10-14)12-11-13-7-3-1-4-8-13/h1-12H/b12-11+ 4-(dimethylamino)-2′,5′-dihydroxychalcone InChI=1S/C17H17NO3/c1-18(2)13-6-3-12(4-7-13)5-9-16(20)15-11-14(19)8-10-17(15)21/h3-11,19,21H,1-2H3/b9-5+ (2E)-1-(2,5-dihydroxyphenyl)-3-[4-(dimethylamino)phenyl]prop-2-en-1-one InChI=1S/C17H17NO3/c1-18(2)13-6-3-12(4-7-13)5-9-16(20)15-11-14(19)8-10-17(15)21/h3-11,19,21H,1-2H3/b9-5+ 4-[(1E)-3-oxo-3-phenylprop-1-en-1-yl]benzonitrile InChI=1S/C16H11NO/c17-12-14-8-6-13(7-9-14)10-11-16(18)15-4-2-1-3-5-15/h1-11H/b11-10+ 2-[(2E)-3-phenylprop-2-enoyl]benzoic acid InChI=1S/C16H12O3/c17-15(11-10-12-6-2-1-3-7-12)13-8-4-5-9-14(13)16(18)19/h1-11H,(H,18,19)/b11-10+ 1-(4-hydroxy-2-methoxyphenyl)-3-phenylpropan-1-one InChI=1S/C16H16O3/c1-19-16-11-13(17)8-9-14(16)15(18)10-7-12-5-3-2-4-6-12/h2-6,8-9,11,17H,7,10H2,1H3 aurone InChI=1S/C15H10O2/c16-15-12-8-4-5-9-13(12)17-14(15)10-11-6-2-1-3-7-11/h1-10H/b14-10- (2Z)-2-benzylidene-1-benzofuran-3(2H)-one InChI=1S/C15H10O2/c16-15-12-8-4-5-9-13(12)17-14(15)10-11-6-2-1-3-7-11/h1-10H/b14-10- (2Z)-2-(phenylmethylidene)-1-benzofuran-3(2H)-one InChI=1S/C15H10O2/c16-15-12-8-4-5-9-13(12)17-14(15)10-11-6-2-1-3-7-11/h1-10H/b14-10- 3′,4,4′,6-tetrahydroxyaurone InChI=1S/C15H10O6/c16-8-5-11(19)14-12(6-8)21-13(15(14)20)4-7-1-2-9(17)10(18)3-7/h1-6,16-19H/b13-4- #aureusidin InChI=1S/C15H10O6/c16-8-5-11(19)14-12(6-8)21-13(15(14)20)4-7-1-2-9(17)10(18)3-7/h1-6,16-19H/b13-4- (2Z)-2-[(3,4-dihydroxyphenyl)methylidene]-4,6-dihydroxy-1-benzofuran-3(2H)-one InChI=1S/C15H10O6/c16-8-5-11(19)14-12(6-8)21-13(15(14)20)4-7-1-2-9(17)10(18)3-7/h1-6,16-19H/b13-4- (2Z)-2-(3,4-dihydroxybenzylidene)-4,6-dihydroxy-1-benzofuran-3(2H)-one InChI=1S/C15H10O6/c16-8-5-11(19)14-12(6-8)21-13(15(14)20)4-7-1-2-9(17)10(18)3-7/h1-6,16-19H/b13-4- 6-(β-d-glucopyranosyloxy)-4,4′-dihydroxyaurone InChI=1S/C21H20O10/c22-8-15-18(26)19(27)20(28)21(31-15)29-11-6-12(24)16-13(7-11)30-14(17(16)25)5-9-1-3-10(23)4-2-9/h1-7,15,18-24,26-28H,8H2/b14-5-/t15-,18-,19+,20-,21-/m1/s1 (2Z)-6-(β-d-glucopyranosyloxy)-4-hydroxy-2-(4-hydroxybenzylidene)-1-benzofuran-3(2H)-one InChI=1S/C21H20O10/c22-8-15-18(26)19(27)20(28)21(31-15)29-11-6-12(24)16-13(7-11)30-14(17(16)25)5-9-1-3-10(23)4-2-9/h1-7,15,18-24,26-28H,8H2/b14-5-/t15-,18-,19+,20-,21-/m1/s1 (2Z)-6-(β-d-glucopyranosyloxy)-4-hydroxy-2-[(4-hydroxyphenyl)methylidene]-1-benzofuran-3(2H)-one InChI=1S/C21H20O10/c22-8-15-18(26)19(27)20(28)21(31-15)29-11-6-12(24)16-13(7-11)30-14(17(16)25)5-9-1-3-10(23)4-2-9/h1-7,15,18-24,26-28H,8H2/b14-5-/t15-,18-,19+,20-,21-/m1/s1 3,3′,4′,5,7-pentahydroxyflavylium InChI=1S/C15H10O6/c16-8-4-11(18)9-6-13(20)15(21-14(9)5-8)7-1-2-10(17)12(19)3-7/h1-6H,(H4-,16,17,18,19,20)/p+1 cyanidin InChI=1S/C15H10O6/c16-8-4-11(18)9-6-13(20)15(21-14(9)5-8)7-1-2-10(17)12(19)3-7/h1-6H,(H4-,16,17,18,19,20)/p+1 2-(3,4-dihydroxyphenyl)-3,5,7-trihydroxy-1λ4-benzopyran-1-ylium InChI=1S/C15H10O6/c16-8-4-11(18)9-6-13(20)15(21-14(9)5-8)7-1-2-10(17)12(19)3-7/h1-6H,(H4-,16,17,18,19,20)/p+1 2-(3,4-dihydroxyphenyl)-3,5,7-trihydroxychromenylium InChI=1S/C15H10O6/c16-8-4-11(18)9-6-13(20)15(21-14(9)5-8)7-1-2-10(17)12(19)3-7/h1-6H,(H4-,16,17,18,19,20)/p+1 3-(β-d-glucopyranosyloxy)-3′,4′,5,7-tetrahydroxyflavylium InChI=1S/C21H20O11/c22-7-16-17(27)18(28)19(29)21(32-16)31-15-6-10-12(25)4-9(23)5-14(10)30-20(15)8-1-2-11(24)13(26)3-8/h1-6,16-19,21-22,27-29H,7H2,(H3-,23,24,25,26)/p+1/t16-,17-,18+,19-,21-/m1/s1 3-O-(β-d-glucopyranosyl)cyanidin InChI=1S/C21H20O11/c22-7-16-17(27)18(28)19(29)21(32-16)31-15-6-10-12(25)4-9(23)5-14(10)30-20(15)8-1-2-11(24)13(26)3-8/h1-6,16-19,21-22,27-29H,7H2,(H3-,23,24,25,26)/p+1/t16-,17-,18+,19-,21-/m1/s1 2-(3,4-dihydroxyphenyl)-3-(β-d-glucopyranosyloxy)-5,7-dihydroxy-1λ4-benzopyran-1-ylium InChI=1S/C21H20O11/c22-7-16-17(27)18(28)19(29)21(32-16)31-15-6-10-12(25)4-9(23)5-14(10)30-20(15)8-1-2-11(24)13(26)3-8/h1-6,16-19,21-22,27-29H,7H2,(H3-,23,24,25,26)/p+1/t16-,17-,18+,19-,21-/m1/s1 2-(3,4-dihydroxyphenyl)-3-(β-d-glucopyranosyloxy)-5,7-dihydroxychromenylium InChI=1S/C21H20O11/c22-7-16-17(27)18(28)19(29)21(32-16)31-15-6-10-12(25)4-9(23)5-14(10)30-20(15)8-1-2-11(24)13(26)3-8/h1-6,16-19,21-22,27-29H,7H2,(H3-,23,24,25,26)/p+1/t16-,17-,18+,19-,21-/m1/s1 3′,4′,5,7-tetrahydroxy-3-{6-O-[(E)-3-(4-hydroxyphenyl)prop-2-enoyl]-β-d-glucopyranosyloxy}flavylium InChI=1S/C30H26O13/c31-16-5-1-14(2-6-16)3-8-25(36)40-13-24-26(37)27(38)28(39)30(43-24)42-23-12-18-20(34)10-17(32)11-22(18)41-29(23)15-4-7-19(33)21(35)9-15/h1-12,24,26-28,30,37-39H,13H2,(H4-,31,32,33,34,35,36)/p+1/t24-,26-,27+,28-,30-/m1/s1 3-O-{6-O-[(E)-3-(4-hydroxyphenyl)prop-2-enoyl]-β-d-glucopyranosyl}cyanidin InChI=1S/C30H26O13/c31-16-5-1-14(2-6-16)3-8-25(36)40-13-24-26(37)27(38)28(39)30(43-24)42-23-12-18-20(34)10-17(32)11-22(18)41-29(23)15-4-7-19(33)21(35)9-15/h1-12,24,26-28,30,37-39H,13H2,(H4-,31,32,33,34,35,36)/p+1/t24-,26-,27+,28-,30-/m1/s1 2-(3,4-dihydroxyphenyl)-5,7-dihydroxy-3-{6-O-[(2E)-3-(4-hydroxyphenyl)prop-2-enoyl]-β-d-glucopyranosyloxy}-1λ4-benzopyran-1-ylium InChI=1S/C30H26O13/c31-16-5-1-14(2-6-16)3-8-25(36)40-13-24-26(37)27(38)28(39)30(43-24)42-23-12-18-20(34)10-17(32)11-22(18)41-29(23)15-4-7-19(33)21(35)9-15/h1-12,24,26-28,30,37-39H,13H2,(H4-,31,32,33,34,35,36)/p+1/t24-,26-,27+,28-,30-/m1/s1 2-(3,4-dihydroxyphenyl)-5,7-dihydroxy-3-{6-O-[(2E)-3-(4-hydroxyphenyl)prop-2-enoyl]-β-d-glucopyranosyloxy}chromenylium InChI=1S/C30H26O13/c31-16-5-1-14(2-6-16)3-8-25(36)40-13-24-26(37)27(38)28(39)30(43-24)42-23-12-18-20(34)10-17(32)11-22(18)41-29(23)15-4-7-19(33)21(35)9-15/h1-12,24,26-28,30,37-39H,13H2,(H4-,31,32,33,34,35,36)/p+1/t24-,26-,27+,28-,30-/m1/s1 3′,4′,5,7-tetrahydroxy-3-[α-l-rhamnopyranosyl-(1→6)-β-d-glucopyranosyloxy]flavylium InChI=1S/C27H30O15/c1-9-19(32)21(34)23(36)26(39-9)38-8-18-20(33)22(35)24(37)27(42-18)41-17-7-12-14(30)5-11(28)6-16(12)40-25(17)10-2-3-13(29)15(31)4-10/h2-7,9,18-24,26-27,32-37H,8H2,1H3,(H3-,28,29,30,31)/p+1/t9-,18+,19-,20+,21+,22-,23+,24+,26+,27+/m0/s1 3-O-β-rutinosylcyanidin InChI=1S/C27H30O15/c1-9-19(32)21(34)23(36)26(39-9)38-8-18-20(33)22(35)24(37)27(42-18)41-17-7-12-14(30)5-11(28)6-16(12)40-25(17)10-2-3-13(29)15(31)4-10/h2-7,9,18-24,26-27,32-37H,8H2,1H3,(H3-,28,29,30,31)/p+1/t9-,18+,19-,20+,21+,22-,23+,24+,26+,27+/m0/s1 3-[6-deoxy-α-l-mannopyranosyl-(1→6)-β-d-glucopyranosyloxy]-2-(3,4-dihydroxyphenyl)-5,7-dihydroxy-1λ4-benzopyran-1-ylium InChI=1S/C27H30O15/c1-9-19(32)21(34)23(36)26(39-9)38-8-18-20(33)22(35)24(37)27(42-18)41-17-7-12-14(30)5-11(28)6-16(12)40-25(17)10-2-3-13(29)15(31)4-10/h2-7,9,18-24,26-27,32-37H,8H2,1H3,(H3-,28,29,30,31)/p+1/t9-,18+,19-,20+,21+,22-,23+,24+,26+,27+/m0/s1 3-[6-deoxy-α-l-mannopyranosyl-(1→6)-β-d-glucopyranosyloxy]-2-(3,4-dihydroxyphenyl)-5,7-dihydroxychromenylium InChI=1S/C27H30O15/c1-9-19(32)21(34)23(36)26(39-9)38-8-18-20(33)22(35)24(37)27(42-18)41-17-7-12-14(30)5-11(28)6-16(12)40-25(17)10-2-3-13(29)15(31)4-10/h2-7,9,18-24,26-27,32-37H,8H2,1H3,(H3-,28,29,30,31)/p+1/t9-,18+,19-,20+,21+,22-,23+,24+,26+,27+/m0/s1 3,7-bis(β-d-glucopyranosyloxy)-3′,4′,5-trihydroxyflavylium InChI=1S/C27H30O16/c28-7-17-19(33)21(35)23(37)26(42-17)39-10-4-13(31)11-6-16(41-27-24(38)22(36)20(34)18(8-29)43-27)25(40-15(11)5-10)9-1-2-12(30)14(32)3-9/h1-6,17-24,26-29,33-38H,7-8H2,(H2-,30,31,32)/p+1/t17-,18-,19-,20-,21+,22+,23-,24-,26-,27-/m1/s1 3,7-di-O-(β-d-glucopyranosyl)cyanidin InChI=1S/C27H30O16/c28-7-17-19(33)21(35)23(37)26(42-17)39-10-4-13(31)11-6-16(41-27-24(38)22(36)20(34)18(8-29)43-27)25(40-15(11)5-10)9-1-2-12(30)14(32)3-9/h1-6,17-24,26-29,33-38H,7-8H2,(H2-,30,31,32)/p+1/t17-,18-,19-,20-,21+,22+,23-,24-,26-,27-/m1/s1 2-(3,4-dihydroxyphenyl)-3,7-bis(β-d-glucopyranosyloxy)-5-hydroxy-1λ4-benzopyran-1-ylium InChI=1S/C27H30O16/c28-7-17-19(33)21(35)23(37)26(42-17)39-10-4-13(31)11-6-16(41-27-24(38)22(36)20(34)18(8-29)43-27)25(40-15(11)5-10)9-1-2-12(30)14(32)3-9/h1-6,17-24,26-29,33-38H,7-8H2,(H2-,30,31,32)/p+1/t17-,18-,19-,20-,21+,22+,23-,24-,26-,27-/m1/s1 2-(3,4-dihydroxyphenyl)-3,7-bis(β-d-glucopyranosyloxy)-5-hydroxychromenylium InChI=1S/C27H30O16/c28-7-17-19(33)21(35)23(37)26(42-17)39-10-4-13(31)11-6-16(41-27-24(38)22(36)20(34)18(8-29)43-27)25(40-15(11)5-10)9-1-2-12(30)14(32)3-9/h1-6,17-24,26-29,33-38H,7-8H2,(H2-,30,31,32)/p+1/t17-,18-,19-,20-,21+,22+,23-,24-,26-,27-/m1/s1 #5-hydroxy-6,7-[methylenebis(oxy)]flavone InChI=1S/C16H10O5/c17-10-6-11(9-4-2-1-3-5-9)21-12-7-13-16(20-8-19-13)15(18)14(10)12/h1-7,18H,8H2 9-hydroxy-6-phenyl-8H-[1,3]dioxolo[4,5-g][1]benzopyran-8-one InChI=1S/C16H10O5/c17-10-6-11(9-4-2-1-3-5-9)21-12-7-13-16(20-8-19-13)15(18)14(10)12/h1-7,18H,8H2 9-hydroxy-6-phenyl-8H-[1,3]dioxolo[4,5-g]chromen-8-one InChI=1S/C16H10O5/c17-10-6-11(9-4-2-1-3-5-9)21-12-7-13-16(20-8-19-13)15(18)14(10)12/h1-7,18H,8H2 #3,4-[methylenebis(oxy)]chalcone InChI=1S/C16H12O3/c17-14(13-4-2-1-3-5-13)8-6-12-7-9-15-16(10-12)19-11-18-15/h1-10H,11H2/b8-6+ (2E)-3-(1,3-benzodioxol-5-yl)-1-phenylprop-2-en-1-one InChI=1S/C16H12O3/c17-14(13-4-2-1-3-5-13)8-6-12-7-9-15-16(10-12)19-11-18-15/h1-10H,11H2/b8-6+ #(+)-2-methoxy-2,21-dihydrofuro[2″,3″:6,7]aurone InChI=1S/C18H14O4/c1-20-18(11-12-5-3-2-4-6-12)17(19)14-7-8-15-13(9-10-21-15)16(14)22-18/h2-10H,11H2,1H3 #castillene A InChI=1S/C18H14O4/c1-20-18(11-12-5-3-2-4-6-12)17(19)14-7-8-15-13(9-10-21-15)16(14)22-18/h2-10H,11H2,1H3 (+)-2-benzyl-2-methoxybenzo[1,2-b:3,4-b′]difuran-3(2H)-one InChI=1S/C18H14O4/c1-20-18(11-12-5-3-2-4-6-12)17(19)14-7-8-15-13(9-10-21-15)16(14)22-18/h2-10H,11H2,1H3 #21,4,4′,5′,6-pentahydroxy-21,6″-dihydropyrano[2″,3″,4″,5″:2,21,1′,2′]aurone InChI=1S/C16H12O8/c17-7-2-11(20)13-12(3-7)24-16(15(13)22)14(21)8-4-10(19)9(18)1-6(8)5-23-16/h1-4,14,17-21H,5H2 #crombenin InChI=1S/C16H12O8/c17-7-2-11(20)13-12(3-7)24-16(15(13)22)14(21)8-4-10(19)9(18)1-6(8)5-23-16/h1-4,14,17-21H,5H2 4,4′,6,6′,7′-pentahydroxy-1′,4′-dihydro-3H-spiro[[1]benzofuran-2,3′-[2]benzopyran]-3-one InChI=1S/C16H12O8/c17-7-2-11(20)13-12(3-7)24-16(15(13)22)14(21)8-4-10(19)9(18)1-6(8)5-23-16/h1-4,14,17-21H,5H2 4,4′,6,6′,7′-pentahydroxy-3H-spiro[[1]benzofuran-2,3′-isochroman]-3-one InChI=1S/C16H12O8/c17-7-2-11(20)13-12(3-7)24-16(15(13)22)14(21)8-4-10(19)9(18)1-6(8)5-23-16/h1-4,14,17-21H,5H2 #6″-(3,4-dihydroxyphenyl)-3-(β-d-glucopyranosyloxy)-4′,7-dihydroxy-3′,5′-dimethoxypyrano[4″,3″,2″:4,5]-flavylium InChI=1S/C31H28O14/c1-40-21-6-13(7-22(41-2)25(21)36)29-30(45-31-28(39)27(38)26(37)23(11-32)44-31)15-10-18(12-3-4-16(34)17(35)5-12)42-19-8-14(33)9-20(43-29)24(15)19/h3-10,23,26-28,31-32,37-39H,11H2,1-2H3,(H3-,33,34,35,36)/p+1/t23-,26-,27+,28-,31+/m1/s1 5-(3,4-dihydroxyphenyl)-3-(β-d-glucopyranosyloxy)-8-hydroxy-2-(4-hydroxy-3,5-dimethoxyphenyl)-1λ4-pyrano[4,3,2-de][1]benzopyran-1-ylium InChI=1S/C31H28O14/c1-40-21-6-13(7-22(41-2)25(21)36)29-30(45-31-28(39)27(38)26(37)23(11-32)44-31)15-10-18(12-3-4-16(34)17(35)5-12)42-19-8-14(33)9-20(43-29)24(15)19/h3-10,23,26-28,31-32,37-39H,11H2,1-2H3,(H3-,33,34,35,36)/p+1/t23-,26-,27+,28-,31+/m1/s1 5-(3,4-dihydroxyphenyl)-3-(β-d-glucopyranosyloxy)-8-hydroxy-2-(4-hydroxy-3,5-dimethoxyphenyl)-pyrano[4,3,2-de]chromenylium InChI=1S/C31H28O14/c1-40-21-6-13(7-22(41-2)25(21)36)29-30(45-31-28(39)27(38)26(37)23(11-32)44-31)15-10-18(12-3-4-16(34)17(35)5-12)42-19-8-14(33)9-20(43-29)24(15)19/h3-10,23,26-28,31-32,37-39H,11H2,1-2H3,(H3-,33,34,35,36)/p+1/t23-,26-,27+,28-,31+/m1/s1 #3′,4′,7,8″,9″-pentahydroxy-6″-oxo-6″H-1″,5″-dioxaphenaleno[4″,3″,2″:3,4,5]flavylium InChI=1S/C22H10O9/c23-8-4-13-16-14(5-8)30-20-15-9(6-12(26)18(20)27)22(28)31-21(17(15)16)19(29-13)7-1-2-10(24)11(25)3-7/h1-6H,(H4-,23,24,25,26,27,28)/p+1 #rosacyanin B InChI=1S/C22H10O9/c23-8-4-13-16-14(5-8)30-20-15-9(6-12(26)18(20)27)22(28)31-21(17(15)16)19(29-13)7-1-2-10(24)11(25)3-7/h1-6H,(H4-,23,24,25,26,27,28)/p+1 11-(3,4-dihydroxyphenyl)-4,5,8-trihydroxy-2H-1,6-dioxa-10-oxoniabenzo[cd]pyren-2-one InChI=1S/C22H10O9/c23-8-4-13-16-14(5-8)30-20-15-9(6-12(26)18(20)27)22(28)31-21(17(15)16)19(29-13)7-1-2-10(24)11(25)3-7/h1-6H,(H4-,23,24,25,26,27,28)/p+1 pterocarpan InChI=1S/C15H12O2/c1-4-8-14-10(5-1)12-9-16-13-7-3-2-6-11(13)15(12)17-14/h1-8,12,15H,9H2/t12-,15-/m0/s1 (6aR,11aR)-6a,11a-dihydro-6H-[1]benzofuro[3,2-c][1]benzopyran InChI=1S/C15H12O2/c1-4-8-14-10(5-1)12-9-16-13-7-3-2-6-11(13)15(12)17-14/h1-8,12,15H,9H2/t12-,15-/m0/s1 (6aR,11aR)-6a,11a-dihydro-6H-[1]benzofuro[3,2-c]chromene InChI=1S/C15H12O2/c1-4-8-14-10(5-1)12-9-16-13-7-3-2-6-11(13)15(12)17-14/h1-8,12,15H,9H2/t12-,15-/m0/s1 9-methoxypterocarpan-3-ol InChI=1S/C16H14O4/c1-18-10-3-5-11-13-8-19-14-6-9(17)2-4-12(14)16(13)20-15(11)7-10/h2-7,13,16-17H,8H2,1H3/t13-,16-/m0/s1 #(−)-medicarpin InChI=1S/C16H14O4/c1-18-10-3-5-11-13-8-19-14-6-9(17)2-4-12(14)16(13)20-15(11)7-10/h2-7,13,16-17H,8H2,1H3/t13-,16-/m0/s1 (6aR,11aR)-9-methoxy-6a,11a-dihydro-6H-[1]benzofuro[3,2-c][1]benzopyran-3-ol InChI=1S/C16H14O4/c1-18-10-3-5-11-13-8-19-14-6-9(17)2-4-12(14)16(13)20-15(11)7-10/h2-7,13,16-17H,8H2,1H3/t13-,16-/m0/s1 (6aR,11aR)-9-methoxy-6a,11a-dihydro-6H-[1]benzofuro[3,2-c]chromen-3-ol InChI=1S/C16H14O4/c1-18-10-3-5-11-13-8-19-14-6-9(17)2-4-12(14)16(13)20-15(11)7-10/h2-7,13,16-17H,8H2,1H3/t13-,16-/m0/s1 3,9-dihydroxypterocarp-6a(11a)-en-6-one InChI=1S/C15H8O5/c16-7-1-3-9-11(5-7)19-14-10-4-2-8(17)6-12(10)20-15(18)13(9)14/h1-6,16-17H 3,9-dihydroxycoumestan-6-one InChI=1S/C15H8O5/c16-7-1-3-9-11(5-7)19-14-10-4-2-8(17)6-12(10)20-15(18)13(9)14/h1-6,16-17H 3,9-dihydroxy-6a,11a-didehydropterocarpan-6-one InChI=1S/C15H8O5/c16-7-1-3-9-11(5-7)19-14-10-4-2-8(17)6-12(10)20-15(18)13(9)14/h1-6,16-17H #coumestrol InChI=1S/C15H8O5/c16-7-1-3-9-11(5-7)19-14-10-4-2-8(17)6-12(10)20-15(18)13(9)14/h1-6,16-17H 3,9-dihydroxy-6H-[1]benzofuro[3,2-c][1]benzopyran-6-one InChI=1S/C15H8O5/c16-7-1-3-9-11(5-7)19-14-10-4-2-8(17)6-12(10)20-15(18)13(9)14/h1-6,16-17H 3,9-dihydroxy-6H-[1]benzofuro[3,2-c]chromen-6-one InChI=1S/C15H8O5/c16-7-1-3-9-11(5-7)19-14-10-4-2-8(17)6-12(10)20-15(18)13(9)14/h1-6,16-17H Rotenane InChI=1S/C16H14O2/c1-3-7-14-11(5-1)9-13-12-6-2-4-8-15(12)17-10-16(13)18-14/h1-8,13,16H,9-10H2/t13-,16+/m0/s1 (6aS,12aS)-6,6a,12,12a-tetrahydro[1]benzopyrano[3,4-b][1]benzopyran InChI=1S/C16H14O2/c1-3-7-14-11(5-1)9-13-12-6-2-4-8-15(12)17-10-16(13)18-14/h1-8,13,16H,9-10H2/t13-,16+/m0/s1 (6aS,12aS)-6,6a,12,12a-tetrahydrochromeno[3,4-b]chromene InChI=1S/C16H14O2/c1-3-7-14-11(5-1)9-13-12-6-2-4-8-15(12)17-10-16(13)18-14/h1-8,13,16H,9-10H2/t13-,16+/m0/s1 5-hydroxy-4′,5′,7-trimethoxyrotenan-4-one InChI=1S/C19H18O7/c1-22-9-4-11(20)18-15(5-9)26-16-8-25-12-7-14(24-3)13(23-2)6-10(12)17(16)19(18)21/h4-7,16-17,20H,8H2,1-3H3/t16-,17+/m1/s1 #sermundone InChI=1S/C19H18O7/c1-22-9-4-11(20)18-15(5-9)26-16-8-25-12-7-14(24-3)13(23-2)6-10(12)17(16)19(18)21/h4-7,16-17,20H,8H2,1-3H3/t16-,17+/m1/s1 (6aS,12aS)-11-hydroxy-2,3,9-trimethoxy-6a,12a-dihydro[1]benzopyrano[3,4-b][1]benzopyran-12(6H)-one InChI=1S/C19H18O7/c1-22-9-4-11(20)18-15(5-9)26-16-8-25-12-7-14(24-3)13(23-2)6-10(12)17(16)19(18)21/h4-7,16-17,20H,8H2,1-3H3/t16-,17+/m1/s1 (6aS,12aS)-11-hydroxy-2,3,9-trimethoxy-6a,12a-dihydrochromeno[3,4-b]chromen-12(6H)-one InChI=1S/C19H18O7/c1-22-9-4-11(20)18-15(5-9)26-16-8-25-12-7-14(24-3)13(23-2)6-10(12)17(16)19(18)21/h4-7,16-17,20H,8H2,1-3H3/t16-,17+/m1/s1 #(5″R)-4′,5′-dimethoxy-5″-(prop-1-en-2-yl)-4″,5″-dihydrofuro[2″,3″:7,8]rotenan-4-one InChI=1S/C23H22O6/c1-11(2)16-8-14-15(28-16)6-5-12-22(24)21-13-7-18(25-3)19(26-4)9-17(13)27-10-20(21)29-23(12)14/h5-7,9,16,20-21H,1,8,10H2,2-4H3/t16-,20-,21+/m1/s1 (2R,6aS,12aS)-8,9-dimethoxy-2-(prop-1-en-2-yl)-1,2,12,12a-tetrahydro[1]benzopyrano[3,4-b]furo[2,3-h]-[1]benzopyran-6(6aH)-one InChI=1S/C23H22O6/c1-11(2)16-8-14-15(28-16)6-5-12-22(24)21-13-7-18(25-3)19(26-4)9-17(13)27-10-20(21)29-23(12)14/h5-7,9,16,20-21H,1,8,10H2,2-4H3/t16-,20-,21+/m1/s1 (2R,6aS,12aS)-8,9-dimethoxy-2-(prop-1-en-2-yl)-1,2,12,12a-tetrahydrochromeno[3,4-b]furo[2,3-h]-chromen-6(6aH)-one InChI=1S/C23H22O6/c1-11(2)16-8-14-15(28-16)6-5-12-22(24)21-13-7-18(25-3)19(26-4)9-17(13)27-10-20(21)29-23(12)14/h5-7,9,16,20-21H,1,8,10H2,2-4H3/t16-,20-,21+/m1/s1 #(5″R)-4′,5′-dimethoxy-5″-(prop-1-en-2-yl)-4″,5″-dihydrofuro[2″,3″:7,8]roten-2-ene-21,4-dione InChI=1S/C23H18O7/c1-10(2)15-8-13-14(28-15)6-5-11-20(24)19-12-7-17(26-3)18(27-4)9-16(12)29-23(25)22(19)30-21(11)13/h5-7,9,15H,1,8H2,2-4H3/t15-/m1/s1 (2R)-8,9-dimethoxy-2-(prop-1-en-2-yl)-1,2-dihydro[1]benzopyrano[3,4-b]furo[2,3-h][1]benzopyran-6,12-dione InChI=1S/C23H18O7/c1-10(2)15-8-13-14(28-15)6-5-11-20(24)19-12-7-17(26-3)18(27-4)9-16(12)29-23(25)22(19)30-21(11)13/h5-7,9,15H,1,8H2,2-4H3/t15-/m1/s1 (2R)-8,9-dimethoxy-2-(prop-1-en-2-yl)-1,2-dihydrochromeno[3,4-b]furo[2,3-h]chromene-6,12-dione InChI=1S/C23H18O7/c1-10(2)15-8-13-14(28-15)6-5-11-20(24)19-12-7-17(26-3)18(27-4)9-16(12)29-23(25)22(19)30-21(11)13/h5-7,9,15H,1,8H2,2-4H3/t15-/m1/s1 #6″′,6″′-dimethyl-5″′,6″′-dihydro-4″′H-[1,3]dioxolo[4″,5″:4′,5′]pyrano[2″′,3″′:7,8]rotenan-4-one InChI=1S/C22H20O6/c1-22(2)6-5-11-14(28-22)4-3-12-20(23)19-13-7-16-17(26-10-25-16)8-15(13)24-9-18(19)27-21(11)12/h3-4,7-8,18-19H,5-6,9-10H2,1-2H3/t18-,19+/m1/s1 #(5aS,12bS)-2,2-dimethyl-3,4,5a,12b-tetrahydro-2H-[1,3]dioxolo[4,5-g]pyrano[2,3-c:6,5-f′]bis-([1]benzopyran)-13(6H)-one InChI=1S/C22H20O6/c1-22(2)6-5-11-14(28-22)4-3-12-20(23)19-13-7-16-17(26-10-25-16)8-15(13)24-9-18(19)27-21(11)12/h3-4,7-8,18-19H,5-6,9-10H2,1-2H3/t18-,19+/m1/s1 (5aS,12bS)-2,2-dimethyl-3,4,5a,12b-tetrahydro-2H-[1,3]dioxolo[4,5-g]pyrano[2,3-c:6,5-f′]dichromen-13(6H)-one InChI=1S/C22H20O6/c1-22(2)6-5-11-14(28-22)4-3-12-20(23)19-13-7-16-17(26-10-25-16)8-15(13)24-9-18(19)27-21(11)12/h3-4,7-8,18-19H,5-6,9-10H2,1-2H3/t18-,19+/m1/s1 #(2R,3R,5″R,6″R)-3,5,7-trihydroxy-6″-(4-hydroxy-3-methoxyphenyl)-5″-(hydroxymethyl)-5″,6″-dihydro-[1,4]dioxino[2″,3″:3′,4′]flavan-4-one InChI=1S/C25H22O10/c1-32-17-6-11(2-4-14(17)28)24-20(10-26)33-16-5-3-12(7-18(16)34-24)25-23(31)22(30)21-15(29)8-13(27)9-19(21)35-25/h2-9,20,23-29,31H,10H2,1H3/t20-,23+,24-,25-/m1/s1 #silibinin InChI=1S/C25H22O10/c1-32-17-6-11(2-4-14(17)28)24-20(10-26)33-16-5-3-12(7-18(16)34-24)25-23(31)22(30)21-15(29)8-13(27)9-19(21)35-25/h2-9,20,23-29,31H,10H2,1H3/t20-,23+,24-,25-/m1/s1 #silybin InChI=1S/C25H22O10/c1-32-17-6-11(2-4-14(17)28)24-20(10-26)33-16-5-3-12(7-18(16)34-24)25-23(31)22(30)21-15(29)8-13(27)9-19(21)35-25/h2-9,20,23-29,31H,10H2,1H3/t20-,23+,24-,25-/m1/s1 (2R,3R)-3,5,7-trihydroxy-2-[(2R,3R)-3-(4-hydroxy-3-methoxyphenyl)-2-(hydroxymethyl)-2,3-dihydro-1,4-benzodioxin-6-yl]-2,3-dihydro-4H-1-benzopyran-4-one InChI=1S/C25H22O10/c1-32-17-6-11(2-4-14(17)28)24-20(10-26)33-16-5-3-12(7-18(16)34-24)25-23(31)22(30)21-15(29)8-13(27)9-19(21)35-25/h2-9,20,23-29,31H,10H2,1H3/t20-,23+,24-,25-/m1/s1 (2R,3R)-3,5,7-trihydroxy-2-[(2R,3R)-3-(4-hydroxy-3-methoxyphenyl)-2-(hydroxymethyl)-2,3-dihydro-1,4-benzodioxin-6-yl]chroman-4-one InChI=1S/C25H22O10/c1-32-17-6-11(2-4-14(17)28)24-20(10-26)33-16-5-3-12(7-18(16)34-24)25-23(31)22(30)21-15(29)8-13(27)9-19(21)35-25/h2-9,20,23-29,31H,10H2,1H3/t20-,23+,24-,25-/m1/s1 (2R,3R)-3,5,7-trihydroxy-2-[(2R,3R)-3-(4-hydroxy-3-methoxyphenyl)-2-(hydroxymethyl)-2,3-dihydro-1,4-benzodioxin-6-yl]-2,3-dihydro-4H-chromen-4-one InChI=1S/C25H22O10/c1-32-17-6-11(2-4-14(17)28)24-20(10-26)33-16-5-3-12(7-18(16)34-24)25-23(31)22(30)21-15(29)8-13(27)9-19(21)35-25/h2-9,20,23-29,31H,10H2,1H3/t20-,23+,24-,25-/m1/s1 #[(3S,4R)-4′-methoxyisoflavan-2′,7-diol]-(4→5′)-[(3R)-4′-methoxyisoflavan-2′,7-diol] InChI=1S/C32H30O8/c1-37-21-6-8-22(27(35)12-21)26-16-40-31-11-20(34)5-7-23(31)32(26)25-13-24(28(36)14-30(25)38-2)18-9-17-3-4-19(33)10-29(17)39-15-18/h3-8,10-14,18,26,32-36H,9,15-16H2,1-2H3/t18-,26+,32+/m0/s1 (3S,4R)-4-{4-hydroxy-5-[(3R)-7-hydroxy-3,4-dihydro-2H-1-benzopyran-3-yl]-2-methoxyphenyl}-3-(2-hydroxy-4-methoxyphenyl)-3,4-dihydro-2H-1-benzopyran-7-ol InChI=1S/C32H30O8/c1-37-21-6-8-22(27(35)12-21)26-16-40-31-11-20(34)5-7-23(31)32(26)25-13-24(28(36)14-30(25)38-2)18-9-17-3-4-19(33)10-29(17)39-15-18/h3-8,10-14,18,26,32-36H,9,15-16H2,1-2H3/t18-,26+,32+/m0/s1 (3S,4R)-4-{4-hydroxy-5-[(3R)-7-hydroxychroman-3-yl]-2-methoxyphenyl}-3-(2-hydroxy-4-methoxyphenyl)-chroman-7-ol InChI=1S/C32H30O8/c1-37-21-6-8-22(27(35)12-21)26-16-40-31-11-20(34)5-7-23(31)32(26)25-13-24(28(36)14-30(25)38-2)18-9-17-3-4-19(33)10-29(17)39-15-18/h3-8,10-14,18,26,32-36H,9,15-16H2,1-2H3/t18-,26+,32+/m0/s1 (3S,4R)-4-{4-hydroxy-5-[(3R)-7-hydroxy-3,4-dihydro-2H-chromen-3-yl]-2-methoxyphenyl}-3-(2-hydroxy-4-methoxyphenyl)-3,4-dihydro-2H-chromen-7-ol InChI=1S/C32H30O8/c1-37-21-6-8-22(27(35)12-21)26-16-40-31-11-20(34)5-7-23(31)32(26)25-13-24(28(36)14-30(25)38-2)18-9-17-3-4-19(33)10-29(17)39-15-18/h3-8,10-14,18,26,32-36H,9,15-16H2,1-2H3/t18-,26+,32+/m0/s1 #[(2S)-5-hydroxy-4′,7-dimethoxyflavan-4-one]-(2′→6)-[(2S)-3′,4′,5,7-tetrahydroxyflavan-4-one] InChI=1S/C32H26O11/c1-40-15-4-5-17(26-12-23(37)30-21(35)9-16(41-2)10-27(30)43-26)18(8-15)29-22(36)13-28-31(32(29)39)24(38)11-25(42-28)14-3-6-19(33)20(34)7-14/h3-10,13,25-26,33-36,39H,11-12H2,1-2H3/t25-,26-/m0/s1 #[(2S)-4′,7-di-O-methylnaringenin]-(2′→6)-[(2S)-3′-hydroxynaringenin] InChI=1S/C32H26O11/c1-40-15-4-5-17(26-12-23(37)30-21(35)9-16(41-2)10-27(30)43-26)18(8-15)29-22(36)13-28-31(32(29)39)24(38)11-25(42-28)14-3-6-19(33)20(34)7-14/h3-10,13,25-26,33-36,39H,11-12H2,1-2H3/t25-,26-/m0/s1 (2S)-2-(3,4-dihydroxyphenyl)-5,7-dihydroxy-6-{2-[(2S)-5-hydroxy-7-methoxy-4-oxo-3,4-dihydro-2H-1-benzopyran-2-yl]-5-methoxyphenyl}-2,3-dihydro-4H-1-benzopyran-4-one InChI=1S/C32H26O11/c1-40-15-4-5-17(26-12-23(37)30-21(35)9-16(41-2)10-27(30)43-26)18(8-15)29-22(36)13-28-31(32(29)39)24(38)11-25(42-28)14-3-6-19(33)20(34)7-14/h3-10,13,25-26,33-36,39H,11-12H2,1-2H3/t25-,26-/m0/s1 (2S)-2-(3,4-dihydroxyphenyl)-5,7-dihydroxy-6-{2-[(2S)-5-hydroxy-7-methoxy-4-oxochroman-2-yl]-5-methoxyphenyl}chroman-4-one InChI=1S/C32H26O11/c1-40-15-4-5-17(26-12-23(37)30-21(35)9-16(41-2)10-27(30)43-26)18(8-15)29-22(36)13-28-31(32(29)39)24(38)11-25(42-28)14-3-6-19(33)20(34)7-14/h3-10,13,25-26,33-36,39H,11-12H2,1-2H3/t25-,26-/m0/s1 (2S)-2-(3,4-dihydroxyphenyl)-5,7-dihydroxy-6-{2-[(2S)-5-hydroxy-7-methoxy-4-oxo-3,4-dihydro-2H-chromen-2-yl]-5-methoxyphenyl}-2,3-dihydro-4H-chromen-4-one InChI=1S/C32H26O11/c1-40-15-4-5-17(26-12-23(37)30-21(35)9-16(41-2)10-27(30)43-26)18(8-15)29-22(36)13-28-31(32(29)39)24(38)11-25(42-28)14-3-6-19(33)20(34)7-14/h3-10,13,25-26,33-36,39H,11-12H2,1-2H3/t25-,26-/m0/s1 #(3′,4′,5,7-tetrahydroxyflavone)-(2′→6)-(3′,4′,5,7-tetrahydroxyflavone)-(2′→6)-(3′,4′,5,7-tetrahydroxyflavone) InChI=1S/C45H26O18/c46-16-8-23(51)37-24(52)11-30(62-32(37)9-16)17-2-5-20(48)42(57)35(17)41-28(56)14-34-39(45(41)60)26(54)12-31(63-34)18-3-6-21(49)43(58)36(18)40-27(55)13-33-38(44(40)59)25(53)10-29(61-33)15-1-4-19(47)22(50)7-15/h1-14,46-51,55-60H 6-(6-{6-[6-(5,7-dihydroxy-4-oxo-4H-1-benzopyran-2-yl)-2,3-dihydroxyphenyl]-5,7-dihydroxy-4-oxo-4H-1-benzopyran-2-yl}-2,3-dihydroxyphenyl)-2-(3,4-dihydroxyphenyl)-5,7-dihydroxy-4H-1-benzopyran-4-one InChI=1S/C45H26O18/c46-16-8-23(51)37-24(52)11-30(62-32(37)9-16)17-2-5-20(48)42(57)35(17)41-28(56)14-34-39(45(41)60)26(54)12-31(63-34)18-3-6-21(49)43(58)36(18)40-27(55)13-33-38(44(40)59)25(53)10-29(61-33)15-1-4-19(47)22(50)7-15/h1-14,46-51,55-60H 6-(6-{6-[6-(5,7-dihydroxy-4-oxo-4H-chromen-2-yl)-2,3-dihydroxyphenyl]-5,7-dihydroxy-4-oxo-4H-chromen-2-yl}-2,3-dihydroxyphenyl)-2-(3,4-dihydroxyphenyl)-5,7-dihydroxy-4H-chromen-4-one InChI=1S/C45H26O18/c46-16-8-23(51)37-24(52)11-30(62-32(37)9-16)17-2-5-20(48)42(57)35(17)41-28(56)14-34-39(45(41)60)26(54)12-31(63-34)18-3-6-21(49)43(58)36(18)40-27(55)13-33-38(44(40)59)25(53)10-29(61-33)15-1-4-19(47)22(50)7-15/h1-14,46-51,55-60H #(5,7-dihydroxyflavone)-(4′-oxy-6)-(4′,5,7-trihydroxyflavone) InChI=1S/C30H18O10/c31-16-5-1-14(2-6-16)24-12-21(35)28-26(40-24)13-22(36)30(29(28)37)38-18-7-3-15(4-8-18)23-11-20(34)27-19(33)9-17(32)10-25(27)39-23/h1-13,31-33,36-37H #hinokiflavone InChI=1S/C30H18O10/c31-16-5-1-14(2-6-16)24-12-21(35)28-26(40-24)13-22(36)30(29(28)37)38-18-7-3-15(4-8-18)23-11-20(34)27-19(33)9-17(32)10-25(27)39-23/h1-13,31-33,36-37H 6-[4-(5,7-dihydroxy-4-oxo-4H-1-benzopyran-2-yl)phenoxy]-5,7-dihydroxy-2-(4-hydroxyphenyl)-4H-1-benzopyran-4-one InChI=1S/C30H18O10/c31-16-5-1-14(2-6-16)24-12-21(35)28-26(40-24)13-22(36)30(29(28)37)38-18-7-3-15(4-8-18)23-11-20(34)27-19(33)9-17(32)10-25(27)39-23/h1-13,31-33,36-37H 6-[4-(5,7-dihydroxy-4-oxo-4H-chromen-2-yl)phenoxy]-5,7-dihydroxy-2-(4-hydroxyphenyl)-4H-chromen-4-one InChI=1S/C30H18O10/c31-16-5-1-14(2-6-16)24-12-21(35)28-26(40-24)13-22(36)30(29(28)37)38-18-7-3-15(4-8-18)23-11-20(34)27-19(33)9-17(32)10-25(27)39-23/h1-13,31-33,36-37H #[(3R)-2′-methoxyisoflavan-3′,7-diol]-(4′-oxy-4′)-[(3R)-2′-methoxyisoflavan-3′,7-diol] InChI=1S/C32H30O9/c1-37-31-23(19-11-17-3-5-21(33)13-27(17)39-15-19)7-9-25(29(31)35)41-26-10-8-24(32(38-2)30(26)36)20-12-18-4-6-22(34)14-28(18)40-16-20/h3-10,13-14,19-20,33-36H,11-12,15-16H2,1-2H3/t19-,20-/m0/s1 #biscyclolobin InChI=1S/C32H30O9/c1-37-31-23(19-11-17-3-5-21(33)13-27(17)39-15-19)7-9-25(29(31)35)41-26-10-8-24(32(38-2)30(26)36)20-12-18-4-6-22(34)14-28(18)40-16-20/h3-10,13-14,19-20,33-36H,11-12,15-16H2,1-2H3/t19-,20-/m0/s1 3,3′-[oxybis(3-hydroxy-2-methoxy-4,1-phenylene)]bis[(3R)-3,4-dihydro-2H-1-benzopyran-7-ol] InChI=1S/C32H30O9/c1-37-31-23(19-11-17-3-5-21(33)13-27(17)39-15-19)7-9-25(29(31)35)41-26-10-8-24(32(38-2)30(26)36)20-12-18-4-6-22(34)14-28(18)40-16-20/h3-10,13-14,19-20,33-36H,11-12,15-16H2,1-2H3/t19-,20-/m0/s1 3,3′-[oxybis(3-hydroxy-2-methoxy-4,1-phenylene)]bis[(3R)-chroman-7-ol] InChI=1S/C32H30O9/c1-37-31-23(19-11-17-3-5-21(33)13-27(17)39-15-19)7-9-25(29(31)35)41-26-10-8-24(32(38-2)30(26)36)20-12-18-4-6-22(34)14-28(18)40-16-20/h3-10,13-14,19-20,33-36H,11-12,15-16H2,1-2H3/t19-,20-/m0/s1 3,3′-[oxybis(3-hydroxy-2-methoxy-4,1-phenylene)]bis[(3R)-3,4-dihydro-2H-chromen-7-ol] InChI=1S/C32H30O9/c1-37-31-23(19-11-17-3-5-21(33)13-27(17)39-15-19)7-9-25(29(31)35)41-26-10-8-24(32(38-2)30(26)36)20-12-18-4-6-22(34)14-28(18)40-16-20/h3-10,13-14,19-20,33-36H,11-12,15-16H2,1-2H3/t19-,20-/m0/s1 #(3′,4′,5,7-tetrahydroxyflavone)-(2′→8,8→2′)-(3′,4′,5,7-tetrahydroxyflavone) InChI=1S/C30H16O12/c31-11-3-1-9-19-7-17(37)23-14(34)6-16(36)26(30(23)41-19)22-10(2-4-12(32)28(22)40)20-8-18(38)24-13(33)5-15(35)25(29(24)42-20)21(9)27(11)39/h1-8,31-36,39-40H #anhydrobartramiaflavone InChI=1S/C30H16O12/c31-11-3-1-9-19-7-17(37)23-14(34)6-16(36)26(30(23)41-19)22-10(2-4-12(32)28(22)40)20-8-18(38)24-13(33)5-15(35)25(29(24)42-20)21(9)27(11)39/h1-8,31-36,39-40H #15,17,23,24,35,37,45,46-octahydroxy-14H,34H-1(2,8),3(8,2)-bis([1]benzopyrana)-2,4(1,2)-dibenzenacyclotetraphane-14,34-dione InChI=1S/C30H16O12/c31-11-3-1-9-19-7-17(37)23-14(34)6-16(36)26(30(23)41-19)22-10(2-4-12(32)28(22)40)20-8-18(38)24-13(33)5-15(35)25(29(24)42-20)21(9)27(11)39/h1-8,31-36,39-40H #3,4,5,7,13,14,15,17-octahydroxy-8,10:18,20-di(ethanylylidene)tetrabenzo[b,d,h,j][1,7]dioxacyclododecine-22,24-dione InChI=1S/C30H16O12/c31-11-3-1-9-19-7-17(37)23-14(34)6-16(36)26(30(23)41-19)22-10(2-4-12(32)28(22)40)20-8-18(38)24-13(33)5-15(35)25(29(24)42-20)21(9)27(11)39/h1-8,31-36,39-40H opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/functionalClasses.txt000066400000000000000000000065561451751637500313770ustar00rootroot00000000000000p-benzoquinone monoimine InChI=1S/C6H5NO/c7-5-1-3-6(8)4-2-5/h1-4,7H N,N'-dimethyl-1,4-naphthoquinone diimine InChI=1S/C12H12N2/c1-13-11-7-8-12(14-2)10-6-4-3-5-9(10)11/h3-8H,1-2H3 acetic acid amide InChI=1S/C2H5NO/c1-2(3)4/h1H3,(H2,3,4) acetic acid N-methylamide InChI=1S/C3H7NO/c1-3(5)4-2/h1-2H3,(H,4,5) acetic acid N-methyl amide InChI=1S/C3H7NO/c1-3(5)4-2/h1-2H3,(H,4,5) 2-pyridylformaldehyde semicarbazone InChI=1/C7H8N4O/c8-7(12)11-10-5-6-3-1-2-4-9-6/h1-5H,(H3,8,11,12)/f/h11H,8H2 ibuprofen methyl ester InChI=1S/C14H20O2/c1-10(2)9-12-5-7-13(8-6-12)11(3)14(15)16-4/h5-8,10-11H,9H2,1-4H3 acetic acid ethyl ester InChI=1S/C4H8O2/c1-3-6-4(2)5/h3H2,1-2H3 acetate ethyl ester InChI=1S/C4H8O2/c1-3-6-4(2)5/h3H2,1-2H3 acetone ethyloxime InChI=1S/C5H11NO/c1-4-7-6-5(2)3/h4H2,1-3H3 acetone O-ethyloxime InChI=1S/C5H11NO/c1-4-7-6-5(2)3/h4H2,1-3H3 acetone O2-ethyloxime InChI=1S/C5H11NO/c1-4-7-6-5(2)3/h4H2,1-3H3 diphosphoric acid 1,3-di(ethylamide) InChI=1/C4H14N2O5P2/c1-3-5-12(7,8)11-13(9,10)6-4-2/h3-4H2,1-2H3,(H2,5,7,8)(H2,6,9,10) benzene-1,4-dicarboxylic acid chloride InChI=1S/C8H4Cl2O2/c9-7(11)5-1-2-6(4-3-5)8(10)12/h1-4H ethylene glycol methacrylate phosphate InChI=1S/C6H11O6P/c1-5(2)6(7)11-3-4-12-13(8,9)10/h1,3-4H2,2H3,(H2,8,9,10) Bisphenol A diglycidyl ether InChI=1/C21H24O4/c1-21(2,15-3-7-17(8-4-15)22-11-19-13-24-19)16-5-9-18(10-6-16)23-12-20-14-25-20/h3-10,19-20H,11-14H2,1-2H3 1,4-butanediol diglycidyl ether InChI=1S/C10H18O4/c1(3-11-5-9-7-13-9)2-4-12-6-10-8-14-10/h9-10H,1-8H2 ethylene glycol ethyl ether acetate InChI=1S/C6H12O3/c1-3-8-4-5-9-6(2)7/h3-5H2,1-2H3 diethylene glycol ethyl methyl ether InChI=1S/C7H16O3/c1-3-9-6-7-10-5-4-8-2/h3-7H2,1-2H3 glycerol monooleate InChI=1/C21H40O4/c1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-21(24)25-19-20(23)18-22/h9-10,20,22-23H,2-8,11-19H2,1H3/b10-9- glycerol triglycidyl ether InChI=1S/C12H20O6/c1(13-3-10-5-16-10)9(15-7-12-8-18-12)2-14-4-11-6-17-11/h9-12H,1-8H2 acetic acid ammonium salt InChI=1S/C2H4O2.H3N/c1-2(3)4;/h1H3,(H,3,4);1H3 acetic acid ammonia salt InChI=1S/C2H4O2.H3N/c1-2(3)4;/h1H3,(H,3,4);1H3 acetic acid sodium salt InChI=1S/C2H4O2.Na/c1-2(3)4;/h1H3,(H,3,4);/q;+1/p-1 acetic acid sodium(0) salt InChI=1S/C2H4O2.Na/c1-2(3)4;/h1H3,(H,3,4); pyridine acetic acid salt InChI=1S/C5H5N.C2H4O2/c1-2-4-6-5-3-1;1-2(3)4/h1-5H;1H3,(H,3,4) glycinamide trifluoroacetate salt InChI=1S/C2HF3O2.C2H6N2O/c3-2(4,5)1(6)7;3-1-2(4)5/h(H,6,7);1,3H2,(H2,4,5) benzoylarginine 4-nitroanilide InChI=1S/C19H22N6O4/c20-19(21)22-12-4-7-16(24-17(26)13-5-2-1-3-6-13)18(27)23-14-8-10-15(11-9-14)25(28)29/h1-3,5-6,8-11,16H,4,7,12H2,(H,23,27)(H,24,26)(H4,20,21,22)/t16-/m0/s1 acetic acid anilide InChI=1S/C8H9NO/c1-7(10)9-8-5-3-2-4-6-8/h2-6H,1H3,(H,9,10) N-Succinyl-L-phenylalanine-p-nitroanilide InChI=1S/C19H19N3O6/c23-17(10-11-18(24)25)21-16(12-13-4-2-1-3-5-13)19(26)20-14-6-8-15(9-7-14)22(27)28/h1-9,16H,10-12H2,(H,20,26)(H,21,23)(H,24,25)/t16-/m0/s1 ethyl chloride InChI=1S/C2H5Cl/c1-2-3/h2H2,1H3 Butyltin chloride dihydroxide InChI=1S/C4H9.ClH.2H2O.Sn/c1-3-4-2;;;;/h1,3-4H2,2H3;1H;2*1H2;/q;;;;+3/p-3 2-Hydroxy-5-methoxy-benzaldehyde chloroxime InChI=1S/C8H8ClNO3/c1-13-5-2-3-7(11)6(4-5)8(9)10-12/h2-4,11-12H,1H3 2-Hydroxy-5-methoxy-benzaldehyde chlorooxime InChI=1S/C8H8ClNO3/c1-13-5-2-3-7(11)6(4-5)8(9)10-12/h2-4,11-12H,1H3 4-NITROBENZALDEHYDE S-(4-NITROPHENYL)THIOOXIME InChI=1S/C13H9N3O4S/c17-15(18)11-3-1-10(2-4-11)9-14-21-13-7-5-12(6-8-13)16(19)20/h1-9H opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/functionalReplacement.txt000066400000000000000000000032311451751637500322240ustar00rootroot00000000000000peroxyacetic acid InChI=1S/C2H4O3/c1-2(3)5-4/h4H,1H3 peroxyacetate InChI=1S/C2H4O3/c1-2(3)5-4/h4H,1H3/p-1 acetperoxoic acid InChI=1S/C2H4O3/c1-2(3)5-4/h4H,1H3 acetperoxoate InChI=1S/C2H4O3/c1-2(3)5-4/h4H,1H3/p-1 phosphoronitridic acid InChI=1S/H2NO2P/c1-4(2)3/h2-3H nitridophosphoric acid InChI=1S/H2NO2P/c1-4(2)3/h2-3H ethandithioate InChI=1S/C2H4S2/c1-2(3)4/h1H3,(H,3,4)/p-1 dithioethanoate InChI=1S/C2H4S2/c1-2(3)4/h1H3,(H,3,4)/p-1 1,1-dithiooxalic acid InChI=1S/C2H2O2S2/c3-1(4)2(5)6/h(H,3,4)(H,5,6) tetrathiooxalic acid InChI=1S/C2H2S4/c3-1(4)2(5)6/h(H,3,4)(H,5,6) imidooxalic acid InChI=1S/C2H3NO3/c3-1(4)2(5)6/h(H2,3,4)(H,5,6) diimidooxalic acid InChI=1S/C2H4N2O2/c3-1(5)2(4)6/h(H2,3,5)(H2,4,6) hydrazonooxalic acid InChI=1S/C2H4N2O3/c3-4-1(5)2(6)7/h3H2,(H,4,5)(H,6,7) dihydrazonooxalic acid InChI=1S/C2H6N4O2/c3-5-1(7)2(8)6-4/h3-4H2,(H,5,7)(H,6,8) 1-hydrazono-2-imidooxalic acid InChI=1S/C2H5N3O2/c3-1(6)2(7)5-4/h4H2,(H2,3,6)(H,5,7) bromooxalic acid InChI=1S/C2HBrO3/c3-1(4)2(5)6/h(H,5,6) 2-chloro-2-thiooxalic acid InChI=1S/C2HClO2S/c3-1(6)2(4)5/h(H,4,5) cyanooxalic acid InChI=1S/C3HNO3/c4-1-2(5)3(6)7/h(H,6,7) 5-carbonoperoxoylpentanoic acid InChI=1S/C6H10O5/c7-5(8)3-1-2-4-6(9)11-10/h10H,1-4H2,(H,7,8) phosphoroperoxoyldibenzene InChI=1S/C12H11O3P/c13-15-16(14,11-7-3-1-4-8-11)12-9-5-2-6-10-12/h1-10,13H phosphoric methyl amide ethyl amide propyl amide InChI=1S/C6H18N3OP/c1-4-6-9-11(10,7-3)8-5-2/h4-6H2,1-3H3,(H3,7,8,9,10) diethyl-thionophosphoric acid InChI=1S/C4H11O3PS/c1-3-6-8(5,9)7-4-2/h3-4H2,1-2H3,(H,5,9) #Counter example benzothiazol-2-ylthio-succinic anhydride InChI=1S/C11H7NO3S2/c13-9-5-8(10(14)15-9)17-11-12-6-3-1-2-4-7(6)16-11/h1-4,8H,5H2 opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/fusedRings.txt000066400000000000000000000053011451751637500300130ustar00rootroot00000000000000cyclopenta[1,2-b:5,1-b']bis[1,4]oxathiine InChI=1S/C9H6O2S2/c1-2-8-9(11-4-5-12-8)7(1)10-3-6-13-9/h1-6H 4a,8a-propanoquinoline InChI=1S/C12H13N/c1-2-8-12-9-3-6-11(12,5-1)7-4-10-13-12/h1-2,4-5,7-8,10H,3,6,9H2 1,5-methanoindole InChI=1S/C9H7N/c1-2-9-8-3-4-10(9)6-7(1)5-8/h1-5H,6H2 1,5-methano-1H-indole InChI=1S/C9H7N/c1-2-9-8-3-4-10(9)6-7(1)5-8/h1-5H,6H2 9H-9,10-ethanoacridine InChI=1S/C15H13N/c1-3-7-14-12(5-1)11-9-10-16(14)15-8-4-2-6-13(11)15/h1-8,11H,9-10H2 1,3-epoxynaphthalene InChI=1S/C10H6O/c1-2-4-9-7(3-1)5-8-6-10(9)11-8/h1-6H 1,12-ethenobenzo[4,5]cyclohepta[1,2,3-de]naphthalene InChI=1S/C20H12/c1-3-13-7-8-14-4-2-6-16-10-12-17-11-9-15(5-1)18(13)20(17)19(14)16/h1-12H 2,6:5,7-dimethanoindeno[7,1-bc]furan InChI=1S/C12H6O/c1-2-6-10-4-8-7-3-9(8)12(13-10)11(6)5(1)7/h1-2H,3-4H2 1,2,3,4-tetrahydro-1,4-ethenoanthracen-2-ol InChI=1S/C16H14O/c17-16-9-12-5-6-13(16)15-8-11-4-2-1-3-10(11)7-14(12)15/h1-8,12-13,16-17H,9H2 1,4:5,8-dimethanonaphthalene InChI=1S/C12H8/c1-2-8-5-7(1)11-9-3-4-10(6-9)12(8)11/h1-4H,5-6H2 6,14:7,14-dimethanobenzo[7,8]cycloundeca[1,2-b]pyridine InChI=1S/C20H15N/c1-2-5-16-10-20-11-17(8-14(16)4-1)18(12-20)9-15-6-3-7-21-19(15)13-20/h1-10,13H,11-12H2 6,13-ethano-6,13-methanodibenzo[b,g][1,6]diazecine InChI=1S/C19H16N2/c1-3-7-16-14(5-1)11-18-9-10-19(13-18,20-16)12-15-6-2-4-8-17(15)21-18/h1-8,11-12H,9-10,13H2 (2S)-2-[(5R,6R,7R,14S)-N-cyclopropylmethyl-4,5-epoxy-6,14-ethano-3-hydroxy-6-methoxymorphinan-7-yl]-3,3-dimethylpentan-2-ol InChI=1S/C30H43NO4/c1-6-26(2,3)27(4,33)21-16-28-11-12-30(21,34-5)25-29(28)13-14-31(17-18-7-8-18)22(28)15-19-9-10-20(32)24(35-25)23(19)29/h9-10,18,21-22,25,32-33H,6-8,11-17H2,1-5H3/t21-,22-,25-,27+,28-,29+,30-/m1/s1 (10R)-7-amino-16-cyclopropyl-12-fluoro-2,10-dimethyl-15-oxo-10,15,16,17-tetrahydro-2H-8,4-(azeno)pyrazolo[4,3-h][2,5,11]benzoxadiazacyclotetradecine-3-carbonitrile InChI=1S/C22H20FN7O2/c1-11-15-7-12(23)3-6-14(15)22(31)30(13-4-5-13)10-17-19(18(8-24)29(2)28-17)16-9-26-20(25)21(27-16)32-11/h3,6-7,9,11,13H,4-5,10H2,1-2H3,(H2,25,26)/t11-/m1/s1 9,10-methanoanthracene InChI=1S/C15H10/c1-2-6-11-10(5-1)14-9-15(11)13-8-4-3-7-12(13)14/h1-8H,9H2 2H-3,5-(epoxymethano)furo[3,4-b]pyran InChI=1S/C8H6O3/c1-5-2-10-8-4-11-7(3-9-5)6(1)8/h1,4H,2-3H2 4a,8a-prop[1]enoquinoline InChI=1S/C12H11N/c1-2-8-12-9-3-6-11(12,5-1)7-4-10-13-12/h1-8,10H,9H2 2H-5,3-(epoxymethano)furo[2,3-c]pyran InChI=1S/C8H6O3/c1-6-5-2-9-7(6)4-11-8(1)10-3-5/h1,4H,2-3H2 11-chloro-9,10-(epoxymethano)anthracene InChI=1S/C15H9ClO/c16-15-13-9-5-1-3-7-11(9)14(17-15)12-8-4-2-6-10(12)13/h1-8,15H 1,4-(ethanediylidene)cyclohexane InChI=1S/C8H10/c1-2-8-5-3-7(1)4-6-8/h1-2H,3-6H2 1,4-azenocyclohexane InChI=1S/C6H9N/c1-2-6-4-3-5(1)7-6/h5H,1-4H2 2H-1,4-azenobenzene InChI=1S/C6H5N/c1-2-6-4-3-5(1)7-6/h1-3H,4H2 opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/hwRings.txt000066400000000000000000000001321451751637500273200ustar00rootroot00000000000000oxazole InChI=1S/C3H3NO/c1-2-5-3-4-1/h1-3H trioxane InChI=1S/C3H6O3/c1-4-2-6-3-5-1/h1-3H2 opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/implicitBracketting.txt000066400000000000000000000120421451751637500316720ustar00rootroot00000000000000methylamino-benzene InChI=1S/C7H9N/c1-8-7-5-3-2-4-6-7/h2-6,8H,1H3 methyl-amino-benzene InChI=1S/C7H9N/c1-6-4-2-3-5-7(6)8/h2-5H,8H2,1H3 dimethylaminobenzene InChI=1S/C8H11N/c1-9(2)8-6-4-3-5-7-8/h3-7H,1-2H3 2,5-bisaminothiobenzene InChI=1S/C6H8N2S2/c7-9-5-1-2-6(10-8)4-3-5/h1-4H,7-8H2 1,4-dimethoxycarbonyl-benzene InChI=1S/C10H10O4/c1-13-9(11)7-3-5-8(6-4-7)10(12)14-2/h3-6H,1-2H3 1,5-bis-(4-methylphenyl)sulfonylbenzene InChI=1/C20H18O4S2/c1-15-6-10-17(11-7-15)25(21,22)19-4-3-5-20(14-19)26(23,24)18-12-8-16(2)9-13-18/h3-14H,1-2H3 S-fluoromethyl methanethioate InChI=1S/C2H3FOS/c3-1-5-2-4/h2H,1H2 2-pentafluoroethylpropanamine InChI=1S/C5H8F5N/c1-3(2-11)4(6,7)5(8,9)10/h3H,2,11H2,1H3 p-dimethylaminopyridine InChI=1S/C7H10N2/c1-9(2)7-3-5-8-6-4-7/h3-6H,1-2H3 3-methanesulfonylmethyl-phenylamine InChI=1S/C8H11NO2S/c1-12(10,11)6-7-3-2-4-8(9)5-7/h2-5H,6,9H2,1H3 tert-butyldimethylsilyloxycyclohexane InChI=1S/C12H26OSi/c1-12(2,3)14(4,5)13-11-9-7-6-8-10-11/h11H,6-10H2,1-5H3 tert-butyldimethylsiloxycyclohexane InChI=1S/C12H26OSi/c1-12(2,3)14(4,5)13-11-9-7-6-8-10-11/h11H,6-10H2,1-5H3 tert-butyldimethylsilanoxycyclohexane InChI=1S/C12H26OSi/c1-12(2,3)14(4,5)13-11-9-7-6-8-10-11/h11H,6-10H2,1-5H3 tert-butyl(dimethyl)siloxycyclohexane InChI=1S/C12H26OSi/c1-12(2,3)14(4,5)13-11-9-7-6-8-10-11/h11H,6-10H2,1-5H3 tert-butyl-dimethylsiloxycyclohexane InChI=1S/C12H26OSi/c1-12(2,3)14(4,5)13-11-9-7-6-8-10-11/h11H,6-10H2,1-5H3 tert-butyldiphenylsiloxycyclohexane InChI=1S/C22H30OSi/c1-22(2,3)24(20-15-9-5-10-16-20,21-17-11-6-12-18-21)23-19-13-7-4-8-14-19/h5-6,9-12,15-19H,4,7-8,13-14H2,1-3H3 pyridine ditrifluoroacetate InChI=1S/C5H5N.2C2HF3O2/c1-2-4-6-5-3-1;2*3-2(4,5)1(6)7/h1-5H;2*(H,6,7) bis-tetrabutylammonium phosphate InChI=1S/2C16H36N.H3O4P/c2*1-5-9-13-17(14-10-6-2,15-11-7-3)16-12-8-4;1-5(2,3)4/h2*5-16H2,1-4H3;(H3,1,2,3,4)/q2*+1;/p-2 4-Tert-butoxy-carbonyl-piperazine InChI=1S/C9H18N2O2/c1-9(2,3)13-8(12)11-6-4-10-5-7-11/h10H,4-7H2,1-3H3 4,4'-propylmethylenedianiline InChI=1S/C16H20N2/c1-2-3-16(12-4-8-14(17)9-5-12)13-6-10-15(18)11-7-13/h4-11,16H,2-3,17-18H2,1H3 2-[3-(1-Methoxycarbonyl-2-phenyl-ethoxycarbonylmethyl)-phenylsulfanylmethyl]-benzoic acid InChI=1S/C26H24O6S/c1-31-26(30)23(15-18-8-3-2-4-9-18)32-24(27)16-19-10-7-12-21(14-19)33-17-20-11-5-6-13-22(20)25(28)29/h2-14,23H,15-17H2,1H3,(H,28,29) 2-[4-(1-Phenyl-but-3-enyloxycarbonylmethyl)-phenylsulfanylmethyl]-benzoic acid InChI=1/C26H24O4S/c1-2-8-24(20-9-4-3-5-10-20)30-25(27)17-19-13-15-22(16-14-19)31-18-21-11-6-7-12-23(21)26(28)29/h2-7,9-16,24H,1,8,17-18H2,(H,28,29) tert-Butyl (4-bromophenyl)sulfonyl(phenyl)carbamate InChI=1S/C17H18BrNO4S/c1-17(2,3)23-16(20)19(14-7-5-4-6-8-14)24(21,22)15-11-9-13(18)10-12-15/h4-12H,1-3H3 N-methylmethyleneamine InChI=1S/C2H5N/c1-3-2/h1H2,2H3 Dimethylaminopropylamine InChI=1S/C5H14N2/c1-7(2)5-3-4-6/h3-6H2,1-2H3 4-benzoylmethyl-4-piperidinol InChI=1S/C13H17NO2/c15-12(11-4-2-1-3-5-11)10-13(16)6-8-14-9-7-13/h1-5,14,16H,6-10H2 2-acetylmethyl-pyridine InChI=1S/C8H9NO/c1-7(10)6-8-4-2-3-5-9-8/h2-5H,6H2,1H3 3-formylmethyl-pyridine InChI=1S/C7H7NO/c9-5-3-7-2-1-4-8-6-7/h1-2,4-6H,3H2 1,1-dimethylethyl(R)-amino-(4-fluorophenyl)-acetate InChI=1S/C12H16FNO2/c1-12(2,3)16-11(15)10(14)8-4-6-9(13)7-5-8/h4-7,10H,14H2,1-3H3/t10-/m1/s1 N-tris(hydroxymethyl)methylglycine InChI=1S/C6H13NO5/c8-2-6(3-9,4-10)7-1-5(11)12/h7-10H,1-4H2,(H,11,12) #Shouldn't be bracketted Diisopropylazodicarboxylate InChI=1S/C8H14N2O4/c1-5(2)13-7(11)9-10-8(12)14-6(3)4/h5-6H,1-4H3 bis(2-hydroxyethyl)oleylamine InChI=1S/C22H45NO2/c1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-23(19-21-24)20-22-25/h9-10,24-25H,2-8,11-22H2,1H3/b10-9- bis(2-hydroxyethyl) oleylamine InChI=1S/C22H45NO2/c1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-23(19-21-24)20-22-25/h9-10,24-25H,2-8,11-22H2,1H3/b10-9- bis(2-hydroxyethyl)-oleylamine InChI=1S/C22H45NO2/c1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-18-23(19-21-24)20-22-25/h9-10,24-25H,2-8,11-22H2,1H3/b10-9- tert-butyl(R)-3-vinylpyrrolidine-1-carboxylate InChI=1S/C11H19NO2/c1-5-9-6-7-12(8-9)10(13)14-11(2,3)4/h5,9H,1,6-8H2,2-4H3/t9-/m0/s1 4-(2-methoxy-acetyl)cis-3,5-dimethyl-piperazine InChI=1S/C9H18N2O2/c1-7-4-10-5-8(2)11(7)9(12)6-13-3/h7-8,10H,4-6H2,1-3H3/t7-,8+ methyl(R)-amino-(4-fluorophenyl)-acetate InChI=1S/C9H10FNO2/c1-13-9(12)8(11)6-2-4-7(10)5-3-6/h2-5,8H,11H2,1H3/t8-/m1/s1 (2-fluorobenzyl)methylamine InChI=1S/C8H10FN/c1-10-6-7-4-2-3-5-8(7)9/h2-5,10H,6H2,1H3 (phenylacetyl)methylamine InChI=1S/C9H11NO/c1-10-9(11)7-8-5-3-2-4-6-8/h2-6H,7H2,1H3,(H,10,11) N,N-diethylmethylphosphonamidate InChI=1S/C5H14NO2P/c1-4-6(5-2)9(3,7)8/h4-5H2,1-3H3,(H,7,8)/p-1 bis(ethoxydimethylsilylpropyl) tetrasulfide InChI=1S/C14H34O2S4Si2/c1-7-15-21(3,4)13-9-11-17-19-20-18-12-10-14-22(5,6)16-8-2/h7-14H2,1-6H3 (3-dimethylethoxysilylpropyl)benzene InChI=1S/C13H22OSi/c1-4-14-15(2,3)12-8-11-13-9-6-5-7-10-13/h5-7,9-10H,4,8,11-12H2,1-3H3 N-[6-(Azetidin-1-yl)pyridin-3-yl]-5-(ethyl)dimethylsilyl-1-[(3-fluorophenyl)methyl]-1H-indole-2-carboxamide InChI=1S/C28H31FN4OSi/c1-4-35(2,3)24-10-11-25-21(16-24)17-26(33(25)19-20-7-5-8-22(29)15-20)28(34)31-23-9-12-27(30-18-23)32-13-6-14-32/h5,7-12,15-18H,4,6,13-14,19H2,1-3H3,(H,31,34) opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/inorganics.txt000066400000000000000000000023111451751637500300340ustar00rootroot00000000000000#mostly ionic sodium chloride InChI=1S/ClH.Na/h1H;/q;+1/p-1 sodium oxide InChI=1S/2Na.O/q2*+1;-2 sodium dioxide InChI=1S/2Na.O2/c;;1-2/q2*+1;-2 barium sulfide InChI=1S/Ba.S/q+2;-2 barium disulfide InChI=1S/Ba.S2/c;1-2/q+2;-2 iron(III) oxide InChI=1S/2Fe.3O/q2*+3;3*-2 ferric oxide InChI=1S/2Fe.3O/q2*+3;3*-2 FERRIC CHLORIDE InChI=1S/3ClH.Fe/h3*1H;/q;;;+3/p-3 silver oxide InChI=1S/2Ag.O/q2*+1;-2 silver monoxide InChI=1S/2Ag.O/q2*+1;-2 silver(I) oxide InChI=1S/2Ag.O/q2*+1;-2 silver(1+) oxide InChI=1S/2Ag.O/q2*+1;-2 argentous oxide InChI=1S/2Ag.O/q2*+1;-2 #mostly covalent boron trifluoride InChI=1S/BF3/c2-1(3)4 silicon dioxide InChI=1S/O2Si/c1-3-2 carbon tetrachloride InChI=1S/CCl4/c2-1(3,4)5 silver(I) acetylide InChI=1S/C2.2Ag/c1-2;;/q-2;2*+1 disilver acetylide InChI=1S/C2.2Ag/c1-2;;/q-2;2*+1 monosodium acetylide InChI=1S/C2H.Na/c1-2;/h1H;/q-1;+1 Sodium deuteroxide InChI=1S/Na.H2O/h;1H2/q+1;/p-1/i/hD dihydrogen InChI=1S/H2/h1H molecular hydrogen InChI=1S/H2/h1H dinitrogen InChI=1S/N2/c1-2 dioxygen InChI=1S/O2/c1-2 difluorine InChI=1S/F2/c1-2 dichlorine InChI=1S/Cl2/c1-2 dibromine InChI=1S/Br2/c1-2 diiodine InChI=1S/I2/c1-2 molecular iodine InChI=1S/I2/c1-2 oganesson InChI=1S/Og opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/ions.txt000066400000000000000000000012141451751637500266510ustar00rootroot00000000000000#P-73.2.3.1 acetylium InChI=1/C2H3O/c1-2-3/h1H3/q+1 cyclohexanecarbonylium InChI=1/C7H11O/c8-6-7-4-2-1-3-5-7/h7H,1-5H2/q+1 pentanthioylium InChI=1/C5H9S/c1-2-3-4-5-6/h2-4H2,1H3/q+1 pentanoylium InChI=1/C5H9O/c1-2-3-4-5-6/h2-4H2,1H3/q+1 pentanylium InChI=1/C5H11/c1-3-5-4-2/h1,3-5H2,2H3/q+1 ethenesulfinylium InChI=1/C2H3OS/c1-2-4-3/h2H,1H2/q+1 dimethylphosphinoylium InChI=1/C2H6OP/c1-4(2)3/h1-2H3/q+1 methylphosphonoylium InChI=1/CH3OP/c1-3-2/h1H3/q+2 glutarylium InChI=1/C5H6O2/c6-4-2-1-3-5-7/h1-3H2/q+2 pentanedioylium InChI=1/C5H6O2/c6-4-2-1-3-5-7/h1-3H2/q+2 pyridine-2,6-dicarbonylium InChI=1/C7H3NO2/c9-4-6-2-1-3-7(5-10)8-6/h1-3H/q+2 opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/isotopes.txt000066400000000000000000000216611451751637500275560ustar00rootroot00000000000000#From CAS recommendations Methane-d InChI=1S/CH4/h1H4/i1D Methane-d4 InChI=1S/CH4/h1H4/i1D4 2,2,2-trifluoro-Ethane-d InChI=1S/C2H3F3/c1-2(3,4)5/h1H3/i1D 2-chloro-6-(methyl-d3)-Benzene-d InChI=1S/C7H7Cl/c1-6-3-2-4-7(8)5-6/h2-5H,1H3/i1D3,5D methyl-Phosphine-d2 InChI=1S/CH5P/c1-2/h2H2,1H3/i2D2 Urea-N,N,N',N'-d4 InChI=1S/CH4N2O/c2-1(3)4/h(H4,2,3,4)/i/hD4 Hydroxyl-d-amine-d2 InChI=1S/H3NO/c1-2/h2H,1H2/i1D2,2D N-(methyl-d3)-Methan-d3-amine InChI=1S/C2H7N/c1-3-2/h3H,1-2H3/i1D3,2D3 Silanamine-d2 InChI=1S/H5NSi/c1-2/h1H2,2H3/i1D2 Hydroxyl-d-amine-d InChI=1S/H3NO/c1-2/h2H,1H2/i1D,2D Ethan-2-d-amine InChI=1S/C2H7N/c1-2-3/h2-3H2,1H3/i1D Methanol-d InChI=1S/CH4O/c1-2/h2H,1H3/i2D Methan-d-ol 1-methanesulfonate InChI=1S/C2H6O3S/c1-5-6(2,3)4/h1-2H3/i1D #Benzene-4-d-methane-alpha,alpha-d2-thiol InChI=1S/C7H8S/c8-6-7-4-2-1-3-5-7/h1-5,8H,6H2/i1D,6D2 #Alanine-N,N,1-d3 InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5,6)/t2-/m0/s1/i/hD3 1H-Imidazole-1-d-2-carboxylic acid-d InChI=1S/C4H4N2O2/c7-4(8)3-5-1-2-6-3/h1-2H,(H,5,6)(H,7,8)/i/hD2 #1,2-Ethane-1,1,2,2-d4-diol-1,2-d2 InChI=1S/C2H6O2/c3-1-2-4/h3-4H,1-2H2/i1D2,2D2,3D,4D Acetaldehyde-1,2-d2 InChI=1S/C2H4O/c1-2-3/h2H,1H3/i1D,2D Propanamide-N,3-d2 InChI=1S/C3H7NO/c1-2-3(4)5/h2H2,1H3,(H2,4,5)/i1D/hD 1-(ethyl-2,2,2-d3)-4-(methyl-d3)-Benzene InChI=1S/C9H12/c1-3-9-6-4-8(2)5-7-9/h4-7H,3H2,1-2H3/i1D3,2D3 N-(2-piperidinyl-1-d)-Carbamic acid InChI=1S/C6H12N2O2/c9-6(10)8-5-3-1-2-4-7-5/h5,7-8H,1-4H2,(H,9,10)/i/hD 2-Propanone-1,1,1,3,3,3-d6 InChI=1S/C3H6O/c1-3(2)4/h1-2H3/i1D3,2D3 #From IUPAC (14C)methane InChI=1S/CH4/h1H4/i1+2 trichloro(12C)methane InChI=1S/CHCl3/c2-1(3)4/h1H/i1+0 (12C)chloroform InChI=1S/CHCl3/c2-1(3)4/h1H/i1+0 (²H1)methane InChI=1S/CH4/h1H4/i1D dichloro(²H2)methane InChI=1S/CH2Cl2/c2-1-3/h1H2/i1D2 (2H3)methoxybenzene InChI=1S/C7H8O/c1-8-7-5-3-2-4-6-7/h2-6H,1H3/i1D3 (α,α,α-2H3)anisole InChI=1S/C7H8O/c1-8-7-5-3-2-4-6-7/h2-6H,1H3/i1D3 1-phenyl(1,2-13C2)ethanone InChI=1S/C8H8O/c1-7(9)8-5-3-2-4-6-8/h2-6H,1H3/i1+1,7+1 (1,2-13C2)acetophenone InChI=1S/C8H8O/c1-7(9)8-5-3-2-4-6-8/h2-6H,1H3/i1+1,7+1 (1,2-13C)acetophenone InChI=1S/C8H8O/c1-7(9)8-5-3-2-4-6-8/h2-6H,1H3/i1+1,7+1 1,2-di[(13C)methyl]benzene InChI=1S/C8H10/c1-7-5-3-4-6-8(7)2/h3-6H,1-2H3/i1+1,2+1 (α,α′-13C2)-1,2-xylene InChI=1S/C8H10/c1-7-5-3-4-6-8(7)2/h3-6H,1-2H3/i1+1,2+1 2-(13C)methyl-(1-13C)benzene InChI=1S/C7H8/c1-7-5-3-2-4-6-7/h2-6H,1H3/i1+1,5+1 (2-2H1)ethan-1-ol InChI=1S/C2H6O/c1-2-3/h3H,2H2,1H3/i1D (2-13C)ethan-1-ol InChI=1S/C2H6O/c1-2-3/h3H,2H2,1H3/i1+1 1-[amino(14C)methyl]cyclopentan-1-ol InChI=1S/C6H13NO/c7-5-6(8)3-1-2-4-6/h8H,1-5,7H2/i5+2 1-(aminomethyl)cyclopentan-1-(18O)ol InChI=1S/C6H13NO/c7-5-6(8)3-1-2-4-6/h8H,1-5,7H2/i8+2 N-[7-(131I)iodo-9H-fluoren-2-yl]acetamide InChI=1S/C15H12INO/c1-9(18)17-13-3-5-15-11(8-13)6-10-7-12(16)2-4-14(10)15/h2-5,7-8H,6H2,1H3,(H,17,18)/i16+4 sodium 4-ethoxy-4-oxo(2,3-14C2)butanoate InChI=1S/C6H10O4.Na/c1-2-10-6(9)4-3-5(7)8;/h2-4H2,1H3,(H,7,8);/q;+1/p-1/i3+2,4+2; sodium ethyl (2,3-14C2)butanedioate InChI=1S/C6H10O4.Na/c1-2-10-6(9)4-3-5(7)8;/h2-4H2,1H3,(H,7,8);/q;+1/p-1/i3+2,4+2; sodium ethyl (2,3-14C2)succinate InChI=1S/C6H10O4.Na/c1-2-10-6(9)4-3-5(7)8;/h2-4H2,1H3,(H,7,8);/q;+1/p-1/i3+2,4+2; 4-[(3-14C)thiolan-2-yl]pyridine InChI=1S/C9H11NS/c1-2-9(11-7-1)8-3-5-10-6-4-8/h3-6,9H,1-2,7H2/i2+2 4-[tetrahydro(3-14C)thiophen-2-yl]pyridine InChI=1S/C9H11NS/c1-2-9(11-7-1)8-3-5-10-6-4-8/h3-6,9H,1-2,7H2/i2+2 2-(35Cl)chloro-3-[(²H3)methyl](1-²H1)pentane InChI=1S/C6H13Cl/c1-4-5(2)6(3)7/h5-6H,4H2,1-3H3/i2D3,3D,7+0 2-(13C)methyl-3-methylpyridine InChI=1S/C7H9N/c1-6-4-3-5-8-7(6)2/h3-5H,1-2H3/i2+1 2-(2,2-2H2)ethyl-3-ethylhexan-1-ol InChI=1S/C10H22O/c1-4-7-9(5-2)10(6-3)8-11/h9-11H,4-8H2,1-3H3/i3D2 #cyclohexane-1,1-di[(14C)-carboxylic acid] InChI=1S/C8H12O4/c9-6(10)8(7(11)12)4-2-1-3-5-8/h1-5H2,(H,9,10)(H,11,12)/i6+2,7+2 #1-carboxycyclohexane-1-(13C,2H)carboxylic acid InChI=1S/C8H12O4/c9-6(10)8(7(11)12)4-2-1-3-5-8/h1-5H2,(H,9,10)(H,11,12)/i6+1/hD #1-(2H)carboxycyclohexane-1-(13C)carboxylic acid InChI=1S/C8H12O4/c9-6(10)8(7(11)12)4-2-1-3-5-8/h1-5H2,(H,9,10)(H,11,12)/i6+1/hD 1-(13C)carboxycyclohexane-1-(14C)carboxylic acid InChI=1S/C8H12O4/c9-6(10)8(7(11)12)4-2-1-3-5-8/h1-5H2,(H,9,10)(H,11,12)/i6+1,7+2 (1-15N)-1H-indole InChI=1S/C8H7N/c1-2-4-8-7(3-1)5-6-9-8/h1-6,9H/i9+1 2,3-dihydro(1-15N)-1H-indole InChI=1S/C8H9N/c1-2-4-8-7(3-1)5-6-9-8/h1-4,9H,5-6H2/i9+1 2,3-dihydro(2,3-2H2,1-15N)-1H-indole InChI=1S/C8H9N/c1-2-4-8-7(3-1)5-6-9-8/h1-4,9H,5-6H2/i5D,6D,9+1 #2,3-di[(2H)hydro]-(2,3-2H2,15N)-1H-indole #6-methyl-2,3-di[(2H2)dihydro](2,3-2H1)napthalen-1-ol (2-²H1)acetic acid InChI=1S/C2H4O2/c1-2(3)4/h1H3,(H,3,4)/i1D #acetic (²H)acid InChI=1S/C2H4O2/c1-2(3)4/h1H3,(H,3,4)/i1D (O-²H)acetic acid InChI=1S/C2H4O2/c1-2(3)4/h1H3,(H,3,4)/i/hD #(O-2H,18O)acetic acid #(18O-2H)acetic acid #(1-14C)pentan(³H)oic acid InChI=1S/C5H10O2/c1-2-3-4-5(6)7/h2-4H2,1H3,(H,6,7)/i5+2/hT sodium (14C)formate InChI=1S/CH2O2.Na/c2-1-3;/h1H,(H,2,3);/q;+1/p-1/i1+2; #cyclohexane(²H)carboxylic acid InChI=1S/C7H12O2/c8-7(9)6-4-2-1-3-5-6/h6H,1-5H2,(H,8,9)/i/hD 4-[(2-14C)ethyl]benzoic acid InChI=1S/C9H10O2/c1-2-7-3-5-8(6-4-7)9(10)11/h3-6H,2H2,1H3,(H,10,11)/i1+2 (1-14C)ethyl propanoate InChI=1S/C5H10O2/c1-3-5(6)7-4-2/h3-4H2,1-2H3/i4+2 ethyl (2-14C)propanoate InChI=1S/C5H10O2/c1-3-5(6)7-4-2/h3-4H2,1-2H3/i3+2 (N-²H)acetamide InChI=1S/C2H5NO/c1-2(3)4/h1H3,(H2,3,4)/i/hD #acet(²H)amide InChI=1S/C2H5NO/c1-2(3)4/h1H3,(H2,3,4)/i/hD (N,N-2H2)aniline InChI=1S/C6H7N/c7-6-4-2-1-3-5-6/h1-5H,7H2/i/hD2 (N,N-2H2)benzenamine InChI=1S/C6H7N/c7-6-4-2-1-3-5-6/h1-5H,7H2/i/hD2 #methan(²H,18O)ol InChI=1S/CH4O/c1-2/h2H,1H3/i2+2D (2-²H1,1-³H1)ethan-1-ol InChI=1S/C2H6O/c1-2-3/h3H,2H2,1H3/i1D,2T (1R)-(1-²H1)ethan-1-ol InChI=1S/C2H6O/c1-2-3/h3H,2H2,1H3/i2D/t2-/m1/s1 (1E)-(1-²H1)prop-1-ene InChI=1S/C3H6/c1-3-2/h3H,1H2,2H3/i1D/b3-1+ (24R)-5alpha-(24-2H1)cholestane InChI=1S/C27H48/c1-19(2)9-8-10-20(3)23-14-15-24-22-13-12-21-11-6-7-17-26(21,4)25(22)16-18-27(23,24)5/h19-25H,6-18H2,1-5H3/t20-,21-,22+,23-,24+,25+,26+,27-/m1/s1/i9D/t9-,20-,21-,22+,23-,24+,25+,26+,27- 5alpha-(17-²H)pregnane InChI=1S/C21H36/c1-4-15-9-11-18-17-10-8-16-7-5-6-13-20(16,2)19(17)12-14-21(15,18)3/h15-19H,4-14H2,1-3H3/t15-,16+,17-,18-,19-,20-,21+/m0/s1/i15D #L-(4-13C,35S)methionine 2-(18F) fluoro-2-deoxy-β-D-glucopyranose InChI=1S/C6H11FO5/c7-3-5(10)4(9)2(1-8)12-6(3)11/h2-6,8-11H,1H2/t2-,3-,4-,5-,6-/m1/s1/i7-1 (2S)-(2-²H)butan-2-ol InChI=1S/C4H10O/c1-3-4(2)5/h4-5H,3H2,1-2H3/t4-/m0/s1/i4D (2E)-1-chloro(2-²H)but-2-ene InChI=1S/C4H7Cl/c1-2-3-4-5/h2-3H,4H2,1H3/b3-2+/i3D (2R,3R)-3-chloro(2-²H1)butan-2-ol InChI=1S/C4H9ClO/c1-3(5)4(2)6/h3-4,6H,1-2H3/t3-,4-/m1/s1/i4D 1,1,1-trifluoro(2-²H1)ethane InChI=1S/C2H3F3/c1-2(3,4)5/h1H3/i1D 1-chloro-3-fluoro(2-²H)benzene InChI=1S/C6H4ClF/c7-5-2-1-3-6(8)4-5/h1-4H/i4D 2-methoxy(3,4,5,6-³H4)phenol InChI=1S/C7H8O2/c1-9-7-5-3-2-4-6(7)8/h2-5,8H,1H3/i2T,3T,4T,5T (2-14C)butane InChI=1S/C4H10/c1-3-4-2/h3-4H2,1-2H3/i3+2 (3-14C,2,2-²H2)butane InChI=1S/C4H10/c1-3-4-2/h3-4H2,1-2H3/i3D2,4+2 (2-14C,3-²H1)butane InChI=1S/C4H10/c1-3-4-2/h3-4H2,1-2H3/i3D,4+2 (3-³H)phenol InChI=1S/C6H6O/c7-6-4-2-1-3-5-6/h1-5,7H/i2T (2R)-(1-²H1)propan-2-ol InChI=1S/C3H8O/c1-3(2)4/h3-4H,1-2H3/i1D/t3-/m0/s1 (2R)-1-(131I)iodo-3-iodopropan-2-ol InChI=1S/C3H6I2O/c4-1-3(6)2-5/h3,6H,1-2H2/i4+4/t3-/m0/s1 (2S,4R)-(4-²H1,2-³H1)pentane InChI=1S/C5H12/c1-3-5-4-2/h3-5H2,1-2H3/i3D,4T/t3-,4+/m1/s1 (²H3)acetonitrile InChI=1S/C2H3N/c1-2-3/h1H3/i1D3 #ethan(²H)ol InChI=1S/C2H6O/c1-2-3/h3H,2H2,1H3/i3D (2-13C)ethan-1-ol InChI=1S/C2H6O/c1-2-3/h3H,2H2,1H3/i1+1 {[(²H1)methoxy(²H2)methyl]sulfanyl}methaneperoxol InChI=1S/C3H8O3S/c1-5-2-7-3-6-4/h4H,2-3H2,1H3/i1D,2D2 (2,3-2H2,15N)pyridine InChI=1S/C5H5N/c1-2-4-6-5-3-1/h1-5H/i2D,4D,6+1 (2H6)benzene InChI=1S/C6H6/c1-2-4-6-5-3-1/h1-6H/i1D,2D,3D,4D,5D,6D 2-(79Br)bromo-(1-13C)benzene InChI=1S/C6H5Br/c7-6-4-2-1-3-5-6/h1-5H/i4+1,7-1 #(1-²H1)ethan-1-(2H)ol InChI=1S/C2H6O/c1-2-3/h3H,2H2,1H3/i2D,3D (1,1,1,3,3-²H5)pentan-2-one InChI=1S/C5H10O/c1-3-4-5(2)6/h3-4H2,1-2H3/i2D3,4D2 (2R)-2(O-2H)hydroxy-3-hydroxy(1-2H)propanal InChI=1S/C3H6O3/c4-1-3(6)2-5/h1,3,5-6H,2H2/t3-/m0/s1/i1D,6D #D-(2-O,1-²H2)glyceraldehyde InChI=1S/C3H6O3/c4-1-3(6)2-5/h1,3,5-6H,2H2/t3-/m0/s1/i1D,6D #D-(O-²H)glycer(2H)aldehyde InChI=1S/C3H6O3/c4-1-3(6)2-5/h1,3,5-6H,2H2/t3-/m0/s1/i1D,6D #DL-[methyl-(14C,2H3)]methionine #L-(carbamimidoyl-14C,N′-15N)arginine #L-(α-2H)-phenylalanine InChI=1S/C9H11NO2/c10-8(9(11)12)6-7-4-2-1-3-5-7/h1-5,8H,6,10H2,(H,11,12)/t8-/m0/s1/i8D 1-(naphthalen-2-yl)-2-phenyl(1-15N)diazene InChI=1S/C16H12N2/c1-2-8-15(9-3-1)17-18-16-11-10-13-6-4-5-7-14(13)12-16/h1-12H/i18+1 1-propylidene(1-15N)diazane InChI=1S/C3H8N2/c1-2-3-5-4/h3H,2,4H2,1H3/i5+1 3-[ethyl(2-34S)trisulfan-1-yl]propanoic acid InChI=1S/C5H10O2S3/c1-2-8-10-9-4-3-5(6)7/h2-4H2,1H3,(H,6,7)/i10+2 1-(1-chloronaphthalen-2-yl)-2-phenyl(1-15N)diazene 2-oxide InChI=1S/C16H11ClN2O/c17-16-14-9-5-4-6-12(14)10-11-15(16)18-19(20)13-7-2-1-3-8-13/h1-11H/i18+1 #Misc 3-methyl-4-(propyl-2,3-13C)octane InChI=1S/C12H26/c1-5-8-10-12(9-6-2)11(4)7-3/h11-12H,5-10H2,1-4H3/i2+1,6+1 opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/lettercasing.txt000066400000000000000000000026411451751637500303720ustar00rootroot000000000000001H-IMIDAZOLE InChI=1S/C3H4N2/c1-2-5-3-4-1/h1-3H,(H,4,5) 1h-imidazole InChI=1S/C3H4N2/c1-2-5-3-4-1/h1-3H,(H,4,5) imidazo[4,5-d]pyridine InChI=1S/C6H5N3/c1-2-7-3-6-5(1)8-4-9-6/h1-4H,(H,8,9) IMIDAZO[4,5-D]PYRIDINE InChI=1S/C6H5N3/c1-2-7-3-6-5(1)8-4-9-6/h1-4H,(H,8,9) (9aS)-perhydropyrazino[2,1-c][1,4]oxazine InChI=1S/C7H14N2O/c1-2-9-3-4-10-6-7(9)5-8-1/h7-8H,1-6H2/t7-/m0/s1 (9AS)-PERHYDROPYRAZINO[2,1-C][1,4]OXAZINE InChI=1S/C7H14N2O/c1-2-9-3-4-10-6-7(9)5-8-1/h7-8H,1-6H2/t7-/m0/s1 9a-methyl-perhydropyrazino[2,1-c][1,4]oxazine InChI=1S/C8H16N2O/c1-8-6-9-2-3-10(8)4-5-11-7-8/h9H,2-7H2,1H3 9A-METHYL-PERHYDROPYRAZINO[2,1-C][1,4]OXAZINE InChI=1S/C8H16N2O/c1-8-6-9-2-3-10(8)4-5-11-7-8/h9H,2-7H2,1H3 PERHYDROPYRAZINO[2,1-C][1,4]OXAZINE-9A-13C InChI=1S/C7H14N2O/c1-2-9-3-4-10-6-7(9)5-8-1/h7-8H,1-6H2/i7+1 PERHYDRO-(9A-13C)-PYRAZINO[2,1-C][1,4]OXAZINE InChI=1S/C7H14N2O/c1-2-9-3-4-10-6-7(9)5-8-1/h7-8H,1-6H2/i7+1 4a-oxo-4alambda5-phosphadecalin InChI=1S/C9H17OP/c10-11-7-3-1-5-9(11)6-2-4-8-11/h9H,1-8H2 4A-OXO-4ALAMBDA5-PHOSPHADECALIN InChI=1S/C9H17OP/c10-11-7-3-1-5-9(11)6-2-4-8-11/h9H,1-8H2 9ah-pyrazino[1,2-a]pyrazine InChI=1S/C7H7N3/c1-3-10-4-2-9-6-7(10)5-8-1/h1-7H 9AH-PYRAZINO[1,2-A]PYRAZINE InChI=1S/C7H7N3/c1-3-10-4-2-9-6-7(10)5-8-1/h1-7H 2,3:4a,8a-diepoxydecalin InChI=1S/C10H14O2/c1-2-4-10-6-8-7(11-8)5-9(10,3-1)12-10/h7-8H,1-6H2 2,3:4A,8A-DIEPOXYDECALIN InChI=1S/C10H14O2/c1-2-4-10-6-8-7(11-8)5-9(10,3-1)12-10/h7-8H,1-6H2 opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/miscellany.txt000066400000000000000000000230011451751637500300370ustar00rootroot00000000000000heptamethylbenzenesulfonamide InChI=1S/C13H21NO2S/c1-8-9(2)11(4)13(12(5)10(8)3)17(15,16)14(6)7/h1-7H3 2,4-dimethylpentyl(propan-2-yl)phosphane InChI=1S/C10H23P/c1-8(2)6-10(5)7-11-9(3)4/h8-11H,6-7H2,1-5H3 N-[4-(dimethylamino)-3-[[2,2-dimethylpropanoyl-[(1R)-1-phenylethyl]amino]methyl]phenyl]cyclopropanecarboxamide InChI=1S/C26H35N3O2/c1-18(19-10-8-7-9-11-19)29(25(31)26(2,3)4)17-21-16-22(14-15-23(21)28(5)6)27-24(30)20-12-13-20/h7-11,14-16,18,20H,12-13,17H2,1-6H3,(H,27,30)/t18-/m1/s1 N-[(2R,3S)-3-[3,3-dimethylbutanoyl-[(2-pyrrolidin-1-ylacetyl)amino]amino]-2-hydroxy-4-phenylbutyl]-N-(2-methylpropyl)-1,3-benzodioxole-5-sulfonamide InChI=1S/C33H48N4O7S/c1-24(2)20-36(45(41,42)26-13-14-29-30(18-26)44-23-43-29)21-28(38)27(17-25-11-7-6-8-12-25)37(32(40)19-33(3,4)5)34-31(39)22-35-15-9-10-16-35/h6-8,11-14,18,24,27-28,38H,9-10,15-17,19-23H2,1-5H3,(H,34,39)/t27-,28+/m0/s1 3,3'-methylenebis(2,4,6-trimethylbenzaldehyde) disemicarbazone InChI=1S/C23H30N6O2/c1-12-7-14(3)20(10-26-28-22(24)30)16(5)18(12)9-19-13(2)8-15(4)21(17(19)6)11-27-29-23(25)31/h7-8,10-11H,9H2,1-6H3,(H3,24,28,30)(H3,25,29,31) 2-[1-(3,4-dihydro-2(1H)-isoquinolinylacetyl)-3-oxo-2-piperazinyl]-N-phenylacetamide InChI=1S/C23H26N4O3/c28-21(25-19-8-2-1-3-9-19)14-20-23(30)24-11-13-27(20)22(29)16-26-12-10-17-6-4-5-7-18(17)15-26/h1-9,20H,10-16H2,(H,24,30)(H,25,28) alpha-ethylfuran-2-methanol InChI=1/C7H10O2/c1-2-6(8)7-4-3-5-9-7/h3-6,8H,2H2,1H3 (S)-N-{5-(4-Fluoro-phenyl)-4-[1-(4-fluoro-phenyl)-1H-indol-4-ylmethyl]-3-oxo-3,4-dihydro-pyrazin-2-yl}-2-methylamino-propionamide InChI=1S/C29H25F2N5O2/c1-18(32-2)28(37)34-27-29(38)36(26(16-33-27)19-6-8-21(30)9-7-19)17-20-4-3-5-25-24(20)14-15-35(25)23-12-10-22(31)11-13-23/h3-16,18,32H,17H2,1-2H3,(H,33,34,37)/t18-/m0/s1 (S)-N-{5-(4-Fluoro-phenyl)-4-[1-(4-fluoro-phenyl)-(1H-indol-4-yl)methyl]-3-oxo-3,4-dihydro-pyrazin-2-yl}-2-methylamino-propionamide InChI=1S/C29H25F2N5O2/c1-17(32-2)28(37)35-27-29(38)36(25(16-34-27)18-6-10-20(30)11-7-18)26(19-8-12-21(31)13-9-19)23-4-3-5-24-22(23)14-15-33-24/h3-17,26,32-33H,1-2H3,(H,34,35,37)/t17-,26?/m0/s1 1-(1-phenylcyclopentyl)methylamine InChI=1S/C12H17N/c13-10-12(8-4-5-9-12)11-6-2-1-3-7-11/h1-3,6-7H,4-5,8-10,13H2 S-(1-Oxido-2-pyridyl)-N,N,N',N'-tetramethylthiuronium tetrafluoroborate InChI=1S/C10H16N3OS.BF4/c1-11(2)10(12(3)4)15-9-7-5-6-8-13(9)14;2-1(3,4)5/h5-8H,1-4H3;/q+1;-1 azaphosphine InChI=1S/C4H4NP/c1-2-4-6-5-3-1/h1-4H sodium tetraethylborate InChI=1S/C8H20B.Na/c1-5-9(6-2,7-3)8-4;/h5-8H2,1-4H3;/q-1;+1 hexabromoantimonate InChI=1S/6BrH.Sb/h6*1H;/q;;;;;;+5/p-6 hexachloroarsenate InChI=1S/AsCl6/c2-1(3,4,5,6)7/q-1 hexachlorophosphate InChI=1S/Cl6P/c1-7(2,3,4,5)6/q-1 Hexafluorosilicic acid InChI=1S/F6Si/c1-7(2,3,4,5)6/q-2/p+2 Hexafluorophosphoric acid InChI=1S/F6P/c1-7(2,3,4,5)6/q-1/p+1 Hexafluorophosphoric acid triamide InChI=1S/F6N3OP/c1-7(2)11(10,8(3)4)9(5)6 phenyltrifluoroborate InChI=1S/C6H5BF3/c8-7(9,10)6-4-2-1-3-5-6/h1-5H/q-1 ethylnitrolic acid InChI=1S/C2H4N2O3/c1-2(3-5)4(6)7/h5H,1H3 #Formally ambiguous pentachlorobenzyl acetate InChI=1S/C9H5Cl5O2/c1-3(15)16-2-4-5(10)7(12)9(14)8(13)6(4)11/h2H2,1H3 #Formally ambiguous tetraphenylporphyrin InChI=1S/C44H30N4/c1-5-13-29(14-6-1)41-33-21-23-35(45-33)42(30-15-7-2-8-16-30)37-25-27-39(47-37)44(32-19-11-4-12-20-32)40-28-26-38(48-40)43(31-17-9-3-10-18-31)36-24-22-34(41)46-36/h1-28,45,48H #Formally ambiguous hexachlorocyclohexane InChI=1S/C6H6Cl6/c7-1-2(8)4(10)6(12)5(11)3(1)9/h1-6H pyridinium hemisulfate InChI=1S/2C5H5N.H2O4S/c2*1-2-4-6-5-3-1;1-5(2,3)4/h2*1-5H;(H2,1,2,3,4) potassium carbonate sesquihydrate InChI=1S/2CH2O3.4K.3H2O/c2*2-1(3)4;;;;;;;/h2*(H2,2,3,4);;;;;3*1H2/q;;4*+1;;;/p-4 S-methylmethionine InChI=1S/C6H13NO2S/c1-10(2)4-3-5(7)6(8)9/h5H,3-4,7H2,1-2H3/p+1/t5-/m0/s1 Se-benzyl-seleno-methionine InChI=1S/C12H17NO2Se/c1-16(8-7-11(13)12(14)15)9-10-5-3-2-4-6-10/h2-6,11H,7-9,13H2,1H3/p+1/t11-,16?/m0/s1 #different phospho interpretation phosphobenzene InChI=1S/C6H5O2P/c7-9(8)6-4-2-1-3-5-6/h1-5H 6-phospho-2-O-methyl-D-mannose InChI=1S/C7H15O9P/c1-15-5(2-8)7(11)6(10)4(9)3-16-17(12,13)14/h2,4-7,9-11H,3H2,1H3,(H2,12,13,14)/t4-,5-,6-,7-/m1/s1 bicyclo[5.4.0]-7-undecene InChI=1/C11H18/c1-2-6-10-8-4-5-9-11(10)7-3-1/h8,11H,1-7,9H2 spiro[4.5]-2-decene InChI=1S/C10H16/c1-2-6-10(7-3-1)8-4-5-9-10/h4-5H,1-3,6-9H2 glutamylglycine InChI=1S/C7H12N2O5/c8-4(1-2-5(10)11)7(14)9-3-6(12)13/h4H,1-3,8H2,(H,9,14)(H,10,11)(H,12,13)/t4-/m0/s1 beta-alaninenitrile InChI=1S/C3H6N2/c4-2-1-3-5/h1-2,4H2 2'-Deoxy-5-azacytidylyl-(3'→5')-2'-deoxyguanosine InChI=1S/C18H24N9O10P/c19-16-22-6-27(18(31)25-16)12-2-8(9(3-28)35-12)37-38(32,33)34-4-10-7(29)1-11(36-10)26-5-21-13-14(26)23-17(20)24-15(13)30/h5-12,28-29H,1-4H2,(H,32,33)(H2,19,25,31)(H3,20,23,24,30)/t7-,8-,9+,10+,11+,12+/m0/s1 4-benzofuran-2-yl-2-methyl-1,2,3,4-tetrahydroisoquinolin-4-ol InChI=1S/C18H17NO2/c1-19-11-14-7-2-4-8-15(14)18(20,12-19)17-10-13-6-3-5-9-16(13)21-17/h2-10,20H,11-12H2,1H3 7-methylguanosine InChI=1S/C11H15N5O5/c1-15-3-16(8-5(15)9(20)14-11(12)13-8)10-7(19)6(18)4(2-17)21-10/h3-4,6-7,10,17-19H,2H2,1H3,(H2-,12,13,14,20)/p+1/t4-,6-,7-,10-/m1/s1 #Vitamin B-6 related compounds pyridoxal 5'-phosphate InChI=1S/C8H10NO6P/c1-5-8(11)7(3-10)6(2-9-5)4-15-16(12,13)14/h2-3,11H,4H2,1H3,(H2,12,13,14) N-(5'-phosphopyridoxyl)-D-alanine InChI=1S/C11H17N2O7P/c1-6-10(14)9(4-13-7(2)11(15)16)8(3-12-6)5-20-21(17,18)19/h3,7,13-14H,4-5H2,1-2H3,(H,15,16)(H2,17,18,19)/t7-/m1/s1 N6-(P-pyridoxyl)-L-lysine InChI=1S/C14H24N3O7P/c1-9-13(18)11(10(6-17-9)8-24-25(21,22)23)7-16-5-3-2-4-12(15)14(19)20/h6,12,16,18H,2-5,7-8,15H2,1H3,(H,19,20)(H2,21,22,23)/t12-/m0/s1 N,N-Dimethylmethyleneiminium chloride InChI=1S/C3H8N.ClH/c1-4(2)3;/h1H2,2-3H3;1H/q+1;/p-1 #subtractive heteroatom replacement 7-Deaza-2′-deoxy-guanosine-5′-triphosphate InChI=1S/C11H17N4O13P3/c12-11-13-9-5(10(17)14-11)1-2-15(9)8-3-6(16)7(26-8)4-25-30(21,22)28-31(23,24)27-29(18,19)20/h1-2,6-8,16H,3-4H2,(H,21,22)(H,23,24)(H2,18,19,20)(H3,12,13,14,17)/t6-,7+,8+/m0/s1 deaza-pyridine InChI=1S/C6H6/c1-2-4-6-5-3-1/h1-6H 1-deaza-pyridine InChI=1S/C6H6/c1-2-4-6-5-3-1/h1-6H deaza-morpholine InChI=1S/C5H10O/c1-2-4-6-5-3-1/h1-5H2 4-deaza-morpholine InChI=1S/C5H10O/c1-2-4-6-5-3-1/h1-5H2 1-dethia-1-oxa-3-cephem InChI=1S/C6H7NO2/c8-5-4-6-7(5)2-1-3-9-6/h1-2,6H,3-4H2/t6-/m1/s1 1-Myristoyl-sn-glycero-3-phosphocholine InChI=1S/C22H46NO7P/c1-5-6-7-8-9-10-11-12-13-14-15-16-22(25)28-19-21(24)20-30-31(26,27)29-18-17-23(2,3)4/h21,24H,5-20H2,1-4H3/t21-/m1/s1 1,2-dioleoyl-sn-glycerol-3-phosphate InChI=1S/C39H73O8P/c1-3-5-7-9-11-13-15-17-19-21-23-25-27-29-31-33-38(40)45-35-37(36-46-48(42,43)44)47-39(41)34-32-30-28-26-24-22-20-18-16-14-12-10-8-6-4-2/h17-20,37H,3-16,21-36H2,1-2H3,(H2,42,43,44)/b19-17-,20-18-/t37-/m1/s1 1,2-Dipalmitoyl-glycero-3-phosphocholine InChI=1S/C40H80NO8P/c1-6-8-10-12-14-16-18-20-22-24-26-28-30-32-39(42)46-36-38(37-48-50(44,45)47-35-34-41(3,4)5)49-40(43)33-31-29-27-25-23-21-19-17-15-13-11-9-7-2/h38H,6-37H2,1-5H3 1,2-Dipalmitoyl-rac-glycero-3-phosphocholine InChI=1S/C40H80NO8P/c1-6-8-10-12-14-16-18-20-22-24-26-28-30-32-39(42)46-36-38(37-48-50(44,45)47-35-34-41(3,4)5)49-40(43)33-31-29-27-25-23-21-19-17-15-13-11-9-7-2/h38H,6-37H2,1-5H3 methylselenopyruvate InChI=1S/C4H6O3Se/c1-8-2-3(5)4(6)7/h2H2,1H3,(H,6,7)/p-1 beta-methylselenopyruvate InChI=1S/C4H6O3Se/c1-8-2-3(5)4(6)7/h2H2,1H3,(H,6,7)/p-1 2,5-dimethoxy-4-n-amylbenzaldehyde InChI=1S/C14H20O3/c1-4-5-6-7-11-8-14(17-3)12(10-15)9-13(11)16-2/h8-10H,4-7H2,1-3H3 sec-Amyl acetate InChI=1S/C7H14O2/c1-4-5-6(2)9-7(3)8/h6H,4-5H2,1-3H3 [8]ANNULENE InChI=1S/C8H8/c1-2-4-6-8-7-5-3-1/h1-8H [12]annulyne InChI=1S/C12H10/c1-2-4-6-8-10-12-11-9-7-5-3-1/h1-10H 3-aminophthalhydrazide InChI=1S/C8H7N3O2/c9-5-3-1-2-4-6(5)8(13)11-10-7(4)12/h1-3H,9H2,(H,10,12)(H,11,13) 1-decyl-3-methyl-1H-imidazolium InChI=1S/C14H27N2/c1-3-4-5-6-7-8-9-10-11-16-13-12-15(2)14-16/h12-14H,3-11H2,1-2H3/q+1 undecahectane InChI=1S/C111H224/c1-3-5-7-9-11-13-15-17-19-21-23-25-27-29-31-33-35-37-39-41-43-45-47-49-51-53-55-57-59-61-63-65-67-69-71-73-75-77-79-81-83-85-87-89-91-93-95-97-99-101-103-105-107-109-111-110-108-106-104-102-100-98-96-94-92-90-88-86-84-82-80-78-76-74-72-70-68-66-64-62-60-58-56-54-52-50-48-46-44-42-40-38-36-34-32-30-28-26-24-22-20-18-16-14-12-10-8-6-4-2/h3-111H2,1-2H3 undecadictane InChI=1S/C211H424/c1-3-5-7-9-11-13-15-17-19-21-23-25-27-29-31-33-35-37-39-41-43-45-47-49-51-53-55-57-59-61-63-65-67-69-71-73-75-77-79-81-83-85-87-89-91-93-95-97-99-101-103-105-107-109-111-113-115-117-119-121-123-125-127-129-131-133-135-137-139-141-143-145-147-149-151-153-155-157-159-161-163-165-167-169-171-173-175-177-179-181-183-185-187-189-191-193-195-197-199-201-203-205-207-209-211-210-208-206-204-202-200-198-196-194-192-190-188-186-184-182-180-178-176-174-172-170-168-166-164-162-160-158-156-154-152-150-148-146-144-142-140-138-136-134-132-130-128-126-124-122-120-118-116-114-112-110-108-106-104-102-100-98-96-94-92-90-88-86-84-82-80-78-76-74-72-70-68-66-64-62-60-58-56-54-52-50-48-46-44-42-40-38-36-34-32-30-28-26-24-22-20-18-16-14-12-10-8-6-4-2/h3-211H2,1-2H3 1-dimethylaminomethyl-d-dihydrolysergic acid methyl ester InChI=1S/C20H27N3O2/c1-21(2)12-23-11-13-9-18-16(15-6-5-7-17(23)19(13)15)8-14(10-22(18)3)20(24)25-4/h5-7,11,14,16,18H,8-10,12H2,1-4H3/t14-,16?,18-/m1/s1 1-dimethylaminomethyl-9,10-dihydro-d-lysergic acid methyl ester InChI=1S/C20H27N3O2/c1-21(2)12-23-11-13-9-18-16(15-6-5-7-17(23)19(13)15)8-14(10-22(18)3)20(24)25-4/h5-7,11,14,16,18H,8-10,12H2,1-4H3/t14-,16?,18-/m1/s1 lithium triethylborodeuteride InChI=1S/C6H16B.Li/c1-4-7(5-2)6-3;/h7H,4-6H2,1-3H3;/q-1;+1/i7D; 2-Phosphoglyceric acid InChI=1S/C3H7O7P/c4-1-2(3(5)6)10-11(7,8)9/h2,4H,1H2,(H,5,6)(H2,7,8,9) 3-phosphoglyceric acid InChI=1S/C3H7O7P/c4-2(3(5)6)1-10-11(7,8)9/h2,4H,1H2,(H,5,6)(H2,7,8,9) 2-methylglyceric acid InChI=1S/C4H8O4/c1-4(8,2-5)3(6)7/h5,8H,2H2,1H3,(H,6,7)opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/multiplicativeNomenclature.txt000066400000000000000000000105601451751637500333150ustar00rootroot00000000000000#Some annoying cases that previously failed 4,4'-{(9-oxo-9H-fluorene-2,7-diyl)bis[sulfonyl(methylimino)]}dibutanoic acid InChI=1S/C23H26N2O9S2/c1-24(11-3-5-21(26)27)35(31,32)15-7-9-17-18-10-8-16(14-20(18)23(30)19(17)13-15)36(33,34)25(2)12-4-6-22(28)29/h7-10,13-14H,3-6,11-12H2,1-2H3,(H,26,27)(H,28,29) 4,4'-{9H-fluorene-2,7-diylbis[sulfonyl(methylimino)]}dibutanoic acid InChI=1S/C23H28N2O8S2/c1-24(11-3-5-22(26)27)34(30,31)18-7-9-20-16(14-18)13-17-15-19(8-10-21(17)20)35(32,33)25(2)12-4-6-23(28)29/h7-10,14-15H,3-6,11-13H2,1-2H3,(H,26,27)(H,28,29) diethyl {1,3-phenylenebis[imino(thioxomethylene)]}biscarbamate InChI=1S/C14H18N4O4S2/c1-3-21-13(19)17-11(23)15-9-6-5-7-10(8-9)16-12(24)18-14(20)22-4-2/h5-8H,3-4H2,1-2H3,(H2,15,17,19,23)(H2,16,18,20,24) N,N'-{(4-chloro-1,3-phenylene)bis[imino(thioxomethylene)]}dipropanamide InChI=1S/C14H17ClN4O2S2/c1-3-11(20)18-13(22)16-8-5-6-9(15)10(7-8)17-14(23)19-12(21)4-2/h5-7H,3-4H2,1-2H3,(H2,16,18,20,22)(H2,17,19,21,23) N,N'-{[4-(4-morpholinyl)-1,3-phenylene]bis[imino(thioxomethylene)]}bis(2-methoxybenzamide) InChI=1S/C28H29N5O5S2/c1-36-23-9-5-3-7-19(23)25(34)31-27(39)29-18-11-12-22(33-13-15-38-16-14-33)21(17-18)30-28(40)32-26(35)20-8-4-6-10-24(20)37-2/h3-12,17H,13-16H2,1-2H3,(H2,29,31,34,39)(H2,30,32,35,40) N,N'-{1,5-naphthalenediylbis[imino(thioxomethylene)]}dipropanamide InChI=1S/C18H20N4O2S2/c1-3-15(23)21-17(25)19-13-9-5-8-12-11(13)7-6-10-14(12)20-18(26)22-16(24)4-2/h5-10H,3-4H2,1-2H3,(H2,19,21,23,25)(H2,20,22,24,26) N,N'-{1,8-naphthalenediylbis[imino(thioxomethylene)]}bis(2-fluorobenzamide) InChI=1S/C26H18F2N4O2S2/c27-18-11-3-1-9-16(18)23(33)31-25(35)29-20-13-5-7-15-8-6-14-21(22(15)20)30-26(36)32-24(34)17-10-2-4-12-19(17)28/h1-14H,(H2,29,31,33,35)(H2,30,32,34,36) N,N'-{1,8-naphthalenediylbis[imino(thioxomethylene)]}di(2-furamide) InChI=1S/C22H16N4O4S2/c27-19(16-9-3-11-29-16)25-21(31)23-14-7-1-5-13-6-2-8-15(18(13)14)24-22(32)26-20(28)17-10-4-12-30-17/h1-12H,(H2,23,25,27,31)(H2,24,26,28,32) N,N'-{1,8-naphthalenediylbis[imino(thioxomethylene)]}di(2-thiophenecarboxamide) InChI=1S/C22H16N4O2S4/c27-19(16-9-3-11-31-16)25-21(29)23-14-7-1-5-13-6-2-8-15(18(13)14)24-22(30)26-20(28)17-10-4-12-32-17/h1-12H,(H2,23,25,27,29)(H2,24,26,28,30) N,N'-{oxybis[1,3-benzothiazole-6,2-diyliminosulfonyl-4,1-phenyleneimino(thioxomethylene)]}bis[3-(2-furyl)acrylamide] InChI=1S/C42H30N8O9S6/c51-37(19-11-27-3-1-21-57-27)47-39(60)43-25-5-13-31(14-6-25)64(53,54)49-41-45-33-17-9-29(23-35(33)62-41)59-30-10-18-34-36(24-30)63-42(46-34)50-65(55,56)32-15-7-26(8-16-32)44-40(61)48-38(52)20-12-28-4-2-22-58-28/h1-24H,(H,45,49)(H,46,50)(H2,43,47,51,60)(H2,44,48,52,61) N,N'-{oxybis[4,1-phenyleneimino(thioxomethylene)]}bis(3-methylbutanamide) InChI=1S/C24H30N4O3S2/c1-15(2)13-21(29)27-23(32)25-17-5-9-19(10-6-17)31-20-11-7-18(8-12-20)26-24(33)28-22(30)14-16(3)4/h5-12,15-16H,13-14H2,1-4H3,(H2,25,27,29,32)(H2,26,28,30,33) N,N'-{oxybis[4,1-phenyleneimino(thioxomethylene)]}dipropanamide InChI=1S/C20H22N4O3S2/c1-3-17(25)23-19(28)21-13-5-9-15(10-6-13)27-16-11-7-14(8-12-16)22-20(29)24-18(26)4-2/h5-12H,3-4H2,1-2H3,(H2,21,23,25,28)(H2,22,24,26,29) 8,8'-{carbonylbis[imino-3,1-phenylenecarbonylimino(4-methyl-3,1-phenylene)carbonylimino]}dinaphthalene-1,3,5-trisulfonic acid InChI=1/C51H40N6O23S6/c1-25-9-11-29(49(60)54-37-13-15-41(83(69,70)71)35-21-33(81(63,64)65)23-43(45(35)37)85(75,76)77)19-39(25)56-47(58)27-5-3-7-31(17-27)52-51(62)53-32-8-4-6-28(18-32)48(59)57-40-20-30(12-10-26(40)2)50(61)55-38-14-16-42(84(72,73)74)36-22-34(82(66,67)68)24-44(46(36)38)86(78,79)80/h3-24H,1-2H3,(H,54,60)(H,55,61)(H,56,58)(H,57,59)(H2,52,53,62)(H,63,64,65)(H,66,67,68)(H,69,70,71)(H,72,73,74)(H,75,76,77)(H,78,79,80)/f/h52-57,63,66,69,72,75,78H 2,2'-[ethane-1,2-diylbis(azanylylidenemethanylylidene)]diphenol InChI=1/C16H16N2O2/c19-15-7-3-1-5-13(15)11-17-9-10-18-12-14-6-2-4-8-16(14)20/h1-8,11-12,19-20H,9-10H2 2,2'-((ethane-1,2-diylbis(azanylylidene))bis(methanylylidene))diphenol InChI=1/C16H16N2O2/c19-15-7-3-1-5-13(15)11-17-9-10-18-12-14-6-2-4-8-16(14)20/h1-8,11-12,19-20H,9-10H2 2,2'-[ethane-1,2-diylidenebis(azanylylidenemethanylylidene)]bis(cyclohexan-1-ol) InChI=1/C16H24N2O2/c19-15-7-3-1-5-13(15)11-17-9-10-18-12-14-6-2-4-8-16(14)20/h9-12,15-16,19-20H,1-8H2 2,2'-((ethane-1,2-diylidenebis(azanylylidene))bis(methanylylidene))dicyclohexanol InChI=1/C16H24N2O2/c19-15-7-3-1-5-13(15)11-17-9-10-18-12-14-6-2-4-8-16(14)20/h9-12,15-16,19-20H,1-8H2 tetramethylethylenediamine InChI=1S/C6H16N2/c1-7(2)5-6-8(3)4/h5-6H2,1-4H3 opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/omittedSpaces.txt000066400000000000000000000047121451751637500305130ustar00rootroot00000000000000methylformate InChI=1/C2H4O2/c1-4-2-3/h2H,1H3 ethylacetate InChI=1/C4H8O2/c1-3-6-4(2)5/h3H2,1-2H3 ethyl-ethanoate InChI=1/C4H8O2/c1-3-6-4(2)5/h3H2,1-2H3 ethyl2-ethylacetate InChI=1/C6H12O2/c1-3-5-6(7)8-4-2/h3-5H2,1-2H3 monoethylterephthalate InChI=1/C10H10O4/c1-2-14-10(13)8-5-3-7(4-6-8)9(11)12/h3-6H,2H2,1H3,(H,11,12)/p-1/fC10H9O4/q-1 diethylterephthalate InChI=1/C12H14O4/c1-3-15-11(13)9-5-7-10(8-6-9)12(14)16-4-2/h5-8H,3-4H2,1-2H3 ethyloxalate InChI=1/C4H6O4/c1-2-8-4(7)3(5)6/h2H2,1H3,(H,5,6)/p-1/fC4H5O4/q-1 diethyloxalate InChI=1/C6H10O4/c1-3-9-5(7)6(8)10-4-2/h3-4H2,1-2H3 diethylsuccinate InChI=1/C8H14O4/c1-3-11-7(9)5-6-8(10)12-4-2/h3-6H2,1-2H3 ethylphenylacetate InChI=1S/C10H12O2/c1-2-12-10(11)8-9-6-4-3-5-7-9/h3-7H,2,8H2,1H3 #Note that this is a fudge to give the expected real-world interpretation tert-butyl(phenyl)carbamate InChI=1S/C11H15NO2/c1-11(2,3)14-10(13)12-9-7-5-4-6-8-9/h4-8H,1-3H3,(H,12,13) #not omitted space 2-methylacetate InChI=1/C3H6O2/c1-2-3(4)5/h2H2,1H3,(H,4,5)/p-1/fC3H5O2/q-1 ethylterephthalate InChI=1/C10H10O4/c1-2-6-5-7(9(11)12)3-4-8(6)10(13)14/h3-5H,2H2,1H3,(H,11,12)(H,13,14)/p-2/fC10H8O4/q-2 triethylterephthalate InChI=1/C14H18O4/c1-4-8-7-11(13(15)16)9(5-2)10(6-3)12(8)14(17)18/h7H,4-6H2,1-3H3,(H,15,16)(H,17,18)/p-2/fC14H16O4/q-2 tetraethylterephthalate InChI=1/C16H22O4/c1-5-9-10(6-2)14(16(19)20)12(8-4)11(7-3)13(9)15(17)18/h5-8H2,1-4H3,(H,17,18)(H,19,20)/p-2/fC16H20O4/q-2 ethylmalonate InChI=1/C5H8O4/c1-2-3(4(6)7)5(8)9/h3H,2H2,1H3,(H,6,7)(H,8,9)/p-2/fC5H6O4/q-2 diethylmalonate InChI=1/C7H12O4/c1-3-7(4-2,5(8)9)6(10)11/h3-4H2,1-2H3,(H,8,9)(H,10,11)/p-2/fC7H10O4/q-2 ethylsuccinate InChI=1/C6H10O4/c1-2-4(6(9)10)3-5(7)8/h4H,2-3H2,1H3,(H,7,8)(H,9,10)/p-2/fC6H8O4/q-2 acetylacetate InChI=1/C4H6O3/c1-3(5)2-4(6)7/h2H2,1H3,(H,6,7)/p-1/fC4H5O3/q-1 diethylcarbamate InChI=1S/C5H11NO2/c1-3-6(4-2)5(7)8/h3-4H2,1-2H3,(H,7,8)/p-1 sodium tert-butyl(phenyl)carbamate InChI=1S/C11H15NO2.Na/c1-11(2,3)12(10(13)14)9-7-5-4-6-8-9;/h4-8H,1-3H3,(H,13,14);/q;+1/p-1 dimethyl(ethylenedioxy)dicarbamate InChI=1S/C6H12N2O6/c1-11-5(9)7-13-3-4-14-8-6(10)12-2/h3-4H2,1-2H3,(H,7,9)(H,8,10) bis(aminomethyl)4,4'-(ethylenedioxy)bis(3-methylbenzoate) InChI=1S/C20H24N2O6/c1-13-9-15(19(23)27-11-21)3-5-17(13)25-7-8-26-18-6-4-16(10-14(18)2)20(24)28-12-22/h3-6,9-10H,7-8,11-12,21-22H2,1-2H3 chloromethyl ether InChI=1/C2H4Cl2O/c3-1-5-2-4/h1-2H2 3-aminophenyl-4-aminobenzenesulfonate InChI=1S/C12H12N2O3S/c13-9-4-6-12(7-5-9)18(15,16)17-11-3-1-2-10(14)8-11/h1-8H,13-14H2 opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/organometallics.txt000066400000000000000000000010111451751637500310570ustar00rootroot00000000000000methylmercury(1+) InChI=1S/CH3.Hg/h1H3;/q;+1 dimethylzinc InChI=1S/2CH3.Zn/h2*1H3; dihydrido(naphthalen-2-yl)rhenium InChI=1S/C10H7.Re.2H/c1-2-6-10-8-4-3-7-9(10)5-1;;;/h1-3,5-8H;;; (prop-1-yn-1yl)copper InChI=1S/C3H3.Cu/c1-3-2;/h1H3; bis(4-carboxyphenyl)mercury InChI=1S/2C7H5O2.Hg/c2*8-7(9)6-4-2-1-3-5-6;/h2*2-5H,(H,8,9); trimethyltin(IV) InChI=1S/3CH3.Sn/h3*1H3;/q;;;+1 Lithium phenylacetylide InChI=1S/C8H5.Li/c1-2-8-6-4-3-5-7-8;/h3-7H;/q-1;+1 Diethylaluminum cyanide InChI=1S/2C2H5.CN.Al/c3*1-2;/h2*1H2,2H3;; opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/radicals.txt000066400000000000000000000010051451751637500274610ustar00rootroot00000000000000methyl InChI=1S/CH3/h1H3 methylidene InChI=1S/CH2/h1H2 methylidyne InChI=1/CH/h1H prop-1-yl InChI=1S/C3H7/c1-3-2/h1,3H2,2H3 propyl InChI=1S/C3H7/c1-3-2/h1,3H2,2H3 prop-2-yl InChI=1S/C3H7/c1-3-2/h3H,1-2H3 phenyl InChI=1S/C6H5/c1-2-4-6-5-3-1/h1-5H terephthaloyl InChI=1S/C8H4O2/c9-5-7-1-2-8(6-10)4-3-7/h1-4H nitrilo InChI=1S/N hexanedioylbis(azanylidene) InChI=1S/C6H8N2O2/c7-5(9)3-1-2-4-6(8)10/h1-4H2 acetylazanidyl InChI=1S/C2H3NO/c1-2(3)4/h1H3/q-1 aniliniumyl InChI=1S/C6H7N/c7-6-4-2-1-3-5-6/h1-5H,7H2/q+1opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/spiro.txt000066400000000000000000000123721451751637500270440ustar00rootroot00000000000000spiro[3.4]octane InChI=1S/C8H14/c1-2-5-8(4-1)6-3-7-8/h1-7H2 spiro[3.3]heptane InChI=1S/C7H12/c1-3-7(4-1)5-2-6-7/h1-6H2 spiro[4.5]decane InChI=1S/C10H18/c1-2-6-10(7-3-1)8-4-5-9-10/h1-9H2 spiro[4.5]deca-1,6-diene InChI=1S/C10H14/c1-2-6-10(7-3-1)8-4-5-9-10/h2,4,6,8H,1,3,5,7,9H2 trispiro[2.0.2^(4).1.2^(8).1^(3)]undecane InChI=1S/C11H16/c1-2-9(1)7-10(3-4-10)11(8-9)5-6-11/h1-8H2 trispiro[2.0.24.1.28.13]undecane InChI=1S/C11H16/c1-2-9(1)7-10(3-4-10)11(8-9)5-6-11/h1-8H2 spiro[cyclopentane-1,1'-indene] InChI=1S/C13H14/c1-2-6-12-11(5-1)7-10-13(12)8-3-4-9-13/h1-2,5-7,10H,3-4,8-9H2 1,1'-spirobiindene InChI=1S/C17H12/c1-3-7-15-13(5-1)9-11-17(15)12-10-14-6-2-4-8-16(14)17/h1-12H dispiro[5.1.7.2]heptadecane InChI=1S/C17H30/c1-2-5-9-16(10-6-3-1)13-14-17(15-16)11-7-4-8-12-17/h1-15H2 dispiro[fluorene-9,1'-cyclohexane-4',1''-indene] InChI=1/C26H22/c1-4-10-22-19(7-1)13-14-25(22)15-17-26(18-16-25)23-11-5-2-8-20(23)21-9-3-6-12-24(21)26/h1-14H,15-18H2 cyclopentanespirocyclobutane InChI=1S/C8H14/c1-2-5-8(4-1)6-3-7-8/h1-7H2 cyclohexanespirocyclopentane InChI=1S/C10H18/c1-2-6-10(7-3-1)8-4-5-9-10/h1-9H2 2H-indene-2-spiro-1'-cyclopentane InChI=1S/C13H14/c1-2-6-12-10-13(7-3-4-8-13)9-11(12)5-1/h1-2,5-6,9-10H,3-4,7-8H2 2-cyclohexenespiro-(2'-cyclopentene) InChI=1S/C10H14/c1-2-6-10(7-3-1)8-4-5-9-10/h2,4,6,8H,1,3,5,7,9H2 spirobicyclohexane InChI=1S/C11H20/c1-3-7-11(8-4-1)9-5-2-6-10-11/h1-10H2 2-cyclohexenespiro-(3'-cyclohexene) InChI=1S/C11H16/c1-3-7-11(8-4-1)9-5-2-6-10-11/h1,3,5,9H,2,4,6-8,10H2 spiro[4.5]deca-1,6-dien-2-yl InChI=1S/C10H13/c1-2-6-10(7-3-1)8-4-5-9-10/h2,6,9H,1,3-4,7-8H2 2-cyclohexenespiro-2'-cyclopenten-3'-yl InChI=1S/C10H13/c1-2-6-10(7-3-1)8-4-5-9-10/h2,6,9H,1,3-4,7-8H2 spiro[cyclopentane-1,1'-inden]-2'-yl InChI=1S/C13H13/c1-2-6-12-11(5-1)7-10-13(12)8-3-4-9-13/h1-2,5-7H,3-4,8-9H2 1-oxaspiro[4,5]decane InChI=1S/C9H16O/c1-2-5-9(6-3-1)7-4-8-10-9/h1-8H2 6,8-diazoniadispiro[5.1.6.2]hexadecane dichloride InChI=1S/C14H28N2.2ClH/c1-2-5-9-15(8-4-1)12-13-16(14-15)10-6-3-7-11-16;;/h1-14H2;2*1H/q+2;;/p-2 3,3'-spirobi[3H-indole] InChI=1S/C15H10N2/c1-3-7-13-11(5-1)15(9-16-13)10-17-14-8-4-2-6-12(14)15/h1-10H spiro[piperidine-4,9'-xanthene] InChI=1S/C17H17NO/c1-3-7-15-13(5-1)17(9-11-18-12-10-17)14-6-2-4-8-16(14)19-15/h1-8,18H,9-12H2 cyclohexanespiro-2'-(tetrahydrofuran) InChI=1S/C9H16O/c1-2-5-9(6-3-1)7-4-8-10-9/h1-8H2 tetrahydropyran-2-spirocyclohexane InChI=1S/C10H18O/c1-2-6-10(7-3-1)8-4-5-9-11-10/h1-9H2 3,3'-spirobi(3H-indole) InChI=1S/C15H10N2/c1-3-7-13-11(5-1)15(9-16-13)10-17-14-8-4-2-6-12(14)15/h1-10H 1,2,3,4-tetrahydroquinoline-4-spiro-4'-piperidine InChI=1S/C13H18N2/c1-2-4-12-11(3-1)13(7-10-15-12)5-8-14-9-6-13/h1-4,14-15H,5-10H2 hexahydroazepinium-1-spiro-1'-imidazolidine-3'-spiro-1''-piperidinium dibromide InChI=1S/C14H28N2.2BrH/c1-2-5-9-15(8-4-1)12-13-16(14-15)10-6-3-7-11-16;;/h1-14H2;2*1H/q+2;;/p-2 1-Oxaspiro[4.5]dec-2-yl InChI=1S/C9H15O/c1-2-5-9(6-3-1)7-4-8-10-9/h8H,1-7H2 Cyclohexanespiro-2'-(tetrahydrofuran)-5'-yl InChI=1S/C9H15O/c1-2-5-9(6-3-1)7-4-8-10-9/h8H,1-7H2 Spiro[benzofuran-2(3H),1'-cyclohexan]-4'-yl InChI=1S/C13H15O/c1-4-8-13(9-5-1)10-11-6-2-3-7-12(11)14-13/h1-3,6-7H,4-5,8-10H2 Spiro[naphthalene-2(3H),2'-thian]-4'-yl InChI=1S/C14H15S/c1-2-6-13-11-14(8-3-4-10-15-14)9-7-12(13)5-1/h1-3,5-7,11H,4,8-10H2 5lambda^5-arsaspiro[4.4]nonan-5-ylium InChI=1S/C8H16As/c1-2-6-9(5-1)7-3-4-8-9/h1-8H2/q+1 5-arsoniaspiro[4.4]nonane InChI=1S/C8H16As/c1-2-6-9(5-1)7-3-4-8-9/h1-8H2/q+1 5lambda^5-phosphaspiro[4.4]nonan-5-uide InChI=1S/C8H18P/c1-2-6-9(5-1)7-3-4-8-9/h1-9H2/q-1 5lambda^5-phosphanuidaspiro[4.4]nonane InChI=1S/C8H18P/c1-2-6-9(5-1)7-3-4-8-9/h1-9H2/q-1 5lambda^5,5'-spirobi[benzo[b]phosphindol]-5-ylium InChI=1S/C24H16P/c1-5-13-21-17(9-1)18-10-2-6-14-22(18)25(21)23-15-7-3-11-19(23)20-12-4-8-16-24(20)25/h1-16H/q+1 9-phosphonia-9,9'-spirobi[fluorene] InChI=1S/C24H16P/c1-5-13-21-17(9-1)18-10-2-6-14-22(18)25(21)23-15-7-3-11-19(23)20-12-4-8-16-24(20)25/h1-16H/q+1 5,5'-spirobi[benzo[b]phosphindolium] InChI=1S/C24H16P/c1-5-13-21-17(9-1)18-10-2-6-14-22(18)25(21)23-15-7-3-11-19(23)20-12-4-8-16-24(20)25/h1-16H/q+1 5,5'-spirobi[5H-dibenzophospholinium] InChI=1S/C24H16P/c1-5-13-21-17(9-1)18-10-2-6-14-22(18)25(21)23-15-7-3-11-19(23)20-12-4-8-16-24(20)25/h1-16H/q+1 5lambda7,5',5''-spiroter[benzo[b]phosphindol]-5-ide InChI=1S/C36H24P/c1-7-19-31-25(13-1)26-14-2-8-20-32(26)37(31,33-21-9-3-15-27(33)28-16-4-10-22-34(28)37)35-23-11-5-17-29(35)30-18-6-12-24-36(30)37/h1-24H/q-1 1H-2lambda5-spiro[isoquinoline-2,2'-pyrido[1,2-a]pyrazin]-2-ylium InChI=1/C17H15N2/c1-2-6-16-13-19(11-8-15(16)5-1)12-10-18-9-4-3-7-17(18)14-19/h1-12,14H,13H2/q+1 spiro[isoquinoline-2(1H),2'-[2H]pyrido[1,2-a]pyrazinium] InChI=1/C17H15N2/c1-2-6-16-13-19(11-8-15(16)5-1)12-10-18-9-4-3-7-17(18)14-19/h1-12,14H,13H2/q+1 2'H-3lambda5-spiro[3-azabicyclo[3.2.2]nonane-3,3'-[1,3]oxazol]-3-ylium InChI=1S/C11H18NO/c1-2-11-4-3-10(1)7-12(8-11)5-6-13-9-12/h5-6,10-11H,1-4,7-9H2/q+1 spiro[3-azabicyclo[3.2.2]nonane-3,3'(2H)-oxazolium] InChI=1S/C11H18NO/c1-2-11-4-3-10(1)7-12(8-11)5-6-13-9-12/h5-6,10-11H,1-4,7-9H2/q+1 spiro[fluorene-9,2'-[3]thiabicyclo[2.2.2]oct[5]ene] InChI=1S/C19H16S/c1-3-7-17-15(5-1)16-6-2-4-8-18(16)19(17)13-9-11-14(20-19)12-10-13/h1-9,11,13-14H,10,12H2 #Incorrect indicated hydrogen 5'-bromo-1',3'-dihydro-2H,5H-spiro[imidazolidine-4,2'-indene]-2,5-dione InChI=1/C11H9BrN2O2/c12-8-2-1-6-4-11(5-7(6)3-8)9(15)13-10(16)14-11/h1-3H,4-5H2,(H2,13,14,15,16) opsin-2.8.0/opsin-inchi/src/test/resources/uk/ac/cam/ch/wwmm/opsin/stereochemistry.txt000066400000000000000000000032741451751637500311420ustar00rootroot00000000000000(3xi)-threonine InChI=1S/C4H9NO3/c1-2(6)3(5)4(7)8/h2-3,6H,5H2,1H3,(H,7,8)/t2?,3-/m0/s1 D-(+)-glucose InChI=1S/C6H12O6/c7-1-3(9)5(11)6(12)4(10)2-8/h1,3-6,8-12H,2H2/t3-,4+,5+,6+/m0/s1 L-2-phenylglycine InChI=1S/C8H9NO2/c9-7(8(10)11)6-4-2-1-3-5-6/h1-5,7H,9H2,(H,10,11)/t7-/m0/s1 L-2-Aminobutyric acid InChI=1S/C4H9NO2/c1-2-3(5)4(6)7/h3H,2,5H2,1H3,(H,6,7)/t3-/m0/s1 D-2-Aminobutyric acid InChI=1S/C4H9NO2/c1-2-3(5)4(6)7/h3H,2,5H2,1H3,(H,6,7)/t3-/m1/s1 L-5-oxopyrrolidine-2-carboxylic acid InChI=1S/C5H7NO3/c7-4-2-1-3(6-4)5(8)9/h3H,1-2H2,(H,6,7)(H,8,9)/t3-/m0/s1 L(-)-Tryptophan InChI=1S/C11H12N2O2/c12-9(11(14)15)5-7-6-13-10-4-2-1-3-8(7)10/h1-4,6,9,13H,5,12H2,(H,14,15)/t9-/m0/s1 1,2-Bis((S)-(2-methoxyphenyl)(phenyl)phosphino)ethane InChI=1S/C28H28O2P2/c1-29-25-17-9-11-19-27(25)31(23-13-5-3-6-14-23)21-22-32(24-15-7-4-8-16-24)28-20-12-10-18-26(28)30-2/h3-20H,21-22H2,1-2H3/t31-,32-/m0/s1 (2R,4S,4aS)-rel-11-fluoro-2,4-dimethyl-8-(methylsulfinyl)-1,2,4,4a-tetrahydro-2′H,6H-spiro[1,4-oxazino[4,3-a][1,2]oxazolo[4,5-g]quinoline-5,5′-pyrimidine]-2′,4′,6′(1′H,3′H)-trione InChI=1S/C19H19FN4O6S/c1-7-6-24-12-9(4-10-13(11(12)20)30-23-15(10)31(3)28)5-19(14(24)8(2)29-7)16(25)21-18(27)22-17(19)26/h4,7-8,14H,5-6H2,1-3H3,(H2,21,22,25,26,27)/t7-,8+,14-,31?/m1/s1 (+-)-Trans-3-butyl-3-ethyl-2,3,4,5-tetrahydro-5-phenyl-1,4-benzothiazepine-4,8-diol InChI=1S/C21H27NO2S/c1-3-5-13-21(4-2)15-25-19-14-17(23)11-12-18(19)20(22(21)24)16-9-7-6-8-10-16/h6-12,14,20,23-24H,3-5,13,15H2,1-2H3/t20-,21-/m0/s1 (±)-cis-6-(3,5-Difluorophenyl)-3-(fluoromethyl)-3,4,6-trimethyl-1-(prop-2-en-1-yl)piperazin-2-one InChI=1S/C17H21F3N2O/c1-5-6-22-15(23)16(2,10-18)21(4)11-17(22,3)12-7-13(19)9-14(20)8-12/h5,7-9H,1,6,10-11H2,2-4H3/t16-,17+/m1/s1opsin-2.8.0/pom.xml000066400000000000000000000206031451751637500142050ustar00rootroot00000000000000 4.0.0 uk.ac.cam.ch.opsin opsin 2.8.0 pom OPSIN Open Parser for Systematic IUPAC Nomenclature http://opsin.ch.cam.ac.uk MIT License https://opensource.org/licenses/MIT https://github.com/dan2097/opsin/ scm:git:https://github.com/dan2097/opsin scm:git:https://github.com/dan2097/opsin 2.8.0 Daniel Lowe https://github.com/dan2097 Lead Programmer Peter Corbett Albina Asadulina Rich Apodaca opsin-core opsin-inchi opsin-cli sonatype-nexus-snapshots Sonatype Nexus Snapshots https://oss.sonatype.org/content/repositories/snapshots/ sonatype-nexus-staging Nexus Release Repository https://oss.sonatype.org/service/local/staging/deploy/maven2/ UTF-8 org.apache.maven.plugins maven-compiler-plugin 3.8.1 1.8 1.8 org.apache.maven.plugins maven-javadoc-plugin 3.2.0 8 maven-source-plugin 3.0.1 true org.apache.maven.plugins maven-surefire-plugin 2.22.2 false org.apache.maven.plugins maven-enforcer-plugin 3.3.0 enforce-bytecode-version enforce 1.8 true org.codehaus.mojo extra-enforcer-rules 1.6.2 org.apache.maven.plugins maven-release-plugin 2.5.3 forked-path false -Psonatype-oss-release sonatype-oss-release org.apache.maven.plugins maven-javadoc-plugin 3.3.1 attach-javadocs jar org.apache.maven.plugins maven-source-plugin 3.2.1 attach-sources jar-no-fork org.apache.maven.plugins maven-gpg-plugin 3.0.1 sign-artifacts verify sign uk.ac.cam.ch.opsin opsin-core ${project.version} uk.ac.cam.ch.opsin opsin-inchi ${project.version} dk.brics automaton 1.12-4 com.fasterxml.woodstox woodstox-core 6.5.1 org.apache.logging.log4j log4j-api 2.20.0 org.apache.logging.log4j log4j-core 2.20.0 io.github.dan2097 jna-inchi-core 1.2 commons-io commons-io 2.12.0 commons-cli commons-cli 1.5.0 org.junit.jupiter junit-jupiter 5.9.2 test org.hamcrest hamcrest-library 2.2 test org.mockito mockito-core 4.11.0 test opsin-2.8.0/src/000077500000000000000000000000001451751637500134565ustar00rootroot00000000000000opsin-2.8.0/src/site/000077500000000000000000000000001451751637500144225ustar00rootroot00000000000000opsin-2.8.0/src/site/apt/000077500000000000000000000000001451751637500152065ustar00rootroot00000000000000opsin-2.8.0/src/site/apt/index.apt000066400000000000000000000036621451751637500170320ustar00rootroot00000000000000OPSIN: Open Parser for Systematic IUPAC Nomenclature OPSIN is a library for converting chemical names, especially organic chemical names, into structures. OPSIN can output to CML (Chemical Markup Language), SMILES or to InChI. You can grab the latest released version of OPSIN in jar form from {{{https://github.com/dan2097/opsin/releases}GitHub}}. OPSIN is open source under the {{{https://opensource.org/licenses/MIT}MIT License}}. To see or contribute to the codebase please go to {{{https://github.com/dan2097/opsin/}GitHub}}. OPSIN uses MAVEN for dependency management, hence the easiest way to depend on OPSIN is by adding OPSIN to your pom.xml. See OPSIN's documentation for more information. Released versions of OPSIN are available from {{{https://maven.ch.cam.ac.uk/content/repositories/releases/uk/ac/cam/ch/opsin/} here}}. SNAPSHOT versions are available from {{{https://maven.ch.cam.ac.uk/content/repositories/snapshots/uk/ac/cam/ch/opsin/} here}}. OPSIN is available as a web service for demonstration purposes and for light batch processing {{{http://opsin.ch.cam.ac.uk} here}}. Attribution If you use OPSIN to produce results for publication, then it would be great if you could cite us: Chemical Name to Structure: OPSIN, an Open Source Solution Daniel M. Lowe, Peter T. Corbett, Peter Murray-Rust, Robert C. Glen Journal of Chemical Information and Modeling 2011 51 (3), 739-753 Acknowledgements Recent performance improvements have been made possible with the use of the excellent YourKit profiling tool. YourKit is kindly supporting open source projects with its full-featured Java Profiler. YourKit, LLC is the creator of innovative and intelligent tools for profiling Java and .NET applications. Take a look at YourKit's leading software products: {{{http://www.yourkit.com/java/profiler/index.jsp}YourKit Java Profiler}} and {{{http://www.yourkit.com/.net/profiler/index.jsp}YourKit .NET Profiler}}.