debian/0000755000000000000000000000000012151500501007154 5ustar debian/watch0000644000000000000000000000015112117430155010213 0ustar version=3 http://code.google.com/p/chem-fingerprints/downloads/list .*/chemfp-([\d.]+(?:p\d+)?)\.tar\.gz debian/rdkit2fps.10000644000000000000000000000656612150746740011203 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.41.2. .TH RDKIT2FPS "1" "May 2013" "rdkit2fps 1.1p1" "User Commands" .SH NAME rdkit2fps \- rdkit2fps .SH DESCRIPTION usage: rdkit2fps [\-h] [\-\-fpSize INT] [\-\-RDK] [\-\-minPath INT] [\-\-maxPath INT] .IP [\-\-nBitsPerHash INT] [\-\-useHs 0|1] [\-\-morgan] [\-\-radius INT] [\-\-useFeatures 0|1] [\-\-useChirality 0|1] [\-\-useBondTypes 0|1] [\-\-torsions] [\-\-targetSize INT] [\-\-pairs] [\-\-minLength INT] [\-\-maxLength INT] [\-\-maccs166] [\-\-substruct] [\-\-rdmaccs] [\-\-id\-tag NAME] [\-\-in FORMAT] [\-o FILENAME] [\-\-errors {strict,report,ignore}] [filenames [filenames ...]] .PP Generate FPS fingerprints from a structure file using RDKit .SS "positional arguments:" .TP filenames input structure files (default is stdin) .SS "optional arguments:" .TP \fB\-h\fR, \fB\-\-help\fR show this help message and exit .TP \fB\-\-fpSize\fR INT number of bits in the fingerprint (applies to RDK, Morgan, topological torsion, and atom pair fingerprints (default=2048) .TP \fB\-\-id\-tag\fR NAME tag name containing the record id (SD files only) .TP \fB\-\-in\fR FORMAT input structure format (default guesses from filename) .TP \fB\-o\fR FILENAME, \fB\-\-output\fR FILENAME save the fingerprints to FILENAME (default=stdout) .TP \fB\-\-errors\fR {strict,report,ignore} how should structure parse errors be handled? (default=strict) .SS "RDKit topological fingerprints:" .TP \fB\-\-RDK\fR generate RDK fingerprints (default) .TP \fB\-\-minPath\fR INT minimum number of bonds to include in the subgraph (default=1) .TP \fB\-\-maxPath\fR INT maximum number of bonds to include in the subgraph (default=7) .TP \fB\-\-nBitsPerHash\fR INT number of bits to set per path (default=4) .TP \fB\-\-useHs\fR 0|1 include information about the number of hydrogens on each atom (default=1) .SS "RDKit Morgan fingerprints:" .TP \fB\-\-morgan\fR generate Morgan fingerprints .TP \fB\-\-radius\fR INT radius for the Morgan algorithm (default=2) .TP \fB\-\-useFeatures\fR 0|1 use chemical\-feature invariants (default=0) .TP \fB\-\-useChirality\fR 0|1 include chirality information (default=0) .TP \fB\-\-useBondTypes\fR 0|1 include bond type information (default=1) .SS "RDKit Topological Torsion fingerprints:" .TP \fB\-\-torsions\fR generate Topological Torsion fingerprints .TP \fB\-\-targetSize\fR INT number of bits in the fingerprint (default=4) .SS "RDKit Atom Pair fingerprints:" .TP \fB\-\-pairs\fR generate Atom Pair fingerprints .TP \fB\-\-minLength\fR INT minimum bond count for a pair (default=1) .TP \fB\-\-maxLength\fR INT maximum bond count for a pair (default=30) .SS "166 bit MACCS substructure keys:" .TP \fB\-\-maccs166\fR generate MACCS fingerprints .SS "881 bit substructure keys:" .TP \fB\-\-substruct\fR generate ChemFP substructure fingerprints .SS "ChemFP version of the 166 bit RDKit/MACCS keys:" .TP \fB\-\-rdmaccs\fR generate 166 bit RDKit/MACCS fingerprints .PP This program guesses the input structure format based on the filename extension. If the data comes from stdin, or the extension name us unknown, then use "\-\-in" to change the default input format. The supported format extensions are: .TP File Type Valid FORMATs (use gz if compressed) .HP \fB\-\-\-\-\-\-\-\-\-\fR \fB\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\fR .TP SMILES smi, ism, can, smi.gz, ism.gz, can.gz .TP SDF sdf, mol, sd, mdl, sdf.gz, mol.gz, sd.gz, mdl.gz debian/ob2fps.10000644000000000000000000000231712150746740010454 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.41.2. .TH OB2FPS "1" "May 2013" "ob2fps 1.1p1" "User Commands" .SH NAME ob2fps \- ob2fps .SH DESCRIPTION usage: ob2fps [\-h] [\-\-FP2 | \fB\-\-FP3\fR | \fB\-\-FP4\fR | \fB\-\-MACCS\fR | \fB\-\-substruct\fR | \fB\-\-rdmaccs]\fR .IP [\-\-id\-tag NAME] [\-\-in FORMAT] [\-o FILENAME] [\-\-errors {strict,report,ignore}] [filenames [filenames ...]] .PP Generate FPS fingerprints from a structure file using OpenBabel .SS "positional arguments:" .TP filenames input structure files (default is stdin) .SS "optional arguments:" .TP \fB\-h\fR, \fB\-\-help\fR show this help message and exit .HP \fB\-\-FP2\fR .HP \fB\-\-FP3\fR .HP \fB\-\-FP4\fR .HP \fB\-\-MACCS\fR .TP \fB\-\-substruct\fR generate ChemFP substructure fingerprints .TP \fB\-\-rdmaccs\fR generate 166 bit RDKit/MACCS fingerprints .TP \fB\-\-id\-tag\fR NAME tag name containing the record id (SD files only) .TP \fB\-\-in\fR FORMAT input structure format (default autodetects from the filename extension) .TP \fB\-o\fR FILENAME, \fB\-\-output\fR FILENAME save the fingerprints to FILENAME (default=stdout) .TP \fB\-\-errors\fR {strict,report,ignore} how should structure parse errors be handled? (default=strict) debian/rules0000755000000000000000000000151012151500441010234 0ustar #!/usr/bin/make -f #export DH_VERBOSE=1 GCC_SSSE3_ARCH = amd64 i386 kfreebsd-i386 kfreebsd-amd64 hurd-i386 x32 DEB_HOST_ARCH ?= $(shell dpkg-architecture -qDEB_HOST_ARCH) -include /usr/share/python/python.mk PYVERS := $(shell pyversions -sv) export CPPFLAGS := $(shell dpkg-buildflags --get CPPFLAGS) export CFLAGS := $(shell dpkg-buildflags --get CFLAGS) export LDFLAGS := $(shell dpkg-buildflags --get LDFLAGS) -Wl,--as-needed %: dh $@ --with python2 --parallel ifeq (,$(findstring $(DEB_HOST_ARCH),$(GCC_SSSE3_ARCH))) override_dh_auto_build: dh_auto_build -- --without-ssse3 endif override_dh_auto_test: -for PY in $(PYVERS); do \ py_path="$(CURDIR)/build/lib.`python$${PY} -c 'from distutils import util; print util.get_platform();'`-$$PY/" ; \ (cd tests; PYTHONPATH=$$py_path python$$PY unit2 discover); \ done debian/simsearch.10000644000000000000000000000313212150746740011233 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.41.2. .TH SIMSEARCH "1" "May 2013" "simsearch 1.1p1" "User Commands" .SH NAME simsearch \- simsearch .SH DESCRIPTION usage: simsearch [\-h] [\-k K_NEAREST] [\-t THRESHOLD] [\-q QUERIES] [\-\-NxN] .IP [\-\-hex\-query HEX_QUERY] [\-\-query\-id QUERY_ID] [\-\-in FORMAT] [\-o FILENAME] [\-c] [\-b BATCH_SIZE] [\-\-scan] [\-\-memory] [\-\-times] target_filename .PP Search an FPS file for similar fingerprints .SS "positional arguments:" .TP target_filename target filename .SS "optional arguments:" .TP \fB\-h\fR, \fB\-\-help\fR show this help message and exit .TP \fB\-k\fR K_NEAREST, \fB\-\-k\-nearest\fR K_NEAREST select the k nearest neighbors (use 'all' for all neighbors) .TP \fB\-t\fR THRESHOLD, \fB\-\-threshold\fR THRESHOLD minimum similarity score threshold .TP \fB\-q\fR QUERIES, \fB\-\-queries\fR QUERIES filename containing the query fingerprints .TP \fB\-\-NxN\fR use the targets as the queries, and exclude the selfsimilarity term .TP \fB\-\-hex\-query\fR HEX_QUERY query in hex .TP \fB\-\-query\-id\fR QUERY_ID id for the hex query .TP \fB\-\-in\fR FORMAT input query format (default uses the file extension, else 'fps') .TP \fB\-o\fR FILENAME, \fB\-\-output\fR FILENAME output filename (default is stdout) .TP \fB\-c\fR, \fB\-\-count\fR report counts .TP \fB\-b\fR BATCH_SIZE, \fB\-\-batch\-size\fR BATCH_SIZE batch size .TP \fB\-\-scan\fR scan the file to find matches (low memory overhead) .TP \fB\-\-memory\fR build and search an in\-memory data structure (faster for multiple queries) .TP \fB\-\-times\fR report load and execution times to stderr debian/copyright0000644000000000000000000001145512150746740011135 0ustar This work was packaged for Debian by: Michael Banck on Fri, 08 Jun 2012 16:04:49 +0200 It was downloaded from: http://code.google.com/p/chem-fingerprints/downloads/list Upstream Author: Andrew Dalke General Copyright: Copyright (c) 2010-2013 Andrew Dalke Scientific, AB General License: Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. License: chemfp/argparse.py Copyright: Copyright (c) 2006-2009 Steven J. Bethard chemfp/argparse.py License: Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. chemfp/rdmaccs.patterns Copyright: Copyright (c) 2006-2010 Rational Discovery LLC, Greg Landrum, and Julie Penzotti Copyright (c) 2001-2008 Greg Landrum and Rational Discovery LLC chemfp/rdmaccs.patterns License: Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Rational Discovery nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. tests/unittest2/* Copyright: Copyright (c) 1999-2003 Steve Purcell Copyright (c) 2003-2010 Python Software Foundation tests/unittest2/* License: This module is free software, and you may redistribute it and/or modify it under the same terms as Python itself, so long as this copyright message and disclaimer are retained in their original form. IN NO EVENT SHALL THE AUTHOR BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THIS CODE, EVEN IF THE AUTHOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. THE AUTHOR SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE CODE PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND THERE IS NO OBLIGATION WHATSOEVER TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. The Debian packaging is: Copyright (C) 2012 Michael Banck Copyright (C) 2013 The debichem team and is licensed under the same (MIT license) terms as the upstream project. debian/source/0000755000000000000000000000000012151500501010454 5ustar debian/source/format0000644000000000000000000000001411764423427011706 0ustar 3.0 (quilt) debian/sdf2fps.10000644000000000000000000000553312150746740010633 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.41.2. .TH SDF2FPS "1" "May 2013" "sdf2fps 1.1p1" "User Commands" .SH NAME sdf2fps \- sdf2fps .SH DESCRIPTION usage: sdf2fps [\-h] [\-\-id\-tag TAG] [\-\-fp\-tag TAG] [\-\-num\-bits INT] .IP [\-\-errors {strict,report,ignore}] [\-o FILENAME] [\-\-software TEXT] [\-\-type TEXT] [\-\-decompress METHOD] [\-\-binary] [\-\-binary\-msb] [\-\-hex] [\-\-hex\-lsb] [\-\-hex\-msb] [\-\-base64] [\-\-cactvs] [\-\-daylight] [\-\-decoder DECODER] [\-\-pubchem] [filenames [filenames ...]] .PP Extract a fingerprint tag from an SD file and generate FPS fingerprints .SS "positional arguments:" .TP filenames input SD files (default is stdin) .SS "optional arguments:" .TP \fB\-h\fR, \fB\-\-help\fR show this help message and exit .TP \fB\-\-id\-tag\fR TAG get the record id from TAG instead of the first line of the record .TP \fB\-\-fp\-tag\fR TAG get the fingerprint from tag TAG (required) .TP \fB\-\-num\-bits\fR INT use the first INT bits of the input. Use only when the last 1\-7 bits of the last byte are not part of the fingerprint. Unexpected errors will occur if these bits are not all zero. .TP \fB\-\-errors\fR {strict,report,ignore} how should structure parse errors be handled? (default=strict) .TP \fB\-o\fR FILENAME, \fB\-\-output\fR FILENAME save the fingerprints to FILENAME (default=stdout) .TP \fB\-\-software\fR TEXT use TEXT as the software description .TP \fB\-\-type\fR TEXT use TEXT as the fingerprint type description .TP \fB\-\-decompress\fR METHOD use METHOD to decompress the input (default='auto', \&'none', 'gzip', 'bzip2') .SS "Fingerprint decoding options:" .TP \fB\-\-binary\fR Encoded with the characters '0' and '1'. Bit #0 comes first. Example: 00100000 encodes the value 4 .TP \fB\-\-binary\-msb\fR Encoded with the characters '0' and '1'. Bit #0 comes last. Example: 00000100 encodes the value 4 .TP \fB\-\-hex\fR Hex encoded. Bit #0 is the first bit (1<<0) of the first byte. Example: 01f2 encodes the value \ex01\exf2 = 498 .TP \fB\-\-hex\-lsb\fR Hex encoded. Bit #0 is the eigth bit (1<<7) of the first byte. Example: 804f encodes the value \ex01\exf2 = 498 .TP \fB\-\-hex\-msb\fR Hex encoded. Bit #0 is the first bit (1<<0) of the last byte. Example: f201 encodes the value \ex01\exf2 = 498 .TP \fB\-\-base64\fR Base\-64 encoded. Bit #0 is first bit (1<<0) of first byte. Example: AfI= encodes value \ex01\exf2 = 498 .TP \fB\-\-cactvs\fR CACTVS encoding, based on base64 and includes a version and bit length .TP \fB\-\-daylight\fR Daylight encoding, which is is base64 variant .TP \fB\-\-decoder\fR DECODER import and use the DECODER function to decode the fingerprint .SS "shortcuts:" .TP \fB\-\-pubchem\fR decode CACTVS substructure keys used in PubChem. Same as \fB\-\-software\fR=\fICACTVS\fR/unknown \fB\-\-type\fR 'CACTVSE_SCREEN/1.0 extended=2' \fB\-\-fptag\fR=\fIPUBCHEM_CACTVS_SUBSKEYS\fR \fB\-\-cactvs\fR debian/compat0000644000000000000000000000000211764423427010376 0ustar 7 debian/control0000644000000000000000000000277412150746740010611 0ustar Source: chemfp Section: science Priority: optional Maintainer: Debichem Team Uploaders: Michael Banck Build-Depends: debhelper (>= 7.0.50~), python-all-dev (>= 2.6.6-3~), python-openbabel, python-rdkit, python-unit Standards-Version: 3.9.4 Homepage: http://code.google.com/p/chem-fingerprints/ Vcs-Browser: http://svn.debian.org/wsvn/debichem/unstable/chemfp/ Vcs-Svn: svn://svn.debian.org/svn/debichem/unstable/chemfp Package: python-chemfp Section: python Architecture: any Provides: ${python:Provides} Depends: ${misc:Depends}, ${python:Depends}, ${shlibs:Depends} Recommends: python-openbabel, python-rdkit Description: cheminformatics fingerprints file formats and tools Chem-fingerprints is a set of formats and related tools for the storage, exchange, and search of cheminformatics fingerprint data sets. . It translates fingerprints from the OpenBabel and RDKIT and cheminformatics packages (as well as the proprietary OEChem package) into the binary FPS format. . Besides Python modules, it provides the following tools: . * sdf2fps - Extract fingerprint data from SD tags * ob2fps - Use OpenBabel to generate fingerprints from structures * rdkit2fps - Use RDKit to generate fingerprints from structures * oe2fps - Use OEChem/OEGraphSim to generate fingerprints from structures * simsearch - Do threshold or k-nearest neighbor Tanimoto similarity searches between two FPS files debian/python-chemfp.manpages0000644000000000000000000000010711764423427013474 0ustar debian/ob2fps.1 debian/rdkit2fps.1 debian/sdf2fps.1 debian/simsearch.1 debian/docs0000644000000000000000000000000711764423427010050 0ustar README debian/changelog0000644000000000000000000000174712151500475011051 0ustar chemfp (1.1p1-2) unstable; urgency=low [ Daniel Leidert ] * debian/rules (override_dh_auto_build): Disable -mssse3 flag on architectures that don't support it to fix FTBFS. -- Debichem Team Thu, 30 May 2013 00:29:13 +0200 chemfp (1.1p1-1) unstable; urgency=low * New upstream release (Closes: #702389). [ Daniel Leidert ] * debian/control: Added Provides field. (Build-Depends): Added python-unit and versioned python-all-dev. (Standards-Version): Bumped to 3.9.4. (Description): Minor improvements. * debian/copyright: Updated. * debian/*.1: Updated by help2man. * debian/rules: Enable tests (passing tests is allowed to fail). Enable hardening. * debian/watch: Added. -- Debichem Team Mon, 27 May 2013 23:20:41 +0200 chemfp (1.0-1) unstable; urgency=low * Initial release (Closes: #676658). -- Michael Banck Fri, 08 Jun 2012 16:04:49 +0200