chemfp-1.1p1/0000755000077000000240000000000012106315372013255 5ustar dalkestaff00000000000000chemfp-1.1p1/CHANGELOG0000644000077000000240000002110412106315334014463 0ustar dalkestaff00000000000000What's new in 1.1p1 (12 Feb 2013) ================================= Fixed memory leaks caused by using Py_BuildValue with an "O" instead of an "N". This caused the reference count on the return arena strings to be too high, so they were never garbage collected. This should only affect people who made and destroyed many arenas. Removed unneeded lock in threshold arena searches. This should give better parallelism when there are many hits (eg, with a low threshold) when there are multiple threads. What's new in 1.1 (5 Feb 2013) ============================== New methods to look up a record, record index, or fingerprint given the record identifier. These are: arena.get_by_id(id) arena.get_index_by_id(id) arena.get_fingerprint_by_id(id) Added or updated all of the docstrings for the public API. Documented that the search methods on the FingerprintArena instance are deprecated - use chemfp.search instead. These will generate warning message in the next release and after that will be removed. Renamed arena.copy_subset() to arena.copy(). Changed the arena.copy() method so that by default it reorders the fingerprints if indices are specified, and by default the (sub)arena ordering is preserved. Added a cache for getattr(subarena, "ids"). Otherwise subarena.ids[i] took O(len(subarena.ids)) time instead of O(1) time. Renamed chemfp.readers to chemfp.fps_io and decoders.py to encodings.py. These were not part of the public API but may be in upcoming versions, so it's best to change them now. Detect and raise an exception if the metadata size doesn't match the fingerprint size passed to the arena builder. Thanks to Greg Landrum for spotting this bug! What's new in 1.1b7 (patch release) =================================== Fixed a problem when the code is compiled on an old compiler which doesn't understand the POPCNT inline assembly then run on a machine which implements POPCNT. What's new in 1.1b6 (5 Dec 2012) ================================ Added methods to count the number of hits in the search results which are within a given score range, and to compute the cumulative score (also called the "raw score") of those hits. These are: SearchResults.count_all(min_score=None, max_score=None, interval="[]") SearchResults.cumulative_score_all(min_score=None, max_score=None, interval="[]") SearchResult.count(min_score=None, max_score=None, interval="[]") SearchResult.cumulative_score(min_score=None, max_score=None, interval="[]") Arenas now have a "copy_subset(indices, reorder=True)" method. This selects a subset of the entries in the arena and makes a new arena. Here's how to select a random subset of 100 entries from an arena: import random subset_indices = random.sample(xrange(len(arena)), 100) new_arena = arena.copy_subset(subset_indices) (NOTE: 'copy_subset' was renamed 'copy' for the final 1.1 release.) Fixed a bug in the Open Babel patterns FPS output: the 'software' line needed a space between the Open Babel and chemfp versions. What's new in 1.1b5 (23 April 2012) =================================== The command-line search tools support an --NxN option for when the queries and targets are the same. (The search results do not include the diagonal term.) The implemention takes advantage of the symmetry to get almost a two-fold performance increase. This option assumes that everything will fit into memory. Added public APIs for the symmetric searches. New popcount algorithms: - Lauradoux and POPCNT versions contributed by Kim Walisch These are 2x and 3x faster than the original method. - SSSE3 version by Imran Haque, Stanford University This is about 2.5x faster than the original method. Use --without-ssse3 to disable support for that method. - Gilles method, which can be better than the original method. The timings depend very much on the compiler, CPU features, and choice of 32- vs 64- bit architecture. For example, Lauradoux is slower than the lookup tables for 32 bit systems. chemfp selects the best method at import run-time. Use chemfp.bitops.set_alignment_method to force a specific method. The new popcount algorithms require a specific fingerprint alignment and padding. Use the new "alignment" option in load_fingerprints() to specify an alignment. The default uses an alignment based on the available methods and the fingerprint size. (It will be 8 or less unless you have SSSE3 hardware but not SSE4.2, and your fingerprint is larger than 224 bits, in which case it's 64 bytes.) Optional OpenMP support. This is used when the query is an arena. If your compiler does not support OpenMP then use "--without-openmp" to disable it. Support for RDKit's Morgan fingerprints. Support for Daylight's Circular and Tree fingerprints (if you have OEGraphSim 2.0.0 installed.) New decoder for Daylight's "binary2ascii" encoding. Fixed a memory overflow bug which caused crashes on some Windows and Linux machines. Changed the API so that "arena.ids" or "subarena.ids" refers to the identifiers for that arena/subarena, and "arena.arena_ids" and "subarena.arena_ids" refers to the complete list of identifiers for the underlying arena. This is what my code expected, only I got the implementation backwards. Two of the test cases should have failed with swapped attributes but it looks like I assumed the computer was right and made the tests agree with the wrong values. Also added more tests to highlight other places where I could make a mistake between 'ids' and 'arena_ids.' This fix resolves a serious error identified by Brian McClain of Vertex. Moved most memory management code from Python to C. The speedup is most noticable when there is a hit density (eg, when the threshold is below 0.5). Created a new 'Results' return object, which lets you sort the hits in different ways, and request only the score, or only the ids, or both from the hitlist. The arena search results specifically are stored in a C data structure. This new API greatly simplfies implementing some types of clustering algorithms, reduces memory overhead, and improves performance. Added Alex Grönholm's 'futures' package as a submodule. It greatly simplifies making a thread- or process pool. It is a backport of the code in Python 3.2. Added Nilton Volpato's 'progressbar' package as a submodule. Use it to show a text-based progress bar in chemfp-based search tools. Added an experimental "Watcher" module by Allen Downey. Use it to handle ^C events, which otherwise get sent to an arbitary thread. It works by spawning a child process. The main process listens for a ^C and forwards that as a os.kill() to the child process. This will likely only work on Unix systems. What's new in 1.0 (20 Sept 2011) ================================ The chemfp format is now a tab-delimited format. I talked with two people who have spaces in their ids: one in their corporate ids and the other wants to use IUPAC names. In discussion with others, having a pure tab-delimited format would not be a problem with the primary audience. The simsearch output format is also tab delimited. Completely redeveloped the in-memory search interface. The core data structure is a "FingerprintArena", which can optionally hold population count information. The similarity searches use a compressed row representation, which is a more efficient use of memory and reduces the number Python-to-C calls I need to make. The FPS knearest search is push oriented, and keeps track of the identifiers at the C level. Major restructuring of the API so that public functions are at the top of the "chemfp" package. Made high-level functions for the expected common tasks of searching an FPSReader and a FingerprintArena. The oe2fps, ob2fps, and rdkit2fps readers now support multiple structure filenames. Each filename is listed on its own "source" line. New --id-tag to use one of the SD tag fields rather than the title line. This is needed for ChEBI where you should use --id-tag "ChEBI ID" to get ids like "CHEBI:776". New --aromaticity option for oe2fps, and a corresponding "aromaticity" field in the FPS header. Improved docstring comments. Improved error reporting. Added error handling options "strict", "report", and "ignore." More comprehensive test suite (which, yes, caught several errors). What's new in 0.95 ================== Cross-platform pattern-based fingerprint generation, and specific implementations of a CACTVS/PubChem-like substructure fingerprint and of RDKit's MACCS patterns. What's new in 0.9.1 =================== Support for Python 2.5. What's new in 0.9 ================= Major update from 0.5. Changes to the API, code cleanup, new search API, and more. Since there are no earlier users, I won't go into the details. :) chemfp-1.1p1/chemfp/0000755000077000000240000000000012106315372014517 5ustar dalkestaff00000000000000chemfp-1.1p1/chemfp/__init__.py0000644000077000000240000010232112106312227016623 0ustar dalkestaff00000000000000# Library for working with cheminformatics fingerprints # All chem-fingerprint software is distributed with the following license: # Copyright (c) 2010-2013 Andrew Dalke Scientific, AB (Gothenburg, Sweden) # # Permission is hereby granted, free of charge, to any person # obtaining a copy of this software and associated documentation files # (the "Software"), to deal in the Software without restriction, # including without limitation the rights to use, copy, modify, merge, # publish, distribute, sublicense, and/or sell copies of the Software, # and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in # all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. __version__ = "1.1p1" __version_info = (1, 1, 1) SOFTWARE = "chemfp/" + __version__ import os import itertools __all__ = ["open", "load_fingerprints", "read_structure_fingerprints", "Metadata", "FingerprintIterator", "Fingerprints"] class ChemFPError(Exception): pass class ParseError(ChemFPError): pass def read_structure_fingerprints(type, source=None, format=None, id_tag=None, errors="strict"): """Read structures from 'source' and return the corresponding ids and fingerprints This returns a FingerprintReader which can be iterated over to get the id and fingerprint for each read structure record. The fingerprint generated depends on the value of 'type'. Structures are read from 'source', which can either be the structure filename, or None to read from stdin. 'type' contains the information about how to turn a structure into a fingerprint. It can be a string or a metadata instance. String values look like "OpenBabel-FP2/1", "OpenEye-Path", and "OpenEye-Path/1 min_bonds=0 max_bonds=5 atype=DefaultAtom btype=DefaultBond". Default values are used for unspecified parameters. Use a Metadata instance with 'type' and 'aromaticity' values set in order to pass aromaticity information to OpenEye. If 'format' is None then the structure file format and compression are determined by the filename's extension(s), defaulting to uncompressed SMILES if that is not possible. Otherwise 'format' may be "smi" or "sdf" optionally followed by ".gz" or "bz2" to indicate compression. The OpenBabel and OpenEye toolkits also support additional formats. If 'id_tag' is None, then the record id is based on the title field for the given format. If the input format is "sdf" then 'id_tag' specifies the tag field containing the identifier. (Only the first line is used for multi-line values.) For example, ChEBI omits the title from the SD files and stores the id after the "> " line. In that case, use id_tag = "ChEBI ID". 'aromaticity' specifies the aromaticity model, and is only appropriate for OEChem. It must be a string like "openeye" or "daylight". Here is an example of using fingerprints generated from structure file:: fp_reader = read_structure_fingerprints("OpenBabel-FP4/1", "example.sdf.gz") print "Each fingerprint has", fps.metadata.num_bits, "bits" for (id, fp) in fp_reader: print id, fp.encode("hex") :param type: information about how to convert the input structure into a fingerprint :type type: string or Metadata :param source: The structure data source. :type source: A filename (as a string), a file object, or None to read from stdin :param format: The file format and optional compression. Examples: 'smi' and 'sdf.gz' :type format: string, or None to autodetect based on the source :param id_tag: The tag containing the record id. Example: 'ChEBI ID'. Only valid for SD files. :type id_tag: string, or None to use the default title for the given format :returns: a FingerprintReader """ # ' # emacs cruft from . import types if isinstance(type, basestring): metadata = None else: metadata = type if metadata.type is None: raise ValueError("Missing fingerprint type information in metadata") type = metadata.type structure_fingerprinter = types.parse_type(type) return structure_fingerprinter.read_structure_fingerprints(source, format, id_tag, errors, metadata=metadata) # Low-memory, forward-iteration, or better def open(source, format=None): """Read fingerprints from a fingerprint file Read fingerprints from 'source', using the given format. If 'source' is a string then it is treated as a filename. If 'source' is None then fingerprints are read from stdin. Otherwise, 'source' must be a Python file object supporting 'read' and 'readline'. If 'format' is None then the fingerprint file format and compression type are derived from the source filename, or from the name attribute of the source file object. If the source is None then the stdin is assumed to be uncompressed data in "fps" format. The supported format strings are: fps, fps.gz - fingerprints are in FPS format The result is an FPSReader. Here's an example of printing the contents of the file:: reader = open("example.fps.gz") for id, fp in reader: print id, fp.encode("hex") :param source: The fingerprint source. :type source: A filename string, a file object, or None :param format: The file format and optional compression. :type format: string, or None :returns: an FPSReader """ from . import io format_name, compression = io.normalize_format(source, format) if format_name == "fps": from . import fps_io return fps_io.open_fps(source, format_name+compression) if format_name == "fpb": raise NotImplementedError("fpb format support not implemented") if format is None: raise ValueError("Unable to determine fingerprint format type from %r" % (source,)) else: raise ValueError("Unknown fingerprint format %r" % (format,)) def load_fingerprints(reader, metadata=None, reorder=True, alignment=None): """Load all of the fingerprints into an in-memory FingerprintArena data structure The FingerprintArena data structure reads all of the fingerprints and identifers from 'reader' and stores them into an in-memory data structure which supports fast similarity searches. If 'reader' is a string or implements "read" then the contents will be parsed with the 'chemfp.open' function. Otherwise it must support iteration returning (id, fingerprint) pairs. 'metadata' contains the metadata the arena. If not specified then 'reader.metadata' is used. The loader may reorder the fingerprints for better search performance. To prevent ordering, use reorder=False. The 'alignment' option specifies the alignment data alignment and padding size for each fingerprint. A value of 8 means that each fingerprint will start on a 8 byte alignment, and use storage space which a multiple of 8 bytes long. The default value of None determines the best alignment based on the fingerprint size and available popcount methods. :param reader: An iterator over (id, fingerprint) pairs :type reader: a string, file object, or (id, fingerprint) iterator :param metadata: The metadata for the arena, if other than reader.metadata :type metadata: Metadata :param reorder: Specify if fingerprints should be reordered for better performance :type reorder: True or False :returns: FingerprintArena :param alignment: Alignment size (both data alignment and padding) """ if isinstance(reader, basestring): reader = open(reader) elif hasattr(reader, "read"): reader = open(reader) if metadata is None: metadata = reader.metadata from . import arena return arena.fps_to_arena(reader, metadata=metadata, reorder=reorder, alignment=alignment) ##### High-level search interfaces def count_tanimoto_hits(queries, targets, threshold=0.7, arena_size=100): """Count the number of targets within 'threshold' of each query term For each query in 'queries', count the number of targets in 'targets' which are at least 'threshold' similar to the query. This function returns an iterator containing the (query_id, count) pairs. Example:: queries = chemfp.open("queries.fps") targets = chemfp.load_fingerprints("targets.fps.gz") for (query_id, count) in chemfp.count_tanimoto_hits(queries, targets, threshold=0.9): print query_id, "has", count, "neighbors with at least 0.9 similarity" Internally, queries are processed in batches of size 'arena_size'. A small batch size uses less overall memory and has lower processing latency, while a large batch size has better overall performance. Use arena_size=None to process the input as a single batch. Note: the FPSReader may be used as a target but it can only process one batch, and searching a FingerprintArena is faster if you have more than a few queries. :param queries: The query fingerprints. :type queries: any fingerprint container :param targets: The target fingerprints. :type targets: FingerprintArena or the slower FPSReader :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param arena_size: The number of queries to process in a batch :type arena_size: a positive integer, or None :returns: An iterator containing (query_id, score) pairs, one for each query """ from . import fps_io if isinstance(targets, fps_io.FPSReader): from . import fps_search count_hits = fps_search.count_tanimoto_hits_arena else: from . import search count_hits = search.count_tanimoto_hits_arena ### Start the search now so compatibility errors are raised eagerly # Start iterating through the subarenas, and get the first of those subarenas = queries.iter_arenas(arena_size) try: first_query_arena = subarenas.next() except StopIteration: # There are no subarenas; return an empty iterator return iter([]) # Get the first result, and hold on to it for the generator first_counts = count_hits(first_query_arena, targets, threshold=threshold) def count_tanimoto_hits(): # Return results for the first arena for query_id, count in zip(first_query_arena.ids, first_counts): yield query_id, count # Return results for the rest of the arenas for query_arena in subarenas: counts = count_hits(query_arena, targets, threshold=threshold) for query_id, count in zip(query_arena.ids, counts): yield query_id, count return count_tanimoto_hits() def threshold_tanimoto_search(queries, targets, threshold=0.7, arena_size=100): """Find all targets within 'threshold' of each query term For each query in 'queries', find all the targets in 'targets' which are at least 'threshold' similar to the query. This function returns an iterator containing the (query_id, hits) pairs. The hits are stored as a list of (target_id, score) pairs. Example:: queries = chemfp.open("queries.fps") targets = chemfp.load_fingerprints("targets.fps.gz") for (query_id, hits) in chemfp.id_threshold_tanimoto_search(queries, targets, threshold=0.8): print query_id, "has", len(hits), "neighbors with at least 0.8 similarity" non_identical = [target_id for (target_id, score) in hits if score != 1.0] print " The non-identical hits are:", non_identical Internally, queries are processed in batches of size 'arena_size'. A small batch size uses less overall memory and has lower processing latency, while a large batch size has better overall performance. Use arena_size=None to process the input as a single batch. Note: the FPSReader may be used as a target but it can only process one batch, and searching a FingerprintArena is faster if you have more than a few queries. :param queries: The query fingerprints. :type queries: any fingerprint container :param targets: The target fingerprints. :type targets: FingerprintArena or the slower FPSReader :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param arena_size: The number of queries to process in a batch :type arena_size: positive integer, or None :returns: An iterator containing (query_id, hits) pairs, one for each query. 'hits' contains a list of (target_id, score) pairs. """ from . import fps_io if isinstance(targets, fps_io.FPSReader): from . import fps_search threshold_search = fps_search.threshold_tanimoto_search_arena else: from . import search threshold_search = search.threshold_tanimoto_search_arena ### Start the search now so compatibility errors are raised eagerly # Start iterating through the subarenas, and get the first of those subarenas = queries.iter_arenas(arena_size) try: first_query_arena = subarenas.next() except StopIteration: # There are no subarenas; return an empty iterator return iter([]) # Get the first result, and hold on to it for the generator first_results = threshold_search(first_query_arena, targets, threshold=threshold) ## Here's a thought; allow a 'result_order' parameter so I can do: # if result_order is not None: # first_results.reorder(reorder) def threshold_tanimoto_search(): # Return results for the first arena for query_id, row in zip(first_query_arena.ids, first_results): yield query_id, row for query_arena in subarenas: results = threshold_search(query_arena, targets, threshold=threshold) ## I would also need to do #if result_order is not None: # first_results.reorder(reorder) for query_id, row in zip(query_arena.ids, results): yield (query_id, row) return threshold_tanimoto_search() def knearest_tanimoto_search(queries, targets, k=3, threshold=0.7, arena_size=100): """Find the 'k'-nearest targets within 'threshold' of each query term For each query in 'queries', find the 'k'-nearest of all the targets in 'targets' which are at least 'threshold' similar to the query. Ties are broken arbitrarily and hits with scores equal to the smallest value may have been omitted. This function returns an iterator containing the (query_id, hits) pairs, where hits is a list of (target_id, score) pairs, sorted so that the highest scores are first. The order of ties is arbitrary. Example:: # Use the first 5 fingerprints as the queries queries = next(chemfp.open("pubchem_subset.fps").iter_arenas(5)) targets = chemfp.load_fingerprints("pubchem_subset.fps") # Find the 3 nearest hits with a similarity of at least 0.8 for (query_id, hits) in chemfp.id_knearest_tanimoto_search(queries, targets, k=3, threshold=0.8): print query_id, "has", len(hits), "neighbors with at least 0.8 similarity" if hits: target_id, score = hits[-1] print " The least similar is", target_id, "with score", score Internally, queries are processed in batches of size 'arena_size'. A small batch size uses less overall memory and has lower processing latency, while a large batch size has better overall performance. Use arena_size=None to process the input as a single batch. Note: the FPSReader may be used as a target but it can only process one batch, and searching a FingerprintArena is faster if you have more than a few queries. :param queries: The query fingerprints. :type queries: any fingerprint container :param targets: The target fingerprints. :type targets: FingerprintArena or the slower FPSReader :param k: The maximum number of nearest neighbors to find. :type k: positive integer :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param arena_size: The number of queries to process in a batch :type arena_size: positive integer, or None :returns: An iterator containing (query_id, hits) pairs, one for each query. 'hits' contains a list of (target_id, score) pairs, sorted by score. """ from . import fps_io if isinstance(targets, fps_io.FPSReader): from . import fps_search knearest_search = fps_search.knearest_tanimoto_search_arena else: from . import search knearest_search = search.knearest_tanimoto_search_arena ### Start the search now so compatibility errors are raised eagerly # Start iterating through the subarenas, and get the first of those subarenas = queries.iter_arenas(arena_size) try: first_query_arena = subarenas.next() except StopIteration: # There are no subarenas; return an empty iterator return iter([]) # Get the first result, and hold on to it for the generator first_results = knearest_search(first_query_arena, targets, k=k, threshold=threshold) def knearest_tanimoto_search(): # Return results for the first arena for query_id, row in zip(first_query_arena.ids, first_results): yield query_id, row # and for the subarenas for query_arena in subarenas: results = knearest_search(query_arena, targets, k=k, threshold=threshold) for query_id, row in zip(query_arena.ids, results): yield (query_id, row) return knearest_tanimoto_search() def count_tanimoto_hits_symmetric(fingerprints, threshold=0.7): """Find the number of other fingerprints within `threshold` of each fingerprint For each fingerprint in the `fingerprints` arena, find the number of other fingerprints in the same arena which are at least `threshold` similar to it. The arena must have pre-computed popcounts. A fingerprint never matches itself. This function returns an iterator of (fingerprint_id, count) pairs. Example:: arena = chemfp.load_fingerprints("targets.fps.gz") for (fp_id, count) in chemfp.count_tanimoto_hits_symmetric(arena, threshold=0.6): print fp_id, "has", count, "neighbors with at least 0.6 similarity" :param fingerprints: The arena containing the fingerprints. :type fingerprints: a FingerprintArena with precomputed popcount_indices :param threshold: The minimum score threshold. :type threshod: float between 0.0 and 1.0, inclusive :returns: An iterator of (fp_id, count) pairs, one for each fingerprint """ from . import fps_io, search if (isinstance(fingerprints, fps_io.FPSReader) or not getattr(fingerprints, "popcount_indices", None)): raise ValueError("`fingerprints` must be a FingerprintArena with pre-computed popcount indices") # Start the search now so the errors are caught early results = search.count_tanimoto_hits_symmetric(fingerprints, threshold) def count_tanimoto_hits_symmetric_internal(): for id, count in zip(fingerprints.ids, results): yield id, count return count_tanimoto_hits_symmetric_internal() def threshold_tanimoto_search_symmetric(fingerprints, threshold=0.7): """Find the other fingerprints within `threshold` of each fingerprint For each fingerprint in the `fingerprints` arena, find the other fingerprints in the same arena which hare at least `threshold` similar to it. The arena must have pre-computed popcounts. A fingerprint never matches itself. This function returns an iterator of (fingerprint, SearchResult) pairs. The SearchResult hit order is arbitrary. Example:: arena = chemfp.load_fingerprints("targets.fps.gz") for (fp_id, hits) in chemfp.threshold_tanimoto_search_symmetric(arena, threshold=0.75): print fp_id, "has", len(hits), "neighbors:" for (other_id, score) in hits.get_ids_and_scores(): print " %s %.2f" % (other_id, score) :param fingerprints: The arena containing the fingerprints. :type fingerprints: a FingerprintArena with precomputed popcount_indices :param threshold: The minimum score threshold. :type threshod: float between 0.0 and 1.0, inclusive :returns: An iterator of (fp_id, SearchResult) pairs, one for each fingerprint """ from . import fps_io, search if (isinstance(fingerprints, fps_io.FPSReader) or not getattr(fingerprints, "popcount_indices", None)): raise ValueError("`fingerprints` must be a FingerprintArena with pre-computed popcount indices") # Start the search now so the errors are caught early results = search.threshold_tanimoto_search_symmetric(fingerprints, threshold) def threshold_tanimoto_search_symmetric_internal(): for id, hits in zip(fingerprints.ids, results): yield id, hits return threshold_tanimoto_search_symmetric_internal() def knearest_tanimoto_search_symmetric(fingerprints, k=3, threshold=0.7): """Find the nearest `k` fingerprints within `threshold` of each fingerprint For each fingerprint in the `fingerprints` arena, find the nearest `k` fingerprints in the same arena which hare at least `threshold` similar to it. The arena must have pre-computed popcounts. A fingerprint never matches itself. This function returns an iterator of (fingerprint, SearchResult) pairs. The SearchResult hits are ordered from highest score to lowest, with ties broken arbitrarily. Example:: arena = chemfp.load_fingerprints("targets.fps.gz") for (fp_id, hits) in chemfp.knearest_tanimoto_search_symmetric(arena, k=5, threshold=0.5): print fp_id, "has", len(hits), "neighbors, with scores", print ", ".join("%.2f" % x for x in hits.get_scores()) :param fingerprints: The arena containing the fingerprints. :type fingerprints: a FingerprintArena with precomputed popcount_indices :param k: The maximum number of nearest neighbors to find. :type k: positive integer :param threshold: The minimum score threshold. :type threshod: float between 0.0 and 1.0, inclusive :returns: An iterator of (fp_id, SearchResult) pairs, one for each fingerprint """ from . import fps_io, search if (isinstance(fingerprints, fps_io.FPSReader) or not getattr(fingerprints, "popcount_indices", None)): raise ValueError("`fingerprints` must be a FingerprintArena with pre-computed popcount indices") # Start the search now so the errors are caught early results = search.knearest_tanimoto_search_symmetric(fingerprints, k, threshold) def knearest_tanimoto_search_symmetric_internal(): for id, hits in zip(fingerprints.ids, results): yield id, hits return knearest_tanimoto_search_symmetric_internal() def check_fp_problems(fp, metadata): "This interface is not documented and may change in the future" if len(fp) != metadata.num_bytes: msg = ("%%(fp)s fingerprint contains %d bytes but %%(metadata)s has %d byte fingerprints" % (len(fp), metadata.num_bytes)) return [("error", "num_bytes mismatch", msg)] return [] def check_metadata_problems(metadata1, metadata2): "This interface is not documented and may change in the future" messages = [] compared_num_bits = False if (metadata1.num_bits is not None and metadata2.num_bits is not None): compared_num_bits = True if metadata1.num_bits != metadata2.num_bits: msg = ("%%(metadata1)s has %d bit fingerprints but %%(metadata2)s has %d bit fingerprints" % (metadata1.num_bits, metadata2.num_bits)) messages.append( ("error", "num_bits mismatch", msg) ) if (not compared_num_bits and metadata1.num_bytes is not None and metadata2.num_bytes is not None and metadata1.num_bytes != metadata2.num_bytes): msg = ("%%(metadata1)s has %d byte fingerprints but %%(metadata2) has %d byte fingerprints" % (metadata1.num_bytes, metadata2.num_bytes)) messages.append( ("error", "num_bytes mismatch", msg) ) if (metadata1.type is not None and metadata2.type is not None and metadata1.type != metadata2.type): msg = ("%%(metadata1)s has fingerprints of type %r but %%(metadata2)s has fingerprints of type %r" % (metadata1.type, metadata2.type)) messages.append( ("warning", "type mismatch", msg) ) if (metadata1.aromaticity is not None and metadata2.aromaticity is not None and metadata1.aromaticity != metadata2.aromaticity): msg = ("%%(metadata1)s uses aromaticity %r but %%(metadata2)s uses aromaticity %r" % (metadata1.aromaticity, metadata2.aromaticity)) messages.append( ("warning", "aromaticity mismatch", msg) ) if (metadata1.software is not None and metadata2.software is not None and metadata1.software != metadata2.software): msg = ("%%(metadata1)s comes from software %r but %%(metadata2)s comes from software %r" % (metadata1.software, metadata2.software)) messages.append( ("info", "software mismatch", msg) ) return messages class Metadata(object): """Store information about a set of fingerprints The metadata attributes are: num_bits: number of bits in the fingerprint num_bytes: number of bytes in the fingerprint type: fingerprint type aromaticity: aromaticity model (only used with OEChem) software: software used to make the fingerprints sources: list of sources used to make the fingerprint date: timestamp of when the fingerprints were made """ def __init__(self, num_bits=None, num_bytes=None, type=None, aromaticity=None, software=None, sources=None, date=None): if num_bytes is None: if num_bits is None: pass else: num_bytes = (num_bits + 7)//8 elif num_bits is None: num_bits = num_bytes * 8 else: if (num_bits + 7)//8 != num_bytes: raise ValueError("num_bits of %d is incompatible with num_bytes of %d" % (num_bits, num_bytes)) self.num_bits = num_bits self.num_bytes = num_bytes self.type = type self.aromaticity = aromaticity self.software = software if sources is None: self.sources = [] elif isinstance(sources, basestring): self.sources = [sources] #raise TypeError("sources must be a list, not a string") else: self.sources = sources self.date = date def __repr__(self): return "Metadata(num_bits=%(num_bits)r, num_bytes=%(num_bytes)r, type=%(type)r, aromaticity=%(aromaticity)r, sources=%(sources)r, software=%(software)r, date=%(date)r)" % self.__dict__ def __str__(self): from cStringIO import StringIO from . import io f = StringIO() io.write_fps1_header(f, self) return f.getvalue() class FingerprintReader(object): """Base class for all chemfp objects holding fingerprint records All FingerprintReader instances have a 'metadata' attribute containing a Metadata and can be iteratated over to get the (id, fingerprint) for each record. """ def __init__(self, metadata): """Initialize with a Metadata instance""" self.metadata = metadata def __iter__(self): """iterate over the (id, fingerprint) pairs""" raise NotImplementedError def reset(self): """restart any internal iterators NOTE: method is likely to be removed in the future This is only relevant for fingerprint containers which have only one iterator. An example is the FPSReader, which uses stream-based file I/O to read fingerprint data. Calling reset() resets the iterator to its initial state. Iterators must allow reset() if data has not yet been read. Otherwise, if a reset is not possible then reset() will raise a TypeError. :returns: None :raises: TypeError """ def iter_arenas(self, arena_size=1000): """iterate through 'arena_size' fingerprints at a time This iterates through the fingerprints 'arena_size' at a time, yielding a FingerprintArena for each group. Working with arenas is often faster than processing one fingerprint at a time, and more memory efficient than processing all fingerprints at once. If arena_size=None then this makes an iterator containing a single arena containing all of the input. :param arena_size: The number of fingerprints to put into an arena. :type arena_size: positive integer, or None """ if arena_size is None: yield load_fingerprints(self, self.metadata, reorder=False) return if arena_size < 1: raise ValueError("arena_size cannot be zero") return it = iter(self) while 1: slice = itertools.islice(it, 0, arena_size) arena = load_fingerprints(slice, self.metadata, reorder=False) if not arena: break yield arena def save(self, destination): from . import io io.write_fps1_output(self, destination, self.metadata) class FingerprintIterator(FingerprintReader): """A FingerprintReader for an iterator of (id, fingerprint) pairs This is often used as an adapter so that something which reads the id and fingerprint data can be used as a query source. """ def __init__(self, metadata, id_fp_iterator): """initialize with a Metadata instance and the (id, fingerprint) iterator""" super(FingerprintIterator, self).__init__(metadata) self._id_fp_iterator = id_fp_iterator self._at_start = True def __iter__(self): """iterate over the (id, fingerprint) pairs""" for x in self._id_fp_iterator: self._at_start = False yield x def reset(self): """raise TypeError except if the iterator has not been used""" if not self._at_start: raise TypeError("It is not possible to reset a FingerprintIterator once it is in use") class Fingerprints(FingerprintReader): """A FingerprintReader contining a list of (id, fingerprint) pairs This is often used as an adapter so that something which contains the id and fingerprint data can be used as a query source. """ def __init__(self, metadata, id_fp_pairs): """initialize with a Metadata instance and the (id, fingerprint) pair list""" super(Fingerprints, self).__init__(metadata) self._id_fp_pairs = id_fp_pairs def __len__(self): """return the number of available (id, fingerprint) pairs""" return len(self._id_fp_pairs) def __iter__(self): """iterate over the (id, fingerprint) pairs""" return iter(self._id_fp_pairs) def __repr__(self): return "FingerprintList(%r, %r)" % (self.metadata, self._id_fp_pairs) def __getitem__(self, i): """return the given (id, fingerprint) pair""" return self._id_fp_pairs[i] # Question: should I support other parts of the list API? # I almost certainly want to support slice syntax like x[:5] def get_num_threads(): """Return the number of OpenMP threads to use in searches Initially this is the value returned by omp_get_max_threads(), which is generally 4 unless you set the environment variable OMP_NUM_THREADS to some other value. It may be any value in the range 1 to get_max_threads(), inclusive. """ # I don't want the top-level chemfp module import to import a submodule. import _chemfp return _chemfp.get_num_threads() def set_num_threads(num_threads): """Set the number of OpenMP threads to use in searches If `num_threads` is less than one then it is treated as one, and a value greater than get_max_threads() is treated as get_max_threads(). """ # I don't want the top-level chemfp module import to import a submodule. import _chemfp return _chemfp.set_num_threads(num_threads) def get_max_threads(): """Return the maximum number of threads available. If OpenMP is not available then this will return 1. Otherwise it returns the maximum number of threads available, as reported by omp_get_num_threads(). """ # I don't want the top-level chemfp module import to import a submodule. import _chemfp return _chemfp.get_max_threads() chemfp-1.1p1/chemfp/arena.py0000644000077000000240000006124112104076516016165 0ustar dalkestaff00000000000000"""Algorithms and data structure for working with a FingerprintArena. NOTE: This module should not be used directly. A FingerprintArena stores the fingerprints as a contiguous byte string, called the `arena`. Each fingerprint takes `storage_size` bytes, which may be larger than `num_bytes` if the fingerprints have a specific memory alignment. The bytes for fingerprint i are arena[i*storage_size:i*storage_size+num_bytes] Additional bytes must contain NUL bytes. The lookup for `ids[i]` contains the id for fingerprint `i`. A FingerprintArena has an optional `indices` attribute. When available, it means that the arena fingerprints and corresponding ids are ordered by population count, and the fingerprints with popcount `p` start at index indices[p] and end just before indices[p+1]. """ from __future__ import absolute_import import ctypes from cStringIO import StringIO import array from chemfp import FingerprintReader import _chemfp from chemfp import bitops, search __all__ = [] class FingerprintArena(FingerprintReader): """Stores fingerprints in a contiguous block of memory The public attributes are: metadata `Metadata` about the fingerprints ids list of identifiers, ordered by position """ def __init__(self, metadata, alignment, start_padding, end_padding, storage_size, arena, popcount_indices, arena_ids, start=0, end=None, id_lookup=None, ): if metadata.num_bits is None: raise TypeError("Missing metadata num_bits information") if metadata.num_bytes is None: raise TypeError("Missing metadata num_bytes information") self.metadata = metadata self.alignment = alignment self.num_bits = metadata.num_bits self.start_padding = start_padding self.end_padding = end_padding self.storage_size = storage_size self.arena = arena self.popcount_indices = popcount_indices self.arena_ids = arena_ids self.start = start # the starting index in the arena (not byte position!) if end is None: # the ending index in the arena (not byte position!) if self.metadata.num_bytes: end = (len(arena) - start_padding - end_padding) // self.storage_size else: end = 0 self.end = end if self.start == 0 and self.end == len(arena_ids): self._ids = arena_ids else: self._ids = None self._id_lookup = id_lookup assert end >= start self._range_check = xrange(end-start) def __len__(self): """Number of fingerprint records in the FingerprintArena""" return self.end - self.start @property def ids(self): ids = self._ids if ids is None: ids = self.arena_ids[self.start:self.end] self._ids = ids return ids def __getitem__(self, i): """Return the (id, fingerprint) at position i""" if isinstance(i, slice): start, end, step = i.indices(self.end - self.start) if step != 1: raise IndexError("arena slice step size must be 1") if start >= end: return FingerprintArena(self.metadata, self.alignment, 0, 0, self.storage_size, "", "", [], 0, 0) return FingerprintArena(self.metadata, self.alignment, self.start_padding, self.end_padding, self.storage_size, self.arena, self.popcount_indices, self.arena_ids, self.start+start, self.start+end) try: i = self._range_check[i] except IndexError: raise IndexError("arena fingerprint index out of range") arena_i = i + self.start start_offset = arena_i * self.storage_size + self.start_padding end_offset = start_offset + self.metadata.num_bytes return self.arena_ids[arena_i], self.arena[start_offset:end_offset] def _make_id_lookup(self): d = dict((id, i) for (i, id) in enumerate(self.ids)) self._id_lookup = d.get return self._id_lookup def get_by_id(self, id): """Given the record identifier, return the (id, fingerprint) tuple or None if not present""" id_lookup = self._id_lookup if id_lookup is None: id_lookup = self._make_id_lookup() i = id_lookup(id) if i is None: return None arena_i = i + self.start start_offset = arena_i * self.storage_size + self.start_padding end_offset = start_offset + self.metadata.num_bytes return self.arena_ids[arena_i], self.arena[start_offset:end_offset] def get_index_by_id(self, id): """Given the record identifier, return the record index or None if not present""" id_lookup = self._id_lookup if id_lookup is None: id_lookup = self._make_id_lookup() return id_lookup(id) def get_fingerprint_by_id(self, id): """Given the record identifier, return its fingerprint or None if not present""" id_lookup = self._id_lookup if id_lookup is None: id_lookup = self._make_id_lookup() i = id_lookup(id) if i is None: return None arena_i = i + self.start start_offset = arena_i * self.storage_size + self.start_padding end_offset = start_offset + self.metadata.num_bytes return self.arena[start_offset:end_offset] def save(self, destination): """Save the arena contents to the given filename or file object""" from . import io need_close = False if isinstance(destination, basestring): need_close = True output = io.open_output(destination) else: output = destination try: io.write_fps1_magic(output) io.write_fps1_header(output, self.metadata) try: for i, (id, fp) in enumerate(self): io.write_fps1_fingerprint(output, fp, id) except ValueError, err: raise ValueError("%s in record %i" % (err, i+1)) finally: if need_close: output.close() def reset(self): """This method is not documented""" pass def __iter__(self): """Iterate over the (id, fingerprint) contents of the arena""" storage_size = self.storage_size if not storage_size: return target_fp_size = self.metadata.num_bytes arena = self.arena for id, start_offset in zip(self.arena_ids[self.start:self.end], xrange(self.start*storage_size+self.start_padding, self.end*storage_size+self.start_padding, storage_size)): yield id, arena[start_offset:start_offset+target_fp_size] def iter_arenas(self, arena_size = 1000): """iterate through `arena_size` fingerprints at a time This iterates through the fingerprints `arena_size` at a time, yielding a FingerprintArena for each group. Working with arenas is often faster than processing one fingerprint at a time, and more memory efficient than processing all fingerprints at once. If arena_size=None then this makes an iterator containing a single arena containing all of the input. :param arena_size: The number of fingerprints to put into an arena. :type arena_size: positive integer, or None """ if arena_size is None: yield self return storage_size = self.storage_size start = self.start for i in xrange(0, len(self), arena_size): end = start+arena_size if end > self.end: end = self.end yield FingerprintArena(self.metadata, self.alignment, self.start_padding, self.end_padding, storage_size, self.arena, self.popcount_indices, self.arena_ids, start, end) start = end def count_tanimoto_hits_fp(self, query_fp, threshold=0.7): """Count the fingerprints which are similar enough to the query fingerprint DEPRECATED: Use `chemfp.search.count_tanimoto_hits_fp`_ instead. Return the number of fingerprints in this arena which are at least `threshold` similar to the query fingerprint `query_fp`. :param query_fp: query fingerprint :type query_fp: byte string :param threshold: minimum similarity threshold (default: 0.7) :type threshold: float between 0.0 and 1.0, inclusive :returns: integer count """ return search.count_tanimoto_hits_fp(query_fp, self, threshold) def count_tanimoto_hits_arena(self, queries, threshold=0.7): """Count the fingerprints which are similar enough to each query fingerprint DEPRECATED: Use `chemfp.search.count_tanimoto_hits_arena`_ or `chemfp.search.count_tanimoto_hits_symmetric`_ instead. Returns an iterator containing the (query_id, count) for each fingerprint in `queries`, where `query_id` is the query fingerprint id and `count` is the number of fingerprints found which are at least `threshold` similar to the query. The order of results is the same as the order of the queries. For efficiency reasons, `arena_size` queries are processed at a time. :param queries: query fingerprints :type query_fp: FingerprintArena or FPSReader (must implement iter_arenas()) :param threshold: minimum similarity threshold (default: 0.7) :type threshold: float between 0.0 and 1.0, inclusive :param arena_size: number of queries to process at a time (default: 100) :type arena_size: positive integer :returns: list of (query_id, integer count) pairs, one for each query """ return search.count_tanimoto_hits_arena(queries, self, threshold) def threshold_tanimoto_search_fp(self, query_fp, threshold=0.7): """Find the fingerprints which are similar enough to the query fingerprint DEPRECATED: Use `chemfp.search.threshold_tanimoto_search_fp`_ instead. Find all of the fingerprints in this arena which are at least `threshold` similar to the query fingerprint `query_fp`. The hits are returned as a list containing (id, score) tuples in arbitrary order. :param query_fp: query fingerprint :type query_fp: byte string :param threshold: minimum similarity threshold (default: 0.7) :type threshold: float between 0.0 and 1.0, inclusive :returns: list of (int, score) tuples """ return search.threshold_tanimoto_search_fp(query_fp, self, threshold) def threshold_tanimoto_search_arena(self, queries, threshold=0.7, arena_size=100): """Find the fingerprints which are similar to each of the query fingerprints DEPRECATED: Use `chemfp.search.threshold_tanimoto_search_arena`_ or `chemfp.search.threshold_tanimoto_search_symmetric`_ instead. For each fingerprint in the `query_arena`, find all of the fingerprints in this arena which are at least `threshold` similar. The hits are returned as a `SearchResults` instance. :param query_arena: query arena :type query_arena: FingerprintArena :param threshold: minimum similarity threshold (default: 0.7) :type threshold: float between 0.0 and 1.0, inclusive :returns: SearchResults """ return search.threshold_tanimoto_search_arena(queries, self, threshold) def knearest_tanimoto_search_fp(self, query_fp, k=3, threshold=0.7): """Find the k-nearest fingerprints which are similar to the query fingerprint DEPRECATED: Use `chemfp.search.knearest_tanimoto_search_fp`_ instead. Find the `k` fingerprints in this arena which are most similar to the query fingerprint `query_fp` and which are at least `threshold` similar to the query. The hits are returned as a list of (id, score) tuples sorted with the highest similarity first. Ties are broken arbitrarily. :param query_fp: query fingerpring :type query_fp: byte string :param k: number of nearest neighbors to find (default: 3) :type k: positive integer :param threshold: minimum similarity threshold (default: 0.7) :type threshold: float between 0.0 and 1.0, inclusive :returns: SearchResults """ return search.knearest_tanimoto_search_fp(query_fp, self, k, threshold) def knearest_tanimoto_search_arena(self, queries, k=3, threshold=0.7): """Find the k-nearest fingerprint which are similar to each of the query fingerprints DEPRECATED: Use `chemfp.search.knearest_tanimoto_search_arena`_ or `chemfp.search.knearest_tanimoto_search_symmetric`_ instead. For each fingerprint in the `query_arena`, find the `k` fingerprints in this arena which are most similar and which are at least `threshold` similar to the query fingerprint. The hits are returned as a SearchResult where the hits are sorted with the highest similarity first. Ties are broken arbitrarily. :param query_arena: query arena :type query_arena: FingerprintArena :param k: number of nearest neighbors to find (default: 3) :type k: positive integer :param threshold: minimum similarity threshold (default: 0.7) :type threshold: float between 0.0 and 1.0, inclusive :returns: SearchResult """ return search.knearest_tanimoto_search_arena(queries, self, k, threshold) def copy(self, indices=None, reorder=None): """Create a new arena using either all or some of the fingerprints in this arena By default this create a new arena. The fingerprint data block and ids may be shared with the original arena, which makes this a shallow copy. If the original arena is a slice, or "sub-arena" of an arena, then the copy will allocate new space to store just the fingerprints in the slice and use its own list for the ids. The `indices` parameter, if not None, is an iterable which contains the indicies of the fingerprint records to copy. Duplicates are allowed, though discouraged. If indices are specified then the default `reorder=None` or a `reorder=True` will reorder the fingerprints for the new arena by popcount. This improves overall search performance. With `reorder=False`, the fingerprints will be in order given by the indices. If indices are not given, then the default is to preserve the order type of the original arena. Otherwise `reorder=True` will always reorder and `reorder=False` will leave them in the current order. :param indices: indicies of the records to copy into the new arena :type indices: iterable containing integers, or None :param reorder: describes how to order the fingerprints :type reorder: True to reorder, False to leave in input order, None for default action """ if reorder is None: if indices is None: # This is a pure copy. Reorder only if there are popcount indices. reorder = (self.popcount_indices != "") else: # The default is to go fast. If you want to preserve index order # then you'll need to set reorder=False reorder = True if indices is None: # Make a completely new arena # Handle the trivial case where I don't need to do anything. if (self.start == 0 and (self.end*self.storage_size + self.start_padding + self.end_padding == len(self.arena)) and (not reorder or self.popcount_indices)): return FingerprintArena(self.metadata, self.alignment, self.start_padding, self.end_padding, self.storage_size, self.arena, self.popcount_indices, self.arena_ids, start = 0, end = self.end, id_lookup = self._id_lookup) # Otherwise I need to do some work # Make a copy of the actual fingerprints. (Which could be a subarena.) start = self.start_padding + self.start*self.storage_size end = self.start_padding + self.end*self.storage_size arena = self.arena[start:end] # If we don't have popcount_indices and don't want them ordered # then just do the alignment and we're done. if not reorder and not self.popcount_indices: # Don't reorder the unordered fingerprints start_padding, end_padding, unsorted_arena = ( _chemfp.make_unsorted_aligned_arena(arena, self.alignment)) return FingerprintArena(self.metadata, self.alignment, start_padding, end_padding, self.storage_size, unsorted_arena, "", self.ids, id_lookup = self._id_lookup) # Either we're already sorted or we should become sorted. # If we're sorted then make_sorted_aligned_arena will detect # that and keep the old arena. Otherwise it sorts first and # makes a new arena block. current_ids = self.ids ordering = (ChemFPOrderedPopcount*len(current_ids))() popcounts = array.array("i", (0,)*(self.metadata.num_bits+2)) start_padding, end_padding, arena = _chemfp.make_sorted_aligned_arena( self.metadata.num_bits, self.storage_size, arena, len(current_ids), ordering, popcounts, self.alignment) reordered_ids = [current_ids[item.index] for item in ordering] return FingerprintArena(self.metadata, self.alignment, start_padding, end_padding, self.storage_size, arena, popcounts.tostring(), reordered_ids) # On this pathway, we want to make a new arena which contains # selected fingerprints given indices into the old arena. arena = self.arena storage_size = self.storage_size start = self.start start_padding = self.start_padding arena_ids = self.arena_ids # First make sure that all of the indices are in range. # This will also convert negative indices into positive ones. new_indices = [] range_check = self._range_check try: for i in indices: new_indices.append(range_check[i]) except IndexError: raise IndexError("arena fingerprint index %d is out of range" % (i,)) if reorder and self.popcount_indices: # There's a slight performance benefit because # make_sorted_aligned_arena will see that the fingerprints # are already in sorted order and not resort. # XXX Is that true? Why do a Python sort instead of a C sort? # Perhaps because then I don't need to copy fingerprints? new_indices.sort() # Copy the fingerprints over to a new arena block unsorted_fps = [] new_ids = [] for new_i in new_indices: start_offset = start_padding + new_i*storage_size end_offset = start_offset + storage_size unsorted_fps.append(arena[start_offset:end_offset]) new_ids.append(arena_ids[new_i]) unsorted_arena = "".join(unsorted_fps) unsorted_fps = None # regain some memory # If the caller doesn't want ordered data, then leave it unsorted if not reorder: start_padding, end_padding, unsorted_arena = _chemfp.make_unsorted_aligned_arena( unsorted_arena, self.alignment) return FingerprintArena(self.metadata, self.alignment, start_padding, end_padding, storage_size, unsorted_arena, "", new_ids) # Otherwise, reorder and align the area, along with popcount information ordering = (ChemFPOrderedPopcount*len(new_ids))() popcounts = array.array("i", (0,)*(self.metadata.num_bits+2)) start_padding, end_padding, sorted_arena = _chemfp.make_sorted_aligned_arena( self.metadata.num_bits, storage_size, unsorted_arena, len(new_ids), ordering, popcounts, self.alignment) reordered_ids = [new_ids[item.index] for item in ordering] return FingerprintArena(self.metadata, self.alignment, start_padding, end_padding, storage_size, sorted_arena, popcounts.tostring(), reordered_ids) # TODO: push more of this malloc-management down into C class ChemFPOrderedPopcount(ctypes.Structure): _fields_ = [("popcount", ctypes.c_int), ("index", ctypes.c_int)] _methods = bitops.get_methods() _has_popcnt = "POPCNT" in _methods _has_ssse3 = "ssse3" in _methods def get_optimal_alignment(num_bits): if num_bits <= 32: # Just in case! if num_bits <= 8: return 1 return 4 # Since the ssse3 method must examine at least 512 bits while the # Gillies method doesn't, this puts the time tradeoff around 210 bits. # I decided to save a bit of space and round that up to 224 bits. # (Experience will tell us if 256 is a better boundary.) if num_bits <= 224: return 8 # If you have POPCNT (and you're using it) then there's no reason # to use a larger alignment if _has_popcnt: if num_bits >= 768: if bitops.get_alignment_method("align8-large") == "POPCNT": return 8 else: if bitops.get_alignment_method("align8-small") == "POPCNT": return 8 # If you don't have SSSE3 or you aren't using it, then use 8 if not _has_ssse3 or bitops.get_alignment_method("align-ssse3") != "ssse3": return 8 # In my timing tests: # Lauradoux takes 12.6s # ssse3 takes in 9.0s # Gillies takes 22s # Otherwise, go ahead and pad up to 64 bytes # (Even at 768 bits/96 bytes, the SSSE3 method is faster.) return 64 def fps_to_arena(fps_reader, metadata=None, reorder=True, alignment=None): if metadata is None: metadata = fps_reader.metadata num_bits = metadata.num_bits if not num_bits: if metadata.num_bytes is None: raise ValueError("metadata must contain at least one of num_bits or num_bytes") num_bits = metadata.num_bytes * 8 #assert num_bits if alignment is None: alignment = get_optimal_alignment(num_bits) num_bytes = metadata.num_bytes storage_size = num_bytes if storage_size % alignment != 0: n = alignment - storage_size % alignment end_padding = "\0" * n storage_size += n else: end_padding = None ids = [] unsorted_fps = StringIO() for (id, fp) in fps_reader: if len(fp) != num_bytes: raise ValueError("Fingerprint for id %r has %d bytes while the metadata says it should have %d" % (id, len(fp), num_bytes)) unsorted_fps.write(fp) if end_padding: unsorted_fps.write(end_padding) ids.append(id) unsorted_arena = unsorted_fps.getvalue() unsorted_fps.close() unsorted_fps = None if not reorder or not metadata.num_bits: start_padding, end_padding, unsorted_arena = _chemfp.make_unsorted_aligned_arena( unsorted_arena, alignment) return FingerprintArena(metadata, alignment, start_padding, end_padding, storage_size, unsorted_arena, "", ids) # Reorder ordering = (ChemFPOrderedPopcount*len(ids))() popcounts = array.array("i", (0,)*(metadata.num_bits+2)) start_padding, end_padding, unsorted_arena = _chemfp.make_sorted_aligned_arena( num_bits, storage_size, unsorted_arena, len(ids), ordering, popcounts, alignment) new_ids = [ids[item.index] for item in ordering] return FingerprintArena(metadata, alignment, start_padding, end_padding, storage_size, unsorted_arena, popcounts.tostring(), new_ids) chemfp-1.1p1/chemfp/argparse.py0000644000077000000240000024455311660452123016712 0ustar dalkestaff00000000000000# -*- coding: utf-8 -*- # Copyright © 2006-2009 Steven J. Bethard . # # Licensed under the Apache License, Version 2.0 (the "License"); you may not # use this file except in compliance with the License. You may obtain a copy # of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the # License for the specific language governing permissions and limitations # under the License. """Command-line parsing library This module is an optparse-inspired command-line parsing library that: - handles both optional and positional arguments - produces highly informative usage messages - supports parsers that dispatch to sub-parsers The following is a simple usage example that sums integers from the command-line and writes the result to a file:: parser = argparse.ArgumentParser( description='sum the integers at the command line') parser.add_argument( 'integers', metavar='int', nargs='+', type=int, help='an integer to be summed') parser.add_argument( '--log', default=sys.stdout, type=argparse.FileType('w'), help='the file where the sum should be written') args = parser.parse_args() args.log.write('%s' % sum(args.integers)) args.log.close() The module contains the following public classes: - ArgumentParser -- The main entry point for command-line parsing. As the example above shows, the add_argument() method is used to populate the parser with actions for optional and positional arguments. Then the parse_args() method is invoked to convert the args at the command-line into an object with attributes. - ArgumentError -- The exception raised by ArgumentParser objects when there are errors with the parser's actions. Errors raised while parsing the command-line are caught by ArgumentParser and emitted as command-line messages. - FileType -- A factory for defining types of files to be created. As the example above shows, instances of FileType are typically passed as the type= argument of add_argument() calls. - Action -- The base class for parser actions. Typically actions are selected by passing strings like 'store_true' or 'append_const' to the action= argument of add_argument(). However, for greater customization of ArgumentParser actions, subclasses of Action may be defined and passed as the action= argument. - HelpFormatter, RawDescriptionHelpFormatter, RawTextHelpFormatter, ArgumentDefaultsHelpFormatter -- Formatter classes which may be passed as the formatter_class= argument to the ArgumentParser constructor. HelpFormatter is the default, RawDescriptionHelpFormatter and RawTextHelpFormatter tell the parser not to change the formatting for help text, and ArgumentDefaultsHelpFormatter adds information about argument defaults to the help. All other classes in this module are considered implementation details. (Also note that HelpFormatter and RawDescriptionHelpFormatter are only considered public as object names -- the API of the formatter objects is still considered an implementation detail.) """ __version__ = '1.0.1' __all__ = [ 'ArgumentParser', 'ArgumentError', 'Namespace', 'Action', 'FileType', 'HelpFormatter', 'RawDescriptionHelpFormatter', 'RawTextHelpFormatter' 'ArgumentDefaultsHelpFormatter', ] import copy as _copy import os as _os import re as _re import sys as _sys import textwrap as _textwrap from gettext import gettext as _ try: _set = set except NameError: from sets import Set as _set try: _basestring = basestring except NameError: _basestring = str try: _sorted = sorted except NameError: def _sorted(iterable, reverse=False): result = list(iterable) result.sort() if reverse: result.reverse() return result # silence Python 2.6 buggy warnings about Exception.message if _sys.version_info[:2] == (2, 6): import warnings warnings.filterwarnings( action='ignore', message='BaseException.message has been deprecated as of Python 2.6', category=DeprecationWarning, module='argparse') SUPPRESS = '==SUPPRESS==' OPTIONAL = '?' ZERO_OR_MORE = '*' ONE_OR_MORE = '+' PARSER = '==PARSER==' # ============================= # Utility functions and classes # ============================= class _AttributeHolder(object): """Abstract base class that provides __repr__. The __repr__ method returns a string in the format:: ClassName(attr=name, attr=name, ...) The attributes are determined either by a class-level attribute, '_kwarg_names', or by inspecting the instance __dict__. """ def __repr__(self): type_name = type(self).__name__ arg_strings = [] for arg in self._get_args(): arg_strings.append(repr(arg)) for name, value in self._get_kwargs(): arg_strings.append('%s=%r' % (name, value)) return '%s(%s)' % (type_name, ', '.join(arg_strings)) def _get_kwargs(self): return _sorted(self.__dict__.items()) def _get_args(self): return [] def _ensure_value(namespace, name, value): if getattr(namespace, name, None) is None: setattr(namespace, name, value) return getattr(namespace, name) # =============== # Formatting Help # =============== class HelpFormatter(object): """Formatter for generating usage messages and argument help strings. Only the name of this class is considered a public API. All the methods provided by the class are considered an implementation detail. """ def __init__(self, prog, indent_increment=2, max_help_position=24, width=None): # default setting for width if width is None: try: width = int(_os.environ['COLUMNS']) except (KeyError, ValueError): width = 80 width -= 2 self._prog = prog self._indent_increment = indent_increment self._max_help_position = max_help_position self._width = width self._current_indent = 0 self._level = 0 self._action_max_length = 0 self._root_section = self._Section(self, None) self._current_section = self._root_section self._whitespace_matcher = _re.compile(r'\s+') self._long_break_matcher = _re.compile(r'\n\n\n+') # =============================== # Section and indentation methods # =============================== def _indent(self): self._current_indent += self._indent_increment self._level += 1 def _dedent(self): self._current_indent -= self._indent_increment assert self._current_indent >= 0, 'Indent decreased below 0.' self._level -= 1 class _Section(object): def __init__(self, formatter, parent, heading=None): self.formatter = formatter self.parent = parent self.heading = heading self.items = [] def format_help(self): # format the indented section if self.parent is not None: self.formatter._indent() join = self.formatter._join_parts for func, args in self.items: func(*args) item_help = join([func(*args) for func, args in self.items]) if self.parent is not None: self.formatter._dedent() # return nothing if the section was empty if not item_help: return '' # add the heading if the section was non-empty if self.heading is not SUPPRESS and self.heading is not None: current_indent = self.formatter._current_indent heading = '%*s%s:\n' % (current_indent, '', self.heading) else: heading = '' # join the section-initial newline, the heading and the help return join(['\n', heading, item_help, '\n']) def _add_item(self, func, args): self._current_section.items.append((func, args)) # ======================== # Message building methods # ======================== def start_section(self, heading): self._indent() section = self._Section(self, self._current_section, heading) self._add_item(section.format_help, []) self._current_section = section def end_section(self): self._current_section = self._current_section.parent self._dedent() def add_text(self, text): if text is not SUPPRESS and text is not None: self._add_item(self._format_text, [text]) def add_usage(self, usage, actions, groups, prefix=None): if usage is not SUPPRESS: args = usage, actions, groups, prefix self._add_item(self._format_usage, args) def add_argument(self, action): if action.help is not SUPPRESS: # find all invocations get_invocation = self._format_action_invocation invocations = [get_invocation(action)] for subaction in self._iter_indented_subactions(action): invocations.append(get_invocation(subaction)) # update the maximum item length invocation_length = max([len(s) for s in invocations]) action_length = invocation_length + self._current_indent self._action_max_length = max(self._action_max_length, action_length) # add the item to the list self._add_item(self._format_action, [action]) def add_arguments(self, actions): for action in actions: self.add_argument(action) # ======================= # Help-formatting methods # ======================= def format_help(self): help = self._root_section.format_help() if help: help = self._long_break_matcher.sub('\n\n', help) help = help.strip('\n') + '\n' return help def _join_parts(self, part_strings): return ''.join([part for part in part_strings if part and part is not SUPPRESS]) def _format_usage(self, usage, actions, groups, prefix): if prefix is None: prefix = _('usage: ') # if usage is specified, use that if usage is not None: usage = usage % dict(prog=self._prog) # if no optionals or positionals are available, usage is just prog elif usage is None and not actions: usage = '%(prog)s' % dict(prog=self._prog) # if optionals and positionals are available, calculate usage elif usage is None: prog = '%(prog)s' % dict(prog=self._prog) # split optionals from positionals optionals = [] positionals = [] for action in actions: if action.option_strings: optionals.append(action) else: positionals.append(action) # build full usage string format = self._format_actions_usage action_usage = format(optionals + positionals, groups) usage = ' '.join([s for s in [prog, action_usage] if s]) # wrap the usage parts if it's too long text_width = self._width - self._current_indent if len(prefix) + len(usage) > text_width: # break usage into wrappable parts part_regexp = r'\(.*?\)+|\[.*?\]+|\S+' opt_usage = format(optionals, groups) pos_usage = format(positionals, groups) opt_parts = _re.findall(part_regexp, opt_usage) pos_parts = _re.findall(part_regexp, pos_usage) assert ' '.join(opt_parts) == opt_usage assert ' '.join(pos_parts) == pos_usage # helper for wrapping lines def get_lines(parts, indent, prefix=None): lines = [] line = [] if prefix is not None: line_len = len(prefix) - 1 else: line_len = len(indent) - 1 for part in parts: if line_len + 1 + len(part) > text_width: lines.append(indent + ' '.join(line)) line = [] line_len = len(indent) - 1 line.append(part) line_len += len(part) + 1 if line: lines.append(indent + ' '.join(line)) if prefix is not None: lines[0] = lines[0][len(indent):] return lines # if prog is short, follow it with optionals or positionals if len(prefix) + len(prog) <= 0.75 * text_width: indent = ' ' * (len(prefix) + len(prog) + 1) if opt_parts: lines = get_lines([prog] + opt_parts, indent, prefix) lines.extend(get_lines(pos_parts, indent)) elif pos_parts: lines = get_lines([prog] + pos_parts, indent, prefix) else: lines = [prog] # if prog is long, put it on its own line else: indent = ' ' * len(prefix) parts = opt_parts + pos_parts lines = get_lines(parts, indent) if len(lines) > 1: lines = [] lines.extend(get_lines(opt_parts, indent)) lines.extend(get_lines(pos_parts, indent)) lines = [prog] + lines # join lines into usage usage = '\n'.join(lines) # prefix with 'usage:' return '%s%s\n\n' % (prefix, usage) def _format_actions_usage(self, actions, groups): # find group indices and identify actions in groups group_actions = _set() inserts = {} for group in groups: try: start = actions.index(group._group_actions[0]) except ValueError: continue else: end = start + len(group._group_actions) if actions[start:end] == group._group_actions: for action in group._group_actions: group_actions.add(action) if not group.required: inserts[start] = '[' inserts[end] = ']' else: inserts[start] = '(' inserts[end] = ')' for i in range(start + 1, end): inserts[i] = '|' # collect all actions format strings parts = [] for i, action in enumerate(actions): # suppressed arguments are marked with None # remove | separators for suppressed arguments if action.help is SUPPRESS: parts.append(None) if inserts.get(i) == '|': inserts.pop(i) elif inserts.get(i + 1) == '|': inserts.pop(i + 1) # produce all arg strings elif not action.option_strings: part = self._format_args(action, action.dest) # if it's in a group, strip the outer [] if action in group_actions: if part[0] == '[' and part[-1] == ']': part = part[1:-1] # add the action string to the list parts.append(part) # produce the first way to invoke the option in brackets else: option_string = action.option_strings[0] # if the Optional doesn't take a value, format is: # -s or --long if action.nargs == 0: part = '%s' % option_string # if the Optional takes a value, format is: # -s ARGS or --long ARGS else: default = action.dest.upper() args_string = self._format_args(action, default) part = '%s %s' % (option_string, args_string) # make it look optional if it's not required or in a group if not action.required and action not in group_actions: part = '[%s]' % part # add the action string to the list parts.append(part) # insert things at the necessary indices for i in _sorted(inserts, reverse=True): parts[i:i] = [inserts[i]] # join all the action items with spaces text = ' '.join([item for item in parts if item is not None]) # clean up separators for mutually exclusive groups open = r'[\[(]' close = r'[\])]' text = _re.sub(r'(%s) ' % open, r'\1', text) text = _re.sub(r' (%s)' % close, r'\1', text) text = _re.sub(r'%s *%s' % (open, close), r'', text) text = _re.sub(r'\(([^|]*)\)', r'\1', text) text = text.strip() # return the text return text def _format_text(self, text): text_width = self._width - self._current_indent indent = ' ' * self._current_indent return self._fill_text(text, text_width, indent) + '\n\n' def _format_action(self, action): # determine the required width and the entry label help_position = min(self._action_max_length + 2, self._max_help_position) help_width = self._width - help_position action_width = help_position - self._current_indent - 2 action_header = self._format_action_invocation(action) # ho nelp; start on same line and add a final newline if not action.help: tup = self._current_indent, '', action_header action_header = '%*s%s\n' % tup # short action name; start on the same line and pad two spaces elif len(action_header) <= action_width: tup = self._current_indent, '', action_width, action_header action_header = '%*s%-*s ' % tup indent_first = 0 # long action name; start on the next line else: tup = self._current_indent, '', action_header action_header = '%*s%s\n' % tup indent_first = help_position # collect the pieces of the action help parts = [action_header] # if there was help for the action, add lines of help text if action.help: help_text = self._expand_help(action) help_lines = self._split_lines(help_text, help_width) parts.append('%*s%s\n' % (indent_first, '', help_lines[0])) for line in help_lines[1:]: parts.append('%*s%s\n' % (help_position, '', line)) # or add a newline if the description doesn't end with one elif not action_header.endswith('\n'): parts.append('\n') # if there are any sub-actions, add their help as well for subaction in self._iter_indented_subactions(action): parts.append(self._format_action(subaction)) # return a single string return self._join_parts(parts) def _format_action_invocation(self, action): if not action.option_strings: metavar, = self._metavar_formatter(action, action.dest)(1) return metavar else: parts = [] # if the Optional doesn't take a value, format is: # -s, --long if action.nargs == 0: parts.extend(action.option_strings) # if the Optional takes a value, format is: # -s ARGS, --long ARGS else: default = action.dest.upper() args_string = self._format_args(action, default) for option_string in action.option_strings: parts.append('%s %s' % (option_string, args_string)) return ', '.join(parts) def _metavar_formatter(self, action, default_metavar): if action.metavar is not None: result = action.metavar elif action.choices is not None: choice_strs = [str(choice) for choice in action.choices] result = '{%s}' % ','.join(choice_strs) else: result = default_metavar def format(tuple_size): if isinstance(result, tuple): return result else: return (result, ) * tuple_size return format def _format_args(self, action, default_metavar): get_metavar = self._metavar_formatter(action, default_metavar) if action.nargs is None: result = '%s' % get_metavar(1) elif action.nargs == OPTIONAL: result = '[%s]' % get_metavar(1) elif action.nargs == ZERO_OR_MORE: result = '[%s [%s ...]]' % get_metavar(2) elif action.nargs == ONE_OR_MORE: result = '%s [%s ...]' % get_metavar(2) elif action.nargs is PARSER: result = '%s ...' % get_metavar(1) else: formats = ['%s' for _ in range(action.nargs)] result = ' '.join(formats) % get_metavar(action.nargs) return result def _expand_help(self, action): params = dict(vars(action), prog=self._prog) for name in list(params): if params[name] is SUPPRESS: del params[name] if params.get('choices') is not None: choices_str = ', '.join([str(c) for c in params['choices']]) params['choices'] = choices_str return self._get_help_string(action) % params def _iter_indented_subactions(self, action): try: get_subactions = action._get_subactions except AttributeError: pass else: self._indent() for subaction in get_subactions(): yield subaction self._dedent() def _split_lines(self, text, width): text = self._whitespace_matcher.sub(' ', text).strip() return _textwrap.wrap(text, width) def _fill_text(self, text, width, indent): text = self._whitespace_matcher.sub(' ', text).strip() return _textwrap.fill(text, width, initial_indent=indent, subsequent_indent=indent) def _get_help_string(self, action): return action.help class RawDescriptionHelpFormatter(HelpFormatter): """Help message formatter which retains any formatting in descriptions. Only the name of this class is considered a public API. All the methods provided by the class are considered an implementation detail. """ def _fill_text(self, text, width, indent): return ''.join([indent + line for line in text.splitlines(True)]) class RawTextHelpFormatter(RawDescriptionHelpFormatter): """Help message formatter which retains formatting of all help text. Only the name of this class is considered a public API. All the methods provided by the class are considered an implementation detail. """ def _split_lines(self, text, width): return text.splitlines() class ArgumentDefaultsHelpFormatter(HelpFormatter): """Help message formatter which adds default values to argument help. Only the name of this class is considered a public API. All the methods provided by the class are considered an implementation detail. """ def _get_help_string(self, action): help = action.help if '%(default)' not in action.help: if action.default is not SUPPRESS: defaulting_nargs = [OPTIONAL, ZERO_OR_MORE] if action.option_strings or action.nargs in defaulting_nargs: help += ' (default: %(default)s)' return help # ===================== # Options and Arguments # ===================== def _get_action_name(argument): if argument is None: return None elif argument.option_strings: return '/'.join(argument.option_strings) elif argument.metavar not in (None, SUPPRESS): return argument.metavar elif argument.dest not in (None, SUPPRESS): return argument.dest else: return None class ArgumentError(Exception): """An error from creating or using an argument (optional or positional). The string value of this exception is the message, augmented with information about the argument that caused it. """ def __init__(self, argument, message): self.argument_name = _get_action_name(argument) self.message = message def __str__(self): if self.argument_name is None: format = '%(message)s' else: format = 'argument %(argument_name)s: %(message)s' return format % dict(message=self.message, argument_name=self.argument_name) # ============== # Action classes # ============== class Action(_AttributeHolder): """Information about how to convert command line strings to Python objects. Action objects are used by an ArgumentParser to represent the information needed to parse a single argument from one or more strings from the command line. The keyword arguments to the Action constructor are also all attributes of Action instances. Keyword Arguments: - option_strings -- A list of command-line option strings which should be associated with this action. - dest -- The name of the attribute to hold the created object(s) - nargs -- The number of command-line arguments that should be consumed. By default, one argument will be consumed and a single value will be produced. Other values include: - N (an integer) consumes N arguments (and produces a list) - '?' consumes zero or one arguments - '*' consumes zero or more arguments (and produces a list) - '+' consumes one or more arguments (and produces a list) Note that the difference between the default and nargs=1 is that with the default, a single value will be produced, while with nargs=1, a list containing a single value will be produced. - const -- The value to be produced if the option is specified and the option uses an action that takes no values. - default -- The value to be produced if the option is not specified. - type -- The type which the command-line arguments should be converted to, should be one of 'string', 'int', 'float', 'complex' or a callable object that accepts a single string argument. If None, 'string' is assumed. - choices -- A container of values that should be allowed. If not None, after a command-line argument has been converted to the appropriate type, an exception will be raised if it is not a member of this collection. - required -- True if the action must always be specified at the command line. This is only meaningful for optional command-line arguments. - help -- The help string describing the argument. - metavar -- The name to be used for the option's argument with the help string. If None, the 'dest' value will be used as the name. """ def __init__(self, option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None): self.option_strings = option_strings self.dest = dest self.nargs = nargs self.const = const self.default = default self.type = type self.choices = choices self.required = required self.help = help self.metavar = metavar def _get_kwargs(self): names = [ 'option_strings', 'dest', 'nargs', 'const', 'default', 'type', 'choices', 'help', 'metavar', ] return [(name, getattr(self, name)) for name in names] def __call__(self, parser, namespace, values, option_string=None): raise NotImplementedError(_('.__call__() not defined')) class _StoreAction(Action): def __init__(self, option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None): if nargs == 0: raise ValueError('nargs for store actions must be > 0; if you ' 'have nothing to store, actions such as store ' 'true or store const may be more appropriate') if const is not None and nargs != OPTIONAL: raise ValueError('nargs must be %r to supply const' % OPTIONAL) super(_StoreAction, self).__init__( option_strings=option_strings, dest=dest, nargs=nargs, const=const, default=default, type=type, choices=choices, required=required, help=help, metavar=metavar) def __call__(self, parser, namespace, values, option_string=None): setattr(namespace, self.dest, values) class _StoreConstAction(Action): def __init__(self, option_strings, dest, const, default=None, required=False, help=None, metavar=None): super(_StoreConstAction, self).__init__( option_strings=option_strings, dest=dest, nargs=0, const=const, default=default, required=required, help=help) def __call__(self, parser, namespace, values, option_string=None): setattr(namespace, self.dest, self.const) class _StoreTrueAction(_StoreConstAction): def __init__(self, option_strings, dest, default=False, required=False, help=None): super(_StoreTrueAction, self).__init__( option_strings=option_strings, dest=dest, const=True, default=default, required=required, help=help) class _StoreFalseAction(_StoreConstAction): def __init__(self, option_strings, dest, default=True, required=False, help=None): super(_StoreFalseAction, self).__init__( option_strings=option_strings, dest=dest, const=False, default=default, required=required, help=help) class _AppendAction(Action): def __init__(self, option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None): if nargs == 0: raise ValueError('nargs for append actions must be > 0; if arg ' 'strings are not supplying the value to append, ' 'the append const action may be more appropriate') if const is not None and nargs != OPTIONAL: raise ValueError('nargs must be %r to supply const' % OPTIONAL) super(_AppendAction, self).__init__( option_strings=option_strings, dest=dest, nargs=nargs, const=const, default=default, type=type, choices=choices, required=required, help=help, metavar=metavar) def __call__(self, parser, namespace, values, option_string=None): items = _copy.copy(_ensure_value(namespace, self.dest, [])) items.append(values) setattr(namespace, self.dest, items) class _AppendConstAction(Action): def __init__(self, option_strings, dest, const, default=None, required=False, help=None, metavar=None): super(_AppendConstAction, self).__init__( option_strings=option_strings, dest=dest, nargs=0, const=const, default=default, required=required, help=help, metavar=metavar) def __call__(self, parser, namespace, values, option_string=None): items = _copy.copy(_ensure_value(namespace, self.dest, [])) items.append(self.const) setattr(namespace, self.dest, items) class _CountAction(Action): def __init__(self, option_strings, dest, default=None, required=False, help=None): super(_CountAction, self).__init__( option_strings=option_strings, dest=dest, nargs=0, default=default, required=required, help=help) def __call__(self, parser, namespace, values, option_string=None): new_count = _ensure_value(namespace, self.dest, 0) + 1 setattr(namespace, self.dest, new_count) class _HelpAction(Action): def __init__(self, option_strings, dest=SUPPRESS, default=SUPPRESS, help=None): super(_HelpAction, self).__init__( option_strings=option_strings, dest=dest, default=default, nargs=0, help=help) def __call__(self, parser, namespace, values, option_string=None): parser.print_help() parser.exit() class _VersionAction(Action): def __init__(self, option_strings, dest=SUPPRESS, default=SUPPRESS, help=None): super(_VersionAction, self).__init__( option_strings=option_strings, dest=dest, default=default, nargs=0, help=help) def __call__(self, parser, namespace, values, option_string=None): parser.print_version() parser.exit() class _SubParsersAction(Action): class _ChoicesPseudoAction(Action): def __init__(self, name, help): sup = super(_SubParsersAction._ChoicesPseudoAction, self) sup.__init__(option_strings=[], dest=name, help=help) def __init__(self, option_strings, prog, parser_class, dest=SUPPRESS, help=None, metavar=None): self._prog_prefix = prog self._parser_class = parser_class self._name_parser_map = {} self._choices_actions = [] super(_SubParsersAction, self).__init__( option_strings=option_strings, dest=dest, nargs=PARSER, choices=self._name_parser_map, help=help, metavar=metavar) def add_parser(self, name, **kwargs): # set prog from the existing prefix if kwargs.get('prog') is None: kwargs['prog'] = '%s %s' % (self._prog_prefix, name) # create a pseudo-action to hold the choice help if 'help' in kwargs: help = kwargs.pop('help') choice_action = self._ChoicesPseudoAction(name, help) self._choices_actions.append(choice_action) # create the parser and add it to the map parser = self._parser_class(**kwargs) self._name_parser_map[name] = parser return parser def _get_subactions(self): return self._choices_actions def __call__(self, parser, namespace, values, option_string=None): parser_name = values[0] arg_strings = values[1:] # set the parser name if requested if self.dest is not SUPPRESS: setattr(namespace, self.dest, parser_name) # select the parser try: parser = self._name_parser_map[parser_name] except KeyError: tup = parser_name, ', '.join(self._name_parser_map) msg = _('unknown parser %r (choices: %s)' % tup) raise ArgumentError(self, msg) # parse all the remaining options into the namespace parser.parse_args(arg_strings, namespace) # ============== # Type classes # ============== class FileType(object): """Factory for creating file object types Instances of FileType are typically passed as type= arguments to the ArgumentParser add_argument() method. Keyword Arguments: - mode -- A string indicating how the file is to be opened. Accepts the same values as the builtin open() function. - bufsize -- The file's desired buffer size. Accepts the same values as the builtin open() function. """ def __init__(self, mode='r', bufsize=None): self._mode = mode self._bufsize = bufsize def __call__(self, string): # the special argument "-" means sys.std{in,out} if string == '-': if 'r' in self._mode: return _sys.stdin elif 'w' in self._mode: return _sys.stdout else: msg = _('argument "-" with mode %r' % self._mode) raise ValueError(msg) # all other arguments are used as file names if self._bufsize: return open(string, self._mode, self._bufsize) else: return open(string, self._mode) def __repr__(self): args = [self._mode, self._bufsize] args_str = ', '.join([repr(arg) for arg in args if arg is not None]) return '%s(%s)' % (type(self).__name__, args_str) # =========================== # Optional and Positional Parsing # =========================== class Namespace(_AttributeHolder): """Simple object for storing attributes. Implements equality by attribute names and values, and provides a simple string representation. """ def __init__(self, **kwargs): for name in kwargs: setattr(self, name, kwargs[name]) def __eq__(self, other): return vars(self) == vars(other) def __ne__(self, other): return not (self == other) class _ActionsContainer(object): def __init__(self, description, prefix_chars, argument_default, conflict_handler): super(_ActionsContainer, self).__init__() self.description = description self.argument_default = argument_default self.prefix_chars = prefix_chars self.conflict_handler = conflict_handler # set up registries self._registries = {} # register actions self.register('action', None, _StoreAction) self.register('action', 'store', _StoreAction) self.register('action', 'store_const', _StoreConstAction) self.register('action', 'store_true', _StoreTrueAction) self.register('action', 'store_false', _StoreFalseAction) self.register('action', 'append', _AppendAction) self.register('action', 'append_const', _AppendConstAction) self.register('action', 'count', _CountAction) self.register('action', 'help', _HelpAction) self.register('action', 'version', _VersionAction) self.register('action', 'parsers', _SubParsersAction) # raise an exception if the conflict handler is invalid self._get_handler() # action storage self._actions = [] self._option_string_actions = {} # groups self._action_groups = [] self._mutually_exclusive_groups = [] # defaults storage self._defaults = {} # determines whether an "option" looks like a negative number self._negative_number_matcher = _re.compile(r'^-\d+|-\d*.\d+$') # whether or not there are any optionals that look like negative # numbers -- uses a list so it can be shared and edited self._has_negative_number_optionals = [] # ==================== # Registration methods # ==================== def register(self, registry_name, value, object): registry = self._registries.setdefault(registry_name, {}) registry[value] = object def _registry_get(self, registry_name, value, default=None): return self._registries[registry_name].get(value, default) # ================================== # Namespace default settings methods # ================================== def set_defaults(self, **kwargs): self._defaults.update(kwargs) # if these defaults match any existing arguments, replace # the previous default on the object with the new one for action in self._actions: if action.dest in kwargs: action.default = kwargs[action.dest] # ======================= # Adding argument actions # ======================= def add_argument(self, *args, **kwargs): """ add_argument(dest, ..., name=value, ...) add_argument(option_string, option_string, ..., name=value, ...) """ # if no positional args are supplied or only one is supplied and # it doesn't look like an option string, parse a positional # argument chars = self.prefix_chars if not args or len(args) == 1 and args[0][0] not in chars: kwargs = self._get_positional_kwargs(*args, **kwargs) # otherwise, we're adding an optional argument else: kwargs = self._get_optional_kwargs(*args, **kwargs) # if no default was supplied, use the parser-level default if 'default' not in kwargs: dest = kwargs['dest'] if dest in self._defaults: kwargs['default'] = self._defaults[dest] elif self.argument_default is not None: kwargs['default'] = self.argument_default # create the action object, and add it to the parser action_class = self._pop_action_class(kwargs) action = action_class(**kwargs) return self._add_action(action) def add_argument_group(self, *args, **kwargs): group = _ArgumentGroup(self, *args, **kwargs) self._action_groups.append(group) return group def add_mutually_exclusive_group(self, **kwargs): group = _MutuallyExclusiveGroup(self, **kwargs) self._mutually_exclusive_groups.append(group) return group def _add_action(self, action): # resolve any conflicts self._check_conflict(action) # add to actions list self._actions.append(action) action.container = self # index the action by any option strings it has for option_string in action.option_strings: self._option_string_actions[option_string] = action # set the flag if any option strings look like negative numbers for option_string in action.option_strings: if self._negative_number_matcher.match(option_string): if not self._has_negative_number_optionals: self._has_negative_number_optionals.append(True) # return the created action return action def _remove_action(self, action): self._actions.remove(action) def _add_container_actions(self, container): # collect groups by titles title_group_map = {} for group in self._action_groups: if group.title in title_group_map: msg = _('cannot merge actions - two groups are named %r') raise ValueError(msg % (group.title)) title_group_map[group.title] = group # map each action to its group group_map = {} for group in container._action_groups: # if a group with the title exists, use that, otherwise # create a new group matching the container's group if group.title not in title_group_map: title_group_map[group.title] = self.add_argument_group( title=group.title, description=group.description, conflict_handler=group.conflict_handler) # map the actions to their new group for action in group._group_actions: group_map[action] = title_group_map[group.title] # add container's mutually exclusive groups # NOTE: if add_mutually_exclusive_group ever gains title= and # description= then this code will need to be expanded as above for group in container._mutually_exclusive_groups: mutex_group = self.add_mutually_exclusive_group( required=group.required) # map the actions to their new mutex group for action in group._group_actions: group_map[action] = mutex_group # add all actions to this container or their group for action in container._actions: group_map.get(action, self)._add_action(action) def _get_positional_kwargs(self, dest, **kwargs): # make sure required is not specified if 'required' in kwargs: msg = _("'required' is an invalid argument for positionals") raise TypeError(msg) # mark positional arguments as required if at least one is # always required if kwargs.get('nargs') not in [OPTIONAL, ZERO_OR_MORE]: kwargs['required'] = True if kwargs.get('nargs') == ZERO_OR_MORE and 'default' not in kwargs: kwargs['required'] = True # return the keyword arguments with no option strings return dict(kwargs, dest=dest, option_strings=[]) def _get_optional_kwargs(self, *args, **kwargs): # determine short and long option strings option_strings = [] long_option_strings = [] for option_string in args: # error on one-or-fewer-character option strings if len(option_string) < 2: msg = _('invalid option string %r: ' 'must be at least two characters long') raise ValueError(msg % option_string) # error on strings that don't start with an appropriate prefix if not option_string[0] in self.prefix_chars: msg = _('invalid option string %r: ' 'must start with a character %r') tup = option_string, self.prefix_chars raise ValueError(msg % tup) # error on strings that are all prefix characters if not (_set(option_string) - _set(self.prefix_chars)): msg = _('invalid option string %r: ' 'must contain characters other than %r') tup = option_string, self.prefix_chars raise ValueError(msg % tup) # strings starting with two prefix characters are long options option_strings.append(option_string) if option_string[0] in self.prefix_chars: if option_string[1] in self.prefix_chars: long_option_strings.append(option_string) # infer destination, '--foo-bar' -> 'foo_bar' and '-x' -> 'x' dest = kwargs.pop('dest', None) if dest is None: if long_option_strings: dest_option_string = long_option_strings[0] else: dest_option_string = option_strings[0] dest = dest_option_string.lstrip(self.prefix_chars) dest = dest.replace('-', '_') # return the updated keyword arguments return dict(kwargs, dest=dest, option_strings=option_strings) def _pop_action_class(self, kwargs, default=None): action = kwargs.pop('action', default) return self._registry_get('action', action, action) def _get_handler(self): # determine function from conflict handler string handler_func_name = '_handle_conflict_%s' % self.conflict_handler try: return getattr(self, handler_func_name) except AttributeError: msg = _('invalid conflict_resolution value: %r') raise ValueError(msg % self.conflict_handler) def _check_conflict(self, action): # find all options that conflict with this option confl_optionals = [] for option_string in action.option_strings: if option_string in self._option_string_actions: confl_optional = self._option_string_actions[option_string] confl_optionals.append((option_string, confl_optional)) # resolve any conflicts if confl_optionals: conflict_handler = self._get_handler() conflict_handler(action, confl_optionals) def _handle_conflict_error(self, action, conflicting_actions): message = _('conflicting option string(s): %s') conflict_string = ', '.join([option_string for option_string, action in conflicting_actions]) raise ArgumentError(action, message % conflict_string) def _handle_conflict_resolve(self, action, conflicting_actions): # remove all conflicting options for option_string, action in conflicting_actions: # remove the conflicting option action.option_strings.remove(option_string) self._option_string_actions.pop(option_string, None) # if the option now has no option string, remove it from the # container holding it if not action.option_strings: action.container._remove_action(action) class _ArgumentGroup(_ActionsContainer): def __init__(self, container, title=None, description=None, **kwargs): # add any missing keyword arguments by checking the container update = kwargs.setdefault update('conflict_handler', container.conflict_handler) update('prefix_chars', container.prefix_chars) update('argument_default', container.argument_default) super_init = super(_ArgumentGroup, self).__init__ super_init(description=description, **kwargs) # group attributes self.title = title self._group_actions = [] # share most attributes with the container self._registries = container._registries self._actions = container._actions self._option_string_actions = container._option_string_actions self._defaults = container._defaults self._has_negative_number_optionals = \ container._has_negative_number_optionals def _add_action(self, action): action = super(_ArgumentGroup, self)._add_action(action) self._group_actions.append(action) return action def _remove_action(self, action): super(_ArgumentGroup, self)._remove_action(action) self._group_actions.remove(action) class _MutuallyExclusiveGroup(_ArgumentGroup): def __init__(self, container, required=False): super(_MutuallyExclusiveGroup, self).__init__(container) self.required = required self._container = container def _add_action(self, action): if action.required: msg = _('mutually exclusive arguments must be optional') raise ValueError(msg) action = self._container._add_action(action) self._group_actions.append(action) return action def _remove_action(self, action): self._container._remove_action(action) self._group_actions.remove(action) class ArgumentParser(_AttributeHolder, _ActionsContainer): """Object for parsing command line strings into Python objects. Keyword Arguments: - prog -- The name of the program (default: sys.argv[0]) - usage -- A usage message (default: auto-generated from arguments) - description -- A description of what the program does - epilog -- Text following the argument descriptions - version -- Add a -v/--version option with the given version string - parents -- Parsers whose arguments should be copied into this one - formatter_class -- HelpFormatter class for printing help messages - prefix_chars -- Characters that prefix optional arguments - fromfile_prefix_chars -- Characters that prefix files containing additional arguments - argument_default -- The default value for all arguments - conflict_handler -- String indicating how to handle conflicts - add_help -- Add a -h/-help option """ def __init__(self, prog=None, usage=None, description=None, epilog=None, version=None, parents=[], formatter_class=HelpFormatter, prefix_chars='-', fromfile_prefix_chars=None, argument_default=None, conflict_handler='error', add_help=True): superinit = super(ArgumentParser, self).__init__ superinit(description=description, prefix_chars=prefix_chars, argument_default=argument_default, conflict_handler=conflict_handler) # default setting for prog if prog is None: prog = _os.path.basename(_sys.argv[0]) self.prog = prog self.usage = usage self.epilog = epilog self.version = version self.formatter_class = formatter_class self.fromfile_prefix_chars = fromfile_prefix_chars self.add_help = add_help add_group = self.add_argument_group self._positionals = add_group(_('positional arguments')) self._optionals = add_group(_('optional arguments')) self._subparsers = None # register types def identity(string): return string self.register('type', None, identity) # add help and version arguments if necessary # (using explicit default to override global argument_default) if self.add_help: self.add_argument( '-h', '--help', action='help', default=SUPPRESS, help=_('show this help message and exit')) if self.version: self.add_argument( '-v', '--version', action='version', default=SUPPRESS, help=_("show program's version number and exit")) # add parent arguments and defaults for parent in parents: self._add_container_actions(parent) try: defaults = parent._defaults except AttributeError: pass else: self._defaults.update(defaults) # ======================= # Pretty __repr__ methods # ======================= def _get_kwargs(self): names = [ 'prog', 'usage', 'description', 'version', 'formatter_class', 'conflict_handler', 'add_help', ] return [(name, getattr(self, name)) for name in names] # ================================== # Optional/Positional adding methods # ================================== def add_subparsers(self, **kwargs): if self._subparsers is not None: self.error(_('cannot have multiple subparser arguments')) # add the parser class to the arguments if it's not present kwargs.setdefault('parser_class', type(self)) if 'title' in kwargs or 'description' in kwargs: title = _(kwargs.pop('title', 'subcommands')) description = _(kwargs.pop('description', None)) self._subparsers = self.add_argument_group(title, description) else: self._subparsers = self._positionals # prog defaults to the usage message of this parser, skipping # optional arguments and with no "usage:" prefix if kwargs.get('prog') is None: formatter = self._get_formatter() positionals = self._get_positional_actions() groups = self._mutually_exclusive_groups formatter.add_usage(self.usage, positionals, groups, '') kwargs['prog'] = formatter.format_help().strip() # create the parsers action and add it to the positionals list parsers_class = self._pop_action_class(kwargs, 'parsers') action = parsers_class(option_strings=[], **kwargs) self._subparsers._add_action(action) # return the created parsers action return action def _add_action(self, action): if action.option_strings: self._optionals._add_action(action) else: self._positionals._add_action(action) return action def _get_optional_actions(self): return [action for action in self._actions if action.option_strings] def _get_positional_actions(self): return [action for action in self._actions if not action.option_strings] # ===================================== # Command line argument parsing methods # ===================================== def parse_args(self, args=None, namespace=None): args, argv = self.parse_known_args(args, namespace) if argv: msg = _('unrecognized arguments: %s') self.error(msg % ' '.join(argv)) return args def parse_known_args(self, args=None, namespace=None): # args default to the system args if args is None: args = _sys.argv[1:] # default Namespace built from parser defaults if namespace is None: namespace = Namespace() # add any action defaults that aren't present for action in self._actions: if action.dest is not SUPPRESS: if not hasattr(namespace, action.dest): if action.default is not SUPPRESS: default = action.default if isinstance(action.default, _basestring): default = self._get_value(action, default) setattr(namespace, action.dest, default) # add any parser defaults that aren't present for dest in self._defaults: if not hasattr(namespace, dest): setattr(namespace, dest, self._defaults[dest]) # parse the arguments and exit if there are any errors try: return self._parse_known_args(args, namespace) except ArgumentError: err = _sys.exc_info()[1] self.error(str(err)) def _parse_known_args(self, arg_strings, namespace): # replace arg strings that are file references if self.fromfile_prefix_chars is not None: arg_strings = self._read_args_from_files(arg_strings) # map all mutually exclusive arguments to the other arguments # they can't occur with action_conflicts = {} for mutex_group in self._mutually_exclusive_groups: group_actions = mutex_group._group_actions for i, mutex_action in enumerate(mutex_group._group_actions): conflicts = action_conflicts.setdefault(mutex_action, []) conflicts.extend(group_actions[:i]) conflicts.extend(group_actions[i + 1:]) # find all option indices, and determine the arg_string_pattern # which has an 'O' if there is an option at an index, # an 'A' if there is an argument, or a '-' if there is a '--' option_string_indices = {} arg_string_pattern_parts = [] arg_strings_iter = iter(arg_strings) for i, arg_string in enumerate(arg_strings_iter): # all args after -- are non-options if arg_string == '--': arg_string_pattern_parts.append('-') for arg_string in arg_strings_iter: arg_string_pattern_parts.append('A') # otherwise, add the arg to the arg strings # and note the index if it was an option else: option_tuple = self._parse_optional(arg_string) if option_tuple is None: pattern = 'A' else: option_string_indices[i] = option_tuple pattern = 'O' arg_string_pattern_parts.append(pattern) # join the pieces together to form the pattern arg_strings_pattern = ''.join(arg_string_pattern_parts) # converts arg strings to the appropriate and then takes the action seen_actions = _set() seen_non_default_actions = _set() def take_action(action, argument_strings, option_string=None): seen_actions.add(action) argument_values = self._get_values(action, argument_strings) # error if this argument is not allowed with other previously # seen arguments, assuming that actions that use the default # value don't really count as "present" if argument_values is not action.default: seen_non_default_actions.add(action) for conflict_action in action_conflicts.get(action, []): if conflict_action in seen_non_default_actions: msg = _('not allowed with argument %s') action_name = _get_action_name(conflict_action) raise ArgumentError(action, msg % action_name) # take the action if we didn't receive a SUPPRESS value # (e.g. from a default) if argument_values is not SUPPRESS: action(self, namespace, argument_values, option_string) # function to convert arg_strings into an optional action def consume_optional(start_index): # get the optional identified at this index option_tuple = option_string_indices[start_index] action, option_string, explicit_arg = option_tuple # identify additional optionals in the same arg string # (e.g. -xyz is the same as -x -y -z if no args are required) match_argument = self._match_argument action_tuples = [] while True: # if we found no optional action, skip it if action is None: extras.append(arg_strings[start_index]) return start_index + 1 # if there is an explicit argument, try to match the # optional's string arguments to only this if explicit_arg is not None: arg_count = match_argument(action, 'A') # if the action is a single-dash option and takes no # arguments, try to parse more single-dash options out # of the tail of the option string chars = self.prefix_chars if arg_count == 0 and option_string[1] not in chars: action_tuples.append((action, [], option_string)) for char in self.prefix_chars: option_string = char + explicit_arg[0] explicit_arg = explicit_arg[1:] or None optionals_map = self._option_string_actions if option_string in optionals_map: action = optionals_map[option_string] break else: msg = _('ignored explicit argument %r') raise ArgumentError(action, msg % explicit_arg) # if the action expect exactly one argument, we've # successfully matched the option; exit the loop elif arg_count == 1: stop = start_index + 1 args = [explicit_arg] action_tuples.append((action, args, option_string)) break # error if a double-dash option did not use the # explicit argument else: msg = _('ignored explicit argument %r') raise ArgumentError(action, msg % explicit_arg) # if there is no explicit argument, try to match the # optional's string arguments with the following strings # if successful, exit the loop else: start = start_index + 1 selected_patterns = arg_strings_pattern[start:] arg_count = match_argument(action, selected_patterns) stop = start + arg_count args = arg_strings[start:stop] action_tuples.append((action, args, option_string)) break # add the Optional to the list and return the index at which # the Optional's string args stopped assert action_tuples for action, args, option_string in action_tuples: take_action(action, args, option_string) return stop # the list of Positionals left to be parsed; this is modified # by consume_positionals() positionals = self._get_positional_actions() # function to convert arg_strings into positional actions def consume_positionals(start_index): # match as many Positionals as possible match_partial = self._match_arguments_partial selected_pattern = arg_strings_pattern[start_index:] arg_counts = match_partial(positionals, selected_pattern) # slice off the appropriate arg strings for each Positional # and add the Positional and its args to the list for action, arg_count in zip(positionals, arg_counts): args = arg_strings[start_index: start_index + arg_count] start_index += arg_count take_action(action, args) # slice off the Positionals that we just parsed and return the # index at which the Positionals' string args stopped positionals[:] = positionals[len(arg_counts):] return start_index # consume Positionals and Optionals alternately, until we have # passed the last option string extras = [] start_index = 0 if option_string_indices: max_option_string_index = max(option_string_indices) else: max_option_string_index = -1 while start_index <= max_option_string_index: # consume any Positionals preceding the next option next_option_string_index = min([ index for index in option_string_indices if index >= start_index]) if start_index != next_option_string_index: positionals_end_index = consume_positionals(start_index) # only try to parse the next optional if we didn't consume # the option string during the positionals parsing if positionals_end_index > start_index: start_index = positionals_end_index continue else: start_index = positionals_end_index # if we consumed all the positionals we could and we're not # at the index of an option string, there were extra arguments if start_index not in option_string_indices: strings = arg_strings[start_index:next_option_string_index] extras.extend(strings) start_index = next_option_string_index # consume the next optional and any arguments for it start_index = consume_optional(start_index) # consume any positionals following the last Optional stop_index = consume_positionals(start_index) # if we didn't consume all the argument strings, there were extras extras.extend(arg_strings[stop_index:]) # if we didn't use all the Positional objects, there were too few # arg strings supplied. if positionals: self.error(_('too few arguments')) # make sure all required actions were present for action in self._actions: if action.required: if action not in seen_actions: name = _get_action_name(action) self.error(_('argument %s is required') % name) # make sure all required groups had one option present for group in self._mutually_exclusive_groups: if group.required: for action in group._group_actions: if action in seen_non_default_actions: break # if no actions were used, report the error else: names = [_get_action_name(action) for action in group._group_actions if action.help is not SUPPRESS] msg = _('one of the arguments %s is required') self.error(msg % ' '.join(names)) # return the updated namespace and the extra arguments return namespace, extras def _read_args_from_files(self, arg_strings): # expand arguments referencing files new_arg_strings = [] for arg_string in arg_strings: # for regular arguments, just add them back into the list if arg_string[0] not in self.fromfile_prefix_chars: new_arg_strings.append(arg_string) # replace arguments referencing files with the file content else: try: args_file = open(arg_string[1:]) try: arg_strings = args_file.read().splitlines() arg_strings = self._read_args_from_files(arg_strings) new_arg_strings.extend(arg_strings) finally: args_file.close() except IOError: err = _sys.exc_info()[1] self.error(str(err)) # return the modified argument list return new_arg_strings def _match_argument(self, action, arg_strings_pattern): # match the pattern for this action to the arg strings nargs_pattern = self._get_nargs_pattern(action) match = _re.match(nargs_pattern, arg_strings_pattern) # raise an exception if we weren't able to find a match if match is None: nargs_errors = { None: _('expected one argument'), OPTIONAL: _('expected at most one argument'), ONE_OR_MORE: _('expected at least one argument'), } default = _('expected %s argument(s)') % action.nargs msg = nargs_errors.get(action.nargs, default) raise ArgumentError(action, msg) # return the number of arguments matched return len(match.group(1)) def _match_arguments_partial(self, actions, arg_strings_pattern): # progressively shorten the actions list by slicing off the # final actions until we find a match result = [] for i in range(len(actions), 0, -1): actions_slice = actions[:i] pattern = ''.join([self._get_nargs_pattern(action) for action in actions_slice]) match = _re.match(pattern, arg_strings_pattern) if match is not None: result.extend([len(string) for string in match.groups()]) break # return the list of arg string counts return result def _parse_optional(self, arg_string): # if it's an empty string, it was meant to be a positional if not arg_string: return None # if it doesn't start with a prefix, it was meant to be positional if not arg_string[0] in self.prefix_chars: return None # if it's just dashes, it was meant to be positional if not arg_string.strip('-'): return None # if the option string is present in the parser, return the action if arg_string in self._option_string_actions: action = self._option_string_actions[arg_string] return action, arg_string, None # search through all possible prefixes of the option string # and all actions in the parser for possible interpretations option_tuples = self._get_option_tuples(arg_string) # if multiple actions match, the option string was ambiguous if len(option_tuples) > 1: options = ', '.join([option_string for action, option_string, explicit_arg in option_tuples]) tup = arg_string, options self.error(_('ambiguous option: %s could match %s') % tup) # if exactly one action matched, this segmentation is good, # so return the parsed action elif len(option_tuples) == 1: option_tuple, = option_tuples return option_tuple # if it was not found as an option, but it looks like a negative # number, it was meant to be positional # unless there are negative-number-like options if self._negative_number_matcher.match(arg_string): if not self._has_negative_number_optionals: return None # if it contains a space, it was meant to be a positional if ' ' in arg_string: return None # it was meant to be an optional but there is no such option # in this parser (though it might be a valid option in a subparser) return None, arg_string, None def _get_option_tuples(self, option_string): result = [] # option strings starting with two prefix characters are only # split at the '=' chars = self.prefix_chars if option_string[0] in chars and option_string[1] in chars: if '=' in option_string: option_prefix, explicit_arg = option_string.split('=', 1) else: option_prefix = option_string explicit_arg = None for option_string in self._option_string_actions: if option_string.startswith(option_prefix): action = self._option_string_actions[option_string] tup = action, option_string, explicit_arg result.append(tup) # single character options can be concatenated with their arguments # but multiple character options always have to have their argument # separate elif option_string[0] in chars and option_string[1] not in chars: option_prefix = option_string explicit_arg = None short_option_prefix = option_string[:2] short_explicit_arg = option_string[2:] for option_string in self._option_string_actions: if option_string == short_option_prefix: action = self._option_string_actions[option_string] tup = action, option_string, short_explicit_arg result.append(tup) elif option_string.startswith(option_prefix): action = self._option_string_actions[option_string] tup = action, option_string, explicit_arg result.append(tup) # shouldn't ever get here else: self.error(_('unexpected option string: %s') % option_string) # return the collected option tuples return result def _get_nargs_pattern(self, action): # in all examples below, we have to allow for '--' args # which are represented as '-' in the pattern nargs = action.nargs # the default (None) is assumed to be a single argument if nargs is None: nargs_pattern = '(-*A-*)' # allow zero or one arguments elif nargs == OPTIONAL: nargs_pattern = '(-*A?-*)' # allow zero or more arguments elif nargs == ZERO_OR_MORE: nargs_pattern = '(-*[A-]*)' # allow one or more arguments elif nargs == ONE_OR_MORE: nargs_pattern = '(-*A[A-]*)' # allow one argument followed by any number of options or arguments elif nargs is PARSER: nargs_pattern = '(-*A[-AO]*)' # all others should be integers else: nargs_pattern = '(-*%s-*)' % '-*'.join('A' * nargs) # if this is an optional action, -- is not allowed if action.option_strings: nargs_pattern = nargs_pattern.replace('-*', '') nargs_pattern = nargs_pattern.replace('-', '') # return the pattern return nargs_pattern # ======================== # Value conversion methods # ======================== def _get_values(self, action, arg_strings): # for everything but PARSER args, strip out '--' if action.nargs is not PARSER: arg_strings = [s for s in arg_strings if s != '--'] # optional argument produces a default when not present if not arg_strings and action.nargs == OPTIONAL: if action.option_strings: value = action.const else: value = action.default if isinstance(value, _basestring): value = self._get_value(action, value) self._check_value(action, value) # when nargs='*' on a positional, if there were no command-line # args, use the default if it is anything other than None elif (not arg_strings and action.nargs == ZERO_OR_MORE and not action.option_strings): if action.default is not None: value = action.default else: value = arg_strings self._check_value(action, value) # single argument or optional argument produces a single value elif len(arg_strings) == 1 and action.nargs in [None, OPTIONAL]: arg_string, = arg_strings value = self._get_value(action, arg_string) self._check_value(action, value) # PARSER arguments convert all values, but check only the first elif action.nargs is PARSER: value = [self._get_value(action, v) for v in arg_strings] self._check_value(action, value[0]) # all other types of nargs produce a list else: value = [self._get_value(action, v) for v in arg_strings] for v in value: self._check_value(action, v) # return the converted value return value def _get_value(self, action, arg_string): type_func = self._registry_get('type', action.type, action.type) if not hasattr(type_func, '__call__'): if not hasattr(type_func, '__bases__'): # classic classes msg = _('%r is not callable') raise ArgumentError(action, msg % type_func) # convert the value to the appropriate type try: result = type_func(arg_string) # TypeErrors or ValueErrors indicate errors except (TypeError, ValueError): name = getattr(action.type, '__name__', repr(action.type)) msg = _('invalid %s value: %r') raise ArgumentError(action, msg % (name, arg_string)) # return the converted value return result def _check_value(self, action, value): # converted value must be one of the choices (if specified) if action.choices is not None and value not in action.choices: tup = value, ', '.join(map(repr, action.choices)) msg = _('invalid choice: %r (choose from %s)') % tup raise ArgumentError(action, msg) # ======================= # Help-formatting methods # ======================= def format_usage(self): formatter = self._get_formatter() formatter.add_usage(self.usage, self._actions, self._mutually_exclusive_groups) return formatter.format_help() def format_help(self): formatter = self._get_formatter() # usage formatter.add_usage(self.usage, self._actions, self._mutually_exclusive_groups) # description formatter.add_text(self.description) # positionals, optionals and user-defined groups for action_group in self._action_groups: formatter.start_section(action_group.title) formatter.add_text(action_group.description) formatter.add_arguments(action_group._group_actions) formatter.end_section() # epilog formatter.add_text(self.epilog) # determine help from format above return formatter.format_help() def format_version(self): formatter = self._get_formatter() formatter.add_text(self.version) return formatter.format_help() def _get_formatter(self): return self.formatter_class(prog=self.prog) # ===================== # Help-printing methods # ===================== def print_usage(self, file=None): self._print_message(self.format_usage(), file) def print_help(self, file=None): self._print_message(self.format_help(), file) def print_version(self, file=None): self._print_message(self.format_version(), file) def _print_message(self, message, file=None): if message: if file is None: file = _sys.stderr file.write(message) # =============== # Exiting methods # =============== def exit(self, status=0, message=None): if message: _sys.stderr.write(message) _sys.exit(status) def error(self, message): """error(message: string) Prints a usage message incorporating the message to stderr and exits. If you override this in a subclass, it should not return -- it should either exit or raise an exception. """ self.print_usage(_sys.stderr) self.exit(2, _('%s: error: %s\n') % (self.prog, message)) chemfp-1.1p1/chemfp/bitops.py0000644000077000000240000001057512055226640016403 0ustar dalkestaff00000000000000from __future__ import absolute_import import os import sys import _chemfp from _chemfp import (hex_isvalid, hex_popcount, hex_intersect_popcount, hex_tanimoto, hex_contains) from _chemfp import (byte_popcount, byte_intersect_popcount, byte_tanimoto, byte_contains, byte_intersect, byte_union, byte_difference) __all__ = ["byte_popcount", "byte_intersect_popcount", "byte_tanimoto", "byte_contains", "hex_isvalid", "hex_popcount", "hex_intersect_popcount", "hex_tanimoto", "hex_contains", "get_methods", "get_alignments", "get_alignment_methods", "set_alignment_method", "select_fastest_method"] def get_methods(): return [_chemfp.get_method_name(i) for i in range(_chemfp.get_num_methods())] def get_alignments(): return [_chemfp.get_alignment_name(i) for i in range(_chemfp.get_num_alignments())] def get_alignment_methods(): settings = {} for alignment in range(_chemfp.get_num_alignments()): method = _chemfp.get_alignment_method(alignment) settings[_chemfp.get_alignment_name(alignment)] = _chemfp.get_method_name(method) return settings def get_alignment_method(alignment): try: alignment_i = get_alignments().index(alignment) except ValueError: raise ValueError("Unknown alignment %r" % (alignment,)) return _chemfp.get_method_name(_chemfp.get_alignment_method(alignment_i)) def set_alignment_method(alignment, method): try: alignment_i = get_alignments().index(alignment) except ValueError: raise ValueError("Unknown alignment %r" % (alignment,)) try: method_i = get_methods().index(method) except ValueError: raise ValueError("Unknown method %r" % (method,)) _chemfp.set_alignment_method(alignment_i, method_i) def select_fastest_method(repeat=10000): if repeat > 100000: raise ValueError("repeat size is meaninglessly large") if repeat < 1: raise ValueError("repeat size must be 1 or larger (values under 1000 are likely useless)") for alignment_i, name in enumerate(get_alignments()): _chemfp.select_fastest_method(alignment_i, repeat) def get_options(): return [_chemfp.get_option_name(i) for i in range(_chemfp.get_num_options())] def get_option(option): _chemfp.get_option(option) def set_option(option, value): _chemfp.set_option(option, value) def print_report(out=sys.stdout): from . import SOFTWARE print >>out, "== Configuration report for", SOFTWARE, "==" print >>out, "Available methods:", " ".join(get_methods()) print >>out, "Alignment methods:" for alignment in get_alignments(): method = get_alignment_method(alignment) print >>out, " %s: %s" % (alignment, method) print >>out, "Option settings:" for option in get_options(): print >>out, " %s: %s" % (option, _chemfp.get_option(option)) def use_environment_variables(environ=None): if environ is None: environ = os.environ known = set() for alignment in get_alignments(): name = "CHEMFP-" + alignment.upper() known.add(name) try: value = environ[name] except KeyError: pass else: try: set_alignment_method(alignment, value) except ValueError, err: print >>sys.stderr, "WARNING: Unable to use $%s = %r: %s" % ( (name, value, err)) for option in get_options(): name = "CHEMFP-" + option.upper() known.add(name) try: value = environ[name] except KeyError: pass else: try: set_option(option, int(value)) except ValueError, err: print >>sys.stderr, "WARNING: Unable to use $%s = %r: %s" % ( (name, value, err)) known.add("CHEMFP-REPORT") report = environ.get("CHEMFP-REPORT", "0") == "1" if report: set_option("report-popcount", 1) set_option("report-intersect", 1) known.add("CHEMFP-PRINT-CONFIG") if (environ.get("CHEMFP-PRINT-CONFIG", "0") == "1" or report): print_report(sys.stderr) for k in environ: if not k.startswith("CHEMFP-"): continue if k not in known: print >>sys.stderr, "WARNING: Unknown chemfp environment variable %r" % (k,) chemfp-1.1p1/chemfp/commandline/0000755000077000000240000000000012106315372017005 5ustar dalkestaff00000000000000chemfp-1.1p1/chemfp/commandline/__init__.py0000644000077000000240000000000011660452123021104 0ustar dalkestaff00000000000000chemfp-1.1p1/chemfp/commandline/cmdsupport.py0000644000077000000240000000413512055226640021564 0ustar dalkestaff00000000000000from __future__ import absolute_import import os import sys import itertools from .. import ChemFPError, Metadata def mutual_exclusion(parser, args, default, groups): true_groups = [] for g in groups: if getattr(args, g): true_groups.append(g) if not true_groups: setattr(args, default, True) elif len(true_groups) == 1: pass else: parser.error("Cannot specify both --%s and --%s" % (true_groups[0], true_groups[1])) def sys_exit_opener(opener, metadata, source, format, id_tag, errors): try: return opener.read_structure_fingerprints(source, format, id_tag, errors, metadata=metadata) except (IOError, ChemFPError, ValueError), err: sys.stderr.write("Problem reading structure fingerprints: %s. Exiting.\n" % err) raise SystemExit(1) def iter_all_sources(opener, metadata, filenames, format, id_tag, errors): for filename in filenames: reader = sys_exit_opener(opener, metadata, filename, format, id_tag, errors) for x in reader: yield x def read_multifile_structure_fingerprints(opener, filenames, format, id_tag, aromaticity, errors): metadata = Metadata(aromaticity=aromaticity) if not filenames: reader = sys_exit_opener(opener, metadata, None, format, id_tag, errors) return reader.metadata, reader reader = sys_exit_opener(opener, metadata, filenames[0], format, id_tag, errors) if len(filenames) == 1: return reader.metadata, reader reader = sys_exit_opener(opener, metadata, filenames[0], format, id_tag, errors) reader.metadata.sources = filenames multi_reader = itertools.chain(reader, iter_all_sources(opener, metadata, filenames[1:], format, id_tag, errors)) return reader.metadata, multi_reader def is_valid_tag(tag): if tag is None: return True for c in "<>\r\n": if c in tag: return False return True def check_filenames(filenames): if not filenames: return None for filename in filenames: if not os.path.exists(filename): return filename return None chemfp-1.1p1/chemfp/commandline/fpsmerge.py0000644000077000000240000000402411660452123021167 0ustar dalkestaff00000000000000from __future__ import absolute_import import sys from .. import argparse, io, readers, check_metadata_problems parser = argparse.ArgumentParser( description="Merge multiple FPS files into a single FPS file", ) parser.add_argument( "-o", "--output", metavar="FILENAME", help="save the fingerprints to FILENAME (default=stdout)") parser.add_argument("filenames", nargs="+", help="list of FPS files", default=[]) def main(args=None): args = parser.parse_args(args) if not args.filenames: return metadata = None sources = [] for filename in args.filenames: fps_reader = readers.open_fps(filename) if metadata is None: metadata = fps_reader.metadata else: problems = check_metadata_problems(metadata, fps_reader.metadata) if problems: for (severity, error, msg_template) in problems: msg = msg_template.format(metadata1 = repr(args.filenames[0]), metadata2 = repr(filename)) if severity == "warning": sys.stderr.write("WARNING: " + msg + "\n") elif severity == "error": sys.stderr.write("ERRORR: " + msg + "\n") raise SystemExit(1) elif severity == "info": sys.stderr.write("INFO: " + msg + "\n") else: raise AssertionError(severity) sources.extend(fps_reader.metadata.sources) metadata.sources = sources metadata.date = io.utcnow() with io._closing_output(args.output) as outfile: with io.ignore_pipe_errors: io.write_fps1_magic(outfile) io.write_fps1_header(outfile, metadata) for filename in args.filenames: fps_reader = readers.open_fps(filename) for block in fps_reader.iter_blocks(): outfile.write(block) if __name__ == "__main__": main() chemfp-1.1p1/chemfp/commandline/ob2fps.py0000644000077000000240000001075211660452123020557 0ustar dalkestaff00000000000000import sys from chemfp import openbabel as ob from chemfp import argparse, io, types from .. import ParseError from . import cmdsupport ############ Command-line parser definition epilog = """\ OpenBabel autodetects the input structure format based on the filename extension. The default format for structures read from stdin is SMILES. Use"--in FORMAT" to select an alternative, where FORMAT is one of the extensions at http://openbabel.org/wiki/List_of_extensions . For a short list of some common formats: File Type Valid FORMATs --------- ------------- SMILES smi, can, smiles SDF sdf, mol, sd, mdl MOL2 mol2, ml2 PDB pdb, ent MacroModel mmod If OpenBabel is compiled with zlib support then it will automatically handle gzip'ed input data if the filename ends with ".gz". You may optionally include that suffix in the format name. """ parser = argparse.ArgumentParser( description="Generate FPS fingerprints from a structure file using OpenBabel", ) group = parser.add_mutually_exclusive_group() group.add_argument("--FP2", action="store_true", # help=ob._fingerprinter_table["FP2"].description + "(default)" ) group.add_argument("--FP3", action="store_true", # help=ob._fingerprinter_table["FP3"].description ) group.add_argument("--FP4", action="store_true", # help=ob._fingerprinter_table["FP4"].description ) if ob.HAS_MACCS: # Added in OpenBabel 2.3 group.add_argument("--MACCS", action="store_true", # help=ob._fingerprinter_table["MACCS"].description ) else: group.add_argument("--MACCS", action="store_true", help="(Not available using your version of OpenBabel)") group.add_argument( "--substruct", action="store_true", help="generate ChemFP substructure fingerprints") group.add_argument( "--rdmaccs", action="store_true", help="generate 166 bit RDKit/MACCS fingerprints") parser.add_argument( "--id-tag", metavar="NAME", help="tag name containing the record id (SD files only)") parser.add_argument( "--in", metavar="FORMAT", dest="format", help="input structure format (default autodetects from the filename extension)") parser.add_argument( "-o", "--output", metavar="FILENAME", help="save the fingerprints to FILENAME (default=stdout)") parser.add_argument( "--errors", choices=["strict", "report", "ignore"], default="strict", help="how should structure parse errors be handled? (default=strict)") parser.add_argument( "filenames", nargs="*", help="input structure files (default is stdin)") ######### def main(args=None): args = parser.parse_args(args) outfile = sys.stdout cmdsupport.mutual_exclusion(parser, args, "FP2", ("FP2", "FP3", "FP4", "MACCS", "substruct", "rdmaccs")) if args.FP2: opener = types.get_fingerprint_family("OpenBabel-FP2")() elif args.FP3: opener = types.get_fingerprint_family("OpenBabel-FP3")() elif args.FP4: opener = types.get_fingerprint_family("OpenBabel-FP4")() elif args.MACCS: if not ob.HAS_MACCS: parser.error( "--MACCS is not supported in your OpenBabel installation (%s)" % ( ob.GetReleaseVersion(),)) opener = types.get_fingerprint_family("OpenBabel-MACCS")() elif args.substruct: opener = types.get_fingerprint_family("ChemFP-Substruct-OpenBabel")() elif args.rdmaccs: opener = types.get_fingerprint_family("RDMACCS-OpenBabel")() else: parser.error("should not get here") if not ob.is_valid_format(args.format): parser.error("Unsupported format specifier: %r" % (args.format,)) if not cmdsupport.is_valid_tag(args.id_tag): parser.error("Invalid id tag: %r" % (args.id_tag,)) missing = cmdsupport.check_filenames(args.filenames) if missing: parser.error("Structure file %r does not exist" % (missing,)) # Ready the input reader/iterator metadata, reader = cmdsupport.read_multifile_structure_fingerprints( opener, args.filenames, format = args.format, id_tag = args.id_tag, aromaticity = None, errors = args.errors) try: io.write_fps1_output(reader, args.output, metadata) except ParseError, err: sys.stderr.write("ERROR: %s. Exiting." % (err,)) raise SystemExit(1) if __name__ == "__main__": main() chemfp-1.1p1/chemfp/commandline/oe2fps.py0000644000077000000240000003050712055226640020564 0ustar dalkestaff00000000000000from __future__ import absolute_import import sys import itertools import textwrap from .. import ParseError from .. import argparse, types, io from .. import openeye as oe from . import cmdsupport ##### Handle command-line argument parsing # Build up some help text based on the atype and btype fields atype_options = "\n ".join(textwrap.wrap(" ".join(sorted(dict(oe._atype_flags))))) btype_options = "\n ".join(textwrap.wrap(" ".join(sorted(dict(oe._btype_flags))))) if oe.OEGRAPHSIM_API_VERSION == "1": from openeye.oegraphsim import (OEGetFPAtomType, OEFPAtomType_DefaultAtom, OEFPAtomType_DefaultAtom, OEGetFPBondType, OEFPBondType_DefaultBond, OEFPBondType_DefaultBond) type_help = """\ ATYPE is one or more of the following, separated by the '|' character. %(atype_options)s The terms 'Default' and 'DefaultAtom' are expanded to OpenEye's suggested default of %(defaultatom)s. Examples: --atype Default --atype AtomicNumber|HvyDegree (Note that most atom type names change in OEGraphSim 2.0.0.) BTYPE is one or more of the following, separated by the '|' character %(btype_options)s The terms 'Default' and 'DefaultBond' are expanded to OpenEye's suggested default of %(defaultbond)s. Examples: --btype Default --btype BondOrder (Note that "BondOrder" changes to "Order" in OEGraphSim 2.0.0.) For simpler Unix command-line compatibility, a comma may be used instead of a '|' to separate different fields. Example: --atype AtomicNumber,HvyDegree """ % dict(atype_options=atype_options, btype_options=btype_options, defaultatom = OEGetFPAtomType(OEFPAtomType_DefaultAtom), defaultbond = OEGetFPBondType(OEFPBondType_DefaultBond)) else: from openeye.oegraphsim import ( OEGetFPAtomType, OEFPAtomType_DefaultPathAtom, OEFPAtomType_DefaultCircularAtom, OEFPAtomType_DefaultTreeAtom, OEGetFPBondType, OEFPBondType_DefaultPathBond, OEFPBondType_DefaultCircularBond, OEFPBondType_DefaultTreeBond, ) type_help = """\ ATYPE is one or more of the following, separated by the '|' character %(atype_options)s The following shorthand terms and expansions are also available: DefaultPathAtom = %(defaultpathatom)s DefaultCircularAtom = %(defaultcircularatom)s DefaultTreeAtom = %(defaulttreeatom)s and 'Default' selects the correct value for the specified fingerprint. Examples: --atype Default --atype Arom|AtmNum|FCharge|HCount BTYPE is one or more of the following, separated by the '|' character %(btype_options)s The following shorthand terms and expansions are also available: DefaultPathBond = %(defaultpathbond)s DefaultCircularBond = %(defaultcircularbond)s DefaultTreeBond = %(defaulttreebond)s and 'Default' selects the correct value for the specified fingerprint. Examples: --btype Default --btype Order|InRing To simplify command-line use, a comma may be used instead of a '|' to separate different fields. Example: --atype AtmNum,HvyDegree """ % dict(atype_options=atype_options, btype_options=btype_options, defaultpathatom=OEGetFPAtomType(OEFPAtomType_DefaultPathAtom), defaultcircularatom=OEGetFPAtomType(OEFPAtomType_DefaultCircularAtom), defaulttreeatom=OEGetFPAtomType(OEFPAtomType_DefaultTreeAtom), defaultpathbond=OEGetFPBondType(OEFPBondType_DefaultPathBond), defaultcircularbond=OEGetFPBondType(OEFPBondType_DefaultCircularBond), defaulttreebond=OEGetFPBondType(OEFPBondType_DefaultTreeBond), ) # Extra help text after the parameter descriptions epilog = type_help + """\ OEChem guesses the input structure format based on the filename extension and assumes SMILES for structures read from stdin. Use "--in FORMAT" to select an alternative, where FORMAT is one of: File Type Valid FORMATs (use gz if compressed) --------- ------------------------------------ SMILES smi, ism, can, smi.gz, ism.gz, can.gz SDF sdf, mol, sdf.gz, mol.gz SKC skc, skc.gz CDK cdk, cdk.gz MOL2 mol2, mol2.gz PDB pdb, ent, pdb.gz, ent.gz MacroModel mmod, mmod.gz OEBinary v2 oeb, oeb.gz old OEBinary bin """ parser = argparse.ArgumentParser( description="Generate FPS fingerprints from a structure file using OEChem", epilog=epilog, formatter_class=argparse.RawDescriptionHelpFormatter, ) if oe.OEGRAPHSIM_API_VERSION == "1": PathFamily = oe.OpenEyePathFingerprintFamily_v1 path_group = parser.add_argument_group("path fingerprints") path_group.add_argument( "--path", action="store_true", help="generate path fingerprints (default)") PathFamily.add_argument_to_argparse("numbits", path_group) PathFamily.add_argument_to_argparse("minbonds", path_group) PathFamily.add_argument_to_argparse("maxbonds", path_group) else: CircularFamily = oe.OpenEyeCircularFingerprintFamily_v2 path_group = parser.add_argument_group("path, circular, and tree fingerprints") path_group.add_argument( "--path", action="store_true", help="generate path fingerprints (default)") path_group.add_argument( "--circular", action="store_true", help="generate circular fingerprints") path_group.add_argument( "--tree", action="store_true", help="generate tree fingerprints") path_group.add_argument( "--numbits", action="store", type=int, metavar="INT", default=4096, help="number of bits in the fingerprint (default=4096)") path_group.add_argument( "--minbonds", action="store", type=int, metavar="INT", default=0, help="minimum number of bonds in the path or tree fingerprint (default=0)") path_group.add_argument( "--maxbonds", action="store", type=int, metavar="INT", default=None, help="maximum number of bonds in the path or tree fingerprint (path default=5, tree default=4)") CircularFamily.add_argument_to_argparse("minradius", path_group) CircularFamily.add_argument_to_argparse("maxradius", path_group) # The expansion of 'Default' differs based on the fingerprint type path_group.add_argument( "--atype", metavar="ATYPE", default="Default", help="atom type flags, described below (default=Default)") path_group.add_argument( "--btype", metavar="BTYPE", default="Default", help="bond type flags, described below (default=Default)") maccs_group = parser.add_argument_group("166 bit MACCS substructure keys") maccs_group.add_argument( "--maccs166", action="store_true", help="generate MACCS fingerprints") substruct_group = parser.add_argument_group("881 bit ChemFP substructure keys") substruct_group.add_argument( "--substruct", action="store_true", help="generate ChemFP substructure fingerprints") rdmaccs_group = parser.add_argument_group("ChemFP version of the 166 bit RDKit/MACCS keys") rdmaccs_group.add_argument( "--rdmaccs", action="store_true", help="generate 166 bit RDKit/MACCS fingerprints") parser.add_argument( "--aromaticity", metavar="NAME", choices=oe._aromaticity_flavor_names, default="openeye", help="use the named aromaticity model") parser.add_argument( "--id-tag", metavar="NAME", help="tag name containing the record id (SD files only)") parser.add_argument( "--in", metavar="FORMAT", dest="format", help="input structure format (default guesses from filename)") parser.add_argument( "-o", "--output", metavar="FILENAME", help="save the fingerprints to FILENAME (default=stdout)") parser.add_argument( "--errors", choices=["strict", "report", "ignore"], default="strict", help="how should structure parse errors be handled? (default=strict)") parser.add_argument( "filenames", nargs="*", help="input structure files (default is stdin)") def _get_atype_and_btype(args, atom_description_to_value, bond_description_to_value): try: atype = atom_description_to_value(args.atype) except ValueError, err: parser.error("--atype must contain '|' separated atom terms: %s" % (err,)) try: btype = bond_description_to_value(args.btype) except ValueError, err: parser.error("--btype must contain '|' separated atom terms: %s" % (err,)) return atype, btype ####### def main(args=None): args = parser.parse_args(args) supported_fingerprints = ("maccs166", "path", "substruct", "rdmaccs") if oe.OEGRAPHSIM_API_VERSION != "1": supported_fingerprints += ("circular", "tree") else: args.circular = False args.tree = False cmdsupport.mutual_exclusion(parser, args, "path", supported_fingerprints) if args.maccs166: # Create the MACCS keys fingerprinter opener = types.get_fingerprint_family("OpenEye-MACCS166")() elif args.path: if not (16 <= args.numbits <= 65536): parser.error("--numbits must be between 16 and 65536 bits") if not (0 <= args.minbonds): parser.error("--minbonds must be 0 or greater") if args.maxbonds is None: args.maxbonds = 5 if not (args.minbonds <= args.maxbonds): parser.error("--maxbonds must not be smaller than --minbonds") atype, btype = _get_atype_and_btype(args, oe.path_atom_description_to_value, oe.path_bond_description_to_value) opener = types.get_fingerprint_family("OpenEye-Path")( numbits = args.numbits, minbonds = args.minbonds, maxbonds = args.maxbonds, atype = atype, btype = btype) elif args.circular: if not (16 <= args.numbits <= 65536): parser.error("--numbits must be between 16 and 65536 bits") if not (0 <= args.minradius): parser.error("--minradius must be 0 or greater") if not (args.minradius <= args.maxradius): parser.error("--maxradius must not be smaller than --minradius") atype, btype = _get_atype_and_btype(args, oe.circular_atom_description_to_value, oe.circular_bond_description_to_value) opener = types.get_fingerprint_family("OpenEye-Circular")( numbits = args.numbits, minradius = args.minradius, maxradius = args.maxradius, atype = atype, btype = btype) elif args.tree: if not (16 <= args.numbits <= 65536): parser.error("--numbits must be between 16 and 65536 bits") if not (0 <= args.minbonds): parser.error("--minbonds must be 0 or greater") if args.maxbonds is None: args.maxbonds = 4 if not (args.minbonds <= args.maxbonds): parser.error("--maxbonds must not be smaller than --minbonds") atype, btype = _get_atype_and_btype(args, oe.tree_atom_description_to_value, oe.tree_bond_description_to_value) opener = types.get_fingerprint_family("OpenEye-Tree")( numbits = args.numbits, minbonds = args.minbonds, maxbonds = args.maxbonds, atype = atype, btype = btype) elif args.substruct: opener = types.get_fingerprint_family("ChemFP-Substruct-OpenEye")() elif args.rdmaccs: opener = types.get_fingerprint_family("RDMACCS-OpenEye")() else: parser.error("ERROR: fingerprint not specified?") if args.format is not None: if args.filenames: filename = args.filenames[0] else: filename = None if not oe.is_valid_format(filename, args.format): parser.error("Unsupported format specifier: %r" % (args.format,)) if not oe.is_valid_aromaticity(args.aromaticity): parser.error("Unsupported aromaticity specifier: %r" % (args.aromaticity,)) if not cmdsupport.is_valid_tag(args.id_tag): parser.error("Invalid id tag: %r" % (args.id_tag,)) missing = cmdsupport.check_filenames(args.filenames) if missing: parser.error("Structure file %r does not exist" % (missing,)) # Ready the input reader/iterator metadata, reader = cmdsupport.read_multifile_structure_fingerprints( opener, args.filenames, args.format, args.id_tag, args.aromaticity, args.errors) try: io.write_fps1_output(reader, args.output, metadata) except ParseError, err: sys.stderr.write("ERROR: %s. Exiting.\n" % (err,)) raise SystemExit(1) if __name__ == "__main__": main() chemfp-1.1p1/chemfp/commandline/rdkit2fps.py0000644000077000000240000001504712055226640021300 0ustar dalkestaff00000000000000# Copyright (c) 2010 Andrew Dalke Scientific, AB (Gothenburg, Sweden) import sys from .. import ParseError from .. import argparse, io, rdkit, types from . import cmdsupport ########### Configure the command-line parser epilog = """\ This program guesses the input structure format based on the filename extension. If the data comes from stdin, or the extension name us unknown, then use "--in" to change the default input format. The supported format extensions are: File Type Valid FORMATs (use gz if compressed) --------- ------------------------------------ SMILES smi, ism, can, smi.gz, ism.gz, can.gz SDF sdf, mol, sd, mdl, sdf.gz, mol.gz, sd.gz, mdl.gz """ parser = argparse.ArgumentParser( description="Generate FPS fingerprints from a structure file using RDKit", epilog=epilog, formatter_class=argparse.RawDescriptionHelpFormatter, conflict_handler="resolve", ) _base = rdkit._base # --RDK and --morgan both have fpSize but argparse doesn't allow the # same option in different groups. Especially with different defaults. _base.add_argument_to_argparse("fpSize", parser) rdk_group = parser.add_argument_group("RDKit topological fingerprints") rdk_group.add_argument("--RDK", action="store_true", help="generate RDK fingerprints (default)") _base.add_argument_to_argparse("minPath", rdk_group) _base.add_argument_to_argparse("maxPath", rdk_group) _base.add_argument_to_argparse("nBitsPerHash", rdk_group) _base.add_argument_to_argparse("useHs", rdk_group) morgan_group = parser.add_argument_group("RDKit Morgan fingerprints") morgan_group.add_argument("--morgan", action="store_true", help="generate Morgan fingerprints") _morgan = rdkit.RDKitMorganFingerprintFamily_v1 _morgan.add_argument_to_argparse("radius", morgan_group) _morgan.add_argument_to_argparse("useFeatures", morgan_group) _morgan.add_argument_to_argparse("useChirality", morgan_group) _morgan.add_argument_to_argparse("useBondTypes", morgan_group) torsion_group = parser.add_argument_group("RDKit Topological Torsion fingerprints") torsion_group.add_argument("--torsions", action="store_true", help="generate Topological Torsion fingerprints") rdkit.RDKitTorsionFingerprintFamily_v1.add_argument_to_argparse( "targetSize", torsion_group) pair_group = parser.add_argument_group("RDKit Atom Pair fingerprints") pair_group.add_argument("--pairs", action="store_true", help="generate Atom Pair fingerprints") rdkit.RDKitTorsionFingerprintFamily_v1.add_argument_to_argparse( "minLength", pair_group) rdkit.RDKitTorsionFingerprintFamily_v1.add_argument_to_argparse( "maxLength", pair_group) maccs_group = parser.add_argument_group("166 bit MACCS substructure keys") maccs_group.add_argument( "--maccs166", action="store_true", help="generate MACCS fingerprints") substruct_group = parser.add_argument_group("881 bit substructure keys") substruct_group.add_argument( "--substruct", action="store_true", help="generate ChemFP substructure fingerprints") rdmaccs_group = parser.add_argument_group("ChemFP version of the 166 bit RDKit/MACCS keys") rdmaccs_group.add_argument( "--rdmaccs", action="store_true", help="generate 166 bit RDKit/MACCS fingerprints") parser.add_argument( "--id-tag", metavar="NAME", help="tag name containing the record id (SD files only)") parser.add_argument( "--in", metavar="FORMAT", dest="format", help="input structure format (default guesses from filename)") parser.add_argument( "-o", "--output", metavar="FILENAME", help="save the fingerprints to FILENAME (default=stdout)") parser.add_argument( "--errors", choices=["strict", "report", "ignore"], default="strict", help="how should structure parse errors be handled? (default=strict)") parser.add_argument( "filenames", nargs="*", help="input structure files (default is stdin)") def main(args=None): args = parser.parse_args(args) cmdsupport.mutual_exclusion(parser, args, "RDK", ("maccs166", "RDK", "substruct", "rdmaccs", "morgan", "torsions", "pairs")) if args.maccs166: opener = types.get_fingerprint_family("RDKit-MACCS166")() elif args.RDK: fpSize = args.fpSize or rdkit.NUM_BITS minPath = args.minPath maxPath = args.maxPath nBitsPerHash = args.nBitsPerHash if maxPath < minPath: parser.error("--minPath must not be greater than --maxPath") useHs = args.useHs opener = types.get_fingerprint_family("RDKit-Fingerprint")( minPath=minPath, maxPath=maxPath, fpSize=fpSize, nBitsPerHash=nBitsPerHash, useHs=useHs) elif args.substruct: opener = types.get_fingerprint_family("ChemFP-Substruct-RDKit")() elif args.rdmaccs: opener = types.get_fingerprint_family("RDMACCS-RDKit")() elif args.morgan: opener = types.get_fingerprint_family("RDKit-Morgan")( radius=args.radius, fpSize=args.fpSize, useFeatures=args.useFeatures, useChirality=args.useChirality, useBondTypes=args.useBondTypes) elif args.torsions: opener = types.get_fingerprint_family("RDKit-Torsion")( fpSize=args.fpSize, targetSize=args.targetSize) elif args.pairs: minLength = args.minLength maxLength = args.maxLength if maxLength < minLength: parser.error("--minLength must not be greater than --maxLength") opener = types.get_fingerprint_family("RDKit-AtomPair")( fpSize=args.fpSize, minLength=minLength, maxLength=maxLength) else: raise AssertionError("Unknown fingerprinter") if not rdkit.is_valid_format(args.format): parser.error("Unsupported format specifier: %r" % (args.format,)) if not cmdsupport.is_valid_tag(args.id_tag): parser.error("Invalid id tag: %r" % (args.id_tag,)) missing = cmdsupport.check_filenames(args.filenames) if missing: parser.error("Structure file %r does not exist" % (missing,)) metadata, reader = cmdsupport.read_multifile_structure_fingerprints( opener, args.filenames, format=args.format, id_tag=args.id_tag, aromaticity=None, errors=args.errors) try: io.write_fps1_output(reader, args.output, metadata) except ParseError, err: sys.stderr.write("ERROR: %s. Exiting." % (err,)) raise SystemExit(1) if __name__ == "__main__": main() chemfp-1.1p1/chemfp/commandline/sdf2fps.py0000644000077000000240000003031612101361634020726 0ustar dalkestaff00000000000000from __future__ import absolute_import import sys import re import itertools from .. import Metadata, FingerprintIterator, ParseError from .. import argparse from .. import encodings from .. import sdf_reader from .. import io from .. import error_handlers from . import cmdsupport # Backwards compatibility support for Python 2.5 try: next except NameError: def next(it): return it.next() def _check_num_bits(num_bits, # from the user fp_num_bits, # not None if the fp decoder know it exactly num_bytes, # length of decoded fp in bytes parser): """Check that the number of fingerprint bits and bytes match the user input Difficulties: some fingerprints have only a byte length, and the user doesn't have to specify the input. Returns the number of bits, or calls parser.error if there are problems """ if fp_num_bits is not None: # The fingerprint knows exactly how many bits it contains if num_bits is None: # The user hasn't specified, so go with the exact number return fp_num_bits # If the user gave a value, make sure it matches if num_bits != fp_num_bits: parser.error( ("the first fingerprint has %(fp_num_bits)s bits which " "is not the same as the --num-bits value of %(num_bits)s") % dict( num_bits=num_bits, fp_num_bits=fp_num_bits)) raise AssertionError("should not get here") return num_bits # If the number of bits isn't specified, assume it's exactly # enough to fill up the fingerprint bytes. if num_bits is None: return num_bytes * 8 # The user specified the number of bits. The first fingerprint # has a number of bytes. This must be enough to hold the bits, # but only up to 7 bits larger. if (num_bits+7)//8 != num_bytes: parser.error( ("The byte length of the first fingerprint is %(num_bytes)s so --num-bits " "must be %(min)s <= num-bits <= %(max)s, not %(num_bits)s") % dict( num_bytes=num_bytes, min=num_bytes*8-7, max=num_bytes*8, num_bits=num_bits)) raise AssertError("should not get here") # Accept what the user suggested return num_bits parser = argparse.ArgumentParser( description="Extract a fingerprint tag from an SD file and generate FPS fingerprints", #epilog=epilog, #formatter_class=argparse.RawDescriptionHelpFormatter, ) parser.add_argument( "filenames", nargs="*", help="input SD files (default is stdin)", default=None) parser.add_argument("--id-tag", metavar="TAG", default=None, help="get the record id from TAG instead of the first line of the record") parser.add_argument("--fp-tag", metavar="TAG", help="get the fingerprint from tag TAG (required)") parser.add_argument("--num-bits", metavar="INT", type=int, help="use the first INT bits of the input. Use only when the " "last 1-7 bits of the last byte are not part of the fingerprint. " "Unexpected errors will occur if these bits are not all zero.") parser.add_argument( "--errors", choices=["strict", "report", "ignore"], default="strict", help="how should structure parse errors be handled? (default=strict)") parser.add_argument("-o", "--output", metavar="FILENAME", help="save the fingerprints to FILENAME (default=stdout)") parser.add_argument("--software", metavar="TEXT", help="use TEXT as the software description") parser.add_argument("--type", metavar="TEXT", help="use TEXT as the fingerprint type description") # TODO: # Do I want "--gzip", "--auto", "--none", "--bzip2", and "--decompress METHOD"? # Do I want to support encoding of the fps output? # Or, why support all these? Why not just "--in gz", "--in bz2" and be done # with it (do I really need to specify the 'auto' and 'none' options?) parser.add_argument( "--decompress", action="store", metavar="METHOD", default="auto", help="use METHOD to decompress the input (default='auto', 'none', 'gzip', 'bzip2')") #parser.add_argument( # "--compress", action="store", metavar="METHOD", default="auto", # help="use METHOD to compress the output (default='auto', 'none', 'gzip', 'bzip2')") # This adds --cactvs, --base64 and other decoders to the command-line arguments encodings._add_decoding_group(parser) # Support the "--pubchem" option shortcuts_group = parser.add_argument_group("shortcuts") class AddSubsKeys(argparse.Action): def __call__(self, parser, namespace, values, option_string=None): namespace.cactvs=True # the 1.3 is solely based on the version of the document at # ftp://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pubchem_fingerprints.txt namespace.software="CACTVS/unknown" namespace.type="CACTVS-E_SCREEN/1.0 extended=2" namespace.fp_tag="PUBCHEM_CACTVS_SUBSKEYS" shortcuts_group.add_argument("--pubchem", nargs=0, action=AddSubsKeys, help = ("decode CACTVS substructure keys used in PubChem. Same as " "--software=CACTVS/unknown --type 'CACTVS-E_SCREEN/1.0 extended=2' " "--fp-tag=PUBCHEM_CACTVS_SUBSKEYS --cactvs")) ############### _illegal_value_pat = re.compile(r"[\000-\037]") def main(args=None): args = parser.parse_args(args) if not args.fp_tag: parser.error("argument --fp-tag is required") if args.num_bits is not None and args.num_bits <= 0: parser.error("--num-bits must be a positive integer") fp_decoder_name, fp_decoder = encodings._extract_decoder(parser, args) missing = cmdsupport.check_filenames(args.filenames) if missing: parser.error("Structure file %r does not exist" % (missing,)) for attr in ("software", "type"): description = getattr(args, attr, None) if description is None: continue m = _illegal_value_pat.search(description) if m is None: continue parser.error("--%(attr)s description may not contain the character %(c)r" % dict( attr=attr, c = m.group(0))) error_handler = error_handlers.get_parse_error_handler(args.errors) # What follows is a bit tricky. I set up a chain of iterators: # - iterate through the SDF iterators # - iterate through the (id, encoded_fp) pairs in each SDF iterator # - convert to (id, fp, num_bits) 3-element tuples # - use the first element to figure out the right metadata # - send to (id, fp) information to the io.write_fps1_output function # Iterate through each of the filenames, yielding the corresponding SDF iterator location = sdf_reader.FileLocation() def get_sdf_iters(): if not args.filenames: yield sdf_reader.open_sdf(None, args.decompress, location=location) else: for filename in args.filenames: location.filename = filename location.lineno = 1 yield sdf_reader.open_sdf(filename, args.decompress, location=location) # Set up the error messages for missing id or fingerprints. if args.id_tag is None: MISSING_ID = "Missing title in the record starting %(where)s" MISSING_FP = "Missing fingerprint tag %(tag)r in record starting %(where)s" else: MISSING_ID = "Missing id tag %(tag)r in the record starting %(where)s" MISSING_FP = "Missing fingerprint tag %(tag)r in record %(id)r starting %(where)s" # For each SDF iterator, yield the (id, encoded_fp) pairs if args.id_tag is None: def iter_encoded_fingerprints(sdf_iters): counter = itertools.count(1) for sdf_iter in sdf_iters: for id, fp in sdf_reader.iter_title_and_tag(sdf_iter, args.fp_tag): if id: id = io.remove_special_characters_from_id(id) yield id, fp else: def iter_encoded_fingerprints(sdf_iters): counter = itertools.count(1) for sdf_iter in sdf_iters: for id, fp in sdf_reader.iter_two_tags(sdf_iter, args.id_tag, args.fp_tag): if id: id = io.remove_special_characters_from_id(id) yield id, fp # This is either None or a user-specified integer num_bits = args.num_bits # At this point I don't have enough information to generate the metadata. # I won't get that until I've read the first record. outfile = None # Don't open it until I'm ready to write the first record num_bytes = None # Will need to get (or at least check) the fingerprint byte length # Decoded encoded fingerprints, yielding (id, fp, num_bits) def decode_fingerprints(encoded_fp_reader, error_handler): expected_num_bits = -1 expected_fp_size = None for id, encoded_fp in encoded_fp_reader: if not id: msg = MISSING_ID % dict(id=id, where=location.where(), tag=args.id_tag) error_handler(msg) continue if not encoded_fp: msg = MISSING_FP % dict(id=id, where=location.where(), tag=args.fp_tag) error_handler(msg) continue # Decode the fingerprint, and complain if it isn't decodeable. try: num_bits, fp = fp_decoder(encoded_fp) except ValueError, err: msg = ("Could not %(decoder_name)s decode %(tag)r value %(encoded_fp)r: %(err)s %(where)s" % dict(decoder_name=fp_decoder_name, tag=args.fp_tag, where=location.where(), err=err, encoded_fp=encoded_fp)) error_handler(msg) continue if num_bits != expected_num_bits: if expected_num_bits == -1: expected_num_bits = num_bits else: msg = ("Tag %(tag)r value %(encoded_fp)r has %(got)d bits but expected %(expected)d %(where)s" % dict(tag=args.fp_tag, encoded_fp=encoded_fp, got=num_bits, expected=expected_num_bits, where=location.where())) error_handler(msg) continue if len(fp) != expected_fp_size: if expected_fp_size is None: expected_fp_size = len(fp) else: msg = ("Tag %(tag)r value %(encoded_fp)r has %(got)d bytes but expected %(expected)d %(where)s" % dict(tag=args.fp_tag, encoded_fp=encoded_fp, got=len(fp), expected=expected_fp_size, where=location.where())) error_handler(msg) continue yield id, fp, num_bits sdf_iters = get_sdf_iters() encoded_fps = iter_encoded_fingerprints(sdf_iters) decoded_fps = decode_fingerprints(encoded_fps, error_handler) try: id, fp, num_bits = next(decoded_fps) except ParseError, err: sys.stderr.write("ERROR: %s. Exiting." % (err,)) raise SystemExit(1) except StopIteration: # No fingerprints? Make a new empty stream metadata = Metadata(date = io.utcnow()) chained_reader = iter([]) else: # Got the first fingerprint expected_num_bytes = len(fp) # Verify that they match expected_num_bits = _check_num_bits(args.num_bits, num_bits, expected_num_bytes, parser) chained_reader = itertools.chain( [(id, fp)], (x[:2] for x in decoded_fps) ) metadata = Metadata(num_bits = expected_num_bits, software = args.software, type = args.type, sources = args.filenames, date = io.utcnow()) try: io.write_fps1_output(chained_reader, args.output, metadata) except ParseError, err: sys.stderr.write("ERROR: %s. Exiting." % (err,)) raise SystemExit(1) if __name__ == "__main__": main() chemfp-1.1p1/chemfp/commandline/simsearch.py0000644000077000000240000003445412101553606021346 0ustar dalkestaff00000000000000from __future__ import with_statement import math import sys import itertools import time import chemfp from chemfp import argparse, io, SOFTWARE, bitops from chemfp import search # Suppose you have a 4K fingerprint. # 1/4096 = 0.000244140625. # 2/4096 = 0.00048828125 # You only need to show "0.0002" and "0.0005" to # disambiguate the scores. I don't like seeing only # the minimum resolution, so I also show at least # the next bit. # For 4096 the float_formatter is %.5f and the # above values are 0.00024 and 0.00049. # This also prevents the results from being shown # in scientific notation. def get_float_formatter(num_bytes): num_digits = int(math.log10(num_bytes*8)) + 2 float_formatter = "%." + str(num_digits) + "f" return float_formatter def write_simsearch_magic(outfile): outfile.write("#Simsearch/1\n") def write_count_magic(outfile): outfile.write("#Count/1\n") def write_simsearch_header(outfile, d): lines = [] for name in ("num_bits", "type", "software", "queries", "targets"): value = d.get(name, None) if value is not None: lines.append("#%s=%s\n" % (name, value)) for name in ("query_sources", "target_sources"): for value in d.get(name, []): lines.append("#%s=%s\n" % (name, value)) outfile.writelines(lines) #### The NxM cases def report_threshold(outfile, query_arenas, targets, threshold): float_formatter = get_float_formatter(targets.metadata.num_bytes) def search_function(query_arena): return targets.threshold_tanimoto_search_arena(query_arena, threshold=threshold) _report_search(outfile, float_formatter, query_arenas, search_function) def report_knearest(outfile, query_arenas, targets, k, threshold): float_formatter = get_float_formatter(targets.metadata.num_bytes) def search_function(query_arena): return targets.knearest_tanimoto_search_arena(query_arena, k=k, threshold=threshold) _report_search(outfile, float_formatter, query_arenas, search_function) def _report_search(outfile, float_formatter, query_arenas, search_function): hit_formatter = "\t%s\t" + float_formatter for query_arena in query_arenas: for query_id, row in zip(query_arena.ids, search_function(query_arena)): outfile.write("%d\t%s" % (len(row), query_id)) for hit in row.get_ids_and_scores(): outfile.write(hit_formatter % hit) outfile.write("\n") # XXX flush? def report_counts(outfile, query_arenas, targets, threshold): for query_arena in query_arenas: counts = targets.count_tanimoto_hits_arena(query_arena, threshold) for query_id, hit_count in zip(query_arena.ids, counts): outfile.write("%d\t%s\n" % (hit_count, query_id)) #### The NxN cases def do_NxN_searches(args, k, threshold, target_filename): t1 = time.time() # load_fingerprints sorts the fingerprints based on popcount # I want the output to be in the same order as the input. # This means I need to do some reordering. Consider: # 0003 ID_A # 010a ID_B # 1000 ID_C # I use this to generate: # original_ids = ["ID_A", "ID_B", "ID_C"] # targets.ids = [2, 0, 1] # original_index_to_current_index = {2:0, 0:1, 1:2} # current_index_to_original_index = {0:2, 1:0, 2:1} original_ids = [] fps = chemfp.open(target_filename) def get_index_to_id(fps): for i, (id, fp) in enumerate(fps): original_ids.append(id) yield i, fp targets = chemfp.load_fingerprints(get_index_to_id(fps), fps.metadata) original_index_to_current_index = dict(zip(targets.ids, xrange(len(targets)))) current_index_to_original_id = dict((i, original_ids[original_index]) for i, original_index in enumerate(targets.ids)) t2 = time.time() outfile = io.open_output(args.output) with io.ignore_pipe_errors: type = "Tanimoto k=%(k)s threshold=%(threshold)s NxN=full" % dict( k=k, threshold=threshold, max_score=1.0) if args.count: type = "Count threshold=%(threshold)s NxN=full" % dict( threshold=threshold) write_count_magic(outfile) else: write_simsearch_magic(outfile) write_simsearch_header(outfile, { "num_bits": targets.metadata.num_bits, "software": SOFTWARE, "type": type, "targets": target_filename, "target_sources": targets.metadata.sources}) if args.count: counts = search.count_tanimoto_hits_symmetric(targets, threshold, batch_size=args.batch_size) for original_index, original_id in enumerate(original_ids): current_index = original_index_to_current_index[original_index] count = counts[current_index] outfile.write("%d\t%s\n" % (count, original_id)) else: hit_formatter = "\t%s\t" + get_float_formatter(targets.metadata.num_bytes) if k == "all": results = search.threshold_tanimoto_search_symmetric(targets, threshold, batch_size=args.batch_size) else: results = search.knearest_tanimoto_search_symmetric(targets, k, threshold, batch_size=args.batch_size) for original_index, original_id in enumerate(original_ids): current_index = original_index_to_current_index[original_index] new_indices_and_scores = results[current_index].get_ids_and_scores() outfile.write("%d\t%s" % (len(new_indices_and_scores), original_id)) for (new_index, score) in new_indices_and_scores: original_id = original_ids[new_index] outfile.write(hit_formatter % (original_id, score)) outfile.write("\n") # XXX flush? t3 = time.time() if args.times: sys.stderr.write("open %.2f search %.2f total %.2f\n" % (t2-t1, t3-t2, t3-t1)) #### def int_or_all(s): if s == "all": return s return int(s) # the "2fps" options need a way to say "get the options from --reference" # ob2fps --reference targets.fps | simsearch -k 5 --threshold 0.5 targets.fps parser = argparse.ArgumentParser( description="Search an FPS file for similar fingerprints") parser.add_argument("-k" ,"--k-nearest", help="select the k nearest neighbors (use 'all' for all neighbors)", default=None, type=int_or_all) parser.add_argument("-t" ,"--threshold", help="minimum similarity score threshold", default=None, type=float) parser.add_argument("-q", "--queries", help="filename containing the query fingerprints") parser.add_argument("--NxN", action="store_true", help="use the targets as the queries, and exclude the self-similarity term") parser.add_argument("--hex-query", help="query in hex") parser.add_argument("--query-id", default="Query1", help="id for the hex query") parser.add_argument("--in", metavar="FORMAT", dest="query_format", help="input query format (default uses the file extension, else 'fps')") parser.add_argument("-o", "--output", metavar="FILENAME", help="output filename (default is stdout)") parser.add_argument("-c", "--count", help="report counts", action="store_true") parser.add_argument("-b", "--batch-size", help="batch size", default=100, type=int) parser.add_argument("--scan", help="scan the file to find matches (low memory overhead)", action="store_true") parser.add_argument("--memory", help="build and search an in-memory data structure (faster for multiple queries)", action="store_true") parser.add_argument("--times", help="report load and execution times to stderr", action="store_true") parser.add_argument("target_filename", nargs=1, help="target filename", default=None) ## Something to enable multi-threading #parser.add_argument("-j", "--jobs", help="number of jobs ", # default=10, type=int) def main(args=None): args = parser.parse_args(args) target_filename = args.target_filename[0] threshold = args.threshold k = args.k_nearest if args.count and k is not None and k != "all": parser.error("--count search does not support --k-nearest") # People should not use this without setting parameters. On the # other hand, I don't want an error message if there are no # parameters. This solution seems to make sense. if threshold is None: if k is None: # If nothing is set, use defaults of --thresdhold 0.7 -k 3 threshold = 0.7 k = 3 else: # only k is set; search over all possible matches threshold = 0.0 else: if k is None: # only threshold is set; search for all hits above that threshold k = "all" if k == "all": pass elif k < 0: parser.error("--k-nearest must be non-negative or 'all'") if not (0.0 <= threshold <= 1.0): parser.error("--threshold must be between 0.0 and 1.0, inclusive") if args.batch_size < 1: parser.error("--batch-size must be positive") bitops.use_environment_variables() if args.NxN: if args.scan: parser.error("Cannot specify --scan with an --NxN search") if args.hex_query: parser.error("Cannot specify --hex-query with an --NxN search") if args.queries: parser.error("Cannot specify --queries with an --NxN search") do_NxN_searches(args, k, threshold, target_filename) return if args.scan and args.memory: parser.error("Cannot specify both --scan and --memory") if args.hex_query and args.queries: parser.error("Cannot specify both --hex-query and --queries") if args.hex_query: query_id = args.query_id for c, name in ( ("\t", "tab"), ("\n", "newline"), ("\r", "control-return"), ("\0", "NUL")): if c in query_id: parser.error("--query-id must not contain the %s character" % (name,)) # Open the target file. This reads just enough to get the header. try: targets = chemfp.open(target_filename) except (IOError, ValueError), err: sys.stderr.write("Cannot open targets file: %s" % err) raise SystemExit(1) if args.hex_query is not None: try: query_fp = args.hex_query.decode("hex") except TypeError, err: parser.error("--hex-query is not a hex string: %s" % (err,)) for (severity, error, msg_template) in chemfp.check_fp_problems(query_fp, targets.metadata): if severity == "error": parser.error(msg_template % dict(fp="query", metadata=repr(target_filename))) num_bits = targets.metadata.num_bits if num_bits is None: num_bits = len(query_fp) * 8 query_metadata = chemfp.Metadata(num_bits=num_bits, num_bytes=len(query_fp)) queries = chemfp.Fingerprints(query_metadata, [(query_id, query_fp)]) query_filename = None else: query_filename = args.queries try: queries = chemfp.open(query_filename, format=args.query_format) except (ValueError, IOError), err: sys.stderr.write("Cannot open queries file: %s\n" % (err,)) raise SystemExit(1) batch_size = args.batch_size query_arena_iter = queries.iter_arenas(batch_size) t1 = time.time() first_query_arena = None for first_query_arena in query_arena_iter: break if args.scan: # Leave the targets as-is pass elif args.memory: targets = chemfp.load_fingerprints(targets) if not first_query_arena: # No input. Leave as-is pass elif len(first_query_arena) < min(10, batch_size): # Figure out the optimal search. If there is a # small number of inputs (< ~10) then a scan # of the FPS file is faster than an arena search. pass else: targets = chemfp.load_fingerprints(targets) problems = chemfp.check_metadata_problems(queries.metadata, targets.metadata) for (severity, error, msg_template) in problems: msg = msg_template % dict(metadata1="queries", metadata2="targets") if severity == "error": parser.error(msg) elif severity == "warning": sys.stderr.write("WARNING: " + msg + "\n") t2 = time.time() outfile = io.open_output(args.output) with io.ignore_pipe_errors: type = "Tanimoto k=%(k)s threshold=%(threshold)s" % dict( k=k, threshold=threshold, max_score=1.0) if args.count: type = "Count threshold=%(threshold)s" % dict( threshold=threshold) write_count_magic(outfile) else: write_simsearch_magic(outfile) write_simsearch_header(outfile, { "num_bits": targets.metadata.num_bits, "software": SOFTWARE, "type": type, "queries": query_filename, "targets": target_filename, "query_sources": queries.metadata.sources, "target_sources": targets.metadata.sources}) if first_query_arena: query_arenas = itertools.chain([first_query_arena], query_arena_iter) if args.count: report_counts(outfile, query_arenas, targets, threshold = threshold) elif k == "all": report_threshold(outfile, query_arenas, targets, threshold = threshold) else: report_knearest(outfile, query_arenas, targets, k = k, threshold = threshold) t3 = time.time() if args.times: sys.stderr.write("open %.2f search %.2f total %.2f\n" % (t2-t1, t3-t2, t3-t1)) if __name__ == "__main__": main() chemfp-1.1p1/chemfp/decoders.py0000644000077000000240000000045712101362122016655 0ustar dalkestaff00000000000000import warnings warnings.warn("The decoders are now located in the chemfp.encoders module.", DeprecationWarning) from .encodings import (from_base64, from_binary_lsb, from_binary_msb, from_cactvs, from_daylight, from_hex, from_hex_lsb, from_hex_msb, from_on_bit_positions) 1/0 chemfp-1.1p1/chemfp/encodings.py0000644000077000000240000004401412101547360017044 0ustar dalkestaff00000000000000"""chemfp.decoders - decode different fingerprint representations into chemfp form The chemfp fingerprints are stored as byte strings, with the bytes in least-significant bit order (bit #0 is stored in the first/left-most byte) and with the bits in most-significant bit order (bit #0 is stored in the first/right-most bit of the first byte). Other systems use different encodings. These include: - the '0 and '1' characters, as in '00111101' - hex encoding, like '3d' - base64 encoding, like 'SGVsbG8h' - CACTVS's variation of base64 encoding plus variations of different LSB and MSB orders. This module decodes most of the fingerprint encodings I have come across. The fingerprint decoders return a 2-ple of the bit length and the chemfp fingerprint. The bit length is None unless the bit length is known exactly, which currently is only the case for the binary and CACTVS fingerprints. (The hex and other encoders must round the fingerprints up to a multiple of 8 bits.) """ import string import binascii _lsb_bit_table = {} # "10000000" -> 1 _msb_bit_table = {} # "00000001" -> 1 _reverse_bits_in_a_byte_transtable = None # These are in lsb order; _lsb_4bit_patterns = ( "0000", "1000", "0100", "1100", "0010", "1010", "0110", "1110", "0001", "1001", "0101", "1101", "0011", "1011", "0111", "1111") # Generate '00000000', '10000000', '01000000', ... , '01111111', '11111111' def _lsb_8bit_patterns(): for right in _lsb_4bit_patterns: for left in _lsb_4bit_patterns: yield left + right def _init(): to_trans = [None]*256 for value, bit_pattern in enumerate(_lsb_8bit_patterns()): # Each pattern maps to the byte byte_value = chr(value) to_trans[value] = chr(int(bit_pattern, 2)) _lsb_bit_table[bit_pattern] = byte_value # Include the forms with trailing 0s # 10000000, 1000000, 100000, 10000, 1000, 100, 10 and 1 are all 0x01 # (RDKit fingerprint lengths don't need to be a multiple of 8) lsb_pattern = bit_pattern while lsb_pattern[-1:] == "0": lsb_pattern = lsb_pattern[:-1] _lsb_bit_table[lsb_pattern] = byte_value msb_pattern = bit_pattern[::-1] _msb_bit_table[msb_pattern] = byte_value while msb_pattern[:1] == "0": msb_pattern = msb_pattern[1:] _msb_bit_table[msb_pattern] = byte_value global _reverse_bits_in_a_byte_transtable _reverse_bits_in_a_byte_transtable = string.maketrans( "".join(chr(i) for i in range(256)), "".join(to_trans)) _init() assert _lsb_bit_table["10000000"] == "\x01", _lsb_bit_table["10000000"] assert _lsb_bit_table["1000000"] == "\x01", _lsb_bit_table["1000000"] assert _lsb_bit_table["100000"] == "\x01" assert _lsb_bit_table["10000"] == "\x01" assert _lsb_bit_table["1"] == "\x01" assert _lsb_bit_table["1111111"] == "\x7f" assert _msb_bit_table["00000001"] == "\x01" assert _msb_bit_table["0000001"] == "\x01" assert _msb_bit_table["000001"] == "\x01" assert _msb_bit_table["00001"] == "\x01" assert _msb_bit_table["1"] == "\x01" assert _msb_bit_table["00000011"] == "\x03" assert _msb_bit_table["00000011"] == "\x03" assert _msb_bit_table["10000000"] == "\x80" assert _msb_bit_table["1000000"] == "\x40" def from_binary_lsb(text): """Convert a string like '00010101' (bit 0 here is off) into '\\xa8' The encoding characters '0' and '1' are in LSB order, so bit 0 is the left-most field. The result is a 2-ple of the fingerprint length and the decoded chemfp fingerprint >>> from_binary_lsb('00010101') (8, '\\xa8') >>> from_binary_lsb('11101') (5, '\\x17') >>> from_binary_lsb('00000000000000010000000000000') (29, '\\x00\\x80\\x00\\x00') >>> """ table = _lsb_bit_table N = len(text) try: bytes = "".join(table[text[i:i+8]] for i in xrange(0, N, 8)) except KeyError: raise ValueError("Not a binary string") return (N, bytes) def from_binary_msb(text): """Convert a string like '10101000' (bit 0 here is off) into '\\xa8' The encoding characters '0' and '1' are in MSB order, so bit 0 is the right-most field. >>> from_binary_msb('10101000') (8, '\\xa8') >>> from_binary_msb('00010101') (8, '\\x15') >>> from_binary_msb('00111') (5, '\\x07') >>> from_binary_msb('00000000000001000000000000000') (29, '\\x00\\x80\\x00\\x00') >>> """ # It feels like there should be a faster, more elegant way to do this. # While close, # hex(int('00010101', 2))[2:].decode("hex") # does not keep the initial 0s try: N = len(text) bytes = [] end = N start = N-8 while start > 0: bytes.append(_msb_bit_table[text[start:end]]) end = start start -= 8 bytes.append(_msb_bit_table[text[0:end]]) return (N, "".join(bytes)) except KeyError: raise ValueError("Not a binary string") def from_base64(text): """Decode a base64 encoded fingerprint string The encoded fingerprint must be in chemfp form, with the bytes in LSB order and the bits in MSB order. >>> from_base64("SGk=") (None, 'Hi') >>> from_base64("SGk=")[1].encode("hex") '4869' >>> """ try: # This is the same as doing text.decode("base64") but since I # need to catch the exception, I might as well work with the # underlying implementation code. return (None, binascii.a2b_base64(text)) except binascii.Error, err: raise ValueError(str(err)) #def from_base64_msb(text): # return (None, text.decode("base64")[::-1], None) #def from_base64_lsb(text): # return (None, text.decode("base64").translate(_reverse_bits_in_a_byte_transtable), None) def from_hex(text): """Decode a hex encoded fingerprint string The encoded fingerprint must be in chemfp form, with the bytes in LSB order and the bits in MSB order. >>> from_hex('10f2') (None, '\\x10\\xf2') >>> Raises a ValueError if the hex string is not a multiple of 2 bytes long or if it contains a non-hex character. """ return (None, text.decode("hex")) def from_hex_msb(text): """Decode a hex encoded fingerprint string where the bits and bytes are in MSB order >>> from_hex_msb('10f2') (None, '\\xf2\\x10') >>> Raises a ValueError if the hex string is not a multiple of 2 bytes long or if it contains a non-hex character. """ return (None, text.decode("hex")[::-1]) def from_hex_lsb(text): """Decode a hex encoded fingerprint string where the bits and bytes are in LSB order >>> from_hex_lsb('102f') (None, '\\x08\\xf4') >>> Raises a ValueError if the hex string is not a multiple of 2 bytes long or if it contains a non-hex character. """ return (None, text.decode("hex").translate(_reverse_bits_in_a_byte_transtable)) # ftp://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pubchem_fingerprints.txt # This comes from cid:11 which is 1,2-dichloroethane # AAADcYBAAAAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGAIAAAAAAAOAAEAAAAA # AAAAAAAAAAAAAAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== # That's simple enough to check the bit ordering by eye. Here's the decoded start # 80-40-00-00-06-00-00 ... # We know it has to match the bits (starting with bit 0) # 1000 0000 0100 0000 0000 0000 0000 0000 0000 0110 # and it does, perfectly. That means CACTVS is pure little endian. # chem-fp has little-endian byte order but big endian bit order. # 0111 1000 0100 0000 0000 0101 0000 0000 0000 0000 0000 0000 def from_cactvs(text): """Decode a 881-bit CACTVS-encoded fingerprint used by PubChem >>> from_cactvs("AAADceB7sQAEAAAAAAAAAAAAAAAAAWAAAAAwAAAAAAAAAAABwAAAHwIYAAAADA" + ... "rBniwygJJqAACqAyVyVACSBAAhhwIa+CC4ZtgIYCLB0/CUpAhgmADIyYcAgAAO" + ... "AAAAAAABAAAAAAAAAAIAAAAAAAAAAA==") (881, '\\x07\\xde\\x8d\\x00 \\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x80\\x06\\x00\\x00\\x00\\x0c\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x80\\x03\\x00\\x00\\xf8@\\x18\\x00\\x00\\x000P\\x83y4L\\x01IV\\x00\\x00U\\xc0\\xa4N*\\x00I \\x00\\x84\\xe1@X\\x1f\\x04\\x1df\\x1b\\x10\\x06D\\x83\\xcb\\x0f)%\\x10\\x06\\x19\\x00\\x13\\x93\\xe1\\x00\\x01\\x00p\\x00\\x00\\x00\\x00\\x00\\x80\\x00\\x00\\x00\\x00\\x00\\x00\\x00@\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00') >>> For format details, see ftp://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pubchem_fingerprints.txt """ fp = text.decode("base64") # first 4 bytes are the length (struct.unpack(">I")) if fp[:4] != '\x00\x00\x03q': raise ValueError("This implementation is hard-coded for 881 bit CACTVS fingerprints") return 881, fp[4:].translate(_reverse_bits_in_a_byte_transtable) ########### Convert from the Daylight encoding created by dt_binary2ascii # Copied from PyDaylight daylight/dayencodings.py """ This code is based on the description of the encoding given in the contrib program '$DY_ROOT/contrib/src/c/fingerprint/ascii2bits.c' Here is the description from that file. ***************************************************************************** ASCII: |=======+=======+=======+=======| etc. ^ becomes... 3 <-> 4 v BINARY: |=====+=====+=====+=====| etc. Daylight uses the following method for translating binary data into printable ascii and vice versa. Each 6 bits of binary (range 0-63) is converted to one of 64 characters in the set [.,0-9A-Za-z]; each 3-byte triplet thus converts to a 4-byte ASCII string. Every binary array is padded to a multiple of 3 bytes for the conversion; once the conversion is done you can't tell whether the last two bytes are pad bytes or real bytes containing zero. To remedy this, an extra character is tacked on the ASCII representation; it will always be one of the characters '3', '2', or '1', indicating how many of the bytes in the last triplet are genuine. That is, an ASCII-to-binary conversion will always produce an array whose length is a 3n bytes, but the last one or two bytes might just be pad bytes; the last ascii character indicates this. Thus, ascii strings are always of length (4n + 1) bytes. Thus, an ascii string can only describe bitmaps with bitcounts that are a multiple of 8. If other sizes are desired, a specific bitcount must be remembered. ************************************************************************** 4.61 Change: ',' is replaced by '+'. ************************************************************************** Author: Jeremy Yang Rev: 27 Jan 1999 ************************************************************************* """ # Map from 6 bit value to character encoding (used in binary2ascii) _daylight_table = (".+" + "".join(map(chr, range(ord("0"), ord("9") + 1) + range(ord("A"), ord("Z") + 1) + range(ord("a"), ord("z") + 1)))) # Map from character encoding to 6 bits (used in ascii2binary) # The '+' used to be represented as ',' in pre-4.61 code _daylight_reverse_table = {} for i, c in enumerate(_daylight_table): _daylight_reverse_table[c] = i _daylight_reverse_table[","] = _daylight_reverse_table["+"] del i, c def from_daylight(text): """Decode a Daylight ASCII fingerprint >>> from_daylight("I5Z2MLZgOKRcR...1") (None, 'PyDaylight') See the implementation for format details. """ if len(text) % 4 != 1: raise ValueError("Daylight binary encoding is of the wrong length") if text == "3": # This is the encoding of an empty string (perverse, I know) return None, "" count = text[-1] if count not in ("1", "2", "3"): raise ValueError("Last character of encoding must be 1, 2, or 3, not %r" % (count,)) count = int(count) try: # Take four digits at a time fields = [] reverse_table = _daylight_reverse_table for i in range(0, len(text)-1, 4): t = text[i:i+4] d = (reverse_table[t[0]] * 262144 + # (2**6) ** 3 reverse_table[t[1]] * 4096 + # (2**6) ** 2 reverse_table[t[2]] * 64 + # (2**6) ** 1 reverse_table[t[3]]) # (2**6) ** 0 # This is a 24 bit field # Convert back into 8 bits at a time c1 = d >> 16 c2 = (d >> 8) & 0xFF c3 = d & 0xFF fields.append( chr(c1) + chr(c2) + chr(c3) ) except KeyError: raise ValueError("Unknown encoding symbol") # Only 'count' of the last field is legal # Because of the special case for empty string earlier, # the 'fields' array is non-empty fields[-1] = fields[-1][:count] s = "".join(fields) return (None, s) assert from_daylight("I5Z2MLZgOKRcR...1") == (None, "PyDaylight") def from_on_bit_positions(text, num_bits=1024, separator=" "): """Decode from a list of integers describing the location of the on bits >>> from_on_bit_positions("1 4 9 63", num_bits=32) (32, '\\x12\\x02\\x00\\x80') >>> from_on_bit_positions("1,4,9,63", num_bits=64, separator=",") (64, '\\x12\\x02\\x00\\x00\\x00\\x00\\x00\\x80') The text contains a sequence of non-negative integer values separated by the `separator` text. Bit positions are folded modulo num_bits. This is often used to convert sparse fingerprints into a dense fingerprint. """ if num_bits <= 0: raise ValueError("num_bits must be positive") bytes = [0] * ((num_bits+7)//8) for bit_s in text.split(separator): try: bit = int(bit_s) except ValueError: raise ValueError("Bit positions must be an integer, not %r" % (bit_s,)) if bit < 0: raise ValueError("Bit positions must be non-negative, not %r" % (bit,)) bit = bit % num_bits bytes[bit//8] |= 1<<(bit%8) return num_bits, "".join(map(chr, bytes)) ############## def import_decoder(path): """Find a decoder function given its full name, as in 'chemfp.decoders.from_cactvs' This function imports any intermediate modules, which may be a security concern. """ terms = path.split(".") if not terms: raise ValueError("missing import name") if "" in terms: raise ValueError("Empty module name in %r" % (path,)) # It's impossible to tell if the dotted terms corresponds to # module or class/instance attribute lookups, so I don't know # which fields are imports and which fields are getattrs. To get # around that, I'll import everything, and if that fails I'll # remove the deepest term and try again. tmp_terms = terms[:] while tmp_terms: try: __import__(".".join(tmp_terms), level=0) except ImportError: del tmp_terms[-1] else: break # I've imported as deep as possible. # Now start from the top and work down with getattr calls obj = __import__(terms[0], level=0) for i, subattr in enumerate(terms[1:]): obj = getattr(obj, subattr, None) if obj is None: failure_path = ".".join(terms[:i+2]) raise ValueError(("Unable to import a decoder: " "Could not find %(attr)r from %(path)r") % dict(attr=failure_path, path=path)) return obj ##### Helper code for dealing with common command-line parameters _decoding_args = [] _decoder_table = {} def _A(arg, action, decoder, help): _decoding_args.append ( ((arg,), dict(action=action, help=help)) ) _decoder_table[arg.lstrip("-").replace("-","_")] = decoder _A("--binary", "store_true", from_binary_lsb, "Encoded with the characters '0' and '1'. Bit #0 comes first. Example: 00100000 encodes the value 4") _A("--binary-msb", "store_true", from_binary_msb, "Encoded with the characters '0' and '1'. Bit #0 comes last. Example: 00000100 encodes the value 4") _A("--hex", "store_true", from_hex, "Hex encoded. Bit #0 is the first bit (1<<0) of the first byte. Example: 01f2 encodes the value \\x01\\xf2 = 498") _A("--hex-lsb", "store_true", from_hex_lsb, "Hex encoded. Bit #0 is the eigth bit (1<<7) of the first byte. Example: 804f encodes the value \\x01\\xf2 = 498") _A("--hex-msb", "store_true", from_hex_msb, "Hex encoded. Bit #0 is the first bit (1<<0) of the last byte. Example: f201 encodes the value \\x01\\xf2 = 498") _A("--base64", "store_true", from_base64, "Base-64 encoded. Bit #0 is first bit (1<<0) of first byte. Example: AfI= encodes value \\x01\\xf2 = 498") _A("--cactvs", "store_true", from_cactvs, help="CACTVS encoding, based on base64 and includes a version and bit length") _A("--daylight", "store_true", from_daylight, help="Daylight encoding, which is is base64 variant") _A("--decoder", "store", None, help="import and use the DECODER function to decode the fingerprint") def _add_decoding_group(parser): decoding_group = parser.add_argument_group("Fingerprint decoding options") for (args, kwargs) in _decoding_args: decoding_group.add_argument(*args, **kwargs) def _extract_decoder(parser, namespace): """An internal helper function for the command-line programs""" # Were any command-line decoder arguments specified? # Make sure that multiple decoders were not specified decoder_name = None for arg in _decoder_table: if getattr(namespace, arg): if decoder_name is not None: parser.error("Cannot decode with both --%(old_arg)s and --%(arg)s" % dict(old_arg=decoder_name, arg=arg)) decoder_name = arg # When in doubt, assume a hex decoder if decoder_name is None: decoder_name = "hex" # If --decoder was specified, do the import and return (name, decoder) if decoder_name == "decoder": function_name = getattr(namespace, "decoder") fp_decoder = import_decoder(function_name) return function_name, fp_decoder # Otherwise it's in the decoder table fp_decoder = _decoder_table[decoder_name] return decoder_name, fp_decoder chemfp-1.1p1/chemfp/error_handlers.py0000644000077000000240000000122611660452123020103 0ustar dalkestaff00000000000000"internal module; not for general use" from __future__ import absolute_import import sys from . import ParseError def ignore_parse_errors(msg): pass def report_parse_errors(msg): sys.stderr.write("ERROR: %s. Skipping.\n" % (msg,)) def strict_parse_errors(msg): raise ParseError(msg) _parse_error_handlers = { "ignore": ignore_parse_errors, "report": report_parse_errors, "strict": strict_parse_errors, } def get_parse_error_handler(errors): try: return _parse_error_handlers[errors] except KeyError: raise ValueError("'errors' must be one of %s" % ", ".join(sorted(_parse_error_handlers))) chemfp-1.1p1/chemfp/fps_io.py0000644000077000000240000002243312101553234016350 0ustar dalkestaff00000000000000from __future__ import absolute_import from cStringIO import StringIO from __builtin__ import open as _builtin_open import binascii import _chemfp import re import sys import heapq import itertools import ctypes from . import Metadata, ParseError, FingerprintReader from . import fps_search from . import io # I tried a wide range of sizes for my laptop, with both compressed # and uncompressed files, and found that the best size was around # 2**17. Actually, 2**16.8 was the absolute best, which gives BLOCKSIZE=11400 # (BTW, the compressed time took 1.3x the uncompressed time) class FPSParseError(ParseError): def __init__(self, errcode, lineno, filename): self.errcode = errcode self.lineno = lineno self.filename = filename def __repr__(self): return "FPSParseError(%d, %d, %s)" % (self.errcode, self.lineno, self.filename) def __str__(self): msg = _chemfp.strerror(self.errcode) msg += " at line %d" % (self.lineno,) if self.filename is not None: msg += " of %r" % (self.filename,) return msg def open_fps(source, format=None): format_name, compression = io.normalize_format(source, format) if format_name != "fps": raise ValueError("Unknown format %r" % (format_name,)) infile = io.open_compressed_input_universal(source, compression) filename = io.get_filename(source) metadata, lineno, block = read_header(infile, filename) return FPSReader(infile, metadata, lineno, block) # This never buffers def _read_blocks(infile): while 1: block = infile.read(BLOCKSIZE) if not block: break if block[-1:] == "\n": yield block continue line = infile.readline() if not line: # Note: this might not end with a newline! yield block break yield block + line class FPSReader(FingerprintReader): _search = fps_search def __init__(self, infile, metadata, first_fp_lineno, first_fp_block): self._infile = infile self._filename = getattr(infile, "name", "") self.metadata = metadata self._first_fp_lineno = first_fp_lineno self._first_fp_block = first_fp_block self._expected_hex_len = 2*metadata.num_bytes self._hex_len_source = "size in header" self._at_start = True self._it = None self._block_reader = None # Not sure if this is complete. Also, should have a context manager # def close(self): # self._infile.close() def iter_blocks(self): if self._block_reader is None: self._block_reader = iter(self._iter_blocks()) return self._block_reader def _iter_blocks(self): if not self._at_start: raise TypeError("Already iterating") self._at_start = False if self._first_fp_block is None: return block_stream = _read_blocks(self._infile) yield self._first_fp_block for block in block_stream: yield block def iter_rows(self): unhexlify = binascii.unhexlify lineno = self._first_fp_lineno expected_hex_len = self._expected_hex_len for block in self.iter_blocks(): for line in block.splitlines(True): err = _chemfp.fps_line_validate(expected_hex_len, line) if err: raise FPSParseError(err, lineno, self._filename) yield line[:-1].split("\t") lineno += 1 def __iter__(self): unhexlify = binascii.unhexlify lineno = self._first_fp_lineno expected_hex_len = self._expected_hex_len for block in self.iter_blocks(): for line in block.splitlines(True): err, id_fp = _chemfp.fps_parse_id_fp(expected_hex_len, line) if err: # Include the line? raise FPSParseError(err, lineno, self._filename) yield id_fp lineno += 1 def _check_at_start(self): if not self._at_start: raise TypeError("FPS file is not at the start of the file; cannot search") def count_tanimoto_hits_fp(self, query_fp, threshold=0.7): self._check_at_start() return fps_search.count_tanimoto_hits_fp(query_fp, self, threshold) def count_tanimoto_hits_arena(self, queries, threshold=0.7, arena_size=100): self._check_at_start() return fps_search.count_tanimoto_hits_arena(queries, self, threshold) def threshold_tanimoto_search_fp(self, query_fp, threshold=0.7): self._check_at_start() return fps_search.threshold_tanimoto_search_fp(query_fp, self, threshold) def threshold_tanimoto_search_arena(self, queries, threshold=0.7, arena_size=100): self._check_at_start() return fps_search.threshold_tanimoto_search_arena(queries, self, threshold) def knearest_tanimoto_search_fp(self, query_fp, k=3, threshold=0.7): self._check_at_start() return fps_search.knearest_tanimoto_search_fp(query_fp, self, k, threshold) def knearest_tanimoto_search_arena(self, queries, k=3, threshold=0.7, arena_size=100): self._check_at_start() return fps_search.knearest_tanimoto_search_arena(queries, self, k, threshold) def _where(filename, lineno): if filename is None: return "line %d" % (lineno,) else: return "%r line %d" % (filename, lineno) # XXX Use Python's warning system def warn_to_stderr(filename, lineno, message): where = _where(filename, lineno) sys.stderr.write("WARNING: %s at %s\n" % (message, where)) def read_header(f, filename, warn=warn_to_stderr): metadata = Metadata() lineno = 1 for block in _read_blocks(f): # A block must be non-empty start = 0 while 1: c = block[start:start+1] if c == "": # End of the block; get the next one break if c != '#': # End of the header. This block contains the first fingerprint line block = block[start:] if metadata.num_bits is None: # We can figure this out from the fingerprint on the first line err = _chemfp.fps_line_validate(-1, block) if err: raise FPSParseError(err, lineno, filename) i = block.index("\t") # If you don't specify the number of bits then I'll do it for you. metadata.num_bits = i * 4 metadata.num_bytes = i // 2 return metadata, lineno, block start += 1 # Skip the '#' end = block.find("\n", start) if end == -1: # Only happens when the last line of the file contains # no newlines. In that case, we're at the last block. line = block[start:] start = len(block) else: line = block[start:end] start = end+1 # Right! We've got a line. Check if it's magic # This is the only line which cannot contain a '=' if lineno == 1: if line.rstrip() == "FPS1": lineno += 1 continue if line.startswith("x-") or line.startswith("X-"): # Completely ignore the contents of 'experimental' lines continue if "=" not in line: raise TypeError("header line must contain an '=': %r at %s" % (line, _where(filename, lineno))) key, value = line.split("=", 1) key = key.strip() value = value.strip() if key == "num_bits": try: metadata.num_bits = int(value) metadata.num_bytes = (metadata.num_bits + 7)//8 if not (metadata.num_bits > 0): raise ValueError except ValueError: raise TypeError( "num_bits header must be a positive integer, not %r: %s" % (value, _where(filename, lineno))) metadata.num_bytes = (metadata.num_bits+7)//8 elif key == "software": metadata.software = value.decode("utf8") elif key == "type": # Should I have an auto-normalization step here which # removes excess whitespace? #metadata.type = normalize_type(value) metadata.type = value elif key == "source": metadata.sources.append(value) elif key == "date": metadata.date = value elif key == "aromaticity": metadata.aromaticity = value elif key.startswith("x-"): pass else: #print "UNKNOWN", repr(line), repr(key), repr(value) #warn(filename, lineno, "Unknown header %r" % (value,)) pass lineno += 1 # Reached the end of file. No fingerprint lines and nothing left to process. if metadata.num_bits is None: metadata.num_bits = 0 metadata.num_bytes = 0 return metadata, lineno, None chemfp-1.1p1/chemfp/fps_search.py0000644000077000000240000002737012055226640017221 0ustar dalkestaff00000000000000# Internal module to help with FPS-based searches from __future__ import absolute_import import ctypes import itertools import array import _chemfp from . import ChemFPError from . import check_fp_problems, check_metadata_problems class FPSFormatError(ChemFPError): def __init__(self, code, filename, lineno): self.code = code self.filename = filename self.lineno = lineno super(FPSFormatError, self).__init__(code, filename, lineno) def __repr__(self): return "FPSFormatError(%r, %r, %r)" % (self.code, self.filename, self.lineno) def __str__(self): return "%s at line %s of %r" % (_chemfp.strerror(self.code), self.lineno, self.filename) def _chemfp_error(err, lineno, filename): if -40 <= err <= -30: return FPSFormatError(err, filename, lineno) elif err == -2: raise MemoryError(_chemfp.strerror(err)) else: # This shouldn't happen return RuntimeError(_chemfp.strerror(err)) def require_matching_sizes(query_arena, target_reader): query_num_bits = query_arena.metadata.num_bits assert query_num_bits is not None, "arenas must define num_bits" target_num_bits = target_reader.metadata.num_bits if (target_num_bits is not None): if query_num_bits != target_num_bits: raise ValueError("query_arena has %d bits while target_reader has %d bits" % (query_num_bits, target_num_bits)) query_num_bytes = query_arena.metadata.num_bytes assert query_num_bytes is not None, "arenas must define num_bytes" target_num_bytes = target_reader.metadata.num_bytes if target_num_bytes is None: raise ValueError("target_reader missing num_bytes metadata") if query_num_bytes != target_num_bytes: raise ValueError("query_arena uses %d bytes while target_reader uses %d bytes" % (query_num_bytes, target_num_bytes)) def report_errors(problem_report): for (severity, error, msg_template) in problem_report: if severity == "error": raise TypeError(msg_template % dict(metadata1 = "query", metadata2 = "target")) ######## count Tanimoto search ######### def _fp_to_arena(query_fp, metadata): assert len(query_fp) == metadata.num_bytes from . import arena return arena.FingerprintArena(metadata, 1, 0, 0, len(query_fp), query_fp, "", [None]) def count_tanimoto_hits_fp(query_fp, target_reader, threshold): return count_tanimoto_hits_arena(_fp_to_arena(query_fp, target_reader.metadata), target_reader, threshold)[0] def count_tanimoto_hits_arena(query_arena, target_reader, threshold): require_matching_sizes(query_arena, target_reader) counts = array.array("i", (0 for i in xrange(len(query_arena)))) lineno = target_reader._first_fp_lineno for block in target_reader.iter_blocks(): err, num_lines = _chemfp.fps_count_tanimoto_hits( query_arena.metadata.num_bits, query_arena.start_padding, query_arena.end_padding, query_arena.storage_size, query_arena.arena, 0, -1, block, 0, -1, threshold, counts) lineno += num_lines if err: raise _chemfp_error(err, lineno, target_reader._filename) return list(counts) ######## threshold Tanimoto search ######### class TanimotoCell(ctypes.Structure): _fields_ = [("score", ctypes.c_double), ("query_index", ctypes.c_int), ("id_start", ctypes.c_int), ("id_end", ctypes.c_int)] def threshold_tanimoto_search_fp(query_fp, target_reader, threshold): """Find matches in the target reader which are at least threshold similar to the query fingerprint The results is an FPSSearchResults instance contain the result. """ ids = [] scores = [] fp_size = len(query_fp) num_bits = fp_size * 8 NUM_CELLS = 1000 cells = (TanimotoCell*NUM_CELLS)() lineno = target_reader._first_fp_lineno for block in target_reader.iter_blocks(): start = 0 end = len(block) while 1: err, start, num_lines, num_cells = _chemfp.fps_threshold_tanimoto_search( num_bits, 0, 0, fp_size, query_fp, 0, -1, block, start, end, threshold, cells) lineno += num_lines if err: raise _chemfp_error(err, lineno, target_reader._filename) for cell in itertools.islice(cells, 0, num_cells): ids.append(block[cell.id_start:cell.id_end]) scores.append(cell.score) if start == end: break return FPSSearchResult(ids, scores) def threshold_tanimoto_search_arena(query_arena, target_reader, threshold): """Find matches in the target reader which are at least threshold similar to the query arena fingerprints The results are a list in the form [search_results1, search_results2, ...] where search_results are in the same order as the fingerprints in the query_arena. """ require_matching_sizes(query_arena, target_reader) if not query_arena: return FPSSearchResults([]) results = [FPSSearchResult([], []) for i in xrange(len(query_arena))] # Compute at least 100 tanimotos per query, but at most 10,000 at a time # (That's about 200K of memory) NUM_CELLS = max(10000, len(query_arena) * 100) cells = (TanimotoCell*NUM_CELLS)() lineno = target_reader._first_fp_lineno for block in target_reader.iter_blocks(): start = 0 end = len(block) while 1: err, start, num_lines, num_cells = _chemfp.fps_threshold_tanimoto_search( query_arena.metadata.num_bits, query_arena.start_padding, query_arena.end_padding, query_arena.storage_size, query_arena.arena, 0, -1, block, start, end, threshold, cells) lineno += num_lines if err: raise _chemfp_error(err, lineno, target_reader._filename) for cell in itertools.islice(cells, 0, num_cells): id = block[cell.id_start:cell.id_end] result = results[cell.query_index] result.ids.append(id) result.scores.append(cell.score) if start == end: break return FPSSearchResults(results) ######### k-nearest Tanimoto search, with threshold # Support for peering into the chemfp_fps_heap data structure def _make_knearest_search(num_queries, k): class TanimotoHeap(ctypes.Structure): _fields_ = [("size", ctypes.c_int), ("heap_state", ctypes.c_int), ("indices", ctypes.POINTER(ctypes.c_int*k)), ("ids", ctypes.POINTER(ctypes.c_char_p*k)), ("scores", ctypes.POINTER(ctypes.c_double*k))] class KNearestSearch(ctypes.Structure): _fields_ = [("queries_start", ctypes.c_char_p), ("num_queries", ctypes.c_int), ("query_fp_size", ctypes.c_int), ("query_storage_size", ctypes.c_int), ("k", ctypes.c_int), ("search_state", ctypes.c_int), ("threshold", ctypes.c_double), ("heaps", ctypes.POINTER(TanimotoHeap*num_queries)), ("num_targets_processed", ctypes.c_int), ("_all_ids", ctypes.c_void_p), ("_all_scores", ctypes.c_void_p)] return KNearestSearch() def knearest_tanimoto_search_fp(query_fp, target_reader, k, threshold): """Find k matches in the target reader which are at least threshold similar to the query fingerprint The results is an FPSSearchResults instance contain the result. """ query_arena = _fp_to_arena(query_fp, target_reader.metadata) return knearest_tanimoto_search_arena(query_arena, target_reader, k, threshold)[0] def knearest_tanimoto_search_arena(query_arena, target_reader, k, threshold): require_matching_sizes(query_arena, target_reader) if k < 0: raise ValueError("k must be non-negative") num_queries = len(query_arena) search = _make_knearest_search(num_queries, k) _chemfp.fps_knearest_search_init( search, query_arena.metadata.num_bits, query_arena.start_padding, query_arena.end_padding, query_arena.storage_size, query_arena.arena, 0, -1, k, threshold) try: for block in target_reader.iter_blocks(): err = _chemfp.fps_knearest_tanimoto_search_feed(search, block, 0, -1) if err: lineno = target_reader._first_fp_lineno + search.num_targets_processed raise _chemfp_error(err, lineno, target_reader._filename) _chemfp.fps_knearest_search_finish(search) results = [] for query_index in xrange(num_queries): heap = search.heaps[0][query_index] ids = [] for i in xrange(heap.size): id = ctypes.string_at(heap.ids[0][i]) ids.append(id) scores = heap.scores[0][:heap.size] results.append(FPSSearchResult(ids, scores)) return FPSSearchResults(results) finally: _chemfp.fps_knearest_search_free(search) def _reorder_row(ids, scores, name): indices = range(len(ids)) if name == "decreasing-score": indices.sort(key=lambda i: (-scores[i], ids[i])) elif name == "increasing-score": indices.sort(key=lambda i: (scores[i], ids[i])) elif name == "decreasing-id": indices.sort(key=lambda i: ids[i], reverse=True) elif name == "increasing-id": indices.sort(key=lambda i: ids[i]) elif name == "reverse": ids.reverse() scores.reverse() return elif name == "move-closest-first": if len(ids) <= 1: # Short-circuit when I don't need to do anything return x = max(scores) i = scores.index(x) ids[0], ids[i] = ids[i], ids[0] scores[0], scores[i] = scores[i], scores[0] return else: raise ValueError("Unknown sort order") new_ids = [ids[i] for i in indices] new_scores = [scores[i] for i in indices] ids[:] = new_ids scores[:] = new_scores class FPSSearchResult(object): def __init__(self, ids, scores): self.ids = ids self.scores = scores def __len__(self): return len(self.ids) def __nonzero__(self): return bool(self.ids) def __iter__(self): return itertools.izip(self.ids, self.scores) def __getitem__(self, i): return (self.ids[i], self.scores[i]) def clear(self): self.ids = [] self.scores = [] def get_ids(self): return self.ids def get_scores(self): return self.scores def get_ids_and_scores(self): return zip(self.ids, self.scores) def reorder(self, order="decreasing-score"): _reorder_row(self.ids, self.scores, order) class FPSSearchResults(object): def __init__(self, results): self._results = results def __len__(self): return len(self._results) def __getitem__(self, i): return self._results[i] def __iter__(self): return iter(self._results) def iter_ids(self): for result in self._results: yield result.ids def iter_scores(self): for result in self._results: yield result.scores def iter_ids_and_scores(self): for result in self._results: yield zip(result.ids, result.scores) def reorder_all(self, order="decreasing-score"): for result in self._results: _reorder_row(result.ids, result.scores, order) def clear_all(self): for result in self._results: result.clear() chemfp-1.1p1/chemfp/futures/0000755000077000000240000000000012106315372016214 5ustar dalkestaff00000000000000chemfp-1.1p1/chemfp/futures/__init__.py0000644000077000000240000000142111661040267020326 0ustar dalkestaff00000000000000from __future__ import absolute_import # Copyright 2009 Brian Quinlan. All Rights Reserved. # Licensed to PSF under a Contributor Agreement. """Execute computations asynchronously using threads or processes.""" __author__ = 'Brian Quinlan (brian@sweetapp.com)' from ._base import (FIRST_COMPLETED, FIRST_EXCEPTION, ALL_COMPLETED, CancelledError, TimeoutError, Future, Executor, wait, as_completed) from .process import ProcessPoolExecutor from .thread import ThreadPoolExecutor chemfp-1.1p1/chemfp/futures/_base.py0000644000077000000240000004632211661040267017651 0ustar dalkestaff00000000000000# Copyright 2009 Brian Quinlan. All Rights Reserved. # Licensed to PSF under a Contributor Agreement. from __future__ import with_statement import functools import logging import threading import time try: from collections import namedtuple except ImportError: from concurrent.futures._compat import namedtuple __author__ = 'Brian Quinlan (brian@sweetapp.com)' FIRST_COMPLETED = 'FIRST_COMPLETED' FIRST_EXCEPTION = 'FIRST_EXCEPTION' ALL_COMPLETED = 'ALL_COMPLETED' _AS_COMPLETED = '_AS_COMPLETED' # Possible future states (for internal use by the futures package). PENDING = 'PENDING' RUNNING = 'RUNNING' # The future was cancelled by the user... CANCELLED = 'CANCELLED' # ...and _Waiter.add_cancelled() was called by a worker. CANCELLED_AND_NOTIFIED = 'CANCELLED_AND_NOTIFIED' FINISHED = 'FINISHED' _FUTURE_STATES = [ PENDING, RUNNING, CANCELLED, CANCELLED_AND_NOTIFIED, FINISHED ] _STATE_TO_DESCRIPTION_MAP = { PENDING: "pending", RUNNING: "running", CANCELLED: "cancelled", CANCELLED_AND_NOTIFIED: "cancelled", FINISHED: "finished" } # Logger for internal use by the futures package. LOGGER = logging.getLogger("concurrent.futures") STDERR_HANDLER = logging.StreamHandler() LOGGER.addHandler(STDERR_HANDLER) class Error(Exception): """Base class for all future-related exceptions.""" pass class CancelledError(Error): """The Future was cancelled.""" pass class TimeoutError(Error): """The operation exceeded the given deadline.""" pass class _Waiter(object): """Provides the event that wait() and as_completed() block on.""" def __init__(self): self.event = threading.Event() self.finished_futures = [] def add_result(self, future): self.finished_futures.append(future) def add_exception(self, future): self.finished_futures.append(future) def add_cancelled(self, future): self.finished_futures.append(future) class _AsCompletedWaiter(_Waiter): """Used by as_completed().""" def __init__(self): super(_AsCompletedWaiter, self).__init__() self.lock = threading.Lock() def add_result(self, future): with self.lock: super(_AsCompletedWaiter, self).add_result(future) self.event.set() def add_exception(self, future): with self.lock: super(_AsCompletedWaiter, self).add_exception(future) self.event.set() def add_cancelled(self, future): with self.lock: super(_AsCompletedWaiter, self).add_cancelled(future) self.event.set() class _FirstCompletedWaiter(_Waiter): """Used by wait(return_when=FIRST_COMPLETED).""" def add_result(self, future): super(_FirstCompletedWaiter, self).add_result(future) self.event.set() def add_exception(self, future): super(_FirstCompletedWaiter, self).add_exception(future) self.event.set() def add_cancelled(self, future): super(_FirstCompletedWaiter, self).add_cancelled(future) self.event.set() class _AllCompletedWaiter(_Waiter): """Used by wait(return_when=FIRST_EXCEPTION and ALL_COMPLETED).""" def __init__(self, num_pending_calls, stop_on_exception): self.num_pending_calls = num_pending_calls self.stop_on_exception = stop_on_exception super(_AllCompletedWaiter, self).__init__() def _decrement_pending_calls(self): self.num_pending_calls -= 1 if not self.num_pending_calls: self.event.set() def add_result(self, future): super(_AllCompletedWaiter, self).add_result(future) self._decrement_pending_calls() def add_exception(self, future): super(_AllCompletedWaiter, self).add_exception(future) if self.stop_on_exception: self.event.set() else: self._decrement_pending_calls() def add_cancelled(self, future): super(_AllCompletedWaiter, self).add_cancelled(future) self._decrement_pending_calls() class _AcquireFutures(object): """A context manager that does an ordered acquire of Future conditions.""" def __init__(self, futures): self.futures = sorted(futures, key=id) def __enter__(self): for future in self.futures: future._condition.acquire() def __exit__(self, *args): for future in self.futures: future._condition.release() def _create_and_install_waiters(fs, return_when): if return_when == _AS_COMPLETED: waiter = _AsCompletedWaiter() elif return_when == FIRST_COMPLETED: waiter = _FirstCompletedWaiter() else: pending_count = sum( f._state not in [CANCELLED_AND_NOTIFIED, FINISHED] for f in fs) if return_when == FIRST_EXCEPTION: waiter = _AllCompletedWaiter(pending_count, stop_on_exception=True) elif return_when == ALL_COMPLETED: waiter = _AllCompletedWaiter(pending_count, stop_on_exception=False) else: raise ValueError("Invalid return condition: %r" % return_when) for f in fs: f._waiters.append(waiter) return waiter def as_completed(fs, timeout=None): """An iterator over the given futures that yields each as it completes. Args: fs: The sequence of Futures (possibly created by different Executors) to iterate over. timeout: The maximum number of seconds to wait. If None, then there is no limit on the wait time. Returns: An iterator that yields the given Futures as they complete (finished or cancelled). Raises: TimeoutError: If the entire result iterator could not be generated before the given timeout. """ if timeout is not None: end_time = timeout + time.time() with _AcquireFutures(fs): finished = set( f for f in fs if f._state in [CANCELLED_AND_NOTIFIED, FINISHED]) pending = set(fs) - finished waiter = _create_and_install_waiters(fs, _AS_COMPLETED) try: for future in finished: yield future while pending: if timeout is None: wait_timeout = None else: wait_timeout = end_time - time.time() if wait_timeout < 0: raise TimeoutError( '%d (of %d) futures unfinished' % ( len(pending), len(fs))) waiter.event.wait(wait_timeout) with waiter.lock: finished = waiter.finished_futures waiter.finished_futures = [] waiter.event.clear() for future in finished: yield future pending.remove(future) finally: for f in fs: f._waiters.remove(waiter) DoneAndNotDoneFutures = namedtuple( 'DoneAndNotDoneFutures', 'done not_done') def wait(fs, timeout=None, return_when=ALL_COMPLETED): """Wait for the futures in the given sequence to complete. Args: fs: The sequence of Futures (possibly created by different Executors) to wait upon. timeout: The maximum number of seconds to wait. If None, then there is no limit on the wait time. return_when: Indicates when this function should return. The options are: FIRST_COMPLETED - Return when any future finishes or is cancelled. FIRST_EXCEPTION - Return when any future finishes by raising an exception. If no future raises an exception then it is equivalent to ALL_COMPLETED. ALL_COMPLETED - Return when all futures finish or are cancelled. Returns: A named 2-tuple of sets. The first set, named 'done', contains the futures that completed (is finished or cancelled) before the wait completed. The second set, named 'not_done', contains uncompleted futures. """ with _AcquireFutures(fs): done = set(f for f in fs if f._state in [CANCELLED_AND_NOTIFIED, FINISHED]) not_done = set(fs) - done if (return_when == FIRST_COMPLETED) and done: return DoneAndNotDoneFutures(done, not_done) elif (return_when == FIRST_EXCEPTION) and done: if any(f for f in done if not f.cancelled() and f.exception() is not None): return DoneAndNotDoneFutures(done, not_done) if len(done) == len(fs): return DoneAndNotDoneFutures(done, not_done) waiter = _create_and_install_waiters(fs, return_when) waiter.event.wait(timeout) for f in fs: f._waiters.remove(waiter) done.update(waiter.finished_futures) return DoneAndNotDoneFutures(done, set(fs) - done) class Future(object): """Represents the result of an asynchronous computation.""" def __init__(self): """Initializes the future. Should not be called by clients.""" self._condition = threading.Condition() self._state = PENDING self._result = None self._exception = None self._waiters = [] self._done_callbacks = [] def _invoke_callbacks(self): for callback in self._done_callbacks: try: callback(self) except Exception: LOGGER.exception('exception calling callback for %r', self) def __repr__(self): with self._condition: if self._state == FINISHED: if self._exception: return '' % ( hex(id(self)), _STATE_TO_DESCRIPTION_MAP[self._state], self._exception.__class__.__name__) else: return '' % ( hex(id(self)), _STATE_TO_DESCRIPTION_MAP[self._state], self._result.__class__.__name__) return '' % ( hex(id(self)), _STATE_TO_DESCRIPTION_MAP[self._state]) def cancel(self): """Cancel the future if possible. Returns True if the future was cancelled, False otherwise. A future cannot be cancelled if it is running or has already completed. """ with self._condition: if self._state in [RUNNING, FINISHED]: return False if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]: return True self._state = CANCELLED self._condition.notify_all() self._invoke_callbacks() return True def cancelled(self): """Return True if the future has cancelled.""" with self._condition: return self._state in [CANCELLED, CANCELLED_AND_NOTIFIED] def running(self): """Return True if the future is currently executing.""" with self._condition: return self._state == RUNNING def done(self): """Return True of the future was cancelled or finished executing.""" with self._condition: return self._state in [CANCELLED, CANCELLED_AND_NOTIFIED, FINISHED] def __get_result(self): if self._exception: raise self._exception else: return self._result def add_done_callback(self, fn): """Attaches a callable that will be called when the future finishes. Args: fn: A callable that will be called with this future as its only argument when the future completes or is cancelled. The callable will always be called by a thread in the same process in which it was added. If the future has already completed or been cancelled then the callable will be called immediately. These callables are called in the order that they were added. """ with self._condition: if self._state not in [CANCELLED, CANCELLED_AND_NOTIFIED, FINISHED]: self._done_callbacks.append(fn) return fn(self) def result(self, timeout=None): """Return the result of the call that the future represents. Args: timeout: The number of seconds to wait for the result if the future isn't done. If None, then there is no limit on the wait time. Returns: The result of the call that the future represents. Raises: CancelledError: If the future was cancelled. TimeoutError: If the future didn't finish executing before the given timeout. Exception: If the call raised then that exception will be raised. """ with self._condition: if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]: raise CancelledError() elif self._state == FINISHED: return self.__get_result() self._condition.wait(timeout) if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]: raise CancelledError() elif self._state == FINISHED: return self.__get_result() else: raise TimeoutError() def exception(self, timeout=None): """Return the exception raised by the call that the future represents. Args: timeout: The number of seconds to wait for the exception if the future isn't done. If None, then there is no limit on the wait time. Returns: The exception raised by the call that the future represents or None if the call completed without raising. Raises: CancelledError: If the future was cancelled. TimeoutError: If the future didn't finish executing before the given timeout. """ with self._condition: if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]: raise CancelledError() elif self._state == FINISHED: return self._exception self._condition.wait(timeout) if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]: raise CancelledError() elif self._state == FINISHED: return self._exception else: raise TimeoutError() # The following methods should only be used by Executors and in tests. def set_running_or_notify_cancel(self): """Mark the future as running or process any cancel notifications. Should only be used by Executor implementations and unit tests. If the future has been cancelled (cancel() was called and returned True) then any threads waiting on the future completing (though calls to as_completed() or wait()) are notified and False is returned. If the future was not cancelled then it is put in the running state (future calls to running() will return True) and True is returned. This method should be called by Executor implementations before executing the work associated with this future. If this method returns False then the work should not be executed. Returns: False if the Future was cancelled, True otherwise. Raises: RuntimeError: if this method was already called or if set_result() or set_exception() was called. """ with self._condition: if self._state == CANCELLED: self._state = CANCELLED_AND_NOTIFIED for waiter in self._waiters: waiter.add_cancelled(self) # self._condition.notify_all() is not necessary because # self.cancel() triggers a notification. return False elif self._state == PENDING: self._state = RUNNING return True else: LOGGER.critical('Future %s in unexpected state: %s', id(self.future), self.future._state) raise RuntimeError('Future in unexpected state') def set_result(self, result): """Sets the return value of work associated with the future. Should only be used by Executor implementations and unit tests. """ with self._condition: self._result = result self._state = FINISHED for waiter in self._waiters: waiter.add_result(self) self._condition.notify_all() self._invoke_callbacks() def set_exception(self, exception): """Sets the result of the future as being the given exception. Should only be used by Executor implementations and unit tests. """ with self._condition: self._exception = exception self._state = FINISHED for waiter in self._waiters: waiter.add_exception(self) self._condition.notify_all() self._invoke_callbacks() class Executor(object): """This is an abstract base class for concrete asynchronous executors.""" def submit(self, fn, *args, **kwargs): """Submits a callable to be executed with the given arguments. Schedules the callable to be executed as fn(*args, **kwargs) and returns a Future instance representing the execution of the callable. Returns: A Future representing the given call. """ raise NotImplementedError() def map(self, fn, *iterables, **kwargs): """Returns a iterator equivalent to map(fn, iter). Args: fn: A callable that will take take as many arguments as there are passed iterables. timeout: The maximum number of seconds to wait. If None, then there is no limit on the wait time. Returns: An iterator equivalent to: map(func, *iterables) but the calls may be evaluated out-of-order. Raises: TimeoutError: If the entire result iterator could not be generated before the given timeout. Exception: If fn(*args) raises for any values. """ timeout = kwargs.get('timeout') if timeout is not None: end_time = timeout + time.time() fs = [self.submit(fn, *args) for args in zip(*iterables)] try: for future in fs: if timeout is None: yield future.result() else: yield future.result(end_time - time.time()) finally: for future in fs: future.cancel() def shutdown(self, wait=True): """Clean-up the resources associated with the Executor. It is safe to call this method several times. Otherwise, no other methods can be called after this one. Args: wait: If True then shutdown will not return until all running futures have finished executing and the resources used by the executor have been reclaimed. """ pass def __enter__(self): return self def __exit__(self, exc_type, exc_val, exc_tb): self.shutdown(wait=True) return False chemfp-1.1p1/chemfp/futures/_compat.py0000644000077000000240000001104511661040267020214 0ustar dalkestaff00000000000000from keyword import iskeyword as _iskeyword from operator import itemgetter as _itemgetter import sys as _sys def namedtuple(typename, field_names): """Returns a new subclass of tuple with named fields. >>> Point = namedtuple('Point', 'x y') >>> Point.__doc__ # docstring for the new class 'Point(x, y)' >>> p = Point(11, y=22) # instantiate with positional args or keywords >>> p[0] + p[1] # indexable like a plain tuple 33 >>> x, y = p # unpack like a regular tuple >>> x, y (11, 22) >>> p.x + p.y # fields also accessable by name 33 >>> d = p._asdict() # convert to a dictionary >>> d['x'] 11 >>> Point(**d) # convert from a dictionary Point(x=11, y=22) >>> p._replace(x=100) # _replace() is like str.replace() but targets named fields Point(x=100, y=22) """ # Parse and validate the field names. Validation serves two purposes, # generating informative error messages and preventing template injection attacks. if isinstance(field_names, basestring): field_names = field_names.replace(',', ' ').split() # names separated by whitespace and/or commas field_names = tuple(map(str, field_names)) for name in (typename,) + field_names: if not all(c.isalnum() or c=='_' for c in name): raise ValueError('Type names and field names can only contain alphanumeric characters and underscores: %r' % name) if _iskeyword(name): raise ValueError('Type names and field names cannot be a keyword: %r' % name) if name[0].isdigit(): raise ValueError('Type names and field names cannot start with a number: %r' % name) seen_names = set() for name in field_names: if name.startswith('_'): raise ValueError('Field names cannot start with an underscore: %r' % name) if name in seen_names: raise ValueError('Encountered duplicate field name: %r' % name) seen_names.add(name) # Create and fill-in the class template numfields = len(field_names) argtxt = repr(field_names).replace("'", "")[1:-1] # tuple repr without parens or quotes reprtxt = ', '.join('%s=%%r' % name for name in field_names) dicttxt = ', '.join('%r: t[%d]' % (name, pos) for pos, name in enumerate(field_names)) template = '''class %(typename)s(tuple): '%(typename)s(%(argtxt)s)' \n __slots__ = () \n _fields = %(field_names)r \n def __new__(_cls, %(argtxt)s): return _tuple.__new__(_cls, (%(argtxt)s)) \n @classmethod def _make(cls, iterable, new=tuple.__new__, len=len): 'Make a new %(typename)s object from a sequence or iterable' result = new(cls, iterable) if len(result) != %(numfields)d: raise TypeError('Expected %(numfields)d arguments, got %%d' %% len(result)) return result \n def __repr__(self): return '%(typename)s(%(reprtxt)s)' %% self \n def _asdict(t): 'Return a new dict which maps field names to their values' return {%(dicttxt)s} \n def _replace(_self, **kwds): 'Return a new %(typename)s object replacing specified fields with new values' result = _self._make(map(kwds.pop, %(field_names)r, _self)) if kwds: raise ValueError('Got unexpected field names: %%r' %% kwds.keys()) return result \n def __getnewargs__(self): return tuple(self) \n\n''' % locals() for i, name in enumerate(field_names): template += ' %s = _property(_itemgetter(%d))\n' % (name, i) # Execute the template string in a temporary namespace and # support tracing utilities by setting a value for frame.f_globals['__name__'] namespace = dict(_itemgetter=_itemgetter, __name__='namedtuple_%s' % typename, _property=property, _tuple=tuple) try: exec(template, namespace) except SyntaxError: e = _sys.exc_info()[1] raise SyntaxError(e.message + ':\n' + template) result = namespace[typename] # For pickling to work, the __module__ variable needs to be set to the frame # where the named tuple is created. Bypass this step in enviroments where # sys._getframe is not defined (Jython for example). if hasattr(_sys, '_getframe'): result.__module__ = _sys._getframe(1).f_globals.get('__name__', '__main__') return result chemfp-1.1p1/chemfp/futures/process.py0000644000077000000240000003377211661040267020263 0ustar dalkestaff00000000000000# Copyright 2009 Brian Quinlan. All Rights Reserved. # Licensed to PSF under a Contributor Agreement. """Implements ProcessPoolExecutor. The follow diagram and text describe the data-flow through the system: |======================= In-process =====================|== Out-of-process ==| +----------+ +----------+ +--------+ +-----------+ +---------+ | | => | Work Ids | => | | => | Call Q | => | | | | +----------+ | | +-----------+ | | | | | ... | | | | ... | | | | | | 6 | | | | 5, call() | | | | | | 7 | | | | ... | | | | Process | | ... | | Local | +-----------+ | Process | | Pool | +----------+ | Worker | | #1..n | | Executor | | Thread | | | | | +----------- + | | +-----------+ | | | | <=> | Work Items | <=> | | <= | Result Q | <= | | | | +------------+ | | +-----------+ | | | | | 6: call() | | | | ... | | | | | | future | | | | 4, result | | | | | | ... | | | | 3, except | | | +----------+ +------------+ +--------+ +-----------+ +---------+ Executor.submit() called: - creates a uniquely numbered _WorkItem and adds it to the "Work Items" dict - adds the id of the _WorkItem to the "Work Ids" queue Local worker thread: - reads work ids from the "Work Ids" queue and looks up the corresponding WorkItem from the "Work Items" dict: if the work item has been cancelled then it is simply removed from the dict, otherwise it is repackaged as a _CallItem and put in the "Call Q". New _CallItems are put in the "Call Q" until "Call Q" is full. NOTE: the size of the "Call Q" is kept small because calls placed in the "Call Q" can no longer be cancelled with Future.cancel(). - reads _ResultItems from "Result Q", updates the future stored in the "Work Items" dict and deletes the dict entry Process #1..n: - reads _CallItems from "Call Q", executes the calls, and puts the resulting _ResultItems in "Request Q" """ from __future__ import absolute_import from __future__ import with_statement import atexit import multiprocessing import threading import weakref import sys from . import _base try: import queue except ImportError: import Queue as queue __author__ = 'Brian Quinlan (brian@sweetapp.com)' # Workers are created as daemon threads and processes. This is done to allow the # interpreter to exit when there are still idle processes in a # ProcessPoolExecutor's process pool (i.e. shutdown() was not called). However, # allowing workers to die with the interpreter has two undesirable properties: # - The workers would still be running during interpretor shutdown, # meaning that they would fail in unpredictable ways. # - The workers could be killed while evaluating a work item, which could # be bad if the callable being evaluated has external side-effects e.g. # writing to a file. # # To work around this problem, an exit handler is installed which tells the # workers to exit when their work queues are empty and then waits until the # threads/processes finish. _thread_references = set() _shutdown = False def _python_exit(): global _shutdown _shutdown = True for thread_reference in _thread_references: thread = thread_reference() if thread is not None: thread.join() def _remove_dead_thread_references(): """Remove inactive threads from _thread_references. Should be called periodically to prevent memory leaks in scenarios such as: >>> while True: >>> ... t = ThreadPoolExecutor(max_workers=5) >>> ... t.map(int, ['1', '2', '3', '4', '5']) """ for thread_reference in set(_thread_references): if thread_reference() is None: _thread_references.discard(thread_reference) # Controls how many more calls than processes will be queued in the call queue. # A smaller number will mean that processes spend more time idle waiting for # work while a larger number will make Future.cancel() succeed less frequently # (Futures in the call queue cannot be cancelled). EXTRA_QUEUED_CALLS = 1 class _WorkItem(object): def __init__(self, future, fn, args, kwargs): self.future = future self.fn = fn self.args = args self.kwargs = kwargs class _ResultItem(object): def __init__(self, work_id, exception=None, result=None): self.work_id = work_id self.exception = exception self.result = result class _CallItem(object): def __init__(self, work_id, fn, args, kwargs): self.work_id = work_id self.fn = fn self.args = args self.kwargs = kwargs def _process_worker(call_queue, result_queue, shutdown): """Evaluates calls from call_queue and places the results in result_queue. This worker is run in a seperate process. Args: call_queue: A multiprocessing.Queue of _CallItems that will be read and evaluated by the worker. result_queue: A multiprocessing.Queue of _ResultItems that will written to by the worker. shutdown: A multiprocessing.Event that will be set as a signal to the worker that it should exit when call_queue is empty. """ while True: try: call_item = call_queue.get(block=True, timeout=0.1) except queue.Empty: if shutdown.is_set(): return else: try: r = call_item.fn(*call_item.args, **call_item.kwargs) except BaseException: e = sys.exc_info()[1] result_queue.put(_ResultItem(call_item.work_id, exception=e)) else: result_queue.put(_ResultItem(call_item.work_id, result=r)) def _add_call_item_to_queue(pending_work_items, work_ids, call_queue): """Fills call_queue with _WorkItems from pending_work_items. This function never blocks. Args: pending_work_items: A dict mapping work ids to _WorkItems e.g. {5: <_WorkItem...>, 6: <_WorkItem...>, ...} work_ids: A queue.Queue of work ids e.g. Queue([5, 6, ...]). Work ids are consumed and the corresponding _WorkItems from pending_work_items are transformed into _CallItems and put in call_queue. call_queue: A multiprocessing.Queue that will be filled with _CallItems derived from _WorkItems. """ while True: if call_queue.full(): return try: work_id = work_ids.get(block=False) except queue.Empty: return else: work_item = pending_work_items[work_id] if work_item.future.set_running_or_notify_cancel(): call_queue.put(_CallItem(work_id, work_item.fn, work_item.args, work_item.kwargs), block=True) else: del pending_work_items[work_id] continue def _queue_manangement_worker(executor_reference, processes, pending_work_items, work_ids_queue, call_queue, result_queue, shutdown_process_event): """Manages the communication between this process and the worker processes. This function is run in a local thread. Args: executor_reference: A weakref.ref to the ProcessPoolExecutor that owns this thread. Used to determine if the ProcessPoolExecutor has been garbage collected and that this function can exit. process: A list of the multiprocessing.Process instances used as workers. pending_work_items: A dict mapping work ids to _WorkItems e.g. {5: <_WorkItem...>, 6: <_WorkItem...>, ...} work_ids_queue: A queue.Queue of work ids e.g. Queue([5, 6, ...]). call_queue: A multiprocessing.Queue that will be filled with _CallItems derived from _WorkItems for processing by the process workers. result_queue: A multiprocessing.Queue of _ResultItems generated by the process workers. shutdown_process_event: A multiprocessing.Event used to signal the process workers that they should exit when their work queue is empty. """ while True: _add_call_item_to_queue(pending_work_items, work_ids_queue, call_queue) try: result_item = result_queue.get(block=True, timeout=0.1) except queue.Empty: executor = executor_reference() # No more work items can be added if: # - The interpreter is shutting down OR # - The executor that owns this worker has been collected OR # - The executor that owns this worker has been shutdown. if _shutdown or executor is None or executor._shutdown_thread: # Since no new work items can be added, it is safe to shutdown # this thread if there are no pending work items. if not pending_work_items: shutdown_process_event.set() # If .join() is not called on the created processes then # some multiprocessing.Queue methods may deadlock on Mac OS # X. for p in processes: p.join() return del executor else: work_item = pending_work_items[result_item.work_id] del pending_work_items[result_item.work_id] if result_item.exception: work_item.future.set_exception(result_item.exception) else: work_item.future.set_result(result_item.result) class ProcessPoolExecutor(_base.Executor): def __init__(self, max_workers=None): """Initializes a new ProcessPoolExecutor instance. Args: max_workers: The maximum number of processes that can be used to execute the given calls. If None or not given then as many worker processes will be created as the machine has processors. """ _remove_dead_thread_references() if max_workers is None: self._max_workers = multiprocessing.cpu_count() else: self._max_workers = max_workers # Make the call queue slightly larger than the number of processes to # prevent the worker processes from idling. But don't make it too big # because futures in the call queue cannot be cancelled. self._call_queue = multiprocessing.Queue(self._max_workers + EXTRA_QUEUED_CALLS) self._result_queue = multiprocessing.Queue() self._work_ids = queue.Queue() self._queue_management_thread = None self._processes = set() # Shutdown is a two-step process. self._shutdown_thread = False self._shutdown_process_event = multiprocessing.Event() self._shutdown_lock = threading.Lock() self._queue_count = 0 self._pending_work_items = {} def _start_queue_management_thread(self): if self._queue_management_thread is None: self._queue_management_thread = threading.Thread( target=_queue_manangement_worker, args=(weakref.ref(self), self._processes, self._pending_work_items, self._work_ids, self._call_queue, self._result_queue, self._shutdown_process_event)) self._queue_management_thread.daemon = True self._queue_management_thread.start() _thread_references.add(weakref.ref(self._queue_management_thread)) def _adjust_process_count(self): for _ in range(len(self._processes), self._max_workers): p = multiprocessing.Process( target=_process_worker, args=(self._call_queue, self._result_queue, self._shutdown_process_event)) p.start() self._processes.add(p) def submit(self, fn, *args, **kwargs): with self._shutdown_lock: if self._shutdown_thread: raise RuntimeError('cannot schedule new futures after shutdown') f = _base.Future() w = _WorkItem(f, fn, args, kwargs) self._pending_work_items[self._queue_count] = w self._work_ids.put(self._queue_count) self._queue_count += 1 self._start_queue_management_thread() self._adjust_process_count() return f submit.__doc__ = _base.Executor.submit.__doc__ def shutdown(self, wait=True): with self._shutdown_lock: self._shutdown_thread = True if wait: if self._queue_management_thread: self._queue_management_thread.join() # To reduce the risk of openning too many files, remove references to # objects that use file descriptors. self._queue_management_thread = None self._call_queue = None self._result_queue = None self._shutdown_process_event = None self._processes = None shutdown.__doc__ = _base.Executor.shutdown.__doc__ atexit.register(_python_exit) chemfp-1.1p1/chemfp/futures/README0000644000077000000240000000007311661040267017077 0ustar dalkestaff00000000000000# This code comes from http://pypi.python.org/pypi/futures chemfp-1.1p1/chemfp/futures/thread.py0000644000077000000240000001132111661040267020036 0ustar dalkestaff00000000000000# Copyright 2009 Brian Quinlan. All Rights Reserved. # Licensed to PSF under a Contributor Agreement. """Implements ThreadPoolExecutor.""" from __future__ import absolute_import from __future__ import with_statement import atexit import threading import weakref import sys from . import _base try: import queue except ImportError: import Queue as queue __author__ = 'Brian Quinlan (brian@sweetapp.com)' # Workers are created as daemon threads. This is done to allow the interpreter # to exit when there are still idle threads in a ThreadPoolExecutor's thread # pool (i.e. shutdown() was not called). However, allowing workers to die with # the interpreter has two undesirable properties: # - The workers would still be running during interpretor shutdown, # meaning that they would fail in unpredictable ways. # - The workers could be killed while evaluating a work item, which could # be bad if the callable being evaluated has external side-effects e.g. # writing to a file. # # To work around this problem, an exit handler is installed which tells the # workers to exit when their work queues are empty and then waits until the # threads finish. _thread_references = set() _shutdown = False def _python_exit(): global _shutdown _shutdown = True for thread_reference in _thread_references: thread = thread_reference() if thread is not None: thread.join() def _remove_dead_thread_references(): """Remove inactive threads from _thread_references. Should be called periodically to prevent memory leaks in scenarios such as: >>> while True: ... t = ThreadPoolExecutor(max_workers=5) ... t.map(int, ['1', '2', '3', '4', '5']) """ for thread_reference in set(_thread_references): if thread_reference() is None: _thread_references.discard(thread_reference) atexit.register(_python_exit) class _WorkItem(object): def __init__(self, future, fn, args, kwargs): self.future = future self.fn = fn self.args = args self.kwargs = kwargs def run(self): if not self.future.set_running_or_notify_cancel(): return try: result = self.fn(*self.args, **self.kwargs) except BaseException: e = sys.exc_info()[1] self.future.set_exception(e) else: self.future.set_result(result) def _worker(executor_reference, work_queue): try: while True: try: work_item = work_queue.get(block=True, timeout=0.1) except queue.Empty: executor = executor_reference() # Exit if: # - The interpreter is shutting down OR # - The executor that owns the worker has been collected OR # - The executor that owns the worker has been shutdown. if _shutdown or executor is None or executor._shutdown: return del executor else: work_item.run() except BaseException: _base.LOGGER.critical('Exception in worker', exc_info=True) class ThreadPoolExecutor(_base.Executor): def __init__(self, max_workers): """Initializes a new ThreadPoolExecutor instance. Args: max_workers: The maximum number of threads that can be used to execute the given calls. """ _remove_dead_thread_references() self._max_workers = max_workers self._work_queue = queue.Queue() self._threads = set() self._shutdown = False self._shutdown_lock = threading.Lock() def submit(self, fn, *args, **kwargs): with self._shutdown_lock: if self._shutdown: raise RuntimeError('cannot schedule new futures after shutdown') f = _base.Future() w = _WorkItem(f, fn, args, kwargs) self._work_queue.put(w) self._adjust_thread_count() return f submit.__doc__ = _base.Executor.submit.__doc__ def _adjust_thread_count(self): # TODO(bquinlan): Should avoid creating new threads if there are more # idle threads than items in the work queue. if len(self._threads) < self._max_workers: t = threading.Thread(target=_worker, args=(weakref.ref(self), self._work_queue)) t.daemon = True t.start() self._threads.add(t) _thread_references.add(weakref.ref(t)) def shutdown(self, wait=True): with self._shutdown_lock: self._shutdown = True if wait: for t in self._threads: t.join() shutdown.__doc__ = _base.Executor.shutdown.__doc__ chemfp-1.1p1/chemfp/io.py0000644000077000000240000002245512055226640015512 0ustar dalkestaff00000000000000from __future__ import with_statement import re import os import sys import binascii from datetime import datetime if sys.platform.startswith("win"): DEV_STDIN = "CON" else: if os.path.exists("/dev/stdin"): DEV_STDIN = "/dev/stdin" else: DEV_STDIN = None def utcnow(): return datetime.utcnow().isoformat().split(".", 1)[0] # XXX should this be here? def remove_special_characters_from_id(id): if "\n" in id: id = id.splitlines()[0] if "\t" in id: id = id.replace("\t", "") if " " in id: id = id.strip() return id #### _compression_extensions = { ".gz": ".gz", ".gzip": ".gz", ".bz2": ".bz2", ".bzip": ".bz2", ".bzip2": ".bz2", ".xz": ".xz", } _compression_regex = "|".join(re.escape(ext) for ext in _compression_extensions) _format_pat = re.compile("^([a-zA-Z0-9]+)(" + _compression_regex + ")?$") def normalize_format(source, format, default=("fps", "")): if source is None: # Read from stdin filename = None elif isinstance(source, basestring): # This is a filename filename = source elif hasattr(source, "read"): # This is a Python file object filename = getattr(source, "name", None) else: raise ValueError("Unknown source type %r" % (source,)) if format is not None: # This must be of the form [. ] m = _format_pat.match(format) if m is None: if "." in format: if _format_pat.match(format.split(".")[0]): raise ValueError("Unsupported compression in format %r" % (format,)) raise ValueError("Incorrect format syntax %r" % (format,)) if m.group(2): # This is a compression compression = _compression_extensions[m.group(2)] else: compression = "" format_name = m.group(1) return (format_name, compression) if filename is None: # Reading from stdin or an unnamed file-like object with no # specified format Not going to sniff the input. Instead, just return default # The filename could have 0, 1 or 2 extensions base, ext = os.path.splitext(filename) if ext == "": # No extensions, use the default return default ext = ext.lower() # If it's not a compression extension then it's a format indicator if ext not in _compression_extensions: # the [1:] is to remove the leading "." return (ext[1:], "") # Found a compression, now look for the format compression = _compression_extensions[ext] base, ext = os.path.splitext(base) if ext == "": # compression found but not the actual format type return (default[0], compression) # The [1:] is to remove the leading "." format_name = ext[1:].lower() return (format_name, compression) def get_filename(source): if source is None: return None elif isinstance(source, basestring): return source else: return getattr(source, "name", None) #### def open_output(destination): if destination is None: return sys.stdout if not isinstance(destination, basestring): return destination base, ext = os.path.splitext(destination) ext = ext.lower() if ext not in _compression_extensions: return open(destination, "w") else: return open_compressed_output(destination, ext) def open_compressed_output(destination, compression): if not compression: if destination is None: return sys.stdout elif isinstance(destination, basestring): return open(destination, "w") else: return destination if compression == ".gz": import gzip if destination is None: return gzip.GzipFile(mode="w", fileobj=sys.stdout) elif isinstance(destination, basestring): return gzip.open(destination, "w") else: return gzip.GzipFile(mode="w", fileobj=destination) if compression == ".bz2": import bz2 if destination is None: if not os.path.exists("/dev/stdout"): raise NotImplementedError("Cannot write bz2 compressed data to stdout on this platform") return bz2.BZ2File("/dev/stdout", "w") elif isinstance(destination, basestring): return bz2.BZ2File(destination, "w") else: raise NotImplementedError("bzip2 compression to file-like objects is not supported") if compression == ".xz": raise NotImplementedError("xz compression is not supported") raise ValueError("Unknown compression type %r" % (compression,)) def open_compressed_input_universal(source, compression): if not compression: if source is None: return sys.stdin elif isinstance(source, basestring): return open(source, "rU") else: return source if compression == ".gz": import gzip if source is None: # GzipFile doesn't have a "U"/universal file mode? return gzip.GzipFile(fileobj=sys.stdin) elif isinstance(source, basestring): return gzip.open(source, "r") else: return gzip.GzipFile(fileobj=source) if compression == ".bz2": import bz2 if source is None: # bz2 doesn't support Python objects. On some platforms # I can work around the problem if not os.path.exists("/dev/stdin"): raise NotImplementedError("Cannot compressed bzip2 data from stdin on this platform") return bz2.BZ2File("/dev/stdin", "rU") elif isinstance(source, basestring): return bz2.BZ2File(source, "rU") else: # Well, I could emulate it, but I'm not going to raise NotImplementedError("bzip decompression from file-like objects is not supported") if compression == ".xz": raise NotImplementedError("xz decompression is not supported") raise ValueError("Unknown compression type %r" % (compression,)) def write_fps1_magic(outfile): outfile.write("#FPS1\n") def write_fps1_header(outfile, metadata): lines = [] if metadata.num_bits is not None: lines.append("#num_bits=%d\n" % metadata.num_bits) if metadata.type is not None: assert "\n" not in metadata.type lines.append("#type=" + metadata.type.encode("ascii")+"\n") # type cannot contain non-ASCII characters if metadata.software is not None: assert "\n" not in metadata.software lines.append("#software=" + metadata.software.encode("utf8")+"\n") if metadata.aromaticity is not None: lines.append("#aromaticity=" + metadata.aromaticity.encode("ascii") + "\n") for source in metadata.sources: # Ignore newlines in the source filename, if present source = source.replace("\n", "") lines.append("#source=" + source.encode("utf8")+"\n") if metadata.date is not None: lines.append("#date=" + metadata.date.encode("ascii")+"\n") # date cannot contain non-ASCII characters outfile.writelines(lines) class _IgnorePipeErrors(object): def __enter__(self): return None def __exit__(self, type, value, tb): if type is None: return False import errno # Catch any pipe errors, like when piping the output to "head -10" if issubclass(type, IOError) and value[0] == errno.EPIPE: return True return False ignore_pipe_errors = _IgnorePipeErrors() def write_fps1_fingerprint(outfile, fp, id): if "\t" in id: raise ValueError("Fingerprint ids must not contain a tab: %r" % (id,)) if "\n" in id: raise ValueError("Fingerprint ids must not contain a newline: %r" % (id,)) if not id: raise ValueError("Fingerprint ids must not be the empty string") outfile.write("%s\t%s\n" % (binascii.hexlify(fp), id)) # This is a bit of a hack. If I open a file then I want to close it, # but if I use stdout then I don't want to close it. class _closing_output(object): def __init__(self, destination): self.destination = destination self.output = open_output(destination) def __enter__(self): return self.output def __exit__(self, *exec_info): if isinstance(self.destination, basestring): self.output.close() def write_fps1_output(reader, destination, metadata=None): if metadata is None: metadata = reader.metadata hexlify = binascii.hexlify with _closing_output(destination) as outfile: with ignore_pipe_errors: write_fps1_magic(outfile) write_fps1_header(outfile, metadata) for i, (id, fp) in enumerate(reader): if "\t" in id: raise ValueError("Fingerprint ids must not contain a tab: %r in record %d" % (id, i+1)) if "\n" in id: raise ValueError("Fingerprint ids must not contain a newline: %r in record %d" % (id, i+1)) if not id: raise ValueError("Fingerprint ids must not be the empty string in record %d" % (i+1,)) outfile.write("%s\t%s\n" % (hexlify(fp), id)) chemfp-1.1p1/chemfp/openbabel.py0000644000077000000240000003116312104061723017020 0ustar dalkestaff00000000000000"Create Open Babel fingerprints" # Copyright (c) 2010-2013 Andrew Dalke Scientific, AB (Gothenburg, Sweden) # See the contents of "__init__.py" for full license details. from __future__ import absolute_import import sys import os import struct import warnings import itertools import sys import openbabel as ob from . import ParseError from . import io from . import types from . import error_handlers # OpenBabel really wants these two variables. I get a segfault if # BABEL_LIBDIR isn't defined, and from the mailing list, some of the # code doesn't work correctly without BABEL_DATADIR. I've had problems # where I forget to set these variables, so check for them now and # warn about possible problems. #if "BABEL_LIBDIR" not in os.environ: # warnings.warn("BABEL_LIBDIR is not set") #else: # ... check that SMILES and a few other things are on the path ... # but note that BABEL_LIBDIR is a colon (or newline or control-return?) # separated field whose behaviour isn't well defined in the docs. # I'm not going to do additional checking without a stronger need. # This is the only thing which I consider to be public __all__ = ["read_structures"] # This is a "standard" size according to the struct module # documentation, so the following is an excess of caution if struct.calcsize(" the version string for the OpenBabel toolkit" obconversion = ob.OBConversion() obconversion.SetInFormat("smi") obconversion.SetOutFormat("pdb") obmol = ob.OBMol() obconversion.ReadString(obmol, "C") for line in obconversion.WriteString(obmol).splitlines(): if "GENERATED BY OPEN BABEL" in line: return line.split()[-1] return "" try: from openbabel import OBReleaseVersion except ImportError: OBReleaseVersion = _emulated_OBReleaseVersion _ob_version = OBReleaseVersion() SOFTWARE = "OpenBabel/" + _ob_version # OpenBabel fingerprints are stored as vector. On all # the machines I use, ints have 32 bits. # OpenBabel bit lengths must be at least sizeof(int)*8 bits long and # must be a factor of two. I have no idea why this is required. # OpenBabel supports new fingerprints through a plugin system. I got # it working thanks to Noel O'Boyle's excellent work with Cinfony. I # then found out that the OB API doesn't have any way to get the # number of bits in the fingerprint. The size is rounded up to the # next power of two, so FP4 (307 bits) needs 512 bits (16 ints) # instead of 320 bits (10 ints). That means I can't even get close to # guessing the bitsize. # In the end, I hard-coded the supported fingerprints into the system. ############ # I could have written a more general function which created these but # there's only a few fingerprints lengths to worry about. # This needs 128 bytes, for 1024 bits # vectorUnsignedInt will contain 32 32-bit words = 1024 bits _ob_get_fingerprint = {} def _init(): # ob.OBConversion() for name in ("FP2", "FP3", "FP4", "MACCS"): ob_fingerprinter = ob.OBFingerprint.FindFingerprint(name) if ob_fingerprinter is None: _ob_get_fingerprint[name] = (None, None) else: _ob_get_fingerprint[name] = (ob_fingerprinter, ob_fingerprinter.GetFingerprint) if _ob_get_fingerprint["FP2"][0] is None: raise ImportError("Unable to load OpenBabel FP2 fingerprinter. Check $BABEL_LIBDIR") n = _ob_get_fingerprint["FP2"][0].Getbitsperint() if n != 32: raise AssertionError( "The chemfp.ob module assumes OB fingerprints have 32 bit integers") _init() def calc_FP2(mol, fp=None, get_fingerprint=_ob_get_fingerprint["FP2"][1], _pack_1024 = struct.Struct("<" + "I"*32).pack): if fp is None: fp = ob.vectorUnsignedInt() get_fingerprint(mol, fp) return _pack_1024(*fp) # This needs 7 bytes, for 56 bits. # vectorUnsignedInt will contain 2 32-bit words = 64 bits def calc_FP3(mol, fp=None, get_fingerprint=_ob_get_fingerprint["FP3"][1], _pack_64 = struct.Struct(" (id, OBMol) iterator Iterate over structures from filename, returning the structure title and OBMol for each record. The structure is assumed to be in normalized_format(filename, format) format. If filename is None then this reads from stdin instead of the named file. """ if not (filename is None or isinstance(filename, basestring)): raise TypeError("'filename' must be None or a string") error_handler = error_handlers.get_parse_error_handler(errors) obconversion = ob.OBConversion() format_name, compression = io.normalize_format(filename, format, default=("smi", "")) if compression not in ("", ".gz"): raise ValueError("Unsupported compression type for %r" % (filename,)) # OpenBabel auto-detects gzip compression. if not obconversion.SetInFormat(format_name): raise ValueError("Unknown structure format %r" % (format_name,)) obmol = ob.OBMol() if not filename: filename = io.DEV_STDIN if filename is None: raise NotImplementedError("Unable to read from stdin on this operating system") success = obconversion.ReadFile(obmol, filename) filename_repr = "" else: # Deal with OpenBabel's logging if HAS_ERROR_LOG: ob.obErrorLog.ClearLog() lvl = ob.obErrorLog.GetOutputLevel() ob.obErrorLog.SetOutputLevel(-1) # Suppress messages to stderr success = obconversion.ReadFile(obmol, filename) filename_repr = repr(filename) errmsg = None if HAS_ERROR_LOG: ob.obErrorLog.SetOutputLevel(lvl) # Restore message level if ob.obErrorLog.GetErrorMessageCount(): errmsg = _get_ob_error(ob.obErrorLog) if not success: # Either there was an error or there were no structures. open(filename).close() # Make sure the file can be opened for reading # If I get here then the file exists and is readable. # If there was an error message then use it. if errmsg is not None: # Okay, don't know what's going on. Report OB's error raise IOError(5, errmsg, filename) # We've opened the file. Switch to the iterator. return _file_reader(obconversion, obmol, success, id_tag, filename_repr, error_handler) def _file_reader(obconversion, obmol, success, id_tag, filename_repr, error_handler): def where(): return " for record #%d of %s" % (recno, filename_repr) # How do I detect if the input contains a failure? recno = 0 if id_tag is None: while success: recno += 1 title = obmol.GetTitle() id = io.remove_special_characters_from_id(title) if not id: error_handler("Missing title" + where()) else: yield id, obmol obmol.Clear() success = obconversion.Read(obmol) else: while success: recno += 1 obj = obmol.GetData(id_tag) if obj is None: error_handler("Missing id tag %r%s" % (id_tag, where())) else: dirty_id = obj.GetValue() id = io.remove_special_characters_from_id(dirty_id) if not id: msg = "Empty id tag %r" % (id_tag,) error_handler(msg + where()) else: yield id, obmol obmol.Clear() success = obconversion.Read(obmol) ##### from .types import FingerprintFamilyConfig def _read_structures(metadata, source, format, id_tag, errors): if metadata.aromaticity is not None: raise ValueError("Open Babel does not support alternate aromaticity models " "(want aromaticity=%r)" % metadata.aromaticity) return read_structures(source, format, id_tag, errors) _base = FingerprintFamilyConfig( software = SOFTWARE, read_structures = _read_structures, ) OpenBabelFP2FingerprintFamily_v1 = _base.clone( name = "OpenBabel-FP2/1", num_bits = 1021, make_fingerprinter = lambda: calc_FP2) OpenBabelFP3FingerprintFamily_v1 = _base.clone( name = "OpenBabel-FP3/1", num_bits = 55, make_fingerprinter = lambda: calc_FP3) OpenBabelFP4FingerprintFamily_v1 = _base.clone( name = "OpenBabel-FP4/1", num_bits = 307, make_fingerprinter = lambda: calc_FP4) def _check_calc_MACCS_v1(): assert HAS_MACCS assert MACCS_VERSION == 1 return calc_MACCS OpenBabelMACCSFingerprintFamily_v1 = _base.clone( name = "OpenBabel-MACCS/1", num_bits = 166, make_fingerprinter = _check_calc_MACCS_v1) def _check_calc_MACCS_v2(): assert HAS_MACCS assert MACCS_VERSION == 2 return calc_MACCS OpenBabelMACCSFingerprintFamily_v2 = _base.clone( name = "OpenBabel-MACCS/2", num_bits = 166, make_fingerprinter = _check_calc_MACCS_v2) chemfp-1.1p1/chemfp/openbabel_patterns.py0000644000077000000240000001462212055226640020747 0ustar dalkestaff00000000000000from __future__ import absolute_import from openbabel import OBSmartsPattern, OBBitVec from . import openbabel from . import pattern_fingerprinter from . import types from . import __version__ as chemfp_version SOFTWARE = openbabel.SOFTWARE + (" chemfp/%s" % (chemfp_version,)) # HasMatch was added to OpenBabel 2.3 . _HAS_HASMATCH = hasattr(OBSmartsPattern(), "HasMatch") class Matcher(object): def __init__(self, ob_matcher): self.ob_matcher = ob_matcher if _HAS_HASMATCH: def HasMatch(self, mol): return self.ob_matcher.HasMatch(mol) else: def HasMatch(self, mol): return self.ob_matcher.Match(mol, True) def NumUniqueMatches(self, mol): self.ob_matcher.Match(mol) return sum(1 for x in self.ob_matcher.GetUMapList()) class HydrogenMatcher(object): def __init__(self, max_count): self.max_count = max_count def HasMatch(self, mol): for i in range(1, mol.NumAtoms()+1): atom = mol.GetAtom(i) if atom.GetAtomicNum() == 1: return True if atom.ImplicitHydrogenCount(): return True return False def NumUniqueMatches(self, mol): count = 0 for i in range(1, mol.NumAtoms()+1): atom = mol.GetAtom(i) if atom.GetAtomicNum() == 1: count += 1 count += atom.ImplicitHydrogenCount() if count >= self.max_count: return count return count class AromaticRings(object): def __init__(self, max_count): self.max_count = max_count # The single ring case is easy; if there's an aromatic atom in a ring # then there's an aromatic ring. self._single_matcher = OBSmartsPattern() assert self._single_matcher.Init("[aR]") if _HAS_HASMATCH: def HasMatch(self, mol): return self._single_matcher.HasMatch(mol) else: def HasMatch(self, mol): return self._single_matcher.Match(mol, True) def NumUniqueMatches(self, mol): num_rings = 0 for ring in mol.GetSSSR(): if ring.IsAromatic(): num_rings += 1 if num_rings == self.max_count: break return num_rings ### class HeteroAromaticRings(object): def __init__(self, max_count): self.max_count = max_count # The single ring case is easy; if there's an non-carbon aromatic atom # in a ring then there's a hetero-aromatic ring self._single_matcher = OBSmartsPattern() assert self._single_matcher.Init("[aR;!#6]") if _HAS_HASMATCH: def HasMatch(self, mol): return self._single_matcher.HasMatch(mol) else: def HasMatch(self, mol): return self._single_matcher.Match(mol, True) def NumUniqueMatches(self, mol): num_rings = 0 for ring in mol.GetSSSR(): if (ring.IsAromatic() and any(mol.GetAtom(atom_id).GetAtomicNum() != 6 for atom_id in ring._path)): num_rings += 1 if num_rings == self.max_count: break return num_rings class NumFragments(object): def __init__(self, max_count): if max_count >= 3: raise NotImplemented(" >= 3 not implemented") def HasMatch(self, mol): # Has at least one fragment, which is the same as having atoms return (mol.NumAtoms() > 0) def NumUniqueMatches(self, mol): # If the largest fragment is not the same size as the entire molecule # then there are at least two fragments n = mol.NumAtoms() if n == 0: # No atoms means no fragments return 0 v = OBBitVec() mol.FindLargestFragment(v) if v.CountBits() == n: return 1 return 2 _pattern_classes = { "": HydrogenMatcher, "": AromaticRings, "": HeteroAromaticRings, "": NumFragments, } def ob_compile_pattern(pattern, max_count): if pattern in _pattern_classes: return _pattern_classes[pattern](max_count) if pattern.startswith("<"): raise NotImplementedError(pattern) matcher = OBSmartsPattern() if not matcher.Init(pattern): raise pattern_fingerprinter.UnsupportedPatternError( pattern, "Uninterpretable SMARTS pattern") return Matcher(matcher) class OBPatternFingerprinter(pattern_fingerprinter.PatternFingerprinter): def __init__(self, patterns): assert patterns is not None super(OBPatternFingerprinter, self).__init__(patterns, ob_compile_pattern) def fingerprint(self, mol): bytes = [0] * self.num_bytes for matcher, largest_count, count_info_tuple in self.matcher_definitions: #print matcher, largest_count, count_info_tuple if largest_count == 1: if matcher.HasMatch(mol): count_info = count_info_tuple[0] bytes[count_info.byteno] |= count_info.bitmask else: # There's no good way to get the number of unique # matches without iterating over all of them. actual_count = matcher.NumUniqueMatches(mol) for count_info in count_info_tuple: if actual_count >= count_info.count: bytes[count_info.byteno] |= count_info.bitmask else: break return "".join(map(chr, bytes)) class _CachedFingerprinters(dict): def __missing__(self, name): patterns = pattern_fingerprinter._load_named_patterns(name) fingerprinter = OBPatternFingerprinter(patterns) self[name] = fingerprinter return fingerprinter _cached_fingerprinters = _CachedFingerprinters() # XXX Why are there two "Fingerprinter" classes? # XX Shouldn't they be merged? _base = openbabel._base.clone( software = SOFTWARE) SubstructOpenBabelFingerprinter_v1 = _base.clone( name = "ChemFP-Substruct-OpenBabel/1", num_bits = 881, make_fingerprinter = lambda: _cached_fingerprinters["substruct"].fingerprint) RDMACCSOpenBabelFingerprinter_v1 = _base.clone( name = "RDMACCS-OpenBabel/1", num_bits = 166, make_fingerprinter = lambda: _cached_fingerprinters["rdmaccs"].fingerprint) chemfp-1.1p1/chemfp/openeye.py0000644000077000000240000006572612104061736016555 0ustar dalkestaff00000000000000"""Create OpenEye fingerprints """ # Copyright (c) 2010-2013 Andrew Dalke Scientific, AB (Gothenburg, Sweden) # Licensed under "the MIT license" # See the contents of COPYING or "__init__.py" for full license details. from __future__ import absolute_import import sys import os import errno import ctypes import warnings from openeye.oechem import * from openeye.oegraphsim import * from . import ParseError from . import types from . import io from . import error_handlers from . import argparse __all__ = ["read_structures", "get_path_fingerprinter", "get_maccs_fingerprinter"] class UnknownFormat(KeyError): def __str__(self): return "Unknown format %r" % (self.args[0],) ############# Used when generate the FPS header SOFTWARE = "OEGraphSim/%(release)s (%(version)s)" % dict( release = OEGraphSimGetRelease(), version = OEGraphSimGetVersion()) OEGRAPHSIM_API_VERSION = "1" if "OEMakeCircularFP" in globals(): OEGRAPHSIM_API_VERSION = "2" if OEGRAPHSIM_API_VERSION == "1": # Set some v2 values beause it simplifies code later on. # (Beware: v2 actually uses different values than these!) OEFPAtomType_DefaultPathAtom = OEFPAtomType_DefaultAtom OEFPBondType_DefaultPathBond = OEFPBondType_DefaultBond OEFPAtomType_DefaultCircularAtom = OEFPAtomType_DefaultAtom OEFPBondType_DefaultCircularBond = OEFPBondType_DefaultBond OEFPAtomType_DefaultTreeAtom = OEFPAtomType_DefaultAtom OEFPBondType_DefaultTreeBond = OEFPBondType_DefaultBond ##### Handle the atom and bond type flags for path fingerprints # The atom and bond type flags can be specified on the command-line # # --atype=DefaultAtom --btype=BondOrder,InRing # --atype AtomicNumber,InRing --btype DefaultBond,InRing # # The type fields may be separated by either a "," or a "|". # The relevant OpenEye function (OEGetFPAtomType() and # OEGetFPBondType()) use a "|" but that requires escaping for # the shell, so I support "," as well. # There's another conversion of the integer type values into a string # representation, used when generating the canonical form of the # generation parameters for the FPS output. That case uses "|" # (and not ",") and omits the DefaultAtom and DefaultBond name. # The result is easier to parse with the OpenEye API functions. # Note: Version 1.0 of OEGraphSim uses different names than 2.0 (Grr!) # OpenEye says these names will not change again. We'll see. _atype_flags = [(OEGetFPAtomType(atype), atype) for atype in ( OEFPAtomType_Aromaticity, # Arom OEFPAtomType_AtomicNumber, # AtmNum OEFPAtomType_Chiral, # Chiral OEFPAtomType_EqHalogen, # EqHalo OEFPAtomType_FormalCharge, # FCharge OEFPAtomType_HvyDegree, # HvyDeg OEFPAtomType_Hybridization, # Hyb OEFPAtomType_InRing, # InRing OEFPAtomType_EqAromatic, # EqArom )] if OEGRAPHSIM_API_VERSION != "1": # HCount added in 2.0.0 _atype_flags.extend([ (OEGetFPAtomType(atype), atype) for atype in ( OEFPAtomType_HCount, # HCount OEFPAtomType_EqHBondAcceptor, # EqHBAcc OEFPAtomType_EqHBondDonor, # EqHBDon )]) _btype_flags = [(OEGetFPBondType(btype), btype) for btype in (OEFPBondType_BondOrder, OEFPBondType_Chiral, OEFPBondType_InRing)] # I support the DefaultAtom and DefaultBond special values. # (Note: complex bitflags go first; it simplifies the flag->description code) if OEGRAPHSIM_API_VERSION == "1": _path_atype_flags = ([("DefaultAtom", OEFPAtomType_DefaultAtom)] + _atype_flags + [("Default", OEFPAtomType_DefaultAtom)]) _path_btype_flags = ([("DefaultBond", OEFPBondType_DefaultBond)] + _btype_flags + [("Default", OEFPBondType_DefaultBond)]) _path_atypes = dict(_path_atype_flags) _path_btypes = dict(_path_btype_flags) else: # Version 2 of the API; this adds circular and tree fingerprints # and changes the default atom flags. The chemfp support also # changed. It still accepts the "Default" names as input, but # normalizes them to the full "|" names. (This matches what # OEGraphSim does.) _atype_default_flags = [("DefaultPathAtom", OEFPAtomType_DefaultPathAtom), ("DefaultCircularAtom", OEFPAtomType_DefaultCircularAtom), ("DefaultTreeAtom", OEFPAtomType_DefaultTreeAtom), ("DefaultAtom", OEFPAtomType_DefaultAtom)] _btype_default_flags = [("DefaultPathBond", OEFPBondType_DefaultPathBond), ("DefaultCircularBond", OEFPBondType_DefaultCircularBond), ("DefaultTreeBond", OEFPBondType_DefaultTreeBond), ("DefaultBond", OEFPBondType_DefaultBond)] _path_atype_flags = _atype_flags + _atype_default_flags + [("Default", OEFPAtomType_DefaultPathAtom)] _path_btype_flags = _btype_flags + _btype_default_flags + [("Default", OEFPBondType_DefaultPathBond)] _circular_atype_flags = _atype_flags + _atype_default_flags + [ ("Default", OEFPAtomType_DefaultCircularAtom)] _circular_btype_flags = _btype_flags + _btype_default_flags + [ ("Default", OEFPBondType_DefaultCircularBond)] _tree_atype_flags = _atype_flags + _atype_default_flags + [("Default", OEFPAtomType_DefaultTreeAtom)] _tree_btype_flags = _btype_flags + _btype_default_flags + [("Default", OEFPBondType_DefaultTreeBond)] _path_atypes = dict(_path_atype_flags) _path_btypes = dict(_path_btype_flags) _circular_atypes = dict(_circular_atype_flags) _circular_btypes = dict(_circular_btype_flags) _tree_atypes = dict(_tree_atype_flags) _tree_btypes = dict(_tree_btype_flags) ## Go from a "," or "|" separated text field to an integer value # Removes extra whitespace, but none should be present. def _get_type_value(a_or_b, table, description): value = 0 # Allow both "|" and "," as separators # (XXX OEGraphSim 2.0.0 only allows "|") description = description.replace("|", ",") for word in description.split(","): word = word.strip() try: value |= table[word] except KeyError: if not word: raise ValueError("Missing %s flag" % (a_or_b,)) raise ValueError("Unknown %s type %r" % (a_or_b, word)) return value def path_atom_description_to_value(description): """path_atom_description_to_value(description) -> integer Convert an atom description like FormalCharge,EqHalogen or FormalCharge|EqHalogen into its atom type value. This is similar to OEGetFPAtomType except both "|" and "," are allowed seperators and "AtomDefault" is an allowed term. """ return _get_type_value("path atom", _path_atypes, description) def path_bond_description_to_value(description): """bond_description_to_value(description) -> integer Convert an bond description like BondOrder,Chiral or BondOrder|Chiral into its bond type value. This is similar to OEGetFPBondType except both "|" and "," are allowed seperators and "BondDefault" is an allowed term. """ return _get_type_value("path bond", _path_btypes, description) if OEGRAPHSIM_API_VERSION != "1": def circular_atom_description_to_value(description): return _get_type_value("circular atom", _circular_atypes, description) def circular_bond_description_to_value(description): return _get_type_value("circular bond", _circular_btypes, description) def tree_atom_description_to_value(description): return _get_type_value("tree atom", _tree_atypes, description) def tree_bond_description_to_value(description): return _get_type_value("tree bond", _tree_btypes, description) else: def not_available(*args, **kwargs): raise NotImplementedError("Unsupported in this version of OEGraphSim") circular_atom_description_to_value = circular_bond_description_to_value = \ tree_atom_description_to_value = tree_bond_description_to_value = not_available ## Go from an integer value into a canonical description # I could use OEGetFPAtomType() and OEGetFPBondType() but I wanted # something which has a fixed sort order even for future releases, # which isn't part of those functions. def _get_type_description(a_or_b, flags, value): words = [] for (word, flag) in flags: if flag & value == flag: # After over 12 years of full-time use of Python, # I finally have a chance to use the xor operator. value = value ^ flag words.append(word) if value != 0: raise AssertionError("Unsupported %s value %d" % (a_or_b, value)) return "|".join(words) def path_atom_value_to_description(value): """atom_value_to_description(value) -> string Convert from an atom type string into its text description, separated by "|"s. The result are compatible with OEGetFPAtomType and are in canonical order. """ return _get_type_description("path atom", _path_atype_flags, value) def path_bond_value_to_description(value): """bond_value_to_description(value) -> string Convert from a bond type string into its text description, separated by "|"s. The result are compatible with OEGetFPBontType and are in canonical order. """ return _get_type_description("path bond", _path_btype_flags, value) if OEGRAPHSIM_API_VERSION != "1": def circular_atom_value_to_description(value): return _get_type_description("circular atom", _circular_atype_flags, value) def circular_bond_value_to_description(value): return _get_type_description("circular bond", _circular_btype_flags, value) def tree_atom_value_to_description(value): return _get_type_description("tree atom", _tree_atype_flags, value) def tree_bond_value_to_description(value): return _get_type_description("tree bond", _tree_btype_flags, value) else: circular_atom_value_to_description = circular_bond_value_to_description = \ tree_atom_value_to_description = tree_bond_value_to_description = not_available ##### Create a function which generate fingerprints # I use functions which return functions because it was a nice way to # hide the differences between the two fingerprinters. I also found # that I can save a bit of time by not creating a new fingerprint each # time. The measured speedup is about 2% for MACCS166 and 6% for path # fingerprints. # Just like the OEGraphMol, these fingerprints must not be reused or # stored. They are mutated EVERY TIME. They are NOT thread-safe. # If you need to use these in multiple threads, then make multiple # fingerprinters. # Believe it or not, reusing the preallocated fingerprint measurably # helps the performance. def get_path_fingerprinter(numbits, minbonds, maxbonds, atype, btype): # Extra level of error checking since I expect people will think # of this as part of the public API. if not (16 <= numbits <= 65536): raise ValueError("numbits must be between 16 and 65536 (inclusive)") if not (0 <= minbonds): raise ValueError("minbonds must be 0 or greater") if not (minbonds <= maxbonds): raise ValueError("maxbonds must not be smaller than minbonds") # XXX validate the atype and type values? # It's a simple mask against the | of all possible value, then test for 0. # However, I'm not sure what to report as the error message. fp = OEFingerPrint() fp.SetSize(numbits) data_location = int(fp.GetData()) num_bytes = (numbits+7)//8 def path_fingerprinter(mol): OEMakePathFP(fp, mol, numbits, minbonds, maxbonds, atype, btype) return ctypes.string_at(data_location, num_bytes) return path_fingerprinter def get_maccs_fingerprinter(): fp = OEFingerPrint() # Call SetSize() now to force space allocation, so I only need one GetData() fp.SetSize(166) data_location = int(fp.GetData()) num_bytes = (166+7)//8 def maccs_fingerprinter(mol): OEMakeMACCS166FP(fp, mol) return ctypes.string_at(data_location, num_bytes) return maccs_fingerprinter if OEGRAPHSIM_API_VERSION == "2": def get_circular_fingerprinter(numbits, minradius, maxradius, atype, btype): # Extra level of error checking since I expect people will think # of this as part of the public API. if not (16 <= numbits <= 65536): raise ValueError("numbits must be between 16 and 65536 (inclusive)") if not (0 <= minradius): raise ValueError("minradius must be 0 or greater") if not (minradius <= maxradius): raise ValueError("maxradius must not be smaller than minradius") fp = OEFingerPrint() fp.SetSize(numbits) data_location = int(fp.GetData()) num_bytes = (numbits+7)//8 def circular_fingerprinter(mol): OEMakeCircularFP(fp, mol, numbits, minradius, maxradius, atype, btype) return ctypes.string_at(data_location, num_bytes) return circular_fingerprinter def get_tree_fingerprinter(numbits, minbonds, maxbonds, atype, btype): # Extra level of error checking since I expect people will think # of this as part of the public API. if not (16 <= numbits <= 65536): raise ValueError("numbits must be between 16 and 65536 (inclusive)") if not (0 <= minbonds): raise ValueError("minbonds must be 0 or greater") if not (minbonds <= maxbonds): raise ValueError("maxbonds must not be smaller than minbonds") fp = OEFingerPrint() fp.SetSize(numbits) data_location = int(fp.GetData()) num_bytes = (numbits+7)//8 def tree_fingerprinter(mol): OEMakeTreeFP(fp, mol, numbits, minbonds, maxbonds, atype, btype) return ctypes.string_at(data_location, num_bytes) return tree_fingerprinter else: get_circular_fingerprinter = not_available get_tree_fingerprinter = not_available ### A note on fingerprints and ctypes.string_at # The FPS format and OEFingerPrint.GetData() values used identical bit # and byte order. Bytes are in little-endian order and bits are in # big-endian order. That means I can use GetData() to get the # underlying C storage area, use ctypes to turn that into a Python # string, which I then hex encode. # The other option is to use OEFingerPrint.ToHexString(). But that's # pure little endian, so I would need a transposition to make the bits # be what I want them to be. OEChem's hex strings also end with a flag # which says how many extra bits to trim, which I don't need since I # handle it a different way. # Here's some info about the bit order, which I tested by setting a # few bits though the API then seeing what changed in the hex string # and in the underlying GetData() field. # The bit pattern # 01234567 89ABCDEF pure little endian # 10011100 01000011 # # 93 2C using ToHexString() (pure little endian) # 0x39 c2 using hex(ord(GetData())) (litle endian byte, big endian bit) # # 76543210 FEDCBA98 # 00111001 11000010 little endian byte, big endian bit ################ Handle formats # Make format names to OEChem format types _formats = { "smi": OEFormat_SMI, "ism": OEFormat_ISM, "can": OEFormat_CAN, "sdf": OEFormat_SDF, "mol": OEFormat_SDF, "skc": OEFormat_SKC, "mol2": OEFormat_MOL2, "mmod": OEFormat_MMOD, "oeb": OEFormat_OEB, "bin": OEFormat_BIN, } # Some trickiness to verify that the format specification is # supported, but without doing anything (yet) to set those flags. # I return a function which will set the file stream parameters # correctly. def _do_nothing(ifs): pass # Format is something like ".sdf.gz" or "pdb" or "smi.gz" def _get_format_setter(format=None): if format is None: return _do_nothing fmt = format.lower() is_compressed = 0 if fmt.endswith(".gz"): is_compressed = 1 fmt = fmt[:-3] # Should be something like ".sdf" or "sdf" or "smi" format_flag = _formats.get(fmt, None) if format_flag is None: raise ValueError("Unknown structure format %r" % (format,)) def set_format(ifs): ifs.SetFormat(format_flag) if is_compressed: ifs.Setgz(is_compressed) return set_format def _open_stdin(set_format, aromaticity_flavor): ifs = oemolistream() ifs.open() set_format(ifs) if aromaticity_flavor is not None: flavor = ifs.GetFlavor(ifs.GetFormat()) flavor |= aromaticity_flavor ifs.SetFlavor(ifs.GetFormat(), flavor) return ifs def _open_ifs(filename, set_format, aromaticity_flavor): ifs = oemolistream() if not ifs.open(filename): # Let Python try to do better error reporting. open(filename).close() # If that didn't work, give up and fake it. # (Did manual coverage testing for this. The test cases I can # think of, like tricky timing, are too tricky.) raise IOError(errno.EIO, "OEChem cannot open the file", filename) set_format(ifs) if aromaticity_flavor is not None: flavor = ifs.GetFlavor(ifs.GetFormat()) flavor |= aromaticity_flavor ifs.SetFlavor(ifs.GetFormat(), flavor) return ifs # This code is a bit strange. It needs to do eager error checking but # lazy parsing. That is, it needs to check right away that the file # can be opened (if it exists) and the format is understood. But it # can wait until later to actually parse the files. _aromaticity_sorted = ( # ("default", None), ("openeye", OEIFlavor_Generic_OEAroModelOpenEye), ("daylight", OEIFlavor_Generic_OEAroModelDaylight), ("tripos", OEIFlavor_Generic_OEAroModelTripos), ("mdl", OEIFlavor_Generic_OEAroModelMDL), ("mmff", OEIFlavor_Generic_OEAroModelMMFF), ) _aromaticity_flavors = dict(_aromaticity_sorted) _aromaticity_flavor_names = [pair[0] for pair in _aromaticity_sorted] _aromaticity_flavors[None] = OEIFlavor_Generic_OEAroModelOpenEye del _aromaticity_sorted, pair # If unspecified, use "openeye" (this is what OEChem does internally) _aromaticity_flavors[None] = _aromaticity_flavors["openeye"] # Allow "None" def is_valid_format(filename, format): format_name, compression = io.normalize_format(filename, format, default=("smi", "")) if compression not in ("", ".gz"): return False try: _get_format_setter(format_name + compression) return True except ValueError: return False def is_valid_aromaticity(aromaticity): return aromaticity in _aromaticity_flavors # Part of the code (parameter checking, opening the file) are eager. # Actually reading the structures is lazy. def read_structures(filename=None, format=None, id_tag=None, aromaticity=None, errors="strict"): try: aromaticity_flavor = _aromaticity_flavors[aromaticity] except KeyError: raise ValueError("Unsupported aromaticity model %r" % (aromaticity,)) error_handler = error_handlers.get_parse_error_handler(errors) # Check that that the format is known format_name, compression = io.normalize_format(filename, format, default=("smi", "")) if compression not in ("", ".gz"): raise ValueError("Unsupported compression type for %r" % (filename,)) set_format = _get_format_setter(format_name + compression) # Input is from a file if filename is None: ifs = _open_stdin(set_format, aromaticity_flavor) filename_repr = "" else: ifs = _open_ifs(filename, set_format, aromaticity_flavor) filename_repr = repr(filename) # Only SD files can take the id_tag if ifs.GetFormat() != OEFormat_SDF: id_tag = None # Lazy structure reader return _iter_structures(ifs, id_tag, filename_repr, error_handler) def _iter_structures(ifs, id_tag, filename_repr, error_handler): def where(): return " for record #%d of %s" % (recno+1, filename_repr) if id_tag is None: for recno, mol in enumerate(ifs.GetOEGraphMols()): title = mol.GetTitle() id = io.remove_special_characters_from_id(title) if not id: error_handler("Missing title" + where()) continue yield id, mol else: for recno, mol in enumerate(ifs.GetOEGraphMols()): dirty_id = OEGetSDData(mol, id_tag) if not dirty_id: if not OEHasSDData(mol, id_tag): error_handler("Missing id tag %r%s" % (id_tag, where())) continue id = io.remove_special_characters_from_id(dirty_id) if not id: msg = "Empty id tag %r" % (id_tag,) error_handler(msg + where()) continue yield id, mol def _read_fingerprints(structure_reader, fingerprinter): for (id, mol) in structure_reader: yield id, fingerprinter(mol) from .types import FingerprintFamilyConfig, nonnegative_int def _read_structures(metadata, source, format, id_tag, errors): return read_structures(source, format, id_tag=id_tag, aromaticity=metadata.aromaticity, errors=errors) def _correct_numbits(s): try: if not s.isdigit(): raise ValueError i = int(s) if not (16 <= i <= 65536): raise ValueError except ValueError: raise ValueError("must be between 16 and 65536 bits") return i _base = FingerprintFamilyConfig( software = SOFTWARE, read_structures = _read_structures) ######### These are appropriate for OEGraphSim 1.0 ############# def _check_v1(func): def make_fingerprinter(*args, **kwargs): if OEGRAPHSIM_API_VERSION != "1": raise TypeError("This version of OEChem does not support the OEGraphSim 1.0.0 fingerprints") return func(*args, **kwargs) return make_fingerprinter OpenEyePathFingerprintFamily_v1 = _base.clone( name = "OpenEye-Path/1", format_string = ("numbits=%(numbits)s minbonds=%(minbonds)s " "maxbonds=%(maxbonds)s atype=%(atype)s btype=%(btype)s"), num_bits = lambda d: d["numbits"], make_fingerprinter = _check_v1(get_path_fingerprinter)) _path = OpenEyePathFingerprintFamily_v1 _path.add_argument("numbits", decoder=_correct_numbits, metavar="INT", default=4096, help="number of bits in the path fingerprint") _path.add_argument("minbonds", decoder=nonnegative_int, metavar="INT", default=0, help="minimum number of bonds in the path fingerprint") _path.add_argument("maxbonds", decoder=nonnegative_int, metavar="INT", default=5, help="maximum number of bonds in the path fingerprint") _path.add_argument("atype", decoder=path_atom_description_to_value, encoder=path_atom_value_to_description, help="atom type as a '|' separated list of terms", default=OEFPAtomType_DefaultAtom) _path.add_argument("btype", decoder=path_bond_description_to_value, encoder=path_bond_value_to_description, help="bond type as a '|' separated list of terms", default=OEFPBondType_DefaultBond) OpenEyeMACCSFingerprintFamily_v1 = _base.clone( name = "OpenEye-MACCS166/1", num_bits = 166, make_fingerprinter = _check_v1(get_maccs_fingerprinter)) ######### These are appropriate for OEGraphSim 2.0 ############# def _check_v2(func): def make_fingerprinter(*args, **kwargs): if OEGRAPHSIM_API_VERSION == "1": raise TypeError("This version of OEChem does not support the OEGraphSim 2.0.0 fingerprints") return func(*args, **kwargs) return make_fingerprinter _ff = OpenEyePathFingerprintFamily_v2 = OpenEyePathFingerprintFamily_v1.clone( name = "OpenEye-Path/2", make_fingerprinter = _check_v2(get_path_fingerprinter)) _ff.add_argument("atype", decoder=path_atom_description_to_value, encoder=path_atom_value_to_description, help="atom type", default=OEFPAtomType_DefaultPathAtom) _ff.add_argument("btype", decoder=path_bond_description_to_value, encoder=path_bond_value_to_description, help="bond type", default=OEFPBondType_DefaultPathBond) OpenEyeMACCSFingerprintFamily_v2 = OpenEyeMACCSFingerprintFamily_v1.clone( name = "OpenEye-MACCS166/2", make_fingerprinter = _check_v2(get_maccs_fingerprinter)) _circular_ff = OpenEyeCircularFingerprintFamily_v2 = _base.clone( name = "OpenEye-Circular/2", format_string = ("numbits=%(numbits)s minradius=%(minradius)s " "maxradius=%(maxradius)s atype=%(atype)s btype=%(btype)s"), num_bits = lambda d: d["numbits"], make_fingerprinter = _check_v2(get_circular_fingerprinter)) _circular_ff.add_argument("numbits", decoder=_correct_numbits, metavar="INT", default=4096, help="number of bits in the circular fingerprint") _circular_ff.add_argument("minradius", decoder=nonnegative_int, metavar="INT", default=0, help="minimum radius for the circular fingerprint") _circular_ff.add_argument("maxradius", decoder=nonnegative_int, metavar="INT", default=5, help="maximum radius for the circular fingerprint") _circular_ff.add_argument("atype", decoder=circular_atom_description_to_value, encoder=circular_atom_value_to_description, help="atom type", default=OEFPAtomType_DefaultCircularAtom) _circular_ff.add_argument("btype", decoder=circular_bond_description_to_value, encoder=circular_bond_value_to_description, help="bond type", default=OEFPBondType_DefaultCircularBond) _tree_ff = OpenEyeTreeFingerprintFamily_v2 = _base.clone( name = "OpenEye-Tree/2", format_string = ("numbits=%(numbits)s minbonds=%(minbonds)s " "maxbonds=%(maxbonds)s atype=%(atype)s btype=%(btype)s"), num_bits = lambda d: d["numbits"], make_fingerprinter = _check_v2(get_tree_fingerprinter)) _tree_ff.add_argument("numbits", decoder=_correct_numbits, metavar="INT", default=4096, help="number of bits in the tree fingerprint") _tree_ff.add_argument("minbonds", decoder=nonnegative_int, metavar="INT", default=0, help="minimum number of bonds in the tree fingerprint") _tree_ff.add_argument("maxbonds", decoder=nonnegative_int, metavar="INT", default=4, help="maximum number of bonds in the tree fingerprint") _tree_ff.add_argument("atype", decoder=tree_atom_description_to_value, encoder=tree_atom_value_to_description, help="atom type", default=OEFPAtomType_DefaultTreeAtom) _tree_ff.add_argument("btype", decoder=tree_bond_description_to_value, encoder=tree_bond_value_to_description, help="bond type", default=OEFPBondType_DefaultTreeBond) chemfp-1.1p1/chemfp/openeye_patterns.py0000644000077000000240000002127412055226640020465 0ustar dalkestaff00000000000000from __future__ import absolute_import from openeye.oechem import ( OESubSearch, OEChemGetRelease, OEChemGetVersion, OEGraphMol, OEAndAtom, OENotAtom, OEIsAromaticAtom, OEIsCarbon, OEIsAromaticBond, OEAtomIsInRing, OEHasBondIdx, OEFindRingAtomsAndBonds, OEDetermineAromaticRingSystems, OEDetermineComponents) from . import openeye from . import pattern_fingerprinter from . import types from . import __version__ as chemfp_version class HydrogenMatcher(object): def __init__(self, max_count): self.max_count = max_count def SingleMatch(self, mol): for atom in mol.GetAtoms(): if atom.GetAtomicNum() == 1: return 1 if atom.GetImplicitHCount(): return 1 return 0 def Match(self, mol, flg=True): max_count = self.max_count count = 0 for atom in mol.GetAtoms(): if atom.GetAtomicNum() == 1: count += 1 count += atom.GetImplicitHCount() if count > max_count: break return [0] * count # OpenEye famously does not include SSSR functionality in OEChem. # Search for "Smallest Set of Smallest Rings (SSSR) Considered Harmful" # After much thought, I agree. But it makes this sort of code harder. # That's why I only support up to max_count = 2. Then again, I know # that this code does the right thing, while I'm not sure about the # SSSR-based implementations. class AromaticRings(object): def __init__(self, max_count): if max_count > 2: raise NotImplementedError("No support for >=3 aromatic rings") self.max_count = max_count self._single_aromatic = OESubSearch("[aR]") # In OpenEye SMARTS, [a;!R2] will find aromatic atoms in at least two rings # The following finds atoms which are members of at least two aromatic rings self._multiring_aromatic = OESubSearch("[a;!R2](:a)(:a):a") def SingleMatch(self, mol): # This is easy; if there's one aromatic atom then there's one # aromatic ring. return self._single_aromatic.SingleMatch(mol) def Match(self, mol, flg=True): # We're trying to find if there are two aromatic rings. if not self._single_aromatic.SingleMatch(mol): return () if self._multiring_aromatic.SingleMatch(mol): # then obviously there are two aromatic rings return (1,2) # Since no aromatic atom is in two aromatic rings that means # the aromatic ring systems are disjoint, so this gives me the # number of ring systems num_aromatic_systems, parts = OEDetermineAromaticRingSystems(mol) if num_aromatic_systems >= self.max_count: return [0]*self.max_count assert num_aromatic_systems != 0, "there is supposed to be an aromatic ring" if num_aromatic_systems == 1: return (1,) raise AssertionError("Should not get here") _is_hetereo_aromatic = OEAndAtom(OEAndAtom(OEIsAromaticAtom(), OENotAtom(OEIsCarbon())), OEAtomIsInRing()) class HeteroAromaticRings(object): def __init__(self, max_count): if max_count > 2: raise NotImplementedError("No support for >=3 hetero-aromatic rings") self.max_count = max_count def SingleMatch(self, mol): for atom in mol.GetAtoms(_is_hetereo_aromatic): return True return False def Match(self, mol, flg=True): # Find all the hetero-aromatic atoms hetero_atoms = [atom for atom in mol.GetAtoms(_is_hetereo_aromatic)] if len(hetero_atoms) < 2: # The caller just needs an iterable return hetero_atoms # There are at least two hetero-aromatic atoms. # Are there multiple ring systems? num_aromatic_systems, parts = OEDetermineAromaticRingSystems(mol) assert num_aromatic_systems >= 1 # Are there hetero-atoms in different systems? atom_components = set(parts[atom.GetIdx()] for atom in hetero_atoms) if len(atom_components) > 1: return (1,2) # The answer now is "at least one". But are there two? # All of the hetero-aromatic atoms are in the same ring system # This is the best answer I could think of, and it only works # with the OEChem toolkit: remove one of the bonds, re-find # the rings, and see if there's still an aromatic hetero-atom. hetero_atom = hetero_atoms[0] for bond in hetero_atom.GetBonds(OEIsAromaticBond()): newmol = OEGraphMol(mol) newmol_bond = newmol.GetBond(OEHasBondIdx(bond.GetIdx())) newmol.DeleteBond(newmol_bond) OEFindRingAtomsAndBonds(newmol) for atom in newmol.GetAtoms(_is_hetereo_aromatic): return (1,2) return (1,) class NumFragments(object): def __init__(self, max_count): pass def SingleMatch(self, mol): return mol.NumAtoms() > 0 def Match(self, mol, flg=True): count, parts = OEDetermineComponents(mol) # parts is a list of component numbers. # Turn them into a set to get the unique set of component numbers # Sets are iterable, so I don't need to do more for the API return set(parts) # Grrr. The substructure keys want up to 4 aromatic rings. The above # code only works for up to 2. The API doesn't let me say "I can # handle up to 2; please set the remainder to 0." # # XXX Well, I can change that. def aromatic_rings(max_count): if max_count > 2: return pattern_fingerprinter.LimitedMatcher(2, AromaticRings(2)) return AromaticRings(max_count) def hetero_aromatic_rings(max_count): if max_count > 2: return pattern_fingerprinter.LimitedMatcher(2, HeteroAromaticRings(2)) return HeteroAromaticRings(max_count) _pattern_classes = { "": HydrogenMatcher, "": aromatic_rings, "": hetero_aromatic_rings, "": NumFragments, } def oechem_compile_pattern(pattern, max_count): if pattern in _pattern_classes: return _pattern_classes[pattern](max_count) elif pattern.startswith("<"): raise NotImplementedError(pattern) # No other special patterns are supported else: pat = OESubSearch() if not pat.Init(pattern): raise pattern_fingerprinter.UnsupportedPatternError( pattern, "Uninterpretable SMARTS pattern") pat.SetMaxMatches(max_count) return pat class OEChemPatternFingerprinter(pattern_fingerprinter.PatternFingerprinter): def __init__(self, patterns): assert patterns is not None super(OEChemPatternFingerprinter, self).__init__(patterns, oechem_compile_pattern) def fingerprint(self, mol): bytes = [0] * self.num_bytes for matcher, largest_count, count_info_tuple in self.matcher_definitions: if matcher is NotImplemented: continue #print matcher, largest_count, count_info_tuple if largest_count == 1: if matcher.SingleMatch(mol): count_info = count_info_tuple[0] bytes[count_info.byteno] |= count_info.bitmask else: actual_count = sum(1 for ignore in matcher.Match(mol, True)) # unique matches if actual_count: for count_info in count_info_tuple: if actual_count >= count_info.count: bytes[count_info.byteno] |= count_info.bitmask else: break return "".join(map(chr, bytes)) class _CachedFingerprinters(dict): def __missing__(self, name): patterns = pattern_fingerprinter._load_named_patterns(name) fingerprinter = OEChemPatternFingerprinter(patterns) self[name] = fingerprinter return fingerprinter _cached_fingerprinters = _CachedFingerprinters() SOFTWARE = "OEChem/%(release)s (%(version)s) chemfp/%(chemfp)s" % dict( release = OEChemGetRelease(), version = OEChemGetVersion(), chemfp = chemfp_version) # XXX Why are there two "Fingerprinter" classes? # XX Shouldn't they be merged? _base = openeye._base.clone( software = SOFTWARE) SubstructOpenEyeFingerprinter_v1 = _base.clone( name = "ChemFP-Substruct-OpenEye/1", num_bits = 881, make_fingerprinter = lambda : _cached_fingerprinters["substruct"].fingerprint) # def describe(self, bitno): # return self._fingerprinter.describe(bitno) RDMACCSOpenEyeFingerprinter_v1 = _base.clone( name = "RDMACCS-OpenEye/1", num_bits = 166, make_fingerprinter = lambda : _cached_fingerprinters["rdmaccs"].fingerprint) chemfp-1.1p1/chemfp/pattern_fingerprinter.py0000644000077000000240000001663411660452123021516 0ustar dalkestaff00000000000000import os class UnsupportedPatternError(KeyError): def __init__(self, pattern, reason=None): KeyError.__init__(self, pattern) self.pattern = pattern if reason is None: reason = "Cannot interpret pattern definition" self.reason = reason self.filename = None self.lineno = None def __str__(self): msg = self.reason + " " + repr(self.pattern) if self.lineno is not None: msg += " at line %d" % (self.lineno,) if self.filename is not None: msg += " in file %r" % (self.filename,) return msg class PatternFile(object): def __init__(self, filename, max_bit, bit_definitions): assert max_bit >= 0, max_bit self.filename = filename self.max_bit = max_bit self.bit_definitions = bit_definitions self._bit_to_bit_definition = dict((bitdef.bit, bitdef) for bitdef in bit_definitions) def __getitem__(self, bit): return self._bit_to_bit_definition[bit] def __iter__(self): return iter(self._bit_to_bit_definition) class BitDefinition(object): __slots__ = ("bit", "count", "pattern", "description", "lineno") def __init__(self, bit, count, pattern, description, lineno): self.bit = bit self.count = count self.pattern = pattern self.description = description self.lineno = lineno def load_patterns(infile): if isinstance(infile, basestring): infile = open(infile, "rU") filename = getattr(infile, "name", "") bit_definitions = list(read_patterns(infile)) max_bit = max(bitdef.bit for bitdef in bit_definitions) return PatternFile(filename, max_bit, bit_definitions) def read_patterns(infile): seen_bits = {} for lineno, line in enumerate(infile): lineno += 1 # Leading and trailing whitespace is ignored line = line.strip() # Ignore blank lines or those with a leading "#" if not line or line.startswith("#"): continue # The first three columns, plus everything else for the description fields = line.split(None, 3) if len(fields) != 4: raise TypeError("Not enough fields on line %d: %r" % (lineno, line)) # Normalize whitespace for the description fields[3] = " ".join(fields[3].split()) # Do some type checking and error reporting bit, count, pattern, description = fields if not bit.isdigit(): raise TypeError( "First field of line %d must be a non-negative bit position, not %r" % (lineno, bit)) bit = int(bit) if not count.isdigit() or int(count) == 0: raise TypeError( "Second field of line %d must be a positive minimum match count, not %r" % (lineno, bit)) count = int(count) if bit in seen_bits: raise TypeError("Line %d redefines bit %d, already set by line %d" % (lineno, bit, seen_bits[bit])) seen_bits[bit] = lineno yield BitDefinition(bit, count, pattern, description, lineno) class CountInfo(object): __slots__ = ("count", "bit", "byteno", "bitmask") def __init__(self, count, bit): self.count = count # minimum count needed to enable this bit self.bit = bit # used to set not_implemented, and useful for debugging # These simplify the fingerprint generation code self.byteno = bit//8 self.bitmask = 1<<(bit%8) def _bit_definition_to_pattern_definition(bit_definitions): "Helper function to organize the bit defintions based on pattern instead of bit" # A pattern definition is of the form: # (pattern string, count_info_list) # where the count_info list elements are sorted by count # I want to preserve the pattern order so that patterns which # are defined first are evaluated first ordered_patterns = [] pattern_info = {} # Find all of the bit definitions for a given pattern for bitdef in bit_definitions: if bitdef.pattern not in pattern_info: pattern_info[bitdef.pattern] = [] ordered_patterns.append(bitdef.pattern) pattern_info[bitdef.pattern].append( CountInfo(bitdef.count, bitdef.bit) ) # Put them into a slighly more useful form # - sorted now makes it easier to test when done # - knowing the max match count lets some matchers optmize how to match for pattern in ordered_patterns: count_info_list = pattern_info[pattern] count_info_list.sort(key=lambda count_info: count_info.count) yield (pattern, count_info_list[-1].count, # the largest count tuple(count_info_list) ) class LimitedMatcher(object): def __init__(self, max_supported, matcher): self.max_supported = max_supported self.matcher = matcher def _build_matchers(patterns, pattern_definitions, compile_pattern): not_implemented = set() matcher_definitions = [] for (pattern, largest_count, count_info_tuple) in pattern_definitions: if pattern == "<0>": # Special case support for setting (or rather, ignoring) the 0 bit continue matcher = compile_pattern(pattern, largest_count) if isinstance(matcher, LimitedMatcher): max_supported = matcher.max_supported new_count_info = [] for count_info in count_info_tuple: if count_info.count <= max_supported: new_count_info.append(count_info) else: not_implemented.add(count_info.bit) matcher = matcher.matcher count_info_tuple = tuple(new_count_info) if not count_info_tuple: continue if matcher is None: # During development I sometimes forgot to return a matcher # This catches those cases raise UnsupportedPatternError(pattern) matcher_definitions.append( (matcher, largest_count, count_info_tuple) ) return not_implemented, tuple(matcher_definitions) def make_matchers(patterns, compile_pattern): pattern_definitions = _bit_definition_to_pattern_definition(patterns.bit_definitions) try: return _build_matchers(patterns, pattern_definitions, compile_pattern) except UnsupportedPatternError, err: err.filename = patterns.filename pattern = err.args[0] for bitdef in patterns.bit_definitions: if bitdef.pattern == pattern: err.lineno = bitdef.lineno raise raise class PatternFingerprinter(object): def __init__(self, patterns, compile_pattern): self.patterns = patterns self.num_bytes = (patterns.max_bit // 8) + 1 self.not_implemented, self.matcher_definitions = ( make_matchers(patterns, compile_pattern) ) def describe(self, bit): description = self.patterns[bit].description if bit in self.not_implemented: description + " (NOT IMPLEMENTED)" return description def fingerprint(self, mol): raise NotImplemented("Must be implemented by a derived class") def _load_named_patterns(name): filename = os.path.join(os.path.dirname(__file__), name + ".patterns") return load_patterns(filename) chemfp-1.1p1/chemfp/progressbar/0000755000077000000240000000000012106315372017050 5ustar dalkestaff00000000000000chemfp-1.1p1/chemfp/progressbar/__init__.py0000644000077000000240000002357311661040267021176 0ustar dalkestaff00000000000000#!/usr/bin/python # -*- coding: utf-8 -*- # # progressbar - Text progress bar library for Python. # Copyright (c) 2005 Nilton Volpato # # This library is free software; you can redistribute it and/or # modify it under the terms of the GNU Lesser General Public # License as published by the Free Software Foundation; either # version 2.1 of the License, or (at your option) any later version. # # This library is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA '''Text progress bar library for Python. A text progress bar is typically used to display the progress of a long running operation, providing a visual cue that processing is underway. The ProgressBar class manages the current progress, and the format of the line is given by a number of widgets. A widget is an object that may display differently depending on the state of the progress bar. There are three types of widgets: - a string, which always shows itself - a ProgressBarWidget, which may return a different value every time its update method is called - a ProgressBarWidgetHFill, which is like ProgressBarWidget, except it expands to fill the remaining width of the line. The progressbar module is very easy to use, yet very powerful. It will also automatically enable features like auto-resizing when the system supports it. ''' from __future__ import division from __future__ import absolute_import import math import os import signal import sys import time try: from fcntl import ioctl from array import array import termios except ImportError: pass from .compat import * from .widgets import * __author__ = 'Nilton Volpato' __author_email__ = 'first-name dot last-name @ gmail.com' __date__ = '2011-05-14' __version__ = '2.3' class UnknownLength: pass class ProgressBar(object): '''The ProgressBar class which updates and prints the bar. A common way of using it is like: >>> pbar = ProgressBar().start() >>> for i in range(100): ... # do something ... pbar.update(i+1) ... >>> pbar.finish() You can also use a ProgressBar as an iterator: >>> progress = ProgressBar() >>> for i in progress(some_iterable): ... # do something ... Since the progress bar is incredibly customizable you can specify different widgets of any type in any order. You can even write your own widgets! However, since there are already a good number of widgets you should probably play around with them before moving on to create your own widgets. The term_width parameter represents the current terminal width. If the parameter is set to an integer then the progress bar will use that, otherwise it will attempt to determine the terminal width falling back to 80 columns if the width cannot be determined. When implementing a widget's update method you are passed a reference to the current progress bar. As a result, you have access to the ProgressBar's methods and attributes. Although there is nothing preventing you from changing the ProgressBar you should treat it as read only. Useful methods and attributes include (Public API): - currval: current progress (0 <= currval <= maxval) - maxval: maximum (and final) value - finished: True if the bar has finished (reached 100%) - start_time: the time when start() method of ProgressBar was called - seconds_elapsed: seconds elapsed since start_time and last call to update - percentage(): progress in percent [0..100] ''' __slots__ = ('currval', 'fd', 'finished', 'last_update_time', 'left_justify', 'maxval', 'next_update', 'num_intervals', 'poll', 'seconds_elapsed', 'signal_set', 'start_time', 'term_width', 'update_interval', 'widgets', '_time_sensitive', '__iterable') _DEFAULT_MAXVAL = 100 _DEFAULT_TERMSIZE = 80 _DEFAULT_WIDGETS = [Percentage(), ' ', Bar()] def __init__(self, maxval=None, widgets=None, term_width=None, poll=1, left_justify=True, fd=sys.stderr): '''Initializes a progress bar with sane defaults''' # Don't share a reference with any other progress bars if widgets is None: widgets = list(self._DEFAULT_WIDGETS) self.maxval = maxval self.widgets = widgets self.fd = fd self.left_justify = left_justify self.signal_set = False if term_width is not None: self.term_width = term_width else: try: self._handle_resize() signal.signal(signal.SIGWINCH, self._handle_resize) self.signal_set = True except (SystemExit, KeyboardInterrupt): raise except: self.term_width = self._env_size() self.__iterable = None self._update_widgets() self.currval = 0 self.finished = False self.last_update_time = None self.poll = poll self.seconds_elapsed = 0 self.start_time = None self.update_interval = 1 def __call__(self, iterable): 'Use a ProgressBar to iterate through an iterable' try: self.maxval = len(iterable) except: if self.maxval is None: self.maxval = UnknownLength self.__iterable = iter(iterable) return self def __iter__(self): return self def __next__(self): try: value = next(self.__iterable) if self.start_time is None: self.start() else: self.update(self.currval + 1) return value except StopIteration: self.finish() raise # Create an alias so that Python 2.x won't complain about not being # an iterator. next = __next__ def _env_size(self): 'Tries to find the term_width from the environment.' return int(os.environ.get('COLUMNS', self._DEFAULT_TERMSIZE)) - 1 def _handle_resize(self, signum=None, frame=None): 'Tries to catch resize signals sent from the terminal.' h, w = array('h', ioctl(self.fd, termios.TIOCGWINSZ, '\0' * 8))[:2] self.term_width = w def percentage(self): 'Returns the progress as a percentage.' return self.currval * 100.0 / self.maxval percent = property(percentage) def _format_widgets(self): result = [] expanding = [] width = self.term_width for index, widget in enumerate(self.widgets): if isinstance(widget, WidgetHFill): result.append(widget) expanding.insert(0, index) else: widget = format_updatable(widget, self) result.append(widget) width -= len(widget) count = len(expanding) while count: portion = max(int(math.ceil(width * 1. / count)), 0) index = expanding.pop() count -= 1 widget = result[index].update(self, portion) width -= len(widget) result[index] = widget return result def _format_line(self): 'Joins the widgets and justifies the line' widgets = ''.join(self._format_widgets()) if self.left_justify: return widgets.ljust(self.term_width) else: return widgets.rjust(self.term_width) def _need_update(self): 'Returns whether the ProgressBar should redraw the line.' if self.currval >= self.next_update or self.finished: return True delta = time.time() - self.last_update_time return self._time_sensitive and delta > self.poll def _update_widgets(self): 'Checks all widgets for the time sensitive bit' self._time_sensitive = any(getattr(w, 'TIME_SENSITIVE', False) for w in self.widgets) def update(self, value=None): 'Updates the ProgressBar to a new value.' if value is not None and value is not UnknownLength: if (self.maxval is not UnknownLength and not 0 <= value <= self.maxval): raise ValueError('Value out of range') self.currval = value if not self._need_update(): return if self.start_time is None: raise RuntimeError('You must call "start" before calling "update"') now = time.time() self.seconds_elapsed = now - self.start_time self.next_update = self.currval + self.update_interval self.fd.write(self._format_line() + '\r') self.last_update_time = now def start(self): '''Starts measuring time, and prints the bar at 0%. It returns self so you can use it like this: >>> pbar = ProgressBar().start() >>> for i in range(100): ... # do something ... pbar.update(i+1) ... >>> pbar.finish() ''' if self.maxval is None: self.maxval = self._DEFAULT_MAXVAL self.num_intervals = max(100, self.term_width) self.next_update = 0 if self.maxval is not UnknownLength: if self.maxval < 0: raise ValueError('Value out of range') self.update_interval = self.maxval / self.num_intervals self.start_time = self.last_update_time = time.time() self.update(0) return self def finish(self): 'Puts the ProgressBar bar in the finished state.' self.finished = True self.update(self.maxval) self.fd.write('\n') if self.signal_set: signal.signal(signal.SIGWINCH, signal.SIG_DFL) chemfp-1.1p1/chemfp/progressbar/compat.py0000644000077000000240000000270411661040267020713 0ustar dalkestaff00000000000000#!/usr/bin/python # -*- coding: utf-8 -*- # # progressbar - Text progress bar library for Python. # Copyright (c) 2005 Nilton Volpato # # This library is free software; you can redistribute it and/or # modify it under the terms of the GNU Lesser General Public # License as published by the Free Software Foundation; either # version 2.1 of the License, or (at your option) any later version. # # This library is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA '''Compatability methods and classes for the progressbar module''' # Python 3.x (and backports) use a modified iterator syntax # This will allow 2.x to behave with 3.x iterators if not hasattr(__builtins__, 'next'): def next(iter): try: # Try new style iterators return iter.__next__() except AttributeError: # Fallback in case of a "native" iterator return iter.next() # Python < 2.5 does not have "any" if not hasattr(__builtins__, 'any'): def any(iterator): for item in iterator: if item: return True return False chemfp-1.1p1/chemfp/progressbar/LICENSE.txt0000644000077000000240000000462311661040267020703 0ustar dalkestaff00000000000000You can redistribute and/or modify this library under the terms of the GNU LGPL license or BSD license (or both). --- progressbar - Text progress bar library for python. Copyright (C) 2005 Nilton Volpato This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA --- progressbar - Text progress bar library for python Copyright (c) 2008 Nilton Volpato All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: a. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. b. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. c. Neither the name of the author nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. chemfp-1.1p1/chemfp/progressbar/README.txt0000644000077000000240000000151511661040267020553 0ustar dalkestaff00000000000000Text progress bar library for Python. A text progress bar is typically used to display the progress of a long running operation, providing a visual cue that processing is underway. The ProgressBar class manages the current progress, and the format of the line is given by a number of widgets. A widget is an object that may display differently depending on the state of the progress bar. There are three types of widgets: - a string, which always shows itself - a ProgressBarWidget, which may return a different value every time its update method is called - a ProgressBarWidgetHFill, which is like ProgressBarWidget, except it expands to fill the remaining width of the line. The progressbar module is very easy to use, yet very powerful. It will also automatically enable features like auto-resizing when the system supports it. chemfp-1.1p1/chemfp/progressbar/widgets.py0000644000077000000240000002204111661040267021072 0ustar dalkestaff00000000000000#!/usr/bin/python # -*- coding: utf-8 -*- # # progressbar - Text progress bar library for Python. # Copyright (c) 2005 Nilton Volpato # # This library is free software; you can redistribute it and/or # modify it under the terms of the GNU Lesser General Public # License as published by the Free Software Foundation; either # version 2.1 of the License, or (at your option) any later version. # # This library is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA '''Default ProgressBar widgets''' from __future__ import division import datetime import math try: from abc import ABCMeta, abstractmethod except ImportError: AbstractWidget = object abstractmethod = lambda fn: fn else: AbstractWidget = ABCMeta('AbstractWidget', (object,), {}) def format_updatable(updatable, pbar): if hasattr(updatable, 'update'): return updatable.update(pbar) else: return updatable class Widget(AbstractWidget): '''The base class for all widgets The ProgressBar will call the widget's update value when the widget should be updated. The widget's size may change between calls, but the widget may display incorrectly if the size changes drastically and repeatedly. The boolean TIME_SENSITIVE informs the ProgressBar that it should be updated more often because it is time sensitive. ''' TIME_SENSITIVE = False __slots__ = () @abstractmethod def update(self, pbar): '''Updates the widget. pbar - a reference to the calling ProgressBar ''' class WidgetHFill(Widget): '''The base class for all variable width widgets. This widget is much like the \\hfill command in TeX, it will expand to fill the line. You can use more than one in the same line, and they will all have the same width, and together will fill the line. ''' @abstractmethod def update(self, pbar, width): '''Updates the widget providing the total width the widget must fill. pbar - a reference to the calling ProgressBar width - The total width the widget must fill ''' class Timer(Widget): 'Widget which displays the elapsed seconds.' __slots__ = ('format',) TIME_SENSITIVE = True def __init__(self, format='Elapsed Time: %s'): self.format = format @staticmethod def format_time(seconds): 'Formats time as the string "HH:MM:SS".' return str(datetime.timedelta(seconds=int(seconds))) def update(self, pbar): 'Updates the widget to show the elapsed time.' return self.format % self.format_time(pbar.seconds_elapsed) class ETA(Timer): 'Widget which attempts to estimate the time of arrival.' TIME_SENSITIVE = True def update(self, pbar): 'Updates the widget to show the ETA or total time when finished.' if pbar.currval == 0: return 'ETA: --:--:--' elif pbar.finished: return 'Time: %s' % self.format_time(pbar.seconds_elapsed) else: elapsed = pbar.seconds_elapsed eta = elapsed * pbar.maxval / pbar.currval - elapsed return 'ETA: %s' % self.format_time(eta) class FileTransferSpeed(Widget): 'Widget for showing the transfer speed (useful for file transfers).' format = '%6.2f %s%s/s' prefixes = ' kMGTPEZY' __slots__ = ('unit', 'format') def __init__(self, unit='B'): self.unit = unit def update(self, pbar): 'Updates the widget with the current SI prefixed speed.' if pbar.seconds_elapsed < 2e-6 or pbar.currval < 2e-6: # =~ 0 scaled = power = 0 else: speed = pbar.currval / pbar.seconds_elapsed power = int(math.log(speed, 1000)) scaled = speed / 1000.**power return self.format % (scaled, self.prefixes[power], self.unit) class AnimatedMarker(Widget): '''An animated marker for the progress bar which defaults to appear as if it were rotating. ''' __slots__ = ('markers', 'curmark') def __init__(self, markers='|/-\\'): self.markers = markers self.curmark = -1 def update(self, pbar): '''Updates the widget to show the next marker or the first marker when finished''' if pbar.finished: return self.markers[0] self.curmark = (self.curmark + 1) % len(self.markers) return self.markers[self.curmark] # Alias for backwards compatibility RotatingMarker = AnimatedMarker class Counter(Widget): 'Displays the current count' __slots__ = ('format',) def __init__(self, format='%d'): self.format = format def update(self, pbar): return self.format % pbar.currval class Percentage(Widget): 'Displays the current percentage as a number with a percent sign.' def update(self, pbar): return '%3d%%' % pbar.percentage() class FormatLabel(Timer): 'Displays a formatted label' mapping = { 'elapsed': ('seconds_elapsed', Timer.format_time), 'finished': ('finished', None), 'last_update': ('last_update_time', None), 'max': ('maxval', None), 'seconds': ('seconds_elapsed', None), 'start': ('start_time', None), 'value': ('currval', None) } __slots__ = ('format',) def __init__(self, format): self.format = format def update(self, pbar): context = {} for name, (key, transform) in self.mapping.items(): try: value = getattr(pbar, key) if transform is None: context[name] = value else: context[name] = transform(value) except: pass return self.format % context class SimpleProgress(Widget): 'Returns progress as a count of the total (e.g.: "5 of 47")' __slots__ = ('sep',) def __init__(self, sep=' of '): self.sep = sep def update(self, pbar): return '%d%s%d' % (pbar.currval, self.sep, pbar.maxval) class Bar(WidgetHFill): 'A progress bar which stretches to fill the line.' __slots__ = ('marker', 'left', 'right', 'fill', 'fill_left') def __init__(self, marker='#', left='|', right='|', fill=' ', fill_left=True): '''Creates a customizable progress bar. marker - string or updatable object to use as a marker left - string or updatable object to use as a left border right - string or updatable object to use as a right border fill - character to use for the empty part of the progress bar fill_left - whether to fill from the left or the right ''' self.marker = marker self.left = left self.right = right self.fill = fill self.fill_left = fill_left def update(self, pbar, width): 'Updates the progress bar and its subcomponents' left, marker, right = (format_updatable(i, pbar) for i in (self.left, self.marker, self.right)) width -= len(left) + len(right) # Marker must *always* have length of 1 marker *= int(pbar.currval / pbar.maxval * width) if self.fill_left: return '%s%s%s' % (left, marker.ljust(width, self.fill), right) else: return '%s%s%s' % (left, marker.rjust(width, self.fill), right) class ReverseBar(Bar): 'A bar which has a marker which bounces from side to side.' def __init__(self, marker='#', left='|', right='|', fill=' ', fill_left=False): '''Creates a customizable progress bar. marker - string or updatable object to use as a marker left - string or updatable object to use as a left border right - string or updatable object to use as a right border fill - character to use for the empty part of the progress bar fill_left - whether to fill from the left or the right ''' self.marker = marker self.left = left self.right = right self.fill = fill self.fill_left = fill_left class BouncingBar(Bar): def update(self, pbar, width): 'Updates the progress bar and its subcomponents' left, marker, right = (format_updatable(i, pbar) for i in (self.left, self.marker, self.right)) width -= len(left) + len(right) if pbar.finished: return '%s%s%s' % (left, width * marker, right) position = int(pbar.currval % (width * 2 - 1)) if position > width: position = width * 2 - position lpad = self.fill * (position - 1) rpad = self.fill * (width - len(marker) - len(lpad)) # Swap if we want to bounce the other way if not self.fill_left: rpad, lpad = lpad, rpad return '%s%s%s%s%s' % (left, lpad, marker, rpad, right) chemfp-1.1p1/chemfp/rdkit.py0000644000077000000240000004602512104061751016212 0ustar dalkestaff00000000000000"Create RDKit fingerprints" # Copyright (c) 2010-2013 Andrew Dalke Scientific, AB (Gothenburg, Sweden) # See the contents of "__init__.py" for full license details. from __future__ import absolute_import import os import sys import gzip import rdkit from rdkit import Chem from rdkit.Chem import rdMolDescriptors import rdkit.rdBase from rdkit.Chem.MACCSkeys import GenMACCSKeys from . import sdf_reader from .encodings import from_binary_lsb as _from_binary_lsb from . import io from . import types # These are the things I consider to be public __all__ = ["read_structures", "iter_smiles_molecules", "iter_sdf_molecules"] # If the attribute doesn't exist then this is an unsupported pre-2010 RDKit distribution SOFTWARE = "RDKit/" + getattr(rdkit.rdBase, "rdkitVersion", "unknown") # Used to check for version-dependent fingerprints _VERSION_PROBE_MOL = Chem.MolFromSmiles(r"CC1=CC(=NN1CC(=O)NNC(=O)\C=C\C2=C(C=CC=C2Cl)F)C") ######### # Helper function to convert a fingerprint to a sequence of bytes. from rdkit import DataStructs if getattr(DataStructs, "BitVectToBinaryText", None): _fp_to_bytes = DataStructs.BitVectToBinaryText else: # Support for pre-2012 releases of RDKit def _fp_to_bytes(fp): return _from_binary_lsb(fp.ToBitString())[1] ######### _allowed_formats = ["sdf", "smi"] _format_extensions = { ".sdf": "sdf", ".mol": "sdf", ".sd": "sdf", ".mdl": "sdf", ".smi": "smi", ".can": "smi", ".smiles": "smi", ".ism": "smi", } class SmilesFileLocation(object): def __init__(self, name=None): self.name = name self.lineno = 1 def where(self): s = "at line %(lineno)s" if self.name is not None: s += " of %(name)s" return s % self.__dict__ # While RDKit has a SMILES file parser, it doesn't handle reading from # stdin or from compressed files. I wanted to support those as well, so # ended up not using Chem.SmilesMolSupplier. def iter_smiles_molecules(fileobj, name=None, errors="strict"): """Iterate over the SMILES file records, returning (title, RDKit.Chem.Mol) pairs 'fileobj' is an input file or any line iterable 'name' is the name used to report errors (if not specified, use fileobj.name if present) 'errors' is one of "strict" (default), "log", or "ignore" (other values are experimental) Each line of the input must at least one whitespace separated fields. The first field is the SMILES. If there is a second field then it is used as the title, otherwise the title is the current record number, starting with "1". """ if name is None: name = getattr(fileobj, "name", None) error_handler = sdf_reader.get_parse_error_handler(errors) loc = SmilesFileLocation(name) for lineno, line in enumerate(fileobj): words = line.split() if len(words) <= 1: loc.lineno = lineno+1 if not words: error_handler("Unexpected blank line", loc) else: error_handler("Missing SMILES name (second column)", loc) continue mol = Chem.MolFromSmiles(words[0]) if mol is None: loc.lineno = lineno+1 error_handler("Cannot parse the SMILES %r" % (words[0],), loc) continue yield words[1], mol def iter_sdf_molecules(fileobj, name=None, id_tag=None, errors="strict"): """Iterate over the SD file records, returning (id, Chem.Mol) pairs fileobj - the input file object name - the name to use to report errors. If None, use fileobj.name . """ # If there's no explicit filename, see if fileobj has one if name is None: name = getattr(fileobj, "name", None) loc = sdf_reader.FileLocation(name) error = sdf_reader.get_parse_error_handler(errors) if id_tag is None: for i, text in enumerate(sdf_reader.iter_sdf_records(fileobj, errors, loc)): mol = Chem.MolFromMolBlock(text) if mol is None: # This was not a molecule? error("Could not parse molecule block", loc) continue title = mol.GetProp("_Name") id = io.remove_special_characters_from_id(title) if not id: error("Missing title for record #%d" % (i+1), loc) continue yield id, mol else: # According to # http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg01436.html # I can make a new SDMolSupplier, then SetData(), get the first record, and # get its property names. That's ... crazy. sdf_iter = sdf_reader.iter_sdf_records(fileobj, errors, loc) for i, (id, text) in enumerate(sdf_reader.iter_tag_and_record(sdf_iter, id_tag)): mol = Chem.MolFromMolBlock(text) if mol is None: # This was not a molecule? error("Could not parse molecule block", loc) continue if id is None: error("Missing id tag %r for record #%d" % (id_tag, i+1), loc) continue id = io.remove_special_characters_from_id(id) if not id: error("Empty id tag %r for record #%d" % (id_tag, i+1), loc) continue yield id, mol # this class helps the case when someone is entering structure # by-hand. (Most likely to occur with SMILES input). They would like # to see the result as soon as a record is entered. But normal # interation reader grabs a buffer of input to process, and not a # line. It's faster that way. The following adapter supports the # iterator protocol but turns it into simple readlines(). This will be # slower but since do it only if stdin is a tty, there shouldn't be a # problem. ## class _IterUsingReadline(object): ## "Internal class for iterating a line at a time from tty input" ## def __init__(self, fileobj): ## self.fileobj = fileobj ## def __iter__(self): ## return iter(self.fileobj.readline, "") ## def _open(filename, compressed): ## "Internal function to open the given filename, which might be compressed" ## if filename is None: ## if compressed: ## return gzip.GzipFile(fileobj=sys.stdin, mode="r") ## else: ## # Python's iter reads a block. ## # When someone types interactively, read only a line. ## if sys.stdin.isatty(): ## return _IterUsingReadline(sys.stdin) ## else: ## return sys.stdin ## if compressed: ## return gzip.GzipFile(filename, "r") ## return open(filename, "rU") def is_valid_format(format): if format is None: return True try: format_name, compression = io.normalize_format(None, format, ("smi", None)) except ValueError: return False format_name = _format_extensions.get(format_name, format_name) return format_name in ("sdf", "smi") def read_structures(source, format=None, id_tag=None, errors="strict"): """Iterate the records in the input source as (title, RDKit.Chem.Mol) pairs 'source' is a filename, a file object, or None for stdin 'format' is either "sdf" or "smi" with optional ".gz" or ".bz2" extensions. If None then the format is inferred from the source extension 'errors' is one of "strict" (default), "log", or "ignore" (other values are experimental) """ format_name, compression = io.normalize_format(source, format, default=("smi", None)) format_name = _format_extensions.get(format_name, format_name) if format_name == "sdf": # I have an old PubChem file Compound_09425001_09450000.sdf . # num. lines = 5,041,475 num. bytes = 159,404,037 # # Parse times for iter_sdf_records (parsing records in Python) # 37.6s (best of 37.6, 38.3, 37.8) # Parse times for the RDKit implementation (parsing records in C++) # 40.2s (best of 41.7, 41.33, 40.2) # # The native RDKit reader is slower than the Python one and does # not have (that I can tell) support for compressed files, so # I'll go with the Python one. For those interested, here's the # RDKit version. # #if (not compressed) and (source is not None): # supplier = Chem.SDMolSupplier(source) # def native_sdf_reader(): # for mol in supplier: # if mol is None: # print >>sys.stderr, "Missing? after", title # else: # title = mol.GetProp("_Name") # yield title, mol # return native_sdf_reader() fileobj = io.open_compressed_input_universal(source, compression) # fileobj should always have the .name attribute set. return iter_sdf_molecules(fileobj, None, id_tag, errors) elif format_name == "smi": # I timed the native reader at 31.6 seconds (best of 31.6, 31.7, 31.7) # and the Python reader at 30.8 seconds (best of 30.8, 30.9, and 31.0) # Yes, the Python reader is faster and using it gives me better consistency # #if (not compressed) and (source is not None): # supplier = Chem.SmilesMolSupplier(source, delimiter=" \t", titleLine=False) # def native_smiles_reader(): # for mol in supplier: # yield mol.GetProp("_Name"), mol # return native_smiles_reader() fileobj = io.open_compressed_input_universal(source, compression) return iter_smiles_molecules(fileobj, None, errors) else: if format is None: raise ValueError("Unknown structure filename extension: %r" % (source,)) else: raise ValueError("Unknown structure format %r" % (format_name,)) ########### The topological fingerprinter # Some constants shared by the fingerprinter and the command-line code. NUM_BITS = 2048 MIN_PATH = 1 MAX_PATH = 7 BITS_PER_HASH = 4 USE_HS = 1 assert USE_HS == 1, "Don't make this 0 unless you know what you are doing" # Not supporting the tgtDensity and minSize options. # This program generates fixed-length fingerprints. def make_rdk_fingerprinter(minPath=MIN_PATH, maxPath=MAX_PATH, fpSize=NUM_BITS, nBitsPerHash=BITS_PER_HASH, useHs=USE_HS): if not (fpSize > 0): raise ValueError("fpSize must be positive") if not (minPath > 0): raise ValueError("minPath must be positive") if not (maxPath >= minPath): raise ValueError("maxPath must not be smaller than minPath") if not (nBitsPerHash > 0): raise ValueError("nBitsPerHash must be positive") def rdk_fingerprinter(mol): fp = Chem.RDKFingerprint( mol, minPath=minPath, maxPath=maxPath, fpSize=fpSize, nBitsPerHash=nBitsPerHash, useHs=useHs) return _fp_to_bytes(fp) return rdk_fingerprinter ########### The MACCS fingerprinter def maccs166_fingerprinter(mol): fp = GenMACCSKeys(mol) # In RDKit the first bit is always bit 1 .. bit 0 is empty (?!?!) bitstring_with_167_bits = fp.ToBitString() # I want the bits to start at 0, so I do a manual left shift return _from_binary_lsb(bitstring_with_167_bits[1:])[1] def make_maccs166_fingerprinter(): return maccs166_fingerprinter ########### The Morgan fingerprinter # Some constants shared by the fingerprinter and the command-line code. RADIUS = 2 USE_FEATURES = 0 USE_CHIRALITY = 0 USE_BOND_TYPES = 1 def make_morgan_fingerprinter(fpSize=NUM_BITS, radius=RADIUS, useFeatures=USE_FEATURES, useChirality=USE_CHIRALITY, useBondTypes=USE_BOND_TYPES): if not (fpSize > 0): raise ValueError("fpSize must be positive") if not (radius >= 0): raise ValueError("radius must be positive or zero") def morgan_fingerprinter(mol): fp = rdMolDescriptors.GetMorganFingerprintAsBitVect( mol, radius, nBits=fpSize, useChirality=useChirality, useBondTypes=useBondTypes,useFeatures=useFeatures) return _fp_to_bytes(fp) return morgan_fingerprinter ########### Torsion fingerprinter TARGET_SIZE = 4 def make_torsion_fingerprinter(fpSize=NUM_BITS, targetSize=TARGET_SIZE): if not (fpSize > 0): raise ValueError("fpSize must be positive") if not (targetSize >= 0): raise ValueError("targetSize must be positive or zero") def torsion_fingerprinter(mol): fp = rdMolDescriptors.GetHashedTopologicalTorsionFingerprintAsBitVect( mol, nBits=fpSize, targetSize=targetSize) return _fp_to_bytes(fp) return torsion_fingerprinter TORSION_VERSION = { "\xc2\x10@\x83\x010\x18\xa4,\x00\x80B\xc0\x00\x08\x00": "1", "\x13\x11\x103\x00\x007\x00\x00p\x01\x111\x0107": "2", }[make_torsion_fingerprinter(128)(_VERSION_PROBE_MOL)] ########### Atom Pair fingerprinter MIN_LENGTH = 1 MAX_LENGTH = 30 def make_atom_pair_fingerprinter(fpSize=NUM_BITS, minLength=MIN_LENGTH, maxLength=MAX_LENGTH): if not (fpSize > 0): raise ValueError("fpSize must be positive") if not (minLength >= 0): raise ValueError("minLength must be positive or zero") if not (maxLength >= minLength): raise ValueError("maxLength must not be less than minLength") def pair_fingerprinter(mol): fp = rdMolDescriptors.GetHashedAtomPairFingerprintAsBitVect( mol, nBits=fpSize, minLength=minLength, maxLength=maxLength) return _fp_to_bytes(fp) return pair_fingerprinter try: ATOM_PAIR_VERSION = { "\xfdB\xfe\xbd\xfa\xdd\xff\xf5\xff\x05\xdf?\xe3\xc3\xff\xfb": "1", "w\xf7\xff\xf7\xff\x17\x01\x7f\x7f\xff\xff\x7f\xff?\xff\xff": "2", }[make_atom_pair_fingerprinter(128)(_VERSION_PROBE_MOL)] except Exception, err: # RDKit 2011.06 contained a bug if "Boost.Python.ArgumentError" in str(type(err)): ATOM_PAIR_VERSION = None else: raise #################### from .types import FingerprintFamilyConfig, positive_int, nonnegative_int, zero_or_one def _read_structures(metadata, source, format, id_tag, errors): if metadata.aromaticity is not None: raise ValueError("RDKit does not support alternate aromaticity models " "(want aromaticity=%r)" % metadata.aromaticity) return read_structures(source, format, id_tag, errors) # Check for metadata.aromaticity _base = FingerprintFamilyConfig( software = SOFTWARE, read_structures = _read_structures, ) _base.add_argument("fpSize", decoder=positive_int, metavar="INT", default=NUM_BITS, help = "number of bits in the fingerprint (applies to RDK, Morgan, topological torsion, and atom pair fingerprints") _base.add_argument("minPath", decoder=positive_int, metavar="INT", default=MIN_PATH, help = "minimum number of bonds to include in the subgraph") _base.add_argument("maxPath", decoder=positive_int, metavar="INT", default=MAX_PATH, help = "maximum number of bonds to include in the subgraph") _base.add_argument("nBitsPerHash", decoder=positive_int, metavar="INT", default=BITS_PER_HASH, help = "number of bits to set per path") _base.add_argument("useHs", decoder=zero_or_one, metavar="0|1", default=USE_HS, help = "include information about the number of hydrogens on each atom") # Morgan _base.add_argument("radius", decoder=nonnegative_int, metavar="INT", default=RADIUS, help = "radius for the Morgan algorithm") _base.add_argument("useFeatures", decoder=zero_or_one, metavar="0|1", default=USE_FEATURES, help = "use chemical-feature invariants") _base.add_argument("useChirality", decoder=zero_or_one, metavar="0|1", default=USE_CHIRALITY, help = "include chirality information") _base.add_argument("useBondTypes", decoder=zero_or_one, metavar="0|1", default=USE_BOND_TYPES, help = "include bond type information") # torsion _base.add_argument("targetSize", decoder=positive_int, metavar="INT", default=TARGET_SIZE, help = "number of bits in the fingerprint") # pair _base.add_argument("minLength", decoder=nonnegative_int, metavar="INT", default=MIN_LENGTH, help = "minimum bond count for a pair") _base.add_argument("maxLength", decoder=nonnegative_int, metavar="INT", default=MAX_LENGTH, help = "maximum bond count for a pair") ######### RDKitMACCSFingerprintFamily_v1 = _base.clone( name = "RDKit-MACCS166/1", num_bits = 166, make_fingerprinter = make_maccs166_fingerprinter, ) # The number of bits depends on the parameters def _get_num_bits(d): return d["fpSize"] RDKitFingerprintFamily_v1 = _base.clone( name = "RDKit-Fingerprint/1", format_string = ("minPath=%(minPath)s maxPath=%(maxPath)s fpSize=%(fpSize)s " "nBitsPerHash=%(nBitsPerHash)s useHs=%(useHs)s"), num_bits = _get_num_bits, make_fingerprinter = make_rdk_fingerprinter, ) ### RDKitMorganFingerprintFamily_v1 = _base.clone( name = "RDKit-Morgan/1", format_string = ( "radius=%(radius)d fpSize=%(fpSize)s useFeatures=%(useFeatures)d " "useChirality=%(useChirality)d useBondTypes=%(useBondTypes)d"), num_bits = _get_num_bits, make_fingerprinter = make_morgan_fingerprinter, ) ### def _check_torsion_version(version): def make_fingerprinter(*args, **kwargs): if TORSION_VERSION != version: raise TypeError("This version of RDKit does not support the RDKit-Torsion/%s fingerprint" % (version,)) return make_torsion_fingerprinter(*args, **kwargs) return make_fingerprinter RDKitTorsionFingerprintFamily_v1 = _base.clone( name = "RDKit-Torsion/1", format_string = "fpSize=%(fpSize)s targetSize=%(targetSize)d", num_bits = _get_num_bits, make_fingerprinter = _check_torsion_version("1"), ) RDKitTorsionFingerprintFamily_v2 = _base.clone( name = "RDKit-Torsion/1", format_string = "fpSize=%(fpSize)s targetSize=%(targetSize)d", num_bits = _get_num_bits, make_fingerprinter = _check_torsion_version("2"), ) ### def _check_atom_pair_version(version): def make_fingerprinter(*args, **kwargs): if ATOM_PAIR_VERSION != version: raise TypeError("This version of RDKit does not support the RDKit-AtomPair/%s fingerprint" % (version,)) return make_atom_pair_fingerprinter(*args, **kwargs) return make_fingerprinter RDKitAtomPairFingerprintFamily_v1 = _base.clone( name = "RDKit-AtomPair/1", format_string = "fpSize=%(fpSize)s minLength=%(minLength)d maxLength=%(maxLength)d", num_bits = _get_num_bits, make_fingerprinter = _check_atom_pair_version("1"), ) RDKitAtomPairFingerprintFamily_v2 = _base.clone( name = "RDKit-AtomPair/2", format_string = "fpSize=%(fpSize)s minLength=%(minLength)d maxLength=%(maxLength)d", num_bits = _get_num_bits, make_fingerprinter = _check_atom_pair_version("2"), ) chemfp-1.1p1/chemfp/rdkit_patterns.py0000644000077000000240000001267711663545155020156 0ustar dalkestaff00000000000000from __future__ import absolute_import from rdkit import Chem from . import pattern_fingerprinter from . import rdkit from . import types from . import __version__ as chemfp_version SOFTWARE = rdkit.SOFTWARE + (" chemfp/%s" % (chemfp_version,)) class HydrogenMatcher(object): def has_match(self, mol): for atom in mol.GetAtoms(): if atom.GetAtomicNum() == 1: return 1 if atom.GetTotalNumHs(): return 1 return 0 def num_matches(self, mol, largest_count): num_hydrogens = 0 for atom in mol.GetAtoms(): if atom.GetAtomicNum() == 1: num_hydrogens += 1 num_hydrogens += atom.GetTotalNumHs() if num_hydrogens >= largest_count: return num_hydrogens return num_hydrogens class AromaticRings(object): def __init__(self): # The single ring case is easy; if there's an aromatic atom in a ring # then there's a ring self._single_matcher = Chem.MolFromSmarts("[aR]") def has_match(self, mol): return mol.HasSubstructMatch(self._single_matcher) def num_matches(self, mol, largest_count): nArom = 0 for ring in mol.GetRingInfo().BondRings(): if all(mol.GetBondWithIdx(bondIdx).GetIsAromatic() for bondIdx in ring): nArom += 1 if nArom == largest_count: return nArom return nArom def _is_hetereo_aromatic_atom(atom): return atom.GetIsAromatic() and atom.GetAtomicNum() not in (1, 6) class HeteroAromaticRings(object): def __init__(self): # In the single match case, if there's an aromatic non-carbon atom # then it's a hetereo ring self._single_matcher = Chem.MolFromSmarts("[aR;!#6]") def has_match(self, mol): return mol.HasSubstructMatch(self._single_matcher) def num_matches(self, mol, largest_count): nArom = 0 for ring in mol.GetRingInfo().AtomRings(): if any(_is_hetereo_aromatic_atom(mol.GetAtomWithIdx(atomIdx)) for atomIdx in ring): nArom += 1 if nArom == largest_count: return nArom return nArom class NumFragments(object): def has_match(self, mol): return mol.GetNumAtoms() > 0 def num_matches(self, mol, largest_count): return len(Chem.GetMolFrags(mol)) # RDKit matches "molecule.HasSubstructMatch(match_pattern)" # while every other toolkit does something like "match_pattern.HasSubstructMatch(molecule)" # Since SMARTS doesn't handle all the pattern cases, I prefer the second ordering. # This class inverts the order so I can do that. class InvertedMatcher(object): def __init__(self, matcher): self.matcher = matcher def has_match(self, mol): return mol.HasSubstructMatch(self.matcher) def num_matches(self, mol, max_count): return len(mol.GetSubstructMatches(self.matcher)) _pattern_classes = { "": HydrogenMatcher, "": AromaticRings, "": HeteroAromaticRings, "": NumFragments, } def rdkit_compile_pattern(pattern, max_count): if pattern in _pattern_classes: return _pattern_classes[pattern]() elif pattern.startswith("<"): raise NotImplementedError(pattern) #return NotImplemented # Everything else must be a SMARTS pattern matcher = Chem.MolFromSmarts(pattern) if matcher is None: raise pattern_fingerprinter.UnsupportedPatternError( pattern, "Can not interpret SMARTS pattern") return InvertedMatcher(matcher) class RDKitPatternFingerprinter(pattern_fingerprinter.PatternFingerprinter): def __init__(self, patterns): assert patterns is not None super(RDKitPatternFingerprinter, self).__init__(patterns, rdkit_compile_pattern) def fingerprint(self, mol): bytes = [0] * self.num_bytes for matcher, largest_count, count_info_tuple in self.matcher_definitions: if largest_count == 1: if matcher.has_match(mol): count_info = count_info_tuple[0] bytes[count_info.byteno] |= count_info.bitmask else: actual_count = matcher.num_matches(mol, largest_count) if actual_count: for count_info in count_info_tuple: if actual_count >= count_info.count: bytes[count_info.byteno] |= count_info.bitmask else: break return "".join(map(chr, bytes)) class _CachedFingerprinters(dict): def __missing__(self, name): patterns = pattern_fingerprinter._load_named_patterns(name) fingerprinter = RDKitPatternFingerprinter(patterns) self[name] = fingerprinter return fingerprinter _cached_fingerprinters = _CachedFingerprinters() _base = rdkit._base.clone( software = SOFTWARE) SubstructRDKitFingerprinter_v1 = _base.clone( name = "ChemFP-Substruct-RDKit/1", num_bits = 881, make_fingerprinter = lambda: _cached_fingerprinters["substruct"].fingerprint) # def describe(self, bitno): # return self._fingerprinter.describe(bitno) RDMACCSRDKitFingerprinter_v1 = _base.clone( name = "RDMACCS-RDKit/1", num_bits = 166, make_fingerprinter = lambda: _cached_fingerprinters["rdmaccs"].fingerprint) chemfp-1.1p1/chemfp/rdmaccs.patterns0000644000077000000240000002253711660452123017726 0ustar dalkestaff00000000000000# The contents of this file are derived from RDKit's Chem/MACCSkeys.py # and translated by hand to the chemfp pattern format. # The RDKit code is distributed with the following license: # Copyright (c) 2006-2010 # Rational Discovery LLC, Greg Landrum, and Julie Penzotti # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are # met: # # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials provided # with the distribution. # * Neither the name of Rational Discovery nor the names of its # contributors may be used to endorse or promote products derived # from this software without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS # "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR # A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT # OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, # SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT # LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # Copyright (C) 2001-2008 greg Landrum and Rational Discovery LLC # # @@ All Rights Reserved @@ # # SMARTS definitions for the publically available MACCS keys # and a MACCS fingerprinter # # I compared the MACCS fingerprints generated here with those from two # other packages (not MDL, unfortunately). Of course there are # disagreements between the various fingerprints still, but I think # these definitions work pretty well. Some notes: # # 1) most of the differences have to do with aromaticity # 2) there's a discrepancy sometimes because the current RDKit # definitions do not require multiple matches to be distinct. e.g. the # SMILES C(=O)CC(=O) can match the (hypothetical) key O=CC twice in my # definition. It's not clear to me what the correct behavior is. # 3) Some keys are not fully defined in the MDL documentation # 4) Two keys, 125 and 166, have to be done outside of SMARTS. # (Note: in chemfp those are bits 123 and 165) # 5) Key 1 (ISOTOPE) isn't defined 0 1 <0> Isotope #1 [#103,#104,#105,#106,#107,#106,#109,#110,#111,#112] 1 Isotope (Not complete) 1 1 [#103,#104] Isotope (Not complete) # *NOTE* spec wrong 2 1 [Ge,#33,#34,Sn,Sb,#52,Tl,Pb,Bi] Group IVa,Va,VIa Periods 4-6 (Ge...) 3 1 [Ac,Th,Pa,U,Np,Pu,Am,Cm,Bk,Cf,Es,Fm,Md,No,Lr] actinide # *NOTE* spec wrong 4 1 [Sc,Ti,Y,Zr,Hf] Group IIIB,IVB (Sc...) 5 1 [La,Ce,Pr,Nd,Pm,Sm,Eu,Gd,Tb,Dy,Ho,Er,Tm,Yb,Lu] Lanthanide # *NOTE* spec wrong 6 1 [V,Cr,Mn,Nb,Mo,Tc,Ta,W,Re] Group VB,VIB,VIIB (V...) 7 1 [!#6;!#1]~1~*~*~*~1 QAAA@1 8 1 [Fe,Co,Ni,Ru,Rh,Pd,Os,Ir,Pt] Group VIII (Fe...) 9 1 [Be,Mg,Ca,Sr,Ba,Ra] Group IIa (Alkaline earth) 10 1 *~1~*~*~*~1 4-member Ring 11 1 [Cu,Zn,Ag,Cd,Au,Hg] Group IB,IIB (Cu..) 12 1 [#8]~[#7](~[#6])~[#6] ON(C)C 13 1 [#16]-[#16] S-S 14 1 [#8]~[#6](~[#8])~[#8] OC(O)O 15 1 [!#6;!#1]~1~*~*~1 QAA@1 16 1 [#6]#[#6] CTC # *NOTE* spec wrong 17 1 [#5,Al,Ga,In,Tl] Group IIIA (B...) 18 1 *~1~*~*~*~*~*~*~1 7-member Ring 19 1 [#14] Si 20 1 [#6]=[#6](~[!#6;!#1])~[!#6;!#1] C=C(Q)Q 21 1 *~1~*~*~1 3-member Ring 22 1 [#7]~[#6](~[#8])~[#8] NC(O)O 23 1 [#7]-[#8] N-O 24 1 [#7]~[#6](~[#7])~[#7] NC(N)N 25 1 [#6]=;@[#6](@*)@* C$=C($A)$A 26 1 [I] I 27 1 [!#6;!#1]~[CH2]~[!#6;!#1] QCH2Q 28 1 [#15] P 29 1 [#6]~[!#6;!#1](~[#6])(~[#6])~[!#1] CQ(C)(C)A 30 1 [!#6;!#1]~[F,Cl,Br,I] QX 31 1 [#6]~[#16]~[#7] CSN 32 1 [#7]~[#16] NS 33 1 [CH2]=* CH2=A 34 1 [Li,Na,K,Rb,Cs,Fr] Group IA (Alkali Metal) 35 1 [#16R] S Heterocycle 36 1 [#7]~[#6](~[#8])~[#7] NC(O)N 37 1 [#7]~[#6](~[#6])~[#7] NC(C)N 38 1 [#8]~[#16](~[#8])~[#8] OS(O)O 39 1 [#16]-[#8] S-O 40 1 [#6]#[#7] CTN 41 1 F F 42 1 [!C;!c;!#1;!H0]~*~[!C;!c;!#1;!H0] QHAQH # I have no idea (APD) #43 1 OTHER 43 1 <0> OTHER 44 1 [#6]=[#6]~[#7] C=CN 45 1 Br BR 46 1 [#16]~*~[#7] SAN 47 1 [#8]~[!#6;!#1](~[#8])(~[#8]) OQ(O)O 48 1 [!+0] CHARGE 49 1 [#6]=[#6](~[#6])~[#6] C=C(C)C 50 1 [#6]~[#16]~[#8] CSO 51 1 [#7]~[#7] NN 52 1 [!#6;!#1;!H0]~*~*~*~[!#6;!#1;!H0] QHAAAQH 53 1 [!#6;!#1;!H0]~*~*~[!#6;!#1;!H0] QHAAQH 54 1 [#8]~[#16]~[#8] OSO 55 1 [#8]~[#7](~[#8])~[#6] ON(O)C 56 1 [#8R] O Heterocycle 57 1 [!#6;!#1]~[#16]~[!#6;!#1] QSQ 58 1 [#16]!:*:* Snot%A%A 59 1 [#16]=[#8] S=O 60 1 [!#1]~[#16](~[!#1])~[!#1] AS(A)A 61 1 *@*!@*@* A$!A$A 62 1 [#7]=[#8] N=O 63 1 *@*!@[#16] A$A!S 64 1 c:n C%N 65 1 [#6]~[#6](~[#6])(~[#6])~[!#1] CC(C)(C)A 66 1 [!#6;!#1]~[#16] QS 67 1 [!#6;!#1;!H0]~[!#6;!#1;!H0] QHQH (&...) FIX: incomplete definition 68 1 [!#6;!#1]~[!#6;!#1;!H0] QQH 69 1 [!#6;!#1]~[#7]~[!#6;!#1] QNQ 70 1 [#7]~[#8] NO 71 1 [#8]~*~*~[#8] OAAO 72 1 [#16]=* S=A 73 1 [CH3]~*~[CH3] CH3ACH3 74 1 [!#1]!@[#7]@[!#1] A!N$A 75 1 [#6]=[#6](~[!#1])~[!#1] C=C(A)A 76 1 [#7]~*~[#7] NAN 77 1 [#6]=[#7] C=N 78 1 [#7]~*~*~[#7] NAAN 79 1 [#7]~*~*~*~[#7] NAAAN 80 1 [#16]~*(~[!#1])~[!#1] SA(A)A 81 1 [!#1]~[CH2]~[!#6;!#1;!H0] ACH2QH 82 1 [!#6;!#1]~1~*~*~*~*~1 QAAAA@1 83 1 [NH2] NH2 84 1 [#6]~[#7](~[#6])~[#6] CN(C)C 85 1 [C;H2,H3][!#6;!#1][C;H2,H3] CH2QCH2 86 1 [F,Cl,Br,I]!@*@* X!A$A 87 1 [#16] S 88 1 [#8]~*~*~*~[#8] OAAAO #89 1 [$([!#6;!#1;!H0]~*~*~[CH2]~[!#1]),$([!#6;!#1;!H0;R]1@[R]@[R]@[CH2;R]1),$([!#6;!#1;!H0]~[R]1@[R]@[CH2;R]1)] QHAACH2A 89 1 [$([!#6;!#1;!H0]~*~*~[CH2]~[!#1]),$([!#6;!#1;!H0;R]1@[R]@[R]@[CH2;R]1),$([!#6;!#1;!H0]~[R]1@[R]@[CH2;R]1)] QHAACH2A 90 1 [$([!#6;!#1;!H0]~*~*~*~[CH2]~[!#1]),$([!#6;!#1;!H0;R]1@[R]@[R]@[R]@[CH2;R]1),$([!#6;!#1;!H0]~[R]1@[R]@[R]@[CH2;R]1),$([!#6;!#1;!H0]~*~[R]1@[R]@[CH2;R]1)] QHAAACH2A 91 1 [#8]~[#6](~[#7])~[#6] OC(N)C 92 1 [!#6;!#1]~[CH3] QCH3 93 1 [!#6;!#1]~[#7] QN 94 1 [#7]~*~*~[#8] NAAO 95 1 *~1~*~*~*~*~1 5 M ring 96 1 [#7]~*~*~*~[#8] NAAAO 97 1 [!#6;!#1]~1~*~*~*~*~*~1 QAAAAA@1 98 1 [#6]=[#6] C=C 99 1 [!#1]~[CH2]~[#7] ACH2N 100 1 [$([R]@1@[R]@[R]@[R]@[R]@[R]@[R]@[R]1),$([R]@1@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]1),$([R]@1@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]1),$([R]@1@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]1),$([R]@1@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]1),$([R]@1@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]1),$([R]@1@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]@[R]1)] 8M Ring or larger. This only handles up to ring sizes of 14 101 1 [!#6;!#1]~[#8] QO 102 1 Cl CL 103 1 [!#6;!#1;!H0]~*~[CH2]~[!#1] QHACH2A 104 1 *@*(@*)@* A$A($A)$A 105 1 [!#6;!#1]~*(~[!#6;!#1])~[!#6;!#1] QA(Q)Q 106 1 [F,Cl,Br,I]~*(~[!#1])~[!#1] XA(A)A 107 1 [CH3]~*~*~*~[CH2]~[!#1] CH3AAACH2A 108 1 [!#1]~[CH2]~[#8] ACH2O 109 1 [#7]~[#6]~[#8] NCO 110 1 [#7]~*~[CH2]~[!#1] NACH2A 111 1 [!#1]~*(~[!#1])(~[!#1])~[!#1] AA(A)(A)A 112 1 [#8]!:*:* Onot%A%A 113 1 [CH3]~[CH2]~[!#1] CH3CH2A 114 1 [CH3]~*~[CH2]~[!#1] CH3ACH2A 115 1 [$([CH3]~*~*~[CH2]~[!#1]),$([CH3]~*1~*~[CH2]1)] CH3AACH2A 116 1 [#7]~*~[#8] NAO 117 2 [$([!#1]~[CH2]~[CH2]~[!#1]),$(*1~[CH2]~[CH2]1)] ACH2CH2A > 1 118 1 [#7]=* N=A # FIX: incomplete definition 119 2 [!#6;R] Heterocyclic atom > 1 (&...) 120 1 [#7;R] N Heterocycle 121 1 [!#1]~[#7](~[!#1])~[!#1] AN(A)A 122 1 [#8]~[#6]~[#8] OCO 123 1 [!#6;!#1]~[!#6;!#1] QQ 124 2 Aromatic Ring > 1 125 1 [!#1]!@[#8]!@[!#1] A!O!A # FIX: incomplete definition 126 2 *@*!@[#8] A$A!O > 1 (&...) 127 1 [$([!#1]~[CH2]~*~*~*~[CH2]~[!#1]),$([R]1@[CH2;R]@[R]@[R]@[R]@[CH2;R]1),$([!#1]~[CH2]~[R]1@[R]@[R]@[CH2;R]1),$([!#1]~[CH2]~*~[R]1@[R]@[CH2;R]1)] ACH2AAACH2A 128 1 [$([!#1]~[CH2]~*~*~[CH2]~[!#1]),$([R]1@[CH2]@[R]@[R]@[CH2;R]1),$([!#1]~[CH2]~[R]1@[R]@[CH2;R]1)] ACH2AACH2A # FIX: incomplete definition 129 2 [!#6;!#1]~[!#6;!#1] QQ > 1 (&...) 130 2 [!#6;!#1;!H0] QH > 1 131 1 [#8]~*~[CH2]~[!#1] OACH2A 132 1 *@*!@[#7] A$A!N 133 1 [F,Cl,Br,I] X (HALOGEN) 134 1 [#7]!:*:* Nnot%A%A 135 2 [#8]=* O=A > 1 136 1 [!#6;R] Heterocycle # FIX: incomplete definition 137 2 [!#6;!#1]~[CH2]~[!#1] QCH2A > 1 (&...) 138 1 [O;!H0] OH # FIX: incomplete definition 139 4 [#8] O > 3 (&...) # FIX: incomplete definition 140 3 [CH3] CH3 > 2 (&...) 141 2 [#7] N > 1 142 1 *@*!@[#8] A$A!O 143 1 [!#1]!:*:*!:[!#1] Anot%A%Anot%A 144 2 *~1~*~*~*~*~*~1 6-member ring > 1 145 3 [#8] O > 2 146 1 [$([!#1]~[CH2]~[CH2]~[!#1]),$([R]1@[CH2;R]@[CH2;R]1)] ACH2CH2A 147 1 [!#1]~[!#6;!#1](~[!#1])~[!#1] AQ(A)A 148 2 [C;H3,H4] CH3 > 1 149 1 [!#1]!@*@*!@[!#1] A!A$A!A 150 1 [#7;!H0] NH 151 1 [#8]~[#6](~[#6])~[#6] OC(C)C 152 1 [!#6;!#1]~[CH2]~[!#1] QCH2A 153 1 [#6]=[#8] C=O 154 1 [!#1]!@[CH2]!@[!#1] A!CH2!A 155 1 [#7]~[!#1](~[!#1])~[!#1] NA(A)A 156 1 [#6]-[#8] C-O 157 1 [#6]-[#7] C-N 158 2 [#8] O>1 159 1 [C;H3,H4] CH3 160 1 [#7] N 161 1 a Aromatic 162 1 *~1~*~*~*~*~*~1 6-member Ring 163 1 [#8] O 164 1 [R] Ring # this can't be done in SMARTS 165 2 more than one fragment chemfp-1.1p1/chemfp/sdf_reader.py0000644000077000000240000002755111660452123017201 0ustar dalkestaff00000000000000"""sdf_reader - iterate through records in an SD file""" # This is used by the command-line SDF reader, used when the source # fingerprints are already encoded in fields of an SD file. # It's also used by the RDKit parser - see the comments there. from __future__ import absolute_import from . import ParseError from . import io __all__ = ["open_sdf", "iter_sdf_records", "iter_two_tags", "iter_title_and_tag"] import sys import re import chemfp class SDFParseError(ParseError): def __init__(self, msg, filename, lineno): super(SDFParseError, self).__init__(msg, filename, lineno) self.msg = msg self.filename = filename self.lineno = lineno def __repr__(self): return "SDFParseError(%r, %r, %r)" % (self.msg, self.filename, self.lineno) def __str__(self): return self.msg def ignore_parse_errors(msg, location): pass def report_parse_errors(msg, location): sys.stderr.write("ERROR: %s %s. Skipping.\n" % (msg, location.where())) def strict_parse_errors(msg, location): raise SDFParseError(msg + " " + location.where(), location.lineno, location.name) _parse_error_handlers = { "ignore": ignore_parse_errors, "report": report_parse_errors, "strict": strict_parse_errors, } def get_parse_error_handler(name_or_callable="strict"): try: return _parse_error_handlers[name_or_callable] except KeyError: raise ValueError("'errors' must be one of %s" % ", ".join(sorted(_parse_error_handlers))) # Do a quick check that the SD record is in the correct format _sdf_check_pat = re.compile(r""" .*\n # line 1 .*\n # line 2 .*\n # line 3 # next are the number of atoms and bonds. This pattern allows # ' 0', ' 00', '000', '00 ', '0 ' and ' 0 ', which is the # same as checking for field.strip().isdigit() # The escape '\040' is for the space character (\040\040\d|\040\d\d|\d\d\d|\d\d\040|\d\040|\040\d\040) # number of bonds (\040\040\d|\040\d\d|\d\d\d|\d\d\040|\d\040|\040\d\040) # number of bonds # Only space and digits are allowed before the required V2000 or V3000 [\0400-9]{28}V(2000|3000) """, re.X) class FileLocation(object): """A mutable instance used to track record title and position information You may pass one of these to the readers to that information. I'm a bit unsure about doing it this way but I liked the options of passing back a 2-ple of (record, position) or records with an attribute like record.position even less. WARNING: the attributes '.lineno', and '._record' be set-able. Passing in a non-FileLocation object may cause interoperability problems in the future. """ def __init__(self, name=None): self.name = name self.lineno = 1 self._record = None # internal variable; it only valid enough to get the title @property def title(self): # The title isn't needed for most cases so don't extract it unless needed if self._record is None: return None return self._record[:self._record.find("\n")].strip() def where(self): s = "at line %s" % (self.lineno,) if self.name is not None: s += " of %r" % (self.name,) title = self.title if title: s += " (title=%r)" % (title,) return s def info(self): return dict(name=self.name, lineno=self.lineno, title=self.title) def open_sdf(source=None, decompressor="auto", errors="strict", location=None): """Open an SD file and return an iterator over the SD records, as blocks of text source - input source. Can be None (for sys.stdin), the input filename as a string, or a file object. errors - one of "strict" (default), "log", or "ignore". Other values are experimental location - experimental location tracking. """ # XXX Adapater until I remove the old decompressor code if decompressor == "auto": format = None elif decompressor == "gzip": format = "sdf.gz" elif decompressor == "bz2": format = "sdf.bz2" elif decompressor == "none": format = "sdf" else: raise AssertionError(decompressor) format_name, compression = io.normalize_format(source, format) fileobj = io.open_compressed_input_universal(source, compression) return iter_sdf_records(fileobj, errors, location) # My original implementation used a slow line-oriented parser. That # was decently fast, but this version, which reads a block at a time # and works directly on those blocks, is over 3 times as fast. It's # also a lot more complicated def iter_sdf_records(fileobj, errors="strict", location=None): """Iterate over records in an SD file, returning records as blocks of text fileobj - input stream. If fileobj.name exists then use it in error messages errors - one of "strict" (default), "log", or "ignore". Other values are experimental location - experimental location tracking """ if location is None: location = FileLocation() if location.name is None: location.name = getattr(fileobj, "name", None) if isinstance(errors, basestring): error = get_parse_error_handler(errors) else: error = errors pushback_buffer = '' records = None while 1: if not records: read_data = fileobj.read(32768) if not read_data: # No more data from the file. If there is something in the # pushback buffer then it's an incomplete record if pushback_buffer: if pushback_buffer.endswith("\n$$$$"): # The file is missing the terminal newline. Compensate. # This will cause an extra read after the known end-of-file. # (Is that a problem? It's not supposed to be one.) pushback_buffer += "\n" else: location._record = None if location.lineno == 1: # No records read. Wrong format. error("Could not find a valid SD record", location) else: error( "unexpected content at the end of the file (perhaps the last record is truncated?)", location) break else: # We're done! break # Join the two blocks of text. This should be enough to # have lots of records, so split, and the last term is # either a partial record or the empty string. Keep track # of that for use in the next go-around. records = (pushback_buffer + read_data).split("\n$$$$\n") pushback_buffer = records.pop() # either '' or a partial record # It is possible though unlikely that the merged blocks of # text contains only an incomplete record, so this might # loop again. However, the joining and searching is an # O(n**2) operation, so I don't want to do that too often. # While it's possible to fix this, there should be no # reason to support huge records - they don't exist unless # you are really stretching and doing things like storing # images or other large data in the SD tags. # To prevent timing problems, don't allow huge records. if len(pushback_buffer) > 2000000: location._record = None error("record is too large for this reader", location) return else: # We have a set of records, one string per record. Pass them back. for record in records: # A simple, quick check that it looks about right if not _sdf_check_pat.match(record): location._record = record error("incorrectly formatted record", location) # If the error callback returns then just skip the record else: record += "\n$$$$\n" # restore the split text location._record = record yield record location.lineno += record.count("\n") records = None # This is complicated. I tried implementing this search with a regular # expression but it was about 30% slower than this more direct search. # Note: tag_substr must contain the "<" and ">" def _find_tag_data(rec, tag_substr): "Return the first data line for the given tag substring, or return None" startpos = 0 while 1: tag_start = rec.find(tag_substr, startpos) if tag_start == -1: return None # rfind cannot return -1 because _sdf_check_pat verified there # are at least 3 newlines. The +1 is safe because there's at # least the "<" and ">" from the tag. tag_line_start = rec.rfind("\n", 0, tag_start) + 1 if rec[tag_line_start] != ">": # This tag is not on a data tag line. It might be the value for # some of the text field. startpos = tag_start + 1 continue # This is an actual tag line. Find the start of the next line. # The record must end with "\n$$$$\n" so find will never return -1 # and never return the last character position. next_line_start = rec.find("\n", tag_start) + 1 # These might occur if there is no data content if rec[next_line_start]==">" or rec[next_line_start:next_line_start+4]=="$$$$": return "" # Otherwise, get up to the end of the line return rec[next_line_start:rec.find("\n", next_line_start)] # These are not legal tag characters (while others may be against the # SD file spec, these will break the parser) _bad_char = re.compile(r"[<>\n\r\t\0]") def iter_two_tags(sdf_iter, tag1, tag2): """Iterate over SD records to get the data lines for tag1 and tag2 sdf_iter - an iterator which returns SD records tag1 - the name of the first tag tag2 - the name of the second tag Each record yields a (tag1_value, tag2_value) 2-ple. If a tag is present then the value is from its first data line (or the empty string if there is no line). If there are multiple fields with the same name then use the first one. If a tag does not exist, return None. """ m = _bad_char.search(tag1) if m: raise TypeError("tag1 must not contain the character %r" % (m.group(0),)) m = _bad_char.search(tag2) if m: raise TypeError("tag2 must not contain the character %r" % (m.group(0),)) tag1_substr = "<" + tag1 + ">" tag2_substr = "<" + tag2 + ">" for rec in sdf_iter: yield _find_tag_data(rec, tag1_substr), _find_tag_data(rec, tag2_substr) def iter_title_and_tag(sdf_iter, tag): """Iterate over SD records to get the title line and data line for the specified tag sdf_iter - an iterator over SD records, as text tag - the name of the tag value to return Each record yields a (title, tag_value) 2-ple where the title is the first line of the SD record (not a tag!) and the tag_value comes from the first data line for the given tag. If the tag is present multiple times, use the first match. If the data line is missing, the value is"". If the tag does not exist, the value is None. """ m = _bad_char.search(tag) if m: raise TypeError("tag must not contain the character %r" % (m.group(0),)) tag_substr = "<" + tag + ">" for rec in sdf_iter: yield rec[:rec.find("\n")].strip(), _find_tag_data(rec, tag_substr) def iter_tag_and_record(sdf_iter, tag): m = _bad_char.search(tag) if m: raise TypeError("tag must not contain the character %r" % (m.group(0),)) tag_substr = "<" + tag + ">" for rec in sdf_iter: yield _find_tag_data(rec, tag_substr), rec chemfp-1.1p1/chemfp/search.py0000644000077000000240000011441212104077217016342 0ustar dalkestaff00000000000000"""Search a FingerprintArena and work with the search results This module implements the different ways to search a `FingerprintArena`. The search functions are: Count the number of hits: count_tanimoto_hits_fp - search an arena using a single fingerprint count_tanimoto_hits_arena - search an arena using an arena count_tanimoto_hits_symmetric - search an arena using itself partial_count_tanimoto_hits_symmetric - (advanced use; see the doc string) Find all hits at or above a given threshold, sorted arbitrarily: threshold_tanimoto_search_fp - search an arena using a single fingerprint threshold_tanimoto_search_arena - search an arena using an arena threshold_tanimoto_search_symmetric - search an arena using itself partial_threshold_tanimoto_search_symmetric - (advanced use; see the doc string) fill_lower_triangle - copy the upper triangle terms to the lower triangle Find the k-nearest hits at or above a given threshold, sorted by decreasing similarity: knearest_tanimoto_search_fp - search an arena using a single fingerprint knearest_tanimoto_search_arena - search an arena using an arena knearest_tanimoto_search_symmetric - search an arena using itself The threshold and k-nearest search results use a `SearchResult` when a fingerprint is used as a query, or a `SearchResults` when an arena is used as a query. These internally use a compressed sparse row format. """ import _chemfp import ctypes import array # __all__ = ["SearchResult", "SearchResults", "count_tanimoto_hits_fp", "count_tanimoto_hits_arena", "count_tanimoto_hits_symmetric", "partial_count_tanimoto_hits_symmetric", "threshold_tanimoto_search_fp", "threshold_tanimoto_search_arena", "threshold_tanimoto_search_symmetric", "partial_threshold_tanimoto_search_symmetric", "fill_lower_triangle", "knearest_tanimoto_search_fp", "knearest_tanimoto_search_arena", "knearest_tanimoto_search_symmetric" ] class SearchResult(object): """Search results for a query fingerprint against a target arena. The results contains a list of hits. Hits contain a target index, score, and optional target ids. The hits can be reordered based on score or index. """ def __init__(self, search_results, row): self._search_results = search_results self._row = row def __len__(self): """The number of hits""" return self._search_results._size(self._row) def __iter__(self): """Iterate through the pairs of (target index, score) using the current ordering""" return iter(self._search_results._get_indices_and_scores(self._row)) def clear(self): """Remove all hits from this result""" self._search_results._clear_row(self._row) def get_indices(self): """The list of target indices, in the current ordering.""" return self._search_results._get_indices(self._row) def get_ids(self): """The list of target identifiers (if available), in the current ordering""" ids = self._search_results.target_ids if ids is None: return None return [ids[i] for i in self._search_results._get_indices(self._row)] def get_scores(self): """The list of target scores, in the current ordering""" return self._search_results._get_scores(self._row) def get_ids_and_scores(self): """The list of (target identifier, target score) pairs, in the current ordering Raises a TypeError if the target IDs are not available. """ ids = self._search_results.target_ids if ids is None: raise TypeError("target_ids are not available") return zip(self.get_ids(), self.get_scores()) def get_indices_and_scores(self): """The list of (target index, score) pairs, in the current ordering""" return self._search_results._get_indices_and_scores(self._row) def reorder(self, ordering="decreasing-score"): """Reorder the hits based on the requested ordering. The available orderings are: increasing-score: sort by increasing score decreasing-score: sort by decreasing score increasing-index: sort by increasing target index decreasing-index: sort by decreasing target index move-closest-first: move the hit with the highest score to the first position reverse: reverse the current ordering :param ordering: the name of the ordering to use """ self._search_results._reorder_row(self._row, ordering) def count(self, min_score=None, max_score=None, interval="[]"): """Count the number of hits with a score between `min_score` and `max_score` Using the default parameters this returns the number of hits in the result. The default `min_score` of None is equivalent to -infinity. The default `max_score` of None is equivalent to +infinity. The `interval` parameter describes the interval end conditions. The default of "[]" uses a closed interval, where min_score <= score <= max_score. The interval "()" uses the open interval where min_score < score < max_score. The half-open/half-closed intervals "(]" and "[)" are also supported. :param min_score: the minimum score in the range. :type min_score: a float, or None for -infinity :param max_score: the maximum score in the range. :type max_score: a float, or None for +infinity :param interval: specify if the end points are open or closed. :type interval: one of "[]", "()", "(]", "[)" :returns: an integer count """ return self._search_results._count_row(self._row, min_score, max_score, interval) def cumulative_score(self, min_score=None, max_score=None, interval="[]"): """The sum of the scores which are between `min_score` and `max_score` Using the default parameters this returns the sum of all of the scores in the result. With a specified range this returns the sum of all of the scores in that range. The cumulative score is also known as the raw score. The default `min_score` of None is equivalent to -infinity. The default `max_score` of None is equivalent to +infinity. The `interval` parameter describes the interval end conditions. The default of "[]" uses a closed interval, where min_score <= score <= max_score. The interval "()" uses the open interval where min_score < score < max_score. The half-open/half-closed intervals "(]" and "[)" are also supported. :param min_score: the minimum score in the range. :type min_score: a float, or None for -infinity :param max_score: the maximum score in the range. :type max_score: a float, or None for +infinity :param interval: specify if the end points are open or closed. :type interval: one of "[]", "()", "(]", "[)" :returns: a floating point value """ return self._search_results._cumulative_score_row(self._row, min_score, max_score, interval) ## ??? What does this do? ## @property ## def target_id(self): ## ids = self._search_results.target_ids ## if ids is None: ## return None ## return ids[self._row] class SearchResults(_chemfp.SearchResults): """Search results for a list of query fingerprints against a target arena This acts like a list of SearchResult elements, with the ability to iterate over each search results, look them up by index, and get the number of scores. In addition, there are helper methods to iterate over each hit and to get the hit indicies, scores, and identifiers directly as Python lists, sort the list contents, and more. """ def __init__(self, n, arena_ids=None): """`n` is the number of SearchResult instances and `arena_ids` the target arena ids There is one SearchResult for each query fingerprint. The `arena_ids` are used to map the hit index back to the hit id. """ super(SearchResults, self).__init__(n, arena_ids) self._results = [SearchResult(self, i) for i in xrange(n)] def __iter__(self): """Iterate over each SearchResult hit""" return iter(self._results) def __len__(self): """The number of rows in the SearchResults""" return super(SearchResults, self).__len__() def __getitem__(self, i): """Get the 'i'th SearchResult""" try: return self._results[i] except IndexError: raise IndexError("row index is out of range") def iter_indices(self): """For each hit, yield the list of target indices""" for i in xrange(len(self)): yield self._get_indices(i) def iter_ids(self): """For each hit, yield the list of target identifiers""" ids = self.target_ids for indicies in self.iter_indices(): yield [ids[idx] for idx in indicies] def iter_scores(self): """For each hit, yield the list of target scores""" for i in xrange(len(self)): yield self._get_scores(i) def iter_indices_and_scores(self): """For each hit, yield the list of (target index, score) tuples""" for i in xrange(len(self)): yield zip(self._get_indices(i), self._get_scores(i)) def iter_ids_and_scores(self): """For each hit, yield the list of (target id, score) tuples""" ids = self.target_ids for i in xrange(len(self)): yield [(ids[idx], score) for (idx, score) in self[i]] # I don't like how C-level doc strings can't report the call # signature even though keyword arguments are supported. I also # don't like maintaining the docstrings in C code. # Problem solved by interposing these Python methods def clear_all(self): """Remove all hits from all of the search results""" return super(SearchResults, self).clear_all() def count_all(self, min_score=None, max_score=None, interval="[]"): """Count the number of hits with a score between `min_score` and `max_score` Using the default parameters this returns the number of hits in the result. The default `min_score` of None is equivalent to -infinity. The default `max_score` of None is equivalent to +infinity. The `interval` parameter describes the interval end conditions. The default of "[]" uses a closed interval, where min_score <= score <= max_score. The interval "()" uses the open interval where min_score < score < max_score. The half-open/half-closed intervals "(]" and "[)" are also supported. :param min_score: the minimum score in the range. :type min_score: a float, or None for -infinity :param max_score: the maximum score in the range. :type max_score: a float, or None for +infinity :param interval: specify if the end points are open or closed. :type interval: one of "[]", "()", "(]", "[)" :returns: an integer count """ return super(SearchResults, self).count_all(min_score, max_score, interval) def cumulative_score_all(self, min_score=None, max_score=None, interval="[]"): """The sum of all scores in all rows which are between `min_score` and `max_score` Using the default parameters this returns the sum of all of the scores in all of the results. With a specified range this returns the sum of all of the scores in that range. The cumulative score is also known as the raw score. The default `min_score` of None is equivalent to -infinity. The default `max_score` of None is equivalent to +infinity. The `interval` parameter describes the interval end conditions. The default of "[]" uses a closed interval, where min_score <= score <= max_score. The interval "()" uses the open interval where min_score < score < max_score. The half-open/half-closed intervals "(]" and "[)" are also supported. :param min_score: the minimum score in the range. :type min_score: a float, or None for -infinity :param max_score: the maximum score in the range. :type max_score: a float, or None for +infinity :param interval: specify if the end points are open or closed. :type interval: one of "[]", "()", "(]", "[)" :returns: an floating point count """ return super(SearchResults, self).cumulative_score_all(min_score, max_score, interval) def reorder_all(self, order="decreasing-score"): """Reorder the hits for all of the rows based on the requested `order`. The available orderings are: increasing-score: sort by increasing score decreasing-score: sort by decreasing score increasing-index: sort by increasing target index decreasing-index: sort by decreasing target index move-closest-first: move the hit with the highest score to the first position reverse: reverse the current ordering :param ordering: the name of the ordering to use """ return super(SearchResults, self).reorder_all(order) def _require_matching_fp_size(query_fp, target_arena): if len(query_fp) != target_arena.metadata.num_bytes: raise ValueError("query_fp uses %d bytes while target_arena uses %d bytes" % ( len(query_fp), target_arena.metadata.num_bytes)) def _require_matching_sizes(query_arena, target_arena): assert query_arena.metadata.num_bits is not None, "arenas must define num_bits" assert target_arena.metadata.num_bits is not None, "arenas must define num_bits" if query_arena.metadata.num_bits != target_arena.metadata.num_bits: raise ValueError("query_arena has %d bits while target_arena has %d bits" % ( query_arena.metadata.num_bits, target_arena.metadata.num_bits)) if query_arena.metadata.num_bytes != target_arena.metadata.num_bytes: raise ValueError("query_arena uses %d bytes while target_arena uses %d bytes" % ( query_arena.metadata.num_bytes, target_arena.metadata.num_bytes)) def count_tanimoto_hits_fp(query_fp, target_arena, threshold=0.7): """Count the number of hits in `target_arena` at least `threshold` similar to the `query_fp` Example:: query_id, query_fp = chemfp.load_fingerprints("queries.fps")[0] targets = chemfp.load_fingerprints("targets.fps") print chemfp.search.count_tanimoto_hits_fp(query_fp, targets, threshold=0.1) :param query_fp: the query fingerprint :type query_fp: a byte string :param target_arena: the target arena :type target_fp: a FingerprintArena :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :returns: an integer count """ _require_matching_fp_size(query_fp, target_arena) # Improve the alignment so the faster algorithms can be used query_start_padding, query_end_padding, query_fp = _chemfp.align_fingerprint( query_fp, target_arena.alignment, target_arena.storage_size) counts = array.array("i", (0 for i in xrange(len(query_fp)))) _chemfp.count_tanimoto_arena(threshold, target_arena.num_bits, query_start_padding, query_end_padding, target_arena.storage_size, query_fp, 0, 1, target_arena.start_padding, target_arena.end_padding, target_arena.storage_size, target_arena.arena, target_arena.start, target_arena.end, target_arena.popcount_indices, counts) return counts[0] def count_tanimoto_hits_arena(query_arena, target_arena, threshold=0.7): """For each fingerprint in `query_arena`, count the number of hits in `target_arena` at least `threshold` similar to it Example:: queries = chemfp.load_fingerprints("queries.fps") targets = chemfp.load_fingerprints("targets.fps") counts = chemfp.search.count_tanimoto_hits_arena(queries, targets, threshold=0.1) print counts[:10] The result is implementation specific. You'll always be able to get its length and do an index lookup to get an integer count. Currently it's a ctype array of longs, but it could be an array.array or Python list in the future. :param query_arena: The query fingerprints. :type query_arena: a FingerprintArena :param target_arena: The target fingerprints. :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :returns: an array of counts """ _require_matching_sizes(query_arena, target_arena) counts = (ctypes.c_int*len(query_arena))() _chemfp.count_tanimoto_arena(threshold, target_arena.num_bits, query_arena.start_padding, query_arena.end_padding, query_arena.storage_size, query_arena.arena, query_arena.start, query_arena.end, target_arena.start_padding, target_arena.end_padding, target_arena.storage_size, target_arena.arena, target_arena.start, target_arena.end, target_arena.popcount_indices, counts) return counts def count_tanimoto_hits_symmetric(arena, threshold=0.7, batch_size=100): """For each fingerprint in the `arena`, count the number of other fingerprints at least `threshold` similar to it A fingerprint never matches itself. The computation can take a long time. Python won't check check for a ^C until the function finishes. This can be irritating. Instead, process only `batch_size` rows at a time before checking for a ^C. Example:: arena = chemfp.load_fingerprints("targets.fps") counts = chemfp.search.count_tanimoto_hits_symmetric(arena, threshold=0.2) print counts[:10] The result object is implementation specific. You'll always be able to get its length and do an index lookup to get an integer count. Currently it's a ctype array of longs, but it could be an array.array or Python list in the future. :param arena: the set of fingerprints :type arena: a FingerprintArena :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param batch_size: the number of rows to process before checking for a ^C :type batch_size: integer :returns: an array of counts """ N = len(arena) counts = (ctypes.c_int * N)() # This spends the entire time in C, which means ^C won't work until it finishes. # While it's theoretically slightly higher performance, I can't measure the # difference, and it's much better to let people be able to interrupt the program. # _chemfp.count_tanimoto_hits_arena_symmetric( # threshold, arena.num_bits, # arena.start_padding, arena.end_padding, arena.storage_size, arena.arena, # 0, N, 0, N, # arena.popcount_indices, # counts) if batch_size <= 0: raise ValueError("batch_size must be positive") # Process N rows at a time, which lets Python handle ^C at times. # Since the code processes a triangle, this means that early # on there will be more time between ^C checks than later. # I'm not able to detect the Python overhead, so I'm not going # to make it more "efficient". for query_start in xrange(0, N, batch_size): query_end = min(query_start + batch_size, N) _chemfp.count_tanimoto_hits_arena_symmetric( threshold, arena.num_bits, arena.start_padding, arena.end_padding, arena.storage_size, arena.arena, query_start, query_end, 0, N, arena.popcount_indices, counts) return counts def partial_count_tanimoto_hits_symmetric(counts, arena, threshold=0.7, query_start=0, query_end=None, target_start=0, target_end=None): """Compute a portion of the symmetric Tanimoto counts For most cases, use count_tanimoto_hits_symmetric instead of this function! This function is only useful for thread-pool implementations. In that case, set the number of OpenMP threads to 1. `counts` is a contiguous array of integers. It should be initialized to zeros, and reused for successive calls. The function adds counts for counts[query_start:query_end] based on computing the upper-triangle portion contained in the rectangle query_start:query_end and target_start:target_end and using symmetry to fill in the lower half. You know, this is pretty complicated. Here's the bare minimum example of how to use it correctly to process 10 rows at a time using up to 4 threads:: import chemfp import chemfp.search from chemfp import futures import array chemfp.set_num_threads(1) # Globally disable OpenMP arena = chemfp.load_fingerprints("targets.fps") # Load the fingerprints n = len(arena) counts = array.array("i", [0]*n) with futures.ThreadPoolExecutor(max_workers=4) as executor: for row in xrange(0, n, 10): executor.submit(chemfp.search.partial_count_tanimoto_hits_symmetric, counts, arena, threshold=0.2, query_start=row, query_end=min(row+10, n)) print counts :param counts: the accumulated Tanimoto counts :type counts: a contiguous block of integer :param arena: the fingerprints. :type arena: a FingerprintArena :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param query_start: the query start row :type query_start: an integer :param query_end: the query end row :type query_end: an integer, or None to mean the last query row :param target_start: the target start row :type target_start: an integer :param target_end: the target end row :type target_end: an integer, or None to mean the last target row :returns: nothing """ N = len(arena) if query_end is None: query_end = N elif query_end > N: query_end = N if target_end is None: target_end = N elif target_end > N: target_end = N if query_end > len(counts): raise ValueError("counts array is too small for the given query range") if target_end > len(counts): raise ValueError("counts array is too small for the given target range") _chemfp.count_tanimoto_hits_arena_symmetric( threshold, arena.num_bits, arena.start_padding, arena.end_padding, arena.storage_size, arena.arena, query_start, query_end, target_start, target_end, arena.popcount_indices, counts) # These all return indices into the arena! def threshold_tanimoto_search_fp(query_fp, target_arena, threshold=0.7): """Search for fingerprint hits in `target_arena` which are at least `threshold` similar to `query_fp` The hits in the returned `SearchResult` are in arbitrary order. Example:: query_id, query_fp = chemfp.load_fingerprints("queries.fps")[0] targets = chemfp.load_fingerprints("targets.fps") print list(chemfp.search.threshold_tanimoto_search_fp(query_fp, targets, threshold=0.15)) :param query_fp: the query fingerprint :type query_fp: a byte string :param target_arena: the target arena :type target_fp: a FingerprintArena :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :returns: a SearchResult """ _require_matching_fp_size(query_fp, target_arena) # Improve the alignment so the faster algorithms can be used query_start_padding, query_end_padding, query_fp = _chemfp.align_fingerprint( query_fp, target_arena.alignment, target_arena.storage_size) results = SearchResults(1, target_arena.arena_ids) _chemfp.threshold_tanimoto_arena( threshold, target_arena.num_bits, query_start_padding, query_end_padding, target_arena.storage_size, query_fp, 0, 1, target_arena.start_padding, target_arena.end_padding, target_arena.storage_size, target_arena.arena, target_arena.start, target_arena.end, target_arena.popcount_indices, results, 0) return results[0] def threshold_tanimoto_search_arena(query_arena, target_arena, threshold=0.7): """Search for the hits in the `target_arena` at least `threshold` similar to the fingerprints in `query_arena` The hits in the returned `SearchResults` are in arbitrary order. Example:: queries = chemfp.load_fingerprints("queries.fps") targets = chemfp.load_fingerprints("targets.fps") results = chemfp.search.threshold_tanimoto_search_arena(queries, targets, threshold=0.5) for query_id, query_hits in zip(queries.ids, results): if len(query_hits) > 0: print query_id, "->", ", ".join(query_hits.get_ids()) :param query_arena: The query fingerprints. :type query_arena: a FingerprintArena :param target_arena: The target fingerprints. :type target_arena: a FingerprintArena :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :returns: a SearchResults instance """ _require_matching_sizes(query_arena, target_arena) num_queries = len(query_arena) results = SearchResults(num_queries, target_arena.arena_ids) if num_queries: _chemfp.threshold_tanimoto_arena( threshold, target_arena.num_bits, query_arena.start_padding, query_arena.end_padding, query_arena.storage_size, query_arena.arena, query_arena.start, query_arena.end, target_arena.start_padding, target_arena.end_padding, target_arena.storage_size, target_arena.arena, target_arena.start, target_arena.end, target_arena.popcount_indices, results, 0) return results def threshold_tanimoto_search_symmetric(arena, threshold=0.7, include_lower_triangle=True, batch_size=100): """Search for the hits in the `arena` at least `threshold` similar to the fingerprints in the arena When `include_lower_triangle` is True, compute the upper-triangle similarities, then copy the results to get the full set of results. When `include_lower_triangle` is False, only compute the upper triangle. The computation can take a long time. Python won't check check for a ^C until the function finishes. This can be irritating. Instead, process only `batch_size` rows at a time before checking for a ^C. The hits in the returned `SearchResults` are in arbitrary order. Example:: arena = chemfp.load_fingerprints("queries.fps") full_result = chemfp.search.threshold_tanimoto_search_symmetric(arena, threshold=0.2) upper_triangle = chemfp.search.threshold_tanimoto_search_symmetric( arena, threshold=0.2, include_lower_triangle=False) assert sum(map(len, full_result)) == sum(map(len, upper_triangle))*2 :param arena: the set of fingerprints :type arena: a FingerprintArena :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param include_lower_triangle: if False, compute only the upper triangle, otherwise use symmetry to compute the full matrix :type include_lower_triangle: boolean :param batch_size: the number of rows to process before checking for a ^C :type batch_size: integer :returns: a SearchResults instance """ if batch_size <= 0: raise ValueError("batch_size must be positive") N = len(arena) results = SearchResults(N, arena.arena_ids) if N: # Break it up into batch_size groups in order to let Python's # interrupt handler check for a ^C, which is otherwise # suppressed until the function finishes. for query_start in xrange(0, N, batch_size): query_end = min(query_start + batch_size, N) _chemfp.threshold_tanimoto_arena_symmetric( threshold, arena.num_bits, arena.start_padding, arena.end_padding, arena.storage_size, arena.arena, query_start, query_end, 0, N, arena.popcount_indices, results) if include_lower_triangle: _chemfp.fill_lower_triangle(results, N) return results #def XXXpartial_threshold_tanimoto_search(results, query_arena, target_arena, threshold, # results_offsets=0): # pass def partial_threshold_tanimoto_search_symmetric(results, arena, threshold=0.7, query_start=0, query_end=None, target_start=0, target_end=None): """Compute a portion of the symmetric Tanimoto search results For most cases, use threshold_tanimoto_arena_symmetric instead of this function! This function is only useful for thread-pool implementations. In that case, set the number of OpenMP threads to 1. `results` is a SearchResults instance which is at least as large as the arena. It should be reused for successive updates. The function adds hits to results[query_start:query_end] based on computing the upper-triangle portion contained in the rectangle query_start:query_end and target_start:target_end. It does not fill in the lower triangle. To get the full matrix, call `fill_lower_triangle`. You know, this is pretty complicated. Here's the bare minimum example of how to use it correctly to process 10 rows at a time using up to 4 threads:: import chemfp import chemfp.search from chemfp import futures import array chemfp.set_num_threads(1) arena = chemfp.load_fingerprints("targets.fps") n = len(arena) results = chemfp.search.SearchResults(n, arena.ids) with futures.ThreadPoolExecutor(max_workers=4) as executor: for row in xrange(0, n, 10): executor.submit(chemfp.search.partial_threshold_tanimoto_search_symmetric, results, arena, threshold=0.2, query_start=row, query_end=min(row+10, n)) chemfp.search.fill_lower_triangle(results) The hits in the `SearchResults` are in arbitrary order. :param counts: the intermediate search results :type counts: a SearchResults instance :param arena: the fingerprints. :type arena: a FingerprintArena :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param query_start: the query start row :type query_start: an integer :param query_end: the query end row :type query_end: an integer, or None to mean the last query row :param target_start: the target start row :type target_start: an integer :param target_end: the target end row :type target_end: an integer, or None to mean the last target row :returns: nothing """ assert arena.popcount_indices N = len(arena) if query_end is None: query_end = N elif query_end > N: query_end = N if target_end is None: target_end = N elif target_end > N: target_end = N if query_end > N: raise ValueError("counts array is too small for the given query range") if target_end > N: raise ValueError("counts array is too small for the given target range") if N: _chemfp.threshold_tanimoto_arena_symmetric( threshold, arena.num_bits, arena.start_padding, arena.end_padding, arena.storage_size, arena.arena, query_start, query_end, target_start, target_end, arena.popcount_indices, results) def fill_lower_triangle(results): """Duplicate each entry of `results` to its transpose This is used after the symmetric threshold search to turn the upper-triangle results into a full matrix. """ _chemfp.fill_lower_triangle(results, len(results)) # These all return indices into the arena! def knearest_tanimoto_search_fp(query_fp, target_arena, k=3, threshold=0.7): """Search for `k`-nearest hits in `target_arena` which are at least `threshold` similar to `query_fp` The hits in the `SearchResults` are ordered by decreasing similarity score. Example:: query_id, query_fp = chemfp.load_fingerprints("queries.fps")[0] targets = chemfp.load_fingerprints("targets.fps") print list(chemfp.search.knearest_tanimoto_search_fp(query_fp, targets, k=3, threshold=0.0)) :param query_fp: the query fingerprint :type query_fp: a byte string :param target_arena: the target arena :type target_fp: a FingerprintArena :param k: the number of nearest neighbors to find. :type k: positive integer :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :returns: a SearchResult """ _require_matching_fp_size(query_fp, target_arena) query_start_padding, query_end_padding, query_fp = _chemfp.align_fingerprint( query_fp, target_arena.alignment, target_arena.storage_size) if k < 0: raise ValueError("k must be non-negative") results = SearchResults(1, target_arena.arena_ids) _chemfp.knearest_tanimoto_arena( k, threshold, target_arena.num_bits, query_start_padding, query_end_padding, target_arena.storage_size, query_fp, 0, 1, target_arena.start_padding, target_arena.end_padding, target_arena.storage_size, target_arena.arena, target_arena.start, target_arena.end, target_arena.popcount_indices, results, 0) _chemfp.knearest_results_finalize(results, 0, 1) return results[0] def knearest_tanimoto_search_arena(query_arena, target_arena, k=3, threshold=0.7): """Search for the `k` nearest hits in the `target_arena` at least `threshold` similar to the fingerprints in `query_arena` The hits in the `SearchResults` are ordered by decreasing similarity score. Example:: queries = chemfp.load_fingerprints("queries.fps") targets = chemfp.load_fingerprints("targets.fps") results = chemfp.search.knearest_tanimoto_search_arena(queries, targets, k=3, threshold=0.5) for query_id, query_hits in zip(queries.ids, results): if len(query_hits) >= 2: print query_id, "->", ", ".join(query_hits.get_ids()) :param query_arena: The query fingerprints. :type query_arena: a FingerprintArena :param target_arena: The target fingerprints. :type target_arena: a FingerprintArena :param k: the number of nearest neighbors to find. :type k: positive integer :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :returns: a SearchResults instance """ _require_matching_sizes(query_arena, target_arena) num_queries = len(query_arena) results = SearchResults(num_queries, target_arena.arena_ids) _chemfp.knearest_tanimoto_arena( k, threshold, target_arena.num_bits, query_arena.start_padding, query_arena.end_padding, query_arena.storage_size, query_arena.arena, query_arena.start, query_arena.end, target_arena.start_padding, target_arena.end_padding, target_arena.storage_size, target_arena.arena, target_arena.start, target_arena.end, target_arena.popcount_indices, results, 0) _chemfp.knearest_results_finalize(results, 0, num_queries) return results def knearest_tanimoto_search_symmetric(arena, k=3, threshold=0.7, batch_size=100): """Search for the `k`-nearest hits in the `arena` at least `threshold` similar to the fingerprints in the arena The computation can take a long time. Python won't check check for a ^C until the function finishes. This can be irritating. Instead, process only `batch_size` rows at a time before checking for a ^C. The hits in the `SearchResults` are ordered by decreasing similarity score. Example:: arena = chemfp.load_fingerprints("queries.fps") results = chemfp.search.knearest_tanimoto_search_symmetric(arena, k=3, threshold=0.8) for (query_id, hits) in zip(arena.ids, results): print query_id, "->", ", ".join(("%s %.2f" % hit) for hit in hits.get_ids_and_scores()) :param arena: the set of fingerprints :type arena: a FingerprintArena :param k: the number of nearest neighbors to find. :type k: positive integer :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param include_lower_triangle: if False, compute only the upper triangle, otherwise use symmetry to compute the full matrix :type include_lower_triangle: boolean :param batch_size: the number of rows to process before checking for a ^C :type batch_size: integer :returns: a SearchResults instance """ N = len(arena) if batch_size <= 0: raise ValueError("batch_size must be positive") results = SearchResults(N, arena.arena_ids) if N: # Break it up into batch_size groups in order to let Python's # interrupt handler check for a ^C, which is otherwise # suppressed until the function finishes. for query_start in xrange(0, N, batch_size): query_end = min(query_start + batch_size, N) _chemfp.knearest_tanimoto_arena_symmetric( k, threshold, arena.num_bits, arena.start_padding, arena.end_padding, arena.storage_size, arena.arena, query_start, query_end, 0, N, arena.popcount_indices, results) _chemfp.knearest_results_finalize(results, 0, N) return results chemfp-1.1p1/chemfp/slow.py0000644000077000000240000001221612055226640016061 0ustar dalkestaff00000000000000import heapq import operator from . import bitops from . import Fingerprints from . import fps_search def count_tanimoto_hits_fp(query_fp, targets, threshold): return sum(1 for target in targets if bitops.byte_tanimoto(query_fp, target[1]) >= threshold) ## def iter_count_tanimoto_hits(queries, targets, threshold): ## for query_id, query_fp in queries: ## yield query_id, sum(1 for target in targets ## if bitops.byte_tanimoto(query_fp, target[1]) >= threshold) def count_tanimoto_hits_arena(queries, targets, threshold): counts = [] for query_id, query_fp in queries: counts.append(sum(1 for target in targets if bitops.byte_tanimoto(query_fp, target[1]) >= threshold)) return counts def tanimoto_count_once(queries, targets, threshold): # Only go through the list of queries once results = [0] * len(queries) query_ids = [] query_fps = [] for (id, fp) in queries: query_ids.append(id) query_fps.append(fp) for target_id, target_fp in targets: for i, query_fp in enumerate(query_fps): if bitops.byte_tanimoto(query_fp, target_fp) >= threshold: results[i] += 1 return zip(query_ids, results) ########## def threshold_tanimoto_search_fp(query_fp, targets, threshold): ids = [] scores = [] for target_id, target_fp in targets: score = bitops.byte_tanimoto(query_fp, target_fp) if score >= threshold: ids.append(target_id) scores.append(score) return fps_search.FPSSearchResult(ids, scores) def _iter_threshold_tanimoto_search(queries, targets, threshold): for query_id, query_fp in queries: yield threshold_tanimoto_search_fp(query_fp, targets, threshold) def threshold_tanimoto_search_arena(queries, targets, threshold): results = [] for query_id, query_fp in queries: results.append(threshold_tanimoto_search_fp(query_fp, targets, threshold)) return fps_search.FPSSearchResults(results) #def tanimoto_search_all(queries, targets, threshold): # results = [[] for i in xrange(len(queries))] # query_fps = [query[1] for query in queries] # for target_id, target_fp in targets: # for i, query_fp in enumerate(query_fps): # score = bitops.byte_tanimoto(query_fp, target_fp) # if score >= threshold: # results[i].append( (target_id, score) ) # return results ########## def knearest_tanimoto_search_fp(query_fp, targets, k, threshold): hits = heapq.nlargest(k, threshold_tanimoto_search_fp(query_fp, targets, threshold), key = operator.itemgetter(1)) if hits: ids, scores = zip(*hits) else: ids, scores = [], [] return fps_search.FPSSearchResult(ids, scores) def _iter_knearest_tanimoto_search(queries, targets, k, threshold): for hits in _iter_threshold_tanimoto_search(queries, targets, threshold): hits = heapq.nlargest(k, hits, key = operator.itemgetter(1)) ids, scores = zip(*hits) yield fps_search.FPSSearchResult(ids, scores) def knearest_tanimoto_search_arena(queries, targets, k, threshold): results = [] for row in _iter_knearest_tanimoto_search(queries, targets, k, threshold): results.append(row) return fps_search.FPSSearchResults(results) # I am not going to optimize this. #def knearest_tanimoto_search_all(queries, targets, k, threshold): # results = tanimoto_search(queries, targets, threshold) # for i, (id, hits) in enumerate(results): # results[i] = (id, heapq.nlargest(k, hits, key = operator.itemgetter(1))) # return results def _check_threshold(threshold): if not (0.0 <= threshold <= 1.0): raise ValueError("threshold must between 0.0 and 1.0, inclusive") class SlowFingerprints(Fingerprints): def count_tanimoto_hits_fp(self, query_fp, threshold=0.7): if not (0.0 <= threshold <= 1.0): raise ValueError("threshold must between 0.0 and 1.0, inclusive") return count_tanimoto_hits_fp(query_fp, self._id_fp_pairs, threshold) def count_tanimoto_hits_arena(self, queries, threshold=0.7): _check_threshold(threshold) return count_tanimoto_hits_arena(queries, self._id_fp_pairs, threshold) def threshold_tanimoto_search_fp(self, fp, threshold=0.7): _check_threshold(threshold) return threshold_tanimoto_search_fp(fp, self._id_fp_pairs, threshold) def threshold_tanimoto_search_arena(self, queries, threshold=0.7): _check_threshold(threshold) return threshold_tanimoto_search_arena(queries, self._id_fp_pairs, threshold) def knearest_tanimoto_search_fp(self, fp, k=3, threshold=0.7): if k < 0: raise ValueError("k must be non-negative") _check_threshold(threshold) return knearest_tanimoto_search_fp(fp, self._id_fp_pairs, k, threshold) def knearest_tanimoto_search_arena(self, queries, k=3, threshold=0.7, batch_size=100): if k < 0: raise ValueError("k must be non-negative") _check_threshold(threshold) return knearest_tanimoto_search_arena(queries, self._id_fp_pairs, k, threshold) chemfp-1.1p1/chemfp/substruct.patterns0000644000077000000240000011675711660452123020360 0ustar dalkestaff00000000000000# Substructure fingerprints based on the Pubchem/CACTVS substructure keys. # Format is: # - lines with only whitespace are ignored # - as are lines with "#" as the first non-whitespace character # - leading and trailing whitespace are ignored # - the pattern line contains whitespace separated columns in the form # $bit $min-count $pattern description ... # - columns 5 and onwards make the description, joined by a single " " # - the pattern is either a SMARTS or a term in <>s which cannot be represented # as SMARTS and must be specially recognized or ignored # The <> terms are: # <0> this bit is always 0 # <1> this bit is always 1 # the hydrogen count # the number of aromatic rings # the number of hetero-aromatic rings # In the future I'll probably support metadata like the following #num_bits=881 #type=ChemFP-Substruct/1 # Section 1: Hierarchic Element Counts - These bits test for the # presence or count of individual chemical atoms represented by their # atomic symbol. 0 4 >= 4 H 1 8 >= 8 H 2 16 >= 16 H 3 32 >= 32 H 4 1 [Li] >= 1 Li 5 2 [Li] >= 2 Li 6 1 [#5] >= 1 B 7 2 [#5] >= 2 B 8 4 [#5] >= 4 B 9 2 [#6] >= 2 C 10 4 [#6] >= 4 C 11 8 [#6] >= 8 C 12 16 [#6] >= 16 C 13 32 [#6] >= 32 C 14 1 [#7] >= 1 N 15 2 [#7] >= 2 N 16 4 [#7] >= 4 N 17 8 [#7] >= 8 N 18 1 [#8] >= 1 O 19 2 [#8] >= 2 O 20 4 [#8] >= 4 O 21 8 [#8] >= 8 O 22 16 [#8] >= 16 O 23 1 [F] >= 1 F 24 2 [F] >= 2 F 25 4 [F] >= 4 F 26 1 [Na] >= 1 Na 27 2 [Na] >= 2 Na 28 1 [#14] >= 1 Si 29 2 [#14] >= 2 Si 30 1 [#15] >= 1 P 31 2 [#15] >= 2 P 32 4 [#15] >= 4 P 33 1 [#16] >= 1 S 34 2 [#16] >= 2 S 35 4 [#16] >= 4 S 36 8 [#16] >= 8 S 37 1 [Cl] >= 1 Cl 38 2 [Cl] >= 2 Cl 39 4 [Cl] >= 4 Cl 40 8 [Cl] >= 8 Cl 41 1 [K] >= 1 K 42 2 [K] >= 2 K 43 1 [Br] >= 1 Br 44 2 [Br] >= 2 Br 45 4 [Br] >= 4 Br 46 1 [I] >= 1 I 47 2 [I] >= 2 I 48 4 [I] >= 4 I 49 1 [Be] >= 1 Be 50 1 [Mg] >= 1 Mg 51 1 [Al] >= 1 Al 52 1 [Ca] >= 1 Ca 53 1 [Sc] >= 1 Sc 54 1 [Ti] >= 1 Ti 55 1 [V] >= 1 V 56 1 [Cr] >= 1 Cr 57 1 [Mn] >= 1 Mn 58 1 [Fe] >= 1 Fe 59 1 [Co] >= 1 Co 60 1 [Ni] >= 1 Ni 61 1 [Cu] >= 1 Cu 62 1 [Zn] >= 1 Zn 63 1 [Ga] >= 1 Ga 64 1 [#32] >= 1 Ge 65 1 [#33] >= 1 As 66 1 [#34] >= 1 Se 67 1 [Kr] >= 1 Kr 68 1 [Rb] >= 1 Rb 69 1 [Sr] >= 1 Sr 70 1 [Y] >= 1 Y 71 1 [Zr] >= 1 Zr 72 1 [Nb] >= 1 Nb 73 1 [Mo] >= 1 Mo 74 1 [Ru] >= 1 Ru 75 1 [Rh] >= 1 Rh 76 1 [Pd] >= 1 Pd 77 1 [Ag] >= 1 Ag 78 1 [Cd] >= 1 Cd 79 1 [In] >= 1 In 80 1 [Sn] >= 1 Sn 81 1 [Sb] >= 1 Sb 82 1 [#52] >= 1 Te 83 1 [Xe] >= 1 Xe 84 1 [Cs] >= 1 Cs 85 1 [Ba] >= 1 Ba 86 1 [Lu] >= 1 Lu 87 1 [Hf] >= 1 Hf 88 1 [Ta] >= 1 Ta 89 1 [W] >= 1 W 90 1 [Re] >= 1 Re 91 1 [Os] >= 1 Os 92 1 [Ir] >= 1 Ir 93 1 [Pt] >= 1 Pt 94 1 [Au] >= 1 Au 95 1 [Hg] >= 1 Hg 96 1 [Tl] >= 1 Tl 97 1 [Pb] >= 1 Pb 98 1 [Bi] >= 1 Bi 99 1 [La] >= 1 La 100 1 [Ce] >= 1 Ce 101 1 [Pr] >= 1 Pr 102 1 [Nd] >= 1 Nd 103 1 [Pm] >= 1 Pm 104 1 [Sm] >= 1 Sm 105 1 [Eu] >= 1 Eu 106 1 [Gd] >= 1 Gd 107 1 [Tb] >= 1 Tb 108 1 [Dy] >= 1 Dy 109 1 [Ho] >= 1 Ho 110 1 [Er] >= 1 Er 111 1 [Tm] >= 1 Tm 112 1 [Yb] >= 1 Yb 113 1 [Tc] >= 1 Tc 114 1 [U] >= 1 U # Section 2: Rings in a canonic Extended Smallest Set of Smallest # Rings (ESSSR) ring set - These bits test for the presence or count # of the described chemical ring system. An ESSSR ring is any ring # which does not share three consecutive atoms with any other ring in # the chemical structure. For example, naphthalene has three ESSSR # rings (two phenyl fragments and the 10-membered envelope), while # biphenyl will yield a count of only two ESSSR rings. 115 1 *~1~*~*1 >= 1 any ring size 3 116 1 c1cc1 >= 1 aromatic carbon-only ring size 3 117 1 n1aa1 >= 1 aromatic nitrogen-containing ring size 3 118 1 [a;!#6]1aa1 >= 1 aromatic heteroatom-containing ring size 3 119 1 C~1~C~C1 >= 1 non-aromatic carbon-only ring size 3 120 1 N~1~A~A1 >= 1 non-aromatic nitrogen-containing ring size 3 121 1 [A;!#6]~1~A~A1 >= 1 non-aromatic heteroatom-containing ring size 3 122 2 *~1~*~*1 >= 2 any ring size 3 123 2 c1cc1 >= 2 aromatic carbon-only ring size 3 124 2 n1aa1 >= 2 aromatic nitrogen-containing ring size 3 125 2 [a;!#6]1aa1 >= 2 aromatic heteroatom-containing ring size 3 126 2 C~1~C~C1 >= 2 non-aromatic carbon-only ring size 3 127 2 N~1~A~A1 >= 2 non-aromatic nitrogen-containing ring size 3 128 2 [A;!#6]~1~A~A1 >= 2 non-aromatic heteroatom-containing ring size 3 129 1 *~1~*~*~*1 >= 1 any ring size 4 130 1 c1ccc1 >= 1 carbon-only ring size 4 131 1 n1aaa1 >= 1 nitrogen-containing ring size 4 132 1 [a;!#6]1aaa1 >= 1 heteroatom-containing ring size 4 133 1 C~1~C~C~C1 >= 1 non-aromatic carbon-only ring size 4 134 1 N~1~A~A~A1 >= 1 non-aromatic nitrogen-containing ring size 4 135 1 [A;!#6]~1~A~A~A1 >= 1 non-aromatic heteroatom-containing ring size 4 136 2 *~1~*~*~*1 >= 2 any ring size 4 137 2 c1ccc1 >= 2 aromatic carbon-only ring size 4 138 2 n1aaa1 >= 2 aromatic nitrogen-containing ring size 4 139 2 [a;!#6]1aaa1 >= 2 aromatic heteroatom-containing ring size 4 140 2 C~1~C~C~C1 >= 2 non-aromatic carbon-only ring size 4 141 2 N~1~A~A~A1 >= 2 non-aromatic nitrogen-containing ring size 4 142 2 [A;!#6]~1~A~A~A1 >= 2 non-aromatic heteroatom-containing ring size 4 143 1 *~1~*~*~*~*1 >= 1 any ring size 5 144 1 c1cccc1 >= 1 aromatic carbon-only ring size 5 145 1 n1aaaa1 >= 1 aromatic nitrogen-containing ring size 5 146 1 [a;!#6]1aaaa1 >= 1 aromatic heteroatom-containing ring size 5 147 1 C~1~C~C~C~C1 >= 1 non-aromatic carbon-only ring size 5 148 1 N~1~A~A~A~A1 >= 1 non-aromatic nitrogen-containing ring size 5 149 1 [A;!#6]~1A~A~A~A1 >= 1 non-aromatic heteroatom-containing ring size 5 150 2 *~1~*~*~*~*1 >= 2 any ring size 5 151 2 c1cccc1 >= 2 aromatic carbon-only ring size 5 152 2 n1aaaa1 >= 2 aromatic nitrogen-containing ring size 5 153 2 [a;!#6]1aaaa1 >= 2 aromatic heteroatom-containing ring size 5 154 2 C~1~C~C~C~C1 >= 2 non-aromatic carbon-only ring size 5 155 2 N~1~A~A~A~A1 >= 2 non-aromatic nitrogen-containing ring size 5 156 2 [A;!#6]~1A~A~A~A1 >= 2 non-aromatic heteroatom-containing ring size 5 157 3 *~1~*~*~*~*1 >= 3 any ring size 5 158 3 c1cccc1 >= 3 aromatic carbon-only ring size 5 159 3 n1aaaa1 >= 3 aromatic nitrogen-containing ring size 5 160 3 [a;!#6]1aaaa1 >= 3 aromatic heteroatom-containing ring size 5 161 3 C~1~C~C~C~C1 >= 3 non-aromatic carbon-only ring size 5 162 3 N~1~A~A~A~A1 >= 3 non-aromatic nitrogen-containing ring size 5 163 3 [A;!#6]~1A~A~A~A1 >= 3 non-aromatic heteroatom-containing ring size 5 164 4 *~1~*~*~*~*1 >= 4 any ring size 5 165 4 c1cccc1 >= 4 aromatic carbon-only ring size 5 166 4 n1aaaa1 >= 4 aromatic nitrogen-containing ring size 5 167 4 [a;!#6]1aaaa1 >= 4 aromatic heteroatom-containing ring size 5 168 4 C~1~C~C~C~C1 >= 4 non-aromatic carbon-only ring size 5 169 4 N~1~A~A~A~A1 >= 4 non-aromatic nitrogen-containing ring size 5 170 4 [A;!#6]~1A~A~A~A1 >= 4 non-aromatic heteroatom-containing ring size 5 171 5 *~1~*~*~*~*1 >= 5 any ring size 5 172 5 c1cccc1 >= 5 aromatic carbon-only ring size 5 173 5 n1aaaa1 >= 5 aromatic nitrogen-containing ring size 5 174 5 [a;!#6]1aaaa1 >= 5 aromatic heteroatom-containing ring size 5 175 5 C~1~C~C~C~C1 >= 5 non-aromatic carbon-only ring size 5 176 5 N~1~A~A~A~A1 >= 5 non-aromatic nitrogen-containing ring size 5 177 5 [A;!#6]~1A~A~A~A1 >= 5 non-aromatic heteroatom-containing ring size 5 # Rings of size 6 178 1 *~1~*~*~*~*~*1 >= 1 any ring size 6 179 1 c1ccccc1 >= 1 aromatic carbon-only ring size 6 180 1 n1aaaaa1 >= 1 aromatic nitrogen-containing ring size 6 181 1 [a;!#6]1aaaaa1 >= 1 aromatic heteroatom-containing ring size 6 182 1 C~1~C~C~C~C~C1 >= 1 non-aromatic carbon-only ring size 6 183 1 N~1~A~A~A~A~A1 >= 1 non-aromatic nitrogen-containing ring size 6 184 1 [A;!#6]~1~A~A~A~A~A1 >= 1 non-aromatic heteroatom-containing ring size 6 185 2 *~1~*~*~*~*~*1 >= 2 any ring size 6 186 2 c1ccccc1 >= 2 aromatic carbon-only ring size 6 187 2 n1aaaaa1 >= 2 aromatic nitrogen-containing ring size 6 188 2 [a;!#6]1aaaaa1 >= 2 aromatic heteroatom-containing ring size 6 189 2 C~1~C~C~C~C~C1 >= 2 non-aromatic carbon-only ring size 6 190 2 N~1~A~A~A~A~A1 >= 2 non-aromatic nitrogen-containing ring size 6 191 2 [A;!#6]~1~A~A~A~A~A1 >= 2 non-aromatic heteroatom-containing ring size 6 192 3 *~1~*~*~*~*~*1 >= 3 any ring size 6 193 3 c1ccccc1 >= 3 aromatic carbon-only ring size 6 194 3 n1aaaaa1 >= 3 aromatic nitrogen-containing ring size 6 195 3 [a;!#6]1aaaaa1 >= 3 aromatic heteroatom-containing ring size 6 196 3 C~1~C~C~C~C~C1 >= 3 non-aromatic carbon-only ring size 6 197 3 N~1~A~A~A~A~A1 >= 3 non-aromatic nitrogen-containing ring size 6 198 3 [A;!#6]~1~A~A~A~A~A1 >= 3 non-aromatic heteroatom-containing ring size 6 199 4 *~1~*~*~*~*~*1 >= 4 any ring size 6 200 4 c1ccccc1 >= 4 aromatic carbon-only ring size 6 201 4 n1aaaaa1 >= 4 aromatic nitrogen-containing ring size 6 202 4 [a;!#6]1aaaaa1 >= 4 aromatic heteroatom-containing ring size 6 203 4 C~1~C~C~C~C~C1 >= 4 non-aromatic carbon-only ring size 6 204 4 N~1~A~A~A~A~A1 >= 4 non-aromatic nitrogen-containing ring size 6 205 4 [A;!#6]~1~A~A~A~A~A1 >= 4 non-aromatic heteroatom-containing ring size 6 206 5 *~1~*~*~*~*~*1 >= 5 any ring size 6 207 5 c1ccccc1 >= 5 aromatic carbon-only ring size 6 208 5 n1aaaaa1 >= 5 aromatic nitrogen-containing ring size 6 209 5 [a;!#6]1aaaaa1 >= 5 aromatic heteroatom-containing ring size 6 210 5 C~1~C~C~C~C~C1 >= 5 non-aromatic carbon-only ring size 6 211 5 N~1~A~A~A~A~A1 >= 5 non-aromatic nitrogen-containing ring size 6 212 5 [A;!#6]~1~A~A~A~A~A1 >= 5 non-aromatic heteroatom-containing ring size 6 # Rings of size 7 213 1 *~1~*~*~*~*~*~*1 >= 1 any ring size 7 214 1 c1cccccc1 >= 1 aromatic carbon-only ring size 7 215 1 n1aaaaaa1 >= 1 aromatic nitrogen-containing ring size 7 216 1 [a;!#6]1aaaaaa1 >= 1 aromatic heteroatom-containing ring size 7 217 1 C~1~C~C~C~C~C~C1 >= 1 non-aromatic carbon-only ring size 7 218 1 N~1~A~A~A~A~A~A1 >= 1 non-aromatic nitrogen-containing ring size 7 219 1 [A;!#6]~1~A~A~A~A~A~A1 >= 1 non-aromatic heteroatom-containing ring size 7 220 2 *~1~*~*~*~*~*~*1 >= 2 any ring size 7 221 2 c1cccccc1 >= 2 aromatic carbon-only ring size 7 222 2 n1aaaaaa1 >= 2 aromatic nitrogen-containing ring size 7 223 2 [a;!#6]1aaaaaa1 >= 2 aromatic heteroatom-containing ring size 7 224 2 C~1~C~C~C~C~C~C1 >= 2 non-aromatic carbon-only ring size 7 225 2 N~1~A~A~A~A~A~A1 >= 2 non-aromatic nitrogen-containing ring size 7 226 2 [A;!#6]~1~A~A~A~A~A~A1 >= 2 non-aromatic heteroatom-containing ring size 7 # Rings of size 8 227 1 *~1~*~*~*~*~*~*~*1 >= 1 any ring size 8 228 1 c1ccccccc1 >= 1 aromatic carbon-only ring size 8 229 1 n1aaaaaaa1 >= 1 aromatic nitrogen-containing ring size 8 230 1 [a;!#6]1aaaaaaa1 >= 1 aromatic heteroatom-containing ring size 8 231 1 C~1~C~C~C~C~C~C~C1 >= 1 non-aromatic carbon-only ring size 8 232 1 N~1~A~A~A~A~A~A~A1 >= 1 non-aromatic nitrogen-containing ring size 8 233 1 [A;!#6]~1~A~A~A~A~A~A~A1 >= 1 non-aromatic heteroatom-containing ring size 8 234 2 *~1~*~*~*~*~*~*~*1 >= 2 any ring size 8 235 2 c1ccccccc1 >= 2 aromatic carbon-only ring size 8 236 2 n1aaaaaaa1 >= 2 aromatic nitrogen-containing ring size 8 237 2 [a;!#6]1aaaaaaa1 >= 2 aromatic heteroatom-containing ring size 8 238 2 C~1~C~C~C~C~C~C~C1 >= 2 non-aromatic carbon-only ring size 8 239 2 N~1~A~A~A~A~A~A~A1 >= 2 non-aromatic nitrogen-containing ring size 8 240 2 [A;!#6]~1~A~A~A~A~A~A~A1 >= 2 non-aromatic heteroatom-containing ring size 8 # Rings of size 9 241 1 *~1~*~*~*~*~*~*~*~*1 >= 1 any ring size 9 242 1 c1cccccccc1 >= 1 aromatic carbon-only ring size 9 243 1 n1aaaaaaaa1 >= 1 aromatic nitrogen-containing ring size 9 244 1 [a;!#6]1aaaaaaaa1 >= 1 aromatic heteroatom-containing ring size 9 245 1 C~1~C~C~C~C~C~C~C~C1 >= 1 non-aromatic carbon-only ring size 9 246 1 N~1~A~A~A~A~A~A~A~A1 >= 1 non-aromatic nitrogen-containing ring size 9 247 1 [A;!#6]~1~A~A~A~A~A~A~A~A1 >= 1 non-aromatic heteroatom-containing ring size 9 # Rings of size 10 248 1 *~1~*~*~*~*~*~*~*~*~*1 >= 1 any ring size 10 249 1 c1ccccccccc1 >= 1 aromatic carbon-only ring size 10 250 1 n1aaaaaaaaa1 >= 1 aromatic nitrogen-containing ring size 10 251 1 [a;!#6]1aaaaaaaaa1 >= 1 aromatic heteroatom-containing ring size 10 252 1 C~1~C~C~C~C~C~C~C~C~C1 >= 1 non-aromatic carbon-only ring size 10 253 1 N~1~A~A~A~A~A~A~A~A~A1 >= 1 non-aromatic nitrogen-containing ring size 10 254 1 [A;!#6]~1~A~A~A~A~A~A~A~A~A1 >= 1 non-aromatic heteroatom-containing ring size 10 255 1 [aR] >= 1 aromatic ring # Interesting. RDkit doesn't support [aR!#6]. Is that a legal SMARTS? 256 1 [aR;!#6] >= 1 hetero-aromatic ring 257 2 >= 2 aromatic rings 258 2 >= 2 hetero-aromatic rings 259 3 >= 3 aromatic rings 260 3 >= 3 hetero-aromatic rings 261 4 >= 4 aromatic rings 262 4 >= 4 hetero-aromatic rings # Section 3: Simple atom pairs - These bits test for the presence of # patterns of bonded atom pairs, regardless of bond order or count. 263 1 [Li;!H0] Li-H 264 1 [Li]~[Li] Li-Li 265 1 [Li]~[#5] Li-B 266 1 [Li]~[#6] Li-C 267 1 [Li]~[#8] Li-O 268 1 [Li]~[F] Li-F 269 1 [Li]~[#15] Li-P 270 1 [Li]~[#16] Li-S 271 1 [Li]~[Cl] Li-Cl 272 1 [#5;!H0] B-H 273 1 [#5]~[#5] B-B 274 1 [#5]~[#6] B-C 275 1 [#5]~[#7] B-N 276 1 [#5]~[#8] B-O 277 1 [#5]~[F] B-F 278 1 [#5]~[#6] B-Si 279 1 [#5]~[#15] B-P 280 1 [#5]~[#16] B-S 281 1 [#5]~[Cl] B-Cl 282 1 [#5]~[Br] B-Br 283 1 [#6;!H0] C-H 284 1 [#6]~[#6] C-C 285 1 [#6]~[#7] C-N 286 1 [#6]~[#8] C-O 287 1 [#6]~[F] C-F 288 1 [#6]~[Na] C-Na 289 1 [#6]~[Mg] C-Mg 290 1 [#6]~[Al] C-Al 291 1 [#6]~[#14] C-Si 292 1 [#6]~[#15] C-P 293 1 [#6]~[#16] C-S 294 1 [#6]~[Cl] C-Cl 295 1 [#6]~[#33] C-As 296 1 [#6]~[#34] C-Se 297 1 [#6]~[Br] C-Br 298 1 [#6]~[I] C-I 299 1 [#7;!H0] N-H 300 1 [#7]~[#7] N-N 301 1 [#7]~[#8] N-O 302 1 [#7]~[F] N-F 303 1 [#7]~[#14] N-Si 304 1 [#7]~[#15] N-P 305 1 [#7]~[#16] N-S 306 1 [#7]~[Cl] N-Cl 307 1 [#7]~[Br] N-Br 308 1 [#8;!H0] O-H 309 1 [#8]~[#8] O-O 310 1 [#8]~[Mg] O-Mg 311 1 [#8]~[Na] O-Na 312 1 [#8]~[Al] O-Al 313 1 [#8]~[#14] O-Si 314 1 [#8]~[#15] O-P 315 1 [#8]~[K] O-K 316 1 [F]~[#15] F-P 317 1 [F]~[#16] F-S 318 1 [Al;!H0] Al-H 319 1 [Al]~[Cl] Al-Cl 320 1 [#14;!H0] Si-H 321 1 [#14]~[#14] Si-Si 322 1 [#14]~[Cl] Si-Cl 323 1 [#15;!H0] P-H 324 1 [#15]~[#15] P-P 325 1 [#33;!H0] As-H 326 1 [#33]~[#33] As-As # Section 4: Simple atom nearest neighbors - These bits test for the # presence of atom nearest neighbor patterns, regardless of bond order # (denoted by "~") or count, but where bond aromaticity (denoted by # ":") is significant. 327 1 [#6](~Br)(~[#6]) C(~Br)(~C) 328 1 [#6](~Br)(~[#6])(~[#6]) C(~Br)(~C)(~C) 329 1 [#6;!H0](~Br) C(~Br)(~H) 330 1 c(~Br)(:c) C(~Br)(:C) 331 1 c(~Br)(:n) C(~Br)(:N) 332 1 [#6](~[#6])(~[#6]) C(~C)(~C) 333 1 [#6](~[#6])(~[#6])(~[#6]) C(~C)(~C)(~C) 334 1 [#6](~[#6])(~[#6])(~[#6])(~[#6]) C(~C)(~C)(~C)(~C) 335 1 [#6;!H0](~[#6])(~[#6])(~[#6]) C(~C)(~C)(~C)(~H) 336 1 [#6](~[#6])(~[#6])(~[#6])(~[#7]) C(~C)(~C)(~C)(~N) 337 1 [#6](~[#6])(~[#6])(~[#6])(~[#8]) C(~C)(~C)(~C)(~O) 338 1 [#6;!H0](~[#6])(~[#6])(~[#7]) C(~C)(~C)(~H)(~N) 339 1 [#6;!H0](~[#6])(~[#6])(~[#8]) C(~C)(~C)(~H)(~O) 340 1 [#6](~[#6])(~[#6])(~[#7]) C(~C)(~C)(~N) 341 1 [#6](~[#6])(~[#6])(~[#8]) C(~C)(~C)(~O) 342 1 [#6](~[#6])(~Cl) C(~C)(~Cl) 343 1 [#6;!H0](~[#6])(~Cl) C(~C)(~Cl)(~H) 344 1 [#6;!H0](~[#6]) C(~C)(~H) 345 1 [#6;!H0](~[#6])(~[#7]) C(~C)(~H)(~N) 346 1 [#6;!H0](~[#6])(~[#8]) C(~C)(~H)(~O) 347 1 [#6;!H0](~[#6])(~[#8])(~[#8]) C(~C)(~H)(~O)(~O) 348 1 [#6;!H0](~[#6])(~[#15]) C(~C)(~H)(~P) 349 1 [#6;!H0](~[#6])(~[#16]) C(~C)(~H)(~S) 350 1 [#6](~[#6])(~I) C(~C)(~I) 351 1 [#6](~[#6])(~[#7]) C(~C)(~N) 352 1 [#6](~[#6])(~[#8]) C(~C)(~O) 353 1 [#6](~[#6])(~[#16]) C(~C)(~S) 354 1 [#6](~[#6])(~[#14]) C(~C)(~Si) 355 1 c(~[#6])(:c) C(~C)(:C) 356 1 c(~[#6])(:c)(:c) C(~C)(:C)(:C) 357 1 c(~[#6])(:c)(:n) C(~C)(:C)(:N) 358 1 c(~[#6])(:n) C(~C)(:N) 359 1 c(~[#6])(:n)(:n) C(~C)(:N)(:N) 360 1 [#6](~Cl)(~Cl) C(~Cl)(~Cl) 361 1 [#6;!H0](~Cl) C(~Cl)(~H) 362 1 c(~Cl)(:c) C(~Cl)(:C) 363 1 [#6](~F)(~F) C(~F)(~F) 364 1 c(~F)(:c) C(~F)(:C) 365 1 [#6;!H0](~[#7]) C(~H)(~N) 366 1 [#6;!H0](~[#8]) C(~H)(~O) 367 1 [#6;!H0](~[#8])(~[#8]) C(~H)(~O)(~O) 368 1 [#6;!H0](~[#16]) C(~H)(~S) 369 1 [#6;!H0](~[#14]) C(~H)(~Si) 370 1 [c;!H0](:c) C(~H)(:C) 371 1 [c;!H0](:c)(:c) C(~H)(:C)(:C) 372 1 [c;!H0](:c)(:n) C(~H)(:C)(:N) 373 1 [c;!H0](:n) C(~H)(:N) 374 1 [#6;!H0;!H1;!H2] C(~H)(~H)(~H) 375 1 [#6](~[#7])(~[#7]) C(~N)(~N) 376 1 c(~[#7])(:c) C(~N)(:C) 377 1 c(~[#7])(:c)(:c) C(~N)(:C)(:C) 378 1 c(~[#7])(:c)(:n) C(~N)(:C)(:N) 379 1 c(~[#7])(:n) C(~N)(:N) 380 1 [#6](~[#8])(~[#8]) C(~O)(~O) 381 1 c(~[#8])(:c) C(~O)(:C) 382 1 c(~[#8])(:c)(:c) C(~O)(:C)(:C) 383 1 c(~[#16])(:c) C(~S)(:C) 384 1 c(:c)(:c) C(:C)(:C) 385 1 c(:c)(:c)(:c) C(:C)(:C)(:C) 386 1 c(:c)(:c)(:n) C(:C)(:C)(:N) 387 1 c(:c)(:n) C(:C)(:N) 388 1 c(:c)(:n)(:n) C(:C)(:N)(:N) 389 1 c(:n)(:n) C(:N)(:N) 390 1 [#7](~[#6])(~[#6]) N(~C)(~C) 391 1 [#7](~[#6])(~[#6])(~[#6]) N(~C)(~C)(~C) 392 1 [#7;!H0](~[#6])(~[#6]) N(~C)(~C)(~H) 393 1 [#7;!H0](~[#6]) N(~C)(~H) 394 1 [#7;!H0](~[#6])(~[#7]) N(~C)(~H)(~N) 395 1 [#7](~[#6])(~[#8]) N(~C)(~O) 396 1 n(~[#6])(:c) N(~C)(:C) 397 1 n(~[#6])(:c)(:c) N(~C)(:C)(:C) 398 1 [#7;!H0](~[#7]) N(~H)(~N) 399 1 [n;!H0](:c) N(~H)(:C) 400 1 [n;!H0](:c)(:c) N(~H)(:C)(:C) 401 1 [#7](~[#8])(~[#8]) N(~O)(~O) 402 1 n(~[#8])(:o) N(~O)(:O) 403 1 n(:c)(:c) N(:C)(:C) 404 1 n(:c)(:c)(:c) N(:C)(:C)(:C) 405 1 [#8](~[#6])(~[#6]) O(~C)(~C) 406 1 [#8;!H0](~[#6]) O(~C)(~H) 407 1 [#8](~[#6])(~[#15]) O(~C)(~P) 408 1 [#8;!H0](~[#16]) O(~H)(~S) 409 1 o(:c)(:c) O(:C)(:C) 410 1 [#15](~[#6])(~[#6]) P(~C)(~C) 411 1 [#15](~[#8])(~[#8]) P(~O)(~O) 412 1 [#16](~[#6])(~[#6]) S(~C)(~C) 413 1 [#16;!H0](~[#6]) S(~C)(~H) 414 1 [#16](~[#6])(~[#8]) S(~C)(~O) 415 1 [#14](~[#6])(~[#6]) Si(~C)(~C) # Section 5: Detailed atom neighborhoods - These bits test for the # presence of detailed atom neighborhood patterns, regardless of # count, but where bond orders are specific, bond aromaticity matches # both single and double bonds, and where "-", "=", and "#" matches a # single bond, double bond, and triple bond order, respectively. 416 1 [#6]=,:[#6] C=C 417 1 [#6]#[#6] C#C 418 1 [#6]=,:[#7] C=N 419 1 [#6]#[#7] C#N 420 1 [#6]=,:[#8] C=O 421 1 [#6]=,:[s,S] C=S 422 1 [#7]=,:[#7] N=N 423 1 [#7]=,:[#8] N=O 424 1 [#7]=,:[#15] N=P 425 1 [#15]=,:[#8] P=O 426 1 [#15]=,:[#15] P=P 427 1 [#6]#[#6]-,:[#6] C(#C)(-C) 428 1 [#6;!H0]#[#6] C(#C)(-H) 429 1 [#7]#[#6]-,:[#6] C(#N)(-C) 430 1 [#6](-,:[#6])(-,:[#6])(=,:[#6]) C(-C)(-C)(=C) 431 1 [#6](-,:[#6])(-,:[#6])(=,:[#7]) C(-C)(-C)(=N) 432 1 [#6](-,:[#6])(-,:[#6])(=,:[#8]) C(-C)(-C)(=O) 433 1 [#6](-,:[#6])(Cl)(=,:[#8]) C(-C)(-Cl)(=O) 434 1 [#6&!H0](-,:[#6])(=,:[#6]) C(-C)(-H)(=C) 435 1 [#6&!H0](-,:[#6])(=,:[#7]) C(-C)(-H)(=N) 436 1 [#6&!H0](-,:[#6])(=,:[#8]) C(-C)(-H)(=O) 437 1 [#6](-,:[#6])(-,:[#7])(=,:[#6]) C(-C)(-N)(=C) 438 1 [#6](-,:[#6])(-,:[#7])(=,:[#7]) C(-C)(-N)(=N) 439 1 [#6](-,:[#6])(-,:[#7])(=,:[#8]) C(-C)(-N)(=O) 440 1 [#6](-,:[#6])(-,:[#8])(=,:[#8]) C(-C)(-O)(=O) 441 1 [#6](-,:[#6])(=,:[#6]) C(-C)(=C) 442 1 [#6](-,:[#6])(=,:[#7]) C(-C)(=N) 443 1 [#6](-,:[#6])(=,:[#8]) C(-C)(=O) 444 1 [#6](Cl)(=,:[#8]) C(-Cl)(=O) 445 1 [#6;!H0](-,:[#7])(=,:[#6]) C(-H)(-N)(=C) 446 1 [#6;!H0](=,:[#6]) C(-H)(=C) 447 1 [#6;!H0](=,:[#7]) C(-H)(=N) 448 1 [#6;!H0](=,:[#8]) C(-H)(=O) 449 1 [#6](-,:[#7])(=,:[#6]) C(-N)(=C) 450 1 [#6](-,:[#7])(=,:[#7]) C(-N)(=N) 451 1 [#6](-,:[#7])(=,:[#8]) C(-N)(=O) 452 1 [#6](-,:[#8])(=,:[#8]) C(-O)(=O) 453 1 [#7](-,:[#6])(=,:[#6]) N(-C)(=C) 454 1 [#7](-,:[#6])(=,:[#8]) N(-C)(=O) 455 1 [#7](-,:[#8])(=,:[#8]) N(-O)(=O) 456 1 [#15](-,:[#8])(=,:[#8]) P(-O)(=O) 457 1 [#16](-,:[#6])(=,:[#8]) S(-C)(=O) 458 1 [#16](-,:[#8])(=,:[#8]) S(-O)(=O) 459 1 [#16](=,:[#8])(=,:[#8]) S(=O)(=O) # Section 6: Simple SMARTS patterns - These bits test for the presence # of simple SMARTS patterns, regardless of count, but where bond # orders are specific and bond aromaticity matches both single and # double bonds. 460 1 [#6]-,:[#6]-,:[#6]#C C-C-C#C 461 1 [#8]-,:[#6]-,:[#6]=,:[#7] O-C-C=N 462 1 [#8]-,:[#6]-,:[#6]=,:[#8] O-C-C=O 463 1 n:c-,:[#16;!H0] N:C-S-[#1] 464 1 [#7]-,:[#6]-,:[#6]=,:[#6] N-C-C=C 465 1 [#8]=,:[#16]-,:[#6]-,:[#6] O=S-C-C 466 1 N#[#6]-,:[#6]=,:[#6] N#C-C=C 467 1 [#6]=,:[#7]-,:[#7]-,:[#6] C=N-N-C 468 1 [#8]=,:[#16]-,:[#6]-,:[#7] O=S-C-N 469 1 [#16]-,:[#16]-,:c:c S-S-C:C 470 1 c:c-,:[#6]=,:[#6] C:C-C=C 471 1 s:c:c:c S:C:C:C 472 1 c:n:c-,:[#6] C:N:C-C 473 1 [#16]-,:c:n:c S-C:N:C 474 1 s:c:c:n S:C:C:N 475 1 [#16]-,:[#6]=,:[#7]-,:[#6] S-C=N-C 476 1 [#6]-,:[#8]-,:[#6]=,:[#6] C-O-C=C 477 1 [#7]-,:[#7]-,:c:c N-N-C:C 478 1 [#16]-,:[#6]=,:[#7;!H0] S-C=N-[#1] 479 1 [#16]-,:[#6]-,:[#16]-,:[#6] S-C-S-C 480 1 c:s:c-,:[#6] C:S:C-C 481 1 [#8]-,:[#16]-,:c:c O-S-C:C 482 1 c:n-,:c:c C:N-C:C 483 1 [#7]-,:[#16]-,:c:c N-S-C:C 484 1 [#7]-,:c:n:c N-C:N:C 485 1 n:c:c:n N:C:C:N 486 1 [#7]-,:c:n:n N-C:N:N 487 1 [#7]-,:[#6]=,:[#7]-,:[#6] N-C=N-C 488 1 [#7]-,:[#6]=,:[#7;!H0] N-C=N-[#1] 489 1 [#7]-,:[#6]-,:[#16]-,:[#6] N-C-S-C 490 1 [#6]-,:[#6]-,:[#6]=,:[#6] C-C-C=C 491 1 [#6]-,:n:[c;!H0] C-N:C-[#1] 492 1 [#7]-,:c:o:c N-C:O:C 493 1 [#8]=,:[#6]-,:c:c O=C-C:C 494 1 [#8]=,:[#6]-,:c:n O=C-C:N 495 1 [#6]-,:[#7]-,:c:c C-N-C:C 496 1 n:n-,:[#6;!H0] N:N-C-[#1] 497 1 [#8]-,:c:c:n O-C:C:N 498 1 [#8]-,:[#6]=,:[#6]-,:[#6] O-C=C-C 499 1 [#7]-,:c:c:n N-C:C:N 500 1 [#6]-,:[#16]-,:c:c C-S-C:C 501 1 Cl-,:c:c-,:[#6] Cl-C:C-C 502 1 [#7]-,:[#6]=,:[#6;!H0] N-C=C-[#1] 503 1 Cl-,:c:[c;!H0] Cl-C:C-[#1] 504 1 n:c:n-,:[#6] N:C:N-C 505 1 Cl-,:c:c-,:[#8] Cl-C:C-O 506 1 [#6]-,:c:n:c C-C:N:C 507 1 [#6]-,:[#6]-,:[#16]-,:[#6] C-C-S-C 508 1 [#16]=,:[#6]-,:[#7]-,:[#6] S=C-N-C 509 1 Br-,:c:c-,:[#6] Br-C:C-C 510 1 [#7;!H0]-,:[#7;!H0] [#1]-N-N-[#1] 511 1 [#16]=,:[#6]-,:[#7;!H0] S=C-N-[#1] 512 1 [#6]-,:[#33]-,:[#8;!H0] C-[As]-O-[#1] 513 1 s:c:[c;!H0] S:C:C-[#1] 514 1 [#8]-,:[#7]-,:[#6]-,:[#6] O-N-C-C 515 1 [#7]-,:[#7]-,:[#6]-,:[#6] N-N-C-C 516 1 [#6;!H0]=,:[#6;!H0] [#1]-C=C-[#1] 517 1 [#7]-,:[#7]-,:[#6]-,:[#7] N-N-C-N 518 1 [#8]=,:[#6]-,:[#7]-,:[#7] O=C-N-N 519 1 [#7]=,:[#6]-,:[#7]-,:[#6] N=C-N-C 520 1 [#6]=,:[#6]-,:c:c C=C-C:C 521 1 c:n-,:[#6;!H0] C:N-C-[#1] 522 1 [#6]-,:[#7]-,:[#7;!H0] C-N-N-[#1] 523 1 n:c:c-,:[#6] N:C:C-C 524 1 [#6]-,:[#6]=,:[#6]-,:[#6] C-C=C-C 525 1 [#33]-,:c:[c;!H0] [As]-C:C-[#1] 526 1 Cl-,:c:c-,:Cl Cl-C:C-Cl 527 1 c:c:[n;!H0] C:C:N-[#1] 528 1 [#7;!H0]-,:[#6;!H0] [#1]-N-C-[#1] 529 1 Cl-,:[#6]-,:[#6]-,:Cl Cl-C-C-Cl 530 1 n:c-,:c:c N:C-C:C 531 1 [#16]-,:c:c-,:[#6] S-C:C-C 532 1 [#16]-,:c:[c;!H0] S-C:C-[#1] 533 1 [#16]-,:c:c-,:[#7] S-C:C-N 534 1 [#16]-,:c:c-,:[#8] S-C:C-O 535 1 [#8]=,:[#6]-,:[#6]-,:[#6] O=C-C-C 536 1 [#8]=,:[#6]-,:[#6]-,:[#7] O=C-C-N 537 1 [#8]=,:[#6]-,:[#6]-,:[#8] O=C-C-O 538 1 [#7]=,:[#6]-,:[#6]-,:[#6] N=C-C-C 539 1 [#7]=,:[#6]-,:[#6;!H0] N=C-C-[#1] 540 1 [#6]-,:[#7]-,:[#6;!H0] C-N-C-[#1] 541 1 [#8]-,:c:c-,:[#6] O-C:C-C 542 1 [#8]-,:c:[c;!H0] O-C:C-[#1] 543 1 [#8]-,:c:c-,:[#7] O-C:C-N 544 1 [#8]-,:c:c-,:[#8] O-C:C-O 545 1 [#7]-,:c:c-,:[#6] N-C:C-C 546 1 [#7]-,:c:[c;!H0] N-C:C-[#1] 547 1 [#7]-,:c:c-,:[#7] N-C:C-N 548 1 [#8]-,:[#6]-,:c:c O-C-C:C 549 1 [#7]-,:[#6]-,:c:c N-C-C:C 550 1 Cl-,:[#6]-,:[#6]-,:[#6] Cl-C-C-C 551 1 Cl-,:[#6]-,:[#6]-,:[#8] Cl-C-C-O 552 1 c:c-,:c:c C:C-C:C 553 1 [#8]=,:[#6]-,:[#6]=,:[#6] O=C-C=C 554 1 Br-,:[#6]-,:[#6]-,:[#6] Br-C-C-C 555 1 [#7]=,:[#6]-,:[#6]=,:[#6] N=C-C=C 556 1 [#6]=,:[#6]-,:[#6]-,:[#6] C=C-C-C 557 1 n:c-,:[#8;!H0] N:C-O-[#1] 558 1 [#8]=,:[#7]-,:c:c O=N-C:C 559 1 [#8]-,:[#6]-,:[#7;!H0] O-C-N-[#1] 560 1 [#7]-,:[#6]-,:[#7]-,:[#6] N-C-N-C 561 1 Cl-,:[#6]-,:[#6]=,:[#8] Cl-C-C=O 562 1 Br-,:[#6]-,:[#6]=,:[#8] Br-C-C=O 563 1 [#8]-,:[#6]-,:[#8]-,:[#6] O-C-O-C 564 1 [#6]=,:[#6]-,:[#6]=,:[#6] C=C-C=C 565 1 c:c-,:[#8]-,:[#6] C:C-O-C 566 1 [#8]-,:[#6]-,:[#6]-,:[#7] O-C-C-N 567 1 [#8]-,:[#6]-,:[#6]-,:[#8] O-C-C-O 568 1 N#[#6]-,:[#6]-,:[#6] N#C-C-C 569 1 [#7]-,:[#6]-,:[#6]-,:[#7] N-C-C-N 570 1 c:c-,:[#6]-,:[#6] C:C-C-C 571 1 [#6;!H0]-,:[#8;!H0] [#1]-C-O-[#1] 572 1 n:c:n:c N:C:N:C 573 1 [#8]-,:[#6]-,:[#6]=,:[#6] O-C-C=C 574 1 [#8]-,:[#6]-,:c:c-,:[#6] O-C-C:C-C 575 1 [#8]-,:[#6]-,:c:c-,:[#8] O-C-C:C-O 576 1 [#7]=,:[#6]-,:c:[c;!H0] N=C-C:C-[#1] 577 1 c:c-,:[#7]-,:c:c C:C-N-C:C 578 1 [#6]-,:c:c-,:c:c C-C:C-C:C 579 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#6] O=C-C-C-C 580 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#7] O=C-C-C-N 581 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#8] O=C-C-C-O 582 1 [#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6] C-C-C-C-C 583 1 Cl-,:c:c-,:[#8]-,:[#6] Cl-C:C-O-C 584 1 c:c-,:[#6]=,:[#6]-,:[#6] C:C-C=C-C 585 1 [#6]-,:c:c-,:[#7]-,:[#6] C-C:C-N-C 586 1 [#6]-,:[#16]-,:[#6]-,:[#6]-,:[#6] C-S-C-C-C 587 1 [#7]-,:c:c-,:[#8;!H0] N-C:C-O-[#1] 588 1 [#8]=,:[#6]-,:[#6]-,:[#6]=,:[#8] O=C-C-C=O 589 1 [#6]-,:c:c-,:[#8]-,:[#6] C-C:C-O-C 590 1 [#6]-,:c:c-,:[#8;!H0] C-C:C-O-[#1] 591 1 Cl-,:[#6]-,:[#6]-,:[#6]-,:[#6] Cl-C-C-C-C 592 1 [#7]-,:[#6]-,:[#6]-,:[#6]-,:[#6] N-C-C-C-C 593 1 [#7]-,:[#6]-,:[#6]-,:[#6]-,:[#7] N-C-C-C-N 594 1 [#6]-,:[#8]-,:[#6]-,:[#6]=,:[#6] C-O-C-C=C 595 1 c:c-,:[#6]-,:[#6]-,:[#6] C:C-C-C-C 596 1 [#7]=,:[#6]-,:[#7]-,:[#6]-,:[#6] N=C-N-C-C 597 1 [#8]=,:[#6]-,:[#6]-,:c:c O=C-C-C:C 598 1 Cl-,:c:c:c-,:[#6] Cl-C:C:C-C 599 1 [#6;!H0]-,:[#6]=,:[#6;!H0] [#1]-C-C=C-[#1] 600 1 [#7]-,:c:c:c-,:[#6] N-C:C:C-C 601 1 [#7]-,:c:c:c-,:[#7] N-C:C:C-N 602 1 [#8]=,:[#6]-,:[#6]-,:[#7]-,:[#6] O=C-C-N-C 603 1 [#6]-,:c:c:c-,:[#6] C-C:C:C-C 604 1 [#6]-,:[#8]-,:[#6]-,:c:c C-O-C-C:C 605 1 [#8]=,:[#6]-,:[#6]-,:[#8]-,:[#6] O=C-C-O-C 606 1 [#8]-,:c:c-,:[#6]-,:[#6] O-C:C-C-C 607 1 [#7]-,:[#6]-,:[#6]-,:c:c N-C-C-C:C 608 1 [#6]-,:[#6]-,:[#6]-,:c:c C-C-C-C:C 609 1 Cl-,:[#6]-,:[#6]-,:[#7]-,:[#6] Cl-C-C-N-C 610 1 [#6]-,:[#8]-,:[#6]-,:[#8]-,:[#6] C-O-C-O-C 611 1 [#7]-,:[#6]-,:[#6]-,:[#7]-,:[#6] N-C-C-N-C 612 1 [#7]-,:[#6]-,:[#8]-,:[#6]-,:[#6] N-C-O-C-C 613 1 [#6]-,:[#7]-,:[#6]-,:[#6]-,:[#6] C-N-C-C-C 614 1 [#6]-,:[#6]-,:[#8]-,:[#6]-,:[#6] C-C-O-C-C 615 1 [#7]-,:[#6]-,:[#6]-,:[#8]-,:[#6] N-C-C-O-C 616 1 c:c:n:n:c C:C:N:N:C 617 1 [#6]-,:[#6]-,:[#6]-,:[#8;!H0] C-C-C-O-[#1] 618 1 c:c-,:[#6]-,:c:c C:C-C-C:C 619 1 [#8]-,:[#6]-,:[#6]=,:[#6]-,:[#6] O-C-C=C-C 620 1 c:c-,:[#8]-,:[#6]-,:[#6] C:C-O-C-C 621 1 [#7]-,:c:c:c:n N-C:C:C:N 622 1 [#8]=,:[#6]-,:[#8]-,:c:c O=C-O-C:C 623 1 [#8]=,:[#6]-,:c:c-,:[#6] O=C-C:C-C 624 1 [#8]=,:[#6]-,:c:c-,:[#7] O=C-C:C-N 625 1 [#8]=,:[#6]-,:c:c-,:[#8] O=C-C:C-O 626 1 [#6]-,:[#8]-,:c:c-,:[#6] C-O-C:C-C 627 1 [#8]=,:[#33]-,:c:c:c O=[As]-C:C:C 628 1 [#6]-,:[#7]-,:[#6]-,:c:c C-N-C-C:C 629 1 [#16]-,:c:c:c-,:[#7] S-C:C:C-N 630 1 [#8]-,:c:c-,:[#8]-,:[#6] O-C:C-O-C 631 1 [#8]-,:c:c-,:[#8;!H0] O-C:C-O-[#1] 632 1 [#6]-,:[#6]-,:[#8]-,:c:c C-C-O-C:C 633 1 [#7]-,:[#6]-,:c:c-,:[#6] N-C-C:C-C 634 1 [#6]-,:[#6]-,:c:c-,:[#6] C-C-C:C-C 635 1 [#7]-,:[#7]-,:[#6]-,:[#7;!H0] N-N-C-N-[#1] 636 1 [#6]-,:[#7]-,:[#6]-,:[#7]-,:[#6] C-N-C-N-C 637 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#6] O-C-C-C-C 638 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#7] O-C-C-C-N 639 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#8] O-C-C-C-O 640 1 [#6]=,:[#6]-,:[#6]-,:[#6]-,:[#6] C=C-C-C-C 641 1 [#8]-,:[#6]-,:[#6]-,:[#6]=,:[#6] O-C-C-C=C 642 1 [#8]-,:[#6]-,:[#6]-,:[#6]=,:[#8] O-C-C-C=O 643 1 [#6;!H0]-,:[#6]-,:[#7;!H0] [#1]-C-C-N-[#1] 644 1 [#6]-,:[#6]=,:[#7]-,:[#7]-,:[#6] C-C=N-N-C 645 1 [#8]=,:[#6]-,:[#7]-,:[#6]-,:[#6] O=C-N-C-C 646 1 [#8]=,:[#6]-,:[#7]-,:[#6;!H0] O=C-N-C-[#1] 647 1 [#8]=,:[#6]-,:[#7]-,:[#6]-,:[#7] O=C-N-C-N 648 1 [#8]=,:[#7]-,:c:c-,:[#7] O=N-C:C-N 649 1 [#8]=,:[#7]-,:c:c-,:[#8] O=N-C:C-O 650 1 [#8]=,:[#6]-,:[#7]-,:[#6]=,:[#8] O=C-N-C=O 651 1 [#8]-,:c:c:c-,:[#6] O-C:C:C-C 652 1 [#8]-,:c:c:c-,:[#7] O-C:C:C-N 653 1 [#8]-,:c:c:c-,:[#8] O-C:C:C-O 654 1 [#7]-,:[#6]-,:[#7]-,:[#6]-,:[#6] N-C-N-C-C 655 1 [#8]-,:[#6]-,:[#6]-,:c:c O-C-C-C:C 656 1 [#6]-,:[#6]-,:[#7]-,:[#6]-,:[#6] C-C-N-C-C 657 1 [#6]-,:[#7]-,:c:c-,:[#6] C-N-C:C-C 658 1 [#6]-,:[#6]-,:[#16]-,:[#6]-,:[#6] C-C-S-C-C 659 1 [#8]-,:[#6]-,:[#6]-,:[#7]-,:[#6] O-C-C-N-C 660 1 [#6]-,:[#6]=,:[#6]-,:[#6]-,:[#6] C-C=C-C-C 661 1 [#8]-,:[#6]-,:[#8]-,:[#6]-,:[#6] O-C-O-C-C 662 1 [#8]-,:[#6]-,:[#6]-,:[#8]-,:[#6] O-C-C-O-C 663 1 [#8]-,:[#6]-,:[#6]-,:[#8;!H0] O-C-C-O-[#1] 664 1 [#6]-,:[#6]=,:[#6]-,:[#6]=,:[#6] C-C=C-C=C 665 1 [#7]-,:c:c-,:[#6]-,:[#6] N-C:C-C-C 666 1 [#6]=,:[#6]-,:[#6]-,:[#8]-,:[#6] C=C-C-O-C 667 1 [#6]=,:[#6]-,:[#6]-,:[#8;!H0] C=C-C-O-[#1] 668 1 [#6]-,:c:c-,:[#6]-,:[#6] C-C:C-C-C 669 1 Cl-,:c:c-,:[#6]=,:[#8] Cl-C:C-C=O 670 1 Br-,:c:c:c-,:[#6] Br-C:C:C-C 671 1 [#8]=,:[#6]-,:[#6]=,:[#6]-,:[#6] O=C-C=C-C 672 1 [#8]=,:[#6]-,:[#6]=,:[#6;!H0] O=C-C=C-[#1] 673 1 [#8]=,:[#6]-,:[#6]=,:[#6]-,:[#7] O=C-C=C-N 674 1 [#7]-,:[#6]-,:[#7]-,:c:c N-C-N-C:C 675 1 Br-,:[#6]-,:[#6]-,:c:c Br-C-C-C:C 676 1 N#[#6]-,:[#6]-,:[#6]-,:[#6] N#C-C-C-C 677 1 [#6]-,:[#6]=,:[#6]-,:c:c C-C=C-C:C 678 1 [#6]-,:[#6]-,:[#6]=,:[#6]-,:[#6] C-C-C=C-C 679 1 [#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6] C-C-C-C-C-C 680 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6] O-C-C-C-C-C 681 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#8] O-C-C-C-C-O 682 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#7] O-C-C-C-C-N 683 1 [#7]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6] N-C-C-C-C-C 684 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6] O=C-C-C-C-C 685 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#7] O=C-C-C-C-N 686 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#8] O=C-C-C-C-O 687 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#6]=,:[#8] O=C-C-C-C=O 688 1 [#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6] C-C-C-C-C-C-C 689 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6] O-C-C-C-C-C-C 690 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#8] O-C-C-C-C-C-O 691 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#7] O-C-C-C-C-C-N 692 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6] O=C-C-C-C-C-C 693 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#8] O=C-C-C-C-C-O 694 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]=,:[#8] O=C-C-C-C-C=O 695 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#7] O=C-C-C-C-C-N 696 1 [#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6] C-C-C-C-C-C-C-C 697 1 [#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6](-,:[#6])-,:[#6] C-C-C-C-C-C(C)-C 698 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6] O-C-C-C-C-C-C-C 699 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6](-,:[#6])-,:[#6] O-C-C-C-C-C(C)-C 700 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#8]-,:[#6] O-C-C-C-C-C-O-C 701 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6](-,:[#8])-,:[#6] O-C-C-C-C-C(O)-C 702 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#7]-,:[#6] O-C-C-C-C-C-N-C 703 1 [#8]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6](-,:[#7])-,:[#6] O-C-C-C-C-C(N)-C 704 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6] O=C-C-C-C-C-C-C 705 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6](-,:[#8])-,:[#6] O=C-C-C-C-C(O)-C 706 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6](=,:[#8])-,:[#6] O=C-C-C-C-C(=O)-C 707 1 [#8]=,:[#6]-,:[#6]-,:[#6]-,:[#6]-,:[#6](-,:[#7])-,:[#6] O=C-C-C-C-C(N)-C 708 1 [#6]-,:[#6](-,:[#6])-,:[#6]-,:[#6] C-C(C)-C-C 709 1 [#6]-,:[#6](-,:[#6])-,:[#6]-,:[#6]-,:[#6] C-C(C)-C-C-C 710 1 [#6]-,:[#6]-,:[#6](-,:[#6])-,:[#6]-,:[#6] C-C-C(C)-C-C 711 1 [#6]-,:[#6](-,:[#6])(-,:[#6])-,:[#6]-,:[#6] C-C(C)(C)-C-C 712 1 [#6]-,:[#6](-,:[#6])-,:[#6](-,:[#6])-,:[#6] C-C(C)-C(C)-C # Section 7: Complex SMARTS patterns - These bits test for the # presence of complex SMARTS patterns, regardless of count, but where # bond orders and bond aromaticity are specific. 713 1 [#6]c1ccc([#6])cc1 Cc1ccc(C)cc1 714 1 [#6]c1ccc([#8])cc1 Cc1ccc(O)cc1 715 1 [#6]c1ccc([#16])cc1 Cc1ccc(S)cc1 716 1 [#6]c1ccc([#7])cc1 Cc1ccc(N)cc1 717 1 [#6]c1ccc(Cl)cc1 Cc1ccc(Cl)cc1 718 1 [#6]c1ccc(Br)cc1 Cc1ccc(Br)cc1 719 1 [#8]c1ccc([#8])cc1 Oc1ccc(O)cc1 720 1 [#8]c1ccc([#16])cc1 Oc1ccc(S)cc1 721 1 [#8]c1ccc([#7])cc1 Oc1ccc(N)cc1 722 1 [#8]c1ccc(Cl)cc1 Oc1ccc(Cl)cc1 723 1 [#8]c1ccc(Br)cc1 Oc1ccc(Br)cc1 724 1 [#16]c1ccc([#16])cc1 Sc1ccc(S)cc1 725 1 [#16]c1ccc([#7])cc1 Sc1ccc(N)cc1 726 1 [#16]c1ccc(Cl)cc1 Sc1ccc(Cl)cc1 727 1 [#16]c1ccc(Br)cc1 Sc1ccc(Br)cc1 728 1 [#7]c1ccc([#7])cc1 Nc1ccc(N)cc1 729 1 [#7]c1ccc(Cl)cc1 Nc1ccc(Cl)cc1 730 1 [#7]c1ccc(Br)cc1 Nc1ccc(Br)cc1 731 1 Clc1ccc(Cl)cc1 Clc1ccc(Cl)cc1 732 1 Clc1ccc(Br)cc1 Clc1ccc(Br)cc1 733 1 Brc1ccc(Br)cc1 Brc1ccc(Br)cc1 734 1 [#6]c1cc([#6])ccc1 Cc1cc(C)ccc1 735 1 [#6]c1cc([#8])ccc1 Cc1cc(O)ccc1 736 1 [#6]c1cc([#16])ccc1 Cc1cc(S)ccc1 737 1 [#6]c1cc([#7])ccc1 Cc1cc(N)ccc1 738 1 [#6]c1cc(Cl)ccc1 Cc1cc(Cl)ccc1 739 1 [#6]c1cc(Br)ccc1 Cc1cc(Br)ccc1 740 1 [#8]c1cc([#8])ccc1 Oc1cc(O)ccc1 741 1 [#8]c1cc([#16])ccc1 Oc1cc(S)ccc1 742 1 [#8]c1cc([#7])ccc1 Oc1cc(N)ccc1 743 1 [#8]c1cc(Cl)ccc1 Oc1cc(Cl)ccc1 744 1 [#8]c1cc(Br)ccc1 Oc1cc(Br)ccc1 745 1 [#16]c1cc([#16])ccc1 Sc1cc(S)ccc1 746 1 [#16]c1cc([#7])ccc1 Sc1cc(N)ccc1 747 1 [#16]c1cc(Cl)ccc1 Sc1cc(Cl)ccc1 748 1 [#16]c1cc(Br)ccc1 Sc1cc(Br)ccc1 749 1 [#7]c1cc([#7])ccc1 Nc1cc(N)ccc1 750 1 [#7]c1cc(Cl)ccc1 Nc1cc(Cl)ccc1 751 1 [#7]c1cc(Br)ccc1 Nc1cc(Br)ccc1 752 1 Clc1cc(Cl)ccc1 Clc1cc(Cl)ccc1 753 1 Clc1cc(Br)ccc1 Clc1cc(Br)ccc1 754 1 Brc1cc(Br)ccc1 Brc1cc(Br)ccc1 755 1 [#6]c1c([#6])cccc1 Cc1c(C)cccc1 756 1 [#6]c1c([#8])cccc1 Cc1c(O)cccc1 757 1 [#6]c1c([#16])cccc1 Cc1c(S)cccc1 758 1 [#6]c1c([#7])cccc1 Cc1c(N)cccc1 759 1 [#6]c1c(Cl)cccc1 Cc1c(Cl)cccc1 760 1 [#6]c1c(Br)cccc1 Cc1c(Br)cccc1 761 1 [#8]c1c([#8])cccc1 Oc1c(O)cccc1 762 1 [#8]c1c([#16])cccc1 Oc1c(S)cccc1 763 1 [#8]c1c([#7])cccc1 Oc1c(N)cccc1 764 1 [#8]c1c(Cl)cccc1 Oc1c(Cl)cccc1 765 1 [#8]c1c(Br)cccc1 Oc1c(Br)cccc1 766 1 [#16]c1c([#16])cccc1 Sc1c(S)cccc1 767 1 [#16]c1c([#7])cccc1 Sc1c(N)cccc1 768 1 [#16]c1c(Cl)cccc1 Sc1c(Cl)cccc1 769 1 [#16]c1c(Br)cccc1 Sc1c(Br)cccc1 770 1 [#7]c1c([#7])cccc1 Nc1c(N)cccc1 771 1 [#7]c1c(Cl)cccc1 Nc1c(Cl)cccc1 772 1 [#7]c1c(Br)cccc1 Nc1c(Br)cccc1 773 1 Clc1c(Cl)cccc1 Clc1c(Cl)cccc1 774 1 Clc1c(Br)cccc1 Clc1c(Br)cccc1 775 1 Brc1c(Br)cccc1 Brc1c(Br)cccc1 776 1 [#6][#6]1[#6][#6][#6]([#6])[#6][#6]1 CC1CCC(C)CC1 777 1 [#6][#6]1[#6][#6][#6]([#8])[#6][#6]1 CC1CCC(O)CC1 778 1 [#6][#6]1[#6][#6][#6]([#16])[#6][#6]1 CC1CCC(S)CC1 779 1 [#6][#6]1[#6][#6][#6]([#7])[#6][#6]1 CC1CCC(N)CC1 780 1 [#6][#6]1[#6][#6][#6](Cl)[#6][#6]1 CC1CCC(Cl)CC1 781 1 [#6][#6]1[#6][#6][#6](Br)[#6][#6]1 CC1CCC(Br)CC1 782 1 [#8][#6]1[#6][#6][#6]([#8])[#6][#6]1 OC1CCC(O)CC1 783 1 [#8][#6]1[#6][#6][#6]([#16])[#6][#6]1 OC1CCC(S)CC1 784 1 [#8][#6]1[#6][#6][#6]([#7])[#6][#6]1 OC1CCC(N)CC1 785 1 [#8][#6]1[#6][#6][#6](Cl)[#6][#6]1 OC1CCC(Cl)CC1 786 1 [#8][#6]1[#6][#6][#6](Br)[#6][#6]1 OC1CCC(Br)CC1 787 1 [#16][#6]1[#6][#6][#6]([#16])[#6][#6]1 SC1CCC(S)CC1 788 1 [#16][#6]1[#6][#6][#6]([#7])[#6][#6]1 SC1CCC(N)CC1 789 1 [#16][#6]1[#6][#6][#6](Cl)[#6][#6]1 SC1CCC(Cl)CC1 790 1 [#16][#6]1[#6][#6][#6](Br)[#6][#6]1 SC1CCC(Br)CC1 791 1 [#7][#6]1[#6][#6][#6]([#7])[#6][#6]1 NC1CCC(N)CC1 792 1 [#7][#6]1[#6][#6][#6](Cl)[#6][#6]1 NC1CCC(Cl)CC1 793 1 [#7][#6]1[#6][#6][#6](Br)[#6][#6]1 NC1CCC(Br)CC1 794 1 Cl[#6]1[#6][#6][#6](Cl)[#6][#6]1 ClC1CCC(Cl)CC1 795 1 Cl[#6]1[#6][#6][#6](Br)[#6][#6]1 ClC1CCC(Br)CC1 796 1 Br[#6]1[#6][#6][#6](Br)[#6][#6]1 BrC1CCC(Br)CC1 797 1 [#6][#6]1[#6][#6]([#6])[#6][#6][#6]1 CC1CC(C)CCC1 798 1 [#6][#6]1[#6][#6]([#8])[#6][#6][#6]1 CC1CC(O)CCC1 799 1 [#6][#6]1[#6][#6]([#16])[#6][#6][#6]1 CC1CC(S)CCC1 800 1 [#6][#6]1[#6][#6]([#7])[#6][#6][#6]1 CC1CC(N)CCC1 801 1 [#6][#6]1[#6][#6](Cl)[#6][#6][#6]1 CC1CC(Cl)CCC1 802 1 [#6][#6]1[#6][#6](Br)[#6][#6][#6]1 CC1CC(Br)CCC1 803 1 [#8][#6]1[#6][#6]([#8])[#6][#6][#6]1 OC1CC(O)CCC1 804 1 [#8][#6]1[#6][#6]([#16])[#6][#6][#6]1 OC1CC(S)CCC1 805 1 [#8][#6]1[#6][#6]([#7])[#6][#6][#6]1 OC1CC(N)CCC1 806 1 [#8][#6]1[#6][#6](Cl)[#6][#6][#6]1 OC1CC(Cl)CCC1 807 1 [#8][#6]1[#6][#6](Br)[#6][#6][#6]1 OC1CC(Br)CCC1 808 1 [#16][#6]1[#6][#6]([#16])[#6][#6][#6]1 SC1CC(S)CCC1 809 1 [#16][#6]1[#6][#6]([#7])[#6][#6][#6]1 SC1CC(N)CCC1 810 1 [#16][#6]1[#6][#6](Cl)[#6][#6][#6]1 SC1CC(Cl)CCC1 811 1 [#16][#6]1[#6][#6](Br)[#6][#6][#6]1 SC1CC(Br)CCC1 812 1 [#7][#6]1[#6][#6]([#7])[#6][#6][#6]1 NC1CC(N)CCC1 813 1 [#7][#6]1[#6][#6](Cl)[#6][#6][#6]1 NC1CC(Cl)CCC1 814 1 [#7][#6]1[#6][#6](Br)[#6][#6][#6]1 NC1CC(Br)CCC1 815 1 Cl[#6]1[#6][#6](Cl)[#6][#6][#6]1 ClC1CC(Cl)CCC1 816 1 Cl[#6]1[#6][#6](Br)[#6][#6][#6]1 ClC1CC(Br)CCC1 817 1 Br[#6]1[#6][#6](Br)[#6][#6][#6]1 BrC1CC(Br)CCC1 818 1 [#6][#6]1[#6]([#6])[#6][#6][#6][#6]1 CC1C(C)CCCC1 819 1 [#6][#6]1[#6]([#8])[#6][#6][#6][#6]1 CC1C(O)CCCC1 820 1 [#6][#6]1[#6]([#16])[#6][#6][#6][#6]1 CC1C(S)CCCC1 821 1 [#6][#6]1[#6]([#7])[#6][#6][#6][#6]1 CC1C(N)CCCC1 822 1 [#6][#6]1[#6](Cl)[#6][#6][#6][#6]1 CC1C(Cl)CCCC1 823 1 [#6][#6]1[#6](Br)[#6][#6][#6][#6]1 CC1C(Br)CCCC1 824 1 [#8][#6]1[#6]([#8])[#6][#6][#6][#6]1 OC1C(O)CCCC1 825 1 [#8][#6]1[#6]([#16])[#6][#6][#6][#6]1 OC1C(S)CCCC1 826 1 [#8][#6]1[#6]([#7])[#6][#6][#6][#6]1 OC1C(N)CCCC1 827 1 [#8][#6]1[#6](Cl)[#6][#6][#6][#6]1 OC1C(Cl)CCCC1 828 1 [#8][#6]1[#6](Br)[#6][#6][#6][#6]1 OC1C(Br)CCCC1 829 1 [#16][#6]1[#6]([#16])[#6][#6][#6][#6]1 SC1C(S)CCCC1 830 1 [#16][#6]1[#6]([#7])[#6][#6][#6][#6]1 SC1C(N)CCCC1 831 1 [#16][#6]1[#6](Cl)[#6][#6][#6][#6]1 SC1C(Cl)CCCC1 832 1 [#16][#6]1[#6](Br)[#6][#6][#6][#6]1 SC1C(Br)CCCC1 833 1 [#7][#6]1[#6]([#7])[#6][#6][#6][#6]1 NC1C(N)CCCC1 834 1 [#7][#6]1[#6](Cl)[#6][#6][#6][#6]1 NC1C(Cl)CCCC1 835 1 [#7][#6]1[#6](Br)[#6][#6][#6][#6]1 NC1C(Br)CCCC1 836 1 Cl[#6]1[#6](Cl)[#6][#6][#6][#6]1 ClC1C(Cl)CCCC1 837 1 Cl[#6]1[#6](Br)[#6][#6][#6][#6]1 ClC1C(Br)CCCC1 838 1 Br[#6]1[#6](Br)[#6][#6][#6][#6]1 BrC1C(Br)CCCC1 839 1 [#6][#6]1[#6][#6]([#6])[#6][#6]1 CC1CC(C)CC1 840 1 [#6][#6]1[#6][#6]([#8])[#6][#6]1 CC1CC(O)CC1 841 1 [#6][#6]1[#6][#6]([#16])[#6][#6]1 CC1CC(S)CC1 842 1 [#6][#6]1[#6][#6]([#7])[#6][#6]1 CC1CC(N)CC1 843 1 [#6][#6]1[#6][#6](Cl)[#6][#6]1 CC1CC(Cl)CC1 844 1 [#6][#6]1[#6][#6](Br)[#6][#6]1 CC1CC(Br)CC1 845 1 [#8][#6]1[#6][#6]([#8])[#6][#6]1 OC1CC(O)CC1 846 1 [#8][#6]1[#6][#6]([#16])[#6][#6]1 OC1CC(S)CC1 847 1 [#8][#6]1[#6][#6]([#7])[#6][#6]1 OC1CC(N)CC1 848 1 [#8][#6]1[#6][#6](Cl)[#6][#6]1 OC1CC(Cl)CC1 849 1 [#8][#6]1[#6][#6](Br)[#6][#6]1 OC1CC(Br)CC1 850 1 [#16][#6]1[#6][#6]([#16])[#6][#6]1 SC1CC(S)CC1 851 1 [#16][#6]1[#6][#6]([#7])[#6][#6]1 SC1CC(N)CC1 852 1 [#16][#6]1[#6][#6](Cl)[#6][#6]1 SC1CC(Cl)CC1 853 1 [#16][#6]1[#6][#6](Br)[#6][#6]1 SC1CC(Br)CC1 854 1 [#7][#6]1[#6][#6]([#7])[#6][#6]1 NC1CC(N)CC1 855 1 [#7][#6]1[#6][#6](Cl)[#6][#6]1 NC1CC(Cl)CC1 856 1 [#7][#6]1[#6][#6](Br)[#6][#6]1 NC1CC(Br)CC1 857 1 Cl[#6]1[#6][#6](Cl)[#6][#6]1 ClC1CC(Cl)CC1 858 1 Cl[#6]1[#6][#6](Br)[#6][#6]1 ClC1CC(Br)CC1 859 1 Br[#6]1[#6][#6](Br)[#6][#6]1 BrC1CC(Br)CC1 860 1 [#6][#6]1[#6]([#6])[#6][#6][#6]1 CC1C(C)CCC1 861 1 [#6][#6]1[#6]([#8])[#6][#6][#6]1 CC1C(O)CCC1 862 1 [#6][#6]1[#6]([#16])[#6][#6][#6]1 CC1C(S)CCC1 863 1 [#6][#6]1[#6]([#7])[#6][#6][#6]1 CC1C(N)CCC1 864 1 [#6][#6]1[#6](Cl)[#6][#6][#6]1 CC1C(Cl)CCC1 865 1 [#6][#6]1[#6](Br)[#6][#6][#6]1 CC1C(Br)CCC1 866 1 [#8][#6]1[#6]([#8])[#6][#6][#6]1 OC1C(O)CCC1 867 1 [#8][#6]1[#6]([#16])[#6][#6][#6]1 OC1C(S)CCC1 868 1 [#8][#6]1[#6]([#7])[#6][#6][#6]1 OC1C(N)CCC1 869 1 [#8][#6]1[#6](Cl)[#6][#6][#6]1 OC1C(Cl)CCC1 870 1 [#8][#6]1[#6](Br)[#6][#6][#6]1 OC1C(Br)CCC1 871 1 [#16][#6]1[#6]([#16])[#6][#6][#6]1 SC1C(S)CCC1 872 1 [#16][#6]1[#6]([#7])[#6][#6][#6]1 SC1C(N)CCC1 873 1 [#16][#6]1[#6](Cl)[#6][#6][#6]1 SC1C(Cl)CCC1 874 1 [#16][#6]1[#6](Br)[#6][#6][#6]1 SC1C(Br)CCC1 875 1 [#7][#6]1[#6]([#7])[#6][#6][#6]1 NC1C(N)CCC1 876 1 [#7][#6]1[#6](Cl)[#6][#6]1 NC1C(Cl)CC1 877 1 [#7][#6]1[#6](Br)[#6][#6][#6]1 NC1C(Br)CCC1 878 1 Cl[#6]1[#6](Cl)[#6][#6][#6]1 ClC1C(Cl)CCC1 879 1 Cl[#6]1[#6](Br)[#6][#6][#6]1 ClC1C(Br)CCC1 880 1 Br[#6]1[#6](Br)[#6][#6][#6]1 BrC1C(Br)CCC1 chemfp-1.1p1/chemfp/types.py0000644000077000000240000003622612101361745016246 0ustar dalkestaff00000000000000from __future__ import absolute_import # Information about fingerprint types from . import argparse from . import FingerprintIterator, Metadata from . import io from .encodings import import_decoder # XXX too specific to the decoder module def check_openbabel_maccs166(): from .openbabel import HAS_MACCS, MACCS_VERSION assert HAS_MACCS if MACCS_VERSION == 1: return "OpenBabel-MACCS/1" elif MACCS_VERSION == 2: return "OpenBabel-MACCS/2" raise AssertionError def check_openeye_maccs166(): from .openeye import OEGRAPHSIM_API_VERSION return "OpenEye-MACCS166/"+OEGRAPHSIM_API_VERSION def check_openeye_path(): from .openeye import OEGRAPHSIM_API_VERSION return "OpenEye-Path/"+OEGRAPHSIM_API_VERSION def check_rdkit_atom_pair(): from .rdkit import ATOM_PAIR_VERSION if ATOM_PAIR_VERSION is None: ATOM_PAIR_VERSION = "2" # Nothing is supported; pretend to want v2 return "RDKit-AtomPair/"+ATOM_PAIR_VERSION def check_rdkit_torsion(): from .rdkit import TORSION_VERSION return "RDKit-Torsion/"+TORSION_VERSION ### The chemfp fingerprint type API isn't powerful enough # I have to list all of the possible fingerprint types, even if the # platform doesn't support that specific type. There's also no support # for toolkit vendors which do carefully tracks the fingerprint # version (like OEChem); I should be using their version information # rather than doing it myself. _family_config_paths = ( ("OpenEye-MACCS166/1", "chemfp.openeye.OpenEyeMACCSFingerprintFamily_v1"), ("OpenEye-Path/1", "chemfp.openeye.OpenEyePathFingerprintFamily_v1"), ("OpenEye-MACCS166/2", "chemfp.openeye.OpenEyeMACCSFingerprintFamily_v2"), ("OpenEye-Path/2", "chemfp.openeye.OpenEyePathFingerprintFamily_v2"), ("OpenEye-Circular/2", "chemfp.openeye.OpenEyeCircularFingerprintFamily_v2"), ("OpenEye-Tree/2", "chemfp.openeye.OpenEyeTreeFingerprintFamily_v2"), ("RDKit-MACCS166/1", "chemfp.rdkit.RDKitMACCSFingerprintFamily_v1"), ("RDKit-Fingerprint/1", "chemfp.rdkit.RDKitFingerprintFamily_v1"), ("RDKit-Morgan/1", "chemfp.rdkit.RDKitMorganFingerprintFamily_v1"), ("RDKit-Torsion/1", "chemfp.rdkit.RDKitTorsionFingerprintFamily_v1"), ("RDKit-Torsion/2", "chemfp.rdkit.RDKitTorsionFingerprintFamily_v2"), ("RDKit-AtomPair/1", "chemfp.rdkit.RDKitAtomPairFingerprintFamily_v1"), ("RDKit-AtomPair/2", "chemfp.rdkit.RDKitAtomPairFingerprintFamily_v2"), ("OpenBabel-FP2/1", "chemfp.openbabel.OpenBabelFP2FingerprintFamily_v1"), ("OpenBabel-FP3/1", "chemfp.openbabel.OpenBabelFP3FingerprintFamily_v1"), ("OpenBabel-FP4/1", "chemfp.openbabel.OpenBabelFP4FingerprintFamily_v1"), ("OpenBabel-MACCS/1", "chemfp.openbabel.OpenBabelMACCSFingerprintFamily_v1"), ("OpenBabel-MACCS/2", "chemfp.openbabel.OpenBabelMACCSFingerprintFamily_v2"), ("Indigo-Similarity/1", "chemfp.indigo.IndigoSimilarityFingerprinter_v1"), ("Indigo-Substructure/1", "chemfp.indigo.IndigoSubstructureFingerprinter_v1"), ("Indigo-ResonanceSubstructure/1", "chemfp.indigo.IndigoResonanceSubstructureFingerprinter_v1"), ("Indigo-TautomerSubstructure/1", "chemfp.indigo.IndigoTautomerSubstructureFingerprinter_v1"), ("Indigo-Full/1", "chemfp.indigo.IndigoFullFingerprinter_v1"), # In the future this will likely change to use a parameterized class # which can dynamically load fingerprint definitions ("ChemFP-Substruct-OpenEye/1", "chemfp.openeye_patterns.SubstructOpenEyeFingerprinter_v1"), ("RDMACCS-OpenEye/1", "chemfp.openeye_patterns.RDMACCSOpenEyeFingerprinter_v1"), ("ChemFP-Substruct-RDKit/1", "chemfp.rdkit_patterns.SubstructRDKitFingerprinter_v1"), ("RDMACCS-RDKit/1", "chemfp.rdkit_patterns.RDMACCSRDKitFingerprinter_v1"), ("ChemFP-Substruct-OpenBabel/1", "chemfp.openbabel_patterns.SubstructOpenBabelFingerprinter_v1"), ("RDMACCS-OpenBabel/1", "chemfp.openbabel_patterns.RDMACCSOpenBabelFingerprinter_v1"), ("ChemFP-Substruct-Indigo/1", "chemfp.indigo_patterns.SubstructIndigoFingerprinter_v1"), ("RDMACCS-Indigo/1", "chemfp.indigo_patterns.RDMACCSIndigoFingerprinter_v1"), ) _alternates = { "OpenBabel-MACCS": check_openbabel_maccs166, "OpenEye-MACCS166": check_openeye_maccs166, "OpenEye-Path": check_openeye_path, "RDKit-AtomPair": check_rdkit_atom_pair, "RDKit-Torsion": check_rdkit_torsion, } def _initialize_families(config_paths): d = {} for name, path in config_paths: # Set both the versioned and non-versioned names # The paths must be in order from oldest to newest unversioned_name = name.split("/")[0] d[unversioned_name] = d[name] = path return d # Convert into a dictionary, and include the unversioned named _family_config_paths = _initialize_families(_family_config_paths) _loaded_families = {} # Return the fingerprint family given a name like # "OpenBabel-FP2" or "RDKit-Morgan/1" def get_fingerprint_family(name): try: return _loaded_families[name] except KeyError: pass # Let's see if we can load it. # Is there a better name for this? try: func = _alternates[name] except KeyError: new_name = name else: new_name = func() try: path = _family_config_paths[new_name] except KeyError: raise ValueError("Unknown fingerprint family %r" % (name,)) config = import_decoder(path) config.validate() family = FingerprintFamily(config) _loaded_families[name] = _loaded_families[new_name] = family return family class FingerprintFamily(object): def __init__(self, config): self.config = config #name = config.name #format_string = config.format_string def __call__(self, **kwargs): return Fingerprinter(self.config, kwargs) def make_fingerprinter_from_type(self, type): terms = type.split() if not terms: raise ValueError("Empty fingerprint type (%r)" % (type,)) required_args = self.config.get_args() kwargs = {} for term in terms[1:]: try: left, right = term.split("=") except ValueError: raise ValueError("Term %r in type %r must have one and only one '='" % (term, type)) if left in kwargs: raise ValueError("Duplicate name %r in type %r" % (left, type)) try: decoder = required_args[left].decoder except KeyError: raise ValueError("Unknown name %r in type %r" % (left, type)) try: value = decoder(right) except ValueError, err: raise ValueError("Unable to parse %s value %r in type %r" % ( left, right, type)) kwargs[left] = value # Fill in any missing default for name, arg in required_args.items(): if name not in kwargs: kwargs[name] = arg.default # Let the configuration verify the kwargs verify_args = self.config.verify_args if verify_args is not None: verify_args(kwargs) return Fingerprinter(self.config, kwargs) class Fingerprinter(object): def __init__(self, config, fingerprinter_kwargs): self.config = config if isinstance(config.num_bits, int): self.num_bits = config.num_bits elif config.num_bits is None: raise AssertionError(config.name) else: self.num_bits = config.num_bits(fingerprinter_kwargs) self.fingerprinter_kwargs = fingerprinter_kwargs def __eq__(self, other): return self.get_type() == other.get_type() def __ne__(self, other): return self.get_type() != other.get_type() def get_type(self): if self.config.format_string is None: assert not self.fingerprinter_kwargs, "kwargs but no format string!" return self.config.name encoded = self.config.format_string % self._encode_parameters() return self.config.name + " " + encoded def _encode_parameters(self): d = {} for k, v in self.fingerprinter_kwargs.items(): encoder = self.config.args[k].encoder if encoder is None: d[k] = v else: d[k] = encoder(v) return d def read_structure_fingerprints(self, source, format=None, id_tag=None, errors="strict", metadata=None): source_filename = io.get_filename(source) if source_filename is None: sources = [] else: sources = [source_filename] if metadata is None: # XXX I don't like how the user who wants to pass in aromaticity # information needs to create the full Metadata metadata = Metadata(num_bits=self.num_bits, type=self.get_type(), software=self.config.software, sources=sources) structure_reader = self.config.read_structures(metadata, source, format, id_tag, errors) fingerprinter = self.config.make_fingerprinter(**self.fingerprinter_kwargs) def fingerprint_reader(structure_reader, fingerprinter): for (id, mol) in structure_reader: yield id, fingerprinter(mol) reader = fingerprint_reader(structure_reader, fingerprinter) return FingerprintIterator(Metadata(num_bits = self.num_bits, sources = sources, software = self.config.software, type = self.get_type(), date = io.utcnow(), aromaticity = metadata.aromaticity), reader) # def describe(self, bitno): # if not (0 <= bitno < self.num_bits): # raise KeyError("Bit number %d is out of range" % (bitno,)) # # bit_description = self.config.bit_description # if bit_description is None: # return "(unknown)" # return bit_description[bitno] ## Some code to figure out what the format strings are inside of a string class Dummy(object): def __str__(self): return "" def __int__(self): return 0 class GetArgs(object): def __init__(self, format_string): self.format_string = format_string self.args = [] def __getitem__(self, name): # O(n**2) but n is small. if name in self.args: raise TypeError("Duplicate name %r in format string %r" % (name, self.format_string)) self.args.append(name) return Dummy() def _get_arg_names(s): if not s: return [] args = GetArgs(s) s % args return args.args class FingerprintArgument(object): def __init__(self, name, decoder, encoder, kwargs): self.name = name self.decoder = decoder self.encoder = encoder self.default = kwargs["default"] self.kwargs = kwargs # Can't use "first or second" because some of the 'first' arguments # can be false value like {}. def OR(first, second): if first is None: return second return first class FingerprintFamilyConfig(object): def __init__(self, name = None, format_string = None, software = None, num_bits = None, read_structures = None, make_fingerprinter = None, verify_args = None, args = None, ): self.name = name self.format_string = format_string self.software = software self.num_bits = num_bits self.read_structures = read_structures self.make_fingerprinter = make_fingerprinter if args is None: args = {} self.verify_args = verify_args self.args = args.copy() # This can contain extra args! self._exact_args = None def validate(self): pass def get_args(self): args = self._exact_args if args is None: result = {} for name in _get_arg_names(self.format_string): result[name] = self.args[name] args = self._exact_args = result return args def clone(self, name=None, format_string=None, software=None, num_bits=None, read_structures=None, make_fingerprinter=None, verify_args=None, args=None): return FingerprintFamilyConfig( name = OR(name, self.name), format_string = OR(format_string, self.format_string), software = OR(software, self.software), num_bits = OR(num_bits, self.num_bits), read_structures = OR(read_structures, self.read_structures), make_fingerprinter = OR(make_fingerprinter, self.make_fingerprinter), verify_args = OR(verify_args, self.verify_args), args = OR(args, self.args)) def add_argument(self, name, decoder=None, encoder=None, default=None, action=None, metavar=None, help=None): #if name in self.args: # raise AssertionError("Argument %r already added" % (name,)) if default is not None: if help is not None: help = "%s (default=%s)" % (help, default) def parse_argument(s): try: return decoder(s) except ValueError, err: raise argparse.ArgumentError(None, "%s %s" % (name, err)) arg = FingerprintArgument(name, decoder, encoder, kwargs = dict(type=parse_argument, default=default, action=action, metavar=metavar, help=help)) self.args[name] = arg def remove_argument(self, name): del self.args[name] def add_argument_to_argparse(self, name, parser): info = self.args[name] parser.add_argument("--" + info.name, **info.kwargs) # Helper functions def positive_int(s): # Don't do int(s) because that allows "+3" and " 3 ", which I don't want if not s.isdigit(): raise ValueError("must be 1 or greater") i = int(s) if i == 0: raise ValueError("must be 1 or greater") return i def nonnegative_int(s): if not s.isdigit(): raise ValueError("must be 0 or greater") return int(s) def zero_or_one(s): if s == "0": return 0 if s == "1": return 1 raise ValueError("must be 0 or a 1") def parse_type(type): terms = type.split() if not terms: raise ValueError("Empty fingerprint type (%r)" % (type,)) name = terms[0] family = get_fingerprint_family(name) return family.make_fingerprinter_from_type(type) chemfp-1.1p1/chemfp/Watcher.py0000644000077000000240000000413211663300476016474 0ustar dalkestaff00000000000000"""Helper code for multi-threaded Python programs Quoting from http://code.activestate.com/recipes/496735-workaround-for-missed-sigint-in-multithreaded-prog/ Multithreaded Python programs often ignore the SIGINT generated by a Keyboard Interrupt, especially if the thread that gets the signal is waiting or sleeping. This module provides a workaround by forking a child process that executes the rest of the program while the parent process waits for signals and kills the child process. How to use: from chemfp import Watcher def main(): ... Watcher.Watcher() ... start multi-threaded code ... if __name__ == "__main__": main() Created by Allen Downey and distributed under the PSF license for Python. """ import threading, time, os, signal, sys class Watcher(object): """this class solves two problems with multithreaded programs in Python, (1) a signal might be delivered to any thread (which is just a malfeature) and (2) if the thread that gets the signal is waiting, the signal is ignored (which is a bug). The watcher is a concurrent process (not thread) that waits for a signal and the process that contains the threads. See Appendix A of The Little Book of Semaphores. http://greenteapress.com/semaphores/ I have only tested this on Linux. I would expect it to work on the Macintosh and not work on Windows. """ def __init__(self): """ Creates a child thread, which returns. The parent thread waits for a KeyboardInterrupt and then kills the child thread. """ self.child = os.fork() if self.child == 0: return else: self.watch() def watch(self): try: os.wait() except KeyboardInterrupt: # I put the capital B in KeyBoardInterrupt so I can # tell when the Watcher gets the SIGINT print 'KeyBoardInterrupt' self.kill() sys.exit() def kill(self): try: os.kill(self.child, signal.SIGKILL) except OSError: pass chemfp-1.1p1/COPYING0000644000077000000240000000372112055226640014315 0ustar dalkestaff00000000000000Library for working with cheminformatics fingerprints Unless otherwise noted, the chemfp software is distributed under the terms of "the MIT license" which follows: ===================================================================== Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ===================================================================== The maintainer and primary copyright holder is - Andrew Dalke, Andrew Dalke Scientific AB Other copyright holders are: - Kim Walisch, - Stanford Univeristy (written by Imran S. Haque ) - Python Software Foundation (the heapq code) - Christopher Swenson (the TimSort code in hits.c) This package also includes (nearly) unmodified copies of - chemfp/progressbar/ by Nilton Volpato under the LGPL 2.1 and/or BSD license - chemfp/futures/ by Brian Quinlan under the Python license - chemfp/Watcher.py by Allen Downey under the Python license - chemfp/argparse.py by Steven J. Bethard under the Apache License 2.0 chemfp-1.1p1/docs/0000755000077000000240000000000012106315372014205 5ustar dalkestaff00000000000000chemfp-1.1p1/docs/api.rst0000644000077000000240000013002512104111111015470 0ustar dalkestaff00000000000000.. _chemfp-api: ========== chemfp API ========== This chapter contains the docstrings for the public portion of the chemfp API. ============= chemfp module ============= The following functions and classes are in the chemfp module. .. py:module:: chemfp open ==== .. py:function:: open(source, format=None) Read fingerprints from a fingerprint file Read fingerprints from 'source', using the given format. If 'source' is a string then it is treated as a filename. If 'source' is None then fingerprints are read from stdin. Otherwise, 'source' must be a Python file object supporting 'read' and 'readline'. If 'format' is None then the fingerprint file format and compression type are derived from the source filename, or from the name attribute of the source file object. If the source is None then the stdin is assumed to be uncompressed data in "fps" format. The supported format strings are: fps, fps.gz - fingerprints are in FPS format The result is an FPSReader. Here's an example of printing the contents of the file:: reader = open("example.fps.gz") for id, fp in reader: print id, fp.encode("hex") :param source: The fingerprint source. :type source: A filename string, a file object, or None :param format: The file format and optional compression. :type format: string, or None :returns: an FPSReader .. _chemfp_load_fingerprints: load_fingerprints ================= .. py:function:: load_fingerprints(reader, metadata=None, reorder=True) Load all of the fingerprints into an in-memory FingerprintArena data structure The FingerprintArena data structure reads all of the fingerprints and identifers from 'reader' and stores them into an in-memory data structure which supports fast similarity searches. If 'reader' is a string or implements "read" then the contents will be parsed with the 'chemfp.open' function. Otherwise it must support iteration returning (id, fingerprint) pairs. 'metadata' contains the metadata the arena. If not specified then 'reader.metadata' is used. The loader may reorder the fingerprints for better search performance. To prevent ordering, use reorder=False. The 'alignment' option specifies the alignment data alignment and padding size for each fingerprint. A value of 8 means that each fingerprint will start on a 8 byte alignment, and use storage space which a multiple of 8 bytes long. The default value of None determines the best alignment based on the fingerprint size and available popcount methods. :param reader: An iterator over (id, fingerprint) pairs :type reader: a string, file object, or (id, fingerprint) iterator :param metadata: The metadata for the arena, if other than reader.metadata :type metadata: Metadata :param reorder: Specify if fingerprints should be reordered for better performance :type reorder: True or False :returns: FingerprintArena :param alignment: Alignment size (both data alignment and padding) .. _chemfp_read_structure_fingerprints: read_structure_fingerprints =========================== .. py:function:: read_structure_fingerprints(type, source=None, format=None, id_tag=None, errors="strict"): Read structures from 'source' and return the corresponding ids and fingerprints This returns a FingerprintReader which can be iterated over to get the id and fingerprint for each read structure record. The fingerprint generated depends on the value of 'type'. Structures are read from 'source', which can either be the structure filename, or None to read from stdin. 'type' contains the information about how to turn a structure into a fingerprint. It can be a string or a metadata instance. String values look like "OpenBabel-FP2/1", "OpenEye-Path", and "OpenEye-Path/1 min_bonds=0 max_bonds=5 atype=DefaultAtom btype=DefaultBond". Default values are used for unspecified parameters. Use a Metadata instance with 'type' and 'aromaticity' values set in order to pass aromaticity information to OpenEye. If 'format' is None then the structure file format and compression are determined by the filename's extension(s), defaulting to uncompressed SMILES if that is not possible. Otherwise 'format' may be "smi" or "sdf" optionally followed by ".gz" or "bz2" to indicate compression. The OpenBabel and OpenEye toolkits also support additional formats. If 'id_tag' is None, then the record id is based on the title field for the given format. If the input format is "sdf" then 'id_tag' specifies the tag field containing the identifier. (Only the first line is used for multi-line values.) For example, ChEBI omits the title from the SD files and stores the id after the "> " line. In that case, use id_tag = "ChEBI ID". 'aromaticity' specifies the aromaticity model, and is only appropriate for OEChem. It must be a string like "openeye" or "daylight". Here is an example of using fingerprints generated from structure file:: fp_reader = read_structure_fingerprints("OpenBabel-FP4/1", "example.sdf.gz") print "Each fingerprint has", fps.metadata.num_bits, "bits" for (id, fp) in fp_reader: print id, fp.encode("hex") :param type: information about how to convert the input structure into a fingerprint :type type: string or Metadata :param source: The structure data source. :type source: A filename (as a string), a file object, or None to read from stdin :param format: The file format and optional compression. Examples: 'smi' and 'sdf.gz' :type format: string, or None to autodetect based on the source :param id_tag: The tag containing the record id. Example: 'ChEBI ID'. Only valid for SD files. :type id_tag: string, or None to use the default title for the given format :returns: a FingerprintReader .. _chemfp_count_tanimoto_hits: count_tanimoto_hits =================== .. py:function:: count_tanimoto_hits(queries, targets, threshold=0.7, arena_size=100) Count the number of targets within 'threshold' of each query term For each query in 'queries', count the number of targets in 'targets' which are at least 'threshold' similar to the query. This function returns an iterator containing the (query_id, count) pairs. Example:: queries = chemfp.open("queries.fps") targets = chemfp.load_fingerprints("targets.fps.gz") for (query_id, count) in chemfp.count_tanimoto_hits(queries, targets, threshold=0.9): print query_id, "has", count, "neighbors with at least 0.9 similarity" Internally, queries are processed in batches of size 'arena_size'. A small batch size uses less overall memory and has lower processing latency, while a large batch size has better overall performance. Use arena_size=None to process the input as a single batch. Note: the FPSReader may be used as a target but it can only process one batch, and searching a FingerprintArena is faster if you have more than a few queries. :param queries: The query fingerprints. :type queries: any fingerprint container :param targets: The target fingerprints. :type targets: FingerprintArena or the slower FPSReader :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param arena_size: The number of queries to process in a batch :type arena_size: a positive integer, or None :returns: An iterator containing (query_id, score) pairs, one for each query .. _chemfp_threshold_tanimoto_search: threshold_tanimoto_search ========================= .. py:function:: threshold_tanimoto_search (queries, targets, threshold=0.7, arena_size=100) Find all targets within 'threshold' of each query term For each query in 'queries', find all the targets in 'targets' which are at least 'threshold' similar to the query. This function returns an iterator containing the (query_id, hits) pairs. The hits are stored as a list of (target_id, score) pairs. Example:: queries = chemfp.open("queries.fps") targets = chemfp.load_fingerprints("targets.fps.gz") for (query_id, hits) in chemfp.id_threshold_tanimoto_search(queries, targets, threshold=0.8): print query_id, "has", len(hits), "neighbors with at least 0.8 similarity" non_identical = [target_id for (target_id, score) in hits if score != 1.0] print " The non-identical hits are:", non_identical Internally, queries are processed in batches of size 'arena_size'. A small batch size uses less overall memory and has lower processing latency, while a large batch size has better overall performance. Use arena_size=None to process the input as a single batch. Note: the FPSReader may be used as a target but it can only process one batch, and searching a FingerprintArena is faster if you have more than a few queries. :param queries: The query fingerprints. :type queries: any fingerprint container :param targets: The target fingerprints. :type targets: FingerprintArena or the slower FPSReader :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param arena_size: The number of queries to process in a batch :type arena_size: positive integer, or None :returns: An iterator containing (query_id, hits) pairs, one for each query. 'hits' contains a list of (target_id, score) pairs. .. _chemfp_knearest_tanimoto_search: knearest_tanimoto_search ======================== .. py:function:: knearest_tanimoto_search (queries, targets, k=3, threshold=0.7, arena_size=100) Find the 'k'-nearest targets within 'threshold' of each query term For each query in 'queries', find the 'k'-nearest of all the targets in 'targets' which are at least 'threshold' similar to the query. Ties are broken arbitrarily and hits with scores equal to the smallest value may have been omitted. This function returns an iterator containing the (query_id, hits) pairs, where hits is a list of (target_id, score) pairs, sorted so that the highest scores are first. The order of ties is arbitrary. Example:: # Use the first 5 fingerprints as the queries queries = next(chemfp.open("pubchem_subset.fps").iter_arenas(5)) targets = chemfp.load_fingerprints("pubchem_subset.fps") # Find the 3 nearest hits with a similarity of at least 0.8 for (query_id, hits) in chemfp.id_knearest_tanimoto_search(queries, targets, k=3, threshold=0.8): print query_id, "has", len(hits), "neighbors with at least 0.8 similarity" if hits: target_id, score = hits[-1] print " The least similar is", target_id, "with score", score Internally, queries are processed in batches of size 'arena_size'. A small batch size uses less overall memory and has lower processing latency, while a large batch size has better overall performance. Use arena_size=None to process the input as a single batch. Note: the FPSReader may be used as a target but it can only process one batch, and searching a FingerprintArena is faster if you have more than a few queries. :param queries: The query fingerprints. :type queries: any fingerprint container :param targets: The target fingerprints. :type targets: FingerprintArena or the slower FPSReader :param k: The maximum number of nearest neighbors to find. :type k: positive integer :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param arena_size: The number of queries to process in a batch :type arena_size: positive integer, or None :returns: An iterator containing (query_id, hits) pairs, one for each query. 'hits' contains a list of (target_id, score) pairs, sorted by score. .. _chemfp_metadata: Metadata ======== .. py:class:: Metadata(num_bits=None, num_bytes=None, type=None, aromaticity=None, software=None, sources=None, date=None) Store information about a set of fingerprints The metadata attributes are: num_bits: number of bits in the fingerprint num_bytes: number of bytes in the fingerprint type: fingerprint type aromaticity: aromaticity model (only used with OEChem) software: software used to make the fingerprints sources: list of sources used to make the fingerprint date: timestamp of when the fingerprints were made .. _chemfp_fingerprintreader: FingerprintReader (base class) ============================== .. py:class:: chemfp.FingerprintReader(metadata) Initialize with a Metadata instance Base class for all chemfp objects holding fingerprint records All FingerprintReader instances have a 'metadata' attribute containing a Metadata and can be iteratated over to get the (id, fingerprint) for each record. iter(arena) ----------- iterate over the (id, fingerprint) pairs iter_arenas ----------- .. py:method:: iter_arenas(arena_size=1000) iterate through 'arena_size' fingerprints at a time This iterates through the fingerprints 'arena_size' at a time, yielding a FingerprintArena for each group. Working with arenas is often faster than processing one fingerprint at a time, and more memory efficient than processing all fingerprints at once. If arena_size=None then this makes an iterator containing a single arena containing all of the input. :param arena_size: The number of fingerprints to put into an arena. :type arena_size: positive integer, or None =================== chemfp.arena module =================== FingerprintArena instances are returned as part of the public API but should not be constructed directly. .. _chemfp_arena_fingerprintarena: .. py:module:: chemfp.arena FingerprintArena ================ Implements the FingerprintReader interface. .. py:class:: FingerprintArena(... do not call directly ...) Stores fingerprints in a contiguous block of memory The public attributes are: metadata `Metadata` about the fingerprints ids list of identifiers, ordered by position arena.ids --------- A list of the fingerprint identifiers, in the same order as the fingerprints. len(arena) ---------- Number of fingerprint records in the FingerprintArena arena[i] -------- Return the (id, fingerprint) at position i .. _chemfp_arena_FingerprintArena_copy: copy ---- .. py:method:: FingerprintArena.copy(indices=None, reorder=None) Create a new arena using either all or some of the fingerprints in this arena By default this create a new arena. The fingerprint data block and ids may be shared with the original arena, which makes this a shallow copy. If the original arena is a slice, or "sub-arena" of an arena, then the copy will allocate new space to store just the fingerprints in the slice and use its own list for the ids. The `indices` parameter, if not None, is an iterable which contains the indicies of the fingerprint records to copy. Duplicates are allowed, though discouraged. If indices are specified then the default `reorder=None` or a `reorder=True` will reorder the fingerprints for the new arena by popcount. This improves overall search performance. With `reorder=False`, the fingerprints will be in order given by the indices. If indices are not given, then the default is to preserve the order type of the original arena. Otherwise `reorder=True` will always reorder and `reorder=False` will leave them in the current order. :param indices: indicies of the records to copy into the new arena :type indices: iterable containing integers, or None :param reorder: describes how to order the fingerprints :type reorder: True to reorder, False to leave in input order, None for default action get_by_id --------- .. py:method:: FingerprintArena.get_by_id(id) Given the record identifier, return the (id, fingerprint) tuple or None if not present .. _get_fingerprint_by_id: get_fingerprint_by_id --------------------- .. py:method:: FingerprintArena.get_fingerprint_by_id(id) Given the record identifier, return its fingerprint or None if not present get_index_by_id --------------- .. py:method:: FingerprintArena.get_index_by_id(id) Given the record identifier, return the record index or None if not present iter(arena) ----------- Iterate over the (id, fingerprint) contents of the arena iter_arenas ----------- .. py:method:: FingerprintArena.iter_arenas(arena_size=1000) iterate through `arena_size` fingerprints at a time This iterates through the fingerprints `arena_size` at a time, yielding a FingerprintArena for each group. Working with arenas is often faster than processing one fingerprint at a time, and more memory efficient than processing all fingerprints at once. If arena_size=None then this makes an iterator containing a single arena containing all of the input. :param arena_size: The number of fingerprints to put into an arena. :type arena_size: positive integer, or None save ---- .. py:method:: FingerprintArena.save(destination) Save the arena contents to the given filename or file object count_tanimoto_hits_fp ---------------------- .. py:method:: FingerprintArena.count_tanimoto_hits_fp(query_fp, threshold=0.7) Count the fingerprints which are similar enough to the query fingerprint DEPRECATED: Use `chemfp.search.count_tanimoto_hits_fp`_ instead. Return the number of fingerprints in this arena which are at least `threshold` similar to the query fingerprint `query_fp`. :param query_fp: query fingerprint :type query_fp: byte string :param threshold: minimum similarity threshold (default: 0.7) :type threshold: float between 0.0 and 1.0, inclusive :returns: integer count count_tanimoto_hits_arena ------------------------- .. py:method:: FingerprintArena.count_tanimoto_hits_arena(query_arena, threshold=0.7) Count the fingerprints which are similar enough to each query fingerprint DEPRECATED: Use `chemfp.search.count_tanimoto_hits_arena`_ or `chemfp.search.count_tanimoto_hits_symmetric`_ instead. Returns an iterator containing the (query_id, count) for each fingerprint in `queries`, where `query_id` is the query fingerprint id and `count` is the number of fingerprints found which are at least `threshold` similar to the query. The order of results is the same as the order of the queries. For efficiency reasons, `arena_size` queries are processed at a time. :param queries: query fingerprints :type query_fp: FingerprintArena or FPSReader (must implement iter_arenas()) :param threshold: minimum similarity threshold (default: 0.7) :type threshold: float between 0.0 and 1.0, inclusive :param arena_size: number of queries to process at a time (default: 100) :type arena_size: positive integer :returns: list of (query_id, integer count) pairs, one for each query threshold_tanimoto_search_fp ---------------------------- .. py:method:: FingerprintArena.threshold_tanimoto_search_fp(query_fp, threshold=0.7) Find the fingerprints which are similar enough to the query fingerprint DEPRECATED: Use `chemfp.search.threshold_tanimoto_search_fp`_ instead. Find all of the fingerprints in this arena which are at least `threshold` similar to the query fingerprint `query_fp`. The hits are returned as a list containing (id, score) tuples in arbitrary order. :param query_fp: query fingerprint :type query_fp: byte string :param threshold: minimum similarity threshold (default: 0.7) :type threshold: float between 0.0 and 1.0, inclusive :returns: list of (int, score) tuples threshold_tanimoto_search_arena ------------------------------- .. py:method:: FingerprintArena.threshold_tanimoto_search_arena(query_arena, threshold=0.7) Find the fingerprints which are similar to each of the query fingerprints DEPRECATED: Use `chemfp.search.threshold_tanimoto_search_arena`_ or `chemfp.search.threshold_tanimoto_search_symmetric`_ instead. For each fingerprint in the `query_arena`, find all of the fingerprints in this arena which are at least `threshold` similar. The hits are returned as a `SearchResults` instance. :param query_arena: query arena :type query_arena: FingerprintArena :param threshold: minimum similarity threshold (default: 0.7) :type threshold: float between 0.0 and 1.0, inclusive :returns: SearchResults knearest_tanimoto_search_fp ---------------------------- .. py:method:: FingerprintArena.knearest_tanimoto_search_fp(query_fp, k=3, threshold=0.7) Find the k-nearest fingerprints which are similar to the query fingerprint DEPRECATED: Use `chemfp.search.knearest_tanimoto_search_fp`_ instead. Find the `k` fingerprints in this arena which are most similar to the query fingerprint `query_fp` and which are at least `threshold` similar to the query. The hits are returned as a list of (id, score) tuples sorted with the highest similarity first. Ties are broken arbitrarily. :param query_fp: query fingerpring :type query_fp: byte string :param k: number of nearest neighbors to find (default: 3) :type k: positive integer :param threshold: minimum similarity threshold (default: 0.7) :type threshold: float between 0.0 and 1.0, inclusive :returns: SearchResults knearest_tanimoto_search_arena ------------------------------- .. py:method:: FingerprintArena.knearest_tanimoto_search_arena(query_arena, k=3, threshold=0.7) Find the k-nearest fingerprint which are similar to each of the query fingerprints DEPRECATED: Use `chemfp.search.knearest_tanimoto_search_arena`_ or `chemfp.search.knearest_tanimoto_search_symmetric`_ instead. For each fingerprint in the `query_arena`, find the `k` fingerprints in this arena which are most similar and which are at least `threshold` similar to the query fingerprint. The hits are returned as a SearchResult where the hits are sorted with the highest similarity first. Ties are broken arbitrarily. :param query_arena: query arena :type query_arena: FingerprintArena :param k: number of nearest neighbors to find (default: 3) :type k: positive integer :param threshold: minimum similarity threshold (default: 0.7) :type threshold: float between 0.0 and 1.0, inclusive :returns: SearchResult ==================== chemfp.search module ==================== The following functions and classes are in the chemfp.search module. .. _chemfp_search: .. py:module:: chemfp.search Module functions ================ The `*_fp` functions search a query fingerprint against a target arena. The `*_arena` functions search a query arena against a target arena. The `*_symmetric` functions use the same arena as query and target, and exclude matching a fingerprint against itself. count_tanimoto_hits_fp ---------------------- .. _chemfp_search_count_tanimoto_hits_fp: .. py:method:: count_tanimoto_hits_fp (query_fp, target_arena, threshold=0.7) Count the number of hits in `target_arena` at least `threshold` similar to the `query_fp` Example:: query_id, query_fp = chemfp.load_fingerprints("queries.fps")[0] targets = chemfp.load_fingerprints("targets.fps") print chemfp.search.count_tanimoto_hits_fp(query_fp, targets, threshold=0.1) :param query_fp: the query fingerprint :type query_fp: a byte string :param target_arena: the target arena :type target_fp: a FingerprintArena :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :returns: an integer count count_tanimoto_hits_arena ------------------------- .. _chemfp_search_count_tanimoto_hits_arena: .. py:method:: count_tanimoto_hits_arena(query_arena, target_arena, threshold=0.7) For each fingerprint in `query_arena`, count the number of hits in `target_arena` at least `threshold` similar to it Example:: queries = chemfp.load_fingerprints("queries.fps") targets = chemfp.load_fingerprints("targets.fps") counts = chemfp.search.count_tanimoto_hits_arena(queries, targets, threshold=0.1) print counts[:10] The result is implementation specific. You'll always be able to get its length and do an index lookup to get an integer count. Currently it's a ctype array of longs, but it could be an array.array or Python list in the future. :param query_arena: The query fingerprints. :type query_arena: a FingerprintArena :param target_arena: The target fingerprints. :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :returns: an array of counts count_tanimoto_hits_symmetric ----------------------------- .. _chemfp_search_count_tanimoto_hits_symmetric: .. py:method:: count_tanimoto_hits_symmetric(arena, threshold=0.7, batch_size=100) For each fingerprint in the `arena`, count the number of other fingerprints at least `threshold` similar to it A fingerprint never matches itself. The computation can take a long time. Python won't check check for a ^C until the function finishes. This can be irritating. Instead, process only `batch_size` rows at a time before checking for a ^C. Example:: arena = chemfp.load_fingerprints("targets.fps") counts = chemfp.search.count_tanimoto_hits_symmetric(arena, threshold=0.2) print counts[:10] The result object is implementation specific. You'll always be able to get its length and do an index lookup to get an integer count. Currently it's a ctype array of longs, but it could be an array.array or Python list in the future. :param arena: the set of fingerprints :type arena: a FingerprintArena :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param batch_size: the number of rows to process before checking for a ^C :type batch_size: integer :returns: an array of counts threshold_tanimoto_search_fp ---------------------------- .. _chemfp_search_threshold_tanimoto_search_fp: .. py:method:: threshold_tanimoto_search_fp(query_fp, target_arena, threshold=0.7) Search for fingerprint hits in `target_arena` which are at least `threshold` similar to `query_fp` The hits in the returned `SearchResult` are in arbitrary order. Example:: query_id, query_fp = chemfp.load_fingerprints("queries.fps")[0] targets = chemfp.load_fingerprints("targets.fps") print list(chemfp.search.threshold_tanimoto_search_fp(query_fp, targets, threshold=0.15)) :param query_fp: the query fingerprint :type query_fp: a byte string :param target_arena: the target arena :type target_fp: a FingerprintArena :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :returns: a SearchResult threshold_tanimoto_search_arena ------------------------------- .. _chemfp_search_threshold_tanimoto_search_arena: .. py:method:: threshold_tanimoto_search_arena(query_arena, target_arena, threshold=0.7) Search for the hits in the `target_arena` at least `threshold` similar to the fingerprints in `query_arena` The hits in the returned `SearchResults` are in arbitrary order. Example:: queries = chemfp.load_fingerprints("queries.fps") targets = chemfp.load_fingerprints("targets.fps") results = chemfp.search.threshold_tanimoto_search_arena(queries, targets, threshold=0.5) for query_id, query_hits in zip(queries.ids, results): if len(query_hits) > 0: print query_id, "->", ", ".join(query_hits.get_ids()) :param query_arena: The query fingerprints. :type query_arena: a FingerprintArena :param target_arena: The target fingerprints. :type target_arena: a FingerprintArena :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :returns: a SearchResults instance threshold_tanimoto_search_symmetric ----------------------------------- .. _chemfp_search_threshold_tanimoto_search_symmetric: .. py:method:: threshold_tanimoto_search_symmetric(arena, threshold=0.7, include_lower_triangle=True, batch_size=100) Search for the hits in the `arena` at least `threshold` similar to the fingerprints in the arena When `include_lower_triangle` is True, compute the upper-triangle similarities, then copy the results to get the full set of results. When `include_lower_triangle` is False, only compute the upper triangle. The computation can take a long time. Python won't check check for a ^C until the function finishes. This can be irritating. Instead, process only `batch_size` rows at a time before checking for a ^C. The hits in the returned `SearchResults` are in arbitrary order. Example:: arena = chemfp.load_fingerprints("queries.fps") full_result = chemfp.search.threshold_tanimoto_search_symmetric(arena, threshold=0.2) upper_triangle = chemfp.search.threshold_tanimoto_search_symmetric( arena, threshold=0.2, include_lower_triangle=False) assert sum(map(len, full_result)) == sum(map(len, upper_triangle))*2 :param arena: the set of fingerprints :type arena: a FingerprintArena :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param include_lower_triangle: if False, compute only the upper triangle, otherwise use symmetry to compute the full matrix :type include_lower_triangle: boolean :param batch_size: the number of rows to process before checking for a ^C :type batch_size: integer :returns: a SearchResults instance knearest_tanimoto_search_fp --------------------------- .. _chemfp_search_knearest_tanimoto_search_fp: .. py:method:: knearest_tanimoto_search_fp(query_fp, target_arena, k=3, threshold=0.7) Search for `k`-nearest hits in `target_arena` which are at least `threshold` similar to `query_fp` The hits in the `SearchResults` are ordered by decreasing similarity score. Example:: query_id, query_fp = chemfp.load_fingerprints("queries.fps")[0] targets = chemfp.load_fingerprints("targets.fps") print list(chemfp.search.knearest_tanimoto_search_fp(query_fp, targets, k=3, threshold=0.0)) :param query_fp: the query fingerprint :type query_fp: a byte string :param target_arena: the target arena :type target_fp: a FingerprintArena :param k: the number of nearest neighbors to find. :type k: positive integer :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :returns: a SearchResult knearest_tanimoto_search_arena ------------------------------ .. _chemfp_search_knearest_tanimoto_search_arena: .. py:method:: knearest_tanimoto_search_arena(query_arena, target_arena, k=3, threshold=0.7) Search for the `k` nearest hits in the `target_arena` at least `threshold` similar to the fingerprints in `query_arena` The hits in the `SearchResults` are ordered by decreasing similarity score. Example:: queries = chemfp.load_fingerprints("queries.fps") targets = chemfp.load_fingerprints("targets.fps") results = chemfp.search.knearest_tanimoto_search_arena(queries, targets, k=3, threshold=0.5) for query_id, query_hits in zip(queries.ids, results): if len(query_hits) >= 2: print query_id, "->", ", ".join(query_hits.get_ids()) :param query_arena: The query fingerprints. :type query_arena: a FingerprintArena :param target_arena: The target fingerprints. :type target_arena: a FingerprintArena :param k: the number of nearest neighbors to find. :type k: positive integer :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :returns: a SearchResults instance knearest_tanimoto_search_symmetric ---------------------------------- .. _chemfp_search_knearest_tanimoto_search_symmetric: .. py:method:: knearest_tanimoto_search_symmetric(arena, k=3, threshold=0.7, batch_size=100) Search for the `k`-nearest hits in the `arena` at least `threshold` similar to the fingerprints in the arena The computation can take a long time. Python won't check check for a ^C until the function finishes. This can be irritating. Instead, process only `batch_size` rows at a time before checking for a ^C. The hits in the `SearchResults` are ordered by decreasing similarity score. Example:: arena = chemfp.load_fingerprints("queries.fps") results = chemfp.search.knearest_tanimoto_search_symmetric(arena, k=3, threshold=0.8) for (query_id, hits) in zip(arena.ids, results): print query_id, "->", ", ".join(("%s %.2f" % hit) for hit in hits.get_ids_and_scores()) :param arena: the set of fingerprints :type arena: a FingerprintArena :param k: the number of nearest neighbors to find. :type k: positive integer :param threshold: The minimum score threshold. :type threshold: float between 0.0 and 1.0, inclusive :param include_lower_triangle: if False, compute only the upper triangle, otherwise use symmetry to compute the full matrix :type include_lower_triangle: boolean :param batch_size: the number of rows to process before checking for a ^C :type batch_size: integer :returns: a SearchResults instance .. _searchresults: SearchResults ============= .. py:class:: SearchResults(... do not call directly ...) Search results for a list of query fingerprints against a target arena This acts like a list of SearchResult elements, with the ability to iterate over each search results, look them up by index, and get the number of scores. In addition, there are helper methods to iterate over each hit and to get the hit indicies, scores, and identifiers directly as Python lists, sort the list contents, and more. len(results) ------------ The number of rows in the SearchResults results[i] ---------- Get the 'i'th SearchResult clear_all --------- .. py:method:: SearchResults.clear_all() Remove all hits from all of the search results .. _chemfp_search_SearchResults_count_all: count_all --------- .. py:method:: SearchResults.count_all(min_score=None, max_score=None, interval="[]") Remove all hits from all of the search results .. _SearchResults.cumulative_score_all: cumulative_score_all -------------------- .. py:method:: SearchResults.cumulative_score_all(min_score=None, max_score=None, interval="[]") The sum of all scores in all rows which are between `min_score` and `max_score` Using the default parameters this returns the sum of all of the scores in all of the results. With a specified range this returns the sum of all of the scores in that range. The cumulative score is also known as the raw score. The default `min_score` of None is equivalent to -infinity. The default `max_score` of None is equivalent to +infinity. The `interval` parameter describes the interval end conditions. The default of "[]" uses a closed interval, where min_score <= score <= max_score. The interval "()" uses the open interval where min_score < score < max_score. The half-open/half-closed intervals "(]" and "[)" are also supported. :param min_score: the minimum score in the range. :type min_score: a float, or None for -infinity :param max_score: the maximum score in the range. :type max_score: a float, or None for +infinity :param interval: specify if the end points are open or closed. :type interval: one of "[]", "()", "(]", "[)" :returns: an floating point count iter(results) ------------- Iterate over each SearchResult hit iter_ids -------- .. py:method:: SearchResults.iter_ids() For each hit, yield the list of target identifiers iter_ids_and_scores ------------------- .. py:method:: SearchResults.iter_ids_and_scores() For each hit, yield the list of (target id, score) tuples iter_indices ------------ .. py:method:: SearchResults.iter_indices() For each hit, yield the list of target indices iter_indices_and_scores ----------------------- .. py:method:: SearchResults.iter_indices_and_scores() For each hit, yield the list of (target index, score) tuples iter_scores ----------- .. py:method:: SearchResults.iter_scores() For each hit, yield the list of target scores iter_hits --------- REMOVED: Renamed to iter_ids_and_scores for 1.1. .. _chemfp_search_searchresults_reorder_all: reorder_all ----------- .. py:method:: SearchResults.reorder_all() Reorder the hits for all of the rows based on the requested `order`. The available orderings are: increasing-score: sort by increasing score decreasing-score: sort by decreasing score increasing-index: sort by increasing target index decreasing-index: sort by decreasing target index move-closest-first: move the hit with the highest score to the first position reverse: reverse the current ordering :param ordering: the name of the ordering to use .. _SearchResult: SearchResult ============ .. py:class:: SearchResult (... do not call directly ...) Search results for a query fingerprint against a target arena. The results contains a list of hits. Hits contain a target index, score, and optional target ids. The hits can be reordered based on score or index. len(result) ------------ The number of hits iter(result) ------------ Iterate through the pairs of (target index, score) using the current ordering clear ----- .. py:method:: SearchResult.clear() Remove all hits from this result .. _SearchResult_count: count ----- .. py:method:: SearchResult.count(min_score=None, max_score=None, interval="[]") Count the number of hits with a score between `min_score` and `max_score` Using the default parameters this returns the number of hits in the result. The default `min_score` of None is equivalent to -infinity. The default `max_score` of None is equivalent to +infinity. The `interval` parameter describes the interval end conditions. The default of "[]" uses a closed interval, where min_score <= score <= max_score. The interval "()" uses the open interval where min_score < score < max_score. The half-open/half-closed intervals "(]" and "[)" are also supported. :param min_score: the minimum score in the range. :type min_score: a float, or None for -infinity :param max_score: the maximum score in the range. :type max_score: a float, or None for +infinity :param interval: specify if the end points are open or closed. :type interval: one of "[]", "()", "(]", "[)" :returns: an integer count .. _SearchResult.cumulative_score: cumulative_score ---------------- .. py:method:: SearchResult.cumulative_score(min_score=None, max_score=None, interval="[]") The sum of the scores which are between `min_score` and `max_score` Using the default parameters this returns the sum of all of the scores in the result. With a specified range this returns the sum of all of the scores in that range. The cumulative score is also known as the raw score. The default `min_score` of None is equivalent to -infinity. The default `max_score` of None is equivalent to +infinity. The `interval` parameter describes the interval end conditions. The default of "[]" uses a closed interval, where min_score <= score <= max_score. The interval "()" uses the open interval where min_score < score < max_score. The half-open/half-closed intervals "(]" and "[)" are also supported. :param min_score: the minimum score in the range. :type min_score: a float, or None for -infinity :param max_score: the maximum score in the range. :type max_score: a float, or None for +infinity :param interval: specify if the end points are open or closed. :type interval: one of "[]", "()", "(]", "[)" :returns: a floating point value .. _get_ids: get_ids ------- .. py:method:: SearchResult.get_ids() The list of target identifiers (if available), in the current ordering get_ids_and_scores ------------------ .. py:method:: SearchResult.get_ids_and_scores() The list of (target identifier, target score) pairs, in the current ordering Raises a TypeError if the target IDs are not available. .. _get_indices: get_indices ----------- .. py:method:: SearchResult.get_indices() The list of target indices, in the current ordering. .. _get_indices_and_scores: get_indices_and_scores ---------------------- .. py:method:: SearchResult.get_indices_and_scores() The list of (target index, score) pairs, in the current ordering .. _get_scores: get_scores ---------- .. py:method:: SearchResult.get_scores() The list of target scores, in the current ordering .. _chemfp_search_searchresult_reorder: reorder ------- .. py:method:: SearchResult.reorder(ordering="decreasing-score") Reorder the hits based on the requested ordering. The available orderings are: increasing-score: sort by increasing score decreasing-score: sort by decreasing score increasing-index: sort by increasing target index decreasing-index: sort by decreasing target index move-closest-first: move the hit with the highest score to the first position reverse: reverse the current ordering :param ordering: the name of the ordering to use .. _chemfp.bitops: ===================== chemfp.bitopts module ===================== .. py:module:: chemfp.bitops The following functions are in the chemfp.bitops module. They provide low-level bit operations on byte and hex fingerprints. byte_popcount ============= .. py:function:: byte_popcount() byte_popcount(fp) Return the number of bits set in a byte fingerprint byte_intersect_popcount ======================= .. py:function:: byte_intersect_popcount() byte_intersect_popcount(fp1, fp2) Return the number of bits set in the instersection of the two byte fingerprints byte_tanimoto ============= .. py:function:: byte_tanimoto() byte_tanimoto(fp1, fp2) Compute the Tanimoto similarity between two byte fingerprints byte_contains ============= .. py:function:: byte_contains() byte_contains(super_fp, sub_fp) Return 1 if the on bits of sub_fp are also 1 bits in super_fp hex_isvalid =========== .. py:function:: hex_isvalid() hex_isvalid(s) Return 1 if the string is a valid hex fingerprint, otherwise 0 hex_popcount ============ .. py:function:: hex_popcount() hex_popcount(fp) Return the number of bits set in a hex fingerprint, or -1 for non-hex strings hex_intersect_popcount ====================== .. py:function:: hex_intersect_popcount() hex_intersect_popcount(fp1, fp2) Return the number of bits set in the intersection of the two hex fingerprint, or -1 if either string is a non-hex string hex_tanimoto ============ .. py:function:: hex_tanimoto() hex_tanimoto(fp1, fp2) Compute the Tanimoto similarity between two hex fingerprints. Return a float between 0.0 and 1.0, or -1.0 if either string is not a hex fingerprint hex_contains ============ .. py:function :: hex_contains() hex_contains(super_fp, sub_fp) Return 1 if the on bits of sub_fp are also 1 bits in super_fp, otherwise 0. Return -1 if either string is not a hex fingerprint chemfp-1.1p1/docs/api.txt0000644000077000000240000003103412104111106015503 0ustar dalkestaff00000000000000.. _chemfp-api: ========== chemfp API ========== This chapter contains the docstrings for the public portion of the chemfp API. ============= chemfp module ============= The following functions and classes are in the chemfp module. .. py:module:: chemfp open ==== .. py:function:: open(source, format=None) {{ chemfp.open|docstring }} .. _chemfp_load_fingerprints: load_fingerprints ================= .. py:function:: load_fingerprints(reader, metadata=None, reorder=True) {{ chemfp.load_fingerprints|docstring}} .. _chemfp_read_structure_fingerprints: read_structure_fingerprints =========================== .. py:function:: read_structure_fingerprints(type, source=None, format=None, id_tag=None, errors="strict"): {{ chemfp.read_structure_fingerprints|docstring}} .. _chemfp_count_tanimoto_hits: count_tanimoto_hits =================== .. py:function:: count_tanimoto_hits(queries, targets, threshold=0.7, arena_size=100) {{ chemfp.count_tanimoto_hits|docstring}} .. _chemfp_threshold_tanimoto_search: threshold_tanimoto_search ========================= .. py:function:: threshold_tanimoto_search (queries, targets, threshold=0.7, arena_size=100) {{ chemfp.threshold_tanimoto_search|docstring}} .. _chemfp_knearest_tanimoto_search: knearest_tanimoto_search ======================== .. py:function:: knearest_tanimoto_search (queries, targets, k=3, threshold=0.7, arena_size=100) {{ chemfp.knearest_tanimoto_search|docstring}} .. _chemfp_metadata: Metadata ======== .. py:class:: Metadata(num_bits=None, num_bytes=None, type=None, aromaticity=None, software=None, sources=None, date=None) {{ chemfp.Metadata|docstring}} .. _chemfp_fingerprintreader: FingerprintReader (base class) ============================== .. py:class:: chemfp.FingerprintReader(metadata) {{ chemfp.FingerprintReader.__init__|docstring}} {{ chemfp.FingerprintReader|docstring}} iter(arena) ----------- {{ chemfp.FingerprintReader.__iter__|docstring}} iter_arenas ----------- .. py:method:: iter_arenas(arena_size=1000) {{ chemfp.FingerprintReader.iter_arenas|docstring}} =================== chemfp.arena module =================== FingerprintArena instances are returned as part of the public API but should not be constructed directly. .. _chemfp_arena_fingerprintarena: .. py:module:: chemfp.arena FingerprintArena ================ Implements the FingerprintReader interface. .. py:class:: FingerprintArena(... do not call directly ...) {{ chemfp.arena.FingerprintArena|docstring}} arena.ids --------- A list of the fingerprint identifiers, in the same order as the fingerprints. len(arena) ---------- {{ chemfp.arena.FingerprintArena.__len__|docstring}} arena[i] -------- {{ chemfp.arena.FingerprintArena.__getitem__|docstring}} .. _chemfp_arena_FingerprintArena_copy: copy ---- .. py:method:: FingerprintArena.copy(indices=None, reorder=None) {{ chemfp.arena.FingerprintArena.copy|docstring}} get_by_id --------- .. py:method:: FingerprintArena.get_by_id(id) {{ chemfp.arena.FingerprintArena.get_by_id|docstring}} .. _get_fingerprint_by_id: get_fingerprint_by_id --------------------- .. py:method:: FingerprintArena.get_fingerprint_by_id(id) {{ chemfp.arena.FingerprintArena.get_fingerprint_by_id|docstring}} get_index_by_id --------------- .. py:method:: FingerprintArena.get_index_by_id(id) {{ chemfp.arena.FingerprintArena.get_index_by_id|docstring}} iter(arena) ----------- {{ chemfp.arena.FingerprintArena.__iter__|docstring}} iter_arenas ----------- .. py:method:: FingerprintArena.iter_arenas(arena_size=1000) {{ chemfp.arena.FingerprintArena.iter_arenas|docstring}} save ---- .. py:method:: FingerprintArena.save(destination) {{ chemfp.arena.FingerprintArena.save|docstring}} count_tanimoto_hits_fp ---------------------- .. py:method:: FingerprintArena.count_tanimoto_hits_fp(query_fp, threshold=0.7) {{ chemfp.arena.FingerprintArena.count_tanimoto_hits_fp|docstring}} count_tanimoto_hits_arena ------------------------- .. py:method:: FingerprintArena.count_tanimoto_hits_arena(query_arena, threshold=0.7) {{ chemfp.arena.FingerprintArena.count_tanimoto_hits_arena|docstring}} threshold_tanimoto_search_fp ---------------------------- .. py:method:: FingerprintArena.threshold_tanimoto_search_fp(query_fp, threshold=0.7) {{ chemfp.arena.FingerprintArena.threshold_tanimoto_search_fp|docstring}} threshold_tanimoto_search_arena ------------------------------- .. py:method:: FingerprintArena.threshold_tanimoto_search_arena(query_arena, threshold=0.7) {{ chemfp.arena.FingerprintArena.threshold_tanimoto_search_arena|docstring}} knearest_tanimoto_search_fp ---------------------------- .. py:method:: FingerprintArena.knearest_tanimoto_search_fp(query_fp, k=3, threshold=0.7) {{ chemfp.arena.FingerprintArena.knearest_tanimoto_search_fp|docstring}} knearest_tanimoto_search_arena ------------------------------- .. py:method:: FingerprintArena.knearest_tanimoto_search_arena(query_arena, k=3, threshold=0.7) {{ chemfp.arena.FingerprintArena.knearest_tanimoto_search_arena|docstring }} ==================== chemfp.search module ==================== The following functions and classes are in the chemfp.search module. .. _chemfp_search: .. py:module:: chemfp.search Module functions ================ The `*_fp` functions search a query fingerprint against a target arena. The `*_arena` functions search a query arena against a target arena. The `*_symmetric` functions use the same arena as query and target, and exclude matching a fingerprint against itself. count_tanimoto_hits_fp ---------------------- .. _chemfp_search_count_tanimoto_hits_fp: .. py:method:: count_tanimoto_hits_fp (query_fp, target_arena, threshold=0.7) {{ chemfp.search. count_tanimoto_hits_fp|docstring}} count_tanimoto_hits_arena ------------------------- .. _chemfp_search_count_tanimoto_hits_arena: .. py:method:: count_tanimoto_hits_arena(query_arena, target_arena, threshold=0.7) {{ chemfp.search. count_tanimoto_hits_arena|docstring}} count_tanimoto_hits_symmetric ----------------------------- .. _chemfp_search_count_tanimoto_hits_symmetric: .. py:method:: count_tanimoto_hits_symmetric(arena, threshold=0.7, batch_size=100) {{ chemfp.search. count_tanimoto_hits_symmetric|docstring}} threshold_tanimoto_search_fp ---------------------------- .. _chemfp_search_threshold_tanimoto_search_fp: .. py:method:: threshold_tanimoto_search_fp(query_fp, target_arena, threshold=0.7) {{ chemfp.search. threshold_tanimoto_search_fp|docstring}} threshold_tanimoto_search_arena ------------------------------- .. _chemfp_search_threshold_tanimoto_search_arena: .. py:method:: threshold_tanimoto_search_arena(query_arena, target_arena, threshold=0.7) {{ chemfp.search. threshold_tanimoto_search_arena|docstring}} threshold_tanimoto_search_symmetric ----------------------------------- .. _chemfp_search_threshold_tanimoto_search_symmetric: .. py:method:: threshold_tanimoto_search_symmetric(arena, threshold=0.7, include_lower_triangle=True, batch_size=100) {{ chemfp.search. threshold_tanimoto_search_symmetric|docstring}} knearest_tanimoto_search_fp --------------------------- .. _chemfp_search_knearest_tanimoto_search_fp: .. py:method:: knearest_tanimoto_search_fp(query_fp, target_arena, k=3, threshold=0.7) {{ chemfp.search. knearest_tanimoto_search_fp|docstring}} knearest_tanimoto_search_arena ------------------------------ .. _chemfp_search_knearest_tanimoto_search_arena: .. py:method:: knearest_tanimoto_search_arena(query_arena, target_arena, k=3, threshold=0.7) {{ chemfp.search. knearest_tanimoto_search_arena|docstring}} knearest_tanimoto_search_symmetric ---------------------------------- .. _chemfp_search_knearest_tanimoto_search_symmetric: .. py:method:: knearest_tanimoto_search_symmetric(arena, k=3, threshold=0.7, batch_size=100) {{ chemfp.search. knearest_tanimoto_search_symmetric|docstring}} .. _searchresults: SearchResults ============= .. py:class:: SearchResults(... do not call directly ...) {{ chemfp.search.SearchResults|docstring}} len(results) ------------ {{ chemfp.search.SearchResults.__len__|docstring}} results[i] ---------- {{ chemfp.search.SearchResults.__getitem__|docstring}} clear_all --------- .. py:method:: SearchResults.clear_all() {{ chemfp.search.SearchResults.clear_all|docstring}} .. _chemfp_search_SearchResults_count_all: count_all --------- .. py:method:: SearchResults.count_all(min_score=None, max_score=None, interval="[]") {{ chemfp.search.SearchResults.clear_all|docstring}} .. _SearchResults.cumulative_score_all: cumulative_score_all -------------------- .. py:method:: SearchResults.cumulative_score_all(min_score=None, max_score=None, interval="[]") {{ chemfp.search.SearchResults.cumulative_score_all|docstring}} iter(results) ------------- {{ chemfp.search.SearchResults.__iter__|docstring}} iter_ids -------- .. py:method:: SearchResults.iter_ids() {{ chemfp.search.SearchResults.iter_ids|docstring}} iter_ids_and_scores ------------------- .. py:method:: SearchResults.iter_ids_and_scores() {{ chemfp.search.SearchResults.iter_ids_and_scores|docstring}} iter_indices ------------ .. py:method:: SearchResults.iter_indices() {{ chemfp.search.SearchResults.iter_indices|docstring}} iter_indices_and_scores ----------------------- .. py:method:: SearchResults.iter_indices_and_scores() {{ chemfp.search.SearchResults.iter_indices_and_scores|docstring}} iter_scores ----------- .. py:method:: SearchResults.iter_scores() {{ chemfp.search.SearchResults.iter_scores|docstring}} iter_hits --------- REMOVED: Renamed to iter_ids_and_scores for 1.1. .. _chemfp_search_searchresults_reorder_all: reorder_all ----------- .. py:method:: SearchResults.reorder_all() {{ chemfp.search.SearchResults.reorder_all|docstring}} .. _SearchResult: SearchResult ============ .. py:class:: SearchResult (... do not call directly ...) {{ chemfp.search.SearchResult|docstring}} len(result) ------------ {{ chemfp.search.SearchResult.__len__|docstring}} iter(result) ------------ {{ chemfp.search.SearchResult.__iter__|docstring}} clear ----- .. py:method:: SearchResult.clear() {{ chemfp.search.SearchResult.clear|docstring}} .. _SearchResult_count: count ----- .. py:method:: SearchResult.count(min_score=None, max_score=None, interval="[]") {{ chemfp.search.SearchResult.count|docstring}} .. _SearchResult.cumulative_score: cumulative_score ---------------- .. py:method:: SearchResult.cumulative_score(min_score=None, max_score=None, interval="[]") {{ chemfp.search.SearchResult.cumulative_score|docstring}} .. _get_ids: get_ids ------- .. py:method:: SearchResult.get_ids() {{ chemfp.search.SearchResult.get_ids|docstring}} get_ids_and_scores ------------------ .. py:method:: SearchResult.get_ids_and_scores() {{ chemfp.search.SearchResult.get_ids_and_scores|docstring}} .. _get_indices: get_indices ----------- .. py:method:: SearchResult.get_indices() {{ chemfp.search.SearchResult.get_indices|docstring}} .. _get_indices_and_scores: get_indices_and_scores ---------------------- .. py:method:: SearchResult.get_indices_and_scores() {{ chemfp.search.SearchResult.get_indices_and_scores|docstring}} .. _get_scores: get_scores ---------- .. py:method:: SearchResult.get_scores() {{ chemfp.search.SearchResult.get_scores|docstring}} .. _chemfp_search_searchresult_reorder: reorder ------- .. py:method:: SearchResult.reorder(ordering="decreasing-score") {{ chemfp.search.SearchResult.reorder|docstring}} .. _chemfp.bitops: ===================== chemfp.bitopts module ===================== .. py:module:: chemfp.bitops The following functions are in the chemfp.bitops module. They provide low-level bit operations on byte and hex fingerprints. byte_popcount ============= .. py:function:: byte_popcount() {{ chemfp.bitops.byte_popcount|docstring}} byte_intersect_popcount ======================= .. py:function:: byte_intersect_popcount() {{ chemfp.bitops.byte_intersect_popcount|docstring}} byte_tanimoto ============= .. py:function:: byte_tanimoto() {{ chemfp.bitops.byte_tanimoto|docstring}} byte_contains ============= .. py:function:: byte_contains() {{ chemfp.bitops.byte_contains|docstring}} hex_isvalid =========== .. py:function:: hex_isvalid() {{ chemfp.bitops.hex_isvalid|docstring}} hex_popcount ============ .. py:function:: hex_popcount() {{ chemfp.bitops.hex_popcount|docstring}} hex_intersect_popcount ====================== .. py:function:: hex_intersect_popcount() {{ chemfp.bitops.hex_intersect_popcount|docstring}} hex_tanimoto ============ .. py:function:: hex_tanimoto() {{ chemfp.bitops.hex_tanimoto|docstring}} hex_contains ============ .. py:function :: hex_contains() {{ chemfp.bitops.hex_contains|docstring}} chemfp-1.1p1/docs/apply_template.py0000644000077000000240000000222112104074773017601 0ustar dalkestaff00000000000000import sys import os from os.path import join, exists, getmtime import inspect import chemfp import chemfp.arena import chemfp.bitops name = sys.argv[1] template_name = name + ".txt" rst_name = name + ".rst" from jinja2 import Template, Environment, BaseLoader, TemplateNotFound class LocalDirLoader(BaseLoader): def __init__(self): self.path = "." def get_source(self, environment, template): path = join(self.path, template) if not exists(path): raise TemplateNotFound(template) mtime = getmtime(path) with file(path) as f: source = f.read().decode('utf-8') return source, path, lambda: mtime == getmtime(path) def docstring(f): doc = inspect.getdoc(f) if doc is None: raise AssertionError("Missing docstring for " + f.__name__) doc = inspect.getdoc(f).rstrip("\n") doc = " " + doc.replace("\n", "\n ") return doc env = Environment(loader = LocalDirLoader()) env.filters['docstring'] = docstring template = env.get_template(template_name) page = template.render({"chemfp": chemfp}) with open(rst_name, "w") as f: f.write(page) chemfp-1.1p1/docs/conf.py0000644000077000000240000001554312104063132015504 0ustar dalkestaff00000000000000# -*- coding: utf-8 -*- # # chemfp documentation build configuration file, created by # sphinx-quickstart on Tue Sep 13 23:13:43 2011. # # This file is execfile()d with the current directory set to its containing dir. # # Note that not all possible configuration values are present in this # autogenerated file. # # All configuration values have a default; values that are commented out # serve to show the default. import sys, os # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. #sys.path.insert(0, os.path.abspath('.')) # -- General configuration ----------------------------------------------------- # If your documentation needs a minimal Sphinx version, state it here. #needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be extensions # coming with Sphinx (named 'sphinx.ext.*') or your custom ones. extensions = ['sphinx.ext.autodoc', 'sphinx.ext.doctest'] # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix of source filenames. source_suffix = '.rst' # The encoding of source files. #source_encoding = 'utf-8-sig' # The master toctree document. master_doc = 'index' # General information about the project. project = u'chemfp' copyright = u'2010-2013, Andrew Dalke' # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the # built documents. # # The short X.Y version. version = '1.1' # The full version, including alpha/beta/rc tags. release = '1.1' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. #language = None # There are two options for replacing |today|: either, you set today to some # non-false value, then it is used: #today = '' # Else, today_fmt is used as the format for a strftime call. #today_fmt = '%B %d, %Y' # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. exclude_patterns = ['_build'] # The reST default role (used for this markup: `text`) to use for all documents. #default_role = None # If true, '()' will be appended to :func: etc. cross-reference text. #add_function_parentheses = True # If true, the current module name will be prepended to all description # unit titles (such as .. function::). #add_module_names = True # If true, sectionauthor and moduleauthor directives will be shown in the # output. They are ignored by default. #show_authors = False # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'sphinx' # A list of ignored prefixes for module index sorting. #modindex_common_prefix = [] # -- Options for HTML output --------------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. html_theme = 'agogo' # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. #html_theme_options = {} # Add any paths that contain custom themes here, relative to this directory. #html_theme_path = [] # The name for this set of Sphinx documents. If None, it defaults to # " v documentation". #html_title = None # A shorter title for the navigation bar. Default is the same as html_title. #html_short_title = None # The name of an image file (relative to this directory) to place at the top # of the sidebar. #html_logo = None # The name of an image file (within the static path) to use as favicon of the # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 # pixels large. #html_favicon = None # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, # using the given strftime format. #html_last_updated_fmt = '%b %d, %Y' # If true, SmartyPants will be used to convert quotes and dashes to # typographically correct entities. #html_use_smartypants = True # Custom sidebar templates, maps document names to template names. #html_sidebars = {} # Additional templates that should be rendered to pages, maps page names to # template names. #html_additional_pages = {} # If false, no module index is generated. #html_domain_indices = True # If false, no index is generated. #html_use_index = True # If true, the index is split into individual pages for each letter. #html_split_index = False # If true, links to the reST sources are added to the pages. #html_show_sourcelink = True # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. #html_show_sphinx = True # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. #html_show_copyright = True # If true, an OpenSearch description file will be output, and all pages will # contain a tag referring to it. The value of this option must be the # base URL from which the finished HTML is served. #html_use_opensearch = '' # This is the file name suffix for HTML files (e.g. ".xhtml"). #html_file_suffix = None # Output file base name for HTML help builder. htmlhelp_basename = 'chemfpdoc' # -- Options for LaTeX output -------------------------------------------------- # The paper size ('letter' or 'a4'). #latex_paper_size = 'letter' # The font size ('10pt', '11pt' or '12pt'). #latex_font_size = '10pt' # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, author, documentclass [howto/manual]). latex_documents = [ ('index', 'chemfp.tex', u'chemfp Documentation', u'Andrew Dalke', 'manual'), ] # The name of an image file (relative to this directory) to place at the top of # the title page. #latex_logo = None # For "manual" documents, if this is true, then toplevel headings are parts, # not chapters. #latex_use_parts = False # If true, show page references after internal links. #latex_show_pagerefs = False # If true, show URL addresses after external links. #latex_show_urls = False # Additional stuff for the LaTeX preamble. #latex_preamble = '' # Documents to append as an appendix to all manuals. #latex_appendices = [] # If false, no module index is generated. #latex_domain_indices = True # -- Options for manual page output -------------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ ('index', 'chemfp', u'chemfp Documentation', [u'Andrew Dalke'], 1) ] chemfp-1.1p1/docs/index.rst0000644000077000000240000001762512104066521016056 0ustar dalkestaff00000000000000.. _intro: ======================== chemfp 1.1 documentation ======================== `chemfp `_ is a set of tools for working with cheminformatics fingerprints in the FPS format. Most people will use the command-line programs to generate and search fingerprint files. :ref:`ob2fps `, :ref:`oe2fps `, and :ref:`rdkit2fps ` use respectively the `Open Babel `_, `OpenEye `_, and `RDKit `_ chemistry toolkits to convert structure files into fingerprint files. :ref:`sdf2fps ` extracts fingerprints encoded in SD tags to make the fingerprint file. :ref:`simsearch ` finds targets in a fingerprint file which are sufficiently similar to the queries. The programs are built using the :ref:`chemfp Python library API `, which in turn uses a C extension for the performance critical sections. The parts of the library API documented here are meant for public use, along with some examples. Remember: chemfp cannot generate fingerprints from a structure file without a third-party chemistry toolkit. Chemfp is regularly tested on a Mac using multiple versions of OEChem/OEGraphSim, Open Babel, and RDKit as well as Python 2.5, 2.6, and 2.7. .. toctree:: :maxdepth: 2 installing using-tools tool-help using-api api License and advertisement ========================= This program was developed by Andrew Dalke of Andrew Dalke Scientific, AB. It is distributed free of charge under the "MIT" license, shown below. Further chemfp development depends on funding from people like you. Asking for voluntary contributions almost never works. Instead, starting with chemfp-1.1, the source code is distributed under an incentive program. You can pay for the commerical distribution, or use the no-cost version. If you pay for the commercial distribution then you will get the most recent version of chemfp, free upgrades for one year, support, and a discount on renewing participation in the incentive program. If you use the no-cost distribution then you will get the 1.1 version of chemfp, limited support, and minor bug fixes and improvements. The current plan is that older versions of the commercial distribution will be released under the no-cost program. However, the no-cost version will always be at least one, and more likely two to three years behind the version available to those who fund chemfp development. If you have questions about or with to purchase the commercial distribution, send an email to sales@dalkescientific.com . .. highlight:: none :: Copyright (c) 2010-2012 Andrew Dalke Scientific, AB (Gothenburg, Sweden) Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. Copyright to portions of the code are held by other people or organizations, and may be under a different license. See the specific code for details. These are: - OpenMP, cpuid, POPCNT, and Lauradoux implementations by Kim Walisch, , under the MIT license - SSSE3.2 popcount implementation by Stanford Univeristy (written by Imran S. Haque ) under the BSD license - heapq by the Python Software Foundation under the Python license - chemfp/progressbar/ by Nilton Volpato under the LGPL 2.1 and/or BSD license - chemfp/futures/ by Brian Quinlan under the Python license - chemfp/argparse.py by Steven J. Bethard under the Apache License 2.0 Future ====== The chemfp code base is solid and in use at several companies. It has great support for fingerprint generation and similarity search and is very fast, but there's plenty left to do in future. The most likely near-term chemfp improvements are to fix the limitations that prevent chemfp from searching arenas which are larger than 2GB and add the FPB binary file format to improve the fingerprint loading times for large data sets. The fingerprint type internals are confusing. I have not documented them as part of the public API because it will be rewritten. Once done, you will have access to toolkit-specific functions like: list the supported fingerprint types, create a fingerprint when the structure is a string, and test if two fingerprint types use the same toolkit. I do not have a Microsoft Windows computer. I looked into how I might provide an installer for that OS. It's a lot of work, if only because I need to support Python 2.5, 2.6, and 2.7 and learn how installers work on Windows. I've decided to put it off until a member of the incentive program specifically wants it. The goals beyond that will depend in part on user feedback. The following are possible ideas, to help inspire you. `Let me know `_ if you need something like one of these. The threshold and k-nearest arena search results store hits using compressed sparse rows. These work well for sparse results, but when you want the entire similarity matrix (ie, with a minimum threshold of 0.0) of a large arena, then time and space to maintain the sparse data structure becomes noticable. It's likely in that case that you want to store the scores in a 2D NumPy matrix. I'm really interested in using chemfp to handle different sorts of clustering. Let me know if there are things I can add to the API which would help you do that. The Tanimoto the most common similarity method in cheminformatics, but not the only one. I could add support for Tversky and other methods. Some of these can also take benefit from chemfp's sublinear search method. I think that an awk-like, or LINQ-like command-line selection tool would be useful. It could be used to select fingerprint records by position, id, or popcount, or just pick records at random. It's tempting to thinking of a full-blown language, but that's what Python is for. There are several internal APIs, like the SDF reader code, which should be refined and made part of the public API. This includes validating the reader against a larger number of real-world SD files. If you are not a Python programmer then you might prefer that the core search routines be made accessible through a C API. That's possible, in that the software was designed with that in mind, but it needs more development and testing. ChemFP 1.1 supports OpenMP. That's great for shared-memory machines. Are you interested in supporting a distributed computing version? There are any number of higher-level tools which can be built on the chemfp components. For example, what about a wsgi component which implements a web-based search API for your local network? There's a paper on using locality-sensitive hashing to find highly similar fingerprints. Are there cases where it's more useful than chemfp? I will support Python 3, but so far no one has asked for it. While the toolkit interfaces will have to wait until the respective toolkits support Python 3, the similarity code does not depend on any toolkit and could be ported on its own. Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search` chemfp-1.1p1/docs/installing.rst0000644000077000000240000000307712104064754017115 0ustar dalkestaff00000000000000Installing ========== The no-cost version of chemfp is available from `http://pypi.python.org/pypi/chemfp/ PyPI`_ so if you use the third-party tools `http://www.pip-installer.org/en/latest/index.html pip`_ or `http://peak.telecommunity.com/DevCenter/EasyInstall easy_install`_ then you can install it with one of: pip install chemfp easy_install chemfp The chemfp tools depends on a working Python installation. You can download Python 2.7 from http://www.python.org/download/. Note that older versions of OEChem only support up to Python 2.6. At some point chemfp will support Python 3.3 or later. The core chemfp functionality does not depend on a third-party library but you will need a chemistry toolkit in order to generate new fingerprints from structure files. chemfp supports the free Open Babel and RDKit toolkits and the proprietary OEChem toolkit. Make sure you install the Python libraries for the toolkit(s) you select. If you have a source version of chemfp then you will need a C compiler in order to compile it. This uses Python's standard "setup.py". Read http://docs.python.org/install/index.html for details of how to use it. The short version is that on Unix systems using sudo (that is, Mac OS X and most Linux-based OSes) you can do:: sudo python setup.py install while for Windows you can do:: python setup.py install Bear in mind Python 2.7 for Windows was built with Visual C++ 2008. The setup.py step does not work with Visual C++ 2010 or later unless you patch your local version of Python. The bug report is at http://bugs.python.org/issue13210 . chemfp-1.1p1/docs/Makefile0000644000077000000240000001115412104040164015637 0ustar dalkestaff00000000000000# Makefile for Sphinx documentation # # You can set these variables from the command line. SPHINXOPTS = SPHINXBUILD = sphinx-build PAPER = BUILDDIR = _build PYTHON = python2.7 # Internal variables. PAPEROPT_a4 = -D latex_paper_size=a4 PAPEROPT_letter = -D latex_paper_size=letter ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) . .PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest help: @echo "Please use \`make ' where is one of" @echo " html to make standalone HTML files" @echo " dirhtml to make HTML files named index.html in directories" @echo " singlehtml to make a single large HTML file" @echo " pickle to make pickle files" @echo " json to make JSON files" @echo " htmlhelp to make HTML files and a HTML help project" @echo " qthelp to make HTML files and a qthelp project" @echo " devhelp to make HTML files and a Devhelp project" @echo " epub to make an epub" @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter" @echo " latexpdf to make LaTeX files and run them through pdflatex" @echo " text to make text files" @echo " man to make manual pages" @echo " changes to make an overview of all changed/added/deprecated items" @echo " linkcheck to check all external links for integrity" @echo " doctest to run all doctests embedded in the documentation (if enabled)" clean: -rm -rf $(BUILDDIR)/* html: api.rst $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html @echo @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." dirhtml: $(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml @echo @echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml." singlehtml: $(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml @echo @echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml." pickle: $(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle @echo @echo "Build finished; now you can process the pickle files." json: $(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json @echo @echo "Build finished; now you can process the JSON files." htmlhelp: $(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp @echo @echo "Build finished; now you can run HTML Help Workshop with the" \ ".hhp project file in $(BUILDDIR)/htmlhelp." qthelp: $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp @echo @echo "Build finished; now you can run "qcollectiongenerator" with the" \ ".qhcp project file in $(BUILDDIR)/qthelp, like this:" @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/chemfp.qhcp" @echo "To view the help file:" @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/chemfp.qhc" devhelp: $(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp @echo @echo "Build finished." @echo "To view the help file:" @echo "# mkdir -p $$HOME/.local/share/devhelp/chemfp" @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/chemfp" @echo "# devhelp" epub: $(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub @echo @echo "Build finished. The epub file is in $(BUILDDIR)/epub." latex: $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex @echo @echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex." @echo "Run \`make' in that directory to run these through (pdf)latex" \ "(use \`make latexpdf' here to do that automatically)." latexpdf: $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex @echo "Running LaTeX files through pdflatex..." make -C $(BUILDDIR)/latex all-pdf @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." text: $(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text @echo @echo "Build finished. The text files are in $(BUILDDIR)/text." man: $(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man @echo @echo "Build finished. The manual pages are in $(BUILDDIR)/man." changes: $(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes @echo @echo "The overview file is in $(BUILDDIR)/changes." linkcheck: $(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck @echo @echo "Link check complete; look for any errors in the above output " \ "or in $(BUILDDIR)/linkcheck/output.txt." doctest: $(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest @echo "Testing of doctests in the sources finished, look at the " \ "results in $(BUILDDIR)/doctest/output.txt." api.rst: api.txt ../chemfp/__init__.py ../chemfp/arena.py $(PYTHON) apply_template.py api chemfp-1.1p1/docs/tool-help.rst0000644000077000000240000003423712104066423016651 0ustar dalkestaff00000000000000 =============================== Help for the command-line tools =============================== .. _ob2fps: ob2fps command-line options =========================== The following comes from ``ob2fps --help``:: usage: ob2fps.py [-h] [--FP2 | --FP3 | --FP4 | --MACCS | --substruct | --rdmaccs] [--id-tag NAME] [--in FORMAT] [-o FILENAME] [--errors {strict,report,ignore}] [filenames [filenames ...]] Generate FPS fingerprints from a structure file using OpenBabel positional arguments: filenames input structure files (default is stdin) optional arguments: -h, --help show this help message and exit --FP2 --FP3 --FP4 --MACCS --substruct generate ChemFP substructure fingerprints --rdmaccs generate 166 bit RDKit/MACCS fingerprints --id-tag NAME tag name containing the record id (SD files only) --in FORMAT input structure format (default autodetects from the filename extension) -o FILENAME, --output FILENAME save the fingerprints to FILENAME (default=stdout) --errors {strict,report,ignore} how should structure parse errors be handled? (default=strict) .. _oe2fps: oe2fps command-line options =========================== The following comes from ``oe2fps --help``:: usage: oe2fps [-h] [--path] [--numbits INT] [--minbonds INT] [--maxbonds INT] [--atype ATYPE] [--btype BTYPE] [--maccs166] [--substruct] [--rdmaccs] [--aromaticity NAME] [--id-tag NAME] [--in FORMAT] [-o FILENAME] [--errors {strict,report,ignore}] [filenames [filenames ...]] Generate FPS fingerprints from a structure file using OEChem positional arguments: filenames input structure files (default is stdin) optional arguments: -h, --help show this help message and exit --aromaticity NAME use the named aromaticity model --id-tag NAME tag name containing the record id (SD files only) --in FORMAT input structure format (default guesses from filename) -o FILENAME, --output FILENAME save the fingerprints to FILENAME (default=stdout) --errors {strict,report,ignore} how should structure parse errors be handled? (default=strict) path fingerprints: --path generate path fingerprints (default) --numbits INT number of bits in the path fingerprint (default=4096) --minbonds INT minimum number of bonds in the path fingerprint (default=0) --maxbonds INT maximum number of bonds in the path fingerprint (default=5) --atype ATYPE atom type flags, described below (default=Default) --btype BTYPE bond type flags, described below (default=Default) 166 bit MACCS substructure keys: --maccs166 generate MACCS fingerprints 881 bit ChemFP substructure keys: --substruct generate ChemFP substructure fingerprints ChemFP version of the 166 bit RDKit/MACCS keys: --rdmaccs generate 166 bit RDKit/MACCS fingerprints ATYPE is one or more of the following, separated by the '|' character. Aromaticity AtomicNumber Chiral EqAromatic EqHalogen FormalCharge HvyDegree Hybridization InRing The terms 'Default' and 'DefaultAtom' are expanded to OpenEye's suggested default of AtomicNumber|Aromaticity|Chiral|FormalCharge|HvyDegree|Hybridization|EqHalogen. Examples: --atype Default --atype AtomicNumber|HvyDegree (Note that most atom type names change in OEGraphSim 2.0.0.) BTYPE is one or more of the following, separated by the '|' character BondOrder Chiral InRing The terms 'Default' and 'DefaultBond' are expanded to OpenEye's suggested default of BondOrder|Chiral. Examples: --btype Default --btype BondOrder (Note that "BondOrder" changes to "Order" in OEGraphSim 2.0.0.) For simpler Unix command-line compatibility, a comma may be used instead of a '|' to separate different fields. Example: --atype AtomicNumber,HvyDegree OEChem guesses the input structure format based on the filename extension and assumes SMILES for structures read from stdin. Use "--in FORMAT" to select an alternative, where FORMAT is one of: File Type Valid FORMATs (use gz if compressed) --------- ------------------------------------ SMILES smi, ism, can, smi.gz, ism.gz, can.gz SDF sdf, mol, sdf.gz, mol.gz SKC skc, skc.gz CDK cdk, cdk.gz MOL2 mol2, mol2.gz PDB pdb, ent, pdb.gz, ent.gz MacroModel mmod, mmod.gz OEBinary v2 oeb, oeb.gz old OEBinary bin .. _rdkit2fps: rdkit2fps command-line options ============================== The following comes from ``rdkit2fps --help``:: usage: rdkit2fps [-h] [--fpSize INT] [--RDK] [--minPath INT] [--maxPath INT] [--nBitsPerHash INT] [--useHs 0|1] [--morgan] [--radius INT] [--useFeatures 0|1] [--useChirality 0|1] [--useBondTypes 0|1] [--torsions] [--targetSize INT] [--pairs] [--minLength INT] [--maxLength INT] [--maccs166] [--substruct] [--rdmaccs] [--id-tag NAME] [--in FORMAT] [-o FILENAME] [--errors {strict,report,ignore}] [filenames [filenames ...]] Generate FPS fingerprints from a structure file using RDKit positional arguments: filenames input structure files (default is stdin) optional arguments: -h, --help show this help message and exit --fpSize INT number of bits in the fingerprint (applies to RDK, Morgan, topological torsion, and atom pair fingerprints (default=2048) --id-tag NAME tag name containing the record id (SD files only) --in FORMAT input structure format (default guesses from filename) -o FILENAME, --output FILENAME save the fingerprints to FILENAME (default=stdout) --errors {strict,report,ignore} how should structure parse errors be handled? (default=strict) RDKit topological fingerprints: --RDK generate RDK fingerprints (default) --minPath INT minimum number of bonds to include in the subgraph (default=1) --maxPath INT maximum number of bonds to include in the subgraph (default=7) --nBitsPerHash INT number of bits to set per path (default=4) --useHs 0|1 include information about the number of hydrogens on each atom (default=1) RDKit Morgan fingerprints: --morgan generate Morgan fingerprints --radius INT radius for the Morgan algorithm (default=2) --useFeatures 0|1 use chemical-feature invariants (default=0) --useChirality 0|1 include chirality information (default=0) --useBondTypes 0|1 include bond type information (default=1) RDKit Topological Torsion fingerprints: --torsions generate Topological Torsion fingerprints --targetSize INT number of bits in the fingerprint (default=4) RDKit Atom Pair fingerprints: --pairs generate Atom Pair fingerprints --minLength INT minimum bond count for a pair (default=1) --maxLength INT maximum bond count for a pair (default=30) 166 bit MACCS substructure keys: --maccs166 generate MACCS fingerprints 881 bit substructure keys: --substruct generate ChemFP substructure fingerprints ChemFP version of the 166 bit RDKit/MACCS keys: --rdmaccs generate 166 bit RDKit/MACCS fingerprints This program guesses the input structure format based on the filename extension. If the data comes from stdin, or the extension name us unknown, then use "--in" to change the default input format. The supported format extensions are: File Type Valid FORMATs (use gz if compressed) --------- ------------------------------------ SMILES smi, ism, can, smi.gz, ism.gz, can.gz SDF sdf, mol, sd, mdl, sdf.gz, mol.gz, sd.gz, mdl.gz .. _sdf2fps: sdf2fps command-line options ============================ The following comes from ``sdf2fps --help``:: usage: sdf2fps [-h] [--id-tag TAG] [--fp-tag TAG] [--num-bits INT] [--errors {strict,report,ignore}] [-o FILENAME] [--software TEXT] [--type TEXT] [--decompress METHOD] [--binary] [--binary-msb] [--hex] [--hex-lsb] [--hex-msb] [--base64] [--cactvs] [--daylight] [--decoder DECODER] [--pubchem] [filenames [filenames ...]] Extract a fingerprint tag from an SD file and generate FPS fingerprints positional arguments: filenames input SD files (default is stdin) optional arguments: -h, --help show this help message and exit --id-tag TAG get the record id from TAG instead of the first line of the record --fp-tag TAG get the fingerprint from tag TAG (required) --num-bits INT use the first INT bits of the input. Use only when the last 1-7 bits of the last byte are not part of the fingerprint. Unexpected errors will occur if these bits are not all zero. --errors {strict,report,ignore} how should structure parse errors be handled? (default=strict) -o FILENAME, --output FILENAME save the fingerprints to FILENAME (default=stdout) --software TEXT use TEXT as the software description --type TEXT use TEXT as the fingerprint type description --decompress METHOD use METHOD to decompress the input (default='auto', 'none', 'gzip', 'bzip2') Fingerprint decoding options: --binary Encoded with the characters '0' and '1'. Bit #0 comes first. Example: 00100000 encodes the value 4 --binary-msb Encoded with the characters '0' and '1'. Bit #0 comes last. Example: 00000100 encodes the value 4 --hex Hex encoded. Bit #0 is the first bit (1<<0) of the first byte. Example: 01f2 encodes the value \x01\xf2 = 498 --hex-lsb Hex encoded. Bit #0 is the eigth bit (1<<7) of the first byte. Example: 804f encodes the value \x01\xf2 = 498 --hex-msb Hex encoded. Bit #0 is the first bit (1<<0) of the last byte. Example: f201 encodes the value \x01\xf2 = 498 --base64 Base-64 encoded. Bit #0 is first bit (1<<0) of first byte. Example: AfI= encodes value \x01\xf2 = 498 --cactvs CACTVS encoding, based on base64 and includes a version and bit length --daylight Daylight encoding, which is is base64 variant --decoder DECODER import and use the DECODER function to decode the fingerprint shortcuts: --pubchem decode CACTVS substructure keys used in PubChem. Same as --software=CACTVS/unknown --type 'CACTVS- E_SCREEN/1.0 extended=2' --fp-tag=PUBCHEM_CACTVS_SUBSKEYS --cactvs .. _simsearch: simsearch command-line options ============================== The following comes from ``simsearch --help``:: usage: simsearch [-h] [-k K_NEAREST] [-t THRESHOLD] [-q QUERIES] [--NxN] [--hex-query HEX_QUERY] [--query-id QUERY_ID] [--in FORMAT] [-o FILENAME] [-c] [-b BATCH_SIZE] [--scan] [--memory] [--times] target_filename Search an FPS file for similar fingerprints positional arguments: target_filename target filename optional arguments: -h, --help show this help message and exit -k K_NEAREST, --k-nearest K_NEAREST select the k nearest neighbors (use 'all' for all neighbors) -t THRESHOLD, --threshold THRESHOLD minimum similarity score threshold -q QUERIES, --queries QUERIES filename containing the query fingerprints --NxN use the targets as the queries, and exclude the self- similarity term --hex-query HEX_QUERY query in hex --query-id QUERY_ID id for the hex query --in FORMAT input query format (default uses the file extension, else 'fps') -o FILENAME, --output FILENAME output filename (default is stdout) -c, --count report counts -b BATCH_SIZE, --batch-size BATCH_SIZE batch size --scan scan the file to find matches (low memory overhead) --memory build and search an in-memory data structure (faster for multiple queries) --times report load and execution times to stderr chemfp-1.1p1/docs/using-api.rst0000644000077000000240000014516312104111070016630 0ustar dalkestaff00000000000000.. highlight:: python ========================= The chemfp Python library ========================= The chemfp command-line programs use a Python library called chemfp. Portions of the API are in flux and subject to change. The stable portions of the API which are open for general use are documented in :ref:`chemfp-api`. The API includes: - low-level Tanimoto and popcount operations - Tanimoto search algorithms based on threshold and/or k-nearest neighbors - a cross-toolkit interface for reading fingerprints from a structure file The following chapters give examples of how to use the API. Byte and hex fingerprints ========================= In this section you'll learn how chemfp stores fingerprints and some of the low-level bit operations on those fingerprints. chemfp stores fingerprints as byte strings. Here are two 8 bit fingerprints:: >>> fp1 = "A" >>> fp2 = "B" The :ref:`chemfp.bitops ` module contains functions which work on byte fingerprints. Here's the Tanimoto of those two fingerprints:: >>> from chemfp import bitops >>> bitops.byte_tanimoto(fp1, fp2) 0.33333333333333331 To understand why, you have to know that ASCII character "A" has the value 65, and "B" has the value 66. The bit representation is:: "A" = 01000001 and "B" = 01000010 so their intersection has 1 bit and the union has 3, giving a Tanimoto of 1/3 or 0.33333333333333331 when represented as a 64 bit floating point number on the computer. You can compute the Tanimoto between any two byte strings with the same length, as in:: >>> bitops.byte_tanimoto("apples&", "oranges") 0.58333333333333337 You'll get a chemfp exception if they have different lengths. .. highlight:: none Most fingerprints are not as easy to read as the English ones I showed above. They tend to look more like:: P1@\x84K\x1aN\x00\n\x01\xa6\x10\x98\\\x10\x11 which is hard to read. I usually show hex-encoded fingerprints. The above fingerprint in hex is:: 503140844b1a4e000a01a610985c1011 which is simpler to read, though you still need to know your hex digits. .. highlight:: python The bitops module includes other low-level functions which work on byte fingerprints, as well as corresponding functions which work on hex fingerprints. (Hex-encoded fingerprints are decidedly second-class citizens in chemfp, but they are citizens.) Fingerprint collections and metadata ==================================== In this section you'll learn the basic operations on a fingerprint collection and the fingerprint metadata. A fingerprint record is the fingerprint plus an identifier. In chemfp, a fingerprint collection is a object which contains fingerprint records and which follows the common API providing access to those records. That's rather abstract, so let's work with a few real examples. You'll need to create a copy of the "pubchem_targets.fps" file generated in :ref:`pubchem_fingerprints` in order to follow along. Here's how to open an FPS file:: >>> import chemfp >>> reader = chemfp.open("pubchem_targets.fps") Every fingerprint collection has a metadata attribute with details about the fingerprints. It comes from the header of the FPS file. You can view the metadata in Python repr format: >>> reader.metadata Metadata(num_bits=881, num_bytes=111, type='CACTVS-E_SCREEN/1.0 extend ed=2', aromaticity=None, sources=['Compound_014550001_014575000.sdf.gz '], software=u'CACTVS/unknown', date='2011-09-14T12:10:34') but I think it's easier to view it in string format, which matches the format of the FPS header: >>> print reader.metadata #num_bits=881 #type=CACTVS-E_SCREEN/1.0 extended=2 #software=CACTVS/unknown #source=Compound_014550001_014575000.sdf.gz #date=2011-09-14T12:10:34 All fingerprint collections support iteration. Each step of the iteration returns the fingerprint identifier and its score. Since I know the 6th record has the id 14550045, I can write a simple loop which stops with that record:: >>> for (id, fp) in reader: ... print id, "starts with", fp.encode("hex")[:20] ... if id == "14550045": ... break ... 14550001 starts with 034e1c00020000000000 14550002 starts with 034e0c00020000000000 14550003 starts with 034e0400020000000000 14550005 starts with 010e1c00000600000000 14550010 starts with 034e1c40000000000000 14550045 starts with 071e8c03000000000000 Fingerprint collections also support iterating via arenas, and several support Tanimoto search functions. FingerprintArena ================ In this section you'll learn about the FingerprintArena fingerprint collection and how to iterate through arenas in a collection. The FPSReader reads through or searches a fingerprint file once. If you want to read the file again you have to reopen it. Reading from disk is slow, and the FPS format is designed for ease-of-use and not performance. If you want to do many queries then it's best to store everything in memory. The :ref:`FingerprintArena ` is a fingerprint collection which does that. Here's how to load fingerprints into an arena:: >>> import chemfp >>> arena = chemfp.load_fingerprints("pubchem_targets.fps") >>> print arena.metadata #num_bits=881 #type=CACTVS-E_SCREEN/1.0 extended=2 #software=CACTVS/unknown #source=Compound_014550001_014575000.sdf.gz #date=2011-09-14T12:10:34 This implements the fingerprint collection API, so you can do things like iterate over an arena and get the id/fingerprint pairs.:: >>> from chemfp import bitops >>> for id, fp in arena: ... print id, "with popcount", bitops.byte_popcount(fp) ... if id == "14574718": ... break ... 14550474 with popcount 2 14574635 with popcount 2 14550409 with popcount 4 14550416 with popcount 6 14574551 with popcount 7 14550509 with popcount 8 14550423 with popcount 10 14550427 with popcount 10 14574637 with popcount 10 14574890 with popcount 11 14574718 with popcount 12 If you look closely you'll notice that the fingerprint record order has changed from the previous section, and that the population counts are suspiciously non-decreasing. By default ref:`load_fingerprints` reorders the fingerprints into a data structure which is faster to search, although you can disable that if you want the fingerprints to be the same as the input order. The :ref:`FingerprintArena ` has new capabilities. You can ask it how many fingerprints it contains, get the list of identifiers, and look up a fingerprint record given an index, as in:: >>> len(arena) 3119 >>> arena.ids[:5] ['14550474', '14574635', '14550409', '14550416', '14574551'] >>> id, fp = arena[6] >>> id '14550423' >>> arena[-1][0] '14566760' >>> bitops.byte_popcount(arena[-1][1]) 231 An arena supports iterating through subarenas. This is like having a long list and being able to iterate over sublists. Here's an example of iterating over the arena to get subarenas of size 1000 (excepting the last), and print information about each subarena.:: >>> for subarena in arena.iter_arenas(1000): ... print subarena.ids[0], len(subarena) ... 14550474 1000 14573373 1000 14555885 1000 14560068 119 >>> arena[0][0] '14550474' >>> arena[1000][0] '14573373' To help demonstrate what's going on, I showed the first id of each record along with the main arena ids for records 0 and 1000, so you can verify that they are the same. Arenas are a core part of chemfp. Processing one fingerprint at a time is slow, so the main search routines expect to iterate over query arenas, rather than query fingerprints. Thus, the FPSReaders -- and all chemfp fingerprint collections -- also support the `iter_arenas` interface. Here's an example of reading the targets file 25 records at a time:: >>> queries = chemfp.open("pubchem_queries.fps") >>> for arena in queries.iter_arenas(25): ... print len(arena) ... 25 25 25 25 25 25 25 25 24 Those add up to 224, which you can verify is the number of structures in the original source file. If you have a `FingerprintArena`_ then you can also use Python's slice notation to make a subarena:: >>> queries = chemfp.load_fingerprints("pubchem_queries.fps") >>> queries[10:15] >>> queries[10:15].ids ['27599116', '27599118', '27599120', '27583411', '27599082'] >>> queries.ids[10:15] ['27599116', '27599118', '27599120', '27583411', '27599082'] The big restriction is that slices can only have a step size of 1. Slices like `[10:20:2]` and `[::-1]` aren't supported. If you want something like that then you'll need to make a new arena instead of using a subarena slice. In case you were wondering, yes, you can use `iter_arenas` or the other FingerprintArena methods on a subarena:: >>> queries[10:15][1:3].ids ['27599118', '27599120'] >>> queries.ids[11:13] ['27599118', '27599120'] How to use query fingerprints to search for similar target fingerprints ======================================================================= In this section you'll learn how to do a Tanimoto search using the previously created PubChem fingerprint files for the queries and the targets. It's faster to search an arena, so I'll load the target fingerprints: >>> import chemfp >>> targets = chemfp.load_fingerprints("pubchem_targets.fps") >>> len(targets) 3119 and open the queries as an FPSReader. >>> queries = chemfp.open("pubchem_queries.fps") I'll use :ref:`threshold_tanimoto_search ` to find, for each query, all hits which are at least 0.7 similar to the query. >>> queries = chemfp.open("pubchem_queries.fps") >>> for (query_id, hits) in chemfp.threshold_tanimoto_search(queries, targets, threshold=0.7): ... print query_id, len(hits), hits[:2] ... 27575433 0 [] 27575577 18 [('14570945', 0.74874371859296485), ('14570946', 0.73762376237623761)] 27575602 3 [('14572463', 0.72560975609756095), ('14553070', 0.75935828877005351)] 27575603 3 [('14572463', 0.72560975609756095), ('14553070', 0.75935828877005351)] 27575880 9 [('14569876', 0.72307692307692306), ('14567856', 0.73076923076923073)] 27575897 0 [] 27577227 1 [('14570135', 0.7142857142857143)] 27577234 0 [] # ... many lines omitted ... I'm only showing the first two hits for the sake of space. It seems rather pointless, after all, to show all 18 hits of query id 27575577. What you don't see is that the implementation uses the iter_arenas() interface on the queries so that it processes only a subarena at a time. There's a tradeoff between a large arena, which is faster because it doesn't often go back to Python code, or a small arena, which uses less memory and is more responsive. You can change the tradeoff using the `arena_size` parameter. If all you care about is the count of the hits within a given threshold then use :ref:`chemfp.count_tanimoto_hits ` >>> queries = chemfp.open("pubchem_queries.fps") >>> for (query_id, count) in chemfp.count_tanimoto_hits(queries, targets, threshold=0.7): ... print query_id, count ... break ... 27575433 0 27575577 18 27575602 3 27575603 3 27575880 9 27575897 0 27577227 1 27577234 0 27577237 1 27577250 4 # ... many lines omitted ... Or, if you only want the k=2 nearest neighbors to each target within that same threshold of 0.7 then use :ref:`chemfp.knearest_tanimoto_search `:: >>> queries = chemfp.open("pubchem_queries.fps") >>> for (query_id, hits) in chemfp.knearest_tanimoto_search(query_arena, targets, k=2, threshold=0.7): ... print query_id, hits ... break ... 27575433 [] 27575577 [('14570945', 0.74874371859296485), ('14570951', 0.73853211009174313)] 27575602 [('14553070', 0.75935828877005351), ('14572463', 0.72560975609756095)] 27575603 [('14553070', 0.75935828877005351), ('14572463', 0.72560975609756095)] 27575880 [('14569866', 0.77272727272727271), ('14567856', 0.73076923076923073)] 27575897 [] 27577227 [('14570135', 0.7142857142857143)] 27577234 [] 27577237 [('14569555', 0.73711340206185572)] 27577250 [('14569555', 0.74742268041237114), ('14550456', 0.72131147540983609)] # ... many lines omitted ... How to search an FPS file ========================= In this section you'll learn how to search an FPS file directly, without loading it into a FingerprintArena. The previous example loaded the fingerprints into a FingerprintArena. That's the fastest way to do multiple searches. Sometimes though you only want to do one or a couple of queries. It seems rather excessive to read the entire targets file into an in-memory data structure before doing the search when you could search will processing the file. For that case, use an FPSReader as the target file. Here I'll get the first record from the queries file and use it to search the targets file:: >>> query_arena = next(chemfp.open("pubchem_queries.fps").iter_arenas(1)) This line opens the file, iterates over its fingerprint records, and return the first one. (Note: the next() function was added after Python 2.5 so the above won't work for that version. Instead, use:: >>> query_arena = chemfp.open("pubchem_queries.fps").iter_arenas(1).next() which is the older form. Or you can use the equally bewildering >>> for query_arena in chemfp.open("pubchem_queries.fps").iter_arenas(1): ... break .) Here are the k=5 closest hits against the targets file:: >>> targets = chemfp.open("pubchem_targets.fps") >>> for query_id, hits in chemfp.knearest_tanimoto_search(query_arena, targets, k=5, threshold=0.0): ... print "Hits for", query_id ... for hit in hits: ... print "", hit ... Hits for 27575433 ('14568234', 0.69035532994923854) ('14550456', 0.64921465968586389) ('14572463', 0.64444444444444449) ('14566364', 0.63953488372093026) ('14573723', 0.63247863247863245) Remember that the FPSReader is based on reading an FPS file. Once you've done a search, the file is read, and you can't do another search. You'll need to reopen the file. Each search processes arena_size query fingerprints at a time. You will need to increase that value if you want to search more than that number of fingerprints with this method. The search performance tradeoff between a FPSReader search and loading the fingerprints into a FingerprintArena occurs with under 10 queries, so there should be little reason to worry about this. FingerprintArena searches returning indices instead of ids =========================================================== In this section you'll learn how to search a FingerprintArena and use hits based on integer indices rather than string ids. The previous sections used a high-level interface to the Tanimoto search code. Those are designed for the common case where you just want the query id and the hits, where each hit includes the target id. Working with strings is actually rather inefficient in both speed and memory. It's usually better to work with indices if you can, and in the next section I'll show how to make a distance matrix using this interface. NOTE: up until the final 1.1 release, this document said to use the FingerprintArena methods. This is no longer recommended. Use the :ref:`chemfp.search ` functions instead. Most of the search methods, except perhaps the single fingerprint methods, will issue a DeprecationWarning in the next release and updated in a release after that. The index-based search functions are in the `chemfp.search` module. They can be categorized into three groups: 1. Count the number of hits: * :ref:`count_tanimoto_hits_fp ` - search an arena using a single fingerprint * :ref:`count_tanimoto_hits_arena ` - search an arena using an arena * :ref:`count_tanimoto_hits_symmetric ` - search an arena using itself 2. Find all hits at or above a given threshold, sorted arbitrarily: * :ref:`threshold_tanimoto_search_fp ` - search an arena using a single fingerprint * :ref:`threshold_tanimoto_search_arena ` - search an arena using an arena * :ref:`threshold_tanimoto_search_symmetric ` - search an arena using itself 3. Find the k-nearest hits at or above a given threshold, sorted by decreasing similarity: * :ref:`knearest_tanimoto_search_fp ` - search an arena using a single fingerprint * :ref:`knearest_tanimoto_search_arena ` - search an arena using an arena * :ref:`knearest_tanimoto_search_symmetric ` - search an arena using itself The functions ending '_fp' take a query fingerprint and a target arena. The functions ending '_arena' take a query arena and a target arena. The functions ending '_symmetric' use the same arena as both the query and target. In the following example, I'll use the first 5 fingerprints of a data set to search the entire data set. To do this, I load the data set as an arena, extract the first 5 records as a sub-arena, and do the search. >>> import chemfp >>> from chemfp import search >>> targets = chemfp.load_fingerprints("pubchem_queries.fps") >>> queries = targets[:5] >>> results = search.threshold_tanimoto_search_arena (queries, targets, threshold=0.7) The threshold_tanimoto_search_arena search finds the target fingerprints which have a similarity score of at least 0.7 compared to the query. You can iterate over the results to get the list of hits for each of the queries. The order of the results is the same as the order of the records in the query.:: >>> for hits in results: ... print len(hits), hits.get_ids_and_scores()[:3] ... 2 [('27581954', 1.0), ('27581957', 1.0)] 2 [('27581954', 1.0), ('27581957', 1.0)] 3 [('27580389', 1.0), ('27580394', 0.88235294117647056), ('27581637', 0.75)] 2 [('27584917', 1.0), ('27585106', 0.89915966386554624)] 2 [('27584917', 0.89915966386554624), ('27585106', 1.0)] This result is like what you saw earlier, except that it doesn't have the query id. You can get that from the arena's `id` attribute, which contains the list of fingerprint identifiers. >>> for query_id, hits in zip(queries.ids, results): ... print "Hits for", query_id ... for hit in hits.get_ids_and_scores()[:3]: ... print "", hit Hits for 27581954 ('27581954', 1.0) ('27581957', 1.0) Hits for 27581957 ('27581954', 1.0) ('27581957', 1.0) ... What I really want to show is that you can get the same data only using the offset index for the target record instead of its id. The result from a Tanimoto search is a :ref:`SearchResults ` object, with the methods :ref:`get_indices_and_scores `, :ref:`get_ids `, :ref:`get__scores `, and more:: >>> for hits in results: ... print len(hits), hits.get_indices_and_scores()[:3] ... 2 [(0, 1.0), (1, 1.0)] 2 [(0, 1.0), (1, 1.0)] 3 [(2, 1.0), (5, 0.88235294117647056), (20, 0.75)] 2 [(3, 1.0), (4, 0.89915966386554624)] 2 [(3, 0.89915966386554624), (4, 1.0)] >>> >>> targets.ids[0] '27581954' >>> targets.ids[1] '27581957' >>> targets.ids[5] '27580394' I did a few id lookups given the target dataset to show you that the index corresponds to the identifiers from the previous code. These examples iterated over each individual `SearchResult` to fetch the ids and scores, or indices and scores. Another possibility is to ask the `SearchResults` collection to iterate directly over the list of fields you want. >>> for row in results.iter_indices_and_scores(): ... print len(row), row[:3] ... 2 [(0, 1.0), (1, 1.0)] 2 [(0, 1.0), (1, 1.0)] 3 [(2, 1.0), (5, 0.88235294117647056), (20, 0.75)] 2 [(3, 1.0), (4, 0.89915966386554624)] 2 [(3, 0.89915966386554624), (4, 1.0)] This was added to get a bit more performance out of chemfp and because the API is sometimes cleaner one way and sometimes cleaner than the other. Yes, I know that the Zen of Python recommends that "there should be one-- and preferably only one --obvious way to do it." Oh well. NOTE: The API has changed slightly from 1.0 to 1.1. Previously the `SearchResults` had the methods `iter_hits` and iteration over the `SearchResult` returned a "hit." However, I couldn't remember if a hit used the identifier or the index. You must now be explicit and use `iter_ids*` or `iter_indices*` on the `SearchResults`, and use `get_ids*` or `get_indices*` on the `SearchResult`. Computing a distance matrix for clustering ========================================== In this section you'll learn how to compute a distance matrix using the chemfp API. chemfp does not do clustering. There's a huge number of tools which already do that. A goal of chemfp in the future is to provide some core components which clustering algorithms can use. That's in the future. Right now you can use the following to build a distance matrix and pass that to one of those tools. Since we're using the same fingerprint arena for both queries and targets, we know the distance matrix will be symmetric along the diagonal, and the diagonal terms will be 1.0. The :ref:`threshold_tanimoto_search_symmetric ` functions can take advantage of the symmetry for a factor of two performance gain. There's also a way to limit it to just the upper triangle, which gives a factor of two memory gain as well. Most of those tools use `NumPy `_, which is a popular third-party package for numerical computing. You will need to have it installed for the following to work. :: import numpy # NumPy must be installed from chemfp import search # Compute distance[i][j] = 1-Tanimoto(fp[i], fp[j]) def distance_matrix(arena): n = len(arena) # Start off a similarity matrix with 1.0s along the diagonal similarities = numpy.identity(n, "d") ## Compute the full similarity matrix. # The implementation computes the upper-triangle then copies # the upper-triangle into lower-triangle. It does not include # terms for the diagonal. results = search.threshold_tanimoto_search_symmetric(arena, threshold=0.0) # Copy the results into the NumPy array. for row_index, row in enumerate(results.iter_indices_and_scores()): for target_index, target_score in row: similarities[row_index, target_index] = target_score # Return the distance matrix using the similarity matrix return 1.0 - similarities Once you've computed the distance matrix, clustering is easy. I installed the `hcluster `_ package, as well as `matplotlib `_, then ran the following to see the hierarchical clustering:: import chemfp import hcluster # Clustering package from http://code.google.com/p/scipy-cluster/ # ... insert the 'distance_matrix' function definition here ... dataset = chemfp.load_fingerprints("pubchem_queries.fps") distances = distance_matrix(dataset) linkage = hcluster.linkage(distances, method="single", metric="euclidean") # Plot using matplotlib, which you must have installed hcluster.dendrogram(linkage, labels=dataset.ids) import pylab pylab.show() Taylor-Butina clustering ======================== For the last clustering example, here's my (non-validated) variation of the `Butina algorithm from JCICS 1999, 39, 747-750 `_. See also http://www.redbrick.dcu.ie/~noel/R_clustering.html . You might know it as Leader clustering. First, for each fingerprint find all other fingerprints with a threshold of 0.8:: import chemfp from chemfp import search arena = chemfp.load_fingerprints("pubchem_targets.fps") results = search. threshold_tanimoto_search_symmetric (arena, threshold = 0.8) Sort the results so that fingerprints with more hits come first. This is more likely to be a cluster centroid. Break ties arbitrarily by the fingerprint id; since fingerprints are ordered by the number of bits this likely makes larger structures appear first.:: # Reorder so the centroid with the most hits comes first. # (That's why I do a reverse search.) # Ignore the arbitrariness of breaking ties by fingerprint index results = sorted( ( (len(indices), i, indices) for (i,indices) in enumerate(results.iter_indices()) ), reverse=True) Apply the leader algorithm to determine the cluster centroids and the singletons:: # Determine the true/false singletons and the clusters true_singletons = [] false_singletons = [] clusters = [] seen = set() for (size, fp_idx, members) in results: if fp_idx in seen: # Can't use a centroid which is already assigned continue seen.add(fp_idx) # Figure out which ones haven't yet been assigned unassigned = set(members) - seen if not unassigned: false_singletons.append(fp_idx) continue # this is a new cluster clusters.append( (fp_idx, unassigned) ) seen.update(unassigned) Once done, report the results:: print len(true_singletons), "true singletons" print "=>", " ".join(sorted(arena.ids[idx] for idx in true_singletons)) print print len(false_singletons), "false singletons" print "=>", " ".join(sorted(arena.ids[idx] for idx in false_singletons)) print # Sort so the cluster with the most compounds comes first, # then by alphabetically smallest id def cluster_sort_key(cluster): centroid_idx, members = cluster return -len(members), arena.ids[centroid_idx] clusters.sort(key=cluster_sort_key) print len(clusters), "clusters" for centroid_idx, members in clusters: print arena.ids[centroid_idx], "has", len(members), "other members" print "=>", " ".join(arena.ids[idx] for idx in members) The algorithm is quick for this small data set. Out of curiosity, I tried this on 100,000 compounds selected arbitrarily from PubChem. It took 35 seconds on my desktop (a 3.2 GHZ Intel Core i3) with a threshold of 0.8. In the Butina paper, it took 24 hours to do the same, although that was with a 1024 bit fingerprint instead of 881. It's hard to judge the absolute speed differences of a MIPS R4000 from 1998 to a desktop from 2011, but it's less than the factor of about 2000 you see here. More relevent is the comparison between these numbers for the 1.1 release compared to the original numbers for the 1.0 release. On my old laptop, may it rest it peace, it took 7 minutes to compute the same benchmark. Where did the roughly 16-fold peformance boost come from? Money. After 1.0 was released, Roche funded me to add various optimizations, including taking advantage of the symmetery (2x) and using hardware POPCNT if available (4x). Roche and another company helped fund the OpenMP support, and when my desktop reran this benchmark it used 4 cores instead of 1. The wary among you might notice that 2*4*4 = 32x faster, while I said the overall code was only 16x faster. Where's the factor of 2x slowdown? It's in the Python code! The :ref:`threshold_tanimoto_search_symmetric ` step took only 13 seconds. The remaining 22 seconds was in the leader code written in Python. To make the analysis more complicated, improvements to the chemfp API sped up the clustering step by about 40%. With chemfp 1.0 version, the clustering performance overhead was minor compared to the full similarity search, so I didn't keep track of it. With chemfp 1.1, those roles have reversed! Reading structure fingerprints using a toolkit ============================================== In this section you'll learn how to use a chemistry toolkit in order to compute fingerprints from a given structure file. What happens if you're given a structure file and you want to find the two nearest matches in an FPS file? You'll have to generate the fingerprints for the structures in the structure file, then do the comparison. For this section you'll need to have a chemistry toolkit. I'll use the "chebi_maccs.fps" file you generated earlier as the targets, and the PubChem file "Compound_027575001_027600000.sdf.gz as the source of query structures.:: >>> import chemfp >>> from chemfp import search >>> targets = chemfp.load_fingerprints("chebi_maccs.fps") >>> queries = chemfp.read_structure_fingerprints(targets.metadata, "Compound_027575001_027600000.sdf.gz") >>> for (query_id, hits) in chemfp.knearest_tanimoto_search(queries, targets, k=2, threshold=0.4): ... print query_id, "=>", ... for (target_id, score) in hits.get_ids_and_scores(): ... print "%s %.3f" % (target_id, score), ... print ... 27575433 => CHEBI:280152 0.667 CHEBI:3176 0.662 27575577 => CHEBI:6375 0.600 CHEBI:46068 0.600 27575602 => CHEBI:3090 0.683 CHEBI:6790 0.635 27575603 => CHEBI:3090 0.683 CHEBI:6790 0.635 27575880 => CHEBI:59736 0.725 CHEBI:8887 0.617 27575897 => CHEBI:8887 0.632 CHEBI:51491 0.622 27577227 => CHEBI:59007 0.831 CHEBI:59120 0.721 27577234 => CHEBI:59007 0.809 CHEBI:9398 0.722 27577237 => CHEBI:59007 0.789 CHEBI:52890 0.741 27577250 => CHEBI:59007 0.753 CHEBI:4681 0.722 # ... many lines omitted ... That's it! Pretty simple, wasn't it? You didn't even need to explictly specify which toolkit you wanted to use. The only new thing here is :ref:`read_structure_fingerprints `. The first parameter of this is the metadata used to configure the reader. In my case it's:: >>> print targets.metadata #num_bits=166 #type=OpenEye-MACCS166/1 #software=OEGraphSim/1.0.0 (20100809) #aromaticity=openeye #source=ChEBI_lite.sdf.gz #date=2011-09-14T17:50:28 The "type" told chemfp which toolkit to use to read molecules, and how to generate fingerprints from those molecules, while "aromaticity" told it which aromaticity model to use when reading the molecule file. You can of course pass in your own metadata as the first parameter to read_structure_fingerprints, and as a shortcut, if you pass in a string then it will be used as the fingerprint type. For examples, if you have OpenBabel installed then you can do:: >>> reader = chemfp.read_structure_fingerprints("OpenBabel-MACCS", "Compound_027575001_027600000.sdf.gz") >>> for i, (id, fp) in enumerate(reader): ... print id, fp.encode("hex") ... if i == 3: ... break ... 27575433 800404000840549e848189cca1f132aedfab6eff1b 27575577 800400000000449e850581c22190022f8a8baadf1b 27575602 000000000000449e840191d820a0122eda9abaff1b 27575603 000000000000449e840191d820a0122eda9abaff1b If you have OEChem and OEGraphSim installed then you can do:: >>> reader = chemfp.read_structure_fingerprints("OpenEye-MACCS166", "Compound_027575001_027600000.sdf.gz") >>> for i, (id, fp) in enumerate(reader): ... print id, fp.encode("hex") ... if i == 3: ... break ... 27575433 000000080840448e8481cdccb1f1b216daaa6a7e3b 27575577 000000080000448e850185c2219082178a8a6a5e3b 27575602 000000080000448e8401d14820a01216da983b7e3b 27575603 000000080000448e8401d14820a01216da983b7e3b And if you have RDKit installed then you can do:: >>> reader = chemfp.read_structure_fingerprints("RDKit-MACCS166", "Compound_027575001_027600000.sdf.gz") >>> for i, (id, fp) in enumerate(reader): ... print id, fp.encode("hex") ... if i == 3: ... break ... 27575433 000000000840549e84818dccb1f1323cdfab6eff1f 27575577 000000000000449e850185c22190023d8a8beadf1f 27575602 000000000000449e8401915820a0123eda98bbff1f 27575603 000000000000449e8401915820a0123eda98bbff1f Select a random fingerprint sample ================================== In this section you'll learn how to make a new arena where the fingerprints are randomly selected from the old arena. A FingerprintArena slice creates a subarena. Technically speaking, this is a "view" of the original data. The subarena doesn't actually copy its fingerprint data from the original arena. Instead, it uses the same fingerprint data, but keeps track of the start and end position of the range it needs. This is why it's not possible to slice with a step size other than +1. This also means that memory for a large arena won't be freed until all of its subarenas are also removed. You can see some evidence for this because a :ref:`FingerprintArena ` stores the entire fingerprint data as a set of bytes named `arena`:: >>> import chemfp >>> targets = chemfp.load_fingerprints("pubchem_targets.fps") >>> subset = targets[10:20] >>> targets.arena is subset.arena True This shows that the `targets` and `subset` share the same raw data set. At least it does to me, the person who wrote the code. You can ask an arena or subarena to make a :ref:`copy `. This allocates new memory for the new arena and copies all of its fingerprints there. :: >>> new_subset = subset.copy() >>> len(new_subset) == len(subset) >>> new_subset.arena is subset.arena False >>> subset[7][0] '14554484' >>> new_subset[7][0] '14554484' The `copy` method can do more than just copy the arena. You can give it a list of indices and it will only copy those fingerprints:: >>> three_targets = targets.copy([3112, 0, 1234]) >>> three_targets.ids ['14550474', '14564466', '14564904'] >>> [targets.ids[3112], targets.ids[0], targets.ids[1234]] ['14564904', '14550474', '14564466'] Are you confused about why the identifiers aren't in the same order? That's because when you specify indicies, the copy automatically reorders them by popcount and stores the popcount information. This extra work help makes future searches faster. Use `reorder=False` to leave the order unchanged >>> my_ordering = targets.copy([3112, 0, 1234], reorder=False) >>> my_ordering.ids ['14564904', '14550474', '14564466'] This interesting, in a boring sort of way. Let's get back to the main goal of getting a random subset of the data. I want to select `m` records at random, without replacement, to make a new data set. You can see this just means making a list with `m` different index values. Python's built-in `random.sample `_ function makes this easy:: >>> import random >>> random.sample("abcdefgh", 3) ['b', 'h', 'f'] >>> random.sample("abcdefgh", 2) ['d', 'a'] >>> random.sample([5, 6, 7, 8, 9], 2) [7, 9] >>> help(random.sample) sample(self, population, k) method of random.Random instance Chooses k unique random elements from a population sequence. ... To choose a sample in a range of integers, use xrange as an argument. This is especially fast and space efficient for sampling from a large population: sample(xrange(10000000), 60) The last line of the help points out what do next!:: >>> random.sample(xrange(len(targets)), 5) [610, 2850, 705, 1402, 2635] >>> random.sample(xrange(len(targets)), 5) [1683, 2320, 1385, 2705, 1850] Putting it all together, and here's how to get a new arena containing 100 randomly selected fingerprints, without replacement, from the `targets` arena:: >>> sample_indices = random.sample(xrange(len(targets)), 100) >>> sample = targets.copy(indices=sample_indices) >>> len(sample) 100 Look up a fingerprint with a given id ===================================== In this section you'll learn how to get a fingerprint record with a given id. All fingerprint records have an identifier and a fingerprint. Identifiers should be unique. (Duplicates are allowed, and if they exist then the lookup code described in this section will arbitrarily decide which record to return. Once made, the choice will not change.) Let's find the fingerprint for the record in "pubchem_targets.fps" which has the identifier `14564126`. One solution is to iterate over all of the records in a file, using the FPS reader:: >>> import chemfp >>> for id, fp in chemfp.open("pubchem_targets.fps"): ... if id == "14564126": ... break ... else: ... raise KeyError("%r not found" % (id,)) ... >>> fp[:5] '\x07\x1e\x1c\x00\x00' I used the somewhat obscure `else` clause to the `for` loop. If the `for` finishes without breaking, which would happen if the identifier weren't present, then it will raise an exception saying that it couldn't find the given identifier. If the fingerprint records are already in a `FingerprintArena` then there's a better solution. Use the :ref:`get_fingerprint_by_id ` method to get the fingerprint byte string, or `None` if the identifier doesn't exist:: >>> arena = chemfp.load_fingerprints("pubchem_targets.fps") >>> fp = arena.get_fingerprint_by_id("14564126") >>> fp[:5] '\x07\x1e\x1c\x00\x00' >>> missing_fp = arena.get_fingerprint_by_id("does-not-exist") >>> missing_fp >>> missing_fp is None True Internally this does about what you think it would. It uses the arena's `id` list to make a lookup table mapping identifier to index, and caches the table for later use. Given the index, it's very easy to get the fingerprint. In fact, you can get the index and do the record lookup yourself:: >>> fp_index = arena.get_index_by_id("14564126") >>> arena.get_index_by_id("14564126") 1559 >>> arena[1559] ('14564126', '\x07\x1e\x1c\x00\x00 ...') Sorting search results ====================== In this section you'll learn how to sort the search results. The k-nearest searches return the hits sorted from highest score to lowest, and break ties arbitrarily. This is usually what you want, and the extra cost to sort is small (k*log(k)) compared to the time needed to maintain the internal heap (N*log(k)). By comparison, the threshold searches return the hits in arbitrary order. Sorting takes up to N*log(N) time, which is extra work for those cases where you don't want sorted data. Use the :ref:`reorder ` method of a :ref:`SearchResult ` if you want the hits sorted in-place:: >>> import chemfp >>> arena = chemfp.load_fingerprints("pubchem_queries.fps") >>> query_fp = arena.get_fingerprint_by_id("27599116") >>> from chemfp import search >>> result = search.threshold_tanimoto_search_fp(query_fp, arena, threshold=0.90) >>> len(result) 6 >>> result.get_ids_and_scores() [('27599092', 0.96153846153846156), ('27599115', 1.0), ('27599116', 1.0), ('27599118', 1.0), ('27599120', 1.0), ('27599082', 0.92537313432835822)] >>> result.reorder("decreasing-score") >>> result.get_ids_and_scores() [('27599115', 1.0), ('27599116', 1.0), ('27599118', 1.0), ('27599120', 1.0), ('27599092', 0.96153846153846156), ('27599082', 0.92537313432835822)] >>> result.reorder("increasing-score") >>> result.get_ids_and_scores() [('27599082', 0.92537313432835822), ('27599092', 0.96153846153846156), ('27599115', 1.0), ('27599116', 1.0), ('27599118', 1.0), ('27599120', 1.0)] There are currently six different sort methods, all specified by name. These are * increasing-score: sort by increasing score * decreasing-score: sort by decreasing score * increasing-index: sort by increasing target index * decreasing-index: sort by decreasing target index * reverse: reverse the current ordering * move-closest-first: move the hit with the highest score to the first position The first two should be obvious from the examples. If you find something useful for the next two then let me know. The `reverse` option reverses the current ordering, and is most useful if you want to reverse the sorted results from a k-nearest search. The `move-closest-first` option exists to improve the leader algorithm stage used by the Taylor-Butina algorithm. The newly seen compound is either in the same cluster as its nearest neighbor or it is the new centroid. I felt it best to implement this as a special reorder term, rather than one of the other possible options. If you are interested in other ways to help improve your clustering performance, let me know. Each `SearchResult` has a :ref:`reorder ` method. If you want to reorder all of the hits of a `SearchResults` then use its :ref:`reorder_all ` method:: >>> similarity_matrix = search.threshold_tanimoto_search_symmetric( ... arena, threshold=0.8) >>> for query_id, row in zip(arena.ids, similarity_matrix): ... print query_id, "->", row.get_ids_and_scores()[:3] ... 27581954 -> [('27581957', 1.0)] 27581957 -> [('27581954', 1.0)] 27580389 -> [('27580394', 0.88235294117647056)] 27584917 -> [('27585106', 0.89915966386554624)] 27585106 -> [('27584917', 0.89915966386554624)] 27580394 -> [('27580389', 0.88235294117647056)] 27593061 -> [] ... It takes the same set of ordering names as :ref:`SearchResult.reorder `. Working with raw scores and counts in a range ============================================= In this section you'll learn how to get the hit counts and raw scores for a interval. The length of the `SearchResult` is the number of hits it contains:: >>> import chemfp >>> from chemfp import search >>> arena = chemfp.load_fingerprints("pubchem_targets.fps") >>> fp = arena.get_fingerprint_by_id("14564126") >>> result = search.threshold_tanimoto_search_fp(fp, arena, threshold=0.2) >>> len(result) 2836 This gives you the number of hits at or above a threshold of 0.2, which you can also get by doing :ref:`count_tanimoto_hits_fp `. The result also stores thehits, and you can get the number of hits which are within a specified interval. Here are the hits counts at or above 0.5, 0.80, and 0.95:: >>> result.count(0.5) 735 >>> result.count(0.8) 5 >>> result.count(0.95) 2 The first parameter, `min_score`, specifies the minimum threshold. The second, `max_score`, specifies the maximum. Here's how to get the number of hits with a score of at most 0.95 and 0.5:: >>> result.count(max_score=0.95) 2834 >>> result.count(max_score=0.5) 2118 If you work out the math, you add 2118+735 and realize that 2853!=2836. There's a difference of 17. This is because the default interval uses a closed range, and there are 17 hits with a score of exactly 0.5:: >>> result.count(0.5, 0.5) 17 The third parameter, `interval`, specifies the end conditions. The default is "[]" which means that both ends are closed. The interval "()" means that both ends are open, and "[)" and "(]" are the two half-open/half-closed ranges. To get the number of hits below 0.5 and the number of hits at or above 0.5 then you might use: >>> result.count(None, 0.5, "[)") 2101 >>> result.count(0.5, None, "[]") 735 at get the expected results. (A min or max of `None` means that there is respectively no lower or no upper bound.) Now for something a bit fancier. Suppose you have two sets of structures. How well do they compare to each other? I can think of various ways to do it. One is to look at a comparison profile. Find all NxM comparisons between the two sets. How many of the hits have a threshold of 0.2? How many at 0.5? 0.95? If there are "many", then the two sets are likely more similar than not. If the answer is "few", then they are likely rather distinct. I'll be more specific. Are the coenzyme A-like structures in ChEBI more similar to the penicillin-like structures than you would expect by comparing two randomly chosen subsets? By similar, I'll use Tanimoto similarity of the "chebi_maccs.fps" file created in the `Generating fingerprints with ...` command-line tool example. The CHEBI id for coenzyme A is CHEBI:15346 and for penicillin is CHEBI:17334. I'll define the "coenzyme A-like" structures as the 117 structures where the fingerprint is at least 0.95 similar to coenzyme A, and "penicillin-like" as the 15 structures at least 0.90 similar to penicillin. This gives 1755 total comparisons. You know enough to do this, but there's a nice optimization I haven't told you about. You can get the total count of all of the threshold hits using the :ref:`SearchResults.count_all ` method, instead of looping over each `SearchResult` and calling :ref:`count `:: import chemfp from chemfp import search def get_neighbors_as_arena(arena, id, threshold): fp = arena.get_fingerprint_by_id(id) neighbor_results = search.threshold_tanimoto_search_fp(fp, chebi, threshold=threshold) neighbor_arena = arena.copy(neighbor_results.get_indices()) return neighbor_arena chebi = chemfp.load_fingerprints("chebi_maccs.fps") # coenzyme A coA_arena = get_neighbors_as_arena(chebi, "CHEBI:15346", threshold=0.95) print len(coA_arena), "coenzyme A-like structures" # penicillin penicillin_arena = get_neighbors_as_arena(chebi, "CHEBI:17334", threshold=0.9) print len(penicillin_arena), "penicillin-like structures" # I'll compute a profile at different thresholds thresholds = [0.25, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95] # Compare the two sets. (For this case the speed difference between a threshold # of 0.25 and 0.0 is not noticible, but having it makes me feel better.) coA_against_penicillin_result= search.threshold_tanimoto_search_arena( coA_arena, penicillin_arena, threshold=min(thresholds)) # Show a similarity profile print "Counts coA/penicillin" for threshold in thresholds: print " %.2f %5d" % (threshold, coA_against_penicillin_result.count_all(min_score=threshold)) This gives a not very useful output:: 117 coenzyme A-like structures 15 penicillin-like structures Counts coA/penicillin 0.25 1755 0.50 445 0.60 0 0.70 0 0.80 0 0.90 0 0.95 0 It's not useful because it's not possible to make any decisions from this. Are the numbers high or low? It should be low, because these are two quite different structure classes, but there's nothing to compare it against. I need some sort of background reference. What I'll two is construct two randomly chosen sets, one with 117 fingerprints and the other with 15, and generate the same similarity profile with them. That isn't quite fair, since randomly chosen sets will most likely be diverse. Instead, I'll pick one fingerprint at random, then get its 117 or 15, respectively, nearest neighbors as the set members:: # Get background statistics for random similarity groups of the same size import random # Find a fingerprint at random, get its k neighbors, return them as a new arena def get_random_fp_and_its_k_neighbors(arena, k): fp = arena[random.randrange(len(arena))][1] similar_search = search.knearest_tanimoto_search_fp(fp, arena, k) return arena.copy(similar_search.get_indices()) I'll construct 1000 pairs of sets this way, accumulate the threshold profile, and compare the CoA/penicillin profile to it:: # Initialize the threshold counts to 0 total_background_counts = dict.fromkeys(thresholds, 0) REPEAT = 1000 for i in range(REPEAT): # Select background sets of the same size and accumulate the threshold count totals set1 = get_random_fp_and_its_k_neighbors(chebi, len(coA_arena)) set2 = get_random_fp_and_its_k_neighbors(chebi, len(penicillin_arena)) background_search = search.threshold_tanimoto_search_arena(set1, set2, threshold=min(thresholds)) for threshold in thresholds: total_background_counts[threshold] += background_search.count_all(min_score=threshold) print "Counts coA/penicillin background" for threshold in thresholds: print " %.2f %5d %5d" % (threshold, coA_against_penicillin_result.count_all(min_score=threshold), total_background_counts[threshold] / (REPEAT+0.0)) Your output should look something like:: Counts coA/penicillin background 0.25 1755 423 0.50 445 82 0.60 0 38 0.70 0 17 0.80 0 6 0.90 0 4 0.95 0 1 This is a bit hard to interpret. Clearly the coenzyme A and penicillin sets are not closely similar, but for low Tanimoto scores the similarity is higher than expected. That difficulty is okay for now because I mostly wanted to show an example of how to use the chemfp API. If you want to dive deeper into this sort of analysis, then look into the `Similarity Ensemble Approach` (SEA) work of Keiser, Roth, Armbruster, Ernsberger, and Irwin. The paper is available online from http://sea.bkslab.org/ . The paper actually wants you to use the `raw score`. This is the sum of the hit scores in a given range, and not just the number of hits. No problem! Use :ref:`SearchResult.cumulative_score ` for an individual result or :ref:`SearchResults.cumulative_score_all ` for the entire set of results:: >>> sum(row.cumulative_score(min_score=0.5, max_score=0.9) ... for row in coA_against_penicillin_result) 224.83239025119906 >>> coA_against_penicillin_result.cumulative_score_all(min_score=0.5, max_score=0.9) 224.83239025119866 These also take the `interval` parameter if you don't want the default of `[]`. You may wonder why these two values aren't exactly the same. Addition of floating point numbers isn't associative. You can see that I get still different results if I sum up the values in reverse order:: >>> sum(list(row.cumulative_score(min_score=0.5, max_score=0.9) ... for row in coA_against_penicillin_result)[::-1]) 224.83239025119875 chemfp-1.1p1/docs/using-tools.rst0000644000077000000240000004047012102124015017213 0ustar dalkestaff00000000000000.. highlight:: none =================================== Working with the command-line tools =================================== The sections in this chapter describe examples of using the command-line tools to generate fingerprint files and to do similarity searches of those files. .. _pubchem_fingerprints: Generating fingerprint files from PubChem SD files ================================================== In this section you'll learn how to create a fingerprint file from an SD file which contains pre-computed CACTVS fingerprints. You do not need a chemistry toolkit for this section. `PubChem `_ is a great resource of publically available chemistry information. The data is available for `ftp download `_. We'll use some of their `SD formatted `_ files. Each record has a PubChem/CACTVS fingerprint field, which we'll used. Start by downloading the files Compound_027575001_027600000.sdf.gz (from ftp://ftp.ncbi.nlm.nih.gov/pubchem/Compound/CURRENT-Full/SDF/Compound_027575001_027600000.sdf.gz ) and Compound_014550001_014575000.sdf.gz (from ftp://ftp.ncbi.nlm.nih.gov/pubchem/Compound/CURRENT-Full/SDF/Compound_014550001_014575000.sdf.gz ). At the time of writing they contain 224 and 3119 records, respectively. (I chose smaller than average files so they would be easier to open and review.) Next, convert the files into fingerprint files. On the command line do the following two commands:: sdf2fps --pubchem Compound_027575001_027600000.sdf.gz -o pubchem_queries.fps sdf2fps --pubchem Compound_014550001_014575000.sdf.gz -o pubchem_targets.fps Congratulations, that was it! How does this work? Each PubChem record contains the precomputed CACTVS substructure keys in the PUBCHEM_CACTVS_SUBSKEYS tag. The :option:`--pubchem` flag tells sdf2fps to get the value of that tag and decode it to get the fingerprint. It also adds a few metadata fields to the fingerprint file header. The order of the fingerprints are the same as the order of the corresponding record in the SDF, although unconvertable records might be skipped, depending on the :option:`--errors` flag. If you store records in an SD file then you almost certainly don't use the same fingerprint encoding as PubChem. sdf2ps can decode from a number of encodings. Use :option:`--help` to see the list of available decoders. k-nearest neighbor search ========================= In this section you'll learn how to search a fingerprint file to find the k-nearest neighbors. You will need the fingerprint files generated in :ref:`pubchem_fingerprints` but you do not need a chemistry toolkit. We'll use the pubchem_queries.fps as the queries for a k=2 nearest neighor similarity search of the target file puchem_targets.gps:: simsearch -k 2 -q pubchem_queries.fps pubchem_targets.fps That's all! You should get output which starts:: #Simsearch/1 #num_bits=881 #type=Tanimoto k=2 threshold=0.0 #software=chemfp/1.0 #queries=pubchem_queries.fps #targets=pubchem_targets.fps #query_sources=Compound_027575001_027600000.sdf.gz #target_sources=Compound_014550001_014575000.sdf.gz 2 27575433 14568234 0.6904 14550456 0.6492 2 27575577 14570945 0.7487 14570951 0.7385 2 27575602 14553070 0.7594 14572463 0.7256 2 27575603 14553070 0.7594 14572463 0.7256 How do you interpret the output? The lines starting with '#' are header lines. It contains metadata information describing that this is a similarity search report. You can see the search parameters, the name of the tool which did the search, and the filenames which went into the search. After the '#' header lines come the search results, with one result per line. There are in the same order as the query fingerprints. Each result line contains tab-delimited columns. The first column is the number of hits. The second column is the query identifier used. The remaining columns contain the hit data, with alternating target id and its score. For example, the first result line contains the 2 hits for the query 27575433. The first hit is the target id 14568234 with score 0.6904 and the second hit is 14550456 with score 0.6492. Since this is a k-nearest neighor search, the hits are sorted by score, starting with the highest score. Do be aware that ties are broken arbitrarily. Threshold search ================ In this section you'll learn how to search a fingerprint file to find all of the neighbors at or above a given threshold. You will need the fingerprint files generated in :ref:`pubchem_fingerprints` but you do not need a chemistry toolkit. Let's do a threshold search and find all hits which are at least 0.738 similar to the queries:: simsearch --threshold 0.738 -q pubchem_queries.fps pubchem_targets.fps The first 20 lines of output from this are:: #Simsearch/1 #num_bits=881 #type=Tanimoto k=all threshold=0.738 #software=chemfp/1.0 #queries=pubchem_queries.fps #targets=pubchem_targets.fps #query_sources=Compound_027575001_027600000.sdf.gz #target_sources=Compound_014550001_014575000.sdf.gz 0 27575433 3 27575577 14570945 0.7487 14570992 0.7383 1457095 0.7385 1 27575602 14553070 0.7594 1 27575603 14553070 0.7594 1 27575880 14569866 0.7727 0 27575897 0 27577227 0 27577234 0 27577237 1 27577250 14569555 0.7474 0 27577307 0 27577324 Take a look at the second result line, which contains the 3 hits for the query id 27575577. As before, the hit information alternates between the target ids and the target scores, but unlike the k-nearest search, the hits are not in a particular order. You can see that here with the scores are 0.7487, 0.7383, and 0.7385. You might be wondering why I chose the 0.738 threshold. Query id 27575577 has 6 hits with a threshold of 0.7 or higher. That requires 14 columns to show, which is a bit overwhelming. Combined k-nearest and threshold search ======================================= In this section you'll learn how to search a fingerprint file to find the k-nearest neighbors, where all of the hits must be at or above given threshold. You will need the fingerprint files generated in :ref:`pubchem_fingerprints` but you do not need a chemistry toolkit. You can combine the :option:`-k` and :option:`--threshold` queries to find the k-nearest neighbors which are all above a given threshold:: simsearch -k 3 --threshold 0.7 -q pubchem_queries.fps pubchem_targets.fps This find the nearest 3 structures, which all must be at least 0.7 similar to the query fingerprint. The output from the above starts:: #Simsearch/1 #num_bits=881 #type=Tanimoto k=3 threshold=0.7 #software=chemfp/1.0 #queries=pubchem_queries.fps #targets=pubchem_targets.fps #query_sources=Compound_027575001_027600000.sdf.gz #target_sources=Compound_014550001_014575000.sdf.gz 0 27575433 3 27575577 14570945 0.7487 14570951 0.7385 14570990.7383 3 27575602 14553070 0.7594 14572463 0.7256 14553060.7208 3 27575603 14553070 0.7594 14572463 0.7256 14553060.7208 3 27575880 14569866 0.7727 14567856 0.7308 14566360.7246 0 27575897 1 27577227 14570135 0.7143 0 27577234 1 27577237 14569555 0.7371 The output format is identical to the previous two search examples, and because this is a k-nearest search, the hits are sorted from higest score to lowest. NxN (self-similar) searches =========================== Use the --NxN option if you want to use the same fingerprints as both the queries and targets:: simsearch -k 3 --threshold 0.7 --NxN pubchem_queries.fps This is about twice as fast and uses half as much memory compared to:: simsearch -k 3 --threshold 0.7 -q pubchem_queries.fps pubchem_queries.fps and the --NxN option excludes matching a fingerprint to itself (the diagonal term). Using a toolkit to process the ChEBI dataset ============================================ In this section you'll learn how to create a fingerprint file from a structure file. The structure processing and fingerprint generation are done with a third-party chemisty toolkit. chemfp supports Open Babel, OpenEye, and RDKit. (OpenEye users please note that you will need an OEGraphSim license to use the OpenEye-specific fingerprinters.) We'll work with data from ChEBI http://www.ebi.ac.uk/chebi/ which contains "Chemical Entities of Biological Interest". They distribute their structures in several formats, including as an SD file. For this section, download the "lite" version from ftp://ftp.ebi.ac.uk/pub/databases/chebi/SDF/ChEBI_lite.sdf.gz . It contains the same structure data as the complete version but many fewer tag data fields. For ChEBI 82 this file contains 19640 records and the compressed file is 5MB. Unlike the PubChem data set, the ChEBI data set does not contain fingerprints so we'll need to generate them using a toolkit. ChEBI record titles don't contain the id ---------------------------------------- Strangely, the ChEBI dataset does not use the title line of the SD file to store the record id. A simple examination shows that 16498 of the title lines are empty, 2119 of them have the title "ChEBI", and 45 of them are labeled "Structure #1." Instead, the id is the value of the "ChEBI ID" tag, which looks like:: > CHEBI:776 By default the toolkit-based fingerprint generation tools use the title as the identifier, and exits with an error if the identifier is missing. (Use the :option:`--errors` option to change the behaviour). If you try one of them with this data file you will get the error message:: ERROR: Missing title for record #1 of 'ChEBI_lite.sdf.gz'. Exiting. Instead, use the :option:`--id-tag` option to specify of the name of the data tag containing the id. For this data set you'll need to write it as:: --id-tag "ChEBI ID" The quotes are important because of the space in the tag name. There's also a "ChEBI Name" tag which includes data values like "tropic acid" and "(+)-guaia-6,9-diene". Every record has a unique name so the names could be used as the primary identifier. The FPS fingerprint file format allows identifiers with a space, or comma, or anything other tab, newline, and a couple of other bytes, so it's no problem using those names directly. To use the ChEBI Name as the primary chemfp identifier, use:: --id-tag "ChEBI Name" Generating fingerprints with Open Babel --------------------------------------- If you have the Open Babel Python library installed then you can use :ref:`ob2fps ` to generate fingerprints:: ob2fps --id-tag "ChEBI ID" ChEBI_lite.sdf.gz -o ob_chebi.fps This takes about 30 seconds on my laptop. The default uses the FP2 fingerprints, so the above is the same as:: ob2fps --FP2 --id-tag "ChEBI ID" ChEBI_lite.sdf.gz -o ob_chebi.fps ob2fps can generate several other types of fingerprints. (See XXX for details). For example, to generate the Open Babel implementation of the MACCS definition use:: ob2fps --MACCS --id-tag "ChEBI ID" ChEBI_lite.sdf.gz -o chebi_maccs.fps Generating fingerprints with OpenEye ------------------------------------ If you have the OEChem Python library installed then you can use :ref:`oe2fps ` to generate fingerprints:: oe2fps --id-tag "ChEBI ID" ChEBI_lite.sdf.gz -o oe_chebi.fps This takes about 10 seconds on my laptop and generates a number of stereochemistry warnings. The default settings produce OEGraphSim path fingerprint with the values:: numbits=4096 minbonds=0 maxbonds=5 atype=DefaultAtom btype=DefaultBond Each of these can be changed through command-line options. See XXX for details. There are also options to use an alternate aromaticity model. oe2fps can generate several other types of fingerprints. For example, to generate the OpenEye implementation of the MACCS definition use:: oe2fps --maccs166 --id-tag "ChEBI ID" ChEBI_lite.sdf.gz -o chebi_maccs.fps Generating fingerprints with RDKit ---------------------------------- If you have the RDKit Python library installed then you can use :ref:`rdkit2fps ` to generate fingerprints. Based on the previous examples you probably guessed that the command-line is:: rdkit2fps --id-tag "ChEBI ID" ChEBI_lite.sdf.gz -o rdkit_chebi.fps Unfortunately, this isn't enough. If you do this you'll get the message:: [18:54:07] Explicit valence for atom # 12 N, 4, is greater than permitted ERROR: Could not parse molecule block at line 14840 of 'ChEBI_lite.sdf.gz'. Exiting. The first line comes from RDKit's error log. RDKit is careful to check that structures make chemical sense, and in this case it didn't like the 4-valent nitrogen. It refuses to process this molecule. The second line comes from rdkit2fps. By default it complains and exits with an error if RDKit cannot process a record. Basically it highlights the source of the problem and demands that you do something about it. In most cases it's okay to skip a few records which can't be processed. You can tell rdkit2fps to report the error but continue processing by using the :option:`--errors` option:: rdkit2fps --id-tag "ChEBI ID" --errors report ChEBI_lite.sdf.gz -o rdkit_chebi.fps Four minutes later I see that 403 records out of the 19640 could not be processed. The previous command-line created RDKit's path fingerprints with parameters:: minPath=1 maxPath=7 fpSize=2048 nBitsPerHash=4 useHs=1 Each of those can be changed through command-line options. See XXX for details. rdkit2fps can generate several other types of fingerprints. For example, to generate the RDKit implementation of the MACCS definition use:: rdkitfps --maccs166 --id-tag "ChEBI ID" ChEBI_lite.sdf.gz -o chebi_maccs.fps chemfp supports neither count fingerprints nor sparse fingerprints so cannot generate RDKit's circular fingerprints. chemfp's two cross-toolkit substructure fingerprints ==================================================== In this section you'll learn how to generate the two substructure-based fingerprints which come as part of chemfp. These are based on cross-toolkit SMARTS pattern definitions and can be used with Open Babel, OpenEye, and RDKit. (For OpenEye users, these fingerprints use the base OEChem library and not the separately licensed OEGraphSim add-on.) chemfp implements two platform-independent fingerprints where were originally designed for substructure filters but which are also used for similarity searches. One is based on the 166-bit MACCS implementation in RDKit and the other comes from the 881-bit PubChem/CACTVS substructure fingerprints. The chemfp MACCS definition is called "rdmaccs" because it closely derives from the MACCS SMARTS patterns used in RDKit. (These pattern definitions are also used in Open Babel and the CDK, but are completely independent from the OpenEye implementation.) Here are example of the respective rdmaccs fingerprint for phenol using each of the toolkits. Open Babel:: % echo "c1ccccc1O phenol" | ob2fps --in smi --rdmaccs #FPS1 #num_bits=166 #type=RDMACCS-OpenBabel/1 #software=OpenBabel/2.2.3 #date=2011-09-19T22:32:13 00000000000000000000000000000140004480101e phenol OpenEye:: % echo "c1ccccc1O phenol" | oe2fps --in smi --rdmaccs #FPS1 #num_bits=166 #type=RDMACCS-OpenEye/1 #software=OEChem/1.7.4 (20100809) #aromaticity=openeye #date=2011-09-19T22:31:33 00000000000000000000000000000140004480101e phenol RDKit:: % echo "c1ccccc1O phenol" | rdkit2fps --in smi --rdmaccs echo "c1ccccc1O phenol" | python2.7 rdkit2fps --in smi --rdmaccs #FPS1 #num_bits=166 #type=RDMACCS-RDKit/1 #software=RDKit/2011.06.1 #date=2011-09-19T22:34:42 00000000000000000000000000000140004480101e phenol You might be wondering why :option:`--rdmaccs` produces different fingerprint types even if the toolkits use the same SMARTS definitions. Each toolkit perceives chemistry differently. Open Babel before 2.3 didn't support chirality so chiral-based bits will never be set. Each toolkit uses a different definition of aromaticity, so a bit which is set when there are "two or more aromatic rings" will be toolkit dependent. substruct fingerprints ---------------------- (This is a horribly generic name. If you can think of a better one then let me know.) chemp also includes an experimental "substruct" substructure fingerprint. This is an 881 bit fingerprint derived from the PubChem/CACTVS substructure keys. They are still being tested and validated, but you you want to try them out, use the :option:`--substruct` option. chemfp-1.1p1/fpsmerge0000755000077000000240000000011411660452123015007 0ustar dalkestaff00000000000000#!/usr/bin/env python from chemfp.commandline.fpsmerge import main main() chemfp-1.1p1/ob2fps0000755000077000000240000000034311660452123014376 0ustar dalkestaff00000000000000#!/usr/bin/env python # Allow ^C to kill the process. import signal signal.signal(signal.SIGINT, signal.SIG_DFL) try: from chemfp.commandline.ob2fps import main main() except KeyboardInterrupt: raise SystemExit() chemfp-1.1p1/oe2fps0000755000077000000240000000034311660452123014401 0ustar dalkestaff00000000000000#!/usr/bin/env python # Allow ^C to kill the process. import signal signal.signal(signal.SIGINT, signal.SIG_DFL) try: from chemfp.commandline.oe2fps import main main() except KeyboardInterrupt: raise SystemExit() chemfp-1.1p1/PKG-INFO0000644000077000000240000000236512106315372014360 0ustar dalkestaff00000000000000Metadata-Version: 1.0 Name: chemfp Version: 1.1p1 Summary: chemfp is a set of command-lines tools for generating cheminformatics fingerprints and searching those fingerprints by Tanimoto similarity, as well as a Python library which can be used to build new tools. These algorithms are designed for the dense, 100-10,000 bit fingerprints which occur in small-molecule/pharmaceutical chemisty. The Tanimoto search algorithms are implemented in C for performance and support both threshold and k-nearest searches. Fingerprint generation can be done either by extracting existing fingerprint data from an SD file or by using an existing chemistry toolkit. chemfp supports the Python libraries from Open Babel, OpenEye, and RDKit toolkits. Home-page: http://code.google.com/p/chem-fingerprints/ Author: Andrew Dalke Author-email: dalke@dalkescientific.com License: MIT Description: UNKNOWN Platform: UNKNOWN Classifier: Development Status :: 5 - Production/Stable Classifier: Environment :: Console Classifier: License :: OSI Approved :: MIT License Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python Classifier: Topic :: Scientific/Engineering :: Chemistry Classifier: Topic :: Software Development :: Libraries :: Python Modules chemfp-1.1p1/rdkit2fps0000755000077000000240000000021311660452123015107 0ustar dalkestaff00000000000000#!/usr/bin/env python try: from chemfp.commandline.rdkit2fps import main main() except KeyboardInterrupt: raise SystemExit() chemfp-1.1p1/README0000644000077000000240000000333312106312227014133 0ustar dalkestaff00000000000000 chemfp 1.1p1 A Python library and set of tools for working with cheminformatics fingerprint data. For more information, see http://chemfp.com/ . Copyright under "the MIT license." See COPYING for details. See THANKS for the people who have contributed in some fashion. (If I've left your name out or didn't credit you correctly, let me know.) Install in the normal Python way: python setup.py install You may need a 'sudo' or be root, depending on your system. If you get a message like: unrecognized command line option "-fopenmp" then your compiler does not understand OpenMP. To compile without OpenMP append "--without-openmp" to the setup.py line. If you get a message like: cc1: error: invalid option ssse3 -or- cc1: error: unrecognized command line option "-mssse3" then your compiler does not understand the SSSE3 intrinsics. To compile without the SSSE3 intrinsics, append "--without-ssse3" to the setup.py line. For example, to compile in a Mac with gcc-4.0 using sudo: sudo python setup.py install --without-openmp --without-ssse3 Note: chemfp requires a C compiler to build the _chemfp extension. If you use Visual Studio for Microsoft Window then you will either need the 2008 version or you will have to patch your version of Python to handle 2010 or newer. See http://bugs.python.org/issue13210 . Documentation? Certainly! Go to: https://chemfp.readthedocs.org/en/latest/ or use '--help' on any of the command-line programs: rdkit2fps --help ob2fps --help oe2fps --help sdf2fps --help simsearch --help or (for parts of the public API), look at the doc strings import chemfp help(chemfp) There are many tests. To run them: cd tests python unit2 discover chemfp-1.1p1/sdf2fps0000755000077000000240000000011611660452123014550 0ustar dalkestaff00000000000000#!/usr/bin/env python from chemfp.commandline import sdf2fps sdf2fps.main() chemfp-1.1p1/setup.py0000755000077000000240000001241312106312227014767 0ustar dalkestaff00000000000000#!/usr/bin/env python import sys from distutils.core import setup, Extension from distutils.command.build_ext import build_ext from distutils.sysconfig import get_config_var DESCRIPTION = """\ chemfp is a set of command-lines tools for generating cheminformatics fingerprints and searching those fingerprints by Tanimoto similarity, as well as a Python library which can be used to build new tools. These algorithms are designed for the dense, 100-10,000 bit fingerprints which occur in small-molecule/pharmaceutical chemisty. The Tanimoto search algorithms are implemented in C for performance and support both threshold and k-nearest searches. Fingerprint generation can be done either by extracting existing fingerprint data from an SD file or by using an existing chemistry toolkit. chemfp supports the Python libraries from Open Babel, OpenEye, and RDKit toolkits. """ # chemfp has two compile-time options. # USE_OPENMP: Compile with OpenMP support # USE_SSSE3: Compile with compiler- and CPU-specific SSSE3 instructions # These are available through the setup command-line as: # --with-openmp / --without-openmp # --with-ssse3 / --without-ssse3 # There doesn't seem to be a clean way to do this with distutils, so # hack something together to make it work. USE_OPENMP = True # True means "enable", False means "disable" USE_SSSE3 = True # True means "enable", False means "disable" argv = [] for arg in sys.argv: if arg == "--with-openmp": USE_OPENMP = True elif arg == "--without-openmp": USE_OPENMP = False elif arg == "--with-ssse3": USE_SSSE3 = True elif arg == "--without-ssse3": USE_SSSE3 = False else: # not one of the special command-line options; don't delete argv.append(arg) sys.argv = argv # chemfp has experimental support for OpenMP. def OMP(*args): if USE_OPENMP: return list(args) return [] def SSSE3(*args): if not USE_SSSE3: return [] # Some Python installations on my Mac are compiled with "-arch ppc". # gcc doesn't like the -mssse3 option in that case. arch = get_config_var("ARCHFLAGS") if arch and "-arch ppc" in arch: return [] return list(args) # Set "USE_OPENMP" (above) to enable OpenMP support (disabled by default) # Set "USE_SSSE3" (above) to enable SSSE3 support (enabled by default) copt = { "msvc": OMP("/openmp") + ["/Ox", "/GL"], "mingw32" : OMP("-fopenmp") + ["-O3", "-ffast-math", "-march=native"], "gcc-4.1": ["-O3"], # Doesn't support OpenMP, doesn't support -mssse3 # I'm going to presume that everyone is using an Intel-like processor "gcc": OMP("-fopenmp") + SSSE3("-mssse3") + ["-O3"], # Options to use before a release # ["-Wall", "-pedantic", "-Wunused-parameter", "-std=c99"], } lopt = { "msvc": ["/LTCG", "/MANIFEST"], "mingw32" : OMP("-fopenmp"), "gcc-4.1": ["-O3"], # Doesn't support OpenMP "gcc": OMP("-fopenmp") + ["-O3"], } def _is_gcc(compiler): return "gcc" in compiler or "g++" in compiler class build_ext_subclass( build_ext ): def build_extensions(self): c = self.compiler.compiler_type if c == "unix": c = self.compiler.compiler[0] if _is_gcc(c): names = [c, "gcc"] else: names = [c] else: names = [c] for c in names: if c in copt: for e in self.extensions: e.extra_compile_args = copt[ c ] break for c in names: if c in lopt: for e in self.extensions: e.extra_link_args = lopt[ c ] break build_ext.build_extensions(self) setup(name = "chemfp", version = "1.1p1", description = DESCRIPTION, author = "Andrew Dalke", author_email = 'dalke@dalkescientific.com', url = "http://code.google.com/p/chem-fingerprints/", license = "MIT", classifiers = ["Development Status :: 5 - Production/Stable", "Environment :: Console", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python", "Topic :: Scientific/Engineering :: Chemistry", "Topic :: Software Development :: Libraries :: Python Modules"], packages = ["chemfp", "chemfp.commandline", "chemfp.futures", "chemfp.progressbar"], package_data = {"chemfp": ["rdmaccs.patterns", "substruct.patterns"]}, scripts = ["ob2fps", "oe2fps", "rdkit2fps", "sdf2fps", "simsearch"], ext_modules = [Extension("_chemfp", ["src/bitops.c", "src/chemfp.c", "src/heapq.c", "src/fps.c", "src/searches.c", "src/hits.c", "src/select_popcount.c", "src/popcount_popcnt.c", "src/popcount_lauradoux.c", "src/popcount_lut.c", "src/popcount_gillies.c", "src/popcount_SSSE3.c", "src/python_api.c", "src/pysearch_results.c"], )], cmdclass = {"build_ext": build_ext_subclass}, ) chemfp-1.1p1/simsearch0000755000077000000240000000021211660452123015154 0ustar dalkestaff00000000000000#!/usr/bin/env python try: from chemfp.commandline.simsearch import main main() except KeyboardInterrupt: raise SystemExit() chemfp-1.1p1/src/0000755000077000000240000000000012106315372014044 5ustar dalkestaff00000000000000chemfp-1.1p1/src/bitops.c0000644000077000000240000003124312055226640015515 0ustar dalkestaff00000000000000#include "chemfp.h" #include "popcount.h" /* Bit operations related to byte and hex fingerprints A byte fingerprint is a length and a sequence of bytes where each byte stores 8 fingerprints bits, in the usual order. (That is, the byte 'A', which is the hex value 0x41, is the bit pattern "01000001".) A hex fingerprint is also stored as a length and a sequence of bytes but each byte encode 4 bits of the fingerprint as a hex character. The only valid byte values are 0-9, A-F and a-f. Other values will cause an error value to be returned. */ /***** Functions for hex fingerprints ******/ /* Map from ASCII value to bit count. Used with hex fingerprints. BIG is used in cumulative bitwise-or tests to check for non-hex input */ #define BIG 16 static int hex_to_value[256] = { BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, BIG, BIG, BIG, BIG, BIG, BIG, /* Upper-case A-F */ BIG, 10, 11, 12, 13, 14, 15, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, /* Lower-case a-f */ BIG, 10, 11, 12, 13, 14, 15, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, BIG, }; /* Map from ASCII value to popcount. Used with hex fingerprints. */ static int hex_to_popcount[256] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 0, 0, 0, 0, 0, 0, 0, 2, 3, 2, 3, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 3, 2, 3, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, }; /* Map from an integer to its popcount. The maximum possible valid hex input is 'f'/'F', which is 15, but non-hex input will set bit 0x10, so I include the range 16-31 as well. */ int _popcount[32] = { 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, }; /* Return 1 if the string contains only hex characters; 0 otherwise */ int chemfp_hex_isvalid(int len, const char *sfp) { int i, union_w=0; const unsigned char *fp = (unsigned char *) sfp; /* Out of range values set 0x10 so do cumulative bitwise-or and see if that bit is set. Optimize for the expected common case of validfingerprints. */ for (i=0; i= BIG) { return -1; /* Then this was an invalid fingerprint (contained non-hex characters) */ } return popcount; } /* Return the population count of the intersection of two hex fingerprints, otherwise return -1. */ int chemfp_hex_intersect_popcount(int len, const char *sfp1, const char *sfp2) { int i, union_w=0, intersect_popcount=0; int w1, w2; const unsigned char *fp1 = (const unsigned char *) sfp1; const unsigned char *fp2 = (const unsigned char *) sfp2; for (i=0; i= BIG) { return -1; } return intersect_popcount; } /* Return the Tanitoto between two hex fingerprints, or -1.0 for invalid fingerprints If neither fingerprint has any set bits then return 1.0 */ double chemfp_hex_tanimoto(int len, const char *sfp1, const char *sfp2) { int i=0, union_w=0; int union_popcount=0, intersect_popcount=0; int w1, w2; int w3, w4; int upper_bound = len - (len%2); const unsigned char *fp1 = (const unsigned char *) sfp1; const unsigned char *fp2 = (const unsigned char *) sfp2; /* Hex fingerprints really should be even-length since two hex characters are used for a single fingerprint byte and all chemfp fingerprints must be a multiple of 8 bits. I'll allow odd-lengths since I don't see how that's a bad idea and I can see how some people will be confused by expecting odd lengths to work. More specifically, I was confused because I used some odd lengths in my tests. ;) */ /* I'll process two characters at a time. Loop-unrolling was about 4% faster. */ for (; i= BIG) { return -1.0; } /* Special case define that 0/0 = 0.0. It's hard to decide what to use here, for example, OpenEye uses 1.0. It seems that 0.0 is the least surprising choice. */ if (union_popcount == 0) { return 0.0; } return (intersect_popcount + 0.0) / union_popcount; /* +0.0 to coerce to double */ } /* Return 1 if the query fingerprint is contained in the target, 0 if it isn't, or -1 for invalid fingerprints */ /* This code assumes that 1) most tests fail and 2) most fingerprints are valid */ int chemfp_hex_contains(int len, const char *squery_fp, const char *starget_fp) { int i, query_w, target_w; int union_w=0; const unsigned char *query_fp = (const unsigned char *) squery_fp; const unsigned char *target_fp = (const unsigned char *) starget_fp; for (i=0; i= BIG) { return -1; } return 0; } } /* This was a subset, but there might have been a non-hex input */ if (union_w >= BIG) { return -1; } return 1; } /****** byte fingerprints *******/ /* These algorithms are a lot simpler than working with hex fingeprints. There are a number of performance tweaks I could put in, especially if I know the inputs are word aligned, but I'll leave those for later. */ static int byte_popcounts[] = { 0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5, 1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6, 1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6, 2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7, 1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6, 2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7, 2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7, 3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,4,5,5,6,5,6,6,7,5,6,6,7,6,7,7,8 }; /* Return the population count of a byte fingerprint */ int chemfp_byte_popcount(int len, const unsigned char *fp) { /* This doesn't (yet?) need the fastest code, so use the simplest */ /* (I only call this through Python, and that overhead dominates.) */ return chemfp_popcount_lut8_1(len, fp); } /* Return the population count of the intersection of two byte fingerprints */ int chemfp_byte_intersect_popcount(int len, const unsigned char *fp1, const unsigned char *fp2) { return chemfp_intersect_popcount_lut8_1(len, fp1, fp2); } /* Return the Tanitoto between two byte fingerprints, or -1.0 for invalid fingerprints If neither fingerprint has any set bits then return 1.0 */ double chemfp_byte_tanimoto(int len, const unsigned char *fp1, const unsigned char *fp2) { int i, union_popcount=0, intersect_popcount=0; /* Accumulate the total union and intersection popcounts */ for (i=0; i 0) { w1 = hex_to_value[*hex_fp++]; w2 = hex_to_value[*hex_fp++]; /* Check for illegal characters */ union_w |= (w1|w2); wc = (unsigned char)((w1<<4) | w2); byte = *byte_fp++; union_popcount += byte_popcounts[byte | wc]; intersect_popcount += byte_popcounts[byte & wc]; size--; } if (union_w >= BIG) { return -1.0; } /* Special case define that 0/0 = 0.0. It's hard to decide what to use here, for example, OpenEye uses 1.0. It seems that 0.0 is the least surprising choice. */ if (union_popcount == 0) { return 0.0; } return (intersect_popcount + 0.0) / union_popcount; /* +0.0 to coerce to double */ } chemfp-1.1p1/src/chemfp.c0000644000077000000240000000636012055226640015461 0ustar dalkestaff00000000000000#include "chemfp.h" #include "chemfp_internal.h" #include const char *chemfp_version(void) { return CHEMFP_VERSION_STRING; } const char *chemfp_strerror(int err) { switch (err) { case CHEMFP_OK: return "Ok"; case CHEMFP_BAD_ARG: return "Bad argument"; case CHEMFP_NO_MEM: return "Cannot allocate memory"; case CHEMFP_UNSUPPORTED_WHITESPACE: return "Unsupported whitespace"; case CHEMFP_MISSING_FINGERPRINT: return "Missing fingerprint field"; case CHEMFP_BAD_FINGERPRINT: return "Fingerprint field is in the wrong format"; case CHEMFP_UNEXPECTED_FINGERPRINT_LENGTH: return "Fingerprint is not the expected length"; case CHEMFP_MISSING_ID: return "Missing id field"; case CHEMFP_BAD_ID: return "Id field is in the wrong format"; case CHEMFP_MISSING_NEWLINE: return "Line must end with a newline character"; case CHEMFP_METHOD_MISMATCH: return "Mismatch between popcount method and alignment type"; case CHEMFP_UNKNOWN_ORDERING: return "Unknown sort order"; default: return "Unknown error"; } } /* Code to handle getting and setting options */ typedef int (*get_option_f)(void); typedef int (*set_option_f)(int); typedef struct { const char *name; get_option_f getter; set_option_f setter; } chemfp_option_type; const chemfp_option_type chemfp_options[] = { {"report-popcount", chemfp_get_option_report_popcount, chemfp_set_option_report_popcount}, {"report-intersect", chemfp_get_option_report_intersect_popcount, chemfp_set_option_report_intersect_popcount}, }; int chemfp_get_num_options(void) { return sizeof(chemfp_options) / sizeof(chemfp_option_type); } const char * chemfp_get_option_name(int i) { if (i < 0 || i >= chemfp_get_num_options()) { return NULL; } return chemfp_options[i].name; } int chemfp_get_option(const char *option) { int i; for (i=0; i #endif /* The value 0 means "initialize using omp_get_max_threads()" */ /* Otherwise, this will be a value between 1 and omp_get_max_threads() */ static int chemfp_num_threads = 0; int chemfp_get_num_threads(void) { #if defined(_OPENMP) if (chemfp_num_threads == 0) { chemfp_num_threads = omp_get_max_threads(); } return chemfp_num_threads; #else return 1; #endif } void chemfp_set_num_threads(int num_threads) { #if defined(_OPENMP) /* Can only have between 1 thread and the number of logical cores */ if (num_threads < 1) { num_threads = 1; } omp_set_num_threads(num_threads); /* Quoting from the docs: If you use omp_set_num_threads to change the number of threads, subsequent calls to omp_get_max_threads will return the new value. */ chemfp_num_threads = omp_get_max_threads(); #else UNUSED(num_threads); #endif } int chemfp_get_max_threads(void) { #if defined(_OPENMP) return omp_get_max_threads(); #else return 1; #endif } chemfp-1.1p1/src/chemfp.h0000644000077000000240000003423612106312227015463 0ustar dalkestaff00000000000000#ifndef CHEMFP_H #define CHEMFP_H /* Errors are always negative numbers. */ enum chemfp_errors { CHEMFP_OK = 0, CHEMFP_BAD_ARG = -1, CHEMFP_NO_MEM = -2, /* memory allocation failed */ /* File format errors */ CHEMFP_UNSUPPORTED_WHITESPACE = -30, CHEMFP_MISSING_FINGERPRINT = -31, CHEMFP_BAD_FINGERPRINT = -32, CHEMFP_UNEXPECTED_FINGERPRINT_LENGTH = -33, CHEMFP_MISSING_ID = -34, CHEMFP_BAD_ID = -35, CHEMFP_MISSING_NEWLINE = -36, /* Popcount errors */ CHEMFP_METHOD_MISMATCH = -50, /* Various other error messages */ CHEMFP_UNKNOWN_ORDERING = -60 }; int chemfp_get_num_methods(void); const char *chemfp_get_method_name(int method); int chemfp_get_num_alignments(void); const char *chemfp_get_alignment_name(int alignment); int chemfp_get_alignment_method(int alignment); int chemfp_set_alignment_method(int alignment, int method); int chemfp_select_fastest_method(int alignment, int repeat); int chemfp_get_num_options(void); const char *chemfp_get_option_name(int index); int chemfp_get_option(const char *option); int chemfp_set_option(const char *option, int value); /* This gives compile-time version information. */ /* Use "chemfp_version" for run-time version information */ #define CHEMFP_MAJOR_VERSION 1 #define CHEMFP_MINOR_VERSION 1 #define CHEMFP_PATCHLEVEL 0 /* This is of the form (\d+\.\d+) (\.\d)? ((a|b|pre)\d+) for examples: 0.9, 1.0.4, 1.0pre2. In practice, the production releases look like '1.0' and '1.1' and intermediate builds before a full release use the 'b' suffix, like '1.1b7' before the '1.1' final release. */ #define CHEMFP_VERSION_STRING "1.1p1" /* Return the CHEMFP_VERSION. */ const char *chemfp_version(void); /* Convert an error code to a string description */ const char *chemfp_strerror(int err); typedef struct { int popcount; int index; } ChemFPOrderedPopcount; /* XXX Why mixed case? */ typedef struct { double score; int query_index; int id_start; int id_end; } chemfp_tanimoto_cell; /* Linked list of blocks */ #define CHEMFP_HIT_BLOCK_SIZE 16 typedef struct chemfp_hit_block { int target_indices[CHEMFP_HIT_BLOCK_SIZE]; double scores[CHEMFP_HIT_BLOCK_SIZE]; struct chemfp_hit_block *next; } chemfp_hit_block; typedef struct chemfp_search_result { int num_hits; int num_allocated; int *indices; double *scores; } chemfp_search_result; chemfp_search_result *chemfp_alloc_search_results(int num_results); void chemfp_free_results(int num_results, chemfp_search_result *); int chemfp_get_num_hits(chemfp_search_result *results); int chemfp_search_result_reorder(chemfp_search_result *result, const char *ordering); int chemfp_search_results_reorder(int num_results, chemfp_search_result *results, const char *ordering); void chemfp_search_result_clear(chemfp_search_result *result); /*** Low-level operations directly on hex fingerprints ***/ /* Return 1 if the string contains only hex characters; 0 otherwise */ int chemfp_hex_isvalid(int len, const char *fp); /* Return the population count of a hex fingerprint, otherwise return -1 */ int chemfp_hex_popcount(int len, const char *fp); /* Return the population count of the intersection of two hex fingerprints, otherwise return -1. */ int chemfp_hex_intersect_popcount(int len, const char *fp1, const char *fp2); /* Return the Tanimoto between two hex fingerprints, or -1.0 for invalid fingerprints If neither fingerprint has any set bits then return 1.0 */ double chemfp_hex_tanimoto(int len, const char *fp1, const char *fp2); /* Return 1 if the query fingerprint is contained in the target, 0 if it isn't, or -1 for invalid fingerprints */ int chemfp_hex_contains(int len, const char *query_fp, const char *target_fp); /**** Low-level operations directly on byte fingerprints ***/ /* Return the population count of a byte fingerprint */ int chemfp_byte_popcount(int len, const unsigned char *fp); /* Return the population count of the intersection of two byte fingerprints */ int chemfp_byte_intersect_popcount(int len, const unsigned char *fp1, const unsigned char *fp2); /* Return the Tanitoto between two byte fingerprints, or -1.0 for invalid fingerprints If neither fingerprint has any set bits then return 1.0 */ double chemfp_byte_tanimoto(int len, const unsigned char *fp1, const unsigned char *fp2); double chemfp_byte_hex_tanimoto(int size, const unsigned char *byte_fp, const char *hex_fp); /* Return 1 if the query fingerprint is contained in the target, 0 if it isn't */ int chemfp_byte_contains(int len, const unsigned char *query_fp, const unsigned char *target_fp); /**** Functions which work with data from an fps block ***/ /* NOTE: an "fps block" means "one or more fingerprint lines from an fps file." These contain the hex fingerprint and the identifier, plus optional additional fields. The fps block must end with a newline. */ /* Return 0 if string is a valid fps fingerprint line, otherwise an error code */ int chemfp_fps_line_validate(int hex_size, /* use -1 if not known */ int line_size, const char *line_start); int chemfp_fps_find_id(int hex_size, const char *line, const char **id_start, const char **id_end); int chemfp_threshold_tanimoto_hexfp_fps( int hex_size, const char *hex_query_fp, int target_block_size, const char *target_block_start, int *lineno, const char **stopped_at, double threshold, int num_cells, int *id_starts, int *id_ends, double *scores); /* Return the number of fingerprints in the fps block which are greater than or equal to the specified threshold. */ int chemfp_fps_count_tanimoto_hits( int num_bits, int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, const char *target_block, int target_block_end, double threshold, int *counts, int *num_lines_processed); typedef struct { int size; /* current heap size */ int heap_state; /* These all point to arrays of size k */ int *indices; /* [k]; contains a unique id or index */ char **ids; /* [k]; array of NULL or malloc'ed identifier */ double *scores; /* [k]; the Tanimoto similarity */ } chemfp_fps_heap; typedef struct { const unsigned char *query_start; int num_queries; int query_fp_size; int query_storage_size; int k; /* max number of elements to find */ int search_state; /* 0 for not not finished, 1 for finished */ double threshold; /* initial threshold */ chemfp_fps_heap *heaps; /* [num_queries] heaps */ int num_targets_processed; char **_all_ids; double *_all_scores; } chemfp_fps_knearest_search; int chemfp_fps_knearest_search_init( chemfp_fps_knearest_search *knearest_search, int num_bits, int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, int k, double threshold); /* Update the heap based on the lines in an fps fingerprint data block. */ int chemfp_fps_knearest_tanimoto_search_feed( chemfp_fps_knearest_search *knearest_search, int target_block_len, const char *target_block); /* Call this after the last fps block, in order to convert the heap into an sorted array. */ void chemfp_fps_knearest_search_finish(chemfp_fps_knearest_search *knearest_search); void chemfp_fps_knearest_search_free(chemfp_fps_knearest_search *knearest_search); int chemfp_fps_threshold_tanimoto_search( int num_bits, int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, const char *target_block, int target_block_end, double threshold, int num_cells, chemfp_tanimoto_cell *cells, const char ** stopped_at, int *num_lines_processed, int *num_cells_processed); /***** The byte-oriented algorithms ********/ /* You must have enough cells */ int chemfp_knearest_tanimoto_block( int k, int query_len, unsigned char *query_fp, int num_targets, unsigned char *target_block, int offset, int storage_len, double threshold, int *indices, double *scores); int chemfp_hex_tanimoto_block( int n, int len, unsigned char *hex_query_fp, int target_len, unsigned char *target_block, double threshold, double *scores, int *id_starts, int *id_ends, int *lineno); int chemfp_byte_intersect_popcount_count( int len, unsigned char *query_fp, int num_targets, unsigned char *target_block, int offset, int storage_len, int min_overlap); int chemfp_reorder_by_popcount( int num_bits, int storage_size, const unsigned char *arena, int start, int end, unsigned char *new_arena, ChemFPOrderedPopcount *ordering, int *popcount_indices); int chemfp_count_tanimoto_arena( /* Count all matches within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Query arena, start and end indices */ int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, /* Target arena, start and end indices */ int target_storage_size, const unsigned char *target_arena, int target_start, int target_end, /* Target popcount distribution information */ /* (must have at least num_bits+1 elements) */ int *target_popcount_indices, /* Results go into these arrays */ int *result_counts ); int chemfp_threshold_tanimoto_arena( /* Within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Query arena, start and end indices */ int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, /* Target arena, start and end indices */ int target_storage_size, const unsigned char *target_arena, int target_start, int target_end, /* Target popcount distribution information */ /* (must have at least num_bits+1 elements) */ int *target_popcount_indices, /* Results go into this data structure */ chemfp_search_result *results ); int chemfp_knearest_tanimoto_arena( /* Find the 'k' nearest items */ int k, /* Within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Query arena, start and end indices */ int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, /* Target arena, start and end indices */ int target_storage_size, const unsigned char *target_arena, int target_start, int target_end, /* Target popcount distribution information */ /* (must have at least num_bits+1 elements) */ int *target_popcount_indices, /* Results go into this data structure */ chemfp_search_result *results ); int chemfp_count_tanimoto_hits_arena_symmetric( /* Count all matches within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Fingerprint arena */ int storage_size, const unsigned char *arena, /* Row start and end indices */ int query_start, int query_end, /* Column start and end indices */ int target_start, int target_end, /* Target popcount distribution information */ int *popcount_indices, /* Results _increment_ existing values in the array - remember to initialize! */ int *result_counts); int chemfp_threshold_tanimoto_arena_symmetric( /* Within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Arena */ int storage_size, const unsigned char *arena, /* start and end indices for the rows and columns */ int query_start, int query_end, int target_start, int target_end, /* Target popcount distribution information */ /* (must have at least num_bits+1 elements) */ int *popcount_indices, /* Results go here */ /* NOTE: This must have enough space for all of the fingerprints! */ chemfp_search_result *results); int chemfp_knearest_tanimoto_arena_symmetric( /* Find the 'k' nearest items */ int k, /* Within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Arena */ int storage_size, const unsigned char *arena, /* start and end indices for the rows and columns */ int query_start, int query_end, int target_start, int target_end, /* Target popcount distribution information */ /* (must have at least num_bits+1 elements) */ int *target_popcount_indices, /* Results go here */ /* NOTE: This must have enough space for all of the fingerprints! */ chemfp_search_result *results); void chemfp_knearest_results_finalize(chemfp_search_result *results_start, chemfp_search_result *results_end); int chemfp_fill_lower_triangle(int n, chemfp_search_result *results); typedef int (*chemfp_popcount_f)(int len, const unsigned char *p1); typedef int (*chemfp_intersect_popcount_f)(int len, const unsigned char *p1, const unsigned char *p2); chemfp_popcount_f chemfp_select_popcount(int num_bits, int storage_len, const unsigned char *arena); chemfp_intersect_popcount_f chemfp_select_intersect_popcount(int num_bits, int storage_len1, const unsigned char *arena1, int storage_len2, const unsigned char *arena2); /* OpenMP interface */ int chemfp_get_num_threads(void); void chemfp_set_num_threads(int num_threads); int chemfp_get_max_threads(void); #endif /* CHEMFP_H */ chemfp-1.1p1/src/chemfp_internal.h0000644000077000000240000000117212055226640017356 0ustar dalkestaff00000000000000#ifndef CHEMFP_INTERNAL_H #define CHEMFP_INTERNAL_H #define ALIGNMENT(POINTER, BYTE_COUNT) \ (((uintptr_t)(const void *)(POINTER)) % (BYTE_COUNT)) /* Macro to use for variable names which exist as a */ /* function parameter but otherwise aren't used */ /* This is to prevent compiler warnings on msvc /W4 */ #define UNUSED(x) (void)(x); int chemfp_get_option_report_popcount(void); int chemfp_set_option_report_popcount(int); int chemfp_get_option_report_intersect_popcount(void); int chemfp_set_option_report_intersect_popcount(int); int chemfp_add_hit(chemfp_search_result *result, int target_index, double score); #endif chemfp-1.1p1/src/CMakeLists.txt0000644000077000000240000000115611671104516016611 0ustar dalkestaff00000000000000project(chemfp) cmake_minimum_required(VERSION 2.8) option(BUILD_SHARED "enable static build support" ON) IF(CMAKE_COMPILER_IS_GNUCXX) SET(CMAKE_C_FLAGS "-O3") ENDIF(CMAKE_COMPILER_IS_GNUCXX) ADD_LIBRARY(chemfp SHARED bitops.c chemfp.c heapq.c searches.c fps.c popcount_SSSE3.c popcount_gillies.c popcount_lauradoux.c popcount_lut.c popcount_popcnt.c hits.c select_popcount.c) add_executable(test_libchemfp test_libchemfp.c) target_link_libraries(test_libchemfp chemfp) add_test(test_libchemfp test_libchemfp) #endforeach(test ${tests}) chemfp-1.1p1/src/cpuid.h0000644000077000000240000000613011665302700015321 0ustar dalkestaff00000000000000#ifndef CPUID_H #define CPUID_H /** * @brief Contains a portable cpuid implementation for x86 and * x86-64 CPUs. * @author Kim Walisch, * @version 1.1 * @date 2011 * * The code within this file has been tested successfully with the * following compilers and operating systems: * * GNU GCC 4.4 Linux i386 & x86-64 * LLVM clang 2.8, Linux i386 & x86-64 * Oracle Solaris Studio 12.2, Linux i386 & x86-64 * Intel C++ Composer XE 2011, Linux i386 & x86-64, Windows 7 64-bit * GNU GCC 3.3, VMware Linux i386 * GNU GCC 4.6, VMware Linux i386 & x86-64 * Apple llvm-gcc-4.2, Mac OS X 10.7 * Apple clang version 3.0, Mac OS X 10.7 * Microsoft Visual Studio 2010, Windows 7 64-bit * MinGW-w64 GCC 4.6, Windows 7 64-bit */ #if defined(_MSC_VER) && (defined(_WIN32) || defined(_WIN64)) #include /* __cpuid() */ #endif /* %ecx bit flags */ #define bit_SSE3 (1 << 0) #define bit_SSSE3 (1 << 9) #define bit_SSE4_1 (1 << 19) #define bit_SSE4_2 (1 << 20) #define bit_POPCNT (1 << 23) #define bit_AVX (1 << 28) /* %edx bit flags */ #define bit_SSE (1 << 25) #define bit_SSE2 (1 << 26) /** * Portable cpuid implementation for x86 and x86-64 CPUs * (supports PIC and non-PIC code). * @return 1 if the CPU supports the cpuid instruction else -1. */ static int cpuid(unsigned int info, unsigned int *eax, unsigned int *ebx, unsigned int *ecx, unsigned int *edx) { #if defined(_MSC_VER) && (defined(_WIN32) || defined(_WIN64)) int regs[4]; __cpuid(regs, info); *eax = regs[0]; *ebx = regs[1]; *ecx = regs[2]; *edx = regs[3]; return 1; #elif defined(__i386__) || defined(__i386) *eax = info; #if defined(__PIC__) __asm__ __volatile__ ( "mov %%ebx, %%esi;" /* save %ebx PIC register */ "cpuid;" "xchg %%ebx, %%esi;" : "+a" (*eax), "=S" (*ebx), "=c" (*ecx), "=d" (*edx)); #else __asm__ __volatile__ ( "cpuid;" : "+a" (*eax), "=b" (*ebx), "=c" (*ecx), "=d" (*edx)); #endif return 1; #elif defined(__x86_64__) *eax = info; __asm__ __volatile__ ( "cpuid;" : "+a" (*eax), "=b" (*ebx), "=c" (*ecx), "=d" (*edx)); return 1; #else /* compiler or CPU architecture do not support cpuid. */ UNUSED(info); UNUSED(eax); UNUSED(ebx); UNUSED(ecx); UNUSED(edx); return -1; #endif } /** * @return An int value with the SSE and AVX bit flags set if the CPU * supports the corresponding instruction sets. */ static int get_cpuid_flags(void) { int flags = 0; unsigned int info = 0x00000001; unsigned int eax, ebx, ecx, edx; if (cpuid(info, &eax, &ebx, &ecx, &edx) != -1) { flags = (edx & (bit_SSE | bit_SSE2)) | (ecx & (bit_SSE3 | bit_SSSE3 | bit_SSE4_1 | bit_SSE4_2 | bit_POPCNT | bit_AVX)); } return flags; } #endif /* CPUID_H */ chemfp-1.1p1/src/fps.c0000644000077000000240000003127311665304016015010 0ustar dalkestaff00000000000000/* Functions for using the "fps" hex-based fingerprint file format */ #include #include #include #include "heapq.h" #include "chemfp.h" enum {ADD_TO_HEAP, REPLACE_IN_HEAP, MAXED_OUT_HEAP}; /* Internal function to find the id field in an FPS line */ /* (Which means the fingerprint field is from line to *id_start-1 ) */ /* The line MUST match /^[0-9A-Fa-f]+\t[^\t\r\n]+/ */ /* REQUIRED: the line MUST end with a newline (this is not checked) */ int chemfp_fps_find_id( int hex_size, /* The expected length of the hex field, or -1 if unknown (If it's known then it's used to validate.) */ const char *line, /* The input line */ const char **id_start, /* After a successful return, these will contain */ const char **id_end /* the start and end+1 position of the id field */ ) { int fp_field_len, id_len; const char *s; /* Find the hex fingerprint and check that the length is appropriate */ fp_field_len = (int) strspn(line, "0123456789abcdefABCDEF"); if (fp_field_len == 0) return CHEMFP_MISSING_FINGERPRINT; if (fp_field_len % 2 != 0) return CHEMFP_BAD_FINGERPRINT; if (hex_size != -1 && hex_size != fp_field_len) return CHEMFP_UNEXPECTED_FINGERPRINT_LENGTH; s = line+fp_field_len; /* The only legal thing here is a tab. */ /* Check if it's some other character, including a NUL */ switch (s[0]) { case '\t': break; /* The only legal option. Everything else improves the error code */ case '\n': return CHEMFP_MISSING_ID; case '\r': if (s[1] == '\n') return CHEMFP_MISSING_ID; /* else fallthrough */ case ' ': return CHEMFP_UNSUPPORTED_WHITESPACE; default: return CHEMFP_BAD_FINGERPRINT; } s++; /* You must pass in a newline-terminated string to this function. Therefore, this function will finish while inside the string. Note that I'm also checking for illegal whitespace here. */ id_len = (int) strcspn(s, "\t\n\r"); switch (s[id_len]) { case '\0': return CHEMFP_BAD_ID; case '\r': if (s[id_len+1] != '\n') return CHEMFP_UNSUPPORTED_WHITESPACE; break; } *id_start = s; *id_end = s+id_len; return CHEMFP_OK; } /* Go to the start of the next line. s may be at a newline already. */ static const char *chemfp_to_next_line(const char *s) { while (*s != '\n') s++; return s+1; } int chemfp_fps_line_validate(int hex_size, int line_size, const char *line_start) { const char *id_start, *id_end; if (line_size == 0 || line_start[line_size-1] != '\n') return CHEMFP_MISSING_NEWLINE; return chemfp_fps_find_id(hex_size, line_start, &id_start, &id_end); } /* Return the number of fingerprints in the fps block which are greater than or equal to the specified threshold. */ int chemfp_fps_count_tanimoto_hits( int num_bits, int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, const char *target_block, int target_block_end, double threshold, int *counts, int *num_lines_processed) { const unsigned char *query_fp; const char *line, *next_line, *end; int fp_size = (num_bits+7)/8; int num_lines = 0, query_index; const char *id_start, *id_end; int err; double score; end = target_block + target_block_end; if (target_block_end == 0 || end[-1] != '\n') { err = CHEMFP_MISSING_NEWLINE; goto finish; } line = target_block; while (line < end) { /* Parse a line, get the id start position and length, and verify hex_size */ err = chemfp_fps_find_id(fp_size*2, line, &id_start, &id_end); if (err < 0) goto finish; /* The character after the id might be a newline, or there might be other fields */ next_line = chemfp_to_next_line(id_end); query_fp = query_arena + query_start * query_storage_size; for (query_index=query_start; query_index= threshold) counts[query_index]++; } num_lines++; line = next_line; } err = CHEMFP_OK; finish: *num_lines_processed = num_lines; return err; } /****** Linear Tanimoto search with threshold and unlimited number of hits ********/ int chemfp_fps_threshold_tanimoto_search( int num_bits, int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, const char *target_block, int target_block_end, double threshold, int num_cells, chemfp_tanimoto_cell *cells, const char ** stopped_at, int *num_lines_processed, int *num_cells_processed) { const char *line = target_block; const char *next_line; const char *end = target_block+target_block_end; const char *id_start, *id_end; const unsigned char *query_fp; chemfp_tanimoto_cell *current_cell; double score; int query_index; int num_lines = 0, num_queries; int err, retval; int fp_size = (num_bits+7)/8; current_cell = cells; if (query_start >= query_end) { retval = CHEMFP_OK; goto finish; } num_queries = query_end - query_start; if (end[-1] != '\n') { /* There's no guarantee that the missing newline is on "stopped_at" */ /* In the Python API there's no way to trigger this through normal code. */ retval = CHEMFP_MISSING_NEWLINE; goto finish; } while (line < end) { if (num_cells < num_queries) { goto success; } err = chemfp_fps_find_id(2*fp_size, line, &id_start, &id_end); if (err < 0) { retval = err; goto finish; } next_line = chemfp_to_next_line(id_end); query_fp = query_arena + query_start * query_storage_size; for (query_index=query_start; query_index= threshold) { current_cell->score = score; current_cell->query_index = query_index; current_cell->id_start = (int)(id_start - target_block); current_cell->id_end = (int)(id_end - target_block); current_cell++; num_cells--; } } line = next_line; num_lines++; } success: retval = CHEMFP_OK; finish: *stopped_at = line; *num_lines_processed = num_lines; *num_cells_processed = (int)(current_cell - cells); return retval; } /****** Manage the best-of-N Tanimoto linear searches ********/ /* Compare two heap entries based on their score. Break ties based on the insertion index, with a preference to older entries. */ static int fps_heap_lt(chemfp_fps_heap *heap, int i, int j) { if (heap->scores[i] < heap->scores[j]) return 1; if (heap->scores[i] > heap->scores[j]) return 0; /* break ties on a first-come basis */ return (heap->indices[i] > heap->indices[j]); } /* Swap two entries in the heap */ static void fps_heap_swap(chemfp_fps_heap *heap, int i, int j) { int idx = heap->indices[i]; double score = heap->scores[i]; char *id = heap->ids[i]; heap->indices[i] = heap->indices[j]; heap->scores[i] = heap->scores[j]; heap->ids[i] = heap->ids[j]; heap->indices[j] = idx; heap->scores[j] = score; heap->ids[j] = id; } /***************** new code */ int chemfp_fps_knearest_search_init( chemfp_fps_knearest_search *knearest_search, int num_bits, int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, int k, double threshold) { chemfp_fps_heap *heaps = NULL; int *all_indices = NULL; char **all_ids = NULL; double *all_scores = NULL; int i, num_queries; if (query_start >= query_end) { num_queries = 0; goto skip_malloc; } else { num_queries = query_end - query_start; } heaps = (chemfp_fps_heap *) calloc(num_queries, sizeof(chemfp_fps_heap)); if (!heaps) { goto malloc_failure; } all_indices = (int *) calloc(k*num_queries, sizeof(int)); if (!all_indices) { goto malloc_failure; } all_ids = (char **) calloc(k*num_queries, sizeof(char *)); if (!all_ids) { goto malloc_failure; } all_scores = (double *) calloc(k*num_queries, sizeof(double)); if (!all_scores) { goto malloc_failure; } skip_malloc: knearest_search->query_start = query_arena + (query_start*query_storage_size); knearest_search->num_queries = num_queries; knearest_search->query_fp_size = (num_bits+7)/8; knearest_search->query_storage_size = query_storage_size; knearest_search->k = k; knearest_search->search_state = 0; knearest_search->threshold = threshold; knearest_search->heaps = heaps; for (i=0; inum_targets_processed = 0; knearest_search->_all_ids = all_ids; knearest_search->_all_scores = all_scores; return CHEMFP_OK; malloc_failure: if (all_scores) free(all_scores); if (all_ids) free(all_ids); if (all_indices) free(all_indices); if (heaps) free(heaps); return CHEMFP_NO_MEM; } static char *new_string(const char *start, const char *end) { size_t n = end-start; char *s = malloc(n+1); if (s) { memcpy(s, start, n); s[n] = '\0'; } return s; } int chemfp_fps_knearest_tanimoto_search_feed( chemfp_fps_knearest_search *knearest_search, int target_block_len, const char *target_block) { int k; double score, threshold; int num_added = 0; char *s; const char *line, *next_line, *end, *id_start, *id_end; const unsigned char *query_fp; chemfp_fps_heap *heap; int query_hex_size, query_fp_size, query_storage_size; int i, err, retval; if (target_block_len == 0 || target_block[target_block_len-1] != '\n') return CHEMFP_MISSING_NEWLINE; end = target_block+target_block_len; threshold = knearest_search->threshold; k = knearest_search->k; query_fp_size = knearest_search->query_fp_size; query_hex_size = query_fp_size * 2; query_storage_size = knearest_search->query_storage_size; line = target_block; while (line < end) { err = chemfp_fps_find_id(query_hex_size, line, &id_start, &id_end); if (err < 0) { retval = err; goto finish; } next_line = chemfp_to_next_line(id_end); query_fp = knearest_search->query_start; heap = knearest_search->heaps; for (i=0; inum_queries; i++, query_fp += query_storage_size, heap++) { switch(heap->heap_state) { case ADD_TO_HEAP: score = chemfp_byte_hex_tanimoto(query_fp_size, query_fp, line); if (score >= threshold) { heap->scores[heap->size] = score; s = new_string(id_start, id_end); if (!s) { retval = CHEMFP_NO_MEM; goto finish; } heap->ids[heap->size] = s; heap->size++; } if (heap->size == k) { chemfp_heapq_heapify(k, (void *)heap, (chemfp_heapq_lt) fps_heap_lt, (chemfp_heapq_swap) fps_heap_swap); heap->heap_state = REPLACE_IN_HEAP; } break; case REPLACE_IN_HEAP: score = chemfp_byte_hex_tanimoto(query_fp_size, query_fp, line); if (score > heap->scores[0]) { heap->scores[0] = score; free(heap->ids[0]); s = new_string(id_start, id_end); if (!s) { retval = CHEMFP_NO_MEM; goto finish; } heap->ids[0] = s; chemfp_heapq_siftup(k, (void *) heap, 0, (chemfp_heapq_lt) fps_heap_lt, (chemfp_heapq_swap) fps_heap_swap); if (heap->scores[0] == 1.0) { heap->heap_state = MAXED_OUT_HEAP; } } break; case MAXED_OUT_HEAP: continue; default: return -1; /* Not possible */ } } line = next_line; num_added++; } retval = CHEMFP_OK; finish: knearest_search->num_targets_processed += num_added; return retval; } void chemfp_fps_knearest_search_free(chemfp_fps_knearest_search *knearest_search) { free(knearest_search->_all_scores); free(knearest_search->_all_ids); free(knearest_search->heaps); } void chemfp_fps_knearest_search_finish(chemfp_fps_knearest_search *knearest_search) { int i; chemfp_fps_heap *heap; if (knearest_search->search_state == 1) { return; } knearest_search->search_state = 1; for (i=0; inum_queries; i++) { heap = knearest_search->heaps+i; if (heap->size < knearest_search->k) { chemfp_heapq_heapify(heap->size, (void *)heap, (chemfp_heapq_lt) fps_heap_lt, (chemfp_heapq_swap) fps_heap_swap); } chemfp_heapq_heapsort(heap->size, (void *)heap, (chemfp_heapq_lt) fps_heap_lt, (chemfp_heapq_swap) fps_heap_swap); } } chemfp-1.1p1/src/heapq.c0000644000077000000240000000640411665272234015322 0ustar dalkestaff00000000000000/* Low-level heap commands */ /* These are private internal functions used by the rest of the chemfp code */ /* They are not part of the public API */ #include "heapq.h" /* This code is derived from Python's _heapqmodule.c Heritage from _heapqmodule.c Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 Python Software Foundation; All Rights Reserved C implementation derived directly from heapq.py in Py2.3 which was written by Kevin O'Connor, augmented by Tim Peters, annotated by François Pinard, and converted to C by Raymond Hettinger. I'm only using a few of the functions from that module. For those, I stripped out the dependencies on Python's data structures and changed it to take user-defined comparison and swap functions. Those are gloss; the code hasn't really changed at all. */ int chemfp_heapq_siftdown(int len, void *heap, int startpos, int pos, chemfp_heapq_lt lt, chemfp_heapq_swap swap) { int parentpos, cmp; /* unused parameter */ (void)(len); /* Follow the path to the root, moving parents down until finding a place newitem fits. */ while (pos > startpos) { parentpos = (pos-1) >> 1; cmp = lt(heap, pos, parentpos); if (cmp == -1) return -1; if (cmp == 0) break; swap(heap, pos, parentpos); pos = parentpos; } return 0; } int chemfp_heapq_siftup(int len, void *heap, int pos, chemfp_heapq_lt lt, chemfp_heapq_swap swap) { int endpos = len; int startpos = pos; int cmp, rightpos, childpos; /* Bubble up the smaller child until hitting a leaf. */ childpos = 2*pos + 1; while (childpos < endpos) { /* Set childpos to index of smaller child. */ rightpos = childpos + 1; if (rightpos < endpos) { cmp = lt(heap, childpos, rightpos); if (cmp == -1) return -1; if (cmp == 0) childpos = rightpos; } /* Move the smaller child up. */ swap(heap, pos, childpos); pos = childpos; childpos = 2*pos + 1; } /* The item at pos contains the original 'pos' item. Bubble it back up to its final resting place (by sifting its parents down). */ return chemfp_heapq_siftdown(len, heap, startpos, pos, lt, swap); } int chemfp_heapq_heapify(int len, void *heap, chemfp_heapq_lt lt, chemfp_heapq_swap swap) { int i; /* Transform bottom-up. The largest index there's any point to looking at is the largest with a child index in-range, so must have 2*i + 1 < n, or i < (n-1)/2. If n is even = 2*j, this is (2*j-1)/2 = j-1/2 so j-1 is the largest, which is n//2 - 1. If n is odd = 2*j+1, this is (2*j+1-1)/2 = j so j-1 is the largest, and that's again n//2-1. */ for (i=len/2-1; i>=0; i--) { if (chemfp_heapq_siftup(len, heap, i, lt, swap) == -1) return -1; } return 0; } /* Put the heap into sorted order. The code must already be heapified. */ /* Details at http://en.wikipedia.org/wiki/Heapsort */ int chemfp_heapq_heapsort(int len, void *heap, chemfp_heapq_lt lt, chemfp_heapq_swap swap) { int end; if (len == 0) return 0; for (end = len-1; end>0; end--) { swap(heap, 0, end); if (chemfp_heapq_siftup(end, heap, 0, lt, swap) == -1) return -1; } return 0; } chemfp-1.1p1/src/heapq.h0000644000077000000240000000167711660452123015326 0ustar dalkestaff00000000000000/**** Low-level heap operations, for the best-of-N algorithms ****/ /* These are internal data types and functions. While they may */ /* be available in the library, do not call them directly. */ /* Compare two items in the heap. Return -1 on error, 1 for lt, otherwise 0 */ typedef int (*chemfp_heapq_lt)(void *data, int i, int j); /* Swap two items in the heap. This function must never fail. */ typedef void (*chemfp_heapq_swap)(void *data, int i, int j); /* Call after replacing the first element in a heapified list */ int chemfp_heapq_siftup(int len, void *heap, int pos, chemfp_heapq_lt lt, chemfp_heapq_swap swap); /* Convert the un-ordered list into a heap */ int chemfp_heapq_heapify(int len, void *heap, chemfp_heapq_lt lt, chemfp_heapq_swap swap); /* Must heapify first */ int chemfp_heapq_heapsort(int len, void *heap, chemfp_heapq_lt lt, chemfp_heapq_swap swap); chemfp-1.1p1/src/hits.c0000644000077000000240000005476712071450775015212 0ustar dalkestaff00000000000000#include #include #include #include "chemfp.h" #include "chemfp_internal.h" /****************************** Start of TimSort code ************************/ /* The following was modified to support sorting two parallel arrays. I also removed the macros. */ /* License All code [in the repository at https://github.com/swenson/sort], unless otherwise specified, is hereby licensed under the MIT Public License: Copyright (c) 2010 Christopher Swenson Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. Modified by Andrew Dalke. I extracted the TimSort code and modified it to work with two parallel arrays. */ #include #include #include #include #define SORT_TYPE1 int #define SORT_TYPE2 double typedef int (*hit_compare_func)(SORT_TYPE1 left_index, SORT_TYPE1 right_index, SORT_TYPE2 left_score, SORT_TYPE2 right_score); #define SORT_CMP(x1, y1, x2, y2) hit_compare(x1, y1, x2, y2) #ifndef CLZ #ifdef __GNUC__ #define CLZ __builtin_clzll #else /* adapted from Hacker's Delight */ static int clzll(uint64_t x) { int n; if (x == 0) return(64); n = 0; if (x <= 0x00000000FFFFFFFFL) {n = n + 32; x = x << 32;} if (x <= 0x0000FFFFFFFFFFFFL) {n = n + 16; x = x << 16;} if (x <= 0x00FFFFFFFFFFFFFFL) {n = n + 8; x = x << 8;} if (x <= 0x0FFFFFFFFFFFFFFFL) {n = n + 4; x = x << 4;} if (x <= 0x3FFFFFFFFFFFFFFFL) {n = n + 2; x = x << 2;} if (x <= 0x7FFFFFFFFFFFFFFFL) {n = n + 1;} return n; } #define CLZ clzll #endif #endif #define SORT_SWAP1(x,y) ({SORT_TYPE1 __SORT_SWAP_t = (x); (x) = (y); (y) = __SORT_SWAP_t;}) #define SORT_SWAP2(x,y) ({SORT_TYPE2 __SORT_SWAP_t = (x); (x) = (y); (y) = __SORT_SWAP_t;}) #ifndef MAX #define MAX(x,y) (((x) > (y) ? (x) : (y))) #endif #ifndef MIN #define MIN(x,y) (((x) < (y) ? (x) : (y))) #endif typedef struct { int64_t start; int64_t length; } hits_tim_sort_run_t; void hits_binary_insertion_sort(SORT_TYPE1 *dst1, SORT_TYPE2 *dst2, const size_t size, hit_compare_func hit_compare); void hits_tim_sort(SORT_TYPE1 *dst1, SORT_TYPE2 *dst2, const size_t size, hit_compare_func hit_compare); /* Function used to do a binary search for binary insertion sort */ static inline int64_t binary_insertion_find(SORT_TYPE1 *dst1, SORT_TYPE2 *dst2, const SORT_TYPE1 x1, const SORT_TYPE2 x2, const size_t size, hit_compare_func hit_compare) { int64_t l, c, r; l = 0; r = size - 1; c = r >> 1; SORT_TYPE1 lx1, cx1, rx1; SORT_TYPE2 lx2, cx2, rx2; lx1 = dst1[l]; lx2 = dst2[l]; /* check for beginning conditions */ if (SORT_CMP(x1, lx1, x2, lx2) < 0) return 0; else if (SORT_CMP(x1, lx1, x2, lx2) == 0) { int64_t i = 1; while (SORT_CMP(x1, dst1[i], x2, dst2[i]) == 0) i++; return i; } rx1 = dst1[r]; rx2 = dst2[r]; /* guaranteed not to be >= rx */ cx1 = dst1[c]; cx2 = dst2[c]; while (1) { const int val = SORT_CMP(x1, cx1, x2, cx2); if (val < 0) { if (c - l <= 1) return c; r = c; rx1 = cx1; rx2 = cx2; } else if (val > 0) { if (r - c <= 1) return c + 1; l = c; } else { do { ++c; cx1 = dst1[c]; cx2 = dst2[c]; } while (SORT_CMP(x1, cx1, x2, cx2) == 0); return c; } c = l + ((r - l) >> 1); cx1 = dst1[c]; cx2 = dst2[c]; } } /* Binary insertion sort, but knowing that the first "start" entries are sorted. Used in timsort. */ static inline void binary_insertion_sort_start(SORT_TYPE1 *dst1, SORT_TYPE2 *dst2, const size_t start, const size_t size, hit_compare_func hit_compare) { int64_t i; for (i = start; i < size; i++) { int64_t j; /* If this entry is already correct, just move along */ if (SORT_CMP(dst1[i - 1], dst1[i], dst2[i - 1], dst2[i]) <= 0) continue; /* Else we need to find the right place, shift everything over, and squeeze in */ SORT_TYPE1 x1 = dst1[i]; SORT_TYPE2 x2 = dst2[i]; int64_t location = binary_insertion_find(dst1, dst2, x1, x2, i, hit_compare); for (j = i - 1; j >= location; j--) { dst1[j + 1] = dst1[j]; dst2[j + 1] = dst2[j]; } dst1[location] = x1; dst2[location] = x2; } } /* Binary insertion sort */ void hits_binary_insertion_sort(SORT_TYPE1 *dst1, SORT_TYPE2 *dst2, const size_t size, hit_compare_func hit_compare) { binary_insertion_sort_start(dst1, dst2, 1, size, hit_compare); } /* timsort implementation, based on timsort.txt */ static inline void reverse_elements(SORT_TYPE1 *dst1, SORT_TYPE2 *dst2, int64_t start, int64_t end) { while (1) { if (start >= end) return; SORT_SWAP1(dst1[start], dst1[end]); SORT_SWAP2(dst2[start], dst2[end]); start++; end--; } } static inline int64_t count_run(SORT_TYPE1 *dst1, SORT_TYPE2 *dst2, const int64_t start, const size_t size, hit_compare_func hit_compare) { if (size - start == 1) return 1; if (start >= size - 2) { if (SORT_CMP(dst1[size - 2], dst1[size - 1], dst2[size - 2], dst2[size - 1]) > 0) { SORT_SWAP1(dst1[size - 2], dst1[size - 1]); SORT_SWAP2(dst2[size - 2], dst2[size - 1]); } return 2; } int64_t curr = start + 2; if (SORT_CMP(dst1[start], dst1[start + 1], dst2[start], dst2[start + 1]) <= 0) { /* increasing run */ while (1) { if (curr == size - 1) break; if (SORT_CMP(dst1[curr - 1], dst1[curr], dst2[curr - 1], dst2[curr]) > 0) break; curr++; } return curr - start; } else { /* decreasing run */ while (1) { if (curr == size - 1) break; if (SORT_CMP(dst1[curr - 1], dst1[curr], dst2[curr - 1], dst2[curr]) <= 0) break; curr++; } /* reverse in-place */ reverse_elements(dst1, dst2, start, curr - 1); return curr - start; } } static inline int compute_minrun(const uint64_t size) { const int top_bit = 64 - CLZ(size); const int shift = MAX(top_bit, 6) - 6; const int minrun = size >> shift; const uint64_t mask = (1ULL << shift) - 1; if (mask & size) return minrun + 1; return minrun; } #define PUSH_NEXT() do {\ len = count_run(dst1, dst2, curr, size, hit_compare); \ run = minrun;\ if (run < minrun) run = minrun;\ if (run > size - curr) run = size - curr;\ if (run > len)\ {\ binary_insertion_sort_start(&dst1[curr], &dst2[curr], len, run, hit_compare); \ len = run;\ }\ run_stack[stack_curr++] = (hits_tim_sort_run_t) {curr, len};\ curr += len;\ if (curr == size)\ {\ /* finish up */ \ while (stack_curr > 1) \ { \ tim_sort_merge(dst1, dst2, run_stack, stack_curr, store, hit_compare); \ run_stack[stack_curr - 2].length += run_stack[stack_curr - 1].length; \ stack_curr--; \ } \ if (store->storage1 != NULL)\ {\ free(store->storage1);\ store->storage1 = NULL;\ free(store->storage2);\ store->storage2 = NULL;\ }\ return;\ }\ }\ while (0) static inline int check_invariant(hits_tim_sort_run_t *stack, const int stack_curr) { if (stack_curr < 2) return 1; if (stack_curr == 2) { const int64_t A = stack[stack_curr - 2].length; const int64_t B = stack[stack_curr - 1].length; if (A <= B) return 0; return 1; } const int64_t A = stack[stack_curr - 3].length; const int64_t B = stack[stack_curr - 2].length; const int64_t C = stack[stack_curr - 1].length; if ((A <= B + C) || (B <= C)) return 0; return 1; } typedef struct { size_t alloc; SORT_TYPE1 *storage1; SORT_TYPE2 *storage2; } hits_temp_storage_t; static inline void tim_sort_resize(hits_temp_storage_t *store, const size_t new_size) { if (store->alloc < new_size) { SORT_TYPE1 *tempstore1 = realloc(store->storage1, new_size * sizeof(SORT_TYPE1)); SORT_TYPE2 *tempstore2 = realloc(store->storage2, new_size * sizeof(SORT_TYPE2)); if (tempstore1 == NULL) { fprintf(stderr, "Error allocating temporary storage for tim sort: need %lu bytes", sizeof(SORT_TYPE1) * new_size); exit(1); } if (tempstore2 == NULL) { fprintf(stderr, "Error allocating temporary storage for tim sort: need %lu bytes", sizeof(SORT_TYPE2) * new_size); exit(1); } store->storage1 = tempstore1; store->storage2 = tempstore2; store->alloc = new_size; } } static inline void tim_sort_merge(SORT_TYPE1 *dst1, SORT_TYPE2 *dst2, const hits_tim_sort_run_t *stack, const int stack_curr, hits_temp_storage_t *store, hit_compare_func hit_compare) { const int64_t A = stack[stack_curr - 2].length; const int64_t B = stack[stack_curr - 1].length; const int64_t curr = stack[stack_curr - 2].start; tim_sort_resize(store, MIN(A, B)); SORT_TYPE1 *storage1 = store->storage1; SORT_TYPE2 *storage2 = store->storage2; int64_t i, j, k; /* left merge */ if (A < B) { memcpy(storage1, &dst1[curr], A * sizeof(SORT_TYPE1)); memcpy(storage2, &dst2[curr], A * sizeof(SORT_TYPE2)); i = 0; j = curr + A; for (k = curr; k < curr + A + B; k++) { if ((i < A) && (j < curr + A + B)) { if (SORT_CMP(storage1[i], dst1[j], storage2[i], dst2[j]) <= 0) { dst1[k] = storage1[i]; dst2[k] = storage2[i++]; } else { dst1[k] = dst1[j]; dst2[k] = dst2[j++]; } } else if (i < A) { dst1[k] = storage1[i]; dst2[k] = storage2[i++]; } else { dst1[k] = dst1[j]; dst2[k] = dst2[j++]; } } } /* right merge */ else { memcpy(storage1, &dst1[curr + A], B * sizeof(SORT_TYPE1)); memcpy(storage2, &dst2[curr + A], B * sizeof(SORT_TYPE2)); i = B - 1; j = curr + A - 1; for (k = curr + A + B - 1; k >= curr; k--) { if ((i >= 0) && (j >= curr)) { if (SORT_CMP(dst1[j], storage1[i], dst2[j], storage2[i]) > 0) { dst1[k] = dst1[j]; dst2[k] = dst2[j--]; } else { dst1[k] = storage1[i]; dst2[k] = storage2[i--]; } } else if (i >= 0) { dst1[k] = storage1[i]; dst2[k] = storage2[i--]; } else { dst1[k] = dst1[j]; dst2[k] = dst2[j--]; } } } } static inline int tim_sort_collapse(SORT_TYPE1 *dst1, SORT_TYPE2 *dst2, hits_tim_sort_run_t *stack, int stack_curr, hits_temp_storage_t *store, const size_t size, hit_compare_func hit_compare) { while (1) { /* if the stack only has one thing on it, we are done with the collapse */ if (stack_curr <= 1) break; /* if this is the last merge, just do it */ if ((stack_curr == 2) && (stack[0].length + stack[1].length == size)) { tim_sort_merge(dst1, dst2, stack, stack_curr, store, hit_compare); stack[0].length += stack[1].length; stack_curr--; break; } /* check if the invariant is off for a stack of 2 elements */ else if ((stack_curr == 2) && (stack[0].length <= stack[1].length)) { tim_sort_merge(dst1, dst2, stack, stack_curr, store, hit_compare); stack[0].length += stack[1].length; stack_curr--; break; } else if (stack_curr == 2) break; const int64_t A = stack[stack_curr - 3].length; const int64_t B = stack[stack_curr - 2].length; const int64_t C = stack[stack_curr - 1].length; /* check first invariant */ if (A <= B + C) { if (A < C) { tim_sort_merge(dst1, dst2, stack, stack_curr - 1, store, hit_compare); stack[stack_curr - 3].length += stack[stack_curr - 2].length; stack[stack_curr - 2] = stack[stack_curr - 1]; stack_curr--; } else { tim_sort_merge(dst1, dst2, stack, stack_curr, store, hit_compare); stack[stack_curr - 2].length += stack[stack_curr - 1].length; stack_curr--; } } /* check second invariant */ else if (B <= C) { tim_sort_merge(dst1, dst2, stack, stack_curr, store, hit_compare); stack[stack_curr - 2].length += stack[stack_curr - 1].length; stack_curr--; } else break; } return stack_curr; } void hits_tim_sort(SORT_TYPE1 *dst1, SORT_TYPE2 *dst2, const size_t size, hit_compare_func hit_compare) { if (size < 64) { hits_binary_insertion_sort(dst1, dst2, size, hit_compare); return; } /* compute the minimum run length */ const int minrun = compute_minrun(size); /* temporary storage for merges */ hits_temp_storage_t _store, *store = &_store; store->alloc = 0; store->storage1 = NULL; store->storage2 = NULL; hits_tim_sort_run_t run_stack[128]; int stack_curr = 0; int64_t len, run; int64_t curr = 0; PUSH_NEXT(); PUSH_NEXT(); PUSH_NEXT(); while (1) { if (!check_invariant(run_stack, stack_curr)) { stack_curr = tim_sort_collapse(dst1, dst2, run_stack, stack_curr, store, size, hit_compare); continue; } PUSH_NEXT(); } } /* heap sort: based on wikipedia */ static inline void heap_sift_down(SORT_TYPE1 *dst1, SORT_TYPE2 *dst2, const int64_t start, const int64_t end, hit_compare_func hit_compare) { int64_t root = start; while ((root << 1) <= end) { int64_t child = root << 1; if ((child < end) && (SORT_CMP(dst1[child], dst1[child + 1], dst2[child], dst2[child + 1]) < 0)) child++; if (SORT_CMP(dst1[root], dst1[child], dst2[root], dst2[child]) < 0) { SORT_SWAP1(dst1[root], dst1[child]); SORT_SWAP2(dst2[root], dst2[child]); root = child; } else return; } } static inline void heapify(SORT_TYPE1 *dst1, SORT_TYPE2 *dst2, const size_t size, hit_compare_func hit_compare) { int64_t start = size >> 1; while (start >= 0) { heap_sift_down(dst1, dst2, start, size - 1, hit_compare); start--; } } /****************************** End of TimSort code ************************/ chemfp_search_result *chemfp_alloc_search_results(int size) { /* Initializes all of the counts to 0 (and makes everything else nice too) */ return (chemfp_search_result *) calloc(size, sizeof(chemfp_search_result)); } void chemfp_free_results(int num_results, chemfp_search_result *results) { int i; for (i=0; inum_hits; } int chemfp_add_hit(chemfp_search_result *result, int target_index, double score) { int num_hits = result->num_hits; int num_allocated = result->num_allocated; int *indices, *old_indices; double *scores; if (num_hits == num_allocated) { if (num_hits == 0) { num_allocated = 6; scores = (double *) malloc(num_allocated * (sizeof(int)+sizeof(double))); if (!scores) { return 0; } indices = (int *) (scores + num_allocated); result->num_allocated = num_allocated; result->indices = indices; result->scores = scores; } else { /* Grow by about 12% each time; this is the Python listobject resize strategy */ num_allocated += (num_allocated >> 3) + (num_allocated < 9 ? 3 : 6); scores = (double *) realloc(result->scores, num_allocated * (sizeof(int)+sizeof(double))); if (!scores) { return 0; } /* Shift the indices to its new location */ old_indices = (int *) (scores + num_hits); indices = (int *) (scores + num_allocated); memmove(indices, old_indices, num_hits*sizeof(int)); result->num_allocated = num_allocated; result->indices = indices; result->scores = scores; } } else { indices = result->indices; scores = result->scores; } indices[num_hits] = target_index; scores[num_hits] = score; result->num_hits = num_hits+1; return 1; } int chemfp_fill_lower_triangle(int n, chemfp_search_result *results) { int i, j; int *sizes = (int *) malloc(n * sizeof(int)); int retval; int *counts = (int *) malloc(n * sizeof(int)); int num_hits, num_allocated; double *scores; int *old_indices, *indices; chemfp_search_result *result; if (!sizes) { return CHEMFP_NO_MEM; } /* Save all of the count information */ for (i=0; inum_hits + counts[i]; if (result->num_hits + counts[i] > result->num_allocated) { if (result->num_allocated == 0) { scores = (double *) malloc(num_allocated * (sizeof(int)+sizeof(double))); if (!scores) { return CHEMFP_NO_MEM; } indices = (int *) (scores + num_allocated); } else { num_hits = result->num_hits; scores = (double *) realloc(result->scores, num_allocated * (sizeof(int)+sizeof(double))); if (!scores) { return CHEMFP_NO_MEM; } old_indices = (int *) (scores + result->num_allocated); indices = (int *) (scores + num_allocated); memmove(indices, old_indices, num_hits*sizeof(int)); } result->num_allocated = num_allocated; result->indices = indices; result->scores = scores; } } retval = CHEMFP_OK; for (i=0; i score2) { return -1; } if (index1 < index2) { return -1; } if (index1 > index2) { return 1; } return 0; } static int compare_increasing_score(int index1, int index2, double score1, double score2) { if (score1 < score2) { return -1; } if (score1 > score2) { return 1; } if (index1 < index2) { return -1; } if (index1 > index2) { return 1; } return 0; } static int compare_increasing_index(int index1, int index2, double score1, double score2) { if (index1 < index2) { return -1; } if (index1 == index2) { return 0; } return 1; } static int compare_decreasing_index(int index1, int index2, double score1, double score2) { if (index1 < index2) { return 1; } if (index1 == index2) { return 0; } return -1; } static void move_closest_first(int num_hits, int *indices, double *scores) { int i, max_i; double max_score; int index; /* Don't need to check. The caller only calls if there is more than one element */ /* if (num_hits <= 1) { return; }*/ max_i = 0; max_score = scores[0]; for (i=1; i max_score) { max_score = scores[i]; max_i = i; } } /* Found the best score. Swap it with position 0 */ if (max_i != 0) { index = indices[max_i]; indices[max_i] = indices[0]; indices[0] = index; scores[max_i] = scores[0]; scores[0] = max_score; } } static void reverse(int num_hits, int *indices, double *scores) { int i = 0; int j = num_hits-1; int tmp_index; double tmp_score; while (i < j) { tmp_index = indices[i]; tmp_score = scores[i]; indices[i] = indices[j]; scores[i] = scores[j]; indices[j] = tmp_index; scores[j] = tmp_score; i++; j--; } } reorder_method_t reorder_methods[] = { {"increasing-score", compare_increasing_score, NULL}, {"decreasing-score", compare_decreasing_score, NULL}, {"increasing-index", compare_increasing_index, NULL}, {"decreasing-index", compare_decreasing_index, NULL}, {"move-closest-first", NULL, move_closest_first}, {"reverse", NULL, reverse}, {NULL} }; static reorder_method_t *chemfp_get_reorder_method(const char *name) { int i=0; while (reorder_methods[i].name != NULL) { if (!strcmp(name, reorder_methods[i].name)) { return &reorder_methods[i]; } i++; } return NULL; } int chemfp_search_results_reorder(int num_results, chemfp_search_result *results, const char *ordering) { int i, num_hits; reorder_method_t *reorder_method = chemfp_get_reorder_method(ordering); if (reorder_method == NULL) { return CHEMFP_UNKNOWN_ORDERING; } if (reorder_method->reorder) { for (i=0; i 1) { reorder_method->reorder(num_hits, results[i].indices, results[i].scores); } } } else { for (i=0; i 1) { hits_tim_sort(results[i].indices, results[i].scores, num_hits, reorder_method->hit_compare); } } } return CHEMFP_OK; } int chemfp_search_result_reorder(chemfp_search_result *result, const char *ordering) { int num_hits; reorder_method_t *reorder_method = chemfp_get_reorder_method(ordering); if (reorder_method == NULL) { return CHEMFP_UNKNOWN_ORDERING; } num_hits = result->num_hits; if (num_hits > 1) { if (reorder_method->reorder) { reorder_method->reorder(num_hits, result->indices, result->scores); } else { hits_tim_sort(result->indices, result->scores, num_hits, reorder_method->hit_compare); } } return CHEMFP_OK; } void chemfp_search_result_clear(chemfp_search_result *result) { if (result->num_hits != 0) { result->num_hits=0; free(result->scores); result->scores = NULL; result->indices = NULL; } } chemfp-1.1p1/src/popcount.h0000644000077000000240000000353512055226640016074 0ustar dalkestaff00000000000000#if !defined(POPCOUNT_H) #define POPCOUNT_H #include #include "chemfp.h" #include "chemfp_internal.h" enum { CHEMFP_ALIGN1=0, CHEMFP_ALIGN4, CHEMFP_ALIGN8_SMALL, CHEMFP_ALIGN8_LARGE, CHEMFP_ALIGN_SSSE3 }; /* These are in the same order as compile_time_methods */ enum { CHEMFP_LUT8_1=0, CHEMFP_LUT8_4, CHEMFP_LUT16_4, CHEMFP_LAURADOUX, CHEMFP_POPCNT, CHEMFP_GILLIES, CHEMFP_SSSE3 }; typedef int (*chemfp_method_check_f)(void); typedef struct { int detected_index; int id; const char *name; int alignment; int min_size; chemfp_method_check_f check; chemfp_popcount_f popcount; chemfp_intersect_popcount_f intersect_popcount; } chemfp_method_type; typedef struct { const char *name; int alignment; int min_size; chemfp_method_type *method_p; } chemfp_alignment_type; extern chemfp_alignment_type chemfp_alignments[]; int chemfp_popcount_lut8_1(int n, const unsigned char *fp); int chemfp_intersect_popcount_lut8_1(int n, const unsigned char *fp1, const unsigned char *fp2); int chemfp_popcount_lut8_4(int n, uint32_t *fp); int chemfp_intersect_popcount_lut8_4(int n, uint32_t *fp1, uint32_t *fp2); int chemfp_popcount_lut16_4(int n, uint32_t *fp); int chemfp_intersect_popcount_lut16_4(int n, uint32_t *fp1, uint32_t *fp2); int chemfp_popcount_gillies(int n, uint64_t *fp); int chemfp_intersect_popcount_gillies(int n, uint64_t *fp1, uint64_t *fp2); int chemfp_popcount_lauradoux(int size, const uint64_t *fp); int chemfp_intersect_popcount_lauradoux(int size, const uint64_t *fp1, const uint64_t *fp2); int chemfp_popcount_popcnt(int size, const uint64_t *fp); int chemfp_intersect_popcount_popcnt(int size, const uint64_t *fp1, const uint64_t *fp2); int chemfp_popcount_SSSE3(int, const unsigned*); int chemfp_intersect_popcount_SSSE3(int, const unsigned*, const unsigned*); int chemfp_has_ssse3(void); #endif chemfp-1.1p1/src/popcount_gillies.c0000644000077000000240000000334212055226640017573 0ustar dalkestaff00000000000000#include "popcount.h" /* Quoting from Knuth, Fascicle 1, The first textbook on programming, "The Preparation of Programs for an Electronic Digital Computer" by Wilkes, Wheeler, and Gill, second edition (Reading, Mass.: Addison-Wesley, 1957), 155, 191-193, presented an interesting subroutine for sideways addition due to D. B. Gillies and J. C. P. Miller. Their method was devised for the 35-bit numbers of the EDSAC, but it is readily converted to the following 64-bit procedure... What follows is essentially this code, which is in Wikipedia http://en.wikipedia.org/wiki/Hamming_weight as "popcount_3". */ int chemfp_popcount_gillies(int n, uint64_t *fp) { const uint64_t m1 = UINT64_C(0x5555555555555555); const uint64_t m2 = UINT64_C(0x3333333333333333); const uint64_t m4 = UINT64_C(0x0F0F0F0F0F0F0F0F); const uint64_t h01 = UINT64_C(0x0101010101010101); int bit_count = 0, i; int size = (n+7) / 8; uint64_t x; for (i=0; i> 1) & m1); x = (x & m2) + ((x >> 2) & m2); x = (x + (x >> 4)) & m4; bit_count += (int) ((x * h01) >> 56); } return bit_count; } int chemfp_intersect_popcount_gillies(int n, uint64_t *fp1, uint64_t *fp2) { const uint64_t m1 = UINT64_C(0x5555555555555555); const uint64_t m2 = UINT64_C(0x3333333333333333); const uint64_t m4 = UINT64_C(0x0F0F0F0F0F0F0F0F); const uint64_t h01 = UINT64_C(0x0101010101010101); int bit_count = 0, i; int size = (n+7) / 8; uint64_t x; for (i=0; i> 1) & m1); x = (x & m2) + ((x >> 2) & m2); x = (x + (x >> 4)) & m4; bit_count += (int) ((x * h01) >> 56); } return bit_count; } chemfp-1.1p1/src/popcount_lauradoux.c0000644000077000000240000001236212055226640020151 0ustar dalkestaff00000000000000/** * @brief Contains fast and portable bit population count functions * for molecular fingerprints. * @author Kim Walisch, * @version 1.1 * @date 2011 */ /** Enable the UINT64_C(c) macro from . */ #if !defined(__STDC_CONSTANT_MACROS) # define __STDC_CONSTANT_MACROS #endif #include #include "popcount.h" /** * Count the number of 1 bits (population count) in a fingerprint * using 64-bit tree merging. This implementation uses only 8 * operations per 8 bytes on 64-bit CPUs. * The algorithm is due to Cédric Lauradoux, it is described and * benchmarked against other bit population count solutions (lookup * tables, bit-slicing) in his paper: * http://perso.citi.insa-lyon.fr/claurado/ham/overview.pdf * http://perso.citi.insa-lyon.fr/claurado/hamming.html */ int chemfp_popcount_lauradoux(int byte_size, const uint64_t *fp) { const uint64_t m1 = UINT64_C(0x5555555555555555); const uint64_t m2 = UINT64_C(0x3333333333333333); const uint64_t m4 = UINT64_C(0x0F0F0F0F0F0F0F0F); const uint64_t m8 = UINT64_C(0x00FF00FF00FF00FF); const uint64_t m16 = UINT64_C(0x0000FFFF0000FFFF); const uint64_t h01 = UINT64_C(0x0101010101010101); uint64_t count1, count2, half1, half2, acc; uint64_t x; int i, j; int size = (byte_size + 7) / 8; int limit = size - size % 12; int bit_count = 0; /* The main outer loop processes 12*8 = 96 bytes per iteration (previously 240 bytes). This makes the popcount more efficient for small fingerprints e.g. 881 bits. */ for (i = 0; i < limit; i += 12, fp += 12) { acc = 0; for (j = 0; j < 12; j += 3) { count1 = fp[j + 0]; count2 = fp[j + 1]; half1 = fp[j + 2]; half2 = fp[j + 2]; half1 &= m1; half2 = (half2 >> 1) & m1; count1 -= (count1 >> 1) & m1; count2 -= (count2 >> 1) & m1; count1 += half1; count2 += half2; count1 = (count1 & m2) + ((count1 >> 2) & m2); count1 += (count2 & m2) + ((count2 >> 2) & m2); acc += (count1 & m4) + ((count1 >> 4) & m4); } acc = (acc & m8) + ((acc >> 8) & m8); acc = (acc + (acc >> 16)) & m16; acc = acc + (acc >> 32); bit_count += (int) acc; } /* Count the bits of the remaining bytes (MAX 88) using "Counting bits set, in parallel" from the "Bit Twiddling Hacks", the code uses wikipedia's 64-bit popcount_3() implementation: http://en.wikipedia.org/wiki/Hamming_weight#Efficient_implementation */ /* Note: This is the "Gillies" algorithm, and timing tests show that it's more effective to put it here than to call the method tied to CHEMFP_ALIGN8_SMALL. You might think it best to use the fastest algorithm, but if you are using Lauradoux then you are on a machine which doesn't have POPCNT and where Lauradoux is faster than the lookup table. That's the same type of machine where Gillies is also faster than a lookup table. Calling it here instead of through the the function table saves time. It's 0.5% faster on my Mac (with gcc) and 5% faster on a Windows box (with msvc 10). */ for (i = 0; i < size - limit; i++) { x = fp[i]; x = x - ((x >> 1) & m1); x = (x & m2) + ((x >> 2) & m2); x = (x + (x >> 4)) & m4; bit_count += (int) ((x * h01) >> 56); } return bit_count; } int chemfp_intersect_popcount_lauradoux(int byte_size, const uint64_t *fp1, const uint64_t *fp2) { const uint64_t m1 = UINT64_C(0x5555555555555555); const uint64_t m2 = UINT64_C(0x3333333333333333); const uint64_t m4 = UINT64_C(0x0F0F0F0F0F0F0F0F); const uint64_t m8 = UINT64_C(0x00FF00FF00FF00FF); const uint64_t m16 = UINT64_C(0x0000FFFF0000FFFF); const uint64_t h01 = UINT64_C(0x0101010101010101); uint64_t count1, count2, half1, half2, acc; uint64_t x; int i, j; int size = (byte_size + 7) / 8; int limit = size - size % 12; int bit_count = 0; /* 64-bit tree merging */ for (i = 0; i < limit; i += 12, fp1 += 12, fp2 += 12) { acc = 0; for (j = 0; j < 12; j += 3) { count1 = fp1[j + 0] & fp2[j + 0]; count2 = fp1[j + 1] & fp2[j + 1]; half1 = fp1[j + 2] & fp2[j + 2]; half2 = fp1[j + 2] & fp2[j + 2]; half1 &= m1; half2 = (half2 >> 1) & m1; count1 -= (count1 >> 1) & m1; count2 -= (count2 >> 1) & m1; count1 += half1; count2 += half2; count1 = (count1 & m2) + ((count1 >> 2) & m2); count1 += (count2 & m2) + ((count2 >> 2) & m2); acc += (count1 & m4) + ((count1 >> 4) & m4); } acc = (acc & m8) + ((acc >> 8) & m8); acc = (acc + (acc >> 16)) & m16; acc = acc + (acc >> 32); bit_count += (int) acc; } /* intersect count the bits of the remaining bytes (MAX 88) using "Counting bits set, in parallel" from the "Bit Twiddling Hacks", the code uses wikipedia's 64-bit popcount_3() implementation: http://en.wikipedia.org/wiki/Hamming_weight#Efficient_implementation */ for (i = 0; i < size - limit; i++) { x = fp1[i] & fp2[i]; x = x - ((x >> 1) & m1); x = (x & m2) + ((x >> 2) & m2); x = (x + (x >> 4)) & m4; bit_count += (int) ((x * h01) >> 56); } return bit_count; } chemfp-1.1p1/src/popcount_lut.c0000644000077000000240000065357712055226641016775 0ustar dalkestaff00000000000000#include "popcount.h" /* 16 bit / 65536 entry lookup table */ static char lut[] = { 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15, 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9,10, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15, 7, 8, 8, 9, 8, 9, 9,10, 8, 9, 9,10, 9,10,10,11, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15, 8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15, 9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15, 10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14, 11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15, 11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15, 12,13,13,14,13,14,14,15,13,14,14,15,14,15,15,16 }; int chemfp_popcount_lut8_1(int n, const unsigned char *fp) { int cnt=0; int i; int top = n-n%2; /* I got a 30% performance gain by unrolling */ for (i=0; i>8&255]; cnt += lut[value>>16&255]; cnt += lut[value>>24]; } return cnt; } int chemfp_intersect_popcount_lut8_4(int n, uint32_t *fp1, uint32_t *fp2) { int cnt=0; unsigned int i; n = (n+3) / 4; do { i = *fp1 & *fp2; cnt += lut[i&255]; cnt += lut[i>>8&255]; cnt += lut[i>>16&255]; cnt += lut[i>>24]; fp1++; fp2++; } while(--n); return cnt; } /* 16 bit LUT assuming I can fetch 4 bytes at a time */ int chemfp_popcount_lut16_4(int n, uint32_t *fp) { int cnt=0; unsigned int i; /* Handle even cases where the fingerprint length is not a multiple of 4 */ n = (n+3) / 4; do { i = *fp; cnt += lut[i&65535]; cnt += lut[i>>16]; fp++; } while(--n); return cnt; } int chemfp_intersect_popcount_lut16_4(int n, uint32_t *fp1, uint32_t *fp2) { int cnt=0; unsigned int i; n = (n+3) / 4; do { i = *fp1 & *fp2; cnt += lut[i&65535]; cnt += lut[i>>16]; fp1++; fp2++; } while(--n); return cnt; } chemfp-1.1p1/src/popcount_popcnt.c0000644000077000000240000000705712071431441017450 0ustar dalkestaff00000000000000/** * @brief Contains portable popcount functions using the POPCNT * (SSE4.2) instruction for molecular fingerprints. * @author Kim Walisch, * @version 1.0 * @date 2011 * * The code within this file has been tested successfully with the * following compilers and operating systems: * * GNU GCC 4.4 Linux i386 & x86-64 * LLVM clang 2.8, Linux i386 & x86-64 * Oracle Solaris Studio 12.2, Linux i386 & x86-64 * Intel C++ Composer XE 2011, Linux i386 & x86-64, Windows 7 64-bit * GNU GCC 3.3, VMware Linux i386 * GNU GCC 4.6, VMware Linux i386 & x86-64 * Apple llvm-gcc-4.2, Mac OS X 10.7 * Apple clang version 3.0, Mac OS X 10.7 * Microsoft Visual Studio 2010, Windows 7 64-bit * MinGW-w64 GCC 4.6, Windows 7 64-bit */ #include "popcount.h" #include #if defined(_MSC_VER) && (defined(_WIN32) || defined(_WIN64)) #include /* _mm_popcnt_u32(), _mm_popcnt_u64() */ #endif /** Convenience functions for the POPCNT instruction. */ #if defined(_MSC_VER) && defined(_WIN64) static uint64_t POPCNT64(uint64_t x) { return _mm_popcnt_u64(x); } #elif defined(_MSC_VER) && defined(_WIN32) static uint32_t POPCNT32(uint32_t x) { return _mm_popcnt_u32(x); } #elif defined(__x86_64__) static uint64_t POPCNT64(uint64_t x) { /* GNU GCC >= 4.2 supports the POPCNT instruction */ /* APD: Apple's gcc-4.0 supports POPCNT and RHEL5's gcc-4.1 supports POPCNT */ /* I'll assume that 4.2 is good enough. Is there a better feature test for this? */ #if !defined(__GNUC__) || (__GNUC__ >= 4 && __GNUC_MINOR__ >= 1) __asm__ ("popcnt %1, %0" : "=r" (x) : "0" (x)); #endif return x; } #elif defined(__i386__) || defined(__i386) static uint32_t POPCNT32(uint32_t x) { /* GNU GCC >= 4.2 supports the POPCNT instruction */ #if !defined(__GNUC__) || (__GNUC__ >= 4 && __GNUC_MINOR__ >= 1) __asm__ ("popcnt %1, %0" : "=r" (x) : "0" (x)); #endif return x; } #endif /** * Count the number of bits set in a fingerprint using the * the POPCNT (SSE4.2) instruction. * @warning Use (get_cpuid_flags() & bit_POPCNT) to test if * the CPU supports the POPCNT instruction. */ int chemfp_popcount_popcnt(int size, const uint64_t *fp) { int bit_count = 0; int i; #if defined(_WIN64) || defined(__x86_64__) size = (size + 7) / 8; for (i = 0; i < size; i++) bit_count += (int) POPCNT64(fp[i]); #elif defined(_WIN32) || defined(__i386__) || defined(__i386) const uint32_t* fp_32 = (const uint32_t*) fp; size = (size + 3) / 4; for (i = 0; i < size; i++) bit_count += (int) POPCNT32(fp_32[i]); #else UNUSED(size); UNUSED(fp); i=0; #endif return bit_count; } /** * Count the number of bits set within the intersection of two * fingerprints using the POPCNT (SSE4.2) instruction. * @warning Use (get_cpuid_flags() & bit_POPCNT) to test if * the CPU supports the POPCNT instruction. */ int chemfp_intersect_popcount_popcnt(int size, const uint64_t *fp1, const uint64_t *fp2) { int bit_count = 0; int i; #if defined(_WIN64) || defined(__x86_64__) size = (size + 7) / 8; for (i = 0; i < size; i++) bit_count += (int) POPCNT64(fp1[i] & fp2[i]); #elif defined(_WIN32) || defined(__i386__) || defined(__i386) const uint32_t* fp1_32 = (const uint32_t*) fp1; const uint32_t* fp2_32 = (const uint32_t*) fp2; size = (size + 3) / 4; for (i = 0; i < size; i++) bit_count += (int) POPCNT32(fp1_32[i] & fp2_32[i]); #else UNUSED(size); UNUSED(fp1); UNUSED(fp2); i=0; #endif return bit_count; } chemfp-1.1p1/src/popcount_SSSE3.c0000644000077000000240000003070512055226640017006 0ustar dalkestaff00000000000000/* The original version of this code was written by Imran Haque and is in the supplementary information to Haque IS, Pande VS, and Walters WP. Anatomy of High-Performance 2D Similarity Calculations. Journal of Chemical Information and Modeling 2011 See http://cs.stanford.edu/people/ihaque/ Kim Walisch modified the code for use in chemfp and implemented two new chemfp popcount functions for molecular fingerprints. Copyright (c) 2011 Kim Walisch, . License: MIT license. */ /* Written by Imran S. Haque (ihaque@cs.stanford.edu) Copyright (c) 2011 Stanford University. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the author nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include "popcount.h" #include "cpuid.h" #if defined(_MSC_VER) && (defined(_WIN32) || defined(_WIN64)) #define GENERATE_SSSE3 #elif defined(__i386__) || defined(__i386) || defined(__x86_64__) #if defined(__SSSE3__) || !defined(__GNUC__) #define GENERATE_SSSE3 #endif #endif #if defined(GENERATE_SSSE3) #include #include static __m128i popcount_SSSE3_helper(const unsigned *buf, int N) { /* LUT of count of set bits in each possible 4-bit nibble, from low-to-high: 0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4 */ const __m128i LUT = _mm_set_epi32(0x04030302, 0x03020201, 0x03020201, 0x02010100); const __m128i mask = _mm_set1_epi32(0x0F0F0F0F); const __m128i *vbuf = (__m128i*) buf; __m128i total = _mm_setzero_si128(); __m128i v0, v1, v2, v3; __m128i v0_lo, v1_lo, v2_lo, v3_lo; __m128i v0_hi, v1_hi, v2_hi, v3_hi; __m128i count0, count1, count2, count3; int i; for (i = 0; i + 4 <= N; i += 4) { v0 = _mm_load_si128(vbuf+i+0); v1 = _mm_load_si128(vbuf+i+1); v2 = _mm_load_si128(vbuf+i+2); v3 = _mm_load_si128(vbuf+i+3); /* Split each byte into low and high nibbles */ v0_lo = _mm_and_si128(mask, v0); v1_lo = _mm_and_si128(mask, v1); v2_lo = _mm_and_si128(mask, v2); v3_lo = _mm_and_si128(mask, v3); v0_hi = _mm_and_si128(mask, _mm_srli_epi16(v0, 4)); v1_hi = _mm_and_si128(mask, _mm_srli_epi16(v1, 4)); v2_hi = _mm_and_si128(mask, _mm_srli_epi16(v2, 4)); v3_hi = _mm_and_si128(mask, _mm_srli_epi16(v3, 4)); /* Compute POPCNT of each byte in two halves using PSHUFB instruction for LUT */ count0 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v0_lo), _mm_shuffle_epi8(LUT, v0_hi)); count1 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v1_lo), _mm_shuffle_epi8(LUT, v1_hi)); count2 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v2_lo), _mm_shuffle_epi8(LUT, v2_hi)); count3 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v3_lo), _mm_shuffle_epi8(LUT, v3_hi)); total = _mm_add_epi8(total, _mm_add_epi8(_mm_add_epi8(count0, count1), _mm_add_epi8(count2, count3))); } switch (N - i) { case 1: v0 = _mm_load_si128(vbuf+i+0); v0_lo = _mm_and_si128(mask, v0); v0_hi = _mm_and_si128(mask, _mm_srli_epi16(v0, 4)); count0 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v0_lo), _mm_shuffle_epi8(LUT, v0_hi)); total = _mm_add_epi8(total, count0); break; case 2: v0 = _mm_load_si128(vbuf+i+0); v1 = _mm_load_si128(vbuf+i+1); v0_lo = _mm_and_si128(mask, v0); v1_lo = _mm_and_si128(mask, v1); v0_hi = _mm_and_si128(mask, _mm_srli_epi16(v0, 4)); v1_hi = _mm_and_si128(mask, _mm_srli_epi16(v1, 4)); count0 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v0_lo), _mm_shuffle_epi8(LUT, v0_hi)); count1 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v1_lo), _mm_shuffle_epi8(LUT, v1_hi)); total = _mm_add_epi8(total, _mm_add_epi8(count0, count1)); break; case 3: v0 = _mm_load_si128(vbuf+i+0); v1 = _mm_load_si128(vbuf+i+1); v2 = _mm_load_si128(vbuf+i+2); v0_lo = _mm_and_si128(mask, v0); v1_lo = _mm_and_si128(mask, v1); v2_lo = _mm_and_si128(mask, v2); v0_hi = _mm_and_si128(mask, _mm_srli_epi16(v0, 4)); v1_hi = _mm_and_si128(mask, _mm_srli_epi16(v1, 4)); v2_hi = _mm_and_si128(mask, _mm_srli_epi16(v2, 4)); count0 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v0_lo), _mm_shuffle_epi8(LUT, v0_hi)); count1 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v1_lo), _mm_shuffle_epi8(LUT, v1_hi)); count2 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v2_lo), _mm_shuffle_epi8(LUT, v2_hi)); total = _mm_add_epi8(total, _mm_add_epi8(_mm_add_epi8(count0, count1), count2)); break; } /* Reduce 16*8b -> {-,-,-,16b,-,-,-,16b} */ return _mm_sad_epu8(total, _mm_setzero_si128()); } static __m128i intersect_popcount_SSSE3_helper(const unsigned *buf, const unsigned *buf2, int N) { /* LUT of count of set bits in each possible 4-bit nibble, from low-to-high: 0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4 */ const __m128i LUT = _mm_set_epi32(0x04030302, 0x03020201, 0x03020201, 0x02010100); const __m128i mask = _mm_set1_epi32(0x0F0F0F0F); const __m128i *vbuf = (__m128i*) buf; const __m128i *vbuf2 = (__m128i*) buf2; __m128i total = _mm_setzero_si128(); __m128i v0, v1, v2, v3; __m128i v0_lo, v1_lo, v2_lo, v3_lo; __m128i v0_hi, v1_hi, v2_hi, v3_hi; __m128i count0, count1, count2, count3; int i; for (i = 0; i + 4 <= N; i += 4) { v0 = _mm_and_si128(_mm_load_si128(vbuf+i+0), _mm_load_si128(vbuf2+i+0)); v1 = _mm_and_si128(_mm_load_si128(vbuf+i+1), _mm_load_si128(vbuf2+i+1)); v2 = _mm_and_si128(_mm_load_si128(vbuf+i+2), _mm_load_si128(vbuf2+i+2)); v3 = _mm_and_si128(_mm_load_si128(vbuf+i+3), _mm_load_si128(vbuf2+i+3)); /* Split each byte into low and high nibbles */ v0_lo = _mm_and_si128(mask, v0); v1_lo = _mm_and_si128(mask, v1); v2_lo = _mm_and_si128(mask, v2); v3_lo = _mm_and_si128(mask, v3); v0_hi = _mm_and_si128(mask, _mm_srli_epi16(v0, 4)); v1_hi = _mm_and_si128(mask, _mm_srli_epi16(v1, 4)); v2_hi = _mm_and_si128(mask, _mm_srli_epi16(v2, 4)); v3_hi = _mm_and_si128(mask, _mm_srli_epi16(v3, 4)); /* Compute POPCNT of each byte in two halves using PSHUFB instruction for LUT */ count0 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v0_lo), _mm_shuffle_epi8(LUT, v0_hi)); count1 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v1_lo), _mm_shuffle_epi8(LUT, v1_hi)); count2 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v2_lo), _mm_shuffle_epi8(LUT, v2_hi)); count3 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v3_lo), _mm_shuffle_epi8(LUT, v3_hi)); total = _mm_add_epi8(total, _mm_add_epi8(_mm_add_epi8(count0, count1), _mm_add_epi8(count2, count3))); } switch (N - i) { case 1: v0 = _mm_and_si128(_mm_load_si128(vbuf+i+0), _mm_load_si128(vbuf2+i+0)); v0_lo = _mm_and_si128(mask, v0); v0_hi = _mm_and_si128(mask, _mm_srli_epi16(v0, 4)); count0 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v0_lo), _mm_shuffle_epi8(LUT, v0_hi)); total = _mm_add_epi8(total, count0); break; case 2: v0 = _mm_and_si128(_mm_load_si128(vbuf+i+0), _mm_load_si128(vbuf2+i+0)); v1 = _mm_and_si128(_mm_load_si128(vbuf+i+1), _mm_load_si128(vbuf2+i+1)); v0_lo = _mm_and_si128(mask, v0); v1_lo = _mm_and_si128(mask, v1); v0_hi = _mm_and_si128(mask, _mm_srli_epi16(v0, 4)); v1_hi = _mm_and_si128(mask, _mm_srli_epi16(v1, 4)); count0 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v0_lo), _mm_shuffle_epi8(LUT, v0_hi)); count1 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v1_lo), _mm_shuffle_epi8(LUT, v1_hi)); total = _mm_add_epi8(total, _mm_add_epi8(count0, count1)); break; case 3: v0 = _mm_and_si128(_mm_load_si128(vbuf+i+0), _mm_load_si128(vbuf2+i+0)); v1 = _mm_and_si128(_mm_load_si128(vbuf+i+1), _mm_load_si128(vbuf2+i+1)); v2 = _mm_and_si128(_mm_load_si128(vbuf+i+2), _mm_load_si128(vbuf2+i+2)); v0_lo = _mm_and_si128(mask, v0); v1_lo = _mm_and_si128(mask, v1); v2_lo = _mm_and_si128(mask, v2); v0_hi = _mm_and_si128(mask, _mm_srli_epi16(v0, 4)); v1_hi = _mm_and_si128(mask, _mm_srli_epi16(v1, 4)); v2_hi = _mm_and_si128(mask, _mm_srli_epi16(v2, 4)); count0 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v0_lo), _mm_shuffle_epi8(LUT, v0_hi)); count1 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v1_lo), _mm_shuffle_epi8(LUT, v1_hi)); count2 = _mm_add_epi8(_mm_shuffle_epi8(LUT, v2_lo), _mm_shuffle_epi8(LUT, v2_hi)); total = _mm_add_epi8(total, _mm_add_epi8(_mm_add_epi8(count0, count1), count2)); break; } /* Reduce 16*8b -> {-,-,-,16b,-,-,-,16b} */ return _mm_sad_epu8(total, _mm_setzero_si128()); } #endif /* GENERATE_SSSE3 */ /** * Count the number of bits set in a fingerprint using the SSSE3 * instruction set (available in x86 CPUs >= 2006). * @warning 1) fp must be aligned to a 16 byte boundary. * 2) Use (get_cpuid_flags() & bit_SSSE3) from cpuid.h to * test if the CPU supports the SSSE3 instructions. */ int chemfp_popcount_SSSE3(int size, const unsigned *fp) { #if defined(GENERATE_SSSE3) /* 2^5 loop iters might overflow 8-bit counter, so cap it at 2^4 iters per chunk */ const int iters = 1 << 4; const int N = (size + 3) / 4; int i, count; __m128i count32 = _mm_setzero_si128(); for (i = 0; i + iters * 4 <= N; i += iters * 4) { count32 = _mm_add_epi32(count32, popcount_SSSE3_helper(&fp[i], iters)); } if (i < N) { count32 = _mm_add_epi32(count32, popcount_SSSE3_helper(&fp[i], (N - i + 3) / 4)); } /* Layout coming from PSADBW accumulation is 2*{0,32}: 0 S1 0 S0 */ count = _mm_cvt_ss2si(_mm_cvtepi32_ps(_mm_add_epi32( count32, _mm_shuffle_epi32(count32, _MM_SHUFFLE(2, 2, 2, 2))))); return count; #else UNUSED(size); UNUSED(fp); return 0; #endif } /** * Count the number of bits set within the intersection of two * fingerprints using the SSSE3 instruction set. * @warning 1) fp1 & fp2 must be aligned to 16 byte boundaries. * 2) Use (get_cpuid_flags() & bit_SSSE3) from cpuid.h to * test if the CPU supports the SSSE3 instructions. */ int chemfp_intersect_popcount_SSSE3(int size, const unsigned *fp1, const unsigned *fp2) { #if defined(GENERATE_SSSE3) /* 2^5 loop iters might overflow 8-bit counter, so cap it at 2^4 iters per chunk */ const int iters = 1 << 4; const int N = (size + 3) / 4; int i, count; __m128i count32 = _mm_setzero_si128(); for (i = 0; i + iters * 4 <= N; i += iters * 4) { count32 = _mm_add_epi32(count32, intersect_popcount_SSSE3_helper(&fp1[i], &fp2[i], iters)); } if (i < N) { count32 = _mm_add_epi32(count32, intersect_popcount_SSSE3_helper(&fp1[i], &fp2[i], (N - i + 3) / 4)); } /* Layout coming from PSADBW accumulation is 2*{0,32}: 0 S1 0 S0 */ count = _mm_cvt_ss2si(_mm_cvtepi32_ps(_mm_add_epi32( count32, _mm_shuffle_epi32(count32, _MM_SHUFFLE(2, 2, 2, 2))))); return count; #else UNUSED(size); UNUSED(fp1); UNUSED(fp2); return 0; #endif } int chemfp_has_ssse3(void) { #if defined(GENERATE_SSSE3) return (get_cpuid_flags() & bit_SSSE3); #else (void)(get_cpuid_flags); /* suppress compiler warning */ return 0; #endif } chemfp-1.1p1/src/pysearch_results.c0000644000077000000240000006376712102120253017616 0ustar dalkestaff00000000000000#include #include "pysearch_results.h" #include "structmember.h" #include "chemfp_internal.h" /************ Search Result type ***************/ /* Help with cyclical garbage collection, in case someone does result.target_ids = result */ static int SearchResults_traverse(SearchResults *self, visitproc visit, void *arg) { Py_VISIT(self->target_ids); return 0; } static int SearchResults_clear_memory(SearchResults *self) { if (self->results) { chemfp_free_results(self->num_results, self->results); self->results = NULL; } self->num_results = 0; Py_CLEAR(self->target_ids); return 0; } static void SearchResults_dealloc(SearchResults *self) { SearchResults_clear_memory(self); self->ob_type->tp_free((PyObject *) self); } static PyObject * SearchResults_new(PyTypeObject *type, PyObject *args, PyObject *kwds) { SearchResults *self; self = (SearchResults *) type->tp_alloc(type, 0); if (self == NULL) { return NULL; } self->num_results = 0; self->results = NULL; Py_INCREF(Py_None); self->target_ids = Py_None; return (PyObject *)self; } static int SearchResults_init(SearchResults *self, PyObject *args, PyObject *kwds) { int num_results=0; PyObject *target_ids=Py_None; chemfp_search_result *results; static char *kwlist[] = {"num_results", "target_ids", NULL}; if (!PyArg_ParseTupleAndKeywords(args, kwds, "i|O", kwlist, &num_results, &target_ids)) { return -1; } if (num_results < 0) { PyErr_SetString(PyExc_ValueError, "num_results must be non-negative"); return -1; } if (num_results == 0) { results = NULL; } else { results = chemfp_alloc_search_results(num_results); if (!results) { PyErr_NoMemory(); return -1; } } self->num_results = num_results; self->results = results; Py_XINCREF(target_ids); Py_XDECREF(self->target_ids); self->target_ids = target_ids; return 0; } /* len(search_results) */ static Py_ssize_t SearchResults_length(SearchResults *result) { return (Py_ssize_t) result->num_results; } static int check_row(int num_results, int *row) { int row_ = *row; if (row_ < 0) { row_ = num_results + row_; if (row_ < 0) { PyErr_SetString(PyExc_IndexError, "row index is out of range"); return 0; } *row = row_; } else if (row_ >= num_results) { PyErr_SetString(PyExc_IndexError, "row index is out of range"); return 0; } return 1; } static int check_min_max_score(PyObject *min_score_obj, PyObject *max_score_obj, double *min_score, double *max_score) { double value; if (min_score_obj == Py_None) { *min_score = -HUGE_VAL; } else { value = PyFloat_AsDouble(min_score_obj); if (value == -1.0) { if (PyErr_Occurred()) { return 0; } } *min_score = value; } if (max_score_obj == Py_None) { *max_score = HUGE_VAL; } else { value = PyFloat_AsDouble(max_score_obj); if (value == -1.0) { if (PyErr_Occurred()) { return 0; } } *max_score = value; } return 1; } static int check_interval(const char *interval, int *include_min, int *include_max) { switch (interval[0]) { case '(': *include_min = 0; break; case '[': *include_min = 1; break; default: PyErr_SetString(PyExc_ValueError, "First interval character must be '(' or '['"); return 0; } switch (interval[1]) { case ')': *include_max = 0; break; case ']': *include_max = 1; break; default: PyErr_SetString(PyExc_ValueError, "Second interval character must be ')' or ']'"); return 0; } if (interval[2]) { PyErr_SetString(PyExc_ValueError, "The interval may only contain two characters"); return 0; } return 1; } static PyObject * SearchResults_reorder_all(SearchResults *self, PyObject *args, PyObject *kwds) { static char *kwlist[] = {"order", NULL}; const char *ordering = "decreasing-score"; int errval; if (!PyArg_ParseTupleAndKeywords(args, kwds, "|s:reorder_all", kwlist, &ordering)) { return NULL; } errval = chemfp_search_results_reorder(self->num_results, self->results, ordering); if (errval) { PyErr_SetString(PyExc_ValueError, chemfp_strerror(errval)); return NULL; } Py_RETURN_NONE; } static PyObject * SearchResults_reorder_row(SearchResults *self, PyObject *args, PyObject *kwds) { static char *kwlist[] = {"row", "order", NULL}; int row=-1; int errval; const char *ordering = "decreasing-score"; if (!PyArg_ParseTupleAndKeywords(args, kwds, "i|s:reorder_row", kwlist, &row, &ordering)) { return NULL; } if (!check_row(self->num_results, &row)) { return NULL; } errval = chemfp_search_result_reorder(self->results+row, ordering); if (errval) { PyErr_SetString(PyExc_ValueError, chemfp_strerror(errval)); return NULL; } Py_RETURN_NONE; } #define COUNT_ALL_MACRO(expr) \ for (row=0; rowresults+row); \ scores = self->results[row].scores; \ for (i=0; i max_score) || (min_score == max_score && !(include_min && include_max))) { return PyInt_FromLong(0); } num_rows = self->num_results; if (include_min) { if (include_max) { /* [] -- Include both ends */ if (min_score_obj == Py_None) { if (max_score_obj == Py_None) { /* Special case; just report the number of elements */ for (row=0; rowresults+row); } } else { /* No lower bound */ COUNT_ALL_MACRO(scores[i] <= max_score); } } else { if (max_score_obj == Py_None) { /* Lower bound but no upper bound */ COUNT_ALL_MACRO(min_score <= scores[i]); } else { /* Definite lower and upper bound */ COUNT_ALL_MACRO(min_score <= scores[i] && scores[i] <= max_score); } } } else { /* [) -- Include the minimum but not the maximum */ if (min_score_obj == Py_None) { /* There is no minimum */ COUNT_ALL_MACRO(scores[i] < max_score); } else { /* There is a minimum and a maximum test */ COUNT_ALL_MACRO(min_score <= scores[i] && scores[i] < max_score); } } } else { if (include_max) { /* (] -- Exclude the minimum, include the maximum */ if (max_score_obj == Py_None) { /* No specified max, so must only be greater than the lower bound */ COUNT_ALL_MACRO(min_score < scores[i]); } else { COUNT_ALL_MACRO(min_score < scores[i] && scores[i] <= max_score); } } else { /* () -- Exclude the minimum and exclude the maximum */ COUNT_ALL_MACRO(min_score < scores[i] && scores[i] < max_score); } } return PyInt_FromLong(count); } #define COUNT_ROW_MACRO(expr) \ for (i=0; inum_results, &row) || !check_min_max_score(min_score_obj, max_score_obj, &min_score, &max_score) || !check_interval(interval, &include_min, &include_max)) { return NULL; } num_hits = chemfp_get_num_hits(self->results+row); scores = (self->results+row)->scores; if ((min_score > max_score) || (min_score == max_score && (!include_min || !include_max))) { return PyInt_FromLong(0); } if (include_min) { if (include_max) { /* [] -- Include both ends */ if (min_score_obj == Py_None) { if (max_score_obj == Py_None) { /* Special case; just report the number of elements */ count = chemfp_get_num_hits(self->results+row); } else { /* No lower bound */ COUNT_ROW_MACRO(scores[i] <= max_score); } } else { if (max_score_obj == Py_None) { /* Lower bound but no upper bound */ COUNT_ROW_MACRO(min_score <= scores[i]); } else { /* Definite lower and upper bound */ COUNT_ROW_MACRO(min_score <= scores[i] && scores[i] <= max_score); } } } else { /* [) -- Include the minimum but not the maximum */ if (min_score_obj == Py_None) { /* There is no minimum */ COUNT_ROW_MACRO(scores[i] < max_score); } else { /* There is a minimum and a maximum test */ COUNT_ROW_MACRO(min_score <= scores[i] && scores[i] < max_score); } } } else { if (include_max) { /* (] -- Exclude the minimum, include the maximum */ if (max_score_obj == Py_None) { /* No specified max, so must only be greater than the lower bound */ COUNT_ROW_MACRO(min_score < scores[i]); } else { COUNT_ROW_MACRO(min_score < scores[i] && scores[i] <= max_score); } } else { /* () -- Exclude the minimum and exclude the maximum */ COUNT_ROW_MACRO(min_score < scores[i] && scores[i] < max_score); } } return PyInt_FromLong(count); } #define CUMULATIVE_SCORE_ALL_MACRO(expr) \ for (row=0; rowresults+row); \ scores = self->results[row].scores; \ for (i=0; i max_score) || (min_score == max_score && !(include_min && include_max))) { return PyInt_FromLong(0); } num_rows = self->num_results; if (include_min) { if (include_max) { /* [] -- Include both ends */ if (min_score_obj == Py_None) { if (max_score_obj == Py_None) { CUMULATIVE_SCORE_ALL_MACRO(1); } else { /* No lower bound */ CUMULATIVE_SCORE_ALL_MACRO(scores[i] <= max_score); } } else { if (max_score_obj == Py_None) { /* Lower bound but no upper bound */ CUMULATIVE_SCORE_ALL_MACRO(min_score <= scores[i]); } else { /* Definite lower and upper bound */ CUMULATIVE_SCORE_ALL_MACRO(min_score <= scores[i] && scores[i] <= max_score); } } } else { /* [) -- Include the minimum but not the maximum */ if (min_score_obj == Py_None) { /* There is no minimum */ CUMULATIVE_SCORE_ALL_MACRO(scores[i] < max_score); } else { /* There is a minimum and a maximum test */ CUMULATIVE_SCORE_ALL_MACRO(min_score <= scores[i] && scores[i] < max_score); } } } else { if (include_max) { /* (] -- Exclude the minimum, include the maximum */ if (max_score_obj == Py_None) { /* No specified max, so must only be greater than the lower bound */ CUMULATIVE_SCORE_ALL_MACRO(min_score < scores[i]); } else { CUMULATIVE_SCORE_ALL_MACRO(min_score < scores[i] && scores[i] <= max_score); } } else { /* () -- Exclude the minimum and exclude the maximum */ CUMULATIVE_SCORE_ALL_MACRO(min_score < scores[i] && scores[i] < max_score); } } return PyFloat_FromDouble(score); } #define CUMULATIVE_SCORE_ROW_MACRO(expr) \ for (i=0; inum_results, &row) || !check_min_max_score(min_score_obj, max_score_obj, &min_score, &max_score) || !check_interval(interval, &include_min, &include_max)) { return NULL; } num_hits = chemfp_get_num_hits(self->results+row); scores = (self->results+row)->scores; if ((min_score > max_score) || (min_score == max_score && (!include_min || !include_max))) { return PyFloat_FromDouble(0.0); } if (include_min) { if (include_max) { /* [] -- Include both ends */ if (min_score_obj == Py_None) { if (max_score_obj == Py_None) { CUMULATIVE_SCORE_ROW_MACRO(1); } else { /* No lower bound */ CUMULATIVE_SCORE_ROW_MACRO(scores[i] <= max_score); } } else { if (max_score_obj == Py_None) { /* Lower bound but no upper bound */ CUMULATIVE_SCORE_ROW_MACRO(min_score <= scores[i]); } else { /* Definite lower and upper bound */ CUMULATIVE_SCORE_ROW_MACRO(min_score <= scores[i] && scores[i] <= max_score); } } } else { /* [) -- Include the minimum but not the maximum */ if (min_score_obj == Py_None) { /* There is no minimum */ CUMULATIVE_SCORE_ROW_MACRO(scores[i] < max_score); } else { /* There is a minimum and a maximum test */ CUMULATIVE_SCORE_ROW_MACRO(min_score <= scores[i] && scores[i] < max_score); } } } else { if (include_max) { /* (] -- Exclude the minimum, include the maximum */ if (max_score_obj == Py_None) { /* No specified max, so must only be greater than the lower bound */ CUMULATIVE_SCORE_ROW_MACRO(min_score < scores[i]); } else { CUMULATIVE_SCORE_ROW_MACRO(min_score < scores[i] && scores[i] <= max_score); } } else { /* () -- Exclude the minimum and exclude the maximum */ CUMULATIVE_SCORE_ROW_MACRO(min_score < scores[i] && scores[i] < max_score); } } return PyFloat_FromDouble(score); } static PyObject * SearchResults_clear_all(SearchResults *self) { int i; for (i=0; inum_results; i++) { chemfp_search_result_clear(self->results+i); } Py_RETURN_NONE; } static PyObject * SearchResults_clear_row(SearchResults *self, PyObject *args, PyObject *kwds) { static char *kwlist[] = {"row", NULL}; int row; if (!PyArg_ParseTupleAndKeywords(args, kwds, "i:clear", kwlist, &row)) { return NULL; } if (!check_row(self->num_results, &row)) { return NULL; } chemfp_search_result_clear(self->results+row); Py_RETURN_NONE; } static PyObject * SearchResults_get_indices_and_scores(SearchResults *self, PyObject *args, PyObject *kwds) { int n, i; PyObject *hits, *obj; chemfp_search_result *result; static char *kwlist[] = {"row", NULL}; int row; if (!PyArg_ParseTupleAndKeywords(args, kwds, "i:get_indices", kwlist, &row)) { return NULL; } if (!check_row(self->num_results, &row)) { return NULL; } result = self->results+row; n = chemfp_get_num_hits(result); hits = PyList_New(n); if (!hits) { return NULL; } for (i=0; iindices[i], result->scores[i]); if (!obj) { goto error; } PyList_SET_ITEM(hits, i, obj); } return hits; error: /* I could be smarter and only deallocate the ones which I know are present. */ /* Proving that's correct is a bit trickier than I want to do. */ for (i=0; inum_results, &row)) { return NULL; } return data_blob_to_array(chemfp_get_num_hits(self->results+row), self->results[row].indices, "i", sizeof(int)); } static PyObject * SearchResults_get_scores(SearchResults *self, PyObject *args, PyObject *kwds) { static char *kwlist[] = {"row", NULL}; int row; if (!PyArg_ParseTupleAndKeywords(args, kwds, "i:get_scores", kwlist, &row)) { return NULL; } if (!check_row(self->num_results, &row)) { return NULL; } return data_blob_to_array(chemfp_get_num_hits(self->results+row), self->results[row].scores, "d", sizeof(double)); } static PyObject * SearchResults_size(SearchResults *self, PyObject *args, PyObject *kwds) { static char *kwlist[] = {"row", NULL}; int row; if (!PyArg_ParseTupleAndKeywords(args, kwds, "i:size", kwlist, &row)) { return NULL; } if (!check_row(self->num_results, &row)) { return NULL; } return PyInt_FromLong(chemfp_get_num_hits(self->results+row)); } static PyObject * SearchResults_add_hit(SearchResults *self, PyObject *args, PyObject *kwds) { static char *kwlist[] = {"row", "column", "score", NULL}; int row, column; double score; if (!PyArg_ParseTupleAndKeywords(args, kwds, "iid:_add_hit", kwlist, &row, &column, &score)) { return NULL; } if (!check_row(self->num_results, &row)) { return NULL; } return PyInt_FromLong(chemfp_add_hit(self->results+row, column, score)); Py_RETURN_NONE; } static PyMethodDef SearchResults_methods[] = { {"clear_all", (PyCFunction) SearchResults_clear_all, METH_VARARGS | METH_KEYWORDS, "Removes all hits from all of the search results"}, {"_clear_row", (PyCFunction) SearchResults_clear_row, METH_VARARGS | METH_KEYWORDS, "(internal) Remove all hits from a given row result"}, {"cumulative_score_all", (PyCFunction) SearchResults_cumulative_score_all, METH_VARARGS | METH_KEYWORDS, "The sum of all scores in all rows which are between `min_score` and `max_score`"}, {"_cumulative_score_row", (PyCFunction) SearchResults_cumulative_score_row, METH_VARARGS | METH_KEYWORDS, "(internal) The sum of the scores which are between `min_score` and `max_score` for a given row"}, {"count_all", (PyCFunction) SearchResults_count_all, METH_VARARGS | METH_KEYWORDS, "Count the number of hits with a score between `min_score` and `max_score`"}, {"_count_row", (PyCFunction) SearchResults_count_row, METH_VARARGS | METH_KEYWORDS, "(internal) Count the number of hits with a score between `min_score` and `max_score` for a given row"}, {"_get_indices", (PyCFunction) SearchResults_get_indices, METH_VARARGS | METH_KEYWORDS, "(internal) The list of target indices, in the current ordering for a given row"}, {"_get_scores", (PyCFunction) SearchResults_get_scores, METH_VARARGS | METH_KEYWORDS, "(internal) The list of target scores, in the current ordering for a given row"}, {"_get_indices_and_scores", (PyCFunction) SearchResults_get_indices_and_scores, METH_VARARGS | METH_KEYWORDS, "(internal) The list of (target index, score) pairs, in the current ordering for a given row"}, {"_size", (PyCFunction) SearchResults_size, METH_VARARGS | METH_KEYWORDS, "(internal) The number of hits for a given row"}, {"reorder_all", (PyCFunction) SearchResults_reorder_all, METH_VARARGS | METH_KEYWORDS, "Reorder the hits for all of the rows based on the requested ordering"}, {"_reorder_row", (PyCFunction) SearchResults_reorder_row, METH_VARARGS | METH_KEYWORDS, "(internal) Reorder the hits based on the requested ordering for a given row"}, {"_add_hit", (PyCFunction) SearchResults_add_hit, METH_VARARGS | METH_KEYWORDS, "(internal) Add a target index and hit score to a given row"}, {NULL} }; static PyMemberDef SearchResults_members[] = { {"target_ids", T_OBJECT_EX, offsetof(SearchResults, target_ids), 0, "list of fingerprint identifiers"}, {NULL} }; static PySequenceMethods SearchResults_as_sequence = { (lenfunc)SearchResults_length, /* sq_length */ NULL, /* sq_concat */ NULL, /* sq_repeat */ NULL, /* sq_item */ NULL, /* sq_slice */ NULL, /* sq_ass_item */ NULL, /* sq_ass_slice */ NULL, /* sq_contains */ NULL, /* sq_inplace_concat */ NULL /* sq_inplace_repeat */ }; PyTypeObject chemfp_py_SearchResultsType = { PyObject_HEAD_INIT(NULL) 0, /*ob_size*/ "chemfp.search.SearchResults", /*tp_name*/ sizeof(SearchResults), /*tp_basicsize*/ 0, /*tp_itemsize*/ (destructor) SearchResults_dealloc, /*tp_dealloc*/ 0, /*tp_print*/ 0, /*tp_getattr*/ 0, /*tp_setattr*/ 0, /*tp_compare*/ 0, /*tp_repr*/ 0, /*tp_as_number*/ &SearchResults_as_sequence, /*tp_as_sequence*/ 0, /*tp_as_mapping*/ 0, /*tp_hash */ 0, /*tp_call*/ 0, /*tp_str*/ 0, /*tp_getattro*/ 0, /*tp_setattro*/ 0, /*tp_as_buffer*/ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC, /*tp_flags*/ "Documentation goes here", /* tp_doc */ (traverseproc) SearchResults_traverse, /* tp_traverse */ (inquiry) SearchResults_clear_memory, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ 0, /* tp_iter */ 0, /* tp_iternext */ SearchResults_methods, /* tp_methods */ SearchResults_members, /* tp_members */ 0, /* tp_getset */ 0, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ 0, /* tp_dictoffset */ (initproc)SearchResults_init, /* tp_init */ 0, /* tp_alloc */ SearchResults_new /* tp_new */ }; chemfp-1.1p1/src/pysearch_results.h0000644000077000000240000000034112055226641017615 0ustar dalkestaff00000000000000#include #include "chemfp.h" typedef struct { PyObject_HEAD int num_results; chemfp_search_result *results; PyObject *target_ids; } SearchResults; extern PyTypeObject chemfp_py_SearchResultsType; chemfp-1.1p1/src/python_api.c0000644000077000000240000016377412106306046016402 0ustar dalkestaff00000000000000#include #include "chemfp.h" #include "chemfp_internal.h" #include "pysearch_results.h" static PyObject * version(PyObject *self, PyObject *args) { UNUSED(self); UNUSED(args); return PyString_FromString(chemfp_version()); } /* Slightly renamed so it won't share the same name as strerror(3) */ static PyObject * strerror_(PyObject *self, PyObject *args) { int err; UNUSED(self); if (!PyArg_ParseTuple(args, "i:strerror", &err)) return NULL; return PyString_FromString(chemfp_strerror(err)); } /*************** Hex fingerprint operations *************/ static PyObject * hex_isvalid(PyObject *self, PyObject *args) { char *s; int len; UNUSED(self); if (!PyArg_ParseTuple(args, "s#:hex_isvalid", &s, &len)) return NULL; return PyInt_FromLong(chemfp_hex_isvalid(len, s)); } static PyObject * hex_popcount(PyObject *self, PyObject *args) { char *s; int len; UNUSED(self); if (!PyArg_ParseTuple(args, "s#:hex_popcount", &s, &len)) return NULL; return PyInt_FromLong(chemfp_hex_popcount(len, s)); } static PyObject * hex_intersect_popcount(PyObject *self, PyObject *args) { char *s1, *s2; int len1, len2; UNUSED(self); if (!PyArg_ParseTuple(args, "s#s#:hex_intersect_popcount", &s1, &len1, &s2, &len2)) return NULL; if (len1 != len2) { PyErr_SetString(PyExc_ValueError, "hex fingerprints must have the same length"); return NULL; } return PyInt_FromLong(chemfp_hex_intersect_popcount(len1, s1, s2)); } static PyObject * hex_tanimoto(PyObject *self, PyObject *args) { char *s1, *s2; int len1, len2; UNUSED(self); if (!PyArg_ParseTuple(args, "s#s#:hex_tanimoto", &s1, &len1, &s2, &len2)) return NULL; if (len1 != len2) { PyErr_SetString(PyExc_ValueError, "hex fingerprints must have the same length"); return NULL; } return PyFloat_FromDouble(chemfp_hex_tanimoto(len1, s1, s2)); } static PyObject * hex_contains(PyObject *self, PyObject *args) { char *s1, *s2; int len1, len2; UNUSED(self); if (!PyArg_ParseTuple(args, "s#s#:hex_contains", &s1, &len1, &s2, &len2)) return NULL; if (len1 != len2) { PyErr_SetString(PyExc_ValueError, "hex fingerprints must have the same length"); return NULL; } return PyInt_FromLong(chemfp_hex_contains(len1, s1, s2)); } /********* Byte fingerprint operations *************/ static PyObject * byte_popcount(PyObject *self, PyObject *args) { unsigned char *s; int len; UNUSED(self); if (!PyArg_ParseTuple(args, "s#:byte_popcount", &s, &len)) return NULL; return PyInt_FromLong(chemfp_byte_popcount(len, s)); } static PyObject * byte_intersect_popcount(PyObject *self, PyObject *args) { unsigned char *s1, *s2; int len1, len2; UNUSED(self); if (!PyArg_ParseTuple(args, "s#s#:byte_intersect_popcount", &s1, &len1, &s2, &len2)) return NULL; if (len1 != len2) { PyErr_SetString(PyExc_ValueError, "byte fingerprints must have the same length"); return NULL; } return PyInt_FromLong(chemfp_byte_intersect_popcount(len1, s1, s2)); } static PyObject * byte_tanimoto(PyObject *self, PyObject *args) { unsigned char *s1, *s2; int len1, len2; UNUSED(self); if (!PyArg_ParseTuple(args, "s#s#:byte_tanimoto", &s1, &len1, &s2, &len2)) return NULL; if (len1 != len2) { PyErr_SetString(PyExc_ValueError, "byte fingerprints must have the same length"); return NULL; } return PyFloat_FromDouble(chemfp_byte_tanimoto(len1, s1, s2)); } static PyObject * byte_contains(PyObject *self, PyObject *args) { unsigned char *s1, *s2; int len1, len2; UNUSED(self); if (!PyArg_ParseTuple(args, "s#s#:byte_contains", &s1, &len1, &s2, &len2)) return NULL; if (len1 != len2) { PyErr_SetString(PyExc_ValueError, "byte fingerprints must have the same length"); return NULL; } return PyInt_FromLong(chemfp_byte_contains(len1, s1, s2)); } static PyObject * byte_intersect(PyObject *self, PyObject *args) { unsigned char *s, *s1, *s2; int i, len1, len2; PyObject *new_obj; UNUSED(self); if (!PyArg_ParseTuple(args, "s#s#:byte_intersect", &s1, &len1, &s2, &len2)) return NULL; if (len1 != len2) { PyErr_SetString(PyExc_ValueError, "byte fingerprints must have the same length"); return NULL; } new_obj = PyString_FromStringAndSize(NULL, len1); if (!new_obj) { return NULL; } s = (unsigned char *) PyString_AS_STRING(new_obj); for (i=0; i 1.0) { PyErr_SetString(PyExc_ValueError, "threshold must between 0.0 and 1.0, inclusive"); return 1; } return 0; } static int bad_alignment(int alignment) { if (chemfp_byte_popcount(sizeof(int), (unsigned char *) &alignment) != 1) { PyErr_SetString(PyExc_ValueError, "alignment must be a positive power of two"); return 1; } return 0; } static int bad_padding(const char *which, int start_padding, int end_padding, const unsigned char **arena, int *arena_size) { char msg[150]; /* printf("PADDING: %d %d for %d\n", start_padding, end_padding, *arena_size);*/ if (start_padding < 0) { sprintf(msg, "%sstart_padding must not be negative", which); PyErr_SetString(PyExc_ValueError, msg); return 1; } if (end_padding < 0) { sprintf(msg, "%send_padding must not be negative", which); PyErr_SetString(PyExc_ValueError, msg); return 1; } if ((start_padding + end_padding) > *arena_size) { sprintf(msg, "%sarena_size is too small for the paddings", which); PyErr_SetString(PyExc_ValueError, msg); return 1; } *arena += start_padding; *arena_size -= (start_padding + end_padding); return 0; } /* The arena num bits and storage size must be compatible */ static int bad_arena_size(const char *which, int num_bits, int storage_size) { char msg[150]; int fp_size = (num_bits+7) / 8; if (storage_size < 0) { sprintf(msg, "%sstorage_size must be positive", which); PyErr_SetString(PyExc_ValueError, msg); return 1; } if (fp_size > storage_size) { sprintf(msg, "num_bits of %d (%d bytes) does not fit into %sstorage_size of %d", num_bits, fp_size, which, storage_size); PyErr_SetString(PyExc_ValueError, msg); return 1; } return 0; } /* There must be enough cells for at least num queries (in an FPS threshold search) */ static int bad_fps_cells(int *num_cells, int cells_size, int num_queries) { char msg[100]; *num_cells = (int)(cells_size / sizeof(chemfp_tanimoto_cell)); if (*num_cells < num_queries) { sprintf(msg, "%d queries requires at least %d cells, not %d", num_queries, num_queries, *num_cells); PyErr_SetString(PyExc_ValueError, msg); return 1; } return 0; } static int bad_results(SearchResults *results, int results_offset) { if (!PyObject_TypeCheck(results, &chemfp_py_SearchResultsType)) { PyErr_SetString(PyExc_TypeError, "results is not a SearchResult instance"); return 1; } if (results_offset != 0) { PyErr_SetString(PyExc_ValueError, "non-zero results_offset?"); return 1; } return 0; } static int bad_num_results(int num_results) { if (num_results <= 0) { PyErr_SetString(PyExc_ValueError, "num_results must be positive"); return 1; } return 0; } static int bad_knearest_search_size(int knearest_search_size) { if (knearest_search_size < (int) sizeof(chemfp_fps_knearest_search)) { PyErr_SetString(PyExc_ValueError, "Not enough space allocated for a chemfp_fps_knearest_search"); return 1; } return 0; } /* Check/adjust the start and end positions into an FPS block */ static int bad_block_limits(int block_size, int *start, int *end) { if (*start < 0) { PyErr_SetString(PyExc_ValueError, "block start must not be negative"); return 1; } if (*end == -1 || *end > block_size) { *end = block_size; } else if (*end < 0) { PyErr_SetString(PyExc_ValueError, "block end must either be -1 or non-negative"); return 1; } if (*start > block_size) { *start = block_size; } return 0; } /* Check/adjust the start and end positions into an arena */ static int bad_arena_limits(const char *which, int arena_size, int storage_size, int *start, int *end) { char msg[150]; int max_index; if (arena_size % storage_size != 0) { sprintf(msg, "%sarena size (%d) is not a multiple of its storage size (%d)", which, arena_size, storage_size); PyErr_SetString(PyExc_ValueError, msg); return 1; } if (*start < 0) { sprintf(msg, "%sstart must not be negative", which); PyErr_SetString(PyExc_ValueError, msg); return 1; } max_index = arena_size / storage_size; if (*start > max_index) { /* I'll later ignore if start is too large */ *start = max_index; } if (*end == -1 || *end > max_index) { *end = max_index; } else if (*end < 0) { sprintf(msg, "%send must either be -1 or non-negative", which); PyErr_SetString(PyExc_ValueError, msg); return 1; } return 0; } static int bad_fingerprint_sizes(int num_bits, int query_storage_size, int target_storage_size) { return (bad_arena_size("query_", num_bits, query_storage_size) || bad_arena_size("target_", num_bits, target_storage_size)); } static int bad_popcount_indices(const char *which, int check_indices, int num_bits, int popcount_indices_size, int **popcount_indices_ptr) { char msg[150]; int num_popcounts; int prev, i; int *popcount_indices; if (popcount_indices_size == 0) { /* Special case: this means to ignore this field */ *popcount_indices_ptr = NULL; return 0; } if ((popcount_indices_size % sizeof(int)) != 0) { sprintf(msg, "%spopcount indices length (%d) is not a multiple of the native integer size", which, popcount_indices_size); PyErr_SetString(PyExc_ValueError, msg); return 1; } /* If there is 1 bit then there must be three indices: */ /* indices[0]...indices[1] ==> fingerprints with 0 bits set */ /* indices[1]...indices[2] ==> fingerprints with 1 bit set */ num_popcounts = (int)(popcount_indices_size / sizeof(int)); if (num_bits > num_popcounts - 2) { sprintf(msg, "%d bits requires at least %d %spopcount indices, not %d", num_bits, num_bits+2, which, num_popcounts); PyErr_SetString(PyExc_ValueError, msg); return 1; } if (check_indices) { popcount_indices = *popcount_indices_ptr; if (popcount_indices[0] != 0) { sprintf(msg, "%s popcount indices[0] must be 0", which); PyErr_SetString(PyExc_ValueError, "%spopcount_indices[0] must be 0"); return 1; } prev = 0; for (i=1; i= target_end) { /* start of next byte to process, num lines processed, num cells */ return Py_BuildValue("iiii", CHEMFP_OK, 0); } Py_BEGIN_ALLOW_THREADS; err = chemfp_fps_count_tanimoto_hits( num_bits, query_storage_size, query_arena, query_start, query_end, target_block+target_start, target_end-target_start, threshold, counts, &num_lines_processed); Py_END_ALLOW_THREADS; return Py_BuildValue("ii", err, num_lines_processed); } /* In Python this is (err, next_start, num_lines_processed, num_cells_processed) = fps_threshold_tanimoto_search(num_bits, query_storage_size, query_arena, target_block, target_start, target_end, threshold, cells) */ static PyObject * fps_threshold_tanimoto_search(PyObject *self, PyObject *args) { int num_bits, query_start_padding, query_end_padding; int query_storage_size, query_arena_size, query_start, query_end; const unsigned char *query_arena; const char *target_block, *stopped_at; int target_block_size, target_start, target_end; chemfp_tanimoto_cell *cells; double threshold; int cells_size; int num_lines_processed = 0, num_cells_processed = 0; int num_cells, err; UNUSED(self); if (!PyArg_ParseTuple(args, "iiiit#iit#iidw#:fps_threshold_tanimoto_search", &num_bits, &query_start_padding, &query_end_padding, &query_storage_size, &query_arena, &query_arena_size, &query_start, &query_end, &target_block, &target_block_size, &target_start, &target_end, &threshold, &cells, &cells_size)) return NULL; if (bad_num_bits(num_bits) || bad_padding("query_", query_start_padding, query_end_padding, &query_arena, &query_arena_size) || bad_arena_size("query_", num_bits, query_storage_size) || bad_arena_limits("query ", query_arena_size, query_storage_size, &query_start, &query_end) || bad_block_limits(target_block_size, &target_start, &target_end) || bad_threshold(threshold) || bad_fps_cells(&num_cells, cells_size, query_arena_size / query_storage_size)) { return NULL; } if (target_start >= target_end) { /* start of next byte to process, num lines processed, num cells */ return Py_BuildValue("iiii", CHEMFP_OK, target_end, 0, 0); } Py_BEGIN_ALLOW_THREADS; err = chemfp_fps_threshold_tanimoto_search( num_bits, query_storage_size, query_arena, query_start, query_end, target_block+target_start, target_end-target_start, threshold, num_cells, cells, &stopped_at, &num_lines_processed, &num_cells_processed); Py_END_ALLOW_THREADS; return Py_BuildValue("iiii", err, stopped_at - target_block, num_lines_processed, num_cells_processed); } static PyObject * fps_knearest_search_init(PyObject *self, PyObject *args) { chemfp_fps_knearest_search *knearest_search; int start_padding, end_padding; int knearest_search_size, num_bits, query_storage_size; unsigned const char *query_arena; int query_arena_size, query_start, query_end, k; double threshold; int err; UNUSED(self); if (!PyArg_ParseTuple(args, "w#iiiit#iiid:fps_knearest_search_init", &knearest_search, &knearest_search_size, &num_bits, &start_padding, &end_padding, &query_storage_size, &query_arena, &query_arena_size, &query_start, &query_end, &k, &threshold)) return NULL; if (bad_knearest_search_size(knearest_search_size) || bad_num_bits(num_bits) || bad_padding("", start_padding, end_padding, &query_arena, &query_arena_size) || bad_arena_size("query_", num_bits, query_storage_size) || bad_arena_limits("query ", query_arena_size, query_storage_size, &query_start, &query_end) || bad_k(k) || bad_threshold(threshold)) { return NULL; } Py_BEGIN_ALLOW_THREADS; err = chemfp_fps_knearest_search_init( knearest_search, num_bits, query_storage_size, query_arena, query_start, query_end, k, threshold); Py_END_ALLOW_THREADS; if (err) { PyErr_SetString(PyExc_ValueError, chemfp_strerror(err)); return NULL; } return Py_BuildValue(""); } static PyObject * fps_knearest_tanimoto_search_feed(PyObject *self, PyObject *args) { chemfp_fps_knearest_search *knearest_search; int knearest_search_size; const char *target_block; int target_block_size, target_start, target_end; int err; UNUSED(self); if (!PyArg_ParseTuple(args, "w#t#ii:fps_knearest_tanimoto_search_feed", &knearest_search, &knearest_search_size, &target_block, &target_block_size, &target_start, &target_end)) return NULL; if (bad_knearest_search_size(knearest_search_size) || bad_block_limits(target_block_size, &target_start, &target_end)) return NULL; Py_BEGIN_ALLOW_THREADS; err = chemfp_fps_knearest_tanimoto_search_feed(knearest_search, target_block_size, target_block); Py_END_ALLOW_THREADS; return PyInt_FromLong(err); } static PyObject * fps_knearest_search_finish(PyObject *self, PyObject *args) { chemfp_fps_knearest_search *knearest_search; int knearest_search_size; UNUSED(self); if (!PyArg_ParseTuple(args, "w#:fps_knearest_search_finish", &knearest_search, &knearest_search_size)) return NULL; if (bad_knearest_search_size(knearest_search_size)) return NULL; Py_BEGIN_ALLOW_THREADS; chemfp_fps_knearest_search_finish(knearest_search); Py_END_ALLOW_THREADS; return Py_BuildValue(""); } static PyObject * fps_knearest_search_free(PyObject *self, PyObject *args) { chemfp_fps_knearest_search *knearest_search; int knearest_search_size; UNUSED(self); if (!PyArg_ParseTuple(args, "w#:fps_knearest_search_free", &knearest_search, &knearest_search_size)) return NULL; if (bad_knearest_search_size(knearest_search_size)) return NULL; Py_BEGIN_ALLOW_THREADS; chemfp_fps_knearest_search_free(knearest_search); Py_END_ALLOW_THREADS; return Py_BuildValue(""); } /**************** The library-based searches **********/ /* Always allocate space. This must overallocate because */ /* there is no guarantee the start alignment. */ /* (Though on my Mac it's always 4-byte aligned. */ static PyObject * _alloc_aligned_arena(Py_ssize_t size, int alignment, int *start_padding, int *end_padding) { PyObject *new_py_string; char *s; uintptr_t i; new_py_string = PyString_FromStringAndSize(NULL, size+alignment-1); if (!new_py_string) { return NULL; } s = PyString_AS_STRING(new_py_string); i = ALIGNMENT(s, alignment); if (i == 0) { *start_padding = 0; *end_padding = alignment-1; } else { *start_padding = (int)(alignment - i); *end_padding = (int)(i-1); } memset(s, 0, *start_padding); memset(s+size+*start_padding, 0, *end_padding); return new_py_string; } static PyObject * _align_arena(PyObject *input_arena_obj, int alignment, int *start_padding, int *end_padding) { const char *input_arena; char *output_arena; Py_ssize_t input_arena_size; uintptr_t i; PyObject *output_arena_obj; if (PyObject_AsCharBuffer(input_arena_obj, &input_arena, &input_arena_size)) { PyErr_SetString(PyExc_ValueError, "arena must be a character buffer"); return NULL; } i = ALIGNMENT(input_arena, alignment); /* Already aligned */ if (i == 0) { *start_padding = 0; *end_padding = 0; Py_INCREF(input_arena_obj); return input_arena_obj; } /* Not aligned. We'll have to move it to a new string */ output_arena_obj = _alloc_aligned_arena(input_arena_size, alignment, start_padding, end_padding); output_arena = PyString_AS_STRING(output_arena_obj); /* Copy over into the new string */ memcpy(output_arena+*start_padding, input_arena, input_arena_size); return output_arena_obj; } static PyObject * make_unsorted_aligned_arena(PyObject *self, PyObject *args) { int alignment; int start_padding, end_padding; PyObject *input_arena_obj, *output_arena_obj; UNUSED(self); if (!PyArg_ParseTuple(args, "Oi:make_unsorted_aligned_arena", &input_arena_obj, &alignment)) { return NULL; } if (bad_alignment(alignment)) { return NULL; } output_arena_obj = _align_arena(input_arena_obj, alignment, &start_padding, &end_padding); if (!output_arena_obj) { return NULL; } return Py_BuildValue("iiN", start_padding, end_padding, output_arena_obj); } static PyObject * align_fingerprint(PyObject *self, PyObject *args) { PyObject *input_fp_obj, *new_fp_obj; const char *fp; char *new_fp; Py_ssize_t fp_size; int alignment, start_padding, storage_size, end_padding; UNUSED(self); if (!PyArg_ParseTuple(args, "Oii:align_fingerprint", &input_fp_obj, &alignment, &storage_size)) { return NULL; } if (bad_alignment(alignment)) { return NULL; } if (PyObject_AsCharBuffer(input_fp_obj, &fp, &fp_size)) { PyErr_SetString(PyExc_ValueError, "fingerprint must be a character buffer"); return NULL; } if (storage_size < 1) { PyErr_SetString(PyExc_ValueError, "storage size must be positive"); return NULL; } if (storage_size < fp_size) { PyErr_SetString(PyExc_ValueError, "storage size is too small for the query"); return NULL; } /* Are we lucky? */ if (storage_size == fp_size) { new_fp_obj = _align_arena(input_fp_obj, alignment, &start_padding, &end_padding); } else { /* Unlucky. Need to allocate more space */ new_fp_obj = _alloc_aligned_arena(storage_size, alignment, &start_padding, &end_padding); if (!new_fp_obj) { return NULL; } new_fp = PyString_AS_STRING(new_fp_obj); /* Copy over into the new string */ memcpy(new_fp+start_padding, fp, fp_size); /* Zero out the remaining bytes */ memset(new_fp+start_padding+fp_size, 0, storage_size-fp_size); } return Py_BuildValue("iiN", start_padding, end_padding, new_fp_obj); } static int calculate_arena_popcounts(int num_bits, int storage_size, const unsigned char *arena, int num_fingerprints, ChemFPOrderedPopcount *ordering) { chemfp_popcount_f calc_popcount; const unsigned char *fp; int fp_index, popcount, prev_popcount; /* Compute the popcounts. (Alignment isn't that important here.) */ calc_popcount = chemfp_select_popcount(num_bits, storage_size, arena); fp = arena; for (fp_index = 0; fp_index < num_fingerprints; fp_index++, fp += storage_size) { popcount = calc_popcount(storage_size, fp); ordering[fp_index].popcount = popcount; ordering[fp_index].index = fp_index; } /* Check if the values are already ordered */ prev_popcount = ordering[0].popcount; for (fp_index = 1; fp_index < num_fingerprints; fp_index++) { if (ordering[fp_index].popcount < prev_popcount) { return 1; /* Need to sort */ } prev_popcount = ordering[fp_index].popcount; } return 0; /* Don't need to sort */ } static int compare_by_popcount(const void *left_p, const void *right_p) { const ChemFPOrderedPopcount *left = (ChemFPOrderedPopcount *) left_p; const ChemFPOrderedPopcount *right = (ChemFPOrderedPopcount *) right_p; if (left->popcount < right->popcount) { return -1; } if (left->popcount > right->popcount) { return 1; } if (left->index < right->index) { return -1; } if (left->index > right->index) { return 1; } return 0; } static void set_popcount_indicies(int num_fingerprints, int num_bits, ChemFPOrderedPopcount *ordering, int *popcount_indices) { int popcount, i; /* We've sorted by popcount so this isn't so difficult */ popcount = 0; popcount_indices[0] = 0; for (i=0; ipopcount can be > num_bits. This is undefined behavior. I get to do what I want. I decided to treat them as having "max_popcount" bits. After all, I don't want corrupt data to crash the system, and no one is going to validate the input fingerprints for correctness each time. */ } } } /* Finish up the high end */ while (popcount <= num_bits) { popcount_indices[++popcount] = num_fingerprints; } } static PyObject * make_sorted_aligned_arena(PyObject *self, PyObject *args) { int start = 0; int num_bits, storage_size, num_fingerprints, ordering_size, popcount_indices_size; int start_padding, end_padding; PyObject *input_arena_obj, *output_arena_obj; const unsigned char *input_arena; unsigned char *output_arena; Py_ssize_t input_arena_size; ChemFPOrderedPopcount *ordering; int *popcount_indices; int need_to_sort, i; int alignment; UNUSED(self); if (!PyArg_ParseTuple(args, "iiOiw#w#i:make_sorted_aligned_arena", &num_bits, &storage_size, &input_arena_obj, &num_fingerprints, &ordering, &ordering_size, &popcount_indices, &popcount_indices_size, &alignment )) { return NULL; } if (PyObject_AsCharBuffer(input_arena_obj, (const char **) &input_arena, &input_arena_size)) { PyErr_SetString(PyExc_ValueError, "arena must be a character buffer"); return NULL; } if (bad_num_bits(num_bits) || bad_arena_limits("", (int) input_arena_size, storage_size, &start, &num_fingerprints) || bad_popcount_indices("", 0, num_bits, popcount_indices_size, NULL)) { return NULL; } if ((int)(ordering_size / sizeof(ChemFPOrderedPopcount)) < num_fingerprints) { PyErr_SetString(PyExc_ValueError, "allocated ordering space is too small"); return NULL; } /* Handle the trivial case of no fingerprints */ if (num_fingerprints == 0) { return Py_BuildValue("iiO", 0, 0, input_arena_obj); } need_to_sort = calculate_arena_popcounts(num_bits, storage_size, input_arena, num_fingerprints, ordering); if (!need_to_sort) { /* Everything is ordered. Just need the right alignment .... */ output_arena_obj = _align_arena(input_arena_obj, alignment, &start_padding, &end_padding); if (!output_arena_obj) { return NULL; } /* ... and to set the popcount indicies */ set_popcount_indicies(num_fingerprints, num_bits, ordering, popcount_indices); /* Everything is aligned and ordered, so we're done */ return Py_BuildValue("iiN", start_padding, end_padding, output_arena_obj); } /* Not ordered. Make space for the results. */ output_arena_obj = _alloc_aligned_arena(input_arena_size, alignment, &start_padding, &end_padding); if (!output_arena_obj) { return NULL; } output_arena = (unsigned char *)(PyString_AS_STRING(output_arena_obj) + start_padding); Py_BEGIN_ALLOW_THREADS; qsort(ordering, num_fingerprints, sizeof(ChemFPOrderedPopcount), compare_by_popcount); /* Build the new arena based on the values in the old arena */ for (i=0; i query_end) { Py_RETURN_NONE; } if (result_counts_size < (int)((query_end - query_start)*sizeof(int))) { PyErr_SetString(PyExc_ValueError, "not enough space allocated for result_counts"); return NULL; } Py_BEGIN_ALLOW_THREADS; chemfp_count_tanimoto_arena(threshold, num_bits, query_storage_size, query_arena, query_start, query_end, target_storage_size, target_arena, target_start, target_end, target_popcount_indices, result_counts); Py_END_ALLOW_THREADS; Py_RETURN_NONE; } /* threshold_tanimoto_arena */ static PyObject * threshold_tanimoto_arena(PyObject *self, PyObject *args) { double threshold; int num_bits; int query_start_padding, query_end_padding; int query_storage_size, query_arena_size, query_start, query_end; const unsigned char *query_arena; int target_start_padding, target_end_padding; int target_storage_size, target_arena_size, target_start, target_end; const unsigned char *target_arena; int *target_popcount_indices, target_popcount_indices_size; int errval, result_offset; SearchResults *results; UNUSED(self); if (!PyArg_ParseTuple(args, "diiiit#iiiiit#iit#Oi:threshold_tanimoto_arena", &threshold, &num_bits, &query_start_padding, &query_end_padding, &query_storage_size, &query_arena, &query_arena_size, &query_start, &query_end, &target_start_padding, &target_end_padding, &target_storage_size, &target_arena, &target_arena_size, &target_start, &target_end, &target_popcount_indices, &target_popcount_indices_size, &results, &result_offset)) { return NULL; } if (bad_threshold(threshold) || bad_num_bits(num_bits) || bad_fingerprint_sizes(num_bits, query_storage_size, target_storage_size) || bad_padding("query ", query_start_padding, query_end_padding, &query_arena, &query_arena_size) || bad_padding("target ", target_start_padding, target_end_padding, &target_arena, &target_arena_size) || bad_arena_limits("query ", query_arena_size, query_storage_size, &query_start, &query_end) || bad_arena_limits("target ", target_arena_size, target_storage_size, &target_start, &target_end) || bad_popcount_indices("target ", 1, num_bits, target_popcount_indices_size, &target_popcount_indices) || bad_results(results, result_offset) ) { return NULL; } Py_BEGIN_ALLOW_THREADS; errval = chemfp_threshold_tanimoto_arena( threshold, num_bits, query_storage_size, query_arena, query_start, query_end, target_storage_size, target_arena, target_start, target_end, target_popcount_indices, results->results + result_offset); Py_END_ALLOW_THREADS; return PyInt_FromLong(errval); } /* knearest_tanimoto_arena */ static PyObject * knearest_tanimoto_arena(PyObject *self, PyObject *args) { int k; double threshold; int num_bits; int query_start_padding, query_end_padding; int query_storage_size, query_arena_size, query_start, query_end; const unsigned char *query_arena; int target_start_padding, target_end_padding; int target_storage_size, target_arena_size, target_start, target_end; const unsigned char *target_arena; int *target_popcount_indices, target_popcount_indices_size; int errval, result_offset; SearchResults *results; UNUSED(self); if (!PyArg_ParseTuple(args, "idiiiit#iiiiit#iit#Oi:knearest_tanimoto_arena", &k, &threshold, &num_bits, &query_start_padding, &query_end_padding, &query_storage_size, &query_arena, &query_arena_size, &query_start, &query_end, &target_start_padding, &target_end_padding, &target_storage_size, &target_arena, &target_arena_size, &target_start, &target_end, &target_popcount_indices, &target_popcount_indices_size, &results, &result_offset)) { return NULL; } if (bad_k(k) || bad_threshold(threshold) || bad_num_bits(num_bits) || bad_padding("query ", query_start_padding, query_end_padding, &query_arena, &query_arena_size) || bad_padding("target ", target_start_padding, target_end_padding, &target_arena, &target_arena_size) || bad_fingerprint_sizes(num_bits, query_storage_size, target_storage_size) || bad_arena_limits("query ", query_arena_size, query_storage_size, &query_start, &query_end) || bad_arena_limits("target ", target_arena_size, target_storage_size, &target_start, &target_end) || bad_popcount_indices("target ", 1, num_bits, target_popcount_indices_size, &target_popcount_indices) || bad_results(results, result_offset)) { return NULL; } Py_BEGIN_ALLOW_THREADS; errval = chemfp_knearest_tanimoto_arena( k, threshold, num_bits, query_storage_size, query_arena, query_start, query_end, target_storage_size, target_arena, target_start, target_end, target_popcount_indices, results->results); Py_END_ALLOW_THREADS; return PyInt_FromLong(errval); } static PyObject * knearest_results_finalize(PyObject *self, PyObject *args) { int result_offset, num_results; SearchResults *results; UNUSED(self); if (!PyArg_ParseTuple(args, "Oii", &results, &result_offset, &num_results)) { return NULL; } if (bad_results(results, result_offset) || bad_num_results(num_results)) { return NULL; } Py_BEGIN_ALLOW_THREADS; chemfp_knearest_results_finalize(results->results+result_offset, results->results+result_offset+num_results); Py_END_ALLOW_THREADS; return Py_BuildValue(""); } /***** Symmetric search code ****/ static PyObject * count_tanimoto_hits_arena_symmetric(PyObject *self, PyObject *args) { double threshold; int num_bits, start_padding, end_padding, storage_size, arena_size; int query_start, query_end, target_start, target_end; const unsigned char *arena; int *popcount_indices, *result_counts; int popcount_indices_size, result_counts_size; UNUSED(self); if (!PyArg_ParseTuple(args, "diiiis#iiiis#w#:count_tanimoto_arena", &threshold, &num_bits, &start_padding, &end_padding, &storage_size, &arena, &arena_size, &query_start, &query_end, &target_start, &target_end, &popcount_indices, &popcount_indices_size, &result_counts, &result_counts_size)) { return NULL; } if (bad_threshold(threshold) || bad_num_bits(num_bits) || bad_padding("", start_padding, end_padding, &arena, &arena_size) || bad_fingerprint_sizes(num_bits, storage_size, storage_size) || bad_arena_limits("query ", arena_size, storage_size, &query_start, &query_end) || bad_arena_limits("target ", arena_size, storage_size, &target_start, &target_end) || bad_popcount_indices("", 1, num_bits, popcount_indices_size, &popcount_indices)) { return NULL; } if (result_counts_size < (arena_size / storage_size) * sizeof(int) ) { PyErr_SetString(PyExc_ValueError, "not enough space allocated for result_counts"); return NULL; } if (query_start > query_end) { Py_RETURN_NONE; } Py_BEGIN_ALLOW_THREADS; chemfp_count_tanimoto_hits_arena_symmetric(threshold, num_bits, storage_size, arena, query_start, query_end, target_start, target_end, popcount_indices, result_counts); Py_END_ALLOW_THREADS; Py_RETURN_NONE; } static PyObject * threshold_tanimoto_arena_symmetric(PyObject *self, PyObject *args) { double threshold; int num_bits, start_padding, end_padding, storage_size, arena_size; int query_start, query_end, target_start, target_end; const unsigned char *arena; int *popcount_indices; int popcount_indices_size; SearchResults *results; UNUSED(self); if (!PyArg_ParseTuple(args, "diiiis#iiiis#O:threshold_tanimoto_arena_symmetric", &threshold, &num_bits, &start_padding, &end_padding, &storage_size, &arena, &arena_size, &query_start, &query_end, &target_start, &target_end, &popcount_indices, &popcount_indices_size, &results)) { return NULL; } if (bad_threshold(threshold) || bad_num_bits(num_bits) || bad_padding("", start_padding, end_padding, &arena, &arena_size) || bad_fingerprint_sizes(num_bits, storage_size, storage_size) || bad_arena_limits("query ", arena_size, storage_size, &query_start, &query_end) || bad_arena_limits("target ", arena_size, storage_size, &target_start, &target_end) || bad_popcount_indices("", 1, num_bits, popcount_indices_size, &popcount_indices) || bad_results(results, 0)) { return NULL; } Py_BEGIN_ALLOW_THREADS; chemfp_threshold_tanimoto_arena_symmetric(threshold, num_bits, storage_size, arena, query_start, query_end, target_start, target_end, popcount_indices, results->results); Py_END_ALLOW_THREADS; Py_RETURN_NONE; } /* knearest_tanimoto_arena */ static PyObject * knearest_tanimoto_arena_symmetric(PyObject *self, PyObject *args) { double threshold; int k, num_bits, start_padding, end_padding, storage_size, arena_size; int query_start, query_end, target_start, target_end; const unsigned char *arena; int *popcount_indices; int popcount_indices_size; SearchResults *results; UNUSED(self); if (!PyArg_ParseTuple(args, "idiiiis#iiiis#O:knearest_tanimoto_arena_symmetric", &k, &threshold, &num_bits, &start_padding, &end_padding, &storage_size, &arena, &arena_size, &query_start, &query_end, &target_start, &target_end, &popcount_indices, &popcount_indices_size, &results)) { return NULL; } if (bad_k(k) || bad_threshold(threshold) || bad_num_bits(num_bits) || bad_padding("", start_padding, end_padding, &arena, &arena_size) || bad_fingerprint_sizes(num_bits, storage_size, storage_size) || bad_arena_limits("query ", arena_size, storage_size, &query_start, &query_end) || bad_arena_limits("target ", arena_size, storage_size, &target_start, &target_end) || bad_popcount_indices("", 1, num_bits, popcount_indices_size, &popcount_indices) || bad_results(results, 0)) { return NULL; } Py_BEGIN_ALLOW_THREADS; chemfp_knearest_tanimoto_arena_symmetric(k, threshold, num_bits, storage_size, arena, query_start, query_end, target_start, target_end, popcount_indices, results->results); Py_END_ALLOW_THREADS; Py_RETURN_NONE; } static PyObject * fill_lower_triangle(PyObject *self, PyObject *args) { int num_results, errval; SearchResults *results; UNUSED(self); if (!PyArg_ParseTuple(args, "Oi:fill_lower_triangle", &results, &num_results)) { return NULL; } if (bad_results(results, 0) || bad_num_results(num_results)) { return NULL; } Py_BEGIN_ALLOW_THREADS; errval = chemfp_fill_lower_triangle(num_results, results->results); Py_END_ALLOW_THREADS; if (errval) { PyErr_SetString(PyExc_ValueError, chemfp_strerror(errval)); return NULL; } Py_RETURN_NONE; } /* Select the popcount methods */ static PyObject * get_num_methods(PyObject *self, PyObject *args) { UNUSED(self); UNUSED(args); return PyInt_FromLong(chemfp_get_num_methods()); } static PyObject * get_method_name(PyObject *self, PyObject *args) { int method; const char *s; UNUSED(self); if (!PyArg_ParseTuple(args, "i:get_method_name", &method)) { return NULL; } s = chemfp_get_method_name(method); if (s == NULL) { PyErr_SetString(PyExc_IndexError, "method index is out of range"); return NULL; } return PyString_FromString(s); } static PyObject * get_num_alignments(PyObject *self, PyObject *args) { UNUSED(self); UNUSED(args); return PyInt_FromLong(chemfp_get_num_alignments()); } static PyObject * get_alignment_name(PyObject *self, PyObject *args) { int alignment; const char *s; UNUSED(self); if (!PyArg_ParseTuple(args, "i:get_alignment_name", &alignment)) { return NULL; } s = chemfp_get_alignment_name(alignment); if (s == NULL) { PyErr_SetString(PyExc_IndexError, "alignment index is out of range"); return NULL; } return PyString_FromString(s); } static PyObject * get_alignment_method(PyObject *self, PyObject *args) { int alignment, method; UNUSED(self); if (!PyArg_ParseTuple(args, "i:get_alignment_method", &alignment)) { return NULL; } method = chemfp_get_alignment_method(alignment); if (method < 0) { PyErr_SetString(PyExc_ValueError, chemfp_strerror(method)); return NULL; } return PyInt_FromLong(method); } static PyObject * set_alignment_method(PyObject *self, PyObject *args) { int alignment, method; int result; UNUSED(self); if (!PyArg_ParseTuple(args, "ii:get_alignment_method", &alignment, &method)) { return NULL; } result = chemfp_set_alignment_method(alignment, method); if (result < 0) { PyErr_SetString(PyExc_ValueError, chemfp_strerror(result)); return NULL; } return Py_BuildValue(""); } static PyObject * select_fastest_method(PyObject *self, PyObject *args) { int alignment, repeat, result; UNUSED(self); if (!PyArg_ParseTuple(args, "ii:select_fastest_method", &alignment, &repeat)) { return NULL; } result = chemfp_select_fastest_method(alignment, repeat); if (result < 0) { PyErr_SetString(PyExc_ValueError, chemfp_strerror(result)); return NULL; } return PyInt_FromLong(result); } static PyObject * get_num_options(PyObject *self, PyObject *args) { UNUSED(self); UNUSED(args); return PyInt_FromLong(chemfp_get_num_options()); } static PyObject * get_option_name(PyObject *self, PyObject *args) { int i; const char *s; UNUSED(self); if (!PyArg_ParseTuple(args, "i:get_option_name", &i)) { return NULL; } s = chemfp_get_option_name(i); if (s == NULL) { PyErr_SetString(PyExc_IndexError, "option name index out of range"); return NULL; } return PyString_FromString(s); } static PyObject * get_option(PyObject *self, PyObject *args) { const char *option; int value; UNUSED(self); if (!PyArg_ParseTuple(args, "s:get_option", &option)) { return NULL; } value = chemfp_get_option(option); if (value == CHEMFP_BAD_ARG) { /* Nothing can currently return -1, so this is an error */ PyErr_SetString(PyExc_ValueError, "Unknown option name"); } return PyInt_FromLong(value); } static PyObject * set_option(PyObject *self, PyObject *args) { const char *option; int value, result; UNUSED(self); if (!PyArg_ParseTuple(args, "si:set_option", &option, &value)) { return NULL; } /* Make sure it's a valid name */ if (chemfp_get_option(option) == CHEMFP_BAD_ARG) { PyErr_SetString(PyExc_ValueError, "Unknown option name"); return NULL; } result = chemfp_set_option(option, value); if (result != CHEMFP_OK) { PyErr_SetString(PyExc_ValueError, "Bad option value"); return NULL; } return Py_BuildValue(""); } static PyObject* get_num_threads(PyObject *self, PyObject *args) { UNUSED(self); UNUSED(args); return PyInt_FromLong(chemfp_get_num_threads()); } static PyObject* set_num_threads(PyObject *self, PyObject *args) { int num_threads; UNUSED(args); if (!PyArg_ParseTuple(args, "i:set_num_threads", &num_threads)) { return NULL; } chemfp_set_num_threads(num_threads); Py_RETURN_NONE; } static PyObject* get_max_threads(PyObject *self, PyObject *args) { UNUSED(self); UNUSED(args); return PyInt_FromLong(chemfp_get_max_threads()); } static PyMethodDef chemfp_methods[] = { {"version", version, METH_NOARGS, "version()\n\nReturn the chemfp library version, as a string like '1.0'"}, {"strerror", strerror_, METH_VARARGS, "strerror(n)\n\nConvert the error code integer to more descriptive text"}, {"hex_isvalid", hex_isvalid, METH_VARARGS, "hex_isvalid(s)\n\nReturn 1 if the string is a valid hex fingerprint, otherwise 0"}, {"hex_popcount", hex_popcount, METH_VARARGS, "hex_popcount(fp)\n\nReturn the number of bits set in a hex fingerprint, or -1 for non-hex strings"}, {"hex_intersect_popcount", hex_intersect_popcount, METH_VARARGS, "hex_intersect_popcount(fp1, fp2)\n\nReturn the number of bits set in the intersection of the two hex fingerprint,\nor -1 if either string is a non-hex string"}, {"hex_tanimoto", hex_tanimoto, METH_VARARGS, "hex_tanimoto(fp1, fp2)\n\nCompute the Tanimoto similarity between two hex fingerprints.\nReturn a float between 0.0 and 1.0, or -1.0 if either string is not a hex fingerprint"}, {"hex_contains", hex_contains, METH_VARARGS, "hex_contains(super_fp, sub_fp)\n\nReturn 1 if the on bits of sub_fp are also 1 bits in super_fp, otherwise 0.\nReturn -1 if either string is not a hex fingerprint"}, {"byte_popcount", byte_popcount, METH_VARARGS, "byte_popcount(fp)\n\nReturn the number of bits set in a byte fingerprint"}, {"byte_intersect_popcount", byte_intersect_popcount, METH_VARARGS, "byte_intersect_popcount(fp1, fp2)\n\nReturn the number of bits set in the instersection of the two byte fingerprints"}, {"byte_tanimoto", byte_tanimoto, METH_VARARGS, "byte_tanimoto(fp1, fp2)\n\nCompute the Tanimoto similarity between two byte fingerprints"}, {"byte_contains", byte_contains, METH_VARARGS, "byte_contains(super_fp, sub_fp)\n\nReturn 1 if the on bits of sub_fp are also 1 bits in super_fp"}, {"byte_intersect", byte_intersect, METH_VARARGS, "byte_interect(fp1, fp2)\n\nReturn fp1 & fp2"}, {"byte_union", byte_union, METH_VARARGS, "byte_union(fp1, fp2)\n\nReturn fp1 | fp2"}, {"byte_difference", byte_difference, METH_VARARGS, "byte_difference(fp1, fp2)\n\nReturn fp1 ^ fp2"}, /* FPS */ {"fps_line_validate", fps_line_validate, METH_VARARGS, "fps_line_validate (TODO: document)"}, {"fps_parse_id_fp", fps_parse_id_fp, METH_VARARGS, "fps_parse_id_fp (TODO: document)"}, {"fps_threshold_tanimoto_search", fps_threshold_tanimoto_search, METH_VARARGS, "fps_threshold_tanimoto_search (TODO: document)"}, {"fps_count_tanimoto_hits", fps_count_tanimoto_hits, METH_VARARGS, "fps_count_tanimoto_hits (TODO: document)"}, {"fps_knearest_search_init", fps_knearest_search_init, METH_VARARGS, "fps_knearest_search_init (TODO: document)"}, {"fps_knearest_tanimoto_search_feed", fps_knearest_tanimoto_search_feed, METH_VARARGS, "fps_knearest_tanimoto_search_feed (TODO: document)"}, {"fps_knearest_search_finish", fps_knearest_search_finish, METH_VARARGS, "fps_knearest_search_finish (TODO: document)"}, {"fps_knearest_search_free", fps_knearest_search_free, METH_VARARGS, "fps_knearest_search_free (TODO: document)"}, {"count_tanimoto_arena", count_tanimoto_arena, METH_VARARGS, "count_tanimoto_arena (TODO: document)"}, {"threshold_tanimoto_arena", threshold_tanimoto_arena, METH_VARARGS, "threshold_tanimoto_arena (TODO: document)"}, {"knearest_tanimoto_arena", knearest_tanimoto_arena, METH_VARARGS, "knearest_tanimoto_arena (TODO: document)"}, {"knearest_results_finalize", knearest_results_finalize, METH_VARARGS, "knearest_results_finalize (TODO: document)"}, {"count_tanimoto_hits_arena_symmetric", count_tanimoto_hits_arena_symmetric, METH_VARARGS, "count_tanimoto_hits_arena_symmetric (TODO: document)"}, {"threshold_tanimoto_arena_symmetric", threshold_tanimoto_arena_symmetric, METH_VARARGS, "threshold_tanimoto_arena_symmetric (TODO: document)"}, {"knearest_tanimoto_arena_symmetric", knearest_tanimoto_arena_symmetric, METH_VARARGS, "knearest_tanimoto_arena_symmetric (TODO: document)"}, {"fill_lower_triangle", fill_lower_triangle, METH_VARARGS, "fill_lower_triangle (TODO: document)"}, {"make_sorted_aligned_arena", make_sorted_aligned_arena, METH_VARARGS, "make_sorted_aligned_arena (TODO: document)"}, {"make_unsorted_aligned_arena", make_unsorted_aligned_arena, METH_VARARGS, "make_unsorted_aligned_arena (TODO: document)"}, {"align_fingerprint", align_fingerprint, METH_VARARGS, "align_fingerprint (TODO: document)"}, /* Select the popcount methods */ {"get_num_methods", get_num_methods, METH_NOARGS, "get_num_methods (TODO: document)"}, {"get_method_name", get_method_name, METH_VARARGS, "get_method_name (TODO: document)"}, {"get_num_alignments", get_num_alignments, METH_NOARGS, "get_num_alignments (TODO: document)"}, {"get_alignment_name", get_alignment_name, METH_VARARGS, "get_alignment_name (TODO: document)"}, {"get_alignment_method", get_alignment_method, METH_VARARGS, "get_alignment_method (TODO: document)"}, {"set_alignment_method", set_alignment_method, METH_VARARGS, "set_alignment_method (TODO: document)"}, {"select_fastest_method", select_fastest_method, METH_VARARGS, "select_fastest_method (TODO: document)"}, {"get_num_options", get_num_options, METH_NOARGS, "get_num_options (TODO: document)"}, {"get_option_name", get_option_name, METH_VARARGS, "get option name (TODO: document)"}, {"get_option", get_option, METH_VARARGS, "get option (TODO: document)"}, {"set_option", set_option, METH_VARARGS, "set option (TODO: document)"}, {"get_num_threads", get_num_threads, METH_NOARGS, "get_num_threads()\n\nSet the number of OpenMP threads to use in a search"}, {"set_num_threads", set_num_threads, METH_VARARGS, "set_num_threads()\n\nGet the number of OpenMP threads to use in a search"}, {"get_max_threads", get_max_threads, METH_NOARGS, "get_max_threads()\n\nGet the maximum number of OpenMP threads available"}, {NULL, NULL, 0, NULL} /* Sentinel */ }; PyMODINIT_FUNC init_chemfp(void) { PyObject *m; if (PyType_Ready(&chemfp_py_SearchResultsType) < 0) { return ; } m = Py_InitModule3("_chemfp", chemfp_methods, "Documentation goes here"); Py_INCREF(&chemfp_py_SearchResultsType); PyModule_AddObject(m, "SearchResults", (PyObject *)&chemfp_py_SearchResultsType); } chemfp-1.1p1/src/search_core.c0000644000077000000240000011627212106314461016474 0ustar dalkestaff00000000000000/* This is a rather cumbersome solution to two problems I have with OpenMP. 1) multiple threads and OpenMP don't mix on a Mac. It segfaults during the first openmp call. I want people to be able to use chemfp in multi-threaded environments, even with diminished performance, so the single thread version should not go through the OpenMP path. 2) I measured a roughly 5% performance penalty hit with a single thread using OpenMP vs. the code compiled without OpenMP. My solution is to compile the core code twice, one for each path. The RENAME macro rewrites int RENAME(chemfp_count_tanimoto_arena) to one of: static int chemfp_count_tanimoto_arena_single -- single-threaded, compiler supports OpenMP static int chemfp_count_tanimoto_arena_openmp -- multiple OpenMP threads int chemfp_count_tanimoto_arena -- single-threaded, compiler does not support OpenMP depending on the circumstances. In a normal build, where OpenMP is available, then this file will be #include'd twice. */ /* count code */ int RENAME(chemfp_count_tanimoto_arena)( /* Count all matches within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Query arena, start and end indices */ int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, /* Target arena, start and end indices */ int target_storage_size, const unsigned char *target_arena, int target_start, int target_end, /* Target popcount distribution information */ int *target_popcount_indices, /* Results go into these arrays */ int *result_counts ) { int query_index, target_index; const unsigned char *query_fp, *target_fp; int start, end; int count; int fp_size = (num_bits+7) / 8; double score, popcount_sum; int query_popcount, start_target_popcount, end_target_popcount; int target_popcount; int intersect_popcount; chemfp_popcount_f calc_popcount; chemfp_intersect_popcount_f calc_intersect_popcount; if (query_start >= query_end) { /* No queries */ return CHEMFP_OK; } /* Prevent overflow if someone uses a threshold of, say, 1E-80 */ /* (Not really needed unless you trap IEEE 754 overflow errors) */ if (threshold > 0.0 && threshold < 1.0/num_bits) { threshold = 0.5 / num_bits; } if ((target_start >= target_end) || threshold > 1.0) { for (query_index = 0; query_index < (query_end-query_start); query_index++) { /* No possible targets */ result_counts[query_index] = 0; } return CHEMFP_OK; } if (threshold <= 0.0) { /* Everything will match, so there's no need to figure that out */ for (query_index = 0; query_index < (query_end-query_start); query_index++) { result_counts[query_index] = (target_end - target_start); } return CHEMFP_OK; } if (target_popcount_indices == NULL) { /* Handle the case when precomputed targets aren't available. */ /* This is a slower algorithm because it tests everything. */ #if USE_OPENMP == 1 #pragma omp parallel for private(query_fp, target_fp, count, target_index, score) schedule(dynamic) #endif for (query_index = 0; query_index < (query_end-query_start); query_index++) { query_fp = query_arena + (query_start + query_index) * query_storage_size; target_fp = target_arena + (target_start * target_storage_size); /* Handle the popcount(query) == 0 special case? */ count = 0; for (target_index = target_start; target_index < target_end; target_index++, target_fp += target_storage_size) { score = chemfp_byte_tanimoto(fp_size, query_fp, target_fp); if (score >= threshold) { count++; } } result_counts[query_index] = count; } return CHEMFP_OK; } /* Choose popcounts optimized for this case */ calc_popcount = chemfp_select_popcount(num_bits, query_storage_size, query_arena); calc_intersect_popcount = chemfp_select_intersect_popcount( num_bits, query_storage_size, query_arena, target_storage_size, target_arena); /* This uses the limits from Swamidass and Baldi */ /* It doesn't use the search ordering because it's supposed to find everything */ #if USE_OPENMP == 1 #pragma omp parallel for \ private(query_fp, query_popcount, start_target_popcount, end_target_popcount, \ count, target_popcount, start, end, target_fp, popcount_sum, target_index, intersect_popcount, score) \ schedule(dynamic) #endif for (query_index = 0; query_index < (query_end-query_start); query_index++) { query_fp = query_arena + (query_start + query_index) * query_storage_size; query_popcount = calc_popcount(fp_size, query_fp); /* Special case when popcount(query) == 0; everything has a score of 0.0 */ if (query_popcount == 0) { if (threshold == 0.0) { result_counts[query_index] = (target_end - target_start); } continue; } /* Figure out which fingerprints to search */ if (threshold == 0.0) { start_target_popcount = 0; end_target_popcount = num_bits; } else { start_target_popcount = (int)(query_popcount * threshold); end_target_popcount = (int)(ceil(query_popcount / threshold)); if (end_target_popcount > num_bits) { end_target_popcount = num_bits; } } count = 0; for (target_popcount = start_target_popcount; target_popcount <= end_target_popcount; target_popcount++) { start = target_popcount_indices[target_popcount]; end = target_popcount_indices[target_popcount+1]; if (start < target_start) { start = target_start; } if (end > target_end) { end = target_end; } target_fp = target_arena + (start * target_storage_size); popcount_sum = query_popcount + target_popcount; for (target_index = start; target_index < end; target_index++, target_fp += target_storage_size) { intersect_popcount = calc_intersect_popcount(fp_size, query_fp, target_fp); score = intersect_popcount / (popcount_sum - intersect_popcount); if (score >= threshold) { count++; } } } result_counts[query_index] = count; } /* went through each of the queries */ return CHEMFP_OK; } int RENAME(chemfp_threshold_tanimoto_arena)( /* Within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Query arena, start and end indices */ int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, /* Target arena, start and end indices */ int target_storage_size, const unsigned char *target_arena, int target_start, int target_end, /* Target popcount distribution information */ /* (must have at least num_bits+1 elements) */ int *target_popcount_indices, /* Results go here */ chemfp_search_result *results) { int query_index, target_index; const unsigned char *query_fp, *target_fp; int start, end; int fp_size = (num_bits+7) / 8; double score; int query_popcount, start_target_popcount, end_target_popcount; int target_popcount; int intersect_popcount, popcount_sum; int numerator, denominator; int add_hit_error = 0; chemfp_popcount_f calc_popcount; chemfp_intersect_popcount_f calc_intersect_popcount; if (query_start >= query_end) { /* No queries */ return CHEMFP_OK; } /* Prevent overflow if someone uses a threshold of, say, 1E-80 */ /* (Not really needed unless you trap IEEE 754 overflow errors) */ if (threshold > 0.0 && threshold < 1.0/num_bits) { threshold = 0.5 / num_bits; } if ((target_start >= target_end) || threshold > 1.0) { return CHEMFP_OK; } if (target_popcount_indices == NULL) { /* Handle the case when precomputed targets aren't available. */ /* This is a slower algorithm because it tests everything. */ #if USE_OPENMP == 1 #pragma omp parallel for private(query_fp, target_fp, target_index, score) schedule(dynamic) #endif for (query_index = query_start; query_index < query_end; query_index++) { query_fp = query_arena + (query_index * query_storage_size); target_fp = target_arena + (target_start * target_storage_size); /* Handle the popcount(query) == 0 special case? */ for (target_index = target_start; target_index < target_end; target_index++, target_fp += target_storage_size) { score = chemfp_byte_tanimoto(fp_size, query_fp, target_fp); if (score >= threshold) { if (!chemfp_add_hit(results+(query_index-query_start), target_index, score)) { add_hit_error = 1; } } } } if (add_hit_error) { return CHEMFP_NO_MEM; } return CHEMFP_OK; } calc_popcount = chemfp_select_popcount(num_bits, query_storage_size, query_arena); calc_intersect_popcount = chemfp_select_intersect_popcount( num_bits, query_storage_size, query_arena, target_storage_size, target_arena); denominator = num_bits * 10; numerator = (int)(threshold * denominator); /* This uses the limits from Swamidass and Baldi */ /* It doesn't use the search ordering because it's supposed to find everything */ #if USE_OPENMP == 1 #pragma omp parallel for \ private(query_fp, query_popcount, target_index, target_fp, start_target_popcount, \ end_target_popcount, target_popcount, start, end, popcount_sum, intersect_popcount, score) \ schedule(dynamic) #endif for (query_index = query_start; query_index < query_end; query_index++) { query_fp = query_arena + (query_index * query_storage_size); query_popcount = calc_popcount(fp_size, query_fp); /* Special case when popcount(query) == 0; everything has a score of 0.0 */ if (query_popcount == 0) { if (threshold == 0.0) { for (target_index = target_start; target_index < target_end; target_index++) { if (!chemfp_add_hit(results+(query_index-query_start), target_index, 0.0)) { add_hit_error = 1; } } } continue; } /* Figure out which fingerprints to search */ if (threshold == 0.0) { start_target_popcount = 0; end_target_popcount = num_bits; } else { start_target_popcount = (int)(query_popcount * threshold); end_target_popcount = (int)(ceil(query_popcount / threshold)); if (end_target_popcount > num_bits) { end_target_popcount = num_bits; } } for (target_popcount=start_target_popcount; target_popcount<=end_target_popcount; target_popcount++) { start = target_popcount_indices[target_popcount]; end = target_popcount_indices[target_popcount+1]; if (start < target_start) { start = target_start; } if (end > target_end) { end = target_end; } target_fp = target_arena + (start * target_storage_size); popcount_sum = query_popcount + target_popcount; for (target_index = start; target_index < end; target_index++, target_fp += target_storage_size) { intersect_popcount = calc_intersect_popcount(fp_size, query_fp, target_fp); /* In my timings (on a Mac), the comparison against a double was a hotspot, */ /* but division is not. I switch to integer math and gained a 3-4% performance, */ /* at the cost of slightly more complicated code. */ if (denominator * intersect_popcount >= numerator * (popcount_sum - intersect_popcount)) { score = ((double) intersect_popcount) / (popcount_sum - intersect_popcount); if (!chemfp_add_hit(results+(query_index-query_start), target_index, score)) { add_hit_error = 1; } } } } } /* went through each of the queries */ if (add_hit_error) { return CHEMFP_NO_MEM; } return CHEMFP_OK; } static int RENAME(knearest_tanimoto_arena_no_popcounts)( /* Find the 'k' nearest items */ int k, /* Within the given threshold */ double threshold, /* Fingerprint size in bits */ int num_bits, /* Query arena, start and end indices */ int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, /* Target arena, start and end indices */ int target_storage_size, const unsigned char *target_arena, int target_start, int target_end, /* Results go into these arrays */ chemfp_search_result *results ) { int query_index, target_index; int fp_size = (num_bits+7)/8; const unsigned char *query_fp, *target_fp; double query_threshold, score; chemfp_search_result *result; for (query_index = 0; query_index < (query_end-query_start); query_index++) { query_fp = query_arena + (query_start+query_index) * query_storage_size; result = results+query_index; query_threshold = threshold; target_fp = target_arena + (target_start * query_storage_size); target_index = target_start; for (; target_index < target_end; target_index++, target_fp += target_storage_size) { score = chemfp_byte_tanimoto(fp_size, query_fp, target_fp); if (score >= query_threshold) { chemfp_add_hit(result, target_index, score); if (result->num_hits == k) { chemfp_heapq_heapify(k, result, (chemfp_heapq_lt) double_score_lt, (chemfp_heapq_swap) double_score_swap); query_threshold = result->scores[0]; /* Since we leave the loop early, I need to advance the pointers */ target_index++; target_fp += target_storage_size; break; } } } /* Either we've reached the end of the fingerprints or the heap is full */ if (result->num_hits == k) { /* Continue scanning through the fingerprints */ for (; target_index < target_end; target_index++, target_fp += target_storage_size) { score = chemfp_byte_tanimoto(fp_size, query_fp, target_fp); /* We need to be strictly *better* than what's in the heap */ if (score > query_threshold) { result->indices[0] = target_index; result->scores[0] = score; chemfp_heapq_siftup(k, result, 0, (chemfp_heapq_lt) double_score_lt, (chemfp_heapq_swap) double_score_swap); query_threshold = result->scores[0]; } /* heapreplaced the old smallest item with the new item */ } /* End of the fingerprint scan */ } else { /* The heap isn't full, so we haven't yet heapified it. */ chemfp_heapq_heapify(result->num_hits, result, (chemfp_heapq_lt) double_score_lt, (chemfp_heapq_swap) double_score_swap); } } /* Loop through the queries */ return query_index-query_start; } int RENAME(chemfp_knearest_tanimoto_arena)( /* Find the 'k' nearest items */ int k, /* Within the given threshold */ double threshold, /* Size of the fingerprints and size of the storage block */ int num_bits, /* Query arena, start and end indices */ int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, /* Target arena, start and end indices */ int target_storage_size, const unsigned char *target_arena, int target_start, int target_end, /* Target popcount distribution information */ int *target_popcount_indices, /* Results go into these arrays */ chemfp_search_result *results ) { int fp_size; int query_popcount, target_popcount, intersect_popcount; double score, best_possible_score, popcount_sum, query_threshold; const unsigned char *query_fp, *target_fp; int query_index, target_index; int start, end; PopcountSearchOrder popcount_order; chemfp_search_result *result; chemfp_popcount_f calc_popcount; chemfp_intersect_popcount_f calc_intersect_popcount; /* This is C. We don't check for illegal input values. */ if (query_start >= query_end) { return 0; } /* k == 0 is a valid input, and of course the result is no matches */ if (k == 0) { return CHEMFP_OK; } fp_size = (num_bits+7)/8; if (target_popcount_indices == NULL) { /* precomputed targets aren't available. Use the slower algorithm. */ return RENAME(knearest_tanimoto_arena_no_popcounts)( k, threshold, num_bits, query_storage_size, query_arena, query_start, query_end, target_storage_size, target_arena, target_start, target_end, results); } /* Choose popcounts optimized for this case */ calc_popcount = chemfp_select_popcount(num_bits, query_storage_size, query_arena); calc_intersect_popcount = chemfp_select_intersect_popcount( num_bits, query_storage_size, query_arena, target_storage_size, target_arena); /* Loop through the query fingerprints */ for (query_index=0; query_index < (query_end-query_start); query_index++) { result = results+query_index; query_fp = query_arena + (query_start+query_index) * query_storage_size; query_threshold = threshold; query_popcount = calc_popcount(fp_size, query_fp); if (query_popcount == 0) { /* By definition this will never return hits. Even if threshold == 0.0. */ /* (I considered returning the first k hits, but that's chemically meaningless.) */ /* XXX change this. Make it returns the first k hits */ continue; } /* Search the bins using the ordering from Swamidass and Baldi.*/ init_search_order(&popcount_order, query_popcount, num_bits); /* Look through the sections of the arena in optimal popcount order */ while (next_popcount(&popcount_order, query_threshold)) { target_popcount = popcount_order.popcount; best_possible_score = popcount_order.score; /* If we can't beat the query threshold then we're done with the targets */ if (best_possible_score < query_threshold) { break; } /* Scan through the targets which have the given popcount */ start = target_popcount_indices[target_popcount]; end = target_popcount_indices[target_popcount+1]; if (!check_bounds(&popcount_order, &start, &end, target_start, target_end)) { continue; } /* Iterate over the target fingerprints */ target_fp = target_arena + start*target_storage_size; popcount_sum = (double)(query_popcount + target_popcount); target_index = start; /* There are fewer than 'k' elements in the heap*/ if (result->num_hits < k) { for (; target_index= query_threshold) { chemfp_add_hit(result, target_index, score); if (result->num_hits == k) { chemfp_heapq_heapify(k, result, (chemfp_heapq_lt) double_score_lt, (chemfp_heapq_swap) double_score_swap); query_threshold = result->scores[0]; /* We're going to jump to the "heap is full" section */ /* Since we leave the loop early, I need to advance the pointers */ target_index++; target_fp += target_storage_size; goto heap_replace; } } /* Added to heap */ } /* Went through target fingerprints */ /* If we're here then the heap did not fill up. Try the next popcount */ continue; } heap_replace: /* We only get here if the heap contains k element */ /* Earlier we tested for "best_possible_score= best_possible_score) { /* Can't do better. Might as well give up. */ break; } /* Scan through the target fingerprints; can we improve over the threshold? */ for (; target_index query_threshold) { result->indices[0] = target_index; result->scores[0] = score; chemfp_heapq_siftup(k, result, 0, (chemfp_heapq_lt) double_score_lt, (chemfp_heapq_swap) double_score_swap); query_threshold = result->scores[0]; if (query_threshold >= best_possible_score) { /* we can't do any better in this section (or in later ones) */ break; } } /* heapreplaced the old smallest item with the new item */ } /* looped over fingerprints */ } /* Went through all the popcount regions */ /* We have scanned all the fingerprints. Is the heap full? */ if (result->num_hits < k) { /* Not full, so need to heapify it. */ chemfp_heapq_heapify(result->num_hits, result, (chemfp_heapq_lt) double_score_lt, (chemfp_heapq_swap) double_score_swap); } } /* looped over all queries */ return CHEMFP_OK; } /***** Special support for the NxN symmetric case ******/ /* TODO: implement the k-nearest variant. It's harder because a k-nearest search, combined with the Swamidass and Baldi search limits, is not reflexive. */ int RENAME(chemfp_count_tanimoto_hits_arena_symmetric)( /* Count all matches within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Fingerprint arena */ int storage_size, const unsigned char *arena, /* Row start and end indices */ int query_start, int query_end, /* Column start and end indices */ int target_start, int target_end, /* Target popcount distribution information */ int *target_popcount_indices, /* Results _increment_ existing values in the array - remember to initialize! */ int *result_counts ) { int fp_size = (num_bits+7) / 8; int query_index, target_index; int start, end; int query_popcount, target_popcount; int start_target_popcount, end_target_popcount, intersect_popcount; int count; double popcount_sum, score; const unsigned char *query_fp, *target_fp; chemfp_popcount_f calc_popcount; chemfp_intersect_popcount_f calc_intersect_popcount; #if USE_OPENMP == 1 /* Reduce contention by using a per-thread counts array. For details see: */ /* http://www.dalkescientific.com/writings/diary/archive/2012/01/17/I_parallelize_an_algorithm.html */ int i; int num_threads; int *parallel_counts; int *per_thread_counts; int per_thread_size; #endif /* Check that we're not obviously in the lower triangle */ if (query_start >= target_end) { /* No possible hits */ return CHEMFP_OK; } /* Shift the target towards the upper triangle, if needed */ if (target_start < query_start) { target_start = query_start; } /* Check for edge cases */ if ((query_start >= query_end) || (target_start >= target_end) || (threshold > 1.0)) { return CHEMFP_OK; } if (threshold <= 0.0) { /* By definition, everything matches */ /* FIXME: this is inelegant. I'm finding the symmetry and boundary conditions a bit tricky */ for (query_index=query_start; query_index 0.0 && threshold < 1.0/num_bits) { threshold = 0.5 / num_bits; } /* target_popcount_indices must exist; if you don't care for the factor */ /* of two performance increase by precomputing/presorting based on popcount */ /* then why are you interested in the factor of two based on symmetry? */ /* Choose popcount methods optimized for this case */ calc_popcount = chemfp_select_popcount(num_bits, storage_size, arena); calc_intersect_popcount = chemfp_select_intersect_popcount( num_bits, storage_size, arena, storage_size, arena); /* This uses the limits from Swamidass and Baldi */ #if USE_OPENMP == 1 num_threads = omp_get_max_threads(); per_thread_size = MAX(query_end, target_end); parallel_counts = (int *) calloc(num_threads * per_thread_size, sizeof(int)); if (!parallel_counts) { return CHEMFP_NO_MEM; } #pragma omp parallel for \ private(query_fp, query_popcount, start_target_popcount, end_target_popcount, \ count, target_popcount, start, end, target_fp, popcount_sum, target_index, \ intersect_popcount, score, per_thread_counts) \ schedule(dynamic) #endif for (query_index = query_start; query_index < query_end; query_index++) { query_fp = arena + (query_index * storage_size); query_popcount = calc_popcount(fp_size, query_fp); #if USE_OPENMP == 1 per_thread_counts = parallel_counts+(omp_get_thread_num() * per_thread_size); #endif /* Special case when popcount(query) == 0; everything has a score of 0.0 */ if (query_popcount == 0) { continue; } /* Figure out which fingerprints to search */ start_target_popcount = (int)(query_popcount * threshold); end_target_popcount = (int)(ceil(query_popcount / threshold)); if (end_target_popcount > num_bits) { end_target_popcount = num_bits; } count = 0; for (target_popcount = start_target_popcount; target_popcount <= end_target_popcount; target_popcount++) { start = target_popcount_indices[target_popcount]; end = target_popcount_indices[target_popcount+1]; if (start < target_start) { start = target_start; } start = MAX(query_index+1, start); if (end > target_end) { end = target_end; } target_fp = arena + (start * storage_size); popcount_sum = query_popcount + target_popcount; for (target_index = start; target_index < end; target_index++, target_fp += storage_size) { intersect_popcount = calc_intersect_popcount(fp_size, query_fp, target_fp); score = intersect_popcount / (popcount_sum - intersect_popcount); if (score >= threshold) { /* Can accumulate the score for the row. This is likely a register */ /* instead of a memory location so should be slightly faster. */ count++; /* I can't use the same technique for the symmetric match */ #if USE_OPENMP == 1 per_thread_counts[target_index]++; #else result_counts[target_index]++; #endif } } } /* Save the accumulated row counts */ #if USE_OPENMP == 1 if (count) { per_thread_counts[query_index] += count; } #else result_counts[query_index] += count; #endif } /* went through each of the queries */ #if USE_OPENMP == 1 /* Merge the per-thread results into the counts array */ /* TODO: start from MIN(query_start, query_end) */ /* TODO: parallelize? */ for (query_index = 0; query_index < per_thread_size; query_index++) { count = 0; for (i=0; i= target_end) { /* No possible hits */ return CHEMFP_OK; } /* Shift the target towards the upper triangle, if needed */ if (target_start < query_start) { target_start = query_start; } /* Corner cases where I don't need to do anything */ if ((query_start >= query_end) || (target_start >= target_end) || (threshold < 0)) { return CHEMFP_OK; } /* if (threshold == 0.0) { */ /* TODO: Optimize this case */ /* Prevent overflow if someone uses a threshold of, say, 1E-80 */ if (threshold > 0.0 && threshold < 1.0/num_bits) { threshold = 0.5 / num_bits; } if (threshold > 1.0) { return CHEMFP_OK; } calc_popcount = chemfp_select_popcount(num_bits, storage_size, arena); calc_intersect_popcount = chemfp_select_intersect_popcount( num_bits, storage_size, arena, storage_size, arena); denominator = num_bits * 10; numerator = (int)(threshold * denominator); /* This uses the limits from Swamidass and Baldi */ /* It doesn't use the search ordering because it's supposed to find everything */ #if USE_OPENMP == 1 #pragma omp parallel for \ private(query_fp, query_popcount, start_target_popcount, end_target_popcount, \ target_popcount, start, end, target_fp, popcount_sum, target_index, intersect_popcount, score) \ schedule(dynamic) #endif for (query_index = query_start; query_index < query_end; query_index++) { query_fp = arena + (query_index * storage_size); query_popcount = calc_popcount(fp_size, query_fp); /* Special case when popcount(query) == 0; everything has a score of 0.0 */ if (query_popcount == 0) { if (threshold == 0.0) { /* Only populate the upper triangle */ target_index = MAX(query_index+1, target_start); for (;target_index < target_end; target_index++) { if (!chemfp_add_hit(results+query_index, target_index, 0.0)) { add_hit_error = 1; } } } continue; } /* Figure out which fingerprints to search, based on the popcount */ if (threshold == 0.0) { start_target_popcount = 0; end_target_popcount = num_bits; } else { start_target_popcount = (int)(query_popcount * threshold); end_target_popcount = (int)(ceil(query_popcount / threshold)); if (end_target_popcount > num_bits) { end_target_popcount = num_bits; } } for (target_popcount=start_target_popcount; target_popcount<=end_target_popcount; target_popcount++) { start = popcount_indices[target_popcount]; end = popcount_indices[target_popcount+1]; if (start < target_start) { start = target_start; } if (end > target_end) { end = target_end; } popcount_sum = query_popcount + target_popcount; for (target_index = MAX(query_index+1, start); target_index < end; target_index++) { target_fp = arena + (target_index * storage_size); intersect_popcount = calc_intersect_popcount(fp_size, query_fp, target_fp); if (denominator * intersect_popcount >= numerator * (popcount_sum - intersect_popcount)) { /* Add to the upper triangle */ score = ((double) intersect_popcount) / (popcount_sum - intersect_popcount); if (!chemfp_add_hit(results+query_index, target_index, score)) { add_hit_error = 1; } } } } } /* went through each of the queries */ if (add_hit_error) { return CHEMFP_NO_MEM; } return CHEMFP_OK; } /* I couldn't figure out a way to take advantage of symmetry */ /* This is the same as the NxM algorithm except that it excludes self-matches */ int RENAME(chemfp_knearest_tanimoto_arena_symmetric)( /* Find the 'k' nearest items */ int k, /* Within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Arena */ int storage_size, const unsigned char *arena, /* start and end indices for the rows and columns */ int query_start, int query_end, int target_start, int target_end, /* Target popcount distribution information */ /* (must have at least num_bits+1 elements) */ int *popcount_indices, /* Results go into these arrays */ chemfp_search_result *results ) { int fp_size; int query_popcount, target_popcount, intersect_popcount; double score, best_possible_score, popcount_sum, query_threshold; const unsigned char *query_fp, *target_fp; int query_index, target_index; int start, end; PopcountSearchOrder popcount_order; chemfp_search_result *result; chemfp_popcount_f calc_popcount; chemfp_intersect_popcount_f calc_intersect_popcount; if (query_start >= query_end) { return 0; } /* k == 0 is a valid input, and of course the result is no matches */ if (k == 0) { return CHEMFP_OK; } fp_size = (num_bits+7)/8; /* Choose popcounts optimized for this case */ calc_popcount = chemfp_select_popcount(num_bits, storage_size, arena); calc_intersect_popcount = chemfp_select_intersect_popcount( num_bits, storage_size, arena, storage_size, arena); /* Loop through the query fingerprints */ #if USE_OPENMP == 1 #pragma omp parallel for \ private(result, query_fp, query_threshold, query_popcount, popcount_order,\ target_popcount, best_possible_score, start, end, target_fp, \ popcount_sum, target_index, intersect_popcount, score) \ schedule(dynamic) #endif for (query_index=query_start; query_index < query_end; query_index++) { result = results+query_index; query_fp = arena + query_index * storage_size; query_threshold = threshold; query_popcount = calc_popcount(fp_size, query_fp); if (query_popcount == 0) { /* By definition this will never return hits. Even if threshold == 0.0. */ /* (I considered returning the first k hits, but that's chemically meaningless.) */ /* XXX change this. Make it returns the first k hits */ continue; } /* Search the bins using the ordering from Swamidass and Baldi.*/ init_search_order(&popcount_order, query_popcount, num_bits); /* Look through the sections of the arena in optimal popcount order */ while (next_popcount(&popcount_order, query_threshold)) { target_popcount = popcount_order.popcount; best_possible_score = popcount_order.score; /* If we can't beat the query threshold then we're done with the targets */ if (best_possible_score < query_threshold) { break; } /* Scan through the targets which have the given popcount */ start = popcount_indices[target_popcount]; end = popcount_indices[target_popcount+1]; if (!check_bounds(&popcount_order, &start, &end, target_start, target_end)) { continue; } /* Iterate over the target fingerprints */ target_fp = arena + start*storage_size; popcount_sum = (double)(query_popcount + target_popcount); target_index = start; /* There are fewer than 'k' elements in the heap*/ if (result->num_hits < k) { for (; target_index= query_threshold) { if (query_index == target_index) { continue; /* Don't match self */ } chemfp_add_hit(result, target_index, score); if (result->num_hits == k) { chemfp_heapq_heapify(k, result, (chemfp_heapq_lt) double_score_lt, (chemfp_heapq_swap) double_score_swap); query_threshold = result->scores[0]; /* We're going to jump to the "heap is full" section */ /* Since we leave the loop early, I need to advance the pointers */ target_index++; target_fp += storage_size; goto heap_replace; } } /* Added to heap */ } /* Went through target fingerprints */ /* If we're here then the heap did not fill up. Try the next popcount */ continue; } heap_replace: /* We only get here if the heap contains k element */ /* Earlier we tested for "best_possible_score= best_possible_score) { /* Can't do better. Might as well give up. */ break; } /* Scan through the target fingerprints; can we improve over the threshold? */ for (; target_index query_threshold) { if (query_index == target_index) { continue; /* Don't match self */ } result->indices[0] = target_index; result->scores[0] = score; chemfp_heapq_siftup(k, result, 0, (chemfp_heapq_lt) double_score_lt, (chemfp_heapq_swap) double_score_swap); query_threshold = result->scores[0]; if (query_threshold >= best_possible_score) { /* we can't do any better in this section (or in later ones) */ break; } } /* heapreplaced the old smallest item with the new item */ } /* looped over fingerprints */ } /* Went through all the popcount regions */ /* We have scanned all the fingerprints. Is the heap full? */ if (result->num_hits < k) { /* Not full, so need to heapify it. */ chemfp_heapq_heapify(result->num_hits, result, (chemfp_heapq_lt) double_score_lt, (chemfp_heapq_swap) double_score_swap); } } /* looped over all queries */ return CHEMFP_OK; } chemfp-1.1p1/src/searches.c0000644000077000000240000003273012055226641016015 0ustar dalkestaff00000000000000#include #include #include #include #include #include #include "heapq.h" #include "chemfp.h" #include "chemfp_internal.h" #if defined(_OPENMP) #include #endif enum scoring_directions { UP_OR_DOWN = 0, UP_ONLY, DOWN_ONLY, FINISHED }; typedef struct { int direction; int query_popcount; int max_popcount; int popcount; int up_popcount; int down_popcount; double score; } PopcountSearchOrder; static void init_search_order(PopcountSearchOrder *popcount_order, int query_popcount, int max_popcount) { popcount_order->query_popcount = query_popcount; popcount_order->popcount = query_popcount; popcount_order->max_popcount = max_popcount; if (query_popcount <= 1) { popcount_order->direction = UP_ONLY; popcount_order->down_popcount = 0; } else { popcount_order->direction = UP_OR_DOWN; popcount_order->down_popcount = query_popcount-1; } popcount_order->up_popcount = query_popcount; } static void ordering_no_higher(PopcountSearchOrder *popcount_order) { switch (popcount_order->direction) { case UP_OR_DOWN: popcount_order->direction = DOWN_ONLY; break; case UP_ONLY: popcount_order->direction = FINISHED; break; default: break; } } static void ordering_no_lower(PopcountSearchOrder *popcount_order) { switch (popcount_order->direction) { case UP_OR_DOWN: popcount_order->direction = UP_ONLY; break; case DOWN_ONLY: popcount_order->direction = FINISHED; break; default: break; } } #define UP_SCORE(po) (((double)(po->query_popcount))/po->up_popcount) #define DOWN_SCORE(po) (((double)(po->down_popcount))/po->query_popcount) static int next_popcount(PopcountSearchOrder *popcount_order, double threshold) { double up_score, down_score; switch (popcount_order->direction) { case UP_OR_DOWN: up_score = UP_SCORE(popcount_order); down_score = DOWN_SCORE(popcount_order); if (up_score >= down_score) { popcount_order->popcount = (popcount_order->up_popcount)++; popcount_order->score = up_score; if (popcount_order->up_popcount > popcount_order->max_popcount) { popcount_order->direction = DOWN_ONLY; } } else { popcount_order->popcount = (popcount_order->down_popcount)--; popcount_order->score = down_score; if (popcount_order->down_popcount < 0) { popcount_order->direction = UP_ONLY; } } break; case UP_ONLY: popcount_order->score = UP_SCORE(popcount_order); popcount_order->popcount = (popcount_order->up_popcount)++; if (popcount_order->up_popcount > popcount_order->max_popcount) { popcount_order->direction = FINISHED; } break; case DOWN_ONLY: popcount_order->score = DOWN_SCORE(popcount_order); popcount_order->popcount = (popcount_order->down_popcount)--; if (popcount_order->down_popcount < 0) { popcount_order->direction = FINISHED; } break; default: return 0; } /* If the best possible score is under the threshold then we're done. */ if (popcount_order->score < threshold) { popcount_order->direction = FINISHED; return 0; } return 1; } static int check_bounds(PopcountSearchOrder *popcount_order, int *start, int *end, int target_start, int target_end) { if (*start > target_end) { ordering_no_higher(popcount_order); return 0; } if (*end < target_start) { ordering_no_lower(popcount_order); return 0; } if (*start < target_start) { ordering_no_higher(popcount_order); *start = target_start; } if (*end > target_end) { ordering_no_lower(popcount_order); *end = target_end; } return 1; } /**** Support for the k-nearest code ****/ static int double_score_lt(chemfp_search_result *result, int i, int j) { if (result->scores[i] < result->scores[j]) return 1; if (result->scores[i] > result->scores[j]) return 0; /* Sort in descending order by index. (XXX important or overkill?) */ return (result->indices[i] >= result->indices[j]); } static void double_score_swap(chemfp_search_result *result, int i, int j) { int tmp_index = result->indices[i]; double tmp_score = result->scores[i]; result->indices[i] = result->indices[j]; result->scores[i] = result->scores[j]; result->indices[j] = tmp_index; result->scores[j] = tmp_score; } void chemfp_knearest_results_finalize(chemfp_search_result *results_start, chemfp_search_result *results_end) { chemfp_search_result *result; for (result = results_start; result < results_end; result++) { /* Sort the elements */ chemfp_heapq_heapsort(result->num_hits, result, (chemfp_heapq_lt) double_score_lt, (chemfp_heapq_swap) double_score_swap); } } #define MAX(x, y) ((x) > (y) ? (x) : (y)) /***** Define the main interface code ***/ #if defined(_OPENMP) #define RENAME(name) name ## _single #define USE_OPENMP 0 #include "search_core.c" #undef RENAME #undef USE_OPENMP #define RENAME(name) name ## _openmp #define USE_OPENMP 1 #include "search_core.c" #undef RENAME #undef USE_OPENMP /* Dispatch based on the number of threads in use */ int chemfp_count_tanimoto_arena( /* Count all matches within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Query arena, start and end indices */ int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, /* Target arena, start and end indices */ int target_storage_size, const unsigned char *target_arena, int target_start, int target_end, /* Target popcount distribution information */ int *target_popcount_indices, /* Results go into these arrays */ int *result_counts ) { if (chemfp_get_num_threads() <= 1) { return chemfp_count_tanimoto_arena_single( threshold, num_bits, query_storage_size, query_arena, query_start, query_end, target_storage_size, target_arena, target_start, target_end, target_popcount_indices, result_counts); } else { return chemfp_count_tanimoto_arena_openmp( threshold, num_bits, query_storage_size, query_arena, query_start, query_end, target_storage_size, target_arena, target_start, target_end, target_popcount_indices, result_counts); } } int chemfp_threshold_tanimoto_arena( /* Within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Query arena, start and end indices */ int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, /* Target arena, start and end indices */ int target_storage_size, const unsigned char *target_arena, int target_start, int target_end, /* Target popcount distribution information */ /* (must have at least num_bits+1 elements) */ int *target_popcount_indices, /* Results go here */ chemfp_search_result *results) { if (chemfp_get_num_threads() <= 1) { return chemfp_threshold_tanimoto_arena_single( threshold, num_bits, query_storage_size, query_arena, query_start, query_end, target_storage_size, target_arena, target_start, target_end, target_popcount_indices, results); } else { return chemfp_threshold_tanimoto_arena_openmp( threshold, num_bits, query_storage_size, query_arena, query_start, query_end, target_storage_size, target_arena, target_start, target_end, target_popcount_indices, results); } } int chemfp_knearest_tanimoto_arena( /* Find the 'k' nearest items */ int k, /* Within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Query arena, start and end indices */ int query_storage_size, const unsigned char *query_arena, int query_start, int query_end, /* Target arena, start and end indices */ int target_storage_size, const unsigned char *target_arena, int target_start, int target_end, /* Target popcount distribution information */ /* (must have at least num_bits+1 elements) */ int *target_popcount_indices, /* Results go here */ chemfp_search_result *results) { if (chemfp_get_num_threads() <= 1) { return chemfp_knearest_tanimoto_arena_single( k, threshold, num_bits, query_storage_size, query_arena, query_start, query_end, target_storage_size, target_arena, target_start, target_end, target_popcount_indices, results); } else { return chemfp_knearest_tanimoto_arena_openmp( k, threshold, num_bits, query_storage_size, query_arena, query_start, query_end, target_storage_size, target_arena, target_start, target_end, target_popcount_indices, results); } } int chemfp_count_tanimoto_hits_arena_symmetric( /* Count all matches within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Fingerprint arena */ int storage_size, const unsigned char *arena, /* Row start and end indices */ int query_start, int query_end, /* Column start and end indices */ int target_start, int target_end, /* Target popcount distribution information */ int *popcount_indices, /* Results _increment_ existing values in the array - remember to initialize! */ int *result_counts ) { if (chemfp_get_num_threads() <= 1) { return chemfp_count_tanimoto_hits_arena_symmetric_single( threshold, num_bits, storage_size, arena, query_start, query_end, target_start, target_end, popcount_indices, result_counts); } else { return chemfp_count_tanimoto_hits_arena_symmetric_openmp( threshold, num_bits, storage_size, arena, query_start, query_end, target_start, target_end, popcount_indices, result_counts); } } int chemfp_threshold_tanimoto_arena_symmetric( /* Within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Arena */ int storage_size, const unsigned char *arena, /* start and end indices for the rows and columns */ int query_start, int query_end, int target_start, int target_end, /* Target popcount distribution information */ /* (must have at least num_bits+1 elements) */ int *popcount_indices, /* Results go here */ /* NOTE: This must have enough space for all of the fingerprints! */ chemfp_search_result *results) { if (chemfp_get_num_threads() <= 1) { return chemfp_threshold_tanimoto_arena_symmetric_single( threshold, num_bits, storage_size, arena, query_start, query_end, target_start, target_end, popcount_indices, results); } else { return chemfp_threshold_tanimoto_arena_symmetric_openmp( threshold, num_bits, storage_size, arena, query_start, query_end, target_start, target_end, popcount_indices, results); } } int chemfp_knearest_tanimoto_arena_symmetric( /* Find the 'k' nearest items */ int k, /* Within the given threshold */ double threshold, /* Number of bits in the fingerprint */ int num_bits, /* Arena */ int storage_size, const unsigned char *arena, /* start and end indices for the rows and columns */ int query_start, int query_end, int target_start, int target_end, /* Target popcount distribution information */ /* (must have at least num_bits+1 elements) */ int *popcount_indices, /* Results go into these arrays */ chemfp_search_result *results) { if (chemfp_get_num_threads() <= 1) { return chemfp_knearest_tanimoto_arena_symmetric_single( k, threshold, num_bits, storage_size, arena, query_start, query_end, target_start, target_end, popcount_indices, results); } else { return chemfp_knearest_tanimoto_arena_symmetric_openmp( k, threshold, num_bits, storage_size, arena, query_start, query_end, target_start, target_end, popcount_indices, results); } } #else /* Not compiling for OpenMP; don't need the run-time switch */ /* Instead, just rename the function */ #define RENAME(name) name #define USE_OPENMP 0 #include "search_core.c" #undef USE_OPENMP #undef RENAME #endif chemfp-1.1p1/src/select_popcount.c0000644000077000000240000006417012071440745017432 0ustar dalkestaff00000000000000#include #include #include #include #include "chemfp.h" #include "chemfp_internal.h" #include "popcount.h" #include "cpuid.h" static unsigned long timeit(chemfp_popcount_f popcount, int size, int repeat); static void verify_methods(void); static int chemfp_report_select_popcount = 0; static chemfp_method_type *chemfp_popcount_method_p = NULL; static int chemfp_report_select_intersect_popcount = 0; static chemfp_method_type *chemfp_intersect_popcount_method_p = NULL; int chemfp_get_option_report_popcount(void) { return chemfp_report_select_popcount; } int chemfp_set_option_report_popcount(int value) { if (value == 0 || value == 1) { chemfp_report_select_popcount = value; chemfp_popcount_method_p = NULL; return CHEMFP_OK; } return CHEMFP_BAD_ARG; } int chemfp_get_option_report_intersect_popcount(void) { return chemfp_report_select_intersect_popcount; } int chemfp_set_option_report_intersect_popcount(int value) { if (value == 0 || value == 1) { chemfp_report_select_intersect_popcount = value; chemfp_intersect_popcount_method_p = NULL; return CHEMFP_OK; } return CHEMFP_BAD_ARG; } /* These are the alignment categories which I support */ chemfp_alignment_type chemfp_alignments[] = { {"align1", 1, 1, NULL}, {"align4", 4, 4, NULL}, {"align8-small", 8, 8, NULL}, {"align8-large", 8, 96, NULL}, /* This is a purely hack category. It's only used if set to "ssse3" */ {"align-ssse3", 64, 64, NULL}, }; static int has_popcnt_instruction(void) { const uint64_t test_bytes = 255; int test_popcount; if (!(get_cpuid_flags() & bit_POPCNT)) { return 0; } /* We are on a machine which has a popcount instruction. Was the */ /* underlying code compiled to be able to use that instruction? */ /* If not, then the function will return 0 instead of the popcount. */ test_popcount = chemfp_popcount_popcnt(8, &test_bytes); if (test_popcount == 8) { return 1; } if (test_popcount == 0) { return 0; } fprintf(stderr, "Popcount function POPCNT32(256) returned %d; expected 0 or 8. This should not happen.\n", test_popcount); return 0; } /* These are in the same order as an enum in popcount.h */ static chemfp_method_type compile_time_methods[] = { {0, CHEMFP_LUT8_1, "LUT8-1", 1, 1, NULL, chemfp_popcount_lut8_1, chemfp_intersect_popcount_lut8_1}, {0, CHEMFP_LUT8_4, "LUT8-4", 4, 4, NULL, (chemfp_popcount_f) chemfp_popcount_lut8_4, (chemfp_intersect_popcount_f) chemfp_intersect_popcount_lut8_4}, {0, CHEMFP_LUT16_4, "LUT16-4", 4, 4, NULL, (chemfp_popcount_f) chemfp_popcount_lut16_4, (chemfp_intersect_popcount_f) chemfp_intersect_popcount_lut16_4}, {0, CHEMFP_LAURADOUX, "Lauradoux", 8, 96, NULL, (chemfp_popcount_f) chemfp_popcount_lauradoux, (chemfp_intersect_popcount_f) chemfp_intersect_popcount_lauradoux}, {0, CHEMFP_POPCNT, "POPCNT", 8, 8, has_popcnt_instruction, (chemfp_popcount_f) chemfp_popcount_popcnt, (chemfp_intersect_popcount_f) chemfp_intersect_popcount_popcnt}, {0, CHEMFP_GILLIES, "Gillies", 8, 8, NULL, (chemfp_popcount_f) chemfp_popcount_gillies, (chemfp_intersect_popcount_f) chemfp_intersect_popcount_gillies}, {0, CHEMFP_SSSE3, "ssse3", 64, 64, chemfp_has_ssse3, (chemfp_popcount_f) chemfp_popcount_SSSE3, (chemfp_intersect_popcount_f) chemfp_intersect_popcount_SSSE3}, }; /* These are the methods which are actually available at run-time */ /* This list is used for the public API */ static chemfp_method_type * detected_methods[sizeof(compile_time_methods)/sizeof(chemfp_method_type)]; static int num_methods = 0; static void detect_methods(void) { int i, j=0; if (num_methods != 0) { return; } /* Go through all of the compile-time methods and see if it's available */ for (i=0; i<(int)(sizeof(compile_time_methods)/sizeof(chemfp_method_type)); i++) { if ((compile_time_methods[i].check == NULL) || (compile_time_methods[i].check())) { /* Add it to the list of detected methods, and tell it its index position */ compile_time_methods[i].detected_index = j; detected_methods[j++] = &compile_time_methods[i]; } } num_methods = j; /* Verify that they all give the same answers */ verify_methods(); } int chemfp_get_num_methods(void) { detect_methods(); return num_methods; } const char * chemfp_get_method_name(int method) { if (method < 0 || method >= chemfp_get_num_methods()) { return NULL; } return detected_methods[method]->name; } static void set_default_alignment_methods(void) { int lut_method, best64_method, large_method, ssse3_method; unsigned long first_time, lut8_time, lut16_time, lut_time; unsigned long gillies_time, best64_time, lauradoux_time; unsigned long ssse3_time; /* Make sure we haven't already initialized the alignments */ if (chemfp_alignments[0].method_p != NULL) { return; } /* Figure out which methods are available for this hardware */ detect_methods(); /* This is the only possibility for 1-byte aligned */ chemfp_alignments[CHEMFP_ALIGN1].method_p = &compile_time_methods[CHEMFP_LUT8_1]; /* Now do some timing measurements and figure out which method is likely the fastest for this hardware. It's a bit tricky; consider what happens if a timeslice boundary happens while doing a test. I mostly fix that by doing the timing twice and using the fastest time. I could require everyone call chemfp_select_fastest_method, but this should be good enough for almost everyone. */ /* For 4-byte aligned we use a LUT. */ /* TODO: implement a POPCNT instruction-based method for 4-byte aligned code */ /* (You really should use an 8-byte aligned arena in this case, so not a priority */ /* On older hardware the LUT16 can be slower than the LUT8 */ first_time = timeit(compile_time_methods[CHEMFP_LUT8_4].popcount, 128, 200); lut8_time = timeit(compile_time_methods[CHEMFP_LUT8_4].popcount, 128, 200); if (first_time < lut8_time) { lut8_time = first_time; } first_time = timeit(compile_time_methods[CHEMFP_LUT16_4].popcount, 128, 200); lut16_time = timeit(compile_time_methods[CHEMFP_LUT16_4].popcount, 128, 200); if (first_time < lut16_time) { lut16_time = first_time; } /* Which one is faster? */ if (lut8_time < lut16_time) { lut_method = CHEMFP_LUT8_4; lut_time = lut8_time; } else { lut_method = CHEMFP_LUT16_4; lut_time = lut16_time; } chemfp_alignments[CHEMFP_ALIGN4].method_p = &compile_time_methods[lut_method]; /* Let's see if the Gillies method is faster */ first_time = timeit(compile_time_methods[CHEMFP_GILLIES].popcount, 128, 200); gillies_time = timeit(compile_time_methods[CHEMFP_GILLIES].popcount, 128, 200); if (first_time < gillies_time) { gillies_time = first_time; } /* For 8-byte aligned code we always want to use the POPCNT instruction if it exists */ if (has_popcnt_instruction()) { chemfp_alignments[CHEMFP_ALIGN8_SMALL].method_p = chemfp_alignments[CHEMFP_ALIGN8_LARGE].method_p = chemfp_alignments[CHEMFP_ALIGN_SSSE3].method_p = &compile_time_methods[CHEMFP_POPCNT]; } else { /* No POPCNT? Then either the LUT or Gillies for the small case, */ /* and perhaps Lauradoux for the large case */ if (lut_time < gillies_time) { best64_time = lut_time; best64_method = lut_method; } else { best64_time = gillies_time; best64_method = CHEMFP_GILLIES; } chemfp_alignments[CHEMFP_ALIGN8_SMALL].method_p = &compile_time_methods[best64_method]; first_time = timeit(compile_time_methods[CHEMFP_LAURADOUX].popcount, 128, 200); lauradoux_time = timeit(compile_time_methods[CHEMFP_LAURADOUX].popcount, 128, 200); if (first_time < lauradoux_time) { lauradoux_time = first_time; } if (lauradoux_time < best64_time) { large_method = CHEMFP_LAURADOUX; best64_time = lauradoux_time; } else { large_method = best64_method; } chemfp_alignments[CHEMFP_ALIGN8_LARGE].method_p = &compile_time_methods[large_method]; ssse3_method = CHEMFP_LUT16_4; if (chemfp_has_ssse3()) { first_time = timeit(compile_time_methods[CHEMFP_SSSE3].popcount, 128, 200); ssse3_time = timeit(compile_time_methods[CHEMFP_SSSE3].popcount, 128, 200); if (first_time < ssse3_time) { ssse3_time = first_time; } if (ssse3_time < best64_time) { ssse3_method = CHEMFP_SSSE3; } } chemfp_alignments[CHEMFP_ALIGN_SSSE3].method_p = &compile_time_methods[ssse3_method]; } } int chemfp_get_num_alignments(void) { set_default_alignment_methods(); return sizeof(chemfp_alignments) / sizeof(chemfp_alignment_type); } const char * chemfp_get_alignment_name(int alignment) { if (alignment < 0 || alignment >= chemfp_get_num_alignments()) { return NULL; } return chemfp_alignments[alignment].name; } int chemfp_get_alignment_method(int alignment) { if (alignment < 0 || alignment >= chemfp_get_num_alignments()) { return CHEMFP_BAD_ARG; } return chemfp_alignments[alignment].method_p->detected_index; } int chemfp_set_alignment_method(int alignment, int method) { /* Make sure it's an available alignment and method */ if (alignment < 0 || alignment >= chemfp_get_num_alignments()) { return CHEMFP_BAD_ARG; } if (method < 0 || method >= chemfp_get_num_methods()) { return CHEMFP_BAD_ARG; } /* Make sure the alignment and sizes are good enough */ if (detected_methods[method]->alignment > chemfp_alignments[alignment].alignment) { return CHEMFP_METHOD_MISMATCH; } if (detected_methods[method]->min_size > chemfp_alignments[alignment].min_size) { return CHEMFP_METHOD_MISMATCH; } chemfp_alignments[alignment].method_p = detected_methods[method]; return CHEMFP_OK; } /**************************************/ /* chemfp stores fingerprints as Python strings */ /* (This may change in the future; memmap, perhaps?) */ /* The Python payload is 4 byte aligned but not 8 byte aligned. */ static int chemfp_determine_alignment(int num_bits, int storage_len, const unsigned char *arena) { int num_bytes = (num_bits+7)/8; if (num_bytes > storage_len) { /* That's just a bad idea */ return CHEMFP_ALIGN1; } set_default_alignment_methods(); if (num_bytes <= 1) { /* Really? */ return CHEMFP_ALIGN1; } if (ALIGNMENT(arena, 8) == 0 && storage_len % 8 == 0) { if (num_bytes >= 96) { return CHEMFP_ALIGN8_LARGE; } else { return CHEMFP_ALIGN8_SMALL; } } if (ALIGNMENT(arena, 4) && storage_len % 4 == 0) { return CHEMFP_ALIGN4; } return CHEMFP_ALIGN1; } const char * _alignment_description(const unsigned char *arena) { if (ALIGNMENT(arena, 64) == 0) { return "64"; } if (ALIGNMENT(arena, 32) == 0) { return "32"; } if (ALIGNMENT(arena, 16) == 0) { return "16"; } if (ALIGNMENT(arena, 8) == 0) { return "8"; } if (ALIGNMENT(arena, 4) == 0) { return "4"; } return "1"; } /* Wrapper function which can report the selected popcount method */ chemfp_popcount_f chemfp_select_popcount(int num_bits, int storage_len, const unsigned char *arena) { int alignment = chemfp_determine_alignment(num_bits, storage_len, arena); chemfp_method_type *method_p = chemfp_alignments[alignment].method_p; if (chemfp_report_select_popcount && chemfp_popcount_method_p != method_p) { chemfp_popcount_method_p = method_p; fprintf(stderr, "Popcount method: %s (%s) num_bits: %d " "arena: %p (%s byte aligned) storage_len: %d\n", method_p->name, chemfp_alignments[alignment].name, num_bits, arena, _alignment_description(arena), storage_len); } return method_p->popcount; } /**** Find the best intersection popcount function *****/ static int chemfp_select_intersect_alignment(int num_bits, int storage_len1, const unsigned char *arena1, int storage_len2, const unsigned char *arena2) { int storage_len = (storage_len1 < storage_len2) ? storage_len1 : storage_len2; int num_bytes = (num_bits+7)/8; if (num_bytes > storage_len) { /* That's just a bad idea */ return CHEMFP_ALIGN1; } set_default_alignment_methods(); if (num_bytes <= 1) { return CHEMFP_ALIGN1; } /* Check for 8 byte alignment */ if (ALIGNMENT(arena1, 8) == 0 && ALIGNMENT(arena2, 8) == 0 && storage_len1 % 8 == 0 && storage_len2 % 8 == 0) { /* We only use SSSE3 if this alignment is identical to "CHEMFP_SSSE3" */ if (chemfp_alignments[CHEMFP_ALIGN_SSSE3].method_p->id == CHEMFP_SSSE3) { /* I'll try, but only if I have 64 byte alignment */ if (ALIGNMENT(arena1, 64) == 0 && ALIGNMENT(arena2, 64) == 0 && storage_len1 % 64 == 0 && storage_len2 % 64 == 0) { return CHEMFP_ALIGN_SSSE3; } } if (num_bytes >= 96) { return CHEMFP_ALIGN8_LARGE; } else { return CHEMFP_ALIGN8_SMALL; } } /* Check for 4 byte alignment */ if (ALIGNMENT(arena1, 4) == 0 && ALIGNMENT(arena2, 4) == 0 && storage_len1 % 4 == 0 && storage_len2 % 4 == 0) { return CHEMFP_ALIGN4; } /* At least we're one byte aligned */ return CHEMFP_ALIGN1; } /* Wrapper function which can report the selected intersect popcount method */ chemfp_intersect_popcount_f chemfp_select_intersect_popcount(int num_bits, int storage_len1, const unsigned char *arena1, int storage_len2, const unsigned char *arena2) { int alignment = chemfp_select_intersect_alignment(num_bits, storage_len1, arena1, storage_len2, arena2); chemfp_method_type *method_p = chemfp_alignments[alignment].method_p; if (chemfp_report_select_intersect_popcount && chemfp_intersect_popcount_method_p != method_p) { chemfp_intersect_popcount_method_p = method_p; fprintf(stderr, "Intersect popcount method: %s (%s) num_bits: %d " "arena1: %p (%s byte aligned) storage_len1: %d " "arena2: %p (%s byte aligned) storage_len2: %d\n", method_p->name, chemfp_alignments[alignment].name, num_bits, arena1, _alignment_description(arena1), storage_len1, arena2, _alignment_description(arena2), storage_len2); } return method_p->intersect_popcount; } /*********** Automatically select the fastest method ***********/ #if defined(_MSC_VER) #include /* QueryPerformanceCounter(LARGE_INTEGER*) */ #else #include #endif static long long high_resolution_timer(void) { #if defined(_MSC_VER) LARGE_INTEGER counter; if (!QueryPerformanceCounter(&counter) || counter.QuadPart == 0) { fprintf(stderr, "Error: high resolution timer not available!\n"); return 0; } return counter.QuadPart; #else struct timeval tv; gettimeofday(&tv, NULL); /* return usecs */ return tv.tv_sec*1000000+tv.tv_usec; #endif } /* Use uint64_t so it's 64-bit/8 byte aligned */ /* The contents are randomly generated. */ /* My first version was too small, and caused the LUT to appear faster even when the Gillies was better on real data. */ /* Overallocate by one to ensure that I can get a 16 byte aligned field */ static uint64_t popcount_buffer[257] = { 0x9b649615d1a50133ull, 0xf3b8dada0e8b43deull, 0x0197e207e4b9af2bull, 0x68a2ecc4053b1305ull, 0x93d933ac2f41e28full, 0xb460859e01b6f925ull, 0xc2c1a9eacc9e4999ull, 0xdc5237f8200aec07ull, 0x9e3bbe45d6e67641ull, 0xa49bed7d060407d4ull, 0xcca5f2913af53c5bull, 0xfdd53575aab7c21aull, 0x76b82d57bfa5c9ddull, 0x0d2a87ba7f2439edull, 0x9ec6a4ee2a6999d4ull, 0xb9ae55f1f402ac97ull, 0x08bbc6d1719a56bdull, 0x969e5ef023c9ed23ull, 0x6b7f08af661a9db6ull, 0xad394da52bbbe18dull, 0xdf9c3e28aae1c460ull, 0xcf82e77d4f02f1efull, 0x1fb88cdb648008ecull, 0xc7a2ab7ecb8f84f5ull, 0xbf8ef6833f18d407ull, 0xb9c7eafdb4653fa2ull, 0x90114b93b87a8a1dull, 0x6e572c9e42e5061cull, 0xb694ec549eeabc20ull, 0xb362909621b9a2c8ull, 0xcadab7b921d3cd0aull, 0xd27f7aef7e2a0c6full, 0xaf5d649ca1d2eefdull, 0x6fc389a822e5769cull, 0xdc849b5da5c5a101ull, 0x3011e28954c71b98ull, 0xecc6f2bb9b24b9d3ull, 0x13d0974bbdbe16b5ull, 0xb50625ca9f3348eeull, 0x91a7462492f11cbbull, 0x5fe0ca6928b55722ull, 0xa5d89c3149133253ull, 0x84645ec3c2cf4be6ull, 0x22fd27c4b7981d9aull, 0x3f9869fee13b43d9ull, 0x0683208def61ce16ull, 0x26f9fd185d31a581ull, 0x837b1ded3af58f74ull, 0x52e0246315b38ad7ull, 0xbde27bb52d771b42ull, 0x7fc2cb4428e33ee2ull, 0xe3511d67a78fb94eull, 0xeac2042d93f9d5f2ull, 0xf987675f01562dd5ull, 0x49f0250c27805c24ull, 0xc331de3409aa714cull, 0x9f3774691ac74fafull, 0x167a091ad590c514ull, 0xe4fbcf7d8f0f2008ull, 0xfbc4b0cb233b04f6ull, 0x960590126cce716aull, 0x1dc1c707f6cc348dull, 0x274b57e30bd6d6d3ull, 0x67525306591d1746ull, 0xf99163b382488844ull, 0xe94f9bf47dfb0b16ull, 0xcbb738584662cebbull, 0x56ee87587103f7e5ull, 0xcd8ff0352714830dull, 0x624dd08f67e90c4bull, 0xfff1f1b5b1f92417ull, 0xcd9d4fb51b05e32bull, 0x43c85c5a7a69cdc4ull, 0xa27e72305a33c247ull, 0xc40882a6813e08f1ull, 0xad2b48e065ca1768ull, 0x1ffa6c9616288e30ull, 0xeb83e3323610ff2bull, 0xb520d27b4f3a3273ull, 0x15470f6c7346b910ull, 0x3397c4c5b5e9bdc6ull, 0x85f3179422591e54ull, 0x86db696004af1781ull, 0x22a9e51e871984beull, 0x2de8e4cdd4652a1cull, 0xe70ef696e037662aull, 0xfc67e1f7083e10f0ull, 0x945105f1c12fc00dull, 0x4d169c35fc28ddebull, 0x5522d55800e2b719ull, 0x618040f560444bedull, 0xff91b03867854f0bull, 0x5ce1bfaf57be27d0ull, 0x81752ce65cf5ba9eull, 0x98e499fe7f0f365eull, 0x5aa2bc888ad924bdull, 0xae2de7838420c59bull, 0x42cda0012ae00ff1ull, 0x7620f99214e30e2full, 0xa0be3f23a80f82ceull, 0x420edefc42cedb09ull, 0x80fe957c6a2817ffull, 0x355174b6692ff140ull, 0x47653e206352c78aull, 0x808f7214b82d7c59ull, 0x5dfcfe4144c253d4ull, 0x4b918724a9084523ull, 0x3e0608080fc35d1bull, 0xf23cfdfd8c0b219eull, 0x55bfd8597cdba8f5ull, 0x269c25c3799d723cull, 0x91e53b39bfdca5deull, 0x02b04e9b8e52e823ull, 0xc53fe276534e5317ull, 0x18bd1dc656174acaull, 0x0e5b4b3a13772eebull, 0xa1943806fca56da6ull, 0x04a5016c4c0be049ull, 0x977ba238079e1e0cull, 0x2df9dbcc4e036035ull, 0x86adc435f1414d29ull, 0x4402f529defe1868ull, 0x03dbf44c63afc870ull, 0xfbfe185f7297e08aull, 0xe717fd0019ef65edull, 0x7918c2b6e9275ba4ull, 0x24f5ee4355f022b3ull, 0xc0ba7a6be52fe0a4ull, 0x685aabb6a61f00d8ull, 0x3fa62a93e20e9372ull, 0xc201d0ade1f15de7ull, 0x28cb5915df8a4912ull, 0x517843f1c3f9928full, 0x4632606437902d9aull, 0x82f853fb34d514b7ull, 0x00464a29dcb32cbcull, 0x84e1c0073eee811full, 0x6eb2e2781ce72271ull, 0xe3f40911bc8845e9ull, 0xe6f2aacb1dd4d080ull, 0xa87b1b15af61762full, 0x810e66188c97dbeaull, 0xdb919c39003db0d6ull, 0x18452ccd19197178ull, 0x5fe005b938986834ull, 0xb179f1f3b113509full, 0xea27088977c864c2ull, 0x4e524739e812d35eull, 0xf76f7a7d15cc08dbull, 0xc0b9a7c0251f7f58ull, 0x319d8eb2f9334c6dull, 0x65db68328c2d2d4dull, 0xc260bbf348039ee2ull, 0xc692e00595613bffull, 0x90fec8d4b374484dull, 0x8ebd5b2ff1de52dfull, 0xd3781952d5254631ull, 0x84196d92f8852097ull, 0xdc621b34a1763da6ull, 0x0799e73b826efc26ull, 0x098532b1f427cd10ull, 0xfb2b0735121a374eull, 0x9f8d3d10f5108176ull, 0x57ee9d46db4529aaull, 0x7c8db1c2e675c649ull, 0x9d8e3388f3ef4382ull, 0x639b5c10b29fc572ull, 0x011f05e93ec9c4aeull, 0xec28a9716fd3f5a1ull, 0x837c0d205aefb577ull, 0x0099fd93cadcb971ull, 0xf29e78eae535df65ull, 0x3c1ca48f330a6d1dull, 0xb734f3c83f57de82ull, 0x42f85b65c22dc638ull, 0x0c50c85af7d3a601ull, 0xea8ced5869fbe2fdull, 0xb0cc396bfd86be6dull, 0xb3ea7c3295866ef9ull, 0x36cf28b306426badull, 0x590de78ae5300681ull, 0x41f4e16df296c0bcull, 0xaad908beff6a93a9ull, 0x909d243860e863d0ull, 0x1d574b777f6e2725ull, 0xacb7e3a9b94bb2b2ull, 0x3b4d173db0b61bf6ull, 0x4ccc5649c6c02c51ull, 0x8d851d80b1a90638ull, 0x6ca86fac5976ba0aull, 0x09b49bdb4a58e177ull, 0x7da8938aa92fe6b7ull, 0x0f10d2d164ab5260ull, 0x410822b41fff8a8eull, 0x13d8dd389fe19217ull, 0x0d6fcf685fdca839ull, 0xae9965f4e51c9094ull, 0x3cc74eabd4b3574aull, 0x616a5f30b4a1e0a2ull, 0x01c995c3cf9cde82ull, 0x083e3df79ed6d08dull, 0x50ca7def49e9be55ull, 0x6827bee9c7b104adull, 0xb09c88041e5a1480ull, 0xd7d6b3f8a5fd79d2ull, 0xe9a2a7562deb9cbbull, 0xc6df657d5d037eaaull, 0xa0513198d897cf1bull, 0x941721727391ffbbull, 0xdd65e39bef1199cbull, 0x4e1129988fcc1a78ull, 0x57d5274d4189e641ull, 0xcd78a6383892a6c2ull, 0x5380e97a1e588b36ull, 0x4b153a04ed4f2d4cull, 0x78c74fdda5d88d5full, 0xa838c19ff3a05996ull, 0x64a935bf0b55a732ull, 0xa5727c5fee927c99ull, 0x584c550d5f7af1d7ull, 0x7b15564ed80dd58bull, 0x42db540eda52029cull, 0x78f64d45305d7f6full, 0x8b549a03a9806568ull, 0x6fa3c48b2b01ba66ull, 0xc56ccbe0f05d1511ull, 0x8adcd70ff4730081ull, 0xf3f19cc845fd5b7aull, 0x0936f92d55e55133ull, 0xfda06bcd399ae365ull, 0xde0c5052f3e158a4ull, 0x58584d0c5e3b7dddull, 0x3c3eb71846edfeb7ull, 0xc1080e17c84266ffull, 0xb25fd442e286d778ull, 0x568605346b044740ull, 0x54ffc2f936a972a2ull, 0x366b795d073f062bull, 0x206dadf277bbf8b4ull, 0x916749a7cdf5e525ull, 0x0afce12439536907ull, 0x9fce50346e346701ull, 0x562fe8ffc572a020ull, 0xbac08aa15dc2f3f6ull, 0x992aea3d03fb66a9ull, 0x9e6a37740d285aafull, 0x11dfb9a7b6b4424aull, 0xe220772a626e2f9dull, 0xae5c0a22b8ab8f2dull, 0x11496ae8d4258860ull, 0x6f3e74167f908fe6ull, 0x622f3431103aef5dull, 0x608584c6e190403dull, 0xc8f7ec331fa3110cull, 0x5ef7066f95c03fa1ull, 0x48924db0f5d40254ull, 0xc0d546123dcd5ff2ull }; /* Remember, keeping an extra value to ensure 16 byte alignment */ static const int popcount_buffer_size = sizeof(popcount_buffer) - 1; static unsigned long timeit(chemfp_popcount_f popcount, int size, int repeat) { long long t1, t2; unsigned char *start_buffer, *end_buffer, *fp; int i; if (size > popcount_buffer_size) { size = popcount_buffer_size; } t1 = high_resolution_timer(); if (ALIGNMENT(popcount_buffer, 16) == 8) { start_buffer = (unsigned char *) (popcount_buffer+1); } else { start_buffer = (unsigned char *) popcount_buffer; } end_buffer = start_buffer + popcount_buffer_size; for (i=0; ipopcount, probe_size, repeat); dt = timeit(method_p->popcount, probe_size, repeat); if (first_time < dt) { dt = first_time; } if (best_method == -1 || dt < best_time) { best_method = method; best_time = dt; } } if (best_method == -1) { /* Shouldn't happen, but I want to be on the safe side. */ best_method = old_method; } chemfp_set_alignment_method(alignment, best_method); return best_method; } static void verify_methods(void) { int i; int expected, got; const unsigned char *start_buffer; /* Test that I can check byte 96 out of 128, with zero padding */ unsigned char *single_bit_buffer, *single_bit_buffer_start; single_bit_buffer_start = single_bit_buffer = (unsigned char *) malloc(150); if (!single_bit_buffer) { fprintf(stderr, "chemfp: unable to malloc popcount verification scratch space\n"); return; } while (ALIGNMENT(single_bit_buffer, 16) != 0) { single_bit_buffer++; } memset(single_bit_buffer, 0, 128); single_bit_buffer[96] = 1; if (ALIGNMENT(popcount_buffer, 16) == 8) { start_buffer = (unsigned char *) (popcount_buffer+1); } else { start_buffer = (unsigned char *) popcount_buffer; } if (ALIGNMENT(start_buffer, 16) != 0) { fprintf(stderr, "chemfp: Misaligned data!\n"); } /* 64 byte aligned */ expected = detected_methods[0]->popcount(256, start_buffer); for (i=1; ipopcount(256, start_buffer); if (got != expected) { fprintf(stderr, "chemfp: popcount validation error(1): method %s returned %d instead of %d\n", detected_methods[i]->name, got, expected); } } /* check the bit in byte 97 */ expected = detected_methods[0]->popcount(97, single_bit_buffer); for (i=1; ipopcount(97, single_bit_buffer); if (got != expected) { fprintf(stderr, "chemfp: popcount validation error(2): method %s returned %d instead of %d\n", detected_methods[i]->name, got, expected); } } /* 64 byte aligned */ expected = detected_methods[0]->intersect_popcount(256, start_buffer, start_buffer+128); for (i=1; iintersect_popcount(256, start_buffer, start_buffer+128); if (got != expected) { fprintf(stderr, "chemfp: intersection popcount error(1): method %s returned %d instead of %d\n", detected_methods[i]->name, got, expected); } } /* check the bit in byte 97 */ expected = detected_methods[0]->intersect_popcount(97, single_bit_buffer, single_bit_buffer); for (i=1; iintersect_popcount(97, single_bit_buffer, single_bit_buffer); if (got != expected) { fprintf(stderr, "chemfp: intersection popcount error(2): method %s returned %d instead of %d\n", detected_methods[i]->name, got, expected); } } free(single_bit_buffer_start); } chemfp-1.1p1/src/test_libchemfp.c0000644000077000000240000001153011666736033017213 0ustar dalkestaff00000000000000#include #include #include "chemfp.h" /* This is not a comprehensive test suite. That's done in Python code. */ /* This tests that the libchemfp public API is usable from C code */ #define CHECK(msg, expr, result) \ if ( (expr) != (result) ) {puts("FAIL: " msg); failed++;} \ else {puts("PASS: " msg); passed++;} int failed = 0; int passed = 0; int main() { char version_prefix[100]; sprintf(version_prefix, "%d.%d", CHEMFP_MAJOR_VERSION, CHEMFP_MINOR_VERSION); puts("== info functions =="); CHECK("version", strncmp(chemfp_version(), version_prefix, strlen(version_prefix)), 0); CHECK("error code", strcmp(chemfp_strerror(CHEMFP_OK), "Ok"), 0); puts("== Hex checks =="); CHECK("empty string", chemfp_hex_isvalid(0, ""), 1); CHECK("2-bytes valid", chemfp_hex_isvalid(2, "abq"), 1); CHECK("all hex chars", chemfp_hex_isvalid(16, "0123456789abcdef"), 1); CHECK("invalid hex char", chemfp_hex_isvalid(16, "0123456789abcdeg"), 0); CHECK("popcount 0", chemfp_hex_popcount(4, "0000"), 0); CHECK("popcount 1", chemfp_hex_popcount(2, "01ff"), 1); CHECK("popcount bad", chemfp_hex_popcount(4, "01fg"), -1); CHECK("intersect popcount", chemfp_hex_intersect_popcount(6, "0F0123", "010b42"), 3); CHECK("Tanimoto empty", chemfp_hex_tanimoto(2, "00", "00"), 0.0); CHECK("Tanimoto", chemfp_hex_tanimoto(6, "123456", "012345"), (0.0+0+1+0+1+1)/(1+2+2+3+2+3)); CHECK("Tanimoto fail", chemfp_hex_tanimoto(6, "12345 ", "012345"), -1.0); CHECK("contains", chemfp_hex_contains(2, "12", "3a"), 1); CHECK("does not contain", chemfp_hex_contains(2, "3a", "12"), 0); puts("== Binary checks =="); CHECK("empty popcount", chemfp_byte_popcount(0, "blah"), 0); CHECK("single byte popcount", chemfp_byte_popcount(1, "A"), 2); CHECK("multiple byte popcount", chemfp_byte_popcount(4, "ABCD"), 2+2+3+2); CHECK("intersect popcount", chemfp_byte_intersect_popcount(4, "ABCD", "BCDE"), 1+2+1+2); CHECK("Tanimoto null", chemfp_byte_tanimoto(0, "", ""), 0.0); CHECK("Tanimoto empty", chemfp_byte_tanimoto(1, "\0", "\0"), 0.0); CHECK("Tanimoto", chemfp_byte_tanimoto(2, "AB", "BC"), (1.0+2) / (3+3)); CHECK("contains", chemfp_byte_contains(2, " *", "**"), 1); CHECK("does not contain", chemfp_byte_contains(2, "**", " *"), 0); puts("== FPS line =="); CHECK("valid with unknown hex len", chemfp_fps_line_validate(-1, 12, "abcdef\tspam\n"), 0); CHECK("valid with known hex len", chemfp_fps_line_validate(6, 12, "abcdef\tspam\n"), 0); CHECK("valid with bad hex len", chemfp_fps_line_validate(4, 12, "abcdef spam\n"), CHEMFP_UNEXPECTED_FINGERPRINT_LENGTH); CHECK("valid with bad hex", chemfp_fps_line_validate(6, 12, "abcdeg spam\n"), CHEMFP_BAD_FINGERPRINT); #if 0 puts("== N-largest =="); int indices[2] = {0,0}; double scores[2] = {0.0, 0.0}; if (chemfp_nlargest_tanimoto_block(2, 2, "A1", 15, "her/hlvhSV#$(ZXVLzf*)4lkdf[]#@", 0, -1, 0.0, indices, scores) < 0) { puts("FAIL: chemfp_nlargest_tanimoto_block"); } else { puts("PASS: chemfp_nlargest_tanimoto_block"); //printf("[%d]=%f [%d]=%f\n", indices[0], scores[0], indices[1], scores[1]); CHECK(" indices[0]", indices[0], 10); CHECK(" indices[1]", indices[1], 13); CHECK(" scores[0]", scores[0], 0.375); CHECK(" scores[1]", scores[1], 0.363636363636363636); } scores[0] = scores[1] = -1; unsigned char *start_ids[2]; int id_lens[2]; int lineno; if (chemfp_hex_tanimoto_block(2, 4, "4131", 157, "6865 ID0\n" "722f ID1\n" "686c ID2\n" "7668 ID3\n" "5356 ID4\n" "2324 ID5\n" "285a ID6\n" "5856 ID7\n" "4c7a ID8 blah\n" "662a ID9\n" "2934 ID10\n" "6c6b ID11\n" "6466 ID12\n" "5b5d ID13 extra items\n" "2340 ID14\n", 0.0, scores, start_ids, id_lens, &lineno) < 0) { puts("FAIL: chemfp_hex_tanimoto_block"); } else { puts("PASS: chemfp_hex_tanimoto_block"); CHECK(" scores[0]", scores[0], 0.375); CHECK(" scores[1]", scores[1], 0.363636363636363636); CHECK(" id[0]", strncmp(start_ids[0], "ID10\n", id_lens[0]+1), 0); CHECK(" id[1]", strncmp(start_ids[1], "ID13 ", id_lens[1]+1), 0); } CHECK("chemfp_byte_intersect_popcount_count", // We know the BCDE returns an overlap of 6 bits // "A " returns an overlap of 2 chemfp_byte_intersect_popcount_count(4, "ABCD", 2, "XBCDEXA X", 1, 5, 2), 2); CHECK("chemfp_byte_intersect_popcount_count", chemfp_byte_intersect_popcount_count(4, "ABCD", 2, "XBCDEXA X", 1, 5, 3), 1); #endif printf("Pass: %d Fail: %d\n", passed, failed); } chemfp-1.1p1/tests/0000755000077000000240000000000012106315372014417 5ustar dalkestaff00000000000000chemfp-1.1p1/tests/__init__.py0000644000077000000240000000000212055233451016520 0ustar dalkestaff00000000000000# chemfp-1.1p1/tests/blah.py0000644000077000000240000000213512071447251015703 0ustar dalkestaff00000000000000import random import _chemfp ids = [1,2,3,2,1,2,3,2,1] #ids = range(10) scores = [0.6, 0.4, 0.5, 0.2, 0.8, 0.7, 0.7, 0.0, 1.0] #scores = [random.random() for id in ids] assert len(scores) == len(ids) I = range(len(ids)) values = zip(ids, scores) def load(): x = _chemfp.SearchResults(1) for value in values: x._add_hit(0, *value) return x x = load() print dir(x) assert list(x._get_scores(0)) == scores x.reorder_all("increasing-score") assert list(x._get_scores(0)) == sorted(scores) expect = [ids[i] for i in sorted(I, key=lambda i: (scores[i], ids[i]))] assert list(x._get_indices(0)) == expect, ( list(x._get_indices(0)), expect) ''' x = load() x.reorder_all("decreasing-score") assert list(x._get_scores(0)) == sorted(scores)[::-1] expect = [ids[i] for i in sorted(I, key=lambda i: (-scores[i], ids[i]))] assert list(x._get_indices(0)) == expect x = load() x.reorder_all("increasing-index") assert list(x._get_indices(0)) == sorted(ids), (list(x._get_indices(0)), sorted(ids)) x = load() x.reorder_all("decreasing-index") assert list(x._get_indices(0)) == sorted(ids)[::-1] ''' chemfp-1.1p1/tests/chebi_queries.fps.gz0000644000077000000240000000514511660452124020365 0ustar dalkestaff00000000000000eOsNchebi_targets.fpsYˎ ]g"6RE(H8YX#P>dU߾~Œ5kج"y/Ë~oxos}Ec{oT|+2(}~G2+)7X?=*TX_^/ =U7S׀yPW_V!TT|e+ۿc_Б((FXs[fZiښn #p*ے={ {zS64'3wJ ?J@-^DC-#fQ 3 x4tR`ou>d/yc7|Z)]'30{.Vܴ1:jgt%PDTH_A%t#=fwbLJ{E2+ NyDRJzʛWˑ[2D$ra>Nf~݅D]L> S@ּjPނțW)r~`SU 0DƙMe)$7-zK]F3#oycX嚤O#6& nkqsqآGRq$qT]%$q*ЉI:sws2$пӸ]$$VOJčP[W:d٪7 S(ENYQyV1=srފ@-o* N:*`=2͌NdNX%vu4UBt*H*Y59$`7N}]IYQHAOӽiPH>z\*֥&oTS!;0J.*ejjnfԗ#2z,q(=lyӰ%$f WIXJdKBt6Cdn$#Sv_$08<j gKرD^$t kWtQ:鶻ldN&WQc&o-mU7t7  'Q HzeM)VA"0p01 `w5i96l₭j[L3Q֋kezep؛@ [n)g›m Swzܺ*$CewpAj]$uƦ;h֓SCAA+˲ V6+f!VKGVvQ?JHOQkgA;w3#U"W*Ǥ#wFNw\@dU(Izno5rRlF*D?:Э7kT>Baf #$&nNTi 3Of8eGq1+Tdpp[Q ~K*'=8eCJf&[Gs5FV:BKڧ%%t[ӝFuqY~Yl;{C}E ,jntGhuv> Ņ2{ΉKx;U(ޭE\CFýF$!\^XVM\4}ęKU]A%%%E-^7S ~ $sn20V g Ld7rYR-?S񄋪#o丰O-`hD2"E͟鑓=oCȤ*嘖J5c}IqZѤ|PxdwNa9.7] W'Fv0Xi>9^dK]5Y4 2{e0k@iqydPS[AZY 1{.o-_mSGz]WHXًq=hC3nys<|o)H`DĚ(0K̆XW7´/mf9mqx7m΁ԓtꭃz 3i1L߶42xde$sSвz::y€AIЫ!^8xXtB)mKoh᭜xs*bPэ }sS*blL$tebJYEtUb̤ (Ȩ爧`qce.%67 S(o,/>\ +T鞑UP$d7Eɜd.FZ|i vvLȼ0txh9 Xu]}10ӗ6 cE70rC{\\ m`BF1 mfĔ%GB7Cg +B`mV\ɤs $+,ԑKHϱ>\_E].(Vaq[KBVcDlv7'Hko) oVOwleV78JlJX]ΦḊ21Q㝂x_-_}]Ki`RmZ$TIT+3-][j'E=y9:N 8vN~|\gԋ._'Y W:$A8`Jz &T-:ݫ=^C8iTldKc~t!chemfp-1.1p1/tests/chebi_rdmaccs.fps0000644000077000000240000032670611660452124017716 0ustar dalkestaff00000000000000#FPS1 #num_bits=166 #type=RDMACCS-OpenEye/1 #software=OEChem/1.7.4 (20100809) #aromaticity=mmff #source=/Users/dalke/databases/ChEBI_lite.sdf.gz #date=2011-09-16T13:49:04 00000000000000008200008490892dc00dc4a7d21e CHEBI:776 000000000000200080000002800002040c0482d608 CHEBI:1148 0000000000000221000800111601017000c1a3d21e CHEBI:1734 00000000000000000000020000100000000400951e CHEBI:1895 0000000002001021820a00011681015004cdb3d21e CHEBI:2303 0000000000000200020a0002048804448c5cb2d21c CHEBI:2365 800404000840200002429dc801a19147946dfafa1f CHEBI:2676 0000000000003001800008478789a0e49d5df7fa1d CHEBI:2682 000000000000300180800a57935180641c6de3fd1d CHEBI:2790 0000002000000100800430101bd1adf789fbbfff1f CHEBI:2914 000000802906549e14918920b3c79059e6e96b6c1f CHEBI:3013 000004000000000000041490034928f309418db81f CHEBI:3048 000020000000200080022242891029e00d46d6fd1f CHEBI:3082 00004020000001000002385408a214260830d8ff09 CHEBI:3084 0000000000000000020430868fc1208309078d351f CHEBI:3112 000040002840313181d2afdf8b7abc75ddffffff1f CHEBI:3139 000000800900449e1445afb4bbdbbeeb9feb7efd1f CHEBI:3176 0000000000200000010270100a442c1320211dad17 CHEBI:3183 000000800100449e14018b63a9c23f7cd6ecefff1f CHEBI:3213 00000000000000000044304e0a603e8351b37daf1f CHEBI:3215 0000200000000001820634971ac9a9e30ddfbffd1f CHEBI:3216 00000000000000000202301018810412001219a517 CHEBI:3238 00000000000000840154d5804804a01321a12d2517 CHEBI:3240 0004000000000000020430469ac121c30dc78d7d1f CHEBI:3242 0004000000003000820430479ac121c78dcf8f7f3f CHEBI:3243 00000000000000008200209f90f13de49dcec7ff1f CHEBI:3353 00000000000000008200209f80f03ca49d0e46ff1d CHEBI:3354 00004020000001000002385648f23c2628325cff29 CHEBI:3385 0000040010000000001418001423101340a1492a1f CHEBI:3387 00000000000038009a00086681c0054c04ece2fe3f CHEBI:3395 0000000000000000010270504a542c3328239dbd1f CHEBI:3398 00004000000000000202080300ba3e24853a56ff09 CHEBI:3419 80040000085020000048d94c57259047b46d6b7b1f CHEBI:3478 800480000a50120051f89ded3723d04fd46d6a7b1f CHEBI:3485 8004000018502a84a1dcb5ff1f6bbb5f97efffff1f CHEBI:3493 8004100008501a84a1dc9dfd17ab906f96effaff1f CHEBI:3499 800400000850020001d89dcc97239047dc6d6a7f1f CHEBI:3510 800400000850030001cc95cc9f619057986b6b7f1f CHEBI:3537 000000000002102001026015871c04352c9db3df1f CHEBI:3558 00000020002001000002301008c82e8221101ca521 CHEBI:3567 000000000000000000007840485c2a267da27cff3f CHEBI:3637 000000800900449e14b1c9207387d049e6e96b681f CHEBI:3640 000000000000200080004203c01401602c4682551e CHEBI:3642 000000801104449e1411c327e9e63608a6204eef1f CHEBI:3650 0000400000000001010044c050279155204580581f CHEBI:3655 000000183094000090b41a46afb2d06b9f6fea7f1d CHEBI:3696 0000200000102000020aa84e84e03c84852e76ff19 CHEBI:3697 0000000800000801802404e11b51c34fc0efabff1f CHEBI:3716 00000000000200000084785bcb7c39f37debffff1f CHEBI:3720 00000000000020018202305713c88e669c5dbbfe1d CHEBI:3732 00000000000000000204749049dc2cb32903bdbd1f CHEBI:3738 0000002000000100820430500ac12cf709478ff31f CHEBI:3743 000000020010002000ee50005605c01370a179a817 CHEBI:3749 000084000000816060e04068f865d05bfae16b6b1f CHEBI:3756 00000000020002010008048014092cc40141b4d218 CHEBI:3900 0000000000000000004430100a40ac9301230da917 CHEBI:3994 0000040000000200000a30101c412c1200801da517 CHEBI:3996 0000040000000220000c30101e412c9301830da117 CHEBI:4046 00000010008430009282824fa9f234a88f2edeff09 CHEBI:4222 0000010200000201430804a6b589acc909c5b5d01f CHEBI:4315 000000000000449e0401890020800018d62049681f CHEBI:4325 00000080e180443a1557b4fc3ae3b09bd7bf7feb3f CHEBI:4379 0000000200003200820a785f55ad1142b4ecfbfa1f CHEBI:4392 00000000000000000200080610892dc015c4e5b81f CHEBI:4474 000000802900441e14b1c1207387d049e2e149e81f CHEBI:4495 0000000000000201000800031601815484cda3561e CHEBI:4513 00000000000000200200304008d82aa609021df71d CHEBI:4514 0000000000003221820804879699ace40d5db7d31c CHEBI:4551 000040000000100181c434df1bebbcd7956fffff1f CHEBI:4562 00000000000020008000020688482dd005c4e7fd1f CHEBI:4670 000000800900449e14018fa0bacb964897e97aed1f CHEBI:4702 000000001012002000dc74841e67b0d3f9e3ed6f1f CHEBI:4717 000000003000200081d416db9b339043ccefbaff1f CHEBI:4728 00000020000001000002301049800742705498bd3f CHEBI:4759 000004010000002001d434d0185baeb359a33cbd1f CHEBI:4779 0000000000000000004414ce887832a7852f7eff1f CHEBI:4784 0000000000002000004414ce88683007852d6eff1f CHEBI:4786 00000000000000000004301e0a683df351e1fdff1f CHEBI:4788 000040020000120181ce34df1fe3bcd7957fffff1f CHEBI:4821 0000000200002200018c36de9f719c13052759ff1f CHEBI:4822 000000000000200080020252891829640c5ed6ff1f CHEBI:4856 000004002000082001f054a05845c01b60a1092917 CHEBI:4858 00000000000020000040024688582a80052654fd09 CHEBI:4877 000001000000000002000000c48002002804809408 CHEBI:4882 00004100000000000200080400a230e40940c47a1d CHEBI:4884 00000000000000000200048f80e01e41884170ee19 CHEBI:4887 0000000010000020005414cf09229243806168ef1f CHEBI:4888 00000000000000000040304e08683e8251b27caf1f CHEBI:4904 0000000000000001030004c79299aa950d85d7df1f CHEBI:4909 0000000000000000015414c00110823700a138ff1f CHEBI:4910 000000012000000001d41c811a53a837d12b5eff1f CHEBI:4974 00004000000000000000080300321024842a427f1f CHEBI:4995 00000000000200000000020088ce0e00210050ad17 CHEBI:5000 00000000000000000000000000000174004483d21e CHEBI:5004 0000001000000020000634cf29e032af8b1f3fff1f CHEBI:5163 000020000012022000cc72559f4dade7f5effffb1f CHEBI:5280 000000000000200080000201801000000c04825708 CHEBI:5445 0000000010000220029c101d07aab8c381617fea1d CHEBI:5706 000000080000000180000480101181d401cba3571e CHEBI:5722 000000100000000110142226ebd2b0cb3b636c7d1d CHEBI:5864 00000000020002010208048014892cc40941a5d21c CHEBI:5981 000000000000002002004056c0cc28c039c0e5aa1f CHEBI:6121 00000000000044be0581c11062848019a2a139c81f CHEBI:6339 000000000000300180000687829080e40c4fa2551c CHEBI:6359 00000000100000200294101b03aa924380617bee1f CHEBI:6758 00000000000000000044305e0a683c8351b17dab1f CHEBI:6759 000000000800049c0485b1103a49ac9b41a33dad1f CHEBI:6780 0000000000000000000008408058056000d0f0bd1f CHEBI:6916 0000000000000000015454c04814803329a3a9351f CHEBI:6923 00004000080000050084b14a1b73b2b7d9ab4fff1f CHEBI:6997 00000000000000000000001000201000000040a209 CHEBI:7438 8004040008400000024295c911b1937794fdfbff1f CHEBI:7447 00000000000000000000001010010174004493d21e CHEBI:7476 000000000000300180800a478b4000641c6de37d1d CHEBI:7489 000000000000300180800ec78b5080641c6fe37d1d CHEBI:7508 0000000000002021820006978299ace40d5db7d71c CHEBI:7569 000000000012022100cc70511f55adc7f1efbffb1f CHEBI:7731 00000000000000000002305008502c320892bdb51f CHEBI:7789 00000000000022000048084a85701ae49c78f2ff1d CHEBI:7798 00000000000001000000000000082e84090004d608 CHEBI:7896 0000002000000120020430d01ac9ace7997bbffb1d CHEBI:7907 0000000000002021800800111601017004cda3d21e CHEBI:7947 00000000000030000202894000800004040450fa09 CHEBI:7959 800404000c400100024295c881e1904798697afe3f CHEBI:7963 00000000100410000290000f03aabec1857176ee1d CHEBI:7983 00000000100410200290000b03a29241846163ee1f CHEBI:8069 0000000010042020025004c901a290518461616a1f CHEBI:8107 80040400184020008256b5cf0bebba47957f7fff1f CHEBI:8232 00000000000020000040024c88683e0054b074af1f CHEBI:8404 000000000000000001400a169b492d715461d5bd1f CHEBI:8405 000000000800000400c4f1105a45ac9361a30dad17 CHEBI:8435 000000000000000001800800120100115421412817 CHEBI:8452 000000000000082181d444e049dca69f29a3acfd1f CHEBI:8489 0000002002000121020434c69ed9ace7197fbfff1d CHEBI:8884 00000000020002010208048694892cc40945b7d21c CHEBI:9023 0000008021004cbe05f5b5f83bfbfffbc3fbffff1f CHEBI:9139 0000000200000201020a000596892ee4895db7d61c CHEBI:9150 0000000000000001820234c009d9ae8609139cf51d CHEBI:9242 000000000010002081c8085117018175d4fdf3fa1f CHEBI:9287 000000802100449f55118de021829019d621cae81f CHEBI:9332 0000008021005c9e151181212382515dd6edeb7a1f CHEBI:9334 00000000080000000000858000000014800582d21e CHEBI:9362 00000000c080451a0601828eb9eb3cc88f5effff1d CHEBI:9407 00000000000000000002325288502e2658325cff1f CHEBI:9468 00000402000022018208048714892ee48d5db6d618 CHEBI:9516 000000002900202387d6b6ea8bf0b9eb5ff7fefd1f CHEBI:9599 0000000000002000004008480020100054b070aa1f CHEBI:9611 0080240000000201820800811699ace40959b7d31c CHEBI:9688 000000000002080003d444e08bc4881b28a5a9bd1f CHEBI:10023 0000000010100b21a09c16ed0772d0cb9f6deaff1d CHEBI:10110 000000800100441b55818ca0b1c390198a21ca4d1f CHEBI:10127 000000300080010090023057a8d22cae8e1e9ef709 CHEBI:11230 000000000000310080000207001000048c0c825708 CHEBI:11449 00000000000030018000024f033090649c6ff3ff1d CHEBI:11714 00000000000001000200008690892cc48d5cb7d61c CHEBI:11901 00000010008030019000024f23b2106c9e4febff1d CHEBI:12194 0000000000000221000800111601017000c1a3d21e CHEBI:12257 0000000000002000010004d690410815042540fe1f CHEBI:15334 0000000000001021000800111601017004cd83d21e CHEBI:15335 000000000000280018000a6e8070300cdc2c467f1f CHEBI:15336 000000000000300000000246894828048c0c46ff09 CHEBI:15337 000000100084000090000a432092002c8e0ecaff09 CHEBI:15338 000000000000000080000080100100508040a1421e CHEBI:15342 000000000000000000000000000000000000008208 CHEBI:15343 00000000000000000000000180000404880482d608 CHEBI:15344 000000102084302193d69fcfbbf3b4fbdf7ffeff1f CHEBI:15345 000000102084302193d69ecfbbf3b4fbdf6ffeff1f CHEBI:15346 000000000000000000020000000000000000908208 CHEBI:15347 000000000001000002020040008000000004909009 CHEBI:15348 000080000000000050220020200040080004108809 CHEBI:15349 0000001000840000100000012082002c860c0ad208 CHEBI:15350 000000102084302193d69fcfbbfbbcfbdf7ffeff1f CHEBI:15351 000000000000300180000207021080640c4fa3d71c CHEBI:15352 000000013014102180b4385f8f62fc479d6defff1d CHEBI:15353 00000020000001000002325088d02c0208161cb509 CHEBI:15354 00000020000001000002305008d82c2608121cf709 CHEBI:15355 000000000000300000008a42804000040404407f09 CHEBI:15356 000001000000010000000000000000048008025208 CHEBI:15357 00000000000001008000000100000404880882d608 CHEBI:15360 00000000000001008000000000000004800082d208 CHEBI:15361 000000100084000090000002209204288e0c8ad708 CHEBI:15362 0000000000000000010414c811211153004588fa1f CHEBI:15363 000001000000010000000000000000040000005208 CHEBI:15364 0000000000000000000000010000016480cca2d21e CHEBI:15365 00000000000000000000000000000004000400d208 CHEBI:15366 0000000000000200020a000004882c84001434d21c CHEBI:15367 000000000200000000000000040000000000000208 CHEBI:15368 000000102084302193d69fcfbbfbbcfbdf7ffeff1f CHEBI:15371 000010032014300080f80a4e9f6bd0411465c3fb1d CHEBI:15372 000000012004200081f008481323d04154e5c3fa1f CHEBI:15373 000000012004300080f00a4e9bebd041146de3fb1d CHEBI:15374 000010032014300080f80a4e1f6bd0411465c3fb1d CHEBI:15375 00000000000003008008000004000004880c825608 CHEBI:15376 000000000000000000000000000000000004000008 CHEBI:15377 000000000000010000000000000000000000000000 CHEBI:15378 000000000000000000000000200000080000004008 CHEBI:15379 000000102084302193d69fcfbbf3b4ffdf6ffeff1f CHEBI:15380 00000000020002000008000004082c800100148014 CHEBI:15382 00000000020002000008000004082c800100148014 CHEBI:15383 00000000020002000008000004082c800100148014 CHEBI:15384 0000000200000200000a000014092c800110158014 CHEBI:15385 0000000000002000800000001401005004c4a1501e CHEBI:15386 0000200002000200020a0084848108c00944b0901c CHEBI:15387 000000000200020000080004840008c00844b0901c CHEBI:15388 000000000200020000080004840008c00844b0901c CHEBI:15389 0000000000000201820804911481016480dd93d21e CHEBI:15390 0000000000000201820804911481016480dd93d21e CHEBI:15391 00000000000000000202008000890cc08950b0c21c CHEBI:15392 00000000000000000202008680892cc00954b4901c CHEBI:15393 00000000000000000202008680892cc00954b4901c CHEBI:15394 000000100084100012020085208b2ce88f5cbed01c CHEBI:15395 00000000000000000202008000892cc00950b4821c CHEBI:15396 00000000000000000202008000892cc00950b4821c CHEBI:15397 00000000000000000202008280890cc00954b0d21c CHEBI:15398 000000000200020000080000040008c00840b0821c CHEBI:15399 000000000200020000080000040008c00840b0821c CHEBI:15400 0000000000001021800000011201015004cda3521e CHEBI:15401 00000000000000000002000680082cc00954b4901c CHEBI:15402 0000200002000200020a0080048108c00940b0821c CHEBI:15403 00000000000001008000000700002004880c865608 CHEBI:15404 00000000000000000202008600892c400154b4901c CHEBI:15405 00000000020002000008000604082c400044b4901c CHEBI:15406 000000000000200000000050000000000404d0b81f CHEBI:15407 00000000020002000008000004082c400040b4821c CHEBI:15408 00000000000000000002000680082cc00954b4901c CHEBI:15409 00000000000000000002000000082cc00950b4821c CHEBI:15410 00000000000030018002020702182ce40d5db7d51c CHEBI:15411 00000000100430008010084901221004842c427a09 CHEBI:15412 0000000000001000000000010400015004cca3521e CHEBI:15413 000000002000213181d4bcdf9361bc575d6feeff1f CHEBI:15414 0000040002000200020a008014892c800110158014 CHEBI:15415 00000400020002000208028614992c80010415951c CHEBI:15416 00000400020002000208008614892c84010415d21c CHEBI:15417 00000400020002000208008014892c80010015821c CHEBI:15418 00000400020002000208008694892cc40d4497d21c CHEBI:15419 00000000020002000008020604182880010404951c CHEBI:15420 00000000020002000008000004082880010004821c CHEBI:15421 000000102084302191d41ccf33b3907bde6dea7d1f CHEBI:15422 800404000000000100081ec90d6190478c6dea7f19 CHEBI:15423 800404000000000100001ecd8961b0c78c6dee7f19 CHEBI:15424 800400000000100000041a4f896030c78c6fee7f19 CHEBI:15425 800400010004000000341a4e896270c78d676e7f19 CHEBI:15426 000000010004300000300a46894260848d2e467f09 CHEBI:15427 000000000000200000000a40880000040c04407709 CHEBI:15428 000080000000000058000820200000080404400009 CHEBI:15429 000000020210020001a804868409e8058cbd76fe1b CHEBI:15430 000200020214020011b804a6968be80f8ebd7ffe1f CHEBI:15431 000200020214020011b804b6968be82f8ebd7ffe1f CHEBI:15432 000200020214020011b804b7978be82f8ebdfffe1f CHEBI:15433 000200020214020011b804b7978be82f8ebdfffe1f CHEBI:15434 0000000002001000018004868449a8958dbd76de1b CHEBI:15435 000000020010020001a804868401e0858dad667e1b CHEBI:15436 0000000000001000018004868041a0958dad665e1b CHEBI:15437 0000000000001000018004868049a8958dbd76de1b CHEBI:15439 0000000000000200000a000004082c800110148400 CHEBI:15440 0080200000000201020a000004882c800911b49418 CHEBI:15441 0000201000841200120a0003249a2ca8871c3ed518 CHEBI:15442 000000000000300180000207021080640c4fa3551c CHEBI:15444 00000000000030018000000102008064844da3521c CHEBI:15446 000000000000300180000007821080640c4fa3511c CHEBI:15447 000000102084302193d69fcfbbf3b4fbdf6fffff1f CHEBI:15448 000000102084302193d69fcfbbfbb4fbdf7ffeff1f CHEBI:15449 000000102084302193d69fcfbbfbb4fbdf7ffeff1f CHEBI:15450 000000102084302193d69fcfbbfbb4fbdf7ffeff1f CHEBI:15451 000000102084302193d69fcfbbf3b4fbdf7ffeff1f CHEBI:15452 000000102084302193d69fcfbbf3b4fbdf7ffeff1f CHEBI:15453 000000102084302193d69fcfbbf3b4ffdf6ffeff1f CHEBI:15454 000000102084302193d69fcfbbf3b4fbdf6ffeff1f CHEBI:15455 000000102084302193d69fcfbbf3b4fbdf6ffeff1f CHEBI:15456 000000102084302193d69fcfbbf3b4ffdf7ffeff1f CHEBI:15457 000000102084302193d69fcfbbf3b4ffdf6ffeff1f CHEBI:15459 000000102084302193d69fcfbffbb6fbdf7ffeff1f CHEBI:15460 000000102084302193d69fcfbffbb6fbdf7ffeff1f CHEBI:15461 000000102084302193d69fcfbff3b4fbdf6fffff1f CHEBI:15463 000000182084302193d69fcfbff3b5ffdfefffff1f CHEBI:15464 000000102084302193d69fcfbbfbb4ffdf7ffeff1f CHEBI:15465 000000102084302193d69fcfbbfbb4ffdf7ffeff1f CHEBI:15466 000000102084302193d69fcfbbf3b4ffdf7ffeff1f CHEBI:15467 000000102080302193d69ecfbbf3b4fbdf6ffeff1f CHEBI:15468 000000102084302193d69fcfbff3b4fbdf6ffeff1f CHEBI:15469 000000102084302193d6dfcffbf7b4fbffefffff1f CHEBI:15470 000000102084302193d69fcfbffbbefbdf7ffeff1f CHEBI:15471 000000102084302193d69fcfbbf3b4fbdfefffff1f CHEBI:15472 000000102084302193d69fcfbff3b4fbdf7ffeff1f CHEBI:15473 000000102084302193d69fcfbbf3b4fbdf6ffeff1f CHEBI:15474 000000102084302193d69fcfbbfbbcfbdf7ffeff1f CHEBI:15475 000000102084302193d69fcfbbfbb4fbdf7ffeff1f CHEBI:15476 000000102084302193d69fcfbbfbb6fbdf7ffeff1f CHEBI:15477 000000102084322193de9fcfbffbb4fbdf7ffeff1f CHEBI:15478 000000102084302193d69fcfbbfbb4fbdf7ffeff1f CHEBI:15479 000000102084322193de9fcfbffbbcffdf7ffeff1f CHEBI:15480 000000102084302193d69fcfbbfbb4fbdf7ffeff1f CHEBI:15481 000000102084302193d69fdfbbf3b5fbdfffffff1f CHEBI:15482 000000102084322193de9fcfbffbbcffdf7ffeff1f CHEBI:15483 000000102084302193d69fcfbbf3b5fbdf6fffff1f CHEBI:15484 000000102084302193d69fcfbbf3b4ffdf6ffeff1f CHEBI:15485 000000102084322193de9fcfbff3b4fbdf7ffeff1f CHEBI:15486 000000102084302193d69fcfbbf3b4fbdf7ffeff1f CHEBI:15487 000000102084322193de9fcfbff3b4ffdf7ffeff1f CHEBI:15488 000000102084302193d69fcfbbf3b4fbdf6ffeff1f CHEBI:15489 000000102084302193d69fcfbbf3b4ffdf6ffeff1f CHEBI:15490 000000102084302193d69fcfbbfbbefbdf7ffeff1f CHEBI:15491 000000102084302193d69fcfbbf3b4ffdf6ffeff1f CHEBI:15492 000000102084302193d69fcfbbfbbcfbdf7fffff1f CHEBI:15493 000000102084302193d69fcfbbfbbcfbdf7fffff1f CHEBI:15494 000000102084302193d69fcfbbfbbcfbdf7ffeff1f CHEBI:15495 000000102084302193d69fcfbbf3b4fbdf6ffeff1f CHEBI:15496 000000102084302193d69fcfbff3b4ffdf6ffeff1f CHEBI:15497 000000102084302193d6dfcffbf7b4fbff6fffff1f CHEBI:15498 000000102084302193d69fcfbff3b5fbdf6fffff1f CHEBI:15499 000000102084302193d69fcfbbf3b5fbdf6fffff1f CHEBI:15500 000000102084302193d69fcfbbf3b4fbdf6ffeff1f CHEBI:15501 000000102884302193d69fcfbbf3b5fbdf6ffeff1f CHEBI:15502 000000102084302193d69fcfbff3b4ffdf6ffeff1f CHEBI:15503 000000102084302193d69fcfbbf3b4ffdf6ffeff1f CHEBI:15504 000000102084322193de9fcfbff3b4fbdf6fffff1f CHEBI:15505 000000102084302193d69fcfbbfbbcfbdf7ffeff1f CHEBI:15506 000000102084302193d69fcfbbf3b4fbdf6ffeff1f CHEBI:15507 000000102084302193d69fcfbffbbefbdf7ffeff1f CHEBI:15508 000000102084302193d69fcfbbf3b4ffdfefffff1f CHEBI:15509 000000102084302193d69fcfbbf3b5ffdf6ffeff1f CHEBI:15510 000000102084302193d69fdfbff3b5fbdfffffff1f CHEBI:15511 000000102084302193d69fcfbbf3b4fbdf7ffeff1f CHEBI:15512 000000102284302193d69fcfbff3b4fbdf6ffeff1f CHEBI:15513 000000102084302193d69fcfbffbbefbdf7ffeff1f CHEBI:15514 000000102084302193d69fcfbbf3b4fbdf6fffff1f CHEBI:15515 000000103884302193d69fcfbbf3b4fbdf6ffeff1f CHEBI:15516 000000102084302193d69fcfbbf3b6fbdf7ffeff1f CHEBI:15517 000000102084302193d69fcfbff3b5fbdfefffff1f CHEBI:15518 000000102084302193d69fcfbbfbbcfbdf7fffff1f CHEBI:15519 000000102084322193de9fcfbff3b4fbdf6fffff1f CHEBI:15520 000000102084302193d69fcfbbfbbefbdf7ffeff1f CHEBI:15521 000000102084302193d69ecfbbf3b4fbdf6ffeff1f CHEBI:15522 000000102084322193de9fcfbffbbcfbdf7ffeff1f CHEBI:15523 000000102084302193d69fcfbbf3b4ffdf6ffeff1f CHEBI:15524 000000102084302193d69fcfbbfbbefbdf7ffeff1f CHEBI:15525 000000102084302193d69fcfbbfbbefbdf7ffeff1f CHEBI:15527 000000102284322193de9fcfbff3b4ffdf6ffeff1f CHEBI:15528 000000102084302193d69fcfbbfbb4fbdf7ffeff1f CHEBI:15529 000000102084302193d69fcfbffbbefbdf7ffeff1f CHEBI:15530 000000102084302193d69fcfbbf3b4ffdf6ffeff1f CHEBI:15531 000000102084302193d69fcfbbfbbefbdf7ffeff1f CHEBI:15532 000000102084302193d69fcfbbfbbefbdf7ffeff1f CHEBI:15533 000000102084302193d69fcfbffbbefbdf7ffeff1f CHEBI:15534 000000102084302193d69fcfbbf3b4ffdf6ffeff1f CHEBI:15535 000000102084302193d69fcfbbf3befbdf7ffeff1f CHEBI:15536 000000102084302193d69fcfbbf3b4fbdf6fffff1f CHEBI:15537 000000102084302193d69fcfbbfbbcfbdf7ffeff1f CHEBI:15538 000000102084302193d69fcfbbfbb6fbdf7ffeff1f CHEBI:15539 000000102084302193d69fdfbff3b5fbdfffffff1f CHEBI:15540 000000102084302193d69fcfbbfbbefbdf7ffeff1f CHEBI:15541 000000102884302193d69fcfbbf3b4fbdf6ffeff1f CHEBI:15542 000000102284302193d69fcfbff3b4fbdf6ffeff1f CHEBI:15543 00000000000000000000008784082ec48d4ca6d618 CHEBI:15544 00000000000000000000008684082ec48d4ca6d618 CHEBI:15545 00000000000002000008008684082ec48d4ca6d618 CHEBI:15546 00000000000000000000008784082ec48d4ca6d618 CHEBI:15547 00000000000000000000008784082ec48d4ca6d618 CHEBI:15548 00000000000000000000008784082ec48d4ca6d618 CHEBI:15550 00000000000000000000008784082ec48d4ca6d618 CHEBI:15551 00000000000000010008048794092ec40d4da6d618 CHEBI:15552 00000000000010000000008784082ec40d4ca6d618 CHEBI:15553 000000000000000100000487a609ae8c0d0da6d61c CHEBI:15554 00000000000000000000008784082ec48d4ca6d618 CHEBI:15555 00000000000000010008048794092ec48d4da6d618 CHEBI:15556 00000000000000000000008784082ec48d4ca6d618 CHEBI:15557 00000000000000000000000684082e840d0486d608 CHEBI:15558 00000000000000000000000684082e84890486d608 CHEBI:15559 000000000000000000000086840822c48944a6d618 CHEBI:15560 00000000000001008000000300000004880c825608 CHEBI:15562 00000000000001008000000300000004880c825608 CHEBI:15563 00000000000032008208000104800044844ca2521c CHEBI:15564 000000000000220080080000040000400444a0d01c CHEBI:15565 000000000000200080000006800020c40d4ca6521c CHEBI:15566 000000000000000080000006800020c48d4ca6521c CHEBI:15567 000040081004100000900a02092210048424427f09 CHEBI:15569 00000000000020000000084000000004040440fa09 CHEBI:15570 000000000000080018000820000000080420400001 CHEBI:15571 000000102080202191d41ccf33b3917fdeedeb7f1f CHEBI:15572 000000100084100010000a02a9c220080e064e6509 CHEBI:15573 0000000200000201820806869499acc40d5d97d51c CHEBI:15574 0000000200003201820806871699ace40d5fb7d51c CHEBI:15575 0000240002000200020a008694892cc00954b5941c CHEBI:15576 0080200200000201020a008294892cc00955b5d41c CHEBI:15577 0000000000000001020004869299acc4095597d11c CHEBI:15578 0000000000003001820006871299ace40d5fb7d51c CHEBI:15579 00802000000002010208000404882e840915b6d618 CHEBI:15580 00802000000002010208001004882ea40911b6d618 CHEBI:15581 00000000000030008200000100800004840c92d208 CHEBI:15582 00000000000030008200000780800a048c0c82d608 CHEBI:15583 000000000000300082000003808004048c0c82d608 CHEBI:15584 000000000000300082000003808004048c0c82d608 CHEBI:15586 00000000000030008202000100800004840c92d208 CHEBI:15587 00000000000001008000000180000004880c825608 CHEBI:15588 00000000000001008000000180000004880c825608 CHEBI:15589 00000000000030008200000100800004840c92d208 CHEBI:15590 00000000000030008000000700000204840c82d608 CHEBI:15591 00000000000001008002000100000004800c92d208 CHEBI:15592 00000000000001008000000100000004800c825208 CHEBI:15593 00000000000001008000000700002604800c86d608 CHEBI:15594 00000000000001008000000180000004880c825608 CHEBI:15595 000000000000100080000011840001648cccb2d61e CHEBI:15596 00000010c080451a840180012082002c8a088a5708 CHEBI:15597 00000000000001008200000180800804880c82d608 CHEBI:15598 000000000000302180000003920101500ccda3501e CHEBI:15600 00000000000011000080084080400404042040fe09 CHEBI:15601 00000000000021000000004280482604840c56fe09 CHEBI:15602 00000000000020000002084280400404040450fe09 CHEBI:15603 000000000000100000020800814008040c0450fe09 CHEBI:15604 0000000000000021000000019201015008c581501e CHEBI:15605 0000000000000021000000011201015008c181521e CHEBI:15606 000000000000300082000007808008048c0c82d608 CHEBI:15607 00000000000030008200000780882e848d0c86d608 CHEBI:15608 000000081004100000900a02092210000420402f09 CHEBI:15609 00000400020002000208008614892c84810417d21c CHEBI:15610 000000000000200000002250880004040c0440f709 CHEBI:15611 00000000000000008000000004000004848ca2521e CHEBI:15612 000000000000100000000a06894020840d24447f09 CHEBI:15613 00000000000000008000000400000604800492d608 CHEBI:15614 00000000000030008200000780882e848d0c86d608 CHEBI:15615 000000000000100000800804814804040c2440fe09 CHEBI:15616 000000000010120001a8048e8428fa458cfd76fe1b CHEBI:15617 000000000210120001a8048e8468f8c58cfd76fe1b CHEBI:15618 000000000210120001a8048e8428f8458cfd76fe1b CHEBI:15619 00000000000020000000224488000a040c0640f709 CHEBI:15620 00000000000000000000000000000140084480561e CHEBI:15621 000000000000300080000203801000048c0c825708 CHEBI:15622 000000000000100080000207801000048d0c825708 CHEBI:15623 000000000000200080000003800000048c0c825608 CHEBI:15624 000000002000213181d4bedf9b69bcd35d67eefd1f CHEBI:15625 000000000000000010000006a4082eac0d0c86d608 CHEBI:15626 8004000000000001000000078609ae840d0da7d61c CHEBI:15627 000000000000000010000006a4082eac0d0c86d608 CHEBI:15628 000000000000000010000006a4082eac0d0c86d608 CHEBI:15629 00000000000020008000000684082e840d0c86d608 CHEBI:15630 00802000000000010000000684082e840d0da6d618 CHEBI:15631 000000000000000010000006a4082eac0d0c86d608 CHEBI:15632 000010032014300000f80a4e9f63f045dc6f477f1f CHEBI:15633 000010032014300000f81a4e9f63f047dc6f4f7f1f CHEBI:15634 000010032014300000f80a4e9f63f045dc6f477f1f CHEBI:15635 0000100b2014112000fc3ece9f63f0c7dd6b4f7f1f CHEBI:15636 000010032014300000f81a4e9f63f047dc6f4f7f1f CHEBI:15637 000010032014312000fc1ece9f63f047dc6f4f7f1f CHEBI:15638 000010032014300000fc1a4e9f63f047dc6f6f7f1f CHEBI:15639 000010032014300000fc1a4e9f63f047dc6f6f7f1f CHEBI:15640 000010032014300000fc1a5e9f63f847dc6f6fff1f CHEBI:15641 000000012004300080f00a4e9bebd041146de3fb1d CHEBI:15642 000000012000300080f00a4e1b6bd0411465c3fb1d CHEBI:15643 000000102094332191dc1ccfb7f3b07bde6fef7f1f CHEBI:15644 0000040000000000000416d7835908e70d4d8bff1f CHEBI:15645 000000000000000000000206841020840d0c865708 CHEBI:15646 00000000000000000000000684082e840d0c86d608 CHEBI:15647 0000000000000001000004811211815004cd83511e CHEBI:15648 000000000000000182000481929181500ccd83511e CHEBI:15649 00000000000020000000894684482e848d0cc6ff09 CHEBI:15650 00802000000000010000000684082e840905a6d618 CHEBI:15651 000000103084302191d414cf33b3906bdeedea7f1f CHEBI:15652 008020000000000100080006840822840905a6d618 CHEBI:15653 00802000000000010008000684082e840905a6d618 CHEBI:15654 000000000000000010000006a4082eac0d0c86d608 CHEBI:15655 000000000000000010000006a40026ac0d0c86d608 CHEBI:15656 000000000000000010000006a4082eac0d0c86d608 CHEBI:15657 00000000000020008000000684082e840d0c86d608 CHEBI:15658 0000800000008341e0080ca92520d01ad229ca6a1b CHEBI:15659 0000800000008341e0080ca92520d01ad229ca6a1b CHEBI:15660 000000000000020000080016040008248c0c02d608 CHEBI:15661 00000000020002000208020684982c800d0494d508 CHEBI:15662 000000000000020000080014840800248c0c02d608 CHEBI:15663 00000010008412001008020724920c28870e0ad508 CHEBI:15664 00008000000000005020002220404148044480541f CHEBI:15665 00008000000000005020002220404148044480541f CHEBI:15666 00008000000000005020002220404148044480541f CHEBI:15667 000000000000020080080006040000048c0c825608 CHEBI:15668 00000000000001000000000004000004000000521e CHEBI:15669 000000000000000000000004840000048c0c025608 CHEBI:15670 00000000000030008000000100000004840c825208 CHEBI:15671 00000000000030008000000100000004840c825208 CHEBI:15672 00000000000030008000000100000004840c825208 CHEBI:15673 00000000000030008000000100000004840c825208 CHEBI:15674 000000000000200000000a40880004000c04c0b509 CHEBI:15675 000000003004300000d00cc9012290419461626a19 CHEBI:15676 000000003004300000d00cc9012290419461626a19 CHEBI:15677 000000003004300000d00cc9012290419461626a19 CHEBI:15678 00000000000030018000024f87f8bee49d7ff7ff1d CHEBI:15679 00000000000030018000024f87fabee49d6ff7ff1d CHEBI:15681 000000010004300000300a46894260048c2c467f09 CHEBI:15682 00000000000001000000000680082e84090486d608 CHEBI:15683 00000000000030008202000100800004040c92d208 CHEBI:15684 00000000000001008000004e806030848808c67e09 CHEBI:15685 00000000000000008000000000000000000490d208 CHEBI:15686 00000000000000008000000000000000000490d208 CHEBI:15687 00000000000000008000000000000000000490d208 CHEBI:15688 00000000000030008202000100800004040c92d208 CHEBI:15689 00000002000002008208008614892cc08944b7d21c CHEBI:15690 000004000000200181c016de9be998531ce5fbfb1f CHEBI:15692 000000000000200080000201801000000c04825708 CHEBI:15693 000000000000200080408b4e886830848c2ec6ff09 CHEBI:15694 000000000000200000000840000001440444c27a1f CHEBI:15695 0000001000803201900a024f27babcec9f5fffff1d CHEBI:15696 000000003010102180b41e8f8772d0c31d6dea7f1d CHEBI:15698 000000000000300000000a46815020040c04467f09 CHEBI:15699 0000001010903221909c168fa7f290ef9f6febff1d CHEBI:15700 00000000000000000000000000000004840c02521e CHEBI:15702 0000001000843001900002072292806c0e4fab551c CHEBI:15703 000000002000302191d414ef1321925b5c6debfc1f CHEBI:15704 000000000000200000000840000000040404407a09 CHEBI:15705 00000000000000008000000600002004880c86561e CHEBI:15707 000000000000030000080000040000048808025608 CHEBI:15708 0000000000000001800004811601815400cd83521e CHEBI:15709 00000020000001000002305008800402081018a709 CHEBI:15710 000000000000300081000487100100750c4de35e1f CHEBI:15711 0000000000001021000000111211017004cda3d31e CHEBI:15712 0000001010943021909414cf27b2906b9e6dea7f1d CHEBI:15713 0000000000000000800000110400016404ccb2d21e CHEBI:15714 000000110084300192b00ed32392c06c9e6dfbff1d CHEBI:15715 0000000800000001800434911b1185f740cf9bf91f CHEBI:15716 0000000200001200020a028794992cc00d54b5d51c CHEBI:15717 0000000000001000000000010000015004cca3521e CHEBI:15718 000000000000010000000080040822c48940a6d618 CHEBI:15720 000000100084300090000207a09200288e0e8a5708 CHEBI:15721 000000000000300180000007021001748ccda3571e CHEBI:15723 0010002000000040400230302080400a001018a009 CHEBI:15724 000000000000100000800a02884020000422442501 CHEBI:15725 0000003000800100900230572cda2eae8b1e9ef709 CHEBI:15726 000000000000300001d00ece8960b0858c25467f1b CHEBI:15727 000000010004000000300a06884260040c24447f09 CHEBI:15728 000000000000200000000a46884020040424447f09 CHEBI:15729 0000800000001000502000612000400c040402da09 CHEBI:15730 000000100084100010000001a082000c8e0c0a5708 CHEBI:15732 00000000000000000000001000000000500040a817 CHEBI:15733 000000080000000000000200001000000004001508 CHEBI:15734 00000000000030018040824f837810649c6fe3ff1d CHEBI:15735 000000001000000000103852882214068c244aff09 CHEBI:15737 000004000000000181c016dc9be998731cf1fbfb1f CHEBI:15738 00000000000000000002080000000000000050a801 CHEBI:15739 000000000000010000000000000000040000005208 CHEBI:15740 000000000000000000000002800020048c0c065608 CHEBI:15741 000004000000000000041494834928c309458cb81d CHEBI:15742 000000000000000000000000000026000800048608 CHEBI:15743 0000001010903221909c148fa7f290ef9f6dfbff1d CHEBI:15744 000000000010200000080840040000048404427a09 CHEBI:15745 000000000000100000802a06884020800522442501 CHEBI:15746 00000000000002000008000004000004840c025208 CHEBI:15749 0000001000841200100a0007249a2ca88f1c1ed508 CHEBI:15750 000000102080302191d41ecf33b3907fde6feb7d1f CHEBI:15751 000000000000000080000006800020048c0c865608 CHEBI:15753 000000000000300180000207821000640c4fa2551c CHEBI:15754 00000000000000000000000680082e84090404d608 CHEBI:15756 000000000000000000000842804020048c04467e09 CHEBI:15757 0000000000003021800000011201015004cda3501e CHEBI:15758 0080240000000201820a000114892cc089d1b7d61e CHEBI:15759 000000000000000000000a00884021400444c4351f CHEBI:15760 00000000000001000104101002000003000108a017 CHEBI:15761 0000001000842121910414cf23b2906f9e4daa7f1f CHEBI:15763 000000000000312180080207161181740ccfa3571e CHEBI:15764 0000000000002000800008468040014404cce27e1f CHEBI:15765 000080010004a14060700a6ea862701ad6284f6f1f CHEBI:15766 000000080000000000000000400000002000000500 CHEBI:15767 0000000400002000000048428044014424c4e27e1f CHEBI:15768 00000010008430009282824fa9f234ac8e2edeff09 CHEBI:15769 000000100084300090000207a09200280f0e8a5508 CHEBI:15770 000000000000320180080297961988640c4db3d71c CHEBI:15771 0000000200102200800804c295410045844de27a1d CHEBI:15772 000000000000000000000806804020801100442815 CHEBI:15773 0000001010903221909c148fa7f290ef9f6dfbff1d CHEBI:15774 000000000000010000000001000020048908865608 CHEBI:15775 80040000085022000048994e9771b8679c6f6eff1d CHEBI:15776 000000000000010000000000000020c4894086521c CHEBI:15777 000000000000300000000042816210048c0c427e09 CHEBI:15779 000000200000010001d234d600c08807003158fe1b CHEBI:15781 000000011014302000f80a4f8f72f0c19d6f667f1d CHEBI:15782 00000000000000000002381000000002502058a817 CHEBI:15783 00000010008430019000004f23b2106c9e4deaff1d CHEBI:15784 000000100080000090000a47acd220a80e0ece7509 CHEBI:15785 00000020000031000002385788c82c060c34deff09 CHEBI:15786 0000000200000221020a06859699acc48d5db7d71e CHEBI:15787 0000000000002000800040004004014024c4a0501e CHEBI:15788 000000000000000000000011000001708ccca3d61e CHEBI:15789 000000000000010000000001000000048808025608 CHEBI:15792 0000000000003000000008400100014454c4e27a1f CHEBI:15793 0000000000000000000000011401014080c483521e CHEBI:15794 0000000200001221800800011601815084ddb3d21e CHEBI:15795 000000002020010001941a028a18a8937ca77dbd3f CHEBI:15797 000000100084200090400a4b20b2102c8e2edaff09 CHEBI:15798 000000100084000090000003a09a2eac8f0e8ed708 CHEBI:15799 00000000000001008000000280002004880c865608 CHEBI:15801 000000000010200001080cc0140100150425407a1f CHEBI:15802 000000010004200000300a46884a6004842c46ff09 CHEBI:15805 000000100084300190000007a292006c0e4daa511c CHEBI:15807 0000000200001200820a385f15a9194294fcfbfa1f CHEBI:15808 0000001000803201900a0207269a2cec0f5fbed51c CHEBI:15809 000000100084200010000843a0d2002c860c4a7f09 CHEBI:15811 0000001000803201900a0207269a2cec0f5fbed51c CHEBI:15812 00000000000000000100000013010151004581181f CHEBI:15815 000000010004200000300a46884260040424447f09 CHEBI:15816 000000102084222191de16cf3fb3987b5e6ffafd1f CHEBI:15819 000000112084302191f41ecf33b3d06fdeefeb7f1f CHEBI:15820 0000000000000000010000400300014504c5a2da1f CHEBI:15821 00000000000030018200008792892de40dcda7d21e CHEBI:15822 0000101230943200909c104fbff3b1efceefcfff1f CHEBI:15823 0000000000000200000a0000040000000000108208 CHEBI:15825 00000002000002000208008594892cc08954b7d21c CHEBI:15826 000000112084302191f41ecfbbf3f07fde6fee7f1f CHEBI:15827 00000000c080601a14018843a0d2002c860c4a7f09 CHEBI:15829 000000000000000000000846800828848d04c6fe09 CHEBI:15830 0000001000841200100a0003249a2ca8871c1ed508 CHEBI:15831 000040001004010000100801012210048420427a09 CHEBI:15832 0000000000000200000a000284082c000814949408 CHEBI:15833 000000000000300180000207021001640ccda3571e CHEBI:15834 0000001000840000900000032092002c8e0e8a5708 CHEBI:15835 000000100084200010000847a1c2002c8e0c4a7e09 CHEBI:15836 00000000000000000002020080102c000804149508 CHEBI:15837 80040000085002008048914e9771b8679c6feeff1d CHEBI:15838 000000100084300190000043a3b2106c9e4dea7f1d CHEBI:15840 0000001010942021909c14cf27b2906f9e6dea7f1d CHEBI:15842 00000000000000000000000684082e84090404d608 CHEBI:15843 00000010008420009000000120820068864caa501c CHEBI:15844 000000100084300090000207a09200288e0e8a5708 CHEBI:15845 000000102080312191d41ccf33b3907bde6feb7f1f CHEBI:15846 000000000000300180000207021000640c4fa2571c CHEBI:15847 00000000000010000200000100800004840c02d208 CHEBI:15849 000000000000200080000203a01000280c06825508 CHEBI:15850 000000001010200000d4184b8f629047846d6a7f1d CHEBI:15851 000100120094130192fa1eefbffbfccbcfffffff1f CHEBI:15852 0000000002000000010030561e41a1f301678df91f CHEBI:15854 00000010d090743b949594cfa7b2906f9e6feb7f1d CHEBI:15855 000000000020010000000000000000002000000000 CHEBI:15858 000000001004300000100846816210048c2c427e09 CHEBI:15859 00000000000000000000001e80683e00090054ae09 CHEBI:15860 00000000020002200208008004882c800110348014 CHEBI:15861 000000000000000000000a0008000200000040a501 CHEBI:15862 00000000000030018008000106008064844da3521c CHEBI:15863 0000000000003021800800011601015004cda3521e CHEBI:15864 000000100084310090000007a092002c8e0c8a5708 CHEBI:15865 000000000000300080000207801000800c0c825708 CHEBI:15866 000000000000300180000007821000440c4da2531c CHEBI:15867 000000103090302190b43edfaff2fc6b9e6feeff1d CHEBI:15868 0000000000001000000000010000017484ccb3d21e CHEBI:15871 0000001000843001900008472392006c1e4dea7d1d CHEBI:15873 0000000000003000010004cf916110958c2d427e1f CHEBI:15874 00000000000030018000024f033890649c7ff3ff1d CHEBI:15876 000000002000030001dc1ec49f518c1755274aff1f CHEBI:15877 0000000800000001800434951b1185d740cf8bf91f CHEBI:15878 000000100084000090008113a09a00288e0c8ad708 CHEBI:15880 00000000c080441a1401a002a08a24088b0e8ec708 CHEBI:15881 00000000000000000000000000000140004480101e CHEBI:15882 000000112084302191f41ccf33b3d07bdeefeb7f1f CHEBI:15883 000004000000000000041494834928c309458cb81d CHEBI:15884 000000000200200000080842c44400042404407e09 CHEBI:15885 0000001010903121909414cf27b2906f9e6debff1d CHEBI:15886 000000000000000000000a06884020040d04447709 CHEBI:15887 00000000000000000000000680082c04880486d608 CHEBI:15888 0000000000000000015004801001801100a1708817 CHEBI:15890 00000000c080541a14018a02a9c220088e064e6509 CHEBI:15891 000000000000200000408b4e886030848d2e467f09 CHEBI:15892 0000000000000100002004c0004060050001047a19 CHEBI:15893 0000000000000200000a000004082c800110148608 CHEBI:15894 000000000000300180000685801000440c4da25718 CHEBI:15895 00000000000020008200008610892dc08dccb7d61e CHEBI:15896 000000100084100012000007a0922c2c8f0c8ed708 CHEBI:15899 00802000000000018000000100000004840da25218 CHEBI:15900 00000000100410000090020f8b62b0418c61446b1d CHEBI:15901 000000100084300190000007a292006c0e4daa511c CHEBI:15902 000000000000300180000207021000440c4da2551c CHEBI:15903 000000000000000000000002800020040804045708 CHEBI:15904 00000010008430009202024fa9f2342c8e0edeff09 CHEBI:15905 0000000000003201820a305787988ee69d5dbbff1d CHEBI:15906 0000001000840000100008202082100806044a4809 CHEBI:15907 000000000000300080000207801000040c0c825708 CHEBI:15908 0000800000403081d0208327235040480c4da35d1f CHEBI:15911 00000000000020008000000680082e840d0486d608 CHEBI:15913 000000000000200000000842814004048c04c2fe09 CHEBI:15914 000000102080302191d41ecfb3b3907fde6feb7d1f CHEBI:15915 000000100080200010000a47a8d220ac8e2e4e7f09 CHEBI:15916 00000000000031218104124f033090471c4dab7f1f CHEBI:15917 000000103094002190b41c8ba7f2d0eb9f6dea7f1d CHEBI:15918 000000102084302191d414cfb3f3907fde6dea7f1f CHEBI:15919 0000000000000100810010101b41217300ddbff91f CHEBI:15920 000000000000000080000002840000048c0c825608 CHEBI:15924 000000100084200090000003a092002c8f0c8a5708 CHEBI:15925 0000001000803201900a024f27babcec9f7fffff1d CHEBI:15926 000000000000312181841ecf033090431c6dea7f1f CHEBI:15927 00000000000000000000009004082ae48940b6d618 CHEBI:15929 0000000100001000019242004a068201743150ad17 CHEBI:15930 00000000c080601b9409824d27b2906c9e4debff1d CHEBI:15931 000000000000302182080687969181740ccfa3571e CHEBI:15932 0000001010903021929416cfa7b2906f9e6fea7f1d CHEBI:15933 000010003014300080d80a4f8f3290c19c6fe27f1d CHEBI:15934 000000000000300080000207801000000c0c825708 CHEBI:15936 000000000000300082000007848001e48c4ca3521e CHEBI:15937 00000000000000000200028490992c80090435951c CHEBI:15938 0000000200003201820a00071689ace48d5db7d21c CHEBI:15939 00000000000000000100000003000005000500521f CHEBI:15940 00000000000032008008000104000044044ca2521c CHEBI:15941 00000000c080459e0401890021820008d2806a681f CHEBI:15942 000000100084200090000203a09200a88e0e8a5508 CHEBI:15943 0000000000002000800002149a4121f105cde7fd1f CHEBI:15944 000000000000300180000207021000440c4da2571c CHEBI:15945 000000100084300090000207a09200288e0e8a5708 CHEBI:15946 000000000000300180400a4f033010401c6de2ff1d CHEBI:15947 0000000000000200000a000004082c000010148400 CHEBI:15948 000000202000210001d23cde80e8b88784355efe1b CHEBI:15949 0000000000000000800002149a4121f105c5e7fd1f CHEBI:15950 000000000000100000020005000005400cccb2d61e CHEBI:15951 0000001010903221909c1ccfa7f290ef9f6dfbff1d CHEBI:15952 000000000000120000280040050040449444e25a1d CHEBI:15953 000000100084300090000007a09200288e0c8a5708 CHEBI:15954 000010022010220000fe1ec69e41e80714275cff1d CHEBI:15955 0000000018042000005085869063b0c58d65467f19 CHEBI:15956 00000400000002008008000104000044844ca25218 CHEBI:15957 000000100080000090002257a8d22cac8e0ecef709 CHEBI:15958 00000020000001000002305588880426881c9af709 CHEBI:15960 000000100084200010000846a1d2202c8e0c4e7f09 CHEBI:15961 000000000000300080000207801000800c0e825508 CHEBI:15963 00000000c184401a14018026a0c2308897044e681d CHEBI:15964 000000000000200000000846804020048c0c467e09 CHEBI:15966 000000010004100000300a06894260040c24447f09 CHEBI:15968 00004000000020000000084380721024842c427f09 CHEBI:15970 000000000000200001d00cc6804080050425407e1b CHEBI:15971 0000001010903221909c1ccfa7f290ef9f6dfbff1d CHEBI:15972 00000000000030008000001100000060044ca2d01c CHEBI:15975 000000000000100081000480110100110405c05a1f CHEBI:15976 000000000000000000000000040020c0084084021c CHEBI:15977 000000100084200090000203a09200280e0e8a5508 CHEBI:15978 0000000000000000000430101a412c9301812dad17 CHEBI:15979 00000000000011008202020300900404040c92d708 CHEBI:15980 000100120094120192fa1eefbffbfccbcfffffff1f CHEBI:15982 000080000000b140e0000267a150404ad66ceb7f1f CHEBI:15984 00200000c080411a040181012082000882080a4008 CHEBI:15987 00000000000004181401a072a0402c088a004cef09 CHEBI:15989 0000000000000100000000011001015400c4a3521e CHEBI:15992 00000010008430019000004723b2106c9e4dea7f1d CHEBI:15993 00000000000000000200008010892cc08940b5c21c CHEBI:15994 000000112084302191f41ccf33b3d06bdeedea7f1f CHEBI:15996 000000000000000000020a0008000400000050a501 CHEBI:15997 000000112084100011f00a4b3bf3d169d6ef4b7d1f CHEBI:15998 000000000000000080000002000001448c4c82561e CHEBI:15999 000000000000200000000a40885020000c06443509 CHEBI:16000 000000100084000090000003a092002c8e0c8a5708 CHEBI:16001 00000000000030008000000100000004840c825208 CHEBI:16002 000000000000200080000002800001440c4c82561e CHEBI:16003 00000000000001008000000000000004000482d208 CHEBI:16004 0400000000800100000000102082000802000ac008 CHEBI:16005 000000000000000000008010000000000000008000 CHEBI:16007 0000000000000000000000010000014000c4a0521e CHEBI:16008 000000002000302181d41cc913219057546dea7a1f CHEBI:16009 0000000000002000000004ce806030458c45467a19 CHEBI:16010 00000000000000000000020600182e800104149508 CHEBI:16011 00000100080000208000858000188834880992d71a CHEBI:16013 000000000000200000000846804020048c0c467e09 CHEBI:16015 000000000000100080000203001000000c06825708 CHEBI:16016 000000000041000000008000000000000000000517 CHEBI:16017 0000000000000200000a0200041008000004109508 CHEBI:16019 0000100a2010302180fc3edf1f31d4431c6feafd1d CHEBI:16020 00802010008400019202305333db8c6e9f5dbbfb1d CHEBI:16021 000000103090302190b41ecfa7b2d0eb9e6fea7f1d CHEBI:16022 000000000000300080000207801000000d0e825708 CHEBI:16023 00000000000020000000084680402004842c467e09 CHEBI:16026 000000102084202191d41ccf33b3907b5e6dea7d1f CHEBI:16027 000000000000110000800a42884020040420447f09 CHEBI:16028 000000000000000001d234d600408807042558fe1b CHEBI:16029 0000000000000200000a0000041a2ca4091014d728 CHEBI:16030 000000000000000001000c8e906110110c21402e1f CHEBI:16031 00000000000000000200008690892cc00944b5d21c CHEBI:16032 000000110084300010300262a982500c8e2c4a7f09 CHEBI:16034 0000000000001021000000051211015004cda3531e CHEBI:16035 00000000000000000000000380182624090496d708 CHEBI:16036 0000800000008140600200202000400a020018e809 CHEBI:16037 000000100080000090000a47a8d220ac8e0ece7709 CHEBI:16038 000000102084302191f414cf33b3d06bdeedea7f1f CHEBI:16039 000000003010100000b008000722d0411461402a1d CHEBI:16040 000000100084300090000007a09200a88e0c8a5708 CHEBI:16043 00000000000000001000a07280402c08000044af09 CHEBI:16044 0000001000803201900a0207269a2cec8f5fbed51c CHEBI:16046 000010123094300090dc104fbfb390ebceeffbff1f CHEBI:16048 000000103090302190b41ccfa7b2d86f9e6debff1d CHEBI:16049 0000000000000000800414c911211043c0c1aaea1f CHEBI:16050 000000100080300090000017a09a00280e0c9ad508 CHEBI:16051 000000000200000000000000040000000000008000 CHEBI:16052 00000000000010000202000780882cc00d54b4d01c CHEBI:16053 000000000000000000000006800000048c0c02561e CHEBI:16054 00000000000030008000000100000000040c82d208 CHEBI:16055 0000001000841200100a000324920828860c1ad508 CHEBI:16057 00000000000030008000001100000060044ca2d21c CHEBI:16058 00000000000030018000024f033090649c6ff3ff1d CHEBI:16059 0000000000000200014816861f59889305a749bd1f CHEBI:16060 000000000000300080000247813210008c0cc27f09 CHEBI:16062 0080200000003001820206871299ace48d5fb7d71e CHEBI:16063 000000000000310080000207801000048c0c825708 CHEBI:16064 000000000000300180000217161181640ccda3d71e CHEBI:16065 0000000000000200020a000004882c80001034821c CHEBI:16066 00000010008420009000000120820068864caa501c CHEBI:16067 0000000000003000800000010000017484cca3521e CHEBI:16068 000000000000000001500480000080010021400013 CHEBI:16069 000000000000300180000207821000400d4fa2551c CHEBI:16070 000000000000000000000489042010418041406a19 CHEBI:16072 0000001000803201900a024f27babcec9f7fffff1d CHEBI:16073 00000000000000000202008010892cc00950b5861c CHEBI:16074 00000002000002000208008494892cc08954b7d21c CHEBI:16076 0000001000843001900002072292006c0e4daa551c CHEBI:16077 00000000000000000000000100180024800092d71e CHEBI:16079 000000000000300000000a47880000040c04c27709 CHEBI:16080 0000001010903221929c168fa7f298ef9f6ffaff1d CHEBI:16081 0000001010903021909414cfa7b2906f9e6feb7f1d CHEBI:16082 000000000000000081000480100100158405c2da1f CHEBI:16083 000000100084300190000687a09200ec0e4faa5518 CHEBI:16084 0000001010903021909414cf27b2906f9e6deb7f1d CHEBI:16085 00200000000030018440824f8378906c9d6ff3ff1d CHEBI:16086 00000000000001008000000300000004880c825608 CHEBI:16087 00000000000010010000000710080e448c5db2d618 CHEBI:16089 0000000000000200000a020484182c800914149528 CHEBI:16091 000000000000000001000000120100110001010017 CHEBI:16092 00000000020002000008000004000004840c02d208 CHEBI:16093 00200000c080411a040181002082000882000a4008 CHEBI:16094 000010032014100080f80a4c1f6bd0419465c3fb1d CHEBI:16095 0000000000000100810010101b41217300d9bff91f CHEBI:16096 000000112084302191f41ccf33b3d06fdeedebff1f CHEBI:16097 00000000000022008008084684400004840cc27e09 CHEBI:16098 000000000000300180000207061001640c4da3571e CHEBI:16099 00000000000032008208000104800044044ca2d21c CHEBI:16100 000000000000000000000000000002000000008416 CHEBI:16101 000000000000000000000006800021440cc4a6561e CHEBI:16104 000000000000000100084481c4840044a84da25618 CHEBI:16106 000000100084000090000203209200288e0e8a5708 CHEBI:16108 000000000000100000000203801020000c06045508 CHEBI:16109 00000030008001009002305728d22cae8a1e9ef709 CHEBI:16110 00000000000000000000000100082680890094c608 CHEBI:16111 000000000000300082000007848001e48ccca3521e CHEBI:16112 0000000200000200020a008294892cc00954b5941c CHEBI:16113 0000000000000001000004911211817000cd83d11e CHEBI:16114 0000000000003000000008400100014454c4e2fa1f CHEBI:16116 00000000000030018000024f033090641c4fe3ff1d CHEBI:16117 0000000800000101810014901b51a17700cbbff91f CHEBI:16118 000000000000320080080007840000440c4ca2521c CHEBI:16119 000000000000010000000000000020048800065608 CHEBI:16120 00000000000030018000024f87fabee49d7ff7ff1d CHEBI:16121 000000040000200080004002800401442ccca2561e CHEBI:16122 00000000000000018202000002892cc00951b5d21c CHEBI:16123 000000000000300180000207021080640c4da3d51c CHEBI:16124 00000000000000000000020680182e800904049508 CHEBI:16125 00000000000030018008024f073090649c4de3ff1d CHEBI:16126 0000001010903221909c148fa7f290ef9f6dfbff1d CHEBI:16128 0000000000003001820002cf13b9bde49dcdf7ff1f CHEBI:16129 000000000000000000004000408600102000010016 CHEBI:16130 000000000000080018000a20001000085424403d1f CHEBI:16131 000000100084200090000003a09200288f0c8a5708 CHEBI:16132 000000000000000000000086800020400944841018 CHEBI:16133 000000000000000000000000000000000000400001 CHEBI:16134 00000000000000000002000000000004000410d208 CHEBI:16135 000000000000000000008000000000000000000000 CHEBI:16136 00000000000030018000025b033090649c5df3ff1d CHEBI:16137 000000000000300080000207801000840c0c825708 CHEBI:16138 0000001000840200100a0002249a2ca8071c1ed508 CHEBI:16141 000000000000300080000203801000048c0c825708 CHEBI:16142 00000011008430009030084121824068166cea781d CHEBI:16143 0400001000800100000000002082000802000a4008 CHEBI:16144 000000000000010000000808042010048000427a09 CHEBI:16146 0000000000002000828434d71fc9a3e745ffbfff1f CHEBI:16147 00000000000000000000000000082e800100148400 CHEBI:16148 00000000000001000000000000000004000000521e CHEBI:16150 0000000000000001000000001201214408c185521e CHEBI:16151 0000001000843000900000032092006c8e4eaa571c CHEBI:16152 00000000000030018000024f033090641c4fe3ff1d CHEBI:16153 000000000000300080000207801000040c0c825708 CHEBI:16154 00000010008400009000000120820068864caa501c CHEBI:16155 000000000000310080000207801000040c0c825708 CHEBI:16157 0000000000001000000000110000017484dcb3d21e CHEBI:16159 000000000000200000000846815020a48d0c467f09 CHEBI:16160 0000000000001100000000010000014404cca2d21e CHEBI:16162 00000000000020000000a846804000048c0e427f09 CHEBI:16163 0000000000003000800000010000014004c4a2501e CHEBI:16164 000000000000300180000685801000440c4da25718 CHEBI:16165 000000000000200080000007840001648ccca2561e CHEBI:16166 000000000000000001000008032011450445025a1f CHEBI:16168 000000000000010000000006000001440ccca2561e CHEBI:16169 000800000000000000000000000000000000000000 CHEBI:16170 000000100080212191841ccf23b2906b9e6dea7f1f CHEBI:16171 0000001000841200100a0003249a2ca8871c1ed508 CHEBI:16172 000000100084300090000047a1b210288e0cca7f09 CHEBI:16173 000000102084102191d41c8bb3f390fbdf6dea7d1f CHEBI:16174 0000000200000200020a008014892cc00950b5861c CHEBI:16175 000000000000200000000a46884020040424447f09 CHEBI:16176 000000000000300180000007821080e40c4fa3551c CHEBI:16177 0000000000000200020a0000049a2ca4001034d71c CHEBI:16179 000000000000200000000240882210048c04427f09 CHEBI:16180 00000000000030008000084101000040144ce2781d CHEBI:16181 000000000000000000000000000006000000108608 CHEBI:16182 000000000000000000000000000000000000008000 CHEBI:16183 0000000200000200820c30141ec12de701ddbffb1f CHEBI:16184 0000000000000100002000400a4060850101047b1d CHEBI:16187 00000000000000000000020680182e800904049508 CHEBI:16188 00000000c080411a040180002082000882080a4008 CHEBI:16189 000000000000200080000000040000400444a0501c CHEBI:16190 000000112084002191f41ccbb3f3d0ebdfedea7f1f CHEBI:16192 00000000000001000000000000000144004482521e CHEBI:16193 00000000000000000000000684082e84090404d608 CHEBI:16196 000000100084200010000047a4fa3ea8870cceff09 CHEBI:16197 00000000000030018000024f033090649c6ff3ff1d CHEBI:16198 000000001004000000100800002210000420402a09 CHEBI:16199 0000000000000000000000111001017084cc93d21e CHEBI:16200 00000000000010000000000100000140044482501e CHEBI:16204 0000000000002020800000000000015004c4a1501e CHEBI:16205 0000001000840200900a2007a49a2ca80f1e9ed508 CHEBI:16206 00000000000000000000000000000140004480521e CHEBI:16207 000000000000100080008202000000048c04825708 CHEBI:16208 000000018004541a14318a06a94260080e264c6d09 CHEBI:16209 0000000000001000800000110400016484ccb2d21e CHEBI:16210 00000000000000010008448144040044a04da25218 CHEBI:16211 000004002000302181f416cf9931d0435cefea7d1b CHEBI:16213 0000001000840200100a0006249a2ca80f1c1ed508 CHEBI:16214 0000001000800100100000002002000802000a4008 CHEBI:16215 00000000001012000008020f05301004840c42ff09 CHEBI:16216 000000000000300180000207021000440c4da2571c CHEBI:16217 0000001000843001900002072292006c0e4daa551c CHEBI:16218 000000000000300180000217061001640ccfa3d51e CHEBI:16220 000000100084200090000203a09200280e0e8a5508 CHEBI:16221 00000000000030008000000100000000840c825208 CHEBI:16222 04000000000001000002201020800008020018c008 CHEBI:16223 00000000000030008000000100000004840c825208 CHEBI:16224 000000000000200000408a4e886030848c2e467f09 CHEBI:16225 0080240000000021820204811299acc48959b7d31e CHEBI:16226 000000000000000001000000020000010001000017 CHEBI:16227 000000000000200000000840008200040404407a09 CHEBI:16228 00000000000000000200008690892cc00954b5d21c CHEBI:16229 000000000000010000000206801020040904065708 CHEBI:16230 0000000000003000000006c6894000450d45c27b19 CHEBI:16231 000000000000200000000a46886230048524467f09 CHEBI:16232 0000000800000001800034901b51a1f701cfaff91f CHEBI:16233 000000000000010000000000000000000004000008 CHEBI:16234 000000012004100001f00cc81323d04154e1402a1f CHEBI:16235 000000000000000000000200001002000004009508 CHEBI:16236 00000000000001000000400040040004280000561e CHEBI:16237 000000103080302191f41ccfbbb3d0fbdeeffbff1f CHEBI:16238 00000000000000000000000100180024000492d71e CHEBI:16239 000000000000000018000000200000080404004008 CHEBI:16240 000000100084300090000007a09200288e0c8a5708 CHEBI:16241 0000000000003021800800011601015004cda3521e CHEBI:16243 00000020000001000002301008c82c0608101cf709 CHEBI:16244 000000100084300090000207a09200a80e0e8a5508 CHEBI:16246 00000000000030018000024f033090649c6ff3ff1d CHEBI:16250 000000011014102180b4385f8f62fc479d6defff1d CHEBI:16251 0000000000000001000000151201056408cd93d21e CHEBI:16252 00000000001012000008000a852010048c0c42fe09 CHEBI:16253 0000000000000200020a000004982ea4091034d71c CHEBI:16254 000000000000200001d00ec6805080010425403d1b CHEBI:16255 00000000000000000000000100000004800402d208 CHEBI:16256 0000001002840200900800032482006c8e4caa521c CHEBI:16257 00000000000020000000004a80681004840442fe1f CHEBI:16259 0000800000008140600000612000414a52c4aa781f CHEBI:16260 000000000000300180000a47031080641c6fe37d1d CHEBI:16261 0000001010903021909416cf27b2906f9e6febff1d CHEBI:16264 000000000000000000000002800020048804065608 CHEBI:16265 0000001000840200900a2003249a2ca80f1e9ed508 CHEBI:16266 000000000001300180000247021001640c4da3551f CHEBI:16267 0000800000008140600000202800420a0a0008e509 CHEBI:16268 00000000000000000002301000000002500018a817 CHEBI:16269 000000010004000000300a0e886270000522442f1f CHEBI:16270 00000000000000018202000682892cc00955b5d01c CHEBI:16271 0000001000842000900008472192002c8e0cca7f09 CHEBI:16273 00000002001030010188084797418155ccedc37e1f CHEBI:16274 0000001000841200100a0003249a2ca8871c1ed508 CHEBI:16275 000000000000200080000007800020048c0c865608 CHEBI:16278 000000000000300180000207061000640c4da3571e CHEBI:16279 00000000000002000008000004000004840c025208 CHEBI:16281 000000003004300000900840012210048424427a09 CHEBI:16282 0020000000002000040088428040000c852e427f09 CHEBI:16283 000000102084102191d41c8bb3f390fbdf6dea7d1f CHEBI:16284 00000000000000000002000480082c84091414d608 CHEBI:16285 0000000200000200020a008794892cc00954b5d61c CHEBI:16286 0000001010903021909416cf27b2906f9e6febff1d CHEBI:16287 000000000000200000000846815828248c0c46ff09 CHEBI:16288 000000000000300180000207021000400c4da2551c CHEBI:16289 0000000200000200020a008694892cc00954b5941c CHEBI:16290 0000000000000000800002032018002c080e82d708 CHEBI:16291 000000000000300180000687001000440c4da25718 CHEBI:16292 00000000100410000090000d836290418c61426a1d CHEBI:16294 000000000000200001000cc6904100150425407e1f CHEBI:16296 00000000000030018000024f877abee49d7ff7ff1d CHEBI:16297 000000100084300090000207a09200280e0e8a5508 CHEBI:16298 000004002000302181f416cf1931d0435cefea7f1b CHEBI:16299 0000001000843001900006872092006c0e4daa5518 CHEBI:16300 000080000000014060000020200040080200004009 CHEBI:16301 0000000000000200020a020004982c80001434951c CHEBI:16302 000000000000300000400a42894000040424407f09 CHEBI:16303 000100120094130192fa1eefbffbfccbcfffffff1f CHEBI:16304 00000000000020000000024e88603014842c477f1f CHEBI:16305 000000000000312180080207161181740ccda3571e CHEBI:16307 000000100084300190000687229280ec0e4faa551c CHEBI:16308 00000000c080459e140181002082000c820c0a521e CHEBI:16309 0000000000000200000a000004082c000810948608 CHEBI:16310 000000103094102190b41c8ba7f2d0eb9f6dea7f1d CHEBI:16311 00000000000000000200008690892cc48d5cb7d61c CHEBI:16312 0000000000002000000006c6884020050505447b19 CHEBI:16313 0000000000002001000006cf887031f48d6fc67f1f CHEBI:16314 0000000000000001820434901ec1adc301c58ffb1f CHEBI:16315 0000800000003000d840086ea160100c8c2c427e09 CHEBI:16316 0000000200000200020a000694892cc48d5cb7d21c CHEBI:16317 000080000000a140e00000202100414a56ccaa781f CHEBI:16318 00000000000020000000004e806830048c0c46fe09 CHEBI:16319 000000000000100000000a0209000404040440f709 CHEBI:16320 00000000000022008008000004000004840c825208 CHEBI:16321 00000010008400009000000120820068864caa501c CHEBI:16322 0000000000000200000a000604082cc00554b4d41c CHEBI:16323 0000000200101201800804d915b191658cddf3fb1f CHEBI:16324 00000000000000000200008690892cc40d54b7d61c CHEBI:16325 0000001000843001900002072292006c0e4daa551c CHEBI:16326 00000000000000008000080b816010048c0cc27e09 CHEBI:16327 000000101090302190d41ecfaff2b0ef9f7fffff1d CHEBI:16329 00000000000000000200008690892cc00944b5d21c CHEBI:16330 00000000000001000000000000002144084486561e CHEBI:16331 000000100084300090000207a09200a88e0e8a5708 CHEBI:16332 00000000000000000000000000000004000400d208 CHEBI:16333 000000002000302181d41ecf133190535c6dea7d1f CHEBI:16335 00000000000030018000024b2330906c9e4de3ff1d CHEBI:16336 0000001000840000900000032092002c8e0e8a5708 CHEBI:16337 000000000000110080000840814000048c0cc27e09 CHEBI:16338 0000000010043000005004cf816290458c6d627e19 CHEBI:16342 000000000000200000000a40880000000c04c0351f CHEBI:16343 000000010004200000300a42880240040c24407f09 CHEBI:16344 000000008000741a14018842a140000c8e0c4a7f09 CHEBI:16345 00000000000000000000000000000000000480901e CHEBI:16346 0000002000000100000230518888040608149af709 CHEBI:16347 000080000000814060000022a940600e8a0c0e7709 CHEBI:16348 000000001004200000100a46886230048424467f09 CHEBI:16349 0000001000843001900002072292006c8e4faa571c CHEBI:16351 0000000000000000002004c6894040450d45827b19 CHEBI:16352 0000002000000100800230510cd02d6608debef71f CHEBI:16353 0000000010000000005434db89629443886168eb19 CHEBI:16354 0000000400000000000040100004016020c0a0901e CHEBI:16355 000000112084002191f41ccf33b3d04bdeedeb7b1f CHEBI:16356 000000000000280001c40ce28940800f0425487f1b CHEBI:16357 000000000001300180000047821080e40c4fa3551f CHEBI:16358 00000000000000000200008690892cc40d5cb7d61c CHEBI:16359 00000000000030018000024b83b090649c4fe3ff1d CHEBI:16361 00000002000002010208048014892c448051b7d21c CHEBI:16363 00000000000000000100004802201141044580581f CHEBI:16364 00000000000000000100000812211151004501181f CHEBI:16365 000000000000000011d414f08048800b002148ae1b CHEBI:16367 000000000000100100000005120105440ccd83d21e CHEBI:16368 000000000000300080000007800000048c0c825608 CHEBI:16369 000000000000100000080005840000c00c4482521c CHEBI:16370 000000103090302190b41ccfa7b2d06f9e6fea7f1d CHEBI:16371 00000000000010008200028790992cc08d4cb7d71c CHEBI:16372 000000000000300081500482814080050c25c25e1b CHEBI:16373 000000000000300000008a42804000040404407f09 CHEBI:16375 00002000000000000202008680892cc0095494901c CHEBI:16377 00000000000001000000084001000004800842fa09 CHEBI:16378 000100120094130192fa1effbffbfdcbcfffffff1f CHEBI:16379 0000000000000000828434d51fc9a3e7c1ffbfff1f CHEBI:16380 000000040000010000000000000000002000000000 CHEBI:16382 000000000000030000080000040000048808025608 CHEBI:16383 000000000000300082000007848001e48ccca3521e CHEBI:16384 0000000000000000800430101a412df301c5bdfd1f CHEBI:16387 00000000000020008000000000000144044c82521e CHEBI:16388 0000000000000200800a0011040008608058b2d61c CHEBI:16389 00000000000000000200008690892cc48d5cb7d61c CHEBI:16390 000000000000300180000687001000440c4da25718 CHEBI:16392 000000000000300000000a4784582e800504c4fd09 CHEBI:16393 00000000000010000000001001000004548460fa1f CHEBI:16394 0000800000000020500000602000003c500041fa1f CHEBI:16395 000000000000000000000800002010000000402209 CHEBI:16397 000000000000300000000841000000040404c2fa09 CHEBI:16398 000000000000300080400a4e89601004842c427f09 CHEBI:16399 0000000000003021800800011601015004cda3521e CHEBI:16400 0000000200000200020a000614892cc4895cb7d21c CHEBI:16404 000000000000300180000007821080e40c4fa3551c CHEBI:16405 00000400000002000268148c17e9d0e740e5bbfa1f CHEBI:16408 0000000000001000010002470318014504cda2df1f CHEBI:16409 000000000000100001000a460b18014105e7e0fd1f CHEBI:16410 000000000000000001000486900100150c05405e1f CHEBI:16411 000001000000000000000204001000000506005508 CHEBI:16413 00000000000020000002084000000004040450fa09 CHEBI:16414 0000000800000001800434901951add709cb8ff31f CHEBI:16415 00000002000022008208008694892cc00d54b7d21c CHEBI:16418 0080200000000001820206879099acc48d5db7d71e CHEBI:16419 000000100084300090000007a09200288e0c8ad708 CHEBI:16421 00000002000002000208008014892cc08940b5c21c CHEBI:16422 00000100000000000000000684082e84090404d608 CHEBI:16423 00000000000000000000000000000000080000061e CHEBI:16424 0000000000000200014814861f49889301a149ab1f CHEBI:16425 00000010008400009150048621d280298e2dca5f1b CHEBI:16426 00000000021032018188048f8668b8f58dfdf6fe1f CHEBI:16427 000000000000220080080007040001648c4ca3521e CHEBI:16428 000000020010020003e804c685c9ec858dbdf6fe1b CHEBI:16430 00802400020002010208000004892c80090194901c CHEBI:16431 000000000000300180000207821000640c4fa2571c CHEBI:16432 0000000000002000800000000400014404cca2521e CHEBI:16433 0010800000000020d00000282020101a500409ea1f CHEBI:16434 0000001012903021909c16cf27b2906f9e6febff1d CHEBI:16435 000000303090212190b63cdf2ff2fc6b9e7ffeff1d CHEBI:16436 000000000000300001d004ca80689005842542fe1b CHEBI:16437 000000000000200000000244880008048c0c42ff09 CHEBI:16439 000000000000200000000056804828048c0c46fe09 CHEBI:16440 000000000001300182020247029000640c4db2d51d CHEBI:16441 00000000000020008200000100800004840c92d208 CHEBI:16444 00000402000002018008048114010044844da25218 CHEBI:16445 00000010008430019000024f23b2106c9e4deaff1d CHEBI:16446 00000000000020000000084000000004040440fa09 CHEBI:16449 00000000101010218094168f877290c39d6dea7f1d CHEBI:16450 000000000000010080000001000000048808825608 CHEBI:16452 00000000001012000008000e876030458c45467a1d CHEBI:16453 00000000000031008202024f89f034048c0ed6ff09 CHEBI:16454 0000000200000200020a008694892cc00954b5941c CHEBI:16455 00000000000030008000024f84783e80050cc6ff09 CHEBI:16456 00000000000001100002a01000002c0408001cd708 CHEBI:16457 00000000000000010008000106000044004582d21c CHEBI:16458 00000000000000000000080e80603e00080044ae09 CHEBI:16459 00000000c080741a14018246a9ca280c8e0e4eff09 CHEBI:16461 000000000000200000400a48886010048c2c427f09 CHEBI:16462 000000100084000010002256a8d22c280e0e4ef509 CHEBI:16463 0000000000001000000002030010014004c4a0551e CHEBI:16464 00000000000000000200008690892cc00d5cb7d61c CHEBI:16466 000000010004200000300a46884260040424447f09 CHEBI:16467 000000000000010000000001040000048808825608 CHEBI:16468 00000000000000000200008690892dc00dc4a5d01e CHEBI:16469 000000000000000182020481969181500ccd93d11e CHEBI:16470 00000000000001100002a01000000404080018d708 CHEBI:16471 00000000000000000000000000002e000800948608 CHEBI:16472 00000002c080421a96098087b49b2ce88f4cbfd71c CHEBI:16473 000000102094322191dc1ccf37b3907bde6feb7f1f CHEBI:16474 000000002000210001d00cd68048a885842d46fe1b CHEBI:16475 0000001000840200100a0002249a2ca8071c1ed508 CHEBI:16477 00000000000030018000024f033090649c7ff3ff1d CHEBI:16478 000000110084300192b00ed72392c06c9e6dfbff1d CHEBI:16479 000000000000004040000020200040080000000009 CHEBI:16480 000000081004100000100a03093210000424407f09 CHEBI:16481 000000000000000000000000100100100000010016 CHEBI:16482 80048000000030005064186fa970717f9c6fef7f1f CHEBI:16483 00000002000002008208028694992cc08d4ea7d71c CHEBI:16485 000000000000000000000080000020400940840218 CHEBI:16486 000000000000000000000a06081021f4894ac7771f CHEBI:16487 00000000000020000000084680402004842c467e09 CHEBI:16488 000000003000100001d0004913239151046501581f CHEBI:16489 000000002000213181d4bcdf1321bc57dd6feeff1f CHEBI:16490 000000100084200090000003a09a00288e0c8ad708 CHEBI:16493 0020000008000000040085868000a08c0905045718 CHEBI:16494 0000000200000200020a008014892cc00950b5861c CHEBI:16495 00000000000000000202008690892cc00d54b7d41c CHEBI:16496 000000112084102191f41ccbb3f3d0ebdfedea7f1f CHEBI:16497 00000000000030018000024383b210449c4de27f1d CHEBI:16498 0000001000843000900000032092006c8e4eaa571c CHEBI:16500 000000000000310080000207801000040c0c825708 CHEBI:16501 040000000000000000000000000000000000000000 CHEBI:16503 000000000000200180080681841000440c4da25718 CHEBI:16504 000000110084300192b00ed7a392c86c1e6ffbfd1d CHEBI:16505 00000000000030018000024f8778bee49d7ff7ff1d CHEBI:16506 00000010008400009000000120820068864caa501c CHEBI:16507 00000000000000000000000004000004840c025208 CHEBI:16508 00000000000000000000000004000040804080421c CHEBI:16509 000000000000010000000201801020040804065708 CHEBI:16510 0000001000803201900a024f27ba3cec9f5ffeff1d CHEBI:16511 0000002800000101800434901bd9adf701dbbff91f CHEBI:16512 00000000000030008000000100000004840c825208 CHEBI:16513 0000000000001021800800111601017004cda3d21e CHEBI:16514 000000102080302191d41ccfb3b3907fde6fea7f1f CHEBI:16515 00000010008010011000000122828048864d0a401c CHEBI:16517 0000000200000200020a008694892cc00954b5941c CHEBI:16521 000000002000120001d806841e118c11552740bd1f CHEBI:16522 000000000000300000000a43805000040404427f09 CHEBI:16523 00000002000002000208008014892ce48950b7d21c CHEBI:16524 00000000c080541a1601828eb9eb3cc88f5effff1d CHEBI:16525 000000000000000000000000000000048000004208 CHEBI:16526 00000000000000008002000000000004800492d208 CHEBI:16530 0000100220100020007814c89761d1530465e97e1f CHEBI:16531 000040000000200000000242883210a48c0e427f1f CHEBI:16532 000000000000200001400cce906110110421402e1f CHEBI:16533 000000000000300080000207801000040c0c825708 CHEBI:16534 00000000000020000040024c8a6090418d63406b1d CHEBI:16535 00000000000010000000000100000140044480d01e CHEBI:16536 00000000000031008000000100000004840c825208 CHEBI:16537 00000000000000000100000802201141004500181f CHEBI:16540 000000100080200010000a47a8d220ac8e2e4e7f09 CHEBI:16542 000000000000200000000a4e88683004842446ff09 CHEBI:16543 00000010d090703b949594cf27b2906f9e6febff1d CHEBI:16544 000000000000000100000000021020c4094104531c CHEBI:16545 000000000000300180000217061001640ccdb3d71e CHEBI:16546 0000000000000000800000100400016000c4a2d21e CHEBI:16547 000404000000000002004080508500402040a1021c CHEBI:16548 000000101090302190941ccf27b2906f9e6dfbff1d CHEBI:16549 000000020000200101a808439741c155ccedc37e1f CHEBI:16550 000000000000300180000207021080640c4fa3551c CHEBI:16551 00000000000020000000a05680603c04840446ff09 CHEBI:16552 0000001010903021909414cf27b2906f9e6debff1d CHEBI:16553 000000000000000000002a06884020800522442501 CHEBI:16554 000000000000000000008100000000000000008208 CHEBI:16555 000000103090302190b41ecfa7b2d06f9e6febff1d CHEBI:16556 00000000c080403b940980013683017886cdab521e CHEBI:16557 00000000000000000000000580080404880c82d608 CHEBI:16558 0000000000003021800800011601817484cda3521e CHEBI:16559 000000000000000000000808806010000800402e1f CHEBI:16562 0000000000000000800030101b4921f301dbbff91f CHEBI:16563 00000000000030018000024b83b090649c6ff3ff1d CHEBI:16565 000000000000300000000a4780582e800d04c4fd09 CHEBI:16566 000000000000010000000800010000045080607a1f CHEBI:16567 0000101b2094302190fc1ccfbfbbf8efde6fffff1f CHEBI:16568 0000000000102000800800401701004144c5e17a1f CHEBI:16569 000080000000200058000a66a840200c0d24467f09 CHEBI:16570 0000000000002000800000060000034004c4a0d41e CHEBI:16572 000000000000000000018000000000000000000208 CHEBI:16573 000000101090302190d41ecfaff2b0ef9f7fffff1d CHEBI:16574 00000000000020000040084800201004842452fa09 CHEBI:16576 00000000000000000200008690892cc40d5cb7d61c CHEBI:16577 000000103090302192b41ecfa7b2dc6b9f6feaff1d CHEBI:16578 00000000000000000000001000000160004080901e CHEBI:16579 00008000000000205800002020000018540441281f CHEBI:16580 00000002000002000208008294892cc00954b5d21c CHEBI:16581 00000000100420008010084901221004842c427a09 CHEBI:16582 00000000000000008000000000000000800090c208 CHEBI:16583 00000010028412001008000724922c288e0c0ed508 CHEBI:16584 000000110084200010300a47a8d260ac8e2e4e7f09 CHEBI:16585 000000000000000000000a06884020840d04447709 CHEBI:16586 0000000200002200820a008694892cc00d5cb7d61c CHEBI:16587 000000100084300092000207a09200a88e0e8a5708 CHEBI:16588 0000000000000200000a020404182c800114149508 CHEBI:16591 0000000800000001800034901b59a1f701cbbff91f CHEBI:16592 000000100084300090000207a09200a88e0e8a5708 CHEBI:16593 00000000000011000080084080400404042040fe09 CHEBI:16594 00000010008430009000000120820068864caa501c CHEBI:16595 000000000000312180080207161189740cdfb3d71e CHEBI:16596 000000100080000190000687209280480e4f8a5518 CHEBI:16597 000010000000020000084000440400102000010016 CHEBI:16598 00000000000030018000024f83b290649c4fe3ff1d CHEBI:16599 000010000000000000080000440400002000000000 CHEBI:16602 0010800000003001d004027f3231916f4ccdabff1f CHEBI:16603 000000000200000000000200041000000004001508 CHEBI:16605 00000000000010000000000801201154d4cce37a1f CHEBI:16606 000000002000300001d00c80000080015421402813 CHEBI:16607 0000000200000200020a008694892cc00954b5941c CHEBI:16608 000000000000300180000207021000440c4da2571c CHEBI:16609 000000000000100000802a06884020800522442501 CHEBI:16610 000000000000300000c0aa4e886030848d2e467f09 CHEBI:16613 000000001804200000508d8e9063b0c18d61446f19 CHEBI:16615 000080000000000040220030200040280010109009 CHEBI:16616 0000001000841000900000032092006c8e4eaa571c CHEBI:16618 0000000000000200000a020404182c800114149508 CHEBI:16619 0000001010903221909c148fa7f290ef9f6dfbff1d CHEBI:16620 00000011c084541a14318226a9c270088e2e4e6d09 CHEBI:16621 000000000000300080000207801000848c0c825708 CHEBI:16622 000000102084302193d69fcfbbfbb4ffdf7ffeff1f CHEBI:16625 00000001000400000030081000024000042040a801 CHEBI:16628 00000010208410001190080323928029d6ad6afd1f CHEBI:16629 000000000000230080080003840000448c4ca2521c CHEBI:16630 000000000000302180080207161181740ccfa3571e CHEBI:16631 0000000000000100800000100000016400cca2d21e CHEBI:16632 040000000000300000000a42804000040404407f09 CHEBI:16633 000000000000300180000687029080e40c4fa3551c CHEBI:16634 000000000000000000020000000004000800108608 CHEBI:16638 00000000000001000000001000000164004082d21e CHEBI:16639 00000000000000000000000380182624090496d708 CHEBI:16641 00000000000020000000a85680402c04040444ff09 CHEBI:16643 0000000000003000018006c68050a0958dad665f1b CHEBI:16645 000000100084300090000007209200288e0c8ad708 CHEBI:16647 0000001010903021909416cf27b2906f9e6febff1d CHEBI:16650 00000000000001008000000000000004000482d208 CHEBI:16651 000000010004000000300a06884260000522442d01 CHEBI:16652 0000240002000200020a008694892cc00954b5941c CHEBI:16653 00000000000020008202000100800004840c92d208 CHEBI:16654 00000000000030018008024f073090649c4de3ff1d CHEBI:16655 000004000000000000041490034928c309418caa1d CHEBI:16656 000000000000302180080207961181740ccfa3571e CHEBI:16658 000000000000210080000201801000040c0c825708 CHEBI:16659 00000000000100000000004000000140044480501f CHEBI:16660 0000400100040084015085901123b635542154ff1f CHEBI:16664 000000000000000082000002048000448c4c82561c CHEBI:16666 00000000000100000000000000400140004480141f CHEBI:16667 000000008000541a14018a02a94020080e064c6509 CHEBI:16668 000000000000310080000207801000848c0c825708 CHEBI:16669 000000010004000000300a06884260000c20442f09 CHEBI:16671 000200020210020001b804b39509ea6f8afdbefe1b CHEBI:16673 00000000000000000100004003000005848d225a1f CHEBI:16675 000000002000202181d4bccf9361b0575d6fee7f1f CHEBI:16680 00000000000010000000020a896838048c0446ff09 CHEBI:16682 0000000000003000810000490220114104c5a2581f CHEBI:16683 000004000000000000041691035928e709458fff1f CHEBI:16684 0000000000001000810000490720114584cda25a1f CHEBI:16685 0000000200002200820a008694892cc00d5cb7d61c CHEBI:16688 000000000000300082000203809000000c0e825708 CHEBI:16689 000000112084102191f41ccf33b3d06bdeedea7f1f CHEBI:16690 000000001004200000500486806ab8c58d6566fe19 CHEBI:16691 000000202000010011d23cfe80e8b88b84315cee1b CHEBI:16692 000000102080302191d41ecfb3b3907fde6feb7d1f CHEBI:16693 00000000000010008008020304100040044ca2571c CHEBI:16694 0000001010942021909414cf27b2906b9e6dea7f1d CHEBI:16695 00000000000000000000aa1288482c00000244a501 CHEBI:16696 0000000000000000000000869001204008c484101e CHEBI:16697 000000000001000000100800000000000020402001 CHEBI:16698 000000000000210080000203801000048c0c825708 CHEBI:16699 0000000000003001810006c7131181750c4de35d1f CHEBI:16700 000000100084000190000483209200680e4daa5518 CHEBI:16701 00000000c184701b940182672392104c9e4dea7d1d CHEBI:16702 000000000000300180000687809080e40d4fa25518 CHEBI:16703 0000000010103021809416cf073290439c6dea7f1d CHEBI:16704 800404000840000002429dc801a1904794657afa19 CHEBI:16705 0000000400000000000040000004014020c4a0101e CHEBI:16706 000000002000100001d00c80120180115421402817 CHEBI:16708 0000000000001000010002470318014105c7a2dd1f CHEBI:16709 000000100084200090000007a09200a88e0e8a5708 CHEBI:16710 000001000000000000000200001000000004005708 CHEBI:16711 00000000c080741a94018003a082000c8e0c8a5708 CHEBI:16712 000000012000100001d0084813239151546541781f CHEBI:16713 0000000000000001820434941ec1ade301c59ff91f CHEBI:16714 000000000000300180000207021000400d4fa2571c CHEBI:16715 000000000000000000000000000000000000000016 CHEBI:16716 000000000000020000080006840008048c0c02d608 CHEBI:16717 0000000000000000800430141a412df305ddbffd1f CHEBI:16718 000000000000000182020487009004440045b2d318 CHEBI:16719 0000001000843001900002072292006c0e4faa551c CHEBI:16720 00000000000001008000a01000002c04880086d708 CHEBI:16723 000000000000010000000202801020040804065708 CHEBI:16724 0000000000003000800200010000014404ccb2d21e CHEBI:16725 000000020000332180080007561181642ccda3d73e CHEBI:16726 000000000000010080000001040000048808825608 CHEBI:16727 00000030008001009002325728d22cae8e1e9ef709 CHEBI:16728 0000001000840000100000012082002c860c0a5208 CHEBI:16729 000080000000a1406000006be074582ef62ccaff1f CHEBI:16730 00000000000000000000000004000000000000021e CHEBI:16731 000000103090302190b41ecfaff2f06b9e6fee7f1d CHEBI:16732 0000000000002000010004ce90691015842542fe1f CHEBI:16734 00000000c080701b9409804f27b2906c9e4debff1d CHEBI:16735 0000000100040000007434de8962d4431c6168ab19 CHEBI:16737 0000000000000000000040004004014020c4a0101e CHEBI:16738 00000000000001018000048000000004000182521a CHEBI:16739 000000000000300180000207021001640c4da3571e CHEBI:16741 00000000101430000098004907229045846d427a1d CHEBI:16742 00000002000002000208008014892cc08950b5c21c CHEBI:16744 0000000000003000800000010000014004cca2501e CHEBI:16746 00000000020022000008084a846010048424427e09 CHEBI:16747 0000001000803000900000032092006c8e4eaa571c CHEBI:16749 000000012004302181f41ecf1333d0435cedea7f1f CHEBI:16750 000000000000300180000687809080e40d4fa25518 CHEBI:16751 000000100084200010000a46a8c2202c8f2cce7f09 CHEBI:16752 00000000000030008282824f89f034808d2ed6ff09 CHEBI:16753 00000000000002010008048184000844884da2d618 CHEBI:16754 00000000000000000200008690892cc40d5cb7d61c CHEBI:16755 0000002000000100000230558888040688149af709 CHEBI:16758 00000000000000000400810000082e880900048608 CHEBI:16759 0000000000002000800000001001015004c4a1501e CHEBI:16760 000000102084302191d41ccf33b3907bde6dea7d1f CHEBI:16761 00000000000001008000000000000204880082d608 CHEBI:16763 000000103094022190bc1c8ba7f2d0eb9f6deaff1d CHEBI:16764 000000000000000001000e86984120110421442d17 CHEBI:16765 00000000000000010200048184800444884d82d618 CHEBI:16766 000000000000000000004800400400007080602817 CHEBI:16767 00000000000030018040824f837810649c6fe3ff1d CHEBI:16768 00000000000001008000080a806030048808c67e09 CHEBI:16769 000000000000200000000a4688482804842c46ff09 CHEBI:16770 000000102084202191d41cef33b3907b5e6dea7d1f CHEBI:16771 000000100084300090000207a09200a80e0e8a5508 CHEBI:16772 00000000000000010008448144040044a04da25218 CHEBI:16773 0000000000000200000a020404182c800114149508 CHEBI:16774 000080000000810070200022a140400e0e0c0a5609 CHEBI:16775 0000000200100200028834961fc9aaa741a35dff1f CHEBI:16776 0000000000000000800430101a41adf305ffbffd1f CHEBI:16777 000000000000302180080207161181740ccda3571e CHEBI:16778 000000000000010000000a0e886030848d22467f09 CHEBI:16780 00000000000001008000000280002004880c865608 CHEBI:16782 00000010008410009000000120820068864caa501c CHEBI:16783 00000000000000000200008690892dc00dc4a7d01e CHEBI:16784 00000010008000001002000020820028021c9ad008 CHEBI:16785 0000000000000021000000151211017004cda3d31e CHEBI:16786 0000001000843001900000012282006c864daa521c CHEBI:16787 000000000000300080000207801000000c0c825708 CHEBI:16789 000000000210120001a8048e8468f8c58cfd76fe1b CHEBI:16790 0000040000000200026814801fc9c8e741e199fb1f CHEBI:16791 0000000008400020016085881061d1510865807b1f CHEBI:16792 000800000000010000000000000000000000000000 CHEBI:16793 0080240000000001000416d1135188e7084d8bff1f CHEBI:16794 000000000000300000000841000000040404c27a1f CHEBI:16795 00000000000000000100069e986939710461d4ff1f CHEBI:16796 00000000000001000184181803201003002148aa1f CHEBI:16797 000000000000300080000207801000000c0e825708 CHEBI:16802 00000000000001000000000801201004d08062fa1f CHEBI:16803 0000000000000000000002049a4120910181452d17 CHEBI:16804 0000001000842021915414cfa1f2906f9e6daa7f1b CHEBI:16805 000000003004310001d004c913239045c4e9427a1f CHEBI:16806 000000000000000000008b0e806830808d0044ef09 CHEBI:16807 000000000000310080000207801000048c0c825708 CHEBI:16808 00000000c080701a94018007a09200288e0c8a5708 CHEBI:16809 000000000000010080000000000020048808865608 CHEBI:16810 00000000000020000000a85680402c04040444ff09 CHEBI:16811 00000000000020008000000000000000040490d008 CHEBI:16812 000000000000300080000207801000000c0e825508 CHEBI:16813 00000002c080421a16098084348b2ce88b4cbfd21c CHEBI:16814 00000000000001008008000004000004000482521e CHEBI:16815 000000000000302180080207161181740ccda3571e CHEBI:16816 000000000000000000000006800020c40d4486521c CHEBI:16817 0000001000841200100a0003249a2ca8871c1ed508 CHEBI:16818 000000010004200000300a4e886270048424467f1f CHEBI:16820 0000000000002000000006c6894000458d45c27b19 CHEBI:16821 00000020c080411a0403b05028d22c2a8a1a1ef509 CHEBI:16822 000000100084300190800a47abd2806c1e6feb7d1d CHEBI:16823 00000000000030008000000100000004040c82d208 CHEBI:16824 00000000000000000020005000004000000490b81f CHEBI:16825 0000001000840000100000012082002c860c0a5208 CHEBI:16826 00000002000002008208028494992cc08d4cb7d71c CHEBI:16827 000000000000200001000cc6904100150425407e1f CHEBI:16828 0000000000000200034814971fc988b781a15bfb1f CHEBI:16829 00000000000000000000081000000000000040a001 CHEBI:16830 000000000000100002000005808004048c0c82d608 CHEBI:16831 0000000000001221000a0007161109d004cdb3d71e CHEBI:16832 0000000000000200000a000004082c800110148400 CHEBI:16833 00000000000020000000084e806830048c2c46fe09 CHEBI:16834 0000000000000200800a001204082de405dcb6d61e CHEBI:16835 0000800000008140600000202000414a52448a781f CHEBI:16836 0000001000842021915414cf31b3907b1eedbafd1f CHEBI:16837 000000103094302192b41ecfa7b2dc6b9f6feaff1d CHEBI:16840 000000000000100000802a02884020800522442501 CHEBI:16841 000000000200000000000000000000000000000208 CHEBI:16842 000000000000220080080006040002400444a0d41c CHEBI:16843 00000000000010008080084da1681068166ce2fa1d CHEBI:16844 000000000000010000200040024060858009067a1d CHEBI:16845 000000123080320090bc104fbff3f1efceefcfff1f CHEBI:16848 00000002000002010208048014892cc48951b7d21c CHEBI:16850 0000001000841000900000032092006c8e4eaa571c CHEBI:16851 00000000000022008008084284400044844ce27e1d CHEBI:16852 00200000c080401a1401810020820008860c0a4008 CHEBI:16853 000000000000010080000001040000048808825608 CHEBI:16854 000000000000200000000a46884020040524447f09 CHEBI:16855 000000000000300000408a4e886030848c2e467f09 CHEBI:16856 000000000000300000000841000000040404c2fa09 CHEBI:16857 00000010008430009282824fa9f234a88f2edeff09 CHEBI:16858 00000000000000008000001300000824880c82d608 CHEBI:16859 0000000000003021800800111601017004cda3d21e CHEBI:16860 004000100084300190800ed7b3d3886c1e6febfd1d CHEBI:16861 000000100084300190000487209a00688e4daad518 CHEBI:16862 000000100084310090000007a092002c8e0c8a5708 CHEBI:16863 000000102080302191d41ecf33b3907fde6fea7d1f CHEBI:16864 000000000000000000000a06884020040c04447709 CHEBI:16865 0000000000002000800000011001015084cca3521e CHEBI:16866 00000000000020000000a85680402c04040444ff09 CHEBI:16867 00802000000000018202305313d98c669d5dbbfb1d CHEBI:16869 000000300080210090023257a8d22caa0e1e9ef509 CHEBI:16870 0000000000000001010004c51311814500c5a2db1f CHEBI:16871 00000000000001000000000981601004d888627e1f CHEBI:16872 000000000000300180000a4786582ee40d4fe6fd1d CHEBI:16874 0000000000003100810004c01101015504cde25a1f CHEBI:16875 00000000e080403a1551800aa3e29179deedeb7e1f CHEBI:16877 00000010008420001000004fa0ea302c8e0c4efe09 CHEBI:16878 00000000000000008000000100180224801a92d708 CHEBI:16879 000000000000300080000207801000800c0e825708 CHEBI:16880 000000000000000001000480100100110001400817 CHEBI:16881 004000000000300180000687029280e40c4fa2551c CHEBI:16885 00000000001020000008084004000004840c427a09 CHEBI:16886 0000010008000020000087808010a010080504151a CHEBI:16887 0000000000003000000006c7894020450c45e67b19 CHEBI:16889 000000000000000080000000000000048004025208 CHEBI:16891 000000002000100001900a020310800154a560bd1f CHEBI:16892 000000100084200010000847a0da2ea80f0ccefd09 CHEBI:16893 00000000c080441a94018003a082000c8e0c8a5708 CHEBI:16894 00000000000030008000a017800004000c0c82d708 CHEBI:16895 000000100080210080000003a09200288e0c8a5708 CHEBI:16897 000000000000000001900000020080010021000017 CHEBI:16898 000000000000300080000207801000000c0e825508 CHEBI:16899 000200020210020001b804b39509ea6f8afdbefe1b CHEBI:16900 00000000000031018000024b83b890649c7ff3ff1d CHEBI:16901 00001000000000008008020304120024080e825708 CHEBI:16903 0000000000000000000000801001005000c481101e CHEBI:16904 000000100084300090000007a09200288e0e8a5708 CHEBI:16905 0010800000008140618400203201415b52610b681f CHEBI:16907 000000102090322191dc1ccf37b3907bde6feb7f1f CHEBI:16908 00000000000100000000004000000000000480101f CHEBI:16910 000000000000200000002250880004000c04c0b51f CHEBI:16913 0000000000001000000000010000014404c4a2521e CHEBI:16914 0000000000003100800000010000014404cca2521e CHEBI:16918 000000010004000000303852880244060c2448ff09 CHEBI:16919 000000000000010000000040816010048808427e09 CHEBI:16923 0000000000002000000004ce806030458c45467a19 CHEBI:16924 000000100084300090000007a09200ac8e0c8a5708 CHEBI:16925 00000000000020000000084680402c04042444fe09 CHEBI:16926 000000000000200000000a46884020848d2c467f09 CHEBI:16927 000000100080200090000a47a8d220a80e0ece7509 CHEBI:16929 000000000000200080000006800020c00d44a4501c CHEBI:16931 0000001010903021909416cf27b2906f9e7ffbff1d CHEBI:16932 0000000200000200020a008294892cc00954b5901c CHEBI:16933 000000000001200000800842804000040424407e09 CHEBI:16934 00000000000030018000048100000044044da2d018 CHEBI:16935 0000000000003200800800081721114144c5e3fa1f CHEBI:16937 0000001000843001900000072292006c8e4daa571c CHEBI:16938 000000000000100000000800814001440c44c27e1f CHEBI:16939 0000000008002000800085801401005004c5a1501e CHEBI:16941 00000010008410019000024fa3fa3eec9f4dfeff1d CHEBI:16942 000000000000300080000207001000048c0c825708 CHEBI:16943 000000000000010000000841000000048000c2fa09 CHEBI:16944 000000000000100000000005900101c00ccc83521e CHEBI:16945 00000000000020000000084681400004dca4e27e1f CHEBI:16946 00000000000001008200000180800004880c825608 CHEBI:16947 00000000000030008000024f813010048c0cc2ff09 CHEBI:16948 000000000000200080000003800008048c0c82d608 CHEBI:16950 000000103094122190bc1e8ba7f2d0eb9f6fea7f1d CHEBI:16952 00000000000001000000004881681004880842fe09 CHEBI:16953 000000000000302180080207161181500ccda3571e CHEBI:16954 000000112084302191f41ccf33b3d06fdeedebff1f CHEBI:16955 0000000000000000828434d49fc9abe749ff9fff1f CHEBI:16957 000000000000100000000a02894020040c04447709 CHEBI:16958 000040010004049c055185903123b63ddc215eff1f CHEBI:16959 000000102080302191d41ccf33b3907fde6fea7d1f CHEBI:16960 00000002000012008208028794992cc08d4cb7d71c CHEBI:16962 00000000101412000098020b073290418465627f1d CHEBI:16964 0000000000003021800000011201015004cda3521e CHEBI:16965 000000000000010080000000040000048808825608 CHEBI:16967 00000000000000000000020e9860b0c18d63446b19 CHEBI:16968 000000101090302190d416cfa7f2b06f9e7fffff1d CHEBI:16970 0000000000003000800000010000014004c4a2501e CHEBI:16971 00000002000002008208028414992cc08944b7d71c CHEBI:16973 000000000000000001500486814080050c25405e1b CHEBI:16974 000000100084000090000003a092002c8e0e8a5708 CHEBI:16975 004000000000300180800ed793d388641c6fe3fd1d CHEBI:16976 00000000000020000000084000000004040440fa09 CHEBI:16977 000000000000200000408b4e8c683e848d2ec6ff09 CHEBI:16978 0000000000000200000a000004082c000010148608 CHEBI:16980 000000000000300180000a4f837090649c6fe3ff1d CHEBI:16981 00000000000020008000000000000000040490d008 CHEBI:16982 00000000000000000154148801209003002108aa1b CHEBI:16984 00000000000000000200008010892cc08940b5c21c CHEBI:16985 00000000202001000194120a8b38b9d32ce73dfd3f CHEBI:16986 000000000000010000200040024060850001047a1d CHEBI:16987 000000000000020100080485840000448c4da25618 CHEBI:16989 00000000021012000188048e8468b8d58dfd76fe1b CHEBI:16990 00000000000020008000000100000004840c825208 CHEBI:16992 000000000000000182000483848000448c4d825618 CHEBI:16993 00000000000030018000024f86783ee40d4fe6ff1d CHEBI:16994 00000000000020008000000000000004840c025208 CHEBI:16995 000000100084000011d00cc6a0d28029062d4a7d1b CHEBI:16996 000000000000200080000200801004000c0480d508 CHEBI:16997 000000000000200000000842804000040404407e1f CHEBI:16998 000000300080010090023057acda2eae8f1e9ef709 CHEBI:16999 00000000000001000000020100100004000402571e CHEBI:17000 000010032014300080f80a4f9f73d0c11c6fc37f1d CHEBI:17001 0000000200000200020a0080148b2ce40950b5d61c CHEBI:17002 00000000000030018000024f877abee49d6ff7ff1d CHEBI:17006 000000000000200080000210801001600cccb2d51e CHEBI:17007 000000112084302191f41ccf33b3d06fdeedebff1f CHEBI:17009 000000002000202181d4bccf9361b1571d6fee7f1f CHEBI:17010 000000000000300080000207001000008c0e825708 CHEBI:17011 00000000000030018000024b83b010449c4de2ff1d CHEBI:17012 0000001010940221909c148fa7f290eb9f6deaff1d CHEBI:17013 000000003000300080f4124f9b33d0c3cceffbff1f CHEBI:17015 00000000000024180401a856a0402c0c8c044eff09 CHEBI:17016 000000000001300180000247021080640c4fa3551f CHEBI:17019 000000000000200080000006800020048c0c865608 CHEBI:17023 00000000100430000090004f836290458c6d427a1d CHEBI:17025 00000002000002000208008014892cc08950b5c21c CHEBI:17026 000000000000200000000846804020048c04467e09 CHEBI:17027 000000000000300080000207801000048d0c825708 CHEBI:17028 00000000000030018000024f033090649c7ff3ff1d CHEBI:17029 0000240000000200020a008694892cc00954b5941c CHEBI:17030 0000000800000101810414911311817700cb9bf91f CHEBI:17031 000000000000300080000207801000048d0c825708 CHEBI:17032 000000000210120001a8048e8428f8458cfd76fe1b CHEBI:17033 00000000000000000000000000082e800900048608 CHEBI:17034 0000000200000001800800011601814080c9a3521e CHEBI:17036 0000000000000200014834971e49a8b781a37ffb1f CHEBI:17037 0000000202000200020a008694892cc00954b5941c CHEBI:17038 00000000000030018000000102000044844da2521c CHEBI:17039 000000000000000080000007801000248d0c825708 CHEBI:17040 0000000000000000800430101a412df305c5bffd1f CHEBI:17041 0000000000000000000040004004014024c4a0501e CHEBI:17042 0000000000000020000000000000015000c4a1101e CHEBI:17043 000000100084300090000207a09200280e0e8a5508 CHEBI:17044 000000000000084060000020200050080220000009 CHEBI:17045 0000001000841200100a0003249a2ca8871c1ed508 CHEBI:17047 0000000000000000800002149a4121f105cdf7fd1f CHEBI:17048 000000100080300190000a472392006c9e4feb7f1d CHEBI:17049 000000100084200090000003a092002c8e0c8a5708 CHEBI:17050 000000000002010000000000000000002000000000 CHEBI:17051 000000100084220090080003a482006c8e4caa521c CHEBI:17052 000000000000300000000842814000048c0c427e09 CHEBI:17053 0000001002802000900800002482002c860c8a5208 CHEBI:17056 000000000000300180000207021080640c4fa3551c CHEBI:17057 0000000200000200820a008694892cc08d5cb7d61c CHEBI:17058 00000020000001000002305088800402081498b509 CHEBI:17059 00000000000020000000084e806030048c24467e09 CHEBI:17061 000000080000000000000a00080000000000402501 CHEBI:17062 000000100084200090000207a09200288f0e8a5708 CHEBI:17063 0000800000008140600008202000400a522048681f CHEBI:17064 000000103090002190b41ccf37b3d04b9e6deb7b1d CHEBI:17065 000000000000100000000001800004040c0482d608 CHEBI:17066 000000102084302193d69fdfbbf3b5fbdfffffff1f CHEBI:17068 00000000000000000000020400100140044480551e CHEBI:17069 000000000000000080000200001000000804005708 CHEBI:17071 000000010004300000300846814240048c2c427e09 CHEBI:17072 000010032014102001f81ece9f63f65304617daf1f CHEBI:17073 0000800800003901e00002772b70586e0e6faafd1d CHEBI:17074 000004000000000100000007800020c40945065218 CHEBI:17075 000000010004300182b00ed70392c0641c6df3ff1d CHEBI:17076 000000000000300082000207809000840c0e825708 CHEBI:17077 000000000000300000408a4e896030848c2e467f09 CHEBI:17078 0000000000002000800430141a412df305cdbffd1f CHEBI:17079 00000000000003000008000004000004800812d208 CHEBI:17081 000000000000010000000846804020048808467e09 CHEBI:17082 000000012000100001f00a4a1b73d14154e7417d1f CHEBI:17083 000000000000200080000006800020048c0c865608 CHEBI:17084 000000000000000000000000000200000000000208 CHEBI:17087 0000201000841200120a0003249a2ca8871c3ed518 CHEBI:17090 000000000000200000000842c04000042404407f09 CHEBI:17092 000000000000000004008100000000080000008208 CHEBI:17093 00000000000000008202020500900404840c92d708 CHEBI:17094 0000000002002201014806979e41a0f505ed67ff1f CHEBI:17096 000000000000002000000000000000100000010016 CHEBI:17097 0000000000000000800000100000016000c0b2d21e CHEBI:17098 0000000000000221800a04830400095404ddb3d61e CHEBI:17099 000000000000300180000687001000440c4da25718 CHEBI:17100 0000000800000101810414911311817700db9bf91f CHEBI:17101 00000000000030018000024f8778bee49d7ff7ff1d CHEBI:17103 00000008000000018004349a99719dd709cfabfb1f CHEBI:17104 000000000000010000000001040000048808825608 CHEBI:17105 000000100084000090000003a09200280e0e8a5508 CHEBI:17106 0000000000002000810000401701014504cda35a1f CHEBI:17109 00000001000430008030084101024040146ce2781d CHEBI:17110 0000001000843001900004872092006c8e4daa5518 CHEBI:17111 000000000000300080000207801000000d0e825508 CHEBI:17113 000010132094300190fc1ccfb7b3f0efde6fffff1f CHEBI:17114 000000000000300000000a43805000040404427f09 CHEBI:17115 000000000000210080000003800000048c0c825608 CHEBI:17117 000000000000300080000207801000000c0c825708 CHEBI:17118 00000000000001000000000000082e04090004d608 CHEBI:17120 000000000000010080000001000000048008825208 CHEBI:17121 00000000000030008000024f813010008c0cc2ff09 CHEBI:17122 000000000210200000080840040000040404407a09 CHEBI:17123 000000000000010000000a0e896838848d2046ff09 CHEBI:17125 0000002000000100000230518888040608149af709 CHEBI:17126 00008000c0c0709bd421836723d240688e4dab5d1f CHEBI:17127 000000000000010000000000000020048908065608 CHEBI:17128 0000000000000000800030161b4121f305cfbff91f CHEBI:17129 000000000000000000000006800020848904065608 CHEBI:17130 00000004000000008000400200040144accca2561e CHEBI:17131 00000000000000008000000500180224801e92d708 CHEBI:17132 0000000200000221020a04819699acc4895db7d71e CHEBI:17133 000000000000200000000a40880001400c44c0751f CHEBI:17134 0000000000000000800800011401014084cca3521e CHEBI:17136 00000000c080411a140180002002000802040a4008 CHEBI:17137 000000100084000090000003a09200288e0c8a5708 CHEBI:17138 00000000c080701b940180072292806c8e4fab571c CHEBI:17139 000000000000300080000207801000800c0e825708 CHEBI:17140 000000000000200000008a00884020000402442501 CHEBI:17141 000000000000220080080000840000048c0c825608 CHEBI:17142 000000020010320080080008172110410445e3fa1d CHEBI:17143 000000002000300001d00cc68040a085842d467e1b CHEBI:17144 000010123094302190dc14cf37b3906b9e6dea7f1d CHEBI:17145 0000000000001200820a385f15a9195294fcfbfa1f CHEBI:17146 000000000010282080ec48684724d14b74e5eb7a1f CHEBI:17147 000000000000000000000a06884020000522442501 CHEBI:17148 000000100084000090000002a092002c8e0c0a5708 CHEBI:17150 000000000000300080000207801000800c0e825508 CHEBI:17151 000000000000000000000000000002000800008608 CHEBI:17153 000000000000000001800808032010010021402a1f CHEBI:17154 000000000000302180080007961181748dcda3571e CHEBI:17155 000000010004300080b0084101024040146ce2781d CHEBI:17156 000000102084302193d69fcfbff3b4fbdf6ffeff1f CHEBI:17157 00000000000000008000000000000000800080c208 CHEBI:17158 00000020000011000002385d89e814020c34d8ff09 CHEBI:17159 00000000000000000200008690892dc00dc4a5d01e CHEBI:17160 000000002000302181d63edf937191535c6dfbff1f CHEBI:17161 00000000000001008000000000082e84890086d608 CHEBI:17162 000000000000200080008311801800000c0482d708 CHEBI:17163 000000000000300180000687029080e40c4fa3551c CHEBI:17164 0000000008400020016085c21041c1550465827b1f CHEBI:17165 000000000000000000002a1688482c00052244a501 CHEBI:17166 00000000000000008000000000000000000480d208 CHEBI:17167 0000000200000200020a008694892cc00954b5941c CHEBI:17168 00000000000000000000000000000000000000021e CHEBI:17169 00000000000000000002201000000000000050a001 CHEBI:17170 000000012004102181f41ecf9373d0c35dedea7f1f CHEBI:17172 000000000000300080000207801000800c0e825708 CHEBI:17173 00000000c184601b940182752392906c9e4dfbff1d CHEBI:17174 000000110084000010300a56a8da60280e2e4efd09 CHEBI:17175 00000000000000000000020004100000000400151e CHEBI:17177 00000000c080451a0601828eb9eb3cc88f5effff1d CHEBI:17179 00000000000001008000020100100004880c825708 CHEBI:17180 0000010008000020000085800018a834080104d71a CHEBI:17181 00000000000030018000024f033890641c4ff3ff1d CHEBI:17182 00000008000001018104149013118117008b8bf91f CHEBI:17183 000404000000000002004080508500402044a1101c CHEBI:17184 0000000000003000800000010000014004c4a2d01e CHEBI:17185 000000100084200190000487209200680e4daa5518 CHEBI:17188 0000000000001000000000010000014404cca2521e CHEBI:17189 00000000000020000000084600000604040450fe09 CHEBI:17191 0000000000210000000040000004014020c4a0101f CHEBI:17192 000000080020000000000000400000002000000500 CHEBI:17194 00000000080020000000a2468a40a0850507447b1d CHEBI:17195 00000000000030000080084a816010048c24427e09 CHEBI:17196 000000100084000090000003209200288e0e8a5708 CHEBI:17197 0000001000803201900a024f27ba3cec9f5ffeff1d CHEBI:17198 0000000000002000800000000000014484cca2521e CHEBI:17199 0000001010903021909414cf27b2906f9e6deb7f1d CHEBI:17200 000000000000200000400a4c886010048d26427f09 CHEBI:17201 000000102084202191f414cf33b3d06bdeedea7f1f CHEBI:17202 0000000000002000000006c6884020050505447b19 CHEBI:17203 00000000000000008000000280002004880c865608 CHEBI:17204 000000000000010080000000040020048808865608 CHEBI:17205 00802402000002010208008014892cc08951b7d21c CHEBI:17206 000000000200020000080006840020048c0c065608 CHEBI:17207 0000000800000001800434901b118597408b8bf91f CHEBI:17209 00000000000001008000000000000144044c82521e CHEBI:17210 0000001000841200100a0003249a2ca8871c1ed508 CHEBI:17211 0000000008002000800085801001015004c5a1501e CHEBI:17212 000000000000200000000a4688482004852c46ff09 CHEBI:17213 00000000000001008002000100000004800892d208 CHEBI:17214 000000000000000000000850804801640444c2fe1f CHEBI:17215 000000012004002001f00e8e9b63f61154a175af1f CHEBI:17216 00000000000000000000000004000004000400d208 CHEBI:17217 00000000020022008208000684882cc00d44b4d01c CHEBI:17219 0000000002000200000a0000040828000000148400 CHEBI:17221 0000001010903021909416cf27b2906f9e6febff1d CHEBI:17222 0000000000000201000800011601014400c583d21e CHEBI:17224 0000000200000200820c30141ec12de301ddbffb1f CHEBI:17225 000000000000200080000006040001748ccca3561e CHEBI:17226 000000000000300180000207021080640c4fa3551c CHEBI:17227 00000001c084541a14318a06a9c260088e264e6d09 CHEBI:17228 00000100080000208000858080108834080582d71a CHEBI:17229 000000000000300000008a46804020040404447f09 CHEBI:17230 00000000000000000000000000000140004480901e CHEBI:17231 000000000000200000000846804020048c04467e09 CHEBI:17232 0000000000001200010a04cf956198d18475f2fe1f CHEBI:17233 00000000000020008008000004000004840c825208 CHEBI:17236 0000002000000100000230100c800406001018f709 CHEBI:17237 000000103094302190b41ccf27b2d06b9e6dea7f1d CHEBI:17239 000000000200030000080000040000048808025608 CHEBI:17240 0000000000000800118004a0000080090021400013 CHEBI:17241 000000000000200180000681801000448c4da25718 CHEBI:17242 00000000000000000000000000182ea4818c36d71e CHEBI:17243 0000000000002000810000081721114104c5a3581f CHEBI:17244 000000000000010000000000000000000000000008 CHEBI:17245 000000000000000000000000800000000804001508 CHEBI:17246 00000001100010000192000103229141546550f81f CHEBI:17247 000010032014300080f80a4c9f6bd0419465c3fb1d CHEBI:17248 000000010004200000300a46884260848c2c467f09 CHEBI:17249 000000000000300082000007808000048c0c825608 CHEBI:17250 0000000000000200020a008014892c800110148010 CHEBI:17251 00000002000002008208008694892cc08954b7d21c CHEBI:17252 000000000000000080000000040000408040a0421c CHEBI:17253 0000000000002000800000000000014004c4a0d01e CHEBI:17254 000000002000102181d41e8f937190d35d65ea7d1f CHEBI:17256 00200000000020000400884e8060308c8d2e467f09 CHEBI:17257 000000002000000001d00480120180110021400817 CHEBI:17258 000000103090302190b41ccfa7b2d86f9e6debff1d CHEBI:17259 0000001000840000100000052082262c8e0c0ed608 CHEBI:17260 000000000000300180c00a4f837010449c6de2ff1d CHEBI:17261 0000001010903221909c148fa7f290ef9f6debff1d CHEBI:17262 00000000000000000200008410892dc009c4a5d21e CHEBI:17263 000000100084000090000003a09200ac8e0e8a5708 CHEBI:17264 000000110084300192b00ec3a192486c1e6deafd1d CHEBI:17265 000000000000300080000207801000000c0e825708 CHEBI:17266 00000000000030008000000100000040044ca2501c CHEBI:17268 0000000000000200020a000004882ca4011034d61c CHEBI:17269 000000100084100090000207209200280e0e8a5508 CHEBI:17270 0000001000800100800000012082000c8a088a5708 CHEBI:17271 00000000000001000000000000000204080000d608 CHEBI:17272 000000000000300080000247813210008c0cc27f09 CHEBI:17274 0000000000000200000a000004000004840c12d208 CHEBI:17275 000000000000100000000005000021500ccca7561e CHEBI:17276 000000101090302190d41ecfaff2b06f9f7fffff1d CHEBI:17277 00000000000000000200028690992cc00d5cb7d51c CHEBI:17278 00000000000020000000084e806030848c2c467e09 CHEBI:17279 000000000000100180000687801000c40d4da25718 CHEBI:17281 000000000000010081000486900100150c05c25e1f CHEBI:17282 0000001000843000900000032092006c8e4eaa571c CHEBI:17283 000000100084300190000ccf21b210681e4dea7d19 CHEBI:17284 00000000c080741a14018842a1c2000c8e0c4a7f09 CHEBI:17285 0000000002003201800802179e51a1e50dcfe7ff1f CHEBI:17286 000000110084100010303072a982540e8e2c4aff09 CHEBI:17287 000000000000000100000cc6815020441841647b19 CHEBI:17289 000000000000100000000001800004040c0482d608 CHEBI:17290 00000000000030008000000100000004040c82d208 CHEBI:17291 000000020010000100a808411601c14150e1e17a1f CHEBI:17293 0000000000000300016814971e49e0b705a57ffb1f CHEBI:17294 000000000000200000000842804000040404407e1f CHEBI:17295 000000000000000000000800000000005000402817 CHEBI:17296 0000001000803201900a0007a69a2cec0f5fbed51c CHEBI:17298 000000000200030000080840844000048008427e09 CHEBI:17299 000010000000000000080000440400002000000000 CHEBI:17300 00000000000000000000000000082e800900048608 CHEBI:17302 0000000000000001820434941ec1adc305c58ff91f CHEBI:17303 0000000000000200000a000004082c400050b4821c CHEBI:17304 000000000000300080000007800000048c0c825608 CHEBI:17305 000000000000300180000207021080640c4fa3551c CHEBI:17306 0000001010903221909c148fa7f290ef9f6dfbff1d CHEBI:17307 00000000000000008000000600002e04880486d608 CHEBI:17308 0000000000002001800000012200006c064da2521c CHEBI:17309 0000000000000000010002450310014104c5a2df1f CHEBI:17310 00000020000021000002385688c82c0605345cff09 CHEBI:17311 000000000000100000802a0e88683880052244af09 CHEBI:17312 0000000000002001800000011601014404cda3521e CHEBI:17313 00000000000000000000000680082e840d0486d608 CHEBI:17314 00000000c184701a94018067a19210288e0cca7f09 CHEBI:17316 000000000000300080000207801000000c0e825708 CHEBI:17317 000000000000010000004009c1641004f880627e1f CHEBI:17318 000000002000202181d41cc9132190535465eaf81f CHEBI:17319 000010132094300190f80ccfb7b3f0edde6fffff1f CHEBI:17321 00000040008000001000000060020008a2040a4008 CHEBI:17322 00000000000010000000000104000150044481501e CHEBI:17323 00000000000001000200000180800404880c82d608 CHEBI:17325 000000100084300190000487209a00688e4daad518 CHEBI:17326 0000000000000200000a020404182c800114149508 CHEBI:17327 000000000000200001000ece986130958d25467f1f CHEBI:17328 000000101090302190d41ecfaffab06f9f7fffff1d CHEBI:17329 0000000000000000040081018000000c8804025608 CHEBI:17330 00000000000001000000000801201014d080637a1f CHEBI:17331 000000000000300180000687029080e40c4fa3551c CHEBI:17332 00000000020002008008000004000064844ca2521c CHEBI:17333 8004040008400000024295c801a39047946d7afa19 CHEBI:17334 000000000000300180000207061001640c4da3571e CHEBI:17335 0000000000000200020a020004982c80001434951c CHEBI:17336 000000000000000100004481c4840044a84d825618 CHEBI:17337 000000103090302190b41ccfa7b2d86f9e6debff1d CHEBI:17338 000000000000200000000a46886230048424467f09 CHEBI:17339 000000000000000000000a06880021e48d4cc7771f CHEBI:17340 000004000000200000400a4e886030c11561642b19 CHEBI:17342 0000000000003021800000011201015004cda3501e CHEBI:17343 00000000000012000008000104000004840c025208 CHEBI:17344 000000112084202191f41ccf33b3d06bdeedea7f1f CHEBI:17345 000000000001000000004000404400002000000417 CHEBI:17346 00000002000002000208008694892cc00944b5d21c CHEBI:17347 000010132094300190fc1cdfb7b3f0efde7fffff1f CHEBI:17349 00000000000000000000000684082e84090404d608 CHEBI:17351 000000000000200080000848002010000404c07a1f CHEBI:17352 00000000000020008000004e806830048c0cc6fe09 CHEBI:17355 00000000000000000000008c10211010508041aa1f CHEBI:17356 00000000000030008000000100000004040c82d208 CHEBI:17357 000000110084000010300a26a8c2700807264e6d09 CHEBI:17358 00000000c080411a040180002002000802000a4008 CHEBI:17359 000000000000310080000207801000040c0c825708 CHEBI:17360 000000103094202190b41ccf27b2d06b9e6dea7f1d CHEBI:17361 000000000000000001000000120100110001010817 CHEBI:17362 000000100084300090000207a09200a88e0e8a5708 CHEBI:17363 000000000000300000000842814000048c0c427e09 CHEBI:17364 00000000000000000008000044040004a40c025208 CHEBI:17365 000000000000200080080002840020048c0c865608 CHEBI:17367 000000002000200001f004c81321d04144e1402a1f CHEBI:17368 0000000000002201820a305387988ee69d5dbbff1d CHEBI:17371 00000400000002000248148e17e990e744e5fbfa1f CHEBI:17372 0010800000002000d000086ea868380e8d2c4eff09 CHEBI:17374 000000000000200080000000000000040404025208 CHEBI:17375 0020000000002000040088428040000c852e427f09 CHEBI:17376 0000000000000001000004811601815400cd83521e CHEBI:17377 000000000000200080000201801000000c04825708 CHEBI:17378 000000000000302180080207161181740ccfa3571e CHEBI:17379 00000000000020000000084681400144dcece27e1f CHEBI:17380 000000000000200001400e86884020858dad667f1b CHEBI:17381 000000000000300181000247031881650dcfa3dd1f CHEBI:17382 000000112084302191f41ecf33b3d06fdeefeb7f1f CHEBI:17383 00000400000000000240148e13e992e745e5fbfe1f CHEBI:17384 000000000000200080000002800001440c4c82561e CHEBI:17385 0000000000000100002004c0004060050001047a19 CHEBI:17388 000000000000100080000203001200240c0e825708 CHEBI:17389 0000000800000001800434901959adf709dbbff31f CHEBI:17390 000000000000300181000687121180750c4de35f1f CHEBI:17391 000000000000200000000a4e886832848d2446ff09 CHEBI:17394 0000000000002000800000000400014404cca2521e CHEBI:17395 000000000001300180000247021000640c4da3551f CHEBI:17396 0000000000000000000004881021104140c160aa1f CHEBI:17397 00000010008030019000025f23ba106c9e5ffaff1d CHEBI:17398 000000000000300080000207801000800c0e825708 CHEBI:17399 00000400000032018268168f17f9d0e74cedbbff1f CHEBI:17400 00000010008400009000000120820068864caa501c CHEBI:17401 0000001000840000100200012082002c860c1ad208 CHEBI:17402 000000000000200000000842c04000042404407f09 CHEBI:17403 0000000000000000000000801001204008c084021e CHEBI:17404 0000000000001000010002450310014504cda2df1f CHEBI:17405 000000000000000081500482814080058c25c25e1b CHEBI:17406 0000001000841200100a0003249a2ca8871c1ed508 CHEBI:17407 000000000000000000000206801020840d04065708 CHEBI:17409 000000003000300081d000491323915104ed23581f CHEBI:17410 00000000000030008000024f813010008c0cc2ff09 CHEBI:17411 000000000000100000000203001000048c0c025708 CHEBI:17413 00000000000000008000000600002004880c865608 CHEBI:17415 0000000000003000800000010000014484cca2521e CHEBI:17416 000000000000200080000a0e8c60314005e6e67f1f CHEBI:17417 00000000000000000000000680002e04080404d608 CHEBI:17418 00000000000010000008000104000004840c825208 CHEBI:17419 000010032014300000f80a4e9f63f0c5dc6f477f1f CHEBI:17420 00000000c184701b9401827f23b2906c9e6ffbff1d CHEBI:17421 000000102080302191d41ccf33b3907bde6feb7d1f CHEBI:17422 00000000000022008008000004000004840c825208 CHEBI:17424 0000000000000100002004c0084060050101047b19 CHEBI:17425 000000000000300080000207001000048c0c825708 CHEBI:17426 0000000000000000800430141a412df305ddbffd1f CHEBI:17428 000000000000300180000687829080e40c4fa2571c CHEBI:17429 000000010004000000300a06884260000522442d01 CHEBI:17431 00000000000000000000890e806030848d0c467f09 CHEBI:17432 000000012004300081b00ece1f23d04154e5e27f1f CHEBI:17433 00000000000001100002a010000000000010188000 CHEBI:17434 0000000000002000800000001001015004c4a1501e CHEBI:17435 000000100084100012000007a0922c2c8f0c8ed708 CHEBI:17436 00000000000000000002a010000000000000108000 CHEBI:17437 000000002000320181d806879e118cf55d6fe3fd1f CHEBI:17438 000100120095120192fa1eefbffbfccbcfffffff1f CHEBI:17439 0000801000848140700000202082416ad64c8a781f CHEBI:17440 000000400000010000000000600000082200004008 CHEBI:17441 00000000000000008000080701000004dc8ce27e1f CHEBI:17442 000000001004200000100a46886230048524467f09 CHEBI:17443 000000000000000000000004800001440c4482561e CHEBI:17445 00000000c184701b940180672392906c9e4deb7f1d CHEBI:17446 0000000000000200000a020404182c800114149508 CHEBI:17447 000000000000200100000ecf887031f48d6fc67f1f CHEBI:17448 00000000000000000000000004000144044482521e CHEBI:17450 00000000000001000000000100000004800802d208 CHEBI:17453 0000001010903021909414cf27b2906f9e6debff1d CHEBI:17454 00000000000030008000024b8170114484cce27f1f CHEBI:17455 000000000000300080008206800000000d06805508 CHEBI:17456 0000000000000200020a000004882c80001034821c CHEBI:17457 00000000000000000000221088402d400444c4b51f CHEBI:17458 000000000000300080000207801000000c0e825508 CHEBI:17459 002000000800000004008d8e8060b0880901442f19 CHEBI:17460 000000000000300180000685801000440c4da25718 CHEBI:17464 0000000000000001020204879291ad500ccdb7d51e CHEBI:17465 000000112084312181f41ccf33b3d06fdeedeb7f1f CHEBI:17466 0000800000000900600230302000501a522019a81f CHEBI:17467 00000000002001848000c10000040004a80082571e CHEBI:17468 000000102084302191d41cef33b3907fdeedeb7f1f CHEBI:17469 0000000000003000800000010400016484cca2521e CHEBI:17470 0010802000000320400c30302ec02ccb09430da11f CHEBI:17472 00000000c080401a16018080308b2de88bccafd21e CHEBI:17474 00000000000030000000084100000004840c427a09 CHEBI:17475 000000100080000090000a47acd220ac8e0ece7709 CHEBI:17476 00000000101412000098000907229045846d627a1d CHEBI:17477 000000000000000000000000000000000000000208 CHEBI:17478 00000000000001008008000104000004800c825208 CHEBI:17479 00000000000020000000a84680402004852e467f09 CHEBI:17482 00000000000030018000024b83b090e49c6ff3ff1d CHEBI:17483 000000000040000000218000080040000000002517 CHEBI:17484 0020000000002000040088468040208c852e467f09 CHEBI:17485 000004000000000000041691035928e709458fff1f CHEBI:17486 0000000000000001800000111601016400cda3d21e CHEBI:17488 000000102080002191d41ccf33b3905b5e6deb791f CHEBI:17489 00000020000001000002305108882ea689189ef709 CHEBI:17490 000000103090302190b41ccf27b2d06f9e6debff1d CHEBI:17494 0000000200003201820a028716992ce40d5db7d51c CHEBI:17495 000000000000200080000200801000040c04025708 CHEBI:17497 0000000200000200020a008694892cc00d54b5d41c CHEBI:17500 0000000000000000800000100400017004ccb3d01e CHEBI:17501 000080000000000050220020200040080004108009 CHEBI:17502 000000000000010000000042806230048808467e09 CHEBI:17503 000000300080010090023057a8d22cae8e1e9ef709 CHEBI:17504 000000000000300180000207021000640c4da3551c CHEBI:17505 0000001000843001900002072292006c0e4faa551c CHEBI:17506 00000000c080701b94018847a6da2eec8f4feefd1d CHEBI:17507 0000000000002000810000401301015504cda35a1f CHEBI:17508 000000002000202181d4bcdf132194535c65eafd1f CHEBI:17509 0000001010903221909c148fa7f290ef9f6dfbff1d CHEBI:17510 000000100080000090000847a09a00ac8e0ecaff09 CHEBI:17511 000000000000300180000217021080640c4db3d51c CHEBI:17512 000000000001010000000000000000000000000001 CHEBI:17514 000000000000300000008a4e806030848c2c467f09 CHEBI:17515 000000000000020000080006840020048c0c065608 CHEBI:17516 000000100080200090000203a09200ac8e0e8a5708 CHEBI:17517 00000000000020000000005000000004040450fa09 CHEBI:17519 000080000040200050208122a04040080404000c1f CHEBI:17520 000000000000300082000007808000c40c4ca2521c CHEBI:17521 000000000000300080000203801000000c06825508 CHEBI:17522 000000002000100001d0049012018011542140a817 CHEBI:17524 00000002000002000208000694892cc48d4cb7d61c CHEBI:17525 0000001000843000900000032092006c8e4eaa571c CHEBI:17526 000000000000300180000003021001748ccda3571e CHEBI:17527 0000001000841200100a0003249a2ca8871c1ed508 CHEBI:17528 00000000000010210202000792912dd00ccdb7d71e CHEBI:17529 00000000000020000200084284c00044844cc27e1d CHEBI:17530 000000000000300180000207061001640ccda3571e CHEBI:17531 0000000000000020018434980a68bd4301652cb91f CHEBI:17532 00000000000020000000004e806830048c0c46fe09 CHEBI:17533 000000000000000080000a06884020048d04c67709 CHEBI:17534 000000000000300180000007821000440c4da2511c CHEBI:17535 000000003004110000900840012210048428427a09 CHEBI:17536 000000010004200000700a4e88627144846cc67f1f CHEBI:17537 00000000000001000000000004082e84090004d608 CHEBI:17539 000000000000300180000217021000640c4da2d51c CHEBI:17540 000000100084300092000207a09200ac8e0e8a5708 CHEBI:17541 0000001000843001900000012282006c864daa521c CHEBI:17543 004000000000010000000000000200040004025208 CHEBI:17544 0000800000000000512004a6304140190425400c1f CHEBI:17545 0000000002000200000a0000040008000010108608 CHEBI:17546 00000000000020018000001102008064844db3d21c CHEBI:17548 000000000000000000000a46880020048d04c67709 CHEBI:17549 00000010d090703b949596cf27b2906f9e6febff1d CHEBI:17550 0000000200000200820a008694892ee48d5cb7d61c CHEBI:17551 000000112084302191f41ccf33b3d06bdeedea7f1f CHEBI:17552 000000100084000010000a46a8d220280e0e4e7509 CHEBI:17553 000000100084000090000003209a002c8e0e8ad708 CHEBI:17555 0000000000000201000a04831611895004cdb3d51e CHEBI:17556 0000000000003021800800011601817404cda3d21e CHEBI:17558 0000000002003201814806979e51a0f50defe7ff1f CHEBI:17559 00000000000001000000004000000004800852fa09 CHEBI:17560 000000000000300000008a42804000040404407f09 CHEBI:17561 000000003010302180b41ecf0732d0431c6dea7f1d CHEBI:17562 00000000000001000000000000000004808822521e CHEBI:17563 000000000001000001000486104100110021400c17 CHEBI:17566 000000000000000080000203201000280806825708 CHEBI:17567 000000001014100000900009072290418461406a1d CHEBI:17568 00000000000030018000024f033090641c4fe3ff1d CHEBI:17571 000000000000010080000a02884020048800c67709 CHEBI:17572 0000000200000300020a000694892cc4895cb7d21c CHEBI:17573 0000000000001221000800111601017004cda3d21e CHEBI:17574 00000000000030018000acc7804020440d4fe67f19 CHEBI:17575 00000000000030008000084101000004840cc27a09 CHEBI:17576 00000000000000000002890e806830808d0054ef09 CHEBI:17577 000000000000000000000000000000000000008016 CHEBI:17578 0000000000000200020a000004882c800010358014 CHEBI:17579 0000000002000200020a000284882c000814949408 CHEBI:17580 00000000000030018000024f033090649c7ff3ff1d CHEBI:17581 00000000020012008008000104000064844ca2521c CHEBI:17582 chemfp-1.1p1/tests/decoder.sdf0000644000077000000240000002373712055226641016541 0ustar dalkestaff000000000000009425004 -OEChem-01150805002D 40 41 0 0 0 0 0 0 0999 V2000 2.0000 -3.0580 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -3.0580 0.0000 F 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 0.9420 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -0.0580 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 2.9420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.3007 3.9365 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 0.9420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -0.0580 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.1097 2.5353 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 2.4420 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.7788 3.2784 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.2788 4.1444 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.3176 1.5571 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 1.4420 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.6856 5.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -3.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -2.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -0.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -3.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -3.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -1.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -4.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -4.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -5.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.1181 3.0246 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7195 2.3343 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.3954 3.2136 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.7112 1.4282 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.4465 0.9507 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9241 1.6860 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.1192 5.3102 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9377 5.6244 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.2520 4.8058 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.9272 1.2520 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.0010 -0.3680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.1951 -1.7480 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 -1.8680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3291 -4.8680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 -4.8680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -5.6780 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 19 1 0 0 0 0 2 20 1 0 0 0 0 3 14 2 0 0 0 0 4 18 2 0 0 0 0 5 6 1 0 0 0 0 5 9 1 0 0 0 0 5 10 1 0 0 0 0 6 12 2 0 0 0 0 7 8 1 0 0 0 0 7 14 1 0 0 0 0 7 34 1 0 0 0 0 8 18 1 0 0 0 0 8 35 1 0 0 0 0 9 11 2 0 0 0 0 9 13 1 0 0 0 0 10 14 1 0 0 0 0 10 25 1 0 0 0 0 10 26 1 0 0 0 0 11 12 1 0 0 0 0 11 27 1 0 0 0 0 12 15 1 0 0 0 0 13 28 1 0 0 0 0 13 29 1 0 0 0 0 13 30 1 0 0 0 0 15 31 1 0 0 0 0 15 32 1 0 0 0 0 15 33 1 0 0 0 0 16 17 1 0 0 0 0 16 19 1 0 0 0 0 16 20 2 0 0 0 0 17 21 2 0 0 0 0 17 36 1 0 0 0 0 18 21 1 0 0 0 0 19 22 2 0 0 0 0 20 23 1 0 0 0 0 21 37 1 0 0 0 0 22 24 1 0 0 0 0 22 38 1 0 0 0 0 23 24 2 0 0 0 0 23 39 1 0 0 0 0 24 40 1 0 0 0 0 M END > 9425004 > Blah > AAADceB7sQAEAAAAAAAAAAAAAAAAAWAAAAAwAAAAAAAAAAABwAAAHwIYAAAADArBniwygJJqAACqAyVyVACSBAAhhwIa+CC4ZtgIYCLB0/CUpAhgmADIyYcAgAAOAAAAAAABAAAAAAAAAAIAAAAAAAAAAA== > I5Z2MLZgOKRcR...1 > HqhdNG+YPqhdNG+kPqhdNG2.2 > 3 > 1P!_3 > 1P!_P > 40 bits, non-symmetric, with a few bits set to check for decoding problems 0000 0000 1010 0000 0000 0000 0000 0011 0000 0000 0 0 5 0 0 0 0 c 0 0 LSB 0 0 a 0 0 0 0 3 0 0 MSB > 0000000010100000000000000000001100000000 > A very short fingerprint to test support for fewer than 8 bits > 001 > Exactly 8 bits > 01101110 > 0001110011101101 > 0001 1100 1110 1101 1 LSB 8 3 7 b 1 0 0011 1001 1101 1011 MSB 0 3 9 d b > 00011100111011011 > Ab > Ab3C > 0123456789abcdef > MQ== > R3JlZXRpbmdzLCBodW1hbg== $$$$ 9425009 -OEChem-01150805002D 47 49 0 0 0 0 0 0 0999 V2000 5.2187 -2.0269 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.6701 3.2453 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 2.4554 0.4001 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.4945 4.8633 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.4727 5.0712 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.2688 2.2272 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.8566 1.4182 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.9097 -2.9780 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.5277 -2.9780 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.9945 5.7294 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.0878 3.9498 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6309 -1.2179 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6637 6.4725 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5772 6.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.0377 -0.3044 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6756 3.1408 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 5.8339 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.2187 -2.0269 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.4499 0.5047 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4432 6.5658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.7187 -3.5658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.7187 -4.5658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.5847 -5.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.8527 -5.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.5847 -6.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.8527 -6.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.7187 -6.5658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.6571 3.5038 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.5738 4.2965 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2002 -1.6639 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.1169 -0.8712 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.4684 0.1416 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.5517 -0.6511 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.5347 7.0790 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.0648 6.4505 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3834 5.8987 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.9352 5.2173 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.6522 2.1624 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7532 6.0288 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1332 7.1027 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.9802 6.8758 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.4732 1.4830 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1217 -4.7558 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.3158 -4.7558 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1217 -6.3758 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.3158 -6.3758 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.7187 -7.1858 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 9 1 0 0 0 0 1 18 1 0 0 0 0 2 16 2 0 0 0 0 3 19 2 0 0 0 0 4 5 1 0 0 0 0 4 10 1 0 0 0 0 4 11 1 0 0 0 0 5 14 2 0 0 0 0 6 7 1 0 0 0 0 6 16 1 0 0 0 0 6 38 1 0 0 0 0 7 19 1 0 0 0 0 7 42 1 0 0 0 0 8 18 2 0 0 0 0 8 21 1 0 0 0 0 9 21 2 0 0 0 0 10 13 2 0 0 0 0 10 17 1 0 0 0 0 11 16 1 0 0 0 0 11 28 1 0 0 0 0 11 29 1 0 0 0 0 12 15 1 0 0 0 0 12 18 1 0 0 0 0 12 30 1 0 0 0 0 12 31 1 0 0 0 0 13 14 1 0 0 0 0 13 34 1 0 0 0 0 14 20 1 0 0 0 0 15 19 1 0 0 0 0 15 32 1 0 0 0 0 15 33 1 0 0 0 0 17 35 1 0 0 0 0 17 36 1 0 0 0 0 17 37 1 0 0 0 0 20 39 1 0 0 0 0 20 40 1 0 0 0 0 20 41 1 0 0 0 0 21 22 1 0 0 0 0 22 23 2 0 0 0 0 22 24 1 0 0 0 0 23 25 1 0 0 0 0 23 43 1 0 0 0 0 24 26 2 0 0 0 0 24 44 1 0 0 0 0 25 27 2 0 0 0 0 25 45 1 0 0 0 0 26 27 1 0 0 0 0 26 46 1 0 0 0 0 27 47 1 0 0 0 0 M END > 9425009 > AAADceB7sAAAAAAAAAAAAAAAAAAAAWLAAAAwAAAAAAAAAAAB8AAAHgAcAAAADAjBnwQzkJZ6EACrAydydgCShAkhgqI7+CG4ZJiIaLLA2fGUpAhknQLIyAc3gAAOAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== > MqVZPKNkR4JnR...1 > J4JnR4ZiNm.U612g616g61A.2 > 3 > 0000 0000 0101 1111 1110 1101 1100 1010 0000 0000 0 0 a f 7 b 3 5 0 0 LSB 0 0 5 f e d c a 0 0 MSB > 0000000001011111111011011100101000000000 > 110 > 00111101 > 0010010101110011 > 0001 0010 1011 1001 1 LSB 8 4 d 9 1 0 0010 0101 0111 0011 MSB 0 2 5 7 3 > 00010010101110011 > 01 > ABCDEF0123456789 > AA== > YmxhaGJsYWhzcGFtYmxhaA== $$$$ chemfp-1.1p1/tests/fpverify.py0000644000077000000240000001564211660452124016634 0ustar dalkestaff00000000000000# Verify import re import itertools import string from chemfp.decoders import from_cactvs from chemfp import types # Parse terms which look like "34","82-88", and "0-4,9,23-880" _range_pat = re.compile("(\d+)-(\d+)") _bit_pat = re.compile("\d+") def parse_bitset_definition(s): if s.endswith("=="): num_bits, decoded = from_cactvs(s) return set(bitno for (bitno, bit) in iter_bits(decoded) if bit) if s.startswith("0x"): decoded = s[2:].decode("hex") return set(bitno for (bitno, bit) in iter_bits(decoded) if bit) bits = set() if s == "X": return bits start = 0 while 1: m = _range_pat.match(s, start) if m: # Ranges are inclusive, so "3-5" means {3, 4, 5} start = int(m.group(1)) end = int(m.group(2)) assert start <= end, (start, end) for bit in range(start, end+1): bits.add(bit) start = m.end() else: m = _bit_pat.match(s, start) if m: # a single bit bits.add(int(m.group())) start = m.end() else: raise ValueError("Unknown term %r position %d" % (s, start+1)) if start == len(s): break if s[start] != ",": raise ValueError("Must have a ',' in %r at position %d" % (s, start+1)) start += 1 return bits def parse_true(true_term, num_bits): if true_term.endswith("?"): assert "!" not in true_term true_bits = parse_bitset_definition(true_term[:-1]) ignore_bits = set(range(num_bits)) - true_bits return true_bits, ignore_bits if "!" not in true_term: return parse_bitset_definition(true_term), set() left, right = true_term.split("!") return parse_bitset_definition(left), parse_bitset_definition(right) def parse_false(false_term, num_bits, true_bits): if false_term == "*": return set(range(num_bits)) - true_bits return parse_bitset_definition(false_term) _bit_offset_table = {} for i in range(256): _bit_offset_table[chr(i)] = tuple(offset for offset in range(8) if i & (1<= fingerprinter.num_bits: break if bitno in ignore_bits: continue if bitno in true_bits: if not val: errors.append( (bitno, val, True) ) else: tested_true.add(bitno) elif bitno in false_bits: if val: errors.append( (bitno, val, False) ) else: tested_false.add(bitno) if errors: print "ERROR: Fingerprint failure:", smiles print "Fingerprint", fingerprinter.get_type() print " bit# got expected" print " ---- --- --------" for (bitno, got, expected) in errors: description = fingerprinter.describe(bitno) print " %3d %d %d %s" % (bitno, got, expected, description) print "bit pattern", ",".join(str(bitno) for (bitno, val) in iter_bits(fp) if val) print "hex pattern", fp.encode("hex") raise SystemExit() all = set(range(fingerprinter.num_bits)) print "Missing true bits:", " ".join(str(b) for b in sorted(all - tested_true)) print "Missing false bits:", " ".join(str(b) for b in sorted(all - tested_true)) def main(): import sys if len(sys.argv) > 1: toolkit = sys.argv[1] main2(toolkit) else: for toolkit in ("OpenEye", "RDKit", "OpenBabel", "Indigo"): main2(toolkit) def main2(toolkit): fingerprinter = None num_bits = None toolkit_version = None test_cases = [] skip = False def toolkit_selected(options): print toolkit, toolkit_version, options return toolkit in options or toolkit_version in options for line in open("rdmaccs.fpverify", "U"): line = line.strip() if not line: continue if line.startswith("#type="): assert fingerprinter is None fingerprinter = get_fingerprinter(line[6:], toolkit) num_bits = fingerprinter.num_bits toolkit_version = fingerprinter.software.split()[0] continue if line.startswith("#"): continue fields = line.split() if fields[0] == "=skip": if toolkit_selected(fields[1:]): print "skip", fields[1:] skip = True continue if fields[0] == "=only": if not toolkit_selected(fields[1:]): skip = True continue if fields[0].startswith("="): if fields[0][1:2] in string.ascii_uppercase: # Some sort of toolkit-specific directive. Ignore. continue raise AssertionError(line) assert fingerprinter is not None # haven't defined the fingerprint type if skip: skip = False continue if len(fields) == 1: raise AssertionError(line) if len(fields) == 2: smiles, true_term = fields false_term = "*" else: smiles, true_term, false_term = fields # must have 2 or 3 terms true_bits, ignore_bits = parse_true(true_term, num_bits) false_bits = parse_false(false_term, num_bits, true_bits) true_bits = true_bits - ignore_bits false_bits = false_bits - ignore_bits assert not (true_bits & false_bits), (true_bits & false_bits) test_cases.append( (smiles, true_bits, false_bits, ignore_bits) ) evaluate_test_cases(fingerprinter, test_cases) if __name__ == "__main__": main() chemfp-1.1p1/tests/maccs.smi0000644000077000000240000000014711660452124016222 0ustar dalkestaff00000000000000[Ge] 3->bit_2 [U] 4->bit_3 [Sc] 5->bit_4 [Dy] 6->bit_5 [Be] 10->bit_9 C1CCC1 11->bit_10 C#C 17->bit_16 chemfp-1.1p1/tests/missing_title.sdf0000644000077000000240000001571711660452124020003 0ustar dalkestaff00000000000000 OpenBabel12160706013D 32 28 0 0 0 0 0 0 0 0999 V2000 -1.4520 0.0061 0.5210 C 0 0 0 0 0 -2.4914 -0.8349 0.5204 C 0 0 0 0 0 -1.8035 -2.4517 0.5212 N 0 0 0 0 0 -0.3733 -2.1482 0.5219 C 0 0 0 0 0 -0.1506 -0.7458 0.5219 C 0 0 0 0 0 0.7298 -3.0423 0.5358 C 0 0 0 0 0 2.0556 -2.5339 0.5496 C 0 0 0 0 0 2.2782 -1.1316 0.4969 C 0 0 0 0 0 1.1752 -0.2375 0.5094 C 0 0 0 0 0 -1.5379 1.1027 0.5311 H 0 0 0 0 0 -3.3041 -0.7044 -0.2093 H 0 0 0 0 0 -2.1610 -3.1400 -0.1867 H 0 0 0 0 0 0.5574 -4.1286 0.5262 H 0 0 0 0 0 2.9098 -3.2257 0.5915 H 0 0 0 0 0 3.3041 -0.7392 0.4369 H 0 0 0 0 0 1.3478 0.8488 0.5189 H 0 0 0 0 0 -1.8300 2.4262 -0.8409 C 0 0 0 0 0 -2.9276 1.9220 -0.9182 O 0 0 0 0 0 -1.4612 3.7787 -1.3892 O 0 0 0 0 0 -0.5853 1.8470 -0.2139 C 0 0 0 0 0 0.4864 1.7183 -1.2087 N 0 0 0 0 0 -0.8878 0.4700 0.3625 C 0 0 0 0 0 0.3735 -0.1058 0.9925 C 0 0 0 0 0 -2.2660 4.1287 -1.8071 H 0 0 0 0 0 -0.2498 2.5432 0.5869 H 0 0 0 0 0 1.3167 1.3258 -0.7652 H 0 0 0 0 0 0.7762 2.6459 -1.5182 H 0 0 0 0 0 -1.2352 -0.2039 -0.4524 H 0 0 0 0 0 -1.6809 0.5598 1.1380 H 0 0 0 0 0 0.1617 -1.1127 1.4166 H 0 0 0 0 0 0.7313 0.5625 1.8071 H 0 0 0 0 0 1.1746 -0.1970 0.2253 H 0 0 0 0 0 28 1 1 0 0 0 22 1 1 0 0 0 1 29 1 0 0 0 2 3 1 0 0 0 11 2 1 0 0 0 3 4 1 0 0 0 12 3 1 0 0 0 4 6 2 0 0 0 5 23 1 0 0 0 5 30 1 0 0 0 6 7 1 0 0 0 6 13 1 0 0 0 7 8 2 0 0 0 7 14 1 0 0 0 8 9 1 0 0 0 8 15 1 0 0 0 9 16 1 0 0 0 9 23 1 0 0 0 22 10 1 0 0 0 17 20 1 0 0 0 19 17 1 0 0 0 18 17 2 0 0 0 24 19 1 0 0 0 20 21 1 0 0 0 20 25 1 0 0 0 27 21 1 0 0 0 21 26 1 0 0 0 23 31 1 0 0 0 M END > tryptophan.pdb > The next tag has 4 spaces in it > > Leading tab > ThreeTabs $$$$ Good OpenBabel12160706013D 32 28 0 0 0 0 0 0 0 0999 V2000 -1.4520 0.0061 0.5210 C 0 0 0 0 0 -2.4914 -0.8349 0.5204 C 0 0 0 0 0 -1.8035 -2.4517 0.5212 N 0 0 0 0 0 -0.3733 -2.1482 0.5219 C 0 0 0 0 0 -0.1506 -0.7458 0.5219 C 0 0 0 0 0 0.7298 -3.0423 0.5358 C 0 0 0 0 0 2.0556 -2.5339 0.5496 C 0 0 0 0 0 2.2782 -1.1316 0.4969 C 0 0 0 0 0 1.1752 -0.2375 0.5094 C 0 0 0 0 0 -1.5379 1.1027 0.5311 H 0 0 0 0 0 -3.3041 -0.7044 -0.2093 H 0 0 0 0 0 -2.1610 -3.1400 -0.1867 H 0 0 0 0 0 0.5574 -4.1286 0.5262 H 0 0 0 0 0 2.9098 -3.2257 0.5915 H 0 0 0 0 0 3.3041 -0.7392 0.4369 H 0 0 0 0 0 1.3478 0.8488 0.5189 H 0 0 0 0 0 -1.8300 2.4262 -0.8409 C 0 0 0 0 0 -2.9276 1.9220 -0.9182 O 0 0 0 0 0 -1.4612 3.7787 -1.3892 O 0 0 0 0 0 -0.5853 1.8470 -0.2139 C 0 0 0 0 0 0.4864 1.7183 -1.2087 N 0 0 0 0 0 -0.8878 0.4700 0.3625 C 0 0 0 0 0 0.3735 -0.1058 0.9925 C 0 0 0 0 0 -2.2660 4.1287 -1.8071 H 0 0 0 0 0 -0.2498 2.5432 0.5869 H 0 0 0 0 0 1.3167 1.3258 -0.7652 H 0 0 0 0 0 0.7762 2.6459 -1.5182 H 0 0 0 0 0 -1.2352 -0.2039 -0.4524 H 0 0 0 0 0 -1.6809 0.5598 1.1380 H 0 0 0 0 0 0.1617 -1.1127 1.4166 H 0 0 0 0 0 0.7313 0.5625 1.8071 H 0 0 0 0 0 1.1746 -0.1970 0.2253 H 0 0 0 0 0 28 1 1 0 0 0 22 1 1 0 0 0 1 29 1 0 0 0 2 3 1 0 0 0 11 2 1 0 0 0 3 4 1 0 0 0 12 3 1 0 0 0 4 6 2 0 0 0 5 23 1 0 0 0 5 30 1 0 0 0 6 7 1 0 0 0 6 13 1 0 0 0 7 8 2 0 0 0 7 14 1 0 0 0 8 9 1 0 0 0 8 15 1 0 0 0 9 16 1 0 0 0 9 23 1 0 0 0 22 10 1 0 0 0 17 20 1 0 0 0 19 17 1 0 0 0 18 17 2 0 0 0 24 19 1 0 0 0 20 21 1 0 0 0 20 25 1 0 0 0 27 21 1 0 0 0 21 26 1 0 0 0 23 31 1 0 0 0 M END > tryptophan.pdb > > > tab separated > The next record has a tab in it $$$$ OpenBabel12160706013D 32 28 0 0 0 0 0 0 0 0999 V2000 -1.4520 0.0061 0.5210 C 0 0 0 0 0 -2.4914 -0.8349 0.5204 C 0 0 0 0 0 -1.8035 -2.4517 0.5212 N 0 0 0 0 0 -0.3733 -2.1482 0.5219 C 0 0 0 0 0 -0.1506 -0.7458 0.5219 C 0 0 0 0 0 0.7298 -3.0423 0.5358 C 0 0 0 0 0 2.0556 -2.5339 0.5496 C 0 0 0 0 0 2.2782 -1.1316 0.4969 C 0 0 0 0 0 1.1752 -0.2375 0.5094 C 0 0 0 0 0 -1.5379 1.1027 0.5311 H 0 0 0 0 0 -3.3041 -0.7044 -0.2093 H 0 0 0 0 0 -2.1610 -3.1400 -0.1867 H 0 0 0 0 0 0.5574 -4.1286 0.5262 H 0 0 0 0 0 2.9098 -3.2257 0.5915 H 0 0 0 0 0 3.3041 -0.7392 0.4369 H 0 0 0 0 0 1.3478 0.8488 0.5189 H 0 0 0 0 0 -1.8300 2.4262 -0.8409 C 0 0 0 0 0 -2.9276 1.9220 -0.9182 O 0 0 0 0 0 -1.4612 3.7787 -1.3892 O 0 0 0 0 0 -0.5853 1.8470 -0.2139 C 0 0 0 0 0 0.4864 1.7183 -1.2087 N 0 0 0 0 0 -0.8878 0.4700 0.3625 C 0 0 0 0 0 0.3735 -0.1058 0.9925 C 0 0 0 0 0 -2.2660 4.1287 -1.8071 H 0 0 0 0 0 -0.2498 2.5432 0.5869 H 0 0 0 0 0 1.3167 1.3258 -0.7652 H 0 0 0 0 0 0.7762 2.6459 -1.5182 H 0 0 0 0 0 -1.2352 -0.2039 -0.4524 H 0 0 0 0 0 -1.6809 0.5598 1.1380 H 0 0 0 0 0 0.1617 -1.1127 1.4166 H 0 0 0 0 0 0.7313 0.5625 1.8071 H 0 0 0 0 0 1.1746 -0.1970 0.2253 H 0 0 0 0 0 28 1 1 0 0 0 22 1 1 0 0 0 1 29 1 0 0 0 2 3 1 0 0 0 11 2 1 0 0 0 3 4 1 0 0 0 12 3 1 0 0 0 4 6 2 0 0 0 5 23 1 0 0 0 5 30 1 0 0 0 6 7 1 0 0 0 6 13 1 0 0 0 7 8 2 0 0 0 7 14 1 0 0 0 8 9 1 0 0 0 8 15 1 0 0 0 9 16 1 0 0 0 9 23 1 0 0 0 22 10 1 0 0 0 17 20 1 0 0 0 19 17 1 0 0 0 18 17 2 0 0 0 24 19 1 0 0 0 20 21 1 0 0 0 20 25 1 0 0 0 27 21 1 0 0 0 21 26 1 0 0 0 23 31 1 0 0 0 M END > tryptophan.pdb > This is not Blank > This does not have a tab > two tabs and another line $$$$ chemfp-1.1p1/tests/pubchem.sdf0000644000077000000240000036202111660452124016545 0ustar dalkestaff000000000000009425004 -OEChem-01150805002D 40 41 0 0 0 0 0 0 0999 V2000 2.0000 -3.0580 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -3.0580 0.0000 F 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 0.9420 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -0.0580 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 2.9420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.3007 3.9365 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 0.9420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -0.0580 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.1097 2.5353 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 2.4420 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.7788 3.2784 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.2788 4.1444 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.3176 1.5571 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 1.4420 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.6856 5.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -3.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -2.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -0.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -3.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -3.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -1.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -4.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -4.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -5.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.1181 3.0246 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7195 2.3343 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.3954 3.2136 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.7112 1.4282 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.4465 0.9507 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9241 1.6860 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.1192 5.3102 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9377 5.6244 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.2520 4.8058 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.9272 1.2520 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.0010 -0.3680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.1951 -1.7480 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 -1.8680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3291 -4.8680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 -4.8680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -5.6780 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 19 1 0 0 0 0 2 20 1 0 0 0 0 3 14 2 0 0 0 0 4 18 2 0 0 0 0 5 6 1 0 0 0 0 5 9 1 0 0 0 0 5 10 1 0 0 0 0 6 12 2 0 0 0 0 7 8 1 0 0 0 0 7 14 1 0 0 0 0 7 34 1 0 0 0 0 8 18 1 0 0 0 0 8 35 1 0 0 0 0 9 11 2 0 0 0 0 9 13 1 0 0 0 0 10 14 1 0 0 0 0 10 25 1 0 0 0 0 10 26 1 0 0 0 0 11 12 1 0 0 0 0 11 27 1 0 0 0 0 12 15 1 0 0 0 0 13 28 1 0 0 0 0 13 29 1 0 0 0 0 13 30 1 0 0 0 0 15 31 1 0 0 0 0 15 32 1 0 0 0 0 15 33 1 0 0 0 0 16 17 1 0 0 0 0 16 19 1 0 0 0 0 16 20 2 0 0 0 0 17 21 2 0 0 0 0 17 36 1 0 0 0 0 18 21 1 0 0 0 0 19 22 2 0 0 0 0 20 23 1 0 0 0 0 21 37 1 0 0 0 0 22 24 1 0 0 0 0 22 38 1 0 0 0 0 23 24 2 0 0 0 0 23 39 1 0 0 0 0 24 40 1 0 0 0 0 M END > 9425004 > 1 > 491 > 5 > 2 > 4 > AAADceB7sQAEAAAAAAAAAAAAAAAAAWAAAAAwAAAAAAAAAAABwAAAHwIYAAAADArBniwygJJqAACqAyVyVACSBAAhhwIa+CC4ZtgIYCLB0/CUpAhgmADIyYcAgAAOAAAAAAABAAAAAAAAAAIAAAAAAAAAAA== > (E)-3-(2-chloro-6-fluoro-phenyl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]prop-2-enehydrazide > (E)-3-(2-chloro-6-fluorophenyl)-N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]prop-2-enehydrazide > (E)-3-(2-chloro-6-fluorophenyl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]prop-2-enehydrazide > (E)-3-(2-chloro-6-fluoro-phenyl)-N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]prop-2-enehydrazide > (E)-3-(2-chloro-6-fluoro-phenyl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]acrylohydrazide > InChI=1/C16H16ClFN4O2/c1-10-8-11(2)22(21-10)9-16(24)20-19-15(23)7-6-12-13(17)4-3-5-14(12)18/h3-8H,9H2,1-2H3,(H,19,23)(H,20,24)/b7-6+/f/h19-20H > 2.8 > 350.094582 > C16H16ClFN4O2 > 350.775243 > CC1=CC(=NN1CC(=O)NNC(=O)C=CC2=C(C=CC=C2Cl)F)C > CC1=CC(=NN1CC(=O)NNC(=O)\C=C\C2=C(C=CC=C2Cl)F)C > 76 > 350.094582 > 0 > 24 > 0 > 0 > 1 > 0 > 0 > 1 > 4 > 11 12 8 16 19 8 16 20 8 19 22 8 20 23 8 22 24 8 23 24 8 5 6 8 5 9 8 6 12 8 9 11 8 $$$$ 9425009 -OEChem-01150805002D 47 49 0 0 0 0 0 0 0999 V2000 5.2187 -2.0269 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.6701 3.2453 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 2.4554 0.4001 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.4945 4.8633 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.4727 5.0712 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.2688 2.2272 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.8566 1.4182 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.9097 -2.9780 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.5277 -2.9780 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.9945 5.7294 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.0878 3.9498 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6309 -1.2179 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6637 6.4725 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5772 6.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.0377 -0.3044 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6756 3.1408 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 5.8339 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.2187 -2.0269 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.4499 0.5047 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4432 6.5658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.7187 -3.5658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.7187 -4.5658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.5847 -5.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.8527 -5.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.5847 -6.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.8527 -6.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.7187 -6.5658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.6571 3.5038 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.5738 4.2965 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2002 -1.6639 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.1169 -0.8712 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.4684 0.1416 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.5517 -0.6511 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.5347 7.0790 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.0648 6.4505 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3834 5.8987 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.9352 5.2173 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.6522 2.1624 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7532 6.0288 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1332 7.1027 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.9802 6.8758 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.4732 1.4830 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1217 -4.7558 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.3158 -4.7558 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1217 -6.3758 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.3158 -6.3758 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.7187 -7.1858 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 9 1 0 0 0 0 1 18 1 0 0 0 0 2 16 2 0 0 0 0 3 19 2 0 0 0 0 4 5 1 0 0 0 0 4 10 1 0 0 0 0 4 11 1 0 0 0 0 5 14 2 0 0 0 0 6 7 1 0 0 0 0 6 16 1 0 0 0 0 6 38 1 0 0 0 0 7 19 1 0 0 0 0 7 42 1 0 0 0 0 8 18 2 0 0 0 0 8 21 1 0 0 0 0 9 21 2 0 0 0 0 10 13 2 0 0 0 0 10 17 1 0 0 0 0 11 16 1 0 0 0 0 11 28 1 0 0 0 0 11 29 1 0 0 0 0 12 15 1 0 0 0 0 12 18 1 0 0 0 0 12 30 1 0 0 0 0 12 31 1 0 0 0 0 13 14 1 0 0 0 0 13 34 1 0 0 0 0 14 20 1 0 0 0 0 15 19 1 0 0 0 0 15 32 1 0 0 0 0 15 33 1 0 0 0 0 17 35 1 0 0 0 0 17 36 1 0 0 0 0 17 37 1 0 0 0 0 20 39 1 0 0 0 0 20 40 1 0 0 0 0 20 41 1 0 0 0 0 21 22 1 0 0 0 0 22 23 2 0 0 0 0 22 24 1 0 0 0 0 23 25 1 0 0 0 0 23 43 1 0 0 0 0 24 26 2 0 0 0 0 24 44 1 0 0 0 0 25 27 2 0 0 0 0 25 45 1 0 0 0 0 26 27 1 0 0 0 0 26 46 1 0 0 0 0 27 47 1 0 0 0 0 M END > 9425009 > 1 > 513 > 7 > 2 > 6 > AAADceB7sAAAAAAAAAAAAAAAAAAAAWLAAAAwAAAAAAAAAAAB8AAAHgAcAAAADAjBnwQzkJZ6EACrAydydgCShAkhgqI7+CG4ZJiIaLLA2fGUpAhknQLIyAc3gAAOAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(3-phenyl-1,2,4-oxadiazol-5-yl)propanehydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-3-(3-phenyl-1,2,4-oxadiazol-5-yl)propanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(3-phenyl-1,2,4-oxadiazol-5-yl)propanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-3-(3-phenyl-1,2,4-oxadiazol-5-yl)propanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(3-phenyl-1,2,4-oxadiazol-5-yl)propionohydrazide > InChI=1/C18H20N6O3/c1-12-10-13(2)24(22-12)11-16(26)21-20-15(25)8-9-17-19-18(23-27-17)14-6-4-3-5-7-14/h3-7,10H,8-9,11H2,1-2H3,(H,20,25)(H,21,26)/f/h20-21H > 1.9 > 368.159689 > C18H20N6O3 > 368.3898 > CC1=CC(=NN1CC(=O)NNC(=O)CCC2=NC(=NO2)C3=CC=CC=C3)C > CC1=CC(=NN1CC(=O)NNC(=O)CCC2=NC(=NO2)C3=CC=CC=C3)C > 115 > 368.159689 > 0 > 27 > 0 > 0 > 0 > 0 > 0 > 1 > 12 > 1 18 8 1 9 8 10 13 8 13 14 8 22 23 8 22 24 8 23 25 8 24 26 8 25 27 8 26 27 8 4 10 8 4 5 8 5 14 8 8 18 8 8 21 8 9 21 8 $$$$ 9425012 -OEChem-01150805002D 41 42 0 0 0 0 0 0 0999 V2000 5.0032 -4.0774 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1013 -0.0386 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.6372 -1.0386 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1013 1.9614 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.2058 2.9560 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.3692 -0.0386 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.3692 -1.0386 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.0032 -4.0774 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0148 1.5547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.2353 1.4614 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5032 -2.5386 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6942 -3.1264 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.7431 -2.8173 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.6840 2.2978 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.1840 3.1639 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.3122 -3.1264 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.2353 0.4614 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.2228 0.5765 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5032 -1.5386 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -3.4865 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.5907 4.0774 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.2633 -2.8173 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.6247 1.3538 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.0232 2.0440 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.2546 -2.4356 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.0342 -2.2699 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.3006 2.2330 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.8292 0.7055 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.3517 -0.0299 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.6163 0.4476 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8323 0.2714 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5851 -3.0257 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5393 -3.9013 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.4149 -3.9472 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.9062 -1.3486 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.0243 4.3296 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.8429 4.6438 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.1571 3.8252 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.4549 -3.4070 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8529 -2.6257 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.0717 -2.2277 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 8 1 0 0 0 0 1 16 1 0 0 0 0 2 17 2 0 0 0 0 3 19 2 0 0 0 0 4 5 1 0 0 0 0 4 9 1 0 0 0 0 4 10 1 0 0 0 0 5 15 2 0 0 0 0 6 7 1 0 0 0 0 6 17 1 0 0 0 0 6 31 1 0 0 0 0 7 19 1 0 0 0 0 7 35 1 0 0 0 0 8 12 2 0 0 0 0 9 14 2 0 0 0 0 9 18 1 0 0 0 0 10 17 1 0 0 0 0 10 23 1 0 0 0 0 10 24 1 0 0 0 0 11 12 1 0 0 0 0 11 16 2 0 0 0 0 11 19 1 0 0 0 0 12 13 1 0 0 0 0 13 20 1 0 0 0 0 13 25 1 0 0 0 0 13 26 1 0 0 0 0 14 15 1 0 0 0 0 14 27 1 0 0 0 0 15 21 1 0 0 0 0 16 22 1 0 0 0 0 18 28 1 0 0 0 0 18 29 1 0 0 0 0 18 30 1 0 0 0 0 20 32 1 0 0 0 0 20 33 1 0 0 0 0 20 34 1 0 0 0 0 21 36 1 0 0 0 0 21 37 1 0 0 0 0 21 38 1 0 0 0 0 22 39 1 0 0 0 0 22 40 1 0 0 0 0 22 41 1 0 0 0 0 M END > 9425012 > 1 > 419 > 6 > 2 > 4 > AAADceBzsAAAAAAAAAAAAAAAAAAAAWLAAAAAAAAAAAAAAAAB4AAAHgAcAAAADAzBngQyhJJ6AACrA6VyVgCQBAAlogIyeCG8bFoAZh5I0fKUlchmuBjISUOYAAAOAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-ethyl-5-methyl-isoxazole-4-carbohydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-3-ethyl-5-methyl-4-isoxazolecarbohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-ethyl-5-methyl-1,2-oxazole-4-carbohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-3-ethyl-5-methyl-1,2-oxazole-4-carbohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-ethyl-5-methyl-isoxazole-4-carbohydrazide > InChI=1/C14H19N5O3/c1-5-11-13(10(4)22-18-11)14(21)16-15-12(20)7-19-9(3)6-8(2)17-19/h6H,5,7H2,1-4H3,(H,15,20)(H,16,21)/f/h15-16H > 1 > 305.14879 > C14H19N5O3 > 305.33236 > CCC1=NOC(=C1C(=O)NNC(=O)CN2C(=CC(=N2)C)C)C > CCC1=NOC(=C1C(=O)NNC(=O)CN2C(=CC(=N2)C)C)C > 102 > 305.14879 > 0 > 22 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 1 16 8 1 8 8 11 12 8 11 16 8 14 15 8 4 5 8 4 9 8 5 15 8 8 12 8 9 14 8 $$$$ 9425015 -OEChem-01150805002D 54 56 0 0 0 0 0 0 0999 V2000 8.0902 0.6739 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.4921 -3.8261 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.6261 -0.3261 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.7601 -4.8261 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5691 -5.4139 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.6261 -2.3261 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.4921 -1.8261 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.9511 -5.4139 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3581 0.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7601 -3.8261 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3581 -0.3261 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 2.1739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 1.1739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.2601 -6.3649 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.2601 -6.3649 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 4.1739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0902 2.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3581 2.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.6261 -3.3261 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -5.1048 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0902 3.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3581 3.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 5.1739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4921 -0.8261 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.8479 -7.1739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3581 5.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0902 5.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3581 6.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0902 6.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 7.1739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.7476 0.5663 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1461 1.2565 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.1495 -3.9337 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.5480 -3.2435 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.9687 -0.2184 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.5702 -0.9087 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8956 -6.8665 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6271 2.3639 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8212 2.3639 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.1916 -4.5152 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4103 -4.9132 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8084 -5.6945 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6271 3.9839 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8212 3.9839 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.0892 -2.0161 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.3463 -7.5384 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.3494 -6.8095 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.2123 -7.6755 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.0291 -2.1361 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8212 5.3639 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6271 5.3639 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8212 6.9839 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6271 6.9839 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 7.7939 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 13 2 0 0 0 0 2 19 2 0 0 0 0 3 24 2 0 0 0 0 4 5 1 0 0 0 0 4 8 1 0 0 0 0 4 10 1 0 0 0 0 5 15 2 0 0 0 0 6 7 1 0 0 0 0 6 19 1 0 0 0 0 6 45 1 0 0 0 0 7 24 1 0 0 0 0 7 49 1 0 0 0 0 8 14 2 0 0 0 0 8 20 1 0 0 0 0 9 11 1 0 0 0 0 9 13 1 0 0 0 0 9 31 1 0 0 0 0 9 32 1 0 0 0 0 10 19 1 0 0 0 0 10 33 1 0 0 0 0 10 34 1 0 0 0 0 11 24 1 0 0 0 0 11 35 1 0 0 0 0 11 36 1 0 0 0 0 12 13 1 0 0 0 0 12 17 2 0 0 0 0 12 18 1 0 0 0 0 14 15 1 0 0 0 0 14 37 1 0 0 0 0 15 25 1 0 0 0 0 16 21 2 0 0 0 0 16 22 1 0 0 0 0 16 23 1 0 0 0 0 17 21 1 0 0 0 0 17 38 1 0 0 0 0 18 22 2 0 0 0 0 18 39 1 0 0 0 0 20 40 1 0 0 0 0 20 41 1 0 0 0 0 20 42 1 0 0 0 0 21 43 1 0 0 0 0 22 44 1 0 0 0 0 23 26 2 0 0 0 0 23 27 1 0 0 0 0 25 46 1 0 0 0 0 25 47 1 0 0 0 0 25 48 1 0 0 0 0 26 28 1 0 0 0 0 26 50 1 0 0 0 0 27 29 2 0 0 0 0 27 51 1 0 0 0 0 28 30 2 0 0 0 0 28 52 1 0 0 0 0 29 30 1 0 0 0 0 29 53 1 0 0 0 0 30 54 1 0 0 0 0 M END > 9425015 > 1 > 597 > 5 > 2 > 7 > AAADceB7sAAAAAAAAAAAAAAAAAAAAWAAAAAwYAAAAAAAAAAB0AAAHgAYAAAADAzBngQygJJqAACqA6VyVACSBAAlggIa+CG4ZNgIYDLA1fCUpQhgmADIyYcdiMCOwAAAAAAAAACAAAAAAAAAAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-oxo-4-(4-phenylphenyl)butanehydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-4-oxo-4-(4-phenylphenyl)butanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-oxo-4-(4-phenylphenyl)butanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-4-oxo-4-(4-phenylphenyl)butanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-keto-4-(4-phenylphenyl)butyrohydrazide > InChI=1/C23H24N4O3/c1-16-14-17(2)27(26-16)15-23(30)25-24-22(29)13-12-21(28)20-10-8-19(9-11-20)18-6-4-3-5-7-18/h3-11,14H,12-13,15H2,1-2H3,(H,24,29)(H,25,30)/f/h24-25H > 3.3 > 404.184841 > C23H24N4O3 > 404.46166 > CC1=CC(=NN1CC(=O)NNC(=O)CCC(=O)C2=CC=C(C=C2)C3=CC=CC=C3)C > CC1=CC(=NN1CC(=O)NNC(=O)CCC(=O)C2=CC=C(C=C2)C3=CC=CC=C3)C > 93.1 > 404.184841 > 0 > 30 > 0 > 0 > 0 > 0 > 0 > 1 > 8 > 12 17 8 12 18 8 14 15 8 16 21 8 16 22 8 17 21 8 18 22 8 23 26 8 23 27 8 26 28 8 27 29 8 28 30 8 29 30 8 4 5 8 4 8 8 5 15 8 8 14 8 $$$$ 9425018 -OEChem-01150805002D 50 51 0 0 0 0 0 0 0999 V2000 5.4641 -2.3080 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0 9.7942 0.1920 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -0.8080 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -1.3080 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 9.7942 2.1920 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 9.8988 3.1865 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 0.1920 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 -0.8080 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -2.8080 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 10.7078 1.7853 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9282 1.6920 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.3769 2.5284 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.8769 3.3944 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.9157 0.8071 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9282 0.6920 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -2.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.2836 4.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -2.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -1.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -2.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 -2.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -3.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -2.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 -3.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -4.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -2.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -3.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.3176 1.5843 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.7162 2.2746 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.9935 2.4636 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.3092 0.6782 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.0446 0.2007 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.5221 0.9360 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.8500 4.0558 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.5358 4.8744 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.7172 4.5602 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5252 0.5020 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.5991 -1.1180 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.9966 -3.2829 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.1995 -3.2829 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.5991 -2.4980 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7932 -4.1180 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.5991 -4.1180 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -4.9280 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3100 -1.7711 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -1.9980 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.6900 -2.8449 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.2460 -3.8080 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -4.4280 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.4860 -3.8080 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 18 1 0 0 0 0 1 20 1 0 0 0 0 2 15 2 0 0 0 0 3 19 2 0 0 0 0 4 23 2 0 0 0 0 5 6 1 0 0 0 0 5 10 1 0 0 0 0 5 11 1 0 0 0 0 6 13 2 0 0 0 0 7 8 1 0 0 0 0 7 15 1 0 0 0 0 7 37 1 0 0 0 0 8 19 1 0 0 0 0 8 38 1 0 0 0 0 9 23 1 0 0 0 0 9 26 1 0 0 0 0 9 27 1 0 0 0 0 10 12 2 0 0 0 0 10 14 1 0 0 0 0 11 15 1 0 0 0 0 11 28 1 0 0 0 0 11 29 1 0 0 0 0 12 13 1 0 0 0 0 12 30 1 0 0 0 0 13 17 1 0 0 0 0 14 31 1 0 0 0 0 14 32 1 0 0 0 0 14 33 1 0 0 0 0 16 18 1 0 0 0 0 16 19 1 0 0 0 0 16 21 2 0 0 0 0 17 34 1 0 0 0 0 17 35 1 0 0 0 0 17 36 1 0 0 0 0 18 22 2 0 0 0 0 20 23 1 0 0 0 0 20 39 1 0 0 0 0 20 40 1 0 0 0 0 21 24 1 0 0 0 0 21 41 1 0 0 0 0 22 25 1 0 0 0 0 22 42 1 0 0 0 0 24 25 2 0 0 0 0 24 43 1 0 0 0 0 25 44 1 0 0 0 0 26 45 1 0 0 0 0 26 46 1 0 0 0 0 26 47 1 0 0 0 0 27 48 1 0 0 0 0 27 49 1 0 0 0 0 27 50 1 0 0 0 0 M END > 9425018 > 1 > 545 > 5 > 2 > 6 > AAADceB7sABAAAAAAAAAAAAAAAAAAWAAAAAwAAAAAAAAAAABwAAAHgQYAAAADAjF3gSygZNqAAiqAyVyVACSBAAlihIa+Dm4ZNgIYDLg1fGUpQhgmgDoyYcYiACOAAAAAAAEAAAAAAAAAAgAAAAAAAAAAA== > 2-[2-[[[2-(3,5-dimethylpyrazol-1-yl)acetyl]amino]carbamoyl]phenyl]sulfanyl-N,N-dimethyl-acetamide > 2-[[2-[[N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]hydrazino]-oxomethyl]phenyl]thio]-N,N-dimethylacetamide > 2-[2-[[[2-(3,5-dimethylpyrazol-1-yl)acetyl]amino]carbamoyl]phenyl]sulfanyl-N,N-dimethylacetamide > 2-[2-[[2-(3,5-dimethylpyrazol-1-yl)ethanoylamino]carbamoyl]phenyl]sulfanyl-N,N-dimethyl-ethanamide > 2-[[2-[[[2-(3,5-dimethylpyrazol-1-yl)acetyl]amino]carbamoyl]phenyl]thio]-N,N-dimethyl-acetamide > InChI=1/C18H23N5O3S/c1-12-9-13(2)23(21-12)10-16(24)19-20-18(26)14-7-5-6-8-15(14)27-11-17(25)22(3)4/h5-9H,10-11H2,1-4H3,(H,19,24)(H,20,26)/f/h19-20H > 1.5 > 389.15216 > C18H23N5O3S > 389.47192 > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC=CC=C2SCC(=O)N(C)C)C > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC=CC=C2SCC(=O)N(C)C)C > 96.3 > 389.15216 > 0 > 27 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 10 12 8 12 13 8 16 18 8 16 21 8 18 22 8 21 24 8 22 25 8 24 25 8 5 10 8 5 6 8 6 13 8 $$$$ 9425021 -OEChem-01150805002D 49 51 0 0 0 0 0 0 0999 V2000 9.4664 2.2809 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0 7.3432 5.1506 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 9.3676 4.1756 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.3927 -2.5473 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.2807 0.8825 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.7359 -3.6675 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5845 -4.1965 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.4225 -1.1124 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.2509 -0.5523 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.9667 5.9324 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.9667 5.9324 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.5902 5.1506 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.5657 4.1756 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.4667 3.7418 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.9706 -4.3112 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.6961 3.6022 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6650 -2.6701 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.7660 2.5629 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.5605 2.7044 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.3462 -5.2380 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.3437 -5.1671 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.7044 2.1110 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -4.0704 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4934 -2.1099 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.9376 2.0028 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.9874 -5.9324 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.0084 1.0053 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.1800 0.4452 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.5253 6.2014 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.8287 6.5368 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.1046 6.5368 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.4081 6.2014 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.1488 4.8816 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.9767 5.6353 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1407 3.8776 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.0636 -2.8207 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.4122 -2.1040 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.0183 -5.7641 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.7532 1.4929 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.1493 -3.4686 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3982 -3.9211 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8507 -4.6721 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.3800 2.2739 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.5129 -6.3315 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.3864 -6.4069 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.4619 -5.5333 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.8650 -0.8413 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.5660 0.7341 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8085 -0.8235 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 19 1 0 0 0 0 2 11 1 0 0 0 0 2 13 1 0 0 0 0 3 12 1 0 0 0 0 3 14 1 0 0 0 0 4 24 2 0 0 0 0 5 28 2 0 0 0 0 6 7 1 0 0 0 0 6 15 1 0 0 0 0 6 17 1 0 0 0 0 7 21 2 0 0 0 0 8 9 1 0 0 0 0 8 24 1 0 0 0 0 8 47 1 0 0 0 0 9 28 1 0 0 0 0 9 49 1 0 0 0 0 10 11 1 0 0 0 0 10 12 1 0 0 0 0 10 29 1 0 0 0 0 10 30 1 0 0 0 0 11 31 1 0 0 0 0 11 32 1 0 0 0 0 12 33 1 0 0 0 0 12 34 1 0 0 0 0 13 14 2 0 0 0 0 13 16 1 0 0 0 0 14 19 1 0 0 0 0 15 20 2 0 0 0 0 15 23 1 0 0 0 0 16 18 2 0 0 0 0 16 35 1 0 0 0 0 17 24 1 0 0 0 0 17 36 1 0 0 0 0 17 37 1 0 0 0 0 18 22 1 0 0 0 0 18 25 1 0 0 0 0 19 22 2 0 0 0 0 20 21 1 0 0 0 0 20 38 1 0 0 0 0 21 26 1 0 0 0 0 22 39 1 0 0 0 0 23 40 1 0 0 0 0 23 41 1 0 0 0 0 23 42 1 0 0 0 0 25 27 2 0 0 0 0 25 43 1 0 0 0 0 26 44 1 0 0 0 0 26 45 1 0 0 0 0 26 46 1 0 0 0 0 27 28 1 0 0 0 0 27 48 1 0 0 0 0 M END > 9425021 > 1 > 590 > 6 > 2 > 4 > AAADceB7uAAEAAAAAAAAAAAAAAAAAWAAAAAwAAAABIAAAAABwAAAHgIYAAAADA7hniYyhpJqBACqAyVyVACSDAAhp0Ia+CC+79gNZiPF8/qWvCrl2BHK6YeAwBAOIAABIQAASABAAAJCAACQAAAAAAAAAA== > (E)-3-(6-chloro-3,4-dihydro-2H-1,5-benzodioxepin-8-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]prop-2-enehydrazide > (E)-3-(6-chloro-3,4-dihydro-2H-1,5-benzodioxepin-8-yl)-N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]prop-2-enehydrazide > (E)-3-(6-chloro-3,4-dihydro-2H-1,5-benzodioxepin-8-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]prop-2-enehydrazide > (E)-3-(6-chloro-3,4-dihydro-2H-1,5-benzodioxepin-8-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]prop-2-enehydrazide > (E)-3-(6-chloro-3,4-dihydro-2H-1,5-benzodioxepin-8-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]acrylohydrazide > InChI=1/C19H21ClN4O4/c1-12-8-13(2)24(23-12)11-18(26)22-21-17(25)5-4-14-9-15(20)19-16(10-14)27-6-3-7-28-19/h4-5,8-10H,3,6-7,11H2,1-2H3,(H,21,25)(H,22,26)/b5-4+/f/h21-22H > 2.6 > 404.125133 > C19H21ClN4O4 > 404.84744 > CC1=CC(=NN1CC(=O)NNC(=O)C=CC2=CC3=C(C(=C2)Cl)OCCCO3)C > CC1=CC(=NN1CC(=O)NNC(=O)\C=C\C2=CC3=C(C(=C2)Cl)OCCCO3)C > 94.5 > 404.125133 > 0 > 28 > 0 > 0 > 1 > 0 > 0 > 1 > 4 > 13 14 8 13 16 8 14 19 8 15 20 8 16 18 8 18 22 8 19 22 8 20 21 8 6 15 8 6 7 8 7 21 8 $$$$ 9425030 -OEChem-01150805002D 60 62 0 1 0 0 0 0 0999 V2000 9.6449 -2.7859 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -2.4495 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -1.9495 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 2.5505 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -2.4495 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -0.9495 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 1.0505 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.1667 -3.4440 0.0000 N 0 3 0 0 0 0 0 0 0 0 0 0 4.5981 -0.4495 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -2.4495 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 2.5505 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -0.4495 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0 6.3301 0.5505 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 -2.4495 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0 8.0622 -0.4495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 0.5505 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -1.9495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9757 -2.0428 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -0.9495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -0.9495 0.0000 C 0 0 1 0 0 0 0 0 0 0 0 0 9.1449 -3.6519 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -0.4495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 2.0505 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -0.9495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -1.9495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -1.9495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 3.5505 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 4.0505 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9282 4.0505 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -1.0695 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1181 1.1331 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7195 0.4428 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5422 -2.7872 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6728 -0.3419 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.2742 -1.0321 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.2742 1.1331 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6728 0.4428 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5501 -3.5088 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.0378 -4.0505 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.4773 -1.6783 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6657 -1.5058 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -1.2595 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9533 -4.2416 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.7113 -3.9041 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 0.1705 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.4675 0.0254 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2646 0.0254 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3894 -1.0572 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.7879 -0.3669 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3894 -1.8419 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.7879 -2.5321 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -3.0695 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.5991 2.2405 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5252 3.2405 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8862 3.5136 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.6592 4.3605 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5062 4.5874 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6182 4.5874 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.4651 4.3605 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.2382 3.5136 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 18 1 0 0 0 0 1 21 1 0 0 0 0 2 17 2 0 0 0 0 3 19 2 0 0 0 0 4 23 2 0 0 0 0 5 25 2 0 0 0 0 6 12 1 0 0 0 0 6 15 1 0 0 0 0 6 17 1 0 0 0 0 7 13 1 0 0 0 0 7 16 1 0 0 0 0 7 23 1 0 0 0 0 8 14 1 0 0 0 0 8 21 1 0 0 0 0 8 38 1 0 0 0 0 8 39 1 0 0 0 0 9 19 1 0 0 0 0 20 9 1 6 0 0 0 9 45 1 0 0 0 0 10 25 1 0 0 0 0 10 26 1 0 0 0 0 10 52 1 0 0 0 0 11 23 1 0 0 0 0 11 27 1 0 0 0 0 11 53 1 0 0 0 0 12 13 1 0 0 0 0 12 19 1 1 0 0 0 12 30 1 0 0 0 0 13 31 1 0 0 0 0 13 32 1 0 0 0 0 14 17 1 6 0 0 0 14 18 1 0 0 0 0 14 33 1 0 0 0 0 15 16 1 0 0 0 0 15 34 1 0 0 0 0 15 35 1 0 0 0 0 16 36 1 0 0 0 0 16 37 1 0 0 0 0 18 40 1 0 0 0 0 18 41 1 0 0 0 0 20 22 1 0 0 0 0 20 25 1 0 0 0 0 20 42 1 0 0 0 0 21 43 1 0 0 0 0 21 44 1 0 0 0 0 22 24 1 0 0 0 0 22 46 1 0 0 0 0 22 47 1 0 0 0 0 24 26 1 0 0 0 0 24 48 1 0 0 0 0 24 49 1 0 0 0 0 26 50 1 0 0 0 0 26 51 1 0 0 0 0 27 28 1 0 0 0 0 27 29 1 0 0 0 0 27 54 1 0 0 0 0 28 55 1 0 0 0 0 28 56 1 0 0 0 0 28 57 1 0 0 0 0 29 58 1 0 0 0 0 29 59 1 0 0 0 0 29 60 1 0 0 0 0 M CHG 1 8 1 M END > 9425030 > 1 > 660 > 4 > 4 > 4 > AAADceB7uABAAAAAAAAAAAAAAAAAAWAAAAAsWAAAAAAAAAAAAAAAHgQQAAAACCjFwASDAAPAAAgIAAEQEAAAAABAABAAAIGIAACAQBogwCAUAAAIFgKAAAAYAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== > (3R)-N-isopropyl-N'-[(3S)-2-oxo-3-piperidyl]-4-[(4R)-thiazolidin-3-ium-4-carbonyl]piperazine-1,3-dicarboxamide > (3R)-N-isopropyl-N'-[(3S)-2-oxo-3-piperidinyl]-4-[oxo-[(4R)-4-thiazolidin-3-iumyl]methyl]piperazine-1,3-dicarboxamide > (3R)-N'-[(3S)-2-oxopiperidin-3-yl]-N-propan-2-yl-4-[(4R)-1,3-thiazolidin-3-ium-4-carbonyl]piperazine-1,3-dicarboxamide > (3R)-N'-[(3S)-2-oxopiperidin-3-yl]-N-propan-2-yl-4-[[(4R)-1,3-thiazolidin-3-ium-4-yl]carbonyl]piperazine-1,3-dicarboxamide > (3R)-N-isopropyl-N'-[(3S)-2-keto-3-piperidyl]-4-[(4R)-thiazolidin-3-ium-4-carbonyl]piperazine-1,3-dicarboxamide > InChI=1/C18H30N6O4S/c1-11(2)21-18(28)23-6-7-24(17(27)13-9-29-10-20-13)14(8-23)16(26)22-12-4-3-5-19-15(12)25/h11-14,20H,3-10H2,1-2H3,(H,19,25)(H,21,28)(H,22,26)/p+1/t12-,13-,14+/m0/s1/fC18H31N6O4S/h19-22H/q+1 > 427.212749 > C18H31N6O4S+ > 427.54154 > CC(C)NC(=O)N1CCN(C(C1)C(=O)NC2CCCNC2=O)C(=O)C3CSC[NH2+]3 > CC(C)NC(=O)N1CCN([C@H](C1)C(=O)N[C@H]2CCCNC2=O)C(=O)[C@@H]3CSC[NH2+]3 > 128 > 427.212749 > 1 > 29 > 3 > 0 > 0 > 0 > 0 > 1 > 8 > 12 19 5 14 17 6 20 9 6 $$$$ 9425031 -OEChem-01150805002D 59 61 0 1 0 0 0 0 0999 V2000 9.6449 -2.8817 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -2.5453 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -2.0453 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 2.4547 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -2.5453 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -1.0453 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 0.9547 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -0.5453 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.1667 -3.5398 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -2.5453 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 2.4547 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -0.5453 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0 6.3301 0.4547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 -0.5453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 0.4547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -2.0453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -1.0453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 -2.5453 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0 3.7321 -1.0453 0.0000 C 0 0 1 0 0 0 0 0 0 0 0 0 2.8660 -0.5453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9757 -2.1386 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -1.0453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 1.9547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -2.0453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -2.0453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.1449 -3.7477 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 3.4547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9282 3.9547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 3.9547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -1.1653 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1181 1.0373 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7195 0.3470 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.2742 -1.1279 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6728 -0.4376 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6728 0.3470 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.2742 1.0373 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5422 -2.8830 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -0.4253 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2646 -0.0704 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.4675 -0.0704 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 0.0747 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6657 -1.6016 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.4773 -1.7741 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.7879 -0.4627 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3894 -1.1530 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.7060 -3.9547 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3894 -1.9376 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.7879 -2.6279 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9533 -4.3374 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.7113 -3.9999 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -3.1653 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.5991 2.1447 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5252 3.1447 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.2382 3.4178 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.4651 4.2647 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6182 4.4916 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5062 4.4916 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.6592 4.2647 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8862 3.4178 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 21 1 0 0 0 0 1 26 1 0 0 0 0 2 16 2 0 0 0 0 3 17 2 0 0 0 0 4 23 2 0 0 0 0 5 24 2 0 0 0 0 6 12 1 0 0 0 0 6 14 1 0 0 0 0 6 16 1 0 0 0 0 7 13 1 0 0 0 0 7 15 1 0 0 0 0 7 23 1 0 0 0 0 8 17 1 0 0 0 0 19 8 1 6 0 0 0 8 41 1 0 0 0 0 9 18 1 0 0 0 0 9 26 1 0 0 0 0 9 46 1 0 0 0 0 10 24 1 0 0 0 0 10 25 1 0 0 0 0 10 51 1 0 0 0 0 11 23 1 0 0 0 0 11 27 1 0 0 0 0 11 52 1 0 0 0 0 12 13 1 0 0 0 0 12 17 1 1 0 0 0 12 30 1 0 0 0 0 13 31 1 0 0 0 0 13 32 1 0 0 0 0 14 15 1 0 0 0 0 14 33 1 0 0 0 0 14 34 1 0 0 0 0 15 35 1 0 0 0 0 15 36 1 0 0 0 0 18 16 1 6 0 0 0 18 21 1 0 0 0 0 18 37 1 0 0 0 0 19 20 1 0 0 0 0 19 24 1 0 0 0 0 19 38 1 0 0 0 0 20 22 1 0 0 0 0 20 39 1 0 0 0 0 20 40 1 0 0 0 0 21 42 1 0 0 0 0 21 43 1 0 0 0 0 22 25 1 0 0 0 0 22 44 1 0 0 0 0 22 45 1 0 0 0 0 25 47 1 0 0 0 0 25 48 1 0 0 0 0 26 49 1 0 0 0 0 26 50 1 0 0 0 0 27 28 1 0 0 0 0 27 29 1 0 0 0 0 27 53 1 0 0 0 0 28 54 1 0 0 0 0 28 55 1 0 0 0 0 28 56 1 0 0 0 0 29 57 1 0 0 0 0 29 58 1 0 0 0 0 29 59 1 0 0 0 0 M END > 9425031 > 1 > 660 > 5 > 4 > 4 > AAADceB7uABAAAAAAAAAAAAAAAAAAWAAAAAsWAAAAAAAAAAAAAAAHgQQAAAACCjFwASDAAPAAAgIAAEQEAAAAABAABAAAIGIAACAQBogwCAUAAAIFgKAAAAYAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== > (3R)-N-isopropyl-N'-[(3S)-2-oxo-3-piperidyl]-4-[(4R)-thiazolidine-4-carbonyl]piperazine-1,3-dicarboxamide > (3R)-N-isopropyl-N'-[(3S)-2-oxo-3-piperidinyl]-4-[oxo-[(4R)-4-thiazolidinyl]methyl]piperazine-1,3-dicarboxamide > (3R)-N'-[(3S)-2-oxopiperidin-3-yl]-N-propan-2-yl-4-[(4R)-1,3-thiazolidine-4-carbonyl]piperazine-1,3-dicarboxamide > (3R)-N'-[(3S)-2-oxopiperidin-3-yl]-N-propan-2-yl-4-[[(4R)-1,3-thiazolidin-4-yl]carbonyl]piperazine-1,3-dicarboxamide > (3R)-N-isopropyl-N'-[(3S)-2-keto-3-piperidyl]-4-[(4R)-thiazolidine-4-carbonyl]piperazine-1,3-dicarboxamide > InChI=1/C18H30N6O4S/c1-11(2)21-18(28)23-6-7-24(17(27)13-9-29-10-20-13)14(8-23)16(26)22-12-4-3-5-19-15(12)25/h11-14,20H,3-10H2,1-2H3,(H,19,25)(H,21,28)(H,22,26)/t12-,13-,14+/m0/s1/f/h19,21-22H > -0.9 > 426.204924 > C18H30N6O4S > 426.5336 > CC(C)NC(=O)N1CCN(C(C1)C(=O)NC2CCCNC2=O)C(=O)C3CSCN3 > CC(C)NC(=O)N1CCN([C@H](C1)C(=O)N[C@H]2CCCNC2=O)C(=O)[C@@H]3CSCN3 > 123 > 426.204924 > 0 > 29 > 3 > 0 > 0 > 0 > 0 > 1 > 8 > 12 17 5 18 16 6 19 8 6 $$$$ 9425032 -OEChem-01150805002D 42 43 0 0 0 0 0 0 0999 V2000 3.7321 2.0761 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -1.4239 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 3.0761 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.6551 3.6639 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 0.5761 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 0.0761 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.2731 3.6639 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 2.0761 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.9641 4.6149 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.9641 4.6149 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -2.4239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -1.4239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 3.3548 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 1.5761 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -2.9239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -2.9239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -0.9239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -4.4239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.3763 5.4239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -3.9239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -3.9239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -5.4239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.0747 2.1837 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6762 1.4935 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.3285 5.1165 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.2554 -1.5316 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.6540 -0.8413 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.0326 2.7652 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.8138 3.1632 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.4158 3.9445 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -2.6139 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -2.6139 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8779 5.7884 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.0119 5.9255 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.8747 5.0595 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 0.2661 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -4.2339 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -4.2339 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.1951 0.3861 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.2460 -5.4239 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.4860 -5.4239 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -6.0439 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 14 2 0 0 0 0 2 17 2 0 0 0 0 3 4 1 0 0 0 0 3 7 1 0 0 0 0 3 8 1 0 0 0 0 4 10 2 0 0 0 0 5 6 1 0 0 0 0 5 14 1 0 0 0 0 5 36 1 0 0 0 0 6 17 1 0 0 0 0 6 39 1 0 0 0 0 7 9 2 0 0 0 0 7 13 1 0 0 0 0 8 14 1 0 0 0 0 8 23 1 0 0 0 0 8 24 1 0 0 0 0 9 10 1 0 0 0 0 9 25 1 0 0 0 0 10 19 1 0 0 0 0 11 12 1 0 0 0 0 11 15 2 0 0 0 0 11 16 1 0 0 0 0 12 17 1 0 0 0 0 12 26 1 0 0 0 0 12 27 1 0 0 0 0 13 28 1 0 0 0 0 13 29 1 0 0 0 0 13 30 1 0 0 0 0 15 20 1 0 0 0 0 15 31 1 0 0 0 0 16 21 2 0 0 0 0 16 32 1 0 0 0 0 18 20 2 0 0 0 0 18 21 1 0 0 0 0 18 22 1 0 0 0 0 19 33 1 0 0 0 0 19 34 1 0 0 0 0 19 35 1 0 0 0 0 20 37 1 0 0 0 0 21 38 1 0 0 0 0 22 40 1 0 0 0 0 22 41 1 0 0 0 0 22 42 1 0 0 0 0 M END > 9425032 > 1 > 394 > 4 > 2 > 4 > AAADceB7sAAAAAAAAAAAAAAAAAAAAWAAAAAwAAAAAAAAAAABwAAAHgAYAAAADAjBngQygJJqAACqAyVyVACSBAAhggIa+CG4ZJgIYDLA1fGUpAhgmADIyAcYiMCOQAAAAAAAAACAAAAAAAAAAAAAAAAAAA== > 2-(3,5-dimethylpyrazol-1-yl)-N'-[2-(4-methylphenyl)acetyl]acetohydrazide > 2-(3,5-dimethyl-1-pyrazolyl)-N'-[2-(4-methylphenyl)-1-oxoethyl]acetohydrazide > 2-(3,5-dimethylpyrazol-1-yl)-N'-[2-(4-methylphenyl)acetyl]acetohydrazide > 2-(3,5-dimethylpyrazol-1-yl)-N'-[2-(4-methylphenyl)ethanoyl]ethanehydrazide > 2-(3,5-dimethylpyrazol-1-yl)-N'-[2-(4-methylphenyl)acetyl]acetohydrazide > InChI=1/C16H20N4O2/c1-11-4-6-14(7-5-11)9-15(21)17-18-16(22)10-20-13(3)8-12(2)19-20/h4-8H,9-10H2,1-3H3,(H,17,21)(H,18,22)/f/h17-18H > 2 > 300.158626 > C16H20N4O2 > 300.3556 > CC1=CC=C(C=C1)CC(=O)NNC(=O)CN2C(=CC(=N2)C)C > CC1=CC=C(C=C1)CC(=O)NNC(=O)CN2C(=CC(=N2)C)C > 76 > 300.158626 > 0 > 22 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 11 15 8 11 16 8 15 20 8 16 21 8 18 20 8 18 21 8 3 4 8 3 7 8 4 10 8 7 9 8 9 10 8 $$$$ 9425033 -OEChem-01150805002D 49 51 0 0 0 0 0 0 0999 V2000 3.5878 -5.6967 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0 3.9538 -2.6579 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.5519 2.8421 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.0878 1.8421 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.5519 4.8421 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.4654 4.4354 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.8198 2.8421 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.8198 1.8421 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.2788 -4.7457 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.6564 5.8366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.6859 4.3421 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.6346 6.0445 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.1346 5.1785 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.9133 6.5058 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.6859 3.3421 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.1291 5.0740 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.9538 0.3421 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.9538 -1.6579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.0878 -3.1579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.9538 1.3421 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.8198 -0.1579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.0878 -0.1579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.8198 -1.1579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.0878 -1.1579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.0878 -4.1579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.8968 -4.7457 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.5878 -5.6967 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -6.5058 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.0753 4.2344 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.4738 4.9247 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.8867 6.6109 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.3281 6.9665 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.4525 6.9206 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.4984 6.0450 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.0643 4.4574 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.7457 5.0092 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.1939 5.6906 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2829 3.1521 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.4772 -3.2656 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8757 -2.5753 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.3568 0.1521 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.5508 0.1521 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.3568 1.5321 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.3568 -1.4679 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.5508 -1.4679 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.4865 -4.5541 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4984 -6.1413 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.6356 -7.0073 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.5016 -6.8702 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 26 1 0 0 0 0 1 27 1 0 0 0 0 2 18 1 0 0 0 0 2 19 1 0 0 0 0 3 15 2 0 0 0 0 4 20 2 0 0 0 0 5 6 1 0 0 0 0 5 10 1 0 0 0 0 5 11 1 0 0 0 0 6 13 2 0 0 0 0 7 8 1 0 0 0 0 7 15 1 0 0 0 0 7 38 1 0 0 0 0 8 20 1 0 0 0 0 8 43 1 0 0 0 0 9 25 1 0 0 0 0 9 27 2 0 0 0 0 10 12 2 0 0 0 0 10 14 1 0 0 0 0 11 15 1 0 0 0 0 11 29 1 0 0 0 0 11 30 1 0 0 0 0 12 13 1 0 0 0 0 12 31 1 0 0 0 0 13 16 1 0 0 0 0 14 32 1 0 0 0 0 14 33 1 0 0 0 0 14 34 1 0 0 0 0 16 35 1 0 0 0 0 16 36 1 0 0 0 0 16 37 1 0 0 0 0 17 20 1 0 0 0 0 17 21 2 0 0 0 0 17 22 1 0 0 0 0 18 23 2 0 0 0 0 18 24 1 0 0 0 0 19 25 1 0 0 0 0 19 39 1 0 0 0 0 19 40 1 0 0 0 0 21 23 1 0 0 0 0 21 41 1 0 0 0 0 22 24 2 0 0 0 0 22 42 1 0 0 0 0 23 44 1 0 0 0 0 24 45 1 0 0 0 0 25 26 2 0 0 0 0 26 46 1 0 0 0 0 27 28 1 0 0 0 0 28 47 1 0 0 0 0 28 48 1 0 0 0 0 28 49 1 0 0 0 0 M END > 9425033 > 1 > 544 > 6 > 2 > 6 > AAADceB7sABAAAAAAAAAAAAAAAAAAWLAAAAwAAAAAAAAAAAB8AAAHgQYAAAADAzl3gayh5JqFAiuAyVyVASS/KBlqjoa+DW+bNgOZjLk9fuXvSjk2BH46YeY3ADOIAAAAAAAAABAAAAAAAAAAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-[(2-methylthiazol-4-yl)methoxy]benzohydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-4-[(2-methyl-4-thiazolyl)methoxy]benzohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-[(2-methyl-1,3-thiazol-4-yl)methoxy]benzohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-4-[(2-methyl-1,3-thiazol-4-yl)methoxy]benzohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-[(2-methylthiazol-4-yl)methoxy]benzohydrazide > InChI=1/C19H21N5O3S/c1-12-8-13(2)24(23-12)9-18(25)21-22-19(26)15-4-6-17(7-5-15)27-10-16-11-28-14(3)20-16/h4-8,11H,9-10H2,1-3H3,(H,21,25)(H,22,26)/f/h21-22H > 2.1 > 399.13651 > C19H21N5O3S > 399.46674 > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC=C(C=C2)OCC3=CSC(=N3)C)C > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC=C(C=C2)OCC3=CSC(=N3)C)C > 98.1 > 399.13651 > 0 > 28 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 1 26 8 1 27 8 10 12 8 12 13 8 17 21 8 17 22 8 18 23 8 18 24 8 21 23 8 22 24 8 25 26 8 5 10 8 5 6 8 6 13 8 9 25 8 9 27 8 $$$$ 9425034 -OEChem-01150805002D 46 47 0 0 0 0 0 0 0999 V2000 6.3301 1.9182 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 0.9182 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 3.9182 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.2437 3.5114 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 1.9182 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 0.9182 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.4347 4.9127 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 3.4182 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.4128 5.1206 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.9128 4.2546 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.6915 5.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -4.0818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -5.0818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 2.4182 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9073 4.1501 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -2.0818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -3.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -3.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -2.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -2.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -5.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -1.0818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -0.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 0.4182 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.8535 3.3105 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.2520 4.0008 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.6650 5.6870 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.2766 5.1211 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.2308 5.9967 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1064 6.0426 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.4766 -4.9742 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.0781 -5.6644 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9721 4.7667 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.5239 4.0852 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.8425 3.5334 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -3.8918 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -3.8918 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -2.2718 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -2.2718 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.6900 -5.0449 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -5.8918 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3100 -6.1188 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.0611 2.2282 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3291 -0.7718 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 0.6082 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -0.8918 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 14 2 0 0 0 0 2 24 2 0 0 0 0 3 4 1 0 0 0 0 3 7 1 0 0 0 0 3 8 1 0 0 0 0 4 10 2 0 0 0 0 5 6 1 0 0 0 0 5 14 1 0 0 0 0 5 43 1 0 0 0 0 6 24 1 0 0 0 0 6 45 1 0 0 0 0 7 9 2 0 0 0 0 7 11 1 0 0 0 0 8 14 1 0 0 0 0 8 25 1 0 0 0 0 8 26 1 0 0 0 0 9 10 1 0 0 0 0 9 27 1 0 0 0 0 10 15 1 0 0 0 0 11 28 1 0 0 0 0 11 29 1 0 0 0 0 11 30 1 0 0 0 0 12 13 1 0 0 0 0 12 17 2 0 0 0 0 12 18 1 0 0 0 0 13 21 1 0 0 0 0 13 31 1 0 0 0 0 13 32 1 0 0 0 0 15 33 1 0 0 0 0 15 34 1 0 0 0 0 15 35 1 0 0 0 0 16 19 2 0 0 0 0 16 20 1 0 0 0 0 16 22 1 0 0 0 0 17 19 1 0 0 0 0 17 36 1 0 0 0 0 18 20 2 0 0 0 0 18 37 1 0 0 0 0 19 38 1 0 0 0 0 20 39 1 0 0 0 0 21 40 1 0 0 0 0 21 41 1 0 0 0 0 21 42 1 0 0 0 0 22 23 2 0 0 0 0 22 44 1 0 0 0 0 23 24 1 0 0 0 0 23 46 1 0 0 0 0 M END > 9425034 > 1 > 458 > 4 > 2 > 5 > AAADceB7sAAAAAAAAAAAAAAAAAAAAWAAAAAwAAAAAAAAAAABwAAAHgAYAAAADAjBngQygJJqAACqAyVyVACSBAAhggIa+CC4ZNgIYCLA0fCUpAhgmADIyYcAgMAOQAAAAAAAAACAAAAAAAAAAAAAAAAAAA== > (E)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(4-ethylphenyl)prop-2-enehydrazide > (E)-N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-3-(4-ethylphenyl)prop-2-enehydrazide > (E)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(4-ethylphenyl)prop-2-enehydrazide > (E)-N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-3-(4-ethylphenyl)prop-2-enehydrazide > (E)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(4-ethylphenyl)acrylohydrazide > InChI=1/C18H22N4O2/c1-4-15-5-7-16(8-6-15)9-10-17(23)19-20-18(24)12-22-14(3)11-13(2)21-22/h5-11H,4,12H2,1-3H3,(H,19,23)(H,20,24)/b10-9+/f/h19-20H > 2.9 > 326.174276 > C18H22N4O2 > 326.39288 > CCC1=CC=C(C=C1)C=CC(=O)NNC(=O)CN2C(=CC(=N2)C)C > CCC1=CC=C(C=C1)\C=C\C(=O)NNC(=O)CN2C(=CC(=N2)C)C > 76 > 326.174276 > 0 > 24 > 0 > 0 > 1 > 0 > 0 > 1 > 4 > 12 17 8 12 18 8 16 19 8 16 20 8 17 19 8 18 20 8 3 4 8 3 7 8 4 10 8 7 9 8 9 10 8 $$$$ 9425035 -OEChem-01150805002D 47 49 0 0 0 0 0 0 0999 V2000 2.0000 2.9045 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0 5.4921 4.1833 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 1.4045 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -2.0955 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 3.9045 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -3.0955 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3871 -3.6833 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -0.0955 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -0.5955 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.9230 4.4923 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.2321 5.4434 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.2321 5.4434 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5411 4.4923 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 2.9045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 2.4045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 2.4045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 1.4045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 1.4045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 0.9045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0052 -3.6833 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -2.0955 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 0.9045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.6962 -4.6343 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.6962 -4.6343 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -1.5955 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9562 -3.3743 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.1084 -5.4434 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.3566 4.7445 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.6130 3.9554 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2969 6.0600 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.6256 5.5723 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8385 5.5723 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.1672 6.0600 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 2.7145 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3291 1.0945 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 0.2845 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.4082 -1.5129 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.8067 -2.2032 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.0606 -5.1359 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.9272 -0.4055 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.1478 -3.9639 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.5459 -3.1827 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.7646 -2.7846 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8671 -0.2855 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6068 -5.0789 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7439 -5.9449 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.6100 -5.8078 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 16 1 0 0 0 0 2 13 2 0 0 0 0 3 22 2 0 0 0 0 4 25 2 0 0 0 0 5 10 1 0 0 0 0 5 13 1 0 0 0 0 5 14 1 0 0 0 0 6 7 1 0 0 0 0 6 20 1 0 0 0 0 6 21 1 0 0 0 0 7 24 2 0 0 0 0 8 9 1 0 0 0 0 8 22 1 0 0 0 0 8 40 1 0 0 0 0 9 25 1 0 0 0 0 9 44 1 0 0 0 0 10 11 1 0 0 0 0 10 28 1 0 0 0 0 10 29 1 0 0 0 0 11 12 1 0 0 0 0 11 30 1 0 0 0 0 11 31 1 0 0 0 0 12 13 1 0 0 0 0 12 32 1 0 0 0 0 12 33 1 0 0 0 0 14 15 1 0 0 0 0 14 16 2 0 0 0 0 15 17 2 0 0 0 0 15 34 1 0 0 0 0 16 18 1 0 0 0 0 17 19 1 0 0 0 0 17 22 1 0 0 0 0 18 19 2 0 0 0 0 18 35 1 0 0 0 0 19 36 1 0 0 0 0 20 23 2 0 0 0 0 20 26 1 0 0 0 0 21 25 1 0 0 0 0 21 37 1 0 0 0 0 21 38 1 0 0 0 0 23 24 1 0 0 0 0 23 39 1 0 0 0 0 24 27 1 0 0 0 0 26 41 1 0 0 0 0 26 42 1 0 0 0 0 26 43 1 0 0 0 0 27 45 1 0 0 0 0 27 46 1 0 0 0 0 27 47 1 0 0 0 0 M END > 9425035 > 1 > 589 > 5 > 2 > 4 > AAADceB7sAAEAAAAAAAAAAAAAAAAAWLAAAAwAAAAAAAAAAABwAAAHgIYAAAADArBniQywJNqAACqAyVyVACSBAAlhwIa+CG4ZtgIYDLB1/HUpQhgngDIyYcciACOBAAAQAAAABAIAACAAAAAIAAAAAAAAA== > 4-chloro-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(2-oxopyrrolidin-1-yl)benzohydrazide > 4-chloro-N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-3-(2-oxo-1-pyrrolidinyl)benzohydrazide > 4-chloro-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(2-oxopyrrolidin-1-yl)benzohydrazide > 4-chloro-N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-3-(2-oxopyrrolidin-1-yl)benzohydrazide > 4-chloro-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(2-ketopyrrolidin-1-yl)benzohydrazide > InChI=1/C18H20ClN5O3/c1-11-8-12(2)24(22-11)10-16(25)20-21-18(27)13-5-6-14(19)15(9-13)23-7-3-4-17(23)26/h5-6,8-9H,3-4,7,10H2,1-2H3,(H,20,25)(H,21,27)/f/h20-21H > 1.7 > 389.125467 > C18H20ClN5O3 > 389.8361 > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC(=C(C=C2)Cl)N3CCCC3=O)C > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC(=C(C=C2)Cl)N3CCCC3=O)C > 96.3 > 389.125467 > 0 > 27 > 0 > 0 > 0 > 0 > 0 > 1 > 8 > 14 15 8 14 16 8 15 17 8 16 18 8 17 19 8 18 19 8 20 23 8 23 24 8 6 20 8 6 7 8 7 24 8 $$$$ 9425036 -OEChem-01150805002D 46 48 0 0 0 0 0 0 0999 V2000 8.9073 -1.3318 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3092 1.1682 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 12.3714 -0.3318 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5772 0.1682 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 11.5054 -1.8318 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 0.1682 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.6637 -0.2386 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 13.3176 -0.0271 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.1753 -0.3318 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 13.3176 -1.6366 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 10.6394 -0.3318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.5054 0.1682 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.7734 0.1682 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.6394 -1.3318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.3714 -1.3318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.5054 1.1682 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4432 -0.3318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4727 1.1627 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9073 -0.3318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.7734 -1.8318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.4945 1.3706 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.9945 0.5046 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3092 0.1682 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.2158 1.8318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 13.9013 -0.8318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 0.4001 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.1719 0.6431 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.3748 0.6431 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.1254 1.1682 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.5054 1.7882 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.8854 1.1682 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8418 -0.8068 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.0447 -0.8068 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.4634 -1.2949 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.2364 -2.1418 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.0834 -2.3688 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2423 1.9370 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 0.7882 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6307 1.3711 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6766 2.2467 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8010 2.2926 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.1753 -0.9518 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 14.5213 -0.8318 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.9352 1.0167 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.0648 -0.2166 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3834 0.3352 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 19 2 0 0 0 0 2 23 2 0 0 0 0 3 8 1 0 0 0 0 3 12 1 0 0 0 0 3 15 1 0 0 0 0 4 7 1 0 0 0 0 4 17 1 0 0 0 0 4 18 1 0 0 0 0 5 14 2 0 0 0 0 5 15 1 0 0 0 0 6 9 1 0 0 0 0 6 19 1 0 0 0 0 6 38 1 0 0 0 0 7 22 2 0 0 0 0 8 25 2 0 0 0 0 9 23 1 0 0 0 0 9 42 1 0 0 0 0 10 15 2 0 0 0 0 10 25 1 0 0 0 0 11 12 2 0 0 0 0 11 13 1 0 0 0 0 11 14 1 0 0 0 0 12 16 1 0 0 0 0 13 19 1 0 0 0 0 13 27 1 0 0 0 0 13 28 1 0 0 0 0 14 20 1 0 0 0 0 16 29 1 0 0 0 0 16 30 1 0 0 0 0 16 31 1 0 0 0 0 17 23 1 0 0 0 0 17 32 1 0 0 0 0 17 33 1 0 0 0 0 18 21 2 0 0 0 0 18 24 1 0 0 0 0 20 34 1 0 0 0 0 20 35 1 0 0 0 0 20 36 1 0 0 0 0 21 22 1 0 0 0 0 21 37 1 0 0 0 0 22 26 1 0 0 0 0 24 39 1 0 0 0 0 24 40 1 0 0 0 0 24 41 1 0 0 0 0 25 43 1 0 0 0 0 26 44 1 0 0 0 0 26 45 1 0 0 0 0 26 46 1 0 0 0 0 M END > 9425036 > 1 > 532 > 8 > 2 > 4 > AAADceB78AAAAAAAAAAAAAAAAAAAAWLAAAAsAAAAAAAAAFgB+AAAHgAYAAAADAjBngQ3kJZqEACqAyVzdACQhCsxgqIXeCG4ZBiAaBJAzfEUhAhoGALISCIcAAAKAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-(5,7-dimethyl-[1,2,4]triazolo[1,5-a]pyrimidin-6-yl)acetohydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-2-(5,7-dimethyl-[1,2,4]triazolo[1,5-a]pyrimidin-6-yl)acetohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-(5,7-dimethyl-[1,2,4]triazolo[1,5-a]pyrimidin-6-yl)acetohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-2-(5,7-dimethyl-[1,2,4]triazolo[1,5-a]pyrimidin-6-yl)ethanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-(5,7-dimethyl-[1,2,4]triazolo[1,5-a]pyrimidin-6-yl)acetohydrazide > InChI=1/C16H20N8O2/c1-9-5-10(2)23(22-9)7-15(26)21-20-14(25)6-13-11(3)19-16-17-8-18-24(16)12(13)4/h5,8H,6-7H2,1-4H3,(H,20,25)(H,21,26)/f/h20-21H > -1.5 > 356.170922 > C16H20N8O2 > 356.3824 > CC1=CC(=NN1CC(=O)NNC(=O)CC2=C(N3C(=NC=N3)N=C2C)C)C > CC1=CC(=NN1CC(=O)NNC(=O)CC2=C(N3C(=NC=N3)N=C2C)C)C > 119 > 356.170922 > 0 > 26 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 10 15 8 10 25 8 11 12 8 11 14 8 18 21 8 21 22 8 3 12 8 3 15 8 3 8 8 4 18 8 4 7 8 5 14 8 5 15 8 7 22 8 8 25 8 $$$$ 9425037 -OEChem-01150805002D 42 43 0 0 0 0 0 0 0999 V2000 8.7788 5.5580 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0 3.5827 -1.4420 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.0468 -0.4420 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.1808 3.0580 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.5827 -3.4420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.4781 -4.4365 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.3147 -1.4420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.3147 -0.4420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.0468 1.5580 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.7788 2.5580 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.6691 -3.0353 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4487 -2.9420 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -3.7784 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.5000 -4.6444 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.4612 -2.0571 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4487 -1.9420 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0933 -5.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.1808 1.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.1808 0.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.0468 2.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.9128 3.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.9128 4.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.7788 4.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.6449 4.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.6449 3.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.6608 -3.5246 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.0593 -2.8343 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3834 -3.7136 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.0677 -1.9282 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3323 -1.4507 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8548 -2.1860 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.6597 -5.8102 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8411 -6.1244 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5269 -5.3058 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8517 -1.7520 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.9687 1.6406 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.5702 0.9503 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.7778 -0.1320 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5837 1.2480 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.3759 4.3680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.1818 4.3680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.1818 2.7480 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 23 1 0 0 0 0 2 16 2 0 0 0 0 3 19 2 0 0 0 0 4 20 2 0 0 0 0 5 6 1 0 0 0 0 5 11 1 0 0 0 0 5 12 1 0 0 0 0 6 14 2 0 0 0 0 7 8 1 0 0 0 0 7 16 1 0 0 0 0 7 35 1 0 0 0 0 8 19 1 0 0 0 0 8 38 1 0 0 0 0 9 18 1 0 0 0 0 9 20 1 0 0 0 0 9 39 1 0 0 0 0 10 21 2 0 0 0 0 10 25 1 0 0 0 0 11 13 2 0 0 0 0 11 15 1 0 0 0 0 12 16 1 0 0 0 0 12 26 1 0 0 0 0 12 27 1 0 0 0 0 13 14 1 0 0 0 0 13 28 1 0 0 0 0 14 17 1 0 0 0 0 15 29 1 0 0 0 0 15 30 1 0 0 0 0 15 31 1 0 0 0 0 17 32 1 0 0 0 0 17 33 1 0 0 0 0 17 34 1 0 0 0 0 18 19 1 0 0 0 0 18 36 1 0 0 0 0 18 37 1 0 0 0 0 20 21 1 0 0 0 0 21 22 1 0 0 0 0 22 23 2 0 0 0 0 22 40 1 0 0 0 0 23 24 1 0 0 0 0 24 25 2 0 0 0 0 24 41 1 0 0 0 0 25 42 1 0 0 0 0 M END > 9425037 > 1 > 506 > 6 > 3 > 5 > AAADceBzsAAEAAAAAAAAAAAAAAAAAWAAAAAsAAAAAAAAAAAB4AAAHgIYAAAACArBliQ+gJLqEACqATV3VACShCA3hyIa+KG4ZtgIYHLB1/GUpQhgngDIyYcYCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== > 4-chloro-N-[2-[N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]hydrazino]-2-oxo-ethyl]pyridine-2-carboxamide > 4-chloro-N-[2-[N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]hydrazino]-2-oxoethyl]-2-pyridinecarboxamide > 4-chloro-N-[2-[2-[2-(3,5-dimethylpyrazol-1-yl)acetyl]hydrazinyl]-2-oxoethyl]pyridine-2-carboxamide > 4-chloro-N-[2-[2-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]hydrazinyl]-2-oxo-ethyl]pyridine-2-carboxamide > 4-chloro-N-[2-[N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]hydrazino]-2-keto-ethyl]picolinamide > InChI=1/C15H17ClN6O3/c1-9-5-10(2)22(21-9)8-14(24)20-19-13(23)7-18-15(25)12-6-11(16)3-4-17-12/h3-6H,7-8H2,1-2H3,(H,18,25)(H,19,23)(H,20,24)/f/h18-20H > 0.4 > 364.105066 > C15H17ClN6O3 > 364.78688 > CC1=CC(=NN1CC(=O)NNC(=O)CNC(=O)C2=NC=CC(=C2)Cl)C > CC1=CC(=NN1CC(=O)NNC(=O)CNC(=O)C2=NC=CC(=C2)Cl)C > 118 > 364.105066 > 0 > 25 > 0 > 0 > 0 > 0 > 0 > 1 > 8 > 10 21 8 10 25 8 11 13 8 13 14 8 21 22 8 22 23 8 23 24 8 24 25 8 5 11 8 5 6 8 6 14 8 $$$$ 9425040 -OEChem-01150805002D 49 51 0 0 0 0 0 0 0999 V2000 9.6449 1.1920 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 11.3769 -1.8080 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.9128 -1.8080 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.3147 0.6920 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 10.5109 -0.3080 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.5827 -0.3080 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.0468 -0.3080 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.1808 -0.8080 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.4781 0.6865 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 11.3769 1.1920 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.6449 -0.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.2429 0.6920 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.2429 -0.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.5109 0.6920 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.3769 -0.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.7788 -0.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 13.1369 1.2267 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 13.1369 -0.8427 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.9128 -0.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 14.0429 0.7128 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 14.0429 -0.3288 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4487 -0.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.6691 -0.7147 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 0.0284 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.5000 0.8944 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.3147 -0.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.4612 -1.6929 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0933 1.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.9784 1.6670 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.7754 1.6670 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.0434 -1.2829 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.2463 -1.2829 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.3803 0.1670 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.1774 0.1670 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 13.1297 1.8466 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 13.1297 -1.4626 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 14.5787 1.0249 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 14.5787 -0.6409 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8472 -1.2829 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.0502 -1.2829 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3834 -0.0364 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.0468 0.3120 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1808 -1.4280 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8548 -1.5640 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3323 -2.2993 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.0677 -1.8218 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.6597 2.0602 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5269 1.5558 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8411 2.3744 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 14 2 0 0 0 0 2 15 2 0 0 0 0 3 19 2 0 0 0 0 4 26 2 0 0 0 0 5 11 1 0 0 0 0 5 14 1 0 0 0 0 5 15 1 0 0 0 0 6 9 1 0 0 0 0 6 22 1 0 0 0 0 6 23 1 0 0 0 0 7 8 1 0 0 0 0 7 19 1 0 0 0 0 7 42 1 0 0 0 0 8 26 1 0 0 0 0 8 43 1 0 0 0 0 9 25 2 0 0 0 0 10 12 1 0 0 0 0 10 14 1 0 0 0 0 10 29 1 0 0 0 0 10 30 1 0 0 0 0 11 16 1 0 0 0 0 11 31 1 0 0 0 0 11 32 1 0 0 0 0 12 13 1 0 0 0 0 12 17 2 0 0 0 0 13 15 1 0 0 0 0 13 18 2 0 0 0 0 16 19 1 0 0 0 0 16 33 1 0 0 0 0 16 34 1 0 0 0 0 17 20 1 0 0 0 0 17 35 1 0 0 0 0 18 21 1 0 0 0 0 18 36 1 0 0 0 0 20 21 2 0 0 0 0 20 37 1 0 0 0 0 21 38 1 0 0 0 0 22 26 1 0 0 0 0 22 39 1 0 0 0 0 22 40 1 0 0 0 0 23 24 2 0 0 0 0 23 27 1 0 0 0 0 24 25 1 0 0 0 0 24 41 1 0 0 0 0 25 28 1 0 0 0 0 27 44 1 0 0 0 0 27 45 1 0 0 0 0 27 46 1 0 0 0 0 28 47 1 0 0 0 0 28 48 1 0 0 0 0 28 49 1 0 0 0 0 M END > 9425040 > 1 > 640 > 6 > 2 > 5 > AAADceB7uAAAAAAAAAAAAAAAAAAAAWAAAAA8QAAAAAAAAACxwAAAHgAYAAAADAjBngQygJNqAACqAyVyVACSBAAlggIa+CG4ZNgIYDrA1fGUpQhgniDIyYcYi4COgAAAAAAQAAAAAAAAACAAAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(1,3-dioxo-4H-isoquinolin-2-yl)propanehydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-3-(1,3-dioxo-4H-isoquinolin-2-yl)propanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(1,3-dioxo-4H-isoquinolin-2-yl)propanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-3-(1,3-dioxo-4H-isoquinolin-2-yl)propanehydrazide > 3-(1,3-diketo-4H-isoquinolin-2-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]propionohydrazide > InChI=1/C19H21N5O4/c1-12-9-13(2)24(22-12)11-17(26)21-20-16(25)7-8-23-18(27)10-14-5-3-4-6-15(14)19(23)28/h3-6,9H,7-8,10-11H2,1-2H3,(H,20,25)(H,21,26)/f/h20-21H > 0.6 > 383.159354 > C19H21N5O4 > 383.40114 > CC1=CC(=NN1CC(=O)NNC(=O)CCN2C(=O)CC3=CC=CC=C3C2=O)C > CC1=CC(=NN1CC(=O)NNC(=O)CCN2C(=O)CC3=CC=CC=C3C2=O)C > 113 > 383.159354 > 0 > 28 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 12 13 8 12 17 8 13 18 8 17 20 8 18 21 8 20 21 8 23 24 8 24 25 8 6 23 8 6 9 8 9 25 8 $$$$ 9425041 -OEChem-01150805002D 45 47 0 1 0 0 0 0 0999 V2000 8.9073 -1.7718 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1753 -3.7718 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -0.2718 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.5772 0.7282 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 8.9073 -3.7718 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5772 2.7282 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3092 -0.2718 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.6637 2.3214 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3092 0.7282 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -2.2718 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0 7.1753 -1.7718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -3.2718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.7734 -2.2718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.7734 -3.2718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1753 -0.7718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4432 2.2282 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4727 3.7227 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.6673 -1.7372 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.6673 -3.8065 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.4945 3.9306 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.9945 3.0646 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4432 1.2282 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.5734 -2.2510 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.5734 -3.2926 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.2158 4.3918 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 2.9601 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -1.6518 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.9632 -2.3544 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.5647 -1.6642 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9073 -4.3918 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6553 2.8108 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.0538 2.1205 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.6601 -1.1172 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.6601 -4.4264 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2423 4.4970 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7723 -0.5818 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.1091 -1.9389 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.1091 -3.6047 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8462 1.0382 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6307 3.9311 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6766 4.8067 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8010 4.8526 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.9352 3.5767 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.0648 2.3434 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3834 2.8952 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 10 1 0 0 0 0 1 13 1 0 0 0 0 2 12 2 0 0 0 0 3 15 2 0 0 0 0 4 22 2 0 0 0 0 5 12 1 0 0 0 0 5 14 1 0 0 0 0 5 30 1 0 0 0 0 6 8 1 0 0 0 0 6 16 1 0 0 0 0 6 17 1 0 0 0 0 7 9 1 0 0 0 0 7 15 1 0 0 0 0 7 36 1 0 0 0 0 8 21 2 0 0 0 0 9 22 1 0 0 0 0 9 39 1 0 0 0 0 10 11 1 1 0 0 0 10 12 1 0 0 0 0 10 27 1 0 0 0 0 11 15 1 0 0 0 0 11 28 1 0 0 0 0 11 29 1 0 0 0 0 13 14 1 0 0 0 0 13 18 2 0 0 0 0 14 19 2 0 0 0 0 16 22 1 0 0 0 0 16 31 1 0 0 0 0 16 32 1 0 0 0 0 17 20 2 0 0 0 0 17 25 1 0 0 0 0 18 23 1 0 0 0 0 18 33 1 0 0 0 0 19 24 1 0 0 0 0 19 34 1 0 0 0 0 20 21 1 0 0 0 0 20 35 1 0 0 0 0 21 26 1 0 0 0 0 23 24 2 0 0 0 0 23 37 1 0 0 0 0 24 38 1 0 0 0 0 25 40 1 0 0 0 0 25 41 1 0 0 0 0 25 42 1 0 0 0 0 26 43 1 0 0 0 0 26 44 1 0 0 0 0 26 45 1 0 0 0 0 M END > 9425041 > 1 > 557 > 6 > 3 > 4 > AAADceB7uAAAAAAAAAAAAAAAAAAAAWAAAAA8QAAAAAAAAACxwAAAHgAYAAAACBzhlgYyxpLqBACqASVyVAKSDAAhogIa+CH/bJgOZjbE8f+XvCjm/BHY6AeVQAAAAAAAAAAAEAAAAAAAAAAgAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-[(2R)-3-oxo-4H-1,4-benzoxazin-2-yl]acetohydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-2-[(2R)-3-oxo-4H-1,4-benzoxazin-2-yl]acetohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-[(2R)-3-oxo-4H-1,4-benzoxazin-2-yl]acetohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-2-[(2R)-3-oxo-4H-1,4-benzoxazin-2-yl]ethanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-[(2R)-3-keto-4H-1,4-benzoxazin-2-yl]acetohydrazide > InChI=1/C17H19N5O4/c1-10-7-11(2)22(21-10)9-16(24)20-19-15(23)8-14-17(25)18-12-5-3-4-6-13(12)26-14/h3-7,14H,8-9H2,1-2H3,(H,18,25)(H,19,23)(H,20,24)/t14-/m1/s1/f/h18-20H > 0.4 > 357.143704 > C17H19N5O4 > 357.36386 > CC1=CC(=NN1CC(=O)NNC(=O)CC2C(=O)NC3=CC=CC=C3O2)C > CC1=CC(=NN1CC(=O)NNC(=O)C[C@@H]2C(=O)NC3=CC=CC=C3O2)C > 114 > 357.143704 > 0 > 26 > 1 > 0 > 0 > 0 > 0 > 1 > 8 > 10 11 5 13 14 8 13 18 8 14 19 8 17 20 8 18 23 8 19 24 8 20 21 8 23 24 8 6 17 8 6 8 8 8 21 8 $$$$ 9425042 -OEChem-01150805002D 45 47 0 1 0 0 0 0 0999 V2000 8.9073 -1.7718 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1753 -3.7718 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -0.2718 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.5772 0.7282 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 8.9073 -3.7718 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5772 2.7282 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3092 -0.2718 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.6637 2.3214 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3092 0.7282 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -2.2718 0.0000 C 0 0 1 0 0 0 0 0 0 0 0 0 7.1753 -1.7718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -3.2718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.7734 -2.2718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.7734 -3.2718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1753 -0.7718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4432 2.2282 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4727 3.7227 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.6673 -1.7372 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.6673 -3.8065 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.4945 3.9306 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.9945 3.0646 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4432 1.2282 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.5734 -2.2510 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.5734 -3.2926 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.2158 4.3918 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 2.9601 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -1.6518 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.9632 -2.3544 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.5647 -1.6642 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9073 -4.3918 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6553 2.8108 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.0538 2.1205 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.6601 -1.1172 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.6601 -4.4264 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2423 4.4970 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7723 -0.5818 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.1091 -1.9389 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.1091 -3.6047 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8462 1.0382 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6307 3.9311 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6766 4.8067 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8010 4.8526 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.9352 3.5767 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.0648 2.3434 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3834 2.8952 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 10 1 0 0 0 0 1 13 1 0 0 0 0 2 12 2 0 0 0 0 3 15 2 0 0 0 0 4 22 2 0 0 0 0 5 12 1 0 0 0 0 5 14 1 0 0 0 0 5 30 1 0 0 0 0 6 8 1 0 0 0 0 6 16 1 0 0 0 0 6 17 1 0 0 0 0 7 9 1 0 0 0 0 7 15 1 0 0 0 0 7 36 1 0 0 0 0 8 21 2 0 0 0 0 9 22 1 0 0 0 0 9 39 1 0 0 0 0 10 11 1 6 0 0 0 10 12 1 0 0 0 0 10 27 1 0 0 0 0 11 15 1 0 0 0 0 11 28 1 0 0 0 0 11 29 1 0 0 0 0 13 14 1 0 0 0 0 13 18 2 0 0 0 0 14 19 2 0 0 0 0 16 22 1 0 0 0 0 16 31 1 0 0 0 0 16 32 1 0 0 0 0 17 20 2 0 0 0 0 17 25 1 0 0 0 0 18 23 1 0 0 0 0 18 33 1 0 0 0 0 19 24 1 0 0 0 0 19 34 1 0 0 0 0 20 21 1 0 0 0 0 20 35 1 0 0 0 0 21 26 1 0 0 0 0 23 24 2 0 0 0 0 23 37 1 0 0 0 0 24 38 1 0 0 0 0 25 40 1 0 0 0 0 25 41 1 0 0 0 0 25 42 1 0 0 0 0 26 43 1 0 0 0 0 26 44 1 0 0 0 0 26 45 1 0 0 0 0 M END > 9425042 > 1 > 557 > 6 > 3 > 4 > AAADceB7uAAAAAAAAAAAAAAAAAAAAWAAAAA8QAAAAAAAAACxwAAAHgAYAAAACBzhlgYyxpLqBACqASVyVAKSDAAhogIa+CH/bJgOZjbE8f+XvCjm/BHY6AeVQAAAAAAAAAAAEAAAAAAAAAAgAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-[(2S)-3-oxo-4H-1,4-benzoxazin-2-yl]acetohydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-2-[(2S)-3-oxo-4H-1,4-benzoxazin-2-yl]acetohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-[(2S)-3-oxo-4H-1,4-benzoxazin-2-yl]acetohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-2-[(2S)-3-oxo-4H-1,4-benzoxazin-2-yl]ethanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-[(2S)-3-keto-4H-1,4-benzoxazin-2-yl]acetohydrazide > InChI=1/C17H19N5O4/c1-10-7-11(2)22(21-10)9-16(24)20-19-15(23)8-14-17(25)18-12-5-3-4-6-13(12)26-14/h3-7,14H,8-9H2,1-2H3,(H,18,25)(H,19,23)(H,20,24)/t14-/m0/s1/f/h18-20H > 0.4 > 357.143704 > C17H19N5O4 > 357.36386 > CC1=CC(=NN1CC(=O)NNC(=O)CC2C(=O)NC3=CC=CC=C3O2)C > CC1=CC(=NN1CC(=O)NNC(=O)C[C@H]2C(=O)NC3=CC=CC=C3O2)C > 114 > 357.143704 > 0 > 26 > 1 > 0 > 0 > 0 > 0 > 1 > 8 > 10 11 6 13 14 8 13 18 8 14 19 8 17 20 8 18 23 8 19 24 8 20 21 8 23 24 8 6 17 8 6 8 8 8 21 8 $$$$ 9425045 -OEChem-01150805002D 50 52 0 0 0 0 0 0 0999 V2000 4.6783 3.2611 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0 9.7619 1.5904 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 10.7619 -1.8737 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 12.7619 -1.8737 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.6783 1.6517 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 9.7619 -0.1417 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 10.7619 -0.1417 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 13.7564 -1.9783 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.7619 1.5904 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.7619 1.5904 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.2619 2.4564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.2619 0.7244 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.2619 2.4564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.2619 0.7244 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 2.9564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 1.9564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.2619 -1.0077 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.3551 -2.7873 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 13.0983 -3.4564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 13.9643 -2.9564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.2619 -1.0077 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 3.4564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 1.4564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.3770 -2.9952 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 2.9564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 1.9564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 14.8779 -3.3631 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.1793 1.3783 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8695 0.9798 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.3445 1.8024 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.6542 2.2010 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1542 3.0670 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8445 2.6685 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.3695 0.1138 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.6793 0.5123 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.1542 -0.3971 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.8445 -0.7956 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.4519 -0.6786 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 13.0335 -4.0730 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.0719 0.3953 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 4.0764 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 0.8364 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.2481 -2.3887 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.7705 -3.1241 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.5059 -3.6016 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 14.6257 -3.9295 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 15.1300 -2.7967 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 15.4443 -3.6153 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 3.2664 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 1.6464 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 13 1 0 0 0 0 1 15 1 0 0 0 0 2 14 2 0 0 0 0 3 21 2 0 0 0 0 4 8 1 0 0 0 0 4 17 1 0 0 0 0 4 18 1 0 0 0 0 5 13 2 0 0 0 0 5 16 1 0 0 0 0 6 7 1 0 0 0 0 6 14 1 0 0 0 0 6 38 1 0 0 0 0 7 21 1 0 0 0 0 7 40 1 0 0 0 0 8 20 2 0 0 0 0 9 10 1 0 0 0 0 9 11 1 0 0 0 0 9 28 1 0 0 0 0 9 29 1 0 0 0 0 10 12 1 0 0 0 0 10 30 1 0 0 0 0 10 31 1 0 0 0 0 11 13 1 0 0 0 0 11 32 1 0 0 0 0 11 33 1 0 0 0 0 12 14 1 0 0 0 0 12 34 1 0 0 0 0 12 35 1 0 0 0 0 15 16 1 0 0 0 0 15 22 2 0 0 0 0 16 23 2 0 0 0 0 17 21 1 0 0 0 0 17 36 1 0 0 0 0 17 37 1 0 0 0 0 18 19 2 0 0 0 0 18 24 1 0 0 0 0 19 20 1 0 0 0 0 19 39 1 0 0 0 0 20 27 1 0 0 0 0 22 25 1 0 0 0 0 22 41 1 0 0 0 0 23 26 1 0 0 0 0 23 42 1 0 0 0 0 24 43 1 0 0 0 0 24 44 1 0 0 0 0 24 45 1 0 0 0 0 25 26 2 0 0 0 0 25 49 1 0 0 0 0 26 50 1 0 0 0 0 27 46 1 0 0 0 0 27 47 1 0 0 0 0 27 48 1 0 0 0 0 M END > 9425045 > 1 > 520 > 5 > 2 > 7 > AAADceB7sABAAAAAAAAAAAAAAAAAAWLAAAAwAAAAAAAAAFgB/AAAHgQYAAAACAjB1gQywbJqEAiuASVyVACT9KBhijpa+D24ZJgIYLLg0fGUpAhgmADoyAcYCAAAAAAAAAAAAQAAAAAAAAACAAAAAAAAAA== > 5-(1,3-benzothiazol-2-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]pentanehydrazide > 5-(1,3-benzothiazol-2-yl)-N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]pentanehydrazide > 5-(1,3-benzothiazol-2-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]pentanehydrazide > 5-(1,3-benzothiazol-2-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]pentanehydrazide > 5-(1,3-benzothiazol-2-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]valerohydrazide > InChI=1/C19H23N5O2S/c1-13-11-14(2)24(23-13)12-18(26)22-21-17(25)9-5-6-10-19-20-15-7-3-4-8-16(15)27-19/h3-4,7-8,11H,5-6,9-10,12H2,1-2H3,(H,21,25)(H,22,26)/f/h21-22H > 2 > 385.157246 > C19H23N5O2S > 385.48322 > CC1=CC(=NN1CC(=O)NNC(=O)CCCCC2=NC3=CC=CC=C3S2)C > CC1=CC(=NN1CC(=O)NNC(=O)CCCCC2=NC3=CC=CC=C3S2)C > 88.9 > 385.157246 > 0 > 27 > 0 > 0 > 0 > 0 > 0 > 1 > 8 > 1 13 8 1 15 8 15 16 8 15 22 8 16 23 8 18 19 8 19 20 8 22 25 8 23 26 8 25 26 8 4 18 8 4 8 8 5 13 8 5 16 8 8 20 8 $$$$ 9425046 -OEChem-01150805002D 40 41 0 0 0 0 0 0 0999 V2000 2.0000 -3.5580 0.0000 Br 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 0.9420 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -4.5580 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -0.0580 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 2.9420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.4347 3.9365 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 0.9420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -0.0580 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.2437 2.5353 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 2.4420 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.9128 3.2784 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.4128 4.1444 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.4516 1.5571 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 1.4420 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.8195 5.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -1.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -0.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -2.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -2.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -3.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -3.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -3.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -5.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.2520 3.0246 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8535 2.3343 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.5294 3.2136 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8451 1.4282 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5805 0.9507 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.0580 1.6860 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.2531 5.3102 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.0717 5.6244 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.3859 4.8058 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.0611 1.2520 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 -0.3680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3291 -1.7480 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 -1.7480 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 -3.3680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2881 -5.5949 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 -5.3680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.9081 -4.5211 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 21 1 0 0 0 0 2 14 2 0 0 0 0 3 20 1 0 0 0 0 3 23 1 0 0 0 0 4 17 2 0 0 0 0 5 6 1 0 0 0 0 5 9 1 0 0 0 0 5 10 1 0 0 0 0 6 12 2 0 0 0 0 7 8 1 0 0 0 0 7 14 1 0 0 0 0 7 33 1 0 0 0 0 8 17 1 0 0 0 0 8 34 1 0 0 0 0 9 11 2 0 0 0 0 9 13 1 0 0 0 0 10 14 1 0 0 0 0 10 24 1 0 0 0 0 10 25 1 0 0 0 0 11 12 1 0 0 0 0 11 26 1 0 0 0 0 12 15 1 0 0 0 0 13 27 1 0 0 0 0 13 28 1 0 0 0 0 13 29 1 0 0 0 0 15 30 1 0 0 0 0 15 31 1 0 0 0 0 15 32 1 0 0 0 0 16 17 1 0 0 0 0 16 18 2 0 0 0 0 16 19 1 0 0 0 0 18 21 1 0 0 0 0 18 35 1 0 0 0 0 19 22 2 0 0 0 0 19 36 1 0 0 0 0 20 21 2 0 0 0 0 20 22 1 0 0 0 0 22 37 1 0 0 0 0 23 38 1 0 0 0 0 23 39 1 0 0 0 0 23 40 1 0 0 0 0 M END > 9425046 > 1 > 437 > 5 > 2 > 4 > AAADceBzsAAAEAAAAAAAAAAAAAAAAWAAAAAwAAAAAAAAAAABwAAAHgBYAAABrAzBngYyhpJqBACqAyVyVACSDAAlogYa+CG+bPgMZjLE9fuUtShk2BHI65eY3ADOIAAAEAAABABAAAAgAAAIAAAAAAAAAA== > 3-bromo-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-methoxy-benzohydrazide > 3-bromo-N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-4-methoxybenzohydrazide > 3-bromo-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-methoxybenzohydrazide > 3-bromo-N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-4-methoxy-benzohydrazide > 3-bromo-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-methoxy-benzohydrazide > InChI=1/C15H17BrN4O3/c1-9-6-10(2)20(19-9)8-14(21)17-18-15(22)11-4-5-13(23-3)12(16)7-11/h4-7H,8H2,1-3H3,(H,17,21)(H,18,22)/f/h17-18H > 2.5 > 380.048403 > C15H17BrN4O3 > 381.22448 > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC(=C(C=C2)OC)Br)C > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC(=C(C=C2)OC)Br)C > 85.3 > 380.048403 > 0 > 23 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 11 12 8 16 18 8 16 19 8 18 21 8 19 22 8 20 21 8 20 22 8 5 6 8 5 9 8 6 12 8 9 11 8 $$$$ chemfp-1.1p1/tests/pubchem.sdf.gz0000644000077000000240000003643211660452124017170 0ustar dalkestaff00000000000000b/Los6S͜4%34DJ)NgB]ϓ*@'EGpz1`WAjQKq캄}Rfu"e~~1'zR[*{w].LӖJ=U L/wõUڥ(uU}ۥRK}]P*[&uҶv©bۥhۥB#K=U*ZB%HzfROqvGpT:sվwK}c' v}ᨨRo"vÎgdgQngd<ޣL' ^Eyjyoѣ-]F (ӗvR{q5ip{ݣG[EQ9ݶ(v??U̕l/a}uӏEQ~,n^ݼ*8?2>,gro[<(?gћ(̙ܩ>\yQONvv'ųG[A&>fo/?]g&{wgp~zus9&Y_u7N ;9|s?d^߼ __=%]+ Kg;@vuU7wh`Yq0.7Prrur^r0N ɛO7'mCxP)/O D2>?|j"(60;P:{p.37!ÑGCpP@1 A u!7}<[vb?\1'^i8]<햇{ӗƣf.A-ZJ.\aN<6 (KxX ?dR)C)>,AIJO _KJ;xZk͝&N<aϷAcQIbn8;$)jgq^ND{oӽn>_sѾMv'yPIA:e|\n ⃅o#I zDV|PՖlo%4Bt_1Q۞򫙨uxv6xAW  u^UU \ =)yX+ZX@ S?1̑IO-:[,%HńDw\)F _#Xϖb ah֣hǕ"jO4֮[栄Q =-jR><):Ht>Er#RT)v6q-%RYʬ.f#uv5ӘBO Bmei`Zfpz gQki"(0\)ZCHz>n]aK-}tR;˕ Jnb#̘k)6+;,i(+$iBEc;IvK)y7Ii^FB=¸H젪6QT<*4Te[<2 \$FHf(`0BER'$`:Rh $`$J$h!)8J3BV*U Bjk2, rxb7O̊ ~~kY'm_}yO_ݢTܼy{V}'Ӌ^DxՋMF%?+m.=!UvۋjGwU:hd#mFN&mZčPLTV#9c Bىg`+(3 v*3YQg Cy PJhD#| gŸ7QGPHT$⓸9⠔dW));)^P AB&Ju O(x]9>-lg 6t-Tל:>C.x,.P=m] [VҋjٹtKL*ėr[h{01-4kæT(zaIR^5@@m[Kkcr+\-z"(),e=p@UdZ\k.W"q:oW $)E:_쬗YO=rDL:yg*uqոvkiUKmjɶ F+bۮQWEJjõ ̀{k&Jr.wBkyr[ds&t5*fNb h#lQ@$EYdITp{Z`̢|Ч51{M0C3ҍvlJtբpMx1qM"X i7MvS>M,%Km#&=Q *ˁ\ V6p 070 Ur 9(gW07$dnVp2-KX0WO0E{5 oE,ޤ4x[o6]۽gM_U 7+ڬrV:!!XйM q3[̭ ,<`A/Nn&BE;-Z/v@G©K{k TzHvfR[R%( HHuhl~e6H]x٬=T(U/r#z"eӺCOf)su: EwˬFl.f#=m;wkc${u2Є{?Ek+PɢyW@p+E[8\_̝O8cؤĠMd0vξ(~QGoK!k)$[U:뱋vuݵEʅ 0"U5[K-cWx͕Ԓb|ЄOGUH_T WC(ՄjƇRMhǓ e_ h^~2z UἪ`_#C17RE X_C (jD = 5SQ-G{WZ:HI94'$ڒkd!Y""2S8%t"oML\ H@dqkdna9Y F Id3M<$Rd3jh3Cb0Kb0G,nCIV]YEM=g7g'b>䇝 l:-¦뛣7GEٸxe9[6a$R2W?'&^;& "cx:*QPX>%"Wkډ[*aRV/T-E[5AX2Qѣj?EQ) FUTbi&KLWjAt[#h1Ф8ju{sЄc.#M{,T<6 UȺH\T&+8ir&cx5*C( nC4:GͯLE``G ˕ns30΄iiY:I)VVr݂6[H,&)1V\9Z$aKQX:盬(u 2* )>ZuF[reVEa(`VeEE17@d3:YdUiA6"m=RՖp-H> 3fJƲ#v`3;K_8F*u7j'`tk5B_'lPtM7z}Q":6v(FlAMh5?qDB@*MBHjqPxҧ4A`؞Y%1 x' #jm`a?IHV :܊4/b{^*~+$lHoxc։01tEnLGԕg2d)(K+ӊ} kd(:얜/m_κ Pg%MbLJ} 3%dݿI|t}t+웫%[VN7Iz\LѴR4͈JgB%bܝ KˋZB[-6%?/`0Aaɨl\z=F(l]\ w[׫h2U,eϾ}DJ+4$1/@Sb{Z2˄ lDmoc/MS^ш.*Aٞ,4e{@zORPw|`#)$B}*IWv [-ՐgPR>(9 Հ/cՏ !B[KUJ[ (1OKIbV @bV e{˚iRG{ZZҽr-Ԕ1VKnu'j볇i+.'ai%,XzP߷6]>NxVH eՃ>M]?;\]#7u|?af(^KGm᫅ŷ/߭cݕ׹3S|Ok֣noz=hvw=,/^ǿ伪+CN.flG#Ø:)5Tbc/t+"gA>B*3, '*#`Kn2k3'WQ#iGVxZ[Nӱ<%6K}|8Y|+&%9X0F.[>TVCֳ ##UY55 NZm~ϳm[;ơc2MkycȵwO9~6lޣ~O,VNKW66#R҅hW#ꢶ`X61Q撾Y „[}a=}}l ~ HDvϗ Ǩ 2"ޒcs-.f K!J;\9l.kL f=F%ta*6 T9lVډrJC9l֊[hv{V? k4Uc+[j,j UӮY[->wNr-r* |A5~ 2R̾E.P: B嘘b-11.r4J|Gԅ٥0:DLlIL&`b=krL(fv[z^:Џ|$䥴Cb ӂ 6cѯỳk>wrWRU@VMmCx@/Yx"V$-ġÿ_}E~?Ľ}R|Nm2X _4([M̝=<ݼoֻًyKD_TGi??'Bzcr,rgh$i9m/^9аή(E 4zD{PVHA2DsXi60&V$lѡ4:kx^:*v+ZJpJgH9k]fY+U,lr+as,[i{e2fk7N__>~wћUJtD+чG)nnW[|zE[_\kGLb_n9wusO}(D>}5^D26m@lT^؅G&嬲ؔ3ҧjGj7TYp桮o\vy|é5wb-wvW[lRcGj},.57Ηh rݦիO;;"ύi+m<-kfvD*mb'L]UPoֹAzs˔3+B$]0.5wՈ0rJ$!^je4e%ȂV0vg, [BdM!h^Y˭XIJ&$т̭tް)T9iً5/`^RB`Tpΐs_e͸3rVɂ5WBQI+m_ArVtQ+>X.5" ~7H5-\+Zskک}{#h0`-*M[a٭FFp 1'_xEn+2\;/i àB$rXæNg":e]JYش޺P8V4R;l _#rX$s\Y`J$ĒGJB<5anL!*I(JAQ!HQFR ܺˆrU BA& []Z'ON>?yO~!|?Lۧէm<32K+-z\ *n5L^HL1]0aj.z8Xmc.ԓ\f P ME\E aE.yjHz0Kml'RvS!QqQjiwBh Ԡ7Ig MM`?%sR@<~1j t/= "]Ť჆J0 W Ǯ-VL SSپ\S_mp#EƀֿK.CYcnne]-ExK/]Ԫ* R"Ķ1K/Mj%V惣 bj/_|/Њ)h+b# Pw!'% Ǩ+'0 /pܕǨ+b9"Ko*qZ*)8 ]v䔵D@5}qF aP"wCbMs$`ً̫;fȞmVDqvfYje1X ;kQJJXVRp})iE"a#2xo0XJϞae\}xĻcnWc/PYpBmX<,#j'G)?IDDyl )IDL)= P7E Rh?.crӒCnZrPhMKM,p> @AW R+ݺ<}۪\r4Y&W(zjӏEK)"L! xrK,aH5 HLZBg311 ZBO@1 L H8B i~5k5PDS  VEIE( HQs Q8-׃1g w=lS߲-/ɖT}ח^/P ҄$| ыW7ooѫ8wo戆 }KD`uvl( tuvKHx, ;%b}҈mP DI "9"E] L4G.@-\t,;zt $'M }D.0;ObuI]$6ꕕezZmzR9ȘFjZ10AZW0up6pRWiF%P!/6W S~SGR%Y0HeCR$Ȉ? 2MTQPke*D}K8R^s>aD`q^gr(9FeH|Y31*m3"Y0:Y0/ JVp—4Gz0 \nqR MP{# 9]`^$kv%XxeJ(A^qحkȫwM;m%_2ya E] *kfA%#ęK r+rj T),^1H>_Pu|BOȌ ;μ} {* &IV2[Hܾ(_?)e>ZD|@`tEUimh2C]ro#'O BUR.I-I I:q鍥Hؠ&$!qE RƑ66t"PAo3@h8cEcJ I B hr!2e@C?(#VC7*ǷϿ_9+׫V|˫8MO/?/[1srgrs$6,^<1Um~:1LHyCL M%]@K)6K"03 œ&=rL‘T3}reln.QMP;abPHo(0BqQ5^ma]5~ # =Nk*-Ձgg)ټ0%) W;iJw, ~( \3jU&=ѸYa#J1?IjzDj/i^#>}je WggՊ(h9FhޟI>렠XZG({ПST{eɭ՚݊VB'KEKHsMkՊ(h`D*KDou/m+%+t!LPuN R]uXⶣ /F89hMz׼t)ݗJ%^FٮUl^e8Ӑ9lL>a2 i#kwD.>K<"О݊rMКe4{X:؀{@H01dC[, ! $@B aF 5tX Kڃ@b)0 rC"31N#p htJ A&1V9A3b/a. ~i:-z/B-zޝ8Y͵׷XE}AUs2ׯ_ō/_?5 ¿݉_wcäWkv=z{czєw18S>A~y{szwtޜIJ?g)tJJͪs5{ XD 'WjanBL7O%m(twE.["ޝFEXuގ(7"M!Ui jcܪzz&V_JJ}WH UW|ZoHR`SWvŶAdY! .RQ+MWG$JwU_{4ˤ?v ,Gօ}uOjiFRV2qH}MKٲ܏_R(U6dP_pu=-tϗgiyDN/]s`<{5Kמ-:"¥xdz*xD;]_J/%/+/%%gZxDڕOYa:E*X0F_Bu ub"ޫ2^]3U\mUi?U^o)8 UJ]F.>~ޥvٍt;k'K4=Q\b6s),(`lu}.Ȫ3-b05VRD AB/ >pʈW@ߐ2"KNS@pc$ L2WqC휎aUr#` *_ANࣞe=Gu휊6UV@EO^Hr:#7uCx&wv2;rvcLwqkDd ݪ=SyX{rc3~~]UDD ":׋USt5CDNj!IU<ϵ͹*?؂9ml~UkC ͟_XuN.QכӨ3p,MNoLLyj=(KW (GLʜ٨>i\|U}:2AmB7^%3PO-dYUf5zE.NE}j!%@mUԇtS-=XMv װ ӚCv՚  8yϮOrXeX \r}_0Ri6ׅUGsuugzCtGen㹻ޗYHr+UJ_6jSʫ^GIq0 ̶W)di 94JM*ϫ7;G?G=w *j=O*^UlW+L-7ۦbTuUMyj!m!ӌI`0*~$S`_r*JZ]4).25[ƫDjMES ѫR=gkrTyv^w˳tңegU?iKTu,Ϋ#V^U*bdLCB"IbH"I"I"ǰ䘉%sR]HY=2 >Dr5$1c!͠%ۑC8L$K Ԉj;t W$K1$1'C<6[_1;cgoV߾albllvvQV'^x_oǧvקfVL۞DMbv3|Ź%QmcwqEIS;pRi4pZJ!d ތAs+508";D[v pk=Gάʧ[8c+MA+'2pE:TYmITY8%O?QuV蜌)ro#c O'WTzBfܢ ~<>;nen;n;n;n;n;n;n1-[ŏ-[bybybybybWb-p#la̋8n{rTBsBsbwqUUF.uˊY]̐*u)qXFCrd } [ٗL+"{_*dQU1д~w%؂}PmuDnum l (8^(X |[.W_\PpDJsWW.4ϗW)pA%ŲW֯ \*^Hv5:MXxsD {]P.. uظR.[nҝGu?J رq¥F6ӵ *gVՔܪˮiR 5pv& LiD|'qs `?Y]Z6H(dH()(D!M›uEb%PRwK¦A:^BB@T @|H27K xѢv Y]яʙB $@! >*YJv 8@">Ib; 2dXM$$dT"6 Ȱ/G}!(%%bvb\8Iݿ5z/zˋG?7j~hXlB{e (ӄ=`Yݧ8Q[DR~mSf7R˱Y]9'*vi]տX0K9)],'u(yO`~DI71]p\=՟)K]/u$I^UO'͊jƼҫ58XԆ,8RF+i^^ - j5x`O;IoZZ6%N5C13ʡK}3]w~9fTBgO6L'OQ`Xhû=h/rr jO.溌HFk j^]Zn[K5p6ňsݺ@Ѻxh:;RݽUU״,nWʺmJLeJ+TIX͍2I5A+j¤W#j~4Vsߞ@R-MD`|L-jJdlhQ P&3iDqlx/۩ HOOWz޾} §C3=؋'zЊ[hJ0:) ȠLCt;uD^iЏҮsNim5F 9425004 > 1 > 491 > 5 > 2 > 4 > AAADceB7sQAEAAAAAAAAAAAAAAAAAWAAAAAwAAAAAAAAAAABwAAAHwIYAAAADArBniwygJJqAACqAyVyVACSBAAhhwIa+CC4ZtgIYCLB0/CUpAhgmADIyYcAgAAOAAAAAAABAAAAAAAAAAIAAAAAAAAAAA== > (E)-3-(2-chloro-6-fluoro-phenyl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]prop-2-enehydrazide > (E)-3-(2-chloro-6-fluorophenyl)-N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]prop-2-enehydrazide > (E)-3-(2-chloro-6-fluorophenyl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]prop-2-enehydrazide > (E)-3-(2-chloro-6-fluoro-phenyl)-N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]prop-2-enehydrazide > (E)-3-(2-chloro-6-fluoro-phenyl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]acrylohydrazide > InChI=1/C16H16ClFN4O2/c1-10-8-11(2)22(21-10)9-16(24)20-19-15(23)7-6-12-13(17)4-3-5-14(12)18/h3-8H,9H2,1-2H3,(H,19,23)(H,20,24)/b7-6+/f/h19-20H > 2.8 > 350.094582 > C16H16ClFN4O2 > 350.775243 > CC1=CC(=NN1CC(=O)NNC(=O)C=CC2=C(C=CC=C2Cl)F)C > CC1=CC(=NN1CC(=O)NNC(=O)\C=C\C2=C(C=CC=C2Cl)F)C > 76 > 350.094582 > 0 > 24 > 0 > 0 > 1 > 0 > 0 > 1 > 4 > 11 12 8 16 19 8 16 20 8 19 22 8 20 23 8 22 24 8 23 24 8 5 6 8 5 9 8 6 12 8 9 11 8 $$$$ 9425009 -OEChem-01150805002D 47 49 0 0 0 0 0 0 0999 V2000 5.2187 -2.0269 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.6701 3.2453 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 2.4554 0.4001 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.4945 4.8633 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.4727 5.0712 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.2688 2.2272 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.8566 1.4182 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.9097 -2.9780 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.5277 -2.9780 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.9945 5.7294 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.0878 3.9498 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6309 -1.2179 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6637 6.4725 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5772 6.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.0377 -0.3044 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6756 3.1408 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 5.8339 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.2187 -2.0269 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.4499 0.5047 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4432 6.5658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.7187 -3.5658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.7187 -4.5658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.5847 -5.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.8527 -5.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.5847 -6.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.8527 -6.0658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.7187 -6.5658 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.6571 3.5038 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.5738 4.2965 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2002 -1.6639 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.1169 -0.8712 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.4684 0.1416 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.5517 -0.6511 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.5347 7.0790 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.0648 6.4505 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3834 5.8987 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.9352 5.2173 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.6522 2.1624 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7532 6.0288 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1332 7.1027 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.9802 6.8758 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.4732 1.4830 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1217 -4.7558 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.3158 -4.7558 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1217 -6.3758 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.3158 -6.3758 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.7187 -7.1858 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 9 1 0 0 0 0 1 18 1 0 0 0 0 2 16 2 0 0 0 0 3 19 2 0 0 0 0 4 5 1 0 0 0 0 4 10 1 0 0 0 0 4 11 1 0 0 0 0 5 14 2 0 0 0 0 6 7 1 0 0 0 0 6 16 1 0 0 0 0 6 38 1 0 0 0 0 7 19 1 0 0 0 0 7 42 1 0 0 0 0 8 18 2 0 0 0 0 8 21 1 0 0 0 0 9 21 2 0 0 0 0 10 13 2 0 0 0 0 10 17 1 0 0 0 0 11 16 1 0 0 0 0 11 28 1 0 0 0 0 11 29 1 0 0 0 0 12 15 1 0 0 0 0 12 18 1 0 0 0 0 12 30 1 0 0 0 0 12 31 1 0 0 0 0 13 14 1 0 0 0 0 13 34 1 0 0 0 0 14 20 1 0 0 0 0 15 19 1 0 0 0 0 15 32 1 0 0 0 0 15 33 1 0 0 0 0 17 35 1 0 0 0 0 17 36 1 0 0 0 0 17 37 1 0 0 0 0 20 39 1 0 0 0 0 20 40 1 0 0 0 0 20 41 1 0 0 0 0 21 22 1 0 0 0 0 22 23 2 0 0 0 0 22 24 1 0 0 0 0 23 25 1 0 0 0 0 23 43 1 0 0 0 0 24 26 2 0 0 0 0 24 44 1 0 0 0 0 25 27 2 0 0 0 0 25 45 1 0 0 0 0 26 27 1 0 0 0 0 26 46 1 0 0 0 0 27 47 1 0 0 0 0 M END > 9425009 > 1 > 513 > 7 > 2 > 6 > AAADceB7sAAAAAAAAAAAAAAAAAAAAWLAAAAwAAAAAAAAAAAB8AAAHgAcAAAADAjBnwQzkJZ6EACrAydydgCShAkhgqI7+CG4ZJiIaLLA2fGUpAhknQLIyAc3gAAOAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(3-phenyl-1,2,4-oxadiazol-5-yl)propanehydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-3-(3-phenyl-1,2,4-oxadiazol-5-yl)propanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(3-phenyl-1,2,4-oxadiazol-5-yl)propanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-3-(3-phenyl-1,2,4-oxadiazol-5-yl)propanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(3-phenyl-1,2,4-oxadiazol-5-yl)propionohydrazide > InChI=1/C18H20N6O3/c1-12-10-13(2)24(22-12)11-16(26)21-20-15(25)8-9-17-19-18(23-27-17)14-6-4-3-5-7-14/h3-7,10H,8-9,11H2,1-2H3,(H,20,25)(H,21,26)/f/h20-21H > 1.9 > 368.159689 > C18H20N6O3 > 368.3898 > CC1=CC(=NN1CC(=O)NNC(=O)CCC2=NC(=NO2)C3=CC=CC=C3)C > CC1=CC(=NN1CC(=O)NNC(=O)CCC2=NC(=NO2)C3=CC=CC=C3)C > 115 > 368.159689 > 0 > 27 > 0 > 0 > 0 > 0 > 0 > 1 > 12 > 1 18 8 1 9 8 10 13 8 13 14 8 22 23 8 22 24 8 23 25 8 24 26 8 25 27 8 26 27 8 4 10 8 4 5 8 5 14 8 8 18 8 8 21 8 9 21 8 $$$$ 9425012 -OEChem-01150805002D 41 42 0 0 0 0 0 0 0999 V2000 5.0032 -4.0774 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1013 -0.0386 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.6372 -1.0386 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1013 1.9614 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.2058 2.9560 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.3692 -0.0386 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.3692 -1.0386 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.0032 -4.0774 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0148 1.5547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.2353 1.4614 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5032 -2.5386 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6942 -3.1264 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.7431 -2.8173 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.6840 2.2978 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.1840 3.1639 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.3122 -3.1264 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.2353 0.4614 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.2228 0.5765 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5032 -1.5386 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -3.4865 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.5907 4.0774 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.2633 -2.8173 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.6247 1.3538 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.0232 2.0440 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.2546 -2.4356 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.0342 -2.2699 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.3006 2.2330 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.8292 0.7055 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.3517 -0.0299 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.6163 0.4476 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8323 0.2714 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5851 -3.0257 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5393 -3.9013 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.4149 -3.9472 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.9062 -1.3486 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.0243 4.3296 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.8429 4.6438 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.1571 3.8252 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.4549 -3.4070 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8529 -2.6257 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.0717 -2.2277 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 8 1 0 0 0 0 1 16 1 0 0 0 0 2 17 2 0 0 0 0 3 19 2 0 0 0 0 4 5 1 0 0 0 0 4 9 1 0 0 0 0 4 10 1 0 0 0 0 5 15 2 0 0 0 0 6 7 1 0 0 0 0 6 17 1 0 0 0 0 6 31 1 0 0 0 0 7 19 1 0 0 0 0 7 35 1 0 0 0 0 8 12 2 0 0 0 0 9 14 2 0 0 0 0 9 18 1 0 0 0 0 10 17 1 0 0 0 0 10 23 1 0 0 0 0 10 24 1 0 0 0 0 11 12 1 0 0 0 0 11 16 2 0 0 0 0 11 19 1 0 0 0 0 12 13 1 0 0 0 0 13 20 1 0 0 0 0 13 25 1 0 0 0 0 13 26 1 0 0 0 0 14 15 1 0 0 0 0 14 27 1 0 0 0 0 15 21 1 0 0 0 0 16 22 1 0 0 0 0 18 28 1 0 0 0 0 18 29 1 0 0 0 0 18 30 1 0 0 0 0 20 32 1 0 0 0 0 20 33 1 0 0 0 0 20 34 1 0 0 0 0 21 36 1 0 0 0 0 21 37 1 0 0 0 0 21 38 1 0 0 0 0 22 39 1 0 0 0 0 22 40 1 0 0 0 0 22 41 1 0 0 0 0 M END > 9425012 > 1 > 419 > 6 > 2 > 4 > AAADceBzsAAAAAAAAAAAAAAAAAAAAWLAAAAAAAAAAAAAAAAB4AAAHgAcAAAADAzBngQyhJJ6AACrA6VyVgCQBAAlogIyeCG8bFoAZh5I0fKUlchmuBjISUOYAAAOAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-ethyl-5-methyl-isoxazole-4-carbohydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-3-ethyl-5-methyl-4-isoxazolecarbohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-ethyl-5-methyl-1,2-oxazole-4-carbohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-3-ethyl-5-methyl-1,2-oxazole-4-carbohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-ethyl-5-methyl-isoxazole-4-carbohydrazide > InChI=1/C14H19N5O3/c1-5-11-13(10(4)22-18-11)14(21)16-15-12(20)7-19-9(3)6-8(2)17-19/h6H,5,7H2,1-4H3,(H,15,20)(H,16,21)/f/h15-16H > 1 > 305.14879 > C14H19N5O3 > 305.33236 > CCC1=NOC(=C1C(=O)NNC(=O)CN2C(=CC(=N2)C)C)C > CCC1=NOC(=C1C(=O)NNC(=O)CN2C(=CC(=N2)C)C)C > 102 > 305.14879 > 0 > 22 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 1 16 8 1 8 8 11 12 8 11 16 8 14 15 8 4 5 8 4 9 8 5 15 8 8 12 8 9 14 8 $$$$ 9425015 -OEChem-01150805002D 54 56 0 0 0 0 0 0 0999 V2000 8.0902 0.6739 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.4921 -3.8261 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.6261 -0.3261 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.7601 -4.8261 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5691 -5.4139 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.6261 -2.3261 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.4921 -1.8261 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.9511 -5.4139 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3581 0.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7601 -3.8261 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3581 -0.3261 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 2.1739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 1.1739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.2601 -6.3649 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.2601 -6.3649 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 4.1739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0902 2.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3581 2.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.6261 -3.3261 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -5.1048 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0902 3.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3581 3.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 5.1739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4921 -0.8261 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.8479 -7.1739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3581 5.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0902 5.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3581 6.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0902 6.6739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 7.1739 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.7476 0.5663 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1461 1.2565 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.1495 -3.9337 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.5480 -3.2435 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.9687 -0.2184 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.5702 -0.9087 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8956 -6.8665 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6271 2.3639 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8212 2.3639 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.1916 -4.5152 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4103 -4.9132 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8084 -5.6945 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6271 3.9839 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8212 3.9839 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.0892 -2.0161 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.3463 -7.5384 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.3494 -6.8095 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.2123 -7.6755 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.0291 -2.1361 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8212 5.3639 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6271 5.3639 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8212 6.9839 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6271 6.9839 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 7.7939 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 13 2 0 0 0 0 2 19 2 0 0 0 0 3 24 2 0 0 0 0 4 5 1 0 0 0 0 4 8 1 0 0 0 0 4 10 1 0 0 0 0 5 15 2 0 0 0 0 6 7 1 0 0 0 0 6 19 1 0 0 0 0 6 45 1 0 0 0 0 7 24 1 0 0 0 0 7 49 1 0 0 0 0 8 14 2 0 0 0 0 8 20 1 0 0 0 0 9 11 1 0 0 0 0 9 13 1 0 0 0 0 9 31 1 0 0 0 0 9 32 1 0 0 0 0 10 19 1 0 0 0 0 10 33 1 0 0 0 0 10 34 1 0 0 0 0 11 24 1 0 0 0 0 11 35 1 0 0 0 0 11 36 1 0 0 0 0 12 13 1 0 0 0 0 12 17 2 0 0 0 0 12 18 1 0 0 0 0 14 15 1 0 0 0 0 14 37 1 0 0 0 0 15 25 1 0 0 0 0 16 21 2 0 0 0 0 16 22 1 0 0 0 0 16 23 1 0 0 0 0 17 21 1 0 0 0 0 17 38 1 0 0 0 0 18 22 2 0 0 0 0 18 39 1 0 0 0 0 20 40 1 0 0 0 0 20 41 1 0 0 0 0 20 42 1 0 0 0 0 21 43 1 0 0 0 0 22 44 1 0 0 0 0 23 26 2 0 0 0 0 23 27 1 0 0 0 0 25 46 1 0 0 0 0 25 47 1 0 0 0 0 25 48 1 0 0 0 0 26 28 1 0 0 0 0 26 50 1 0 0 0 0 27 29 2 0 0 0 0 27 51 1 0 0 0 0 28 30 2 0 0 0 0 28 52 1 0 0 0 0 29 30 1 0 0 0 0 29 53 1 0 0 0 0 30 54 1 0 0 0 0 M END > 9425015 > 1 > 597 > 5 > 2 > 7 > AAADceB7sAAAAAAAAAAAAAAAAAAAAWAAAAAwYAAAAAAAAAAB0AAAHgAYAAAADAzBngQygJJqAACqA6VyVACSBAAlggIa+CG4ZNgIYDLA1fCUpQhgmADIyYcdiMCOwAAAAAAAAACAAAAAAAAAAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-oxo-4-(4-phenylphenyl)butanehydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-4-oxo-4-(4-phenylphenyl)butanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-oxo-4-(4-phenylphenyl)butanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-4-oxo-4-(4-phenylphenyl)butanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-keto-4-(4-phenylphenyl)butyrohydrazide > InChI=1/C23H24N4O3/c1-16-14-17(2)27(26-16)15-23(30)25-24-22(29)13-12-21(28)20-10-8-19(9-11-20)18-6-4-3-5-7-18/h3-11,14H,12-13,15H2,1-2H3,(H,24,29)(H,25,30)/f/h24-25H > 3.3 > 404.184841 > C23H24N4O3 > 404.46166 > CC1=CC(=NN1CC(=O)NNC(=O)CCC(=O)C2=CC=C(C=C2)C3=CC=CC=C3)C > CC1=CC(=NN1CC(=O)NNC(=O)CCC(=O)C2=CC=C(C=C2)C3=CC=CC=C3)C > 93.1 > 404.184841 > 0 > 30 > 0 > 0 > 0 > 0 > 0 > 1 > 8 > 12 17 8 12 18 8 14 15 8 16 21 8 16 22 8 17 21 8 18 22 8 23 26 8 23 27 8 26 28 8 27 29 8 28 30 8 29 30 8 4 5 8 4 8 8 5 15 8 8 14 8 $$$$ 9425018 -OEChem-01150805002D 50 51 0 0 0 0 0 0 0999 V2000 5.4641 -2.3080 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0 9.7942 0.1920 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -0.8080 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -1.3080 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 9.7942 2.1920 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 9.8988 3.1865 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 0.1920 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 -0.8080 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -2.8080 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 10.7078 1.7853 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9282 1.6920 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.3769 2.5284 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.8769 3.3944 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.9157 0.8071 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9282 0.6920 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -2.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.2836 4.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -2.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -1.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -2.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 -2.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -3.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -2.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 -3.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -4.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -2.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -3.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.3176 1.5843 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.7162 2.2746 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.9935 2.4636 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.3092 0.6782 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.0446 0.2007 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.5221 0.9360 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.8500 4.0558 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.5358 4.8744 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.7172 4.5602 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5252 0.5020 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.5991 -1.1180 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.9966 -3.2829 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.1995 -3.2829 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.5991 -2.4980 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7932 -4.1180 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.5991 -4.1180 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -4.9280 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3100 -1.7711 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -1.9980 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.6900 -2.8449 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.2460 -3.8080 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -4.4280 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.4860 -3.8080 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 18 1 0 0 0 0 1 20 1 0 0 0 0 2 15 2 0 0 0 0 3 19 2 0 0 0 0 4 23 2 0 0 0 0 5 6 1 0 0 0 0 5 10 1 0 0 0 0 5 11 1 0 0 0 0 6 13 2 0 0 0 0 7 8 1 0 0 0 0 7 15 1 0 0 0 0 7 37 1 0 0 0 0 8 19 1 0 0 0 0 8 38 1 0 0 0 0 9 23 1 0 0 0 0 9 26 1 0 0 0 0 9 27 1 0 0 0 0 10 12 2 0 0 0 0 10 14 1 0 0 0 0 11 15 1 0 0 0 0 11 28 1 0 0 0 0 11 29 1 0 0 0 0 12 13 1 0 0 0 0 12 30 1 0 0 0 0 13 17 1 0 0 0 0 14 31 1 0 0 0 0 14 32 1 0 0 0 0 14 33 1 0 0 0 0 16 18 1 0 0 0 0 16 19 1 0 0 0 0 16 21 2 0 0 0 0 17 34 1 0 0 0 0 17 35 1 0 0 0 0 17 36 1 0 0 0 0 18 22 2 0 0 0 0 20 23 1 0 0 0 0 20 39 1 0 0 0 0 20 40 1 0 0 0 0 21 24 1 0 0 0 0 21 41 1 0 0 0 0 22 25 1 0 0 0 0 22 42 1 0 0 0 0 24 25 2 0 0 0 0 24 43 1 0 0 0 0 25 44 1 0 0 0 0 26 45 1 0 0 0 0 26 46 1 0 0 0 0 26 47 1 0 0 0 0 27 48 1 0 0 0 0 27 49 1 0 0 0 0 27 50 1 0 0 0 0 M END > 9425018 > 1 > 545 > 5 > 2 > 6 > AAADceB7sABAAAAAAAAAAAAAAAAAAWAAAAAwAAAAAAAAAAABwAAAHgQYAAAADAjF3gSygZNqAAiqAyVyVACSBAAlihIa+Dm4ZNgIYDLg1fGUpQhgmgDoyYcYiACOAAAAAAAEAAAAAAAAAAgAAAAAAAAAAA== > 2-[2-[[[2-(3,5-dimethylpyrazol-1-yl)acetyl]amino]carbamoyl]phenyl]sulfanyl-N,N-dimethyl-acetamide > 2-[[2-[[N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]hydrazino]-oxomethyl]phenyl]thio]-N,N-dimethylacetamide > 2-[2-[[[2-(3,5-dimethylpyrazol-1-yl)acetyl]amino]carbamoyl]phenyl]sulfanyl-N,N-dimethylacetamide > 2-[2-[[2-(3,5-dimethylpyrazol-1-yl)ethanoylamino]carbamoyl]phenyl]sulfanyl-N,N-dimethyl-ethanamide > 2-[[2-[[[2-(3,5-dimethylpyrazol-1-yl)acetyl]amino]carbamoyl]phenyl]thio]-N,N-dimethyl-acetamide > InChI=1/C18H23N5O3S/c1-12-9-13(2)23(21-12)10-16(24)19-20-18(26)14-7-5-6-8-15(14)27-11-17(25)22(3)4/h5-9H,10-11H2,1-4H3,(H,19,24)(H,20,26)/f/h19-20H > 1.5 > 389.15216 > C18H23N5O3S > 389.47192 > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC=CC=C2SCC(=O)N(C)C)C > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC=CC=C2SCC(=O)N(C)C)C > 96.3 > 389.15216 > 0 > 27 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 10 12 8 12 13 8 16 18 8 16 21 8 18 22 8 21 24 8 22 25 8 24 25 8 5 10 8 5 6 8 6 13 8 $$$$ 9425021 -OEChem-01150805002D 49 51 0 0 0 0 0 0 0999 V2000 9.4664 2.2809 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0 7.3432 5.1506 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 9.3676 4.1756 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.3927 -2.5473 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.2807 0.8825 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.7359 -3.6675 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5845 -4.1965 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.4225 -1.1124 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.2509 -0.5523 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.9667 5.9324 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.9667 5.9324 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.5902 5.1506 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.5657 4.1756 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.4667 3.7418 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.9706 -4.3112 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.6961 3.6022 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6650 -2.6701 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.7660 2.5629 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.5605 2.7044 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.3462 -5.2380 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.3437 -5.1671 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.7044 2.1110 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -4.0704 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4934 -2.1099 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.9376 2.0028 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.9874 -5.9324 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.0084 1.0053 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.1800 0.4452 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.5253 6.2014 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.8287 6.5368 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.1046 6.5368 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.4081 6.2014 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.1488 4.8816 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.9767 5.6353 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1407 3.8776 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.0636 -2.8207 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.4122 -2.1040 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.0183 -5.7641 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.7532 1.4929 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.1493 -3.4686 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3982 -3.9211 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8507 -4.6721 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.3800 2.2739 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.5129 -6.3315 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.3864 -6.4069 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.4619 -5.5333 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.8650 -0.8413 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.5660 0.7341 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8085 -0.8235 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 19 1 0 0 0 0 2 11 1 0 0 0 0 2 13 1 0 0 0 0 3 12 1 0 0 0 0 3 14 1 0 0 0 0 4 24 2 0 0 0 0 5 28 2 0 0 0 0 6 7 1 0 0 0 0 6 15 1 0 0 0 0 6 17 1 0 0 0 0 7 21 2 0 0 0 0 8 9 1 0 0 0 0 8 24 1 0 0 0 0 8 47 1 0 0 0 0 9 28 1 0 0 0 0 9 49 1 0 0 0 0 10 11 1 0 0 0 0 10 12 1 0 0 0 0 10 29 1 0 0 0 0 10 30 1 0 0 0 0 11 31 1 0 0 0 0 11 32 1 0 0 0 0 12 33 1 0 0 0 0 12 34 1 0 0 0 0 13 14 2 0 0 0 0 13 16 1 0 0 0 0 14 19 1 0 0 0 0 15 20 2 0 0 0 0 15 23 1 0 0 0 0 16 18 2 0 0 0 0 16 35 1 0 0 0 0 17 24 1 0 0 0 0 17 36 1 0 0 0 0 17 37 1 0 0 0 0 18 22 1 0 0 0 0 18 25 1 0 0 0 0 19 22 2 0 0 0 0 20 21 1 0 0 0 0 20 38 1 0 0 0 0 21 26 1 0 0 0 0 22 39 1 0 0 0 0 23 40 1 0 0 0 0 23 41 1 0 0 0 0 23 42 1 0 0 0 0 25 27 2 0 0 0 0 25 43 1 0 0 0 0 26 44 1 0 0 0 0 26 45 1 0 0 0 0 26 46 1 0 0 0 0 27 28 1 0 0 0 0 27 48 1 0 0 0 0 M END > 9425021 > 1 > 590 > 6 > 2 > 4 > AAADceB7uAAEAAAAAAAAAAAAAAAAAWAAAAAwAAAABIAAAAABwAAAHgIYAAAADA7hniYyhpJqBACqAyVyVACSDAAhp0Ia+CC+79gNZiPF8/qWvCrl2BHK6YeAwBAOIAABIQAASABAAAJCAACQAAAAAAAAAA== > (E)-3-(6-chloro-3,4-dihydro-2H-1,5-benzodioxepin-8-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]prop-2-enehydrazide > (E)-3-(6-chloro-3,4-dihydro-2H-1,5-benzodioxepin-8-yl)-N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]prop-2-enehydrazide > (E)-3-(6-chloro-3,4-dihydro-2H-1,5-benzodioxepin-8-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]prop-2-enehydrazide > (E)-3-(6-chloro-3,4-dihydro-2H-1,5-benzodioxepin-8-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]prop-2-enehydrazide > (E)-3-(6-chloro-3,4-dihydro-2H-1,5-benzodioxepin-8-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]acrylohydrazide > InChI=1/C19H21ClN4O4/c1-12-8-13(2)24(23-12)11-18(26)22-21-17(25)5-4-14-9-15(20)19-16(10-14)27-6-3-7-28-19/h4-5,8-10H,3,6-7,11H2,1-2H3,(H,21,25)(H,22,26)/b5-4+/f/h21-22H > 2.6 > 404.125133 > C19H21ClN4O4 > 404.84744 > CC1=CC(=NN1CC(=O)NNC(=O)C=CC2=CC3=C(C(=C2)Cl)OCCCO3)C > CC1=CC(=NN1CC(=O)NNC(=O)\C=C\C2=CC3=C(C(=C2)Cl)OCCCO3)C > 94.5 > 404.125133 > 0 > 28 > 0 > 0 > 1 > 0 > 0 > 1 > 4 > 13 14 8 13 16 8 14 19 8 15 20 8 16 18 8 18 22 8 19 22 8 20 21 8 6 15 8 6 7 8 7 21 8 $$$$ 9425030 -OEChem-01150805002D 60 62 0 1 0 0 0 0 0999 V2000 9.6449 -2.7859 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -2.4495 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -1.9495 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 2.5505 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -2.4495 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -0.9495 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 1.0505 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.1667 -3.4440 0.0000 N 0 3 0 0 0 0 0 0 0 0 0 0 4.5981 -0.4495 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -2.4495 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 2.5505 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -0.4495 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0 6.3301 0.5505 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 -2.4495 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0 8.0622 -0.4495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 0.5505 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -1.9495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9757 -2.0428 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -0.9495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -0.9495 0.0000 C 0 0 1 0 0 0 0 0 0 0 0 0 9.1449 -3.6519 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -0.4495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 2.0505 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -0.9495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -1.9495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -1.9495 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 3.5505 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 4.0505 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9282 4.0505 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -1.0695 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1181 1.1331 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7195 0.4428 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5422 -2.7872 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6728 -0.3419 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.2742 -1.0321 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.2742 1.1331 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6728 0.4428 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5501 -3.5088 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.0378 -4.0505 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.4773 -1.6783 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6657 -1.5058 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -1.2595 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9533 -4.2416 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.7113 -3.9041 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 0.1705 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.4675 0.0254 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2646 0.0254 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3894 -1.0572 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.7879 -0.3669 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3894 -1.8419 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.7879 -2.5321 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -3.0695 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.5991 2.2405 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5252 3.2405 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8862 3.5136 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.6592 4.3605 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5062 4.5874 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6182 4.5874 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.4651 4.3605 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.2382 3.5136 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 18 1 0 0 0 0 1 21 1 0 0 0 0 2 17 2 0 0 0 0 3 19 2 0 0 0 0 4 23 2 0 0 0 0 5 25 2 0 0 0 0 6 12 1 0 0 0 0 6 15 1 0 0 0 0 6 17 1 0 0 0 0 7 13 1 0 0 0 0 7 16 1 0 0 0 0 7 23 1 0 0 0 0 8 14 1 0 0 0 0 8 21 1 0 0 0 0 8 38 1 0 0 0 0 8 39 1 0 0 0 0 9 19 1 0 0 0 0 20 9 1 6 0 0 0 9 45 1 0 0 0 0 10 25 1 0 0 0 0 10 26 1 0 0 0 0 10 52 1 0 0 0 0 11 23 1 0 0 0 0 11 27 1 0 0 0 0 11 53 1 0 0 0 0 12 13 1 0 0 0 0 12 19 1 1 0 0 0 12 30 1 0 0 0 0 13 31 1 0 0 0 0 13 32 1 0 0 0 0 14 17 1 6 0 0 0 14 18 1 0 0 0 0 14 33 1 0 0 0 0 15 16 1 0 0 0 0 15 34 1 0 0 0 0 15 35 1 0 0 0 0 16 36 1 0 0 0 0 16 37 1 0 0 0 0 18 40 1 0 0 0 0 18 41 1 0 0 0 0 20 22 1 0 0 0 0 20 25 1 0 0 0 0 20 42 1 0 0 0 0 21 43 1 0 0 0 0 21 44 1 0 0 0 0 22 24 1 0 0 0 0 22 46 1 0 0 0 0 22 47 1 0 0 0 0 24 26 1 0 0 0 0 24 48 1 0 0 0 0 24 49 1 0 0 0 0 26 50 1 0 0 0 0 26 51 1 0 0 0 0 27 28 1 0 0 0 0 27 29 1 0 0 0 0 27 54 1 0 0 0 0 28 55 1 0 0 0 0 28 56 1 0 0 0 0 28 57 1 0 0 0 0 29 58 1 0 0 0 0 29 59 1 0 0 0 0 29 60 1 0 0 0 0 M CHG 1 8 1 M END > 9425030 > 1 > 660 > 4 > 4 > 4 > AAADceB7uABAAAAAAAAAAAAAAAAAAWAAAAAsWAAAAAAAAAAAAAAAHgQQAAAACCjFwASDAAPAAAgIAAEQEAAAAABAABAAAIGIAACAQBogwCAUAAAIFgKAAAAYAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== > (3R)-N-isopropyl-N'-[(3S)-2-oxo-3-piperidyl]-4-[(4R)-thiazolidin-3-ium-4-carbonyl]piperazine-1,3-dicarboxamide > (3R)-N-isopropyl-N'-[(3S)-2-oxo-3-piperidinyl]-4-[oxo-[(4R)-4-thiazolidin-3-iumyl]methyl]piperazine-1,3-dicarboxamide > (3R)-N'-[(3S)-2-oxopiperidin-3-yl]-N-propan-2-yl-4-[(4R)-1,3-thiazolidin-3-ium-4-carbonyl]piperazine-1,3-dicarboxamide > (3R)-N'-[(3S)-2-oxopiperidin-3-yl]-N-propan-2-yl-4-[[(4R)-1,3-thiazolidin-3-ium-4-yl]carbonyl]piperazine-1,3-dicarboxamide > (3R)-N-isopropyl-N'-[(3S)-2-keto-3-piperidyl]-4-[(4R)-thiazolidin-3-ium-4-carbonyl]piperazine-1,3-dicarboxamide > InChI=1/C18H30N6O4S/c1-11(2)21-18(28)23-6-7-24(17(27)13-9-29-10-20-13)14(8-23)16(26)22-12-4-3-5-19-15(12)25/h11-14,20H,3-10H2,1-2H3,(H,19,25)(H,21,28)(H,22,26)/p+1/t12-,13-,14+/m0/s1/fC18H31N6O4S/h19-22H/q+1 > 427.212749 > C18H31N6O4S+ > 427.54154 > CC(C)NC(=O)N1CCN(C(C1)C(=O)NC2CCCNC2=O)C(=O)C3CSC[NH2+]3 > CC(C)NC(=O)N1CCN([C@H](C1)C(=O)N[C@H]2CCCNC2=O)C(=O)[C@@H]3CSC[NH2+]3 > 128 > 427.212749 > 1 > 29 > 3 > 0 > 0 > 0 > 0 > 1 > 8 > 12 19 5 14 17 6 20 9 6 $$$$ 9425031 -OEChem-01150805002D 59 61 0 1 0 0 0 0 0999 V2000 9.6449 -2.8817 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -2.5453 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -2.0453 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 2.4547 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -2.5453 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -1.0453 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 0.9547 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -0.5453 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.1667 -3.5398 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -2.5453 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 2.4547 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -0.5453 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0 6.3301 0.4547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 -0.5453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 0.4547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -2.0453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -1.0453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 -2.5453 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0 3.7321 -1.0453 0.0000 C 0 0 1 0 0 0 0 0 0 0 0 0 2.8660 -0.5453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9757 -2.1386 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -1.0453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 1.9547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -2.0453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -2.0453 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.1449 -3.7477 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0622 3.4547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9282 3.9547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 3.9547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -1.1653 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1181 1.0373 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7195 0.3470 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.2742 -1.1279 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6728 -0.4376 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6728 0.3470 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.2742 1.0373 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5422 -2.8830 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -0.4253 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2646 -0.0704 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.4675 -0.0704 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 0.0747 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6657 -1.6016 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.4773 -1.7741 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.7879 -0.4627 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3894 -1.1530 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.7060 -3.9547 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3894 -1.9376 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.7879 -2.6279 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9533 -4.3374 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.7113 -3.9999 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -3.1653 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.5991 2.1447 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5252 3.1447 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.2382 3.4178 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.4651 4.2647 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.6182 4.4916 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5062 4.4916 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.6592 4.2647 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8862 3.4178 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 21 1 0 0 0 0 1 26 1 0 0 0 0 2 16 2 0 0 0 0 3 17 2 0 0 0 0 4 23 2 0 0 0 0 5 24 2 0 0 0 0 6 12 1 0 0 0 0 6 14 1 0 0 0 0 6 16 1 0 0 0 0 7 13 1 0 0 0 0 7 15 1 0 0 0 0 7 23 1 0 0 0 0 8 17 1 0 0 0 0 19 8 1 6 0 0 0 8 41 1 0 0 0 0 9 18 1 0 0 0 0 9 26 1 0 0 0 0 9 46 1 0 0 0 0 10 24 1 0 0 0 0 10 25 1 0 0 0 0 10 51 1 0 0 0 0 11 23 1 0 0 0 0 11 27 1 0 0 0 0 11 52 1 0 0 0 0 12 13 1 0 0 0 0 12 17 1 1 0 0 0 12 30 1 0 0 0 0 13 31 1 0 0 0 0 13 32 1 0 0 0 0 14 15 1 0 0 0 0 14 33 1 0 0 0 0 14 34 1 0 0 0 0 15 35 1 0 0 0 0 15 36 1 0 0 0 0 18 16 1 6 0 0 0 18 21 1 0 0 0 0 18 37 1 0 0 0 0 19 20 1 0 0 0 0 19 24 1 0 0 0 0 19 38 1 0 0 0 0 20 22 1 0 0 0 0 20 39 1 0 0 0 0 20 40 1 0 0 0 0 21 42 1 0 0 0 0 21 43 1 0 0 0 0 22 25 1 0 0 0 0 22 44 1 0 0 0 0 22 45 1 0 0 0 0 25 47 1 0 0 0 0 25 48 1 0 0 0 0 26 49 1 0 0 0 0 26 50 1 0 0 0 0 27 28 1 0 0 0 0 27 29 1 0 0 0 0 27 53 1 0 0 0 0 28 54 1 0 0 0 0 28 55 1 0 0 0 0 28 56 1 0 0 0 0 29 57 1 0 0 0 0 29 58 1 0 0 0 0 29 59 1 0 0 0 0 M END > 9425031 > 1 > 660 > 5 > 4 > 4 > AAADceB7uABAAAAAAAAAAAAAAAAAAWAAAAAsWAAAAAAAAAAAAAAAHgQQAAAACCjFwASDAAPAAAgIAAEQEAAAAABAABAAAIGIAACAQBogwCAUAAAIFgKAAAAYAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== > (3R)-N-isopropyl-N'-[(3S)-2-oxo-3-piperidyl]-4-[(4R)-thiazolidine-4-carbonyl]piperazine-1,3-dicarboxamide > (3R)-N-isopropyl-N'-[(3S)-2-oxo-3-piperidinyl]-4-[oxo-[(4R)-4-thiazolidinyl]methyl]piperazine-1,3-dicarboxamide > (3R)-N'-[(3S)-2-oxopiperidin-3-yl]-N-propan-2-yl-4-[(4R)-1,3-thiazolidine-4-carbonyl]piperazine-1,3-dicarboxamide > (3R)-N'-[(3S)-2-oxopiperidin-3-yl]-N-propan-2-yl-4-[[(4R)-1,3-thiazolidin-4-yl]carbonyl]piperazine-1,3-dicarboxamide > (3R)-N-isopropyl-N'-[(3S)-2-keto-3-piperidyl]-4-[(4R)-thiazolidine-4-carbonyl]piperazine-1,3-dicarboxamide > InChI=1/C18H30N6O4S/c1-11(2)21-18(28)23-6-7-24(17(27)13-9-29-10-20-13)14(8-23)16(26)22-12-4-3-5-19-15(12)25/h11-14,20H,3-10H2,1-2H3,(H,19,25)(H,21,28)(H,22,26)/t12-,13-,14+/m0/s1/f/h19,21-22H > -0.9 > 426.204924 > C18H30N6O4S > 426.5336 > CC(C)NC(=O)N1CCN(C(C1)C(=O)NC2CCCNC2=O)C(=O)C3CSCN3 > CC(C)NC(=O)N1CCN([C@H](C1)C(=O)N[C@H]2CCCNC2=O)C(=O)[C@@H]3CSCN3 > 123 > 426.204924 > 0 > 29 > 3 > 0 > 0 > 0 > 0 > 1 > 8 > 12 17 5 18 16 6 19 8 6 $$$$ 9425032 -OEChem-01150805002D 42 43 0 0 0 0 0 0 0999 V2000 3.7321 2.0761 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -1.4239 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 3.0761 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.6551 3.6639 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 0.5761 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 0.0761 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.2731 3.6639 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 2.0761 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.9641 4.6149 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.9641 4.6149 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -2.4239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -1.4239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.2242 3.3548 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 1.5761 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -2.9239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -2.9239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -0.9239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -4.4239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.3763 5.4239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -3.9239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -3.9239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -5.4239 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.0747 2.1837 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6762 1.4935 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.3285 5.1165 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.2554 -1.5316 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.6540 -0.8413 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.0326 2.7652 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.8138 3.1632 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.4158 3.9445 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -2.6139 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -2.6139 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8779 5.7884 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.0119 5.9255 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.8747 5.0595 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 0.2661 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -4.2339 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -4.2339 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.1951 0.3861 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.2460 -5.4239 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.4860 -5.4239 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -6.0439 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 14 2 0 0 0 0 2 17 2 0 0 0 0 3 4 1 0 0 0 0 3 7 1 0 0 0 0 3 8 1 0 0 0 0 4 10 2 0 0 0 0 5 6 1 0 0 0 0 5 14 1 0 0 0 0 5 36 1 0 0 0 0 6 17 1 0 0 0 0 6 39 1 0 0 0 0 7 9 2 0 0 0 0 7 13 1 0 0 0 0 8 14 1 0 0 0 0 8 23 1 0 0 0 0 8 24 1 0 0 0 0 9 10 1 0 0 0 0 9 25 1 0 0 0 0 10 19 1 0 0 0 0 11 12 1 0 0 0 0 11 15 2 0 0 0 0 11 16 1 0 0 0 0 12 17 1 0 0 0 0 12 26 1 0 0 0 0 12 27 1 0 0 0 0 13 28 1 0 0 0 0 13 29 1 0 0 0 0 13 30 1 0 0 0 0 15 20 1 0 0 0 0 15 31 1 0 0 0 0 16 21 2 0 0 0 0 16 32 1 0 0 0 0 18 20 2 0 0 0 0 18 21 1 0 0 0 0 18 22 1 0 0 0 0 19 33 1 0 0 0 0 19 34 1 0 0 0 0 19 35 1 0 0 0 0 20 37 1 0 0 0 0 21 38 1 0 0 0 0 22 40 1 0 0 0 0 22 41 1 0 0 0 0 22 42 1 0 0 0 0 M END > 9425032 > 1 > 394 > 4 > 2 > 4 > AAADceB7sAAAAAAAAAAAAAAAAAAAAWAAAAAwAAAAAAAAAAABwAAAHgAYAAAADAjBngQygJJqAACqAyVyVACSBAAhggIa+CG4ZJgIYDLA1fGUpAhgmADIyAcYiMCOQAAAAAAAAACAAAAAAAAAAAAAAAAAAA== > 2-(3,5-dimethylpyrazol-1-yl)-N'-[2-(4-methylphenyl)acetyl]acetohydrazide > 2-(3,5-dimethyl-1-pyrazolyl)-N'-[2-(4-methylphenyl)-1-oxoethyl]acetohydrazide > 2-(3,5-dimethylpyrazol-1-yl)-N'-[2-(4-methylphenyl)acetyl]acetohydrazide > 2-(3,5-dimethylpyrazol-1-yl)-N'-[2-(4-methylphenyl)ethanoyl]ethanehydrazide > 2-(3,5-dimethylpyrazol-1-yl)-N'-[2-(4-methylphenyl)acetyl]acetohydrazide > InChI=1/C16H20N4O2/c1-11-4-6-14(7-5-11)9-15(21)17-18-16(22)10-20-13(3)8-12(2)19-20/h4-8H,9-10H2,1-3H3,(H,17,21)(H,18,22)/f/h17-18H > 2 > 300.158626 > C16H20N4O2 > 300.3556 > CC1=CC=C(C=C1)CC(=O)NNC(=O)CN2C(=CC(=N2)C)C > CC1=CC=C(C=C1)CC(=O)NNC(=O)CN2C(=CC(=N2)C)C > 76 > 300.158626 > 0 > 22 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 11 15 8 11 16 8 15 20 8 16 21 8 18 20 8 18 21 8 3 4 8 3 7 8 4 10 8 7 9 8 9 10 8 $$$$ 9425033 -OEChem-01150805002D 49 51 0 0 0 0 0 0 0999 V2000 3.5878 -5.6967 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0 3.9538 -2.6579 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.5519 2.8421 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.0878 1.8421 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.5519 4.8421 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.4654 4.4354 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.8198 2.8421 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.8198 1.8421 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.2788 -4.7457 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.6564 5.8366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.6859 4.3421 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.6346 6.0445 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.1346 5.1785 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.9133 6.5058 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.6859 3.3421 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.1291 5.0740 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.9538 0.3421 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.9538 -1.6579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.0878 -3.1579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.9538 1.3421 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.8198 -0.1579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.0878 -0.1579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.8198 -1.1579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.0878 -1.1579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.0878 -4.1579 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.8968 -4.7457 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.5878 -5.6967 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -6.5058 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.0753 4.2344 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.4738 4.9247 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.8867 6.6109 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.3281 6.9665 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.4525 6.9206 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.4984 6.0450 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.0643 4.4574 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.7457 5.0092 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.1939 5.6906 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2829 3.1521 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.4772 -3.2656 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8757 -2.5753 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.3568 0.1521 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.5508 0.1521 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.3568 1.5321 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.3568 -1.4679 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.5508 -1.4679 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.4865 -4.5541 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4984 -6.1413 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.6356 -7.0073 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.5016 -6.8702 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 26 1 0 0 0 0 1 27 1 0 0 0 0 2 18 1 0 0 0 0 2 19 1 0 0 0 0 3 15 2 0 0 0 0 4 20 2 0 0 0 0 5 6 1 0 0 0 0 5 10 1 0 0 0 0 5 11 1 0 0 0 0 6 13 2 0 0 0 0 7 8 1 0 0 0 0 7 15 1 0 0 0 0 7 38 1 0 0 0 0 8 20 1 0 0 0 0 8 43 1 0 0 0 0 9 25 1 0 0 0 0 9 27 2 0 0 0 0 10 12 2 0 0 0 0 10 14 1 0 0 0 0 11 15 1 0 0 0 0 11 29 1 0 0 0 0 11 30 1 0 0 0 0 12 13 1 0 0 0 0 12 31 1 0 0 0 0 13 16 1 0 0 0 0 14 32 1 0 0 0 0 14 33 1 0 0 0 0 14 34 1 0 0 0 0 16 35 1 0 0 0 0 16 36 1 0 0 0 0 16 37 1 0 0 0 0 17 20 1 0 0 0 0 17 21 2 0 0 0 0 17 22 1 0 0 0 0 18 23 2 0 0 0 0 18 24 1 0 0 0 0 19 25 1 0 0 0 0 19 39 1 0 0 0 0 19 40 1 0 0 0 0 21 23 1 0 0 0 0 21 41 1 0 0 0 0 22 24 2 0 0 0 0 22 42 1 0 0 0 0 23 44 1 0 0 0 0 24 45 1 0 0 0 0 25 26 2 0 0 0 0 26 46 1 0 0 0 0 27 28 1 0 0 0 0 28 47 1 0 0 0 0 28 48 1 0 0 0 0 28 49 1 0 0 0 0 M END > 9425033 > 1 > 544 > 6 > 2 > 6 > AAADceB7sABAAAAAAAAAAAAAAAAAAWLAAAAwAAAAAAAAAAAB8AAAHgQYAAAADAzl3gayh5JqFAiuAyVyVASS/KBlqjoa+DW+bNgOZjLk9fuXvSjk2BH46YeY3ADOIAAAAAAAAABAAAAAAAAAAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-[(2-methylthiazol-4-yl)methoxy]benzohydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-4-[(2-methyl-4-thiazolyl)methoxy]benzohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-[(2-methyl-1,3-thiazol-4-yl)methoxy]benzohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-4-[(2-methyl-1,3-thiazol-4-yl)methoxy]benzohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-[(2-methylthiazol-4-yl)methoxy]benzohydrazide > InChI=1/C19H21N5O3S/c1-12-8-13(2)24(23-12)9-18(25)21-22-19(26)15-4-6-17(7-5-15)27-10-16-11-28-14(3)20-16/h4-8,11H,9-10H2,1-3H3,(H,21,25)(H,22,26)/f/h21-22H > 2.1 > 399.13651 > C19H21N5O3S > 399.46674 > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC=C(C=C2)OCC3=CSC(=N3)C)C > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC=C(C=C2)OCC3=CSC(=N3)C)C > 98.1 > 399.13651 > 0 > 28 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 1 26 8 1 27 8 10 12 8 12 13 8 17 21 8 17 22 8 18 23 8 18 24 8 21 23 8 22 24 8 25 26 8 5 10 8 5 6 8 6 13 8 9 25 8 9 27 8 $$$$ 9425034 -OEChem-01150805002D 46 47 0 0 0 0 0 0 0999 V2000 6.3301 1.9182 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 0.9182 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 3.9182 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.2437 3.5114 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 1.9182 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 0.9182 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.4347 4.9127 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 3.4182 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.4128 5.1206 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.9128 4.2546 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.6915 5.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -4.0818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -5.0818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 2.4182 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9073 4.1501 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -2.0818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -3.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -3.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -2.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -2.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -5.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -1.0818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -0.5818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 0.4182 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.8535 3.3105 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.2520 4.0008 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.6650 5.6870 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.2766 5.1211 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.2308 5.9967 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1064 6.0426 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.4766 -4.9742 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.0781 -5.6644 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9721 4.7667 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.5239 4.0852 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.8425 3.5334 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -3.8918 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -3.8918 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -2.2718 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -2.2718 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.6900 -5.0449 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 -5.8918 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3100 -6.1188 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.0611 2.2282 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3291 -0.7718 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 0.6082 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2690 -0.8918 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 14 2 0 0 0 0 2 24 2 0 0 0 0 3 4 1 0 0 0 0 3 7 1 0 0 0 0 3 8 1 0 0 0 0 4 10 2 0 0 0 0 5 6 1 0 0 0 0 5 14 1 0 0 0 0 5 43 1 0 0 0 0 6 24 1 0 0 0 0 6 45 1 0 0 0 0 7 9 2 0 0 0 0 7 11 1 0 0 0 0 8 14 1 0 0 0 0 8 25 1 0 0 0 0 8 26 1 0 0 0 0 9 10 1 0 0 0 0 9 27 1 0 0 0 0 10 15 1 0 0 0 0 11 28 1 0 0 0 0 11 29 1 0 0 0 0 11 30 1 0 0 0 0 12 13 1 0 0 0 0 12 17 2 0 0 0 0 12 18 1 0 0 0 0 13 21 1 0 0 0 0 13 31 1 0 0 0 0 13 32 1 0 0 0 0 15 33 1 0 0 0 0 15 34 1 0 0 0 0 15 35 1 0 0 0 0 16 19 2 0 0 0 0 16 20 1 0 0 0 0 16 22 1 0 0 0 0 17 19 1 0 0 0 0 17 36 1 0 0 0 0 18 20 2 0 0 0 0 18 37 1 0 0 0 0 19 38 1 0 0 0 0 20 39 1 0 0 0 0 21 40 1 0 0 0 0 21 41 1 0 0 0 0 21 42 1 0 0 0 0 22 23 2 0 0 0 0 22 44 1 0 0 0 0 23 24 1 0 0 0 0 23 46 1 0 0 0 0 M END > 9425034 > 1 > 458 > 4 > 2 > 5 > AAADceB7sAAAAAAAAAAAAAAAAAAAAWAAAAAwAAAAAAAAAAABwAAAHgAYAAAADAjBngQygJJqAACqAyVyVACSBAAhggIa+CC4ZNgIYCLA0fCUpAhgmADIyYcAgMAOQAAAAAAAAACAAAAAAAAAAAAAAAAAAA== > (E)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(4-ethylphenyl)prop-2-enehydrazide > (E)-N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-3-(4-ethylphenyl)prop-2-enehydrazide > (E)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(4-ethylphenyl)prop-2-enehydrazide > (E)-N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-3-(4-ethylphenyl)prop-2-enehydrazide > (E)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(4-ethylphenyl)acrylohydrazide > InChI=1/C18H22N4O2/c1-4-15-5-7-16(8-6-15)9-10-17(23)19-20-18(24)12-22-14(3)11-13(2)21-22/h5-11H,4,12H2,1-3H3,(H,19,23)(H,20,24)/b10-9+/f/h19-20H > 2.9 > 326.174276 > C18H22N4O2 > 326.39288 > CCC1=CC=C(C=C1)C=CC(=O)NNC(=O)CN2C(=CC(=N2)C)C > CCC1=CC=C(C=C1)\C=C\C(=O)NNC(=O)CN2C(=CC(=N2)C)C > 76 > 326.174276 > 0 > 24 > 0 > 0 > 1 > 0 > 0 > 1 > 4 > 12 17 8 12 18 8 16 19 8 16 20 8 17 19 8 18 20 8 3 4 8 3 7 8 4 10 8 7 9 8 9 10 8 $$$$ 9425035 -OEChem-01150805002D 47 49 0 0 0 0 0 0 0999 V2000 2.0000 2.9045 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0 5.4921 4.1833 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 1.4045 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -2.0955 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 3.9045 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -3.0955 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3871 -3.6833 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 -0.0955 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -0.5955 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.9230 4.4923 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.2321 5.4434 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.2321 5.4434 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5411 4.4923 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 2.9045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 2.4045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 2.4045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 1.4045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 1.4045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 0.9045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0052 -3.6833 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1962 -2.0955 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 0.9045 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.6962 -4.6343 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.6962 -4.6343 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 -1.5955 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9562 -3.3743 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.1084 -5.4434 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.3566 4.7445 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.6130 3.9554 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2969 6.0600 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.6256 5.5723 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8385 5.5723 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.1672 6.0600 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 2.7145 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3291 1.0945 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 0.2845 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.4082 -1.5129 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.8067 -2.2032 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.0606 -5.1359 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.9272 -0.4055 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.1478 -3.9639 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.5459 -3.1827 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.7646 -2.7846 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8671 -0.2855 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6068 -5.0789 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7439 -5.9449 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.6100 -5.8078 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 16 1 0 0 0 0 2 13 2 0 0 0 0 3 22 2 0 0 0 0 4 25 2 0 0 0 0 5 10 1 0 0 0 0 5 13 1 0 0 0 0 5 14 1 0 0 0 0 6 7 1 0 0 0 0 6 20 1 0 0 0 0 6 21 1 0 0 0 0 7 24 2 0 0 0 0 8 9 1 0 0 0 0 8 22 1 0 0 0 0 8 40 1 0 0 0 0 9 25 1 0 0 0 0 9 44 1 0 0 0 0 10 11 1 0 0 0 0 10 28 1 0 0 0 0 10 29 1 0 0 0 0 11 12 1 0 0 0 0 11 30 1 0 0 0 0 11 31 1 0 0 0 0 12 13 1 0 0 0 0 12 32 1 0 0 0 0 12 33 1 0 0 0 0 14 15 1 0 0 0 0 14 16 2 0 0 0 0 15 17 2 0 0 0 0 15 34 1 0 0 0 0 16 18 1 0 0 0 0 17 19 1 0 0 0 0 17 22 1 0 0 0 0 18 19 2 0 0 0 0 18 35 1 0 0 0 0 19 36 1 0 0 0 0 20 23 2 0 0 0 0 20 26 1 0 0 0 0 21 25 1 0 0 0 0 21 37 1 0 0 0 0 21 38 1 0 0 0 0 23 24 1 0 0 0 0 23 39 1 0 0 0 0 24 27 1 0 0 0 0 26 41 1 0 0 0 0 26 42 1 0 0 0 0 26 43 1 0 0 0 0 27 45 1 0 0 0 0 27 46 1 0 0 0 0 27 47 1 0 0 0 0 M END > 9425035 > 1 > 589 > 5 > 2 > 4 > AAADceB7sAAEAAAAAAAAAAAAAAAAAWLAAAAwAAAAAAAAAAABwAAAHgIYAAAADArBniQywJNqAACqAyVyVACSBAAlhwIa+CG4ZtgIYDLB1/HUpQhgngDIyYcciACOBAAAQAAAABAIAACAAAAAIAAAAAAAAA== > 4-chloro-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(2-oxopyrrolidin-1-yl)benzohydrazide > 4-chloro-N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-3-(2-oxo-1-pyrrolidinyl)benzohydrazide > 4-chloro-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(2-oxopyrrolidin-1-yl)benzohydrazide > 4-chloro-N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-3-(2-oxopyrrolidin-1-yl)benzohydrazide > 4-chloro-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(2-ketopyrrolidin-1-yl)benzohydrazide > InChI=1/C18H20ClN5O3/c1-11-8-12(2)24(22-11)10-16(25)20-21-18(27)13-5-6-14(19)15(9-13)23-7-3-4-17(23)26/h5-6,8-9H,3-4,7,10H2,1-2H3,(H,20,25)(H,21,27)/f/h20-21H > 1.7 > 389.125467 > C18H20ClN5O3 > 389.8361 > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC(=C(C=C2)Cl)N3CCCC3=O)C > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC(=C(C=C2)Cl)N3CCCC3=O)C > 96.3 > 389.125467 > 0 > 27 > 0 > 0 > 0 > 0 > 0 > 1 > 8 > 14 15 8 14 16 8 15 17 8 16 18 8 17 19 8 18 19 8 20 23 8 23 24 8 6 20 8 6 7 8 7 24 8 $$$$ 9425036 -OEChem-01150805002D 46 48 0 0 0 0 0 0 0999 V2000 8.9073 -1.3318 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3092 1.1682 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 12.3714 -0.3318 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5772 0.1682 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 11.5054 -1.8318 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 0.1682 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.6637 -0.2386 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 13.3176 -0.0271 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.1753 -0.3318 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 13.3176 -1.6366 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 10.6394 -0.3318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.5054 0.1682 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.7734 0.1682 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.6394 -1.3318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.3714 -1.3318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.5054 1.1682 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4432 -0.3318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4727 1.1627 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.9073 -0.3318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.7734 -1.8318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.4945 1.3706 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.9945 0.5046 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3092 0.1682 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.2158 1.8318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 13.9013 -0.8318 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 0.4001 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.1719 0.6431 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.3748 0.6431 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.1254 1.1682 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.5054 1.7882 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.8854 1.1682 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8418 -0.8068 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.0447 -0.8068 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.4634 -1.2949 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.2364 -2.1418 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.0834 -2.3688 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2423 1.9370 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 0.7882 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6307 1.3711 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6766 2.2467 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8010 2.2926 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.1753 -0.9518 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 14.5213 -0.8318 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.9352 1.0167 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.0648 -0.2166 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3834 0.3352 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 19 2 0 0 0 0 2 23 2 0 0 0 0 3 8 1 0 0 0 0 3 12 1 0 0 0 0 3 15 1 0 0 0 0 4 7 1 0 0 0 0 4 17 1 0 0 0 0 4 18 1 0 0 0 0 5 14 2 0 0 0 0 5 15 1 0 0 0 0 6 9 1 0 0 0 0 6 19 1 0 0 0 0 6 38 1 0 0 0 0 7 22 2 0 0 0 0 8 25 2 0 0 0 0 9 23 1 0 0 0 0 9 42 1 0 0 0 0 10 15 2 0 0 0 0 10 25 1 0 0 0 0 11 12 2 0 0 0 0 11 13 1 0 0 0 0 11 14 1 0 0 0 0 12 16 1 0 0 0 0 13 19 1 0 0 0 0 13 27 1 0 0 0 0 13 28 1 0 0 0 0 14 20 1 0 0 0 0 16 29 1 0 0 0 0 16 30 1 0 0 0 0 16 31 1 0 0 0 0 17 23 1 0 0 0 0 17 32 1 0 0 0 0 17 33 1 0 0 0 0 18 21 2 0 0 0 0 18 24 1 0 0 0 0 20 34 1 0 0 0 0 20 35 1 0 0 0 0 20 36 1 0 0 0 0 21 22 1 0 0 0 0 21 37 1 0 0 0 0 22 26 1 0 0 0 0 24 39 1 0 0 0 0 24 40 1 0 0 0 0 24 41 1 0 0 0 0 25 43 1 0 0 0 0 26 44 1 0 0 0 0 26 45 1 0 0 0 0 26 46 1 0 0 0 0 M END > 9425036 > 1 > 532 > 8 > 2 > 4 > AAADceB78AAAAAAAAAAAAAAAAAAAAWLAAAAsAAAAAAAAAFgB+AAAHgAYAAAADAjBngQ3kJZqEACqAyVzdACQhCsxgqIXeCG4ZBiAaBJAzfEUhAhoGALISCIcAAAKAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-(5,7-dimethyl-[1,2,4]triazolo[1,5-a]pyrimidin-6-yl)acetohydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-2-(5,7-dimethyl-[1,2,4]triazolo[1,5-a]pyrimidin-6-yl)acetohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-(5,7-dimethyl-[1,2,4]triazolo[1,5-a]pyrimidin-6-yl)acetohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-2-(5,7-dimethyl-[1,2,4]triazolo[1,5-a]pyrimidin-6-yl)ethanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-(5,7-dimethyl-[1,2,4]triazolo[1,5-a]pyrimidin-6-yl)acetohydrazide > InChI=1/C16H20N8O2/c1-9-5-10(2)23(22-9)7-15(26)21-20-14(25)6-13-11(3)19-16-17-8-18-24(16)12(13)4/h5,8H,6-7H2,1-4H3,(H,20,25)(H,21,26)/f/h20-21H > -1.5 > 356.170922 > C16H20N8O2 > 356.3824 > CC1=CC(=NN1CC(=O)NNC(=O)CC2=C(N3C(=NC=N3)N=C2C)C)C > CC1=CC(=NN1CC(=O)NNC(=O)CC2=C(N3C(=NC=N3)N=C2C)C)C > 119 > 356.170922 > 0 > 26 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 10 15 8 10 25 8 11 12 8 11 14 8 18 21 8 21 22 8 3 12 8 3 15 8 3 8 8 4 18 8 4 7 8 5 14 8 5 15 8 7 22 8 8 25 8 $$$$ 9425037 -OEChem-01150805002D 42 43 0 0 0 0 0 0 0999 V2000 8.7788 5.5580 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0 3.5827 -1.4420 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.0468 -0.4420 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.1808 3.0580 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.5827 -3.4420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.4781 -4.4365 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.3147 -1.4420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.3147 -0.4420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.0468 1.5580 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.7788 2.5580 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 2.6691 -3.0353 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4487 -2.9420 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 -3.7784 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.5000 -4.6444 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.4612 -2.0571 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4487 -1.9420 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0933 -5.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.1808 1.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.1808 0.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.0468 2.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.9128 3.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.9128 4.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.7788 4.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.6449 4.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.6449 3.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.6608 -3.5246 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.0593 -2.8343 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3834 -3.7136 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.0677 -1.9282 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3323 -1.4507 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8548 -2.1860 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.6597 -5.8102 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8411 -6.1244 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5269 -5.3058 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.8517 -1.7520 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.9687 1.6406 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.5702 0.9503 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.7778 -0.1320 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5837 1.2480 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.3759 4.3680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.1818 4.3680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.1818 2.7480 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 23 1 0 0 0 0 2 16 2 0 0 0 0 3 19 2 0 0 0 0 4 20 2 0 0 0 0 5 6 1 0 0 0 0 5 11 1 0 0 0 0 5 12 1 0 0 0 0 6 14 2 0 0 0 0 7 8 1 0 0 0 0 7 16 1 0 0 0 0 7 35 1 0 0 0 0 8 19 1 0 0 0 0 8 38 1 0 0 0 0 9 18 1 0 0 0 0 9 20 1 0 0 0 0 9 39 1 0 0 0 0 10 21 2 0 0 0 0 10 25 1 0 0 0 0 11 13 2 0 0 0 0 11 15 1 0 0 0 0 12 16 1 0 0 0 0 12 26 1 0 0 0 0 12 27 1 0 0 0 0 13 14 1 0 0 0 0 13 28 1 0 0 0 0 14 17 1 0 0 0 0 15 29 1 0 0 0 0 15 30 1 0 0 0 0 15 31 1 0 0 0 0 17 32 1 0 0 0 0 17 33 1 0 0 0 0 17 34 1 0 0 0 0 18 19 1 0 0 0 0 18 36 1 0 0 0 0 18 37 1 0 0 0 0 20 21 1 0 0 0 0 21 22 1 0 0 0 0 22 23 2 0 0 0 0 22 40 1 0 0 0 0 23 24 1 0 0 0 0 24 25 2 0 0 0 0 24 41 1 0 0 0 0 25 42 1 0 0 0 0 M END > 9425037 > 1 > 506 > 6 > 3 > 5 > AAADceBzsAAEAAAAAAAAAAAAAAAAAWAAAAAsAAAAAAAAAAAB4AAAHgIYAAAACArBliQ+gJLqEACqATV3VACShCA3hyIa+KG4ZtgIYHLB1/GUpQhgngDIyYcYCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== > 4-chloro-N-[2-[N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]hydrazino]-2-oxo-ethyl]pyridine-2-carboxamide > 4-chloro-N-[2-[N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]hydrazino]-2-oxoethyl]-2-pyridinecarboxamide > 4-chloro-N-[2-[2-[2-(3,5-dimethylpyrazol-1-yl)acetyl]hydrazinyl]-2-oxoethyl]pyridine-2-carboxamide > 4-chloro-N-[2-[2-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]hydrazinyl]-2-oxo-ethyl]pyridine-2-carboxamide > 4-chloro-N-[2-[N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]hydrazino]-2-keto-ethyl]picolinamide > InChI=1/C15H17ClN6O3/c1-9-5-10(2)22(21-9)8-14(24)20-19-13(23)7-18-15(25)12-6-11(16)3-4-17-12/h3-6H,7-8H2,1-2H3,(H,18,25)(H,19,23)(H,20,24)/f/h18-20H > 0.4 > 364.105066 > C15H17ClN6O3 > 364.78688 > CC1=CC(=NN1CC(=O)NNC(=O)CNC(=O)C2=NC=CC(=C2)Cl)C > CC1=CC(=NN1CC(=O)NNC(=O)CNC(=O)C2=NC=CC(=C2)Cl)C > 118 > 364.105066 > 0 > 25 > 0 > 0 > 0 > 0 > 0 > 1 > 8 > 10 21 8 10 25 8 11 13 8 13 14 8 21 22 8 22 23 8 23 24 8 24 25 8 5 11 8 5 6 8 6 14 8 $$$$ 9425040 -OEChem-01150805002D 49 51 0 0 0 0 0 0 0999 V2000 9.6449 1.1920 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 11.3769 -1.8080 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.9128 -1.8080 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 5.3147 0.6920 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 10.5109 -0.3080 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.5827 -0.3080 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.0468 -0.3080 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.1808 -0.8080 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.4781 0.6865 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 11.3769 1.1920 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.6449 -0.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.2429 0.6920 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.2429 -0.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.5109 0.6920 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.3769 -0.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.7788 -0.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 13.1369 1.2267 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 13.1369 -0.8427 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.9128 -0.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 14.0429 0.7128 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 14.0429 -0.3288 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4487 -0.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.6691 -0.7147 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 0.0284 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.5000 0.8944 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.3147 -0.3080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.4612 -1.6929 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0933 1.8080 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.9784 1.6670 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.7754 1.6670 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.0434 -1.2829 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.2463 -1.2829 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.3803 0.1670 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.1774 0.1670 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 13.1297 1.8466 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 13.1297 -1.4626 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 14.5787 1.0249 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 14.5787 -0.6409 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8472 -1.2829 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.0502 -1.2829 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3834 -0.0364 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.0468 0.3120 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1808 -1.4280 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8548 -1.5640 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3323 -2.2993 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.0677 -1.8218 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.6597 2.0602 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5269 1.5558 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8411 2.3744 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 14 2 0 0 0 0 2 15 2 0 0 0 0 3 19 2 0 0 0 0 4 26 2 0 0 0 0 5 11 1 0 0 0 0 5 14 1 0 0 0 0 5 15 1 0 0 0 0 6 9 1 0 0 0 0 6 22 1 0 0 0 0 6 23 1 0 0 0 0 7 8 1 0 0 0 0 7 19 1 0 0 0 0 7 42 1 0 0 0 0 8 26 1 0 0 0 0 8 43 1 0 0 0 0 9 25 2 0 0 0 0 10 12 1 0 0 0 0 10 14 1 0 0 0 0 10 29 1 0 0 0 0 10 30 1 0 0 0 0 11 16 1 0 0 0 0 11 31 1 0 0 0 0 11 32 1 0 0 0 0 12 13 1 0 0 0 0 12 17 2 0 0 0 0 13 15 1 0 0 0 0 13 18 2 0 0 0 0 16 19 1 0 0 0 0 16 33 1 0 0 0 0 16 34 1 0 0 0 0 17 20 1 0 0 0 0 17 35 1 0 0 0 0 18 21 1 0 0 0 0 18 36 1 0 0 0 0 20 21 2 0 0 0 0 20 37 1 0 0 0 0 21 38 1 0 0 0 0 22 26 1 0 0 0 0 22 39 1 0 0 0 0 22 40 1 0 0 0 0 23 24 2 0 0 0 0 23 27 1 0 0 0 0 24 25 1 0 0 0 0 24 41 1 0 0 0 0 25 28 1 0 0 0 0 27 44 1 0 0 0 0 27 45 1 0 0 0 0 27 46 1 0 0 0 0 28 47 1 0 0 0 0 28 48 1 0 0 0 0 28 49 1 0 0 0 0 M END > 9425040 > 1 > 640 > 6 > 2 > 5 > AAADceB7uAAAAAAAAAAAAAAAAAAAAWAAAAA8QAAAAAAAAACxwAAAHgAYAAAADAjBngQygJNqAACqAyVyVACSBAAlggIa+CG4ZNgIYDrA1fGUpQhgniDIyYcYi4COgAAAAAAQAAAAAAAAACAAAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(1,3-dioxo-4H-isoquinolin-2-yl)propanehydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-3-(1,3-dioxo-4H-isoquinolin-2-yl)propanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-3-(1,3-dioxo-4H-isoquinolin-2-yl)propanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-3-(1,3-dioxo-4H-isoquinolin-2-yl)propanehydrazide > 3-(1,3-diketo-4H-isoquinolin-2-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]propionohydrazide > InChI=1/C19H21N5O4/c1-12-9-13(2)24(22-12)11-17(26)21-20-16(25)7-8-23-18(27)10-14-5-3-4-6-15(14)19(23)28/h3-6,9H,7-8,10-11H2,1-2H3,(H,20,25)(H,21,26)/f/h20-21H > 0.6 > 383.159354 > C19H21N5O4 > 383.40114 > CC1=CC(=NN1CC(=O)NNC(=O)CCN2C(=O)CC3=CC=CC=C3C2=O)C > CC1=CC(=NN1CC(=O)NNC(=O)CCN2C(=O)CC3=CC=CC=C3C2=O)C > 113 > 383.159354 > 0 > 28 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 12 13 8 12 17 8 13 18 8 17 20 8 18 21 8 20 21 8 23 24 8 24 25 8 6 23 8 6 9 8 9 25 8 $$$$ 9425041 -OEChem-01150805002D 45 47 0 1 0 0 0 0 0999 V2000 8.9073 -1.7718 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1753 -3.7718 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -0.2718 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.5772 0.7282 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 8.9073 -3.7718 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5772 2.7282 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3092 -0.2718 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.6637 2.3214 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3092 0.7282 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -2.2718 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0 7.1753 -1.7718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -3.2718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.7734 -2.2718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.7734 -3.2718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1753 -0.7718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4432 2.2282 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4727 3.7227 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.6673 -1.7372 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.6673 -3.8065 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.4945 3.9306 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.9945 3.0646 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4432 1.2282 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.5734 -2.2510 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.5734 -3.2926 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.2158 4.3918 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 2.9601 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -1.6518 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.9632 -2.3544 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.5647 -1.6642 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9073 -4.3918 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6553 2.8108 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.0538 2.1205 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.6601 -1.1172 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.6601 -4.4264 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2423 4.4970 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7723 -0.5818 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.1091 -1.9389 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.1091 -3.6047 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8462 1.0382 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6307 3.9311 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6766 4.8067 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8010 4.8526 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.9352 3.5767 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.0648 2.3434 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3834 2.8952 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 10 1 0 0 0 0 1 13 1 0 0 0 0 2 12 2 0 0 0 0 3 15 2 0 0 0 0 4 22 2 0 0 0 0 5 12 1 0 0 0 0 5 14 1 0 0 0 0 5 30 1 0 0 0 0 6 8 1 0 0 0 0 6 16 1 0 0 0 0 6 17 1 0 0 0 0 7 9 1 0 0 0 0 7 15 1 0 0 0 0 7 36 1 0 0 0 0 8 21 2 0 0 0 0 9 22 1 0 0 0 0 9 39 1 0 0 0 0 10 11 1 1 0 0 0 10 12 1 0 0 0 0 10 27 1 0 0 0 0 11 15 1 0 0 0 0 11 28 1 0 0 0 0 11 29 1 0 0 0 0 13 14 1 0 0 0 0 13 18 2 0 0 0 0 14 19 2 0 0 0 0 16 22 1 0 0 0 0 16 31 1 0 0 0 0 16 32 1 0 0 0 0 17 20 2 0 0 0 0 17 25 1 0 0 0 0 18 23 1 0 0 0 0 18 33 1 0 0 0 0 19 24 1 0 0 0 0 19 34 1 0 0 0 0 20 21 1 0 0 0 0 20 35 1 0 0 0 0 21 26 1 0 0 0 0 23 24 2 0 0 0 0 23 37 1 0 0 0 0 24 38 1 0 0 0 0 25 40 1 0 0 0 0 25 41 1 0 0 0 0 25 42 1 0 0 0 0 26 43 1 0 0 0 0 26 44 1 0 0 0 0 26 45 1 0 0 0 0 M END > 9425041 > 1 > 557 > 6 > 3 > 4 > AAADceB7uAAAAAAAAAAAAAAAAAAAAWAAAAA8QAAAAAAAAACxwAAAHgAYAAAACBzhlgYyxpLqBACqASVyVAKSDAAhogIa+CH/bJgOZjbE8f+XvCjm/BHY6AeVQAAAAAAAAAAAEAAAAAAAAAAgAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-[(2R)-3-oxo-4H-1,4-benzoxazin-2-yl]acetohydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-2-[(2R)-3-oxo-4H-1,4-benzoxazin-2-yl]acetohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-[(2R)-3-oxo-4H-1,4-benzoxazin-2-yl]acetohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-2-[(2R)-3-oxo-4H-1,4-benzoxazin-2-yl]ethanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-[(2R)-3-keto-4H-1,4-benzoxazin-2-yl]acetohydrazide > InChI=1/C17H19N5O4/c1-10-7-11(2)22(21-10)9-16(24)20-19-15(23)8-14-17(25)18-12-5-3-4-6-13(12)26-14/h3-7,14H,8-9H2,1-2H3,(H,18,25)(H,19,23)(H,20,24)/t14-/m1/s1/f/h18-20H > 0.4 > 357.143704 > C17H19N5O4 > 357.36386 > CC1=CC(=NN1CC(=O)NNC(=O)CC2C(=O)NC3=CC=CC=C3O2)C > CC1=CC(=NN1CC(=O)NNC(=O)C[C@@H]2C(=O)NC3=CC=CC=C3O2)C > 114 > 357.143704 > 0 > 26 > 1 > 0 > 0 > 0 > 0 > 1 > 8 > 10 11 5 13 14 8 13 18 8 14 19 8 17 20 8 18 23 8 19 24 8 20 21 8 23 24 8 6 17 8 6 8 8 8 21 8 $$$$ 9425042 -OEChem-01150805002D 45 47 0 1 0 0 0 0 0999 V2000 8.9073 -1.7718 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 7.1753 -3.7718 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -0.2718 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 4.5772 0.7282 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 8.9073 -3.7718 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5772 2.7282 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3092 -0.2718 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.6637 2.3214 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.3092 0.7282 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -2.2718 0.0000 C 0 0 1 0 0 0 0 0 0 0 0 0 7.1753 -1.7718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -3.2718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.7734 -2.2718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.7734 -3.2718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.1753 -0.7718 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4432 2.2282 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.4727 3.7227 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.6673 -1.7372 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.6673 -3.8065 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.4945 3.9306 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.9945 3.0646 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4432 1.2282 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.5734 -2.2510 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.5734 -3.2926 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.2158 4.3918 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 2.9601 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.0413 -1.6518 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.9632 -2.3544 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.5647 -1.6642 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.9073 -4.3918 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6553 2.8108 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.0538 2.1205 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.6601 -1.1172 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.6601 -4.4264 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 3.2423 4.4970 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.7723 -0.5818 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.1091 -1.9389 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.1091 -3.6047 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8462 1.0382 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6307 3.9311 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.6766 4.8067 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8010 4.8526 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.9352 3.5767 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.0648 2.3434 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.3834 2.8952 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 10 1 0 0 0 0 1 13 1 0 0 0 0 2 12 2 0 0 0 0 3 15 2 0 0 0 0 4 22 2 0 0 0 0 5 12 1 0 0 0 0 5 14 1 0 0 0 0 5 30 1 0 0 0 0 6 8 1 0 0 0 0 6 16 1 0 0 0 0 6 17 1 0 0 0 0 7 9 1 0 0 0 0 7 15 1 0 0 0 0 7 36 1 0 0 0 0 8 21 2 0 0 0 0 9 22 1 0 0 0 0 9 39 1 0 0 0 0 10 11 1 6 0 0 0 10 12 1 0 0 0 0 10 27 1 0 0 0 0 11 15 1 0 0 0 0 11 28 1 0 0 0 0 11 29 1 0 0 0 0 13 14 1 0 0 0 0 13 18 2 0 0 0 0 14 19 2 0 0 0 0 16 22 1 0 0 0 0 16 31 1 0 0 0 0 16 32 1 0 0 0 0 17 20 2 0 0 0 0 17 25 1 0 0 0 0 18 23 1 0 0 0 0 18 33 1 0 0 0 0 19 24 1 0 0 0 0 19 34 1 0 0 0 0 20 21 1 0 0 0 0 20 35 1 0 0 0 0 21 26 1 0 0 0 0 23 24 2 0 0 0 0 23 37 1 0 0 0 0 24 38 1 0 0 0 0 25 40 1 0 0 0 0 25 41 1 0 0 0 0 25 42 1 0 0 0 0 26 43 1 0 0 0 0 26 44 1 0 0 0 0 26 45 1 0 0 0 0 M END > 9425042 > 1 > 557 > 6 > 3 > 4 > AAADceB7uAAAAAAAAAAAAAAAAAAAAWAAAAA8QAAAAAAAAACxwAAAHgAYAAAACBzhlgYyxpLqBACqASVyVAKSDAAhogIa+CH/bJgOZjbE8f+XvCjm/BHY6AeVQAAAAAAAAAAAEAAAAAAAAAAgAAAAAAAAAA== > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-[(2S)-3-oxo-4H-1,4-benzoxazin-2-yl]acetohydrazide > N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-2-[(2S)-3-oxo-4H-1,4-benzoxazin-2-yl]acetohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-[(2S)-3-oxo-4H-1,4-benzoxazin-2-yl]acetohydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-2-[(2S)-3-oxo-4H-1,4-benzoxazin-2-yl]ethanehydrazide > N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-2-[(2S)-3-keto-4H-1,4-benzoxazin-2-yl]acetohydrazide > InChI=1/C17H19N5O4/c1-10-7-11(2)22(21-10)9-16(24)20-19-15(23)8-14-17(25)18-12-5-3-4-6-13(12)26-14/h3-7,14H,8-9H2,1-2H3,(H,18,25)(H,19,23)(H,20,24)/t14-/m0/s1/f/h18-20H > 0.4 > 357.143704 > C17H19N5O4 > 357.36386 > CC1=CC(=NN1CC(=O)NNC(=O)CC2C(=O)NC3=CC=CC=C3O2)C > CC1=CC(=NN1CC(=O)NNC(=O)C[C@H]2C(=O)NC3=CC=CC=C3O2)C > 114 > 357.143704 > 0 > 26 > 1 > 0 > 0 > 0 > 0 > 1 > 8 > 10 11 6 13 14 8 13 18 8 14 19 8 17 20 8 18 23 8 19 24 8 20 21 8 23 24 8 6 17 8 6 8 8 8 21 8 $$$$ 9425045 -OEChem-01150805002D 50 52 0 0 0 0 0 0 0999 V2000 4.6783 3.2611 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0 9.7619 1.5904 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 10.7619 -1.8737 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 12.7619 -1.8737 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.6783 1.6517 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 9.7619 -0.1417 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 10.7619 -0.1417 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 13.7564 -1.9783 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.7619 1.5904 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.7619 1.5904 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.2619 2.4564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 8.2619 0.7244 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.2619 2.4564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.2619 0.7244 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 2.9564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 1.9564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.2619 -1.0077 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.3551 -2.7873 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 13.0983 -3.4564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 13.9643 -2.9564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.2619 -1.0077 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 3.4564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 1.4564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.3770 -2.9952 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 2.9564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0000 1.9564 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 14.8779 -3.3631 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.1793 1.3783 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8695 0.9798 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.3445 1.8024 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.6542 2.2010 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.1542 3.0670 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8445 2.6685 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.3695 0.1138 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.6793 0.5123 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.1542 -0.3971 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 12.8445 -0.7956 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 9.4519 -0.6786 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 13.0335 -4.0730 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.0719 0.3953 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 4.0764 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 0.8364 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.2481 -2.3887 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 10.7705 -3.1241 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 11.5059 -3.6016 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 14.6257 -3.9295 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 15.1300 -2.7967 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 15.4443 -3.6153 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 3.2664 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4631 1.6464 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 13 1 0 0 0 0 1 15 1 0 0 0 0 2 14 2 0 0 0 0 3 21 2 0 0 0 0 4 8 1 0 0 0 0 4 17 1 0 0 0 0 4 18 1 0 0 0 0 5 13 2 0 0 0 0 5 16 1 0 0 0 0 6 7 1 0 0 0 0 6 14 1 0 0 0 0 6 38 1 0 0 0 0 7 21 1 0 0 0 0 7 40 1 0 0 0 0 8 20 2 0 0 0 0 9 10 1 0 0 0 0 9 11 1 0 0 0 0 9 28 1 0 0 0 0 9 29 1 0 0 0 0 10 12 1 0 0 0 0 10 30 1 0 0 0 0 10 31 1 0 0 0 0 11 13 1 0 0 0 0 11 32 1 0 0 0 0 11 33 1 0 0 0 0 12 14 1 0 0 0 0 12 34 1 0 0 0 0 12 35 1 0 0 0 0 15 16 1 0 0 0 0 15 22 2 0 0 0 0 16 23 2 0 0 0 0 17 21 1 0 0 0 0 17 36 1 0 0 0 0 17 37 1 0 0 0 0 18 19 2 0 0 0 0 18 24 1 0 0 0 0 19 20 1 0 0 0 0 19 39 1 0 0 0 0 20 27 1 0 0 0 0 22 25 1 0 0 0 0 22 41 1 0 0 0 0 23 26 1 0 0 0 0 23 42 1 0 0 0 0 24 43 1 0 0 0 0 24 44 1 0 0 0 0 24 45 1 0 0 0 0 25 26 2 0 0 0 0 25 49 1 0 0 0 0 26 50 1 0 0 0 0 27 46 1 0 0 0 0 27 47 1 0 0 0 0 27 48 1 0 0 0 0 M END > 9425045 > 1 > 520 > 5 > 2 > 7 > AAADceB7sABAAAAAAAAAAAAAAAAAAWLAAAAwAAAAAAAAAFgB/AAAHgQYAAAACAjB1gQywbJqEAiuASVyVACT9KBhijpa+D24ZJgIYLLg0fGUpAhgmADoyAcYCAAAAAAAAAAAAQAAAAAAAAACAAAAAAAAAA== > 5-(1,3-benzothiazol-2-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]pentanehydrazide > 5-(1,3-benzothiazol-2-yl)-N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]pentanehydrazide > 5-(1,3-benzothiazol-2-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]pentanehydrazide > 5-(1,3-benzothiazol-2-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]pentanehydrazide > 5-(1,3-benzothiazol-2-yl)-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]valerohydrazide > InChI=1/C19H23N5O2S/c1-13-11-14(2)24(23-13)12-18(26)22-21-17(25)9-5-6-10-19-20-15-7-3-4-8-16(15)27-19/h3-4,7-8,11H,5-6,9-10,12H2,1-2H3,(H,21,25)(H,22,26)/f/h21-22H > 2 > 385.157246 > C19H23N5O2S > 385.48322 > CC1=CC(=NN1CC(=O)NNC(=O)CCCCC2=NC3=CC=CC=C3S2)C > CC1=CC(=NN1CC(=O)NNC(=O)CCCCC2=NC3=CC=CC=C3S2)C > 88.9 > 385.157246 > 0 > 27 > 0 > 0 > 0 > 0 > 0 > 1 > 8 > 1 13 8 1 15 8 15 16 8 15 22 8 16 23 8 18 19 8 19 20 8 22 25 8 23 26 8 25 26 8 4 18 8 4 8 8 5 13 8 5 16 8 8 20 8 $$$$ 9425046 -OEChem-01150805002D 40 41 0 0 0 0 0 0 0999 V2000 2.0000 -3.5580 0.0000 Br 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 0.9420 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -4.5580 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -0.0580 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.3301 2.9420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 6.4347 3.9365 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 0.9420 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -0.0580 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 7.2437 2.5353 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 2.4420 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.9128 3.2784 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.4128 4.1444 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.4516 1.5571 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.4641 1.4420 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 7.8195 5.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -1.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -0.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -2.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -2.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.7321 -3.5580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8660 -3.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -3.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.5981 -5.0580 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.2520 3.0246 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.8535 2.3343 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.5294 3.2136 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 6.8451 1.4282 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.5805 0.9507 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.0580 1.6860 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 7.2531 5.3102 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.0717 5.6244 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 8.3859 4.8058 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.0611 1.2520 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 -0.3680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.3291 -1.7480 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 -1.7480 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 -3.3680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.2881 -5.5949 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.1350 -5.3680 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.9081 -4.5211 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 21 1 0 0 0 0 2 14 2 0 0 0 0 3 20 1 0 0 0 0 3 23 1 0 0 0 0 4 17 2 0 0 0 0 5 6 1 0 0 0 0 5 9 1 0 0 0 0 5 10 1 0 0 0 0 6 12 2 0 0 0 0 7 8 1 0 0 0 0 7 14 1 0 0 0 0 7 33 1 0 0 0 0 8 17 1 0 0 0 0 8 34 1 0 0 0 0 9 11 2 0 0 0 0 9 13 1 0 0 0 0 10 14 1 0 0 0 0 10 24 1 0 0 0 0 10 25 1 0 0 0 0 11 12 1 0 0 0 0 11 26 1 0 0 0 0 12 15 1 0 0 0 0 13 27 1 0 0 0 0 13 28 1 0 0 0 0 13 29 1 0 0 0 0 15 30 1 0 0 0 0 15 31 1 0 0 0 0 15 32 1 0 0 0 0 16 17 1 0 0 0 0 16 18 2 0 0 0 0 16 19 1 0 0 0 0 18 21 1 0 0 0 0 18 35 1 0 0 0 0 19 22 2 0 0 0 0 19 36 1 0 0 0 0 20 21 2 0 0 0 0 20 22 1 0 0 0 0 22 37 1 0 0 0 0 23 38 1 0 0 0 0 23 39 1 0 0 0 0 23 40 1 0 0 0 0 M END > 9425046 > 1 > 437 > 5 > 2 > 4 > AAADceBzsAAAEAAAAAAAAAAAAAAAAWAAAAAwAAAAAAAAAAABwAAAHgBYAAABrAzBngYyhpJqBACqAyVyVACSDAAlogYa+CG+bPgMZjLE9fuUtShk2BHI65eY3ADOIAAAEAAABABAAAAgAAAIAAAAAAAAAA== > 3-bromo-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-methoxy-benzohydrazide > 3-bromo-N'-[2-(3,5-dimethyl-1-pyrazolyl)-1-oxoethyl]-4-methoxybenzohydrazide > 3-bromo-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-methoxybenzohydrazide > 3-bromo-N'-[2-(3,5-dimethylpyrazol-1-yl)ethanoyl]-4-methoxy-benzohydrazide > 3-bromo-N'-[2-(3,5-dimethylpyrazol-1-yl)acetyl]-4-methoxy-benzohydrazide > InChI=1/C15H17BrN4O3/c1-9-6-10(2)20(19-9)8-14(21)17-18-15(22)11-4-5-13(23-3)12(16)7-11/h4-7H,8H2,1-3H3,(H,17,21)(H,18,22)/f/h17-18H > 2.5 > 380.048403 > C15H17BrN4O3 > 381.22448 > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC(=C(C=C2)OC)Br)C > CC1=CC(=NN1CC(=O)NNC(=O)C2=CC(=C(C=C2)OC)Br)C > 85.3 > 380.048403 > 0 > 23 > 0 > 0 > 0 > 0 > 0 > 1 > 4 > 11 12 8 16 18 8 16 19 8 18 21 8 19 22 8 20 21 8 20 22 8 5 6 8 5 9 8 6 12 8 9 11 8 $$$$ chemfp-1.1p1/tests/queries.fps0000644000077000000240000006413311660452124016616 0ustar dalkestaff00000000000000#FPS1 #num_bits=1021 #software=OpenBabel/2.2.0 #type=OpenBabel-FP2/1 #source=queries.smi #date=2011-02-20T21:23:41 102280000040a2240300048040020000210000b000c0c00000040000011420101e00100880a04017100e04102000000001a200029020c023000000800000a2100c9028080c14088002008a020a80021e00f0015000222040a880100140200004420400284008000201082000c0001cc000070180080d90020050048140040500 22525101 10060046004082a403022e000000100400018c8000425a0001040084001020059e00408100c01041c00622180081c000012a030090207b0020811022040601200c22082804040000000021a02280061640b8c040400a3408408a9800800d04810f061028a10290220b908280611009600206014003091020289c04c300044100 22525102 102280460040022c03040c00404210006005049040724800020400800014e0041a08100840a04045901e04902180000001a2020090206f4060010080000083204c941800140400820001a3a00a80068610f04141400e3448c8001000002400014604102aa00318000518000400001940000f4980080d10002190448942004300 22525103 102280460040022c03040c00404210006005049040704800020400800014e0141a08100840a04047901e04902180000001a2020090206f4060010080000083204c141900141400820001a3a00a80068610f04141400e3448c8001000002400014604102aa00318000518000400001940000f4980080d10002190448902004100 22525104 102280460040022c03040c00404210006005049040704800020400800014e0041a08100840a04045901e04902180000001a202009020ef40600100800000a3204c941800140400820001aba00a80068e10f04141400e3448c8001000002400014604102aa00318000518000400001940000f4980080d10002190448942004100 22525105 100000040000a2240a00000040020000200000a40060400000040000001420001600000800a04005001a00102000100001a0000090a0403100000080000082000c902a2804040800000082022880021c00d0004000020440280010010020200042040008002880020108200040003cc000038180080d00000000048140000500 22525106 100080240000a2260a00808040420000002000a00050400000140000001420001600000000804001800a0011204000000120000090604d3100010080040082000c10280904040802010082220880029481d0004000030040688010010000000106040018000800020108200040003c4001030180000900000040048140000500 22525107 10068002800006e40b000600000350002000008000404880003400c0031ca4001c004008d2800027208e1218200100c4012200001820c0332101808200a0a2180890081245040400040089022a88021c01d000440402304848081801406b80004214002a4000000200480000d000380000070180080d9082a85880814a000510 22525108 00060a32800c02c40b5606410a82489481108012900ac8082a206181010400541c21500182280833700d345928411040010242018820c00a20008882808086080e101e1249101f010000b9000288200408d0015aa4113c08d00c0800040b4004200710000020200340e04204d0304820820781800a91106a0852000142660100 22525109 100000040000a2240a00000040020000200000a40060400000040000001420001600000800a04005001a00102000100001a0000090a0403100000080000082000c902a2804040800000082022880021c00d0004000020440280010010020200042040008002880020108200040003cc000038180080d00000000048140000500 22525110 100000040000a2240a00000040020000200000a40060400000040000001420001600000800a04005001a00102000100001a0000090a0403100010080000082000c902a2804040800000082022880021c00d0004000020440280010010020200042040008002880020108200040003cc000038180080d00000000048140000500 22525111 c00cf117000ae0e70e68029003430120650042a040507a14892210c6008c6208144c54a840c10015e00e12182505c1c105ba364a0a606d17249150224404e20808b2286327440db3a428a1023adac29640d94240000110186a88d84141696481468602028538146f11d480ea533039802b0321801b0705a4391c05cf81e2251c 22525112 10288202000002240a00000001020002600000800020000000000008000410001000000c00000005001b001020000000018000001020401300010000000082000890080004040800000080020880001c00900040000000000800100300200000400400000000000200000000c020780200030180080500000100008140000f00 22525113 102282460040022403140c00400210006005049000624800020400000014a0041a08100800a04045901e04102180000001a2020090206b4062010001000083204c901800140400800001a3a00a80060610b04141400a344088001000002400014604102aa00118000518300400001d40000f4980080d10002190448942004300 22525114 102280460040022403140c00400210006005049000624800020400000014a0041a08100800a04045901e04102180000001a2020090206b4062010001000083204c901800140400800001a3a00a80060610b04141400a344088001000002400014604102aa00118000518300400001d40000f4980080d10002190448942004300 22525115 00060a26800c82c40b5e06c10a8248948110a012900ac8080a2060c1010400541c01d00182280833700d345928411240010240018828c44a20048882808086081e101e1248111e200000b9008088010408d0015aa4113c0cf00c0800040b4014200710000028200140e04204d0204820820781800291900a1052400142660101 22525116 102280460040022401144c00400210084005049000604800020400200010a0041a08100a01a04045900e04300180000021a2020090206b4062010001000083204c001800140400800121a3a00a80060610b04141400a344088001000002400010604102aa00118000518300400001d40000f4984080d10002190448902004100 22525117 0026127701000a240a0016800352086e001a00d0004468604024108843280d0810085000c09900e1100a011801011040012266030820d01230a10002004982e018802c02c00d02080000a1433e980294009005501c00370ac10838000088001002640228102010570006200010004810200f2180c2c130401a82048103080100 22525118 102200000040a2240300048040020000210000b000c0c00000040000011420101e00100880a04017100e04102000000001a200029020c003000000800000a2100c9028080c14088002008a020a80021600f0015000222040a880100140200004420400284008000001002000c0000cc000070180080d90020050048140040100 22525119 000000000000022402000200020000601000808a1040400000400000001000001200080000804001000a000800000000010000009020490000000102200000000c000810000400000000a22008840244001000400000100400001800004802010004200000400000010000000100084000420000020108000000240100000100 22525120 100000020000026402000200420200601000808a1044480000400000001400001200080000804001000a000820000000012000009020490000000502200082080c100852000400000000b3202884024400900840000018444880180000490a01000420000040000001000000110008c000430100020108000000248140000100 22525121 10000000000002240200000000020000200000800040400000000000000420001000000800800005000a000020000000012000001020401000000000000082000890080004040000000080022880020c009000400000000008001000000000010204000000000002000000000000288000030180000500000000008140000500 22525122 10000002000002240200008000020000000000800060400000000000000420001400000000800001000a00002000000001a000001020401000000080000082000810080804040800000080022880020480d000400000000008001200000000010204000000000002000000008000280000030180000100000000008140000500 22525123 10000000000002240200000000020000200000800062400000000000000420001000000800800005000a000020000000018000001020401000000000000082000890080004040800000080022880020c009040400000000008001000000000010204000000000002000000000000290000030100000500000000008140000500 22525124 0000000000000224060004011002100111000088006240c000000000000421101000000000a00013101a0010e200000001a010010020c110800000000000a2100810091004140820000080032880020400b041400802000088011000080000000204000001000000040000008100390000170380000900200400108140000700 22525125 0000000000000224000000000000000000000080004240000004000000102000100800000080000100000000010000000100020010204000000000000000000008000800000400000000800200000200000040400000000000001000000000000204000000000000000000000000090080000000000000000000008102000000 22525126 10000000000002240200000000020000000000800062400000000000001420001000000000a00001001a00002000000001a0000010204130000000000000820008100800040408000000800228800204009040400002004008001000000000010204000000000002000000008000290000030180000900000000008140000500 22525127 00000000000002240200000000020000000000800060400000000000000420001000000000a00001001a00002000000001a0000000204010000000000000820008100800040408000000800328800204009040400002000008001000000000000204000000000000000000008000290000030180000900000000008140000500 22525128 10000000000022240200000000020000000000800062400000000000000420001000000000a00001001a00002000000001a0200010204000000000000000820008100800040488000000800208800204009040400002000008001440010000010204000000000042010000000000090500030100000900000000008150002500 22525129 110120000000a2340a00020140424000200400e40220c00000008080201400001000500800000005901a00182000100001c2000018e04131000000020404820408902a0106040002000080020280008c0080014040000048e80850010020a001400421088028028200000000c100288001038180090500000000048140008400 22525130 110120000000a2340a00020140424000200400e40220c20000008080201400001000500801000005901a00182000100001c2020018e04131000000020404820408902a0106040002000080020280008c0080014040000048e80850010020a001400421088028028200000000c100288001038180090500200800048140008400 22525131 110120000000a2340a01020140424000200400e40220c20000008080201400001000500801000005901a00182000100001c2020018e04931000000020404820408902a01070400030000800202c0008c0080114040000048e80850010020a001400421088028028200000000c11028800103818009050020080004c140008400 22525132 10288202000002240a00008101020002600000800060400000040008001430001408000c00800005001b001021000000018102021020403300010080000082000890080004040800000080032880021400d000400000004008801003000000000204000800000042000800004020780200030100000500000100008141000f00 22525133 10288202000012240a00000001020000200000800040400000bc0208001420001000000c00800005000b001020000000012000001020403300010000000082000890080004040800000080020880021c019000400002004008801003002000004204002800000002000c0000c000780000030180080d00000000008140000700 22525134 10008000000002240a00000000020000200000800040400000140000001420001000000800800005000a001020000000012000001020403300010000000082000890080004040800000080020880021c01900040000200400880100100200000420400280000000200080000c000380000030180080d00000000008140000500 22525135 300001000000a2240000040000020000200001a4000200000000000000040000100000080000000710120010200010000180000010a04011000100000000800008902a000404088002008003288400140090014000200000a80010010020200040060008002800020000200040003c800005d180000000000000008140000500 22525136 102282460040022401144c00400210084005049000604800020400200010a0041a08100a01a04045900e04300180000021a2020090206b4062010001000083204c001800140400800121a3a00a80060610b04141400a344088001000002400010604102aa00118000518300400001d40000f4984080d10002190448902004100 22525137 102280460040022401144c00400210084005049000604800020400200010a0041a08100a01a04045900e04300180000021a2020090206b4062010001000083204c001800140400800121a3a00a80060610b04141400a344088001000002400010604102aa00118000518300400001d40000f4984080d10002190448902004100 22525138 00000000000002240a00000000020000200000800040400000000000000420001000000800800005000a000020000000012000000020400100000000000082000890080000040000000080000080020000800040000000000800100100200000420400000000000000000000c000080000030180080500000000008140000000 22525139 e0044384040006e408294e188c433205c41e0150400057600000008e002444081408218300a88223a093035c312010400090a27090616d00a040c09a008007004c641d00d72e0302010060202488028210c81050040a3009d13811100c4740031a940212000180c803c0000401204c4111064902a909002028020e5007860b00 22525140 00400000000016240000000000020000208000800043400000008000000420001000000810a40005001000002080000001a000000020400220000000000080000898480004040800040080020000020000802040400200000800100000400000160400000000000000000000000008000001018010080000a000008140000000 22525141 10000002000002240200008000020000000000800060400000040000001420001400000000a00001001a00002000000001a000001020401000000080000082000810080804040800000080022880020480d000400002000008001200000000010204002800000002000000008000280000030180000900000000008140000500 22525142 0000000000000224000000800002000000000080004240000000000000042000100000000080000180020000200000000120000000204100000000000000800008100800000400000000800028800204009040400000000008001000000000010204000080000000000000008000090000010180000000000000008140000100 22525143 00ba001000c0022400000d0111020004010000920040604082004000180020001000100000800053100a0110000008800102000000244100001000000040821008000800803428a00a0080022a80020600b0495018002100890110000000001002e40021012000800008000080015b00001d0120000010000000400105808108 22525144 0000000000000224000000800002000000000080004240000000000000042000100000000080000180020000200000000120000000204100000000000000800008100800000400000000800028800204009040400000000008001000000000010204000080000000000000008000090000010180000000000000008140000100 22525145 0000000000000224000000000002000000000080004040000000000000042000100000000080000100000000200000000120000000204010000000000000800008100800000400000000800200000200008040400000000008001000000000000204000000000000000000008000290000010180000000000000008140000000 22525146 1000800000a002240a00000002420000000000a04040400000140080001460001008000000800001800a0011210000000120a2001060413300000000100082000814280001040802010080020080028000900040000104486800104100000001021400088008000200080000c1017800810b899000010000000000814a002400 22525147 102280460040022401140c00400210004005049000604800020400000010a0041a08100800a04045900e04100180000001a2020090206b4062010001000083204c001800140400800001a3a00a80060610b04141400a344088001000002400010604102aa00118000518300400001d40000f4980080d10002190448902004100 22525148 20088010400200e5020027001c80460002120010082400080140108007380400140050004040a021108a100810001005558000600e0460180430000220000c000c061c080140020024c823080080000d901a016004001100910008080009c0005804100a003200b112a0210820001020100a8055100300401008030002400010 22525149 182e01c720f002ec1b0e2e0242c34501a40704f018785a4041070080203400059e01640021c01061e00f073830c55401010a035294605f3d21c150b6040682285c702c3a27440c078000a1a202c88688f0dd414408223568cb18d80140dd2185571e3038e10576c20bf08a4241506960010711d00b0fd122389e04c301124804 22525150 100600040000843e03000e8140430000200100a040524800000000c0001040001a0840080080404500c600080120008001aa000290604d0000011082000080200c04280006040002004083222288029400d0004000022448e888d800000000010014010201011200010080000101084001078180000100000080848100040500 22525151 122e832620700a241d08ee06570b0800c49704981064c9888345108a01582d651a091308a0e05065108f15501180102c03a20030992cc910e041000e204686284e025d02361582812109a3e8cb90071e10f0436010aa3c4090609888016cc1910704102a21e411910b3d230445039960113fc1921a0f1000389054810a804910 22525152 122e812620704a241d08ee06560f0800c49704981064c9888345108205582d051a081308a0e05065108e15501180142c03ba0030982cc930e041000e604687284e025c02361582812109a3e0cb90070614b0476010aabc4090609888016cc1910704102a21e410910b39016445039960113fc1921a0f1000389054810a804910 22525153 20088030400200e5020026001c80460002120010082400080140108007380400140050004040a021108a100810001005558000600e0460180430000220000c000c061c080140020024c823080080000d901a016004001100910008080009c0005804100a003200b112a0210820001020100a8055100300401008030002400010 22525154 20088010408200e5020026001c80460002120010082400080140108007380400140050004040a021108a100810001005558000600e0460180430000220000c000c061c080140020024c823080080000d901a016004001100910008080009c0005804100a003200b112a0210820001020100a8055100300401008030002400010 22525155 20088050400200e5020026001c80460002120010082400080140108007380400140050004040a021108a100810001005558000600e0460180430000220000c000c061c080140020024c823080080000d901a016004001100910008080009c0005804100a003200b112a0210820001020100a8055100300401008030002400010 22525156 20088010400201e5020026001c80460002120010082400080140108007380400140050004040a021108a100810001005558000600e0460180430000220000c000c061c080140020024c823080080000d901a016004001100910008080009c0005804100a003200b112a0210820001020100a8055100300401008030002400010 22525157 20088010400204e5020026001c80460002120010082400080140108007380400140050004040a021108a100810001005558000600e0460180430000220000c000c061c080140020024c823080080000d901a016004001100910008080009c0005804100a003200b112a0210820001020100a8055100300401008030002400010 22525158 20088018400200e5020026001c80460002120010082400080140108007380400140050004040a021108a100810001005558000600e0460180430000220000c000c061c080140020024c823080080000d901a016004001100910008080009c0005804100a003200b112a0210820001020100a8055100300401008030002400010 22525159 20088010400200e5020026001c80460002120010082400080140108007380400140050004040a021108a100810001005558000600e0460180430000220000c000c061c080140020024c823080080000d901a016004001100910008080009c0005804100a003200b112a0210820001020100a8055100300401008030002400010 22525160 20088010404200e5020026001c80460002120010082400080140108007380400140050004040a021108a100810001005558000600e0460180430000220000c000c061c080140020024c823080080000d901a016004001100910008080009c0005804100a003200b112a0210820001020100a8055100300401008030002400010 22525161 20088012400200e5020026001c80460002120010082400080140108007380400140050004040a021108a100810001005558000600e0460180430000220000c000c061c080140020024c823080080000d901a016004001100910008080009c0005804100a003200b112a0210820001020100a8055100300401008030002400010 22525162 10000002000002040a00000140020000200000000000400000000000001400001400000800004005000a0010200000000100000090204933000000800000820008900a0000000000000082222080000400d00040000004400000000100200001400400000000000201000000c000288000030100080500000000040140000500 22525163 1806030600a402e40b030e2058c34d01841044a158c85e601a0001802044000016a0400000a00061980e031830411442093e02009260490d008100b6840886281e703a3845c50e0300a1d12a23c88680f8dd04680c023c484908d801010b2105401604240008025a4101224a41304c70010701dc49019160288804e307520005 22525164 00020006004000240700040000020000010002100020c80000000000000400101800100000a00013801e04102000000001820401000041000001000000008270181008000c100000000001011681000600b00340000220008000100000002001000400200002100c0000000080000800000701800011104010d0000150000100 22525165 101e0204084002a40f04060040824000a100209000cac400029c4080003404103a00510880a24037500f141820031044012e02019028cc594001380a8000a6100c901b004f140a810009ba2b2e81021600b801400402b848d8089801002a4000500400880100080201a0a248c0103c74000701800805106008d444cb42c00500 22525166 100e0204004002a40f04060040824000a100209000cac400029c0080003404103a005108c0a04037500f141820011044012e02019028cc594001180a8000a6100c901b004f140a810009ba2b2e80021600b801400402b848d8089801002a4000500400880100080201a0a248c0103c740007018008051060085444cb42c00500 22525167 10000000000002040a00000140020000200000000000400000000000001400001000000800004005000a0010200000000100000090204933000000000000820008900a0000000000000082222080000400d00040000004400000000100200001400400000000000201000000c000288000030100000100000000040140000500 22525168 100a40200000e2b60100848042420008202008b4025ac00002040080001420101808200800a00017501e06312140100001a00e0310e0c43320010a804400a20018903a010c1408220500c8023c80038410d040c0000b2048e080108102632015660400380038000200888300c101398081058180000810403070008142000500 22525169 100a40200000e2b60300848042420008202008b4025ac00002040080001420101808200800a00017d01e06312140100001a00e0310e0c53320010a804400a20018903a010c1408220500c8023c80038410d040c0000b2048e880128102632015660400388038000200888300c101398001078180080d10403070048142000500 22525170 10000004000002040a00000140020000200000800000400000000000001400001200000800804005000a001020000000010000009020493300000000000082000c900a0004000000000082222080020400d00040000204400000100100200001400400000000000201000000c00028c000030100000100000000040140000500 22525171 00000001000080040000020820404000000080c0002040000080208000100000100840000082000000000008100000000000000098000000000000020004080008000800020200000000008000000084008000400000000040081000002000000004000080000080010000000020400000020100000100000000040100040000 22525172 c0420020004000c0000846000c0000040000000000000000000600000110140014000010c0008023400a2110000c0000010220c0180040800200000e000000100c010c18000002010008100002800006003801500480212000020900800940100128000800b0000000c01000e000000c300e881900081800001880001a000104 22525173 100080000000a2240a00000000020000200000a00242400000140000001420001000000800a00005801a00102000000001a00000102041310000000000008200089028000404000000008002288002140190404000020040280010010020000142040028800a0002000800004000398000030180080d00000000008140000500 22525174 100a40200000e2b60100848042420008202008b4025ac00002040080001420101808200800a00017501e06312140100001a00e0310e0c43320010a804400a20018903a010c1408220500c8023c80038410d040c0000b2048e080108102632015660400380038000200888300c101398081058180000810403070008142000500 22525175 10000004000002040a00000140020000200000800000400000140000001400001200000800a04005801a001020000000018000009020493300000000000082000c900a0004000000000082220080020801800040000204400000100100200001400400200002000201080000c00028c000030100000100000000040140000400 22525176 10000004000002040a00000140020000200000800000400000140000001400001200000800a04005801a001020000000018000009020493300000000000082000c900a0004000000000082220080020801800040000204400000100100200001400400200002000201080000c00028c000030100000100000000040140000400 22525177 00200200000000040040060000004000000000400020400000000080001000000000400000000020000a0008000000400000000018800100000000020004000008000800420000000000000000800004008000400400000040080000000080010004000080000080000000000000000000020100000100000008000002000000 22525178 00000004000000040000000000000000000000000000400000100000001000000000000000000000000a0000000000000000000010002100000000000000000008000800000000000000000000800000000000400000000000000000000000010004000000000000000000000000000000000000000000000000000000000000 22525179 68089925611200a406e036c814595019109244e108604049014013e2c138240910085302e1c18823008e103811601005cd2032328c0ce0108463405a208c86348e021c000244070da0a8250952a2170c00100360041011129bc01d2c228bc0641a04301e287002f10320031caa001863302e905c4063200c520a069952225911 22525180 28089925611200a406c0268814595019109245e108604048014013a2c138240910085306e1c18023008e103811601005cd3002728c0ce0100463405a208c04248e021c0002440e0da0a8250952a2160c00100360041011129bc01b2d220bc0001a04301e287002b10320031ce8001a62302e905c4067200c528a068152025911 22525181 a0a281ca00009a75590446800002500c700507904042480e020000e0000420141801500e01808027300e003c200008c0237200030020e13906018003a0002f304ccf1e00c0140a002021eb831a910604029003c00c802005f08890010045201002340020604e100010501800e4003c00000e05851001108000d8400112004300 22525182 00224006004004ec0504068000030008210102900080c8100aa400c0011420101800500281800013708e04382000000001220ec31020c400484100020008a27018331a020c141880012189000e89031e00b0414100122000c8481100000380002a1c002a0100180601c0211090000d00000701840009106019f0408958800100 22525183 00224006004004ec0504068000030008210102900080c8100aa400c0011420101800500281800013708e04382000000001220ec31020c400484100020008a27018331a020c141880012189000e89031e00b0414100122000c8481100000380002a1c002a0100180601c0211090000d00000701840009106019f0408958800100 22525184 80a2a1ca00009a75190446800002100c700507900042480e02000060002420141801100e01808027300e003c200008c0237206030020e1390a018001a0002f304ccf1e00c1140a012021cb831e910604009003c00c802005f0889001004520100a340020604e180610501800e4103c00000e0185180010a008d0400112004300 22525185 02aa0142000040011500040200000000000005100000000200010000001000401800000400000003000e081400000000031300001000400103000000200002110802080000000800200006000280000c0050004000002000a00102020000000000000020004080000000100084001000000e1000000250000014000000000400 22525186 0000000000000224000000010000000000000080000000000000000000000000100000000000000100000000000000000100000000204003000000000000000008000800040400000000800200000000000000400000000008001000000000000004000000000000000000000000280000000000000000000000008100000000 22525187 50040102402800440a48028001430501a00001a000425a04012700c0003c20401204440820e00001a02a01082017040003aa030a9820692e24015086000082081cb1281225040c538400a12202c0829540d80040080310486a98d800484b248543060008c10014470f44a042513148400203b1800a0dc021280404cb41100504 22525188 200801000000020000000001008000280000000000000040000000000500000510000002a00000050000002000000000010000000020c002000000000008000008000800020000000200800000000000000000400020000000000000000000000008000000000080000000000000080000004000000000000000000100000000 22525189 002200000000020000000400000000000000001000000000000000000010000010001000000000001002001000000000010200001020000200000000000000000000000000000000000000001a80000c001001500000200080000000000000000000002000000000000000000000000000040000000010000000000100000100 22525190 002200000000020000000400000000000000001000000000000000000100000010001000800000011000001000000000010200000020c002000000000000000008000800000000000000800102000000000001500000200080000000000000000000002000000000000000000000080000040000000010000000000100000000 22525191 100080020000a2e40a00020140020000201000a40040480000200080001400001600400800804005200a0218200010000100000090a0493300000002000082000c90280200040000000083220080020800800040000000406008100100012001000400080028800201400000d000384000030100000500800008040140000400 22525192 002ec022800002240a0006800282080c8010009000424c0000200080002804001300508341800021108a20380001104001320e030820490a0001008200c0a6601a000c00410502010020a1213e8002040090035004003008c00a180000082010080400200020100f502000c000101820000701846a011060188200c322400300 22525193 002e0080004006241b080601020340080000009000400008000000c0001c04001800500223408023308e04b8200102c0010200001820601204010002000082180c100e0080041c800330a0009a88000e00b0014004003009e889190000080000001400220040000000000000c0103c00100701140249100000d680014a000510 22525194 602f0020840202400ac88600070308048410801000021208012200c0006c04001015d083c240c0a35080311800153004010202400824e80f04808006008280dc1ea00e1221008e11808aa121824080084089815a0c813028d0180801800b50004182022000300013424400c2e15318001007a5500a051222084a00630a100006 22525195 002200000000ea000a00040000020000200000740000004804000000200400001000000880000007008a0110200030200100000200a4c011000000001000a20018902a000800000000008802288000140090004000002100b100000140a03000002000084029000000002000c0083c80000f8100000590020000000141000100 22525196 500a02020000ee800a00060180820c04e10120348080084b021000800124020010001889c0018037708b0438200010c00506000002a9e019080218000008a6180c903f00c8908c000001c8032e8800160098074004003000e009000100223000401000a810a8000000802008c0003c8040070180041510040102c0014a800100 22525197 000a00000000e2800a00040000020000610020340000000002100000010400001000100880000017500a0010200010000102000000a8c011000008000000a21008903b0008100800000188032a8000140090014000002000a000000100222000400000a80028000000802008c0003c8000070180000510000100400140000100 22525198 500a02020000ee800a00060180820c0ce10120348080084b02100080012402001000188bc1018037708b0438200010c00506000002a9e019080218000008a6180c903f00c8909c000021c8032e8800160098074004003000e009000100223000401000a810a8000000802008c0003c8040070184041510040102c0014a800100 22525199 002e8004004000200b0006000000000d1000001000000140000000c0000800001800508303000021300e2438000102400102000008204000000000020008000018000c00400110a0092000013a80010600b003401400300ce2080a00800820140c000020000001000000000002001800000e014400011000103a000102000300 22525200 chemfp-1.1p1/tests/simple.fps0000644000077000000240000000017511660452124016426 0ustar dalkestaff00000000000000#FPS1 00000000 zeros 02000000 bit1 03000000 two_bits 0f0000a4 several deadbeef deadbeef DEADdead DEADdead deafbeef Deaf Beef chemfp-1.1p1/tests/strange.sdf0000644000077000000240000001147711660452124016573 0ustar dalkestaff00000000000000tryptophan.pdb DalkeSci.12160706013D 32 28 0 0 0 0 0 0 0 0999 V2000 -1.4520 0.0061 0.5210 C 0 0 0 0 0 -2.4914 -0.8349 0.5204 C 0 0 0 0 0 -1.8035 -2.4517 0.5212 N 0 0 0 0 0 -0.3733 -2.1482 0.5219 C 0 0 0 0 0 -0.1506 -0.7458 0.5219 C 0 0 0 0 0 0.7298 -3.0423 0.5358 C 0 0 0 0 0 2.0556 -2.5339 0.5496 C 0 0 0 0 0 2.2782 -1.1316 0.4969 C 0 0 0 0 0 1.1752 -0.2375 0.5094 C 0 0 0 0 0 -1.5379 1.1027 0.5311 H 0 0 0 0 0 -3.3041 -0.7044 -0.2093 H 0 0 0 0 0 -2.1610 -3.1400 -0.1867 H 0 0 0 0 0 0.5574 -4.1286 0.5262 H 0 0 0 0 0 2.9098 -3.2257 0.5915 H 0 0 0 0 0 3.3041 -0.7392 0.4369 H 0 0 0 0 0 1.3478 0.8488 0.5189 H 0 0 0 0 0 -1.8300 2.4262 -0.8409 C 0 0 0 0 0 -2.9276 1.9220 -0.9182 O 0 0 0 0 0 -1.4612 3.7787 -1.3892 O 0 0 0 0 0 -0.5853 1.8470 -0.2139 C 0 0 0 0 0 0.4864 1.7183 -1.2087 N 0 0 0 0 0 -0.8878 0.4700 0.3625 C 0 0 0 0 0 0.3735 -0.1058 0.9925 C 0 0 0 0 0 -2.2660 4.1287 -1.8071 H 0 0 0 0 0 -0.2498 2.5432 0.5869 H 0 0 0 0 0 1.3167 1.3258 -0.7652 H 0 0 0 0 0 0.7762 2.6459 -1.5182 H 0 0 0 0 0 -1.2352 -0.2039 -0.4524 H 0 0 0 0 0 -1.6809 0.5598 1.1380 H 0 0 0 0 0 0.1617 -1.1127 1.4166 H 0 0 0 0 0 0.7313 0.5625 1.8071 H 0 0 0 0 0 1.1746 -0.1970 0.2253 H 0 0 0 0 0 28 1 1 0 0 0 22 1 1 0 0 0 1 29 1 0 0 0 2 3 1 0 0 0 11 2 1 0 0 0 3 4 1 0 0 0 12 3 1 0 0 0 4 6 2 0 0 0 5 23 1 0 0 0 5 30 1 0 0 0 6 7 1 0 0 0 6 13 1 0 0 0 7 8 2 0 0 0 7 14 1 0 0 0 8 9 1 0 0 0 8 15 1 0 0 0 9 16 1 0 0 0 9 23 1 0 0 0 22 10 1 0 0 0 17 20 1 0 0 0 19 17 1 0 0 0 18 17 2 0 0 0 24 19 1 0 0 0 20 21 1 0 0 0 20 25 1 0 0 0 27 21 1 0 0 0 21 26 1 0 0 0 23 31 1 0 0 0 M END > tryptophan2 > This is a simple data line > This line is not followed by a blank line > This contains two lines but the reader will only get the first > This is the first version. > This is the second version. (only the first will be returned) > I have tags in the data line > (REGID) This line contains some of the strange junk that might exist on the tag line > > $$$$ tryptophan1.pdb DalkeSci.12160706013D 32 28 0 0 0 0 0 0 0 0999 V2000 -1.4520 0.0061 0.5210 C 0 0 0 0 0 -2.4914 -0.8349 0.5204 C 0 0 0 0 0 -1.8035 -2.4517 0.5212 N 0 0 0 0 0 -0.3733 -2.1482 0.5219 C 0 0 0 0 0 -0.1506 -0.7458 0.5219 C 0 0 0 0 0 0.7298 -3.0423 0.5358 C 0 0 0 0 0 2.0556 -2.5339 0.5496 C 0 0 0 0 0 2.2782 -1.1316 0.4969 C 0 0 0 0 0 1.1752 -0.2375 0.5094 C 0 0 0 0 0 -1.5379 1.1027 0.5311 H 0 0 0 0 0 -3.3041 -0.7044 -0.2093 H 0 0 0 0 0 -2.1610 -3.1400 -0.1867 H 0 0 0 0 0 0.5574 -4.1286 0.5262 H 0 0 0 0 0 2.9098 -3.2257 0.5915 H 0 0 0 0 0 3.3041 -0.7392 0.4369 H 0 0 0 0 0 1.3478 0.8488 0.5189 H 0 0 0 0 0 -1.8300 2.4262 -0.8409 C 0 0 0 0 0 -2.9276 1.9220 -0.9182 O 0 0 0 0 0 -1.4612 3.7787 -1.3892 O 0 0 0 0 0 -0.5853 1.8470 -0.2139 C 0 0 0 0 0 0.4864 1.7183 -1.2087 N 0 0 0 0 0 -0.8878 0.4700 0.3625 C 0 0 0 0 0 0.3735 -0.1058 0.9925 C 0 0 0 0 0 -2.2660 4.1287 -1.8071 H 0 0 0 0 0 -0.2498 2.5432 0.5869 H 0 0 0 0 0 1.3167 1.3258 -0.7652 H 0 0 0 0 0 0.7762 2.6459 -1.5182 H 0 0 0 0 0 -1.2352 -0.2039 -0.4524 H 0 0 0 0 0 -1.6809 0.5598 1.1380 H 0 0 0 0 0 0.1617 -1.1127 1.4166 H 0 0 0 0 0 0.7313 0.5625 1.8071 H 0 0 0 0 0 1.1746 -0.1970 0.2253 H 0 0 0 0 0 28 1 1 0 0 0 22 1 1 0 0 0 1 29 1 0 0 0 2 3 1 0 0 0 11 2 1 0 0 0 3 4 1 0 0 0 12 3 1 0 0 0 4 6 2 0 0 0 5 23 1 0 0 0 5 30 1 0 0 0 6 7 1 0 0 0 6 13 1 0 0 0 7 8 2 0 0 0 7 14 1 0 0 0 8 9 1 0 0 0 8 15 1 0 0 0 9 16 1 0 0 0 9 23 1 0 0 0 22 10 1 0 0 0 17 20 1 0 0 0 19 17 1 0 0 0 18 17 2 0 0 0 24 19 1 0 0 0 20 21 1 0 0 0 20 25 1 0 0 0 27 21 1 0 0 0 21 26 1 0 0 0 23 31 1 0 0 0 M END > > $$$$ chemfp-1.1p1/tests/support.py0000644000077000000240000002743212055226641016520 0ustar dalkestaff00000000000000import sys import os from cStringIO import StringIO import tempfile # Ignore the close. io.write_fps1_output() auto-closes its output. class SIO(object): def __init__(self): self.sio = StringIO() def write(self, s): return self.sio.write(s) def writelines(self, lines): return self.sio.writelines(lines) def close(self): # Ignore this pass def getvalue(self): return self.sio.getvalue() # Given a filename in the "tests/" directory, return its full path _dirname = os.path.dirname(__file__) def fullpath(name): path = os.path.join(_dirname, name) assert os.path.exists(path), path return path PUBCHEM_SDF = fullpath("pubchem.sdf") PUBCHEM_SDF_GZ = fullpath("pubchem.sdf.gz") PUBCHEM_ANOTHER_EXT = fullpath("pubchem.should_be_sdf_but_is_not") MISSING_TITLE = fullpath("missing_title.sdf") real_stdin = sys.stdin real_stdout = sys.stdout real_stderr = sys.stderr class Runner(object): def __init__(self, main): self.main = main def pre_run(self): pass def post_run(self): pass def run(self, cmdline, source=PUBCHEM_SDF): if isinstance(cmdline, basestring): args = cmdline.split() else: args = cmdline assert isinstance(args, list) or isinstance(args, tuple) if source is not None: args = args + [source] self.pre_run() try: sys.stdout = stdout = SIO() self.main(args) finally: sys.stdout = real_stdout self.post_run() result = stdout.getvalue().splitlines() if result: self.verify_result(result) return result def verify_result(self, result): assert result[0] == "#FPS1", result[0] # TODO: .. verify more more line format ... def run_stdin(self, cmdline): raise NotImplementedError("Implement in the derived class") def run_fps(self, cmdline, expect_length=None, source=PUBCHEM_SDF): result = self.run(cmdline, source) while result[0].startswith("#"): del result[0] if expect_length is not None: assert len(result) == expect_length, (len(result), expect_length) return result def run_split(self, cmdline, expect_length=None, source=PUBCHEM_SDF): "split into dict of headers and list of values" result = self.run(cmdline, source) headers = {} fps = [] result_iter = iter(result) # I know the first line is correct (it was tested in verify_result) # Plus, this lets the SimsearchRunner use run_split result_iter.next() for line in result_iter: if line.startswith("#"): k, v = line.split("=", 1) assert k not in headers, k headers[k] = v continue fps.append(line) break fps.extend(result_iter) if expect_length is not None: assert len(fps) == expect_length, (len(fps), expect_length) return headers, fps def run_exit(self, cmdline, source=PUBCHEM_SDF): sys.stderr = stderr = SIO() try: try: self.run(cmdline, source) except SystemExit: pass else: raise AssertionError("should have exited: %r" % (cmdline,)) finally: sys.stderr = real_stderr return stderr.getvalue() def run_split_capture(self, cmdline, expect_length=None, source=PUBCHEM_SDF): sys.stderr = stderr = SIO() try: try: headers, fps = self.run_split(cmdline, expect_length, source) except SystemExit: raise AssertionError("unexpected SystemExit") finally: sys.stderr = real_stderr return headers, fps, stderr.getvalue() #### def can_skip(name): s = os.environ.get("TOX_CHEMFP_TEST", "") return not (s.startswith(name) or (","+name) in s) #### fingerprint encoding def set_bit(n): assert n <= 16 bytes = [0, 0, 0] bytes[n//8] = 1<<(n%8) return "%02x%02x%02x" % tuple(bytes) class TestIdAndErrors(object): # # One of the records doesn't have an XLOGP field # def test_missing_id_tag(self): errmsg = self._runner.run_exit("--id-tag PUBCHEM_CACTVS_XLOGP") self.assertIn("ERROR: Missing id tag 'PUBCHEM_CACTVS_XLOGP' for record #7 ", errmsg) self.assertIn("pubchem.sdf", errmsg) # Should be the same as the previous code. def test_missing_id_strict(self): errmsg = self._runner.run_exit("--id-tag PUBCHEM_CACTVS_XLOGP --errors strict") self.assertIn("ERROR: Missing id tag 'PUBCHEM_CACTVS_XLOGP' for record #7 ", errmsg) self.assertIn("pubchem.sdf", errmsg) def test_missing_id_tag_report(self): headers, fps, errmsg = self._runner.run_split_capture("--id-tag PUBCHEM_CACTVS_XLOGP --errors report", 18) self.assertIn("ERROR: Missing title for record #1", errmsg) self.assertIn("missing_title.sdf", errmsg) self.assertEquals(fps[-1], "") def test_missing_id_tag_ignore(self): headers, fps, errmsg = self._runner.run_split_capture("--id-tag PUBCHEM_CACTVS_XLOGP --errors ignore", 18) self.assertNotIn("ERROR: Missing title for record #1", errmsg) self.assertNotIn("missing_title.sdf", errmsg) ids = [fp.split("\t")[1] for fp in fps] self.assertEquals(ids, ['2.8', '1.9', '1', '3.3', '1.5', '2.6', '-0.9', '2', '2.1', '2.9', '1.7', '-1.5', '0.4', '0.6', '0.4', '0.4', '2', '2.5']) # # Various ways of having a strange title # def test_missing_title(self): errmsg = self._runner.run_exit("", MISSING_TITLE) self.assertIn("ERROR: Missing title for record #1", errmsg) def test_missing_title_strict(self): errmsg = self._runner.run_exit("--errors strict", MISSING_TITLE) self.assertIn("ERROR: Missing title for record #1", errmsg) def test_missing_title_report(self): headers, fps, errmsg = self._runner.run_split_capture("--errors report", 1, MISSING_TITLE) self.assertIn("ERROR: Missing title for record #1", errmsg) self.assertNotIn("ERROR: Missing title for record #2", errmsg) self.assertIn("ERROR: Missing title for record #3", errmsg) self.assertEquals(len(fps), 1) self.assertEquals(fps[0].split("\t")[1], "Good") def test_missing_title_ignore(self): headers, fps, errmsg = self._runner.run_split_capture("--errors ignore", 1, MISSING_TITLE) self.assertNotIn("ERROR: Missing title for record #1", errmsg) self.assertNotIn("ERROR: Missing title for record #2", errmsg) self.assertNotIn("ERROR: Missing title for record #3", errmsg) self.assertEquals(len(fps), 1) self.assertEquals(fps[0].split("\t")[1], "Good") # # Various ways of handling a missing id in a tag # def test_missing_id_tag(self): errmsg = self._runner.run_exit("--id-tag Blank", MISSING_TITLE) self.assertIn("ERROR: Empty id tag 'Blank' for record #1", errmsg) def test_missing_id_tag_strict(self): errmsg = self._runner.run_exit("--id-tag Blank --errors strict", MISSING_TITLE) self.assertIn("ERROR: Empty id tag 'Blank' for record #1", errmsg) self.assertIn("missing_title.sdf", errmsg) def test_missing_id_tag_report(self): headers, fps, errmsg = self._runner.run_split_capture("--id-tag Blank --errors report", 1, MISSING_TITLE) self.assertIn("ERROR: Empty id tag 'Blank' for record #1", errmsg) self.assertIn("ERROR: Empty id tag 'Blank' for record #2", errmsg) self.assertNotIn("ERROR: Empty id tag 'Blank' for record #3", errmsg) self.assertEquals(fps[0].split("\t")[1], "This is not Blank") def test_missing_id_tag_ignore(self): headers, fps, errmsg = self._runner.run_split_capture("--id-tag Blank --errors ignore", 1, MISSING_TITLE) self.assertNotIn("ERROR: Empty id tag 'Blank' for record #1", errmsg) self.assertNotIn("ERROR: Empty id tag 'Blank' for record #2", errmsg) self.assertNotIn("ERROR: Empty id tag 'Blank' for record #3", errmsg) self.assertEquals(fps[0].split("\t")[1], "This is not Blank") # # Various ways of handling a tab characters in an id tag # def test_tab_id_tag(self): errmsg = self._runner.run_exit("--id-tag Tab", MISSING_TITLE) self.assertIn("ERROR: Empty id tag 'Tab' for record #2", errmsg) def test_tab_id_tag_strict(self): errmsg = self._runner.run_exit("--id-tag Tab --errors strict", MISSING_TITLE) self.assertIn("ERROR: Empty id tag 'Tab' for record #2", errmsg) self.assertIn("missing_title.sdf", errmsg) def test_tab_id_tag_report(self): headers, fps, errmsg = self._runner.run_split_capture("--id-tag Tab --errors report", 2, MISSING_TITLE) self.assertIn("ERROR: Empty id tag 'Tab' for record #2", errmsg) self.assertEquals(fps[0].split("\t")[1], "Leading tab") self.assertEquals(fps[1].split("\t")[1], "This does not") def test_tab_id_tag_ignore(self): headers, fps, errmsg = self._runner.run_split_capture("--id-tag Tab --errors ignore", 2, MISSING_TITLE) self.assertNotIn("ERROR: Empty id tag 'Tab'", errmsg) self.assertEquals(fps[0].split("\t")[1], "Leading tab") self.assertEquals(fps[1].split("\t")[1], "This does not") def test_contains_tab_id_tag(self): headers, fps = self._runner.run_split("--id-tag ContainsTab", 3, MISSING_TITLE) ids = [fp.split("\t")[1] for fp in fps] self.assertEquals(ids, ["ThreeTabs", "tabseparated", "twotabs"]) def test_contains_tab_id_tag_strict(self): headers, fps = self._runner.run_split("--id-tag ContainsTab --errors strict", 3, MISSING_TITLE) ids = [fp.split("\t")[1] for fp in fps] self.assertEquals(ids, ["ThreeTabs", "tabseparated", "twotabs"]) def test_contains_tab_id_tag_report(self): headers, fps, errmsg = self._runner.run_split_capture("--id-tag ContainsTab --errors report", 3, MISSING_TITLE) self.assertNotIn("ContainsTab", errmsg) self.assertNotIn("ERROR", errmsg) ids = [fp.split("\t")[1] for fp in fps] self.assertEquals(ids, ["ThreeTabs", "tabseparated", "twotabs"]) def test_contains_tab_id_tag_ignore(self): headers, fps, errmsg = self._runner.run_split_capture("--id-tag ContainsTab --errors ignore", 3, MISSING_TITLE) self.assertNotIn("ERROR: Empty id tag 'ContainsTab'", errmsg) ids = [fp.split("\t")[1] for fp in fps] self.assertEquals(ids, ["ThreeTabs", "tabseparated", "twotabs"]) # # Handling bad files # def test_handles_missing_filename(self): errmsg = self._runner.run_exit("this_file_does_not_exist.sdf", PUBCHEM_SDF) self.assertIn("Structure file '", errmsg) self.assertIn("this_file_does_not_exist.sdf", errmsg) self.assertIn("' does not exist", errmsg) self.assertNotIn("pubchem", errmsg) def test_handles_missing_filename_at_end(self): errmsg = self._runner.run_exit([PUBCHEM_SDF, "this_file_does_not_exist.sdf"]) self.assertIn("Structure file '", errmsg) self.assertIn("this_file_does_not_exist.sdf", errmsg) self.assertIn("' does not exist", errmsg) self.assertNotIn("pubchem", errmsg) def test_unreadable_file(self): tf = tempfile.NamedTemporaryFile(suffix="unreadable.sdf") try: os.chmod(tf.name, 0222) errmsg = self._runner.run_exit([PUBCHEM_SDF, tf.name]) self.assertIn("Problem reading structure fingerprints", errmsg) self.assertIn("unreadable.sdf", errmsg) self.assertNotIn("pubchem", errmsg) finally: tf.close() chemfp-1.1p1/tests/targets.fps0000644000077000000240000006413311660452124016612 0ustar dalkestaff00000000000000#FPS1 #num_bits=1021 #software=OpenBabel/2.2.0 #type=OpenBabel-FP2/1 #source=targets.smi #date=2011-02-20T21:23:29 002ec222800002240a0006800282080c8010009000425a0000200080002804001300508341800021108a20380001104001360e030820410a0001008280c0a6601a000c00410502010020a1091e8002040090035004003008c00a180000082010000400200020140f5020000000101820000701846a0110601882008122400300 22525001 002a0000000002000e00040100020000001000100002000000000000010400001000100080000001100a001020000000010200800020c013000000000000a20008100800000000000000880302800000008001500000200080000001000000000000002000000000000000004000380000070100000110000000000100000000 22525002 002a00000000e2000e00040102020000211000340002000000000000010400001000100880000017100a0010200010000102008000a0c013000000004000a210081028000010080000008803028000000080015000002000a000000100202000400000280038000004000000c000380000078100080510000000000100000000 22525003 002a8002000002c00704060102820000801000101002080000200080010400401c01500880200027300e061820001240010200800020c01b000000020000a6100a10080240000a0000009903028000040090015084002800c0080001000180000000082000000000506001009000382000070100000110000018000102400310 22525004 002a0002000002c00f00060100020000201000100002080000200080010400001c00500880000005300e161820000000010200800020c013000000020000a200081008020000000000008903028000000080015000002000c0080001000100000000002000000000004000005000380000070100000510800018000100000000 22525005 002200000000020000000401000200000010001000000000000000000100000010001000800000011000001000000000010200800020c003000000000000a00008000800000000000000800302000000008001500000200080000001000000000000002000000000000000000000080000050100000110000000000100000000 22525006 88260113006024a43ac886000013050d110001b400483a58010100c0000604701484449f6547003300aaa03820030760031b52100080480424a10002080646bc08a53e0a29141dd180202460c6db8183d0b9855000003048f1585f03800be00441968022001004620e8b03c0e11008306006119c2a275322089e00477b000004 22525007 88260113006024a43ac886000013050d110001b400483a58010100c0000604701484449f6547003300aaa03820030760031b52100080480424a10002080646bc08a53e0a29141dd180202460c6db8183d0b9855000003048f1585f03800be00441968022001004620e8b03c0e11008306006119c2a275322089e00477b000004 22525008 00000000000002000a00000100020000000000000000000000000000000400001000000000000001000a0000200000000100000000204013000100000000820008100800000000000000800028800004009000400000040000000201000000000000000000000000000000004000280000030100000100000000000140000500 22525009 0026127701000a240a0016800352086e001a00d0004468604024108843280d0810085000c09900e1100a011801011040012266030820d01230a10002004982e018802c02c00d02080000a1433e980294009005501c00370ac10838000088001002640228102010570006200010004810200f2180c2c130401a82048103080100 22525010 00040002000002400a000200020200603100800a1004080000400080000400001000080840000005000a0008200000000120000000204013000105022000820808900852000008000000b102288400540090084000001c0440000a0100490a0000002000004000000000200051002c8000430100020108000000208140000500 22525011 800201871040002c21040c440202000871250e900060c800300000000196001c1a81100a81a34053921e0038a080004009a270219020c30180830020000402344c0348001c14118001200f812ac3071600b005c0813a2400c0401002000400010c041020a00210002110000080000941400631840353160a00e0048130004100 22525012 506ea642c04066e6030ea60144e31045a11596bcd0f07a60012000840834301d9481fc8d02f0b1b7f89e36392043d54303bad74010e0ff1fe40090cec48ea71cdcfd2c3b4d540e939a04f92336cd03de41fae1c4c50b3948e99a98c1826f2c974b1e2862a52a80222ec8b282f7133eb0030fe1d00b4f18e2392c24cf42404104 22525013 00260000002002000600060000020000001000100000000000000080010400001000500080000003100a001820000100010200000020c015000000020000a2180810080000000c4000008802028000110090015000002008c008080100000000400000200000000010000000c000280000070100000110000000000140000000 22525014 1806030600a402e40b000e0040c345018400c0a048485e2000000080200400001680400800a00065880e031820411440012e02009060490d008100b6840086281c70283845442e038000912a22c08280f8d100600c023c484808580100082101401600000000024201000246411008700107019009019120088804c303520004 22525015 102200000040a2240300048040020000010000b000c0c00000040000011420101e00100880a04017100e04102000000001a200029020c003000000800000a2100c1028080c14088002008a020a80021600f0015000222040a880100140000004020400284008000001002000c0000cc000070180000990020050048140040100 22525016 000a80000000fa80100004105002800aa00008340000034002010000020400001000100a8100002710200130202010000100080000a0d013000000000008a02008903a000000000008008803001600010080014010002010a2010001000220000220000c0428810000002000c0003880000d0100003110000000000148000000 22525017 0020010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 22525018 11002080000002240000000000020400200000800048400000040000001420001400000800a00005801200002000002001e000001020413320030080040080040890082804040000040088022880021402d04040000204400880100102600001420420288002000200080200c000298000010180000800002000008140000500 22525019 00040000000000000200060000000004000000000000120000000080000800001000408140000003000a001800010000010002000800400000000002000000000800080000000000000000000280000400b000400000100840080800000800000000000000000002000000000000000000060100000100200800000000000000 22525020 00040000000000000200060000000004000000000000120000000080000800001000408140000003000a001800010000010002000800400000000002000000000800080000000000000000000280000400b000400000100840080800000800000000000000000002000000000000000000060100000100200800000000000000 22525021 00040000000000000200060000000004000000000000120000000080000800001000408140000003000a001800010000010002000800400000000002000000000800080000000000000000000280000400b000400000100840080800000800000000000000000002000000000000000000060100000100200800000000000000 22525022 00044060800000040e1006a080804004801304801020020000000080000000001000400000000023000a01788001104001400014080240080040009301800d200800080140010200000030000080000400b004400c023c08c0081800000c000000040000a00008830000300000000800000601021a0110100002008102004100 22525023 00044060800000040e1006a080804004801304801020020000000080000000001000400000000023000a01788001104001400014080240080040009301800d200800080140010200000030000080000400b004400c023c08c0081800000c000000040000a00008830000300000000800000601021a0110100002008102004100 22525024 00044060800000040e1006a080804004801304801020020000000080000000001000400000000023000a01788001104001400014080240080040009301800d200800080140010200000030000080000400b004400c023c08c0081800000c000000040000a00008830000300000000800000601021a0110100002008102004100 22525025 0002000000000004000404012000000000000090000040000000200000000000100010000080000100000010000000000100000200004100400000000000080008001800040400000000000000000000000001410002200088001000000200000204000000000800000000000000080000040000000810000000000900000000 22525026 008c000008000e00024206000200000000c00030000000000010008040000000100040200000000010800c18008100300100000000200000000010021b000002010030010030000000002201000800000080014801001008f688080000080000301000400430000000100000180000000806c100020100182000002180000000 22525027 700501000000060002000620020400002c0000000000004004000080000000021800401800000000108000180001008001640000002040000000100200200000000000000000608001002001000800000080016000201008c018480000080000001000000100000400000000080008000006b101020108010010800110000000 22525028 700501000000060002000620020400002c0000000000004004000080000000021800401800000000108000180001008001640000002040000000100200200000000000000000608001002001000800000080016000201008c018480000080000001000000100000400000000080008000006b101020108010010800110000000 22525029 70040100000002000a000620020400002c00000000000000040000800000000218004018000000001000001800010080016c0000002040000000000200200000000000000000608001002001000000000080016000201008c008480000080000000000000000000000000000080008000006b101020100010012800110000000 22525030 700001000000020000000620000400002c0000000000000004000080000000021800401800000000100000180000008001640000002040020000000200200000000000000000608001000001000000000080016000200000c008400000000000000000000000000000000000080008000006b101000100010010800110000000 22525031 00cc000408001e00024206000200000000c000300001000000108080400000001000402010040001188a0c18008180300100001000204080000010221b00000209003801003000000000a201088800040090214841001008f688080000080000341000400430000000100001180008000806c10012010018a080002180000100 22525032 008c000008000e00024206000200000000c00030000000000010008040000000100040200000000010800c18008100300100000000200000000010021b000002010030010030000000002201000800000080014801001008f688080000080000301000400430000000100000180000000806c100020100182000002180000000 22525033 008c000008000e00024206000203000000c0003000000000001000c040000000100440200000000010800c18008110300100020000200000000010021b000002092030030134000100002201020800004080014801001008f688080000080000301600400430000200110000181000200806c1000a0100382800002180000000 22525034 008c000008000e00024206000200000000c00030000000000010008040000000100040200000000110800c18008100300100000000204000000010021b00000209003801003000000000a201000800000080014801001008f688080000080000301000400430000000100000180008000806c100020100182000002180000000 22525035 000ec008080004a003000600020300000000009000021200000000c0000000001c00500000000001008e0618000100800102000000204000000800020020820008000800000400010000a0011a8800040090014000003008c0080800000800000010002200000000000000000000180000070100020110000098800100400300 22525036 0026c000004004e00300060002030004010020900000800002200080000800001c00508140000013708e2418000100800102000088284000000408020000021008001a0200100880000021011a88000600b0034000003008c0080800800b3000041000220000000000c000008000000000070180020110000858c00100000301 22525037 0026c000004004e003000600020300080000209000000000022000c0000000001c00500203000000708e0438000100800102000080280000000408020000020008001a0200040080012021011a88000600b0034000003008c0080900000b3000081000220000000010c000000000000000060184020110000038c00100000301 22525038 0006c000000004c00300060002030000000020900000000002200080000000001c00500000000000708e0418000100800102000080280000000408020000020008001a0200000000000021011a8800040090034000003008c0080800000b3000001000220000000000c000000000000000060180020110000018c00100000301 22525039 002ec000000004e003000600020300000000209000020000022000c0000000001c00500000000001708e0418000100800102000080284000000408020000820008001a02000400000000a1011a8800040090034000003008c0080800000b3000001000220000000000c000000000180000070180020110000018c00100000301 22525040 0006c000000004c00300060002030000000020900000000002200080000000001c00500000000000708e0418000100800100000080280000000408020000020008001a020000000000002101188800040090034000003008c0080800000b3000001000020000000000c000000000000000060180020110000018c00100000301 22525041 0006c000000004c00300060002030000000020900000000002200080000000001c00500000000000708e0418000100800100000080280000000408020000020008001a020000000000002101188800040090034000003008c0080800000b3000001000020000000000c000000000000000060180020110000018c00100000301 22525042 008c000008000c24024206000200000000c000b0000000000030008040000000100040200000000000800c18008100300100000000200100000010021b000002010030010034000000002201000800000080014801001008f688180000080010301400400430020000100000180000000806c100020100182000002180000000 22525043 008c400008000c00024206000202000000c000300002000000100080400000001000402000000001008a0c18008100300100000000204000000010021b00820209003801003000000000a201188800040090014801001008f688080000080000301000400430080000100000180018000807c100020100182000002180000300 22525044 000ec008080004a003000600020300000000009000021200000000c0000000001c00500000000001008e0618000100800102000000204000000800020020820008000800000400010000a0011a8800040090014000003008c0080800000800000010002200000000000000000000180000070100020110000098800100400300 22525045 0026c000000004e003000600020300000000209000000000022000c0000000001c00500000000000708e0418000100800102000080280000000408020000020008001a0200040000000021011a8800040090034000003008c0080800000b3000001000220000000000c000000000000000060180020110000018c00100000301 22525046 002ec000000004e003000600020300000000209000020000022000c0000000001c00500000000001708e0418000100800102000080284000000408020000820008001a02000400000000a1011a8800040090034000003008c0080800000b3000001000220000000000c000000000180000070180020110000018c00100000301 22525047 0026c000000004e003000600020300000000209000000000022000c0000000001c00500000000000708e0418000100800102000080280000000408020000020008001a0200040000000021011a8800040090034000003008c0080800000b3000001000220000000000c000000000000000060180020110000018c00100000301 22525048 0026c000000004e003000600020300000000209000000000022000c0000000001c00500000000000708e0418000100800102000080280000000408020000020008001a0200040000000021011a8800040090034000003008c0080800000b3000001000220000000000c000000000000000060180020110000018c00100000301 22525049 00060000004004e403000600020300082001209000000000022000c0000000001800500201000000308e0438000100800102000090280100000400020040020008001a0200040000002021010288000800a0034000003008c0081900000bb0100814002200000000004001000000000000060104020110000030c00100004001 22525050 0026c000004004e00300060002030004010020900000800002200080000800001c00508140000013708e2418000100800102000088284000000408020000021008001a0200100880000021011a88000600b0034000003008c0080800800b3000041000220000000000c000008000000000070180020110000858c00100000301 22525051 0006c000000004c00300060002030000000020900000000002200080000000001c00500000000000708e0418000100800100000080280000000408020000020008001a020000000000002101188800040090034000003008c0080800000b3000001000020000000000c000000000000000060180020110000018c00100000301 22525052 0026c000004004e00300060002030000010020900000800002200080000000001c00500000000013708e0418000100800102000080284000000408020000821008001a0200100880000021011a88000600b0034000003008c0080800000b3000001000220000000010c000008000080000070180020110000058c00100000301 22525053 0026c000004004e003000600020300080000209000000000022000c0000000001c00500203000000708e0438000100800102000080280000000408020000020008001a0200040080012021011a88000600b0034000003008c0080900000b3000081000220000000010c000000000000000060184020110000038c00100000301 22525054 008c000008000c00024206000200000000c00030000000000010008040000000100040200000000100800c18008100300100000000204000000010021b00000209003801003000000000a201000800000080014801001008f688080000080000301000400430000000100000180018000806c100020100182000002180000000 22525055 008c001008000c04024206000000000000c20090004040000010008040002000100040200080000000800408008000300108000000202000000110021b0000220101100100140000000012000008020002800042010000085688980000000000b21400000530000000108000180000000806c100000100182000012110002000 22525056 00cc004008001c0402c206008900000000c2001040014000007c80a040100000101040201080000108800408008080300008001890000280000114121b0000220900180100100000100012010008000002802042010000085788980020000800b4140008a534000000108001184000000807c1001001401ca080012100063000 22525057 0006c000004004e003000600020300082000209000000000022000c0000000001800500201000000308e0438000100800102000090280000000400020000020008001a0200040000002021011a88000c00b0034000003008c0080900000bb0000810002200000000004001000000000000060104020110000030c00100000301 22525058 00060024000006e0030006000203c0800060009000000000002000c000000c001c0c500000000020308e4618040102c00102000000200000004000020020020018000812400510200000210102880180008801400400300ce0080800000980140010002200000000004000000000000000060540022110001018800102000010 22525059 008c000008000c24024206000200000000c000b0000000000030008040000000100040200000000000800c18008100300100000000200100000010021b000002010030010034000000002201000800000080014801001008f688180000080010301400400430020000100000180000000806c100020100182000002180000000 22525060 0006c000000004c00300060002030000000020900000000002200080000000001c00500000000000708e0418000100800102000080280000000408020000020008001a0200000000000021011a8800040090034000003008c0080800000b3000001000220000000000c000000000000000060180020110000018c00100000301 22525061 0006c008080004c00300060002830000800020900000220002200080000000001c00500000200020708e0418000110c00100000080280008000408020000060008001a024000020000003101188800040090034004003808c0080800000b3000001000020000000000e000000000002800060180020110000098c00102400301 22525062 0006c000000004e003000600020300000000209000000000022000c0000000001c00500000000000708e0418000100800102000080280000000408020000020008001a0200040000000021011a8800040090034000003008c0080800000b3000001000220000000000c000000000000000060180020110000018c00100000301 22525063 0026c000004004e003000600020300080000209000000000022000c0000000001c00500203000000708e0438000100800102000080280000000408020000020008001a0200040080012021011a88000600b0034000003008c0080900000b3000081000220000000010c000000000000000060184020110000038c00100000301 22525064 000ec000000004c00300060002030000000020900002000002200080000000001c00500000000001708e0418000100800100000080284000000408020000820008001a02000000000000a101188800040090014000003008c0080800000b0000001000020000000000c000000000180000070180020110000018c00100000301 22525065 0026c000004004e00300060002030004010020900000800002200080000800001c00508140000013708e2418000100800102000088284000000408020000021008001a0200100880000021011a88000600b0034000003008c0080800800b3000041000220000000000c000008000000000070180020110000858c00100000301 22525066 0026c000004004e003000600020300080000209000000000022000c0000000001c00500203000000708e0438000100800102000080280000000408020000020008001a0200040080012021011a88000600b0034000003008c0080900000b3000081000220000000010c000000000000000060184020110000038c00100000301 22525067 008c000000000a24024206000203400000c000b000200000001000c040000000100440200000000010000c18000100300100020000200000000010021b000002092030010134000100002201020000004080014800001008f688180004081010300600408030028200100000181020000806c1000a0100380800000180000000 22525068 008c000000000a24020206000202400000c000b0002012000010008040000000100040200100000010000c18000100300100020000200000000010021b000002090030010134000100002201020000000080014800001008f688180000081010300400408030068200100000181000000806c1000a0100380800000180000000 22525069 002a0000004022e00b0006000002000001042010000a800002200080010400051800500080100013700f143820000000010260019028c011000608020000821008101a0200100880000081030a80000e00b4014000002000c008144101030000000000200000000001c00200c000380500070100000110000050400140002501 22525070 08020100800a00c000900a800c0000006500000080000054000800800000000208504000000004260000000800004880000028206006400000300002008208000e2008000000020001102001000000820088006000400100c10800000041000000008801000088021400000401000013000601020c0310ca0880480020804800 22525071 08000100800a00c000100a800c0000006400000080000054000800800000000208504000000004260000000880000880000028206006400000300002008000000ea008000000028001002001000000830088006000c00100c1080000004100000000880100008802040000080100001300060100000100ca0810480020805800 22525072 2000c2008050048008020e800c01000544100010808002448202008000600102000050830108842701a101180008080000002a000004480000508082089800082c200c00000806800004a001241800060198014008020000e1080000004310000030100100808800260000004080180310060104800100680802484004800100 22525073 200000008000048008020e800c0100054410001080800244000000c000400102000050830108842700a001180000080000002a000004480000108082088000080c200c00000806000000a001201800000188014008020000e1080000004310800010000100808c00060000004000080310060104800110680802484004000000 22525074 00000000800000c000000a800c0000004410000080000044000800800000000200004000000004260020000800000800000028000004400000100002008000000c200800000002000000200100000080008800400000000041080000004100000000000100008800040000000100001300020100000100480800480000000000 22525075 2000c2008050048008020e800c01000544100010808002448002008000400102000050830108842701a101180000080000002a000004480000108082088000082c200c00000806800000a001201800060198014008020000e1080000004310000010100100808800060000004080180310060104800100680802484004000100 22525076 0083000000482a6003809e00080000080100001000000040000840800000210018005000001004401006053800010a80010890000024000201200002001200000800080080001000001000002890008400b021d058102000c12801100081000012600801050041001000000000204000800e010200011000003000010c000300 22525077 e0000104000002400200420104020000402000000000000000000010000400001001480040019021000a000820002000010000002020400200000002000082040c100c00500002000000948000802000008800400430000080400000000106000000000000010000004000000000090000271500210900000008000102000000 22525078 e0000104000002400200420104020000402000000000000000000010000400001001480040019021000a000820002000010000002020400200000002000082040c100c00500002000000948000802000008800400430000080400000000106000000000000010000004000000000090000271500210900000008000102000000 22525079 000a8000000002000100040000020000000000100002000000000000000000001800100000000001100e041000000000010200000020400200000000000082000800080000000000000080000a800004009001400000200080000000000000000000002000000000000000000000180000050100000010000010000100000300 22525080 102a8200000004240104068002830004810000900042c008000400c0011420101880540080a0003330860419200110c0012602021060c94a60010002904024100810080049140a0105001809128802140090034104012c00c8881000004020100214003a000008021020200080101c300007018008111022285080cd42400700 22525081 102a8200000004240104068002830004810000900042c008000400c0011420101880540080a0003330860419200110c0012602021060c94a60010002904024100810080049140a0105001809128802140090034104012c00c8881000004020100214003a000008021020200080101c300007018008111022285080cd42400700 22525082 102a8200000004240104068002830004810000900042c008000400c0011420101880540080a0003330860419200110c0012602021060c94a60010002904024100810080049140a0105001809128802140090034104012c00c8881000004020100214003a000008021020200080101c300007018008111022285080cd42400700 22525083 102a8200000004240104068002830004810000900042c008000400c0011420101880540080a0003330860419200110c0012602021060c94a60010002904024100810080049140a0105001809128802140090034104012c00c8881000004020100214003a000008021020200080101c300007018008111022285080cd42400700 22525084 102a8200000000240104068002830004810000981042d00c000400402114a0101800580080a0003330860419200110c0012602021060c14a60010002904024100810080069140a010d0018091a8c02540090034134012c00c8881000004120100214203a000008121020200080101c3400070180081118222850a089c2400700 22525085 102a8200000000240104068002830004810000981042d00c000400402114a0101800580080a0003330860419200110c0012602021060c14a60010002904024100810080069140a010d0018091a8c02540090034134012c00c8881000004120100214203a000008121020200080101c3400070180081118222850a089c2400700 22525086 102a8200000000240104068002830004810000981042d00c000400402114a0101800580080a0003330860419200110c0012602021060c14a60010002904024100810080069140a010d0018091a8c02540090034134012c00c8881000004120100214203a000008121020200080101c3400070180081118222850a089c2400700 22525087 102a8200000000240104068002830004810000981042d00c000400402114a0101800580080a0003330860419200110c0012602021060c14a60010002904024100810080069140a010d0018091a8c02540090034134012c00c8881000004120100214203a000008121020200080101c3400070180081118222850a089c2400700 22525088 02024060000004a401408608260100008000049800401004030220e0000094001c005204004000208086061810041280010a004008001000024c000600200400040104002004020008002008828c00040298004010802100d00802000041800001340c22800400911880100004000004d00e011200011880041880012a004100 22525089 02064060000004a402408608a4010000800004d800001004030200e000009400100052040040002000800018100400000102004008000000024c000600000000040104002004020008002008820c00040298004010802108d0080a000041800001340c23000400810800100004000004d00e011200011880040000010a004100 22525090 600001040000020400020622000600000c0000a02080600004000040004020021800400800b00021008a0008001003c001e1000000024000000100000021422018010800440130a0010000000080030400900060042200046008100040000014020400004000800000000000000008000007b101001880011000800012000000 22525091 000400000000a20002000201022200620100902e10000000014000805004000010005800484000010002000a200100000100000000a04013000000022000800008102810800000008000a002288400540090404820001000600008410048200200000008004880000001600051003c0000030100020108000000200100000500 22525092 10000000000002240200000100020000001000800020000000000000000400001000000000000001000a00002000000001800000102040000001000000008200081008000404080000008002288000140090004000000000088010000000000100040000000000020000200000000c8000030100000100000000000140000500 22525093 002200c20000022401040400000010040005069000485800000000000000200418011002408000003006041000808000010200010020010222010001004001204001004000040400044001000281020000000150c000300080041000004400100604102020001000000010000000000000060000000010002090020910004000 22525094 002601a00000222402048600060140048000009040444008032000c000002c001004540000c0802010880018000132400102200008242002640000060042020004000c02200412000408200282880200000081510480300dea0018400048501603140022006808014200000020110400100690101271100020c010290a000010 22525095 002a0000000002000600040000020000000000100000000000000000010400001000100080000003100a001020000000010200000020c013000000000000a21008100800000008000000880302800000008001500000200080000000000000000000002000000000000000008000380000070100000110000000000100000000 22525096 0000000000008004080000000004000000008002000040000000400000000008000000000000000002000000000000000000000080000000000000000000000000000000000000000000000000080000000000400000000000000000000000000005000000000000010000000024400000000000000000000000040000060000 22525097 00260002000002040a000600020000040000001000001a0001000080000000001000508100400005100a2018000100000102020000204003000100020002000008000c0220000000000021000280001400d0815000003008c0080800800800000104002000000002020000402000200000060100020110200802006300000400 22525098 01013000000000060000000000420100200000802010c00000048080001402001400000900800005800a0a002040000001e000001840650004000080040000040890282c00000002000000000080029000d2404040000000400010004000000500042018c00100000100200000100d800101210001000002000000c000008000 22525099 10000000000000040000000000020000000000a0000040000004000000100000100000000080000000000000000000000100000010000000000000000000000008002800000000000000000000000200008000400000004000001000000000000004000800000000000800000000000000000000000000000000000000000000 22525100 chemfp-1.1p1/tests/test_align.py0000644000077000000240000002413011665575604017141 0ustar dalkestaff00000000000000from __future__ import with_statement from cStringIO import StringIO import unittest2 import ctypes import chemfp from chemfp import arena, bitops import _chemfp zeros = ("0000\tfirst\n" "0010\tsecond\n" "0000\tthird\n") ordered_zeros = ("0000\tfirst\n" "0000\tthird\n" "0010\tsecond\n") class TestUnsortedAlignment(unittest2.TestCase): # Python strings are 1-, 2-, and 4-byte aligned def test_1_alignment(self): a = chemfp.load_fingerprints(StringIO(zeros), reorder=False, alignment=1) self.assertEquals(a.start_padding, 0) self.assertEquals(a.end_padding, 0) self.assertEquals(a.arena, "\x00\x00\x00\x10\x00\x00") self.assertEquals(a.storage_size, 2) def test_2_alignment(self): a = chemfp.load_fingerprints(StringIO(zeros), reorder=False, alignment=2) self.assertEquals(a.start_padding, 0) self.assertEquals(a.end_padding, 0) self.assertEquals(a.arena, "\x00\x00\x00\x10\x00\x00") self.assertEquals(a.storage_size, 2) def test_4_alignment(self): a = chemfp.load_fingerprints(StringIO(zeros), reorder=False, alignment=4) self.assertEquals(a.start_padding, 0) self.assertEquals(a.end_padding, 0) self.assertEquals(a.arena, "\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00") self.assertEquals(a.storage_size, 4) # A Python string might be 8-aligned, but no guarantee def test_8_alignment(self): arenas = [chemfp.load_fingerprints(StringIO(zeros), reorder=False, alignment=8) for i in range(10)] for a in arenas: if a.start_padding == a.end_padding == 0: s = a.arena else: self.assertEquals(a.arena[:a.start_padding], "\x00" * a.start_padding) self.assertEquals(a.arena[-a.end_padding:], "\x00" * a.end_padding) s = a.arena[a.start_padding:-a.end_padding] self.assertEquals(s, "\x00\x00\x00\x00\x00\x00\x00\x00" "\x00\x10\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00") self.assertEquals(a.storage_size, 8) class TestReorderedAlignment(unittest2.TestCase): def test_1_alignment(self): a = chemfp.load_fingerprints(StringIO(zeros), reorder=True, alignment=1) self.assertEquals(a.start_padding, 0) self.assertEquals(a.end_padding, 0) self.assertEquals(a.arena, "\x00\x00\x00\x00\x00\x10") def test_2_alignment(self): a = chemfp.load_fingerprints(StringIO(zeros), reorder=True, alignment=2) self.assertEquals(a.start_padding, 0) self.assertEquals(a.end_padding, 1) # The code overallocates one byte self.assertEquals(a.arena, "\x00\x00\x00\x00\x00\x10\x00") def test_4_alignment(self): a = chemfp.load_fingerprints(StringIO(zeros), reorder=True, alignment=4) self.assertEquals(a.start_padding, 0) self.assertEquals(a.end_padding, 3) self.assertEquals(a.arena, "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00") class TestAlreadyOrderedAlignment(unittest2.TestCase): def test_1_alignment(self): a = chemfp.load_fingerprints(StringIO(ordered_zeros), reorder=True, alignment=1) self.assertEquals(a.start_padding, 0) self.assertEquals(a.end_padding, 0) self.assertEquals(a.arena, "\x00\x00\x00\x00\x00\x10") def test_2_alignment(self): a = chemfp.load_fingerprints(StringIO(ordered_zeros), reorder=True, alignment=2) self.assertEquals(a.start_padding, 0) self.assertEquals(a.end_padding, 0) self.assertEquals(a.arena, "\x00\x00\x00\x00\x00\x10") def test_4_alignment(self): a = chemfp.load_fingerprints(StringIO(ordered_zeros), reorder=True, alignment=4) self.assertEquals(a.start_padding, 0) self.assertEquals(a.end_padding, 0) self.assertEquals(a.arena, "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00") def test_16_alignment(self): arenas = [chemfp.load_fingerprints(StringIO(ordered_zeros), reorder=True, alignment=16) for i in range(10)] for a in arenas: if a.start_padding == 0 and a.end_padding == 0: s = a.arena else: self.assertEquals(a.start_padding + a.end_padding + 1, 16) self.assertEquals(a.arena[:a.start_padding], "\x00" * a.start_padding) self.assertEquals(a.arena[-a.end_padding:], "\x00" * a.end_padding) s = a.arena[a.start_padding:-a.end_padding] self.assertEquals(s, ("\x00"*16 + "\x00"*16 + "\x00\x10" + "\x00"*14)) class TestOptimalAlignment(unittest2.TestCase): def setUp(self): self._has_popcnt = arena._has_popcnt self._has_ssse3 = arena._has_ssse3 self.get_alignment_method = bitops.get_alignment_method bitops.get_alignment_method = self.hook def tearDown(self): arena._has_popcnt = self._has_popcnt arena._has_ssse3 = self._has_ssse3 bitops.get_alignment_method = self.get_alignment_method def hook(self, name): return self.data[name] def test_tiny(self): for i in range(1, 9): self.assertEquals(arena.get_optimal_alignment(i), 1) def test_small(self): for i in range(9, 33): self.assertEquals(arena.get_optimal_alignment(i), 4) def test_medium(self): arena._has_popcnt = False arena._has_ssse3 = False for i in range(35, 225): self.assertEquals(arena.get_optimal_alignment(i), 8) def test_popcnt_ssse3_combinations(self): arena._has_popcnt = True arena._has_ssse3 = True self.data = {"align8-large": "POPCNT", "align8-small": "POPCNT", "align-ssse3": "ssse3"} self.assertEquals(arena.get_optimal_alignment(300), 8) self.assertEquals(arena.get_optimal_alignment(800), 8) self.data = {"align8-large": "POPCNT", "align8-small": "LUT8-1", "align-ssse3": "ssse3"} self.assertEquals(arena.get_optimal_alignment(300), 64) self.assertEquals(arena.get_optimal_alignment(800), 8) self.data = {"align8-large": "LUT8-1", "align8-small": "POPCNT", "align-ssse3": "ssse3"} self.assertEquals(arena.get_optimal_alignment(300), 8) self.assertEquals(arena.get_optimal_alignment(800), 64) self.data = {"align8-large": "LUT8-1", "align8-small": "LUT8-1", "align-ssse3": "ssse3"} self.assertEquals(arena.get_optimal_alignment(300), 64) self.assertEquals(arena.get_optimal_alignment(800), 64) # And now, with SSSE3 disabled self.data = {"align8-large": "POPCNT", "align8-small": "POPCNT", "align-ssse3": "POPCNT"} self.assertEquals(arena.get_optimal_alignment(300), 8) self.assertEquals(arena.get_optimal_alignment(800), 8) self.data = {"align8-large": "POPCNT", "align8-small": "LUT8-1", "align-ssse3": "POPCNT"} self.assertEquals(arena.get_optimal_alignment(300), 8) self.assertEquals(arena.get_optimal_alignment(800), 8) self.data = {"align8-large": "LUT8-1", "align8-small": "POPCNT", "align-ssse3": "POPCNT"} self.assertEquals(arena.get_optimal_alignment(300), 8) self.assertEquals(arena.get_optimal_alignment(800), 8) self.data = {"align8-large": "LUT8-1", "align8-small": "LUT8-1", "align-ssse3": "POPCNT"} self.assertEquals(arena.get_optimal_alignment(300), 8) self.assertEquals(arena.get_optimal_alignment(800), 8) # I can't find a better solution than this. (!?) def _addressof(s): t = str(ctypes.c_char_p(s)) after_open_paren = t.split("(")[1] return int(after_open_paren.strip(")")) class TestFingerprintAlignment(unittest2.TestCase): def test_different_cases(self): for query in ( ("1", 4, 8), ("12", 8, 8), ("123", 16, 16), ("abcd", 4, 8), ("abcd", 8, 8), ("abcd", 16, 16), ): fp, alignment, storage_size = query result = _chemfp.align_fingerprint(*query) start_padding, end_padding, s = result i = _addressof(s) + start_padding self.assertEquals(i % alignment, 0, (query, result)) expected = fp + "\0" * (storage_size - len(fp)) self.assertEquals(s[start_padding:-end_padding], expected, (query, expected, result)) self.assertEquals(s[:start_padding], "\0"*start_padding) self.assertEquals(s[-end_padding:], "\0"*end_padding) def test_identical(self): # This fingerprint is aligned; no need to create a new one s = "blah" start_padding, end_padding, t = _chemfp.align_fingerprint(s, 4, 4) self.assertEquals(start_padding, 0) self.assertEquals(end_padding, 0) self.assertIs(s, t) def test_errors(self): with self.assertRaisesRegexp(ValueError, "must be a character buffer"): _chemfp.align_fingerprint(1, 4, 4) with self.assertRaisesRegexp(ValueError, "storage size is too small"): _chemfp.align_fingerprint("too long", 4, 4) with self.assertRaisesRegexp(ValueError, "storage size must be positive"): _chemfp.align_fingerprint("", 1, 0) with self.assertRaisesRegexp(ValueError, "storage size must be positive"): _chemfp.align_fingerprint("X", 1, -12) with self.assertRaisesRegexp(ValueError, "alignment must be a positive power of two"): _chemfp.align_fingerprint("1234", 3, 4) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_api.py0000644000077000000240000030072512106306046016606 0ustar dalkestaff00000000000000from __future__ import absolute_import, with_statement import os import sys import unittest2 from cStringIO import StringIO import tempfile import gzip import shutil import itertools import random import chemfp from chemfp import bitops, io try: import openbabel has_openbabel = True except ImportError: has_openbabel = False try: # I need to import 'oechem' to make sure I load the shared libries from openeye import oechem if not oechem.OEChemIsLicensed(): raise ImportError from chemfp import openeye has_openeye = True except ImportError: has_openeye = False openeye = None try: from rdkit import Chem has_rdkit = True except ImportError: has_rdkit = False from support import fullpath, PUBCHEM_SDF, PUBCHEM_SDF_GZ DBL_MIN = 2.2250738585072014e-308 # Assumes 64 bit doubles assert DBL_MIN > 0.0 CHEBI_TARGETS = fullpath("chebi_rdmaccs.fps") CHEBI_QUERIES = fullpath("chebi_queries.fps.gz") MACCS_SMI = fullpath("maccs.smi") # Backwards compatibility for Python 2.5 try: next except NameError: def next(it): return it.next() def _tmpdir(testcase): dirname = tempfile.mkdtemp() testcase.addCleanup(shutil.rmtree, dirname) return dirname QUERY_ARENA = next(chemfp.open(CHEBI_QUERIES).iter_arenas(10)) class CommonReaderAPI(object): _open = None def _check_target_metadata(self, metadata): self.assertEqual(metadata.num_bits, 166) self.assertEqual(metadata.num_bytes, 21) self.assertEqual(metadata.software, "OEChem/1.7.4 (20100809)") self.assertEqual(metadata.type, "RDMACCS-OpenEye/1") self.assertEqual(metadata.sources, ["/Users/dalke/databases/ChEBI_lite.sdf.gz"]) self.assertEqual(metadata.date, "2011-09-16T13:49:04") self.assertEqual(metadata.aromaticity, "mmff") def _check_query_metadata(self, metadata): self.assertEqual(metadata.num_bits, 166) self.assertEqual(metadata.num_bytes, 21) self.assertEqual(metadata.software, "OEChem/1.7.4 (20100809)") self.assertEqual(metadata.type, "RDMACCS-OpenEye/1") self.assertEqual(metadata.sources, ["/Users/dalke/databases/ChEBI_lite.sdf.gz"]) self.assertEqual(metadata.date, "2011-09-16T13:28:43") self.assertEqual(metadata.aromaticity, "openeye") def test_uncompressed_open(self): reader = self._open(CHEBI_TARGETS) self._check_target_metadata(reader.metadata) num = sum(1 for x in reader) self.assertEqual(num, 2000) def test_compressed_open(self): reader = self._open(CHEBI_QUERIES) self._check_query_metadata(reader.metadata) num = sum(1 for x in reader) self.assertEqual(num, 154) def test_iteration(self): assert self.hit_order is not sorted, "not appropriate for sorted arenas" reader = iter(self._open(CHEBI_TARGETS)) fields = [next(reader) for i in range(5)] self.assertEqual(fields, [("CHEBI:776", "00000000000000008200008490892dc00dc4a7d21e".decode("hex")), ("CHEBI:1148", "000000000000200080000002800002040c0482d608".decode("hex")), ("CHEBI:1734", "0000000000000221000800111601017000c1a3d21e".decode("hex")), ("CHEBI:1895", "00000000000000000000020000100000000400951e".decode("hex")), ("CHEBI:2303", "0000000002001021820a00011681015004cdb3d21e".decode("hex"))]) def test_iter_arenas_default_size(self): assert self.hit_order is not sorted, "not appropriate for sorted arenas" reader = self._open(CHEBI_TARGETS) count = 0 for arena in reader.iter_arenas(): self._check_target_metadata(arena.metadata) if count == 0: # Check the values of the first arena self.assertEqual(arena.ids[-5:], ['CHEBI:16316', 'CHEBI:16317', 'CHEBI:16318', 'CHEBI:16319', 'CHEBI:16320']) self.assertEqual(len(arena), 1000) # There should be two of these count += 1 self.assertEqual(count, 2) self.assertEqual(arena.ids[-5:], ['CHEBI:17578', 'CHEBI:17579', 'CHEBI:17580', 'CHEBI:17581', 'CHEBI:17582']) def test_iter_arenas_select_size(self): assert self.hit_order is not sorted, "not appropriate for sorted arenas" reader = self._open(CHEBI_TARGETS) count = 0 for arena in reader.iter_arenas(100): self._check_target_metadata(arena.metadata) if count == 0: self.assertEqual(arena.ids[-5:], ['CHEBI:5280', 'CHEBI:5445', 'CHEBI:5706', 'CHEBI:5722', 'CHEBI:5864']) self.assertEqual(len(arena), 100) count += 1 self.assertEqual(count, 20) self.assertEqual(arena.ids[:5], ['CHEBI:17457', 'CHEBI:17458', 'CHEBI:17459', 'CHEBI:17460', 'CHEBI:17464']) def test_read_from_file_object(self): f = StringIO("""\ #FPS1 #num-bits=8 F0\tsmall """) reader = self._open(f) self.assertEqual(sum(1 for x in reader), 1) self.assertEqual(reader.metadata.num_bits, 8) def test_read_from_empty_file_object(self): f = StringIO("") reader = self._open(f) self.assertEqual(sum(1 for x in reader), 0) self.assertEqual(reader.metadata.num_bits, 0) def test_read_from_header_only_file_object(self): f = StringIO("""\ #FPS1 #num_bits=100 """) reader = self._open(f) self.assertEqual(sum(1 for x in reader), 0) self.assertEqual(reader.metadata.num_bits, 100) # # Count tanimoto hits using a fingerprint # def test_count_tanimoto_hits_fp_default(self): reader = self._open(CHEBI_TARGETS) num_hits = reader.count_tanimoto_hits_fp("000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex")) self.assertEqual(num_hits, 176) def test_count_tanimoto_hits_fp_set_default(self): # This is set to the default value reader = self._open(CHEBI_TARGETS) num_hits = reader.count_tanimoto_hits_fp("000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = 0.7) self.assertEqual(num_hits, 176) def test_count_tanimoto_hits_fp_set_threshold(self): reader = self._open(CHEBI_TARGETS) num_hits = reader.count_tanimoto_hits_fp("000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = 0.8) self.assertEqual(num_hits, 108) def test_count_tanimoto_hits_fp_set_max_threshold(self): reader = self._open(CHEBI_TARGETS) num_hits = reader.count_tanimoto_hits_fp("000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = 1.0) self.assertEqual(num_hits, 1) def test_count_tanimoto_hits_fp_set_min_threshold(self): reader = self._open(CHEBI_TARGETS) num_hits = reader.count_tanimoto_hits_fp("000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = DBL_MIN) # It isn't 2000 since there are some scores of 0.0 self.assertEqual(num_hits, 1993) def test_count_tanimoto_hits_fp_0(self): reader = self._open(CHEBI_TARGETS) num_hits = reader.count_tanimoto_hits_fp("000000000000000000000000000000000000000000".decode("hex"), threshold = 1./1000) self.assertEqual(num_hits, 0) def test_count_tanimoto_hits_fp_threshold_range_error(self): reader = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "threshold must between 0.0 and 1.0, inclusive") as e: for x in reader.count_tanimoto_hits_fp( "000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = 1.1): raise AssertionError("Should not happen!") reader = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "threshold must between 0.0 and 1.0, inclusive") as e: for x in reader.count_tanimoto_hits_fp( "000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = -0.00001): raise AssertionError("Should not happen") # # Count tanimoto hits using an arena # def test_count_tanimoto_default(self): targets = self._open(CHEBI_TARGETS) counts = targets.count_tanimoto_hits_arena(QUERY_ARENA) self.assertSequenceEqual(counts, [4, 179, 40, 32, 1, 3, 28, 11, 46, 7]) def test_count_tanimoto_set_default(self): targets = self._open(CHEBI_TARGETS) counts = targets.count_tanimoto_hits_arena(QUERY_ARENA, threshold=0.7) self.assertSequenceEqual(counts, [4, 179, 40, 32, 1, 3, 28, 11, 46, 7]) def test_count_tanimoto_set_threshold(self): targets = self._open(CHEBI_TARGETS) counts = targets.count_tanimoto_hits_arena(QUERY_ARENA, threshold=0.9) self.assertSequenceEqual(counts, [0, 97, 7, 1, 0, 1, 1, 0, 1, 1]) def test_count_tanimoto_hits_threshold_range_error(self): reader = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "threshold must between 0.0 and 1.0, inclusive") as e: for x in reader.count_tanimoto_hits_arena(QUERY_ARENA, threshold = 1.1): raise AssertionError("Shouldn't get here") reader = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "threshold must between 0.0 and 1.0, inclusive") as e: for x in reader.count_tanimoto_hits_arena(QUERY_ARENA, threshold = -0.00001): raise AssertionError("Shouldn't get here!") # # Threshold tanimoto search using a fingerprint # def test_threshold_tanimoto_search_fp_default(self): reader = self._open(CHEBI_TARGETS) result = reader.threshold_tanimoto_search_fp( "000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex")) self.assertEqual(len(result), 176) hits = result.get_ids_and_scores() first_hits = [('CHEBI:3139', 0.72277227722772275), ('CHEBI:4821', 0.71134020618556704), ('CHEBI:15345', 0.94505494505494503), ('CHEBI:15346', 0.92307692307692313), ('CHEBI:15351', 0.96703296703296704), ('CHEBI:15371', 0.96703296703296704)] last_hits = [('CHEBI:17383', 0.72164948453608246), ('CHEBI:17422', 0.73913043478260865), ('CHEBI:17439', 0.81000000000000005), ('CHEBI:17469', 0.72631578947368425), ('CHEBI:17510', 0.70526315789473681), ('CHEBI:17552', 0.71578947368421053)] if self.hit_order is not sorted: self.assertEqual(hits[:6], first_hits) self.assertEqual(hits[-6:], last_hits) else: for x in first_hits + last_hits: self.assertIn(x, hits) def test_threshold_tanimoto_search_fp_set_default(self): # This is set to the default value reader = self._open(CHEBI_TARGETS) result = reader.threshold_tanimoto_search_fp( "000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = 0.7) self.assertEqual(len(result), 176) hits = result.get_ids_and_scores() first_hits = [('CHEBI:3139', 0.72277227722772275), ('CHEBI:4821', 0.71134020618556704), ('CHEBI:15345', 0.94505494505494503), ('CHEBI:15346', 0.92307692307692313), ('CHEBI:15351', 0.96703296703296704), ('CHEBI:15371', 0.96703296703296704)] last_hits = [('CHEBI:17383', 0.72164948453608246), ('CHEBI:17422', 0.73913043478260865), ('CHEBI:17439', 0.81000000000000005), ('CHEBI:17469', 0.72631578947368425), ('CHEBI:17510', 0.70526315789473681), ('CHEBI:17552', 0.71578947368421053)] if self.hit_order is not sorted: self.assertEqual(hits[:6], first_hits) self.assertEqual(hits[-6:], last_hits) else: for x in first_hits + last_hits: self.assertIn(x, hits) def test_threshold_tanimoto_search_fp_set_threshold(self): reader = self._open(CHEBI_TARGETS) result = reader.threshold_tanimoto_search_fp( "000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = 0.8) self.assertEqual(len(result), 108) hits = result.get_ids_and_scores() first_hits = [('CHEBI:15345', 0.94505494505494503), ('CHEBI:15346', 0.92307692307692313), ('CHEBI:15351', 0.96703296703296704), ('CHEBI:15371', 0.96703296703296704), ('CHEBI:15380', 0.92391304347826086), ('CHEBI:15448', 0.92391304347826086)] last_hits = [('CHEBI:15982', 0.81818181818181823), ('CHEBI:16304', 0.81000000000000005), ('CHEBI:16625', 0.94565217391304346), ('CHEBI:17068', 0.90526315789473688), ('CHEBI:17157', 0.94505494505494503), ('CHEBI:17439', 0.81000000000000005)] if self.hit_order is not sorted: self.assertEqual(hits[:6], first_hits) self.assertEqual(hits[-6:], last_hits) else: for x in first_hits + last_hits: self.assertIn(x, hits) def test_threshold_tanimoto_search_fp_set_max_threshold(self): reader = self._open(CHEBI_TARGETS) hits = reader.threshold_tanimoto_search_fp( "000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = 1.0) self.assertEqual(hits.get_ids_and_scores(), [('CHEBI:15523', 1.0)]) def test_threshold_tanimoto_search_fp_set_min_threshold(self): reader = self._open(CHEBI_TARGETS) results = reader.threshold_tanimoto_search_fp( "000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = DBL_MIN) self.assertEqual(len(results), 1993) def test_threshold_tanimoto_search_fp_0_on_0(self): zeros = ("0000\tfirst\n" "0010\tsecond\n" "0000\tthird\n") f = StringIO(zeros) reader = self._open(f) result = reader.threshold_tanimoto_search_fp("0000".decode("hex"), threshold=0.0) hits = result.get_ids_and_scores() self.assertEqual(self.hit_order(hits), self.hit_order([ ("first", 0.0), ("second", 0.0), ("third", 0.0) ])) def test_threshold_tanimoto_search_fp_0(self): reader = self._open(CHEBI_TARGETS) results = reader.threshold_tanimoto_search_fp( "000000000000000000000000000000000000000000".decode("hex"), threshold = 1./1000) self.assertEqual(len(results), 0) def test_threshold_tanimoto_search_fp_threshold_range_error(self): reader = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "threshold must between 0.0 and 1.0, inclusive") as e: reader.threshold_tanimoto_search_fp("000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = 1.1) reader = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "threshold must between 0.0 and 1.0, inclusive") as e: reader.threshold_tanimoto_search_fp("000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = -0.00001) # # Threshold tanimoto search using an arena # def test_threshold_tanimoto_arena_default(self): targets = self._open(CHEBI_TARGETS) results = targets.threshold_tanimoto_search_arena(QUERY_ARENA) hits = [result.get_ids_and_scores() for result in results] self.assertEqual(map(len, results), [4, 179, 40, 32, 1, 3, 28, 11, 46, 7]) self.assertEqual(list(hits[0]), [('CHEBI:16148', 0.7142857142857143), ('CHEBI:17034', 0.8571428571428571), ('CHEBI:17302', 0.8571428571428571), ('CHEBI:17539', 0.72222222222222221)]) def test_threshold_tanimoto_arena_set_default(self): targets = self._open(CHEBI_TARGETS) results = targets.threshold_tanimoto_search_arena(QUERY_ARENA, threshold=0.7) self.assertEqual(map(len, results), [4, 179, 40, 32, 1, 3, 28, 11, 46, 7]) hits = [result.get_ids_and_scores() for result in results] self.assertEqual(self.hit_order(list(hits[-1])), self.hit_order([('CHEBI:15621', 0.8571428571428571), ('CHEBI:15882', 0.83333333333333337), ('CHEBI:16008', 0.80000000000000004), ('CHEBI:16193', 0.80000000000000004), ('CHEBI:16207', 1.0), ('CHEBI:17231', 0.76923076923076927), ('CHEBI:17450', 0.75)])) def test_threshold_tanimoto_arena_set_threshold(self): targets = self._open(CHEBI_TARGETS) results = targets.threshold_tanimoto_search_arena(QUERY_ARENA, threshold=0.9) self.assertEqual(map(len, results), [0, 97, 7, 1, 0, 1, 1, 0, 1, 1]) hits = [result.get_ids_and_scores() for result in results] self.assertEqual(self.hit_order(list(hits[2])), self.hit_order([('CHEBI:15895', 1.0), ('CHEBI:16165', 1.0), ('CHEBI:16292', 0.93333333333333335), ('CHEBI:16392', 0.93333333333333335), ('CHEBI:17100', 0.93333333333333335), ('CHEBI:17242', 0.90000000000000002), ('CHEBI:17464', 1.0)])) def test_threshold_tanimoto_search_0_on_0(self): zeros = ("0000\tfirst\n" "0010\tsecond\n" "0000\tthird\n") query_arena = next(chemfp.open(StringIO(zeros)).iter_arenas()) self.assertEqual(query_arena.ids, ["first", "second", "third"]) targets = self._open(StringIO(zeros)) results = targets.threshold_tanimoto_search_arena(query_arena, threshold=0.0) self.assertEqual(map(len, results), [3, 3, 3]) targets = self._open(StringIO(zeros)) results = targets.threshold_tanimoto_search_arena(query_arena, threshold=0.000001) self.assertEqual(map(len, results), [0, 1, 0]) def test_threshold_tanimoto_search_threshold_range_error(self): reader = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "threshold must between 0.0 and 1.0, inclusive") as e: for x in reader.threshold_tanimoto_search_arena(QUERY_ARENA, threshold = 1.1): raise AssertionError("should never get here") reader = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "threshold must between 0.0 and 1.0, inclusive") as e: for x in reader.threshold_tanimoto_search_arena(QUERY_ARENA, threshold = -0.00001): raise AssertionError("should never get here!") # # K-nearest tanimoto search using a fingerprint # def test_knearest_tanimoto_search_fp_default(self): reader = self._open(CHEBI_TARGETS) result = reader.knearest_tanimoto_search_fp( "00000000100410200290000b03a29241846163ee1f".decode("hex")) hits = result.get_ids_and_scores() self.assertEqual(hits, [('CHEBI:8069', 1.0), ('CHEBI:6758', 0.78723404255319152), ('CHEBI:7983', 0.73999999999999999)]) def test_knearest_tanimoto_search_fp_set_default(self): # This is set to the default values reader = self._open(CHEBI_TARGETS) result = reader.knearest_tanimoto_search_fp( "000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), k = 3, threshold = 0.7) self.assertEqual(len(result), 3) hits = result.get_ids_and_scores() if hits[1][0] == "CHEBI:15483": self.assertEqual(hits, [('CHEBI:15523', 1.0), ('CHEBI:15483', 0.98913043478260865), ('CHEBI:15480', 0.98913043478260865)]) else: self.assertEqual(hits, [('CHEBI:15523', 1.0), ('CHEBI:15480', 0.98913043478260865), ('CHEBI:15483', 0.98913043478260865)]) def test_knearest_tanimoto_search_fp_set_knearest(self): reader = self._open(CHEBI_TARGETS) result = reader.knearest_tanimoto_search_fp("000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), k = 5, threshold = 0.8) hits = result.get_ids_and_scores() expected = [('CHEBI:15523', 1.0), ('CHEBI:15483', 0.98913043478260865), ('CHEBI:15480', 0.98913043478260865), ('CHEBI:15478', 0.98901098901098905), ('CHEBI:15486', 0.97802197802197799)] if hits[1][0] == "CHEBI:15480" and hits[2][0] == "CHEBI:15483": expected[1], expected[2] = expected[2], expected[1] self.assertEqual(list(hits), expected) def test_knearest_tanimoto_search_fp_set_max_threshold(self): reader = self._open(CHEBI_TARGETS) result = reader.knearest_tanimoto_search_fp("000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = 1.0) hits = result.get_ids_and_scores() self.assertEqual(hits, [('CHEBI:15523', 1.0)]) def test_knearest_tanimoto_search_fp_set_knearest_1(self): reader = self._open(CHEBI_TARGETS) result = reader.knearest_tanimoto_search_fp("000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), k = 1) self.assertEqual(result.get_ids_and_scores(), [('CHEBI:15523', 1.0)]) def test_knearest_tanimoto_search_fp_set_knearest_0(self): reader = self._open(CHEBI_TARGETS) result = reader.knearest_tanimoto_search_fp( "000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), k = 0) self.assertFalse(result) def test_knearest_tanimoto_search_fp_knearest_threshold_range_error(self): reader = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "threshold must between 0.0 and 1.0, inclusive"): reader.knearest_tanimoto_search_fp("000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = 1.1) reader = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "threshold must between 0.0 and 1.0, inclusive"): reader.knearest_tanimoto_search_fp("000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), threshold = -0.00001) def test_knearest_tanimoto_search_fp_knearest_k_range_error(self): reader = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "k must be non-negative") as e: reader.knearest_tanimoto_search_fp("000000102084322193de9fcfbffbbcfbdf7ffeff1f".decode("hex"), k = -1) # # K-nearest tanimoto search using an arena # def test_knearest_tanimoto_default(self): targets = self._open(CHEBI_TARGETS) results = targets.knearest_tanimoto_search_arena(QUERY_ARENA) self.assertEqual(map(len, results), [3, 3, 3, 3, 1, 3, 3, 3, 3, 3]) hits = [result.get_ids_and_scores() for result in results] first_hits = hits[0] if first_hits[0][0] == 'CHEBI:17302': self.assertEqual(list(first_hits), [('CHEBI:17302', 0.8571428571428571), ('CHEBI:17034', 0.8571428571428571), ('CHEBI:17539', 0.72222222222222221)]) else: self.assertEqual(list(first_hits), [('CHEBI:17034', 0.8571428571428571), ('CHEBI:17302', 0.8571428571428571), ('CHEBI:17539', 0.72222222222222221)]) def test_knearest_tanimoto_set_default(self): targets = self._open(CHEBI_TARGETS) results = targets.knearest_tanimoto_search_arena(QUERY_ARENA, k=3, threshold=0.7) self.assertEqual(map(len, results), [3, 3, 3, 3, 1, 3, 3, 3, 3, 3]) self.assertEqual(results[-1].get_ids_and_scores(), [('CHEBI:16207', 1.0), ('CHEBI:15621', 0.8571428571428571), ('CHEBI:15882', 0.83333333333333337)]) def test_knearest_tanimoto_set_threshold(self): targets = self._open(CHEBI_TARGETS) results = targets.knearest_tanimoto_search_arena(QUERY_ARENA, threshold=0.8) self.assertEqual(map(len, results), [2, 3, 3, 3, 1, 1, 3, 3, 3, 3]) self.assertEqual(results[6].get_ids_and_scores(), [('CHEBI:16834', 0.90909090909090906), ('CHEBI:17061', 0.875), ('CHEBI:16319', 0.84848484848484851)]) def test_knearest_tanimoto_search_knearest_range_error(self): reader = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "threshold must between 0.0 and 1.0, inclusive") as e: for x in reader.knearest_tanimoto_search_arena(QUERY_ARENA, threshold = 1.1): raise AssertionError("What?!") reader = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "threshold must between 0.0 and 1.0, inclusive") as e: for x in reader.knearest_tanimoto_search_arena(QUERY_ARENA, threshold = -0.00001): raise AssertionError("What2?!") class TestFPSReader(unittest2.TestCase, CommonReaderAPI): hit_order = staticmethod(lambda x: x) _open = staticmethod(chemfp.open) def test_row_iteration(self): reader = chemfp.open(CHEBI_TARGETS) num = sum(1 for x in reader.iter_rows()) self.assertEqual(num, 2000) row_reader = chemfp.open(CHEBI_TARGETS).iter_rows() fields = [next(row_reader) for i in range(5)] self.assertEqual(fields, [ ['00000000000000008200008490892dc00dc4a7d21e', 'CHEBI:776'], ['000000000000200080000002800002040c0482d608', 'CHEBI:1148'], ['0000000000000221000800111601017000c1a3d21e', 'CHEBI:1734'], ['00000000000000000000020000100000000400951e', 'CHEBI:1895'], ['0000000002001021820a00011681015004cdb3d21e', 'CHEBI:2303']]) def test_iter_blocks(self): reader = chemfp.open(CHEBI_TARGETS) line_counts = 0 has_776 = False has_17582 = False for block in reader.iter_blocks(): line_counts += block.count("\n") if "00000000000000008200008490892dc00dc4a7d21e\tCHEBI:776" in block: has_776 = True if "00000000020012008008000104000064844ca2521c\tCHEBI:17582" in block: has_17582 = True self.assertEqual(line_counts, 2000) self.assertTrue(has_776, "Missing CHEBI:776") self.assertTrue(has_17582, "Missing CHEBI:17582") def test_reiter_open_handle_arena_search(self): reader = chemfp.open(CHEBI_TARGETS) # The main goal is to prevent people from searching a # partially open file. This reflects an implementation # problem; the iterator should be shared across all instances. it = iter(reader) arena = next(it) for method in (reader.threshold_tanimoto_search_arena, reader.knearest_tanimoto_search_arena): with self.assertRaisesRegexp(TypeError, "FPS file is not at the start"): for x in method(arena): break def test_reiter_open_handle_fp_search(self): reader = chemfp.open(CHEBI_TARGETS) # The main goal is to prevent people from searching a # partially open file. This reflects an implementation # problem; the iterator should be shared across all instances. it = iter(reader) arena = next(it) fp = arena[0][1] # Get the fingerprint term for method in (reader.threshold_tanimoto_search_fp, reader.knearest_tanimoto_search_fp): with self.assertRaisesRegexp(TypeError, "FPS file is not at the start"): for x in method(fp): break def test_open_not_valid_object(self): with self.assertRaisesRegexp(ValueError, r"Unknown source type \(1\+4j\)"): reader = self._open(1+4j) _cached_fingerprint_load = {} class TestLoadFingerprints(unittest2.TestCase, CommonReaderAPI): hit_order = staticmethod(lambda x: x) # Hook to handle the common API def _open(self, name): try: return _cached_fingerprint_load[name] except KeyError: arena = chemfp.load_fingerprints(name, reorder=False) _cached_fingerprint_load[name] = arena return arena def test_slice_ids(self): fps = self._open(CHEBI_TARGETS) self.assertEqual(fps.ids[4:10], fps[4:10].ids) self.assertEqual(fps.ids[5:20][1:5], fps[6:10].ids) def test_slice_fingerprints(self): fps = self._open(CHEBI_TARGETS) self.assertEqual(fps[5:45][0], fps[5]) self.assertEqual(fps[5:45][0], fps[5]) self.assertEqual(fps[5:45][3:6][0], fps[8]) def test_slice_negative(self): fps = self._open(CHEBI_TARGETS) self.assertEqual(fps[len(fps)-1], fps[-1]) self.assertEqual(fps.ids[-2:], fps[-2:].ids) self.assertEqual(fps.ids[-2:], fps[-2:].ids) self.assertEqual(list(fps[-2:]), [fps[-2], fps[-1]]) self.assertEqual(fps[-5:-2][-1], fps[-3]) def test_slice_past_end(self): fps = self._open(CHEBI_TARGETS) self.assertSequenceEqual(fps[1995:], [ ('CHEBI:17578', '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x16'), ('CHEBI:17579', '\x00\x00\x00\x00\x00\x00\x02\x00\x02\n\x00\x00\x04\x88,\x80\x00\x105\x80\x14'), ('CHEBI:17580', '\x00\x00\x00\x00\x02\x00\x02\x00\x02\n\x00\x02\x84\x88,\x00\x08\x14\x94\x94\x08'), ('CHEBI:17581', '\x00\x00\x00\x00\x00\x000\x01\x80\x00\x02O\x030\x90d\x9c\x7f\xf3\xff\x1d'), ('CHEBI:17582', '\x00\x00\x00\x00\x02\x00\x12\x00\x80\x08\x00\x01\x04\x00\x00d\x84L\xa2R\x1c'), ]) self.assertSequenceEqual(fps[2000:], []) def test_slice_errors(self): arena = self._open(CHEBI_TARGETS) with self.assertRaisesRegexp(IndexError, "arena fingerprint index out of range"): arena[len(arena)] with self.assertRaisesRegexp(IndexError, "arena fingerprint index out of range"): arena[-len(arena)-1] with self.assertRaisesRegexp(IndexError, "arena slice step size must be 1"): arena[4:45:2] with self.assertRaisesRegexp(IndexError, "arena slice step size must be 1"): arena[45:5:-1] def test_search_in_slice(self): fps = self._open(CHEBI_TARGETS) for i, (id, fp) in enumerate(fps): subarena = fps[i:i+1] self.assertEqual(len(subarena), 1) self.assertEqual(subarena[0][0], id) self.assertEqual(subarena[0][1], fp) self.assertEqual(subarena.ids[0], id) results = subarena.threshold_tanimoto_search_arena(subarena) self.assertEqual(len(results), 1) self.assertEqual(results[0].get_ids_and_scores(), [(id, 1.0)]) hits = [result.get_ids_and_scores() for result in results] self.assertEqual(hits, [[(id, 1.0)]]) results = subarena.knearest_tanimoto_search_arena(subarena) query_ids, hits = zip(*result) self.assertEqual(len(hits), 1) self.assertEqual(results[0].get_ids_and_scores(), [(id, 1.0)]) hits = [result.get_ids_and_scores() for result in results] self.assertEqual(hits, [[(id, 1.0)]]) counts = subarena.count_tanimoto_hits_arena(subarena) self.assertEqual(len(counts), 1) self.assertEqual(counts[0], 1) self.assertEqual(list(counts), [1]) self.assertEqual(i, len(fps)-1) def test_missing_metatdata_size(self): pairs = [("first", "1234".decode("hex")), ("second", "ABCD".decode("hex"))] with self.assertRaisesRegexp(ValueError, "metadata must contain at least one of num_bits or num_bytes"): chemfp.load_fingerprints(pairs, chemfp.Metadata(type="Blah!")) def test_read_from_id_fp_pairs_num_bytes(self): pairs = [("first", "1234".decode("hex")), ("second", "ABCD".decode("hex"))] arena = chemfp.load_fingerprints(pairs, chemfp.Metadata(type="Blah!", num_bytes=2)) self.assertEqual(len(arena), 2) self.assertEqual(arena[0], ("first", "1234".decode("hex"))) self.assertEqual(arena[1], ("second", "ABCD".decode("hex"))) def test_read_from_id_fp_pairs_num_bits(self): pairs = [("first", "1234".decode("hex")), ("second", "ABCD".decode("hex"))] arena = chemfp.load_fingerprints(pairs, chemfp.Metadata(type="Blah!", num_bits=16)) self.assertEqual(len(arena), 2) self.assertEqual(arena[0], ("first", "1234".decode("hex"))) self.assertEqual(arena[1], ("second", "ABCD".decode("hex"))) def test_declared_size_mismatch(self): pairs = [("first", "1234".decode("hex"))] with self.assertRaisesRegexp(ValueError, "Fingerprint for id 'first' has 2 bytes " "while the metadata says it should have 4"): arena = chemfp.load_fingerprints(pairs, chemfp.Metadata(type="Blah!", num_bytes=4)) # Use this to verify the other implementations from chemfp.slow import SlowFingerprints _cached_slow_fingerprint_load = {} class TestSlowFingerprints(unittest2.TestCase, CommonReaderAPI): hit_order = staticmethod(lambda x: x) def _open(self, name): try: return _cached_slow_fingerprint_load[name] except KeyError: reader = chemfp.open(name) slow_arena = SlowFingerprints(reader.metadata, list(reader)) _cached_slow_fingerprint_load[name] = slow_arena return slow_arena _cached_ordered_fingerprint_load = {} class TestLoadFingerprintsOrdered(unittest2.TestCase, CommonReaderAPI): hit_order = staticmethod(sorted) # Hook to handle the common API def _open(self, name): try: return _cached_ordered_fingerprint_load[name] except KeyError: arena = chemfp.load_fingerprints(name, reorder=True) _cached_ordered_fingerprint_load[name] = arena return arena def test_iteration(self): expected = [("CHEBI:776", "00000000000000008200008490892dc00dc4a7d21e".decode("hex")), ("CHEBI:1148", "000000000000200080000002800002040c0482d608".decode("hex")), ("CHEBI:1734", "0000000000000221000800111601017000c1a3d21e".decode("hex")), ("CHEBI:1895", "00000000000000000000020000100000000400951e".decode("hex")), ("CHEBI:2303", "0000000002001021820a00011681015004cdb3d21e".decode("hex"))] found = [] for x in self._open(CHEBI_TARGETS): try: found.append(expected.index(x)) except ValueError: pass self.assertEqual(sorted(found), [0, 1, 2, 3, 4]) def test_arena_is_ordered_by_popcount(self): arena = self._open(CHEBI_TARGETS) prev = 0 for id, fp in arena: popcount = bitops.byte_popcount(fp) self.assertTrue(prev <= popcount, (prev, popcount)) prev = popcount def test_iter_arenas_default_size(self): arena = self._open(CHEBI_TARGETS) ids = [id for (id, fp) in arena] for subarena in arena.iter_arenas(): self.assertEqual(len(subarena), 1000) subids = [id for (id, fp) in subarena] self.assertEqual(ids[:1000], subids) del ids[:1000] self.assertFalse(ids) def test_iter_arenas_select_size(self): arena = self._open(CHEBI_TARGETS) ids = [id for (id, fp) in arena] prev = 0 for subarena in arena.iter_arenas(100): self._check_target_metadata(subarena.metadata) self.assertEqual(len(subarena), 100) subids = [] for id, fp in subarena: subids.append(id) popcount = bitops.byte_popcount(fp) self.assertTrue(prev <= popcount, (prev, popcount)) prev = popcount self.assertEqual(ids[:100], subids) del ids[:100] self.assertFalse(ids) def test_iter_arenas_arena_size_None(self): arena = self._open(CHEBI_TARGETS) # use None to read all fingerprints into a single arena subarenas = list(arena.iter_arenas(None)) self.assertEqual(len(subarenas), 1) self.assertEqual(len(subarenas[0]), 2000) _expected_records = dict((rec[0], rec) for rec in QUERY_ARENA) _expected_ids = set(_expected_records) class TestArenaCopy(unittest2.TestCase): def _check_by_id(self, arena, id): # The identifier lookup is a bit tricky with copies. # This helps me feel a bit better. self.assertEqual(arena.get_by_id(id), _expected_records[id]) self.assertEqual(arena.get_fingerprint_by_id(id), _expected_records[id][1]) i = arena.get_index_by_id(id) self.assertNotEqual(i, None) self.assertEqual(arena[i], _expected_records[id]) # And as long as I'm here, make sure I can look up a random id .. new_id = random.choice(arena.ids) self.assertEqual(arena.get_by_id(new_id), _expected_records[new_id]) # .. and a record which doesn't exist missing_ids = _expected_ids - set(arena.ids) if missing_ids: missing_id = missing_ids.pop() else: missing_id = "spam" self.assertEqual(arena.get_by_id(missing_id), None) self.assertEqual(arena.get_fingerprint_by_id(missing_id), None) self.assertEqual(arena.get_index_by_id(missing_id), None) def _compare(self, arena1, arena2): self.assertEqual(len(arena1), len(arena2)) for i in range(len(arena1)): self.assertEqual((i, arena1[i]), (i, arena2[i])) def _anti_compare(self, arena1, arena2): # These are supposed to be different, where the second is already ordered by popcount assert arena2.popcount_indices != "", "arena2 must be sorted!" self.assertEqual(len(arena1), len(arena2)) values1 = list(arena1) values2 = list(arena2) self.assertNotEqual(values1, values2) indices = range(len(arena1)) indices.sort(key = lambda i: (bitops.byte_popcount(values1[i][1]), i)) ordered_values1 = [values1[i] for i in indices] self.assertEqual(ordered_values1, values2) def test_simple_copy_of_unordered_arena(self): # A copy of an unordered arena leaves things unordered arena1 = QUERY_ARENA.copy() self.assertEqual(arena1.popcount_indices, "") # internal API; make sure it's unsorted self._compare(QUERY_ARENA, arena1) # Do it again to make sure. arena2 = arena1.copy() self._compare(arena1, arena2) self._check_by_id(arena1, "CHEBI:17586") self._check_by_id(arena2, "CHEBI:17586") def test_reordered_copy_of_unordered_arena(self): arena1 = QUERY_ARENA.copy(reorder=True) self._anti_compare(QUERY_ARENA, arena1) self.assertNotEqual(arena1.popcount_indices, "") # internal API; make sure it's sorted # Do another copy. This triggers a different path through the code. arena2 = arena1.copy() self._compare(arena1, arena2) self.assertIs(arena1.popcount_indices, arena2.popcount_indices) # internal API; share popcounts self._check_by_id(arena1, "CHEBI:17586") self._check_by_id(arena2, "CHEBI:17586") def test_simple_copy_of_aligned_unordered_arena(self): arena1 = chemfp.load_fingerprints(QUERY_ARENA, alignment=128, reorder=False) self.assertEqual(arena1.alignment, 128) self.assertEqual(arena1.start_padding + arena1.end_padding, 128-1) arena2 = arena1.copy(reorder=False) self._compare(arena1, arena2) arena3 = arena2.copy(reorder=True) self._anti_compare(arena1, arena3) self._check_by_id(arena1, "CHEBI:17586") self._check_by_id(arena3, "CHEBI:17586") def test_copy_of_reordered_arena_slice(self): arena1 = QUERY_ARENA.copy(reorder=True) arena2_slice = arena1[1:8] arena2_copy = arena2_slice.copy() self._compare(arena2_slice, arena2_copy) self._check_by_id(arena2_slice, "CHEBI:17587") self._check_by_id(arena2_copy, "CHEBI:17587") def test_copy_of_unordered_arena_slice(self): arena1 = QUERY_ARENA.copy(reorder=False) arena2_slice = arena1[1:8] arena2_copy = arena2_slice.copy() self._compare(arena2_slice, arena2_copy) self._check_by_id(arena2_slice, "CHEBI:17586") self._check_by_id(arena2_copy, "CHEBI:17586") def test_empty_input(self): arena1 = QUERY_ARENA[2:8][2:4][1:1] arena2 = arena1.copy() self._compare(arena1, arena2) ##### Work with indicies def test_select_all(self): arena1 = QUERY_ARENA.copy() arena2 = arena1.copy(indices=range(len(arena1)), reorder=False) self._compare(arena1, arena2) self._check_by_id(arena1, "CHEBI:17586") self._check_by_id(arena2, "CHEBI:17586") def test_select_all_reversed(self): arena1 = QUERY_ARENA.copy() arena2 = arena1.copy(indices=range(len(arena1)-1, -1, -1), reorder=False) self.assertEqual(arena1.ids, arena2.ids[::-1]) self.assertEqual(list(arena1), list(reversed(arena2))) self._check_by_id(arena2, "CHEBI:17586") def test_subset_equals_slice(self): arena1 = QUERY_ARENA.copy(indices=range(2, 8), reorder=False) self._compare(arena1, QUERY_ARENA[2:8]) self._check_by_id(arena1, "CHEBI:17589") def test_double_subset_equals_slice(self): arena1 = QUERY_ARENA.copy(indices=range(2, 8), reorder=False) arena2 = arena1.copy(indices=range(1, 3), reorder=False) self._compare(arena2, QUERY_ARENA[3:5]) self._check_by_id(arena2, "CHEBI:17588") def test_negative_subset(self): arena1 = QUERY_ARENA.copy(indices=[-4, -3, -2], reorder=False) self._compare(arena1, QUERY_ARENA[6:9]) self._check_by_id(arena1, "CHEBI:17592") def test_duplicate_indices(self): arena1 = QUERY_ARENA.copy(indices=[0, 0, 0], reorder=False) self.assertEqual(len(arena1), 3) self.assertEqual(arena1[0], arena1[1]) self.assertEqual(arena1[0], arena1[2]) self._check_by_id(arena1, "CHEBI:17585") def test_ordered_copy_with_indicies_from_unordered(self): arena1 = QUERY_ARENA.copy(indices=[3, 5, 9], reorder=True) self.assertNotEqual(arena1.popcount_indices, "") popcounts = [bitops.byte_popcount(fp) for (id, fp) in arena1] self.assertEqual(popcounts, sorted(popcounts)) arena2 = chemfp.load_fingerprints((QUERY_ARENA[i] for i in [3, 5, 9]), QUERY_ARENA.metadata) self._compare(arena1, arena2) self._check_by_id(arena1, "CHEBI:17597") self._check_by_id(arena2, "CHEBI:17597") def test_unordered_copy_with_indicies_from_ordered(self): arena1 = QUERY_ARENA.copy(reorder=True) self.assertNotEqual(arena1.popcount_indices, "") arena2 = arena1.copy(indices=[2,3,6,7,9], reorder=False) self.assertEqual(arena2.popcount_indices, "") self._check_by_id(arena2, "CHEBI:17597") def test_ordered_copy_with_indicies_from_ordered(self): arena1 = QUERY_ARENA.copy(reorder=True) self.assertNotEqual(arena1.popcount_indices, "") arena2 = arena1.copy(indices=[2,3,6,7,9]) self.assertNotEqual(arena2.popcount_indices, "") self.assertEqual(arena1[2], arena2[0]) self.assertEqual(arena1[3], arena2[1]) # make sure that reorder=True changes nothing arena2 = arena1.copy(indices=[2,3,6,7,9], reorder=True) self.assertNotEqual(arena2.popcount_indices, "") self.assertEqual(arena1[2], arena2[0]) self.assertEqual(arena1[3], arena2[1]) def test_empty_input_with_indices(self): arena1 = QUERY_ARENA.copy(indices=[]) self._compare(arena1, QUERY_ARENA[8:7]) def test_indices_out_of_range(self): with self.assertRaisesRegexp(IndexError, "arena fingerprint index 100 is out of range"): QUERY_ARENA.copy(indices=[100]) with self.assertRaisesRegexp(IndexError, "arena fingerprint index 0 is out of range"): QUERY_ARENA[4:4].copy(indices=[0]) def test_reorder_is_none_when_arena_is_unordered(self): arena1 = QUERY_ARENA.copy(reorder=None) self.assertEqual(arena1.popcount_indices, "") def test_reorder_is_none_when_arena_is_ordered(self): arena1 = QUERY_ARENA.copy(reorder=True) self.assertNotEqual(arena1.popcount_indices, "") arena2 = arena1.copy() self.assertNotEqual(arena2.popcount_indices, "") # These tests use part of the private, internal API (the "_id_lookup"). # That's the only way I could figure out to make sure I'm caching correctly. class TestArenaGetById(unittest2.TestCase): def test_get_by_id(self): arena = QUERY_ARENA.copy() assert arena._id_lookup is None record = arena.get_by_id("CHEBI:17586") self.assertEqual(record, ("CHEBI:17586", "002000102084302197d69ecfbbf3b4ffdf6ffeff1f".decode("hex"))) assert arena._id_lookup is not None def test_get_by_id_not_present(self): arena = QUERY_ARENA.copy() assert arena._id_lookup is None record = arena.get_by_id("spam") self.assertIs(record, None) assert arena._id_lookup is not None def test_get_index_by_id(self): arena = QUERY_ARENA.copy() assert arena._id_lookup is None index = arena.get_index_by_id("CHEBI:17586") self.assertEqual(index, 1) assert arena._id_lookup is not None def test_get_index_by_id_not_present(self): arena = QUERY_ARENA.copy() assert arena._id_lookup is None index = arena.get_index_by_id("spam") self.assertIs(index, None) assert arena._id_lookup is not None def test_get_fingerprint_by_id(self): arena = QUERY_ARENA.copy() assert arena._id_lookup is None fp = arena.get_fingerprint_by_id("CHEBI:17586") self.assertEqual(fp, "002000102084302197d69ecfbbf3b4ffdf6ffeff1f".decode("hex")) assert arena._id_lookup is not None def test_get_fingerprint_by_id_not_present(self): arena = QUERY_ARENA.copy() assert arena._id_lookup is None fp = arena.get_fingerprint_by_id("spam") self.assertIs(fp, None) assert arena._id_lookup is not None def test_duplicate_id(self): arena = chemfp.load_fingerprints([("id1", "ABCD"), ("id2", "EFGH"), ("id3", "IJKL"), ("id2", "MNOP")], metadata = chemfp.Metadata(num_bytes=4), reorder=False) self.assertEqual(arena.get_fingerprint_by_id("id1"), "ABCD") self.assertEqual(arena.get_fingerprint_by_id("id3"), "IJKL") self.assertIn(arena.get_fingerprint_by_id("id2"), ("EFGH", "MNOP")) self.assertEqual(arena.get_index_by_id("id1"), 0) self.assertEqual(arena.get_index_by_id("id3"), 2) self.assertIn(arena.get_index_by_id("id2"), (1, 3)) SDF_IDS = ['9425004', '9425009', '9425012', '9425015', '9425018', '9425021', '9425030', '9425031', '9425032', '9425033', '9425034', '9425035', '9425036', '9425037', '9425040', '9425041', '9425042', '9425045', '9425046'] class ReadStructureFingerprints(object): def _read_ids(self, *args, **kwargs): reader = chemfp.read_structure_fingerprints(*args, **kwargs) self.assertEqual(reader.metadata.num_bits, self.num_bits) ids = [id for (id, fp) in reader] self.assertEqual(len(fp), self.fp_size) return ids def test_read_simple(self): ids = self._read_ids(self.type, source=PUBCHEM_SDF) self.assertEqual(ids, SDF_IDS) def test_read_simple_compressed(self): ids = self._read_ids(self.type, source=PUBCHEM_SDF_GZ) self.assertEqual(ids, SDF_IDS) def test_read_missing_filename(self): with self.assertRaises(IOError): self._read_ids(self.type, "this_file_does_not_exist.sdf") def test_read_metadata(self): metadata = chemfp.Metadata(type=self.type) ids = self._read_ids(metadata, source=PUBCHEM_SDF_GZ) self.assertEqual(ids, SDF_IDS) def test_read_sdf_gz(self): ids = self._read_ids(self.type, source=PUBCHEM_SDF_GZ, format="sdf.gz") self.assertEqual(ids, SDF_IDS) def test_read_sdf(self): ids = self._read_ids(self.type, source=PUBCHEM_SDF, format="sdf") self.assertEqual(ids, SDF_IDS) def test_read_bad_format(self): with self.assertRaisesRegexp(ValueError, "Unknown structure format 'xyzzy'"): self._read_ids(self.type, source=PUBCHEM_SDF, format="xyzzy") def test_read_bad_compression(self): with self.assertRaisesRegexp(ValueError, "Unsupported compression in format 'sdf.Z'"): self._read_ids(self.type, source=PUBCHEM_SDF, format="sdf.Z") def test_read_bad_format_specification(self): with self.assertRaisesRegexp(ValueError, "Incorrect format syntax '@'"): self._read_ids(self.type, source=PUBCHEM_SDF, format="@") def test_read_id_tag(self): ids = self._read_ids(self.type, source=PUBCHEM_SDF, id_tag = "PUBCHEM_MOLECULAR_FORMULA") self.assertEqual(ids, ["C16H16ClFN4O2", "C18H20N6O3", "C14H19N5O3", "C23H24N4O3", "C18H23N5O3S", "C19H21ClN4O4", "C18H31N6O4S+", "C18H30N6O4S", "C16H20N4O2", "C19H21N5O3S", "C18H22N4O2", "C18H20ClN5O3", "C16H20N8O2", "C15H17ClN6O3", "C19H21N5O4", "C17H19N5O4", "C17H19N5O4", "C19H23N5O2S", "C15H17BrN4O3"]) # I decided to not check this. The failure is that you'll get a "can't find id" error. Oh well. # def test_read_invalid_id_tag(self): # self._read_ids(self.type, PUBCHEM_SDF, id_tag = "This\tis\ninvalid>") def test_read_smiles(self): # Need at least one test with some other format ids = self._read_ids(self.type, source=MACCS_SMI) self.assertEqual(ids, ["3->bit_2", "4->bit_3", "5->bit_4", "6->bit_5", "10->bit_9", "11->bit_10", "17->bit_16"] ) def test_read_unknown_format(self): # XXX Fix this - figure out the right way to handle filename-extension/format-option with self.assertRaisesRegexp(ValueError, "Unknown structure (format|filename extension).*should_be_sdf_but_is_not"): self._read_ids(self.type, fullpath("pubchem.should_be_sdf_but_is_not")) def test_read_known_format(self): ids = self._read_ids(self.type, fullpath("pubchem.should_be_sdf_but_is_not"), "sdf") self.assertEqual(ids, SDF_IDS) def test_read_errors(self): with self.assertRaisesRegexp(chemfp.ParseError, "Missing title for record #1 .*missing_title.sdf"): self._read_ids(self.type, fullpath("missing_title.sdf")) def test_read_errors_strict(self): with self.assertRaisesRegexp(chemfp.ParseError, "Missing title for record #1 .*missing_title.sdf"): self._read_ids(self.type, fullpath("missing_title.sdf"), errors="strict") def test_read_errors_ignore(self): ids = self._read_ids(self.type, fullpath("missing_title.sdf"), errors="ignore") self.assertEqual(ids, ["Good"]) def test_read_errors_report(self): import sys from cStringIO import StringIO old_stderr = sys.stderr sys.stderr = new_stderr = StringIO() try: ids = self._read_ids(self.type, fullpath("missing_title.sdf"), errors="report") finally: sys.stderr = old_stderr errmsg = new_stderr.getvalue() self.assertEqual(ids, ["Good"]) self.assertIn("ERROR: Missing title for record #1", errmsg) self.assertNotIn("record #2", errmsg) self.assertIn("ERROR: Missing title for record #3", errmsg) self.assertIn("Skipping.\n", errmsg) def test_read_errors_wrong_setting(self): with self.assertRaisesRegexp(ValueError, "'errors' must be one of ignore, report, strict"): self._read_ids(self.type, PUBCHEM_SDF, errors="this-is-not.a.valid! setting") # Test classes for the different toolkits class TestOpenBabelReadStructureFingerprints(unittest2.TestCase, ReadStructureFingerprints): type = "OpenBabel-FP2/1" num_bits = 1021 fp_size = 128 TestOpenBabelReadStructureFingerprints = ( unittest2.skipUnless(has_openbabel, "Open Babel not available")(TestOpenBabelReadStructureFingerprints) ) class TestOpenEyeReadStructureFingerprints(unittest2.TestCase, ReadStructureFingerprints): type = "OpenEye-Path/" + getattr(openeye, "OEGRAPHSIM_API_VERSION", "1") num_bits = 4096 fp_size = 512 # I haven't yet figured out how 'aromaticity' is exposed in the high-level interface # For now, test the toolkit-specific API def test_read_unknown_aromaticity(self): with self.assertRaisesRegexp(ValueError, "Unsupported aromaticity model 'smelly'"): openeye.read_structures(PUBCHEM_SDF, aromaticity="smelly") def test_default_aromaticity(self): mol_reader = openeye.read_structures(PUBCHEM_SDF) default_smiles = [oechem.OECreateIsoSmiString(mol) for (id, mol) in mol_reader] mol_reader = openeye.read_structures(PUBCHEM_SDF, aromaticity="openeye") openeye_smiles = [oechem.OECreateIsoSmiString(mol) for (id, mol) in mol_reader] self.assertSequenceEqual(default_smiles, openeye_smiles) mol_reader = openeye.read_structures(PUBCHEM_SDF, aromaticity="mdl") mdl_smiles = [oechem.OECreateIsoSmiString(mol) for (id, mol) in mol_reader] for (smi1, smi2) in zip(default_smiles, mdl_smiles): if (smi1 == smi2): break else: raise AssertionError("MDL aromaticity model should not be the same as OpenEye's") TestOpenEyeReadStructureFingerprints = ( unittest2.skipUnless(has_openeye, "OpenEye not available")(TestOpenEyeReadStructureFingerprints) ) class TestRDKitReadStructureFingerprints(unittest2.TestCase, ReadStructureFingerprints): type = "RDKit-Fingerprint/1" num_bits = 2048 fp_size = 256 def test_read_from_compressed_input_using_default_type(self): from StringIO import StringIO f = StringIO("\x1f\x8b\x08\x00\xa9\\,O\x02\xff3042vt\xe3t\xccK)J-\xe7\x02\x00\xfe'\x16\x99\x0e\x00\x00\x00") f.name = "test.gz" values = list(chemfp.open(f)) self.assertEqual(values, [("Andrew", "0123AF".decode("hex"))]) TestRDKitReadStructureFingerprints = ( unittest2.skipUnless(has_rdkit, "RDKit not available")(TestRDKitReadStructureFingerprints) ) class ReadStructureFingerprintsErrors(unittest2.TestCase): def test_metadata_without_type(self): with self.assertRaisesRegexp(ValueError, "Missing fingerprint type information in metadata"): chemfp.read_structure_fingerprints(chemfp.Metadata(num_bits=13)) class TestBitOps(unittest2.TestCase): def test_byte_union(self): for (fp1, fp2, expected) in ( ("ABC", "ABC", "ABC"), ("ABC", "BBC", "CBC"), ("BA", "12", "ss")): self.assertEqual(bitops.byte_union(fp1, fp2), expected) def test_byte_intersect(self): for (fp1, fp2, expected) in ( ("ABC", "ABC", "ABC"), ("ABC", "BBC", "@BC"), ("AB", "12", "\1\2"), ("BA", "12", "\0\0")): self.assertEqual(bitops.byte_intersect(fp1, fp2), expected) def test_byte_difference(self): for (fp1, fp2, expected) in ( ("A", "C", "\2"), ("ABC", "ABC", "\0\0\0"), ("ABC", "BBC", "\3\0\0"), ("BA", "12", "ss")): self.assertEqual(bitops.byte_difference(fp1, fp2), expected) def test_empty(self): self.assertEqual(bitops.byte_union("", ""), "") self.assertEqual(bitops.byte_intersect("", ""), "") self.assertEqual(bitops.byte_difference("", ""), "") def test_failures(self): for func in (bitops.byte_union, bitops.byte_intersect, bitops.byte_difference): with self.assertRaisesRegexp(ValueError, "byte fingerprints must have the same length"): func("1", "12") # This also tests 'count_tanimoto_hits' class TestFPSParser(unittest2.TestCase): def test_open_with_unknown_format(self): with self.assertRaisesRegexp(ValueError, "Unable to determine fingerprint format type from 'spam.pdf'"): chemfp.open("spam.pdf") with self.assertRaisesRegexp(ValueError, "Unknown fingerprint format 'pdf'"): chemfp.open("spam.sdf", format="pdf") def test_fpb_failure(self): with self.assertRaisesRegexp(NotImplementedError, "fpb format support not implemented"): chemfp.open("spam.fpb") def test_base_case(self): values = list(chemfp.open(StringIO("ABCD\tfirst\n"))) self.assertSequenceEqual(values, [("first", "ABCD".decode("hex"))]) def test_unsupported_whitespace(self): with self.assertRaisesRegexp(chemfp.ChemFPError, "Unsupported whitespace at line 1"): list(chemfp.open(StringIO("ABCD first\n"))) def test_missing_id(self): with self.assertRaisesRegexp(chemfp.ChemFPError, "Missing id field at line 1"): list(chemfp.open(StringIO("ABCD\n"))) with self.assertRaisesRegexp(chemfp.ChemFPError, "Missing id field at line 2"): list(chemfp.open(StringIO("0000\tXYZZY\nABCD\n"))) def test_error_properties(self): from StringIO import StringIO f = StringIO("1234 first\n") f.name = "spam" try: list(chemfp.open(f)) raise AssertionError("Should not get here") except chemfp.ChemFPError, err: self.assertEqual(str(err), "Unsupported whitespace at line 1 of 'spam'") self.assertEqual(repr(err), "FPSParseError(-30, 1, spam)") self.assertEqual(err.lineno, 1) self.assertEqual(err.filename, "spam") self.assertEqual(err.errcode, -30) def test_count_size_mismatch(self): query_arena = chemfp.load_fingerprints(StringIO("AB\tSmall\n")) targets = chemfp.open(StringIO("1234\tSpam\nABBA\tDancingQueen\n")) with self.assertRaisesRegexp(ValueError, "query_arena has 8 bits while target_reader has 16 bits"): chemfp.count_tanimoto_hits(query_arena, targets, 0.1) def test_count_size_changes(self): query_arena = chemfp.load_fingerprints(StringIO("ABCD\tSmall\n")) targets = chemfp.open(StringIO("1234\tSpam\nABBA\tDancingQueen\n" * 200 + "12\tNo-no!\n")) with self.assertRaisesRegexp(chemfp.ChemFPError, "Fingerprint is not the expected length at line 401 of ''"): list(chemfp.count_tanimoto_hits(query_arena, targets, 0.1)) def test_count_size_bad_target_fps(self): query_arena = chemfp.load_fingerprints(StringIO("ABCD\tSmall\n")) targets = chemfp.open(StringIO("1234\tSpam\nABBA DancingQueen\n")) with self.assertRaisesRegexp(chemfp.ChemFPError, "Unsupported whitespace at line 2 of ''"): list(chemfp.count_tanimoto_hits(query_arena, targets, 0.1)) def test_count_size_bad_query_fps(self): from StringIO import StringIO f = StringIO("DBAC\tLarge\nABCD Small\n") f.name = "query.fps" queries = chemfp.open(f) targets = chemfp.open(StringIO("1234\tSpam\nABBA\tDancingQueen\n")) with self.assertRaisesRegexp(chemfp.ChemFPError, "Unsupported whitespace at line 2 of 'query.fps'"): list(chemfp.count_tanimoto_hits(queries, targets, 0.1)) def test_count_size_bad_fps_later_on(self): queries = chemfp.open(StringIO("ABCD\tSmall\n" * 200 + "AACE Oops.\n")) targets = chemfp.load_fingerprints(StringIO("1234\tSpam\nABBA\tDancingQueen\n")) results = chemfp.count_tanimoto_hits(queries, targets, 0.1) for i in range(10): results.next() with self.assertRaisesRegexp(chemfp.ChemFPError, "Unsupported whitespace at line 201 of ''"): list(results) def test_empty_input(self): queries = chemfp.load_fingerprints([], chemfp.Metadata(num_bytes=16)) targets = chemfp.load_fingerprints(StringIO("1234\tSpam\nABBA\tDancingQueen\n")) results = chemfp.count_tanimoto_hits(queries, targets, 0.34) for x in results: raise AssertionError("Should not get here") class CountTanimotoHits(unittest2.TestCase): def test_with_initial_offset(self): targets = chemfp.load_fingerprints(CHEBI_TARGETS) it = targets.iter_arenas(10) # Skip the first arena and use only the second next(it) subarena = next(it) hits = list(chemfp.count_tanimoto_hits(subarena, subarena, 0.2, arena_size=3)) self.assertEqual(hits, [("CHEBI:15343", 6), ("CHEBI:15858", 4), ("CHEBI:16007", 3), ("CHEBI:16052", 4), ("CHEBI:16234", 7), ("CHEBI:16382", 4), ("CHEBI:16716", 1), ("CHEBI:16842", 5), ("CHEBI:17051", 4), ("CHEBI:17087", 4)]) self.assertEqual(subarena.ids, ["CHEBI:15343", "CHEBI:15858", "CHEBI:16007", "CHEBI:16052", "CHEBI:16234", "CHEBI:16382", "CHEBI:16716", "CHEBI:16842", "CHEBI:17051", "CHEBI:17087"]) self.assertEqual(len(subarena.arena_ids), len(targets)) self.assertEqual(subarena.ids, targets.arena_ids[10:20]) def test_with_large_input(self): targets = chemfp.load_fingerprints(CHEBI_TARGETS) for (query_id, count) in chemfp.count_tanimoto_hits(targets, targets, threshold=0.9, arena_size=10): pass self.assertEqual(query_id, "CHEBI:16379") self.assertEqual(count, 5) class ThresholdTanimotoSearch(unittest2.TestCase): def test_with_fps_reader_as_targets(self): queries = chemfp.open(CHEBI_QUERIES).iter_arenas(10).next() targets = chemfp.open(CHEBI_TARGETS) fps_results = chemfp.threshold_tanimoto_search(queries, targets) with self.assertRaisesRegexp(TypeError, ".*has no len.*"): len(fps_results) results = list(fps_results) self.assertEqual(len(results), 10) query_id, result = results[0] self.assertEqual(query_id, "CHEBI:17585") self.assertEqual(len(result), 4) result.reorder("increasing-score") self.assertEqual(result.get_scores(), [0.7142857142857143, 0.72222222222222221, 0.8571428571428571, 0.8571428571428571]) self.assertEqual(result.get_ids(), ['CHEBI:16148', 'CHEBI:17539', 'CHEBI:17034', 'CHEBI:17302']) query_id, result = results[3] self.assertEqual(query_id, "CHEBI:17588") self.assertEqual(len(result), 32) result.reorder("decreasing-score") expected_scores = [1.0, 0.88, 0.88, 0.88, 0.85185185185185186, 0.85185185185185186] expected_ids = ['CHEBI:17230', 'CHEBI:15356', 'CHEBI:16375', 'CHEBI:17561', 'CHEBI:15729', 'CHEBI:16176'] self.assertEqual(result.get_scores()[:6], expected_scores) self.assertEqual(result.get_ids()[:6], expected_ids) self.assertEqual(result.get_ids_and_scores()[:6], zip(expected_ids, expected_scores)) def test_with_arena_as_targets(self): queries = chemfp.load_fingerprints(CHEBI_QUERIES, reorder=False)[:10] targets = chemfp.load_fingerprints(CHEBI_TARGETS) arena_results = chemfp.threshold_tanimoto_search(queries, targets) with self.assertRaisesRegexp(TypeError, ".*has no len.*"): len(arena_results) results = list(arena_results) self.assertEqual(len(results), 10) query_id, result = results[0] self.assertEqual(query_id, "CHEBI:17585") self.assertEqual(len(result), 4) result.reorder("increasing-score") self.assertSequenceEqual(result.get_scores(), [0.7142857142857143, 0.72222222222222221, 0.8571428571428571, 0.8571428571428571]) self.assertEqual(result.get_ids(), ['CHEBI:16148', 'CHEBI:17539', 'CHEBI:17034', 'CHEBI:17302']) query_id, result = results[3] self.assertEqual(query_id, "CHEBI:17588") self.assertEqual(len(result), 32) result.reorder("decreasing-score") expected_scores = [1.0, 0.88, 0.88, 0.88, 0.85185185185185186, 0.85185185185185186] expected_ids = ['CHEBI:17230', 'CHEBI:15356', 'CHEBI:16375', 'CHEBI:17561', 'CHEBI:15729', 'CHEBI:16176'] self.assertSequenceEqual(result.get_scores()[:6], expected_scores) self.assertEqual(result.get_ids()[:6], expected_ids) self.assertEqual(result.get_ids_and_scores()[:6], zip(expected_ids, expected_scores)) def test_with_different_parameters(self): queries = chemfp.load_fingerprints(CHEBI_QUERIES, reorder=False)[:10] targets = chemfp.load_fingerprints(CHEBI_TARGETS) results = list(chemfp.threshold_tanimoto_search(queries, targets, threshold=0.8)) query_id, result = results[0] result.reorder("increasing-score") self.assertEqual(query_id, "CHEBI:17585") self.assertSequenceEqual(result.get_scores(), [0.8571428571428571, 0.8571428571428571]) self.assertEqual(result.get_ids(), ['CHEBI:17034', 'CHEBI:17302']) def test_with_initial_offset(self): targets = chemfp.load_fingerprints(CHEBI_TARGETS) it = targets.iter_arenas(10) # Skip the first arena and use only the second next(it) subarena = next(it) hits = list(chemfp.threshold_tanimoto_search(subarena, subarena, 0.2, arena_size=3)) self.assertEqual([(query_id, len(hit)) for (query_id, hit) in hits], [("CHEBI:15343", 6), ("CHEBI:15858", 4), ("CHEBI:16007", 3), ("CHEBI:16052", 4), ("CHEBI:16234", 7), ("CHEBI:16382", 4), ("CHEBI:16716", 1), ("CHEBI:16842", 5), ("CHEBI:17051", 4), ("CHEBI:17087", 4)]) self.assertEqual(subarena.ids, ["CHEBI:15343", "CHEBI:15858", "CHEBI:16007", "CHEBI:16052", "CHEBI:16234", "CHEBI:16382", "CHEBI:16716", "CHEBI:16842", "CHEBI:17051", "CHEBI:17087"]) self.assertEqual(len(subarena.arena_ids), len(targets)) self.assertEqual(subarena.ids, targets.arena_ids[10:20]) def test_with_large_input(self): targets = chemfp.load_fingerprints(CHEBI_TARGETS) for (query_id, result) in chemfp.threshold_tanimoto_search(targets, targets, threshold=0.9): pass self.assertEqual(query_id, "CHEBI:16379") result.reorder("increasing-score") self.assertSequenceEqual(result.get_scores(), [0.956989247311828, 0.96739130434782605, 0.97826086956521741, 0.97826086956521741, 1.0]) self.assertSequenceEqual(result.get_ids(), ["CHEBI:17439", "CHEBI:15982", "CHEBI:15852", "CHEBI:16304", "CHEBI:16379"]) def test_with_empty_queries(self): targets = chemfp.load_fingerprints(CHEBI_QUERIES) queries = targets[len(targets):] for x in chemfp.threshold_tanimoto_search(queries, targets): raise AssertionError class KNearestTanimotoSearch(unittest2.TestCase): def test_with_fps_reader_as_targets(self): queries = chemfp.open(CHEBI_QUERIES).iter_arenas(10).next() targets = chemfp.open(CHEBI_TARGETS) fps_results = chemfp.knearest_tanimoto_search(queries, targets) with self.assertRaisesRegexp(TypeError, ".*has no len.*"): len(fps_results) results = list(fps_results) self.assertEqual(len(results), 10) query_id, result = results[0] self.assertEqual(query_id, "CHEBI:17585") self.assertEqual(len(result), 3) # The default is in decreasing score self.assertEqual(result.get_scores(), [0.8571428571428571, 0.8571428571428571, 0.72222222222222221]) self.assertEqual(result.get_ids(), ['CHEBI:17302', 'CHEBI:17034', 'CHEBI:17539']) result.reorder("increasing-score") self.assertEqual(result.get_scores(), [0.72222222222222221, 0.8571428571428571, 0.8571428571428571]) self.assertEqual(result.get_ids(), ['CHEBI:17539', 'CHEBI:17034', 'CHEBI:17302']) query_id, result = results[3] self.assertEqual(query_id, "CHEBI:17588") self.assertEqual(len(result), 3) expected_scores = [1.0, 0.88, 0.88] expected_ids = ['CHEBI:17230', 'CHEBI:15356', 'CHEBI:16375'] self.assertEqual(result.get_scores(), expected_scores) self.assertEqual(result.get_ids(), expected_ids) self.assertEqual(result.get_ids_and_scores(), zip(expected_ids, expected_scores)) def test_with_arena_as_targets(self): queries = chemfp.load_fingerprints(CHEBI_QUERIES, reorder=False)[:10] targets = chemfp.load_fingerprints(CHEBI_TARGETS) arena_results = chemfp.knearest_tanimoto_search(queries, targets) with self.assertRaisesRegexp(TypeError, ".*has no len.*"): len(arena_results) results = list(arena_results) self.assertEqual(len(results), 10) query_id, result = results[0] self.assertEqual(query_id, "CHEBI:17585") self.assertEqual(len(result), 3) # The default is in decreasing score self.assertSequenceEqual(result.get_scores(), [0.8571428571428571, 0.8571428571428571, 0.72222222222222221]) self.assertEqual(result.get_ids()[2], 'CHEBI:17539') self.assertEqual(sorted(result.get_ids()), sorted(['CHEBI:17302', 'CHEBI:17034', 'CHEBI:17539'])) result.reorder("increasing-score") self.assertSequenceEqual(result.get_scores(), [0.72222222222222221, 0.8571428571428571, 0.8571428571428571]) self.assertEqual(result.get_ids(), ['CHEBI:17539', 'CHEBI:17034', 'CHEBI:17302']) query_id, result = results[3] self.assertEqual(query_id, "CHEBI:17588") self.assertEqual(len(result), 3) expected_scores = [1.0, 0.88, 0.88] expected_ids = ['CHEBI:17230', 'CHEBI:15356', 'CHEBI:16375'] self.assertSequenceEqual(result.get_scores(), expected_scores) self.assertEqual(result.get_ids(), expected_ids) self.assertEqual(result.get_ids_and_scores(), zip(expected_ids, expected_scores)) def test_with_different_parameters(self): queries = chemfp.load_fingerprints(CHEBI_QUERIES, reorder=False)[:10] targets = chemfp.load_fingerprints(CHEBI_TARGETS) results = list(chemfp.knearest_tanimoto_search(queries, targets, k=8, threshold=0.3)) query_id, result = results[0] result.reorder("increasing-score") self.assertEqual(query_id, "CHEBI:17585") self.assertSequenceEqual(result.get_scores(), [0.61904761904761907, 0.66666666666666663, 0.66666666666666663, 0.66666666666666663, 0.7142857142857143, 0.72222222222222221, 0.8571428571428571, 0.8571428571428571]) self.assertEqual(result.get_ids(), ['CHEBI:15843', 'CHEBI:7896', 'CHEBI:15894', 'CHEBI:16759', 'CHEBI:16148', 'CHEBI:17539', 'CHEBI:17034', 'CHEBI:17302']) def test_with_initial_offset(self): targets = chemfp.load_fingerprints(CHEBI_TARGETS) it = targets.iter_arenas(10) # Skip the first arena and use only the second next(it) subarena = next(it) hits = list(chemfp.knearest_tanimoto_search(subarena, subarena, k=5, threshold=0.2, arena_size=3)) self.assertEqual([(query_id, len(hit)) for (query_id, hit) in hits], [("CHEBI:15343", 5), ("CHEBI:15858", 4), ("CHEBI:16007", 3), ("CHEBI:16052", 4), ("CHEBI:16234", 5), ("CHEBI:16382", 4), ("CHEBI:16716", 1), ("CHEBI:16842", 5), ("CHEBI:17051", 4), ("CHEBI:17087", 4)]) self.assertEqual(subarena.ids, ["CHEBI:15343", "CHEBI:15858", "CHEBI:16007", "CHEBI:16052", "CHEBI:16234", "CHEBI:16382", "CHEBI:16716", "CHEBI:16842", "CHEBI:17051", "CHEBI:17087"]) self.assertEqual(len(subarena.arena_ids), len(targets)) self.assertEqual(subarena.ids, targets.arena_ids[10:20]) def test_with_large_input(self): targets = chemfp.load_fingerprints(CHEBI_TARGETS) for (query_id, result) in chemfp.knearest_tanimoto_search(targets, targets, threshold=0.9): pass self.assertEqual(query_id, "CHEBI:16379") result.reorder("increasing-score") self.assertSequenceEqual(result.get_scores(), [0.97826086956521741, 0.97826086956521741, 1.0]) self.assertSequenceEqual(result.get_ids(), ["CHEBI:15852", "CHEBI:16304", "CHEBI:16379"]) def test_with_empty_queries(self): targets = chemfp.load_fingerprints(CHEBI_QUERIES) queries = targets[len(targets):] for x in chemfp.knearest_tanimoto_search(queries, targets): raise AssertionError class TestCountSymmetric(unittest2.TestCase): def test_count_with_fps_reader(self): targets = chemfp.open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "`fingerprints` must be a FingerprintArena " "with pre-computed popcount indices"): chemfp.count_tanimoto_hits_symmetric(targets) def test_count_without_indices(self): targets = chemfp.load_fingerprints(CHEBI_TARGETS, reorder=False) with self.assertRaisesRegexp(ValueError, "`fingerprints` must be a FingerprintArena " "with pre-computed popcount indices"): chemfp.count_tanimoto_hits_symmetric(targets) def test_count(self): # Work around a bug: cannot do the symmetric search on a subarena fps = chemfp.open(CHEBI_TARGETS) targets = chemfp.load_fingerprints(itertools.islice(fps, 200, 220), fps.metadata) results = chemfp.count_tanimoto_hits_symmetric(targets) with self.assertRaisesRegexp(TypeError, ".*has no len.*"): len(results) results = list(results) self.assertEqual(len(results), 20) self.assertSequenceEqual(results, [ ('CHEBI:15399', 4), ('CHEBI:15400', 4), ('CHEBI:15404', 0), ('CHEBI:15385', 0), ('CHEBI:15386', 0), ('CHEBI:15388', 4), ('CHEBI:15389', 4), ('CHEBI:15392', 3), ('CHEBI:15396', 5), ('CHEBI:15397', 5), ('CHEBI:15402', 3), ('CHEBI:15403', 3), ('CHEBI:15387', 3), ('CHEBI:15398', 6), ('CHEBI:15393', 5), ('CHEBI:15394', 5), ('CHEBI:15401', 0), ('CHEBI:15390', 1), ('CHEBI:15391', 1), ('CHEBI:15395', 0)]) def test_count_with_explicit_default_threshold(self): # Work around a bug: cannot do the symmetric search on a subarena fps = chemfp.open(CHEBI_TARGETS) targets = chemfp.load_fingerprints(itertools.islice(fps, 200, 220), fps.metadata) results = chemfp.count_tanimoto_hits_symmetric(targets, threshold=0.7) with self.assertRaisesRegexp(TypeError, ".*has no len.*"): len(results) results = list(results) self.assertEqual(len(results), 20) self.assertSequenceEqual(results, [ ('CHEBI:15399', 4), ('CHEBI:15400', 4), ('CHEBI:15404', 0), ('CHEBI:15385', 0), ('CHEBI:15386', 0), ('CHEBI:15388', 4), ('CHEBI:15389', 4), ('CHEBI:15392', 3), ('CHEBI:15396', 5), ('CHEBI:15397', 5), ('CHEBI:15402', 3), ('CHEBI:15403', 3), ('CHEBI:15387', 3), ('CHEBI:15398', 6), ('CHEBI:15393', 5), ('CHEBI:15394', 5), ('CHEBI:15401', 0), ('CHEBI:15390', 1), ('CHEBI:15391', 1), ('CHEBI:15395', 0)]) def test_count_with_different_threshold(self): # Work around a bug: cannot do the symmetric search on a subarena fps = chemfp.open(CHEBI_TARGETS) targets = chemfp.load_fingerprints(itertools.islice(fps, 300, 350), fps.metadata) results = chemfp.count_tanimoto_hits_symmetric(targets, threshold=0.95) with self.assertRaisesRegexp(TypeError, ".*has no len.*"): len(results) results = list(results) self.assertEqual(len(results), 50) self.assertSequenceEqual(results[:15], [ ('CHEBI:15522', 28), ('CHEBI:15496', 40), ('CHEBI:15501', 40), ('CHEBI:15507', 40), ('CHEBI:15490', 29), ('CHEBI:15492', 29), ('CHEBI:15504', 29), ('CHEBI:15512', 41), ('CHEBI:15515', 35), ('CHEBI:15524', 29), ('CHEBI:15531', 29), ('CHEBI:15535', 29), ('CHEBI:15537', 35), ('CHEBI:15497', 27), ('CHEBI:15500', 31)]) class TestThresholdSymmetric(unittest2.TestCase): def test_search_with_fps_reader(self): targets = chemfp.open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "`fingerprints` must be a FingerprintArena " "with pre-computed popcount indices"): chemfp.threshold_tanimoto_search_symmetric(targets) def test_search_without_indices(self): targets = chemfp.load_fingerprints(CHEBI_TARGETS, reorder=False) with self.assertRaisesRegexp(ValueError, "`fingerprints` must be a FingerprintArena " "with pre-computed popcount indices"): chemfp.threshold_tanimoto_search_symmetric(targets) def test_search(self): # Work around a bug: cannot do the symmetric search on a subarena fps = chemfp.open(CHEBI_TARGETS) targets = chemfp.load_fingerprints(itertools.islice(fps, 200, 220), fps.metadata) results = chemfp.threshold_tanimoto_search_symmetric(targets) with self.assertRaisesRegexp(TypeError, ".*has no len.*"): len(results) results = list(results) self.assertEqual(len(results), 20) self.assertSequenceEqual([(id, len(result)) for (id, result) in results], [ ('CHEBI:15399', 4), ('CHEBI:15400', 4), ('CHEBI:15404', 0), ('CHEBI:15385', 0), ('CHEBI:15386', 0), ('CHEBI:15388', 4), ('CHEBI:15389', 4), ('CHEBI:15392', 3), ('CHEBI:15396', 5), ('CHEBI:15397', 5), ('CHEBI:15402', 3), ('CHEBI:15403', 3), ('CHEBI:15387', 3), ('CHEBI:15398', 6), ('CHEBI:15393', 5), ('CHEBI:15394', 5), ('CHEBI:15401', 0), ('CHEBI:15390', 1), ('CHEBI:15391', 1), ('CHEBI:15395', 0)]) def test_search_with_explicit_defaults(self): # Work around a bug: cannot do the symmetric search on a subarena fps = chemfp.open(CHEBI_TARGETS) targets = chemfp.load_fingerprints(itertools.islice(fps, 200, 220), fps.metadata) results = chemfp.threshold_tanimoto_search_symmetric(targets, threshold=0.7) with self.assertRaisesRegexp(TypeError, ".*has no len.*"): len(results) results = list(results) self.assertEqual(len(results), 20) self.assertSequenceEqual([(id, len(result)) for (id, result) in results], [ ('CHEBI:15399', 4), ('CHEBI:15400', 4), ('CHEBI:15404', 0), ('CHEBI:15385', 0), ('CHEBI:15386', 0), ('CHEBI:15388', 4), ('CHEBI:15389', 4), ('CHEBI:15392', 3), ('CHEBI:15396', 5), ('CHEBI:15397', 5), ('CHEBI:15402', 3), ('CHEBI:15403', 3), ('CHEBI:15387', 3), ('CHEBI:15398', 6), ('CHEBI:15393', 5), ('CHEBI:15394', 5), ('CHEBI:15401', 0), ('CHEBI:15390', 1), ('CHEBI:15391', 1), ('CHEBI:15395', 0)]) def test_search_with_different_threshold(self): # Work around a bug: cannot do the symmetric search on a subarena fps = chemfp.open(CHEBI_TARGETS) targets = chemfp.load_fingerprints(itertools.islice(fps, 300, 350), fps.metadata) results = chemfp.threshold_tanimoto_search_symmetric(targets, threshold=0.95) with self.assertRaisesRegexp(TypeError, ".*has no len.*"): len(results) results = list(results) self.assertEqual(len(results), 50) self.assertSequenceEqual([(id, len(result)) for (id, result) in results[:15]], [ ('CHEBI:15522', 28), ('CHEBI:15496', 40), ('CHEBI:15501', 40), ('CHEBI:15507', 40), ('CHEBI:15490', 29), ('CHEBI:15492', 29), ('CHEBI:15504', 29), ('CHEBI:15512', 41), ('CHEBI:15515', 35), ('CHEBI:15524', 29), ('CHEBI:15531', 29), ('CHEBI:15535', 29), ('CHEBI:15537', 35), ('CHEBI:15497', 27), ('CHEBI:15500', 31)]) class TestKNearestSymmetric(unittest2.TestCase): def test_search_with_fps_reader(self): targets = chemfp.open(CHEBI_TARGETS) with self.assertRaisesRegexp(ValueError, "`fingerprints` must be a FingerprintArena " "with pre-computed popcount indices"): chemfp.knearest_tanimoto_search_symmetric(targets) def test_search_without_indices(self): targets = chemfp.load_fingerprints(CHEBI_TARGETS, reorder=False) with self.assertRaisesRegexp(ValueError, "`fingerprints` must be a FingerprintArena " "with pre-computed popcount indices"): chemfp.knearest_tanimoto_search_symmetric(targets) def test_search(self): # Work around a bug: cannot do the symmetric search on a subarena fps = chemfp.open(CHEBI_TARGETS) targets = chemfp.load_fingerprints(itertools.islice(fps, 200, 220), fps.metadata) results = chemfp.knearest_tanimoto_search_symmetric(targets) with self.assertRaisesRegexp(TypeError, ".*has no len.*"): len(results) results = list(results) self.assertEqual(len(results), 20) self.assertSequenceEqual([(id, len(result)) for (id, result) in results], [ ('CHEBI:15399', 3), ('CHEBI:15400', 3), ('CHEBI:15404', 0), ('CHEBI:15385', 0), ('CHEBI:15386', 0), ('CHEBI:15388', 3), ('CHEBI:15389', 3), ('CHEBI:15392', 3), ('CHEBI:15396', 3), ('CHEBI:15397', 3), ('CHEBI:15402', 3), ('CHEBI:15403', 3), ('CHEBI:15387', 3), ('CHEBI:15398', 3), ('CHEBI:15393', 3), ('CHEBI:15394', 3), ('CHEBI:15401', 0), ('CHEBI:15390', 1), ('CHEBI:15391', 1), ('CHEBI:15395', 0)]) def test_search_with_explicit_defaults(self): # Work around a bug: cannot do the symmetric search on a subarena fps = chemfp.open(CHEBI_TARGETS) targets = chemfp.load_fingerprints(itertools.islice(fps, 200, 220), fps.metadata) results = chemfp.knearest_tanimoto_search_symmetric(targets, k=3, threshold=0.7) with self.assertRaisesRegexp(TypeError, ".*has no len.*"): len(results) results = list(results) self.assertEqual(len(results), 20) self.assertSequenceEqual([(id, len(result)) for (id, result) in results], [ ('CHEBI:15399', 3), ('CHEBI:15400', 3), ('CHEBI:15404', 0), ('CHEBI:15385', 0), ('CHEBI:15386', 0), ('CHEBI:15388', 3), ('CHEBI:15389', 3), ('CHEBI:15392', 3), ('CHEBI:15396', 3), ('CHEBI:15397', 3), ('CHEBI:15402', 3), ('CHEBI:15403', 3), ('CHEBI:15387', 3), ('CHEBI:15398', 3), ('CHEBI:15393', 3), ('CHEBI:15394', 3), ('CHEBI:15401', 0), ('CHEBI:15390', 1), ('CHEBI:15391', 1), ('CHEBI:15395', 0)]) def test_search_with_different_settings(self): # Work around a bug: cannot do the symmetric search on a subarena fps = chemfp.open(CHEBI_TARGETS) targets = chemfp.load_fingerprints(itertools.islice(fps, 300, 350), fps.metadata) results = chemfp.knearest_tanimoto_search_symmetric(targets, threshold=0.95, k=28) with self.assertRaisesRegexp(TypeError, ".*has no len.*"): len(results) results = list(results) self.assertEqual(len(results), 50) self.assertSequenceEqual([(id, len(result)) for (id, result) in results[:15]], [ ('CHEBI:15522', 28), ('CHEBI:15496', 28), ('CHEBI:15501', 28), ('CHEBI:15507', 28), ('CHEBI:15490', 28), ('CHEBI:15492', 28), ('CHEBI:15504', 28), ('CHEBI:15512', 28), ('CHEBI:15515', 28), ('CHEBI:15524', 28), ('CHEBI:15531', 28), ('CHEBI:15535', 28), ('CHEBI:15537', 28), ('CHEBI:15497', 27), ('CHEBI:15500', 28)]) try: import bz2 has_bz2 = True except ImportError: has_bz2 = False class TestSave(unittest2.TestCase): def test_save_to_fps(self): filename = os.path.join(_tmpdir(self), "output.fps") arena = chemfp.load_fingerprints(CHEBI_TARGETS, reorder=False) arena.save(filename) arena2 = chemfp.load_fingerprints(filename, reorder=False) self.assertEqual(arena.metadata.type, arena2.metadata.type) self.assertEqual(len(arena), len(arena2)) arena_lines = open(CHEBI_TARGETS).readlines() arena2_lines = open(filename).readlines() self.assertSequenceEqual(arena_lines, arena2_lines) def test_save_to_file_object(self): arena = chemfp.load_fingerprints(CHEBI_TARGETS, reorder=False) f = StringIO() arena.save(f) s = f.getvalue() f.close() arena_lines = open(CHEBI_TARGETS).readlines() arena2_lines = s.splitlines(True) self.assertSequenceEqual(arena_lines, arena2_lines) def test_save_to_fps_gz(self): filename = os.path.join(_tmpdir(self), "output.fps.gz") arena = chemfp.load_fingerprints(CHEBI_TARGETS, reorder=False) arena.save(filename) arena2 = chemfp.load_fingerprints(filename, reorder=False) self.assertEqual(arena.metadata.type, arena2.metadata.type) self.assertEqual(len(arena), len(arena2)) arena_lines = open(CHEBI_TARGETS).readlines() arena2_lines = gzip.GzipFile(filename).readlines() self.assertSequenceEqual(arena_lines, arena2_lines) def test_save_to_fps_bz2(self): filename = os.path.join(_tmpdir(self), "output.fps.bz2") arena = chemfp.load_fingerprints(CHEBI_TARGETS, reorder=False) arena.save(filename) arena2 = chemfp.load_fingerprints(filename, reorder=False) self.assertEqual(arena.metadata.type, arena2.metadata.type) self.assertEqual(len(arena), len(arena2)) arena_lines = open(CHEBI_TARGETS).readlines() arena2_lines = bz2.BZ2File(filename).readlines() self.assertSequenceEqual(arena_lines, arena2_lines) test_save_to_fps_bz2 = unittest2.skipUnless(has_bz2, "bz2 module not available")(test_save_to_fps_bz2) def test_save_id_with_tab(self): arena = chemfp.load_fingerprints([("A\tB", "1234")], chemfp.Metadata(num_bytes=4)) f = StringIO() with self.assertRaisesRegexp(ValueError, "Fingerprint ids must not contain a tab: 'A\\\\tB' in record 1"): arena.save(f) def test_save_id_with_newline(self): arena = chemfp.load_fingerprints([("AB", "1234"), ("C\nD", "1324")], chemfp.Metadata(num_bytes=4)) f = StringIO() with self.assertRaisesRegexp(ValueError, "Fingerprint ids must not contain a newline: 'C\\\\nD' in record 2"): arena.save(f) def test_save_empty_id(self): arena = chemfp.load_fingerprints([("AB", "1234"), ("", "1324")], chemfp.Metadata(num_bytes=4)) f = StringIO() with self.assertRaisesRegexp(ValueError, "Fingerprint ids must not be the empty string"): arena.save(f) # These help improve code coverage. They aren't yet part of the main public API. class TestIO(unittest2.TestCase): def test_save_with_reader_metadata(self): fps = chemfp.FingerprintIterator(chemfp.Metadata(type="Spam/1", num_bytes=4), [("AB", "1234")]) f = StringIO() io.write_fps1_output(fps, f) self.assertEqual(f.getvalue(), "#FPS1\n#num_bits=32\n#type=Spam/1\n31323334\tAB\n") def test_save_id_with_tab(self): fps = [("A\tB", "1234")] with self.assertRaisesRegexp(ValueError, "Fingerprint ids must not contain a tab: 'A\\\\tB' in record 1"): io.write_fps1_output(fps, StringIO(), chemfp.Metadata(num_bytes=4)) def test_save_id_with_newline(self): fps = [("AB", "1234"), ("C\nD", "1324")] with self.assertRaisesRegexp(ValueError, "Fingerprint ids must not contain a newline: 'C\\\\nD' in record 2"): io.write_fps1_output(fps, StringIO(), chemfp.Metadata(num_bytes=4)) def test_save_empty_id(self): fps = [("AB", "1234"), ("", "1324")] with self.assertRaisesRegexp(ValueError, "Fingerprint ids must not be the empty string"): io.write_fps1_output(fps, StringIO(), chemfp.Metadata(num_bytes=4)) def test_write_fps1_output_to_file(self): fps = [("AB", "1234")] dirname = _tmpdir(self) filename = os.path.join(dirname, "output.fps") io.write_fps1_output(fps, filename, chemfp.Metadata(num_bytes=4, type="Blah/21")) s = open(filename).read() self.assertEqual(s, "#FPS1\n#num_bits=32\n#type=Blah/21\n31323334\tAB\n") # This is hard to test through the main() API since the main() API # changes stdout / uses an alternate output. class TestOpenCompression(unittest2.TestCase): def test_open_output(self): self.assertIs(io.open_output(None), sys.stdout) f = StringIO() self.assertIs(io.open_output(f), f) def test_open_compressed_output_uncompressed(self): self.assertIs(io.open_compressed_output(None, None), sys.stdout) filename = os.path.join(_tmpdir(self), "spam.out") f = io.open_compressed_output(filename, None) try: self.assertEqual(f.name, filename) f.write("Check that it's writeable.\n"); finally: f.close() f = StringIO() self.assertIs(f, io.open_compressed_output(f, None)) def test_open_compressed_output_gzip_stdout(self): old_stdout = sys.stdout sys.stdout = f = StringIO() try: g = io.open_compressed_output(None, ".gz") g.write("Spam and eggs.") g.close() finally: sys.stdout = old_stdout t = gzip.GzipFile(fileobj=StringIO(f.getvalue())).read() self.assertEqual(t, "Spam and eggs.") def test_open_compressed_output_gzip_filename(self): filename = os.path.join(_tmpdir(self), "spam_gz") f = io.open_compressed_output(filename, ".gz") try: if hasattr(f, "name"): # doesn't work before Python 2.7 self.assertEqual(f.name, filename) f.write("Check that it's writeable.\n"); f.close() f = gzip.GzipFile(filename) s = f.read() self.assertEqual(s, "Check that it's writeable.\n") finally: f.close() def test_open_compressed_output_gzip_filelike(self): f = StringIO() g = io.open_compressed_output(f, ".gz") g.write("This is a test.\n") g.close() s = f.getvalue() t = gzip.GzipFile(fileobj=StringIO(s)).read() self.assertEqual(t, "This is a test.\n") ## def test_open_compressed_output_bzip_stdout(self): ## # This cannot be tested from Python since the bz2 library only ## # takes a filename. The chemfp interface uses "/dev/stdout" ## # as a hack, but that is not interceptable. ## # I can't even check that I can connect to stdout since ## # this emits a header ## #g = io.open_compressed_output(None, ".bz2") def test_open_compressed_output_bzip_filename(self): filename = os.path.join(_tmpdir(self), "spam_bz") f = io.open_compressed_output(filename, ".bz2") try: self.assertEqual(f.name, filename) f.write("Check that it's writeable.\n"); f.close() f = bz2.BZ2File(filename) s = f.read() self.assertEqual(s, "Check that it's writeable.\n") finally: f.close() test_open_compressed_output_bzip_filename = unittest2.skipUnless(has_bz2, "bz2 module not available")( test_open_compressed_output_bzip_filename) def test_open_compressed_output_bzip_filelike(self): with self.assertRaisesRegexp(NotImplementedError, "bzip2 compression to file-like objects is not supported"): io.open_compressed_output(StringIO(), ".bz2") test_open_compressed_output_bzip_filelike = unittest2.skipUnless(has_bz2, "bz2 module not available")( test_open_compressed_output_bzip_filelike) def test_open_compressed_output_xz(self): with self.assertRaisesRegexp(NotImplementedError, "xz compression is not supported"): io.open_compressed_output(StringIO(), ".xz") def test_unsupported_compression(self): with self.assertRaisesRegexp(ValueError, "Unknown compression type '.Z'"): io.open_compressed_output(StringIO(), ".Z") ###### def test_cannot_read_bzip_input_file(self): with self.assertRaisesRegexp(NotImplementedError, "bzip decompression from file-like objects is not supported"): io.open_compressed_input_universal(StringIO(), ".bz2") test_cannot_read_bzip_input_file = unittest2.skipUnless(has_bz2, "bz2 module not available")( test_cannot_read_bzip_input_file) def test_cannot_read_xz_input_file(self): with self.assertRaisesRegexp(NotImplementedError, "xz decompression is not supported"): io.open_compressed_input_universal(StringIO(), ".xz") def test_unsupported_decompression(self): with self.assertRaisesRegexp(ValueError, "Unknown compression type '.Z'"): io.open_compressed_input_universal(StringIO(), ".Z") # Improve code coverage by testing the parts not tested elsewhere class TestMetadata(unittest2.TestCase): def test_bit_byte_incompatibility(self): with self.assertRaisesRegexp(ValueError, "num_bits of 9 is incompatible with num_bytes of 1"): chemfp.Metadata(num_bits=9, num_bytes=1) def test_sources_as_string_is_allowed(self): metadata = chemfp.Metadata(sources="Spam") metadata.sources = ["Spam"] def test_basic_repr(self): s = repr(chemfp.Metadata()) self.assertEqual(s, "Metadata(num_bits=None, num_bytes=None, type=None, aromaticity=None, sources=[], software=None, date=None)") def test_full_repr(self): s = repr(chemfp.Metadata(num_bits=14, num_bytes=2, type="1-Adam/12", aromaticity="smelly", sources=["one", "two"], software="My head", date="1970-08-22T18:12:30")) self.assertEqual(s, "Metadata(num_bits=14, num_bytes=2, type='1-Adam/12', " + "aromaticity='smelly', sources=['one', 'two'], software='My head', " + "date='1970-08-22T18:12:30')") def test_basic_str(self): s = str(chemfp.Metadata()) self.assertEqual(s, "") def test_full_str(self): s = str(chemfp.Metadata(num_bits=14, num_bytes=2, type="1-Adam/12", aromaticity="smelly", sources=["one", "two"], software="My head", date="1970-08-22T18:12:30")) lines = s.splitlines() self.assertSequenceEqual(lines, ["#num_bits=14", '#type=1-Adam/12', '#software=My head', '#aromaticity=smelly', '#source=one', '#source=two', '#date=1970-08-22T18:12:30']) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_chemfp.py0000644000077000000240000000302512106315052017265 0ustar dalkestaff00000000000000import unittest2 import re import os import chemfp import _chemfp version_pattern = re.compile(r"\d+\.\d+(\.\d)?((a|b|p)\d+)?$") class SystemTests(unittest2.TestCase): def test_version(self): m = version_pattern.match(_chemfp.version()) self.assertNotEqual(m, None, "bad version: %s" % (_chemfp.version(),)) skip_omp = (chemfp.get_max_threads() == 1) class OpenMPTests(unittest2.TestCase): def setUp(self): self._num_threads = chemfp.get_num_threads() def tearDown(self): chemfp.set_num_threads(self._num_threads) def test_num_threads_is_max_threads(self): self.assertEquals(chemfp.get_num_threads(), chemfp.get_max_threads()) def test_set_to_zero(self): chemfp.set_num_threads(0) self.assertEquals(chemfp.get_num_threads(), 1) def test_set_to_one(self): chemfp.set_num_threads(1) self.assertEquals(chemfp.get_num_threads(), 1) def test_set_to_two(self): chemfp.set_num_threads(2) self.assertEquals(chemfp.get_num_threads(), 2) test_set_to_two = unittest2.skipIf(skip_omp, "Multiple OMP threads not available")( test_set_to_two) def test_set_to_max(self): chemfp.set_num_threads(chemfp.get_max_threads()) self.assertEquals(chemfp.get_num_threads(), chemfp.get_max_threads()) def test_set_beyond_max(self): chemfp.set_num_threads(chemfp.get_max_threads()+1) self.assertEquals(chemfp.get_num_threads(), chemfp.get_max_threads()) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_docstrings.py0000644000077000000240000000032212101362343020177 0ustar dalkestaff00000000000000import sys import unittest2 import doctest def load_tests(loader, tests, ignore): tests.addTests(doctest.DocTestSuite("chemfp.encodings")) return tests if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_memory.py0000644000077000000240000001207712106306046017345 0ustar dalkestaff00000000000000# This tests check for memory leaks. import unittest2 import resource import sys import array import _chemfp import chemfp def get_memory(): return resource.getrusage(resource.RUSAGE_SELF).ru_maxrss def memory_growth(f, *args, **kwargs): x = f(*args, **kwargs) m1 = get_memory() count = 0 for i in xrange(1000): x = f(*args, **kwargs) m2 = get_memory() if m2 > m1: count += 1 m1 = m2 #print "count", count if count > 900: raise AssertionError("Memory growth in %r (%d/%d)!" % (f, count, 1000)) return x def fps_parse_id_fp(): memory_growth(_chemfp.fps_parse_id_fp, 4, "ABCD\tSpam\n") # I mistakenly used BuildValue("O") instead of BuildValue("N"). # This added an extra refcount to some of the returned strings, # which meant they would never become freed. class TestBuildValueMemoryLeaks(unittest2.TestCase): def test_make_unsorted_aligned_arena_when_aligned(self): s = "SpamAndEggs!" n1 = sys.getrefcount(s) memory_growth(_chemfp.make_unsorted_aligned_arena, s, 4) n2 = sys.getrefcount(s) if n2 > n1 + 10: raise AssertionError("Aligned growth has an extra refcount") def test_make_unsorted_aligned_arena_when_unaligned(self): SIZE = 1024*8 s = "S" * SIZE n1 = sys.getrefcount(s) memory_growth(_chemfp.make_unsorted_aligned_arena, s, SIZE) n2 = sys.getrefcount(s) if n2 > n1 + 10: raise AssertionError("Aligned growth has an extra refcount: %d, %d" % (n1, n2)) def test_align_fingerprint_same_size(self): s = "abcd" n1 = sys.getrefcount(s) memory_growth(_chemfp.align_fingerprint, s, 4, 4) n2 = sys.getrefcount(s) if n2 > n1 + 10: raise AssertionError("align_fingerprint with alignment has extra refcount: %d, %d" % (n1, n2)) def test_align_fingerprint_needs_new_string(self): x = memory_growth(_chemfp.align_fingerprint, "ab", 16, 1024) # I don't know why the above doesn't find the leak. # I can probe the count directly. i = sys.getrefcount(x[2]) if i != 2: raise AssertionError("Unexpected refcount: %d" % (i,)) def test_make_sorted_aligned_arena_trivial(self): ordering = array.array("c", "\0\0\0\0"*16) # space for a ChemFPOrderedPopcount (and extra) popcounts = array.array("c", "\0\0\0\0"*36) # num_fingerprints + 2 (and extra) arena = "BLAH" n1 = sys.getrefcount(arena) memory_growth(_chemfp.make_sorted_aligned_arena, 32, 4, arena, 0, ordering, popcounts, 4) n2 = sys.getrefcount(arena) if n1 != n2: # This one doesn't need "N" because it borrowed the input refcount. raise AssertionError("Borrowed refcount should have been okay.") def test_make_sorted_aligned_arena_already_sorted_and_aligned(self): ordering = array.array("c", "\0\0\0\0"*16) # space for a ChemFPOrderedPopcount (and extra) popcounts = array.array("c", "\0\0\0\0"*36) # num_fingerprints + 2 (and extra) arena = "BDFL" n1 = sys.getrefcount(arena) memory_growth(_chemfp.make_sorted_aligned_arena, 32, 4, arena, 1, ordering, popcounts, 4) n2 = sys.getrefcount(arena) if n1 != n2: # This one doesn't need "N" because it borrowed the input refcount. raise AssertionError("Borrowed refcount should have been okay. %d != %d" % (n1, n2)) def test_make_sorted_aligned_arena_already_sorted_but_not_aligned(self): ordering = array.array("c", "\0\0\0\0"*16) # space for a ChemFPOrderedPopcount (and extra) popcounts = array.array("c", "\0\0\0\0"*40) # num_fingerprints + 2 (and extra) arena = "QWOP" + "\0" * (4096-4) n1 = sys.getrefcount(arena) memory_growth(_chemfp.make_sorted_aligned_arena, 32, 4096, arena, 1, ordering, popcounts, 1024) n2 = sys.getrefcount(arena) if n1 != n2: # This one shouldn't touch the input arena raise AssertionError("Borrowed refcount should have been okay. %d != %d" % (n1, n2)) def test_make_sorted_aligned_arena_when_not_sorted(self): ordering = array.array("c", "\0\0\0\0"*16) # space for a ChemFPOrderedPopcount (and extra) popcounts = array.array("c", "\0\0\0\0"*40) # num_fingerprints + 2 (and extra) arena = "QWON" + "\0" * (4096-4) n1 = sys.getrefcount(arena) memory_growth(_chemfp.make_sorted_aligned_arena, 32, 4, arena, 2, ordering, popcounts, 1024) n2 = sys.getrefcount(arena) if n1 != n2: # This one shouldn't touch the input arena raise AssertionError("Borrowed refcount should have been okay. %d != %d" % (n1, n2)) def test_arena_copy(self): data = "ABCD\t0\n" * 200 from cStringIO import StringIO arena = chemfp.load_fingerprints(StringIO(data)) def make_subarena_copy(): arena[1:].copy() memory_growth(make_subarena_copy) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_methods.py0000644000077000000240000002722712055226641017510 0ustar dalkestaff00000000000000# Tests for the different alignment methods in bitops from __future__ import absolute_import, with_statement import unittest2 from cStringIO import StringIO from support import fullpath import chemfp import chemfp.bitops import _chemfp set_alignment_method = chemfp.bitops.set_alignment_method get_alignment_method = chemfp.bitops.get_alignment_method CHEBI_TARGETS = fullpath("chebi_rdmaccs.fps") CHEBI_QUERIES = fullpath("chebi_queries.fps.gz") targets = chemfp.load_fingerprints(CHEBI_TARGETS, alignment=8) targets_64 = chemfp.load_fingerprints(CHEBI_TARGETS, alignment=64) available_methods = chemfp.bitops.get_methods() alignment_methods = chemfp.bitops.get_alignment_methods() all_methods = dict.fromkeys("LUT8-1 LUT8-4 LUT16-4 Lauradoux POPCNT Gillies ssse3".split()) class TestMethods(unittest2.TestCase): def test_no_duplicates(self): methods = chemfp.bitops.get_methods() self.assertEquals(len(methods), len(set(methods))) def test_for_unknown_methods(self): for method in chemfp.bitops.get_methods(): self.assertIn(method, all_methods) def test_for_possible_missing_popcnt(self): if len(all_methods) == 4: self.assertNotIn("POPCNT", chemfp.get_methods()) def test_internal_bad_args(self): with self.assertRaisesRegexp(IndexError, "method index is out of range"): _chemfp.get_method_name(-1) with self.assertRaisesRegexp(IndexError, "method index is out of range"): _chemfp.get_method_name(_chemfp.get_num_methods()) all_alignments = dict.fromkeys("align1 align4 align8-small align8-large align-ssse3".split()) class TestAlignments(unittest2.TestCase): def test_no_duplicates(self): alignments = chemfp.bitops.get_alignments() self.assertEquals(len(alignments), len(set(alignments))) self.assertEquals(len(alignments), len(all_alignments)) def test_for_unknown_alignments(self): for alignment in chemfp.bitops.get_alignments(): self.assertIn(alignment, all_alignments) def test_get_set_alignment_method(self): for alignment in chemfp.bitops.get_alignments(): method = get_alignment_method(alignment) self.assertIn(method, all_methods) set_alignment_method(alignment, "LUT8-1") self.assertEqual(get_alignment_method(alignment), "LUT8-1") set_alignment_method(alignment, method) self.assertEqual(get_alignment_method(alignment), method) def test_internal_bad_args(self): with self.assertRaisesRegexp(IndexError, "alignment index is out of range"): _chemfp.get_alignment_name(-1) with self.assertRaisesRegexp(IndexError, "alignment index is out of range"): _chemfp.get_alignment_name(_chemfp.get_num_methods()) # I didn't want a better error code for this with self.assertRaisesRegexp(ValueError, "Bad argument"): _chemfp.get_alignment_name(_chemfp.get_alignment_method(-1)) with self.assertRaisesRegexp(ValueError, "Bad argument"): _chemfp.get_alignment_name(_chemfp.get_alignment_method(100)) with self.assertRaisesRegexp(ValueError, "Bad argument"): _chemfp.get_alignment_name(_chemfp.set_alignment_method(-1, 0)) with self.assertRaisesRegexp(ValueError, "Bad argument"): _chemfp.get_alignment_name(_chemfp.set_alignment_method(100, 0)) def test_cannot_use_64_bit_method_for_shorter_bit_alignment(self): msg = "Mismatch between popcount method and alignment type" available_methods = chemfp.bitops.get_methods() for method in ("Lauradoux", "Gillies", "POPCNT"): if (method == "POPCNT") and ("POPCNT" not in available_methods): continue with self.assertRaisesRegexp(ValueError, msg): set_alignment_method("align1", method) with self.assertRaisesRegexp(ValueError, msg): set_alignment_method("align4", method) @unittest2.skipIf("ssse3" not in available_methods, "CPU does not implement SSSE3") def test_ssse3(self): method = get_alignment_method("align-ssse3") # This disables SSSE3 support set_alignment_method("align-ssse3", "LUT8-1") self.assertEquals(get_alignment_method("align-ssse3"), "LUT8-1") set_alignment_method("align-ssse3", "ssse3") self.assertEquals(get_alignment_method("align-ssse3"), "ssse3") set_alignment_method("align-ssse3", method) class TestAlign8SmallMethods(unittest2.TestCase): def setUp(self): self.small_method = get_alignment_method("align8-small") self.large_method = get_alignment_method("align8-large") self.ssse3_method = get_alignment_method("align-ssse3") def tearDown(self): set_alignment_method("align8-small", self.small_method) set_alignment_method("align8-large", self.large_method) set_alignment_method("align-ssse3", self.ssse3_method) def _doit(self, method): for alignment in ("align8-small", "align8-large", "align-ssse3"): set_alignment_method(alignment, method) self.assertEquals(get_alignment_method(alignment), method) hits = targets.knearest_tanimoto_search_fp("00000000100410200290000b03a29241846163ee1f".decode("hex"), k=12, threshold=0.2).get_ids_and_scores() self.assertEqual(hits, [('CHEBI:8069', 1.0), ('CHEBI:6758', 0.78723404255319152), ('CHEBI:7983', 0.73999999999999999), ('CHEBI:8107', 0.6956521739130435), ('CHEBI:17568', 0.6904761904761905), ('CHEBI:16294', 0.6818181818181818), ('CHEBI:16964', 0.673469387755102), ('CHEBI:17477', 0.6458333333333334), ('CHEBI:17025', 0.62), ('CHEBI:15901', 0.6122448979591837), ('CHEBI:16742', 0.6122448979591837), ('CHEBI:4888', 0.6078431372549019)]) def test_lut8_1(self): self._doit("LUT8-1") def test_lut8_4(self): self._doit("LUT8-4") def test_lut16_4(self): self._doit("LUT16-4") def test_lauradoux(self): with self.assertRaisesRegexp(ValueError, "Mismatch between popcount method and alignment type"): set_alignment_method("align8-small", "Lauradoux") @unittest2.skipIf("POPCNT" not in alignment_methods, "CPU does not implement POPCNT") def test_popcnt(self): self._doit("POPCNT") class TestAlign8LargeMethods(unittest2.TestCase): def setUp(self): self.large_method = get_alignment_method("align8-large") self.ssse3_method = get_alignment_method("align-ssse3") def tearDown(self): set_alignment_method("align8-large", self.large_method) set_alignment_method("align-ssse3", self.ssse3_method) def _doit(self, method, use_ssse3=False): set_alignment_method("align8-large", method) self.assertEquals(get_alignment_method("align8-large"), method) if use_ssse3: set_alignment_method("align-ssse3", "ssse3") self.assertEquals(get_alignment_method("align-ssse3"), "ssse3") else: set_alignment_method("align-ssse3", "LUT8-1") self.assertEquals(get_alignment_method("align-ssse3"), "LUT8-1") hits = targets_64.knearest_tanimoto_search_fp("00000000100410200290000b03a29241846163ee1f".decode("hex"), k=12, threshold=0.2).get_ids_and_scores() self.assertEqual(hits, [('CHEBI:8069', 1.0), ('CHEBI:6758', 0.78723404255319152), ('CHEBI:7983', 0.73999999999999999), ('CHEBI:8107', 0.6956521739130435), ('CHEBI:17568', 0.6904761904761905), ('CHEBI:16294', 0.6818181818181818), ('CHEBI:16964', 0.673469387755102), ('CHEBI:17477', 0.6458333333333334), ('CHEBI:17025', 0.62), ('CHEBI:15901', 0.6122448979591837), ('CHEBI:16742', 0.6122448979591837), ('CHEBI:4888', 0.6078431372549019)]) def test_lut8_1(self): self._doit("LUT8-1") def test_lut8_4(self): self._doit("LUT8-4") def test_lut16_4(self): self._doit("LUT16-4") def test_lauradoux(self): self._doit("Lauradoux") def test_gillies(self): self._doit("Gillies") @unittest2.skipIf("ssse3" not in available_methods, "CPU does not implement SSSE3") def test_ssse3(self): self._doit("Lauradoux", use_ssse3=True) @unittest2.skipIf("POPCNT" not in available_methods, "CPU does not implement POPCNT") def test_popcnt(self): self._doit("POPCNT") class TestSelectFastestMethod(unittest2.TestCase): def setUp(self): self._alignment_methods = chemfp.bitops.get_alignment_methods() def tearDown(self): for k,v in self._alignment_methods.items(): set_alignment_method(k, v) def test_select_fastest(self): for alignment in all_alignments: set_alignment_method(alignment, "LUT8-1") self.assertEquals(get_alignment_method(alignment), "LUT8-1") chemfp.bitops.select_fastest_method() best_methods1 = chemfp.bitops.get_alignment_methods() for alignment in all_alignments: set_alignment_method(alignment, "LUT8-1") self.assertEquals(get_alignment_method(alignment), "LUT8-1") chemfp.bitops.select_fastest_method() best_methods2 = chemfp.bitops.get_alignment_methods() self.assertEquals(best_methods1, best_methods2) # This might fail if two methods have nearly identical timings # Tests based on coverage analysis class TestErrorConditions(unittest2.TestCase): def test_get_unknown_alignment(self): with self.assertRaisesRegexp(ValueError, "Unknown alignment 'Blah'"): get_alignment_method("Blah") def test_set_unknown_alignment(self): with self.assertRaisesRegexp(ValueError, "Unknown alignment 'Blah'"): set_alignment_method("Blah", "LUT8-1") def test_set_unknown_alignment_method(self): with self.assertRaisesRegexp(ValueError, "Unknown method 'LUT8-7'"): set_alignment_method("align1", "LUT8-7") def test_too_large_repeat(self): with self.assertRaisesRegexp(ValueError, "repeat size is meaninglessly large"): chemfp.bitops.select_fastest_method(2**30) def test_negative_repeat(self): with self.assertRaisesRegexp(ValueError, "repeat size must be 1 or larger.*"): chemfp.bitops.select_fastest_method(-10) class TestPrintReport(unittest2.TestCase): def test_print_report(self): stdout = StringIO() chemfp.bitops.print_report(stdout) output = stdout.getvalue() class TestEnvironmentVariables(unittest2.TestCase): def setUp(self): self._alignment_methods = chemfp.bitops.get_alignment_methods() def tearDown(self): for k,v in self._alignment_methods.items(): set_alignment_method(k, v) def set_align4(self): set_alignment_method("align4", "LUT8-4") chemfp.bitops.use_environment_variables({"CHEMFP-ALIGN4": "LUT8-1"}) self.assertEquals(get_alignment_methods("align4"), "LUT8-1") if __name__ == "__main__": chemfp.bitops.use_environment_variables() unittest2.main() chemfp-1.1p1/tests/test_ob2fps.py0000644000077000000240000001726311660452125017236 0ustar dalkestaff00000000000000from __future__ import with_statement import os import shutil import tempfile import unittest2 import support try: import openbabel has_openbabel = True skip_openbabel = False except ImportError: has_openbabel = False skip_openbabel = True if not support.can_skip("ob"): skip_openbabel = False import openbabel if has_openbabel: import chemfp.openbabel from chemfp.commandline import ob2fps VERSION = chemfp.openbabel._ob_version runner = support.Runner(ob2fps.main) run = runner.run run_fps = runner.run_fps run_split = runner.run_split run_exit = runner.run_exit HAS_MACCS = chemfp.openbabel.HAS_MACCS else: HAS_MACCS = False runner = None class TestFingerprintTypes(unittest2.TestCase): def test_unspecified(self): # Should give the same results as FP2 headers, fps = run_split("", 19) self.assertEquals(headers["#type"], "OpenBabel-FP2/1") self.assertEquals(fps[0], "200206000000000402800e00040140010100014008206200000200c0082200080200500201c9804100270538000000402000a2040080c1240001c2c2004600200c200c04020800200410a0001490000200a803c018005400c80c00000000810100840000880064a0124010000000080102060142400110200a00000004800000\t9425004") def test_FP2(self): headers, fps = run_split("--FP2", 19) self.assertEquals(headers["#type"], "OpenBabel-FP2/1") self.assertEquals(fps[0], "200206000000000402800e00040140010100014008206200000200c0082200080200500201c9804100270538000000402000a2040080c1240001c2c2004600200c200c04020800200410a0001490000200a803c018005400c80c00000000810100840000880064a0124010000000080102060142400110200a00000004800000\t9425004") def test_FP3(self): headers, fps = run_split("--FP3", 19) self.assertEquals(headers["#type"], "OpenBabel-FP3/1") self.assertEquals(fps[0], "0400000000b001\t9425004") def test_FP4(self): headers, fps = run_split("--FP4", 19) self.assertEquals(headers["#type"], "OpenBabel-FP4/1") # Sigh. OpenBabel post 2.3 added stereo support. This structure is # Clc1c(/C=C/C(=O)NNC(=O)Cn2nc(cc2C)C)c(F)ccc1\t9425004\n # FP4 bits 289 and 290 are "on" for post 2.3.0 releases # Those bits are: # 289 Cis_double_bond: */[D2]=[D2]\* # 290 Trans_double_bond: */[D2]=[D2]/* if (VERSION.startswith("2.2") or VERSION == "2.3.0"): self.assertEquals(fps[0], "1100000000000000000080000000000000010000000c9800000000000000000000000640407800\t9425004") # old (without stereo) else: self.assertEquals(fps[0], "1100000000000000000080000000000000010000000c9800000000000000000000000640437800\t9425004") # new (with stereo) @unittest2.skipUnless(HAS_MACCS, "Missing MACCS support") def test_MACCS(self): headers, fps = run_split("--MACCS", 19) if headers["#type"] == "OpenBabel-MACCS/1": # Running on a buggy 2.3.0 release or earlier self.assertEquals(fps[0], "800400000002080019cc40eacdec980baea378ef1b\t9425004") else: self.assertEquals(headers["#type"], "OpenBabel-MACCS/2") # Running on a corrected post-2.3.0 release self.assertEquals(fps[0], "000000000002080019c444eacd6c980baea178ef1f\t9425004") @unittest2.skipIf(HAS_MACCS, "check for missing MACCS support") def test_MACCS_does_not_exist(self): run_exit("--MACCS") def test_rdmaccs(self): headers, fps = run_split("--rdmaccs", 19) self.assertEquals(headers["#type"], "RDMACCS-OpenBabel/1") self.assertEquals(fps[0], "000000000002080019c444eacd6c981baea178ef1f\t9425004") self.assertEquals(fps[1], "000000002000082159d404eea968b81b8ea17eef1f\t9425009") self.assertEquals(fps[2], "000000000000080159c404efa9689a1b8eb1faef1b\t9425012") self.assertEquals(fps[3], "000000000000082019c404ee8968b81b8ea1ffef1f\t9425015") self.assertEquals(fps[4], "000000000000088419c6b5fa8968981b8eb37aef1f\t9425018") def test_substruct(self): headers, fps = run_split("--substruct", 19) self.assertEquals(headers["#type"], "ChemFP-Substruct-OpenBabel/1") self.assertEquals(fps[0], "07de8d002000000000000000000000000080060000000c000000000000000080030000f8401800000030508379344c014956000055c0a44e2a0049200084e140581f041d661b10064483cb0f2925100619001393e10001007000000000008000000000000000400000000000000000\t9425004") self.assertEquals(fps[1], "07de0d000000000000000000000000000080460300000c0000000000000000800f0000780038000000301083f920cc09695e0800d5c0e44e6e00492190844145dc1f841d261911164d039b8f29251026b9401313e0ec01007000000000000000000000000000000000000000000000\t9425009") TestFingerprintTypes = unittest2.skipIf(skip_openbabel, "OpenBabel not installed")( TestFingerprintTypes) class TestIO(unittest2.TestCase, support.TestIdAndErrors): _runner = runner def test_compressed_auto(self): header, fps = run_split("--FP3", 19, support.PUBCHEM_SDF_GZ) self.assertEquals(fps[0], "0400000000b001\t9425004") def test_compressed_specified(self): header, fps = run_split("--FP3 --in sdf.gz", 19, support.PUBCHEM_SDF_GZ) self.assertEquals(fps[0], "0400000000b001\t9425004") def test_format_specified(self): header, fps = run_split("--FP3 --in sdf", 19, support.PUBCHEM_ANOTHER_EXT) self.assertEquals(fps[0], "0400000000b001\t9425004") def test_output(self): dirname = tempfile.mkdtemp(prefix="test_ob2fps") output_filename = os.path.join(dirname, "blah.fps") assert len(output_filename.split()) == 1 # ensure no whitespace try: output = runner.run("--FP3 -o " + output_filename) assert len(output) == 0 with open(output_filename) as f: result = f.readlines() finally: shutil.rmtree(dirname) self.assertEquals(result[0], "#FPS1\n") fps = [line for line in result if not line.startswith("#")] self.assertEquals(len(fps), 19) self.assertEquals(fps[0], "0400000000b001\t9425004\n") def test_missing_filename(self): errmsg = run_exit("--FP2", "does_not_exist.smi") self.assertIn("Structure file", errmsg) self.assertIn("does not exist", errmsg) self.assertIn("does_not_exist.smi", errmsg) def test_bad_extension(self): errmsg = run_exit("--FP2 --in xyzzy") self.assertIn("Unsupported format specifier", errmsg) self.assertIn("xyzzy", errmsg) TestIO = unittest2.skipIf(skip_openbabel, "OpenBabel not installed")(TestIO) class TestMACCS(unittest2.TestCase): @unittest2.skipIf(not HAS_MACCS, "need MACCS support") def test_bitorder(self): result = runner.run_fps("--MACCS", 7, support.fullpath("maccs.smi")) # The fingerprints are constructed to test the first few bytes. self.assertEquals(result[0][:6], support.set_bit(2)) self.assertEquals(result[1][:6], support.set_bit(3)) self.assertEquals(result[2][:6], support.set_bit(4)) self.assertEquals(result[3][:6], support.set_bit(5)) self.assertEquals(result[4][:6], support.set_bit(9)) ## This appears to be a bug in the OpenBabel MACCS definition if VERSION in ("2.2.3", "2.3.0"): # This is WRONG, since OB has an off-by-one error in the ring sizes self.assertEquals(result[5][:6], "000020") else: # which is fixed in the SVN version self.assertEquals(result[5][:6], support.set_bit(10)) self.assertEquals(result[6][:6], support.set_bit(16)) TestMACCS = unittest2.skipIf(skip_openbabel, "OpenBabel not installed")(TestMACCS) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_oe2fps.py0000644000077000000240000006672512106306046017244 0ustar dalkestaff00000000000000import unittest2 import sys import os from cStringIO import StringIO as SIO import support try: from openeye import oechem # These tests require OEChem if not oechem.OEChemIsLicensed(): print >>sys.stderr, "oechem library available but not license found. Skipping its tests." raise ImportError has_oechem = True skip_oechem = False except ImportError: has_oechem = False skip_oechem = True if not support.can_skip("oe"): skip_oechem = False from openeye import oechem if has_oechem: from chemfp.commandline import oe2fps import chemfp.openeye OEGRAPHSIM_API_VERSION = chemfp.openeye.OEGRAPHSIM_API_VERSION chemfp.openeye._USE_SELECT = False # Grrr. Needed to automate testing. real_stdout = sys.stdout real_stderr = sys.stderr PUBCHEM_SDF = support.fullpath("pubchem.sdf") PUBCHEM_SDF_GZ = support.fullpath("pubchem.sdf.gz") PUBCHEM_ANOTHER_EXT = support.fullpath("pubchem.should_be_sdf_but_is_not") oeerrs = oechem.oeosstream() oechem.OEThrow.SetOutputStream(oeerrs) def convert_v1_atom_names_to_v2(s): return (s.replace("Aromaticity", "Arom") .replace("AtomicNumber", "AtmNum") .replace("EqAromatic", "EqArom") .replace("EqHalogen", "EqHalo") .replace("FormalCharge", "FCharge") .replace("HvyDegree", "HvyDeg") .replace("Hybridization", "Hyb") .replace("DefaultAtom", "Arom|AtmNum|Chiral|EqHalo|FCharge|HvyDeg|Hyb")) def convert_v1_bond_names_to_v2(s): return (s.replace("DefaultBond", "Order|Chiral") .replace("BondOrder", "Order")) def convert_v1_names_to_v2(s): return convert_v1_atom_names_to_v2(convert_v1_bond_names_to_v2(s)) def convert_type_string(s): if " " not in s: word = s rest = None else: word, rest = s.split(" ", 1) assert word.endswith("/1") word = word[:-1] + "2" if rest is None: return word else: return word + " " + convert_v1_names_to_v2(rest) def _check_for_oe_errors(): lines = oeerrs.str().splitlines() for line in lines: if line.startswith("Warning: Stereochemistry corrected on atom number"): continue if line.startswith("Warning: Unknown file format set in input stream"): # There's a bug in OEChem where it generates this warning on unknown # file extensions even after SetFormat has been called continue raise AssertionError("Unexpected message from OEChem: %r" % (line,)) # I build the fingerprints using bit offsets to ensure that the test # data matches the actual bit results from OEChem. While I could # reproduce the method in chemfp.openeye.get_maccs_fingerprinter, that # would be cheating. I could test against ToHexString() but then I # would have the nagging feeling that I got the ordering # backwards. Instead, I do it from scratch using the bit offset. def _construct_test_values(fp_func = None, num_bits=4096): from openeye.oechem import oemolistream from openeye.oegraphsim import OEFingerPrint, OEMakePathFP fp = OEFingerPrint() ifs = oemolistream() assert ifs.open(PUBCHEM_SDF) hex_data = [] if fp_func is None: fp_func = OEMakePathFP def _convert_to_chemfp_order(s): # The FPS format allows either case but I prefer lowercase s = s.lower() # OpenEye orders hex values on nibbles. Chemfp orders on bytes. return "".join( (s[i+1]+s[i]) for i in range(0, len(s), 2)) for mol in ifs.GetOEGraphMols(): fp_func(fp, mol) # Set the byte values given the bit offsets bytes = [0] * (num_bits//8) i = fp.FirstBit() while i >= 0: bytes[i//8] |= 1<<(i%8) i = fp.NextBit(i) as_hex = "".join("%02x" % i for i in bytes) assert len(as_hex) == 2*(num_bits//8), len(as_hex) # Double-check that it matches the (reordered) ToHexString() oe_hex = fp.ToHexString()[:-1] assert as_hex == _convert_to_chemfp_order(oe_hex), ( as_hex, _convert_to_chemfp_order(oe_hex)) hex_data.append("%s\t%s" % (as_hex, mol.GetTitle())) return hex_data # I have this to flag any obvious changes in the OEChem algorithm and # to help with figuring out how to build a test case. _fp1 = "00001002200200000000000000000000000008400020000300801002300000000200000840000000000080000000000000204008000000000010000c10000000400000010100000210800002000000009400000000020020088000000000010000918000200000580400002000010020002440000008001001404000000200010000a8c00020400200002000004084000000030100820000000000000002000000510001800000010001000081100110000800480000100400000c00004c000800000808000100000022000228800020004000000200182100000100000000101000010004004808000000800000000001010010201000000090400000100000020010000010201000000040300100000000580000000000000200000000401000000000000008040004000000008002080820000310280200004040a000000010000080005000004010010018000000800000020008208040000400000200000000000000000800080050000008400100004000000200ac001000000000800100200060900010002000000040200000000000040808000048400040000000020000001001000000000302002008200000a044000180800000100000000200000049004080080000100022a00084000400280480000000402400080400404100000000040000020c10000000000c000100002000080010002080100002000600" if has_oechem: hex_test_values = _construct_test_values() assert hex_test_values[0].startswith(_fp1) class OERunner(support.Runner): def pre_run(self): oeerrs.clear() def post_run(self): _check_for_oe_errors() def run_stdin(self, cmdline, source): fd = os.open(source, os.O_RDONLY) oechem.oein.openfd(fd, 0) try: return self.run(cmdline, None) finally: oechem.oein.openfd(0, 0) os.close(fd) if has_oechem: runner = OERunner(oe2fps.main) run = runner.run run_stdin = runner.run_stdin run_fps = runner.run_fps run_exit = runner.run_exit else: runner = None def headers(lines): assert lines[0] == "FPS1" del lines[0] return [line for line in lines if line.startswith("#")] class TestMACCS(unittest2.TestCase): def test_bitorder(self): result = run_fps("--maccs166", 7, support.fullpath("maccs.smi")) # The fingerprints are constructed to test the first few bytes. self.assertEquals(result[0][:6], support.set_bit(2)) self.assertEquals(result[1][:6], support.set_bit(3)) self.assertEquals(result[2][:6], support.set_bit(4)) self.assertEquals(result[3][:6], support.set_bit(5)) self.assertEquals(result[4][:6], support.set_bit(9)) self.assertEquals(result[5][:6], support.set_bit(10)) self.assertEquals(result[6][:6], support.set_bit(16)) TestMACCS = unittest2.skipIf(skip_oechem, "OEChem not installed")(TestMACCS) class TestPath(unittest2.TestCase): def test_default(self): result = run_fps("", 19) hexfp, id = result[0].split() self.assertEquals(len(hexfp), 4096//4) self.assertEquals(result[0], hex_test_values[0]) self.assertEquals(result, hex_test_values) def test_path_option(self): result = run_fps("--path", 19) self.assertEquals(result, hex_test_values) def test_num_bits(self): result = run_fps("--numbits 16", 19) self.assertEquals(result[0][:5], "ff1f\t") def test_min_bonds_default(self): result = run_fps("--minbonds 0", 19) self.assertEquals(result, hex_test_values) def test_min_bonds_changed(self): result = run_fps("--minbonds 1", 19) self.assertNotEquals(result, hex_test_values) def test_max_bonds_default(self): result = run_fps("--maxbonds 5", 19) self.assertEquals(result, hex_test_values) def test_max_bonds_changed(self): result = run_fps("--minbonds 4", 19) self.assertNotEquals(result, hex_test_values) def test_atype_default_named(self): result = run_fps("--atype DefaultAtom", 19) self.assertEquals(result, hex_test_values) def test_atype_default_flags(self): result = run_fps(atom_type_converter( "--atype Aromaticity|AtomicNumber|Chiral|EqHalogen|FormalCharge|HvyDegree|Hybridization"), 19) self.assertEquals(result, hex_test_values) def test_atype_default_flags_with_duplicates(self): result = run_fps(atom_type_converter( "--atype Aromaticity|Chiral|AtomicNumber|AtomicNumber|EqHalogen|HvyDegree|FormalCharge|Hybridization"), 19) self.assertEquals(result, hex_test_values) # Make sure that each of the flags returns some other answer def test_atype_different_1(self): result = run_fps(atom_type_converter( "--atype AtomicNumber|Chiral|EqHalogen|FormalCharge|HvyDegree|Hybridization"), 19) self.assertNotEquals(result, hex_test_values) def test_atype_different_2(self): result = run_fps(atom_type_converter( "--atype Aromaticity|Chiral|EqHalogen|FormalCharge|HvyDegree|Hybridization"), 19) self.assertNotEquals(result, hex_test_values) def test_atype_different_3(self): result = run_fps(atom_type_converter( "--atype Aromaticity|AtomicNumber|EqHalogen|FormalCharge|HvyDegree|Hybridization"), 19) self.assertNotEquals(result, hex_test_values) def test_atype_different_4(self): result = run_fps(atom_type_converter( "--atype Aromaticity|AtomicNumber|Chiral|FormalCharge|HvyDegree|Hybridization"), 19) self.assertNotEquals(result, hex_test_values) def test_atype_different_5(self): result = run_fps(atom_type_converter( "--atype Aromaticity|AtomicNumber|Chiral|EqHalogen|HvyDegree|Hybridization"), 19) self.assertNotEquals(result, hex_test_values) def test_atype_different_6(self): result = run_fps(atom_type_converter( "--atype Aromaticity|AtomicNumber|Chiral|EqHalogen|FormalCharge|Hybridization"), 19) self.assertNotEquals(result, hex_test_values) def test_atype_different_7(self): result = run_fps(atom_type_converter( "--atype Aromaticity|AtomicNumber|Chiral|EqHalogen|FormalCharge|HvyDegree"), 19) self.assertNotEquals(result, hex_test_values) def test_btype_default_named(self): result = run_fps("--btype DefaultBond", 19) self.assertEquals(result, hex_test_values) def test_btype_default_flags(self): result = run_fps(bond_type_converter("--btype BondOrder|Chiral"), 19) self.assertEquals(result, hex_test_values) def test_btype_different_1(self): result = run_fps(bond_type_converter("--btype BondOrder"), 19) self.assertNotEquals(result, hex_test_values) def test_btype_different_2(self): result = run_fps("--btype Chiral", 19) self.assertNotEquals(result, hex_test_values) TestPath = unittest2.skipIf(skip_oechem, "OEChem not installed")(TestPath) class TestPatterns(unittest2.TestCase): def test_rdmaccs(self): headers, fps = runner.run_split("--rdmaccs", 19) self.assertEquals(headers["#type"], "RDMACCS-OpenEye/1") self.assertEquals(fps[0], "000000000002080019c444eacd6c981baea178ef1f\t9425004") self.assertEquals(fps[1], "000000002000082159d404eea968b81b8ea17eef1f\t9425009") self.assertEquals(fps[2], "000000000000080159c404efa9689a1b8eb1faef1b\t9425012") self.assertEquals(fps[3], "000000000000082019c404ee8968b81b8ea1ffef1f\t9425015") self.assertEquals(fps[4], "000000000000088419c6b5fa8968981b8eb37aef1f\t9425018") def test_substruct(self): headers, fps = runner.run_split("--substruct", 19) self.assertEquals(headers["#type"], "ChemFP-Substruct-OpenEye/1") self.assertEquals(fps[0], "07de8d002000000000000000000000000080060000000c000000000000000080030000f8401800000030508379344c014956000055c0a44e2a0049200084e140581f041d661b10064483cb0f2925100619001393e10001007000000000008000000000000000400000000000000000\t9425004") # Note: not the same as OpenBabel's answer; bit 260 (>= 3 hetero-aromatic rings) is different. # openeye_patterns doesn't handle this. self.assertEquals(fps[1], "07de0d000000000000000000000000000080460300000c000000000000000080070000780038000000301083f920cc09695e0800d5c0e44e6e00492190844145dc1f841d261911164d039b8f29251026b9401313e0ec01007000000000000000000000000000000000000000000000\t9425009") if skip_oechem: TestPatterns = unittest2.skipIf(True, "OEChem not installed")(TestPatterns) elif chemfp.openeye.OEGRAPHSIM_API_VERSION == "1": TestPatterns = unittest2.skipIf(True, "OEGraphSim is not new enough")(TestPatterns) class TestIO(unittest2.TestCase, support.TestIdAndErrors): _runner = runner def test_compressed_input(self): result = run_fps("", source=PUBCHEM_SDF_GZ) ### XXX Fix how I handle unknown extensions. # def test_unknown_extension(self): # # OEChem's default assumes SMILES. This will parse some of the # # SD file lines as SMILES and skip the ones it doesn't know. # # The error output will have a bunch of warnings, starting # # with the "Unknown file format ... " warning, and then this # # string about a SMILES parse error. # try: # run("--errors ignore", source=PUBCHEM_ANOTHER_EXT) # except AssertionError, x: # self.assertEquals("Problem parsing SMILES" in str(x), True, str(x)) def test_specify_input_format(self): result = run_fps("--in sdf", source=PUBCHEM_ANOTHER_EXT) def test_from_stdin(self): run_stdin("--in sdf", source=PUBCHEM_SDF) def test_from_gziped_stdin(self): run_stdin("--in sdf.gz", source=PUBCHEM_SDF_GZ) def test_unknown_format(self): msg = run_exit("--in blah") self.assertEquals("Unsupported format specifier: 'blah'" in msg, True, msg) def test_file_does_not_exist(self): msg = run_exit("", source="/asdfaserwe.does.not.exist") self.assertEquals("Structure file '/asdfaserwe.does.not.exist' does not exist" in msg, True, repr(msg)) # XXX how to test that this generates a warning? # def test_specify_input_format_with_dot(self): # result = run_fps("--in .sdf", source=PUBCHEM_ANOTHER_EXT) TestIO = unittest2.skipIf(skip_oechem, "OEChem not installed")(TestIO) class TestArgErrors(unittest2.TestCase): def _run(self, cmd, expect): msg = run_exit(cmd) self.assertIn(expect, msg) def test_two_fp_types(self): self._run("--maccs166 --path", "Cannot specify both --maccs166 and --path") def test_num_bits_too_small(self): self._run("--numbits 0", "between 16 and 65536 bits") self._run("--numbits 1", "between 16 and 65536 bits") self._run("--numbits 15", "between 16 and 65536 bits") def test_num_bits_too_large(self): self._run("--numbits 65537", "between 16 and 65536 bits") # Check for overflow, even though I know it won't happen in Python self._run("--numbits %(big)s"%dict(big=2**32+32), "between 16 and 65536 bits") def test_min_bonds_too_small(self): self._run("--minbonds=-1", "0 or greater") def test_min_bonds_larger_than_default_max_bonds(self): self._run("--minbonds=6", "--maxbonds must not be smaller than --minbonds") def test_min_bonds_too_large(self): self._run("--minbonds=4 --maxbonds=3", "--maxbonds must not be smaller than --minbonds") def test_bad_atype(self): self._run("--atype spam", "Unknown path atom type 'spam'") def test_bad_atype2(self): self._run("--atype DefaultAtom|spam", "Unknown path atom type 'spam'") def test_bad_atype3(self): self._run("--atype DefaultAtom|", "Missing path atom flag") def test_bad_btype(self): self._run("--btype eggs", "Unknown path bond type 'eggs'") def test_bad_btype2(self): self._run("--btype DefaultBond|eggs", "Unknown path bond type 'eggs'") def test_bad_btype3(self): self._run("--btype DefaultBond|", "Missing path bond flag") TestArgErrors = unittest2.skipIf(skip_oechem, "OEChem not installed")(TestArgErrors) if has_oechem: if OEGRAPHSIM_API_VERSION == "1": type_converter = str atom_type_converter = str bond_type_converter = str else: type_converter = convert_type_string atom_type_converter = convert_v1_atom_names_to_v2 bond_type_converter = convert_v1_bond_names_to_v2 class TestHeaderOutput(unittest2.TestCase): def _field(self, s, field): try: result = run(s) except SystemExit, err: raise raise AssertionError("Should not die: %r" % (err,)) filtered = [line for line in result if line.startswith(field)] self.assertEquals(len(filtered), 1, result) return filtered[0] def test_software(self): result = self._field("", "#software") self.assertEquals("#software=OEGraphSim/" in result, True, result) self.assertIn("(", result) self.assertIn(")", result) result = self._field("--maccs166", "#software") self.assertIn("#software=OEGraphSim/", result) def test_type(self): result = self._field("", "#type") self.assertEquals(result, type_converter( "#type=OpenEye-Path/1 numbits=4096 minbonds=0 maxbonds=5 atype=DefaultAtom btype=DefaultBond")) def test_default_atom_and_bond(self): result = self._field( atom_type_converter("--atype=Aromaticity|AtomicNumber|Chiral|EqHalogen|FormalCharge|HvyDegree|Hybridization") + " " + bond_type_converter("--btype=BondOrder|Chiral"), "#type") self.assertEquals(result, type_converter( "#type=OpenEye-Path/1 numbits=4096 minbonds=0 maxbonds=5 atype=DefaultAtom btype=DefaultBond")) # different flags. All flags? and order def test_num_bits(self): result = self._field("--numbits 38", "#num_bits") self.assertEquals(result, "#num_bits=38") def test_atype_flags(self): result = self._field(atom_type_converter("--atype FormalCharge|FormalCharge"), "#type") + " " self.assertIn(atom_type_converter(" atype=FormalCharge "), result) def test_btype_flags(self): result = self._field(bond_type_converter("--btype Chiral|BondOrder"), "#type") + " " self.assertIn(bond_type_converter(" btype=DefaultBond "), result) result = self._field(bond_type_converter("--btype BondOrder|Chiral"), "#type") + " " self.assertIn(bond_type_converter(" btype=DefaultBond "), result) def test_pipe_or_comma(self): result = self._field(atom_type_converter("--atype HvyDegree,FormalCharge") + " " + bond_type_converter("--btype Chiral,BondOrder"), "#type") + " " self.assertIn(atom_type_converter(" atype=FormalCharge|HvyDegree "), result) self.assertIn(bond_type_converter(" btype=DefaultBond "), result) def test_maccs_header(self): result = self._field("--maccs166", "#type") self.assertEquals(result, type_converter("#type=OpenEye-MACCS166/1")) TestHeaderOutput = unittest2.skipIf(skip_oechem, "OEChem not installed")(TestHeaderOutput) if has_oechem and OEGRAPHSIM_API_VERSION == "2": from openeye.oegraphsim import ( OEMakePathFP, OEFPAtomType_DefaultPathAtom, OEFPBondType_DefaultPathBond, OEMakeCircularFP, OEFPAtomType_DefaultCircularAtom, OEFPBondType_DefaultCircularBond, OEMakeTreeFP, OEFPAtomType_DefaultTreeAtom, OEFPBondType_DefaultTreeBond, OEFPAtomType_Aromaticity, OEFPAtomType_AtomicNumber, OEFPAtomType_EqHalogen, OEFPAtomType_HvyDegree, OEFPAtomType_FormalCharge, OEFPBondType_Chiral, OEFPBondType_InRing, OEFPBondType_BondOrder, ) class TestOEGraphSimVersion2(unittest2.TestCase): def test_hash(self): header, result = runner.run_split("--path", 19) self.assertEquals(header["#type"], "OpenEye-Path/2 numbits=4096 minbonds=0 maxbonds=5 atype=Arom|AtmNum|Chiral|EqHalo|FCharge|HvyDeg|Hyb btype=Order|Chiral") self.assertEquals(result, _construct_test_values()) def test_path_defaults(self): header, result = runner.run_split("--path --numbits 4096 --minbonds 0 --maxbonds 5 " "--atype AtmNum|Arom|Chiral|FCharge|HvyDeg|Hyb|EqHalo --btype Order|Chiral", 19) self.assertEquals(header["#type"], "OpenEye-Path/2 numbits=4096 minbonds=0 maxbonds=5 atype=Arom|AtmNum|Chiral|EqHalo|FCharge|HvyDeg|Hyb btype=Order|Chiral") self.assertEquals(result, _construct_test_values()) def test_change_all_path_fields(self): header, result = runner.run_split("--path --numbits 1024 --minbonds 2 --maxbonds 4 " "--atype AtmNum|EqHalo --btype InRing|Order", 19) self.assertEquals(header["#type"], "OpenEye-Path/2 numbits=1024 minbonds=2 maxbonds=4 atype=AtmNum|EqHalo btype=Order|InRing") def compute_path_fingerprints(fp, mol): OEMakePathFP(fp, mol, 1024, 2, 4, OEFPAtomType_AtomicNumber|OEFPAtomType_EqHalogen, OEFPBondType_InRing|OEFPBondType_BondOrder) self.assertEquals(result, _construct_test_values(compute_path_fingerprints, 1024)) def test_path_default_type(self): result = run("--atype DefaultPathAtom") # (DefaultPathAtom is the same as DefaultAtom) typename = [line for line in result if line.startswith("#type=")][0] self.assertEquals(typename, "#type=OpenEye-Path/2 numbits=4096 minbonds=0 maxbonds=5 atype=Arom|AtmNum|Chiral|EqHalo|FCharge|HvyDeg|Hyb btype=Order|Chiral") def test_check_path_default_types(self): header, result = runner.run_split("--path --atype Default --btype Default") self.assertEquals(header["#type"], "OpenEye-Path/2 numbits=4096 minbonds=0 maxbonds=5 atype=Arom|AtmNum|Chiral|EqHalo|FCharge|HvyDeg|Hyb btype=Order|Chiral") header, result = runner.run_split("--path --atype DefaultCircularAtom --btype DefaultCircularBond") self.assertEquals(header["#type"], "OpenEye-Path/2 numbits=4096 minbonds=0 maxbonds=5 atype=Arom|AtmNum|Chiral|EqHalo|FCharge|HCount btype=Order") header, result = runner.run_split("--path --atype DefaultPathAtom --btype DefaultPathBond") self.assertEquals(header["#type"], "OpenEye-Path/2 numbits=4096 minbonds=0 maxbonds=5 atype=Arom|AtmNum|Chiral|EqHalo|FCharge|HvyDeg|Hyb btype=Order|Chiral") ######################## # Note: The documentation says that OEFPBondType_DefaultCircularBond is Order|Chiral # but the code says it's only Order. def test_circular(self): header, result = runner.run_split("--circular", 19) def compute_circular_fingerprints(fp, mol): OEMakeCircularFP(fp, mol, 4096, 0, 5, OEFPAtomType_DefaultCircularAtom, OEFPBondType_DefaultCircularBond) self.assertEquals(header["#type"], "OpenEye-Circular/2 numbits=4096 minradius=0 maxradius=5 atype=Arom|AtmNum|Chiral|EqHalo|FCharge|HCount btype=Order") self.assertEquals(result, _construct_test_values(compute_circular_fingerprints)) def test_circular_defaults(self): # Make sure that when I specify the defaults then I get the same results header, result = runner.run_split("--circular --numbits 4096 --minradius 0 --maxradius 5 " "--atype AtmNum|Arom|Chiral|FCharge|HCount|EqHalo --btype Order") self.assertEquals(header["#type"], "OpenEye-Circular/2 numbits=4096 minradius=0 maxradius=5 atype=Arom|AtmNum|Chiral|EqHalo|FCharge|HCount btype=Order") def compute_circular_fingerprints(fp, mol): OEMakeCircularFP(fp, mol, 4096, 0, 5, OEFPAtomType_DefaultCircularAtom, OEFPBondType_DefaultCircularBond) self.assertEquals(result, _construct_test_values(compute_circular_fingerprints)) def test_change_all_circular_fields(self): header, result = runner.run_split("--circular --numbits 1024 --minradius 2 --maxradius 4 --atype Arom --btype Chiral", 19) def compute_circular_fingerprints(fp, mol): OEMakeCircularFP(fp, mol, 1024, 2, 4, OEFPAtomType_Aromaticity, OEFPBondType_Chiral) self.assertEquals(header["#type"], "OpenEye-Circular/2 numbits=1024 minradius=2 maxradius=4 atype=Arom btype=Chiral") self.assertEquals(result, _construct_test_values(compute_circular_fingerprints, 1024)) def test_check_circular_default_types(self): header, result = runner.run_split("--circular --atype Default --btype Default") self.assertEquals(header["#type"], "OpenEye-Circular/2 numbits=4096 minradius=0 maxradius=5 atype=Arom|AtmNum|Chiral|EqHalo|FCharge|HCount btype=Order") header, result = runner.run_split("--circular --atype DefaultCircularAtom --btype DefaultCircularBond") self.assertEquals(header["#type"], "OpenEye-Circular/2 numbits=4096 minradius=0 maxradius=5 atype=Arom|AtmNum|Chiral|EqHalo|FCharge|HCount btype=Order") header, result = runner.run_split("--circular --atype DefaultPathAtom --btype DefaultPathBond") self.assertEquals(header["#type"], "OpenEye-Circular/2 numbits=4096 minradius=0 maxradius=5 atype=Arom|AtmNum|Chiral|EqHalo|FCharge|HvyDeg|Hyb btype=Order|Chiral") ######################## def test_tree(self): header, result = runner.run_split("--tree", 19) self.assertEquals(header["#type"], "OpenEye-Tree/2 numbits=4096 minbonds=0 maxbonds=4 atype=Arom|AtmNum|Chiral|FCharge|HvyDeg|Hyb btype=Order") def compute_tree_fingerprints(fp, mol): OEMakeTreeFP(fp, mol, 4096, 0, 4, OEFPAtomType_DefaultTreeAtom, OEFPBondType_DefaultTreeBond) self.assertEquals(result, _construct_test_values(compute_tree_fingerprints)) def test_tree_defaults(self): # Make sure that when I specify the defaults then I get the same results header, result = runner.run_split("--tree --numbits 4096 --minradius 0 --maxradius 5 " "--atype FCharge|HvyDeg|AtmNum|Arom|Chiral|Hyb --btype Order") self.assertEquals(header["#type"], "OpenEye-Tree/2 numbits=4096 minbonds=0 maxbonds=4 atype=Arom|AtmNum|Chiral|FCharge|HvyDeg|Hyb btype=Order") def compute_tree_fingerprints(fp, mol): OEMakeTreeFP(fp, mol, 4096, 0, 4, OEFPAtomType_DefaultTreeAtom, OEFPBondType_DefaultTreeBond) self.assertEquals(result, _construct_test_values(compute_tree_fingerprints)) def test_change_all_tree_fields(self): header, result = runner.run_split("--tree --numbits 1024 --minbonds 1 --maxbonds 2 --atype HvyDeg|FCharge --btype InRing", 19) def compute_circular_fingerprints(fp, mol): OEMakeTreeFP(fp, mol, 1024, 1, 2, OEFPAtomType_HvyDegree | OEFPAtomType_FormalCharge, OEFPBondType_InRing) self.assertEquals(header["#type"], "OpenEye-Tree/2 numbits=1024 minbonds=1 maxbonds=2 atype=FCharge|HvyDeg btype=InRing") self.assertEquals(result, _construct_test_values(compute_circular_fingerprints, 1024)) if skip_oechem: TestOEGraphSimVersion2 = unittest2.skipIf(skip_oechem, "OEChem not installed")(TestOEGraphSimVersion2) else: TestOEGraphSimVersion2 = unittest2.skipIf(OEGRAPHSIM_API_VERSION == "1", "OEGraphSim library is too old")(TestOEGraphSimVersion2) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_openeye_patterns.py0000644000077000000240000001774312106306046021426 0ustar dalkestaff00000000000000from __future__ import with_statement import sys import unittest2 import support try: from openeye.oechem import OEGraphMol, OEParseSmiles, OEChemIsLicensed if not OEChemIsLicensed(): raise ImportError skip_oechem = False except ImportError: skip_oechem = support.can_skip("oe") if not skip_oechem: from chemfp import openeye_patterns from chemfp.openeye import OEGRAPHSIM_API_VERSION import test_patterns from chemfp import types MACCS_SMI = support.fullpath("maccs.smi") def parse_smiles(smiles): mol = OEGraphMol() OEParseSmiles(mol, smiles) return mol def _count(it): return sum(1 for item in it) class ReferenceMixin(object): def test_reference_data_set(self): largest = min(self.reference_limit, max(v for (k,v) in self.reference_cases)) matcher = self.reference_class(largest) for (smiles, expected) in self.reference_cases: mol = parse_smiles(smiles) self.assertEquals(matcher.SingleMatch(mol), bool(expected), smiles) expected = min(expected, largest) self.assertGreaterEqual(_count(matcher.Match(mol)), expected, smiles) def test_match_limit(self): largest = min(5, self.reference_limit) for max_count in range(1, largest+1): matcher = self.reference_class(max_count) for (smiles, expected) in self.reference_cases: mol = parse_smiles(smiles) expected = min(expected, max_count) self.assertGreaterEqual(_count(matcher.Match(mol)), expected, smiles) class TestHydrogenMatcher(ReferenceMixin, unittest2.TestCase): if not skip_oechem: reference_class = openeye_patterns.HydrogenMatcher reference_cases = test_patterns.hydrogen_test_cases reference_limit = 100 TestHydrogenMatcher = unittest2.skipIf(skip_oechem, "OEChem not installed")( TestHydrogenMatcher) class TestAromaticRingMatcher(ReferenceMixin, unittest2.TestCase): if not skip_oechem: reference_class = openeye_patterns.AromaticRings reference_cases = test_patterns.aromatic_ring_cases reference_limit = 2 TestAromaticRingMatcher = unittest2.skipIf(skip_oechem, "OEChem not installed")( TestAromaticRingMatcher) # XXX These are too low-level. The tests should really be done through the # file interface and shared with the other implementations. class TestAromaticRings(unittest2.TestCase): def _count(self, matcher, smiles): return sum(1 for x in matcher.Match(parse_smiles(smiles))) def test_at_least_one_aromatic_ring(self): matcher = openeye_patterns.AromaticRings(1) self.assertFalse(matcher.SingleMatch(parse_smiles("C"))) self.assertTrue(matcher.SingleMatch(parse_smiles("c1ccccc1"))) self.assertTrue(matcher.SingleMatch(parse_smiles("c1ccccc1c1ccccc1"))) self.assertEqual(self._count(matcher, "C"), 0) self.assertEqual(self._count(matcher, "c1ccccc1"), 1) self.assertIn(self._count(matcher, "c12cccccc1ccccc2"), (1, 2)) # Matcher can return >count; XXX Why? self.assertEqual(self._count(matcher, "c1ccccc1c1ccccc1"), 1) self.assertEqual(self._count(matcher, "c1ccccc1C1CCCCC1"), 1) def test_at_least_two_aromatic_rings(self): matcher = openeye_patterns.AromaticRings(2) self.assertEqual(self._count(matcher, "C"), 0) self.assertEqual(self._count(matcher, "c1ccccc1"), 1) self.assertEqual(self._count(matcher, "c12cccccc1ccccc2"), 2) self.assertEqual(self._count(matcher, "c1ccccc1c1ccccc1"), 2) self.assertEqual(self._count(matcher, "c1ccccc1C1CCCCC1"), 1) self.assertGreaterEqual(self._count(matcher, "c1ccc-2cccc-2cc1"), 1) self.assertGreaterEqual(self._count(matcher, "c1ccc-2cccc-2c3c1cccn3"), 2) self.assertGreaterEqual(self._count(matcher, "c1ccccc1.c1ccccc1"), 2) def test_aromatic_rings_failure(self): with self.assertRaisesRegexp(NotImplementedError, "No support for >=3 aromatic rings"): openeye_patterns.AromaticRings(3) class TestHeteroAromaticRings(unittest2.TestCase): def _count(self, matcher, smiles): return sum(1 for x in matcher.Match(parse_smiles(smiles))) def test_at_least_one_heteroaromatic_ring(self): matcher = openeye_patterns.HeteroAromaticRings(1) self.assertFalse(matcher.SingleMatch(parse_smiles("C"))) self.assertTrue(matcher.SingleMatch(parse_smiles("c1cccnc1"))) self.assertTrue(matcher.SingleMatch(parse_smiles("c1cccnc1c1ccccc1"))) self.assertTrue(matcher.SingleMatch(parse_smiles("c1ccccc1c1ncccc1"))) self.assertEqual(self._count(matcher, "C"), 0) self.assertEqual(self._count(matcher, "c1ccccc1"), 0) self.assertEqual(self._count(matcher, "c1ccncc1"), 1) self.assertEqual(self._count(matcher, "c12cccccc1ccccc2"), 0) self.assertEqual(self._count(matcher, "c12ccnc1ccccc2"), 1) self.assertEqual(self._count(matcher, "c12cncc1ccccc2"), 1) self.assertEqual(self._count(matcher, "c12ccccc1cncc2"), 1) self.assertIn(self._count(matcher, "c12cccnc1cncc2"), (1, 2)) # a matcher can return >count. Why? XXX self.assertEqual(self._count(matcher, "c1ccccc1c1ccccc1"), 0) self.assertEqual(self._count(matcher, "c1ccccc1C1CCCNC1"), 0) self.assertEqual(self._count(matcher, "c1cnccc1C1CCCNC1"), 1) self.assertEqual(self._count(matcher, "c1cnccc1C1CCCNC1"), 1) def test_at_least_two_heteroaromatic_rings(self): matcher = openeye_patterns.HeteroAromaticRings(2) self.assertEqual(self._count(matcher, "C"), 0) self.assertEqual(self._count(matcher, "c1ccccc1"), 0) self.assertEqual(self._count(matcher, "c1ccncc1"), 1) self.assertEqual(self._count(matcher, "c12cccccc1ccccc2"), 0) self.assertEqual(self._count(matcher, "c12ccnc1ccccc2"), 1) self.assertEqual(self._count(matcher, "c12cncc1ccccc2"), 1) self.assertEqual(self._count(matcher, "c12ccccc1cncc2"), 1) self.assertEqual(self._count(matcher, "c12cccnc1cncc2"), 2) self.assertEqual(self._count(matcher, "c1ccccc1c1ccccc1"), 0) self.assertEqual(self._count(matcher, "c1ccccc1C1CCCNC1"), 0) self.assertEqual(self._count(matcher, "c1cnccc1C1CCCNC1"), 1) self.assertEqual(self._count(matcher, "c1cnccc1C1CCCNC1"), 1) self.assertEqual(self._count(matcher, "c1ccccn1.c1ccoc1"), 2) self.assertEqual(self._count(matcher, "c1ccc-2cccc-2cc1"), 0) self.assertEqual(self._count(matcher, "c1ccn-2ccc-2cc1"), 1) self.assertEqual(self._count(matcher, "c1ccccc2c1cncnc2"), 1) def test_heteroaromatic_rings_failure(self): with self.assertRaisesRegexp(NotImplementedError, "No support for >=3 hetero-aromatic rings"): openeye_patterns.HeteroAromaticRings(3) class TestNumFragments(unittest2.TestCase): def _count(self, matcher, smiles): return sum(1 for x in matcher.Match(parse_smiles(smiles))) def test_single_match(self): matcher = openeye_patterns.NumFragments(10) self.assertEqual(matcher.SingleMatch(OEGraphMol()), 0) self.assertEqual(matcher.SingleMatch(parse_smiles("C")), 1) self.assertEqual(matcher.SingleMatch(parse_smiles("C.C")), 1) def test_multiple_matches(self): matcher = openeye_patterns.NumFragments(10) self.assertEqual(self._count(matcher, ""), 0) self.assertEqual(self._count(matcher, "C"), 1) self.assertEqual(self._count(matcher, "CC"), 1) self.assertEqual(self._count(matcher, "C.C"), 2) self.assertEqual(self._count(matcher, "C.C.O.[U]"), 4) reason = None if skip_oechem: reason = "OEChem not installed" elif OEGRAPHSIM_API_VERSION == "1": reason = "OEGraphSim is not new enough" TestAromaticRings = unittest2.skipIf(reason, reason)(TestAromaticRings) TestHeteroAromaticRings = unittest2.skipIf(reason, reason)(TestHeteroAromaticRings) TestNumFragments = unittest2.skipIf(reason, reason)(TestNumFragments) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_patterns.py0000644000077000000240000000132211660452125017670 0ustar dalkestaff00000000000000 hydrogen_test_cases = [ ("[C]", 0), ("[CH]", 1), ("[CH2]", 2), ("[CH3]", 3), ("[CH4]", 4), ("[C][C]", 0), ("[C][CH]", 1), ("[CH][CH2]", 3), ("[H]", 1), ("[HH]", 2), ("[H][CH3]", 4), ("C", 4), ("[CH]([2H])([3H])[H]", 4), ("[H][CH3]", 4), ("[H][H]", 2), ] aromatic_ring_cases = [ ("C1CCCCC1", 0), ("c1ccccc1", 1), ("c1ccccc1.c1ccccc1", 2), ("c1cncc2c1nncn2", 2), ("c1csc2c1csc2", 2), ("c1csc2c1csc2.c1csc2c1csc2", 4), ("c1ccccc1.c1ccccc1.c1ccccc1", 3), ("c1ccccc1.c1ccccc1.c1ccccc1.c1ccccc1", 4), ("c1ccccc1.c1ccccc1.c1ccccc1.c1ccccc1.c1ccccc1", 5), ("c1ccccc1.c1ccccc1.c1ccccc1.c1ccccc1.c1ccccc1.c1ccccc1", 6), ] chemfp-1.1p1/tests/test_rdkit2fps.py0000644000077000000240000006327212055226641017755 0ustar dalkestaff00000000000000from __future__ import with_statement import sys import unittest2 import tempfile import shutil import os from cStringIO import StringIO import tempfile import support try: import chemfp.rdkit has_rdkit = True skip_rdkit = False except ImportError: has_rdkit = False skip_rdkit = True if not support.can_skip("rdkit"): skip_rdkit = False import rdkit if has_rdkit: from chemfp.commandline import rdkit2fps runner = support.Runner(rdkit2fps.main) else: runner = None MACCS_SMI = support.fullpath("maccs.smi") TRP = open(support.fullpath("tryptophan.sdf")).read() class TestMACCS(unittest2.TestCase): def test_bitorder(self): result = runner.run_fps("--maccs166", 7, MACCS_SMI) # The fingerprints are constructed to test the first few bytes. self.assertEquals(result[0][:6], support.set_bit(2)) self.assertEquals(result[1][:6], support.set_bit(3)) self.assertEquals(result[2][:6], support.set_bit(4)) self.assertEquals(result[3][:6], support.set_bit(5)) self.assertEquals(result[4][:6], support.set_bit(9)) self.assertEquals(result[5][:6], support.set_bit(10)) self.assertEquals(result[6][:6], support.set_bit(16)) def test_type(self): for line in runner.run("--maccs166", MACCS_SMI): if line.startswith("#type="): self.assertEquals(line, "#type=RDKit-MACCS166/1") return self.assertEquals("could not find", "#type line") TestMACCS = unittest2.skipIf(skip_rdkit, "RDKit not installed")(TestMACCS) _fp1 = "32bb93145be9598dc6f22cbd1c781196e1733f7a53ed6f09e9e55e22bd3d3ac9e3be17f187fbcaefea8d2982ba7dab47ae1a3fd1aca52b48c70f540f964f79cd79afd9dc9871717341eaf7d7abe6febbc9bee9a971855ec7d960ecb2dacdbbb9b9b6d05f8ce9b7f4bc57fa7fa4573e95fe5a7dc918883f7fd9a3a825ef8e2fb2df944b94a2fb36c023cef883e967d9cf698fbb927cfe4fcbbaff71f7ada5ced97d5d679764bba6be8ff7d762f98d26bfbb3cb003647e1180966bc7eaffdad9a2ce47c6169bf679639e67e1bf50bd8bf30d3438dc877e67ba4e786fedfb831e56f34abc27bdfdce02c7aa57b36f761deb9d9bd5b2579df169ab0eae547515d2a7" assert len(_fp1) == 2048 // 4 def get_field_and_first(cmdline, field): result = runner.run(cmdline) field_value = None first = None for line in result: if line.startswith(field): field_value = line if not line.startswith("#"): first = line break return (field_value, first) class TestRDKFingerprints(unittest2.TestCase): def assertIn(self, substr, str): self.assertEquals(substr in str, True, str) def test_is_default(self): result = runner.run_fps("", 19) self.assertEquals(result[0], _fp1 + "\t9425004") self.assertNotEquals(result[1].split()[0], _fp1) # All must have the same length (since the fp lengths and ids lengths are the same self.assertEquals(len(set(map(len, result))), 1, set(map(len, result))) def test_as_rdk(self): result = runner.run_fps("--RDK", 19) self.assertEquals(result[0], _fp1 + "\t9425004") self.assertNotEquals(result[1].split()[0], _fp1) # All must have the same length (since the fp lengths and ids lengths are the same self.assertEquals(len(set(map(len, result))), 1, set(map(len, result))) def test_num_bits_default(self): result = runner.run_fps("--fpSize 2048", 19) self.assertEquals(result[0], _fp1 + "\t9425004") self.assertNotEquals(result[1].split()[0], _fp1) def test_num_bits_16(self): field, first = get_field_and_first("--fpSize 16", "#num_bits=") self.assertEquals(field, "#num_bits=16") self.assertEquals(first, "ffff\t9425004") def test_num_bits_1(self): field, first = get_field_and_first("--fpSize 1", "#num_bits=") self.assertEquals(field, "#num_bits=1") self.assertEquals(first, "01\t9425004") def test_num_bits_2(self): field, first = get_field_and_first("--fpSize 2", "#num_bits=") self.assertEquals(field, "#num_bits=2") self.assertEquals(first, "03\t9425004") def test_num_bits_too_small(self): result = runner.run_exit("--fpSize 0") self.assertIn("fpSize must be 1 or greater", result) def test_bits_per_hash_default(self): field, first = get_field_and_first("--nBitsPerHash 4", "#type=") self.assertEquals(field, "#type=RDKit-Fingerprint/1 minPath=1 maxPath=7 fpSize=2048 nBitsPerHash=4 useHs=1") self.assertEquals(first.split()[0], _fp1) def test_bits_per_hash(self): field, first = get_field_and_first("--nBitsPerHash 1", "#type") self.assertEquals(field, "#type=RDKit-Fingerprint/1 minPath=1 maxPath=7 fpSize=2048 nBitsPerHash=1 useHs=1") self.assertNotEquals(first.split()[0], _fp1) def test_bits_per_hash_too_small(self): result = runner.run_exit("--nBitsPerHash 0") self.assertIn("nBitsPerHash must be 1 or greater", result) def test_min_path_default(self): field, first = get_field_and_first("--minPath 1", "#type") self.assertEquals(field, "#type=RDKit-Fingerprint/1 minPath=1 maxPath=7 fpSize=2048 nBitsPerHash=4 useHs=1") self.assertEquals(first.split()[0], _fp1) def test_min_path_2(self): field, first = get_field_and_first("--minPath 2", "#type") self.assertEquals(field, "#type=RDKit-Fingerprint/1 minPath=2 maxPath=7 fpSize=2048 nBitsPerHash=4 useHs=1") self.assertNotEquals(first.split()[0], _fp1) def test_min_path_too_small(self): result = runner.run_exit("--minPath 0") self.assertIn("minPath must be 1 or greater", result) def test_min_path_too_large(self): result = runner.run_exit("--minPath 5 --maxPath 4") self.assertIn("--minPath must not be greater than --maxPath", result) def test_max_path_default(self): field, first = get_field_and_first("--maxPath 7", "#type") self.assertEquals(field, "#type=RDKit-Fingerprint/1 minPath=1 maxPath=7 fpSize=2048 nBitsPerHash=4 useHs=1") self.assertEquals(first.split()[0], _fp1) def test_max_path_6(self): field, first = get_field_and_first("--maxPath 6", "#type") self.assertEquals(field, "#type=RDKit-Fingerprint/1 minPath=1 maxPath=6 fpSize=2048 nBitsPerHash=4 useHs=1") self.assertNotEquals(first.split()[0], _fp1) # def test_ignore_Hs(self): # I don't have a good test case for this... XXX TestRDKFingerprints = unittest2.skipIf(skip_rdkit, "RDKit not installed")(TestRDKFingerprints) _morgan1 = "00000080000200000010010000000040000000400000000000000000000000000000000200000000800000400040000400000000000000004200000000080000000000000000020000000000000000000004000000004008000000000002000000000000800000000800000100080800000000048000000000400000000000000081000002000000010000000000000001000020000000000000000000020000000000000100000020040800100000000000000000000000000000000000000000000000000000040000800000000000000008000000000408004000000000000000000000000100000002000000002000010000100000000000000000000000" _morgan_radius3 = "00000080000200000110010000000040000100400000000000000000000000004000000201000000800000400040000401000000000000004200000000080000000000000000020000000000000000000004400000004008000000000002000000000000800000000800000100080800000000048000000000408000000000000081000002000000010020000000000001000020000000000002000000020080000000000100000020040800100000000000000000000000000000000000200000000040000000040000800000000000000008000000040408004000000000000000000000000100000002000000002800010000100000000000000000000000" class TestRDKMorgan(unittest2.TestCase): def test_as_morgan(self): result = runner.run_fps("--morgan", 19) self.assertEquals(result[0], _morgan1 + "\t9425004") self.assertNotEquals(result[1].split()[0], _morgan1) # All must have the same length (since the fp lengths and ids lengths are the same self.assertEquals(len(set(map(len, result))), 1, set(map(len, result))) def test_num_bits_default(self): result = runner.run_fps("--morgan --fpSize 2048", 19) self.assertEquals(result[0], _morgan1 + "\t9425004") self.assertNotEquals(result[1].split()[0], _morgan1) def test_num_bits_16(self): field, first = get_field_and_first("--morgan --fpSize 16", "#num_bits=") self.assertEquals(field, "#num_bits=16") self.assertEquals(first, "fbff\t9425004") def test_num_bits_1(self): field, first = get_field_and_first("--morgan --fpSize 1", "#num_bits=") self.assertEquals(field, "#num_bits=1") self.assertEquals(first, "01\t9425004") def test_num_bits_2(self): field, first = get_field_and_first("--morgan --fpSize 2", "#num_bits=") self.assertEquals(field, "#num_bits=2") self.assertEquals(first, "03\t9425004") def test_num_bits_too_small(self): result = runner.run_exit("--morgan --fpSize 0") self.assertIn("fpSize must be 1 or greater", result) def test_radius_default(self): result = runner.run_fps("--morgan --radius 2", 19) self.assertEquals(result[0], _morgan1 + "\t9425004") self.assertNotEquals(result[1].split()[0], _morgan1) def test_radius_3(self): result = runner.run_fps("--morgan --radius 3", 19) self.assertEquals(result[0], _morgan_radius3 + "\t9425004") self.assertNotEquals(result[1].split()[0], _morgan1) def test_radius_too_small(self): result = runner.run_exit("--morgan --radius -1") self.assertIn("radius must be 0 or greater", result) def test_default_use_options(self): field, first = get_field_and_first("--morgan --useFeatures 0 --useChirality 0 --useBondTypes 1", "#type") self.assertEquals(field, "#type=RDKit-Morgan/1 radius=2 fpSize=2048 useFeatures=0 useChirality=0 useBondTypes=1") self.assertEquals(first, _morgan1 + "\t9425004") # This isn't a complete test of the different options. I don't think it's worth the effort def test_useChirality(self): field, first = get_field_and_first("--morgan --useFeatures 1 --useChirality 1 --useBondTypes 0", "#type=") self.assertEquals(field, "#type=RDKit-Morgan/1 radius=2 fpSize=2048 useFeatures=1 useChirality=1 useBondTypes=0") self.assertNotEquals(first, _morgan1 + "\t9425004") TestRDKMorgan = unittest2.skipIf(skip_rdkit, "RDKit not installed")(TestRDKMorgan) _atom_pair_fingerprints = { None: None, "1": {"2048": "100000100008007000045020000008b0080220a4420084c000054010800300e02040000088c080010000800101404023000000100000000020000000004000a00002060400040800000000000002c00108000000000030009d00100002001001900080003081058010000400200209004000000000050b00800008084042060801000800000000010200030000000000040000000080000000000400000000021000708000100000600010008080200008000c8020000004040008000000600000000008000000100000000402000000400080300000600090004020000020000002008100000800000100020000000000000000000008100000000002000000", "128": "fd42febdfaddfff5ff05df3fe3c3fffb", "minLength": "dd02bebd328cbff5be055f3e6242ff32", "maxLength": "3042c000e8d1e141f101d02181c160cb", }, "2": {"2048": "0100070010000000101100000013010000000000001100000010000001001703100000000007000011000301000310001000000010000000003000110100731000310001300000000000101000033110100000010000001000037311000000370313033003010000000101070000130030000010330000000000170031001077000013301000000300003133030000300030133003000131011100100f000010000013010300000000030310310000000300101030000011033010077100100000300003000000000011000000000110000010000301000037300000000101000001303000000000003000000010000010000001100000073001100100101010", "128": "77f7fff7ff17017f7fffff7fff3fffff", "minLength": "71777f777317003377ffff37733fff77", "maxLength": "073033737f0001370100337f7f101077", }, } if not (skip_rdkit or chemfp.rdkit.ATOM_PAIR_VERSION is None): _atom_pair_fps = _atom_pair_fingerprints[chemfp.rdkit.ATOM_PAIR_VERSION] PAIR_TYPE = "RDKit-AtomPair/" + chemfp.rdkit.ATOM_PAIR_VERSION + " " class TestAtomPairFingerprinter(unittest2.TestCase): def test_pair_defaults(self): header, output = runner.run_split("--pair", 19) self.assertEqual(header["#type"], PAIR_TYPE + "fpSize=2048 minLength=1 maxLength=30") self.assertEqual(output[0], _atom_pair_fps["2048"] + "\t9425004") def test_pair_explicit_defaults(self): header, output = runner.run_split("--pair --fpSize 2048 --minLength 1 --maxLength 30", 19) self.assertEqual(header["#type"], PAIR_TYPE + "fpSize=2048 minLength=1 maxLength=30") self.assertEqual(output[0], _atom_pair_fps["2048"] + "\t9425004") def test_num_bits_128(self): header, output = runner.run_split("--pair --fpSize 128", 19) self.assertEqual(header["#type"], PAIR_TYPE + "fpSize=128 minLength=1 maxLength=30") self.assertEqual(output[0], _atom_pair_fps["128"] + "\t9425004") def test_num_bits_error(self): errmsg = runner.run_exit("--pair --fpSize 0") self.assertIn("fpSize must be 1 or greater", errmsg) errmsg = runner.run_exit("--pair --fpSize 2.3") self.assertIn("fpSize must be 1 or greater", errmsg) def test_min_length(self): header, output = runner.run_split("--pair --fpSize 128 --minLength 4", 19) self.assertEqual(header["#type"], PAIR_TYPE + "fpSize=128 minLength=4 maxLength=30") self.assertEqual(output[0], _atom_pair_fps["minLength"] + "\t9425004") def test_max_length(self): header, output = runner.run_split("--pair --fpSize 128 --maxLength 3", 19) self.assertEqual(header["#type"], PAIR_TYPE + "fpSize=128 minLength=1 maxLength=3") self.assertEqual(output[0], _atom_pair_fps["maxLength"] + "\t9425004") def test_min_length_error(self): errmsg = runner.run_exit("--pair --minLength spam") self.assertIn("minLength must be 0 or greater", errmsg) def test_max_length_error(self): errmsg = runner.run_exit("--pair --maxLength -3") self.assertIn("maxLength must be 0 or greater", errmsg) errmsg = runner.run_exit("--pair --maxLength spam") self.assertIn("maxLength must be 0 or greater", errmsg) def test_invalid_min_max_lengths(self): errmsg = runner.run_exit("--pair --maxLength 0") # default minLength is 1 self.assertIn("--minLength must not be greater than --maxLength", errmsg) errmsg = runner.run_exit("--pair --minLength 4 --maxLength 3") self.assertIn("--minLength must not be greater than --maxLength", errmsg) def test_valid_min_max_lengths(self): header, output = runner.run_split("--pair --minLength 0") self.assertEqual(header["#type"], PAIR_TYPE + "fpSize=2048 minLength=0 maxLength=30") header, output = runner.run_split("--pair --minLength 0 --maxLength 0") self.assertEqual(header["#type"], PAIR_TYPE + "fpSize=2048 minLength=0 maxLength=0") header, output = runner.run_split("--pair --minLength 5 --maxLength 5") self.assertEqual(header["#type"], PAIR_TYPE + "fpSize=2048 minLength=5 maxLength=5") header, output = runner.run_split("--pair --minLength 6 --maxLength 8") self.assertEqual(header["#type"], PAIR_TYPE + "fpSize=2048 minLength=6 maxLength=8") if skip_rdkit: TestAtomPairFingerprinter = unittest2.skipIf(skip_rdkit, "RDKit not installed")(TestAtomPairFingerprinter) else: TestAtomPairFingerprinter = unittest2.skipIf(chemfp.rdkit.ATOM_PAIR_VERSION is None, "This version of RDKit has a broken GetHashedAtomPairFingerprintAsBitVect")(TestAtomPairFingerprinter) _torsion_fingerprints = { "1": { "2048": "000000010100000000000040800008000000000000000000000000000000000040004000000000000c0000000000000040000000003010000000000000000800000000000000000000000000000000000000000000100000000000000000000000000080000000000000000000000000000000000000000000000000400000000000000000000000000000000000000000000002000000000000000000000000000000000000002000000000000000000000000000000020200080000000000000100000000000000000000000000000800000000000008400000000000000000200000000000000000000020000000000000000010008000000000200000000", "128": "c2104083013018a42c008042c0000800", "targetSize": "1491150001c0f010648000086245052c", }, "2": { "2048": "00000000000003000000000001000000010000000000000000000000000000000000000000000000000000000000000700000010000000000010000000000000000000100000000000000000100000000000000000000100000000000001003000000000000000000000000000000000000000000000000000000000000000001000000100000000000000010000001000000000000000000000000000001000001100000000000000000000100000000000000000000000000000000000000000000001000000000000000000000000010000000000330000100100000010000000100000000000000000101000000000000001000000000030000000000000", "128": "13111033000037000070011131013037", "targetSize": "33037307030103730303131100331100", } } if not skip_rdkit: _torsion_fps = _torsion_fingerprints[chemfp.rdkit.TORSION_VERSION] TORSION_TYPE = "RDKit-Torsion/" + chemfp.rdkit.TORSION_VERSION + " " class TestTorsionFingerprinter(unittest2.TestCase): def test_torsion_defaults(self): header, output = runner.run_split("--torsion", 19) self.assertEqual(header["#type"], "RDKit-Torsion/1 fpSize=2048 targetSize=4") self.assertEqual(output[0], _torsion_fps["2048"] + "\t9425004") def test_torsion_explicit_defaults(self): header, output = runner.run_split("--torsion --fpSize 2048 --targetSize 4", 19) self.assertEqual(header["#type"], "RDKit-Torsion/1 fpSize=2048 targetSize=4") self.assertEqual(output[0], _torsion_fps["2048"] + "\t9425004") def test_num_bits_128(self): header, output = runner.run_split("--torsion --fpSize 128 --targetSize 4", 19) self.assertEqual(header["#type"], "RDKit-Torsion/1 fpSize=128 targetSize=4") self.assertEqual(output[0], _torsion_fps["128"] + "\t9425004") def test_num_bits_error(self): errmsg = runner.run_exit("--torsion --fpSize 0") self.assertIn("fpSize must be 1 or greater", errmsg) errmsg = runner.run_exit("--torsion --fpSize 2.3") self.assertIn("fpSize must be 1 or greater", errmsg) def test_target_size(self): header, output = runner.run_split("--torsion --fpSize 128 --targetSize 5", 19) self.assertEqual(header["#type"], "RDKit-Torsion/1 fpSize=128 targetSize=5") self.assertEqual(output[0], _torsion_fps["targetSize"] + "\t9425004") def test_target_size_error(self): errmsg = runner.run_exit("--torsion --fpSize 128 --targetSize -1") self.assertIn("targetSize must be 1 or greater", errmsg) errmsg = runner.run_exit("--torsion --fpSize 128 --targetSize spam") self.assertIn("targetSize must be 1 or greater", errmsg) errmsg = runner.run_exit("--torsion --fpSize 128 --targetSize 0") self.assertIn("targetSize must be 1 or greater", errmsg) TestTorsionFingerprinter = unittest2.skipIf(skip_rdkit, "RDKit not installed")(TestTorsionFingerprinter) class TestIO(unittest2.TestCase, support.TestIdAndErrors): _runner = runner def test_input_format(self): def without_source_header(cmdline, source): return [line for line in runner.run(cmdline, source) if not line.startswith("#source=") and not line.startswith("#date=")] result1 = without_source_header("", support.PUBCHEM_SDF) result2 = without_source_header("", support.PUBCHEM_SDF_GZ) self.assertEquals(result1, result2) result3 = without_source_header("--in sdf.gz", support.PUBCHEM_SDF_GZ) self.assertEquals(result1, result3) result4 = without_source_header("--in sdf", support.PUBCHEM_ANOTHER_EXT) self.assertEquals(result1, result4) def test_output(self): dirname = tempfile.mkdtemp(prefix="test_rdkit2fps") output_filename = os.path.join(dirname, "blah.fps") assert len(output_filename.split()) == 1 try: result = runner.run("-o " + output_filename) assert len(result) == 0 with open(output_filename) as f: result = f.readlines() finally: shutil.rmtree(dirname) self.assertEquals(result[0], "#FPS1\n") while result and result[0].startswith("#"): del result[0] self.assertEquals(len(result), 19) self.assertEquals(result[0], _fp1 + "\t9425004\n") def test_bad_format(self): result = runner.run_exit("--in spam") self.assertIn("Unsupported format specifier: 'spam'", result) TestIO = unittest2.skipIf(skip_rdkit, "RDKit not installed")(TestIO) class TestBadStructureFiles(unittest2.TestCase): def setUp(self): self.dirname = tempfile.mkdtemp() def tearDown(self): shutil.rmtree(self.dirname) def _make_datafile(self, text, ext): filename = os.path.join(self.dirname, "input."+ext) with open(filename, "w") as f: f.write(text) return filename def test_blank_line_in_smiles(self): filename = self._make_datafile("C methane\n\nO water\n", "smi") msg = runner.run_exit([filename]) self.assertIn("Unexpected blank line at line 2 of", msg) def test_bad_smiles(self): filename = self._make_datafile("C methane\nQ Q-ane\nO water\n", "smi") msg = runner.run_exit([filename]) self.assertIn("Cannot parse the SMILES 'Q' at line 2", msg) def test_smiles_without_title(self): filename = self._make_datafile("C methane\nO water\n[235U]\n", "smi") msg = runner.run_exit([filename]) self.assertIn("Missing SMILES name (second column) at line 3", msg) def test_sdf_with_bad_record(self): # Three records, second one is bad input = TRP + TRP.replace("32 28", "40 21") + TRP filename = self._make_datafile(input, "sdf") msg = runner.run_exit([filename]) self.assertIn("Could not parse molecule block at line 70", msg) self.assertIn("input.sdf", msg) def test_sdf_with_bad_record_checking_id(self): # This tests a different code path than the previous input = TRP + TRP.replace("32 28", "40 21") + TRP filename = self._make_datafile(input, "sdf") msg = runner.run_exit(["--id-tag", "COMPND", filename]) self.assertIn("Could not parse molecule block at line 70", msg) self.assertIn("input.sdf", msg) def test_sdf_with_missing_id(self): filename = self._make_datafile(TRP, "sdf") msg = runner.run_exit(["--id-tag", "SPAM", filename]) self.assertIn("Missing id tag 'SPAM' for record #1 at line 1", msg) self.assertIn("input.sdf", msg) def test_ignore_errors(self): input = TRP + TRP.replace("32 28", "40 21") + TRP.replace("COMPND", "BLAH") filename = self._make_datafile(input, "sdf") header, output = runner.run_split(["--errors", "ignore", "--id-tag", "BLAH"], source=filename) self.assertEqual(len(output), 1) def test_unsupported_format(self): filename = self._make_datafile("Unknown", "xyzzy") result = runner.run_exit([filename]) self.assertIn("Unknown structure filename extension", result) self.assertIn("input.xyzzy", result) TestBadStructureFiles = unittest2.skipIf(skip_rdkit, "RDKit not installed")(TestBadStructureFiles) # Some code to test the internal interface class TestInternals(unittest2.TestCase): def test_make_rdk_fingerprinter(self): # Make sure that I can call with the defaults chemfp.rdkit.make_rdk_fingerprinter() def test_make_rdk_fingerprinter_bad_fpSize(self): with self.assertRaisesRegexp(ValueError, "fpSize must be positive"): chemfp.rdkit.make_rdk_fingerprinter(fpSize=0) with self.assertRaisesRegexp(ValueError, "fpSize must be positive"): chemfp.rdkit.make_rdk_fingerprinter(fpSize=-10) def test_make_rdk_fingerprinter_min_path(self): with self.assertRaisesRegexp(ValueError, "minPath must be positive"): chemfp.rdkit.make_rdk_fingerprinter(minPath=0) with self.assertRaisesRegexp(ValueError, "minPath must be positive"): chemfp.rdkit.make_rdk_fingerprinter(monPath=-3) def test_make_rdk_fingerprinter_max_path(self): chemfp.rdkit.make_rdk_fingerprinter(minPath=2, maxPath=2) with self.assertRaisesRegexp(ValueError, "maxPath must not be smaller than minPath"): chemfp.rdkit.make_rdk_fingerprinter(minPath=3, maxPath=2) def test_make_rdk_fingerprinter_min_path(self): with self.assertRaisesRegexp(ValueError, "nBitsPerHash must be positive"): chemfp.rdkit.make_rdk_fingerprinter(nBitsPerHash=0) with self.assertRaisesRegexp(ValueError, "nBitsPerHash must be positive"): chemfp.rdkit.make_rdk_fingerprinter(nBitsPerHash=-1) def test_make_morgan_fingerprinter(self): chemfp.rdkit.make_morgan_fingerprinter() def test_make_morgan_fingerprinter_bad_fpSize(self): with self.assertRaisesRegexp(ValueError, "fpSize must be positive"): chemfp.rdkit.make_morgan_fingerprinter(fpSize=0) with self.assertRaisesRegexp(ValueError, "fpSize must be positive"): chemfp.rdkit.make_morgan_fingerprinter(fpSize=-10) def test_make_morgan_fingerprinter_bad_radius(self): with self.assertRaisesRegexp(ValueError, "radius must be positive or zero"): chemfp.rdkit.make_morgan_fingerprinter(radius=-1) with self.assertRaisesRegexp(ValueError, "radius must be positive or zero"): chemfp.rdkit.make_morgan_fingerprinter(radius=-10) TestInternals = unittest2.skipIf(skip_rdkit, "RDKit not installed")(TestInternals) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_reorder.py0000644000077000000240000000773512055226641017511 0ustar dalkestaff00000000000000# Test the fingerprint reordering implementation # Note: this is the ordering by popcount, and NOT the ordering of the results! from __future__ import absolute_import, with_statement import unittest2 from cStringIO import StringIO import struct import chemfp from chemfp.bitops import byte_popcount def _load(fingerprints, reorder): if len(fingerprints) == 0: num_bits = 16 else: num_bits = len(fingerprints[0])*8 id_fps = ((str(i), fp) for (i, fp) in enumerate(fingerprints)) return chemfp.load_fingerprints(id_fps, metadata=chemfp.Metadata(num_bits=num_bits), reorder=reorder, alignment=1) def verify_popcount_indices(arena): assert len(arena.popcount_indices) % 4 == 0 format = "i"*(len(arena.popcount_indices)//4) values = struct.unpack(format, arena.popcount_indices) assert len(values) == arena.metadata.num_bits+2, (len(values), arena.metadata.num_bits) assert values[-1] == len(arena), (values[-1], len(arena), values) for i in range(len(values)-1): start = values[i] end = values[i+1] for j in range(start, end): fp = arena[j][1] assert byte_popcount(fp) == i, (byte_popcount(fp), i, start, end) class TestReorder(unittest2.TestCase): def test_empty(self): arena = _load([], True) self.assertEquals(arena.arena, "") verify_popcount_indices(arena) arena = _load([], False) self.assertEquals(arena.arena, "") self.assertEquals(arena.popcount_indices, "") def test_single(self): arena = _load(["1234"], True) self.assertEquals(arena.arena, "1234") verify_popcount_indices(arena) arena = _load(["1234"], False) self.assertEquals(arena.arena, "1234") def test_two(self): arena = _load(["AA", "CC"], True) self.assertEquals(arena.arena, "AACC") verify_popcount_indices(arena) arena = _load(["AA", "CC"], False) self.assertEquals(arena.arena, "AACC") arena = _load(["CC", "AA"], True) self.assertEquals(arena.arena, "AACC") verify_popcount_indices(arena) arena = _load(["CC", "AA"], False) self.assertEquals(arena.arena, "CCAA") def test_every_bit_unsorted(self): arena = _load([chr(i) for i in range(256)], False) self.assertEquals(arena.arena, "".join(chr(i) for i in range(256))) def test_every_bit_sorted(self): # This is a bit tricker since there's no guaranteed order # of the contents of the arena all_bytes = [chr(i) for i in range(256)] expected = sorted(byte_popcount(fp) for fp in all_bytes) arena = _load(all_bytes, True) popcounts = map(byte_popcount, arena.arena) self.assertEquals(popcounts, expected) verify_popcount_indices(arena) arena = _load(all_bytes[::-1], True) popcounts = map(byte_popcount, arena.arena) self.assertEquals(popcounts, expected) verify_popcount_indices(arena) def test_every_bit_sorted_tripled(self): # This is a bit tricker since there's no guaranteed order # of the contents of the arena all_bytes = [chr(i) for i in range(256)] all_bytes *= 3 expected = sorted(byte_popcount(fp) for fp in all_bytes) arena = _load(all_bytes, True) popcounts = map(byte_popcount, arena.arena) self.assertEquals(popcounts, expected) verify_popcount_indices(arena) arena = _load(all_bytes[::-1], True) popcounts = map(byte_popcount, arena.arena) self.assertEquals(popcounts, expected) verify_popcount_indices(arena) def test_all_ones(self): arena = _load([chr(255), chr(255), chr(255), chr(255)], True) self.assertEquals(arena.arena, "\xff\xff\xff\xff") verify_popcount_indices(arena) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_sdf2fps.py0000644000077000000240000002776112055226641017417 0ustar dalkestaff00000000000000import sys import unittest2 from cStringIO import StringIO as SIO from chemfp.commandline import sdf2fps import support real_stdin = sys.stdin real_stdout = sys.stdout real_stderr = sys.stderr DECODER_SDF = support.fullpath("decoder.sdf") def run(s): args = s.split() try: sys.stdin = open(DECODER_SDF) sys.stdout = stdout = SIO() sdf2fps.main(args) finally: sys.stdout = real_stdout sys.stdin = real_stdin result = stdout.getvalue().splitlines() if result: assert result[0] == "#FPS1" return result def run_failure(s): sys.stderr = stderr = SIO() try: try: run(s) except SystemExit: pass else: raise AssertionError("should have exited: %r" % (s,)) finally: sys.stderr = real_stderr return stderr.getvalue() def run_warning(s): sys.stderr = stderr = SIO() try: try: run(s) finally: sys.stderr = real_stderr except SystemExit, e: raise AssertionError("unexpected SystemExit: %r and %r" % (e, stderr.getvalue())) return stderr.getvalue() def run_fps(s, expect_length=None): result = run(s) while result[0].startswith("#"): del result[0] if expect_length is not None: assert len(result) == expect_length return result import support _runner = support.Runner(sdf2fps.main) run_exit = _runner.run_exit class TestDecoderFlags(unittest2.TestCase): def test_cactvs(self): result = run_fps("--cactvs --fp-tag PUBCHEM_CACTVS_SUBSKEYS") self.assertEquals(result, ["07de8d002000000000000000000000000080060000000c000000000000000080030000f8401800000030508379344c014956000055c0a44e2a0049200084e140581f041d661b10064483cb0f2925100619001393e10001007000000000008000000000000000400000000000000000\t9425004", "07de0d000000000000000000000000000080460300000c0000000000000000800f0000780038000000301083f920cc09695e0800d5c0e44e6e00492190844145dc1f841d261911164d039b8f29251026b9401313e0ec01007000000000000000000000000000000000000000000000\t9425009"]) def test_binary40(self): result = run_fps("--binary --fp-tag binary40", 2) self.assertEquals(result[0], "000500c000\t9425004") self.assertEquals(result[1], "00fab75300\t9425009") def test_binary_msb40(self): result = run_fps("--binary-msb --fp-tag binary40", 2) self.assertEquals(result[0], "000300a000\t9425004") self.assertEquals(result[1], "00caed5f00\t9425009") def test_binary3(self): result = run_fps("--binary --fp-tag binary3", 2) self.assertEquals(result[0], "04\t9425004") self.assertEquals(result[1], "03\t9425009") def test_binary_msb3(self): result = run_fps("--binary-msb --fp-tag binary3", 2) self.assertEquals(result[0], "01\t9425004") self.assertEquals(result[1], "06\t9425009") def test_binary8(self): result = run_fps("--binary --fp-tag binary8", 2) self.assertEquals(result[0], "76\t9425004") self.assertEquals(result[1], "bc\t9425009") def test_binary_msb8(self): result = run_fps("--binary-msb --fp-tag binary8", 2) self.assertEquals(result[0], "6e\t9425004") self.assertEquals(result[1], "3d\t9425009") def test_binary17(self): result = run_fps("--binary --fp-tag binary17", 2) self.assertEquals(result[0], "38b701\t9425004") self.assertEquals(result[1], "489d01\t9425009") def test_binary_msb17(self): result = run_fps("--binary-msb --fp-tag binary17", 2) self.assertEquals(result[0], "db3900\t9425004") self.assertEquals(result[1], "732500\t9425009") def test_binary_failure(self): errmsg = run_exit("--binary --fp-tag PUBCHEM_CACTVS_SUBSKEYS", DECODER_SDF) self.assertIn("Could not binary decode 'PUBCHEM_CACTVS_SUBSKEYS' value 'AAADceB7sQ", errmsg) self.assertIn("Not a binary string at line 1 of", errmsg) self.assertIn("decoder.sdf", errmsg) def test_binary_msb_failure(self): errmsg = run_exit("--binary-msb --fp-tag PUBCHEM_CACTVS_SUBSKEYS", DECODER_SDF) self.assertIn("Could not binary_msb decode 'PUBCHEM_CACTVS_SUBSKEYS' value 'AAADceB7sQ", errmsg) self.assertIn("Not a binary string at line 1 of", errmsg) self.assertIn("decoder.sdf", errmsg) def test_hex2(self): result = run_fps("--hex --fp-tag hex2", 2) self.assertEquals(result[0], "ab\t9425004") self.assertEquals(result[1], "01\t9425009") def test_hex_lsb2(self): # 0xab == 0b10101011 # 10101011 with LSB first is 5 d => "d5" # 0x01 == 0b00000001 => 80 when in LSB first result = run_fps("--hex-lsb --fp-tag hex2", 2) self.assertEquals(result[0], "d5\t9425004") self.assertEquals(result[1], "80\t9425009") def test_hex_msb2(self): # With 2 nibbles the result is the same as hex result = run_fps("--hex-msb --fp-tag hex2", 2) self.assertEquals(result[0], "ab\t9425004") self.assertEquals(result[1], "01\t9425009") def test_hex16(self): result = run_fps("--hex --fp-tag hex16", 2) self.assertEquals(result[0], "0123456789abcdef\t9425004") self.assertEquals(result[1], "abcdef0123456789\t9425009") def test_hex_lsb16(self): result = run_fps("--hex-lsb --fp-tag hex16", 2) # 0123456789abcdef in LSB form => # 084c2a6e195d3b7f when nibbles bits are in MSB form but nibbles are LSB # 80 c4 a2 e6 91 d5 b3 f7 when byte bits are in MSB and bytes are LSB self.assertEquals(result[0], "80c4a2e691d5b3f7\t9425004") # abcdef0123456789 in LSB form => # 5d3b7f084c2a6e19 => # d5 b3 f7 80 c4 a2 e6 91 self.assertEquals(result[1], "d5b3f780c4a2e691\t9425009") def test_hex_msb16(self): # Just a bit of reordering result = run_fps("--hex-msb --fp-tag hex16", 2) self.assertEquals(result[0], "efcdab8967452301\t9425004") self.assertEquals(result[1], "8967452301efcdab\t9425009") def test_base64_16(self): result = run_fps("--base64 --fp-tag base64_16", 2) self.assertEquals(result[0], "Greetings, human".encode("hex") + "\t9425004") self.assertEquals(result[1], "blahblahspamblah".encode("hex") + "\t9425009") def test_daylight1(self): result = run_fps("--daylight --fp-tag daylight1", 2) self.assertEquals(result[0], "PyDaylight".encode("hex") + "\t9425004") self.assertEquals(result[1], "chemfptest".encode("hex") + "\t9425009") def test_daylight2(self): result = run_fps("--daylight --fp-tag daylight2", 2) self.assertEquals(result[0], "Okie dokie pokie!".encode("hex") + "\t9425004") self.assertEquals(result[1], "Testing 1, 2, 3".encode("hex") + "\t9425009") def test_daylight3(self): result = run_fps("--daylight --fp-tag daylight3", 2) self.assertEquals(result[0], "\t9425004") self.assertEquals(result[1], "\t9425009") def test_daylight_end_error(self): errmsg = run_exit("--daylight --fp-tag daylight-end-illegal", DECODER_SDF) self.assertIn("Could not daylight decode 'daylight-end-illegal' value '1P!_P'", errmsg) self.assertIn("Last character of encoding must be 1, 2, or 3, not 'P' at line 1", errmsg) self.assertIn("decoder.sdf", errmsg) def test_daylight_symbol_error(self): errmsg = run_exit("--daylight --fp-tag daylight-illegal", DECODER_SDF) self.assertIn("Could not daylight decode 'daylight-illegal' value '1P!_3'", errmsg) self.assertIn("Unknown encoding symbol at line 1", errmsg) self.assertIn("decoder.sdf", errmsg) def test_daylight_length_error(self): errmsg = run_exit("--daylight --fp-tag PUBCHEM_CACTVS_SUBSKEYS", DECODER_SDF) self.assertIn("Could not daylight decode 'PUBCHEM_CACTVS_SUBSKEYS' value 'AAADceB7sQ", errmsg) self.assertIn("Daylight binary encoding is of the wrong length at line 1 of", errmsg) self.assertIn("decoder.sdf", errmsg) def test_bad_decoding(self): msg = run_warning("--base64 --fp-tag binary17 --errors report") self.assertIn("Could not base64 decode 'binary17' value", msg) self.assertIn("Skipping.", msg) class TestBitSizes(unittest2.TestCase): def test_exact_fingerprint_bits(self): result = run("--binary --fp-tag binary3") self.assertIn("#num_bits=3", result) def test_user_bits_match_fingerprint_bits(self): result = run("--binary --fp-tag binary3 --num-bits 3") self.assertIn("#num_bits=3", result) self.assertIn("04\t9425004", result) self.assertIn("03\t9425009", result) def test_user_bits_disagree_with_fingerprint_bits(self): errmsg = run_failure("--binary --fp-tag binary3 --num-bits 2") self.assertIn("has 3 bits", errmsg) self.assertIn(" 2", errmsg) def test_implied_from_fingerprint_bytes(self): result = run("--hex --fp-tag hex2") self.assertIn("#num_bits=8", result) def test_user_bits_matches_fingerprint_bytes(self): result = run("--hex --fp-tag hex2 --num-bits 8") self.assertIn("#num_bits=8", result) def test_user_bits_too_large_for_bytes(self): result = run_failure("--hex --fp-tag hex2 --num-bits 9") self.assertIn("1 <= num-bits <= 8, not 9", result) def test_user_bits_acceptably_smaller_than_bytes(self): result = run("--hex --fp-tag hex2 --num-bits 6") self.assertIn("#num_bits=6", result) def test_user_bits_too_much_smaller_than_bytes(self): result = run_failure("--hex --fp-tag hex16 --num-bits 56") self.assertIn("57 <= num-bits <= 64, not 56", result) class TestTitleProcessing(unittest2.TestCase): def test_title_from_title_tag(self): result = run("--hex --fp-tag hex2 --id-tag binary3") self.assertIn("ab\t001", result) def test_missing_title_from_title_line(self): warning = run_warning("--hex --fp-tag hex2 --id-tag FAKE_TITLE --errors report") self.assertIn("Missing id tag 'FAKE_TITLE' in the record starting at line 160 of ", warning) self.assertIn("decoder.sdf", warning) self.assertIn("title='9425009'", warning) self.assertIn("Skipping.", warning) def test_missing_all_titles(self): warning = run_warning("--hex --fp-tag hex2 --id-tag DOES_NOT_EXIST --errors report") self.assertIn("Missing id tag 'DOES_NOT_EXIST'", warning) self.assertIn("line 1 of", warning) self.assertIn("line 160 of", warning) class TestShortcuts(unittest2.TestCase): def test_pubchem(self): result = run("--pubchem") self.assertIn("#num_bits=881", result) self.assertIn("#software=CACTVS/unknown", result) self.assertIn("#type=CACTVS-E_SCREEN/1.0 extended=2", result) self.assertIn("07de8d002000000000000000000000000080060000000c000000000000000080030000f8401800000030508379344c014956000055c0a44e2a0049200084e140581f041d661b10064483cb0f2925100619001393e10001007000000000008000000000000000400000000000000000\t9425004", result) self.assertIn("07de0d000000000000000000000000000080460300000c0000000000000000800f0000780038000000301083f920cc09695e0800d5c0e44e6e00492190844145dc1f841d261911164d039b8f29251026b9401313e0ec01007000000000000000000000000000000000000000000000\t9425009", result) class TestBadArgs(unittest2.TestCase): def test_missing_fp_tag(self): msg = run_failure("") self.assertIn("argument --fp-tag is required", msg) def test_num_bits_positive(self): msg = run_failure("--fp-tag SPAM --num-bits 0") self.assertIn("--num-bits must be a positive integer", msg) msg = run_failure("--fp-tag SPAM --num-bits -1") self.assertIn("--num-bits must be a positive integer", msg) def test_bad_char(self): msg = run_failure("--fp-tag SPAM --software this\bthat") self.assertIn("--software", msg) self.assertIn("'\\x08'", msg) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_sdf_reader.py0000644000077000000240000002714411660452125020140 0ustar dalkestaff00000000000000from __future__ import with_statement import sys import unittest2 from cStringIO import StringIO as SIO import support # Make sure the module correctly implements __all__ before = after = None before = set(globals()) from chemfp.sdf_reader import * after = set(globals()) assert len(after - before) == 4, ("wrong import * count", after-before) # Needed for access to the experimental FileLocation from chemfp import sdf_reader TRYPTOPHAN_SDF = support.fullpath("tryptophan.sdf") PUBCHEM_SDF = support.fullpath("pubchem.sdf") PUBCHEM_SDF_GZ = support.fullpath("pubchem.sdf.gz") STRANGE_SDF = support.fullpath("strange.sdf") expected_identifiers = ["9425004", "9425009", "9425012", "9425015", "9425018", "9425021", "9425030", "9425031", "9425032", "9425033", "9425034", "9425035", "9425036", "9425037", "9425040", "9425041", "9425042", "9425045", "9425046"] expected_linenos = [1, 191, 401, 592, 817, 1027, 1236, 1457, 1678, 1872, 2086, 2288, 2493, 2700, 2894, 3103, 3305, 3507, 3722] assert len(expected_identifiers) == len(expected_linenos) expected_locs = [dict(title=title, lineno=lineno) for (title, lineno) in zip(expected_identifiers, expected_linenos)] class TestReadRecords(unittest2.TestCase): def test_reads_the_only_record(self): n = sum(1 for x in open_sdf(TRYPTOPHAN_SDF)) self.assertEquals(n, 1) def test_reads_all_records(self): n = sum(1 for x in open_sdf(PUBCHEM_SDF)) self.assertEquals(n, 19) def test_reads_all_compressed_records(self): n = sum(1 for x in open_sdf(PUBCHEM_SDF_GZ)) self.assertEquals(n, 19) def test_reads_from_stdin(self): old_stdin = sys.stdin sys.stdin = open(PUBCHEM_SDF, "rb") try: n = sum(1 for x in open_sdf()) finally: sys.stdin = sys.stdin self.assertEquals(n, 19) def test_reads_from_gzip_stdin(self): old_stdin = sys.stdin sys.stdin = open(PUBCHEM_SDF_GZ, "rb") try: n = sum(1 for x in open_sdf(None, "gzip")) finally: sys.stdin = sys.stdin self.assertEquals(n, 19) def test_reads_from_fileobj(self): f = open(PUBCHEM_SDF, "rU") n = sum(1 for x in open_sdf(f)) self.assertEquals(n, 19) def test_reads_from_uncompressed_fileobj(self): f = open(PUBCHEM_SDF, "rU") n = sum(1 for x in open_sdf(f, "none")) self.assertEquals(n, 19) def test_reads_from_gzip_fileobj(self): f = open(PUBCHEM_SDF_GZ, "rb") n = sum(1 for x in open_sdf(f, "gzip")) self.assertEquals(n, 19) def test_handles_loc(self): loc = sdf_reader.FileLocation() results = [] for x in open_sdf(PUBCHEM_SDF_GZ, location=loc): if sys.version_info[:3] > (2, 5, 4): # Earlier versions of the gzip library didn't # keep track of the .name attribute self.assertEquals(loc.name, PUBCHEM_SDF_GZ) else: self.assertEquals(getattr(loc, "name"), None) results.append(dict(title=loc.title, lineno=loc.lineno)) self.assertEquals(results, expected_locs) def test_when_using_wrong_compression(self): try: n = sum(1 for x in open_sdf(PUBCHEM_SDF, "gzip")) raise AssertionError("parsed a gzip'ed file?") except IOError: pass tryptophan = open(TRYPTOPHAN_SDF).read() class ReadReturnsSmallAmounts(object): def __init__(self): self.f = open(PUBCHEM_SDF, "rU") def read(self, n): return self.f.read(1) class ReadReturnsOneRecord(object): def __init__(self): self.f = open_sdf(PUBCHEM_SDF) def read(self, n): try: rec = self.f.next() except StopIteration: return "" assert rec.endswith("\n$$$$\n") rec = rec[:-6] + ">spam\n" + ("X" * (n-len(rec)-6)) + "\n$$$$\n" assert len(rec) == n return rec class ReadReturnsTwoRecords(object): def __init__(self): self.f = open_sdf(PUBCHEM_SDF) def read(self, n): try: rec = self.f.next() except StopIteration: return "" try: rec2 = self.f.next() except StopIteration: return rec rec = rec + rec2 assert rec.endswith("\n$$$$\n") rec = rec[:-6] + ">spam\n" + ("X" * (n-len(rec)-6)) + "\n$$$$\n" assert len(rec) == n return rec class TestBoundaryConditions(unittest2.TestCase): def test_missing_terminal_newline(self): f = SIO(tryptophan.rstrip("\n")) n = sum(1 for x in open_sdf(f)) self.assertEquals(n, 1) def test_small_amounts(self): # the O(n**2) behavior really hits hard here - this takes a full second to work n = sum(1 for x in open_sdf(ReadReturnsSmallAmounts())) self.assertEquals(n, 19) def test_exact_record_boundary_reads(self): loc = sdf_reader.FileLocation() titles = [loc.title for x in open_sdf(ReadReturnsOneRecord(), location=loc)] self.assertEquals(titles, expected_identifiers) def test_two_record_boundary_reads(self): loc = sdf_reader.FileLocation() titles = [loc.title for x in open_sdf(ReadReturnsTwoRecords(), location=loc)] self.assertEquals(titles, expected_identifiers) class TestReadErrors(unittest2.TestCase): def test_wrong_format(self): f = SIO("Spam\n") try: for x in open_sdf(f): raise AssertionError("Bad parse") except sdf_reader.SDFParseError, err: self.assertEquals("Could not find a valid SD record" in str(err), True) self.assertEquals("line 1" in str(err), True, str(err)) def test_record_too_large(self): f = SIO( (tryptophan * ((2000000 // len(tryptophan)) + 1)).replace("$$$$", "1234")) try: for x in open_sdf(f): raise AssertionError("should not be able to read the first record") except sdf_reader.SDFParseError, err: self.assertIn("too large", str(err)) self.assertIn("at line 1", str(err)) def test_has_extra_data(self): f = SIO(tryptophan + tryptophan + "blah") try: for i, x in enumerate(open_sdf(f)): if i > 1: raise AssertionError("bad record count") except sdf_reader.SDFParseError, err: self.assertIn("unexpected content", str(err)) expected_lineno = (tryptophan.count("\n")*2) + 1 expected_lineno_msg = "at line %d" % expected_lineno self.assertIn(expected_lineno_msg, str(err)) def test_bad_format(self): f = SIO(tryptophan + tryptophan.replace("V2000", "V4000")) try: for i, x in enumerate(open_sdf(f)): if i > 0: raise AssertionError("bad record count") except sdf_reader.SDFParseError, err: self.assertIn("incorrectly formatted record", str(err)) self.assertIn("at line 70", str(err)) def test_my_error_handler(self): class CaptureErrors(object): def __init__(self): self.errors = [] def __call__(self, msg, loc): self.errors.append( (msg, loc.info()) ) my_error_handler = CaptureErrors() loc = sdf_reader.FileLocation() f = SIO(tryptophan + tryptophan.replace("V2000", "V4000") + tryptophan) titles = [loc.lineno for rec in open_sdf(f, location=loc, errors=my_error_handler)] self.assertEquals(titles, [1, 137]) self.assertEquals(my_error_handler.errors, [("incorrectly formatted record", {"name": None, "lineno": 70, "title": "tryptophan.pdb"})]) expected_hbond_donors = ["2","2","2","2","2","2","4","4","2", "2", "2","2","2","3","2","3","3","2","2"] expected_complexity = ["491", "513", "419", "597", "545", "590", "660", "660", "394", "544", "458", "589", "532", "506", "640", "557", "557", "520", "437"] expected_xlogp = ["2.8", "1.9", "1", "3.3", "1.5", "2.6", None, "-0.9", "2", "2.1", "2.9", "1.7", "-1.5", "0.4", "0.6", "0.4", "0.4", "2", "2.5"] assert len(expected_complexity) == len(expected_hbond_donors) == len(expected_linenos) assert len(expected_xlogp) == len(expected_linenos) class TestIterTwoTags(unittest2.TestCase): def test_read_two_existing_tags(self): fields = list(iter_two_tags(open_sdf(PUBCHEM_SDF), "PUBCHEM_CACTVS_HBOND_DONOR", "PUBCHEM_CACTVS_COMPLEXITY")) self.assertEquals(fields, zip(expected_hbond_donors, expected_complexity)) def test_read_tag_missing_data_field1(self): fields = list(iter_two_tags(open_sdf(PUBCHEM_SDF), "PUBCHEM_CACTVS_XLOGP", "PUBCHEM_CACTVS_HBOND_DONOR")) self.assertEquals(fields, zip(expected_xlogp, expected_hbond_donors)) def test_read_tag_missing_data_field2(self): fields = list(iter_two_tags(open_sdf(PUBCHEM_SDF), "PUBCHEM_CACTVS_HBOND_DONOR", "PUBCHEM_CACTVS_XLOGP")) self.assertEquals(fields, zip(expected_hbond_donors, expected_xlogp)) def test_edge_conditions1(self): fields = list(iter_two_tags(open_sdf(STRANGE_SDF), "noblank", "twolines")) self.assertEquals(fields, [("This line is not followed by a blank line", "This contains two lines"), (None, None)]) def test_edge_conditions2(self): fields = list(iter_two_tags(open_sdf(STRANGE_SDF), "duplicate", "embedded-tags")) self.assertEquals(fields, [("This is the first version.", "I have tags in the data line "), (None, None)]) def test_edge_conditions3(self): fields = list(iter_two_tags(open_sdf(STRANGE_SDF), "junk", "blank lines")) self.assertEquals(fields, [ ("This line contains some of the strange junk that might exist on the tag line", ""), (None, None)]) def test_edge_conditions4(self): fields = list(iter_two_tags(open_sdf(STRANGE_SDF), "nada", "fini")) self.assertEquals(fields, [(None, None), ("", "")]) def test_bad_tags(self): for tag in ("<", ">", "\n", "\t", "1<2", "2<1", "blah\t"): self.assertRaises(TypeError, iter_two_tags([], tag, "fini")) self.assertRaises(TypeError, iter_two_tags([], "fini", tag)) class TestReadTitleAndTag(unittest2.TestCase): def test_read_existing_tag(self): fields = list(iter_title_and_tag(open_sdf(PUBCHEM_SDF), "PUBCHEM_CACTVS_HBOND_DONOR")) self.assertEquals(fields, zip(expected_identifiers, expected_hbond_donors)) def test_missing_tag(self): fields = list(iter_title_and_tag(open_sdf(PUBCHEM_SDF), "PUBCHEM_CACTVS_XLOGP")) self.assertEquals(fields, zip(expected_identifiers, expected_xlogp)) def test_bad_tags(self): for tag in ("<", ">", "\n", "\t", "1<2", "2<1", "blah\t"): self.assertRaises(TypeError, iter_title_and_tag([], tag)) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_search_results.py0000644000077000000240000011342012101627403021053 0ustar dalkestaff00000000000000from __future__ import with_statement import math import unittest2 import random import re import array from chemfp.search import SearchResults from chemfp.fps_search import FPSSearchResults, FPSSearchResult try: next except NameError: # Compatibility with Python 2.5 def next(it): return it.next() if hasattr(math, "isnan"): isnan = math.isnan else: # math.isnan was added in Python 2.6 # NaN is a number which isn't equal to itself def isnan(x): return x != x random_scores = [ 0.676, 0.384, 0.740, 0.970, 0.148, 0.361, 0.621, 0.715, 0.698, 0.009, 0.667, 0.760, 0.743, 0.807, 0.772, 0.074, 0.622, 0.218, 0.594, 0.247, 0.680, 0.214, 0.721, 0.590, 0.433, 0.725, 0.917, 0.401, 0.818, 0.381, 0.039, 0.214, 0.133, 0.014, 0.072, 0.254, 0.515, 0.965, 0.145, 0.548, 0.468, 0.205, 0.631, 0.132, 0.710, 0.367, 0.313, 0.866, 0.611, 0.640, 0.727, 0.910, 0.057, 0.619, 0.160, 0.390, 0.868, 0.101, 0.525, 0.689, 0.945, 0.473, 0.448, 0.705, 0.399, 0.731, 0.214, 0.575, 0.721, 0.867, 0.514, 0.801, 0.415, 0.742, 0.628, 0.686, 0.117, 0.016, 0.411, 0.336, 0.447, 0.774, 0.028, 0.283, 0.937, 0.341, 0.348, 0.404, 0.956, 0.391, 0.822, 0.976, 0.162, 0.422, 0.260, 0.688, 0.596, 0.298, 0.927, 0.412, 0.979, 0.180, 0.258, 0.779, 0.893, 0.367, 0.219, 0.658, 0.084, 0.966, 0.264, 0.024, 0.795, 0.703, 0.092, 0.007, 0.463, 0.028, 0.567, 0.815, 0.403, 0.084, 0.760, 0.738, 0.125, 0.067, 0.200, 0.044, 0.307, 0.696, 0.314, 0.244, 0.420, 0.135, 0.741, 0.770, 0.047, 0.678, 0.186, 0.704, 0.732, 0.796, 0.017, 0.316, 0.377, 0.256, 0.866, 0.964, 0.651, 0.046, 0.073, 0.787, 0.043, 0.115, 0.269, 0.171, 0.374, 0.572, 0.448, 0.733, 0.845, 0.366, 0.777, 0.057, 0.009, 0.145, 0.755, 0.939, 0.417, 0.813, 0.658, 0.985, 0.866, 0.171, 0.425, 0.849, 0.316, 0.981, 0.689, 0.665, 0.872, 0.342, 0.326, 0.057, 0.755, 0.903, 0.111, 0.618, 0.980, 0.313, 0.829, 0.566, 0.876, 0.456, 0.954, 0.122, 0.598, 0.616, 0.735, 0.319, 0.482, 0.680, 0.437, 0.025, 0.281, 0.688, 0.859, 0.472, 0.321, 0.048, 0.601, 0.654, 0.991, 0.918, 0.754, 0.832, 0.352, 0.036, 0.184, 0.089, 0.534, 0.875, 0.651, 0.482, 0.135, 0.958, 0.805, 0.999, 0.655, 0.373, 0.092, 0.906, 0.919, 0.464, 0.588, 0.752, 0.268, 0.907, 0.936, 0.215, 0.551, 0.848, 0.324, 0.533, 0.787, 0.369, 0.695, 0.550, 0.594, 0.775, 0.731, 0.774, 0.881, 0.445, 0.034, 0.946, 0.979, 0.051, 0.494, 0.247, 0.475, 0.650, 0.452, 0.482, 0.691, 0.018, 0.191, 0.773, 0.859, 0.240, 0.782, 0.321, 0.613, 0.507, 0.453, 0.438, 0.461, 0.707, 0.973, 0.621, 0.882, 0.216, 0.232, 0.832, 0.456, 0.383, 0.507, 0.100, 0.422, 0.807, 0.008, 0.525, 0.962, 0.912, 0.329, 0.929, 0.041, 0.107, 0.897, 0.810, 0.666, 0.560, 0.593, 0.934, 0.223, 0.460, 0.447, 0.555, 0.649, 0.210, 0.297, 0.056, 0.854, 0.007, 0.024, 0.914, 0.547, 0.620, 0.981, 0.369, 0.877, 0.320, 0.564, 0.767, 0.672, 0.731, 0.386, 0.136, 0.569, 0.870, 0.950, 0.905, 0.504, 0.130, 0.477, 0.850, 0.823, 0.765, 0.378, 0.065, 0.694, 0.733, 0.283, 0.391, 0.084, 0.123, 0.362, 0.720, 0.327, 0.969, 0.077, 0.179, 0.732, 0.747, 0.621, 0.784, 0.116, 0.570, 0.964, 0.064, 0.138, 0.696, 0.596, 0.308, 0.256, 0.491, 0.678, 0.775, 0.958, 0.202, 0.342, 0.426, 0.946, 0.969, 0.236, 0.435, 0.231, 0.345, 0.242, 0.096, 0.146, 0.142, 0.723, 0.035, 0.449, 0.208, 0.123, 0.596, 0.499, 0.797, 0.146, 0.115, 0.547, 0.678, 0.989, 0.200, 0.042, 0.971, 0.732, 0.775] class TestCase(unittest2.TestCase): def assertListEquals(self, left, right, msg=None): self.assertEquals(list(left), right, msg) class CreateSearchResults(object): def _create(self, i, values): results = SearchResults(i) for value in values: results._add_hit(*value) return results class CreateFPSSearchResults(object): def _create(self, i, values): all_ids = [[] for x in range(i)] all_scores = [[] for x in range(i)] results = FPSSearchResults(i) for (row, id, score) in values: all_ids[row].append(id) all_scores[row].append(score) return FPSSearchResults([FPSSearchResult(rows, scores) for (rows, scores) in zip(all_ids, all_scores)]) class TestBasicAPI(object): def test_len(self): for i in (0, 1, 4, 5): results = SearchResults(i) self.assertEquals(len(results), i) def test_row_len(self): results = self._create(5, [ (0, 1, 0.1), (1, 2, 0.2), (1, 3, 0.25), (2, 1, 0.15), (2, 5, 0.7), (2, 6, 0.8), (3, 8, 0.9)]) self.assertEquals(len(results[0]), 1) self.assertEquals(len(results[1]), 2) self.assertEquals(len(results[2]), 3) self.assertEquals(len(results[3]), 1) self.assertEquals(len(results[4]), 0) self.assertEquals(len(results[-5]), 1) self.assertEquals(len(results[-4]), 2) self.assertEquals(len(results[-3]), 3) self.assertEquals(len(results[-2]), 1) self.assertEquals(len(results[-1]), 0) def test_negative_index(self): results = self._create(3, [ (0, 1, 0.0), (1, 12, 1.0)]) self.assertListEquals(results[1], [(12, 1.0)]) self.assertListEquals(results[-2], [(12, 1.0)]) self.assertListEquals(results[-3], [(1, 0.0)]) def test_clear(self): results = self._create(3, [ (0, 1, 0.0), (1, 12, 1.0)]) self.assertTrue(results[0]) self.assertTrue(results[1]) results.clear_all() self.assertFalse(results[0]) self.assertListEquals(results[0], []) self.assertFalse(results[1]) def test_clear_row(self): results = self._create(3, [ (0, 1, 0.0), (1, 12, 1.0)]) self.assertListEquals(results[1], [(12, 1.0)]) results[1].clear() self.assertListEquals(results[0], [(1, 0.0)]) self.assertTrue(results[0]) self.assertEquals(len(results[0]), 1) self.assertFalse(results[1]) results[0].clear() self.assertFalse(results[0]) self.assertEquals(len(results[0]), 0) self.assertFalse(results[1]) def test_clear_negative_row(self): results = self._create(3, [ (0, 1, 0.0), (1, 12, 1.0)]) self.assertListEquals(results[1], [(12, 1.0)]) results[-2].clear() self.assertListEquals(results[0], [(1, 0.0)]) self.assertListEquals(results[1], []) results[-3].clear() self.assertListEquals(results[0], []) self.assertListEquals(results[1], []) def test_unknown_ordering(self): results = self._create(1, [(0, 1, 0.9)]) with self.assertRaisesRegexp(ValueError, "Unknown sort order"): results.reorder_all("blah") class TestArenaBasicAPI(TestCase, CreateSearchResults, TestBasicAPI): pass class TestFPSBasicAPI(TestCase, CreateFPSSearchResults, TestBasicAPI): pass class TestIterAPI(TestCase): def setUp(self): results = SearchResults(2) results.target_ids = map(str, range(40)) all_scores = [random_scores[:10], random_scores[30:35]] # First row has columns 0, 1, 2, ..., 9 # second row has columns 0, 2, 4, 6, 8 for row, scores in enumerate(all_scores): for i, score in enumerate(scores): results._add_hit(row, i*(row+1), score) self.results = results def assertRealListEquals(self, left, right): # Make sure it's really a list - the ListEquals allows iterators self.assertTrue(isinstance(left, list) or isinstance(left, array.array)) self.assertListEquals(left, right) def test_iter_indices(self): results = self.results it = results.iter_indices() self.assertRealListEquals(next(it), range(10)) self.assertRealListEquals(next(it), [0, 2, 4, 6, 8]) with self.assertRaisesRegexp(StopIteration, ""): next(it) def test_iter_ids(self): results = self.results it = results.iter_ids() self.assertRealListEquals(next(it), map(str, range(10))) self.assertRealListEquals(next(it), map(str, [0, 2, 4, 6, 8])) with self.assertRaisesRegexp(StopIteration, ""): next(it) def test_iter_scores(self): results = self.results it = results.iter_scores() self.assertRealListEquals(next(it), random_scores[:10]) self.assertRealListEquals(next(it), random_scores[30:35]) with self.assertRaisesRegexp(StopIteration, ""): next(it) def test_iter_indices_and_scores(self): results = self.results it = results.iter_indices_and_scores() self.assertRealListEquals(next(it), zip(range(10), random_scores[:10])) self.assertRealListEquals(next(it), zip([0, 2, 4, 6, 8], random_scores[30:35])) with self.assertRaisesRegexp(StopIteration, ""): next(it) def test_iter_ids_and_scores(self): results = self.results it = results.iter_ids_and_scores() self.assertRealListEquals(next(it), zip(map(str, range(10)), random_scores[:10])) self.assertRealListEquals(next(it), zip(map(str, [0, 2, 4, 6, 8]), random_scores[30:35])) with self.assertRaisesRegexp(StopIteration, ""): next(it) class TestErrors(TestCase): def test_bad_order(self): results = SearchResults(5) with self.assertRaisesRegexp(ValueError, "Unknown sort order"): results.reorder_all("xyzzy") def test_bad_row_order(self): results = SearchResults(5) with self.assertRaisesRegexp(ValueError, "Unknown sort order"): results[0].reorder("xyzzy") def test_index_out_of_range(self): results = SearchResults(5) with self.assertRaisesRegexp(IndexError, "row index is out of range"): results[5] with self.assertRaisesRegexp(IndexError, "row index is out of range"): results[98] def test_illegal_negative_index(self): results = SearchResults(3) with self.assertRaisesRegexp(IndexError, "row index is out of range"): results[-4] class TestGetHitInfo(TestCase): def setUp(self): results = SearchResults(4) results._add_hit(1, 1, 0.1) results._add_hit(2, 7, 1.0) results._add_hit(3, 3, 0.2) results._add_hit(3, 4, 0.5) results._add_hit(3, 5, 0.6) results._add_hit(3, 6, 0.7) self.results = results def test_index(self): self.assertListEquals(self.results[0], []) self.assertListEquals(self.results[1], [(1, 0.1)]) self.assertListEquals(self.results[3], [(3, 0.2), (4, 0.5), (5, 0.6), (6, 0.7)]) self.assertListEquals(self.results[-2], [(7, 1.0)]) def test_get_indices(self): self.assertListEquals(self.results[0].get_indices(), []) self.assertListEquals(self.results[1].get_indices(), [1]) self.assertListEquals(self.results[3].get_indices(), [3, 4, 5, 6]) self.assertListEquals(self.results[-2].get_indices(), [7]) def test_get_scores(self): self.assertListEquals(self.results[0].get_scores(), []) self.assertListEquals(self.results[1].get_scores(), [0.1]) self.assertListEquals(self.results[3].get_scores(), [0.2, 0.5, 0.6, 0.7]) self.assertListEquals(self.results[-2].get_scores(), [1.0]) def test_get_indices_and_scores(self): self.assertListEquals(self.results[0].get_indices_and_scores(), []) self.assertListEquals(self.results[1].get_indices_and_scores(), [(1, 0.1)]) self.assertListEquals(self.results[3].get_indices_and_scores(), [(3, 0.2), (4, 0.5), (5, 0.6), (6, 0.7)]) self.assertListEquals(self.results[-2].get_indices_and_scores(), [(7, 1.0)]) def test_get_ids_and_scores(self): self.results.target_ids = map(str, range(8)) self.assertListEquals(self.results[0].get_ids_and_scores(), []) self.assertListEquals(self.results[1].get_ids_and_scores(), [("1", 0.1)]) self.assertListEquals(self.results[3].get_ids_and_scores(), [("3", 0.2), ("4", 0.5), ("5", 0.6), ("6", 0.7)]) self.assertListEquals(self.results[-2].get_ids_and_scores(), [("7", 1.0)]) def test_get_ids_and_scores_missing_target_ids(self): with self.assertRaisesRegexp(TypeError, "target_ids are not available"): self.assertListEquals(self.results[-2].get_ids_and_scores(), [("7", 1.0)]) class TestGetHitInfoRow(TestCase): def setUp(self): results = SearchResults(4) results._add_hit(1, 1, 0.1) results._add_hit(2, 7, 1.0) results._add_hit(3, 3, 0.2) results._add_hit(3, 4, 0.5) results._add_hit(3, 5, 0.6) results._add_hit(3, 6, 0.7) self.results = results def test_get_indices(self): self.assertListEquals(self.results[0].get_indices(), []) self.assertListEquals(self.results[1].get_indices(), [1]) self.assertListEquals(self.results[3].get_indices(), [3, 4, 5, 6]) self.assertListEquals(self.results[-2].get_indices(), [7]) def test_get_scores(self): self.assertListEquals(self.results[0].get_scores(), []) self.assertListEquals(self.results[1].get_scores(), [0.1]) self.assertListEquals(self.results[3].get_scores(), [0.2, 0.5, 0.6, 0.7]) self.assertListEquals(self.results[-2].get_scores(), [1.0]) def test_get_indices_and_scores(self): self.assertListEquals(self.results[0].get_indices_and_scores(), []) self.assertListEquals(self.results[1].get_indices_and_scores(), [(1, 0.1)]) self.assertListEquals(self.results[3].get_indices_and_scores(), [(3, 0.2), (4, 0.5), (5, 0.6), (6, 0.7)]) self.assertListEquals(self.results[-2].get_indices_and_scores(), [(7, 1.0)]) def test_get_ids_and_scores(self): self.results.target_ids = map(str, range(8)) self.assertListEquals(self.results[0].get_ids_and_scores(), []) self.assertListEquals(self.results[1].get_ids_and_scores(), [("1", 0.1)]) self.assertListEquals(self.results[3].get_ids_and_scores(), [("3", 0.2), ("4", 0.5), ("5", 0.6), ("6", 0.7)]) self.assertListEquals(self.results[-2].get_ids_and_scores(), [("7", 1.0)]) def test_get_ids_and_scores_missing_target_ids(self): with self.assertRaisesRegexp(TypeError, "target_ids are not available"): self.assertListEquals(self.results[-2].get_ids_and_scores(), [("7", 1.0)]) _get_sort_key = { "increasing-score": lambda (index, score): (score, index), "decreasing-score": lambda (index, score): (-score, index), "increasing-index": lambda (index, score): index, "decreasing-index": lambda (index, score): -index, } class TestSortOrder(object): def test_size_0(self): results = self._create(5, []) results.reorder_all() self.assertListEquals(results[0], []) def test_size_1(self): results = self._create(5, [ (1, 5, 0.2),]) results.reorder_all() self.assertListEquals(results[1], [(5, 0.2)]) def test_size_2(self): results = self._create(5, [ (1, 5, 0.2), (1, 6, 0.4),]) self.assertListEquals(results[1], [(5, 0.2), (6, 0.4)]) results.reorder_all("increasing-score") self.assertListEquals(results[1], [(5, 0.2), (6, 0.4)]) results.reorder_all("decreasing-score") self.assertListEquals(results[1], [(6, 0.4), (5, 0.2)]) def test_default_ordering_2(self): results = self._create(5, [ (1, 5, 0.2), (1, 6, 0.4),]) results.reorder_all() self.assertListEquals(results[1], [(6, 0.4), (5, 0.2)]) def test_size_3(self): results = self._create(5, [ (1, 5, 0.2), (1, 6, 0.4), (1, 7, 0.2),]) self.assertListEquals(results[1], [(5, 0.2), (6, 0.4), (7, 0.2)]) results.reorder_all("increasing-score") self.assertListEquals(results[1], [(5, 0.2), (7, 0.2), (6, 0.4)]) results.reorder_all("decreasing-score") self.assertListEquals(results[1], [(6, 0.4), (5, 0.2), (7, 0.2)]) class TestArenaTestSortOrder(TestCase, CreateSearchResults, TestSortOrder): def test_index_as_secondary_sort(self): # Timsort preserves input order. test_random_values uses # sequentially ordered indicies so can't tell the difference # between input order and index order. Here I reverse the # order so I can really test tie-breaking. for name in ("increasing-score", "decreasing-score", "increasing-index", "decreasing-index"): results = SearchResults(1) expected = [] for i in range(300): score = random_scores[i] results._add_hit(0, 400-i, score) expected.append((400-i, score)) results.reorder_all(name) expected.sort(key = _get_sort_key[name]) self.assertListEquals(results[0], expected, "error in %s (300)" % (name,)) def test_random_values(self): # The underlying timsort does merge sorts of 64 element # blocks. Hence some of the code is not exercised unless the # input is at least 128 elements long. for size in (3, 5, 10, 20, 70, 100, 400): results = SearchResults(1) expected = [] for i in range(size): score = random_scores[i] expected.append((i, score)) results._add_hit(0, i, score) self.assertListEquals(results[0], expected) for name in ("increasing-score", "decreasing-score", "increasing-index", "decreasing-index"): results.reorder_all(name) expected.sort(key = _get_sort_key[name]) self.assertListEquals(results[0], expected, "error in %s:%d" % (name, size)) def test_regression_error_where_duplicate_indices_did_not_sort_correctly(self): # The id case doesn't happen in real code, since duplicate # indices are not possible. However, I suspect that the real # issue is with duplicate primary keys, so duplicate scores # might trigger the same problem. It's easiest to test with # indices. results = SearchResults(1) ids = range(5) * 2 for id in ids: results._add_hit(0, id, id/10.0) results.reorder_all("increasing-index") self.assertListEquals(results[0].get_indices(), sorted(ids)) class TestFPSSortOrder(TestCase, CreateFPSSearchResults, TestSortOrder): def test_order_by_id(self): results = self._create(2, [ (0, "one", 0.4), (0, "two", 0.5), (0, "three", 0.6), (1, "ett", 0.3), (1, "tvaa", 0.2), (1, "tre", 0.1),]) results.reorder_all("increasing-id") self.assertListEquals(results[0], [("one", 0.4), ("three", 0.6), ("two", 0.5)]) self.assertListEquals(results[1], [("ett", 0.3), ("tre", 0.1), ("tvaa", 0.2)]) results.reorder_all("decreasing-id") self.assertListEquals(results[0], [("one", 0.4), ("three", 0.6), ("two", 0.5)][::-1]) self.assertListEquals(results[1], [("ett", 0.3), ("tre", 0.1), ("tvaa", 0.2)][::-1]) class TestSortOrderRow(object): def test_size_2(self): results = self._create(5, [ (1, 5, 0.2), (1, 6, 0.4),]) self.assertListEquals(results[1], [(5, 0.2), (6, 0.4)]) for result in results: result.reorder("increasing-score") self.assertListEquals(results[1], [(5, 0.2), (6, 0.4)]) for result in results: result.reorder("decreasing-score") self.assertListEquals(results[1], [(6, 0.4), (5, 0.2)]) def test_default_ordering_2(self): results = self._create(5, [ (1, 5, 0.2), (1, 6, 0.4),]) for result in results: result.reorder() self.assertListEquals(results[1], [(6, 0.4), (5, 0.2)]) def test_size_3(self): results = self._create(5, [ (1, 5, 0.2), (1, 6, 0.4), (1, 7, 0.2),]) self.assertListEquals(results[1], [(5, 0.2), (6, 0.4), (7, 0.2)]) for result in results: result.reorder("increasing-score") self.assertListEquals(results[1], [(5, 0.2), (7, 0.2), (6, 0.4)]) for result in results: result.reorder("decreasing-score") self.assertListEquals(results[1], [(6, 0.4), (5, 0.2), (7, 0.2)]) def test_reorder_row(self): results = self._create(2, [ (0, 1, 0.1), (0, 2, 0.8), (0, 3, 0.6), (1, 6, 0.1), (1, 7, 0.8), (1, 8, 0.6),]) self.assertListEquals(results[0], [(1, 0.1), (2, 0.8), (3, 0.6)]) self.assertListEquals(results[1], [(6, 0.1), (7, 0.8), (8, 0.6)]) results[1].reorder("increasing-score") self.assertListEquals(results[0], [(1, 0.1), (2, 0.8), (3, 0.6)]) self.assertListEquals(results[1], [(6, 0.1), (8, 0.6), (7, 0.8)]) results[0].reorder("decreasing-score") self.assertListEquals(results[0], [(2, 0.8), (3, 0.6), (1, 0.1)]) self.assertListEquals(results[1], [(6, 0.1), (8, 0.6), (7, 0.8)]) # Check that the default works results[0].reorder("increasing-score") # ensure the default only affects one row results[1].reorder() self.assertListEquals(results[0], [(1, 0.1), (3, 0.6), (2, 0.8)]) self.assertListEquals(results[1], [(7, 0.8), (8, 0.6), (6, 0.1)]) def test_sort_negative_row(self): results = self._create(2, [ (0, 1, 0.1), (0, 2, 0.8), (0, 3, 0.6), (1, 6, 0.1), (1, 7, 0.8), (1, 8, 0.6),]) self.assertListEquals(results[0], [(1, 0.1), (2, 0.8), (3, 0.6)]) self.assertListEquals(results[1], [(6, 0.1), (7, 0.8), (8, 0.6)]) results[-1].reorder("increasing-score") self.assertListEquals(results[0], [(1, 0.1), (2, 0.8), (3, 0.6)]) self.assertListEquals(results[1], [(6, 0.1), (8, 0.6), (7, 0.8)]) results[-2].reorder("decreasing-score") self.assertListEquals(results[0], [(2, 0.8), (3, 0.6), (1, 0.1)]) self.assertListEquals(results[1], [(6, 0.1), (8, 0.6), (7, 0.8)]) results[1].reorder() # default is decreasing score self.assertListEquals(results[0], [(2, 0.8), (3, 0.6), (1, 0.1)]) self.assertListEquals(results[1], [(7, 0.8), (8, 0.6), (6, 0.1)]) class TestArenaTestSortOrderRow(TestCase, CreateSearchResults, TestSortOrderRow): pass class TestFPSSortOrderRow(TestCase, CreateFPSSearchResults, TestSortOrderRow): pass class TestMoveClosestFirst(object): def test_empty(self): results = self._create(2, []) results.reorder_all("move-closest-first") self.assertEquals(len(results), 2) self.assertEquals(len(results[0]), 0) self.assertEquals(len(results[1]), 0) def test_one(self): results = self._create(2, [ (0, 9, 0.1), (1, 8, 0.8)]) results.reorder_all("move-closest-first") self.assertListEquals(results[0], [(9, 0.1)]) self.assertListEquals(results[1], [(8, 0.8)]) def test_two(self): results = self._create(2, [ (0, 1, 0.1), (0, 2, 0.8), (1, 2, 0.8), (1, 3, 0.6),]) results.reorder_all("move-closest-first") self.assertListEquals(results[0], [(2, 0.8), (1, 0.1)]) self.assertListEquals(results[1], [(2, 0.8), (3, 0.6)]) def test_three(self): results = self._create(3, [ (0, 1, 0.1), (0, 2, 0.8), (0, 3, 0.6), (1, 12, 0.8), (1, 22, 0.1), (1, 32, 0.6), (2, 12, 0.6), (2, 22, 0.1), (2, 32, 0.8),]) results.reorder_all("move-closest-first") self.assertListEquals(results[0], [(2, 0.8), (1, 0.1), (3, 0.6)]) self.assertListEquals(results[1], [(12, 0.8), (22, 0.1), (32, 0.6)]) self.assertListEquals(results[2], [(32, 0.8), (22, 0.1), (12, 0.6)]) def test_row(self): results = self._create(3, [ (0, 1, 0.1), (0, 2, 0.8), (0, 3, 0.6), (1, 12, 0.8), (1, 22, 0.1), (1, 32, 0.6), (2, 12, 0.6), (2, 22, 0.1), (2, 32, 0.8),]) results[0].reorder("move-closest-first") self.assertListEquals(results[0], [(2, 0.8), (1, 0.1), (3, 0.6)]) self.assertListEquals(results[1], [(12, 0.8), (22, 0.1), (32, 0.6)]) self.assertListEquals(results[2], [(12, 0.6), (22, 0.1), (32, 0.8)]) results[-1].reorder("move-closest-first") self.assertListEquals(results[0], [(2, 0.8), (1, 0.1), (3, 0.6)]) self.assertListEquals(results[1], [(12, 0.8), (22, 0.1), (32, 0.6)]) self.assertListEquals(results[2], [(32, 0.8), (22, 0.1), (12, 0.6)]) class TestArenaMoveClosestFirst(TestCase, CreateSearchResults, TestMoveClosestFirst): pass class TestFPSMoveClosestFirst(TestCase, CreateFPSSearchResults, TestMoveClosestFirst): pass class TestReverse(object): def test_empty(self): results = self._create(2, []) results.reorder_all("reverse") self.assertEquals(len(results), 2) self.assertEquals(len(results[0]), 0) self.assertEquals(len(results[1]), 0) def test_one(self): results = self._create(2, [ (0, 9, 0.1), (1, 8, 0.8)]) results.reorder_all("reverse") self.assertListEquals(results[0], [(9, 0.1)]) self.assertListEquals(results[1], [(8, 0.8)]) def test_two(self): results = self._create(2, [ (0, 1, 0.1), (0, 2, 0.8), (1, 2, 0.8), (1, 3, 0.6)]) results.reorder_all("reverse") self.assertListEquals(results[0], [(2, 0.8), (1, 0.1)]) self.assertListEquals(results[1], [(3, 0.6), (2, 0.8)]) def test_three(self): results = self._create(3, [ (0, 1, 0.1), (0, 2, 0.8), (0, 3, 0.6), (1, 12, 0.8), (1, 22, 0.1), (1, 32, 0.6), (2, 12, 0.6), (2, 32, 0.1), (2, 22, 0.8),]) results.reorder_all("reverse") self.assertListEquals(results[0], [(3, 0.6), (2, 0.8), (1, 0.1)]) self.assertListEquals(results[1], [(32, 0.6), (22, 0.1), (12, 0.8)]) self.assertListEquals(results[2], [(22, 0.8), (32, 0.1), (12, 0.6)]) def test_row(self): results = self._create(3, [ (0, 1, 0.1), (0, 2, 0.8), (0, 3, 0.6), (1, 12, 0.8), (1, 22, 0.1), (1, 32, 0.6), (2, 12, 0.6), (2, 32, 0.1), (2, 22, 0.8)]) results[0].reorder("reverse") self.assertListEquals(results[0], [(3, 0.6), (2, 0.8), (1, 0.1)]) self.assertListEquals(results[1], [(12, 0.8), (22, 0.1), (32, 0.6)]) self.assertListEquals(results[2], [(12, 0.6), (32, 0.1), (22, 0.8)]) results[-1].reorder("reverse") self.assertListEquals(results[0], [(3, 0.6), (2, 0.8), (1, 0.1)]) self.assertListEquals(results[1], [(12, 0.8), (22, 0.1), (32, 0.6)]) self.assertListEquals(results[2], [(22, 0.8), (32, 0.1), (12, 0.6)]) class TestArenaReverse(TestCase, CreateSearchResults, TestReverse): pass class TestFPSReverse(TestCase, CreateFPSSearchResults, TestReverse): pass neg_inf = float("-inf") pos_inf = float("inf") def _count_scores(func, scores): return sum(1 for score in scores if func(score)) def _add_scores(func, scores): return sum(score for score in scores if func(score)) def _range_searches(func): expected_count = [ _count_scores(func, (0.1, 0.9, 0.2, 0.3, 0.15, 1.0)), 0.0, _count_scores(func, (1.0, 0.0, 0.5, 0.14, 0.28)), _count_scores(func, (neg_inf, 0.0001, pos_inf))] expected_cumulative = [ _add_scores(func, (0.1, 0.9, 0.2, 0.3, 0.15, 1.0)), 0.0, _add_scores(func, (1.0, 0.0, 0.5, 0.14, 0.28)), _add_scores(func, (neg_inf, 0.0001, pos_inf))] return expected_count, expected_cumulative class TestRangeSearches(TestCase): def setUp(self): self.results = self._create() def _create(self): results = SearchResults(4) for i, score in enumerate((0.1, 0.9, 0.2, 0.3, 0.15, 1.0)): results._add_hit(0, i, score) for i, score in enumerate((1.0, 0.0, 0.5, 0.14, 0.28)): results._add_hit(2, i, score) for i, score in enumerate((neg_inf, 0.0001, pos_inf)): results._add_hit(3, i, score) return results def _test(self, func, *args, **kwargs): expected_count, expected_cumulative = _range_searches(func) results = self.results self._compare_lists([result.count(*args, **kwargs) for result in results], expected_count) self._compare(results.count_all(*args, **kwargs), sum(expected_count)) self._compare_lists([result.cumulative_score(*args, **kwargs) for result in results], expected_cumulative) self._compare(results.cumulative_score_all(*args, **kwargs), sum(expected_cumulative)) def _compare_lists(self, got, expected): self.assertEqual(len(got), len(expected)) for got_term, expected_term in zip(got, expected): n = isnan(got_term) + isnan(expected_term) if n != 2: self.assertAlmostEqual(got_term, expected_term) def _compare(self, got, expected): n = isnan(got) + isnan(expected) if n != 2: self.assertAlmostEqual(got, expected) def test_empty(self): self._test(lambda score: 1) def test_impossible_to_match(self): self._test((lambda score: 1.0 <= score <= 0.0), min_score=1.0, max_score=0.0) self._test((lambda score: 1.0 <= score <= 0.0), min_score=1.0, max_score=0.0, interval="[]") self._test((lambda score: 1.0 <= score <= 0.0), min_score=1.0, max_score=0.0, interval="[)") self._test((lambda score: 1.0 <= score <= 0.0), min_score=1.0, max_score=0.0, interval="(]") self._test((lambda score: 1.0 <= score <= 0.0), min_score=1.0, max_score=0.0, interval="()") def test_single_point(self): self._test((lambda score: 1.0 <= score <= 1.0), min_score=1.0, max_score=1.0, interval="[]") self._test((lambda score: 1.0 <= score < 1.0), min_score=1.0, max_score=1.0, interval="[)") self._test((lambda score: 1.0 < score <= 1.0), min_score=1.0, max_score=1.0, interval="(]") self._test((lambda score: 1.0 <= score < 1.0), min_score=1.0, max_score=1.0, interval="()") def test_min_default(self): self._test((lambda score: score>=0.15), min_score=0.15) self._test((lambda score: score>=0.14), 0.14) def test_min_oo(self): self._test((lambda score: 0.15 < score < pos_inf), min_score=0.15, interval="()") self._test((lambda score: 0.14 < score < pos_inf), min_score=0.14, interval="()") def test_min_oc(self): self._test((lambda score: 0.15 < score <=pos_inf), min_score=0.15, interval="(]") self._test((lambda score: 0.14 < score <=pos_inf), min_score=0.14, interval="(]") def test_min_co(self): self._test((lambda score: 0.15 <= score < pos_inf), min_score=0.15, interval="[)") self._test((lambda score: 0.14 <= score < pos_inf), min_score=0.14, interval="[)") def test_min_cc(self): self._test((lambda score: 0.15 <= score), min_score=0.15, interval="[]") self._test((lambda score: 0.14 <= score), min_score=0.14, interval="[]") def test_max_default(self): self._test((lambda score: score <= 0.15), max_score=0.15) self._test((lambda score: score <= 0.14), None, 0.14) def test_max_oo(self): self._test((lambda score: neg_inf < score < 0.15), max_score=0.15, interval="()") self._test((lambda score: neg_inf < score < 0.14), max_score=0.14, interval="()") def test_max_oc(self): self._test((lambda score: neg_inf < score <= 0.15), max_score=0.15, interval="(]") self._test((lambda score: neg_inf < score <= 0.14), max_score=0.14, interval="(]") def test_max_co(self): self._test((lambda score: score < 0.15), max_score=0.15, interval="[)") self._test((lambda score: score < 0.14), max_score=0.14, interval="[)") def test_max_cc(self): self._test((lambda score: score <= 0.15), max_score=0.15, interval="[]") self._test((lambda score: score <= 0.14), max_score=0.14, interval="[]") # The same min/max tests but using -inf as the lower bound/+inf as the upper def test_min_default_max_inf(self): self._test((lambda score: 0.15 <= score <= pos_inf), min_score=0.15, max_score=pos_inf) self._test((lambda score: 0.14 <= score <= pos_inf), 0.14, pos_inf) def test_min_oo_max_inf(self): self._test((lambda score: 0.15 < score < pos_inf), min_score=0.15, max_score=pos_inf, interval="()") self._test((lambda score: 0.14 < score < pos_inf), min_score=0.14, max_score=pos_inf, interval="()") def test_min_oc_max_inf(self): self._test((lambda score: 0.15 < score <= pos_inf), min_score=0.15, max_score=pos_inf, interval="(]") self._test((lambda score: 0.14 < score <= pos_inf), min_score=0.14, max_score=pos_inf, interval="(]") def test_min_co_max_inf(self): self._test((lambda score: 0.15 <= score < pos_inf), min_score=0.15, max_score=pos_inf, interval="[)") self._test((lambda score: 0.14 <= score < pos_inf), min_score=0.14, max_score=pos_inf, interval="[)") def test_min_cc_max_inf(self): self._test((lambda score: 0.15 <= score <= pos_inf), min_score=0.15, max_score=pos_inf, interval="[]") self._test((lambda score: 0.14 <= score <= pos_inf), min_score=0.14, max_score=pos_inf, interval="[]") def test_max_default_min_ninf(self): self._test((lambda score: neg_inf <= score <= 0.15), min_score=neg_inf, max_score=0.15) self._test((lambda score: neg_inf <= score <= 0.14), neg_inf, 0.14) def test_max_oo_min_ninf(self): self._test((lambda score: neg_inf < score < 0.15), max_score=0.15, min_score=neg_inf, interval="()") self._test((lambda score: neg_inf < score < 0.14), max_score=0.14, min_score=neg_inf, interval="()") def test_max_oc_min_ninf(self): self._test((lambda score: neg_inf < score <= 0.15), max_score=0.15, min_score=neg_inf, interval="(]") self._test((lambda score: neg_inf < score <= 0.14), max_score=0.14, min_score=neg_inf, interval="(]") def test_max_co_min_ninf(self): self._test((lambda score: neg_inf <= score < 0.15), max_score=0.15, min_score=neg_inf, interval="[)") self._test((lambda score: neg_inf <= score < 0.14), max_score=0.14, min_score=neg_inf, interval="[)") def test_max_cc_min_ninf(self): self._test((lambda score: neg_inf <= score <= 0.15), max_score=0.15, min_score=neg_inf, interval="[]") self._test((lambda score: neg_inf <= score <= 0.14), max_score=0.14, min_score=neg_inf, interval="[]") # Specify both min and max def test_min_max_default(self): self._test((lambda score: 0.15 <= score <= 1.0), min_score=0.15, max_score=1.0) self._test((lambda score: 0.14 <= score <= 1.0), 0.14, 1.0) self._test((lambda score: 0.14 <= score <= 0.9), 0.14, 0.9) def test_min_max_oo(self): self._test((lambda score: 0.15 < score < 1.0), min_score=0.15, max_score=1.0, interval="()") self._test((lambda score: 0.14 < score < 1.0), min_score=0.14, max_score=1.0, interval="()") self._test((lambda score: 0.14 < score < 0.9), 0.14, 0.9, "()") def test_min_max_oc(self): self._test((lambda score: 0.15 < score <= 1.0), min_score=0.15, max_score=1.0, interval="(]") self._test((lambda score: 0.14 < score <= 1.0), min_score=0.14, max_score=1.0, interval="(]") self._test((lambda score: 0.14 < score <= 0.9), 0.14, 0.9, "(]") def test_min_max_co(self): self._test((lambda score: 0.15 <= score < 1.0), min_score=0.15, max_score=1.0, interval="[)") self._test((lambda score: 0.14 <= score < 1.0), min_score=0.14, max_score=1.0, interval="[)") self._test((lambda score: 0.14 <= score < 0.9), 0.14, 0.9, "[)") def test_min_max_cc(self): self._test((lambda score: 0.15 <= score <= 1.0), min_score=0.15, max_score=1.0, interval="[]") self._test((lambda score: 0.14 <= score <= 1.0), min_score=0.14, max_score=1.0, interval="[]") self._test((lambda score: 0.14 <= score <= 0.9), 0.14, 0.9, "[]") def test_bad_interval(self): results = self.results for interval, msg in (("[", "Second interval character must be ')' or ']'"), ("(", "Second interval character must be ')' or ']'"), ("]", "First interval character must be '(' or '['"), (")", "First interval character must be '(' or '['"), ("a", "First interval character must be '(' or '['"), ("((", "Second interval character must be ')' or ']'"), ("()(", "The interval may only contain two characters")): for method in (results.count_all, results[0].count, results.cumulative_score_all, results[0].cumulative_score): with self.assertRaisesRegexp(ValueError, re.escape(msg)): method(interval=interval) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_simsearch.py0000644000077000000240000005473712055226641020031 0ustar dalkestaff00000000000000import unittest2 import sys import chemfp from chemfp.commandline import simsearch SOFTWARE = "chemfp/" + chemfp.__version__ from cStringIO import StringIO import support class SimsearchRunner(support.Runner): def verify_result(self, result): assert result[0] == "#Simsearch/1", result[0] class CountRunner(support.Runner): def verify_result(self, result): assert result[0] == "#Count/1", result[0] runner = SimsearchRunner(simsearch.main) run = runner.run run_split = runner.run_split run_exit = runner.run_exit count_runner = CountRunner(simsearch.main) count_run_split = count_runner.run_split count_run_exit = count_runner.run_exit SIMPLE_FPS = support.fullpath("simple.fps") def run_split_stdin(input, cmdline, expect_length=None, source=SIMPLE_FPS): old_stdin = sys.stdin sys.stdin = StringIO(input) try: return run_split(cmdline, expect_length, source) finally: sys.stdin = old_stdin # The values I get using gmpy are: # [(1.0, 'deadbeef'), # (0.95999999999999996, 'Deaf Beef'), # (0.83999999999999997, 'DEADdead'), # (0.23999999999999999, 'several'), # (0.041666666666666664, 'bit1'), # (0.040000000000000001, 'two_bits'), # (0.0, 'zeros')] class TestOptions(unittest2.TestCase): def test_default(self): header, lines = run_split("--hex-query deadbeef -t 0.1", 1, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=all threshold=0.1"}) self.assertEquals(len(lines), 1, lines) fields = lines[0].split("\t") self.assertEquals(fields[:2], ["4", "Query1"]) hits = zip(fields[2::2], fields[3::2]) hits.sort() self.assertEquals(hits, [("DEADdead", "0.840"), ("Deaf Beef", "0.960"), ("deadbeef", "1.000"), ('several', '0.240')]) def test_k_3(self): header, lines = run_split("--hex-query deadbeef -k 3 --threshold 0.8", 1, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=3 threshold=0.8"}) self.assertEquals(lines, ["3\tQuery1\tdeadbeef\t1.000\tDeaf Beef\t0.960\tDEADdead\t0.840"]) def test_k_2(self): header, lines = run_split("--hex-query deadbeef -k 2 --threshold 0.9", 1, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=2 threshold=0.9"}) self.assertEquals(lines, ["2\tQuery1\tdeadbeef\t1.000\tDeaf Beef\t0.960"]) def test_k_1(self): header, lines = run_split("--hex-query deadbeef -k 1 -t 0.0", 1, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=1 threshold=0.0"}) self.assertEquals(lines, ["1\tQuery1\tdeadbeef\t1.000"]) def test_knearest_1(self): header, lines = run_split("--hex-query deadbeef --k-nearest 1", 1, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=1 threshold=0.0"}) self.assertEquals(lines, ["1\tQuery1\tdeadbeef\t1.000"]) def test_k_10(self): # Asked for 10 but only 7 are available header, lines = run_split("--hex-query deadbeef -k 10 --threshold 0.0", 1, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=10 threshold=0.0"}) self.assertEquals(lines, ["7\tQuery1\tdeadbeef\t1.000\tDeaf Beef\t0.960\tDEADdead\t0.840\t" "several\t0.240\tbit1\t0.042\ttwo_bits\t0.040\tzeros\t0.000"]) def test_knearest_all(self): header, lines = run_split("--hex-query deadbeef --k-nearest all", 1, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=all threshold=0.0"}) self.assertEquals(lines, ['7\tQuery1\tzeros\t0.000\tbit1\t0.042\ttwo_bits\t0.040\tseveral\t0.240\tdeadbeef\t1.000\tDEADdead\t0.840\tDeaf Beef\t0.960']) def test_threshold(self): header, lines = run_split("--hex-query deadbeef --threshold 0.9", 1, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=all threshold=0.9"}) self.assertEquals(lines, ["2\tQuery1\tdeadbeef\t1.000\tDeaf Beef\t0.960"]) def test_threshold_and_k(self): header, lines = run_split("--hex-query deadbeef -t 0.9 -k 1", 1, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=1 threshold=0.9"}) self.assertEquals(lines, ["1\tQuery1\tdeadbeef\t1.000"]) def test_threshold_and_k_all(self): header, lines = run_split("--hex-query deadbeef --threshold 0.9 --k-nearest all", 1, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=all threshold=0.9"}) self.assertEquals(lines, ["2\tQuery1\tdeadbeef\t1.000\tDeaf Beef\t0.960"]) def test_stdin(self): header, lines = run_split_stdin("deadbeef\tspam\n", "", 1, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=3 threshold=0.7"}) self.assertEquals(lines, ["3\tspam\tdeadbeef\t1.000\tDeaf Beef\t0.960\tDEADdead\t0.840"]) def test_stdin2(self): header, lines = run_split_stdin("deadbeef\tspam\nDEADBEEF\teggs\n", "-k 3 --threshold 0.6", 2, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=3 threshold=0.6"}) self.assertEquals(lines, ["3\tspam\tdeadbeef\t1.000\tDeaf Beef\t0.960\tDEADdead\t0.840", "3\teggs\tdeadbeef\t1.000\tDeaf Beef\t0.960\tDEADdead\t0.840"]) def test_stdin3(self): header, lines = run_split_stdin("deadbeef\tspam\n87654321\tcountdown\n", "-k 3 --threshold 0.9", 2, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=3 threshold=0.9"}) self.assertEquals(lines, ["2\tspam\tdeadbeef\t1.000\tDeaf Beef\t0.960", "0\tcountdown"]) class _AgainstSelf: def test_with_threshold(self): header, lines = run_split( ["--queries", SIMPLE_FPS, "-k", "3", "--threshold", "0.8"] + self.extra_arg, 7, SIMPLE_FPS) self.assertEquals(lines, ["0\tzeros", "1\tbit1\tbit1\t1.000", "1\ttwo_bits\ttwo_bits\t1.000", "1\tseveral\tseveral\t1.000", "3\tdeadbeef\tdeadbeef\t1.000\tDeaf Beef\t0.960\tDEADdead\t0.840", "3\tDEADdead\tDEADdead\t1.000\tdeadbeef\t0.840\tDeaf Beef\t0.808", "3\tDeaf Beef\tDeaf Beef\t1.000\tdeadbeef\t0.960\tDEADdead\t0.808"]) def test_with_count_and_threshold(self): header, lines = count_run_split( ["--queries", SIMPLE_FPS, "--threshold", "0.8", "--count"] + self.extra_arg, 7, SIMPLE_FPS) self.assertEquals(lines, ["0\tzeros", "1\tbit1", "1\ttwo_bits", "1\tseveral", "3\tdeadbeef", "3\tDEADdead", "3\tDeaf Beef"]) class TestAgainstSelf(unittest2.TestCase, _AgainstSelf): extra_arg = [] class TestAgainstSelfInMemory(unittest2.TestCase, _AgainstSelf): extra_arg = ["--memory"] class TestAgainstSelfFileScan(unittest2.TestCase, _AgainstSelf): extra_arg = ["--scan"] class TestNxN(unittest2.TestCase): def test_default(self): header, lines = run_split("--NxN", 7, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=3 threshold=0.7 NxN=full"}) self.assertEquals(len(lines), 7, lines) self.assertEquals(lines, ['0\tzeros', '0\tbit1', '0\ttwo_bits', '0\tseveral', '2\tdeadbeef\tDeaf Beef\t0.960\tDEADdead\t0.840', '2\tDEADdead\tdeadbeef\t0.840\tDeaf Beef\t0.808', '2\tDeaf Beef\tdeadbeef\t0.960\tDEADdead\t0.808']) def test_specify_default_values(self): header, lines = run_split("--NxN -k 3 --threshold 0.7", 7, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=3 threshold=0.7 NxN=full"}) self.assertEquals(len(lines), 7, lines) self.assertEquals(lines, ['0\tzeros', '0\tbit1', '0\ttwo_bits', '0\tseveral', '2\tdeadbeef\tDeaf Beef\t0.960\tDEADdead\t0.840', '2\tDEADdead\tdeadbeef\t0.840\tDeaf Beef\t0.808', '2\tDeaf Beef\tdeadbeef\t0.960\tDEADdead\t0.808']) def test_k_2(self): # This sets the theshold to 0.0 header, lines = run_split("--NxN -k 2 ", 7, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=2 threshold=0.0 NxN=full"}) self.assertEquals(len(lines), 7, lines) self.assertEquals(lines, ['0\tzeros', '2\tbit1\ttwo_bits\t0.500\tseveral\t0.143', '2\ttwo_bits\tbit1\t0.500\tseveral\t0.286', '2\tseveral\ttwo_bits\t0.286\tDEADdead\t0.261', '2\tdeadbeef\tDeaf Beef\t0.960\tDEADdead\t0.840', '2\tDEADdead\tdeadbeef\t0.840\tDeaf Beef\t0.808', '2\tDeaf Beef\tdeadbeef\t0.960\tDEADdead\t0.808']) def test_threshold(self): header, lines = run_split("--NxN --threshold 0.5 ", 7, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=all threshold=0.5 NxN=full"}) self.assertEquals(len(lines), 7, lines) self.assertEquals(lines, ['0\tzeros', '1\tbit1\ttwo_bits\t0.500', '1\ttwo_bits\tbit1\t0.500', '0\tseveral', # The order here is implementation dependent... '2\tdeadbeef\tDeaf Beef\t0.960\tDEADdead\t0.840', '2\tDEADdead\tdeadbeef\t0.840\tDeaf Beef\t0.808', #'2\tDeaf Beef\tdeadbeef\t0.960\tDEADdead\t0.808', '2\tDeaf Beef\tDEADdead\t0.808\tdeadbeef\t0.960', ]) def test_count_with_threshold(self): header, lines = count_run_split("--NxN --count --threshold 0.5 ", 7, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Count threshold=0.5 NxN=full"}) self.assertEquals(len(lines), 7, lines) self.assertEquals(lines, ['0\tzeros', '1\tbit1', '1\ttwo_bits', '0\tseveral', '2\tdeadbeef', '2\tDEADdead', '2\tDeaf Beef', ]) def test_count_with_default_threshold(self): header, lines = count_run_split("--NxN --count", 7, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Count threshold=0.7 NxN=full"}) self.assertEquals(len(lines), 7, lines) self.assertEquals(lines, ['0\tzeros', '0\tbit1', '0\ttwo_bits', '0\tseveral', '2\tdeadbeef', '2\tDEADdead', '2\tDeaf Beef', ]) def test_threshold_with_low_batch_size(self): header, lines = run_split("--NxN --threshold 0.5 --batch-size 1", 7, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=all threshold=0.5 NxN=full"}) self.assertEquals(len(lines), 7, lines) self.assertEquals(lines, ['0\tzeros', '1\tbit1\ttwo_bits\t0.500', '1\ttwo_bits\tbit1\t0.500', '0\tseveral', # The order here is implementation dependent... '2\tdeadbeef\tDeaf Beef\t0.960\tDEADdead\t0.840', '2\tDEADdead\tdeadbeef\t0.840\tDeaf Beef\t0.808', #'2\tDeaf Beef\tdeadbeef\t0.960\tDEADdead\t0.808', '2\tDeaf Beef\tDEADdead\t0.808\tdeadbeef\t0.960', ]) def test_knearest_with_low_batch_size(self): # This is the same as test_k_2 but with a batch-size of 1. # This tests a bug where I wasn't incrementing the offset # to the start of each batch location in the results. header, lines = run_split("--NxN -k 2 --batch-size 1", 7, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Tanimoto k=2 threshold=0.0 NxN=full"}) self.assertEquals(len(lines), 7, lines) self.assertEquals(lines, ['0\tzeros', '2\tbit1\ttwo_bits\t0.500\tseveral\t0.143', '2\ttwo_bits\tbit1\t0.500\tseveral\t0.286', '2\tseveral\ttwo_bits\t0.286\tDEADdead\t0.261', '2\tdeadbeef\tDeaf Beef\t0.960\tDEADdead\t0.840', '2\tDEADdead\tdeadbeef\t0.840\tDeaf Beef\t0.808', '2\tDeaf Beef\tdeadbeef\t0.960\tDEADdead\t0.808']) def test_count_with_low_batch_size(self): header, lines = count_run_split("--NxN --count --batch-size 1", 7, SIMPLE_FPS) self.assertIn("simple.fps", header.pop("#targets")) self.assertEquals(header, {"#num_bits": "32", "#software": SOFTWARE, "#type": "Count threshold=0.7 NxN=full"}) self.assertEquals(len(lines), 7, lines) self.assertEquals(lines, ['0\tzeros', '0\tbit1', '0\ttwo_bits', '0\tseveral', '2\tdeadbeef', '2\tDEADdead', '2\tDeaf Beef', ]) class TestCompatibility(unittest2.TestCase): def test_incompatible_fingerprint(self): errmsg = run_exit(["--hex-query", "dead"], SIMPLE_FPS) self.assertIn("error: query fingerprint contains 2 bytes but", errmsg) self.assertIn("simple.fps", errmsg) self.assertIn("has 4 byte fingerprints", errmsg) def test_targets_is_not_an_fps_file(self): errmsg = run_exit(["--queries", SIMPLE_FPS]) self.assertIn("Cannot open targets file:", errmsg) self.assertIn("Unable to determine fingerprint format type from", errmsg) self.assertIn("pubchem.sdf", errmsg) def test_targets_does_not_exist(self): errmsg = run_exit(["--queries", SIMPLE_FPS], "/this/file/does_not_exist_t") self.assertIn("Cannot open targets file:", errmsg) self.assertIn("No such file or directory", errmsg) # Mac specific? self.assertIn("does_not_exist_t", errmsg) def test_queries_is_not_an_fps_file(self): errmsg = run_exit(["--queries", support.PUBCHEM_SDF], SIMPLE_FPS) self.assertIn("Cannot open queries file:", errmsg) self.assertIn("Unable to determine fingerprint format type from ", errmsg) self.assertIn("pubchem.sdf", errmsg) def test_queries_does_not_exist(self): errmsg = run_exit(["--queries", "/this/file/does_not_exist_q"], SIMPLE_FPS) self.assertIn("Cannot open queries file:", errmsg) self.assertIn("No such file or directory", errmsg) # Mac specific? self.assertIn("does_not_exist_q", errmsg) class TestCommandlineErrors(unittest2.TestCase): def test_mix_count_and_knearest(self): errmsg = count_run_exit("--count --hex-query beefcafe --k-nearest 4", SIMPLE_FPS) self.assertIn("--count search does not support --k-nearest", errmsg) def test_negative_k(self): errmsg = run_exit("--hex-query beefcafe -k -1", SIMPLE_FPS) self.assertIn("--k-nearest must be non-negative or 'all'", errmsg) def test_negative_threshold(self): errmsg = run_exit("--hex-query beefcafe --threshold -0.1", SIMPLE_FPS) self.assertIn("--threshold must be between 0.0 and 1.0, inclusive", errmsg) errmsg = run_exit("--hex-query beefcafe --threshold -1.0", SIMPLE_FPS) self.assertIn("--threshold must be between 0.0 and 1.0, inclusive", errmsg) def test_too_large_threshold(self): errmsg = run_exit("--hex-query beefcafe --threshold 1.1", SIMPLE_FPS) self.assertIn("--threshold must be between 0.0 and 1.0, inclusive", errmsg) def test_non_positive_batch_size(self): errmsg = run_exit("--hex-query beefcafe --batch-size 0", SIMPLE_FPS) self.assertIn("--batch-size must be positive", errmsg) errmsg = run_exit("--hex-query beefcafe --batch-size -1", SIMPLE_FPS) self.assertIn("--batch-size must be positive", errmsg) def test_NxN_with_scan(self): errmsg = run_exit("--NxN --scan", SIMPLE_FPS) self.assertIn("Cannot specify --scan with an --NxN search", errmsg) def test_NxN_with_hex_query(self): errmsg = run_exit("--NxN --hex-query feedfeed", SIMPLE_FPS) self.assertIn("Cannot specify --hex-query with an --NxN search", errmsg) def test_NxN_with_queries(self): errmsg = run_exit("--NxN --queries ignored", SIMPLE_FPS) self.assertIn("Cannot specify --queries with an --NxN search", errmsg) def test_scan_with_memory(self): errmsg = run_exit("--scan --memory", SIMPLE_FPS) self.assertIn("Cannot specify both --scan and --memory", errmsg) def test_hex_query_with_queries(self): errmsg = run_exit("--hex-query faceb00c --queries not_important", SIMPLE_FPS) self.assertIn("Cannot specify both --hex-query and --queries", errmsg) def test_hex_query_with_bad_character(self): errmsg = run_exit("--hex-query faceb00k", SIMPLE_FPS) self.assertIn("--hex-query is not a hex string: Non-hexadecimal digit found", errmsg) def test_hex_query_with_bad_length(self): errmsg = run_exit("--hex-query deadbeef2", SIMPLE_FPS) self.assertIn("--hex-query is not a hex string: Odd-length string", errmsg) def test_query_id_with_bad_character(self): for (bad_id, name) in (("A\tB", "tab"), ("C\nD", "newline"), ("E\rF", "control-return"), ("G\0H", "NUL")): errmsg = run_exit(["--hex-query", "abcd1234", "--query-id", bad_id], SIMPLE_FPS) self.assertIn("--query-id must not contain the %s character" % name, errmsg) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_symmetric.py0000644000077000000240000002501212101627573020050 0ustar dalkestaff00000000000000# Test the symmetric code import unittest2 from cStringIO import StringIO import array import chemfp from chemfp import search, bitops from support import fullpath, PUBCHEM_SDF, PUBCHEM_SDF_GZ fps = chemfp.load_fingerprints(fullpath("queries.fps")) zeros = chemfp.load_fingerprints(StringIO("""\ 0000\tA 0000\tB 0001\tC 0002\tD FFFE\tE FFFF\tF """)) def slow_counts(counts, fps, threshold, query_start, query_end, target_start, target_end): N = len(fps) query_end = min(N, query_end) target_end = min(N, target_end) for row in range(query_start, query_end): row_fp = fps[row][1] for col in range(max(row+1, target_start), target_end): col_fp = fps[col][1] if bitops.byte_tanimoto(row_fp, col_fp) >= threshold: if row != col: #print row, col, "X" counts[row] += 1 counts[col] += 1 def slow_threshold_search(results, fps, threshold, query_start, query_end, target_start, target_end): N = len(fps) query_end = min(N, query_end) target_end = min(N, target_end) for row in range(query_start, query_end): row_fp = fps[row][1] for col in range(max(row+1, target_start), target_end): col_fp = fps[col][1] score = bitops.byte_tanimoto(row_fp, col_fp) if score >= threshold: if row != col: #print row, col, "X" results[row].append((col, score)) #results[col].append((row, score)) # only do the upper triangle class TestCounts(unittest2.TestCase): #maxDiff = 0 def test_symmetric(self): # query[i] always matches target[i] so x[i] will be at least one x = search.count_tanimoto_hits_arena(fps, fps, 0.6) # This only processes the upper-triangle, and not the diagonal y = search.count_tanimoto_hits_symmetric(fps, 0.6) self.assertEquals(len(x), len(y)) for i in range(len(x)): self.assertEquals(x[i]-1, y[i]) def test_zeros(self): y = search.count_tanimoto_hits_symmetric(zeros, 0.9) self.assertSequenceEqual(y, [0, 0, 0, 0, 1, 1]) y = search.count_tanimoto_hits_symmetric(zeros, 0.001) self.assertSequenceEqual(y, [0, 0, 1, 2, 2, 3]) def test_partial_counts_with_zero_threshold(self): threshold = 0.0 N = len(fps) counts = array.array("i", [0]*N) search.partial_count_tanimoto_hits_symmetric(counts, fps, threshold, 1, 5, 0, N) expected = [0] * N slow_counts(expected, fps, threshold, 1, 5, 0, N) self.assertSequenceEqual(counts, expected) search.partial_count_tanimoto_hits_symmetric(counts, fps, threshold, 3, 33, 0, N) slow_counts(expected, fps, threshold, 3, 33, 0, N) self.assertSequenceEqual(counts, expected) def test_partial_counts_in_columns(self): threshold = 0.6 N = len(fps) counts = array.array("i", [0]*N) search.partial_count_tanimoto_hits_symmetric(counts, fps, threshold, 1, 5, 0, N) expected = [0] * N slow_counts(expected, fps, threshold, 1, 5, 0, N) self.assertSequenceEqual(counts, expected) search.partial_count_tanimoto_hits_symmetric(counts, fps, threshold, 3, 33, 0, N) slow_counts(expected, fps, threshold, 3, 33, 0, N) self.assertSequenceEqual(counts, expected) def test_partial_counts_in_rows(self): threshold = 0.2 N = len(fps) counts = array.array("i", [0]*N) search.partial_count_tanimoto_hits_symmetric(counts, fps, threshold, 1, 5, 9, 30) expected = [0] * N slow_counts(expected, fps, threshold, 1, 5, 9, 30) self.assertSequenceEqual(counts, expected) search.partial_count_tanimoto_hits_symmetric(counts, fps, threshold, 20, 26, 30, 50) slow_counts(expected, fps, threshold, 20, 26, 30, 50) self.assertSequenceEqual(counts, expected) def test_partial_counts(self): threshold = 0.2 N = len(fps) counts = array.array("i", [0]*N) expected = [0] * N for i in range(0, N, 10): for j in range(i, N, 8): slow_counts(expected, fps, threshold, i, i+10, j, j+8) search.partial_count_tanimoto_hits_symmetric(counts, fps, threshold, i, i+10, j, j+8) self.assertSequenceEqual(counts, expected) normal = search.count_tanimoto_hits_symmetric(fps, threshold) self.assertSequenceEqual(normal, expected) def _compare_search_results(self, result, expected): q=list(result.iter_indices_and_scores()) for i in range(len(result)): self.assertEquals(len(q[i]), len(expected[i]), "length of %d" % i) self.assertEquals(sorted(q[i]), sorted(expected[i]), "contents of %d" % i) class TestThreshold(unittest2.TestCase): def test_upper_only(self): # query[i] always matches target[i] so x[i] will always contain i x = search.threshold_tanimoto_search_arena(fps, fps, 0.9) x = list(x) # This only processes the upper-triangle, and not the diagonal y = search.threshold_tanimoto_search_symmetric(fps, 0.9, include_lower_triangle=False) rows = list(row.get_indices_and_scores() for row in y) row_sizes = map(len, rows) # Move elements to the lower triangle for rowno, (row, row_size) in enumerate(zip(rows, row_sizes)): for (colno, score) in row[:row_size]: assert colno > rowno, (rowno, colno) rows[colno].append( (rowno, score) ) # Fill in the diagonal row.append((rowno, 1.0)) # Put into a consistent order row.sort() # Match with the NxM algorithm expected_row = x[rowno] expected_row.reorder("increasing-index") self.assertEquals(row, list(expected_row), rowno) def test_upper_and_lower(self): # query[i] always matches target[i] so x[i] will always contain i x = search.threshold_tanimoto_search_arena(fps, fps, 0.9) # This only processes the upper-triangle, and not the diagonal y = search.threshold_tanimoto_search_symmetric(fps, 0.9) for i, (x_row, y_row) in enumerate(zip(x, y)): x_row = x_row.get_indices_and_scores() y_row = y_row.get_indices_and_scores() y_row.append((i, 1.0)) x_row.sort() y_row.sort() self.assertEquals(x_row, y_row) def test_partial_search_with_zero_threshold(self): threshold = 0.0 N = len(fps) result = search.SearchResults(N, fps.arena_ids) search.partial_threshold_tanimoto_search_symmetric(result, fps, threshold, 1, 5, 0, N) expected = [[] for i in range(N)] slow_threshold_search(expected, fps, threshold, 1, 5, 0, N) _compare_search_results(self, result, expected) def test_partial_search_in_rows(self): threshold = 0.2 N = len(fps) result = search.SearchResults(N, fps.arena_ids) search.partial_threshold_tanimoto_search_symmetric(result, fps, threshold, 1, 9, 0, N) expected = [[] for i in range(N)] slow_threshold_search(expected, fps, threshold, 1, 9, 0, N) _compare_search_results(self, result, expected) def test_partial_search_in_cols(self): threshold = 0.1 N = len(fps) result = search.SearchResults(N, fps.arena_ids) search.partial_threshold_tanimoto_search_symmetric(result, fps, threshold, 0, N, 7, 69) expected = [[] for i in range(N)] slow_threshold_search(expected, fps, threshold, 0, N, 7, 69) _compare_search_results(self, result, expected) _compare_search_results(self, result, expected) def test_partial_threshold_search(self): threshold = 0.1 N = len(fps) result = search.SearchResults(N, fps.arena_ids) expected = [[] for i in range(N)] for i in range(0, N, 13): for j in range(i, N, 8): search.partial_threshold_tanimoto_search_symmetric(result, fps, threshold, i, i+13, j, j+8) slow_threshold_search(expected, fps, threshold, i, i+13, j, j+8) _compare_search_results(self, result, expected) counts_before = map(len, result) search.fill_lower_triangle(result) counts_after = map(len, result) self.assertNotEqual(counts_before, counts_after) self.assertSequenceEqual(counts_after, search.count_tanimoto_hits_symmetric(fps, threshold)) normal = search.threshold_tanimoto_search_symmetric(fps, threshold) _compare_search_results(self, result, list(normal.iter_indices_and_scores())) class TestKNearest(unittest2.TestCase): def test_symmetric(self): # query[i] always matches target[i] so x[i] will always contain element[i] x = search.knearest_tanimoto_search_arena(fps, fps, 31, 0.9) # This only processes the upper-triangle, and not the diagonal y = search.knearest_tanimoto_search_symmetric(fps, 30, 0.9) for i, (x_row, y_row) in enumerate(zip(x, y)): x_row = x_row.get_indices_and_scores() y_row = y_row.get_indices_and_scores() y_row.append((i, 1.0)) x_row.sort() y_row.sort() self.assertEquals(x_row, y_row, "Problem in %d" % i) def test_symmetric2(self): # query[i] always matches target[i] so x[i] will always contain element[i] x = fps.knearest_tanimoto_search_arena(fps, 81, 0.9) # This only processes the upper-triangle, and not the diagonal y = search.knearest_tanimoto_search_symmetric(fps, 80, 0.9) for i, (x_row, y_row) in enumerate(zip(x, y)): x_row = x_row.get_indices_and_scores() y_row = y_row.get_indices_and_scores() y_row.append((i, 1.0)) x_row.sort() y_row.sort() self.assertEquals(x_row, y_row, "Problem in %d" % i) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/test_tox_version.py0000644000077000000240000000617312055226641020421 0ustar dalkestaff00000000000000# This tests that the tox environment is set up the way it should be # set up. import sys import os import unittest2 import support envstr = os.environ.get("TOX_CHEMFP_TEST", None) if not envstr: versions = [] else: versions = envstr.split(",") def check_py25(): version = sys.version_info[:2] assert version == (2,5), version assert not support.can_skip("py25") def check_py26(): version = sys.version_info[:2] assert version == (2,6), version assert not support.can_skip("py26") def check_py27(): version = sys.version_info[:2] assert version == (2,7), version assert not support.can_skip("py27") def check_x32(): assert sys.maxint == 2147483647 assert not support.can_skip("x32") def check_x64(): assert sys.maxint == 9223372036854775807 assert not support.can_skip("x64") def check_oe174(): from openeye.oechem import OEChemGetRelease version = OEChemGetRelease() assert version == "1.7.4", version assert not support.can_skip("oe") assert not support.can_skip("oe174") def check_oe2011Oct(): from openeye.oechem import OEChemGetRelease version = OEChemGetRelease() assert version == "1.7.6", version assert not support.can_skip("oe") assert not support.can_skip("oe2011Oct") def check_ob223(): import openbabel from chemfp import openbabel assert openbabel._ob_version == "2.2.3", openbabel._ob_version assert not support.can_skip("ob") assert not support.can_skip("ob223") def check_ob230(): import openbabel from chemfp import openbabel assert openbabel._ob_version == "2.3.0", openbabel._ob_version assert not support.can_skip("ob") assert not support.can_skip("ob230") def check_ob23svn1(): import openbabel from chemfp import openbabel assert openbabel._ob_version == "2.3.90", openbabel._ob_version assert not support.can_skip("ob") assert not support.can_skip("ob23svn1") def check_rd201106(): from rdkit.rdBase import rdkitVersion assert rdkitVersion[:7] == "2011.06", rdkitVersion def check_rd201103(): import rdkit.rdBase from rdkit.rdBase import rdkitVersion assert rdkitVersion[:7] == "2011.03", rdkitVersion def check_rd201012(): import rdkit.rdBase from rdkit.rdBase import rdkitVersion assert rdkitVersion[:7] == "2010.12", rdkitVersion def check_rd201112_svn(): import rdkit.rdBase from rdkit.rdBase import rdkitVersion assert rdkitVersion[:7] == "2011.12", rdkitVersion def _check(required): req = required.split() for name in versions: if name in req: return raise AssertionError("Missing one of %r: %r" % (required, versions)) class TestToxVersion(unittest2.TestCase): def test_enough_specifications(self): _check("x32 x64") _check("py25 py26 py27 py32") _check("oe174 oe2011Oct ob223 ob230 ob23svn1 rd201106 rd201103 rd201012 rd201112_svn") def test_version(self): for name in versions: func = globals()["check_" + name] func() TestToxVersion = unittest2.skipUnless(envstr, "Not building under the tox environment")( TestToxVersion) chemfp-1.1p1/tests/test_types.py0000644000077000000240000000731612055226641017206 0ustar dalkestaff00000000000000from __future__ import with_statement import unittest2 from chemfp import types try: from chemfp import openeye except ImportError: skip_openeye = True else: skip_openeye = False try: from chemfp import rdkit except ImportError: skip_rdkit = True else: skip_rdkit = False # These tests are to improve overall test coverage class ParseType(unittest2.TestCase): def test_no_type_string(self): with self.assertRaisesRegexp(ValueError, r"Empty fingerprint type \(''\)"): types.parse_type("") with self.assertRaisesRegexp(ValueError, r"Empty fingerprint type \(' '\)"): types.parse_type(" ") def test_unknown_fingerprint_family(self): with self.assertRaisesRegexp(ValueError, "Unknown fingerprint family 'Blah-Blah/1'"): types.parse_type("Blah-Blah/1") class TestOEFingerprintTypes(unittest2.TestCase): def test_missing_equals(self): with self.assertRaisesRegexp(ValueError, "Term 'atype' in type 'OpenEye-Path atype' must have one and only one '='"): types.parse_type("OpenEye-Path atype") def test_extra_equals(self): with self.assertRaisesRegexp(ValueError, "Term 'atype=DefaultAtom=DefaultAtom' in type 'OpenEye-Path atype=DefaultAtom=DefaultAtom' must have one and only one '='"): types.parse_type("OpenEye-Path atype=DefaultAtom=DefaultAtom") def test_duplicate_names(self): with self.assertRaisesRegexp(ValueError, "Duplicate name 'atype' in type 'OpenEye-Path atype=DefaultAtom atype=InRing'"): types.parse_type("OpenEye-Path atype=DefaultAtom atype=InRing") def test_unknown_name(self): with self.assertRaisesRegexp(ValueError, "Unknown name 'atomtype' in type 'OpenEye-Path atomtype=DefaultAtom'"): types.parse_type("OpenEye-Path atomtype=DefaultAtom") def test_negative_minbonds(self): with self.assertRaisesRegexp(ValueError, "Unable to parse minbonds value '-1' in type 'OpenEye-Path minbonds=-1'"): types.parse_type("OpenEye-Path minbonds=-1") def test_make_fingerprinter_from_type_with_empty_string(self): family = types.get_fingerprint_family("OpenEye-Path") with self.assertRaisesRegexp(ValueError, r"Empty fingerprint type \(''\)"): family.make_fingerprinter_from_type("") def test_compare_types_for_equality(self): t1 = types.parse_type("OpenEye-Path") self.assertEquals(t1, t1) t2 = types.parse_type("OpenEye-Path") self.assertEquals(t1, t2) def test_compare_types_for_inequality(self): t1 = types.parse_type("OpenEye-Path") t2 = types.parse_type("OpenEye-Path atype=InRing") self.assertNotEquals(t1, t2) TestOEFingerprintTypes = unittest2.skipIf(skip_openeye, "OEChem not installed")(TestOEFingerprintTypes) class TestRDKitFingerprintTypes(unittest2.TestCase): def test_passing_0_to_positive_int(self): with self.assertRaisesRegexp(ValueError, "Unable to parse fpSize value '0' in type 'RDKit-Fingerprint fpSize=0'"): types.parse_type("RDKit-Fingerprint fpSize=0") def test_passing_negative_to_nonnegative_int(self): with self.assertRaisesRegexp(ValueError, r"Unable to parse fpSize value '\+2' in type 'RDKit-Fingerprint fpSize=\+2'"): types.parse_type("RDKit-Fingerprint fpSize=+2") def test_passing_something_other_than_0_or_1(self): with self.assertRaisesRegexp(ValueError, r"Unable to parse useHs value '2' in type 'RDKit-Fingerprint useHs=2'"): types.parse_type("RDKit-Fingerprint useHs=2") TestRDKitFingerprintTypes = unittest2.skipIf(skip_rdkit, "OEChem not installed")(TestRDKitFingerprintTypes) if __name__ == "__main__": unittest2.main() chemfp-1.1p1/tests/tryptophan.sdf0000644000077000000240000000434711660452125017337 0ustar dalkestaff00000000000000tryptophan.pdb OpenBabel12160706013D 32 28 0 0 0 0 0 0 0 0999 V2000 -1.4520 0.0061 0.5210 C 0 0 0 0 0 -2.4914 -0.8349 0.5204 C 0 0 0 0 0 -1.8035 -2.4517 0.5212 N 0 0 0 0 0 -0.3733 -2.1482 0.5219 C 0 0 0 0 0 -0.1506 -0.7458 0.5219 C 0 0 0 0 0 0.7298 -3.0423 0.5358 C 0 0 0 0 0 2.0556 -2.5339 0.5496 C 0 0 0 0 0 2.2782 -1.1316 0.4969 C 0 0 0 0 0 1.1752 -0.2375 0.5094 C 0 0 0 0 0 -1.5379 1.1027 0.5311 H 0 0 0 0 0 -3.3041 -0.7044 -0.2093 H 0 0 0 0 0 -2.1610 -3.1400 -0.1867 H 0 0 0 0 0 0.5574 -4.1286 0.5262 H 0 0 0 0 0 2.9098 -3.2257 0.5915 H 0 0 0 0 0 3.3041 -0.7392 0.4369 H 0 0 0 0 0 1.3478 0.8488 0.5189 H 0 0 0 0 0 -1.8300 2.4262 -0.8409 C 0 0 0 0 0 -2.9276 1.9220 -0.9182 O 0 0 0 0 0 -1.4612 3.7787 -1.3892 O 0 0 0 0 0 -0.5853 1.8470 -0.2139 C 0 0 0 0 0 0.4864 1.7183 -1.2087 N 0 0 0 0 0 -0.8878 0.4700 0.3625 C 0 0 0 0 0 0.3735 -0.1058 0.9925 C 0 0 0 0 0 -2.2660 4.1287 -1.8071 H 0 0 0 0 0 -0.2498 2.5432 0.5869 H 0 0 0 0 0 1.3167 1.3258 -0.7652 H 0 0 0 0 0 0.7762 2.6459 -1.5182 H 0 0 0 0 0 -1.2352 -0.2039 -0.4524 H 0 0 0 0 0 -1.6809 0.5598 1.1380 H 0 0 0 0 0 0.1617 -1.1127 1.4166 H 0 0 0 0 0 0.7313 0.5625 1.8071 H 0 0 0 0 0 1.1746 -0.1970 0.2253 H 0 0 0 0 0 28 1 1 0 0 0 22 1 1 0 0 0 1 29 1 0 0 0 2 3 1 0 0 0 11 2 1 0 0 0 3 4 1 0 0 0 12 3 1 0 0 0 4 6 2 0 0 0 5 23 1 0 0 0 5 30 1 0 0 0 6 7 1 0 0 0 6 13 1 0 0 0 7 8 2 0 0 0 7 14 1 0 0 0 8 9 1 0 0 0 8 15 1 0 0 0 9 16 1 0 0 0 9 23 1 0 0 0 22 10 1 0 0 0 17 20 1 0 0 0 19 17 1 0 0 0 18 17 2 0 0 0 24 19 1 0 0 0 20 21 1 0 0 0 20 25 1 0 0 0 27 21 1 0 0 0 21 26 1 0 0 0 23 31 1 0 0 0 M END > tryptophan.pdb $$$$ chemfp-1.1p1/tests/unit20000755000077000000240000000012411660452125015405 0ustar dalkestaff00000000000000#! /usr/bin/env python __unittest = True from unittest2.main import main_ main_()chemfp-1.1p1/tests/unittest2/0000755000077000000240000000000012106315372016360 5ustar dalkestaff00000000000000chemfp-1.1p1/tests/unittest2/__init__.py0000644000077000000240000000454611660452125020504 0ustar dalkestaff00000000000000""" unittest2 unittest2 is a backport of the new features added to the unittest testing framework in Python 2.7. It is tested to run on Python 2.4 - 2.6. To use unittest2 instead of unittest simply replace ``import unittest`` with ``import unittest2``. Copyright (c) 1999-2003 Steve Purcell Copyright (c) 2003-2010 Python Software Foundation This module is free software, and you may redistribute it and/or modify it under the same terms as Python itself, so long as this copyright message and disclaimer are retained in their original form. IN NO EVENT SHALL THE AUTHOR BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THIS CODE, EVEN IF THE AUTHOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. THE AUTHOR SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE CODE PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND THERE IS NO OBLIGATION WHATSOEVER TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. """ __all__ = ['TestResult', 'TestCase', 'TestSuite', 'TextTestRunner', 'TestLoader', 'FunctionTestCase', 'main', 'defaultTestLoader', 'SkipTest', 'skip', 'skipIf', 'skipUnless', 'expectedFailure', 'TextTestResult', '__version__', 'collector'] __version__ = '0.5.1' # Expose obsolete functions for backwards compatibility __all__.extend(['getTestCaseNames', 'makeSuite', 'findTestCases']) from unittest2.collector import collector from unittest2.result import TestResult from unittest2.case import ( TestCase, FunctionTestCase, SkipTest, skip, skipIf, skipUnless, expectedFailure ) from unittest2.suite import BaseTestSuite, TestSuite from unittest2.loader import ( TestLoader, defaultTestLoader, makeSuite, getTestCaseNames, findTestCases ) from unittest2.main import TestProgram, main, main_ from unittest2.runner import TextTestRunner, TextTestResult try: from unittest2.signals import ( installHandler, registerResult, removeResult, removeHandler ) except ImportError: # Compatibility with platforms that don't have the signal module pass else: __all__.extend(['installHandler', 'registerResult', 'removeResult', 'removeHandler']) # deprecated _TextTestResult = TextTestResult __unittest = Truechemfp-1.1p1/tests/unittest2/__main__.py0000644000077000000240000000024611660452125020456 0ustar dalkestaff00000000000000"""Main entry point""" import sys if sys.argv[0].endswith("__main__.py"): sys.argv[0] = "unittest2" __unittest = True from unittest2.main import main_ main_() chemfp-1.1p1/tests/unittest2/case.py0000644000077000000240000012343411660452125017656 0ustar dalkestaff00000000000000"""Test case implementation""" import sys import difflib import pprint import re import unittest import warnings from unittest2 import result from unittest2.util import ( safe_repr, safe_str, strclass, unorderable_list_difference ) from unittest2.compatibility import wraps __unittest = True DIFF_OMITTED = ('\nDiff is %s characters long. ' 'Set self.maxDiff to None to see it.') class SkipTest(Exception): """ Raise this exception in a test to skip it. Usually you can use TestResult.skip() or one of the skipping decorators instead of raising this directly. """ class _ExpectedFailure(Exception): """ Raise this when a test is expected to fail. This is an implementation detail. """ def __init__(self, exc_info): # can't use super because Python 2.4 exceptions are old style Exception.__init__(self) self.exc_info = exc_info class _UnexpectedSuccess(Exception): """ The test was supposed to fail, but it didn't! """ def _id(obj): return obj def skip(reason): """ Unconditionally skip a test. """ def decorator(test_item): if not (isinstance(test_item, type) and issubclass(test_item, TestCase)): @wraps(test_item) def skip_wrapper(*args, **kwargs): raise SkipTest(reason) test_item = skip_wrapper test_item.__unittest_skip__ = True test_item.__unittest_skip_why__ = reason return test_item return decorator def skipIf(condition, reason): """ Skip a test if the condition is true. """ if condition: return skip(reason) return _id def skipUnless(condition, reason): """ Skip a test unless the condition is true. """ if not condition: return skip(reason) return _id def expectedFailure(func): @wraps(func) def wrapper(*args, **kwargs): try: func(*args, **kwargs) except Exception: raise _ExpectedFailure(sys.exc_info()) raise _UnexpectedSuccess return wrapper class _AssertRaisesContext(object): """A context manager used to implement TestCase.assertRaises* methods.""" def __init__(self, expected, test_case, expected_regexp=None): self.expected = expected self.failureException = test_case.failureException self.expected_regexp = expected_regexp def __enter__(self): return self def __exit__(self, exc_type, exc_value, tb): if exc_type is None: try: exc_name = self.expected.__name__ except AttributeError: exc_name = str(self.expected) raise self.failureException( "%s not raised" % (exc_name,)) if not issubclass(exc_type, self.expected): # let unexpected exceptions pass through return False self.exception = exc_value # store for later retrieval if self.expected_regexp is None: return True expected_regexp = self.expected_regexp if isinstance(expected_regexp, basestring): expected_regexp = re.compile(expected_regexp) if not expected_regexp.search(str(exc_value)): raise self.failureException('"%s" does not match "%s"' % (expected_regexp.pattern, str(exc_value))) return True class _TypeEqualityDict(object): def __init__(self, testcase): self.testcase = testcase self._store = {} def __setitem__(self, key, value): self._store[key] = value def __getitem__(self, key): value = self._store[key] if isinstance(value, basestring): return getattr(self.testcase, value) return value def get(self, key, default=None): if key in self._store: return self[key] return default class TestCase(unittest.TestCase): """A class whose instances are single test cases. By default, the test code itself should be placed in a method named 'runTest'. If the fixture may be used for many test cases, create as many test methods as are needed. When instantiating such a TestCase subclass, specify in the constructor arguments the name of the test method that the instance is to execute. Test authors should subclass TestCase for their own tests. Construction and deconstruction of the test's environment ('fixture') can be implemented by overriding the 'setUp' and 'tearDown' methods respectively. If it is necessary to override the __init__ method, the base class __init__ method must always be called. It is important that subclasses should not change the signature of their __init__ method, since instances of the classes are instantiated automatically by parts of the framework in order to be run. """ # This attribute determines which exception will be raised when # the instance's assertion methods fail; test methods raising this # exception will be deemed to have 'failed' rather than 'errored' failureException = AssertionError # This attribute sets the maximum length of a diff in failure messages # by assert methods using difflib. It is looked up as an instance attribute # so can be configured by individual tests if required. maxDiff = 80*8 # This attribute determines whether long messages (including repr of # objects used in assert methods) will be printed on failure in *addition* # to any explicit message passed. longMessage = True # Attribute used by TestSuite for classSetUp _classSetupFailed = False def __init__(self, methodName='runTest'): """Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name. """ self._testMethodName = methodName self._resultForDoCleanups = None try: testMethod = getattr(self, methodName) except AttributeError: raise ValueError("no such test method in %s: %s" % \ (self.__class__, methodName)) self._testMethodDoc = testMethod.__doc__ self._cleanups = [] # Map types to custom assertEqual functions that will compare # instances of said type in more detail to generate a more useful # error message. self._type_equality_funcs = _TypeEqualityDict(self) self.addTypeEqualityFunc(dict, 'assertDictEqual') self.addTypeEqualityFunc(list, 'assertListEqual') self.addTypeEqualityFunc(tuple, 'assertTupleEqual') self.addTypeEqualityFunc(set, 'assertSetEqual') self.addTypeEqualityFunc(frozenset, 'assertSetEqual') self.addTypeEqualityFunc(unicode, 'assertMultiLineEqual') def addTypeEqualityFunc(self, typeobj, function): """Add a type specific assertEqual style function to compare a type. This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages. Args: typeobj: The data type to call this function on when both values are of the same type in assertEqual(). function: The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal. """ self._type_equality_funcs[typeobj] = function def addCleanup(self, function, *args, **kwargs): """Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success. Cleanup items are called even if setUp fails (unlike tearDown).""" self._cleanups.append((function, args, kwargs)) def setUp(self): "Hook method for setting up the test fixture before exercising it." @classmethod def setUpClass(cls): "Hook method for setting up class fixture before running tests in the class." @classmethod def tearDownClass(cls): "Hook method for deconstructing the class fixture after running all tests in the class." def tearDown(self): "Hook method for deconstructing the test fixture after testing it." def countTestCases(self): return 1 def defaultTestResult(self): return result.TestResult() def shortDescription(self): """Returns a one-line description of the test, or None if no description has been provided. The default implementation of this method returns the first line of the specified test method's docstring. """ doc = self._testMethodDoc return doc and doc.split("\n")[0].strip() or None def id(self): return "%s.%s" % (strclass(self.__class__), self._testMethodName) def __eq__(self, other): if type(self) is not type(other): return NotImplemented return self._testMethodName == other._testMethodName def __ne__(self, other): return not self == other def __hash__(self): return hash((type(self), self._testMethodName)) def __str__(self): return "%s (%s)" % (self._testMethodName, strclass(self.__class__)) def __repr__(self): return "<%s testMethod=%s>" % \ (strclass(self.__class__), self._testMethodName) def _addSkip(self, result, reason): addSkip = getattr(result, 'addSkip', None) if addSkip is not None: addSkip(self, reason) else: warnings.warn("Use of a TestResult without an addSkip method is deprecated", DeprecationWarning, 2) result.addSuccess(self) def run(self, result=None): orig_result = result if result is None: result = self.defaultTestResult() startTestRun = getattr(result, 'startTestRun', None) if startTestRun is not None: startTestRun() self._resultForDoCleanups = result result.startTest(self) testMethod = getattr(self, self._testMethodName) if (getattr(self.__class__, "__unittest_skip__", False) or getattr(testMethod, "__unittest_skip__", False)): # If the class or method was skipped. try: skip_why = (getattr(self.__class__, '__unittest_skip_why__', '') or getattr(testMethod, '__unittest_skip_why__', '')) self._addSkip(result, skip_why) finally: result.stopTest(self) return try: success = False try: self.setUp() except SkipTest, e: self._addSkip(result, str(e)) except Exception: result.addError(self, sys.exc_info()) else: try: testMethod() except self.failureException: result.addFailure(self, sys.exc_info()) except _ExpectedFailure, e: addExpectedFailure = getattr(result, 'addExpectedFailure', None) if addExpectedFailure is not None: addExpectedFailure(self, e.exc_info) else: warnings.warn("Use of a TestResult without an addExpectedFailure method is deprecated", DeprecationWarning) result.addSuccess(self) except _UnexpectedSuccess: addUnexpectedSuccess = getattr(result, 'addUnexpectedSuccess', None) if addUnexpectedSuccess is not None: addUnexpectedSuccess(self) else: warnings.warn("Use of a TestResult without an addUnexpectedSuccess method is deprecated", DeprecationWarning) result.addFailure(self, sys.exc_info()) except SkipTest, e: self._addSkip(result, str(e)) except Exception: result.addError(self, sys.exc_info()) else: success = True try: self.tearDown() except Exception: result.addError(self, sys.exc_info()) success = False cleanUpSuccess = self.doCleanups() success = success and cleanUpSuccess if success: result.addSuccess(self) finally: result.stopTest(self) if orig_result is None: stopTestRun = getattr(result, 'stopTestRun', None) if stopTestRun is not None: stopTestRun() def doCleanups(self): """Execute all cleanup functions. Normally called for you after tearDown.""" result = self._resultForDoCleanups ok = True while self._cleanups: function, args, kwargs = self._cleanups.pop(-1) try: function(*args, **kwargs) except Exception: ok = False result.addError(self, sys.exc_info()) return ok def __call__(self, *args, **kwds): return self.run(*args, **kwds) def debug(self): """Run the test without collecting errors in a TestResult""" self.setUp() getattr(self, self._testMethodName)() self.tearDown() while self._cleanups: function, args, kwargs = self._cleanups.pop(-1) function(*args, **kwargs) def skipTest(self, reason): """Skip this test.""" raise SkipTest(reason) def fail(self, msg=None): """Fail immediately, with the given message.""" raise self.failureException(msg) def assertFalse(self, expr, msg=None): "Fail the test if the expression is true." if expr: msg = self._formatMessage(msg, "%s is not False" % safe_repr(expr)) raise self.failureException(msg) def assertTrue(self, expr, msg=None): """Fail the test unless the expression is true.""" if not expr: msg = self._formatMessage(msg, "%s is not True" % safe_repr(expr)) raise self.failureException(msg) def _formatMessage(self, msg, standardMsg): """Honour the longMessage attribute when generating failure messages. If longMessage is False this means: * Use only an explicit message if it is provided * Otherwise use the standard message for the assert If longMessage is True: * Use the standard message * If an explicit message is provided, plus ' : ' and the explicit message """ if not self.longMessage: return msg or standardMsg if msg is None: return standardMsg try: return '%s : %s' % (standardMsg, msg) except UnicodeDecodeError: return '%s : %s' % (safe_str(standardMsg), safe_str(msg)) def assertRaises(self, excClass, callableObj=None, *args, **kwargs): """Fail unless an exception of class excClass is thrown by callableObj when invoked with arguments args and keyword arguments kwargs. If a different type of exception is thrown, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception. If called with callableObj omitted or None, will return a context object used like this:: with self.assertRaises(SomeException): do_something() The context manager keeps a reference to the exception as the 'exception' attribute. This allows you to inspect the exception after the assertion:: with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3) """ if callableObj is None: return _AssertRaisesContext(excClass, self) try: callableObj(*args, **kwargs) except excClass: return if hasattr(excClass,'__name__'): excName = excClass.__name__ else: excName = str(excClass) raise self.failureException, "%s not raised" % excName def _getAssertEqualityFunc(self, first, second): """Get a detailed comparison function for the types of the two args. Returns: A callable accepting (first, second, msg=None) that will raise a failure exception if first != second with a useful human readable error message for those types. """ # # NOTE(gregory.p.smith): I considered isinstance(first, type(second)) # and vice versa. I opted for the conservative approach in case # subclasses are not intended to be compared in detail to their super # class instances using a type equality func. This means testing # subtypes won't automagically use the detailed comparison. Callers # should use their type specific assertSpamEqual method to compare # subclasses if the detailed comparison is desired and appropriate. # See the discussion in http://bugs.python.org/issue2578. # if type(first) is type(second): asserter = self._type_equality_funcs.get(type(first)) if asserter is not None: return asserter return self._baseAssertEqual def _baseAssertEqual(self, first, second, msg=None): """The default assertEqual implementation, not type specific.""" if not first == second: standardMsg = '%s != %s' % (safe_repr(first), safe_repr(second)) msg = self._formatMessage(msg, standardMsg) raise self.failureException(msg) def assertEqual(self, first, second, msg=None): """Fail if the two objects are unequal as determined by the '==' operator. """ assertion_func = self._getAssertEqualityFunc(first, second) assertion_func(first, second, msg=msg) def assertNotEqual(self, first, second, msg=None): """Fail if the two objects are equal as determined by the '==' operator. """ if not first != second: msg = self._formatMessage(msg, '%s == %s' % (safe_repr(first), safe_repr(second))) raise self.failureException(msg) def assertAlmostEqual(self, first, second, places=None, msg=None, delta=None): """Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the between the two objects is more than the given delta. Note that decimal places (from zero) are usually not the same as significant digits (measured from the most signficant digit). If the two objects compare equal then they will automatically compare almost equal. """ if first == second: # shortcut return if delta is not None and places is not None: raise TypeError("specify delta or places not both") if delta is not None: if abs(first - second) <= delta: return standardMsg = '%s != %s within %s delta' % (safe_repr(first), safe_repr(second), safe_repr(delta)) else: if places is None: places = 7 if round(abs(second-first), places) == 0: return standardMsg = '%s != %s within %r places' % (safe_repr(first), safe_repr(second), places) msg = self._formatMessage(msg, standardMsg) raise self.failureException(msg) def assertNotAlmostEqual(self, first, second, places=None, msg=None, delta=None): """Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the between the two objects is less than the given delta. Note that decimal places (from zero) are usually not the same as significant digits (measured from the most signficant digit). Objects that are equal automatically fail. """ if delta is not None and places is not None: raise TypeError("specify delta or places not both") if delta is not None: if not (first == second) and abs(first - second) > delta: return standardMsg = '%s == %s within %s delta' % (safe_repr(first), safe_repr(second), safe_repr(delta)) else: if places is None: places = 7 if not (first == second) and round(abs(second-first), places) != 0: return standardMsg = '%s == %s within %r places' % (safe_repr(first), safe_repr(second), places) msg = self._formatMessage(msg, standardMsg) raise self.failureException(msg) # Synonyms for assertion methods # The plurals are undocumented. Keep them that way to discourage use. # Do not add more. Do not remove. # Going through a deprecation cycle on these would annoy many people. assertEquals = assertEqual assertNotEquals = assertNotEqual assertAlmostEquals = assertAlmostEqual assertNotAlmostEquals = assertNotAlmostEqual assert_ = assertTrue # These fail* assertion method names are pending deprecation and will # be a DeprecationWarning in 3.2; http://bugs.python.org/issue2578 def _deprecate(original_func): def deprecated_func(*args, **kwargs): warnings.warn( ('Please use %s instead.' % original_func.__name__), PendingDeprecationWarning, 2) return original_func(*args, **kwargs) return deprecated_func failUnlessEqual = _deprecate(assertEqual) failIfEqual = _deprecate(assertNotEqual) failUnlessAlmostEqual = _deprecate(assertAlmostEqual) failIfAlmostEqual = _deprecate(assertNotAlmostEqual) failUnless = _deprecate(assertTrue) failUnlessRaises = _deprecate(assertRaises) failIf = _deprecate(assertFalse) def assertSequenceEqual(self, seq1, seq2, msg=None, seq_type=None, max_diff=80*8): """An equality assertion for ordered sequences (like lists and tuples). For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator. Args: seq1: The first sequence to compare. seq2: The second sequence to compare. seq_type: The expected datatype of the sequences, or None if no datatype should be enforced. msg: Optional message to use on failure instead of a list of differences. max_diff: Maximum size off the diff, larger diffs are not shown """ if seq_type is not None: seq_type_name = seq_type.__name__ if not isinstance(seq1, seq_type): raise self.failureException('First sequence is not a %s: %s' % (seq_type_name, safe_repr(seq1))) if not isinstance(seq2, seq_type): raise self.failureException('Second sequence is not a %s: %s' % (seq_type_name, safe_repr(seq2))) else: seq_type_name = "sequence" differing = None try: len1 = len(seq1) except (TypeError, NotImplementedError): differing = 'First %s has no length. Non-sequence?' % ( seq_type_name) if differing is None: try: len2 = len(seq2) except (TypeError, NotImplementedError): differing = 'Second %s has no length. Non-sequence?' % ( seq_type_name) if differing is None: if seq1 == seq2: return seq1_repr = repr(seq1) seq2_repr = repr(seq2) if len(seq1_repr) > 30: seq1_repr = seq1_repr[:30] + '...' if len(seq2_repr) > 30: seq2_repr = seq2_repr[:30] + '...' elements = (seq_type_name.capitalize(), seq1_repr, seq2_repr) differing = '%ss differ: %s != %s\n' % elements for i in xrange(min(len1, len2)): try: item1 = seq1[i] except (TypeError, IndexError, NotImplementedError): differing += ('\nUnable to index element %d of first %s\n' % (i, seq_type_name)) break try: item2 = seq2[i] except (TypeError, IndexError, NotImplementedError): differing += ('\nUnable to index element %d of second %s\n' % (i, seq_type_name)) break if item1 != item2: differing += ('\nFirst differing element %d:\n%s\n%s\n' % (i, item1, item2)) break else: if (len1 == len2 and seq_type is None and type(seq1) != type(seq2)): # The sequences are the same, but have differing types. return if len1 > len2: differing += ('\nFirst %s contains %d additional ' 'elements.\n' % (seq_type_name, len1 - len2)) try: differing += ('First extra element %d:\n%s\n' % (len2, seq1[len2])) except (TypeError, IndexError, NotImplementedError): differing += ('Unable to index element %d ' 'of first %s\n' % (len2, seq_type_name)) elif len1 < len2: differing += ('\nSecond %s contains %d additional ' 'elements.\n' % (seq_type_name, len2 - len1)) try: differing += ('First extra element %d:\n%s\n' % (len1, seq2[len1])) except (TypeError, IndexError, NotImplementedError): differing += ('Unable to index element %d ' 'of second %s\n' % (len1, seq_type_name)) standardMsg = differing diffMsg = '\n' + '\n'.join( difflib.ndiff(pprint.pformat(seq1).splitlines(), pprint.pformat(seq2).splitlines())) standardMsg = self._truncateMessage(standardMsg, diffMsg) msg = self._formatMessage(msg, standardMsg) self.fail(msg) def _truncateMessage(self, message, diff): max_diff = self.maxDiff if max_diff is None or len(diff) <= max_diff: return message + diff return message + (DIFF_OMITTED % len(diff)) def assertListEqual(self, list1, list2, msg=None): """A list-specific equality assertion. Args: list1: The first list to compare. list2: The second list to compare. msg: Optional message to use on failure instead of a list of differences. """ self.assertSequenceEqual(list1, list2, msg, seq_type=list) def assertTupleEqual(self, tuple1, tuple2, msg=None): """A tuple-specific equality assertion. Args: tuple1: The first tuple to compare. tuple2: The second tuple to compare. msg: Optional message to use on failure instead of a list of differences. """ self.assertSequenceEqual(tuple1, tuple2, msg, seq_type=tuple) def assertSetEqual(self, set1, set2, msg=None): """A set-specific equality assertion. Args: set1: The first set to compare. set2: The second set to compare. msg: Optional message to use on failure instead of a list of differences. assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method). """ try: difference1 = set1.difference(set2) except TypeError, e: self.fail('invalid type when attempting set difference: %s' % e) except AttributeError, e: self.fail('first argument does not support set difference: %s' % e) try: difference2 = set2.difference(set1) except TypeError, e: self.fail('invalid type when attempting set difference: %s' % e) except AttributeError, e: self.fail('second argument does not support set difference: %s' % e) if not (difference1 or difference2): return lines = [] if difference1: lines.append('Items in the first set but not the second:') for item in difference1: lines.append(repr(item)) if difference2: lines.append('Items in the second set but not the first:') for item in difference2: lines.append(repr(item)) standardMsg = '\n'.join(lines) self.fail(self._formatMessage(msg, standardMsg)) def assertIn(self, member, container, msg=None): """Just like self.assertTrue(a in b), but with a nicer default message.""" if member not in container: standardMsg = '%s not found in %s' % (safe_repr(member), safe_repr(container)) self.fail(self._formatMessage(msg, standardMsg)) def assertNotIn(self, member, container, msg=None): """Just like self.assertTrue(a not in b), but with a nicer default message.""" if member in container: standardMsg = '%s unexpectedly found in %s' % (safe_repr(member), safe_repr(container)) self.fail(self._formatMessage(msg, standardMsg)) def assertIs(self, expr1, expr2, msg=None): """Just like self.assertTrue(a is b), but with a nicer default message.""" if expr1 is not expr2: standardMsg = '%s is not %s' % (safe_repr(expr1), safe_repr(expr2)) self.fail(self._formatMessage(msg, standardMsg)) def assertIsNot(self, expr1, expr2, msg=None): """Just like self.assertTrue(a is not b), but with a nicer default message.""" if expr1 is expr2: standardMsg = 'unexpectedly identical: %s' % (safe_repr(expr1),) self.fail(self._formatMessage(msg, standardMsg)) def assertDictEqual(self, d1, d2, msg=None): self.assert_(isinstance(d1, dict), 'First argument is not a dictionary') self.assert_(isinstance(d2, dict), 'Second argument is not a dictionary') if d1 != d2: standardMsg = '%s != %s' % (safe_repr(d1, True), safe_repr(d2, True)) diff = ('\n' + '\n'.join(difflib.ndiff( pprint.pformat(d1).splitlines(), pprint.pformat(d2).splitlines()))) standardMsg = self._truncateMessage(standardMsg, diff) self.fail(self._formatMessage(msg, standardMsg)) def assertDictContainsSubset(self, expected, actual, msg=None): """Checks whether actual is a superset of expected.""" missing = [] mismatched = [] for key, value in expected.iteritems(): if key not in actual: missing.append(key) elif value != actual[key]: mismatched.append('%s, expected: %s, actual: %s' % (safe_repr(key), safe_repr(value), safe_repr(actual[key]))) if not (missing or mismatched): return standardMsg = '' if missing: standardMsg = 'Missing: %s' % ','.join(safe_repr(m) for m in missing) if mismatched: if standardMsg: standardMsg += '; ' standardMsg += 'Mismatched values: %s' % ','.join(mismatched) self.fail(self._formatMessage(msg, standardMsg)) def assertItemsEqual(self, expected_seq, actual_seq, msg=None): """An unordered sequence specific comparison. It asserts that expected_seq and actual_seq contain the same elements. It is the equivalent of:: self.assertEqual(sorted(expected_seq), sorted(actual_seq)) Raises with an error message listing which elements of expected_seq are missing from actual_seq and vice versa if any. Asserts that each element has the same count in both sequences. Example: - [0, 1, 1] and [1, 0, 1] compare equal. - [0, 0, 1] and [0, 1] compare unequal. """ try: expected = sorted(expected_seq) actual = sorted(actual_seq) except TypeError: # Unsortable items (example: set(), complex(), ...) expected = list(expected_seq) actual = list(actual_seq) missing, unexpected = unorderable_list_difference( expected, actual, ignore_duplicate=False ) else: return self.assertSequenceEqual(expected, actual, msg=msg) errors = [] if missing: errors.append('Expected, but missing:\n %s' % safe_repr(missing)) if unexpected: errors.append('Unexpected, but present:\n %s' % safe_repr(unexpected)) if errors: standardMsg = '\n'.join(errors) self.fail(self._formatMessage(msg, standardMsg)) def assertMultiLineEqual(self, first, second, msg=None): """Assert that two multi-line strings are equal.""" self.assert_(isinstance(first, basestring), ( 'First argument is not a string')) self.assert_(isinstance(second, basestring), ( 'Second argument is not a string')) if first != second: standardMsg = '%s != %s' % (safe_repr(first, True), safe_repr(second, True)) diff = '\n' + ''.join(difflib.ndiff(first.splitlines(True), second.splitlines(True))) standardMsg = self._truncateMessage(standardMsg, diff) self.fail(self._formatMessage(msg, standardMsg)) def assertLess(self, a, b, msg=None): """Just like self.assertTrue(a < b), but with a nicer default message.""" if not a < b: standardMsg = '%s not less than %s' % (safe_repr(a), safe_repr(b)) self.fail(self._formatMessage(msg, standardMsg)) def assertLessEqual(self, a, b, msg=None): """Just like self.assertTrue(a <= b), but with a nicer default message.""" if not a <= b: standardMsg = '%s not less than or equal to %s' % (safe_repr(a), safe_repr(b)) self.fail(self._formatMessage(msg, standardMsg)) def assertGreater(self, a, b, msg=None): """Just like self.assertTrue(a > b), but with a nicer default message.""" if not a > b: standardMsg = '%s not greater than %s' % (safe_repr(a), safe_repr(b)) self.fail(self._formatMessage(msg, standardMsg)) def assertGreaterEqual(self, a, b, msg=None): """Just like self.assertTrue(a >= b), but with a nicer default message.""" if not a >= b: standardMsg = '%s not greater than or equal to %s' % (safe_repr(a), safe_repr(b)) self.fail(self._formatMessage(msg, standardMsg)) def assertIsNone(self, obj, msg=None): """Same as self.assertTrue(obj is None), with a nicer default message.""" if obj is not None: standardMsg = '%s is not None' % (safe_repr(obj),) self.fail(self._formatMessage(msg, standardMsg)) def assertIsNotNone(self, obj, msg=None): """Included for symmetry with assertIsNone.""" if obj is None: standardMsg = 'unexpectedly None' self.fail(self._formatMessage(msg, standardMsg)) def assertIsInstance(self, obj, cls, msg=None): """Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.""" if not isinstance(obj, cls): standardMsg = '%s is not an instance of %r' % (safe_repr(obj), cls) self.fail(self._formatMessage(msg, standardMsg)) def assertNotIsInstance(self, obj, cls, msg=None): """Included for symmetry with assertIsInstance.""" if isinstance(obj, cls): standardMsg = '%s is an instance of %r' % (safe_repr(obj), cls) self.fail(self._formatMessage(msg, standardMsg)) def assertRaisesRegexp(self, expected_exception, expected_regexp, callable_obj=None, *args, **kwargs): """Asserts that the message in a raised exception matches a regexp. Args: expected_exception: Exception class expected to be raised. expected_regexp: Regexp (re pattern object or string) expected to be found in error message. callable_obj: Function to be called. args: Extra args. kwargs: Extra kwargs. """ if callable_obj is None: return _AssertRaisesContext(expected_exception, self, expected_regexp) try: callable_obj(*args, **kwargs) except expected_exception, exc_value: if isinstance(expected_regexp, basestring): expected_regexp = re.compile(expected_regexp) if not expected_regexp.search(str(exc_value)): raise self.failureException('"%s" does not match "%s"' % (expected_regexp.pattern, str(exc_value))) else: if hasattr(expected_exception, '__name__'): excName = expected_exception.__name__ else: excName = str(expected_exception) raise self.failureException, "%s not raised" % excName def assertRegexpMatches(self, text, expected_regexp, msg=None): """Fail the test unless the text matches the regular expression.""" if isinstance(expected_regexp, basestring): expected_regexp = re.compile(expected_regexp) if not expected_regexp.search(text): msg = msg or "Regexp didn't match" msg = '%s: %r not found in %r' % (msg, expected_regexp.pattern, text) raise self.failureException(msg) def assertNotRegexpMatches(self, text, unexpected_regexp, msg=None): """Fail the test if the text matches the regular expression.""" if isinstance(unexpected_regexp, basestring): unexpected_regexp = re.compile(unexpected_regexp) match = unexpected_regexp.search(text) if match: msg = msg or "Regexp matched" msg = '%s: %r matches %r in %r' % (msg, text[match.start():match.end()], unexpected_regexp.pattern, text) raise self.failureException(msg) class FunctionTestCase(TestCase): """A test case that wraps a test function. This is useful for slipping pre-existing test functions into the unittest framework. Optionally, set-up and tidy-up functions can be supplied. As with TestCase, the tidy-up ('tearDown') function will always be called if the set-up ('setUp') function ran successfully. """ def __init__(self, testFunc, setUp=None, tearDown=None, description=None): super(FunctionTestCase, self).__init__() self._setUpFunc = setUp self._tearDownFunc = tearDown self._testFunc = testFunc self._description = description def setUp(self): if self._setUpFunc is not None: self._setUpFunc() def tearDown(self): if self._tearDownFunc is not None: self._tearDownFunc() def runTest(self): self._testFunc() def id(self): return self._testFunc.__name__ def __eq__(self, other): if not isinstance(other, self.__class__): return NotImplemented return self._setUpFunc == other._setUpFunc and \ self._tearDownFunc == other._tearDownFunc and \ self._testFunc == other._testFunc and \ self._description == other._description def __ne__(self, other): return not self == other def __hash__(self): return hash((type(self), self._setUpFunc, self._tearDownFunc, self._testFunc, self._description)) def __str__(self): return "%s (%s)" % (strclass(self.__class__), self._testFunc.__name__) def __repr__(self): return "<%s testFunc=%s>" % (strclass(self.__class__), self._testFunc) def shortDescription(self): if self._description is not None: return self._description doc = self._testFunc.__doc__ return doc and doc.split("\n")[0].strip() or None chemfp-1.1p1/tests/unittest2/collector.py0000644000077000000240000000044111660452125020721 0ustar dalkestaff00000000000000import os import sys from unittest2.loader import defaultTestLoader def collector(): # import __main__ triggers code re-execution __main__ = sys.modules['__main__'] setupDir = os.path.abspath(os.path.dirname(__main__.__file__)) return defaultTestLoader.discover(setupDir) chemfp-1.1p1/tests/unittest2/compatibility.py0000644000077000000240000000406411660452125021611 0ustar dalkestaff00000000000000import os import sys try: from functools import wraps except ImportError: # only needed for Python 2.4 def wraps(_): def _wraps(func): return func return _wraps __unittest = True def _relpath_nt(path, start=os.path.curdir): """Return a relative version of a path""" if not path: raise ValueError("no path specified") start_list = os.path.abspath(start).split(os.path.sep) path_list = os.path.abspath(path).split(os.path.sep) if start_list[0].lower() != path_list[0].lower(): unc_path, rest = os.path.splitunc(path) unc_start, rest = os.path.splitunc(start) if bool(unc_path) ^ bool(unc_start): raise ValueError("Cannot mix UNC and non-UNC paths (%s and %s)" % (path, start)) else: raise ValueError("path is on drive %s, start on drive %s" % (path_list[0], start_list[0])) # Work out how much of the filepath is shared by start and path. for i in range(min(len(start_list), len(path_list))): if start_list[i].lower() != path_list[i].lower(): break else: i += 1 rel_list = [os.path.pardir] * (len(start_list)-i) + path_list[i:] if not rel_list: return os.path.curdir return os.path.join(*rel_list) # default to posixpath definition def _relpath_posix(path, start=os.path.curdir): """Return a relative version of a path""" if not path: raise ValueError("no path specified") start_list = os.path.abspath(start).split(os.path.sep) path_list = os.path.abspath(path).split(os.path.sep) # Work out how much of the filepath is shared by start and path. i = len(os.path.commonprefix([start_list, path_list])) rel_list = [os.path.pardir] * (len(start_list)-i) + path_list[i:] if not rel_list: return os.path.curdir return os.path.join(*rel_list) if os.path is sys.modules.get('ntpath'): relpath = _relpath_nt else: relpath = _relpath_posix chemfp-1.1p1/tests/unittest2/loader.py0000644000077000000240000003215511660452125020210 0ustar dalkestaff00000000000000"""Loading unittests.""" import os import re import sys import traceback import types import unittest from fnmatch import fnmatch from unittest2 import case, suite try: from os.path import relpath except ImportError: from unittest2.compatibility import relpath __unittest = True def _CmpToKey(mycmp): 'Convert a cmp= function into a key= function' class K(object): def __init__(self, obj): self.obj = obj def __lt__(self, other): return mycmp(self.obj, other.obj) == -1 return K # what about .pyc or .pyo (etc) # we would need to avoid loading the same tests multiple times # from '.py', '.pyc' *and* '.pyo' VALID_MODULE_NAME = re.compile(r'[_a-z]\w*\.py$', re.IGNORECASE) def _make_failed_import_test(name, suiteClass): message = 'Failed to import test module: %s' % name if hasattr(traceback, 'format_exc'): # Python 2.3 compatibility # format_exc returns two frames of discover.py as well message += '\n%s' % traceback.format_exc() return _make_failed_test('ModuleImportFailure', name, ImportError(message), suiteClass) def _make_failed_load_tests(name, exception, suiteClass): return _make_failed_test('LoadTestsFailure', name, exception, suiteClass) def _make_failed_test(classname, methodname, exception, suiteClass): def testFailure(self): raise exception attrs = {methodname: testFailure} TestClass = type(classname, (case.TestCase,), attrs) return suiteClass((TestClass(methodname),)) class TestLoader(unittest.TestLoader): """ This class is responsible for loading tests according to various criteria and returning them wrapped in a TestSuite """ testMethodPrefix = 'test' sortTestMethodsUsing = cmp suiteClass = suite.TestSuite _top_level_dir = None def loadTestsFromTestCase(self, testCaseClass): """Return a suite of all tests cases contained in testCaseClass""" if issubclass(testCaseClass, suite.TestSuite): raise TypeError("Test cases should not be derived from TestSuite." " Maybe you meant to derive from TestCase?") testCaseNames = self.getTestCaseNames(testCaseClass) if not testCaseNames and hasattr(testCaseClass, 'runTest'): testCaseNames = ['runTest'] loaded_suite = self.suiteClass(map(testCaseClass, testCaseNames)) return loaded_suite def loadTestsFromModule(self, module, use_load_tests=True): """Return a suite of all tests cases contained in the given module""" tests = [] for name in dir(module): obj = getattr(module, name) if isinstance(obj, type) and issubclass(obj, unittest.TestCase): tests.append(self.loadTestsFromTestCase(obj)) load_tests = getattr(module, 'load_tests', None) tests = self.suiteClass(tests) if use_load_tests and load_tests is not None: try: return load_tests(self, tests, None) except Exception, e: return _make_failed_load_tests(module.__name__, e, self.suiteClass) return tests def loadTestsFromName(self, name, module=None): """Return a suite of all tests cases given a string specifier. The name may resolve either to a module, a test case class, a test method within a test case class, or a callable object which returns a TestCase or TestSuite instance. The method optionally resolves the names relative to a given module. """ parts = name.split('.') if module is None: parts_copy = parts[:] while parts_copy: try: module = __import__('.'.join(parts_copy)) break except ImportError: del parts_copy[-1] if not parts_copy: raise parts = parts[1:] obj = module for part in parts: parent, obj = obj, getattr(obj, part) if isinstance(obj, types.ModuleType): return self.loadTestsFromModule(obj) elif isinstance(obj, type) and issubclass(obj, unittest.TestCase): return self.loadTestsFromTestCase(obj) elif (isinstance(obj, types.UnboundMethodType) and isinstance(parent, type) and issubclass(parent, case.TestCase)): return self.suiteClass([parent(obj.__name__)]) elif isinstance(obj, unittest.TestSuite): return obj elif hasattr(obj, '__call__'): test = obj() if isinstance(test, unittest.TestSuite): return test elif isinstance(test, unittest.TestCase): return self.suiteClass([test]) else: raise TypeError("calling %s returned %s, not a test" % (obj, test)) else: raise TypeError("don't know how to make test from: %s" % obj) def loadTestsFromNames(self, names, module=None): """Return a suite of all tests cases found using the given sequence of string specifiers. See 'loadTestsFromName()'. """ suites = [self.loadTestsFromName(name, module) for name in names] return self.suiteClass(suites) def getTestCaseNames(self, testCaseClass): """Return a sorted sequence of method names found within testCaseClass """ def isTestMethod(attrname, testCaseClass=testCaseClass, prefix=self.testMethodPrefix): return attrname.startswith(prefix) and \ hasattr(getattr(testCaseClass, attrname), '__call__') testFnNames = filter(isTestMethod, dir(testCaseClass)) if self.sortTestMethodsUsing: testFnNames.sort(key=_CmpToKey(self.sortTestMethodsUsing)) return testFnNames def discover(self, start_dir, pattern='test*.py', top_level_dir=None): """Find and return all test modules from the specified start directory, recursing into subdirectories to find them. Only test files that match the pattern will be loaded. (Using shell style pattern matching.) All test modules must be importable from the top level of the project. If the start directory is not the top level directory then the top level directory must be specified separately. If a test package name (directory with '__init__.py') matches the pattern then the package will be checked for a 'load_tests' function. If this exists then it will be called with loader, tests, pattern. If load_tests exists then discovery does *not* recurse into the package, load_tests is responsible for loading all tests in the package. The pattern is deliberately not stored as a loader attribute so that packages can continue discovery themselves. top_level_dir is stored so load_tests does not need to pass this argument in to loader.discover(). """ set_implicit_top = False if top_level_dir is None and self._top_level_dir is not None: # make top_level_dir optional if called from load_tests in a package top_level_dir = self._top_level_dir elif top_level_dir is None: set_implicit_top = True top_level_dir = start_dir top_level_dir = os.path.abspath(top_level_dir) if not top_level_dir in sys.path: # all test modules must be importable from the top level directory # should we *unconditionally* put the start directory in first # in sys.path to minimise likelihood of conflicts between installed # modules and development versions? sys.path.insert(0, top_level_dir) self._top_level_dir = top_level_dir is_not_importable = False if os.path.isdir(os.path.abspath(start_dir)): start_dir = os.path.abspath(start_dir) if start_dir != top_level_dir: is_not_importable = not os.path.isfile(os.path.join(start_dir, '__init__.py')) else: # support for discovery from dotted module names try: __import__(start_dir) except ImportError: is_not_importable = True else: the_module = sys.modules[start_dir] top_part = start_dir.split('.')[0] start_dir = os.path.abspath(os.path.dirname((the_module.__file__))) if set_implicit_top: self._top_level_dir = os.path.abspath(os.path.dirname(os.path.dirname(sys.modules[top_part].__file__))) sys.path.remove(top_level_dir) if is_not_importable: raise ImportError('Start directory is not importable: %r' % start_dir) tests = list(self._find_tests(start_dir, pattern)) return self.suiteClass(tests) def _get_name_from_path(self, path): path = os.path.splitext(os.path.normpath(path))[0] _relpath = relpath(path, self._top_level_dir) assert not os.path.isabs(_relpath), "Path must be within the project" assert not _relpath.startswith('..'), "Path must be within the project" name = _relpath.replace(os.path.sep, '.') return name def _get_module_from_name(self, name): __import__(name) return sys.modules[name] def _match_path(self, path, full_path, pattern): # override this method to use alternative matching strategy return fnmatch(path, pattern) def _find_tests(self, start_dir, pattern): """Used by discovery. Yields test suites it loads.""" paths = os.listdir(start_dir) for path in paths: full_path = os.path.join(start_dir, path) if os.path.isfile(full_path): if not VALID_MODULE_NAME.match(path): # valid Python identifiers only continue if not self._match_path(path, full_path, pattern): continue # if the test file matches, load it name = self._get_name_from_path(full_path) try: module = self._get_module_from_name(name) except: yield _make_failed_import_test(name, self.suiteClass) else: mod_file = os.path.abspath(getattr(module, '__file__', full_path)) realpath = os.path.splitext(mod_file)[0] fullpath_noext = os.path.splitext(full_path)[0] if realpath.lower() != fullpath_noext.lower(): module_dir = os.path.dirname(realpath) mod_name = os.path.splitext(os.path.basename(full_path))[0] expected_dir = os.path.dirname(full_path) msg = ("%r module incorrectly imported from %r. Expected %r. " "Is this module globally installed?") raise ImportError(msg % (mod_name, module_dir, expected_dir)) yield self.loadTestsFromModule(module) elif os.path.isdir(full_path): if not os.path.isfile(os.path.join(full_path, '__init__.py')): continue load_tests = None tests = None if fnmatch(path, pattern): # only check load_tests if the package directory itself matches the filter name = self._get_name_from_path(full_path) package = self._get_module_from_name(name) load_tests = getattr(package, 'load_tests', None) tests = self.loadTestsFromModule(package, use_load_tests=False) if load_tests is None: if tests is not None: # tests loaded from package file yield tests # recurse into the package for test in self._find_tests(full_path, pattern): yield test else: try: yield load_tests(self, tests, pattern) except Exception, e: yield _make_failed_load_tests(package.__name__, e, self.suiteClass) defaultTestLoader = TestLoader() def _makeLoader(prefix, sortUsing, suiteClass=None): loader = TestLoader() loader.sortTestMethodsUsing = sortUsing loader.testMethodPrefix = prefix if suiteClass: loader.suiteClass = suiteClass return loader def getTestCaseNames(testCaseClass, prefix, sortUsing=cmp): return _makeLoader(prefix, sortUsing).getTestCaseNames(testCaseClass) def makeSuite(testCaseClass, prefix='test', sortUsing=cmp, suiteClass=suite.TestSuite): return _makeLoader(prefix, sortUsing, suiteClass).loadTestsFromTestCase(testCaseClass) def findTestCases(module, prefix='test', sortUsing=cmp, suiteClass=suite.TestSuite): return _makeLoader(prefix, sortUsing, suiteClass).loadTestsFromModule(module) chemfp-1.1p1/tests/unittest2/main.py0000644000077000000240000002225311660452125017664 0ustar dalkestaff00000000000000"""Unittest main program""" import sys import os import types from unittest2 import loader, runner try: from unittest2.signals import installHandler except ImportError: installHandler = None __unittest = True FAILFAST = " -f, --failfast Stop on first failure\n" CATCHBREAK = " -c, --catch Catch control-C and display results\n" BUFFEROUTPUT = " -b, --buffer Buffer stdout and stderr during test runs\n" USAGE_AS_MAIN = """\ Usage: %(progName)s [options] [tests] Options: -h, --help Show this message -v, --verbose Verbose output -q, --quiet Minimal output %(failfast)s%(catchbreak)s%(buffer)s Examples: %(progName)s test_module - run tests from test_module %(progName)s test_module.TestClass - run tests from test_module.TestClass %(progName)s test_module.TestClass.test_method - run specified test method [tests] can be a list of any number of test modules, classes and test methods. Alternative Usage: %(progName)s discover [options] Options: -v, --verbose Verbose output %(failfast)s%(catchbreak)s%(buffer)s -s directory Directory to start discovery ('.' default) -p pattern Pattern to match test files ('test*.py' default) -t directory Top level directory of project (default to start directory) For test discovery all test modules must be importable from the top level directory of the project. """ USAGE_FROM_MODULE = """\ Usage: %(progName)s [options] [test] [...] Options: -h, --help Show this message -v, --verbose Verbose output -q, --quiet Minimal output %(failfast)s%(catchbreak)s%(buffer)s Examples: %(progName)s - run default set of tests %(progName)s MyTestSuite - run suite 'MyTestSuite' %(progName)s MyTestCase.testSomething - run MyTestCase.testSomething %(progName)s MyTestCase - run all 'test*' test methods in MyTestCase """ class TestProgram(object): """A command-line program that runs a set of tests; this is primarily for making test modules conveniently executable. """ USAGE = USAGE_FROM_MODULE # defaults for testing failfast = catchbreak = buffer = progName = None def __init__(self, module='__main__', defaultTest=None, argv=None, testRunner=None, testLoader=loader.defaultTestLoader, exit=True, verbosity=1, failfast=None, catchbreak=None, buffer=None): if isinstance(module, basestring): self.module = __import__(module) for part in module.split('.')[1:]: self.module = getattr(self.module, part) else: self.module = module if argv is None: argv = sys.argv self.exit = exit self.verbosity = verbosity self.failfast = failfast self.catchbreak = catchbreak self.buffer = buffer self.defaultTest = defaultTest self.testRunner = testRunner self.testLoader = testLoader self.progName = os.path.basename(argv[0]) self.parseArgs(argv) self.runTests() def usageExit(self, msg=None): if msg: print msg usage = {'progName': self.progName, 'catchbreak': '', 'failfast': '', 'buffer': ''} if self.failfast != False: usage['failfast'] = FAILFAST if self.catchbreak != False and installHandler is not None: usage['catchbreak'] = CATCHBREAK if self.buffer != False: usage['buffer'] = BUFFEROUTPUT print self.USAGE % usage sys.exit(2) def parseArgs(self, argv): if len(argv) > 1 and argv[1].lower() == 'discover': self._do_discovery(argv[2:]) return import getopt long_opts = ['help', 'verbose', 'quiet', 'failfast', 'catch', 'buffer'] try: options, args = getopt.getopt(argv[1:], 'hHvqfcb', long_opts) for opt, value in options: if opt in ('-h','-H','--help'): self.usageExit() if opt in ('-q','--quiet'): self.verbosity = 0 if opt in ('-v','--verbose'): self.verbosity = 2 if opt in ('-f','--failfast'): if self.failfast is None: self.failfast = True # Should this raise an exception if -f is not valid? if opt in ('-c','--catch'): if self.catchbreak is None and installHandler is not None: self.catchbreak = True # Should this raise an exception if -c is not valid? if opt in ('-b','--buffer'): if self.buffer is None: self.buffer = True # Should this raise an exception if -b is not valid? if len(args) == 0 and self.defaultTest is None: # createTests will load tests from self.module self.testNames = None elif len(args) > 0: self.testNames = args if __name__ == '__main__': # to support python -m unittest ... self.module = None else: self.testNames = (self.defaultTest,) self.createTests() except getopt.error, msg: self.usageExit(msg) def createTests(self): if self.testNames is None: self.test = self.testLoader.loadTestsFromModule(self.module) else: self.test = self.testLoader.loadTestsFromNames(self.testNames, self.module) def _do_discovery(self, argv, Loader=loader.TestLoader): # handle command line args for test discovery self.progName = '%s discover' % self.progName import optparse parser = optparse.OptionParser() parser.prog = self.progName parser.add_option('-v', '--verbose', dest='verbose', default=False, help='Verbose output', action='store_true') if self.failfast != False: parser.add_option('-f', '--failfast', dest='failfast', default=False, help='Stop on first fail or error', action='store_true') if self.catchbreak != False and installHandler is not None: parser.add_option('-c', '--catch', dest='catchbreak', default=False, help='Catch ctrl-C and display results so far', action='store_true') if self.buffer != False: parser.add_option('-b', '--buffer', dest='buffer', default=False, help='Buffer stdout and stderr during tests', action='store_true') parser.add_option('-s', '--start-directory', dest='start', default='.', help="Directory to start discovery ('.' default)") parser.add_option('-p', '--pattern', dest='pattern', default='test*.py', help="Pattern to match tests ('test*.py' default)") parser.add_option('-t', '--top-level-directory', dest='top', default=None, help='Top level directory of project (defaults to start directory)') options, args = parser.parse_args(argv) if len(args) > 3: self.usageExit() for name, value in zip(('start', 'pattern', 'top'), args): setattr(options, name, value) # only set options from the parsing here # if they weren't set explicitly in the constructor if self.failfast is None: self.failfast = options.failfast if self.catchbreak is None and installHandler is not None: self.catchbreak = options.catchbreak if self.buffer is None: self.buffer = options.buffer if options.verbose: self.verbosity = 2 start_dir = options.start pattern = options.pattern top_level_dir = options.top loader = Loader() self.test = loader.discover(start_dir, pattern, top_level_dir) def runTests(self): if self.catchbreak: installHandler() if self.testRunner is None: self.testRunner = runner.TextTestRunner if isinstance(self.testRunner, (type, types.ClassType)): try: testRunner = self.testRunner(verbosity=self.verbosity, failfast=self.failfast, buffer=self.buffer) except TypeError: # didn't accept the verbosity, buffer or failfast arguments testRunner = self.testRunner() else: # it is assumed to be a TestRunner instance testRunner = self.testRunner self.result = testRunner.run(self.test) if self.exit: sys.exit(not self.result.wasSuccessful()) main = TestProgram def main_(): TestProgram.USAGE = USAGE_AS_MAIN main(module=None) chemfp-1.1p1/tests/unittest2/result.py0000644000077000000240000001376511660452125020266 0ustar dalkestaff00000000000000"""Test result object""" import sys import traceback import unittest from StringIO import StringIO from unittest2 import util from unittest2.compatibility import wraps __unittest = True def failfast(method): @wraps(method) def inner(self, *args, **kw): if getattr(self, 'failfast', False): self.stop() return method(self, *args, **kw) return inner STDOUT_LINE = '\nStdout:\n%s' STDERR_LINE = '\nStderr:\n%s' class TestResult(unittest.TestResult): """Holder for test result information. Test results are automatically managed by the TestCase and TestSuite classes, and do not need to be explicitly manipulated by writers of tests. Each instance holds the total number of tests run, and collections of failures and errors that occurred among those test runs. The collections contain tuples of (testcase, exceptioninfo), where exceptioninfo is the formatted traceback of the error that occurred. """ _previousTestClass = None _moduleSetUpFailed = False def __init__(self): self.failfast = False self.failures = [] self.errors = [] self.testsRun = 0 self.skipped = [] self.expectedFailures = [] self.unexpectedSuccesses = [] self.shouldStop = False self.buffer = False self._stdout_buffer = None self._stderr_buffer = None self._original_stdout = sys.stdout self._original_stderr = sys.stderr self._mirrorOutput = False def startTest(self, test): "Called when the given test is about to be run" self.testsRun += 1 self._mirrorOutput = False if self.buffer: if self._stderr_buffer is None: self._stderr_buffer = StringIO() self._stdout_buffer = StringIO() sys.stdout = self._stdout_buffer sys.stderr = self._stderr_buffer def startTestRun(self): """Called once before any tests are executed. See startTest for a method called before each test. """ def stopTest(self, test): """Called when the given test has been run""" if self.buffer: if self._mirrorOutput: output = sys.stdout.getvalue() error = sys.stderr.getvalue() if output: if not output.endswith('\n'): output += '\n' self._original_stdout.write(STDOUT_LINE % output) if error: if not error.endswith('\n'): error += '\n' self._original_stderr.write(STDERR_LINE % error) sys.stdout = self._original_stdout sys.stderr = self._original_stderr self._stdout_buffer.seek(0) self._stdout_buffer.truncate() self._stderr_buffer.seek(0) self._stderr_buffer.truncate() self._mirrorOutput = False def stopTestRun(self): """Called once after all tests are executed. See stopTest for a method called after each test. """ @failfast def addError(self, test, err): """Called when an error has occurred. 'err' is a tuple of values as returned by sys.exc_info(). """ self.errors.append((test, self._exc_info_to_string(err, test))) self._mirrorOutput = True @failfast def addFailure(self, test, err): """Called when an error has occurred. 'err' is a tuple of values as returned by sys.exc_info().""" self.failures.append((test, self._exc_info_to_string(err, test))) self._mirrorOutput = True def addSuccess(self, test): "Called when a test has completed successfully" pass def addSkip(self, test, reason): """Called when a test is skipped.""" self.skipped.append((test, reason)) def addExpectedFailure(self, test, err): """Called when an expected failure/error occured.""" self.expectedFailures.append( (test, self._exc_info_to_string(err, test))) @failfast def addUnexpectedSuccess(self, test): """Called when a test was expected to fail, but succeed.""" self.unexpectedSuccesses.append(test) def wasSuccessful(self): "Tells whether or not this result was a success" return (len(self.failures) + len(self.errors) == 0) def stop(self): "Indicates that the tests should be aborted" self.shouldStop = True def _exc_info_to_string(self, err, test): """Converts a sys.exc_info()-style tuple of values into a string.""" exctype, value, tb = err # Skip test runner traceback levels while tb and self._is_relevant_tb_level(tb): tb = tb.tb_next if exctype is test.failureException: # Skip assert*() traceback levels length = self._count_relevant_tb_levels(tb) msgLines = traceback.format_exception(exctype, value, tb, length) else: msgLines = traceback.format_exception(exctype, value, tb) if self.buffer: output = sys.stdout.getvalue() error = sys.stderr.getvalue() if output: if not output.endswith('\n'): output += '\n' msgLines.append(STDOUT_LINE % output) if error: if not error.endswith('\n'): error += '\n' msgLines.append(STDERR_LINE % error) return ''.join(msgLines) def _is_relevant_tb_level(self, tb): return '__unittest' in tb.tb_frame.f_globals def _count_relevant_tb_levels(self, tb): length = 0 while tb and not self._is_relevant_tb_level(tb): length += 1 tb = tb.tb_next return length def __repr__(self): return "<%s run=%i errors=%i failures=%i>" % \ (util.strclass(self.__class__), self.testsRun, len(self.errors), len(self.failures)) chemfp-1.1p1/tests/unittest2/runner.py0000644000077000000240000001514511660452125020253 0ustar dalkestaff00000000000000"""Running tests""" import sys import time import unittest from unittest2 import result try: from unittest2.signals import registerResult except ImportError: def registerResult(_): pass __unittest = True class _WritelnDecorator(object): """Used to decorate file-like objects with a handy 'writeln' method""" def __init__(self,stream): self.stream = stream def __getattr__(self, attr): if attr in ('stream', '__getstate__'): raise AttributeError(attr) return getattr(self.stream,attr) def writeln(self, arg=None): if arg: self.write(arg) self.write('\n') # text-mode streams translate to \r\n if needed class TextTestResult(result.TestResult): """A test result class that can print formatted text results to a stream. Used by TextTestRunner. """ separator1 = '=' * 70 separator2 = '-' * 70 def __init__(self, stream, descriptions, verbosity): super(TextTestResult, self).__init__() self.stream = stream self.showAll = verbosity > 1 self.dots = verbosity == 1 self.descriptions = descriptions def getDescription(self, test): doc_first_line = test.shortDescription() if self.descriptions and doc_first_line: return '\n'.join((str(test), doc_first_line)) else: return str(test) def startTest(self, test): super(TextTestResult, self).startTest(test) if self.showAll: self.stream.write(self.getDescription(test)) self.stream.write(" ... ") self.stream.flush() def addSuccess(self, test): super(TextTestResult, self).addSuccess(test) if self.showAll: self.stream.writeln("ok") elif self.dots: self.stream.write('.') self.stream.flush() def addError(self, test, err): super(TextTestResult, self).addError(test, err) if self.showAll: self.stream.writeln("ERROR") elif self.dots: self.stream.write('E') self.stream.flush() def addFailure(self, test, err): super(TextTestResult, self).addFailure(test, err) if self.showAll: self.stream.writeln("FAIL") elif self.dots: self.stream.write('F') self.stream.flush() def addSkip(self, test, reason): super(TextTestResult, self).addSkip(test, reason) if self.showAll: self.stream.writeln("skipped %r" % (reason,)) elif self.dots: self.stream.write("s") self.stream.flush() def addExpectedFailure(self, test, err): super(TextTestResult, self).addExpectedFailure(test, err) if self.showAll: self.stream.writeln("expected failure") elif self.dots: self.stream.write("x") self.stream.flush() def addUnexpectedSuccess(self, test): super(TextTestResult, self).addUnexpectedSuccess(test) if self.showAll: self.stream.writeln("unexpected success") elif self.dots: self.stream.write("u") self.stream.flush() def printErrors(self): if self.dots or self.showAll: self.stream.writeln() self.printErrorList('ERROR', self.errors) self.printErrorList('FAIL', self.failures) def printErrorList(self, flavour, errors): for test, err in errors: self.stream.writeln(self.separator1) self.stream.writeln("%s: %s" % (flavour, self.getDescription(test))) self.stream.writeln(self.separator2) self.stream.writeln("%s" % err) def stopTestRun(self): super(TextTestResult, self).stopTestRun() self.printErrors() class TextTestRunner(unittest.TextTestRunner): """A test runner class that displays results in textual form. It prints out the names of tests as they are run, errors as they occur, and a summary of the results at the end of the test run. """ resultclass = TextTestResult def __init__(self, stream=sys.stderr, descriptions=True, verbosity=1, failfast=False, buffer=False, resultclass=None): self.stream = _WritelnDecorator(stream) self.descriptions = descriptions self.verbosity = verbosity self.failfast = failfast self.buffer = buffer if resultclass is not None: self.resultclass = resultclass def _makeResult(self): return self.resultclass(self.stream, self.descriptions, self.verbosity) def run(self, test): "Run the given test case or test suite." result = self._makeResult() result.failfast = self.failfast result.buffer = self.buffer registerResult(result) startTime = time.time() startTestRun = getattr(result, 'startTestRun', None) if startTestRun is not None: startTestRun() try: test(result) finally: stopTestRun = getattr(result, 'stopTestRun', None) if stopTestRun is not None: stopTestRun() else: result.printErrors() stopTime = time.time() timeTaken = stopTime - startTime if hasattr(result, 'separator2'): self.stream.writeln(result.separator2) run = result.testsRun self.stream.writeln("Ran %d test%s in %.3fs" % (run, run != 1 and "s" or "", timeTaken)) self.stream.writeln() expectedFails = unexpectedSuccesses = skipped = 0 try: results = map(len, (result.expectedFailures, result.unexpectedSuccesses, result.skipped)) expectedFails, unexpectedSuccesses, skipped = results except AttributeError: pass infos = [] if not result.wasSuccessful(): self.stream.write("FAILED") failed, errored = map(len, (result.failures, result.errors)) if failed: infos.append("failures=%d" % failed) if errored: infos.append("errors=%d" % errored) else: self.stream.write("OK") if skipped: infos.append("skipped=%d" % skipped) if expectedFails: infos.append("expected failures=%d" % expectedFails) if unexpectedSuccesses: infos.append("unexpected successes=%d" % unexpectedSuccesses) if infos: self.stream.writeln(" (%s)" % (", ".join(infos),)) else: self.stream.write("\n") return result chemfp-1.1p1/tests/unittest2/signals.py0000644000077000000240000000322411660452125020375 0ustar dalkestaff00000000000000import signal import weakref from unittest2.compatibility import wraps __unittest = True class _InterruptHandler(object): def __init__(self, default_handler): self.called = False self.default_handler = default_handler def __call__(self, signum, frame): installed_handler = signal.getsignal(signal.SIGINT) if installed_handler is not self: # if we aren't the installed handler, then delegate immediately # to the default handler self.default_handler(signum, frame) if self.called: self.default_handler(signum, frame) self.called = True for result in _results.keys(): result.stop() _results = weakref.WeakKeyDictionary() def registerResult(result): _results[result] = 1 def removeResult(result): return bool(_results.pop(result, None)) _interrupt_handler = None def installHandler(): global _interrupt_handler if _interrupt_handler is None: default_handler = signal.getsignal(signal.SIGINT) _interrupt_handler = _InterruptHandler(default_handler) signal.signal(signal.SIGINT, _interrupt_handler) def removeHandler(method=None): if method is not None: @wraps(method) def inner(*args, **kwargs): initial = signal.getsignal(signal.SIGINT) removeHandler() try: return method(*args, **kwargs) finally: signal.signal(signal.SIGINT, initial) return inner global _interrupt_handler if _interrupt_handler is not None: signal.signal(signal.SIGINT, _interrupt_handler.default_handler) chemfp-1.1p1/tests/unittest2/suite.py0000644000077000000240000002230611660452125020070 0ustar dalkestaff00000000000000"""TestSuite""" import sys import unittest from unittest2 import case, util __unittest = True class BaseTestSuite(unittest.TestSuite): """A simple test suite that doesn't provide class or module shared fixtures. """ def __init__(self, tests=()): self._tests = [] self.addTests(tests) def __repr__(self): return "<%s tests=%s>" % (util.strclass(self.__class__), list(self)) def __eq__(self, other): if not isinstance(other, self.__class__): return NotImplemented return list(self) == list(other) def __ne__(self, other): return not self == other # Can't guarantee hash invariant, so flag as unhashable __hash__ = None def __iter__(self): return iter(self._tests) def countTestCases(self): cases = 0 for test in self: cases += test.countTestCases() return cases def addTest(self, test): # sanity checks if not hasattr(test, '__call__'): raise TypeError("%r is not callable" % (repr(test),)) if isinstance(test, type) and issubclass(test, (case.TestCase, TestSuite)): raise TypeError("TestCases and TestSuites must be instantiated " "before passing them to addTest()") self._tests.append(test) def addTests(self, tests): if isinstance(tests, basestring): raise TypeError("tests must be an iterable of tests, not a string") for test in tests: self.addTest(test) def run(self, result): for test in self: if result.shouldStop: break test(result) return result def __call__(self, *args, **kwds): return self.run(*args, **kwds) def debug(self): """Run the tests without collecting errors in a TestResult""" for test in self: test.debug() class TestSuite(BaseTestSuite): """A test suite is a composite test consisting of a number of TestCases. For use, create an instance of TestSuite, then add test case instances. When all tests have been added, the suite can be passed to a test runner, such as TextTestRunner. It will run the individual test cases in the order in which they were added, aggregating the results. When subclassing, do not forget to call the base class constructor. """ def run(self, result): self._wrapped_run(result) self._tearDownPreviousClass(None, result) self._handleModuleTearDown(result) return result def debug(self): """Run the tests without collecting errors in a TestResult""" debug = _DebugResult() self._wrapped_run(debug, True) self._tearDownPreviousClass(None, debug) self._handleModuleTearDown(debug) ################################ # private methods def _wrapped_run(self, result, debug=False): for test in self: if result.shouldStop: break if _isnotsuite(test): self._tearDownPreviousClass(test, result) self._handleModuleFixture(test, result) self._handleClassSetUp(test, result) result._previousTestClass = test.__class__ if (getattr(test.__class__, '_classSetupFailed', False) or getattr(result, '_moduleSetUpFailed', False)): continue if hasattr(test, '_wrapped_run'): test._wrapped_run(result, debug) elif not debug: test(result) else: test.debug() def _handleClassSetUp(self, test, result): previousClass = getattr(result, '_previousTestClass', None) currentClass = test.__class__ if currentClass == previousClass: return if result._moduleSetUpFailed: return if getattr(currentClass, "__unittest_skip__", False): return try: currentClass._classSetupFailed = False except TypeError: # test may actually be a function # so its class will be a builtin-type pass setUpClass = getattr(currentClass, 'setUpClass', None) if setUpClass is not None: try: setUpClass() except Exception, e: if isinstance(result, _DebugResult): raise currentClass._classSetupFailed = True className = util.strclass(currentClass) errorName = 'setUpClass (%s)' % className self._addClassOrModuleLevelException(result, e, errorName) def _get_previous_module(self, result): previousModule = None previousClass = getattr(result, '_previousTestClass', None) if previousClass is not None: previousModule = previousClass.__module__ return previousModule def _handleModuleFixture(self, test, result): previousModule = self._get_previous_module(result) currentModule = test.__class__.__module__ if currentModule == previousModule: return self._handleModuleTearDown(result) result._moduleSetUpFailed = False try: module = sys.modules[currentModule] except KeyError: return setUpModule = getattr(module, 'setUpModule', None) if setUpModule is not None: try: setUpModule() except Exception, e: if isinstance(result, _DebugResult): raise result._moduleSetUpFailed = True errorName = 'setUpModule (%s)' % currentModule self._addClassOrModuleLevelException(result, e, errorName) def _addClassOrModuleLevelException(self, result, exception, errorName): error = _ErrorHolder(errorName) addSkip = getattr(result, 'addSkip', None) if addSkip is not None and isinstance(exception, case.SkipTest): addSkip(error, str(exception)) else: result.addError(error, sys.exc_info()) def _handleModuleTearDown(self, result): previousModule = self._get_previous_module(result) if previousModule is None: return if result._moduleSetUpFailed: return try: module = sys.modules[previousModule] except KeyError: return tearDownModule = getattr(module, 'tearDownModule', None) if tearDownModule is not None: try: tearDownModule() except Exception, e: if isinstance(result, _DebugResult): raise errorName = 'tearDownModule (%s)' % previousModule self._addClassOrModuleLevelException(result, e, errorName) def _tearDownPreviousClass(self, test, result): previousClass = getattr(result, '_previousTestClass', None) currentClass = test.__class__ if currentClass == previousClass: return if getattr(previousClass, '_classSetupFailed', False): return if getattr(result, '_moduleSetUpFailed', False): return if getattr(previousClass, "__unittest_skip__", False): return tearDownClass = getattr(previousClass, 'tearDownClass', None) if tearDownClass is not None: try: tearDownClass() except Exception, e: if isinstance(result, _DebugResult): raise className = util.strclass(previousClass) errorName = 'tearDownClass (%s)' % className self._addClassOrModuleLevelException(result, e, errorName) class _ErrorHolder(object): """ Placeholder for a TestCase inside a result. As far as a TestResult is concerned, this looks exactly like a unit test. Used to insert arbitrary errors into a test suite run. """ # Inspired by the ErrorHolder from Twisted: # http://twistedmatrix.com/trac/browser/trunk/twisted/trial/runner.py # attribute used by TestResult._exc_info_to_string failureException = None def __init__(self, description): self.description = description def id(self): return self.description def shortDescription(self): return None def __repr__(self): return "" % (self.description,) def __str__(self): return self.id() def run(self, result): # could call result.addError(...) - but this test-like object # shouldn't be run anyway pass def __call__(self, result): return self.run(result) def countTestCases(self): return 0 def _isnotsuite(test): "A crude way to tell apart testcases and suites with duck-typing" try: iter(test) except TypeError: return True return False class _DebugResult(object): "Used by the TestSuite to hold previous class when running in debug." _previousTestClass = None _moduleSetUpFailed = False shouldStop = False chemfp-1.1p1/tests/unittest2/util.py0000644000077000000240000000540511660452125017715 0ustar dalkestaff00000000000000"""Various utility functions.""" __unittest = True _MAX_LENGTH = 80 def safe_repr(obj, short=False): try: result = repr(obj) except Exception: result = object.__repr__(obj) if not short or len(result) < _MAX_LENGTH: return result return result[:_MAX_LENGTH] + ' [truncated]...' def safe_str(obj): try: return str(obj) except Exception: return object.__str__(obj) def strclass(cls): return "%s.%s" % (cls.__module__, cls.__name__) def sorted_list_difference(expected, actual): """Finds elements in only one or the other of two, sorted input lists. Returns a two-element tuple of lists. The first list contains those elements in the "expected" list but not in the "actual" list, and the second contains those elements in the "actual" list but not in the "expected" list. Duplicate elements in either input list are ignored. """ i = j = 0 missing = [] unexpected = [] while True: try: e = expected[i] a = actual[j] if e < a: missing.append(e) i += 1 while expected[i] == e: i += 1 elif e > a: unexpected.append(a) j += 1 while actual[j] == a: j += 1 else: i += 1 try: while expected[i] == e: i += 1 finally: j += 1 while actual[j] == a: j += 1 except IndexError: missing.extend(expected[i:]) unexpected.extend(actual[j:]) break return missing, unexpected def unorderable_list_difference(expected, actual, ignore_duplicate=False): """Same behavior as sorted_list_difference but for lists of unorderable items (like dicts). As it does a linear search per item (remove) it has O(n*n) performance. """ missing = [] unexpected = [] while expected: item = expected.pop() try: actual.remove(item) except ValueError: missing.append(item) if ignore_duplicate: for lst in expected, actual: try: while True: lst.remove(item) except ValueError: pass if ignore_duplicate: while actual: item = actual.pop() unexpected.append(item) try: while True: actual.remove(item) except ValueError: pass return missing, unexpected # anything left in actual is unexpected return missing, actual chemfp-1.1p1/THANKS0000644000077000000240000000445012071432403014167 0ustar dalkestaff00000000000000The thoughts, ideas, help, and support of many people went into making this project. In no particular order they are: Noel O'Boyle: I used his cinfony code as the base to understand how the different projects generate fingerprints. He also encouraged me to develop these tools and put them out to the world. Geoff Hutchison for pointing out that he most often works with small data sets, and human-readable/text formats are more important than a binary format which is designed for million-compound data sets. (Thinking about this was the source for developing both a text and a binary format.) The OpenBabel developers, for developing a widely used tool in cheminformatics with both hash and substructure fingerprints. Greg Landrum, for RDKit and for discussions about what's needed in a fingerprint format. I hope some day to include some features he needs, like sparse and count fingerprints. RDKit's MACCS definitions are (excepting minor portability tweaks) the definitions used for the RDMACCS patterns. Greg's definitions are also used in CDK and OpenBabel. Greg also contributed additional support for RDKit fingerprints and reported bugs. OpenEye, for a non-commercial license to their tools. Phil Evans, for being the first external person to test the code and give feedback about installation and Python 2.5 compatibility problems. Evan Bolton and Wolf-Dietrich Ihlenfeldt for their help in understanding the PubChem/CACTVS substructure keys. Rajarshi Guha for implementing the PubChem/CACTVS substructure keys in CDK (and the anonymous person at the NIH Chemical Genomics Center who developed the precursor version). The CDK's independent implementation helped me cross-check my implementation. He also added support for the FPS format in his fingerprint analysis tools. Dmitry Pavlov for insights in how Indigo works and improvements in the Indigo implementations. Roche, for helping fund part of the project, and especially to Jérôme Hert, for his work on getting chemfp working for their needs and in helping track down a subtle memory bug. Kim Walisch, for suggestions and code for faster popcount implementations. To Chris Morely, Jörg Kurt Wegner, Phil Evans, Björn Grüning, Andrew Henry, Brian McClain and others for patches, bug reports, and feedback. To you, for reading these thanks and using the code. chemfp-1.1p1/TODO0000644000077000000240000000341711660452123013752 0ustar dalkestaff00000000000000Include Indigo and CDK support I use an 8-bit lookup table for the popcount because it's portable and pretty fast. On some machines a 16-bit LUT is faster. It's also possible that the GCC __builtin_popcount is faster, or the one of the approaches which makes use of Intel-specific assembly instructions. Should there be a way to toggle which one to use? What about a way to get a popcount function pointer specialized to, say, 64-bit aligned 2048 bit fingerprints? A "structure2fps" command-line tool which uses metadata from an existing fingerprint file to generate new fingerprints. The goal is to support: structure2fps --reference targets.fps query.sdf | simsearch targets.fps Add support for Tversky, Euclidean and other search algorithms Add support for a "max_score" in doing searches. (This comes from looking at Pipeline Pilot, which has a default upper limit of 0.999. But frankly, I'm not sure that's useful. I think the point is to ignore identical hits, so perhaps that's the real requirement.) Work on the pattern definition format. Should it have a distinction between "unique" and "non-unique" matches? Eg, C=* with count >= 2 is the same as C=* (non-unique) with count >= 3, and the latter is likely faster. I think people will want a way to specify a pattern file as input. Perhaps "--patterns"? How do I assign those a type name? There should be some way to get a description of each set bit in the pattern fingerprints. What about a way to list all of the match atoms which go into a different bit? That will take a lot of new code, and many of the fingerprinters don't support it. Perhaps limit it to the pattern fingerprints? Add other fingerprints? Eg, the PaDEL substructure descriptors are available as a list of SMARTS, and would be easy to add as a pattern definition.