pax_global_header00006660000000000000000000000064124150561770014522gustar00rootroot0000000000000052 comment=8cd571f6f48c5d86d07571ff04d8382320a8a658 pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/000077500000000000000000000000001241505617700177465ustar00rootroot00000000000000pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/LICENSES.txt000066400000000000000000000031221241505617700217120ustar00rootroot00000000000000Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Pacific Biosciences nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/Makefile000066400000000000000000000020251241505617700214050ustar00rootroot00000000000000SHELL = /bin/bash -e all: build install build: python setup.py build --executable="/usr/bin/env python" bdist: python setup.py build --executable="/usr/bin/env python" python setup.py bdist --formats=egg install: python setup.py install develop: python setup.py develop test: # Unit tests find tests/unit -name "*.py" | xargs nosetests # End-to-end tests @echo pbalign cram tests require blasr installed. find tests/cram -name "*.t" | xargs cram doc: sphinx-apidoc -T -f -o doc src/ && cd doc && make html docs: doc doc-clean: rm -f doc/*.html clean: doc-clean rm -rf dist/ build/ *.egg-info rm -rf doc/_build find . -name "*.pyc" | xargs rm -f rm -rf dist/ rm -f nostests.xml pip-install: @which pip > /dev/null @pip freeze|grep 'pbalign=='>/dev/null \ && ( pip uninstall -y pbalign \ || pip uninstall -y pbtools.pbalign ) \ || true @pip install --no-index \ --install-option="--install-scripts=$(PREFIX)/bin" \ ./ .PHONY: all build bdist install develop test doc clean pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/README.md000066400000000000000000000012501241505617700212230ustar00rootroot00000000000000###pbalign maps PacBio reads to reference sequences.### **Q: Want to know how to install and run pbalign?** A: Please refer to [pbalign readme document](https://github.com/PacificBiosciences/pbalign/blob/master/doc/howto.rst) **Q: 'pbalign.py' does not work?** A: The main script has been changed from 'pbalign.py' to 'pbalign'. Please use 'pbalign' instead. **Q: Can pbalign handle large datasets with many movies?** A: pbalign is not designed to handle large datasets, you should follow a [divide and conquer way](https://github.com/PacificBiosciences/pbalign/wiki/Tutorial:-How-to-divide-and-conquer-large-datasets-using-pbalign) to align many movies to a reference. pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/doc/000077500000000000000000000000001241505617700205135ustar00rootroot00000000000000pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/doc/Makefile000066400000000000000000000127001241505617700221530ustar00rootroot00000000000000# Makefile for Sphinx documentation # # You can set these variables from the command line. SPHINXOPTS = SPHINXBUILD = sphinx-build PAPER = BUILDDIR = _build # Internal variables. PAPEROPT_a4 = -D latex_paper_size=a4 PAPEROPT_letter = -D latex_paper_size=letter ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) . # the i18n builder cannot share the environment and doctrees with the others I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) . .PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest gettext help: @echo "Please use \`make ' where is one of" @echo " html to make standalone HTML files" @echo " dirhtml to make HTML files named index.html in directories" @echo " singlehtml to make a single large HTML file" @echo " pickle to make pickle files" @echo " json to make JSON files" @echo " htmlhelp to make HTML files and a HTML help project" @echo " qthelp to make HTML files and a qthelp project" @echo " devhelp to make HTML files and a Devhelp project" @echo " epub to make an epub" @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter" @echo " latexpdf to make LaTeX files and run them through pdflatex" @echo " text to make text files" @echo " man to make manual pages" @echo " texinfo to make Texinfo files" @echo " info to make Texinfo files and run them through makeinfo" @echo " gettext to make PO message catalogs" @echo " changes to make an overview of all changed/added/deprecated items" @echo " linkcheck to check all external links for integrity" @echo " doctest to run all doctests embedded in the documentation (if enabled)" clean: -rm -rf $(BUILDDIR)/* html: $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html @echo @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." dirhtml: $(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml @echo @echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml." singlehtml: $(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml @echo @echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml." pickle: $(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle @echo @echo "Build finished; now you can process the pickle files." json: $(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json @echo @echo "Build finished; now you can process the JSON files." htmlhelp: $(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp @echo @echo "Build finished; now you can run HTML Help Workshop with the" \ ".hhp project file in $(BUILDDIR)/htmlhelp." qthelp: $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp @echo @echo "Build finished; now you can run "qcollectiongenerator" with the" \ ".qhcp project file in $(BUILDDIR)/qthelp, like this:" @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/pbalign.qhcp" @echo "To view the help file:" @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/pbalign.qhc" devhelp: $(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp @echo @echo "Build finished." @echo "To view the help file:" @echo "# mkdir -p $$HOME/.local/share/devhelp/pbalign" @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/pbalign" @echo "# devhelp" epub: $(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub @echo @echo "Build finished. The epub file is in $(BUILDDIR)/epub." latex: $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex @echo @echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex." @echo "Run \`make' in that directory to run these through (pdf)latex" \ "(use \`make latexpdf' here to do that automatically)." latexpdf: $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex @echo "Running LaTeX files through pdflatex..." $(MAKE) -C $(BUILDDIR)/latex all-pdf @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." text: $(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text @echo @echo "Build finished. The text files are in $(BUILDDIR)/text." man: $(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man @echo @echo "Build finished. The manual pages are in $(BUILDDIR)/man." texinfo: $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo @echo @echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo." @echo "Run \`make' in that directory to run these through makeinfo" \ "(use \`make info' here to do that automatically)." info: $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo @echo "Running Texinfo files through makeinfo..." make -C $(BUILDDIR)/texinfo info @echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo." gettext: $(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale @echo @echo "Build finished. The message catalogs are in $(BUILDDIR)/locale." changes: $(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes @echo @echo "The overview file is in $(BUILDDIR)/changes." linkcheck: $(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck @echo @echo "Link check complete; look for any errors in the above output " \ "or in $(BUILDDIR)/linkcheck/output.txt." doctest: $(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest @echo "Testing of doctests in the sources finished, look at the " \ "results in $(BUILDDIR)/doctest/output.txt." pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/doc/conf.py000077500000000000000000000175041241505617700220240ustar00rootroot00000000000000# -*- coding: utf-8 -*- # # pbalign documentation build configuration file, created by # sphinx-quickstart on Wed Jul 17 13:09:03 2013. # # This file is execfile()d with the current directory set to its containing dir. # # Note that not all possible configuration values are present in this # autogenerated file. # # All configuration values have a default; values that are commented out # serve to show the default. import sys, os # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. #sys.path.insert(0, os.path.abspath('.')) # -- General configuration ----------------------------------------------------- # If your documentation needs a minimal Sphinx version, state it here. #needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be extensions # coming with Sphinx (named 'sphinx.ext.*') or your custom ones. extensions = ['sphinx.ext.autodoc', 'sphinx.ext.doctest', 'sphinx.ext.intersphinx', 'sphinx.ext.todo', 'sphinx.ext.coverage', 'sphinx.ext.viewcode'] # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix of source filenames. source_suffix = '.rst' # The encoding of source files. #source_encoding = 'utf-8-sig' # The master toctree document. master_doc = 'index' # General information about the project. project = u'pbalign' copyright = u'2013, 2013, pbiDevNet' # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the # built documents. # # The short X.Y version. version = '0.5' # The full version, including alpha/beta/rc tags. release = '0.5' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. #language = None # There are two options for replacing |today|: either, you set today to some # non-false value, then it is used: #today = '' # Else, today_fmt is used as the format for a strftime call. #today_fmt = '%B %d, %Y' # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. exclude_patterns = ['_build'] # The reST default role (used for this markup: `text`) to use for all documents. #default_role = None # If true, '()' will be appended to :func: etc. cross-reference text. #add_function_parentheses = True # If true, the current module name will be prepended to all description # unit titles (such as .. function::). #add_module_names = True # If true, sectionauthor and moduleauthor directives will be shown in the # output. They are ignored by default. #show_authors = False # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'sphinx' # A list of ignored prefixes for module index sorting. #modindex_common_prefix = [] # -- Options for HTML output --------------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. html_theme = 'default' # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. #html_theme_options = {} # Add any paths that contain custom themes here, relative to this directory. #html_theme_path = [] # The name for this set of Sphinx documents. If None, it defaults to # " v documentation". #html_title = None # A shorter title for the navigation bar. Default is the same as html_title. #html_short_title = None # The name of an image file (relative to this directory) to place at the top # of the sidebar. #html_logo = None # The name of an image file (within the static path) to use as favicon of the # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 # pixels large. #html_favicon = None # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, # using the given strftime format. #html_last_updated_fmt = '%b %d, %Y' # If true, SmartyPants will be used to convert quotes and dashes to # typographically correct entities. #html_use_smartypants = True # Custom sidebar templates, maps document names to template names. #html_sidebars = {} # Additional templates that should be rendered to pages, maps page names to # template names. #html_additional_pages = {} # If false, no module index is generated. #html_domain_indices = True # If false, no index is generated. #html_use_index = True # If true, the index is split into individual pages for each letter. #html_split_index = False # If true, links to the reST sources are added to the pages. #html_show_sourcelink = True # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. #html_show_sphinx = True # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. #html_show_copyright = True # If true, an OpenSearch description file will be output, and all pages will # contain a tag referring to it. The value of this option must be the # base URL from which the finished HTML is served. #html_use_opensearch = '' # This is the file name suffix for HTML files (e.g. ".xhtml"). #html_file_suffix = None # Output file base name for HTML help builder. htmlhelp_basename = 'pbaligndoc' # -- Options for LaTeX output -------------------------------------------------- latex_elements = { # The paper size ('letterpaper' or 'a4paper'). #'papersize': 'letterpaper', # The font size ('10pt', '11pt' or '12pt'). #'pointsize': '10pt', # Additional stuff for the LaTeX preamble. #'preamble': '', } # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, author, documentclass [howto/manual]). latex_documents = [ ('index', 'pbalign.tex', u'pbalign Documentation', u'2013, pbiDevNet', 'manual'), ] # The name of an image file (relative to this directory) to place at the top of # the title page. #latex_logo = None # For "manual" documents, if this is true, then toplevel headings are parts, # not chapters. #latex_use_parts = False # If true, show page references after internal links. #latex_show_pagerefs = False # If true, show URL addresses after external links. #latex_show_urls = False # Documents to append as an appendix to all manuals. #latex_appendices = [] # If false, no module index is generated. #latex_domain_indices = True # -- Options for manual page output -------------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ ('index', 'pbalign', u'pbalign Documentation', [u'2013, pbiDevNet'], 1) ] # If true, show URL addresses after external links. #man_show_urls = False # -- Options for Texinfo output ------------------------------------------------ # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ ('index', 'pbalign', u'pbalign Documentation', u'2013, pbiDevNet', 'pbalign', 'One line description of project.', 'Miscellaneous'), ] # Documents to append as an appendix to all manuals. #texinfo_appendices = [] # If false, no module index is generated. #texinfo_domain_indices = True # How to display URL addresses: 'footnote', 'no', or 'inline'. #texinfo_show_urls = 'footnote' # Example configuration for intersphinx: refer to the Python standard library. intersphinx_mapping = {'http://docs.python.org/': None} pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/doc/howto.rst000066400000000000000000000317641241505617700224200ustar00rootroot00000000000000How to install and run pbalign ============================== pbalign is a tool for aligning PacBio reads to reference sequences. It is part of the PacBio Bioinformatics tools, and will be bundled in the 2.1 release of SMRTanalysis. You may also follow the instructions below to install pbalign. *Note: the pseudo namespace pbtools has been removed in version 0.2.0.* *Note: program name has been changed from `pbalign.py` in version 0.1.0 to `pbalign` in version 0.2.0.* *Note: please install this software on an isolated machine that does not have SMRTanalysis installed.* Background ---------- **pbalign** aligns PacBio reads to reference sequences, filters aligned reads according to user-specific filtering criteria, and converts the output to either the SAM format or PacBio Compare HDF5 (e.g., .cmp.h5) format. The output Compare HDF5 file will be compatible with Quiver if ``--forQuiver`` option is specified. Required software ----------------- pbalign is available through the ``pbalign`` script from the ``pbalign`` package. To use pbalign, the following PacBio software is required, - ``pbalign``, containing ``pbalign`` - ``pbcore``, a package providing access to PacBio data files - ``blasr``, a package of PacBio aligner blasr, containing c++ executables for processing PacBio data, such as ``blasr``, ``pls2fasta``, ``samFilter``, ``samtoh5`` and ``loadPulses`` The following software is optionally required if ``--forQuiver`` option will be used to convert the output Compare HDF5 file to be compatible with Quiver. - ``pbh5tools.cmph5tools``, a PacBio Bioinformatics tools that manipulates Compare HDF5 files. - ``h5repack``, a HDF5 tool to compress and repack HDF5 files. The default aligner that pbalign uses is ``blasr``. If you want to use bowtie2 as aligner, then the bowtie2 package also needs to be installed. Required libraries and tools ---------------------------- - Python 2.7.3 - virtualenv (builds isolated Python environments) - numpy 1.6.1 (required by pbcore) - h5py 2.0.1 (required by pbcore) If you are within PacBio, these requirements are already installed within the cluster environment. Otherwise, you will need to install them yourself. Data file requirements ---------------------- pbalign distinguishes input and output file formats by file extensions. The input PacBio reads can be in FASTA, Base HDF5, Pulse HDF5, Circular Consensus Sequence (CCS) HDF5 or file or file names (FOFN). The supported input file extensions are as follows. - FASTA : .fa or .fasta - PacBio BASE HDF5 : .bas.h5 or .bax.h5 - PacBio PULSE HDF5 : .pls.h5 or .plx.h5 - PacBio CCS HDF5 : .ccs.h5 - File of file names : .fofn The input reference sequences can be in a FASTA file or a reference deposit directory created by referenceUploader (a PacBio tool for uploading references to the server and data preprocessing). The output file can either be a SAM file or a Compare HDF5 file. The output Compare HDF5 file cannot be consumed by Quiver directly unleis ``--forQuiver`` option is specified. The supported output file extensions are as follows. - SAM : .sam - PacBio Compare HDF5: .cmp.h5 Manual installation instructions -------------------------------- Step 1: Set up your Python virtual environment ``````````````````````````````````````````````````` To install ``Python 2.7``, please visit :: http://www.python.org/ , or if you have root permission on Ubuntu, execute :: sudo apt-get install python To install ``pip``, please visit :: https://pypi.python.org/pypi/pip , or if you have root permission using Ubuntu, execute :: sudo apt-get install python-pip To install ``virtualenv``, please visit :: https://pypi.python.org/pypi/virtualenv , or execute :: pip install virtualenv To set up a new virtualenv, do :: $ cd; virtualenv -p python2.7 --no-site-packages my_env , and activate the virtualenv using :: $ source ~/my_env/bin/activate To install ``git``, please visit :: http://git-scm.com/. Step 2: Install required software and library ````````````````````````````````````````````` To install blasr, please execute :: $ git clone https://github.com/PacificBiosciences/blasr , and follow instructions at :: https://github.com/PacificBiosciences/blasr/blob/master/README.md Before installing pbcore, you may need to install numpy and h5py from :: http://www.numpy.org/ https://code.google.com/p/h5py/ , or if you have root permission on Ubuntu, do :: $ git install numpy $ sudo apt-get install libhdf5-serial-dev $ git install h5py To install pbcore, execute :: $ pip install git+https://github.com/PacificBiosciences/pbcore Step 3: Install optionally required software and library ```````````````````````````````````````````````````````` To install pbh5tools, execute :: $ pip install git+https://github.com/PacificBiosciences/pbh5tools To install ``HDF5 tools``, visit :: http://www.hdfgroup.org/products/hdf5_tools/ , or if you have root permission on Ubuntu, do :: $ sudo apt-get install hdf5-tools Step 4: Install pbalign ``````````````````````` To *uninstall* pbalign, execute :: $ pip uninstall pbalign To install pbalign, execute :: $ pip install git+https://github.com/PacificBiosciences/pbalign , or to download the whole pbalign package with examples :: $ git clone https://github.com/PacificBiosciences/pbalign.git $ cd pbalign $ pip install . Examples -------- (1) Basic usage of pbalign. - Example (1.1) :: $ pbalign tests/data/example_read.fasta \ tests/data/example_ref.fasta \ example.sam - Example (1.2) :: $ pbalign tests/data/example_read.fasta \ tests/data/example_ref.fasta \ example.cmp.h5 - Example (1.3) - with optional arguments :: $ pbalign --maxHits 10 --hitPolicy all \ tests/data/example_read.fasta \ tests/data/example_ref.fasta \ example.sam (2) Advanced usage of pbalign. - Example (2.1) - Import pre-defined options from a config File :: $ pbalign --configFile=tests/data/1.config \ tests/data/example_read.fasta \ tests/data/example_ref.fasta \ example.sam - Example (2.2) - Pass options through to aligner :: $ pbalign --algorithmOptions='-nCandidates 10 -sdpTupleSize 12' \ tests/data/example_read.fasta \ tests/data/example_ref.fasta \ example.sam - Example (2.3) - Create a cmp.h5 file with --forQuiver option :: # The output cmp.h5 file will loaded with quality values (pulses) # from the input bas/bax.h5 file, sorted and repacked, and therefore # can be consumed by Quiver directly, (Note that in order to use # --forQuiver option, cmph5tools and h5repack are required.) $ pbalign --forQuiver your_movie.bas.h5 your_reference.fasta out.cmp.h5 (3) Use pbalign as a library through Python API. - Example (3.1) :: $ python >>> from pbalign.pbalignrunner import PBAlignRunner >>> # Specify arguments in a list. >>> args = ['--maxHits', '20', 'tests/data/example_read.fasta',\ ... 'tests/data/example_ref.fasta', 'example.sam'] >>> # Create a PBAlignRunner object. >>> a = PBAlignRunner(args) >>> # Execute. >>> exitCode = a.start() >>> # Show all files used. >>> print a.fileNames Usage ----- :: usage: pbalign [-h] [--verbose] [--version] [--profile] [--debug] [--regionTable REGIONTABLE] [--configFile CONFIGFILE] [--algorithm {blasr,bowtie}] [--maxHits MAXHITS] [--minAnchorSize MINANCHORSIZE] [--useccs {useccs,useccsall,useccsdenovo}] [--noSplitSubreads] [--nproc NPROC] [--algorithmOptions ALGORITHMOPTIONS] [--maxDivergence MAXDIVERGENCE] [--minAccuracy MINACCURACY] [--minLength MINLENGTH] [--scoreFunction {alignerscore,editdist,blasrscore}] [--scoreCutoff SCORECUTOFF] [--hitPolicy {randombest,allbest,random,all}] [--forQuiver] [--seed SEED] [--tmpDir TMPDIR] inputFileName referencePath outputFileName Mapping PacBio sequences to references using an algorithm selected from a selection of supported command-line alignment algorithms. Input can be a fasta, pls.h5, bas.h5 or ccs.h5 file or a fofn (file of file names). Output is in either cmp.h5 or sam format. positional arguments: inputFileName The input file can be a fasta, pls.h5, bas.h5, ccs.h5 file or a fofn. referencePath Either a reference fasta file or a reference repository. outputFileName The output cmp.h5 or sam file. optional arguments: -h, --help show this help message and exit --verbose, -v Set the verbosity level --version show program's version number and exit --profile Print runtime profile at exit --debug Run within a debugger session --regionTable REGIONTABLE Specify a region table for filtering reads. --configFile CONFIGFILE Specify a set of user-defined argument values. --algorithm {blasr,bowtie} Select an aligorithm from ('blasr', 'bowtie'). Default algorithm is blasr. --maxHits MAXHITS The maximum number of matches of each read to the reference sequence that will be evaluated. Default value is 10. --minAnchorSize MINANCHORSIZE The minimum anchor size defines the length of the read that must match against the reference sequence. Default value is 12. --useccs {useccs,useccsall,useccsdenovo} Map the ccsSequence to the genome first, then align subreads to the interval that the CCS reads mapped to. useccs: only maps subreads that span the length of the template. useccsall: maps all subreads. useccsdenovo: maps ccs only. --noSplitSubreads Do not split reads into subreads even if subread regions are available. Default value is False. --nproc NPROC Number of threads. Default value is 8. --algorithmOptions ALGORITHMOPTIONS Pass alignment options through. --maxDivergence MAXDIVERGENCE The maximum allowed percentage divergence of a read from the reference sequence. Default value is 30. --minAccuracy MINACCURACY The minimum percentage accuracy of alignments that will be evaluated. Default value is 70. --minLength MINLENGTH The minimum aligned read length of alignments that will be evaluated. Default value is 50. --scoreFunction {alignerscore,editdist,blasrscore} Specify a score function for evaluating alignments. alignerscore : aligner's score in the SAM tag 'as'. editdist : edit distance between read and reference. blasrscore : blasr's default score function. Default value is alignerscore. --scoreCutoff SCORECUTOFF The worst score to output an alignment. --hitPolicy {randombest,allbest,random,all} Specify a policy for how to treat multiple hit random : selects a random hit. all : selects all hits. allbest : selects all the best score hits. randombest: selects a random hit from all best alignment score hits. Default value is randombest. --forQuiver The output cmp.h5 file which will be sorted, loaded with pulse information, and repacked, so that it can be consumed by quiver directly. This requires the input file to be in PacBio bas/pls.h5 format. Default value is False. --seed SEED Initialize the random number generator with a none-zero integer. Zero means that current system time is used. Default value is 1. --tmpDir TMPDIR Specify a directory for saving temporary files. Default is /scratch. pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/doc/index.rst000066400000000000000000000004411241505617700223530ustar00rootroot00000000000000.. pbalign documentation master file, created by sphinx-quickstart on Wed Jul 17 13:09:03 2013. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. pbalign ======= Contents: .. toctree:: :maxdepth: 3 howto pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/000077500000000000000000000000001241505617700213625ustar00rootroot00000000000000pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/__init__.py000077500000000000000000000056751241505617700235130ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### # Author: Yuan Li """Initialization.""" from __future__ import absolute_import _changelist = "$Change: 141024 $" def _get_changelist(perforce_str): """Extract change list from p4 str""" import re rx = re.compile(r'Change: (\d+)') match = rx.search(perforce_str) if match is None: v = 'UnknownChangelist' else: try: v = int(match.group(1)) except (TypeError, IndexError): v = "UnknownChangelist" return v def get_changelist(): """Return changelist""" return _get_changelist(_changelist) def get_dir(): """Return lib directory.""" return op.dirname(op.realpath(__file__)) VERSION = (0, 2, 0, get_changelist()) def get_version(): """Return the version as a string. "O.7" This uses a major.minor Each python module of the system (e.g, butler, detective, siv_butler.py) will use this version + individual changelist. This allows top level versioning, and sub-component to be versioned based on a p4 changelist. .. note:: This should be improved to be compliant with PEP 386. """ return ".".join([str(i) for i in VERSION]) pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/alignservice/000077500000000000000000000000001241505617700240355ustar00rootroot00000000000000pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/alignservice/__init__.py000077500000000000000000000036071241505617700261570ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### # Author: Yuan Li """Initialization.""" from __future__ import absolute_import pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/alignservice/align.py000077500000000000000000000212541241505617700255100ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### # Author: Yuan Li """This script defines class AlignService.""" from __future__ import absolute_import import logging from copy import copy from pbalign.options import importDefaultOptions from pbalign.utils.tempfileutil import TempFileManager from pbalign.service import Service class AlignService (Service): """Super class for all alignment services. AlignService takes argument options as input and generates a SAM file as output. Non-abstract subclasses should define the following properties. name : name of the subclass align service availability: availability of the subclass align service scoreSign : score sign of the subclass align service Subclasses should override the following virtual methods. _preProcess : _toCmd() _postProcesss() If --algorithmOptions needs to be supported by a subclass, override _resolveAlgorithmOptions(). """ @property def scoreSign(self): """Align service score sign can be -1 or 1. -1: negative scores are better than positive ones. 1: positive scores are better than negative ones. """ raise NotImplementedError( "Virtual property scoreSign() for AlignService must be " + "overwritten.") def _resolveAlgorithmOptions(self, options, fileNames): """A virtual method to resolve options specified within --algorithmOptions and options parsed from the command-line (including the config file). Input: options: options parsed from a command-line and a config file. fileNames: an PBAlignFiles object. Output: new options """ if options.algorithmOptions is None or options.algorithmOptions == "": return copy(options) raise NotImplementedError( "_resolveAlgorithmOptions() method for AlignService must be " + "overridden if --algorithmOptions is specified.") def __init__(self, options, fileNames, tempFileManager=None): """Initialize an AlignSerivce object. Need to resolve options specified within algorithmOptions; patch default options if not specified by the user inherit or initialize a tempory file manager Input: options : options parsed from (a list of arguments and a config file if --configFile is specified). fileNames : an object of PBAlignFiles tempFileManager: a temporary file manager. If it is None, create a new temporary file manager. """ self._options = options # Verify and assign input & output files. self._fileNames = fileNames self._fileNames.SetInOutFiles(self._options.inputFileName, self._options.referencePath, self._options.outputFileName, self._options.regionTable, self._options.pulseFile) # Resolve options specified within --algorithmOptions with # options parsed from the argument list (e.g. the command-line) # or a config file. self._options = self._resolveAlgorithmOptions(self._options, self._fileNames) # Patch PBalign default options if they havn't been specified yet. self._options = importDefaultOptions(self._options)[0] if tempFileManager is None: self._tempFileManager = TempFileManager(self._options.tmpDir) else: self._tempFileManager = tempFileManager self._tempFileManager.SetRootDir(self._options.tmpDir) # self.args is finalized. logging.debug("Parsed arguments considering configFile and " + "algorithmOptions: " + str(self._options)) @property def cmd(self): """String of a command line to align reads.""" return self._toCmd(self._options, self._fileNames, self._tempFileManager) def _toCmd(self, options, fileNames, tempFileManager): """A virtual method to generate a command line string. Generate a command line of the aligner to use in bash based on options and PBAlignFiles. Input: options : arguments parsed from the command-line, the config file and --algorithmOptions. fileNames: an PBAlignFiles object. tempFileManager: temporary file manager. Output: a command-line string which can be used in bash. """ raise NotImplementedError( "_toCmd() method for AlignService must be overridden") def _preProcess(self, inputFileName, referenceFile, regionTable, noSplitSubreads, tempFileManager, isWithinRepository): """A virtual method to prepare inputs for the aligner. Input: inputFileName : a PacBio BASE/PULSE/FOFN file. referenceFile : a FASTA reference file. regionTable : a region table RGN.H5/FOFN file. noSplitSubreads: whether to split subreads or not. tempFileManager: temporary file manager. isWithinRepository: whether or not the reference is within a refererence repository. Output: String, a FASTA file which can be used by the aligner. """ raise NotImplementedError( "_preProcess() method for AlignService must be overridden") def _postProcess(self): """A virtual method to post process the generated output file. """ raise NotImplementedError( "_postProcess() method for AlignService must be overridden") def run(self): """AlignService starts to run. """ logging.info(self.name + ": Align reads to references using " + "{prog}.".format(prog=self.progName)) # Prepare inputs for the aligner. self._fileNames.queryFileName = self._preProcess( self._fileNames.inputFileName, self._fileNames.targetFileName, self._fileNames.regionTable, self._options.noSplitSubreads, self._tempFileManager, self._fileNames.isWithinRepository) self._fileNames.alignerSamOut = self._tempFileManager.\ RegisterNewTmpFile(suffix=".sam") # Generate and execute cmd. try: output, errCode, errMsg = self._execute() except RuntimeError as e: raise RuntimeError(str(e)) # Post process the results. self._postProcess() return output, errCode, errMsg pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/alignservice/blasr.py000077500000000000000000000310771241505617700255250ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### """This script defines class BalserService which calls blasr to align reads.""" # Author: Yuan Li from __future__ import absolute_import from pbalign.alignservice.align import AlignService from pbalign.utils.fileutil import FILE_FORMATS, real_upath import logging class BlasrService(AlignService): """Class BlasrService calls blasr to align reads.""" @property def name(self): """Blasr Service name.""" return "BlasrService" @property def progName(self): """Program to call.""" return "blasr" @property def scoreSign(self): """score sign for blasr is -1, the lower the better.""" return -1 def __parseAlgorithmOptionItems(self, optionstr): """Given a string of algorithm options, reconstruct option items. First, split the string by white space, then reconstruct path with white spaces. """ items = optionstr.split(' ') ret = [] for index, item in enumerate(items): if item.endswith('\\'): item = '{x} '.format(x=item) if index > 0 and items[index-1].endswith('\\'): ret[-1] = "{x}{y}".format(x=ret[-1], y=item) else: ret.append(item) return ret def _resolveAlgorithmOptions(self, options, fileNames): """ Resolve options specified within --algorithmOptions with options parsed from the command-line or the config file. Return updated options. If find conflicting values of the following options, error out. (1) --maxHits and blasr -bestn (2) --maxAnchorSize and blasr -minMatch (3) --useccs and blasr -useccs/-useccsall/useccsdenovo If find conflicting values of sawriter, regionTable and nproc, it does not matter which value is used. Input: options : the original pbalign options from argumentList and configFile. fileNames: an PBAlignFiles object Output: new options by resolving options specified within --algorithmOptions and the original pbalign options """ if options.algorithmOptions is None: return options ignoredBinaryOptions = ['-m', '-out', '-V'] ignoredUnitaryOptions = ['-h', '--help', '--version', '-v', '-vv', '-sam'] items = self.__parseAlgorithmOptionItems(options.algorithmOptions) i = 0 try: while i < len(items): infoMsg, errMsg, item = "", "", items[i] if item == "-sa": val = real_upath(items[i+1]) if fileNames.sawriterFileName != val: infoMsg = "Over write sa file with {0}".format(val) fileNames.sawriterFileName = val elif item == "-regionTable": val = real_upath(items[i+1]) if fileNames.regionTable != val: infoMsg = "Over write regionTable with {0}.\n"\ .format(val) fileNames.regionTable = val elif item == "-bestn": val = int(items[i+1]) if options.maxHits is not None and \ int(options.maxHits) != val: errMsg = "blasr -bestn specified within " + \ "--algorithmOptions is equivalent to " + \ "--maxHits. Conflicting values of " + \ "--algorithmOptions '-bestn' and " +\ "--maxHits have been found." else: options.maxHits = val elif item == "-minMatch": val = int(items[i+1]) if options.minAnchorSize is not None and \ int(options.minAnchorSize) != val: errMsg = "blasr -minMatch specified within " + \ "--algorithmOptions is equivalent to " + \ "--minAnchorSize. Conflicting values " + \ "of --algorithmOptions '-minMatch' and " + \ "--minAnchorSize have been found." else: options.minAnchorSize = val elif item == "-nproc": val = int(items[i+1]) # The number of threads is not critical. if options.nproc is None or \ int(options.nproc) != val: infoMsg = "Over write nproc with {n}.".format(n=val) options.nproc = val elif item == "-noSplitSubreads": if not options.noSplitSubreads: infoMsg = "Over write noSplitSubreads with True." logging.info(self.name + ": Resolve algorithmOptions. " + infoMsg) options.noSplitSubreads = True del items[i] continue elif item == "-concordant": if not options.concordant: infoMsg = "Over writer concordant with True." logging.info(self.name + ": Resolve algorithmOptions. " + infoMsg) options.concordant = True del items[i] elif "-useccs" in item: # -useccs, -useccsall, -useccsdenovo val = item.lstrip('-') if options.useccs != val and options.useccs is not None: errMsg = "Found conflicting options in " + \ "--algorithmOptions '{v}' \nand --useccs={u}"\ .format(v=item, u=options.useccs) else: options.useccs = val elif item == "-seed" or item == "-randomSeed": val = int(items[i+1]) if options.seed is None or int(options.seed) != val: infoMsg = "Overwrite random seed with {0}.".format(val) options.seed = val elif item in ignoredBinaryOptions: pass elif item in ignoredUnitaryOptions: del items[i:i+1] continue else: i += 1 continue if errMsg != "": logging.error(errMsg) raise ValueError(errMsg) if infoMsg != "": logging.info(self.name + ": Resolve algorithmOptions. " + infoMsg) del items[i:i+2] except Exception as e: errMsg = "An error occured during parsing algorithmOptions " + \ "'{ao}': ".format(ao=options.algorithmOptions) logging.error(errMsg + str(e)) raise ValueError(errMsg + str(e)) # Update algorithmOptions when resolve is done options.algorithmOptions = " ".join(items) return options def _toCmd(self, options, fileNames, tempFileManager): """ Generate a command line for blasr based on options and PBAlignFiles, and return a command-line string which can be used in bash. Input: options : arguments parsed from the command-line, the config file and --algorithmOptions. fileNames: an PBAlignFiles object. tempFileManager: temporary file manager. Output: a command-line string which can be used in bash. """ cmdStr = "blasr {queryFile} {targetFile} -sam -out {outFile} ".format( queryFile=fileNames.queryFileName, targetFile=fileNames.targetFileName, outFile=fileNames.alignerSamOut) if ((fileNames.sawriterFileName is not None) and (fileNames.sawriterFileName != "")): cmdStr += " -sa {sawriter} ".format( sawriter=fileNames.sawriterFileName) if ((fileNames.regionTable != "") and (fileNames.regionTable is not None)): cmdStr += " -regionTable {regionTable} ".format( regionTable=fileNames.regionTable) if options.maxHits is not None and options.maxHits != "": cmdStr += " -bestn {n}".format(n=options.maxHits) if (options.minAnchorSize is not None and options.minAnchorSize != ""): cmdStr += " -minMatch {0} ".format(options.minAnchorSize) if options.nproc is not None and options.nproc != "": cmdStr += " -nproc {0} ".format(options.nproc) if options.minLength is not None: cmdStr += " -minSubreadLength {n} -minReadLength {n} ".\ format(n=options.minLength) if options.noSplitSubreads: cmdStr += " -noSplitSubreads " if options.concordant: cmdStr += " -concordant " if options.seed is not None and options.seed != 0: cmdStr += " -randomSeed {0} ".format(options.seed) if options.hitPolicy == "randombest": cmdStr += " -placeRepeatsRandomly " if options.useccs is not None and options.useccs != "": cmdStr += " -{0} ".format(options.useccs) # When input is a FASTA file, blasr -clipping = soft if fileNames.inputFileFormat == FILE_FORMATS.FASTA: cmdStr += " -clipping soft " if options.algorithmOptions is not None: cmdStr += " {0} ".format(options.algorithmOptions) return cmdStr def _preProcess(self, inputFileName, referenceFile=None, regionTable=None, noSplitSubreads=None, tempFileManager=None, isWithinRepository=None): """Preprocess input files. Input: inputFilieName : a PacBio BASE/PULSE/FOFN file. referenceFile : a FASTA reference file. regionTable : a region table RGN.H5/FOFN file. noSplitSubreads: whether to split subreads or not. tempFileManager: temporary file manager. Output: string, a file which can be used by blasr. """ # For blasr, nothing needs to be done, return the input PacBio # PULSE/BASE/FOFN reads directly. return inputFileName def _postProcess(self): """ Postprocess after alignment is done. """ logging.debug(self.name + ": Postprocess after alignment is done. ") pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/alignservice/bowtie.py000077500000000000000000000251651241505617700257140ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### """This script defines class BowtieSerive, which uses bowtie to align reads.""" # Author: Yuan Li from __future__ import absolute_import from pbalign.alignservice.fastabasedalign import FastaBasedAlignService from os import path from pbcore.util.Process import backticks import logging def bt2BaseName(tempDir, refFile): """Return basename of bowtie2 index files. Input: tempDir: a temporary directory for saving bowtie2 index files. refFile: the reference sequence file. Output: string, the basename of bowtie2 index files. """ return path.join(tempDir, path.splitext(path.basename(refFile))[0]) def bt2IndexFiles(baseName): """Return a tuple of bowtie2 index files. Input: baseName: the base name of bowtie2 index files. Output: list of strings, bowtie2 index files. """ exts = ['1.bt2', '2.bt2', '3.bt2', '4.bt2', 'rev.1.bt2', 'rev.2.bt2'] return [".".join([baseName, ext]) for ext in exts] class BowtieService(FastaBasedAlignService): """BowtieService calls bowtie to align reads.""" @property def name(self): """Name of the service.""" return "BowtieService" @property def progName(self): """Program name.""" return "bowtie2" @property def scoreSign(self): """Score sign for bowtie2 is 1, the larger the better.""" return 1 def _resolveAlgorithmOptions(self, options, fileNames): """Resolve options specified within --algorithmOptions with options parsed from the command-line or the config file, and return updated options. Input: options : the original pbalign options from argumentList and configFile. fileNames: an PBAlignFiles object Output: new options built by resolving options specified within --algorithmOptions and the original pbalign options """ if options.algorithmOptions is None: return options ignoredBinaryOptions = ['-x', '-S', # Needs to be computed. '-1', '-2', '-U', # No paired-end input. '-r', '-q', '--qseq', # Only accepts FASTA. '--seed'] ignoredUnitaryOptions = ['--version', '--help'] items = options.algorithmOptions.split(' ') i = 0 try: while i < len(items): infoMsg, errMsg, item = "", "", items[i] if item == "-k": val = int(items[i+1]) if options.maxHits is not None and \ int(options.maxHits) != val: errMsg = "bowtie2 -k specified within " + \ "--algorithmOptions is equivalent to " + \ "--maxHits. Conflicting values of " + \ "--algorithmOptions '-k' and " +\ "--maxHits have been found." elif item == "-L": val = int(items[i+1]) if options.minAnchorSize is not None and \ int(options.minAnchorSize) != val: errMsg = "bowtie2 -L specified within " + \ "--algorithmOptions is equivalent to " + \ "--minAnchorSize. Conflicting values " + \ "of --algorithmOptions '-L' and " + \ "--minAnchorSize have been found." elif item == "-p": val = int(items[i+1]) # The number of threads is not critical. if options.nproc is None or \ int(options.npoc) != val: infoMsg = "Over write nproc with {n}.".format(n=val) options.nproc = val elif item in ignoredBinaryOptions: pass elif item in ignoredUnitaryOptions: del items[i:i+1] continue else: i += 1 continue if errMsg != "": logging.error(errMsg) raise ValueError(errMsg) if infoMsg != "": logging.info(self.name + ": Resolve algorithmOptions. " + infoMsg) del items[i:i+2] except Exception as e: errMsg = "An error occured during parsing algorithmOptions '{ao}'"\ .format(ao=options.algorithmOptions) + str(e) logging.error(errMsg) raise ValueError(errMsg) # Update algorithmOptions when resolve is done options.algorithmOptions = " ".join(items) return options def _bt2BuildIndex(self, tempDir, referenceFile): """Build bt2 index files. Input: tempDir : a temporary directory for saving bowtie2 index files. referenceFile: the reference sequence file. Output: list of strings, bowtie2 index files. """ refBaseName = bt2BaseName(tempDir, referenceFile) cmdStr = "bowtie2-build -q -f {0} {1}".\ format(referenceFile, refBaseName) logging.info(self.name + ": Build bowtie2 index files.") logging.debug(self.name + ": Call {0}".format(cmdStr)) _output, errCode, errMsg = backticks(cmdStr) if (errCode != 0): logging.error(self.name + ": Failed to build bowtie2 " + "index files.\n" + errMsg) raise RuntimeError(errMsg) return bt2IndexFiles(refBaseName) def _preProcess(self, inputFileName, referenceFile, regionTable, noSplitSubreads, tempFileManager, isWithinRepository): """Preprocess inputs and pre-build reference index files for bowtie2. For bowtie2, we need to (1) index the reference sequences, (2) convert the input PULSE/BASE/FOFN file to FASTA. Input: inputFilieName : a PacBio BASE/PULSE/FOFN file. referenceFile : a FASTA reference file. regionTable : a region table RGN.H5/FOFN file. noSplitSubreads: whether to split subreads or not. tempFileManager: temporary file manager. isWithinRepository: whether or not the reference is within a reference repository. Output: String, a FASTA file which can be used by bowtie2. """ # Build bt2 index files and return files that have been built. indexFiles = self._bt2BuildIndex(tempFileManager.defaultRootDir, referenceFile) # Register bt2 index files in the temporary file manager. for indexFile in indexFiles: tempFileManager.RegisterExistingTmpFile(indexFile, own=True) # Return a FASTA file that can be used by bowtie2 directly. return self._pls2fasta(inputFileName, regionTable, noSplitSubreads) def _toCmd(self, options, fileNames, tempFileManager): """Return a bowtie2 command line to run in bash. Generate a bowtie2 command line based on options and PBAlignFiles. Input: options : arguments parsed from the command-line, the config file and --algorithmOptions. fileNames: an PBAlignFiles object. tempFileManager: temporary file manager. Output: a command-line string which can be used in bash. """ cmdStr = "bowtie2 " if options.maxHits is not None and options.maxHits != "": cmdStr += " -k {maxHits}".format(maxHits=options.maxHits) if options.nproc is not None and options.nproc != "": cmdStr += " -p {nproc}".format(nproc=options.nproc) if options.algorithmOptions is not None: cmdStr += " {opts} ".format(opts=options.algorithmOptions) if options.seed is not None and options.seed != "": cmdStr += " --seed {seed} ".format(seed=options.seed) refBaseName = bt2BaseName(tempFileManager.defaultRootDir, fileNames.targetFileName) cmdStr += "-x {refBase} -f {queryFile} -S {outFile} ".\ format(refBase=refBaseName, queryFile=fileNames.queryFileName, outFile=fileNames.alignerSamOut) return cmdStr def _postProcess(self): """Postprocess after alignment is done.""" logging.debug("Preprocess after alignment is done. ") pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/alignservice/fastabasedalign.py000077500000000000000000000101211241505617700275150ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### """This script defines FastaBasedAlignService, a subclass of AlignService, which converts PacBio reads in BASE/PULSE/FOFN formats into FASTA format before align.""" # Author: Yuan Li from __future__ import absolute_import from pbalign.alignservice.align import AlignService from pbalign.utils.fileutil import getFileFormat, FILE_FORMATS from pbcore.util.Process import backticks import logging class FastaBasedAlignService(AlignService): """An abstract class for aligners that do not support PacBio reads in BASE/PULSE/FOFN formats. All subclasses need to call _pls2fasta in preprocess to convert input PacBio reads to FASTA.""" def _pls2fasta(self, inputFileName, regionTable, noSplitSubreads): """ Call pls2fasta to convert a PacBio BASE/PULSe/FOFN file to FASTA. Input: inputFilieName : a PacBio BASE/PULSE/FOFN file. regionTable : a region table RGN.H5/FOFN file. noSplitSubreads: whether to split subreads or not. Output: a FASTA file which can be used as an input by an aligner. """ # If the incoming file is a FASTA file, no conversion is needed. if getFileFormat(inputFileName) == FILE_FORMATS.FASTA: return inputFileName # Otherwise, create a temporary FASTA file to write. outFastaFile = self._tempFileManager.RegisterNewTmpFile( suffix=".fasta") cmdStr = "pls2fasta {plsFile} {fastaFile} ".format( plsFile=inputFileName, fastaFile=outFastaFile) if regionTable is not None and regionTable != "": cmdStr += " -regionTable {rt} ".format(rt=regionTable) if noSplitSubreads: cmdStr += " -noSplitSubreads " logging.info(self.name + ": Convert {inFile} to FASTA format.". format(inFile=inputFileName)) logging.debug(self.name + ": Call \"{cmd}\"".format(cmd=cmdStr)) _output, errCode, errMsg = backticks(cmdStr) if errCode != 0: errMsg += "Failed to convert {i} to {o}.".format( i=inputFileName, o=outFastaFile) logging.error(errMsg) raise RuntimeError(errMsg) # Return the converted FASTA file which can be used by an aligner. return outFastaFile pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/alignservice/gmap.py000077500000000000000000000331751241505617700253470ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### """This script defines class GMAPService which calls GMAP to align reads.""" # Author: Yuan Li from __future__ import absolute_import from os import path from pbalign.alignservice.fastabasedalign import FastaBasedAlignService from pbalign.utils.fileutil import isExist from pbcore.util.Process import backticks from time import sleep from random import randint import logging class GMAPService(FastaBasedAlignService): """Class GMAPService calls gmap to align reads.""" def __init__(self, options, fileNames, tempFileManager=None): super(GMAPService, self).__init__(options, fileNames, tempFileManager) self.dbRoot = None # If a GMAP DB is within a PacBio reference repository, its name should # always be 'gmap_db'. However, if it is not within a repository, its # name should be randomized. Otherwise, if multiple calls of # 'gmap_build' running simultaneously will fail. self.dbName = "gmap_db" @property def name(self): """GMAP Service name.""" return "GMAPService" @property def progName(self): """Program to call.""" return "gmap" @property def scoreSign(self): """Using edit distance as align score for GMAP, the lower the better. """ return -1 def _resolveAlgorithmOptions(self, options, fileNames): """ Resolve options specified within --algorithmOptions with options parsed from the command-line or the config file. Return updated options. If find conflicting values of the following options, error out. (1) --maxHits and gmap -n (2) --maxAnchorSize and gmap -k Input: options : the original pbalign options from argumentList and configFile. fileNames: an PBAlignFiles object Output: new options by resolving options specified within --algorithmOptions and the original pbalign options """ if options.algorithmOptions is None: return options unsupportedOptions = ['-1', '--selfalign', '-2', '--pairalign', '--cmdline', '--cmetdir', '--atoidir', '-v', '--use-snps', '-V', '--snpsdir'] # Ignore options to specify the input database ignoredBinaryOptions = ['-D', '-d'] # Ignore options to specify output types, and always output in SAM # with full headers and the 'sam-use-0M' option. # Ignore --kmer, --nthreads, and --npaths # Ignore help and version. ignoredUnitaryOptions = ['-S', '-A', '-3', '-4', '-Z', '-E', '-P', '-Q', '-5', '--no-sam-headers', '-f', '--sam-use-0M', '--dir', '--db', '--kmer', '--nthreads', '--npaths', '--help', '--version'] items = options.algorithmOptions.split(' ') i = 0 try: while i < len(items): infoMsg, errMsg, item = "", "", items[i].split("=")[0] if item in unsupportedOptions: raise ValueError("Unsupported option: {i}".format(i=item)) elif item in ignoredUnitaryOptions: infoMsg = "Ignore option: {i}".format(i=item) del items[i:i+1] elif item in ignoredBinaryOptions: infoMsg = "Ignore option: {i}".format(i=item) del items[i:i+2] elif item == "-k": # kmer size val = int(items[i+1]) if options.minAnchorSize == val: del items[i:i+2] else: errMsg = "Found conflicting options: " + \ "--algorithmOptions '-k={k}'".format(k=val) +\ "and --minAnchorSize={v}.".format( v=options.minAnchorSize) elif item == "-t": val = int(items[i+1]) # The number of threads is not critical. if options.nproc is None or \ int(options.nproc) != val: infoMsg = "Over write nproc with {n}.".format(n=val) options.nproc = val del items[i:i+2] elif item == "-n": val = int(items[i+1]) if options.maxHits == val: del items[i:i+2] else: errMsg = "Found conflicting options: " + \ "--algorithmsOptions '-n={n}' and " + \ "--maxHits={v}.".format(v=options.maxHits) if errMsg != "": logging.error(errMsg) raise ValueError(errMsg) if infoMsg != "": logging.info(self.name + ": Resolve algorithmOptions. " + infoMsg) except Exception as e: errMsg = "An error occured during parsing algorithmOptions " + \ "'{ao}': ".format(ao=options.algorithmOptions) logging.error(errMsg + str(e)) raise ValueError(errMsg + str(e)) # Update algorithmOptions when resolve is done options.algorithmOptions = " ".join(items) return options def _toCmd(self, options, fileNames, tempFileManager): """ Generate a command line for GMAP based on options and PBAlignFiles, and return a command-line string which can be used in bash. Input: options : arguments parsed from the command-line, the config file and --algorithmOptions. fileNames: an PBAlignFiles object. tempFileManager: temporary file manager. Output: a command-line string which can be used in bash. """ cmdStr = "gmap -D {dbRoot} ".format(dbRoot=self.dbRoot) + \ "-d {dbName} -f samse ".format(dbName=self.dbName) + \ "--sam-use-0M {inFa} ".format(inFa=fileNames.queryFileName) if options.maxHits is not None and options.maxHits != "": cmdStr += "-n {n} ".format(n=options.maxHits) if (options.minAnchorSize is not None and options.minAnchorSize != ""): cmdStr += "-k {0} ".format(options.minAnchorSize) if options.nproc is not None and options.nproc != "": cmdStr += "-t {0} ".format(options.nproc) if options.algorithmOptions is not None: cmdStr += "{0} ".format(options.algorithmOptions) cmdStr += "> {outSam} ".format(outSam=fileNames.alignerSamOut) return cmdStr def _releaseLock(self, dbLock): """Release dbLock.""" _o, errCode, _m = backticks("rm -f {dbLock}".format(dbLock=dbLock)) if errCode == 0: logging.debug(self.name + ": Release the lock for DB creation.") else: raise RuntimeError(self.name + ": Failed to release lock " + dbLock + ". Please delete the lock manually.") def _gmapCreateDB(self, referenceFile, isWithinRepository, tempRootDir): """ Create gmap database for reference sequences if no DB exists. Wait for gmap DB to be created if gmap_db.lock exists. return (gmap_DB_root_path, gmap_DB_name). """ # Determine dbRoot according to whether the reference file is wihtin # a reference repository. if isWithinRepository: # If the reference file is within a reference repository, create # gmap_db under the root of the repository, then the gmap DB root # is the repo root, and gmap DB name is 'gmap_db', e.g., # refrepo/ # --------sequence/ # --------gmap_db/ # --------reference.info.xml dbRoot = path.split(path.dirname(referenceFile))[0] dbName = "gmap_db" else: # Otherwise, create gmap_db under the tempRootDir, and give the # gmap DB a random name dbRoot = tempRootDir dbName = "gmap_db_{sfx}".format(sfx=randint(100000, 1000000)) dbPath = path.join(dbRoot, dbName) dbLock = dbPath + ".lock" # Check if DB already exists if isExist(dbPath) and not isExist(dbLock): # gmap_db already exists logging.info(self.name + ": GMAP database {dbPath} found".format( dbPath=dbPath)) return (dbRoot, dbName) # Check if DB is being created by other pbalign calls while isExist(dbLock): logging.info(self.name + ": Waiting for GMAP database to be " + \ "created for {inFa}".format(inFa=referenceFile)) sleep(10) # Create DB if it does not exist if not isExist(dbPath): # Touch the lock file _output, errCode, errMsg = backticks("touch {dbLock}".format( dbLock=dbLock)) logging.debug(self.name + ": Create a lock when GMAP DB is " + "being built.") if (errCode != 0): logging.error(self.name + ": Failed to create {dbLock}.\n" + errMsg) self._releaseLock(dbLock) raise RuntimeError(errMsg) logging.info(self.name + ": Create GMAP DB for {inFa}.".format( inFa=referenceFile)) cmdStr = "gmap_build -k 12 --db={dbName} --dir={dbRoot} {inFa}".\ format(dbName=dbName, dbRoot=dbRoot, inFa=referenceFile) _output, errCode, errMsg = backticks(cmdStr) logging.debug(self.name + ": Call {cmdStr}".format(cmdStr=cmdStr)) if (errCode != 0): logging.error(self.name + ": Failed to build GMAP db.\n" + errMsg) self._releaseLock(dbLock) raise RuntimeError(errMsg) # Delete the lock file to notify others pbalign who are waiting # for this DB to be created. self._releaseLock(dbLock) return (dbRoot, dbName) def _preProcess(self, inputFileName, referenceFile, regionTable, noSplitSubreads, tempFileManager, isWithinRepository): """Preprocess inputs and pre-build reference index files for gmap. For gmap, we need to (1) create indices for reference sequences, (2) convert the input PULSE/BASE/FOFN file to FASTA. Input: inputFileName : a PacBio BASE/PULSE/FOFN file. referenceFile : a FASTA reference file. regionTable : a region table RGN.H5/FOFN file. noSplitSubreads: whether to split subreads or not. tempFileManager: temporary file manager. Output: String, a FASTA read file which can be used by gmap. """ # Create a gmap database, update gmap DB root path and db name. (self.dbRoot, self.dbName) = self._gmapCreateDB(referenceFile, isWithinRepository, tempFileManager.defaultRootDir) # DO NOT delete gmap_db if it is within a reference repository; # otherwise, delete it. if not isWithinRepository: tempFileManager.RegisterExistingTmpFile(path.join(self.dbRoot, self.dbName), own=True, isDir=True) # Return a FASTA file that can be used by gmap as query directly. return self._pls2fasta(inputFileName, regionTable, noSplitSubreads) def _postProcess(self): """ Postprocess after alignment is done. """ logging.debug(self.name + ": Postprocess after alignment is done. ") pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/filterservice.py000077500000000000000000000135621241505617700246140ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### """This script defines FilterService, which calls samFilter to remove aligments in an input SAM file according to filtering criteria.""" # Author: Yuan Li from __future__ import absolute_import import logging from pbalign.service import Service from pbalign.utils.fileutil import isExist class FilterService(Service): """ Call samFilter to filter low quality hits and apply multiple hits policy. """ @property def name(self): """Name of filter service.""" return "FilterService" @property def progName(self): """Program to call.""" return "samFilter" def __init__(self, inSamFile, refFile, outSamFile, alnServiceName, scoreSign, options, adapterGffFile=None): """Initialize a FilterService object. Input: inSamFile: an input SAM file refFile : the reference FASTA file outSAM : an output SAM file alnServiceName: the name of the align service scoreSign: score sign of the aligner, can be -1 or 1 options : pbalign options adapterGffFile: a GFF file storing all the adapters """ self.inSamFile = inSamFile self.refFile = refFile self.outSamFile = outSamFile self.alnServiceName = alnServiceName self.scoreSign = scoreSign self.options = options self.adapterGffFile = adapterGffFile @property def cmd(self): """String of a command-line to execute.""" return self._toCmd(self.inSamFile, self.refFile, self.outSamFile, self.alnServiceName, self.scoreSign, self.options, self.adapterGffFile) def _toCmd(self, inSamFile, refFile, outSamFile, alnServiceName, scoreSign, options, adapterGffFile): """ Generate a samFilter command line from options. Input: inSamFile : the input SAM file refFile : the reference FASTA file outSamFile: the output SAM file alnServiceName: aligner service name scoreSign : score sign, can be -1 or 1 options : argument options Output: a command-line string """ cmdStr = self.progName + \ " {inSamFile} {refFile} {outSamFile} ".format( inSamFile=inSamFile, refFile=refFile, outSamFile=outSamFile) if options.maxDivergence is not None: maxDivergence = int(options.maxDivergence if options.maxDivergence > 1.0 else (options.maxDivergence * 100)) cmdStr += " -minPctSimilarity {0}".format(100 - maxDivergence) if options.minAccuracy is not None: minAccuracy = int(options.minAccuracy if options.minAccuracy > 1.0 else (options.minAccuracy * 100)) cmdStr += " -minAccuracy {0}".format(minAccuracy) if options.minLength is not None: cmdStr += " -minLength {0}".format(options.minLength) if options.seed is not None: cmdStr += " -seed {0}".format(options.seed) if scoreSign in [1, -1]: cmdStr += " -scoreSign {0}".format(scoreSign) else: logging.error("{0}'s score sign is neither 1 nor -1.".format( alnServiceName)) if options.scoreCutoff is not None: cmdStr += " -scoreCutoff {0}".format(options.scoreCutoff) if options.hitPolicy is not None: cmdStr += " -hitPolicy {0}".format(options.hitPolicy) if options.filterAdapterOnly is True and \ isExist(adapterGffFile): cmdStr += " -filterAdapterOnly {gffFile}".format( gffFile=adapterGffFile) return cmdStr def run(self): """ Run the filter service. """ logging.info(self.name + ": Filter alignments using {0}.". format(self.progName)) return self._execute() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/forquiverservice/000077500000000000000000000000001241505617700247655ustar00rootroot00000000000000pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/forquiverservice/__init__.py000077500000000000000000000036071241505617700271070ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### # Author: Yuan Li """Initialization.""" from __future__ import absolute_import pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/forquiverservice/forquiver.py000077500000000000000000000101051241505617700273610ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### """This script defines ForQuiverService, which post-processes a cmp.h5 file so that it can be used by quiver directly. ForQuiverService sorts the file, loads pulse information to it and finally repacks it..""" # Author: Yuan Li from __future__ import absolute_import import logging from pbalign.forquiverservice.sort import SortService from pbalign.forquiverservice.loadpulses import LoadPulsesService from pbalign.forquiverservice.loadchemistry import LoadChemistryService from pbalign.forquiverservice.repack import RepackService class ForQuiverService(object): """ Uses SortService, LoadPulsesService, LoadChemistryService, RepackService to post process a cmp.h5 file so that the file can be used by quiver directly. """ @property def name(self): """Name of ForQuiverService.""" return "ForQuiverService" def __init__(self, fileNames, options): """Initialize a ForQuiverService object. Input: fileNames : pbalign file names options : pbalign options """ self.fileNames = fileNames self.options = options self._loadpulsesService = LoadPulsesService( self.fileNames.pulseFileName, self.fileNames.outputFileName, self.options) self._loadchemistryService = LoadChemistryService( self.fileNames.pulseFileName, self.fileNames.outputFileName, self.options) self._sortService = SortService( self.fileNames.outputFileName, self.options) self._repackService = RepackService( self.fileNames.outputFileName, self.fileNames.outputFileName + ".TMP") def run(self): """ Run the ForQuiver service.""" logging.info(self.name + ": Sort.") self._sortService.checkAvailability() self._sortService.run() logging.info(self.name + ": LoadPulses.") self._loadpulsesService.checkAvailability() self._loadpulsesService.run() logging.info(self.name + ": LoadChemistry.") self._loadchemistryService.checkAvailability() self._loadchemistryService.run() logging.info(self.name + ": Repack.") self._repackService.checkAvailability() self._repackService.run() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/forquiverservice/loadchemistry.py000077500000000000000000000061231241505617700302130ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### from __future__ import absolute_import import logging from pbalign.service import Service class LoadChemistryService(Service): @property def name(self): return "LoadChemistryService" @property def progName(self): return "loadChemistry.py" # Quiver only uses the following five metrics. def __init__(self, basFofnFile, cmpFile, options): """ Input: basFofnFile: the input BASE.H5 (or fofn) files cmpFile : an input CMP.H5 file options : pbalign options """ self.basFofnFile = basFofnFile self.cmpFile = cmpFile self.options = options @property def cmd(self): """String of a command-line to execute.""" return self._toCmd(self.basFofnFile, self.cmpFile) def _toCmd(self, basFofnFile, cmpFile): """ Generate a loadChemistry command line. """ cmdStr = self.progName + \ " {basFofnFile} {cmpFile} ".format( basFofnFile=basFofnFile, cmpFile=cmpFile) return cmdStr def run(self): """Run the loadChemistry service.""" logging.info(self.name + ": Load pulses using {progName}.". format(progName=self.progName)) return self._execute() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/forquiverservice/loadpulses.py000077500000000000000000000076231241505617700275250ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### """Define LoadPulseService class, which calls loadPulses to load PacBio pulse metrics to a cmp.h5 file. Five metrics, inclduing DeletionQV, DeletionTag, InsertionQV, MergeQV, and SubstitutionQV, are loaded by default unless --metrics is specified.""" # Author: Yuan Li from __future__ import absolute_import import logging from pbalign.service import Service class LoadPulsesService(Service): """ LoadPulsesService calls loadPulses to load PacBio pulse information to a cmp.h5 file. """ @property def name(self): """Name of LoadPulsesService.""" return "LoadPulsesService" @property def progName(self): """Program to call.""" return "loadPulses" # Quiver only uses the following five metrics. def __init__(self, basFofnFile, cmpFile, options): """Initialize a LoadPulsesService object. Input: basFofnFile: the input BASE.H5 (or fofn) files cmpFile : an input CMP.H5 file options : pbalign options """ self.basFofnFile = basFofnFile self.cmpFile = cmpFile self.options = options @property def cmd(self): """String of a command-line to execute.""" return self._toCmd(self.basFofnFile, self.cmpFile) def _toCmd(self, basFofnFile, cmpFile): """Generate a loadPulses command line. Input: basFofnFile: a BAX/PLX.H5 (or fofn) file with pulses cmpFile : an input CMP.H5 file Output: a command-line string """ cmdStr = self.progName + \ " {basFofnFile} {cmpFile} ".format( basFofnFile=basFofnFile, cmpFile=cmpFile) metrics = self.options.metrics.replace(" ", "") cmdStr += " -metrics {metrics} ".format(metrics=metrics) if self.options.byread: cmdStr += " -byread " return cmdStr def run(self): """Run the loadPulses service.""" logging.info(self.name + ": Load pulses using {progName}.". format(progName=self.progName)) return self._execute() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/forquiverservice/repack.py000077500000000000000000000064371241505617700266210ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### """Define RepackService class, which calls h5repack to repack a cmp.h5 file.""" # Author: Yuan Li from __future__ import absolute_import import logging from pbalign.service import Service class RepackService(Service): """RepackService calls h5repack to repack a PacBio cmp.h5 file.""" @property def name(self): """Name of RepackService.""" return "RepackService" @property def progName(self): """Program to call.""" return "h5repack" def __init__(self, cmpFile, tmpcmpFile): """Initialize a RepackService object. Input: cmpFile : an input CMP.H5 file tmpcmpFile : a temporary CMP.H5 file """ self.cmpFile = cmpFile self.tmpcmpFile = tmpcmpFile @property def cmd(self): """String of a command-line to execute.""" return self._toCmd(self.cmpFile, self.tmpcmpFile) def _toCmd(self, cmpFile, tmpcmpFile): """Generate a h5repack command line. Input: cmpFile : an input CMP.H5 file options : pbalign options Output: a command-line string """ cmdStr = self.progName + " -f GZIP=1 {i} {o} && mv {o} {i}".\ format(i=cmpFile, o=tmpcmpFile) return cmdStr def run(self): """Run the Repack service.""" logging.info(self.name + ": Repack a cmp.h5 file using {progName}.". format(progName=self.progName)) return self._execute() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/forquiverservice/sort.py000077500000000000000000000065071241505617700263410ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### """This script defines SortService, which calls cmph5tools.py sort to sort a cmp.h5 file.""" # Author: Yuan Li from __future__ import absolute_import import logging from pbalign.service import Service class SortService(Service): """Call cmph5tools.py sort to sort a PacBio cmp.h5 file.""" @property def name(self): """Name of SortService.""" return "SortService" @property def progName(self): """Program to call.""" return "cmph5tools.py" def __init__(self, cmpFile, options): """Initialize a SortService object. Input: cmpFile : an input CMP.H5 file options : pbalign options """ self.cmpFile = cmpFile self.options = options @property def cmd(self): """String of a command-line to execute.""" return self._toCmd(self.cmpFile, self.options) def _toCmd(self, cmpFile, options): """Generate a cmph5tools.py sort command line. Input: cmpFile : an input CMP.H5 file options : pbalign options Output: a command-line string """ cmdStr = self.progName if options.verbosity > 1: cmdStr += " -vv " cmdStr += " sort --deep --inPlace {cmpFile} ".format(cmpFile=cmpFile) return cmdStr def run(self): """Run the sort service.""" logging.info(self.name + ": Sort a cmp.h5 file using {progName}.". format(progName=self.progName)) return self._execute() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/options.py000077500000000000000000000551011241505617700234340ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### # Author:Yuan Li """This scripts defines functions for parsing PBAlignRunner options.""" from __future__ import absolute_import import argparse from copy import copy # The first candidate 'blasr' is the default. ALGORITHM_CANDIDATES = ('blasr', 'bowtie', 'gmap') # The first candidate 'randombest' is the default. HITPOLICY_CANDIDATES = ('randombest', 'allbest', 'random', 'all', 'leftmost') # The first candidate 'aligner' is the default. SCOREFUNCTION_CANDIDATES = ('alignerscore', 'editdist', #'blasrscore', 'userscore') 'blasrscore') DEFAULT_METRICS = ("DeletionQV", "DeletionTag", "InsertionQV", "MergeQV", "SubstitutionQV") # Default values of arguments DEFAULT_OPTIONS = {"regionTable": None, "configFile": None, # Choose an aligner "algorithm": ALGORITHM_CANDIDATES[0], # Aligner options "maxHits": 10, "minAnchorSize": 12, "noSplitSubreads": False, "concordant": False, "algorithmOptions": None, "useccs": None, # Filter options "maxDivergence": 30, "minAccuracy": 70, "minLength": 50, "scoreFunction": SCOREFUNCTION_CANDIDATES[0], "scoreCutoff": None, "hitPolicy": HITPOLICY_CANDIDATES[0], "filterAdapterOnly": False, # Cmp.h5 writer options "readType": "standard", "forQuiver": False, "loadQVs": False, "byread": False, "metrics": str(",".join(DEFAULT_METRICS)), # Miscellaneous options "nproc": 8, "seed": 1, "tmpDir": "/scratch"} def constructOptionParser(parser=None): """Constrct and return an argument parser. If a parser is specified, use it. Otherwise, create a parser instead. Add PBAlignRunner arguments to construct the parser, and finally return it. """ desc = "Mapping PacBio sequences to references using an algorithm \n" desc += "selected from a selection of supported command-line alignment\n" desc += "algorithms. Input can be a fasta, pls.h5, bas.h5 or ccs.h5\n" desc += "file or a fofn (file of file names). Output is in either\n" desc += "cmp.h5 or sam format.\n" if (parser is None): parser = argparse.ArgumentParser() parser.description = desc parser.argument_default = argparse.SUPPRESS parser.formatter_class = argparse.RawTextHelpFormatter # Optional input. parser.add_argument("--regionTable", dest="regionTable", type=str, default=None, action="store", help="Specify a region table for filtering reads.") parser.add_argument("--configFile", dest="configFile", default=None, type=str, action="store", help="Specify a set of user-defined argument values.") helpstr = "When input reads are in fasta format and output is a cmp.h5\n" + \ "this option can specify pls.h5 or bas.h5 or \n" + \ "FOFN files from which pulse metrics can be loaded for Quiver." parser.add_argument("--pulseFile", dest="pulseFile", default=None, type=str, action="store", help=helpstr) # Chose an aligner. helpstr = "Select an aligorithm from {0}.\n".format(ALGORITHM_CANDIDATES) helpstr += "Default algorithm is {0}.".format(DEFAULT_OPTIONS["algorithm"]) parser.add_argument("--algorithm", dest="algorithm", type=str, action="store", choices=ALGORITHM_CANDIDATES, default=ALGORITHM_CANDIDATES[0], help=helpstr) # Aligner options. helpstr = "The maximum number of matches of each read to the \n" + \ "reference sequence that will be evaluated. Default\n" + \ "value is {0}.".format(DEFAULT_OPTIONS["maxHits"]) parser.add_argument("--maxHits", dest="maxHits", type=int, default=None, # Set as None instead of a real number. action="store", help=helpstr) helpstr = "The minimum anchor size defines the length of the read\n" + \ "that must match against the reference sequence. Default\n" + \ "value is {0}.".format(DEFAULT_OPTIONS["minAnchorSize"]) parser.add_argument("--minAnchorSize", dest="minAnchorSize", type=int, default=None, # Set as None to avoid conflicts with # --algorithmOptions action="store", help=helpstr) # Aligner options: Use ccs or not? helpstr = "Map the ccsSequence to the genome first, then align\n" + \ "subreads to the interval that the CCS reads mapped to.\n" + \ " useccs: only maps subreads that span the length of\n" + \ " the template.\n" + \ " useccsall: maps all subreads.\n" + \ " useccsdenovo: maps ccs only." parser.add_argument("--useccs", type=str, choices=["useccs", "useccsall", "useccsdenovo"], action="store", default=None, help=helpstr) helpstr = "Do not split reads into subreads even if subread \n" + \ "regions are available. Default value is {0}."\ .format(DEFAULT_OPTIONS["noSplitSubreads"]) parser.add_argument("--noSplitSubreads", dest="noSplitSubreads", default=DEFAULT_OPTIONS["noSplitSubreads"], action="store_true", help=helpstr) helpstr = "Map subreads of a ZMW to the same genomic location.\n" parser.add_argument("--concordant", dest="concordant", default=DEFAULT_OPTIONS["concordant"], action="store_true", help=helpstr) helpstr = "Number of threads. Default value is {v}."\ .format(v=DEFAULT_OPTIONS["nproc"]) parser.add_argument("--nproc", type=int, dest="nproc", default=DEFAULT_OPTIONS["nproc"], #default=15, action="store", help=helpstr) parser.add_argument("--algorithmOptions", type=str, dest="algorithmOptions", default=None, action="append", help="Pass alignment options through.") # Filtering criteria and hit policy. helpstr = "The maximum allowed percentage divergence of a read \n" + \ "from the reference sequence. Default value is {0}." \ .format(DEFAULT_OPTIONS["maxDivergence"]) parser.add_argument("--maxDivergence", dest="maxDivergence", type=float, default=DEFAULT_OPTIONS["maxDivergence"], #default=30, action="store", help=helpstr) helpstr = "The minimum percentage accuracy of alignments that\n" + \ "will be evaluated. Default value is {v}." \ .format(v=DEFAULT_OPTIONS["minAccuracy"]) parser.add_argument("--minAccuracy", dest="minAccuracy", type=float, default=DEFAULT_OPTIONS["minAccuracy"], #default=70, action="store", help=helpstr) helpstr = "The minimum aligned read length of alignments that\n" + \ "will be evaluated. Default value is {v}." \ .format(v=DEFAULT_OPTIONS["minLength"]) parser.add_argument("--minLength", dest="minLength", type=int, default=DEFAULT_OPTIONS["minLength"], action="store", help=helpstr) helpstr = "Specify a score function for evaluating alignments.\n" helpstr += " alignerscore : aligner's score in the SAM tag 'as'.\n" helpstr += " editdist : edit distance between read and reference.\n" helpstr += " blasrscore : blasr's default score function.\n" helpstr += "Default value is {0}.".format(DEFAULT_OPTIONS["scoreFunction"]) parser.add_argument("--scoreFunction", dest="scoreFunction", type=str, choices=SCOREFUNCTION_CANDIDATES, default=DEFAULT_OPTIONS["scoreFunction"], action="store", help=helpstr) #" userscore : user-defined score matrix (by -scoreMatrix).\n") #parser.add_argument("--scoreMatrix", # dest="scoreMatrix", # type=str, # default=None, # help= # "Specify a user-defined score matrix for " # "scoring reads.The matrix\n"+\ # "is in the format\n" # " ACGTN\n" # " A abcde\n" # " C fghij\n" # " G klmno\n" # " T pqrst\n" # " N uvwxy\n" # ". The values a...y should be input as a " # "quoted space separated\n" # "string: "a b c ... y". Lower scores are better," # "so matches\n" # "should be less than mismatches e.g. a,g,m,s " # "= -5 (match),\n" # "mismatch = 6.\n") parser.add_argument("--scoreCutoff", dest="scoreCutoff", type=int, default=None, action="store", help="The worst score to output an alignment.\n") helpstr = "Specify a policy for how to treat multiple hit\n" + \ " random : selects a random hit.\n" + \ " all : selects all hits.\n" + \ " allbest : selects all the best score hits.\n" + \ " randombest: selects a random hit from all best score hits.\n" + \ " leftmost : selects a hit which has the best score and the\n" + \ " smallest mapping coordinate in any reference.\n" + \ "Default value is {v}.".format(v=DEFAULT_OPTIONS["hitPolicy"]) parser.add_argument("--hitPolicy", dest="hitPolicy", type=str, choices=HITPOLICY_CANDIDATES, default=DEFAULT_OPTIONS["hitPolicy"], action="store", help=helpstr) helpstr = "If specified, do not report adapter-only hits using\n" + \ "annotations with the reference entry." parser.add_argument("--filterAdapterOnly", dest="filterAdapterOnly", default=DEFAULT_OPTIONS["filterAdapterOnly"], action="store_true", help=helpstr) # Output. helpstr = "Specify the ReadType attribute in the cmp.h5 output.\n" + \ "Default value is {v}.".format(v=DEFAULT_OPTIONS["readType"]) parser.add_argument("--readType", dest="readType", type=str, action="store", default=DEFAULT_OPTIONS["readType"], help=argparse.SUPPRESS) #help=helpstr) helpstr = "The output cmp.h5 file which will be sorted, loaded\n" + \ "with pulse QV information, and repacked, so that it \n" + \ "can be consumed by quiver directly. This requires\n" + \ "the input file to be in PacBio bas/pls.h5 format,\n" + \ "and --useccs must be None. Default value is False." parser.add_argument("--forQuiver", dest="forQuiver", action="store_true", default=DEFAULT_OPTIONS["forQuiver"], help=helpstr) helpstr = "Similar to --forQuiver, the only difference is that \n" + \ "--useccs can be specified. Default value is False." parser.add_argument("--loadQVs", dest="loadQVs", action="store_true", default=DEFAULT_OPTIONS["loadQVs"], help=helpstr) helpstr = "Load pulse information using -byread option instead\n" + \ "of -bymetric. Only works when --forQuiver or \n" + \ "--loadQVs are set. Default value is False." parser.add_argument("--byread", dest="byread", action="store_true", default=DEFAULT_OPTIONS["byread"], help=helpstr) helpstr = "Load the specified (comma-delimited list of) metrics\n" + \ "instead of the default metrics required by quiver.\n" + \ "This option only works when --forQuiver or \n" + \ "--loadQVs are set. Default: {m}".\ format(m=DEFAULT_OPTIONS["metrics"]) parser.add_argument("--metrics", dest="metrics", type=str, action="store", default=DEFAULT_OPTIONS["metrics"], help=helpstr) # Miscellaneous. helpstr = "Initialize the random number generator with a none-zero \n" + \ "integer. Zero means that current system time is used.\n" + \ "Default value is {v}.".format(v=DEFAULT_OPTIONS["seed"]) parser.add_argument("--seed", dest="seed", type=int, default=DEFAULT_OPTIONS["seed"], action="store", help=helpstr) helpstr = "Specify a directory for saving temporary files.\n" + \ "Default is {v}.".format(v=DEFAULT_OPTIONS["tmpDir"]) parser.add_argument("--tmpDir", dest="tmpDir", type=str, action="store", default=DEFAULT_OPTIONS["tmpDir"], help=helpstr) # Keep all temporary & intermediate files. parser.add_argument("--keepTmpFiles", dest="keepTmpFiles", action="store_true", default=False, help=argparse.SUPPRESS) # Required options: inputs and outputs. helpstr = "The input file can be a fasta, plx.h5, bax.h5, ccs.h5\n" + \ "file or a fofn." parser.add_argument("inputFileName", type=str, action="store", help=helpstr) helpstr = "Either a reference fasta file or a reference repository." parser.add_argument("referencePath", type=str, action="store", help=helpstr) parser.add_argument("outputFileName", type=str, action="store", help="The output cmp.h5 or sam file.") return parser def importConfigOptions(options): """ Import options from options.configFile if the file exists, and overwrite a copy of the incoming options with options imported from the config file. Finally, return the new options and an info message. """ newOptions = copy(options) # No config file exists. if 'configFile' not in options or options.configFile is None: return newOptions, "" # There exists a config file optionsDictView = vars(newOptions) configFile = options.configFile infoMsg = "ConfigParser: Import options from a config file {0}: "\ .format(configFile) # The following arguments are defined in PBToolRunner, and may # not exist in the input options (if the input options is parsed # by a parser created in constructOptionParser). specialArguments = ("--version", "--configFile", "--verbose", "--debug", "--profile", "-v", "-vv", "-vvv", "--keepTmpFiles") try: with open(configFile, 'r') as cf: for line in cf: line = line.strip() errMsg = "" # First parse special arguments and comments if (line.startswith("#") or line == "" or line in specialArguments): pass else: # Parse binary arguments try: k, v = line.split("=") k = k.lstrip().lstrip('-').strip() v = v.strip().strip('\"').strip('\'') except ValueError as e: errMsg = "ConfigParser: could not find '=' when " + \ "parsing {0}.".format(line) raise ValueError(errMsg) # Always use options' values from the configFile. if k not in optionsDictView: errMsg = "{k} is an invalid option.".format(k=k) raise ValueError(errMsg) else: infoMsg += "{k}={v}, ".format(k=k, v=v) optionsDictView[k] = v except IOError as e: errMsg = "ConfigParser: Could not open a config file {0}.\n".\ format(configFile) raise IOError(errMsg + str(e)) return newOptions, infoMsg def importDefaultOptions(parsedOptions, additionalDefaults=DEFAULT_OPTIONS): """Import default options and return (update_options, an_info_message). After parsing the arguments and resolving algorithmOptions, we need to patch the default pbalign options, if they have not been overwritten on the command-line nor in the config file nor within algorithmOptions. """ newOptions = copy(parsedOptions) infoMsg = "Importing default options: " optionsDictView = vars(newOptions) for k, v in additionalDefaults.iteritems(): if (k not in optionsDictView) or (optionsDictView[k] is None): infoMsg += "{k}={v}, ".format(k=optionsDictView[k], v=v) optionsDictView[k] = v return newOptions, infoMsg.rstrip(', ') def parseOptions(argumentList, parser=None): """Parse a list of arguments, return options and an info message. If a parser is not specified, create a new parser, otherwise, use the specifed parser. If there exists a config file, import options from the config file and finally overwrite these options with options from the argument list. """ # Obtain a constructed argument parser. parser = constructOptionParser(parser) # Parse argumentList for the first time in order to # get a config file. options = parser.parse_args(args=argumentList) # Import options from the specified config file, if it exists. configOptions, infoMsg = importConfigOptions(options) # Parse argumentList for the second time in order to # overwrite config options with options in argumentList. newOptions = copy(configOptions) newOptions.algorithmOptions = None newOptions = parser.parse_args(namespace=newOptions, args=argumentList) # Overwrite config algorithmOptions if it is specified in argumentList if newOptions.algorithmOptions is None: if configOptions.algorithmOptions is not None: newOptions.algorithmOptions = configOptions.algorithmOptions else: newOptions.algorithmOptions = \ " ".join(newOptions.algorithmOptions) # Return the updated options and an info message. return parser, newOptions, infoMsg #if __name__ == "__main__": # import sys # parser = argparse.ArgumentParser() # parser, options, info = parseOptions(argumentList = sys.argv[1:], # parser=parser) pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/pbalignfiles.py000077500000000000000000000214301241505617700243760ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### # Author: Yuan Li """This script defines class PBALignFiles.""" from __future__ import absolute_import import logging from pbalign.utils.fileutil import checkInputFile, getRealFileFormat, \ checkOutputFile, checkReferencePath, checkRegionTableFile, \ getFileFormat, FILE_FORMATS class PBAlignFiles: """PBAlignFiles contains files that will be used by pbalign.""" def __init__(self, inputFileName=None, referencePath=None, outputFileName=None, regionTable=None, pulseFileName=None): """ Initialize an instance of PBAlignFiles. Input: inputFileName : The user-specified input PacBio read file can be in FASTA/BASE/PULSE/FOFN format. referencePath : The user-specified reference path or file. outputFileName: The user-specified output file in CMP.H5 or SAM format. regionTable : The user-specified region table. It can be None if region table is not specified. """ self.inputFileName = None # The input PacBio read files. self.referencePath = None # The reference file or repository. self.outputFileName = None # The output CMP.H5 or SAM file. self.regionTable = None # The region table. # The query file that will be used by an aligner. queryFileName # and inputFileName can be different, because PacBio BASE/PULSE/FOFN # files need to be converted to FASTA for aligners that do not accept # PacBio reads. self.queryFileName = None # Load pulses from the pulse file. When input reads are in # BASE/PULSE/CCS.H5 files, pulseFileName=inputFileName; # otherwise, use '--pulseFile'. self.pulseFileName = None # File format of inputFileName if it is not a FOFN; otherwise, # file format of the first file in FOFN: FASTA/BAS.H5/PLS.H5/CCS.H5. self.inputFileFormat = None # The target (reference) file that will be used by an aligner. # referencePath can be a directory but targetFileName should always # be a FASTA file. self.targetFileName = None self.sawriterFileName = None self.isWithinRepository = False self.alignerSamOut = None # The sam output file by an aligner self.filteredSam = None # The filtered sam file. # There might be an adapter file in the reference repository in # directory 'annotations', which can be used by the # 11k_Unrolled_Resequencing protocol to filter reads that # only map to adapter regions. self.adapterGffFileName = None # Verify and assign the input & output files. self.SetInOutFiles(inputFileName, referencePath, outputFileName, regionTable, pulseFileName) def SetInputFile(self, inputFileName): """Verify and assign input file name and input file format.""" # Validate the user-specified input PacBio read file and get # the absolute and expanded path. Validate file format. if inputFileName is not None and inputFileName != "": self.inputFileName = checkInputFile(inputFileName) self.inputFileFormat = getRealFileFormat(inputFileName) def SetPulseFileName(self, inputFileName, pulseFileName): """Verify and assign the pulse file from which pulses can be extracted. When inputFileName is a Base/Pulse/CCS.H5 file or a fofn of Base/Pulse/CCS.H5, pulse file is inputFileName. Otherwise, pulse file is pulseFileName.""" self.pulseFileName = None if inputFileName is not None and inputFileName != "": inputFormat = getRealFileFormat(inputFileName) if inputFormat in [FILE_FORMATS.BAS, FILE_FORMATS.BAX, FILE_FORMATS.PLS, FILE_FORMATS.PLX, FILE_FORMATS.CCS]: self.pulseFileName = checkInputFile(inputFileName) if self.pulseFileName is None: if pulseFileName is not None and pulseFileName != "": self.pulseFileName = checkInputFile(pulseFileName) def SetReferencePath(self, referencePath): """Validate the user-specified referencePath and get the absolute and expanded path for referencePath, targetFileName and sawriterFileName. targetFileName is the target reference FASTA file to be used by an aligner. sawriterFileName is the reference sawriter file that can be used by an aligner (e.g. blasr), its value can be None if absent. """ if referencePath is not None and referencePath != "": (self.referencePath, self.targetFileName, self.sawriterFileName, self.isWithinRepository, self.adapterGffFileName) = \ checkReferencePath(referencePath) def SetOutputFileName(self, outputFileName): """Validate the user-specified output file and get the absolute and expanded path. """ if outputFileName is not None and outputFileName != "": self.outputFileName = checkOutputFile(outputFileName) def SetRegionTable(self, regionTable): """Validate the user-specified region table and get the absolute and expanded path. The value can be None if regionTable is not given. """ if regionTable is not None and regionTable != "": self.regionTable = checkRegionTableFile(regionTable) def SetInOutFiles(self, inputFileName, referencePath, outputFileName, regionTable, pulseFileName=None): """Verify and assign the input & output files.""" self.SetInputFile(inputFileName) self.SetReferencePath(referencePath) self.SetOutputFileName(outputFileName) self.SetRegionTable(regionTable) self.SetPulseFileName(inputFileName, pulseFileName) def __repr__(self): """ Represent PBAlignFiles.""" desc = "Input file : {i}\n".format(i=self.inputFileName) desc += "Reference path: {r} ".format(r=self.referencePath) desc += "is {res}within a reference repository.\n".format( res="" if self.isWithinRepository else "not ") desc += "Output file: {o}\n".format(o=self.outputFileName) desc += "Query file : {q}\n".format(q=self.queryFileName) desc += "Target file: {t}\n".format(t=self.targetFileName) desc += "Suffix array file: {s}\n".format(s=self.sawriterFileName) desc += "regionTable:{s}\n".format(s=self.regionTable) if self.pulseFileName is not None: desc += "Pulse files: {s}\n".format(s=self.pulseFileName) desc += "Aligner's SAM out: {t}\n".format(t=self.alignerSamOut) desc += "Filtered SAM file: {t}\n".format(t=self.filteredSam) if self.adapterGffFileName is not None: desc += "Adapter GFF file: {t}\n".format( t=self.adapterGffFileName) return desc #if __name__ == "__main__": # p = PBAlignFiles("lambda.fasta", "lambda_ref.fasta", "tmp.sam") pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/pbalignrunner.py000077500000000000000000000305411241505617700246100ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### """This script defines class PBAlignRunner. PBAlignRunner uses AlignService to align PacBio reads in FASTA/BASE/PULSE/FOFN formats to reference sequences, then uses FilterServices to filter out alignments that do not satisfy filtering criteria, and finally generates a SAM or CMP.H5 file. """ # Author: Yuan Li import logging import time import sys import shutil from pbalign.__init__ import get_version from pbalign.options import parseOptions, ALGORITHM_CANDIDATES from pbalign.alignservice.blasr import BlasrService from pbalign.alignservice.bowtie import BowtieService from pbalign.alignservice.gmap import GMAPService from pbalign.utils.fileutil import getFileFormat, FILE_FORMATS, real_ppath from pbalign.utils.tempfileutil import TempFileManager from pbalign.pbalignfiles import PBAlignFiles from pbalign.filterservice import FilterService from pbalign.forquiverservice.forquiver import ForQuiverService from pbcore.util.Process import backticks from pbcore.util.ToolRunner import PBToolRunner class PBAlignRunner(PBToolRunner): """Tool runner.""" def __init__(self, argumentList): """Initialize a PBAlignRunner object. argumentList is a list of arguments, such as: ['--debug', '--maxHits', '10', 'in.fasta', 'ref.fasta', 'out.sam'] """ desc = "Utilities for aligning PacBio reads to reference sequences." super(PBAlignRunner, self).__init__(desc) self._argumentList = argumentList self._alnService = None self._filterService = None self.fileNames = PBAlignFiles() self._tempFileManager = TempFileManager() self.parser, self.args, _infoMsg = parseOptions( argumentList=self._argumentList, parser=self.parser) # args.verbosity is computed by counting # of 'v's in '-vv...'. # However in parseOptions, arguments are parsed twice to import config # options and then overwrite them with argumentList (e.g. command-line) # options. self.args.verbosity = 0 if (self.args.verbosity is None) else \ int(self.args.verbosity) / 2 def getVersion(self): """Return version.""" return get_version() def _createAlignService(self, name, args, fileNames, tempFileManager): """ Create and return an AlignService by algorithm name. Input: name : an algorithm name such as blasr fileNames : an PBAlignFiles object args : pbalign options tempFileManager: a temporary file manager Output: an object of AlignService subclass (such as BlasrService). """ if name not in ALGORITHM_CANDIDATES: errMsg = "ERROR: unrecognized algorithm {algo}".format(algo=name) logging.error(errMsg) raise ValueError(errMsg) service = None if name == "blasr": service = BlasrService(args, fileNames, tempFileManager) elif name == "bowtie": service = BowtieService(args, fileNames, tempFileManager) elif name == "gmap": service = GMAPService(args, fileNames, tempFileManager) else: errMsg = "Service for {algo} is not implemented.".\ format(algo=name) logging.error(errMsg) raise ValueError(errMsg) service.checkAvailability() return service def _makeSane(self, args, fileNames): """ Check whether the input arguments make sense or not. """ errMsg = "" if args.useccs == "useccsdenovo": args.readType = "CCS" if fileNames.inputFileFormat == FILE_FORMATS.CCS: args.readType = "CCS" if args.forQuiver: if args.useccs is not None: errMsg = "Options --forQuiver and --useccs should not " + \ "be used together, since Quiver is not designed to " + \ "polish ccs reads. if you want to align ccs reads" + \ "in cmp.h5 format with pulse QVs loaded, use " + \ "--loadQVs with --useccs instead." raise ValueError(errMsg) args.loadQVs = True if args.loadQVs: if fileNames.pulseFileName is None: errMsg = "The input file has to be in bas/pls/ccs.h5 " + \ "format, or --pulseFile needs to be specified, " if getFileFormat(fileNames.outputFileName) != FILE_FORMATS.CMP: errMsg = "The output file has to be in cmp.h5 format, " if errMsg != "": errMsg += "in order to load pulse QVs." logging.error(errMsg) raise ValueError(errMsg) def _parseArgs(self): """Overwrite ToolRunner.parseArgs(self). Parse PBAlignRunner arguments considering both args in argumentList and args in a config file (specified by --configFile). """ pass def _output(self, inSam, refFile, outFile, readType=None, smrtTitle=False): """Generate a sam or a cmp.h5 file. Input: inSam : an input SAM file. (e.g. fileName.filteredSam) refFile : the reference file. (e.g. fileName.targetFileName) outFile : the output SAM or CMP.H5 file. (i.e. fileName.outputFileName) readType: standard or cDNA or CCS (can be None if not specified) Output: output, errCode, errMsg """ output, errCode, errMsg = "", 0, "" if getFileFormat(outFile) == FILE_FORMATS.SAM: #`mv inSam outFile` logging.info("OutputService: Genearte the output SAM file.") logging.debug("OutputService: Move {src} as {dst}".format( src=inSam, dst=outFile)) try: shutil.move(real_ppath(inSam), real_ppath(outFile)) except shutil.Error as e: output, errCode, errMsg = "", 1, str(e) elif getFileFormat(outFile) == FILE_FORMATS.CMP: #`samtoh5 inSam outFile -readType readType logging.info("OutputService: Genearte the output CMP.H5 " + "file using samtoh5.") prog = "samtoh5" cmd = "samtoh5 {samFile} {refFile} {outFile}".format( samFile=inSam, refFile=refFile, outFile=outFile) if readType is not None: cmd += " -readType {0} ".format(readType) if smrtTitle: cmd += " -smrtTitle " # Execute the command line logging.debug("OutputService: Call \"{0}\"".format(cmd)) output, errCode, errMsg = backticks(cmd) if errCode != 0: errMsg = prog + " returned a non-zero exit status." + errMsg logging.error(errMsg) raise RuntimeError(errMsg) return output, errCode, errMsg def _cleanUp(self, realDelete=False): """ Clean up temporary files and intermediate results. """ logging.debug("Clean up temporary files and directories.") self._tempFileManager.CleanUp(realDelete) # def _setupLogging(self): # LOG_FORMAT = "%(asctime)s [%(levelname)s] %(message)s" # if self.args.verbosity >= 2: # print "logLevel = debug" # logLevel = logging.DEBUG # elif self.args.verbosity == 1: # print "logLevel = info" # logLevel = logging.INFO # else: # print "logLevel = warn" # logLevel = logging.WARN # logging.basicConfig(level=logLevel, format=LOG_FORMAT) def run(self): """ The main function, it is called by PBToolRunner.start(). """ startTime = time.time() logging.info("pbalign version: {version}".format(version=get_version())) logging.debug("Original arguments: " + str(self._argumentList)) # Create an AlignService by algorithm name. self._alnService = self._createAlignService(self.args.algorithm, self.args, self.fileNames, self._tempFileManager) # Make sane. self._makeSane(self.args, self.fileNames) # Run align service. try: self._alnService.run() except RuntimeError: return 1 # Create a temporary filtered SAM file as output for FilterService. self.fileNames.filteredSam = self._tempFileManager.\ RegisterNewTmpFile(suffix=".sam") # Call filter service. self._filterService = FilterService(self.fileNames.alignerSamOut, self.fileNames.targetFileName, self.fileNames.filteredSam, self._alnService.name, self._alnService.scoreSign, self.args, self.fileNames.adapterGffFileName) try: self._filterService.run() except RuntimeError: return 1 # Output all hits either in SAM or CMP.H5. try: useSmrtTitle = False if (self.args.algorithm != "blasr" or self.fileNames.inputFileFormat == FILE_FORMATS.FASTA): useSmrtTitle = True self._output( self.fileNames.filteredSam, self.fileNames.targetFileName, self.fileNames.outputFileName, self.args.readType, useSmrtTitle) except RuntimeError: return 1 # Call post service for quiver. if self.args.forQuiver or self.args.loadQVs: postService = ForQuiverService(self.fileNames, self.args) try: postService.run() except RuntimeError: return 1 # Delete temporay files anyway to make self._cleanUp(False if (hasattr(self.args, "keepTmpFiles") and self.args.keepTmpFiles is True) else True) endTime = time.time() logging.info("Total time: {:.2f} s.".format(float(endTime - startTime))) return 0 def main(): pbobj = PBAlignRunner(sys.argv[1:]) return pbobj.start() if __name__ == "__main__": # For testing PBAlignRunner. # PBAlignRunner inherits PBToolRunner. So PBAlignRunner.start() parses args, # sets up logging and finally returns run(). sys.exit(main()) pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/service.py000077500000000000000000000061131241505617700234000ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### """This script defines a virtual class Service.""" from __future__ import absolute_import from pbalign.utils.progutil import Availability, \ CheckAvailability, Execute class Service(object): """A Service object class.""" @property def name(self): """Service name.""" raise NotImplementedError( "Virtual property name() for {0} must be overwritten.". format(type(self))) @property def progName(self): """Program to call.""" raise NotImplementedError( "Virtual property progName() for {0} must be overwritten.". format(type(self))) @property def availability(self): """Return True if self.progName is available, otherwise false.""" return Availability(self.progName) @property def cmd(self): """Command to execute for this service.""" raise NotImplementedError( "Virtual property progName() for {0} must be overwritten.". format(type(self))) def checkAvailability(self): """Raise a runtime error if program for this service is not available.""" CheckAvailability(self.progName) def _execute(self): """Execute a command (self.cmd).""" return Execute(self.name, self.cmd) pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/tools/000077500000000000000000000000001241505617700225225ustar00rootroot00000000000000pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/tools/__init__.py000077500000000000000000000036411241505617700246420ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### # Author: Yuan Li """Migrate pbpy scripts to bioinformatics.""" from __future__ import absolute_import pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/tools/createChemistryHeader.py000077500000000000000000000154261241505617700273530ustar00rootroot00000000000000#!/usr/bin/env python """createChemistryHeader.py gets chemistry triple information for movies in a BLASR-produced SAM file. It writes a new SAM header file that contains the chemisty information. This header can be used with samtools reheader. Most of the work is actually done by BasH5Reader. """ import argparse import copy import logging import sys import pysam from pbcore.io import BasH5IO, FofnIO log = logging.getLogger('main') MOVIENAME_TAG = 'PU' class ChemistryLoadingException(Exception): """Exception when chemistry lookup fails.""" pass def format_rgds_entries(rgds_entries): """Turn the RG DS dictionary into a list of strings that can be placed into a header object. """ rgds_strings = {} for rg_id in rgds_entries: rgds_string = ("BINDINGKIT:{b};SEQUENCINGKIT:{s};" "SOFTWAREVERSION:{v}" .format(b=rgds_entries[rg_id][0], s=rgds_entries[rg_id][1], v=rgds_entries[rg_id][2])) rgds_strings[rg_id] = rgds_string return rgds_strings def extend_header(old_header, new_rgds_strings): """Create a new SAM/BAM header, adding the RG descriptions to the old_header. """ new_header = copy.deepcopy(old_header) for rg_entry in new_header['RG']: try: new_ds_string = new_rgds_strings[rg_entry['ID']] except KeyError: continue if 'DS' in rg_entry: rg_entry['DS'] += ';' + new_ds_string else: rg_entry['DS'] = new_ds_string return new_header def get_chemistry_info(sam_header, input_filenames, fail_on_missing=False): """Get chemistry triple information for movies referenced in a SAM header. Args: sam_header: a pysam.Samfile.header, which is a multi-level dictionary. Movie names are read from RG tags in this header. input_filenames: a list of bas, bax, or fofn filenames. fail_on_missing: if True, raise an exception if the chemistry information for a movie in the header cannot be found. If False, just log a warning. Returns: a list of strings that can be written as DS tags to RG entries in the header of a new SAM or BAM file. For example, ['BINDINGKIT:xxxx;SEQUENCINGKIT:yyyy;SOFTWAREVERSION:2.0'] Raises: ChemistryLoadingException if chemistry information cannot be found for a movie in the header and fail_on_missing is True. """ # First get the full list of ba[sx] files, reading through any fofn or xml # inputs bas_filenames = [] for filename in input_filenames: bas_filenames.extend(FofnIO.enumeratePulseFiles(filename)) # Then get the chemistry triple for each movie in the list of bas files triple_dict = {} for bas_filename in bas_filenames: bas_file = BasH5IO.BasH5Reader(bas_filename) movie_name = bas_file.movieName chem_triple = bas_file.chemistryBarcodeTriple triple_dict[movie_name] = chem_triple # Finally, find the movie names that appear in the header and create CO # lines with the chemistry triple if 'RG' not in sam_header: return [] rgds_entries = {} for rg_entry in sam_header['RG']: rg_id = rg_entry['ID'] rg_movie_name = rg_entry[MOVIENAME_TAG] try: rg_chem_triple = triple_dict[rg_movie_name] rgds_entries[rg_id] = rg_chem_triple except KeyError: err_msg = ("Cannot find chemistry information for movie {m}." .format(m=rg_movie_name)) if fail_on_missing: raise ChemistryLoadingException(err_msg) else: log.warning(err_msg) rgds_strings = format_rgds_entries(rgds_entries) return rgds_strings def get_parser(): """Return an ArgumentParser for pbcompress options.""" desc = ("createChemistryHeader creates a SAM header that contains the " "chemistry information used by Quiver.") parser = argparse.ArgumentParser( prog='getChemistryHeader.py', description=desc, formatter_class=argparse.ArgumentDefaultsHelpFormatter) parser.add_argument("--debug", help="Output detailed log information.", action='store_true') def sam_or_bam_filename(val): """Check that val names a SAM or BAM file.""" if not (val.endswith(".bam") or val.endswith(".sam")): raise argparse.ArgumentTypeError( "File must end with .sam or .bam. {f} doesn't " "end with either of those." .format(f=val)) return val parser.add_argument( "input_alignment_file", help="A SAM or BAM file produced by BLASR.", type=sam_or_bam_filename) parser.add_argument( "output_header_file", help=("Name of the SAM or BAM header file that will be created with " "chemistry information loaded."), type=sam_or_bam_filename) parser.add_argument( "--bas_files", help=("The bas or bax files containing the reads that were aligned in " "the input_alignment_file. Also can be a fofn of bas or bax " "files."), nargs='+', required=True) return parser def setup_log(alog, file_name=None, level=logging.DEBUG, str_formatter=None): """Util function for setting up logging.""" if file_name is None: handler = logging.StreamHandler(sys.stderr) else: handler = logging.FileHandler(file_name) if str_formatter is None: str_formatter = ('[%(levelname)s] %(asctime)-15s ' '[%(name)s %(funcName)s %(lineno)d] %(message)s') formatter = logging.Formatter(str_formatter) handler.setFormatter(formatter) alog.addHandler(handler) alog.setLevel(level) def main(): """Entry point.""" parser = get_parser() args = parser.parse_args() if args.debug: setup_log(log, level=logging.DEBUG) else: setup_log(log, level=logging.INFO) input_file = pysam.Samfile(args.input_alignment_file, 'r') input_header = input_file.header log.debug("Read header from {f}.".format(f=input_file.filename)) chemistry_rgds_strings = get_chemistry_info( input_header, args.bas_files) new_header = extend_header(input_header, chemistry_rgds_strings) if args.output_header_file.endswith('.bam'): output_file = pysam.Samfile(args.output_header_file, 'wb', header=new_header) elif args.output_header_file.endswith('.sam'): output_file = pysam.Samfile(args.output_header_file, 'wh', header=new_header) output_file.close() if __name__ == '__main__': main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/tools/extractUnmappedSubreads.py000077500000000000000000000161611241505617700277410ustar00rootroot00000000000000#!/usr/bin/env python """ Starts with the filtered_reads.fa file. Reads in the control.cmp.h5 and reference.cmp.h5, removing any subreads that map. Writes resulting fasta entries to stdout. """ import sys import logging import os.path as op import re import h5py from pbcore.io import FastaReader from pbcore.util.ToolRunner import PBToolRunner __version__ = "0.1.0.133504" class ExtractRunner(PBToolRunner): """ExtractUnmappedReads Runner.""" def __init__(self): """Handle command line argument parsing""" desc = "Extract unmapped subreads from a fasta file." PBToolRunner.__init__(self, desc) self.set_parser(self.parser) self.fastaFN = None self.cmpH5FNs = [] def set_parser(self, parser): """Set parser.""" parser.add_argument("fasta", type=str, help="a fasta file containing all subreads.") parser.add_argument("cmph5", metavar="cmp.h5", nargs="+", help="input cmp.h5 files.") return parser def getVersion(self): """Get version string.""" return __version__ def _getFastaReadsInfo(self, fastaReads): """ Get reads' info (not including alignments) from the fasta. """ pattern = re.compile(r"(m.+)\/(\d+)\/(\d+)_(\d+)") #for entry in FastaIO.SimpleFastaReader(self.fastaFN): with FastaReader(self.fastaFN) as reader: for entry in reader: match = pattern.search( entry.name.strip() ) if not match: continue movie, holeNumber, srStart, srEnd = match.groups() holeNumber, srStart, srEnd = \ int(holeNumber), int(srStart), int(srEnd) fastaReads.setdefault(movie, {}) fastaReads[movie].setdefault(holeNumber, []) fastaReads[movie][holeNumber].append((srStart, srEnd)) def _loadMappedSubreads(self, subreads, cmpH5FN): """Loads all subreads from the specified cmpH5 into the subread data structure.""" cmpFile = h5py.File(cmpH5FN, 'r') movieInfo = cmpFile["/MovieInfo"] movieDict = dict(zip(movieInfo["ID"], movieInfo["Name"])) numAln = cmpFile["/AlnInfo/AlnIndex"].shape[0] movieIdIdx, holeIdx, startIdx, endIdx = 2, 7, 11, 12 if numAln != 0: for row in cmpFile["/AlnInfo/AlnIndex"].value: movie = movieDict[row[movieIdIdx]] subreads.setdefault(movie, {}) subreads[movie].setdefault(row[holeIdx], []) subreads[movie][row[holeIdx]].append( (row[startIdx], row[endIdx])) logging.info("Loaded {n} subreads from {f}".format(n=numAln, f=cmpH5FN)) cmpFile.close() def _rmMappedReads( self, fastaReadsPos, cmpReadsPos): """ Remove fasta reads that are mapped. """ # For a read of a hole, the number of subreads that can map to reference # is usually small (e.g. 1 or 2). Actually, it is not profitable at all # to "sort and binary search for mapped subreads" based on experiments. i = 0 while i < len(fastaReadsPos): fStart, fEnd = fastaReadsPos[i] for cStart, cEnd in cmpReadsPos: if cStart >= fStart and cEnd <= fEnd: logging.debug("{0} {1} in {2} {3} ?".\ format(cStart, cEnd, fStart, fEnd)) fastaReadsPos.pop(i) i -= 1 break i += 1 def _printUnMappedReads(self, fastaReads): """Print unmapped subreads.""" pattern = re.compile(r"(m.+)\/(\d+)\/(\d+)_(\d+)") with FastaReader(self.fastaFN) as reader: for entry in reader: match = pattern.search( entry.name.strip() ) if not match: continue movie, holeNumber, srStart, srEnd = match.groups() holeNumber, srStart, srEnd = \ int(holeNumber), int(srStart), int(srEnd) if movie in fastaReads and \ holeNumber in fastaReads[movie] and \ (srStart, srEnd) in fastaReads[movie][holeNumber]: entry.COLUMNS=70 print str(entry) def run(self): """Executes the body of the script.""" logging.info("Running {f} v{v}.".format(f=op.basename(__file__), v=self.getVersion)) args = self.args logging.info("Extracting unmapped reads from a fasta file.") self.fastaFN = args.fasta self.cmpH5FNs = args.cmph5 logging.debug("Input fasta is {f}.".format(f=self.fastaFN)) logging.debug("Input fasta is {f}.".format(f=self.cmpH5FNs)) fastaReads = {} self._getFastaReadsInfo(fastaReads) subreads = {} for cmpH5FN in self.cmpH5FNs: subreads = { } self._loadMappedSubreads(subreads, cmpH5FN) for movie in subreads: for holeNumber in subreads[movie]: logging.debug("Movie: {m}".format(m=movie)) if movie not in fastaReads: break elif holeNumber not in fastaReads[movie]: continue # Remove mapped reads from fastaReads[movie][holeNumber] self._rmMappedReads(fastaReads[movie][holeNumber], subreads[movie][holeNumber]) # Print unmapped reads self._printUnMappedReads(fastaReads) # The following code can cut the memory used in half (if there is only # one cmpH5FN) at the expense of increased running time # (by 80% ~ 100% if there are multiple movies). # def run( self ): # """Executes the body of the script.""" # # logging.info("Log level set to INFO") # logging.debug("Log Level set to DEBUG") # # subreads = { } # for cmpH5FN in self.cmpH5FNs: # self._loadMappedSubreads( subreads, cmpH5FN ) # # subreadId = re.compile("(m.+)\/(\d+)\/(\d+)_(\d+)") # for entry in FastaIO.SimpleFastaReader( self.fastaFN ): # match = subreadId.search( entry.name.strip() ) # if not match: # continue # movie, read, srStart, srEnd = [match.group(i) for i in range(1,5)] # read, srStart, srEnd = int(read), int(srStart), int(srEnd) # mapped = False # if movie in subreads and read in subreads[ movie ]: # ids = subreads[ movie ][ read ] # logging.debug("Movie: %s" % movie) # for id in ids: # logging.debug("%s inside %s?" % # ( str(id), "(%s,%s)" % ( srStart, srEnd))) # if id[0] >= srStart and id[1] <= srEnd: # mapped = True # break # if not mapped: # print str(entry) # # return 0 def main(): """Main entry""" runner = ExtractRunner() return runner.start() if __name__ == "__main__": sys.exit(main()) pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/tools/loadChemistry.py000077500000000000000000000046631241505617700257170ustar00rootroot00000000000000#!/usr/bin/env python USAGE = \ """ loadChemistry.py Load chemistry info into a cmp.h5, just copying the triple. Note that there is no attempt to "decode" chemistry barcodes here---this is a dumb pipe. usage: % loadChemistry [input.fofn | list of input.ba[sx].h5] aligned_reads.cmp.h5 """ import sys, h5py, numpy as np from pbcore.io import * class ChemistryLoadingException(BaseException): pass STRING_DTYPE = h5py.special_dtype(vlen=bytes) def safeDelete(group, dsName): if dsName in group: del group[dsName] def writeTriples(movieInfoGroup, triplesByMovieName): movieNamesInCmpH5 = list(movieInfoGroup["Name"]) if not set(movieNamesInCmpH5).issubset(set(triplesByMovieName.keys())): raise ChemistryLoadingException, "Mismatch between movies in input.fofn and cmp.h5 movies" safeDelete(movieInfoGroup, "BindingKit") safeDelete(movieInfoGroup, "SequencingKit") safeDelete(movieInfoGroup, "SoftwareVersion") shape = movieInfoGroup["Name"].shape bindingKit = movieInfoGroup.create_dataset("BindingKit" , shape=shape, dtype=STRING_DTYPE, maxshape=(None,)) sequencingKit = movieInfoGroup.create_dataset("SequencingKit" , shape=shape, dtype=STRING_DTYPE, maxshape=(None,)) softwareVersion = movieInfoGroup.create_dataset("SoftwareVersion", shape=shape, dtype=STRING_DTYPE, maxshape=(None,)) for (movieName, triple) in triplesByMovieName.items(): if movieName in movieNamesInCmpH5: idx = movieNamesInCmpH5.index(movieName) bindingKit[idx] = triple[0] sequencingKit[idx] = triple[1] softwareVersion[idx] = triple[2] assert all(bindingKit.value != "") assert all(sequencingKit.value != "") assert all(softwareVersion.value != "") def main(): if len(sys.argv) < 3: print USAGE return -1 inputFilenames = sys.argv[1:-1] cmpFname = sys.argv[-1] if len(inputFilenames) == 1 and inputFilenames[0].endswith(".fofn"): basFnames = list(enumeratePulseFiles(inputFilenames[0])) else: basFnames = inputFilenames f = h5py.File(cmpFname, "r+") movieInfoGroup = f["MovieInfo"] triples = {} for basFname in basFnames: bas = BasH5Reader(basFname) movieName = bas.movieName chemTriple = bas.chemistryBarcodeTriple triples[movieName] = chemTriple writeTriples(movieInfoGroup, triples) if __name__ == '__main__': main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/tools/mask_aligned_reads.py000077500000000000000000000150531241505617700266770ustar00rootroot00000000000000#!/usr/bin/env python """Takes in a rgn.fofn and corresponding cmp.h5. Uses the alignments from the cmp.h5 to mask corresponding regions of the rgn.h5s. Writes output to a new rgn.fofn.""" import os import sys import logging import argparse import re from pbcore.io import CmpH5Reader, EmptyCmpH5Error import traceback from pbalign.utils.RgnH5IO import RgnH5Reader, RgnH5Writer __VERSION__ = "0.3" class AlignedReadsMasker(object): """Mask aligned reads in a region table. Input: inCmpFile - a cmp.h5 file with alignments. inRgnFofn - a input fofn of region table files. Output: outRgnFofn - a output fofn of region table files. Generate new rgn.h5 files, which mask aligned reads in `inRgnFofn` by overwritting their corresponding HQ regions to (0, 0). The generated new rgn.h5 files have to be stored in the same directory as `outRgnFofn`. """ def __init__(self, inCmpFile, inRgnFofn, outRgnFofn): self.inCmpFile = inCmpFile self.inRgnFofn = inRgnFofn self.outRgnFofn = outRgnFofn def maskAlignedReads(self): """Mask aligned zmws in region tables.""" logging.info("Log level set to INFO") logging.debug("Log Level set to DEBUG") alignedReads = self._extractAlignedReads() nreads = sum([len(v) for v in alignedReads.values()]) logging.info("Extracted {r} reads ({m} movies) from {f}".format( r=nreads, m=len(alignedReads), f=self.inCmpFile)) outDir = os.path.splitext(self.outRgnFofn)[0] if not os.path.exists(outDir): os.mkdir(outDir) outRgnFofn = open(self.outRgnFofn, 'w') # Check for new format generated from bax files. # m130226_022844_...131362_s1_p0.3.rgn.h5 rx = re.compile(r'\.[0-9].rgn\.h5') for rgnH5FN in [line.strip() for line in open(self.inRgnFofn, 'r')]: if not rgnH5FN.endswith("rgn.h5"): logging.error("Region table file " + "{0} should be a rgn.h5 file.".format(rgnH5FN)) return 1 rgnReader = RgnH5Reader(rgnH5FN) basename = os.path.basename(rgnH5FN) # Default movie name movieName = rgnReader.movieName # 'movieId' is used to write the file compatible with bax style. # m130226_022844_ethan_c100471672550000001823071906131362_s1_p0.3 if rx.search(basename): movieId = re.split(r'.rgn\.h5', basename)[0] else: # old format # m130226_022844_....131362_s1_p0.rgn.h5 movieId = movieName outH5FN = os.path.abspath(os.path.join(outDir, movieId + ".rgn.h5")) outRgnFofn.write("{o}\n".format(o=outH5FN)) rgnWriter = RgnH5Writer(outH5FN) rgnWriter.writeScanDataGroup(rgnReader.scanDataGroup) logging.info("Processing {f}...".format(f=rgnH5FN)) for rt in rgnReader: if movieName in alignedReads and \ rt.holeNumber in alignedReads[movieName]: rt.setHQRegion(0, 0) rgnWriter.addRegionTable(rt) rgnReader.close() rgnWriter.close() outRgnFofn.close() return 0 def _extractAlignedReads(self): """Grab a mapping of all movie names of aligned reads to hole numbers. and return { Movie: [HoleNumbers ...] }. """ alignedReads = {} try: reader = CmpH5Reader(self.inCmpFile) for movie in reader.movieInfoTable.Name: alignedReads.setdefault(movie, set()) for i in reader: alignedReads[i.movieInfo.Name].add(i.HoleNumber) reader.close() except (IndexError, EmptyCmpH5Error): msg = "No aligned reads found in {x}".format(x=self.inCmpFile) sys.stderr.write(msg + "\n") logging.warn(msg) return alignedReads def getParser(): """Add arguments to an argument parser and return it. usage = "%prog [--help] [options] cmp.h5 rgn.fofn rgn_out.fofn" """ desc = "Use in.cmp.h5 to mask corresponing regions of files in " + \ "in.rgn.h5, write output to a new rgn.fofn." parser = argparse.ArgumentParser( description=desc, version=__VERSION__, formatter_class=argparse.ArgumentDefaultsHelpFormatter) parser.add_argument( "-l", "--logFile", default=None, help="Specify a file to log to. Defaults to stderr.") parser.add_argument( "-d", "--debug", default=False, action="store_true", help="Increases verbosity of logging") parser.add_argument( "-i", "--info", default=False, action="store_true", help="Display informative log entries") parser.add_argument( "inCmpFile", type=str, help="An input cmp.h5 file.") parser.add_argument( "inRgnFofn", type=str, help="A fofn of input region table files.") parser.add_argument( "outRgnFofn", type=str, help="A fofn of output region table files.") return parser def configLog(isInfo, isDebug, logFile): """Sets up logging based on command line arguments. Allows for three levels of logging: logging.error( ): always emitted logging.info( ): emitted with --info or --debug logging.debug( ): only with --debug """ logLevel = logging.DEBUG if isDebug else \ logging.INFO if isInfo else logging.ERROR logFormat = "%(asctime)s [%(levelname)s] %(message)s" if logFile is not None: logging.basicConfig(filename=logFile, level=logLevel, format=logFormat) else: logging.basicConfig(stream=sys.stderr, level=logLevel, format=logFormat) def run(inCmpFile, inRgnFofn, outRgnFofn): """Main function to run mask aligned reads().""" masker = AlignedReadsMasker(inCmpFile, inRgnFofn, outRgnFofn) try: masker.maskAlignedReads() except Exception as e: logging.error(e, exc_info=True) traceback.print_exc(file=sys.stderr) return 1 return 0 def main(): """Main function.""" parser = getParser() args = parser.parse_args() configLog(args.debug, args.info, args.logFile) rcode = run(args.inCmpFile, args.inRgnFofn, args.outRgnFofn) logging.info("Exiting {f} {v} with rturn code {r}.".format( r=rcode, f=os.path.basename(__file__), v=__VERSION__)) return rcode if __name__ == "__main__": sys.exit(main()) pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/utils/000077500000000000000000000000001241505617700225225ustar00rootroot00000000000000pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/utils/RgnH5IO.py000077500000000000000000000227101241505617700242540ustar00rootroot00000000000000#!/usr/bin/env python # Author: Yuan Li """ Region table reader and writer. """ __all__ = ["RgnH5Reader", "RgnH5Writer"] import h5py import os.path as op import numpy as np from pbcore.io.BasH5IO import ADAPTER_REGION, INSERT_REGION, HQ_REGION, \ REGION_TABLE_DTYPE, toRecArray, _makeRegionTableIndex __version__ = "1.0" REGION_COLUMN_NAMES = ( "HoleNumber", "TypeIndex", "Start", "End", "Score" ) REGION_TYPES = ( "Adapter", "Insert", "HQRegion" ) REGION_SOURCES = ( "AdapterFinding", "AdapterFinding", "PulseToBase Region classifer" ) REGION_DESCRIPTIONS = ( "Adapter Hit", "Insert Region", "High Quality bases region. Score is 1000 * predicted accuracy, " + "where predicted accuary is 0 to 1.0" ) class Region(object): """ A `Region` represents a row in /PulseData/Regions, including five fields: HoleNumber, TypeIndex, Start, End and Score. """ __slots__ = ['holeNumber', 'typeIndex', 'start', 'end', 'score'] def __init__(self, l): self.holeNumber = np.int32(l[0]) self.typeIndex = np.int32(l[1]) self.start = np.int32(l[2]) self.end = np.int32(l[3]) self.score = np.int32(l[4]) def __repr__(self): return "ZMW=%r, %r, start=%r, end=%r, score=%r\n" % ( self.holeNumber, REGION_TYPES[self.typeIndex], self.start, self.end, self.score) def toTuple(self): """Convert a Region object to a tuple.""" return (self.holeNumber, self.typeIndex, self.start, self.end, self.score) def setStartAndEnd(self, newStart, newEnd): """Reset start and end.""" self.start = newStart self.end = newEnd @property def isHqRegion(self): """Is this a HQ region?""" return self.typeIndex == HQ_REGION @property def isAdapter(self): """Is this an adapter?""" return self.typeIndex == ADAPTER_REGION @property def isInsert(self): """Is this an insert region?""" return self.typeIndex == INSERT_REGION class RegionTable(object): """ A `RegionTable` represents a list of all regions of a ZMW. """ def __init__(self, holeNumber, regions): self.holeNumber = holeNumber for r in regions: assert self.holeNumber == r.holeNumber, \ "RegionTable instantiated with holeNumber %i != " \ "region holeNumber %i" % (self.holeNumber, r.holeNumber) self.regions = regions def __str__(self): ret = "ZMW: %s, regions are:\n" % self.holeNumber for r in self.regions: ret += " (%s, %s, %s, %s)" % (REGION_TYPES[r.typeIndex], r.start, r.end, r.score) return ret def setHQRegion(self, newHQStart, newHQEnd): """ If a HQ region exists, reset HQ region; otherwise add one. """ hqRegionsFound = 0 for r in self.regions: if r.isHqRegion: r.setStartAndEnd(newHQStart, newHQEnd) hqRegionsFound += 1 if hqRegionsFound == 0: # No HQ Region exists in the region table, add one to the table. self.regions.append( Region([self.holeNumber, HQ_REGION, newHQStart, newHQEnd, 0])) elif hqRegionsFound > 1: # If more than one HQ region exists, give a warning. print "WARNING: Found more than one HQ region in ZMW %s." % \ self.holeNumber def __len__(self): return self.regions.__len__() def __getitem__(self, key): return self.regions.__getitem__(key) def __delitem__(self, key): return self.regions.__delitem__(key) def __setitem__(self, key, value): return self.regions.__setitem__(key, value) @property def numRegions(self): """Return the number of regions in a ZMW's region table.""" return len(self.regions) def toList(self): """Return a list of regions.""" return [r.toTuple() for r in self.regions] class RgnH5Reader(object): """ The `RgnH5Reader` class provides access to rgn.h5 files. Region tables are usually small (e.g. a few MB), so we can cache all data. To use RgnH5Reader and RgnH5Writer: reader = RgnH5Reader(inFileName) writer = RgnH5Reader(outFileName) writer.writeScanDataGroup(reader.scanDataGroup) for rt in reader: writer.addRegionTable(rt) reader.close() writer.close() """ def __init__(self, filename): self.filename = op.abspath(op.expanduser(filename)) self.file = h5py.File(self.filename, 'r') if "Regions" in self.file["/PulseData"]: self._regionsGroup = self.file["/PulseData/Regions"] else: raise TypeError("Unsupported region table which does not " + "contain /PulseData/Regions: %s " % self.filename) self._regionsData = toRecArray( REGION_TABLE_DTYPE, self._regionsGroup.value) self._regionTableIndex = _makeRegionTableIndex( self._regionsData.holeNumber) self.holeNumbers = set(self._regionsData.holeNumber) def __iter__(self): for holeNumber in self.holeNumbers: startRow, endRow = self._regionTableIndex[holeNumber] yield RegionTable( holeNumber, [Region(r) for r in self._regionsData[startRow:endRow]]) def __enter__(self): return self def __exit__(self, exc_type, exc_value, traceback): self.close() @property def movieName(self): """Copied from BasH5Reader, written by David Alexander.""" movieNameAttr = self.file["/ScanData/RunInfo"].attrs["MovieName"] # In old bas.h5 files, attributes of ScanData/RunInfo are stored as # strings in arrays of length one. if (isinstance(movieNameAttr, (np.ndarray, list)) and len(movieNameAttr) == 1): movieNameString = movieNameAttr[0] else: movieNameString = movieNameAttr if not isinstance(movieNameString, basestring): raise TypeError("Unsupported movieName {m} of type {t}." .format(m=movieNameString, t=type(movieNameString))) return movieNameString @property def numZMWs(self): """Return the number of ZMWs in the region table.""" return len(self.holeNumbers) @property def scanDataGroup(self): """Return /ScanData Group.""" return self.file["/ScanData"] def close(self): """Close the file.""" if hasattr(self, "file") and self.file is not None: self.file.close() self.file = None def addStrListAttr(obj, attrName, attrList): """Add a string list as an attribute to a hdf5 object.""" obj.attrs[attrName] = np.array(attrList, dtype=h5py.new_vlen(str)) class RgnH5Writer(object): """Region table writer.""" def __init__(self, filename): self.filename = op.abspath(op.expanduser(filename)) if not self.filename.endswith("rgn.h5"): raise TypeError("File extension of region table: " + "%s should be rgn.h5" % self.filename) self.file = h5py.File(self.filename, 'w') self.regions = [] def _addVersion(self): """Add version to file.""" self.file.attrs['Version'] = __version__ def _addRegionsDataset(self): """Add /PulseData/Regions dataset.""" # Create /PulseData group. pulseDataGroup = self.file.create_group("PulseData") # Get the total number of regions in region table. numRegions = len(self.regions) shape = (max(1, numRegions), len(REGION_COLUMN_NAMES)) # Add /PulseData/Regions dataset. # The datatype is int32 instead of uint32 because scores can be -1. regionsDataset = pulseDataGroup.create_dataset( "Regions", shape, np.int32, maxshape=(None, len(REGION_COLUMN_NAMES))) # Add attributes to Regions. addStrListAttr(regionsDataset, "ColumnNames", REGION_COLUMN_NAMES) addStrListAttr(regionsDataset, "RegionTypes", REGION_TYPES) addStrListAttr(regionsDataset, "RegionDescriptions", REGION_DESCRIPTIONS) addStrListAttr(regionsDataset, "RegionSources", REGION_SOURCES) # Fill Regions dataset. if len(self.regions) == 0: self.regions = [(0, 0, 0, 0, 0)] regionsDataset[:] = np.array(self.regions) def writeScanDataGroup(self, scanDataGroup=None): """Copy /ScanData group if not None.""" if scanDataGroup is not None: self.file.copy(scanDataGroup, "/ScanData") def addRegionTable(self, regionTable): """Add a ZMW's region table to the writer's region table list.""" self.regions.extend(regionTable.toList()) def write(self): """Write the region table list to file.""" # ensure the output is sorted by hole number, de facto "spec" for rgn.h5 self.regions.sort(key=lambda x:x[0]) self._addVersion() self._addRegionsDataset() def close(self): """Close the file.""" if hasattr(self, "file") and self.file is not None: self.write() self.file.close() self.file = None def __enter__(self): return self def __exit__(self, exc_type, exc_value, traceback): self.close() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/utils/__init__.py000077500000000000000000000035621241505617700246440ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### # Author: Yuan Li from __future__ import absolute_import pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/utils/fileutil.py000077500000000000000000000353251241505617700247240ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### # Author: Yuan Li """This scripts defines functions for handling input and output files.""" from __future__ import absolute_import import os.path as op import logging from xml.etree import ElementTree as ET from pbcore.util.Process import backticks def enum(**enums): """Simulate enum.""" return type('Enum', (), enums) FILE_FORMATS = enum(FASTA="FASTA", PLS="PLS_H5", PLX="PLX_H5", BAS="BAS_H5", BAX="BAX_H5", FOFN="FOFN", SAM="SAM", CMP="CMP_H5", RGN="RGN_H5", SA="SA", XML="XML", UNKNOWN="UNKNOWN", CCS="CCS_H5") VALID_INPUT_FORMATS = (FILE_FORMATS.FASTA, FILE_FORMATS.PLS, FILE_FORMATS.PLX, FILE_FORMATS.BAS, FILE_FORMATS.BAX, FILE_FORMATS.FOFN, FILE_FORMATS.CCS) VALID_REGIONTABLE_FORMATS = (FILE_FORMATS.RGN, FILE_FORMATS.FOFN) VALID_OUTPUT_FORMATS = (FILE_FORMATS.CMP, FILE_FORMATS.SAM) def real_ppath(fn): """Return real 'python-style' path of a file. Consider files with white spaces in their paths, such as 'res\ with\ space/out.sam' or 'res with space/out.sam', 'res\ with\ space/out.sam' is unix-style file path. 'res with space/out.sam' is python style file path. We need to convert all '\_' in path to ' ' so that python can handle files with space correctly, which means that 'res\ with\ space/out.sam' will be converted to 'res with space/out.sam'. """ return op.abspath(op.expanduser(fn)).replace(r'\ ', ' ') def real_upath(fn): """Return real 'unix-style' path of a file. Consider files with white spaces in their paths, such as 'res\ with\ space/out.sam' or 'res with space/out.sam', 'res\ with\ space/out.sam' is unix-style file path. 'res with space/out.sam' is python style file path. We need to convert all ' ' to '\ ' so that unix can handle files with space correctly, which means that 'res with space/out.sam' will be converted to 'res\ with\ space/out.sam'. """ return real_ppath(fn).replace(' ', r'\ ') def isExist(ff): """Return whether a file or a dir ff exists or not. Call ls instead of python os.path.exists to eliminate NFS errors. """ if ff is None: return False cmd = "ls %s" % real_upath(ff) _output, errCode, _errMsg = backticks(cmd) return (errCode == 0) def isValidInputFormat(ff): """Return True if ff is a valid input file format.""" return ff in VALID_INPUT_FORMATS def isValidOutputFormat(ff): """Return True if ff is a valid output file format.""" return ff in VALID_OUTPUT_FORMATS def isValidRegionTableFormat(ff): """Return true if ff is a valid region table file format.""" return ff in VALID_REGIONTABLE_FORMATS def getFileFormat(filename): """Verify and return a file's format. If a file format is supported, return the format. Otherwise, return FILE_FORMATS.UNKOWN. """ base, ext = op.splitext(filename) ext = ext.lower() if ext in [".fa", ".fasta", ".fsta", ".fna"]: return FILE_FORMATS.FASTA elif ext in [".sam"]: return FILE_FORMATS.SAM elif ext in [".sa"]: return FILE_FORMATS.SA elif ext in [".fofn"]: return FILE_FORMATS.FOFN elif ext in [".xml"]: return FILE_FORMATS.XML elif ext in [".h5"]: ext = op.splitext(base)[1].lower() if ext in [".pls"]: return FILE_FORMATS.PLS elif ext in [".plx"]: return FILE_FORMATS.PLX elif ext in [".bas"]: return FILE_FORMATS.BAX elif ext in [".bax"]: return FILE_FORMATS.BAX elif ext in [".cmp"]: return FILE_FORMATS.CMP elif ext in [".rgn"]: return FILE_FORMATS.RGN elif ext in [".ccs"]: return FILE_FORMATS.CCS return FILE_FORMATS.UNKNOWN def getFilesFromFOFN(fofnname): """ Given a fofn file, return a list of absolute path of all files in fofn. """ lines = [] with open(real_ppath(fofnname), 'r') as f: lines = f.readlines() lines.sort() return [real_upath(l.strip()) for l in lines] def getFileFormatsFromFOFN(fofnname): """ Given a fofn file, return a list of file formats of all files in this fofn. """ fs = getFilesFromFOFN(fofnname) return [getFileFormat(f) for f in fs] def checkInputFile(filename, validFormats=VALID_INPUT_FORMATS): """ Check whether an input file has the valid file format and exists. If an input file is a fofn, check whether all files names in the fofn exist. Return a list of absolute paths of all input files. """ filename = real_ppath(filename) if not getFileFormat(filename) in validFormats: errMsg = "The input file format can only be {fm}.".format( fm=",".join(validFormats)) logging.error(errMsg) raise IOError(errMsg) if not isExist(filename): errMsg = "Input file {fn} does not exist.".format(fn=filename) logging.error(errMsg) raise IOError(errMsg) if getFileFormat(filename) == FILE_FORMATS.FOFN: fileList = getFilesFromFOFN(filename) if len(fileList) == 0: errMsg = "FOFN file {fn} is empty.".format(fn=filename) logging.error(errMsg) raise ValueError(errMsg) fileListRet = [] for f in fileList: if not isExist(f): errMsg = "A file in the fofn {fn} does not exist.".format(fn=f) logging.error(errMsg) raise IOError(errMsg) else: fileListRet.append(f) return real_upath(filename) def getRealFileFormat(filename): """Return file format if filename is not a FOFN, otherwise return format of the first file within FOFN.""" if getFileFormat(filename) == FILE_FORMATS.FOFN: fileList = getFilesFromFOFN(filename) assert len(fileList) != 0 return getFileFormat(fileList[0]) else: return getFileFormat(filename) def checkRegionTableFile(filename): """ Check whether the specified region table has the right format and exists. Return absolute path of the region table file. """ if filename is None: return None return checkInputFile(filename, validFormats=VALID_REGIONTABLE_FORMATS) def checkOutputFile(filename): """ Check whether an output file is writable or not. Return absolute path of the output file. """ filename = real_ppath(filename) if not isValidOutputFormat(getFileFormat(filename)): errMsg = "The output file format can only be SAM or CMP.H5." logging.error(errMsg) raise ValueError(errMsg) try: with open(filename, "a"): pass except IOError as e: errMsg = "Could not access output file {fn}.".format(fn=filename) logging.error(errMsg) raise IOError(errMsg + str(e)) return real_upath(filename) class ReferenceInfo: """Parse reference.info.xml in reference path.""" def __init__(self, fileName): fileName = real_ppath(fileName) if getFileFormat(fileName) != FILE_FORMATS.XML: errMsg = "The reference info file is not in XML format." raise ValueError(errMsg) self.dirname = op.dirname(fileName) self.refFastaFile = None self.refSawriterFile = None self.desc = None self.adapterGffFile = None self.fileName = real_upath(fileName) self._parse() def __repr__(self): """ Represent a reference info object.""" desc = "Reference Info Object:" desc += "File Name: {f}".format(f=self.fileName) desc += "Reference FASTA File: {f}".format(f=self.refFastaFile) desc += "Reference Suffix Array File: {f}".format( f=self.refSawriterFile) desc += "Description: {d}".format(d=self.desc) if self.adapterGffFile is not None: desc += "Adapter GFF file: {f}".format(f=self.adapterGffFile) return desc def _parse(self): """Parse reference.info.xml in reference folder.""" fileName = real_ppath(self.fileName) if isExist(fileName): try: tree = ET.parse(fileName) root = tree.getroot() ref = root.find("reference") refFile = ref.find("file") refFormat = refFile.get("format") if refFormat.lower().find("text/fasta") != -1: self.refFastaFile = real_upath(op.join( self.dirname, op.relpath(refFile.text))) else: errMsg = "Could not find the reference fasta " + \ "file in reference.info.xml." raise IOError(errMsg) for node in ref.getchildren(): if node.tag == "description": self.desc = node.text if node.tag == "index_file" and \ node.get("type").lower() == "sawriter": self.refSawriterFile = real_upath(op.join( self.dirname, op.relpath(node.text))) # Get the adapter annotation GFF file annotations = root.findall("annotations/annotation") for annotation in annotations: if annotation.get("type") == "adapter": self.adapterGffFile = real_upath(op.join( self.dirname, annotation.find("file").text)) break except IOError as e: raise IOError(str(e)) except ET.ParseError as e: errMsg = "Failed to parse {f}".format(f=fileName) raise ET.ParseError(errMsg) else: errMsg = "{fn} is not a valid reference info file."\ .format(fn=fileName) raise IOError(errMsg) def checkReferencePath(inRefpath): """Validate input reference path. Check whether the input reference path exists or not. Input : can be a FASTA file or a reference repository. Output: [refpath, FASTA_file, None, False, gff], if input is a FASTA file, and it is not located within a reference repository. [refpath, FASTA_file, SA_file, True, gff], if input is a FASTA file, and it is located within a reference repository. [refpath, FASTA_file, SA_file, True, gff], if input is a reference repository produced by PacBio referenceUploader. """ fastaFile, sawriterFile, refinfoxml = None, None, None isWithinRepository, adapterGffFile = None, None refpath = real_ppath(inRefpath) if not isExist(refpath): # The inRefpath does not exist. errMsg = "The input path {refpath} does not exist.".format( refpath=refpath) logging.error(errMsg) raise IOError(errMsg) if getFileFormat(refpath) == FILE_FORMATS.FASTA: fastaFile = refpath # Assume the input FASTA file is also located within a # reference repository refinfoxml = op.join(op.split(op.dirname(refpath))[0], "reference.info.xml") else: refinfoxml = op.join(refpath, "reference.info.xml") # Check if refpath is a reference repository produced by # referenceUploader or is a FASTA file located within a reference # repository. try: refinfoobj = ReferenceInfo(refinfoxml) isWithinRepository = True fastaFile = refinfoobj.refFastaFile sawriterFile = refinfoobj.refSawriterFile adapterGffFile = refinfoobj.adapterGffFile except Exception as e: isWithinRepository = False if fastaFile is None: errMsg = "Could not find reference fasta file, please " + \ "check %s.\n" % refpath + str(e) logging.error(errMsg) raise IOError(errMsg) if getFileFormat(fastaFile) != FILE_FORMATS.FASTA: errMsg = "The reference file specified is not in FASTA format. " + \ "Please check %s." % refpath logging.error(errMsg) raise IOError(errMsg) if (sawriterFile is not None) and \ ((getFileFormat(sawriterFile) != FILE_FORMATS.SA) or (not isExist(sawriterFile))): errMsg = "Could not found the sawriter file {f}".format(f=sawriterFile) logging.warn(errMsg) sawriterFile = None if (sawriterFile is not None): sawriterFile = real_upath(sawriterFile) return real_upath(refpath), real_upath(fastaFile), sawriterFile, \ isWithinRepository, adapterGffFile #if __name__ == "__main__": # refPath = "/opt/smrtanalysis" + \ # "/common/references/lambda/" # refpath, faFile, saFile, isWithinRepository = checkReferencePath(refPath) # assert(faFile == refPath + "sequence/" + "lambda.fasta") # assert(saFile == refPath + "sequence/" + "lambda.fasta.sa") # assert(checkInputFile(faFile) == faFile) # assert(checkOutputFile("abc.sam") != "") pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/utils/progutil.py000077500000000000000000000057111241505617700247500ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### """This script defines faciliating functions for calling programs.""" # Author: Yuan Li from __future__ import absolute_import from pbcore.util.Process import backticks import logging def Availability(progName): """Return True if a program is available, otherwise false.""" return (backticks("which {0}".format(progName))[1] == 0) def CheckAvailability(progName): """Raise a runtime error if a program is not available.""" if not Availability(progName): raise RuntimeError("{0} is not available.".format(progName)) def Execute(name, cmd): """Execute the sepcified command in bash. Raise a RuntimeError if execution of cmd fail. Input: cmd: a command-line string to execute in bash Output: output : the cmd output errCode: the error code (zero means normal exit) errMsg : the error message """ logging.debug(name + ": Call \"{0}\"".format(cmd)) output, errCode, errMsg = backticks(cmd) if errCode != 0: errMsg = name + " returned a non-zero exit status. " + errMsg logging.error(errMsg) raise RuntimeError(errMsg) return output, errCode, errMsg pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/pbalign/utils/tempfileutil.py000077500000000000000000000234651241505617700256140ustar00rootroot00000000000000#!/usr/bin/env python ############################################################################### # Copyright (c) 2011-2013, Pacific Biosciences of California, Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of Pacific Biosciences nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY # THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT # NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR # ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF # ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ############################################################################### # Author: Yuan Li """This scripts defines class TempFile and class TempFileManager for managing temporary files and directories.""" from os import path, makedirs, remove, fdopen import shutil import logging import tempfile import time from pbalign.utils.fileutil import isExist class TempFile(): """Class of temporary files and directories.""" def __init__(self, name, own=False, isDir=False): """ Initialize a TempFile object. Set file name, indicate whether we own it or not. """ self.name = name self.own = own self.isDir = isDir def __repr__(self): fileOrDir = "directory" if self.isDir else "file" return "TempFile({name}, {type}, own = {own})"\ .format(name=self.name, type=fileOrDir, own=('True' if self.own else 'False')) class TempFileManager(): """ Manage all temporary files and directories. """ def __init__(self, rootDir=""): self.defaultRootDir = rootDir if (self.defaultRootDir != ""): self.defaultRootDir = path.abspath(path.expanduser(rootDir)) self.fileDB = [] self.dirDB = [] self.SetRootDir(self.defaultRootDir) def __repr__(self): return "TempFileManager:\n" + \ " the default root dir is: {0}\n".\ format(self.defaultRootDir) + \ " registered files are : {0}\n".\ format(",".join([obj.__repr__() for obj in self.fileDB])) + \ " registered folders are : {0}\n".\ format(",".join([obj.__repr__() for obj in self.dirDB])) def SetRootDir(self, rootDir): """ Set default root directory for temporary files. """ changeRootDir = True if (rootDir != ""): rootDir = path.abspath(path.expanduser(rootDir)) if path.isdir(rootDir): # self.dirDB.append(TempFile(rootDir, own=False, isDir=True)) # In case a dir (such as /scratch) is specified, create # another layer of sub-dir, and use it as the real rootDir. rootDir = tempfile.mkdtemp(dir=rootDir) self.dirDB.append(TempFile(rootDir, own=True, isDir=True)) changeRootDir = False elif not isExist(rootDir): # Make the user-specified temporary directory. try: makedirs(rootDir) self.dirDB.append(TempFile(rootDir, own=True, isDir=True)) changeRootDir = False except (IOError, OSError): # If fail to make the user-specified temp dir, # create a new temp dir using tempfile.mkdtemp changeRootDir = True if changeRootDir: try: rootDir = tempfile.mkdtemp() self.dirDB.append(TempFile(rootDir, own=True, isDir=True)) except (IOError, OSError): # If fail to make temp dir rootDir = "" self.defaultRootDir = rootDir def _isRegistered(self, tempFileName): """ Is this a registered file or directory? """ tempFileName = path.abspath(path.expanduser(tempFileName)) if tempFileName in [obj.name for obj in self.fileDB] or \ tempFileName in [obj.name for obj in self.dirDB]: return True else: return False def _RegisterTmpFile(self, tmpFile): """ Register a TmpFile obj. """ if tmpFile.isDir: self.dirDB.append(tmpFile) else: self.fileDB.append(tmpFile) return tmpFile.name def RegisterNewTmpFile(self, isDir=False, rootDir="", suffix="", prefix=""): """Create a new temporary file/directory under rootDir and register it in self.fileDB/self.dirDB. """ if rootDir == "": if self.defaultRootDir == "": raise IOError("TempManager default root dir not set.") rootDir = self.defaultRootDir fileOrDir = "directory" if isDir else "file" thisPath = "" if isDir: thisPath = tempfile.mkdtemp(dir=rootDir, suffix=suffix, prefix=prefix) else: fHandler, thisPath = tempfile.mkstemp(dir=rootDir, suffix=suffix, prefix=prefix) f = fdopen(fHandler) f.close() thisPath = path.abspath(path.expanduser(thisPath)) if self._isRegistered(thisPath): errMsg = "Failed to register a temporary {0} {1} twice.".\ format(fileOrDir, thisPath) logging.error(errMsg) raise IOError(errMsg) return self._RegisterTmpFile(TempFile(thisPath, own=True, isDir=isDir)) def RegisterExistingTmpFile(self, thisPath, own=False, isDir=False): """Register an existing temporary file/directory if it exists. Input: thisPath: path of the temporary file/directory to register. own : Whether this object owns this file. isDir : True = directory, False = file. Output: the abosolute expanded path of the input """ errMsg = "" thisPath = path.abspath(path.expanduser(thisPath)) fileOrDir = "directory" if isDir else "file" if not isDir and not isExist(thisPath): errMsg = "Failed to register a directory as a file." if isDir and not path.isdir(thisPath): errMsg = "Failied to register a file as a directory." if self._isRegistered(thisPath): errMsg = "Failed to register {0} {1} as it has been registered.".\ format(fileOrDir, thisPath) if not isExist(thisPath): errMsg = "Failed to register {0} {1} as it does not exist.".\ format(fileOrDir, thisPath) if errMsg != "": logging.error(errMsg) raise IOError(errMsg) return self._RegisterTmpFile(TempFile(thisPath, own=own, isDir=isDir)) def CleanUp(self, realDelete=True): """Deregister all temporary files and directories, and delete them from the file system if realDelete is True. """ # Always clean up temp files first. while len(self.fileDB) > 0: obj = self.fileDB.pop() if realDelete and obj.own and isExist(obj.name): logging.debug("Remove a temporary file {0}".format(obj.name)) remove(obj.name) # Then clean up temp dirs while len(self.dirDB) > 0: obj = self.dirDB.pop() if realDelete and obj.own and isExist(obj.name): logging.debug("Remove a temporary dir {0}".format(obj.name)) # bug 25074, in some systems occationally there might be a NFS # lock error: "Device or resource busy, unable to delete # .nfsxxxxxx". # This is because although all temp files have been deleted, # nfs still takes a while to send back an ack for the rpc call. # In that case, wait a few seconds before deleting the temp # directory, and try this several times. # If the temporary dir could not be deleted anyway, print a # warning instead of exiting with an error. times = 0 maxTry = 5 while times < maxTry: try: shutil.rmtree(obj.name) break except (IOError, OSError): times += 1 # wait 3 seconds time.sleep(3) if times >= maxTry: logging.warn("Unable to remove a temporary dir {0}". format(obj.name)) self.defaultRootDir = "" pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/setup.py000077500000000000000000000014461241505617700214700ustar00rootroot00000000000000from setuptools import setup, Extension, find_packages import os import sys setup( name = 'pbalign', version='0.2.0', author='Pacific Biosciences', author_email='devnet@pacificbiosciences.com', license='LICENSE.txt', package_dir={'pbalign':'pbalign'}, packages = find_packages(), zip_safe = False, install_requires=[ 'pbcore >= 0.8.5', 'pysam' ], entry_points={'console_scripts': [ 'pbalign=pbalign.pbalignrunner:main', 'maskAlignedReads.py = pbalign.tools.mask_aligned_reads:main', 'loadChemistry.py = pbalign.tools.loadChemistry:main', 'extractUnmappedSubreads.py = pbalign.tools.extractUnmappedSubreads:main', 'createChemistryHeader.py = pbalign.tools.createChemistryHeader:main' ]} ) pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/000077500000000000000000000000001241505617700211105ustar00rootroot00000000000000pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/cram/000077500000000000000000000000001241505617700220325ustar00rootroot00000000000000pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/cram/test_mask_aligned_reads.t000066400000000000000000000034221241505617700270530ustar00rootroot00000000000000Test mask_aligned_reads.py $ CURDIR=$TESTDIR $ DATDIR=/mnt/secondary-siv/testdata/BlasrTestData/pbalign/data/test_mask_aligned_reads $ OUTDIR=$CURDIR/../out $ STDDIR=$DATDIR/../../stdout #Test mask_aligned_reads.py with movie.rgn.h5 $ CMPFILE=$DATDIR/in.cmp.h5 $ INRGNFOFN=$DATDIR/in_rgn.fofn $ OUTRGNFOFN=$OUTDIR/out_rgn.fofn $ TMP1=$OUTDIR/test_mask_aligned_reads.tmp1 $ TMP2=$OUTDIR/test_mask_aligned_reads.tmp2 $ rm -rf $OUTDIR/out_rgn $ maskAlignedReads.py $CMPFILE $INRGNFOFN $OUTRGNFOFN $ echo $? 0 $ cat $OUTRGNFOFN | xargs ls -1 */out_rgn/m121215_065521_richard_c100425710150000001823055001121371_s1_p0.rgn.h5 (glob) */m121215_065521_richard_c100425710150000001823055001121371_s2_p0.rgn.h5 (glob) $ h5dump -d /PulseData/Regions $OUTDIR/out_rgn/m121215_065521_richard_c100425710150000001823055001121371_s1_p0.rgn.h5 | sed '1d' > $TMP1 $ h5dump -d /PulseData/Regions $STDDIR/out_rgn/m121215_065521_richard_c100425710150000001823055001121371_s1_p0.rgn.h5 | sed '1d' > $TMP2 $ diff $TMP1 $TMP2 $ h5dump -d /PulseData/Regions $OUTDIR/out_rgn/m121215_065521_richard_c100425710150000001823055001121371_s2_p0.rgn.h5 | sed '1d' > $TMP1 $ h5dump -d /PulseData/Regions $STDDIR/out_rgn/m121215_065521_richard_c100425710150000001823055001121371_s2_p0.rgn.h5 | sed '1d' > $TMP2 $ diff $TMP1 $TMP2 #Test while cmp.h5 alignments are generated from multiple movie.bax.h5 $ CMPFILE=$DATDIR/in_2.cmp.h5 $ INRGNFOFN=$DATDIR/in_rgn_2.fofn $ OUTRGNFOFN=$OUTDIR/out_rgn_2.fofn $ maskAlignedReads.py $CMPFILE $INRGNFOFN $OUTRGNFOFN $ echo $? 0 $ cat $OUTRGNFOFN | xargs ls -1 */out_rgn_2/m130322_020628_ethan_c100499142550000001823070408081367_s1_p0.1.rgn.h5 (glob) */m130322_020628_ethan_c100499142550000001823070408081367_s1_p0.2.rgn.h5 (glob) pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/cram/test_pbalign.t000066400000000000000000000236401241505617700246770ustar00rootroot00000000000000Note: Program name has been changed from `pbalign.py` in version 0.1.0 to `pbalign` in 0.2.0, pseudo namespace pbtools has been removed also. Test pbalign $ CURDIR=$TESTDIR $ DATDIR=$CURDIR/../data $ OUTDIR=$CURDIR/../out $ STDDIR=$CURDIR/../stdout #Test pbalign with all combinations of input & output formats #input, reference and output formats are: fasta, fasta, and sam/cmp.h5 $ READFILE=$DATDIR/lambda_query.fasta $ REFFILE="/mnt/secondary-siv/references/lambda/sequence/lambda.fasta" $ SAMOUT=$OUTDIR/lambda.sam $ CMPOUT=$OUTDIR/lambda.cmp.h5 $ rm -f $SAMOUT $CMPOUT $ pbalign $READFILE $REFFILE $SAMOUT $ tail -n+6 $SAMOUT | cut -f 1-11 | sort | md5sum ea31763bc847a6c75d3ddb5fb6036489 - $ pbalign $READFILE $REFFILE $CMPOUT $ h5dump -g /ref000001 $CMPOUT > tmpfile $ sed -n '2,11p' tmpfile GROUP "/ref000001" { GROUP "m120619_015854_42161_c100392070070000001523040811021231_s1_p0" { DATASET "AlnArray" { DATATYPE H5T_STD_U8LE DATASPACE SIMPLE { ( 48428 ) / ( H5S_UNLIMITED ) } DATA { (0): 34, 136, 17, 17, 34, 17, 136, 136, 136, 17, 136, 34, 136, 68, (14): 128, 34, 17, 136, 34, 17, 136, 17, 2, 34, 136, 136, 34, 34, (28): 68, 17, 68, 34, 17, 136, 136, 136, 17, 136, 136, 1, 17, 68, (42): 34, 17, 136, 136, 136, 34, 68, 34, 136, 17, 136, 17, 17, 68, $ rm tmpfile #input, reference and output formats are: fasta, folder and sam/cmp.h5 $ READFILE=$DATDIR/lambda_query.fasta $ REFFILE=/mnt/secondary-siv/references/lambda/ $ SAMOUT=$OUTDIR/lambda2.sam $ CMPOUT=$OUTDIR/lambda2.cmp.h5 $ rm -f $SAMOUT $CMPOUT $ pbalign $READFILE $REFFILE $SAMOUT $ tail -n+6 $SAMOUT | cut -f 1-11 | sort | md5sum ea31763bc847a6c75d3ddb5fb6036489 - $ pbalign $READFILE $REFFILE $CMPOUT $ h5dump -g /ref000001 $CMPOUT > tmpfile $ sed -n '2,11p' tmpfile GROUP "/ref000001" { GROUP "m120619_015854_42161_c100392070070000001523040811021231_s1_p0" { DATASET "AlnArray" { DATATYPE H5T_STD_U8LE DATASPACE SIMPLE { ( 48428 ) / ( H5S_UNLIMITED ) } DATA { (0): 34, 136, 17, 17, 34, 17, 136, 136, 136, 17, 136, 34, 136, 68, (14): 128, 34, 17, 136, 34, 17, 136, 17, 2, 34, 136, 136, 34, 34, (28): 68, 17, 68, 34, 17, 136, 136, 136, 17, 136, 136, 1, 17, 68, (42): 34, 17, 136, 136, 136, 34, 68, 34, 136, 17, 136, 17, 17, 68, $ rm -f tmpfile #input, reference and output formats are: fofn, fasta and sam/cmp.h5 $ READFILE=$DATDIR/lambda_bax.fofn $ REFFILE="/mnt/secondary-siv/references/lambda/sequence/lambda.fasta" $ SAMOUT=$OUTDIR/lambda3.sam $ CMPOUT=$OUTDIR/lambda3.cmp.h5 $ rm -f $SAMOUT $CMPOUT $ pbalign $READFILE $REFFILE $SAMOUT $ tail -n+6 $SAMOUT | cut -f 1-11 | sort | md5sum e5c29fba1efbbfbe164fa2797408dbf6 - $ pbalign $READFILE $REFFILE $CMPOUT $ h5dump -g /ref000001 $CMPOUT > tmpfile $ sed -n '2,11p' tmpfile GROUP "/ref000001" { GROUP "m130220_114643_42129_c100471902550000001823071906131347_s1_p0" { DATASET "AlnArray" { DATATYPE H5T_STD_U8LE DATASPACE SIMPLE { ( 79904 ) / ( H5S_UNLIMITED ) } DATA { (0): 136, 34, 136, 34, 136, 68, 34, 68, 68, 68, 17, 68, 136, 68, (14): 136, 2, 34, 68, 68, 68, 17, 17, 136, 17, 17, 136, 136, 1, 17, (29): 17, 17, 34, 68, 17, 136, 68, 2, 17, 34, 17, 34, 17, 68, 68, (44): 68, 8, 136, 136, 17, 68, 34, 68, 34, 68, 136, 17, 2, 17, 130, $ rm -f tmpfile #input, reference and output formats are: fofn, folder and sam/cmp.h5 $ READFILE=$DATDIR/lambda_bax.fofn $ REFFILE=/mnt/secondary-siv/references/lambda/ $ SAMOUT=$OUTDIR/lambda4.sam $ CMPOUT=$OUTDIR/lambda4.cmp.h5 $ rm -f $SAMOUT $CMPOUT $ pbalign $READFILE $REFFILE $SAMOUT $ tail -n+6 $SAMOUT | cut -f 1-11 | sort | md5sum e5c29fba1efbbfbe164fa2797408dbf6 - $ pbalign $READFILE $REFFILE $CMPOUT $ h5dump -g /ref000001 $CMPOUT > tmpfile $ sed -n '2,11p' tmpfile GROUP "/ref000001" { GROUP "m130220_114643_42129_c100471902550000001823071906131347_s1_p0" { DATASET "AlnArray" { DATATYPE H5T_STD_U8LE DATASPACE SIMPLE { ( 79904 ) / ( H5S_UNLIMITED ) } DATA { (0): 136, 34, 136, 34, 136, 68, 34, 68, 68, 68, 17, 68, 136, 68, (14): 136, 2, 34, 68, 68, 68, 17, 17, 136, 17, 17, 136, 136, 1, 17, (29): 17, 17, 34, 68, 17, 136, 68, 2, 17, 34, 17, 34, 17, 68, 68, (44): 68, 8, 136, 136, 17, 68, 34, 68, 34, 68, 136, 17, 2, 17, 130, $ rm tmpfile #Test --maxDivergence --minAnchorSize --minAccuracy $ READFILE=$DATDIR/lambda_query.fasta $ REFFILE="/mnt/secondary-siv/references/lambda/sequence/lambda.fasta" $ SAMOUT=$OUTDIR/lambda5.sam $ rm -f $SAMOUT $ pbalign --maxDivergence 40 --minAnchorSize 20 --minAccuracy 80 $READFILE $REFFILE $SAMOUT $ tail -n+6 $SAMOUT | cut -f 1-11 | sort | md5sum 29f8897b8ee6d4b7fff126d374edb306 - #Test whether pbalign interprets minAccuracy and maxDivergence correclty. $ rm -f $SAMOUT $ pbalign --maxDivergence 0.4 --minAnchorSize 20 --minAccuracy 0.8 $READFILE $REFFILE $SAMOUT $ tail -n+6 $SAMOUT | cut -f 1-11 | sort | md5sum 29f8897b8ee6d4b7fff126d374edb306 - #Test --hitPolicy random $ SAMOUT=$OUTDIR/lambda_hitPolicy_random.sam $ rm -f $SAMOUT $ pbalign --hitPolicy random --seed 1 $READFILE $REFFILE $SAMOUT $ tail -n+6 $SAMOUT | cut -f 1-11 | sort | md5sum ea31763bc847a6c75d3ddb5fb6036489 - #Test --hitPolicy all $ SAMOUT=$OUTDIR/lambda_hitPolicy_all.sam $ rm -f $SAMOUT $ pbalign --hitPolicy all $READFILE $REFFILE $SAMOUT $ tail -n+6 $SAMOUT | cut -f 1-11 | sort | md5sum 2022614eb99fe3288c332aadcfefe739 - #Test --hitPolicy randombest $ SAMOUT=$OUTDIR/lambda_hitPolicy_randombest.sam $ rm -f $SAMOUT $ pbalign --hitPolicy randombest --seed 1 $READFILE $REFFILE $SAMOUT $ tail -n+6 $SAMOUT | cut -f 1-11 | sort | md5sum ea31763bc847a6c75d3ddb5fb6036489 - #Test --scoreFunction $ SAMOUT=$OUTDIR/lambda_scoreFunction_editdist.sam $ rm -f $SAMOUT $ pbalign $READFILE $REFFILE $SAMOUT --scoreFunction editdist $ tail -n+6 $SAMOUT | cut -f 1-11 | sort | md5sum ea31763bc847a6c75d3ddb5fb6036489 - #Test --hitPolicy allbest $ READFILE=$DATDIR/example_read.fasta $ REFFILE=$DATDIR/example_ref.fasta $ SAMOUT=$OUTDIR/hitPolicy_allbest.sam $ rm -f $SAMOUT $ pbalign --hitPolicy allbest $READFILE $REFFILE $SAMOUT $ tail -n+8 $SAMOUT | cut -f 1-11 | sort | md5sum 6e68a0902f282c25526e14e5516e663b - #Test --useccs=useccsdenovo, whether attribute /ReadType is 'CCS' $ READFILE=$DATDIR/lambda_bax.fofn $ REFFILE="/mnt/secondary-siv/references/lambda/sequence/lambda.fasta" $ CMPOUT=$OUTDIR/lambda_denovo.cmp.h5 $ rm -f $CMPOUT $ pbalign $READFILE $REFFILE $CMPOUT --useccs=useccsdenovo --algorithmOptions=" -holeNumbers 0-100" $ h5dump -a /ReadType $CMPOUT | grep "CCS" (0): "CCS" #Test --forQuiver can not be used together with --useccs $ pbalign $READFILE $REFFILE $CMPOUT --useccs=useccsdenovo --algorithmOptions=" -holeNumbers 0-100" --forQuiver 1>/dev/null 2>/dev/null || echo 'fail as expected' fail as expected #Test whether pbalign can produce sam output for non-PacBio reads # $ READFILE=$DATDIR/notSMRT.fasta # $ REFFILE="/mnt/secondary-siv/references/lambda/sequence/lambda.fasta" # $ SAMOUT=$OUTDIR/notSMRT.sam # # $ rm -f $SAMOUT $CMPOUT # $ pbalign $READFILE $REFFILE $SAMOUT # Test whether (ccs.h5) produces # identical results as (bas.h5 and --useccs=useccsdenovo). $ READFILE=$DATDIR/test_ccs.fofn $ REFFILE=/mnt/secondary-siv/references/ecoli_k12_MG1655/sequence/ecoli_k12_MG1655.fasta $ CCS_CMPOUT=$OUTDIR/test_ccs.cmp.h5 $ rm -f $CCS_CMPOUT $ pbalign $READFILE $REFFILE $CCS_CMPOUT $ READFILE=$DATDIR/test_bas.fofn $ BAS_CMPOUT=$OUTDIR/test_bas.cmp.h5 $ rm -f $BAS_CMPOUT $ pbalign $READFILE $REFFILE $BAS_CMPOUT --useccs=useccsdenovo $ h5diff $CCS_CMPOUT $BAS_CMPOUT /AlnGroup /AlnGroup $ h5diff $CCS_CMPOUT $BAS_CMPOUT /AlnInfo /AlnInfo $ h5diff $CCS_CMPOUT $BAS_CMPOUT /MovieInfo /MovieInfo $ h5diff $CCS_CMPOUT $BAS_CMPOUT /RefInfo /RefInfo $ h5diff $CCS_CMPOUT $BAS_CMPOUT /ref000001 /ref000001 #Test pbalign with -filterAdapterOnly $ READFILE=$DATDIR/test_filterAdapterOnly.fofn $ REFDIR=/mnt/secondary-siv/testdata/BlasrTestData/pbalign/data/references/H1_6_Scal_6x/ $ OUTPUT=$OUTDIR/test_filterAdapterOnly.sam $ rm -f $OUTPUT $ pbalign $READFILE $REFDIR $OUTPUT --filterAdapterOnly --algorithmOptions=" -holeNumbers 10817,14760" --seed=1 $ tail -n+6 $OUTPUT | cut -f 1-4 # Test pbalign with --pulseFile # This is an experimental option which goes only with gmap, # it enables users to bypass the pls2fasta step and use their own fasta # file instead, while keep the ability of generating cmp.h5 files with pulses # (i.e., --forQuiver). Eventually, we need to support --algorithm=blasr. $ OUTFILE=$OUTDIR/test_pulseFile.cmp.h5 $ REFPATH=/mnt/secondary-siv/references/Ecoli_K12_DH10B/ $ REFFILE=/mnt/secondary-siv/references/Ecoli_K12_DH10B/sequence/Ecoli_K12_DH10B.fasta $ pbalign $DATDIR/test_pulseFile.fasta $REFPATH $OUTFILE --pulseFile $DATDIR/test_pulseFile.fofn --forQuiver --algorithm gmap --byread $ echo $? 0 $ OUTFILE=$OUTDIR/test_pulseFile.cmp.h5 $ REFPATH=/mnt/secondary-siv/references/Ecoli_K12_DH10B/ $ REFFILE=/mnt/secondary-siv/references/Ecoli_K12_DH10B/sequence/Ecoli_K12_DH10B.fasta $ rm -f $OUTFILE $ pbalign $DATDIR/test_pulseFile.fasta $REFPATH $OUTFILE --pulseFile $DATDIR/test_pulseFile.fofn --forQuiver --algorithm blasr --byread $ echo $? 0 #Test pbalign with space in file names. $ FA=$DATDIR/dir\ with\ spaces/reads.fasta $ pbalign "$FA" "$FA" $OUTDIR/with_space.sam $ echo $? 0 #Test pbalign with -hitPolicy leftmost $ Q=$DATDIR/test_leftmost_query.fasta $ T=$DATDIR/test_leftmost_target.fasta $ O=$OUTDIR/test_leftmost_out.sam $ pbalign $Q $T $O --hitPolicy leftmost $ echo $? 0 $ tail -n+6 $O | cut -f 4 1 pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/000077500000000000000000000000001241505617700220215ustar00rootroot00000000000000pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/1.config000066400000000000000000000016631241505617700233560ustar00rootroot00000000000000# This is a config file for pbalign where users can specify # values for an arbitary subset of optional arguments for pbalign. # Lines which start with '#' are comments. Otherwise each line # should specify the value for exactly one optinal argument. # Note that: # [1] Positional arguments, including inputFileName, outputFileName, # and referencePath, can not be specified in a config file. # [2] Sepcial arguments, including --verbose, --version (-v..), # --debug and --profile, in a config file will be ignored. # [3] Arguments specified in a config file will be overwritten # by arguments on command-line. # Aligner's options. --maxHits = 20 --minAnchorSize = 1 --minLength = 100 --algorithmOptions = "-noSplitSubreads -maxMatch 30 -nCandidates 30" # SamFilter's filtering criteria and hit policy. --scoreFunction = blasr --hitPolicy = randombest --maxDivergence = 40 # Miscallaneous --seed = 10 pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/dir with spaces/000077500000000000000000000000001241505617700247725ustar00rootroot00000000000000pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/dir with spaces/reads.fasta000066400000000000000000000004161241505617700271110ustar00rootroot00000000000000>m120619_015854_42161_c100392070070000001523040811021231_s1_p0/13 GCTGTTTTCTCCAGCGCAGCACCGTAAATTACTGCTGAGCCATCATGACG CCGATGGAGCCTGTCCGGGCGGTCCCTGGCGTGACCAGACGCCGGAGGCG GCACTGGCAAGCAACTACCTGCACTGCAGTTCATGTGTTGGCAACCGCCC ATACCGGTTTATGTCCACGCACACGGGCGATAGTCAGCGCCGTCCAAATG pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/ecoli.fasta000066400000000000000000001430301241505617700241350ustar00rootroot00000000000000>m121004_000921_42130_c100440700060000001523060402151341_s1_p0/7/0_1165 GAATCGTCGTCAAATACTCGCCATGTACCCTCTTCTTCCCATCCCGCTCA AATCCGACATCAATGCAGGCCTTCATCGCTTCTACGCGGGTCCATAGTTG GCAAGTACCACGCATTTTGTTCGCGCGTCACCCACGGACTGCCTAGTTAC TGCTACCCGGCCATCGAAGGCTGACTTTATGGCCTCCGAAACCACCCGCA GCCGCCCGGCAACTTCCATGAAATCCCGGAGGCTAAACGGCATTTCAGTT TCAAGGACTCGTTGCCACGTCACTGCAATAAACCATCGGAGACAGCAGGC GGGTACACGCATACTTTCGTCGCGATAGATGATCGGGGATTCGTAACAGT TCACACCGAGCCCGCGAGATATGAATTCAAACAACGGGTTCCTGACGTCG CTCTCACGCTTACTCGTTTTCCCCAGGCCAGTGGCTTTAGCGTACCTCCG GGACCACACCGGTGCAAACCTCAGCAAGGCAGGGTGTGGAAGTAGGAACT TTCATGTCAGTCCACTTCTCCTTCTCGCCGCGAGCGGGTTTTGCTCATCC CGTTGTGACCTCTGAAGCGGTGATTGACGGCGAGCCAAGTACCGATTTTG CCACGCATCATGCCCTGTTCGACCAGCTCTCCATCGATCCCGGTACCGCG GCCCTGGCAGGATATCGCTCCGGTCGTCACTGCCTGCCACCTTCTGCTCT GCGGCTTTCTGTTTTCAGGACTCCAAGAGCTTTTACTGCTCGCCTGTGTC CAGCCTTCTCGCGACGATGCACGAATGTCCCGCGGCCGAAATATCTGGCG ACCAGAGGCGGCAATAAGTCGCGTCATCCATGTTTGTTATCCAGGGCCGA TCAGCAGAGTGTTAATCTCCTGCATGGTTTCATCGTTACCCGGACGGATG TCGCGTTCCGGCGGACGGTTCTGCCAGTGTATTGCAGTATTTTCGTACAA TCGCTCGGCTTCATCCTTGTCATGCAGATAGCCCAGCAATCCGAAGGCCA GACGGGCCACACTGAATCAATGGCTTTATCCGTCAACACTGTTCTGGGCT GCTGACTGACACGGCCCCGTGATTCCTCTCTGCTCTTTCGCGAGCGTTTT GAATGGTTCTCGCGGCGGCATTCATTCCATCCATTCGGGTAACTCGCGAT CGATGATTTACGGTC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/8/2069_2682 CGGTGTATCAGGCAAGGCCACTCATATCAGGTGCAGCTTGACTGTCATCA ACACGGCCTTTCAGCAACACCCGATACTTCTTCCAGGGCCTTCCAGCAAC GAGGTGTGCCTTCCTTCGTTGCAGTTTCCAGATCTCAGCATCCTGAAGCG GCCGCGATATGCTCACTGGGCTACCTTGCATCAGGCCTTTTTTTTGCTTT TCTTCAAAAAAAAGAAAATGATGTTGCCATCGTAGAGAACATGCTGCTAA CGTGAGAGAAGAAGAGATTGAATCTCAGAGAGAGAGACAGAGCGGTCATA CAGCAGCTTAACAGTGCGGGACCAGGTGGGTTAGAGAGAGGTTCTGGATT AGCATCGAGAGAGAAGCGCGATATGCTGCGCTGCTGGCATCCTTGAATAA GAGAGACTACGCACGCTTTCTCGCAACTCTCCCCACAGCTTCTGTTTTGG GCAATATCAACCGCCGCTAGTACCATGGCAATCTCTGCTCTTGCCCCCGG CGTCGCGGCACTCGGCATATCCGCATAAGCGAATGTTGCGAGCATTGCGT ACCGTTTGCCTTAGTATTTCCTTAAGCTTTGGGCCACACCACGGTATTTC CGATACCTGTGTG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/8/2724_3021 CACAAGGTATCGGGGAAATCACCGATGGTGTGGCAAAGCTTGAAGGAATA CTAGGCAAAGGTACTGGCAAGTGGCTCGCGACATTCGCTTATGCGGATTA TTGCCGTAGTGCCGCGCGCCGGGGGCAAGATGCAGAGATGCATGGCTAAA GGCGTGCGGTTGAGATTGCGAAAAAAGCTGTGGGGTAAGTTGTCGAGAAG AGTGCGGAGATGCAAGGCAGTCGGCTAATTCAAGGATGCAGCAAGCGCAG GGAATATCGCGCTGTGACGATGCTAATACCAAACCTTACCAACCACT >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/13/0_278 GTATAGAAATGGATCCACTCGTTATTCTTCGGGACGAGGTGTTCAGTTGA CCTCTGGAGAGAACCATGTATCATGAATCGTTATCTGGGTTGGACTTCTG CTTTTAGCCAGAATACATGGCCTGATATGTTACATGAGAGAATGGTATTC CTCATGTGAGTGGCTGTCTTTCGTCTTTCTCTTTGCATTTTCGCTAGCAA CTTATGTGCATCGATTATCAAGCTATTGCCGCGGCCAGATAGTAAGCGAT TTCAAGCTAAGAAACACCGACCATTAGC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/13/327_954 CTTAAGCGATTTATCTTCAAGAATTAATACAGCTATAATCTGGCGCTAGG ACATAAGCTAAATAAATCCGATGCACATTATATGATACGCGAAAATGCAA GAGCAAGCAGAAAACATAAGCCACACATGAAGAGAATAACAGATTCTCTC CATTAACATATTCAGGCCAGTTAAAAAACAATCATGAGAGCTTAAAAGAC AGAAGTCCAACCCAAGATAACGAAATAATATACACTGGTTCTCTCCAGAG GGTTTCATTACTGAACACTCTCCGAGAATAACGAGGGATTCCATTTCTAT ACTACATCAACTGTAGGGGTTGTAATAGTTTATCCGATTTCTATCGCTGT AGGGGACACGAGAACCACCCGAGCCTGATGTGGTTAAAAGACAGGAACAA TCTTTACTACAGCAATACACTATTTAAGGTGATATATGGAAAAGAATTTT GAAAGAGTTCGAAGAAGCATACCTCAAGGATGTGATGGAACAATACCAGA GACTATCCGTATGACTAAACGACTATTGATAAAAATCAATGGTGTGGACA ATTCAAGCGATGCAATGGAGCAACGCTGCCCATCGGAATGCATGGTTAAG CTGAACGAAATGTCTTCCTTGTACATC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/13/1004_1580 TACGGAAACATTTCTTTCAAGGCTTAACCATGCATTCCCGATTGCAGCTT GCATCCATTGCCATCGCTTGAATTGTCCACACCATTGATTCGTTTATCAA TAGTCGATAGTCTACGATACGGTCTGGTATCTGTTCCATCACATCCTGAG AGATGCTCTTCGAACTATCTTCAATTCATTCCTTCCATATATCACCTTAA TAGTGGATTGCGGTAGTAAAGATTGTGCCTGTCTTTTAACCACATCAGGC TCTGTTCCTCGTGTACCCCTACAGCGAGAAATCGGATAAACTATTTACAC CTACAGTTTGATGAGTATAGAAATGGATCACTCGTTTATTTCGGACGAGT GTTCAGTAATGAACCTCTGGAGAGAACATGTATATGGTTATCTGGGTTGG ACTTCTTGCTTTTGCCAGATAACTGGCCTGATATGTTAATGAGAGATCGT AATTCCCTCATGTGTGGATGTTTTTCGTCTTGCTCTTGCATTTTCGCTAG CCAATTAATGTGCCTCGATTATCAGCTATTGCCAGCGCCAGATATAAGCG ATTTAAGCTAAGAAAACGCATTAAGA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/13/1625_2202 CTTAATGCGTTTTCTTAGCTTAAATCGCTTATATCTGGCGCTGGCATAGC TGATAATCGATGCACATTATTCTAGCGAAAATGCAAGAGAAAAGACGAAA CCATGCCCACATGAGGAATACGATTCTCTCATTAACATATTCAGGCCCAG TTTATCTGGCTTAAAAGCAGAAGTCGCAACCCAGATACGATCATATACAT GGTTCTCTCCAGAGGTTCATTACTGAACACTCGTCGAGATAAGCGAGTGG ATCCATTTCTATACTCTCAAACATGTAGGGGTGTAATAGTTTATCCGATT TCCTCGCCTGTAGGGTACACGAGAACACTCGAGCCTGATGTGGTTAAAAG ACAGGCACCATTTTACTACACGCAATCATATTTTAAGGTGATTAATGGAA GAAGAATTTGAAGAGTTCGAAGAGCATCCTCAGGATGTGATGGAACCAAT ACCAGGCACTCATCCGTATGCACTACGACTATTGATAAAAATCAATGGTG TGGACAATTCCAGGATGCAATGATGCAAGCCATGCAATCGGAATGCATGG TTAAACTGAAGAAATGTTTCCTGTAAT >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/13/2251_2836 ATTACAGGAAACATTTCTTCAGGCCTTACCATGCATTCCGTTGCAGCTTG CATCCTTGCATCGCTTGAATGTCCCACATATGATTTTTAATCAATAGTCC GTAGTCATAGGATAAGTCCTGGTATTGGTCTCATCACATCCTGAGGATGC CTCTTACGAACTCTTCATTCTTCTTCCATATATCACCATTTAAATAGTGG ATTGCGTAGTAAGATTGTGCCTGTTCTTTAACCACATCAGGCTCGGTGGT TCTCGTGTACCCCCTACAGCGAGAACTCGGATAAACTAGTTACAACCCCT ACAGTTTGATGAGTATACGAAATGGATCCACTCGTTATTCTTCAGGACGA GTGTTTCAGATAATGACCTCTGGAGAGAACCATGTAATTGATCGTTATCT GGGTTGGACTTCTGCTTTTAAGCCCAGGATAACTGGCTGATATGTTATGA GAGAATCGGTAATTCCTCATGGTGGCCATGTTTCGTCTTTGCCCTTGCAT TTCGGCTAGCATTAAATGTGCATCGATTAATAGACTATTGCGCAGCCAGA TATAAGCGATTTAAGCTAAGAAAACGCATAATAAG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/13/2879_3129 CTTTAAATGCGTTTTTCTTGCTTAAATCGCTTATATCTGGCGCTGGCCAT AGCTGATAATCATGCACATTAATTGCTAGCGAAAATGCCAAGAGCAAAGA CGAAAACAAATGCCCACATGAGGATACCGATTCTCTCATTACATATTAGG CAGTTATCGGGCCTTAAAGCAAGAAGTCCAACCAAATAACGATCCAATAT ACAATGGTTTCCTCGTCCAGAGGTTACATTACGAACCTCCGCCGAGAATA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/14/2522_3325 CGTTGTCTTGTATACGGGGAAGTTCCTGGCTTTCGGGTGGCGAGCATCTG CGCCACACCACCGAGCCATACTGGCACCGAGAGAAAAAGGATGCCGGTCA TAACACCGGCCCAATGGCTGGCCCCCATGCTGCAGGGTGGCTCCGGGTAA AGAATGATCCGGCATGGCGGGCAGCCCCCAAGGACAATCTGGAATACGCC ACCTGACTTGGCCCCGGCGAACTCTGGGAAACAATATGAATTACAGCGCC CATCAAGGCAGAGTCTCATTGTAACTGCGCCGTTAACCCGGACGTGGCTG ACGTCCCGCCCGGCAATCGCGTACCTGATACCAGCCGTCGCTCAGTTTCT GACGAAACGCCGGGAGCTGTGGTGGCCAGTGCCCGGATGGCCTTCAGCCC CCGTTTTTCCACCGAAGGTGATGCGTCGACAAATCGTTGGTAAATCCCCG TAAAAGCAGATGCGATGCCCGGTTGACGCCAGAGGGAGTGTTGTGTCGTC GCCTGCCATTTGTCGGTGTACCTCTCCGTTTGCTCAGTTGTTCAGGAATA TGTGCAGCACTCGCCGTCGCGCAGTAATGCGGCGTGATTCGCACTGATGA ACCAAAACAGCAACAGCAGCACATCGCCCGGCTGTGCCGCTGACAACGGC AGCCTGATACAGCCCCGTCGCCTCCAGATAATCCAGATAGAGATTCTGGC CGTTCTACGCCACCAGTCATCCTCACGATGAAAGTCCGGCATCTTCATCC CCGCCAGATGATAAGCATCCGGAACAAGTGTGTAACAGTCGTCACACGTG CTC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/14/3370_4191 GAAGCACGGTGTGACGGACGTTACACACGTTCGGGATTGCTTATCATCTG GCGGGGATTGAATGCCGGACTTTCATCGTGAGGGATGAACTGAACGGCAG AATCTCCTATCTGGATAATCTTGGAGGCGATCGGGGCTGTAGATCAGTGC GTTTGTCAAGCGGCACAGCCGGGCGATGTGCTGCTGTGCTGTTTTGGTTC ATCAGTGCCGAAACGCGCAATTTATATTGCGGCGAGGCGAGCTGCTGCAC CATATTCCTGAACAACTGGAGGCAAACGCAGAGAGGTACAACCGACACAG GCAGCGACGCACACACTCCCTCTGGCGTCACCCCGGGGCATGGCCGCGCA TCTGCCTTTACGGGATTTTCAAAGATTTGGTTCGCCGCATCGACTTCGTG TGAAAACGGGGGGGGCTGAAGCCATCCCGGCACTGGCCACACAGCTCCGG CGTTTCGTCAGAAACTGAGGCGCGGCTTGGTATCAGGTTTTACGGATTGC CGGGCGGGACGTCAGCACGTCCGGTTTAAGGCGCAGATTACAGAGACTTG CCTGATGGCGGCTGTAATTCATATTGTTCCCAGAGTCGCCGGGCGCAAGT CAGGTGGCGCCGTATTCCAGATTGTCCTGGGGGCTGCCGCCATTGCCGGA TCATTCCTTTACCCGGCGCGGGCCACCCTTGCAGCATGGGGGGCAGTCCT TGGGGGCCGGTGGATGACCGGCATCCTGTTTTCTGCTCTCGGTGGCGCAG TATGGTGCTCGGTGGTGTGGCGCAGATGCTGGCACCGAAAGCCAGAACTC CCGCGTATACACGACAACGGA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/14/4238_4294 TCCGTTGTCTGTATACGGGGAGTTCTGGCTTTCGGTGCGCAGCATCTGCG CCACAC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/15/6210_6438 TCCCCCCCGGGGGGGGGGCGGAGTCTTTTGGGGCCCCCCCCGGGGGGTTT GGACAAACCCAGGAAAAACCCCGCGGGGAGAGAAAAGGGCCACGGCCCCA AACCCTTTTGGGCCCCAAAAAGCAAAAACCTTTTGACCCCACCCCCCCCC TTTTTTTTTGCCACCTGCAAAAAGTTTTCAAATGTCCGTTTTTTTGGCAG CCCGCCCCATAACCGGTTTTTTTTTTTA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/15/6494_7499 TTTAAAAAACCGGTAAAATGAGGGCCCGGCTTTTTGCCCCAAAACGAAAC ATTGAACTGGCAAAGTTTGCAAAGGTCCACCCCAAGTTTTTTTTATTTGT GCTTTTTGCCAAGTGGGGCCGGCCTCGCGGCGGGGGTCCTGGTCCACGCC AAGGAAACCCGCCCGGAAAACAGGCTTTTCCCATCCCGGCCGGTCCCCAA ATGGGAATGGCCTCAACAGAATTTTAACGGTTTTGCTGCGCTTGGGAGAA AACAAAGGGGTTTGTTGGAAAAATTTCACGCTTGAATTTTTTTTTTTACA AGCGGCCCAGCCAATAAAGGTGGAGAGATGGCACCCCTACAGCCCATTCC TTTTTCCGGATGAACGTCCGCGGGGAGACCACTTGCCAGTCCCGGAATGG GAACGCCAACCCCGCCCAGATTGTTTTTTTTTGCGCAGAAGAGGGGGTGT CCCCGGCCAAAATAAAATACCCGGCCTGTCCCGTGCAGGTTATTTGTGCT TGGAATACCGAGGGCCCTTTTGCAAAAAAAAGTGTAACAGCGGGTTTCCA GGAAGGCCAATTGAAAAATGCCGGGACTTGGCCTTTGAATTTGAACGTTT GTTAACAGCAAAAGCCGGATGGCCGATCAACCGTCCAAAAATGCGTGATG CAACCTTGATTGCACCGTAAAATTCCCGTCTCCTCAGGAGGGCGAAAAAA TGAAAACAACCAAAGAGACCTTCAAATCAACAAAAACTTTTGTTTTTCAG CCCACTGCTTTTTTTTTTCGCAGGCCTGACCGTTTCTTGACGTGGTTGCC CCAAGCGGACGAGGCGGAGAACGGCCAAAGCGCGGCGCAGCCGGAAACCG TTGAAAACGAGAAAAAAGATCCACGCAGCGGTTTTTGCGGCAAGAAAAAA GACAGCCCGCCCAATTTATTGGGGATCCTCAACTGGTTGAGGACAAGGCT TCCCAAGCGGAACAGCGAAAGAACAGGCAAAACGCGTGCCTTGGCAAAGA AAACA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/15/7563_8532 TTTGTTTTTGCTGCCAGCAAAAAACGCGGTGGGATTTGTTTCTTTTTTCG CGTCCGTGAGCCCTTTTCCCCCCCTCACAGTTTTGAGGAATTCCCAATAT GCGGCTTGTTTTTTCCCTATGCCGCAACCGCTTGCCCGGTTTGATTTCTT ATTTGTGCGCCGTTTTGTTTTCAACCGTCCGCTTTGCCGCCGCGCTGGGC GTTTTTTTTTTTTTCCTCCCCGCCCTTCCCGTCGCTTGCAAAAAAACCCG TCAAAGTAAAACGTCCCAAAAAAGCCTGCGAAGCCAGTTGGCCTTTTTGG AAACAGTTTTTTTTTTTTGGTTTTGGGAATTGAAAAGTCCTCTTTGGTCC ATTATCGCCCCCTCCCTGAAGAGACGGGAAAAATTTTTACGTTTGCATCC CAGTGCATCAAAAAACAGCAATGACCGGTGATTTCGCAATCGGTTTGGCT GTTAAAAACAAAGTTCATCAGCCCAGTTTCCCGGCAAAATTCAAAAAATG GCCCCCTTCCTGACCGCCTTGTTTAAACACTGCAAGGCCCCTCGGGGTTT ATCCAAAAGCCCCAAAAAACAACCTGCACGGAAAAAACAAAGGCCGGTAA AAAAAATATGCGAACAAACCTTCTTGCGCAAAACATTCTTGGGGGATTTT TTTTTTTTTTGCGGTCCCCATCCGAGGACTGGCAAGTGGTCTTCCCCGGG AACGTCAATCCGGAAAAGATTGGCTGGGTAGGGGGGTTTTTTTTTTTGTT TTTTTTTTTTTTTTTGGCCCAAAATCCCAACCTAAATTGGCTTGTCCCGC TGTAAATCAGCCGTTGGATTTATCCAACCAAACCCTGTTTTCTCCAGCGA CAAAAAAAGCAAAACCGTAAAAATTTTTACTGTTGAAGCCAATCGGAATT TGAAAACGCCGATGGAGCCTGTATCCCGGGCCGGTCCCCCTGCGTTGGAC CCAGAAATACGCCGGGAGG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/16/0_404 TGCACGAGGCAGCGGCAAGTTTCTCTCTGTTCGCGTCATTTCAATCCTTC CGGATATAGCGCACGTGGCGTTAATCGTGCCATGTTTTCTGTTGGTTGTC CTGCACCATCCTCTCCTGCAGCCTCGCCAGCAGCGCACTGAGATTCTTCA GCTGCAGCGGGAAATACTGATGCGCAGTCGCCGCCAGCGCATAAATCCGA AGCAAGTCGTGTGCCTCATTGCGTCGCTTTTTGCTGTCCCTTACATTGTA TTTTTTTCCTGCTCATCCACCATTTTTCGACGCTCTTCAGCAGTCAGTCC TGCTTGCTGCTTCGTTCAATCAAAAATATCTCTGGTTATTTTCGGGAAGT GAACCGGCACCGGGAGCAGGTTCATCCTCCTTCCGGCGTCAGTGTGAATT GCGG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/16/447_1003 GCCGCTTCACACTGAAGCCGGAAGGGGATGAACCGCTCCGGTGCTTGTTC AGCTCCGAAAAATTGCCGGATATTTTTTGATCTGACCGAATGCGCAGCAG CTGACTGCTGAATGAGCAGGTCGAAAAATGGGTTGATGGCAGGAAAAATA TACTGTGGGACAGCAAAAAGCGACGCAATGAGGCACTCGGACTGCTTCGT TTATTCGCTGGCTGCGCTGCGCATCAGTATTTCCTCGCTGGCAGCTGGAT CTCCAGTGCGCTGCTGGCTGAGCCGTGCAGAGAAGGTAGTGATGGTTGCA GTCAACCAATCAAGAAAACACTGGCAGAATTTACACCGTGTGCCTTATCC GGAGAGGATGATGACGCGACAGGTATAGAATTGCGCTTGTCCCGTGCGGC ACTGCATGACCTGATGACAGGTAATTACGGGTGGCAACAGTACTAGAAGA CGTGACGAAGTGGTGGAGTTTACGGCCACTTCCGTGTCTACTGAAAAATA ATATTTGCTAGATTGCTGGAAGTGCAGACCGGCATGAACACAGGGCAGGA CTGCAG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/16/1051_1636 CTGCAGGTTCTCTGCGTCGCTGTGTCATGCCGGATCTGCTATCTTCCAGC TCTGCAATGATTATTTTTTCAGGTCAGACACGAAGTGTGCCGTAATACTC CAACCTTCCGTCTACCACGTTCGTTTCTGTATGTTGCCACCCGTTACCTC TGATCATTCAGGTCCATGCAGTGCCGCAGGGTGGGGTGGGTTGGGGCGGG CAGCGGCAAGTTCTTCCTGTCGCGTCATTCATCCTCTCCGGATAAGGCAC GGGCGTAATCTGCCAGTGTTTTCTTGTTGGTTGCTGCACCAACCTCCCTT CCTGCAGTCCGCTCGCCCAGCAGCGCACTGAGAATCCAGCTCCAGCCGGG AAATACTGATGCGCAGCGCCCCAGCGCATAAACGAAGCAGTCGAGTGCCT CATTGCGTCGCTTTTTGCTGTCCCACAGTATTTTTTTCCTGCCCATCCAC CCATTTTTTTCGACCTGCTCTTCCAGCAGTCAGCTGCTGCGCTTCGGTCC AAGATCAAAAAATATCCGGGTTAGTTCGGGAAGTGAACGGCACCGGGAAG CGGTTCATCCCCTTCCGGCCGTCAGTGTGAAGCGG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/16/1680_2264 CCGCTTCCATGACGCCCGGAAGGGGATGAACCGCTTCCCCGGTGCCGTTC ACTTCCCGAAATAACCCGGATATTTTTGATCTGACCGAAGCGCAGCAGCT GACTTGCTGAAGAGCAGGTCGAAAAATGGGTGGATGGCAAGGAAAAAATA CTGTGGGACAGCAAAAGCGACCGCAATGAGGCACTCCGACTGCTTCGTTT AATGGCCCGCTGGGCGGGCGCTGCCGCATCAGTTATTTCCCGCTGGCAGC TGGATCTCAAGTGGCGCTGCGGCCTGAGGCCTGCAAAGGAAGAGGATGGT TGCCAGCAACCCACAGAAAACACTGGCAGATTACCGCCCGTGCCCTTACC GGAGAGGATGAAATGACGCGACAAAGGAAGAACTTGCCCGCTGCCCGTGC GGGCAACTGCATGACCTGGATGGACAGGTAAACGGGTGGCATCAGTAACA GAAAGACGGACGAAGGGGGTGGGAGTTTACGGCCCACTTCCGTGTCTGAC CTGAAAAAATATATTGCAGAGCTTGGAAGTGCAGACCCGGCCATGGACAA CCAGCGGGGAACTGCAGGGGGGGGGACCTGCAAG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/16/2310_2884 TGCGAGGTCCCTGGCGTCGCTGTGTCATGCCGGTCTGCCACTTCCAGCTC TGCAATAATATTTTTTCAGGTCGACACGAAGTGGCCCGTAAACTCTTCGC CGTCTTTCGTACTTGTTGCTACCCGTTTACCTGTCATCCAGGTCCATGCC AGTGCCCGCAGGGCAGCGGGCAAAGGTTCTTCCTGTCCGCGTCCCCATTC ATCCTCTCGGGATAAGGCACGGGGGTAATCCTGCCAGTGGTTTTCTTTGT TGGTTGCTGCACCATCCTCTTCCCTTGCAGGCCGGCCAGCAGCGCACCCC TGAAGATCCAGCCTGCCAGCCCGGGAAATACTTGGATGCGCAGCGCCGGC CAGCGCATAAACCGAAGCAGTCGGAGTGCCCATTGCCGTCGCTTTTTTTT GCTGTCCCACAGTATTTTTTTTCCTGCCCATCCCACCCATTTTTCGACCT GCCCTCTTCAGCAGTCAGCTGCTGCGCTTCGGTTTCAGATTCAAAAATAT CCGGTTATTCGGGAAGTGGAACGGCACCCGGGAAGCGGTTCATCCCCTTC CGGCGGTCCAGTGGTGGAAGCGGA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/16/2931_3514 CCGCTTTCACACTGACGCGGAAGGGGATGAACCCGCCTTCCCGGTGGCCG TTTCACTTCCCGAATAACCCGGATATTTTTGATCTGACCGAAGCGCCAGC CAGCTGAAATGCCTGAAGAGCAGGTGAAAAAATGGGTGGATGCAGTGAAA AAAATACTGTGGGACAGGCAAAAAGCCCGACGCAATGAGGCAACTCGACT GCCTTCGTTTATGCTGCTGGCGGCGCTGCGCATCAGTAGTTTCCCCGCTG GGGCAAGCCCTGGATTCTCAGTGGCGCTGGCTGGCCCGAGCCCCTGCAGG AAGAGGATGGTGGGCAAGCAAACCACAAGTAAAACACTGGCAGATTACCC CCGGTGCCCTTATCCGGAGAGGATAATTGACGGCGACAGGAAGGAACTTG CCGCTGCCCGTGCGGCACTGCATGACCTGATGACAAGGTAAACGGGTGGC AACAAGTACAGAAAGAACGGACGAAGGGTGGAGTTTACGCCACTCTCCGT GTCCTGACCTGAAAAAAAATATATGCAGAGTCTTGGAAGGTGCCAGACCG GCATGACACAAGCGACGCAGGGGGACCTGACAG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/16/3564_4141 CTGCAGGTCCCCTGCCGTCCGCTGTGTCATGCCTGTCTGCACTTCCAGCT CTGCAATATTATTTTTTCAGGTCAGGAACCACGGAAGGTGGCCGTAACTT CCCACCCTTCCGTCCGTCTTTCTGTACTGGTTGCCCGTTTACTGTCATCA GGTCATGCAGTGCCCAAGGGCACGGCAAGTTCTTCCCTGTCGGCGTCATT CATCCTCTTCCGGAAAGGCACGGGGCCGTTATCTGCAAGTGTTTTTCTTG TTTGGTTGCCTGCACCATCCTTTTTCCTGCAAGGCTCGCCAGCAGCGCAC TGAGATCCAGCTTGCCCAGCGGGAAATACTGATTGCGGCCAGCGCCCGCC CCAGCCGCATAAACGGCAAGCCAGTCGAAGGTGCCTAATTGGCGTCCGCT TTTTTGCTGTCCCACAGTATTTTTTTCCTGCCATCCACCCATTTTTTCGA CCTGCTCTTCAGCAGTCAGCTGCTGCGCTTCCGGTTCCAGATCAAAAATA TCCGGGTTATTCGGGAAGTGAACGGCAACCGGGAAGCGGGGTTCATCCCC TTCCGGCGTCAAAAGTGGGTGAAGCGG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/16/4191_4774 TCGCGCCTTCCACACTGGACCGCCGGGAAGGGGATGAACCGCTTTCCCGG TGCCGTTTCACTTCCCGAAATAACCCGGATATTTTTGATCTGACCGAAGG CGCAGCCAAGCCTGACTGCTGAAAGAGCAGGGTGAAAAATGGGTGGATGG CAGGAAAAAAATACTTGTGGGAACAGGCCAAAAAGCGGACGCAATGAGGC CACTCGACTGCTTCGTTTATGCGCTGGCGGCGCTGCCGCATCAGTATTTC CCGCTGGCAGCTGGATATCAGTTGCGCTGCCTGGTCGAGCCTGGCAGGAA GAGGAGGTGCCCAGCCAACCAACCAAGGAAAACAACGGCCAGATTACGCC CGTGGCCTTATCCGGAGAGGGATGAATTGACGCGCAGGAAGAACTTGCCC GGCTGCCCCGGGCGGCACTGCATGACCTGATGACAAGGTAAAGGGTGGCA ACAGTACAGGAAAGACGGACGAAGGGGTGGGAGTTTTTTACGGCCACTTC CGTGTCCTGACCCTGAAAAATATATTGCAGAAGCTGGAGTTGCAGACCGG CATGACACAAGCGGGACGCAGGGGACCTTGCAG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/16/4818_5217 CTGCAGGTTCGTCGCTGTGTCATGCCGGTCGCACTTCCAGCTCTGGCAAT ATATTTTTTCAGGTCAGACCACGGAAGTGCCCGTAAACTCCCAACCCTTC GTCCGTCTTTCTGTACTGTTGGCACCCGTTTACCTGTCCATCAGGTCATG CAGTGCGCACGCAGCGGCAAGTTCTTCCCTGTCCGCGTCATTCATCCTCT CCCGGGATAAAGGCACGGGGTTAAATCTGCCCCCAGTGGTTTTCTTGTTG GTTGCCTGCACATCGCCTCTTTCTGCAGGCCGCCGCCCCAGCGCACTGAG GATCCAGCTGCCAGCGGGGGAATACTGATGCGCCAGCGCGCCAGCCGCAT AAACGAAGCAAGTCGAGTGCCTCCATTGAGGTCGCTTTTTGCCTGTCCC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/17/0_336 GATGAAGGGTAAAGTTAAACGATGCTGATTGCCGTTCCGGCAAACGCGGT CCGTTTTTTCGTCCTCGTCGCTGGCAGCCTCCGGCCAGAGCACATCCTCA TAACGGAACGTGCCGGACTTGTAGAACGTCAGCGTGGTGCTGGTCCTGGT CCAGCAGCACCCGCAAGAATGCCAAGGCAGCACCGTCGGTGGTGCCATGC CCACGCAACCAGCTTAACGGGCTGGAGGTGTCCAGCATCAGCGGGGTCAT TGCAGGCGCTTTCGCCCTCAATCCGCGCGGGGCGCGGTGCGGTATGACCG GGTCATGTTGCCCTGCGGCTGGTAATGGGTAAAGGT >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/17/380_1046 ATCCTTTACCCATTCCAGCCGCAGGGCAAACAGTGAAACCCGGCTCATGA CCGCAACCGCGCCCGGCTGGATTGAGTGCGGAAAGCGCCTGCAATGACCC CGCTGATGCTGGACCACCTCCAGCCGTAAAGCTGGTTGCGTGGATGGCAC CACCGACGGTGCTGCCGTTGGCATTCTTGCGGTTGCCTGCTGACCAGACC AGCACCACGTCTGGACGTTCACAAGTCCGGCACGTTCCGTATGAGGATGT GCCTCTGGCCGGAGCTGCCAGCGACGAGACGAAAAAAAACGGACCGCGTT TGCCGGAACGGCAATCAGCATCGTTTAACTTTACCCTTCATCACTAAAGG CCGCCTGTGCGGCTTTTTTTACGGGATTTTTTTATGTCGATGTACACAAC CGCCCCAACTGCTGGCGGCAAATGAGCAGAAATTTAAGTTGATCCGCTGT TTCTGCGTCTCTTTTTCGTGAGAGCTATCCCTTCCACCACGGAGAAAGTC TATCTCTCCACAAATTCCGGGACTGGTAAACATGGCGCTGTACGTTCCGC GAATTGGTTTCCGGTGAGGTTATTCCGTTCCCCGTGGCGGCTCCACCTCT GAATTTACGCCGGGATAATGTCAAGCCGAAAGCATGAAGTGAATCCGCAA TCTCTCTCAACAACAA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/18/0_398 CCTCTCCCCCACCCCCCCCCCCCCGACTCCCCCCACCACCCCCCCCCCCA ACCGATGGTTTGAGTACGGTCATCATGCTGACACTAAGACTCTGGCATCG CTGTGAAGACGACGCGAAATTCAGCATTATATCACAGCGTTTCTTTTACA AACCGTTGTCTCCCTCTGCCTTTGATGGCGAATGCCGAGCGATCAGACAC TCATAATGCGAGATACTCACGTGCATCCTGAACCCATTGACCATCCACCC GTAATAGCGATGCGTAATGATGTCGATAGTTACTAACGGGTCTTGTTCGA TTAACTGCGCAGAAACTCTTTCCAGGTGACCAGTGCAGTGCTTGATAAAG GAGTCTTCCCAGGATGGCGAACAACAAGAAACGGTTTCCTCTTCCGTC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/18/447_1210 GACGAGACGAACGCAGTTCTTTGTTGTGTCGCCATCCTGAGGAAGACTCC CTGTTTATCAAGCGGACACGGTTGCACTTCTGATACGGGACGCTGGATAG GAAGTTGTCTGCGGCAGTTAATCGAACAAGACGCCCGTACGTGGCTTTGT TTATCCGCAGACGTAACCCTGATCGACAGACGAAAGCTCTTTGAGTTACG CCATCGGCTGGAAGGTATTACGCATCGCTATACGCGGGTTGAAGTCCAAT GGGGTTTTGCCGAGTTGCATCAAAAAAAAAAAAACACATAACAAACAAAA CAGAGATGGAAAACAGCCAAGTGAACGTAGGAGCAGTGACGGGAGGGGAG CAGACACGCGACTGGGCTATGCACGATGTCTGACGGCCGGAAGCTGGCGC GTACGACGAAGATGTGCGTGCGGTGCTTGCGGGAATAAAACAAAAGCCCC GTTGTCGGTGCAGTGGTATGATCACGACGGCGTTTGCTGGCGGCAAGGTT CACATCGTGTATCGGTAGGGTATTGGGCGGTGCAGAATTACGCAGGCAGT TGCGAGAGTGCGAATTTCGGAGAATCATTGGACAGAAACGGTGTGGGCCG CTGTCTGGAACTTGTCAGTCGCCGCGGGATCCACGCGCCTGATGAAACGG AGGAGAACAGGACCGGAGTGCGCCGAAACTGGCGTAATACGTGAAGACGC CAAGCCGAACCGGCGCGTTTAGCGGTTGGCGATGTCTGTGCTCCACCGCA GAAGACGCAGAGG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/18/1246_2761 TCTGTCTTTCTGGAGGGTAGGCAAGGGAAATTCGTCACAAGAACGGCCGG GCGGATTGGCGTCTTCCACGAAGGAACGTGAGAGTGGGTGTCTCTTACGG CCTTGAGAGAGAAGAGATGAGCAGGGAGTAGCGATGCTTGGTCGAAATGT TGCTGAATTCTTCGCGTCGTCTTCACAGCGATGCAGAGTCTGTAGTGTCC CAGATGATGACCGTACTCAACATCGGGTTGATGTATTATCTTACTGTGTT CCTTTACGATAAACATTGCTGCATAGTGCTGAAGCTTCGACAGATAGGTA TGCAAGAGATATATCAGTGAGCTGACGCTGAGTTTCGCAATGAGTTTTAC GAGTCTGGAGCAAGAGATTGACATTGCGCGCGTTAGTTGCAATGTCTGGA TTTGCAGCTCCGTGTCTGTGTCTGGCAGAGCAGTGCACGAACGGTGTGTG TAGTAGCTAGATAAGGGTCTAAACTAGCAAGTAAGAACTAGCTCGAGCTA TTACCCAAGATATTACGGATGGATGACAGTTTTAAGTCTGAATGAAGACA AGATCCCCATTTCTGCAATGTGCGAGAATGACTGATACGTGCAAATTATT ATCACGGACGTACGTGGAAACGATACATTGCCTCTCTGGTAGCAAAACAT ATAGATGATTTAAACCAATATTACATAACAATCCTCGCACTCGCGGAATT TATTTATCTGAACACTCGCTACGGCGGTTTTGTTTTATGGAGATGATAAA TGCACTTCCGAGTCCACAGGAGAATGGAATGGAGAGCCCATTCAACAGAG TTATCGAAGCGGAGAAACATCAACGACTGCTACGACACTGATGATATGGG CGCAGATAGCACATGCAGACGTAACCAATATTCGATTGAAGAACTGAAAG AACACGCAAGCCGCTGATGGCGTTTCTTCTGCGTGTAATTTGCGGAGACT TTTGCGATGTCTTTGACACTTCAGGAGTGGAACGCACGCCAGCGACGTCC AAGAAGCCTTGAAAGCAGTTCGTCGATGGGCTTCGGGAATGCAGGATATT CCCACCTGCCGGTTAAGGATGGAAGAGAGTATCTGTTACGAATCAGCGTA AAGTTTGACTTAAATCGACCAGTAACAGGTGGCCTTTTGAAGAGGATAAG AAATGGAAGAAGGCGAAGTCATGAGGTCGCCGGATTTTACCCGCCTACAA CCTTTATATAAGAAACAATGGATAATTACTGCTACAGGGACCCAAGGACG GTAAAGAGTTTGGATTAGGCAGAGACAGGCGAATCGCATCACTGAAGCTA GTGACGGACGCAAGATCTGATGATCAGTGGGTATTCAGGGCATCGACAAA CTTAGCAAGCACTACGGCGTTGGGGGGGGAAGCGGAACTCCCGACAGCTG ACTCAGATATCGTGTTCCTCGTAGTGTTTCTGGGCTTGATCGTCATACGC CTGCTCCTGCGGCCCTAGGGAAGGCACGTCAGAGCGAGAGTGCAGCTAGC AGCTCATAACTACAA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/18/2809_4330 GTTTGGCCTACAGCTGACCTGTATTGGGCAGAGTCGGCGAGTCTGTCTTA GAATTCTCTGCGGGTCGCAGCGTGCGAGTTGGTTTCGTGAGGCGATCAAG CGCTAGCCGTGTGACAGCGTAAGCGAGAGATTATGGTCTGACGCGATTGG ATTCTCAACGACTGTCGCCGGCTTGTGTGTGTGTCTGAAAACTAGCTGTG ATGGCTTGGCCTGATAGTAACTTTGCTTCAGGTGAGTTGCGATTCGCCCT GGTTCATATCTTGCTATGATTCTATATTATTCCTGGCGTTGTAGCCAGCC GCTCCGTTTGGGTCGCCTGTGGGCCGCGCCAGTATATCCATTGTTTCTTA TATAGCCGTGTATCCTTGGAAGGGGTAAATCCCGGGAGTGCGCGCTGATG TTCGACTGACCGTGCAGAGCTATCTAATCATGGTTTCTTGGCACTCTGCT GTCAAGAAGGCACTGCTGTATGTATGTCGAGTCCGATCTTTATGTCAGAG CCAGTTGTATCCGCGTCGATTCGGTAGAACAGATACTCGGTCGTATCCAG TGCCTTGAGGCGGAGATGCTCAGATGCCGTGCAATATGTCACTGAATCCA TCGACGAGTGGGATATTCGAAGCTTCTTCGCCGAGACGAATCGCCGTGGC GTGCTCGCTATCGTGCTCGTGAACGGTTGCTTTCAAGTGACGCATATCGC AAGTCTCGCCGCATGATACACGCATGAAAAACGCTCGCAGCTCAAGGCCG GGGGAGGAGGTGTATCGTTATCAGTATACGATAGACATTTCGAGTGTTTT GGCGTGAGAGTCTTTGGCATTGTGCTATCTGCGCCATACCATCATGCCAC TCCAGTGCGCTCGCTAAGCCAGCGCGTGCGGTTGGATGTGCTTGGTCCGG GTCTTCGATTCAATCTACGGTGAACTCCTGGTCTTCAGTTTGCACTTCTT TCCGCTGTCTCTGTGACTGCGAGAGAGGTGGCATGTTATGCAGTACTCCA TAGAAACGCACAACCTGCCGTAGGCGCGAGCTATCCAGGGATAACAAATG AAATCGCCGCAAAAGGATGTGGAGATGTGGTTATTAGTATATTCGGTTTA ATACATTCTATAGATGTCTATTTGTACAGGAGAGGTGCAAGTATCAGCGG CTATCTTATCCACGCGTAGCTACTCGGGTGATATATAAACTTTTTGGCGG CACTGGTATGAGGGAGGTTTTCATCTGTCATCGCAGGCACTTTCGACAGA ATAGGGATTTTGTTACTTCATGATAGACTGCTACTGAGAAAGGCCATATG CATTGAGACTGTAGTGTTGATGGCCCGCCTGTTCAGGCTGAATATTCTGA GTATTACACACTGTGCAGTCTACATACAAGCGAGGTGGCGCTGTTATTCG ATGATGTACTGCATCGCGCAGACGAGGACACCAGGAGAGTCTGCAACAAT CGATTGGATGGTACGCGCCCAATCCTTGTGTGCTAAGCGCAGACTCTACC GGATCAGTTGATGAGTCAGTT >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/18/4375_4894 ACGCTGCGTTGGGCCTTTGCAAACTGTGGTATGCGTAAGTCCCGTATCAA TCGAGCCTGTAACGGTATCAGTCGTGAGTTGATTGCGGAATAAACAGTAA AGCATAAGACCTCAACCCCGGTTCGGATCAGATGTTGTGAGTATCGAAGT CTCATCTGACACTACAGACGTGACGCGCTGGCAGAGGTCGCTGTGGAGCA CGCGCAGACGGAGGATTCTGTCTTCTTCCAGCAGAGCGTTATCTCTGTGT TATTAACGAACCGATCTGCACTCTGGTCCTTGTGTTGAATGGCGTGGCCG GCCGGTCAGAAACGATCAACTCGCAGATACTCACCTGCATCCTGAACCAT TGAACCTCCAACCCCGTAATAGCGGATGCGTAATGAATGTCGGATAGTTA CTAACGGTCTTGTTCGATTAACGGCGCCAGACACTCTCCGGTCACCAGTG CGTGCTTGGATAACGGAGTCTTCCCAGGATGGCGAGACACACAATTGAGA GACGATGATGTTCCGCTGC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/18/4953_5418 GCGGACAAACAGTTATCTTGTGTGTTGTGTCCATGCCTGGGAGTGCTGCT ACTATCATGCACTGCGACTGGCTGTACCCTTGGAACAGAGTTTCCTCGCG TGCCAGTTAATCGAACCCAGGCTCTAGGTTTATATACTAGCGACCTTATC ATTCTGCCGCTCAGTGCGCTATTAACGGGGGATATGGGAGGCTGCGAATC GAGGTATCAGAGAGTCGACAGGCTCGGGGTATAAGCTGCATTACTTGAGT GGGATGTAGACGCTGGGCATTCGCATGCCGGAAGAGCTGAGGTCGCGGGT TTTTGTAAACATAGTGTAGACGCTTAGTGAAACGATTGCTGAGATTTTCG CCGCGATACAGTCTTCAACTTCGCGATCGCCAGAGATGTAGTGAGTGCTC GAGTATGATTGACTCTGATACTACAAACGTGGAGGTGAGAGGTAAATAGC GAGTTAGTTATATCT >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/21/0_506 TACTGCCCTGCGTCGCCTGCCGGTATGAAGATCCGCGAATCTGGCGGACG GCTTACCAGCCGCCGTCGCATCACAATGCAGCGAACTATGCGTGACGAAG AGCTGGCCATTGCCTCAGGTCCGAAGAGATGCAGGCAAAGTTCTGGCCGT GCTTAAGGGCAAATACACCCATGACCGGTGGAAGCCTTCATCGTGAGGGG ATATGGCCGCAGTGAGGAGAATAACATCACGCAGTCCGGCGCACGGAGTG GAGCAAGCGTGACAAGTCCACGTATGTACCCACCGACGAATCCGAAGCCT ACGGCGCTGAACGCCAGCGGTGTGGTGAATATCATCGTGTTCGATCCGAA AGGCCTGGGCGCTGTTCCGGTTCCTTCAAAGCCGTCAAGGAGAAGCTGGA TACCGTCGGGCTCTAATTCCGAGCTGGAGACAGCGGTGAAAGACCTGGGC AAAGCGGTGTCCTATAAAAGGGGATGTATGGCCGATGTGGCCATCGTCCG TGTATT >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/21/553_2265 TAATACACGACGATGGGCCACATCGCCATAACATCCCTTATAGGACAGCC GCTTTGCCCAGGTCTTTCACCGCTGTCCCAGCTCGAATTAGAGCCACGAC GGGTATCCGCTCCCTTGACGGCTTTGGAAAGGAACGACAGCGCCCAGCCT TTCGGAATTCGGAACACGATGATGATTTCACTCACACCGCTGGCGTTCCA AGCGCGTAGGCTTCGATATCGGCGTCGGGTCAACGTGGACTTGTCACGTT TGCTCCACTCCGTGCCGGCCGGACCCTGCGGTGATGTTATCTCCTCACTG CGGCCCATATCCACCTCAACCGGATCGAGGCTTCACCGGTCATGGTGTAT TTGCCCTTAAGCACGGCAAGGAAACTGCCCTGCATCTCTTCGACCTGAGC AAATGGCCAGCTCTTCGTCACGCAATGTTTGCATGAAGATTGCGACGGCG GCGTAAGCCGGTCCGCCAGATCTGCGGATCTCATCCGCAGGCGACGCAGG TCATCTGCGGATTCACTTCATGCTTTCGGCTGACATATCCCGGCGGTAAA ATTTCAGAGTGGAGCCCGCACGGGAACGGATAACCTCACCGGAAAACAAT CGGCGAAACGTACAGCGCCATGTTTACCCAGTCCGCGAATTTGTGAGAGA ATAGACTCTTCTCCCCGTGGTGAAGGGATAGGCTCTCACGGAAAAAGCAG ACGCAGGAAACAGCGGATCAAACTTAAATTCTGCTCATTGTTGCCGCAAG CAGTTGGGCGGTTGTGTACATCGACATAAAAAAAATCCCGCTAAAAAAAG CCGCACAGGCGCCTTTAGTGATGAAGGGGTAAAGTTAAACGATGCTGATT GCCGTTCCGGCAAACGCGGTCCGTTTTTTTCGTCTCGTCGCTGGCAGCCC TCCGGCCAGAGCACAATCCTCATAACGGACGTGCCGGACTTGTAGAACGT CAGCGTGTGCTGGTCTGGTCAGCAGCAACCGCAAAGAATGCCAACGGCAG CACCGTCGGTGGCCCATCCCACGCAACCAGCCTACAGGCTGGAGGTGTCC AGCAATCAGCGGGGTCATTGCAGGCGCTTTCGCCACTCAATCCGCCGGGC GCGGTTGCGGTATGGAGCCGGGTTCACTGTTGCCCTGCGGCTGGTAATGG GTAAATGGTTTCTTGCTCGTCATAAACACTCTTACACTGGTGTTCCAGCA AATCGTTAACGCATCAGAATGCCGGGTTACCTGCAGCCAGCGGTGCCGGT GCCCCTGCATCAGACCGATCCAGCGCAGTGTCACTGCGCGCCTGTGCACT CGTTGGTGCTGCGGCCAGATGGCGGCGGGCCGTTTTCACGTCATACCGGG GGTTCTGCCAGCAGGCGTGCCTGTTCTTCGCGTCCGTGAGCCTCTCACAG TTGAGGATCCCCATAAGCGGCTGTTTTCTGCGCAACCGCTGCGGGATCTG CGCGTTCACGTCCGGCTGCGCCCGCGCTGGCGTTCTCGCCCTCCGTCGCT GGCACCACGTCAGAACGTCAGCCTGCCGAAGCAGTGGGCTGAAACATTGT GATTGAAGTCCCTTGGGTCATTCCGCCCTCCTGAGAGACGGGATTTAACG TGCATCCAGTGCATCACGCATGACGGTGATCGCCATCGGTGCTGTTAACA AGTTCATCAGCCAGTCCGGCATCAATGGCCCTCCTGACCGCTGTACCACC TGCAGCCCGGTA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/21/2313_4029 ACCCGAGGCCTGCCCAGTGTACACGCGGTCAGGAGGCCATTGATGCCGGA CTGGCTGATGAACCTTGTTAACAGCACCGATCGCGATCACCGTCATGGCG TGATGCACTGGATGCACGTAAATCCGTCTCGTCAGGAGGGCGAATGACCA AAGAGACTCAATCAACAACGTTCAGCCACTGCTTCGCGCAGCTACCGTTA CGACGTGGGCCAGCGACGAGGGCGAGAACGCCAGCGCGCGCAGCCGGACG TGAACAGGCGCAGATCACCGCAGCGGTTGCCGGCAGAAACAGCCGCCTTA TGGGGATCTCAACTGTGAGGAGGCTCCGGACGCGAAGAACAGGCCACGCG TGCTGGCAGAAACCCCCGTGTATGACCGTGAAAAACGGTCCCGCGCATTC TGGCCCCGCAGCACCACAGAGTGCACAGGCGCGCAGTGACACTGCGCTGG ATCGCTGATGGCAGTGGGGCACCGGCACGCTGGCGCAGGTAACCCGGCAT CCTGATGCCGTTAACGATTTGCTGAACACAACCAAGTGTGTAAGGGATGG TTATGACGGCAAGAAACCTTTAGACCCATTACCAGTCCGCAGGGCAACAG TGGACCCGGCTCATACGCCAACCGCGCCCAGGCGGATTGAGTGCGAAAGC GCCTGGCAATGACCCCGCTGATGCTGACACCTCCAGCCGTAACTGGTGCC GTGGGAGGCACCACACCGACGGTGCTGCCGTTGCATTCTTGCGGTTGCTG GCTGACCAGACCAGCCCACGCTGACGTTCTACAAGTCCGGCACGTTCCGT TATGAGGATTGTGCTCTGGCCGGAGGCTGCCAGCACGAGACCGAAAAACG GACCGCGTTGCCGGAACGGCAGATCAGCATCGTTAACTTTAACCCTTCAT CACTGAAAAGGCCGCCTGTGCGGGGCTTTTTTACGGATTTTTTTATGTCC GATGTACACCAACCGCCCAACTGCTGGCGGCAAATGAGGCAAGAAATTTA AGTTTGAATCCGCTGTTTCTGCGTCTCTTTTTCCGTGAGGCTATCCCTTG CACCACGGAGAAAGTCCTATCTCTCACAAATTCCGGGACTGGTAAACATG GCGCTGTACGTTTCCGGCCGATTGTTCCGGTGAGGTTATCCTTCCCGTGG CGGCTCACCTCTGAATTACGCGGATATGTCCAAGCCGAAGCATAAGTGAA TCCGCCAGATGACCCTGGCGCGCCTCCGGATGAAGACCGCAGGAATCTGG CGGACGCCGGCTTACCTGCCGCCGTCGCATCATCATCGCAGACATTGCGT GACGAAGAGCTGGCCATATGCTCAGGTCCAAGAGATGCATGGCAGTTTCT GCCCGTGCTTAAGGGCAAATACACCAATGACCGGTGAAGCCTTCGATCCG GTTGAAGGTGAGATGGCCGCAGTGAGGAGAATAACATCACGGCAGTCCGG CGGCACGGGAGTTGGAGCAAGCGTGACAAGTCACGTATGGACCCGACGAC GGATATCGAAGCTTACGCGCTGAACGCAGCGGTGTGGGTGAATACATCGT GTTCGATCCGAAGGCTGGGCGCTTGTTCCGTTCCTTCAAAGCCGTCAAGG AGAAGCTGGATTACCCGTGTGGCTCAATTCCGCAGCTGGAGACAGCGGTG AAAGACCTGGGCAAAGCGGTGTCCTATAAGGGGAGTGTATGGCGATGTGG CCCATCGTCGTGTATT >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/21/4078_5075 TAATACACGACGATGGCCACATCGCCATACATCCCCTTATAGGACACCGC TTTGCCCAGGTCTTTCACCGCCTGTCTCCAGCTCCGATTAGAGCCACGAC GGGTATCCAGCTCTTCCTGACGGCTTTGAAGGGAAGGAACAGCGCCCAGC CTTCCCCCTCGGATCGAACACGATGATATTTCACCACACCCGCTGGCGTT CAGCGCGTAGGCTTCGATATCGTCGGTCGGGTCATACGTGGACTGTCAAC GCTGCTCCACCCGTCCGCCGATGCCGTGATGTTATTCTCGCTCACTGCGG CCATATCCACCTTCAACCGGATCGAAGGCTTCACCGGTCATGGGTGGAAT TTGCCTTAAGCAGCGGCCAGAAACTGCCTGCATCTCTTCGACCTGAGCAA TGGCCAGCTCTTACGTCACGCATGTTCTGCATGATGATGCGACGGCGGCG GTAAGCCGGGTCCCGCCAGAGTTCTGCGGATGCTCATCCTGGCAGGCGTA CGCAGGGTCATCTGCGGATTCACTTCATGCTTCGGCTTGACATATCCAGG CGTAATTCAGGAGGTGGAGCCGCCACGGGAACGGATAACCTCACCGGAAA CAATCGGCGAAACGGTACAGCGGCCATGTTACCAGTCCCGGAATTTGTGA GAGATAGACTTTCTCCGTGGTGAAGGGATAGCGTCTCACGGAAAAAAGAG ACGCAGAAACAGCGGATCAAACAGTTAATTTTCTGCTCATTGGCCGCCAG CAGTTGCGGTTGGTACATCGACATAAAAAAATCCCGTAAAAAAAGCCGCA CGGCGGCCCTTTAAGTGGATCAAGGGTAATAGTTAGACGATGCTGAATGC CGTTCCGGCAAACGCGGCACCGTTTTTTTCGTCTCGTCGCTGGCAGCCTC CCGGCCAGAGCACATCCTCATAACGGAAACCGTGCGGACTTGTAGAACGT CACGTGGTGCTGTCGGCAACAGCACCGCATAGAATGCCAACGGCTAG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/22/0_346 TTGGAAAGCCAGTTGATCATCAGCGCAGGTTAATCTGGAACCGCGAACGA ATCACGCACTCACAAAACGGGATCGTGAAAGAAATCAAAGCGCCGGACAC GTTCAATCTTTGGTCCTGACGCCAGCAGTGAAACCACTCAAGTTTGCCAA CCAAATGTATATCGATACCGGCGCAGTGTTCGGGAAACCTAACATTGATT CAGTACAGGGGGAGAAGGCGCATGAGACTCGAAAGCGGTACGCTAAATTT CATTCGCCCAAAAAAGCCCCGAATGATGAGCGACTCACCACGGCCCACGG CTTCTGACTCTCTTTCCGGTTACTGGATGTGATGGCTTGCTATGGG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/22/390_999 CCCCCATAGCAGGCATCACATAGTAACCGGAAGAGCAGTCCAGAAGCCGT GGCCCGTGGTGAGTCGCTCATCATCGGGCTTTTTGGCGAATGAAATTTAG ACTACGCATTTCCGAGTCTCATTGCGGCCTTCTCCCTGTACCTGAACTCA ATGTTAGGTTTCCGCAGACACTGGCGCCGGTATTCGATGATACATTTGGT TGGCAAACTTGAAGTGGATTTCACGTGCTGGCGTATGACAAAGATGAACG TGTCGCCGGCCTTTGAATTTCTTTCACGATCCGTTTTGTGAGTTGCTGAT TCGTTCGCGGTTGCCAGATTACCTGCTGATGATCAGACTGAGCTTTCCAA ACTCGTATTCGTCAAAGGGATAATCGCGGTGGCAGAGAGAGAGTATTTTT TATTTTGCCTCACCAGGTTTCGGATGATGTAACGGAAGTTCATCCTGCTT TAATGGGCAAGAGCTTTAGCCAGAATTTCTTTGTCGTAATCCGAGATAAA GAACAGCCACCGCCATTAAGCAGCCAGTGAATTAACGTTTCCACGCTCCT TGATAAGCCATCAATCATCCATTTGCCTCATGGTTTCACGTGACAGCTCT GAACCAGGG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/22/1051_1684 CAGCCCTGGGTTCAGAGCTGTACGTGGGGAAACCATGAGCAAATGATGAT TGATGGCTTTATCAGAGCGTGGAAACGTTCAATCCTGGGCTGCTTAATGC GGTGGCTGGTTCTTTAATCTGCGATTACGACAAAGAAATTCATGGCCTAA AGGCCTTGCCCATAAAGCCAGATGAACTTCCGTTAATCATCGAACTGGTG AGGCGAAAGATAACAAAATATGTAGTCTGCCACGGCCGATTATCCTTTGA CGAATAGCGGTTGGAAAGCCAGTTGATCATGCAGAAGGTAATCGGAACCC GGGCGAAGCGAATCAGCAACGTCACAAAACGGGATCGTGAAAGAAAGTCA AAGGCGCGCGGACACGTTCATCTTTGGTCATACGCCAGCAGTGAAACCAC TCAAGTGTTGCCAACCCAAAATGTATATCGATACCGGCGCAGTGTTCTGC GGAAACCTAACATTATTCAGGTACAGGGAGAAGGCGCCATGACGACTCGA AACGTCAGCTAAATTTCATTCGCCAAAAAGCCCGATGATGAGCGACTCAC CACGGGCGCACGGCTTCGACTCTCTTCTAGCACCGCGACGCACACGCGAG CGGGGTACTGAGATGTGATGGCTGCTATGCGGA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/22/1732_2248 CCCAAAAGCCAGCCTCACATCAGTACCGGAAGAGAGTAGAAGCCGTGGCC CGTGGCTGAGTCGCTCATCATCGGGCGTTTTTTGGCGAATGAAATTGTAG CGTACGCTTTCGAGTCTCATGCGCCTTCTCCTGTACTGATCAATGTTAGG TTTCCGCAGAACACTGCGCGGTATCGATATACAATTTGGTTGCAAACTTG AGTGGTTTCACTGCTGGCGTATGGACCAACCAGATGAACGTGTCCCGCGG CCTTTGATTTCTTTCACCGATCCCGGTTTTGTGAGGTTGCTGATTCGTTC GCCGGTTCCAGATTACCTGCTGATGATCAACTGGCTTTCCAAATCGTATT CCGATCAAAGGGATTAATCGGCGTGGCAGGATAACATATGTTTTTATCTT TGCTCACCATTCGATGATTACGTGAAGTTCATCTGCTTTATGGGCCAAGA GCTTTAGCCAGAAGTTCTTTGGTCGTATCGAGGATAAAGAACCAGCCACG CCATGGCCGTGATTAA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/24/0_354 ATCACGAATAGTCGGCTCCAACGTGGGTTTTCATAAAGTCTCGGCATCAC CATCCGTCGGCAAACCAGATAAGGGTGTTGCGCTGCTTATGCTCTATAAA GTAGGCATAAACACCCAGGCAGCATTTTGGAATAGACCGACACGGGCAGA CTTAACCAACATTCACCTCCACGGATGTAGTCGCTGCCATCGCATTCATG ATGGCCCGCTGAAAGGGCAGTGTTTCCCCAGCGCCCTTTCCCTGGTATGC GGATTCTTTCCGGGAGGATAGTAATTAGCATCCGCCCATTCAACGGCGGT CTGTTGGCTCCGGCCTGAACAGTTGAAGCGAAGCCCGGCGCGGACAAAAT GCCG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/24/401_1026 CGGCATTTGTCCCGCGCCGGCTTCGCTCACTGTTCAGGCCGGAGCCACAG ACCGCCGTTGAATGGGCCGGATGCTAATACTATCTCCCGAAAGAATCCGC ATACAGGATAGGGCGCTGGGAAACACTGCCTTTCAGCGGCCATCAGAATG CGATGGGAGGACTACATCCGTGAGGTGAATGTGGTGAAGTCTGCCTGTCC GGTTATTCCAAAATGCTGCTGGGTGTTTATGCTACCTTTATAGAGCATAA GCAGCGCAACACCCTTACTGGTTGCCCCGACGGATGGTGATGCCGAGAAC TTTATTGAAAACCCACGTTGAGCCGACTATTCGTGATATTCCGCGCTGCT GGCGCTGGCCCCGTGGTATTGGCAAAAAGCACCCGGGATAACACGCGTCA CCATGAAGCGTTTCAACTAATGGGCGTGGACTTCTGGTGACTGGGCGGTA AAGCGGCCAAAAAACTACCGTGAAAAGTCCGGTGGATGTGGCCGGGTTAT GATGAACTTGCTGCTTTGATGATATATTGAACAGGAAAGACGCTTCCGAC GTTCCTGGGTGACAGCGTATTGAAGGCTCGGTCTGGCCAAAGTCCATGCC GTGGCTCCACGCCAAAAGTGAGAGG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/24/1074_1708 CCTCTCACTTTTGGCGTGGGAGCCACGGATGGACTTTGGCCCAGACCCGA GCCCTTCAATACGCATGTCACCCAGGAACGTCGGAGAGCCTTCCTGTTCA AATATCACATCAAAAGCAGCAAGTTCATCATAACCCGCACAATCCACCGA CTTTTCACGGTAGTTTTTTTGCCGCTTTAGCCGCCCAGGCCCAACAGAAG CCACGCCCATTAGTGAAACGCTTCATGGTGAGCGTGTTTATCCCGTCTTT GCCATACCACGGGGCCAGCGCCAGCAGCGACGGGAATATCACGAATAGTC GGCTCAACGTGGGTTTTCATAAAGTTCTCGGCATCACATCCGTCGGCAAC CAGATAAGGGTGTTGCGCTGCTTATGCTCTATAAAGTAGGCATAAACACC CAGCAGCCATTTTGGAATAACCGACCACGGCAGACTTCAACCACATTCAC CTCACGGATGTAGTCGCTGCCCATCGCATTCCATGATGGCCGCTGAAAGG GCAGTGTTTCCGCACGCCCCTTCATGGTATGGCGGATTCTTTCGGGAGAT AGTAAATTAGCATCCGCCCATTCAAAGGCGGTCTGGTGGCTCCGGCCTGA ACAGTGAGCGAAGCCCGGGCGGACCAAAATGCCG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/24/1752_2380 CGGCATTTTGTCCGCGCGGGCTTCGCTCACTGTTCAGGCCCGGAGCCACA GACCGCCCGTTGAATGGGCGGATGCTAATTACTATCTCCCGAAAGATCCC ATACCAGGAAGGGCGCTGGGAAACACTGCCCTTTCAGCGGGGCCATCATG AATGCGATGGGCAGCGACTACCATCCGTGAGGTGAATGTGGTGAAGTCTG CCCGTGTCGGTTATTCCAAAATGCTGCTGGGTGTTTTATGCCTACTTATA GAGCATAAGCAGCCCGCAACACCCTTATCTGGGTTGCCGACGGATGGTGG ATGCCGAGAACTTTATGAAAACCCACGTTGAGCCGACTATTCGTGGATAT CCGTCGTGCTGCGCTGGCCGTGGTATGGCAAAAAGCAACCGGGAATAACA CGCCTCACCATGAAGCGTTTCACTAATGGCGTGGCTTCTGGTGCCTGGGC GGTAAAGCGGCAAAAAACTACCGTGAAAAGTCGGTGGATGTGGCGGGTAT GATGAACTTTGCCTGCTTTTGATGATGATATTGAACAGGAAGGCTCCCGA CGTTTCCTGGGTGACAAGCGTATTGAAGGGCTCGGTCTGGCCAAAGTCCA TCCGTGGCTCCACGCCAAAAGTGAGAGG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/24/2426_3071 CCTCTCACTTTTGGCGTGGAGCCACGGAAATGGACTTTGTGCCAGACCCG AGCCTTTCAATACGGCTTGGTCACCCACGGAACGTCGAGAGCCTTCCTGT TCAATATCATCATCCAAAAGCAGCAAGTTCATCATAACCCGCCCCCACAT CCACCGACTTTTCAACGGATAGTTTTTGCCGCTTTACCGCCCAGGCACCA GAAGCCACGCCATTAGTGAAACGCAATCTCATGGTGAGCTGTGTTATCCC GGTGCTTTTGCCATACCACGGGCCCAGCGCCAGCAGCGACGGAATATCAC GAATAGTCGGCTTCAACGTGGGGTTTTCAATAAAGTTCTCGGCATCACCA TCCGTCGGACACCAGATAAGGGTGGTTGCGCTGCTTATGCTCTATAAAGT AGGCCTAAACACTCCCAACAGCATTGTGGAATAACCGACACGGGCAGACT TCACTCACATTCACCTCACGGATGTAGTCGCTGCCCATCGCATTCATGAT GCCCCTGGAAAGGGCAGTTGTTTCCCAGCGCCTTCTGGTATCGGATTCTT TCGGGAGATAGTAAATTAGCACCGCCCATTCAACGGCGGTCTGTGGCTCC GGCCTGAAACAGTGAAGCGAAGCCCAGGCGCGGGACAAAATGCCG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/24/3116_3741 CGGCATTTTGTCCGCGCCGGCTTCGCTCAACTGTTCAGCCGGAGCCACAC GCCGTTGAATGGGCGGATGCTAATTACTATCTCCCGAAAGAAATCCGCTA CCAGGAAGGGCGCTGGAAAAACTGCCCTTTCAGGGGCCATCTGAATGCGA TGGGCAGCGACTAACATCCGTGAGGTGAATGTGGTGAAGTGCTGCCCCCG TGTCGGTTATTCCAAAATGCTGCTGGGTGTTTATGCCTACTTTATAGAGC ATAAGCAGCGCAACACCCTTTATCTGGTTGCCGACGGATGGTGATCCGAG AACTTTATGAAAACGCAACGTTGAGCCGACTAATGTCGTGATTTCCGTCG CTGCTGGCGCTGGCCCCGTGGTATGGCAAAAAGCACCGGGATAACCACGC TCACCCATTGAAGCGTTTCACTATGGCGTGGCTTCTGGTGCCTGGGCGGT AAAGCGGCAAAAAAACTACCGTGAAAGTCGGTGGATGGTGCGGGTTATGA TGAACTTGCTGATTTTGATGATGATATTTGAACAGGAAGGCTCTCCGAAC GTTCCTGGGTGACAAGCGTATTGAAGGCTCGGTCTGGCCAAAGTCCATCC GTGCTCCACGTCCAAAAGTGAGAGG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/24/3790_4270 CCTCTCACTTTGGCGTGGAGCCACCGGAATGGATTTGGCCAGACCGAGCC TTCAATACGCTTGTCCAACCCAGGAACGTCGGAAGCCTTCCTGTTTCAAT ATCATCATCAAAAGCAGCAAGTTCATCATAACCCGCCACATGCCACGCGA CTTTTCCAGCGGTAGTTTTTTGCGCTTTACCGCCCAGGCACCAGAAGCCA CCGCCCATTAGTGAAACGGCTTCATGGGTGAGCGTGTTATCGCGGTGCTT TTTGCCATACCCCACGGGGCCAGGCGGCCAGCAGCGACGGAATATTCACG AAATAGTCGGCTCAACGTGGGGTTTTCATAAAGTTCTCGGCATCATCCAT ACGTCGGCAAACCAGATAAGGGTGTTTGCCGCTGCTTATGCTCTATAAAG TAGGCATAAACACCCAGCAGCATTTTTGGAATAACCGACACGGGCAGACT TCACCACATTCACCTCCACGGATGTAGTCG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/25/138_620 ACATCCGGTTGCCCAGGGCAATAGTTCTGCGTTCTGTACGGGGGGAAATG CGCTGTGGGGTCACGCGTTGGTGTTCTCAGGAGATCAGCACGGCAGACGA GGGGACTGTGTCAGGATTCGTGGTATTGGTCCGCTGATGCAAAATGTATT ATGTGAAACGCCTGCCGGCGTTTTTGTCATTATGGAGCGTCGAGGAATGG GTAAAGGAGAGCAGCTAAGGGCATACCCCGGCGCCGAAGCGAAGGAGCAA CCTGAAGTCCACGCAGTTGCTGAGTGTGATCGATCGCCATCAGCGAAGGG CCCCCCGCCGCCCGCTACACATGCGTGCTGCTGAACCAGTACGACCGGTG CTGGGACATGCAGGGGAATACCAACATATCCCGGTGTCACGGCTGGTGCT CCGGCTGGTGAGCAGCTGAGCAGACTCCGCCGGAGGCGGATTTGCGAATC CTCCGGCTCCAGCACGACGCGTGGCTGGGTAC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/25/675_3031 GTACCCAGCACCGGTCTCGGAGCACGGAGGATTCAAATCCCTCGGCGGAG TTCTGCTCCCTGCTCACCAGCCCCGGAACACCACCGTCGCACACGCGATA TGTTGGCTATCCCTCAGTTCCAGCACGGCGTACTGTTCAGCCAGCGCGCA CGCTTTTTAAGCCGTCCACCGGACCTTCAATCGGCCCTGTGACGCTGATG AAAACTACAACGAGCATACACATCTCACGCAATCTGAGCAGTGTGTTCTC CAGCTGCAGGTGTGCTTCCCTTCTCGCTCATCGCGCCGCCGCGTTTCACA CTCAGAGTCGCGAGGTTCCTCTGCTCTACCTCTATCCGTCCACGCCGCGA GCATCCCCAACAACATAGCGCTCCCAAAAACCTGGCCCGAGGGCGTTCAC ACTAAGGAACCTGAATGCCTGTTATATGCACGATCAGTCCTGACATAGCT CCGACCCAATCGACAACGCGACCACCCGGTCCGTTCAGTCCTTGCGCCCT GTGGCCGTCATCTGCACTGCGTAAAGGCCCCCTATCCGTACCGTGACCCC CACGAGACGAGAGACGAGCATCCTCCCGTACGCAACCAGGCAAGAAACAT GGGCGCGCGTGGGCGAGGCCAATGTGGTTAAGGTAAACCAACCCGCAGTG CGCCTGCTGAGAATGTGACGGGGGGCATTCTGTCTATACCCGATATTGCT CGCGTTGCGTGTCGCTAGAATACACGACACCGGTGGATGCCACAGTCCTG GTCTTTCCCGGGTGCAACCCATGTCTGCGGCCATCACTCCAGACACCGAC GGGCACCAATCTCTCTCGAGCCAGCGAACCCAATCAGAGCCATGTCGAAA GAGCCGGACGGAGGAGAACAGATGCAGACCGAAACAAGGCCGCCCTCGGT GAGTGTATGCGTGACGTCGATCCTGTTCTTGGGGCCTTCCCGAACGCCAG AGTATACGACAACCGGATACAGCGCGGGTAAGCAGAACACTATCTCCTCA CCTGGATAACATGGCTTGCCCAGGCAATGGTCTGCCTGGTTCTGTAACGG AGGGGAGTGCGCGTGCCGCGGCTCGCGTGGTTTCTCAGGAGATCAGCGCA CGGCAGACGGCGGGGACGGGGTCAGGTTGTGGTGCATGGCTCGGCTGATG CCCAAAAATGTTTTTATGTGAACCGCCTGCGGGCGGTTTTTGTCGATTTA TGGGACGCGTGAGGACATGGGAAAGAAGCAGCGTTGAAGGGCCATACACC GCCGAAGCGAAAGGACAACCTGAAGCCACGCAAAGTTGCGAGTGTGAATC GGAGCCACCAGCGAAGGGCCGATTGGAAGGTCCGGTTGATGCCGCCTTAC AAAACGGCGCGCGGCTGCTGAACAGCGTCACCGCGGTGCATGGACCACTG AGGGGACGATACCAACATATCCGGGGCACGGTGTTCCCGGGCTGGTGGAG CAGGAGCAGACTTCCGCCGGAGCGGATTTGAATCCTCCGGTCCGAGACGG GTGCTGGGTAGGTCTCTGTGTTTTTTCAATTTTCATATCATTAAATCTAA TTTGCAGATATTTAAATTTTAAGCACCGAGAGGAGGACAGAGAAAGAGAG AGTATATATCGTATATATTCCTTCTCTTTCTCTGTTTCGTGATACATTAT AGGATTTTACTAGGTTTTGTATCTATTTTTCTCTCTTGTGTTATTATAAA GAAGATATATCTTCTTAGTTGTTGTTATGTTCGTTGCTTCTGTGTATGTT GTTTCTAGTTGTGCTTAGCGTTGCTTGCTTTTTTTTTTTTTTTTTTTTAT AGTTTTTTAGTCTTCGTGATTTTTCTTAATTATCTGGTCTTTTGTTCTTT GTATATGTTTGTTCTTATTTTTTTCGATCCACTCAGCAAACCTGTTTGTT CTTTGGTTTTTGTGGTACTTTCTTTTTTTTGTTTTTTTTTATTGTTGTTG TTTGCGCTTCGTCGCTTTCGTCTTGGCTTGGTGGTTTATGTCTTCTTTCT TCTTATTTCTTTTTGTCTTCTCTTTACCCATCCTCCCCGGCTCCCATAAA ATGGCAAAACCGCCGCAAGGGCGCGTTTCACATAAAACATTTTTGCATCA GCGACCCAATCCACCACAACCTGACCACCGTCCCCCTTGCGGTCTGCCGT GCCTGAGTCCCTGGCAGAAACCCGGACGCGTGACCCACGCGCAAGTTTCA CACGTAACAGATACCAGGCCAGGAACATTGCCTGGGCCATACCATGTTAT CCCAGTTGAGGGAGAAAAAGTGTTCTGGCTTACCGTCATCGTTGTCTTAT ACGGGAGTTCTGGCTTTCGGTGCCCCAGCATCCTTGGCCCACACCACCCG AGGCAC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/25/3077_3717 GTGCTCGGTGGTGTGGGCGGCAGACTGGCTGGCACCGAAACCAGAAACTC CCGTATACATACAGACAACCGGATAACGGGTAAGCAGACACACCGTATTT CGTCCTCTGGAAACAAGGTTTGCCCCAGGGCCAAATGTTCTGCCCTGTTT GTACGGGGGGGAAATCCGCGTGGGGTCAACGCGTGGTTTCCCCCTCCCAG GAGATCAAGCACGGCAGAAGCCGAAGGGGACCCGGTGGGTCAGGTGTGGG GATTGGCGCTGATGCAAAATGCTTTATGGTGAAGACCGCCCTGCCCCGGC TGGTTTGTCATTTTATGGAGCGTCGGAGGAAGTGGGTAAAGGGAAGCCAG GTAAAGGCATACCCCGCGCGAAGCGAAGGGACAACCCTGTGAAGTCCCAC GATTGCTGAGTGTGGGATCGAATGCCAATCCGCGAGGCGCCGATGAAGGT CCGGTGGGATGGCTTTGTTTAAAAAGCGTGCTGGCTGAACAGTACGCCCC GGGTGTGGACACCTGAGGGGAATACCCAACCATATCCGGTGGTCCACGTG GTGGTGTTTCCCGGGCCCTGGTGAGCAGGACAGACCTCCGCCGGAGGGAT TTGAATCCTCCCGGCTCCCCGAGACGGGGTGGCCGGGTAC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/25/3771_4379 GGGTACCAGGCAACCCGTCTCGGAGCCCGGAAGGATTCAAATCCCTCCGG CCGGATGTCTGCTCTGCTCACCAAGCTCCGGGAACACCACCCGTGGACAC CGAATATGGTATCCCTCAGTGTCAGCAACCGGCGTACTGTGTTCAGCAGC ACGCTTTTTTTAAGCCATCCACGGACCTCAATCGGCCCCTTCGGCTGACT GCATCCCGATCACACTCAGCAAGCTGCGTGGACTGTCAGGGTTGTCCCTT CGCTTGCCGCGCGCGGGTAGCCTTGACTGCTTCTTTACCCACTTCTCACG CTCCCAATAAATGACAAAAACCGCCGCCCGCAGGCCGGTTTGCACATAAA GACATTTTGCCATCGCGACCCAATCACCACAACTGACCACCGTCCCCCTT CGCTCTGCCGTGGGGCCTGATCTCCTGGAGAAAACCGACGGCCGTGACCA CGCGCATTTCCCCGTACAGGAACCAGGCAGAACATGCCCTGGCAACCATG AGTATCCGTGGAGGAAATAGGTGTTCTGCTTACCGTATCCGTTTGTCCCT GTCATACGGGAGTCCGGCTTTCGGTGCCAGCATCTTGCGCCACACCCACC GAGCCACA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/25/4430_5057 GTGGCTCGGTGGGTGCTGGCGCAAAGCATTGGGCACCGAAAGCCCAGAAC GTCCCCGTATACAGGACAACCGGAAATAACGGTAAGCAGAAACCACCCCT ATGTTCTCTCCTGGATACAATGGTTGCCCAGGGCAATGTTCTGCCCTGTT CTGTAGGGGGAAAAATGCGCGGGGTGGTGGCGTCCCCACCGCGGTGGTTT CCAGGAGAGCAGCGACGGGCAGGACGAAGGGGACGGTGGTCCCAGGTTTG GGTGATTGTGGCCGCTGATGCCAAAGTTTATGTGGAACCGTGCGGGCCGG TTTTTGTCATTGATTGGAGCGGTGGAGAACCGGGTAAAGGAACAGATAAA GGGGCATACGCGCGAAAGCCCCGAAGGAACAACCTGAGCCCACGCAGTTG CTGAGTGTGATCCGATGCCCCATCCAGCCGAGGGCCCGATTGAAGGTCCC GGTGGGATGGCTTTAAAAATGCCCGTGCCTGCTGAAACAGTACGCCCGGG CCTGGGACACTGAGCGGGAAGTACCAACCATATCCGTGGTCACGGGTGGG TGTCCGGGCCTGGGTGAGCAGGAGCAGCTCCGCCGGAAGGGATTGGAATC CCCGGTCCGAGACGGTGGCTTGGCGTA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/25/5106_5726 GTTACCCAGCACGTCGGTCGGAGCCGGAGAGGCGAGAGAGCCTCCGGCGG ACGCTGCTCCTGCTCCCACGCAGCCAAGGAACACCACCGTGACACCGGAT ATGGGTATTTCCCCTCAGTGTCCCAACGCACCGGCCGTAGCTGTTCAAGC CAGCACGCTTTTTAAGCCCCATCCCACCCGGACCTTCAGTCGAGCGCCTT CGCTGATGGCAATCGGATCACACTCAGCAACTGCGTGGACTTCAGGTGTC CCCTTCGCTTCGCGCGGGGGTATGCCCCTTACTGCTTCCTTACCCATTCC TCACGCCCCAAAATGACAACAACCGCCGCAGGGCGGTTCACATAAAACAT TTTGCTGCATCAGCCGACCAATCACCACAACCTGACCACCGGTCCCCCTT CGTCTGCCGTGATGATCCTCTGAGAACCCACGCGGGACGCCCACGCGCAT TTCCCGTACAGAAAACAGCCCAGAACATGCTGCCCTGGGCCAAACCATGT TAATCCAGTGAGGAGAAAAATAGGTGTTCCGGCTTACCGTTATCTCCGTG GTCGTATACACGGGAGTCTGGCTGTTCGGTGCCAGCCCATCTGGGCGCGG GCGCCCACACACGAGGCCAC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/25/5774_5927 GTGCTCGGTGTGGTCGGGCGCCAGATGCTGGCACCCCGAAAGCCAGAAAC TCCGTTATACAGACAAGGGATTAAGGGTAAGCAGAAACCCTATTTTCCCT CACTGGATAACATTGGTTGCCAGGGCAAACTGTTCTGCTGCCCTGTTTCT GTA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/26/0_67 TCGACATTTCCTCGCACACATTCCGGCCCGCACTGGCCGGGGAAAACAGT ACGAGAACGACGCCAGA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/26/109_954 TCTGGCGTTCGTTCTCCGTACTTGCTTTACCACCAGGCCAGTGTTAGCGT TAAACTTCCGGAGGGCCACACCGGTGCAAACTCAGGTCAAGCAGGGTGGT GGAAGTAGGAATTTTCATGTCAGCCACTTCTTTCGGAGCGGGGTTTTTGC TATCACGTTGTGAACTTCTGAAGCGGTGATGACGGGAGCCGTAATTTTGT GACGATTCATCCCACCTGTTCGACAGCTCTCACATCGATCCCGGTACGCT GCAGGATAATGATCCGGTGTCATGCTGCACGGACACCTTGTCTGCTCTGC GGCTTGCCCTTGCTTTTCAAGGAATCCAAGAGCTTTTACTGTTCGGCCCT GTGTCAGTTCTGACGATGCACAGAATGCGCGCGGAAATATCGGGAACAGA GCCAATAAGTCTCATCCCATGTTTTATCCAGGGCGATCCAGCAAGGTTAA TTCCTGCTGGTTTCATCGTTAACGCCGGAGTGATGTCTGCGTTCGGCTGA CCGTTCTGGCAGTGTACTGCAGTATGTTTTCGACAATGCGCTCGGCTTCA TCCTTGTCATAGATAACCCCAGCAAATCCGAAGGCCAAGACGGGCACACT AATCATGGCTTTATGACGTAAATCCGTTTGGGTGCGACTCGCCACGGCCC CGTGATTTCCTGCCTCGCGAGTTTTTGAATGGTTCGCGGCGGCATTCATC CATCCATTCGGTAACGCCAGATCGGATGAATTACGGTAAACTTGCGGTAA AATCCGCGCATGTACAGGATCATTGTCCTGCCCTCAAAGCCATGCCATCG GCAACTGCTGGTTTTCATTGATGTGCGGGACCAAGCCATCCAACT >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/27/195_501 TGATGAGCGTTCGTTTCTGGATGATTTTGCTGCCTCTTTTGAGGCCACCG CATCTCGTGTGAAGTGGCGCGCCTCATGGAGCCTTCGTGGCGCGGTGGAG GCGACGTGCGGGCTGATTGTTGTGACGCTGCGCATTCGTTTCTGACGTTT TCGCCCGCACCGGCACTGGTGGCCGGCCGCGTTTTGCGAGGATTCTGCGG CTGCGGCACTTTTTCCGCTTCATGGCCTTTGCTGATGCGCTTCCTGCGCC CGGAGGACGCTTTCCTGAGGCTTGACGATGCAGCCTGTCCGGCGGACGTG CGGCGG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/42/2357_2612 CCGCGCCCGAACGTCGCGCAGAGCAAACAGTGCCTCAATGGAAAGCAGCA AATCCCCTGTTGGTTTGGGGTAAGCGCAAAACCAGTTAACCGCCTATTCT CTTAGCTGAATCTTCAAACCCGAAATCACGAGTAGATAAGCGCACTAAAT CCGATAGACCGTTTACAGTGCTGGCTGAATACCACAAAAAACCCCCCCAA AAGAAAGTTTGAAAGCAACCTGCAACGTATTGAGGCGCAAGAATCAGCGC ACATG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/42/2655_3333 CATGTGCCTGATTCTTGCGCTCAATACGTTGCAGGTTGCTTTCATCTGTT TGTGTATTCAGCCAGCACTGTAAGGTCTATCGGATTTAGTGCGCTTTCCC TACTCGTGAATTTCGGTTGCGATTCAGCGAAGAGAATAGGGCGGGTTAAC TGGTTTTTGCCGCTTACCCCACCAACAGGGGATTTGCTGCTGTTCCATTG AGCCTGTTTTCTCTGCGCGACGTTCGCGGCGGCGTGTTTGTGTGCATCCA TCTGGTATTCTCCTGTCAGTTAGCTTTGGTGGTGTGTTGGCAGTTTGTAG TCCTGAACGAAACCCCCGCGATTGGCCACATTGGGCAGCTAACCGGAATC GCACTTACGGCCAATGCTTCGTTTCGTATCACACCACCAAAGCCTTCTGC TTTGAATGCTGCCCTTGTCTTCAGGGCTTAATTTTTAAGAGCGTCACCTT CACTGGTGGTCAGTGCCGTCTGCCTGATGTGCTCAGTATCACCGCCAGTG GTATTTATGTCAAACACCGCCAGAGATAATTTTATCACCGCAGATGGTTT CTGTATGTTTTTTATATGTGAATTTTATTTTTTTGGCAGGGGGGCATTGT TGGTAGGTTGGAGAGATCTGAATTGCTATGTTTAAGTGAGTTGTATCTAT TTATTTAGTTCAATAAATACCAATTGGG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/42/3378_4072 CCCAATGTATTATTGAAAAATAAATAGATACAACTCACTAAAACCATAGC AATTCCAGATCTCTCACCTACCAAACATGCCCCCCTGCAAAGAATAAATT CATAAAAAAAACATACAGATTGAACCATCTGCGGTGATAAGATTATCTAT GGCGGTGTTGACAAAAAAAAAAAAATAAATACCACTGGCGGTGAATACTG AGCACATCAGCAGGACGCACTGACCACCAAATGAAGGTGACCGCTCTTAA AAAATTAAGCCCTGGAAGAGGGCAGCATTCAAAGCAGAAAAGGCTTTTGG GTGTGTGGTGATACGAAACGAAGCATTGCCGTAAAGTGCGATTCCGGATT ATGGCTTGCCAATGTGCCAATCGCCGGGGGGTTTTCGTTCAGGACTACAA CTGCCACACACCACCAAAGCTAACTGTACAGGAGAATCCAGATGGATTGC ACAAACACGCCGCCGCGAACGTCGCGCAGAGAAACAGGCTCAATGGAAAG CAGCAAATCCCCCTGTTGGTTGGGGTAAGCGCAAAAACCAGTTAACCGCC CTATTCTTCGCTGAATCGCAAACCGAAATCACGAGGTAGAAAGCCGCACT AAATCCGATAGACCGTTACAGTGCTGGCTGAAATACCACAAACAGATTGA AGCAACCTGCAACGTATTGAGGCCGGCAAGAACAGGCGCACATG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/42/4118_4795 CATGTGCCGCTGATTCTTGCGCTCATACGTTGCCAGGTTTGCTTCAATCT GTTTGTGGTATTCAGCCAGCATCTGTAAGGTCTATCGGATTTAGTGCGCT TCTACTCGTGATTTTCCGGTTTGCGATTCAGCGAGAGAATAGGGCGGTTA ACTGGTTTTGCGCTTACCCCAACCAACAGGGGATTTGCTGCTTTTCCCCA TTGGAGCCTGTTTCTCTGCGCGACGTTCGCGGCGGCGTGTTTGTGCATCC ATCTGGATTCTCCTGTCAGTTAGCTTTGGTGGTGTGTGGCAGTTGTAGTC CTGAACGAAAACCCCCCGCGATTGGCACATTGGCGCAGCTAATCCGGAAT CCGCACTTACGCCAAATGCTTCGTTGTTTCCGTTCACACAACCCCAAAGC CTCCTGCTTTGAATGCTGCCCTTCTTCAGGGCTTTAATTTTTAAGAGCGG TCACCTCATGGTGGTCAGTGCGTCCCTGCTGATGTGCTCAGTATCACCGC CAGTGGGTATTTATGTCAACACCGCCCAGGAGATAATTTATCACCGCAGA ATTGGGTTATCTGTATGTTTTTTATATGCATTTAATTTTTTGGCAGGGGG CATTGTTGGTAGGTGAGAGATCTGAATTGCTATGTTTAGTGAGTTGTATC TATTTATTTTTCAATAAATACAATTGG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/42/4841_5464 CCAAATTTCAATGGTAAGTAATTATTGAAAAAATAAATAGATACAACTCC TAAACATAGCAATTCAGATTCTCCTACAAACAATGCCCCCCTGCAAAAAA TAAATAATATAAAAAACACTAGCAGATAACCGATCTGCGGTGATAAATTA TCTCTGGCCGGTGTTGACATAAATACCACGTGGTGCGTGATACTGCGAGC ACATCAGCAGGACGCACTGACCACCATGAAGGTGACGCTCTTTAAAAATT AAGCCCTGAAGAAGGCAGCATTCAAAGCAGAAGGCTTTTGGGCTGGTCGT GATACGAAACGAAGCCATTGAGCCGTAATGCGGATTCCGGCATTCAGCTG CCAATGTGCCCAATCGTGGGGGGTTTCGCGTTTCAGCGACTCTGCCCGCC CACACCGACCCAAAGCTAACTGAGCGAGGGAGCTCAGACATGATGCACAA ACACGCGGCTGACACGGCCGCCGCGAACGTCGCGCAAGAGAAACAGGCGT CTAATGGAAAGCAGCAAATCCCCTGTTGGTTGGGGTAAGCGCAAAACAGT TAACATCGCCCTATTCTCTCGCTGAATCGCAAACCGAAATCACGAGTAGA TAAGCGCACTAAATCCGATAGAC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/44/507_877 CTGAGAGTTAATTTCGCTCACTTCGAAAACCTCTCTGTTTACTGATAAAG CTTCCAGATCCTCCTGGCAACTTGCACAAGTGACAACCCTGAACGACCAG GCGGTCTCCGTTCATCCTATCGGATCGCCACACTCACCAACAATGAGTGC CAGATATAGCCTGGTGGTTCAGGCGGCGCATTTTTATTGCTGTGTTGCGT GTAATTCTTCTATTTCTGATGCTGAATCAAGATGCGTCTGCCATCTTTCA TTAATCCCCTGAACTGCTTGGTTAATACGGCTTGAGGGTGGGGGAAAATG CGGAATAATAAAAAAGGAGCCTGTAGCTCCGCTGATCGATTTTGCTTTTC ATGTTCATCGTTCCTTAAAG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/44/923_2037 CTTTAAGGAACCGAGTGAAAACAGGAAAAAGCAAAATCATCAGGGAGCTA CAGGCTCCTTTTTTATTAATTCGCATCACCCTCAAGCGTATTAACCAACA GTTCAGGGATTAATGAAAAGATGGCAGACATCATTGAATTCAGCATCAGA AATAAGAAGAATTACAGCGAAACACAGCAAATAAAAATGGCCGCCGCCTG AACCACCGGCTATATCCTGCTGCCAACTCATTGTTGTGAGTTGTGGCGAT CCGATAGATGAAACGAAGACGCCTGGTCGTTCAGGGTTGTCGGACTGTGT GCAAGTTGCCAGGAGGATCTGGAACTTATCGTAAACCAGAGAGGTTGAAG TGAGGCAAATTAACTCTCCAGGCACTGCCGTGAAGCGGCAGGAGCAGGCA ATGCATGACCGACTGGGGATTGTGACGCAGACCTTTTCCATGAATTGGTA ACACCATCGATTGTGCTGGAACTGCCTGGATGAACCGGGAAAGAAACCAG CAAATACATCAAACGCCGCCGACCAGGAGAACGAGGATATTGCGCTAACA GTAGGGAAACTGCGTGTTGAGCCTTGAAACAGCAAAATCAAACACCAACG AGCAGCTGAGTATTACGAAGGTGTTATCTCGGATGGGGTAAGCGTATTGC TAAACTGGAAAGCAACGAAGTCCGGGGAAGACGAGAAACCAGTTCTTTGT TGTTCCGTCCAATCCTGGGAACGACTCCTGCGTTATCAAGCACCTGCAAC TGGTGACTGGAAGAGTTTCTGCGGCAGTTAATCGAACAAGACCCGTTAGT AACTATCGACATCATTACGCATCGCTATTACGGGGTTGGAGGTCAATGGG TTCAAGGATGCCAGGTGAGTATCCTGCATATGAATGTCTGACCGCTGGCA TTCGCATCCAAAGGAGAGTGAGATCGGTTTTAGTAAAAAGATAACGCTTG TGAAATGCTGAATTTCGCGTCGGTCTTCACAAGAGATGCCAGAGTCTGAG TGTCAGAATGATGACCGTACTCAAACATCGGGTTGAGTAATTATCTTACT GTTTCTTTACATAAACATTGCTGATACCGTTTTAGCCCTGAAACGACATA CATTGCAAGGAGTT >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/44/2083_3176 AAACTCCTTGCAATGTAGTCGTTTCAGCTAAAACGGTATCAGCAAGCTTT ATGTACAAGAAAGCAGTAAGATAATAAGCTCAACCCGATGTTTGAGGTAC GGTCATCATCTGACACTAGCAGACTCTGGCTCGCTGTGAAGACGACGCGA AATTCAGCATTTTCACAAGCGTTACTCTTTTACAAACGCGATCTCACTCT CCTTTGATGCGATGCCAGCGTCAGACATCATATGCAGATACTCACCTGCA TCCTGAACCCATTACCTCCACCCCGTAATAGCGATGCCGTAATGATGTCG ATAGTTACTAACGGGTCTTGTTTCGATAACTGCCGCAGAAACTCTTCCAG GCACCAGTGCAAGTGCTTGAATAACAGGAGTCTTCCCCAGGATGGCGACA ACAAGAAACCTGGTTTCCCGTCTTCACGGACTTCGTTGCTTTCCAGTTCT AGCAATACGCTTATCCCATCCGAGAATACACCTTCGTAATACTCACGCTG CTCGTTGAGTTTTGATTTTGCTGTTTCAAGCTCACACGCAGTTTCCCTAC CTGTTAGCGCAATATCCCTCGTTCTCCTGGTCGCGGCGTTTGATGTATTT GCTGGTTTTTTTCCCGTTCATCCAGCATTCAGCACAATCGATGGTGTTAC CAATTCATGGAAAAGGTCTGCGTCAAAATCCCCAGTCGTCATGCATTGCC TGCTCTGCCGCTTAAACGCAGTGCCTGAGAGTTAATTTCGCTCACTTCGA ACCCTCTCCTTGTTTTACTGATAAGTTCCAATCCTCCTGGCAACTTGCAC AGTCCGACAACCCTGAACCGACCAGGCGTCTTCGTTCCATCATCGGATCC GCACACTCACAAACAAATGAGTGGCAGATATCAGCCTGGTGGTCAGGCGG CGCATTTTTTACTTGCTGTGTTGCGCTGTAATTCTTCTATTTCTGATGCT GAATCATGATGTCTCGCATCTTTTCATCTACATCCTGAACTGTTGGTTAA TACGCTTGAGGGTGGGAAATGCGAATAATAAAAAAGGAGCCCTGTAGCTT CCCTGATGATTTGTCTTTTCAGCTGTTCATCGTTCCTTAAAGA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/44/3224_4396 CTTTAGGAACGACTGAACTGAAAAAAAGCAAAATCATCAGGAGCTACAGG CTCCCTTTTTATTATTCCGCAATTCCCCCTCAAGCGTATTACCAACCAGT TTTCAGGGATTAAATGGAAAGATGGCAGACCATCATTGATTCAGCCCATC AGAAATAGAAGAATTACAGCGCAACCAGCAAATAAAAATGCGGCCGCCTG ACCACCAGGCTATATCTGCCACCTCATGTGTGTGAAGTTGTGGCGTCCGA TAGATTTTGAACGAAGACGCCTGGTTCCGTTCAGGGTTTTTGTCCGCGAA CTTGTGCCAAGTTGCAGGAGCGATCCTTAGGAACTTATCAGTAACAGAGA GGGTTCGAAAGTGAGCCGAAATTAACTCTCAGGCACCTGCGGGAAAGCGG CAAGAGCAGGCAATGGCAATGCGACTGGGGATTTGACGCAGATTTTCCCC CCTGAAATTGGTAGACACCATCCGATTGTGCGGGAACTTGGCTGGACTGG AACGGGAAAAGAAACCAGGCATACATCCAAACGCGGCGACCAGGAGAACG AGGAATATTGCGCTAACAGTAGGGAACTGCGGTGTGAGCTTGAAAACAGG CAAAATCTCAAAACTCAACGAGCTGCGGTGAGGCTATTTACGAAGGGTGG TTATCTCGGATGGGAGTAAAAAGCCGTATTGCTAAACTGCGAAAGCAACT GAAGGGTCCGTGGAGACCGGAAAACAGTGTTCTTTGTTGTTCGCCATCCT GGGAGACTCCCTGTTATCAAGCACTGCACTGGGGACCTGAAGAGTTTCTG GCGGCAGTAATCGAACAAGACCCGTTAGTAACCTATCGAACATCATTACG CAATCGGCTATTACGCGGGTGGAGGTCACATGGGTTTCCAGGATGCCAGG GTGGGAGTAAATCTGCCCCATATGAATGTCTGACCGCTGGCCATTCGCAT CAAAGGGGAGAGTGAGGATCGGGTTTTTTAAAAGATTAACGTTCGTGAAA ATGCTGAATTTCGGCCGGTCCGTCCTTCACAGCGATGCCCCAGAGTCTGT AATGTCCGGATGAAATGCCCGTATCAACATCCGGGGTTGAGTATTATCTT ACTGTTTCCTTTTACATAAACCATTTGCTGATACCGTTTAGTTCTGAAAC GACATACCATTGCAAGGAGTTT >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/44/4449_4913 AAACTCCTTGCCAATGTATGTCCCGTTTTCAGCTAAACCGTAATCACAAT GTTTTAAATGTAAAGAAACAGTAAGATAATACCTCCAACCCGATGTTTGA AGTAACGGTCATCATCTGGACAACTACAGACTCTGGCATCTGCTTGTGCA GAACCGACGCGAAAATTCAGCATTTCAGCAAGCGTTATTCCTTTTACAAA ACCGATCCTCAACTCTCCCTTTGATGCCGAATGCCAGCGGGTCCAGACAT CATCATTGCAGAATACTCACCTGCCCATCCTGAACCCATGACCTCCAACC CCGTAATAGCGATCGCGTAATGATGTCGATATACTAACGGGGTCCTTGTT CATTAACCTGCCGCAGAAACCTCTTCCAGGTCCCACCCAGTGCCAGTGGC TTTGCAACAACCAGGAGTCCCTTCCCAGATGGCGGAAACAAAAACAAAGA AACTTGGTTTCCCA pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/ecoli_lp.fofn000066400000000000000000000004161241505617700244620ustar00rootroot00000000000000/home/UNIXHOME/yli/yliWorkspace/private/yli/data/testLoadPulses/m121215_065521_richard_c100425710150000001823055001121371_s1_p0.pls.h5 /home/UNIXHOME/yli/yliWorkspace/private/yli/data/testLoadPulses/m121215_065521_richard_c100425710150000001823055001121371_s2_p0.pls.h5 pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/example_read.fasta000066400000000000000000000004251241505617700254700ustar00rootroot00000000000000>m120619_015854_42161_c100392070070000001523040811021231_s1_p0/13/0_3797 GCTGTTTTCTCCAGCGCAGCACCGTAAATTACTGCTGAGCCATCATGACG CCGATGGAGCCTGTCCGGGCGGTCCCTGGCGTGACCAGACGCCGGAGGCG GCACTGGCAAGCAACTACCTGCACTGCAGTTCATGTGTTGGCAACCGCCC ATACCGGTTTATGTCCACGCACACGGGCGATAGTCAGCGCCGTCCAAATG pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/example_ref.fasta000066400000000000000000000070501241505617700253320ustar00rootroot00000000000000>example_reference CCGCATATTGCCAGCATGGCCTTTAATGAGCCGCTGATGCTTGAACCCGCCTATGCGCGG GTTTTCTTTTGTGCGCTTGCAGGCCAGCTTGGGATCAGCAGCCTGACGGATGCGGTGTCC GGCGACAGCCTGACTGCCCAGGAGGCACTCGCGACGCTGGCATTATCCGGTGATGATGAC GGACCACGACAGGCCCGCAGTTATCAGGTCATGAACGGCATCGCCGTGCTGCCGGTGTCC GGCACGCTGGTCAGCCGGACGCGGGCGCTGCAGCCGTACTCGGGGATGACCGGTTACAAC GGCATTATCGCCCGTCTGCAACAGGCTGCCAGCGATCCGATGGTGGACGGCATTCTGCTC GATATGGACACGCCCGGCGGGATGGTGGCGGGGGCATTTGACTGCGCTGACATCATCGCC CGTGTGCGTGACATAAAACCGGTATGGGCGCTTGCCAACGACATGAACTGCAGTGCAGGT CAGTTGCTTGCCAGTGCCGCCTCCCGGCGTCTGGTCACGCAGACCGCCCGGACAGGCTCC ATCGGCGTCATGATGGCTCACAGTAATTACGGTGCTGCGCTGGAGAAACAGGGTGTGGAA ATCACGCTGATTTACAGCGGCAGCCATAAGGTGGATGGCAACCCCTACAGCCATCTTCCG GATGACGTCCGGGAGACACTGCAGTCCCGGATGGACGCAACCCGCCAGATGTTTGCGCAG AAGGTGTCGGCATATACCGGCCTGTCCGTGCAGGTTGTGCTGGATACCGAGGCTGCAGTG TACAGCGGTCAGGAGGCCATTGATGCCGGACTGGCTGATGAACTTGTTAACAGCACCGAT GCGATCACCGTCATGCGTGATGCACTGGATGCACGTAAATCCCGTCTCTCAGGAGGGCGA ATGACCAAAGAGACTCAATCAACAACTGTTTCAGCCACTGCTTCGCAGGCTGACGTTACT GACGTGGTGCCAGCGACGGAGGGCGAGAACGCCAGCGCGGCGCAGCCGGACGTGAACGCG CAGATCACCGCAGCGGTTGCGGCAGAAAACAGCCGCATTATGGGGATCCTCAACTGTGAG GAGGCTCACGGACGCGAAGAACAGGCACGCGTGCTGGCAGAAACCCCCGGTATGACCGTG AAAACGGCCCGCCGCATTCTGGCCGCAGCACCACAGAGTGCACAGGCGCGCAGTGACACT GCGCTGGATCGTCTGATGCAGGGGGCACCGGCACCGCTGGCTGCAGGTAACCCGGCATCT GATGCCGTTAACGATTTGCTGAACACACCAGTGTAAGGGATGTTTATGACGAGCAAAGAA ACCTTTACCCATTACCAGCCGCAGGGCAACAGTGACCCGGCTCATACCGCAACCGCGCCC GGCGGATTGAGTGCGAAAGCGCCTGCAATGACCCCGCTGATGCTGGACACCTCCAGCCGT AAGCTGGTTGCGTGGGATGGCACCACCGACGGTGCTGCCGTTGGCATTCTTGCGGTTGCT ATCGGCGTCATGATGGCTCACGGGAAATACGGTGCTGCGCTGGAGAAAAAGGGTGTGGAA ATCACGTTGATTTACAGCGGCAGCCATAAGGTGGATGGCAACCCCTACAGCCATCTTCCG GATGACGTCCGGGAGACACTGCAGTCCCGGATGGACGCAACCCGCCAGATGTTTGCGCAG >2 CCGCATATTGCCAGCATGGCCTTTAATGAGCCGCTGATGCTTGAACCCGCCTATGCGCGG GTTTTCTTTTGTGCGCTTGCAGGCCAGCTTGGGATCAGCAGCCTGACGGATGCGGTGTCC GGCGACAGCCTGACTGCCCAGGAGGCACTCGCGACGCTGGCATTATCCGGTGATGATGAC GGACCACGACAGGCCCGCAGTTATCAGGTCATGAACGGCATCGCCGTGCTGCCGGTGTCC GGCACCCTAGTGAGCCGGACGCGGGCGCTGCAGCCGTACTCGGGGATGACCGGTTACAAC GGGATTATCGCCGGTCTGCAACAGGCTGCCAGCGATCCGATGGTGGACGGCATTCTGCTC GATATGGACACTCCCGGCGGGATGGTGGCGGGGGCATTTGACTGCGCTGACATCATCGCC CGTGTGCGTGACATAAAACCGGTATGGGCGCTTGCCAACGACATGAACTGCAGTGCAGGT CAGTTGCTTGCCAATAGAACCTCCCGGCGTCTGGTCACGCAGACCGCCCGGACAGGCTCC ATCGGCGTCATGATGGCTCACAAAGGTTACGGTGCTGCGCTGGAGAAACAGGGTGTGGAA ATCACGCTGATTTACAGCGGCAGCCAGGGGTGGATGGCAACCCCTACAGCCATCTTCCGA GATGACGTCCGGGAGACACTGATCTCCGGAATGGACGCAACCCGCCAGATGTTTGCGCAG AAGGTGTCGGCATATACCGGCCTGTCCGTGCAGGTTGTGCTGGATACCGAGGCTGCAGTG TACAGCGGTCAGGAGGCCATTGATGCCGGACTGGCTGATGAACTTGTTAACAGCACGAAT GCGATCACCGTCATGCGTGATGCACTGGATGCACGTAAATCCCGTCTCTCAGGAGGCAGA >3 GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGAAAAAAAAAAAAAAAAAAAAAAAAAA CCGCATATTGCCAGCATGGCCTTTAATGAGCCGCTGATGCTTGAACCCGCCTATGCGCGG GTTTTCTTTTGTGCGCTTGCAGGCCAGCTTGGGATCAGCAGCCTGACGGATGCGGTGTCC GGCGACAGCCTGACTGCCCAGGAGGCACTCGCGACGCTGGCATTATCCGGTGATGATGAC GGACCACGACAGGCCCGCAGTTATCAGGTCATGAACGGCATCGCCGTGCTGCCGGTGTCC GGCACCCTAGTGAGCCGGACGCGGGCGCTGCAGCCGTACTCGGGGATGACCGGTTACAAC GGGATTATCGCCGGTCTGCAACAGGCTGCCAGCGATCCGATGGTGGACGGCATTCTGCTC GATATGGACACTCCCGGCGGGATGGTGGCGGGGGCATTTGACTGCGCTGACATCATCGCC CGTGTGCGTGACATAAAACCGGTATGGGCGCTTGCCAACGACATGAACTGCAGTGCAGGT CAGTTGCTTGCCAATAGAACCTCCCGGCGTCTGGTCACGCAGACCGCCCGGACAGGCTCC ATCGGCGTCATGATGGCTCACAAAGGTTACGGTGCTGCGCTGGAGAAACAGGGTGTGGAA ATCACGCTGATTTACAGCGGCAGCCAGGGGTGGATGGCAACCCCTACAGCCATCTTCCGA GATGACGTCCGGGAGACACTGATCTCCGGAATGGACGCAACCCGCCAGATGTTTGCGCAG AAGGTGTCGGCATATACCGGCCTGTCCGTGCAGGTTGTGCTGGATACCGAGGCTGCAGTG TACAGCGGTCAGGAGGCCATTGATGCCGGACTGGCTGATGAACTTGTTAACAGCACGAAT GCGATCACCGTCATGCGTGATGCACTGGATGCACGTAAATCCCGTCTCTCAGGAGGCAGA pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/lambda.rgn.h5000066400000000000000000000000461241505617700242640ustar00rootroot00000000000000# This is an aritifcial region table. pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/lambda.sam000066400000000000000000001710451241505617700237530ustar00rootroot00000000000000@HD VN:1.3.1 @SQ SN:lambda_NEB3011 LN:48502 M5:a1319ff90e994c8190a4fe6569d0822a @RG ID:f7d81fcc44 PU:/net/usmp-data3-10g/ifs/data/unixhome/yli/yliWorkspace/software/bioinformatics/tools/pbalign/tests/data/lambda.fasta SM:/net/usmp-data3-10g/ifs/data/unixhome/yli/yliWorkspace/software/bioinformatics/tools/pbalign/tests/data/lambda.fasta @PG ID:BLASR VN:1.3.1.121192 CL:blasr /net/usmp-data3-10g/ifs/data/unixhome/yli/yliWorkspace/software/bioinformatics/tools/pbalign/tests/data/lambda.fasta /net/usmp-data3-10g/ifs/data/unixhome/yli/yliWorkspace/software/bioinformatics/tools/pbalign/tests/data/lambda_ref.fasta -sam -out /scratch/vUoNZr.sam -bestn 10 -nproc 15 @PG ID:SAMFILTER VN:v0.1.0 CL:samFilter /home/UNIXHOME/yli/yliWorkspace/software/assembly/cpp/pbihdfutils/bin/samFilter /scratch/vUoNZr.sam /net/usmp-data3-10g/ifs/data/unixhome/yli/yliWorkspace/software/bioinformatics/tools/pbalign/tests/data/lambda_ref.fasta /scratch/bP2PvR.sam -minPctSimilarity 70 -minAnchorSize 12 -minAccuracy 70 -minLength 50 -seed 1 -scoreSign -1 -hitPolicy randombest m120619_015854_42161_c100392070070000001523040811021231_s1_p0/105/0_2152/0_2152 0 lambdaf7d81fcc44 AS:i:-9105 XS:i:1 XE:i:2152 XL:i:2151 XT:i:1 NM:i:1 FI:i:1 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/115/1304_3321/0_2017 16 lambdaf7d81fcc44 AS:i:-8408 XS:i:4 XE:i:2007 XL:i:2003 XT:i:1 NM:i:1 FI:i:4 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/115/3365_7852/0_4487 0 lambdaf7d81fcc44 AS:i:-18994 XS:i:1 XE:i:4488 XL:i:4487 XT:i:1 NM:i:1 FI:i:1 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/115/7896_11181/0_3285 16 lambda_NEB3011 28177 254 14M1I7M1I32M1D6M1I5M1I22M1I34M1D15M1D6M1I13M1I4M1I4M1I7M1I1M1I1M1D21M1I3M1I6M1I5M1I20M1D3M1I4M1D24M1I5M1I7M1I6M2I4M1D6M1I7M1I7M1D5M1I3M2I23M1D12M1I6M1I2M1I4M1D35M1I1M1I2M1D17M1D4M1D5M1I12M1D4M1D12M1I10M1I1M1I15M1I2M1I34M1I4M1I5M1I12M1I17M1I5M1D3M1D12M1I4M1I3M1I9M1I2M1I4M1D4M1I4M1I8M1D11M1D19M1I6M1I17M1D29M1D14M1I20M1I4M1D6M1D17M1I25M1I29M1I1M1I5M1I2M1D2M1I4M1D9M1I23M1D4M2I4M1D6M1I9M1I2M1I3M1I4M1I3M1D4M1I4M1D2M1D5M1I5M1I12M1D2M1D7M1I20M1D21M1D1M1I3M1D26M1I13M1I7M1I6M1I3M2I4M1D22M1I15M1I3M1D2M1I10M1I9M1I20M1D11M1I29M1I3M1I12M2I14M1I14M1I14M1I17M1I11M2I18M1D12M1D42M1I13M2D4M1D9M1I22M1I10M1I9M1I2M1I13M1D4M1D2M1I3M1I8M1D4M1I35M1I10M1D11M1I75M1I10M1I13M1I11M1I1M1I7M1D4M1D4M1I9M1I9M1D12M1I40M1D2M1I5M1I4M1I2M1I20M1D13M1I8M1I3M1I9M1I9M1I35M1D1M1I6M1D27M1I14M1D3M1D14M2D11M1D6M1I14M1I1M1I4M1I3M1I21M1I4M1I24M1I7M2I6M1D1M1I24M1I13M1I4M1I9M1D2M1I3M1I6M1I14M1D7M1D42M1I4M1I38M1D2M1I3M1I16M1D2M1I4M1D2M1I23M1D3M1I6M1I19M1I19M1I4M1I2M1I22M1I25M1D2M1I9M1I17M1I9M1I5M1D4M2D61M1I2M1I6M1I32M1I1M1I24M1I1M1I10M1I10M1I2M2I11M1I1M1I5M1I8M1D11M1I1M1I5M1D4M1I51M1I4M1I19M1D5M1I9M1I15M1I5M1D2M1I15M1D9M1I2M1I52M1I26M2I27M1I2M1I3M1D38M5I1M2I8M1I24M3I9M1I4M1I4M1D1M1I16M1I2M2I3M1I1M1I17M1I33M1I1M1I24M2I2M1D18M1I4M1I1M1I1M1I2M1I1M1I40M1D16M1D5M1I16M1D8M1I21M * 0 3145 TTTTGCTTTGCTCGCACATAAAAGATATCCATCTACGATATCAGACCACTTCATTCGCATATAATCATCCAACTCGTTGCCCGGTAACAAACAGCCAGTTCCATTGCAAGTCTGAGCCAACATGGGATGATTCTGCTGCTGATAAAATTTTCAGCTATTCCGTCAAGCCGGTAAGTCTCTTGTCTCGTTACCTCTGATTTTGCCTGCCGCGAGTTGGCAGGCGACATGGTTTGTTGTTATAGGCGCTTCGCTATTGCCTCTCGGAATGCATCGGCTCAGGTGTTGATACTGATTCTAACTGGCTGAGCGCCGCCCTTGCCCTGTCTACTGTTTATCCATTGAGCACTGCCGCAATTCTGTTGTGGTGAATGTCTTTCATAGTGAGCATCAGGCAGACCCCTCCTTATTGCTTTAATTTCTTGCCATGTAATTTATGAGTGCTTCGCTTGGATTCCTCTGCTGCCAGATTTATTCGTAGGCGTTCAAGCCGAATGAATGTAACGTAACAGGGAATTATCACTGTTGATTCTCGCTGTCAGAGGCTTAGCGTGTTGTGGTCCTGAAAATAAGCTCAATGTTGGCCTGTACTAGCTCAGGATTGCTATTCGTCCTGGTCTTCTGCCTAATTCCCAAACCTTTTACCCCGTCCTTGGTCCCTGTAGCATAATATCCATTGTTTCTTAATATAAAAGGTTAGGGGGTAAATCCGGCGCTCATGACTTCGCCTTCTTCCCATTCTGATCCTCTTCAACAAGGCCACCTGTAACTGGTCTGATTAAGTCACCTTTACCGCTGATTCGCTGGAACAGATACTCTCTTCCATCCTCTAACGGGAGTTGGGGAGATCCGGCATTCCGCTGAAGGGCGCGGACGACTGTTTCAACGGCTCCTTGGACGTCGCTGGCGTCGTTTACCACCCTGAAAGTGTCAAGTTACCATCCGCAACAGTTCCGGCAATACCGCAACGAAAATAACCGCCATCAGCGCTTGGTGCTTCTTTCAGTTCTTCAATTCAATATTGGTTACGTCTGCATGGTCTACTGCGCCAATATCATCCAGTGTTCGTTAGCAGTCGTTGATCGTTCTCCCGCTTCGGATACCACTCGTTGAATGGCTCTCCATTCCATGTCTCCTGTGACTCGGGAAGGCCATTTATCATCATCAATAAAAACAAAGCCCGCCGTAGCGAGTCAGATAAAATACAATCCCCGCGAGTGCGAGGATTGTTATGTTAATTATTGGGTTTAATTGCATCTATATGTTTTCGTACAGAGAGGGCACAGTATCGTTTCCACACGTACTCGTGATAATAAATTTTGCACCGTGCATCAGTCATTTCTCGCACTTGCAGAATGGGATTTGTCTTCATTAGACTTATAAACCTTCATGGAATATTTGTTATGCCGACTCTATCTATCCTTCATCTTACATAAACACCTTCGTGATGTCATGCATGGAGAACAAGACACCCGGCATCTGCACAACATGATACGACCCCAATCTTTTGCTCCAGACTCAAACTCATTGATACTCATTTATAAACTCCATTGCAATGTAGTCGTTTCAGCATAAACGGTATCAGCAATGTTTATGTAAAGAAACAGTAAGATAATACTCAACCCGATGTTTGAGTACGGTCATCATTCTGACACTACCAGACTCTGGCATCCGCTGTGAAGACCGGACGCGAATTCACATTGTTCACAAGCAGTTATATTTACAAAAACGATCCTCACTCGCCTTTGATGCAAATGCCAGCGTCAGACTTCATAGCCAGATAACTCCGCCCTGCATCCTGAACCCATTGACTCCAACCCCGTAAATAGCGATGTCGTTAATGATGTCAGATAGTTACCTAACGGGTCTTGTTCGATTAACTGCCGCATTAACTTCTCCAGGCACCAGGGCAGTGCTTGATAACAGGAGGTCTTCCCAGGATGGGAAAACAAGAAACTGGTCCGTCTTCACGACTTCGGTTGCTTTCCAGTTTTAAGCAAATACCGCTTACTCCCATCCGAGATAAACACCCTTCGTAATACTCACGCTGCTCGTTTGAGTTTTCTGATTTTCGTGTTTCAAGCTCAACACGCAGTTTTCCCTACTGTTAGCAGCAAATATCCTCGTCTTCCTAGGTCGCCGGCGTTTGATGTATGCTGGTTCTTTCCCGTTCATCCAGCAGTTCCAGCACAATCGATGGTGTTTACCATATTCATGGAAAGGGTCTGCGTCAAATCCCCAGTCGTCAGCGATTCGCCTGCTCTGCCGCTGACCGCAGGCTCTGAGAGTTAATTTCGCTCACTTGAAACCTCTCATGTTTACAGATAAGTTCCATGATCCTCCTGGCAACTTGCCACAAAGTTCCGACAACCCTGAACGACCAGGGCGTCTTCGTTCATCTATCGGATCGCACCACTCACAACTAATGAGTGGCAGATATATGACTGGTGGGTTCAGCGGCATTTTTATTGCTGTGTTGCGCTGTAATTCTTCTATTTCTGATGCTGAATCAATGATGTCTGACCCATCTTTTCAGTAATCCCTGAACTGTTGGTTAATACGCTTTGCAGGGTGAATGCGAATAATAAAAAAAGTGAGCCTTTAGGCTCCCTGATGGATGCTTTGCTTTTCACTCGTTCAGTCGTTCCTAAAGACTCCATCTCTAACAGCCGCATTGCCAGGCTTAAATGAGTCCGTGTGAATCCCATCAGCGTTACCGTTTCGGCGGGCGCTTCTTCAGTACGCTACGCAAATTGTCATCGACTGTTTTTATCCGGAAAACTGCTTCCTGGCTTTTTTTGATTCAGAATTAGGCCATGACGGGCAATGCTCCGAAGGGCGTTTTCCTGCTGAGGTGTCATTGAACAAGCTCCCATGTCGGCAAGCATAAGCACACGCAGAATCTGAAGCCCGCTGCCAGAAAAAATGGCATCCGTGGTTGTCATACCTGGTTTCTCTCATCTGCTTCTGGGGGGCGCTTTCGCCATCCATCATTTCCAGCTTTTGTGAAAACAGGGATGCGGGCTAAACGTAGCAAATTCTTCGTCTGTTATCCATACCTAGGTATTGGCACAAACCTTGATTCCAATTTGAGCAAGGCTATGTGCCATCTCCGGATACTCGTTCTTAACTCAACAGAAACGAGCTTTGTGCATACAGGCCGCTCGGTCTCTCCTATATTTATCTCCTCAGCCAGCCGCTGTGCTTTCAGTGGATTCGGCTAACAGAAAGGCGGGAATATACCCAGCCTCGCTTGTAACGGACGTAGACGAAAGTGATTGCGCC * RG:Z:f7d81fcc44 AS:i:-13267 XS:i:1 XE:i:3283 XL:i:3282 XT:i:1 NM:i:1 FI:i:1 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/13/0_3797/0_3797 16 lambdaf7d81fcc44 AS:i:-15373 XS:i:2 XE:i:3798 XL:i:3796 XT:i:1 NM:i:1 FI:i:2 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/14/0_1593/0_1593 16 lambdaf7d81fcc44 AS:i:-6697 XS:i:3 XE:i:1592 XL:i:1589 XT:i:1 NM:i:1 FI:i:3 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/147/0_6711/0_6711 16 lambdaf7d81fcc44 AS:i:-28663 XS:i:1 XE:i:6712 XL:i:6711 XT:i:1 NM:i:1 FI:i:1 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/171/2019_3396/0_1377 0 lambda_NEB3011 38857 254 3M1D22M1D2M1I43M1D4M1I28M1I2M2I3M1I7M2I7M1D4M1I17M1I8M2I5M1I10M1D9M2I5M1I4M1I23M1I5M1D3M1D5M1I5M1I22M1I4M1D8M1D10M1D3M1D4M1D21M1I6M1D7M1D5M1I4M1I8M2I5M1I2M1D6M1D12M1I4M1I1M1I7M1D10M1D2M1D19M1I2M1I5M2I11M1I15M1D8M1I4M2I4M1I25M1I7M1I11M1D8M3I6M1I12M1I2M1I6M1I5M1I3M1I1M1D5M1I1M1D6M1I2M3I17M1D4M1I8M1I1M1I2M1D2M1D7M2I20M2I3M2I12M1I2M1I2M1I1M1I2M1D3M1D1M1D4M1I7M1D9M2I5M1D23M1D4M1D13M2D6M2I8M1D9M1I19M1I4M1D13M1D8M1I15M4I4M1I7M1D26M2I11M1I2M1I9M1I2M3I8M1D13M1I4M1D5M1I4M1D4M1I25M1I13M1D6M1I15M1D6M2D1M1D10M1D4M1I21M1D13M1I1M1D6M1D8M1D16M1I17M1I2M1I1M1I4M1I5M1I24M1I23M1I3M3I5M3I10M1D4M1D1M1I7M2I5M3I10M1I7M1I7M1D7M1I5M1D16M1D8M1I20M1I41M * 0 1310 CGTAAACCTATGGGTGGAATAAAACATGGGACAGAATCACCGATTCTCAACTTAGCGAGATTACAAAGTTACTGTCCAAACGGTGCAATGAAGCCAAGTTAGAACCTCCCGTCCAGAATGAAAATATTACAAGCCAGCAAGGCGGCATGTTGGGGACCAAGGGTAAAGAACATCTCAGATGGTGCATCCCCCTCAAAAACCGAGGGAAAATCCCCTAAAACGAGGGGATAAAATCCCGTCAAATTTGGGGGATTGCTATCCCTCAATAACAGGGGACACAAAGACACTATACAAAGAAAAAGAAAAGATTATTCGCCAAGAGAATCTGGCGATCCTCCTGACCCAGCCAGAAAAAACGGACTTTCTTGGTGAGATCGGAATGCTTGGCAATTCGAGCGGCAGCAGAGGGAACAGCAGAAGACCTGGACCCGCCGAGCAGAGTGGATCGTTTGACATGGTGAAACTATCGCAACCATCTCAGCCCAGAAAACCGAATTTTGCTGGGTGAGGCTAACGGATATCCGCCTATGCGTGACGGACGTGACCGGACGTAACCACCCTGCGACAGTGTGTGGTGACGTTCCAGTGGGCATTGCCCCCAGGACAACTTCTGGTCGGTAAACGTGCTGGTAGCCGCCAAACTTTCCGCGATAAGTGGACCCAACCCTCGGGAAATCAACCGTAAAGCACAGGCACAGAGGCGGTGACAGCAGCAAACCAAAAAACCGACCTGCCCCACACAGACTGGATTACGGGTGGATCTATGAAACATCCCGCCGCACGATGGTTAACCTTTGACCCTGAGCAGATGCCGTCGATCGCCAACAACTGCCGGAACCAGTACGACGAAAAGCCCCCCGCAAGGTACACAGGTAGCGCAGATCATCAACGGTGTGGGTTCAGCCAGTCTACCTGGCAACTCTTCCCCCCGGCGACCTGGCTAACCGTGGACCGAACGAAAGTAACGAAAATCCGTCGCCAGTGGGTTCTGGCTTTTTCGGGAAAACGGATCACCCACGATGGAACAGGTAACGCGATGCGCGTAGCCGTCCGGCAGAATCGACCATTTCTGCATCACCCCGGCATGTTGTTGATGGTGCCGGAAGGAGCATCCGTTAACCGCCGGACTGCCAAACCGTTCCAGCGGAGCTGGGTTGATATGGTTTACGATTATTGCCCGGAAGCGAGGCCTGTATCCGGAATGCCTCGGAGTCCTCTTATCCGTGAAACAAAACGCGCCCACTATTCCTGGCTGGTTAACCAACCTTGTATCAAACATGCGGGGCCATGCGCTTACTGATGCGAATTACGCCCGTAAGGCCGCAGATGAGCTTTGTCCATATGACGGCGAGAATTACCCGTGGTGAGCCGATC * RG:Z:f7d81fcc44 AS:i:-5177 XS:i:3 XE:i:1377 XL:i:1374 XT:i:1 NM:i:1 FI:i:3 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/180/0_2599/0_2599 0 lambdaf7d81fcc44 AS:i:-9644 XS:i:1 XE:i:2589 XL:i:2588 XT:i:1 NM:i:1 FI:i:1 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/180/2645_6512/0_1950 16 lambda_NEB3011 40647 254 29M2I5M1D20M1I7M1I10M1I4M1I1M1I25M1I18M1I18M1I2M1D3M1I25M2I2M2I23M1D2M1I2M1I15M2I6M1I4M1I5M1D17M1I51M1I8M1I17M1D4M1I6M1D3M1I8M1I9M1I1M1I2M1I3M1I28M1I5M1I2M1D15M1I10M1I2M1I5M1I13M1D8M1I11M1D3M1D8M1I12M1I22M1D24M1D11M1D1M1D14M1I19M1I17M1I58M1D18M1I6M1I7M2I5M1I9M1D11M1D15M1I3M1I34M1I15M1D23M1I9M1D10M1I33M1D9M1I12M1I2M1I17M1D10M1D7M1I4M1D10M1D12M1I7M1D1M1D14M1I6M1I38M1I5M1I7M1I21M1D3M1D6M1I9M1D27M1I24M1I15M1D18M1I3M1I4M1I8M1I3M1I16M1I3M1D3M1D9M1I4M2I24M1D9M1D25M1D5M1D12M1I6M1D33M2D12M1D5M1D6M1I10M1D14M1D18M1I10M2D3M1D18M2I4M1D7M1I5M2I10M1D6M1I7M1I3M1I12M1D8M1I17M2I14M1I4M1I10M1D3M1D17M1I36M1D11M1I6M1D4M1D2M1D9M1I5M1D7M1I15M1I13M1I28M1D30M1I8M1I22M1I7M1D6M1I35M1I2M1I2M1D13M1D15M1I1M1I13M1D7M1I12M * 0 1885 AAAAAACTAACCTTTGGAATTCGATATCCCCAGCACTCAGCAAAACGCTATTCACGGCAGTACAAGCAAATCCTTACCAGCATCCCAACCAAACCAATCGTAGTAACCCATTCAGGAACGCAACCGCGAGCTTAGACCAAAACAGGGAACTAATGGGCCTGCTTAGGTGACGTCTCTCTCGTTTCAGGTTGAATGGCATGGTCGCTGCTTGGGAGGCAGAAAGCTGGAGCAGTGTGCTGTTCTACCGAGCATTAAAGCAGCAGGGATGTTGTTCCTAAGCTTGCCGGGAATGGCTTTGTGGTAATAGGCCAGTCAAACCAGCAGGCATGCGTGTAGGGGAATTGCGGCAGCTATAGAAGCTTATACCAGGCATTCGTGGTAACAGGAGCGTGGCGTTAAGTGGTCAGACGAAGCTGAGATATGCTCTGTAGTGGAAAGGCGAGATGGGGGAGGACAGGCGCTGCATGATAAAGTCGTTAGGTTTCTCCGGTGCAGACGTCAGCTATATTTGCTCTGCGCTAATGGAGCAAAAGCGACGGCAGGTAAAGACGTGCATTACGTTTCATGGATACAGTGTGAACATCCAATGCACATATCGGTTTGTCAGGGGAAGTTGTGAAGTTCTGGTGATATACCGGTCACCGTATTGCAGGTTGATATCAACCCGGAGCTTGGACAGCCAAATGTTATACGGTATGGGAACCCAAAGGACTATTCAGGGACGCGTAATGCCTGTCTGAAGCCATTATCGATATGGTAAAGGAAAATATGGCACTCCATACGTCGGCGGCGCGTTCTGCAGCTGACAGATAAAAACCGTTCCCTTCACCAAATACTGTGCATGACCATTCGGGCGAGGGGAATTACACCACGTGGATTGGCATCAGAGCTGATAACCGAAGCCGGCTAAAGCCAATAGGCCTGGAATCAGATATCTGCTGAACTGTAGACTTTCGAGAGGAAGAAATCTCGCATGGTGGATAGCAACACATTCGATTTGCAAAATACCGGGAACATCTCGGTAACTGCATATTCTGCATTAAAAAATCAGACGCATAAAAATCAGGACTTGCCTGCAAAGATGAGAGGATGGCAAGCGTGTTTTAATGAGGTCATCACGGGATCCCATGTGACGTGACGGACATCGGGAAACGCCCCAAGGATATTATGTACGAGGAAGAATGTCGCTGGGACGCGTATTCGCGAAAAATGTTATTCAGAAAATGATTAATCAGCCTGTATCAGGGACATTTGGTACGAGCTAAAAGATTCGATACGGCTCTTGTCTGAGTCATGCGAAATATTTGGAGGCAGCTGATTGCGACTTCAGGGAGGAAGCGGCATGATGCGATGGTATCGGTGCGGTGAGCAAAGAAGATACCGCTCCGACCCAAATCAACCTACTGGAATCGATGGGTCTCCGGTGTGAAAGAAACACCAACAGGTGTACCACTACCGCAGGAAAAACGGAGACGTGTGCGCGAGGGACAGCGACGAGTATCAGCCGACATTAATTCTGCGAAAACTGAAATACCCGTCCAACGAAACGCACCGTTGAAATAAACCCAAGTCCAAATCCCAAAAGATCTACGTAAAAACCTTCAACCTACACGGCTCACCTGTGGGATATCCGATGGCTAAGAGTCGTGCGAGGGAAAACAGGTGTTACCAAAATCCGAAGTACTAACAGAGAAAGCGTCGAGCGGAGCTTTAACGTGCTGCTAACTGCGGTCAGAAGCGGCATGTGCGGAAGTTCACGTGTGTGAGCACTGCTGCGCTAGAACTGACTGATCGATCCGAATAGCTCGATTGCACGAGAAGAAGGATGATGGCTAAACCAGCGCGAAGACGCTGTAAAAAACGGATAATGCCGGGAATGTTTCACCCTGCATTCCGGCTAATCAGTGGTGTGCTCTCTCAGAGTGTGGAA * RG:Z:f7d81fcc44 AS:i:-8032 XS:i:1 XE:i:1931 XL:i:1930 XT:i:1 NM:i:1 FI:i:1 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/28/0_1464/0_1464 0 lambdaf7d81fcc44 AS:i:-6274 XS:i:1 XE:i:1465 XL:i:1464 XT:i:1 NM:i:1 FI:i:1 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/45/0_2526/0_2526 16 lambdaf7d81fcc44 AS:i:-10503 XS:i:1 XE:i:2527 XL:i:2526 XT:i:1 NM:i:1 FI:i:1 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/47/0_1953/0_1953 16 lambda_NEB3011 11636 254 9M1I2M1D13M1I1M3I5M3I28M1D6M1I28M1I18M1I1M1I8M1I4M1D9M1I5M1I2M1D4M1I53M1I3M1I11M1D3M1I8M1I14M1D37M1I35M1I14M1D32M1I2M1I6M1D12M1I2M1D37M1I2M1I13M1I1M1I46M1D8M1D10M1I38M1D2M1I28M1D15M1D3M1I5M1I18M1I2M1I4M1I1M1I23M1D4M1D5M1D1M1D3M1D4M1I6M1I30M1I16M1I4M1D26M1I5M1I7M1D4M1I13M1I16M1I3M1D48M1I7M1D11M1D1M1D19M1I28M1D6M1D16M2I10M1D2M1I6M1I11M1I10M2I6M1I10M1I5M1I1M1I19M1I5M1I12M1I16M1D1M1D7M1I3M1I6M1I2M1D4M1I4M1D3M1I1M2I2M1I4M1I2M1I9M1I2M1D1M1I1M1I12M1I5M1D14M3I1M2I7M1I7M1I3M1I12M1D14M1I8M1I12M2I4M1D9M1D10M1D1M1I10M1I1M1I15M2I1M1I4M1D26M1I3M1I43M1I23M1I8M1I21M1D20M1I4M1I11M1I19M1D3M1D3M1I1M1I7M1I5M1D4M1D16M1I18M1D9M1I14M1I41M1I5M1I12M1I4M1I22M1D5M1D15M1I19M1I6M1I1M1I3M1I4M1I31M1D23M1I3M1D1M1I4M1I8M1I15M1I7M1D20M1I1M1I6M1I5M1D1M1D9M1D7M1I18M1I37M * 0 1873 GCGCTTGAAAGCGCCCGAAAGAAGGTCCCCTGAGCCCCAGCAGACTCAACAGGACAAAAATGCGCACAGCAGGAGCGATACCGAAGCGTCACGGCTGAAATTATACCGAAGAGGCGCAGATAAGGCTTACGGAACGCTGCAGACGGCCGCTTGGGAAAATATACCGCCCGTCAGGAAGAACTGAACAAGGCACTGAAAGACGGGAAAATCCTTGCAAGGCGGATTACACACAGCTGATGGGCGGCGGCGAAAAAGATTATGAAGCGACGCTGAAAAAGCCGAAACAGTCCAGACGTGAAGGTGTCTGCGGGCGATCGTCAGGAAGACAAGTGCTCATGCTGCCTGCTGACGCTTCAGGCAGAACTCCGGACGCTGAGAAGAAGCAGCCGGAGCAAATTGAAAAATCAGCCAGCAGCGCCGGGATTTGTGGAAGGCGGGAGGAGTCAGTTCGCGGGTTACTGGAGGAGGCGGCGCAACGTCGCCAGCTGTCTGCACAGTAGAAACCCTGCTGCGCATAAAGAATGAGACGCTGGAGTACAAACGCCAGCTGGCTGCACTTGCGGACAAGGTTACGTATCAGGAGCGCCTGAAGCGCTGGCGCAGCAGCGGGATAAAATTCGCACAGCAGCAACGGAGCCAAAATCCGGGCCGCCATTGATGCGAAAAGCGGGGCTGACGCCGCAGCTCATAACCGGGAAGCCACGGAACAGCGCCTGAAGGAACCAGTATGGCGATAATCCCGCTGCGCTGAATAACGTCATGTCAGAGCAGTAAAAAAGACCTGGCGGCATGAAGACCAGCTTTCGCGGGAACTGGATGGGCAGCCTGAAGGCCGGCTGGAGTGAGTGGGAAGAGAGCGCCACGGACAGTATTGTCGCAGTAAAAAGTGCACACGCAGACCTTTGATGGTAATTGCACAGAATATGGCGGCGATGCTGACGGCAGTAGCAGACCTGGCGCAGGGCTTCACCCGTCCTGTGCTGCTCCATGATGACGAGAAATGCTGGTCTTAAGGCAGGCAATGGGTGGGGGAATTGTCGGGAGTATCGGCAGTCGCCAGTTGGCGTGGCTGGTTGGTGGCGGCGCATCGGTCAGGCTGGTTACAGCCGATCAGGGCCGCGCGTGGGCGGAAATGTCTCATTTTGCAGACGAGTAGGATTTACGGGGAACCGCGGCAAATATGAGCGTTCCCAGCGGGGGATTGTTCCACCCGTGGTGAGTTTGCTTCACGAAGGAGGTCAACCAGCACGGATTGGGGTATTGGGATCTTTACCGCTGATGCGCGCGTATGCCACCGCGGCGGTTATGTCGGTACGCAACCGGCAGCATGGCAGACAGCCAGTCGCAGGGCGTGCCGGGACGTTTGAGCAGAATAACCATGTGGTGATTAACAACGAGCGGCACGAACGGGCAGATATGTCACGGCTGCTTCTGAAGGCGGTGTATGACATGCCCGCAAGGGTGCCCGTGATTGAAAATTCAGACACAGGATGCGTGATGGTGGCCTGTCTCGGAAGTGTGGACGGATGAAACCTCCGCTGGAAAGTGAAAACCCGGTATGGATGTTGCTCGGTCCCTTTCTGTAAGAAAGGTGGCGCTTTGGTGATGGCTATTCTCAGCGAGCGCCTGCCGGGCTTGAATGGCCAACCTGAAAAACGTAACAGCGTGACGCTTTCTGTCCCCGTGAGAGGCCACGGTACTTGTAGTCGTTTCTGGAAGAGCAACGGGGGGCCTGGGAAATGCCTGTCTGTGGACGCCGCCTTATGAGTGGCGCAGATAAAGGTGACCTGCGCAAATATGTGCGTCTGCGGGTCAAGTATGCTGCGTGTTGGAGTTCAGGCAGAGTTTGAACAGGTGGTAGGAACTGAATGCAGAATCCGGCAGAAACATTCGAATGAATGCACCCGTGCCGGAGCAGTCGGCCAGCGTGGTGCTCTGGGAAATCGAC * RG:Z:f7d81fcc44 AS:i:-8030 XS:i:1 XE:i:1949 XL:i:1948 XT:i:1 NM:i:1 FI:i:1 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/49/0_7870/0_7870 256 lambdaf7d81fcc44 AS:i:-6841 XS:i:6261 XE:i:7871 XL:i:1610 XT:i:1 NM:i:1 FI:i:6261 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/53/0_1306/0_1306 16 lambda_NEB3011 35500 254 11M1I1M1I1M1I8M1I4M1I4M1I22M1I2M1I4M1D2M1I24M1D39M2D5M1I10M1I5M1I15M1I13M1I18M1D16M1I3M1I14M1I1M1I4M1I4M1I19M1D7M1I7M1I2M1I46M1I7M1D14M1D3M1I1M1I18M1D8M2I40M1D17M1D13M1D10M1I4M1I20M1I26M1I10M1D13M2I5M1I10M1I4M1I2M1I8M1I26M2I9M1D25M1I26M1D22M1I2M1D2M1D3M1I14M1I2M1I21M1D8M1D21M1D4M1I11M2I25M1D8M1I8M1I15M1I13M1D9M1D6M2I5M1D1M1I3M1I5M1I21M1D21M2D30M1I8M1I25M1I15M1I36M1I12M1I1M1I16M1I7M1D1M1I22M3I27M1D10M1I1M1I6M1I57M1I1M1I15M1D6M1I3M1D21M1I1M1I10M1D2M1I8M * 0 1253 GCCTTCTGCTTCTTGGAATGCTGCTCCTTTCTTCCAGGGCTTAATTTTTAAGAGCGTTCAGCCGGATAGGTGGTCAGTGCGTCCTGCTGATGGCTCAGTATCACCGCCAGTGGTATTTATGTCAACACCGCGAGATTAATTTATCACGCGCAGTATGGTTATCTGTATGGTTTTTTATATGAAATTTATTTTTTGCAGGGGGCATTGTTTGGTAGGTGGAGAAGATCTGAATTGCTACTTGTTTTAGTGGAGTTGTATCTATTTATTTTCAATAAAATACAATTTGGATTATGTGTTTGGGGGGCGATCGTGAGGCAAAGAAAACCCGTCGCTGGAGGCCGGTTATTCTTGTTCTCGGTTCCAAATTATATAGTTGGAAACAAGGATGGGCATATATGAATGAACGATGCAGATGCAATGCCGATGGCGAAGTGGGTATCATGTAGCGCTTATGCTGGAAGAAGCAATAAACCCGACAGAAAAACAAAGCTCCAAGGCTCAACAAAACTAAGGGCATAGACAAGTTACTACCGAGTCATATACCCATTTACTCTTCTAATCTTGGGCCAGGTCCGGCGCGTTGCTGCTTCCGATTAGAAACGTCAAGGCGGAGCAATCAGATTGCAATCATGGTTCCTGCATATGGATGACAATGTCGCCCCAAGACCATCTTATGAGCTGAAAAAGAAACACCCAGAAGTAAGTGGCGGAAAAAGAAGAGTAGCAAATGCTTACGATAACGAAGGAATTTTACTATGTAAACACCAGGCAGATTTCTGTTCCGCATTTAATTACTCCTGATAATTAATCCTTACTTTGCCCCACCTGCCTATTTAAAACATTCCAGCTATATCACTTTTCTTCTTGCGTGCAATACATGCCACATCTCTCCGCCTATCTCAGCATTGGTGACCTTTTCAGAGGCGCTGAGAGATGGTTTTTCTGATAGATAATGTTCTGTTAAAATGATCTCCGGGCCTCATCTTTTGCCCGCAGGCTAATCGTCTGAAAATTGAGGGTGACGGGTTAAAAATAATATCCTTGGCAACCTTTTTGTATATCCCTTTTTACAATTTTGGCTTAATGATCTATATCAGATGAGTCAAAAAGCTCCCCTTCCCCAATATCTGTTGCCCCTAAGACCTTTAAATATCGCCCACACTACAGGGTAGCTTGGCTTCTACCTTCACCGTTGTTCGGCCGATGAAATGCATATGCATAACATCCGGTCTTGGGTGGTTCCCTCATCAAGTGCTCTATCTGAACGCGCTCTCCGAGCTGCTTAATGCACTTCCTTTC * RG:Z:f7d81fcc44 AS:i:-5445 XS:i:1 XE:i:1300 XL:i:1299 XT:i:1 NM:i:1 FI:i:1 m120619_015854_42161_c100392070070000001523040811021231_s1_p0/70/0_3728/0_3728 0 lambdaf7d81fcc44 AS:i:-15047 XS:i:5 XE:i:3719 XL:i:3714 XT:i:1 NM:i:1 FI:i:5 pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/lambda_bax.fofn000066400000000000000000000002051241505617700247420ustar00rootroot00000000000000/home/UNIXHOME/yli/yliWorkspace/private/yli/data/testLoadPulses/m130302_011223_pd1_c000000092559900001500000112311511_s1_p0.2.bax.h5 pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/lambda_query.fasta000066400000000000000000001434571241505617700255240ustar00rootroot00000000000000>m120619_015854_42161_c100392070070000001523040811021231_s1_p0/13/0_3797 GCTGTTTTCTCCAGCGCAGCACCGTAAATTACTGCTGAGCCATCATGACG CCGATGGAGCCTGTCCGGGCGGTCCCTGGCGTGACCAGACGCCGGAGGCG GCACTGGCAAGCAACTACCTGCACTGCAGTTCATGTGTTGGCAACCGCCC ATACCGGTTTATGTCCACGCACACGGGCGATAGTCAGCGCCGTCCAAATG CCACCCGCCACCATCTCCGCCGCGTGTCCCATATCGAGCAGAATAGCCGT CCACACATCGATGCTGGCAGCCTGTTGCCAGAGGGCGATAATGCCGTGTA AACCGTCATCATCCCCGGAGTACGGCTGCAGCGCTCCGCGGTCCGCCTGA CAGCGTGCCGACACCGGCAGCACGGCGATGCCGTTCATGACGATAACTGC GGGCCTGTCAGAGTTGGTCCCTCATCATCACCGGATAAATGAGGCCAGCG TCGCGAGTGCCTCCTGGGCAGTCAGGCTGTAGCCGGACACCGCATCCGTC AGGCTGCGATCCAAGCGGCCTGCAAGCCGCCACAAAAGAAAACCCGCGCA TAGGCGGTTCAAGCATCAGCCGGCTCACTTAAAGCCATGCTGGTCAATAT GCGGGAGATTACGGCAGCTCTGCTGTCACTCTTCTCCTCCTCTGTTGATT GTCCGCAGCCCGGATTGATCAAATGCCGCGCCCGCCCAGTGCGGCCGGTT TAAGACCGGCTGCACGGCGTCTCCATCGTTTCACGGACCTGCCTGGCAAA AATTTCCTGATAGTCGTCCACCGCGTTTTTGCGCACCTTCTCGTGGTAAC TCAGCCGGCTTTCTATCACATCCACCGCTTCCCTGACTTTCTTCAGACCA TCCGATGCCATACGACGGGGGGGGGGGGCGGGGGGGAGCCTATCCAGTCG CAGTTCGCGCGCCAGCACTGCGGGCTTCTGAAAACGAAGCGCGCTTTGAA GGTAACGTCACACGCGCGACGATGGCCTCTTCCAGCCAGCACAGAAACAT CTGGCTCGCCTGACGGGATGCGAGCGAATTTTTCGCCTGTTGCCGCCCCA TAAAGTAACGTCCCAGCGACTGCGTTCGCACTGGCCCGTGCCCGTGGAGT AGCTCATCTGGGCGTAATCTCCGGGAAAAGCTCCTCATACAGACCCTCCA GCCGGCAAGCGATATACCCGCAGCAGTGACGCTCAAACACGGAGTAGCCC GTTATCCGTATCCTTGAGCCGTCTGCAGGTTCAGTGAGTCCACCCGGCAT CAGTGCAGTAGCTTTGCGCCTCCCAGCCGGACGGCGCTGCGGCTTAATAC GCGGCAATTCACCCATCCACCGCGGTCAGCCTTTCCCGCGCGCCTGATGT TCGCGCCAGAATACAAATACCATCGCTGACTGCTATCCAGCTCACCTCCA ATGGTGCGCGGGCATACAATCGCCTTTCCACAATGGCGCTCTGCAGCTGC GTTGTTCTTGCAGCGTGTCCCGAGCATCTTCATCTGCTCCATCACGCTGT AAAACACCATTTGCACCGGCGGAGTCTGCCCCGTCCTCCACGGAGTTCAA AAACGGAGAGAGCGCGCCGCCGTGTAATCACGGGTATTCCATGTCCCATT TCTGCGGCAATCAGCCAGGATAACCCGTCCTCGCTGACGTAATATCCCAG CCCCGGCACCGCTGTCATTAATCTGCACACCGGCCACAGGCGTTCCGGCT GTCCGCGTATTGTTCGGGTTGCTATGCCGCTTCCGGCTGACCATCCGAAC CTGTGTCCGGAAAAGCCGCGACGAATGTATCCCAGGTGGCCTGAACGAAA TTCACCCGTAAAGCAGTCTGCAACAACCTTCACGAACATCATGGTAAACG TGCAGTTTCCGCTCAACGTCAATGAAAGCAGCATCATCCTCGGCAAACTC TTTCATGCCTTCAACCTCCGCGGGGAAAAGCACGGCTCTTCCTCCCCGAT CCCCATATAGCGCCAGCTTGGGCGATGACTGAGCGCCGGAAAAAAGACCC GACGATATGATCTGATGCAGCGGATGGCGTTGCGGCATAGCCGTTATGCG TACCAGATCGTCTGACGCGGGCATTGCCACGGTAAAGTGGGTCACATGCT GCATCCACACTTTCCACTCGGTGGGTCCACAACCGCAACTGCCTCCAAAG TCGCATCGCCACCGCCGTGATAATCCGGCATATTCGCGAGCGATTGTCAT GCCGTCCGGCCCAGAAGGGTGGAATGGTCCTTCCCATAAAAAATGCCTGC AGTCCCCTGCTCGCTGTGTCATGCCGGTCTGCACTTCCAGCTCTGCAATA TAATTTTTTCAGGTCAGACACGGAAGGGCCGTAACTCCACCCTTCGTACT CTTTCTGTACTGTTGCCACCCGTTTACCTTCACAGGTCATGCAGTCGCAA CGGGGCAGCGCGCAATCTCTTCCCTGTCCCCGCGTCATTCATGCCCTCTC CGATAAGGCACGGGCGGTAATCTGCCAGTGTTTTGTCTTTAGTTGCGTGC TGCCCATCCTCCTTCCTGACAGGCCTCCAGCAGCCACTGCGACCAGCTGC CAGCGGGAATACCTGATGCGCAGCCCGCCGGACCAGCCGCATAAACGAAG CAGTCGAGTGCCTCATATTCGTCGCTTTTTGCTGTCCCAACAGTATTATT TTTCCTGCCATCCCACCCATTTTTCGACCTGCTCTTCAGCAGTCAGTCGC TGCGCTTCGGTCAGATCACAAATAATCCGGTTATTCGGAAGTGAACGGCA CCGGGAAGCGGTCCATACCCCTTCCCGCCGTCAGTGTGAAGCGGTTATAA TATCTGCCTCTTTTCGCGTATCCGTACCGATTTCGGTAAGGTAAACTCCG TTTTTGTTTCGCTACGTGCATGCTGGCCACACGGCCTTTCCGTAGACATC CCCTTTGAATGGGGACACCCCGGAAAGCCCCATGGTTTTTTCGAGCGTTC ATACCACAATGTCGGGTCAATCCGCCAGTATCGCCAGCAGAATACGGTTA TCACATTTCTGCACCATTCCGGCGGTATAGGTTTATTGATGGCCTCATCC ACACGCAGCCAGCCGTCCTGTCATACGTCGTGGCCGGGCCCATAATAATT CCTGCCGGTCAATCAGCCAGCTTCCTCCCCGGCCCATCCCATACGCGCAT TTCCGTAGCGGTCCAGCTGGGAGTCGATACCGCGGCAGGTAAGCCACAAG GTCAGCGAACGGGCGCGAATAAATGCTCTTTCCGCTCTGCCATCACTTCA CTCCGGCCGTTCGCCTCAATTTTCGCCTCCCACCGGTCTCACCGAGCGTC GGGTGTTTACGAAAGGTTTTACGTTTTCCCGTATCGCCTTTCGTTCATCC AGGTCTTTGACAATCTGCACCTCAGTGGTGAAGCGGGCTGTACATCTGTC CCAGATGTGAAAGCACACTGTCCAGGTGGCTCAATCTCTTCACCGATGGA CGAAACCAGAGAAGCCCATCACGGGTCCCAGATCCTCGGTCCTTTGTCCG CAATATATACGGGCATCAGTAAATTCCAGATCTGCTGGCGATGACGCAGC ATTATGCTCGCAGAGATAAAACACTGCTTGAGGGGTCATCCGGCGTCCAT TTGAGGCCCAACGGGTCCTTCTTTGTCGCCAAATTTAAGATACCTGCTCC TCCCCAGCAATGCGGGCAGGCAACATGAAAACGCATAAAATGCGGGGGGG ATTCACTTGCTTGCACCGCTCAATCTGACAGGTGACCTCTCAACTTTTGC GTGGAGCCACGGATGGACTTGGCCAGACCGAGCCTTCAATACCTTGT >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/14/0_1593 ACCGACAGCAAGGTTTTGGGTAGGCTGATTTGTCAAAACGCCTGCCCCTG CATCAGACGTAGCCAGACGGAACGATATCTGATGGCCACGGGATCGGCGC AACCTGCCGGAAAGGGACGAATCTCACCTCGGCCCCAAGTAATTCAAGAA CATCTGCAACGGAATTTTTGCCAGAATATCCCTGCCAACCTGAGTTCAGT TCAAGTCAGGCTTGGCGGCATCATTTTCCGCAAATACGGTAATTTATTTT TCGCCGTGGAAAGCCCTGCCAGCGGCCGTCCAGTGTCGCATTCTTCGGTT GTTTACCCGCAAGCGCGTTAGTCAATGGTGGTTAGCCAAAATCTGGAATC ATTCCGAGCGCTGCGGCCAGTTCATTCAGCGTATCAGTGCCGTCCATGGG TGACAGCGTCGATAACATTCTGCAATCGCGGCCAGTACAAAAGCGGTGTT CGGCAATCTGGGTATTGTTTGTTCCCCTGAGCGCGGTTTTGGTCTGTTGG CGTTCCGGTCCAGTGCCGGACTGTCCAGTGGGCTTTTCTGTTCGTTCATC ATTAACAACCTTAAACCGCATTTTGGCGTGCAGCAAGCGTTTCAGACGAT GCTGTTGGTTGCACTGCTGAGCTGCACTATCCCTTTCTCGTTGTGTCCGC ATCATCAAGCGCGACAGTGCTGAAGCTATATCTTCTGCAACGTTTTGCCC GAATTTTTTTGCACGTATTGCCGCCGCTTCTGCCAGCACTTTGCTCTGCG GATGCTGATACCGCACCTTCCCGCAGCCTCTGTCGCCTTCGTGGATGGCA CGTTGACGCACTCCCGCCCGCCGCTGTTTTTGCGTCTGCCGCGGCAGAGG CGCTCCGTTCCGGCTGGCTGTTTCAGATGAAGCGCTGGGCATTCGTCTCG ACCCGTTTTTTGCGCCCTGGCAGAATTTTCTGCCGCCGTTGCCGAGGAAG GGCTGGGCAAGCGACCGGCACTTGATGATGCGTTCGTTCTGATGATTTTG CTGGGGGCCGTCTTTTGAAGGGCCAGCCGCATCTCGGCTGAAGTGGCCCG GACTTGACGGCTTTCGTGGCCGCGTGGAGGGCAGACGTGGCGGCCTGATT GTTGTGACGCTGGCAGCATTCTTTTCTGACGTTTCGCGACCGCACTGGTG GCCGCCGCGTTTTTTGAGGACTCTGCGGCTGCGGGCACTTTTTCGCGCTT CAGTGGACTTTGCTGATGCCGCTTCTGCGCCGAGGACGCTTCTGAGCTGA CGAAATGCAGCTGTCGCGGGCGGACGTGCTGGCGGCGCAGTGGCTGAGTC AGTTGCATCAGTTCACAGGGCCGCGACCCGTGAGGCAGCTGATGGCAGCT GCATCGCCGGCTGATTTCTTGCGCGTCTGCCGTAGCTCTGTGCACCACGG ACGCGTTACGCGGGCCACCTCTTCCACCGCATCAGTTCAAGACGAGCGCA GCACTTCCGGCCGGGCATCATCCTCCGTCATGGCGCACAGAGAAATCATT CGTCGTCCCCGGTTGTGAAATCTTTCATACACGGTATGGTCGCCGGCGTG CGATGGTGGAAAACCGTCAACGCTGACAGGATGACACTGTATT >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/28/0_1464 GCAAAGAGGCATTATTCATTACCGCTGGGGACGCATAATAGCTTCTGTGC GCCGGGACGTTGCCGCGCTAACAGGCGAACCAGTAACAGCCATAAATCAG GCGCGGCCTAAACTGGCACGCGGCCCGCGTCTTCTGGTTATCCGAAGGTA AAGTCCTGGCGAACGGTGTATTACCCGGTTTGCTAACAGGGAAGAACGGC GAAGGAAAGATGAGCACGAACCTGGTTTTTAAGGAGTGTCGGCCAGAGTT GCCGCGCTGAAACGGGTATATGGCGGTATATGGCAGTTAAACAGATGACC ATCTACATTACTGAGCTAATACAGGCCTGCCTGGTAAATCGCAGGCCTTT TTATTTGAGGGGAGAGGGGAAAGTCATGAAAACTAACCTTTGAAATTCGA TCTCCAGCACATCACAAAACGCTTTCCACGCAGTAAAGCAAAGTCCTTCC AGACCCAACCAAACCAATCGTAGTAAACCATTAAGGAACGCAACCGCAGC TTAGACGCCAAAACAGGAAGCTATGGGCCTGCATTTAGGTGACGTCCTCT CGTCAGGTTGACATGGCATGGTCGCTGGCTGGATGCAGAAAGCTGGAAGT GTGTGGTTTACCGCAGCATTAAAGCAGCAGGATGTTGTTCTAACCTTGCC GGGAATTGGCTTTGTGGTAATAGGCCAGTCAACCAGCAGTGAATGCGTGA GGGAATTTGCGGAGCTATTTAGAGCTTATACCAGGCTTCGGTACACAGCG GTGGCGTTAAGTGGTCAGACGAAGCGAGACTGAGGCTCTGGGAGTGGGAA AGCTGAGATGGGGAGAGCGGGCTGCATGATTAAATGTCGTTAGTTTCCCT CCGGTGGCAGGACCGTCAGCATATTTGCTCTGGCTAATGGAGCGCAAAGC GACGGGCCCGGTAAAGTACGTGCATTACGTTTTCATGGGATAAGGTTGTG AACTCCAATGACATATCGGGTTTGTCAGGGAAGTTGTGAAGTTCTGGGAT ATACGCTCACCGTATTGCAGTTGATATCAACCCGGAGCTTGGGACAGCAA ATGGTTAATACGGTATGGGAACCAAAGATATTCAGACGCGAATGCCTGTT ACTGAAGCCATTTATCGATATGGTAAAGAAATATGGCACTCCATACGTCG GCGGCGCGTTCTGCACTGACACGATAAAACTCGTTCCCTTCCCAAATACT GTGGAATGACCATTTCGGGCGAGGGAAATTACACCACGTGGATTGGCATC AGAGCCTGATTGAACCCGAAGCGGCCTAAAGCCAAAGCCTGGAATCAGAT ATCCTTGGCTGAACTGTCCCAGACTTTGAGAAGGCAGATATCCTCGCATG GGTGGAGCAACAACATTCGATTTGCAAATACCGAGAACATCTCGGTACTG CATATTTCCTGCCTGCATTAAAAAAATTCAACGCAAAAAATTCGGACTTG CCTGCAAAGATGAG >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/45/0_2526 GCATAGGCGGGTCAAGCATCCAGCGGCTCATTAAAGGCCATGCTGGCCAA TATGGCCGGGAGATTACGCGCTCAGCTCTGCTGTCCACTCTTCTCCTCCT CTGGTTGATTGTCGCAGCCGGATTCAAATGCTGCAGTCGCCCAGGCGGGC GGTTTACAGACCGGCCTGCACGGCGCTCCATCGTTTCACGGACTGCTGGG CAAAAACTTTCCGATAAAGTCGTCACCGCGTTTTGCGCACGTCTTTCTCG TAGGTACTCATCCGGCTTCTATCACATCACCTGCTTCCTGACCTTCTTTC AGACCATCGATGGCCATACGACCGGAGTCCTATCCAGTCCAGTTCCCCCA GGCACTGGCGCTCCTGAAAACGAAGCGGCGCTTTTGAAGGTAACGTCACC ACCGCGAACGAATGGCCTCTCCAGCCAGCTACAGAAACTCTGCTCGCCTG AACGGATGCGAACCGAATTTTCGCCGCCCATAAAGTCGCCACGACATCGT TCGCACTGGCCCGTGCCGTGAGTAGCTCATCTGGCGTAATTCCGGGAAAG CTGCCTCATGACGGAACACCCAGCCCGCAGCGATAGTACCCGCAGCAGTG ACTGCTCAACACGGAGCTAGCCGTTATCCGTATCCTGAGCCGTCTGCAGG TTCAGTGAGTCCACACAGGCATCAGGTGCGGTTACTTTTGCGCCTCCCAG CCGGACCGGCGCTGCGCGTAATACGCGGCAATTTCCACCAATCAACGCCG GTCAAGCCTTTCCCGCTGCTCCTGACTTGTTCGCAGCCCAGACATAAAAA TCCCATCGCTGACTTGCGTATCACAGCCACTCTCAATGGTGGCGCATACA TCCGCCTTCACAATGGCGCTCTGCAGCTGCGTGTTCTGCAGCGTGCGAGC ATCTTCATCTGCTCCATCACGCTGTAAAACAATTTGGCACCGCGGATCTG CCCGTCCTCCACGGTTCAAAAAACGTCGAATGAACGAGGCCGCGCCCGCC GGGTAACTGCACAGGGGTTATCCCATCGTCCAATTTCTGCGGCATCCAGC CAGATACCCGTCCTCGCTGACGTCAATTCCCAGCGCCGCACCCGCTGTCC CATTAATCTGCACACCGCACGGCAGTTCCGGCTGGTCCGCGGGTATTGAT TCGTGTTGCTGATGCGCTTCGGGCTGACCATCCGGACTGTGGTCCGGAAA GCCGCGACGAACTGGTATCACAGGTGGGCCTAACGACAGTTCACCGTTAA AGGCGTGCATCGCGCCACACCTTCCCGAATCATCATGGTAGAACGTGCGT TTTCGCTCAACGCTCAATGCCCGCAGCAGTCATCCCTCGCAAACTCTCTC CATGCCCGCTTCACCGTCGCGGAAAAAGGCACGGCCTTCTTCCTGCCCGA TGCCCCATGATAGCGCCAGCTTGGCGATGACTGAGCCCGAAAAAAGACCC GACGATATGATCCTGATGCAGCTGGATGGCGTTGGCGGCATAGCCCGTTT ATTGCGTAGCGCAGATCGTTCTGCGCGGGCGATGGGCCACGTAAAGTTGG GTCAACAGGGCTGCATCCACACTTTCACTGCGTGGGTTTCCAACCGCAAC GTGCCCCTCACAAATCTCGCTGCCACCGCCGTCCGATAACCGCATATTTC GCGCAGCGATGTCAAAAGCCGTCGCCGGCCCCGAAGGGGTGGGAAATAGG TGGGCGTTTCATACATAAATCCTGCAGGTCCCCCTGCGTCGCTGTGTCAT GCCGGTCTGCACTTCCAGCTCTGCAATATACTTTTTTCAGTCCAGAACAA ACGAAGTGCCCGTAACTCCACCCTTCCGTGTCGTCGTCTTTCTGTACTGT GCCACCCGTGTTACCTGTCATTCAGGCTCATGCAGTGCCGCACGGGCAGC GGCAAGTTCTTCCTGTCCGCGTCATTGCATCCTCTCCGATAAGGCCGGCG TAATCTGCCAGTGTTTTCTTGTTGTTGCTGCCACCATCCTCTTCCCTGCA GGCTCGCCAGCAGCGCACTGAGATCCGCAGCTGCCAGCGGAAATACTGAT GCCAGCGCCGCCAGCGCATAAACGAAGTCAGTGCCCGAGTGGCCTCATCT GCGTCCGCTTTTTGCTGTCCCACCAGTATTTTTTTCCTTGCCATCCATCT CCATTTTTCGACCTGCTCTTCGCGTCAGCTGCTGCGCTTCGTCAAATCAG AAAATATCTCGGTTAAGTTCCGGGAAGTGAACGGACCGGAACGGTTCATC CCCTTCCGGCGTCCAGTGTTGAAGGCGGGTTAAAATCTCTCTTCGGGTAT CCGTACCCGATTTTCGGTAAGGTAAACCCCGTTTTGGTTTTCGCTTACGG CATGCTGGCCACCGCTTTCCCGTAGACGGATGCCCCTTTAATGGGATCAC CACGGAACAGCCCAGTTTTTTCGAGCGGTCTACACATGGTCGGGTCAATC CCGCCAGTAATCCCAGCAGATACGGGATATCGACATTTAATGCACCGTTC CGCGGGTAATAAGGTTTCTATTGATG >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/47/0_1953 GTCGATTTCCCAGAGCACCACGCTGGCCGACTGCTCCGGCACGGGTGCAT TCATTCGAATGTTTCTGCCGGATTCTGCATTCAGTTCCTACCACCTGTTC AAACTCTGCCTGAACTCCAACACGCAGCATACTTGACCCGCAGACGCACA TATTTGCGCAGGTCACCTTTATCTGCGCCACTCATAAGGCGGCGTCCACA GACAGGCATTTCCCAGGCCCCCCGTTGCTCTTCCAGAAACGACTACAAGT ACCGTGGCCTCTCACGGGGACAGAAAGCGTCACGCTGTTACGTTTTTCAG GTTGGCCATTCAAGCCCGGCAGGCGCTCGCTGAGAATAGCCATCACCAAA GCGCCACCTTTCTTACAGAAAGGGACCGAGCAACATCCATACCGGGTTTT CACTTTCCAGCGGAGGTTTCATCCGTCCACACTTCCGAGACAGGCCACCA TCACGCATCCTGTGTCTGAATTTTCAATCACGGGCACCCTTGCGGGCATG TCATACACCGCCTTCAGAAGCAGCCGTGACATATCTGCCCGTTCGTGCCG CTCGTTGTTAATCACCACATGGTTATTCTGCTCAAACGTCCCGGCACGCC CTGCGACTGGCTGTCTGCCATGCTGCCGGTTGCGTACCGACATAACCGCC GCGGTGGCATACGCGCGCATCAGCGGTAAAGATCCCAATACCCCAATCCG TGCTGGTTGACCTCCTTCGTGAAGCAAACTCACCACGGGTGGAACAATCC CCCGCTGGGAACGCTCATATTTGCCGCGGTTCCCCGTAAATCCTACTCGT CTGCAAAATGAGACATTTCCGCCCACGCGCGGCCCTGATCGGCTGTAACC AGCCTGACCGATGCGCCGCCACCAACCAGCCACGCCAACTGGCGACTGCC GATACTCCCGACAATTCCCCCACCCATTGCCTGCCTTAAGACCAGCATTT CTCGTCATCATGGAGCAGCACAGGACGGGTGAAGCCCTGCGCCAGGTCTG CTACTGCCGTCAGCATCGCCGCCATATTCTGTGCAATTACCATCAAAGGT CTGCGTGTGCACTTTTTACTGCGACAATACTGTCCGTGGCGCTCTCTTCC CACTCACTCCAGCCGGCCTTCAGGCTGCCCATCCAGTTCCCGCGAAAGCT GGTCTTCATGCCGCCAGGTCTTTTTTACTGCTCTGACATGACGTTATTCA GCGCAGCGGGATTATCGCCATACTGGTTCCTTCAGGCGCTGTTCCGTGGC TTCCCGGTTATGAGCTGCGGCGTCAGCCCCGCTTTTCGCATCAATGGCGG CCCGGATTTTGGCTCCGTTGCTGCTGTGCGAATTTTATCCCGCTGCTGCG CCAGCGCTTCAGGCGCTCCTGATACGTAACCTTGTCCGCAAGTGCAGCCA GCTGGCGTTTGTACTCCAGCGTCTCATTCTTTATGCGCAGCAGGGTTTCT ACTGTGCAGACAGCTGGCGACGTTGCGCCGCCTCCTCCAGTAACCCGCGA ACTGACTCCTCCCGCCTTCCACAAATCCCGGCGCTGCTGGCTGATTTTTC AATTTGCTCCGGCTGCTTCTTCTCAGCGTCCGGAGTTCTGCCTGAAGCGT CAGCAGGCAGCATGAGCACTTGTCTTCCTGACGATCGCCCGCAGACACCT TCACGTCTGGACTGTTTCGGCTTTTTCAGCGTCGCTTCATAATCTTTTTC GCCGCCGCCCATCAGCTGTGTGTAATCCGCCTTGCAAGGATTTTCCCGTC TTTCAGTGCCTTGTTCAGTTCTTCCTGACGGGCGGTATATTTTCCCAAGC GGCCGTCTGCAGCGTTCCGTAAGCCTTATCTGCGCCTCTTCGGTATAATT TCAGCCGTGACGCTTCGGTATCGCTCCTGCTGTGCGCATTTTTGTCCTGT TGAGTCTGCTGGGGCTCAGGGGACCTTCTTTCGGGCGCTTTCAAGCGCCC CCG >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/49/0_7870 TTTGACTCATTGGAATATAGTCCATTCAAAGGCCAAATGTAAAAAGGATA TAAAAAAGGTTGCCCAGGATATTATTTTTAACCCGTCCCAGTCCATTTTC AGACATTAAGCCCTGCGGGCAAAAGAGAGACGGAGATATTTTAAAACAGA ACATTATCTATCAGAAAAGGCATCCTCTCAGACCCTCTGAAACAAGCGGG TCACCAATTGCTGAGATAAGCTGAAGAGATGCATATTGCTACGCAAGAAA TGAAAAGGTGATATTACTGGAAGCTTTTAAAAGGCAGGTGGTGCAAAAGT TAAGGATTAATTATCAGGAGTAATTATGCGAACAGAATCCATGCCTGGTG TTTACATAGTAATAATTCCCTTACGTCTTATCGAAAGCATTTGCCTATCT CTGCTTTTCCGCCACTACATTCCTGGTGTTTCCTTTTTCAGCTCATAGAA GAGGTCCTTGGGCGACATTGTCATCAATATCAGGAACCAATGATTGCAAT CCTGATTGCTCGCCTGACGTTTTCCTAAATCGGAAGCAGAAACGCGCCGA CTGGCCAAATTAGGAGAGTATCGGTATATGACATCGTAGGTTATTGTCTA ATGCCCTTAGTTTTGTTGAGCTTGGAGCTTGTTTTTCTGCGGGTTATTTG CTTCTTTCCAGCATAAGCGGCTACATGATACCACTATCGCCATCGGCATT GGCCTCTGCATCGTTCATTCTATATGCATCCTTGTTTTCCAACTATATAA TTTGACCAGAGAACAAGAATAACCCGGCCCTCAGCGCCGGTTTTCTTTGC CCCACATCGCCCCCCCAAAACACATACCAATTGTAATTTATATGAAAAAT AAAAGAGTACAACTCACTAAACATAGCAATTCAGATCTCCTCACTACCAA AGACAATGCCCCTGCCAAAAAATAAATTCATATAAAAGAACATACAGATA TCACCATCTGCGGTGATAAATTTATCTCTGGCGTGTTGAAATAAATACCA CTGTGGGGGCGGTGATACTGGCCATCCAAGCAGGGACGCAGGCTGACCAC CATGAAGTGACGCTCTAAAAATTAAGCCCTGAAGAACGGGCGCAATTCAA AGCAGAGAAGGCTTTGGGTGGGTGTGATCACAAACGAAAGCATTGGCCGT AATGGCGATTCCGGATTAGCTGCCAATGTGCCAATCCCGCGCGGGGTTTT CGTTCAGGACTACAACTGCCACACACACCACCAAAAGCTAACTTGACAGG CAGAATCCAGATGGATGCACAAACACTCCGCCGCGGAACGTCGCCGCAGC AAAACCAGGCTCAATGGAAAGCAGCAATCCCACTGTGGTTGGGGTAAGGC GCAAAACCAGTTAACCGCCCCTATTCTCTCGCTGATAGCAAACCGAAATC AAACGAGTAGAAAGGCGCACTAAATCCGTAAGACCCTTAACAGGTGCTGG CTGAATACCACAAACAGATTGAAAGCACCTGCACGTATTGAGCGCAAGAA TCAGCGCACATGGTACAGCAAGCCTGGCAACGCGGCATAACAATACAGGT GACGCCCAGAAAATTAAGGGAAAATCGATTCCTCTTATCTAGTTACTCTT AGATATTGCCTTGGCTTTATCTCAATATTAATATGGATCATAGCTGGCAA CTAATTCAGATCCAGTAAATATCCTCAATAGGGAATAATAATGCTTTTCC CATTCATCGGGAAAAAGTTTTGTTTCAACACACCAAGCTCAATCAACCCT CACTAATGTATGGGTAATTGTTCTTTGATGTAACCACAATACTTCCTGCC TTCATTAAGGGCTGCGCCCACAAAACCATAGATGCTCTTCTGTAAGTTTG AATTAACTGATCGCACTTTATCGTTGTTTGCAAATCTTAATGGCGTTTTT CTTAGCTAAATGCTATATCTTGGCGCTGGCAATAGCTGATAATCGATGCA CAGTTAATTCTAGCGAAAATGCAAGAGCAAAGACGAAAACATGCACACAA TGAGGAATACCGATTCTCTCATTAACATATTCAGGCCCAGTTATCTGGGC TTAAAAGCAGAAGTCAACCCCAGATAACGATTCATATACATGGTTTCTCT CCAGAGGTTCATTACTGGAACACTCGTCCGAGAATAACGAGTGGATCCAT TTCTATACCTCATCAAACTGTAGGGTGTAATAGTTTATCCGATTTTCCTC GCTGTAGGGGTACATCGAGAACCACCGAGCCTTGATGGTGGTTAAAAGAC AGGCAGCAATCTTTACTACCGCAATCCACTATTTAAGGTGATATATGGAA GAAAGAAAGAATTTGAAGGAGTTCGAAGAGCATCCTCAGGATGTGATGAA CAATACAGGACTATCCGTAATGACTACGACTATTGGGATAAAAATCAATG GTGGTGGACAATTCCAAGCGATGCAATGGATGCAAGCGCAATCGGAATGC ATGGTTACAGCCTGAAAGAAATGTTTTCCCTGTTAATGGAAGATGGGAAA GTATGTCGATAAATGGGCAATACGAACGCACGGAATGATGCCAGAGAACT TGGGGTAAACAGAACAAACAAAGCTGCTGATAGTGGTCCTTTATTTTTGC ATAAATAACCAAATAAACACTGCACTGGTATTTCATTCCAACGAGTGGAA ATACACGGAGCAATGGTCGCTTCGTAACTAAACAGGAGCCGACTTGTTCT GATTATGAAATCTTCTTTGCCTCCCAGTGGTGAGGAGCGGACTTTTTTAG CTGTGAGGATATGAAACAGATGTCAAACATCAAAAAATACATCATGATTA ACGACTGGAAAGCATCAATAGAAATTGAAAATCAGACCATGACGTAATGA CAGAGGAAAAACTTCACCAGATAATAATTTCTGGTCAGACTCTGAAACGA CTCAATAAACACGGCTCTGTATTAAATGACTGTATTAACATGCTGGGCGC CACAATGCTCTGCTTATAGCAATTTCAAGCTGACTTAAATGCATATGGGT TGGTTGTGGTGTGATTCGACTGAATGGATGGAAATGGGTCGGAAGATGGC CTCGCAATTGGATGGTAACGAAGGATAAGAATTACCGATATCGATACATC CAGGAATATTTGATTCAGATGATATGAACCTATCAAGGCCGGCCTGAGTG CGGTTTTACCGCATATCCAATAACCCGCGTTCCACTCGAGCGTTTTTCGT TATGCTATGAAATAAGGAAGCACACTCATGCAAATATGCCATTGCAGGGT GGGCCTGGTTGCTGGCCTCCCTTCGCGAGCGTTACTTGAACGAATCAGCC CGTAAATTACGTGACGGATGGAAACGCCTTATTCGACATACTTAATCAGC CAGGATCCCAAAGAAATGGATCAAACACTTATGGCTATCCAGACTAAATT CACTATCGGCGCACTTTTATGGCGATGAAAAGACGTGTTCGTGAATCCGT CGACGCTTATAAAAATGGATATTAATACTTGAAAACTGAAGATCAAGCAA AAGCATTCACTAACCCCTTTCCTGTTTTCCTAATCAAGCCCGGCATTTCG CGGGCCGATATTTTCACAGCTATTTCGGAGTTCAGACCATGAACGAGCTT ATTACATTAGGATCGTCTTGAGGCTCAGAGCTGGGCGCGTCCCTACCAGC AGCTCGCCCGTGAAGAGAAAGAGGCGAACTGGGCAGACGACATGCGAAAA AGGCCTGCCCCAGCACCTGTTTGAATCGCTATGCCATCGATCATCTTGCA ACGCCACGGGCCCAGCAAAAAATCCATTACCGTGTCGTTTGATGACGATG TTGAGTTTCAGGAGCTGCATGGCAAACACATCCGGTAACATGGTTGAAAC CATTGCTTCACGCACCCAGGTTTGATTTGATCAGAGGTATAAACAATGAG TACTTGCACTCCGCACGCTGGCTGGGAAGATGGCTGAACTGTCGGCATGG ATTCTGTCGACCCACCAGGAACTGATCACCACTCTTCCCCAGACGGCAAT TTTAAAGGTGATGCCAGCGATCGCAGTTCAATCGCATACTGATCGTTGCC AACCAGTTAACGCGCCTTAAATCCGTGGACGAAAGCAATTTACGCTTTCC TGGATAAGCAGAATGGCATCCGTTTCCGTGGTGGGCGTTGTATGGCTGGT CCCCGCATCATCAATGAACCAGCATGTTTGATGCCATGGACTTTGAAGCG ACAATGATCTGTACATGCCGGATTTACCGCAAGGACCGTAATCATCCGAA TCTTCGTTTACCGATGATGGCTGAATGACCGCCGCGAACCATTCAAAACT CCGCGAAGGCAGAGAAATTCACGGGGCCGTGGCAGGCCCACCGCACCCAA ACGGATGTTACGTCATAAAGCCATGTGATTCAGTGTGCCCCCGTCTGGCC TTCGGATTTGCTTGGTATCTATGGACAAGGATGAAGCCGAGCGCATTGTC GAAAATACCCTGCATACACTGCAGAACGTCAGCCGGAACGCGACATCACT CCGTTAACGATGAAAACCATGCAGTGAGATTACACTGCTGCGTCGTCCTG GATAAATAAAAATGGGAATGACGACTTAATTGCCCGCTCTTGTTCCCCAG ATATTCCCGCCGACATTCGTCATCGTAGAACTGACACGGCCGAAGCAGGT AAAAAGCTCTTGGATCCTGAAACAGAGAAGCCGCAGAGCAAAGGTGGCAG CATGGACACCGGACATTATCCTGCAGCGTACGGAATCGATGTGAGAGCTG TCGAACAGGGGGATGATGACGTGGCACAAATTACGGCTCGGCGTCATCAC CCGCTTCAGAGAGTTCACAACGTGATAGCAGAAACCCCGGCTCCGGAAAG AAGGTGCCCTGACATGAAATGTCCTACTTCACACACCCTGCCTTGCTGAG GGTTTTGCACCGGTGGCTCCGTGAAGTTAACGCTAAAGCATGCCTGGGGA AGACAGTACGAGAACGACGCCAGAACCCTGTTTGAATTCACTTCCGGCGT GGAATGTTACTGAATCCCCGATCTCTATCGCGACGAAAGTATGCGTACCC GCCTGCTTCCTCCCGATGTTTATGCAGTGACGGCAACGGCCTGAACTGAA ATGCCCGTTTACCTCGCCGGGATTTCATGAAGTTCCGGCTCGGTGGTTTT CGAAGGCGCATAAGTCAGCTTACATGGCCCAGGTGCAGTACAGCATGTGG TGACGCGAAAAAAATCCTGGTACTTTTGCCCAACTAATGACCCGCGTATG AAGCTGAAGGCCTGGCATTATGTCGTGATTGAGCGGGATGAAGAATACAT GGCGAGTTTTGACGAGATCGTCCGGAGTTCCATCGAAAAAAATGGACGAG GCACTGGCTGAAATTGTTTTGTATTTTGGGGAGCAATGCGATGACGCATC CTCACGATAATTATCCGGGGTAGGCGCAATCACTTCGGTGCTACTCCGTT ACAAAGCGAGGCTGGGTTTCCCGCCTTTCTGTTTATCCGAGAATCCACTG AAAGCACAGCGCTGGGCTGAGGAGAGTAAATAATAAACGAGGGGCTGTAT GCCAAAGCATCTTCGTTGAGTTAGAGAACGATTTCGGAGATGCACATAAG CCTTGCTCAATTGAATCAGGTTTGTGCCAATACACAGTAGAAACCAGATA CAGACGAAGAATTTCCATACTTTAGGCCGCATCCCCCTTTCCAAAAGCTG AAATGATGGTGGCGAAAGCAGAGCAGATGAGAGACAACCATGTATGACAA CCACGAATGCATGTTTTCTTGGCAAGCGGGCTTCATATTCTGTGTGCTAT GCTTGCCGACATGGGAGCTTGTTCAAATGACACCTCAGCCAGGTGAAAAC GCCCTTTCGGCAGCATTGCCGCTGCAGGCTAATTCTGAAATCAAAAAAAA GCCAGACAGCAGTTTCCGGATAAAAACGTCGATGACATTTGCGCGTAGCG TAACTGAAGAAGCCCGCGAAACGGTAACGCTGATGGATTCACACGACTCA TTAAGCCTGGCAATCGGCCTGTAAACGCGCGTCTTTAAGGACGATGAACA TGAAAAGCAAAATCATTCAGGGCAGCTACAGGCCTTCTCCTTTTTTAATT ATTCGCATCACTGCCTCAAGCCGTATTAAAACCAAACCAGTTCAGGGATT TAAGAAAGATGGCAGACATCCATTGATCAGCATCAGAAGATAGAAGAATT ACAGCGCAACACAGCAATAAAAATGCGGCCGCCTGACCAACCAGGCCCTA TATCGTGCCAACTCACAACAATGAGTGGCAGATATATGCCTGGTGTTCAG GCGGCGCATTTTTTATTGCTGTGTTGCGCTGTAATTCTTTCTATTTCTGA TGCTGAATCAATATGCTGCCCATCCTTTCATTTATCCCTGAACTGTTGGA TTAATACGCTTGAGGTGAAATGCGAATAATAAAAAAGGAGCCTGTAGCTC CCTGATGATTTTGCTTTTCATGTTCATCGTTCCTTAAAGACGCCGTTAAC ATGCCGATTGACCCAGGCTTAAATGAGTCGGTGTGAAATCCCATCAGCGT TACAGTTTCGGCGGTGCTTCTTCCAGTACGCACGGCAAAATGTCAATCGA CGTTTTTATCCGAAACTGCTGTCTGGCTTTTTTTTGATTTCAGAATTAGC CTGACGGGCAAATGCTGCGAAAGGGCGTTTTTCCTGCGAAGGTTCATTGA ACAGCCGCATGTCGGACAAGCATAAGCACACAGAATATGAAGCCCGCTGC CAAGAAAGAATGCATTCCGTGGTTGTCCAGTACCTGGTTCTCCTCACTGC TTCTGCTTTCGCCAACCATCATGTTCCAGCTTTTGTGAAAGGGATGCGCT AAACGTATGAAATTCTTCGTCTGTTTCTACTGTAATTGGCACAAACCTGA GTTTCCAATTTGAGCAAGGCATGTGGCATCTCGATACTCGTCTGTAACTC ACAGAAGATGCTTTTGTGCAACAGCCCCTCGTTTATTATTTAATGCTCCC TCAGCCAGCCGCTGTGCTTTCAGGGATGTTCGGATACAGAAAGGCCGGAA ATACCCAGCCTCGCTTGGTAACGAGTTAGACGAAAGTGATTGCGCTACCC CGGATGAATTATCGTGAGATGCGTCCCATCGCCATTGCTCCCCAAATACA AAACCAATTTCACCAGTGGCCTCGTCCATTTTTTCGATGAACTCCGCACG ATCTCGTCCAACTCCGCCCATGTACTTTTCTCCCGCTCATCACGACATAA TGCAGGCTTCACGCTTCATACGCGGTCATAGATTGGCAAAGTACCAGGCA TTTTTTCGCGCACCCACATCTGTACTGCACTGGCCAATGGTAAGCTGCAC TTTATGGCCCTCGAAACCACCGAGCGGAACTTCCATGAAATTCCGGAGGT AAACGGGCATTTCAGTGTCAGGCCGTTGTCCGTCACTGCATAAACCATAA GGGAGAGACAGGCGGTTATCGCATACTTATCGGTCGCGATAGATGATCGG ACTTCAGCTAACATTCACGCCGGAGTGAATTACAAACAGGGTTCTGGCGG TCGTTCTCGGGTACTGTTTTCCCCAGCCAGTGCTTTAGCGGTTAACTCCC GGAGCCACACCGGTGCAACCTCACAAGCAGGTGTGGAAGTAGGACAATCT TTCATGTCAGTTGGCCAACTTCTTTTCCGGAGCGGGTTTTGCTATCACGT TGTGAACTTCTGAAGCGGTGGATGACGCCGAGCCGTAATTTGTGCCACGC ATCATCTCCTCCCTGTTCGAACAGCTTCTCCACATGCGATCCGGTACGCT GCATGATAAATGTCCGGTGTCATGGCTGCCACCTTCTGCTCTTGGCGGCT TTCTGTTTCAGGAATCCAAG >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/53/0_1306 GAAAGGAAGTGCATTAAGCAGCTCGGAGAGCGCGTTCAGATAGAGCACTT GATGAGGGAACCACCCAAGACCGGATGTTATGCATATGCATTTCATCGGC CGAACAACGGTGAAGGTAGAAGCCAAGCTACCCTGTAGTGTGGGCGATAT TTAAAGGTCTTAGGGGCAACAGATATTGGGGAAGGGGAGCTTTTTGACTC ATCTGATATAGATCATTAAGCCAAAATTGTAAAAAGGGATATACAAAAAG GTTGCCAAGGATATTATTTTTAACCCGTCACCCTCAATTTTCAGACGATT AGCCTGCGGGCAAAAGATGAGGCCCGGAGATCATTTTAACAGAACATTAT CTATCAGAAAAACCATCTCTCAGCGCCTCTGAAAAGGTCACCAATGCTGA GATAGGCGGAGAGATGTGGCATGTATTGCACGCAAGAAGAAAAGTGATAT AGCTGGAATGTTTTAAATAGGCAGGTGGGGCAAAGTAAGGATTAATTATC AGGAGTAATTAAATGCGGAACAGAAATCTGCCTGGTGTTTACATAGTAAA ATTCCTTCGTTATCGTAAGCATTTGCTACTCTTCTTTTTCCGCCACTTAC TTCTGGGTGTTTCTTTTTCAGCTCATAAGATGGTCTTGGGGCGACATTGT CATCCATATGCAGGAACCATGATTGCAATCTGATTGCTCCGCCTTGACGT TTCTAATCGGAAGCAGCAACGCGCCGGACCTGGCCCAAGATTAGAAGAGT AAATGGGTATATGACTCGGTAGTAACTTGTCTATGCCCTTAGTTTTGTTG AGCCTTGGAGCTTTGTTTTTCTGTCGGGTTTATTGCTTCTTCCAGCATAA GCGCTACATGATACCCACTTCGCCATCGGCATTGCATCTGCATCGTTCAT TCATATATGCCCATCCTTGTTTCCAACTATATAATTTGGAACCGAGAACA AGAATAACCGGCCTCCAGCGACGGGTTTTCTTTGCCTCACGATCGCCCCC CAAACACATAATCCAAATTGTATTTTATTGAAAATAAATAGATACAACTC CACTAAAACAAGTAGCAATTCAGATCTTCTCCACCTACCAAACAATGCCC CCTGCAAAAAATAAATTTCATATAAAAAACCATACAGATAACCATACTGC GCGTGATAAATTAATCTCGCGGTGTTGACATAAATACCACTGGCGGTGAT ACTGAGCCATCAGCAGGACGCACTGACCACCTATCCGGCTGAACGCTCTT AAAAATTAAGCCCTGGAAGAAAGGAGCAGCATTCCAAGAAGCAGAAGGCA TTTGGC >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/70/0_3728 CGCCGTACGCATACTTCTCGTCGCGATAGATGATCGGGGGATTCGTAACA TCACGGGCCGGAAGTGAATTCAAACAGGGTTCCTGGCGTCCGTTCTCGTA TGTTTTCCCCAGGCAGTGCTTTTAGCGTTAAACTTCCCGGAGCCACACCC GTGCAACCGCTCAGCAAGCAGGGTGTGGAAGTAGGAAACAGATTTTCAAT GTCAGAGCACTTCTTTTCCGGAGCGGGGTATTTTGCTATCACGTTTTGTG GCAACTTTCCTGAGCGGTGATGACGCCGAGCCAGTATTTGTTGCCACGCA TCATTCCCCCCCTGTTCGACAGCTCTCACAATCGATCCCGGTACGCTGCC AGGATAATGTTCCGGTGTCATGCTGCCACTTCTCTGCTCTGCGGCTTTCT GTTTCAGGAATCCAAGAGCTTTTACTCGCTTCGGCTGTGTCAGTTCTGAC GATCCACAATGTCGCGGCGATATATCTGGGACAGAGCGGTCATAAGTCGT CATCCCTGTTTTTGTCCAGGGCGATCAGAAGAGTGTTTAATCTCTGCATG GTTTTCATCGTTAACACGAGTGATGGTCGCGTTCGGCTGACGTTCTGGCA AGTGTAAATGCAGTATTTTCCGCAATGGTCGGCTTCATCCTTGTCATAGA TACAAGCAATCCAGAAGGCCAGACGGCACACTGAATCATGGCTTTATGAG TAACATCCGTTCTGGGATGCCGACTGGCACGGCCGTAGATTTCTCTGCCT TGCCGAGTTTTGAAATGGTTTTTTCGCGGCGGATTATCCATCCATTCGTA ACTCGCAGATCGGATGATTACGGTCCTTGCGGTAATCCGGCATGCTCAAG GATTCATTGTCCCTGCTCAAGTCCCCATGCATCAAACTGCTGGTTTTCAT TGATGATGCGGGACCAGCCATCAACGCCCACCACCGGAACGATGCCATCT CTGCTTATCAGGAAGGCTACTTTCCTTTCGTCCCACGGATTAAGGCCCGT ACTGGTTGGCACGATCAGTAATGCGATGAACTGCGCATCGCTGGCATCAA CATTTAAATGCCGTCTGGCGCGAGTGGTGATCAGTTCCTGTGGGGCGAAG AATCCATGCCCGACACGTTCGCCAGCTTCCCAGCCAGCGTTGCGAGTGCA GTACTATTCGTTTTATACCTCTGAATCAAATCAACCTGGGGTGAGCAATG GGTTTCACCATGTACCAGGATGTGTTCTGCATGCGCTCAACTGAAATAAC ATCGTCTCAACGAACGGGTAATGATTTTTTGCTGGCCCCCAGTGGCGTTT GCAAATGGATCGTGCCATAGCGATTCAAACAGGTGCTGGGGCAGGCCTTT TTCCCATGTCGTCCTGCCAGTTCTGCCTCTTTCCTCTTCAAGGGCGAAGC GCCTGGTAGTGACGCGCCCAGCTCTGAGCTCAAGACATTGACATGTATAA GACGTTCATTGGCTGAACTCCTGAAATAGCTGTGAAAATATCGCCCGCGA AATGCCGGGCTGATTAGGAAAACAGGAAAGGGGTTAGTGAATGCTTTTGC TTGATCTCAGTGTTTCCCCCCCCAGGTAAAAAATTAAATATCCATTTTTT ATAGCGTCGACGGCCTTCACGAAACATCTTTTCATCGCCAATAAAATGGC GATAGTGACATTTAGTCTGGATAAGATAAGTGTGTTTGATACCCATTCTT TGGGACTCCATGGCTGATTAGTATCGTGCGATAAGGCGTTTCCATCGTCA CGTAAGTTCCCGGGGTGATTCGTTCAAGTAAAGATTCGGAAGGGCAGCCA GCAACAGGCACCCTGCAATGCATATTGCAATGGTGGTGCTCTCATCTTAT ACATAACGAAAACGCCCTCGGAGTGAAGTCGTTATTGGTATCGAGGTAAA ACCGCACTCAGGCGCCTTGGATAAGTCCATATCATCTGATCAAATATTCC TGAAGTATCGATATACTGTAATTCTTATTCCTTCGCTACCATCCATGTGG AGGCAATACCTTCTGACCATTTCATCATTCAGTGACTCACACAACACCAT ATGACATTTAAGTTCGTTTGATTGCCTATAAGCAGAGCATGTCTGCGCCA GCATGAATCTAATACCAGATTTAATACAGAGCCGTGTTTATTGAGTCCGG CTATTACAGAGGTCTGACAGAAATTATTAATCTGTGGAAGTTTTTTCCCT GCCATTAAACGTACATGGCGATTTCAATTTCTATTGATGCTTTCCAGCGT AATCAATGATGTATTTTTGATGTTTGACATCTGTTCATATCCTCACAGAA TAAAAAATCGACCTCACTGGAGGCCAAAGAAGATTTCCAATATCAAGAAA AGTCGGCCCTGTTTAGTTAACGAGCGAATTGCTCCGTGTATTCCTTCTTG GAATGAATACACGTGGCAGTGTTTATTCTGTTATTATGCCAAAATAAAGG CAACTATCAGCAGCTTTGTTGTTCTGTTTACCAAGGTTCTCTGGCCATCA TTGCGTCGTTCGTATTGCCATTTATCGAACATATTTCCCATCTCCATAAA AGGAAACATTTCTTCAGGCTTAACCTGCATTTCCGATTGCGCTTGCATCA TTTGCATCGCTTTGAATTGTCCAGACATTGTTTTTTTATCCATAGTCGTA GTCTACGGATAGTCCTGGTAGTTGTTCCATCACATCCCTGAGGATGCTTC GAACTATCCCAACTTCTTCTTCCATATATCACCTTAAAATAGTGGATTGC GGTAGTAAAGATTGGTGGCCCTGTCCTTTAACCACATCAGGGCTCGGTGG TTCCTCAGTGTACCCCCTACAGCGAGAAAATACGGATAAATATGTAAACC CTACAGTTTGATGAGTATAGAAAAATGAGATCACTCGTTTCTCGGAACCG GTGATTCCAGTAATGAACCTCGGAGAAGAAACATGTCTTATGCATCGGTA TATCTGGTTCGGACTTCTGCTTATTTAAGACCCATGATAACTGGACCCTG ATATGTTAAATAGAGAGAATCGGTATTCCTCATGTGTGCATGTTTTCGTC TTTGCCTTGCTTTTCGCTAGCATTAATGTGCCATCGATTATAGCTATTGC CAGGGCCAGATATAAGCGATTAAGCTAGAAAGCATTAAGATGCAAACGAT AAAGTGCGATCAGTAATTCAAAACCTTTACAGAAAGGCAATCTATGGTTT TGTGCGCAGCCCTTAATGAAGGCAGGGAGTATGTGGGTTACATCAAAACA ATTCCATACATTAGTGGAATGTTGATGAGCTTGGTGGTGTTGAACAAAAC TTTTCCCTGATGGAATGGAAAGCATACTATTATTCCTATTGAGGCTATTT AGCTGGACTGAATTAGTTGCCAGGCTAATGATCCATCATAATATTAGAGA TAAAAGCCAAGGGCCAATATCTAAGTAACTAGATAAGAGGAATCCCGATT TTCCCTTAATTTTCCTGGCGTCCCAATGCATGTTATGCCGCGGTTCGCCA GGCTGTGCTGTACCATGTGTGCCGCTTGATTCTTGCCGCTCAATAACGTT TGCGGTTGCTTTCAAATCTGTTTGCTGGTATTCGCCCCAGACACTGAAGT CTATCGGATTTAGTGCGCCTTTCTATCGTGATTTCGGGATTTTTGCGATT CGACGAGAGGAAATAGGGCGGTTAATGGTTTTGCGCTTACCCTGCCAACA CAACAGGGGATTTGCTGCCGTTTTCACC >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/105/0_2152 CTAACATTTATCTGTCATCATACTTCCGAGCATTTATTAGCATTTCGCTA TAAGTTCTCGCCTGGAAGAGGTAGTTTTTTCCTTTGGTACTTTCACCTGT TCATCCTCTGTTCACGTTATCTATCGCTTTTAAAACGGTTCGACCTTCTA AATCCATCCTGATCCATTAAATATATTTTTTAGAATGGTTCATCAAGAAA GCCCTCTGAATCAACGACTTGCATATTAAAGTGGTGTATGCGCAGATTGT CCACTTCAAGTAAAACACCTCCAACCGAGTTAAAACACCTTAAGTTCACC CGAATGTCCTCAATATCCGGACGGATAATATTTATTTGCTTCCTCATTGA CCGTAGGACTTTCCACATGCAGGATTTGTGCGAACCTCTTGCAGTACTAC TGGGAATGAGTTGCAATTATTGCACACATTGGCGTGCATCCGAGTAAGTC GCTAATGTTCGTAAAAAAGCAGAGGAGCAAAGGGTGGATGCAGATGAAAC TCTGGTTCATTCGAATAAAACTAATGACTTTTCGCCAACGACATCTACTA ATCTTGTGATAGTAAATATAAACAATTGCATGTCCAAGAGCGTCAATTCG AAGCAGATATTTCCTGGATATTGTCATAAAAACAATTTAGTAGATTTATC ATCGTCCACTGAATCTGGTGGTGTCATTTACGTCTTAACTCTTCATATTT AGGAATGAGCACTGATGAGTTCCATATTTGAAAAGTTTCATCACTACTTA GTTTTTTGATAGCTTCCAAGCCAGAGTTGTCCTTTTTCTATCTACACTCT CATACAACCAATAAATTGCTGAAATGAATTCTAAGCGCGAGAACGCCTAG TGATTTTAAACTATGCTGGCAAGCATTCCTTTGAGTCGCAATATAAAAGT ATTGGTGTACCATTTTGCTGGGTCTGAGGTTGTTCTTTAGGAGGAGTAAA AGGATCAAATGCAAACTAAACGAAACGGAAACAAGCGATCGAAAATATCC CTTTGGGATTTCTGGAACTGCGATAAGTCTATTTAAAGTTTTCAGAGAAA AATATTCATTGTTTTCTGGTTGGGTGATTGCCACCAATCTTCCCATTCAG AAATTGTTGTTTTACCACACCATTCGCCCGATAAAAGCATAATGTCGTGC TGGGCATAGAATTAACCGTCACCTCCAAAAGGTATAGTTATAATCACTTG AAACCGGAGAGGCACTTTTTCTATTAAATGAAAAGTGGAAATCTGACCAA TTTCTGCAAAATCCATTTTACACCCGTGCGAACTGTCCATGAGATTTCCT GAAGAGTTACCCCCTCTAAGTAATGAGGTGTTAAGGACGCTTTTCATTTC ATGTCGGCTAATCGATTTGGCCATACTACTAAACCTGAATAGCTTTAAGA AGGTTATGTTTAAAACCATCGCGTTAATTTGCGTGAGATTAACCATAGTA GTCAATGCTTCACCATAAGGAAAAAAACATTTCAGTGGATTGACTGTAAT TTTTTATCTATTAATGGAATAAGTGCTTACTCTTCTTTTTTGACCTACAA AACCCAATTTTACATTCCGATATCGCATTTTCACCATGCTCATCAAGACA GTAAGAATAAAACATTGTAACAAAGGAAATAGTCATTCGCAACCATCTGC TCGTAGGAATGCCTTATTTTTTTTCTACTGCAGAATAATAGCCCGCCTCT TTCAATAAACAACAAAACTCCAACATATAGTAACCCTTAATTTTATTAAA ATAACCGCCAATTTATTTGGCGGCAACACAGGATCGTCTCTTTTAAGTTA CTCTCTATACATACGTTTTCCATCTAAAAATTAGTAGGTATTTCGAAACT TAAACGGGCATCGTATTGTAGTTTTTCCATATTTTAGCTTCTGCTTCCTT TTGGATAACCCACTGTTATTCAATGTTGCATGGTGCACTTGTTTATACCA ACGATATAGTCTATTAATGCATATATTAGTATCGCCGAACGATTAGCTCT TCAGGCTTCCTGAAGAAGCGTTTCAGTACTAAATAAGCCGATAGAGAGCC ACGCGACTTCGTAGCTCATTTTCCATAAGTGTTAACTTCCGCTCCTCGCT GCTATAACAGACATTCACTACAGTTATGGCGAAAGGTTGCATGCTGGGTG TT >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/115/1304_3321 CGTGCCTTTTGGAAGGAGGATCATGAAATGGGAAGAAGGCGAAGTCATGA GCGCCGGATTTAACCGCGCTAACCCTTTTATATACGAGACACAATGATAT TAACTGCTACAGGGACCCAAGGACGGGGTAAGAGTTGGATTAGGCAGAGA CAGGCGAATCGAGCAATCACTGCACGCTATACAGAGCCAACATTGAGTTA TTTTCAGGACACAACACAAGCCTCTGACCAGCGGAAATCAACAGGAAATT CGTTACGTTACATTCATCGGCTTGATCGCTACGAAAAAATCCTGGCCAGC AGAGGAATCAAGCAGAAGACACTCATAATTACATGAGCACAATTAAAGCA ATAGGAGGGGTCTGCCTGAGCTCCACTTGAGACCATCACCACAAAAGAAT TGCGGCAATGCTCAATGGATACATAGTACGAGGGCCAAGCGGGCGTCAGC CAAGTTAAATCAGATCAACACTGAGCGATGCATTCGAGAGGCAATAGCGA ATGGCCATATAACAACCAAACCATGTCGCTGCCACTCGCGCAGCAAAATC AGAAGGTAAGGAGGATCAGAGACTTACGGCTGACGAATCCTGAAAATTTA TCAAGCAGCAGAATCATCGCACCATGTTGGCTCAGACTTGCAATGGAACC TGGGCTGTTGGGGGGGGGGGGGGGGGGGGGTTACCGGGCAACGAGTTGTG ATGTTATTGCGAAATGAAGTGGTCTGGAATCGTAGATTGGATATCTTTAT GTCGAGCAAAGCAAAACGGCGTAAAAATTGCATCCCAACAGCATCTGCAA TATTTGATGCTCTCGCGGAATATCCATGAAGGAATACACTTGATAAATGA CAAAGAGATTCTTGGCGGAGAAACCATAATGCATCGTACTCGTCCGCGGA ACCGGCTTTCCATCGGCACAGTATCAATGGTATTTATGCGCGCACGAAAA GCATCGGTCTTTCTTCGAAGGGGGATCCGCCTACCCTTTCACGAGTTGCG CAGTTTGTCTGCCAGACTCTATGAGCAAGCAGATAAGCGATAAGTTTGCC CCTCAACATCTTCTCGGGCATAAGTCGGACACCATGGCATCAACAGTATC GTGATGACAGAGGCAGGGAGTGGGACACAATTGAACATCAAATAATGCCT TCTTCTAATTTGACTGATAGTGACCTGGTTCGTTGGCAACAAATGATAAG CATGCTTTTTTTATAATGCCAACTTAGTAATAAAAAAAAAGCTGAACGAG AAAACGTAAAAGATATAAATATCAATATGATTAAAGTTAGATTTTGCATA AAAAACAGACTACATAAATACTGTAAAACACAACATATGCAGGTCACGTA TGAATCAACTACTTAGAGAGTATTAGTTGGACCTGTTAACAGAGATTAGC GCAAGGTGATTTTTGTCTTCTTCGCGCTATTTTTCTGTCATCAAACTGTC GCACTCCCAGAGAAGCACAAAGCCTCGCAATCCAGTGGCAGAAGCTTTTG TGTGCACCCACTACGACCTGCATAACCAGTAGAAAGATAGCAAGTGATGT CAAACGACGCCAGCTGACTTCTTTGTCTTCACGACTTCCCCACACCCAGC ATGCATACCTTTTCCGCCATAACTGTAGTGAATGTCTGTTATGGGAGCGA GGAGCCGGAAGTTTCTAACACTATGGAAAAATGGCTACGAAAGTCCGTGC TATCTATCGGCTTATCTAGTAGCTTGAAACGCTTCTCAGAAGCCTGAAGA GCTAATCGTTCGGCGATACTATATATGCATTAATAGACTATACGTTGGTA TAAACAGTGCACCATGCAACATCGAAGAACAGTGGGTTTATCCAAAAGGA AGCAGAAAGCTAAATACTGGCAAACTACAATACGATGCCCGGTAAGTTCA TACTACGTAATTTTTAGATGGAAAACGTATGTACAATAGGAGTAACTTAA AGAGAGATCCTGTGTTGCCGCCAAATAAATTGCGGTTATTTTAATAAAAT TAAGGGGGTTACTATAT >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/115/3365_7852 ATATAGTAACCCTTAATGCTTATTAAAATAACCGCAATTATTTGGCGGCA ACACATGGATCTCTCTTTTAAGTTACTGCCTCTATTACATACGTTTTTCC ATCTAAAATTAGTAGTATTGAACTAACGGGGCCTCGTATTGTATTTTCCC ATATTTAGCTTTTCTGCTTCCTTTTTGGAATAACCCACTGTTATTCATGT TGCATGGTGCATGTTTATACCACGATATAGTCCTATTAAGTGCACTATAT AGTATCGCCGAACGATTTAGCTCTTCAGGCTTCTGAAGAAGCGTTTCAGA CTAATAAGCCGATAATAAGCCACGGACTGTCGTAGCCATTTTTCATGAAG TGTTAACTTCCCTCCTCGCTCATAAAGACATTCCACTACAGTTATGGCGG ACGGTATTGCATGCTGGTGTGGGAAGTCGTGAAAAGAAAAGAAGGTCAGC TGCCGTCGTTTGACATCACTGCTAGTCTTCTTACTGGTTATGCAGGTCCG TAGTGGGGTGGCACACAAAGCTTTGCCACTGGATTGCGAGGCTTTGTCGG CTTCTTCTGTGAGTGCCGACAGTTTGATGACAAAAAATTAGCGCAAGAGA CAAAAATCACCTTGCGCTAATGCTCTGGTACAGGTCCTACTACCATCTCA AGTGTTGATTCTAGTGAGCTGCTATGTTTGTGTTTTACAGTATTATGTAG TCTGTTTTTTATGCAAATCTAATTTAATATATTGAATATTTATATCATTT TTACGTTCTCGTTCAGCTTTTTTTATACTAAGTTGGCATTATAAAAAGCA ATCTGCTGTTATCAATCTTGTTGACAGACGAAAGGCACTATCAGTCAAAA TAAAATCATTATTTGATTTCAATTTTGTCCCACTCCCTGCCTCTGTTCAT CCACGATACGTGATGCCATGGGTGGTCCGACTTATGCCCGAGAAGATGTT GAAGCAAACTTATCGCGTTATGCTGCTTCTCATAGATGTCTTGGCAGACA AACTGCGCAACTCGCTGAAAGGTAGGCGGATCCCCTTCGAAGGAAATGAC CTGATGCTTTTTCGTGCCGCGCATAAAATACTTGATCCTGTGCCTGATGA AAGCGGTTCGCGACGAGTAGATGCAATTATGGTTTCTCTCGCCAAGGAAT CTCCTTTGCATTTATCCAAGTGTGTTCCTTCATTGATATTCCGAGAGCAT GCAATATGCATGTCAAGCTGTTGGGATGGCAATTTTTACGCCTGTTTTGC TTTGCTCGACATAAAGATATCATCTACGATATCAGACCACTATCATTTCG CATAAATCACCAAACTCGTTGCCCGGTAAAACAACAGCCAGCTTCCATTC AAGTCTGAGCCAACATGGTGATGATTGCTGCTTGCTTGATAAATTTTCAG GTATTCGTGAACAGCCGTAAGTCCTTGATCTCCTTACCCCTTCTGATTTT GCTGCGCGAAGTGGCAGCGACAATGGTTTTTGTTTATATGGCTTCAGCTA TGTGCCTCTCGGAATGCATCGCTCAAGTGTTGATCTGGAATTAACTTGCT GACGCCGCCTTGCCCTCGTCTATTATACCATTGAGCATTGCCGCAATTTC TTTTGTGGTAATGTCTTCAAGTGGAGCATCAGCAGACCCCTCCCCTTAAT TGCTTTAATTTTGCACATGTAATTTAATAGTGTCTTCTGCTGATTTCCTC TGCTGGGCCAGGATTTTTCCCCGTGTAGCGATCAAGCATAGGAATGTAAA CGTAACGGGAATAATCACTTGGTTGACTTCTCGCTGTCAGAGGCTTGTGT TTGTGTCGCTGAAAAATAAGCTCAATGTGGCCTGTATTAGCTTCAGGTGA TTGCGATCGCTGTCTGTCTCTGCCCTAATCCAAACTCTTTACCCGTCCTT GGTCCCTGTACAGAATATCCATTGTTTTCTTAGGTATACAGGGATAGGGG GTTAAATATCCCGGCGCTCATGACTTTCGCTTCTTCCCATTTCCTGATCC CTCTCAAAAGGCCACCTGTTACTGGGTCGATTTAAGTCAAACCTTTTACC GCTGATTCCGTGGAACAATACTCTCTTCCATCCTTTAACCGGAGGTGGGA ATATCCTGCATTCCCGAACCCATCGACGAACTGTTCAAGGCTTCGTTGGA CGTCGCTGGCGTGCGTTCCACTCCTGAAGTGTCAAGTACATCGCAAAGTC TCCGCAATTACACGCAAGAAAACGCCATCAGGCGGCTGGTGTTCTTTCAG TTTTCTTCAATTCGGAATATTGGTTACGTCTGCATGTGCTACTGCGCCCA TAGTCATCAAGTGTGTTCGTAGCAGTCGCTTATGTTCCTCCGCTTCGATA GACTCTGGTTGAATGGGTCTCATTCCATTCTCATGTGTGGACTCGCGAAG TGCATTTATCATCTCATAAACAAAACCCGCCGTAGCGGAGTTCAGATAAA ATCAAACCCCGCGAGTGCGAGGATTGTTATGTAATATTGGGTTTGAATCA TCTATATGTTTTGAGTACAGAGAGGGCAAGTATCGTTTCCACCGTACCTC GTGATTAATAATTTTGCACGGTATCAGTCATTTCCGCACATTGCAGAATG GGGATTTGTCTTCATTAGACTTATAAACTTCATGGAATATTTGTATGCCG ACTCTATATCTATACCTTCATCTACATAAACACCTTCAGTGATGTCTGGC ATGGAGACAAGACACGATCTGCACGACATTGATAACAGCCCAATCTTTTG CTCAGCACCTCTAACTCATTATACTCCATTTATAAACTCCCTTGCAATGT AAATGTCGTTTCAGCTAAACGGTATCAGCAATGTTTATTAAGAAAAGTAA GTAATACGTCAACCCGATGTTTGAGTACGGTCGATCATCTGACACTTACA ATGACTCTGGCATCGCTGTGAAGACGACGCGGAAATTCAGCATTTTTCAC AAGCGTTATCTTTTACAAAACCGATCTCACTCCTCCTTTGATGCGAATGC CAGCGTCAGACTCATATGCGATACTCACCTGCATCCTGAACCCATTGCCC TCCAACCCGTAATAGGCGATGCGTAATGATGTCCGTAAGTTACTAACGGG TCTTGTTCGATTAACTGCGCAGAAAACTCTCAGGTCACAGTGGCAGTGCC TTGATAACAAGGAGTCTTCCGAGGGATGGCGAACAACAAGAAACTGTTTC CCGTCTTCCCGGACTTCGTTGCTTCAAGTTTTAGCAATACGCCTTCCTCC CATCCGAGAGTATCCACCTTCGTATCTCACGCTGCCGTTGAGTTTTGACT TTTGCTTGTTTCAAGCTCAACACGCAGTCCCTACTGTTAGCGCAATACTC CTCGTTCGTCCTGGTAGCGGCGTTTGATGTATGGCTGGTTCTTTCCCGTT CATCCAGCAGTTCCAGCCACAATCGATGTGGTTGAGCCATTCATGAAAAG GTCTGCGTCAAATCCCCAGTCGTCTGCATTGCCTGCTCGTGCGCTTCACG CAGTGCCTGAGAGTTAATTTCGCTCACTTCGAACTCTCTGTTTACTGCAT AAGTTCCAGATCCTCCCTGGCATTGCACAAGTCCGGAACAACCCTGACGA CCAGGCGTCCTTCGTTTCATCTATCGGATCGCCACACTCACAACAATGAG GTGGCAGATATAGCCTGGTGTTCAGGCGCGCATTTTTATTGCTGTTGTTG CGCTGTATTCATTCATTTCTGATGCTTGAATCAATGACTGTCTCCATCTT TCATTACCCGGAACTGTCTGGTTAAATACGCTTGAGGCGTGAATGCGAAT AATAAAAAAGGAGCCTGTAGCTCCCTGATGATTGTTTGCTTTTCATGCGT TCATCGTTCCATTAAAGAGGCCGTTTAACATTGCCGAGTGCCAGGCTTAA ATGAGTCGGTGTGAATCCCATCAGCGTTACCGTTTCGCGGTGCTTCTCTC AGTAACGCTACGGCAAATGTCCTCGACGTTTATTATCCGGAAAGCTTGCT GTCTGGACTTTTTTTGATCTTCCGAAATTCAGCCTCGACGTGGCACTGCC GCGCAGGGCGTTTTTCCTGCTGAGGGTGTCATTGAACCAAGTCCCTCATG TCGGCAAGCATAGGCACACAGAATATTGAAGACCGCTGCTCAGAAAAATG CATTCCGTGGTTGGTCATGAACCTGTTTCTCTCATCTGCTTTCTGCTTTC GCCACATCATTTCCAGCTTTTGTGAAAGGGATGCGGCCTAACGTATGAAA TTCCTTCGTCTGTTTCTACCTGGTATTGGCAACAAACCTGATCCCGATTT GGCAAGGCTATGTGCCATCTCGATACTCGTTCTTAACTCAACAGAAGATG CTTTGTGCATACAGCCCTAGTTTATTATATTTATCTCCTCAGCAGCCCAC TGTGCTTTCCAGTGGATTTCGGGATAACAGAAGGGGCCGGGAAAATACCC GCTCGTTTGTAACGGAGTAGACCGAATGAATTGCGCC >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/115/7896_11181 GGCGCAATCACTTTCGTCTACGTCCGTTACAAGCGAGGCTGGGTATATTC CCGCCTTTCTGTTAGCCGAATCCACTGAAAGCACAGCGGCTGGCTGAGGA GATAAATATAGGAGAGACCGAGCGGCCTGTATGCACAAAGCTCGTTTCTG TTGAGTTAAGAACGAGTATCCGGAGATGGCACATAGCCTTGCTCAAATTG GAATCAAGGTTTGTGCCAATACCTAGGTATGGATAACAGACGAAGAATTT GCTACGTTTAGCCCGCATCCCTGTTTTCACAAAAGCTGGAAATGATGGAT GGCGAAAGCGCCCCCCAGAAGCAGATGAGAGAAACCAGGTATGACAACCA CGGATGCCATTTTTTCTGGCAGCGGGCTTCAGATTCTGCGTGTGCTTATG CTTGCCGACATGGGAGCTTGTTCAATGACACCTCAGCAGGAAAACGCCCT TCGGAGCATTGCCCGTCATGGCCTAATTCTGAATCAAAAAAAGCCAGGAA GCAGTTTTCCGGATAAAAACAGTCGATGACAATTTGCGTAGCGTACTGAA GAAGCGCCCGCCGAAACGGTAACGCTGATGGGATTCACACGGACTCATTT AAGCCTGGCAATGCGGCTGTTAGAGATGGAGTCTTTAGGAACGACTGAAC GAGTGAAAAGCAAAGCATCCATCAGGGAGCCTAAAGGCTCACTTTTTTTA TTATTCGCATTCACCCTGCAAAGCGTATTAACCAACAGTTCAGGGATTAC TGAAAAGATGGGTCAGACATCATTGATTCAGCATCAGAAATAGAAGAATT ACAGCGCAACACAGCAATAAAAATGCCGCTGAACCCACCAGTCATATATC TGCCACTCATTAGTTGTGAGTGGTGCGATCCGATAGATGAACGAAGACGC CCTGGTCGTTCAGGGTTGTCGGAACTTTGTGGCAAGTTGCCAGGAGGATC ATGGAACTTATCTGTAAACATGAGAGGTTTCAAGTGAGCGAAATTAACTC TCAGAGCCTGCGGTCAGCGGCAGAGCAGGCGAATCGCTGACGACTGGGGA TTTGACGCAGACCCTTTCCATGAATATGGTAAACACCATCGATTGTGCTG GAACTGCTGGATGAACGGGAAAGAACCAGCATACATCAAACGCCGGCGAC CTAGGAAGACGAGGATATTTGCTGCTAACAGTAGGGAAAACTGCGTGTTG AGCTTGAAACACGAAAATCAGAAAACTCAAACGAGCAGCGTGAGTATTAC GAAGGGTGTTTATCTCGGATGGGAGTAAGCGGTATTTGCTTAAAACTGGA AAGCAACCGAAGTCGTGAAGACGGACCAGTTTCTTGTTTTCCCATCCTGG GAAGACCTCCTGTTATCAAGCACTGCCCTGGTGCCTGGAGAAGTTAATGC GGCAGTTAATCGAACAAGACCCGTTAGGTAACTATCTGACATCATTAACG ACATCGCTATTTACGGGGTTGGAGTCAATGGGTTCAGGATGCAGGGCGGA GTTATCTGGCTATGAAGTCTGACGCTGGCATTTGCATCAAAGGCGAGTGA GGATCGTTTTTGTAAATATAACTGCTTGTGAACAATGTGAATTCGCGTCC GGTCTTCACAGCGGATGCCAGAGTCTGGTAGTGTCAGAATGATGACCGTA CTCAAACATCGGGTTGAGTATTATCTTACTGTTTCTTTACATAAACATTG CTGATACCGTTTATGCTGAAACGACTACATTGCAATGGAGTTTATAAATG AGTATCAATGAGTTTGAGTCTGGAGCAAAAGATTGGGGTCGTATCATGTT GTGCAGATGCCGGGTGTCTTGTTCTCCATGCATGACATCACGAAGGTGTT TATGTAAGATGAAGGATAGATAGAGTCGGCATAACAAATATTCCATGAAG GTTTATAAGTCTAATGAAGACAAATCCCATTCTGCAAGTGCGAGAAATGA CTGATGCACGGTGCAAAATTTATTATCACGAGTACGTGTGGAAACGATAC TGTGCCCTCTCTGTACGAAAACATATAGATGCAATTAAACCCAATAATTA ACATAACAATCCTCGCACTCGCGGGGATTGTATTTTATCTGACTCGCTAC GGCGGGCTTTGTTTTTATTGATGATGATAAATGGCCTTCCCGAGTCACAG GAGACATGGAATGGAGAGCCATTCAACGAGTGGTATCCGAAGCGGGAGAA CGATCAACGACTGCTAACGAACACTGGATGATATTGGCGCAGTAGACCAT GCAGACGTAACCAATATTGAATTGAAGAACTGAAAGAAGCACCAAGCGCT GATGGCGGTTATTTTCGTTGCGGTATTGCCGGAACTGTTGCGGATGGTAA CTTGACACTTTCAGGGTGGTAAACGACGCCAGCGACGTCCAAGGAGCCGT TGAAACAGTCGTCCGCGCCCTTCAGCGGAATGCCGGATCTCCCCAACTCC CGTTAGAGGATGGAAGAGAGTATCTGTTCCAGCGAATCAGCGGTAAAGGT GACTTAATCAGACCAGTTACAGGTGGCCTTGTTGAAGAGGATCAGAATGG GAAGAAGGCGAAGTCATGAGCGCCGGATTTACCCCCTAACCTTTTATATT AAGAAACAATGGATATTATGCTACAGGGACCAAGGACGGGGTAAAAGGTT TGGGAATTAGGCAGAAGACCAGGACGAATAGCAATCCTGAGCTAGTACAG GCCAACATTGAGCTTATTTTCAGGACCACAACACGCTAAGCCTCTGACAG CGAGAATCAACAGTGATAATTCCCTGTTACGTTACATTCATTCGGCTTGA ACGCCTACGAATAAATCTGGCAGCAGAGGAATCCAAGCGAAGCACTCATA AATTACATGGCAAGAAATTAAAGCAATAAGGAGGGGTCTGCCTGATGCTC ACTATGAAAGACATTCACCACAACAGAATTGCGGCAGTGCTCAATGGATA AACAGTAGACAGGGCAAGGGCGGCGCTCAGCCAGTTAGAATCAGTATCAA CACCTGAGCCGATGCATTCCGAGAGGCAATAGCGAAGCGCCTATAACAAC AAACCATGTCGCCTGCCAACTCGCGGCAGGCAAAATCAGAGGTAACGAGA CAAGAGACTTACCGGCTTGACGGAATAGCTGAAAATTTTATCAGCAGCAG AATCATCCCATGTTGGCTCAGACTTGCAATGGAACTGGCTGTTTGTTACC GGGCAACGAGTTGGATGATTATATGCGAATGAAGTGGTCTGATATCGTAG ATGGATATCTTTTATGTGCGAGCAAAGCAAAAGCA >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/147/0_6711 CGTGGCCTCCTCACGGGGACAGAAAGCGTCCCACGCTGTACGTTTTCAGT TGGCATTCAGCCCGGCAGGCGTCGCTGAGAATAGCCATCACCAAAGCGCA CCTTTCTTACGAAGGGACCGAAGGCCACATCCATACCGGGTTTCACTTTC CAGCGGAAGGTCTTTCATCGGTCCACCTCCGGAGAAACAGGCCACCATCA CGCATGCTGTGTCTGAATTGTCATCCACGGGCACCCTTGGGCGGCCCATG TCCATACCACCAGCCTTCAGAGCGCCGGACTATCTGCCCGTTCGTGCCGT CGTTGTTAATCACCCATGGTTATTCTGCTCAACGTCTCCGCGGACGCTGC GACGGCTGTTGCCATGCTGCCCGGTGGTACCGACATAACCGTCCGGTGAC ATAAGCCCGCGCATCAGCCGGTAAAGATTCCCCACGACCAATCGCGGCTG GTTGCCGTCCTTCGTGGAAGACCAACTCACCACTGGTGAACAATCCCCGC TGGCTCATATTTGCCGCCGGTTCCCCGTAAATCCCTCCGGTTGCAAAATG GAATTTCGCCGCAGCGGCCTGAATGAGACTGTACCGCCTGACGCGATGCC GCCGCCACCAACAGCCCCCCGCCAATGGCGCTGCCGATACTCCGACAATC CCACCATTGCCTGCTTAAGCGAATTTCTGTCATCATGGACAGCACGGAAA CGGGTGAAGCTGGCGCCAGTTCTTGCACACCTGCCCGTCAGCACTCGCCC GCACATACTTCTGTGCCAATACCTTCAAAAGGTCTGCGTGGCTGCCCTTT TTACCTGCGAACATACTGTCCGTGGCGCTCTCTTCCCACTCACGATCCAG CCGGACTTCCAGGCCTGCCATCCAGTTCCCGCGAAGCTGGTCTTCAGCCG CCCCAGGTCTTTTTCTGCTCTGACATATGACGTATTTCAGCGCCCAGCGG ATTAGTCGCATACTGTTCCTTCAGGCGCTGTTCCGTGGCTTCCCGTTCTG CCTGCCGGTCAGTCAGCCCCCGGCTTTTCGCATCATGGGGCCCGTTTTGC CGTTGCTGCCTGTGCGAATTTATTCCGCCTGCTGCGCCAGCGCGTTCAGG CGCTCCCTGATACGTAACCTTTGTCGCCAAGTGCAGCCCAGCTGCGTTTG GTACTCCAGCGGTCTCATCTTTATGCGCCAGCAGGGATTTCTCCCTGTGC AGACAGGCCTTGCGACGTTGCGCCGCCTCCTCCAGTACCGGCGAACTGAC TCTCCGCCTTTCCCACAATCCCGGCGCTGCTGGCTATTTTCTCATTTGCT CCCCGGCAATGGCTTCTCCAGCGTCCGGAGTTCTGCCTGAAGCGTCAGCA GGGCAGCATGAGCACTGTCCTCCTGACGAATCGCCCGCAACCACCTTCAG CTGGACCTGTTTCGGCTTTTTCAGCGTCGCTTCATAATCCTTTCTTTTTC GCCGCCGCCATCAGCGTGTTGTAAATCCGCCTGCAGGATTTTCCCGTCTT TCAGTGCCTGTTTCAGTTCTTCCTGACGGGCGGTATATTTCTCCCAGCGG CGTCCTGCAGCCGTTCCGTAAGCCTTCCCCTGCGCCTCTTCGGTATATTT CAGAGCCGTGACGCTTCGGTATCGGCTCTGCTGCTGCGCATTTTTGTCCT GTTGAGTACTGCTGCTCAGCCTTCCTTTCGGGCGGCTTCCAGCGCAAGAC GGGCCTTTTCCAGATCATCCCAGTAACGCGCCCGCGCTTCATCGTTACAA ATAATCATCCTTGCGCAGATTCCAGTATGTCGTCTGCTTTCTTATACGCA GCCTCTGCCTTAAATCAAGCATCTCCTGCGCGTATCAAGGACGACCAATA TCCAGTCACCGCATCCACATGGATTTTGAATGCCCGCGCAGTAACCCTGT CCTGCCCAGGTCTCCAGCGGTGCCCATGTTCTCTTTCAGTGCGGCGGGTC TGGTCATCAAACCCTTTCGTTGCGGCTCGTCCGCCGCCTTGCAATGCCCC GGCTTCATCGCCGGAACGCATGCACTGAGGCACATACGCAATCTGCTCCG CCGACACGTTATGGGAAACTGGCGAGCCATCGCCCGTCAGCCCCGACGTC GGGTCTGTGGTCACTTCCCGAAGGCTTCAGCGACCTTGTCCACCTCCACG CCGGATTGCAGAGGAGAAAACGCGCCACACTCTGGCCTGGTGGACTCAAT CCCCTCGAGCTCACTCGCTTACCCCAGCCTTAACCAGTGCGCTGGAAGGT GGACTCGCTGGTCCTGGTTAAACGTCAGCCCTGCGCCTGCCCGGCTCTGG ACAGCCAGCATACGATCTGTTTGCCGTCAGTCCCGCTGATTGCGGAAAGG GACCAGCGTTTGGTTGAATCGGACAGGGGTTGAGTTGCCCTGATACCAGC ATTACCCAGCGCAACCGGGTCGCCACGCCAGGCGAGGTGGCCCCCCCATC GGCAGGGTGATCGCAACCGGCAAGCCCTGGAACCATGGGGATCATCCCGC CGAAGGAGTCCTTCACCTGCCCCCCTGTTGCAGAGGATTCAGCCACGGAC TTTGCCCGCCTGGCAAGCTGCCGTGGCCACGGTCGGTGAACTGTGCAGGC AGCATACGCATGGCGGCTTTATACTGCCCCGACGGAAATCCCCTCCCCCG CTTTCTGTGCAGCAGCGCCTGTCCCGGCTCAGCGACTGTTCAACGACCTG CCGCTGTATTTTTTCGCATCACTTTCCGTACCCTAGAAAAATGACGCCTG ACTCTGGCCTCTGCCTCGTTCAAATCTGGCCGCATCCAGACTCAAATCAA CGACCAGATCGCTACCGGTTCAGCCATACCGACTCCTCCTGCGATACCTT CTGATACTGATCATAAGATGTACGTCATCCTCCGTCATGTCGGCCACCCT CCGGGGAAGCGGGGATAAACGTTCATTCCCGTCCGGGCCCAAAGCGGACC TTCCGGAGCCTGCCAGCCTTTCTGCATCAGCCATCATCTTCAGGTCTTCG TCCAGCCCTCGCCCGGTTCAAGCAGACTGAAATCGCATGCGGCTGCATAT CCGGATCGCTGAAAACAGGCTGGAGCACGGTGTACGTAGCCCGGAAAAGT GCATATCCATAGCAGAACATCATGAAAATAATGGGTACTGTAAAGCGGGC CAGTCGGCATACTCCGTGGATGACATCCCGGCAAGCATGGACGCCAGTCG GGTCGCCCATCTCACGCGGCCAGTTTCAGGGCAAAACTCAGCTCCCGTCG AACACTTCCCGCAGAAACAGGCTCTGCGGGCCCCGAGCGTCCTCTGTCTG TCTCAGGGGCATAATTTCACCAAAAACTTCCATACTACCAGACAGCCGGT ACACCACGTTTTCAGCATGAGAAATTGCCCTCGTGGGCCAGGTTGGGTAA GCACTTCCTGTCTCAATCGTTTAACGGCTTCATTCATGGACGGCATCCTG CGTCTTCTGCGGATGGTTATGCACAGGGACAGTCGCCACCAGAAACGCGC CGGTTCCAGGCGTCCTTACCACAGTAACGTTCCCGGTTGCTGTCTGATCC GCCTGTTTGCCTGCGTTTCATCAGGGCGAGATGCTCAATGCGCTGCAGGG CCTGACAGGTTCAGAAAGGCGTGACGGTTCACACCCCCCCGGTTATGTTC AAATGATTTCCGGGTTTTCAGCGAACATGCTGGACTCTCACGATTAACTC GGTGACGGTAATTTCTGCAACCGGCAGCAACTCACCATTACCGGTACACC GGGAATGTTGACCTTGCCTGCGCAACGCCGTTCAACGGTGATTGGTCATA CAACTGACCGACAACGGTGCTTTGTTTTTATACGCAGACACGCACCGAAA GCTCTTGTTCGGTTACGCCCTGTCCGGCTGGAAGGGCAACGGTCAGCGTG GTGCTCTGCTTTCACCACCGAGGTGCTGGCAGTGCGTCAGGTCCATTGCC GCGTTTGCCGCTGTTACCGTGCTGCGATCTTCGCCATCGACGGGACGTTC CCACATTGGTGACTTTCCACCGTGCCGGGGTGAATCACTTCACCTTCGCG TCACCGCCTTACCGGATACTGCTGACCCAGCCACGGAACACATCGACCGT GCGTTCGGGAAGCGATTTTTATAGGCACGGGTATCGCCTTCATTAAACAC GCCGCAGCGCCTGCTGCCCCTGCTCTCCGGCATCCACGCCCAGCGTGAAA GCTGGTATCTCACGGCAGATTTCTGCACTGCCCGGTCGCAGTCCAGTCTG CATCTTCATCATCGAGAAGCTGTCGTCATAGGGACTCAGCGGTCAGTTTC GCCGGGCGTCAGGTCTTTCAACTTTGCCGACGCGACCAGGTCAACGTCTG AAAGCGGATTCGCGTAGGGTCACCGCTCCCCTATATAAACCCACAGGGTG GTCCCGGCACCTTTTCCCGGCATTTGTAGGATTTGGTACAGGCATAGCGT CCTCACATTCATAGGTAATGACATAAGTCAGGATCGGCTGAACTCACAGC CCGCATCATCGTCGCGCCGGTAGTCAATAGCCGCTGGCTCACCATACTGG TGATCAAATTCTGCAGTGCGGGATATCGTTCATCACCGGATAAATTCGGG GACCCCATCCACGCATCAAGCTCTGAACTCGGCACCTGACAGGCAGGAAA ACTTCGATATGCAGCTCCGCCTGCCAGGTATCGCTGTCCAGCTTTCGCCC GTGTATCAGCGCCGTGAGATAAACGGCAACTGCCGGAAAATCCGCCTCAT CAAAAACAGCGGGGGCGACATCAAAAACGTCGCTCCGTGTCATGCTTCTC GTGCATCCGTACGGCTGCAGCGGAGTTCAGTATGTTTCATCGCTTTATTA CCATCCTCAGTTTGAGTGCTGCAGCGCATAGCACAGCTTTCGGAAGACGT TCACGCCGTATCCGCTCATATTTGTTTAAACGCCGTGGTCAGCGGCACCG CATCGATTTTCCACCACATCAATGGGGTAACGGTTTTTTCCCAGCCACAC GCTGATGACATGCCACCGGCCATTTTTCAGATTGCTGAATAAACCGCCGG GAATACGACGGTTACCACCAAAACACCTGCCGGCCACCTTTCAGGGATGA ACGCTGCCCCTTTTTACGACCTGCGGCGCGAAAGGACACCCGCGCTTACC CCAGCTTGATTACGGGCAATCCCCCGGTTAACTTTTGAACTCTGGCCTGC GGAATTTTTGACCGTGGCCCTTTTCACCTGGCCCTTTCCTTTACCAGTTT TCCGGCGTACCTTTGTCTCACGGGCACCTTGTGAACGCCGACCTGCGGAT ATCGCAGGGATGAAGCCAACGCGGTTAATGCATTGCGGCTGGCACCAGGA ACCCGCCGTTTTTGCTGATAGGCTGAGGTTTTCAACGGCTCTGCTCAAGA CCTTTTATGGCCATACCATCCCCTTTCAGCGGCGACGGTTAAACGCGCAG GCGGGTACGCCCGTCCAAGCCAGAGATACAACTTCCGCCATCATCCGGCG AAACCCGATATACCCATGAATTTTTCCTCACCGATGGTCCCAGCGTGTCT CCAGCGCCGCAGCTGGCCTCCCTCATCAGTCCGGCAACAGGGACGGGCTG GAGCCTTCAACGCGCACGCCCTGTCCCGGCTAGCTGATAATTTTCAGGGT CATCAAAAACACCACGTATCACCGCACCTGACTGCTCACCGGATGTAAGT GGTGGCTGACGTTTCCCATTTACCCGCGTATGTTTTTCATCGGCCGCGGG CATGGCAGCATCGAACAGGTTATCGAAATCAGCCACACCGCCTCCCGTTA TTGCTTCTGGCCAGGCCGCGCTCTGTCATTTCCGGCTGCCAACCGGCAGA GACACGAAACGCCGTTCCCGGCGCTACAAATGCCACAGGTTCATCCGCTC GTGGCGTGAAGTGCATCAGTATGCAGTTCAACCAGTGCCACGAACGTGAA CAGTTCAGACGTATCCCAGATCACGGTATCCGGCTGCGGCTGATCCCACC TCATGTTCCATGTGCCGGTCAGCACATTTTCCGGGCTGAGAGGGGTGTCC CTGACCGGCGTTTCATGTCCGTAGTCCTCAAGCCCTCTTTCAGCTCTGCC ATAACGGGAGCGCCAGTTTCTTTCTTCGTCCCGTCAGGCTGACATCACGA GTTCCCCAGTTTGTTACCCAGCCGAGCGGGAGACGGGCAATCAGTTCATC TTTCGTCATGGACATCCCTCCCACAGAGGAAAACAATGGCCCGAAGGGCC ATGATTAACGCCAGTTGTACGGACACGAACTACATCAGGGTCAGCCAGCA GCATCAGCGGGTGCTGACTCGATCATGGTGAACTCACGCGCCGGGATTCT GCCGGTGGTCACCAGTTTTTTCGGGTACGGGCAAGAGGGTTAATGCCTTC GCGCTGGTGCGTCTCGCATCAACCTGAAATGCAGCCATAGGTGCGCAGAC CGCGTGGCCTGAGTGGTTCCCCAGCACCATCGTGTTGTCCGGCAGGAAGT TCTTTTTTGACGCCCGTTTTTCCCACGTACTGTCCGGAATACACGACGGA TGGCCACATCGCCTACATCCCCTTATAGGACCGCCGCTTTGCCCAGGTCT TGTCACCGCTGTCTCACCAGCTCGAAATTAGAGACCCGACGGGTATCTCC AGCTTCTCCTTGACGGCTTTGAAGGGAACGGAAAAGCGCCAGCCTTTTCA GGTCGAACACG >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/171/2019_3396 GCCGTAAACCTATGGGTGGAATAAAACATGGGACAGAATCACCGATTCTC AACTTAGCGAGATTACAAAGTTACTGTCCAAACGGTGCAATGAAGCCAAG TTAGAACCTCCCGTCCAGAATGAAAATATTACAAGCCAGCAAGGCGGCAT GTTGGGGACCAAGGGTAAAGAACATCTCAGATGGTGCATCCCCCTCAAAA ACCGAGGGAAAATCCCCTAAAACGAGGGGATAAAATCCCGTCAAATTTGG GGGATTGCTATCCCTCAATAACAGGGGACACAAAGACACTATACAAAGAA AAAGAAAAGATTATTCGCCAAGAGAATCTGGCGATCCTCCTGACCCAGCC AGAAAAAACGGACTTTCTTGGTGAGATCGGAATGCTTGGCAATTCGAGCG GCAGCAGAGGGAACAGCAGAAGACCTGGACCCGCCGAGCAGAGTGGATCG TTTGACATGGTGAAACTATCGCAACCATCTCAGCCCAGAAAACCGAATTT TGCTGGGTGAGGCTAACGGATATCCGCCTATGCGTGACGGACGTGACCGG ACGTAACCACCCTGCGACAGTGTGTGGTGACGTTCCAGTGGGCATTGCCC CCAGGACAACTTCTGGTCGGTAAACGTGCTGGTAGCCGCCAAACTTTCCG CGATAAGTGGACCCAACCCTCGGGAAATCAACCGTAAAGCACAGGCACAG AGGCGGTGACAGCAGCAAACCAAAAAACCGACCTGCCCCACACAGACTGG ATTACGGGTGGATCTATGAAACATCCCGCCGCACGATGGTTAACCTTTGA CCCTGAGCAGATGCCGTCGATCGCCAACAACTGCCGGAACCAGTACGACG AAAAGCCCCCCGCAAGGTACACAGGTAGCGCAGATCATCAACGGTGTGGG TTCAGCCAGTCTACCTGGCAACTCTTCCCCCCGGCGACCTGGCTAACCGT GGACCGAACGAAAGTAACGAAAATCCGTCGCCAGTGGGTTCTGGCTTTTT CGGGAAAACGGATCACCCACGATGGAACAGGTAACGCGATGCGCGTAGCC GTCCGGCAGAATCGACCATTTCTGCATCACCCCGGCATGTTGTTGATGGT GCCGGAAGGAGCATCCGTTAACCGCCGGACTGCCAAACCGTTCCAGCGGA GCTGGGTTGATATGGTTTACGATTATTGCCCGGAAGCGAGGCCTGTATCC GGAATGCCTCGGAGTCCTCTTATCCGTGAAACAAAACGCGCCCACTATTC CTGGCTGGTTAACCAACCTTGTATCAAACATGCGGGGCCATGCGCTTACT GATGCGAATTACGCCCGTAAGGCCGCAGATGAGCTTTGTCCATATGACGG CGAGAATTACCCGTGGTGAGCCGATCA >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/180/0_2599 GGCCGCAGAAAATGAGCTTCGTCCATATGGACTGCGAGAATTAAACCGTG GTGAGGCGGTATCCCTGAAACCAGTAAAACAACTTCCTGTCATGGGGGCG GTAGACTCTAAATCGTGCCAGGCTCTGGCCCGAAGATCGCAGAAATCAAA GCAAGTTCGATACTGAAATGGAGCAAGTGTTGACGGGCAACGAGAAGGCA ATTATTCATTACGCCCGTGGGGACGAACCGATAATGAGCGCCATTCTGTG CGCCGGACCGTTTGCCAGCGCTAACAGCGCAACACGTAACCGCAAACTCA GGCAGGGCCGCGGCTAAAATGGCCACGGGCAGGGTTCTTCTGGGTGACTC GAAGGTATGGCTGGCGAACGGTGGTATTACCGCTTTGCTACCAGGGAACC AACGGAAGGCCGCTCGACGCCCGAACTTGGTTTTTAAGGAGTGTCGCCAG AGTGCCGCGATGAAACGGGGTATTGGCGGTAATGGGGAGTAAAAGATGAC CGATCTACATTACTGAGCTAATAACAGGCCCTCTGGTAATCCGCAGGCTT TTTATTTGGGGGAGAGGGAAGTATAAGAAAAAACTAACCCTTTGAAATCA GATCTCCAGGCATCAGTCAGCAAATACAGCTATTTGCCAACGGCAGTACG ATTGGCAAAATCCTTTGATTTTACCAGAACCAACCATACCATAAACCTCG GTAGTAAACTTCCAGTTAACGGTTTTAAACGCATCGACAGCCGCAGCTTT TAGCCCTGGTCAAAACACCTGATTAGCTATTGGGTCCTCGCTTTAGGTGA CGTCTCCTCGTCAGTTGAATGGCAATGGTCCCGCAGAGGATGGGCGATGC CAGCTATGCAAAGCGCTTATGTCGGTTATGTTTCAGCGCGCACCTTAACG CCACGGCCATCTTGATAGCGCAGATGTTGTTCCTAAGATATGCGGCGATG TCTTTTATAGGTAATAGGCCAGTCACCAGACATAGCGAATGCGTTGCGAC TGCCTGCGAATTTGCGGAGTATTAGAGCTTATACAGGCTTACGGTACAGA GCGTGGCGTTAAGTGGTCATACGAAGCGAGAACCTGGCTCTAGGAGTGGA AAGCCGAGATGGGGAGAACAGGGCTGCATGATAAATGTCGTTAGTTTTCT CCGGATGGCAGCGACGGTCCAGCATATTTGCTCTGCTAATGGGAGCAAAA GGACGGCCAGGTAAAACGTGCTTACGTTTTCATGGATACAGGTTTTGTGA ACATAAGTGACATATCGGTTTTGTCAGGGAAAGTTGTGAAGTTCTGGGAT ATACCGCATCACCGTATTGCAAGGTTGATATCAACCCGGAGCTTGGACAG ACAAATGGTTATACGGTATGGGAACCAAGGTCTCATTCAAAGCGAATGCC TGTTCTGAAGCCATTTATCGATATGGTTAAAGAATATATGGCACTCCATA ACGTCGGCGGCGCGTTCTGCACTGACAGATAAAAACTCGGTTCCCTTCAC CAAATCAAATGTGATGACCATTTCGGGCGAGGGAATTAACACACGTGGAA TTTGGCTACAGAGCTGATGAACCGAAGCGGCTAAAGCCAAAGCCTGGAAT CAGATATCTTGCTGAACTGTCAGACTTTGAGAAAGGAAGCTATCCTCGCA ATGGTGGAAAGCAACAACCCATTCGATTTGCAAATACCGGAACTCTCGGA ACTGCATATTCTGCATATAAAAAATCAACGCAAAAAATCGGACTTCCTGC AAAGATGAGGAGGGATTGCAGCCGTGTTTTTAATGAGGTCATCACGGGGA TCGCCATGTCGTGAACGGACATCGGGAAACGCCAAAGGAGATTATGTACA GAGGAGAATGTCACCTGGCCGGTATCGCGAAAATGTATTCGAAAATGATT ATCAACCTGGTATCGAGGACATGGTAGAGCTAAAAGATTCGATACCGGCT CTTGTTCTGAGTCATGCGAAATATTTGGAGGTGCAAGCTTTTGATTTCGA CTTCGGGAGGGAAGCTGCATGGATGCCGATGTTATCGTGCGGTGACTGCA AAGAAGATAACCGCTTCCGACCAATCAACCTACTGTAAATCGATGGTGTC TCGTGTGAAAGAACACCAACAGGGTGTTACCACTACCGCAGGAAAAGGCG GACGTGTGGCGAGACAGGACGAAGTATACGACCCATAATCCTGCGAAAAC TGCAAATACCTTCCACGAACGCACCAGAATAAACCAAGCCAATCCCAAAA GATCTGACCGCTAAAACCTTCAACTACACGGCTCACCTGTGGGATATCCG GTTGAGCTAGACGTGTGGCGAGGAAAACAAGGTGATTGACCAAAATCGAA GTTACGAACAGAAAGCGTGCGAGCGAGCTTTAACGTGGCGCTAACGCGGT CAGCAGCTGCATGTTGCCTGGAAGTTCACGTGGGTAGCCACTGCTGCGCA GAACTGAATGAGCGATCCGAATAGCTCGATGCACGAGGAAGAGAATGATG GCTAAACAGCGCGAAGACCGATGTAAAAATCGATGAATGCCGGGAATGTT TCACCCTTGCATTCGCTAATCACGTGGTGGGTGCTCTCAGAGTGGTGAA >m120619_015854_42161_c100392070070000001523040811021231_s1_p0/180/2645_6512 TTCCACACTCTGAGAGAGCACACCACTGATTAGCCGGAATGCAGGGTGAA ACATTCCCGGCATTATCCGTTTTTTACAGCGTCTTCGCGCTGGTTTAGCC ATCATCCTTCTTCTCGTGCAATCGAGCTATTCGGATCGATCAGTCAGTTC TAGCGCAGCAGTGCTCACACACGTGAACTTCCGCACATGCCGCTTCTGAC CGCAGTTAGCAGCACGTTAAAGCTCCGCTCGACGCTTTCTCTGTTAGTAC TTCGGATTTTGGTAACACCTGTTTTCCCTCGCACGACTCTTAGCCATCGG ATATCCCACAGGTGAGCCGTGTAGGTTGAAGGTTTTTACGTAGATCTTTT GGGATTTGGACTTGGGTTTATTTCAACGGTGCGTTTCGTTGGACGGGTAT TTCAGTTTTCGCAGAATTAATGTCGGCTGATACTCGTCGCTGTCCCTCGC GCACACGTCTCCGTTTTTCCTGCGGTAGTGGTACACCTGTTGGTGTTTCT TTCACACCGGAGACCCATCGATTCCAGTAGGTTGATTTGGGTCGGAGCGG TATCTTCTTTGCTCACCGCACCGATACCATCGCATCATGCCGCTTCCTCC CTGAAGTCGCAATCAGCTGCCTCCAAATATTTCGCATGACTCAGACAAGA GCCGTATCGAATCTTTTAGCTCGTACCAAATGTCCCTGATACAGGCTGAT TAATCATTTTCTGAATAACATTTTTCGCGAATACGCGTCCCAGCGACATT CTTCCTCGTACATAATATCCTTGGGGCGTTTCCCGATGTCCGTCACGTCA CATGGGATCCCGTGATGACCTCATTAAAACACGCTTGCCATCCTCTCATC TTTGCAGGCAAGTCCTGATTTTTATGCGTCTGATTTTTTAATGCAGAATA TGCAGTTACCGAGATGTTCCCGGTATTTTGCAAATCGAATGTGTTGCTAT CCACCATGCGAGATTTCTTCCTCTCGAAAGTCTACAGTTCAGCAGATATC TGATTCCAGGCCTATTGGCTTTAGCCGGCTTCGGTTATCAGCTCTGATGC CAATCCACGTGGTGTAATTCCCCTCGCCCGAATGGTCATGCACAGTATTT GGTGAAGGGAACGGTTTTTATCTGTCAGCTGCAGAACGCGCCGCCGACGT ATGGAGTGCCATATTTTCCTTTACCATATCGATAATGGCTTCAGACAGGC ATTACGCGTCCCTGAATAGTCCTTTGGGTTCCCATACCGTATAACATTTG GCTGTCCAAGCTCCGGGTTGATATCAACCTGCAATACGGTGACCGGTATA TCACCAGAACTTCACAACTTCCCCTGACAAACCGATATGTGCATTGGATG TTCACACTGTATCCATGAAACGTAATGCACGTCTTTACCTGCCGTCGCTT TTGCTCCATTAGCGCAGAGCAAATATAGCTGACGTCTGCACCGGAGAAAC CTAACGACTTTATCATGCAGCGCCTGTCCTCCCCCATCTCGCCTTTCCAC TACAGAGCATATCTCAGCTTCGTCTGACCACTTAACGCCACGCTCCTGTT ACCACGAATGCCTGGTATAAGCTTCTATAGCTGCCGCAATTCCCCTACAC GCATGCCTGCTGGTTTGACTGGCCTATTACCACAAAGCCATTCCCGGCAA GCTTAGGAACAACATCCCTGCTGCTTTAATGCTCGGTAGAACAGCACACT GCTCCAGCTTTCTGCCTCCCAAGCAGCGACCATGCCATTCAACCTGAAAC GAGAGAGACGTCACCTAAGCAGGCCCATTAGTTCCCTGTTTTGGTCTAAG CTCGCGGTTGCGTTCCTGAATGGGTTACTACGATTGGTTTGGTTGGGATG CTGGTAAGGATTTGCTTGTACTGCCGTGAATAGCGTTTTGCTGAGTGCTG GGGATATCGAATTCCAAAGGTTAGTTTTTTTTTCATGACTTACTTCCCAT pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/test_bas.fofn000066400000000000000000000002061241505617700244750ustar00rootroot00000000000000/mnt/secondary-siv/testdata/BlasrTestData/pbalign/data/ccs_P4-C2/m130328_211423_ethan_c100499512550000001823070408081371_s1_p0.bas.h5 pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/test_ccs.fofn000066400000000000000000000001021241505617700244730ustar00rootroot00000000000000/mnt/secondary-siv/testdata/BlasrTestData/pbalign/data/new.ccs.h5 pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/test_filterAdapterOnly.fofn000077500000000000000000000001641241505617700273660ustar00rootroot00000000000000/mnt/data3/vol57/2820011/0006/Analysis_Results/m130302_124313_42130_c100502672550000001523078308081365_s1_p0.bas.h5 pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/test_leftmost_query.fasta000066400000000000000000000060331241505617700271640ustar00rootroot00000000000000>ref000001|gi|49175990|ref|NC_000913.2| Escherichia coli str. K-12 substr. MG1655 chromosome, complete genome AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTC TGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGG TCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTAC ACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGT AACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGT ACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCC AGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTG GCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAA CGTATTTTTGCCGAACTTTTGACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCG CAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAAAACATGTCCTGCATGGCATT AGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGCTGATTTGCCGTGGCGAGAAA ATGTCGATCGCCATTATGGCCGGCGTATTAGAAGCGCGCGGTCACAACGTTACTGTTATC GATCCGGTCGAAAAACTGCTGGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCT GAGTCCACCCGCCGTATTGCGGCAAGCCGCATTCCGGCTGATCACATGGTGCTGATGGCA GGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAACGGTTCCGAC TACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATTGTTGCGAGATTTGGACGGAC GTTGACGGGGTCTATACCTGCGACCCGCGTCAGGTGCCCGATGCGAGGTTGTTGAAGTCG ATGTCCTACCAGGAAGCGATGGAGCTTTCCTACTTCGGCGCTAAAGTTCTTCACCCCCGC ACCATTACCCCCATCGCCCAGTTCCAGATCCCTTGCCTGATTAAAAATACCGGAAATCCT CAAGCACCAGGTACGCTCATTGGTGCCAGCCGTGATGAAGACGAATTACCGGTCAAGGGC ATTTCCAATCTGAATAACATGGCAATGTTCAGCGTTTCTGGTCCGGGGATGAAAGGGATG GTCGGCATGGCGGCGCGCGTCTTTGCAGCGATGTCACGCGCCCGTATTTCCGTGGTGCTG ATTACGCAATCATCTTCCGAATACAGCATCAGTTTCTGCGTTCCACAAAGCGACTGTGTG CGAGCTGAACGGGCAATGCAGGAAGAGTTCTACCTGGAACTGAAAGAAGGCTTACTGGAG CCGCTGGCAGTGACGGAACGGCTGGCCATTATCTCGGTGGTAGGTGATGGTATGCGCACC TTGCGTGGGATCTCGGCGAAATTCTTTGCCGCACTGGCCCGCGCCAATATCAACATTGTC GCCATTGCTCAGGGATCTTCTGAACGCTCAATCTCTGTCGTGGTAAATAACGATGATGCG ACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTG TTTGTGATTGGCGTCGGTGGCGTTGGCGGTGCGCTGCTGGAGCAACTGAAGCGTCAGCAA AGCTGGCTGAAGAATAAACATATCGACTTACGTGTCTGCGGTGTTGCCAACTCGAAGGCT CTGCTCACCAATGTACATGGCCTTAATCTGGAAAACTGGCAGGAAGAACTGGCGCAAGCC AAAGAGCCGTTTAATCTCGGGCGCTTAATTCGCCTCGTGAAAGAATATCATCTGCTGAAC CCGGTCATTGTTGACTGCACTTCCAGCCAGGCAGTGGCGGATCAATATGCCGACTTCCTG CGCGAAGGTTTCCACGTTGTCACGCCGAACAAAAAGGCCAACACCTCGTCGATGGATTAC TACCATCAGTTGCGTTATGCGGCGGAAAAATCGCGGCGTAAATTCCTCTATGACACCAAC GTTGGGGCTGGATTACCGGTTATTGAGAACCTGCAAAATCTGCTCAATGCAGGTGATGAA TTGATGAAGTTCTCCGGCATTCTTTCTGGTTCGCTTTCTTATATCTTCGGCAAGTTAGAC GAAGGCATGAGTTTCTCCGAGGCGACCACGCTGGCGCGGGAAATGGGTTATACCGAACCG GACCCGCGAGATGATCTTTCTGGTATGGATGTGGCGCGTAAACTATTGATTCTCGCTCGT GAAACGGGACGTGAACTGGAGCTGGCGGATATTGAAATTGAACCTGTGCTGCCCGCAGAG TTTAACGCCGAGGGTGATGTTGCCGCTTTTATGGCGAATCTGTCACAACTCGACGATCTC TTTGCCGCGCGCGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAAT ATTGATGAAGATGGCGTCTGCCGCGTGAAGATTGCCGAAGTGGATGGTAATGATCCGCTG TTCAAAGTGAAAAATGGCGAAAACGCCCTGGCCTTCTATAGCCACTATTATCAGCCGCTG CCGTTGGTACTGCGCGGATATGGTGCGGGCAATGACGTTACAGCTGCCGGTGTCTTTGCT GATCTGCTACGTACCCTCTCATGGAAGTTAGGAGTCTGACATGGTTAAAGTTTATGCCCC GGCTTCCAGTGCCAATATGAGCGTCGGGTTTGATGTGCTCGGGGCGGCGGTGACACCTGT TGATGGTGCATTGCTCGGAGATGTAGTCACGGTTGAGGCGGCAGAGACATTCAGTCTCAA pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/test_leftmost_target.fasta000066400000000000000000000437521241505617700273160ustar00rootroot00000000000000>ref000001|gi|49175990|ref|NC_000913.2| Escherichia coli str. K-12 substr. MG1655 chromosome, complete genome AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTC TGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGG TCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTAC ACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGT AACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGT ACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCC AGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTG GCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAA CGTATTTTTGCCGAACTTTTGACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCG CAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAAAACATGTCCTGCATGGCATT AGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGCTGATTTGCCGTGGCGAGAAA ATGTCGATCGCCATTATGGCCGGCGTATTAGAAGCGCGCGGTCACAACGTTACTGTTATC GATCCGGTCGAAAAACTGCTGGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCT GAGTCCACCCGCCGTATTGCGGCAAGCCGCATTCCGGCTGATCACATGGTGCTGATGGCA GGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAACGGTTCCGAC TACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATTGTTGCGAGATTTGGACGGAC GTTGACGGGGTCTATACCTGCGACCCGCGTCAGGTGCCCGATGCGAGGTTGTTGAAGTCG ATGTCCTACCAGGAAGCGATGGAGCTTTCCTACTTCGGCGCTAAAGTTCTTCACCCCCGC ACCATTACCCCCATCGCCCAGTTCCAGATCCCTTGCCTGATTAAAAATACCGGAAATCCT CAAGCACCAGGTACGCTCATTGGTGCCAGCCGTGATGAAGACGAATTACCGGTCAAGGGC ATTTCCAATCTGAATAACATGGCAATGTTCAGCGTTTCTGGTCCGGGGATGAAAGGGATG GTCGGCATGGCGGCGCGCGTCTTTGCAGCGATGTCACGCGCCCGTATTTCCGTGGTGCTG ATTACGCAATCATCTTCCGAATACAGCATCAGTTTCTGCGTTCCACAAAGCGACTGTGTG CGAGCTGAACGGGCAATGCAGGAAGAGTTCTACCTGGAACTGAAAGAAGGCTTACTGGAG CCGCTGGCAGTGACGGAACGGCTGGCCATTATCTCGGTGGTAGGTGATGGTATGCGCACC TTGCGTGGGATCTCGGCGAAATTCTTTGCCGCACTGGCCCGCGCCAATATCAACATTGTC GCCATTGCTCAGGGATCTTCTGAACGCTCAATCTCTGTCGTGGTAAATAACGATGATGCG ACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTG TTTGTGATTGGCGTCGGTGGCGTTGGCGGTGCGCTGCTGGAGCAACTGAAGCGTCAGCAA AGCTGGCTGAAGAATAAACATATCGACTTACGTGTCTGCGGTGTTGCCAACTCGAAGGCT CTGCTCACCAATGTACATGGCCTTAATCTGGAAAACTGGCAGGAAGAACTGGCGCAAGCC AAAGAGCCGTTTAATCTCGGGCGCTTAATTCGCCTCGTGAAAGAATATCATCTGCTGAAC CCGGTCATTGTTGACTGCACTTCCAGCCAGGCAGTGGCGGATCAATATGCCGACTTCCTG CGCGAAGGTTTCCACGTTGTCACGCCGAACAAAAAGGCCAACACCTCGTCGATGGATTAC TACCATCAGTTGCGTTATGCGGCGGAAAAATCGCGGCGTAAATTCCTCTATGACACCAAC GTTGGGGCTGGATTACCGGTTATTGAGAACCTGCAAAATCTGCTCAATGCAGGTGATGAA TTGATGAAGTTCTCCGGCATTCTTTCTGGTTCGCTTTCTTATATCTTCGGCAAGTTAGAC GAAGGCATGAGTTTCTCCGAGGCGACCACGCTGGCGCGGGAAATGGGTTATACCGAACCG GACCCGCGAGATGATCTTTCTGGTATGGATGTGGCGCGTAAACTATTGATTCTCGCTCGT GAAACGGGACGTGAACTGGAGCTGGCGGATATTGAAATTGAACCTGTGCTGCCCGCAGAG TTTAACGCCGAGGGTGATGTTGCCGCTTTTATGGCGAATCTGTCACAACTCGACGATCTC TTTGCCGCGCGCGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAAT ATTGATGAAGATGGCGTCTGCCGCGTGAAGATTGCCGAAGTGGATGGTAATGATCCGCTG TTCAAAGTGAAAAATGGCGAAAACGCCCTGGCCTTCTATAGCCACTATTATCAGCCGCTG CCGTTGGTACTGCGCGGATATGGTGCGGGCAATGACGTTACAGCTGCCGGTGTCTTTGCT GATCTGCTACGTACCCTCTCATGGAAGTTAGGAGTCTGACATGGTTAAAGTTTATGCCCC GGCTTCCAGTGCCAATATGAGCGTCGGGTTTGATGTGCTCGGGGCGGCGGTGACACCTGT TGATGGTGCATTGCTCGGAGATGTAGTCACGGTTGAGGCGGCAGAGACATTCAGTCTCAA NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTC TGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGG TCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTAC ACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGT AACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGT ACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCC AGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTG GCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAA CGTATTTTTGCCGAACTTTTGACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCG CAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAAAACATGTCCTGCATGGCATT AGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGCTGATTTGCCGTGGCGAGAAA ATGTCGATCGCCATTATGGCCGGCGTATTAGAAGCGCGCGGTCACAACGTTACTGTTATC GATCCGGTCGAAAAACTGCTGGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCT GAGTCCACCCGCCGTATTGCGGCAAGCCGCATTCCGGCTGATCACATGGTGCTGATGGCA GGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAACGGTTCCGAC TACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATTGTTGCGAGATTTGGACGGAC GTTGACGGGGTCTATACCTGCGACCCGCGTCAGGTGCCCGATGCGAGGTTGTTGAAGTCG ATGTCCTACCAGGAAGCGATGGAGCTTTCCTACTTCGGCGCTAAAGTTCTTCACCCCCGC ACCATTACCCCCATCGCCCAGTTCCAGATCCCTTGCCTGATTAAAAATACCGGAAATCCT CAAGCACCAGGTACGCTCATTGGTGCCAGCCGTGATGAAGACGAATTACCGGTCAAGGGC ATTTCCAATCTGAATAACATGGCAATGTTCAGCGTTTCTGGTCCGGGGATGAAAGGGATG GTCGGCATGGCGGCGCGCGTCTTTGCAGCGATGTCACGCGCCCGTATTTCCGTGGTGCTG ATTACGCAATCATCTTCCGAATACAGCATCAGTTTCTGCGTTCCACAAAGCGACTGTGTG CGAGCTGAACGGGCAATGCAGGAAGAGTTCTACCTGGAACTGAAAGAAGGCTTACTGGAG CCGCTGGCAGTGACGGAACGGCTGGCCATTATCTCGGTGGTAGGTGATGGTATGCGCACC TTGCGTGGGATCTCGGCGAAATTCTTTGCCGCACTGGCCCGCGCCAATATCAACATTGTC GCCATTGCTCAGGGATCTTCTGAACGCTCAATCTCTGTCGTGGTAAATAACGATGATGCG ACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTG TTTGTGATTGGCGTCGGTGGCGTTGGCGGTGCGCTGCTGGAGCAACTGAAGCGTCAGCAA AGCTGGCTGAAGAATAAACATATCGACTTACGTGTCTGCGGTGTTGCCAACTCGAAGGCT CTGCTCACCAATGTACATGGCCTTAATCTGGAAAACTGGCAGGAAGAACTGGCGCAAGCC AAAGAGCCGTTTAATCTCGGGCGCTTAATTCGCCTCGTGAAAGAATATCATCTGCTGAAC CCGGTCATTGTTGACTGCACTTCCAGCCAGGCAGTGGCGGATCAATATGCCGACTTCCTG CGCGAAGGTTTCCACGTTGTCACGCCGAACAAAAAGGCCAACACCTCGTCGATGGATTAC TACCATCAGTTGCGTTATGCGGCGGAAAAATCGCGGCGTAAATTCCTCTATGACACCAAC GTTGGGGCTGGATTACCGGTTATTGAGAACCTGCAAAATCTGCTCAATGCAGGTGATGAA TTGATGAAGTTCTCCGGCATTCTTTCTGGTTCGCTTTCTTATATCTTCGGCAAGTTAGAC GAAGGCATGAGTTTCTCCGAGGCGACCACGCTGGCGCGGGAAATGGGTTATACCGAACCG GACCCGCGAGATGATCTTTCTGGTATGGATGTGGCGCGTAAACTATTGATTCTCGCTCGT GAAACGGGACGTGAACTGGAGCTGGCGGATATTGAAATTGAACCTGTGCTGCCCGCAGAG TTTAACGCCGAGGGTGATGTTGCCGCTTTTATGGCGAATCTGTCACAACTCGACGATCTC TTTGCCGCGCGCGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAAT ATTGATGAAGATGGCGTCTGCCGCGTGAAGATTGCCGAAGTGGATGGTAATGATCCGCTG TTCAAAGTGAAAAATGGCGAAAACGCCCTGGCCTTCTATAGCCACTATTATCAGCCGCTG CCGTTGGTACTGCGCGGATATGGTGCGGGCAATGACGTTACAGCTGCCGGTGTCTTTGCT GATCTGCTACGTACCCTCTCATGGAAGTTAGGAGTCTGACATGGTTAAAGTTTATGCCCC GGCTTCCAGTGCCAATATGAGCGTCGGGTTTGATGTGCTCGGGGCGGCGGTGACACCTGT TGATGGTGCATTGCTCGGAGATGTAGTCACGGTTGAGGCGGCAGAGACATTCAGTCTCAA NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTC TGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGG TCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTAC ACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGT AACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGT ACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCC AGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTG GCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAA CGTATTTTTGCCGAACTTTTGACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCG CAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAAAACATGTCCTGCATGGCATT AGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGCTGATTTGCCGTGGCGAGAAA ATGTCGATCGCCATTATGGCCGGCGTATTAGAAGCGCGCGGTCACAACGTTACTGTTATC GATCCGGTCGAAAAACTGCTGGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCT GAGTCCACCCGCCGTATTGCGGCAAGCCGCATTCCGGCTGATCACATGGTGCTGATGGCA GGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAACGGTTCCGAC TACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATTGTTGCGAGATTTGGACGGAC GTTGACGGGGTCTATACCTGCGACCCGCGTCAGGTGCCCGATGCGAGGTTGTTGAAGTCG ATGTCCTACCAGGAAGCGATGGAGCTTTCCTACTTCGGCGCTAAAGTTCTTCACCCCCGC ACCATTACCCCCATCGCCCAGTTCCAGATCCCTTGCCTGATTAAAAATACCGGAAATCCT CAAGCACCAGGTACGCTCATTGGTGCCAGCCGTGATGAAGACGAATTACCGGTCAAGGGC ATTTCCAATCTGAATAACATGGCAATGTTCAGCGTTTCTGGTCCGGGGATGAAAGGGATG GTCGGCATGGCGGCGCGCGTCTTTGCAGCGATGTCACGCGCCCGTATTTCCGTGGTGCTG ATTACGCAATCATCTTCCGAATACAGCATCAGTTTCTGCGTTCCACAAAGCGACTGTGTG CGAGCTGAACGGGCAATGCAGGAAGAGTTCTACCTGGAACTGAAAGAAGGCTTACTGGAG CCGCTGGCAGTGACGGAACGGCTGGCCATTATCTCGGTGGTAGGTGATGGTATGCGCACC TTGCGTGGGATCTCGGCGAAATTCTTTGCCGCACTGGCCCGCGCCAATATCAACATTGTC GCCATTGCTCAGGGATCTTCTGAACGCTCAATCTCTGTCGTGGTAAATAACGATGATGCG ACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTG TTTGTGATTGGCGTCGGTGGCGTTGGCGGTGCGCTGCTGGAGCAACTGAAGCGTCAGCAA AGCTGGCTGAAGAATAAACATATCGACTTACGTGTCTGCGGTGTTGCCAACTCGAAGGCT CTGCTCACCAATGTACATGGCCTTAATCTGGAAAACTGGCAGGAAGAACTGGCGCAAGCC AAAGAGCCGTTTAATCTCGGGCGCTTAATTCGCCTCGTGAAAGAATATCATCTGCTGAAC CCGGTCATTGTTGACTGCACTTCCAGCCAGGCAGTGGCGGATCAATATGCCGACTTCCTG CGCGAAGGTTTCCACGTTGTCACGCCGAACAAAAAGGCCAACACCTCGTCGATGGATTAC TACCATCAGTTGCGTTATGCGGCGGAAAAATCGCGGCGTAAATTCCTCTATGACACCAAC GTTGGGGCTGGATTACCGGTTATTGAGAACCTGCAAAATCTGCTCAATGCAGGTGATGAA TTGATGAAGTTCTCCGGCATTCTTTCTGGTTCGCTTTCTTATATCTTCGGCAAGTTAGAC GAAGGCATGAGTTTCTCCGAGGCGACCACGCTGGCGCGGGAAATGGGTTATACCGAACCG GACCCGCGAGATGATCTTTCTGGTATGGATGTGGCGCGTAAACTATTGATTCTCGCTCGT GAAACGGGACGTGAACTGGAGCTGGCGGATATTGAAATTGAACCTGTGCTGCCCGCAGAG TTTAACGCCGAGGGTGATGTTGCCGCTTTTATGGCGAATCTGTCACAACTCGACGATCTC TTTGCCGCGCGCGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAAT ATTGATGAAGATGGCGTCTGCCGCGTGAAGATTGCCGAAGTGGATGGTAATGATCCGCTG TTCAAAGTGAAAAATGGCGAAAACGCCCTGGCCTTCTATAGCCACTATTATCAGCCGCTG CCGTTGGTACTGCGCGGATATGGTGCGGGCAATGACGTTACAGCTGCCGGTGTCTTTGCT GATCTGCTACGTACCCTCTCATGGAAGTTAGGAGTCTGACATGGTTAAAGTTTATGCCCC GGCTTCCAGTGCCAATATGAGCGTCGGGTTTGATGTGCTCGGGGCGGCGGTGACACCTGT TGATGGTGCATTGCTCGGAGATGTAGTCACGGTTGAGGCGGCAGAGACATTCAGTCTCAA NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTC TGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGG TCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTAC ACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGT AACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGT ACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCC AGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTG GCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAA CGTATTTTTGCCGAACTTTTGACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCG CAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAAAACATGTCCTGCATGGCATT AGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGCTGATTTGCCGTGGCGAGAAA ATGTCGATCGCCATTATGGCCGGCGTATTAGAAGCGCGCGGTCACAACGTTACTGTTATC GATCCGGTCGAAAAACTGCTGGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCT GAGTCCACCCGCCGTATTGCGGCAAGCCGCATTCCGGCTGATCACATGGTGCTGATGGCA GGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAACGGTTCCGAC TACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATTGTTGCGAGATTTGGACGGAC GTTGACGGGGTCTATACCTGCGACCCGCGTCAGGTGCCCGATGCGAGGTTGTTGAAGTCG ATGTCCTACCAGGAAGCGATGGAGCTTTCCTACTTCGGCGCTAAAGTTCTTCACCCCCGC ACCATTACCCCCATCGCCCAGTTCCAGATCCCTTGCCTGATTAAAAATACCGGAAATCCT CAAGCACCAGGTACGCTCATTGGTGCCAGCCGTGATGAAGACGAATTACCGGTCAAGGGC ATTTCCAATCTGAATAACATGGCAATGTTCAGCGTTTCTGGTCCGGGGATGAAAGGGATG GTCGGCATGGCGGCGCGCGTCTTTGCAGCGATGTCACGCGCCCGTATTTCCGTGGTGCTG ATTACGCAATCATCTTCCGAATACAGCATCAGTTTCTGCGTTCCACAAAGCGACTGTGTG CGAGCTGAACGGGCAATGCAGGAAGAGTTCTACCTGGAACTGAAAGAAGGCTTACTGGAG CCGCTGGCAGTGACGGAACGGCTGGCCATTATCTCGGTGGTAGGTGATGGTATGCGCACC TTGCGTGGGATCTCGGCGAAATTCTTTGCCGCACTGGCCCGCGCCAATATCAACATTGTC GCCATTGCTCAGGGATCTTCTGAACGCTCAATCTCTGTCGTGGTAAATAACGATGATGCG ACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTG TTTGTGATTGGCGTCGGTGGCGTTGGCGGTGCGCTGCTGGAGCAACTGAAGCGTCAGCAA AGCTGGCTGAAGAATAAACATATCGACTTACGTGTCTGCGGTGTTGCCAACTCGAAGGCT CTGCTCACCAATGTACATGGCCTTAATCTGGAAAACTGGCAGGAAGAACTGGCGCAAGCC AAAGAGCCGTTTAATCTCGGGCGCTTAATTCGCCTCGTGAAAGAATATCATCTGCTGAAC CCGGTCATTGTTGACTGCACTTCCAGCCAGGCAGTGGCGGATCAATATGCCGACTTCCTG CGCGAAGGTTTCCACGTTGTCACGCCGAACAAAAAGGCCAACACCTCGTCGATGGATTAC TACCATCAGTTGCGTTATGCGGCGGAAAAATCGCGGCGTAAATTCCTCTATGACACCAAC GTTGGGGCTGGATTACCGGTTATTGAGAACCTGCAAAATCTGCTCAATGCAGGTGATGAA TTGATGAAGTTCTCCGGCATTCTTTCTGGTTCGCTTTCTTATATCTTCGGCAAGTTAGAC GAAGGCATGAGTTTCTCCGAGGCGACCACGCTGGCGCGGGAAATGGGTTATACCGAACCG GACCCGCGAGATGATCTTTCTGGTATGGATGTGGCGCGTAAACTATTGATTCTCGCTCGT GAAACGGGACGTGAACTGGAGCTGGCGGATATTGAAATTGAACCTGTGCTGCCCGCAGAG TTTAACGCCGAGGGTGATGTTGCCGCTTTTATGGCGAATCTGTCACAACTCGACGATCTC TTTGCCGCGCGCGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAAT ATTGATGAAGATGGCGTCTGCCGCGTGAAGATTGCCGAAGTGGATGGTAATGATCCGCTG TTCAAAGTGAAAAATGGCGAAAACGCCCTGGCCTTCTATAGCCACTATTATCAGCCGCTG CCGTTGGTACTGCGCGGATATGGTGCGGGCAATGACGTTACAGCTGCCGGTGTCTTTGCT GATCTGCTACGTACCCTCTCATGGAAGTTAGGAGTCTGACATGGTTAAAGTTTATGCCCC GGCTTCCAGTGCCAATATGAGCGTCGGGTTTGATGTGCTCGGGGCGGCGGTGACACCTGT TGATGGTGCATTGCTCGGAGATGTAGTCACGGTTGAGGCGGCAGAGACATTCAGTCTCAA NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTC TGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGG TCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTAC ACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGT AACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGT ACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCC AGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTG GCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAA CGTATTTTTGCCGAACTTTTGACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCG CAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAAAACATGTCCTGCATGGCATT AGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGCTGATTTGCCGTGGCGAGAAA ATGTCGATCGCCATTATGGCCGGCGTATTAGAAGCGCGCGGTCACAACGTTACTGTTATC GATCCGGTCGAAAAACTGCTGGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCT GAGTCCACCCGCCGTATTGCGGCAAGCCGCATTCCGGCTGATCACATGGTGCTGATGGCA GGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAACGGTTCCGAC TACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATTGTTGCGAGATTTGGACGGAC GTTGACGGGGTCTATACCTGCGACCCGCGTCAGGTGCCCGATGCGAGGTTGTTGAAGTCG ATGTCCTACCAGGAAGCGATGGAGCTTTCCTACTTCGGCGCTAAAGTTCTTCACCCCCGC ACCATTACCCCCATCGCCCAGTTCCAGATCCCTTGCCTGATTAAAAATACCGGAAATCCT CAAGCACCAGGTACGCTCATTGGTGCCAGCCGTGATGAAGACGAATTACCGGTCAAGGGC ATTTCCAATCTGAATAACATGGCAATGTTCAGCGTTTCTGGTCCGGGGATGAAAGGGATG GTCGGCATGGCGGCGCGCGTCTTTGCAGCGATGTCACGCGCCCGTATTTCCGTGGTGCTG ATTACGCAATCATCTTCCGAATACAGCATCAGTTTCTGCGTTCCACAAAGCGACTGTGTG CGAGCTGAACGGGCAATGCAGGAAGAGTTCTACCTGGAACTGAAAGAAGGCTTACTGGAG CCGCTGGCAGTGACGGAACGGCTGGCCATTATCTCGGTGGTAGGTGATGGTATGCGCACC TTGCGTGGGATCTCGGCGAAATTCTTTGCCGCACTGGCCCGCGCCAATATCAACATTGTC GCCATTGCTCAGGGATCTTCTGAACGCTCAATCTCTGTCGTGGTAAATAACGATGATGCG ACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTG TTTGTGATTGGCGTCGGTGGCGTTGGCGGTGCGCTGCTGGAGCAACTGAAGCGTCAGCAA AGCTGGCTGAAGAATAAACATATCGACTTACGTGTCTGCGGTGTTGCCAACTCGAAGGCT CTGCTCACCAATGTACATGGCCTTAATCTGGAAAACTGGCAGGAAGAACTGGCGCAAGCC AAAGAGCCGTTTAATCTCGGGCGCTTAATTCGCCTCGTGAAAGAATATCATCTGCTGAAC CCGGTCATTGTTGACTGCACTTCCAGCCAGGCAGTGGCGGATCAATATGCCGACTTCCTG CGCGAAGGTTTCCACGTTGTCACGCCGAACAAAAAGGCCAACACCTCGTCGATGGATTAC TACCATCAGTTGCGTTATGCGGCGGAAAAATCGCGGCGTAAATTCCTCTATGACACCAAC GTTGGGGCTGGATTACCGGTTATTGAGAACCTGCAAAATCTGCTCAATGCAGGTGATGAA TTGATGAAGTTCTCCGGCATTCTTTCTGGTTCGCTTTCTTATATCTTCGGCAAGTTAGAC GAAGGCATGAGTTTCTCCGAGGCGACCACGCTGGCGCGGGAAATGGGTTATACCGAACCG GACCCGCGAGATGATCTTTCTGGTATGGATGTGGCGCGTAAACTATTGATTCTCGCTCGT GAAACGGGACGTGAACTGGAGCTGGCGGATATTGAAATTGAACCTGTGCTGCCCGCAGAG TTTAACGCCGAGGGTGATGTTGCCGCTTTTATGGCGAATCTGTCACAACTCGACGATCTC TTTGCCGCGCGCGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAAT ATTGATGAAGATGGCGTCTGCCGCGTGAAGATTGCCGAAGTGGATGGTAATGATCCGCTG TTCAAAGTGAAAAATGGCGAAAACGCCCTGGCCTTCTATAGCCACTATTATCAGCCGCTG CCGTTGGTACTGCGCGGATATGGTGCGGGCAATGACGTTACAGCTGCCGGTGTCTTTGCT GATCTGCTACGTACCCTCTCATGGAAGTTAGGAGTCTGACATGGTTAAAGTTTATGCCCC GGCTTCCAGTGCCAATATGAGCGTCGGGTTTGATGTGCTCGGGGCGGCGGTGACACCTGT TGATGGTGCATTGCTCGGAGATGTAGTCACGGTTGAGGCGGCAGAGACATTCAGTCTCAA NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTC TGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGG TCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTAC ACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGT AACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGT ACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCC AGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTG GCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAA CGTATTTTTGCCGAACTTTTGACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCG CAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAAAACATGTCCTGCATGGCATT AGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGCTGATTTGCCGTGGCGAGAAA ATGTCGATCGCCATTATGGCCGGCGTATTAGAAGCGCGCGGTCACAACGTTACTGTTATC GATCCGGTCGAAAAACTGCTGGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCT GAGTCCACCCGCCGTATTGCGGCAAGCCGCATTCCGGCTGATCACATGGTGCTGATGGCA GGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAACGGTTCCGAC TACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATTGTTGCGAGATTTGGACGGAC GTTGACGGGGTCTATACCTGCGACCCGCGTCAGGTGCCCGATGCGAGGTTGTTGAAGTCG ATGTCCTACCAGGAAGCGATGGAGCTTTCCTACTTCGGCGCTAAAGTTCTTCACCCCCGC ACCATTACCCCCATCGCCCAGTTCCAGATCCCTTGCCTGATTAAAAATACCGGAAATCCT CAAGCACCAGGTACGCTCATTGGTGCCAGCCGTGATGAAGACGAATTACCGGTCAAGGGC ATTTCCAATCTGAATAACATGGCAATGTTCAGCGTTTCTGGTCCGGGGATGAAAGGGATG GTCGGCATGGCGGCGCGCGTCTTTGCAGCGATGTCACGCGCCCGTATTTCCGTGGTGCTG ATTACGCAATCATCTTCCGAATACAGCATCAGTTTCTGCGTTCCACAAAGCGACTGTGTG CGAGCTGAACGGGCAATGCAGGAAGAGTTCTACCTGGAACTGAAAGAAGGCTTACTGGAG CCGCTGGCAGTGACGGAACGGCTGGCCATTATCTCGGTGGTAGGTGATGGTATGCGCACC TTGCGTGGGATCTCGGCGAAATTCTTTGCCGCACTGGCCCGCGCCAATATCAACATTGTC GCCATTGCTCAGGGATCTTCTGAACGCTCAATCTCTGTCGTGGTAAATAACGATGATGCG ACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTG TTTGTGATTGGCGTCGGTGGCGTTGGCGGTGCGCTGCTGGAGCAACTGAAGCGTCAGCAA AGCTGGCTGAAGAATAAACATATCGACTTACGTGTCTGCGGTGTTGCCAACTCGAAGGCT CTGCTCACCAATGTACATGGCCTTAATCTGGAAAACTGGCAGGAAGAACTGGCGCAAGCC AAAGAGCCGTTTAATCTCGGGCGCTTAATTCGCCTCGTGAAAGAATATCATCTGCTGAAC CCGGTCATTGTTGACTGCACTTCCAGCCAGGCAGTGGCGGATCAATATGCCGACTTCCTG CGCGAAGGTTTCCACGTTGTCACGCCGAACAAAAAGGCCAACACCTCGTCGATGGATTAC TACCATCAGTTGCGTTATGCGGCGGAAAAATCGCGGCGTAAATTCCTCTATGACACCAAC GTTGGGGCTGGATTACCGGTTATTGAGAACCTGCAAAATCTGCTCAATGCAGGTGATGAA TTGATGAAGTTCTCCGGCATTCTTTCTGGTTCGCTTTCTTATATCTTCGGCAAGTTAGAC GAAGGCATGAGTTTCTCCGAGGCGACCACGCTGGCGCGGGAAATGGGTTATACCGAACCG GACCCGCGAGATGATCTTTCTGGTATGGATGTGGCGCGTAAACTATTGATTCTCGCTCGT GAAACGGGACGTGAACTGGAGCTGGCGGATATTGAAATTGAACCTGTGCTGCCCGCAGAG TTTAACGCCGAGGGTGATGTTGCCGCTTTTATGGCGAATCTGTCACAACTCGACGATCTC TTTGCCGCGCGCGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAAT ATTGATGAAGATGGCGTCTGCCGCGTGAAGATTGCCGAAGTGGATGGTAATGATCCGCTG TTCAAAGTGAAAAATGGCGAAAACGCCCTGGCCTTCTATAGCCACTATTATCAGCCGCTG CCGTTGGTACTGCGCGGATATGGTGCGGGCAATGACGTTACAGCTGCCGGTGTCTTTGCT GATCTGCTACGTACCCTCTCATGGAAGTTAGGAGTCTGACATGGTTAAAGTTTATGCCCC GGCTTCCAGTGCCAATATGAGCGTCGGGTTTGATGTGCTCGGGGCGGCGGTGACACCTGT TGATGGTGCATTGCTCGGAGATGTAGTCACGGTTGAGGCGGCAGAGACATTCAGTCTCAA NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/test_pulseFile.fasta000066400000000000000000001051641241505617700260370ustar00rootroot00000000000000>m121004_000921_42130_c100440700060000001523060402151341_s1_p0/7/0_1165 GAATCGTCGTCAAATACTCGCCATGTACCCTCTTCTTCCCATCCCGCTCA AATCCGACATCAATGCAGGCCTTCATCGCTTCTACGCGGGTCCATAGTTG GCAAGTACCACGCATTTTGTTCGCGCGTCACCCACGGACTGCCTAGTTAC TGCTACCCGGCCATCGAAGGCTGACTTTATGGCCTCCGAAACCACCCGCA GCCGCCCGGCAACTTCCATGAAATCCCGGAGGCTAAACGGCATTTCAGTT TCAAGGACTCGTTGCCACGTCACTGCAATAAACCATCGGAGACAGCAGGC GGGTACACGCATACTTTCGTCGCGATAGATGATCGGGGATTCGTAACAGT TCACACCGAGCCCGCGAGATATGAATTCAAACAACGGGTTCCTGACGTCG CTCTCACGCTTACTCGTTTTCCCCAGGCCAGTGGCTTTAGCGTACCTCCG GGACCACACCGGTGCAAACCTCAGCAAGGCAGGGTGTGGAAGTAGGAACT TTCATGTCAGTCCACTTCTCCTTCTCGCCGCGAGCGGGTTTTGCTCATCC CGTTGTGACCTCTGAAGCGGTGATTGACGGCGAGCCAAGTACCGATTTTG CCACGCATCATGCCCTGTTCGACCAGCTCTCCATCGATCCCGGTACCGCG GCCCTGGCAGGATATCGCTCCGGTCGTCACTGCCTGCCACCTTCTGCTCT GCGGCTTTCTGTTTTCAGGACTCCAAGAGCTTTTACTGCTCGCCTGTGTC CAGCCTTCTCGCGACGATGCACGAATGTCCCGCGGCCGAAATATCTGGCG ACCAGAGGCGGCAATAAGTCGCGTCATCCATGTTTGTTATCCAGGGCCGA TCAGCAGAGTGTTAATCTCCTGCATGGTTTCATCGTTACCCGGACGGATG TCGCGTTCCGGCGGACGGTTCTGCCAGTGTATTGCAGTATTTTCGTACAA TCGCTCGGCTTCATCCTTGTCATGCAGATAGCCCAGCAATCCGAAGGCCA GACGGGCCACACTGAATCAATGGCTTTATCCGTCAACACTGTTCTGGGCT GCTGACTGACACGGCCCCGTGATTCCTCTCTGCTCTTTCGCGAGCGTTTT GAATGGTTCTCGCGGCGGCATTCATTCCATCCATTCGGGTAACTCGCGAT CGATGATTTACGGTC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/8/2069_2682 CGGTGTATCAGGCAAGGCCACTCATATCAGGTGCAGCTTGACTGTCATCA ACACGGCCTTTCAGCAACACCCGATACTTCTTCCAGGGCCTTCCAGCAAC GAGGTGTGCCTTCCTTCGTTGCAGTTTCCAGATCTCAGCATCCTGAAGCG GCCGCGATATGCTCACTGGGCTACCTTGCATCAGGCCTTTTTTTTGCTTT TCTTCAAAAAAAAGAAAATGATGTTGCCATCGTAGAGAACATGCTGCTAA CGTGAGAGAAGAAGAGATTGAATCTCAGAGAGAGAGACAGAGCGGTCATA CAGCAGCTTAACAGTGCGGGACCAGGTGGGTTAGAGAGAGGTTCTGGATT AGCATCGAGAGAGAAGCGCGATATGCTGCGCTGCTGGCATCCTTGAATAA GAGAGACTACGCACGCTTTCTCGCAACTCTCCCCACAGCTTCTGTTTTGG GCAATATCAACCGCCGCTAGTACCATGGCAATCTCTGCTCTTGCCCCCGG CGTCGCGGCACTCGGCATATCCGCATAAGCGAATGTTGCGAGCATTGCGT ACCGTTTGCCTTAGTATTTCCTTAAGCTTTGGGCCACACCACGGTATTTC CGATACCTGTGTG >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/8/2724_3021 CACAAGGTATCGGGGAAATCACCGATGGTGTGGCAAAGCTTGAAGGAATA CTAGGCAAAGGTACTGGCAAGTGGCTCGCGACATTCGCTTATGCGGATTA TTGCCGTAGTGCCGCGCGCCGGGGGCAAGATGCAGAGATGCATGGCTAAA GGCGTGCGGTTGAGATTGCGAAAAAAGCTGTGGGGTAAGTTGTCGAGAAG AGTGCGGAGATGCAAGGCAGTCGGCTAATTCAAGGATGCAGCAAGCGCAG GGAATATCGCGCTGTGACGATGCTAATACCAAACCTTACCAACCACT >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/13/0_278 GTATAGAAATGGATCCACTCGTTATTCTTCGGGACGAGGTGTTCAGTTGA CCTCTGGAGAGAACCATGTATCATGAATCGTTATCTGGGTTGGACTTCTG CTTTTAGCCAGAATACATGGCCTGATATGTTACATGAGAGAATGGTATTC CTCATGTGAGTGGCTGTCTTTCGTCTTTCTCTTTGCATTTTCGCTAGCAA CTTATGTGCATCGATTATCAAGCTATTGCCGCGGCCAGATAGTAAGCGAT TTCAAGCTAAGAAACACCGACCATTAGC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/13/327_954 CTTAAGCGATTTATCTTCAAGAATTAATACAGCTATAATCTGGCGCTAGG ACATAAGCTAAATAAATCCGATGCACATTATATGATACGCGAAAATGCAA GAGCAAGCAGAAAACATAAGCCACACATGAAGAGAATAACAGATTCTCTC CATTAACATATTCAGGCCAGTTAAAAAACAATCATGAGAGCTTAAAAGAC AGAAGTCCAACCCAAGATAACGAAATAATATACACTGGTTCTCTCCAGAG GGTTTCATTACTGAACACTCTCCGAGAATAACGAGGGATTCCATTTCTAT ACTACATCAACTGTAGGGGTTGTAATAGTTTATCCGATTTCTATCGCTGT AGGGGACACGAGAACCACCCGAGCCTGATGTGGTTAAAAGACAGGAACAA TCTTTACTACAGCAATACACTATTTAAGGTGATATATGGAAAAGAATTTT GAAAGAGTTCGAAGAAGCATACCTCAAGGATGTGATGGAACAATACCAGA GACTATCCGTATGACTAAACGACTATTGATAAAAATCAATGGTGTGGACA ATTCAAGCGATGCAATGGAGCAACGCTGCCCATCGGAATGCATGGTTAAG CTGAACGAAATGTCTTCCTTGTACATC >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/13/1004_1580 TACGGAAACATTTCTTTCAAGGCTTAACCATGCATTCCCGATTGCAGCTT GCATCCATTGCCATCGCTTGAATTGTCCACACCATTGATTCGTTTATCAA TAGTCGATAGTCTACGATACGGTCTGGTATCTGTTCCATCACATCCTGAG AGATGCTCTTCGAACTATCTTCAATTCATTCCTTCCATATATCACCTTAA TAGTGGATTGCGGTAGTAAAGATTGTGCCTGTCTTTTAACCACATCAGGC TCTGTTCCTCGTGTACCCCTACAGCGAGAAATCGGATAAACTATTTACAC CTACAGTTTGATGAGTATAGAAATGGATCACTCGTTTATTTCGGACGAGT GTTCAGTAATGAACCTCTGGAGAGAACATGTATATGGTTATCTGGGTTGG ACTTCTTGCTTTTGCCAGATAACTGGCCTGATATGTTAATGAGAGATCGT AATTCCCTCATGTGTGGATGTTTTTCGTCTTGCTCTTGCATTTTCGCTAG CCAATTAATGTGCCTCGATTATCAGCTATTGCCAGCGCCAGATATAAGCG ATTTAAGCTAAGAAAACGCATTAAGA >m121004_000921_42130_c100440700060000001523060402151341_s1_p0/13/1625_2202 CTTAATGCGTTTTCTTAGCTTAAATCGCTTATATCTGGCGCTGGCATAGC TGATAATCGATGCACATTATTCTAGCGAAAATGCAAGAGAAAAGACGAAA CCATGCCCACATGAGGAATACGATTCTCTCATTAACATATTCAGGCCCAG TTTATCTGGCTTAAAAGCAGAAGTCGCAACCCAGATACGATCATATACAT GGTTCTCTCCAGAGGTTCATTACTGAACACTCGTCGAGATAAGCGAGTGG ATCCATTTCTATACTCTCAAACATGTAGGGGTGTAATAGTTTATCCGATT TCCTCGCCTGTAGGGTACACGAGAACACTCGAGCCTGATGTGGTTAAAAG ACAGGCACCATTTTACTACACGCAATCATATTTTAAGGTGATTAATGGAA GAAGAATTTGAAGAGTTCGAAGAGCATCCTCAGGATGTGATGGAACCAAT ACCAGGCACTCATCCGTATGCACTACGACTATTGATAAAAATCAATGGTG TGGACAATTCCAGGATGCAATGATGCAAGCCATGCAATCGGAATGCATGG TTAAACTGAAGAAATGTTTCCTGTAAT >m130406_011850_42141_c100513442550000001823074308221310_s1_p0/5/0_14503 AAAGAGAGAGATAGAGAACAACTGATCAAGTGATTTCTTTGATTTTCCAG ACCTGCCACGTGTTGCACGCCGCTGTTCCAGACGATCAACGCCCCGATAT CATCCGGACAGAGCCGTCCAGCGGTGACAGATGCCCCCTCAACACCAAAA TAAACGACCTGCAAAACTGCCCGTCTGCTTCAATGCGAGATGTCCCGAGT CGCGAACTCTCCACACGAGCAGCGCTTCATCTTTGTATCGCGTTTTTTAT TCTCTCGGACGCCAGCGGACGTATCGGCGCGCAATCCCGATACACAGAGG TAGAGTACTGAGACACAAGACTTCCTCGCTCGGGTGATAAGTGCGGCAAT CGGCGCAGTGGCCGATTCCCCACCAAACACGCAGCCGCAGCACACGCCGG CATTAGCCGCGCGACGACCGCACGCGACTCCCAACCGTTAACGGACTTCT CGCTTTCTGAATCTAGCGTACATACAGTAACTGCGTTACAGCCCTTCGGC CAACGACCCTCGGACAGGACACTTTCTGTGTACTTCCTCAGCCGCGATGT GCCTATCCTACTGCTGGCAGACAGCTGCCGTTCCCCAGCGGCCTGGTGGG TTGGCAATCAGAACCACAGGCCATCCTGTACAATGCAAGCGACGCGCTGC ACTCCTCTCTGAGCCAATCATCACAGGCAAGACCGAGAGCTTTTGCGCTT ATTATCGCATCGACAGACATATTACTTTAAATAGACGCCTGCTAGTCGTC GTGCACAGATCCTGCACCCACCAGCGGGATCTGATCCCGCGATACAATCA TCACTCGAGTCGTTAACGCCTCACAGCCAGCGCGTCGGACAACCGCCTAC TCCCTGAAGCGACGGTGGTGCGTGCTCGCTACTCGAATCATGGTGCACCG CGGTTAATAGGCCACAACTTTACCAGACCCTGCGCCAGAATTCTCGACAG GCACGCCTGGACTTACCTCCGCTGCGCGATCTCGCTTCCGTCTAAGCACG CGTGCATTTCACTCTAGCCGATGCTTCCCAAGACTGAGTCAGGATCAACC TGCCACCAGTACTCGCTGGCTTGCCGCGAATCCAGCTGAGCATTCGCTCG AAACAACATGGTGACCAATAGAAAACTGACGATAGAGCCTAACCAAAACA CCACAGTAGGGATTACTCCTCTCCATCAAAAGCCTCATATCGGGTACGTA ACCGAGACTGACCGAGTACCACAGCGCGATTACTTACGCTCCAAAGACGC GAGCAGCCCCTTGAATTCTCCACCATCCTCAGCTCCGCACTATCGAGGAT GAATCACGACGCAGTCGTCTTGATTAATACGGGCAAAATCCATTTTCTTA TCCACAGCCAATGACACCACGCAGCCTGGCCCATTCGTTCTTCTCCGAAC CTCGCCACGACAGACAGCGTACCCAGCGGCGACACCTCCCGTCGCGACTA CCAACATCACCGTATTTCTATAGGCCAGCACGCATCAGTCCCTGAACACA TACCGTTTGAACCGTCTGTAACATTTGACCACAGCACGACCGACATAGGT CAGTGTCTATGCCTATGCACGCGGCGCGGGTGTTCCTCCAAAGCAATAGC CCGCACGAGCAGGAGGCGCCAAAGACATGGCAGACACTACGTGCGTTGTC GCTCGCGCGAAGCCTCTTCTGCCAGAACAATACATTAAACCCGACGACCT GTGTATCCCACGGAGGAGATCGCAAGTTTTGGTAGCGAATTGCTTCCCCT GTCTGCACCAGCCCTACGTGGTATTCGCGCACCCTAACAAATGGCATGTA AGGGTGCACGAGAGCGGGGGGGGGGTGCCGCTAGCTTCGCCGCCAGTTCC GGCGTTTTTCCATGTTCCAGCGCGCGTTGTTGTTCGTCAGGAGCAGCTTC CGCGACAGGCGACACACACCCGATCGGAAGATAGCGCTGCACCATCTCCG CAGTTGGTCCGAGTCCACGCGATTACACGCAGTGCGTGTATTCTTTGCTG CGCCAGTCGCGTTCCCTCACGCAGAGGTGCGCACCCGGAGCGCGCCTGAA CGCGAAGTTGATCCCAGACCTCGAGCCAGGACCGCGCAGTCTTGGTCCAG CGCTAGTCTAAACACCGGCCCTGCGCGATATGATCTGCTCCAGGTTCACT CTCACTCACCTTTCTAGCGTGGTCATCACACCTCCTGTCATCCGCTGAGC GAACTGCCTGGCGCGCCCGCAGCACCTCACGCAGCAGCACGCGGGAACCA TCCTGAGTCTGTCACTTTTCGCGGAGCTTGGTTTTATGGTCCGGATAGCC TCTTCAGTAGCTGGCGTCACTATCGCGATTCTGCGTCTGACGGCGGCCCT GGGTTGAGCGTACATTACTGAGGCCGTAAGTCTGCTTGCGCACGTGGTCA CCAAGCCGTGCGACCGTGCACTATCCTCGCTCAGCAGCGGCATACGCGAA GTCGCGGAACATCGGCGCGCACACACGCGGTCGCGAGGCCGTTCATCGCG TCAATCGCATTGGCGGTAGACATCAACCGGCGTTAGCGCTGGAACGTCCA GCGCTATTACCCTGTTCCAGCTATCAATCGGTCTCATGCCAGATACCTGC CAAACAGCGACACAAAGTAGATCGAATCCGAACAGAATCTGATCATTCCG CCGGTATGCGACGCTCGCGGACCAGTTACACGCCAGAGCCTGCAAGCTGC GGCGATCTGGCTGTCCATCATATTCGCCGCCACGTAAGCGGGATATCCTC CCTGCGCGATCGTTTGGCCTATCCCGAGAGCTAGCAGGCTGAAGAGATTA ACCACACCGCACCATATCCATACAACGGAAATGGCGACTGAGCTGAGTGT CCTCACACATCTCTCACACAGCACCTGCTCCGACTGCGATAGAACCTACG TCTCGTCGCTTCATGCGGACAGTCAACGGCAGCTTCTGCCAGCATGCACG CCGTGTTTTGGTTTGCACCCCATTCAGCCCTGCTCAACACATCCGCACAC GCCTATGTGAGCGCGAATCTCGTCGAACAGGGCACGTGACGCATGGCATC GCACTTCGAGCGTCTATGGGTTGAACACTGCGGCGGGAGTTCACCATCGC TAGCGGCAAAGACGCCTGTTCAGCACATACCTATGGTAGCAACTCGCGAG GAGCTTACCTCGAGTCGTCGAGCGTGTGCCCAGCGAAATATGCACGTCGC TCGCAGCCCGTCTTGTCGCAGCACGCGATTAGTGTTTCGCATCATTCGAG GAAGCCGGGCGTCAGCAGGCGGAGACAGAATTGCTCGCAGTCTAACTTCC AGCGCCCTGGAATGACCGGCGGTGCGAGCATCGTCTAGGAATAAAACTTC TGCGCACCGCAGTTTGAGGAACAAATTTTCGCTCGAGACCCCCCTTCTGC GAACAGCTATGATCGAGACAGCAGATAGACAATTAACGTGTTCATTCTCA CGCAAAAATCGGCGATCTCGCCCTGAAAGCGAGACAATCTATCAGACGAG TGCGTTACCGGTGCGTTCATACGTCGGTTGTCAGCAGAGCCAGACATAAG ACTTGGACTGTTCAGGCTGAGCACCGCTTGCGGCTTGCTGGTTATTACAG CGGTAACTTCCTGTGCTTTTACCCACTTGCGTCCAGAACCAGGACGCATC TCAGTCATGTAAACTCAGCCTCTTGAACACGTACCTATTACTACGCGATC GCCTGGCCGCCGATGCCGAGACTCCGTACTTCTGTCACCCAGCCACGGTT TCAACCTCGGTCATCAACCAGCTCTTGCCGAGACTTGCGTGTGCGCGCGG CGTCGCGTCAGCAGTCTATACCGACATTCTTAAAAGCTCGCCTCATGCTC CGACCGCTGCCAACCTCCACCCACCGTAGTATTACACCACAGTGGCCGCA CCCTGCGTCGAGCGTCGAGACAGTACCCGGGTGGATTTCGCAGTTTAGCC CCTTCGCCCCAGCAGCTCGGCGTGATTGACGGACTAAAATTAGCCGTTTC CTGAGAGCAGGCCCGGTATCATTAAGCAGACCTGACGCATAGTCCGAACA TAATACGCCCTCCACAGACAGGAACGAATCGGTACTCGCCTACACAGGTG CCGGCGTCGAGAGAGATCAATGTTCCTGCGCGGCCAGAGCGGATAGACCG CCCACAATTATCAAGGATTGTGCGCCATTTCATACCGTGCATAAACAGAT TCAGGAGATAATATCCACCATGGCAGTGACGCTCCTCGCGAACGGGCATG ACATGTAATTGCGGCCAGTCACTGAAGGGGTATGGCAACACAGCCCCCGT CAGAACGGCGAGCTGAATGCACATCTTGGCGTCCGTAGCGTGGGCAGCAC AAGACGGCGATTAATCAGTATAGTACGAGGCCCGCACTGTGTACCATTGA CGGCTTGCGCTACGGTAATGCCCTGTTTTTCTCGTATGATCCGAGCGCCT ATTCGTCTGCGCCCAGGTATCTCTGATACAGCAGTGCGCAGGATGACGTG CTCGGAGCAGCGTGCCAACACATCCAATCTCGACCACCGCAGACGACCGT ACCAGTTATTATCAAACCGCTGCGCTGCACTGGTTCCTCCGACTGAGCAG CGCGGCGTGCCCGCTCGGTCCGCTGCTACAGCCTGTCCACCACCGCGCCT TCACAGATCACGTTTGTCGCCCGGTTTGATGGCATCAACGTGGAGTCTCT CACGCTGATATACATAGGTGCCTGCGAGCCCTATTGCTAATCCAGCGGTC GTAAGATGGCACGATGCTATGTGGTCGACGCAGGACCAGGTCCTTAGGCC CGGTTTAGACGGACGGTACAATCAGCCACACACACGCATACCTCAACGGC CCAGCTTATTCTGCCCGCGAGTGAACGCGCCAGCAACCATAGTTAAGCCA CCCGCTTGCAGTCGCGACTCCAGCCAGAAGCACACATTTGTACAGATACT CTTCCGCTCATTATCGATAATCGCTCAACCAGCAAGTCGAACTTCTGAGT ACCAGAGCCGTAAGTTTCGGGGTCTCTCAGTAAAGCTGTGGCAATCGCGG AAAGAATGCCGGTTTTGGGTAGTCCTGAATGCGTATGTCTGTGATGTCCA TTGTATTAGTGAGCAGTAATTATCCATAGCTGCTGTGCACGTCAACGCGT CCATAAGTGCACTGATATGCCTGCCTGCTGTTAGCGGCGTGGACGCACTG CTGTCGTCTCACAGTAACAAATAGCTAATCCCAGTTTTTATACGCCACCT TGAGCCGCGGTCCTGCAGCCTGCCTGCGCTGCGTAAGATACTGCCTACAC AGAATGGAAGTATCTAGGACTGGCACTGGCAAGAGAAGACAGCCTGCTCC GAATTCTGCGGTCAACACCTGGCGATTGTTGTGCTTCAGTGGCTTTCTTG CTTTTCCATCTCTAATCTACCCTTCTTGAAATTACGCTTCAGGATAATCC CGCTAACTGCAGGACACTTCCTGCGATCCGGTAAGGCCACTGGTGCGACC CGGAAGCTCGGTGGTGACTGCAGAGGTCTCAGTTTTGACTGTTACCTACG CCCAACGGCGGGCATCCTGCTGGCCACCTGGTAGATCCCGGCCTATATGT CGTCACATATTCCTAGTGATAAGGAGCTAAACTATGGCCGAGAGAGCATT ACCCAGAGACGACCAGCCATAGAGAGGCCGCTACACACCTCGTCGTATCT GTAGTAAACCACAATGAAAGAAAAACCTGGCAAACGTGAGAAATTACAGA GCCGTAATCCGCGCCTCTGGTCAAATTGGTCAACCGCGCGGCATCCATGA TGTTTCTGTGCCAGACACAACCGAAGAATAAGGCTGAGCCAACATGCACG CTGTGACACAGCAACGCGTTATATTATCGTCGTGCTACTGGTACATACAT TCACCAATGTATGTAAATCTAACGCCTGTAAATTCACGAACATATGCAAG AAAAACCAAACAAGAACGCGCAAGAAACGCGCCAACACCATCCTCGATGT GGCTCTACGTCTTTTCCACAGCAGGGGGTATCCATCCACCTCGCTGGGCG AGGATGCAAAAGCAGCTGCGTTACGCGCAGTGCAAGCTACGGCATTTTAA GAGACAACTCGGGATTTGTTTTCAGTGGAGAATTGGGAACTTGTCTAGAA TCCAATATTGGTGAACTAGAGCTGAGTATCAGGGCAAAATTCCCTGCGAT CCACTCTCAGTATAAGAGAGATATTAATTTTTTCAAAAAAATTCCGTTTC TTGAATCCACGTGACAGAAGAACGGCGTCGATTATTGATGGAGATTATAA TTCCACAAAATGCGAATTGTGTCGGAGAAATGGCTGTTGTGCAACAGGCA ACAAACGTATCTCTGTCTGGAATAGTTACTGACCGTATAGAACGAAACGT TAAAACATGATTGAAGCGAAAATGTTGCCTGCCGGATTTAATGACGCGTC GCGCAGCAATTATTATCGCGGCTATACTTTCCGGCCTGATGGAAAACTGG CTCTTTGCCCCGCAATCTTTTGATCTTAAAAAAGAAGCCCGCGATTACGT TCGCCATCTTACTGGAAGATGTATCTCCTGTGCCCCAACGCTTTTCGAAC TCCTGCCACTTAACCGAATAACCCTGAATCTGACATCCAGTGATTTCTTC CCTGGACATTTTCGTCGTTGCTATTCTGGTTCACTGCGTCGTGATATTCA TTAGCGGTTTTGACTTTTTCAGGTCGTTCTCAGGTTCGAAACCTTCATCC TCATCATGACTATGTTCCAGTTATTACAAACGATCACGAGCATTTTGTTT TTTCCAGCATTTATTGCTTTTGTTTTTGTCTGTTATGCCAGAACACGGCG TTTGCGCGGCGTCATCGAATGGTGATCTGCCGCAAAAGCGACCTGCCAGG CGCAACTTGAACTCCAACTAAATTGAACAAAAAAGAGTCTTTCTGCTCAG GACAAACTTGGTGCAGCAGATCTGACAGAACATTCAGCCACCCTCGATAA AATCGATCGCATAAAAGAAGAGACATTTCAGCTAACGGCAAAAGTCGCTG AAGCGCCGGAAAAAATGCGCCAGGCGACCGCCGGCCGTTAAACAGCACTT AGCGATGTCGATAACGGAGAAGAAACTGCGCAAAATTGTCTGAGCCACGC GTCCGTGTGCCGCCAAGCTGGAAACTCGCGTTGCCCCAGGCGCTGGACCG ATTGCAAAAACGCACAACAACGATCATGCGTCTTATAACAGCCAGCTGGT TTCGTTTACAGACGCAGCCCCGAAACGCGTGCAAATGCGAGTATAACGCT TCGCAGCAAGCTGCAACAAAATTAGCAAGTCGTCTGGATGGGACTGTTCG GCGAGACCAGCCTTACGTCCACAGCCCAGAAAGTGCTTAAATGCAGGCCA GCAGGCCGTTTCTGAACGGAGATTGACCAGCAGCGTAAAAGCCTGAAGGG AACCCACCGTCCTGCAGGATACCATTGCAAAAGCAACGTGATTACGTGAA CGGCGAACAGCCGCTCGTCTGGTAGCACCAGTTACAACTGTTGCAAGAAG CCGGTAAACAGCAAACGCCTGAACTTTAACCGAAAAAACGGCGCAGGAAG CCGTCTCCCCGGATGAAGCCGCGCTATTCAGGCTAATCGCGCGGTGAAGC AGAACTGGAAATTAACCAGCGTTAAGTCAGCGTCGATACCCGCGACTGAA AACGTAATCAGTGATGCAGCAAAACATTAAAGTCAAAAACTGGCTGGAAG CGGGCGCTGCAAATCGAACGCAATATTAAAGAGCAGATTGCCGTGTCCTG AAGGGCAGGCCTGCTGTTGTCTCGTATCCTTTACCAGGCAACAACAAACG CTGCCCCTCGGCGGATGAACTGGAAACATGACCCAAACCGCATCGCGGAT TTTGCGTCTCGAAACATTTGAAGTTAACCAGCAGCGTGATCACTCTTTCC CAGAGCGATGCGTAAACATCAACAAAACTGGAATGAAGGTCCTACCAACC GAAGCACAGCGAAGGTTCACGATCGTTATTGCAAGTGGTTGATATGCGCG CGAATGCTGGATGTCAACCTTCAAACAAACAAGGTTGGGTAACCAGCTGA TGATGGCCCATTAACCTGCAAATCAACCAGCAAGCAGTTAATGAGGTGTC GAAAACACCTGAAAATCCATCCTGACCAGCAAATCTTTTGGTGAACAGTA ACCGTCCAATGGAGGGACCTGGAATCAAAAGCGTTCCCGAAAGCCTGGAA AAGTGAATTTAAGTACAAGAAAACACGGTGAACTGGCAAAAATGGCCCGC CGTTTTTATCGCTTTCCTCGCTGGTTGCCGCTGCTGTTGATTGCCGGGCT GATCCCACTGCCGTCTGGGCTGGCAATGAAAGCGTATCAACAAAAACTGG CTTTCCGCCTGTGGGTTCCCTGCGGTAACGACCAGCCAGCTCAAAACTCC AAAAGCGATCCTTATCAGACCTGACTTCCGTGCGCCTGCGGTGTGCCTGA TTATTCTCGCGGTTGGCCTGATTCTGTTGACCATGCAGCCTCAACAATCC AGGAACTGCTATGCGTCGTTCCAGTCAAAAAAACTGGCGCATATTCGGCT GGTGGTTGCCTCGTGCGGAAGGTAGCTGGAGAAAAACGGCGTTGCGTACG TCACTTGACTCGGCATGCCGGAACAGCAGACCAGCCACTGGCGTCGGCAA ATTGTCCCCGCATCAGTCTCGCATTGCTGCCTATCCATTTCGTCTGTGGT GGCAGAACTTTCCCCGCTGCACTCTGATGGAGAGTGCTGGGGCAAGCGAT GATTTTCCTCAACCTGCTGCTGATTGCCCTTCCTGGTATGCCGATGTGCG CGAAAAGCTGGCGTGATAAAGAGTCGCAGCCCAATGCGACTGTCCACCAT TACAGTGCTGTCGATAATCCCGATTGCGCTGATGGTGCTGACTGCTACAG GCTAACTTCTACACTACGCTCTGCGTCTGGCAGGAGCTGGATAGAAACCG TTTATCTGGTGATCATCTGGAACCGGCTGTACCAACGTTTACTGCGTGCG GCCTTAAGCGTAGCGGCGCGGCGCGTATCGCCTGCCGTCGTGCGCTGGCG GACCGTCGGCAGAATCTGTGAAAGAGGCCGCAGAAGGTGCTAACCGCCGA AGAACCACCATTGCACCTGAAAGCCAAGTTAACCAGCAGACGCTGCGTAT TACCATGTTGCTGATGTTTGCGCTGTTGCGGTGTCATGTTCTGGGCAATT TGGTCGATTTATCACCGTAGTTTCCAGCCTATCTCGACAGCATCACGCAT CTGGCATTACACACGGCAACGAAGCTGGCGCTGCGGTGGTTGAAAAAAAC GTCACCACTGGCAGTCCCTGTTGTTTGCGATTATCGCCCTCAATGGGTGG CCTGGGCGTTGATTCGCAACACCTGCCTGGTTACCTGGAAGCTGGTGCTC TCCGCGACTGAATATGCGCCACGGGCGCGTCGTATGCCATACTACTCACC CATCCATTAACACAATCATTATTTGCTGTTGGTGCAGAACCGGTGTTCGG ATCGACTGGCCGTCTCTGGGATAACCCAGTGCTGGCCGCAAGCATTATCC GTAGGGTCTTGGTTTTGCGTTTACAAGACAATTTCGGTAACTTCGTCCTC CGGTTTGATCATGTCTATCCGAACACAGCCGCTCGCGGTGCGTTATTGGC GATACGGTAACCATGGTAGCTTCTCGGGGGAACGGTAAGAAGACCGTATT CGTGCGACAGACGATTTACCGATTTCGATTCGCAGAAGTGATCATCGCCG AACAATAGCGTTTGTTACCGAGCGTCTGATTCAACCTGGTCGTTGACTGA CACTACTACGCGTCTGGTGAATCCGTCGTCGGCGTGGCCCATGGCTCCGA GTCCTGGAAAAAGTGCGTAAAGTGTTACTTGAGGCGCGACTGACCAACCC AAGGGTGATGCACAACCAACTGCCGAAGTCTTCTTTACGCGCATTTGTGC CAGCACGCTTGGAGATCATGAGCTGCGTCTGTATGATGCGTGAACTGCGG ACCGTAGTCGTACTGTCGATGAGCGAACCGTATTATCGATACGCTGATGC CGTGAAACCGACATCAACATTGCCATGTTAACCAGCTTGAAGTGGACATC TGCACAAACGAGAAGGGCGATGAGGTGCCCGGAAGTAAAACGCGACACAA AGGCGATGACCCGACCGCCAGCGGTAGGGTAAAAACGAAGGGGCAACATT TAGTTGCCCCGAGATTGCTAACAAAGGTGCGCGTTGTTCATCGCCGATGC GGCGTGACCAGCCTTATCCGGCCTAACGAAACGCAAGAATTCAATATAAT TGCAGGAGCGGGTGTAGGCCTGATAAGCGTCAGCGCATCATGCAGTTTTG CGTTTGCCCCAACCTTAGGGGACATTCACTACGCCGACCCCAATTTATCT TCTCACTTTCCGCCTCATCATCGCGCGTTAATTTCCTTTTCATAATCACC GCTTTACAATAATCCTAGCGCGCGCAGCACGGTACTGGCAGGGATTGATT TTCCTCCCAGCAGCACAATCAATCGGACAGCCAGTTTGACATCGTCAGGG GCCATGTTTCCAGTGCACAATATTCTCTCCATCTTGCTAGCGGGTTAAAC GCGCCTACCTGTTTTCGATTTTTCCTAGCGCATGGCGGCAGCGTGGCCAG GCGCGCTTCATAGGCTTCCACTTCAACGACTGCAAGCGTTTGCTGTTCCA CGAGATCGGCTCAACAACCTGCGCTCAGACGGGCTCTGCGTCGGCAACGC ATCTCCACGCTCAGCCGCCCGCTCCAAAATGCTGACTGCTGAATACGTTT ACGCTGCCAGCGGGCCATCTTTTCGAGGGTGCACTGTCCCACTCGCGCAT GACCAGGCGGGAGGCTTCACGCGCGATGCTTCCAGTTTGTTGCCGCCAGA TGTTCCGCCAGCCCAGGCCAACCTGCCGGCAGCTGTCTGCTGCTCAACTG CATACGAAGCGCAGCCAGATTATCGCCCGCCCTCGTCGAGACAAGCCTGT AGTGTTGTCGCACGAAGTCTGAAAAAGATGCCGTCGAAACGAGCACTCAC TACGTGGCGAACTGTGACACCCGGGGACAACAGACGCTGGACGCAGCGTA GCGAGCTTGTCCTTCCCAGTTTTTCCAGCAGCCCCAGGGGCGTTTTCCGA TAAGACCTCAATGCAACCATGATAATTGTCTATGCTCAAAGTACCACTCT CGTATAGCTGACACACTCAATAGTGCTAACGAATCATTTTAACATCATTG GCTGGCTGGCGGTAGTGCTGGGTACGCTGGGCGTTGGTAATTACCGTTAT TACCACCCGACGCGTTATCCTGCTGGCGGCCTGGTGCTTTGCCCGTTCTT CCCGCGCTTTCACGCCCTGGTTGCCTGTACCCGCTCATGGTTTGGCAGCC TATCTACGTTTCCTGGCAAAACATCATGGATGCCGCGCGGCGTGCCAACG CGGGCGATTTTTTGCTTATTTTGCTACGTTTGCCATGTTCTCTGTGGTCG TCCAGAGTGCCATGGTGCGCATCATGTTGCTGGTAATTCTCGCCTGTTTG CTTTTCTATATGTGGCGAATTCCGGTGATTGATGAAAAAGCAAGTAAAAG CACTGAAGCACAACAATCCGCAGTTGCAATTATTGCGACAGCCAGTACAT TCTGGCGTTTTCGAGCACAGGCCAGGCGGTCAACAGGTTAAACAACTGTT ACTTTTGAACGTTTTTAAAACCGCCGTGAGTAACCCACGCGTAACAAGCA GGCATACACTTAATGACCGCGACTGCAACAGCAGTTGAGTATCTCAAAAA ATTAGCATCAAAGCATTCAGGACACCCAAACCCGCATTCTTTTCCGCGAT GATCACCAGCTTACTGGAAGACCCGAAAGGCTTACGCTCTCAGCATCGAC TTGCTGGTTGAGCGTTACCAAAAATGCGGGCATTACCCAAAGTTGGTCGG CACCGAGCGCGTGGTCTTGTTGGGCCGGCTCCGGTAGCTTCTGGTCTGGG CGTTGGCTTGTACCGGTCCCGTAAACCGGCAAACTTGCCGCGTGAAACCA TGCAGTGAAACTTACGAGCCTGGAATACGGCACCGATCAGCTGGAGAGTT CCACCGTTGATGCCATCAAACCGGGCACAAAGTTCTGGTGTGGCGACTGC TTGGCAACCGGCGGCAAACTATCGACAACGACGTTAAACTGATCCGCGTC TGGGTGGTGAAGTGGCTGACGCTGCGTCATATCAACCTGTTCGATCTCGC GGCGAACAGCGTCTCGAAAAGACAGGGCATTACCAGCTCAGCCTTGTCCC GTTCCACGGGCCATTAATTATCCCAGTCTGTGCTCCACGCTACGGACAGC ACAAGAATGTGCATTCAGCCTCGCCGTTCTGACGGGGCCTGTTTAGCATT ACCTCCTTCGTGTAATCCACCTTCCAGCGTTTCAGTAGCCTGGCCAATGA GTTATCAGGTCTTTAGCCGAAATTGCGCCACAAACTTTGCGACGATCGTC GGCCAGGAACATGTGCGACCGCACTGGCGAACGGCTTGTCCGTAGGCGTA TTCATCTTGGCCTTATTTTTTTCCGGCACCCGTGCGTCGGAAAAACCTCT TTCTGCCCGACCTGCTGGCGAAGGGTAAACGCGAAACCCGCATTCCGCGG ACCCCCGTGGCGGCGTGGTGCGATAACTGTCGTGAAATGCGAGCAGGGGC GTGTCCGATCTGATTGAAATCGACGCCCGCCTCGCGCAACCAAAGTTGAA GATACCCGGACCTGCTGGATTAACGGTCCAGTAGCTCCGGCGCGTGTCGT TTCAAGTTATCTGATCGCGACGAAGTGCATATCTGTCGCGCCACAGCTTT TAACGCACTGTTAAAAACCCTTGAAGAGCCGCCGGAGCACGTTTAAGTTT CTGCCTGCGACGACCGATCCACATGAAATGCCGGTGACGATTATGTCCGC TTGTCTGCATTCATCTCAAGGCCGCCGGATGCGAGCAATTCGCCATCAGC TTGAGCACATCCTCAACGAGAACATATCGCTTCACGGAGCCGCGGGCCGC CTGCAAATGCTGGCACGCGCCCGCTGAAGGCAAGCCTGCGAGTATTGCCT TGAAGTCCTGACCGACCAGGCGATTGCCAGAGGGTGACGGCCAGGTTTCA ACCCAGGGGTCAGTGCGATGCTGGGTACGCTTGACGTACGATCAGGCCCG CTGTCGCTGTAAGCGATGTCGAGGCCAACAGGCGAGCCGTATGGCGGCTG ATTAATGAAGGCCGCTGCCCGTGGTTCGAGGGGAAAGCGTTTGCTGGTGG AATGTCGGCCTGTTGCATCCGTATTGCGATGTACAACTTTCGCCTGCTGC ACTTTGGCACGGACATGGCCCGCCATCGGAGCTGCGGGATGCGTGAACTG GCGGCGGCACCATACCGCACCGGATATCAGCTTACTATCAGACGCGTGTT GATTGGTCGCAAAGAATTACCGTAATGCGTCCGGACCGTCCGCATGGCGT TGAGATGACGCTGCTGCGCGCCGCTGGCCCATTCCATTCCGCGTATCCCG CTGCCTGAGCCAGAAGTGCCACGACAGTCCTTTTGCACCCGTTCGCGCCA ACGGCAGTAATGGACGCAACCCAGGTGCCCGCCCAACCGCATATCAGCGC CCGCAGCAGGCACCGACTGTACCCTGCCCCGGAAACCACCAGCCCAGGTG CTGGCGGCGCGCCAGCAGTTGCAGCGCAGTGCCAGGGAGCAACCAAGCAA GAAAAGAGGAACCGTCAGCCCGCTAACCTGCCGCGCGCCGCGGTGGACAT AACGCTGCGCTGGAAAGACTGCTTCCGGTCACCGATTCGCGTTCAGGCGC GTCGGGCCATCGGCGGCTGGAAAAAGCGGCCAGCCCAAAAAAAGAAGCGT ATCCGCTGGAAGGCCGACACTCCGGTGATGCAGCAAAAAGAAGTGGTCGC CACGCCGAAGGCGATGAAAAAAGCCTGGAACCAGAAAAACGCCGGAACTG GCGCGAAGCCTAGCGCAGAAGCATTGGGGAGCGCCGGACCCGTGGGCGGC CACAGGTAGCCAACTTTCGCTACCAAAACTGGTCGAACAGGTGGCCGTTA AAATGCTGGAAAAGAGGAGAGCGACCAACGCAGTCATGTCTGCATTGCCT CCTCCTCAGCGGCATTTGAACAACCCCGGGTGCACATGCAAAAAACTGGC GAAGCGTTGACATGTTAAAAGGTCAACGTTGAGACTGCTATCGTTGAAGA TGAAATCCCGCGTGGCGTACGCCGCGGGAAAGTGGCGTCCCAGCGATATA CGAAGAAAAAACTTGCGCAGGCGCGCGAGTCCATTATGGGATAATAATAT TCAGACCCCTGCCGTCGGTCCTTTCGAGCCGGAGCGGATGAAAGAAAGTT ATCCGCCCCCATTTGATTCGAACACAGCTTACGTTCGTCATCCTTCACAA CCC >m130406_011850_42141_c100513442550000001823074308221310_s1_p0/6/0_5194 AAAAAAGAAGAGAAGAATTCCCGGTTCAAAAATTAGTGACATATACATCA TATAGACTACTTCATCGAGATTAATAAGAAAATCACAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAGTACGAATCCAATATTCATCTCCAGAGCCATG ACGAAACGAAAACAAAAAAAAAACGGAATAGAATACAAAAAAAAATGCGT CGGGAATAATACAGCTATCAGAACGAACTTATCAACTAAGACAAACTAAC AGAATGGGGGAAAAGGACAAATTAATATCAGAGCCAAAATATTAGAAATG AAAAAACCGATACAAATACAACTAATCGCGCGCGGCAATTAAGCACAATA TGCTTAACATGACCGGATATAAAAAAAAAAAAACTCAGTAAAAGGATGAT AAAACGCCGCATTAAGAGAAAAATCGAATAGCCAGACGGTGCCCATAAAC AAAATATATAAATCGAAGGTGAATGCTCGAAGGCATAAAAAATATCATAG CCAGTTTGACCGTAGAAATGAGGACGAATATTATTTGACGAACGTATCCG CTGGCATACAGAGAATAGAATTATAGTAATCGTATACAAAAAAAAAACAG AATTCGGCAATAAGCGCAATTGCGAAAATAATGACAAGGACGTTAAATAG AGAATCATCACCAAAAAAGAAATAGATAAAGTTAATTAAAGCAGAGAGCA GAAGGGAAACCATAACTAGCAAACAGATACATAAAAGCCAGCATCAAAGA TGATCGAATCGCCATACCAATAAGAGAGGATAAAACAATAACGAACAGCA ATATCATGCAATCAGATCTTTCATATGTGGGAGATATAGATAGAAAGTTA GTATATATGATACGATAGGGCGGAAAAATAGAGAATATCACTATTTATTC ACAACCCAATAACGCCATAAAGAAATAACAGAATGACAGAGATTCCAGCA TGTTCCAAGAAATAGACCAAGAAAAAAAAAAAATCTTTAGCGAATGGTTG TGCCTGTGGTGTACAAAGATAATATGCGCAATTCTCACCAGAACAATAAA TAATAACAACTAGCTATTATAATGCAGATAACTCCAGAATGGGCTGCCGA CAAAGAAGGTTTCATGAGAGCGTGAAATTAACCCTGATTTAAAAGGTATA TACAAAATGAATAAACATCGCAAAGACAAAGATAACAATATCAACAAAAA AAAACGATATAAATACACAGTATATACAAGATAATTAAGCGACACAAATC GATAAGAAGACAATATTGTGAATAGATGGAAATATAGGGGAAAGAAAAAA TCACATCGGGACATAAATTGTAATCACCGCAAAAATGCATAACGAGTAAT ATAGCGAAAGGATTATATATATAGACGGTACGGAACAAAACAGCAAGCGG GATTCAGAATCCTTTACATAAACAAAAGTGTACGACGAATAGACACATTG CTGATCGAATGCAGACGCGAATGGCGACAGAACGACCCGCGGTTTAAAGA GCCAGCATAGGAACGAAGCCCACGCAACCGCCCTTAGCAATAGAGCCTGA AGTAGTAGTCGAAAGGCACGTATAAAAAAAAACGAGAAATATGCCAGACA GCACTAAAGGCCAAGATAATTAGACAAAGAACACAAGGCGATAACAGAAG CGACGAGATGATGAAGTGCGGTTAAACACAATAAAAAAAAAAAAAAAAAC ATATTAATCATATAATATACAGAAATGGCGAGAAGTACCTAGTATAGACC ATGAATGAAGAACCGTAAAAAGCGCATCAGGAATTCTAAATAAAAAAGAT ACTTCAACGGAGAAAAAATCCTTAAATGAGATCTCCGCTGTTGAAAAATC GACATCACCTGACAAGCAGTGTGTCTGTATAATGTTACATGTCCGGTGCA AAAGTACGAACGCCTATAGAGACTGAGCGAAGAAAACAGCAGCAATATAA TACAAGCACAAATCTGATACTCGCAGCCGCAGTAAATATAGAGCGGATTA CATAAAAAATTGAATCGAATGTAAAAAAAATACATGCATCGCACTATAGA AATCCGAGAATATATATGAATACCACAGCTAATAGGTCGCAGTATATGTG GCAACAGATGACTAAGGAAAACAATCGCAGAAATTCAGACAGAGAAATTA ACCAAAACAGAGAGCAATACCGTGAATAGACACGAAAATAGACGGATACT AAAACACACAGAAATCCAGACAGACCAGATAGATTGAGTCAGCCGACGCA CACACAAAAAATAAAATCAGCTGCGCTGTAAGACCAGAAGTATGTGTATG CTAGCAACCTTATGCGGATATTCGGAGTATGTACTGTAATAAAAAAATGC GCCACTAGAAATCACCAGTCAGCGGAATTACAATCAACAGCGAAAAATAT ACACGCGCACAACAGGAAATAACCCAACCAGAATTCACCATGATACGAAA CGGTCAAAAAATCGCCTAAAGAGCTGAGCAGATCAGCTGAATAAAACAGA ATCCTTTATGCTATTACCGGAGCTGTAAAAGACCCAAAGCAAAAATCCAC GCGACAAAAAAAGATTTCAATCTACCCACGAGAGCGCAGAAAAAAAAAAA AAAAAAAAAAAAAAAGCTTTGACATCGCTAAGTGGCGTCAGCCCAAAAAC AAGATACGCATGACTAATATCACCATACTTGACGAGAGCTTAATTCTTGT GCACATTTGCGGCTTTGTCGCTCTGAATATCGACAACCGCGACGCGATAA CCCCATACGGCAGCCAGACCAGATGGCACAAAGGACAGCGCCATAGGTAA TTAGCCCACCACGACTGACAACGCAACCTGAATTCATTTTTTACTCCTTA ACAGATTCAAACTTCAAACCGAAAAAATCCCGAGCGCGATATCGATCCGA GAGACAGGGCTAGCCAACAAATGGACAGTGCCGGAATTCCGCCTTCGTTT ACCTCGAAGACGGAGGAGAATGACTGAACCAACAGTTCGCGAGGTTGTCT TCCAGCACGAACTAAGCACCAAGCGGAGATTCCACCGGATAGCGAATGCA TGCCCAGAGTTGAAAATTTGCAACCAGGAGTTAAGGATTAGCACCTTTCA AACAATCGCCGATGGCAAAAAATGAAATGAAGCAAAAATACATACATTAC GAGGATCCGCAAAAAAAAGGACAGCGCTTCACGGAACAGGTGATGAGACA TACTAGTACAAGGGGCGTCATTGGCACTCGCGCCGATACGGGTGAAATGG ATGTCTGATAAATAAAAAAAACAGGATCATAAAAAGTTATCAGCCTCTAT TGATAGAATAAAAAACACAGGAAGAAAACCAGAGCGATCAGTCGATAACG AGACGCGGACGATATAGAGGCTCACCAGTAAAAAAGAAGAGAGACACACA AAAAGAAGCGAACAGTGTCCATGCGGGCTTACCGCCAAGCAGACAAAAAC CAGACCGAGAGAATGAAGAATACGCAGAAGCCGCACTGCAGGCGATATGA TGGAAAAAACAGTGCCGGATAAGCCAGAAGCGGCAGGAATATTTGCGAGA CAAATCTGCAGCGAACATACAAAATACAGAACCCGATACCTGTGCGAATA AACTGCGCCTGGGCCCGAGGAAAAAAGGAAGAAAAGCAAGATGGGAGAGA CGAATGAGCAGAGCATAGACCAAAAAAGACCAAGAATGAAGGCATTCGCC AGCGGAAGCAAGCCGTGGAGCAATACCAGATCACAAGGAAGAGCATAAAA TGAATGCCAAATGAGCGCCAGAGACGAAAAATACCATAACAGGCAAAATA GATATTTTTAATACAGATAGGAATCGAATGGTGTCACGCACCAAGATTGA ACAGCACAGCAACGTGCCAAACAATGCCCATTACCACCTTTACCGCCAGG TAAAACCAATTCCGCCTTGATATCAGAGAAATGATTTTCTTACTGGATAG TCTAAGTACAGAAGCCAACATGAAGAGGTAATGATGGGTGTCAAGCATCA CATAACTACAAGTGATGTTTCTTCTATATACGCCAAGACAATTAAAAAAT ATCTTAATCCACAAATGTAAACTAGCGCCAGCGACCAGGACTATGCAAGT CGAGTGGATAATTAAAATGGTAGGGAAATAACGTATAAAATCCGTATATC GGATAAGAATGCCGCAGCGTAATTAGATGGCCGCACAGTCGATTACAAGC GACAAAACAATTCCGCTCCGCGGGATATCCATTCTATATAAATCACAGAA TCAATAGCCTGCCAGCACACCGTGCTGGATGCCAGTATGATCAAATCGCA GGCCAGAGGTCCAGAGCGGTGAAATATAGACGATTTTAATTTTCCCACGG GACGTAAAAAATAAGAGAGCTATATCAAGCCTAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAACAAAAACAACAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAACAAAAAAAAAAAACAAAAATAAAAACACAAAAACCCC >m130406_011850_42141_c100513442550000001823074308221310_s1_p0/7/0_2243 AAAAAAAAAAATCAAAAAAAAAAAAAAAACAGAAAAACAAGAAGAAAAGA AAAAAATAACTGAAAAAAAAAAAAAAGAAGAAAAAAAATAAACAGACAAT AAGAAAATAGCAAAAAAAAAAAAGAGAAAAAAAATAAGAATAAAAAGACA AAAAAATAGCAAAAACAACAAAAAATTAATAAAAAGAAAAGAACAGAACC AATAAAGATAAAAAGATAAATAAAAGAAATAAAAAAGAATATAAAAGAAA CAAACAAATAAAAAAGACAAAAACAAAACAATAAAAACAATAAATATAAG ACACAACATAAATAAATAAAAGAAATACAAAATTACAATAAGAAAAATAA ATAAAAACAATAAAGAAAAAAAAAAAAAAAACAAAAAAAAAAAAACACAA AAAGAAACAAAAGAAAAAAAAAAAGAAACAAAAACATAATACAAAAACAA AAGAATAAATACAAAGAAAAAAAATAGTCATAAAAGAAAAGAAAAAGAAA AAAACAAACAAAATAGTAAAAAAAAAGAAAAAAAAAAAAAACAGAAAAAA AAAAAGTAAAACAAAAAATTAACAAACAAAAGAGCAAATAAGTTGAAAAA AAAGAAAAGAAAAAAAAAAAAAGAAAAAAAATCAGAAAAAATAAAAACAA ATAATAAGAACGAATAAACATAATAAAAAAAAAAGAAAAAAAAGAAAAAA AGAGAATAAAAAAACAAAATAAAAAAAAAAATAAAAAATAAAAGAAATAA AAAAAGTAAATAAACAAAGAACAAAAAATACATAAAAAAAACAATAAAAA AAGAACAAAAATATAAGAAAAAATAAAACAAAAAAAAACGAAAGAACACA AAGTACTAAAAATAAAAACATAAAAAAGCAACACAAAAAGATAACAAATT AACACAAAAAAATAGAAAAAAAAACAGAAGTAATAAACAAAAAAATAAAG AAGGTAAAATAAAAGAAAAGAATGAAAATAAAACAGAAAAAACAGATATA GACATATAGAATATAAATAAAAAAATACATAATAACAAGAATAAAAAAAA GAAAGAAAAAAAAGAAAACAAAATACTAATAAAAAAATGAAACAACAAAG AAGTAATACAAAATATAAATAAGTGGAACACGCAAAAATAAAAAAAAAGA AAAAAAAAACAAAAAAAAGAAATCGGCAAATAAATTAATAAAATCAGAAA CGAGAACGAGAAATAAAAAACAACTAAGGAAGAAACATCAAAAACAACAA AAGAGTAAATCAAAATAAGAAGAAGAGTAAAAAATTCACGCAAAGAAACA AAATCAAAAGAACAACAAGAAAAAATAAAAAAGAAATATGCAAATAAACA AAAATGTTATTAGAAAAAGTATAATAAATACAATAAGAAAAAAAAAACAC AAATAGAAGAAAAAGAGAAAAAAAGAAAAGAAAAAAAATTAAAAAAAATA AAAAAAAAACAACAAAAAAAAGGAGAGAAAGATAAACAAAATAGAAAATA GAAATAAACAGAAAAAGAGAAAAAAACAAAGAAAAAAAACAAATAAAGTA AAAAGAAACAAAAAAAAAACAAAAAACAAGAGAAAAAAAATAAATACAGA AAATAAAAGAAAACAATACATAAGAATAACAAAAAATAAAAACAATAATA TAAGAATAAGAAATAGAAAAAATAAAGAAAAAAGTAAAAAAAAAAAGAAA GAAAAAAAAAAAAAAGAAAAAAGAAAAAAAGAAGAAAAAAAATATAATAA TAAAAGATAATAAACAAATAAATAACATATAAAGAAAGAGAAGCAGAAAA AAAAAAAAAATAAAAAAACAGAAAGAAAAACTGCAAAATAAGAAAATCAA ATAAAAAAAAAACAATAAAAAAATGAATAACAAAGAAAAATATAAAAGAA ACAAAAACAATAAAAAAAACATAAACATAAAAAGAATAATAAGAAAAAAA AAAAAAATAAATGAAATGAGTATAACAAAGAAAGCAAAAAACAAAAAAAA AAAAAAAAAAAAAACATTAAAAAATAAATAAAAAAAATAAAAAAAGAAGG AAAAAAGAAAAATTAAAAAAAAAGTAAAAGCACAAGTCAGAAGAATAACG TAAGAGTGAAGAAAACTAAAAAAAAGAAGAAAAAAAAAGTAAAGATAAAG AAAAAAAAAAAAACAGTAAAGAAAAACAAATAAAAGAAAGGAGAAAAAAG GTCCTGTGGTTTCAGCTGAGTTTGCAGTGTCAGCATTCCTTTC >m130406_011850_42141_c100513442550000001823074308221310_s1_p0/8/0_7878 AAAAAAAAAGAGAGAGGAATGAGTCAGAAGGCACCCCTGATAAGAAAATC TTCGTGGAGGCTGGCGCGAATTTTAGCATTTGTTCAATGGCGTAAACGAA CGTATCCATTTGGGTATGACCTCCATTAATTTCAGCCGATATAGTAACGC AGTGACAAATAAAGTAAATGTAATTGTTTTAGAAAAAATGATTCTTGTGG TATATAATCGCCATCACTTCCAGGCAGCCCATAAGGCCAGCGCACTAAGC AGTGTTGGGATGATAACGCTGCGTGTTTTAAGAAACTGGCACCCAGCTCC GCGAAGCCAACCACAGCGTGGGCACGAAACGGCGTGTAAATCGGCATCAC TTCTGGTGCGGTAGAGACAACAGCAGGAGCCGCATATCGAGGCATGCCAA TGGTGTCGAGCAAAATACCTACCGCGCCACGTTTGGTTGGGCGGGCATTA CCCACACAGCAGGCGCAGCGGCAAATAGCGGAAGCAATAATTCGCCACGC CAACTAGTAACCCAAGCACAGAACCTCATAGCTCCATCGGGCGCCCTTTG CCAGAATGCCTGGATTAACGCCAGTGAGGCAGCCACAAGACAATGCCTGC CAGAATTGCCGACGAATTAGAAAATCAGCGTTACGCCTGCAAGGGCACCA ACTAACCTGCGGTAAACGCAAAGAGATGTTTGCGCTGAAAGAGGCCGGCA GGAAACTCATAAAGAGTGGGCCGGAAGCATAAAACCTAATGCAGCTTCAA CGGCGGGATAAACCTTGCAGCAAGCCGCTGCCGGAGAATGCCCCTATTAC CCGTACAAATAATCCCACGTGACCATGAACTGAAAGGCATGCGATATCCC CAGTTTCTCGCTCCGAGCGGCCGATTATTTGCGTACAAGTTTTGCGGTTG CGAGCGGCAAAAAACCCTCATCCGTCCAGGCAAACGCCCACAGGGCGGTT TTCGATTTTTTGCAGACGCTGAATAATACGGCTACGCAGTGACGGGCCAT ACAACACATGGCGAACATCCATTGCCATGACGGTCAGTGCAGCAATCCCC ACAAAACTACCCCGGCACTGCCCAGCATCGCGTAAGACCGAAACTGGCTC GCGCCTGCATAATGATGCAGGAGAAAAAAACGCTTTCGAGAGGAAGAGAA TCCCAGACCGGGTCCGCATTTCAGACGAACGCAAAGGCCACCGGAATTAT AACTAATAACAATCGGTAAACTGTCCTTTGCAATCCTTCCATGAAAGGTC GCCGAACCAGGAGCAGCTGTGGAGTAGGGCTTTCCATAGAGATGTACGCT ACCGATTGTTAATATTATAAATAGACTGAATGAATATCCTTAACCTTATC AGACTGAGGCTTCTTTAACACCACTTATAAGTGTAAAGCCACGAAAACCG TTGTGATTGCTTAATTTTGCGGGCTTCCTGAACGATGAAACCTCGCCACC CAGACCGTAAAAGTTAACCAGGGCAATCGGTCGCGCCAGCCAGACAAACC AGGCCCAAACCGCCATGTTGCCAGGCTGAGGCTAATTAGCCGAACCGGCG GGCACCGCCAATAAAGTAGCTAGTCATGTAACCTGCGGTCAGGCGATTGC GCGCATCAGGATGTATCCGATAAATTACCGTCTGGTTTAGTGATATGCAC GCCCTGCACGGTGAGATCCAGCACCAGGATTCCGATAAATCAAACGCCAG TACGGAAGTGTGACCAAACCAGATCGCCAGCCATGAAATAATAGCAGCAG CAGACCGAAAGTTGTGGTGTGGTGCGATTTGCCCTTATCGCAAACCGCCC GCCGGACGAGCGCCCATACGCTCCGGCAGCTCCCGCAAGTCTCAAACGAC CAATGACACCATCCGCTGTAGTTAAAAGGTGGAGGCGGGCAAGCCAAAAA GGCCCATTGAGGTCCAGAGAATGCTGAGAATGGCAAGGTCAGGCAGCCCA GCAACGCGCGGTACGCAGAATTTTATCGCTGATAACTATGAAACGGAACC CAACAACTGTGGGTAGTTGAGGTGGGTTTCGATTCATTTGTGGCAGACCA CGCCATAATGCCAGCGCCATTCAGTGCCGATTAACGACCCGAAGCAACCA AAAGACGGTGCGCCAGCCGCCGAGATTCGCCAAGCAATCCGGCAACTGTC CGTGCAATGCAAGAATCCCAACAGCAGACCGCTCATAATAGTGCCAACCA CCTTTGCCGCGTTTACCCGGTGAACCAGCGTCGCTGCCAGCGGAACAGAA TTTGTGCCACGACTGAGAATAAAACCGGTTAATGCCGGTACCGAGGATCA TCATCGCCAGCTGACGACTTGCTGGCGGGTAATCAACATGCCGCCAGTAA GGTATCGAGACAATCAGGCGGCGGCCGTTCAAACATATACCGAGGGGAAC AAGAAACAGTAGACCTGCCGGCATAGCCCAACTGCGCGGCGGTAACAATA AAGCCTGCGAACTGCGGAAAGAGGAAAAGTTACGCGAGATGGTGTCGAGC AATGGCTGGGCGTAAATAGTTACTGGCTACCAGCCAGACCGGTGGCGATA GACATCAGCACGATCAGCGCCGGGCAAGCTCAATGATTTAGGTTTAGTCA TTAATTTTTGTTCATCAGGTGATGATTAAGCTGGAAATCCAATAATAGCG GATGGTTGATGAATATGGGATGAAAGATAGATTGTTGAAAGTGCGATGTG GTTTGTTAGGATGATTAAGACGCGCAACAAGCGTCGCATCAGGCATTGTG CACAGAACGGCGGAATGCGGCAAAAACGCCTTATCGCGCCGAATAAAAAT TAACTTCTGCGTGCCAGCGCCTCCATTCACCCCAAGCCATCGAACTGCCT GCTGGTGGGCTTTGATCCAACATCAACCGTTGTCCCTGAAATATCGCCCT TCTGAGGCTTTGCCGTCATGCATAATGGCGTTCTGGGCCGTTAAAATCTG CCACTGGCAACCTGCATAATGGCAACAGTTTCGCTGCTGCCGGGTTTTTC TGCGGCCCAGGCTTTGTTGGCAACGATATGCATGGTGCTGACCCGGGAAG CCATAATTCCGCAAACCATTCAGGCAGTTTTGGTACGGCGTTTTTATCGC ACCCGGCCAGTGCGGAGATACGGCACCTGCAACCAGACGAAACCATCTTT TTTGCCCGGCTTCATTCGTAACTCACCCAGTACGGCGTCCAGGTGAATAA AAACACCGGTTTGCCCTCTTTGGTACGACCTGATGGTGTCGGCCATCATC GCTGCGTAGTTCCCCTGATTAAATGCGTCACGGTGTTGGTCAGTTCATAC GCGGCAAGCGGTGGTTGATCGCACCTTTCGCAGGCCCCAGCCAGGGTTAC AACCGGTAAATCAGCTTTTCCGTCGCCGTTGGTATCGAACCAGTTTGGCG ATCTTCGGATCATTTCAGTTGTGCGATGTTGGTGATTTTGTACTGGTCGG CGTTTTTCTTATCGATCAGGTAACCCTGTCCGCGCGTTAACAAATACCCC TTCACGATAAAATTTCTTATCGCCACCGCAGCTTCGTACATGTTGTCATG CAGTGGCGTCCAGTTCACGGCGTGAAGGGTTCATCGCCGGAAGCAACGAG GTGTACCAACGTTGTAAATCTATCGCTTGGGTTTGTTGACGGTATAACCT AAAATTCTCAGCGCACGACTGACAGCAGCGTCTGGAAGGTTTCTTCAGTG GATGGTGCTCTGAACTGGATTAACAGTAATGCTTTGCCCGGCAGATCGCA GCAAAAGTTTGTGTAAGAGATAAGCGTGGCAACGCTGTCGCAAAAAGTAC CGCCTATGTCGCATTGTTATTCCTTTTTTTGGCAGGGTGAATAATGCCGC GTCACCGGGCAAGTGCCAGAGTTATTAAGAATGGGCGGGTCAGCAGACCA ACAGGGCCAAGTTGGTGTACCAGCGACGGTTGCCGCGACTGGCGTGAGTC GCGCCCAACCGGCCTGCGTCAGACGATCGAGGATAATGGCGAGGATCCAA AATCCCGACCGCCGCCCAACGGTGGCAAGCCCCATATCCAGACGACCGAT ACCGCGAAGAAAAACTACCATCTGACCACCCGCCGACGGCAAATCCATCG AGGCGATGACCACCATAGAAAGGGGCCAGCATCAGCGTCCTGGTTAACGC CCGCCCATAATGGTCGGCATCGCCAAGCGGTAACTGAAACTTTGAACAGC ATCTGGCGCGGGCTGGCACCTGAATGAGCGCGAGGCTTCAATCCAGATCC CGCCGGAACCTGGTTAAATCCCCAAATGGTCAGACGGATAATCGGCGCAG AGCAAAGAATTGATCTCACCACCACGCCCCGGCACGTTACCGATACCAAA TAGCATGACGATGGCACCAGATAAACAAACGCTGCGTGGTCTGCAATGGC CATCAAGCAAGTGGACCGAATAATTTTTCGCCGCTCGCGGACTTCTCGCC CAGCCATATCCCAACGGCAAACCGAATGACGATTACAGAACAGCAGGGCG GTTAACACCAGCGCCAGAGTCACCATTGCCTGCGACCAGGCACCGATTGC GCCCGATGGCAATCAGCGAAACCAGCGGTCGCCACACCCCATTCCGACCC CCGGAAATCTGCCAGGCGATGAGAAGTCGAAAACGATAATCGCCACCGGT GCGGCATACCAGCAGCAATTGCTGGAAACCGTTGAAGGATATAATCAACC GGAACGAGCCGCCCTGGAAGACTGGGACGGAATGGGTAACGACCCAGTCG ATCCCTTCAGTGACCCAACTGTCGATGCGGGATCAAGGTTTTTGGAACGG AATGAGAATATTAAAATGCTCTGCGTTTGGCGCAGGCGTACTGGTCAGCC AGTCAGCACCACCGCCGCAGTCGGTGCAGTCGTTCGGTGTACCCCAGGCG TCTGCGGTATTGCGCGGCACTGTCCCGCCGCTGGCGTGGTATCCCACGGG ATTATTTTGATCAGACCATTATTTACCCCCTCAACGATCTAAAGCGCGCA TGCAGCATTCCTTTCGAAATGATGCCGACATACTGTGGTCCTCCGTCGAC CACGGGCACCGCACAGGGTGCCTGTCCCGAACATGAGAGAGCAACTCGCT AAGAGGCGTTTGTGCATCGACTGCTAACGGCGCCATCAAATCAGCGCCCG CATCAAGACCTTGCGGCTGCGTTAACGCGGTTTTAAAGCGAATCGATGGA GACTGCGCCGACAAACTTTATTACCGGCGTTCGATAACGTAGCCATATTT CGCGATCTTCATCCCTGCAATAATTTCAGTGCCGAACGTGGCCGAAGCCA GGGGTTTTACGAAAATAAGCCATATCGGTGTCCGGCGGGCAATATCTTTC GCACTGATACCTGACTAATATCCAACGCCACGGAAGAAGTACGGAACATA ATCATTCGCCGGATTATGAGAATTTCATCCGGTGTGTCCGACCTGTACAC TTCACCATTTTTATAATGGCAATCGGTCGCCAATACGCATGGCTTCATCA AGATCGTGGAAAATAAAGACAATGGTGCGCTGAATGTTTCGCCTGTAATT TTCCAGCTCATCCTGCATCTCGGTGCGAATTAAATGATCGAGCGCCGAGA AGGCTTCGTCCATTAATAATATATCCGGATTATATCGCTAACGCGCGGGC TAATCCCACACGTTTGGACGCCATCCCGCCAGAGAGTTCCATCCGGGTGA CTGTGGGCATAATTTTCCACCCGACCTGACGCAGTGCATCAAGGGCTTTT TCCCGGCGTTCTTCGGCATTAATCCGGCCAATTCCATACCGAAGCGCAGT ATTGTCCAGCATCGGTCATAAATGCGGCATTAAGGCAAAGGACTGGAATA CCATCGCAATCTTTTTCTTGCGCACCTCACCGGGTTCGGCGTCGGATATT TTGGCAATATCCACACCATCAATCAGTCCACTTGCCCGCGGGTGGGTTCA ATCAGGCGATTGAGACAGCCGTACCATTGTGGATTTACCAGCCGGATAAT CCATTGGATGACAAATATACTCGCCTTCTTCAATGGCCAGACTGGCGTCT TTTACGCCAAGCGATAGCCCAGTTTTTTCCCAGAATTTGGTTCTTTTGAA AGTCCTTGTTCGATATATTTGAACGCTCGCTGTGGATGCTCGCCAAATAT TTTATAAAGAATTTTTAATTTTCTAATTTAATTGCATGCAATAGAAAGAT TCCTTTATTTGTCTATGTCGATATATTACCCAATGGAAATAGTCACTTTT TTCTAACCCTAACATACTGAGAATAACTGAGGGCAACCCCTGATGGCAAA TGGTGGCATGAATATTTGATGCGAGCAAATCGCCGGAATTTCCCGTGATA TAAAGGGCTGGACAATATCGAGAAAGTTGGATAATTTTTTGTCAAAGATA GCGAAAAAACGTCCTGAATATTGTGGTTTCAGCGAATCAATGCAACAGAG TATCAGTGTAGATCACCACAAATTAATTTGCGTGATAAAAAGCTATTTGG CGGATTATTTCCCTGCTGCGGGTAGTGATATTTTTGGAAAATAACACCCT CAGAAATTCCAGTCTTCATCTTCTGTTTCAACCGCTTTCCCCAATCACAA ATAAGAGGAGCCTGAACCGGAAAAAGAATCGTGATTTTCATCGGCATTCG CGAAAGCGCGGCGAGGATTGCCGGATTCACTTCCGCATTTTCGGGAAATA AACGGTTCGTAGCCCAGATTTCCCCCCCAAAAATCCCAAAAAGCCTTATT GGCGTTGTAACAGAGAAACGCTTTTCCACATCGTCACCGACGGGGTTTCG GCGTACAGCTCCATCGGTGTATGTGAACTCCGTTGTTCGTAGAGTTCCAG CAACAAATCGAAGGCGAAACTCTCAACTTCCTCACGTTGTCCAGAGATAT CTTTTCCCATGTTTTTCTGATATTTAATAGGCCTATGTAGTAACCGTGAC TGCTTCATCGCGGATAATCAGGACGGATTCAGGTTCCGCGGTATTGGTCA GCTTTCCGCGGCTGGAAAAATACAATCGGCAGCCAGAAACCGGAATAGAA TCAAAAAAGATTCAAGAACAACTGGCGATTTTCTTTTTCAGCGGATCATC AACCGCGATAATGTTTGCTGAAATAATCTGAGCTTTCGCTGCAACGGTGC GTTTTCTTCACTCCAGGCGTAGCGGCATCGACATCTTTTGGTCTGGCAAT AGCGTCGAGAAAATCGAACCTGTAAGAGCGGCATGAACCGCTTCGCATAA AGCTGATATTCGAATAATCCGCTTCTTTCATGAGGCGTGGAGTGCAATCG GGCAATCAGAGAAGGGCGCGCCGTAACATTTTGCAGCGTGTCGAGCAGCG TCAGGCCAGTAAAAACGCGCATCGTCCAAGTTGTTGTTCTACGACAGTTA AATGTCTGCCCAAGGCAGGAAATATCGTTCGACAGCGGCACCTTTTTCTG GTAGCCAGAATGCCTGGTCAGGCGATTCCACACCTCCAGATCTTTATCGT ACAGATTCTTTGTTCCAGTTTTGATGGGCGCTGATACGTGAAGTTTCATA GATATCCCTTAAGTGCACAGGAGGACGCCAGCCTTCAATTTCAGTGCCTT CCAGCGCCATCTGACGCAGGCGGAATGTAATAGAGCGTTTTGATACCCTT GGCGCCAGGCGTAAATCTGCCGCTTTGTTGATATCGCGAGTGGTGGCGGT ATCGGGGAAAAAAAGCCGTCCAGCGACAGCCCCTTGACGACAATGGCTGA GTCCGCCTTCCGCGTAGGTTCGTGATACTTTTCTGCGCCAATTTCGTAAG GCCGTCCTGATACAGCGCAGATTCCGTTAGTCTAATAAACGGCAGGGTAG TAAACCGCGTCCTGTTTTGCCCCGCCCC pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/data/test_pulseFile.fofn000066400000000000000000000007261241505617700256670ustar00rootroot00000000000000/mnt/data3/vol57/2650250/0001/Analysis_Results/m121004_000921_42130_c100440700060000001523060402151341_s1_p0.bas.h5 /mnt/data3/vol53/2450496/0001/Analysis_Results/m130406_011850_42141_c100513442550000001823074308221310_s1_p0.1.bax.h5 /mnt/data3/vol53/2450496/0001/Analysis_Results/m130406_011850_42141_c100513442550000001823074308221310_s1_p0.2.bax.h5 /mnt/data3/vol53/2450496/0001/Analysis_Results/m130406_011850_42141_c100513442550000001823074308221310_s1_p0.3.bax.h5 pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/out/000077500000000000000000000000001241505617700217175ustar00rootroot00000000000000pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/out/readme000066400000000000000000000001311241505617700230720ustar00rootroot00000000000000# This directory saves output files generated by pbalign's # unit tests and cram tests. pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/000077500000000000000000000000001241505617700220675ustar00rootroot00000000000000pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_fileutil.py000077500000000000000000000141661241505617700253300ustar00rootroot00000000000000"""Test pbalign.util/fileutil.py""" import unittest from os import path from pbalign.utils.fileutil import getFileFormat, \ isValidInputFormat, isValidOutputFormat, getFilesFromFOFN, \ checkInputFile, checkOutputFile, checkReferencePath, \ real_upath, real_ppath, isExist import filecmp class Test_fileutil(unittest.TestCase): """Test pbalign.util/fileutil.py""" def setUp(self): self.rootDir = path.dirname(path.dirname(path.abspath(__file__))) def test_isValidInputFormat(self): """Test isValidInputFormat().""" self.assertTrue(isValidInputFormat( getFileFormat("ab.fasta")) ) self.assertTrue(isValidInputFormat( getFileFormat("ab.fa")) ) self.assertTrue(isValidInputFormat( getFileFormat("ab.pls.h5")) ) self.assertTrue(isValidInputFormat( getFileFormat("ab.plx.h5")) ) self.assertTrue(isValidInputFormat( getFileFormat("ab.bas.h5")) ) self.assertTrue(isValidInputFormat( getFileFormat("ab.bax.h5")) ) self.assertTrue(isValidInputFormat( getFileFormat("ab.fofn")) ) self.assertFalse(isValidInputFormat( getFileFormat("ab.sam")) ) self.assertFalse(isValidInputFormat( getFileFormat("ab.cmp.h5")) ) self.assertFalse(isValidInputFormat( getFileFormat("ab.xyz")) ) def test_isValidOutputFormat(self): """Test isOutputFormat().""" self.assertFalse(isValidOutputFormat( getFileFormat("ab.fasta")) ) self.assertFalse(isValidOutputFormat( getFileFormat("ab.fa")) ) self.assertFalse(isValidOutputFormat( getFileFormat("ab.pls.h5")) ) self.assertFalse(isValidOutputFormat( getFileFormat("ab.plx.h5")) ) self.assertFalse(isValidOutputFormat( getFileFormat("ab.bas.h5")) ) self.assertFalse(isValidOutputFormat( getFileFormat("ab.bax.h5")) ) self.assertFalse(isValidOutputFormat( getFileFormat("ab.fofn")) ) self.assertTrue(isValidOutputFormat( getFileFormat("ab.sam")) ) self.assertTrue(isValidOutputFormat( getFileFormat("ab.cmp.h5")) ) self.assertFalse(isValidOutputFormat( getFileFormat("ab.xyz")) ) def test_getFilesFromFOFN(self): """Test getFilesFromFOFN().""" fofnFN = "{0}/data/ecoli_lp.fofn".format(self.rootDir) fns = ["/home/UNIXHOME/yli/yliWorkspace/private/yli/data" + \ "/testLoadPulses/m121215_065521_richard_c10042571" + \ "0150000001823055001121371_s1_p0.pls.h5", "/home/UNIXHOME/yli/yliWorkspace/private/yli/data" + \ "/testLoadPulses/m121215_065521_richard_c10042571" + \ "0150000001823055001121371_s2_p0.pls.h5"] self.assertEqual(fns, getFilesFromFOFN(fofnFN)) def test_checkInputFile(self): """Test checkInputFile().""" fastaFN = "{0}/data/ecoli.fasta".format(self.rootDir) plsFN = "/home/UNIXHOME/yli/yliWorkspace/private/yli/" + \ "data/testLoadPulses/m121215_065521_richard_" + \ "c100425710150000001823055001121371_s1_p0.pls.h5" self.assertTrue(filecmp.cmp(fastaFN, checkInputFile(fastaFN))) self.assertTrue(filecmp.cmp(plsFN, checkInputFile(plsFN))) fofnFN = "{0}/data/ecoli_lp.fofn".format(self.rootDir) self.assertTrue(filecmp.cmp(fofnFN, checkInputFile(fofnFN))) def test_checkOutputFile(self): """Test checkOutputFile().""" samFN = "{0}/out/lambda_out.sam".format(self.rootDir) cmpFN = "{0}/out/lambda_out.cmp.h5".format(self.rootDir) self.assertTrue(filecmp.cmp(samFN, checkOutputFile(samFN))) self.assertTrue(filecmp.cmp(cmpFN, checkOutputFile(cmpFN))) def test_checkReferencePath(self): """Test checkReferencePath().""" refDir = "/mnt/secondary/Smrtanalysis/opt/smrtanalysis/common/" + \ "references" refPath = path.join(refDir, "lambda") refPath, refFastaOut, refSaOut, isWithinRepository, annotation = \ checkReferencePath(refPath) self.assertTrue(filecmp.cmp(refFastaOut, path.join(refPath, "sequence/lambda.fasta"))) self.assertTrue(filecmp.cmp(refSaOut, path.join(refPath, "sequence/lambda.fasta.sa"))) self.assertTrue(isWithinRepository) refpath, refFastaOut, refSaOut, isWithinRepository, annotation = \ checkReferencePath(refFastaOut) self.assertTrue(filecmp.cmp(refFastaOut, path.join(refPath, "sequence/lambda.fasta"))) self.assertTrue(filecmp.cmp(refSaOut, path.join(refPath, "sequence/lambda.fasta.sa"))) self.assertTrue(isWithinRepository) fastaFN = "{0}/data/ecoli.fasta".format(self.rootDir) refpath, refFastaOut, refSaOut, isWithinRepository, annotation = \ checkReferencePath(fastaFN) self.assertTrue(filecmp.cmp(refpath, refFastaOut)) self.assertIsNone(refSaOut) self.assertFalse(isWithinRepository) refPathWithAnnotation = "/mnt/secondary-siv/" + \ "testdata/BlasrTestData/pbalign/data/references/H1_6_Scal_6x/" _refPath, _refFaOut, _refSaOut, _isWithinRepository, annotation = \ checkReferencePath(refPathWithAnnotation) self.assertEqual(path.abspath(annotation), path.abspath(path.join(refPathWithAnnotation, "annotations/H1_6_Scal_6x_adapters.gff"))) def test_isExist(self): """Test isExist(ff).""" self.assertFalse(isExist(None)) def test_realpath(self): """Test real_upath and real_ppath.""" print real_upath("ref with space") self.assertTrue(real_upath("ref with space").endswith("ref\ with\ space")) self.assertTrue(real_upath("ref\ with\ space").endswith("ref\ with\ space")) self.assertTrue(real_ppath("ref with space").endswith("ref with space")) self.assertTrue(real_ppath("ref\ with\ space").endswith("ref with space")) if __name__ == "__main__": unittest.main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_filterservice.py000077500000000000000000000051671241505617700263620ustar00rootroot00000000000000"""Test pbalign.filterservice.""" from pbalign.filterservice import FilterService from os import path import unittest class Opt(object): """The Option class.""" def __init__(self, maxDivergence, minAccuracy, minLength, seed, scoreCutoff, hitPolicy): self.maxDivergence = maxDivergence self.minAccuracy = minAccuracy self.minLength = minLength self.seed = seed self.scoreCutoff = scoreCutoff self.hitPolicy = hitPolicy self.filterAdapterOnly = None class Test_FilterService(unittest.TestCase): """Test pbalign.filterservice.""" def setUp(self): self.testDir = path.dirname(path.dirname(path.abspath(__file__))) self.alignedSam = path.join(self.testDir, "data/lambda.sam") self.targetFileName = "/mnt/secondary/Smrtanalysis/opt/" + \ "smrtanalysis/common/references/" + \ "lambda/sequence/lambda.fasta" self.filteredSam = path.join(self.testDir, "out/lambda_filtered.sam") def test_init(self): """Test FilterService.__init__().""" options = Opt(30, 70, 50, 1, None, "random") obj = FilterService(self.alignedSam, self.targetFileName, self.filteredSam, "BlasrService", -1, options) self.assertTrue(obj.availability) # samFilter should be available self.assertIn("-minPctSimilarity 70", obj.cmd) self.assertIn("-minAccuracy 70", obj.cmd) self.assertIn("-scoreSign -1", obj.cmd) def test_run(self): """Test FilterService.run().""" options = Opt(30, 70, 50, 1, None, "random") obj = FilterService(self.alignedSam, self.targetFileName, self.filteredSam, "BlasrService", -1, options) _output, errCode, _errMsg = obj.run() self.assertEqual(errCode, 0) def test_run_without_scoreCutoff(self): """Test FilterService.run() without score cutoff.""" options2 = Opt(40, 50, None, None, None, "allbest") obj2 = FilterService(self.alignedSam, self.targetFileName, self.filteredSam, "BowtieService", 1, options2) self.assertNotIn("-seed", obj2.cmd) self.assertNotIn("-scoreCutoff", obj2.cmd) self.assertIn("-scoreSign 1", obj2.cmd) _output, errCode, _errMsg = obj2.run() self.assertEqual(errCode, 0) if __name__ == "__main__": unittest.main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_forquiverservice.py000077500000000000000000000031121241505617700271030ustar00rootroot00000000000000"""Test pbalign.forquiverservice.forquiver.""" import unittest from os import path, remove from shutil import copyfile from pbalign.forquiverservice.forquiver import ForQuiverService from pbalign.pbalignfiles import PBAlignFiles from tempfile import mkstemp class Opt(object): """Simulate PBAlign options.""" def __init__(self): """Option class.""" self.verbosity = 2 self.metrics = "DeletionQV, InsertionQV" self.byread = None class Test_ForQuiverService(unittest.TestCase): """Test pbalign.forquiverservice.forquiver.""" def setUp(self): self.rootDir = "/mnt/secondary-siv/" + \ "testdata/BlasrTestData/pbalign" self.inCmpFile = path.join(self.rootDir, "data/testforquiver.cmp.h5") #self.outCmpFile = path.join(self.rootDir, "out/testforquiver.cmp.h5") self.outCmpFile = mkstemp(suffix=".cmp.h5")[1] copyfile(self.inCmpFile, self.outCmpFile) self.basFile = path.join(self.rootDir, "data/lambda_bax.fofn") refpath = "/mnt/secondary/Smrtanalysis/opt/" + \ "smrtanalysis/common/references/lambda/" self.fileNames = PBAlignFiles() self.fileNames.SetInOutFiles(self.basFile, refpath, self.outCmpFile, None, None) self.options = Opt() self.obj = ForQuiverService(self.fileNames, self.options) def tearDown(self): remove(self.outCmpFile) def test_run(self): """Test ForQuiverService.__init__().""" self.obj.run() if __name__ == "__main__": unittest.main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_gmap.py000077500000000000000000000050131241505617700244260ustar00rootroot00000000000000"""Test pbalign.alignservices.gmap.""" import unittest from os import path from pbalign.alignservice.gmap import GMAPService from pbalign.pbalignfiles import PBAlignFiles from pbalign.options import parseOptions import argparse class Test_GMAPService(unittest.TestCase): """Test pbalign.alignservices.gmap.""" def setUp(self): """Set up test data.""" self.rootDir = path.dirname(path.dirname(path.abspath(__file__))) self.outDir = path.join(self.rootDir, "out/") self.outSam = path.join(self.outDir, "test_gmap_01.sam") self.dataDir = path.join(self.rootDir, "data/") self.queryFofn = path.join(self.dataDir, "ecoli_lp.fofn") self.refFa = path.join(self.dataDir, "ecoli.fasta") self.repoPath = "/mnt/secondary/Smrtanalysis/opt/smrtanalysis/" + \ "common/references/ecoli/" def test_gmapCreateDB_case1(self): """Test _gmapCreateDB(refFile, isWithinRepository, tempRootDir). Condition: the reference is not within a reference repository. """ # Case 1: the reference is not within a reference repository files = PBAlignFiles() parser = argparse.ArgumentParser() argumentList = [self.queryFofn, self.refFa, self.outSam, '--algorithm', 'gmap'] parser, options, _info = parseOptions(argumentList=argumentList, parser=parser) service = GMAPService(options, files) dbRoot, dbName = service._gmapCreateDB(self.refFa, False, self.outDir) self.assertTrue(path.exists(dbRoot)) self.assertTrue(path.exists(path.join(dbRoot, dbName))) def test_gmapCreateDB_case2(self): """Test _gmapCreateDB(refFile, isWithinRepository, tempRootDir). Condition: the reference is within a reference repository. """ # Case 2: the reference is within a reference repository files = PBAlignFiles() parser = argparse.ArgumentParser() argumentList = [self.queryFofn, self.repoPath, self.outSam, '--algorithm', 'gmap'] parser, options, _info = parseOptions(argumentList=argumentList, parser=parser) service = GMAPService(options, files) dbRoot, dbName = service._gmapCreateDB(files.targetFileName, True, self.outDir) self.assertEqual(path.abspath(dbRoot), path.abspath(self.repoPath)) self.assertEqual(dbName, "gmap_db") pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_loadpulsesservice.py000077500000000000000000000023711241505617700272420ustar00rootroot00000000000000"""Test pbalign.forquiverservice.loadpulses.""" import unittest from os import path, remove from shutil import copyfile from pbalign.forquiverservice.loadpulses import LoadPulsesService from argparse import Namespace from tempfile import mkstemp class Test_LoadPulsesService(unittest.TestCase): """Test pbalign.forquiverservice.loadpulses.""" def setUp(self): """Set up tests.""" self.rootDir = "/mnt/secondary-siv/" + \ "testdata/BlasrTestData/pbalign" self.inCmpFile = path.join(self.rootDir, "data/testloadpulses.cmp.h5") #self.outCmpFile = path.join(self.rootDir, "out/testloadpulses.cmp.h5") self.outCmpFile = mkstemp(suffix=".cmp.h5")[1] self.basFile = path.join(self.rootDir, "data/lambda_bax.fofn") copyfile(self.inCmpFile, self.outCmpFile) self.options = Namespace(metrics="DeletionQV", byread=False) self.obj = LoadPulsesService(self.basFile, self.outCmpFile, self.options) def tearDown(self): remove(self.outCmpFile) def test_run(self): """Test LoadPulsesService.__init__().""" _output, errCode, _errMsg = self.obj.run() self.assertEqual(errCode, 0) if __name__ == "__main__": unittest.main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_options.py000077500000000000000000000126031241505617700252000ustar00rootroot00000000000000from pbalign.options import * from argparse import * from os import path import filecmp import unittest rootDir = path.dirname(path.dirname(path.abspath(__file__))) configFile = path.join(rootDir, "data/2.config") configFile3 = path.join(rootDir, "data/3.config") def writeConfigFile(configFile, configOptions): """Write configs to a file.""" with open (configFile, 'w') as f: f.write("\n".join(configOptions)) class Test_Options(unittest.TestCase): def test_importConfigOptions(self): """Test importConfigOptions().""" configOptions = ("--minAccuracy = 40", "--maxHits = 20") writeConfigFile(configFile, configOptions) options = Namespace(configFile=configFile, minAccuracy=10, maxHits=12) newOptions, infoMsg = importConfigOptions(options) self.assertEqual(int(newOptions.maxHits), 20) self.assertEqual(int(newOptions.minAccuracy), 40) def test_ConstructOptionParser(self): """Test constructOptionParser().""" ret = constructOptionParser() self.assertEqual(type(ret), argparse.ArgumentParser) def test_parseOptions(self): """Test parseOptions().""" configOptions = ( "--maxHits = 20", "--minAnchorSize = 15", "--minLength = 100", "--algorithmOptions = '-noSplitSubreads " + \ "-maxMatch 30 -nCandidates 30'", "# Some comments", "--scoreFunction = blasr", "--hitPolicy = random", "--maxDivergence = 40", "--debug") writeConfigFile(configFile, configOptions) def test_parseOptions_with_config(self): """Test parseOptions with a config file.""" # With the above config file argumentList = ['--configFile', configFile, '--maxHits', '30', '--minAccuracy', '50', 'readfile', 'reffile', 'outfile'] parser, options, infoMsg = parseOptions(argumentList) self.assertTrue(filecmp.cmp(options.configFile, configFile)) self.assertEqual(int(options.maxHits), 30) self.assertEqual(int(options.minAccuracy), 50) self.assertEqual("".join(options.algorithmOptions), "-noSplitSubreads -maxMatch 30 -nCandidates 30") self.assertEqual(options.scoreFunction, "blasr") self.assertEqual(options.hitPolicy, "random") self.assertEqual(int(options.maxDivergence), 40) def test_parseOptions_without_config(self): """Test parseOptions without any config file.""" argumentList = ['--maxHits=30', '--minAccuracy=50', 'readfile', 'reffile', 'outfile'] parser, options,infoMsg = parseOptions(argumentList) self.assertIsNone(options.configFile) self.assertEqual(int(options.maxHits), 30) self.assertEqual(int(options.minAccuracy), 50) self.assertIsNone(options.algorithmOptions) self.assertIsNone(options.minAnchorSize) def test_parseOptions_multi_algorithmOptions(self): """Test parseOptions with multiple algorithmOptions.""" algo1 = " -holeNumbers 1" algo2 = " -nCandidate 25" algo3 = " ' -bestn 11 '" argumentList = ['--algorithmOptions=%s' % algo1, '--algorithmOptions=%s' % algo2, 'readfile', 'reffile', 'outfile'] print argumentList parser, options, infoMsg = parseOptions(argumentList) # Both algo1 and algo2 should be in algorithmOptions. print options.algorithmOptions #self.assertTrue(algo1 in options.algorithmOptions) #self.assertTrue(algo2 in options.algorithmOptions) # Construct a config file. configOptions = ("--algorithmOptions = \"%s\"" % algo3) writeConfigFile(configFile3, [configOptions]) argumentList.append("--configFile={0}".format(configFile3)) print argumentList parser, options, infoMsg = parseOptions(argumentList) # Make sure algo3 have been overwritten. print options.algorithmOptions self.assertTrue(algo1 in options.algorithmOptions) self.assertTrue(algo2 in options.algorithmOptions) self.assertFalse(algo3 in options.algorithmOptions) def test_parseOptions_without_some_options(self): """Test parseOptions without specifying maxHits and minAccuracy.""" # Test if maxHits and minAccuracy are not set, # whether both options.minAnchorSize and maxHits are None argumentList = ["--minAccuracy", "50", 'readfile', 'reffile', 'outfile'] parser, options,infoMsg = parseOptions(argumentList) self.assertIsNone(options.minAnchorSize) self.assertIsNone(options.maxHits) def test_importDefaultOptions(self): """Test importDefaultOptions""" options = Namespace(configFile=configFile, minAccuracy=10, maxHits=12) defaultOptions = {"minAccuracy":30, "maxHits":14} newOptions, infoMsg = importDefaultOptions(options, defaultOptions) self.assertEqual(newOptions.minAccuracy, 10) self.assertEqual(newOptions.maxHits, 12) if __name__ == "__main__": unittest.main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_pbalign.py000077500000000000000000000050561241505617700251250ustar00rootroot00000000000000import unittest from os import path from pbalign.pbalignrunner import PBAlignRunner class Test_PBAlignRunner(unittest.TestCase): def setUp(self): self.rootDir = path.dirname(path.dirname(path.abspath(__file__))) self.queryFile = path.join(self.rootDir, "data/lambda_query.fasta") self.referenceFile = "/mnt/secondary/Smrtanalysis/opt/" + \ "smrtanalysis/common/references/" + \ "lambda/sequence/lambda.fasta" self.configFile = path.join(self.rootDir, "data/1.config") self.samOut = path.join(self.rootDir, "out/lambda_out.sam") self.cmph5Out = path.join(self.rootDir, "out/lambda_out.cmp.h5") def test_init(self): """Test PBAlignRunner.__init__().""" argumentList = ['--minAccuracy', '70', '--maxDivergence', '30', self.queryFile, self.referenceFile, self.samOut] pbobj = PBAlignRunner(argumentList = argumentList) self.assertEqual(pbobj.start(), 0) def test_init_with_algorithmOptions(self): """Test PBAlignRunner.__init__() with --algorithmOptions.""" argumentList = ['--algorithmOptions', '-minMatch 10 -useccsall', self.queryFile, self.referenceFile, self.cmph5Out] pbobj = PBAlignRunner(argumentList = argumentList) self.assertEqual(pbobj.start(), 0) def test_init_with_config_algorithmOptions(self): """Test PBAlignRunner.__init__() with a config file and --algorithmOptions.""" argumentList = ['--algorithmOptions', '-maxMatch 20 -nCandidates 30', '--configFile', self.configFile, self.queryFile, self.referenceFile, self.cmph5Out] pbobj = PBAlignRunner(argumentList = argumentList) self.assertEqual(pbobj.start(), 0) def test_init_expect_conflicting_options(self): """Test PBAlignRunner.__init__() with a config file and --algorithmOptions and expect a ValueError for conflicting options.""" argumentList = ['--algorithmOptions', '-minMatch 10 -useccsall', '--configFile', self.configFile, self.queryFile, self.referenceFile, self.cmph5Out] pbobj = PBAlignRunner(argumentList = argumentList) with self.assertRaises(ValueError) as cm: # Expect a ValueError since -minMatch and --minAnchorSize conflicts. pbobj.start() if __name__ == "__main__": unittest.main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_pbalignfiles.py000077500000000000000000000071321241505617700261450ustar00rootroot00000000000000import unittest import filecmp from os import path from pbalign.pbalignfiles import PBAlignFiles class Test_PbAlignFiles_Ecoli(unittest.TestCase): def setUp(self): self.rootDir = path.dirname(path.dirname(path.abspath(__file__))) self.inputFileName = path.join(self.rootDir, "data/ecoli.fasta") self.referencePath = "/mnt/secondary/Smrtanalysis/opt/" + \ "smrtanalysis/common/references/ecoli_K12_MG1655/" self.targetFileName = path.join(self.referencePath, "sequence/ecoli_K12_MG1655.fasta") self.sawriterFileName = self.targetFileName + ".sa" self.outputFileName = path.join(self.rootDir, "out/tmp.sam") def test_init(self): """Test PBAlignFiles.__init__() with a reference repository.""" # Without region table p = PBAlignFiles(self.inputFileName, self.referencePath, self.outputFileName) self.assertTrue(filecmp.cmp(p.inputFileName, self.inputFileName)) self.assertTrue(p.referencePath, path.abspath(path.expanduser(self.referencePath))) self.assertTrue(filecmp.cmp(p.targetFileName, self.targetFileName)) self.assertTrue(filecmp.cmp(p.outputFileName, self.outputFileName)) self.assertIsNone(p.regionTable) class Test_PbAlignFiles(unittest.TestCase): def setUp(self): self.rootDir = path.dirname(path.dirname(path.abspath(__file__))) self.inputFileName = path.join(self.rootDir, "data/lambda_bax.fofn") self.referenceFile = "/mnt/secondary/Smrtanalysis/opt/" + \ "smrtanalysis/common/references/" + \ "lambda/sequence/lambda.fasta" self.outputFileName = path.join(self.rootDir, "out/tmp.sam") def test_init(self): """Test PBAlignFiles.__init__().""" # Without region table p = PBAlignFiles(self.inputFileName, self.referenceFile, self.outputFileName) self.assertTrue(filecmp.cmp(p.inputFileName, self.inputFileName)) self.assertTrue(filecmp.cmp(p.referencePath, self.referenceFile)) self.assertTrue(filecmp.cmp(p.targetFileName, self.referenceFile)) self.assertTrue(filecmp.cmp(p.outputFileName, self.outputFileName)) self.assertIsNone(p.regionTable) def test_init_region_table(self): """Test PBAlignFiles.__init__() with a region table.""" # With an artifical region table regionTable = path.join(self.rootDir, "data/lambda.rgn.h5") p = PBAlignFiles(self.inputFileName, self.referenceFile, self.outputFileName, regionTable) self.assertTrue(filecmp.cmp(p.regionTable, regionTable)) def test_setInOutFiles(self): """Test PBAlignFiles.SetInOutFiles().""" p = PBAlignFiles() self.assertIsNone(p.inputFileName) self.assertIsNone(p.outputFileName) self.assertIsNone(p.referencePath) p.SetInOutFiles(self.inputFileName, self.referenceFile, self.outputFileName, None) self.assertTrue(filecmp.cmp(p.inputFileName, self.inputFileName)) self.assertTrue(filecmp.cmp(p.referencePath, self.referenceFile)) self.assertTrue(filecmp.cmp(p.targetFileName, self.referenceFile)) self.assertTrue(filecmp.cmp(p.outputFileName, self.outputFileName)) self.assertIsNone(p.regionTable) if __name__ == "__main__": unittest.main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_progutil.py000077500000000000000000000006501241505617700253510ustar00rootroot00000000000000import unittest from os import path from pbalign.utils.progutil import * class Test_progutil(unittest.TestCase): def setUp(self): self.prog = "blasr" def testAvailability(self): self.assertTrue(Availability(self.prog)) def testCheckAvailability(self): CheckAvailability(self.prog) def testExecute(self): Execute("ls", "ls") if __name__ == "__main__": unittest.main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_referenceInfo.py000077500000000000000000000014661241505617700262640ustar00rootroot00000000000000from pbalign.utils.fileutil import ReferenceInfo import unittest class Test_ReferenceInfo(unittest.TestCase): def test_init(self): """Test ReferenceInfo.__init__() with a valid reference.info.xml.""" rootDir = "/mnt/secondary/Smrtanalysis/opt/" + \ "smrtanalysis/common/references/lambda/" r = ReferenceInfo(rootDir + "reference.info.xml") self.assertEqual(r.refFastaFile, rootDir + "sequence/lambda.fasta") self.assertEqual(r.refSawriterFile, rootDir + "sequence/lambda.fasta.sa") def test_init_with_errors(self): with self.assertRaises(ValueError) as cm: r2 = ReferenceInfo("noexist.txt") with self.assertRaises(IOError) as cm: r2 = ReferenceInfo("noexist.xml") if __name__ == "__main__": unittest.main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_repackservice.py000077500000000000000000000023171241505617700263340ustar00rootroot00000000000000"""Test pbalign.forquiverservice.repack.""" import unittest from os import path, remove from shutil import copyfile from pbalign.forquiverservice.repack import RepackService from tempfile import mkstemp class Test_RepackService(unittest.TestCase): """Test pbalign.forquiverservice.repack.""" def setUp(self): """Set up the tests.""" self.rootDir = "/mnt/secondary-siv/" + \ "testdata/BlasrTestData/pbalign" self.inCmpFile = path.join(self.rootDir, "data/testrepack.cmp.h5") #self.outCmpFile = path.join(self.rootDir, "out/testrepack.cmp.h5") self.outCmpFile = mkstemp(suffix=".cmp.h5")[1] self.tmpCmpFile = self.outCmpFile + ".tmp" #self.tmpCmpFile = path.join(self.rootDir, "out/testrepack.cmp.h5.tmp") copyfile(self.inCmpFile, self.outCmpFile) self.options = {} self.obj = RepackService(self.outCmpFile, self.tmpCmpFile) def tearDown(self): remove(self.outCmpFile) def test_run(self): """Test LoadPulsesService.__init__().""" print self.obj.cmd _output, errCode, _errMsg = self.obj.run() self.assertEqual(errCode, 0) if __name__ == "__main__": unittest.main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_rgnh5io.py000077500000000000000000000107131241505617700250600ustar00rootroot00000000000000"""Test pbalign.utils.RgnH5IO.py.""" import unittest from pbcore.io.BasH5IO import ADAPTER_REGION, INSERT_REGION, HQ_REGION from pbalign.utils.RgnH5IO import Region, RegionTable, RgnH5Reader, \ RgnH5Writer, addStrListAttr from os import path import h5py class Test_RgnH5IO(unittest.TestCase): """Test RgnH5Reader and RgnH5Writer.""" def setUp(self): """Set up test data.""" self.rootDir = path.dirname(path.dirname(path.abspath(__file__))) self.inRgnFN = "/mnt/secondary-siv/testdata/" + \ "BlasrTestData/pbalign/data/test_rgnh5io.rgn.h5" self.outRgnFN = path.join(self.rootDir, "out/test_rgnh5io_out.rgn.h5") self.outTmpFN = path.join(self.rootDir, "out/test_rgnh5io_tmp.h5") self.movieName = "m130427_152935_42178_" + \ "c100518602550000001823079209281316_s1_p0" def test_Region(self): """Test class Region.""" rtuple = (0, 2, 2, 3, 4) r = Region(list(rtuple)) self.assertTrue(r.toTuple() == rtuple) r.setStartAndEnd(10, 20) self.assertTrue(r.toTuple() == (0, 2, 10, 20, 4)) self.assertTrue(r.isHqRegion) self.assertTrue(r.holeNumber == 0) self.assertTrue(r.typeIndex == HQ_REGION) self.assertTrue(r.start == 10) self.assertTrue(r.end == 20) self.assertTrue(r.score == 4) # Create an instance of Region with type = ADAPTER_REGION. self.assertTrue(Region([11, ADAPTER_REGION, 30, 50, -1]).isAdapter) # Create an instance of Region with type = INSERT_REGION. self.assertTrue(Region([11, INSERT_REGION, 30, 50, -1]).isInsert) def test_RegionTable(self): """Test class RegionTable.""" l = [Region([11, 1, 1634, 7207, -1]), Region([11, 2, 1634, 7207, 872])] rt = RegionTable(11, l) self.assertEqual(rt.numRegions, 2) self.assertEqual(rt.toList(), [(11, 1, 1634, 7207, -1), (11, 2, 1634, 7207, 872)]) def test_reader(self): """Test RgnH5Reader.""" reader = RgnH5Reader(self.inRgnFN) for rt in reader: if (rt.holeNumber in [0, 1, 81740]): self.assertEqual( rt.toList(), [(rt.holeNumber, 2, 0, 0, 0)]) elif rt.holeNumber == 11: self.assertEqual( rt.toList(), [(11, 1, 1634, 7207, -1), (11, 2, 1634, 7207, 882)]) elif rt.holeNumber == 30: self.assertEqual( rt.toList(), [(30, 1, 14046, 17047, -1), (30, 1, 17092, 19610, -1), (30, 0, 17047, 17092, 955), (30, 2, 14046, 19610, 890)]) # Reset HQRegion and test. rt.setHQRegion(0, 0) self.assertEqual( rt.toList(), [(30, 1, 14046, 17047, -1), (30, 1, 17092, 19610, -1), (30, 0, 17047, 17092, 955), (30, 2, 0, 0, 890)]) self.assertEqual(reader.movieName, self.movieName) self.assertEqual(reader.numZMWs, 81741) reader.close() def test_writer(self): """Test RgnH5Writer().""" reader = RgnH5Reader(self.inRgnFN) writer = RgnH5Writer(self.outRgnFN) writer.writeScanDataGroup(reader.scanDataGroup) for rt in reader: writer.addRegionTable(rt) if rt.holeNumber == 1000: break reader.close() writer.close() reader1 = RgnH5Reader(self.inRgnFN) reader2 = RgnH5Reader(self.outRgnFN) self.assertTrue("PulseData" in reader2.file) self.assertTrue("ScanData" in reader2.file) for (rt1, rt2) in zip(reader1, reader2): self.assertEqual(rt1.toList(), rt2.toList()) if rt1.holeNumber == 1000: break def test_addStrListAttr(self): """Test function addStrListAttr(obj, name, strlist).""" f = h5py.File(self.outTmpFN, 'w') obj = f.create_group("PulseData") addStrListAttr(obj, "addedAttr", ["val1", "val2"]) f.close() f = h5py.File(self.outTmpFN, 'r') self.assertTrue("addedAttr" in f["PulseData"].attrs) attrList = f["PulseData"].attrs["addedAttr"] self.assertEqual(attrList[0], 'val1') self.assertEqual(attrList[1], 'val2') f.close() if __name__ == "__main__": unittest.main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_sortservice.py000077500000000000000000000021571241505617700260600ustar00rootroot00000000000000"""Test pbalign.forquiverservice.sort.""" import unittest from os import path, remove from shutil import copyfile from pbalign.forquiverservice.sort import SortService from tempfile import mkstemp class Opt(object): """Simulate pbalign options.""" def __init__(self): """Option class.""" self.verbosity = 2 class Test_SortService(unittest.TestCase): """Test pbalign.forquiverservice.sort.""" def setUp(self): """Set up tests.""" self.rootDir = "/mnt/secondary-siv/" + \ "testdata/BlasrTestData/pbalign" self.inCmpFile = path.join(self.rootDir, "data/testsort.cmp.h5") self.outCmpFile = mkstemp(suffix=".cmp.h5")[1] copyfile(self.inCmpFile, self.outCmpFile) self.options = Opt() self.obj = SortService(self.outCmpFile, self.options) def tearDown(self): remove(self.outCmpFile) def test_run(self): """Test SortService.__init__().""" print self.obj.cmd _output, errCode, _errMsg = self.obj.run() self.assertEqual(errCode, 0) if __name__ == "__main__": unittest.main() pbalign-8cd571f6f48c5d86d07571ff04d8382320a8a658/tests/unit/test_tmpfileutil.py000077500000000000000000000054631241505617700260510ustar00rootroot00000000000000from pbalign.utils.tempfileutil import TempFileManager import unittest from os import path def keep_writing_to_file(fn): """Keep writing a to a file.""" f = open(fn, 'w') while True: # Keep writing a to a file, # note that this function will never end unless # being killed by the main process. f.write('a') class Test_TempFileManager(unittest.TestCase): def test_init(self): """Test TempFileManager all functions.""" t = TempFileManager() t.SetRootDir("/scratch") newFN = t.RegisterNewTmpFile() self.assertTrue(path.isfile(newFN)) existingDir = t.RegisterExistingTmpFile("/tmp", isDir=True) self.assertTrue(path.isdir(existingDir)) with self.assertRaises(IOError) as cm: t.RegisterExistingTmpFile("filethatdoesnotexist") newDN = t.RegisterNewTmpFile(isDir=True) self.assertTrue(path.isdir(newDN)) self.assertTrue(t._isRegistered(newDN)) newTxt = t.RegisterNewTmpFile(suffix=".txt") self.assertTrue(newTxt.endswith(".txt")) t.SetRootDir("~/tmp/") t.CleanUp() self.assertFalse(path.exists(newFN)) self.assertFalse(path.exists(newDN)) self.assertEqual(t.fileDB, []) self.assertEqual(t.dirDB, []) # def test_CleanUp(self): # """Create a temp directory and register several tmp files. # Then, manually open another file under temp dir without # registering it. Finally, call CleanUp to delete temp dirs # and files. Note that by doing this, an nfs lock is created # manually, CleanUp should exit with a warning instead of an # error.""" # t = TempFileManager() # t.SetRootDir("/scratch") # newFN = t.RegisterNewTmpFile() # with open(newFN, 'w') as writer: # writer.write("x") # # newFN2 = t.RegisterNewTmpFile() # with open(newFN2, 'w') as writer: # writer.write("x") # # # Manually create a temp file under temp dir without register # rootDir = t.defaultRootDir # extraFN = path.join(t.defaultRootDir, 'extra_file.txt') # # Keep the file open while trying to Clean up temp dir. # from multiprocessing import Process # p = Process(target=keep_writing_to_file, args=(extraFN, )) # p.start() # self.assertTrue(p.is_alive()) # # t.CleanUp() # # self.assertFalse(path.exists(newFN)) # self.assertFalse(path.exists(newFN2)) # # self.assertTrue(p.is_alive()) # self.assertTrue(path.exists(extraFN)) # self.assertEqual(t.fileDB, []) # self.assertEqual(t.dirDB, []) # # p.terminate() # # import time # time.sleep(1) # import shutil # shutil.rmtree(rootDir) # if __name__ == "__main__": unittest.main()