pax_global_header 0000666 0000000 0000000 00000000064 14121001445 0014502 g ustar 00root root 0000000 0000000 52 comment=feacb685fe8465f1ba94c530028812b5fdd18b1f
unikmer-0.18.8/ 0000775 0000000 0000000 00000000000 14121001445 0013252 5 ustar 00root root 0000000 0000000 unikmer-0.18.8/.gitignore 0000664 0000000 0000000 00000000503 14121001445 0015240 0 ustar 00root root 0000000 0000000 # Binaries for programs and plugins
*.exe
*.exe~
*.dll
*.so
*.dylib
# Test binary, build with `go test -c`
*.test
# Output of the go coverage tool, specifically when used with LiteIDE
*.out
*.directory
unikmer/unikmer*
unikmer/binaries*
doc/site/*
*ssshtest
*.unik
t_*
*.nextflow.log*
*.brename_detail.txt
*/Rplots.pdf
unikmer-0.18.8/CHANGELOG.md 0000664 0000000 0000000 00000020226 14121001445 0015065 0 ustar 00root root 0000000 0000000 # Changelog
- v0.18.8
- `unikmer info`: fix typoes.
- v0.18.7
- `unikmer`: better counting speed by upstream optimization of FASTA/Q parsing.
- `unikmer concat`: fix parsing flag `-n`.
- v0.17.3
- `unikmer`: fix buiding for 386. #21
- v0.17.2
- `unikmer`: slightly speedup for computing LCA.
- `unikmer rfilter:`
- flag `-E/--equal-to` supports multiple values.
- new flag `-n/--save-predictable-norank`: do not discard some special ranks without order when using -L, where rank of the closest higher node is still lower than rank cutoff.
- v0.17.1
- `unikmer rfilter:` change handling of black list.
- v0.17.0
- syncmer value changed with different hash method.
- `unikmer count`: syncmer value changed.
- v0.16.1
- change Header.Number from `int64` to `uint64`
- `unikmer info`: fix recounting problem for unsorted kmers but with Number.
- v0.16.0
- `unikmer`:
- binary file format change: fix reading long description, and bump version to `5.0`.
- better binary file parsing performance.
- v0.15.0
- `unikmer`:
- binary file minor change: increase description maximal length from 128 B to 1KB.
- separating k-mers (sketches) indexing and searching from `unikmer`, including `unikmer db info/index/search`.
- `unikmer count`: fix syncmer.
- `unikmer dump`: new flag `--hashed`.
- rename `unikmer stats` to `unikmer info`, and add new column `description`.
- v0.14.0
- `unikmer union`: fix bug when flag `-s` not given.
- `unikmer count/uniqs/locate`: performance improvement on generating k-mers.
- `unikmer count/db`: support scaled/minizimer/syncmer sketch.
- `unikmer stats`: change format.
- v0.13.0
- new command `unikmer common`: Finding k-mers shared by most of multiple binary files.
- `unikmer common/count/diff/grep/rfilter/sort/split/union`: faster sorting.
- `unikmer uniqs`: better result for flag `--circular`.
- `unikmer search`: fix a bug when searching on database with more than one hash.
- v0.12.0
- `unikmer`:
- support longer k (k>32) by saving ntHash.
- new flag `-nocheck-file` for not checking binary file.
- new commands:
- `unikmer db index`: constructing index from binary files
- `unikmer db info`: printing information of index file
- `unikmer db search`: searching sequence from index database
- `unikmer rfilter`: change format of rank order file.
- `unikmer inter/union`: speedup for single input file.
- `unikmer concat`:
- new flag `-t/--taxid` for assigning global taxid, this can slightly reduce file size.
- new flag `-n/--number` for setting number of k-mers.
- `unikmer num`:
- new flag `-f/--force` for counting k-mers.
- `unikmer locate`: output in BED6.
- `unikmer locate/uniqs`: support multiple genome files.
- `unikmer uniqs`:
- stricter multiple mapping limit.
- new flag `-W/--seqs-in-a-file-as-one-genome`.
- `unikmer count`:
- new flag `-u/--unique` for output unique (single copy) kmers
- v0.11.0
- new command: `unikmer rfilter` for filtering k-mers by taxonomic rank.
- `unikmer inter`: new flag `-m/--mix-taxid` allowing part of files being whithout taxids.
- `unikmer dump`: fix a nil pointer bug.
- `unikmer count`:
- fix checking taxid in sequence header.
- fix setting global taxid.
- `unikmer count/diff/union`: slightly reduce memory and speedup when sorting k-mers.
- `unikmer filter`: change scoring.
- `unikmer count/locate/uniqs`: remove flag `--circular`.
- v0.10.0
- `unikmer`: fix loading custom taxonomy files.
- `unikmer count`:
- new flag `-d` for only count duplicated k-mers, for removing singleton in FASTQ.
- fix nil pointer bug of `-t`.
- `unikmer split`: fix memery and last odd k-mer mising bug for given ONE sorted input file.
- `unikmer sort`: skip loading taxonomy data when neither `-u` or `-d` given.
- `unikmer diff`: 2X speedup, and requiring 1th file being sorted.
- `unikmer inter`: 2-5X speedup, and requiring all files being sorted, sorted output by default.
- v0.9.0
- `unikmer`: **new binary format supporting optional Taxids**.
- deleted command: `unikmer subset`.
- new command: `unikmer head` for extracting the first N k-mers.
- new command: `unikmer tsplit` for splitting k-mers according to taxid.
- `unikmer grep`: support searching with taxids.
- `unikmer count`: support parsing taxid from FASTA/Q header.
- v0.8.0
- `unikmer`:
- new option `-i/--infile-list`, if given, files in the list file are appended to files from cli arguments.
- improve performance of binary file reading and writing.
- `unikmer sort/split/merge`: safer forcing deletion of existed outdir, and better log.
- `unikmer split`: performance improvement for single sorted input file.
- `unikmer sort`: performance improvement for using `-m/--chunk-size`.
- `unikmer grep`: rewrite, support loading queries from .unik files.
- `unikmer dump`: fix number information in output file.
- `unikmer concat`: new flag `-s/--sorted`.
- v0.7.0
- new command `unikmer filter`: filter low-complexity k-mers.
- new command `unikmer split`: split k-mers into sorted chunk files.
- new command `unikmer merge`: merge from sorted chunk files.
- `unikmer view`:
- new option `-N/--show-code-only` for only showing encoded integers.
- fix output error for `-q/--fastq`.
- `unikmer uniqs`:
- new option `-x/--max-cont-non-uniq-kmers` for limiting max continuous non-unique k-mers.
- new option `-X/--max-num-cont-non-uniq-kmers` for limiting max number of continuous non-unique k-mers.
- fix bug for `-m/--min-len`.
- `unikmer union`:
- new option `-d/--repeated` for only printing duplicate k-mers.
- `unikmer sort`:
- new option `-u/--unique` for removing duplicated k-mers.
- new option `-d/--repeated` for only printing duplicate k-mers.
- new option `-m/--chunk-size` for limiting maximum memory for sorting.
- `unikmer diff`:
- small speed improvements.
- v0.6.2
- `unikmer encode`: better output for bits presentation of encoded k-mers (`-a/--all`)
- v0.6.1
- `unikmer dump`:
- new option `-K/--canonical` to keep the canonical k-mers.
- new option `-k/--canonical-only` to only keep the canonical k-mers.
- new option `-s/--sorted` to save sorted k-mers.
- `unikmer encode`: add option `-K/--canonical` to keep the canonical k-mers.
- v0.6.0
- `unikmer`: check encoded integer overflow
- new command `unikmer encode`: encode plain k-mer text to integer
- new command `unikmer decode`: decode encoded integer to k-mer text
- v0.5.3
- `unikmer count/dump`: check file before handling them.
- v0.5.2
- `unikmer locate`: fix bug.
- `unikmer`: doc update.
- v0.5.1
- `unikmer locate/uniqs`: fix options checking.
- v0.5.0
- `unikmer diff`: fix concurrency bug when cloning kmers from first file.
- new command `unikmer locate`: locate Kmers in genome.
- new command `unikmer uniqs`: mapping Kmers back to genome and find unique subsequences.
- v0.4.4
- `unikmer`: add global option `-L/--compression-level`.
- `unikmer diff`: reduce memory occupation, speed not affected.
- v0.4.3
- `unikmer diff`: fix bug of hanging when the first file having no Kmers.
- v0.4.2
- `unikmer stats/diff`: more intuitional output
- v0.4.1
- Better performance of writing and reading binary files
- v0.4.0
- **Binary serialization format changed.**
- new command `unikmer sort`: sort binary files
- `unikmer count/diff/union/inter`: better performance, add option to sort Kmers which significantly reduces file size
- `unikmer dump`: changed option
- `unikmer count`: changed option
- v0.3.1
- **Binary serialization format changed.**
- new command `unikmer stats`: statistics of binary files.
- `unikmer`: adding global option `-i/--infile-list` for reading files listed in file.
- `unikmer diff`: fixed a concurrency bug when no diff found.
- v0.2.1
- `unikmer count`: performance improvement and new option `--canonical` for only keeping canonical Kmers.
- v0.2
- new command `unikmer sample`: sample Kmers from binary files.
- new global options:
- `-c, --compact`: write more compact binary file with little loss of speed.
- `-C, --no-compress`: do not compress binary file (not recommended).
- some improvements.
- v0.1.0
- first release
unikmer-0.18.8/LICENSE 0000664 0000000 0000000 00000002111 14121001445 0014252 0 ustar 00root root 0000000 0000000 Copyright (c) 2018-2019 Wei Shen (shenwei356@gmail.com)
The MIT License
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
unikmer-0.18.8/README.md 0000664 0000000 0000000 00000041074 14121001445 0014537 0 ustar 00root root 0000000 0000000 # unikmer
`unikmer` is a golang package and a toolkit for nucleic acid [k-mer]((https://en.wikipedia.org/wiki/K-mer)) analysis, providing functions
including set operation k-mers (sketch) optional with
TaxIds but without count information.
K-mers are either encoded (k<=32) or hashed (arbitrary k) into `uint64`,
and serialized in binary file with extension `.unik`.
TaxIds can be assigned when counting k-mers from genome sequences,
and LCA (Lowest Common Ancestor) is computed during set opertions
including computing union, intersecton, set difference, unique and
repeated k-mers.
## Table of Contents
- [unikmer](#unikmer)
- [Table of Contents](#table-of-contents)
- [The package](#the-package)
- [Installation](#installation)
- [Benchmark](#benchmark)
- [The toolkit](#the-toolkit)
- [Installation](#installation-1)
- [Commands](#commands)
- [Binary file (.unik)](#binary-file-unik)
- [Compression rate comparison](#compression-rate-comparison)
- [Quick Start](#quick-start)
- [Contributing](#contributing)
- [License](#license)
## The package
[](https://godoc.org/github.com/shenwei356/unikmer)
[](https://goreportcard.com/report/github.com/shenwei356/unikmer)
The unikmer package provides basic manipulations of K-mers (sketch)
optional with TaxIds but without frequency information,
and also provides serialization methods.
### Installation
go get -u github.com/shenwei356/unikmer
### Benchmark
CPU: AMD Ryzen 7 2700X Eight-Core Processor, 3.7 GHz
$ go test . -bench=Bench* -benchmem \
| grep Bench \
| perl -pe 's/\s\s+/\t/g' \
| csvtk cut -Ht -f 1,3-5 \
| csvtk add-header -t -n test,time,memory,allocs \
| csvtk pretty -t -r
test time memory allocs
------------------------------------------ ------------ --------- -----------
BenchmarkEncodeK32-16 18.66 ns/op 0 B/op 0 allocs/op
BenchmarkEncodeFromFormerKmerK32-16 8.030 ns/op 0 B/op 0 allocs/op
BenchmarkMustEncodeFromFormerKmerK32-16 1.702 ns/op 0 B/op 0 allocs/op
BenchmarkDecodeK32-16 78.95 ns/op 32 B/op 1 allocs/op
BenchmarkMustDecodeK32-16 76.86 ns/op 32 B/op 1 allocs/op
BenchmarkRevK32-16 3.639 ns/op 0 B/op 0 allocs/op
BenchmarkCompK32-16 0.7971 ns/op 0 B/op 0 allocs/op
BenchmarkRevCompK32-16 3.831 ns/op 0 B/op 0 allocs/op
BenchmarkCannonalK32-16 4.210 ns/op 0 B/op 0 allocs/op
BenchmarkKmerIterator/1.00_KB-16 12625 ns/op 160 B/op 1 allocs/op
BenchmarkHashIterator/1.00_KB-16 8118 ns/op 232 B/op 3 allocs/op
BenchmarkProteinIterator/1.00_KB-16 14324 ns/op 480 B/op 3 allocs/op
BenchmarkMinimizerSketch/1.00_KB-16 62497 ns/op 688 B/op 6 allocs/op
BenchmarkSyncmerSketch/1.00_KB-16 99390 ns/op 1456 B/op 8 allocs/op
BenchmarkProteinMinimizerSketch/1.00_KB-16 24888 ns/op 728 B/op 5 allocs/op
## The toolkit
### Installation
1. Downloading [executable binary files](https://github.com/shenwei356/unikmer/releases) (Latest version).
1. Via Bioconda (not available now)
conda install unikmer
1. Via Homebrew (not lastest version)
brew install brewsci/bio/unikmer
### Commands
1. Counting
count Generate k-mers (sketch) from FASTA/Q sequences
1. Information
stats Statistics of binary files
num Quickly inspect number of k-mers in binary files
1. Format conversion
encode Encode plain k-mer text to integer
decode Decode encoded integer to k-mer text
view Read and output binary format to plain text
dump Convert plain k-mer text to binary format
1. Set operations
head Extract the first N k-mers
concat Concatenate multiple binary files without removing duplicates
inter Intersection of multiple binary files
common Find k-mers shared by most of multiple binary files
union Union of multiple binary files
diff Set difference of multiple binary files
grep Search k-mers from binary files
sort Sort k-mers in binary files to reduce file size
split Split k-mers into sorted chunk files
tsplit Split k-mers according to TaxId
merge Merge k-mers from sorted chunk files
sample Sample k-mers from binary files
filter Filter low-complexity k-mers
rfilter Filter k-mers by taxonomic rank
1. Searching on genomes
locate Locate k-mers in genome
uniqs Mapping k-mers back to genome and find unique subsequences
1. Misc
genautocomplete Generate shell autocompletion script
help Help about any command
version Print version information and check for update
### Binary file (.unik)
K-mers (represented in `uint64` in RAM ) are serialized in 8-Byte
(or less Bytes for shorter k-mers in compact format,
or much less Bytes for sorted k-mers) arrays and
optionally compressed in gzip format with extension of `.unik`.
TaxIds are optionally stored next to k-mers with 4 or less bytes.
#### Compression rate comparison
No TaxIds stored in this test.

label |encoded-kmera|gzip-compressedb|compact-formatc|sortedd|comment
:---------------|:----------------------:|:-------------------------:|:------------------------:|:----------------:|:------------------------------------------------------
`plain` | | | | |plain text
`gzip` | |✔ | | |gzipped plain text
`unik.default` |✔ |✔ | | |gzipped encoded k-mers in fixed-length byte array
`unik.compat` |✔ |✔ |✔ | |gzipped encoded k-mers in shorter fixed-length byte array
`unik.sorted` |✔ |✔ | |✔ |gzipped sorted encoded k-mers
- a One k-mer is encoded as `uint64` and serialized in 8 Bytes.
- b K-mers file is compressed in gzip format by default,
users can switch on global option `-C/--no-compress` to output non-compressed file.
- c One k-mer is encoded as `uint64` and serialized in 8 Bytes by default.
However few Bytes are needed for short k-mers, e.g., 4 Bytes are enough for
15-mers (30 bits). This makes the file more compact with smaller file size,
controled by global option `-c/--compact `.
- d One k-mer is encoded as `uint64`, all k-mers are sorted and compressed
using varint-GB algorithm.
- In all test, flag `--canonical` is ON when running `unikmer count`.
### Quick Start
# memusg is for compute time and RAM usage: https://github.com/shenwei356/memusg
# counting (only keep the canonical k-mers and compact output)
# memusg -t unikmer count -k 23 Ecoli-IAI39.fasta.gz -o Ecoli-IAI39.fasta.gz.k23 --canonical --compact
$ memusg -t unikmer count -k 23 Ecoli-MG1655.fasta.gz -o Ecoli-MG1655.fasta.gz.k23 --canonical --compact
elapsed time: 0.897s
peak rss: 192.41 MB
# counting (only keep the canonical k-mers and sort k-mers)
# memusg -t unikmer count -k 23 Ecoli-IAI39.fasta.gz -o Ecoli-IAI39.fasta.gz.k23.sorted --canonical --sort
$ memusg -t unikmer count -k 23 Ecoli-MG1655.fasta.gz -o Ecoli-MG1655.fasta.gz.k23.sorted --canonical --sort
elapsed time: 1.136s
peak rss: 227.28 MB
# counting and assigning global TaxIds
$ unikmer count -k 23 -K -s Ecoli-IAI39.fasta.gz -o Ecoli-IAI39.fasta.gz.k23.sorted -t 585057
$ unikmer count -k 23 -K -s Ecoli-MG1655.fasta.gz -o Ecoli-MG1655.fasta.gz.k23.sorted -t 511145
$ unikmer count -k 23 -K -s A.muciniphila-ATCC_BAA-835.fasta.gz -o A.muciniphila-ATCC_BAA-835.fasta.gz.sorted -t 349741
# counting minimizer and ouputting in linear order
$ unikmer count -k 23 -W 5 -H -K -l A.muciniphila-ATCC_BAA-835.fasta.gz -o A.muciniphila-ATCC_BAA-835.fasta.gz.m
# view
$ unikmer view Ecoli-MG1655.fasta.gz.k23.sorted.unik --show-taxid | head -n 3
AAAAAAAAACCATCCAAATCTGG 511145
AAAAAAAAACCGCTAGTATATTC 511145
AAAAAAAAACCTGAAAAAAACGG 511145
# view (hashed k-mers needs original FASTA/Q file)
$ unikmer view --show-code --genome A.muciniphila-ATCC_BAA-835.fasta.gz A.muciniphila-ATCC_BAA-835.fasta.gz.m.unik | head -n 3
CATCCGCCATCTTTGGGGTGTCG 1210726578792
AGCGCAAAATCCCCAAACATGTA 2286899379883
AACTGATTTTTGATGATGACTCC 3542156397282
# find the positions of k-mers
$ seqkit locate -M A.muciniphila-ATCC_BAA-835.fasta.gz \
-f <(unikmer view -a -g A.muciniphila-ATCC_BAA-835.fasta.gz A.muciniphila-ATCC_BAA-835.fasta.gz.m.unik | seqkit head -n 5 ) \
| csvtk sort -t -k start:n | head -n 6 | csvtk pretty -t
seqID patternName pattern strand start end
----------- ------------------- ----------------------- ------ ----- ---
NC_010655.1 2090893901864583115 ATCTTATAAAATAACCACATAAC + 3 25
NC_010655.1 696051979077366638 TTATAAAATAACCACATAACTTA + 6 28
NC_010655.1 390297872016815006 TATAAAATAACCACATAACTTAA + 7 29
NC_010655.1 2582400417208090837 AAAATAACCACATAACTTAAAAA + 10 32
NC_010655.1 3048591415312050785 TAACCACATAACTTAAAAAGAAT + 14 36
# stats
$ unikmer stats *.unik -a -j 10
file k canonical hashed scaled include-taxid global-taxid sorted compact gzipped version number description
A.muciniphila-ATCC_BAA-835.fasta.gz.m.unik 23 ✓ ✓ ✕ ✕ ✕ ✕ ✓ v5.0 860,900
A.muciniphila-ATCC_BAA-835.fasta.gz.sorted.unik 23 ✓ ✕ ✕ ✕ 349741 ✓ ✕ ✓ v5.0 2,630,905
Ecoli-IAI39.fasta.gz.k23.sorted.unik 23 ✓ ✕ ✕ ✕ 585057 ✓ ✕ ✓ v5.0 4,902,266
Ecoli-IAI39.fasta.gz.k23.unik 23 ✓ ✕ ✕ ✕ ✕ ✓ ✓ v5.0 4,902,266
Ecoli-MG1655.fasta.gz.k23.sorted.unik 23 ✓ ✕ ✕ ✕ 511145 ✓ ✕ ✓ v5.0 4,546,632
Ecoli-MG1655.fasta.gz.k23.unik 23 ✓ ✕ ✕ ✕ ✕ ✓ ✓ v5.0 4,546,632
# concat
$ memusg -t unikmer concat *.k23.sorted.unik -o concat.k23 -c
elapsed time: 1.020s
peak rss: 25.86 MB
# union
$ memusg -t unikmer union *.k23.sorted.unik -o union.k23 -s
elapsed time: 3.991s
peak rss: 590.92 MB
# or sorting with limited memory.
# note that taxonomy database need some memory.
$ memusg -t unikmer sort *.k23.sorted.unik -o union2.k23 -u -m 1M
elapsed time: 3.538s
peak rss: 324.2 MB
$ unikmer view -t union.k23.unik | md5sum
4c038832209278840d4d75944b29219c -
$ unikmer view -t union2.k23.unik | md5sum
4c038832209278840d4d75944b29219c -
# duplicated k-mers
$ memusg -t unikmer sort *.k23.sorted.unik -o dup.k23 -d -m 1M
elapsed time: 1.143s
peak rss: 240.18 MB
# intersection
$ memusg -t unikmer inter *.k23.sorted.unik -o inter.k23
elapsed time: 1.481s
peak rss: 399.94 MB
# difference
$ memusg -t unikmer diff -j 10 *.k23.sorted.unik -o diff.k23 -s
elapsed time: 0.793s
peak rss: 338.06 MB
$ ls -lh *.unik
-rw-r--r-- 1 shenwei shenwei 9.5M 2月 13 00:55 A.muciniphila-ATCC_BAA-835.fasta.gz.sorted.unik
-rw-r--r-- 1 shenwei shenwei 46M 2月 13 00:59 concat.k23.unik
-rw-r--r-- 1 shenwei shenwei 8.7M 2月 13 01:00 diff.k23.unik
-rw-r--r-- 1 shenwei shenwei 11M 2月 13 01:04 dup.k23.unik
-rw-r--r-- 1 shenwei shenwei 18M 2月 13 00:55 Ecoli-IAI39.fasta.gz.k23.sorted.unik
-rw-r--r-- 1 shenwei shenwei 21M 2月 13 00:48 Ecoli-IAI39.fasta.gz.k23.unik
-rw-r--r-- 1 shenwei shenwei 17M 2月 13 00:55 Ecoli-MG1655.fasta.gz.k23.sorted.unik
-rw-r--r-- 1 shenwei shenwei 19M 2月 13 00:48 Ecoli-MG1655.fasta.gz.k23.unik
-rw-r--r-- 1 shenwei shenwei 9.5M 2月 13 00:59 inter.k23.unik
-rw-r--r-- 1 shenwei shenwei 27M 2月 13 01:04 union2.k23.unik
-rw-r--r-- 1 shenwei shenwei 27M 2月 13 00:58 union.k23.unik
$ unikmer stats *.unik -a -j 10
file k canonical hashed scaled include-taxid global-taxid sorted compact gzipped version number description
A.muciniphila-ATCC_BAA-835.fasta.gz.m.unik 23 ✓ ✓ ✕ ✕ ✕ ✕ ✓ v5.0 860,900
A.muciniphila-ATCC_BAA-835.fasta.gz.sorted.unik 23 ✓ ✕ ✕ ✕ 349741 ✓ ✕ ✓ v5.0 2,630,905
concat.k23.unik 23 ✓ ✕ ✕ ✓ ✕ ✓ ✓ v5.0 -1
diff.k23.unik 23 ✓ ✕ ✕ ✓ ✕ ✕ ✓ v5.0 2,326,096
dup.k23.unik 23 ✓ ✕ ✕ ✓ ✓ ✕ ✓ v5.0 0
Ecoli-IAI39.fasta.gz.k23.sorted.unik 23 ✓ ✕ ✕ ✕ 585057 ✓ ✕ ✓ v5.0 4,902,266
Ecoli-IAI39.fasta.gz.k23.unik 23 ✓ ✕ ✕ ✕ ✕ ✓ ✓ v5.0 4,902,266
Ecoli-MG1655.fasta.gz.k23.sorted.unik 23 ✓ ✕ ✕ ✕ 511145 ✓ ✕ ✓ v5.0 4,546,632
Ecoli-MG1655.fasta.gz.k23.unik 23 ✓ ✕ ✕ ✕ ✕ ✓ ✓ v5.0 4,546,632
inter.k23.unik 23 ✓ ✕ ✕ ✓ ✓ ✕ ✓ v5.0 2,576,170
union2.k23.unik 23 ✓ ✕ ✕ ✓ ✓ ✕ ✓ v5.0 6,872,728
union.k23.unik 23 ✓ ✕ ✕ ✓ ✓ ✕ ✓ v5.0 6,872,728
# -----------------------------------------------------------------------------------------
# mapping k-mers to genome
g=Ecoli-IAI39.fasta
f=inter.k23.unik
# to fasta
unikmer view $f -a -o $f.fa.gz
# make index
bwa index $g; samtools faidx $g
ncpu=12
ls $f.fa.gz | rush -j 1 -v ref=$g -v j=$ncpu \
' bwa aln -o 0 -l 17 -k 0 -t {j} {ref} {} \
| bwa samse {ref} - {} \
| samtools view -bS > {}.bam; \
samtools sort -T {}.tmp -@ {j} {}.bam -o {}.sorted.bam; \
samtools index {}.sorted.bam; \
samtools flagstat {}.sorted.bam > {}.sorted.bam.flagstat; \
/bin/rm {}.bam '
## Contributing
We welcome pull requests, bug fixes and issue reports.
## License
[MIT License](https://github.com/shenwei356/unikmer/blob/master/LICENSE)
unikmer-0.18.8/go.mod 0000664 0000000 0000000 00000002257 14121001445 0014366 0 ustar 00root root 0000000 0000000 module github.com/shenwei356/unikmer
go 1.17
require (
github.com/dustin/go-humanize v1.0.0
github.com/klauspost/compress v1.13.6
github.com/klauspost/pgzip v1.2.5
github.com/mattn/go-colorable v0.1.8
github.com/mitchellh/go-homedir v1.1.0
github.com/pkg/errors v0.9.1
github.com/shenwei356/bio v0.3.1
github.com/shenwei356/breader v0.3.1
github.com/shenwei356/go-logging v0.0.0-20171012171522-c6b9702d88ba
github.com/shenwei356/util v0.3.2
github.com/shenwei356/xopen v0.1.0
github.com/spf13/cobra v1.2.1
github.com/tatsushid/go-prettytable v0.0.0-20141013043238-ed2d14c29939
github.com/twotwotwo/sorts v0.0.0-20160814051341-bf5c1f2b8553
github.com/will-rowe/nthash v0.4.0
github.com/zeebo/wyhash v0.0.1
)
require (
github.com/inconshreveable/mousetrap v1.0.0 // indirect
github.com/mattn/go-isatty v0.0.12 // indirect
github.com/mattn/go-runewidth v0.0.13 // indirect
github.com/rivo/uniseg v0.2.0 // indirect
github.com/shenwei356/bpool v0.0.0-20160710042833-f9e0ee4d0403 // indirect
github.com/shenwei356/natsort v0.0.0-20190418160752-600d539c017d // indirect
github.com/spf13/pflag v1.0.5 // indirect
golang.org/x/sys v0.0.0-20210510120138-977fb7262007 // indirect
)
unikmer-0.18.8/go.sum 0000664 0000000 0000000 00000170133 14121001445 0014412 0 ustar 00root root 0000000 0000000 cloud.google.com/go v0.26.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw=
cloud.google.com/go v0.34.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw=
cloud.google.com/go v0.38.0/go.mod h1:990N+gfupTy94rShfmMCWGDn0LpTmnzTp2qbd1dvSRU=
cloud.google.com/go v0.44.1/go.mod h1:iSa0KzasP4Uvy3f1mN/7PiObzGgflwredwwASm/v6AU=
cloud.google.com/go v0.44.2/go.mod h1:60680Gw3Yr4ikxnPRS/oxxkBccT6SA1yMk63TGekxKY=
cloud.google.com/go v0.45.1/go.mod h1:RpBamKRgapWJb87xiFSdk4g1CME7QZg3uwTez+TSTjc=
cloud.google.com/go v0.46.3/go.mod h1:a6bKKbmY7er1mI7TEI4lsAkts/mkhTSZK8w33B4RAg0=
cloud.google.com/go v0.50.0/go.mod h1:r9sluTvynVuxRIOHXQEHMFffphuXHOMZMycpNR5e6To=
cloud.google.com/go v0.52.0/go.mod h1:pXajvRH/6o3+F9jDHZWQ5PbGhn+o8w9qiu/CffaVdO4=
cloud.google.com/go v0.53.0/go.mod h1:fp/UouUEsRkN6ryDKNW/Upv/JBKnv6WDthjR6+vze6M=
cloud.google.com/go v0.54.0/go.mod h1:1rq2OEkV3YMf6n/9ZvGWI3GWw0VoqH/1x2nd8Is/bPc=
cloud.google.com/go v0.56.0/go.mod h1:jr7tqZxxKOVYizybht9+26Z/gUq7tiRzu+ACVAMbKVk=
cloud.google.com/go v0.57.0/go.mod h1:oXiQ6Rzq3RAkkY7N6t3TcE6jE+CIBBbA36lwQ1JyzZs=
cloud.google.com/go v0.62.0/go.mod h1:jmCYTdRCQuc1PHIIJ/maLInMho30T/Y0M4hTdTShOYc=
cloud.google.com/go v0.65.0/go.mod h1:O5N8zS7uWy9vkA9vayVHs65eM1ubvY4h553ofrNHObY=
cloud.google.com/go v0.72.0/go.mod h1:M+5Vjvlc2wnp6tjzE102Dw08nGShTscUx2nZMufOKPI=
cloud.google.com/go v0.74.0/go.mod h1:VV1xSbzvo+9QJOxLDaJfTjx5e+MePCpCWwvftOeQmWk=
cloud.google.com/go v0.78.0/go.mod h1:QjdrLG0uq+YwhjoVOLsS1t7TW8fs36kLs4XO5R5ECHg=
cloud.google.com/go v0.79.0/go.mod h1:3bzgcEeQlzbuEAYu4mrWhKqWjmpprinYgKJLgKHnbb8=
cloud.google.com/go v0.81.0/go.mod h1:mk/AM35KwGk/Nm2YSeZbxXdrNK3KZOYHmLkOqC2V6E0=
cloud.google.com/go/bigquery v1.0.1/go.mod h1:i/xbL2UlR5RvWAURpBYZTtm/cXjCha9lbfbpx4poX+o=
cloud.google.com/go/bigquery v1.3.0/go.mod h1:PjpwJnslEMmckchkHFfq+HTD2DmtT67aNFKH1/VBDHE=
cloud.google.com/go/bigquery v1.4.0/go.mod h1:S8dzgnTigyfTmLBfrtrhyYhwRxG72rYxvftPBK2Dvzc=
cloud.google.com/go/bigquery v1.5.0/go.mod h1:snEHRnqQbz117VIFhE8bmtwIDY80NLUZUMb4Nv6dBIg=
cloud.google.com/go/bigquery v1.7.0/go.mod h1://okPTzCYNXSlb24MZs83e2Do+h+VXtc4gLoIoXIAPc=
cloud.google.com/go/bigquery v1.8.0/go.mod h1:J5hqkt3O0uAFnINi6JXValWIb1v0goeZM77hZzJN/fQ=
cloud.google.com/go/datastore v1.0.0/go.mod h1:LXYbyblFSglQ5pkeyhO+Qmw7ukd3C+pD7TKLgZqpHYE=
cloud.google.com/go/datastore v1.1.0/go.mod h1:umbIZjpQpHh4hmRpGhH4tLFup+FVzqBi1b3c64qFpCk=
cloud.google.com/go/firestore v1.1.0/go.mod h1:ulACoGHTpvq5r8rxGJ4ddJZBZqakUQqClKRT5SZwBmk=
cloud.google.com/go/pubsub v1.0.1/go.mod h1:R0Gpsv3s54REJCy4fxDixWD93lHJMoZTyQ2kNxGRt3I=
cloud.google.com/go/pubsub v1.1.0/go.mod h1:EwwdRX2sKPjnvnqCa270oGRyludottCI76h+R3AArQw=
cloud.google.com/go/pubsub v1.2.0/go.mod h1:jhfEVHT8odbXTkndysNHCcx0awwzvfOlguIAii9o8iA=
cloud.google.com/go/pubsub v1.3.1/go.mod h1:i+ucay31+CNRpDW4Lu78I4xXG+O1r/MAHgjpRVR+TSU=
cloud.google.com/go/storage v1.0.0/go.mod h1:IhtSnM/ZTZV8YYJWCY8RULGVqBDmpoyjwiyrjsg+URw=
cloud.google.com/go/storage v1.5.0/go.mod h1:tpKbwo567HUNpVclU5sGELwQWBDZ8gh0ZeosJ0Rtdos=
cloud.google.com/go/storage v1.6.0/go.mod h1:N7U0C8pVQ/+NIKOBQyamJIeKQKkZ+mxpohlUTyfDhBk=
cloud.google.com/go/storage v1.8.0/go.mod h1:Wv1Oy7z6Yz3DshWRJFhqM/UCfaWIRTdp0RXyy7KQOVs=
cloud.google.com/go/storage v1.10.0/go.mod h1:FLPqc6j+Ki4BU591ie1oL6qBQGu2Bl/tZ9ullr3+Kg0=
dmitri.shuralyov.com/gpu/mtl v0.0.0-20190408044501-666a987793e9/go.mod h1:H6x//7gZCb22OMCxBHrMx7a5I7Hp++hsVxbQ4BYO7hU=
github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU=
github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802/go.mod h1:IVnqGOEym/WlBOVXweHU+Q+/VP0lqqI8lqeDx9IjBqo=
github.com/antihax/optional v1.0.0/go.mod h1:uupD/76wgC+ih3iEmQUL+0Ugr19nfwCT1kdvxnR2qWY=
github.com/armon/circbuf v0.0.0-20150827004946-bbbad097214e/go.mod h1:3U/XgcO3hCbHZ8TKRvWD2dDTCfh9M9ya+I9JpbB7O8o=
github.com/armon/go-metrics v0.0.0-20180917152333-f0300d1749da/go.mod h1:Q73ZrmVTwzkszR9V5SSuryQ31EELlFMUz1kKyl939pY=
github.com/armon/go-radix v0.0.0-20180808171621-7fddfc383310/go.mod h1:ufUuZ+zHj4x4TnLV4JWEpy2hxWSpsRywHrMgIH9cCH8=
github.com/bgentry/speakeasy v0.1.0/go.mod h1:+zsyZBPWlz7T6j88CTgSN5bM796AkVf0kBD4zp0CCIs=
github.com/bketelsen/crypt v0.0.4/go.mod h1:aI6NrJ0pMGgvZKL1iVgXLnfIFJtfV+bKCoqOes/6LfM=
github.com/census-instrumentation/opencensus-proto v0.2.1/go.mod h1:f6KPmirojxKA12rnyqOA5BBL4O983OfeGPqjHWSTneU=
github.com/chzyer/logex v1.1.10/go.mod h1:+Ywpsq7O8HXn0nuIou7OrIPyXbp3wmkHB+jjWRnGsAI=
github.com/chzyer/readline v0.0.0-20180603132655-2972be24d48e/go.mod h1:nSuG5e5PlCu98SY8svDHJxuZscDgtXS6KTTbou5AhLI=
github.com/chzyer/test v0.0.0-20180213035817-a1ea475d72b1/go.mod h1:Q3SI9o4m/ZMnBNeIyt5eFwwo7qiLfzFZmjNmxjkiQlU=
github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw=
github.com/cncf/udpa/go v0.0.0-20191209042840-269d4d468f6f/go.mod h1:M8M6+tZqaGXZJjfX53e64911xZQV5JYwmTeXPW+k8Sc=
github.com/cncf/udpa/go v0.0.0-20200629203442-efcf912fb354/go.mod h1:WmhPx2Nbnhtbo57+VJT5O0JRkEi1Wbu0z5j0R8u5Hbk=
github.com/cncf/udpa/go v0.0.0-20201120205902-5459f2c99403/go.mod h1:WmhPx2Nbnhtbo57+VJT5O0JRkEi1Wbu0z5j0R8u5Hbk=
github.com/coreos/go-semver v0.3.0/go.mod h1:nnelYz7RCh+5ahJtPPxZlU+153eP4D4r3EedlOD2RNk=
github.com/coreos/go-systemd/v22 v22.3.2/go.mod h1:Y58oyj3AT4RCenI/lSvhwexgC+NSVTIJ3seZv2GcEnc=
github.com/cpuguy83/go-md2man/v2 v2.0.0/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU=
github.com/cznic/mathutil v0.0.0-20181122101859-297441e03548/go.mod h1:e6NPNENfs9mPDVNRekM7lKScauxd5kXTr1Mfyig6TDM=
github.com/cznic/sortutil v0.0.0-20181122101858-f5f958428db8 h1:LpMLYGyy67BoAFGda1NeOBQwqlv7nUXpm+rIVHGxZZ4=
github.com/cznic/sortutil v0.0.0-20181122101858-f5f958428db8/go.mod h1:q2w6Bg5jeox1B+QkJ6Wp/+Vn0G/bo3f1uY7Fn3vivIQ=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/dustin/go-humanize v1.0.0 h1:VSnTsYCnlFHaM2/igO1h6X3HA71jcobQuxemgkq4zYo=
github.com/dustin/go-humanize v1.0.0/go.mod h1:HtrtbFcZ19U5GC7JDqmcUSB87Iq5E25KnS6fMYU6eOk=
github.com/edsrzf/mmap-go v1.0.0/go.mod h1:YO35OhQPt3KJa3ryjFM5Bs14WD66h8eGKpfaBNrHW5M=
github.com/envoyproxy/go-control-plane v0.9.0/go.mod h1:YTl/9mNaCwkRvm6d1a2C3ymFceY/DCBVvsKhRF0iEA4=
github.com/envoyproxy/go-control-plane v0.9.1-0.20191026205805-5f8ba28d4473/go.mod h1:YTl/9mNaCwkRvm6d1a2C3ymFceY/DCBVvsKhRF0iEA4=
github.com/envoyproxy/go-control-plane v0.9.4/go.mod h1:6rpuAdCZL397s3pYoYcLgu1mIlRU8Am5FuJP05cCM98=
github.com/envoyproxy/go-control-plane v0.9.7/go.mod h1:cwu0lG7PUMfa9snN8LXBig5ynNVH9qI8YYLbd1fK2po=
github.com/envoyproxy/go-control-plane v0.9.9-0.20201210154907-fd9021fe5dad/go.mod h1:cXg6YxExXjJnVBQHBLXeUAgxn2UodCpnH306RInaBQk=
github.com/envoyproxy/go-control-plane v0.9.9-0.20210217033140-668b12f5399d/go.mod h1:cXg6YxExXjJnVBQHBLXeUAgxn2UodCpnH306RInaBQk=
github.com/envoyproxy/protoc-gen-validate v0.1.0/go.mod h1:iSmxcyjqTsJpI2R4NaDN7+kN2VEUnK/pcBlmesArF7c=
github.com/fatih/color v1.7.0/go.mod h1:Zm6kSWBoL9eyXnKyktHP6abPY2pDugNf5KwzbycvMj4=
github.com/fsnotify/fsnotify v1.4.9/go.mod h1:znqG4EE+3YCdAaPaxE2ZRY/06pZUdp0tY4IgpuI1SZQ=
github.com/ghodss/yaml v1.0.0/go.mod h1:4dBDuWmgqj2HViK6kFavaiC9ZROes6MMH2rRYeMEF04=
github.com/go-gl/glfw v0.0.0-20190409004039-e6da0acd62b1/go.mod h1:vR7hzQXu2zJy9AVAgeJqvqgH9Q5CA+iKCZ2gyEVpxRU=
github.com/go-gl/glfw/v3.3/glfw v0.0.0-20191125211704-12ad95a8df72/go.mod h1:tQ2UAYgL5IevRw8kRxooKSPJfGvJ9fJQFa0TUsXzTg8=
github.com/go-gl/glfw/v3.3/glfw v0.0.0-20200222043503-6f7a984d4dc4/go.mod h1:tQ2UAYgL5IevRw8kRxooKSPJfGvJ9fJQFa0TUsXzTg8=
github.com/godbus/dbus/v5 v5.0.4/go.mod h1:xhWf0FNVPg57R7Z0UbKHbJfkEywrmjJnf7w5xrFpKfA=
github.com/gogo/protobuf v1.3.2/go.mod h1:P1XiOD3dCwIKUDQYPy72D8LYyHL2YPYrpS2s69NZV8Q=
github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b/go.mod h1:SBH7ygxi8pfUlaOkMMuAQtPIUF8ecWP5IEl/CR7VP2Q=
github.com/golang/groupcache v0.0.0-20190702054246-869f871628b6/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc=
github.com/golang/groupcache v0.0.0-20191227052852-215e87163ea7/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc=
github.com/golang/groupcache v0.0.0-20200121045136-8c9f03a8e57e/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc=
github.com/golang/mock v1.1.1/go.mod h1:oTYuIxOrZwtPieC+H1uAHpcLFnEyAGVDL/k47Jfbm0A=
github.com/golang/mock v1.2.0/go.mod h1:oTYuIxOrZwtPieC+H1uAHpcLFnEyAGVDL/k47Jfbm0A=
github.com/golang/mock v1.3.1/go.mod h1:sBzyDLLjw3U8JLTeZvSv8jJB+tU5PVekmnlKIyFUx0Y=
github.com/golang/mock v1.4.0/go.mod h1:UOMv5ysSaYNkG+OFQykRIcU/QvvxJf3p21QfJ2Bt3cw=
github.com/golang/mock v1.4.1/go.mod h1:UOMv5ysSaYNkG+OFQykRIcU/QvvxJf3p21QfJ2Bt3cw=
github.com/golang/mock v1.4.3/go.mod h1:UOMv5ysSaYNkG+OFQykRIcU/QvvxJf3p21QfJ2Bt3cw=
github.com/golang/mock v1.4.4/go.mod h1:l3mdAwkq5BuhzHwde/uurv3sEJeZMXNpwsxVWU71h+4=
github.com/golang/mock v1.5.0/go.mod h1:CWnOUgYIOo4TcNZ0wHX3YZCqsaM1I1Jvs6v3mP3KVu8=
github.com/golang/protobuf v1.2.0/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
github.com/golang/protobuf v1.3.1/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
github.com/golang/protobuf v1.3.2/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
github.com/golang/protobuf v1.3.3/go.mod h1:vzj43D7+SQXF/4pzW/hwtAqwc6iTitCiVSaWz5lYuqw=
github.com/golang/protobuf v1.3.4/go.mod h1:vzj43D7+SQXF/4pzW/hwtAqwc6iTitCiVSaWz5lYuqw=
github.com/golang/protobuf v1.3.5/go.mod h1:6O5/vntMXwX2lRkT1hjjk0nAC1IDOTvTlVgjlRvqsdk=
github.com/golang/protobuf v1.4.0-rc.1/go.mod h1:ceaxUfeHdC40wWswd/P6IGgMaK3YpKi5j83Wpe3EHw8=
github.com/golang/protobuf v1.4.0-rc.1.0.20200221234624-67d41d38c208/go.mod h1:xKAWHe0F5eneWXFV3EuXVDTCmh+JuBKY0li0aMyXATA=
github.com/golang/protobuf v1.4.0-rc.2/go.mod h1:LlEzMj4AhA7rCAGe4KMBDvJI+AwstrUpVNzEA03Pprs=
github.com/golang/protobuf v1.4.0-rc.4.0.20200313231945-b860323f09d0/go.mod h1:WU3c8KckQ9AFe+yFwt9sWVRKCVIyN9cPHBJSNnbL67w=
github.com/golang/protobuf v1.4.0/go.mod h1:jodUvKwWbYaEsadDk5Fwe5c77LiNKVO9IDvqG2KuDX0=
github.com/golang/protobuf v1.4.1/go.mod h1:U8fpvMrcmy5pZrNK1lt4xCsGvpyWQ/VVv6QDs8UjoX8=
github.com/golang/protobuf v1.4.2/go.mod h1:oDoupMAO8OvCJWAcko0GGGIgR6R6ocIYbsSw735rRwI=
github.com/golang/protobuf v1.4.3/go.mod h1:oDoupMAO8OvCJWAcko0GGGIgR6R6ocIYbsSw735rRwI=
github.com/golang/protobuf v1.5.0/go.mod h1:FsONVRAS9T7sI+LIUmWTfcYkHO4aIWwzhcaSAoJOfIk=
github.com/golang/protobuf v1.5.1/go.mod h1:DopwsBzvsk0Fs44TXzsVbJyPhcCPeIwnvohx4u74HPM=
github.com/golang/protobuf v1.5.2/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiuN0vRsmY=
github.com/google/btree v0.0.0-20180813153112-4030bb1f1f0c/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ=
github.com/google/btree v1.0.0/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ=
github.com/google/go-cmp v0.2.0/go.mod h1:oXzfMopK8JAjlY9xF4vHSVASa0yLyX7SntLO5aqRK0M=
github.com/google/go-cmp v0.3.0/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
github.com/google/go-cmp v0.3.1/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
github.com/google/go-cmp v0.4.0/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.4.1/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.5.0/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.5.1/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.5.2/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.5.3/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.5.4/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.5.5/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
github.com/google/martian v2.1.0+incompatible/go.mod h1:9I4somxYTbIHy5NJKHRl3wXiIaQGbYVAs8BPL6v8lEs=
github.com/google/martian/v3 v3.0.0/go.mod h1:y5Zk1BBys9G+gd6Jrk0W3cC1+ELVxBWuIGO+w/tUAp0=
github.com/google/martian/v3 v3.1.0/go.mod h1:y5Zk1BBys9G+gd6Jrk0W3cC1+ELVxBWuIGO+w/tUAp0=
github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57/go.mod h1:zfwlbNMJ+OItoe0UupaVj+oy1omPYYDuagoSzA8v9mc=
github.com/google/pprof v0.0.0-20190515194954-54271f7e092f/go.mod h1:zfwlbNMJ+OItoe0UupaVj+oy1omPYYDuagoSzA8v9mc=
github.com/google/pprof v0.0.0-20191218002539-d4f498aebedc/go.mod h1:ZgVRPoUq/hfqzAqh7sHMqb3I9Rq5C59dIz2SbBwJ4eM=
github.com/google/pprof v0.0.0-20200212024743-f11f1df84d12/go.mod h1:ZgVRPoUq/hfqzAqh7sHMqb3I9Rq5C59dIz2SbBwJ4eM=
github.com/google/pprof v0.0.0-20200229191704-1ebb73c60ed3/go.mod h1:ZgVRPoUq/hfqzAqh7sHMqb3I9Rq5C59dIz2SbBwJ4eM=
github.com/google/pprof v0.0.0-20200430221834-fc25d7d30c6d/go.mod h1:ZgVRPoUq/hfqzAqh7sHMqb3I9Rq5C59dIz2SbBwJ4eM=
github.com/google/pprof v0.0.0-20200708004538-1a94d8640e99/go.mod h1:ZgVRPoUq/hfqzAqh7sHMqb3I9Rq5C59dIz2SbBwJ4eM=
github.com/google/pprof v0.0.0-20201023163331-3e6fc7fc9c4c/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE=
github.com/google/pprof v0.0.0-20201203190320-1bf35d6f28c2/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE=
github.com/google/pprof v0.0.0-20210122040257-d980be63207e/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE=
github.com/google/pprof v0.0.0-20210226084205-cbba55b83ad5/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE=
github.com/google/renameio v0.1.0/go.mod h1:KWCgfxg9yswjAJkECMjeO8J8rahYeXnNhOm40UhjYkI=
github.com/google/uuid v1.1.2/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/googleapis/gax-go/v2 v2.0.4/go.mod h1:0Wqv26UfaUD9n4G6kQubkQ+KchISgw+vpHVxEJEs9eg=
github.com/googleapis/gax-go/v2 v2.0.5/go.mod h1:DWXyrwAJ9X0FpwwEdw+IPEYBICEFu5mhpdKc/us6bOk=
github.com/gopherjs/gopherjs v0.0.0-20181017120253-0766667cb4d1/go.mod h1:wJfORRmW1u3UXTncJ5qlYoELFm8eSnnEO6hX4iZ3EWY=
github.com/grpc-ecosystem/grpc-gateway v1.16.0/go.mod h1:BDjrQk3hbvj6Nolgz8mAMFbcEtjT1g+wF4CSlocrBnw=
github.com/hashicorp/consul/api v1.1.0/go.mod h1:VmuI/Lkw1nC05EYQWNKwWGbkg+FbDBtguAZLlVdkD9Q=
github.com/hashicorp/consul/sdk v0.1.1/go.mod h1:VKf9jXwCTEY1QZP2MOLRhb5i/I/ssyNV1vwHyQBF0x8=
github.com/hashicorp/errwrap v1.0.0/go.mod h1:YH+1FKiLXxHSkmPseP+kNlulaMuP3n2brvKWEqk/Jc4=
github.com/hashicorp/go-cleanhttp v0.5.1/go.mod h1:JpRdi6/HCYpAwUzNwuwqhbovhLtngrth3wmdIIUrZ80=
github.com/hashicorp/go-immutable-radix v1.0.0/go.mod h1:0y9vanUI8NX6FsYoO3zeMjhV/C5i9g4Q3DwcSNZ4P60=
github.com/hashicorp/go-msgpack v0.5.3/go.mod h1:ahLV/dePpqEmjfWmKiqvPkv/twdG7iPBM1vqhUKIvfM=
github.com/hashicorp/go-multierror v1.0.0/go.mod h1:dHtQlpGsu+cZNNAkkCN/P3hoUDHhCYQXV3UM06sGGrk=
github.com/hashicorp/go-rootcerts v1.0.0/go.mod h1:K6zTfqpRlCUIjkwsN4Z+hiSfzSTQa6eBIzfwKfwNnHU=
github.com/hashicorp/go-sockaddr v1.0.0/go.mod h1:7Xibr9yA9JjQq1JpNB2Vw7kxv8xerXegt+ozgdvDeDU=
github.com/hashicorp/go-syslog v1.0.0/go.mod h1:qPfqrKkXGihmCqbJM2mZgkZGvKG1dFdvsLplgctolz4=
github.com/hashicorp/go-uuid v1.0.0/go.mod h1:6SBZvOh/SIDV7/2o3Jml5SYk/TvGqwFJ/bN7x4byOro=
github.com/hashicorp/go-uuid v1.0.1/go.mod h1:6SBZvOh/SIDV7/2o3Jml5SYk/TvGqwFJ/bN7x4byOro=
github.com/hashicorp/go.net v0.0.1/go.mod h1:hjKkEWcCURg++eb33jQU7oqQcI9XDCnUzHA0oac0k90=
github.com/hashicorp/golang-lru v0.5.0/go.mod h1:/m3WP610KZHVQ1SGc6re/UDhFvYD7pJ4Ao+sR/qLZy8=
github.com/hashicorp/golang-lru v0.5.1/go.mod h1:/m3WP610KZHVQ1SGc6re/UDhFvYD7pJ4Ao+sR/qLZy8=
github.com/hashicorp/hcl v1.0.0/go.mod h1:E5yfLk+7swimpb2L/Alb/PJmXilQ/rhwaUYs4T20WEQ=
github.com/hashicorp/logutils v1.0.0/go.mod h1:QIAnNjmIWmVIIkWDTG1z5v++HQmx9WQRO+LraFDTW64=
github.com/hashicorp/mdns v1.0.0/go.mod h1:tL+uN++7HEJ6SQLQ2/p+z2pH24WQKWjBPkE0mNTz8vQ=
github.com/hashicorp/memberlist v0.1.3/go.mod h1:ajVTdAv/9Im8oMAAj5G31PhhMCZJV2pPBoIllUwCN7I=
github.com/hashicorp/serf v0.8.2/go.mod h1:6hOLApaqBFA1NXqRQAsxw9QxuDEvNxSQRwA/JwenrHc=
github.com/ianlancetaylor/demangle v0.0.0-20181102032728-5e5cf60278f6/go.mod h1:aSSvb/t6k1mPoxDqO4vJh6VOCGPwU4O0C2/Eqndh1Sc=
github.com/ianlancetaylor/demangle v0.0.0-20200824232613-28f6c0f3b639/go.mod h1:aSSvb/t6k1mPoxDqO4vJh6VOCGPwU4O0C2/Eqndh1Sc=
github.com/inconshreveable/mousetrap v1.0.0 h1:Z8tu5sraLXCXIcARxBp/8cbvlwVa7Z1NHg9XEKhtSvM=
github.com/inconshreveable/mousetrap v1.0.0/go.mod h1:PxqpIevigyE2G7u3NXJIT2ANytuPF1OarO4DADm73n8=
github.com/json-iterator/go v1.1.11/go.mod h1:KdQUCv79m/52Kvf8AW2vK1V8akMuk1QjK/uOdHXbAo4=
github.com/jstemmer/go-junit-report v0.0.0-20190106144839-af01ea7f8024/go.mod h1:6v2b51hI/fHJwM22ozAgKL4VKDeJcHhJFhtBdhmNjmU=
github.com/jstemmer/go-junit-report v0.9.1/go.mod h1:Brl9GWCQeLvo8nXZwPNNblvFj/XSXhF0NWZEnDohbsk=
github.com/jtolds/gls v4.20.0+incompatible/go.mod h1:QJZ7F/aHp+rZTRtaJ1ow/lLfFfVYBRgL+9YlvaHOwJU=
github.com/kisielk/errcheck v1.5.0/go.mod h1:pFxgyoBC7bSaBwPgfKdkLd5X25qrDl4LWUI2bnpBCr8=
github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck=
github.com/klauspost/compress v1.13.6 h1:P76CopJELS0TiO2mebmnzgWaajssP/EszplttgQxcgc=
github.com/klauspost/compress v1.13.6/go.mod h1:/3/Vjq9QcHkK5uEr5lBEmyoZ1iFhe47etQ6QUkpK6sk=
github.com/klauspost/pgzip v1.2.5 h1:qnWYvvKqedOF2ulHpMG72XQol4ILEJ8k2wwRl/Km8oE=
github.com/klauspost/pgzip v1.2.5/go.mod h1:Ch1tH69qFZu15pkjo5kYi6mth2Zzwzt50oCQKQE9RUs=
github.com/kr/fs v0.1.0/go.mod h1:FFnZGqtBN9Gxj7eW1uZ42v5BccTP0vu6NEaFoC2HwRg=
github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo=
github.com/kr/pretty v0.2.1 h1:Fmg33tUaq4/8ym9TJN1x7sLJnHVwhP33CNkpYV/7rwI=
github.com/kr/pretty v0.2.1/go.mod h1:ipq/a2n7PKx3OHsz4KJII5eveXtPO4qwEXGdVfWzfnI=
github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ=
github.com/kr/text v0.1.0 h1:45sCR5RtlFHMR4UwH9sdQ5TC8v0qDQCHnXt+kaKSTVE=
github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI=
github.com/magiconair/properties v1.8.5/go.mod h1:y3VJvCyxH9uVvJTWEGAELF3aiYNyPKd5NZ3oSwXrF60=
github.com/mattn/go-colorable v0.0.9/go.mod h1:9vuHe8Xs5qXnSaW/c/ABM9alt+Vo+STaOChaDxuIBZU=
github.com/mattn/go-colorable v0.1.8 h1:c1ghPdyEDarC70ftn0y+A/Ee++9zz8ljHG1b13eJ0s8=
github.com/mattn/go-colorable v0.1.8/go.mod h1:u6P/XSegPjTcexA+o6vUJrdnUu04hMope9wVRipJSqc=
github.com/mattn/go-isatty v0.0.3/go.mod h1:M+lRXTBqGeGNdLjl/ufCoiOlB5xdOkqRJdNxMWT7Zi4=
github.com/mattn/go-isatty v0.0.12 h1:wuysRhFDzyxgEmMf5xjvJ2M9dZoWAXNNr5LSBS7uHXY=
github.com/mattn/go-isatty v0.0.12/go.mod h1:cbi8OIDigv2wuxKPP5vlRcQ1OAZbq2CE4Kysco4FUpU=
github.com/mattn/go-runewidth v0.0.13 h1:lTGmDsbAYt5DmK6OnoV7EuIF1wEIFAcxld6ypU4OSgU=
github.com/mattn/go-runewidth v0.0.13/go.mod h1:Jdepj2loyihRzMpdS35Xk/zdY8IAYHsh153qUoGf23w=
github.com/miekg/dns v1.0.14/go.mod h1:W1PPwlIAgtquWBMBEV9nkV9Cazfe8ScdGz/Lj7v3Nrg=
github.com/mitchellh/cli v1.0.0/go.mod h1:hNIlj7HEI86fIcpObd7a0FcrxTWetlwJDGcceTlRvqc=
github.com/mitchellh/go-homedir v1.0.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0=
github.com/mitchellh/go-homedir v1.1.0 h1:lukF9ziXFxDFPkA1vsr5zpc1XuPDn/wFntq5mG+4E0Y=
github.com/mitchellh/go-homedir v1.1.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0=
github.com/mitchellh/go-testing-interface v1.0.0/go.mod h1:kRemZodwjscx+RGhAo8eIhFbs2+BFgRtFPeD/KE+zxI=
github.com/mitchellh/gox v0.4.0/go.mod h1:Sd9lOJ0+aimLBi73mGofS1ycjY8lL3uZM3JPS42BGNg=
github.com/mitchellh/iochan v1.0.0/go.mod h1:JwYml1nuB7xOzsp52dPpHFffvOCDupsG0QubkSMEySY=
github.com/mitchellh/mapstructure v0.0.0-20160808181253-ca63d7c062ee/go.mod h1:FVVH3fgwuzCH5S8UJGiWEs2h04kUh9fWfEaFds41c1Y=
github.com/mitchellh/mapstructure v1.1.2/go.mod h1:FVVH3fgwuzCH5S8UJGiWEs2h04kUh9fWfEaFds41c1Y=
github.com/mitchellh/mapstructure v1.4.1/go.mod h1:bFUtVrKA4DC2yAKiSyO/QUcy7e+RRV2QTWOzhPopBRo=
github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
github.com/modern-go/reflect2 v0.0.0-20180701023420-4b7aa43c6742/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0=
github.com/modern-go/reflect2 v1.0.1/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0=
github.com/pascaldekloe/goe v0.0.0-20180627143212-57f6aae5913c/go.mod h1:lzWF7FIEvWOWxwDKqyGYQf6ZUaNfKdP144TG7ZOy1lc=
github.com/pelletier/go-toml v1.9.3/go.mod h1:u1nR/EPcESfeI/szUZKdtJ0xRNbUoANCkoOuaOx1Y+c=
github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4=
github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pkg/sftp v1.10.1/go.mod h1:lYOWFsE0bwd1+KfKJaKeuokY15vzFx25BLbzYYoAxZI=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/posener/complete v1.1.1/go.mod h1:em0nMJCgc9GFtwrmVmEMR/ZL6WyhyjMBndrE9hABlRI=
github.com/prometheus/client_model v0.0.0-20190812154241-14fe0d1b01d4/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA=
github.com/remyoudompheng/bigfft v0.0.0-20200410134404-eec4a21b6bb0/go.mod h1:qqbHyh8v60DhA7CoWK5oRCqLrMHRGoxYCSS9EjAz6Eo=
github.com/rivo/uniseg v0.2.0 h1:S1pD9weZBuJdFmowNwbpi7BJ8TNftyUImj/0WQi72jY=
github.com/rivo/uniseg v0.2.0/go.mod h1:J6wj4VEh+S6ZtnVlnTBMWIodfgj8LQOQFoIToxlJtxc=
github.com/rogpeppe/fastuuid v1.2.0/go.mod h1:jVj6XXZzXRy/MSR5jhDC/2q6DgLz+nrA6LYCDYWNEvQ=
github.com/rogpeppe/go-internal v1.3.0/go.mod h1:M8bDsm7K2OlrFYOpmOWEs/qY81heoFRclV5y23lUDJ4=
github.com/russross/blackfriday/v2 v2.0.1/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/ryanuber/columnize v0.0.0-20160712163229-9b3edd62028f/go.mod h1:sm1tb6uqfes/u+d4ooFouqFdy9/2g9QGwK3SQygK0Ts=
github.com/sean-/seed v0.0.0-20170313163322-e2103e2c3529/go.mod h1:DxrIzT+xaE7yg65j358z/aeFdxmN0P9QXhEzd20vsDc=
github.com/shenwei356/bio v0.3.1 h1:rdSoslv8HahfJpkzhN6W1Ky1lQ++gaBIfuJAC4f65cc=
github.com/shenwei356/bio v0.3.1/go.mod h1:rHA8DoaDooXKdqX7bqoksQDKm3mEAQ3MsD9nivVTSf0=
github.com/shenwei356/bpool v0.0.0-20160710042833-f9e0ee4d0403 h1:/3JklLnHXiWUBxWc3joQYavDQJpncRhRA909cUb7eOw=
github.com/shenwei356/bpool v0.0.0-20160710042833-f9e0ee4d0403/go.mod h1:YkgdTWfNnJgv5HVJbVSDmxQtkK3/jZWDoqcG26BVU8k=
github.com/shenwei356/breader v0.3.1 h1:OjgfeHhpNGQPkS1+lgsl4eNuuO//Y16N6TkqG5oxO5U=
github.com/shenwei356/breader v0.3.1/go.mod h1:UR11JJCxU9s7eUdU4xn3L/VodxoXzWhjJPh8WZbb+us=
github.com/shenwei356/go-logging v0.0.0-20171012171522-c6b9702d88ba h1:UvnrxFDPmz7agYX0eQ2JEorTKn1ORnZ9dT5OzbjPvK8=
github.com/shenwei356/go-logging v0.0.0-20171012171522-c6b9702d88ba/go.mod h1:LiqYp/K5yCEWOi7Ux/iOF/kjDxtsdYjOGcKHDbEOXFU=
github.com/shenwei356/natsort v0.0.0-20190418160752-600d539c017d h1:eeXLHcXyGEr72V1SOSEI7vSzUOTJvHutwF7Ykm+hscQ=
github.com/shenwei356/natsort v0.0.0-20190418160752-600d539c017d/go.mod h1:SiiGiRFyRtV7S9RamOrmQR5gpGIRhWJM1w0EtmuQ1io=
github.com/shenwei356/util v0.3.2 h1:3qXkcO2erNKnPCnV8zxjN2JL5sGnOqW+muj1x4XxkuM=
github.com/shenwei356/util v0.3.2/go.mod h1:pY/f5wR/0o0dJcodhw1Df/ghzqNt2wFSATW0zymI4mA=
github.com/shenwei356/xopen v0.1.0 h1:PizY52rLA7A6EdkwKZ6A8h8/a+c9DCBXqfLtwVzsWnM=
github.com/shenwei356/xopen v0.1.0/go.mod h1:6EQUa6I7Zsl2GQKqcL9qGLrTzVE+oZyly+uhzovQYSk=
github.com/shurcooL/sanitized_anchor_name v1.0.0/go.mod h1:1NzhyTcUVG4SuEtjjoZeVRXNmyL/1OwPU0+IJeTBvfc=
github.com/smartystreets/assertions v0.0.0-20180927180507-b2de0cb4f26d/go.mod h1:OnSkiWE9lh6wB0YB77sQom3nweQdgAjqCqsofrRNTgc=
github.com/smartystreets/goconvey v1.6.4/go.mod h1:syvi0/a8iFYH4r/RixwvyeAJjdLS9QV7WQ/tjFTllLA=
github.com/spf13/afero v1.6.0/go.mod h1:Ai8FlHk4v/PARR026UzYexafAt9roJ7LcLMAmO6Z93I=
github.com/spf13/cast v1.3.1/go.mod h1:Qx5cxh0v+4UWYiBimWS+eyWzqEqokIECu5etghLkUJE=
github.com/spf13/cobra v1.2.1 h1:+KmjbUw1hriSNMF55oPrkZcb27aECyrj8V2ytv7kWDw=
github.com/spf13/cobra v1.2.1/go.mod h1:ExllRjgxM/piMAM+3tAZvg8fsklGAf3tPfi+i8t68Nk=
github.com/spf13/jwalterweatherman v1.1.0/go.mod h1:aNWZUN0dPAAO/Ljvb5BEdw96iTZ0EXowPYD95IqWIGo=
github.com/spf13/pflag v1.0.5 h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA=
github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
github.com/spf13/viper v1.8.1/go.mod h1:o0Pch8wJ9BVSWGQMbra6iw0oQ5oktSIBaujf1rJH9Ns=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs=
github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4=
github.com/stretchr/testify v1.5.1/go.mod h1:5W2xD1RspED5o8YsWQXVCued0rvSQ+mT+I5cxcmMvtA=
github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/subosito/gotenv v1.2.0/go.mod h1:N0PQaV/YGNqwC0u51sEeR/aUtSLEXKX9iv69rRypqCw=
github.com/tatsushid/go-prettytable v0.0.0-20141013043238-ed2d14c29939 h1:BhIUXV2ySTLrKgh/Hnts+QTQlIbWtomXt3LMdzME0A0=
github.com/tatsushid/go-prettytable v0.0.0-20141013043238-ed2d14c29939/go.mod h1:omGxs4/6hNjxPKUTjmaNkPzehSnNJOJN6pMEbrlYIT4=
github.com/twotwotwo/sorts v0.0.0-20160814051341-bf5c1f2b8553 h1:DRC1ubdb3ZmyyIeCSTxjZIQAnpLPfKVgYrLETQuOPjo=
github.com/twotwotwo/sorts v0.0.0-20160814051341-bf5c1f2b8553/go.mod h1:Rj7Csq/tZ/egz+Ltc2IVpsA5309AmSMEswjkTZmq2Xc=
github.com/will-rowe/nthash v0.4.0 h1:YiHdqR0phP9o/kKVMJJiuXYY9qOH8QHofptDqUCOxrU=
github.com/will-rowe/nthash v0.4.0/go.mod h1:5ezweuK0J5j+/7lih/RkrSmnxI3hoaPpQiVWJ7rd960=
github.com/yuin/goldmark v1.1.25/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
github.com/yuin/goldmark v1.1.27/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
github.com/yuin/goldmark v1.1.32/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
github.com/yuin/goldmark v1.2.1/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
github.com/yuin/goldmark v1.3.5/go.mod h1:mwnBkeHKe2W/ZEtQ+71ViKU8L12m81fl3OWwC1Zlc8k=
github.com/zeebo/wyhash v0.0.1 h1:VEByEMek3iHhV65CgG3SRAWVtg/6TcmbEKj5jPOKDrc=
github.com/zeebo/wyhash v0.0.1/go.mod h1:Ti+OwfNtM5AZiYAL0kOPIfliqDP5c0VtOnnMAqzuuZk=
go.etcd.io/etcd/api/v3 v3.5.0/go.mod h1:cbVKeC6lCfl7j/8jBhAK6aIYO9XOjdptoxU/nLQcPvs=
go.etcd.io/etcd/client/pkg/v3 v3.5.0/go.mod h1:IJHfcCEKxYu1Os13ZdwCwIUTUVGYTSAM3YSwc9/Ac1g=
go.etcd.io/etcd/client/v2 v2.305.0/go.mod h1:h9puh54ZTgAKtEbut2oe9P4L/oqKCVB6xsXlzd7alYQ=
go.opencensus.io v0.21.0/go.mod h1:mSImk1erAIZhrmZN+AvHh14ztQfjbGwt4TtuofqLduU=
go.opencensus.io v0.22.0/go.mod h1:+kGneAE2xo2IficOXnaByMWTGM9T73dGwxeWcUqIpI8=
go.opencensus.io v0.22.2/go.mod h1:yxeiOL68Rb0Xd1ddK5vPZ/oVn4vY4Ynel7k9FzqtOIw=
go.opencensus.io v0.22.3/go.mod h1:yxeiOL68Rb0Xd1ddK5vPZ/oVn4vY4Ynel7k9FzqtOIw=
go.opencensus.io v0.22.4/go.mod h1:yxeiOL68Rb0Xd1ddK5vPZ/oVn4vY4Ynel7k9FzqtOIw=
go.opencensus.io v0.22.5/go.mod h1:5pWMHQbX5EPX2/62yrJeAkowc+lfs/XD7Uxpq3pI6kk=
go.opencensus.io v0.23.0/go.mod h1:XItmlyltB5F7CS4xOC1DcqMoFqwtC6OG2xF7mCv7P7E=
go.uber.org/atomic v1.7.0/go.mod h1:fEN4uk6kAWBTFdckzkM89CLk9XfWZrxpCo0nPH17wJc=
go.uber.org/multierr v1.6.0/go.mod h1:cdWPpRnG4AhwMwsgIHip0KRBQjJy5kYEpYjJxpXp9iU=
go.uber.org/zap v1.17.0/go.mod h1:MXVU+bhUf/A7Xi2HNOnopQOrmycQ5Ih87HtOu4q5SSo=
golang.org/x/crypto v0.0.0-20181029021203-45a5f77698d3/go.mod h1:6SG95UA2DQfeDnfUPMdvaQW0Q7yPrPDi9nlGo2tz2b4=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20190510104115-cbcb75029529/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20190605123033-f99c8df09eb5/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20190820162420-60c769a6c586/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
golang.org/x/exp v0.0.0-20190121172915-509febef88a4/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA=
golang.org/x/exp v0.0.0-20190306152737-a1d7652674e8/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA=
golang.org/x/exp v0.0.0-20190510132918-efd6b22b2522/go.mod h1:ZjyILWgesfNpC6sMxTJOJm9Kp84zZh5NQWvqDGG3Qr8=
golang.org/x/exp v0.0.0-20190829153037-c13cbed26979/go.mod h1:86+5VVa7VpoJ4kLfm080zCjGlMRFzhUhsZKEZO7MGek=
golang.org/x/exp v0.0.0-20191030013958-a1ab85dbe136/go.mod h1:JXzH8nQsPlswgeRAPE3MuO9GYsAcnJvJ4vnMwN/5qkY=
golang.org/x/exp v0.0.0-20191129062945-2f5052295587/go.mod h1:2RIsYlXP63K8oxa1u096TMicItID8zy7Y6sNkU49FU4=
golang.org/x/exp v0.0.0-20191227195350-da58074b4299/go.mod h1:2RIsYlXP63K8oxa1u096TMicItID8zy7Y6sNkU49FU4=
golang.org/x/exp v0.0.0-20200119233911-0405dc783f0a/go.mod h1:2RIsYlXP63K8oxa1u096TMicItID8zy7Y6sNkU49FU4=
golang.org/x/exp v0.0.0-20200207192155-f17229e696bd/go.mod h1:J/WKrq2StrnmMY6+EHIKF9dgMWnmCNThgcyBT1FY9mM=
golang.org/x/exp v0.0.0-20200224162631-6cc2880d07d6/go.mod h1:3jZMyOhIsHpP37uCMkUooju7aAi5cS1Q23tOzKc+0MU=
golang.org/x/image v0.0.0-20190227222117-0694c2d4d067/go.mod h1:kZ7UVZpmo3dzQBMxlp+ypCbDeSB+sBbTgSJuh5dn5js=
golang.org/x/image v0.0.0-20190802002840-cff245a6509b/go.mod h1:FeLwcggjj3mMvU+oOTbSwawSJRM1uh48EjtB4UJZlP0=
golang.org/x/lint v0.0.0-20181026193005-c67002cb31c3/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE=
golang.org/x/lint v0.0.0-20190227174305-5b3e6a55c961/go.mod h1:wehouNa3lNwaWXcvxsM5YxQ5yQlVC4a0KAMCusXpPoU=
golang.org/x/lint v0.0.0-20190301231843-5614ed5bae6f/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE=
golang.org/x/lint v0.0.0-20190313153728-d0100b6bd8b3/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc=
golang.org/x/lint v0.0.0-20190409202823-959b441ac422/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc=
golang.org/x/lint v0.0.0-20190909230951-414d861bb4ac/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc=
golang.org/x/lint v0.0.0-20190930215403-16217165b5de/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc=
golang.org/x/lint v0.0.0-20191125180803-fdd1cda4f05f/go.mod h1:5qLYkcX4OjUUV8bRuDixDT3tpyyb+LUpUlRWLxfhWrs=
golang.org/x/lint v0.0.0-20200130185559-910be7a94367/go.mod h1:3xt1FjdF8hUf6vQPIChWIBhFzV8gjjsPE/fR3IyQdNY=
golang.org/x/lint v0.0.0-20200302205851-738671d3881b/go.mod h1:3xt1FjdF8hUf6vQPIChWIBhFzV8gjjsPE/fR3IyQdNY=
golang.org/x/lint v0.0.0-20201208152925-83fdc39ff7b5/go.mod h1:3xt1FjdF8hUf6vQPIChWIBhFzV8gjjsPE/fR3IyQdNY=
golang.org/x/lint v0.0.0-20210508222113-6edffad5e616/go.mod h1:3xt1FjdF8hUf6vQPIChWIBhFzV8gjjsPE/fR3IyQdNY=
golang.org/x/mobile v0.0.0-20190312151609-d3739f865fa6/go.mod h1:z+o9i4GpDbdi3rU15maQ/Ox0txvL9dWGYEHz965HBQE=
golang.org/x/mobile v0.0.0-20190719004257-d2bd2a29d028/go.mod h1:E/iHnbuqvinMTCcRqshq8CkpyQDoeVncDDYHnLhea+o=
golang.org/x/mod v0.0.0-20190513183733-4bf6d317e70e/go.mod h1:mXi4GBBbnImb6dmsKGUJ2LatrhH/nqhxcFungHvyanc=
golang.org/x/mod v0.1.0/go.mod h1:0QHyrYULN0/3qlju5TqG8bIK38QM8yzMo5ekMj3DlcY=
golang.org/x/mod v0.1.1-0.20191105210325-c90efee705ee/go.mod h1:QqPTAvyqsEbceGzBzNggFXnrqF1CaUcvgkdR5Ot7KZg=
golang.org/x/mod v0.1.1-0.20191107180719-034126e5016b/go.mod h1:QqPTAvyqsEbceGzBzNggFXnrqF1CaUcvgkdR5Ot7KZg=
golang.org/x/mod v0.2.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.3.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.4.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.4.1/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.4.2/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/net v0.0.0-20180724234803-3673e40ba225/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20180826012351-8a410e7b638d/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20181023162649-9b4f9f5ad519/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20181201002055-351d144fa1fc/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20190108225652-1e06a53dbb7e/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20190213061140-3a22650c66bd/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20190311183353-d8887717615a/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190501004415-9ce7a6920f09/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190503192946-f4e77d36d62c/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190603091049-60506f45cf65/go.mod h1:HSz+uSET+XFnRR8LxR5pz3Of3rY3CfYBVs4xY44aLks=
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20190628185345-da137c7871d7/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20190724013045-ca1201d0de80/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20191209160850-c0dbc17a3553/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200114155413-6afb5195e5aa/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200202094626-16171245cfb2/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200222125558-5a598a2470a0/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200226121028-0de0cce0169b/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200301022130-244492dfa37a/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200324143707-d3edc9973b7e/go.mod h1:qpuaurCH72eLCgpAm/N6yyVIVM9cpaDIP3A8BGJEC5A=
golang.org/x/net v0.0.0-20200501053045-e0ff5e5a1de5/go.mod h1:qpuaurCH72eLCgpAm/N6yyVIVM9cpaDIP3A8BGJEC5A=
golang.org/x/net v0.0.0-20200506145744-7e3656a0809f/go.mod h1:qpuaurCH72eLCgpAm/N6yyVIVM9cpaDIP3A8BGJEC5A=
golang.org/x/net v0.0.0-20200513185701-a91f0712d120/go.mod h1:qpuaurCH72eLCgpAm/N6yyVIVM9cpaDIP3A8BGJEC5A=
golang.org/x/net v0.0.0-20200520182314-0ba52f642ac2/go.mod h1:qpuaurCH72eLCgpAm/N6yyVIVM9cpaDIP3A8BGJEC5A=
golang.org/x/net v0.0.0-20200625001655-4c5254603344/go.mod h1:/O7V0waA8r7cgGh81Ro3o1hOxt32SMVPicZroKQ2sZA=
golang.org/x/net v0.0.0-20200707034311-ab3426394381/go.mod h1:/O7V0waA8r7cgGh81Ro3o1hOxt32SMVPicZroKQ2sZA=
golang.org/x/net v0.0.0-20200822124328-c89045814202/go.mod h1:/O7V0waA8r7cgGh81Ro3o1hOxt32SMVPicZroKQ2sZA=
golang.org/x/net v0.0.0-20201021035429-f5854403a974/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU=
golang.org/x/net v0.0.0-20201031054903-ff519b6c9102/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU=
golang.org/x/net v0.0.0-20201110031124-69a78807bb2b/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU=
golang.org/x/net v0.0.0-20201209123823-ac852fbbde11/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
golang.org/x/net v0.0.0-20210119194325-5f4716e94777/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
golang.org/x/net v0.0.0-20210316092652-d523dce5a7f4/go.mod h1:RBQZq4jEuRlivfhVLdyRGr576XBO4/greRjx4P4O3yc=
golang.org/x/net v0.0.0-20210405180319-a5a99cb37ef4/go.mod h1:p54w0d4576C0XHj96bSt6lcn1PtDYWL6XObtHCRCNQM=
golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U=
golang.org/x/oauth2 v0.0.0-20190226205417-e64efc72b421/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
golang.org/x/oauth2 v0.0.0-20191202225959-858c2ad4c8b6/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
golang.org/x/oauth2 v0.0.0-20200107190931-bf48bf16ab8d/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
golang.org/x/oauth2 v0.0.0-20200902213428-5d25da1a8d43/go.mod h1:KelEdhl1UZF7XfJ4dDtk6s++YSgaE7mD/BuKKDLBl4A=
golang.org/x/oauth2 v0.0.0-20201109201403-9fd604954f58/go.mod h1:KelEdhl1UZF7XfJ4dDtk6s++YSgaE7mD/BuKKDLBl4A=
golang.org/x/oauth2 v0.0.0-20201208152858-08078c50e5b5/go.mod h1:KelEdhl1UZF7XfJ4dDtk6s++YSgaE7mD/BuKKDLBl4A=
golang.org/x/oauth2 v0.0.0-20210218202405-ba52d332ba99/go.mod h1:KelEdhl1UZF7XfJ4dDtk6s++YSgaE7mD/BuKKDLBl4A=
golang.org/x/oauth2 v0.0.0-20210220000619-9bb904979d93/go.mod h1:KelEdhl1UZF7XfJ4dDtk6s++YSgaE7mD/BuKKDLBl4A=
golang.org/x/oauth2 v0.0.0-20210313182246-cd4f82c27b84/go.mod h1:KelEdhl1UZF7XfJ4dDtk6s++YSgaE7mD/BuKKDLBl4A=
golang.org/x/oauth2 v0.0.0-20210402161424-2e8d93401602/go.mod h1:KelEdhl1UZF7XfJ4dDtk6s++YSgaE7mD/BuKKDLBl4A=
golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20181108010431-42b317875d0f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190227155943-e225da77a7e6/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20200317015054-43a5402ce75a/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20200625203802-6e8e738ad208/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20201207232520-09787c993a3a/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sys v0.0.0-20180823144017-11551d06cbcc/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20181026203630-95b1ffbd15a5/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190312061237-fead79001313/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190502145724-3ef323f4f1fd/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190507160741-ecd444e8653b/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190606165138-5da285871e9c/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190624142023-c5567b49c5d0/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190726091711-fc99dfbffb4e/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191001151750-bb3f8db39f24/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191005200804-aed5e4c7ecf9/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191204072324-ce4227a45e2e/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191228213918-04cbcbbfeed8/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200113162924-86b910548bc1/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200116001909-b77594299b42/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200122134326-e047566fdf82/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200202164722-d101bd2416d5/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200212091648-12a6c2dcc1e4/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200223170610-d5e6a3e2c0ae/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200302150141-5c8b2ff67527/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200323222414-85ca7c5b95cd/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200331124033-c3d80250170d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200501052902-10377860bb8e/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200511232937-7e40ca221e25/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200515095857-1151b9dac4a9/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200523222454-059865788121/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200803210538-64077c9b5642/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200905004654-be1d3432aa8f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20201201145000-ef89a241ccb3/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210104204734-6f8348627aad/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210119212857-b64e53b001e4/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210220050731-9a76102bfb43/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210305230114-8fe3ee5dd75b/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210315160823-c6e025ad8005/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210320140829-1e4c9ba3b0c4/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210330210617-4fbd30eecc44/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210403161142-5e06dd20ab57/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210510120138-977fb7262007 h1:gG67DSER+11cZvqIMb8S8bt0vZtiN6xWYARwirrOSfE=
golang.org/x/sys v0.0.0-20210510120138-977fb7262007/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/text v0.0.0-20170915032832-14c0d48ead0c/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.1-0.20180807135948-17ff2d5776d2/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.2/go.mod h1:bEr9sfX3Q8Zfm5fL9x+3itogRgK3+ptLWKqgva+5dAk=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.4/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.5/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/time v0.0.0-20181108054448-85acf8d2951c/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.0.0-20190308202827-9d24e82272b4/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.0.0-20191024005414-555d28b269f0/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20190114222345-bf090417da8b/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20190226205152-f727befe758c/go.mod h1:9Yl7xja0Znq3iFh3HoIrodX9oNMXvdceNzlUR8zjMvY=
golang.org/x/tools v0.0.0-20190311212946-11955173bddd/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs=
golang.org/x/tools v0.0.0-20190312151545-0bb0c0a6e846/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs=
golang.org/x/tools v0.0.0-20190312170243-e65039ee4138/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs=
golang.org/x/tools v0.0.0-20190328211700-ab21143f2384/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs=
golang.org/x/tools v0.0.0-20190425150028-36563e24a262/go.mod h1:RgjU9mgBXZiqYHBnxXauZ1Gv1EHHAz9KjViQ78xBX0Q=
golang.org/x/tools v0.0.0-20190506145303-2d16b83fe98c/go.mod h1:RgjU9mgBXZiqYHBnxXauZ1Gv1EHHAz9KjViQ78xBX0Q=
golang.org/x/tools v0.0.0-20190524140312-2c0ae7006135/go.mod h1:RgjU9mgBXZiqYHBnxXauZ1Gv1EHHAz9KjViQ78xBX0Q=
golang.org/x/tools v0.0.0-20190606124116-d0a3d012864b/go.mod h1:/rFqwRUd4F7ZHNgwSSTFct+R/Kf4OFW1sUzUTQQTgfc=
golang.org/x/tools v0.0.0-20190621195816-6e04913cbbac/go.mod h1:/rFqwRUd4F7ZHNgwSSTFct+R/Kf4OFW1sUzUTQQTgfc=
golang.org/x/tools v0.0.0-20190628153133-6cdbf07be9d0/go.mod h1:/rFqwRUd4F7ZHNgwSSTFct+R/Kf4OFW1sUzUTQQTgfc=
golang.org/x/tools v0.0.0-20190816200558-6889da9d5479/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20190911174233-4f2ddba30aff/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20191012152004-8de300cfc20a/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20191112195655-aa38f8e97acc/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20191113191852-77e3bb0ad9e7/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20191115202509-3a792d9c32b2/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20191125144606-a911d9008d1f/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20191130070609-6e064ea0cf2d/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20191216173652-a0e659d51361/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/tools v0.0.0-20191227053925-7b8e75db28f4/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/tools v0.0.0-20200117161641-43d50277825c/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/tools v0.0.0-20200122220014-bf1340f18c4a/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/tools v0.0.0-20200130002326-2f3ba24bd6e7/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/tools v0.0.0-20200204074204-1cc6d1ef6c74/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/tools v0.0.0-20200207183749-b753a1ba74fa/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/tools v0.0.0-20200212150539-ea181f53ac56/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/tools v0.0.0-20200224181240-023911ca70b2/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/tools v0.0.0-20200227222343-706bc42d1f0d/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/tools v0.0.0-20200304193943-95d2e580d8eb/go.mod h1:o4KQGtdN14AW+yjsvvwRTJJuXz8XRtIHtEnmAXLyFUw=
golang.org/x/tools v0.0.0-20200312045724-11d5b4c81c7d/go.mod h1:o4KQGtdN14AW+yjsvvwRTJJuXz8XRtIHtEnmAXLyFUw=
golang.org/x/tools v0.0.0-20200331025713-a30bf2db82d4/go.mod h1:Sl4aGygMT6LrqrWclx+PTx3U+LnKx/seiNR+3G19Ar8=
golang.org/x/tools v0.0.0-20200501065659-ab2804fb9c9d/go.mod h1:EkVYQZoAsY45+roYkvgYkIh4xh/qjgUK9TdY2XT94GE=
golang.org/x/tools v0.0.0-20200512131952-2bc93b1c0c88/go.mod h1:EkVYQZoAsY45+roYkvgYkIh4xh/qjgUK9TdY2XT94GE=
golang.org/x/tools v0.0.0-20200515010526-7d3b6ebf133d/go.mod h1:EkVYQZoAsY45+roYkvgYkIh4xh/qjgUK9TdY2XT94GE=
golang.org/x/tools v0.0.0-20200618134242-20370b0cb4b2/go.mod h1:EkVYQZoAsY45+roYkvgYkIh4xh/qjgUK9TdY2XT94GE=
golang.org/x/tools v0.0.0-20200619180055-7c47624df98f/go.mod h1:EkVYQZoAsY45+roYkvgYkIh4xh/qjgUK9TdY2XT94GE=
golang.org/x/tools v0.0.0-20200729194436-6467de6f59a7/go.mod h1:njjCfa9FT2d7l9Bc6FUM5FLjQPp3cFF28FI3qnDFljA=
golang.org/x/tools v0.0.0-20200804011535-6c149bb5ef0d/go.mod h1:njjCfa9FT2d7l9Bc6FUM5FLjQPp3cFF28FI3qnDFljA=
golang.org/x/tools v0.0.0-20200825202427-b303f430e36d/go.mod h1:njjCfa9FT2d7l9Bc6FUM5FLjQPp3cFF28FI3qnDFljA=
golang.org/x/tools v0.0.0-20200904185747-39188db58858/go.mod h1:Cj7w3i3Rnn0Xh82ur9kSqwfTHTeVxaDqrfMjpcNT6bE=
golang.org/x/tools v0.0.0-20201110124207-079ba7bd75cd/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA=
golang.org/x/tools v0.0.0-20201201161351-ac6f37ff4c2a/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA=
golang.org/x/tools v0.0.0-20201208233053-a543418bbed2/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA=
golang.org/x/tools v0.0.0-20210105154028-b0ab187a4818/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA=
golang.org/x/tools v0.0.0-20210106214847-113979e3529a/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA=
golang.org/x/tools v0.1.0/go.mod h1:xkSsbof2nBLbhDlRMhhhyNLN/zl3eTqcnHD5viDpcZ0=
golang.org/x/tools v0.1.2/go.mod h1:o0xws9oXOQQZyjljx8fwUC0k7L1pTE6eaCbjGeHmOkk=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
google.golang.org/api v0.4.0/go.mod h1:8k5glujaEP+g9n7WNsDg8QP6cUVNI86fCNMcbazEtwE=
google.golang.org/api v0.7.0/go.mod h1:WtwebWUNSVBH/HAw79HIFXZNqEvBhG+Ra+ax0hx3E3M=
google.golang.org/api v0.8.0/go.mod h1:o4eAsZoiT+ibD93RtjEohWalFOjRDx6CVaqeizhEnKg=
google.golang.org/api v0.9.0/go.mod h1:o4eAsZoiT+ibD93RtjEohWalFOjRDx6CVaqeizhEnKg=
google.golang.org/api v0.13.0/go.mod h1:iLdEw5Ide6rF15KTC1Kkl0iskquN2gFfn9o9XIsbkAI=
google.golang.org/api v0.14.0/go.mod h1:iLdEw5Ide6rF15KTC1Kkl0iskquN2gFfn9o9XIsbkAI=
google.golang.org/api v0.15.0/go.mod h1:iLdEw5Ide6rF15KTC1Kkl0iskquN2gFfn9o9XIsbkAI=
google.golang.org/api v0.17.0/go.mod h1:BwFmGc8tA3vsd7r/7kR8DY7iEEGSU04BFxCo5jP/sfE=
google.golang.org/api v0.18.0/go.mod h1:BwFmGc8tA3vsd7r/7kR8DY7iEEGSU04BFxCo5jP/sfE=
google.golang.org/api v0.19.0/go.mod h1:BwFmGc8tA3vsd7r/7kR8DY7iEEGSU04BFxCo5jP/sfE=
google.golang.org/api v0.20.0/go.mod h1:BwFmGc8tA3vsd7r/7kR8DY7iEEGSU04BFxCo5jP/sfE=
google.golang.org/api v0.22.0/go.mod h1:BwFmGc8tA3vsd7r/7kR8DY7iEEGSU04BFxCo5jP/sfE=
google.golang.org/api v0.24.0/go.mod h1:lIXQywCXRcnZPGlsd8NbLnOjtAoL6em04bJ9+z0MncE=
google.golang.org/api v0.28.0/go.mod h1:lIXQywCXRcnZPGlsd8NbLnOjtAoL6em04bJ9+z0MncE=
google.golang.org/api v0.29.0/go.mod h1:Lcubydp8VUV7KeIHD9z2Bys/sm/vGKnG1UHuDBSrHWM=
google.golang.org/api v0.30.0/go.mod h1:QGmEvQ87FHZNiUVJkT14jQNYJ4ZJjdRF23ZXz5138Fc=
google.golang.org/api v0.35.0/go.mod h1:/XrVsuzM0rZmrsbjJutiuftIzeuTQcEeaYcSk/mQ1dg=
google.golang.org/api v0.36.0/go.mod h1:+z5ficQTmoYpPn8LCUNVpK5I7hwkpjbcgqA7I34qYtE=
google.golang.org/api v0.40.0/go.mod h1:fYKFpnQN0DsDSKRVRcQSDQNtqWPfM9i+zNPxepjRCQ8=
google.golang.org/api v0.41.0/go.mod h1:RkxM5lITDfTzmyKFPt+wGrCJbVfniCr2ool8kTBzRTU=
google.golang.org/api v0.43.0/go.mod h1:nQsDGjRXMo4lvh5hP0TKqF244gqhGcr/YSIykhUk/94=
google.golang.org/api v0.44.0/go.mod h1:EBOGZqzyhtvMDoxwS97ctnh0zUmYY6CxqXsc1AvkYD8=
google.golang.org/appengine v1.1.0/go.mod h1:EbEs0AVv82hx2wNQdGPgUI5lhzA/G0D9YwlJXL52JkM=
google.golang.org/appengine v1.4.0/go.mod h1:xpcJRLb0r/rnEns0DIKYYv+WjYCduHsrkT7/EB5XEv4=
google.golang.org/appengine v1.5.0/go.mod h1:xpcJRLb0r/rnEns0DIKYYv+WjYCduHsrkT7/EB5XEv4=
google.golang.org/appengine v1.6.1/go.mod h1:i06prIuMbXzDqacNJfV5OdTW448YApPu5ww/cMBSeb0=
google.golang.org/appengine v1.6.5/go.mod h1:8WjMMxjGQR8xUklV/ARdw2HLXBOI7O7uCIDZVag1xfc=
google.golang.org/appengine v1.6.6/go.mod h1:8WjMMxjGQR8xUklV/ARdw2HLXBOI7O7uCIDZVag1xfc=
google.golang.org/appengine v1.6.7/go.mod h1:8WjMMxjGQR8xUklV/ARdw2HLXBOI7O7uCIDZVag1xfc=
google.golang.org/genproto v0.0.0-20180817151627-c66870c02cf8/go.mod h1:JiN7NxoALGmiZfu7CAH4rXhgtRTLTxftemlI0sWmxmc=
google.golang.org/genproto v0.0.0-20190307195333-5fe7a883aa19/go.mod h1:VzzqZJRnGkLBvHegQrXjBqPurQTc5/KpmUdxsrq26oE=
google.golang.org/genproto v0.0.0-20190418145605-e7d98fc518a7/go.mod h1:VzzqZJRnGkLBvHegQrXjBqPurQTc5/KpmUdxsrq26oE=
google.golang.org/genproto v0.0.0-20190425155659-357c62f0e4bb/go.mod h1:VzzqZJRnGkLBvHegQrXjBqPurQTc5/KpmUdxsrq26oE=
google.golang.org/genproto v0.0.0-20190502173448-54afdca5d873/go.mod h1:VzzqZJRnGkLBvHegQrXjBqPurQTc5/KpmUdxsrq26oE=
google.golang.org/genproto v0.0.0-20190801165951-fa694d86fc64/go.mod h1:DMBHOl98Agz4BDEuKkezgsaosCRResVns1a3J2ZsMNc=
google.golang.org/genproto v0.0.0-20190819201941-24fa4b261c55/go.mod h1:DMBHOl98Agz4BDEuKkezgsaosCRResVns1a3J2ZsMNc=
google.golang.org/genproto v0.0.0-20190911173649-1774047e7e51/go.mod h1:IbNlFCBrqXvoKpeg0TB2l7cyZUmoaFKYIwrEpbDKLA8=
google.golang.org/genproto v0.0.0-20191108220845-16a3f7862a1a/go.mod h1:n3cpQtvxv34hfy77yVDNjmbRyujviMdxYliBSkLhpCc=
google.golang.org/genproto v0.0.0-20191115194625-c23dd37a84c9/go.mod h1:n3cpQtvxv34hfy77yVDNjmbRyujviMdxYliBSkLhpCc=
google.golang.org/genproto v0.0.0-20191216164720-4f79533eabd1/go.mod h1:n3cpQtvxv34hfy77yVDNjmbRyujviMdxYliBSkLhpCc=
google.golang.org/genproto v0.0.0-20191230161307-f3c370f40bfb/go.mod h1:n3cpQtvxv34hfy77yVDNjmbRyujviMdxYliBSkLhpCc=
google.golang.org/genproto v0.0.0-20200115191322-ca5a22157cba/go.mod h1:n3cpQtvxv34hfy77yVDNjmbRyujviMdxYliBSkLhpCc=
google.golang.org/genproto v0.0.0-20200122232147-0452cf42e150/go.mod h1:n3cpQtvxv34hfy77yVDNjmbRyujviMdxYliBSkLhpCc=
google.golang.org/genproto v0.0.0-20200204135345-fa8e72b47b90/go.mod h1:GmwEX6Z4W5gMy59cAlVYjN9JhxgbQH6Gn+gFDQe2lzA=
google.golang.org/genproto v0.0.0-20200212174721-66ed5ce911ce/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c=
google.golang.org/genproto v0.0.0-20200224152610-e50cd9704f63/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c=
google.golang.org/genproto v0.0.0-20200228133532-8c2c7df3a383/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c=
google.golang.org/genproto v0.0.0-20200305110556-506484158171/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c=
google.golang.org/genproto v0.0.0-20200312145019-da6875a35672/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c=
google.golang.org/genproto v0.0.0-20200331122359-1ee6d9798940/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c=
google.golang.org/genproto v0.0.0-20200430143042-b979b6f78d84/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c=
google.golang.org/genproto v0.0.0-20200511104702-f5ebc3bea380/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c=
google.golang.org/genproto v0.0.0-20200513103714-09dca8ec2884/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c=
google.golang.org/genproto v0.0.0-20200515170657-fc4c6c6a6587/go.mod h1:YsZOwe1myG/8QRHRsmBRE1LrgQY60beZKjly0O1fX9U=
google.golang.org/genproto v0.0.0-20200526211855-cb27e3aa2013/go.mod h1:NbSheEEYHJ7i3ixzK3sjbqSGDJWnxyFXZblF3eUsNvo=
google.golang.org/genproto v0.0.0-20200618031413-b414f8b61790/go.mod h1:jDfRM7FcilCzHH/e9qn6dsT145K34l5v+OpcnNgKAAA=
google.golang.org/genproto v0.0.0-20200729003335-053ba62fc06f/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no=
google.golang.org/genproto v0.0.0-20200804131852-c06518451d9c/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no=
google.golang.org/genproto v0.0.0-20200825200019-8632dd797987/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no=
google.golang.org/genproto v0.0.0-20200904004341-0bd0a958aa1d/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no=
google.golang.org/genproto v0.0.0-20201109203340-2640f1f9cdfb/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no=
google.golang.org/genproto v0.0.0-20201201144952-b05cb90ed32e/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no=
google.golang.org/genproto v0.0.0-20201210142538-e3217bee35cc/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no=
google.golang.org/genproto v0.0.0-20201214200347-8c77b98c765d/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no=
google.golang.org/genproto v0.0.0-20210222152913-aa3ee6e6a81c/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no=
google.golang.org/genproto v0.0.0-20210303154014-9728d6b83eeb/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no=
google.golang.org/genproto v0.0.0-20210310155132-4ce2db91004e/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no=
google.golang.org/genproto v0.0.0-20210319143718-93e7006c17a6/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no=
google.golang.org/genproto v0.0.0-20210402141018-6c239bbf2bb1/go.mod h1:9lPAdzaEmUacj36I+k7YKbEc5CXzPIeORRgDAUOu28A=
google.golang.org/genproto v0.0.0-20210602131652-f16073e35f0c/go.mod h1:UODoCrxHCcBojKKwX1terBiRUaqAsFqJiF615XL43r0=
google.golang.org/grpc v1.19.0/go.mod h1:mqu4LbDTu4XGKhr4mRzUsmM4RtVoemTSY81AxZiDr8c=
google.golang.org/grpc v1.20.1/go.mod h1:10oTOabMzJvdu6/UiuZezV6QK5dSlG84ov/aaiqXj38=
google.golang.org/grpc v1.21.1/go.mod h1:oYelfM1adQP15Ek0mdvEgi9Df8B9CZIaU1084ijfRaM=
google.golang.org/grpc v1.23.0/go.mod h1:Y5yQAOtifL1yxbo5wqy6BxZv8vAUGQwXBOALyacEbxg=
google.golang.org/grpc v1.25.1/go.mod h1:c3i+UQWmh7LiEpx4sFZnkU36qjEYZ0imhYfXVyQciAY=
google.golang.org/grpc v1.26.0/go.mod h1:qbnxyOmOxrQa7FizSgH+ReBfzJrCY1pSN7KXBS8abTk=
google.golang.org/grpc v1.27.0/go.mod h1:qbnxyOmOxrQa7FizSgH+ReBfzJrCY1pSN7KXBS8abTk=
google.golang.org/grpc v1.27.1/go.mod h1:qbnxyOmOxrQa7FizSgH+ReBfzJrCY1pSN7KXBS8abTk=
google.golang.org/grpc v1.28.0/go.mod h1:rpkK4SK4GF4Ach/+MFLZUBavHOvF2JJB5uozKKal+60=
google.golang.org/grpc v1.29.1/go.mod h1:itym6AZVZYACWQqET3MqgPpjcuV5QH3BxFS3IjizoKk=
google.golang.org/grpc v1.30.0/go.mod h1:N36X2cJ7JwdamYAgDz+s+rVMFjt3numwzf/HckM8pak=
google.golang.org/grpc v1.31.0/go.mod h1:N36X2cJ7JwdamYAgDz+s+rVMFjt3numwzf/HckM8pak=
google.golang.org/grpc v1.31.1/go.mod h1:N36X2cJ7JwdamYAgDz+s+rVMFjt3numwzf/HckM8pak=
google.golang.org/grpc v1.33.1/go.mod h1:fr5YgcSWrqhRRxogOsw7RzIpsmvOZ6IcH4kBYTpR3n0=
google.golang.org/grpc v1.33.2/go.mod h1:JMHMWHQWaTccqQQlmk3MJZS+GWXOdAesneDmEnv2fbc=
google.golang.org/grpc v1.34.0/go.mod h1:WotjhfgOW/POjDeRt8vscBtXq+2VjORFy659qA51WJ8=
google.golang.org/grpc v1.35.0/go.mod h1:qjiiYl8FncCW8feJPdyg3v6XW24KsRHe+dy9BAGRRjU=
google.golang.org/grpc v1.36.0/go.mod h1:qjiiYl8FncCW8feJPdyg3v6XW24KsRHe+dy9BAGRRjU=
google.golang.org/grpc v1.36.1/go.mod h1:qjiiYl8FncCW8feJPdyg3v6XW24KsRHe+dy9BAGRRjU=
google.golang.org/grpc v1.38.0/go.mod h1:NREThFqKR1f3iQ6oBuvc5LadQuXVGo9rkm5ZGrQdJfM=
google.golang.org/protobuf v0.0.0-20200109180630-ec00e32a8dfd/go.mod h1:DFci5gLYBciE7Vtevhsrf46CRTquxDuWsQurQQe4oz8=
google.golang.org/protobuf v0.0.0-20200221191635-4d8936d0db64/go.mod h1:kwYJMbMJ01Woi6D6+Kah6886xMZcty6N08ah7+eCXa0=
google.golang.org/protobuf v0.0.0-20200228230310-ab0ca4ff8a60/go.mod h1:cfTl7dwQJ+fmap5saPgwCLgHXTUD7jkjRqWcaiX5VyM=
google.golang.org/protobuf v1.20.1-0.20200309200217-e05f789c0967/go.mod h1:A+miEFZTKqfCUM6K7xSMQL9OKL/b6hQv+e19PK+JZNE=
google.golang.org/protobuf v1.21.0/go.mod h1:47Nbq4nVaFHyn7ilMalzfO3qCViNmqZ2kzikPIcrTAo=
google.golang.org/protobuf v1.22.0/go.mod h1:EGpADcykh3NcUnDUJcl1+ZksZNG86OlYog2l/sGQquU=
google.golang.org/protobuf v1.23.0/go.mod h1:EGpADcykh3NcUnDUJcl1+ZksZNG86OlYog2l/sGQquU=
google.golang.org/protobuf v1.23.1-0.20200526195155-81db48ad09cc/go.mod h1:EGpADcykh3NcUnDUJcl1+ZksZNG86OlYog2l/sGQquU=
google.golang.org/protobuf v1.24.0/go.mod h1:r/3tXBNzIEhYS9I1OUVjXDlt8tc493IdKGjtUeSXeh4=
google.golang.org/protobuf v1.25.0/go.mod h1:9JNX74DMeImyA3h4bdi1ymwjUzf21/xIlbajtzgsN7c=
google.golang.org/protobuf v1.26.0-rc.1/go.mod h1:jlhhOSvTdKEhbULTjvd4ARK9grFBp09yW+WbY/TyQbw=
google.golang.org/protobuf v1.26.0/go.mod h1:9q0QmTI4eRPtz6boOQmLYwt+qCgq0jsYwAQnmE0givc=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
gopkg.in/errgo.v2 v2.1.0/go.mod h1:hNsd1EY+bozCKY1Ytp96fpM3vjJbqLJn88ws8XvfDNI=
gopkg.in/ini.v1 v1.62.0/go.mod h1:pNLf8WUiyNEtQjuu5G5vTm06TEv9tsIgeAvK8hOrP4k=
gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.3/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.8/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
honnef.co/go/tools v0.0.0-20190102054323-c2f93a96b099/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4=
honnef.co/go/tools v0.0.0-20190106161140-3f1c8253044a/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4=
honnef.co/go/tools v0.0.0-20190418001031-e561f6794a2a/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4=
honnef.co/go/tools v0.0.0-20190523083050-ea95bdfd59fc/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4=
honnef.co/go/tools v0.0.1-2019.2.3/go.mod h1:a3bituU0lyd329TUQxRnasdCoJDkEUEAqEt0JzvZhAg=
honnef.co/go/tools v0.0.1-2020.1.3/go.mod h1:X/FiERA/W4tHapMX5mGpAtMSVEeEUOyHaw9vFzvIQ3k=
honnef.co/go/tools v0.0.1-2020.1.4/go.mod h1:X/FiERA/W4tHapMX5mGpAtMSVEeEUOyHaw9vFzvIQ3k=
rsc.io/binaryregexp v0.2.0/go.mod h1:qTv7/COck+e2FymRvadv62gMdZztPaShugOCi3I+8D8=
rsc.io/quote/v3 v3.1.0/go.mod h1:yEA65RcK8LyAZtP9Kv3t0HmxON59tX3rD+tICJqUlj0=
rsc.io/sampler v1.3.0/go.mod h1:T1hPZKmBbMNahiBKFy5HrXp6adAjACjK9JXDnKaTXpA=
unikmer-0.18.8/index/ 0000775 0000000 0000000 00000000000 14121001445 0014361 5 ustar 00root root 0000000 0000000 unikmer-0.18.8/index/serialization.go 0000664 0000000 0000000 00000017663 14121001445 0017602 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package index
import (
"encoding/binary"
"errors"
"fmt"
"io"
"strings"
)
// Version is the version of index format
const Version uint8 = 2
// Magic number of index file.
var Magic = [8]byte{'.', 'u', 'n', 'i', 'k', 'i', 'd', 'x'}
// ErrInvalidIndexFileFormat means invalid index format.
var ErrInvalidIndexFileFormat = errors.New("unikmer/index: invalid index format")
// ErrUnfishedWrite means writing not finished
var ErrUnfishedWrite = errors.New("unikmer/index: index not fished writing")
// ErrTruncateIndexFile means the file is truncated
var ErrTruncateIndexFile = errors.New("unikmer/index: truncated index file")
// ErrWrongWriteDataSize means the size of data to write is invalid
var ErrWrongWriteDataSize = errors.New("unikmer/index: write data with wrong size")
// ErrVersionMismatch means version mismatch between files and program
var ErrVersionMismatch = errors.New("unikmer/index: version mismatch")
// ErrNameAndSizeMismatch means size of names and sizes are not equal.
var ErrNameAndSizeMismatch = errors.New("unikmer/index: size of names and sizes unequal")
// ErrNameAndIndexMismatch means size of names and sizes are not equal.
var ErrNameAndIndexMismatch = errors.New("unikmer/index: size of names and indices unequal")
var be = binary.BigEndian
// Header contains metadata
type Header struct {
Version uint8 // uint8
K int // uint8
Canonical bool // uint8
NumHashes uint8 // uint8
NumSigs uint64
Names []string
Indices []uint32
Sizes []uint64
NumRowBytes int // length of bytes for storing one row of signiture for n names
}
func (h Header) String() string {
return fmt.Sprintf("unikmer index file v%d: k: %d, canonical: %v, #hashes: %d, #signatures: %d, names: %s",
h.Version, h.K, h.Canonical, h.NumHashes, h.NumSigs, strings.Join(h.Names, ", "))
}
// Compatible checks compatibility
func (h Header) Compatible(b Header) bool {
if h.Version == b.Version &&
h.K == b.K &&
h.Canonical == b.Canonical &&
h.NumHashes == b.NumHashes {
return true
}
return false
}
// Reader is for reading KmerCode.
type Reader struct {
Header
r io.Reader
count uint64
}
// NewReader returns a Reader.
func NewReader(r io.Reader) (reader *Reader, err error) {
reader = &Reader{r: r}
err = reader.readHeader()
if err != nil {
return nil, err
}
reader.NumRowBytes = int((len(reader.Names) + 7) / 8)
return reader, nil
}
func (reader *Reader) readHeader() (err error) {
// check Magic number
var m [8]byte
r := reader.r
err = binary.Read(r, be, &m)
if err != nil {
return err
}
same := true
for i := 0; i < 8; i++ {
if Magic[i] != m[i] {
same = false
break
}
}
if !same {
return ErrInvalidIndexFileFormat
}
// 4 bytes meta info
var meta [4]uint8
err = binary.Read(r, be, &meta)
if err != nil {
return err
}
// check compatibility
if Version != meta[0] {
return ErrVersionMismatch
}
reader.Version = meta[0]
reader.K = int(meta[1])
if meta[2] > 0 {
reader.Canonical = true
}
reader.NumHashes = meta[3]
// 8 bytes signature size
err = binary.Read(r, be, &reader.NumSigs)
if err != nil {
return err
}
// 4 bytes length of Names
var n uint32
err = binary.Read(r, be, &n)
if err != nil {
return err
}
// Names
namesData := make([]byte, n)
err = binary.Read(r, be, &namesData)
if err != nil {
return err
}
names := strings.Split(string(namesData), "\n")
names = names[0 : len(names)-1]
reader.Names = names
// Indices
indicesData := make([]uint32, len(names))
err = binary.Read(r, be, &indicesData)
if err != nil {
return err
}
reader.Indices = indicesData
// Sizes
sizesData := make([]uint64, len(names))
err = binary.Read(r, be, &sizesData)
if err != nil {
return err
}
reader.Sizes = sizesData
return nil
}
// Read reads one code.
func (reader *Reader) Read() ([]byte, error) {
data := make([]byte, reader.NumRowBytes)
nReaded, err := io.ReadFull(reader.r, data)
if err != nil {
if err == io.EOF {
if reader.count != reader.NumSigs {
return nil, ErrTruncateIndexFile
}
}
return nil, err
}
if nReaded < reader.NumRowBytes {
return nil, ErrTruncateIndexFile
}
reader.count++
return data, nil
}
// Writer writes KmerCode.
type Writer struct {
Header
w io.Writer
wroteHeader bool
count uint64
}
// NewWriter creates a Writer.
func NewWriter(w io.Writer, k int, canonical bool, numHashes uint8, numSigs uint64, names []string, indices []uint32, sizes []uint64) (*Writer, error) {
if len(names) != len(sizes) {
return nil, ErrNameAndSizeMismatch
}
if len(names) != len(indices) {
return nil, ErrNameAndIndexMismatch
}
writer := &Writer{
Header: Header{
Version: Version,
K: k,
Canonical: canonical,
NumHashes: numHashes,
NumSigs: numSigs,
Names: names,
Indices: indices,
Sizes: sizes,
},
w: w,
}
writer.NumRowBytes = int((len(names) + 7) / 8)
return writer, nil
}
// WriteHeader writes file header
func (writer *Writer) WriteHeader() (err error) {
if writer.wroteHeader {
return nil
}
w := writer.w
// 8 bytes magic number
err = binary.Write(w, be, Magic)
if err != nil {
return err
}
// 4 bytes meta info
var canonical uint8
if writer.Canonical {
canonical = 1
}
err = binary.Write(w, be, [4]uint8{writer.Version, uint8(writer.K), canonical, writer.NumHashes})
if err != nil {
return err
}
// 8 bytes signature size
err = binary.Write(w, be, writer.NumSigs)
if err != nil {
return err
}
// 4 bytes length of Names
var n int
for _, name := range writer.Names {
n += len(name) + 1
}
err = binary.Write(w, be, uint32(n))
if err != nil {
return err
}
// Names
for _, name := range writer.Names {
err = binary.Write(w, be, []byte(name+"\n"))
if err != nil {
return err
}
}
// Indices
err = binary.Write(w, be, writer.Indices)
if err != nil {
return err
}
// Sizes
err = binary.Write(w, be, writer.Sizes)
if err != nil {
return err
}
writer.wroteHeader = true
return nil
}
// Write writes some thing
func (writer *Writer) Write(data []byte) (err error) {
if len(data) != writer.NumRowBytes {
return ErrWrongWriteDataSize
}
// lazily write header
if !writer.wroteHeader {
err = writer.WriteHeader()
if err != nil {
return err
}
writer.wroteHeader = true
}
_, err = writer.w.Write(data)
if err != nil {
return err
}
writer.count++
return nil
}
// WriteBatch writes a batch of data
func (writer *Writer) WriteBatch(data []byte, n int) (err error) {
// lazily write header
if !writer.wroteHeader {
err = writer.WriteHeader()
if err != nil {
return err
}
writer.wroteHeader = true
}
_, err = writer.w.Write(data)
if err != nil {
return err
}
writer.count += uint64(n)
return nil
}
// Flush check completeness
func (writer *Writer) Flush() (err error) {
if !writer.wroteHeader {
writer.WriteHeader()
}
if writer.count != writer.NumSigs {
return ErrUnfishedWrite
}
return nil
}
unikmer-0.18.8/index/serialization_test.go 0000664 0000000 0000000 00000007471 14121001445 0020635 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package index
import (
"bufio"
"bytes"
"fmt"
"io"
"os"
"testing"
)
func TestIndexReadAndWrite(t *testing.T) {
file := "test.unikidx"
defer func() {
err := os.Remove(file)
if err != nil {
t.Errorf("clean error %s", err)
}
}()
k := 31
canonical := true
numHashes := uint8(1)
numSigs := uint64(2)
names := []string{"a", "b", "c", "d", "e", "f", "g", "h", "i"}
indices := []uint32{1, 2, 3, 4, 5, 6, 7, 8, 9}
sizes := []uint64{1, 2, 3, 4, 5, 6, 7, 8, 9}
data := [][]byte{[]byte("aa"), []byte("bb")}
err := write(file, k, canonical, numHashes, numSigs, names, indices, sizes, data)
if err != nil {
t.Errorf("write error %s", err)
}
reader, datas, err := read(file)
if err != nil {
t.Errorf("read error %s", err)
}
if reader.K != k {
t.Errorf("unmatch k")
}
if reader.Canonical != canonical {
t.Errorf("unmatch canonical")
}
if reader.NumHashes != numHashes {
t.Errorf("unmatch NumHashes")
}
if reader.NumSigs != numSigs {
t.Errorf("unmatch NumSigs")
}
if len(reader.Names) != len(names) {
t.Errorf("unmatch names length")
}
for i, n := range names {
if reader.Names[i] != n {
t.Errorf("unmatch name")
}
}
if len(reader.Indices) != len(indices) {
t.Errorf("unmatch indices length")
}
for i, n := range indices {
if reader.Indices[i] != n {
t.Errorf("unmatch index")
}
}
if len(reader.Sizes) != len(sizes) {
t.Errorf("unmatch sizes length")
}
for i, n := range sizes {
if reader.Sizes[i] != n {
t.Errorf("unmatch size")
}
}
if len(datas) != len(data) {
t.Errorf("unmatch data length")
}
for i, d := range data {
if bytes.Compare(d, datas[i]) != 0 {
t.Errorf("unmatch data")
}
}
}
func write(file string, k int, canonical bool, numHashes uint8, numSigs uint64, names []string, indices []uint32, sizes []uint64, datas [][]byte) error {
w, err := os.Create(file)
if err != nil {
return err
}
defer w.Close()
outfh := bufio.NewWriter(w)
defer outfh.Flush()
writer, err := NewWriter(outfh, k, canonical, numHashes, numSigs, names, indices, sizes)
if err != nil {
return err
}
for _, data := range datas {
err = writer.Write(data)
if err != nil {
return err
}
}
err = writer.Flush()
if err != nil {
return err
}
return nil
}
func read(file string) (*Reader, [][]byte, error) {
r, err := os.Open(file)
if err != nil {
return nil, nil, err
}
defer r.Close()
infh := bufio.NewReader(r)
reader, err := NewReader(infh)
if err != nil {
return reader, nil, err
}
fmt.Println(reader.Header)
datas := make([][]byte, 0, 10)
var data []byte
for {
data, err = reader.Read()
if err != nil {
if err == io.EOF {
break
}
return nil, nil, err
}
datas = append(datas, data)
}
return reader, datas, nil
}
unikmer-0.18.8/iterator-protein.go 0000664 0000000 0000000 00000005117 14121001445 0017114 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"sync"
"github.com/shenwei356/bio/seq"
"github.com/zeebo/wyhash"
)
var poolProteinIterator = &sync.Pool{New: func() interface{} {
return &ProteinIterator{}
}}
// ProteinIterator is a iterator for protein sequence.
type ProteinIterator struct {
s0 *seq.Seq // only used for KmerProteinIterator
s *seq.Seq // amino acid
k int
finished bool
end int
idx int
}
// NewProteinIterator returns an iterator for hash of amino acids
func NewProteinIterator(s *seq.Seq, k int, codonTable int, frame int) (*ProteinIterator, error) {
if k < 1 {
return nil, ErrInvalidK
}
if len(s.Seq) < k*3 {
return nil, ErrShortSeq
}
// iter := &ProteinIterator{s0: s, k: k}
iter := poolProteinIterator.Get().(*ProteinIterator)
iter.s0 = s
iter.k = k
iter.finished = false
iter.idx = 0
var err error
if s.Alphabet != seq.Protein {
iter.s, err = s.Translate(codonTable, frame, false, false, true, false)
if err != nil {
return nil, err
}
} else {
iter.s = s
}
iter.end = len(iter.s.Seq) - k
return iter, nil
}
// Next return's a hash
func (iter *ProteinIterator) Next() (code uint64, ok bool) {
if iter.finished {
return 0, false
}
if iter.idx > iter.end {
iter.finished = true
poolProteinIterator.Put(iter)
return 0, false
}
code = wyhash.Hash(iter.s.Seq[iter.idx:iter.idx+iter.k], 1)
iter.idx++
return code, true
}
// Index returns current 0-baesd index.
func (iter *ProteinIterator) Index() int {
return iter.idx - 1
}
unikmer-0.18.8/iterator-protein_test.go 0000664 0000000 0000000 00000003632 14121001445 0020153 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"testing"
"github.com/shenwei356/bio/seq"
)
func TestProteinIterator(t *testing.T) {
_s := "AAGTTTGAATCATTCAACTATCTAGTTTTCAGAGAACAATGTTCTCTAAAGAATAGAAAAGAGTCATTGTGCGGTGATGATGGCGGGAAGGATCCACCTG"
sequence, err := seq.NewSeq(seq.DNA, []byte(_s))
if err != nil {
t.Errorf("fail to create sequence: %s", _s)
}
k := 10
iter, err := NewProteinIterator(sequence, k, 1, 1)
if err != nil {
t.Errorf("fail to create aa iter rator")
}
var code uint64
var ok bool
// var idx int
codes := make([]uint64, 0, 1024)
for {
code, ok = iter.Next()
if !ok {
break
}
// idx = iter.Index()
// fmt.Printf("aa: %d-%s, %d\n", idx, iter.s.Seq[idx:idx+k], code)
codes = append(codes, code)
}
if len(codes) != len(_s)/3-k+1 {
t.Errorf("k-mer hashes number error")
}
}
unikmer-0.18.8/iterator.go 0000664 0000000 0000000 00000012676 14121001445 0015446 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"fmt"
"sync"
"github.com/pkg/errors"
"github.com/shenwei356/bio/seq"
"github.com/will-rowe/nthash"
)
// ErrInvalidK means k < 1.
var ErrInvalidK = fmt.Errorf("unikmer: invalid k-mer size")
// ErrEmptySeq sequence is empty.
var ErrEmptySeq = fmt.Errorf("unikmer: empty sequence")
// ErrShortSeq means the sequence is shorter than k
var ErrShortSeq = fmt.Errorf("unikmer: sequence too short")
var poolIterator = &sync.Pool{New: func() interface{} {
return &Iterator{}
}}
// Iterator is a kmer code (k<=32) or hash iterator.
type Iterator struct {
s *seq.Seq // only used for KmerIterator
k int
kUint uint // uint(k)
kP1 int // k -1
kP1Uint uint // uint(k-1)
canonical bool
circular bool
hash bool
finished bool
revcomStrand bool
idx int
// for KmerIterator
length int
end, e int
first bool
kmer []byte
codeBase uint64
preCode uint64
preCodeRC uint64
codeRC uint64
// for HashIterator
hasher *nthash.NTHi
}
// NewHashIterator returns ntHash Iterator.
func NewHashIterator(s *seq.Seq, k int, canonical bool, circular bool) (*Iterator, error) {
if k < 1 {
return nil, ErrInvalidK
}
if len(s.Seq) < k {
return nil, ErrShortSeq
}
// iter := &Iterator{s: s, k: k, canonical: canonical, circular: circular}
iter := poolIterator.Get().(*Iterator)
iter.s = s
iter.k = k
iter.canonical = canonical
iter.circular = circular
iter.finished = false
iter.revcomStrand = false
iter.idx = 0
iter.hash = true
iter.kUint = uint(k)
iter.kP1 = k - 1
iter.kP1Uint = uint(k - 1)
var err error
var seq2 []byte
if circular {
seq2 = make([]byte, len(s.Seq), len(s.Seq)+k-1)
copy(seq2, s.Seq) // do not edit original sequence
seq2 = append(seq2, s.Seq[0:k-1]...)
} else {
seq2 = s.Seq
}
iter.hasher, err = nthash.NewHasher(&seq2, uint(k))
if err != nil {
return nil, err
}
return iter, nil
}
// NextHash returns next ntHash.
func (iter *Iterator) NextHash() (code uint64, ok bool) {
code, ok = iter.hasher.Next(iter.canonical)
if !ok {
poolIterator.Put(iter)
}
iter.idx++
return code, ok
}
// NewKmerIterator returns k-mer code iterator.
func NewKmerIterator(s *seq.Seq, k int, canonical bool, circular bool) (*Iterator, error) {
if k < 1 {
return nil, ErrInvalidK
}
if len(s.Seq) < k {
return nil, ErrShortSeq
}
var s2 *seq.Seq
if circular {
s2 = s.Clone() // do not edit original sequence
s2.Seq = append(s2.Seq, s.Seq[0:k-1]...)
} else {
s2 = s
}
// iter := &Iterator{s: s2, k: k, canonical: canonical, circular: circular}
iter := poolIterator.Get().(*Iterator)
iter.s = s2
iter.k = k
iter.canonical = canonical
iter.circular = circular
iter.finished = false
iter.revcomStrand = false
iter.idx = 0
iter.length = len(s2.Seq)
iter.end = iter.length - k + 1
iter.kUint = uint(k)
iter.kP1 = k - 1
iter.kP1Uint = uint(k - 1)
iter.first = true
return iter, nil
}
// NextKmer returns next k-mer code.
func (iter *Iterator) NextKmer() (code uint64, ok bool, err error) {
if iter.finished {
return 0, false, nil
}
if iter.idx == iter.end {
if iter.canonical || iter.revcomStrand {
iter.finished = true
poolIterator.Put(iter)
return 0, false, nil
}
iter.s.RevComInplace()
iter.idx = 0
iter.revcomStrand = true
iter.first = true
}
iter.e = iter.idx + iter.k
iter.kmer = iter.s.Seq[iter.idx:iter.e]
if !iter.first {
iter.codeBase = base2bit[iter.kmer[iter.kP1]]
if iter.codeBase == 4 {
err = ErrIllegalBase
}
// compute code from previous one
code = iter.preCode&((1<<(iter.kP1Uint<<1))-1)<<2 | iter.codeBase
// compute code of revcomp kmer from previous one
iter.codeRC = (iter.codeBase^3)<<(iter.kP1Uint<<1) | (iter.preCodeRC >> 2)
} else {
code, err = Encode(iter.kmer)
iter.codeRC = MustRevComp(code, iter.k)
iter.first = false
}
if err != nil {
return 0, false, errors.Wrapf(err, "encode %s", iter.kmer)
}
iter.preCode = code
iter.preCodeRC = iter.codeRC
iter.idx++
if iter.canonical && code > iter.codeRC {
code = iter.codeRC
}
return code, true, nil
}
// Next is a wrapter for NextHash and NextKmer.
func (iter *Iterator) Next() (code uint64, ok bool, err error) {
if iter.hash {
code, ok = iter.NextHash()
return
}
code, ok, err = iter.NextKmer()
return
}
// Index returns current 0-baesd index.
func (iter *Iterator) Index() int {
return iter.idx - 1
}
unikmer-0.18.8/iterator_test.go 0000664 0000000 0000000 00000011735 14121001445 0016500 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"math/rand"
"testing"
"github.com/shenwei356/bio/seq"
"github.com/shenwei356/util/bytesize"
)
func TestKmerIterator(t *testing.T) {
_s := "AAGTTTGAATCATTCAACTATCTAGTTTTCAGAGAACAATGTTCTCTAAAGAATAGAAAAGAGTCATTGTGCGGTGATGATGGCGGGAAGGATCCACCTG"
sequence, err := seq.NewSeq(seq.DNA, []byte(_s))
if err != nil {
t.Errorf("fail to create sequence: %s", _s)
}
k := 10
iter, err := NewKmerIterator(sequence, k, true, false)
if err != nil {
t.Errorf("fail to create aa iter rator")
}
var code uint64
var ok bool
// var idx int
codes := make([]uint64, 0, 1024)
for {
code, ok, err = iter.Next()
if err != nil {
t.Error(err)
}
if !ok {
break
}
// idx = iter.Index()
// fmt.Printf("kmer: %d-%s, %d\n", idx, iter.s.Seq[idx:idx+k], code)
codes = append(codes, code)
}
if len(codes) != len(_s)-k+1 {
t.Errorf("k-mers number error")
}
}
func TestHashIterator(t *testing.T) {
_s := "AAGTTTGAATCATTCAACTATCTAGTTTTCAGAGAACAATGTTCTCTAAAGAATAGAAAAGAGTCATTGTGCGGTGATGATGGCGGGAAGGATCCACCTG"
sequence, err := seq.NewSeq(seq.DNA, []byte(_s))
if err != nil {
t.Errorf("fail to create sequence: %s", _s)
}
k := 10
iter, err := NewHashIterator(sequence, k, true, false)
if err != nil {
t.Errorf("fail to create aa iter rator")
}
var code uint64
var ok bool
// var idx int
codes := make([]uint64, 0, 1024)
for {
code, ok, err = iter.Next()
if err != nil {
t.Error(err)
}
if !ok {
break
}
// idx = iter.Index()
// fmt.Printf("kmer: %d-%s, %d\n", idx, iter.s.Seq[idx:idx+k], code)
codes = append(codes, code)
}
if len(codes) != len(_s)-k+1 {
t.Errorf("k-mer hashes number error")
}
}
var benchSeqs []*seq.Seq
var _code uint64
func init() {
rand.Seed(11)
sizes := []int{1 << 10} //, 1 << 20, 10 << 20}
benchSeqs = make([]*seq.Seq, len(sizes))
var err error
for i, size := range sizes {
sequence := make([]byte, size)
// fmt.Printf("generating pseudo DNA with length of %s ...\n", bytesize.ByteSize(size))
for j := 0; j < size; j++ {
sequence[j] = bit2base[rand.Intn(4)]
}
benchSeqs[i], err = seq.NewSeq(seq.DNA, sequence)
if err != nil {
panic("should not happen")
}
// fmt.Println(benchSeqs[i])
}
// fmt.Printf("%d DNA sequences generated\n", len(sizes))
}
func BenchmarkKmerIterator(b *testing.B) {
for i := range benchSeqs {
size := len(benchSeqs[i].Seq)
b.Run(bytesize.ByteSize(size).String(), func(b *testing.B) {
var code uint64
var ok bool
for j := 0; j < b.N; j++ {
iter, err := NewKmerIterator(benchSeqs[i], 31, true, false)
if err != nil {
b.Errorf("fail to create hash iterator. seq length: %d", size)
}
for {
code, ok, err = iter.NextKmer()
if err != nil {
b.Errorf("fail to get kmer code: %d-%s", iter.Index(),
benchSeqs[i].Seq[iter.Index():iter.Index()+31])
}
if !ok {
break
}
_code = code
}
}
})
}
}
func BenchmarkHashIterator(b *testing.B) {
for i := range benchSeqs {
size := len(benchSeqs[i].Seq)
b.Run(bytesize.ByteSize(size).String(), func(b *testing.B) {
var code uint64
var ok bool
for j := 0; j < b.N; j++ {
iter, err := NewHashIterator(benchSeqs[i], 31, true, false)
if err != nil {
b.Errorf("fail to create hash iterator. seq length: %d", size)
}
for {
code, ok = iter.NextHash()
if !ok {
break
}
_code = code
}
}
})
}
}
func BenchmarkProteinIterator(b *testing.B) {
for i := range benchSeqs {
size := len(benchSeqs[i].Seq)
b.Run(bytesize.ByteSize(size).String(), func(b *testing.B) {
var code uint64
var ok bool
for j := 0; j < b.N; j++ {
iter, err := NewProteinIterator(benchSeqs[i], 10, 1, 1)
if err != nil {
b.Errorf("fail to create hash iterator. seq length: %d", size)
}
for {
code, ok = iter.Next()
if !ok {
break
}
_code = code
}
}
})
}
}
unikmer-0.18.8/kmer-sort.go 0000664 0000000 0000000 00000005264 14121001445 0015533 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
// KmerCodeSlice is a slice of KmerCode, for sorting
type KmerCodeSlice []KmerCode
// Len return length of the slice
func (codes KmerCodeSlice) Len() int {
return len(codes)
}
// Swap swaps two elements
func (codes KmerCodeSlice) Swap(i, j int) {
codes[i], codes[j] = codes[j], codes[i]
}
// Less simply compare two KmerCode
func (codes KmerCodeSlice) Less(i, j int) bool {
return codes[i].Code < codes[j].Code
}
// func splitKmer(code uint64, k int) (uint64, uint64, uint64, uint64) {
// // -====, k = 4: ---, -, =, ===
// return code >> 2, code & 3, code >> (uint(k-1) << 1) & 3, code & ((1 << (uint(k-1) << 1)) - 1)
// }
// CodeSlice is a slice of Kmer code (uint64), for sorting
type CodeSlice []uint64
// Len return length of the slice
func (codes CodeSlice) Len() int {
return len(codes)
}
// Swap swaps two elements
func (codes CodeSlice) Swap(i, j int) {
codes[i], codes[j] = codes[j], codes[i]
}
// Less simply compare two KmerCode
func (codes CodeSlice) Less(i, j int) bool {
return codes[i] < codes[j]
}
// CodeTaxid is the code-taxid pair
type CodeTaxid struct {
Code uint64
// _ uint32 // needed? to test
Taxid uint32
}
// CodeTaxidSlice is a list of CodeTaxid, just for sorting
type CodeTaxidSlice []CodeTaxid
// Len return length of the slice
func (pairs CodeTaxidSlice) Len() int {
return len(pairs)
}
// Swap swaps two elements
func (pairs CodeTaxidSlice) Swap(i, j int) {
pairs[i], pairs[j] = pairs[j], pairs[i]
}
// Less simply compare two KmerCode
func (pairs CodeTaxidSlice) Less(i, j int) bool {
return pairs[i].Code < pairs[j].Code
}
unikmer-0.18.8/kmer.go 0000664 0000000 0000000 00000031325 14121001445 0014543 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//b
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"bytes"
"errors"
)
// ErrIllegalBase means that base beyond IUPAC symbols are detected.
var ErrIllegalBase = errors.New("unikmer: illegal base")
// ErrKOverflow means K > 32.
var ErrKOverflow = errors.New("unikmer: k-mer size (1-32) overflow")
// ErrCodeOverflow means the encode interger is bigger than 4^k.
var ErrCodeOverflow = errors.New("unikmer: code value overflow")
// slice is much faster than switch and map.
var base2bit = [256]uint64{
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 0, 1, 1, 0, 4, 4, 2, 0, 4, 4, 2, 4, 0, 0, 4,
4, 4, 0, 1, 3, 3, 0, 0, 4, 1, 4, 4, 4, 4, 4, 4,
4, 0, 1, 1, 0, 4, 4, 2, 0, 4, 4, 2, 4, 0, 0, 4,
4, 4, 0, 1, 3, 3, 0, 0, 4, 1, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
}
// var base2bit []uint64
// MaxCode is the maxinum interger for all Ks.
var MaxCode []uint64
func init() {
MaxCode = make([]uint64, 33)
for i := 1; i <= 32; i++ {
MaxCode[i] = 1< 32 {
return 0, ErrKOverflow
}
var v uint64
for _, b := range kmer {
code <<= 2
v = base2bit[b]
// if v > 3 {
if v == 4 {
return code, ErrIllegalBase
}
code |= v
}
return code, nil
}
// ErrNotConsecutiveKmers means the two k-mers are not consecutive.
var ErrNotConsecutiveKmers = errors.New("unikmer: not consecutive k-mers")
// MustEncodeFromFormerKmer encodes from former the k-mer,
// assuming the k-mer and leftKmer are both OK.
func MustEncodeFromFormerKmer(kmer []byte, leftKmer []byte, leftCode uint64) (uint64, error) {
v := base2bit[kmer[len(kmer)-1]]
// if v > 3 {
if v == 4 {
return leftCode, ErrIllegalBase
}
// retrieve (k-1)*2 bits and << 2, and then add v
return leftCode&((1<<(uint(len(kmer)-1)<<1))-1)<<2 | v, nil
}
// EncodeFromFormerKmer encodes from the former k-mer, inspired by ntHash
func EncodeFromFormerKmer(kmer []byte, leftKmer []byte, leftCode uint64) (uint64, error) {
if len(kmer) == 0 {
return 0, ErrKOverflow
}
if len(kmer) != len(leftKmer) {
return 0, ErrKMismatch
}
if !bytes.Equal(kmer[0:len(kmer)-1], leftKmer[1:]) {
return 0, ErrNotConsecutiveKmers
}
return MustEncodeFromFormerKmer(kmer, leftKmer, leftCode)
}
// MustEncodeFromLatterKmer encodes from the latter k-mer,
// assuming the k-mer and rightKmer are both OK.
func MustEncodeFromLatterKmer(kmer []byte, rightKmer []byte, rightCode uint64) (uint64, error) {
v := base2bit[kmer[0]]
// if v > 3 {
if v == 4 {
return rightCode, ErrIllegalBase
}
return v<<(uint(len(kmer)-1)<<1) | rightCode>>2, nil
}
// EncodeFromLatterKmer encodes from the former k-mer.
func EncodeFromLatterKmer(kmer []byte, rightKmer []byte, rightCode uint64) (uint64, error) {
if len(kmer) == 0 {
return 0, ErrKOverflow
}
if len(kmer) != len(rightKmer) {
return 0, ErrKMismatch
}
if !bytes.Equal(rightKmer[0:len(kmer)-1], kmer[1:len(rightKmer)]) {
return 0, ErrNotConsecutiveKmers
}
return MustEncodeFromLatterKmer(kmer, rightKmer, rightCode)
}
// Reverse returns code of the reversed sequence.
func Reverse(code uint64, k int) (c uint64) {
if k <= 0 || k > 32 {
panic(ErrKOverflow)
}
// for i := 0; i < k; i++ {
// c = (c << 2) | (code & 3)
// code >>= 2
// }
// return
// https: //www.biostars.org/p/113640, with a little modification
c = code
c = ((c >> 2 & 0x3333333333333333) | (c&0x3333333333333333)<<2)
c = ((c >> 4 & 0x0F0F0F0F0F0F0F0F) | (c&0x0F0F0F0F0F0F0F0F)<<4)
c = ((c >> 8 & 0x00FF00FF00FF00FF) | (c&0x00FF00FF00FF00FF)<<8)
c = ((c >> 16 & 0x0000FFFF0000FFFF) | (c&0x0000FFFF0000FFFF)<<16)
c = ((c >> 32 & 0x00000000FFFFFFFF) | (c&0x00000000FFFFFFFF)<<32)
return (c >> (2 * (32 - k)))
}
// MustReverse is similar to Reverse, but does not check k.
func MustReverse(code uint64, k int) (c uint64) {
// for i := 0; i < k; i++ {
// c = (c << 2) | (code & 3)
// code >>= 2
// }
// return
// https: //www.biostars.org/p/113640, with a little modification
c = code
c = ((c >> 2 & 0x3333333333333333) | (c&0x3333333333333333)<<2)
c = ((c >> 4 & 0x0F0F0F0F0F0F0F0F) | (c&0x0F0F0F0F0F0F0F0F)<<4)
c = ((c >> 8 & 0x00FF00FF00FF00FF) | (c&0x00FF00FF00FF00FF)<<8)
c = ((c >> 16 & 0x0000FFFF0000FFFF) | (c&0x0000FFFF0000FFFF)<<16)
c = ((c >> 32 & 0x00000000FFFFFFFF) | (c&0x00000000FFFFFFFF)<<32)
return (c >> (2 * (32 - k)))
}
// Complement returns code of complement sequence.
func Complement(code uint64, k int) uint64 {
if k <= 0 || k > 32 {
panic(ErrKOverflow)
}
return code ^ (1< 32 {
panic(ErrKOverflow)
}
// for i := 0; i < k; i++ {
// c = (c << 2) | (code&3 ^ 3)
// code >>= 2
// }
// return
// https://www.biostars.org/p/113640/#9474334
c = ^code
c = ((c >> 2 & 0x3333333333333333) | (c&0x3333333333333333)<<2)
c = ((c >> 4 & 0x0F0F0F0F0F0F0F0F) | (c&0x0F0F0F0F0F0F0F0F)<<4)
c = ((c >> 8 & 0x00FF00FF00FF00FF) | (c&0x00FF00FF00FF00FF)<<8)
c = ((c >> 16 & 0x0000FFFF0000FFFF) | (c&0x0000FFFF0000FFFF)<<16)
c = ((c >> 32 & 0x00000000FFFFFFFF) | (c&0x00000000FFFFFFFF)<<32)
return (c >> (2 * (32 - k)))
}
// MustRevComp is similar to RevComp, but does not check k.
func MustRevComp(code uint64, k int) (c uint64) {
// for i := 0; i < k; i++ {
// c = (c << 2) | (code&3 ^ 3)
// code >>= 2
// }
// return
// https://www.biostars.org/p/113640/#9474334
c = ^code
c = ((c >> 2 & 0x3333333333333333) | (c&0x3333333333333333)<<2)
c = ((c >> 4 & 0x0F0F0F0F0F0F0F0F) | (c&0x0F0F0F0F0F0F0F0F)<<4)
c = ((c >> 8 & 0x00FF00FF00FF00FF) | (c&0x00FF00FF00FF00FF)<<8)
c = ((c >> 16 & 0x0000FFFF0000FFFF) | (c&0x0000FFFF0000FFFF)<<16)
c = ((c >> 32 & 0x00000000FFFFFFFF) | (c&0x00000000FFFFFFFF)<<32)
return (c >> (2 * (32 - k)))
}
// Canonical returns code of its canonical kmer.
func Canonical(code uint64, k int) uint64 {
if k <= 0 || k > 32 {
panic(ErrKOverflow)
}
var rc uint64
// c := code
// for i := 0; i < k; i++ {
// rc = (rc << 2) | (c&3 ^ 3)
// c >>= 2
// }
// https://www.biostars.org/p/113640/#9474334
c := ^code
c = ((c >> 2 & 0x3333333333333333) | (c&0x3333333333333333)<<2)
c = ((c >> 4 & 0x0F0F0F0F0F0F0F0F) | (c&0x0F0F0F0F0F0F0F0F)<<4)
c = ((c >> 8 & 0x00FF00FF00FF00FF) | (c&0x00FF00FF00FF00FF)<<8)
c = ((c >> 16 & 0x0000FFFF0000FFFF) | (c&0x0000FFFF0000FFFF)<<16)
c = ((c >> 32 & 0x00000000FFFFFFFF) | (c&0x00000000FFFFFFFF)<<32)
rc = (c >> (2 * (32 - k)))
if rc < code {
return rc
}
return code
}
// MustCanonical is similar to Canonical, but does not check k.
func MustCanonical(code uint64, k int) uint64 {
var rc uint64
c := code
for i := 0; i < k; i++ {
rc = (rc << 2) | (c&3 ^ 3)
c >>= 2
}
if rc < code {
return rc
}
return code
}
// bit2base is for mapping bit to base.
var bit2base = [4]byte{'A', 'C', 'G', 'T'}
// bit2str is for output bits string
var bit2str = [4]string{"00", "01", "10", "11"}
// Decode converts the code to original seq
func Decode(code uint64, k int) []byte {
if k <= 0 || k > 32 {
panic(ErrKOverflow)
}
if code > MaxCode[k] {
panic(ErrCodeOverflow)
}
kmer := make([]byte, k)
for i := 0; i < k; i++ {
kmer[k-1-i] = bit2base[code&3]
code >>= 2
}
return kmer
}
// MustDecode is similar to Decode, but does not check k and code.
func MustDecode(code uint64, k int) []byte {
kmer := make([]byte, k)
for i := 0; i < k; i++ {
kmer[k-1-i] = bit2base[code&3]
code >>= 2
}
return kmer
}
// KmerCode is a struct representing a k-mer in 64-bits.
type KmerCode struct {
Code uint64
K int
}
// NewKmerCode returns a new KmerCode struct from byte slice.
func NewKmerCode(kmer []byte) (KmerCode, error) {
code, err := Encode(kmer)
if err != nil {
return KmerCode{}, err
}
return KmerCode{code, len(kmer)}, err
}
// NewKmerCodeFromFormerOne computes KmerCode from the Former consecutive k-mer.
func NewKmerCodeFromFormerOne(kmer []byte, leftKmer []byte, preKcode KmerCode) (KmerCode, error) {
code, err := EncodeFromFormerKmer(kmer, leftKmer, preKcode.Code)
if err != nil {
return KmerCode{}, err
}
return KmerCode{code, len(kmer)}, err
}
// NewKmerCodeMustFromFormerOne computes KmerCode from the Former consecutive k-mer,
// assuming the k-mer and leftKmer are both OK.
func NewKmerCodeMustFromFormerOne(kmer []byte, leftKmer []byte, preKcode KmerCode) (KmerCode, error) {
code, err := MustEncodeFromFormerKmer(kmer, leftKmer, preKcode.Code)
if err != nil {
return KmerCode{}, err
}
return KmerCode{code, len(kmer)}, err
}
// Equal checks wether two KmerCodes are the same.
func (kcode KmerCode) Equal(kcode2 KmerCode) bool {
return kcode.K == kcode2.K && kcode.Code == kcode2.Code
}
// Rev returns KmerCode of the reverse sequence.
func (kcode KmerCode) Rev() KmerCode {
return KmerCode{MustReverse(kcode.Code, kcode.K), kcode.K}
}
// Comp returns KmerCode of the complement sequence.
func (kcode KmerCode) Comp() KmerCode {
return KmerCode{MustComplement(kcode.Code, kcode.K), kcode.K}
}
// RevComp returns KmerCode of the reverse complement sequence.
func (kcode KmerCode) RevComp() KmerCode {
return KmerCode{MustRevComp(kcode.Code, kcode.K), kcode.K}
}
// Canonical returns its canonical kmer
func (kcode KmerCode) Canonical() KmerCode {
rcKcode := kcode.RevComp()
if rcKcode.Code < kcode.Code {
return rcKcode
}
return kcode
}
// Bytes returns k-mer in []byte.
func (kcode KmerCode) Bytes() []byte {
return Decode(kcode.Code, kcode.K)
}
// String returns k-mer in string
func (kcode KmerCode) String() string {
return string(Decode(kcode.Code, kcode.K))
}
// BitsString returns code to string
func (kcode KmerCode) BitsString() string {
var buf bytes.Buffer
for _, b := range Decode(kcode.Code, kcode.K) {
buf.WriteString(bit2str[base2bit[b]])
}
return buf.String()
}
unikmer-0.18.8/kmer_test.go 0000664 0000000 0000000 00000015165 14121001445 0015606 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"bytes"
"fmt"
"math/rand"
"testing"
)
var randomMers [][]byte
var randomMersN = 100000
var benchMer = []byte("ACTGactgGTCAgtcaactgGTCAACTGGTCA")
var codeBenchMer uint64 = 2170370756141391540
var benchMer2 = []byte("CTGactgGTCAgtcaactgGTCAACTGGTCAC")
var codeBenchMer2 uint64 = 8681483024565566161
var benchCode uint64
var benchKmerCode KmerCode
func init() {
randomMers = make([][]byte, randomMersN)
for i := 0; i < randomMersN; i++ {
randomMers[i] = make([]byte, rand.Intn(32)+1)
for j := range randomMers[i] {
randomMers[i][j] = bit2base[rand.Intn(4)]
}
}
// for benchmark
var err error
benchCode, err = Encode(benchMer)
if err != nil {
panic(fmt.Sprintf("init: fail to encode %s", benchMer))
}
benchKmerCode, err = NewKmerCode(benchMer)
if err != nil {
panic(fmt.Sprintf("init: fail to create KmerCode from %s", benchMer))
}
}
// TestEncodeDecode tests encode and decode
func TestEncodeDecode(t *testing.T) {
var kcode KmerCode
var err error
for _, mer := range randomMers {
kcode, err = NewKmerCode(mer) // encode
if err != nil {
t.Errorf("Encode error: %s", mer)
}
if !bytes.Equal(mer, kcode.Bytes()) { // decode
t.Errorf("Decode error: %s != %s ", mer, kcode.Bytes())
}
}
}
// TestEncodeFromFormerKmer tests TestEncodeFromFormerKmer
func TestEncodeFromFormerKmer(t *testing.T) {
var err error
k := 5
first := true
var code, code0, pCode uint64
var kmer, pKmer []byte
for i := 0; i < len(benchMer)-k; i++ {
kmer = benchMer[i : i+k]
if first {
code, err = Encode(kmer)
if err != nil {
t.Errorf("Encode error: %s", kmer)
}
pCode = code
first = false
continue
}
pKmer = benchMer[i-1 : i+k-1]
code, err = EncodeFromFormerKmer(kmer, pKmer, pCode)
if err != nil {
t.Errorf("Encode error: %s", kmer)
}
code0, err = Encode(kmer)
if err != nil {
t.Errorf("Encode error: %s", kmer)
}
if code0 != code {
t.Errorf("EncodeFromFormerKmer error for %s: wrong %d != right %d", kmer, code, code0)
}
pCode = code
}
}
func TestEncodeFromLatterKmer(t *testing.T) {
var err error
k := 5
first := true
var code, code0, pCode uint64
var kmer, pKmer []byte
for i := len(benchMer) - k - 1; i >= 0; i-- {
kmer = benchMer[i : i+k]
if first {
code, err = Encode(kmer)
if err != nil {
t.Errorf("Encode error: %s", kmer)
}
pCode = code
first = false
continue
}
pKmer = benchMer[i+1 : i+k+1]
code, err = EncodeFromLatterKmer(kmer, pKmer, pCode)
if err != nil {
t.Errorf("Encode error: %s", kmer)
}
code0, err = Encode(kmer)
if err != nil {
t.Errorf("Encode error: %s", kmer)
}
if code0 != code {
t.Errorf("EncodeFromLatterKmer error for %s: wrong %d != right %d", kmer, code, code0)
}
pCode = code
}
}
// TestRevComp tests revcomp
func TestRevComp(t *testing.T) {
var kcode KmerCode
for _, mer := range randomMers {
kcode, _ = NewKmerCode(mer)
// fmt.Printf("%s, rev:%s\n", kcode, kcode.Rev())
}
for _, mer := range randomMers {
kcode, _ = NewKmerCode(mer)
if !kcode.Rev().Rev().Equal(kcode) {
t.Errorf("Rev() error: %s, Rev(): %s", kcode, kcode.Rev())
}
if !kcode.Comp().Comp().Equal(kcode) {
t.Errorf("Comp() error: %s, Comp(): %s", kcode, kcode.Comp())
}
if !kcode.Comp().Rev().Equal(kcode.RevComp()) {
t.Errorf("Rev().Comp() error: %s, Rev(): %s, Comp(): %s, RevComp: %s", kcode, kcode.Rev(), kcode.Comp(), kcode.RevComp())
}
}
}
var result uint64
// BenchmarkEncode tests speed of Encode()
func BenchmarkEncodeK32(b *testing.B) {
var code uint64
var err error
for i := 0; i < b.N; i++ {
code, err = Encode(benchMer)
if err != nil {
b.Errorf("Encode error: %s", benchMer)
}
if code != codeBenchMer {
b.Errorf("wrong result: %s", benchMer)
}
}
result = code
}
// BenchmarkEncode tests speed of EncodeFromFormerKmer
func BenchmarkEncodeFromFormerKmerK32(b *testing.B) {
var code uint64
var err error
for i := 0; i < b.N; i++ {
code, err = EncodeFromFormerKmer(benchMer2, benchMer, benchCode)
if err != nil {
b.Errorf("Encode error: %s", benchMer)
}
if code != codeBenchMer2 {
b.Errorf("wrong result: %s", benchMer)
}
}
result = code
}
// BenchmarkEncode tests speed of MustEncodeFromFormerKmer
func BenchmarkMustEncodeFromFormerKmerK32(b *testing.B) {
var code uint64
var err error
for i := 0; i < b.N; i++ {
code, err = MustEncodeFromFormerKmer(benchMer2, benchMer, benchCode)
if err != nil {
b.Errorf("Encode error: %s", benchMer)
}
if code != codeBenchMer2 {
b.Errorf("wrong result: %s", benchMer)
}
}
result = code
}
var result2 []byte
// BenchmarkDecode tests speed of decode
func BenchmarkDecodeK32(b *testing.B) {
var r []byte
for i := 0; i < b.N; i++ {
r = Decode(benchCode, len(benchMer))
}
result2 = r
}
func BenchmarkMustDecodeK32(b *testing.B) {
var r []byte
for i := 0; i < b.N; i++ {
r = MustDecode(benchCode, len(benchMer))
}
result2 = r
}
var result3 KmerCode
// BenchmarkRevK32 tests speed of rev
func BenchmarkRevK32(b *testing.B) {
var r KmerCode
for i := 0; i < b.N; i++ {
r = benchKmerCode.Rev()
}
result3 = r
}
// BenchmarkRevK32 tests speed of comp
func BenchmarkCompK32(b *testing.B) {
var r KmerCode
for i := 0; i < b.N; i++ {
r = benchKmerCode.Comp()
}
result3 = r
}
// BenchmarkRevCompK32 tests speed of revcomp
func BenchmarkRevCompK32(b *testing.B) {
var r KmerCode
for i := 0; i < b.N; i++ {
r = benchKmerCode.RevComp()
}
result3 = r
}
func BenchmarkCannonalK32(b *testing.B) {
var r KmerCode
for i := 0; i < b.N; i++ {
r = benchKmerCode.Canonical()
}
result3 = r
}
unikmer-0.18.8/serialization.go 0000664 0000000 0000000 00000046734 14121001445 0016474 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"encoding/binary"
"errors"
"fmt"
"io"
)
// MainVersion is the main version number.
const MainVersion uint8 = 5
// MinorVersion is the minor version number.
const MinorVersion uint8 = 0
// Magic number of binary file.
var Magic = [8]byte{'.', 'u', 'n', 'i', 'k', 'm', 'e', 'r'}
// ErrInvalidFileFormat means invalid file format.
var ErrInvalidFileFormat = errors.New("unikmer: invalid binary format")
// ErrBrokenFile means the file is not complete.
var ErrBrokenFile = errors.New("unikmer: broken file")
// ErrKMismatch means K size mismatch.
var ErrKMismatch = errors.New("unikmer: K mismatch")
// ErrDescTooLong means length of description two long
var ErrDescTooLong = errors.New("unikmer: description too long, 128 bytes at most")
// ErrCallOrder means WriteTaxid/ReadTaxid should be called after WriteCode/ReadCode
var ErrCallOrder = errors.New("unikmer: WriteTaxid/ReadTaxid should be called after WriteCode/ReadCode")
// ErrCallLate means SetMaxTaxid/SetGlobalTaxid should be called before writing KmerCode/code/taxid
var ErrCallLate = errors.New("unikmer: SetMaxTaxid/SetGlobalTaxid should be called before writing KmerCode/code/taxid")
// ErrCallReadWriteTaxid means flag UnikIncludeTaxID is off, but you call ReadTaxid/WriteTaxid
var ErrCallReadWriteTaxid = errors.New("unikmer: can not call ReadTaxid/WriteTaxid when flag UnikIncludeTaxID is off")
// ErrInvalidTaxid means zero given for a taxid.
var ErrInvalidTaxid = errors.New("unikmer: invalid taxid, 0 not allowed")
// ErrVersionMismatch means version mismatch between files and program
var ErrVersionMismatch = errors.New("unikmer: version mismatch")
var be = binary.BigEndian
var descMaxLen = 1024
var conservedDataLen = 64
// Header contains metadata
type Header struct {
MainVersion uint8
MinorVersion uint8
K int
Flag uint32
Number uint64 // Number of Kmers, may not be accurate
globalTaxid uint32 // universal taxid, 0 for no record
maxTaxid uint32
Description []byte // let's limit it to 128 Bytes
Scale uint32 // scale of down-sampling
MaxHash uint64 // max hash for scaling/down-sampling
}
const (
// UnikCompact means k-mers are serialized in fix-length (n = int((K + 3) / 4) ) of byte array.
UnikCompact = 1 << iota
// UnikCanonical means only canonical k-mers kept.
UnikCanonical
// UnikSorted means k-mers are sorted
UnikSorted // when sorted, the serialization structure is very different
// UnikIncludeTaxID means a k-mer is followed its LCA taxid
UnikIncludeTaxID
// UnikHashed means ntHash value are saved as code.
UnikHashed
// UnikScaled means only hashes smaller than or equal to max_hash are saved.
UnikScaled
)
func (h Header) String() string {
return fmt.Sprintf("unikmer binary k-mer data file v%d.%d with K=%d and Flag=%d",
h.MainVersion, h.MinorVersion, h.K, h.Flag)
}
// Reader is for reading KmerCode.
type Reader struct {
Header
r io.Reader
buf []byte
compact bool // saving KmerCode in variable-length byte array.
bufsize int
sorted bool
hasPrev bool
prev uint64
buf2 []byte
offset uint64
includeTaxid bool
bufTaxid []byte
taxidByteLen int
prevTaxid uint32 // buffered taxid
hasPrevTaxid bool
justReadACode bool
lastRecord bool
hashValue bool
}
// NewReader returns a Reader.
func NewReader(r io.Reader) (reader *Reader, err error) {
reader = &Reader{r: r}
err = reader.readHeader()
if err != nil {
return nil, err
}
return reader, nil
}
// IsSorted tells if the k-mers in file sorted
func (reader *Reader) IsSorted() bool {
return reader.Flag&UnikSorted > 0
}
// IsCanonical tells if the only canonical k-mers stored
func (reader *Reader) IsCanonical() bool {
return reader.Flag&UnikCanonical > 0
}
// IsCompact tells if the k-mers are stored in a compact format
func (reader *Reader) IsCompact() bool {
return reader.Flag&UnikCompact > 0
}
// IsIncludeTaxid tells if every k-mer is followed by its taxid
func (reader *Reader) IsIncludeTaxid() bool {
return reader.Flag&UnikIncludeTaxID > 0
}
// IsHashed tells if ntHash values are saved.
func (reader *Reader) IsHashed() bool {
return reader.Flag&UnikHashed > 0
}
// IsScaled tells if hashes is scaled
func (reader *Reader) IsScaled() bool {
return reader.Flag&UnikHashed > 0 && reader.Flag&UnikScaled > 0
}
// HasGlobalTaxid means the file has a global taxid
func (reader *Reader) HasGlobalTaxid() bool {
return reader.globalTaxid > 0
}
// HasTaxidInfo means the binary file contains global taxid or taxids for all k-mers
func (reader *Reader) HasTaxidInfo() bool {
return reader.IsIncludeTaxid() || reader.HasGlobalTaxid()
}
// GetGlobalTaxid returns the global taxid
func (reader *Reader) GetGlobalTaxid() uint32 {
return reader.globalTaxid
}
// GetTaxidBytesLength returns number of byte to store a taxid
func (reader *Reader) GetTaxidBytesLength() int {
return reader.taxidByteLen
}
// GetScale returns the scale of down-sampling
func (reader *Reader) GetScale() uint32 {
if reader.Scale == 0 {
return uint32(1)
}
return reader.Scale
}
// GetMaxHash returns the max hash for scaling.
func (reader *Reader) GetMaxHash() uint64 {
if reader.MaxHash == 0 {
return ^uint64(0)
}
return reader.MaxHash
}
func (reader *Reader) readHeader() (err error) {
buf := make([]byte, 56)
r := reader.r
// check Magic number
_, err = io.ReadFull(r, buf[:8])
if err != nil {
return err
}
same := true
for i := 0; i < 8; i++ {
if Magic[i] != buf[i] {
same = false
break
}
}
if !same {
return ErrInvalidFileFormat
}
// read metadata
_, err = io.ReadFull(r, buf[:4])
if err != nil {
return err
}
// check compatibility?
if (buf[0] == 0 && buf[1] == 0) ||
MainVersion != buf[0] {
return ErrVersionMismatch
}
reader.MainVersion = buf[0]
reader.MinorVersion = buf[1]
reader.K = int(buf[2])
_, err = io.ReadFull(r, buf[:4])
if err != nil {
return err
}
reader.Flag = be.Uint32(buf[:4])
reader.buf = make([]byte, 8)
if reader.IsCompact() {
reader.compact = true
reader.bufsize = int((reader.K + 3) / 4)
}
if reader.IsSorted() {
reader.sorted = true
reader.buf2 = make([]byte, 17)
}
if reader.IsIncludeTaxid() {
reader.includeTaxid = true
reader.bufTaxid = make([]byte, 4)
}
// number
_, err = io.ReadFull(r, buf[:8])
if err != nil {
return err
}
reader.Number = be.Uint64(buf[:8])
// taxid
_, err = io.ReadFull(r, buf[:4])
if err != nil {
return err
}
reader.globalTaxid = be.Uint32(buf[:4])
// taxid byte length
_, err = io.ReadFull(r, buf[1:2])
if err != nil {
return err
}
buf[0] = 0
reader.taxidByteLen = int(be.Uint16(buf[:2]))
// length of description
var lenDesc uint16
_, err = io.ReadFull(r, buf[:2])
if err != nil {
return err
}
lenDesc = be.Uint16(buf[:2])
desc := make([]byte, lenDesc)
_, err = io.ReadFull(r, desc)
if err != nil {
return err
}
reader.Description = desc
// scale
_, err = io.ReadFull(r, buf[:4])
if err != nil {
return err
}
reader.Scale = be.Uint32(buf[:4])
// max hash
_, err = io.ReadFull(r, buf[:8])
if err != nil {
return err
}
reader.MaxHash = be.Uint64(buf[:8])
reserved := make([]byte, conservedDataLen)
_, err = io.ReadFull(r, reserved)
if err != nil {
return err
}
return nil
}
// Read reads one KmerCode.
func (reader *Reader) Read() (KmerCode, error) {
code, err := reader.ReadCode()
return KmerCode{Code: code, K: reader.K}, err
}
// ReadWithTaxid reads a KmerCode, also return taxid if having.
func (reader *Reader) ReadWithTaxid() (KmerCode, uint32, error) {
code, taxid, err := reader.ReadCodeWithTaxid()
return KmerCode{Code: code, K: reader.K}, taxid, err
}
// ReadCodeWithTaxid reads a code, also return taxid if having.
func (reader *Reader) ReadCodeWithTaxid() (code uint64, taxid uint32, err error) {
code, err = reader.ReadCode()
if err != nil {
return 0, 0, err
}
if reader.includeTaxid {
taxid, err = reader.ReadTaxid()
if err != nil {
return 0, 0, err
}
} else {
taxid = reader.globalTaxid
}
return code, taxid, err
}
// ReadTaxid reads on taxid
func (reader *Reader) ReadTaxid() (taxid uint32, err error) {
if !reader.includeTaxid {
return 0, ErrCallReadWriteTaxid
}
if !reader.justReadACode {
return 0, ErrCallOrder
}
if reader.sorted {
if reader.lastRecord {
_, err = io.ReadFull(reader.r, reader.bufTaxid)
if err != nil {
return 0, err
}
reader.hasPrevTaxid = false
reader.justReadACode = false
return be.Uint32(reader.bufTaxid), nil
}
if reader.hasPrevTaxid {
c := reader.prevTaxid
reader.hasPrevTaxid = false
reader.justReadACode = false
return c, nil
}
_, err = io.ReadFull(reader.r, reader.bufTaxid[4-reader.taxidByteLen:])
if err != nil {
return 0, err
}
taxid = be.Uint32(reader.bufTaxid)
_, err = io.ReadFull(reader.r, reader.bufTaxid[4-reader.taxidByteLen:])
if err != nil {
return 0, err
}
reader.prevTaxid = be.Uint32(reader.bufTaxid)
reader.hasPrevTaxid = true
return taxid, nil
} else if reader.compact {
_, err = io.ReadFull(reader.r, reader.bufTaxid[4-reader.taxidByteLen:])
} else {
_, err = io.ReadFull(reader.r, reader.bufTaxid)
}
if err != nil {
return 0, err
}
reader.justReadACode = false
return be.Uint32(reader.bufTaxid), nil
}
// ReadCode reads one code.
func (reader *Reader) ReadCode() (uint64, error) {
var err error
if reader.sorted {
if reader.hasPrev {
c := reader.prev
// reader.prev = 0
reader.hasPrev = false
reader.justReadACode = true
return c, nil
}
buf2 := reader.buf2
r := reader.r
// read control byte
var nReaded int
nReaded, err = io.ReadFull(r, buf2[0:1])
if err != nil {
return 0, err
}
ctrlByte := buf2[0]
if ctrlByte&128 > 0 { // last one
nReaded, err = io.ReadFull(r, buf2[0:8])
if err != nil {
return 0, err
}
reader.lastRecord = true
reader.justReadACode = true
return be.Uint64(buf2[0:8]), nil
}
// parse control byte
encodedBytes := ctrlByte2ByteLengths[ctrlByte]
nEncodedBytes := int(encodedBytes[0] + encodedBytes[1])
// read encoded bytes
nReaded, err = io.ReadFull(r, buf2[0:nEncodedBytes])
if err != nil {
return 0, err
}
if nReaded < nEncodedBytes {
return 0, ErrBrokenFile
}
decodedVals, nDecoded := Uint64s(ctrlByte, buf2[0:nEncodedBytes])
if nDecoded == 0 {
return 0, ErrBrokenFile
}
code := decodedVals[0] + reader.offset
reader.prev = code + decodedVals[1]
reader.hasPrev = true
reader.offset = code + decodedVals[1]
reader.justReadACode = true
return code, nil
} else if reader.compact {
_, err = io.ReadFull(reader.r, reader.buf[8-reader.bufsize:])
} else {
_, err = io.ReadFull(reader.r, reader.buf)
}
if err != nil {
return 0, err
}
reader.justReadACode = true
return be.Uint64(reader.buf), nil
}
// Writer writes KmerCode.
type Writer struct {
Header
w io.Writer
wroteHeader bool
buf []byte
// saving KmerCode in compact fixlength byte array.
compact bool
bufsize int
// sortred mode
sorted bool
offset uint64
prev uint64 // buffered code
hasPrev bool
buf2 []byte
buf3 []byte
ctrlByte byte
nEncodedByte int
// for taxid
includeTaxid bool
bufTaxid []byte
justWrittenACode bool
taxidByteLen int
prevTaxid uint32 // buffered taxid
hasPrevTaxid bool
}
// NewWriter creates a Writer.
func NewWriter(w io.Writer, k int, flag uint32) (*Writer, error) {
if k == 0 {
return nil, ErrKOverflow
}
writer := &Writer{
Header: Header{MainVersion: MainVersion, MinorVersion: MinorVersion, K: k, Flag: flag, Number: 0},
w: w,
}
// prevent wrong use of compact
if writer.Flag&UnikCompact > 0 && (writer.Flag&UnikSorted > 0 || writer.Flag&UnikHashed > 0) {
writer.Flag ^= UnikCompact
}
writer.buf = make([]byte, 8)
if writer.Flag&UnikCompact > 0 &&
writer.Flag&UnikSorted == 0 &&
writer.Flag&UnikHashed == 0 {
writer.compact = true
writer.bufsize = int(k+3) / 4
} else if writer.Flag&UnikSorted > 0 {
writer.sorted = true
writer.buf2 = make([]byte, 16)
writer.buf3 = make([]byte, 32)
}
if writer.Flag&UnikIncludeTaxID > 0 {
writer.includeTaxid = true
writer.bufTaxid = make([]byte, 4)
}
return writer, nil
}
// WriteHeader writes file header
func (writer *Writer) WriteHeader() (err error) {
if writer.wroteHeader {
return nil
}
w := writer.w
// 8 bytes magic number
err = binary.Write(w, be, Magic)
if err != nil {
return err
}
// 4 bytes meta info
err = binary.Write(w, be, [4]uint8{writer.MainVersion, MinorVersion, uint8(writer.K), 0})
if err != nil {
return err
}
// 4 bytes flags
err = binary.Write(w, be, writer.Flag)
if err != nil {
return err
}
// 8 bytes number
err = binary.Write(w, be, writer.Number)
if err != nil {
return err
}
// 4 bytes taxid
err = binary.Write(w, be, writer.globalTaxid)
if err != nil {
return err
}
// 1 byte taxid bytes len
if writer.maxTaxid <= 0 {
writer.taxidByteLen = 4
} else {
writer.taxidByteLen = int(byteLength(uint64(writer.maxTaxid)))
}
err = binary.Write(w, be, uint8(writer.taxidByteLen))
if err != nil {
return err
}
// description length (2 byte)s and data (128 bytes)
lenDesc := len(writer.Description)
if lenDesc > descMaxLen {
return ErrDescTooLong
}
err = binary.Write(w, be, uint16(lenDesc))
if err != nil {
return err
}
err = binary.Write(w, be, writer.Description)
if err != nil {
return err
}
// scale
err = binary.Write(w, be, writer.Scale)
if err != nil {
return err
}
// max hash
err = binary.Write(w, be, writer.MaxHash)
if err != nil {
return err
}
// reserved 24 bytes
reserved := make([]byte, conservedDataLen)
err = binary.Write(w, be, reserved)
if err != nil {
return err
}
// header has 192 bytes
writer.wroteHeader = true
return nil
}
// SetGlobalTaxid sets the global taxid
func (writer *Writer) SetGlobalTaxid(taxid uint32) error {
if writer.wroteHeader {
return ErrCallLate
}
writer.globalTaxid = taxid
return nil
}
// SetMaxTaxid set the maxtaxid
func (writer *Writer) SetMaxTaxid(taxid uint32) error {
if writer.wroteHeader {
return ErrCallLate
}
writer.maxTaxid = taxid
return nil
}
// SetScale set the scale
func (writer *Writer) SetScale(scale uint32) error {
if writer.wroteHeader {
return ErrCallLate
}
if writer.Flag&UnikHashed == 0 {
writer.Flag |= UnikHashed
}
if writer.Flag&UnikScaled == 0 {
writer.Flag |= UnikScaled
}
writer.Scale = scale
return nil
}
// SetMaxHash set the max hash
func (writer *Writer) SetMaxHash(maxHash uint64) error {
if writer.wroteHeader {
return ErrCallLate
}
if writer.Flag&UnikHashed == 0 {
writer.Flag += UnikHashed
}
if writer.Flag&UnikScaled == 0 {
writer.Flag += UnikScaled
}
writer.MaxHash = maxHash
return nil
}
// WriteKmer writes one k-mer.
func (writer *Writer) WriteKmer(mer []byte) error {
kcode, err := NewKmerCode(mer)
if err != nil {
return err
}
return writer.Write(kcode)
}
// WriteKmerWithTaxid writes one k-mer and its taxid
func (writer *Writer) WriteKmerWithTaxid(mer []byte, taxid uint32) error {
err := writer.WriteKmer(mer)
if err != nil {
return nil
}
return writer.WriteTaxid(taxid)
}
// Write writes one KmerCode.
func (writer *Writer) Write(kcode KmerCode) (err error) {
if writer.K != kcode.K {
return ErrKMismatch
}
return writer.WriteCode(kcode.Code)
}
// WriteWithTaxid writes one KmerCode and its taxid.
// If UnikIncludeTaxID is off, taxid will not be written.
func (writer *Writer) WriteWithTaxid(kcode KmerCode, taxid uint32) (err error) {
err = writer.Write(kcode)
if err != nil {
return nil
}
return writer.WriteTaxid(taxid)
}
// WriteCodeWithTaxid writes a code and its taxid.
// If UnikIncludeTaxID is off, taxid will not be written.
func (writer *Writer) WriteCodeWithTaxid(code uint64, taxid uint32) (err error) {
err = writer.WriteCode(code)
if err != nil {
return nil
}
if !writer.includeTaxid { // if no taxid, just return.
return nil
}
return writer.WriteTaxid(taxid)
}
// WriteTaxid appends taxid to the code
func (writer *Writer) WriteTaxid(taxid uint32) (err error) {
if !writer.includeTaxid {
return ErrCallReadWriteTaxid
}
if !writer.justWrittenACode {
return ErrCallOrder
}
if writer.sorted {
if !writer.hasPrevTaxid { // write it later
writer.prevTaxid = taxid
writer.hasPrevTaxid = true
writer.justWrittenACode = false
return nil
}
be.PutUint32(writer.bufTaxid, writer.prevTaxid)
_, err = writer.w.Write(writer.bufTaxid[4-writer.taxidByteLen:])
// fmt.Printf("write taxid: %d, %d\n", writer.prevTaxid, writer.bufTaxid[4-writer.taxidByteLen:])
be.PutUint32(writer.bufTaxid, taxid)
_, err = writer.w.Write(writer.bufTaxid[4-writer.taxidByteLen:])
writer.hasPrevTaxid = false
} else if writer.compact {
be.PutUint32(writer.bufTaxid, taxid)
_, err = writer.w.Write(writer.bufTaxid[4-writer.taxidByteLen:])
} else {
be.PutUint32(writer.bufTaxid, taxid)
_, err = writer.w.Write(writer.bufTaxid)
}
writer.justWrittenACode = false
return nil
}
// WriteCode writes one code
func (writer *Writer) WriteCode(code uint64) (err error) {
// lazily write header
if !writer.wroteHeader {
err = writer.WriteHeader()
if err != nil {
return err
}
writer.wroteHeader = true
}
if writer.sorted {
if !writer.hasPrev { // write it later
writer.prev = code
writer.hasPrev = true
writer.justWrittenACode = true
return nil
}
writer.ctrlByte, writer.nEncodedByte = PutUint64s(writer.buf2, writer.prev-writer.offset, code-writer.prev)
writer.buf3[0] = writer.ctrlByte
copy(writer.buf3[1:writer.nEncodedByte+1], writer.buf2[0:writer.nEncodedByte])
_, err = writer.w.Write(writer.buf3[0 : writer.nEncodedByte+1])
writer.offset = code
// writer.prev = 0
writer.hasPrev = false
} else if writer.compact {
be.PutUint64(writer.buf, code)
_, err = writer.w.Write(writer.buf[8-writer.bufsize:])
} else {
be.PutUint64(writer.buf, code)
_, err = writer.w.Write(writer.buf)
}
if err != nil {
return err
}
writer.justWrittenACode = true
return nil
}
// Flush write the last k-mer
func (writer *Writer) Flush() (err error) {
if !writer.wroteHeader {
writer.Number = 0
writer.WriteHeader()
}
if !writer.sorted || !writer.hasPrev {
return nil
}
// write last k-mer
err = binary.Write(writer.w, be, uint8(128))
if err != nil {
return err
}
err = binary.Write(writer.w, be, writer.prev) // last code
if err != nil {
return err
}
if writer.includeTaxid && writer.hasPrevTaxid { // last taxid
err = binary.Write(writer.w, be, writer.prevTaxid)
if err != nil {
return err
}
}
writer.hasPrev = false
writer.hasPrevTaxid = false
return nil
}
unikmer-0.18.8/serialization_test.go 0000664 0000000 0000000 00000006526 14121001445 0017526 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"bufio"
"bytes"
"fmt"
"io"
"math/rand"
"os"
"sort"
"testing"
"github.com/shenwei356/util/byteutil"
)
func genKmers(k int, num int, sorted bool) [][]byte {
mers := make([][]byte, num)
var j int
for i := 0; i < num; i++ {
mers[i] = make([]byte, k)
for j = 0; j < k; j++ {
mers[i][j] = bit2base[rand.Intn(4)]
}
}
sort.Sort(byteutil.SliceOfByteSlice(mers))
return mers
}
// TestWriterReader tests Writer and Writer
func TestWriter(t *testing.T) {
var file string
var mers, mers2 [][]byte
var err error
ns := []int{10001, 10001, 10001, 10000}
for k := 1; k <= 31; k++ {
for i, flag := range []uint32{0, UnikCompact, UnikSorted} { //, UnikSorted
func(flag uint32) {
mers = genKmers(k, ns[i], flag&UnikSorted > 0)
file = fmt.Sprintf("t.k%d.unik", k)
err = write(mers, file, flag)
if err != nil {
t.Error(err)
}
defer func() {
err = os.Remove(file)
if err != nil {
t.Error(err)
}
}()
mers2, err = read(file)
if err != nil {
t.Error(err)
}
if len(mers2) != len(mers) {
t.Errorf("write and read: number err")
}
for i := 0; i < len(mers); i++ {
if !bytes.Equal(mers[i], mers2[i]) {
t.Errorf("write and read: data mismatch. %d: %d vs %d", i, mers[i], mers2[i])
}
}
}(flag)
}
}
}
func write(mers [][]byte, file string, flag uint32) error {
w, err := os.Create(file)
if err != nil {
return err
}
defer w.Close()
outfh := bufio.NewWriter(w)
defer outfh.Flush()
writer, err := NewWriter(outfh, len(mers[0]), flag)
if err != nil {
return err
}
for _, mer := range mers {
err = writer.WriteKmer(mer)
if err != nil {
return err
}
}
err = writer.Flush()
if err != nil {
return err
}
return nil
}
func read(file string) ([][]byte, error) {
r, err := os.Open(file)
if err != nil {
return nil, err
}
defer r.Close()
infh := bufio.NewReader(r)
reader, err := NewReader(infh)
if err != nil {
return nil, err
}
// fmt.Println(reader.Header)
mers := make([][]byte, 0, 1000)
var kcode KmerCode
for {
kcode, err = reader.Read()
if err != nil {
if err == io.EOF {
break
}
return nil, err
}
mers = append(mers, kcode.Bytes())
}
return mers, nil
}
unikmer-0.18.8/sketch-protein.go 0000664 0000000 0000000 00000011170 14121001445 0016540 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"sort"
"sync"
"github.com/shenwei356/bio/seq"
"github.com/zeebo/wyhash"
)
// ProteinMinimizerSketch is a protein k-mer minimizer iterator
type ProteinMinimizerSketch struct {
s *seq.Seq // amino acid
k int
end0 int
idx int
// ----------------------
skip bool
w int
end int
r int // L-s
i, mI int
mV uint64
preMinIdx int
buf []IdxValue
i2v IdxValue
flag bool
t, b, e int
}
var poolProteinMinimizerSketch = &sync.Pool{New: func() interface{} {
return &ProteinMinimizerSketch{}
}}
// NewProteinMinimizerSketch returns a ProteinMinimizerSketch
func NewProteinMinimizerSketch(S *seq.Seq, k int, codonTable int, frame int, w int) (*ProteinMinimizerSketch, error) {
if k < 1 {
return nil, ErrInvalidK
}
if len(S.Seq) < k*3 {
return nil, ErrShortSeq
}
if w < 1 || w > (1<<31)-1 {
return nil, ErrInvalidW
}
if len(S.Seq) < k*3+w-1 {
return nil, ErrShortSeq
}
// s := &ProteinMinimizerSketch{s0: S, k: k, w: w}
s := poolProteinMinimizerSketch.Get().(*ProteinMinimizerSketch)
s.k = k
s.w = w
var err error
if S.Alphabet != seq.Protein {
s.s, err = S.Translate(codonTable, frame, false, false, true, false)
if err != nil {
return nil, err
}
} else {
s.s = S
}
s.idx = 0
s.end0 = len(s.s.Seq) - k
s.skip = w == 1
s.end = len(s.s.Seq) - 1
s.r = w - 1 // L-k
s.buf = make([]IdxValue, 0, w)
s.preMinIdx = -1
return s, nil
}
// Next returns next hash value
func (s *ProteinMinimizerSketch) Next() (code uint64, ok bool) {
for {
// if s.idx > s.end {
// return 0, false
// }
if s.idx > s.end0 {
poolProteinIterator.Put(s)
return 0, false
}
code = wyhash.Hash(s.s.Seq[s.idx:s.idx+s.k], 1)
if s.skip {
s.mI = s.idx
s.idx++
return code, true
}
// in window
if s.idx < s.r {
s.buf = append(s.buf, IdxValue{Idx: s.idx, Val: code})
s.idx++
continue
}
// end of w
if s.idx == s.r {
s.buf = append(s.buf, IdxValue{Idx: s.idx, Val: code})
sort.Sort(idxValues(s.buf)) // sort
s.i2v = s.buf[0]
s.mI, s.mV = s.i2v.Idx, s.i2v.Val
s.preMinIdx = s.mI
s.idx++
return s.i2v.Val, true
}
// find min k-mer
// remove k-mer not in this window.
// have to check position/index one by one
for s.i, s.i2v = range s.buf {
if s.i2v.Idx == s.idx-s.w {
if s.i < s.r {
copy(s.buf[s.i:s.r], s.buf[s.i+1:])
} // happen to be at the end
s.buf = s.buf[:s.r]
break
}
}
// add new k-mer
s.flag = false
// using binary search, faster han linear search
s.b, s.e = 0, s.r-1
for {
s.t = s.b + (s.e-s.b)/2
if code < s.buf[s.t].Val {
s.e = s.t - 1 // end search here
if s.e <= s.b {
s.flag = true
s.i = s.b
break
}
} else {
s.b = s.t + 1 // start here
if s.b >= s.r {
s.flag = false
break
}
if s.b >= s.e {
s.flag = true
s.i = s.e // right here
break
}
}
}
if !s.flag { // it's the biggest one, append to the end
s.buf = append(s.buf, IdxValue{s.idx, code})
} else {
if code >= s.buf[s.i].Val { // have to check again
s.i++
}
s.buf = append(s.buf, blankI2V) // append one element
copy(s.buf[s.i+1:], s.buf[s.i:s.r]) // move right
s.buf[s.i] = IdxValue{s.idx, code}
}
s.i2v = s.buf[0]
if s.i2v.Idx == s.preMinIdx { // deduplicate
s.idx++
continue
}
s.mI, s.mV = s.i2v.Idx, s.i2v.Val
s.preMinIdx = s.mI
s.idx++
return s.i2v.Val, true
}
}
// Index returns current 0-baesd index.
func (s *ProteinMinimizerSketch) Index() int {
return s.mI
}
unikmer-0.18.8/sketch-protein_test.go 0000664 0000000 0000000 00000003551 14121001445 0017603 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"testing"
"github.com/shenwei356/bio/seq"
)
func TestProteinMinimizer(t *testing.T) {
_s := "AAGTTTGAATCATTCAACTATCTAGTTTTCAGAGAACAATGTTCTCTAAAGAATAGAAAAGAGTCATTGTGCGGTGATGATGGCGGGAAGGATCCACCTG"
sequence, err := seq.NewSeq(seq.DNA, []byte(_s))
if err != nil {
t.Errorf("fail to create sequence: %s", _s)
}
k := 10
w := 3
sketch, err := NewProteinMinimizerSketch(sequence, k, 1, 1, w)
if err != nil {
t.Errorf("fail to create minizimer sketch")
}
var code uint64
var ok bool
// var idx int
codes := make([]uint64, 0, 1024)
for {
code, ok = sketch.Next()
if !ok {
break
}
// idx = sketch.Index()
// fmt.Printf("aa: %d-%s, %d\n", idx, sketch.s.Seq[idx:idx+k], code)
codes = append(codes, code)
}
}
unikmer-0.18.8/sketch.go 0000664 0000000 0000000 00000027052 14121001445 0015070 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"fmt"
"sort"
"sync"
"github.com/shenwei356/bio/seq"
"github.com/will-rowe/nthash"
// hasher "github.com/zeebo/wyhash"
)
// ErrInvalidS means s >= k.
var ErrInvalidS = fmt.Errorf("unikmer: invalid s-mer size")
// ErrInvalidW means w < 2 or w > (1<<32)-1
var ErrInvalidW = fmt.Errorf("unikmer: invalid minimimzer window")
// ErrBufNil means the buffer is nil
var ErrBufNil = fmt.Errorf("unikmer: buffer slice is nil")
// ErrBufNotEmpty means the buffer has some elements
var ErrBufNotEmpty = fmt.Errorf("unikmer: buffer has elements")
// Sketch is a k-mer sketch iterator
type Sketch struct {
S []byte
k int
s int
circular bool
hasher *nthash.NTHi
kMs int // k-s, just for syncmer
r int // L-s
idx int // current location, 0-based
end int
i, mI int
v, mV uint64
preMinIdx int
buf []IdxValue
i2v IdxValue
flag bool
t, b, e int
// ------ just for syncmer -------
hasherS *nthash.NTHi
bsyncmerIdx int
lateOutputThisOne bool
preMinIdxs []int
// ------ just for minimizer -----
skip bool
minimizer bool
w int
}
var poolSketch = &sync.Pool{New: func() interface{} {
return &Sketch{}
}}
// NewMinimizerSketch returns a SyncmerSketch Iterator.
// It returns the minHashes in all windows of w (w>=1) bp.
func NewMinimizerSketch(S *seq.Seq, k int, w int, circular bool) (*Sketch, error) {
if k < 1 {
return nil, ErrInvalidK
}
if w < 1 || w > (1<<31)-1 {
return nil, ErrInvalidW
}
if len(S.Seq) < k+w-1 {
return nil, ErrShortSeq
}
// sketch := &Sketch{S: S.Seq, w: w, k: k, circular: circular}
sketch := poolSketch.Get().(*Sketch)
sketch.minimizer = true
sketch.k = k
sketch.w = w
sketch.circular = circular
sketch.skip = w == 1
var seq2 []byte
if circular {
seq2 = make([]byte, len(S.Seq), len(S.Seq)+k-1)
copy(seq2, S.Seq) // do not edit original sequence
seq2 = append(seq2, S.Seq[0:k-1]...)
sketch.S = seq2
} else {
seq2 = S.Seq
}
sketch.idx = 0
sketch.end = len(seq2) - 1
sketch.r = w - 1 // L-k
var err error
sketch.hasher, err = nthash.NewHasher(&seq2, uint(k))
if err != nil {
return nil, err
}
if sketch.buf == nil {
sketch.buf = make([]IdxValue, 0, w)
} else {
sketch.buf = sketch.buf[:0]
}
if sketch.preMinIdxs == nil {
sketch.preMinIdxs = make([]int, 0, 8)
} else {
sketch.preMinIdxs = sketch.preMinIdxs[:0]
}
sketch.preMinIdx = -1
return sketch, nil
}
// NewSyncmerSketch returns a SyncmerSketch Iterator.
// 1<=s<=k.
func NewSyncmerSketch(S *seq.Seq, k int, s int, circular bool) (*Sketch, error) {
if k < 1 {
return nil, ErrInvalidK
}
if s > k || s == 0 {
return nil, ErrInvalidS
}
if len(S.Seq) < k*2-s-1 {
return nil, ErrShortSeq
}
// sketch := &Sketch{S: S.Seq, s: s, k: k, circular: circular}
sketch := poolSketch.Get().(*Sketch)
sketch.minimizer = false
sketch.k = k
sketch.s = s
sketch.circular = circular
sketch.skip = s == k
var seq2 []byte
if circular {
seq2 = make([]byte, len(S.Seq), len(S.Seq)+k-1)
copy(seq2, S.Seq) // do not edit original sequence
seq2 = append(seq2, S.Seq[0:k-1]...)
sketch.S = seq2
} else {
seq2 = S.Seq
}
sketch.idx = 0
sketch.end = len(seq2) - 2*k + s + 1 // len(sequence) - L (2*k - s - 1)
sketch.r = 2*k - s - 1 - s // L-s
sketch.kMs = k - s // k-s
sketch.w = k - s
var err error
sketch.hasher, err = nthash.NewHasher(&seq2, uint(k))
if err != nil {
return nil, err
}
sketch.hasherS, err = nthash.NewHasher(&seq2, uint(s))
if err != nil {
return nil, err
}
if sketch.buf == nil {
sketch.buf = make([]IdxValue, 0, (k-s)<<1)
} else {
sketch.buf = sketch.buf[:0]
}
if sketch.preMinIdxs == nil {
sketch.preMinIdxs = make([]int, 0, 8)
} else {
sketch.preMinIdxs = sketch.preMinIdxs[:0]
}
sketch.preMinIdx = -1
return sketch, nil
}
// NextMinimizer returns next minimizer.
func (s *Sketch) NextMinimizer() (code uint64, ok bool) {
for {
if s.idx > s.end {
return 0, false
}
// nthash of current k-mer
code, ok = s.hasher.Next(true)
if !ok {
poolSketch.Put(s)
return code, false
}
if s.skip {
s.mI = s.idx
s.idx++
return code, true
}
// in window
if s.idx < s.r {
s.buf = append(s.buf, IdxValue{Idx: s.idx, Val: code})
s.idx++
continue
}
// end of w
if s.idx == s.r {
s.buf = append(s.buf, IdxValue{Idx: s.idx, Val: code})
sort.Sort(idxValues(s.buf)) // sort
s.i2v = s.buf[0]
s.mI, s.mV = s.i2v.Idx, s.i2v.Val
s.preMinIdx = s.mI
s.idx++
return s.i2v.Val, true
}
// find min k-mer
// remove k-mer not in this window.
// have to check position/index one by one
for s.i, s.i2v = range s.buf {
if s.i2v.Idx == s.idx-s.w {
if s.i < s.r {
copy(s.buf[s.i:s.r], s.buf[s.i+1:])
} // happen to be at the end
s.buf = s.buf[:s.r]
break
}
}
// add new k-mer
s.flag = false
// using binary search, faster han linear search
s.b, s.e = 0, s.r-1
for {
s.t = s.b + (s.e-s.b)/2
if code < s.buf[s.t].Val {
s.e = s.t - 1 // end search here
if s.e <= s.b {
s.flag = true
s.i = s.b
break
}
} else {
s.b = s.t + 1 // start here
if s.b >= s.r {
s.flag = false
break
}
if s.b >= s.e {
s.flag = true
s.i = s.e // right here
break
}
}
}
if !s.flag { // it's the biggest one, append to the end
s.buf = append(s.buf, IdxValue{s.idx, code})
} else {
if code >= s.buf[s.i].Val { // have to check again
s.i++
}
s.buf = append(s.buf, blankI2V) // append one element
copy(s.buf[s.i+1:], s.buf[s.i:s.r]) // move right
s.buf[s.i] = IdxValue{s.idx, code}
}
s.i2v = s.buf[0]
if s.i2v.Idx == s.preMinIdx { // deduplicate
s.idx++
continue
}
s.mI, s.mV = s.i2v.Idx, s.i2v.Val
s.preMinIdx = s.mI
s.idx++
return s.i2v.Val, true
}
}
// NextSyncmer returns next syncmer.
func (s *Sketch) NextSyncmer() (code uint64, ok bool) {
for {
if s.idx > s.end {
return 0, false
}
// nthash of current k-mer
code, ok = s.hasher.Next(true)
if !ok {
poolSketch.Put(s)
return code, false
}
// fmt.Printf("\nidx: %d, %s, %d\n", s.idx, s.S[s.idx:s.idx+s.s], code)
// fmt.Printf("idx: %d, pres: %v, pre: %d\n", s.idx, s.preMinIdxs, s.preMinIdx)
if s.skip {
s.idx++
return code, true
}
if len(s.preMinIdxs) > 0 && s.idx == s.preMinIdxs[0] {
// we will output this one in this round
s.lateOutputThisOne = true
} else {
s.lateOutputThisOne = false
}
// find min s-mer
if s.idx == 0 {
for s.i = s.idx; s.i <= s.idx+s.r; s.i++ {
// fmt.Printf("s: %d\n", s.i)
s.v, ok = s.hasherS.Next(true)
if !ok {
return code, false
}
s.buf = append(s.buf, IdxValue{Idx: s.i, Val: s.v})
}
sort.Sort(idxValues(s.buf))
} else {
// remove s-mer not in this window.
// have to check position/index one by one
for s.i, s.i2v = range s.buf {
if s.i2v.Idx == s.idx-1 {
if s.i < s.r {
copy(s.buf[s.i:s.r], s.buf[s.i+1:])
} // happen to be at the end
s.buf = s.buf[:s.r]
break
}
}
// add new s-mer
// fmt.Printf("s: %d\n", s.idx+s.r)
s.v, ok = s.hasherS.Next(true)
if !ok {
return code, false
}
s.flag = false
// using binary search, faster han linear search
s.b, s.e = 0, s.r-1
for {
s.t = s.b + (s.e-s.b)/2
if s.v < s.buf[s.t].Val {
s.e = s.t - 1 // end search here
if s.e <= s.b {
s.flag = true
s.i = s.b
break
}
} else {
s.b = s.t + 1 // start here
if s.b >= s.r {
s.flag = false
break
}
if s.b >= s.e {
s.flag = true
s.i = s.e // right here
break
}
}
}
if !s.flag { // it's the biggest one, append to the end
s.buf = append(s.buf, IdxValue{s.idx + s.r, s.v})
} else {
if s.v >= s.buf[s.i].Val { // have to check again
s.i++
}
s.buf = append(s.buf, blankI2V) // append one element
copy(s.buf[s.i+1:], s.buf[s.i:s.r]) // move right
s.buf[s.i] = IdxValue{s.idx + s.r, s.v}
}
}
s.i2v = s.buf[0]
s.mI, s.mV = s.i2v.Idx, s.i2v.Val
// fmt.Printf(" smer: %d: %d\n", s.mI, s.mV)
// find the location of bounded syncmer
if s.mI-s.idx < s.w { // syncmer at the beginning of kmer
s.bsyncmerIdx = s.mI
// fmt.Printf(" bIdx: start: %d\n", s.bsyncmerIdx)
} else { // at the end
s.bsyncmerIdx = s.mI - s.kMs
// fmt.Printf(" bIdx: end: %d\n", s.bsyncmerIdx)
}
// ----------------------------------
// duplicated
if len(s.preMinIdxs) > 0 && s.bsyncmerIdx == s.preMinIdxs[0] {
// fmt.Printf(" duplicated: %d\n", s.bsyncmerIdx)
if s.lateOutputThisOne {
// remove the first element
copy(s.preMinIdxs[0:len(s.preMinIdxs)-1], s.preMinIdxs[1:])
s.preMinIdxs = s.preMinIdxs[0 : len(s.preMinIdxs)-1]
s.idx++
s.preMinIdx = s.bsyncmerIdx
return code, true
}
s.idx++
// s.preMinIdx = s.bsyncmerIdx
continue
}
if s.lateOutputThisOne {
// remove the first element
copy(s.preMinIdxs[0:len(s.preMinIdxs)-1], s.preMinIdxs[1:])
s.preMinIdxs = s.preMinIdxs[0 : len(s.preMinIdxs)-1]
if s.preMinIdx != s.bsyncmerIdx {
s.preMinIdxs = append(s.preMinIdxs, s.bsyncmerIdx)
}
// fmt.Printf(" late2: %d\n", s.preMinIdxs[0])
s.idx++
s.preMinIdx = s.bsyncmerIdx
return code, true
}
// is it current kmer?
if s.bsyncmerIdx == s.idx {
// fmt.Printf(" current: %d\n", s.bsyncmerIdx)
if len(s.preMinIdxs) > 0 {
// remove the first element
copy(s.preMinIdxs[0:len(s.preMinIdxs)-1], s.preMinIdxs[1:])
s.preMinIdxs = s.preMinIdxs[0 : len(s.preMinIdxs)-1]
}
s.idx++
s.preMinIdx = s.bsyncmerIdx
return code, true
}
if s.preMinIdx != s.bsyncmerIdx {
s.preMinIdxs = append(s.preMinIdxs, s.bsyncmerIdx)
}
// fmt.Printf(" return it later: %d\n", s.bsyncmerIdx)
s.idx++
s.preMinIdx = s.bsyncmerIdx
}
}
// Next returns next sketch
func (s *Sketch) Next() (uint64, bool) {
if s.minimizer {
return s.NextMinimizer()
}
return s.NextSyncmer()
}
// Index returns current 0-baesd index
func (s *Sketch) Index() int {
if s.minimizer {
return s.mI
}
return s.idx - 1
}
// IdxValue is for storing k-mer hash and it's location when computing k-mer sketches.
type IdxValue struct {
Idx int // index
Val uint64 // hash
}
var blankI2V = IdxValue{0, 0}
type idxValues []IdxValue
func (l idxValues) Len() int { return len(l) }
func (l idxValues) Less(i int, j int) bool { return l[i].Val < l[j].Val }
func (l idxValues) Swap(i int, j int) { l[i], l[j] = l[j], l[i] }
unikmer-0.18.8/sketch_test.go 0000664 0000000 0000000 00000012353 14121001445 0016125 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"testing"
"github.com/shenwei356/bio/seq"
"github.com/shenwei356/util/bytesize"
)
var _syncmer uint64
var _syncmerIdx int
func TestMinimizer(t *testing.T) {
_s := "GGCAAGTTCGTCA"
// _s := "GGCAAGTTC"
sequence, err := seq.NewSeq(seq.DNA, []byte(_s))
if err != nil {
t.Errorf("fail to create sequence: %s", _s)
}
k := 5
w := 3
sketch, err := NewMinimizerSketch(sequence, k, w, false)
if err != nil {
t.Errorf("fail to create minizimer sketch")
}
var code uint64
var ok bool
var idx int
codes := make([]uint64, 0, 1024)
for {
code, ok = sketch.NextMinimizer()
if !ok {
break
}
idx = sketch.Index()
_syncmerIdx = idx
_syncmer = code
codes = append(codes, code)
// fmt.Printf("minizimer: %d-%s, %d\n", idx, _s[idx:idx+k], code)
}
if len(codes) == 5 &&
codes[0] == 973456138564179607 &&
codes[1] == 2645801399420473919 &&
codes[2] == 1099502864234245338 &&
codes[3] == 6763474888237448943 &&
codes[4] == 2737971715116251183 {
} else {
t.Errorf("minizimer error")
}
}
func TestSyncmer(t *testing.T) {
_s := "GGCAAGTTCGTCATCGATC"
// _s := "GGCAAGTTC"
sequence, err := seq.NewSeq(seq.DNA, []byte(_s))
if err != nil {
t.Errorf("fail to create sequence: %s", _s)
}
k := 5
s := 2
sketch, err := NewSyncmerSketch(sequence, k, s, false)
if err != nil {
t.Errorf("fail to create syncmer sketch")
}
var code uint64
var ok bool
var idx int
codes := make([]uint64, 0, 1024)
for {
code, ok = sketch.NextSyncmer()
// fmt.Println(sketch.Index(), code, ok)
if !ok {
break
}
idx = sketch.Index()
_syncmerIdx = idx
_syncmer = code
codes = append(codes, code)
// fmt.Printf("syncmer: %d-%s, %d\n", idx, _s[idx:idx+k], code)
}
// if len(codes) == 5 &&
// codes[0] == 7385093395039290540 &&
// codes[1] == 1099502864234245338 {
// } else {
// t.Errorf("syncmer error")
// }
}
func BenchmarkMinimizerSketch(b *testing.B) {
for i := range benchSeqs {
size := len(benchSeqs[i].Seq)
b.Run(bytesize.ByteSize(size).String(), func(b *testing.B) {
var code uint64
var ok bool
// var n int
for j := 0; j < b.N; j++ {
iter, err := NewMinimizerSketch(benchSeqs[i], 31, 15, false)
if err != nil {
b.Errorf("fail to create minizimer sketch. seq length: %d", size)
}
// n = 0
for {
code, ok = iter.NextMinimizer()
if !ok {
break
}
// fmt.Printf("minizimer: %d-%d\n", iter.Index(), code)
_code = code
// n++
}
}
// fmt.Printf("minizimer for %s DNA, c=%.6f\n", bytesize.ByteSize(size).String(), float64(size)/float64(n))
})
}
}
// go test -v -test.bench=BenchmarkSyncmerSketch -cpuprofile profile.out -test.run=damnit
// go tool pprof -http=:8080 profile.out
func BenchmarkSyncmerSketch(b *testing.B) {
for i := range benchSeqs {
size := len(benchSeqs[i].Seq)
b.Run(bytesize.ByteSize(size).String(), func(b *testing.B) {
var code uint64
var ok bool
// var n int
for j := 0; j < b.N; j++ {
iter, err := NewSyncmerSketch(benchSeqs[i], 31, 16, false)
if err != nil {
b.Errorf("fail to create syncmer sketch. seq length: %d", size)
}
// n = 0
for {
code, ok = iter.NextSyncmer()
if !ok {
break
}
// fmt.Printf("syncmer: %d-%d\n", iter.Index(), code)
_code = code
// n++
}
}
// fmt.Printf("syncmer for %s DNA, c=%.6f\n", bytesize.ByteSize(size).String(), float64(size)/float64(n))
})
}
}
func BenchmarkProteinMinimizerSketch(b *testing.B) {
for i := range benchSeqs {
size := len(benchSeqs[i].Seq)
b.Run(bytesize.ByteSize(size).String(), func(b *testing.B) {
var code uint64
var ok bool
// var n int
for j := 0; j < b.N; j++ {
iter, err := NewProteinMinimizerSketch(benchSeqs[i], 10, 1, 1, 5)
if err != nil {
b.Errorf("fail to create minizimer sketch. seq length: %d", size)
}
// n = 0
for {
code, ok = iter.Next()
if !ok {
break
}
// fmt.Printf("minizimer: %d-%d\n", iter.Index(), code)
_code = code
// n++
}
}
// fmt.Printf("minizimer for %s Protein, c=%.6f\n", bytesize.ByteSize(size).String(), float64(size)/float64(n))
})
}
}
unikmer-0.18.8/taxonomy.go 0000664 0000000 0000000 00000034556 14121001445 0015474 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"bufio"
"errors"
"fmt"
"strconv"
"strings"
"sync"
"github.com/shenwei356/xopen"
)
// Taxonomy holds relationship of taxon in a taxonomy.
type Taxonomy struct {
file string
rootNode uint32
Nodes map[uint32]uint32 // child -> parent
DelNodes map[uint32]struct{}
MergeNodes map[uint32]uint32 // from -> to
Names map[uint32]string
taxid2rankid map[uint32]uint8 // taxid -> rank id
ranks []string // rank id -> rank
Ranks map[string]interface{}
hasRanks bool
hasDelNodes bool
hasMergeNodes bool
hasNames bool
cacheLCA bool
lcaCache sync.Map
maxTaxid uint32
}
// ErrIllegalColumnIndex means column index is 0 or negative.
var ErrIllegalColumnIndex = errors.New("unikmer: illegal column index, positive integer needed")
// ErrRankNotLoaded means you should reate load Taxonomy with NewTaxonomyWithRank before calling some methods.
var ErrRankNotLoaded = errors.New("unikmer: taxonomic ranks not loaded, please call: NewTaxonomyWithRank")
// ErrNamesNotLoaded means you should call LoadNames before using taxonomy names.
var ErrNamesNotLoaded = errors.New("unikmer: taxonomy names not loaded, please call: LoadNames")
// ErrTooManyRanks means number of ranks exceed limit of 255
var ErrTooManyRanks = errors.New("unikmer: number of ranks exceed limit of 255")
// ErrUnkownRank indicate an unknown rank
var ErrUnkownRank = errors.New("unikmer: unknown rank")
// NewTaxonomyFromNCBI parses nodes relationship from nodes.dmp
// from ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz .
func NewTaxonomyFromNCBI(file string) (*Taxonomy, error) {
return NewTaxonomy(file, 1, 3)
}
// NewTaxonomy only loads nodes from nodes.dmp file.
func NewTaxonomy(file string, childColumn int, parentColumn int) (*Taxonomy, error) {
if childColumn < 1 || parentColumn < 1 {
return nil, ErrIllegalColumnIndex
}
maxColumns := maxInt(childColumn, parentColumn)
fh, err := xopen.Ropen(file)
if err != nil {
return nil, fmt.Errorf("unikmer: %s", err)
}
defer func() {
fh.Close()
}()
nodes := make(map[uint32]uint32, 1024)
n := maxColumns + 1
childColumn--
parentColumn--
items := make([]string, n)
scanner := bufio.NewScanner(fh)
var _child, _parent int
var child, parent uint32
var maxTaxid uint32
var root uint32
for scanner.Scan() {
stringSplitN(scanner.Text(), "\t", n, &items)
if len(items) < n {
continue
}
_child, err = strconv.Atoi(items[childColumn])
if err != nil {
continue
}
_parent, err = strconv.Atoi(items[parentColumn])
if err != nil {
continue
}
child, parent = uint32(_child), uint32(_parent)
// ----------------------------------
nodes[child] = parent
if child == parent {
root = child
}
if child > maxTaxid {
maxTaxid = child
}
}
if err := scanner.Err(); err != nil {
return nil, fmt.Errorf("unikmer: %s", err)
}
return &Taxonomy{file: file, Nodes: nodes, rootNode: root, maxTaxid: maxTaxid}, nil
}
// NewTaxonomyWithRankFromNCBI parses Taxonomy from nodes.dmp
// from ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz .
func NewTaxonomyWithRankFromNCBI(file string) (*Taxonomy, error) {
return NewTaxonomyWithRank(file, 1, 3, 5)
}
// NewTaxonomyWithRank loads nodes and ranks from nodes.dmp file.
func NewTaxonomyWithRank(file string, childColumn int, parentColumn int, rankColumn int) (*Taxonomy, error) {
if childColumn < 1 || parentColumn < 1 || rankColumn < 1 {
return nil, ErrIllegalColumnIndex
}
maxColumns := maxInt(childColumn, parentColumn, rankColumn)
taxid2rankid := make(map[uint32]uint8, 1024)
ranks := make([]string, 0, 128)
rank2rankid := make(map[string]int, 128)
ranksMap := make(map[string]interface{}, 128)
fh, err := xopen.Ropen(file)
if err != nil {
return nil, fmt.Errorf("unikmer: %s", err)
}
defer func() {
fh.Close()
}()
nodes := make(map[uint32]uint32, 1024)
n := maxColumns + 1
childColumn--
parentColumn--
rankColumn--
items := make([]string, n)
scanner := bufio.NewScanner(fh)
var _child, _parent int
var child, parent uint32
var maxTaxid uint32
var rank string
var ok bool
var rankid int
var root uint32
for scanner.Scan() {
stringSplitN(scanner.Text(), "\t", n, &items)
if len(items) < n {
continue
}
_child, err = strconv.Atoi(items[childColumn])
if err != nil {
continue
}
_parent, err = strconv.Atoi(items[parentColumn])
if err != nil {
continue
}
child, parent, rank = uint32(_child), uint32(_parent), items[rankColumn]
// ----------------------------------
nodes[child] = parent
if child == parent {
root = child
}
if child > maxTaxid {
maxTaxid = child
}
if rankid, ok = rank2rankid[rank]; ok {
taxid2rankid[child] = uint8(rankid)
} else {
ranks = append(ranks, rank)
if len(ranks) > 255 {
return nil, ErrTooManyRanks
}
rank2rankid[rank] = len(ranks) - 1
taxid2rankid[child] = uint8(len(ranks) - 1)
ranksMap[rank] = struct{}{}
}
}
if err := scanner.Err(); err != nil {
return nil, fmt.Errorf("unikmer: %s", err)
}
return &Taxonomy{file: file, Nodes: nodes, rootNode: root, maxTaxid: maxTaxid,
taxid2rankid: taxid2rankid, ranks: ranks, hasRanks: true, Ranks: ranksMap}, nil
}
// Rank returns rank of a taxid.
func (t *Taxonomy) Rank(taxid uint32) string {
if !t.hasRanks {
panic(ErrRankNotLoaded)
}
if i, ok := t.taxid2rankid[taxid]; ok {
return t.ranks[int(i)]
}
return "" // taxid not found int db
}
// AtOrBelowRank returns whether a taxid is at or below one rank.
func (t *Taxonomy) AtOrBelowRank(taxid uint32, rank string) bool {
if !t.hasRanks {
panic(ErrRankNotLoaded)
}
var ok bool
var i uint8
rank = strings.ToLower(rank)
if _, ok = t.Ranks[rank]; !ok {
return false
}
if i, ok = t.taxid2rankid[taxid]; ok {
if rank == t.ranks[int(i)] {
return true
}
}
// continue searching towards to root node
var child, parent, newtaxid uint32
child = taxid
for {
parent, ok = t.Nodes[child]
if !ok { // taxid not found
// check if it was deleted
if _, ok = t.DelNodes[child]; ok {
return false
}
// check if it was merged
if newtaxid, ok = t.MergeNodes[child]; ok {
child = newtaxid
parent = t.Nodes[child]
} else { // not found
return false
}
}
if parent == 1 {
break
}
if rank == t.ranks[t.taxid2rankid[parent]] {
return true
}
child = parent
}
return false
}
// LoadNamesFromNCBI loads scientific names from NCBI names.dmp
func (t *Taxonomy) LoadNamesFromNCBI(file string) error {
return t.LoadNames(file, 1, 3, 7, "scientific name")
}
// LoadNames loads names.
func (t *Taxonomy) LoadNames(file string, taxidColumn int, nameColumn int, typeColumn int, _type string) error {
if taxidColumn < 1 || nameColumn < 1 || typeColumn < 1 {
return ErrIllegalColumnIndex
}
maxColumns := maxInt(nameColumn, nameColumn, typeColumn)
fh, err := xopen.Ropen(file)
if err != nil {
return fmt.Errorf("unikmer: %s", err)
}
defer func() {
fh.Close()
}()
m := make(map[uint32]string, 1024)
n := maxColumns + 1
taxidColumn--
nameColumn--
typeColumn--
filterByType := _type != ""
items := make([]string, n)
scanner := bufio.NewScanner(fh)
var taxid uint64
for scanner.Scan() {
stringSplitN(scanner.Text(), "\t", n, &items)
if len(items) < n {
continue
}
if filterByType && items[typeColumn] != _type {
continue
}
taxid, err = strconv.ParseUint(items[taxidColumn], 10, 32)
if err != nil {
continue
}
m[uint32(taxid)] = items[nameColumn]
}
if err := scanner.Err(); err != nil {
return fmt.Errorf("unikmer: %s", err)
}
t.Names = m
t.hasNames = true
return nil
}
// LoadMergedNodesFromNCBI loads merged nodes from NCBI merged.dmp.
func (t *Taxonomy) LoadMergedNodesFromNCBI(file string) error {
return t.LoadMergedNodes(file, 1, 3)
}
// LoadMergedNodes loads merged nodes.
func (t *Taxonomy) LoadMergedNodes(file string, oldColumn int, newColumn int) error {
if oldColumn < 1 || newColumn < 1 {
return ErrIllegalColumnIndex
}
maxColumns := maxInt(oldColumn, newColumn)
fh, err := xopen.Ropen(file)
if err != nil {
return fmt.Errorf("unikmer: %s", err)
}
defer func() {
fh.Close()
}()
m := make(map[uint32]uint32, 1024)
n := maxColumns + 1
oldColumn--
newColumn--
items := make([]string, n)
scanner := bufio.NewScanner(fh)
var from, to int
for scanner.Scan() {
stringSplitN(scanner.Text(), "\t", n, &items)
if len(items) < n {
continue
}
from, err = strconv.Atoi(items[oldColumn])
if err != nil {
continue
}
to, err = strconv.Atoi(items[newColumn])
if err != nil {
continue
}
m[uint32(from)] = uint32(to)
}
if err := scanner.Err(); err != nil {
return fmt.Errorf("unikmer: %s", err)
}
t.MergeNodes = m
t.hasMergeNodes = true
return nil
}
// LoadDeletedNodesFromNCBI loads deleted nodes from NCBI delnodes.dmp.
func (t *Taxonomy) LoadDeletedNodesFromNCBI(file string) error {
return t.LoadDeletedNodes(file, 1)
}
// LoadDeletedNodes loads deleted nodes.
func (t *Taxonomy) LoadDeletedNodes(file string, column int) error {
if column < 1 {
return ErrIllegalColumnIndex
}
fh, err := xopen.Ropen(file)
if err != nil {
return fmt.Errorf("unikmer: %s", err)
}
defer func() {
fh.Close()
}()
m := make(map[uint32]struct{}, 1024)
n := column + 1
column--
items := make([]string, n)
scanner := bufio.NewScanner(fh)
var id int
for scanner.Scan() {
stringSplitN(scanner.Text(), "\t", n, &items)
if len(items) < n {
continue
}
id, err = strconv.Atoi(items[column])
if err != nil {
continue
}
m[uint32(id)] = struct{}{}
}
if err := scanner.Err(); err != nil {
return fmt.Errorf("unikmer: %s", err)
}
t.DelNodes = m
t.hasDelNodes = true
return nil
}
// MaxTaxid returns maximum taxid
func (t *Taxonomy) MaxTaxid() uint32 {
return t.maxTaxid
}
// CacheLCA tells to cache every LCA query result
func (t *Taxonomy) CacheLCA() {
t.cacheLCA = true
}
// LCA returns the Lowest Common Ancestor of two nodes, 0 for unknown taxid.
func (t *Taxonomy) LCA(a uint32, b uint32) uint32 {
if a == 0 || b == 0 {
return 0
}
if a == b {
return a
}
// check cache
var ok bool
var query uint64
var tmp interface{}
if t.cacheLCA {
query = pack2uint32(a, b)
tmp, ok = t.lcaCache.Load(query)
if ok {
return tmp.(uint32)
}
}
mA := make(map[uint32]struct{}, 16)
var child, parent, newTaxid uint32
var flag bool
child = a
for {
parent, ok = t.Nodes[child]
if !ok {
flag = false
if t.hasMergeNodes { // merged?
if newTaxid, ok = t.MergeNodes[child]; ok { // merged
child = newTaxid // update child
parent, ok = t.Nodes[child]
if ok {
flag = true
}
}
}
if !flag {
if t.cacheLCA {
t.lcaCache.Store(query, uint32(0))
}
return 0
}
}
if parent == child { // root
mA[parent] = struct{}{}
break
}
if parent == b { // b is ancestor of a
if t.cacheLCA {
t.lcaCache.Store(query, b)
}
return b
}
mA[parent] = struct{}{}
child = parent
}
child = b
for {
parent, ok = t.Nodes[child]
if !ok {
flag = false
if t.hasMergeNodes { // merged?
if newTaxid, ok = t.MergeNodes[child]; ok { // merged
child = newTaxid // update child
parent, ok = t.Nodes[child]
if ok {
flag = true
}
}
}
if !flag {
if t.cacheLCA {
t.lcaCache.Store(query, uint32(0))
}
return 0
}
}
if parent == child { // root
break
}
if parent == a { // a is ancestor of b
if t.cacheLCA {
t.lcaCache.Store(query, a)
}
return a
}
if _, ok = mA[parent]; ok {
if t.cacheLCA {
t.lcaCache.Store(query, parent)
}
return parent
}
child = parent
}
return t.rootNode
}
// LineageNames returns nodes' names of the the complete lineage.
func (t *Taxonomy) LineageNames(taxid uint32) []string {
taxids := t.LineageTaxIds(taxid)
if taxids == nil {
return nil
}
if !t.hasNames {
panic(ErrNamesNotLoaded)
}
names := make([]string, len(taxids))
for i, tax := range taxids {
names[i] = t.Names[tax]
}
return names
}
// LineageTaxIds returns nodes' taxid of the the complete lineage.
func (t *Taxonomy) LineageTaxIds(taxid uint32) []uint32 {
var child, parent, newtaxid uint32
var ok bool
child = taxid
list := make([]uint32, 0, 16)
for {
parent, ok = t.Nodes[child]
if !ok { // taxid not found
// check if it was deleted
if _, ok = t.DelNodes[child]; ok {
return nil
}
// check if it was merged
if newtaxid, ok = t.MergeNodes[child]; ok {
child = newtaxid
parent = t.Nodes[child]
} else { // not found
return nil
}
}
list = append(list, child)
if parent == 1 {
break
}
child = parent
}
// reversing
for i, j := 0, len(list)-1; i < j; i, j = i+1, j-1 {
list[i], list[j] = list[j], list[i]
}
return list
}
func pack2uint32(a uint32, b uint32) uint64 {
if a < b {
return (uint64(a) << 32) | uint64(b)
}
return (uint64(b) << 32) | uint64(a)
}
func minInt(a int, vals ...int) int {
min := a
for _, v := range vals {
if v < min {
min = v
}
}
return min
}
func maxInt(a int, vals ...int) int {
min := a
for _, v := range vals {
if v > min {
min = v
}
}
return min
}
func stringSplitN(s string, sep string, n int, a *[]string) {
if a == nil {
tmp := make([]string, n)
a = &tmp
}
n--
i := 0
for i < n {
m := strings.Index(s, sep)
if m < 0 {
break
}
(*a)[i] = s[:m]
s = s[m+len(sep):]
i++
}
(*a)[i] = s
(*a) = (*a)[:i+1]
}
unikmer-0.18.8/taxonomy_test.go 0000664 0000000 0000000 00000002721 14121001445 0016520 0 ustar 00root root 0000000 0000000 // Copyright © 2018-2021 Wei Shen
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package unikmer
import (
"testing"
)
func TestPackTwoTaxids(t *testing.T) {
type Test struct {
a, b uint32
c uint64
}
tests := []Test{
{0, 0, 0},
{1, 1, 1<<32 + 1},
{2, 1, 1<<32 + 2},
}
for _, test := range tests {
c := pack2uint32(test.a, test.b)
if c != test.c {
t.Errorf("pack2uint32 error: %d != %d ", c, test.c)
}
}
}
unikmer-0.18.8/testdata/ 0000775 0000000 0000000 00000000000 14121001445 0015063 5 ustar 00root root 0000000 0000000 unikmer-0.18.8/testdata/README.md 0000664 0000000 0000000 00000001432 14121001445 0016342 0 ustar 00root root 0000000 0000000 ## Compression rate comparison
No Taxids stored.
1. Prepare a genome sequence, I used human genome chromosome X (`t_chrX.fa.gz`)
2. Computation
f=t_chrX.fa.gz
./cr2.sh $f > table.tsv
3. Plot
cat table.tsv \
| csvtk -t mutate2 -L 1 -n r_gzip -e '$gzip/$plain*100' \
| csvtk -t mutate2 -L 1 -n r_unik.default -e '$unik/$plain*100' \
| csvtk -t mutate2 -L 1 -n r_unik.compact -e '$cunik/$plain*100' \
| csvtk -t mutate2 -L 1 -n r_unik.sorted -e '$sunik/$plain*100' \
| csvtk -t cut -F -f k,num,r_* \
| csvtk -t gather -k group -v value -F -f 'r_*' \
| csvtk -t replace -f group -p 'r_' \
| csvtk -t replace -f num -p '^(.+)$' -k size.tsv -r '{kv} k-mers' \
> table.r.tsv
./plot.R
unikmer-0.18.8/testdata/cr.jpg 0000664 0000000 0000000 00000757631 14121001445 0016214 0 ustar 00root root 0000000 0000000 JFIF ,, C
$.' ",#(7),01444'9=82<.342 C
2!!22222222222222222222222222222222222222222222222222 F"
} !1AQa"q2#BR$3br
%&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz
w !1AQaq"2B #3Rbr
$4%&'()*56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz ? (
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(_ֿ*wk"{8٢F'<4k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BGI:0e=5so
JDHSU'A QHt(Z _Уֿ+2F'<g @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?ȣ @k ?BZ _Ъfx |>kֿ(uE }
h?Ȩm[iH
pB ITOB
:i?fUk ?B=
bA٣4H璠
?Z _Уֿ*٠ 1 "A