debian/0000755000000000000000000000000012163334107007166 5ustar debian/vcf-merge.10000644000000000000000000000204412151612073011121 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH VCF-MERGE "1" "July 2011" "vcf-merge 0.1.5" "User Commands" .SH NAME vcf-merge \- merge the bgzipped and tabix indexed VCF files .SH SYNOPSIS .B merge-vcf [\fIOPTIONS\fR] \fIfile1.vcf file2.vcf.gz \fR... \fI> out.vcf\fR .SH DESCRIPTION About: Merge the bgzipped and tabix indexed VCF files. (E.g. bgzip file.vcf; tabix \fB\-p\fR vcf file.vcf.gz) .SH OPTIONS .TP \fB\-c\fR, \fB\-\-chromosomes\fR Same as \fB\-r\fR, left for backward compatibility. Please do not use as it will be dropped in the future. .TP \fB\-d\fR, \fB\-\-remove\-duplicates\fR If there should be two consecutive rows with the same chr:pos, print only the first one. .TP \fB\-H\fR, \fB\-\-vcf\-header\fR Use the VCF header .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .TP \fB\-r\fR, \fB\-\-regions\fR Do only the given regions (comma\-separated list or one region per line in a file). .TP \fB\-s\fR, \fB\-\-silent\fR Try to be a bit more silent, no warnings about duplicate lines. debian/fill-fs.10000644000000000000000000000240012151612073010576 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.41.1. .TH FILL-FS "1" "February 2013" "fill-fs" "User Commands" .SH NAME fill-fs \- Annotate VCF with flanking sequence .SH SYNOPSIS .B fill-fs [\fIOPTIONS\fR] \fIfile.vcf\fR .SH DESCRIPTION About: Annotate VCF with flanking sequence (INFO/FS tag) .SH OPTIONS .TP \fB\-b\fR, \fB\-\-bed\-mask\fR Regions to mask (tabix indexed), multiple files can be given .TP \fB\-c\fR, \fB\-\-cluster\fR Do self\-masking of clustered variants within this range. .TP \fB\-l\fR, \fB\-\-length\fR Flanking sequence length [100] .TP \fB\-m\fR, \fB\-\-mask\-char\fR The character to use or "lc" for lowercase. This option must preceed \fB\-b\fR, \fB\-v\fR or \fB\-c\fR in order to take effect. With multiple files works .IP as a switch on the command line, see the example below [N] .TP \fB\-r\fR, \fB\-\-refseq\fR The reference sequence. .TP \fB\-v\fR, \fB\-\-vcf\-mask\fR Mask known variants in the flanking sequence, multiple files can be given (tabix indexed) .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .SS "Example:" .IP # Mask variants from the VCF file with N's and use lowercase for the bed file regions fill\-fs file.vcf \fB\-v\fR mask.vcf \fB\-m\fR lc \fB\-b\fR mask.bed .IP debian/manpages0000644000000000000000000000100512151612073010676 0ustar debian/vcftools.1 debian/vcf-annotate.1 debian/vcf-compare.1 debian/vcf-concat.1 debian/vcf-convert.1 debian/vcf-isec.1 debian/vcf-merge.1 debian/vcf-query.1 debian/vcf-sort.1 debian/vcf-stats.1 debian/vcf-subset.1 debian/vcf-to-tab.1 debian/vcf-validator.1 debian/fill-aa.1 debian/fill-an-ac.1 debian/fill-rsIDs.1 debian/fill-fs.1 debian/fill-ref-md5.1 debian/vcf-consensus.1 debian/vcf-contrast.1 debian/vcf-fix-ploidy.1 debian/vcf-indel-stats.1 debian/vcf-phased-join.1 debian/vcf-shuffle-cols.1 debian/vcf-tstv.1 debian/control0000644000000000000000000000166712163234417010606 0ustar Source: vcftools Section: science Priority: extra Build-Depends: debhelper (>= 9), zlib1g-dev Standards-Version: 3.9.4 Homepage: http://vcftools.sourceforge.net Maintainer: Debian Med Packaging Team Uploaders: Thorsten Alteholz , Andreas Tille Vcs-Browser: http://anonscm.debian.org/viewvc/debian-med/trunk/packages/vcftools/trunk/ Vcs-Svn: svn://anonscm.debian.org/debian-med/trunk/packages/vcftools/trunk Package: vcftools Architecture: any Depends: ${shlibs:Depends}, ${misc:Depends}, ${perl:Depends} Recommends: tabix Description: Collection of tools to work with VCF files VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide methods for working with VCF files: validating, merging, comparing and calculate some basic population genetic statistics. debian/vcf-tstv.10000644000000000000000000000042112151612073011017 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.41.1. .TH VCF-TSTV "1" "February 2013" "vcf-tstv" "User Commands" .SH NAME vcf-tstv \- vcf-tstv .SH SYNOPSIS .B cat \fIfile.vcf | vcf-tstv\fR .SH OPTIONS .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .IP debian/vcf-phased-join.10000644000000000000000000000171512151612073012227 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.41.1. .TH VCF-PHASED-JOIN "1" "February 2013" "vcf-phased-join" "User Commands" .SH NAME vcf-phased-join \- take multiple overlapping pre\-phased chunks and concatenates them into one VCF .SH SYNOPSIS .B vcf-phased-join [\fIOPTIONS\fR] \fIA.vcf B.vcf C.vcf\fR .SH DESCRIPTION About: The script takes multiple overlapping pre\-phased chunks and concatenates them into one VCF .IP using heterozygous calls from the overlaps to determine correct phase. .SH OPTIONS .TP \fB\-j\fR, \fB\-\-min\-join\-quality\fR Quality threshold for gluing the pre\-phased blocks together [10] .TP \fB\-l\fR, \fB\-\-list\fR List of VCFs to join. .TP \fB\-o\fR, \fB\-\-output\fR Output file name. When "\-" is supplied, STDOUT and STDERR will be used .TP \fB\-q\fR, \fB\-\-min\-PQ\fR Break pre\-phased segments if PQ value is lower in input VCFs [0.6] .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message .IP debian/copyright0000644000000000000000000000227012151612073011120 0ustar Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ Upstream-Name: vcftools Source: http://vcftools.sourceforge.net/index.html Files-Excluded: *.pdf Files: * Copyright: 2009 - 2011, Adam Auton (cpp) 2009 - 2011, Petr Danecek (perl) License: GPL-3 This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. . This package is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. . You should have received a copy of the GNU General Public License along with this program. If not, see . . On Debian systems, the complete text of the GNU General Public License version 3 can be found in "/usr/share/common-licenses/GPL-3". Files: debian/* Copyright: 2011 Thorsten Alteholz License: GPL-3 This program is free software: you can redistribute it and/or modify debian/changelog0000644000000000000000000000503512163234526011047 0ustar vcftools (0.1.11+dfsg-1) unstable; urgency=low * new upstream version * debian/patches/use-dpkg-buildflags.patch: adatpt for new version * debian/control: adapt VCS*-lines according to lintian -- Thorsten Alteholz Fri, 28 Jun 2013 18:00:00 +0200 vcftools (0.1.10+dfsg-1) unstable; urgency=low [ Thorsten Alteholz ] * add man pages for new scripts * debian/patches/use-dpkg-buildflags.patch: add LDFLAGS to link stage * debian/vcftools.lintian-overrides: false positive * debian/control: standard bumped to 3.9.4 (no changes) * debian/control: DM-Upload-Allowed removed [ Dominique Belhachemi ] * New upstream version * Updated debian/patches/use-dpkg-buildflags.patch * Removed debian/patches/perl.patch (applied upstream) -- Thorsten Alteholz Sun, 24 Feb 2013 16:00:00 +0100 vcftools (0.1.9+dfsg-2) unstable; urgency=low * debian/control: description changed (Closes: #689058) * debian/patches/perl.patch for new syntax (Closes: #689059) * fix get-orig-source to handle dfsg -- Thorsten Alteholz Sun, 02 Dec 2012 19:00:00 +0100 vcftools (0.1.9+dfsg-1) UNRELEASED; urgency=low * debian/copyright: - DEP5 - Add Files-Excluded to document what was removed from original source * debian/watch: Enable handling dfsg suffix -- Andreas Tille Wed, 12 Sep 2012 16:31:09 +0200 vcftools (0.1.9-1) unstable; urgency=low * New upstream version * debian/upstream: Added citations * debian/control:Standards-Version: 3.9.3 (no changes needed) * debian/{get-orig-source,rules,watch}: Make sure we always get the latest version * debhelper 9 (control+compat) * debian/{rules,install,examples}: use debhelper files to install files (upstream changed some targets so the old copying mechanism did not worked any more) * debian/patches/use-dpkg-buildflags.patch: Enable propagation of hardening flags -- Andreas Tille Sat, 12 May 2012 09:31:58 +0200 vcftools (0.1.7-1) unstable; urgency=low * new upstream version * rearrange debian/copyright to calm lintian -- Thorsten Alteholz Wed, 19 Oct 2011 18:00:00 +0200 vcftools (0.1.6-1) unstable; urgency=low [ Thorsten ] * Initial release (Closes: #633142). [ Steffen as the package's sponsor ] * Added README.source * Removed Andreas and myself from uploaders * Followed FTPmaster's advice to remove the source-less .pdf from source tree -- Thorsten Alteholz Sun, 10 Jul 2011 15:18:25 +0200 debian/vcf-contrast.10000644000000000000000000000250612151612073011662 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.41.1. .TH VCF-CONTRAST "1" "February 2013" "vcf-contrast" "User Commands" .SH NAME vcf-contrast \- finds differences amongst samples .SH SYNOPSIS .B vcf-contrast \fI+ - \fR[\fIOPTIONS\fR] \fIfile.vcf.gz\fR .SH DESCRIPTION About: Finds differences amongst samples adding NOVEL* annotation to INFO field. .SH OPTIONS .TP + List of samples where unique variant is expected .TP \- List of background samples .TP \fB\-d\fR, \fB\-\-min\-DP\fR Minimum depth across all \- samples .TP \fB\-f\fR, \fB\-\-apply\-filters\fR Skip sites with FILTER column different from PASS or "." .TP \fB\-n\fR, \fB\-\-novel\-sites\fR Print only records with novel genotypes .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .SS "Example:" .IP # Test if any of the samples A,B is different from all C,D,E vcf\-contrast +A,B \fB\-C\fR,D,E \fB\-m\fR file.vcf.gz .IP # Same as above but printing only sites with novel variants and table output vcf\-contrast \fB\-n\fR +A,B \fB\-C\fR,D,E \fB\-m\fR file.vcf.gz | vcf\-query \fB\-f\fR '%CHROM %POS\et%INFO/NOVELTY\et%INFO/NOVELAL\et%INFO/NOVELGT[\et%SAMPLE %GTR %PL]\en' .IP # Similar to above but require minimum mapping quality of 20 vcf\-annotate \fB\-f\fR MinMQ=20 file.vcf.gz | vcf\-contrast +A,B,C \fB\-D\fR,E,F \fB\-f\fR .IP debian/vcf-sort.10000644000000000000000000000050612151612073011012 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH VCF-SORT "1" "July 2011" "vcf-sort 0.1.5" "User Commands" .SH NAME vcf-sort \- sort VCF file .SH SYNOPSIS .B vcf-sort \fI> out.vcf\fR .SH DESCRIPTION .IP cat file.vcf | vcf\-sort > out.vcf .SH OPTIONS .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. debian/compat0000644000000000000000000000000212151612073010362 0ustar 9 debian/vcf-consensus.10000644000000000000000000000057112151612073012045 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.41.1. .TH VCF-CONSENSUS "1" "February 2013" "vcf-consensus" "User Commands" .SH NAME vcf-consensus \- vcf-consensus .SH SYNOPSIS .B cat \fIref.fa | vcf-consensus \fR[\fIOPTIONS\fR] \fIin.vcf.gz > out.txt\fR .SH OPTIONS .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .HP \fB\-s\fR, \fB\-\-sample\fR .IP debian/examples0000644000000000000000000000001312151612073010717 0ustar examples/* debian/vcf-query.10000644000000000000000000000162412151612073011172 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH VCF-QUERY "1" "July 2011" "vcf-query 0.1.5" "User Commands" .SH NAME vcf-query \- query VCF files .SH SYNOPSIS .B query-vcf [\fIOPTIONS\fR] \fIfile.vcf.gz\fR .SH OPTIONS .TP \fB\-c\fR, \fB\-\-columns\fR List of comma\-separated column names. .TP \fB\-f\fR, \fB\-\-format\fR The default is '%CHROM:%POS\et%REF[\et%SAMPLE=%GT]\en' .TP \fB\-l\fR, \fB\-\-list\-columns\fR List columns. .TP \fB\-r\fR, \fB\-\-region\fR chr:from\-to Retrieve the region. (Runs tabix.) .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .SH EXAMPLES .IP query\-vcf file.vcf.gz 1:1000\-2000 \fB\-c\fR NA001,NA002,NA003 query\-vcf file.vcf.gz \fB\-r\fR 1:1000\-2000 \fB\-f\fR '%CHROM:%POS\et%REF[\et%SAMPLE:%*=,]\en' query\-vcf file.vcf.gz \fB\-f\fR '[%GT\et]%LINE\en' query\-vcf file.vcf.gz \fB\-f\fR '%CHROM\et%POS\et%INFO/DP\et%FILTER\en' debian/vcf-concat.10000644000000000000000000000145012151612073011271 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH VCF-CONCAT "1" "July 2011" "vcf-concat 0.1.5" "User Commands" .SH NAME vcf-concat \- concatenate VCF files .SH SYNOPSIS .B vcf-concat [\fIOPTIONS\fR] \fIA.vcf.gz B.vcf.gz C.vcf.gz > out.vcf\fR .SH DESCRIPTION About: Convenience tool for concatenating VCF files. In the basic mode it does not .IP do anything fancy except for a sanity check that all files have the same columns. When run with the \fB\-s\fR option, it will perform a partial merge sort, looking at a limited number of open jobs simultaneously. .SH OPTIONS .TP \fB\-f\fR, \fB\-\-files\fR Read the list of files from a file. .TP \fB\-s\fR, \fB\-\-merge\-sort\fR Allow small overlaps in N consecutive files. .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. debian/vcf-stats.10000644000000000000000000000242212151612073011160 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH VCF-STATS "1" "July 2011" "vcf-stats 0.1.5" "User Commands" .SH NAME vcf-stats \- statistic of VCF file .SH SYNOPSIS .B vcf-stats [\fIOPTIONS\fR] \fIfile.vcf.gz\fR .SH OPTIONS .TP \fB\-d\fR, \fB\-\-dump\fR Take an existing dump file and recreate the files (works with \fB\-p\fR) .TP \fB\-f\fR, \fB\-\-filters\fR List of filters such as column/field (any value), column/field=bin:max (cluster in bins),column/field=value (exact value) .TP \fB\-p\fR, \fB\-\-prefix\fR Prefix of output files. If slashes are present, directories will be created. .TP \fB\-s\fR, \fB\-\-samples\fR Process only the listed samples, \- for none. Excluding unwanted samples may increase performance considerably. .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .SH EXAMPLES .IP # Calculate stats separately for the filter field, quality and non\-indels vcf\-stats file.vcf.gz \fB\-f\fR FILTER,QUAL=10:200,INFO/INDEL=False \fB\-p\fR out/ .IP # Calculate stats for all samples vcf\-stats file.vcf.gz \fB\-f\fR FORMAT/DP=10:200 \fB\-p\fR out/ .IP # Calculate stats only for the sample NA00001 vcf\-stats file.vcf.gz \fB\-f\fR SAMPLE/NA00001/DP=1:200 \fB\-p\fR out/ .IP vcf\-stats file.vcf.gz > perl.dump debian/vcf-isec.10000644000000000000000000000357712151612073010761 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH VCF-ISEC "1" "July 2011" "vcf-isec 0.1.5" "User Commands" .SH NAME vcf-isec \- create intersections, unions, complements on bgzipped and tabix indexed VCF or tab-delimited files .SH SYNOPSIS .B vcf-isec [\fIOPTIONS\fR] \fIfile1.vcf file2.vcf \fR... .SH DESCRIPTION About: Create intersections, unions, complements on bgzipped and tabix indexed VCF or tab\-delimited files. .IP Note that lines from all files can be intermixed together on the output, which can yield unexpected results. .SH OPTIONS .TP \fB\-C\fR, \fB\-\-chromosomes\fR Process the given chromosomes (comma\-separated list or one chromosome per line in a file). .TP \fB\-c\fR, \fB\-\-complement\fR Output positions present in the first file but missing from the other files. .TP \fB\-d\fR, \fB\-\-debug\fR Debugging information .TP \fB\-f\fR, \fB\-\-force\fR Continue even if the script complains about differing columns. .TP \fB\-o\fR, \fB\-\-one\-file\-only\fR Print only entries from the left\-most file. Without \fB\-o\fR, all unique positions will be printed. .TP \fB\-n\fR, \fB\-\-nfiles\fR [+\-=] Output positions present in this many (=), this many or more (+), or this many or fewer (\-) files. .TP \fB\-p\fR, \fB\-\-prefix\fR If present, multiple files will be created with all possible isec combinations. (Suitable for Venn Diagram analysis.) .TP \fB\-t\fR, \fB\-\-tab\fR Tab\-delimited file with indexes of chromosome and position columns. (1\-based indexes) .TP \fB\-w\fR, \fB\-\-win\fR In repetitive sequences, the same indel can be called at different positions. Consider records this far apart as matching (be it a SNP or an indel). .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .SH EXAMPLES .IP bgzip file.vcf; tabix \fB\-p\fR vcf file.vcf.gz bgzip file.tab; tabix \fB\-s\fR 1 \fB\-b\fR 2 \fB\-e\fR 2 file.tab.gz debian/watch0000644000000000000000000000021112151612073010207 0ustar version=3 opts=dversionmangle=s/[~\+]dfsg// \ http://sf.net/vcftools/vcftools_(\d*\.\d*\.\d*)\.tar\.gz \ debian debian/get-orig-source debian/vcf-compare.10000644000000000000000000000324212151612073011451 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH VCF-COMPARE "1" "July 2011" "vcf-compare 0.1.5" "User Commands" .SH NAME vcf-compare \- compare bgzipped and tabix indexed VCF files .SH SYNOPSIS .B compare-vcf [\fIOPTIONS\fR] \fIfile1.vcf file2.vcf \fR... .SH DESCRIPTION About: Compare bgzipped and tabix indexed VCF files. (E.g. bgzip file.vcf; tabix \fB\-p\fR vcf file.vcf.gz) .SH OPTIONS .TP \fB\-c\fR, \fB\-\-chromosomes\fR Same as \fB\-r\fR, left for backward compatibility. Please do not use as it will be dropped in the future. .TP \fB\-d\fR, \fB\-\-debug\fR Debugging information. Giving the option multiple times increases verbosity .TP \fB\-H\fR, \fB\-\-cmp\-haplotypes\fR Compare haplotypes, not only positions .TP \fB\-m\fR, \fB\-\-name\-mapping\fR Use with \fB\-H\fR when comparing files with differing column names. The argument to this options is a comma\-separated list or one mapping per line in a file. The names are colon separated and must appear in the same order as the files on the command line. .TP \fB\-R\fR, \fB\-\-refseq\fR Compare the actual sequence, not just positions. Use with \fB\-w\fR to compare indels. .TP \fB\-r\fR, \fB\-\-regions\fR Process the given regions (comma\-separated list or one region per line in a file). .TP \fB\-s\fR, \fB\-\-samples\fR Process only the listed samples. Excluding unwanted samples may increase performance considerably. .TP \fB\-w\fR, \fB\-\-win\fR In repetitive sequences, the same indel can be called at different positions. Consider records this far apart as matching (be it a SNP or an indel). .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. debian/upstream0000644000000000000000000000117512151612073010753 0ustar Reference: Author: Petr Danecek and Adam Auton and Goncalo Abecasis and Cornelis A. Albers and Eric Banks and Mark A. DePristo and Robert E. Handsaker and Gerton Lunter and Gabor T. Marth and Stephen T. Sherry and Gilean McVean and Richard Durbin Title: The variant call format and VCFtools Journal: Bioinformatics Year: 2011 Volume: 27 Number: 15 Pages: 2156-8 DOI: 10.1093/bioinformatics/btr330 PMID: 21653522 URL: http://bioinformatics.oxfordjournals.org/content/early/2011/06/07/bioinformatics.btr330 eprint: http://bioinformatics.oxfordjournals.org/content/early/2011/06/07/bioinformatics.btr330.full.pdf+html debian/vcftools.10000644000000000000000000004647712151612073011127 0ustar .TH VCFTOOLS "1" "July 2011" "vcftools 0.1.5" "User Commands" .SH NAME vcftools \- analyse VCF files .SH SYNOPSIS .B vcftools \fR[\fIOPTIONS\fR] .SH DESCRIPTION The vcftools program is run from the command line. The interface is inspired by PLINK, and so should be largely familiar to users of that package. Commands take the following form: vcftools \-\-vcf file1.vcf \-\-chr 20 \-\-freq The above command tells vcftools to read in the file file1.vcf, extract sites on chromosome 20, and calculate the allele frequency at each site. The resulting allele frequency estimates are stored in the output file, out.freq. As in the above example, output from vcftools is mainly sent to output files, as opposed to being shown on the screen. Note that some commands may only be available in the latest version of vcftools. To obtain the latest version, you should use SVN to checkout the latest code, as described on the home page. Also note that polyploid genotypes are not currently supported. .SS Basic Options .TP \fB\-\-vcf\fR This option defines the VCF file to be processed. The files need to be decompressed prior to use with vcftools. vcftools expects files in VCF format v4.0, a specification of which can be found here. .TP \fB\-\-gzvcf\fR This option can be used in place of the \-\-vcf option to read compressed (gzipped) VCF files directly. Note that this option can be quite slow when used with large files. .TP \fB\-\-out\fR This option defines the output filename prefix for all files generated by vcftools. For example, if is set to output_filename, then all output files will be of the form output_filename.*** . If this option is omitted, all output files will have the prefix 'out.'. .SS Site Filter Options .TP \fB\-\-chr\fR Only process sites with a chromosome identifier matching .TP \fB\-\-from\-bp\fR .TP \fB\-\-to\-bp\fR These options define the physical range of sites will be processed. Sites outside of this range will be excluded. These options can only be used in conjunction with \-\-chr. .TP \fB\-\-snp\fR Include SNP(s) with matching ID. This command can be used multiple times in order to include more than one SNP. .TP \fB\-\-snps\fR Include a list of SNPs given in a file. The file should contain a list of SNP IDs, with one ID per line. .TP \fB\-\-exclude\fR Exclude a list of SNPs given in a file. The file should contain a list of SNP IDs, with one ID per line. .TP \fB\-\-positions\fR Include a set of sites on the basis of a list of positions. Each line of the input file should contain a (tab-separated) chromosome and position. The file should have a header line. Sites not included in the list are excluded. .TP \fB\-\-bed\fR .TP \fB\-\-exclude\-bed\fR Include or exclude a set of sites on the basis of a BED file. Only the first three columns (chrom, chromStart and chromEnd) are required. The BED file should have a header line. .TP \fB\-\-remove\-filtered\-all\fR .TP \fB\-\-remove\-filtered\fR .TP \fB\-\-keep\-filtered\fR These options are used to filter sites on the basis of their FILTER flag. The first option removes all sites with a FILTER flag. The second option can be used to exclude sites with a specific filter flag. The third option can be used to select sites on the basis of specific filter flags. The second and third options can be used multiple times to specify multiple FILTERs. The \-\-keep\-filtered option is applied before the \-\-remove\-filtered option. .TP \fB\-\-minQ\fR Include only sites with Quality above this threshold. .TP \fB\-\-min\-meanDP\fR .TP \fB\-\-max\-meanDP\fR Include sites with mean Depth within the thresholds defined by these options. .TP \fB\-\-maf\fR .TP \fB\-\-max\-maf\fR Include only sites with Minor Allele Frequency within the specified range. .TP \fB\-\-non\-ref\-af\fR .TP \fB\-\-max\-non\-ref\-af\fR Include only sites with Non-Reference Allele Frequency within the specified range. .TP \fB\-\-hue\fR Assesses sites for Hardy-Weinberg Equilibrium using an exact test, as defined by Wigginton, Cutler and Abecasis (2005). Sites with a p-value below the threshold defined by this option are taken to be out of HWE, and therefore excluded. .TP \fB\-\-geno\fR Exclude sites on the basis of the proportion of missing data (defined to be between 0 and 1). .TP \fB\-\-min\-alleles\fR .TP \fB\-\-max\-alleles\fR Include only sites with a number of alleles within the specified range. For example, to include only bi\-allelic sites, one could use: vcftools \-\-vcf file1.vcf \-\-min\-alleles 2 \-\-max\-alleles 2 .TP \fB\-\-mask\fR .TP \fB\-\-invert\-mask\fR .TP \fB\-\-mask\-min\fR Include sites on the basis of a FASTA-like file. The provided file contains a sequence of integer digits (between 0 and 9) for each position on a chromosome that specify if a site at that position should be filtered or not. An example mask file would look like: >1 0000011111222... In this example, sites in the VCF file located within the first 5 bases of the start of chromosome 1 would be kept, whereas sites at position 6 onwards would be filtered out. The threshold integer that determines if sites are filtered or not is set using the \-\-mask\-min option, which defaults to 0. The chromosomes contained in the mask file must be sorted in the same order as the VCF file. The \-\-mask option is used to specify the mask file to be used, whereas the \-\-invert\-mask option can be used to specify a mask file that will be inverted before being applied. .SS Individual Filters .TP \fB\-\-indv\fR Specify an individual to be kept in the analysis. This option can be used multiple times to specify multiple individuals. .TP \fB\-\-keep\fR Provide a file containing a list of individuals to include in subsequent a nalysis. Each individual ID (as defined in the VCF headerline) should be included on a separate line. .TP \fB\-\-remove\-indv\fR Specify an individual to be removed from the analysis. This option can be used multiple times to specify multiple individuals. If the \-\-indv option is also specified, then the \-\-indv option is executed before the \-\-remove\-indv option. .TP \fB\-\-remove\fR Provide a file containing a list of individuals to exclude in subsequent analysis. Each individual ID (as defined in the VCF headerline) should be included on a separate line. If both the \-\-keep and the \-\-remove options are used, then the \-\-keep option is execute before the \-\-remove option. .TP \fB\-\-mon\-indv\-meanDP\fR .TP \fB\-\-max\-indv\-meanDP\fR Calculate the mean coverage on a per-individual basis. Only individuals with coverage within the range specified by these options are included in subsequent analyses. .TP \fB\-\-mind\fR Specify the minimum call rate threshold for each individual. .TP \fB\-\-phased\fR First excludes all individuals having all genotypes unphased, and subsequently excludes all sites with unphased genotypes. The remaining data therefore consists of phased data only. .SS Genotype Filters .TP \fB\-\-remove\-filtered\-geno\-all\fR .TP \fB\-\-remove\-filtered\-geno\fR The first option removes all genotypes with a FILTER flag. The second option can be used to exclude genotypes with a specific filter flag. .TP \fB\-\-minGQ\fR Exclude all genotypes with a quality below the threshold specified by this option (GQ). .TP \fB\-\-minDP\fR Exclude all genotypes with a sequencing depth below that specified by this option (DP) .SS Output Statistics .TP \fB\-\-freq\fR .TP \fB\-\-counts\fR .TP \fB\-\-freq2\fR .TP \fB\-\-counts2\fR Output per\-site frequency information. The \-\-freq outputs the allele frequency in a file with the suffix '.frq'. The \-\-counts option outputs a similar file with the suffix '.frq.count', that contains the raw allele counts at each site. The \-\-freq2 and \-\-count2 options are used to suppress allele information in the output file. In this case, the order of the freqs/counts depends on the numbering in the VCF file. .TP \fB\-\-depth\fR Generates a file containing the mean depth per individual. This file has the suffix '.idepth'. .TP \fB\-\-site\-depth\fR .TP \fB\-\-site\-mean\-depth\fR Generates a file containing the depth per site. The \-\-site\-depth option outputs the depth for each site summed across individuals. This file has the suffix '.ldepth'. Likewise, the \-\-site\-mean\-depth outputs the mean depth for each site, and the output file has the suffix '.ldepth.mean'. .TP \fB\-\-geno\-depth\fR Generates a (possibly very large) file containing the depth for each genotype in the VCF file. Missing entries are given the value \-1. The file has the suffix '.gdepth'. .TP \fB\-\-site\-quality\fR Generates a file containing the per\-site SNP quality, as found in the QUAL column of the VCF file. This file has the suffix '.lqual'. .TP \fB\-\-het\fR Calculates a measure of heterozygosity on a per\-individual basis. Specfically, the inbreeding coefficient, F, is estimated for each individual using a method of moments. The resulting file has the suffix '.het'. .TP \fB\-\-hardy\fR Reports a p\-value for each site from a Hardy\-Weinberg Equilibrium test (as defined by Wigginton, Cutler and Abecasis (2005)). The resulting file (with suffix '.hwe') also contains the Observed numbers of Homozygotes and Heterozygotes and the corresponding Expected numbers under HWE. .TP \fB\-\-missing\fR Generates two files reporting the missingness on a per\-individual and per\-site basis. The two files have suffixes '.imiss' and '.lmiss' respectively. .TP \fB\-\-hap\-r2\fR .TP \fB\-\-geno\-r2\fR .TP \fB\-\-ld\-window\fR .TP \fB\-\-ld\-window\-bp\fR .TP \fB\-\-min\-r2\fR These options are used to report Linkage Disequilibrium (LD) statistics as summarised by the r2 statistic. The \-\-hap\-r2 option informs vcftools to output a file reporting the r2 statistic using phased haplotypes. This is the traditional measure of LD often reported in the population genetics literature. If phased haplotypes are unavailable then the \-\-geno\-r2 option may be used, which calculates the squared correlation coefficient between genotypes encoded as 0, 1 and 2 to represent the number of non-reference alleles in each individual. This is the same as the LD measure reported by PLINK. The haplotype version outputs a file with the suffix '.hap.ld', whereas the genotype version outputs a file with the suffix '.geno.ld'. The haplotype version implies the option \-\-phased. The \-\-ld\-window option defines the maximum SNP separation for the calculation of LD. Likewise, the \-\-ld\-window\-bp option can be used to define the maximum physical separation of SNPs included in the LD calculation. Finally, the \-\-min\-r2 sets a minimum value for r2 below which the LD statistic is not reported. .TP \fB\-\-SNPdnsity\fR Calculates the number and density of SNPs in bins of size defined by this option. The resulting output file has the suffix '.snpden'. .TP \fB\-\-TsTv\fR Calculates the Transition / Transversion ratio in bins of size defined by this option. The resulting output file has the suffix '.TsTv'. A summary is also supplied in a file with the suffix '.TsTv.summary'. .TP \fB\-\-FILTER\-summary\fR Generates a summary of the number of SNPs and Ts/Tv ratio for each FILTER category. The output file has the suffix '.FILTER.summary. .TP \fB\-\-filtered\-sites\fR Creates two files listing sites that have been kept or removed after filtering. The first file, with suffix '.kept.sites', lists sites kept by vcftools after filters have been applied. The second file, with the suffix '.removed.sites', list sites removed by the applied filters. .TP \fB\-\-singletons\fR This option will generate a file detailing the location of singletons, and the individual they occur in. The file reports both true singletons, and private doubletons (i.e. SNPs where the minor allele only occurs in a single individual and that individual is homozygotic for that allele). The output file has the suffix '.singletons'. .TP \fB\-\-site\-pi\fR .TP \fB\-\-window\-pi\fR These options are used to estimate levels of nucleotide diversity. The first option does this on a per\-site basis, and the output file has the suffix '.sites.pi'. The second option calculates the nucleotide diversity in windows, with the window size defined in the option argument. Output for this option has the suffix '.windowed.pi'. The windowed version requires phased data, and hence use of this option implies the \-\-phased option. .SS Output in Other Formats .TP \fB\-\-O12\fR This option outputs the genotypes as a large matrix. Three files are produced. The first, with suffix '.012', contains the genotypes of each individual on a separate line. Genotypes are represented as 0, 1 and 2, where the number represent that number of non-reference alleles. Missing genotypes are represented by \-1. The second file, with suffix '.012.indv' details the individuals included in the main file. The third file, with suffix '.012.pos' details the site locations included in the main file. .TP \fB\-\-IMPUTE\fR This option outputs phased haplotypes in IMPUTE reference\-panel format. As IMPUTE requires phased data, using this option also implies \-\-phased. Unphased individuals and genotypes are therefore excluded. Only bi\-allelic sites are included in the output. Using this option generates three files. The IMPUTE haplotype file has the suffix '.impute.hap', and the IMPUTE legend file has the suffix '.impute.hap.legend'. The third file, with suffix '.impute.hap.indv', details the individuals included in the haplotype file, although this file is not needed by IMPUTE. .TP \fB\-\-ldhat\fR .TP \fB\-\-ldhat\-geno\fR These options output data in LDhat format. Use of these options also require the \-\-chr option to by used. The \-\-ldhat option outputs phased data only, and therefore also implies \-\-phased, leading to unphased individuals and genotypes being excluded. Alternatively, the \-\-ldhat\-geno option treats all of the data as unphased, and therefore outputs LDhat files in genotype/unphased format. In either case, two files are generated with the suffixes '.ldhat.sites' and '.ldhat.locs', which correspond to the LDhat 'sites' and 'locs' input files respectively. .TP \fB\-\-BEAGLE\-GL\fR This option outputs genotype likelihood information for input into the BEAGLE program. This option requires the VCF file to contain the FORMAT GL tag, which can generally be output by SNP callers such as the GATK. Use of this option requires a chromosome to be specified via the \-\-chr option. The resulting output file (with the suffix '.BEAGLE.GL') contains genotype likelihoods for biallelic sites, and is suitable for input into BEAGLE via the 'like=' argument. .TP \fB\-\-plink\fR This option outputs the genotype data in PLINK PED format. Two files are generated, with suffixes '.ped' and '.map'. Note that only bi\-allelic loci will be output. Further details of these files can be found in the PLINK documentation. Note: This option can be very slow on large datasets. Using the \-\-chr option to divide up the dataset is advised. .TP \fB\-\-plink\-tped\fR The \-\-plink option above can be extremely slow on large datasets. An alternative that might be considerably quicker is to output in the PLINK transposed format. This can be achieved using the \-\-plink\-tped option, which produces two files with suffixes '.tped' and '.tfam'. .TP \fB\-\-recode\fR The \-\-recode option is used to generate a VCF file from the input VCF file having applied the options specified by the user. The output file has the suffix '.recode.vcf'. By default, the INFO fields are removed from the output file, as the INFO values may be invalidated by the recoding (e.g. the total depth may need to be recalculated if individuals are removed). This default functionality can be overridden by using the \-\-keep\-INFO option, where defines the INFO key to keep in the output file. The \-\-keep\-INFO flag can be used multiple times. Alternatively, the option \-\-keep\-INFO-all can be used to retain all INFO fields. .SS Miscellaneous .TP \fB\-\-extract\-FORMAT\-info\fR Extract information from the genotype fields in the VCF file relating to a specfied FORMAT identifier. For example, using the option '\-\-extract\-FORMAT\-info GT' would extract the all of the GT (i.e. Genotype) entries. The resulting output file has the suffix '..FORMAT'. .TP \fB\-\-get\-INFO\fR This option is used to extract information from the INFO field in the VCF file. The argument specifies the INFO tag to be extracted, and the option can be used multiple times in order to extract multiple INFO entries. The resulting file, with suffix '.INFO', contains the required INFO information in a tab\-separated table. For example, to extract the NS and DB flags, one would use the command: vcftools \-\-vcf file1.vcf \-\-get\-INFO NS \-\-get\-INFO DB .SS VCF File Comparison Options The file comparison options are currently in a state of flux and likely buggy. If you find a bug, please report it. Note that genotype\-level filters are not supported in these options. .TP \fB\-\-diff\fR .TP \fB\-\-gzdiff\fR Select a VCF file for comparison with the file specified by the \-\-vcf option. Outputs two files describing the sites and individuals common / unique to each file. These files have the suffixes '.diff.sites_in_files' and '.diff.indv_in_files' respectively. The \-\-gzdiff version can be used to read compressed VCF files. .TP \fB\-\-diff\-site\-discordance\fR Used in conjunction with the \-\-diff option to calculate discordance on a site by site basis. The resulting output file has the suffix '.diff.sites'. .TP \fB\-\-diff\-indv\-discordance\fR Used in conjunction with the \-\-diff option to calculate discordance on a per-individual basis. The resulting output file has the suffix '.diff.indv'. .TP \fB\-\-diff\-discordance\-matrix\fR Used in conjunction with the \-\-diff option to calculate a discordance matrix. This option only works with bi\-allelic loci with matching alleles that are present in both files. The resulting output file has the suffix '.diff.discordance.matrix'. .TP \fB\-\-diff\-switch\-error\fR Used in conjunction with the \-\-diff option to calculate phasing errors (specifically 'switch errors'). This option generates two output files describing switch errors found between sites, and the average switch error per individual. These two files have the suffixes '.diff.switch' and '.diff.indv.switch' respectively. .SS Options still in development The following options are yet to be finalised, are likely to contain bugs, and are likely to change in the future. .TP \fB\-\-fst\fR .TP \fB\-\-gzfst\fR Calculate FST for a pair of VCF files, with the second file being specified by this option. FST is currently calculated using the formula described in the supplementary material of the Phase I HapMap paper. Currently, only pairwise FST calculations are supported, although this will likely change in the future. The \-\-gzfst option can be used to read compressed VCF files. .TP \fB\-\-LROH\fR Identify Long Runs of Homozygosity. .TP \fB\-\-relatedness\fR Output Individual Relatedness Statistics. debian/vcf-convert.10000644000000000000000000000111212151612073011475 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH VCF-CONVERT "1" "July 2011" "vcf-convert 0.1.5" "User Commands" .SH NAME vcf-convert \- convert between VCF versions .SH SYNOPSIS .B cat \fIin.vcf | vcf-convert \fR[\fIOPTIONS\fR] \fI> out.vcf\fR .SH DESCRIPTION About: Convert between VCF versions, currently to VCFv4.0 only. .SH OPTIONS .TP \fB\-r\fR, \fB\-\-refseq\fR The reference sequence in samtools faindexed fasta file. (Not required with SNPs only.) .TP \fB\-v\fR, \fB\-\-version\fR 4.0 .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. debian/vcf-annotate.10000644000000000000000000000552612151612073011643 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH VCF-ANNOTATE "1" "July 2011" "vcf-annotate 0.1.5" "User Commands" .SH NAME vcf-annotate \- annotate VCF file, add filters or custom annotations .SH SYNOPSIS .B cat \fIin.vcf | vcf-annotate \fR[\fIOPTIONS\fR] \fI> out.vcf\fR .SH DESCRIPTION About: Annotates VCF file, adding filters or custom annotations. Requires tabix indexed file with annotations. .IP Currently annotates only the INFO column, but it will be extended on demand. .SH OPTIONS .TP \fB\-a\fR, \fB\-\-annotations\fR The tabix indexed file with the annotations: CHR\etFROM[\etTO][\etVALUE]+. .TP \fB\-c\fR, \fB\-\-columns\fR The list of columns in the annotation file, e.g. CHROM,FROM,TO,\-,INFO/STR,INFO/GN. The dash in this example indicates that the third column should be ignored. If TO is not present, it is assumed that TO equals to FROM. .TP \fB\-d\fR, \fB\-\-description\fR Header annotation, e.g. key=INFO,ID=HM2,Number=0,Type=Flag,Description='HapMap2 membership'. The descriptions can be read from a file, one annotation per line. .TP \fB\-f\fR, \fB\-\-filter\fR Apply filters, list is in the format flt1=value/flt2/flt3=value/etc. .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .SS "Filters:" .TP + Apply all filters with default values (can be overridden, see the example below). .TP \fB\-X\fR Exclude the filter X .TP 1, StrandBias FLOAT Min P\-value for strand bias (given PV4) [0.0001] .TP 2, BaseQualBias FLOAT Min P\-value for baseQ bias [1e\-100] .TP 3, MapQualBias FLOAT Min P\-value for mapQ bias [0] .TP 4, EndDistBias FLOAT Min P\-value for end distance bias [0.0001] .TP a, MinAB INT Minimum number of alternate bases [2] .TP c, SnpCluster INT1,INT2 Filters clusters of 'INT1' or more SNPs within a run of 'INT2' bases [] .TP D, MaxDP INT Maximum read depth [10000000] .TP d, MinDP INT Minimum read depth [2] .TP q, MinMQ INT Minimum RMS mapping quality for SNPs [10] .TP Q, Qual INT Minimum value of the QUAL field [10] .TP r, RefN Reference base is N [] .TP W, GapWin INT Window size for filtering adjacent gaps [10] .TP w, SnpGap INT SNP within INT bp around a gap to be filtered [10] .SS "Example:" .IP zcat in.vcf.gz | vcf\-annotate \fB\-a\fR annotations.gz \fB\-d\fR descriptions.txt | bgzip \fB\-c\fR >out.vcf.gz zcat in.vcf.gz | vcf\-annotate \fB\-f\fR +/\-a/c=3,10/q=3/d=5/\-D \fB\-a\fR annotations.gz \fB\-d\fR descriptions.txt | bgzip \fB\-c\fR >out.vcf.gz .SS "Where descriptions.txt contains:" .IP key=INFO,ID=GN,Number=1,Type=String,Description='Gene Name' key=INFO,ID=STR,Number=1,Type=Integer,Description='Strand' debian/vcf-fix-ploidy.10000644000000000000000000000217112151612073012107 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.41.1. .TH VCF-FIX-PLOIDY "1" "February 2013" "vcf-fix-ploidy" "User Commands" .SH NAME vcf-fix-ploidy \- vcf-fix-ploidy .SH SYNOPSIS .B cat \fIbroken.vcf | vcf-fix-ploidy \fR[\fIOPTIONS\fR] \fI> fixed.vcf\fR .SH OPTIONS .TP \fB\-a\fR, \fB\-\-assumed\-sex\fR M or F, required if the list is not complete in \fB\-s\fR .TP \fB\-l\fR, \fB\-\-fix\-likelihoods\fR Add or remove het likelihoods (not the default behaviour) .TP \fB\-p\fR, \fB\-\-ploidy\fR Ploidy definition. The default is shown below. .TP \fB\-s\fR, \fB\-\-samples\fR List of sample sexes (sample_name [MF]). .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .SS "Default ploidy definition:" .IP ploidy => { .IP X => [ .IP # The pseudoautosomal regions 60,001\-2,699,520 and 154,931,044\-155,270,560 with the ploidy 2 { from=>1, to=>60_000, M=>1 }, { from=>2_699_521, to=>154_931_043, M=>1 }, .IP ], Y => [ .IP # No chrY in females and one copy in males { from=>1, to=>59_373_566, M=>1, F=>0 }, .IP ], MT => [ .IP # Haploid MT in males and females { from=>1, to => 16_569, M=>1, F=>1 }, .IP ], .IP } debian/vcf-indel-stats.10000644000000000000000000000101512151612073012246 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.41.1. .TH VCF-INDEL-STATS "1" "February 2013" "vcf-indel-stats" "User Commands" .SH NAME vcf-indel-stats \- vcf-indel-stat .SH SYNOPSIS .B vcf-indel-stats [\fIOPTIONS\fR] \fI< in.vcf > out.txt\fR .SH DESCRIPTION About: Currently calculates in\-frame ratio. .SH OPTIONS .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .TP \fB\-e\fR, \fB\-\-exons\fR Tab\-separated file with exons (chr,from,to; 1\-based, inclusive) .HP \fB\-v\fR, \fB\-\-verbose\fR .IP debian/fill-an-ac.10000644000000000000000000000045612151612073011156 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH FILL-AN-AC "1" "July 2011" "fill-an-ac 0.1.5" "User Commands" .SH NAME fill-an-ac \- fill or recalculate AN and AC INFO fields. .SH SYNOPSIS .B fill-an-ac .SH DESCRIPTION .IP zcat file.vcf.gz | fill-an-ac | bgzip \-c > out.vcf.gz debian/get-orig-source0000755000000000000000000000141012151612073012121 0ustar #/bin/bash #set -x PKG=`dpkg-parsechangelog | awk '/^Source/ { print $2 }'` if ! echo $@ | grep -q upstream-version ; then DVERSION=`dpkg-parsechangelog | awk '/^Version:/ { print $2 }' | cut -d- -f1` else DVERSION=`echo $@ | sed "s?^.*--upstream-version \([0-9.]\+\) .*${PKG}.*?\1?"` if echo "$DVERSION" | grep -q "upstream-version" ; then echo "Unable to parse version number" exit fi fi cd ../tarballs tar -xzf ${PKG}_${DVERSION}.tar.gz cd ${PKG}_${DVERSION} find . -name CVS -type d | xargs rm -rf find . -name '.svn' -type d | xargs rm -rf find . -name "*.pdf" | xargs rm -f cd .. GZIP="--best --no-name" tar --owner=root --group=root --mode=a+rX -czf "$PKG"_"$DVERSION"+dfsg.orig.tar.gz ${PKG}_${DVERSION} rm -rf ${PKG}_${DVERSION} debian/fill-aa.10000644000000000000000000000057012151612073010555 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH FILL-AA "1" "July 2011" "fill-aa 0.1.5" "User Commands" .SH NAME fill-aa \- fill in ancestral alleles .SH SYNOPSIS .B fill-aa [OPTIONS] .SH DESCRIPTION .IP zcat file.vcf.gz | fill-aa \-a ancestral-alleles.fa.gz | bgzip \-c > out.vcf.gz .SH OPTIONS .TP \fB\-a\fR file containing acestral alleles debian/vcf-subset.10000644000000000000000000000200312151612073011322 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH VCF-SUBSET "1" "July 2011" "vcf-subset 0.1.5" "User Commands" .SH NAME vcf-subset \- create subset of VCF file .SH SYNOPSIS .B vcf-subset [\fIOPTIONS\fR] \fIin.vcf.gz > out.vcf\fR .SH OPTIONS .TP \fB\-c\fR, \fB\-\-columns\fR File or comma\-separated list of columns to keep in the vcf file. If file, one column per row .TP \fB\-e\fR, \fB\-\-exclude\-ref\fR Exclude rows not containing variants. .TP \fB\-p\fR, \fB\-\-private\fR Print only rows where only the subset columns carry an alternate allele. .TP \fB\-r\fR, \fB\-\-replace\-with\-ref\fR Replace the excluded types with reference allele instead of dot. .TP \fB\-t\fR, \fB\-\-type\fR Comma\-separated list of variant types to include: SNPs,indels. .TP \fB\-u\fR, \fB\-\-keep\-uncalled\fR Do not exclude rows without calls. .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .SH EXAMPLES .IP cat in.vcf | vcf\-subset \fB\-r\fR \fB\-t\fR indels \fB\-e\fR \fB\-c\fR SAMPLE1 > out.vcf debian/makeman0000755000000000000000000000335312151612073010527 0ustar #!/bin/bash # helper script to create first version of man pages help2man -n "annotate VCF file, add filters or custom annotations" -N --help-option="-h" --no-discard-stderr --version-string="0.1.5" vcf-annotate > vcf-annotate.1 help2man -n "compare bgzipped and tabix indexed VCF files" -N --help-option="-h" --no-discard-stderr --version-string="0.1.5" vcf-compare > vcf-compare.1 help2man -n "concatenate VCF files" -N --help-option="-h" --no-discard-stderr --version-string="0.1.5" vcf-concat > vcf-concat.1 help2man -n "convert between VCF versions" -N --help-option="-h" --no-discard-stderr --version-string="0.1.5" vcf-convert > vcf-convert.1 help2man -n "create intersections, unions, complements on bgzipped and tabix indexed VCF or tab-delimited files" -N --help-option="-h" --no-discard-stderr --version-string="0.1.5" vcf-isec > vcf-isec.1 help2man -n "merge the bgzipped and tabix indexed VCF files" -N --help-option="-h" --no-discard-stderr --version-string="0.1.5" vcf-merge > vcf-merge.1 help2man -n "query VCF files" -N --help-option="-h" --no-discard-stderr --version-string="0.1.5" vcf-query > vcf-query.1 help2man -n "sort VCF file" -N --help-option="-h" --no-discard-stderr --version-string="0.1.5" vcf-sort > vcf-sort.1 help2man -n "statistic of VCF file" -N --help-option="-h" --no-discard-stderr --version-string="0.1.5" vcf-stats > vcf-stats.1 help2man -n "create subset of VCF file" -N --help-option="-h" --no-discard-stderr --version-string="0.1.5" vcf-subset > vcf-subset.1 help2man -n "convert to tabix" -N --help-option="-h" --no-discard-stderr --version-string="0.1.5" vcf-to-tab > vcf-to-tab.1 help2man -n "validate VCF file" -N --help-option="-h" --no-discard-stderr --version-string="0.1.5" vcf-validator > vcf-validator.1 debian/patches/0000755000000000000000000000000012163334107010615 5ustar debian/patches/series0000644000000000000000000000004512151612073012027 0ustar make.patch use-dpkg-buildflags.patch debian/patches/make.patch0000644000000000000000000000121712151612073012552 0ustar Description: do not use MAKEFLAGS Author: Thorsten Alteholz Last-Update: 2011-07-03 Index: vcftools-0.1.7/Makefile =================================================================== --- vcftools_0.1.7.org/Makefile 2011-07-04 15:22:33.000000000 +0200 +++ vcftools_0.1.7/Makefile 2011-07-04 15:22:56.000000000 +0200 @@ -22,7 +22,7 @@ DIRS = cpp perl install: @mkdir -p $(BINDIR); mkdir -p $(MODDIR); \ - for dir in $(DIRS); do cd $$dir && $(MAKE) $(MAKEFLAGS) && cd ..; done + for dir in $(DIRS); do cd $$dir && $(MAKE) && cd ..; done clean: @for dir in $(DIRS); do cd $$dir && $(MAKE) clean && cd ..; done debian/patches/use-dpkg-buildflags.patch0000644000000000000000000000222712163241374015475 0ustar Author: Andreas Tille Date: Sat, 12 May 2012 09:31:58 +0200 Description: Enable propagation of hardening flags Index: vcftools_0.1.11/cpp/Makefile =================================================================== --- vcftools_0.1.11.orig/cpp/Makefile 2013-06-13 16:40:54.000000000 +0200 +++ vcftools_0.1.11/cpp/Makefile 2013-06-27 19:54:45.000000000 +0200 @@ -12,9 +12,9 @@ VCFTOOLS_PCA = 0 endif # Compiler flags -CFLAGS = -O2 -m64 +CFLAGS += -O2 #CFLAGS = -Wall -O2 -pg -m64 -CPPFLAGS = -O2 -D_FILE_OFFSET_BITS=64 +CPPFLAGS += -O2 -D_FILE_OFFSET_BITS=64 #CPPFLAGS = -O2 -Wall -pg -D_FILE_OFFSET_BITS=64 # Included libraries (zlib) LIB = -lz @@ -38,7 +38,7 @@ endif vcftools: $(OBJS) - $(CPP) $(CPPFLAGS) $(OBJS) -o vcftools $(LIB) + $(CPP) $(CPPFLAGS) $(OBJS) $(LDFLAGS) -o vcftools $(LIB) ifdef BINDIR cp $(CURDIR)/$@ $(BINDIR)/$@ endif @@ -50,8 +50,8 @@ -include $(OBJS:.o=.d) %.o: %.cpp - $(CPP) -c $(CPPFLAGS) $*.cpp -o $*.o - $(CPP) -MM $(CPPFLAGS) $*.cpp > $*.d + $(CPP) -c $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) $*.cpp -o $*.o + $(CPP) -MM $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) $*.cpp > $*.d # remove compilation products clean: debian/fill-ref-md5.10000644000000000000000000000214212151612073011430 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.41.1. .TH FILL-REF-MD5 "1" "February 2013" "fill-ref-md5" "User Commands" .SH NAME fill-ref-md5 \- computes MD5 sum of the reference sequence .SH SYNOPSIS .B fill-ref-md5 [\fIOPTIONS\fR] \fIin.vcf.gz out.vcf.gz\fR .SH DESCRIPTION About: The script computes MD5 sum of the reference sequence and inserts .IP \&'reference' and 'contig' tags into header as recommended by VCFv4.1. The VCF file must be compressed and tabix indexed, as it takes advantage of the lightning fast tabix reheader functionality. .SH OPTIONS .TP \fB\-d\fR, \fB\-\-dictionary\fR Where to read/write computed MD5s. Opened in append mode, existing records are not touched. .TP \fB\-i\fR, \fB\-\-info\fR Optional info on reference assembly (AS), species (SP), taxonomy (TX) .TP \fB\-r\fR, \fB\-\-refseq\fR The reference sequence in fasta format indexed by samtools faidx .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .SH EXAMPLES .TP fill\-ref\-md5 \-i AS:NCBIM37,SP:"Mus\e Musculus" \-r NCBIM37_um.fa \-d NCBIM37_um.fa.dict in.vcf.gz out.vcf.gz .IP debian/vcf-validator.10000644000000000000000000000070612151612073012012 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH VCF-VALIDATOR "1" "July 2011" "vcf-validator 0.1.5" "User Commands" .SH NAME vcf-validator \- validate VCF file .SH SYNOPSIS .B vcf-validator [\fIOPTIONS\fR] \fIfile.vcf.gz\fR .SH OPTIONS .TP \fB\-d\fR, \fB\-\-duplicates\fR Warn about duplicate positions. .TP \fB\-u\fR, \fB\-\-unique\-messages\fR Output all messages only once. .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. debian/README.source0000644000000000000000000000074312151612073011347 0ustar vcftools - changes to source tree ================================= A series of files and directories needed to be removed - nothing dramatic: $ find . -name ".svn" ./website/img/.svn ./website/src/.svn ./website/.svn ./cpp/.svn ./examples/.svn ./perl/.svn $ find . -name ".svn" | xargs -r rm -r $ find . -name "*.pdf" ./website/VCF-poster.pdf $ find . -name "*.pdf" | xargs -r rm Also, the source tree has now the version number attached with a hyphen instead of an underscore. debian/vcftools.lintian-overrides0000644000000000000000000000033212163236256014411 0ustar # the compiler flags are used vcftools: hardening-no-fortify-functions usr/bin/vcftools # this string does not appear in sources, it is just some binary blob vcftools: spelling-error-in-binary usr/bin/vcftools teH the debian/rules0000755000000000000000000000054612151612073010251 0ustar #!/usr/bin/make -f # -*- makefile -*- # Uncomment this to turn on verbose mode. # export DH_VERBOSE=1 %: dh $@ override_dh_clean: dh_clean # [ -r Makefile ] && make clean ? override_dh_installchangelogs: dh_installchangelogs perl/ChangeLog get-orig-source: mkdir -p ../tarballs uscan --verbose --force-download --destdir=../tarballs --no-symlink debian/fill-rsIDs.10000644000000000000000000000056212151612073011221 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH FILL-RSIDS "1" "July 2011" "fill-rsIDs 0.1.5" "User Commands" .SH NAME fill-rsIDs \- fill missing rsIDs .SH SYNOPSIS .B fill-rsIDs [OPTIONS] .SH DESCRIPTION .IP zcat file.vcf.gz | fill-rsIDs \-r dbSNP_ids_129.txt.bgz | bgzip \-c > out.vcf.gz .SH OPTIONS .TP \fB\-r\fR file containing ids debian/vcf-to-tab.10000644000000000000000000000055612151612073011216 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.4. .TH VCF-TO-TAB "1" "July 2011" "vcf-to-tab 0.1.5" "User Commands" .SH NAME vcf-to-tab \- convert to tabix .SH SYNOPSIS .B vcf-to-tab [\fIOPTIONS\fR] \fI< in.vcf > out.tab\fR .SH OPTIONS .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .TP \fB\-i\fR, \fB\-\-iupac\fR Use one\-letter IUPAC codes debian/vcf-shuffle-cols.10000644000000000000000000000101212151612073012406 0ustar .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.41.1. .TH VCF-SHUFFLE-COLS "1" "February 2013" "vcf-shuffle-cols" "User Commands" .SH NAME vcf-shuffle-cols \- reorder columns .SH SYNOPSIS .B vcf-shuffle-cols [\fIOPTIONS\fR] \fI-t template.vcf.gz file.vcf.gz > out.vcf\fR .SH DESCRIPTION About: Reorder columns to match the order in the template VCF. .SH OPTIONS .TP \fB\-t\fR, \fB\-\-template\fR The file with the correct order of the columns. .TP \fB\-h\fR, \-?, \fB\-\-help\fR This help message. .IP debian/source/0000755000000000000000000000000012163334107010466 5ustar debian/source/format0000644000000000000000000000001412151612073011672 0ustar 3.0 (quilt) debian/install0000644000000000000000000000006012151612073010551 0ustar bin usr lib/perl5/site_perl/* usr/share/perl5