NAME
SeqMule an automatic pipeline for next-generation sequencing data analysis
SYNOPSIS
seqmule stats
For details, please use 'seqmule stats -h':
Options:
--prefix,-p <STRING> output prefix. Mandatory for multiple input files.
--bam <BAM> a sorted BAM file (used with --capture, --aln)
--capture [BED] a BED file for capture regions (or any other regions of interest). Effective for --bam and --vcf.
--vcf <VCF> output variant stats for a VCF file. If a BED file is supplied, extract variants based on the BED file.
--aln output alignment stats for a BAM file
--consensus,-c <LIST> comma separated list of files for extracting consensus calls.
VCF4 and SOAPsnp *.consensus format or ANNOVAR *.avinput required.
--union,-u <LIST> comma separated list of files for pooling variants (same format as above).
--venn <LIST> comma separated list of files for Venn diagram plotting (same format as above).
--c-vcf <LIST> comma separated list of SORTED VCF files for extracting consensus calls. *.vcf or *.vcf.gz suffix required
--u-vcf <LIST> comma separated list of SORTED VCF files for extracting union calls. *.vcf or *.vcf.gz suffix required
--ref <FASTA> reference file in FASTA format. Effective for --c-vcf and --u-vcf.
-s,--sample <STRING> sample name for VCF file, used for -vcf, -u, -venn, -c options.
--plink convert VCF to PLINK format (PED,MAP). Only works with --vcf option.
--mendel-stat generate Mendelian error statistics
--paternal <STRING> sample ID for paternal ID (case-sensitive). Rest are either maternal or offspring. Only one family allowed.
--maternal <STRING> sample ID for maternal ID (case-sensitive). Rest are either paternal or offspring. Only one family allowed.
-N <INT> extract variants appearing in at least N input files. Currently only effective for --c-vcf option.
--jmem <STRING> max java memory. Only effective for --c-vcf and --u-vcf. Default: 1750m
--jexe <STRING> Java executable path. Default: java
-t <INT> number of threads. Only effective for --aln, --c-vcf and --u-vcf. Default: 1
--tmpdir <DIR> use DIR for storing large temporary files. Default: $TMPDIR(in your ENV variables) or /tmp
--nofilter If specified, consider all variants, otherwise, only unfiltered variants.
-h,--help help
--noclean do not clean temporary files
-v,--verbose verbose
EXAMPLES
#draw Venn Diagram to examine overlapping between different VCF files
seqmule stats -p gatk-soap-varscan -venn gatk.vcf,soap.avinput,varscan.vcf
#extract union of all variants, ouput in ANNOVAR format
seqmule stats -p gatk-soap-varscan -u gatk.vcf,soap.avinput,varscan.vcf
#extract consensus of all variants, output in ANNOVAR format
seqmule stats -p gatk_soap_varscan -c gatk.vcf,soap.avinput,varscan.vcf
#extract consensus of all variants, output in VCF format
seqmule stats -p gatk_soap_varscan -c-vcf gatk.vcf,soapsnp.vcf,varscan.vcf -ref hg19.fa
#extract union of all variants, output in VCF format
seqmule stats -p gatk_soap_varscan -u-vcf gatk.vcf,soapsnp.vcf,varscan.vcf -ref hg19.fa
#generate coverage statistics for specified region (region.bed)
seqmule stats -p sample -capture region.bed --bam sample.bam
#generate alignment statistics
seqmule stats -bam sample.bam -aln
#generate variant statistics
seqmule stats -vcf sample.vcf
#extract variants in specified region generate variant statistics
seqmule stats -vcf sample.vcf -capture region.bed
#generate Mendelian error statistics
#NOTE, sample.vcf contains 3 samples!
seqmule stats -vcf sample.vcf --plink --mendel-stat --paternal father --maternal mother
OPTIONS
-
--capture
SeqMule automatizes analysis of next-generation sequencing data by simplifying program installation, downloading of various databases, generation of analysis script, and customization of your pipeline.