sourCEntral - mobile manpages

pdf

HISAT2-ALIGN-S

NAME

hisat2-align-s − graph-based alignment of short nucleotide reads to many genomes, wrapper script

DESCRIPTION

HISAT2 version 2.0.5 by Daehwan Kim (infphilo AT gmail DOT com, www.ccb.jhu.edu/people/infphilo) Usage:

hisat2 [options]* −x <ht2−idx> {−1 <m1> −2 <m2> | −U <r>} [−S <sam>]

<ht2−idx>

Index filename prefix (minus trailing .X.ht2).

<m1>

Files with #1 mates, paired with files in <m2>. Could be gzip’ed (extension: .gz) or bzip2’ed (extension: .bz2).

<m2>

Files with #2 mates, paired with files in <m1>. Could be gzip’ed (extension: .gz) or bzip2’ed (extension: .bz2).

<r>

Files with unpaired reads. Could be gzip’ed (extension: .gz) or bzip2’ed (extension: .bz2).

<sam>

File for SAM output (default: stdout)

<m1>, <m2>, <r> can be comma−separated lists (no whitespace) and can be specified many times. E.g. ’−U file1.fq,file2.fq −U file3.fq’.

Options (defaults in parentheses):

Input:

−q

query input files are FASTQ .fq/.fastq (default)

−−qseq

query input files are in Illumina’s qseq format

−f

query input files are (multi−)FASTA .fa/.mfa

−r

query input files are raw one−sequence−per−line

−c

<m1>, <m2>, <r> are sequences themselves, not files

−s/−−skip <int>

skip the first <int> reads/pairs in the input (none)

−u/−−upto <int>

stop after first <int> reads/pairs (no limit)

−5/−−trim5 <int>

trim <int> bases from 5’/left end of reads (0)

−3/−−trim3 <int>

trim <int> bases from 3’/right end of reads (0)

−−phred33

qualities are Phred+33 (default)

−−phred64

qualities are Phred+64

−−int−quals

qualities encoded as space−delimited integers

Alignment:

−−n−ceil <func>

func for max # non−A/C/G/Ts permitted in aln (L,0,0.15)

−−ignore−quals

treat all quality values as 30 on Phred scale (off)

−−nofw

do not align forward (original) version of read (off)

−−norc

do not align reverse−complement version of read (off)

Spliced Alignment:

−−pen−cansplice <int>

penalty for a canonical splice site (0)

−−pen−noncansplice <int>

penalty for a non−canonical splice site (12)

−−pen−canintronlen <func>

penalty for long introns (G,−8,1) with canonical splice sites

−−pen−noncanintronlen <func>

penalty for long introns (G,−8,1) with noncanonical splice sites

−−min−intronlen <int>

minimum intron length (20)

−−max−intronlen <int>

maximum intron length (500000)

−−known−splicesite−infile <path>

provide a list of known splice sites

−−novel−splicesite−outfile <path>

report a list of splice sites

−−novel−splicesite−infile <path>

provide a list of novel splice sites

−−no−temp−splicesite

disable the use of splice sites found

−−no−spliced−alignment

disable spliced alignment

−−rna−strandness <string>

Specify strand−specific information (unstranded)

−−tmo

Reports only those alignments within known transcriptome

−−dta

Reports alignments tailored for transcript assemblers

−−dta−cufflinks

Reports alignments tailored specifically for cufflinks

Scoring:

−−ma <int>

match bonus (0 for −−end−to−end, 2 for −−local)

−−mp <int>,<int>

max and min penalties for mismatch; lower qual = lower penalty <2,6>

−−sp <int>,<int>

max and min penalties for soft−clipping; lower qual = lower penalty <1,2>

−−no−softclip

no soft−clipping

−−np <int>

penalty for non−A/C/G/Ts in read/ref (1)

−−rdg <int>,<int>

read gap open, extend penalties (5,3)

−−rfg <int>,<int>

reference gap open, extend penalties (5,3)

−−score−min <func> min acceptable alignment score w/r/t read length

(L,0.0,−0.2)

Reporting:

−k <int> (default: 5) report up to <int> alns per read

Paired−end:

−I/−−minins <int>

minimum fragment length (0), only valid with −−no−spliced−alignment

−X/−−maxins <int>

maximum fragment length (500), only valid with −−no−spliced−alignment

−−fr/−−rf/−−ff −1, −2 mates align fw/rev, rev/fw, fw/fw (−−fr)

−−no−mixed

suppress unpaired alignments for paired reads

−−no−discordant

suppress discordant alignments for paired reads

Output:

−t/−−time

print wall−clock time taken by search phases

−−un <path>

write unpaired reads that didn’t align to <path>

−−al <path>

write unpaired reads that aligned at least once to <path>

−−un−conc <path>

write pairs that didn’t align concordantly to <path>

−−al−conc <path>

write pairs that aligned concordantly at least once to <path>

(Note: for −−un, −−al, −−un−conc, or −−al−conc, add ’−gz’ to the option name, e.g. −−un−gz <path>, to gzip compress output, or add ’−bz2’ to bzip2 compress output.) −−quiet print nothing to stderr except serious errors −−met−file <path> send metrics to file at <path> (off) −−met−stderr send metrics to stderr (off) −−met <int> report internal counters & metrics every <int> secs (1) −−no−head supppress header lines, i.e. lines starting with @ −−no−sq supppress @SQ header lines −−rg−id <text> set read group id, reflected in @RG line and RG:Z: opt field −−rg <text> add <text> ("lab:value") to @RG line of SAM header.

Note: @RG line only printed when −−rg−id is set.

−−omit−sec−seq

put ’*’ in SEQ and QUAL fields for secondary alignments.

Performance:

−o/−−offrate <int> override offrate of index; must be >= index’s offrate

−p/−−threads <int> number of alignment threads to launch (1)

−−reorder

force SAM output order to match order of input reads

−−mm

use memory−mapped I/O for index; many ’hisat2’s can share

Other:

−−qc−filter

filter out reads that are bad according to QSEQ filter

−−seed <int>

seed for random number generator (0)

−−non−deterministic seed rand. gen. arbitrarily instead of using read attributes

−−remove−chrname

remove ’chr’ from reference names in alignment

−−add−chrname

add ’chr’ to reference names in alignment

−−version

print version information and quit

−h/−−help

print this usage message

64−bit Built on Debian unstable Fri Dec 9 12:45:26 UTC 2016 Compiler: gcc version 6.2.1 20161202 (Ubuntu 6.2.1−5ubuntu2) Options: −O3 −funroll−loops −g3 −Wdate−time −D_FORTIFY_SOURCE=2 −DPOPCNT_CAPABILITY Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}

pdf