sourCEntral - mobile manpages

pdf

SCYTHE

NAME

scythe − Bayesian adaptor trimmer

SYNOPSIS

scythe −t sanger −a /path/to/adaptors.fasta [options] <sequences.fastq.gz>

Trim 3´-end adaptor contaminants off sequence files. If no output file is specified, scythe will use stdout.

OPTIONS

−p, −−prior prior (default: 0.300)
−q, −−quality−type quality type, either illumina, solexa, or sanger (default: sanger)
−m, −−matches−file matches file (default: no output)
−o, −−output−file output trimmed sequences file (default: stdout)
−t, −−tag add a tag to the header indicating Scythe cut a sequence (default: off)
−n, −−min−match smallest contaminant to consider (default: 5)
−M, −−min−keep filter sequences less than or equal to this length (default: 35)
−−quiet don´t output statistics about trimming to stdout (default: off)
−−help display this help and exit
−−version output version information and exit

These are the quality encoding schemes scythe recognises (see ´−−quality´)

phred PHRED quality scores (e.g. from Roche 454). ASCII with no
offset, range: [4, 60].
sanger Sanger are PHRED ASCII qualities with an offset of 33,
range: [0, 93]. From NCBI SRA, or Illumina pipeline 1.8+.
solexa Solexa (also very early Illumina −− pipeline < 1.3).
ASCII offset of 64, range: [−5, 62]. Uses a different
quality-to-probabilities conversion than other schemes.
illumina Illumina output from pipeline versions between 1.3 and 1.7.
ASCII offset of 64, range: [0, 62]

FILES

adaptors.fasta: Provide contaminant sequences as a fasta-formatted file.
See ´/usr/share/doc/scythe/illumina_adaptors.fa´.
N.B.: Index/Barcode sequences should be substituted for Ns in
the example adaptor file.

AUTHOR

Vince Buffalo, https://github.com/vsbuffalo

pdf