sourCEntral - mobile manpages

pdf

alignment−thin

NAME

alignment−thin − Remove sequences or columns from an alignment.

SYNOPSIS

alignment−thin alignment−file [OPTIONS]

DESCRIPTION

Remove sequences or columns from an alignment.

GENERAL OPTIONS:

−h, −−help

Print usage information.

−v, −−verbose

Output more log messages on stderr.

SEQUENCE FILTERING OPTIONS:

−−protect arg

Sequences that cannot be removed (comma−separated).

−−remove arg

Remove sequences in comma−separated list arg.

−−longer−than arg

Remove sequences not longer than arg.

−−shorter−than arg

Remove sequences not shorter than arg.

−−cutoff arg

Remove similar sequences with #mismatches < cutoff.

−−down−to arg

Remove similar sequences down to arg sequences.

−−remove−crazy arg

Remove arg outlier sequences −− defined as sequences that are missing too many conserved sites.

−−conserved arg (=0.75)

Fraction of sequences that must contain a letter for it to be considered conserved.

COLUMN FILTERING OPTIONS:

−−min−letters arg

Remove columns with fewer than arg letters.

−−remove−unique arg

Remove insertions in a single sequence if longer than arg letters

OUTPUT OPTIONS:

−−sort

Sort partially ordered columns to group similar gaps.

−−show−lengths

Just print out sequence lengths.

−−find−dups arg

For each sequence, find the closest other sequence.

EXAMPLES:

Remove columns without a minimum number of letters:

% alignment−thin −−min−letters=5 file.fasta > file−thinned.fasta

Remove sequences by name:

% alignment−thin −−remove=seq1,seq2 file.fasta > file2.fasta

Remove short sequences:

% alignment−thin −−longer−than=250 file.fasta > file−long.fasta

Remove sequences with <= 5 differences from the closest other sequence:

% alignment−thin −−cutoff=5 file.fasta > more−than−5−differences.fasta

Like −−cutoff, but stop when we have the right number of sequences:

% alignment−thin −−down−to=30 file.fasta > file−30taxa.fasta

Remove dissimilar sequences that are missing conserved columns:

% alignment−thin −−remove−crazy=10 file.fasta > file2.fasta

Protect some sequences from being removed:

% alignment−thin −−down−to=30 file.fasta −−protect=seq1,seq2 > file2.fasta

REPORTING BUGS:

BAli−Phy online help: <http://www.bali-phy.org/docs.php>.

Please send bug reports to <bali-phy-users AT googlegroups DOT com>.

AUTHORS

Benjamin Redelings.

pdf