sourCEntral - mobile manpages

pdf

CLUSTALW

NAME

clustalw − Multiple alignment of nucleic acid and protein sequences

SYNOPSIS

clustalw [−infile] file.ext [OPTIONS]

clustalw [−help | −fullhelp]

DESCRIPTION

Clustal W is a general purpose multiple alignment program for DNA or proteins.

The program performs simultaneous alignment of many nucleotide or amino acid sequences. It is typically run interactively, providing a menu and an online help. If you prefer to use it in command−line (batch) mode, you will have to give several options, the minimum being −infile.

OPTIONS

DATA (sequences)
−infile=
file.ext

Input sequences.

−profile1=file.ext and −profile2=file.ext

Profiles (old alignment)

VERBS (do things)
−options

List the command line parameters.

−help or −check

Outline the command line params.

−fullhelp

Output full help content.

−align

Do full multiple alignment.

−tree

Calculate NJ tree.

−pim

Output percent identity matrix (while calculating the tree).

−bootstrap=n

Bootstrap a NJ tree (n= number of bootstraps; def. = 1000).

−convert

Output the input sequences in a different file format.

PARAMETERS (set things)
General settings:

−interactive

Read command line, then enter normal interactive menus.

−quicktree

Use FAST algorithm for the alignment guide tree.

−type=

PROTEIN or DNA sequences.

−negative

Protein alignment with negative values in matrix.

−outfile=

Sequence alignment file name.

−output=

GCG, GDE, PHYLIP, PIR or NEXUS.

−outputorder=

INPUT or ALIGNED

−case

LOWER or UPPER (for GDE output only).

−seqnos=

OFF or ON (for Clustal output only).

−seqnos_range=

OFF or ON (NEW: for all output formats).

−range=m,n

Sequence range to write starting m to m+n.

−maxseqlen=n

Maximum allowed input sequence length.

−quiet

Reduce console output to minimum.

−stats=file

Log some alignments statistics to file.

Fast Pairwise Alignments:

−ktuple=n

Word size.

−topdiags=n

Number of best diags.

−window=n

Window around best diags.

−pairgap=n

Gap penalty.

−score

PERCENT or ABSOLUTE.

Slow Pairwise Alignments:

−pwmatrix=

:Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename

−pwdnamatrix=

DNA weight matrix=BLOSUMIUB, BLOSUMCLUSTALW or BLOSUMfilename.

−pwgapopen=f

Gap opening penalty.

−pwgapext=f

Gap extension penalty.

Multiple Alignments:

−newtree=

File for new guide tree.

−usetree=

File for old guide tree.

−matrix=

Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename.

−dnamatrix=

DNA weight matrix=IUB, CLUSTALW or filename.

−gapopen=f

Gap opening penalty.

−gapext=f

Gap extension penalty.

−engaps

No end gap separation pen.

−gapdist=n

Gap separation pen. range.

−nogap

Residue−specific gaps off.

−nohgap

Hydrophilic gaps off.

−hgapresidues=

List hydrophilic res.

−maxdiv=n

Percent identity for delay.

−type=

PROTEIN or DNA

−transweight=f

Transitions weighting.

−iteration=

NONE or TREE or ALIGNMENT.

−numiter=n

Maximum number of iterations to perform.

Profile Alignments:

−profile

Merge two alignments by profile alignment.

−newtree1=

File for new guide tree for profile1.

−newtree2=

File for new guide tree for profile2.

−usetree1=

File for old guide tree for profile1.

−usetree2=

File for old guide tree for profile2.

Sequence to Profile Alignments:

−sequences

Sequentially add profile2 sequences to profile1 alignment.

−newtree=

File for new guide tree.

−usetree=

File for old guide tree.

Structure Alignments:

−nosecstr1

Do not use secondary structure−gap penalty mask for profile 1.

−nosecstr2

Do not use secondary structure−gap penalty mask for profile 2.

−secstrout=STRUCTURE or MASK or BOTH or NONE

Output in alignment file.

−helixgap=n

Gap penalty for helix core residues.

−strandgap=n

Gap penalty for strand core residues.

loopgap=n

Gap penalty for loop regions.

−terminalgap=n

Gap penalty for structure termini.

−helixendin=n

Number of residues inside helix to be treated as terminal.

−helixendout=n

Number of residues outside helix to be treated as terminal.

−strandendin=n

Number of residues inside strand to be treated as terminal.

−strandendout=n

Number of residues outside strand to be treated as terminal.

Trees:

−outputtree=nj OR phylip OR dist OR nexus

−seed=n

Seed number for bootstraps.

−kimura

Use Kimura's correction.

−tossgaps

Ignore positions with gaps.

−bootlabels=node

Position of bootstrap values in tree display.

−clustering=

NJ or UPGMA.

BUGS

The Clustal bug tracking system can be found at http://bioinf.ucd.ie/bugzilla/buglist.cgi?quicksearch=clustal.

SEE ALSO

clustalx(1).

REFERENCES

• Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. (2007). Clustal W and Clustal X version 2.0. [1] Bioinformatics, 23, 2947−2948.

• Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD. (2003). Multiple sequence alignment with the Clustal series of programs. [2] Nucleic Acids Res., 31, 3497−3500.

• Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ. (1998). Multiple sequence alignment with Clustal X [3] . Trends Biochem Sci., 23, 403−405.

• Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. (1997). The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. [4] Nucleic Acids Res., 25, 4876−4882.

• Higgins DG, Thompson JD, Gibson TJ. (1996). Using CLUSTAL for multiple sequence alignments. [5] Methods Enzymol., 266, 383−402.

• Thompson JD, Higgins DG, Gibson TJ. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position−specific gap penalties and weight matrix choice. [6] Nucleic Acids Res., 22, 4673−4680.

• Higgins DG. (1994). CLUSTAL V: multiple alignment of DNA and protein sequences. [7] Methods Mol Biol., 25, 307−318

• Higgins DG, Bleasby AJ, Fuchs R. (1992). CLUSTAL V: improved software for multiple sequence alignment. [8] Comput. Appl. Biosci., 8, 189−191.

• Higgins,D.G. and Sharp,P.M. (1989). Fast and sensitive multiple sequence alignments on a microcomputer. [9] Comput. Appl. Biosci., 5, 151−153.

• Higgins,D.G. and Sharp,P.M. (1988). CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. [10] Gene, 73, 237−244.

AUTHORS

Des Higgins

Copyright holder for Clustal.

Julie Thompson

Copyright holder for Clustal.

Toby Gibson

Copyright holder for Clustal.

Charles Plessy <plessy@debian.org>

Prepared this manpage in DocBook XML for the Debian distribution.

COPYRIGHT

Copyright © 1988–2010 Des Higgins, Julie Thompson & Toby Giboson (Clustal)
Copyright © 2008–2010 Charles Plessy (This manpage)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program. If not, see http://www.gnu.org/licenses/, or on Debian systems, /usr/share/common−licenses/LGPL−3.

This manual page and its XML source can be used, modified, and redistributed as if it were in public domain.

NOTES

1.

Clustal W and Clustal X version 2.0.

http://www.ncbi.nlm.nih.gov/pubmed/17846036

2.

Multiple sequence alignment with the Clustal series of programs.

http://www.ncbi.nlm.nih.gov/pubmed/12824352

3.

Multiple sequence alignment with Clustal X

http://www.ncbi.nlm.nih.gov/pubmed/9810230

4.

The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.

http://www.ncbi.nlm.nih.gov/pubmed/9396791

5.

Using CLUSTAL for multiple sequence alignments.

http://www.ncbi.nlm.nih.gov/pubmed/8743695

6.

CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.

http://www.ncbi.nlm.nih.gov/pubmed/7984417

7.

CLUSTAL V: multiple alignment of DNA and protein sequences.

http://www.ncbi.nlm.nih.gov/pubmed/8004173

8.

CLUSTAL V: improved software for multiple sequence alignment.

http://www.ncbi.nlm.nih.gov/pubmed/1591615

9.

Fast and sensitive multiple sequence alignments on a microcomputer.

http://www.ncbi.nlm.nih.gov/pubmed/2720464

10.

CLUSTAL: a package for performing multiple sequence alignment on a microcomputer.

http://www.ncbi.nlm.nih.gov/pubmed/3243435

pdf