clustalw − Multiple alignment of nucleic acid and protein sequences
clustalw [−infile] file.ext [OPTIONS] |
||
clustalw [−help | −fullhelp] |
Clustal W is a general purpose multiple alignment program for DNA or proteins.
The program performs simultaneous alignment of many nucleotide or amino acid sequences. It is typically run interactively, providing a menu and an online help. If you prefer to use it in command−line (batch) mode, you will have to give several options, the minimum being −infile.
DATA (sequences)
−infile=file.ext
Input sequences.
−profile1=file.ext and −profile2=file.ext
Profiles (old alignment)
VERBS (do things)
−options
List the command line parameters.
−help or −check
Outline the command line params.
−fullhelp
Output full help content.
−align
Do full multiple alignment.
−tree
Calculate NJ tree.
−pim
Output percent identity matrix (while calculating the tree).
−bootstrap=n
Bootstrap a NJ tree (n= number of bootstraps; def. = 1000).
−convert
Output the input sequences in a different file format.
PARAMETERS (set things)
General settings:
−interactive
Read command line, then enter normal interactive menus.
−quicktree
Use FAST algorithm for the alignment guide tree.
−type=
PROTEIN or DNA sequences.
−negative
Protein alignment with negative values in matrix.
−outfile=
Sequence alignment file name.
−output=
GCG, GDE, PHYLIP, PIR or NEXUS.
−outputorder=
INPUT or ALIGNED
−case
LOWER or UPPER (for GDE output only).
−seqnos=
OFF or ON (for Clustal output only).
−seqnos_range=
OFF or ON (NEW: for all output formats).
−range=m,n
Sequence range to write starting m to m+n.
−maxseqlen=n
Maximum allowed input sequence length.
−quiet
Reduce console output to minimum.
−stats=file
Log some alignments statistics to file.
Fast Pairwise Alignments:
−ktuple=n
Word size.
−topdiags=n
Number of best diags.
−window=n
Window around best diags.
−pairgap=n
Gap penalty.
−score
PERCENT or ABSOLUTE.
Slow Pairwise Alignments:
−pwmatrix=
:Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename
−pwdnamatrix=
DNA weight matrix=BLOSUMIUB, BLOSUMCLUSTALW or BLOSUMfilename.
−pwgapopen=f
Gap opening penalty.
−pwgapext=f
Gap extension penalty.
Multiple Alignments:
−newtree=
File for new guide tree.
−usetree=
File for old guide tree.
−matrix=
Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename.
−dnamatrix=
DNA weight matrix=IUB, CLUSTALW or filename.
−gapopen=f
Gap opening penalty.
−gapext=f
Gap extension penalty.
−engaps
No end gap separation pen.
−gapdist=n
Gap separation pen. range.
−nogap
Residue−specific gaps off.
−nohgap
Hydrophilic gaps off.
−hgapresidues=
List hydrophilic res.
−maxdiv=n
Percent identity for delay.
−type=
PROTEIN or DNA
−transweight=f
Transitions weighting.
−iteration=
NONE or TREE or ALIGNMENT.
−numiter=n
Maximum number of iterations to perform.
Profile Alignments:
−profile
Merge two alignments by profile alignment.
−newtree1=
File for new guide tree for profile1.
−newtree2=
File for new guide tree for profile2.
−usetree1=
File for old guide tree for profile1.
−usetree2=
File for old guide tree for profile2.
Sequence to Profile Alignments:
−sequences
Sequentially add profile2 sequences to profile1 alignment.
−newtree=
File for new guide tree.
−usetree=
File for old guide tree.
Structure Alignments:
−nosecstr1
Do not use secondary structure−gap penalty mask for profile 1.
−nosecstr2
Do not use secondary structure−gap penalty mask for profile 2.
−secstrout=STRUCTURE or MASK or BOTH or NONE
Output in alignment file.
−helixgap=n
Gap penalty for helix core residues.
−strandgap=n
Gap penalty for strand core residues.
loopgap=n
Gap penalty for loop regions.
−terminalgap=n
Gap penalty for structure termini.
−helixendin=n
Number of residues inside helix to be treated as terminal.
−helixendout=n
Number of residues outside helix to be treated as terminal.
−strandendin=n
Number of residues inside strand to be treated as terminal.
−strandendout=n
Number of residues outside strand to be treated as terminal.
Trees:
−outputtree=nj OR phylip OR dist OR nexus
−seed=n
Seed number for bootstraps.
−kimura
Use Kimura's correction.
−tossgaps
Ignore positions with gaps.
−bootlabels=node
Position of bootstrap values in tree display.
−clustering=
NJ or UPGMA.
The Clustal bug tracking system can be found at http://bioinf.ucd.ie/bugzilla/buglist.cgi?quicksearch=clustal.
• Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. (2007). Clustal W and Clustal X version 2.0. [1] Bioinformatics, 23, 2947−2948.
• Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD. (2003). Multiple sequence alignment with the Clustal series of programs. [2] Nucleic Acids Res., 31, 3497−3500.
• Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ. (1998). Multiple sequence alignment with Clustal X [3] . Trends Biochem Sci., 23, 403−405.
• Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. (1997). The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. [4] Nucleic Acids Res., 25, 4876−4882.
• Higgins DG, Thompson JD, Gibson TJ. (1996). Using CLUSTAL for multiple sequence alignments. [5] Methods Enzymol., 266, 383−402.
• Thompson JD, Higgins DG, Gibson TJ. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position−specific gap penalties and weight matrix choice. [6] Nucleic Acids Res., 22, 4673−4680.
• Higgins DG. (1994). CLUSTAL V: multiple alignment of DNA and protein sequences. [7] Methods Mol Biol., 25, 307−318
• Higgins DG, Bleasby AJ, Fuchs R. (1992). CLUSTAL V: improved software for multiple sequence alignment. [8] Comput. Appl. Biosci., 8, 189−191.
• Higgins,D.G. and Sharp,P.M. (1989). Fast and sensitive multiple sequence alignments on a microcomputer. [9] Comput. Appl. Biosci., 5, 151−153.
• Higgins,D.G. and Sharp,P.M. (1988). CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. [10] Gene, 73, 237−244.
Des Higgins
Copyright holder for Clustal.
Julie Thompson
Copyright holder for Clustal.
Toby Gibson
Copyright holder for Clustal.
Charles Plessy <plessy@debian.org>
Prepared this manpage in DocBook XML for the Debian distribution.
Copyright © 1988–2010 Des Higgins, Julie Thompson & Toby Giboson (Clustal)
Copyright © 2008–2010 Charles Plessy (This manpage)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this program. If not, see http://www.gnu.org/licenses/, or on Debian systems, /usr/share/common−licenses/LGPL−3.
This manual page and its XML source can be used, modified, and redistributed as if it were in public domain.
1. |
Clustal W and Clustal X version 2.0. |
http://www.ncbi.nlm.nih.gov/pubmed/17846036
2. |
Multiple sequence alignment with the Clustal series of programs. |
http://www.ncbi.nlm.nih.gov/pubmed/12824352
3. |
Multiple sequence alignment with Clustal X |
http://www.ncbi.nlm.nih.gov/pubmed/9810230
4. |
The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. |
http://www.ncbi.nlm.nih.gov/pubmed/9396791
5. |
Using CLUSTAL for multiple sequence alignments. |
http://www.ncbi.nlm.nih.gov/pubmed/8743695
6. |
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. |
http://www.ncbi.nlm.nih.gov/pubmed/7984417
7. |
CLUSTAL V: multiple alignment of DNA and protein sequences. |
http://www.ncbi.nlm.nih.gov/pubmed/8004173
8. |
CLUSTAL V: improved software for multiple sequence alignment. |
http://www.ncbi.nlm.nih.gov/pubmed/1591615
9. |
Fast and sensitive multiple sequence alignments on a microcomputer. |
http://www.ncbi.nlm.nih.gov/pubmed/2720464
10. |
CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. |
http://www.ncbi.nlm.nih.gov/pubmed/3243435