Numerous Web servers have exploited the command line interface of ClustalW, notably the EBI's ClustalWWW Web server, which currently runs between 2000–10
000 jobs/day and the SRS server at the same site (http://srs.ebi.ac.uk/
), which has ClustalW built in. The EBI ClustalWWW interface provides extensive help, ranging from an introduction to multiple alignments for new users to detailed descriptions of each alignment option. An important factor in obtaining a high-quality alignment is the ability to change the numerous alignment parameters available in ClustalW. While the default values of the parameters have been optimised to work in the majority of cases, they are not necessarily optimal for any given alignment problem. In the ClustalWWW interface, all the options are easily accessible on the top page.
Sequences can be entered by either pasting them or by uploading a file from the user's local computer. In both cases, the sequences should be in one of seven different formats (GCG, FASTA, EMBL, GenBank, PIR, NBRF, Phylip or SWISS-PROT). Although users are encouraged to submit large numbers of sequences, there is no guarantee that the alignment will be completed within the job run limits. Therefore, users who experience problems when attempting to make very large alignments are advised to download the software and run it locally. In addition to the input format, the user can also specify the preferred output format for the multiple sequence alignment. The options are currently ALN, GCG, PHYLIP, PIR and GDE. It is also possible to configure the browser to automatically load the results files from ClustalW into a suitable external application. A list of some example URLs for obtaining such applications for MS-Windows, Macintosh and UNIX systems is provided in Table . Many commercial packages, e.g. the GCG package (Wisconsin Package, Genetics Computer Group, Madison, WI) and its X Window graphical user interface, SeqLab, can also accept ClustalW alignments.
Example URLs of some external applications compatible with the output from ClustalW and ClustalX
The resulting multiple alignments can be displayed as either black and white or colour coded text. An example of the colour coded display is shown in Figure . The alignment consists of four oxidoreductase NAD binding domains. The colouring of residues takes place according to physicochemical criteria highlighting conserved positions in the sequences. A consensus line is also displayed below the alignment with the following symbols denoting the degree of conservation observed in each column: ‘*’ (identical residues in all sequences), ‘:’ (highly conserved column), ‘.’ (weakly conserved column).
Figure 1 A multiple alignment of four oxidoreductase NAD binding domain protein sequences. Residues are coloured according to the following criteria: AVFPMILW are shown in red, DE are blue, RHK are magenta, STYHCNGQ are green and all other residues are grey. (more ...)
A recent enhancement to the ClustalWWW interface has been the addition of an option that allows the user to upload the results of ClustalW into an alignment editor, using a Java Applet called JalView (http://www.compbio.dundee.ac.uk/
). JalView is a fully featured multiple sequence alignment editor which allows the user to perform further alignment analysis. Special features include the definition of sequence sub-groups, links to the SRS server at the EBI and an option to output the alignment as a colour postscript file for printing purposes.
As well as constructing multiple alignments, ClustalWWW can also calculate trees from a multiple alignment using the NJ method, a widely used and relatively fast algorithm that clusters sequences by minimising the sum of branch lengths. The resulting evolutionary relationships can be viewed either as cladograms or phylograms, with the option to display branch lengths (or ‘tree graph distances’).