/worm/

SGD Help: Worm-Yeast Protein Comparison Results Displays


Contents



Display Options

Trees
Protein similarity groups shared between worm and yeast can be displayed as rooted trees, unrooted trees, or dot trees. Dot trees are identical to rooted trees except that the gene names are not shown. Clicking on a gene name (or a dot) in a tree links to a database entry for the gene; see the "Sequence" definition below.

The protein sequences found in the similarity group are also described in a table shown below the tree. Table entries are explained in the "Codes and Definitions" section.

Note: In the rooted trees, the placement of the root is arbitrary.

Alignments
Multiple sequence alignments can be viewed as plain text or with HTML formatting. Conserved (identical or similar) amino acids are identified by symbols below the aligned sequences, and are also color coded in the HTML-formatted display. The symbols and color codes are defined in the "Sequence Similarity Categories" table below. The HTML alignment also includes the table of protein sequences found in the alignment, as described for the similarity trees.

Methods

Multiple sequence alignments were generated using Clustal W (version 1.74, Thompson, Higgins, Gibson).
Reference: Thompson, Higgins, and Gibson. (1994). CLUSTAL W. Nucleic Acids Res, 22: 4673-80.

All tree figures were generated by the programs drawgram (rooted trees) and drawtree (unrooted trees) included in the Phylip Package (version 3.5c, J. Felsenstein). Alignments generated using Clustal W were used to produce input files for the Phylip programs.

More information about how the alignments and trees were generated is available on the General Description of Methods page.

Table Definitions

Trees and alignments are accompanied by a table that contains the following headings:

Sequence - For each gene, the sequence name is given, and the gene name is included if one exists. For yeast genes, the sequence name corresponds to the ORF name. Clicking on a yeast gene name links to the SGD Locus entry for the gene; clicking on a worm gene links to the appropriate entry in the Webace database at the Sanger Centre.

Description - The entries in the description field are color-coded. Black is used for yeast proteins, with descriptions taken primarily from SGD. Worm protein descriptions, in brown type, are taken from the Webace version of the C. elegans database at the Sanger Centre.

Length - The length of each protein in amino acid residues.

% Ident, %Strong, %Weak - See "Sequence Similarity Categories" table below.

BLAST - links to the results for the BLAST search performed to compare the gene to the protein complement of the other organism. More information can be found in the General Description of Methods page.

COG - Clusters of Orthologous Groups is a site at NCBI which describes gene families found in the complete genomes of yeast and several bacteria. For each yeast gene, a link is provided to the COG cluster containing the gene.

Sequence Similarity Categories

The symbols used in the alignments, and color codes in HTML versions, reflect the degree of sequence similarity, as described below. The "strong" and "weak" groups are all the positively scoring groups that occur in the Gonnet Pam250 matrix. Strong: score > 0.5; weak: score =< 0.5. The composition of these groups is defined below.

Symbol Meaning
* Indicates positions which have a single, fully conserved residue
: Indicates that one of the following 'strong' groups is fully conserved:
                 STA
                 NEQK
                 NHQK
                 NDEQ
                 QHRK
                 MILV
                 MILF
                 HY
                 FYW
. Indicates that one of the following 'weaker' groups is fully conserved:
                 CSA
                 ATV
                 SAG
                 STNK
                 STPA
                 SGND
                 SNDEQK
                 NDEQHK
                 NEQHRK
                 FVLIM
                 HFY

Associated Glossary Terms


Return to Saccharomyces Genome Database Send a Message to the SGD Curators

Last update 1998-12-08 MAH