Contents
Trees
Protein similarity groups shared between worm and yeast can be displayed as rooted trees, unrooted trees, or dot trees. Dot trees are identical to rooted trees except that the gene names are not shown. Clicking on a gene name (or a dot) in a tree links to a database entry for the gene; see the "Sequence" definition below.
The protein sequences found in the similarity group are also described in a table shown below the tree. Table entries are explained in the "Codes and Definitions" section.
Note: In the rooted trees, the placement of the root is arbitrary.
Alignments
Multiple sequence alignments can be viewed as plain text or with HTML formatting. Conserved (identical or similar) amino acids are identified by symbols below the aligned sequences, and are also color coded in the HTML-formatted display. The symbols and color codes are defined in the "Sequence Similarity Categories" table below. The HTML alignment also includes the table of protein sequences found in the alignment, as described for the similarity trees.
Multiple sequence alignments were generated using
Clustal W
(version 1.74, Thompson, Higgins, Gibson).
Reference: Thompson,
Higgins, and Gibson. (1994). CLUSTAL W. Nucleic Acids Res, 22: 4673-80.
All tree figures were generated by the programs drawgram (rooted trees) and drawtree (unrooted trees) included in the Phylip Package (version 3.5c, J. Felsenstein). Alignments generated using Clustal W were used to produce input files for the Phylip programs.
More information about how the alignments and trees were generated is available on the General Description of Methods page.
Trees and alignments are accompanied by a table that contains the following headings:
Sequence - For each gene, the sequence name is given, and the gene name is included if one exists. For yeast genes, the sequence name corresponds to the ORF name. Clicking on a yeast gene name links to the SGD Locus entry for the gene; clicking on a worm gene links to the appropriate entry in the Webace database at the Sanger Centre.
Description - The entries in the description field are color-coded. Black is used for yeast proteins, with descriptions taken primarily from SGD. Worm protein descriptions, in brown type, are taken from the Webace version of the C. elegans database at the Sanger Centre.
Length - The length of each protein in amino acid residues.
% Ident, %Strong, %Weak - See "Sequence Similarity Categories" table below.
BLAST - links to the results for the BLAST search performed to compare the gene to the protein complement of the other organism. More information can be found in the General Description of Methods page.
COG - Clusters of Orthologous Groups is a site at NCBI which describes gene families found in the complete genomes of yeast and several bacteria. For each yeast gene, a link is provided to the COG cluster containing the gene.
The symbols used in the alignments, and color codes in HTML versions, reflect the degree of sequence similarity, as described below. The "strong" and "weak" groups are all the positively scoring groups that occur in the Gonnet Pam250 matrix. Strong: score > 0.5; weak: score =< 0.5. The composition of these groups is defined below.
| Symbol | Meaning |
|---|---|
| * | Indicates positions which have a single, fully conserved residue |
| : | Indicates that one of the following 'strong' groups is fully conserved:
STA
NEQK
NHQK
NDEQ
QHRK
MILV
MILF
HY
FYW
|
| . | Indicates that one of the following 'weaker' groups is fully conserved:
CSA
ATV
SAG
STNK
STPA
SGND
SNDEQK
NDEQHK
NEQHRK
FVLIM
HFY
|
Return to Saccharomyces Genome Database |
Send a Message to the SGD Curators ![]() |
Last update 1998-12-08 MAH