Tips and tricks
Users already familiar with Webminer's basic functions may like to learn some of these advanced features and shortcuts:

Switching rapidly between experiments
Use the command, ctrl, alt, or shift key to select subsets of experiments

Frequently, users want to perform many searches on different pairwise combinations of a small number of experiments. For instance, there are several experiments that reflect gene expression in the S phase of the cell cycle (alpha-factor release at 35, 42, and 98 min; cdc15ts release at 50, 70, 150 and 160 min; cdc28ts release at 30, 40, 130 and 140 min; and others).

A rapid way to pick and choose within a palette of experiments is to load, for example, all S-phase experiments into your search list, then hold the shift, command, ctrl, or alt key while clicking on them to highlight a subset. When you click the blue arrow, only the highlighted choices will be passed to the next page. You can then do the search, see the result, and then use the back button on your browser to return to the first page and select a different subset of experiments.


Boolean searches
How to do AND, NOT, and OR searches

Webminer automatically performs an AND search on the criteria you provide - that is, it displays only ORFs that match criterion 1 AND 2 AND 3 etc.

You can easily force Webminer to perform a NOT search, in most cases. For example, to find genes induced more than 2-fold by pheromone but NOT induced more than 2-fold in G1, do the following search:

With a little more work you can perform an OR search as well. Suppose you want to find genes expressed early in meiosis, at either 30 min OR 2 h after shift to sporulation medium. If you simply do two searches, one for genes induced after 30 min and a second for genes induced after 2 h, and then combine the lists you will end up with many duplicate entries.

A better approach is to search first for genes induced after 30 min, and to save this list as a spreadsheet (see below). Then, search for genes induced after 2 hr but NOT after 30 min. Adding this to the first list will give you a nonredundant set of ORFs expressed at 30 min OR 2 hr into sporulation.


Finding multiple regulatory elements or motifs
Use the pattern "motif.*motif.*motif"

Many motifs in biology are most meaningful when found in clusters. For example, many pheromone-regulated genes contain repeats of a pheromone response element (PRE) in their upstream sequence. Likewise, many kinase-regulated proteins contain multiple phosphorylation sites rather than a single one. It is often useful therefore to search not just for the presence of a site but for the multiple occurrences of it.

This search is simple to implement using the standard regular expression rules outlined below. For example, to search for three or more PREs search for a motif with the pattern

TGAAACA.*TGAAACA.*TGAAACA

This motif will match any promoter containing at least three PREs separated by any distance from each other. (Note that they must be oriented in the same sense. It is not straightforward to search for all possible orientations at once.) To find motifs clustered more closely, say within 10 bp, search for

TGAAACA.{0,10}TGAAACA


Displaying values without sorting by them
Demand values greater than zero

For the purpose of scanning quickly through a long list, you may want to include variables for display purposes only. For instance, while searching for sporulation-specific genes you may also want to know how each one behaves in response to overexpression of the meiotic transcription factor NDT80, but not use this response as an absolute criterion for your search.

The simplest solution is to include the experiment of interest as a criterion in your search, but set the cut-off such that all ORFs will be included. For the example above, you would search for:

This search will force the display of the NDT80 data without introducing any new limits on your search.
Importing search results to a spreadsheet
Check the "Output as tab-delimited" option on the search page

For quick glances it is great to just see the results of your search directly in your web browser, but when you have a set of genes you want to save and work more extensively with it is often convenient to import them into a spreadsheet program like Microsoft Excel.

To do that, click the "Output as tab-delimited instead of HTML table" option on the page where you enter your search parameters. The first time you do this, your browser will probably notify you that you are downloading a file of type "spreadsheet/tab-delimited" and give you the option of cancelling, saving the file, or picking an application to open it with.

Click "Pick App" and double-click on a spreadsheet program like Excel in the window that comes up next. This teaches your browser to open Webminer files in Excel. Most spreadsheets will either recognize the tab-delimited format automatically, or may ask you for some help in importing the data. In most versions of Excel, simply click the "Finish" button and it will take care of everything for you.

The tab-delimited file is simply a plain text file where each line is one row in the table, and columns are separated by tabs. You can also use any word processing program like SimpleText or Word to view it.


Motif searching with regular expressions
Most regular expressions from the language perl are accepted

Consensus sites are frequently degenerate. Webminer is written in perl, and takes advantage of that language's powerful pattern matching capabilities. Here are some basic examples showing how to use them:

Symbolmatches
.any character
[ST]a single S or T (as in serine or threonine)
[^DENQKR]any single character that is not D, E, N, Q, K, or R
A*any number of occurrences, including zero, of A
A?zero or one occurrences of A
A+one or more occurrences of A
A{3,5}three to five occurrences of A
In its output Webminer will show you both the pattern you searched for (as the column heading) and the exact match that was found for each ORF. You can use these pattern matching terms both for searching promoter sequences and protein sequences.


Contributing new modules to Webminer
Send me a tab-delimited, two-column text file and reference information

Webminer is written to be easily expanded as new genomic information accumulates. If you would like to contribute to the Webminer project, your effort would be welcome. Webminer data modules should contain the three following parts:

  1. A plain text, tab-delimited file in which each line contains: ORF name (tab) data point (newline)
    In most cases the data points will be fold-induction values of gene expression, but other kinds of data is welcome as well.
  2. A one-sentence general description of the experiment, to appear on the Webminer home page
  3. Reference information for yourself, including your web site, if any, and either links to the PubMed and electronic journal listing of the paper or, if the data is unpublished, instructions on how it should be referenced.
Send contributions or questions to
SGD Curators