![]() |
The Illinois Mangrove Project | v 3 7/2/12 |
Home | Transcriptome database | R mangle v3 | About MTDB | Rhizophora germination movies | Help | About us |

How to use the MTDB search capabilities.
Overview
There are two types of searches through the web interface: first, searches against the transcriptome database and second, BLASTn searches against mangroves. Transcriptome searches via the web interface can be performed on sequences, queries, proteins, and genes. BLASTn searches are allowed against species specific databases within MTDB, or against all mangroves. FASTA sequences can be entered in the space provided or up loaded as a text file. The BLAST search can be optimized by altering parameters such as e-value, matrix, and filtering of low complexity regions. The results are displayed as a graphical overview and pairwise alignments, in the fashion of NCBI BLASTn searches.
For more on the definitions of the individual search boxes, scroll down or click here.
Database Searches
Note: none of the searches against the database are completely overlapping. Four different types of searches allow one to look at the data from different starting points.
Protein
Use the Protein search when you have a specific protein in mind and you want to see all the hits for it (e.g. ATP synthase). Only protein records are returned.
top
Genes
These searches are limited to refseq sequences. The records resulting from a Gene search based on annotation (e.g. ATP synthase) will not be the same as those returned from the protein search.
top
Queries
This search feature returns hits based on the original annotation for a sequence. Query name searches, for example, return hits only if the target keyword was in the annotation submitted originally to public databases. In the case of our R. mangleH. littoralis sequences, Query searches require a singlet or contig number. Query searches can be further restricted by specifying a minimum sequence length (size) or bit score. This search feature is also useful if, for example, you want to find out what information is available on a particular mangrove species (use the drop down box to restrict the results).
top
Sequences
You wouldn't want to use this for a long sequence (use BLASTn), but for short sequences and exact matches, this is fast. This is the only type of search that will return otherwise unknown unknowns (Rumsfeld class 3), since they are not otherwise annotated.
top
No hit sequences
This isn't actually a search, but it allows you to download all the sequences for which there is no annotation, not even one that says it is an unknown, hypothetical or putative protein. In Rumsfeldian terminology, these are the unknown-unknowns. The resulting file will be 15.8 MB.
top
More information on the search fields
top
Accession (exact) - refers to the uniform accesion number shared between all public databases.
GI (exact) - the accession number from NCBI only. Do not include "gi" when using this box. This can be used if one is looking for hits annotated specifically with a GI number, e.g. a particular ATP synthase entry.
Protein name - any name or abbreviation that has (or may have) a protein record in a public database. The search is by substring, so any protein name that has the search string anywhere in it will be returned.
Gene ID (exact) - a unique number for a record in the MCBI genes database (this returns different information than that returned by a protein search).
Locus (exact) - if you have a locus id from a reference genome and want to search for it, use this (e.g. the AT number).
Annotation (in Gene Search) - This is usually, but not always, a protein name or some substring (key word) thereof.
Alternative names - for genes with multiple names. Previous and obsolete designators for a gene may show up when alternative names are used in the search.
top
KEGG - any keyword in a metabolic pathway that has a KEGG assignment will return the genes in MTDB so annotated.
GO - again, a keyword search. If any standard GO name or annotation is entered, all hits with that GO annotation will be returned.
Query Name - if it comes from R. mangle or H.littoralis, this is the singlet or contig number. If it comes from GenBank, it comes with the species name and gi number for that record. This is whatever the original submisson had for the name. It is a substring addressing field, i.e. you don't need the entire name to get the hits.
top