THE FOLLOWING LINKS TO DATABASES PROVIDE A STARTING POINT FOR VARIOUS BIOINFORMATICS QUERIES. IF ANY OF THE LINKS IS NOT CURRENT, USE A SEARCH ENGINE TO FIND THE CURRENT LINK. SOME OF THE DATABASE PROVIDER MAY REQUIRE REGISTRATION. COMMERCIAL USE OF ANY OF THESE DATABASES MAY ALSO REQUIRE SPECIAL LICENSE. INQUIRE THE DATABASE PROVIDER.
Human genome project (HGP)http://www.ornl.gov/sci/techresources/Human_Genome/home.shtml
Explore this site for information about the Human
Genome Project (1990-2003).
For each known human gene HGNC approves a gene name
and symbol (short-form abbreviation). All approved symbols are stored in
database. Each symbol is unique and is ensured that each gene is only
given one approved gene symbol. In preference each symbol maintains
parallel construction in different members of a gene family and can also be
used in other species, especially the mouse.
The Human Genome Project (HGP) was one of the great
feats of exploration in history - an inward voyage of discovery rather than an
outward exploration of the planet or the cosmos; an international research
effort to sequence and map all of the genes - together known as the genome - of
members of our species, Homo sapiens.
Completed in April 2003, the HGP gave us the ability to, for the first time, to
read nature's complete genetic blueprint for building a human being.
The European Bioinformatics Institute (EBI) is a non-profit academic organization
that forms part of the European Molecular Biology Laboratory (EMBL).
The EBI is a centre for research and services in bioinformatics. The Institute
manages databases of biological data including nucleic acid, protein sequences
and macromolecular structures.
Established in 1988 as a national resource for
molecular biology information, NCBI creates public databases, conducts research
in computational biology, develops software tools for analyzing genome data,
and disseminates biomedical information - all for the better understanding of
molecular processes affecting human health and disease.
Eukaryotic promoter database (EPD) http://www.epd.isb-sib.ch/
The Eukaryotic Promoter
Database is an annotated
non-redundant collection of eukaryotic POL II promoters, for which the
transcription start site has been determined experimentally. Access to promoter
sequences is provided by pointers to positions in nucleotide sequence entries.
The annotation part of an entry includes description of the initiation site
mapping data, cross-references to other databases, and bibliographic
references. EPD is structured in a way that facilitates dynamic extraction of
biologically meaningful promoter subsets for comparative sequence analysis.
The Transcript Sequence
Retreiver (TRASER- see http://genome-www6.stanford.edu/cgi-bin/Traser/traser)
provides rapid retrieval of transcript
and upstream (putative promoter-containing) sequences for predicted human
genome mRNAs. The underlying database is built using the human genome annotation
files provided by the National Center for
TRANSFAC is the database on eukaryotic transcription factors,
their genomic binding sites and DNA-binding profiles. http://www.gene-regulation.com/pub/databases.html#transfac
The new J. Craig Venter Institute was formed in
October 2006 through the merger of several affiliated and legacy
organizations--The Institute for Genomic Research (TIGR) and The Center for the
Advancement of Genomics (TCAG), The J. Craig Venter Science Foundation, The
Joint Technology Center, and the Institute for Biological Energy Alternatives
(IBEA). Today all these organizations have become one large multidisciplinary
genomic-focused organization. With more than 500 scientists and staff, more than
250, 000 square feet of laboratory space, and locations in Rockville, Maryland
and La Jolla, California, the new JCVI is a world leader in genomic research.
For protein datamining tools see Expasy http://ca.expasy.org/
(Expert Protein Analysis System)
proteomics server of the Swiss Institute of Bioinformatics (SIB) is
dedicated to the analysis of protein sequences and structures as well as 2-D
Proteomic tools at Expacy. See http://ca.expasy.org/tools/
UCSC genome browser http://genome.ucsc.edu/- This site contains the reference sequence
and working draft assemblies for a large collection of genomes. It also
provides a portal to the ENCODE project.
Entrez Gene (previously Locus Link)
is a searchable database of genes, from RefSeq genomes, and defined by
sequence and/or located in the NCBI Map Viewer.
to information:Genecards http://www.genecards.org/index.shtml
GeneCards® is an integrated database of human genes
that includes automatically-mined genomic, proteomic and transcriptomic
information, as well as orthologies, disease relationships, SNPs, gene
expression, gene function, and service links for ordering assays and
The goal of the NCI's Cancer Genome Anatomy Project is
to determine the gene expression profiles of normal, precancer, and cancer
cells, leading eventually to improved detection, diagnosis, and treatment for
the patient. By collaborating with scientists worldwide, CGAP seeks to increase
its scientific expertise and expand its databases for the benefit of all cancer
provide access to all CGAP data, bioinformatic analysis tools, and biological
resources allowing the user to find "in silico" answers to biological
questions in a fraction of the time it once took in the laboratory.
Gene information, clone resources, SNP500Cancer, GAI, and transcriptome
FISH-mapped BAC clones, SNP500Cancer, and the Mitelman database of chromosome
cDNA library information, methods, and EST-based gene expression analysis
Analysis of gene expression using long and
short SAGE tag data for both human and mouse
Diagrams of biological pathways and protein complexes, with links to genetic
resources for each known protein
Direct access to all analytic and data mining tools developed for the project
RNA-interference constructs, targeted specifically against cancer relevant
SOURCE is a unification tool which dynamically
collects and compiles data from many scientific databases, and thereby attempts
to encapsulate the genetics and molecular biology of genes from the genomes of
Homo sapiens, Mus musculus, Rattus norvegicus into easy to navigate
intron exon information along with a comprehensive summary for a gene on the
genome. See http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/av.cgi?
provides an alternative view for genomic information, upstream promoter region
in a highly integrated form with other relevant links. See http://www.dsi.univ-paris5.fr/genatlas/
The Alternative Splicing Database (ASD) Project aims to understand the mechanism of
alternative splicing on a genome-wide scale by creating a database of
alternative splice events and the resultant isoform splice patterns of genes
from human, and other model species. See http://www.ebi.ac.uk/asd/. Also see http://hollywood.mit.edu/Dgene.php
An Organized View of the Transcriptome.Each
UniGene entry is a set of transcript sequences that appear to come from the
same transcription locus (gene or expressed pseudogene), together with
information on protein similarities, gene expression, cDNA clone reagents, and
genomic location. See http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=unigene
online Mendalian inheritance in man database is a catalog of human genes and
genetic disorders authored and edited by Dr. Victor A. McKusick and his colleagues
at Johns Hopkins and elsewhere, and developed for the World Wide Web by NCBI,
the National Center for Biotechnology Information. The database contains
textual information and references. It also contains copious links to MEDLINE
and sequence records in the Entrez system, and links to additional related
resources at NCBI and elsewhere. See http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM
NCBI Map Viewer for chromosomal position of a gene in a graphical display. See http://www.ncbi.nlm.nih.gov/mapview/map_search.cgi?taxid=9606
Thousands of molecular targets have been measured in
the NCI panel of 60 human tumor cell lines. Measurements include protein
levels, RNA measurements, mutation status and enzyme activity levels. You can
choose to search for a target of interest, or you may browse through a list of
targets. Follow the links for a target to retrieve the 60 cell line data
(either text or graphical), to run COMPARE (find Targets or Compounds whose
patterns correlate with a Target of interest) and to link to various databases
with information (function, sequences, disease associations) about the target.
GENATLAS contains relevant information with respect to gene mapping and genetic
diseases. The database was created in 1986 by Jean Frézal and is located on
and Proteome databases:ArrayExpress
ArrayExpress is a public repository for microarray
data, which is aimed at storing
MIAME-compliant data in accordance with MGED
recommendations. The ArrayExpress Data Warehouse stores gene-indexed expression
profiles from a curated subset of
experiments in the repository.
MIAME (Minimum Information
About a Microarray Experiment) is a community standard for microarray data
developed by the MGED (Microarray Gene Expression Data) Society (http://www.mged.org/miame). ArrayExpress
and related tools are MIAME supportive. You can find a
publication on MIAME here: Nature
Genetics 29(4): 365-371. MIAME is a developing standard.
SymAtlas. This online tool is a public installation
of the gene-centric database of integrated gene and genome annotation.
SymAtlas presents annotation collated from the public domain alongside gene
expression data generated at GNF from humans and various rodents. In
particular, the “GeneAtlas” data set displays the expression pattern for >
20,000 transcripts across an anatomically diverse panel of tissues . This
application has been widely used as a candidate gene prioritization tool for
gene expression and genetics studies.
SNPview. Browse or search the large-scale SNP
collections GNF researchers generated for most of the commonly used in-bred
laboratory strains. These SNP collections are a key tool towards the goal of
in-silico mapping phenotypic and complex disease traits.
Druggable Genome BLAT.It provides a convenient
interface to search against our expanded dataset of slightly more than 3000
unique human protein-encoding "druggable" genes.
Comprehensive microarray services
Oncomine combines a rapidly growing compendium
of 20,000+ cancer transcriptome profiles with a sophisticated analysis engine
and a powerful web application for data-mining and visualization. Oncomine
facilitates rapid and reliable biomarker and therapeutic target discovery,
validation and prioritization.
Open Proteomic database (OPD) is a
public database for storing and disseminating mass spectrometry based
proteomics data. The database currently contains roughly 3,000,000 spectra
representing experiments from 5 different organisms. See http://bioinformatics.icmb.utexas.edu/OPD/
Other Protein Expression Data
2D Gel Databases
Partial List of Web
2D Electrophoretic Gel Databases
NCI 2DWG Image Meta-Database
Argonne National Lab Protein Mapping Group
Weinstein: NCI-60 cancer cell lines
A Protein Expression
Database for the Molecular Pharmacology of Cancer
SNP Hap maphttp://www.hapmap.org/
The International HapMap Project is a partnership of
scientists and funding agencies from Canada, China, Japan, Nigeria, the United
Kingdom and the United States to develop a public resource that will help
researchers find genes associated with human disease and response to
pharmaceuticals. See "About
the International HapMap Project" for more information.dbSNP. See http://www.ncbi.nlm.nih.gov/projects/SNP/
The Single Nucleotide Polymorphism database (dbSNP) is a public-domain archive for a broad collection of
simple genetic polymorphisms.Pharmacogenomics http://www.pharmgkb.org/- PharmGKB curates
information that establishes knowledge about the relationships among drugs,
diseases and genes, including their variations and gene products. Our mission
is to catalyze pharmacogenomics research.
Primer 3 http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi
To calculate the Tm of your primers please fillout the
requested information for Primer #1 (Yellow) and Primer #2 (Grey). Upon
submission you will be presented with a summary of the values you've submitted
in addition to the optimized Tm calculated for Applied Biosystems products.
a comprehensive collection of cell, bacteria and cDNA repository. See http://www.atcc.org/.
MGC (Mammalian gene collection) full length clones- http://mgc.nci.nih.gov/
The goal of the Mammalian Gene Collection (MGC), a
trans-NIH initiative, is to provide full-length open reading frame (FL-ORF)
clones for human, mouse, and rat genes. In 2005, the project added the cow
cDNAs generated by Genome Canada.
Initially, cDNA libraries provided the source of the clones. Recently,
alternative methods based on gene-specific amplification have been developed to
target the recovery of human and mouse genes absent from the MGC collection.
- The Cooperative Human Tissue Network (CHTN)
was initiated by the Cancer Diagnosis Program of the National Cancer Institute
(NCI) in 1987 to provide increased access to human cancer tissue for basic and
applied scientist from academia and industry to accelerate the advancement of
discoveries in cancer diagnosis and treatment. The CHTN provides prospective
investigator-defined procurement of malignant, benign, diseased and uninvolved
(normal adjacent) tissues. The investigator can also choose from several
methods to fix the specimen such as fresh, frozen, or chemically fixed. The
CHTN also produces tissue microarrays (TMA) representing multiple tissue types
to disease-specific blocks. Recently, the CHTN has approved the development
within the divisions to isolate and distribute the raw nucleic acid to expand
resources and to more readily serve investigator’s interest. Tissues are
annotated with patient demographics including gender, age, and race. Additional
patient information can be requested where applicable.
NIGMS Human Genetic Cell Repository
- By providing the resources
for human genome research, the HUMAN GENETIC CELL REPOSITORY, sponsored by the National
Institute of General Medical Sciences (NIGMS), supplies scientists with the
materials for accelerating disease gene discovery. The resources available
include highly-characterized, viable, and contaminant-free cell cultures and
high quality, well-characterized DNA samples derived from these cultures, both
subjected to rigorous quality
If you are looking for additional links
or alternatives see Biotool kit http://www.biosupplynet.com/biotoolkit/
The revised and updated BioToolKit provides access to
over 1200 bioinformatics and neuroinformatics resources for the analysis and
visualization of the genome, transcriptome, and proteome, and to
neuroinformatics tools enabling visualization of neuroanatomic structure
.- Nucleic Acid Analysis Primer design and
restriction site analysis tools. Literature gateways and sequence retrieval
programs. Applications for multiple sequence alignment, the analysis of gene
expression regulation, localization of transcription promoter sites, and
identification of exons and alternative splice sites.
- Genomics Resources Genomics and gene expression
databases. Bioinformatics resources. Applications for resolving gene symbols
and synonyms, and for the annotation of genes. Tools for comparing genomes and
visualizing phylogenomic relationships.
- Protein Structural Imaging and Analysis Tools for
predicting, classifying, comparing and visualizing protein structures. Proteome
databases. Applications for motif identification, sequence alignment, and the
analysis of protein-protein interactions. Small molecule and metabolomics
- Neuroinformatics Neuroimaging resources,
including multi-scale imaging. Connectivity databases and applications for
interactive display of the neuroanatomy of gene expression. Neuroinformatics
databases and software.Micro RNA databases- RNAi resources-http://www.rnaiweb.com/
is the new home of microRNA data on the web, providing data previously
accessible from the miRNA Registry.
The RNAi Consortium shRNA Library
RNAi Consortium) - Short hairpin RNA (shRNA) clones
produced by the TRC, as well as protocols for handling and conducting
screens with shRNA molecules. The RNAi consortium shRNA library is
distributed as bacterial glycerol stocks, plasmid DNA or lentiviral
particles by Sigma-Aldrich and as bacterial glycerol stocks by Open