Databases in Biology
(in major taken from Bioinformatics Course UCL/Birkbeck College, London, June
2001)
Pages Summarising Current Databases
Literature/General
- OVID
- PubMED
- UCL Online Journal Catalogue
- ENTREZ - sequence, structure, molbio, literature
Primary Databases
Nucleic Acids
- GENBANK - accession number constant for every gene
- EMBL
- DDBJ
Protein
- PIR - Protein Information
Resource, integrates other databases
- MIPS
- SWISSPROT - manually annotated
(late, but nice), lots of infos
- NCBI databases
- PDB - Protein Data Bank
- Molecular Structure Database
- TrEMBL - up to date, but automatically annotated
- SP-TrEMBL
- REM-TrEMBL - igs, TCRs, ...
- NRL-3D
- GenPept (NCBI)
Secondary Databases
Note: Databases require different formats (raw sequence or FASTA, accession
number or gi number). The first link is not always the best one.
Note: The choice of the best methods depends on the degree of sequence
similarity: 100-50 % - Automatic pairwise alignment, -30 % - Consensus methods,
-15 % - Profile methods, - Structure prediction
- PROSITE
- good functional annotation, from patterns, with supplement PROSITE
Profiles - HMM library
- PFAM - HMMs, retrieve alignment
from domain name
- BLOCKS - from sequence patterns
- PRODOM - domain
families as trees, from consensus sequences
- PRINTS - from
collection of regular expressions (finger prints)
- SMART - occurence of
domain in the three kingdoms, find proteins with similar domain
architecture/composition !!!
- IDENTIF/eMOTIF - from fuzzy regular expressions
- INTERPRO - annotated
motif database
- SignalP - signal sequences
- TMHMM2 - transmembranal helices
Tertiary Databases
Composite Databases
- NRDB = PIR + SP + PDB + GenPept
- OWL = PIR + SP + GENBANK + NRL-3D
- MIPRSX (PIR+PATCHX) =
PIR+SP+MIPSOWN+NRL-3D+MIPSH+PIRMOD+MIPSTRN+EMTRANS+GBTRANS+KABAT+PseqIP
- SP+TrEMBL = TrEMBL + SP
Pathway Databases
- KEGG
- WIT
- PathDB (June 2001)
- Gene Ontology - tries to get
everything together: yeast (SGD), fly (FlyBASE), mouse (MGD) - functional
classification
- TIGR - comparative genomics,
multiple microbial organisms
- PEDANT - yeast scheme
applied to other organisms (at MIPS)
Other Programmes and Downloads
- Rasmol -
manual
- ClustalW
- MMDB - NCBI Molecular Modelling Database
- HMMer - guess what, multiple alignments etc
- SAM - HMM application
- Swiss-Model - automated modelling server (gives first suggestions)
Plant Specials