Flowchart of Sequence Data Management: Primary & Secondary Databases in Bioinformatics

Bioinformatics Ayesha M. Khan 22 Feb, 2012 Lec-4

Flowchart of sequence data from labs and literature to primary sequence database and subsequent secondary databases Secondary Sequence Database Protein Domains & Families Metabolic Pathways e.g. RefSeq and Conserved Domain Database (CDD) within NCBI Primary Sequence Database Amino AcidNucleic Acid e.g. GenBank, EMBL, DDBJ SwissProt and PIR Sequencing centers Literature Researchers Lec-4

Always remember that: • The data within primary databases is as reliable as the data submitted. • This depends primarily on the methods used to produce it. • Regardless of who obtains the sequence data, nucleic acid and amino acid sequencing results are subject to errors. Lec-4

Protein Sequence databases • The protein sequence database was developed at the National Biomedical Research Foundation (NBRF) • Early 1960’s by Margaret Dayhoff to investigate evolutionary relationships among proteins • 1988 onwards, maintained collectively by: Protein Information Resource (PIR) at NBRF, International Protein Information Database of Japan (JIPID), and the Martinsried Institute for Protein Sequences (MIPS). Lec-4

Examples of molecular sequence types in NCBI records Lec-4

Lec-4

Protein Sequence databases SWISS-PROT Started in 1986-University of Geneva and EMBL It is now maintained by Swiss Institute of Bioinformatics (SIB) and EBI/EMBL TrEMBL Started in 1996-Follows SWISS-PROT format and contains translations of coding sequences in EMBL. It also provides: synthetic sequences, short amino acid fragments, and codons that do not encode real proteins. Lec-4

Composite protein sequence databases • A database that merges a variety of different primary sources. • They obviate the need to interrogate multiple resources. • It can eliminate identical sequence copies, or eliminate both identical and highly similar sequences. Lec-4

Flowchart of Sequence Data Management: Primary & Secondary Databases in Bioinformatics

Flowchart of Sequence Data Management: Primary & Secondary Databases in Bioinformatics

Presentation Transcript

Bioinformatics

Bioinformatics:

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics

Bioinformatics