DNA 序列分析 - PowerPoint PPT Presentation

hamish
slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
DNA 序列分析 PowerPoint Presentation
Download Presentation
DNA 序列分析

play fullscreen
1 / 68
Download Presentation
DNA 序列分析
214 Views
Download Presentation

DNA 序列分析

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. DNA序列分析 David Shiuan Department of Life Science Institute of Biotechnology and Interdisciplinary Program of Bioinformatics National Dong Hwa University

  2. DNA序列分析 (I) • BLAST comparison • ORF (open reading frame) Finder • Promoter Search -Promoter Prediction (BCM) -EPD(Eukaryote Promoter Database) -NNPP prokaryote promoter prediction(BCM) -ProtScan (BIMAS)

  3. DNA序列分析 (II) • Sequence Alignment (Clastal W) • Tree Analysis (MEGA, PAUP, UPGMA) • Motif Prediction • Restriction Analysis (TCGA) • RNAFOLD (GCG)

  4. Basic Local Alignment Search Tool • A sequence comparison algorithm optimized for speed used to search sequence databases for optimal local alignments to a query. • Algorithm : A fixed procedure embodied in a computer program.

  5. Basic Local Alignment Search Tool • The initial search is done for a word of length "W" that scores at least "T" when compared to the query using a substitution matrix. Word hits are then extended in either direction in an attempt to generate an alignment with a score exceeding the threshold of "S". The "T" parameter dictates the speed and sensitivity of the search.

  6. Calculating alignment scores

  7. BLOSUM62 Substitution Scoring Matrix • The BLOSUM 62 matrix shown here is a 20 x 20 matrix, in which every possible identity and substitution is assigned a score based on the observed frequencies of such occurences in alignments of related proteins. • Identities are assigned the most positive scores.

  8. The NCBI BLAST family of programs • blastp compares an amino acid query sequence against a protein sequence database • blastn compares a nucleotide query sequence against a nucleotide sequence database • blastx compares a nucleotide query sequence translated in all reading frames against a protein sequence database • tblastn compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames • tblastx compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.

  9. Peptide Sequence Databasesfor BLAST search • nr • All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF • month • All new or revised GenBank CDS translation+PDB+SwissProt+PIR+PRF released in the last 30 days. • swissprot • Last major release of the SWISS-PROT protein sequence database (no updates)

  10. Filtering of low-complexity segments

  11. E-value for the score S • the expected number of HSPs with score at least S is given by the formula E = K m n e – lS HSP : high-scoring segment pairs m and n :sequence lengths K and lambda : parameters

  12. Promoter Search • ProtScan (at BIMAS) • EPD (Eukaryote Promoter Database) • Promoter Prediction (BCM) • NNPP (Prokaryote Promoter Prediction at BCM)