1 / 54

Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence

For Bioinformatics. , Start with:. Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence. carry out dideoxy sequencing. connect seqs. to make whole chromosomes . find the genes!. The Human Genome. E. coli Genome. Reading:. DNA target sample. SHEAR.

kellsie
Télécharger la présentation

Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. For Bioinformatics , Start with: Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes find the genes!

  2. The Human Genome E. coli Genome

  3. Reading: DNA target sample SHEAR Reads LIGATE & CLONE Primer SEQUENCE Vector Shotgun DNA Sequencing of whole genome (WGS)

  4. Reading to Assembly:

  5. Assembly: The challenge of eukaryotic genomes E. coli Genome 4 million bp The Human Genome 3 billion bp 50% of genome is repeat sequences!

  6. Assembly of sequence of each chromosome from end to end END, Jan 14 begin

  7. Annotation: Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence Robotically do dideoxy-dye data collection Whole genome shotgun OR Ordered clones find the genes !

  8. Annotation: 10/1/5 Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence find the genes ! • ab initio • by evidence

  9. Annotation: For Bacterial genomes, ab initio is adequate ab initio: “from the beginning” יש מאין from first principles… ORFs are MOST of prokaryotic genome

  10. Annotation: ab initio – finding ORFs • -85-88% of the nucleotides are associated with coding sequence • in the bacterial genomes that have been completely sequenced. • example: in Escherichia coli there are 4288 genes that • have an average of 950 bp of coding sequence • and are separated by an average of just 118 bp. So first, to find genes in prokaryotic DNA, search for ORFs!!

  11. Annotation: ab initio – finding ORFs • -85-88% of the nucleotides are associated with coding sequence • in the bacterial genomes that have been completely sequenced. • example: in Escherichia coli there are 4288 genes that • have an average of 950 bp of coding sequence • and are separated by an average of just 118 bp. So first, to find genes in prokaryotic DNA, search for ORFs!!

  12. Annotation: ab initio – beyond ORFs beyond ORFs: • -Prokaryotes have short, simple promoters that are • easy to recognize • -Transcriptional terminators often consist of short inverted • repeats followed by a run of Ts. • -Therefore, programs that find prokaryotic genes search for: • ORFs 60 or more codons long –and codon usage • promoters at the 5' end • Terminators at the 3' end • Homology to known genes from other prokaryotes • Shine-Dalgarno sequences • `

  13. Annotation: ab initio – automated Prokaryotic gene finder examples Glimmer- Interpolated Markov Model method GrailII- Neural Network method (See BioInfo text – Fig 8.8)

  14. Annotation: results

  15. Annotation: Multicellular eukaryotes Done too 10/1/5

  16. Annotation: Multicellular eukaryotes Done too 10/1/5

  17. Annotation: Multicellular eukaryotes Done too 10/1/5

  18. Annotation: 2 ways to annotate eukaryotic genomes: -ab initio gene finders: Work on basic biological principles: Open reading frames Codon usage Consensus splice sites Met start codons ….. -Genes based on previous knowledge….EVIDENCE -cDNA sequence of the gene’s message -cDNA of a closely related gene’ message sequence -Protein sequence of the known gene Same gene’s Same gene’s from another species Related gene’s protein……. -ab initio gene finders: Work on basic biological principles: Open reading frames Codon usage Consensus splice sites Met start codons ….. Genes based on previous knowledge-EVIDENCE -cDNA sequence of the gene’s message -cDNA of a related gene’s message seq. -Protein sequence of the known gene Same gene’s Same gene’s from another species Related gene’s protein…….

  19. start and stop site predictions Unique identifiers Splice site predictions Homology based exon predictions computational exon predictions Tracking information Consensus gene structure (both strands)

  20. Automatically generated annotation

  21. A zebrafish hit shows a gene model protein encoded by a 6 exon gene. This gene structure (intron/exon) is seen in other species, as is the protein size. The proteins, if corresponding to MSP in S. gal., must be heavily glycosylated (likely). At least some have a signal peptide.

  22. The zebrafish hit can be viewed at higher resolution, and…

  23. The zebrafish hit can be viewed down to nucleotide resolution

  24. Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing , 700 bp each read, MAX connect seqs. to make whole chromosomes

  25. Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes find the genes!

  26. Annotation: End Reads (Mates) Primer SEQUENCE cDNAs & ESTs: Expressed Sequence Tags RNA target sample cDNA Library Each cDNA provides sequence from the two ends – two ESTs

  27. Who Gets Sequenced? Models Pathogens Agriculturals

  28. Array analysis: see animation from Griffiths

  29. Protein Structure Database See Swiss-pdb viewer

More Related