1 / 32

Low-Cost/High-Accuracy Microbial Genome Synthesis and Monitoring

Low-Cost/High-Accuracy Microbial Genome Synthesis and Monitoring. 1-Feb-2005 9:15-10 MITRE. Thanks to: DARPA & DOE-GtL Agencourt , Ambergen, Atactic , BeyondGenomics, Caliper, Genomatica, Genovoxx, Helicos, MJR, NEN, Nimblegen , Xeotron/Invitrogen For more info see: arep.med.harvard.edu.

arella
Télécharger la présentation

Low-Cost/High-Accuracy Microbial Genome Synthesis and Monitoring

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Low-Cost/High-Accuracy Microbial Genome Synthesis and Monitoring 1-Feb-2005 9:15-10 MITRE Thanks to:DARPA & DOE-GtL Agencourt, Ambergen, Atactic, BeyondGenomics, Caliper, Genomatica, Genovoxx, Helicos, MJR, NEN, Nimblegen, Xeotron/Invitrogen For more info see: arep.med.harvard.edu

  2. Synthetic - homologous recombination testing of DNA motifs 1.3 2.4 (1.3 in DargR) 1.1 1.3 0.7 2.5 0.2 1.4 1.4 3.5 RNA Ratio (motif- to wild type) for each flanking gene Bulyk, McGuire,Masuda,Church Genome Res. 14:201–208

  3. Synthetic Genomes&Proteomes. Why? • Test or engineer cis-DNA/RNA-elements • Access to any protein (complex) including • post-transcriptional modifications • Affinity agents for the above. • Protein design, vaccines, solubility screens • Utility of molecular biology DNA -- RNA -- Protein • in vitro "kits" (e.g. PCR -- T7 -- Roche) • Toward these goals design a chassis: • 115 kbp genome. 150 genes. • Nearly all 3D structures known. • Comprehensive functional data.

  4. (PURE) translation utility Removing tRNA-synthetases, translational release-factors, RNases & proteases Selection of scFvs[antibodies] specific for HBV DNA polymerase using ribosome display. Lee et al. 2004 J Immunol Methods. 284:147 Programming peptidomimetic syntheses by translating genetic codes designed de novo. Forster et al. 2003 PNAS 100:6353 High level cell-free expression & specific labeling of integral membrane proteins. Klammt et al. 2004 Eur J Biochem 271:568 Cell-free translation reconstituted with purified components. Shimizu et al. 2001 Nat Biotechnol. 19:751-5.

  5. yU mS eU UUG UGG CAG | | | | | | | | | ... AUG AAC ACC GUU GAA 5' A 3' fM N T V E in vitro genetic codes 5' 3' Second base A U A C U C A C yU mS U G eU 80% average yield per unnatural coupling. eU = 2-amino-4-pentenoic acid yU = 2-amino-4-pentynoic acid mS = O-methylserine gS = O-GlcNAc–serine bK = biotinyl-lysine Forster, et al. (2003) PNAS 100:6353-7

  6. Oligos for 150 & 776 synthetic genes(for E.coli minigenome & M.mobile whole genome respectively) Forster & Church

  7. Up to 760K Oligos/Chip18 Mbp for $700 raw (6-18K genes) <1K Oxamer Electrolytic acid/base 8K Atactic/Xeotron/InvitrogenPhoto-Generated Acid Sheng , Zhou, Gulari, Gao (U.Houston) 24K Agilent Ink-jet standard reagents 48K Febit 100K Metrigen 380K NimblegenPhotolabile 5'protection Nuwaysir, Smith, Albert Tian, Gong, Church

  8. Improve DNA Synthesis Cost Synthesis on chips in pools is 5000Xless expensive per oligonucleotide, but amounts are low (1e6 molecules rather than usual 1e12) & bimolecular kinetics slow with square of concentration decrease!) Solution: Amplify the oligos then release them. 10 50 10 => ss-70-mer (chip) => ds-90-mer => ds-50-mer 20-mer PCR primers with restriction sites at the 50mer junctions Tian, Gong, Sheng , Zhou, Gulari, Gao, Church

  9. Improve DNA Synthesis Accuracyvia mismatch selection Other mismatch methods: MutS (&H,L) Tian & Church

  10. Genome assembly 50 75 125 225 425 825 … 100*2^(n-1) Moving forward: 1. Tandem, inverted and dispersed repeats (hierarchical assembly, size-selection and/or scaffolding) 2. Reduce mutations (goal <1e-6 errors) to reduce # of intermediates 3. 15kb to 5Mb by homologous recombination (Nick Reppas) 4. Phage integrase site-specific recombination, also for counters. Stemmer et al. 1995. Gene 164:49-53;Mullis 1986 CSHSQB.

  11. All 30S-Ribosomal-protein DNAs(codon re-optimized) 1.7 kb 0.3 kb Atactic <4K chip s19 0.3kb Nimblegen 95K chip Tian, Gong, Sheng , Zhou, Gulari, Gao, Church

  12. Improving synthesis accuracy Method Bp/error Chip assembly only 160 Hybridization-selection 1,400 MutS-gel-shift 10,000 MutHLS cleavage 30,000 (10X better than PCR) Tian & Church 2004 Carr & Jacobson 2004 Smith & Modrich 1997

  13. Extreme mRNA makeoverfor protein expression in vitro RS-2,4,5,6,9,10,12,13,15,16,17,and 21 detectable initially. RS-1, 3, 7, 8, 11, 14, 18, 19, 20 initially weak or undetectable. Solution: Iteratively resynthesize all mRNAs with less mRNA structure. Western blot based on His-tags Tian & Church

  14. Safety Proposals Church, G.M. A synthetic biohazard non-proliferation proposal. http://arep.med.harvard.edu/SBP/Church_Biohazard04c.doc (2004) 1. Monitor oligo synthesis via expansion of Controlled substances, Select Agents, &/or Recombinant DNA 2. Computational tools for the above 3. System modeling checks for synthetic biology projects 4. Multi-auxotroph, novel genetic code for the host genome, prevents functional transfer of DNA to other cells.

  15. Why sequence? • Synthetic biology & laboratory selections • • Pathogen "weather map", biowarfare sensors • • Cancer: mutation sets for individual clones, loss-of-heterozygosity • • RNA splicing & chromatin modification patterns. • Antibodies or "aptamers" for any protein • B & T-cell receptor diversity: Temporal profiling, clinical • Preventative medicine & genotype–phenotype associations • Cell-lineage during development • Phylogenetic footprinting, biodiversity Shendure et al. 2004 Nature Rev Gen 5, 335.

  16. Personal genomics & cancer therapy Mutations G719S, L858R, Del746ELREA in red. EGFR Mutations in lung cancer: correlation with clinical response to gefitinib [Iressa] therapy. Paez, … Meyerson (Apr 2004) Science 304: 1497 Lynch … Haber, N Engl J Med. (Apr 2004) 350:2129. Pao .. Mardis,Wilson,Varmus H, PNAS (Aug 2004) 101:13306-11. Dulbecco R. (1986) A turning point in cancer research: sequencing the human genome. Science 231:1055-6.

  17. Why 'single molecule' sequencing? (1) Single-cells: Preimplantation (PGD), uncultivatable (2) Co-occurrence on a molecule, complex, cell RNA splice-forms & DNA haplotypes (3) Cost: $1K-100K "personal genomes" http://grants.nih.gov/grants/guide/rfa-files/RFA-HG-04-003.html (4) Precision: Counting 109 RNA tags (to reduce variance) (~5e5RNAs per human cell) Fixed 5e3 5e4 5e6 5e9 (goal) costs EST SAGE MPSS Polony-FISSeq (polymerase colony)

  18. CD44 Exon Combinatorics (Zhu & Shendure) • Alternatively Spliced Cell Adhesion Molecule • Specific variable exons are up-or-down-regulated in various cancers (>2000 papers) • v6 & v7 enable direct binding to chondroitin sulfate, heparin… Zhu,J, et al. Science. 301:836-8.

  19. CD44 RNA isoforms Eph4 = murine mammary epithelial cell line Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic) Zhu J, Shendure J, Mitra RD, Church GM. Science 301:836-8. Single molecule profiling of alternative pre-mRNA splicing.

  20. 60-Mb Chromosome-wide haplotyping Human Chr. 7 IL6-3572 : A A..A 7 3 CD36-4366 : A/T 1 A..T 3 150 Mb

  21. Convergence on non-electrophorectic tag-sequencing methods? • Tag >400 14-26 20 100 26 bp (2-ends) • EST SAGE MPSS 454 Polony-Seq • Ronaghi • Single-molecule vs. amplified single molecule. • Array vs. bead packing vs. random • Rapid scans vs. long scans (chemically limited, 454) • Number of immobilized primers: • 0: Chetverin'97 "Molecular Colonies" • 1: Mitra'99 > Agencourt "Bead Polonies" • 2: Kawashima'88, Adams'97 > Lynx/Solexa: "Clusters" http://arep.med.harvard.edu/Polonator/Plone.htm

  22. In vitro libraries via paired tag manipulation Monolayered immobilization in acrylamide SOFTWARE Images → Tag Sequences Tag Sequences → Genome Bead Polony Sequencing Pipeline Bead polonies via emulsion PCR [Dre03] Enrichment of amplified beads FISSEQ or “wobble” sequencing Epifluorescence Scope with Integrated Flow Cell

  23. Selector bead Polony Fluorescent In SituSequencing Libraries 1 to 100kb Genomic 2x20bp after MmeI (BceAI, AcuI) LR M M Sequencing primers PCR bead Greg Porreca Abraham Rosenbaum Dressman et al PNAS 2003 emulsion

  24. Cleavable dNTP-Fluorophore (& terminators) Reduce or photo- cleave Mitra,RD, Shendure,J, Olejnik,J, Olejnik,EK, and Church,GM (2003) Fluorescent in situ Sequencing on Polymerase Colonies. Analyt. Biochem. 320:55-65

  25. Polony-FISSeq: up to 2 billion beads/slide Cy5 primer (570nm) ; Cy3 dNTP (666nm) Self Organizing Monolayer Jay Shendure

  26. Polony FISSeq Stats • # of bases sequenced (total) 23,703,953 • # bases sequenced (unique) 73 • Avg fold coverage 324,711 X • Pixels used per bead (analysis) ~3.6 • Read Length per primer 14-15 bp • Insertions 0.5% • Deletions 0.7% • Substitutions (raw) 4e-5 • Throughput: 360,000 bp/min • Current capillary sequencing 1400 bp/min • (600X speed/cost ratio, ~$5K/1X) • (This may omit: PCR , homopolymer, context errors) Shendure

  27. High accuracy special case: homopolymers (e.g. AAA, CC, etc.) • Use "compressed" tags , ACG = ACCG=ACCCG • Quantitate incorporation • Reversible terminators • FRET between adjacent 3' bases • Wobble sequencing All five of these work. • Maintenance of amplification fidelity using linear amplification from initial genomic fragment

  28. Degenerate (aka “wobble) sequencing “single tipped” vs “double tipped” length of anchoring sequence natural vs. universal nucleotides (i.e. deoxyinosine) single fluor vs. four-color fluor mixtures of dNTPs for extensions Sequenase vs Klenow vs BST Exonuclease stripping vs heat stripping anchor degenerate CTAGCGAGCTAGNNNNNNNNA CTAGCGAGCTAGNNNNNNNNG CTAGCGAGCTAGNNNNNNNNC CTAGCGAGCTAGNNNNNNNNT “tip”

  29. Wobble vs Simple base-extension 1/4 vs 2.5/4 base/cycle >8 vs 14-200 base reads 3e-3 vs 4e-5 non-homopolymer errors 3e-3 vs 1e-1 homopolymer errors 40' per cycle, 60 hr per 20 cycles

  30. Sequencing single molecules Ecosystem studies need single-cell amplification because of multiple chromosomes (& RNAs) per cell. Many cells are hard to grow. Microbes exchange genome subsets. (Even an 80% genome coverage is better than 100 kb BACs) Many input molecules required to sequence one molecule. vs. one molecule sufficient to sequence via many copies of it.

  31. Single cell sequencing No template control f29 real-time amplification Affymetrix quantitation of independent amplifications

  32. .

More Related