1 / 60

Genomics

Genomics. 20. Key Concepts. Once a genome has been completely sequenced, researchers use a variety of techniques to identify which sequences code for products and which act as regulatory sites.

Télécharger la présentation

Genomics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genomics 20

  2. Key Concepts • Once a genome has been completely sequenced, researchers use a variety of techniques to identify which sequences code for products and which act as regulatory sites. • Bacterial and archaeal genomes are relatively small. Among species, there is a positive correlation between total gene number and metabolic capabilities. Gene transfer between species is also common.

  3. Key Concepts • Eukaryotic genomes are large and complex. They include many sequences that have little to no effect on the fitness of the organism, and many transcribed sequences whose function is not known. • Data and techniques derived from genome sequencing projects are being used to analyze cancer cells.

  4. Introduction • The complete DNA sequence of an organism is its genome. The human genome sequence was published in February 2001 as part of the Human Genome Project. • Genomics is the scientific effort to sequence, interpret, and compare whole genomes. • Genomics provides a list of the genes present in an organism. Functional genomics looks at when those genes are expressed and how their products interact.

  5. Whole-Genome Sequencing • Improved automation has increased the speed and reduced the cost of DNA sequencing. • The primary international repositories for DNA sequence data now contain over 194 billion nucleotides. • With about 3 billion nucleotides, humans have the largest haploid genome sequenced to date. • The size of the database increases by about 30 percent every year.

  6. How Are Complete Genomes Sequenced? • Most genome sequencing projects use a whole-genome shotgunsequencing approach. • In this process, the genome is broken up into a set of overlapping fragments that are sequenced, and these sequences are then put in order.

  7. The Shotgun Sequencing Process 1. Sonication (use of high-frequency sound waves) breaks a genome into pieces approximately 160 kilobases long. 2. Each piece is inserted into a plasmid called a bacterial artificial chromosome (BAC). A BAC library is created by inserting each BAC into a different Escherichia coli cell. Colonies of each cell are allowed to grow, creating multiple copies of each BAC library. 3. Each 160-kb DNA segment is broken into 1-kb segments.

  8. The Shotgun Sequencing Process • Each 1-kb segment is cloned into a plasmid. These plasmids are then inserted into E. coli cells and replicated, producing shotgun clones. • The fragments from each clone are then sequenced and analyzed by computer programs. • The computer puts the sequences in order, thus reconstructing the BACs. • The ends of the reconstructed BACs are similarly analyzed. The goal is to arrange each 160-kb segment in its correct position along the chromosome, based on regions of overlap.

  9. The Shotgun Sequencing Process • In essence, the shotgun strategy consists of breaking a genome into tiny fragments, sequencing the fragments, and then putting the sequence data back into the correct order.

  10. The Role of Next-Generation Sequences Strategies • Pyrosequencing is a cheaper and faster alternative to traditional sequencing. • It takes place on a single DNA fragment rather than multiple copies of the same fragment. • However, it only works with fragments that are too small to be pieced back together to reconstruct a complete genome accurately. • If the entire genome of the organism is known, pyrosequencing produces the sequence of an individual for comparison to the “master genome.”

  11. How Are Complete Genomes Sequenced? • Bioinformatics is the effort to manage, analyze, and interpret biological information, and is key to managing the vast quantity of data generated by genome sequencing.

  12. Which Genomes Are Being Sequenced, and Why? • The first genome of an organism to be sequenced was that of the bacterium Haemophilus influenzae in 1995; it consists of about 1.8 million base pairs. • The first eukaryotic genome to be sequenced was that of the yeast Saccharomyces cerevisiae in 1996. • To date, complete genomes have been sequenced from over 800 species. • Most of the organisms that have been sequenced cause disease or have other interesting biological properties.

  13. Which Sequences Are Genes? • The most basic task in annotating or interpreting a genome is to identify which bases constitute genes. • Identifying genes is relatively straightforward in bacteria and archaea but is much more difficult in eukaryotes, who have many noncoding sequences in their genomes.

  14. Identifying Genes in Bacterial and Archaeal Genomes • Computer programs are used to scan a genome sequence in both directions in order to identify open reading frames (ORFs). ORFs are possible genes—long stretches of sequence that lack a stop codon but are flanked by a start codon and a stop codon. • The computer programs also look for sequences typical of promoters, operators, and other regulatory sites. • Researchers can confirm that an ORF is actually a gene by analyzing its product or by finding that it is homologous (similar due to common ancestry) to a known gene.

  15. Identifying Genes in Eukaryotic Genomes • In eukaryotic organisms, genes contain introns, and most of the genome does not code for a product—thus, it is not possible to scan for ORFs. • The most effective strategy for identifying genes is to use reverse transcriptase to produce a cDNA version of each mRNA, and sequence a portion of the resulting molecule to produce an expressed sequence tag, or EST. ESTs represent protein-coding genes.

  16. Human Genome Sequencing Strategies Web Activity: Human Genome Sequencing Strategies

  17. Bacterial and Archaeal Genomes • By sequencing the genomes of various strains of the same prokaryotic species, researchers can now compare the genomes of closely related organisms that have different ways of life.

  18. The Natural History of Prokaryotic Genomes • In bacteria, there is a general correlation between the size of the genome and the metabolic capabilities of the organism. • The function of many bacterial genes is still unknown. • There is tremendous genetic diversity among bacteria and archaea. About 15 percent of the genes in a prokaryotic genome are unique to its own species. • Redundancy among genes is common. Some genes are found multiple times within a prokaryotic genome.

  19. The Natural History of Prokaryotic Genomes • Multiple chromosomes and plasmids are more common than expected. • In many bacterial and archaeal species, a significant portion of the genome appears to have been acquired from other, often distantly related, species.

  20. Lateral Gene Transfer • The movement of DNA from one species to another species is called lateral gene transfer. • Recent evidence suggests that over 50 percent of archaean species and 30–50% of bacterial species have at least one gene acquired by lateral gene transfer.

  21. Evidence for Lateral Gene Transfer • Two general criteria support the hypothesis that sequences in bacterial or archaeal genomes originated in another species: • A gene is much more similar to genes in distantly related species than it is to those in closely related species. • When the proportion of G-C base pairs to A-T base pairs in a particular gene or series of genes is markedly different from the base composition of the rest of the genome.

  22. How Does Lateral Gene Transfer Occur? • Lateral gene transfer often results because genes are carried on plasmids. • Another way lateral gene transfer occurs is through transformation, taking up DNA fragments from the environment. • Thus, mutation and genetic recombination within species are not the only sources of genetic variation in bacteria and archaea.

  23. Environmental Sequencing • Environmental sequencing, or metagenomics, is the practice of cataloging all of the genes present in a community of bacteria and archaea. The subject of these studies is genes—not organisms. • This method resulted in the discovery of nearly 150 new species of bacteria, and over 1 million new alleles in the Sargasso Sea.

  24. Eukaryotic Genomes • Many eukaryotic genomes are dominated by repeated DNA sequences that occur between genes or inside introns and do not code for products used by the organism. • Sequencing eukaryotic genomes presents unique challenges. • Eukaryotic genomes are much larger than the genomes of bacteria and archaea. • The presence of noncoding repetitive sequences.

  25. Parasitic and Repeated Sequences • Protein-coding sequences constitute a very small percentage of the human genome, and repetitive sequences make up more than 50 percent. In contrast, over 90 percent of the prokaryotic genome consists of genes. • Repeated sequences in the human genome are often the result of transposable elements—segments of DNA that can move from one location in a genome to another.

  26. Characteristics of Transposable Elements • Transposable elements are examples of selfish genes—parasitic DNA sequences that survive and reproduce but that do not increase the fitness of the host genome. • Transposable elements are classified as parasitic because they decrease their host’s fitness: • It takes time and resources to copy them along with the rest of the genome. • They can disrupt gene function when they insert in a new location.

  27. How Do Transposable Elements Work? • Long interspersed nuclear elements (LINEs) are one type of transposable element. • An active LINE contains all the sequences required to make copies of itself and insert them into a new location in the genome. • Analyses of the human genome have revealed that only a handful of LINEs appear to be complete and potentially active. • However, virtually every prokaryotic and eukaryotic genome examined to date contains at least some transposable elements.

  28. Repeated Sequences • Eukaryotic genomes have several thousand loci called short tandem repeats (STRs). These are small sequences repeated down the length of a chromosome. There are two types of STRs. • Microsatellites, or simple sequence repeats, are repeating units of 1 to 5 bases. • Minisatellites, or variable number terminal repeats (VNTRs), are repeating units of 6 to 500 bases. • Repeated sequences are hypervariable and vary among individuals much more than any other type of sequence.

  29. Repeated Sequences • One hypothesis for why microsatellites and minisatellites have so many different alleles is that these highly repetitive stretches may misalign when chromosomes synapse during meiosis. • This misalignment then causes unequal crossover. • Chromosomes produced by unequal crossover contain different numbers of repeats.

  30. Repeated Sequences and DNA Fingerprinting • DNA fingerprinting refers to any technique for identifying individuals on the basis of unique features of their genomes. • Because microsatellite and minisatellite loci vary so much among individuals, they are now the markers of choice for DNA fingerprinting.

  31. DNA Fingerprinting Process • A sample of DNA is acquired from the individual. • PCR is performed using primers that flank a region containing an STR. • The region is cloned. • The region can be analyzed to determine the number of repeats present.

  32. DNA Fingerprinting BLAST Animation: DNA Fingerprinting

  33. Gene Families • In eukaryotes, the major source of new genes is duplication of existing genes. • Within a species, genes that are extremely similar to each other in structure and function are considered to be part of the same gene family. • Genes that make up gene families are hypothesized to have arisen from a common ancestral sequence through gene duplication.

  34. How Do Gene Families Arise? • When gene duplication occurs, an extra copy of a gene is added to the genome. • The most common type of gene duplication results from unequal crossing over during meiosis. • The redundancy of duplicated genes may allow one copy to mutate to create a new gene with different function or regulation, possibly leading to the evolution of novel traits.

  35. New Genes—New Functions? • Gene duplication is important because the original gene is still functional and produces a normal product. • The duplicated gene may: • Retain its original function and provide additional quantities of the same product. • Undergo mutation resulting in a beneficial altered protein, thus creating an important new gene. • Be a nonfunctional pseudogene, a remnant of a functional copy of the gene that does not produce a working product.

  36. Insights from the Human Genome Project • Scientists do not know the function of more than half of the genes found in the human genome. • Two recent discoveries are changing biologists’ thinking about the human genome: • Genes for miRNAs are much more common than previously thought. • A much larger proportion of the genome is transcribed than previously thought. Many of these sequences are referred to as Transcripts of Unknown Function (TUFs) because their role in the cell is unknown.

More Related