Determining DNA Sequence • Originally 2 methods were invented around 1976, but only one is widely used: invented by Fred Sanger. • Uses DNA polymerase to synthesize a second DNA strand that is labeled. Recall that DNA polymerase always adds new bases to a primer. • Also uses chain terminator nucleotides: dideoxy nucleotides (ddNTPs), which lack the -OH group on the 3' carbon of the deoxyribose. When DNA polymerase inserts one of these ddNTPs into the growing DNA chain, the chain terminates, as nothing can be added to its 3' end.
Sequencing Reaction • The template DNA is usually single stranded DNA, which can be produced from plasmid cloning vectors that contain the origin of replication from a single stranded bacteriophage such as M13 or fd. Infecting bacteria containing this vector with a “helper phage” causes single stranded phage to be produced. The phage DNA contains the cloned insert • The primer is complementary to the region in the vector adjacent to the multiple cloning site. • Sequencing is done by having 4 separate reactions, one for each DNA base. • All 4 reactions contain the 4 normal dNTPs, but each reaction also contains one of the ddNTPs. • In each reaction, DNA polymerase starts creating the second strand beginning at the primer. • When DNA polymerase reaches a base for which some ddNTP is present, the chain will either: • terminate if a ddNTP is added, or: • continue if the corresponding dNTP is added. • which one happens is random, based on ratio of dNTP to ddNTP in the tube. • However, all the second strands in, say, the A tube will end at some A base: you get a collection of DNAs that end at each of the A's in the region being sequenced.
Electrophoresis • The newly synthesized DNA from the 4 reactions is then run (in separate lanes) on an electrophoresis gel. • The DNA bands fall into a ladder-like sequence, spaced one base apart. The actual sequence can be read from the bottom of the gel up. • Automated sequencers use 4 different fluorescent dyes as tags and run all 4 reactions in the same lane of the gel. • Radioactive nucleotides (32P) are used for non-automated sequencing. • Sequencing reactions usually produce about 500 bp of good sequence.
Single Nucleotide Polymorphisms • Looking at many individuals, you can see that most bases in their DNA are the same in everyone. However, some bases are different in different individuals. These changes are single nucleotide polymorphisms (SNPs). • SNPs are found everywhere in the genome, and they are inherited in a regular Mendelian fashion. These characteristics makes them good markers for finding disease genes and determining their inheritance. • Lots of ways to detect SNPs, many of which are easy to automate. • Primer extension: make a primer 1 base short of the SNP site, and then extend the primer using DNA polymerase with nucleotides having different fluorescent tags.
Gene Detection • It is surprisingly hard to be sure that a given genomic sequence is a gene: that it is ever expressed as RNA. • Protein-coding regions are open reading frames (ORFs): they don’t contain stop codons. • But, human genes often contain long introns and very short exons, and some parts of genes are introns in one cell type but exons in other cell types. So, finding all the pieces of a gene can be a challenge. • Three questions: • is a given DNA sequence ever expressed? • is the sequence expressed in a given cell type or set of conditions? • what is the intron/exon structure of the sequence?
Evolutionarily Conserved Sequences • When looking across different species, most DNA sequences are not conserved. • However, the exons of genes are often highly conserved, because their function is necessary for life. • Zoo blot: a Southern blot containing genomic DNA from many species. Probe it with the sequence in question: exons will hybridize with other species’ DNA, while introns and non-gene DNA won’t. • Computer-based homology search: BLAST search. Do similar sequences appear in the nucleotide databases? Especially chimpanzee and mouse, which have complete genome sequences available.
Detecting Gene Expression • Northern blots: RNA extracted from various tissues or experimental conditions, run on an electrophoresis gel, then probed with a specific DNA sequence.
Detecting Gene Expression • Real time PCR: • first convert all mRNA in a sample to cDNA using reverse transcriptase, • then amplify the region of interest using specific primers. • Use a fluorescent probe to detect and quantitate the specific product as it is being made by the PCR reaction. • the two components of the fluorescent tag interact to quench each other. When one part is removed by the Taq polymerase, the quenching stops and fluorescence can be detected.
Expressed Sequence Tags • ESTs are cDNA clones that have has a single round of sequencing done from one end. • First extract mRNA from a given tissue. Then convert it to cDNA and clone. • Sequence thousands of EST clones and save the results in a database. • A search can then show whether your sequence was expressed in that tissue. • quantitation issues: some mRNAs are present in much higher concentration than others. Many EST libraries are “normalized” by removing duplicate sequences. • Also can get data on transcription start sites and exon/intron boundaries by comparing to genomic DNA • but sometimes need to obtain the clone and sequence the rest of it yourself.
Etc. • New techniques in DNA/RNA technology are being developed constantly. The main goal is to increase reliability and decrease cost. Primarily the aim is to automate as much as possible. • Just a few techniques we are not going to discuss: RACE, SAGE, differential display, S1 nuclease protection
Protein Methods • It is important to be sure that the protein product of a gene is made, and to know where in the tissue or cell it is made, and how much is made. • Most protein detection is based on either making antibodies to the protein of interest, or by making a fusion protein: your protein fused to a fluorescent protein. • GFP: green fluorescent protein. Isolated from jellyfish. Several variants give different colors. It still works when it is fused to other proteins. • Often done in conjunction with confocal microscopy: examining the same image with visible light and fluorescence.
Antibodies • If you inject rabbits (usually) with your protein, the rabbit will develop an immune response against it. The antibodies can be isolated from the blood. • Antibodies bind very specifically to the antigen. The antibodies can be detected by a labeled second antibody that binds to the first antibody: fluorescein-labeled goat anti-rabbit antibody for instance.