Molecular Biology Techniques – A Primer

Molecular Biology Techniques – A Primer The methods depend upon, and were developed from, an understanding of the properties of biological macromolecules themselves. • Hybridization---the base-pairing characteristics of DNA and RNA • DNA cloning--- DNA polymerase, restriction endonucleases and DNA ligase • PCR---Thermophilic DNA polymerase

Topic 1: Nucleic acids • Electrophoresis • Restriction • Hybridization • DNA Cloning and gene expression • PCR • Genome sequence and analysis

Electrophoresis 1. Gel electrophoresis separates DNA and RNA molecules according to size, shape and topological properties Gel matrix is an inserted, jello-like porous material that support and allows macromolecules to move through. Agarose and polyacrylamideare two different gel matrices

Electrophoresis • DNA and RNA molecules are negatively charged, thus move in the gel matrix toward the positive pole (+) • Linear DNA molecules are separated according to size • The mobility of circular DNA molecules is affected by their topological structures. The mobility of the same molecular weight DNA molecule with different shapes is: supercoiled> linear> nicked or relaxed

DNA separation by gel electrophoresis large moderate small After electr

Electrophoresis To separate DNA of different size ranges • Narrow size range of DNA: use polyacrylamide • Wide size range of DNA: use agarose gel • Very large DNA(>30-50kb): use pulsed-field gel electrophoresis

Electrophoresis pulsed-field gel electrophoresis Switching between two orientations: the larger the DNA is, the longer it takes to reorient

Nucleic acid Restriction digestion Restriction endonucleases cleave DNA molecules at particular sites • Why use endonucleases? --To make large DNA molecules break into manageable fragments

Restriction digestion • Restriction endonucleases: the nucleases that cleave DNA at particular sites by the recognition of specific sequences • The target site recognized by endonucleases is usually palindromic. e.g. EcoRI 5’….GAATTC.….3’ ….CTTAAG….

Restriction digestion • To name a restriction endonuclease: e.g.EcoRI the 1st such enzyme found Escherichia coli Species category R13 strain

Restriction digestion • Frequency of the occurrence of hexamaeric sequence: 1/4096 (4-6) Randomly

(The largest fragment) (The smallest fragment) • Consider a linear DNA molecule with 6 copies of GAATTC: it will be cut into 7 fragments which could be separated in the gel electrophoresisby size Digestion of a DNA fragment with endonuclease EcoRI

Restriction digestion • Endonucleases are used to make restriction map: • e.g. the combination of EcoRI + HindIII • Allows different regions of one molecule to be isolate and a given molecule to be identified • A given molecule will generate a characteristic series of patterns when digested with a set of different enzymes

Restriction digestion Different enzymes recognize their specific target sites with different frequency • EcoRI Recognize hexameric sequence: 4-6 • Sau3A1 Recognize terameric sequence: 4-4 • Thus Sau3A1 cuts the same DNA molecule more frequently

Restriction digestion blunt ends sticky ends Recognition sequences and cut sites of various endonucleases

Restriction digestion • The 5’ protruding ends of are said to be “sticky” because they readily anneal through base-pairing to DNA molecules cut with the same enzyme Reanneal with its complementary strand or other strands with the same cut

Nucleic acid DNA hybridization DNA hybridization can be used to identify specific DNA molecules Hybridization: the process of base-pairing between complementary ssDNA or RNA from two different sources

Probe: a labeled, defined sequence used to search mixtures of nucleic acids for molecules containing a complementary sequence

Labeling of DNA or RNA probes Radioactive labeling: display and/or magnify the signals by radioactivity Non-radioactive labeling: display and/or magnify the signals by antigen labeling – antibody binding – enzyme binding - substrate application (signal release) End labeling: put the labels at the ends Uniform labeling: put the labels internally

End labeling Single stranded DNA/RNA 5’-end labeling: polynucleotide kinase (PNK) 3’-end labeling: terminal transferase

Labeling at both ends by kinase, then remove one end by restriction digestion ---------------------G ---------------------CTTAAp5’ 5’pAATTC G

J1 Characterization of clones Uniformly labeling of DNA/RNA Nick translation: DNase I to introduce random nicks DNA polI to remove dNMPs from 3’ to 5’ and add new dNMP including labeled nucleotide at the 3’ ends. Hexanucleotide primered labeling: Denature DNA  add random hexanucleotide primers and DNA pol  synthesis of new strand incorporating labeled nucleotide.

J1 Characterization of clones Strand-specific DNA probes: e.g. M13 DNA as template the missing strand can be re- synthesized by incorporating radioactive nulceotides Strand-specific RNA probes: labeled by transcription

J1 Characterization of clones J1-5 Southern and Northern blotting DNA on blot RNA on blot • Genomic DNA preparation RNA preparation • Restriction digestion - • Denature with alkali - • Agarose gel electrophoresis  • DNA blotting/transfer and fixation RNA • 6. Probe labeling  • 6. Hybridization (temperature)  • 7. Signal detection (X-ray film or antibody) 

Southern analysis

Southern bolt hybridization

bI1 bI2 bI3 bI4 bI5 Pre-mRNAs mRNA Northern analysis COB RNAs in S. cerevisiae

J1 Characterization of clones

Nucleic acid Sequencing Two ways for sequencing: • 1. DNA molecules (radioactively labeled at 5’ termini) are subjected to 4 regiments to be broken preferentially at Gs, Cs, Ts, As, separately. • 2. chain-termination method

chain-termination method • ddNTPs are chain-terminating nucleotides: the synthesis of a DNA strand stops when a ddNTP is added to the 3’ end

The absence of 3’-hydroxyl lead to the inefficiency of the nucleophilic attack on the next incoming substrate molecule

Tell from the gel the position of each G DNA synthesis aborts at a frequency of 1/100 every time the polymerase meets a ddGTP

Fluorescence automated sequencing system Slab gel electrophoresis..

Fluorescence automated sequencing system capillary gel electrophoresis

Computerized visualization from a single lane of an automated sequencer. Method uses non-radioactive fluorescent labelling.

DNA sequencing gel 4 systems with dNTP+ ddGTP, dNTP+ ddATP d NTP+ ddCTP, d NTP+ ddTTP separately “read” the sequencing gel to get the sequence of the DNA

The shortgun strategy permits a partial assembly of large genome sequence • If we want to sequence a much larger and more complicate eukaryotic genome using the shortgun strategy. What can we do? • Firstly, libraries in different level should be constructed. NUCLEIC ACIDS

The DNA fragment can be easily extracted and sequenced automatically. • Sophisticated computer programs have been developed to assemble the randomized DNA fragment, forming contigs. • A single contig is about 50,000 to 200,000 bp. It’s useful to analysis fruit fly genome that contains an average of one gene every 10 kb. • If we want to analysis human genome, contigs should be assembled into scaffolds.

1-16 the paired-end strategy permits the assembly of large genome sequence • The main limitation to producing large contigs is the occurrence of repetitive sequence. (Why?) • To solve this problem, paired-end sequencing is developed. • The same genomic DNA is also used to produce recombinant libraries composed of large fragments between 3~100 kb. NUCLEIC ACIDS

The end of each clone can be sequenced easily, and these larger clones can firstly assemble together.

If a larger scaffold is needed, you should use a cloning vector that can carry large DNA fragment, (at least 100kb). BAC is a good choice.

1-17 genome-wide analysis • The purpose of this analysis is to predict the coding sequence and other functional sequence in the genome. • For the genomes of bacteria and simple eukaryotes, finding ORF is very simple and effective. NUCLEIC ACIDS

For animal genomes, a variety of bioinformatics tools are required to identify genes and other functional fragments. But the accuracy is low.

The most important method for validating protein coding regions and identify those those missed by current current gene finder program is the use of cDNA sequence data. • The mRNAs are firstly reverse transcript into cDNA, and these cDNA, both full length and partial, are sequenced using shortgun method. These sequence are used to generate EST (expressed sequence tag) database. And these ESTs are aligned onto genomic scaffolds to help us identify genes.

Part II proteins

2-1 specific proteins can be purified from cell extracts • The purification of individual proteins is critical to understanding their function. (why?) • Although there are thousands of proteins in a single cell, each protein has unique properties that make its purification somewhat different from others. proteins

The purification of a protein is designed to exploit its unique characteristics, such as size, charge, shape, and in many instance, function.

Molecular Biology Techniques – A Primer