Human Genetic Variation Weibin Shi
Genetic variations underlie phenotypic differences Wilt Chamberlain, a famous NBA basketball player (7 feet, 1 inch; 275 pounds) Willie Shoemaker, a famous horse racing jockey (4 feet, 11 inches; barely 100 pounds).
- Environment - Genes Genetic variations cause inherited diseases Environmental Diseases Genetic Diseases Complex Diseases - Alzheimer disease - Cardiovascular Disease - Diabetes (type 2) - Parkinson Disease - Influenza - Hepatitis - Measles - Cystic fibrosis - Down syndrome - Sickle cell disease - Turner syndrome
Locus1 Possible Alleles: A1,A2 Locus2 Possible Alleles: B1,B2,B3 Basic terminology • Locus – location of a gene/marker on the chromosome. • Allele – one variant form of a gene/marker at a particular locus.
A little more basic terminology Polymorphism: • Variations in DNA sequence (substitutions, deletions, insertion, etc) that are present at a frequency greater than 1% in a population. • Have a WEAK EFFECT or NO EFFECT at all. • Ancient and COMMON. Mutation: • Variations in DNA sequence (substitutions, deletions, etc) that are present at a frequency lower than 1% in a population. • Can produce a gain of function and a loss of function. • Recent and RARE.
Some Facts • In human beings, 99.9% bases are same • Remaining 0.1% makes a person unique • Different attributes / characteristics / traits • how a person looks • diseases he or she develops • These variations can be: • Harmless (change in phenotype) • Harmful (diabetes, cancer, heart disease, Huntington's disease, and hemophilia ) • Latent (variations found in coding and regulatory regions, are not harmful on their own, and the change in each gene only becomes apparent under certain conditions e.g. susceptibility to heart attack)
Forms of genetic variations Single nucleotide substitution:replacement of one nucleotide with another Microsatellites or minisatellites: these tandem repeats often present high levels of inter- and intra-specific polymorphism Deletions or insertions: loss or addition of one or more nucleotides Changes in chromosome number, segmental rearrangements and deletions
How many variations are present in the average human genome ? SNPs appear at least once per 0.3-1-kb average intervals. Considering the size of entire human genome (3.2X109 bp), the total number of SNPs is around to 5-10 million Potentially polymorphic microsatellites are over 100,000 across the human genome The insertion/deletions are very difficult to quantify and the number is likely to fall in between SNPs and microsatellites
look at multiple sequences from the same genome region • use base quality values to decide if mismatches are true polymorphisms or sequencing errors How do we find sequence variations?
AGGAAAAGAACATAACAAGAACTATTTTTCGCCCGAACTC B6 AGGAAAAGAACATAACAAGGACTATTTTTCGCCCGAACTC C3H Vcam1 : Coding-NonSynonymous B6 C3H
Human Genetic Variation Most abundant type: SNPs-Single Nucleotide Polymorphisms GATTTAGATCGCGATAGAG GATTTAGATCTCGATAGAG ^ about 90% of all human genetic variations
What is the difference between SNP and mutation? For a variation to be considered a SNP, it must occur in at least 1% of the population.
Life cycle of SNP (long way from mutation to SNP) Appearance of new variant by mutation Survival of rare allele Increase in allele frequency after population expand New allele is fixed in population as novel polymorphism
Basic facts about SNPs • SNPs occur every 300-1000 bases in human genome; • Two of every three SNPs involve the replacement of cytosine (C) with thymine (T); • SNPs can occur in both coding (gene) and noncoding regions of the genome; • Many SNPs have no effect on cell function, but others could predispose people to disease or influence their response to a drug.
Single base changes • Transitions • Purine to purine or pyrimidine to pyrimidine • A to G or G to A T to C or C to T • Transversions • Purine to pyrimidine or pyrimidine to purine
SNP Databases • NCBI dbSNP • http://www.ncbi.nlm.nih.gov/SNP/index.html • Human Genome Variation Database (HGVbase) • http://hgvbase.org/ • International HapMap Project • http://snp.cshl.org/
Classification of SNPs • 1. Coding SNPs • Synonymous: when single base substitutions do not cause a change in the resultant amino acid • Non-synonymous: when single base substitutions cause a change in the resultant amino acid • 2. Non-coding SNPs that influence gene expression • 3. Non-coding silent SNPs
SNPs as gene mapping markers • SNPs are used as genetic markers to identify genes responsible for disease susceptibility or a particular trait.
Point mutations • Not all single base pair differences are SNPs • They can be a mutation if least abundant allele has a frequency < 1% in a population
Consequences of mutations • Most mutations are neutral • 97% DNA neither codes for protein or RNA, nor indirectly affects gene function • A new variant in the 1.5% coding regions may not result in a change in amino acid • Variants that change amino acid may not affect function • Certain mutations have functional effect and even cause disease • Gain-of-function mutations often produce dominant disorders • Loss-of-function mutations result in recessive disease
Consequences of mutations • Missense mutations differ in severity • conservative amino acid substitution substitutes chemically similar amino acid, less likely to alter function • nonconservative amino acid substitution substitutes chemically different amino acid, more likely to alter function • consequences for function often context-specific • Nonsense mutation results in premature termination of translation • truncated polypeptides often are nonfunctional • Point mutation in non-coding region may affect transcription, RNA splicing, and protein assembling
Microsatellite di-, tri-, and tetra-nucleotide repeats The second abundant genetic variation in the human genome Usually have no functional effect, but some do TGCCACACACACACACACAGC TGCCACACACACA------GC TGCTCATCATCATCAGC TGCTCATCA------GC TGCTCAGTCAGTCAGTCAGGC TGCTCAGTCAG--------GC
Trinucleotide repeats-associated diseases • Characterized by expansion of three-base-pair repeats • few repeats to hundreds of repeats • expansion results in abnormal protein, disease • number of repeats may expand in subsequent generations
Triplet repeat expansion • Normal Disease Gene • Huntington disease CAG 9-35 37-100 Huntingtin • Kennedy disease CAG 17-24 40-55 androgen receptor • Spino-cerebellar Ataxia CAG 19-36 43-81 Ataxin 1 • Machado Joseph D CAG 12-36 67-75 SCA • Myotonic dystrophy CTG 5-35 50-400 DM • Fragile X CGG CCG GCC 6-50 200-1000 FMR1 Many result in neurodegeneration Severity of many diseases increases with the number of repeats
Minisatellite • 6-64 bp repeating pattern 1 tgattggtct ctctgccacc gggagatttc cttatttgga ggtgatggag gatttcagga 61 attttttagg aattttttta atggattacg ggattttagg gttctaggat tttaggatta 121 tggtatttta ggatttactt gattttggga ttttaggatt gagggatttt agggtttcag 181 gatttcggga tttcaggatt ttaagttttc ttgattttat gattttaaga ttttaggatt 241 tacttgattt tgggatttta ggattacggg attttagggt ttcaggattt cgggatttca 301 ggattttaag ttttcttgat tttatgattt taagatttta ggatttactt gattttggga 361 ttttaggatt acgggatttt agggtgctca ctatttatag aactttcatg gtttaacata 421 ctgaatataa atgctctgct gctctcgctg atgtcattgt tctcataata cgttcctttg These occur at more than 1000 locations in the human genome Usually have no functional effect
Transposon and mutation Transposons are interspersed DNA repeats that can cause mutations and change the amount of DNA in the genome
Down Syndrome • 1 per 800 births • Large tongue • Flat face • Slanted eyes • Single crease across palm • Mental retardation • Some are not
Turner Syndrome • Short • Absence of a menstrual period • Produce little estrogen • Sterile • Extra skin on neck
Polymorphism • Gene confers an increased risk, but does not directly cause disorder • No clear inheritance pattern • Common in population Mutation • Gene directly leads to disorder • Mendelian pattern of inheritance • Rare