The New Biology: From Science in the Modern World to the Genetics of Diabetes Gilbert S. Omenn, M.D., Ph.D. University of Michigan, Ann Arbor, MI, USA SuperCourse of Science Conference 6 January 2009 Bibliotheca Alexandrina, Egypt
A Call for Renewal of Science in Muslim Countries Our Muslim forefathers first held up the torch of rationality, tolerance, and advancement of knowledge throughout the Dark Ages of medieval Europe. [astronomy, math, chemistry] Ibn Al-Haytham (10th C) laid down rules for the scientific method of observation, experiment, and search for truth. Ibn Al-Nafis (13th C) emphasized respect for contrarian views to be tested with evidence. Then came Taqlid. Science requires freedom to enquire, challenge, think, and envision the unimagined. --Ismail Serageldin, SCIENCE 8-08-08
Education is the most powerful weapon which you can use to change the world. Nelson Mandela
The Bibliotheca Alexandrina A beacon and compass for science, education, and peace in the Muslim world and the broader developing world An institution with a stunning legacy, magnificent architecture, a splendid leader, fully digitalized resources, and remarkable, diverse initiatives, including—among many others---the SuperCourse of Science. A leading force for cooperation and collaboration among equals between North and South.
Europe: Investing in Intelligence “Research and innovation are the main keys to Europe’s development. They are also the most efficient way to respond to the challenges set by Asia’s large emerging economies and to lay the foundation for sustainable development for the entire planet.” ---Nicolas Sarkozy 14 May, 2008
FRONTIER SCIENCE AND GRAND CHALLENGES: INVESTING IN HIGH-POTENTIAL INDIVIDUALS AND HIGH-PAYOFF SCIENTIFIC FIELDS Gilbert S. Omenn University of Michigan French Presidency of the EU Symposium Celebrating Frontier Science Paris, 7 October, 2008
Kudos to the EU on the Launch of the Frontiers of Science Program • Investments in young scientists and their individual investigator-initiated projects • Sufficient funding to make a difference • High standards • The “Ideas Program”, complementary to the 7th Framework cooperative networks • Congratulations to those honored today • The rest of the world has noticed!
Grand Challenges for S&T and Society • Pursue the unknowns in each scientific discipline from math to biology to education. • Mobilize multidisciplinary research and development for food security, energy, health, green chemistry. • Combine S&T with political will and social purpose to overcome poverty and hunger, scarcity of water, and climate change, for sustainable economic development. --G.S. Omenn, SCIENCE 15 Dec 2006
Obama Statement on Science • Saturday December 13 announcement of Presidential Science and Technology Adviser John Holdren, Co-Chairs of President’s Committee of Advisers on Science and Technology (PCAST) genetics pioneers Harold Varmus and Eric Lander, and ecologist Jane Lubchenco • Affirmation of the importance of science • Commitment to integrity of review of scientific issues—expect support for stem cell research, teaching of evolution, and control of greenhouse gases/climate change.
U.N. MILLENIUM DEVELOPMENT GOALS These goals for peace, security, development, human rights and fundamental freedoms (1990 to 2015) are people-centered, time-bound, and measurable. • Eradicate extreme poverty (<$1/day; 1 billion people) and hunger--by 50% • Achieve universal primary education for boys and girls • Promote gender equality and empower women • Reduce child mortality rate before age 5 by 67% • Improve maternal health--reduce mortality ratio by 75% • Combat HIV/AIDS, malaria and other diseases---begin to reverse incidence and spread • Ensure environmental sustainabiity--50% reduction in those without safe drinking water 8. Develop a global partnership for development
GRAND CHALLENGES IN GLOBAL INFECTIOUS DISEASES (7 Goals, 14 Challenges)—Gates Foundation • Improve childhood vaccines (3) • Create new vaccines (3) • Control insects that transmit agents of disease (2) • Improve nutrition to promote health (1) • Improve drug treatment of infectious diseases (1) • Cure latent and chronic infection (2) • Measure health status accurately and economically (2)
It’s a New World in Life Sciences New Biology---New Technology Genome Expression Microarrays Comparative Genomics, Epigenetics, miRNA Gene Regulation Proteomics, incl alternative splice isoforms Bioinformatics Systems Biology Path to predictive, personalized, preventive (P3) healthcare
Biology as an Information Science: Historical Milestones • The molecule of inheritance is DNA, not protein: 1944 • The Watson-Crick double-helix model of DNA permits transcription and replication and mutations: 1953 • 46, not 48, human chromosomes: 1956 • The triplet code for proteins demonstrated: 1960 • The principle of “unity in diversity” applies to all living things---at all levels from molecules to cells to organ functions to ecosystems • Systems biology combines the digital code of genetics with environmental and behavioral inputs and perturbations (Leroy Hood) • Latest: Synthetic Biology (George Church)
U.S. Leaders of the Human Genome Project Eric Lander J. Craig Venter and Francis Collins Ari Patrinos
Avalanche of Genomic Information • The International HapMap Consortium aims to genotype 1 million SNPs from 270 individuals. • Direct associations of individual SNP alleles with disease phenotypes (including linkage disequilibrium, LD) are more powerful than linkage-based indirect association analyses. • dbSNP has >10 million validated SNPs. • Haplotype structures can be obtained via genome-wide LD, haplotype blocks (1 KB to 1 MB), and haplotype-tagging SNPs, respecting recombination hotspots and variable LD.
ESTIMATED COSTS OF GENOTYPING • When Human Genome sequence published in 2001, along with 10M common SNPs identified, proposed case/control studies of 1000 + 1000 participants with 20B genotypes @ $0.50 had cost estimate of $10B. • HapMap brought cost of 300,000 tagging SNPs @ $0.003 to $2M per common disease (5000x decrease in 4 years). • Now we have even more powerful analyses with “next-generation sequencing of the genome” • Computational muscle: “Skate where the puck is gonna be” (Gretzky) in planning big studies
A Golden Age for the Public Health Sciences Sequencing and analyzing the human genome is generating genetic information that must be linked with information about: • Nutrition and metabolism • Lifestyle behaviors • Diseases and medications • Microbial, chemical, physical exposures Every discipline of public health sciences needed.
NIH National Centers for Biomedical Computing Physics-Based Simulation of Biological Structures (SIMBIOS) Russ Altman, PI Informatics for IntegratingBiology and the Bedside (i2b2) Isaac Kohane, PI National Center for Integrative Biomedical Informatics (NCIBI) Brian D. Athey, PI National Alliance for Medical Imaging Computing (NA-MIC) Ron Kikinis, PI The National Center For Biomedical Ontology (NCBO) Mark Musen, PI Multiscale Analysis of Genomic and Cellular Networks (MAGNet) Andrea Califano, PI Center for Computational Biology (CCB) Arthur Toga, PI
Multi- and Interdisciplinary Research will be Required to Solve the “Puzzle” of Complex Diseases and Conditions—such as Diabetes Genes Behavior Diet/Nutrition Infectious agents Environment Society ???
Global Health Network 44,000 Faculty 3500 Universities 174 Countries
Supercourse Mirror Sites 42 Mirrored Sites, MOH Egypt, Sudan, China, Mongolia, Russia
A.Husseini (Birzeit University, West Bank): “Diabetes in the Arab World”, from the SuperCourse
Prevalence Estimates of Diabetes in selected Arab Countries > 20 Years old in the Year 2025Dev Countries/World/Tunisia/Oman/Saudi Arabia/Egypt
Genetics of Diabetes and Its Complications: Layers of Complexity Craig L. Hanis, Ph.D., University of Texas at Houston; delivered at Univ Pittsburgh, 23 October, 2001 #1 ranked “Genetics and Diabetes” lecture at www.pitt.edu/~super1/
Rising Interest in the Genetics of Diabetes and Its Complications
A Brief History of the Genetics of Diabetes Nightmare Disequilibrium Headache Linkage Interactions Heterogeneity Complexity
Complex Inheritance • Model Free Linkage Approaches • Affected Pairs • Concordant Sib Pairs • Discordant Sib Pairs • Association Based Mapping • Transmission Disequilibrium Testing • Parent - Offspring Trios (pairs) • Traditional Associations • SNP-based mapping
Fine Mapping • Ultimately a search for association of disease with single-nucleotide polymorphisms (SNP) • Criteria for selecting samples • Affected/Unaffected • Segregating/Non-segregating • Haplotype Determination • enhanced by pedigrees?
Genome-Wide Association (GWA) Studies • GWA studies represent a systematic search with nucleic acid probes (chips) for variants in the genome statistically associated with particular diseases or traits. • “Next-generation sequencing” is replacing chip arrays. • Only 2% of the DNA codes for protein products, so few of these variants actually occur in such coding genes, but they may still influence regulation of gene function. • Tremendous investment and output past several years has transformed the genetic side of molecular epidemiology, but neglected non-genetic variables • Variants give clues to unsuspected genes and pathways potentially involved in diseases like diabetes mellitus. I focus rest of the lecture on genomics and diabetes, as a bridge to the WHO course starting today on Epidemiology of Diabetes.
First GWA Studies for T2DM In 2007, five GWA studies were reported: They replicated earlier evidence for three genome variants: TCF7L2, PPARG, and KCNJ11. They identified at least six additional variants in or near these loci: SLC30A8, IGF2BP2, FTO, HHEX-IDE, CDKAL1, CDKN2A-CDKN2B. Only one (SLC30A8) is a likely functional variant at the protein level. Variants in FTO are associated also with body mass index.
Interpretation of GWA Studies of Type 2 Diabetes • These studies are unbiased by previous hypotheses of predisposing genes • The results are limited by modest effects and need for stringent statistical thresholds and very large sample sizes. • The largest allelic OR for any established variant is <= 1.35 for TCF7L2; at least nine others (now about 20) have OR 1.1-1.2. • The aggregate attributable risk is <10 percent.
Meta-Analysis of GWA Data for Susceptibility Loci for Type 2 Diabetes[Zeggini et al, Nature Genetics 2008] • Common variants at multiple loci have modest but reproducible association with risk of T2DM. • Three studies combined (DGI, FUSION, WTCCC): 10,128 individuals of European descent; 2.2 million SNPs genotyped/extended with imputed SNPs from haplotype variation • Used both Affy 500K and Illumina 317K chips • Tried to replicate findings analysis for 11 variants with p<10-5 with 53,975 samples • Found at least six more previously unknown loci: JAZF1, CDC123-CAMK1D, TSPAN8-LGR5, THADA, ADAMTS9, NOTCH2. The first three are probably associated with insulin release.
Complementary Strategy: GWA Studies of Risk Factors for T2 Diabetes[Mohlke et al, Hum Mol Genetics 2008] • Classic genetic epidemiology studies estimate genetic effects explain 25% of variance for 20 measures of cardiovascular function, 51% for five anthropologic measures, and 40%s for 38 blood tests, including cholesterol and metabolism. • They reviewed GWA studies of >200,000 SNPs that reported at least one SNP exceeding statistical significance threshold of p<5x10-8 for cholesterol and lipid levels, obesity, myocardial infarction, or coronary heart disease.
Cholesterol, Lipoproteins, Lipidsand CRP [Mohlke et al, Hum Mol Genetics 2008] • Glucokinase regulator (GCKR) initially associated with triglycerides • Then with HDL-C, LDL-C, TG and 11 additional previously reported SNP variants and 7 new loci • SNPs near SORT1-PSRC1-CELSR2 loci were associated with LDL-C; a SNP explained 58-86% of the inter-individual variability in transcript levels for these three neighboring genes. • 7 variants are associated with C-reactive protein levels, including CRP itself, APOE, leptin receptor, and HNF1 homeobox A (HNF1A).
Fat Mass and Obesity Genes • A 2005 review cited 127 gene candidates and 253 quantitative trait loci reported from linkage studies of obesity. Hardly any were confirmed. • In 2007 two independent GWA studies identified obesity-associated variants in the first intron of the FTO gene; now replicated many times. FTO encodes a 2-oxoglutarate-dependent nucleic acid demethylase whose relation to obesity or BMI is not yet understood.
Informative Heterogeneity • The initial association of FTO with diabetes was not replicated in several well-powered GWA studies. • Whether or not FTO turns up in T2DM GWA studies depends entirely on the inclusion criteria for cases—if obese individuals are excluded, as in the GWA studies above, FTO is not associated; if they are included, FTO is associated (indirectly) with T2DM.
Obesity and MC4R (chromosome 18q21) • Two recent large GWA studies for obesity-related traits identified associated SNPs near the melanocortin-4 receptor (MC4R) gene. This receptor is a major target in drug development for obesity. Mutations in MC4R can produce a rare extreme form of childhood obesity. • BMI, insulin resistance, and waist circum-ference were associated with these variants 188 kb downstream of MC4R. What is actually happening with these allelic substitutions is unknown, but under investigation. • Together FTO and MC4R account for only 1.2 kg/m2 variation in BMI in adults.
Other Quantitative Metabolic Variables • For fasting glucose level, there are common sequence variants in glucokinase (GCK) promoter and in islet-specific glucose-6-phosphatase, catalytic 2 (G6PC2). • Uric acid levels are associated with variants at solute carrier/glucose transporter SLC2A9. • Surprisingly, none for high blood pressure or systolic or diastolic blood pressures.
Evidence for Association of T2DM with Several Traits on Chromosome 9p21: SNPs in 10,128 GWA samples. Arrows = locations of SNPs. Black bars = recombination hotspots. Genes and transcripts at the bottom.
Stature/Height—Heritability >0.8[Sanna et al and Lettre et al, Nature Genetics 2008] • Body mass index comprises height and weight measures. • Several rare mutations are definitely associated with height in Mendelian syndromes • Common variants in transcription factor HMGA2 are associated with height in the general population. • GWA studies from Finland and Sardinia reveal an association of osteoarthritis-associated locus GDF5-UQCC---perhaps through bone growth [Sanna et al] • With six populations, 10 additional loci have now been associated [Lettre et al], and the two above confirmed; however, together they (and others) account for just 2 percent of population variation in height. They do expand our ideas of biological regulation of height.
Classic Approach of Detecting Large-Effect Rare Mutations • Three of the T2DM-associated variant loci were actually discovered through analysis of the heterogeneity of the disorder • Rare Mendelian mutants of KCNJ11, WFS1, and HNF1B can cause diabetes, including Maturity-Onset Diabetes of the Young. These variants have been confirmed repeatedly by GWA. • Their potential pathways relevant to diabetes biology are shown in next slide. • Rare or small-effect loci may still be clues to underlying pathophysiology and targets to treat. • Copy-number variants are also missed in GWA studies.
Processes involved in genetic predisposition to type 2 diabetes, based on the best candidates within each signal and human physiological studies. Most genes implicated in diabetes susceptibility act through effects on beta-cell function or mass. [McCarthy and Hattersly, 2008]
Resources to Keep up with Field • U.S. NIH (NCI-NHGRI) maintain an ongoing catalog of published genome-wide association studies • There are many databases of gene sequences and variants, and protein variants to assist in annotation of the potential biological roles of variants in or near mapped genes. • Statistical compendia for tests and adjustments for bias due to selection, misclassification, and population stratification are established; see McCarthy et al, Nature Reviews/Genetics 2008. • GWAS Graphical User Interface: graphical browser [Chen et al, Bioinformatics 2008]