250 likes | 385 Vues
This study from Harvard Medical School's Center for Biomedical Informatics utilizes data mining and natural language processing to investigate the effects of COX-2 inhibitors on myocardial infarction incidence and explores gene expression associated with obesity and diabetes. By analyzing computerized medical records and employing genome-wide association studies, we identify patient cohorts with treatment-resistant depression and examine the pharmacogenetics involved. Key findings underscore the necessity for advanced informatics to enhance our understanding of complex disease mechanisms and patient treatment pathways.
E N D
Translational Case Histories Harvard Medical School Center for Biomedical Informatics i2b2 National Center for Biomedical Computing Isaac S. Kohane, MD, PhD John Glaser, PhD Susanne Churchill, PhD
First signal: • 1 year after Celecoxib • 8 months after Rofecoxib
For every million prescriptions, 0.5% increase in MI (95%CI 0.1 to 0.9) • 50.3% of the deviance explained
Effect on patient age • Negative association between mean age at MI and prescription volume • Spearman correlation -0.67, P<0.05
I2B2: Test RelNet Project • Correlate available GEO expression data for GPL96 platform containing expressions for more than 22K human genes • Number of gene pairs for this gene chip: ~ 250 Million • Multi-threaded application to run on the high-performance Cluster environment from HP • Bottleneck: the back-end Database • Current, fine-tuned version of the application takes about 2-3 months to complete one data set calculation
Recurrent Themes • Access to large numbers of phenotyped specimens • Inadequacy of informatics at the cutting edge • Inadequacy of software solutions alone • A persistent multidisciplinary requirement
Overall Remission Rate with Citalopram = 32.9% QIDS: Quick Inventory of Depressive Symptoms, self report N = 943/2876 No depression Mildsymptoms Moderatesymptoms Severesymptoms Very severe symptoms Percent (%) Last QIDS-SR Score Trivedi MH, et al. Am J Psychiatry 2006;163:28-40.
Aims: • Identify a cohort of patients with TRD, and a matched cohort with SSRI-responsive MDD. • Data-mining tools • Natural language processing • Conduct the first genomewide association study of TRD.
Scan computerized medical records (DataMart) • ICD9 RA x 3 plus one of: • CCP or RF • Erosions on x-ray • DMARD treatment • Crimson “discarded” blood samples (cases and controls) • CCP on all samples (and bank serum) • DNA on all samples for genetic studies adds >95% specificity www.i2b2.org/disease/arthritis.html
Association in population samples Affecteds Controls SNP frequency in cases compared to controls Positive controls: MHC, PTPN22, STAT4, TRAF1-C5, TNFAIP3
…...acgt…ggaatac…... …...acgt…ggaatac….. Allele ‘A’ NspI NspI NspI NspI …...acgt…ggattac….. ......acgt…ggattac…… Allele ‘B’ NspI NspI NspI NspI _ B _ _ ‘B’ methylated both unmethylated ‘B’ expressed ‘A’ expressed both expressed MSRE digested Allele ‘A’ Allele ‘B’ A B A _ control (no digestion) ‘A’ methylated
Gene Network Enrichment Analysis Microarray data Protein protein interaction network Molecular Function Biological Process
Diabetes Genome Anatomy Project:Mouse Models of Insulin Resistance, Insulin Deficiency and Obesity • Knockouts • Insulin receptor • Insulin receptor substrates • Leptin • PGC1A • Environmental • High fat diets • Drug treatments (Streptozotocin) Tissues 67 Conditions Total
Three Functional Sets Are Consistently Over-represented In Disease Models • Insulin signaling, interleukins, and nuclear receptors. • Insulin signaling is consistent with the given disease models. Was not identified using standard techniques. • Interleukins and nuclear receptors consistent with the inflammation and disordered metabolism associated with type 2 diabetes. Insulin signaling NuclearReceptors Nuclear receptors: 31 of 67. Interleukins: 38 of 67. Insulin signaling: 45 of 67. Interleukins