620 likes | 628 Vues
This article explores the applications of microarrays and artificial intelligence in the diagnosis, prognosis, and selection of therapeutic targets for cancer. It discusses the use of gene expression profiles and neural networks for accurate diagnosis and prediction of patient outcome. The potential of these technologies in personalized medicine is also highlighted.
E N D
The Applications of Microarrays and Artificial Intelligence for Diagnosis, Prognosis and Selection of Therapeutic Targets Oncogenomics Section Javed Khan M.D. February 2005
Case History • 5 year old male referred to NCI second opinion • Injury to R groin/inguinal while playing • Rapid evolving mass with initial resolution • Diagnosis= hematoma • Treatment observation • Several weeks later mass enlarged • Suspected malignancy • Biopsy performed
Despite availability of immunohistochemistry, cytogenetics and molecular techniques, in some cases incorrect diagnoses are made Lymphoma? Alveolar Rhabdomyosarcoma
Small Round Blue Cell Tumors (SRBCT) Lymphoma Ewing’s Diagnostic Dilemmas Rhabdomyosarcoma Neuroblastoma
Rhabdomyosarcoma non-Hodgkin's Lymphoma Muscle Lymphoid Chemotherapy Chemotherapy Yes Rarely Yes No 20-60% survival 50-90% survival Accurate Diagnosis is Essential for the Treatment Small Blue Round Cell Tumors Origin Treatment Lumbar Puncture Intra-thecal drugs Surgery Radiation Prognosis
Evolution of Translational Applications of Microarrays Schena et al.
Translational Applications of MicroarraysOutline • Cancer Diagnosis using Artificial Neural Networks (ANN) • Prognosis Prediction using ANNs • Array-CGH Investigation of Genomic Imbalances & Characterization of Tumor Progression Models
Or • Intrinsic genomic instability of tumors leads to extensive random fluctuations in global gene expression such that no two cancers have similar profiles. Applications of Microarrays for Tumor Diagnosis • Hypothesis • Cancers belonging to a given diagnosis have diagnostic specific gene expression profiles
Khan et al., Cancer Research , 58, 5009-5013, 1998 Multidimensional Scaling (MDS) Do Cancers Exhibit Diagnostic Specific Gene Expression Profiles? • Alveolar Rhabdomyosarcoma (ARMS cell lines) • 1238 element cDNA microarray • 2-Class problem comparing ARMS vs. “others” • First report to demonstrate that cancers of a given diagnosis have similar gene expression profiles • Utilized visualization (MDS) and clustering tools
Using unsupervised clustering methods we demonstrated:Cancers (ARMS) belonging to a specific type have similar gene expression profiles, raising the possibility for it’s application for diagnostics. Can gene expression profiles be used to reliably diagnose cancers belonging to multiple classes?
Lymphoma (n=8) Ewing’s (n=23) 6567 element cDNA Microarray Rhabdomyosarcoma (n=20) Neuroblastoma (n=12) Unknown n=25 Non-SRBCT n=5 Khan, Wei, Ringnér et al. Nat Med. 7: 673-9, 2001
RMS NB EWS BL Unsupervised clustering Principal Component Analysis (PCA) showed no diagnostic specific clustering
Why Artificial Neural Networks (ANNs)? • Supervised • Pattern recognition algorithms • Modeled on the human neuron/brain • Learning from prior experience by error minimization • Input = any type data, e.g. gene expression • Output = any given number (1) • APPLICATIONS • Defense • Voice and handwriting recognition • Fingerprint Recognition • Diagnosis of Arrhythmias • Diagnosis of Myocardial Infarcts • Interpreting Mammograms, Radiographs/MRI
ANN Training & Validation of 63 SRBCT samples 25 Unknown Test Samples 3750Trained Models Ideal output: e.g. for EWS EWS RMS BL NB 1 0 0 0 Output: (0-1)
Gene Minimization Identified minimal top 96 (1.5%) that perfectly classified all 4 SRBCT classes Increasing Number of Ranked Genes in Order
EWS NB RMS BL Hierarchical Clustering using Top 96 Ranked Clones Resulted in “Perfect” Clustering Identified 41genes not previously reportedto be expressed in SRBCTs
25 Unknown Test Samples Cancer Diagnosis?
Diagnostic Classification 25 Unknown Test Samples Ideal output: EWS RMS BL NB 1 0 0 0 Ideal Distance=0 • Highest Output Determines Classification • Euclidean Distance from Ideal • Construct Probability Distribution • of Distances • Calculate 95th Percentile • Diagnosis Confirmed if Distance • if within 95th Percentile Distance from Ideal
ANN Diagnostic Classification Cancer EWS BL NB RMS Sensitivity (%) 93 100 100 96 Specificity (%) 100 100 100 100 The expression profile of 96 genes can predict the diagnosis of SRBCT using ANNs
Cancer Prognosis Hypothesis • The expression profiles of cancer contains “prognostic information” at presentation prior to treatment. • Computer algorithms can utilize this to predict outcome of individual patients with no prior knowledge
Therapy Presentation Remission Relapse Cancer Prognosis Converse Hypothesis • Only a fraction of the original pretreatment tumor mass contains the “resistant clone” and this cannot be detected by whole tumor microarray experimentation.
Incidence: 1 per 100,000 in children < 15 yrs in US The most common solid tumor for children <1 yr 7-10% of cancers of childhood Neuroblastoma • Survival: • 75% under 1 year of age • < 30% of children over 1 year old with advanced disease despite aggressive therapy • Known Prognostic Factors • Age, Stage, Shimada Histology, MYCN amplification,Ploidy • Other genes, such asTRKA, TRKB, hTERT, BCL2,FYN, CD44and • caspases
Neuroblastoma Prognosis Age Stage MYCN amplification Ploidy Shimada Histology Risk John Maris Current Opinion in Pediatrics 2005, 17:7–13
Children’s Oncology Group (COG) Neuroblastoma Risk Stratification • Low: 90% survival • Intermediate : 70-90% • High: 10-30%
7 biological repeats Wei JS, Greer BT et al. Cancer Research Oct 1, 2004; 64(19) 42k cDNA Microarray
Survival Probability Curves of 49 Patients . Can ANNs predict survival status of each individual patient?
Leave-one-out (Test) Gene Expression Profiling 42578 Clones 25933 Unigenes PCA & Train ANN Output: 0=Alive 1=Dead ANN Experimental Design using all clones 49 NB Samples 30 Alive (>3yrs) 19 Deceased
Alive Dead Output 0=Alive 1=Dead ANN prediction (88%) of 49 patients using 38k High Quality Clones in a leave-one-out strategy . 0=Alive 1=Dead
ANNs (using all 38k genes) were able to predict outcome without prior knowledge- of known risk factors .
TEST 21 NB samples (16 Alive, 5 Deceased)
Clone Minimization . (19 Genes)
PCA of NB tumor samples using 19 genes . Alive Dead
How well do these top 19 genes perform for predicting the outcome? ANNs were retrained on the 35 NB training samples using the expression profiles the 19 genes and outcome predicted on 21 test samples
Alive Dead St2-NA-NB18 St4-A-NB14 ANN prediction (98%) using 19 top ranked genes . Output 0=Alive 1=Dead
Which patients will benefit the most from the 19-gene prediction models? . • Ultra High-Risk • Good High-Risk
Survival probability of high-risk patients using the top 19 genes . All high-risk Without MYCN-A Patients with COG high-risk disease will benefit most from 19-gene prediction models
The expression of the 24 predictor clones (19 genes) .
Details of 19 predictor genes • 12 known genes, 7 ESTs, 2 previously described (MYCN & CD44) • 8 out of the 12 neural specific genes • DLK1 human homologue Drosophilia Delta gene • -expressed by developing neuroblasts • -activates Notch signaling pathway, inhibits neuronal differentiation • ARC, MYCN, and SLIT3 also neural development • -higher expression in the poor-outcome tumors suggest a more • aggressive less differentiated phenotype, reminiscent of proliferating • and migrating neural crest progenitors • SLIT3 neuron axon repellant gene was high; ROBO2, of one of its receptors was low in poor risk suggesting these NB cells secrete a substrate to repel connecting axons and potentially prevent differentiation • Three secreted proteins DLK1, SLIT3, and PRSS3 in poor risk tumors
Summary Gene expression profiles contain prognostic information in the pre-treatment diagnostic samples for patients with NB • 2. Identified 19 prognostic specific genes that • performed the best 3. Ability to further partition current COG high risk patients into ultra-high and survivors groups
Future Direction Develop a multiplex PCR based prognosis prediction assay / serum markers test 2. Validate in a larger cohort of patients: national trials 3. Biological studies of the role of selected genes in the tumorigenic process 4. Isotope-coded affinity tags (ICAT) analysis of ultra-high risk tumor samples 5. New treatment trial for ultra-high risk patients
Comparative Genomic Hybridization (CGH) Known Genomic Alterations that Correlate with Poor Prognosis in NB • MYCN amplification: 20-25% • 1p deletion: 30-36% • 17q gain: 70-80% • 11q deletion: 44%
Perform a systematic survey of genomic copy number alterations in Neuroblastoma Identify genomic alterations that correlate with stage and MYCN amplifications Infer a model for tumor progression Objectives Chen QR, Bilke S et al.
CANCER Harvest DNA Metaphase spread DNA DNA Klenow Labeling DNA DNA Fold Sensitivity Resolution Metaphase CGH 2 5-10MB BAC 2 <1MB cDNA 3-5 Gene CGH Preparation of genomic DNA BAC array DNA array DNA labeling Hybridization Image analysis
Neuroblastoma Sample and Dataset • 12 neuroblastoma cell lines (MYCN-A & NA) • 73 tumor samples • 20 Stage 1 tumor samples • 53 Stage 4 tumors - 15 Stage 4S - 20 Stage 4 without MYCN amplification (4-) - 18 Stage 4 with MYCN amplification (4+) • 42 k clone cDNA microarray
Sensitivity of cDNA A-CGH to Detect Single Copy Number Changes Ref 0.5 (Expected) equal to 0.9 (Observed)