1 / 60

Structural Modelling and Bioinformatics in Drug Discovery and Infectious Disease

Structural Modelling and Bioinformatics in Drug Discovery and Infectious Disease. Shoba Ranganathan Professor and Chair – Bioinformatics Dept. of Chemistry and Biomolecular Sciences & Adjunct Professor Biotechnology Research Institute Dept. of Biochemistry

karli
Télécharger la présentation

Structural Modelling and Bioinformatics in Drug Discovery and Infectious Disease

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Structural Modelling and Bioinformatics in Drug Discovery and Infectious Disease Shoba Ranganathan Professor and Chair – Bioinformatics Dept. of Chemistry and Biomolecular Sciences & Adjunct Professor Biotechnology Research Institute Dept. of Biochemistry Macquarie University Yong Loo Lin School of Medicine Sydney, Australia National University of Singapore, Singapore (shoba.ranganathan@mq.edu.au) (shoba@bic.nus.edu.sg) Visiting scientist @ Institute for Infocomm Research (I2R), Singapore

  2. Bioinformatics is ….. • Bioinformatics is the study of living systems through computation

  3. Sequences Networks, pathways and systems Structures Genetics and populations Genomes Transcriptomes Data in Bioinformatics (in the main)and their management and analysis Databases, ontologies Data & text mining Algorithms Maths/Stats Physics/ Chemistry Evolution and phylogenetics

  4. What is Immunoinformatics? • Using Bioinformatics to address problems in Immunology • Application of bioinformatics to accelerate immune system research has the potential to deliver vaccines and address immunotherapeutics. • Computational systems biology of immune response

  5. Immunology Computer Science Biology Immunoinformatics

  6. Basic immunology Clinical immunology Networks, pathways and systems -omics Summary Genetics and populations • Introduction • Structural Immunoinformatic Database development • Data Analysis • Computational models • Applications

  7. The immune system • Composed of many interdependent cell types, organs and tissues • 2nd most complex system in the human body Figure by Dr. Standley LJ • Two types: • Innate Immune System • Adaptive Immune System

  8. It is a numbers game…. • >1013 MHC class I haplotypes (IMGT-HLA) • 107-1015 T cell receptors (Arstila et al., 1999) • >109 combinatorial antibodies (Jerne, 1993) • 1012 B cell clonotypes (Jerne, 1993) • 1011 linear epitopes composed of nine amino acids • >>1011 conformational epitopes

  9. Adaptive immune system • Major Histocompatability Complex (MHC Class I and II) • Human Leukocyte Antigen (HLA in human) • Peptide binding to MHC • Recognition of pMHC complex by the TCR • Activation of T cells • MHC Class I – CD8+ cytotoxic T cells www.immunologygrid.org • MHC Class II – CD4+ helper T cells

  10. How to generate a T cell-mediated immune response 3. T cell receptor 2. MHC 1. Epitope

  11. Antigen processing pathway: peptides, MHC, T-cells • Degradation of antigen • Peptide binding to MHC • Recognition of peptide-MHC complex by T-cells Yewdell et al. Ann. Rev Immunol (1999) 0.05% chance of immunogenicity 50% CTL response 20% processed 0.5% bind MHC

  12. Physico-chemical properties affect MHC-peptide binding

  13. Computational models can help identify T cell epitopes • Suggest candidate epitopes by in silico screening of entire proteins and even proteomes with specificity at: • the allele level • the supertype level • disease-implicated alleles alone. • Minimize the number of wet-lab experiments • Cut down the lead time involved in epitope discovery and vaccine design

  14. Predicting MHC-binding peptides Tong, Tan and Ranganathan (2007) Briefings in Bioinformatics 8: 96-108 • Sequence-based approach • Pattern recognition techniques • binding motif, matrices, ANN, HMM, SVM • Main limitations: • Require large amount of data for training • Preclude data with limited sequence conservation • Structure-based approach • Rigid backbone modeling techniques • Flexible docking techniques • Main advantage: large training datasets unnecessary

  15. Our aim: Structure-based prediction of MHC-binding peptides

  16. Why structure? • Great potential to: • generate biologically meaningful data for analysis • predict candidate peptides for alleles that have not been widely studied, where sequence-based approaches fail or are not attempted • predict binding affinity of peptides • predict non-contiguous epitopes • Structure determination through experimental methods is both expensive and time-consuming • Has not been extensively studied due to high computational costs and development complexity

  17. Existing Structure-based Prediction Techniques • Protein Threading [Altuvia et al. 1995; Schueler-Furman et al. 2000] • Homology Modeling [Michielin et al. 2000] • Rigid/Flexible Docking [Rosenfeld et al. 1993; Sezerman et al. 1996; Rognan et al. 1999; Desmet et al. 2000; Michielin et al. 2003]

  18. Hypothesis for epitope selection • Peptides bound to MHC alleles are similar to substrates bound to enzymes • “Lock-and-key” mechanism for peptide selection • Shape • Size • Electrostatic characteristics

  19. Basic immunology Sequences Structures Genetics and populations Databases, ontologies • Introduction • Structural Immunoinformatic Database development • Data Analysis • Computational models • Applications

  20. MPID:MHC-Peptide Interaction DatabaseGovindarajan et al. (2003) Bioinformatics, 19: 309-310 RDB of 82 curated pMHC complexes (Class I: 64 & Class II:18)

  21. Peptide/MHC interaction characteristics Gap volume Interface area Peptide Length Interface area Gap Volume Interacting Residues Intermolecular hydrogen bonds Gap index =

  22. MPID-T: MHC-Peptide-T Cell Receptor Interaction DatabaseTong et al. (2006) Applied Bioinformatics, 5: 111-114 • 187 curated pMHC • 16 with TCR • Human:110, Murine:74 and Rat:3 • Alleles: 40 (interface area, H bonds, gap volume and gap index)

  23. Distribution of MHC by allele 101 new entries 187 entries (Human: 110; Murine: 74; Rat: 3) 134 non-redundant entries (class I: 100; class II: 34) 121 class I and 41 class II entries 26 HLA alleles (class I: 18; class II: 8) 14 rodent alleles (class I: 8; class II: 6) 16 TCR/peptide/MHC complexes

  24. Peptide/MHC binding motifs Polar Amide Basic Acidic Hydrophobic • Conserved peptide properties in solution structures • Classified according to • Alleles • Peptide length

  25. How to obtain structures of experimentally unsolved alleles? • There were only36crystal structures of unique MHC (2006) alleles vs.1765 unique MHC alleles identified in IMGT/HLA database • Structure determination through experimental methods is both expensive and time-consuming • Homology model building for alleles with no structural data!

  26. Structures • Introduction • Structural Immunoinformatic Database development • Data Analysisof pMHC Class I complexes • Computational models • Applications Data & text mining Maths/Stats

  27. Conservation of nonamer peptide backbone conformation • Class I peptides • N-termini residues 0.02 – 0.29 Å • C-termini residues 0.00 – 0.25 Å • Class II binding registers • Only 9 residues fit in the binding groove • N-termini residues 0.01 – 0.22 Å • C-termini residues 0.02 – 0.27 Å

  28. Structures Sequences • Introduction • Structural Immunoinformatic Database development • Data Analysis • Computational models • Applications Physics/ Chemistry Maths/Stats

  29. Two-step approach to predict MHC-binding peptides • Finding the best fit conformation (docking) of peptides within the MHC binding groove • Screening potential binders from the background

  30. y   x C N C Ca z O R Docking is a computationally exhaustive procedure • Large number of possible peptide conformations • 3 global translational degrees of freedom • 3 global rotational degrees of freedom • 1 conformational degree of freedom for each rotatable bond >1010 possible conformations for a 10-residue peptide

  31. Rapid docking of peptide to MHC Tong, Tan & Ranganathan (2004) Protein Sci. 13:2523-2532 1 2 3

  32. Benchmarking with existing techniques aRMSD of peptide backbone obtained from respective authors. bRMSD of peptide backbone obtained in our work from redocking bound complexes and single template respectively.

  33. Quantitative separation of binders from non-binders: empirical free energy scoring function • DQ3.2binvolved in several autoimmune diseases: • Celiac disease • insulin-dependent diabetes mellitus • IDDM-associated periodontal disease • autoimmune polyendocrine syndrome type II

  34. Quantitative separation of binders from non-binders: empirical free energy scoring function Gbind = αGH + βGS + GEL + C • Gbind = binding free energy • GH = hydrophobic term • GS = decrease in side chain entropy • GEL = electrostatic term • C = entropy change in system due to external factors • α, β, γ optimized by least-square multivariate regression with experimental binding affinities (IC50) of MHC-peptides in training dataset (Rognan et al., 1999)

  35. Test case: MHC Class II DQ8 • DQ3.2b(DQA1*0301/DQB1*0302)is involved in several autoimmune diseases: • Celiac disease • insulin-dependent diabetes mellitus • IDDM-associated periodontal disease • autoimmune polyendocrine syndrome type II

  36. Data used • Structure: 1JK8 - DQ3.2β–insulin B9-23 complex • Dataset I: 127 peptides with experimentally determined IC50 values [70 high-affinity (IC50 <500 nM), 13 medium-affinity (500 nM < IC50 < 1500 nM )and 23 low-affinity (1500 < IC50 < 5000 nM) binders and 21 non-binders (5000 < IC50)] derived from biochemical studies. • 87 with known binding registers. • Dataset II: 12 Dermatophagoides pternnyssinus (Der p 2) peptides with experimental T-cell proliferation values from functional studies, with 7 peptides eliciting DQ3.2β-restricted T-cell proliferation.

  37. Scoring: Training & testing datasets • Training • 56 binding conformations with known registers • 30 non-binding conformations from 3 non-binders • Testing • Test set 1 – 68 peptides from biochemical studies • 16 strong ; 13 medium; 21 weak; 18 non-binders • Test set 2 – 12 peptides from functional studies • 7 elicit T-cell proliferation

  38. Screening class II binding register: a sliding window approach E285B 112-126 peptide Y Q T I E E N I K I F E E D A

  39. 4-step protocol used A B C D

  40. Accuracy estimates • Sensitivity (SE) = number of binders correctly predicted = TP/AP (TP+FN) • Specificity (SP) = number of non-binders correctly predicted = TN/AN (TN+FP) Area under ROC (receiver operating characteristics) curve: >90% excellent >80% good

  41. Results for Training set High SE (good for most predictions) Very few FPs, but also fewer predictions

  42. Screening class II binding register: HLA-DQ8 prediction accuracy for Test Set I • Classification of binding peptides • High-affinity binders (H) • IC50 ≤ 500 nM • Medium-affinity binders (M) • 500 nM < IC50 ≤ 1500 nM • Low-affinity binders (L) • 1500 < IC50 ≤ 5000 nM

  43. Test Set 1: Improved detection of binders lacking position specific binding motifs

  44. Binding registers • 20/23 (87%) binding registers • Only register (aa 4-12) from Test Set 2 (Der p 2: 1-20) (SE=0.80; SP(LMH)=0.90) • Top 5 predictions are experimental positives at very stringent threshold criteria (SE=0.95; SP(H)=0.63) T-cell proliferation

  45. Multiple registers (SP=0.95, SE(LMHP =0.81): 58% of Test Set 1) Mainly for medium and high binders Experimental support: Sinha et al. for DRB1*0402 Is this why binding motifs are unsuccessful?

  46. Introduction • Structural Immunoinformatic Database development • Data Analysis • Computational models developed • Applications

  47. Pemphigus vulgaris (PV) www.aafp.org adam.about.com http://www.medscape.com Autoimmune blistering skin disorder Characterized by autoantibodies targeting desmoglein-3 (Dsg3) Strong association with DR4 and DR6 alleles

  48. Who are the major players in PV? • DR4 PV implicated alleles (for Semitic) • DRB1*0401 • DRB1*0402 • DRB1*0404 • DRB1*0406 • DR6 PV implicated alleles (for Caucasians) • DRB1*1401 • DRB1*1404 • DRB1*1405 • DQB1*0503

  49. What is known about DR4? DR4 PV implicated alleles (DRB1*0401, *0402, *0404, *0406) • High sequence conservation • 97.9 – 99.0% identity • 98.4 – 99.5% similarity • High structural conservation • Cα RMSD <0.22 Å for all key binding pockets • 7 polymorphic residues within binding cleft • Pocket 1 (β86), • Pocket 4 (β70, 71, 74) • Pocket 6 (β11) • Pocket 7 (β71) • Pocket 9 (β37)

  50. What is known about DR6? DR6 PV implicated alleles (DRB1*1401, *1404, *1405, DQB1*0503) • High sequence conservation • 85.8 – 94.1% identity • 83.2 – 97.3% similarity • High structural conservation • Cα RMSD <0.22 Å for all key binding pockets • 14 polymorphic residues within binding clefts • Pocket 1 (β86) • Pocket 4 (β13, 70, 71, 74, 78) • Pocket 6 (β11) • Pocket 7 (β28, 30, 67, 71) • Pocket 9 (β9, 37, 57, 60)

More Related