1 / 58

Functional 3-D modelling of G protein coupled receptors

Functional 3-D modelling of G protein coupled receptors. Uğur Sezerman. DNA. Transcription. mRNA. Translation. PROTEINS. Central Dogma. Motivation. Knowing the structure of molecules enables us to understand its mechanism of function Current experimental techniques X-ray cystallography

galena
Télécharger la présentation

Functional 3-D modelling of G protein coupled receptors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Functional 3-D modelling of G protein coupled receptors Uğur Sezerman

  2. DNA Transcription mRNA Translation PROTEINS Central Dogma

  3. Motivation • Knowing the structure of molecules enables us to understand its mechanism of function • Current experimental techniques • X-ray cystallography • NMR

  4. X-Ray Crystallography • crystallize and immobilize single, perfect protein • bombard with X-rays, record scattering diffraction patterns • determine electron density map from scattering and phase via Fourier transform: • use electron density and biochemical knowledge of the protein to refine and determine a model "All crystallographic models are not equal. ... The brightly colored stereo views of a protein model, which are in fact more akin to cartoons than to molecules, endow the model with a concreteness that exceeds the intentions of the thoughtful crystallographer. It is impossible for the crystallographer, with vivid recall of the massive labor that produced the model, to forget its shortcomings. It is all too easy for users of the model to be unaware of them. It is also all too easy for the user to be unaware that, through temperature factors, occupancies, undetected parts of the protein, and unexplained density, crystallography reveals more than a single molecular model shows.“ - Rhodes, “Crystallography Made Crystal Clear” p. 183.

  5. NMR Spectroscopy • protein in aqueous solution, motile and tumbles/vibrates with thermal motion • NMR detects chemical shifts of atomic nuclei with non-zero spin, shifts due to electronic environment nearby • determine distances between specific pairs of atoms based on shifts, “constraints” • use constraints and biochemical knowledge of the protein to determine an ensemble of models determining constraints using constraints to determine secondary structure

  6. Biology/Chemistry of Protein Structure Primary Secondary Tertiary Quaternary Assembly Folding Packing Interaction P R O C E S S S T R U C T U R E

  7. occurs at the ribosome • involves dehydration synthesis and polymerization of amino acids attached to tRNA: • NH - {A + B  A-B + H O} -COO • yields primary structure + - 3 2 n Protein Assembly

  8. Amino Acids

  9. Forces driving protein folding • It is believed that hydrophobic collapse is a key driving force for protein folding • Hydrophobic core • Polar surface interacting with solvent • Minimum volume (no cavities) Van der Walls • Disulfide bond formation stabilizes • Hydrogen bonds • Polar and electrostatic interactions

  10. PROTEIN FOLDING PROBLEM • STARTING FROM AMINO ACID SEQUENCE FINDING THE STRUCTURE OF PROTEINS IS CALLED THE PROTEIN FOLDING PROBLEM

  11. Secondary Structure • non-linear • 3 dimensional • localized to regions of an amino acid chain • formed and stabilized by hydrogen bonding, electrostatic and van der Waals interactions

  12. The a-helix

  13. Ramachandran Plot • Pauling built models based on the following principles, codified by Ramachandran: • bond lengths and angles – should be similar to those found in individual amino acids and small peptides • (2) peptide bond – should be planer • (3) overlaps – not permitted, pairs of atoms no closer than sum of their covalent radii • (4) stabilization – have sterics that permit hydrogen bonding • Two degrees of freedom: •  (phi) angle = rotation about N – C •  (psi) angle = rotation about C – C • A linear amino acid polymer with some folds is better but still not functional nor completely energetically favorable  packing!

  14. Chou-Fasman Parameters

  15. HOMOLOGY MODELLING • Using database search algorithms find the sequence with known structure that best matches the query sequence • Assign the structure of the core regions obtained from the structure database to the query sequence • Find the structure of the intervening loops using loop closure algorithms

  16. Homology Modeling: How it works • Find template • Align target sequence • with template • Generate model: • - add loops • - add sidechains • Refine model

  17. 1esr

  18. TURALIGN: Constrained Structural Alignment Tool For Structure Prediction

  19. Motif Alignment Using Dynamic Algorithm Template Target Template Target

  20. RESULTS • For all the experiments done, our algorithm perfectly matched functional sites and motifs given as input to the program. • 1csh vs 1iomA : • RMSD = 2.50 • 1csh vs 1k3pA • RMSD = 2.12 • 1k3pA vs 1iomA • RMSD = 3.03 • 1b6a vs 1xgsA • RMSD = 2.23 • 1fp2A vs 1fp1D • RMSD = 2.98 • At average we got the best results for 5 experiments: • RMSD = 2.57 with ac:0.4,sc:0.4,tc:0.2,cc:0

  21. Thanks to • Tural Aksel

  22. Why Functional Classification? • Huge amount of data accumulatedvia genome sequencing projects.  • Costly experimental structure predictionmethods(X-ray & NMR), takes months/year.  • Also computational structure prediction methods are not accurate enough.

  23. G-protein coupled receptors (GPCRs) • Vital protein bundles with versatile functions. • Play a key role in cellular signaling, regulation of basic physiological processes by interacting with more than 50% of prescription drugs. • Therefore excellent potential therapeutic target for drug design and the focus of current pharmaceutical research.

  24. Although thousands of GPCR sequences are known, the crystal structure solved only for one GPCR sequence at medium resolution to date. For many of them, the activating ligand is unknown. Functional classification methods for automated characterization of such GPCRs is imperative. GPCR Functional Classification Problem

  25. According to the binding of GPCRs with different ligand types, GPCRs are classified into at least six families. The correlation between sub-family classification and the specific binding of GPCRs to their ligands can be computationally explored for Level 2 subfamily classification of Amine Level 1 subfamily. Subfamily classifications in GPCRDB are defined according to which ligands the receptor binds (based on chemical interactions rather than sequence homology). Relationship between specific binding of GPCRs into their ligands and their functional classification

  26. Benchmark Dataset • Dataset • 352 amines, 595 peptides, 1898 olfactory, 355 rhodopsin, 56 prostanoid • Derive GPCR proteins from GPCRDB & SWISS-PROT through internet • Group the proteins according to their ligand specificity (i.e amines, peptides, olfactory, rhodopsin, prostanoid) • Seperate proteins into train and test groups with 2:1 ratio respectively • Derive the ecto-domains by using TMHMM (i.e n-terminal, loop1, loop2, loop3) • Rewrite the sequences using 11 letter alphabets

  27. Classification of Amino acids

  28. Snake plot of the human beta-2 adrenoceptor

  29. PROTEIN DATABASE Train proteins; Ligand group: amines

  30. FINDING MOST COMMON PATTERNS FOR EACH LIGAND GROUP • Form triplets for n-terminal, loop1, loop2 and loop3 seperately • For 11 letter alphabet 1331 different triplets • For each triplet find proteins in certain ligand group those containing the current triplet at a given location and keep the data in vectors • Find the ratio of occurence of each triplet in a given GPCR protein type(i.e amines) in a given location (i.e loop1) • Insert the triplets into SQL database with their ratios • Sort the triplets according to their ratios

  31. VECTORS

  32. FINDING DISTINGUISHING MOTIFS I • Compare the ratios of triplets of a certain ligand group with the occurence of this triplet with the other ligand groups one by one(aaa in amines = 0.5; in peptides = 0.1 r = 0.5/0.1 • Keep the motifs with n(150) highest “r”s for each ligand group pairs. These are the motifs that distinguish given group from the other groups

  33. RESULTS • Success rates for Information theory

  34. CART RESULTS The classification table showing the only patterns determining amines from all others

  35. Index Triplet Family • 1 CAA Amine • 2 AIB Amine • 3 HIJ Prostanoid • 4 AEA Hormone-protein • 5 JAA Hormone-protein • 6 AAD TRH • 7 ADA TRH • 8 JCK Melatonin

  36. i.e. Variable importance of the amine determining patterns

  37. Occurence of EIG in Loop2 in Rhodopsin Family

  38. Triplet JJI at exo-loop 2 in olfactory sub-family.

  39. Conclusion • Exploiting the fact that there is a non-promiscuous relationship between the specific binding of GPCRs into their ligands and their functional classification, our method classifies Level 1 subfamilies of GPCRs with a high predictive accuracy of 98%. • The presented machine learning approach, bridges the gulf between the excess amount of GPCR sequence data and their poor functional characterization. • The method also finds binding motifs of GPCRs to their specific ligands which can be exploited for drug design to block these site • With such an accurate and automated GPCR classification method, we are hoping to accelerate the pace of identifying proper GPCRs and their ligand binding scheme to facilitate drug discovery especially for neurological diseases.

  40. Ligand binding motifs and their site information can be used as contraints to build better models. • Highly conserved sites from alignment of GPCR families can also be used as constraints

  41. Thanks to • Murat Can Çobanoğlu

  42. Class A Rhodopsin like • The largest and most diverse family of GPCRs • Conserved sequence motifs • Unique signal-transduction activities • Important members: • Adrenergic Receptors • Adenosine Receptors • Chemokine Receptors • Dopamine Receptors • Histamine Receptors • Opsins

  43. Highlighted 4 GPCRs for Structure Comparison

More Related