C ONCLUSIONS

Mechanism Legend Prediction Legend RS-Predictor Core Model Training Set – Unknown Response Calculate MIRank Models – Right Panel Calculate Descriptors – Central Panel Modeling the regioselectivity of cytochrome P450-mediated metabolism: Development of an effective in silico solution – “RS-Predictor” Jed Zaretzki, Charles Bergeron, Tao-wei Huang, Kristen Bennett, Curt Breneman* Chemistry Department, Rensselaer Polytechnic Institute, Troy, NY 12180 e-mail: zaretj@gmail.com Descriptor theory and implementation Descriptor Generation Flowchart Implemented in SVL MIRank theory and modeling procedure ABSTRACT A small number of cytochrome P450 isozymes are responsible for metabolizing 90% of all marketed drugs. CYP isozymes are capable of catalyzing a variety of transformations, including aromatic and aliphatic oxidation, N-,O-,and S-dealkylation, N-oxidation, sufloxide/sulfone formation, oxidative deamination, desulfuration, and dehalogenation, among others. We present RS-Predictor, a method that specifically addresses the unique machine learning and reactivity descriptor challenges that must be solved in order to provide reliable site-specific prediction of the products of lead compound oxidative metabolism. Initial Database in .sdf format A standard .txt format to capture molecular databases. The format that has been slightly modified to incorporate experimental sight of metabolism information. Databases exist on an isozyme specific basis, containing all molecules for which experimental metabolic regioselectivity is known. Results INTRODUCTION Database in .mdb format • A molecule of Lidocaine: • 7 metabolophores • Mechanism Types: • Csp3 Hydroxylation • N – Hydroxylation • Aromatic Hydroxylation • 1 Experimental site of metabolism • Csp3 Hydroxylation Import .sdf file into a .mdb file. The fields include molecular structure, ID, as well as all experimentally known sites of metabolism. ni- the occupation number of the i-th MO - MO coefficients for atomic orbitals m and n Generate conformations for each molecule in the database Sample Analysis 3A4 Experimentally Known Sites Mechanism Breakdown - 376 Sites For each entry in isozyme database generate a conformational database with up to 25 conformations, using default settings for the StochasticCSearch function. The goal is to design a system able to predict for a given molecule which metabolophore, or potential site of metabolism, is modified by a specific CYP450 isozyme. 3A4 Potential Sites Mechanism Breakdown - 4147 Sites • Building a model universally representative across different molecules is made difficult by two facts. • Limited kinetic data means only binary response information, labeling which metabolophores have high reaction rates, is known. Differences between molecules through response is hard to gauge because response values are binary. • Difference between molecules’ metabolophore compositions, and their corresponding differences in relative environmental or electronic properties, make it hard to build a universal model. This concept is symbolically illustrated the figure on the left. Generate atom specific descriptors for each conformation in database These descriptors are calculated with MOPAC, using the SVL function mopac_Run with the following parameters: 'XYZ','MMOK','VECTORS','BONDS','PI','PRECISE', 'ENPART','EF','MULLIK',GNORM,'version':2007. Details of descriptors are provided on panel to the left. Boltzmannn average atom specific descriptors over all conformations and calculate topological descriptors Individual conformation All possible conformations RS-Predictor Flowchart Boltzmann’s formula averages descriptor values according to the energetic likelihood of a conformation occuring amongst all possible (max 25) conformations. Topological descriptors are conformationally invariant. Molecule, Group, and Hydrogen ID’s are calculated to give each atom entry a unique key for MIRank modeling. RS-Predictor Prediction Set – Unknown Response Metabolic Site Predictions Calculate Descriptors – Central Panel • CONCLUSIONS • Using a training set of protein-ligand complexes structurally and chemically similar to a query complex at the binding site and constructing a locally applicable model results in improved accuracy in binding constant prediction • TAE augmented scoring functions are able to predict pKd/pKi values of enzymes with lower mean error and greater ranking accuracy than other available scoring functions using current methodologies • Water was not included in building scoring functions, resulting in inability to identify native-like binding poses for ligands binding to bridged water molecules in native poses • Descriptors which capture conformation information more accuratelyand account for water could improve success rate in prediction of native-like poses REFERENCES 1) Wang et al. J. Med. Chem. 2003, 46, 2287-2303 2) Wang et al. J. Chem. Inf. Comput. Sci. 2004, 44, 2114-2125 3) Gold et. al. J. Chem. Inf. Model. 2006, 46, 736-742 4) Wang et al. J. Med. Chem. 2005, 48, 4111-4119 5) Whitehead et al. J.Comput. Chem.2003, 24, 512-529 Prediction Analysis – Right Panel Descriptor quantified database in .txt space separated format Descriptor databases for each molecule are exported and merged into a final space seperated .txt file. This file is used as input to the MIRank modeling algorithm, implemented in MATLAB.

C ONCLUSIONS

C ONCLUSIONS

Presentation Transcript

c = - 30 c = - 38 c = - 22 c = 20

C ONCLUSIONS Results from this exploratory study indicate that successful treatment of

C reation C ovenant C onquest ‘ C’ ingdom C ollapse/Restoration C hrist C hurch C ulmination

C hange, C hange, C hange, C hange

C/C

Make Inferences and Draw C onclusions

C onclusions

C C

C c

Low-C High-C Low-C High-C

C hallenged C ommissioned C ompelled C ompensated

C, C++

C, C++

C = 7 C = 10 C = 8 C = - 10 C = - 8 C = - 7

C C C Bible Study

C ontent C overage C urrentness C ompleteness C onsistency C orrectness C ompatibility