TWO-LAYER QSPR MODEL FOR PREDICTION OF ORGANIC COMPOUNDS A QUEOUS SOLUBILITY AT VARIOUS TEMPERATURES Klimenko K. a),OgnichenkoL.b), Polishchuk P. b),NovoselskaN.a), GorbL.c), Kuzmin V.a,b), LeszczynskiJ.d) a) I. I. Mechnikov National University, Chemistry Department, Dvorianskaya 2, Odessa 65026, Ukraine, e-mail email@example.com b) Department of Molecular Structure and Chemoinformatics, A.V. Bogatsky Physical-Chemical Institute National Academy of Sciences of Ukraine, Lustdorfskaya Doroga 86, Odessa 65080, Ukraine c) Badger Technical Services, LLC, Vicksburg, Mississippi, USA d) Interdisciplinary Center for Nanotoxicity, Department of Chemistry, Jackson State University, Jackson, Mississippi, 39217, USA Presented by: Klimenko K. 2013
3 Challenges of aqueous solubility determination • Other factors which can effect solubility • Pressure • Solution equilibrium • pH • State of substance • Methods for excessive solute removal • These factors are frequently not taken to the account when solubility determination is carried out. Moreover, there is no universally recognized method for the experiment, therefore, solubility data can be variegated.
4 Temperature-solubility relationship solubility temperature coefficient(kj) Example
5 Assessment of regression equation fit
6 Two-layer QSPR approach for aqueous solubility model development QSPR of solubility temperature coefficient (kj) Aqueous solubility prediction in range 0<t<100 lg(xj)t = f (lg(xj)25, kj, t) Molecular descriptors QSPR of aqueous solubility at 25 oC (lg(xj)25)
Feature netprocedure for QSPR solubility model development 7 Solubility temperature coefficient (kj) calculation from experimental data Generating Simplex descriptors QSPR model forcoefficient prediction (kj) Prediction of (kj) value for all compounds in the set QSPR solubility model 0<t<100 0C Calculation of descriptor kj(t-25), for temperature factor impact implementation
8 Statistical characteristics of QSPR models forsolubility temperature coefficients n – number of data points T(1-5) – test sets
9 Obs. vs Pred. solubility coefficient plot
10 Statistical characteristics of feature net QSPR models forsolubility at temperature range 0>t>100 0C m – number of compounds
11 Obs. vs Pred. solubility model plot
12 Distribution of prediction error for compounds with various molecular mass
13 Physicochemical parameters' relative influence on solubility in general model
14 Prediction of aqueous solubility for compounds from external test set(t=25,m=28)
15 Prediction of aqueous solubility at different temperatures 484-12-8 98634-28-7 t= 15-55 oC t= 20-40 oC 75885-58-4 482-44-0 t= 15-55 oC t= 22-63 oC 87-69-4 m=5,k=35 %acc.pred.comp=75 %acc.pred.data points=71,4 t= 15-65 oC
16 Conclusion • SiRMS allows developing QSPR models for successful aqueous solubility in temperature range 0-100 оС. • Linear regression equation is the best to describe solubility logarithm dependence on temperature. It is also useful for defining solubility temperature coefficient. • Electrostatics (25%) andlipophilicity (18%) have max impact on solubility. Temperature factor’s influence is also substantial and equals 3%. • Information derived from 2D-structure is sufficient for aqueous solubility prediction.