Good Morning

Good Morning

RESPONSE SURFACE METHODOLOGY (R S M) Par Mariam MAHFOUZ

Organization of the talk • 10.h -10h.30m: Part I • 10h.30m - 10h.45m: Questions about part I • 10h.45m - 11h.45m: Part II • 11h.45m - 12h.: Questions and discussion

General Planning • Part I A - Introduction to the RSM method B - Techniques of the RSM method C - Terminology D - A review of the method of least squares • Part II A - Procedure to determine optimum conditions – Steps of the RSM method B – Illustration of the method against an example

Part I

A – Introduction to the RSM method • The experimenter frequently faces the task of exploring the relationship between some response y and a number of predictor variables x = (x1, x2, … , xk)’. • Then, we speak about twofold purpose: • To determine and quantify therelationship between the values of one or more measurable response variable(s) and the sittings of a group of experimental factors presumed to affect the response(s) and • To find the sittings of experimental factors that produce the best value or best set of values of the response (s).

Drug manufacturing Example Combinations of two drugs, each known to reduce blood pressure in humans, are to be studied. Clinical trials involve 100 high blood pressure patients. Each patient is given some predetermined combination of the two drugs. The purpose of administering the different combinations of the drugs to the individuals is to find the specific combination that gives the greatest reduction in the patient’s blood pressure readingwithin some specified interval of time.

B - Techniques of the Response surface methodology (RSM) Three principal techniques Setting up a series of experiments. Determining a mathematical model that best fits the data collected. Determining the optimal settings of the experimental factors that produce the optimum value of the response.

C – Terminology • Factors • Response • The response function • The polynomial representation of a response surface • The predicted response function • The response surface • Contour representation of a response surface • The operability region and the experimental region

Factors Factors are processing conditions or input variables whose values or settings can be controlled by experimenter. Factors can be qualitative or quantitative. The specific factors studied in detail in this course are quantitative, and their levels are assumed to be fixed or controlled by the experimenter. Factors and their levels will be denoted by X1, X2,…,Xkrespectively.

Response The response variable is the measured quantity whose value is assumed to be affected by changing the levels of the factors. The true value of the response is denoted by . Because experimental error, theobserved value of the response Y differs from . This difference from the true value is written as Y =  + , where  denotes experimental error.

Factors or input variables Response or output variable Experiences

Response function When we say that the value of the true responsedepends upon the levels X1, X2,…,Xkof k quantitative factors, we are saying that there exists some function of these levels,  = (X1, X2,…,Xk). The function  is called the true response function (unknown), and is assumed to be a continuous , smooth function of the Xi.

The polynomial representation of aresponse surface Let us consider the response function  = (X1) for a single factor. If is a continuous, smooth function, then it is possible to represent it locally to any required degree of approximation with a Taylor series expansion about some arbitrary point X1,0: where are respectively, the first, second, … derivatives of (X1) with respect to X1.

This expansion reduces to a polynomial of the form: where the coefficients 0, 1, 11 … are parameters which depend on X1,0 and the derivatives of (X1) at X1,0. First order model with one factor: Second order model with one factor: The second order model with two factors is in the form (equation 1): And so on …

Predicted response function The structural form of  is usually unknown. The steps, taken in obtaining the approximating model, are as follows: A form of model equation is proposed. some number of combinations of the levels X1, X2, …, Xk of the k factors are selected for use as the design. At each factor level combination chosen, one or more observations are collected.

The observations are used to obtain estimates of the parameters in the proposed model. Tests are then performed. If the model is considered to be satisfactory, it can be used as a prediction equation. Let us assume the true response function is represented by equation 1 Estimates of the parameters0, 1, … are obtained using the method of least squares. .

If these estimates, denoted by b0, b1, … respectively, are used instead of the unknown parameters0, 1, …, we obtain the prediction equation: where , called “Yhat”, denotes the predicted response value for given values of X1 and X2.

The response surface With k factors, the response surface is a subset of (k+1)-dimensional Euclidean space, and have as equation: where xi, i=1, …, k, are called coded variables.

Geometrically, each contour is a projection onto the x1x2 plane of a cross-section of the response surface made by a plane, parallel to the x1x2 plane, cutting through the surface. Contour representation of a responsesurface

Yhat

Contour plotting is not limited to three dimensional surfaces. The geometrical representation for two and three factors enables the general situation for k > 3 factors to be more readily understood, although they cannot be visualized geometrically.

Operability and experimental regions The operability region O is the region in the factor space in which the experiments can be performed. The experimental region R is the limited region of interest, which is entirely contained within the operability region O

In most experimental programs, the design points are positioned inside or on the boundary of the region R. Typically R is defined as, a cubical region, or as a spherical region.

D – A review of the method of least squares Let us assume provisionally that N observations of the response are expressible by means of the first-order model in k variables: Yudenotes the observed response for the uth trial, Xui represents the level of factor i at the uth trial, 0 and i are unknown parameters, u represents the random error in Yuand N is the number of observations (experiences).

Assumptions made about the errors are: • Random errors u have zero mean and common variance 2. • Random errors u are mutually independent in the statistical sense. • For tests of significance (T- and F_statistics), and confidence interval estimation procedures, an additional assumption must be satisfied: Random errors u are normally distributed.

Parameter estimates and properties Remember the form of the first-order model in (Eq. 2): This equation can be expressed, in matrix notation, as: Y=X + , where:

The method of least squares selects as estimates for the unknown parameters in (Eq. 2), those values, b0, b1,…, bk respectively, which minimize the quantity:

The parameter estimates b0, b1,…, bk which minimize R(0, 1, …, k) are the solutions to the (k+1) normal equations, which can be expressed, in matrix notation, as: X’ X b = X’ Y, where X’ is the transpose of the matrix X, and b=(b0, b1,…, bk)’.

The matrix X is assumed to be of full column rank. Then: b=(X’X)-1 X’ Y, where (X’X)-1 is the inverse of X’ X. If the used model is correct, b is an unbiased estimator of . The variance-covariance matrix of the vector of estimates, b, is: Var(b) = 2(X’ X)-1.

Geometrically, how explain the least square method and the normal equations ? • e  Im(X)  X,e=0 • X’ (Y-Xb)=0 • X’ X b = X’ Y Im(X) Normal equations Im(X) is the Euclidean Space generate by the column vectors of the matrix X. Im(X) X b

Properties of the estimator of  It is easy to show that the least squares estimator, b, produces minimum variance estimates of the elements of  in the class of all linear unbiased estimators of . As stated by the Gauss-Markov theorem, b is the best linear unbiased estimator (BLUE) of .

Predicted response values Let denote a 1x p vector whose elements correspond to the elements of a row of the matrix X. The expression for the predicted value of the response, at point in the experimental region is: Hereafter we shall use the notation to denote the predicted value of Y at the point . A measure of the precision of the prediction, defined as the variance of , is expressed as:

Estimation of 2 Let , u=1, …,N=number of experiments. is called the uth residual. For the general case where the fitted model contains p parameters, the total number of observations is N (N>p) and the matrix X is supposed of full column rank, the estimate, s2 of 2, is computed from: SSE is the sum of squared residuals. The divisor N-p is the degrees of freedom of the estimators2. When the true model is given by Y=X+, then s2 is an unbiased estimator of 2.

The Analysis of variance table The entries in the ANOVA table represent measures of information concerning the separate sources of variation in the data. The total variation in a set of data is called the total sum of square (SST): where is the mean of Y. The total sum of squares can be partitioned into two parts: The sum of squares dues to regression, SSR (or sum of squares explained by the fitted model) and the sum of squares not accounted by the fitted model, SSE (or the sum of squares of the residuals). and

If the fitted model contains p parameters, then the number of degrees of freedom associated with SSR is p-1, and this associated with SSE, is N-p. Short-cut formulas for SST, SSR, and SSE are possible using matrix notation. Letting 1’ be a 1xN vector of ones, we have: Note that: SST = SSR + SSE

The usual test of the significance of the fitted regression equation is a test of the null hypothesis: H0: “all values of i (excluding 0) are zero”. The alternative hypothesis is Ha: “at least one value of i (excluding 0) is not zero”. Assuming normality of the errors, the test of H0 involves first calculating the value of the F-statistic where is called the mean square regression, and is called the mean square residual.

ANOVA table

Tests of hypotheses concerning the individual parameters in the model Let us denote the least squares estimate of i by bi and the estimated standard error of bi byest.s.e.(bi). Then a test of the null hypothesis H0: i=0, is performed by calculating the value of the test statistic: and comparing the value of t against a table value, t, from the student-table.

Testing lack of fit of the fitted model using replicated observations This inadequacy of the model is due to either or both of the following causes: • Factors (other than those in the proposed model) that are omitted from the proposed model but which affect the response, and or, • The omission of higher-order terms involving the factors in the proposed model which are needed to adequately explain the behavior of the response.

Detail Two conditions be met regarding the collection (design) of the data values: The number of distinct design points, n > p, the number of terms in the fitted model. An estimate of the experimental error variance that does not depend on the form of the fitted model is required. This can be achieved by collecting at least two replicate observations at one or more of the design points and alculating the variation among the replicates at each point.

Detail The residual sum of squares, SSE, can be partitioned into two sources of variation: • the variation among the replicates at those design points where replicates are collected, • and the variation arising from the lack of fit of the fitted model.

The sum of squares due to the replicate observations is called the sum of squares for pure error (abbreviated, SSPE): Where Yul is the uth observation at the lth design point, u=1,2,…,rl1, l=1,2,…,n. The degrees of freedom associated withSSPE is: N is the total number of observations.

The sum of squares due to lack of fit is found by subtraction: SSLOF = SSE - SSPE . The degrees of freedom associated with SSLOF is obtained by : (N-p) - (N-n) = n - p. An expanded table of analysis-of-variance which displays the partitioning of the residual sum of squares is the following:

Finally, some notes about:The use of the coded variables in the fitted model • The use of coded variables in place of the input variables facilitates the construction of experimental design. • Coding removes the units of measurement of the input variables and as such distances measured along the axes of the coded variables in k-dimensional space are standardized (or defined in the same metric). • Another advantages to using coded variables rather than the original input variables, when fitting polynomial models, are: computational ease and increased accuracy in estimating the model coefficients, and enhanced interpretability of the coefficient estimates in the model.

Thank you Questions Part 1 ?

Good Morning

Good Morning

Presentation Transcript

Good Morning

Good morning

Good Morning

Good Morning!

GOOD MORNING

Good Morning!

Good Morning!

Good morning, good morning! Good morning to you! Good morning, dear teacher!

Good morning

Good Morning!

Good morning, good morning, Good morning to you! Good morning, good morning,

Good Morning

Good Morning!!

GOOD MORNING

Good morning

Good Morning!!

Good morning!