GASP Models and Bayesian Regression

# GASP Models and Bayesian Regression

Télécharger la présentation

## GASP Models and Bayesian Regression

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College SAMSI March 2007

2. PREVIEW • GASP: The Random Field Regression Model • The RFR Model and Bayesian Regression • RFR to Bayes – What is the Model? • Example 1: A Simple One-Factor Model • Example 2: Nuclear Waste Repository • From Bayes to RFR • Conclusions SAMSI March 2007

3. GASP: The Random Field Regression Model What kind of model should be used for data from computer experiments? • We need to consider: • Attention to bias, not to variance. • Nonlinear effects and interactions. • High-dimensional inputs. The RFR model is one possible solution. SAMSI March 2007

4. The Random Field Regression Model Also known as Kriging model – from roots in geostatistics GASP – for Gaussian stochastic process SAMSI March 2007

5. The Random Field Regression Model Let y denote a response and x a vector of factor settings or covariates. Treat y as the realization of a random field with a fixed regression component: y(x) = 0 + jfj(x) + (x) The regression part is often limited to just the constant term. SAMSI March 2007

6. The Random Field Regression Model The random field (x) is used to represent the departure of the true response function from the regression model. Typical assumptions: E{(x)} = 0. E{(x1)(x2)} = C(X1,X2) = 2R(X1,X2) SAMSI March 2007

7. The Random Field Regression Model • We can estimate the response at a new input site using the Best Linear Unbiased Predictor. • The estimator is also the posterior mean if we assume that all random terms have normal distributions. • The estimator is much more flexible than the standard regression model. It smoothly interpolates the output data. SAMSI March 2007

8. The Random Field Regression Model • Typically the correlation function R includes parameters that can be estimated by maximum likelihood or by cross-validation. • One popular recommendation: • R(x1,x2) =  exp{ - j | x1,j – x2,j |p(j)} SAMSI March 2007

9. The Random Field Regression Model • An example from Welch et al., Technometrics, 1992. SAMSI March 2007

10. The Random Field Regression Model The RFR model has been found to generate good predictors in applications. • But it … • is difficult to interpret. • does not relate to “classical” models. • is not clear “what it does to the data”. SAMSI March 2007

11. RFR Model and Bayesian Regression We will show that the RFR model can be understood as a Bayesian regression model. Suppose we want to represent the response y using a regression model: y(x) = 0 + jfj(x) SAMSI March 2007

12. RFR Model and Bayesian Regression Take a Bayesian view and assign priors to the coefficients. Assign a vague prior to the constant. Assume that the remaining terms are independent, with j ~ N(0,j2). SAMSI March 2007

13. RFR Model and Bayesian Regression We now have y(x) = 0 + jfj(x) = 0 + (x) SAMSI March 2007

14. RFR Model and Bayesian Regression The term (x) is a random field whose distribution is induced by the prior assumptions on the regression coefficients. E{(x)} = 0. E{(x1)(x2)} = C(X1,X2) = j2fj(X1)fj(X2) SAMSI March 2007

15. RFR Model and Bayesian Regression The RFR model is equivalent to a Bayesian regression model. The number of regression functions can be as large as we desire, even a full series expansion. SAMSI March 2007

16. RFR Model and Bayesian Regression The importance of each regression function in the Bayesian model is reflected by the prior variance, with important terms assigned large variances. A regression component in the RFR model corresponds to assigning diffuse priors to the appropriate coefficients (i.e. giving them “infinite” prior variances). Then leave those terms out of the random field component. SAMSI March 2007

17. RFR to Bayes– What is the Model? Suppose we fit an RFR model to data from a computer experiment. Can we find an associated Bayesian regression model? Finding the Bayes model may be helpful in understanding the RFR model. SAMSI March 2007

18. RFR to Bayes– What is the Model? • Some simple data analysis provides an answer. • Our algorithm: • Compute the correlation matrix R(Xi,Xj) at all pairs of design points. • Compute the eigenvalues and eigenvectors of the correlation matrix. • For the leading eigenvalues, find out how the associated eigenvectors are related to the design factors. SAMSI March 2007

19. Example 1: A One-factor Design Consider a computer experiment with just one factor. The design includes 50 points spread uniformly on the interval [-1,1]. The correlation function is estimated from the power exponential family: R(X1,X2) = exp{ - 0.05 |X1 – X2|2}. SAMSI March 2007

20. Example 1: A One-factor Design The eight leading eigenvalues of the correlation matrix: SAMSI March 2007

21. Example 1: A One-factor Design The first eigenvector, plotted against the input factor from our design: SAMSI March 2007

22. Example 1: A One-factor Design The second eigenvector: SAMSI March 2007

23. Example 1: A One-factor Design The third eigenvector: SAMSI March 2007

24. Example 1: A One-factor Design The fourth eigenvector: SAMSI March 2007

25. RFR to Bayes– What is the Model? Why does the algorithm work? Let Y denote the output vector. The Bayesian regression model says that Y = 01 + 1f1 + … + TfT. where f1,f2,…,fT are the columns in the regression matrix. Then Y ~ N(01,C), where C= j2fjf’j. SAMSI March 2007

26. RFR to Bayes– What is the Model? The algorithm merely reverses the logic. Given the correlation matrix, it identifies regression vectors and prior variances. The regression vectors depend on intrinsic properties of the correlation function and on the experimental design. For example, if the design “confounds” two effects, we might get a regression vector that is explained by either of the two or by a linear combination of them. SAMSI March 2007

27. Example 2: Nuclear Waste Repository We included 26 input factors. The design is a 900 point Latin Hypercube, generated automatically by RESRAD. Several pairs of factors should be equal to one another. RESRAD allowed us to enforce a 0.99 rank correlation between such pairs. Other pairs should be similar and we used a 0.3 rank correlation for them. SAMSI March 2007

28. Example 2: Nuclear Waste Repository The response is the maximal equivalent annual dose of radiation in the drinking water (in millirem) during a 10,000 year time window. IAEC standards stipulate that this dose should be at most 30 millirem. The goal is to identify factors that affect the outcome and should be subject to further study at a proposed repository site. SAMSI March 2007

29. Example 2: Nuclear Waste Repository The output data show no leaching at all for more than 75% of the input vectors. When leaching does occur, the maximal annual dose has a highly skewed distribution: SAMSI March 2007

30. Example 2: Nuclear Waste Repository We fitted an RFR model to the log of the maximal annual dose, using only those input vectors with an outcome of at least 0.1 (n=163). We selected the 8 strongest input factors as predictors. Most of these factors are also related to the presence/absence of leaching, so the design for the RFR model is no longer uniform in the input space. SAMSI March 2007

31. Example 2: Nuclear Waste Repository Two of the strongest factors related to presence or absence of leaching. SAMSI March 2007

32. Example 2: Nuclear Waste Repository A RFR model was fitted using 8 strong factors with the PErK software of Brian Williams. The power exponential correlation function was used. SAMSI March 2007

33. Example 2: Nuclear Waste Repository The fitted model: SAMSI March 2007

34. Example 2: Nuclear Waste Repository The eigenvalues: SAMSI March 2007

35. Example 2: Nuclear Waste Repository The leading eigenvector versus Thickness: SAMSI March 2007

36. Example 2: Nuclear Waste Repository The next 5 eigenvectors are almost linear functions of the 5 input factors with the largest scale parameters. Dominant factors in red. SAMSI March 2007

37. Example 2: Nuclear Waste Repository Adding a few nonlinear effects increases the R2 values to above 95%. The first vector has small quadratic effects of the first 3 factors. The 6th vector has clear nonlinear effects of factor 7 (Precipitation – low exponent in model). SAMSI March 2007

38. Example 2: Nuclear Waste Repository The 7th eigenvector is not a linear function of the input factors. Adding second-order effects shows a strong quadratic effect of Precipitation. SAMSI March 2007

39. Example 2: Nuclear Waste Repository Plot of the vector against Precipitation: SAMSI March 2007

40. Example 2: Nuclear Waste Repository Regressing the e-vector against a “tent function with a plateau” in Precipitation gives an R2 of 89.9%. The remaining scatter is most closely related to a linear effect in factor 5 (Effective Porosity in the Saturated Zone) and a quadratic effect in factor 3 (Kd for U238 in the Saturated Zone). SAMSI March 2007

41. Example 2: Nuclear Waste Repository The 8th eigenvector is not a linear function of the input factors. It can be largely explained by a linear term in Effective Porosity (Saturated Zone), a quadratic dependence on the Kd for U238 (Saturated Zone), the interaction of the last factor with Thickness, and nonlinear terms in Precipitation. SAMSI March 2007

42. Example 2: Nuclear Waste Repository Plot vs EP (SZ). Residuals vs. Precipitation The outlier is in a “corner” of the Thickness by Kd projection. SAMSI March 2007

43. Example 2: Nuclear Waste Repository We can also apply the idea “in reverse”. Suppose there is a linear effect in one of the input factors. Is the effect a part of the RFR model? SAMSI March 2007

44. Example 2: Nuclear Waste Repository Results from regressing linear effects on the 12 leading eigenvectors. SAMSI March 2007

45. Example 2: Nuclear Waste Repository Results from regressing pure cubic effects on the 12 leading eigenvectors. SAMSI March 2007

46. From Bayes to RFR The ideas here can also be used to derive covariance functions for RFR models. Write down a Bayesian regression model. Compute the resulting covariance function. SAMSI March 2007

47. From Bayes to RFR • Example 1: • Hermite polynomials. • Decay to 0 away from the origin. • Priors on the coefficients that shrink exponentially. • Result is the power exponential family with all exponents equal to 2. SAMSI March 2007

48. From Bayes to RFR • Example 2: • Fourier series. • Priors on the coefficients that shrink polynomially. • Result is family of spline covariances. SAMSI March 2007

49. Some Special Models • Gaussian correlation and Hermite polynomials Consider a single univariate term in this product. SAMSI March 2007

50. The scaled Hermite polynomials are orthonormal with respect to the N(0,1) density. Define Assume SAMSI March 2007