1 / 84

ISSUES IN THE DESIGN AND ANALYSIS OF COMPUTER EXPERIMENTS

ISSUES IN THE DESIGN AND ANALYSIS OF COMPUTER EXPERIMENTS. David M. Steinberg Tel Aviv University. COLLABORATORS. Dennis Lin Dizza Bursztyn Ron Kenett Henry Wynn Ron Bates Sigal Levy Einat Neuman Ben Ari. Gideon Leonard Tamir Reisin Eyal Hashavia Zeev Somer. THANK YOUS Noga Alon

louis
Télécharger la présentation

ISSUES IN THE DESIGN AND ANALYSIS OF COMPUTER EXPERIMENTS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ISSUES IN THE DESIGN AND ANALYSIS OF COMPUTER EXPERIMENTS David M. Steinberg Tel Aviv University SAMSI Working Group March 2007

  2. COLLABORATORS Dennis Lin Dizza Bursztyn Ron Kenett Henry Wynn Ron Bates Sigal Levy Einat Neuman Ben Ari Gideon Leonard Tamir Reisin Eyal Hashavia Zeev Somer THANK YOUS Noga Alon Ronit Steinberg SAMSI Working Group March 2007

  3. PREVIEW • Some Applications • Nuclear Waste Repository • Ground Response to an Earthquake • Chemotherapy Simulator • Optimizing a Piston • Designing Computer Experiments • Latin Hypercube Designs • Rotated Factorial Designs • LHD’s as Rotated Factorial Designs • Near LHD’s from Rotated Factorials • Nuclear Waste Disposal: Quandaries • Chemotherapy: Quandaries • Ground Shaking: Quandaries • GASP Models and Bayesian Regression SAMSI Working Group March 2007

  4. Example: Nuclear Waste Repository RESRAD computes leaching of radioactive isotopes from the repository into the food and water supply. Time frame is thousands of years, so field study is impossible. SAMSI Working Group March 2007

  5. Example: Nuclear Waste Repository • Inputs • Initial isotope concentrations • Distribution coefficients of the isotopes • Lithology of the repository • Outputs • Maximal dose during 10,000 years SAMSI Working Group March 2007

  6. Example: Ground Shaking What will be the ground response to an earthquake? An engineering simulator uses a finite element scheme to simulate ground motion. Shaking of the bedrock generates surface motion. We wish to study the output from the program to aid earthquake preparedness plans. SAMSI Working Group March 2007

  7. Example: Ground Shaking • Inputs • Geometry of the ground surface • Layers of hard/soft soil below the surface • Shear velocity, density, elasticity of the soil in each layer • Amplitude and spectrum at bedrock • Outputs • Displacement along the surface • Acceleration along the surface SAMSI Working Group March 2007

  8. Example: Chemotherapy Simulator What is the effect of chemotherapy treatment? The treatment affects both cancerous and healthy cells in the body. The goal is to develop treatment protocols that will put the cancer into remission with minimal damage to healthy cells. SAMSI Working Group March 2007

  9. Example: Chemotherapy Simulator • Inputs • Treatment protocol: dosage and timing • Rate of drug decay • Rate of cell death • Rate of cell regeneration • Outputs • Number of healthy and malignant cells, as a fraction of the initial count SAMSI Working Group March 2007

  10. Example: Piston Performance The piston simulator was written by Kenett and Zacks as a teaching tool for their text book. The simulator describes the cycle time of a piston and is based on the physics governing the piston. Variation in output is related to tolerances in the inputs. The goal was to achieve a target cycle time with minimal variation. SAMSI Working Group March 2007

  11. Example: Piston Performance • Output • Cycle time SAMSI Working Group March 2007

  12. Latin Hypercube Designs Latin Hypercubes are the most popular class of experimental plan. LHD’s place the input levels for each factor on a uniform grid. Then “mate” the levels across factors by randomly permuting the column for each factor. McKay, Beckman and Conover, Technometrics, 1979. SAMSI Working Group March 2007

  13. Latin Hypercube Designs Example of a Latin Hypercube design for 3 factors. SAMSI Working Group March 2007

  14. Latin Hypercube Designs Some 2-factor projections from a 250-run LHD. SAMSI Working Group March 2007

  15. Latin Hypercube Designs Other mating schemes have been suggested to obtain columns with low correlation. Ye showed how to get 2m-2 fully orthogonal columns with 2m runs. Butler showed how to get orthogonality with respect to a trigonometric regression model and 2m runs. How many orthogonal columns are possible? SAMSI Working Group March 2007

  16. Rotated Factorial Designs Bursztyn and Steinberg developed experimental plans with many levels in which linear effects are orthogonal. Start with a “standard” first-order orthogonal design, like a 2k-p fractional factorial: D. “Rotate” the design using a rotation matrix R: D  DR. Then (DR)’(DR) = R’D’DR = nR’R = nI. SAMSI Working Group March 2007

  17. LHD’s as Rotated Factorial Designs Steinberg and Lin showed how to rotate two-level factorials into Latin Hypercube designs with a large number of first-order orthogonal columns. This work combines a rotation idea in Bursztyn and Steinberg with another rotation idea developed by Lin and Beattie. SAMSI Working Group March 2007

  18. LHD’s as Rotated Factorial Designs • Lin and Beattie: rotate 2k factorials to Latin Hypercube designs. The intuition: • Columns in a LHD are an arithmetic sequence. • Columns in DR are linear combinations of the rows of D(the 2k design). • The rows of Dare a binary expansion of the odd integers. • Using appropriate powers of 2 as the elements in R, each column in DR is an integer sequence. SAMSI Working Group March 2007

  19. LHD’s as Rotated Factorial Designs SAMSI Working Group March 2007

  20. LHD’s as Rotated Factorial Designs Weights SAMSI Working Group March 2007

  21. LHD’s as Rotated Factorial Designs Weights Weighted Sums SAMSI Working Group March 2007

  22. LHD’s as Rotated Factorial Designs • Lin and Beattie: rotate 2k factorials to Latin Hypercube designs. • Can we organize weights for multiple columns in a rotation matrix R? • Yes – provided R is t by t, where t is a power of 2. • A simple recursive scheme gives the rotation matrices. • Original proposal limited to full factorial designs 2k, where k is a power of 2. SAMSI Working Group March 2007

  23. LHD’s as Rotated Factorial Designs Lin and Beattie: rotate 2k factorials to Latin Hypercube designs. SAMSI Working Group March 2007

  24. LHD’s as Rotated Factorial Designs Bursztyn and Steinberg showed that fractional factorial designs can also be rotated. First, the design must be decomposed into sets of factors, each of which is a full factorial. SAMSI Working Group March 2007

  25. LHD’s as Rotated Factorial Designs Steinberg and Lin: Bursztyn & Steinberg Lin & Beattie The resulting design is an orthogonal Latin hypercube. SAMSI Working Group March 2007

  26. LHD’s as Rotated Factorial Designs The construction requires that each set of columns be a full factorial design. Suppose we start with a saturated fractional factorial with 2m runs. How can we “group” the columns to achieve the maximum number of full factorials? SAMSI Working Group March 2007

  27. LHD’s as Rotated Factorial Designs We can order the columns so that each set of m consecutive columns is a full factorial. • Identify the columns as the non-zero points in GF(2m). • All non-zero points (hence all columns) can be obtained as xj mod p(x), where p(x) is a primitive polynomial of GF(2m). • Order the columns by the order of the powers. • A set of m consecutive columns is not a full factorial if it as a linear dependency. Easy to show that this implies a linear dependency in the first m columns. SAMSI Working Group March 2007

  28. LHD’s as Rotated Factorial Designs • Identify the columns as the non-zero points in GF(2m), the Galois Field of binary vectors of length m. • The column of 1’s is matched with (0,0,…,0). • The column for A is matched with (1,0,…,0). • The column for B is matched with (0,1,0,…,0). • The column for AB is matched with (1,1,0,…,0). • In general, the column for any interaction is matched with a vector with 1’s marking the factors involved in the interaction. SAMSI Working Group March 2007

  29. LHD’s as Rotated Factorial Designs • Identify the columns as the non-zero points in GF(2m), the Galois Field of binary vectors of length m. • Each binary vector is used to represent a polynomial with binary coefficients. • AC  (1,0,1,0,0,0)  1 + x2 • BDF  (0,1,0,1,0,1)  x + x3 + x5 SAMSI Working Group March 2007

  30. LHD’s as Rotated Factorial Designs 2. All non-zero points (hence all columns) can be obtained as xj mod p(x), where p(x) is a primitive polynomial of GF(2m). GF theory – there exists a primitive polynomial, p(x), that can be used to generate all the non-zero polynomials in GF(2m). The primitive polynomial is a binary polynomial of degree m. Recall that m is the number of factors, so we want to generate all polynomials of degree m-1 or less. All calculations are carried out modulo 2. SAMSI Working Group March 2007

  31. LHD’s as Rotated Factorial Designs 2. All non-zero points (hence all columns) can be obtained as xj mod p(x), where p(x) is a primitive polynomial of GF(2m). For example, with m=4, a primitive polynomial is 1+x+x4. x0 ≡ 1 (A) x1≡ x (B) x2 ≡ x2 (C)x3 ≡ x3 (D) x4≡ 1+x (AB) x5≡ x+x2 (BC) etc. If we continue, we find all the non-zero polynomials. Every set of m successive columns is a full factorial. SAMSI Working Group March 2007

  32. LHD’s as Rotated Factorial Designs The rotated designs are a special class of Latin Hypercubes with an external orthogonal array structure (U-designs). For each pair of columns, ¼ of all the points are in each quadrant. For many pairs, finer divisions hold. SAMSI Working Group March 2007

  33. LHD’s as Rotated Factorial Designs Some 2-factor projections from the design of the ground-shaking study. SAMSI Working Group March 2007

  34. LHD’s as Rotated Factorial Designs Points may “clump” in low-dimensional projections. In high dimensions, points do not clump. The rotation is isometric, so the inter-point differences are like those in the original factorial, except for “shrinking” the final design back to a hypercube. SAMSI Working Group March 2007

  35. LHD’s as Rotated Factorial Designs Steinberg and Lin show that these rotated designs have good statistical properties as screening designs. Main effects have low aliasing with second order effects (by comparison with randomly mated LHC designs or randomly chosen U-designs). SAMSI Working Group March 2007

  36. LHD’s as Rotated Factorial Designs Suppose you use the design to fit a simple first-order regression model, to “screen” the most influential factors: Y = Xβ + ε. But the true dependence involves additional regression terms: Y = Xβ + Zγ. Then β-hat = β + (X’X)-1X’Zγ = β + Aγ. The matrix A is known as the alias matrix. SAMSI Working Group March 2007

  37. LHD’s as Rotated Factorial Designs The alias matrix depends on the design, the model used for screening, and the extra terms in Z. A good screening design should have small values in A for simple screening models and somewhat more complex extra terms. Bursztyn and Steinberg, JSPI (2006), 1103-1119, SAMSI Working Group March 2007

  38. LHD’s as Rotated Factorial Designs We compared 16-run, 12-factor designs, with a first-order screening model and extra terms of second order. The alternatives: a standard LHD (best of 100 random choices) and an OA-based LHD (best of 100 random choices). SAMSI Working Group March 2007

  39. LHD’s as Rotated Factorial Designs The percent of entries in A that were < 0.1: SAMSI Working Group March 2007

  40. LHD’s as Rotated Factorial Designs For the standard and OA-based LHD’s, the results shown are the best found for 100 random designs. For the orthogonal LHD, all non-isomorphic groupings of columns into 3 sets of 4 columns were found. Results were very similar for all groupings. SAMSI Working Group March 2007

  41. LHD’s as Rotated Factorial Designs A design with n/2 columns, all orthogonal to each other and to all possible second-order effects, can be constructed using the same ideas. The trick is in the choice of the starting design. We rotate the resolution IV “foldover” design. The rotation preserves the foldover property and that, in turn, guarantees the orthogonality properties. The GF(2m) structure again provides a way to group the columns into full factorials. SAMSI Working Group March 2007

  42. Near LHD’s from Rotated Factorials Orthogonal designs that are nearly LHD’s can be obtained by rotating other base designs. Example: use as the base design the 48 run Plackett-Burman design. Rotate 40 factors in 5 groups of 8. The rotated design has all columns orthogonal. It is also a U-design. It is nearly a Latin Hypercube. SAMSI Working Group March 2007

  43. Near LHD’s from Rotated Factorials Below is a q-q plot for one of the factors against a uniform distribution. SAMSI Working Group March 2007

  44. Nuclear Waste Repository: Quandaries • Main goal is to assess which input factors have greatest influence on output: Sensitivity Analysis. • For example: given a proposed site, which factors should be measured? • Output data are highly skewed, with many 0’s (configurations with no leaching into the drinking water). • What is the best way to summarize the results? SAMSI Working Group March 2007

  45. RESRAD • RESRAD is a computer model designed to estimate radiation doses and risks from RESidual RADioactive materials. • RESRAD simulates radiation doses and cancer risks for a variety of pathways in the environment (e.g. drinking water, food chain, atmosphere). Developed at Argonne National Laboratory. http://web.ead.anl.gov/resrad/ SAMSI Working Group March 2007

  46. RESRAD • Number of input parameters can reach hundreds. • Most parameters are difficult/expensive to measure or control and are subject to wide ranges of uncertainty. SAMSI Working Group March 2007

  47. RESRAD Typical RESRAD output. SAMSI Working Group March 2007

  48. Our Case Study • Twenty-seven input parameters. • Initial radionuclide is U238 buried at a depth of 2 meters. • Lithology is one-dimensional, with contaminated, unsaturated and saturated layers above groundwater. SAMSI Working Group March 2007

  49. Our Case Study • Wide uncertainties for inputs. • Many have log-normal distributions as a reflection of scientific uncertainty. • The distribution coefficients for U234 and U238 should be identical. • Outcome: maximal annual dose during 10k years. SAMSI Working Group March 2007

  50. Our Case Study Use RESRAD’s built-in capability for sensitivity analysis. Options include: • One-factor-at-a-time analysis. • Random samples of input settings. • Latin Hypercube samples. • Different input parameter distributions (e.g. uniform, normal, log-normal). • Specified rank correlations of inputs. SAMSI Working Group March 2007

More Related