250 likes | 340 Vues
Principles in the design of multiphase experiments with a later laboratory phase: orthogonal designs. Chris Brien 1 , Bronwyn Harch 2 , Ray Correll 2 & Rosemary Bailey 3 1 University of South Australia, 2 CSIRO Mathematics, Informatics & Statistics , 3 Queen Mary University of London.
E N D
Principles in the design of multiphase experiments with a later laboratory phase: orthogonal designs • Chris Brien1, Bronwyn Harch2, Ray Correll2 & Rosemary Bailey3 • 1University of South Australia, 2CSIRO Mathematics, Informatics & Statistics, 3Queen Mary University of London http://chris.brien.name/multitier Chris.brien@unisa.edu.au
Outline Primary experimental design principles Factor-allocation description for standard designs. Principles for simple multiphase experiments. Principles leading to complications, even with orthogonality. Summary
1) Primary experimental design principles • Principle 1 (Evaluate designs with skeleton ANOVA tables) • Use whether or not data to be analyzed by ANOVA. • Principle 2 (Fundamentals): Use randomization, replication and blocking or local control. • Principle 3 (Minimize variance): Block entities to form new entities, within new entities being more homogeneous; assign treatments to least variable entity-type. • Principle 4 (Split units): confound some treatment sources with more variable sources if some treatment factors: • require larger units than others, • are expected to have a larger effect, or • are of less interest than others.
A standard agricultural example • 8 treatments — combinations of 4 harvest dates and 2 different fertilizers – are to be investigated for a plant crop. • During harvest the yield is to be measured. • Assume differences between harvest dates expected to be larger than between fertilizers. • A split-plot design is to be used: • It will have 3 blocks each containing 4 main plots, with each main plot split into 2 subplots; • Randomize 4 harvest dates to 4 main plots in a block and 2 fertilizers to 2 subplots in a main plot. • This design employs Principles 2, 3 and 4(ii).
2) Factor-allocation description for standard designs (Nelder, 1965; Brien, 1983; Brien & Bailey, 2006) • Standard designs involve a single allocation in which a set of treatments is assigned to a set of units: • treatments are whatever are allocated; • units are what treatments are allocated to; • treatments and units each referred to as a set of objects; • Often do by randomization using a permutation of the units. • More generally treatments are allocated to units e.g. using a spatial design or systematically • Each set of objects is indexed by a set of factors: • Unit or unallocated factors (indexing units); • Treatment or allocated factors (indexing treatments). • Represent the allocation using factor-allocation diagrams that have apanel for each set of objects with: • a list of the factors; their numbers of levels; their nesting relationships.
3 Blocks 4 Mainplotsin B 2 Subplotsin B, M 4 Harvests 2 Fertilizers 8 treatments 24 subplots Factor-allocation diagram for the standard agricultural experiment • One allocation (randomization): • a set of treatments to a set of subplots. • The set of factors belonging to a set of objects forms a tier: • {Harvests, Fertilizers} or {Blocks, Mainplots, Subplots}; • they have the same status in the allocation (randomization); • Textbook experiments are two-tiered. • Cf. single set of factors uniquely indexing observations: • {Blocks, Harvests, Fertilizers}. • A crucial feature of diagram is that it shows EU and restrictions on randomization/allocation. (e.g. Searle et al., 1992)
3 Blocks 4 Mainplotsin B 2 Subplotsin B, M 4 Harvests 2 Fertilizers 8 treatments 24 subplots Some derived items • Sets of generalized factors (terms in the mixed model): • Blocks, BlocksMainplots, BlocksMainplotsSubplots; • Harvests, Fertilizers, HarvestsFertilizers. • Corresponding types of entities (groupings of objects): • block, main plot, subplot (last two are EUs); • harvest, fertilizer, treatment (harvest-fertilizer combination). • Corresponding sources (in an ANOVA): • Blocks, Mainplots[B], Subplots[BM]; • Harvests, Fertilizers, Harvests#Fertilizers.
3 Blocks 4 Mainplotsin B 2 Subplotsin B, M 4 Harvests 2 Fertilizers 8 treatments 24 subplots Skeleton ANOVA Harvests is confounded with the more-variable Mainplots[B] & Fertilizers with Subplots[B^M].
3) Principles for simple multiphase experiments • Suppose in the agricultural experiment: • in addition to yield measured during harvest, • the amount of Na in the plant is to be measured for each subplot, and • the harvested material is transported to the laboratory for drying and analysis. • The experiment is two phase: field and laboratory phases. • The outcome of the field phase is yield and batches of plant material. • The outcome of the laboratory phase is the amount of Na. • How to process the specimens from the first phase in the laboratory phase?
Some principles • Principle 5 (Simplicity desirable): assign first-phase units to laboratory units so that each first-phase source is confounded with a single laboratory source. • Use composed randomizations with an orthogonal design. • Principle 6 (Preplan all): if possible. • Principle 7 (Allocate all and randomize in laboratory): always allocate all treatment and unit factors and randomize first-phase units and lab treatments. • Principle 8 (Big with big): • Confound big first-phase sources with big laboratory sources, provided no confounding of treatment with first-phase sources.
3 Blocks 4 Mainplotsin B 2 Subplotsin B, M 24 Locations 4 Harvests 2 Fertilizers 24 locations 8 treatments 24 subplots A simple two-phase agricultural experiment Composed randomizations (Brien & Bailey, 2006) Simplest is to randomize batches from a subplot to locations (in time or space) during the laboratory phase.
A simple two-phase agricultural experiment (cont’d) No. subplots = no. locations = 24 and so subplots sources exhaust locations sources. Cannot separately estimate locations and subplots variability, but can estimate their sum. However want to block the laboratory phase.
3 Blocks 4 Mainplots in B 2 Subplotsin B, M 3 Intervals 8 Locations in I 4 Harvests 2 Fertilizers 8 treatments 24 subplots 24 locations A simple two-phase agricultural experiment (cont’d) Composed randomizations Simplest is to align lab-phase and first-phase blocking. • Note Blocks confounded with Intervals (i.e. Big with Big).
The multiphase law • DF for first phase sources unaffected. DF for sources from a previous phase can never be increased as a result of the laboratory-phase design. However, it is possible that first-phase sources are split into two or more sources, each with fewer degrees of freedom than the original source.
4) Principles leading to complications, even with orthogonality • Principle 9 (Use pseudofactors): • An elegant way to split sources (as opposed to introducing grouping factors unconnected to real sources of variability). • Principle 10 (Compensating across phases): • Sometimes, if something is confounded with more variable first-phase source, can confound with less variable lab source. • Principle 11 (Laboratory replication): • Replicate laboratory analysis of first-phase units if lab variability much greater than 1st-phase variation; • Often involves splitting product from the first phase into portions (e.g. batches of harvested crop, wines, blood specimens into aliquots, drops, lots, samples and fractions). • Principle 12 (Laboratory treatments): • Sometimes treatments are introduced in the laboratory phase and this involves extra randomization.
A three-phase agricultural experiment with laboratory treatments Piephoet al. (2003) • Suppose that material from a harvest date cannot be stored and must be analysed before the next harvest. • Also, 3 different methods are to be used to measure Na. • For this, the material from each subplot is divided into 3 portions and the Methods randomized to them. • For each harvest, there are 18 portions to be dried and analysed. • There are three ovens to be used simultaneously for the drying: they differ in speed of drying and can only fit 6 portions for drying — blocks will be randomized to ovens.
4 Harvests 2 Fertilizers 4 Mainplots in B 3 Blocks 2 Subplotsin B, M 3 Portions in B, M, S 4 Dates 3 Ovens in D 6 Locations in D, O 8 field treatments 72 portions 72 locations A three-phase agricultural experiment with laboratory treatments – drying phase 3 Methods 3 lab treatments 4 M1 • Problem: 12 main plots to assign to 4 dates: • Use M1 to group the 4 main plots to be dried on the same date. • An alternative is to introduce a grouping factor, say MainGroups, • but, while in the analysis, not an anticipated variability source. • Dashed arrow indicates systematic allocation. • The blocks are randomized to ovens (Big with Big). • The 2 subplots x 3 portions in a main plot are randomized to the 6 locations in an oven on a date. • The methods are randomized to portions.
4 Harvests 2 Fertilizers 4 Mainplots in B 3 Blocks 2 Subplotsin B, M 3 Portions in B, M, S 4 Dates 3 Ovens in D 6 Locations in D, O 4 Times 3 Intervals in T 6 Analyses in T, I 8 field treatments 72 portions 72 locations 72 analyses A three-phase agricultural experiment with laboratory treatments – labphase 3 Methods 3 lab treatments 4 M1 • After drying, the analysis to measure Na must be done. • In what order will this be done? • All 18 dried portions for a harvest will be done together. • Then, could randomize the order of analysis for these or could group according to ovens. • The latter, so that analyses are performed in batches of 6.
4 Harvests 2 Fertilizers 4 Mainplots in B 3 Blocks 2 Subplotsin B, M 3 Portions in B, M, S 4 Dates 3 Ovens in D 6 Locations in D, O 4 Times 3 Intervals in T 6 Analyses in T, I 8 field treatments 72 portions 72 locations 72 analyses A three-phase agricultural experiment with laboratory treatments (cont’d) 3 Methods 3 lab treatments 4 M1 Note split source
A three-phase agricultural experiment with laboratory treatments (cont’d) • Harvests confounded with Times & Dates. • No. analyses = no. locations = no. portions = 72: so portions sources exhaust locations sources exhaust analyses sources. • Consequently, not all terms are estimable.
Mixed models • Mixed models can be easily obtained from the factor-allocation description using Brien & Demétrio’s (2009) method: • In each panel, form terms as the set of generalized factors for that panel • all combinations of the factors, subject to nesting restrictions; • For each term from each panel, add to either fixed or random model.
4 Harvests 2 Fertilizers 4 Mainplots in B 3 Blocks 2 Subplotsin B, M 3 Portions in B, M, S 4 Dates 3 Ovens in D 6 Locations in D, O 4 Times 3 Intervals in T 6 Analyses in T, I 8 field treatments 72 portions 72 locations 72 analyses Mixed model for 3-phase experiment 3 Methods 3 lab treatments 4 M1 • Full mixed model is: • Harvests + Fertilizers + HarvestsFertilizers + Methods | Blocks + BlocksMainplots + BlocksMainplotsSubplots + BlocksMainplotsSubplotsPortions + Dates + DatesOvens + DatesOvensLocations + Times + TimesIntervals + TimesIntervalsAnalyses. • Pseudofactors not needed in mixed model (cf. grouping factors).
Mixed model of convenience (for fitting) • Model will not fit because of confounding (see ANOVA). • Mixed model of convenience (a term for each line in the ANOVA): • Harvests + Fertilizers + HarvestsFertilizers + Methods | BlocksMainplots + BlocksMainplotsSubplots + BlocksMainplotsSubplotsPortions. • This model of convenience does not reflect all the sources of variability in the experiment. • For example, the Residual is not just portions variability, but has locations and analyses variability contributing to it also. • Full mixed model is: • Harvests + Fertilizers + HarvestsFertilizers + Methods | Blocks + BlocksMainplots + BlocksMainplotsSubplots + BlocksMainplotsSubplotsPortions + Dates + DatesOvens + DatesOvensLocations + Times + TimesIntervals + TimesIntervalsAnalyses.
5)Summary • Have provided 4 standard principles and 8 principles specific to orthogonal, multiphase designs. • Extension to nonorthogonal designs is coming.
References • Brien, C. J. (1983). Analysis of variance tables based on experimental structure. Biometrics,39, 53–59. • Brien, C.J., and Bailey, R.A. (2006) Multiple randomizations (with discussion). J. Roy. Statist. Soc., Ser. B, 68, 571–609. • Brien, C.J. and Demétrio, C.G.B. (2009) Formulating mixed models for experiments, including longitudinal experiments. Journal of Agricultural, Biological and Environmental Statistics, 14, 253–80. • Brien, C.J., Harch, B.D., Correll, R.L. and Bailey, R.A. (2011) Multiphase experiments with laboratory phases subsequent to the initial phase. I. Orthogonal designs. Journal of Agricultural, Biological and Environmental Statistics, 16, 422–450. • Nelder, J. A. (1965). The analysis of randomized experiments with orthogonal block structure. Proceedings of the Royal Society of London, Series A, 283(1393), 147-162, 163–178. • Piepho, H.P., Büchse, A. and Emrich, K. (2003) A hitchhiker’s guide to mixed models for randomized experiments. Journal of Agronomy and Crop Science, 189, 310–322. • Searle, S. R., Casella, G. & Mcculloch, C. E. (1992) Variance components. New York, Wiley. Web address for link to Multitiered experiments site: http://chris.brien.name/multitier