Comprehensive Guide to Experimental Design in HCI Studies

Controlled User studies HCI - 4163/6610 Winter 2013

Usability Experiments • Predict the relationship between two or more variables. • Independent variable is manipulated by the researcher. • Dependent variable depends on the independent variable. • Typical experimental designs have one or two independent variable. • Validated statistically & replicable. 2

True Experiment • Experimental control • Control as many potential threats to validity as possible • Random assignment of participants/data to conditions • Could be within-subjects or between-subjects

Control • True experiment = complete control over the subject assignment to conditions and the presentation of conditions to subjects • Control over the who, what, when, where, how • Control of the who => random assignment to conditions • Only by chance can other variables be confounded with IV • Control of the what/when/where/how => control over the way the experiment is conducted

Quasi-Experiment • When you can’t achieve complete control • Lack of complete control over conditions • Subjects for different conditions come from potentially non-random pre-existing groups (smokers vs nonsmokers)

It’s a matter of control True Experiment Quasi Experiment • Random assignment of subjects to condition • Manipulate the IV • Control allows ruling out of alternative hypotheses • Selection of subjects for the conditions • Observe categories of subjects • If the subject variable is the IV, it’s a quasi experiment • Don’t know whether differences are caused by the IV or differences in the subjects

Other features • In some instances cannot completely control the what, when, where, and how • Need to collect data at a certain time or not at all • Practical limitations to data collection, experimental protocol

Validity • Internal validity is reduced due to the presence of controlled/confounded variables • But not necessarily invalid • It’s important for the researcher to evaluate the likelihood that there are alternative hypotheses for observed differences • Need to convince self and audience of the validity

External validity • If the experimental setting more closely replicates the setting of interest, external validity can be higher than a true experiment run in a controlled lab setting • Often comes down to what is most important for the research question • Control or ecological validity?

Terminology • Factors: Independent Variables (Ivs) of an experiment • Level: particular value of an IV • Condition: a group or treatment (technique) • e.g., Condition 1: old system, Condition 2: new system • Treatment: a condition of an experiment • Subject: participant (can also think more broadly of data sets that are ‘subjected’ to a treatment)

Factors to Treatments • At least 1 Factor (IV) has to vary to have an experiment • Effect of screen size and input technique on performance (speed, accuracy) • An IV must always have at least 2 levels • Condition refers to a particular way that subjects are treated • Between subject: experimental conditions are the same as the groups • Within subjects: only 1 group, that experiences every condition (can be many conditions in an experiment)

Good Experimental Design • Two-Group, Post-Test Design • Two conditions • Two groups: • Between subjects: random allocation • Treatment • Post-test: measure the DV • What’s really important?

Experimental designs • Between subjects: Different participants - single group of participants is allocated randomly to the experimental conditions. • Within subjects: Same participants - all participants appear in both conditions. • Matched participants - participants are matched in pairs, e.g., based on expertise, gender, etc. 13

Within-subjects • Similar to the one-group pre-test-post-test design • It solves the individual differences issues • But raises other problems: • Need to look at the impact of experiencing the two conditions • Will they get tired? Gain practice? Learn what is expected? • Need to control for order and sequence effects?

Order Effects • Changes in performance resulting from (ordinal) position in which a condition appears in an experiment (always first?) • Arises from warm-up, learning, fatigue, etc. • Effect can be averaged and removed if all possible orders are presented in the experiment and there has been random assignment to orders

Sequence effects • Changes in performance resulting from interactions among conditions (e.g., if done first, condition 1 has an impact on performance in condition 2) • Effects viewed may not be main effects of the IV, but interaction effects • Can be controlled by arranging each condition to follow every other condition equally often

Counterbalancing • Controlling order and sequence effects by arranging subjects to experience the various conditions (levels of the IV) in different orders • Self-directed learning: investigate the different counterbalancing methods • Randomization • Block Randomization • Reverse counter-balancing • Latin squares and Greco squares (when you can’t fully counterbalance) • http://www.experiment-resources.com/counterbalanced-measures-design.html

Between, within, matched participant design 18

Key points 1 • Usability testing is done in controlled conditions. • Usability testing is an adapted form of experimentation. • Experiments aim to test hypotheses by manipulating certain variables while keeping others constant. • The experimenter controls the independent variable(s) but not the dependent variable(s). 19

Comprehensive Guide to Experimental Design in HCI Studies

Comprehensive Guide to Experimental Design in HCI Studies

Presentation Transcript

User Studies Methods

MU220 User Interaction Studies

MU220 User Interaction Studies

MU220 User Interaction Studies

MU220 User Interaction Studies

MU220 User Interaction Studies

User Studies Methods

Controlled Studies: PTSD

User studies

User Studies Motivation

User Studies II

Library User Studies

User Studies With Camtasia

Controlled User studies

User Observation/Field Studies :

User Studies

Review of User Studies

User studies

Evaluation Using User Studies

Cooltown User Studies

Designing user studies

Controlled Studies: PTSD