Empirical Methods in Human-Computer Interaction: Project Guidelines & Statistical Analysis

CS 594: Empirical Methods in HCCExperimental Research in HCI (Part 2) Dr. Debaleena Chattopadhyay Department of Computer Science debchatt@uic.edu debaleena.com hci.cs.uic.edu

Agenda • Discuss course project details • Revisiting Parametric Statistics • Non-Parametric Statistics • Categorical Data

Course Project Details • Part 1 (15%) Research proposal. Research design and conceptualization of a chosen research topic. – Due 9/26 • Part 2 (25%) Data analysis. Results and Discussion. --Due mid term and finals

Course Project Details (cont.…) • You may deal with the same research topic for part 1 and part 2. Conceptualize, collect, and analyze data. • You may use different topics for part 1 and part 2. For example, use data that you had collected before, but not analyzed (must get instructor approval beforehand)

Course Project Details (cont.…) • Part 1 • Scope of research • Conceptualization – research questions • Operationalization • Metrics; what would you measure? Validity and Reliability? • Hypotheses • How would you collect data? • How would you analyze data? Why is this methodology suitable? • Explain how the data collected and anticipated results will help you answer the research questions.

Course project; Part 1 • Research scope must not be trivial • NO simple usability tests • Your proposal will be evaluated on the following • Correctness of operationalization and RQs • Quality of metrics • Quality of data collection plan • Correctness of rationale for the chosen empirical method • Degree of difficulty of the research proposal (10%) • Proposals will be evaluated in two dimensions: degree of difficulty and execution appraisal

Revisiting Parametric Statistics

Q1 • What does a significant test statistic tells us? • There is an important effect • The null hypothesis is false • There is an effect in the population of sufficient magnitude to warrant interpretation • All of the above

Q2 • A Type II error is when: • We conclude that there is an effect in the population when in fact there is not. • We conclude that there is not an effect in the population when in fact there is. • We conclude that the test statistic is significant when in fact it is not. • The data we have entered in R is different that the data collected.

Q3 • Which of these statements about statistical power is not true? • Power is the ability of a test to detect an effect, given that an effect of a certain size exists in the population. • We can use power to determine how big a sample is required to detect an effect of a certain size. • Power is linked to the probability of making a Type II error. • All of the above are true.

Q4 • Which of the following are assumptions underlying the use of parametric tests (based on the normal distribution)? • Some feature of the data should be normally distributed. • The samples being tested should have approximately equal variances. • Your data should be at least interval level. • All of the above.

Q5 • The Shapiro-Wilk test can be used to test: • Whether data are normally distributed. • Whether group variances are equal. • Whether scores are measured at the interval level • Whether group means differ

Q6 • The correlation between two variables A and B is .12 with a significance of p <.01. What can we conclude? • That there is a substantial relationship between A and B • That there is a small relationship between A and B. • That variable A causes variable B. • All of the above.

Normality

Homogeneity of variance

T-test (independent)

T-test (dependent)

Non-Parametric Statistics

When to use non-parametric tests? • Data are not normally distributed • Data are not measured at interval level. • Non-parametric tests sometimes get referred to as distribution-free tests, with an explanation that they make no assumptions about the distribution of the data.*

Common Non-parametric Tests in use • Wilcoxon rank-sum test/ Mann–Whitney test (similar to independent t-test) • Wilcoxon signed-rank test (similar to dependent t-test) • Friedman’s test (similar to repeated-measures ANOVA) • Kruskal–Wallis test (similar to one-way ANOVA)

Comparing two independent conditions:the Wilcoxon rank-sum test • When you want to test differences between two conditions and different participants have been used in each condition then you have two choices • Wilcoxon rank-sum test • Mann–Whitney test

Wilcoxon rank-sum test • If you have the data for different groups stored in a single column • newModel<-wilcox.test(outcome ~ predictor, data = dataFrame, paired = FALSE/TRUE) • if you have the data for different groups stored in two columns • newModel<-wilcox.test(scores group 1, scores group 2, paired = FALSE/TRUE)

Example output • For example, a neurologist might collect data to investigate the depressant effects of certain recreational drugs. She tested 20 clubbers in all: 10 were given an ecstasy tablet to take on a Saturday night and 10 were allowed to drink only alcohol. Levels of depression were measured using the Beck Depression Inventory (BDI) the day after and midweek.

Wilcoxon signed-rank test • Used in situations in which there are two sets of scores to compare, but these scores come from the same participants. As such, think of it as the nonparametric equivalent of the dependent t-test.

Kruskal–Wallis test • The one-way independent ANOVA has a non-parametric counterpart called the Kruskal–Wallis test. • When the data are collected using different participants in each group, we input the data using a coding variable. So, the data editor will have two columns of data. The first column is a factor.

Kruskal–Wallis test (example output)

Differences between severalrelated groups: Friedman’s ANOVA • Used for testing differences between conditions when there are more than two conditions and the same participants have been used in all conditions. • If you have violated some assumption of parametric tests then this test can be a useful way around the problem.

Friedman’s ANOVA (example)

Categorical Data

Chi-square test; contingency table

Upcoming: • Proposal due Sep 26, 11:59pm • Start working on your annotated bibliography • Post your slides on piazza after class presentations

Empirical Methods in Human-Computer Interaction: Project Guidelines & Statistical Analysis