470 likes | 488 Vues
Asking users & experts and Testing & modeling users Ref: Ch. 13-14. The aims. Discuss the role of interviews & questionnaires in evaluation. Teach basic questionnaire design. Describe how to do interviews, heuristic evaluation & walkthroughs.
E N D
Asking users & expertsand Testing & modeling usersRef: Ch. 13-14
The aims • Discuss the role of interviews & questionnaires in evaluation. • Teach basic questionnaire design. • Describe how to do interviews, heuristic evaluation & walkthroughs. • Describe how to collect, analyze & present data. • Discuss strengths & limitations of these techniques
Interviews • Unstructured - are not directed by a script. Rich but not replicable. • Structured - are tightly scripted, often like a questionnaire. Replicable but may lack richness. • Semi-structured - guided by a script but interesting issues can be explored in more depth. Can provide a good balance between richness and replicability.
Basics of interviewing • Determine the goals the evaluation addresses. • Explore the specific questions to be answered. • Choose the evaluationparadigm and techniques to answer the questions. • Identify the practical issues. • Decide how to deal with the ethical issues. • Evaluate, interpret and present the data. • Remember the DECIDE framework • Goals and questions guide all interviews • Two types of questions:‘closed questions’ have a predetermined answer format, e.g., ‘yes’ or ‘no’‘open questions’ do not have a predetermined format • Closed questions are quicker and easier to analyze
Things to avoid when preparing interview questions • Long questions • Compound sentences - split into two • Jargon & language that the interviewee may not understand • Leading questions that make assumptions e.g., why do you like …? • Unconscious biases e.g., gender stereotypes
Components of an interview • Introduction - introduce yourself, explain the goals of the interview, reassure about the ethical issues, ask to record, present an informed consent form. • Warm-up - make first questions easy & non-threatening. • Main body – present questions in alogicalorder • A cool-off period - includea few easy questions to defuse tension at the end • Closure - thank interviewee, signal the end, e.g, switch recorder off.
The interview process • Use the DECIDE framework for guidance • Dress in a similar way to participants • Check recording equipment in advance • Devise a system for coding names of participants to preserve confidentiality. • Be pleasant • Ask participants to complete an informed consent form
Probes and prompts • Probes - devices for getting more information.e.g., ‘would you like to add anything?’ • Prompts - devices to help interviewee, e.g., help with remembering a name • Remember that probing and prompting should not create bias. • Too much can encourage participants to try to guess the answer.
Group interviews • Also known as ‘focus groups’ • Typically 3-10 participants • Provide a diverse range of opinions • Need to be managed to:- ensure everyone contributes- discussion isn’t dominated by one person- the agenda of topics is covered
Analyzing interview data • Depends on the type of interview • Structured interviews can be analyzed like questionnaires • Unstructured interviews generate data like that from participant observation • It is best to analyze unstructured interviews as soon as possible to identify topics and themes from the data
Asking Users: Questionnaires • Questions can be closed or open • Closed questions are easiest to analyze, and may be done by computer • Can be administered to large populations • Paper, email & the web used for dissemination • Advantage of electronic questionnaires is that data goes into a data base & is easy to analyze • Sampling can be a problem when the size of a population is unknown as is common online
Questionnaire style • Varies according to goal so use the DECIDE framework for guidance • Questionnaire format can include:- ‘yes’, ‘no’ checkboxes- checkboxes that offer many options- Likert rating scales 1, 2, 3 ,4, 5 - semantic scales- open-ended responses • Likert scales have a range of points • 3, 5, 7 & 9 point scales are common • Debate about which is best Attractive |___|_X_|___|___|___|Ugly Clear |___|___|_X_|___|___|Confusing Dull |___|___|___|___|___|Colorful Exciting |___|_X_|___|___|___|Boring Annoying |___|___|___|___|_X_|Pleasing Poor |___|___|___|_X_|___|Well-designed
Developing a questionnaire • Provide a clear statement of purpose & guarantee participants anonymity • Plan questions - if developing a web-based questionnaire, design off-line first • Decide on whether phrases will all be positive, all negative or mixed • Pilot test questions - are they clear, is there sufficient space for responses • Decide how data will be analyzed & consult a statistician if necessary
Encouraging a good response • Make sure purpose of study is clear • Promise anonymity • Ensure questionnaire is well designed • Offer a short version for those who do not have time to complete a long questionnaire • If mailed, include a s.a.e. • Follow-up with emails, phone calls, letters • Provide an incentive • 40% response rate is high, 20% is often acceptable
Advantages of online questionnaires • Responses are usually received quickly • No copying and postage costs • Data can be collected in database for analysis • Time required for data analysis is reduced • Errors can be corrected easily • Disadvantage - sampling problematic if population size unknown • Disadvantage - preventing individuals from responding more than once
Problems with online questionnaires • Sampling is problematic if population size is unknown • Preventing individuals from responding more than once
Developing a web-based questionnaire • Produce an error-free interactive interactive electronic version from the original paper-based one. • Make the questionnaire accessible from all common browsers and readable from different-size monitors and different network locations. • Make sure information identifying each respondent will be captured and stored confidentially because the same person may submit several complete surveys. • User-test the survey with pilot studies before distributing.
Questionnaire data analysis & presentation • Present results clearly - tables may help • Simple statistics can say a lot, e.g., mean, median, mode, standard deviation • Percentages are useful but give population size • Bar graphs show categorical data well • More advanced statistics can be used if needed
Asking experts • Inspections • Experts use their knowledge of users & technology to review software usability • Expert critiques (crits) can be formal or informal reports • Heuristic evaluation is a review guided by a set of heuristics • Walkthroughs involve stepping through a pre-planned scenario noting potential problems
Heuristic evaluation • Developed Jacob Nielsen in the early 1990s • Based on heuristics distilled from an empirical analysis of 249 usability problems • These heuristics have been revised for current technology, e.g., HOMERUN for web • Heuristics still needed for mobile devices, wearables, virtual worlds, etc. • Design guidelines form a basis for developing heuristics
Nielsen’s heuristics • Visibility of system status • Match between system and real world • User control and freedom • Consistency and standards • Help users recognize, diagnose, recover from errors • Error prevention • Recognition rather than recall • Flexibility and efficiency of use • Aesthetic and minimalist design • Help and documentation
Discount evaluation • Heuristic evaluation is referred to as discount evaluation when 5 evaluators are used. • Empirical evidence suggests that on average 5 evaluators identify 75-80% of usability problems.
3 stages for doing heuristic evaluation • Briefing session to tell experts what to do • Evaluation period of 1-2 hours in which:- Each expert works separately- Take one pass to get a feel for the product- Take a second pass to focus on specific features • Debriefing session in which experts work together to prioritize problems
Advantages and problems • Few ethical & practical issues to consider • Can be difficult & expensive to find experts • Best experts have knowledge of application domain & users • Biggest problems- important problems may get missed- many trivial problems are often identified
Cognitive walkthroughs • Focus on ease of learning • Designer presents an aspect of the design & usage scenarios • One of more experts walk through the design prototype with the scenario • Expert is told the assumptions about user population, context of use, task details • Experts are guided by 3 questions
The 3 questions • Will the correct action be sufficiently evident to the user? • Will the user notice that the correct action is available? • Will the user associate and interpret the response from the action correctly? As the experts work through the scenario they note problems
Pluralistic walkthrough • Variation on the cognitive walkthrough theme • Performed by a carefully managed team • The panel of experts begins by working separately • Then there is managed discussion that leads to agreed decisions • The approach lends itself well to participatory design
Key points • Structured, unstructured, semi-structured interviews, focus groups & questionnaires • Closed questions are easiest to analyze & can be replicated • Open questions are richer • Check boxes, Likert & semantic scales • Expert evaluation: heuristic & walkthroughs • Relatively inexpensive because no users • Heuristic evaluation relatively easy to learn • May miss key problems & identify false ones
The aims • Describe how to do user testing. • Discuss the differences between user testing, usability testing and research experiments. • Discuss the role of user testing in usability testing. • Discuss how to design simple experiments. • Describe GOMS, the keystroke level model, Fitts’ law and discuss when these techniques are useful. • Describe how to do a keystroke level analysis.
Experiments, user testing & usability testing • Experiments test hypotheses to discover new knowledge by investigating the relationship between two or more things – i.e., variables. • User testing is applied experimentation in which developers check that the system being developed is usable by the intended user population for their tasks. • Usability testing uses a combination of techniques, including user testing & user satisfaction questionnaires.
User testing Aim: improve products Few participants Results inform design Not perfectly replicable Controlled conditions Procedure planned Results reported to developers Research experiments Aim: discover knowledge Many participants Results validated statistically Replicable Strongly controlled conditions Experimental design Scientific paper reports results to community User testing is not research
User testing • Goals & questions focus on how well users perform tasks with the product • Comparison of products or prototypes common • Major part of usability testing • Focus is on time to complete task & number & type of errors • Informed by video & interaction logging • User satisfaction questionnaires provide data about users’ opinions
Testing conditions • Usability lab or other controlled space • Major emphasis on- selecting representative users- developing representative tasks • 5-10 users typically selected • Tasks usually last no more than 30 minutes • The test conditions should be the same for every participant • Informed consent form explains ethical issues
Type of data (Wilson & Wixon, ‘97) • Time to complete a task • Time to complete a task after a specified time away from the product • Number and type of errors per task • Number of errors per unit of time • Number of navigations to online help or manuals • Number of users making a particular error • Number of users completing task successfully
How many participants is enough for user testing? • The number is largely a practical issue • Depends on:- schedule for testing- availability of participants- cost of running tests • Typical 5-10 participants • Some experts argue that testing should continue until no new insights are gained
Experiments • Predict the relationship between two or more variables • Independent variable is manipulated by the researcher • Dependent variable depends on the independent variable • Typical experimental designs have one or two independent variable
Experimental designs • Different participants - single group of participants is allocated randomly to the experimental conditions • Same participants - all participants appear in all conditions • Matched participants - participants are matched in pairs, e.g., based on expertise, gender
Predictive models • Provide a way of evaluating products or designs without directly involving users • Psychological models of users are used to test designs • Less expensive than user testing • Usefulness limited to systems with predictable tasks - e.g., telephone answering systems, mobiles, etc. • Based on expert behavior
GOMS (Card et al., 1983) • Goals - the state the user wants to achieve e.g., find a website • Operators - the cognitive processes & physical actions performed to attain those goals, e.g., decide which search engine to use • Methods - the procedures for accomplishing the goals, e.g., drag mouse over field, type in keywords, press the go button • Selection rules - determine which method to select when there is more than one available
Benefits and limitations of GOMS • Help make decisions about the effectiveness of new products • Allow comparative analysis to be performed for different interfaces • Difficult or impossible to predict how an average user will carry out their tasks
Keystroke level model GOMS has also been developed further into a quantitative model - the keystroke level model. This model allows predictions to be made about how long it takes an expert user to perform a task.
Fitts’ Law (Paul Fitts 1954) • The law predicts that the time to point at an object using a device is a function of the distance from the target object & the object’s size. • The further away & the smaller the object, the longer the time to locate it and point. • Useful for evaluating systems for which the time to locate an object is important such as handheld devices like mobile phones
Fitt’s Law • T = k ln(D/S+0.5) k~100ms • T: time to move the hand to a target • D: the distance between hand and target • S: size of target
Key points • User testing is a central part of usability testing • Testing is done in controlled conditions • User testing is an adapted form of experimentation • Experiments aim to test hypotheses by manipulating certain variables while keeping others constant • The experimenter controls the independent variable(s) but not the dependent variable(s) • There are three types of experimental design: different-participants, same- participants, & matched participants • GOMS, Keystroke level model, & Fitts’ Law predict expert, error-free performance • Predictive models are used to evaluate systems with predictable tasks such as telephones