BADM 531: Survey sampling: General techniques and Special population

BADM 531: Survey sampling: General techniques and Special population Audrey Pattee Avinish Chaturvedi

Sudman and Blair – Ch 1 • Sampling • What kind of people/who, more important than how many • Fair chance given to every member of the population to be selected + control • Purpose of sampling • Almost always impossible to study a whole population • Sample = small group of individuals usually randomly selected from a population • Conclusions of the study of the sample applied to the whole population

Sudman and Blair – Ch 1 • Example: you only need a sip of coffee to tell if it needs more sugar, you don't have to drink the whole cup.

Sudman and Blair – Ch 1 • Research errors: • Non-sampling error (incorrect answer/failures or fluctuations in measurement/error in coding or data entry) • Sampling error (non representative sample, larger samples tends to reduce that type of error)

Sudman and Blair – Ch 1 • Sample bias (systematic difference between the population and the sample) • Coverage bias: inappropriate exclusion of a part of the population • Selection bias: disproportionate chances of selection • Non-response bias: disproportionate failure to answer within one segment of the population

Sudman and Blair – Ch 1 • Probability vs non-probability samples • Probability sample = random sample, random process to select population elements • Simple random sampling (srs): every ith member of the population is chosen, random start • Stratified sampling: sub groups (strata) are sampled separately • Cluster sampling: sample the sub groups (clusters) at different rates

Sudman and Blair – Ch 1 • Exercise 1.2.2: (p16) 20 elementary → 5 rooms → 100 10 middle → 10 rooms → 100 5 high → 20 rooms → 100  Any given students has 1/100 chance to be selected

Sudman and Blair – Ch 1 • Non-probability samples = underestimate the true variability in a population • Judgment sampling: level and representativeness of elements • Convenience sampling: easily available population members • Volunteer samples: produce a good enough sample for the research purposes → not typical of the broader population • Quota sampling

Sudman and Blair – Ch 1 • Contexts in which different sampling procedures are efficient Probability samples Non-probability samples • Judgment sampling: • Convenience sampling: • Volunteer samples • Quota sampling

Case Study For this research in Jaipur, the sample size of about 175,000 representing nearly 50% of households was chosen. This constituted almost 100% of households subscribing to a newspaper in the city. The objective of the research was to create a newspaper for the people, of the people and by the people. The research would also help creating awareness, establish a brand image and develop a better understanding of the readers.

Sudman and Blair – Ch 1 • Good sampling methods • Minimize: coverage bias, selection bias, non-response bias (maximize participation), sampling error • “A smaller sample with less potential for coverage/non-response bias is usually better than a larger sample with more potential for those biases” • Avoid inferences that would go beyond what the sample can tolerate

Sudman and Blair – Ch 1 • Exercise 1.3.1 (a):(p25) • This sample comprise ? major risks of biases 3 • Volunteer sample, • Convenience, • Judgment sample

Sudman and Blair – Ch 2 • Population, 2 levels of definition: • Unit = depends on topic and purpose • Boundaries = determines who is or is not of interest for the study • Choose the potential consumers and not only the actual consumers.

Sudman and Blair – Ch 2 • Exercise 2.1.2 (a):(p35) A researcher wishes to study key business leaders to learn their opinions about issues facing a metropolitan area. Define this population in specific operational terms. → Area: big cities and suburbs (size?what to consider a suburb?) → Business leaders: CEOs? Others? What firms (all? Only big ones?)

Sudman and Blair – Ch 2 • Determining the target population • Frame = enable to identify members to draw adequate sample Ex: Lists • Different types of lists: • National list of the general population • National lists of population sub-groups (limited use) • National lists of the online population (working addresses and agreeing individuals)

Sudman and Blair – Ch 2 • Local lists of the general population, but 35% non-listed rate in certain cities for the local telephone directories • Lists of organization members (do not include potential members!) • Local/Regional/National lists of business • Lists’ deficiencies: • Omission = ignore them, hope the resulting bias is not serious Random digit dialing: avoid omission by working telephone exchanges so unlisted numbers can be included. Open intervals: give unlisted elements the same chance to be included Stratification: break into listed and unlisted groups

Sudman and Blair – Ch 2 • Ineligibility = listed elements that are not members of the population → screen lists before to avoid ineligible elements → screen selected elements for eligibility after sampling → adjusted sample size, note that Researchers should not stop when they reached the desired number of eligible, tend to first obtain easiest observations. → adjusted sample size for expected cooperation rate

Sudman and Blair – Ch 2 • Duplication = several listing for the same individual) → cross check the list, identify duplicates and remove them → draw sample then control how many duplicates → ask people how many times they appear in the list, technique that should be avoided when possible • Clustering= unfair representation of some groups → get data from every selected cluster → sample clusters members only at certain set dates → select one individual per cluster and weigh it in the study according to the size of the cluster

Sudman and Blair – Ch 2 • Exercise 2.2.3 (a):(p47) Telephone survey of people who visited an open-air art festival. Members of the general population but 4% local population attended, 25% of those who attended registered for a prize drawing (registered). Should the researcher use the registrations for sampling purposes? Other techniques? → use the registered list of the population → use RDD in the local population to figure who attended and then get information.

Case Study 2 • Problem: Measure the branding value of online advertising The largest consumer electronics superstore in the country wanted to understand consumer perceptions and attitudes toward their brand. In addition, they wanted to measure the branding impact of the campaign in a control/test environment to determine the overall value of Internet advertising against other mediums before committing more marketing dollars to online media buys.

Case Study 2 • The survey was presented to high-income, tech-savvy consumers via an ad targeting system. The blind survey contained questions about brand awareness, message recall, intent to purchase, and perception of quality against the competitors, and was presented to two test groups: consumers who saw the advertisement but did not click (the test group), and consumers who did not see the advertisement (the control group).

Sudman and Blair – Ch 3 • Drawing the sample: • Simple random sampling • Physical selection procedures = list each member of the population and randomly pick some to draw the sample • Use of random numbers (random number tables/computer) The smaller the sample, the more biased is the sample using those methods.

Sudman and Blair – Ch 3 • Systematic sampling, sample every ith member of the population • Random start between 1 and i • N/n, pick every (N/n)th element in the population after the established random start If periodicity is detected while establishing the sample, then it becomes unrepresentative • Physical sampling • Sampling from directories • Sampling from file drawers

Sudman and Blair – Ch 3 • Exercise 3.1.3 (a):(p73) Using the Student/Staff directory of the University of Illinois, especially the student directory section. • Sample: n = 500, 246 pages => 500/246 = 2.03, round up at 2 • 5 columns/page, about 30 names/column • Select 13 as the random number between 1 and 30 • Select 2 and 5 as the random column between 1 and 5 • Work the section of the directory and we get exactly 500 names.

Sudman and Blair – Ch 3 • While executing the research • Non-response bias: • The response rate is higher when personal visits and telephone surveys • For procedures without callbacks, using quota is preferable and limits bias • Increase follow-up research participation with cash incentives • Reporting the sample: • Detailed definition of the target population • Describe the list or other procedures used/sampling design and sample size • Response rates • Completion rate = Nbr obs / (n – ineligibles) • Cooperation rate = Nbr obs / (n – ineligibles – non contacts)

Sudman and Blair – Ch 4 • Sample size depends on: • Sample mean • expected value to be as close as possible to the population mean • Sampling error(σXbar = √σ²/n or σ/√n) • In some conditions, we have, due to costs, to accept wider confidence intervals. How far are researchers ready or should be ready to go? When do we decide that the research is unrealistic as it has be drawn?

Sudman and Blair – Ch 4 • Value research = difference between the firm’s profit with and without research • Exercise 4.4.1 (a):(p93) • if the hit rate is improved to 85%, • 85 * 1M = 85M • 15 * 1M = 15M Benefit = 70M • 85 * 2M = 170M • 15 * 3 M = 45M Benefit = 125M • 85 * 20 000 = 170 000 • 15 * 30 000 = 45 000 Benefit = 125 000

Sudman and Blair – Ch 4 • Value of information • Prior uncertainty = ignoring researches that contradict perceptions • Gains or losses = Who is paying? Usually, those who will benefit from the research and lose without it. • Nearness to breakhaven: chances that research will influence the decision

Sudman and Blair – Ch 4 • Informal rules to determine sample size • Use previous/typical practice • Use the « magic number » • Optimize the sample size in the case of sub-group analyses • Influence of the resources on the sample size (time, money) • What do you understand about the magic number?

Sudman and Blair – Ch 4 • Exercise 4.5.4 (a):(p101) A college instructor is planning a class project that involves a telephone survey. 40 students in the class, two-week period, sufficient number of questionnaire per student. → 2 week period = 10 days → No more than 2 hours a day, 20h/student → 30 minute survey, 40 studies/student/period  160 surveys (sample size)

Sudman and Blair – Ch 5 & 6 • Comparison between Stratified and Cluster sampling

Sudman,Journal of Marketing research • Identifying zero segments • Cost effective ways of screening • What could have been an efficient way of screening in case 1?

Sudman, Journal of marketing research • Systematic sampling - Similar to simple random sampling, but instead of selecting random numbers from tables, you move through a list (sample frame) picking every nth name. For example, pick every 10th name from an alphabetical list of students enrolled in a school. • Random Route Sampling - Used in market research surveys, mainly for sampling households, shops, garages and other premises in urban areas. A starting address is randomly selected and, taking alternate left- and right-hand turns at road junctions, every nth address is selected. • Stratified Sampling - All people in sampling frame are divided into "strata" (groups or categories). Within each stratum, a simple random sample or systematic sample is selected. For example, a politician wishes to poll his/her constituents regarding taxation. The constituents are broken into income brackets and then each bracket is polled. • Cluster or Area Random Sampling - In cluster sampling, the population is divided into clusters (usually along geographic boundaries), the clusters are randomly sampled and all units within the sampled cluster are measured. For example: a survey of town governments that will require going to the towns personally could be done by using county boundaries as the clusters and randomly selecting five counties. All the town governments in these selected counties would then be measured. • Multi-stage cluster sampling - As the name implies, this involves drawing several different samples. The first stage would be a cluster sample as described above but then another sample is taken from these samples. For example: a face-to-face survey of the residence of a state could be done by first selecting a sample of counties and then doing another sample, such as systemic sampling, of the residence of those selected counties. Thus the cost of interviewing is minimized.

BADM 531: Survey sampling: General techniques and Special population

BADM 531: Survey sampling: General techniques and Special population

Presentation Transcript

Multimedia Sampling (Chapter 47)

Bias in Survey Sampling

Chapter 2

Special And General Grant Provisions

Chapter 10: Estimating With Confidence

Applied Sampling [ Notes based on Graham Kalton’s Sage Publication and Prof. Jim Lepkowski’s Lecture Notes ]

5.1 Objectives

Special Senses

Control Device Technology

Detection of Control Flow Errors Survey of Hardware and Software Techniques

Slides Prepared by JOHN S. LOUCKS St. Edward’s University

Summer Seminar

Nursing Research

Techniques

Scientific Diving Sampling Techniques

BUS 332 Scien tific Research Techniques

Chapter 2 Population

Sampling Distribution Models

7.0 Sampling 7.1 The Sampling Theorem

CS 431/636 Advanced Rendering Techniques

Review of Survey Methodology