220 likes | 330 Vues
This presentation outlines a microsimulation framework focused on improving Computer-Assisted Telephone Interviewing (CATI) efficiency. The key motivation is to enable proactive collection management while addressing issues with time, control, and interpretation costs. It introduces core aspects such as modeling individual units, a virtual collection system, and call scheduling rules. The system's simulation parameters, call outcomes derived from data, and the implementation of multinomial logistic regression are discussed. Results from simulations with various parameters highlight impacts on response rates and case outcomes. Future work aims to refine models and explore potential collaborations.
E N D
Microsimulation of Survey Collection Yves Bélanger Kristen Couture 26 January 2010
Outline • Motivation • Main aspects of microsimulation • Overview of the system • A short demo • A few results • Future work
Motivation • Ultimate goal: make CATI collection more efficient • proactive collection management • Recent initiatives in the field • Experimentation with time slices, cap on calls, calling priorities, Z-groups, ... • Takes time, lack of control, costly(?), results not always easy to interpret • Need for a controlled environment, where the impact of each aspect can be isolated
Main Aspects of Microsimulation • What is microsimulation? • A modelling technique that operates at the level of individual units, such as persons, households, vehicles, etc. • For us: a "virtual collection" system • What elements are we considering? • The cases (sampled units) • The servers (interviewers) • The call attempts • The waiting queue(s) • The rules of the call scheduler (flows and priorities)
Main Aspects of Microsimulation (cont'd) • What do we want to simulate? • A random component: the result of each call attempt • Use existing BTH data with appropriate statistical models • A deterministic component: how the cases flow through the system • Use a simulation software to replicate Blaise: SAS Simulation Studio
Overview of the System Simulation Collection Parameters
Overview of the System (cont'd) • Call outcome • Modeled using CSGVP 2004 BTH data • Five outcomes derived from BTH outcome codes • Unresolved (eg. Busy signal, wrong #) • Out of Scope (eg. Cell phone, Business) • Refusal • Other Contact (eg. Ans. Machine, appointment) • Respondent
Overview of the System (cont'd) • Used Multinomial Logistic Regression • 7 parameters entered into model: • Afternoon – 1 if call made between 12 and 5 • Evening – 1 if call made between 5 and 9 • Weekend - 1 if call made on weekend • Resid – 1 if initial status was residential • Unresolved – 1 if call history is only unresolved • Refusal – 1 if history shows at least one refusal • Contact – 1 if history shows at least one contact i = 1..n j = 1..k
Overview of the System (cont'd) • Calculate probability for each of the five possible outcomes using estimated betas and collection parameters
Overview of the System (cont'd) • Call duration • Modeled using existing CSGVP 2004 BTH data • Modeled distributions for each of the 5 outcomes
Overview of the System (cont'd) • Components of model • Input • Allows user to enter parameters via SAS data sets
Overview of the System (cont'd) • Clock • Creates Time Parameters including Afternoon, Evening, Weekend, and Time Slice by reading the current simulation time
Overview of the System (cont'd) • Queuing System • Cases are created and enter a queue waiting to be interviewed
Overview of the System (cont'd) • Determining Call Outcome • Uses probability formulas to determine call outcome: Unresolved, Out of Scope, Other Contact, Refusal, Respondent
Overview of the System (cont'd) • Call Center • Interview takes place • Call duration is simulated • Ability to control interviewer schedule
Overview of the System (cont'd) • Finalizing Cases • Case exits system when… • Outcome code = OOS or Respondent • Cap on Calls is reached • Cap of 20 for Residential Status • Cap of 5 for Unknown Status • Number of Refusals=3 • A BTH file is created as output in terms of a SAS dataset
A Few Results • Simulation with 10,000 cases for 30 days of collection • Interviewer Agenda • Shift 1 (9am-12pm): 10 interviewers • Shift 2 (12pm-5pm): 10 interviewers • Shift 3 (5pm-9pm): 10 interviewers * Note: No Time Slices in this example
A few results (cont'd) Finalized Cases and Response Rate Distribution of Outcome Codes
A few results (cont'd) • Impact of Changing Parameters • Number of Interviewers • Length of Collection Period
A few results (cont'd) • Changing the Time Per Unit • Cap on Calls is in Effect
Future Work • Continue improvements to system • To outcome model • More explanatory variables • Distinguish between hhld and person contacts • To simulation system • Implement time slices • Improve priorities • Presentation to JSM (incl. article) • Potential cooperation with Census • Other?... will depend on available budget