1 / 39

Lecture 8

Lecture 8. MARK2039 Summer 2006 George Brown College Wednesday 9-12. Assignment 6. Backend: H4B2E5STRUGER Marketing list: H4B2E5STRUGERJOHN4849MAYFAIR Unaddressed Campaign: H4B2E5. Assignment 6. Id Total Amount # of months since last trans.

Télécharger la présentation

Lecture 8

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 8 MARK2039 Summer 2006 George Brown College Wednesday 9-12

  2. Assignment 6 Backend: H4B2E5STRUGER Marketing list: H4B2E5STRUGERJOHN4849MAYFAIR Unaddressed Campaign: H4B2E5

  3. Assignment 6 Id Total Amount # of months since last trans. 456 1280 6 months 123 300 5 months789 76 8 months12 10 10 months

  4. Assignment 6 Data needs to be standardized such that we have one value for each gender outcome

  5. Assignment 6 Use purchase behaviour field and look at purchase window(say 3 mos.)(April06 to June06). No purchase in window means customer is non defector(0) while purchase in window means customer is defector(1). I would use the other information(income,region,age, and tenure) as potential variablesto help predict defection.

  6. Classification/Profiling vs. Predictive Modelling Profiling PredictiveModelling - - Pre Defector Post Non Defector Age - Defector - Age - Age - Tenure - Non - Defector - Tenure - Tenure - Income - Income - Income -Transaction TransactionBehaviour - Transactionbehaviour -Transaction Behaviour Independant Dependant variables variable Classification Predict

  7. Predictive Modelling • Examples:Discrete Models • Response Models • Cross Sell • Upsell • Acquisition • Attrition Models • Product Affinity Models • Risk Models

  8. Predictive Modelling • Examples-Continuous Models • Profitability/Value Models • Spending Models

  9. Types of Predictive Models - • An acquisition campaign with no targetting was conducted in January. The available information is as follows: • Mail files containing name and address • Responder files containing name and address • 2001 Stats Can Census data available at the enumeration area • A conversion table which maps enumeration areas to postal codes • How would you use the above information to better target prospects to become new customers. • Describe how the analytical file would be created • 1) define objective function of creating response variable • 2)create response variable by matching responder file to mail file using match key of postal code and last name. Assign value of 1 for matches(responders) and 0 for non matches(non responders). This field will be created on mail file or analytical file • 3)Match analytical file to Stats conversion file(contains enumeration area) by postal code. Match new output file to Stats Can file by enumeration area which contains the very rich demographic information. • Remember the end deliverable is to create a table with the dependant variable or objective function and examples of other independent or predictor variables.

  10. 1) define objective function of Types of Predictive Models • You have been asked to create programs that better target existing customers for insurance products. You have the following info: What would you do and how would you create the analytical file 1) Define objective function and create insurance response variable 2)create insurance response variable by looking at amount spent in certain transaction type and within a certain timeframe. Assign value of 1 if this condition is met and 0 if not.. This field will be created on analytical file 3)Create independent model predictors by creating recency,freq uency, and amount variables and by type from the transaction file. Create demographic variables from the customer file such as region of country, tenure, age, income,etc. Remember the end deliverable is to create a table with the dependant variable or objective function and examples of other independent or predictor variables.

  11. Types of Predictive Models • You have been asked to build a targetting tool for a cross-sell campaign to get existing customers to purchase an insurance policy A campaign was conducted in May of 2005. What questions do you need to ask in order to help design a proper tool • Was the campaign data captured. Are responders clearly identified or do we have to impute them through the database based on the transaction data that occurred within a certain time frame of the campaign.

  12. Types of Predictive Models • You have been asked to target customer that will not only purchase insurance but will also purchase the largest premiums • What type of model would be built here? • Two-stage model with one whereby we are targetting both insurance response and premium. Objective function is Expected value of premium: Pr(Response) X Premium

  13. Types of Predictive Models • Creating The Analytical File • Defining the objective function • Defining the Model predictors • Once this is done, the first diagnostic that can be done is the correlation matrix.

  14. Correlation • Want to determine which variables have the greatest relationship with response • Run the correlation of the dependant variable with all the independents (in your reduced set). • Based on the highest correlation coefficient select best variables (usually select those with statistical significance criterion of at least 95%) • Correlation can be negative or positive • Serves as a great pre-screening tool.

  15. The Concept of Correlation • Using correlation analysis for selecting variables for our response model. • Analytical file contains six variables: Dependant Variable/ Modelled Variable Response • Age • Tenure • # of Products • # of Promotions • Income • Household Size Independent Variables • The key diagnostics in this routine are: • Correlation coefficient • Confidence level

  16. Correlation Coefficient

  17. Correlation Analysis • The male gender variable has a perfect correlation of +1. • The female gender variable has a perfect correlation of -1. • Household size has no correlation with response, hence the correlation coefficient is 0.

  18. Correlation Results • Show the level of confidence which a given variable has with the modelled behaviour i.e. response Correlation coefficient Confidence Interval

  19. Correlation • Why couldn’t we just use results of correlation to create model and create index values for each sign .variable. • Age • Tenure • # of products purchased • # of promotions since last purchase Because there is interaction between variables that need to be accounted for in modelling exercise(multicollinearity). You canreview this concept in more detail in any introductory stats textbook.

  20. Examples-Correlation-Response Model • Listed below is an example of a correlation matrix • Answer the following: • Is each variable relevant • -all with exception of live in Quebec, # in household and # of months since last purchase • What is the relationship or impact of each variable with response • -sign of variable tells you relationship where corr. Coeff. tells you impact • What is the strongest variable and what is the weakest variable? • Strongest var: # of months since last promoted. Weakest var: live in Quebec

  21. More examples of correlation • -Younger people are more likely to respond -Higher income are more likely to respond -Males are less likely to respond Would the correlation values against response for the above variables be highly positive,close to zero or negative for age,income, and femalesage: highly negative Income: highly positive Females: highly positive • People who live in Quebec exhibit no impact on response, people with high tenure and high number of months since last promotion are less likely to respond. Would the correlation values against response for the each variable be highly positive,close to zero or negative • Quebec: close to zero • tenure: highly negative • Number of months since last promotion:highly negative

  22. More examples of correlation • Previous analysis has indicated the following trends • Would the correlations be closer to 1,-1 , or0 here for bothvariables? Spending: close to 0. tenure: close to -1

  23. More examples of correlation • Would the correlations be closer to 1,-1 , or0 here for bothvariables? Spending: close to 1 tenure: close to 0 • What is the learning here vs. the previousslide-variables have changed in their impact to response

  24. Exploratory Data Analysis Reports(EDA) • After looking at the correlation reports, we also need to create EDA reports which help to better understand the relationship of a given variable with the desired marketing behaviour. • It helps the business people and marketers to get inside the so-called black box of modelling.

  25. Exploratory Data Analysis Reports(EDA)

  26. Exploratory Data Analysis Reports(EDA) • Let’s take a look at example of a binary variable Male # of Observations Response Rate Yes 50000 2.00% No 50000 2.60% Average 100000 2.30% On the next page are some examples of EDA reports of variables that are not statistically significant according to the correlation matrix.

  27. Exploratory Data Analysis Reports(EDA) • EDA’s of non-stat.sign. variables

  28. Exploratory Data Analysis Reports • Exploratory Data Analysis Reports: What does this tell us? What does this tell us?

  29. Exploratory Data Analysis Reports What does this mean? What does this mean?

  30. Creating the Final Model • Why couldn’t we just use results of correlation to create model and create index values for each sign .variable. • Age • Tenure • # of products purchased • # of promotions since last purchase Think Statistics here?

  31. The Data Mining Process : Application of Data Mining Techniques-Creating the Final Model Problems with Multicollinearity • Example: Years of Education and Income on Response Rate • Regression Equation is: Response= .50+.00001*income -.03*yrs. of education Problems with Multicollinearity • Example: Years of Education and Income on Response Rate • Regression Equation is: Response= .50+.00001*income -.03*yrs. of education Response Years of Income Education Correlation Coefficient 0.11 0.12 Confidence Interval 99% 99.50% What is the problem here and what do you do?

  32. Continuing to build the model • Multivariate analytical techniques such as multiple regression,logistic regression,etc. may be employed to produce the final model • Final equation:Predicted Response Rate:=A –B1*Age +B2*tenure • What is the problem here?

  33. Variable Correlation Spend 0.6 Live in Ontario 0.5 Number in House -0.3 Response= A (+.05 X spend) (-.03 X Live in Ontario) (-.01 X Number in House) Variable Correlation # of products 0.6 Credit Score 0.4 Tenure -0.2 Response= A (-.03*number of products) (+.08 X Credit Score) (-.01 X tenure) Continuing to build the model

  34. Continuing to build the model • After observing correlation results and EDA’s what can we begin to do at this point. • Derive new variables-EDA’s • Derive new variables-multicollinearity • Derive new variables-Factor Analysis • Derive new variables-CHAID(will explore later) Reference Material: Factor Analysis-look up in any Statistics Handbook Regression-look up in textbook under Regression and Statistics Regression.

  35. Continuing to build the model • Running further statistical routines, we are able to develop a final model. The marketer or business person should receive a report that looks as follows: For those of you that have statistics training, how is the % Contribution to model calculated derived?

  36. Continuing to Build the Model Variable Partial Model Entered R-Square R-Square var 4 0.0036 0.0036 var 3 0.0034 0.007 var 1 0.0016 0.0086 var 2 0.0007 0.0092 var 6 0.0009 0.0102 var 5 0.0003 0.0105

  37. Continuing to Build the Model What would be the final equation in terms of the sign?

  38. Continuing to build the model • What would you do here

  39. Continuing to build the model • Suppose we have the following equation: • Response= +.09 • +.05 X Income • +.06 X Tenure • +.08 X Product Spend • -.04 X Male • What is the problem here?

More Related