360 likes | 565 Vues
Portrait Quadstone. Webinar: Scorecard Secrets. Starting in 15 minutes. Starting in 10 minutes. Starting in 5 minutes. Starting in 2 minutes. Starting now. Please join the teleconference call—any problems, support@quadstone.com. How to ask questions. Use Q&A (not Chat please):
 
                
                E N D
Portrait Quadstone Webinar: Scorecard Secrets Starting in 15 minutes Starting in 10 minutes Starting in 5 minutes Starting in 2 minutes Starting now Please join the teleconference call—any problems, support@quadstone.com Issue 5.2-1
How to ask questions Use Q&A (not Chat please): • Click on the Q&A Panel icon at the bottom-right of your screen: • Type in your question:
Webinar: Scorecard Secrets • Presenter: Patrick Surry, VP Technology • Agenda: • Predictive modeling process • How do you assess a given model (scorecard)? • How do you pick the weights in the boxes? • How do you pick the boxes for each field? • How do you pick the fields? • Why use a scorecard (why boxes?), e.g. vs. ‘traditional’ regression
Predictive Modeling Process • What is the business problem (what are we predicting)? • How is success measured (when is one model ‘better’ than another)? • What modeling approach to use? • Preprocessing: • Variable creation • Variable selection • Variable transformation • Core solver: fitting algorithm to generate “best” model • Postprocessing to transform model output into desired prediction (score) • Final model
Typical Business Problems in Marketing (We’ll focus mainly on binary outcomes; approaches are similar for continuous case)
How do we measure success? • The score given to a customer is equivalent to either: • The estimated probability of a binary outcome • The estimatedvalue of a continuous outcome • Sometimes we only care about performance at a cutoff score (e.g. a bank deciding to make a loan or not) • Sometimes we only care about ranking or classifying customers (e.g. outbound marketing wants to call customers most likely to buy first) • Sometimes we care about some quantitative measure of accuracy (e.g. bank wants to predict level of reserves to keep against future bad loans)
How good is a given model? • Nominal non-parameteric measures: How good at a cutoff? • Two-by-two contingency tables • Information gain, chi-squared significance, Cramer’s V • Ranked non-parametric measures: How well-ordered? • Gini / RoC • Kolmogorov-Smirnov • Parametric measures: How accurate for each customer? • Divergence statistic • Maximum likelihood measures • Linear regression • Logistic regression • Probit regression • NB. Tend to choose what is mathematically tractable, not what’s business relevant • Luckily they’re typically highly correlated
Scorecard performance Target rate = Accept rate = ( A + B ) / ( A + B + C + D ) Hit rate = Bad rate = B / ( A + B ) • Often can directly assign a financial value to each category Cutoff Count Bads C D A Goods B Score • Ranked metrics (Gini, KS) measure how well the score sorts goods to the right and bads to the left • Parametric metrics (R2, MLE) measure how accurate each prediction is
Scorecards Issue 5.2-1
bins fields scores (weights) What is a scorecard? Apply scorecard
Scorecard ingredients • How do you assess a given scorecard (model)? • How do you pick the weights in the boxes? • How do you pick the boxes for each field? • How do you pick the fields? • Why a scorecard (why boxes?) – scorecard vs regression
What numbers in the boxes? Issue 5.2-1
Linear model • Linear model (multiple regression), perhaps with manually transformed variables y = w x + b (e.g. y is Response, x is Age, Income, w are coefficients, b is intercept) • Scorecard builder doesn’t implement this form • Even with continuous outcomes we use transformed inputs Inputs (x, y) Linear solver Output (w, b) Optimizer
Generalized linear model • Generalized linear model (including link function, e.g. f() as log-odds) f(y) = w x + b y = f-1(w x + b) • Although the core is still linear, finding w to optimize the quality metric typically isn’t • Scorecard builder doesn’t implement this form Inputs (x, y) Non-linear solver Output Transformation & Rescaling Output (w, b) Optimizer Postprocessing
Generalized additive model • Generalized additive model (arbitrary functions of the independent variables) f(y) = w F(x) + b • Scorecard builder uses very simple class of functions F(x): Piecewise constant fit of x to the observed outcome, or indicator variables • For continuous variables, we use this form without the link function • For binary variables, we always use the link function (“linear regression” just uses an approximation of the non-linear solution) Inputs (x, y) Variable Transformation (F) Non-linear or Linear approx. solver Output Transformation & Rescaling Output (w, b) Preprocessing Optimizer Postprocessing
Variable Transformation Core Solver Output Transformation & Rescaling Core Solver • Choose weights (numbers in boxes) to maximize likelihood of observing actual outcomes (based on quality measure) • Solver window: controls optimization parameters • Singular value decomposition provides robust solution with correlated variables • Can still see sensitivity with very small categories (though shouldn’t impact predictions unless those categories become large when scoring)
Variable Transformation Core Solver Output Transformation & Rescaling Postprocessing • Model types: Risk, Response, Churn, Satisfaction • No change to ‘core’ statistics, just flipping signs and labels • Scaling of final score via two constants: • Even-odds point: log(odds) = 0 (50% likelihood) • Odds-doubling factor: log(odds) increment (e.g. +20 points double odds) • Core model always fits odds • Always ‘logistic’ form (except with ‘continuous’ model) • Prediction y is rescaled as Ay + B to give best logistic fit with desired scaling • “Linear regression” quality measure solves linear approximation to logistic
What boxes for each field? Issue 5.2-1
Variable Transformation Core Solver Output Transformation & Rescaling Variable transformation Generalized additive model: f(y) = w F(x) + b • What are the input variables xor F(x)? • In traditional regression, xare raw variables, or manually transformed, e.g. Income, log(Income) • In scorecard building, either: • One weight per bin: indicator (dummy) variables, one per bin in source variable, representing bin membership. • More fitting power (but also more free parameters) • One weight per field: Continuous variables, one per field, transformed from the source variable based on the outcome rate in each bin • Implicit transform (to the observed bad rate) gives a significant advantage over “standard” regression techniques
Optimized binning • Maximize a measure of (categorical) association with the outcome • The default technique is a hierarchical merge • Similar to that used to generate a decision tree • Maximize the information gain at each step
Optimized binning Target number of bins = 5 mean of objective field Age • Attempts to maximize univariate predictiveness, that is, minimize loss of predictiveness • Uses either iterative splitting (like decision tree) or exhaustive search
Summary: Scorecard vs Traditional (linear) regression • scorecard is a generalized additive model based on indicator functions • model is thus piecewise constant in each of the independent variables • automatically captures non-linear relationships, more robust to outliers • increases in lift of 2% or more in real-world risk, response and retention modeling applications • simpler to build, understand & explain / socialize
What fields? Issue 5.2-1
Scorecard Builder: stepwise inclusion / exclusion • Why not use all available fields? • More fields typically increase training performance but risk overfit on test • Larger model is more difficult to explain & socialize • Build time scales by square of number of (transformed) variables • Build a set of trial scorecards: • Include a candidate field • Build a scorecard • Compute the quality measure • Include the next field… • Choose the field that creates the best trial scorecard • Linear Fit uses residuals to compute the marginal sum-of-squares error • Quality Measure uses a hybrid • Similar to traditional -score technique
“Right-Size” Scorecard • Select point at which model quality exhibits diminishing returns on test data • Generate test/training split • Build logistic model on training data using remaining variables (initially all) • Measure quality (gini measure) when applied to test data • Exclude least contributory variable (based on training data) • Repeat to step 2 until no variables remain • Choose last point where test-set performance increases by minimum threshold • Refit model with selected number of variables using all data
Optimize Binnings Variable Reduction Right-size Model Final Model (Optional) Parameter overrides Analytic Dataset Automating scorecard building workflow • Optimized binning of each independent variable: “best” piecewise constant transformation • Variable reduction using recursive step-wise exclusion • Model “right-sizing” by seeking point of diminishing returns on test-data • Many variables (1000s) require automated tools to help focus effort • Time is money • To perform same steps by hand would take several days (or weeks) • Automation completes in minutes
Further Reading • Generalized additive models (GAM) • Generalized linear models (GLM) • Gini • ROC • Kolmogorov-Smirnov, • Probit • Logit • Singular value decomposition (SVD) • McCullagh P, Nelder JA, Generalised Linear Models (2nd edition), Chapman and Hall 1989.
After the webinar • These slides, and a recording of this webinar will be available via http://support.quadstone.com/info/events/webinars/ • Any problems or questions, please contact support@quadstone.com
Upcoming webinars See http://support.quadstone.com/info/events/webinars/