1 / 19

Data Mining For Credit Card Fraud : A Comparative Study

Data Mining For Credit Card Fraud : A Comparative Study. Xxxxxxxx DSCI 5240 | Dr. Nick Evangelopoulos Graduate Presentation. Overview. Credit Card Fraud Data Mining Techniques Data Experimental Setup Results. Credit Card Fraud. Two Types: Application Fraud

damia
Télécharger la présentation

Data Mining For Credit Card Fraud : A Comparative Study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining For Credit Card Fraud: A Comparative Study Xxxxxxxx DSCI 5240 | Dr. Nick Evangelopoulos Graduate Presentation

  2. Overview • Credit Card Fraud • Data Mining Techniques • Data • Experimental Setup • Results Graduate Presentation | DSCI 5240 | Xxxxxxx

  3. Credit Card Fraud • Two Types: • Application Fraud • Obtain new cards using false information • Behavioral Fraud • Mail theft • Stolen/lost card • Counterfeit card Graduate Presentation | DSCI 5240 | Xxxxxxx

  4. Credit Card Fraud • Online Revenue loss due to Fraud (cybersource.com) Graduate Presentation | DSCI 5240 | Xxxxxxx

  5. Data Mining Techniques • Logistic Regression • Used to predict outcome of categorical dependent variable • Fraud variable is binary • Support Vector Machines • Random Forest Graduate Presentation | DSCI 5240 | Xxxxxxx

  6. Support Vector Machines (SVM) • Supervised learning models with associated learning algorithms that analyze and recognize patterns • Linear classifiers that work in high dimensional feature space that is non-linear mapping of input space • Two properties of SVM • Kernel representation • Margin optimization Graduate Presentation | DSCI 5240 | Xxxxxxx

  7. Random Forest (RF) • Ensemble of classification trees • Performs well when individual members are dissimilar Graduate Presentation | DSCI 5240 | Xxxxxxx

  8. Data: Datasets • 13 Months of data (Jan 2006 – Jan 2007) • 50 Million credit card transactions on 1 Million credit cards • 2420 known fraudulent transactions with 506 credit cards Graduate Presentation | DSCI 5240 | Xxxxxxx

  9. Percentage of Transaction by transaction type Graduate Presentation | DSCI 5240 | Xxxxxxx

  10. Data Selection Graduate Presentation | DSCI 5240 | Xxxxxxx

  11. Primary attributes in Dataset Graduate Presentation | DSCI 5240 | Xxxxxxx

  12. Derived Attributes Graduate Presentation | DSCI 5240 | Xxxxxxx

  13. Experimental Setup • For SVM, Gaussian radial basis function was used as the kernel function • For Random Forest, number of attributes considered at the node and number of trees was set. • Data were sampled at different rates using random under sampling of majority class Graduate Presentation | DSCI 5240 | Xxxxxxx

  14. Training and testing data Graduate Presentation | DSCI 5240 | Xxxxxxx

  15. Results Graduate Presentation | DSCI 5240 | Xxxxxxx

  16. Proportion of fraud captured at different depths Graduate Presentation | DSCI 5240 | Xxxxxxx

  17. Fraud Capture Rate w/ Different Fraud Rates in Training Data Graduate Presentation | DSCI 5240 | Xxxxxxx

  18. Conclusion • Examine the performance of two data mining techniques • SVM and RF together with logistic regression • Used real life data set from Jan 2006 – Jan 2007 • Used data undersampling approach to sample data • Random forest showed much higher performance at upper file depths • SVM performance at the upper file depths tended to increase with lower proportion of fraud in the training data • Random forest demonstrated overall better performance Graduate Presentation | DSCI 5240 | Xxxxxxx

  19. Questions Graduate Presentation | DSCI 5240 | Xxxxxxx

More Related