Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Present by: Chung-Hsien Yu Advisor: Prof. Wei Ding Department of Computer Science University of Massachusetts Boston 2012 GRADUATE STUDENTS SYMPOSIUM

Abstract • Retaining spatiotemporal knowledge by applying multi-clustering to monthly aggregated crime data. • Training baseline learners on these clusters obtained from clustering. • Adapting a greedy algorithm to find a rule-based ensemble classifier during each boosting round. • Pruning the ensemble classifier to prevent it from overfitting. • Constructing a strong hypothesis based on these ensemble classifiers obtained from each round.

Original Data

Aggregated Data 1 3 1 1

Monthly Data 1 0 3 0 3 1 0 2 0 4 2 5 0 4 5 1 0 3 0 1 4 1 5 2 0 1 3 0 0 5 1 0 0 1 3 0 0 2 2 1 5 2 3 3 4 4 0 3 5 2 4 1 2 0 2 0 0 2 5 3 3 0 3 0 1 6 3 0 4 2 0 2 0 5 4 3 1 2 0 3 0 1 0 2 0 2 1 4 4 1 5 4 3 1 0 3 0 3 1 0 2 0 0 2 2 0 0 1 3 5 4 2 2 0 3 1 0 3 0 0 6 0 5 4 6 1 4 4 6 7 8 4 0 2 4 3 0 4 6 0 4 8 4 0 0 2 0 0 3 6 1 3 3 8 1 8 6 1 0 3 4 5 3 3 3 8 2 0 5 2 4 8 0 2 0 4 6 5 4 1 5 3 2 6 0 3 8 5 3 6 5 3 6 2 0 6 2 6 6 1 6 2 0 4 8 4 4 0 3 4 1 2 4 3 0 8 1 0 3 5 3 3 3 4 2 0 6 5 3 5 6 6 1 6 6 0 4 1 8 0 1 0 0 0 0 1 0 3 1 9 0 0 0 1 1 3 0 0 2 1 9 1 0 0 2 4 0 4 5 6 2 3 9 3 3 0 0 0 0 4 9 0 1 3 4 0 1 4 0 0 2 2 3 0 9 4 0 3 3 3 3 2 3 4 0 0 9 9 0 0 1 9 6 0 0 3 3 3 0 9 4 2 4 3 4 0 2 3 0 0 1 1 3 3 1 0 0 0 0 0 9 0 0 0 0 0 1 1 1 1 0 1 3 1 3 1 2 0 3 0 0 1 5 2 3 2 0 3 0 0 2 2 0 0 3 0 3 2 3 1 0 2 0 0 3 2 0 2 2 1 0 5 1 0 2 0 0 3 2 0 3 2 5 1 0 2 0 0 3 2 0 3 0 0 0 2 5 0 2 0 5 3 0 2 0 3 1 3 3 5 0 4 5 0 2 0 2 2 0 2 0 0 1 0 2 3 0 0 2 3 0 1 0 2 0 0 3 2 0 5 3 5 3 1 0 2 0 0 3 2 0 0 2 0 5 1 0 2 2 0

Monthly Clusters (k=3)

Monthly Clusters (k=4)

Flow Chart

Algorithm (Part I)

Algorithm (Part II)

Confidence Value From AdaBoosting (Schapire & Singer 1998) we have Let and ignore the boosting round . is defined as the confidence value for the rule and if .

Objective Function Therefore,

Minimum Z Value has the minimum value when

BuildChain Function Repeatedly adding a classifier to R until it maximizes . This will minimize as well.

PruneChain Function Loss Function: is obtained from GrowSet. are obtained from applying R to PruneSet Minimize by removing the last classifier from R.

Update Weights Calculate with ensemble classifier R on the entire data set. where

Strong Hypothesis At the end of boosting, there are chains,

SUMMARY The grid cells with the similar crime counts clustered together also are close to each other on the map geographically. Besides, the high-crime-rate area and low-crime-rate area are separated with cluster. The original data set is randomly divided into two subsets each round. The greedy weak-learn algorithm adapts confidence-rate evaluation to “chain” the base-line classifiers using one data set. And then, “trim” the chain using the other data set. The strong hypothesis is easy to calculate.

Q & A THANK YOU!!

Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers

Presentation Transcript

Ensemble Forecasting

An Ensemble of Classifiers Approach for the Missing Feature Problem Using learn ++

Forecasting of Atlantic Tropical Cyclones Using a Kilo-Member Ensemble

Hand Detection with a Cascade of Boosted Classifiers Using Haar-like Features

Large Ensemble Tropical Cyclone Forecasting

Ensemble Forecasting

Ensemble Forecasting

Forecasting uncertainty: the ensemble solution

Introduction to ensemble forecasting

Advancing Hydrologic Ensemble Forecasting using Distributed Watershed Models

Overview of Ensemble Forecasting

Forecasting Computer Crime Complaints

Mining Several Databases with an Ensemble of Classifiers

LAMEPS – Limited area ensemble forecasting in Norway, using targeted EPS

Issues in Limited-Area Ensemble Forecasting

Ensemble Hydrologic Forecasting

Using CPC long lead climate outlooks for ensemble streamflow forecasting

Ensemble Classifiers

Ensemble Forecasting

Forecasting of Atlantic Tropical Cyclones Using a Kilo-Member Ensemble

Ensemble Prediction Systems and Probabilistic Forecasting

Operational Flood Forecasting for Bangladesh using ECMWF ensemble weather forecasts