Efficient Image Classification on Vertically Decomposed Data

Efficient Image Classification on Vertically Decomposed Data Taufik Abidin, Aijuan Dong, Hongli Li, andWilliam Perrizo Computer Science North Dakota State University The 1st IEEE International Workshop on Multimedia Databases and Data Management (MDDM-06)

Outline • Image classification • The application of SMART-TV algorithm in image classification • SMART-TV algorithm • Experimental results • Summary

Image Classification • Why classifying images? • The proliferation of digital images • The need to organize them into semantic categories for effective browsing for effective retrieval • Techniques for image classification: • SVM, Bayesian, Neural Network, KNN

Image Classification Cont. • In this work, we focus on KNN method • KNN is widely used in image classification: • Simple and easy to implement • Good classification results • Problems: • Classification time is linear to the size of image repositories • When the repositories are very large, contains millions of images, KNN is impractical

Our Contributions • We apply our recently developed classification algorithms, a.k.a. SMART-TV for image classification task and analyze its performance • We demonstrate that SMART-TV, a classification algorithm that uses P-tree vertical data structure, is fast and scalable to very large image databases • We show that for Corel images a combination of color and texture features are a good alternative to represent the low-level of images

Image Preprocessing • We extracted color and texture features from the original pixel of the images • We created 54-dimension color histogram in HVS (6x3x3) color space for color features and created 8 multi-resolutions Gabor filter (4 orientations and 2 scales) to extract texture features of the images (see B.S. Manjunath, IEEE Trans. on Pattern Analysis and Machine Intelligence, 1996, for more detail about the filter)

Image Preprocessing Cont. Color Features • Convert RGB to HSV • HSV  to the way humans tend to perceive color • The value is in the range of 0..1 • Quantize the image into 54 bins i.e. (6 x 3 x 3) bins • Record the frequency of the HSV of each pixel in the images

Image Preprocessing Cont. Texture Features • Transform the images into frequency domain using the 8 filters generated (4 orientations and 2 scales parameters) and record the standard deviation and the mean of the pixel in the image after transformation • This process will produce 16 texture features for each image

Store the root count and TV values Compute Root Counts Measure TV of each object in each class Large Training Set Preprocessing Phase Classifying Phase Search the K-nearest neighbors from the candidate set Approximate the candidate set of NNs Unclassified Object Vote Overview of SMART-TV

SMART-TV Algorithm • SMART-TV: SMall Absolute diffeRence of ToTal Variation • Approximates a set of candidates of nearest neighbors by examining the absolute difference between the total variation of each data object in the training set and the total variation of the unclassified object • The k-nearest neighbors are searched from the candidate set • Computing Total Variation (TV):

SMART-TV Algorithm

The Independency of RC • The root count operations are independence from , which allows us to run the operations once in advance and retain the count results • In classification task, the sets of classes are known and unchanged. Thus, the total variation of an object about its class can be pre-computed

Preprocessing Phase Preprocessing: • The computation of root counts of each class Cj, where 1  j  number of classes. O(kdb2) where k is the number of classes, d is the total of dimensions, and b is the bit-width • Compute , 1 j  number of classes. O(n) where n is the number of images in the training set

Classifying Phase Classifying: • For each class Cj, where 1  j  number of classes do: a. Compute , where is the feature of the unclassified image • Find hs images in Cj such that the absolute difference between the total variation of the images in Cj and the total variation of are the smallest, i.e. Let A be an array and , where c. Store the ID of the images in an arrayTVGapList

Classifying Phase (Cont.) • For each objectIDt, 1 t  Len(TVGapList) where Len(TVGapList) is equal to hs times the total number of classes, retrieve the corresponding object features from the training set and measure the pair-wise Euclidian distance between and , i.e. and determine the k nearest neighbors of • Vote the class label forfrom the k nearest neighbors

Dataset We used Corel images (http://wang.ist.psu.edu/docs/related) • 10 categories • Originally, each category has 100 images • Number of feature attributes 70 (54 from color and 16 from texture) • We randomly generated several bigger size datasets to evaluate the speed and scalability of the algorithms • 50 images for testing set, 5 for each category

Dataset Cont.

Experimental Results Experimental Setup : Intel P4 CPU 2.6 GHz machine, 3.8GB RAM running Red Hat Linux Classification Accuracy Comparison

Example on Corel Dataset

Experimental Results Cont. Loading Time Classification Time

Summary • We have presented the SMART-TV algorithm, a classification algorithm that uses vertical data structure, and applied it in image classification task • We found that the speed of our algorithm outperforms the speed of the classical KNN algorithm • Our method scales well to large image repository. Its classification accuracy is very comparable to that of KNN algorithm

Efficient Image Classification on Vertically Decomposed Data

Efficient Image Classification on Vertically Decomposed Data

Presentation Transcript

Image Classification

Image Classification

Efficient classification for metric data

Image classification

Efficient Decomposed Learning for Structured Prediction

DIGITAL IMAGE CLASSIFICATION

Image Classification

Image Classification Basics

Image Classification: Features, Algorithms or Data?

Hyperspectral Image Classification

Image Classification

Image Classification: Introduction

Image Classification: Introduction

Image Classification

Efficient Skyline Computation on Vertically Partitioned Datasets

Efficient classification for metric data

Image Classification

IMAGE CLASSIFICATION DATA INTEGRATION AND ANALYSIS

Image Classification 영상분류