Unsupervised learning: Clustering

Unsupervised learning: Clustering Ata Kaban The University of Birmingham http://www.cs.bham.ac.uk/~axk

The Clustering Problem Unsupervised Learning Data (input) ‘Interesting structure’ (output) • Should contain essential traits • discard unessential details • provide a compact summary the data • interpretable for humans • … Objective function that expresses our notion of interestingness for this data

Here is some data…

Formalising • Data points xnn=1,2,… N • Assume K clusters • Binary indicator variables zkn associated with each data point and cluster: 1 if xn is in cluster k and 0 otherwise • Define a measure of cluster compactness as the total distance from the cluster mean:

Cluster quality objective (the smaller the better): • Two sets of parameters - the cluster mean values mkand the cluster allocation indicator variables zkn • Minimise the above objective over each set of variables while holding one set fixed  This is exactly what the K-means algorithm is doing! (can you prove it?)

Pseudo-code of K-means algorithm: Begin initialize 1, 2, …,K (randomly selected) do classify n samples according to nearest i recompute i until no change in i return 1, 2, …, K End

Other forms of clustering • Many times, clusters are not disjoint, but a cluster may have subclusters, in turn having sub-subclusters. • Hierarchical clustering

Given any two samples x and x’, they will be grouped together at some level, and if they are grouped a level k, they remain grouped for all higher levels • Hierarchical clustering  tree representation called dendrogram

The similarity values may help to determine if the grouping are natural or forced, but if they are evenly distributed no information can be gained • Another representation is based on set, e.g., on the Venn diagrams

Hierarchical clustering can be divided in agglomerative and divisive. • Agglomerative (bottom up, clumping): start with n singleton cluster and form the sequence by merging clusters • Divisive (top down, splitting): start with all of the samples in one cluster and form the sequence by successively splitting clusters

Agglomerative hierarchical clustering • The procedure terminates when the specified number of cluster has been obtained, and returns the cluster as sets of points, rather than the mean or a representative vector for each cluster

Application to image segmentation

Application to clustering face images Cluster centres = face prototypes

The problem of the number of clusters • Typically, the number of clusters is known. • When it’s not, that is a hard problem called model selection. There are several ways of proceed. • A common approach is to repeat the clustering with K=1, K=2, K=3, etc.

What did we learn today? • Data clustering • K-means algorithm in detail • How K-means can get stuck and how to take care of that • The outline of Hierarchical clustering methods

Unsupervised learning: Clustering

Unsupervised learning: Clustering

Presentation Transcript

Approaches to clustering-based analysis and validation

Finding Structure in Noisy Text: Topic Classification and Unsupervised Clustering

Spectral Clustering

Markov Logic in Natural Language Processing

Windows Server 2008 R2 and Microsoft SQL Server 2008: Failover Clustering Implementations

Guest vs. Host Clustering: What ? Why? When?

Markov Logic in Natural Language Processing

EM Algorithm: Expectation Maximazation Clustering Algorithm book: “ DataMining, Morgan Kaufmann, Frank ”

An introduction to machine learning and probabilistic graphical models

Chapter 5 Unsupervised learning

Lecture 16: Unsupervised Learning from Text

LECTURE 3 Introduction to PCA and PLS K-mean clustering

Machine Learning for Big Data, Methods and Applications

Scalable Methods for Graph-Based Unsupervised and Semi-Supervised Learning

Semi-Supervised Clustering

Unsupervised Learning with Random Forest Predictors: Applied to Tissue Microarray Data

CENG 464 Introduction to Data Mining

Lecture 16: Unsupervised Learning from Text

Power Iteration Clustering