Understanding the K-Means Clustering Algorithm with Euclidean Distance
This resource provides a comprehensive overview of the K-means clustering algorithm, a popular method used for partitioning datasets into clusters based on similarity. The process involves selecting a number of cluster centers (k), assigning each data point (or gene) to the nearest cluster center, and iteratively recalculating the cluster centers based on the assigned data points until convergence is achieved. The examples provided illustrate each step of the algorithm, highlighting the use of Euclidean distance as the distance metric for determining similarity among data points.
Understanding the K-Means Clustering Algorithm with Euclidean Distance
E N D
Presentation Transcript
K-means algorithm • Pick a number (k) of cluster centers • Assign every gene to its nearest cluster center • Move each cluster center to the mean of its assigned genes • Repeat 2-3 until convergence Slides from Wash Univ. BIO5488 lecture, 2004
k1 k2 k3 Clustering: Example 2, Step 1 Algorithm: k-means, Distance Metric: Euclidean Distance
k1 k2 k3 Clustering: Example 2, Step 2 Algorithm: k-means, Distance Metric: Euclidean Distance
k1 k2 k3 Clustering: Example 2, Step 3 Algorithm: k-means, Distance Metric: Euclidean Distance
k1 k2 k3 Clustering: Example 2, Step 4 Algorithm: k-means, Distance Metric: Euclidean Distance
k1 k2 k3 Clustering: Example 2, Step 5 Algorithm: k-means, Distance Metric: Euclidean Distance