1 / 13

Exploiting Data Topology in Visualization and Clustering of Self-Organizing Maps

Exploiting Data Topology in Visualization and Clustering of Self-Organizing Maps. Kadim Tas¸demir and Erzsébet Merényi TNN, Vol.20, No. 4, 2009, pp. 549-562. Presenter : Wei- Shen Tai 200 9 / 5/21. Outline. Introduction Previous work on visualization of SOM knowledge

jessie
Télécharger la présentation

Exploiting Data Topology in Visualization and Clustering of Self-Organizing Maps

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploiting Data Topology in Visualization andClustering of Self-Organizing Maps KadimTas¸demir and ErzsébetMerényi TNN, Vol.20, No. 4, 2009, pp. 549-562. Presenter : Wei-Shen Tai 2009/5/21

  2. Outline • Introduction • Previous work on visualization of SOM knowledge • Topology visualization through connectivity matrix of SOM prototypes • Clustering through CONNVIS • Discussions and conclusion • Comments

  3. Motivation • Exploit underutilized component of the SOM’s knowledge: data topology • Inclusion of data topology in the SOM visualization provides more sophisticated clues to cluster structure than existing SOM visualization approaches.

  4. Objective • Integrate the data topology to the visualization of SOM • It can improve the cluster extraction of SOM map via “connectivity matrix” and its specific rendering over the SOM.

  5. Visualization for SOM • SOM is a topology preserving mapping • Ideally, prototypes(neurons) those are neighbors in SOM map are also neighbors (centroids of neighboring Voronoipolyhedra) in data space and vice versa. • Growing SOM • It appears less robust than the Kohonen SOM because of the large number of parameters needing adjustment. • ViSOM • it requires a relatively large number of prototypes even for small data sets.

  6. Topology visualization through connectivity matrix of SOM prototypes • Induced Delaunay triangulation • It can be determined from the relationships of the best matching units (BMUs) and the second BMUs. • CONN • It is a weighted analog of A, where the weights indicate the density distribution of the input data among the prototypes adjacent in M. • where, RFij means wi is the BMU and wjis the second BMU.

  7. CONNvis: visualization of the connectivity matrix • Line width • The strength of the connection and reflects the density distribution among the connected units. • Line colors • A ranking of the connectivity strengths of wi. • Reveals most-to-least dense regions local to wi in data space.

  8. Assessment of topology preservation with CONNvis • Topology violations • connected neural units that are not immediate neighbors in map (forward topology violations); • unconnected neural units that are immediate neighbors in map (backward topology violations).

  9. Clustering through CONNVIS • Remove weak connections that link any two coarse clusters X and Y at their boundary • Step 1) Remove all weak connections to cluster X if the number of weak connections to X is less than the number of weak connections to the other cluster Y. • Step 2) Remove the weakest connection if the connections of the prototype to the two clusters have different widths. • Step 3) Remove the lowest ranking connection if the number of weak connections to both clusters is the same and all connections at the boundary of these clusters are weak. • Step 4) Repeat Steps 1)–3) until this prototype has been disconnected from one of the clusters. • Step 5) Repeat Steps 1)–4) for all prototypes at this boundary.

  10. A Real-Data Application • A real remote sensing spectral image of Ocean City

  11. Compare to U-matrix and ISOMAP

  12. Discussion and conclusions • CONNVIS • Integrates data distribution into the customary Delaunay triangulation. • Shows both forward and backward topology violations on the SOM grid. • Makes cluster extraction more efficiently.

  13. Comments • Advantage • This proposed method improves the visualization of SOM via combining induced Delaunay triangulation with connection strength. • It adopts the training processed of conventional SOM, but renders the resulting map via those connections between neurons after removing weak connection and boundary neurons. • Drawback • In this paper, most of terminology are not as same as general used ones in SOM, such as data vectors. • If one connection, connects two neuron in the same cluster, cross over an unrelated neuron (because it is not a boundary neuron for this cluster, so it is not removed by this propose method), it will makes the user confuse in the relation of these three neurons. • Application • Data clustering.

More Related