1 / 14

A Polygon-based Clustering and Analysis Framework for Mining Spatial Dataset

A Polygon-based Clustering and Analysis Framework for Mining Spatial Dataset. Name: Sujing Wang Advisor: Dr. Christoph F. Eick Data Mining & Machine Learning Group. Outline. Introduction Framework Architecture Methodology Case Study Conclusion and Future Work.

keiran
Télécharger la présentation

A Polygon-based Clustering and Analysis Framework for Mining Spatial Dataset

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Polygon-based Clustering and Analysis Framework for Mining Spatial Dataset Name: Sujing Wang Advisor: Dr. Christoph F. Eick Data Mining & Machine Learning Group

  2. Outline • Introduction • Framework Architecture • Methodology • Case Study • Conclusion and Future Work Data Mining & Machine Learning Sujing Wang 2

  3. Introduction • Spatial Data Mining (SDM): • the process of analyzing and discovering interesting and useful patterns, associations, or relationships from large spatial datasets. • Spatial object structures: (<spatial attributes>;<non-spatial attributes>) • Example: Data Mining & Machine Learning Sujing Wang 3

  4. Introduction • Spatial objects: point, trajectory(line) polygon(region) Data Mining & Machine Learning Sujing Wang 4

  5. Introduction • Challenges: • Complexity of spatial data types • Spatial relationships • Spatial autocorrelation • Motivation: • Polygons, specially overlapping polygons are very important for mining spatial datasets. • Traditional Clustering algorithms do not work for spatial polygons. • Research goal: • Develop new distance functions and new spatial clustering algorithms for polygons clustering. • Implement novel post-clustering techniques with plug-in reward functions to capture domain experts notation of interestingness. Data Mining & Machine Learning Sujing Wang 5

  6. A Polygon-based Clustering and Analysis Framework for Mining Spatial Datasets GeospatialDatasets Domain Experts Post-processing DCONTOUR NotionofInterestingness Spatial Clusters Poly_SNN Reward Functions Meta Clusters Summaries and Interesting Patterns

  7. Methodology 1. Domain Driven Final Clustering Generation Methodology Inputs: • A meta-clustering M={X1, …, Xk} —at most one object will be selected from each meta-cluster Xi (i=1,...k). • The user provides the individual cluster reward function RewardU whose values are in [0,). • A reward threshold U —clusters with low rewards are not included in the final clusterings. • A cluster distance threshold d, which expresses to what extent the user would like to tolerate cluster overlap. • A cluster distance function dist. Find ZX1…Xk that maximizes: subject to: xZx’Z (xx’ Dist(x,x’)>d) xZ (RewardU(x)>U) xZx’Z ((x Xi x’Xkxx’ ) ik) Data Mining & Machine Learning Sujing Wang 7

  8. Methodology 2. Finding interesting clusters with respect to continuous non spatial variable V: Let Xi2A be a cluster in the A-space  be the variance of v with respect in dataset D (Xi) be the variance of variable v in a cluster Xi mv(Xi) the mean value of variable v in a cluster Xi t10 a mean value reward threshold and t21 be a variance reward threshold Interestingness function  for each cluster: ( Xi) = max (0, |mv(Xi)| - t1) × max(0,- ((Xi) × t2)) Data Mining & Machine Learning Sujing Wang 8

  9. Case Study 1. Meta-clusters generated from multiple spatial datasets: Data Mining & Machine Learning Sujing Wang 9

  10. Case Study 2. Final Clusters with area of polygons as plug-in reward function Data Mining & Machine Learning Sujing Wang 10

  11. Case Study 3. Finding interesting meta-clusters with respect to solar radiation: Data Mining & Machine Learning Sujing Wang 11

  12. Conclusion & future work • Conclusions: • Our framework can effectively cluster spatial overlapping polygons similar in size, shape and locations. • Our post-clustering techniques with different plug-in reward functions can guide the knowledge extraction of interesting patterns and generate summaries from large spatial datasets. • Future Works: • Develop novel spatial-temporal clustering techniques and embed them to our framework. • Investigating novel change analysis techniques to identify spatial and temporal changes of spatial data. • Evaluate our framework in challenging case studies. Data Mining & Machine Learning Sujing Wang 12

  13. Publication: • S. Wang, C.S. Chen, V. Rinsourongkawong, F. Akdag, C.F. Eick, “Polygon-based Methodology for Mining Related Spatial Datasets”, ACM SIGSPATIAL GIS Workshop on Data Mining forGeoinformatics (DMG) in conjunction with ACM SIGSPATIAL GIS 2010, San Jose, CA, Nov. 2010. NSF travel Award for ACM GIS 2010 • S. Wang, C. Eick, Q. Xu, “A Space-Time Analysis Framework for Mining Geospatial Datasets”, CyberGIS’12 the First International Conference on Space, Time, and CyberGIS, University of Illinois at Urbana-Champaign, Champaign, IL Aug 6-9, 2012. NSF travel Award for CyberGIS 2012 • C. Eick, G. Forestier, S. Wang, Z. Cao, S. Goyal, “A Methodology for Finding Uniform Regions in Spatial Data”, CyberGIS’12 the First International Conference on Space, Time, and CyberGIS, University of Illinois at Urbana-Champaign, Champaign, IL Aug 6-9, 2012. • S. Wang, C.F. Eick, “A Polygon-based Clustering and Analysis Framework for Mining Spatial Datasets”, Geoinformatica, (Under Review). Data Mining & Machine Learning Sujing Wang 13

  14. Thank you! Data Mining & Machine Learning Sujing Wang 14

More Related