160 likes | 505 Vues
Spatial Data Mining. Ashkan Zarnani Sadra Abedinzadeh Farzad Peyravi. From DM to KDD. DM is a step in KDD Extracting useful, meaningful patterns Five terabyte of data collected each day in NASA This is used to discover stars, galaxies etc. Spatial Data.
E N D
Spatial Data Mining Ashkan Zarnani Sadra Abedinzadeh Farzad Peyravi
From DM to KDD • DM is a step in KDD • Extracting useful, meaningful patterns • Five terabyte of data collected each day in NASA • This is used to discover stars, galaxies etc.
Spatial Data • Any kind of data that has one or more fields concerning with location, shape , area and similar attributes • Point, Line, Polygon • Spatial Access Methods (SAMs) • Information in a GIS is organized in “layers”. • For example a map will have a layer of “roads”, “train stations”, “suburbs” and “water bodies
Layers in GIS • People • Commercial • Governmental • Geographical • Traffic • Business
Spatial Data Mining Methods • Spatial OLAP and spatial data warehousing • Drilling, dicing and pivoting on multi-dimensional spatial databases • Generalization & characterization of spatial objects • Summarize & contrast data characteristics, e.g., dry vs. wet regions • Spatial Association: • Find rules like “inside(x, city) à near(x, highway)”. • Spatial classification and prediction • Classify countries based on climate • Spatial clustering and outlier analysis • Cluster houses to find distribution patterns • Similarity analysis in spatial databases • Find similar regions in a large set of maps
SDM : State of the Art Progressive Refinement Finding Coarse Relationships and then extracting the non-candidate rules to avoid complex spatial operations for all objects g_close_to candidates detail process
SDM : State of the Art Multilevel Rules Finding rules in several levels of the concept hierarchies ContinentCountryProvinceCityZoneBlock Water( flow(river, channel) – nonflow(sea, lake, ocean) )
SDM : State of the Art Quantitative Rules The challenge of treating continuous attributes, the sharp boundaries Fuzziness applied for realistic knowledge extraction
SDM : State of the Art OLAM OnLine Analytical Mining, the user can interact with the mining progress: Data sets, Concept Hierarchies, Interestingness Measures, Type of Knowledge, Representation GMQL is proposed and is being extended
References • [1] Floris Geerts, Sofie Haesevoets and Bart Kuijpers. • A Theory of Spatio-Temporal Database. Computer Science Dept., North Dakota State University (2000) • [2] Martin Ester, Hans-Peter Kriegel, Jörg Sander.Algorithms and Applications for Spatial Data Mining , Geographic Data Mining and Knowledge Discovery, 2001. • [3] Martin Ester, Alexander Frommelt, Hans-Peter Kriegel, Jörg Sander. Algorithms for Characterization and Trend Detection in Spatial Databases, International Conference on Knowledge Discovery and Data Mining (KDD-98) • [4] Jan Paredaens, Bart Kuijpers. Data Models and Query Languages for Spatial Databases. ACMSIGKDD Explorations (1999) • [5] Hans-Peter Kriegel, Thomas Brinkhoff, Ralf Schneider. Efficient Spatial Query Processing in Geographic Database Systems. VLDB (2001) • [6] Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth. From Data Mining to Knowledge Discovery in Databases. AI MAGAZINE (1999) • [7] Ramakrishnan Srikant, Rakesh Agrawal. Mining Quantitative Association Rules in Large Relational Tables. VLDB (1996) • [8] Krzysztof Koperski, A Progressive Refinement Approach to Spatial Data Mining. SFU PhD Thesis (1999)