340 likes | 355 Vues
Explore the use of data mining techniques in data-driven social simulation, with a focus on methodology, case study application, and conclusions presented at SS@IJCAI 2009. Discover how data mining aids in extracting patterns, cluster finding, and post-processing of simulation output.
E N D
Re-thinking Modelling: a Call for the Use of Data Mining in Data-driven Social Simulation Samer Hassan Javier Arroyo Celia Gutiérrez Universidad Complutense de Madrid
Contents • Data-driven ABM • DM-assisted Methodology • Case Study: Mentat • Application • Conclusions SS@IJCAI 2009
Research Aim SS@IJCAI 2009
Research Aim • Theoretical • KISS • Structural Validation • Abstract • General SS@IJCAI 2009
Research Aim • Data-driven • Non-KISS • Empirical Validation • Specific (case study) • Expressive • Theoretical • KISS • Structural Validation • Abstract • General SS@IJCAI 2009
Classical Logic of Simulation SS@IJCAI 2009
Data-Driven Logic SS@IJCAI 2009
Data-driven Approach • Complexity • Large amounts of Data • Auxiliary AI: • Fuzzy Logic • Ontologies • Evolutionary Computation • Data Mining SS@IJCAI 2009
Data Mining • Data Mining • Extracting patterns and relevant information from large amounts of data • Pre-processing of empirical data • Cluster finding • Discovery of hidden patterns • Locates redundancies • Post-processing of simulation output • Clustering: • Discovery of hidden patterns • Validation of clusters • Locates inconsistencies • Classification • Cluster matching SS@IJCAI 2009
Contents • Data-driven ABM • DM-assisted Methodology • Case Study: Mentat • Application • Conclusions SS@IJCAI 2009
Methodology for DM-assisted ABM SS@IJCAI 2009
Methodology for DM-assisted ABM • Data Collection • Initial point • Validation points • Necessarily ≠ initial • Type • Explicit • Externalised • Empirical distributions • Secondary sources • Methods • Quantitative • E.g. surveys • Qualitative • E.g. interviews SS@IJCAI 2009
Methodology for DM-assisted ABM • Analysis • Preprocessing of empirical data • Roles • Domain expert • Guide DM exploration • Interpretation • DM expert • Confirm or refine theories SS@IJCAI 2009
Methodology for DM-assisted ABM • Selection of Relevant Data • Filtering • Adaptation of data • Normalisation • Discretisation • Domain Expert • Theory • DM • Redundancies • Overlooked independent variables SS@IJCAI 2009
Methodology for DM-assisted ABM • Types • Cluster analysis • Principal Component Analysis • Time series methods • Association rules • Data Analysis • Large data collections • Guided by theory SS@IJCAI 2009
Methodology for DM-assisted ABM • Interpretation of results • Theory expert • Relate results to theory • New findings are added to the findings base SS@IJCAI 2009
Methodology for DM-assisted ABM • ABM Building • Based on Findings • Modeller • Steps • Formalisation • Data-driven Design • Implementation • Initialisation SS@IJCAI 2009
Methodology for DM-assisted ABM • Simulation • Fine tuning the ABM • Sensitivity analysis • Intensive testing • Output • Record agent trace SS@IJCAI 2009
Methodology for DM-assisted ABM • Validation • Analysis of the results • Empirical validation • Theoretical consistency • Roles • DM expert • Analyse the data • Domain expert • Extract conclusions • Iterative cycle SS@IJCAI 2009
Contents • Data-driven ABM • DM-assisted Methodology • Case Study: Mentat • Application • Conclusions SS@IJCAI 2009
The Problem • Aim: simulate the process of change in social values • in a period • in a society • Plenty of factors involved • Inertia of generational change: • To which extent the demographic dynamics explain the mental change? • Inter-generational: • Agent characteristics remain constant • Macro aggregation evolves SS@IJCAI 2009
Mentat: architecture • Agent: • Mental State attributes • Life cycle patterns • Demographic micro-evolution: • Couples • Reproduction • Inheritance SS@IJCAI 2009
Mentat: architecture • World: • 3000 agents • Grid 100x100 • Demographic model • 8 indep. parameters • Social Network: • Communication with Moore Neighbourhood • Friends network • Family network SS@IJCAI 2009
Contents • Data-driven ABM • DM-assisted Methodology • Case Study: Mentat • Application • Conclusions SS@IJCAI 2009
Data Collection in Mentat • Initial data: • EVS-1980 • Representative sample of Spain • Qualitative info • Empirically-grounded demographic equations • Validation data: • EVS-1990 • EVS-1999 SS@IJCAI 2009
Analysis in Mentat • Selection of relevant data • EVS-1980,1990,1999 • Options: • Algorithm for the best subset of variables • Rely on domain expert • Tested domain knowledge • (2) chosen • Variables adaptation • Normalisation SS@IJCAI 2009
Analysis in Mentat • Data Analysis • Algorithm selection • Wrapped k-means • Explore different k (# of clusters) • Discarded variables • Gender & Age provokes appearance of irrelevant clusters • E.g. widowed women • Economy is redundant • High correlation with Education SS@IJCAI 2009
Analysis in Mentat • Interpretation • Sociological research • Religious typology (RLGTYPE) • Based on 3 variables • Ecclesiastical, low-intensity, alternatives & non-religious • Clusters found (1980, 1999) • Based on the 9-3=6 variables • 5 clusters with sociological meaning • Consistent with RLGTYPE • Theoretical observations of the pattern evolution: • Religiosity strength falls • Ideological spectrum twists to the left • education & economy • Newest type of religiosity, “alternatives” rise • youngsters SS@IJCAI 2009
Analysis in Mentat SS@IJCAI 2009
Validation in Mentat • Mentat re-building & simulation explored • Mentat output clusterised • Same 5 clusters found • Similar evolution trends • 3 theoretical observations shown • Inconsistencies detected • Liberal cluster % do not match • although aggregated they do • Graphics show less youngsters • Liberal clusters deeply affected • Guide to re-design SS@IJCAI 2009
Contents • Data-driven ABM • DM-assisted Methodology • Case Study: Mentat • Application • Conclusions SS@IJCAI 2009
Conclusions • DM-assisted ABM methodology • Suitable for DDABM • Complexity • Large amounts of data • Limitations • KISS • Qualitative sources • Uses • Build new ABM • Re-thinking existing DDABM • Revealing hidden facts • Detect inconsistencies SS@IJCAI 2009
Thanks for your attention! Samer Hassan samer@fdi.ucm.es Universidad Complutense de Madrid SS@IJCAI 2009
Contents License • This presentation is licensed under a Creative Commons Attribution 3.0 http://creativecommons.org/licenses/by/3.0/ • You are free to copy, modify and distribute it as long as the original work and author are cited SS@IJCAI 2009