160 likes | 260 Vues
Dive into the world of web mining and social networking, understanding the theoretical backgrounds, data models, similarity functions, and performance measures. Discover how to extract useful insights and recommendations from web data, navigate web communities, and evaluate web recommendations. Explore the basics of social network metrics such as size, centrality, density, and clique behavior.
E N D
Web mining and Social Networking Introduction Theoretical Backgrounds
Introduction • Background • With the explosive growth of information over the internet,WWW has become a powerful platform to mine useful knowledge • Problems in Web related researches • Finding relevant information • Search engine – low precision and low recall. • Finding needed information • Query-based search – Doesn’t handle homograph. • Learning useful knowledge • Utilize the Web as knowledge base • Recommendation/personalization of information • Learning user navigational pattern • Web communities and social networking • Relationship among Web objects
Introduction • Data Mining and Web Mining • Data Mining • Discovering hidden or unseen knowledge in the forms of pattern in huge data • Web Mining • The means of utilizing data mining method to induce and extract useful information from Web data information • Web content mining • Web structure mining • Web usage mining • Semantic Web mining
Introduction • Characteristics of Web Data • The data on the web is huge • The data is distributed • The data is unstructured • The data is dynamic • Web community and Social Networking • An aggregation of web pages, users, and data
Theoretical Backgrounds • Web Data Model • Web data can be expressed such as matrix, directed graph and click sequence and so on.
Theoretical Backgrounds • Similarity Functions • Correlation-based Similarity • Cosine-based Similarity
Theoretical Backgrounds • Eigenvector, Principal Eigenvector
Theoretical Backgrounds • Singular Value Decomposition(SVD)
Theoretical Backgrounds • Latent Semantic Analysis(LSA)
Theoretical Backgrounds • Tensor Expression and Decomposition
Theoretical Backgrounds • Performance measure • Precision • Recall • F-measure
Theoretical Backgrounds • Mean Average Precision(MAP) • Discount cumulative gain(DCG) • In the cases of using a graded relevance scale
Theoretical Backgrounds • Web Recommendation Evaluation Metrics • Mean Absolute Error (MAE) • Hit Ratio • Weighted Average Visit Percentage
Theoretical Backgrounds • Basic Metrics of Social Network • Size – # of vertexes in the network • Centrality – Betweenness, Closeness, Degree • Density – existing edges / total possible edges in the network. • Degree( of network) - # of edges in the network. • Betweeness and Closeness • Clique – sub-set of a network
Theoretical Backgrounds • Social Network over the web • Each web page = social entity, hyperlink = relationship • Centrality – closeness, degree, betweenness • Prestige – A prestige actor is one who receives a lot of inlinks