1 / 10

Introduction to Web Mining

Introduction to Web Mining. Spring 2013. What is data mining?. Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web, images, etc. Patterns must be: valid, novel, potentially useful, understandable. Classic data mining tasks. Classification:

zasha
Télécharger la présentation

Introduction to Web Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Web Mining Spring 2013

  2. What is data mining? • Data mining is • extraction of useful patterns from data sources, e.g., databases, texts, web, images, etc. • Patterns must be: • valid, novel, potentially useful, understandable

  3. Classic data mining tasks • Classification: mining patterns that can classify future (new) data into known classes. • Association rule mining mining any rule of the form X Y, where X and Y are sets of data items. • Clustering identifying a set of similarity groups in the data

  4. Classic data mining tasks (contd) • Sequential pattern mining: A sequential rule: A B, says that event A will be immediately followed by event B with a certain confidence • Deviation detection: discovering the most significant changes in data • Data visualization CS583, Bing Liu, UIC

  5. Why is data mining important? • Huge amount of data • How to make best use of data? • Knowledge discovered from data can be used for competitive advantage. • Many interesting things that one wants to find cannot be found using database queries, e.g., “find people likely to buy my products”

  6. WWW • Web is an internet-based computer network that allows users of one computer to access information stored on another through the internet. • Client-server model, hypertext documents • Invented in 1989 by Tim Berners-Lee at CERN with HTTP/HTML • Mosaic (1993), Netscape(1994), Internet Explore (1995) • Related with Internet (ARPANET, TCP/IP)

  7. Web mining • traditional data mining • data is structured and relational • well-defined tables, columns, rows, keys, and constraints. • Web data • readily available data rich in features and patterns • Content/link/usage data

  8. Topic Description • Introduction to basic data mining: association and sequential mining, classification, clustering • Crawling, Web search and information retrieval • Social network analysis • Structure data extraction, information integration • Opinion mining and sentiment analysis • Web usage mining

  9. Related fields • Web mining is an multi-disciplinary field: Machine learning Statistics Databases Information retrieval Visualization Natural language processing etc.

More Related