1 / 15

Welcome! MSCIT 521: Knowledge Discovery and Data Mining

Welcome! MSCIT 521: Knowledge Discovery and Data Mining. Qiang Yang Hong Kong University of Science and Technology qyang@cs.ust.hk http://www.cs.ust.hk. KDDCUP from past years 2007: Predict if a user is going to rate a movie? Predict how many users are going to rate a movie? 2006:

aspen
Télécharger la présentation

Welcome! MSCIT 521: Knowledge Discovery and Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Welcome!MSCIT 521: Knowledge Discovery and Data Mining Qiang Yang Hong Kong University of Science and Technology qyang@cs.ust.hk http://www.cs.ust.hk Course Introduction

  2. KDDCUP from past years 2007: Predict if a user is going to rate a movie? Predict how many users are going to rate a movie? 2006: Predict if a patient has cancer from medical images 2005: Given a web query (“Apple”), predict the categories (IT, Food) 1998: Given a person, predict if this person is going to donate money In general, we wish to Input: Data Output: Build model Apply model to future data Data Mining: An Example Course Introduction 2

  3. Data Mining: Convergence of Three Technologies Course Introduction 3

  4. Definition: Predictive Model • A “black box” that makes predictions about the future based on information from the past and present • Large number of inputs usually available Course Introduction 4

  5. How are Models Built and Used? • High Level View: Course Introduction 5

  6. The Data Mining Process Course Introduction 6

  7. What does the Real World Look Like Course Introduction 7

  8. Predictive Models are… • Decision Trees • Nearest Neighbor Classification • Neural Networks • Rule Induction • Clustering Course Introduction 8

  9. Course Description • Data Mining and Knowledge Discovery • Focus: • Focus 1: Theoretical foundations in Pattern Recognition and Machine Learning • Algorithms: • Differences? • where they apply? • Focus 2: Broad survey of recent research • Focus 3: Hands-on, apply algorithms to KDD data sets Course Introduction

  10. Topic 1: Foundations • Classification algorithms • Clustering algorithms • Association algorithms • Sequential Data Mining • Novel Applications • Web • Customer Relationship Management • Biological Data Course Introduction

  11. Topic 2: Hands On • Apply learned algorithms to selected data sets • Homework assignments • Get familiar with existing system packages and libraries • In-class workshops • Programming Assignments Course Introduction

  12. Important Sites • Instructor Web Site • http://www.cse.ust.hk/~qyang/521 • TA: Kaixiang Mo • Assignment Hand-in: online • csit5210@ust.hk • Course Discussion Site: • Check out the web cite Course Introduction

  13. Prerequisites • Statistics and Probability would help, • but not necessary • Pattern Recognition would help, • but not necessary • Databases • Knowledge of SQL and relational algebra • But not necessary • One programming language • One of Java, C++, Perl, Matlab, etc. • Will need to read Java Library Course Introduction

  14. Grading • Grade Distribution: • Assignments 20% • Course Project 20% • Exams 60% • Midterm 20% • Final 40% Course Introduction

  15. More info • Textbooks: For reference only • Introduction to Data Mining by Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, Pearson International Edition, 2005. • Data Mining.  by Ian Witten and Ebe Frank. (Google books) • Data Mining -- Concepts and Techniques by Jiawei Han and Micheline Kamber. Morgan Kaufmann Publishers. • Available in our bookstore Course Introduction

More Related