150 likes | 353 Vues
CS/CMPE 536 –Data Mining. Outline . Description. A comprehensive introduction to the concepts and techniques in data mining data mining process – its need and motivation data mining tasks and functionalities association rule mining clustering text and web mining mining sequential data
E N D
CS/CMPE 536 –Data Mining Outline
Description • A comprehensive introduction to the concepts and techniques in data mining • data mining process – its need and motivation • data mining tasks and functionalities • association rule mining • clustering • text and web mining • mining sequential data • evaluation of DM tools and programming of algorithms in C/C++/Java • Emphasis on concept building, algorithm evaluation and applications CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
Goals • To provide a comprehensive introduction to data mining • To develop conceptual and theoretical understanding of the data mining process • To provide hands-on experience in the implementation and evaluation of data mining algorithms and tools • To develop interest in data mining research CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
After Taking this Course… You should be able to … • understand the need and motivation for data mining • understand the characteristics of different data mining tasks • decide what data mining task and algorithm to use for a given problem/data set • implement and evaluate data mining solutions • use commercially available DM tools CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
Before Taking This Course… You should be comfortable with… • Data structures and algorithms! • CS213 is a prerequisite • You should be comfortable with algorithm descriptions and implementations in a high-level programming language • Databases • Understanding of the database concept and familiarity with database terms and terminology • CS341 is recommended, not required • Basic math background • Algebra, calculus, etc • Programming in a high-level language • C/C++ or Java CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
Grading • Points distribution Quizzes (5 to 6) 10% Assignments (hand + computer) 20% Project 150% Midterm exam 25% Final exam (comprehensive) 30% CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
Policies (1) • Quizzes • Most quizzes will be announced a day or two in advance • Unannounced quizzes are also possible • Sharing • No copying is allowed for assignments. Discussions are encouraged; however, you must submit your own work • Violators can face mark reduction and/or reported to Disciplinary Committee • Plagiarism • Do NOT pass someone else’s work as yours! Write in your words and cite the reference. This applies to code as well. CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
Policies (2) • Submission policy • Submissions are due at the day and time specified • Late penalties: 1 day = 10%; 2 day late = 20%; not accepted after 2 days • An extension will be granted only if there is a need and when requested several days in advance. • Classroom behavior • Maintain classroom sanctity by remaining quiet and attentive • If you have a need to talk and gossip, please leave the classroom so as not to disturb others • Dozing is allowed provided you do not snore load CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
Project • Design, implementation and evaluation of a data mining application • You may choose a problem of your liking (after consultation with me) or select one suggested by me • You may do the project in groups (of 2) • Start thinking about the project now CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
Summarized Course Contents • Introduction and motivation • The data mining process – tasks and functionalities • Data preprocessing for data mining – data cleaning, reduction, summarization, normalization, etc • Mining association rules – algorithms and applications • Mining by clustering – algorithms and applications • Mining text and web data • Mining sequential data CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
Course Material • Required textbook • Data Mining: Concepts and Techniques, Han and Kamber, 2001 • Supplementary material • Data Mining: Introductory and Advanced Topics, Dunham, Pearson Education, 2003. • Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Witten et al., Morgan Kaufmann, 006.3 W829D, 2000. • Handouts (as and when necessary) • Other resources • Books in library • Web CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
Course Web Site • For announcements, lecture slides, handouts, assignments, quiz solutions, web resources: http://suraj.lums.edu.pk/~cs536a04/ • The resource page has links to information available on the Web. It is basically a meta-list for finding further information. CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
Other Stuff • How to contact me? • Office hours: 3.00 to 4.30 PM TR (office: 429) • E-mail: akarim@lums.edu.pk • By appointment: e-mail me for an appointment before coming • Philosophy • Knowledge cannot be taught; it is learned. • Be excited. That is the best way to learn. I cannot teach everything in class. Develop an inquisitive mind, ask questions, and go beyond what is required. • I don’t believe in strict grading. But… there has to be a way of rewarding performance. CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
Reference Books in LUMS Library (1) • Data Mining: Concepts, Models, Methods, and Algorithms, Mehmed Kantardzic, 006.3 K167D, 2003. • Principles of Data Mining, Hand and Mannila, 006.3 H236P, 2001. • The elements of statistical learning; data mining, inference, and prediction, Tervor Hastie, Robert Tibshirani and Jerome Friedman, 006.31 H356E 2001. • Data mining and uncertain reasoning;an integrated approach, Zhengxin Chen, 006.321 C518D 2001. • Graphical models; methods for data analysis and mining, Christian Borgelt and Rudolf Kruse, 006.3 B732G 2001. • Information visualization in data mining and knowledge discovery, Usama Fayyad (ed.), 006.3 I434 2002. • Intelligent data warehousing;from data preparation to data mining, Zhengxin Chen, 005.74 C518I 2002. • Machine learning and data mining;methods and applications, Michalski, Ryszard S., ed.;Bratko, Ivan, ed.;Kubat, Miroslav, ed., 006.31 M149 1999. CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
Reference Books in LUMS Library (2) • Managing and mining multimedia databases, Bhavani Thuraisingbam, 006.7 T536M 2001. • Mastering data mining;the art and science of customer relationship management, J.A. Michael Berry and Gordon Linoff, 006.3 B534M 2000. • Data mining explained;a manager's guide to customer-centric business intelligence, Rhonda Delmater and Monte Hancock, 006.3 D359D 2001. • Data mining solutions;methods and tools for solving real-world problems, Christopher Westphal and Teresa Blaxton, 006.3 W537D 1998. CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS