1 / 19

Data Mining: P enelitian Data Mining

Data Mining: P enelitian Data Mining. Romi Satria Wahon o romi@romisatriawahono.net http://romisatriawahono.net +6281586220090. Romi Satria Wahono. SD Sompok Semarang (1987) SMPN 8 Semarang (1990) SMA Taruna Nusantara , Magelang (1993)

avidan
Télécharger la présentation

Data Mining: P enelitian Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining:Penelitian Data Mining Romi Satria Wahonoromi@romisatriawahono.nethttp://romisatriawahono.net+6281586220090

  2. Romi Satria Wahono • SD Sompok Semarang (1987) • SMPN 8 Semarang (1990) • SMA Taruna Nusantara, Magelang (1993) • S1, S2 dan S3 (on-leave)Department of Computer SciencesSaitama University, Japan (1994-2004) • Research Interests: Software EngineeringandIntelligent Systems • Founder IlmuKomputer.Com • Peneliti LIPI (2004-2007) • Founder dan CEO PT Brainmatics Cipta Informatika

  3. Course Outline • Pengenalan Data Mining • Proses Data Mining • Evaluasi dan Validasi pada Data Mining • Metode dan Algoritma Data Mining • Penelitian Data Mining

  4. Penelitian Data Mining

  5. Penelitian Data Mining • Standard Proses Penelitian pada Data Mining • Journal Publications on Data Mining • Research on Classification • Research on Clustering • Research on Prediction • Research on Association Rule

  6. Standard Proses Penelitian pada Data Mining

  7. Data Mining Standard Process(CRISP–DM) • A cross-industry standard was clearly required that is industryneutral,tool-neutral, and application-neutral • The Cross-Industry Standard Processfor Data Mining (CRISP–DM) was developed in 1996 (Chapman, 2000) • CRISP-DM provides a nonproprietary and freely availablestandard process for fitting data mining into the general problem-solving strategyof a business or research unit

  8. CRISP-DM

  9. 1. Business UnderstandingPhase • Enunciate the project objectives and requirements clearly in terms of thebusiness or research unit as a whole • Translate these goals and restrictions into the formulation of a data mining problem definition • Prepare a preliminary strategy for achieving these objectives

  10. 2. Data UnderstandingPhase • Collectthedata • Use exploratory data analysis to familiarize yourself with the data and discoverinitialinsights • Evaluatethe quality of the data • If desired, select interesting subsets that may contain actionable patterns

  11. 3. Data PreparationPhase • Prepare from the initial raw data the final data set that is to be used for allsubsequent phases. This phase is very labor intensive • Select the cases and variables you want to analyze and that are appropriate for youranalysis • Perform transformations on certain variables, if needed • Clean the raw data so that it is ready for the modeling tools

  12. 4. Modeling phase • Select and apply appropriate modeling techniques • Calibrate model settings to optimize results • Remember that often, several different techniques may be used for the same data miningproblem • If necessary, loop back to the data preparation phase to bring the form ofthe data into line with the specific requirements of a particular data miningtechnique

  13. 5. Evaluationphase • Evaluate the one or more models delivered in the modeling phase for qualityand effectiveness before deploying them for use in the field • Determine whether the model in fact achieves the objectives set for it in thefirstphase • Establish whether some important facet of the business or research problemhas not been accounted for sufficiently • Come to a decision regarding use of the data mining results

  14. 6. Deploymentphase • Make use of the models created: Model creation does not signify the completion of a project • Example of a simple deployment: Generate a report • Example of a more complex deployment: Implement a parallel data miningprocessinanotherdepartment • For businesses, the customer often carries out the deployment based on your model

  15. Latihan • Pelajari dan pahami Case Study 1-5 dari buku Larose (2005) Chapter 1 • Pelajari dan pahami bagaimana menerapkan CRISP-DM pada tesis Firmansyah (2011) tentang penerapan algoritma C4.5 untuk penentuan kelayakan kredit

  16. Journal Publications on Data Mining

  17. TransactionsandJournals • Review Paper (surveyandstate-of-the-art): • ACM ComputingSurveys (CSUR) • Research Paper (technical): • ACM Transactions on Knowledge Discovery from Data (TKDD) • ACM Transactions on Information Systems (TOIS) • IEEE Transactions on Knowledge and Data Engineering • SpringerData Mining and Knowledge Discovery • International Journal of Business Intelligence and Data Mining (IJBIDM)

  18. CognitiveAssignmentIII • Baca 1 paper ilmiah yang diterbitkan di journal 2010-2012 yang berhubungan dengan metode data mining yang sudah kita pelajari • Rangkumkan masing-masing dalam bentuk slidedengan struktur: • Latar Belakang Masalah (Research Background) • Pernyataan Masalah (Problem Statements) • Pertanyaan Penelitian (Research Questions) • Tujuan Penelitian (Research Objective) • Metode-Metode yang Sudah Ada (Existing Methods) • Metode yang Diusulkan (ProposedMethod) • Hasil (Results) • Kesimpulan (Conclusion) • Presentasikan di depan kelas pada mata kuliah berikutnya

  19. Referensi • Ian H. Witten, Frank Eibe, Mark A. Hall, Data mining: PracticalMachineLearning Toolsand Techniques3rd Edition, Elsevier, 2011 • Daniel T. Larose, Discovering Knowledgein Data: an Introductionto DataMining, John Wiley & Sons, 2005 • Florin Gorunescu, Data Mining: Concepts, Models and Techniques, Springer, 2011 • JiaweiHanandMichelineKamber, Data Mining:Concepts and TechniquesSecond Edition, Elsevier, 2006 • OdedMaimonandLiorRokach, Data Mining and Knowledge Discovery Handbook Second Edition, Springer, 2010 • Warren Liao and EvangelosTriantaphyllou (eds.), Recent Advances in Data Mining of Enterprise Data: Algorithmsand Applications, World Scientific, 2007

More Related