1 / 3

Chapter 8: Extensions and Applications

This chapter discusses the implications of using massive datasets in machine learning, exploring key challenges such as memory constraints and the efficiency of various learning methods. It highlights the naive Bayes method and the differences between incremental and non-incremental learning schemes. Important concepts include the impact of dataset size on model performance, the risk of overfitting, and the benefits of parallelization in learning algorithms. Furthermore, the chapter delves into incorporating domain knowledge through metadata and practical applications like text and web mining and adversarial situations.

amato
Télécharger la présentation

Chapter 8: Extensions and Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 8: Extensions and Applications

  2. Learning from Massive Datasets • Can it be held in main memory?---Naïve Byaes Method • Some learning schemes are incremental; some are not. • What about time it takes to model?—should be linear or near linear • What to do when data set is too large? • Use a small subset of data for training---law of diminishing returns • Some schemes do better with more data; but there is also a danger of overfitting • Parallelization is another way---develop parallelized versions of learning schemes

  3. Incorporating Domain Knowledge :Metadata---data about data---semantic, causal, and functional • Text and web mining: • Adversarial situations: Junk email filtering, for example

More Related