1 / 10

Statistical Learning Problems

Explore the fundamentals of supervised learning, classification, regression, and data analysis to predict outcomes accurately. Learn about risk factors for prostate cancer, heart attack prediction, email spam detection, image classification, and more.

cbosco
Télécharger la présentation

Statistical Learning Problems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Learning Problems • Identify the risk factors for prostate cancer, based on clinical and demographic variables. ESL Chap1 - Introduction

  2. Classify a recorded phoneme, based on a log-periodogram.

  3. Predict whether someone will have a heart attack on the basis of demographic, diet and clinical measurements

  4. Customize an email spam detection system.

  5. Identify the numbers in a handwritten zip code, from a digitized image

  6. Classify a tissue sample into one of several cancer classes, based on a gene expression profile.

  7. Classify the pixels in a LANDSAT image, according to usage:{red soil, cotton, vegetation stubble, mixture, gray soil, damp gray soil, very damp gray soil}

  8. The Supervised Learning Problem • Starting point: • Outcome measurement Y (also called dependent variable, response, • target) • Vector of p predictor measurements X (also called inputs, regressors, • covariates, features, independent variables) • In the regression problem, Y is quantitative (e.g price, blood • pressure) • In the classification problem, Y takes values in a finite, unordered set • (survived/died, digit 0-9, cancer class of tissue sample) • We have training data (x1, y1)……(x , y). These are • observations (examples, instances) of these measurements.

  9. Objectives • On the basis of the training data we would like to: • Accurately predict unseen test cases • Understand which inputs affect the outcome, and how • Assess the quality of our predictions and inferences

  10. Philosophy • It is important to understand the ideas behind the various techniques, in order to know how and when to use them. • One has to understand the simpler methods first, in order to grasp the more sophisticated ones. • It is important to accurately assess the performance of a method, to know how well or how badly it is working [simpler methods often perform as well as fancier ones!] • This is an exciting research area, having important applications in science, industry and finance.

More Related