slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
This chapter uses MS Excel and Weka PowerPoint Presentation
Download Presentation
This chapter uses MS Excel and Weka

This chapter uses MS Excel and Weka

633 Vues Download Presentation
Télécharger la présentation

This chapter uses MS Excel and Weka

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. This chapter uses MS Excel and Weka

  2. Statistical Techniques Chapter 10

  3. 10.1 Linear Regression Analysis Equation 10.1

  4. 10.1 Linear Regression Analysis • A Supervised technique that generalizes a set of numeric data by creating a math equation relating one or more ,nput variables to a single output variable. • With linear regression we attemp to model vairation in a dependent variable as a linear combination of one or more independent variable • Linear regression is appro when the relation betwee the dependent and the independent variables are nearly linear

  5. Simple Linear Regression(slope-intercept form) Equation 10.2

  6. Simple Linear Regression(least squares criterion) Equation 10.3

  7. Multiple Linear Regression with Excel

  8. Try to estimate the value of a building

  9. A Regression Equation for the District Office Building Data

  10. 10.1 Linear Regression Analysis • How accurate are the results • Use scatterplot diagram, and the line for the formula • Which ind vars are linearly related to dep vars. Use the stats? • Coefficient determination=1, no difference between actual (in the table) and computed values for dependent variable.(reps corrolation between actual and computed values) • Standard error for the estimate of dep var.

  11. F stat for the regression analysis • Used to establish, if the coeff. deter. İs significant. • Look up f critical values (459) from one-tailed F tables in stat books using v1(number of ind vars, 4), v2 (no of instance – no of vars, 11-5=6) • Regression equation is able to correctly determineassesed values of office buildings that are part of the training data

  12. Figure 10.1 A simple linear regression equation

  13. Regression Trees

  14. Figure 10.2 A generic model tree

  15. Regression Tree • Essentially a desicion tree with leaf node with numeric variables • The value at an individual leaf node is numeric average of the output attribute for all instances passing through the tree to the leaf node posititon • Regresion trees are more accurate than lınear regression, when data is nonlinear • But is more difficult to interpret • Sometime regression trees are combined with linear regression to form model trees

  16. Model Trees • Regression tree + linear regression • Each leaf node represents a linear regression quation instead of an average value • Model trees simplify regession trees by reducing the number of nodes in the tree. • More complex tree means less linear relationship between dep and ind vars.

  17. Figure 10.3 A model tree for the deer hunter dataset (output attribute yes)

  18. 10.2 Logistic Regression

  19. Logistic Regression • Using linear regresion to model problems with observed outcome restricted to 2 values (e.g. yes/no) is sriously flawed. Value restriction placed on output var is not observed in the regression equation, Linear regression produce straight line unbounded onboth ends. • Therefor the linear equation must be transform to restric output to [0,1], Thus regression equation can be thought of as producing a probablity of occurence or nonoccurence of a measured event. • Logistic regression applies logaithmic transform.

  20. Transforming the Linear Regression Model Logistic regression is a nonlinear regression technique that associates a conditional probability with each data instance. 1 denotes observaton of one class (yes) 0 denotes observation of another class (no) Thus a conditional proabality of seeing class associatied with y=1 (yes) p(y=1|x), given the values in the feature vector x

  21. The Logistic Regression Model Determine the coefficients in x, (ax+c) using an iterative method (tries to minimize the sum of logarithms of predicted probablities) Convergence occurs when logarithmic summation is close to 0 or when it doesn’t change from iteration to iteration Equation 10.7

  22. Figure 10.4 The logistic regressioin equation

  23. Logistic Regression: An Example Credit card Example: CreditCardPromotionNet file. LifeIns Pro is output CreditCardIns and Sex are most influantion attribs.

  24. Logistic Regression • Classify a new instance using logistic regression • income=35K • Credit card insurance=1 • Sex=0 • Age=39 • P(y=1|x)=0.999

  25. 10.3 Bayes Classifier • Supervised classification tech, categorical output attrib • All input vars are independent, of equal importance • P(H|E) likelihood of H (dependent var representing a predicted class) • P(E|H) conditional probability of H is true given evidence E (computed from training data) • P(H) apriori probability, denotes probability of H before the presentation of evidence E (computed from training data) Equation 10.9

  26. Bayes Classifier: An Example Credit card promotion data set Sex is output

  27. The Instance to be Classified Magazine Promotion = Yes Watch Promotion = Yes Life Insurance Promotion = No Credit Card Insurance = No Sex = ? 2 hypothesis, sex=female, sex=male

  28. Computing The Probability For Sex = Male Equation 10.10

  29. Conditional Probabilities for Sex = Male P(magazine promotion = yes | sex = male) = 4/6 P(watch promotion = yes | sex = male) = 2/6 P(life insurance promotion = no | sex = male) = 4/6 P(credit card insurance = no | sex = male) = 4/6 P(E | sex =male) = (4/6) (2/6) (4/6) (4/6) = 8/81

  30. The Probability for Sex=Male Given Evidence E P(sex = male | E)  0.0593 / P(E)

  31. The Probability for Sex=Female Given Evidence E P(sex = female| E)  0.0281 / P(E) P(sex = male | E) > P(sex = female| E) The instance is most likely a male credit card customer

  32. Zero-Valued Attribute Counts Problem with Bayes is when of the counts are 0, to solve this problem a small constant to numerator/dominator n/d becomes k is 0.5 for an attrib with 2 possible values Example: P(E | sex =male) = (3/4)(2/4)(1/4)(3/4) = 9/128 P(E | sex =male) = (3.5/5)(2.5/5)(1.5/5)(3.5/5) = 0.0176 Equation 10.12

  33. Missing Data With Bayes classifier missing data items are ignored.

  34. Missing Data • Example

  35. Numeric Data

  36. Numeric Data Probability Density Function, (attribute values are assumed to be normally distributed) where e = the exponential function m = the class mean for the given numerical attribute s = the class standard deviation for the attribute x = the attribute value Equation 10.13

  37. Numeric Data • Magazine Promotion = Yes • Watch Promotion = Yes • Life Insurance Promotion = No • Credit Card Insurance = No • Age = 45 • Sex = ? • … • P(E|sex=male) = …. P(age=45|sex=male) • σ = 7.69 П = 37, x=45 • P(age=45|sex=male) = 1/(….) = 0.03 • P(sex=male|E) = 0.0018/P(E) • P(sex=female|E) = 0.0016/P(E) • Instance belong to male

  38. 10.4 Clustering Algorithms

  39. Agglomerative Clustering Place each instance into a separate partition. Until all instances are part of a single cluster: a. Determine the two most similar clusters. b. Merge the clusters chosen into a single cluster. 3. Choose a clustering formed by one of the step 2 iterations as a final result.

  40. Agglomerative Clustering: An Example

  41. Agglomerative Clustering Final step of the Algorithm is to choose final clustering among all. (Requires heuristics) Use similarity measure for creating clusters, compare average within-cluster similarity with overall similarity of all instances in dataset (domain similarity) This technique can be best used to eliminate clusterings rather than to choose a final result

  42. Agglomerative Clustering Final step of the Algorithm is to choose final clustering among all. (Requires heuristics) Use within-cluster similarity measure and within-cluster similarities of pairwise-combined clusters in the cluster set. Look for the highest similarity This technique can be best used to eliminate clusters rather than to choose a final result

  43. Agglomerative Clustering Final step of the Algorithm is to choose final clustering among all. (Requires heuristics) Use previous 2 techniques to eliminate some of the clusterings Feed each remaining clustering to a rule generator The clustering with best defining rules is chosen. (4th tech) Bayesian Information Criterion