1 / 43

Neural network and learning machines

Neural network and learning machines. Logistic Regression Model. Lec 4 Instructor: Dr. Emad Nabil. Linear regression. Logistic regression. Multivariate regression. Polynomial regression. Note :. Logistic regression is used for classification

pearli
Télécharger la présentation

Neural network and learning machines

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Neural network and learning machines Logistic Regression Model Lec 4 Instructor: Dr. Emad Nabil

  2. Linear regression Logistic regression Multivariate regression Polynomial regression

  3. Note: Logistic regression is used for classification not for regression (prediction) like linear/polynomial regression.

  4. Classification vs. Prediction (regression)

  5. https://www.slideshare.net/AdilAslam4/bayesian-classification-in-data-mining-73034171https://www.slideshare.net/AdilAslam4/bayesian-classification-in-data-mining-73034171

  6. Classification applications Email: Spam / Not Spam? OnlineTransactions: Fraudulent (Yes / No)? Tumor: Malignant / Benign ? 0: “Negative Class” (e.g., benign tumor) 1: “Positive Class” (e.g., malignant tumor) n features m examples Training dataset

  7. Classification using Regression The most basic step would be to try and fit regression curve to see if one can achieve classification using the same approach. (Yes) 1 Apply linear regression Malignant ? For a certain instance x calc hϴ(x) (No) 0 Tumor Size Tumor Size If hϴ(x) >= .05 X is malignant X is benign

  8. Classification using Regression say an outlier is present like the blue data point then the same decision will be shifted to the right. problem1: a misclassification of two data points   misclassification (Yes) 1 Malignant ? Apply linear regression (No) 0 Tumor Size Tumor Size For a certain instance x calc hϴ(x) If hϴ(x) >= .05 X is malignant X is benign

  9. Classification using Regression say an outlier is present like the blue data point then the same decision will be shifted to the right. problem1: a misclassification of two data points   problem2: according to the problem definition 0<=hϴ(x) <=1, to be able to assign a label, but in this case: hϴ(x) >1 misclassification (Yes) 1 hϴ(x) >1 Malignant ? (No) 0 Tumor Size Tumor Size

  10. Conclusion about Classification using Regression Applying linear regression to classification problem might work in some cases but is not advisable as it would not scale with complexity. And we need a model where: Solution  logistic regression

  11. sigmoid function or logistic function. Plot of the sigmoid function is given below which shows no matter what the value of z, the function returns a value between 0 and 1

  12. logistic regression • f is to be true, • there is a need of squashing function i.e. a function which limits the output of hypothesis between given range. • For logistic regression sigmoid function is used as the squashing function. • The hypothesis for logistic regression is give by,

  13. logistic regression The value of hypothesis is interpreted as the probability that the input x belongs to class y=1. i.e. probability that y=1, given x, parametrized by θ.

  14. Decision Boundary

  15. Decision Boundary for the given hypothesis of logistic regression, say δ=0.5 is chosen as the threshold for the binary classification. y This is the decision boundary

  16. Linear Decision Boundary y In this example 12

  17. Non linear Decision Boundary It is possible to achieve non-linear decision boundaries by using the higher order polynomial terms and can be incorporated in a way similar to how multivariate linear regression handles polynomial regression.

  18. Non linear Decision Boundary y Say, the hypothesis of the logistic regression has higher order polynomial terms, and is given by, let optimal θ given below would form an optimal decision boundary -1 0 0 1 1] Substituting Decision boundary is

  19. As the order of features is increased more and more complex decision boundaries can be achieved by logistic regression. Be aware of overfitting !! Gradient Descent is used to search for the best parameter values of θ that make the decision boundary

  20. Logistic Regression Cost Function

  21. The Cost function is convex

  22. The same cost function of multivariate regression would not work well for the logistic regression because the hypothesis for logistic regression is the complex sigmoid function below, givesnon-convex curve with many local minima as shown in the plot below. So gradient descent will not work properly for such a case and therefore it would be very difficult to minimize this function. Const function is NOT convex We will use another cost function which is convex

  23. Logistic regression cost function  It is clear that new cost function can be minimized because its convex.

  24. Logistic regression cost function This cost function is reached at using the principle of maximum likelihood expectation.  Below Y is the actual value

  25. Logistic regression summary

  26. Logistic regression cost function Since y∈{0,1}, is equivalent to  Now the total error will be:

  27. Logistic regression cost function So now to get optimal θ, we need to } Now we need to compute

  28. Remember !

  29. Gradient Descent for logistic Regression Want : (simultaneously update all ) Repeat Repeat (simultaneously update all ) Algorithm looks identical to linear regression!

  30. Gradient Descent for logistic Regression Note: Feature Scaling is as important for logistic regression as it is for linear regression as it helps the process of gradient descent.

  31. These algorithms automatically find out the best α value.

  32. Multiclass Logistic Regression

  33. Multiclass logistic regression Multiclass logistic regression is a extension of the binary classification making use of the one-vs-all or one-vs-rest classification strategy.

  34. Multiclass logistic regression Email foldering/tagging: Work, Friends, Family, Hobby y=1 y=2 y=3 y=4 Medical diagrams: Not ill, Cold, Flu y=1 y=2 y=3 Weather: Sunny, Cloudy, Rain, Snow y=1 y=2 y=3 y=4

  35. Multiclass logistic regression Multi-class classification: Binary classification: +ve samples y= (1) x2 x2 x1 -ve samples y=(0) x1

  36. Multiclass logistic regression We have three classes

  37. Multiclass logistic regression cn +ve samples y= (1) -ve samples y=(0) -ve samples y=(0) -ve samples y=(0) +ve samples y= (1) +ve samples y= (1) • Each classifier hiθ(x) returns the probability that an observation belongs to class i. • all we have to do in order to predict the class of an observation is to select the class of whichever classifier returns the highest probability.

  38. Multiclass logistic regression 28x28 784 mnist dataset 10 classifiers: to

More Related