1 / 26

Nurissaidah Ulinnuha

A Study of Academic Performance using Random Forest, Artificial Neural Network, Naïve Bayesian and Logistic Regression . Nurissaidah Ulinnuha. Introduction. LITERATURE REVIEW. Artificial Neural Network. Superiority

angus
Télécharger la présentation

Nurissaidah Ulinnuha

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Study of Academic Performance using Random Forest, Artificial Neural Network, Naïve Bayesian and Logistic Regression NurissaidahUlinnuha

  2. Introduction

  3. LITERATURE REVIEW

  4. Artificial Neural Network Superiority • ANN is useful for application in several areas, including pattern recognition, classification, forecasting, process control, etc. • Robust for noisy dataset

  5. Limitation • ANNs do not have parametric statistical properties (e.g. they do not have individual coefficient or model significance tests based on the t and F distributions). • ANN may converge to local instead of global minima, thereby providing non-optimal data fits.

  6. Logistic Regression Superiority • LR is able to provide information about significance value of predictor • There are no assumption about normality of dataset.

  7. Limitation • Only able to work with binary criterion variable

  8. Naïve Bayessian Superiority • Naïve bayessian requires data training fewer than other Classsification method Limitation • Dataset should satisfy independent assumption

  9. Random Forest Decision Tree Superiority • Random Forest runs efficiently on large databases. • Random Forest can handle thousands of input variables without variable deletion. • Random Forest gives estimates of what variables are important in the classification. • Random Forest has an effective method for estimating missing data and maintains accuracy when a large proportion of the data are missing. • Random forest able to do classification, clustering and outlier detection

  10. Limitation • Random forests have been observed to overfit for some datasets with noisy classification/regression tasks. • Unlike decision trees, the classifications made by Random Forests are difficult for humans to interpret.

  11. Prior Research

  12. MuktaPaliwal and Usha Kumar Title Academic performance of business school graduates using neural network and statistical techniques. Overview This research compare ANN with several statistical techniques. Paliwal conclude that the superior performance of the neural network techniques as compared to regression analysis for prediction problem whereas performance of neural network is comparable to logistic regression and discriminant analysis for classification problem.

  13. J. Zimmerman Title Predicting graduate-level performance from undergraduate achievements Result This research predicting graduate-level performance using random forest decision tree. From this research, we get information that random forest is not only able to do classification but also explain about significance of variable

  14. Data and Methods

  15. Raw dataDATA GRADUATION OF INFORMATICS ENGINEERING MAGISTER STUDENT ITS (2008-2011)

  16. Preprocess (165 field) • Filter data with null value • Change all attribute to number value • Change class attribute to nominal value

  17. Dataset DATA GRADUATION OF INFORMATICS ENGINEERING MAGISTER STUDENT ITS (2008-2011)

  18. Information of Dataset Fitur 7 fitur and 104 field

  19. Class • A : GPA > 3.5 • B : GPA <= 3.5 Tools WeKa

  20. Result

  21. Discussion and Future Work

  22. Discussion • Data training composition influence the performance of classifier technique. • Random Forest analysis is overfit for some dataset. • Random Forest in accuracy is not better than other methods for dataset with small fitur

  23. Future Works • Discard unimportant atribut dataset using Principal Component analysis. • Finding any method to solve overfitting problem of Random Forest Decision Tree

  24. Thank you

More Related