180 likes | 313 Vues
This project presentation explores the implementation of data mining techniques to classify stock quotes using decision trees. The aim is to develop a model that groups companies based on historical stock prices, helping to predict future trends. We’ll discuss the theory behind classification, demonstrate the application interface for data processing, and analyze stock data from various companies like Pfizer and Exxon. The final outcomes include enhanced user interfaces, visualization of decision trees, and the next steps to improve the model's usability and effectiveness.
E N D
First Presentation, Final Year Project, 2013 Analyzing Stock Quotes using Data Mining Techniques Name of Student: To Yi Fun University Number: 2010149103
Flow of Presentation • Aim of the this classification for stock trade • Theory of Classification • Decision Tree making • Introduction of the application • Structure and techs used in this application • Preparation • Interface
Flow of Presentation • Demonstration • Data Analysis • What to do next • Q&A
Aim • Find a model for class attribute as a function of others to group a class for previously unseen records • e.g. find out the classifier for historic stock price; Group companies into different classes for inspection • classier: decision tree, rule-based classifier
Theory for Decision Tree • A series of test conditions making to sort the instances into class • Greedy, split record based on attribute that best suit the criterion • Attribute (discrete) setting, 2-way split; multiple-way split
Theory for Decision Tree • Best split -Gini Index, generalization of variance impurity -Entropy, amount of impurity on a set • Aim: using a training set to provide a classifier for classifying testing set
Application Structure CSV2MYSQLGENERATOR Processed Data Download Filter Query (Splitting) Information presentation and arithmetic operation Raw data Data processing
Preparation • Downloading the stock historic data: for 30 DOM shares e.g. Pfizer, Bank of America, America Express, Exxon • Convert to .csv file to be processed by the CSV2MYSQLGENERATOR program, the result is a lengthy sql commands
Data Processing • Categories into different type of stock by its industries • Dow 30 as training set and 8 more stocks as testing set, mainly large scale company
Data Processing • Downloading the stock historic data: for 30 DOM shares e.g. Pfizer, Bank of America, America Express, Exxon • Convert to .csv file to be processed by the CSV2MYSQLGENERATOR program, the result is a lengthy sql commands
Data Processing Class: -B_RiseMore3Perc5Day: Buy Signal • Attributes Setting -HL_30DaysAverage: Tendency -HL_ChangeDaily: Change -HL_ChangePerc: Difference -HL_VolChange: Popularity
Data Processing • Attributes Setting
User Interface • Make Use of the mysql connector to input the processed data into the C# • Three Major Components: • -Input • -Result Log • -Test
Demonstration • Make Use of the mysql connector to input the processed data into the C# • Three Major Components: • -Input • -Result Log • -Test
Result Analysis Attributes Setting -HL_30DaysAverage: Tendency -HL_ChangeDaily: Change -HL_ChangePerc: Difference -HL_VolChange: Popularity
What to do Next • Implement a more user friendly UI for presenting the stock price, visualize the tree and provide query service • Implement an splitting Algorithm using Gini and compare the difference of the results generated by these Algorithms