1 / 14

Pump it Up: Data Mining the Water Table Project Plan Milestone 1

Pump it Up: Data Mining the Water Table Project Plan Milestone 1. Team Members Alex Pang Terry Scates Ibrahim Oyekan Patrick Merker. Tools Chosen. Patrick Merker (Rattle based on R) Ibrahim Oyekan(RapidMiner) Terry Scates(Vowpal Wabbit) Alex Pang (Prediction I/O). Rattle on R.

mayhugh
Télécharger la présentation

Pump it Up: Data Mining the Water Table Project Plan Milestone 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pump it Up: Data Mining the Water Table Project Plan Milestone 1 Team Members Alex Pang Terry Scates Ibrahim Oyekan Patrick Merker

  2. Tools Chosen • Patrick Merker (Rattle based on R) • Ibrahim Oyekan(RapidMiner) • Terry Scates(Vowpal Wabbit) • Alex Pang (Prediction I/O)

  3. Rattle on R • Rattle is a GUI based on R that allows the user to analysis data and utilize many of the R packages available. • Rattle simplifies manipulation of data sets and allows the user to easily perform tasks without getting lost in the detail. • One of the most productive features is the ability of Rattle to use R to make decision trees that predict outcomes based on data. • Once we as a team determine the parts of the tool that we want to use, we will extract the functions being used and create a stand alone R program that will the data and give us the decision tree that we will use.

  4. Rattle on R Demonstration

  5. Vowpal Wabbit

  6. Vowpal Wabbit Demonstration

  7. RapidMiner • RapidMiner is a popular, open-source integrated environment useful for predictive analysis, data science and machine learning • It combines a handy, extensively customizable GUI with scripting functionality to help train and visualize statistical input to predict derived outputs • It provides embedded data mining and machine learning algorithms (regression, random forests, decision trees etc.) called operators which can be applied to your datasets and gives you the option to add personal algorithms or modify their defaults

  8. RapidMiner • It also provides functionality for visualizing statistical data in graphs, histograms, scatter plots etc. • After your dataset is loaded and wired up to an operator, it allows you build a process on the dashboard which is then executed and provides the requested output. • The free edition only allows up to 10,000 rows of data, and the lowest tier costs $2,500 a year but I applied for an educational license that supports unlimited functionality, which just got approved.

  9. RapidMiner Demonstration

  10. Prediction I/O • A machine learning server deployed as a web service, creates predictive models in real time • Separated into three layers: The platform, Event Server, and the Engine • The platform basically controls and maintains the event server and the engine • Input the data via the event server • The engine builds predictive models based on the input • Each engine for creating predictive models are highly customizable

  11. Prediction I/O Demonstration

  12. Design of Predictive Software

  13. Progress Report

  14. Goals for Milestone 2

More Related