100 likes | 119 Vues
Big data is data sets that are so big and complex that traditional data-processing application software are inadequate to deal with them. ... There are a number of concepts associated with big data: originally there were 3 concepts volume, variety, velocity.
E N D
Basic Concepts in BigData By ProfessionalGuru
What is “bigdata”? • "Big Data are high-‐volume, high-‐velocity, and/or high-‐variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization” (Gartner2012) • Complicated (intelligent) analysis of data may make a small data “appear” to be“big” • Bottom line: Any data that exceeds our current capability of processing can be regarded as“big” http://professional-guru.com
Why is “big data” a “bigdeal”? • Government • Obama administration announced “big data”initiative • Many different big data programslaunched • PrivateSector • Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes ofdata • Facebook handles 40 billion photos from its userbase. • Falcon Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide • Science • Large Synoptic Survey Telescopewillgenerate 140Terabyte of data every 5days. • Biomedical computation like decoding human Genome & personalizedmedicine • Social science revolution • – -… http://professional-guru.com
Lifecycle of Data: 4“A”s Aggregation Analysis Acquisition Application http://professional-guru.com
Computational View of BigData DataVisualization DataAnalysis DataAccess DataIntegration DataUnderstanding Forma&ng,Cleaning Storage Data http://professional-guru.com
Big Data & RelatedTopics/Courses CS199 Human-‐ComputerInteraction DataVisualization MachineLearning InformationRetrieval Databases DataAnalysis DataAccess DataMining ComputerVision SpeechRecognition DataIntegration DataUnderstanding Natural LanguageProcessing DataWarehousing Forma&ng,Cleaning SignalProcessing ManyApplications! Data Storage InformationTheory http://professional-guru.com
Some Data AnalysisTechniques Visualization Classification PredictiveModeling TimeSeries Clustering http://professional-guru.com
Example of Analysis: Clustering & Latent FactorAnalysis GroupM2 GroupM1 GroupU1 GroupU2 http://professional-guru.com
Example of Analysis: PredictiveModeling GroupM2 GroupM1 GroupU1 GroupU2 (Binary)Classification Does user2 like moviem? What rating is user2 likely going to give moviem? Regression http://professional-guru.com
Some topics we’llcover http://professional-guru.com