1 / 5

Big Data and Data Science (red I590)

Big Data and Data Science (red I590). What fields, if any, would not benefit from using Big Data style computing. Can it really be an aid to all of academia and industry ?

lloyd
Télécharger la présentation

Big Data and Data Science (red I590)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Big Data and Data Science (red I590) • What fields, if any, would not benefit from using Big Data style computing. Can it really be an aid to all of academia and industry? • It dawned on me how much expertise is actually needed in this field, from business knowledge to programming to analytical skills it seems never ending.  • Data science, in a lot of ways, gives us a chance to understand and answer the questions that were previously unexplained.  • looking at all the data in different sets of angles to examine what the information truly represents, then providing the best options to innovate or fix the issues at hand • The 3 big V’s! Volume Velocity Variety • Extracting meaningful patterns from big data itself and analyzing this information helps the data scientist to manage end to end scientific method process.  • However data science has more meaning than finding patterns in the data. Even management of data is a part of data science. And hence, data science can be briefly defined as collecting, analyzing and modeling the data that we receive. • Gartner analyst Doug Laney introduced the 3Vs concept in a 2001 MetaGroup research publication, 3D data management: Controlling data volume, variety and velocity. • Various technologies like parallelism, clustering, MapReduce, Iterative MapReduce are applied to these huge amounts of data to obtain some knowledge from them and taking wise decisions for the organization. Secondly, machine learning is also one of the main technologies for big data. • This field requires expertise in the domain, in data mining to extract information from the data, and in programming to design tools to extract this information. In all, it takes skilled professionals to extract meaningful information from raw data to help in the decision making process.  • the four A’s of data:a) data Architecture, b) data Acquisition, c) data Analysis, d) data Archiving • The Big Data market as measured by vendor revenue derived from sales of related hardware, software and services reached $18.6 billion in calendar year 2013. That represents a growth rate of 58% over the previous year. http://wikibon.org/wiki/v/Big_Data_Vendor_Revenue_and_Market_Forecast_2013-2017

  2. 51 Use Cases I • Recommender systems(7) or web search(8) use case interests me. For example, Amazon depending on what kind of books we read then match with other readers who read the same book and recommend the books those users have read. • (7) Netflix was most popular use case • Netflix is interesting to me because personally it affects me all of the time.   • I never thought about everything that goes into the decision to present that choice to me.  • In my case, Netflix has rarely picked out something I like and recommended it to me. • I sometimes forget about how big Netflix really is • I have never given it too much thought about what goes into the algorithm for recommending new shows/movies for users but I cannot imagine how much time it took make that code. It will be interesting to see what the future brings but possibly also a little uncomfortable if algorithms get to be too good. • It is incredible to think about how much video is stored and used through Netflix. • Whether or not the recommender is accurate all the time, I find it fascinating to see what it comes up with, and if it actually matches my interests.  • I think it's crazy how much information Netflix has to store and how it personally categorizes movies depending on what you've recently watched.   • The recommendation service on Netflix isn't always as accurate as it could be, but based off the few components it has to make those suggestions, the system is doing its job •  I do not even have cable anymore due to my excessive use of Netflix. I do use the "recommended" feature every day because I think it shows accurate suggestions based off of the movies I watch. • It seems really interesting how they can use all the data accessible to them to create the recommender. •  it’s very interesting to know how much information/data goes into making the website work.

  3. 51 Use Cases II • It’s very interesting to know how much information/data goes into making the website work. • I liked Netflix because I use it so often and did not know it's complexity. I only assumed it collected and used a lot of data, which it does, but was unaware of the other factors surrounding it. • Most of us are really familiar with Netflix, however, we hardly get to know how big it really is and how powerful it is. When we use the recommended function within it, we never thought about how it can know us so correctly.   • Netflix case particularly interesting because I use it at least once or twice a week. I have always been curious on what types of algorithms they use to refer you to other movies. Big data is a huge part of what makes Netflix such a popular form of entertainment.  • Features that Youtubehas that the videos that are recommended when you are watching a video helps you see all the similar videos that are available . • It has a very excellent data analytics system in the form of Netflix recommender system as it looks for the meta data stored in the huge database and gives us similar results • Netflix use case intrigues me as it is something that we use on a daily basis. • IMDb is a internet movie database. It also uses the recommender system similar to Netflix • One field which amazes me is the recommended systems which are used by many large companies. For example, Amazon it gives suggestions of what to buy, we get a things recommended for us on the site. The same with its kindle, it compares what we read and suggests books read by people who read the similar book. Even Netflix does this, it recommends videos...

  4. 51 Use Cases III • (13-15) Defense case. There are so many issues around surveillance and privacy rights these days.  • I think that having a fast and reliable cloud based system for the Army is essential and worth the cost, despite negative publicity today about the NSA and their surveillance program. • Privacy rights regarding social networks and/or government surveillance is a progressing debate topic of concern in our society.     • How we see and identify people and how fast that process can happen. Surveillance is evolving and becoming so advanced,  • Example of a tank needing to process whether it could go through given terrain stuck in my head. • (10)  I thought the cargo shipping case was interesting because more and more people are purchasing goods online and they must be delivered somehow to the customers' homes by shipping companies. • because Amazon would be using drones for this delivery service that there would be a lot of consumer privacy discussion. • Commercial (5-12): maximize the resources by having come up with different patterns and business strategies from analyzing the data collected from all kinds of sources such as consumers • (51) Smart grid involved with energy distribution. A more efficient way to give people exactly the amount of energy that they will use seems like a great area for improvement in our economy today. • I researched a Smart Grid and think its interesting how it can take the the information of all consumers and providers of energy on the grid and use the the data to better help make better decisions for energy consumption • (16-25) Healthcare • “Personalized medicine" is growing in popularity and potential practicality • I think the healthcare example is the most relevant in our society today and my interests. I find is so cool how all this data can be applied in neat ways to help others. • Being able to understand healthcare information at a deeper level helps us to form new hypothesis, test new ideas and find solutions to problems that we have this current day. •  I never really thought that using big data could make your healthcare experience more specific toward your specific needs.  • I think the use of data science in healthcare can really help patient and doctors manage the disease. • (23) Studying how epidemics happen around different populations in our world is interesting.  •  It would be very interesting to see some of the data behind the statistics that I have been exposed to.

  5. 51 Use Cases IV •  (5) Use case of finance.  I've started to do my own investing and trading and think it is amazing how some technical analysts can predict indexes/stocks. • Technical analysis is stock value driven • Fundamental analysis is company property driven • One area that I think is important to be covered is online shopping with credit card information and things of that nature because of the amounts of personal data to be stored... • (24) Social media trends on the internet are being used to sway the marketing environment. In a world where privacy is deteriorating we are seeing how Google is able to establish trends  • (28) Truthy: Information diffusion research from Twitter Data" mostly because it is a study that has been done here at Indiana University. I am also intrigued by the tracking of large amounts of data to detect trends and make information from social media useful. • vast amount of data collected through social networking is always surprising to me.  • The idea of big data is easily defined, where as data science is a limitless field of research and study. The way I think of it is that big data is the pooling of raw information, and the field of data science is deriving knowledge from said information on a holistic scale •  I feel social media plays a big role in understanding people and their moods and thoughts. This research if done anonymously, can provide information that is useful to the government as well as businesses. What is challenging in this use case is the real-time processing of data. • (26) I have a keen interest in Artificial Intelligence and by using deep learning we can revisit our path of trying to make the robot more human-like. I would personally like to do some research in this field • (27) 3D Image reconstruction using MapReduce exciting! • (36-40) Big data usage in the field of Astronomy and Physics. Astronomy has been one of the first areas of science to embrace and learn from big data. It is transforming our knowledge of the cosmos in several ways. • (41-50) "Earth, Environmental, and Polar Science" use case. I think that the application of big data collected from this field is very important because it can provide possible solutions to environmental issues and provide us with knowledge about the earth that would otherwise be left undiscovered.  • With CReSIS,  we are learning creating new technologies to learn more (and predict future changes) about the sea level change of ice sheets.  This will prove very important since it is such an important part of our planet. • I am really interested in environmental studies so thinking about some solutions to help the earth is so amazing to me. Having data make a difference is really a remarkable thing and I hope it can help a lot of things to come.

More Related