Download
slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
BIG DATA PowerPoint Presentation

BIG DATA

185 Views Download Presentation
Download Presentation

BIG DATA

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. BIG DATA MBUS 626-01 - G4 Zoe Mayhook Bailee Neyland Crystal Side Michael Stuber

  2. Big Data Finding that Diamond in the Rough • Most common interpretation of big data is the systematic analysis of huge volumes of data to find patterns and behaviors that are not readily apparent. • Has rapidly created an entire sub-industry that generated $11.59 billion in 2012, according to the research community Wikibon. • By 2017, they predict the big data market will be worth $47 billion.

  3. Big Data Defined “A massive volume of both structured and unstructured data that is so large that it’s difficult to process using traditional databases and software techniques.” 3-V Model

  4. Big Data Continued… • What makes data big? • Origin • Growth

  5. Major Sources of Big Data • Social Media • Server Logs • Web/clickstream • Machine/sensor • Geolocation

  6. Important factors to consider • Big data needs to be mediated by the human touch and common sense • Human beings and human-oriented decisions must play a fundamental role in any big data strategy or companies risk alienating their customers and damaging their brands.

  7. Who is using Big Data? LEADERS: • Amazon • Uses big data to drive innovation through data, with scalable services for data collection, storage, integration, analytics and collaboration • Handles millions of back-end operations every day, as well as queries from more than half a million third-party sellers • Walmart • Handles more than one million customer transactions every hour • Uses big data to reach customers, or friends of customers who have mentioned something online to inform them about that exact product and include a discount • Netflix • Uses big data to more-accurately predict the consumer behaviors of their subscribers and potential subscribers

  8. Who else is using Big Data? And who should? Start ups Healthcare

  9. Big Data Companies • Cloudera- leader in Apache Hadoop-based software and services and offers a powerful new data platform that enables enterprises and organizations to look at all their data and ask bigger questions for unprecedented insight at the speed of thought. • MapR - delivers on the promise of Hadoop with a proven, enterprise-grade platform that supports a broad set of mission-critical and real-time production uses. • Splunk- founded to pursue a disruptive new vision: make machine data accessible, usable and valuable to everyone. By monitoring and analyzing everything from customer clickstreams and transactions to network activity and call records - Splunk turns machine data into valuable insights no matter the business. • Palantir- Delivers big data technology to improve crisis response

  10. Up-and-Coming Big Data Companies

  11. Big Data Technologies Wide-scale digitization of information has created many new sources of data Traditional approaches to managing data don’t support volume, velocity, and variety New approaches are needed: • NoSQL Databases • MapReduce & Hadoop

  12. What do data sources look like? a

  13. What Do Data Sources Look Like?

  14. What Do Data Sources Look Like?

  15. What Do Data Sources Look Like?

  16. NoSQL • Not Only Structure Query Language • Data can be unstructured • Data is typically organized in key-value pairs • Values can be anything from images, songs, and documents, to lists or traditional data types • Examples include Cassandra & Redis

  17. MapReduce & Hadoop • All processing is done on key/value pairs • Basic approach is to organize very large sets of data (map) and then crunch them (reduce) • Many algorithms can be implemented within MapReduce architecture • Hadoop & MapReduce systems provide task management & file systems to distribute jobs across hundreds (or thousands) of commodity servers

  18. Mini Case Discussion The San Leandro California Police Department uses mounted squad car camera’s to routinely photograph license plates while patrolling the area. Millions of these pictures are passed on to the Northern California Regional Intelligence Center, and are analyzed using big data software developed by Palantir. What are the benefits to photographing, saving, and analyzing license plate information? 2. What do you find most concerning?

  19. Ethics for Big Data • Ethically Neutral • Might not align with how we feel, but should align with core values • Ethical inquiry should take place due to the sheer volume, variety and velocity of big data

  20. Framework for Big Data Ethics • Identity • Relationship between offline and online identity • Privacy • Who should control access to data? • Ownership • Who owns data, can rights be transferred? • Reputation • Can we determine what data is trustworthy?

  21. Alignment of Methodology • Inquiry • Discussion of core values • Analysis • Review current practices, and assess how well they align with core values • Articulation • Explicit, written expression of alignment and misalignment between values and practices • Action • Tactical plan to close alignment gaps

  22. Ethical Guidelines - Proposals • Radical Transparencies • explain what data is being collected and how it will be used • Simplicity by Design • Allow users to adjust any privacy settings to determine what they want shared or now • Privacy policies should be simple and understandable • Preparation and Security • Define what information and data you need, and what information you can do without • Develop crisis strategy if company system gets hacked • Make Privacy Part of the DNA • Hire a chief privacy officer or chief data officer • Address privacy in all levels of the organization

  23. Benefits of Adopting Big Data Ethics • Reduction in risk of unintended consequences • Faster consumer adoption (reducing fear of unknown) • Increased pace of innovation • Reduced friction from legislation

  24. Big data is about “building new analytic applications based on new types of data, in order to better serve your customers and drive a better competitive advantage.” …Thank you