300 likes | 445 Vues
Big Data: Local Data. What it is & how CAP’s can apply Big Data analytic practices to their work. What will be covering today. What is meant by Big Data The Who, What, How, When of big data Current applications by corporate/ academic world What CAPs can learn from these applications
E N D
Big Data: Local Data What it is & how CAP’s can apply Big Data analytic practices to their work
What will be covering today • What is meant by Big Data • The Who, What, How, When of big data • Current applications by corporate/ academic world • What CAPs can learn from these applications • Issues about Big Data- the good, bad and ugly • How these issues are likely to impact CAA clients • Discussion on practical points to start applying Big Data analytics locally
What is Big Data? • Big data is a blanket term for any collection of data sets so large and complex that it becomes difficult to process using on-hand data management tools or traditional data processing applications.
What is Big Data? • We swim in a sea of data ... and the sea level is rising rapidly • Tens of millions of connected people, billions of sensors, trillions of transactions now work to create unimaginable amounts of information. An equivalent amount of data is generated by people simply going about their lives, creating what the McKinsey Global Institute calls “digital exhaust”—data given off as a byproduct of other activities such as their Internet browsing and searching or moving around with their smartphone in their pocket.
Characteristicsof the Industry • According to the FTC, this is what the world of big data looks like in 2014 • Data Brokers Collect Consumer Data from Numerous Sources, Largely Without Consumers’ Knowledge • The Data Broker Industry is Complex, with Multiple Layers of Data Brokers Providing Data to Each Other • Data Brokers Collect and Store Billions of Data Elements Covering Nearly Every U.S. Consumer • Data Brokers Combine and Analyze Data About Consumers to Make Inferences About Them, Including Potentially Sensitive Inferences • Data Brokers Combine Online and Offline Data to market to consumers Online
Sources of Included Data Again from the FTC: Data brokers collect data from commercial, government, and other publicly available sources. Data collected could include bankruptcy information, voting registration, consumer purchase data, web browsing activities, warranty registrations, and other details of consumers’ everyday interactions. Data brokers do not obtain this data directly from consumers, and consumers are thus largely unaware that data brokers are collecting and using this information. While each data broker source may provide only a few data elements about a consumer’s activities, data brokers can put all of these data elements together to form a more detailed composite of the consumer’s life.
Who is doing all this data gathering? • Acxiom: Acxiom provides consumer data and analytics for marketing campaigns and fraud detection. Its databases contain information about 700 million consumers worldwide with over 3000 data segments for nearly every U.S. consumer. • Corelogic: Corelogic provides data and analytic services to businesses and government based primarily on property information, as well as consumer and financial information. Its databases include over 795 million historical property transactions, over ninety-three million mortgage applications, and property-specific data covering over ninety-nine percent of U.S. residential properties, in total exceeding 147 million records. • Datalogix: Datalogix provides businesses with marketing data on almost every U.S. household and more than one trillion dollars in consumer transactions.23 In September 2012, Facebook announced a partnership with Datalogix to measure how often Facebook’s one billion users see a product advertised on the social site and then complete the purchase in a brick and mortar retail store.
Who is doing all this data gathering - 2 • eBureau: eBureau provides predictive scoring and analytics services for marketers, financial services companies, online retailers, and others. eBureau primarily offers products that predict whether someone is likely to become a profitable customer or whether a transaction is likely to conclude in fraud. It provides clients with information drawn from billions of consumer records, adding over three billion new records each month. • ID Analytics:ID Analytics provides analytics services designed principally to verify people’s identities or to determine whether a transaction is likely fraudulent. The ID Analytics network includes hundreds of billions of aggregated data points, 1.1 billion unique identity elements, and it covers 1.4 billion consumer transactions. • Intelius: Intelius provides businesses and consumers with background check and public record information. Its databases contain more than twenty billion records. • Rapleaf:Rapleaf is a data aggregator that has at least one data point associated with over eighty percent of all U.S. consumer email addresses. Rapleafsupplements email lists with the email address owner's age, gender, marital status, and thirty other data points.
How Big Data could being Used? A 2011 industry report by the management consulting firm McKinsey argued that five new kinds of value might come from abundant data: • creating transparency in organizational activities that can be used to increase efficiency; • enabling more thorough analysis of employee and systems performances in ways that allow experiments and feedback; • segmenting populations in order to customize actions; • replacing/supporting human decision making with automated algorithms; and • innovating new business models, products, and services
The Good… Consumers Benefit from Many of the Purposes for Which Data Brokers Collect and Use Data: • Data broker products help to prevent fraud, improve product offerings, and deliver tailored advertisements to consumers. • Risk mitigation products provide significant benefits to consumers by, for example, helping prevent fraudsters from impersonating unsuspecting consumers. • Similarly, people search products allow individuals to connect with old classmates, neighbors, and friends.
The Good… Consumers Benefit from Many of the Purposes for Which Data Brokers Collect and Use Data: • In the past several months we’ve heard reports about how health care costs can be reduced through large scale analyses made possible by big data. • Other researchers have reported how sophisticated analyses of traffic patterns and congestion can be analyzed for “smart routing,” which could be designed to save consumers’ time. • And most recently we learned of uses of Big Data that really make a difference in people’s lives, such as predicting infections in newborn babies – where having this information in real time can save lives.
The Good… • New start up are being formed to provide “Sentiment Analysis” - insights gleaned from social networks about how consumers feel about certain brands and products. • Sentiment analysis has other uses, including identifying public health concerns and other areas of need. A new initiative by the United Nations analyzes social media public information • and text messages to predict job losses, spending reductions or disease outbreaks in the developing world. Used this way, sentiment analysis can tease out early warning signs to aid better planning and target assistance programs in a region on the brink of possible crisis.
The Bad… Many of the Purposes for Which Data Brokers Collect and Use Data Pose Risks to Consumers: • Marketers could even use the seemingly innocuous inferences about consumers in ways that raise concerns. For example, while data brokers have a data category for “Diabetes Interest” that a manufacturer of sugar-free products could use to offer product discounts, an insurance company could use that same category to classify a consumer as higher risk.
The Bad… Privacy Concerns • Sentiment analysis and other uses of data that claim to be “deidentified” when that means only taking name and address out. • We must insure that collection and use of sensitive information – such as information related to health, finances, or sexual orientation – triggers the heightened protection it deserves. • The extent to which the analysis of vast amounts of data results in consumer profiles that will be used to deny consumers important benefits.
The Ugly • Storing Data About Consumers Indefinitely May Create Security Risks - For some products, these data brokers report that they need to keep older data, however, the risk of keeping the data may outweigh the benefits. • To the Extent Data Brokers Offer Consumers Choices About Their Data, the Choices are Largely Invisible and Incomplete There is a fundamental lack of transparency about data broker industry practices..
Examples of Uses Reported Recently When a Health Plan Knows How You Shop (NYT 6/28/2014)
Examples of Uses, cont’d • “There could be people who want to use the data for purposes that widen health disparities, not solve them.” MedSeek, a software and analytics company in Birmingham, Ala., offers services intended to help hospitals “virtually influence” the behavior of current and would-be patients. According to MedSeek.com, the company offers a “21st-century tool kit” that can refine health care marketing pitches based on sex, age, race, income, risk assessment, culture, religious beliefs and family status. One client, Trinity Health System in Michigan, used MedSeek’s services “to scientifically identify well-insured prospects,” among others, and encourage them to schedule screening tests and doctor visits, a company case study said.
Examples Reported Recently Big Data 101-Colleges are hoping predictive analytics can fix their dismal graduation rates VOX July 14, 2014 "It sounds almost like science fiction," Wagner says. "But the reality is there's a lot that every one of us can be doing right now by simply looking at patterns of information.“ “So once colleges know the students who are most likely to drop out, the hope is that they can help them avoid that fate. The path is strewn with potential unintended consequences. Studies show teachers expend more time and attention with students they know will succeed; will professors neglect students data shows are likely to fail? States are under pressure to improve their graduation rates; if they can identify the students least likely to graduate, will it be too tempting to shut them out rather than admit them and help them through?”
Another Example Who gets shot in America: What I learned compiling records of carnage for the New York Times By Jennifer MasciaTuesday, July 15, 2014 7:00 EDT The project began with just a couple of items unearthed during a Google news search, but after a few months I was searching ten pages deep — for “shooting,” “man shot,” “woman shot,” “child shot,” “teen shot” and “accidentally shot.” By the project’s end, the report featured 40 shootings a day. Of course, there were more, but I was only finding the ones reported in the news. This naturally limited my scope — most suicides aren’t reported at all.
Who gets shot in America: What I learned compiling records of carnage for the New York Times Before starting work on Gun Report, I had my own ideas about gun violence: Most of it probably resulted from gang activity, I assumed, along with the marital domestic shootings we so often read about. A few months in, I noticed how shootings spiked over the weekends. Summer was the worst. Holiday weekends were full of needless shootings — arguments, stray bullets, kids finding their parents’ guns. Compiling weekend reports took me 10 hours every Sunday. More than 350 posts and 40,000 deaths later, here is what I learned.
Who gets shot in America: What I learned compiling records of carnage for the New York Times Gang shootings are prevalent, especially in former hubs of industry now in economic decline in Ohio; the Flint/Tri-Cities region of Michigan; in Indianapolis and Fort Wayne, Indiana; Newport News, Va.; and Milwaukee, Wisconsin. Carjackings and home invasions often appeared in my Google news searches. I was surprised to learn that suburbs were a magnet for gun violence, perhaps mirroring the housing implosion, which decimated the suburbs and propelled people to cities, where there are always jobs.
Who gets shot in America: What I learned compiling records of carnage for the New York Times Not that nation’s largest cities are exempt: Miami, Chicago, St. Louis, Detroit, Newark, New Orleans, Philadelphia, and Dallas are notable examples. (Less so New York, possibly because of the NYPD’s stop-and-frisk policy, which was ruled unconstitutional.) Drive-by shootings still plague northern and southern California; Los Angeles, Fresno and the entire east side of the state are rife with gang activity. Tennessee, Alabama, and Missouri also frequently popped up in this regard. What was also notable was where the shootings aren’t: Maine, Hawaii, Vermont, Wyoming, Montana, and New Hampshire were rarely mentioned in the report. Why?
Who gets shot in America: What I learned compiling records of carnage for the New York Times But while half of the shootings I featured were the result of a crime, the other half, I was most surprised to learn, resulted from arguments — often fueled by alcohol — among friends, neighbors, family members and romantic partners. More and more, people are solving their differences not with their fists but with guns. Husbands and wives are shooting each other, as are sisters and brothers. In many homes across America, loaded guns are easily accessible, and children find them, accidentally shooting themselves or each other. One hundred children died in unintentional shootings in the year after Newtown, which breaks down to two every week.
Issues about Big Data for CAA clients • Having seen the Good, Bad and Ugly about Big Data uses of consumer data- Do CAA clients have a need for better protections from being harmed (or discriminated) by business uses? • If they do, is there a role for CAPs in advocating for clients for these protections? • Is there a role for CAPs in getting “positive” consumer data about CAA clients into the data brokers data gathering channels?
Issues about Big Data for CAPs • What equivalent data points do CAPs collect about CAA clients that can be used for analytics? • To better understand clients needs • To improve programs provided • For completing your Community Needs Assessment
Issues about Big Data for CAPs • As apply big data analytics to CAA clients, how to ensure that best practices are followed: • Transparent in use to clients • Clients are in control of their data • Don’t unintentionally harm clients
Discussion Topics • Do CAPs have a responsibility to advocate for clients in the developing regulation of this field? • What are some data points already collecting that could improve programs? • What are some data points already collecting that could benefit clients in the consumer marketplace? • What are some data points already collecting that could be predictive of success in programs? Predictive of success in self sufficiency, moving out of poverty?
For Further Information • http://www.forbes.com/sites/lisaarthur/2013/08/15/what-is-big-data/ • http://www.pewinternet.org/2012/07/20/the-future-of-big-data/ • http://www.ftc.gov/system/files/documents/reports/data-brokers-call-transparency-accountability-report-federal-trade-commission-may-2014/140527databrokerreport.pdf • http://www.ftc.gov/sites/default/files/documents/public_statements/big-data-big-issues/120228fordhamlawschool.pdf
Questions? Kate Martin KMartinWorks@Yahhoo.com