1 / 38

Scientists See Promise in Deep-Learning Programs

Scientists See Promise in Deep-Learning Programs. Microsoft Seeks an Edge in Analyzing Big Data. The Age of Big Data. Why Hire a Lawyer? Computers Are Cheaper. Armies of Expensive Lawyers, Replaced by Cheaper Software. Google Offers Big-Data Analytics.

raanan
Télécharger la présentation

Scientists See Promise in Deep-Learning Programs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scientists See Promise in Deep-Learning Programs Microsoft Seeks an Edge in Analyzing Big Data The Age of Big Data Why Hire a Lawyer? Computers Are Cheaper Armies of Expensive Lawyers, Replaced by Cheaper Software Google Offers Big-Data Analytics Jeff Hawkins Develops a Brainy Big Data Company How Big Data Became So Big

  2. The total amount of digital data in the world is estimated toexceed 1.8 Zettabytes (1.8 TRILLION Gigabytes)) The digital universe is doubling every 2 years 85% of that data is owned or controlled by corporations at some point in its lifecycle Source: International Data Corporation (IDC) Study, 2012

  3. Big Data is Here And it’s coming soon to a litigation near you… What’s changed?

  4. TheGreatCommingling

  5. Redefining scalability in eDiscovery. 1 X 1012 1000 1

  6. Predictive Coding is a Form of Machine Learning What is Machine Learning?

  7. It’s already a part of our lives. . . • voice recognition software, e.g., calling your bank or credit card company • handwriting, facial or fingerprint recognition • analyzing market trends and guiding investment decisions • making decisions on applications for credit or loans • modeling and predicting severe weather patterns • filtering spam in your email inbox • targeted marketing on the internet • robotics

  8. KEY POINT: Predictive coding is just a part of a continuum of technology assisted review (TAR) methods that we are already very familiar with in searching and analyzing data. Concept Clustering • Concept Search • Predictive Coding Key Words Three supporting propositions: Each successive approach incorporates the preceding approaches. Each successive approach contains more supporting criteria. All are ultimately based on the concept of pattern matching.

  9. Key Words = Simple pattern matching dog rhino wolf domestic External input: “wild,” “wolf,” “pet” wild ferret cat goldfish cow pet

  10. 01110111011010010110110001100100 (wild) 011001000110111101100111 (dog) 011100000110010101110100 (pet) Concept Clustering = Organizationbased on internal relationships tiger tiger rhino rhino dog cat wild ferret domesticated ferret dog wolf wolf pet wild goldfish goldfish cat cow domesticated pet cow

  11. Concept Searching = Key words + Concept organization rhino zoo dog rhino wolf tiger wolf domestic dog cat External input: “zoo,” wild,” “domesticated” wild wild ferret 01111010011011110110111 (zoo) 01110111011010010110110001100100 (wild) 011001000110111101101101011001010111001101110100011010010110001101100001011101000110010101100100 (domesticated) domesticated pet ferret cat goldfish farm goldfish cow cow pet

  12. Predictive Coding = document-level input + probabilistic modeling rhino zoo dog rhino wolf tiger wolf domestic dog cat external input: human-coded documents wild output: doc-level probability rankings wild ferret 01111010011011110110111 (zoo) 01110111011010010110110001100100 (wild) 011001000110111101101101011001010111001101110100011010010110001101100001011101000110010101100100 (domesticated) domesticated pet ferret cat goldfish farm goldfish cow cow pet

  13. Infer Step 1. sample documents from entire set.

  14. Step 2: attorney review of sample documents to create training and control set. In the European mind, wolves long stood as a symbol of baneful, uncontrollable nature. As far back as the time of Aesop in 500 BCE (Before the Common Era), wolves in literature are portrayed as wicked villains and long-fanged, terrible beasts. Before the Middle Ages, wolves were nearly always the greedy thief, criminal trickster, or cruel remorseless murderer. The wolf does not fare well in the European imagination. Can the wolf be domesticated? The domesticated dog isdescended from the wolf found in the wild.While some people have occasionally attemptedto raise wolves as pets, their2 ½ inch fangs and tendencyto eat nearby small animals such as catscan create socially awkward situations withneighbors. Responsive Not Responsive

  15. 011001000110111101100111 011001000110111101100111 011001000110111101100111 011001000110111101100111 011001000110111101100111 011001000110111101100111 011001000110111101100111 Step 3: create model from human coded training set (responsive and not responsive). In the European mind, wolves long stood as a symbol of baneful, uncontrollable nature. As far back as the time of Aesop in 500 BCE (Before the Common Era), wolves in literature are portrayed as wicked villains and long-fanged, terrible beasts. Before the Middle Ages, wolves were nearly always the greedy thief, criminal trickster, or cruel remorseless murderer. The wolf does not fare well in the European imagination. Can the wolf be domesticated? The domesticated dog isdescended from the wolf found in the wild.While some people haveoccasionally attemptedto raise wolves as pets, their2 ½ inch fangs and tendencyto eat nearby small animals such as catscan create socially awkward situations withneighbors. Can the wolf be domesticated? The domesticated dog isdescended from the wolf found in the wild.While some people have occasionally attemptedto raise wolves as pets, their2 ½ inch fangs and tendencyto eat nearby small animals such as catscan create socially awkward situations withneighbors. dances raise wolf wolves werewolf pet costner

  16. Step 4: test model against sample (human coded) set. Wolves are sometimes kept as exotic pets, and in some rarer occasions, as working animals. Although closely related to dogs (which are believed to have split from wolves between 10,000 and 100,000 years ago), wolves do not show the same tractability as dogs in living alongside humans. Wolves also need much more space than dogs, about 10- 15 sq. miles. "Dances With Wolves" has the makings of a great work, one that recalls a variety of literary antecedents, everything from "Robinson Crusoe" and "Walden" to "Tarzan of the Apes." Michael Blake's screenplay touches both on man alone in nature and on the 19th-century white man's assuming his burden among the less privileged.

  17. Apply model to remainder of documents that have not been reviewed Responsive Yes No Non-responsive

  18. Step 5: Apply model to entire set and rank documents. 100 % 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%

  19. PREDICTIVE CODING AND BIG DATA NYLJ/Pangea3 Webinar April 15, 2013

  20. OUTLINE • Mitigating Big Data in E-Discovery • Stakeholder Analysis • The New Reality of Predictive Coding • Long-Term Trends

  21. Predictive Coding and Big Data Mitigating Big Data in e-discovery

  22. BIG DATA IN E-DISCOVERY • Bigger haystack—more documents in general • Corporate data culture—more relevant documents • More sources—poses collection/preservation challenges

  23. MITIGATING BIG DATA IN E-DISCOVERY • Some mitigating factors: • Principles of proportionality and cooperation • Information governance tools and document management • Technology-assisted review and predictive coding

  24. Predictive Coding and Big Data Stakeholder analysis

  25. PREDICTIVE CODING STAKEHOLDER ANALYSIS • Judges: generally receptive • Clients: cost efficiencies vs. risk management • Lawyers: new model, building expertise

  26. Predictive Coding and Big Data The new reality of predictive coding

  27. NEW REALITY OF PREDICTIVE CODING

  28. Predictive Coding and Big Data Long-term trends

  29. LONG-TERM TRENDS • Over time, Big Data growth > predictive coding benefits • Some document-by-document human review necessary • Strategic nuances in a new discovery battleground

  30. CONTACT PANGEA3

  31. SEARCH (1) How do we search for discoverable ESI? • Manually? • With automated assistance? • Which is“better” and why? • M.R. Grossman & G.V. Cormack, “The Grossman-Cormack Glossary of Technology-Assisted Review,” 7 Fed. Cts. Law R. 1 (2013) • Maura R. Grossman & Gordon V. Cormack, “Technologically-Assisted Review in E-Discovery Can Be More Effective and More Efficient than Exhaustive Manual Review,” XVII Rich. J.L. & Tech. 11 (2011) (available at http://jolt.richmond.edu/v17i3/article11.pdf) • For a “shorter” discussion, see Efficient E-Discovery, ABA Journal 31 (Apr. 2012)

  32. SEARCH (2) • Using search terms? How accurate are these? See In re National Ass’n of Music Merchants, Musical Instruments and Equipment Antitrust Litig., 2011 WL 6372826 (S.D. Ca. Dec. 19, 2011)

  33. SEARCH (3) Automated review or “predictive coding” as an alternative to the use of search terms. For decisions which address automated review, see: • EORHB, Inc. v. HOA Holdings LLC, C.A. No. 7409 (Del. Ct. Ch. Oct. 15, 2012) • In re Actos (Pioglitazone) Prod. Liability Litig., MDL No. 6:11-md-2299 (W.D. La. July 27, 2012) • Da Silva Moore v. PublicisGroupe SA, 2012 U.S. Dist. LEXIS 23350 (S.D.N.Y. Feb. 24), aff’d, 11 Civ. 1279 (ALC (AJP) (S.D.N.Y. Apr. 26, 2012) • Global Aerospace Inc. v. Landow Aviation, L.P., Consol. Case No. CL 61040 (VA Cir. Ct. Apr. 23, 2012)

  34. SEARCH (4) WHAT LESSONS CAN BE DRAWN FROM THE DECISIONS? • Judge approved automated search at a “threshold” level. “Results” may be subject to challenge and later rulings. • Threshold superiority of automated vs. manual review recognized given volume of ESI and attorney review costs. • Large volumes of ESI in issue. • Party seeking to do automated review must offer “transparency of process” or something close to it. • “Reasonableness” of methodology is key. • Speculation by the opposing party is insufficient to defeat threshold approval.

  35. SEARCH (5) LET’S TAKE A DEEP BREATH AND RECAP WHERE WE ARE TODAY, VENDOR HYPE NOTWITHSTANDING: • We have yet to see a judicial analysis of process and results in a contested matter. • Safe to assume that the proponent of a process will bear the burden of proof (whatever that burden might be). • Safe to assume at least some transparency of processmay/will be expected. • If “reasonableness” is standard, how reasonable must the results be? Is “precision” of 80% enough? 90%? Remember, there are no agreed-on standards.

  36. INTERLUDE Assume a party makes production of ESI based on search terms proposed by an adversary. Assume further that the adversary suspects “something” is missing. Is suspicion enough to warrant direct access to the party’s databases by a consultant retained by the adversary? If not, what proofs should be required? • Will an attorney’s certification or affidavit suffice? • Will/should the attorney become a witness? • Will experts be needed? Note, with regard to proofs, S2 Automation LLC v. Micron Technology, Inc., No. 11-0884 (D.N.M. Aug. 9, 2012), where the court, relying on Rule 26(g)(1), required a party to disclose its search methodology.

  37. INTERLUDE A collision between search and ethics? • Assume a party’s attorney knows that search terms proposed by adversary counsel, if applied to the party’s ESI, will not lead to the production of relevant (perhaps highly relevant) ESI. • Absent a lack of candor to adversary counsel or the court under RPC 3.4 (which implies if not require,s some affirmative statement), does not RPC 1.6 require the party’s attorney to remain silent? • What if the “nonproduction” becomes learned later? If nothing else, will the party’s attorney suffer bad “PR” if nothing else? • If the party’s attorney wants to advise the adversary, should the attorney secure her client’s informed consent? What if the client says, “no?” (with thanks to the Hon. John M. Facciola)

  38. INTERLUDE AS WE THINK ABOUT SEARCH, THINK ABOUT THE ETHICS ISSUES THAT USE OF A NONPARTY VENDOR MAY LEAD TO!

More Related