1 / 16

The process of electronic discovery

The process of electronic discovery . Herbert L. Roitblat, Ph.D. OrcaTec LLC. What is relevance?. How do we explain the poor agreement on document relevance? H1: No objective relevance, only subjective (post-modern view) H2: There is relevance, but it is difficult to measure reliably

Télécharger la présentation

The process of electronic discovery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. The process of electronic discovery Herbert L. Roitblat, Ph.D.OrcaTec LLC

  2. What is relevance? • How do we explain the poor agreement on document relevance? • H1: No objective relevance, only subjective (post-modern view) • H2: There is relevance, but it is difficult to measure reliably • Implications for how we approach machine support for discovery

  3. Subjective relevance • No inherent wrong decisions • Relevance determined by the project manager (e.g., lead attorney) • Reviewer’s job is not to be right, but to represent the designated opinion • Conundrum: On what basis can one challenge the decision of the leader? • Work cannot be replicated

  4. Objective relevance • Decision making is fallible • Key is to discern what document features make a document relevant • Limited by lack of world and case knowledge • Limited by the flexibility of human language • What level of quality is reasonable?

  5. Measurement • Recall nominally important • Recall is difficult to measure • Use sampling techniques to estimate • Focus on whole project • Right query terms? • Right technology? • Right selection methods? • Measuring performance is more important than the technology used "All documents constituting or reflecting discussions about unfair or discriminatory allocations of [Brand X] products or the fear of such unfair or discriminatory allocations."

  6. Elusion • Proportion of rejected documents that are relevant • Sampling for elusion leads directly to an “accept on zero” quality assessment

  7. Measures Contingency table Precision A/C Recall A/G Elusion D/F

  8. Documents to review The estimated number of documents to review to achieve specified levels of confidence and maximum acceptable prevalence rates (ps).

  9. Why is this important? • Manual review has reached the end of its useful life—volume, volume, volume • Need powerful tools to augment human review (AI, statistical, neural networks) • Need to be able to assess quality • Need to communicate to audience in terms they care about

  10. What form will assistance take? • Semantic web approach • eDiscovery ontology • Requires adaptation for each case • Several services exploit similar approach • Adaptationist approach • Statistical (LSI, Bayes, language modeling) • Neural network • Other machine learning techniques • Syntactic

  11. Why is this so hard?

  12. What properties does language have? • Systematicity, atomicity, semantic transparency: Words are independent symbols • Compositionality and syntax: Words can be combined according to rich rules “There was a desert wind blowing that night. It was one of those hot, dry, Santa Anas that come down through the mountain passes and curl your hair and make your nerves jump and your skin itch. On nights like that, every booze party ends in a fight. Meek little wives feel the edge of the carving knife and study their husbands' necks. Anything can happen.”—Raymond Chandler

  13. Semantic transparency How long did the Hundred-Years War last? 116 years, from 1337 to 1453 Which country makes Panama hats? Ecuador What is a camel's hair brush made of? Squirrel fur The Canary Islands in the Atlantic are named after what animal? Insularia Canaria - Island of the Dogs

  14. Real mother Fun with systematicity • I was adopted, I don’t know who my real mother was. • I am not a nurturing person, so I don’t think that I could ever be a real mother to anyone. • My real mother died when I was an embryo, and I was frozen and later implanted in the womb of a woman who gave birth to me. • I had a genetic mother who contributed the egg that was implanted in the womb of my real mother who gave birth to me.

  15. Systematicity & Atomicity(answer quickly) • What is Mr. Baron’s first name? • What currency is used in Italy? • How many animals of each kind did Moses take on the Ark? • What is the nationality of Thomas Edison, inventor of the telephone? • In what biblical story was Job swallowed by a whale? • What do cows drink?

  16. Conclusion • Lawyers need eDiscovery help • Any approach that relies on the systematicity of language is limited • Can work with a lot of human input • Human review is of unknown accuracy, but high subjective confidence • Measurement is essential to evaluating reasonableness

More Related