1 / 39

Query-based Opinion Summarization for Legal Blog Entries

Query-based Opinion Summarization for Legal Blog Entries. Jack G. Conrad , Jochen L. Leidner, Frank Schilder, Ravi Kondadadi Corporate Technology Research & Development Twelfth International Conference on Artificial Intelligence & Law (ICAIL 2009) Barcelona, Spain 8-12 June 2009. OUTLINE.

kalyca
Télécharger la présentation

Query-based Opinion Summarization for Legal Blog Entries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query-based Opinion Summarization for Legal Blog Entries • Jack G. Conrad, Jochen L. Leidner, Frank Schilder, Ravi Kondadadi • Corporate Technology Research & Development • Twelfth International Conference on Artificial Intelligence & Law (ICAIL 2009) • Barcelona, Spain • 8-12 June 2009

  2. OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

  3. INTRODUCTION (1/4) — Motivations • Amount and rate of legal information flow increasing • Demands on attorneys for work products very high • Essential for productivity tools to be efficient • Legal blogs provide a more immediate forum • Unmoderated, instantaneous, candid, terse • Contain rich viewpoints, individual or in aggregate • Missing piece: ability to summarize blog entries • Legal professionals busy synthesizing traditional legal materials (cases, statutes, analytical documents) • Pressures due to case load and schedules immense • Increasingly impossible to keep up with information bandwidth • Means of consolidating, summarizing artifacts invaluable

  4. 4

  5. INTRODUCTION (3/4) • Key contributions • First work to perform multi-document opinion-based summarization on legal blog entries • Extends the TAC evaluation of opinion summarization task to assess the accuracy of measured polarity, using expert reviewers • Presents a proposal to the AI & Law community — host a formal track to pursue the topic in a more structured, in-depth manner

  6. INTRODUCTION (4/4) • Opinion Mining for Legal Blogs • Prospective Applications • Monitoring — follow what communities are saying about firms, products, services, topics • Alerting — inform subscribers of unfavorable developments • Profiling — represent litigation patterns of attorneys, courts ... • Tracking — study decisions of judges, reputations of firms ... • Exploration/Education — present law students with contrasting opinions

  7. OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

  8. Ashley & Aleven (1991 ff.) Intelligent tutoring Lerman & McDonald (2009) Sentiment-modeled summarizers Conrad, Leidner, Schilder and Kondadadi (2009) Blawg sentiment summarization RELATED WORK (1/2) Hachey & Grover (2006) Argumentative zoning Saravanan & Raman (2006) Conditional Random Fields Conrad & Schilder (2007) Blawg polarity classification Summarization TREC, TAC, et al. ICAIL, JURIX ICWSM Legal Domain Sentiment Analysis

  9. RELATED WORK (2/2)— TAC, the Text Analysis Conference (www.nist.gov/tac/) • a new annual international workshop sponsored by NIST • the US National Institute of Standards & Technology • organizers disseminate NLP-type tasks and datasets • participants develop systems that solve the tasks • submit their results to NIST for evaluation • members can also propose new tasks for future workshops • the sentiment summarization pilot task consisted of producing short, coherent sentiment summaries of blog text • Thomson Reuters R&D addressed the task • system produced multi-document summaries

  10. OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

  11. SYSTEM (1/3) — Workflow Diagram for Blawg Opinion Summarization Sample Query: Has Google been a consistent supporter of Net neutrality? Sample Target: Google Net Neutrality

  12. SYSTEM (2/3) — FastSum, design and application • TR’s legal blog opinion summarization system • multi-document summarization system • harnesses regression Support Vector Machine (SVM) for ranking candidate sentences • original system extended to sentiment • current system applied to legal domain (blawgs) Summarization (2007) (2008) (2009) Legal Sentiment Summarization Summarization

  13. SYSTEM (3/3) FastSum Blog Opinion Summarization Processing • Key Modifications • A.1 HTML parsing & clean-up module • B.1 Question sentiment & target analyzer • C.1 Sentence tagger • C.2 Target overlap 13

  14. OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

  15. METHODOLOGY (1/7) • Application of Thomson Reuters’ legal blog opinion summarization system • Data collection via Web-based queries • submitted to Web Search Engine, Blog Search Engine • Summary generation • via modified FastSum System • Evaluation • human assessment • two assessors rated each summary • measures modeled on TAC metrics

  16. METHODOLOGY (2/7) Blog Search Engines Examined along with Their Properties

  17. METHODOLOGY (2/4) 17

  18. 18

  19. METHODOLOGY (5/7) — Evaluation • Metrics used modeled on TAC (et al.) evaluation • Two metrics used: • Responsiveness • Linguistic Quality • Scale: Five-point Likert [1- 5 ] • 5 = high • 1 = low • Scores generally track those of TAC, though task not completely identical

  20. METHODOLOGY (6/7)— Evaluation: Responsiveness Reviewer Guidelines for Responsiveness [1-5]

  21. METHODOLOGY (7/7)— Evaluation: Linguistic Quality Reviewer Guidelines for Linguistic Quality [1-5]

  22. OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

  23. RESULTS (1/2) — Baseline Averages • Scores comparable to those of TAC 2008 (in 2-3 range) • Caveat —we scored for correct sentiment polarity; TAC didn’t • Kappa statistic for inter-rater agreement between pair, Κ = 0.75

  24. RESULTS (2/2) — Sample FastSum Summary

  25. RESULTS (2/2) — Sample FastSum Summary Deficient Topical overlap Display of sentiment Useful to researcher

  26. OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

  27. CONCLUSIONS (1/1) • Amount, rate of legal information flow growing • Summarization, identification of trends increasingly valuable • Forums like TAC-opinion summarization beginning to study topic • For certain legal research, such synopses can be very helpful • Viewpoints, individually or in aggregate, can expand arguments, comprehension of underlying legal issues • First effort to produce automatic opinion summaries for entries in legal blog space • Based on multiple documents • For pre-specified polarity • Trained on general, homogeneous news documents (okay) • Trained on specific heterogeneous legal blogs (better) • Assessed by expert legal reviewers • Baseline scores in the low 2.0s out of 5 (comparable to TAC)

  28. OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

  29. FUTURE WORK (1/1) • Compare to other summarization systems/techniques • From TAC or elsewhere • Test against model summaries and use the nugget pyramid evaluation method • Train the ML component of FastSum on various blog entries, rather than general news • Formalize the role input data has on result sets; and the impact output length has on results • Incorporate more structure • Qualitative — best template to harness? • Quantitative — optimal length for each section? • Leverage features from the legal domain • E.g., use a legal dictionary to help rank sentences

  30. OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

  31. AI & LAW PROPOSAL (1/1) • For AI & Law (IAAIL) and TAC (NIST) • NIST offers research groups shared task in multi-document summarization • Why not focus on a shared task in the legal domain? • Need be assessed by IAAIL, NIST communities to determine interest • Who would benefit? • Legal practitioners — potentially highly beneficial results • Legal researchers — thanks to valuable testbed • AI & Law Community — can breath in new life, members • What data collections could be used? • TAC uses the very large BLOG06 collection • Text Entailment uses the RTE collection; a hybrid also possible

  32. Query-based Opinion Summarization for Legal Blog Entries • Jack G. Conrad, Jochen L. Leidner, Frank Schilder, Ravi Kondadadi • Research & Development • Twelfth International Conference on Artificial Intelligence & Law (ICAIL 2009) • Barcelona, Spain • 8-12 June 2009 Gracias! ¿Preguntas?

  33. INTRODUCTION (1/4) — Motivations • Modern legal information environment increasingly dynamic, fast-paced • Blawgs (legal blogs) provide a more immediate forum • Generally unmoderated, instantaneous, candid, terse • Viewpoints to be gleaned, in aggregate or individually, are rich • Missing piece: ability to summarize blog entries • Legal professionals busy simply synthesizing traditional legal materials (cases, statutes, analytical documents) • Pressures due to case load and schedules immense • Increasingly impossible to keep up with information bandwidth • Means of consolidating, summarizing artifacts invaluable

  34. AI & LAW PROPOSAL (1/1) • For AI & Law (IAAIL) and TAC (NIST) • NIST offers research groups shared task in multi-document summarization • Why not focus on such a shared task in the legal domain? • Need be assessed by IAAIL, NIST communities to determine interest • Potentially of great benefit to legal practitioners • Could raise the bar on current baseline system • Could use: • Blog-based data set like BLOG06, as used in TAC 2008, with a legal component • RTE (Recognizing Text Entailment) data set, again with a legal component • a combination of the two

  35. RELATED WORK (1/2) • Ashley & Aleven (1991 ff.) — produce intelligent tutoring applications to teach law students how to argue in the context of caselaw • Farzindar & Lapalme (2004) — present the LetSum system to summarize Canadian court decisions • Hachey & Grover (2006) — apply argumentative zoning to summarize decisions from the House of Lords • Saravanan & Raman (2006) — use statistical graphical models (CFRs) for legal summarization, while extracting rhetorical roles • Lerman, B.-G., and McDonald (2009) — show users have a strong preference for summarizers that model sentiment over non-sentiment baselines

  36. SYSTEM (4/5) • FastSum’s legal blog opinion summarization system • Sequence of operation • Pre-processing • tokenization • sentence splitting • boiler plate expression removal (e.g., ‘Response by ...’) • Question analysis • sentiment analysis (tagging) • target analysis (matching) • Sentiment Filter • sentences with proper polarity selected; else, filtered out

  37. RELATED WORK (2/2)— TAC, the Text Analysis Conference • a new annual international workshop sponsored by NIST • the National Institute of Standards & Technology • organizers disseminate NLP-type tasks and datasets • participants develop systems that solve the tasks • submit their results to NIST for evaluation • members can also propose new tasks for future workshops • the sentiment summarization pilot task consisted of producing short, coherent sentiment summaries of blog text • our system produced multi-document summaries • Related Conferences: • TREC — the Text Retrieval Conference (started in mid-90s) • DUC — Document Understanding Conference (from 2001-07) • evaluated many automatic summarization systems during period

  38. SYSTEM (5/5) • FastSum’s legal blog opinion summarization system • Sequence of operation (cont.) • Feature extraction • focus largely on correspondence with terms in query • at different levels of granularity: title, description, document • also harness sentence-based features • length, position • Sentence ranker • trained regression SVM on feature set — goal: summary worthiness • Redundancy removal • basic idea — change relative importance of remaining sentences w.r.t. currently selected sentences

  39. SYSTEM (5/5) • FastSum’s legal blog opinion summarization system • Sequence of operation (cont.) • Feature extraction • topic word frequency (title, description) • content word frequency • document frequency • headline frequency • sentence-based features (length, position) • Sentence ranker • trained regression SVM on feature set — goal: summary worthiness • Redundancy removal • basic idea — change relative importance of remaining sentences w.r.t. currently selected sentences <topic> <num> D0703A </num> <title> age discrimination </title> <narr> This expose documents the increasing occurrence of age discrimination in the workplace in Canada ... </narr> </topic>

More Related