Iuriservice II Ontology Development

Iuriservice II Ontology Development Núria Casellas, Denny Vrandečić, Joan Josep Vallbé, Aleks Jakulin, Mercedes Blázquez Workshop on Artificial Intelligence and Law XXII World Congress of Philosophy of Law and Social Philosophy Granada, May 2005

Agenda • Introduction to SEKT Project and Legal Case Study • Methodology • OPJK • Improving knowledge discovery on the competency questions • Architecture

The inSEKTs Vrije Universiteit Amsterdam Empolis University of Sheffield Universität Karlsruhe BT Ontoprise Kea-pro Universität Innsbruck iSOCO Sirma AI Universitat Autònoma de Barcelona Jozef Stefan Institute

SEKT • Main goals of SEKT • European Leadership in Semantic Technologies • Core Research • Combine Human Language Technologies, Knowledge Discovery and Ontology Technologies • Provide intelligent knowledge access

Description of the Problem: Legal Domain • In General: • Complaint about diligence of legal administration. • The Judges are overworked. • In Particular: • New Judges • A lot of theoretical knowledge, but few practical knowledge • On Duty. • When they are confronted with situations in which they are not sure what to do • “Disturb” experienced judges with typical questions. • Usually his/her former tutor (Preparador) • Existing Technology • Legal Databases • Essential in their daily work • Based on keywords and boolean operators • A search retrieves a huge number of hits

Description of the Problem: Legal Domain • Solution: • Design an intelligent system to help new judges with their typical problems. • Extended FAQ system using Semantic Web technologies • Connect the FAQ system with the exiting jurisprudence. • Search Jurisprudence using Semantic Web technologies.

State of the Art in Legal Ontologies • LLD [Language for Legal Discourse, L.T. McCarty, 1989]: Atomic formula, Rules and Modalities. • NOR [Norma, R.K. Stamper, 1991, 1996]: Agents Behavioral invariants, Realizations. • LFU [Functional Ontology for Law, R.W. van Kranlinger; P.R.S. Visser, 1995]: Normative Knowledge, World knowledge, Responsibility knowledge, Reactive knowledge and Creative knowledge. • FBO [Frame-Based Ontology of Law, A. Valente, 1995]: Norms, Acts and Concepts Descriptions]. • LRI-Core Legal Ontology [J. Breuker et al., 2002]: Objects, Processes, Physical entities, Mental entities, Agents, Communicative Acts. • IKF-IF-LEX Ontology for Norm Comparaison [A. Gangemi et al., 2001]: Agents, Institutive Norms, Instrumental provisions; Regulative norms; Open-textured legal notions, Norm dynamics.

Conceptual distinctions • Professional Knowledge (PK) • Legal Knowledge (LK)  Legal Core Ontologies (LCO) [based on General Theories of Law] • Legal Professional Knowledge (LPK)  OPLK • Judicial Professional Knowledge (JPK)  OPJK

Ethnographic survey 10 6 1 16 7 8 14 8 5 10 16 12 8 29 Total Autonomous Communities: 14 (out of 17)

Preliminary exploitation of data • Statistical analysis of results • Judicial units: heterogeneity • Judge’s profile • Protocols of analysis • Literal transcripts • Completed questionnaires • List of extracted questions

OPJK Modeling • Identification of possible concepts through ALCESTE’s results and TextToOnto conceptual distribution • Domain detection • Competency questions discussion and concept extraction

Intuitive ontological subdomains CRIMINAL LAW GENDER VIOLENCE ON-DUTY FAMILY ISSUES ORDER OF PROTECTION / INJUNCTION JUDGE CONTRACT LAW IMMIGRATION COMMERCIAL LAW REAL ESTATE JUDICIAL CLERKS PROCEEDINGS DECISION-MAKING & JUDGMENTS

Term extraction using TextToOnto

Term extraction using TextToOnto and Spanish Gate

Identify important concepts that should be represented • Hierarchy construction • Identify relations between them • Redefine the ontology repeting steps 1-4

Competency question discussion Selecting (underlying) all the nouns (usually concepts) and adjectives (usually properties) contained in the competency questions. • ¿Cuál es el tratamiento de las denuncias manifiestamente inverosímiles o relativas a hechos que evidentemente carecen de tipicidad? • ¿Y si se trata de una querella que reúne todos los demás presupuestos procesales pero los hechos objeto de la misma carecen de relevancia penal o manifiestamente falsos? • ¿Qué ocurre si comparece en el juzgado una persona que quiere denunciarhechos difícilmente creíbles, sin relación entre sí, dudándose por el juez de la capacidad mental del denunciante? • ¿Ante quién debe interponerse el recurso de reforma contra la prisión, delante del juez de guardia o del juez que dictó el correspondiente auto de prisión?

OPJK classes identified

OPJK and Proton Integration

Improving knowledge discovery on the competency questions

Data and Method Data: 3 text corpora (judges’ questions): • Corpus 1: Scholar “on duty” questions (Spanish Judicial School = 99) • Corpus 2: Practical “on duty” questions (= 163) (field work) • Corpus 3: All practical questions (=756)(field work) Method: • TEXT GARDEN (J. Stefan Institute, Ljubljana) • ALCESTE -Analysis of the co-occurring lexemes within the simple statements of a text [Reinert 2002, 2003]

Analysis of Text The text needs to be represented in an appropriate way for statistical analysis: • Breaking text into “units” (lines, sentences, …) • Morphological categorization (adjectives, prepositions, …) • Putting words into canonical form: • Lemmatization (is,was,are → be) • Stemming (loved, loving → lov+) • Analysis: • Clustering • Latent semantic indexing • Correspondence analysis • Classification • Visualization

{ { { } } } Corpus Segmented in chunks ALCESTE (Reinert,1988) Folch & Habert (2000) Hierarchical descending clustering Correspondence analysis List of typical words related to each class Geometric representation Classes of related chunks

Example of Correspondence Analysis and Visualization +-----|---------|---------|---------+---------|---------|---------|-----+ • 20| solo| | • 19| | parte+ | • 18| | monitorio demand+ | • 17| | archiv+accion+ | • 16| present+ | falta+ vehiculo+fase+ | • 15| | seguir procurador+ | • 14| |recurso+ pago+quiebra+ | • 13| ofici+| gasto+ . .ejecut+ejecucion+ | • 12| sido dia+ .finca+embarg+verbal+ | • 11| interes+traficoacto+.notificacionentrega+ | • 10| momentocelebr+hall+ cuantia+resolver | • 9 | valor+ |auto+admit+qued+.juicio+deposit+ | • 8 | lesion+ venirdinero.. notific+pericial+ | • 7 | | si vista+aport+inform+ | • 6 madreacord+viviend+ | cabo solicit+ | • 5 | victima+maridoempresa+ | llev+ ya prueba+abogado+ | • 4 | ..tratosproteccion | | • 3 | .senor+alejamiento | responsabili | • 2 tema+mujer+malo+violencia | | • 1 | denunci+medida+visitas | | • 0 +--.separacion+orden+---------------+-----venirfiscal+------------------+ • 1 | pidepresun+ | | • 2 | | | • 3 | | | • 4 | | | • 5 | | | • 6 | | | • 7 | dict+ | | • 8 | | | • 9 | | | • 10| | | • 11| | | • 12| | | • 13| | | • 14| | un | • 15| | | • 16| | levantamient | • 17| | tenerdeten+ libertadforense | • 18| |person+ .. . ..hacercausa+asunto+ | • 19| servicio+ ......judicial+actuacion+ | • 20| guardia+. juezllam+ .. .policiadetenido+ | • 21| | partido+ | • +-----|---------|---------|---------+---------|---------|---------|-----+ TEXT GARDEN ALCESTE

Example of Clustering Class 1: Judicial unit funcionar+ (21), juzgar(26), oficina(11), trabaj+(13), decir(26), llam+(16), mand+(12), acudir(11), adjunto(4), busc+(4), consult+(4), dato(6), hablar(4), jurisprudencia(3), local+(3), material(6), necesit+(7), policia(14), prensa(4), sala(4), funerari+(2), hurto(3), informacion(5), miedo(3), robo(3), servicio+(7), sustitu+(4), tecnico(2), venir(15) Class 2: Family law alejamiento(22), malo(22), medida(16), orden+(23), proteccion(17), senor+(13), trat+(22), victima(11), mujer(11), padre(7), denunci+(12), domestico(8), violencia(8), agresor(4), dict+(10), madre(7), marido(6), nino(5), pension(4), psicolog+(5), separacion(5), abus+(5), alimento(3), ayud+(4), casa(3), cautelar+(3), divorcio(2), empresa(3), hijo(4), lesion+(6) Class 3: Proceedings escrit+(9), fiscal+(13), instruccion(9), ordinario(5), seguir(11), acumular(5), audiencia-provincia(2), conform+(2), contradictori+(3), criterio+(10), cuantia(5), falt+(7), injusto(3), interpretacion(3), ley(6), motiv+(3), pendiente(2), perito(5) Class 4: Enforcement (judgment) ejecucion(14), ejecut+(15), embarg+(11), finca+(9), depositar+(6), interes+(6), pago(6), suspension(5), deposito(6), entreg+(6), quiebra(5), sentencia(9), solicit+(9), vehiculo(4), acreedor(3), administracion(4), cantidad(4), conden+(4), cost+(4), dinero(4), edicto(2), imposibilidad(3), multa(3), notificacion(4), pagar+(4)

Stemming vs Lemmatization Stem Lema acumulacion acumulación acumularse acumular acumul+ --- admision admisión admit+ admitir celebracion celebración celebr+ celebrar misma+ mismo mismo+ --- suspenderse suspender suspend+ --- Stemming: the longest string of characters that is common to different words: For all the variants of ‘love’, but also for ‘lover’ (noun), ‘lovely’ (adverb), it can offer the stem: lov+ Lemmatization respects the category: 3 different lemma: love (verb), lover (noun) lovely (adv)If we apply this process to Spanish or Catalan (or every Romanesque language), which have a high flection capacity (60 forms for verbs, without taking into account the composed forms), stemming would hide a lot of information. EXAMPLES

Quantitative Comparison • Lemmatized corpus has fewer word-forms than the stemmed version. • The LSI on the lemmatized corpus is able to reconstruct documents better, especially in few dimensions. • The lemmatized corpus clustering is more detailed.

Comparision of Clustering Results • Clustering with stemmed corpus offers us 4 classes: • ‘On-duty’ actions (mixed with Judicial Office) (54,06%) • Proceedings and Trial (18,10%) • Enforcement (judgements) (14,39%) • Family Law (gender violence, divorce, separation…) (13,46%) • Clustering with lemmatized corpus is more detailed and offers 6 classes: • Judicial Office (20,11%) • ‘On-duty’ actions (27,25%) • Family Law (gender violence, divorce, separation…)(14,55%) • Proceedings (15,61%) • Trial (8,47%) • Enforcement (judgements) (14,02%)

Take-Home Messages • Do text analysis of legal documents! • If you do that, Do lemmatization!

Methodology

Initial Methodology + Based on 800 competency questions + Questions were clustered + Middle-out strategy – Usage of ontology not considered – Repetitive discussions – Long discussions

Considering the “Why” • No normative knowledge • Stick to the questions as sources • Model the questions, not the answers

Wiki visualization

Diligent Argumentation Ontology Argumentation ontology defined Based on Case Studies to identify the most effective types of arguments Argument type recognition based on RST

Methodology changes Using DILIGENT made the ontology engineering… • … much faster • … amenable to distributed development • … better documented • … trackable • … better manageable Also DILIGENT itself got changed!

Outlook • Better tool support – off-the-shelf wiki had weaknesses • Moderator support in discussions • Competency question clustering • Gathering further experience from legal and other case studies

Architecture

High Level Requirements • Judges should not be bothered with a complex user interface. • A simple natural language interface is probably appropriate. • The decision as to whether a new question is similar to a stored question (with its corresponding answer) should be based on semantics rather than on simple word matching. • An ontology can be used to perform this semantic matching of questions. • The questions included in the system should be of highquality. • Be rather exhaustive and reflect the actual situation • As extensive survey with more than 250 Spanish judges forms the basis for the questions. • Justify the answer provided by the system with existing Jurisprudence. • Jurisprudence databases. • Metadata and Ontology process of documents. • KnowledgeManagement at all levels

Example Question-Answer • Question: • What problems can we foresee with the analysis of small amounts of drugs, where the identification test destroys the drugs? • Answer: • This is an unrepeatable piece of evidence at the trial. In these cases, the Spanish Criminal Procedure Act states that the adversarial principle should be respected. While the trial proceedings are prepared, the judge must explain to all parties that they may choose an expert to perform these tests.

Example of judgment: parts Court and docket number Grounds of Decision Names of the magistrates Date and place Prefatory statement History of the Case

Relations between the Question/Answer & Judgment Judgement Summary FAQ Case History Decision Grounds Question Ruling Answer OPJK Practical Knowledge Instances

Architecture Web browser Natural Language DB 1 DB N Decisions Decisions Questions- Answers Ontology Learning & feeding Semantic Matching Ontology Merging Ontology Alignment Expert Knowledge Jurisprudence

Expert Knowledge Retrieval Design - Technological considerations iFAQ System Multistage Searching Subsystem Accuracy Eficiency Ontology Domain Detection Keyword Matching Ontology Grapth Path Matching Natural Language Processing Ontology Technology Caching subsystem Persistence subsystem

Expert Knowledge Retrieval Plugged Searching Stages • Chain of Resposability pattern FAQ FAQ FAQ FAQ Ontology Domain Detection Keyword/synonym matching stage Ontology graph path matching FAQ Candidates User Question iFAQ Search Engine Other search engines ... Search Factory

Expert Knowledge Retrieval Semantic Similarity: Main steps Ontology Ontology Linking Semantic Distance Calculation NL query NLP POS list (lemmas) Term Coverage Calculation between queries Best match of stored queries Semantic distance Between queries

Expert Knowledge Retrieval Ontology Denounce Actions Mother Mother Son Son Accuse Follow Semantic Similarity • The semantic distance is based on the weighted navigation distance between terms in the ontology. • Navigation through the ontology means that one moves from one concept to another concept, via one of its relations or attributes. • Is a • Follows • Actor • Etc. • The task of associating distance costs: • Is a domain specific • Needs to be performed by legal expert.

Conclusions • Decision support system for unexperienced judges • Using Semantic Web technology for handling knowledge • Provide knowledge for decision making process • Capture knowledge from experts • Share knowledge among all users • Extended understanding capacities • Background knowledge: Professional Legal Ontology • Decision Explanation • Improved Knowledge Acquisition

Iuriservice II Ontology Development

Iuriservice II Ontology Development

Presentation Transcript

Knowledge Representation, Structuring and Ontology Development

Ontology

Foundations II: Ontology Engineering Class Session 3

ON THE ONTOLOGY OF DISEASE: part II

SCEC Ontology Development

Ontology Development

Part II. The Ontology of Biomedical Reality

Collaborative ontology development by scientists

Community Ontology Development

The Plant Ontology: Development of a Reference Ontology for all Plants

Ontology

POC tutorial #2: Ontology Development

Introduction to Ontology Development and Tools Part I: First Steps in Ontology Development

DYI Ontology Development

ONTOLOGY PRINCIPLES DESIGN AND DEVELOPMENT

Effective Ontology Development

Building Communities Around Ontology Development

Ontology development in Protégé

Development of the Amphibian Anatomical Ontology

Real-life ontology development:

Effective Ontology Development