1 / 34

CPE 641 Natural Language Processing

CPE 641 Natural Language Processing. Ontologies Asst. Prof. Nuttanart Facundes, Ph.D. Introduction. Tim Berners-Lee, creator of the WWW, foresees a future when the Web will be more than just a collection of web pages (Berners-Lee et al.,2001).

luella
Télécharger la présentation

CPE 641 Natural Language Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CPE 641 Natural Language Processing Ontologies Asst. Prof. Nuttanart Facundes, Ph.D.

  2. Introduction • Tim Berners-Lee, creator of the WWW, foresees a future when the Web will be more than just a collection of web pages (Berners-Lee et al.,2001). • This mean: computers will be able to consider the meaning, or semantics, of information sources on the web & to meaningfully interpret the wealth of knowledge available on the web.

  3. Semantic Web • This has a lot to do with agents. • Agents will search and perform tasks for human users. This is done on Semantic Web. • To realize the Semantic Web is to transform the current web into an information space with a semantic organizational foundation, or an information space that makes information semantically accessible to machines by considering its meaning.

  4. Ontologies • Information resource on the Semantic Web will not only contain data, but will also consist of metadata which describe what the data are about. • This will allow agents and human users to identify, collect and process suitable information sources by interpreting the semantic metadata based on the task at hand. • The semantic foundation will be provided by ontologies.

  5. Origin • Ontology (uncountable noun with no plural) – philosophical discipline that studies the nature of being, a system of categories. • AI researchers borrow the term to mean: a designed artefact consisting of shared vocabulary used to describe entities in some domain of interest

  6. What is ontology? • An explicit specification of a conceptualisation. • Ontological Commitment • The agreements about the object and relation being talked about among agents

  7. Definitions • An ontology is a shared conceptualization of a domain • An ontology is a set of definitions in a formal language for terms describing the world

  8. Motivation • select EMPDAT from PERSTAB where POS=“mgmnt” • What does it mean? • PERSTAB is a table which lists employee data • What’s an employee? How is an employee different from a contractor? What if I want data on both? • Even if this information is available in English, a human has to read it

  9. Motivation (2) • "Parenthood is a more general relationship than motherhood." • "Mary is the mother of Bill." • "Who are Bill's parents?“ • "Mary is the parent of Bill.” • that fact is not stated anywhere, but can be derived by a DAML application. Example from “Why Use DAML?” <http://www.daml.org/2002/04/why.html>

  10. Motivation (2) continued • More formally stated, given the statements (motherOf subProperty parentOf) (Mary motherOf Bill) • when stated in DAML, allows you to conclude (Mary parentOf Bill) • Java code or a stored procedure could do this sort of inference for facts in XML or SQL • But the DAML spec itself says the conclusion is true • In contrast, different Java code could reach a different conclusion

  11. Motivation (2) continued • (Mary motherOf Bill) • (parentOf inverse childOf) • (Bill childOf ?X) • ?X = Mary • The semantics of inverse is part of the DAML spec

  12. Language Formality and Expressiveness Human Language CycL F-Logic KIF Machine Processing OWL Human Consumption Machine Inference SQL Expressiveness DAML XML Formality

  13. Content Formality and Size Cyc WordNet SUMO+domain SUMO UMLS Yahoo! DOLCE Taxonomy Lexicons Formal Ontology Size Formality

  14. Many Ways to Use Ontology • As an information engineering tool • Create a database schema • Map the schema to an upper ontology • Use the ontology as a set of reminders for additional information that should be included • As more formal comments • Define an ontology that is used to create a DB or OO system • Use a theorem prover at design time to check for inconsistencies • For taxonomic reasoning • Do limited run-time inference in Prolog, a description logic, or even Java • For first order logical inference • Full-blown use of all the axioms at run time

  15. Upper Ontology • An attempt to capture the most general and reusable terms and definitions

  16. Motivation • Ontologies may have different names for the same things • type – a relation between a class and an instance • instance – a relation between a class and an instance • isa – a relation between a class and an instance • … • Ontologies may have the same name for different things, and no corresponding terms • before – a relation between two time points • before – a relation between two time intervals • Either use the same upper ontology, or at least map to a common upper ontology

  17. Formal Upper Ontologies • DOLCE • Cyc • SUMO

  18. Simple Methodology • Extract nouns and verbs from a source text • Find classes in SUMO for the nouns and verbs • Record a mapping as being either equal, subsuming or instance. • type a single word that relates to the UBL term in the "SUMO term" or "English Word" text areas in the SUMO browser • Create a subclass of SUMO if it's a subsuming mapping • Add properties to the subclass • reusing SUMO properties • extending SUMO properties by creating a &%subrelation of an existing property • Add English definition to the class • define constraints that express how the subclass is more specific than the superclass • Express the classes and properties in KIF and begin creating axioms, based on the English definitions created previously

  19. First Exercise (1) • “Seven Turkish nationals of Chechen origin hijacked a Russia-bound Panamanian ferry in Trabzon. The hijackers initially threatened to kill all Russians on board unless Chechen separatists being held in Dagestan, Russia, were released. On 19 January 1998, the hijackers surrendered to Turkish authorities outside the entrance to the Bosporus. The passengers were unharmed.“ • Identify items that need formalization – start with nouns and verbs

  20. First Exercise (2) • “Seven Turkish nationals of Chechen originhijacked a Russia-bound Panamanian ferry in Trabzon. The hijackers initially threatened to kill all Russianson board unless Chechen separatists being held in Dagestan, Russia, were released. On 19 January 1998, the hijackers surrendered to Turkish authoritiesoutside the entrance to the Bosporus. The passengers were unharmed.“ • Now create terms that correspond to the nouns and verbs • Remove redundancy • Are there any “background” notions that are not explicit in the text?

  21. First Exercise (3) • Seven Turkish nationals of Chechen originhijacked a Russia-bound Panamanian ferry in Trabzon. The hijackers initially threatened to kill all Russianson board unless Chechen separatists being held in Dagestan, Russia, were released. On 19 January 1998, the hijackers surrendered to Turkish authoritiesoutside the entrance to the Bosporus. The passengers were unharmed • Turkey, Chechnya, Nationality, Hijacking, Threatening, Killing, Releasing, Holding, Dagestan, Russia, Separatist, Entrance, Bosporus, Unharmed, Panama, Trabzon, Authority, Outside, boundFor, Ferry, onBoard

  22. SUMO Overview • Understanding what’s in the upper ontology, in order to use it effectively

  23. High Level Distinctions The first fundamental distinction is that between ‘Physical’ (things which have a position in space/time) and ‘Abstract’ (things which don’t) Physical Abstract

  24. High Level Distinctions Partition of ‘Physical’ into ‘Objects’ and ‘Processes’ Physical Object Process

  25. Objects Object SelfConnectedObject Substance CorpuscularObject Region Collection

  26. Processes IntentionalProcess IntentionalPsychologicalProcess RecreationOrExercise OrganizationalProcess Guiding Keeping Maintaining Repairing Poking ContentDevelopment Making Searching SocialInteraction Maneuver Motion BodyMotion DirectionChange Transfer Transportation Radiating DualObjectProcess Substituting Transaction Comparing Attaching Detaching Combining Separating InternalChange BiologicalProcess QuantityChange Damaging ChemicalProcess SurfaceChange Creation StateChange ShapeChange

  27. Abstract SetOrClass Relation Proposition Quantity Number PhysicalQuantity Attribute Graph GraphElement

  28. A Little Bit of Logic • Instance – GeorgeBush, Iraq, BobsRightBigToe • Class – Human, Nation • Relation – WWI before WWII, Bill childOf Mary • => (read as “implies”) - if X then Y • and – X and Y are true • or – X or Y (or both) are true • not – not X – the opposite of the truth of X • exists ?X – there exists something about which the following is true

  29. A Little “Structural” Ontology (instance GeorgeBush Human) – GeorgeBush is an instance of the class of humans (exists (?X) (parent ?X GeorgeBush)) – there exists something of which George Bush is the parent (instance parent BinaryPredicate) – the relation of parent is a binary relation (domain parent 1 Organism) – the first argument to the parent relation must be an instance of the class Organism (domain parent 2 Organism) – similarly for the second argument

  30. Linking to SUMO Terms • Nation, Confining, Committing, SocialRole, TransportationDevice, Killing, Near, Injuring, citizen, (not…), (exists…) • Terms from the exercise (may or may not be the same as SUMO terms): • Turkey, Chechnya, Nationality, Hijacking, Threatening, Killing, Releasing, Holding, Dagestan, Russia, Separatist, Entrance, Bosporus, Unharmed, Panama, Trabzon, Authority, Outside, boundFor, Ferry, onBoard • Use the terms in the first bullet to define the terms in the second bullet • Use Nation to state: (instance Turkey Nation)

  31. Formalization (exists (?TURK …) (and (citizen ?TURK Turkey)) … )

  32. Formalization (exists (?TURK ?FERRY …) (and (citizen ?TURK Turkey) (instance ?FERRY FerryBoat) … )

  33. Formalization (exists (?TURK ?FERRY ?HIJACK) (and (citizen ?TURK Turkey) (instance ?FERRY FerryBoat) (instance ?HIJACK Hijacking) (agent ?HIJACK ?TURK) (patient ?HIJACK ?FERRY) (earlier ?HIJACK (DayFn 19 (MonthFn January (YearFn 1998))))))

More Related