200 likes | 333 Vues
A Contextual Clustering Approach for Theory Manipulation of RKF Knowledge Bases. Mala Mehrotra Pragati Synergetic Research Inc. Cupertino, CA mm@pragati-inc.com www.pragati-inc.com. RKF Team Review Meeting SRI CA 11 th Oct 2000. SRI Team’s Primary Focus
E N D
A Contextual Clustering Approach for Theory Manipulation of RKF Knowledge Bases Mala Mehrotra Pragati Synergetic Research Inc. Cupertino, CA mm@pragati-inc.com www.pragati-inc.com RKF Team Review Meeting SRI CA 11th Oct 2000
SRI Team’s Primary Focus Provide components for formation of KBS Multi-ViewPoint-Clustering Analysis (MVP-CA) Technology Focus Provide an analysis tool for aiding componentization of existing KBS
Multi-ViewPoint-Clustering Analysis (MVP-CA) Approach • Agglomerative clustering algorithms produce semantically-related axiom clusters • “Similarity” defined by a set of heuristic distance metrics • Meaningful clusters with the aid of statistical and semantics-based cluster information • Clustering provides support for reverse engineering of KBs: • anomaly checking • comprehension • building intermediate concept nodes and mid-level theories • …. • by exposing semantic contexts for terms in the pre-existing axioms
SRI Team’s Short-Term Objective Formulate spatial representation components MVP-CA Technology’s Potential Contribution Extract components from IKB dealing with spatial concepts
Status of Work in Progress • IKB slice for spatial vocabulary obtained in mid-Sept from SRI. • Slice was divided into two files: • 288 assertion axioms • 599 term-definition axioms • Focus on: • Exposing redundant overloaded concepts • Identify reusable concepts • Current work focuses on analyzing the assertion axioms: Report on results so far ….
First Stage: KB cleanup • Eliminated axioms with :ignore t • Identified at the parse stage • 65 such axioms eliminated • Eliminated duplicate axioms • Identified using the MVP-CA tool’s redundancy feature • 59 such axioms eliminated • The cleaned up version has 164 assertions
Second Stage: Cluster Formation • MVP-CA tool’s clustering of assertions has produced axiom clusters which reveal context of usage of a few salient terms • Some problem areas • Some useful concepts • Such exposition can help with intermediate concept node formation for: • Better maintenance • Reorganization, and • Presentation of concept terms to SME/KE • Work is still in progress • Some plausible scenarios with these clusters will be presented next
Potentially Redundant Axioms (#$implies (#$and (#$touchesDirectly ?X ?Y)(#$objectFoundInLocation ?X ?LOC)) (#$objectFoundInLocation ?Y ?LOC)) (#$implies (#$and (#$touches ?X ?Y)(#$objectFoundInLocation ?X ?LOC)) (#$objectFoundInLocation ?Y ?LOC)) touchesDirectly and touches are essentially same concepts in the context of objectFoundInLocation.
Concept of touches and touchesDirectly (#$implies (#$formsBorderBetween ?BORDER ?Y ?Z) (#$touchesDirectly ?BORDER ?Y)) (#$implies(#$and(#$isa ?INSIDEOUT #$InsideSurface)(#$isa ?OUTSIDEIN #$ExternalSurface-WholeThing)(#$physicalParts ?OUT ?INSIDEOUT)(#$externalParts ?IN ?OUTSIDEIN)(#$in-Snugly ?IN ?OUT)) (#$touches ?INSIDEOUT ?OUTSIDEIN)) (#$implies(#$in-ImmersedGeneric ?OBJ ?FLUID)(#$touches ?FLUID ?OBJ)) (#$implies (#$touches ?X ?Y) (#$near ?X ?Y)) (#$implies(#$and(#$touchesDirectly ?X ?Y)(#$objectFoundInLocation ?X ?LOC)) (#$objectFoundInLocation ?Y ?LOC)) (#$implies(#$and(#$touches ?X ?Y)(#$objectFoundInLocation ?X ?LOC)) (#$objectFoundInLocation ?Y ?LOC)) (#$implies(#$bordersOn ?X ?Y)(#$touchesDirectly ?X ?Y)) (#$implies(#$and(#$touches ?X ?Y)(#$physicalParts ?Z ?Y)) (#$touches ?X ?Z)) (#$implies (#$touchesDirectly ?X ?Y) (#$touches ?X ?Y)) (#$implies (#$and (#$touchesDirectly ?PRT ?THING) (#$externalParts ?WHL?PRT)) (#$touchesDirectly ?THING ?WHL)) (#$implies(#$in-Embedded ?X ?Y) (#$touchesDirectly ?X ?Y)) (#$implies(#$in-ContFullOf ?X ?Y) (#$touchesDirectly ?X ?Y)) (#$implies (#$touchesDirectly ?X ?Y) (#$distanceBetween ?X ?Y(#$Foot-UnitOfMeasure 0))) (#$implies(#$in-Held ?OBJ ?HOLDER)(#$touches ?HOLDER ?OBJ)) (#$implies (#$adjacentTo ?REG1 ?REG2) (#$touches ?REG1 ?REG2)) (#$implies(#$on-Physical ?TOP ?BOT) (#$touches ?BOT ?TOP))
Pivot Concept: distanceBetween (assertion(#$implies(#$bordersOn ?X ?Y) (#$distanceBetween ?X ?Y (#$Kilometer 0)))) (assertion(#$implies(#$bordersOn ?X ?Y) (#$distanceBetween ?X ?Y (#$Meter 0)))) (assertion(#$implies(#$bordersOn ?X ?Y) (#$touchesDirectly ?X ?Y))) (assertion(#$implies (#$touchesDirectly ?X ?Y) (#$distanceBetween ?X ?Y (#$Foot-UnitOfMeasure 0))))
bordersOn bordersOn bordersOn distanceBetween (KM) distanceBetween (M) touchesDirectly touchesDirectly distanceBetween (F) bordersOn bordersOn distanceBetween (F) distanceBetween (F) Intermediate Concept Node Formation } } F | M | KM distanceUnit bordersOn distanceBetween (distanceUnit)
Pivot Concept: ObjectFoundInLocation (#$implies (#$oFIL ?OBJ ?LOC)(#$near ?LOC ?OBJ)) (#$implies(#$and (#$touchesDirectly ?X ?Y)(#$objectFoundInLocation ?X ?LOC)) (#$objectFoundInLocation ?Y ?LOC)) (#$implies(#$and (#$touches ?X ?Y)(#$objectFoundInLocation ?X ?LOC)) (#$objectFoundInLocation ?Y ?LOC)) (#$implies(#$and(#$on-Physical ?X ?Y)(#$objectFoundInLocation ?Y ?LOC)) (#$objectFoundInLocation ?X ?LOC)) (#$implies(#$and (#$groupMembers ?C ?MEM)(#$objectFoundInLocation ?C ?LOC)) (#$objectFoundInLocation ?MEM ?LOC)) (#$implies (#$and (#$physicalParts ?X ?PART)(#$objectFoundInLocation ?X ?LOC)) (#$objectFoundInLocation ?PART ?LOC)) (#$implies (#$and (#$physicalParts ?LOC ?PART)(#$objectFoundInLocation ?X ?PART)) (#$objectFoundInLocation ?X ?LOC)) (#$implies (#$and (#$in-ContGeneric ?OBJ ?CONT)(#$objectFoundInLocation ?CONT ?REG)) (#$objectFoundInLocation ?OBJ ?REG)) (#$implies (#$in-ImmersedFully ?OBJ ?FLU) (#$objectFoundInLocation ?OBJ ?FLU)) (#$implies (#$and (#$isa ?FLUID #$Place)(#$in-ImmersedGeneric ?OBJECT ?FLUID)) (#$objectFoundInLocation ?OBJECT ?FLUID)) (#$implies (#$and(#$objectFoundInLocation ?PER ?LOC)(#$covers-Hairlike ?STUFF ?LOC)) (#$in-Among ?PER ?STUFF)) (#$implies(#$and (#$in-Floating ?OB ?LIQ)(#$surfaceParts ?LIQ ?SURF)) (#$objectFoundInLocation ?OB ?SURF)) (#$implies (#$and (#$isa ?WATER #$BodyOfWater)(#$in-Floating ?OBJ ?WATER)) (#$objectFoundInLocation ?OBJ ?WATER)) (#$implies (#$and (#$in-ContGeneric ?OBJ ?CONT)(#$containsCavity ?CONT ?CAV)) (#$objectFoundInLocation ?OBJ ?CAV)) (#$implies(#$geographicalSubRegions ?REG ?PLACE) (#$objectFoundInLocation ?PLACE ?REG)) (#$implies (#$and (#$isa ?Y #$GeographicalRegion)(#$on-Physical ?X ?Y)) (#$objectFoundInLocation ?X ?Y))
objectFoundInLocation: relationship to positional terms (#$implies (#$objectFoundInLocation ?OBJ ?LOC) (#$near ?LOC ?OBJ)) (#$implies (#$and (#$touchesDirectly ?X ?Y) (#$objectFoundInLocation ?X ?LOC)) (#$objectFoundInLocation ?Y ?LOC)) (#$implies (#$and (#$touches ?X ?Y) (#$objectFoundInLocation ?X ?LOC)) (#$objectFoundInLocation ?Y ?LOC)) (#$implies (#$and (#$on-Physical ?X ?Y) (#$objectFoundInLocation ?Y ?LOC)) (#$objectFoundInLocation ?X ?LOC))
objectFoundInLocation: relationship to partonomic terms (#$implies (#$and (#$physicalParts ?X ?PART) (#$objectFoundInLocation ?X ?LOC)) (#$objectFoundInLocation ?PART ?LOC)) (#$implies (#$and (#$physicalParts ?LOC ?PART) (#$objectFoundInLocation ?X ?PART)) (#$objectFoundInLocation ?X ?LOC)) (#$implies (#$and (#$in-ContGeneric ?OBJ ?CONT) (#$objectFoundInLocation ?CONT ?REG)) (#$objectFoundInLocation ?OBJ ?REG)) (#$implies (#$and (#$in-ContGeneric ?OBJ ?CONT) (#$containsCavity ?CONT ?CAV)) (#$objectFoundInLocation ?OBJ ?CAV)) (#$implies (#$and (#$in-Floating ?OB ?LIQ) (#$surfaceParts ?LIQ ?SURF)) (#$objectFoundInLocation ?OB ?SURF))
objectFoundInLocation: relationship to group membership terms (#$implies (#$and (#$objectFoundInLocation ?PER ?LOC) (#$covers-Hairlike ?STUFF ?LOC)) (#$in-Among ?PER ?STUFF)) \ (#$implies (#$and (#$groupMembers ?C ?MEM) (#$objectFoundInLocation ?C ?LOC)) (#$objectFoundInLocation ?MEM ?LOC))
objectFoundInLocation: relationship to geographical terms (#$implies (#$geographicalSubRegions ?REG ?PLACE) (#$objectFoundInLocation ?PLACE ?REG)) (#$implies (#$and (#$isa ?Y #$GeographicalRegion) (#$on-Physical ?X ?Y)) (#$objectFoundInLocation ?X ?Y))
objectFoundInLocation: relationship to fluid terms (#$implies (#$in-ImmersedFully ?OBJ ?FLU) (#$objectFoundInLocation ?OBJ ?FLU)) (#$implies (#$and (#$isa ?FLUID #$Place) (#$in-ImmersedGeneric ?OBJECT ?FLUID)) (#$objectFoundInLocation ?OBJECT ?FLUID)) (#$implies (#$and (#$isa ?WATER #$BodyOfWater) (#$in-Floating ?OBJ ?WATER)) (#$objectFoundInLocation ?OBJ ?WATER)) (#$implies (#$and (#$in-Floating ?OB ?LIQ) (#$surfaceParts ?LIQ ?SURF)) (#$objectFoundInLocation ?OB ?SURF))
Intermediate Concept Node Identification near inAmong covers-Hairlike touches groupMembers onPhysical objectFoundInLocation touchesDirectly in-Floating surfaceParts physicalParts BodyOfWater containsCavity in-ImmersedFully geographicalSubRegions GeographicalRegion in-ContGeneric in-ImmersedGeneric
Intermediate Concept Node Identification near inAmong covers-Hairlike touches groupMembers groupMembership onPhysical objectFoundInLocation positional touchesDirectly fluids partonomic in-Floating surfaceParts geographic physicalParts BodyOfWater containsCavity in-ImmersedFully geographicalSubRegions GeographicalRegion in-ContGeneric in-ImmersedGeneric
Achievements & Plans • Parser built for MELD axioms • MVP-CA tool adapted for ontology representation in MELD • Clustering results with Virus KB released • Clustering results with the IKB spatial ontology is ongoing (assertions only): • Duplicate axioms identified • Clusters being studied for • redundant axioms • intermediate concepts • Next steps: • Cluster term definition file • Provide support for concept graph • Long term goal is to develop criteria for component identification using clusters