490 likes | 619 Vues
Ways to Incorporate Ontology and Bayesian Network. Presented By: Asma Sanam Larik. Three Approaches. Following ways have been applied to incorporate them: 1) Ontology Mapping Enhancement using BN 2) Extending Ontology queries by BN reasoning
 
                
                E N D
Ways to Incorporate Ontology and Bayesian Network Presented By: AsmaSanamLarik
Three Approaches Following ways have been applied to incorporate them: 1) Ontology Mapping Enhancement using BN 2) Extending Ontology queries by BN reasoning 3) Semi automated construction of BN from domain ontology
Purpose of the Mapping Approach Mapping Approach Designers of Ontology apply different views of the same domain during ontology development. This yields heterogeneity at ontology level which is the main obstacle to semantic interoperability. Ontology mapping is the approach trying to solve this problem
Research in Ontology Mapping 1. OMEN (Ontology Mapping ENhancer 2004) OMEN: A Probabilistic Ontology Mapping Tool by PrasenjitMitra, AnujJaiswal, Pennsylvania State University and Stanford University 2. BayesOWL (2005) A Bayesian Network Approach to Ontology mapping by Zhongli Ding, YunPeng, UMBC 3. OntoBayes (2006) OntoBayes:An Ontology driven Uncertainty Model by Yi Yang and Jacques Calmet, University of Karlsruhe, Germany
Purpose of Extending Ontology reasoning Approach Extending Ontology Queries: Existing ontology query languages cannot provide answers to queries involving probabilities like the following ones: What is the likelihood of default of a company given that it is limited and has branches outside Europe? What is the likelihood of a particular project type given that it is led by male managers working for a ltd company? Thus BN are applied for this sort of probabilistic reasoning Proposed by Bellandi Andrea, Turini Franco April 2009, University of Pisa, Italy
Purpose of automatic BN construction approach Automatic BN Construction: The creation of BN requires identifying variables of interest, their influence on each other and construction of CPT. Based on existing domain ontologies these methods propose methodology for Ontology based generation of Bayesian Networks
Research on Automated BN Construction Stefan Fenz (University of Vienna, Austria) 2008 Ann Devitt and K. Mutosikova 2006 (Network Management Research Centre, Ireland) Hai-tao Zhang , B-Yoeng Kang (Soeul National University, Korea ) 2007
BayesOWL Approach Probabilistic Ontology is an Annotated Ontology that contains set of prior and Conditional Distributions This approach takes a simple Ontology file and a Probability file and maps both of them to generate a Bayesian Network Purpose of doing so is to use Bayesian Inference for OWL reasoning
Purpose/ Direction of Approach In domain modeling I know that A is a subclass of B now one may wish to express the probability that an instance of B belongs to an instance of A Also if A and B are not logically related one may still wish to express how much A is overlapped with B In Ontology Reasoning one may wish to know degree of similarity of A to B even if A and B are not subsumed by each other
Its purpose is in Concept Mapping between two ontologies where it is often the case that concept defined by one ontology has partial matching with concept in other Ontology
How to Incorporate PO? Probabilistic Information Markups Structural translation Constructing CPT for L-Nodes Constructing CPT for Concept Nodes
CPT for Concept NodesExample taken from Zhongli Ding’s Thesis Next a constraint on B is applied R1(B)=(0.61,0.39)
Initial Knowledge Base Marginal on B Constraint on B New JPD From new JPD the constraint is satisfied. Next another constraint on C is applied R2(C)= (0.83,0.17) shown in next slide
Initial Knowledge Base Marginal on C Constraint on C New JPD
Reasoning The BayesOWL framework can support common ontology reasoning tasks as probabilistic reasoning in the translated BN Concept Overlapping Concept Subsumption
Ontology-based generation of Bayesian Networks By Stefan Fenz and Min Tjoa University of Vienna S.Fenz Ontology and Bayesian-based information security risk management PhD. Thesis, Vienna University of Technology, Oct 2008
Motivation • Creation of BN requires at least three challenging tasks: • Determination of relevant influence factors • Determination of relationships between identified influence factors • Calculation of CPT’s • Ontologies are a potential solution to address stated challenges
Example Security Ontology 1) Security Attribute: Confidentiality Integrity Availability 2) Threat Source: Accidental Deliberate 3) Threat origin: Human Natural 4) Vulnerability: Physical Technical Administrative 5) Control Type: Preventive Corrective Recovery 6) Severity Level High, Medium, Low
Methodology Proposed Concepts Nodes in BN Relations Links Axioms  Node states Instances  Findings
Concept Nodes • Following factors have been identified: 1) Predecessor Threats (PT1Ti , ……, PTnTi ) influence the considered threat Ti which influences it successor threats (ST1Ti , ……, STnTi)
Continue.. 2) Each threat (Ti) requires one or more vulnerabilities (V1,….,Vn) to become effective
Axiomsnode scales and weights Three point Likert scale (High, Medium) For CPT construction Severity rating Svi is defined for each vulnerability therefore a numerical weight Wppvi for each vulnerability is identified by dividing severity of vulnerability by the sum of all vulnerabilities relevant to the threat
Continue.. 3) Controls can be used to mitigate identified vulnerabilities, mitigation depends on the effectiveness of a potential control combination (CCEvi) which again depends on the actual effectiveness of control implementation (CE1,…., CEn)
Continue.. 4) a) Incase of deliberate threat sources, the vulnerability exploitation probability PPVi is determined by the effectiveness of a potential attacker (AEVi) which is again determined by the motivation (AMVi) and the capabilities of the attacker (ACVi) b) Incase of accidental threat sources (PPVi) is determined by a prior probability (APTi) of corresponding threat Ti
Limitations Functions for calculating CPT are not provided by Ontology and have to be modeled externally Human Intervention is necessary if the ontology provides a knowledge model that does not exactly fit the domain of interest
Strategy • Extracting the BN directly from the ontology: • Definition of the ontology compiling process for extracting the Bayesian network structure directly from the schema of the knowledge base. • Learning the initial probability distributions. • Providing a Bayesian query language for answering queries involving probabilities • Using inference over the BN for answering queries involving probabilities: • Definition of the language operational semantics, based on the well-known Bayesian network reasoning schemas
Extracting a Bayesian network from an Ontology – An Example (1) • An Ontology O is a pair <T, R > where • T = {T1,..,Ti,..,Tn} is a set of hierarchies called domain concepts • R  Ti x Tj is a set of arcs binding elements of T such that the resulting graph is acyclic. is-a relation Object property T2 MAN T4 R1= T1 hasCeo R3= hasSector PERSON SECTOR WOMAN COMPANY R2= leads T3 PROJECT SERVICES FINANCIAL VENDOR JOINTVENTURE RESEARCH CREDITCARD LIFEINSURANCE COMPTETITOR SUPPLIER PATENT T1= COMPANY = {Company,Vendor, Jointv, Compet, Suplier} T2= PERSON = {Person, Man, Woman} R1 : COMPANY  PERSON R2 : PERSON  PROJECT T3= PROJECT = {Project, Research,PAtent} R3 : COMPANY  SECTOR T4= SECTOR = {Sector,Services,Financial,CreditCard,LifeIns}
Extracting a Bayesian network from an Ontology – An Example (2) P(COMPANY) P(Company=c) T1 COMPANY P(VENDOR|COMPANY) P(JOINTVENTURE|COMPANY) VENDOR JOINTVENTURE P(SUPPLIER|VENDOR) P(COMPETITOR|VENDOR) P(Sector=s |hasSector Company=c) SUPPLIER P(Person=p |hasCeo Company=c) COMPETITOR R1 R3 T4 P(SECTOR) T2 P(PERSON) SECTOR P(FINANCIAL|SECTOR) P(SERVICES|SECTOR) PERSON SERVICES FINANCIAL P(WOMAN|PERSON) P(MAN|PERSON) P(CREDITCARD|FINANCIAL) MAN WOMAN P(LIFEINSURANCE|FINANCIAL) LIFEINSURANCE CREDIT CARD R2 P(Project=pr |leads Person=p) P(PROJECT) T3 PROJECT High Level Node P(PATENT|PROJECT) P(RESEARCH|PROJECT) High Level Relation RESEARCH PATENT Low Level Node Low Level Relation
Ontology Compiling Process • It is composed of two phases: • Phase one: compiling TBox ontology in a 2lBN structural part • Phase two: compiling ABox ontology in a2lBN probabilistic part HLN_COMPANY LLRT HLR_Ceo HLRT LLN hasCeo Company Customer Partenrship Jointventure 0.56 0.86 0.36 0.29 0.44 0.14 0.64 0.71 1 1 1 1 Man Woman Person HLN_PERSON
2lBN structural part • The compiling process of a TBox component maps: • Each ontology class to a booelan random variable (LLN) • Each concept domain to a multi-valued random variable (HLN) • Each object property to a Bayesian arc (HLR) 2lBN probabilistic part • The initial probability distribution is computed on the basis of the distribution of the instances, that is the ABox component. • Two kind of probability exists: • Low Level Relation Probability Table (LLRT) • A Prior probability P(A) represents the probability that an arbitrary ontology instance belongs to theclass A. • A Conditional Probability P(A|B) represents the probability that an arbitrary ontology instance belonging to the class B, belongs also to the class A. • High Level Relation Probability Table (HLRT) • A Conditional Probability P(A|RB) represents the probability that it does exist a relation R (i.e., an object property or a path of object properties) between arbitrary ontology instances of A and arbitraryontology instances of B
Low Level Relations probability distribution - example Starting from this table, we can compute the probability distribution by applying the Bayes formula in the following form:
High Level Relations probability distribution - example Number of triples satisfying the TBox schema <Company, hasCeo, Person> Number of instances belonging to the sub-space of Company corresponding to “Company with a CEO”
Inference over Bayesian network (1) P(A) • Top-Down. Causal Reasoning • P(D | A) = A P(B) P(D,B | A) + P(D,B | A) B • Bottom-Up. Diagnostic Reasoning • P(A | D) = P(C|B) P(D | A) * P(A) D C P(D) P(D|A,B) • Top-Down/Bottom-Up. Explaining Away Reasoning • P(A | B, D) = P(D | B,A) * P(B | A) * P(A) P(D, B | A) * P(A) = P(B,D) P(B,D) InferenceoverBayesiannetworksis, in general, NP-hard.
Inference over Bayesian network (2) • Polytree is a class of Bayesian Networks that can efficiently be solved in timelinear in the number of nodes. Polytree property: Exists a unique path between each possible coupleof nodes. • Fixed a node D, is always possible to partition all the other nodes into two disjoint sets: • set over D, which is the set of nodes that are connected to D only by the fathers of D. • set under D, which is the set of nodes that are connected to D only by the immediate descendents of D.
Bayesian query structure • The general structure of a probabilistic query is P(QUERY |pathEVIDENCE) where: • QUERY is a node of the polytree • EVIDENCE can be both one node over and one node under w.r.t query, one node over w.r.t. query, one node under w.r.t. query • EVIDENCE can refer: • is-a ontology relations (classical bayesian conditioning, that is path is empty) • object properties (bayesian conditioning is annotated with the path binding query to evidence) EVIDENCE over QUERY (A, B, C, E) QUERY node (D) EVIDENCE under QUERY (F, G, H, I)
P(COMPANY) P(Company=c) COMPANY P(VENDOR|COMPANY) VENDOR JOINTVENTURE P(SUPPLIER|VENDOR) P(JOINTVENTURE|COMPANY) P(COMPETITOR|VENDOR) hasCeo COMPETITOR SUPPLIER P(PERSON) P(SECTOR) PERSON SECTOR P(WOMAN|PERSON) P(MAN|PERSON) P(FINANCIAL|SECTOR) P(SERVICES|SECTOR) MAN WOMAN FINANCIAL SERVICES EVIDENCE FINANCIAL is underQUERY Company EVIDENCE FINANCIAL is overQUERY PATENT EVIDENCE FINANCIAL is overQUERY Person Bottom-Up Inference (Bayes Formula) P(PROJECT) Top Down Inference Top Down Inference PROJECT P(PATENT|PROJECT) P(RESEARCH|PROJECT) P(Company |(hasSector) FINANCIAL) = P(PATENT |(leads.hasCeo.hasSector) FINANCIAL)= P(PATENT |(leads)Person) *P(Person |(hasCeo.hasSector) S1) P(Person |(hasCeo.hasSector) FINANCIAL)= P(Person |(hasCeo) Company) *P(Company |(hasSector) FINANCIAL) P(FINANCIAL |(hasSector)Company) * P(Company)*1 RESEARCH PATENT P(FINANCIAL) Normalisation factor Which is the probability that a Patent project is led by person which is CEO of a company operating in the financial sector ? P(PROJECT=patent|(leads.hasCeo.hasSector) SECTOR=financial) COMPANY hasSector PERSON PERSON FINANCIAL leads PATENT