1 / 18

Inductive Logic Programming

Inductive Logic Programming. Luis Tari. Logic Programming. parent_of(charles,george). parent_of(george,diana). parent_of(bob,harry). parent_of(harry,elizabeth). grandparent_of(X,Y) :- parent_of(X,Z), parent_of(Z,Y). Consider the following example of a logic program:.

Télécharger la présentation

Inductive Logic Programming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inductive Logic Programming Luis Tari

  2. Logic Programming parent_of(charles,george). parent_of(george,diana). parent_of(bob,harry). parent_of(harry,elizabeth). grandparent_of(X,Y) :- parent_of(X,Z), parent_of(Z,Y). • Consider the following example of a logic program: • From the program, we can ask queries about grandparents. • Query: grandparent_of(X,Y)? • Answers: • grandparent_of(charles,diana). • grandparent_of(bob,elizabeth).

  3. What is ILP? • Inductive Logic Programming (ILP) • Automated learning of logic rules from examples and background knowledge • Example: learn the rule for grandparents, given background knowledge of parents and examples of grandparents • ILP can be used for classification and prediction • Hypotheses are generated in ILP, unlike “black box” approach for classifiers such as SVM

  4. B U He for every e E+. • B U Hf for every f  E-. • B U H is consistent. Assume that Be for some e  E+. ILP – formal definitions • Given • a logic program B representing background knowledge • a set of positive examples E+ • a set of negative examples E- • Find hypothesis H such that:

  5. Example • Background knowledge B: • parent_of(charles,george). • parent_of(george,diana). • parent_of(bob,harry). • parent_of(harry,elizabeth). • Positive examples E+: • grandparent_of(charles,diana). • grandparent_of(bob,elizabeth). • Generate hypothesis H: • grandparent_of(X,Y) :- parent_of(X,Z), parent_of(Z,Y).

  6. ILP systems • Two of the most popular ILP systems: • Progol • FOIL • Progol [Muggleton95] • Developed by S. Muggleton et. al. • Learns first-order Horn clauses (no negation in head and body literals of hypotheses) • FOIL [Quinlan93] • Developed by J. Quinlan et. al. • Learns first-order rules (no negation in head literals of the hypotheses)

  7. Rule Learning (Intuition) • How to come up the rule for grandparent_of(X,Y)? • Take the example grandparent_of(bob,elizabeth). • Find the subset of background knowledge relevant to this example: parent_of(bob,harry), parent_of(harry,elizabeth). • Form a rule from these facts grandparent_of(bob,elizabeth) :- parent_of(bob,harry), parent_of(harry,elizabeth). • Generalize the rule grandparent_of(X,Y) :- parent_of(X,Z), parent_of(Z,Y). • Check if this rule is valid wrt the positive and negative examples

  8. Progol Algorithm Outline • From a positive example, construct the most specific rule rs. • Based on rs, find a generalized form rg of rs so that rg has the score(rg) has the highest value among all candidates. • Remove all positive examples that are covered by rg. • Go to step 1 if there are still positive examples that are not yet covered.

  9. Scoring hypotheses • score(r) is a measure of how well a rule r explains all the examples with preference given to shorter rules. • pr = number of +ive examples correctly deducible from r • nr = number of -ive examples correctly deducible from r • cr= number of body literals in rule r • score(r) = pr – (nr + cr)

  10. Applications of ILP • Constructing Biological Knowledge Bases by Extracting Information from Text Sources (M. Craven & J. Kumlien) [Craven99] • The automatic discovery of structural principles describing protein fold space (A. Cootes, S.H. Muggleton, and M.J.E. Sternberg) [Cootes03] • More from UT-ML group (Ray Mooney) • http://www.cs.utexas.edu/~ml/publication/ilp.html

  11. Extraction of relations in biomedical text [Craven99] • Applied on biomedical text • Used FOIL system to learn rules describing relations of interest

  12. Relations of interest

  13. Example of relation from text • We want to extract the following relation: • Sample sentence from biomedical articles:

  14. From parse trees to background knowledge

  15. Example of a learned rule • The first two literals indicate that the phrase referencing the subcellular localization follows the phrase referencing the protein, and there is one phrase separating them • Other literals indicate that the sentence must satisfy a particular Naïve Bayes classifier.

  16. Protein fold space [Cootes03] • Folds of a protein are described “in terms of the spatial and topological arrangements of their regular secondary structure elements.” • Why using ILP (Progol system was used) • “Classification alone will not explain why some types of fold are more prevalent than others or why some potential protein folds are not observed at all.” • Goal: Automatically generate descriptions for fold classes

  17. Protein fold space • One of the rules generated for Rossmann fold: • Fold A belongs to this fold class if: • A has a total number of helices between 3 and 4; • A has helices B of type h and at core positions b respectively; • B contains a glycine in the nterm region; • B contains a glycine in the middle region.

  18. References • [Quinlan93] J. R. Quinlan, R. M. Cameron-Jones. FOIL: A Midterm Report. Proceedingsof Machine Learning: ECML-93 • [Muggleton95] S. Muggleton. Inverse Entailment and Progol. New Generation Computing Journal, 13:245-286, 1995. • [Craven99] M. Craven & J. Kumlien (1999). Constructing Biological Knowledge Bases by Extracting Information from Text Sources. ISMB 99. • [Cootes03] A. Cootes, S.H. Muggleton, and M.J.E. Sternberg. The automatic discovery of structural principles describing protein fold space. Journal of Molecular Biology, 330(4):839-850, 2003.

More Related