1 / 90

Reading to Learn Q3 Review

Reading to Learn Q3 Review. Peter Clark John Thompson Tom Jenkins Phil Harrison Bill Murray. Agenda. This Seedling and Mobius Major lessons learned Reformulations in CPL Whole 5 pages Key Sentences How do other texts compare? Generics How to identify “important” text

Télécharger la présentation

Reading to Learn Q3 Review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reading to LearnQ3 Review Peter Clark John Thompson Tom Jenkins Phil Harrison Bill Murray

  2. Agenda • This Seedling and Mobius • Major lessons learned • Reformulations in CPL • Whole 5 pages • Key Sentences • How do other texts compare? • Generics • How to identify “important” text • Principles for an extensible KB • Evaluation discussion • Tuples as another source of knowledge

  3. SRI-Boeing’s Reading to Learn Seedling • Goal: • study issues in learning through reading by working with a reduced version of the problem, namely working with controlled, rather than unrestricted natural language. The NLP task is factored into two: • full NL → CL, CL → logic • Rationale: • by sidestepping some of the linguistic issues of full NLP, can focus on knowledge integration issues • methods for full NL → CL can be studied separately this project

  4. SRI-Boeing’s Reading to Learn Seedling • Approach: • Rewrite 5 pages of chemistry text into our controlled language, CPL • Extend and use our CPL interpreter to generate logic • Integrate this new knowledge with an existing chemistry knowledge base (from the Halo Pilot), which has the new knowledge surgically deleted from it • Evaluate the performance of the CPL-extended KB with the original • Report on the problems encountered and solutions developed

  5. This Seedling in Mobius Test Generation Natural Language Processing Introspection Knowledge Integration This seedling

  6. Summary • Q3: • Completed coding of key sentences in CPL • Demonstration of inference with that knowledge • Study of cues for identifying important text • Assembly of key lessons learned • Interaction with ISI • Exploration of shallow knowledge extraction • Q4 • Finish interpretation of additional sentences • Assemble qualitative and quantitive evaluations • Continue interaction with ISI: Side-by-side study • Final report

  7. Main Results and Messages • With some hand-holding, part of the “Mobius loop” can be done • But: chemistry is a formidable domain • Contributions: • 10 key lessons learned for a larger project • Qualitative and quantitative evaluation data

  8. 10 Key Lessons • Much of the text is irrelevant (“fluff”) • Much important knowledge is conveyed by examples & diagrams • General principles are rarely spelt out clearly • Text is full of ambiguity, metaphor, and metonymy/“loosespeak” • Declarative knowledge may be hidden in procedural descriptions • Text creates disconnected knowledge, which may not chain well • Discourse structure is important • Generic sentences are ubiquitous • Many sentences pose major representational challenges • Traditional KR structures are difficult to extend

  9. Two Reformulations into CPL… • Reformulation of the whole 5 pages into CPL • Approximately 250 sentences • Syntactic conversion + pseudo-logic • generally not inference capable, esp. generics • Re-reformulation of first subsection into explicit if-thens • Inference capable but greater distance from source text • Reformulation of key pieces into CPL • approximately 10 if-then rules • inference capable • barely recognizable from the original source text

  10. Agenda • This Seedling and Mobius • Major lessons learned • Reformulations in CPL • Whole 5 pages • Key Sentences • How do other texts compare? • Generics • How to identify “important” text • Principles for an extensible KB • Evaluation discussion • Tuples as another source of knowledge

  11. Some CPL Rules IF a substance is an acid THEN the substance tastes sour. IF an acid contacts an acid-sensitive dye THEN the acid changes the color of the dye. IF a substance is a base THEN the substance tastes bitter. IF a substance is a base THEN the substance feels slippery. IF a substance is an acid THEN the substance contains hydrogen. IF a thing is a base THEN the thing is a substance. IF an Arrhenius base contacts water THEN the base emits OH-minus ions in the water. IF an Arrhenius acid is dissolving in water THEN the dissolving is increasing the concentration of H-plus ions in the water. IF an Arrhenius base is dissolving in water THEN the dissolving is increasing the concentration of OH-minus ions in the water IF a substance is a HCl substance THEN the substance is an Arrhenius acid. IF hydrogen chloride gas is in water THEN the gas dissolves easily in the water. IF hydrogen chloride gas is in water THEN the gas reacts with the water.

  12. Reformulation of the 5 pages… • Note: introductory material, flowery language, fluff, complex sentences, parentheticals.

  13. IF a substance is an acid THEN the substance tastes sour. IF an acid contacts an acid-sensitive dye THEN the acid changes the color of the dye. IF a substance is a base THEN the substance tastes bitter. IF a substance is a base THEN the substance feels slippery.

  14. IF a substance is a HCl substance THEN the substance is an Arrhenius acid. IF hydrogen chloride gas is in water THEN the gas dissolves easily in the water. IF hydrogen chloride gas is in water THEN the gas reacts with the water. HCl is the chemical symbol for hydrogen chloride. IF a substance is an aqueous solution of HCl substance THEN the substance is hydrochloric acid. IF a substance is concentrated hydrochloric acid THEN 37 percent of the mass of the substance is HCl. IF a substance is concentrated hydrochloric acid THEN the concentration of HCl in the substance is 12 M. ←(Implied but not explicit)

  15. (surface logical form) the'(e1,x1,e2) & aqueous'(e3,x1) & solution'(e2,x1) & of'(e4,x1,x2) & hcl'(e5,x2) & know'(e6,z1,x1,x3) & as'(e7,e6,x3) & hydrochloric'(e8,x3) & acid'(e9,x3) IF a substance is an aqueous solution of HCl substance THEN the substance is hydrochloric acid. CPL (every Hydrochloric-Acid has-definition (instance-of (Aqueous-Solution)) (has-solute ((a HCl-Substance))) Halo KB style

  16. Summary of Interpretation Challenges • Interpreting generics. • "Acids cause some dyes to change color." • how to handle negation. • "Some substances containing hydrogen are not acids." • "The transfer leaves no undissociated acid molecules" • Vague attributes ("properties", "due to") • “Properties of aqueous solutions of Arrhenius acids are due to H-plus ions" • coreference with nominalizations ("react"/"reaction") • "Hydrogen chloride reacts... The reaction produces..." • naming: how to represent both the name and the symbol for a chemical. • "An aqueous solution of HCl is called hydrochloric acid." • how to get new technical vocabulary + meanings into the system. • "NaOH dissociates in water." • "H2O abstracts the proton from HX" • how to represent definitions. • "Arrhenius acids and defined..." • how to state that one category is more general than another. • "Bronsted-Lowry acids are more general than Arrhenius acids."

  17. Summary of Interpretation Challenges (cont) • how to represent "sometimes". • "An HO3-plus ion sometimes reacts with an H2O molecule." • how to represent modals/tendancies like "can". • "A molecule of a Bronsted-Lowry acid can donate a proton..." • how to represent an argument (proof), and generalize from it. • "Therefore, the H2O molecule acts as a Bronsted-Lowry base.“ • "Substances with negligible acidity contain hydrogen, but the substances do not behave as acids in water." • vagueness ("is mostly", "nearby", "some") • "The NH4Cl is mostly solid particles." • "Some acids are better proton donors than other acids." • "A weak acid partly transfers the acid's protons to the water." • "Proton-transfer reactions are governed by the relative strengths of the bases" • "The solution has a negligible concentration of HCl molecules." • "An aqueous solution of acetic acid consists mainly of HC2H3O2 molecules" • "The aqueous solution has relatively few H3O-plus ions" • metonymy • "The H2O molecule in Equation 16.5 donates a proton" • "In Equation 16.9 HX dissolves in water." • "Equation 16.9 describes the behavior of a strong acid in water."

  18. Summary of Interpretation Challenges (cont) • definitions with negation. • "An H-plus ion is a proton with no valence electron." • presuppositions • "Acids cause some dyes to change color." • "A Bronsted-Lowry acid always reacts with a nearby Bronsted-Lowry base." • generalized formulae and equations • "In Equation 16.6 the symbol HX denotes an acid." • how to compute and represent differences • "An acid and a base differing only in a proton are called a conjugate pair" • how to handle definite references ("the" base) that haven't been introduced. • "Removing a proton from the acid produces the conjugate base." • change over time • "The HNO2 molecule becomes the NO2-minus ion." • "The H2O molecule changes into the hydronium ion" • "Acids cause some dyes to change color." • semi-malformed sentences • "A stronger acid has a weaker conjugate base." • How to state and represent hypothetical situations. • "Assume that H2O is a stronger base than X-minus in Equation 16.9."

  19. Summary of Interpretation Challenges (cont) • Generalization from examples • “In any reaction we can identify two sets of conjugate acid-base pairs. For example, consider the reaction…” • Information in tables and diagrams

  20. Agenda • This Seedling and Mobius • Major lessons learned • Reformulations in CPL • Whole 5 pages • Key Sentences • How do other texts compare? • How to identify “important” text • Principles for an extensible KB • Evaluation discussion • Tuples as another source of knowledge

  21. Recall from Last Time … • Most of the textbook sentences are “fluff” and examples • and are not needed to solve test questions • A few key sentences (and a table) are the heart of this section of the textbook • and are often given in italics • These key sentences are not worded as precisely as needed for automatic translation into axioms that can chain together to solve a problem • in fact, some parts are not stated at all • students look at diagrams and examples and figure it out

  22. Overview • 4 key pieces of knowledge in the Section: • Computing the direction of the reaction • Rewriting in CPL • Compare to UT’s KM encoding • Compare to ISI’s shallow logical form • Identifying the acids/bases in a reaction • Computing the conjugate of an acid/base • Comparing the strengths of two acids/bases

  23. Overview • 4 key pieces of knowledge in the Section: • Computing the direction of the reaction • Rewriting in CPL • Compare to UT’s KM encoding • Compare to ISI’s shallow logical form • Identifying the acids/bases in a reaction • Computing the conjugate of an acid/base • Comparing the strengths of two acids/bases

  24. A Key Sentence in Our Textbook • Let’s look at one example of a key sentence: • “From these examples we conclude that in every acid-base reaction the position of the equilibrium favors transfer of the proton to the stronger base.” • Restated in Sample Exercise 16.3: • “Thus, the equilibrium favors the direction in which the proton moves from the stronger acid and becomes bonded to the stronger base.” • “In other words, the reaction favors consumption of the stronger acid and stronger base and formation of the weaker acid and weaker base.”

  25. Rewriting a Sentence into CPL Textbook “In every acid-base reaction the position of the equilibrium favors transfer of the proton to the stronger base.” Naïve Encoding 1 IF there is a reaction AND one base in the reaction is stronger than the other base in the reaction THEN the direction of the reaction is away from the stronger base. [“favors transfer to” → “direction is away from”] Naïve Encoding 2 IF there is a reaction AND there is a base on the left side of the reaction AND there is a base on the right side of the reaction AND the first base is stronger than the second base THEN the direction of the reaction is to the right.

  26. Further Refinement of the CPL Naïve Encoding 2 IF there is a reaction AND there is a base on the left side of the reaction AND there is a base on the right side of the reaction AND the first base is stronger than the second base THEN the direction of the reaction is to the right. “The chemical entity whose formula is on the left side of the equation of the reaction and which plays a base role”

  27. Final CPL Rule That Worked! IF there is an equation of a reaction AND a first chemical entity has a chemical formula AND the first chemical formula is part of the left side of the equation AND the first chemical entity is playing a base role AND a second chemical entity has a second chemical formula AND the second chemical formula is part of the right side of the equation AND the second chemical entity is playing a base role AND the first chemical entity is stronger than the second chemical entity THEN the direction of the reaction is right [to the right] AND the equilibrium side of the reaction is right. [lies on the right] “the base on the LHS” “the base on the RHS” (means “stronger base than”) (UT’s rep. uses Reaction, but should use Equation)

  28. Compare Sentence to Final CPL • In every acid-base reaction the position of the equilibrium favors transfer of the proton to the stronger base. • IF there is an equation of a reaction AND a first chemical entity has a chemical formula AND a second chemical entity has a second chemical formula AND the first chemical formula is part of the left side of the equation AND the second chemical formula is part of the right side of the equation AND the first chemical entity is playing a base role AND the second chemical entity is playing a base role AND the first chemical entity is stronger than the second chemical entity THEN the direction of the reaction is right AND the equilibrium side of the reaction is right. • (There’s a 2nd rule like this that concludes the direction is left) not actually used!

  29. KM Generated from CPL chem.on LHS • (_Equation7461 equation-of _Reaction7462) • (|_Chemical Entity7468| has-chemical-formula |_Chemical Formula7469|) • (|_Chemical Formula7469| equal _Part7485) • (_Part7485 is-part-of |_Left Side7483|) • (|_Left Side7483| is-region-of _Equation7461) • (|_Chemical Entity7475| has-chemical-formula |_Chemical Formula7476|) • (|_Chemical Formula7476| equal _Part7494) • (_Part7494 is-part-of |_Right Side7492|) • (|_Right Side7492| is-region-of _Equation7461) • (|_Chemical Entity7468| plays |_Base Role7501|) • (|_Chemical Entity7475| plays |_Base Role7508|) • (|_Chemical Entity7468| stronger-base-than |_Chemical Entity7475|) •  • (_Direction7518 value *right) • (_Direction7518 direction-of _Reaction7462) • (|_Equilibrium Side7524| property *right) • (|_Equilibrium Side7524| equilibrium-side-of _Reaction7462) IF chem. on RHS THEN

  30. Structure of the CPL Axioms 1. Find equilibrium side (or direction) of equation 2. Find out if a chemical is playing a base role in the equation 4. Check whether one base is stronger than another base 4a. Look in Table 3. Find out if a chemical is the conjugate base of another chemical 3b. Check whether one formula differs from another in an H+ 3a. Look in Table, or … (not in CPL)

  31. Notes on our CPL Rule • The wording is way different from the original text! • The literal sentence translation would not have produced anything that could solve a problem, given an equation • “In every acid-base reaction the position of the equilibrium favorstransfer of the proton to the stronger base.” • this would create a Favoring event • the position of the equilibrium is the agent • the transfer of the proton is the object • what does this mean?

  32. Overview • 4 key pieces of knowledge in the Section: • Computing the direction of the reaction • Rewriting in CPL • Compare to UT’s KM encoding • Compare to ISI’s shallow logical form • Identifying the acids/bases in a reaction • Computing the conjugate of an acid/base • Comparing the strengths of two acids/bases

  33. How UT Encoded This • "In acid/base equilibrium reactions, the reaction proceeds in the direction of the side where equilibrium lies“ [their comment for use in explanations] • (every Reaction has … (direction ( (if (not (the direction of Self)) then (a Direction-Value with (value ((if (the output of (a Compute-Equilibrium-Position with (input (Self)))) then (if ((the output of (a Compute-Equilibrium-Position with (input (Self)))) = (the raw-material of Self)) then*left else*right))))) To find the direction of a reaction… Compute the equilibrium position … If the chemicals match the raw materials Then the direction is left, else right

  34. UT’s Compute-Equilibrium-Position (every Compute-Equilibrium-Position has (input ((a Reaction))) (output ( ;; See if both the strong acid and base are on the LHS. (if (;; Check the acids. ((the output of (a Compare-Relative-Strengths-of-Acids with (input ( (oneof (the raw-material of (the input of Self)) where (the Acid-Role plays of It)) (oneof (the result of (the input of Self)) where (the Acid-Role plays of It)))))) = (oneof (the raw-material of (the input of Self)) where (the Acid-Role plays of It))) and ;; Check the bases. ((the output of (a Compare-Relative-Strengths-of-Bases with (input ( (oneof (the raw-material of (the input of Self)) where (the Base-Role plays of It)) (oneof (the result of (the input of Self)) where (the Base-Role plays of It)))))) = (oneof (the raw-material of (the input of Self)) where (the Base-Role plays of It)))) then (the result of (the input of Self)) else (the raw-material of (the input of Self)))))) If the stronger of… the raw material acid… and the result acid… is the raw material acid… (same for bases) then equilibrium is on the result side else the raw material side

  35. Notes on UT’s Encoding • Very procedural! • Various procedural methods are encoded • both qualitative and quantitative • Nothing like the textbook sentences • Their representation does not match the natural conceptual model we expected • see the next slide

  36. Mismatches between UT and CPL • UT put a “direction” slot on a Reaction, we expected it to be on an Equation • UT has no model of the left and right sides of an Equation, only the “raw-materials” and “result” slots of a Reaction • UT has a Conjugate-Acid-Base-Pair concept, but lacks the conjugate-base & conjugate-acid relations we expected • UT has no slot for the “equilibrium-side” of an Equation, only the “direction” of a reaction

  37. More Mismatches between UT and CPL • UT gives us no primitives to use for formula manipulation (adding an H+), it’s buried within their Compute-Conjugate-Acid • UT’s model of Formula does not include a “charge” slot, they’ve only attached it to the Chemical itself • UT has no notion of “stronger-base-than,” they only label a chemical with “intensity” = strong or weak. • So, it would help if the conceptual model were closer to natural language!

  38. Overview • 4 key pieces of knowledge in the Section: • Computing the direction of the reaction • Rewriting in CPL • Compare to UT’s KM encoding • Compare to ISI’s shallow logical form • Identifying the acids/bases in a reaction • Computing the conjugate of an acid/base • Comparing the strengths of two acids/bases

  39. ISI’s Shallow Logical Form for our Sentence “From these examples we conclude that in every acid-base reaction the position of the equilibrium favors transfer of the proton to the stronger base.” position'(e17,x5) & of'(e18,x5,x6) & the'(e19,x6,e20) & equilibrium'(e20,x6) & favor'(e11,x5,x7,z2) & transfer'(e21,x7) & of'(e22,x7,x8) & the'(e23,x8,e24) & proton'(e24,x8) & to'(e25,x7,x9) & the'(e26,x9,e27) & strong'(e28,x9) & base'(e27,x9) from'(e1,e2,x1) & these'(e3,s1,e4) & example'(e4,x1) & plural'(e7,x1,s1) & we'(e8,x2) & plural'(e9,x2,s2) & conclude'(e2,x2,x3,z1) & that'(e10,e2,e11) & in'(e12,e11,x4) & every'(e13,x4,e14) & acid-base'(e15,x4) & reaction'(e14,x4) & the'(e16,x5,e17) &

  40. Graph of ISI’s Shallow Logical Form z1 = conclude(x2, x3) ? from (x1) x2 = we x3 = [missing!] ? x1 = example that in(x4) ? these x4 = reaction z2 = favor (x5, x7) every(x4) acid-base(x4) x5 = position x7 = transfer of (x5, x6) of (x7, x8) to (x7, x9) x6 = equilibrium x9 = base x8 = proton strong (x9)

  41. Notes on ISI’s Shallow Logical Form • Not far removed from a syntactic parse • They plan to do much more development of this • Will probably produce a literal translation • there will be a Favoring event, with agent & object • As with the naïve CPL sentence, a literal translation wouldnot help solve a Chemistry problem

  42. Overview • 4 key pieces of knowledge in the Section: • Computing the direction of the reaction • Rewriting in CPL • Compare to UT’s KM encoding • Compare to ISI’s shallow logical form • Identifying the acids/bases in a reaction • Computing the conjugate of an acid/base • Comparing the strengths of two acids/bases

  43. CPL for 2nd Key Sentence • “In any acid-base (proton transfer) reaction we can identify two sets of conjugate acid-base pairs.” • IF there is an equation of a reaction AND a first chemical entity has a chemical formula AND a second chemical entity has a second chemical formula AND the first chemical formula is part of the left side of the equation AND the second chemical formula is part of the right side of the equation AND the first chemical entity is the conjugate base of the second chemical entity THEN the first chemical entity is playing a base role AND the second chemical entity is playing an acid role. • (There’s a 2nd rule like this with first & second reversed)

  44. UT Code for 2nd Key Sentence (every Chemical has (plays ( (if ((the term of (the atomic-chemical-formula of (the has-basic-structural-unit of Self))) and (not (the Base-Role plays of Self))) then (if ((has-value (oneof (the result of (the Reaction raw-material-of of Self)) where (((the elements of (the term of (the atomic-chemical-formula of (the has-basic-structural-unit of It)))) = (forall2 (the elements of (the term of (the atomic-chemical-formula of (the has-basic-structural-unit of Self)))) (if ((the2 of It2) = H) then (:pair ((the1 of It2) + 1) H) else It2))) or... then (a Base-Role) jump to the other side of the equation! Reaction result raw-material Chemical Chemical “IF one of the chemicals on the other side of the reaction…” “… has an extra H” “…THEN this chemical’s a base”

  45. Overview • 4 key pieces of knowledge in the Section: • Computing the direction of the reaction • Rewriting in CPL • Compare to UT’s KM encoding • Compare to ISI’s shallow logical form • Identifying the acids/bases in a reaction • Computing the conjugate of an acid/base • Comparing the strengths of two acids/bases • These last two items are presented in a table

  46. Conjugate Acid-Base Pairs Textbook CPL IF there is an HCl and a Cl-Minus THEN the conjugate base of the HCl is the Cl-minus. IF there is an H3O-Plus and an H2O THEN the conjugate base of the H3O-Plus is the H2O. Etc.

  47. Relative Strengths of Bases Textbook CPL IF there is a Cl-Minus and an HSO4-Minus THEN the HSO4-Minus is a stronger base than the Cl-Minus. IF there is a HSO4-Minus and an NO3-Minus THEN the NO3-Minus is a stronger base than the HSO4-Minus. IF there is an NO3-Minus and an H2O THEN the H2O is a stronger base than the NO3-Minus. Etc.

  48. Lessons from Key Sentences - 1 • The key sentences did not translate literally into useful logic • they had to be carefully rewritten in CPL • and knowledge was added from studying diagrams and examples • and they were tested with each other to chain together • It was difficult to make use of the UT representations • they were very procedural • their representations were further removed from the English • so, we shoulduse more natural representations • ISI’s shallow logical forms may produce literal translations • again, not useful for solving problems

More Related