CPSC 503 Computational Linguistics

CPSC 503Computational Linguistics Natural Language Generation Lecture 15 Giuseppe Carenini CPSC503 Winter 2008

Understanding Generation Knowledge-Formalisms Map for Generation Intended meaning Pragmatics Discourse and Dialogue AI planners Logical formalisms (First-Order Logics) Semantics Rule systems (features and unification) Syntax Morphology State Machines Discourse (English) CPSC503 Winter 2008

NLG Systems (see handout) • Communicative Goals • Domain Knowledge • Context Knowledge NLG System Text Examples • FOG – Input: numerical data about future. Output: textual wheatear forecasts • IDAS – Input: KB describing a machinery (e.g., bike), user’s level of expertise Output: hypertext help messages • ModelExplainer – Input: OO model. Output: textual description of information on aspects of the model • STOP – Input: user history and attitudes toward smoking Output: personalize smoking cessation letter CPSC503 Winter 2008

GEA: the Generator of Evaluative Arguments- NLG pipeline architecture- Research methodology- Extrinsic evaluation (vs. Intrinsic)- Research / commercialization cycle CPSC503 Winter 2008

Four Basic Types of Persuasive Text -“Arguments”… (main claim) • Factual Argument(e.g., Canada is the only country outside of Asia to record SARS-related deaths….) • Causal Argument(e.g., Travelers from Honk Kong brought SARS to Toronto….) • Recommendation(e.g., You should not go to China in the next few weeks…..) • Evaluative Argument(e.g., Some Asian governments were inefficientin stopping the SARS outbreak…) CPSC503 Winter 2008

Single entity House-A is great! Although it is somewhat old, the house is spacious and is in an excellent location. Comparison Vancouver is better than Seattle. There is less crime. Also, social services are more accessible. Sample Textual Evaluative Arguments CPSC503 Winter 2008

Evaluative Arguments: Importance Natural Language Generation Theory: model of argument type which is pervasive in natural human communication. • Ability to generate evaluative arguments is crucial in large classes of systems: • Personal assistants (e.g., travel advisor) • Recommender systems (e.g., movie, book) • Tutoring Systems • …… CPSC503 Winter 2008

Limitations of Previous Research [Ardissono and Goy 99] [Chu-Carroll and Carberry 98] [Elhadad 95] [Kolln 95] [Klein 94] [Morik 89] • Focus on specific aspects of generation • Selection of content • Realization of content into language • Lack of systematic evaluation • proof-of-concept system • analyzed on a few examples CPSC503 Winter 2008

Methodology • Develop generator of evaluative arguments • complete • integrate and extend previous work • Develop evaluation framework • Perform experiment within framework to test generator CPSC503 Winter 2008

Outline • Generator of Evaluative Arguments (GEA) • Evaluation Framework • Experiment • More recent results from others CPSC503 Winter 2008

Knowledge Sources: - User Model - Domain Model Text Planner Communicative Strategies Text Plan Text Micro-planner Linguistic Knowledge Sources: - Lexicon - Grammar Sentence Generator English Text Generator Architecture Content Selection and Organization Content Realization CPSC503 Winter 2008

Represent values and preferences of user • Enable identification of supporting and opposing evidence • Provide measure of evidence importance GEA User Model • Argumentation Theory tells us [Miller 96, Mayberry 96] • Supporting (opposing) evidence depends on values and preferences of audience • Evidence arranged according to importance (i.e., strength of support or opposition) • Concise: only important evidence included User Model must … and can be elicited in practice ... CPSC503 Winter 2008

COMPONENT VALUE FUNCTIONS 0.4 OBJECTIVES 0.7 0.6 Neighborhood 0.3 Location 0.8 House Value Park-Distance 0.2 Amenities Deck-Size Porch-Size Model of User’s Preferences • Additive Multi-attribute Value Function (AMVF) • Decision Theory and Psychology (Consumer’s Behavior) • Can be elicited in practice [Edwards and Barron 1994] CPSC503 Winter 2008

+ 0.78 + 0.6 + + 0.9 _ _ 0.32 0.25 + 0.6 + Likes it _ Does not like it AMVF application OBJECTIVES COMPONENT VALUE FUNCTIONS Neighborhood 0.4 House-A Location 0.7 Westend 0.6 House Value Park-Distance 0.5 km 0.3 0.64 0.8 Amenities 20 m2 Deck-Size 36 m2 0.2 Porch-Size CPSC503 Winter 2008

House-A n2 o Parent(o) relation + + supporting 0.5 km _ _ supporting _ 20 m2 + opposing + _ + opposing + 36 m2 + + _ _ + + Likes it Supporting _ _ _ _ + + + + + Does not like it Opposing Supporting and Opposing Evidence 0.4 Neighborhood Location 0.6 0.78 0.7 0.6 House Value Park-Distance 0.9 0.64 0.3 0.8 Amenities Deck-Size 0.32 0.25 0.2 Porch-Size 0.6 CPSC503 Winter 2008

1 + 0.24 + 0.55 vo 0 0.5 1 + + 0.54 _ _ 0.2 0.6 + 0.12 + Likes it Supporting _ _ _ _ + + + + + Does not like it Opposing Measure of Importance [Klein 94] 0.4 Neighborhood Location 0.6 0.78 0.7 House-A 0.6 House Value Park-Distance n2 0.9 0.64 0.3 0.5 km 0.8 Amenities Deck-Size 20 m2 0.32 0.25 36 m2 0.2 Porch-Size 0.6 CPSC503 Winter 2008

Why AMVF? - summary An AMVF • Represents user’s values and preferences • Enables identification of supporting and opposing evidence • Provides measure of evidence importance • Evidence arranged according to importance • Concise arguments can be generated • Can be elicited in practice CPSC503 Winter 2008

GEA Architecture Content Selection and Organization Knowledge Sources: - User Model - Domain Model Text Planner AMVF Communicative Strategies Text Plan Content Realization Text Micro-planner Linguistic Knowledge Sources: - Lexicon - Grammar Sentence Generator English CPSC503 Winter 2008

Argumentative Strategy Based on guidelines from argumentation theory [Miller 96, Mayberry 96] Selection: include only “important” evidence (i.e., above threshold on measure of importance) Organization: (1) Main Claim(e.g., “This house is interesting”) (2) Opposing evidence (3) Most importantsupporting evidence (4) Further supporting evidence -- ordered by importance withstrongest last Strategy applied recursively on supporting evidence CPSC503 Winter 2008

Sample GEA Text Plan EVALUATIVE ARGUMENT MAIN-CLAIM SUPPORTING EVIDENCE (VALUE (House-A) 0.72) SUB-CLAIM OPPOSING EVIDENCE SUPPORTING EVIDENCE (VALUE (Location) 0.7) (VALUE (distance-from-park 1.8m) 0.3) (VALUE (distance-from-rap-trans 0.5 mi) 0.75) (VALUE (distance-from-work 1mi) 0.75) decomposition ordering rhetorical relations CPSC503 Winter 2008

GEA Architecture Content Selection and Organization Knowledge Sources: - User Model - Domain Model Text Planner AMVF Argumentative Strategy Communicative Strategies Text Plan Content Realization Text Micro-planner Linguistic Knowledge Sources: - Lexicon - Grammar Sentence Generator English CPSC503 Winter 2008

Text Micro-Planner • Aggregation: combining multiple propositions in one single sentence[Shaw 98] • Lexicalization: • Scalar Adjectives (e.g., nice, far, convenient) [Elhadad 93] • Discourse cues (e.g., although, because, in fact) [Knott 96; Di Eugenio, Moore and Paolucci 97] • Pronominalization: deciding whether to use a pronoun to refer to an entity (centering [Grosz,Joshi and Weinstein 95]) CPSC503 Winter 2008

Aggregation (Logical Forms) • Conjunction via shared participants “House B-11 is far from a shopping area” + “House B-11 is far from public transportation” = “House B-11 is far from a shopping area and public transportation”. • Syntactic embedding • “House B-11 offers a nice view” + • “House B-11 offers a view on the river” = • “House B-11 offers a nice view on the river”. CPSC503 Winter 2008

Scalar Adjectives Selection The house has an excellent location The house has an excellent location Value > 0.8 Value > 0.8 … … a convenient … a convenient … 0.65 < Value < 0.8 0.65 < Value < 0.8 HOUSE-LOCATION HOUSE-LOCATION … … a reasonable … a reasonable … 0.5 < Value < 0.65 0.5 < Value < 0.65 HAS_PARK_DISTANCE HAS_PARK_DISTANCE … an average… … an average… 0.35 < Value < 0.5 0.35 < Value < 0.5 … … a bad … a bad … 0.2 < Value < 0.35 0.2 < Value < 0.35 HAS_COMMUTING_DISTANCE HAS_COMMUTING_DISTANCE … a terrible … … a terrible … Value < 0.2 Value < 0.2 HAS_SHOPPING_DISTANCE HAS_SHOPPING_DISTANCE HOUSE-AMENITIES HOUSE-AMENITIES . . . CPSC503 Winter 2008

Discourse Cues Selection Type-of- nesting Rel-type Discourse cue Typed-ordering Although (placed on contributor) ("CORE" "CONCESSION" "EVIDENCE") CONCESSION ROOT However (placed on core) ("CONCESSION”"CORE" "EVIDENCE")) EVIDENCE Even though (placed on contributor) ("CORE" "CONCESSION" "EVIDENCE") EVIDENCE SEQUENCE CPSC503 Winter 2008

Simple Pronominalization Strategy inspired by Centering Theory Centering tells us: entity providing link preferentially realized as pronoun (within a discourse segment) • Our Strategy: • Within a discourse segment successive references always pronoun • First reference in segment definite description unless • Segment boundary explicitly marked by discourse cue and • No pronoun was used in previous sentence • “House B-11 is an interesting house. In fact, it has a reasonable…”. CPSC503 Winter 2008

Output of MicroPlanning Sequence of Lexicalized Functional Descriptions (LFDs) Example: “House B-11 is close to shops and reasonably close to work” ((CAT CLAUSE) (PROCESS ((TYPE ASCRIPTIVE) (MODE ATTRIBUTIVE)((POLARITY POSITIVE(EPISTEMIC-MODALITY NONE))) (PARTICIPANTS ((CARRIER ((CAT NP)(COMPLEX APPOSITION) (RESTRICTIVE YES) (DISTINCT ((AND ((CAT COMMON)(DENOTATION ZERO-ARTICLE-THING)(HEAD ((LEX "house")))) ((CAT PROPER) (LEX "B-11")))(CDR NONE)))) (ATTRIBUTE (AND((CAT AP)(HEAD ((CAT ADJ)(LEX "close"))) (QUALIFIER ((CAT PP) (PREP ((CAT PREP) (LEX "to"))) (NP((CAT COMMON) (NUMBER PLURAL)(DEFINITE NO) (HEAD ((CAT NOUN) (LEX "shop"))))))))) ((CAT AP)(HEAD ((CAT ADJ)(LEX "reasonably close"))) (QUALIFIER ((CAT PP) (PREP ((CAT PREP) (LEX "to"))) (NP ((CAT COMMON)(DEFINITE NO) (HEAD ((CAT NOUN)(LEX "work"))))))) ))))))))))) CPSC503 Winter 2008

Last Step: Sentence Generator • Unify LFDs with large grammar of English (FUF/SURGE[Elhadad 93, Robin 94]) • fill in syntactic constraints (e.g., agreement, ordering) • choose closed class words (e.g., prepositions, articles) • Apply morphology • Linearize as English sentences CPSC503 Winter 2008

GEA Highlights • GEA implements a computational model of generating evaluative arguments • All aspects covered in a principled way: • argumentation theory (argumentative strategy and requirements on user model) • decision theory (user model and elicitation method) • computational linguistics (architecture, micro-planning techniques and sentence generator) CPSC503 Winter 2008

Hot List Subtask1 1st best User presented with info about set of alternatives - Select preferred N alternatives - Order them by preference 2nd best ….. nth best Subtask2 Hot List 1st best Where? 2nd best YES ..... User presented with Evaluative argument about NewInstance Include? NewInstance is created nth best NO End Fill-out final questionnaire Evaluation Framework: Task Efficacy CPSC503 Winter 2008

Selection Task in Real-Estate • Why Real-Estate? • No sophisticated background or expertise • But still presents challenging decision task • Instructions • Move to new town • Buy house • Use system for data exploration CPSC503 Winter 2008

Data Exploration System 2-13 CPSC503 Winter 2008

Argument is presented… 2-13 CPSC503 Winter 2008

SAMPLE SELF-REPORT How would you judge the new house? The more you like the house the closer you should put a cross to “good choice” bad choice: ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : ___: good choice Satisfaction Z-score X Measures of Effectiveness • Behavior and Attitude change • Record of user actions • Whether or not adopts new instance • Position in Hot List • Final Questionnaire • How much likes new instance • How much likes the instances in Hot-List • Others(Final questionnaire) • Decision Confidence • Decision Rationale CPSC503 Winter 2008

Two Empirical Questions • Argument content, structure and phrasing tailored to user-specific AMVF, but . . . • Does this tailoring actually contribute to argument effectiveness? • Arguments should be concise. • Conciseness can be varied, but…. • What is the optimal level of conciseness? CPSC503 Winter 2008

Experimental Conditions • Tailored-Concise (~ 50% of objectives) • Tailored-Verbose (~ 80% of objectives) • Non-Tailored-Concise (~ 50% of objectives) • No-Argument CPSC503 Winter 2008

> ? ? > ? > Experimental Hypotheses Tailored-Verbose Tailored-Concise Non-Tailored-Concise No-Argument CPSC503 Winter 2008

Experimental Procedure 40 subjects (10 for each condition) PHASE1 Online questionnaire to acquire preferences (AMVF - 19 objectives, 3 layers) [Edwards and Barron 1994] • PHASE2 • - randomly assigned to condition • interacts with evaluation framework • - fill-out questionnaire CPSC503 Winter 2008

AMVF used in the experiment Distance-park trans Distance-rapid- Street-traffic Distance-work Distance-shopping #- of-bars Location Neighborhood Crime Garden-Size House-value Amenities Deck-size Porch-size Quality Appearance-quality modern Architectural-style View-quality View-object deco river park victorian houses university CPSC503 Winter 2008

Experiment Results • Satisfaction Z-score • Decision Confidence • Decision Rationale CPSC503 Winter 2008

0.05 0.28 1 0.28 Results Satisfaction Z-score Dennett test Tailored-Verbose p=0.02 > Tailored-Concise Non-Tailored-Concise > p=0.08 > p=0.08 No-Argument CPSC503 Winter 2008

Summary Generator of Evaluative Argument (GEA):generates concise arguments tailored to a model of the user’s preferences (AMVF) • Evaluation Framework • Basic decision tasks • Evaluate wide range of generation techniques • Experiment • Differences in conciseness influence effectiveness • Tailoring to AMVF seems to be effective CPSC503 Winter 2008

AT&T MATCH system Future Work (in 2001!) • Extend Argument Generator • More Complex Textual Arguments • Speech • Other domains • Other languages • Arguments combining text and graphics • More Experiments to test: • Whether tailoring to AMVF is actually effective • Extensions CPSC503 Winter 2008

Multimodal Access to City Help (MATCH) (AT&TJohnston, Ehlen, Bangalore, Walker, Stent, Maloor and Whittaker 2002) • Multimodal interface • Portable Fujitsu tablet • Input: Pen for deictic gestures and Speech input • Output: Text, Speech and graphics CPSC503 Winter 2008

User:“Compare” User: “Show me Italian restaurants in the West Village” MATCH Example: User: “Recommend” • Comparison: evaluative argument comparing at most five alternatives (reasons for choosing each of them) • Recommendation: evaluative argument about the best alternative MATCH generates responses using techniques inspired by GEA CPSC503 Winter 2008

[CogSci 2004] MATCH Evaluation • 16 subjects “overheard” 4x2 dialogues each about selecting a restaurant • In each dialogue 6 arguments are generated (3 tailored and 3 non-tailored) • Subjects rate each argument information quality on 0-5 scale “..is easy to understand and it provides exactly the info I am interested in when choosing a restaurant” • 768 judgments (vs. 36 in our experiment) Result: tailored preferred p<.05 CPSC503 Winter 2008

Commercial application • Product by CoGenTex (an NLG company) in 2003 • RecommenderExplaining product recommendations for Active Decisions, the leading provider of web-based guided-selling solutions. CPSC503 Winter 2008

Conclusions • Computational framework for generating and testing user-tailored evaluative arguments: • Argumentation theory • Decision Theory • Computational Linguistics • Interactive Data Exploration • Social Psychology • Independent experiments indicate that proposed tailoring influences user’s behavior/attitudes CPSC503 Winter 2008

CPSC 503 Computational Linguistics