1 / 42

Gaiku Generating Haiku with Word Associations Norms

Gaiku Generating Haiku with Word Associations Norms. Yael Netzer, David Gabay , Yoav Goldberg and Michael Elhadad Department of Computer Science Ben Gurion University of the Negev Israel. CALC’09 May 35 th 2009. Creativity.

phong
Télécharger la présentation

Gaiku Generating Haiku with Word Associations Norms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GaikuGenerating Haiku withWord Associations Norms Yael Netzer, David Gabay , Yoav Goldberg and Michael Elhadad Department of Computer Science Ben Gurion University of the Negev Israel CALC’09 May 35th 2009

  2. Creativity “the forming of associative elements into new combinations which either meet specified requirements or are in some way useful…” [Mendick 1969] Three main pathes to a creative solution: • serendipity • similarity • mediation

  3. WAN Computational Creativity Poetry Generating Haiku!

  4. Haiku

  5. Haiku • Form of poetry originated in Japan, 16th Century • Three lines of 5,7,5 phonetic units (mora) • Use present tense and use no judgmental words • Adopted in Western languages, 20th Century • 5,7,5  3 short lines • Traditionaly, reference to nature and seasons, but modern Haiku are not restricted • Basho Haiku • 古池や蛙飛込む水の音 • old pond . . . • a frog leaps in • water’s sound

  6. fishing guides boat in the background a new trip iced over pond I skip a rock the entire width a holy cow a carton of milk seeking a church blind snakeson the wet grasstombstoned terror blossomless but not unloved the old magnolia first date — the little pile of anchovies

  7. Poetry Generation

  8. Bo y S ul

  9. Bo y S ul Structure

  10. Bo y S ul Content Structure

  11. Bo y S ul Inspiring, Interesting, Intriguing, Joyful, … 3 lines, Grammatical, Haiku-like

  12. Previous works • Manurung [2003] • Manurung et al. [2000] • Gervas [2001] Emphasize on Structure, Less on Content

  13. Body / Structure • Haiku Corpus • ~3,500 Haiku in English • Various sources • amateurish sites • children’s writings • translations of classic Japanese Haiku of Bashu and others • ’official’ sites of Haiku Associations (e.g., Haiku Path - Haiku Society of America).

  14. Body / Structure Line 1 Patterns: 280 JJ NN276 NN NN... Line 2 Patterns: 64 DT_the JJ NN … Line 3 Patterns: …. NN IN_of NNPDT_a NN IN_ofNNS NN NNNNS CC NNSIN_on DT_a NN NN … POS Tag Count Count Pattern Transitions: P(line2==DT_the NN | line1==JJ NN) = ... …

  15. Body / Structure Google 1T-Web / Proj Gutenberg Line 1 Patterns: 280 JJ NN276 NN NN... Line 2 Patterns: 64 DT_the JJ NN … Line 3 Patterns: …. POS Tagged match Pattern Transitions: P(line2==DT_the NN | line1==JJ NN) = ... …

  16. Body / Structure Google 1T-Web / Proj Gutenberg Line 1 Patterns: 280 JJ NN276 NN NN... Line 2 Patterns: 64 DT_the JJ NN … Line 3 Patterns: …. POS Tagged match Pattern Transitions: P(line2==DT_the NN | line1==JJ NN) = ... … JJ NNSDT_a JJ NNIN_of NN

  17. Body / Structure Google 1T-Web / Proj Gutenberg Line 1 Patterns: 280 JJ NN276 NN NN... Line 2 Patterns: 64 DT_the JJ NN … Line 3 Patterns: …. POS Tagged match Pattern Transitions: P(line2==DT_the NN | line1==JJ NN) = ... … pouring catsa pilot careof fighter JJ NNSDT_a JJ NNIN_of NN

  18. Body / Structure Google 1T-Web / Proj Gutenberg Line 1 Patterns: AA BB CC / 12 BB CC DD / 10 … Line 2 Patterns: CC DD EE / 20 … Line 3 Patterns: …. Grammatical output Preserves Haiku “Texture” POS Tagged match Pattern Transitions: P(Line2=AA BB | Line1= XX YY) … pouring catsa pilot careof fighter JJ NNSDT_a JJ NNIN_of NN

  19. Soul? • Requirements: good “story” • cohesive • surprising • provoke feelings/emotions • metaphorical • “Should leave the reader wondering…” … Creative!

  20. Soul? • An idea: capture “story” seed as sequence of concepts butterfly, spring, flower thief , steal , jail mosquito, blood, vampire but not any seed will do cat , feline , claw  too cohesive computer , coat , queen too divergent

  21. Soul? Is WordNet a good soul? not really it may give cohesiveness, but bad stories

  22. Soul? We actually measured it in Haiku Corpus Is WordNet a good soul? not really

  23. Butterfly Spring Flower • The connection between these words is reconstructable by human • It is not available in WordNet • Where can we find such relations?

  24. Word Association Norms

  25. Word Association Norms (WAN) • Collection of cue words a set of free associations (targets) with quantitative and statistical measures. (mouse CAT 0.5, RAT 0.08, CHEESE 0.07, HOLE 0.05…) • Given a cue - collect immediate responses of first word that comes to mind. • Largest WAN we know for English is the University of South Florida Free Association Norms (Nelson et al., 1998). http://w3.usf.edu/FreeAssociation/ • 5,019 cue words and 10,469 additional target that were collected with more than 6,000 participants since 1973. WAN – weighted directed graph, nodes are stemmed words.

  26. water spring water fall fall flower butterfly green bloom

  27. Why Word Associations • Added value of WAN: an insight on language, not found in WordNet or are hard to acquire from corpora [Sinopalnikova & Smrz 2004] • Associative thinking takes part in the process of writing and reading poetry. • Haiku, because so short - relies on lexical associations for concept progression Hypothesis: word-associations are good catalyzers for creativity, can be used as a building block in the creative process of Haiku generation.

  28. We first test this hypothesis by analyzing a corpus of existing Haiku poems. • Can the creativity of text as reflected in word associations be quantified? • Are Haiku poems indeed more associative than newswire text or prose?

  29. Two nodes are connected iff one of them is a cue for the other. Associative distance: number of edges in the shortest path between the words in the associations-graph. WordNet distance: number of edges in the shortest path between any synset of one word to any synset of the other word Associativity of a text - the number of associated word pairs in the text, normalized by the number of word pairs in the text of which both words are in the WAN. WordNet-relations level - the number of WordNet-related word pairs in the text.

  30. Average Associativity We measure the associavity and WordNet relations levels of 200 of the Haiku in our Haiku Corpus, as well as of random 12-word sequences from Project Gutenberg and from the NANC newswire corpus.

  31. Filling body with soul: Theme Selection • Generating the seed of the story: • Start with a word • random walk on a word graph Many possible variants. We currently use: start with the node of the seed word do several short random walks keep resulting word set

  32. water spring water fall fall flower butterfly green bloom Spring  {flower, butterfly…}

  33. Filling body with soul • For a given structure: • Choose first line containing seed word • Choose other lines containing a word from the set • This is adequate, but relations might be straightforward Searching for a better soul:  Generate several poems for the pattern Rerank them based on associativity measure. Reranking catches further “residual” relations

  34. 6 alligator pear a handful of whites in the spring 8 avocado pear a kind of boots in the fall 10 pear salad a season of tears in the summer 10 pear tree a seasoning of spices in the fall 10 alligator pear a spring of tears in the blackness NN NN DET_a NN of NNS PP_in DET_the NN

  35. Evaluation Method • ‘Turing test’: • Was this Haiku written by human or by a computer? • How would you grade it between 1 to 5? • Settings: • AUTO Haiku set: 15 Haiku created by Gaiku without any manual selection, 10 random human Haiku on same subjects • SEL set: 17 Haiku created by Gaiku, selected manually out several runs, 9 award winning human Haiku • 52 subjects

  36. Results: AUTO set

  37. Results: SEL set

  38. The Best of Gaiku early dew the water contains teaspoons of honey • Best in SEL. Classified as human - 77.2%, average grade 3.09 • Best in AUTO. Classified as human - 72.2%, average grade 2.75 cherry tree poisonous flowers lie blooming

  39. Conclusions • Word Association Norms have good potential in creative content generation Future Work: Lots! • Haiku: improve theme selection • Additional forms of creative texts • Test WAN in general NLP tasks: • Use WAN for (Non-creative) Generation • Word Sense Disambiguation • Lexical chains • ‘Guess the word’ given associations (for people with SLI)

  40. fishing guides boat in the background a new trip iced over pond I skip a rock the entire width a holy cow a carton of milk seeking a church blind snakeson the wet grasstombstoned terror blossomless but not unloved the old magnolia first date — the little pile of anchovies

  41. fishing guides boat in the background a new trip iced over pond I skip a rock the entire width a holy cow a carton of milk seeking a church blind snakeson the wet grasstombstoned terror blossomless but not unloved the old magnolia first date — the little pile of anchovies

More Related