Extractive Spoken Document Summarization - Models and Features

Extractive Spoken Document Summarization - Models and Features Yi-Ting Chen(陳怡婷) Department of Computer Science ＆ Information Engineering National Taiwan Normal University, Taipei, Taiwan 2007/02/08

Outline • Introduction • Conventional Extractive Summarization Approaches • Probabilistic Generative Summarization Approaches • Summarization Compaction • Evaluation Metrics • Experimental Results • Conclusions and Future Work

Introduction (1/3) • World Wide Web has led to a renaissance of the research of automatic document summarization, and has extended it to cover a wider range of new tasks • Speech is one of the most important sources of information about multimedia content • However, spoken documents associated with multimedia are unstructured without titles and paragraphs and thus are difficult to retrieve and browse • Spoken documents are merely audio/video signals or a very long sequence of transcribed words including errors • It is inconvenient and inefficient for users to browse through each of them from the beginning to the end

Introduction (2/3) • Spoken document summarization, which aims to generate a summary automatically for the spoken documents, is the key for better speech understanding and organization • Extractive vs. Abstractive Summarization • Extractive summarization is to select a number of indicative sentences or paragraphs from original document and sequence them to form a summary • Abstractive summarization is to rewrite a concise abstract that can reflect the key concepts of the document • Extractive summarization has gained much more attention in the recent past

Introduction (3/3)

Spoken-document Summarization The emergence of new areas such as multi-document summarization (1997), multiligual summarization, and multimedia summarization (1997) Text-document Summarization More natural language generation work begins to focus on text summarization The first discourse-based approaches based on story grammars (1980) The first SVD-based approach (1995) The use of location features (1969) The first training approach (1995) The first entity-level approaches based on syntactic analysis (1961) A variety of different work (entity-level approaches based on AI、logic and production rules semantic networks、hybrid approaches) The emergence of more extensive entity-level approaches (1972) Recent work has almost exclusively focused on extract rather than abstracts. A renewed interest in earlier surface-level approaches. The extended surface-level approach to include the used of cue phrases Early system using a surface-level approach (1958) 1950 1960 1970 1980 1990 2000 Background of summarization (Mani and Maybury 1999)

Extraction Based on Sentence Locations/Structures • Sentence extraction using sentence location information • Lead (Hajime and Manabu 2000) • Focusing on the introductory and concluding segments (Hirohata et al. 2005) • Specific structure on some domain (Maskey et al. 2003) • E.g., broadcast news programs－sentence position, speaker type, previous-speaker type, next-speaker type, speaker change

Statistical Summarization Approaches (1/8) • Spoken sentences are ranked and selected based on some similarity measures or significant scores (a) Similarity Measures • Vector Space Model (VSM) (Ho 2003) • The document and sentence of it are represented in vector forms • The sentences that have the highest relevance scores to the whole document are selected • To summarize more important and different concepts in a document • Relevance measure (Gong et al. 2001) • Maximum Marginal Relevance (MMR) (Murray et al. 2005)

Statistical Summarization Approaches (2/8) (a) Similarity Measures • Relevance measure (Gong et al. 2001) • Maximum Marginal Relevance, MMR (Murray et al. 2005) D

Statistical Summarization Approaches (3/8) (b) SVD-based Method • The sentence can also be represented as a semantic vector • While the sentence with more topic or semantic information are selected • LSA (Gong et al. 2001) • DIM (Hirohata et al. 2005)

Statistical Summarization Approaches (4/8) (b) SVD-based Method • Embedded LSA, eLSA (黃耀民 2005) • Not only the sentences of the document to be summarized but also the document itself are involved in the construction of latent topic space • The sentences in latent topic space that have the highest relevance scores to the whole document are selected

Statistical Summarization Approaches (5/8) (c) Sentence Significance Score (SIG) (Kikuchi et al. 2003) • Each sentence in the document is represented as a sequence of terms, which can be simply given by a significance score • Features such as the confidence score, linguistic score or prosodic information also can be further integrated • Sentence selection can be performed based on this score • E.g., Given a sentence • Linguistic score: • significance score: • Or Sentence Significance Score (Hirohata et al. 2005)

Statistical Summarization Approaches (6/8) (c) Sentence Significance Score (Kong and Lee 2006) • Sentence: • :statistical measure, such as TF/IDF • :linguistic measure, e.g., named entities and POSs • :confidence score • :N-gram score • is calculated from the grammatical structure of the sentence • Statistical measure also can be evaluated using PLSA (Probabilistic Latent Semantic Analysis) • Topic Significance • Term Entropy

Statistical Summarization Approaches (7/8) (d) Classification-based Methods • These methods need a set of training documents (or labeled data) for training the classifiers • Naïve Bayes’ Classifier/Bayesian Network Classifier (Kupiec 1995, Koumpis et al. 2005, Maskey et al. 2005) • Support Vector Machine (SVM) (Zhu and Penn 2005) • Logistic Regression (Zhu and Penn 2005) • Gaussian Mixture Models (GMM) (Murray et al. 2005) Summary Non-summary

Statistical Summarization Approaches (8/8) (e) Combined Methods (Hirohata et al. 2005) • Sentence Significance Score (SIG) combined with Location Information • Latent semantic analysis (LSA) combined with Location Information • DIM combined with Location Information

Probabilistic Generative Approaches (1/7) • MAP criterion for sentence selection • Sentence prior • Sentence prior is simply set to uniform here • Or may have to do with • Sentence duration/position, correctness of sentence boundary, confidence score, prosodic information, etc. • Each sentence of the document can be ranked by this likelihood value Sentence prior Sentence model

Probabilistic Generative Approaches (2/7) • Hidden Markov Model, HMM(黃耀民 2005, 陳怡婷 et al. 2005) • Each sentence of the spoken document is treated as a probabilistic generative model of N-grams, while the spoken document is the observation • : the sentence model, estimated from the sentence • : the collection model, estimated from a large corpus (In order to have some probability of every term in the vocabulary) • : a weighting parameter

Probabilistic Generative Approaches (3/7) • Relevance Model, RM (Chen et al. 2006) • In HMM, the true sentence model might not be accurately estimated (by MLE) • Since the sentence consists only of few terms • In order to improve estimation of the sentence model • Each sentence has its own associated relevant model , constructed by the subset of documents in the collection that are relevant to the sentence • The relevance model is then linearly combined with the original sentence model to form a more accurate sentence model

Probabilistic Generative Approaches (4/7) • A schematic diagram of extractive spoken document summarization jointly using the HMM and RM models Spoken Documents to be Summarized Local Feedback General Text NewsCollection Contemporary Text NewsCollection IR System Sentence S’s RMModel Retrieved RelevantDocumentsof S Document Likelihood S’s HMM Model

( ) ( ) P wn T P T S 1 1 i ( ) P wn T 2 ( ) P T S 2 i ( ) ( ) P T S P wn T K i K A sentence model Document D=w1w2…wn…wN The TMM model for s specific sentence Si Probabilistic Generative Approaches (5/7) • Topical Mixture Model, TMM (Chen et al. 2006) • Build a probabilistic latent topical space • Measure the likelihood of a sentence generating a given document in such space

Probabilistic Generative Approaches (6/7) • Word Topical Mixture Model (wTMM) (Chiu and Chen 2007) • To explore the co-occurrence relationship between words of the language • Each word of the language as a topical mixture model for predicting the occurrence of the other word • Each sentence of the spoken document to be summarized was treated as a composite word TMM model for generating the document • The likelihood of the document being generated by can be expressed as:

Probabilistic Generative Approaches (7/7) • Word Topical Mixture Model (wTMM)

Comparison of Extractive Summarization Methods • Literal Term Matching Vs. Concept Matching • Literal Term Matching： • Extraction using degree of similarity (VSM, MMR) • Extraction using features score (Sentence score) • HMM, HMMRM • Concept Matching： • Extraction using latent semantic analysis (LSA, DIM) • TMM, wTMM

Comparison of Extractive Summarization Methods • Summarizing Speech Without Text Using HMM • (Maskey and Hirschberg 2006) • L state position-sensitive HMM • L number of bins, 2L states • Features extraction and HMM Training 2 1 L

Summarization Compaction (1/3) • Two-Stage Summarization Method (Furui et al. 2004) • Sentence extraction • Sentence Compaction • , this score is a measure of the dependency between two words and is obtained by a phrase structure grammar, stochastic dependency context-free grammar (SDCFG) • A set of words that maximizes a weighted sum of these scores is selected according to compression ratio and connected to create a summary using a two-stage DP technique

Summarization Compaction (2/3)

Summarization Compaction (3/3) • Using acoustic, prosodic, semantic information and dynamic programming search algorithm to find the summarized result (Huang et al. 2005) • A noisy-channel model for sentence compression (Knight and Marcu 2001) • A decision-based model for sentence compression (Knight and Marcu 2001) • To decompose the rewriting operation into a sequence of actions

Evaluation Metrics (1/4) • Subjective Evaluation Metrics (Direct evaluation) • Conducted by human subjects • Different levels • Objective Evaluation Metrics • Automatic summaries were evaluated by objective metrics • Automatic Evaluation • Summaries are evaluated by IR

Evaluation Metrics (2/4) • Objective Evaluation Metrics • Summarization accuracy (Hori et al. 2004) • All the human summaries are merged into a single word network. • Word accuracy of the automatic summary is then measured as a summarization accuracy in comparison with the closest word string extracted from the word network. • Problem: the variation between manual summaries is so large. (high summarization ratio or low summarization ratio)

Evaluation Metrics (3/4) • Objective Evaluation Metrics • Sentence recall/precision (Hirohata et al. 2004) • Sentence recall/precision is commonly used in evaluating sentence-extraction-based text summarization. • Sentence boundaries are not explicitly indicated in input speech, estimated boundaries based on recognition results do not always agree with those in manual summaries. (Kitade et al., 2004) • F-measure, F-measure/max, F-measure/ave.

Evaluation Metrics (4/4) • Objective Evaluation Metrics • ROUGE-N (Lin et al. 2003) • ROUGE-N is an N-gram recall between an automatic summary and a set of manual summaries. • Cosine Measure (Saggion et al. 2002, Ho 2003) 昨天　馬英九　訪問　中國大陸昨天馬英九　結束　訪問　回國

Experimental Results (1/2) • Preliminary Tests on 200 radio broadcast news stories collected in Taiwan (automatic transcripts with 14.17% character error rate) • Development Set (100) • Test Set (100) • ROUGE-2 measure was used to evaluate the performance levels of different models • Development set results

Experimental Results (2/2) • Test set results • Test set with non-uniform sentence prior probabilities

Conclusions and Future Work • Various (spoken) document summarization approaches and features have been extensively investigated in the past several years • The probabilistic generative framework seems to be promising for extractive (spoken) document summarization. We currently investigate to • Improve the proposed sentence models • Improve the estimation of sentence prior • Consider the relevance between sentences

Reference (1/3) • (Mani and Maybury 1999) Inderjeet Mani and Mark T. Maybury, “Advances in Automatic Text Summarization”, the MIT Press Cambridge, Massachusetts London, England. • (Hajime and Manabu 2000) Mochizuki Hajime, Okumura Manabu,　“A Comparison of Summarization Methods Based on Task-based Evaluation”, 2nd International conference on language resources and evaluation, LREC-2000, Athens, Greece. • (Hirohata et al. 2005) Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui, “Sentence Extraction-Based Presentation Summarization Techniques and Evaluation Metrics”, ICASSP 2005. • (Maskey et al. 2003) Sameer Raj Maskey, Julia Hirschberg, “Automatic Summarization of Broadcast News using Structural Features”, EUROSPEECH 2003. • (Ho 2003) Y. Ho,” An initial study on automatic summarization of Chinese spoken documents”, Master Thesis, National Taiwan University, July 2003. • (Gong et al. 2001) Y. Gong and X. Liu, “Generic text summarization using relevance measure and latent semantic analysis,” in Proc. ACM SIGIR Conference on R&D in Information Retrieval, 2001, pp. 19-25. • (Murray et al. 2005) Gabriel Murray, Steve Renals, Jean Carletta, “Extractive Summarization of Meeting Recordings”, in Proc. Eurospeech 2005. • (Kikuchi et al. 2003) T. Kikuchi, S. Furui, and C. Hori, “Two-stage automatic speech summarization by sentence extraction and compaction,” in Proc. IEEE and ISCA Workshop on Spontaneous Speech Processing and Recognition, 2003, pp.207-210. • (Furui et al. 2004) Sadaoki Furui, Tomonori Kikuchi, Yousuke Shinnaka, Chiori Hori, “Speech-to-Text and Speech-to-Speech Summarization of Spontaneous Speech”, IEEE transactions on speech and audio processing, VOL. 12 No.4, July 2004.

Reference (2/3) • (Kong and Lee 2006) Sheng-Yi Kong and Lin-shan Lee, “Improved Spoken Document Summarization using Probabilistic Latent Semantic Analysis (PLSA)”, the 31th IEEE International Conference on Acoustics, Speech, and Signal processing (ICASSP 2006), Toulouse, France, May 14-19, 2006. • (Kupiec 1995) Julian Kupiec, Jan Pedersen and Francine Chen, “A Trainable Document Summarizer”, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, 1995. • (Koumpis et al. 2005) K. Koumpis, S. Renals, “Automatic Summarization of Voicemail Messages Using Lexical and Prosodic Features”, ACM trans. Speech and Language Processing 2(1), 2005. • (Zhu and Penn 2005) X. Zhu, G. Penn, “Evaluation of Sentence Selection for Speech Summarization”, in Proc the 2nd International Conference on Recent Advances in Natural Language Processing (RANLP-05), pp. 39-45. September 2005. • (Chen et al. 2006) Yi-Ting Chen, Suhan Yu, Hsin-min Wang, Berlin Chen, "Extractive Chinese Spoken Document Summarization Using Probabilistic Ranking Models,"the Fifth International Symposium on Chinese Spoken Language Processing ( ISCSLP 2006), Singapore, December 13-16, 2006. • (Chen et al. 2006) Berlin Chen, Yao-Ming Yeh, Yao-Min Huang, Yi-Ting Chen, "Chinese Spoken Document Summarization Using Probabilistic Latent Topical Information,” the 31th IEEE International Conference on Acoustics, Speech, and Signal processing (ICASSP 2006), Toulouse, France, May 14-19, 2006. • (Chiu and Chen 2007)Hsuan-Sheng Chiu, Berlin Chen, "Word Topical Mixture Models for Dynamic Language Model Adaptation," the 32th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), Hawaii, USA, April 15-20, 2007.

Reference (3/3) • (Maskey and Hirschberg 2006) Sameer Maskey, Julia Hirschberg, “Summarizing Speech Without Text Using Hidden Markov Models”, HLT-NAACL, 2006 • (Huang et al. 2005) Chien-Lin Huang, Chia-Hsin Hsieh and Chung-Hsien Wu, “Spoken Document Summarization Using Acoustic, Prosodic and Semantic Information,” in Proceedings of ICME 2005, Amsterdam, The Netherlands, 2005. • (Knight and Marcu 2001) Kevin Knight, Daniel Marcu. “Summarization beyond sentence extraction: A probabilistic approach to sentence compression”. 2002, Artificial Intelligence 139(1): 91-107 • (Hori et al. 2004) C. Hori, T. Hirao and H. Isozaki, “Evaluation measures considering sentence concatenation for automatic summarization by sentence or word extraction,” Proc. ACL, pp. 82-88 (2004) • (Kitade et al., 2004) Kitade, T. et al., 2004. Automatic extraction of key sentences from CSJ presentations using discourse markers and topic words. In: Proc. Third Spontaneous Speech Science and Technology workshop, pp. 111-118. • (Hirohata et al. 2004) Makoto Hirohata, Yosuke Shinnaka, Koji Iwano and Sadaoki Furui, Sentence-extractive automatic speech summarization and evaluation techniques, Speech Communication, In Press, Corrected Proof, , Available online 5 June 2006 . • (Lin et al. 2003) C.Y. Lin, “ROUGE: Recall-oriented Understudy for Gisting Evaluation,” 2003, http://www.isi.edu/~cyl/ROUGE/. • (Saggion et al. 2002) Horacio Saggion and Dragomir Radev, “Meta-evaluation of Summaries in a Cross-lingual Environment using Content-based Metrics”, COLING 2002.

Reference (3/3) • (黃耀民 2005)黃耀民，『以字句擷取為基礎並應用於文件分類之自動摘要之研究』，碩士論文，國立臺灣師範大學資訊工程研究所，2005. • (陳怡婷 et al. 2005)陳怡婷、黃耀民、葉耀明、陳柏琳,“中文語音文件自動摘要之摘要模型”, 「第十屆人工智慧與應用研討會」, December 2-3, 2005.

Extractive Spoken Document Summarization - Models and Features

Extractive Spoken Document Summarization - Models and Features

Presentation Transcript

Recent advances in multi-document summarization

COMMON FEATURES OF SPOKEN LANGUAGES

Extractive Summarization of Meeting Recordings

Topic Themes for Multi-Document Summarization

NLP Document and Sequence Models

Document Summarization

Concept based Multi-Document Text Summarization

Document Summarization

Assessing sentence scoring techniques for extractive text summarization

Chinese Spoken Document Retrieval and Organization

Suprasegmental Features of Spoken English

A New Multi-document Summarization System

Document Summarization

Document Type Recognition and Content Summarization

Event-Based Extractive Summarization

Exploiting Timelines to Enhance Multi-document Summarization

Combining Extractive Summarization and DA Recognition

Semantic Medline: Multi-Document Summarization and Visualization

Spoken Document Recognition, Retrieval and Summarization

Learning Semantic Sub-graphs for Document Summarization

Frame aided Multiple Document Summarization

A Brief Review of Extractive Summarization Research