Given a query (text or audio ) Detection the query within target audio

Explorations in Zero-Resource Spoken Term Detection Justin Chiu Language Technology Institute, Carnegie Mellon University What is STD? Preliminary Results • Given a query (text or audio) • Detection the query within target audio • Common Approach: Recognized -> Search Evaluation Measurement: ATWV Why Zero Resource? • Common approach rely on high quality recognizer • Language have not enough knowledge to construct recognizer • Dictionary • Language Model DET Graph: Development Set Preliminary Attempt 1.Extract the MFCC feature from audio DET Graph: Evaluation Set • 2.Clustering • Goal: Get a better higher level representation for audio • K-mean Clustering • GMM Clustering • 3.Representation • Hard representation • Vectors to fixed label (Available for both clustering) • Soft representation • Vector to different vector (Available for GMM clustering) Proposed Approach Using Successive State Splitting algorithm to train HMM Initially using 1 state HMM to describe the training audio Splitting the HMM state, then prune the state according to Maximum Likelihood Deciding the number of iteration for splitting and pruning Train an “acoustic model” to model these sub-word units Representing queries and audios with decoding Indexing and searching to perform term detection Advantage: Increasing the system’s robustness by giving stronger assumption compare to preliminary approach 3.Segmental Dynamic Time Warping Conclusions We have some explorations on the Zero-Resource STD Pattern matching does not work on speaker independent case We proposed a modeling approach to address this issue Stronger assumptions might make the system more robust • Hard distance: mismatch = 1, match = 0 • Hard distance can use Inverted Frequency Weighting • Soft distance: -log(a∙q) • Distance below certain Threshold treated as detection

Given a query (text or audio ) Detection the query within target audio

Given a query (text or audio ) Detection the query within target audio

Presentation Transcript

Replace this text with a title

Text A

Writing a Text Analysis

Title of Presentation Given by

Type or paste your text here

Text Text Text Character limit = 994 characters

Given

Title

SOLVING A RIGHT TRIANGLE GIVEN TWO SIDES

Find PV given FV

Text Types

Insert text or sketch

Replace this text with a title

HIS ###: Course Title

A simple dummy text

Text A

Text A

Facebook

Goals : Write a function rule given a table or a real-world situation.

Hi H3 Here's a list of contacts for our event's need.

Text A