1 / 11

Sequence Package Analysis A New Data Mining Tool to Speed Up Wiretap Analysis

Sequence Package Analysis A New Data Mining Tool to Speed Up Wiretap Analysis. Amy Neustein, Ph.D. Linguistic Technology Systems lingtec@banet.net. Sequence Package Analysis: A New Data Mining Tool to Speed Up Wiretap Analysis. Question: Why do we need a new tool?

gwheeler
Télécharger la présentation

Sequence Package Analysis A New Data Mining Tool to Speed Up Wiretap Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sequence Package AnalysisA New Data Mining Tool to Speed Up Wiretap Analysis Amy Neustein, Ph.D. Linguistic Technology Systems lingtec@banet.net

  2. Sequence Package Analysis: A New Data Mining Tool to Speed Up Wiretap Analysis Question: Why do we need a new tool? Answer: 1) In the real world speakers do not always use “key” words that can be spotted in a dialog. 2) In crime or terror related dialog, speakers will deliberately avoid the use of key words that can identify names, places, dates, etc

  3. Sequence Package AnalysisHow Does SPA Work? 1) Add rather than Replace SPA adds a layer of intelligence to standard dialog systems. 2) Mines audio data SPA goes beyond a conventional search for words and word strings. 3) Examines a series of related speaking turnsthat are discretely packaged as a sequence.

  4. WHAT DOES SPA DO? 1) SPA permits the discovery of “key” words (e.g., the name of a location where a crucial meeting among terrorists will take place) that are not in the preset lexicon. 2) SPA permits rapid and efficient data mining of large volumes of audio text by spotting sequence packages in the dialog.

  5. ADVANTAGES OF SPA Can be applied to different languages works by identifying interactional features of dialog (conversational sequence patterns) rather a preset glossary of words. Can perform data mining in realtime permits a human analyst to be brought in immediately when high alarm content is being produced in the dialog.

  6. Dialog Example Speaker “A”: Come to the intersection near Juniors? (the question mark shows an upward intonation) 0.2 - 0.5 second pause (speaker then pauses briefly) Speaker “B”: 1.2 second pause Speaker “A”: You know the thoroughfare with the big traffic light? Speaker “B”: Juniors, yeah.

  7. THE SEQUENCE PACKAGE Speaker “A”: Come to the intersection near Juniors? 0.2-0.5 Speaker “B”: 1.2 seconds of silence • A noun referent (“Juniors”) with an upward intonation • A brief pause, giving the listener the chance to show recognition or ask for clarification. • Silence by the listener which indicates lack of understanding or confusion.

  8. Speaker “A”: You know the thoroughfare withthe big traffic light?Speaker “B”: Juniors, yeah. • Clarification of the noun referent (“You know the thoroughfare with...”) • Repeat of noun referent (“Juniors”) - the source of the recognition trouble - followed by a recognitional marker (“Yeah”).

  9. Finding the Sequence Package in the Dialog Example Look for a concatenation of these utterance components: • noun referent with upward intonation • brief pause • silence • clarification of noun referent • repeat of noun referent that was initial source of the recognition trouble • recognitional marker

  10. Private Industry Applications for SPA • Technical Support Centers • Help desks • Call Centers • Broadcast Media • Depositions • Courtroom Testimony • Corporate Communications • Conference Managers

  11. Sequence Package AnalysisA New Data Mining Tool to Speed Up Wiretap Analysis Amy Neustein, Ph.D. Linguistic Technology Systems lingtec@banet.net

More Related