1 / 22

A Framework for Effective Annotation of Information from Closed Captions Using Ontologies

A Framework for Effective Annotation of Information from Closed Captions Using Ontologies. Authors: Latifur Khan, Dennis McLeod, Eduard Hovy. Presenter : Mohamed Mustafa Khimani. TOPICS. Introduction Related Work Content extraction Ontologies

reeves
Télécharger la présentation

A Framework for Effective Annotation of Information from Closed Captions Using Ontologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Framework for Effective Annotation of Information from Closed Captions Using Ontologies Authors: Latifur Khan, Dennis McLeod, Eduard Hovy Presenter : Mohamed Mustafa Khimani

  2. TOPICS Introduction Related Work Content extraction Ontologies Metadata acquisition and management of metadata Experimental Implementation Conclusions

  3. INTRODUCTION Keyword based techniques & use of query expansion mechanism Ontology-based model Extraction of semantic concepts from keywords Document Indexing Precision and Recall Effective selection/retrieval of audio information

  4. RELATED WORK Query expansion through use of semantically related terms e.g. using WordNet Use of conceptual distance measure between query and document to model relevance

  5. CONTENT EXTRACTION Fully automated content extraction – converting speech to equivalent text Selected content extraction – Word spotting An audio object is composed of a sequence of contiguous segments Audio object Oi is defined as: (Idi, Si, Ei, Vi, Ai) Vi (description) is a finite set of tags of labels e.g. {10, 1145.59, 1356.00, {Gretzky Wayne}, *}

  6. ONTOLOGIES Ontology defines a set of representational terms called Concepts Interrelationships among these concepts describe a target world Ontology as a DAG Each node in DAG represents a concept Concept = Unique Label name + synonyms list(l1, l2, l3, …, li, …, ln) – user requests are matched with this li – an element of the list Interrelationships

  7. NPC Region = League + Team + Player Disjointconcepts

  8. ONTOLOGIES Each league and its team and player form a region During annotation of concepts – choose a particular region Due to the disjoint property, objects are associated with only one disjoint concept rather than two A player plays in several leagues Multiple instances of the player in ontology (sub-tree) Single instance with two parents (DAG)

  9. Disjoint Is-A Part - Of

  10. METADATA ACQUISITION Process through which descriptions are provided Extract concepts from keywords Concept scoring Stemming –comput – computer, computation, etc. Keyword : Concept – 1 : many Disambiguation – a set of keywords occurring together determine a context for one another Disambiguation methods: Co-occurrence (disambiguate across several regions) Semantic closeness (disambiguate within a region)

  11. METADATA ACQUISITION E.g. Lakers keep grooving with 8th straight win. Kobe Bryant scores 21 points as the Lakers remain perfect on their eastern road trip with a 97-89 triumph over the Nets. Bryant discussed the eight game win streak and his performance in the All Star game. Lakers – Los Angeles Lakers Nets – New Jersey Nets Bryant – Reeves Bryant, Bryant Mark, Bryant Kobe Eastern – Eastern Washington & Eastern Michigan

  12. FORMAL DEFINITIONS Element – Score (Escore) – element lj for a particular concept Ci Concept – Score (Score) Scorei = max Escoreij where 1<=j<=n Region – Score (CscoreR) – For a Region R, is the summation of Concept-Score of selected concepts that are belonged to this region Semantic Distance (SD (Ci,Cj) – shortest path between two concepts Ci and Cj

  13. FORMAL DEFINITIONS Propogated-score(Si) Si = Scorei + Scorej/SD(Ci, Cj) + ….. Smax – For an object, Smax is the largest score of all its selected concepts propagted-score Si Threshold-score(γscore) – Threshold score for an object is a certain fraction of its Smax. Smax * threshold-constant(0-1) For high values of threshold, we may lose some relevant concepts and at the same time discard many irrelevant concepts

  14. CHARACTERISTICS Relevant concepts may be discarded along with irrelevant ones because relevant concepts may not correlate with other concepts – Si will be low If there is no correlation, the algorithm fails to resolve ambiguity – we keep all the selected concepts Due to incompleteness of Ontologies, some irrelevant concepts may be associated Disambiguation fails to disambiguate concepts when there is little or no correlation among the concepts selected

  15. IMPLEMENTATION

  16. IMPLEMENTATION Total number of clips – 2,481 Maximum length of clip – 5 min Average size of closed caption for a clip – 25 words Total # of concepts in ontologies – 7,000 Average # of concepts associated with an object -4.47

  17. RESULTS

  18. RESULTS

  19. CONCLUSION Ontology proposed can be used to generate information selection requests in database queries Can be extended to video with closed captions Better than keyword-based technique Cost of building domain specific ontologies and connecting domain data to them automatically Evolving ontologies, extracting highlighted sections of audio, addressing retrieval questions in the video domain, facilitation of cross-media indexing

  20. Thank you

More Related