200 likes | 302 Vues
Explore the effectiveness of unsupervised and supervised tracking models for multilingual stories using native language comparisons. The study involves creating training corpora with English and multilingual topics, developing unsupervised tracking ideas and models like Vector Space and TF-IDF, comparing relevance models, and discussing adaptation and incremental thresholds. Results reveal that native language comparisons and supervised models produce better outcomes. Future work includes exploring feedback methods and incorporating judgment costs in tracking systems.
E N D
UMASS-Amherst at TDT 2004 Unsupervised and SupervisedTracking Hema Raghavan
Outline • Create a training corpus • Unsupervised tracking • Supervised Tracking • Discussion
Creating a training corpus • For Tracking • 50% topics are English • 50% are multilingual • Created a training corpus (supervised and unsupervised) • 30 topics from TDT4 • 50% stories with primarily English topics. • 50% multilingual stories
Unsupervised Tracking Ideas Ideas • Models • Vector Space • Relevance Models • Adaptation • Native Language comparisons
Unsupervised Tracking Models • Vector Space • TF-IDF • IDF is incremental • Relevance Models • State of the art, high performance system • Adaptation
Native Language Hypothesis • TDT tasks involve comparisons of models: • Story link detection: sim(Si, Sj) • Topic tracking: sim(Si, Tj) • It is more effective to measure similarity between models in the original language of the stories, than after machine translation into English • Quality of translation • Differences in score distributions • Trivially obvious? Hard to demonstrate in tracking
Submitted Runs • TF-IDF (UMASS4) • TF-IDF + adaptation (UMASS1) • TF-IDF + adaptation + native models (UMASS2) • Relevance Models + adaptation (UMASS5) • All submissions for primary evaluation condition.
Supervised Tracking • Creating a newswire only training corpus. • Ideas • Models • Vector Space • Relevance Models • Native Language comparisons • Incremental Thresholds • Negative Feedback
Incremental Thresholds • Utility • Relevance judgments for both Hits and False-Alarms • Increment the YES/NO threshold by when Utility falls below zero.
Negative Feedback • Relevance judgments for both Hits and False-Alarms • for a hit. • for a false alarm.
Submitted Runs • Rel. Models (UMASS-2) • Optimized for TDT cost • Rel. Models + Inc. Thresholds (UMASS-1) • TF-IDF + adaptation + neg. feedback + inc thresholds (UMASS-3) • TF-IDF + adaptation + native models (UMASS-4) • TF-IDF + adaptation + native models + neg feedback + increase thresh. (UMASS-7) Optimized for T11SU
Supervised Tracking Results Cost: 0.0467
Results and Discussion • Supervision clearly helps. • Relevance models – a clear winner. • Negative Feedback helps. • Training set did not reflect test very well. • Min-cost versus T11SU
Future Work • Exploration Exploitation trade-off. • What about feedback that is less on demand? • more realistic • Can add costs for judgments. • What about feedback like in the HARD task – Clarification forms?