140 likes | 256 Vues
This presentation examines the results of adaptive and non-adaptive topic tracking systems based on TDT-2004 research conducted at the University of Maryland and the NSA. We analyze system designs, the impact of normalization on performance, and key findings regarding incomplete judgments. Through our evaluation, we found that normalization issues affected results significantly, and we propose next steps for improvement, including further exploration of normalization techniques and refining training data. This research aims to enhance topic tracking effectiveness for future applications.
E N D
Adaptive Topic Tracking at Maryland Tamer Elsayed, Douglas W. Oard, David Doermann University of Maryland, College Park Gary Kuhn National Security Agency TDT-2004
Outline • Results • System design • Interpreting the results • Next steps
Cost=0.6507 Non-Adaptive Topic Tracking Bottom left is better No score normalization
Cost=0.2438 Adaptive Topic Tracking No score normalization, unjudged treated as firmly off-topic
Cost=0.3789 Adaptive Topic Tracking One-pass score normalization, unjudged treated as firmly off-topic
Non-Adaptive System Design TDT-5 Training Epoch Evaluation Epoch Compute log-odds ngram weights Compute story scores
Non-Adaptive System Design TDT-5 Training Epoch Evaluation Epoch Compute log-odds ngram weights Compute story scores
Compute Normalization factor Normalize Story scores Adaptive System Design TDT-4 TDT-5 Extended Training Epoch Training Epoch Evaluation Epoch Compute log-odds ngram weights Compute story scores
Lack of normalization probably hurt! What can we say about the effect of incomplete judgments? Interpreting Non-Adaptive Results
Normalization hurt! One-pass design is the problem DET has limitations Changing the threshold changes our topic model! Threshold selection is now a critical path item How does judgment density affect the results? Interpreting Adaptive Results Not normalized Normalized
Next Steps • Further explore normalization • Implement continuous renormalization • Tune parameters on devtest data • Decide between TDT-5 and TDT-4 • Is incomplete judging harmful? • Define richer training sets • Explicit queries • Many known on-topic/off-topic training stories • Models of (imperfect) behavioral feedback
Our Favorite Quote of the Day • “It takes time to get the implementation correct” [Yiming] • We had 30 days from project initiation to non-adaptive submission