220 likes | 340 Vues
This paper discusses the implementation of text classification and clustering techniques in conference management systems. By automating tasks such as paper submissions, topic assignments, and review processes, we enhance the overall workflow. Focusing on case studies from ECDL 2005 and the ECR, we analyze the effectiveness of automatic topic classification based on abstracts. Results indicate that quality improves with the volume of training documents, allowing for better thematic arrangements and accessibility post-conference. Our findings demonstrate significant accuracy rates in topic classification, indicating potential for further automation.
E N D
Applying Text Classification in Conference Management: Some Lessons Learned Andreas Pesenhofer, Helmut Berger, Michael Dittenbach, Andreas Rauber
Overview • Conference Management Systems • Classification & Clustering • Case Studies • ECDL 2005 • ECR • Conclusions
Conference Management Systems • Set of tools to support conference workflow • Basic support for paper submission & review collection • Many tasks for further automation • Selection of the program committee • Topic assignment of submission • Paper to reviewer assignment • Support in review generation • Poster arrangement • Post-conference access to papers
Classification & Clustering • Topic assignment of submission • Problem: authors uncertain about precise topic assignment (conference terminology) • Solution: support by automatic assignment • Method: ATC based on abstracts • Poster arrangement & Post-conference access to papers • Problem: topic based arrangement • Solution: clustering • Method: SOM & Mnemonic SOM
ATC for topic assignment • Train model based on previous conferences • Abstract submission • Automatic assignment • Confirmation
Clustering for organization • Arrange posters thematically • Non-rectangular SOMs reflecting conference site • Mnemonic SOMs simplify post-conference paper access
Overview Conference Management Systems • Classification & Clustering • Case Studies • ECDL 2005 • ECR • Conclusions
ECDL 2005 – ATC data • English abstracts of previous ECDL conferences • Topics of the conference call -> defined seven categories • Pre-processing (removing all numbers, punctuation marks, special characters, transformation to lower case) • tfidf-weighting • 4,141 unique terms • IG of 3,460 top ranked terms average - accuracy over all category is 58.60%
ECDL 2005 – SOM data • Poster and Paper Organization: • full text of accepted posters of ECDL 2005 • term selection based on minimal word length and document frequencies • 30 posters - 569 terms • Post-conference access • 71 papers and posters – 5,654 terms
Overview Conference Management Systems • Classification & Clustering • Case Studies • ECDL 2005 • ECR • Conclusions
ECR - Data • Abstracts of the ECR:European Congress for Radiology • Training set: ECR 2003 & 2004 - 1,952 documents • Test set: ECR 2005 - 924 documents • Same steps as for the ECDL data • Resulting in 14,887 unique terms • IG: 5,720 top ranked terms, average accuracy over all categories of 73.57%
Conclusions • Quality is proportional to amount of training documents • Structure of the classes (overlapping?) • The bulk of submissions can be dealt with automatically • May be used for session assignment • Arrange poster & papers thematically • Easy to memorize & find
Questions? E-Commerce Competence Center Donau-City-Strasse 1 1220 Vienna Austria Phone: +43/1/522 71 71-20 Fax: +43/1/522 71 71-71 Internet: http://www.ec3.at/ E-Mail: office@ec3.at