1 / 20

Unlock the books with IntelligentCAPTURE

Unlock the books with IntelligentCAPTURE. Xavier Baumgartner University of St. Gallen. Outline. 1 Background of the Project: Euregio Bodensee - Library Cooperation Project AGI and VLB = Vorarlberger Landesbibliothek IBH = Internationale Bodenseehochschule 2 Project Partners:

mei
Télécharger la présentation

Unlock the books with IntelligentCAPTURE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Unlock the books with IntelligentCAPTURE Xavier Baumgartner University of St. Gallen

  2. Outline • 1 Background of the Project: • Euregio Bodensee - Library Cooperation • Project AGI and VLB = Vorarlberger Landesbibliothek • IBH = Internationale Bodenseehochschule • 2 Project Partners: • AGI: http://www.agi-imc.de/ • Libraries

  3. Outline • 3 Project Tools: • intelligentCAPTURE • IC CAI-Engine • intelligentSEARCH • 4 Project Results: • Library catalogue: http://www.vorarlberg.at/vlb/ • Portal: http://www.dandelon.com

  4. 1 BackgroundEuregio Bodensee • Region extending for roughly 50km around Lake Constance • (Bodensee) • Covers the southern German districts of Konstanz, • Sigmaringen, Ravensburg, Lindau, and Oberallgäu und • Bodenseekreis • Austrian province of Vorarlberg • Swiss cantons of St. Gallen, Schaffhausen, Appenzell- • Innerrhoden and Appenzell-Ausserrhoden • Principality of Liechtenstein.

  5. 1 BackgroundEuregio Bodensee - Library Cooperation http://www.ub.uni-konstanz.de/euregio/bodkat.htm http://www.ub.uni-konstanz.de/boddb/

  6. 1 BackgroundIBH = Internationale Bodensee-HochschuleInternational Lake Constance University • - Virtual University • - Network of 24 independent universities • Aim: promote cooperation among member universities in fields of science, • research and infrastructure • - Use synergies to mutual advantage

  7. 2 Project PartnersAGI - Information Management Consultants - Focused on information and knowledge managment - Consulting - Software development and long-term maintenance - Use advanced recognition technologies in: Automatic indexing and text mining (CAI) Machine translation (MT) Optical character recognition (OCR) Recognition of text structures in PDF documents Voice recognition

  8. 2 Project PartnersAGI - Information Management Consultants • Products: • based on IBM technical platform Lotus Notes & Domino • intelligentCAPTURE -> tool for document capturing and • machine indexing • IC INDEX -> tool for developing topic maps, taxonomies, • thesauri and classifications • intelligentSEARCH -> tool for information retrieval, • vizualization

  9. 2 Project PartnersLibraries - University of Applied Sciences Dornbirn - University of Applied Sciences Kempten - University of Applied Sciences Liechtenstein - Central Library Zurich for University Zurich - University of Applied Sciences Konstanz - University of St. Gallen

  10. 3 Project toolsintelligentCAPTURE • Software intelligentCAPTURE installed locally and • connected to scanner • Workflow: • - Identification of document via barcode • - Scanning table of contents of books • - Character recognition process (OCR) • - Quick check of result of OCR

  11. 3 Project toolsintelligentCAPTURE • Workflow (cont): • - Generation of PDF file • - Compression of files • - Automatic indexing (CAI engine) • - Transfer of PDF file to file system • - Export of indexing results and PDF files • to Local library system • to Local intelligentSEARCH database • to Central database, hosted by AGI

  12. 3 Project toolsIC CAI Engine • Automatic indexing much more specific and comprehensive • than just indexing of title and intellectual indexing with • controlled vocabulary • Document analysis on basis of linguistic methods and • procedures from computer linguistics • All words are reduced to linguistic base form (morphems) • Uses large semantic nets (thesauri, topic maps etc.) • Statistical rules for relevance ranking

  13. 3 Project toolsIC CAI-Engine • Output of most important terms in groups: • - geographical terms • - personal/corporate terms • - branches areas of activity • - decriptors: words from internal thesaurus • - important words and phrases from text • Libraries: use broad generic thesaurus, approx. 300‘000 • German terms and smaller English thesaurus • Languages: German and English in use, French and Spanish • available

  14. Library1 Library 2 Library 3 iCAPT ILS iCAPT ILS iCAPT ILS Indexing PDF Indexing PDF Indexing PDF AGI

  15. 3 Project toolsintelligent SEARCH • Search engine, simple (Google like) interface, with IBM • GTR (Global Text Retrieval) as core engine • Search terms input -> automatically expanded • semantically • Main features of GTR: • Operators: Boolean, adjacency, near, paragraph sentence, right and left truncation, wildcard, fuzzy searching, sorting by relevance

  16. 3 Project toolsintelligent SEARCH • AGI developed features: • - Highlighting • - Interfaces to library system, book seller, web via google • - Query expansion by semantic nets • - Vizualization and browsing of topic maps

  17. 4 Project ResultsProject Results • Library OPAC Vorarlberger Landesbibliothek: • http://vlb-katalog.vorarlberg.at • Portal: www.dandelon.com

  18. 4 Project results www.dandelon.com • Portal with semantic search engine (intelligentSEARCH) • Content: automatically indexed content pages of books and • other publications; PDF files of contents pages • Search terms expanded semantically • Relevance ranking • - Highlighting

  19. 4 Project results www.dandelon.com - Links to libraries holding the book, to booksellers, to internet search engines - View topic maps

More Related