720 likes | 842 Vues
Prototyping Digital Libraries Handling Heterogeneous Data Sources – An ETANA-DL Case Study. ECDL 2004, Bath, England, September 2004. Unni Ravindranathan, Rao Shen, Marcos Andr é Gon ç alves, Weiguo Fan, Edward A. Fox, James W. Flanagan fox@vt.edu http://fox.cs.vt.edu
E N D
Prototyping Digital Libraries Handling Heterogeneous Data Sources – An ETANA-DL Case Study ECDL 2004, Bath, England, September 2004 Unni Ravindranathan, Rao Shen, Marcos André Gonçalves, Weiguo Fan, Edward A. Fox, James W. Flanagan fox@vt.edu http://fox.cs.vt.edu Virginia Tech, Blacksburg, VA, USA (and CWRU)
Acknowledgements(Selected) • Sponsors: NSF grant ITR-0325579; AOL, ASOR, CWRU, ETANA, Vanderbilt U., Virginia Tech • Faculty/Staff: Lillian Cassel, Debra Dudley, Roger Ehrich, Manuel Perez, Naren Ramakrishnan • VT (Former) Students: Aaron Krowne, Ming Luo, Fernando Das Neves, Ricardo Torres, Hussein Suleman
Acknowledgements (contd.) • Karen Borstad, MPP • Douglas Clark, Walla Walla College • Joanne Eustis, CWRU • Nick Fischio, CWRU • Paul Gherman, Vanderbilt U. • Andrew Graham, U. Toronto • Tim Harrison, U. Toronto • Larry Herr, Canadian University College • Christopher Holland, LRP • Paul Jacobs, Mississippi State U. • Douglas Knight, Vanderbilt U. • Stan LaBianca, Andrews U. • David McCreery, Willamette U. • Eric Meyers,Duke U. • Adam Porter, Illinois College • Jack Sasson, Vanderbilt U. • Tom Schaub,Indiana U. of Penn. • Randall Younker, Andrews U.
Outline • Problems • Background • Approach • ETANA-DL • ETANA-DL Prototype System • Modeling ETANA-DL • ETANA-DL Services • Analysis • Conclusions • Future Work
Problems • Interoperability among heterogeneous archaeological systems • Delay in publication of primary archaeological data • Lack of sustainable solutions to long-term preservation of valuable information • Lack of services useful to the archaeology community, including “traditional DL services” • Difficulty in understanding complex archaeological information systems • Difficulty in requirements elicitation for archaeological systems
Outline • Problems • Background • Approach • ETANA-DL • ETANA-DL Prototype System • Modeling ETANA-DL • ETANA-DL Services • Analysis • Conclusions • Future Work
Open Archives Initiatives • Promotes interoperability among DLs • Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) • Data Provider • possess metadata and share it (internally / externally) • via well-defined OAI protocols (e.g., database servers) • Service Provider • harvest data from Data Providers • provide higher-level services to users
Program Video Image Video Image Program Program Video Image 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 Monolithic and/or Custom-built web-based application ? ? Document Document Document 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 Traditional Digital Libraries Digital Objects Users Digital Library
Introduction to ODL(Open Digital Libraries) • Open Digital Libraries • Framework for componentized Digital Libraries • Design principles for components • Protocols for inter-component communications • Built upon OAI
1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 Bone Seed 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 Figurine 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 Pottery Open Digital Libraries Approach USER INTERFACE Recent Filter Browse Union Filter Search Users ETANA-DL Sites
Basic ODL Model: An application for Archaeology WWW Interface User Interface ODL Protocol ETANA-DL Search Engine ODL Protocol ODL Service Provider Component ODL Protocol ETANA-DL Union Catalog OAI-PMH OAI-PMH OAI Data Provider Nimrin
Componentized services example Query User User Interface Results Query Parsed XML Query in the IRDB query language IRDB Search Engine Search Handler Servlet Index DB Results in XML
5S Model – Informally • Digital libraries are complex information systems that: • help satisfy info needs of users (societies) • provide info services (scenarios) • organize info in usable ways (structures) • present info in usable ways (spaces) • communicate info with users (streams)
Outline • Problems • Background • Approach • ETANA-DL • ETANA-DL Prototype System • Modeling ETANA-DL • ETANA-DL Services • Analysis • Conclusions • Future Work
Solution – our approach • Applying and extending Digital Library (DL) techniques to solve the following problems: interoperability, making primary data available, data preservation • Modeling archaeological information systems using 5S theory to better understand the domain and design the system and the supported services • Rapidly prototyping DLs that handle heterogeneous archaeological data using componentized frameworks: requirements elicitation, provide useful services.
Outline • Problems • Background • Approach • ETANA-DL • ETANA-DL Prototype System • Modeling ETANA-DL • ETANA-DL Services • Analysis • Conclusions • Future Work
ETANA-DL • Archaeological Digital Library • Applies and extends the OAI-PMH • Open Archives Initiative Protocol for Metadata Handling • Design considerations • Componentized • Distributed architecture • Extensible • Portable
ETANA Digital Library Core Components - DigBase • DigBase (DB) • Central repository - stores metadata • Union catalog - for the collections in ETANA-DL • Various kinds of digital objects – excavation records, images, text collections, etc. • General services - Search, Browse, Annotate, Recommend, etc. • Archaeology-specific services - artifact analysis, visualizations, artifact interpretation, workflows, etc.
ETANA Digital Library Core Components - DigKit • DigKit (DK) • A suite of tools for collecting and recording archaeological data in the field, that can be used for a new dig • Metadata will migrate to DigBase (DB). • Real-time collaborative archaeology: Metadata in DB will be rapidly available to others.
Outline • Problems • Background • Approach • ETANA-DL • ETANA-DL Prototype System • Modeling ETANA-DL • ETANA-DL Services • Analysis • Conclusions • Future Work
DigBase DigKit Inverted Files OAI Data Provider Web Interface Search Component XOAI Index DB Data Mapping Component Union Catalog OAI Browse DB Browse Engine Index Configure DB used by Services Other ETANA-DL Services XOAI ETANA-DL Archaeological Site Architecture
Modeling ETANA-DL – An Archaeological DL Meta-model Scenario model Society model Repository building Services Archaeologist Service Manager Value added General public Domain specific Space model Information Satisfaction Geographic space User interface Metric space Structure model Taxonomies Spatial Metadata Temporal Artifact-specific *Sub-partition Region *Site *Partition *Locus *Container *Artifact Stream model Text Video Audio Drawing Photo 3D
Modeling ETANA-DL – The ETANA-DL model Scenario model Society model Services Harvesting, Converting Archaeologist ETANA-DL Service Manager Annotation, binding Generic public Object comparison, marking item for analysis Space model Web interface Site-specific coordinate system Vector space Searching, Browsing Taxonomies Structure model Field record, locus sheet Bone type Spatial Archaeological periods Seed species Jordan Umayri *Field *Square *Locus *Pail *Bone Jordan Valley *Quadrant *Square *Bag *Locus *Seed Nimrin Southern Israel Halif *Field *Area *Locus *Basket *Figurine Stream model Site/field plan(drawing) Figurine image (photo) Preliminary/FinalReport (application/pdf)
Modeling ETANA-DL – Mapping heterogeneous data to the structural model
ETANA-DL Schema Design Locus Owner Partition Collection Subpartition ETANA-DL Object ID Container Seed Figurine Bone …… Name Animal Dimensions Species Count Description …… …… ……
Outline • Problems • Background • Approach • ETANA-DL • ETANA-DL Prototype System • Modeling ETANA-DL • ETANA-DL Services • Analysis • Conclusions • Future Work
ETANA-DL Services: Categories • Information satisfaction • Searching • Browsing • Recommendation • Archaeology (Domain) specific • Object comparison • Marking items • Value-added • Annotation • Items of interest (Binding service) • Recent searches/discussions • User management
Multi Dimensional Browsing User context Site structure Temporal Object-specific
Other services • Items of Interest (Binding service) • Recent searches/discussions • Recommendation • User management • Account creation • Login