120 likes | 252 Vues
Explore the challenges and solutions encountered at Algoma University in managing its institutional repository. This comprehensive overview discusses the significance of a repository as a storage service for published outputs, delving into issues like unrelated data, storage capacity, and accessibility of massive datasets. Learn how technologies such as Fedora, Drupal, and OpenSRF enhance data handling and retrieval. Discover strategies for making legacy data and diverse datasets searchable while considering digital rights and public access to information.
E N D
The Algoma University Experience By Robin Isard, Algoma University Repositories
What is an institutional repository? • A storage service for a given institution's published output • Can be physical or virtual
Why are repositories difficult to deal with? • Unrelated data • Storage • Amount of data • Length of time • No obvious approach • Record management? • Open publishing? • Digital archive?
Access • How do you make massive datasets searchable? • How do you make legacy data available? (The Wordperfect Problem) • How do you provide access to completely unrelated datasets? • Should people even be able to access the data? Are we “publishing” it? • Digital Rights Management?
The Algoma Experience • A variety or Data in a variety of places • Archival data in a legacy system • 23,000 photos • Health Informatics • Geo-Information Data • Faculty Research • Process published in the repository • Output published to the web
Technologies we're looking at • Fedora • Drupal • OpenSRF • Solar
The Datastore – Fedora • Scales to millions of objects • Can manage unrelated data • Many access options: • REST, RDP • Many storage options • Databases • Filesystems • Define services
The interface – Drupal • Content Management System • Extremely popular • Support from the Library community • Modular flexability • Can interact with Solr
Network Services – OpenSRF • Micro Services • Develop OpenSRF services to provide processing and connectivity services • OpenSRF::OAI_Harvaster
Access – Solr • Excellent for indexing • Faceted searching • Enterprise scale • Runs on Apache/Tomcat • Replication • caching • Many APIs • Xml / HTTP • JSON • Talks to the interface (Drupal)
Putting it all together • Fedora can store all kinds of heterogenous data • Drupal + Solr provide an excellent generic interface • Customizable • Multi-site capable • Integreates with the library website • OpenSRF used to connect various processing services