400 likes | 592 Vues
Tools for Repositories: Microsoft Research & the Scholarly Information Ecosystem. Lee Dirks Director, Education & Scholarly Communications Microsoft External Research Microsoft Corporation. Microsoft External Research.
E N D
Tools for Repositories:Microsoft Research & the Scholarly Information Ecosystem Lee Dirks Director, Education & Scholarly Communications Microsoft External Research Microsoft Corporation
Microsoft External Research Organization within Microsoft Research that engages in strong partnerships with academia, industry and government to advance computer science, education, and research in fields that rely heavily upon advanced computing Initiatives that focus on the research process and its role in the innovation ecosystem, including support for open access, open tools, open technology, and interoperability Developers of advanced technologies and services to support every stage of the research process
Mission • Optimize and extend Microsoft software to meet the specific needs of the academic community • Our approach: • Conduct applied projects to enhance academic productivity by evolving Microsoft’s scholarly communication offerings • Microsoft External Research is uniquely positioned to drive this initiative across Microsoft
Transforming Scholarly Communication • Interoperability is essential • Actively lobby and drive for consensus around technical standards and standardized protocols proactively adopted by the community; enable broad community engagement • Customers have told Microsoft that interoperability is OUR responsibility • Leverage existing community protocols, practices, guidelines, etc. • Example – metadata conventions / taxonomies / ontologies: a traditional strength for libraries – and a critical component in enabling Web 2.0 • Optimize for data-driven research • To both data (scientific) and to information (scholarly publications) • Reproducible research + computational science • Properly document / annotate scholarly output • Data preservation (and provenance) should be baseline • Documentation of the data’s provenance • Preservation needs to be like “accessibility” features – i.e., assumed as required • Semantic knowledge discovery & social networking • Harnessing collective intelligence must be a consideration – since accessing research is a core step in the life-cycle. Enable knowledge discovery • Optimize for Web 2.0 scenarios and allow end-users/experts to find things easier
The Scholarly Communication Lifecycle Excel 2007 Windows Server HPC “Astoria” / “Pop Fly” Collaboration SharePoint LiveMeeting Office Live • Office 2007: • Word • PowerPoint • Excel • OneNote • Tablet PC/UMPC Office OpenXML XPS Format SQL Server & Entity Framework Rights Management Data Protection Manager Discoverability Libra 2.0 “Bookweb” SharePoint Word 2007 + PowerPoint 2007 WPF & Silverlight “Sea Dragon” / “PhotoSynth” / “Deep Zoom”
Scholarly Communications: Project Overview • Current or Completed Projects • Cornell – arXiv.org + Word 2007 (and repository interoperability via SWORD) • MIT / Broad Institute – Authoring (Word 2007) + data for research reproducibility • MSR – CMT++ interoperability with data + metadata transfer/exchange (conference management tool enhancements) • LiveLabs – eJournal publishing online service (community publishing tool) • UC San Diego / PLoS – Semantic mark-up of scholarly articles (+ submission) • Chem4Word with Office & Cambridge University – Create add-in to Word 2007 to facilitate drawing of chemical compounds and equations • Johns Hopkins University – Digital Archive for Astronomy/Astrophysics data (storage, preservation and access) • Planets Project / EU (with MSR – Cambridge) OpenXML and file format preservation + interoperability • eChemistry Project (Cornell, Penn State, Indiana, Cambridge, Southampton) – ORE exemplar: access to compound chemical info objects (cross-repository access to open chemistry data) • British Library – Researcher Information Centre (RIC) online workflow tool for scientists and researchers • Creative Commons Add-in for Office 2007 – evolving the Word 2003 effort • University of Southampton (UK) – Port ePrints Repository Software for installation on the Windows platform • University of Manchester / “MyExperiment” Project – social networking for scientists • ORE Acceleration Project (OAI – Object Reuse & Exchange) – Alpha spec development • UK National Archives – Virtual PC / Emulation of legacy systems to facilitate preservation • National Library of Medicine / NCBI – “PubMed Int’l” UK version of PubMed + NLM DTD • Pipeline • DRIVER 2 (EU) – Infrastructure integration of across a network of European research repositories
Our goals for working in this community • For Microsoft end-users, making it easier to use our software for all aspects of their research process • For Microsoft developers, demonstrating the toolset and showing how our platform can be extended • For non-Microsoft end-users, working to ensure the ability to interoperate with our software across all phases of the research process, as necessary • For non-Microsoft developers, enabling transparency to our efforts in this space and encouraging a dialogue
Who’s here & why Goals / Intentions Approach
AGENDA • 12:30 p.m. Welcome & Overview Lee Dirks – Director, Education & Scholarly Communication, Microsoft Research • 1:00 p.m. Zentity - Repository Platform Alex Wade – Director, Scholarly Communication, Microsoft • 2:00 p.m. Servicesfor Repositories (RIC, Electronic Journals Service, Live Translator, Document Conversion Service) Pablo Fernicola, Group Manager, Microsoft & Alex Wade • 3:00 p.m. Break • 3:15 p.m. Programming with Zentity Savas Parastatidis, Software Philosopher, Microsoft • 4:30 p.m. Tools for Authors (AA, Ontology, Creative Commons, ORE, Submission Wizard, etc.) Pablo Fernicola & Alex Wade • 5:30 p.m. Wrap-up & Futures Discussion
Questions? Lee Dirks Director—Education & Scholarly Communication Microsoft External Research ldirks@microsoft.com URL – http://www.microsoft.com/scholarlycomm/
Zentity 1.0Open Repositories ‘09 Workshop Alex Wade Director, Scholarly Communication Microsoft External Research Microsoft Corporation
Agenda Ecosystem of Tool/Services • Visualization • Discovery • Entity Extraction • etc. Peer-Review Translation Conversion Repositories User Environment • Search • Desktop Tools • ELNs • etc. Authoring Collaboration/VREs
Agenda • Goals • System Requirements • Architectural Stack • Installation • Repository Demo • UI • Services • Extensibility
Zentity – Goals Quick • Easy to install • ‘Scholarly Works’ data model • Authors, Papers, Data, Videos, Code, Lectures, Books, etc. • Default Web UI Extensible • UI Toolkit • Intuitive programming experience • Extensible Data Model (entities, relationships) • RDFs for new data models Interoperable • BibTeX Import • RSS/Atom Syndication • METS support • OAI-PMH Provider • OAI-ORE • Simple Search API • Atom Publishing Protocol • SWORD Free & Open • Freely available • Based on open standards • SQL Server and Developer tools available via Dreamspark
System Requirements • Supported Processor Architectures • x86 and x64. • Supported Operating Systems • Microsoft Windows Server 2008 (x86 and x64) • Microsoft Windows Vista SP1 (x86 and x64) • Installation Requirements • Microsoft .Net Framework 3.5 • Supported Microsoft SQL Server • Microsoft SQL Server 2008 Enterprise Edition • Microsoft SQL Express 2008 with Advanced Services • User and Configuration Requirements • Site Admin privileges are granted to the user installing Zentity • The selected Microsoft SQL Server instance must have “Windows Authentication” enabled. • User running the installer must have ‘database creation’ permissions on the Microsoft SQL Server instance.
Application Stack ScholarlyWorks Application Web UI Services UI.Toolkit Zentity.Search Zentity.Security Zentity.Core ADO.NET 3.5 Entity Framework SQL Server 2008(including Express edition)
Zentity - Store • A Semantic Computing platform • A hybrid between a relational database and a triple store • Triple stores • Evolution friendly • Poor performance • No need to model everything in advance • Semantic interpretation at the application level • Relational schema • Evolution not so easy • Great opportunities for optimization • Model everything in advance • Zentity Store • Maintain a balance • Try to model the frequently used entities in our app domain • Try to capture the frequently used relationships • Allow for extensibility (Relationships, Properties)
Research Output Repository Platform PDF file Lecture on 2/19/2008 contains is representation of PowerPoint presentation authored by organized by tony presented by Elizabeth, Sebastien, Matthew, Norman, Brian, Sarah, George, Roy
OAI-PMH database localhost\SQLExpress
Search • Basic Search • Search Filters • Advanced Query Syntax (AQS) • Field Support • Advanced Search
Syndication • http://<myserver>/Syndication/Syndication.ashx?resourcetype: book author:(tony hey)
Extensibility ScholarlyWorks Application • Web UI & UI Toolkit • CSS • ASP.NET Controls • Services • Search • Security • Data Model Web UI Services UI.Toolkit Zentity.Search Zentity.Security Zentity.Core ADO.NET 3.5 Entity Framework SQL Server 2008(including Express edition)
Further Information and Resourceshttp://research.microsoft.com • The site contains access and downloads of relevant tools and resources for the worldwide academic research community. A small set of examples include: • Research Output Repository: building blocks, tools, and services for developers who are tasked with creating and maintaining an organization’s repository ecosystem. http://research.microsoft.com/zentity • Tools and Services for Research Collaboration: http://research.microsoft.com/en-us/collaboration/tools/default.aspx