190 likes | 293 Vues
Learn how Microsoft is driving innovation in scholarly communication through various initiatives, projects, and collaborations. Explore the importance of optimizing research data, interoperability, and data preservation in advancing scientific discovery. Get insights into Microsoft's commitment to global collaboration and technology excellence in the scholarly community.
 
                
                E N D
Report on Scholarly Communication Initiatives @ Microsoft Lee Dirks Director, Scholarly Communications Technical Computing MSR External Research Microsoft Corporation
Agenda • Context • Our Mission & Mandate • Engagement Model / Methodology • Some Project Examples • Future Directions • Your Questions & Feedback
Technical Computing @ Microsoft Life Sciences Social Sciences Earth Sciences Accelerating Discovery New Materials,Technologies& Processes MultidisciplinaryResearch Computer &Information Sciences Math andPhysical Science
Our Commitment to Science • Advancement of Science • Global Collaboration • Technology Excellence • Interoperability • Putting computing into science… • Applying Microsoft products and research technologies to advance the scientific research and engineering innovation process • Putting science into computing… • Investing in potentially breakthrough computer science research to address the Multicore challenges facing the IT industry
The Scholarly Communication Lifecycle Excel 2007 Windows Compute Cluster Server “Astoria” / “Pop Fly” Collaboration SharePoint LiveMeeting • Tablet PC/UMPC • Office 2007: • Word • PowerPoint • Excel OpenXML XPS SQL Server Rights Management Data Protection Manager Discoverability Live Search Academic & Books Libra 2.0 SharePoint Word 2007 + PowerPoint 2007 SharePoint WPF & Silverlight “Sea Dragon” / “PhotoSynth”
Why Scholarly Communication? • Science + computation are not the entire equation • Authoring, Analysis, Publishing, Discoverability, and Data Storage/Preservation are key components to scientists’ everyday work…and Microsoft’s core businesses • The scholarly community has made it clear to us: • Microsoft must improve its offerings throughout the scholarly communication lifecycle • MSR/TCI is uniquely positioned to drive this initiative within Microsoft • Our approach: Conduct prototyping projects and proofs-of-concept to evolve Microsoft’s scholarly communication offerings
Audiences We Focus On • Academics / Scholars (higher education setting) • Researchers / Scientists • Libraries / Archives • Academic, Research and National institutions • Scholarly Publishers & Societies • Both Open Access and For-Profit enterprises • Governments / Related Organizations • EU, NIH/NLM, NSF, NASA, etc. • JISC (UK), OCLC, CNI, DLF, NISO, etc.
Goal: Transform Scholarly Communication • Optimize for data-driven research & science (open data/access) • To both data (scientific) and to information (scholarly publications) • Reproducible research + computational science • Properly document / annotate scholarly output • Interoperability is paramount • Actively lobby and drive for consensus around technical standards and standardized protocols proactively adopted by the community; enable broad community engagement • Customers have told Microsoft that the interoperability (and intellectual property) are OUR responsibility • Data preservation (and provenance) should be baseline • Documentation of the data’s provenance • Reliable and secure long-term storage – at a massive scale • Preservation needs to be like “accessibility” features – i.e., assumed as required • Social networking & semantic knowledge discovery • Harnessing collective intelligence must be a consideration – since accessing research is a core step in the life-cycle. Enable knowledge discovery • Optimize for Web 2.0 scenarios and allow end-users/experts to find things easier • Metadata conventions / taxonomies / ontologies • This is a crucial strength for libraries – and a critical component in enabling Web 2.0
Our Engagement Model: “Dual Benefit” • Work with researchers around the world • Facilitate/advise on the application of technology • Link MSR researchers with (non-CS) researchers • Work with product groups • Provide feedback on the use of MS technologies • Identify research-driven requirements for products • Terms & Conditions • Microsoft typically shares IP (via BSD-type license) or makes source code available on http://www.codeplex.com • Microsoft will not develop on a Linux platform • Project Execution Models • Internal Development (FTE) • External Development (Vendor) • External Development (Institutional) • Mixed Model
Scholarly Communications: Current & Upcoming Projects • Current or Completed Projects • Cornell – arXiv.org + Word 2007 (and repository interoperability) • MIT / Broad Institute – Authoring (Word 2007) + data for research reproducibility • MSR – CMT++ interoperability with data + metadata transfer/exchange (conference management tool enhancements) • UC San Diego / PLoS – Semantic mark-up of scholarly articles (+ submission) • LiveLabs – eJournal publishing online service (community publishing tool) • Johns Hopkins University – Digital Archive for Astronomy/Astrophysics data (storage, preservation and access) • Planets Project / EU (with MSR – Cambridge) around OpenXML and file format preservation and interoperability • eChemistry Project (Cornell, Penn State, Indiana, Cambridge, Southampton) – ORE exemplar: access to compound chemical info objects (cross-repository access to open chemistry data) • Indiana University – Toolbox for Social Networking (SRT) • British Library – Researcher Information Centre (RIC) online workflow tool for scientists and researchers • University of Southampton (UK) – Port ePrints Repository Software for installation on the Windows platform • University of Manchester / “MyExperiment” Project – social networking for scientists • ORE Acceleration Project (OAI – Object Reuse & Exchange) • UK National Archives – Virtual PC / Emulation of legacy systems to facilitate preservation • National Library of Medicine / NCBI – “PubMed Int’l” UK version of PubMed + NLM DTD • Creative Commons Add-in for Office 2007 – evolving the Word 2003 effort • Pipeline • Chem4Word with Office & Cambridge University – Create add-in to Word 2007 to facilitate drawing of chemical compounds and equations • DRIVER 2 (EU) – Infrastructure integration of across a network of European research repositories
Project Example: GenePattern for Word 2007 • Integrate data and images from GenePattern workflows into research papers. Allow for research reproducibility by combining data with the text. • Highlights OpenXML and Office 2007 technologies as well as breaking new research ground with the integration of data & workflows with research papers. • MIT Broad Institute • (http://www.broad.mit.edu/) • Contracted Work • Infusion for development work via SOW • Broad for GenePattern Development for integration
NIH National Library of Medicine • NLM’s PubMedCentral repository contains full-text of research papers resulting from work funded by NIH • Working with NCBI using Word 2007 to author the NLM-DTD tag set • TCI assisted in deployment of PMC International in the UK, Japan, Italy, China and South Africa
Research Community Publishing • eJournal Project • Extending existing MSR ‘CMT’ Conference Management Tool to offer eJournal service • Developing a toolset for ‘self-publishing’ of workshop and conference proceedings and small journals • Research Repositories • Adapting ‘arXiv’ repository to accommodate Word 2007 and interoperable web services interfaces • Developed an open source (BSD) Windows version of ‘EPrints’ software at Southampton
The British Library’s “Researcher Information Center”Virtual Research Environment • Identify information sources, tools and services to support research in STM • Explore the application of new services • Collaborative filtering of literature, continual queries and more… • Intuitive to use and navigate, user configurable
International Virtual Observatory • Working with the Astronomy community to to build the IVO • Goal is for all astronomy data and literature online and cross indexed • Tools to analyze it • OpenSkyQuery Federation of ~20 observatories • Works and is used every day • Spatial extensions in SQL 2005 • Good example of Data Grid • Good example of Web Services • TCI is facilitating a library project to link astronomy publications to the data
Other Efforts & Initiatives • “Global Research Library 2020” with University of Washington (Oct07) • Planning to participate in application(s) to the NSF “DataNet” solicitation (as an unfunded partner) • Sponsoring BioMed Central’s 2008 Research Awards (Mar08) • Aug07 Issue of CT Watch Quarterly (v. 3, no. 3) • “The Coming Revolution in Scholarly Communications & Cyberinfrastructure” • http://www.ctwatch.org/quarterly/articles/2007/08/ • New Scholarly Publishing website at: • http://www.microsoft.com/mscorp/tc/scholarly-publishing.mspx
Questions / Feedback? Lee Dirks ldirks@microsoft.com http://www.microsoft.com/science