1 / 13

Jim Myers, Carmen Pancerella

Collaboratory for Multi-scale Chemical Science (CMCS): A Knowledge Grid/ Adaptive Informatics Infrastructure. Jim Myers, Carmen Pancerella. CMCS – Enabling New Forms of Research and Communication. Distributed Research Groups Chemical Databases Rich Publication Community Annotation

benoit
Télécharger la présentation

Jim Myers, Carmen Pancerella

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Collaboratory for Multi-scale Chemical Science (CMCS):A Knowledge Grid/ Adaptive Informatics Infrastructure Jim Myers, Carmen Pancerella

  2. CMCS – Enabling New Forms of Research and Communication • Distributed Research Groups • Chemical Databases • Rich Publication • Community Annotation • Informatics Analysis • Cross-scale Communication • Peer Data Review • Pedigree Analysis • Automated informatics • Automated monitoring/analysis

  3. Adaptive Informatics Infrastructure • Infrastructure – a well designed, scalable, reusable, flexible set of tools, middleware, and services • Informatics – the emerging use of semi-automated means to derive new knowledge from the analysis of (large amounts of) heterogeneous data, annotating existing data with its newly discovered meaning • Adaptive – able to dynamically change to incorporate new knowledge and support new activities • Low Barriers • Many access points • Storage of data in original formats with dynamic metadata extraction and translation • Powerful • Arbitrary formats (binary, ASCII, XML) • Integrated data, metadata, pedigree across internal and external tools • Evolvable • Schema can be changed/extended as needed • Metadata, translations, viewers, portal, etc. can be dynamically configured

  4. Database Web SAM Architecture Notebook Services Semantic Services DAV, DASL, JMS, SAM Extensions DAV, JDBC, GridFTP Metadata Services DataGrid

  5. SAM Metadata Services Layer • Jakarta Slide DAV server plus configurable: • Mime Type Assignment • CMCS default: Based on dc:format tag within .xml file • Property Generation from binary/ASCII/XML files • 12 types  standard CMCS properties • Resource Translation • 12+ Viewers/Translators for CMCS including Interactive Applets • Mapping to Data Store(s) • NIST Kinetics DB • JMS Events for access and changes • Feeds events to CMCS NED Email Notification daemon • Authentication/Authorization model • (single sign-on with CMCS Portal – username/password or GridCert)

  6. Extensible Scientific Interchange Language (XSIL) / Binary Format Description (BFD) language • XSIL (Roy Williams, CalTech) - XML Encoding and Java code for scientific data • Ints, floats, vectors, arrays, time series, … • Can describe the byte structure of external data files/streams (encoding, byte order,…) • Can have link(s) to external data • BFD (Alan Chappell, Jim Myers, PNNL) XML Encoding and Java code for describing binary/ascii files • Bug fixes, removed ambiguities • Parameterized logic (if, while, for…) • Parameterized Stream interface • Being used as input for Grid Forum Data Format description Language (DFDL) standard <XSIL> <Param Name="date" Type="String" /> <Param Name="Program Version" Type="float" /> <Param Name="numColumns" Type="int" /> <Array Name="data" Type="float"> <Dim> <XBFDvalue-of select ="/XSIL/Param[@Name='numColumns']" /> </Dim> <Dim>6</Dim> </Array> <Stream Encoding="Binary" Type="Remote“ XBFDstreamnumber="0" /> </XSIL>

  7. Demo

  8. Example • Binary  XML  Properties • Translation of Chemistry Data • SAM-based Electronic Notebook • CMCS Portal/Pedigree Browser ELN DAV+ Fortran Application SAM DAV JMS ‘Local Disk’ DataGrid

  9. CMCS Provenance:de-facto standards • Cmcs:hasinputs – workflow • Cmcs:hasoutputs – workflow • Sam:hastranslations – virtual workflow • Cmcs:ispartofproject – hierarchy • Eln:children – hierarchy • (Dav:collection) – hierarchy • Dcterms:references – scientific pedigree • Dcterms:isreferencedby – scientific pedigree • Eln:references – informal/private scientific pedigree

  10. Applications/Chemistry Services • Extensible Computational Chemistry Environment • Export to CMCS with pedigree/metadata • Active Thermochemical Tables • Portlet/web service using CMCS data store • RIOT – adaptive mechanism reduction • Portlet/web service using CMCS data store – asynchronous invocation mechanism

  11. Standard Protocol and API • WebDAV: An early web service (XML commands over HTTP) • A widely adopted standard for metadata/data transport • Put/Get data with arbitrary properties (dynamic) • Properties can be discovered and accessed independently • DASL, Versioning, Transactions, … • JSR 170: Java Content Repository • An API for working with nodes with properties (versioning, queries, typing, notification, …)

  12. Path Forward • Pilot groups doing “real” chemistry • Exploring new practice • Peer-Review / Endorsement Mechanisms/Interfaces • Digital publication, third party annotation • Activity Reporting tools • Scoping Searches, Notifications • Based on user-defined notion of provenance/hierarchy • Notebook Views of Other Hierarchies • E.g. A notebook sharing a computational chemistry project hierarchy • Validation of Chemical networks • E.g. Active Thermo-chemical Tables • Workflow by Example… • Informatics Data File Assembly Tool

  13. URLs/Team Members • http://cmcs.org/ • http://www.scidac.org/SAM/ CMCS Team Members: Thomas C. Allison, Kaizar Amin, Sandra Bittner, Brett Didier, Michael Frenklach, William H. Green, Jr., Yen-Ling Ho, John Hewson, Wendy Koegler, Carina Lansing, David Leahy, Michael Lee, Renata McCoy, Michael Minkoff, James D. Myers, Sandeep Nijsure, Gregor von Laszewski, David Montoya, Carmen Pancerella, Reinhardt Pinzon, William Pitz, Larry Rahn, Branko Ruscic, Karen Schuchardt, Eric Stephan, Al Wagner, Baoshan Wang, Theresa Windus, Lili Xu, Christine Yang

More Related