1 / 6

Data Curation Issues and Challenges

Data Curation Issues and Challenges. ARL/CNI Fall Forum 2008 Sayeed Choudhury sayeed@jhu.edu. Pixel data collected by telescope. Sent to Fermilab for processing. Beowulf Cluster produces catalog. Loaded in a SQL database. Data Flow (Levels of Data). Courtesy of Alex Szalay.

eron
Télécharger la présentation

Data Curation Issues and Challenges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Curation Issues and Challenges ARL/CNI Fall Forum 2008 Sayeed Choudhury sayeed@jhu.edu

  2. Pixel data collected by telescope Sent to Fermilab for processing Beowulf Cluster produces catalog Loaded in a SQL database Data Flow (Levels of Data) Courtesy of Alex Szalay

  3. Key Considerations • Work with existing scientific systems • Consider gateways for these systems as part of infrastructure development • Focus on both human and technical components of infrastructure • Human interoperability is more difficult than technical interoperability • Trust

  4. Questions (1) • How do we transfer principles into new practices, especially given scale and complexity? • What are the fundamental differences between data and collections? Human readable vs. machine readable? • What about the “cloud” or the “crowd”? • Can flickr help us with data curation?

  5. Questions (2) • How does a partnership audit data (and associated services) distributed across the network? • Are audits about “completeness” or perhaps about transparency and reliability? • Where are the existing data curators? Maybe we shouldn’t use the terms data librarian or data scientist or humanist.

  6. Questions (3) • What are the requirements? Are there common requirements, which may be most appropriate area for libraries? • Are there unifying concepts or themes? “One scientist’s noise is another scientist’s signal…” • What are we trying to sustain? Data? Scholarship? Our organizations?

More Related