1 / 27

North Carolina Geospatial Data Archiving Project

This project aims to collect and preserve at-risk digital geospatial data in North Carolina through a partnership between NCSU Libraries and NC Center for Geographic Information & Analysis, supported by NDIIPP.

boyler
Télécharger la présentation

North Carolina Geospatial Data Archiving Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. North Carolina Geospatial Data Archiving Project/NDIIPP:Collection and preservation of at-risk digital geospatial data Partners:NCSU LibrariesProject Lead: Steve MorrisNC Center for Geographic Information & AnalysisProject Lead: Zsolt Nagy NSDI Partnership Community Meeting March 1, 2006

  2. Outline • Risks to Digital Geospatial Data • Overview of NC Geospatial Data Archiving Project and NDIIPP • Preservation Challenges and Possible Solutions • Points of Engagement with Spatial Data Infrastructure and Industry Note: Percentages based on the actual number of respondents to each question

  3. Risks to Digital Geospatial Data .shp .mif .gml .e00 .dwg .dgn .bsb .bil .sid Note: Percentages based on the actual number of respondents to each question

  4. Risks to Digital Geospatial Data • Producer focus on current data • Archiving data does not guarantee “permanent access” • Future support of data formats in question • Need to migrate formats or allow for emulation • Data failure • “Bit rot”, media failure • Preservation metadata requirements • Descriptive, administrative, technical, DRM • Shift to “streaming data” for access Note: Percentages based on the actual number of respondents to each question

  5. Time series – vector data Parcel Boundary Changes 2001-2004, North Raleigh, NC Temporal data to support business needs in: Real estate analysis Land use change analysis Economic planning Note: Percentages based on the actual number of respondents to each question

  6. Time series – Ortho imagery Vicinity of Raleigh-Durham International Airport 1993-2002 Even static orthophotos are at risk. Note: Percentages based on the actual number of respondents to each question

  7. Today’s geospatial data as tomorrow’s cultural heritage Future uses of data are difficult to anticipate (as with Sanborn Maps). Note: Percentages based on the actual number of respondents to each question

  8. NC Geospatial Data Archiving Project • Partnership between university library (NCSU) and state agency (NCCGIA), with Library of Congress under the National Digital Information Infrastructure and Preservation Program (NDIIPP) • One of 8 initial NDIIPP partnerships (only state project) • Focus on state and local geospatial content in North Carolina (statedemonstration) • Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventories • Objective: engage existing state/federal geospatial data infrastructures in preservation Note: Percentages based on the actual number of respondents to each question

  9. Targeted Content • Resource Types • GIS data (vector, etc.) • Digital orthophotography • Digital maps • Tabular data (e.g. assessment data) • Content Producers • Mostly state, local, regional agencies • Some university, not-for-profit, commercial • Selected local federal projects Note: Percentages based on the actual number of respondents to each question

  10. Work plan in a Nutshell • Work from existing data inventories • NC OneMap Data Sharing Agreements as the “blanket”, individual agreements as the “quilt” • Partnership: work with existing geospatial data infrastructures (state and federal) • Technical approach • Metadata: FGDC, METS, PREMIS?, GeoDRM? • Repository-independent: Dspace initially • Web services consumption for archival development (in future?) Note: Percentages based on the actual number of respondents to each question

  11. NCGDAP Philosophy of Engagement Provide feedback to producer organizations/ inform state geospatial infrastructure Take the data as is, in the manner in which it can be obtained “Wrangle” and archive data Note the ‘Project’ in ‘North Carolina Geospatial Data Archiving Project’– the process, the learning experience, and the engagement with industry and infrastructure are more important than the archive … What is the long term solution? Note: Percentages based on the actual number of respondents to each question

  12. Big Technical Challenges • Format migration paths • Management of data versions over time • Preservation metadata • Harnessing geospatial web services • Preserving cartographic representation • Keeping content repository-agnostic • Preserving geodatabases • More … Note: Percentages based on the actual number of respondents to each question

  13. Vector Data Format Issues • Vector data much more complicated than image data • ‘Archiving’ vs. ‘Permanent access’ • An ‘open’ pile of XML might make an archive, but if using it requires a team of programmers to do digital archaeology then it does not provide permanent access • Piles of XML need to be widely understood piles • GML: need widely accepted application schemas (like OSMM?) • The Geodatabase conundrum • Export feature classes, and lose topology, annotation, relationships, etc. • … or use the Geodatabase as the primary archival platform (some are now thinking this way) Note: Percentages based on the actual number of respondents to each question

  14. Managing Time-versioned Content Continuously updated data: Frequency of snapshots? Different for various framework layers? Note: Percentages based on the actual number of respondents to each question

  15. Metadata Availability – Limited at Local Level February 2005 Note: Percentages based on the actual number of respondents to each question

  16. Harnessing Geospatial Web Services Image atlases from WMS services? Capturing cartographic representation? Recording records from decisions-making processes? Later: data transfer via WFS & GML?, Other? Note: Percentages based on the actual number of respondents to each question

  17. “Web mash-ups” and the New Mainstream Geospatial Web Services How does temporal data fit into emerging WMS caching and tiling schemes? Capture of tiles and caches for archive? Note: Percentages based on the actual number of respondents to each question

  18. Preserving Cartographic Representation Counterpart to the map is not just the dataset but also models, symbolization, classification, annotation, etc. Note: Percentages based on the actual number of respondents to each question

  19. Needed: Efficient Content Replication • Content replication also needed for: • Disaster preparedness • State and federal data improvement projects • Aggregation by regional geospatial web service providers • WFS, e.g.: efficiency in complete content transfer? • Need rsync-like function, informed by: rights management, inventory processes, metadata management, data update cycles • Archiving delta files vs. complete replication – need to avoid requiring “digital archaeology” in the future Note: Percentages based on the actual number of respondents to each question

  20. Points of Engagement with the Open Geospatial Consortium (OGC) • GML for archiving (PDF/A version of GML?) • GeoDRM • Adding preservation use cases • Content Packaging • Will there be an industry solution? • Web Map Context Documents • Can we save data state as well as application state? • Content Replication • Is this a layer in the overall architecture? • Persistent Identifiers Note: Percentages based on the actual number of respondents to each question

  21. Points of Engagement with Spatial Data Infrastructure • Framework data communities • Snapshot frequency, naming schemes, classification, GML application schemas, format strategies • Metadata standards and outreach • Persistent identifiers, versioning, feedback on metadata quality • Content replication/transfer • For data improvement projects, disaster preparedness, aggregation by regional service providers, … and archives • Where does archiving and preservation fit into the NSDI, GOS, etc? Note: Percentages based on the actual number of respondents to each question

  22. Points of Engagement with Industry • Software vendors • Better support for temporal data management • Tools for retrospective data conversion • Web mashup and open source communities • WMS caching schemes • Standard tiling schemes with temporal component? • Data vendors • Cultivate market for older data (scaled pricing?) • Tech transfer on archiving practices? Note: Percentages based on the actual number of respondents to each question

  23. Cultivating a market for older data. Project Status Note: Percentages based on the actual number of respondents to each question

  24. Project Status Cultivating tools for retrospective conversion. Note: Percentages based on the actual number of respondents to each question

  25. Expected Project Outcomes • Demonstration archive • Outreach activity – planting seeds • International, national, state, local, commercial • Learning experience, informing: • Spatial data infrastructure • Commercial vendors (data/software/consulting) • Repository software communities • Metadata practice (both GIS & preservation) • Rights management developments • Data and interoperability standards Note: Percentages based on the actual number of respondents to each question

  26. Project Status • Storage system and backup deployed • DSpace deployed • FGDC Metadata workflow finalized • Ingest workflow near finalization • Content migration workflow plan near finalization • Regional site visits planned for coming months • Wide range of outreach/collaboration: FGDC, ESRI, EDINA (JISC), USGS, OGC, TRB, etc. • Pilot project, georegistering digital archival geologic maps Note: Percentages based on the actual number of respondents to each question

  27. Questions? Contact: Steve Morris Head, Digital Library Initiatives NCSU Libraries Steven_Morris@ncsu.edu Web site: http://www.lib.ncsu.edu/ncgdap/ Note: Percentages based on the actual number of respondents to each question

More Related