460 likes | 464 Vues
Steve Morris Jim Tuttle Rob Farrell Jeff Essic. Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project. What is NDIIPP? Why NCSU Libraries?. NDIIPP = National Digital Information Infrastructure and Preservation Program
E N D
Steve Morris Jim Tuttle Rob Farrell Jeff Essic Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project EPA Meeting
What is NDIIPP? Why NCSU Libraries? • NDIIPP = National Digital Information Infrastructure and Preservation Program • Responding to concern that we might be in the middle of a “digital dark age” Congress earmarked $100 million for digital preservation efforts through 2010 • Timeline • Aug. 2003: Library of Congress (LC) puts out call for proposals for “preservation partners” • Sept. 2004: LC finalizes agreements with eight principal partners, including NCSU. • Oct. 2004: the three-year projects begin • A cooperative agreement … not a grant • emphasis on ongoing interaction with LC and other partners, with transfer of learning experience to LC as primary outcome
NC Geospatial Data Archiving Project (NCGDAP) • Partner: NC Center for Geographic Information & Analysis (state agency) • Focus: State and local agency digital geospatial data in NC as state demonstration • Objective: Engage existing spatial data infrastructure (SDI) in the problem of preservation • Tied to the NC OneMap initiative, which provides for seamless access to data, metadata, and inventories
Geospatial Data Types: Vector &Attribute Data Time series Parcel Boundary Changes 2001-2004 North Raleigh, NC
Geospatial Data Types: Vector Data Time series Parcel Boundary Changes 2001-2004 North Raleigh, NC
Geospatial Data Types: Aerial Imagery 85+ NC counties with orthophotos 1-5 flights per county 30-200 gb per flight
Today’s Geospatial Data as Tomorrow’s Cultural Heritage Future uses of data are difficult to anticipate (as with Sanborn Maps).
Digital Preservation Points of Failure • Data is not saved, or … • can’t be found, or … • media is obsolete, or … • media is corrupt, or … • format is obsolete, or … • file is corrupt, or … • meaning is lost Solutions: Migration Emulation Encapsulation XML
Risks to Digital Geospatial Data • Producer focus on current data • Data overwrite as common practice • Future support of data formats in question • No open, supported format for vector data • Shift to web services-based access • Data becoming more ephemeral • Inadequate or nonexistent metadata • Impedes discovery and use • Increasing use of spatial databases for data management • Complex entities: the whole is greater than the sum of the parts
NCGDAP Approach to Preservation • Technical solutions: How do we archive acquired content over the long term? • Build a data repository: not as an end in itself but as a catalyst for discussion within the data community • Develop a repository ingest workflow: create technical points of engagement with the NDIIPP partners • Cultural/Organizational solutions: How do we make the data more preservable—and more prone to be archived—from point of production? • Engage data producer community and spatial data infrastructure through outreach and engagement; influence practice • Sell the problem to software vendors and standards development • Find overlap with more compelling business problems: disaster preparedness, business continuity, road building, etc. • Start a discussion about roles at the local, state, and federal level
Repository Ingest Workflow • Flexible, extensible processes • Clear, documented procedures • Adherence to standard practices, where they exist • Automation
Technical Solution:Building a Digital Repository • Three “Rights”: • Right format • Right tags (metadata) • Right relationship Oh, and of course, valid for the rest of the Digital Age! NCGDAP is about researching methodologies…
Open Source • Developments • Multi-part datasets What is the “Right” Format??? • Well, it’s complicated… WebServices • Databases
Our Format Methodology • Decide on archival format(s) • Migrate non-archival formats • Archive both versions of the data set We need a methodology that can do this a few hundred thousand times…initially.
Needles in the Haystack • Computer Programs Written • Utilize functionality of GIS • Iterate through the data sets • Create “bundles” for deposit • Process Steps • Locate a data set • Determine the format • Make appropriate conversion • Create and isolate “bundle” with new and original format • Repeat
Geologic and Historic Topographic Maps: Georeferencing and Preservation
Historic Topographic Map Preservation • 165 Historic 15-minute series topographic maps for NC • Date range: 1892-1959 • Documentation at http://www.lib.ncsu.edu/gis/historictopos.html • Available on NCSU Libraries Geodata server
Geologic Map Preservation • 290 Geologic Maps for NC • Map sources are US Geologic Survey, NC Geologic Survey, theses and dissertations • Documentation at http://www.lib.ncsu.edu/gis/geolmaps.html • Public download at http://wfs.enr.state.nc.us/NCGeologicMaps/
Geologic Map Preservation 1,200 – 24,000 1:31,680 – 1:430,000 1:500,000 – 1:2.5 M
NCGS Project Summary • Project came to us - workplan and intern identified • Preservation risk - data was stored on external drive • Content is in high demand by patrons, hardcopy only, scarce to obtain • Collection acquired at no cost to Libraries • Data files publicly available for download • Partnership with NC Dept. of Environment and Natural Resources; increasing interest in preservation • Early raster dataset for NCGDAP – test for large data volumes, ingest process, metadata creation • NCGS Open File Report forthcoming
NCGDAP: Engagement with the Data Community • Engaging spatial data infrastructure • Evaluating metadata and content standard adherence • Cultivation of content exchange networks • Sept. 2006 survey of current practice in local agencies • External partnerships • Partners on JISC-funded effort in the UK (Edinburgh) • Engaging software vendors • Meetings with ESRI development teams • Engaging standards development processes • Nov. 2005, partnered with University of Edinburgh on presenting the preservation problem to the Open Geospatial Consortium (OGC) Technical Committee • Oct. 2006, partnered with NARA on initiating a formal working group on digital preservation within the OGC
NCGDAP on the Road Presentations, posters, and workshops Jan. 2005- Sept. 2006 Highlights: O’Reilly Where 2.0 OGC Meeting (Germany) Digital Curation Center (UK) IS&T Archiving (Canada) IASSIST (UK) ESRI International Joint NDIIPP & JISC Meeting National/International: 37 State/Local: 21
NCGDAP: Future Directions • Project shifting to data acquisition mode • Current contract ends Oct. 2007 • Likely continuation of project funding through Oct. 2010 • Four responses to additional LC “Requests for Expression of Interest (RFEI)” • Development of content exchange networks • Development of tool for automated capture of web mapping services • Participation in repository exchange tests • Multi-state project involving State Archives … RFEI status pending
Questions? North Carolina Geospatial Data Archiving Project website http://www.lib.ncsu.edu/ncgdap Library of Congress NDIIPP website http://www.digitalpreservation.gov/