Geospatial IntegrationJim Barrett –data.gov/NGP PMOsMarch 3rd , 2011Enterprise Planning Solutions LLC
Presentation Outline Integration discussion context • Data Integration vs. Data Interoperability What do users want and need to know? What is stopping us from meeting needs? How can we think about the problem differently? Case Studies – NGP, Data.gov, Indonesian NSDI What can and should be done?
Value Chain –data.gov – Integration Context Use Side Community of Users Supply Side Community of Suppliers Acquire Data Build Dataset Publish Dataset Enable Discovery Discover Connect Participate Enable Use Data.gov Supply Chain Management – no geo integration focus Access and Interoperability Focused
Typical Spatial Data Integration Data Qualities • Temporal – currentness, vintage… • Semantic – meaning of the object and its attributes • Spatial dimensions (X,Y,Z) • Accuracy (positional) • Topology/modeling • Resolution • Representation All important qualities – how we attain them will require not only technology but improvement to how we manage
Simple supply side questions that are very hard to answer? Who produces the information I need? Are they “the” recognized authority? How can I tell? How often will it be re-published? • Is the supply predictable and reliable? Can I count on it? Do the data have a geospatial characteristic? • What are its geospatial qualities (specs) and provenance? • Is it consistently defined in its meaning? • What is the scope of its coverage? Will the data be maintained? • Geometry and models • Attributes and metadata Where do I get it and in what forms?
Barriers to integration What is preventing our data from being integrated? • Acquisition: • Uncoordinated data acquisition strategies at national level • Barrier between business data and geospatial data i.e. schools, minerals, • Few means to broker and optimize requirements from consumers • Production • Quality of our metadata and when and how we get it • Unclear operational roles in a national data framework. (NSDI) • Absence of a granular or meaningful trustworthy data chain of authority? • Absence of a schedule to communicate what is going to be happening?
Barriers What is preventing our data from being integrated? • Data Management • Cataloging • Fundamental Semantics (A16) • Policy, Organization and Culture • Federated political and government collection and production environments • divergent data quality requirements – national, state, local, regional • Stove-piped national Geodetic policy (A16) • Shifting market expectations and tolerances for lower quality in favor of access? • Legacy institutional barriers and thinking • They are national assets not just a programs data.
Where are the problems occurring in the Value Chain? Supply Side Community of Suppliers Use Side Community of Users Ambiguous Cataloging and semantics Gap in what gets integrated Acquire Data Build / Intra Dataset Integration Publish Dataset Enable Discovery Discover Connect Participate Enable Use Downstream Data Integration $$$ Gap in planning view of Acquisition Data.gov Supply Chain Management Data Integration Focused Access and Interoperability Focused
What we have is many value chains running in parallel. It is hard to do integration without a systematic collaborative approach.
We need to integrate the supply chain.How can we think about the problem differently?
Organizing Principles A supply chain is a system of organizations, people, technology, activities, information and resources involved in moving a product or service from supplier to customer. Supply chain activities transform natural resources, raw materials and components into a finished product that is delivered to the end customer. In our case Information. In sophisticated supply chain systems, used products may re-enter the supply chain at any point where residual value is recyclable. Supply chains link value chains
How to think about the supply chain? Supply Chain Models: • Make to Stock – standardized, inventory driven, off the shelf – e.g.., USTopo, Gazetteers • Make to Order – customer driven, configurable, longer lead times – L5 and 7 near line • Engineer to Order – custom per unique requirements
Nationally, what are we? We hover between the make to order and make to stock without addressing the data integration parameters • We push integration too far down the value chain • We don’t think of our data assets as stock inventory • Focus on collection and build - not integration and use • Myopic vs. Synoptic Objective - Foundational data should be integrated and made to stock
Case Study NGP USGS National Geospatial Program – NGP • Using Supply Chain Management principles to establish an integrated National Topographic Baseline.
NGP Objective – provide NGP a framework to cost effectively and sustainably manage the feature content of the National Topographic Baseline. (better than 24,000 map scale) Approach: Define the necessary supplier relationship types, evaluation criteria and the roles and responsibilities to manage the lifecycle of the feature content more cost effectively.
Types of Suppliers and Roles Stewardship – Communities of Use/Supply have a “vested interest” and some expertise in managing and integration the data to meet their specific objectives. NGP has a policy obligation to build and maintain the asset. NGP Shared Integrator and aggregator roles Source Suppliers – Community of Supply provides data and information uni-directionally that is used as a product in and of itself or as critical input to processes to support deriving and integrating feature information. NGP does not fundamentally edit the source but uses it to achieve higher value added information. NGP Integrator and Aggregator Collaborative Suppliers - Suppliers who have a shared information requirement with the NGP and are currently or soon to be funded in a sustainable fashion that will provide feature data and attribute aggregation and standardization value. Integrator and Aggregator – Roles negotiated Licensed Supplier – NGP buys for the information and incurs any opportunity loss with updating, integrating or enriching the dataset or Datasets. Typically, licensed suppliers fill the voids, when other types of relationships cannot be brokered effectively NGP No Integrator and No aggregator role Volunteer Geographic Information, at this point, is considered a method that can be used to collect and edit feature data that are not adequately or cost effectively met through other means. • NGP Integrator QA/QC and No aggregator role
Core Principles NGP Supplier Framework Geospatial Datasets or (single or classes of features) are the core transaction unit. Leverage Datasets that are managed by business systems where aggregation role and business data are maintained regularly Follow the money and regulation Work towards breaking down the traditional view of business systems vs. a mapping system or GIS system Datasets are inconsistently managed between two “system” paradigms and they needn’t be. Traditional Mapping acquisition strategies are too costly and cannot be afforded
Mockup Supply Chain Plan Strategy Build and publish a national Supply Chain Plan for the data. Suppliers and Integrators - make it a part of program and budget planning Communicate, Communicate, Communicate Build and manage to robust inventories not just metadata records Rethink A16 • Semantics – break theme model - Policy needs to set towards web 2.0 ,3.0 • Create actionable A16 – roles assigned responsibility at operating unit, and identify associated aggregation, production, integration and delivery nodes/systems to datasets • Assess gaps in Roles and Responsibilities for the granular datasets. Don’t do it in IT. Business problem. Business owns and drives IT.
The Real Challenge in a nutshell… We have an unbounded problem. • Need to bring to a managed state • Need to focus it on specified value based outputs or results
Supply Chain Key Points Move towards - Make to Stock Develop inventory practices up front as function planning in addition to our metadata Need published semantically defined supply chain plan for the Plan
Geo - Supply Side Barriers • Islands of Data – Fragmentation of content • Stored locally, shelved or just tucked away • Project oriented • spatial – temporal – positional accuracy • Data Supply Chain is still not tapped • Incentive to participate? • Seen as more work with little benefit • Numerous inventories of data not exposed (collections, series, imagery) • GeoData; especially base data is often duplicative • Lot of data still not spatially enabled
A16 • (1) What are data themes?Data themes are electronic records and coordinates for a topic or subject, such as elevation or vegetation. This Circular requires the development, maintenance, and dissemination of a standard core set of digital spatial information for the Nation that will serve as a foundation for users of geographic information. This set of data consists of themes of national significance (see Appendix E). Themes providing the core, most commonly used set of base data are known as framework data, specifically geodetic control, orthoimagery, elevation and bathymetry, transportation, Hydrography, cadastral, and governmental units. Other themes of national significance are also an important part of the NSDI, and must be available to share with others. Additional data themes may be added with the approval of the FGDC..
(5) Coordinate and work in partnership with federal, state, tribal and local government agencies, academia and the private sector to efficiently and cost-effectively collect, integrate, maintain, disseminate, and preserve spatial data, building upon local data wherever possible.
Supplying necessary information to the interagency coordinating committee concerning its surveying, mapping, and related spatial data requirements, programs, activities, and products; and • http://www.whitehouse.gov/omb/circulars_a016_rev
Binding Thoughts “all” data is local – especially geospatial GeoData is expensive • We must collect once and use many times • IT ownership GeoData is different: • “they” don’t get it; they don’t understand; its hard It’s the other agency’s responsibility; • I shouldn’t have to do it.
Dimensions of the Sonar Diagram National Topographic Baseline (NTB) Supplier Relationship Types and criteria for the NTB Datasets (geometry and attributes qualities) – that are the basic working element comprising the NTB Spheres of Influence - Data Integration and Maintenance Responsibilities of NTB Datasets Lifecycle costs planning, and regulatory dimensions