1 / 9

Pipeline Architecture

Pipeline Architecture. Eugene Feld, Vladimir Zhukov. Introduction. What it is: Platform for integration of external (primarily bulk) content sources Used currently for core Map and DiCi Transit sourcing What it has: Set of tools for data cleansing and normalization

tarmon
Télécharger la présentation

Pipeline Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pipeline Architecture Eugene Feld, Vladimir Zhukov

  2. Introduction • What it is: • Platform for integration of external (primarily bulk) content sources • Used currently for core Map and DiCi Transit sourcing • What it has: • Set of tools for data cleansing and normalization • Engine(s) for cross layer spatial and non-spatial data conflation • Configurable workflows for source-specific processing • Supports various standard GIS data formats as well as custom formats • Allows for integration of COTS ETL tools into the processing pipeline

  3. Pipeline Data Flow 2009

  4. Technology Stack 2009 Source Formats Transit XML CSV ESRI Formats Workflow Visualization and Editing Apache Ant Computational Logic Spatial ETL Java SE ArcGIS Desktop ArcObjects Plug-ins FME ArcSDE Oracle PL/SQL Spatial Indexes

  5. Architectural Characteristics 2009 All manual activities precede batch processing and integration to Core. No support for review/approve tasks within workflow Manual process kick-off, even for recurring source deliveries Core Map insert/delete via R2R/ClipTool. No support for update. Monolithic computational components. No support for scaling through distributed architecture. Exclusive write lock on Core Repos required for integration Low visibility in process status with ANT workflows Brittle and lengthy Reference NQ creation process from IWs

  6. Business Reasons For Change • Significant map expansion of 2009-2010 shifts focus to rapid update of the newly added areas through external sources • Map Update ability is required in addition to Add/Delete which exists today • Emergence of online change collection • Need for event-driven, transactional source processing and integration • High volume of source deliveries calls for greater efficiency of map update process • Automated change detection • Visibility into automated processing workflows • Complimentary update detection logic between Postal, Map and Transit data processing is identified. • Emergence of common platform for source based change detection and processing

  7. Pipeline Data Flow 2010+ BPM ArcDesktop Map Sources (ESRI) TrX Generation Change Candidates EGIS Turbo Map Geometry Change Detection Attribute Change Detection Asset Management and Access Interface DiCi Transit Sources (XML) NQ IW Source Normalization Attribute Derivation Refresh TBD Reference NQ Graph Conflation Engine

  8. Functional View 2010+ Human Activities EGIS Asset Management Source2NTMapping Reference NQ New Coverage NQ New Coverage tNQ Sources (PGDB, SHP) Change Detection Normalization Correlation Tools jBPM Workflow Conflation Review Graph Conflation Attr Change Detection New Geo Detection Geo Change Detection Arc SDE Change Candidate Management Change Candidate DB Change Review TurboMapTrX generation Change Correlation Turbo Map Core Repos Production Cleanup Map IW DiCi RMOB2NQ DC Sync

  9. Improvements circa 2010+ Manual review/approval tasks interspersed with computational tasks Automated change detection through Graph Conflation Engine Change Candidates are tracked, reviewed, and approved by users prior to TM submission Process kick-off of recurrent source deliveries is triggered by AM notifications Rich configuration for Reference NQ creation

More Related