1 / 16

Spatial Big Data Challenges Intersecting Cloud Computing and Mobility

Spatial Big Data Challenges Intersecting Cloud Computing and Mobility. Shashi Shekhar McKnight Distinguished University Professor Department of Computer Science and Engineering University of Minnesota www.cs.umn.edu/~shekhar. Shortest Paths. Storing graphs in disk blocks.

sora
Télécharger la présentation

Spatial Big Data Challenges Intersecting Cloud Computing and Mobility

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Spatial Big Data ChallengesIntersecting Cloud Computing and Mobility Shashi Shekhar McKnight Distinguished University Professor Department of Computer Science and Engineering University of Minnesota www.cs.umn.edu/~shekhar

  2. Shortest Paths Storing graphs in disk blocks Evacutation Route Planning only in old plan Only in new plan In both plans Parallelize Range Queries Spatial Databases: Representative Projects

  3. Why cloud computing for spatial data? • Geospatial Intelligence [ Dr. M. Pagels, DARPA, 2006] • Estimated at 140 terabytes per day, 150 peta-bytes annually • Annual volume is 150x historical content of the entire internet • Analyze daily data as well as historical data

  4. Eco-Routing U.P.S. Embraces High-Tech Delivery Methods (July 12, 2007)By “The research at U.P.S. is paying off. ……..— saving roughly three million gallons of fuel in good part by mapping routes thatminimize left turns.” • Minimize fuel consumption and GPG emission • rather than proxies, e.g. distance, travel-time • avoid congestion, idling at red-lights, turns and elevation changes, etc.

  5. Real-time and Historic Travel-time, Fuel Consumption, GPS Tracks 5

  6. Eco-Routng Research Challenges • Frames of Reference • Absolute to moving object based (Lagrangian) • Data model of lagrangian graphs • Conceptual – generalize time-expanded graph • Logical – Lagrangian abstract data types • Physical – clustering, index, Lagrangian routing algorithms • Flexible Architecture • Allow inclusion of new algorithms, e.g., gps-track mining • Merge solutions from different algorithms • Geo-sensing of events, • e.g., volunteered geographic information (e.g., open street map), • social unrest (Ushahidi), flash-mob, … • Geo-Prediction, • e.g., predict track of a hurricane or a vehicle • Challenges: auto-correlation, non-stationarity • Geo-privacy

  7. Cloud Computing and Spatial Big Data • Motivation • Case Study 1: Simpler to Parallelize • Case Study 2 – Harder • Case Study 3 – Hardest • Wrap up

  8. Simpler: Land-cover Classification • Multiscale Multigranular Image Classification into land-cover categories Inputs Output at 2 Scales

  9. Parallelization Choice 1.    Initialize parameters and memory 2.    for each Spatial Scale 3. for each Quad 4.    for each Class 5.    Calculate Quality Measure 6 end for Class 7. end for Quad 8.    end for Spatial Scale 9. Post-processing

  10. Set of Polygons Set of Polygons Local Terrain Database Remote Terrain Databases Graphics Engine Display 2Hz. 8Km X 8Km Bounding Box 25 Km X 25 Km Bounding Box 30 Hz. View Graphics High Performance GIS Component Harder: Parallelizing Vector GIS • (1/30) second Response time constraint on Range Query • Parallel processing necessary since best sequential computer cannot meet requirement • Blue rectangle = a range query, Polygon colors shows processor assignment

  11. Data-Partitioning Approach • Initial Static Partitioning • Run-Time dynamic load-balancing (DLB) • Platforms: Cray T3D (Distributed), SGI Challenge (Shared Memory)

  12. DLB Pool-Size Choice is Challenging!

  13. Hardest – Location Prediction Nest locations Distance to open water Vegetation durability Water depth

  14. Maximum Likelihood Estimation • Need cloud computing to scale up to large spatial dataset. • However, computing determinant of large matrix is an open problem! Ex. 3: Hardest to Parallelize

  15. Cloud Computing and Spatial Big Data • Motivation: Spatial Big Data in National Security & Eco-routing • Case Study 1: Simpler to Parallelize • Map-reduce is okay • Should it provide spatial declustering services? • Can query-compiler generate map-reduce parallel code? • Case Study 2 – Harder • Need dynamic load balancing beyond map-reduce • Case Study 3 – Hardest • Need new computer science, e.g., • Eco-routing algorithms • determinant of large matrix • Parallel formulation of evacuation route planning

  16. Acknowledgments • HPC Resources, Research Grants • Army High Performance Computing Research Center-AHPCRC • Minnesota Supercomputing Institute - MSI • Spatial Database Group Members • Mete Celik, Sanjay Chawla, Vijay Gandhi, Betsy George, James Kang, Baris M. Kazar, QingSong Lu, Sangho Kim, Sivakumar Ravada • USDOD • Douglas Chubb, Greg Turner, Dale Shires, Jim Shine, Jim Rodgers • Richard Welsh (NCS, AHPCRC), Greg Smith • Academic Colleagues • Vipin Kumar • Kelley Pace, James LeSage • Junchang Ju, Eric D. Kolaczyk, Sucharita Gopal

More Related