實驗室 : 先進網路技術與服務實驗室報告者 : 黃福銘 (Angus F.M. Huang)

TMSG Adaptive Fastest Path Computation on a Road Network: A Traffic Mining Approach 實驗室: 先進網路技術與服務實驗室報告者: 黃福銘 (Angus F.M. Huang) 2013.10.02

Historical traffic data or driving patterns are often more useful than the simple Euclidean distance-based computation because people must have good reasons to choose these routes

Publication • Conference • VLDB‘07, September 23-28, 2007, Vienna, Austria • Authors • Hector Gonzalez, Jiawei Han, Xiaolei Li, Margaret Myslinska, John Paul Sondag • Department of Computer Science • University of Illinois at Urbana-Champaign

Outline • INTRODUCTION • PROBLEM DEFINITION • TRAFFIC DATABASE • ROAD NETWORK PARTITIONING • TRAFFIC MINING • PRE-COMPUTATION AND UPGRADES • FASTEST PATH COMPUTATION • EXPERIMENTAL EVALUATION • CONCLUSIONS

Introduction • MapQuest, MapPoint, Google Maps • Route planning systems • MapQuest had 10 billion routes queries from 1996 to 2006 • Current speed conditions are not enough for the fastest route searching • Road speed limits, average speed,… • Example 1: Importance of driving patterns • Local experts will consider a multitude of important factors that are difficult to explicitly incorporate into a path finding algorithm • Example 2: Importance of speed patterns • Time of departure, weather conditions, car pool lane, etc.

Introduction • Solution • Traffic-mining-based path-finding method • Speed and driving patterns from historic traffic data • Technical Contributions • Road hierarchy-based partitioning • Speed rule mining • Driving pattern mining • Adaptive pre-computation • Road upgrading • Adaptive fastest path algorithm

Problem Definition • Def.: Road network • G(V, E) • Def.: Speed pattern • <edege_id, t_start, t_end, (d1,d2,…,dk):m> • di is a value for speed factor Di • m is an aggregate function computed on edge speed • Def.: Driving pattern • A sequences s of edges e(1),e(2),…,e(l) • appears more than min_suptimes • support(s), the number of paths containing the sequence • length(s), the number of edges that it contains • Def.: Edge forecast model • F(edge_id, t) • Returns a tuple (d1,d2,…,dk) with the expected driving conditions for edge edge_id at time t

Time-of-day • D1 = weather • D2 = vehicle-type • Forecast function example • At 5 pm [time], for highway 74 between Champaign and Normal [edge], Weather = rain, and Construction = no [conditions] • Larger roads are shown in bold • 24,123 edges • 18,496 nodes • TIGER line files

Problem Statement • Given a road network G(V,E), a set of speed patterns S, an edge forecast model F, and a query q ←(s, e, start_time) • Compute a fast route qrbetween nodes s and e starting from s at time start_time, such that qrcontains a large number of frequent driving patterns

Traffic Database • (edge_id, time, speed) • Basic traffic observation • (car_id, edge_id, time, speed) • Radio-frequency tags • (edge_id, start_time, end_time, (d1,d2,…,dk):m) • Augment each traffic observation with the driving factors

Road Network Partitioning • Road hierarchy • Highway, interstate road, multi-lane road, small road,… • Grid-based partitioning is bad • The natural partition induced by the road hierarchy itself can be used to divide the network into semantically meaningful areas • With well defined driving and speed patterns • Given a road hierarchy with l levels, we can construct a hierarchy of areas as a tree of depth l-1 • Road class 1 is the largest, and road class l the smallest

San Joaquin Partitioned Map • a:b • a is the area number when roads of level 1 are used • b is the subarea of a when roads of level 2 are used to subdivide a • The upper left !! • Quite a few strong connection

Area Partitioning Algorithm • Generate semantically meaningful partitions • By using road hierarchy information • Flood filling technique • Identify strongly connected components • class(n) > k • It automatically identifies.. • Interior nodes, those with a single area in their area set • Border nodes, those with multiple areas in their area set • O(n) • O(1), interior node, n nodes • O(|a|), border nodes, a areas • O(n x |a|) • |a| << n

Traffic Mining • Speed pattern mining • See the mining as a classification problem • Where we would like to predict edge speed based on time and feature values d1,…,dk • “if area = a1 and weather = icy and time = rush hourthenspeed =1/4 x base speed” • Abstraction level, general representation • Run a preprocessing step to discretize speed factors, which will be treated as our class label • Use Decision tree induction to perform rule induction

Traffic Mining • Driving pattern mining • Ask local people for route tips in an unfamiliar area • Frequent pattern mining • Minimum support level • Uniform mining support level is difficult to define • And it may filter many important local roads, or may keep infrequently traveled high-level roads • Use a frequent pattern mining method guided by the area and road hierarchies • Frequent edges are mined according to different area level • To distinguish different level-supports

Pre-computation and Upgrades • To improve the performance both in terms of run time and path accuracy • Area level pre-computation • A*, Floyd Warshall,… • When edge speed is a function of factors • The fastest path between two nodes may be different for different times and conditions • We can check two conditions to determine pre-computing benefits • How many fastest path queries will go through nodes of the pre-computed path • How stable is the path • To compute certain fastest paths only within the nodes inside the area

Pre-computation and Upgrades • Assumption: drivers take the largest road available to reach destination • Exception !! : if there is a small road that is faster than a large road • Small road upgrades • If under some driving conditions small roads have a significantly higher speed • To upgrade the internal edges to upper level

Fastest Path Computation • Properties of the (approximate) fastest routes • Be well supported by the historical driver behavior • Larger road first, significant smaller road second • Account for all relevant factors affecting driving speed • Before computation… • Road network partitioning • Speed patterns are mined • get_edge_speed(edge_id, t, (d1,…,dk)) • Driving patterns are mined • is_frequent(edge_seq, t, (d1,…,dn)) • Area-level paths are pre-computed • Internal roads are upgraded • get_edge_class(edge_id, t, (d1,…,dk))

Fastest Path Algorithm • It is a variation of A* • Algorithm strategies • Priority queue of expanded paths • Pick the frequent node with lowest g(n)+h(n) • g(n), the current travel time cost <start, n> • h(n), the expected travel time cost, <n, end> • Ascending search to find the bigger road • Descending search to find the smaller road • Simple estimation policy, h(n) = distance(n, end) / max_speed • Online path re-computation • Lemma • The adaptive fastest path algorithm, when computing a path between (start, end) nodes, in areas ai, aj respectively will consider at most O(|ai|+|aj|+|bn|+|un|) distinct nodes

Experimental Evaluation • Comparisons • A*, basic A* • Hier, the algorithm without area pre-computation • Adapt, the algorithm • Data synthesis • San Francisco Bay area, 175,343 nodes, 223,606 edges • Illinois, 831,524 nodes, 1,048,080 nodes, 24,123 edges • Traffic simulator • Network-based Generator of Moving Objects by Thomas Brinkhoff • Rush hour: 10,000 objects, Non-rush: 1,000 objects • Include weather factor to slow down speeds • Two car classes: Cars with faster speeds, Trucks with slower speeds • Simulation output was a list of edge observations • <edge_id, car_id, time, weather, speed> • Then, mine the speed patterns for each edge

Network-based Generator of Moving ObjectsThomas BrinkhoffInstitutfürAngewandtePhotogrammetrie und Geoinformatik (IAPG)

Query Length • We varied the average distance between the starting and ending nodes • The longer the distance the larger the search space • The distance is as a percentage of the map diameter • 20% upgraded roads • Pre-compute fastest paths in 30% of the lowest level areas • Figure 5, Adapt only expands slightly more nodes than Hier • Figure 6, Adapt is as good as the A*’s fastest path, efficiency & accuracy • Figure 7, the same pattern as in the expanded nodes

Upgraded Paths • Vary the percentage of lowest level areas that contain a path that is faster than the border paths and thus needs to be upgraded • Figure 8&10: A* and Hier are significantly affected (???) • Figure 8&10: Adapt suffers as having more upgraded edges but still gradual • Figure 9: when no edges are upgraded both Hier and Adapt perform equally, as we increase the number of upgraded edges Adapt starts closing the gap with A* • We can use a fairly aggressive edge updating strategy to improve path quality without incurring any significant performance penalty • Interior edges as long as are 80% as fast as border edges

Area Pre-computation • Examine the performance gain for different levels of pre-compution • Adapt vs. Adapt_nopre • The same algorithm but withourt using pre-computed areas • Select a percentage of the lowest level areas to pre-compute fastest path, 0% to 100% • The performance improvement is very significant • If we use higher level area, the performance would have been more noticeable

Road Network Size • Compare query processing efficiency for 3 road network sizes • sj, 18496 nodes and 24123 edges • sf, 175343 nodes and 223606 edges • il, 831524 nodes and 1048080 edges • sj < sf < il • Adapt has excellent scalability in terms of road network size • The number of nodes usually grow much slower than the number of small roads

Conclusion • We developed an adaptive fastest path algorithm, that bases routing decision on driving and speed patterns mined from historical data. • The partitioning algorithm yields very natural partitions, where larger areas are observed at regions with low road densities, and much finer areas are observed at dense regions such as big cities.

Angus Comments • 如果道路層級規畫不佳，此篇成效依然會很好嗎?? • 此篇亦無解決歷史資料稀疏的問題。 • The power of Number & Trajectory • http://www.youtube.com/watch?v=cTiJaWCaKas

Thanks for your listening…

實驗室 : 先進網路技術與服務實驗室報告者 : 黃福銘 (Angus F.M. Huang)