Analyzing Large-Scale Wireless Network Traces for User Modeling and Protocol Design

Analysis of Large-scale Wireless Network Traces and Its Impact on User Modeling and Protocol Design Wei-jen Hsu Advised by Dr. Ahmed Helmy

Emerging Wireless Communication • Opportunities • Challenges • Dynamic network structure • Decentralized service paradigm • Tight coupling between the devices and individuals

Problem Statement • To understand user behavioral patterns in mobile networks from empirical, large-scale data sets • Individual mobility characteristics • Pair-wise similarity • Global encounter pattern • To incorporate the findings in modeling and protocol design • User mobility model • Classify behavioral groups; profile-cast • Efficient broadcast

Analyze Represent Trace Employ(apply) Characterize The TRACE framework

Detailed behavior analysis Complete case Future Work Outline

Trace Trace Sets • In this work we mainly use WLAN traces • Mostly from university campuses or corporate networks (4 universities, 1 corporate network) • The largest data sets about wireless network users available to date (# users / lengths) • No bias: not “special-purpose”, data from all users in the network • For comparison we also look at some vehicular movement trace and human encounter trace

Trace Trace Sets • Available information from WLAN traces • MAC addresses of the devices as identifiers • Location/Time of users (our main focus) Node: e0_12_29_fc_ba_8c AssociationStart time Location_ID Duration 2197745 172.16.8.244_11009 4433 2230200 172.16.8.244_11009 13320 2257917 172.16.8.244_11009 643 2285119 172.16.8.244_11009 1017 2297134 172.16.8.244_11009 7153 2304287 172.16.8.244_11023 6744

Case study I – individual mobility

Goal • To understand the mobility/usage pattern of individual wireless network users • To observe how environments/user type/trace-collection techniques impact the observations • To propose a realistic mobility model based on empirical observations • That is mathematically tractable • That matches with multiple scenarios

Mobility Models • Mobility models are of crucial importance for the evaluation of wireless mobile network protocols • Requirements for mobility models • Realism (detailed behavior from traces) • Parameterized, tunable behavior • Mathematical tractability

Represent Metrics for Mobility Models • How often are the nodes present? • Percentage of “online” time • What kind of preference do users show in space? • The percentile of time spent at the most frequently visited locations • What kind of repetition do users show in time? • The probability of re-appearance

Skewed location preference Characterize On/off activity pattern Periodic re-appearance Prob.(online time fraction > x) Mobility Characteristics from WLANs • Simple existing modelsare very differentfrom the characteristicsin WLAN

Skewed location visiting preferences Create “communities” to be the preferred area of movement Periodical re-appearance Create structure in time – Periods Repetitive structure 75% 25% Employ(apply) Time-variant Community (TVC) Model [Spy06, Hsu07]

Theoretical Tractability • For the TVC model, we can derive • Nodal spatial distribution – the demographic profile of the mobility model • Average node degree – important for cluster maintenance and geographic routing • Hitting time/ Meeting time – important for routing performance analysis • With low error when the communication range is small compared to the community sizes (communication disk < 25% of community)

Theoretical Tractability Avg. node degree Spatial distribution Hitting time Meeting time

Using the TVC Model – Reproducing Mobility Characteristics • (STEP1) Identify the popular locations; assign communities • (STEP2) Assignparameters to the communities according to stats • (STEP3) Adding user on-off patterns (e.g., in WLAN, users are usually off when moving)

Using the TVC Model – Reproducing Mobility Characteristics • WLAN trace (example: MIT trace) Skewed location visiting preference Periodic re-appearance *Similar matches achieved for USC and Dartmouth traces.

Using the TVC Model – Reproducing Mobility Characteristics • Vehicular trace (Cab-spotting)

Using the TVC Model – Reproducing Mobility Characteristics • Human encounter trace at a conference Inter-meeting time A encounters B Encounter duration time Encounter duration Inter-meeting time

Summary (Case Study I) • We observe some omni-present mobility characteristics from WLANs • These characteristics are not captured by existing synthetic mobility models (i.e., hence the models are not realistic) • We propose the Time-variant Community (TVC) model, which is realistic, theoretically tractable, and flexible

Case study II – Groups in WLAN

Goal • Identify similar users (in terms of long run mobility preferences) from the diverse WLAN user population • Understand the constituents of the population • Identify potential groups for group-aware service • In this work we classify users based on their long-run mobility trends (or location-visiting preferences) • We consider semester-long USC trace (spring 2006, 94days) and quarter-long Dartmouth trace (spring 2004, 61 days)

Association vector: (library, office, class) =(0.2, 0.4, 0.4) Represent Representation of User Association Patterns • We choose to represent summary of user association in each day by a single vector • Summarize the long-run mobility in an “association matrix” • Office, 10AM -12PM • Library, 3PM – 4PM-Class, 6PM – 8PM

Eigen-behavior • Eigen-behaviors: The vectors that describe the maximum remaining power in the association matrix (obtained through Singular Value Decompostion)with quantifiable importance • Eigen-behavior Distance calculates similarity of users by weighted inner products of eigen-behaviors. • Benefits: Reduced computation and noise

Identify Similar User • With the distance between users U and V defined as 1-Sim(U,V), we use hierarchical clustering to find similar user groups. USC Dartmouth *AMVD = Average Minimum Vector Distance

Validation of User Groups • Significance of the groups – users in the same group are indeed much more similar to each other than randomly formed groups (0.93 v.s. 0.46 for USC, 0.91 v.s. 0.42 for Dartmouth) • Uniqueness of the groups – the most important group eigen-behavior is important for its own group but not other groups

Characterize User Groups in WLAN - Observations • Skewed group size distribution – the largest 10 groups account for more than 30% of population on campus. Power-law distributed group sizes. • Most groups can be described by a list of locations with a clear ordering of importance • We also observe groups visiting multiple locations with similar importance – taking the most important location for each user is not sufficient

Enough of words! Let’s see how it works

Summary (Case Study II) • We use SVD to obtain eigen-behaviors of individual users. • We use the eigen-behavior distances and hierarchical clustering to classify WLAN users into similar groups. • This finding is useful for mobility modeling (identifying group sizes and their frequently visited locations), network management, abnormality detection, and group-aware protocol (i.e., profile-cast, our future work)

Case study III – Encounter Pattern

Derived from simultaneous associations to the same locations How many other nodes does a node encounter with? Encounter Events Prob. (unique encounter fraction > x) 0.5 On avg. only 2%~7% of population

Draw a link to connect a pair of nodes if they ever encounter with each other Most node pairs are connected in the ER graph The ER graphs show SmallWorld graph characteristics High clustering coefficient Low average path length Characterize Represent Encounter-Relationship (ER) graph

Future Work – Profile-cast

Goal • To send messages to a group of nodes within the general population • The group is defined by the intrinsic behavior patterns of the nodes (CISE students, library visitors, moviegoers) • The sender does not know the network identities (addresses) of the destinations • Different from multi-cast: No join/leave, no group maintenance

Profile-cast Use Cases • Mobility profile-cast • Targeting people who move in a particular pattern (lost-and-found, context-aware announcement) • Rely on the “similarity metric” between users • Mobility-independent profile-cast • Targeting people with a certain characteristics independent of mobility (classic music lovers) • Rely on the “Small World” encounter pattern Current Future

Mobility space S N D N N N S D D Forward?? Mobility Profile-cast (inter-group) Scoped message spread in the mobility space

1. profiling S N N N N Each row represents an association vector for a time slot An entry represents the percentage of online time during time slot i at location j Sum. vectors Inter-group profile-cast Operation • Profiling user mobility • The mobility of a node is represented by an association matrix • Singular value decomposition provides a summary of the matrix (A few eigen-behavior vectors are sufficient, e.g. for 99% of users at most 7 vectors describe 90% of power in the association matrices for 94 days)

1. profiling S N N N N 2. Forwarding decision Inter-group profile-cast Operation • Determining user similarity • Nodes exchange their eigen-behaviors and the corresponding weights at encounter • Similarity of user mobility are evaluated by weighted inner products of eigen-behaviors • Message forwarded if Sim(U,V) is higher than a threshold (recall that the goal is to deliver messages to nodes with similar profile)

Evaluation • Based on USC WLAN trace for realistic user mobility(2006 spring, 94 days, 5000 users) • We use hierarchical clustering to identify 200 distinct groups based on mobility profile. • We pick groups with 5 or more members and randomly pick 20% of the members in these groups as senders

Complete user grouping info No usergrouping info Evaluation • Spanning the spectrum of grouping knowledge Inferred user grouping info Similarity-basedprotocol • Epidemic andRandom Tx. • Simple • Not optimized Centralized protocol- Highly efficient - But not practical

Success Rate Delay Overhead 92% 45% more overhead Evaluation - Result • Centralized: Excellent successrate with only 3% overhead. • Similarity-based: • (1) 61% success rate at low overhead, 92% success rate at 45% overhead • (2) A flexible success rate – overhead tradeoff • RTx with infinite TTL: Much more overhead undersimilar success rate • Short RTx with many copies: Good success rate/overhead, but delay is still long

Flooding Similarity S S S S S Single long random walk Multiple short random walks Mobility Profile-cast (intra-group) Goal

Mobility Profile-cast (inter-group) • Sending to a mobility profile specified by the sender • Gradient ascend followed by local flooding (in the mobility space) • The current message holder holds on to the message until it encounters with a node with higher similarity to the target • When the message reaches a point close enough to the target, local flooding is triggered

S S S S S S T.P. T.P. T.P. T.P. T.P. T.P. Gradient-ascend Single long random walk Multiple short random walks Mobility Profile-cast (inter-group) Goal Flooding Flooding_sim

Mobility Profile-cast (inter-group)

Performance Comparison Gradient ascend helpsto overcome the difficult case – when the source is far from T.P. Few long RW is better when S is far from T.P. but many short RW is betterwhen S is close to T.P.

Performance Comparison Few long RW is better when S is close toT.P. but many short RW is betterwhen S is close to T.P. Gradient ascend helpsto overcome the difficult case – when the source is far from T.P. Gradient ascend has some extra delay comparing with flooding

Future Work • Mobility independent profile-cast • The target group are not necessarily “close” in the mobility space • The encounter pattern provides a network in which most nodes are reachable • We don’t want to flood – How to leverage the Small World encounter pattern to reach the “neighborhood” of most nodes efficiently?

S S S S S Mobility Independent Profile-cast Goal Flooding SmallWorld-based Single long random walk Multiple short random walks

Forward? Interest space Mobility space Physical space S Future Work • One-copy-per-clique in the “mobility space” • We expect this to work because similarity in mobility leads to frequent encounters

Analyzing Large-Scale Wireless Network Traces for User Modeling and Protocol Design

Analyzing Large-Scale Wireless Network Traces for User Modeling and Protocol Design

Presentation Transcript

DATA BY DR: HASHIM AHMED MRCGP

By Dr. Shamshad Ahmed

GAP- Generating Access Permissions IJAZ AHMED advised by : NESTOR CATANO

Presented by : Ahmed Mesbah Ahmed El- taybany Mentor : Dr. Marwan Torki

Wei Hsu University of Minnesota

By Dr. Ahmed Rakha

Dr. Ahmed Refaey

DR. MUNIR AHMED

By: Dr. Ahmed Ihab Abdelaziz , MD, PhD.

Presented by: Miguel Cabral Advised by: Dr. Fan Wu

by Lam Ho-yu advised by Dr. Yeung Dit-yan

Ahmed Helmy

Present by Hsu Ting-Wei 2006.03.16

Jason Aughenbaugh gtg224k@mail.gatech Advised by Dr. Chris Paredis

BY Dr.Khaled Helmy

Advised by Prof. Peter Dolog

Advisor ： Dr. Hsu Graduate ： Chien-Shing Chen Author ： Pu-Jen Cheng

Original ppt by Christina Maffa Advised by Dr. Emily Meixner

Pharmacodynamics by Dr. Sherif Ahmed Shaltout

Atmospheric chemistry by Dr. Ahmed Ramadan Mughari