1 / 38

A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS*

A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS*. by G ö khan Yavaş Feb 22, 200 5. *: To appear in Data and Knowledge Engineering, Elsevier. Outline. Introduction Background Work Mobility Prediction Based On Mobility Rules Experimental Results Conclusion

maya
Télécharger la présentation

A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS*

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier

  2. Outline • Introduction • Background Work • Mobility Prediction Based On Mobility Rules • Experimental Results • Conclusion • Future Work

  3. Introduction • Personal Communication Systems are becoming more popular • Dynamic relocation of users gives rise to the problem of Mobility Management • Methods for storing and updating the location information of users • Mobility Prediction: the prediction of a user’s next inter-cell movement

  4. Motivation • Predicted movement can be used for effectively allocating resources instead of blindly allocating excessive resources • Benefit to the broadcast program generation [1], data items can be broadcast to the predicted cell • Location prediction is crucial in processing of location dependent queries [2], since answer depends on the location of user • Queries depending on future positions can be answered by effective location prediction [1] Y. Saygin and O. Ulusoy. Exploiting Data Mining Techniques for Broadcasting Data in Mobile Computing Environments. IEEE Transactions on Knowledge and Data Engineering, 14(6): 1387-1399, 2002. [2] R. Agrawal and R. Srikant. Mining sequential patterns. In Proceedings of the IEEE Conference on Data Engineering (ICDE’95), pages 3–14, 1995. [2] G. Gok and O. Ulusoy. Transmission of Continuous Query Results in Mobile Computing Systems.Information Sciences, 125(1-4): 37-63, 2000

  5. Network Model • PCS network partitioned into smaller areas called cells • Each cell has a Base Station (BS), used for broadcasting and receiving information • Home Location Register (HLR): database which keeps the inter-cell movement history of user • Visitor Location Register (VLR): each BS has a database which keeps the profiles of the users located in this cell.

  6. Problem Definition • It is possible for us to get the movement history of a mobile user from HLR of a user • Movement trajectories in the form of T=<(id1, t1) ... (idk, tk)> • Partitioned into subsequences, named user actual paths, UAPs • UAPs have the form of U=<c1, c2, ..., cn> • We mine UAPs to find user mobility patterns, UMPs

  7. Related Work • The roots of our method go back to the Apriori algorithm [3] • Association rule mining • Sequential pattern mining problem [4] • Ordering of the items in an itemset must be taken into consideration • Not appropriate for our domain, because does not take into account the network topology [3] R. Agrawal, R. Srikant, Fast Algorithms for mining association rules. In Proceedings of Very Large Databases Conference (VLDB’94), pages 487-499, 1994. [4] R. Agrawal and R. Srikant. Mining sequential patterns. In Proceedings of the IEEE Conference on Data Engineering (ICDE’95), pages 3–14, 1995.

  8. Mobility Prediction Based On Mobility Rules • Mining UMPs from Graph Traversals: Movement data mined for discovering regularities (UMP) in inter-cell movements • Generation of Mobility Rules: Mobility rules are extracted from UMPs • Mobility Prediction: Prediction of next inter-cell movement based on mobility rules

  9. Mining UMPs from Graph Traversals An example coverage region and corresponding graph G • Vertices of G: the cells in the coverage region • Edges of G: if two cells, A and B, are neighbors in the coverage region, then there are two edges in G, A  B and B  A

  10. Mining UMPs from Graph Traversals • Subsequencedefinition: Assume we have two UAPs, A = <a1, a2, ... , an> and B = <b1, b2, ... , bm>. B is a subsequence of A, iff all cells in B also exist in A while keeping their order in B • Example: A=<c3, c4, c0, c1, c6, c5>, then B=<c4, c5> is a length-2 subsequence of A. In other words, B is contained by A

  11. Mining UMPs from Graph Traversals • Every candidate has a count value that keeps the support given to this candidate by UAPs • This is the point our work extends algorithm in [5, 6] • Method in [5, 6] increments the count value of a candidate by 1 if this candidate is contained by a UAP • Unfair !!! • Treats in the same way • a highly corrupted candidate pattern • a slightly corrupted (or even not corrupted at all) candidate pattern [5] A. Nanopoulos, D. Katsaros, Y. Manolopoulos, A Data Mining Algorithm for Generalized Web Prefetching, IEEE Transactions on Knowledge and Data Engineering, 15(5): 1155-1169, 2003. [6] A. Nanopoulos, D. Katsaros, Y. Manolopoulos, Effective Prediction of Web User Accesses: A Data Mining Approach, In Proceedings of the WebKDD Workshop (WebKDD’01), 2001.

  12. Mining UMPs from Graph Traversals • Should consider the degree of corruption for the mobile motion prediction context • Support assigned to a candidate pattern B by a UAP A (i.e., suppInc)

  13. Mining UMPs from Graph Traversals • Define totDistvalue by means of the notion of string alignment • Definition 2.1: If x and y are each single character or space, then (x, y) denotes the score of aligning x and y. In our case, the scoring function is defined as follows:

  14. Mining UMPs from Graph Traversals • Definition 2.3: Let A be a UAP and B be a pattern. A containment alignment X' maps A and B into strings A‘ and B‘ where: • |A'| = |B'| • B is contained by A, and • Removal of all spaces from A' and B' leaves A and B • Total score of the alignment X':

  15. Mining UMPs from Graph Traversals • For any two patterns, there may be more than one alignment • Ex: Consider A=<c3, c4, c0, c1, c6, c5, c8, c5>, B=<c4, c5>

  16. Mining UMPs from Graph Traversals • Definition 2.4: An optimal containment alignment of UAP A and pattern B is one that has the minimum possible value for these two patterns • Total score of an alignment: sum of penalties • An optimal alignment should have the minimum number of mismatches, which means the minimum score of alignment • totDist(A, B) = Score of the optimal alignment for the UAP A and pattern B

  17. Mining UMPs from Graph Traversals • Example: Given UAP A=<c3, c4, c0, c1, c6, c5, c8> and pattern B=<c4, c5 , c8 > , optimal containment alignment for these: • Score of the alignment = totDist (A, B) = 3 • Support assigned to the candidate pattern B by the UAP A:

  18. Mining UMPs from Graph Traversals • The quality of the patterns will improve since this method is a more accurate way of support counting • Degree of corruption taken into account • This will give rise to more accurate mobility rules • Resulting in the prediction accuracy improved compared to the accuracy by using the rules that are generated with the former way of support counting • Application of different methods for totDist will affect the quality of rules

  19. Mining UMPs from Graph Traversals • Candidate Generation: • Example: C = <c1, c2, ..., ck> • N+(ck): the set of all nodes in G, which have an incoming edge from the cell ck • A cell from N+(ck) is attached to the end of C to generate C' • Add C' to the set of Candidates

  20. Mining UMPs from Graph Traversals • Apriori Pruning can be used? • NO due to the nature of our new support counting method • Support is no longer monotonically decreasing with the increasing size of the pattern • A length-(k-1) subpattern S of a length-k pattern P doesn’t need to be large even if P is large • Ex: UAP <1, 6, 0, 3, 2>, P1= <1, 0, 2> and its subpattern P2= <1, 2> • UAP assigns a support • to P1 and to P2

  21. Mining UMPs from Graph Traversals Example: Use suppmin= 1.33 UMP Mining Algorithm Database of UAPs Set of all large Patterns (UMPs)

  22. Tail Head Generation of Mobility Rules • Extract rules from the UMPs • For a rule: R: < c1, c2, …, ci-1 >  < ci, ci+1, ... ck > • A confidence value is calculated:

  23. Generation of Mobility Rules • The rules which have confidence higher than confmin are selected • All possible mobility rules for the UMPs given in previous example are:

  24. Mobility Prediction • User has followed a path P=< c1, c2, …, ci-1 > up to now • Find the rules whose head parts are contained in P and the last cell in their head is ci-1 • Store the first cell of tail along with the (confidence + support) of rule as a tuple • Sort these tuples w.r.t. the (confidence + support) values in descending order • Select the first m tuples

  25. Mobility Prediction • Example: Assume that the current trajectory of the user is P=<2, 3, 0, 4> • Matching Rules: • <4>  <0> • <4>  < 5> • <3, 4>  <0> • < 3, 4 >  <5> • Sorted tuple array is: TupleArray= [(5, 85.83), (0, 76.5)] • If m=1, then Predicted Cells Set = {5} • If m=2, then Predicted Cells Set = {5, 0}

  26. Simulation Design • Mobile users travel on a 15 by 15 hexagonal shaped network • To generate UAPs, first UMPs are generated • UMPs are taken as a random walk over the network • Two types of UAPs: • Outliers: a random walk over the network • Non-outliers: those which follow a UMP • o (outlier percentage): ratio of the number of outliers to the number of non-outliers

  27. Simulation Design • Corruption mechanism: insert random cells between the consecutive cells of an UMP • c (corruption ratio): denotes the ratio of the number of such random cells to the number of cells in the corresponding UMP • Three possible outcomes of a prediction • Correct prediction • Incorrect prediction • No prediction • Two performance measures:

  28. Algorithms Used for Comparison • Mobility Prediction Based on Transition Matrix (TM) • A cell-to-cell transition matrix formed • Select the m most probable cells from the transition matrix • Ignorant Prediction • Randomly select the m neighboring cells of the current cell

  29. Impact of m on Precision and Recall • Decreasing precision for both our algorithm and TM • Increasing probability of making some incorrect predictions as m increases • Increasing recall for all algorithms, but more significant increase for TM and Ignorant prediction

  30. Impact of m on Precision and Recall • Setting m as small as possible is convenient for our method • The increase rate in the recall value from m values 1 to 2 is maximum for TM • m ≥ 3 would cause excessive network resource waste • Thus choose m = 2

  31. Impact of Suppmin • Reduced recall and precision • The increase in the suppmin value leads to a decrease in the number of mined mobility rules • Number of correct predictions is reduced • Choose suppmin=0.1

  32. Impact of Confmin • Increasing precision • Higher quality rules with the increasing confmin • Leading to a higher decrease rate in number of predictions when compared to the decrease rate in number of correct predictions • Decreasing recall • The number of mined rules is reduced leading to a decrease in the number of correct predictions • Choose confmin=80

  33. Impact of Corruption Factor • Decreasing precision and recall for our method and TM • For all c, better precision than TM but worse recall than TM • For our method, as c increases: • The number of mined mobility rules decreases • No prediction in some cases because no matching rules due to the corrupted UAPs

  34. Impact of Outlier Percentage • Both performance measures not affected significantly for all methods • Rules extracted from outlier UAPs not used commonly, thus not reducing recall and precision significantly

  35. Conclusion • A data mining algorithm for the prediction of user movements in a mobile computing system • Algorithm is based on • Mining the mobility patterns of users • Then forming mobility rules from these patterns • Finally predicting a mobile user’s next movements by using the mobility rules • A good performance when compared to the performance of Ignorant Method

  36. Conclusion • Performance when compared to the TM • Better Precision: • More accurate predictions • Most of its predictions made at each request are correct • Worse Recall: • Our method may not make prediction in response to some of the prediction requests • Because there may not be any matching rule for the current trajectory of the user when a prediction request is made

  37. Future Work • For calculating the totDist value, our method: • Decrease the support given to pattern by a UAP as the number of corrupted cells increases in pattern • Other methods may be employed for calculating totDist value • No time domain of the mobility patterns and mobility rules considered • In real life, mobility patterns might be related to time • Some specific rules valid for a specific time interval • Extend our algorithm to include the time domain of mobility rules • A candidate pruning criterion suitable for our support counting method may be employed

  38. ? Questions & Comments

More Related