 Download Download Presentation Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor

# Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor

Download Presentation ## Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor Raymond Chi-Wing Wong (Hong Kong University of Science and Technology) M. Tamer Ozsu (University of Waterloo) Philip S. Yu (University of Illinois at Chicago) Ada Wai-Chee Fu (Chinese University of Hong Kong) Lian Liu (Hong Kong University of Science and Technology) Presented by Raymond Chi-Wing Wong Presented by Raymond Chi-Wing Wong

2. Outline • Introduction • Related work – Bichromatic Reverse Nearest Neighbor • Problem - MaxBRNN • Algorithm - MaxOverlap • Empirical Study • Conclusion

3. 1. Introduction • Bichromatic Reverse Nearest Neighbor (BRNN or RNN) • Given • P and O are two sets of points in the same data space • Problem • Given a point pP, a BRNN query finds all the points oO whose nearest neighbor (NN) in P are p.

4. o3 o1 p1 o4 o2 p2 o5 1. Introduction Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} NN in P = p1 NN in P = p2 NN in P = p2 NN in P = p1 RNN = {o1, o2} RNN = {o3, o4 , o5} NN in P = p2

5. o3 o1 p1 p o4 o2 p2 o5 Placement 1 RNN = {o1, o2} 2 1. Introduction Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Placement 1 Suppose that we want to set up a new convenience store p Where should we set up? RNN = {o1, o2} Influence value = 2

6. o3 o1 p1 p o4 o2 p2 o5 Placement 1 RNN = {o1, o2} 2 1. Introduction Placement 2 RNN = {o1, o2 , o3, o4 , o5} 5 Which placement is better? Placement 2 Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Placement 2 Suppose that we want to set up a new convenience store p Where should we set up? Different placements of p may have different RNN sets RNN = {o1, o2 , o3, o4 , o5} Influence value = 5

7. o3 o1 p1 p o4 o2 p2 o5 Placement 1 RNN = {o1, o2} 2 1. Introduction Placement 2 RNN = {o1, o2 , o3, o4 , o5} 5 Placement 3 RNN = {o1, o2 , o3, o4 , o5} 5 Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Placement 3 Suppose that we want to set up a new convenience store p Where should we set up? Different placements of p may have the same RNN set RNN = {o1, o2 , o3, o4 , o5} Influence value = 5

8. o3 o1 p1 o4 o2 p2 o5 Placement 1 RNN = {o1, o2} 2 1. Introduction Placement 2 RNN = {o1, o2 , o3, o4 , o5} 5 Placement 3 RNN = {o1, o2 , o3, o4 , o5} 5 Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Suppose that we want to set up a new convenience store p Where should we set up? Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

9. 1. Introduction • Related Work • Arrangement • Running Time = O(|O| log |P| + |O|2 +2(|O|))where (|O|) is a function on |O| and is (|O|) • Our Proposed Algorithm MaxOverlap • Running Time = O(|O| log |P| + k2 |O| +k |O| log |O|)where k << |O| • Significant improvementon Running Time Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

10. o3 o1 p1 p o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} RNN = {o1, o2 , o3, o4 , o5} Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

11. o3 o1 p1 p o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} RNN = {o1, o2 , o3, o4 , o5} Consistent region Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized. Influence value = 5 For any two possible placements in this region, their RNN sets are the same

12. o3 o1 p1 o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

13. o3 o1 p1 p o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} RNN = {o1, o2} Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

14. o3 o1 p1 p o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} RNN = {o1, o2 , o3, o4 , o5} Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized. Non-Consistent region

15. o3 o1 p1 o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Consistent region Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

16. o3 o1 p1 o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Consistent region Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized. Influence value = 5 Many consistent regions!

17. o3 o1 p1 o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Maximal consistent region Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized. Influence value = 5 There does not exist another consistent region R’ where (1) R’ covers R and (2) the RNN sets of R and R’ are equal

18. o3 o1 p1 o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Maximal consistent region Maximal consistent region Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized. Influence value = 5 There does not exist another consistent region R’ where (1) R’ covers R and (2) the RNN sets of R and R’ are equal

19. o3 o1 p1 o4 o2 p2 o5 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Maximal consistent region Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

20. o3 o1 p1 o4 o2 p2 o5 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Two challenges: Challenge 1: It is difficult to find a maximal consistent region Challenge 2: We need to return the maximal consistent region with the greatest influence value

21. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Convenience stores P = {p1, p2} Customers Construct a circle centered at o2 with radius |p2, o2| O = {o1, o2} NN in P = p2 Two challenges: Challenge 1: It is difficult to find a maximal consistent region Challenge 2: We need to return the maximal consistent region with the greatest influence value Construct a circle centered at o1 with radius |p1, o1| NN in P = p1

22. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Convenience stores P = {p1, p2} Customers O = {o1, o2} Two challenges: Challenge 1: It is difficult to find a maximal consistent region A Challenge 2: We need to return the maximal consistent region with the greatest influence value

23. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Convenience stores P = {p1, p2} Customers O = {o1, o2} Two challenges: Challenge 1: It is difficult to find a maximal consistent region B A Challenge 2: We need to return the maximal consistent region with the greatest influence value

24. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Convenience stores P = {p1, p2} Customers O = {o1, o2} Two challenges: Challenge 1: It is difficult to find a maximal consistent region B C A Challenge 2: We need to return the maximal consistent region with the greatest influence value

25. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Convenience stores P = {p1, p2} Customers O = {o1, o2} Two challenges: D Challenge 1: It is difficult to find a maximal consistent region B C A Challenge 2: We need to return the maximal consistent region with the greatest influence value

26. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. Four maximal consistent regions 2. Problem Solution: Region A Intersection between two NLCs Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Convenience stores P = {p1, p2} Customers O = {o1, o2} 0 RNN set = {} Two challenges: D Challenge 1: It is difficult to find a maximal consistent region B C A Challenge 2: We need to return the maximal consistent region with the greatest influence value RNN set = {o1} RNN set = {o2} 1 1 RNN set = {o1, o2} 2

27. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. Four maximal consistent regions 2. Problem Solution: Region A Intersection between two NLCs Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Lemma: The solution of MaxBRNN can be represented by an intersection of multiple nearest location circles. Two challenges: D Challenge 1: It is difficult to find a maximal consistent region B C A Challenge 2: We need to return the maximal consistent region with the greatest influence value

28. Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Two challenges: We propose an algorithm called MaxOverlap Challenge 1: It is difficult to find a maximal consistent region Challenge 2: We need to return the maximal consistent region with the greatest influence value

29. 3. Algorithm • Make use of the principle of region-to-point transformation • Search a limited number of points • Find the optimal point This optimal point can be mapped to the optimal region in Optimal Region Search Problem Optimal Point Search Problem Optimal Region Search Problem

30. p3 p4 o4 o3 o5 o2 p2 o6 p5 o1 p1 3. Algorithm Convenience stores P = {p1, p2 , p3 , p4 , p5} Customers O = {o1, o2, o3 , o4, o5 , o6}

31. 3. Algorithm p3 p4 o4 o3 o5 o2 p2 o6 p5 o1 p1

32. Solution Intersection of c1, c2 and c3 3. Algorithm NLC c3 o4 o3 o5 NLC c2 o2 o6 The maixmal consistent region which maximizes the RNN set o1 NLC c1 Intersection of c1, c2 and c3

33. 3. Algorithm • Algorithm MaxOverlap • Three-Step Algorithm

34. 3. Algorithm Step 1 (Finding Intersection Point) o4 o3 o5 o2 o6 o1

35. 3. Algorithm Step 1 (Finding Intersection Point) q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1

36. 3. Algorithm Step 2 (Point Query) q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q4

37. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q4

38. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q4

39. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q3

40. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 c1 Result for q3 = { } q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q3

41. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 c1 , c2 Result for q3 = { } q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q3

42. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 c1 , c2 , c3 Result for q3 = { } q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q3

43. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 c1 , c2 , c3 Result for q3 = { } q7 c1, c2, c3 Result for q1 = { } o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q1

44. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 c1 , c2 , c3 Result for q3 = { } q7 c1, c2, c3 Result for q1 = { } o4 c1, c2, c3 Result for q5 = { } o3 q6 o5 q8 o2 … q9 q1 o6 q4 q3 q5 q2 o1 Point query for q5

45. 3. Algorithm The intersection of c1, c2 and c3 corresponds to the solution. Step 3 (Finding Maximum Size) c1 Result for q4 = { } , c3 c1 , c2 , c3 Result for q3 = { } q7 c1, c2, c3 Result for q1 = { } o4 c1, c2, c3 Result for q5 = { } o3 q6 o5 q8 o2 … q9 q1 o6 q4 q3 q5 q2 o1 Optimal Point Search Problem Optimal Region Search Problem

46. 3. Algorithm • Theorem: The running time of algorithm MaxOverlap is O(|O| log |P| + k2|O| + k |O| log |O|)where • k is typically much smaller than |O|

47. 3. Algorithm • Enhancement 1: We process the intersection points q in a pre-defined order • Enhancement 2: • Step 2 and Step 3 can be combined • We introduce a pruning technique such that some intersection points will not be processed.

48. 4. Empirical Study • Synthetic Dataset • P: Gaussian distribution • O: Zipfian distribution • Real Dataset • Rtree Portalhttp://www.rtreeportal.org/spatial.html • CA (62,556) • LB (53,145) • GR (23,268) • GM (36,334) • P: one of the above datasets • O: one of the above datasets

49. 4. Empirical Study • Measurements • Execution Time • Storage • Our proposed algorithms • MaxOverlap-P • MaxOverlap with Pruning • MaxOverlap-NP • MaxOverlap without pruning • Comparison with adapted algorithms • Arrangement • Buffer-Adapt

50. 4. Empirical Study • Small dataset