1 / 22

Xiaoming Sun Tsinghua University David Woodruff MIT

The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences. Xiaoming Sun Tsinghua University David Woodruff MIT. The Problem. Stream of elements a 1 , …, a n 2  Algorithm given one pass over stream

corinnas
Télécharger la présentation

Xiaoming Sun Tsinghua University David Woodruff MIT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff MIT

  2. The Problem • Stream of elements a1, …, an2 • Algorithm given one pass over stream • Problem: Compute the longest increasing subsequence (LIS) – in this case answer is (3,7) 4 3 7 3 1 1 0

  3. Previous Work • Let k be the length of the LIS of the stream • There exists an algorithm which computes the LIS with O(k2 log ||) space [LNVZ05] • Trivial (k) lower bound • Our first result: Improve both bounds to a tight (k2 log ||/k)

  4. Our Lower Bound Reduction from indexing function: Alice Bob What is xi? x 2 {0,1}n i 2 [n] = {1, 2, …, n} Randomized 1-way communication is (n)

  5. Alice Bob What is xi? x 2 {0,1}n i 2 [n] = {1, 2, …, n} Construct a stream A Construct a stream B • From LIS(A, B), Bob can get xi • 2. |LIS(A, B)| = k, where k is input parameter

  6. Alice Ak-1 Value … x 2 {0,1}n A: A2 A1 Position in stream Alice uses x to create k-1 increasing sequences A1, …, Ak-1For each j, Aj has length j. Each bit of x is encoded in some sequence Aj Every element in Ak-1 is larger than every element in Ak-2, every element in Ak-2 larger than every element in Ak-3, etc. Set A = Ak-1 ,…, A2 , A1

  7. Aj+1 Value B Aj B: Aj-1 Position in stream Bob i 2 [n] Bob uses i to recover Aj, the sequence encoding xi Bob creates an increasing subsequence B of length k-j, Every element in B is greater than Arif r < j, and every element in B is less than Arif r > j

  8. Alice Bob What is xi? Aj+1 i 2 [n] x 2 {0,1}n Value B Aj Aj-1 B A = Ak-1, …, A2, A1 Position in stream LIS(A, B) = Aj, B, and |LIS(A, B)| = k But xi encoded in Aj, so Bob recovers xi

  9. Thus, any streaming algorithm must use (n) space. • But what is n? We need to construct k increasing sequences that are different for different x in {0,1}n • Assume || large. Divide  into k-1 blocks of size ||/(k-1) • Let Aj be a random increasing sequence of length j in block j. • The space to represent Aj is (k log ||/k) for j > k/2 • Set n = (k2 log ||/k).

  10. Our Upper Bound • When processing the stream, keep lists A[1], A[2], …, A[k]. • A[j] is an LIS of length j in the stream with minimal last element. • Let L[1], L[2], …, L[k] be last elements of A[1], A[2], …, A[k] • To process item x,find i for which L[i] < x < L[i+1], and replace A[i+1] with A[i], x

  11. So we have k arrays A[1], …, A[k], each of length at most k. • Naively, this takes O(k2 log ||) space. • But the Ai are increasing, so can compress the list by storing differences. • Total space is O(k2 log ||/k).

  12. This talk • First result: a tight space bound for the LIS problem • Second result: tight bounds for longest common subsequence (LCS)

  13. LCS Bounds • Problem: Alice has a permutation  of [N], Bob has a permutation  of [N]. Decide if |LCS(, )| ¸ k. • Previous space bound: (k) [LNVZ05] • Our space bound: (N) for 3 · k · N/2 (holds for randomized O(1)-pass algorithms)

  14. LCS Bounds • Why can we only prove (N) for 3 · k · N/2? • If k = 2, reduces to equality test. • If k large, there are at most O(N2(N-k)) permutations  with |LCS(, )| > k, so just use an equality test with error O(1/N2(N-k))

  15. Our Lower Bound • Padding lemma: if for k = 3 the randomized communication complexity is (N), then it’s (N) for all k · N/2 • Proof: just pad each of the inputs by some common subsequence of length k-3

  16. Remains to show high complexity for k =3. We reduce from disjointness Is there an i such that xi = yi = 1? Alice Bob x 2 {0,1}n y 2 {0,1}n Randomized multi-way communication is (n)

  17. Is there an i such that xi = yi = 1? Alice Bob y 2 {0,1}N/3 x 2 {0,1}N/3 Construct  Construct  Want |LCS(, )| ¸ 3 iff x and y are disjoint

  18. Alice  = 1, 2, …, N/3 x 2 {0,1}N/3 Divide 1, …, N into N/3 groups G1 = (1, 2, 3), G2 = (4, 5, 6), …, GN/3 = (N-2, N-1, N). Use x to choose 1, …, N/3 iacts on Gi If xi = 0, i (m+1, m+2, m+3) = (m+1, m+2, m+3). If xi = 1, i (m+1, m+2, m+3) = (m+1, m+3, m+2).

  19. Bob y 2 {0,1}N/3  = N/3 , …, 1 Divide 1, …, N into N/3 groups G1 = (1, 2, 3), G2 = (4, 5, 6), …, GN/3 = (N-2, N-1, N). Use y to choose 1, …, N/3 iacts on Gi If yi = 0, i (m+1, m+2, m+3) = (m+3, m+2, m+1). If yi = 1, I (m+1, m+2, m+3) = (m+1, m+3, m+2).

  20. N/3(GN/3) N/3(GN/3) … … 3(G3) 3(G3) 2(G2) 2(G2) 1(G1) 1(G1) Claim: |LCS(, )| · 3. Proof: Use the fact that LCS(, ) intersects at most one Gi Claim: |LCS(, )| = 3 iff there is some i with xi = yi = 1 Proof: Use the way we defined i and i Thus, can decide disjointness, so (N) communication.

  21. Other results • Tight space bounds for computing the LIS length. • Generalization to approximate LIS and LCS. Still many gaps here. • Example: approximate LIS length, we have (1/) and O(k log ||). Recent work [GJKK07] has shown O(sqrt(N/) log ||), but still large gap.

  22. Conclusion • First result: a tight bound for the LIS • Second result: an (N) space bound for the LCS k-decision problem for 3 · k · N/2 • Other results for approximation problems • Another open question: extend our lower bound for LIS to randomized multi-round

More Related