Dimension Reduction for Trees in L1: Geometric Information Theory Overview

James R. Lee Mohammad Moharrami Dimension reduction for finite trees in L1 University of Washington Arnaud De Mesmay ÉcoleNormaleSupérieure TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA

Given an n-point subset XµRd, find a mapping dimension reduction in Lp such that for all x, y2X, n = size of X k = target dimension D = distortion Dimension reduction as “geometric information theory”

When p=2, the Johnson-Lindenstrauss transform gives, for every n-point subset XµRd and " > 0, the case p=2 Applications to… • Statistics over data streams • Nearest-neighbor search • Compressed sensing • Quantum information theory • Machine learning

Natural to consider is p=1. n = size of X k = target dimension dimension reduction in L1 D = distortion History: - Caratheodory’s theorem yields D=1 and - [Schechtman’87, Bourgain-Lindenstrauss-Milman’89, Talagrand’90] Linear mappings (sampling + reweighting)yield D·1+" and - [Batson-Spielman-Srivastava’09, Newman-Rabinovich’10] Sparsification techniques yield D·1+" and

the Brinkman-Charikar lower bound [Brinkman-Charikar’03]: There are n-point subsets such that distortion D requires [Brinkman-Karagiozova-L 07] Lower bound tight for these spaces Very technical argument based on LP-duality. [L-Naor’04]: One-page argument based on uniform convexity.

[Andoni-Charikar-Neiman-Nguyen’11]: There are n-point subsets such that distortion 1+" requires more lower bounds [Regev’11]: Simple, elegant, information-theoretic proof of both the Brinkman-Charikar and ACNN lower bounds. Low-dimensional embedding ) encoding scheme

A tree metric is a graph theoretic tree T=(V, E) together with non-negative lengths on the edges the simplest of L1 objects Easy to embed isometrically into RE equipped with the L1 norm.

Charikar and Sahai (2002) showed that for trees one can achieve dimension reduction for trees in L1 A. Gupta improved this to In 2003 in Princeton with Gupta and Talwar, we asked: Is possible? even for complete binary trees?

Theorem: For every n-point tree metric, one can achieve dimension reduction for trees in L1 and (Can get for “symmetric” trees.) Complete binary tree using local lemma Schulman’s tree codes Complete binary tree using re-randomization Extension to general trees

dimension reduction for the complete binary tree Every edge gets B bits ) target dimension = Blog2n Choose edge labels uniformly at random. Nodes at tree distance have probability to get labels with hamming distance

dimension reduction for the complete binary tree Every edge gets B bits ) target dimension = Blog2n Choose edge labels uniformly at random. Siblings have probability 2-B to have the same label, yet there are n/2 of them.

Lovász Local Lemma Pairs at distance L have probability to be “good” Number of dependent “distance L” events is LLL+ sum over levels ) good embedding

LLL argument difficult to extend to arbitrary trees. • Same as construction of Schulman’96: • Tree codes for interactive communication Schulman’s tree codes

re-randomization Random isometry: For every level on the right, exchange 0’s and 1’s with probability half (independently for each level)

re-randomization Pairs at distance L have probability to be “good” Number of pairs at distance L is

Unfortunately, the general case is technical (paper is 50 pages) Obstacles: extension to general trees Use “topological depth” of Matousek. How many coordinates to change per edge, and by what magnitude? General trees do not have O(log n) depth Multi-scale entropy functional

open problems ? or Close the gap: For distortion 10, is the right target dimension Other Lp norms: Nothing non-trivial is known for Coding/dimension reduction: Extend/make explicit the connection between L1 dimension reduction and information theory.

Dimension Reduction for Trees in L1: Geometric Information Theory Overview