320 likes | 430 Vues
This document discusses the concept of property testing, specifically focusing on ultrametrics and tree metrics. Property testing is a methodology used to determine if a given object possesses a certain property or is significantly different from objects that do. Initially established by researchers in the context of program testing, the paper outlines algorithms that sample a subset of points to query a matrix representation of the object. These algorithms operate in sub-linear time and allow for efficient testing of properties with manageable sample sizes, contributing to both computational efficiency and combinatorial insights.
E N D
Testing Metric Properties Michal Parnas and Dana Ron
? ? ? ? ? Task should be performed by querying the object (in as few places as possible). Property Testing (Informal Definition) For a fixed property Pand any object O, determine whether O has property P, or whether O is farfrom having propertyP(i.e., far from any other object having P ).
Property Testing - Background • Initially defined by Rubinfeld and Sudan in the context of Program Testing (of algebraic functions). • Goldreich Goldwasser and Ron initiated study of testing properties of (undirected) graphs. • Growing body of work deals with properties offunctions, graphs, strings, sets of points ... Many algorithms with complexity that is sub-linear in (or even independent of) size of object.
Motivation • Computational: Design testing algorithms that are (much) more efficient than exact decision algorithms for properties. • Combinatorial:Gain new understanding about tested property.
Testing Metric Properties P - Metric property ; M - n x n rational-valued matrix; e - Distance/approximation parameter; M is said to be e-far from property P if must modify more thane fraction of n2 entries so that M obtains P. Otherwise say that it ise-close. Testing algorithm can query M on entries M[i,j]. If M has property P, should accept; If M is e-far from property P, should reject w.p. 2/3.
Tree Metrics and Ultametrics An n x n matrix M is a tree metric (additive metric) if exists a tree T with positive weights on edges, such that: • There exists a mapping ffrom [n] into nodes of T; • For every i,j[n]={1,…,n}, T(f(i),f(j))=M[i,j]; • All nodes to which no i[n] is mapped to, have degree greater than 2. If:T is rooted, f maps only to leaves of T, and distance of all leaves to root is the same, then M is an ultrametric.
1 3 5 2 3 5 M[1,2]=8;M[1,3]=12;M[1,4]=10;M[1,5]=15; . . . 7 5 4 4 5 3 2 6 7 Tree Metric M[1,2]=M[1,3]=M[2,3]=8;M[1,4]=M[1,5]=M[1,6]=12;M[4,5]=M[4,6]=6;M[5,6]=2; . . . 2 3 2 3 4 4 4 1 1 1 2 3 4 5 6 Ultrametric
Our Results Our algorithms all work by taking uniformly selected sampleS[n] and querying M[i,j] for i,j S. Size of sample is always poly(1/e) and independent of n. Specifically: • Can test ultrametrics with |S|= O(log(1/e)/e3). • Can test general tree metrics with |S|=O(log(1/e)/e3). • Can extend result for ultrametrics to approximate ultrametrics. • Can test d-dimensional Euclideanmetrics with |S|=O(d log d/e).
Our Results (continued) Testing algorithms can be used to solve relaxed versions of corresponding search problems in time linear in n (and polynomial in 1/e). That is, can construct tree that agrees with M on all but at most e-fraction of entries. (Note that running time is sub-linear in size of matrix M.)
Constructing an Ultrametric Tree Suppose M is an ultrametric. We can construct an ultrametric tree that agrees with M on given subset {1,…,s} in following manner: • Initialization: Position points 1 and 2 at equal distanceM[1,2]/2 from root node. • Iterations: For each point j = 3,…,s add point j to current tree by adding new branch that emits from j’s unique point of departure from tree. This point is determined by closest point in tree.
4 4 1 2 3 5 4 M[1,2]=8; M[1,3]=M[1,4]=M[1,5]=10;M[2,3]=M[2,4]=M[2,5]=10;M[3,4]=2; M[3,5]=6;M[4,5]=6; 1 3 5 1 2 1 1
Consistency of points with tree For U[n] , let TUdenote tree with leaf-set U, that agrees with M on U (if exists, such tree is unique). Def: Say that j [n] \ U is consistent with TUif adding j to TU as described in construction procedure, results in tree that agrees with Mon U+j.Denote set of points consistent with U by GU.
The “Scaffold Partition” For U[n] , let TUdenote tree with leaf-set U, that agrees with M on U. We refer to tree as scaffold. Def: Let PU be following partitionof GU, induced by TU: Points i and j are in same class i.f.f have same point of departure from TU.
1 1 1 1 2 2 3 3 2 1 1 C1 C2 C3 C4 The scaffold partition
Violating Pairs If M is an ultrametric, then for every subset U, and for every two pointsi,j that belong to different classes in PU, value of M[i,j] is exactlydetermined by corresponding (different) departure points in TU. Def: Say that i,j GU that belong to different classes in PU are a violating pair w.r.t. TU if distance between them according to scaffold TUdiffers from M[i,j] .
1 1 1 1 2 2 3 3 2 3 2 1 1 C1 C2 C3 C4 i j If M is ultrametric, must haveM[i,j]=8.
Two types of “witnesses” Suppose have scaffold tree TUthat agrees with M on U. (If can’t construct such tree, clearly M not ultrametric.) It follows that: • If obtain point j that is inconsistent withTUthen have witness that M not ultrametric. • If obtain pair of points i,j that are violating w.r.t.TUthen have witness that M not ultrametric.
Testing Algorithm for Ultrametrics 1. Uniformly select s=O(log(1/e)/e3) points from [n]. Denote set by U. 2. Construct tree TUthat agrees with M on U. If fail, reject. 3. Uniformly select m=O(1/e) pairs of points from [n]. 4. If any of these 2m points is inconsistent with TU, or any of the m pairs is violating w.r.t. TU, thenreject. 5. If no step cause rejection then accept.
Analysis of Algorithm • If Misultrametric -- Algorithm always accepts.(No inconsistent points and no violating pairs.) • From now on assume M is e-far from ultrametric. Will show that algorithm rejects w.h.p. Specifically: Either can’t construct TU that agrees with M;or many inconsistent points w.r.t. TU;or many violating pairs w.r.t. TU;
Special Case (for Me-far from ultrametric) Suppose TU agrees with M, and all but at most (e/3)n2pairs of points in GUbelong to different classes in PU (are separated). (In particular is the case if all classes of size O(e n).) Claim: Either have >(e/3)ninconsistent points w.r.t. TUor have >(e/3)n2violating pairs w.r.t TU. Subject to claim, if M is e-far from ultrametric, then rejected w.h.p. as required.
Proof of Claim for special case Assume, contrary to claim, that have (e/3)ninconsistent points, and (e/3)n2 violating pairs. Will show that ultrametrictreeT that agrees with M on all but at most en2 entries, in contradiction to assumption on M. Tree Tbuilds on scaffold TU:For every class C in PU create star-shaped sub-tree with leaf set C that is rooted at point of departure of C from TU.Inconsistent points are added arbitrarily. By premise of lemma and (counter) assumptions, num of disagreements (e/3)n .n + (e/3)n2 + (e/3)n2 = en2 . incon. pts viol. Pairs unsep. pairs
1 1 1 1 2 2 3 3 2 1 1 C1 C2 C3 C4
1 1 1 1 2 2 3 3 2 1 1 C1 C2 C3 C4
General Case By special case: Gain from separating points to diff classes. Def: Say that point kU is effective separator w.r.t. TU if adding k to U causes (e n/12)2pairs of points to be separated into different classes. C1 C2 C3 C4 C1,1 C1,2 k
C2 C3 C4 C1,1 C1,2 k General Case By special case: Gain from separating points to diff classes. Def: Say that point kU is effective separator w.r.t. TU if adding k to U causes (e n/12)2pairs of points to be separated into different classes.
General Case (continued) In analysis, view sample U as being selected in phases. In each phase, if many effective separators then one selected w.h.p. After sufficient num of phases, either have special case (few non-separated pairs), or U s.t. have few effective separators w.r.t. TU . In latter case can show that class C in PU,tree TC s.t. for almost all pairs i,jC, M[i,j]= TC(i,j). (Tree is star-shaped/broom-shaped.)
General Case (continued) Claim: Either have >(e/4)ninconsistent points w.r.t. TUor have >(e/4)n2violating pairs w.r.t TU. Subject to claim, if M is e-far from ultrametric, then rejected w.h.p. as required. Proof of Claim is similar to that in special case: Assume few inconsistent points and violating pairs, show that tree close to M (contradicting M beinge-far from ultrametric).
1 1 1 1 2 2 3 3 2 1 1 C1 C2 C3 C4
1 1 1 1 2 2 3 3 2 1 1 C1 C2 C3 C4
Solving Relaxed version of Search Problem Analysis implies that testing algorithm can be used to solve relaxed version of corresponding search problem.That is, if M is ultrametric then, w.h.p. can construct tree that agrees with M on all but at most e-fraction of entries in time linear in n and polynomial in 1/e: • Construct scaffoldTU on uniformly selected sample U; • Partition all points in [n]\U into classes of PUaccording to distances to points in U; • For each class C construct star/broom-shaped tree TC.
Testing Approximate Ultrametrics Def: For a given approximation parameterd, we say that matrix M is a d-approximate ultrametric if exists ultrametric M’s.t. for every i,j[n], |M[i,j]-M’[i,j]| d. We describe an algorithm, that for every d ande, if M is a d–approximateultrametric then algorithm acceptsM, and if M is e–far from being a cd–approximate ultrametric then algorithm rejectsM w.h.p. (c is a fixed constant).
Conclusions and Further Research • Presented algorithm for testing whether matrix is an ultrametricor far from being an ultrametric. Analysis implies fast solution for relaxed search problem. • Mentioned similar results for approximate ultrametrics, general tree metrics and Euclidean metrics. • We suspect that results can be improved in terms of dependence on 1/e. • We conjecture that can extend result for general tree metrics to approximate variant. • Testing other natural metric properties?