1 / 34

Counting subgraphs

Counting subgraphs. Support measures for graphs Natalia Vanetik. This research was carried out under the supervision of Prof. Eyal S. Shimony and Prof. Ehud Gudes. Published in DAMI Journal vol. 13(2), September 2006. Research directions. Multiflows in graphs

tevy
Télécharger la présentation

Counting subgraphs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Counting subgraphs Support measures for graphs Natalia Vanetik PhD seminar CS BGU

  2. This research was carried out under the supervision of Prof. Eyal S. Shimony and Prof. Ehud Gudes. Published in DAMI Journal vol. 13(2), September 2006. PhD seminar CS BGU

  3. Research directions • Multiflows in graphs • Counting functions in graphs PhD seminar CS BGU

  4. Problem description Let D and G be graphs. We need to measure statistical significance of G as a subgraph of D. Observe instances (isomorphic copies) of G within D. D D D G G G G G G G G G has zero significance G has some significance G has high significance PhD seminar CS BGU

  5. Definition A counting function on graphs that measures statistical significance of one graph G as a subgraph of another graph D is called a support measure. It is obvious that when G is not a subgraph of D, this function should return 0. Otherwise, it should return value greater than 0. PhD seminar CS BGU

  6. Traditional support measure An item-setX in relational model is a set of tuples (f1,v1),…,(fn,vn) where fi are the names of fields and vi are values. A transaction TsupportsX if the value of fi in it equals to vi for every i=1…n. A support of an item-set X is the number of transactions in the database that support X. PhD seminar CS BGU

  7. Admissibility It is important, especially for graph mining, that support measure is admissible or has a downward closure property or antimonotonicity: support of a graph cannot be smaller than support of its supergraph. support of a graph cannot be larger than support of any of its subgraphs. PhD seminar CS BGU

  8. Motivation • Significant amount of data in the world is graph-like and not relational. • Graph data is usually represented by one or more large graphs. Transaction-like graph datasets are rare. • Traditional support definition is not admissible. • Admissible support measures are required for mining the graph data and other tasks. PhD seminar CS BGU

  9. Instance graph • We observe all the subgraphs of G in D, called instances. • Instances are thought to be connected if they have an edge/node/subgraph in common. • A graph with instances of G as nodes and edges between every pair of connected vertices is called the instance graph of G in D. PhD seminar CS BGU

  10. Instance graph: an example G D Instance graph of G in D PhD seminar CS BGU

  11. Intuitive support measures G • Just count the instances. • Perform some sort of a weighted count. D CountD(G)=3 G D WcountD(G) = CountD(G) / 3 =1 PhD seminar CS BGU

  12. The problem with intuitive approach is… G …that these measures are not admissible: CountD(G)=3 D g CountD(g)=1 G WcountD(G)=1+1=2 D g WcountD(g)= 1/2+1/2+1/2=3/2 PhD seminar CS BGU

  13. What is going on? • A counting function can be viewed as acting on the instance graph. • A graph g and its supergraph G have different instance graphs Ig and IG, and Ig is obtained from IG by a series of graph operations. • If a counting function does not decrease under these operations, it is admissible (for specific G and g, at least). PhD seminar CS BGU

  14. Operations on instance graphs We narrowed it down to the following three operations on instance graphs: • clique contraction, • node addition, • edge deletion. PhD seminar CS BGU

  15. Clique contraction A clique is contracted into a single node. Another node is incident to the new one only if it was incident to all the nodes in the clique. Intuition behind it: G G g PhD seminar CS BGU

  16. Node addition A new node and some edges incident to this node are added. Intuition behind it: G G g g g PhD seminar CS BGU

  17. Edge removal An edge is removed. Intuition behind it: G G g g PhD seminar CS BGU

  18. The main result Theorem. A support measure on graphs is admissible if and only if it does not decrease under following operations on instance graphs: • clique contraction, • edge removal, • node addition. PhD seminar CS BGU

  19. Sufficiency To prove sufficiency for these three operations, we need to show that for every graph D and every pair of graphs G and g, s.t. g is a subgraph of G, the instance graph Ig of g is obtained from the instance graph IG of G by these operations alone. PhD seminar CS BGU

  20. Sufficiency: proof outline The proof is constructive (algorithmic). The main idea is • to build a pair of mappings, first from instances of G to instances of g and second from instances of g to instances of G. • Perform clique contractions and node additions to obtain a vertex set of Ig from a vertex set of IG. • Perform edge deletions as necessary. PhD seminar CS BGU

  21. Necessity To prove the necessity, we need to show that for every graph H and every operation (from the above list) that produces a graph h, there exist a database graph D and a pair of its subgraphs G and g, where g is a subgraph of G, so that H=IG and h=Ig. PhD seminar CS BGU

  22. Necessity: proof outline • The proof is constructive. • Specific graphs G and g are constructed. For convenience, these graphs are labeled. • Intersection types for instances of G and g in D are defined. • D is constructed accordingly. PhD seminar CS BGU

  23. Necessity: the patterns … d d Arms Top d d g c b G a a b Legs Bottom a a … a a Legs … a a PhD seminar CS BGU

  24. Necessity: intersection Following intersection types are allowed in D: • Bottom overlap: all legs of two instances overlap. • Leg overlap: two instances have exactly one leg in common. • Arm overlap: two instances have exactly one arm in common. PhD seminar CS BGU

  25. Bottom overlap: for clique contraction … … … d d d d d d d d d d d d c c c G2 G1 G3 b a a … a a PhD seminar CS BGU

  26. Leg overlap: for node addition … … d d d d G1 G2 d d d d c c b b a a a … a a … a PhD seminar CS BGU

  27. Arm overlap: for edge removal … … d d d G1 G2 d d d c c b b a a a a … a a a … a PhD seminar CS BGU

  28. Necessity: proof outline • Use instances of G to construct the database graph D. • Prove that no additional instances of G arise from the overlaps. • Show that the instance graph of g arises from the instance graph of G by applying the chosen operation. PhD seminar CS BGU

  29. MIS measure • MIS measure is the size of maximum independent set (anti-clique) in the instance graph. • It satisfies the necessity conditions (direct admissibility proof is also available). • It was used in several papers (Han, Kuramochi etc.) • No other admissible support measure have been found to date. PhD seminar CS BGU

  30. MIS: example G D Instance graph IG of G in D MIS(IG)=1 PhD seminar CS BGU

  31. Extensions • Necessary and sufficient conditions can be re-formulated for different pattern intersection types (for example, a common node can be considered an intersection). PhD seminar CS BGU

  32. Open problems and conjectures • Is computation of an admissible support measure an NP-hard problem, regardless of the measure chosen? • Is any admissible support measure a function on MIS size? What kind of a function? PhD seminar CS BGU

  33. Thank you! PhD seminar CS BGU

  34. Questions? PhD seminar CS BGU

More Related