Multi-Granularity Classi fi cation Rule Discovery Using ERID

Multi-Granularity Classification Rule DiscoveryUsing ERID by Seunghyun Im, Zbigniew Ras, Li-Shiang Tsay University of Pittsburgh at Johnstown University of North Carolina North Carolina A&T University

Outline 1. Introduction 2. Backgrounds 3. Algorithm 4. Experiment 5. Conclusion

1. Introduction • Discoveringclassification rules at multiple (relevant) levels ofabstraction specified by the user • e.g. Find out the dependency betweenthe flight schedule and ticket sales to increase the profit. • 10:00 AM and PA  High sales (Rule from given data set) • Morning and PA  High Sales (General) • 10:00 AM and Pittsburgh  Medium sales (Specific)

2. Backgrounds and Contribution • Related Work • Generalized association rules • Multilevel Classification rules using ontology • Rule based attribute oriented induction • Multiple-level decision tree, ontology-driven decision tree learning • Feature hierarchies, Structured attributes, Variant levels of precision

2. Backgrounds and Contribution • Proposed Method • Discover more specific and generalclassification rules from an incomplete information systemwith hierarchical attributes using ERID • * An incomplete informationsystem enables to contain multiple weighted values as a single attribute value. • * ERID (Extracting Rules from Incomplete Decision Table) algorithm isdesigned to extract classification rules from (partially) incomplete informationsystems.

3. Algorithm (By example) Incomplete Information System S Information System S = (X,A,V) X is a set of objects= {x1, x2, x3, x4, x5, x6} A is a set of attributes= {a, b, c, d} Va is a set of values of attribute a, where a  A, and V = {Va : a  A},

3. Algorithm (By example) Incomplete Information System S Assumption

3. Algorithm (By example) Incomplete Information System S e.g. An attribute value for an object in an incomplete informationsystem can be {(dark black,0.5),(light black, 0.5)}, meaning that the object maybe dark black or light black with equal weight. e.g. If {(v1, 0.2), (v2, 0.4), (v3, 0.4)}is being assigned asingle attribute value in S with 1= 0.3 then, it is automatically converted to{(v2,0.5),(v3,0.5)}because the confidence of v1< 0.3, and sum ofthe weights of viis 1 by definition. Weights for v2 and v3 are calculated by theratio of its weight and the total weight after eliminating v1, that is p2 = (0.4)/(0.4+0.4), and p3 = (0.4)/(0.4+0.4)

3. Algorithm Attribute Value Hierarchy (AVH) for S An AVH t = {(a[p])}is the set ofnode values, where a is an attribute name and p is the path fromthe root to the node. (e.g. b[1,1] is the value of the left most element at depth2). a, where is the empty sequence, is equivalent to the attributename.

3. Algorithm Replacement of Attribute Value and Weight 1. The sum of the weights for a'(x) = 1 (unchanged by the definition of anincomplete information system) 2. If ai(x) = {(v,w)}is generalized to level k, a'(x) = {(v',w)}, where v'is anancestor node of v at depth k, and w is carried over. 3. If ai(x) = {(v,w)}is specified to level k, each v is replaced by one or moredescendant nodes {v'1, v'2,..., v'n}, and w'for each v'is 1 /n,where n is thenumber of sibling nodes at depth k. 4. If ti T is not balanced, and the granularity of ai(x) is given to a levelwhere no node exists, choose the closest parent node.

3. Algorithm Creating a new information system S' with specified granularity e.g. If an attribute value {(a[2,1], 0.5), (a[2,2], 0.5)} is generalized to one level higher  (a[2], 1). e.g. If {(a[2],1)}is specified to the values in depth 3, {(a[2,1,1],1/6),(a[2,1,2],1/6 ),(a[2,3],1/3 ),(a[2,3],1/3 )}. S' with granularity at a = 1b = 2c = 2d = 1

3. Algorithm Extract Classification Rules from S' using ERID Support = 1 confidence = 0.75 Granules associated with a,b,c a[1]= {(x1,1/2), (x3,1), (x4,1), (x5,1)} a[2]= { (x1,1/2), (x6,1)} b[1,1]= {(x2,1/2), (x4,1), (x5,1)} b[1,2]= {(x2,1/2) } b[2,1]= {(x1,1/2), (x3,1/2), (x6,1/2) } b[2,1]= {(x1,1/2), (x3,1/2), (x6,1/2) } c[1,1]= {(x1,1), (x2,1), (x3,1/2), (x6,1)} c[1,2]= {(x3,1/2), (x4,1)} Granules associated with d (decision) d1= {(x1,1), (x4,1), (x5,1) } d2= {(x2,1), (x3,1), (x6,1) }

3. Algorithm Single Element Term Rules a[1]* d[1]* sup = 5/2 , conf = 0.72 + a[1]*  d[2]* sup = 1, conf = 0.29 a[2]*  d[1]* sup = 1/2 - a[2]*  d[2]* sup = 1, conf = 0.67 b[1,1]*  d[1]* sup = 2, conf = 0.8 + b[1,1]*  d[2]* sup = 1/2 - b[2,1]*  d[1]* sup = 1/2 - b[2,1]*  d[2]* sup = 1, conf = 0.67 b[2,2]*  d[1]* sup = 1/2 - b[2,2]*  d[2]* sup = 1, conf = 0.67 c[1,1]*  d[1]* sup = 3, conf = 0.67 c[1,1]*  d[2]* sup = 1/2 - c[1,2]*  d[1]* sup = 2, conf = 0.8 + c[1,2]*  d[2]* sup = 1/2 - rule sup> conf> no further consideration because sup < will be used in 2element terms

3. Algorithm Two Element Term Rules (a[1], b[2,1])* d[2]* sup = 1 , conf = 0.67 (a[1], b[2,1])* d[2]* sup = 1 , conf = 0.67 + (a[1], b[2,2])* d[1]* sup = 0 - (a[1], c[1,1])* d[2]* sup = 1 , conf = 2 + (b[1,1], c[1,1])* d[1]* sup = 2 , conf = 0.8 - (b[2,1], c[1,1])* d[2]* sup = 1/2

4. Implementation and Experiment • Data Set • Census bureau database (obtained from UCI ML repository) • 4884 objects and 12 Attributes • AVH(one to three levels of depth) was built based on thedescription given by the data provider Levels of attribute granularity for the test data set Rules at different granularity

4. Implementation and Experiment • Implementation • Platform : Pentium M 1.6, Windows XP • Programming Language : Matlab 7.1, Visual Basic .Net

5. Conclusion • Described the use of ERID for discovering multi-granularity classification rules from an incomplete information system having attribute value hierarchies. • ERID is especially suitable to extract rules from information systems containing multiple weighted values, and this feature enables to handle partial incompleteness of attribute values when the levels of abstraction are changed.

Questions? Thank You

Multi-Granularity Classi fi cation Rule Discovery Using ERID