Probabilistic and Logic Approaches to Machine Learning and Data Mining

PROBABILISTIC AND LOGIC APPROACHES TO MACHINE LEARNING ANDDATA MINING Marek Perkowski Portland State University

Essence of logic synthesis approach to learning

Example of Logical Synthesis John Mark Dave Jim Alan Mate Robert Nick

Good guys John Mark Dave Jim Bad guys Alan Mate Robert Nick A - size of hair B - size of nose C - size of beard D - color of eyes

A - size of hair CD 00 01 11 10 B - size of nose AB - 1 - 00 - C - size of beard 1 – 1 01 - D - color of eyes - – - - 11 - - - - 10 Good guys Mark John Dave Jim A’ B’CD A’ B’CD A’ BCD A’ BCD’

Alan Mate Robert Nick A - size of hair CD 00 01 11 10 B - size of nose AB - 1 - 00 - C - size of beard D - color of eyes 1 0 1 01 0 - – 0 - 11 - 0 - - 10 Bad guys A’ B’C’D ABCD A’ BC’D’ AB’C’D A’C

A - size of hair CD 00 01 11 10 B - size of nose AB - 1 - 00 - C - size of beard D - color of eyes 1 0 1 01 0 - – 0 - 11 - 0 - - 10 Generalization 1: Bald guys with beards are good Generalization 2: All other guys are no good A’C

SOP (DNF) approach to learning

Sum of Products AND gates, followed by an OR gate that produces the output. (Also, use Inverters as needed.) There are many algorithms to minimize SOP They were created either in ML community ot Logic Synthesis community. We will illustrate three different algorithms.

Method 1 SOP minimization based on graph coloring

Reduction of SOP (DNF) Machine Learning to graph coloring SOP through Graph Coloring • In previous example there were 4 binary variables. Here there are two variables , each with 4 values. • We encode every group or minterm using the encoding as in the right • We check for every two groups if they can be combined. • If they can be combined, the combined group does not cover zeros. • If the combined group covers zeros, the groups cannot be combined. Let us try to combine a1 and b1. We do bitwise OR. 1001 1000 1001 0100 The combined group does not cover zeros. 1001 1100

Reduction of SOP (DNF) Machine Learning to graph coloring SOP through Graph Coloring Let us try to combine a2 and b2. We do bitwise OR. 0010 1000 0100 0100 The combined group covers zeros. So groups a2 and b2 are not compatible. For every incompatible nodes in the graph there is an edge. 0110 1100

Reduction of SOP (DNF) Machine Learning to graph coloring SOP through Graph Coloring • Based on incompatibility of groups we create the INCOMPATIBILITY GRAPH. • Every two incompatible nodes (for incompatible groups) there is an edge. • We color graph with the minimum number of colors. • The minimum number of colors is called the chromatic number. • We combine the nodes that have the same color.

The minimum coloring corresponds to the minimum number of combined groups in the final solution. These groups are usually products, but they may be also of the form PRODUCT1 * (PRODUCT2)’

Method 2 SOP minimization based on set covering with primes

SOP through Set Covering Find all prime implicants of the function. Create a table with columns being true minterms and rows being prime implicants. This is called the covering problem. You want to find the smallest subset of rows that covers all columns There are many algorithms for this problem. Some use BDDs, some SAT, some matrices. The same method can be used for Boolean minimization, test generation to cover all faults with minimum number of tests and to select best position of robots guarding a building from terrorists.

Columns correspond to minterms with value 1 SOP through Set Covering T0 T1 T2 T3 Rows correspond to prime implicants T0 and T2 is not a solution because column b0 is not covered. T0, T2 and T3 is a solution.

Method 3 SOP minimization based on set sequential finding of secondary essential primes

CD 00 01 11 10 AB 1 1 - 00 0 0 0 0 01 1 1 1 0 - 11 1 0 - - 10 Machine Learning SOP through sequential finding of essential and secondary essential primes 1. Find essential primes Essential prime Essential prime

CD 00 01 11 10 AB - - - 00 0 0 0 0 01 1 - 1 0 - 11 - 0 - - 10 Machine Learning SOP through sequential finding of essential and secondary essential primes 2. remove essential primes Secondary Essential prime orange is redundant prime Yellow is redundant prime

CD 00 01 11 10 AB - - - 00 0 0 0 0 01 1 - 1 0 - 11 - 0 - - 10 3. ITERATE Essential prime Secondary essential prime ESSENTIAL prime The solution are essential primes and secondary essential primes of all levels. If algorithm does not terminate, make random choice and iterate OR use another algorithm.

Multivalued relations approach to learning

MIN MAX Short Introduction: multiple-valued logic Signals can have values from some set, for instance {0,1,2}, or {0,1,2,3} {0,1} - binary logic (a special case) {0,1,2} -a ternary logic {0,1,2,3} - a quaternary logic, etc 1 Minimal value 1 2 2 3 Maximal value 2 3 3

Functional Decomposition X A - free set Evaluates the data function and attempts to decompose into simpler functions. F(X) = H( G(B), A ), X = A B B - bound set if A B = , it is disjoint decomposition if A B  , it is non-disjoint decomposition

Pros and cons In generating the final combinational network, BDD decomposition, based on multiplexers, and SOP decomposition, trade flexibility in circuit topology for time efficiency Generalized functional decomposition sacrifices speed for a higher likelihood of minimizing the complexity of the final network

A Standard Map of function ‘z’ Bound Set a b \ c Columns 0 and 1 and columns 0 and 2 are compatible column compatibility = 2 Free Set z

Principle of finding patterns We have a tabular representation of data We want to find patterns In this case we are looking for patterns in columns. Columns have the same pattern if the symbols in each row can be combined. We say that these columns are COMPATIBLE. If in one row we have 0 and 0 , 1 and 1, 0 and -, 1 and - , or – and – then the columns are compatible. If we have a 0 and a relation 0,1 then the columns are compatible as one can select 0 from the choice of 0,1.

Relation Decomposition of Multi-Valued Relations F(X) = H( G(B), A ), X = A B A X Relation Relation B if A B = , it is disjoint decomposition if A B  , it is non-disjoint decomposition

Bound Set a b \ c C0 C1 Free Set C2 Forming a CCG from a K-Map Columns 0 and 1 and columns 0 and 2 are compatible column compatibility index = 2 Column Compatibility Graph z

Forming a CIG from a K-Map a b \ c C0 C1 C2 z Columns 1 and 2 are incompatible chromatic number = 2 Column Incompatibility Graph

C0 C0 C1 C1 C2 C2 CCG and CIG are complementary Graph coloring graph multi-coloring Maximal clique covering clique partitioning Column Compatibility Graph Column Incompatibility Graph

clique partitioning example.

Maximal clique covering example.

\ c \ c G G Map of relation G After induction From CIG g = a high pass filter whose acceptance threshold begins at c > 1

The Meaning of Attributes

Attributes • Static • Facial features • Gestures • Objects to grasp • Objects to avoid • Symptoms of illness • View of body cells • Crystallization parameters of liquids • Dynamic • Changes in stock market • Changes in facial features – facial gestures. • Change of object’s view when robot approaches it. • Dynamical change of body part in motion. • Changes of moles on the skin. • Changing symptoms of an illness.

Static Dynamic t0 t1 t2 Three vectors in time represented as one long vector for Machine Learning …… Attributes in time t0

Representation Models forLogic Based Machine Learning

The method we are using Types of Logical Synthesis • Sum of Products • Decision Trees • Decision Diagrams • Functional Decomposition

Binary Decision Diagrams There are many types of Decision Trees and many generalizations of them, used in logic and in ML

Example Karnaugh Map Decision Diagrams A Decision diagram breaks down a Karnaugh map into set of decision trees. A decision diagram ends when all of branches have a yes, no, or do not care solution. This diagram can become quite complex if the data is spread out as in the following example. 0

Decision Tree for Example Karnaugh Map 0

BDD Representationof function A 0 1 B CD 00 01 11 10 AB 1 0 1 1 1 - 00 - C D 1 1 1 01 - 0 0 0 1 - 0 11 1 - 1 0 10 - 0 1 Incompletely specified function

BDD Representationof function A 0 1 B CD 00 01 11 10 1 0 1 1 1 1 00 1 C D 1 1 1 01 1 0 0 0 0 1 1 11 1 1 1 0 0 10 0 1 AB Completely specified function The problem is how to find minimum tree or decision diagram for your given data

Absolutely Minimum Background on Binary Decision Diagrams (BDD) • BDDs are based on recursive Shannon expansion F = x Fx+ x’ Fx’ • Compact data structure for Boolean logic • can represents sets of objects (states) encoded as Boolean functions • Canonical representation • reduced ordered BDDs (ROBDD) are canonical • essential for simulation, analysis, synthesis and verification

Other expansions, other trees, other diagrams. • F = x Fx+ x’ Fx’ • The standard Decision Tree is based on Shannon Expansion. This is the same concept as this F Shannon node for variable x x’ x All examples • Fx’ Fx WIND WIND=WEAK WIND=STRONG Data Separation in ML is the same as Shannon Expansion in Logic • All examples for which WIND was Weak • All examples for which WIND was STRONG

Absolutely Minimum Background on Binary Decision Diagrams (BDD) and Kronecker Functional Decision Diagrams • BDDs are based on recursive Shannon expansion F = x Fx+ x’ Fx’ • Compact data structure for Boolean logic • can represents sets of objects (states) encoded as Boolean functions • Canonical representation • reduced ordered BDDs (ROBDD) are canonical • essential for simulation, analysis, synthesis and verification Negative cofactor of F with respect to variable x Positive cofactor of F with respect to variable x

b b a b a b f f f c 0 0 1 1 BDD Construction • Typically done using APPLY operator • Reduction rules • remove duplicate terminals • merge duplicate nodes (isomorphic subgraphs) • remove redundant nodes • Redundant nodes: • nodes with identical children

f 1 edge a b c f 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 0 0 1 0 1 1 1 1 0 0 1 1 1 1 0 edge a b b c c c c 0 0 0 0 0 1 1 1 BDD Construction – your first BDD • Construction of a Reduced Ordered BDD f = ac + bc Truth table Decision tree

f f a a a b b b b b c c c c c c c 0 0 0 1 1 1 BDD Construction – cont’d f = (a+b)c 1. Remove duplicate terminals 2. Merge duplicate nodes 3. Remove redundant nodes

Probabilistic and Logic Approaches to Machine Learning and Data Mining

Probabilistic and Logic Approaches to Machine Learning and Data Mining

Presentation Transcript

Data Mining (and machine learning)

Data Mining and Machine Learning

Data Mining and Machine Learning

Data Mining (and machine learning)

Data Mining (and machine learning)

Data Mining (and machine learning)

Data Mining (and machine learning)

Data Mining and Machine Learning

Data Mining (and machine learning)

Data Mining (and machine learning)

Data Mining (and machine learning)

Data Mining (and machine learning)

Data Mining (and machine learning)

Data Mining (and machine learning)

Data Mining and Machine Learning

Data Mining (and machine learning)

Data Mining (and machine learning)

Data mining and Machine Learning

Data Mining (and machine learning)

Data Mining and Machine Learning

Data Mining (and machine learning)