620 likes | 639 Vues
Data mining II The fuzzy way. Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland http://www.phys.uni.torun.pl/~duch. ISEP Porto, 8-12 July 2002. Basic ideas. Complex problems cannot be analyzed precisely
E N D
Data miningIIThe fuzzy way Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland http://www.phys.uni.torun.pl/~duch ISEP Porto, 8-12 July 2002
Basic ideas • Complex problems cannot be analyzed precisely • Knowledge of an expert may be approximated using imprecise concepts.If the weather is nice and the place is attractive then not many participants stay at the workshop. Fuzzy logic/systems include: Mathematics of fuzzy sets/systems, fuzzy logics. Fuzzy methods of knowledge representation for clusterization, classification and regression. Extraction of fuzzy concepts and rules from data. Fuzzy control theory.
Types of uncertainty • Stochastic uncertaintyRolling dice, accident, insurance risk… - probability theory. • Measurement uncertainty About 3 cm; 20 degrees - statistics. • Information uncertaintyTrustworthy client, known constraints - data mining. • Linguistic uncertaintySmall, fast, low price – fuzzy logic.
Crisp sets young = { xM | age(x) 20 } myoung(x) myoung(x) ={ 1 : age(x) 20 0 : age(x) > 20 Membership function A=“young” 1 0 x [years]
Fuzzy sets X-universum, space; xX A-linguistic variable, concept, fuzzy set. mA – a Membership Function (MF), determining the degree, to which xbelongs to A. Linguistic variables, concepts – sums of fuzzy sets. Logical predicate functions with continuous values. Membership value: different from probability. m(bold) = 0.8 does not mean bold 1 in 5 cases. Probabilities are normalized to 1, MF are not. Fuzzy concepts are subjective and context-dependent.
Fuzzy examples Crisp and fuzzy concept „young men” A=“young” A=“young” 1 1 =0.8 0 0 x [years] x [years] x=20 x=23 „Boiling temperature” has value around 100 degrees (pressure, chemistry).
a=0.6 Few definitions Support of a fuzzy set A: supp(A) = { xX : A(x) > 0 } Coreof a fuzzy set A: core(A) = { xX : A(x) =1 } a-cutof a fuzzy set A: Aa = { xX : A(x) > a } Height= maxxA(x) 1 Normal fuzzy set: sup xXA(x) = 1
Definitions illustrated MF 1 .5 a 0 Core X Crossover points a - cut Support
Types of MF Trapezoid: <a,b,c,d> Gaus/Bell: N(m,s) (x) (x) 1 1 s 0 0 a b c d x c x
1 0 a b x MF example Singleton: (a,1) i (b,0.5) Triangular: <a,b,c> (x) (x) 1 0 a b c x
Linguistic variables W=20 => Age=young. L. variable = L. value. L. variable: : temperature terms, fuzzy sets : { cold, warm, hot} (x) cold warm hot 1 0 20 40 x [C]
Fuzzy numbers MP are usually convex, with single maximum. MPs for similar numbers overlap. Numbers: core= point, x (x)=1 Decrease monotonically on both sides of the core. Typically: triangular functions (a,b,c) or singletons.
Sums of fuzzy sets A, B – fuzzy sets. Sum AB is a fuzzy set with the following MP: Instead of max, S-normsS(a,b), may be used: • Boundary: S(1, 1) = 1, S(a, 0) = S(0, a) = a • Monotonicity: S(a, b) < S(c, d) if a < c and b < d • Commutativity: S(a, b) = S(b, a) • Associativity: S(a, S(b, c)) = S(S(a, b), c)
Products of fuzzy sets A, B – fuzzy sets.Product AB of two fuzzy sets has MP: Instead of minT-normsT(a,b), may be used: • Boundary: T(0, 0) = 0, T(a, 1) = T(1, a) = a • Monotonicity: T(a, b) < T(c, d) if a < c and b < d • Commutativity: T(a, b) = T(b, a) • Associativity: T(a, T(b, c)) = T(T(a, b), c)
Sums/Products Sum Product AB(x)=max{A(x),B(x)} AB(x)=min{A(x),B(x)} A(x) B(x) A(x) B(x) 1 1 0 0 x x AB(x)=min{1,A(x)+B(x)} AB(x)=A(x) B(x) A(x) B(x) A(x) B(x) 1 1 0 0 x x
T-norms&S-norms Typical T-norms and their co-norms (S-norms) T(a,b): AND(a,b), MIN(a,b), a•b, MAX(0,a+b-1) .... S(a,b): OR(a,b), MAX(a,b), a+b-a•b, MIN(1, a+b) ... S(a,b) = 1–T(1-a,1-b) De Morgan laws T(a,b) = 1–S(1-a,1-b) max(a,b) = 1–min(1-a,1-b) a•b = 1-(1-a)-(1-b) + (1-a)•(1-b) max(0, a+b-1) = 1-min(1,1-a+1-b)
Min/Max MIN(a,b), a•b MAX(a,b), a+b
Complements and subsets Complement A’of a set Ahas an MP: A 2-element set of all fuzzy sets crisp sets are in corners; middle set is maximally fuzzy:
Sums and products of fuzzy numbers Sums: A+B(x) = max{A(y), B(z) | x=y+z} (x) A(y) B(z) A+B(x) 1 0 x Products: AB(x) = min{A(y), B(z) | x=yz} (x) A(y) B(z) AB(x) 1 0 x
y y f f f(A)(y) f(A)(y) A(x) A(x) max x x Fuzzy functions A fuzzy set A and a functionf: How does the functionf(A) looks like? f(A)(y) = max{A(x) | y=f(x)}
y y b b y = f(x) y = f(x) a x x a Interval functions For functions of numbers if y=f(x), andx=a then y=b. For crisp functions - curvesfor intervals – range of values.
Fuzzy relations { • Classical relations R X Ydef: mR(x,y) = 1 iff (x,y) R 0 iff (x,y) R • Fuzzy relations R X Ydef:mR(x,y) [0,1] mR(x,y) describes the degree of relation between x andy Another interpretation: it reflects the degree of truthof x R y sentence.
X/Y beachroller-skatescamping reading rainy cloudy sunny 0.0 0.2 0.0 1.0 0.0 0.8 0.3 0.3 1.0 0.2 0.7 0.0 Examples of fuzzy relations Approximately equal: X Y; X depends on Y; X similar to Y ... X = { rainy, cloudy, sunny} Y = { beach, roller-skates, camping, reading} Degree? Rather correlations or probability.
Fuzzy rules Commonsense knowledge may sometimes be captured in an natural way using fuzzy rules. IF L-variable-1 = term-1 andL-variable-2 = term-2 THEN zm. L-variable-3 = term-3 IFTemperature = hotandair-condition price = low THENcooling = strong What does it mean for fuzzy rules: IFxis A then yis B ?
Fuzzy implication If => means correlation T-norma T(A,B) is sufficient. A=>B has many realizations.
y y B B x x A A Interpretation of implication Ifxis A thenyis B: correlationor implication. A=>B not A or B A entails B A=>B A and B
Various implications Kleen-Dienes, Goguen, Sharp, restricted sum, probabilistic ... Many relations are derived from the Łukasiewicz multi-valued logic.
A’ A B w X Y A’ B’ X Y x is A’ y is B’ Single rule If xis A then yis B. Fact: xis A’, conclusion: yis B’ It is easy to generalize to many conditions If xis A and yis B than zis C
Types of rules FIR, Fuzzy Implication Rules. Logic of implications between fuzzy facts. FMR, Fuzzy Mapping Rules. Functional dependencies, fuzzy graphs, approximation problems. Mamdani type: IF MFA(x)=high then MFB(y)=medium. Takagi-Sugeno type: IF MFA(x)=high then y=fA(x) Linear fA(x) – first order Sugeno type. FIS, Fuzzy Inference Systems. Combine rules fuzzy rules to calculate final decisions.
Fuzzy approximation • Fuzzy systems F: n p use m rules to map vectorx on the output F(x), vector or scalar. Singleton model:Ri: IF xis AiThen y is bi
IFTemperatura=chilly andHeating-price=expensive THEN heating=no Rules base Temperature freezingcold chilly Heating Price cheap so-so expensive full full medium full medium weak medium weak no IFTemperature=freezing andHeating-price=cheap THEN heating=full
1. Fuzzification Fuzzification: from measured values to MF: Determine membership degrees for all fuzzy sets (linguistic variables): Temperature: T=15 C Heating-price: p=48 Euro/MBtu chilly(T)=0.5 cheap(p)=0.3 1 1 0.5 0.3 0 0 t p 15C 48 Euro/MBtu IF Temperature = chilly and Heating-price = cheap...
chilly(T)=0.5 cheap(p)=0.3 1 1 0.5 0.3 0 0 t p 15C 48 Euro/MBtu IF Temperature=chilly and Heat-price=cheap... 2. Term composition Calculate the degree of rule fulfillment for all conditionscombining terms using fuzzy AND, ex. MIN operator. A(X)= A1(X1) A2(X2) AN(XN)for rules RA all(X)= min{chilly(t), cheap(p)} = min{0.5,0.3} = 0.3
3. Inference Calculate the degree of truth of rule conclusion: use T-norms such as MIN or product to combine the degree of fulfillment of conditions and the MF of conclusion. full(h) conclusions(h) 1 Inference MIN concl=min{cond,full} ... cond=0.3 0 h THEN Heating=full mocno(h) konkl(h) 1 ... cond =0.3 Inferenceconcl. = cond •full 0 h
4. Aggregation Aggregate all possible rule conclusion using MAX operator to calculate the sum. THEN Heating=full THEN Heating =medium THEN Heating =no 1 0 h
5. Defuzzification Calculate crisp value/decision using for example the “Center of Gravity” (COG) method: concl(h) COG 1 0 h 73 For discrete sets a „center of singletons”, for continuous: mi= degree of membership ini Ai = area under MF for the seti ci = center of gravity for the seti. Simi• Ai • ci Simi• Ai h =
FIS for heating Fuzzification Defuzzification Inference Rule base if temp=freezing then valve=open freeze cold warm full half closed freeze=0.7 0.7 0.7 if temp=cold then valve=half open 0.2 0.2 cold =0.2 T v Measured temperature if temp=warm then valve=closed Output that controls the valve position hot =0.0
Takagi-Sugeno rules Mamdani rules: conclude that IF X1= A1i X2=A2 … Xn= AnThen Y = B TS rules: conclude some functional dependence f(xi) IF X1= A1i X2= A2 …. Xn= AnThen Y=f(x1,x2,..xn) TSrules are usually based on piecewise linear functions(equivalent to linear splines approximation): IF X1= A1i X2= A2…Xn= AnThen Y=a0 + a1x1 …+anxn
Fuzzy system in Matlab rulelist=[ 11311 1 2 3 1 1 1 3 2 1 1 2 1 3 1 1 2 2 2 1 1 2 3 1 1 1 3 1 2 1 1 3 2 3 1 1 3 3 3 1 1]; fis=addrule(fis,rulelist); showrule(fis) gensurf(fis); Surfview(fis); 1. If (temperature is cold) and (oilprice is normal) then (heating is high) (1) 2. If (temperature is cold) and (oilprice is expensive) then (heating is medium) (1) 3. If (temperature is warm) and (oilprice is cheap) then (heating is high) (1) 4. If (temperature is warm) and (oilprice is normal) then (heating is medium) (1) 5. If (temperature is cold) and (oilprice is cheap) then (heating is high) (1) 6. If (temperature is warm) and (oilprice is expensive) then (heating is low) (1) 7. If (temperature is hot) and (oilprice is cheap) then (heating is medium) (1) 8. If (temperature is hot) and (oilprice is normal) then (heating is low) (1) 9. If (temperature is hot) and (oilprice is expensive) then (heating is low) (1) first input second input output rule weight operator (1=AND, 2=OR)
Fuzzy Inference System (FIS) IFspeed is slowthen break = 2 IFspeed is medium then break = 4* speed IFspeed is high then break = 8* speed MF(speed) slow medium high .8 .3 .1 speed 2 R1: w1 = .3; r1 = 2 R2: w2 = .8; r2 = 4*2 R3: w3 = .1; r3 = 8*2 Break = S(wi*ri) / Swi = 7.12
First-order TS FIS • Rules • IF X is A1andY is B1then Z = p1*x + q1*y + r1 • IF X is A2and Y is B2thenZ = p2*x + q2*y + r2 • Fuzzy inference A1 B1 z1 = p1*x+q1*y+r1 w1 X Y A2 B2 z2 = p2*x+q2*y+r2 w2 X Y w1*z1+w2*z2 x=3 y=2 z = P w1+w2
Induction of fuzzy rules Choices/adaptive parameters in fuzzy rules: • The number of rules. • The number of terms for each attribute. • Position of the membership function (MF). • MF shape for each attribute/term. • Type of rules (conclusions). • Type of inference and composition operators. • Induction algorithms: incremental or refinement. • Type of learning procedure.
Feature space partition Regular grid Independent functions
MFs on a grid • Advantage: simplest approach • Regular grid: divide each dimension in a fixed number of FPs and assign an average value from all samples that belong to the region. • Irregular grid: find largest error, divide the grid there in two parts adding new MF. • Mixed method: start from regular grid, adapt parameters later. • Disadvantages: for kdimensions and NMFs in each Nkareas are created !Poor quality of approximation. • Combs proposal: linearize, use the same number of classes as fuzzy sets for every feature.
Optimized MP • Advantages: higher accuracy, better approximation, less functions, context dependent MPs. • Optimized MP may come from: • Neurofuzzy systems – equivalent to RBF network with Gaussian functions (several proofs). FSM models with triangular ortrapezoidal functions.Modified MLP networks with bicentral functions, etc. • Decision trees, fuzzy decision trees. • Fuzzy machine learning inductive systems. • Disadvantages: extraction of rules is hard, optimized MPs are more difficult to create.
Improving sets of rules. • How to improve known sets of rules? • Use minimization methods to improve parameters of fuzzy rules: usually non-gradient methods are used; most often genetic algorithms. • change rules into neural network, train the network and convert it into rules again. • Use heuristic methods for local adaptation of parameters of individual rules. • Fuzzy logic – good for modeling imprecise knowledgebut ... • How do the decision borders of FIS look like? Is it worthwhile to make input fuzzy and output crisp? • Is it the best approximation method?
Inference B1 A1 z1 = p1*x+q1*y+r1 w1 w1*z1+w2*z2 z = w1+w2 A2 B2 z2 = p2*x+q2*y+r2 w2 y x • ANFIS (Adaptive Neuro-Fuzzy Inference System) A1 w1 w1*z1 P x A2 Swi*zi S B1 / z P w2*z2 y B2 S Swi w2 ANFIS
Partition of the input space y A2 A1 B2 x B2 B1 B1 y x A2 A1 • ANFIS (Adaptive Neuro-Fuzzy Inference System) w1 A1 P w1*z1 x A2 P S Swi*zi B1 P / z y B2 P w4*z4 Swi w4 S ANFIS with 4 rules
ANFIS: parameter identification • Hybrid training methods: BP + LMS nonlinear parameters linear parameters w1 A1 P w1*z1 x A2 P S Swi*zi B1 P / z y B2 P w4*z4 Swi w4 S forward backward MF Param. (nonlinear) constant gradient Coefficients (linear) LMS constant
Neurofuzzy systems Fuzzy: m(x)=0,1 (no/yes) replaced by a degree m(x)[0,1]. Triangular, trapezoidal, Gaussian ...MF. Feature Space Mapping (FSM) neurofuzzy system. Neural adaptation, estimation of probability density distribution (PDF) using single hidden layer network (RBF-like) with nodes realizing separable functions: M.f-s in many dimensions: