1 / 16

Extending Propositional Satisfiability to Determine Minimal Fuzzy-Rough Reducts

Extending Propositional Satisfiability to Determine Minimal Fuzzy-Rough Reducts. Outline. The importance of feature selection Rough set theory Fuzzy-rough feature selection (FRFS) FRFS-SAT Experimentation Conclusion. Feature selection. Why dimensionality reduction/feature selection?

tacey
Télécharger la présentation

Extending Propositional Satisfiability to Determine Minimal Fuzzy-Rough Reducts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Extending Propositional Satisfiability to Determine Minimal Fuzzy-Rough Reducts

  2. Outline • The importance of feature selection • Rough set theory • Fuzzy-rough feature selection (FRFS) • FRFS-SAT • Experimentation • Conclusion

  3. Feature selection • Why dimensionality reduction/feature selection? • Growth of information - need to manage this effectively • Curse of dimensionality - a problem for machine learning Intractable High dimensional data Dimensionality Low dimensional Reduction data Processing System

  4. Rough set theory Upper Approximation Set A Lower Approximation Equivalence class Rx Rx is the set of all points that are indiscernible with point x in terms of feature subset B

  5. Discernibility approach • Decision-relative discernibility matrix • Compare objects • Examine attribute values • For attributes that differ: • If decision values differ, include attributes in matrix • Else leave slot blank • Construct discernibility function:

  6. Example • Remove duplicates fC(a,b,c,d) = {a ⋁ b ⋁ c ⋁ d} ⋀ {a ⋁ c ⋁ d} ⋀ {b ⋁ c} ⋀ {d} ⋀ {a ⋁ b ⋁ c} ⋀ {a ⋁ b ⋁ d} ⋀ {b ⋁ c ⋁ d} ⋀ {a ⋁ d} • Remove supersets fC(a,b,c,d) = {b ⋁ c} ⋀ {d}

  7. Finding reducts • Usually too expensive to search exhaustively for reducts with minimal cardinality • Reducts found through: • Converting from CNF to DNF (expensive) • Hill-climbing search using clauses (non-optimal) • Other search methods - GAs etc (non-optimal) • RSAR-SAT • Solve directly in SAT formulation. • DPLL approach is both fast and ensures optimal reducts

  8. Fuzzy discernibility matrices • Extension of crisp approach • Previously, attributes had {0,1} membership to clauses • Now have membership in [0,1] • Allows real-coded data as well as nominal. • Fuzzy DMs can be used to find fuzzy-rough reducts

  9. Formulation • Fuzzy satisfiability • In crisp SAT, a clause is fully satisfied if at least one variable in the clause has been set to true • For the fuzzy case, clauses may be satisfied to a certain degree depending on which variables have been assigned the value true

  10. Experimentation: setup • 9 benchmark datasets • Features – 10 to 39 • Objects – 120 to 690 • Methods used: • FRFS-SAT • Greedy hill-climbing: fuzzy dependency, fuzzy boundary region and fuzzy discernibility. • Evolutionary algorithms: genetic algorithms (GA) and particle swarm optimization (PSO) using fuzzy dependency • 10x10-fold cross validation • FS performed on the training folds, test folds reduced using discovered reducts

  11. Experimentation: results

  12. Conclusion • Extended propositional satisfiability to enable search for fuzzy-rough reducts • New framework for fuzzy satisfiability • New DPLL algorithm • Fuzzy clause simplification • Future work: • Non-chronological backtracking • Better heuristics • Unsupervised FS • Other extensions in propositional satisfiability

  13. WEKA implementations of all fuzzy-rough feature selectors and classifiers can be downloaded from: • http://users.aber.ac.uk/rkj/book/weka.zip

  14. Feature selection • Feature selection (FS) is a DR technique that preserves data semantics (meaning of data) • Subset generation: forwards, backwards, random… • Evaluation function: determines ‘goodness’ of subsets • Stopping criterion: decide when to stop subset search Feature set Subset Evaluation Generation Subset suitability Stopping Continue Stop Validation Criterion

  15. Algorithm

  16. Example

More Related