90 likes | 222 Vues
This study explores the application of Genetic Algorithms (GAs) to evolve simple yet accurate binary decision trees. By employing straightforward genetic operators on tree structures, the research is validated through experiments with diverse UCI datasets, demonstrating favorable size and competitive accuracy outcomes. It argues the merits of GAs over traditional hill-climbing techniques, particularly in complex search spaces with conditionally dependent and irrelevant attributes. The proposed method includes selecting desired tree characteristics, creating a tailored fitness function, and evolving decision trees effectively. Future work aims to minimize evolution time, enhance node statistics, and refine output classification.
E N D
GATree Genetically Evolved Decision Trees Papagelis Athanasios - Kalles DimitriosComputer Technology Institute
Introduction • We use GA’s to evolve simple and accurate binary decision trees • Simple genetic operators over tree structures • Experiments with UCI datasets • very good size • competitive accuracy results
Why it should work ? • GA’s are not • Hill climbers • Blind on complex search spaces • Exhaustive searchers • Extremely expensive • They are … • Beam searchers • They balance between time needed and space searched
The question… • Are there datasets where hill-climbing techniques are really inadequate ? • e.g unnecessary big – misguiding output • Yes there are… • Conditionally dependent attributes • e.g XOR • Irrelevant attributes • Many solutions that use GAs as a preprocessor so as to select adequate attributes • Direct genetic search can be proven more efficient for those datasets
The proposed solution • Select the desired decision tree characteristics (e.g small size) • Create an appropriate fitness function • Adopt a decision tree representation with appropriate genetic operators • Evolve for as long as you wish!
Payoff function • Balance between accuracy and size • set x depending on the desired output characteristics. • Small Trees ? x near one • Emphasis on accuracy ? x grows big
Future work • Minimize evolution time • Improved node statistics • Choose the output class using a majority vote over the produced tree forest • Dynamic tuning of initial parameters • Experiments with synthetic datasets • Specific characteristics