60 likes | 187 Vues
This study explores innovative data mining techniques to identify molecules that can inhibit cancer. Leveraging tools like segmentation, regression, and classification, we analyze extensive datasets to reveal potential drug candidates. A modified version of the 'Wizard of Oz' metaphor is used to elucidate our methodology, emphasizing the importance of contextual data and goal-centered strategies in drug discovery. This work aims to bridge the gap between theoretical models and real-world application, enhancing the scalability, interpretability, and effectiveness of data mining in cancer research.
E N D
Data Mining Front Side Which molecules inhibit cancer? Data ACME DATA-MINO-MATIC Segmen-tation Regression Cleaning Evalua- tion Classifi-cation Potential drug molecules Modified version of cartoon from Union of Concerned Scientists
Data Mining Backside Pay no attention to the man behind the curtain. -Wizard of Oz
Strategies ? ? Data Data ACME DATA-MINO-MATIC ACME DATA-MINO-MATIC • Input/Output funnels are largely art • Capture and exploit meaning and context not just data – semantic web • Adapt goal centered versus algorithm centered approach
Strategies ? Data ACME DATA-MINO-MATIC • Sub-boxes are scientific but narrow • Push more functionality in each box • Grow the theory • Move boxes closer to real world heterogeneous data, scalability, simplicity, sparseness, interpretability, interestingness
Strategies ? ? Data Data ACME DATA-MINO-MATIC ACME DATA-MINO-MATIC • Funnels and levers not always published • Mundane details matter • Mine the mining • Identify best practices, problem strategies, and emerging methods via data mining applications website • Social tagging and ranking