Classification in Complex Systems

Classification in Complex Systems Why we should look at the paper: CAEP: Classification by Aggregating Emerging Patterns G. Dong, X. Zhang, L. Wong, and J Li

What are Common Problems in Classification? • Many variables • Graphs that relate tuples • Protein-protein interactions (KDD-cup 02) • Citations (KDD-cup 03) • Anything that violates standard table format

Many Variables Solution: • Naïve Bayes way of multiplying probabilities • Other additive models Problems: • Many factors • May be correlated • Noise … but it gets worse

Graphs • 2 kinds of attributes • Attributes within nodes • Attributes of neighbor and more distant nodes • How do neighbor attributes count? • Take disjunction? • “At least one neighbor that has a particular property” • Probably preferable: • Use links or, more general, paths as basis • Integration into classification???

Idea • Get away from strict set of n attributes • If an attribute or combination of attributes is “interesting” use them • Combining rules? • I would have guessed as in Naïve Bayes • CAEP adds probabilities!?

What is “interesting” • CAEP paper claims “growth rate” • Support of a rule increases significantly from one class label to another • Note: Only increase, not decrease! • What does that mean? • For pattern e and classes P and N • growth_ratePN (e) = suppN (e) / suppP (e)

2 Things Worth Investigating • Is “interestingness” measure related to information gain? • Under certain assumptions: Yes • Can the “score” be justified? • Sum of P(C)!?

Other Issues • Normalization • Emerging patterns only consider increase in support => different number of relevant patterns • How to mine for EPs

Conclusions • Idea very valuable • Classification split into ARM-step and rule combination • Justification of details? • Not great • Should be possible to do it right – with poorer accuracy ;-)

Classification in Complex Systems

Classification in Complex Systems

Presentation Transcript

Classification Systems

Synchronization in Coupled Complex Systems

CLASSIFICATION SYSTEMS

Complex Systems

Complex Information Systems

Complex Information Systems

Interoperability in Complex Distributed Systems

Complex Systems Applications

Classification Systems

Complex Systems

CLASSIFICATION SYSTEMS

Complex Adaptive Systems

Complex Adaptive Systems

Classification Systems

Computing in Complex Systems

Complex Adaptive Systems

Complex networked systems

CLASSIFICATION SYSTEMS

Computing in Complex Systems

Classification Systems