1 / 34

By Abd El-Monem M. Kozea, Mohamed M. E. Abd El-Monsef, Soaad Abd El-Badie Attia El-Afify

New Approaches for Data Reduction in Generalized Multi-valued Decision Information System (GMDIS): Case Study of Rheumatic Fever Patients. By Abd El-Monem M. Kozea, Mohamed M. E. Abd El-Monsef, Soaad Abd El-Badie Attia El-Afify Mathematics Department, Faculty of Science,

thelma
Télécharger la présentation

By Abd El-Monem M. Kozea, Mohamed M. E. Abd El-Monsef, Soaad Abd El-Badie Attia El-Afify

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. New Approaches for Data Reduction in Generalized Multi-valued Decision Information System (GMDIS):Case Study ofRheumatic Fever Patients WRSTA2006, 13 August 2006

  2. By Abd El-Monem M. Kozea, Mohamed M. E. Abd El-Monsef,Soaad Abd El-Badie Attia El-Afify Mathematics Department, Faculty of Science, Tanta University, Egypt Email:savvymore@yahoo.com WRSTA2006, 13 August 2006

  3. Outline • Motivation / Introduction • Basic Concepts of Rough Sets • Rheumatic Fever Data: Characteristics • New Thinking • Generalized Multi-Valued Decision Information System (GMDIS) • New Approaches for Data Reduction in GMDIS • Non-equivalence Relations, • Topological Spaces and • Degree of Dependencies in GMDIS • Reduct Algorithms based on GMDIS • Rheumatic Fever GMDIS Reduction: Worked example • Conclusion and Future Work WRSTA2006, 13 August 2006

  4. Motivation / Introduction (1) • Rough set theory was developed by Zdzislaw Pawlak in the early 1982’s. • RS is based on the idea of equivalence relations which partition the domain into different classes. • It is a mathematical tool for dealing with incomplete data for induction of approximations of concepts and for discovering patterns hidden in data. • It can be used for feature selection, data reduction, identifies partial/total dependencies in data, gives approach to null values and missing data, and decision rule generation. • Rough Set Features: • It is applicable to problems with both numeric and descriptive attributes • It is capable of finding all minimal knowledge representation • It is highly automated based on strict rules. • A multi-valued information system (MIS) is a generalization of the idea of a single valued information system (SIS). • In a multi-valued information system, • Attribute functions are allowed to map elements to sets of attribute values. WRSTA2006, 13 August 2006

  5. Motivation / Introduction (2) • Initiated a new approach for data reduction in GMDIS. • By converting the Single-Valued Decision Information System (SDIS) to a GMDIS. • Two general relations are defined • Constructing new classes using the general relations • The measure of decision dependency on the condition attributes is studied To evaluate the performance of the approach, • An application of Rheumatic Fever datasets. WRSTA2006, 13 August 2006

  6. Rough Set Theory: Basic Concepts • Information/Decision Systems (Tables) • Indiscernibility • Set Approximation • Reducts and Core • Rough Membership • Dependency of Attributes WRSTA2006, 13 August 2006

  7. Information Systems/Tables • IS is a pair (U, A) • U is a non-empty finite set of objects. • A is a non-empty finite set of attributes such that for every • is called the value set of a. WRSTA2006, 13 August 2006

  8. Decision Systems/Tables • DS: • is the decisionattribute(instead of one we can consider more decision attributes). • The elements ofAare called the condition attributes. WRSTA2006, 13 August 2006

  9. = Î ( U , At , { V : a At }, f ) a a = Î ( U , At D , { V : a At }, f ) U a a Information Systems Types • The first concept of IS was developed by Grzymala-Busse (1988). There are many types of IS as follows: • Single valued Information System (SIS) • The data takes a single value for each element • Single valued Decision Information System (SDIS) • A Multi-valued Information System (MIS) • An ordinary information system which its values ore sets • A Multi-valued Decision Information System (MDIS) WRSTA2006, 13 August 2006

  10. Rheumatic Fever Data: Characteristics • We obtained the used Rheumatic Fever patients data from Tanta University Hospital, Egypt. • All patients are between 9-12 years old with history of Arthritis began from age 3-5 years. • This disease has many symptoms and it is usually started in young age and still with the patient along his life. • The following table shows seven patients characterized by 8 symptoms (attributes) using them to decide the diagnosis for each patient (decision attribute). WRSTA2006, 13 August 2006

  11. Rheumatic Fever Data: Characteristics ِAttribute Symbol ً to?Refers ِAttribute Values to?Refers Sex Male Female Pharyngitis Yes No Arthritis No arthritis Began in the knee Began in the ankle Carditis Affected Not affected Chorea Yes No ESR Normal High Abdonominal Pain Absent Present Headache Yes No Diagnosis Rheumatic Arthritis CarditisRheumatic Rheumatic Arthritis and Carditis WRSTA2006, 13 August 2006

  12. New Thinking (1) • A multi-valued information system (MIS) is a generalization of the idea of a single valued information system (SIS). • In a multi-valued information system, attribute functions are allowed to map elements to sets of attribute values • Covert the SDIS to a MDIS and vice versa? WRSTA2006, 13 August 2006

  13. New Thinking (2) • Initiative two methods to: • Covert the SIS to a MIS and vice versa! • Covert the SDIS to a MDIS and vice versa! by ( Collecting of Attributes). WRSTA2006, 13 August 2006

  14. S F A R K E P H D x1 s2 f1 a1 r1 k1 e1 p1 h2 d3 x2 s1 f1 a1 r1 k1 e2 p1 h1 d3 x3 s2 f1 a2 r1 k2 e1 p1 h2 d3 x4 s1 f1 a1 r2 k2 e1 p1 h2 d1 r1 x5 s1 f2 a0 k2 e1 p2 h2 d2 x6 s1 f1 a1 r1 k2 e2 p1 h2 d3 x7 s1 f1 a2 r1 k2 e1 p1 h2 d3 Worked Example 1 (SDIS): Rheumatic Fever SDIS Data WRSTA2006, 13 August 2006

  15. Worked Example 2 (MDIS ): Converted Data Description (MDIS) Attribute Symbol ًRefers to ? ِAttribute Values ًRefers to ? α {S,K} α1 S → s1 α2 K → k1 α3 {S,K}→ {s2,k2} β {F,A,E} β1 F → f1 β2 A →a1 β3 A →a2 β4 E → e1 β5 {F,A,E} →{f2,a0,e2} δ {R,P,H} δ1 R → r1 δ2 P→p1 δ3 H→h1 δ4 {R,P,H}→ {r2,p2,h2} D Diagnosis d1 Rheumatic arthritis d2 Rheumatic carditis d3 Rheumatic arthritis and carditis WRSTA2006, 13 August 2006

  16. α β δ D x1 {α2} {β1,β2,β4} {δ1,δ2,} {d3 } x2 {α1,α2} {β1, β2,} {δ1,δ2,δ3} {d3 } x3 {α3} {β1, β2, β4} {δ1,δ2} {d3 } x4 {α1} {β1,β2,β4} {δ2 } {d1 } x5 {α1} {β4} {δ1 } {d2 } x6 {α1} {β1,β2} {δ1,δ2} {d3 } x7 {α1} {β1, β3, β4} {δ1,δ2,δ3} {d3 } Worked Example 3 (MDIS ): Rheumatic Fever MDIS Data WRSTA2006, 13 August 2006

  17. Generalized Multi-Valued Decision Information System (GMDIS) WRSTA2006, 13 August 2006

  18. Initiated a New Approach • Initiate a new approach for data reduction in Generalized Multi–Valued Decision Information System (GMDIS). • Convert the SDIS to GMDIS. • Two general relations are defined on condition attributes and decision attribute. • Construct new classes using the general relations which are used for data reduction. • Study The measure of decision dependency on the condition attributes • Evaluate the performance of the approach, • an application of, rheumatic fever datasets has been chosen and the reduct approach have been applied to see their ability and accuracy. WRSTA2006, 13 August 2006

  19. = y Î h Í GMIS ( U , At , { : a At }, f , { : B At }) (1) a a B = y Î h Í GMDIS ( U , At D , { : a At }, f , { : B At }) U a a B (2) Generalized Multi-valued Decision Information System A Generalized Multi-valued Information System can be defined as follows. A Generalized Multi-valued Decision Information System can be defined as follows. WRSTA2006, 13 August 2006

  20. (1) c h = Í " Î Í {( x , y ) : f ( x ) f ( y ) , a B , B At } (4) m = m = m ¹ Î ¹ { A A , A , A , A , A A , i j } U I h = {( x , y ) : f ( x ) depends on f ( y )} B a a h a l i j l k i j k a D D k D = Í {( x , y ) : f ( x ) f ( y )} (3) D D h = Í " Î Í {( x , y ) : f ( y ) f ( x ) , a B , B At } B a a (2) Set Approximations in GMDIS (1) Define the set of all intersections of members of as the Meeting Point Relation (MPR) can be written as: WRSTA2006, 13 August 2006

  21. D = Í (5) POS ( D ) X , B At U h B B Î X A h D Where, for any subset the lower and upper approximations are defined by, Í X U = h h Í Í X { : X }, B At U h Bx Bx (6) B = h h ¹ F Í X { : X }, B At U I h Bx Bx B Set Approximations in GMDIS (2) WRSTA2006, 13 August 2006

  22. 2.The attribute is called the principal attribute (PA) if , and if then both a and b are principal attributes. Î a At t £ t B D Í B At t £ t " Î t $ Î t Ì ¹ iff G , G ' s . t . G G ' , G , G ' U B D B D t = t t t " Î ¹ , a , b At , b a f a b a b Suggested New Technique : Consideration (1) 1.The set of attributes is called a reduct if and B is minimal, where WRSTA2006, 13 August 2006

  23. Y = { R , R , , R } L 1 2 n Y = ' { R ' , R ' , , R ' } L 1 2 n " Î Y $ Î Y Í R ' ' , R s . t . R ' R i i i i Suggested New Technique : Consideration (2) • The set of attributes of equal highest degree of dependency is the PA of the GMDIS. If the set of all reducts of any SDIS is , , and the set of reducts for the GMDIS system using tha new approach is, . .Then, it can be said that Y’ is more refinement than, Y if . WRSTA2006, 13 August 2006

  24. Simplified Reducts • Is the set of all reducts, after omitted the supersets of each reduct in the set RED (At), and we denote it by SRED (At). WRSTA2006, 13 August 2006

  25. GMDIS Reduction Algorithms • Algorithm 1: GMDIS Reduct • Algorithm 2: GMDIS PA Algorithm WRSTA2006, 13 August 2006

  26. Í R At A GMDIS = y Î h Í ( U , At D , { : a At }, f , { : B At }) U a a B (1) ¬ R GMDIS ¬ (6) R {} ¬ (3) GMDIS R t £ t (7) R D R Î - Loop a ( At R ) (9) Return ¬ GMDIS R { a } º U Where R Reduct GMDIS Reduction Algorithms: GMDIS Reduct (2) Do (4) (8) Until t £ t (5) If { } R a D U R : A set of minimum attribute subset; WRSTA2006, 13 August 2006

  27. Í PA: A set of principal attribute subset, PA At A GMDIS = y Î h Í ( U , At D , { : a At }, f , { : B At }) (1) U (6) ¬ ¬ PA PA { a } PA {} U a a B (7) End Loop (2) Do Î (8) a At End Loop (9) Î PA b At Return t t f a b GMDIS Reduction Algorithms: GMDIS PA Algorithm (3) Loop (4) Loop (5) If WRSTA2006, 13 August 2006

  28. c h = Í " Î Í {( x , y ) : f ( x ) f ( y ) , a B , B At } B a a a { } = a = RED ( At ) { } { S , K } Rheumatic Fever GMDIS Reduction: Worked Example • Applying the new approach on MDIS Rheumatic Fever data to be a GMDIS by using the relation • So we conclude that is the reduct and it is the PA of the GMDIS and this is the same result obtained using the second consideration. WRSTA2006, 13 August 2006

  29. x1 x2 x3 x4 x5 x6 x7 x1 Ф x2 Ф Ф x3 Ф Ф Ф x4 {S,R,K} {R,K,E,H} {S,A,R} Ф x5 {S,F,A,K,P} {F,A,K,E,P,H} {S,F,A,P} {F,A,R,P} Ф x6 Ф Ф Ф {R,E} {F,A,E,P} Ф x7 Ф Ф Ф {A,R,H} {F,A,P,H} Ф Ф Discernibility Matrix versus GMDIS Rheumatic Fever Data Discernibility Matrix WRSTA2006, 13 August 2006

  30. = Ú Ú Ù Ú Ú Ú Ù Ú Ú Ù Ú Ú Ú Ú f { S R K } { R K E H } { S A R } { S F A K P } At Ù Ú Ú Ú Ú Ú Ù Ú Ú Ú Ù Ú Ú Ú Ú Ù Ú { F A K E P H } { S F A P } { F A R K P } { R E } Ù Ú Ú Ú Ù Ú Ú Ù Ú Ú Ú { F A E P } { A R H } { F A P H } = Ú Ú Ú Ú Ú Ú Ú Ú Ú Ú Ú Ú Re d ( At ) {{ S R K }, { S A R }, { S F A P }, { F A R K P }, { R E } Ú Ú Ú Ú Ú Ú Ú Ú , { F A E P }, { A R H }, { F A P H }} = a = RED ( At ) { } { S , K } The discernibility function WRSTA2006, 13 August 2006

  31. Final Note Reducts obtained by GMDIS is contained in the reducts obtained on SDIS using the discernibility matrix, that means that the new approach gives more reduction. WRSTA2006, 13 August 2006

  32. Conclusion • New approach for data reduction in GMDIS is considered as a generalization in the case of MDIS. • This approach extended to Pawlak approach if the system is single-valued and the relations are equivalence. • Opens the way for other approaches of data reduction • if we use the general topological recent concepts such as Pre-open sets, Semi-open sets, etc. • In many real life situations, the use of attributes in a single fashion is not represetable for the actual effect of attributes. So, it is necessary to consider subsets of the attributes as a multi criteria. • An application of, Rheumatic Fever datasets has been chosen and the reduct approach has been applied to see their ability and accuracy. WRSTA2006, 13 August 2006

  33. Acknowledgment • The authors greatly appreciate and thanks many peoples for their valuable comments and advices: • Dr. K. E. Sturtz, , Air Force Research Laboratory, Wright Patterson Air Force Base, Ohio; • Prof. Aboul Ella Hassanien, Cairo University • Prof. E. Rady,, I.S.S.R., Cairo University. • Dr. A. S. Salama. Pure Mathematics Dept., Faculty of Science, Tanta University. WRSTA2006, 13 August 2006

  34. شكرا لحسن استماعكم WRSTA2006, 13 August 2006

More Related