1 / 34

Discovery of Temporal Patterns in Course-of-Disease Medical Data

Discovery of Temporal Patterns in Course-of-Disease Medical Data. Jorge C. G. Ramirez Ph.D. Candidate Lynn L. Peterson and Diane J. Cook Supervising Professors. Overview. Objective Contributions Approach TEMPADIS Summary and Conclusions. Objective.

kyna
Télécharger la présentation

Discovery of Temporal Patterns in Course-of-Disease Medical Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discovery of Temporal Patterns in Course-of-Disease Medical Data Jorge C. G. Ramirez Ph.D. Candidate Lynn L. Peterson and Diane J. Cook Supervising Professors

  2. Overview • Objective • Contributions • Approach • TEMPADIS • Summary and Conclusions

  3. Objective • Discover patterns that represent groups of patients that had a similar course of disease for a catastrophic or chronic illness • Motivation • Medical • AI

  4. Contributions • Data Preprocessing • Normalization • Learning Missing Data • Learning Implicit Knowledge • Exploratory Analysis • Event Set Sequence Approach

  5. Contributions • Domain Understanding • New perspective on mass of data • Identify groups of patients for further medical study

  6. Approach • Example Events • Laboratory Results • 461 L WBC 2.70 • 461 L HCT 40.10 • 461 L PLT 239.00 • 461 L CD4% 19.00 • 461 L CD4A 188.00

  7. Approach • Example Events • Example Events • Visits • 468 C CV • Diagnoses • 468 D 043.9 AIDS-RELATED COMPLEX, UNSPECIFIED • Pharmacy • 469 P CTM 60 CO-TRIMOXAZOLE DS • 469 P AZT 200 ZIDOVUDINE 100MG

  8. Approach • Event Set Sequences • Events • Value Event: laboratory test result, visit • Duration Event: pharmacy, diagnosis • Event Set is all Events that occur in a window of time • Event Set Sequence is all Event Sets that occur over a long period of time • Event Set Sequences

  9. Approach • Example Event Set • 461 L WBC 2.70 • 461 L HCT 40.10 • 461 L PLT 239.00 • 461 L CD4% 19.00 • 461 L CD4A 188.00 • 468 C CV • 468 D 043.9 AIDS-RELATED COMPLEX, UNSPECIFIED • 469 P CTM 60 CO-TRIMOXAZOLE DS • 469 P AZT 200 ZIDOVUDINE 100MG

  10. Approach • Normalization • Normal for each patient is different • Especially when effected by a catastrophic or chronic illness • Example: CD4A • General Population Normal: 416 - 1751 • Well HIV-positive patient: 200 - 350 • Severely immune-compromised patient: 0 - 50

  11. Approach • Normalization (continued) • Scale to -4…0…+4 • 0 is normal • Each number represents a deviation from normal • 1 and 2 are noticeable but not severe • 3 is severe • 4 is very severe

  12. Approach • Replace Missing Data • Diagnosis data very incomplete • Learn severity of condition from pharmacy data • Induce decision tree to classify conditions

  13. Approach • Create Health Status Categories • = HIV-positive asymptomatic • = Asymptomatic, on anti-HIV therapy • = Immune-compromised, on prophylactic therapy • = Active illness • = Severe active illness

  14. Approach • Learn Implicit Knowledge • Need to augment explicit knowledge • Recovery time is expert’s implicit knowledge • Use neural network to learn recovery time function • 0 = Nothing to recover from • 1-4 = weeks to recover • 5 = 5 or more weeks to recover

  15. Approach • Categorize Pharmacy Data • A myriad of drugs prescribed • Need to understand significance • Categorize by use

  16. Approach • Categories • Nucleoside Analogs • Protease Inhibitors • Prophylaxis Therapies • Intraveneous antibiotics • Anti-virals • Anti-PCP/Toxoplasmosis • Anti-mycobacterials

  17. Approach • Categories (continued) • Anti-wasting syndrome • Anti-fungals • Chemotherapies

  18. Approach • Result: Understandable representation of patient data • 861 C 1.1 26.1 167 0.0 0 16 0 • 862 0.0 0.0 0 0.0 0 0 2 24: 30 38: 50 • 867 H 4.3 19.2 144 0.0 0 11 3 0: 3 22: 1 35: 2 • 868 H 2.2 26.2 144 0.0 0 5 3 0: 3 22: 1 35: 2 • 869 0.0 0.0 0 0.0 0 0 1 35: 60 • 874 C 1.3 32.4 0 0.0 0 17 0 • 889 C 1.1 30.4 154 0.0 0 36 0 • 890 0.0 0.0 0 0.0 0 0 3 22: 30 38: 50 39:480 • 923 0.0 0.0 0 0.0 0 0 1 39:480 • 933 H 3.6 20.4 182 0.0 0 11 3 0: 2 22: 1 39: 12

  19. Approach • Result: Understandable representation of patient data • 861 C 3 1 -4 -3 0 -9 -9 –1 0 0 2 0 0 0 0 0 0 0 • 867 H 4 4 0 -4 -1 -9 -9 –2 0 0 2 0 0 0 1 1 0 0 • 868 H 4 1 -2 -3 -1 -9 -9 –4 0 0 2 0 0 0 1 1 0 0 • 874 C 4 3 -4 -1 -9 -9 -9 0 0 0 2 0 0 0 1 1 0 0 • 889 C 4 2 -4 -2 -1 -9 -9 2 0 0 2 0 0 0 1 1 0 0 • 933 H 4 4 0 -4 0 -9 -9 –2 0 0 1 0 0 0 0 2 0 0

  20. Approach • Result: Understandable representation of patient data • < { (EV C)(HS 3)(RT 1)(WBC -4)(HCT -3)(PLT 0) • (LMPH –1)(onD 0010000000) } • { (EV H)(HS 4)(RT 4)(WBC 0)(HCT -4)(PLT -1) • (LMPH –2)(onD 0010001100) } • { (EV H)(HS 4)(RT 1)(WBC -2)(HCT -3)(PLT -1) • (LMPH –4)(onD 0010001100) } • { (EV C)(HS 4)(RT 3)(WBC -4)(HCT -1) • (onD 00010001100) } • { (EV C)(HS 4)(RT 2)(WBC -4)(HCT -2)(PLT -1) • (LMPH 2)(onD 0010001100) } • { (EV H)(HS 4)(RT 4)(WBC 0)(HCT -4)(PLT 0) • (LMPH –2)(onD 0010000100) } >

  21. Approach • Inexact Match • Use set difference • Partial match, feature by feature • Assumes default partial match for missing data • Use weakest-link/average-link • Require minimum degree of match • Require average degree of match

  22. Raw Target Data Data Cleaning Data Normalization Normalized Database TEMPADIS

  23. Decision Tree Normalized Database Reduced, Knowledge-Added Data Neural Net TEMPADIS

  24. Knowledge-Added Database Sequence Builder Temporal Patterns TEMPADIS

  25. Results • Validation • Results are temporal patterns that demonstrate groups of patients had similar experience during the course of disease • Only medical experts can assess validity of discovered patterns • These results have been validated by the experts in the HIV Clinical Research Group

  26. Results • Given a database of patients followed for 4 to 9 years • Discovered interesting patterns • Interestingness has multiple dimensions • Length • Data that appears in the patterns • Data that does not appear in the patterns

  27. Results • Advanced patients, subject to various OIs • < { (EV C)(HS 3)(RT 0)(WBC 0)(HCT -1)(PLT 0)(LMPH -3) • (onD 0000000000) } • { (EV E)(HS 3)(RT 2)(WBC 3)(HCT -1)(PLT 1)(LMPH 4) • (onD 0000000000) } • { (EV C)(HS 3)(RT 0)(WBC 1)(HCT 0)(PLT 0)(CD4P -3) • (CD4A -1)(LMPH 0)(onD 1010000000) } • { (EV C)(HS 3)(RT 1)(WBC -1)(HCT -1)(PLT 1)(LMPH 2) • (onD 1010000000) } • { (EV E)(HS 3)(RT 1)(WBC 2)(HCT -1)(PLT 1)(LMPH 4) • (onD 0000000000) } • { (EV C)(HS 3)(RT 1)(WBC 1)(HCT 0)(PLT 0)(CD4P -3) • (CD4A -2)(LMPH 0)(onD 1010000000) } >

  28. Advanced patients, fairly stable • < { (EV C)(HS 3)(RT 0)(WBC -1)(HCT -1)(PLT 1)(CD4P -4) • (CD4A -4)(LMPH 0)(onD 0010000000) } • { (EV C)(HS 3)(RT 0)(WBC 0)(HCT 0)(PLT -1)(CD4P -4) • (CD4A -4)(LMPH 0)(onD 1010000000) } • { (EV C)(HS 3)(RT 0)(onD 1010000000) } • { (EV C)(HS 3)(RT 0)(WBC -2)(HCT 0)(PLT -1)(CD4P -4) • (CD4A -4)(LMPH 0)(onD 0010000000) } • { (EV C)(HS 4)(RT 1)(WBC 1)(HCT -4)(PLT 0)(CD4P -4) • (CD4A -4)(LMPH -4)(onD 0011001000) } • { (EV C)(HS 3)(RT 3)(onD 0010000000) } • { (EV )(HS 3)(RT 1)(WBC 0)(HCT 0)(PLT 0)(LMPH 0) • (onD 0000000000) } • { (EV C)(HS 3)(RT 0)(CD4A -4)(onD 0010000000) } >

  29. Asymptomatic period • < { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 1)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV E)(HS 1)(RT 0)(WBC -1)(HCT 0)(PLT 1)(CD4P -1) • (CD4A -2)(LMPH 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(CD4A 0)(onD 0010000000) } • { (EV C)(HS 1)(RT 0)(CD4A 0)(onD 0010000000) } • { (EV E)(HS 1)(RT 0)(WBC 1)(HCT 0)(PLT 0)(CD4P 0) • (CD4A 0)(LMPH 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } • { (EV C)(HS 1)(RT 0)(onD 0000000000) } >

  30. Summary • Nine Steps of KDD • Identify goal • Identify target data set • Data cleaning and preprocessing • Data reduction and projection • Identify data mining method

  31. Summary • Nine Steps of KDD • Exploratory Analysis • Data Mining • Interpretation of Mined Patterns • Acting on Discovered Knowledge

  32. Conclusions • Objective Met with Contributions • Patterns discovered representing groups of patients with similar experience in course of disease • This perspective on the data has not previously been produced • This kind of computation on this kind of data has not previously been produced

  33. Future Work • Improve discovery algorithm • Backtracking is a barrier to overcome • Improve search control • Develop heuristic for measuring interestingness • Add ability to identify clinically identical/similar patterns

  34. Future Work • Move database to new Intelligent Systems in Medicine and Biology Lab • Bring database up to date • Include more domain data in Event Sets • Explore impact of new developments in HIV treatment

More Related