1 / 31

Pathology data sharing United States Military Cancer Institute Walter Reed Army Medical Center

Pathology data sharing United States Military Cancer Institute Walter Reed Army Medical Center November 16, 2004 Jules J. Berman, Ph.D., M.D. Program Director, Pathology Informatics Cancer Diagnosis Program, NCI, NIH email: bermanj@mail.nih.gov. UFO Abductees Lots of them

abe
Télécharger la présentation

Pathology data sharing United States Military Cancer Institute Walter Reed Army Medical Center

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pathology data sharing United States Military Cancer Institute Walter Reed Army Medical Center November 16, 2004 Jules J. Berman, Ph.D., M.D. Program Director, Pathology Informatics Cancer Diagnosis Program, NCI, NIH email: bermanj@mail.nih.gov

  2. UFO Abductees Lots of them They often say about the same thing (independent confirmations) All walks of life Minority are a little crazy Mostly honest and rational people One problem: no evidence

  3. Researchers who don’t publish their primary data Lots of them They often say about the same thing (independent confirmations) All walks of life Minority are a little crazy Mostly honest and rational people One problem: no evidence

  4. After your research data reaches a certain size, the data becomes the publication, and the journal articles become tiny editorials that describe or interpret the data

  5. In a data-intensive world, the data is the center of the universe. Manuscripts are satellites revolving around a central large BLOB of data.

  6. Sticks and carrots: NIH Statement on Data Sharing http://grants.nih.gov/grants/guide/notice-files/NOT-OD-03-032.html National Research Council UPSIDE Universal Principle of Sharing Integral Data Expeditiously http://books.nap.edu/books/0309088593/html/R1.html

  7. NIH Funding for data sharing Shared Pathology Informatics Network http://grants.nih.gov/grants/guide/rfa-files/RFA-CA-01-006.html Tools for collaborations that involve data sharing http://grants1.nih.gov/grants/guide/pa-files/PAR-03-134.html Infrastructure for data sharing and archiving http://grants.nih.gov/grants/guide/rfa-files/RFA-HD-03-032.html caBIG http://cabig.nci.nih.gov/

  8. Confidentiality methods

  9. Two U.S. regulations that tell us how we can use medical records in research: Common Rule & HIPAA Privacy Rule In pathology informatics, the most ambitious research typically involves hundreds of thousands or millions of patient records. Getting informed consent for these studies is not feasible. HIPAA and Common Rule both work under the principle that medical research is good, and it can be conducted without getting patient consent if you can come up with a way to avoid harming patients (no harm, no consent for harm).

  10. hipaa

  11. IRB

  12. Corporate Lawyer

  13. Irate Human Subject

  14. Principle Investigator

  15. Articles: Berman JJ. Confidentiality for Medical Data Miners. Artificial Intelligence in Medicine. 26(1-2):25-36, 2002. Berman JJ. Concept-Match Medical Data Scrubbing: How pathology datasets can be used in research. Arch Pathol Lab Med. 2003 Jun;127(6):680-6. Berman JJ. Threshold protocol for the exchange of confidential medical data. BMC Medical Research Methodology, 2002, 2:12.

  16. More: Berman JJ. A tool for sharing annotated research data: the "Category 0" UMLS (Unified Medical Language System) vocabularies. BMC Med Inform Decis Mak. 2003 Jun 16;3(1):6. Berman JJ. Zero-check: a zero-knowledge protocol for reconciling patient identities across institutions.Arch Pathol Lab Med. 2004 Mar;128(3):344-6. Berman JJ. Racing to share pathology data. Am J Clin Pathol. 2004 Feb;121(2):169-71 (editorial).

  17. Standard ways of organizing data (nomenclatures, taxonomies, classifications, data structures)

  18. Director’s Challenge: Toward a molecular classification of tumors In January 1999, the U.S. National Cancer Institute (NCI) issued a challenge to the scientific community "to harness the power of comprehensive molecular analysis technologies to make the classification of tumors vastly more informative. This challenge is intended to lay the groundwork for changing the basis of tumor classification from morphological to molecular characteristics."

  19. Impediment: Misunderstanding about the definition of classification

  20. Classifications are not: Identification systems Taxonomies and nomenclatures Ontologies Achieved by analyzing gene expression array data

  21. What is a [tumor] classification? A grouped taxonomy [listing of all tumors] with the following properties: Inheritance: Hierarchical structure, with each class of tumors inheriting properties of its ancestors Uniqueness: Each tumor occurs in only one place in the classification Comprehensive: All tumors are included Intransitive: A tumor from one class does not change into a tumor from another class (e.g. an adenocarcinoma does not become a lymphoma)

  22. Problems with current tumor classifications 1. Created piecemeal 2. Often based on medical disciplines 3. A given tumor can appear redundantly when subclassifications are merged 4. No tumor classification has been prepared in a standard format designed to exchange, merge or analyze heterogeneous biological data

  23. New Tumor Classification Comprehensive ~122,000 terms (9 Megabytes) Based on developmental and molecular biologic features of tumors Heritable class structure with a unique class location for each tumor XML document that can be cross-annotated with molecular biology databases Preserves current tumor names, while abandoning purely morphologic categories (e.g. epithelial/stromal)

  24. Latest version of the nomenclature: http://www.pathologyinformatics.org/informatics_r.htm 122,000+ terms Copy of paper: Berman JJ. Tumor classification: molecular analysis meets Aristotle. BMC Cancer 2004 4:10, 17 March 2004 http://www.biomedcentral.com/1471-2407/4/10

  25. Standard ways of exchanging data

  26. XML is the greatest information organizing tool since the invention of the book. Much more important than HTML Takes advantage of: Metadata Namespaces Internet External links

  27. Example: Tissue Microarray Data Exchange Specification The TMA Specification is an open access document that can be used without any restriction. Its development was sponsored by the NCI and by the Association for Pathology Informatics

  28. Basics of the specification: Jules J Berman, Mary Edgerton and Bruce Friedman. The tissue microarray data exchange specification: a community-based, open source tool for sharing tissue microarray data. BMC Med Inform Decis Mak. 2003 May 23;3:5 Real-world implementation example: Jules J Berman, Milton Datta, Andre Kajdacsy-Balla, Jonathan Melamed, Jan Orenstein, Kevin Dobbin, Ashok Patel, Rajiv Dhir, Michael J Becich. The tissue microarray data exchange specification: implementation by the Cooperative Prostate Cancer Tissue Resource. BMC Bioinformatics 2004 Feb 27, 5:19

  29. LDIP (Laboratory Digital Imaging Project) Association for Pathology Informatics Pathology Image Data Exchange Specification Information available at: http://www.pathologyinformatics.org/ldip.htm

  30. end

More Related