1 / 19

Datamining MEDLINE for Topics and Trends in Dental and Craniofacial Research

Datamining MEDLINE for Topics and Trends in Dental and Craniofacial Research William C. Bartling, D.D.S. NIDCR/NLM Fellow in Dental Informatics Center for Biomedical Informatics University of Pittsburgh Titus K. L. Schleyer, D.M.D., Ph.D. Director, Center for Dental Informatics

Leo
Télécharger la présentation

Datamining MEDLINE for Topics and Trends in Dental and Craniofacial Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Datamining MEDLINE for Topics and Trends in Dental and Craniofacial Research William C. Bartling, D.D.S. NIDCR/NLM Fellow in Dental Informatics Center for Biomedical Informatics University of Pittsburgh Titus K. L. Schleyer, D.M.D., Ph.D. Director, Center for Dental Informatics University of Pittsburgh School of Dental Medicine

  2. Overview • Goals of project • Retrieving the entire corpus of dental and craniofacial research literature from MEDLINE • Determining the characteristics of a dental research article • Machine learning to extract articles from any body of literature • Methods to categorize dental research literature to study temporal trends • Summary

  3. Goals of project • To use computerized methods to determine topics and trends in dental and craniofacial research since 1966. • Determining the structure of such research can help to identify those research areas emerging and those waning. • Identify research funding opportunities?

  4. Retrieving the dental literature • MEDLINE chosen as the database • MeSH tree searched manually for dental and craniofacial terms • Many MeSH terms were found in unusual locations in the hierarchy. • Decision to keep or discard term • Search limited to : • English language • Journal article • Abstract present

  5. Results of search • ~450,000 English language articles in: • DENTISTRY • STOMATOGNATHIC SYSTEM (not PHARYNX) • STOMATOGNATHIC DISEASES (not PHARYNGEAL DISEASES) • ~61,000 articles indexed with dental MeSH terms not in above set • ~134,000 articles remaining after limiting to journal articles containing abstracts

  6. What is a dental research article? • Currently at this phase of project • 1000 abstracts randomly chosen, 5 groups of 200 each • 15 expert judges • 3 judges assigned to each group • Judges categorize each article as: • Dental or craniofacial research • Dental or craniofacial, non-research • Non-dental • Not sure • Web interface for judging- PHP with mySQL

  7. Differentiation of article categories • Acceptable reliability in each group ( > 0.70) • Use results of each category to develop training set • Identify Patient Sets (IPS) software • Developed by Dr. Greg Cooper at University of Pittsburgh CBMI • Natural language processing used to find patient records of a certain type from free text documents, i.e. hospital admission records

  8. IPS creates a document vector for each document or set of documents Document i Word 1 p1 Word 2 p2 Word 3 p3 Word n pn

  9. IDENTIFY PATIENT SETS (IPS) • Uses machine learning technique of “text classification” • All articles fed into the program • Select fields (title, abstract, MeSH terms) • Training set: • 2/3 of validated “dental research” articles • Add remaining 1/3 to original set, less the training set • Calculate success of retrieval using model created from training set • Adjust IPS and iterate, or train set with more or less documents until successful

  10. Determining trends and topics in dental and craniofacial research • Entire set of dental research articles used • Knowledge visualization and bibliometric methods • Based on the assumption that articles in a given field are similar to one other (Hearst & Pedersen, 1996) • Similar articles and topics tend to cluster together

  11. Bibliometric examples from other fields • Co-word analysis • Software engineering (Coulter, Monarch, and Konda, 1998) • Co-descriptor analysis • Information science (McCain, 1995) • Co-author analysis • Information retrieval literature (Ding et. al., 1999) • Co-citation analysis • Medical informatics literature (Morris & McCain, 1998)

  12. Visual methods to categorize literature • Co-occurrence vectors or weights • Weights based on co-occurrence of terms • Multidimensional scaling • Display of points in two or three dimensions • Points closer together on matrix when articles are more similar • Clustering • Groups of points in close proximity to each other are bounded to provide an intellectual grouping

  13. Medical Informatics Structure

  14. How do we cluster dental research? • Entire text of abstracts • MeSH terms only • Major headings • Subheadings • All MeSH headings • Journal titles • Combinations of the above

  15. Once clustering is done: • Cluster dental research within certain time periods (5 years) • Determine quantities of articles published for each cluster within each time period • Cluster including only journals with a given impact factor threshhold • Study changes over time of different categories of research

  16. Summary • A comprehensive content analysis of the dental and craniofacial research literature has not been done. • Computerized methods can help to retrieve and categorize this literature. • Study of trends in dental research can help researchers to identify relevance of current studies and possibly reveal future research opportunities.

  17. Many thanks to the following: • Amy Gregg, MLIS-Dental Reference Librarian • Falk Library for the Health Sciences • University of Pittsburgh • Shyam Visweswaran, MD- NLM Fellow in Intelligent Systems • Center for Biomedical Informatics • University of Pittsburgh • All of my expert raters! • This research is supported with a training grant from the National Institute of Dental and Craniofacial Research and the National Library of Medicine

More Related