Software tools for dose-volume data-mining to predict radiotherapy response Joseph O Deasy, Angel I Blanco, Vanessa H Clark, Constantine Zakarian, Andrew Hope, Walter Bosch, James A Purdy and James Alaly, Dept. of Radiation Oncology, Washington University, St. Louis, MO Heading Data review: CERR Figure 5. General (‘Kitchen sink’) model results: risk predictions for grade 2 or greater late pneumonitis/fibrosis, based on multi-metric logistic regression analysis. Patients are grouped into risk octiles. The significant terms (in order of selection) are: pretreatment chemotherapy, GEUD, V5, V30, V60, and V80. The coefficients for V30, V60, and V80 were opposite that of GEUD and V5, indicating that they ‘correct’ GEUD and V5, in a complicated way. Confidence intervals are 95%. Spearman’s rank correlation coefficient is 0.3 (p < 0.0001 ). Six terms were used based on Akaike information criteria (AIC). Abstract Normal-tissue complication and tumor response rates have been shown to be sensitive to dose-volume factors. These dose-volume factors need to be known accurately for treatment plan comparisons and as mathematical models to drive intensity modulated radiation therapy treatment planning. But determining and validated such models has been hampered by (1) the difficulty of retrieving and utilizing dose distributions, anatomical structures, and CT scans from treatment planning systems, and (2) the inability of different institutions to share data due to the lack of a standard format and associated software tools. Our group has developed software tools which address both these issues. These tools are open-source, and, for research purposes, are available to anyone. The tools consist of: (1) CERR (A Computational Environment for Radiotherapy Research), and (2) database construction and front-end query programs. CERR, built using the widely-used Matlab programming environment, contains a wide variety of graphical and quantitative treatment plan review tools. CERR tools allow users to conveniently retrieve all the relevant information which resides in the treatment planning systems. The data transfer is made using widely available protocols: either the AAPM/RTOG or the DICOM protocol. CERR plans can be accessed conveniently in batch or interactive fashion to retrieve dose metrics, such as V5, V10, or mean dose values. Structure contours can be edited or outlined from scratch. The database software retrieves plan characteristics from collections of CERR-archived treatment plans, and includes a convenient query capability for finding relevant treatment plans or retrieving dose-volume data. Together, these tools comprise a robust and convenient system for the data-mining of dose-volume related factors in treatment response. We believe that these tools also provide a needed technical base for constructing long-lived inter-institutional databases. This work was supported by NIH grant CA85181 and a grant from Computerized Medical Systems, Inc. CERR is available from http://deasylab.info. Radiotherapy treatment plan in RTOG format imported in the CERR environment. • CERR is currently at version 2.1, and has the following components: • Graphical User Interface • Transverse, sagittal and coronal slice viewers • DVH display/recalculation • Contouring/re-contouring toolset • RTOG-based import as well as DICOM-based import. • 3-D display • archive compression (typically 20-40 MB per compressed patient dataset) • batch programmability based on the full Matlab language • A stand-alone version which can run independently of Matlab • CERR has successfully imported treatment plans from a variety of treatment planning systems, including: • CMS Focus (RTOG) • Pinnacle (RTOG) • TMS Helax (RTOG) • Varian Helios (DICOM) • Current plans are to expand the lung treatment planning database to other institutions (Duke University, Netherlands Cancer Center). Ultimately, all the data will by anonymized and placed into the public domain. Introduction Intensity modulated radiation therapy (IMRT) has rapidly increased the oncologist’s ability to shape the dose distribution to potentially yield improved therapeutic results. Rational control of IMRT’s capabilities requires precise and quantitative models of dose-volume effects (i.e., normal tissue complication probability (NTCP) models), and ultimately, tumor control probability (TCP) models. NTCP and TCP dose-volume effects research is currently hampered by two main obstacles: (1) for any given endpoint, relevant cohort 3-D treatment plans and followup data is almost never compiled into a database, and (2) even where such databases exist, they are not made publicly available. Our group is working on software tools which are designed to make the construction of long-lived and ever-growing multi-institutional databases of 3-D treatment planning and outcomes data feasible and even convenient. Axial, coronal, sagittal, and 3-D viewers as well as DVH dose analysis tools and contouring/re-contouring tools are available in CERR. Data extraction Modeling example: NSCLC • Conclusions • CERR/Matlab is a powerful tool for Data mining of treatment plan outcomes, including the use of new plan metrics, such as those based on spatial information (e.g., lung upper vs. lower lobe irradiation). • CERR is a powerful and flexible tool for extracting 3-D treatment planning data from many different treatment planning systems. • It is feasible to use CERR to construct long-lived databases of treatment planning and outcomes data from multiple institutions. • Databases of treatment plans and outcomes from multiple institutions are likely to be more powerful for modeling than databases consisting of relatively similar single-institution treatment strategies. • Acknowledgements • This research was supported by NIH grants R29 CA85181 and a grant from Computerized Medical Systems, Inc. • CERR can be downloaded from http://deasylab.info and links. CERR is free for research use. Use of CERR for clinical patient care is prohibited. • References: • Spezi, Lewis, Smith, “A DICOM-RT-based toolbox for the evaluation and verification of radiotherapy plans” Phys. Med. Biol. 47 (2002) 4223–4232 • J. O. Deasy, A. I. Blanco, and V. H. Clark “CERR: A Computational Environment for Radiotherapy Research,” Med Phys 30:979-985 (2003). Figure 4. As an example of the use of CERR for modeling of treatment outcomes, we analyzed 167 patient treatment plans and outcomes (pneumonitis/fibrosis) of non-small-cell lung cancer radiotherapy patients (Deasy, et al., Int J Radiat Oncol Biol Phys 57:S412, 2003 (ASTRO 2003)). Spearman’s rank correlation coefficient is plotted between the different risk factors. Dark red is perfect correlation and dark blue is slightly negative correlation (see colorbar). HLP = highest late pneumonitis (grade 2 or greater). Each square at a given row and column represents the correlation between the parameter listed on that row with the parameter listed on the row numerically equal to that column. There are strong correlations among the Vx values. Also, the generalized equivalent uniform dose term used here (a = 5.32) correlates strongly with high Vx values (V60-V75). V20 and mean dose are highly correlated. Figure 1. CERR and CERR database tools as an archival system. CERR can extract data from many types of commercial and academic treatment planning systems via the RTOG and DICOM mechanisms.