National Cancer Institute: Utilizing Data for Cancer Prevention and Control
National Cancer Institute: Utilizing Data for Cancer Prevention and Control

  1. National Cancer Institute: Utilizing Data for Cancer Prevention and Control Abdul R Shaikh, PhD, MHSc Program Director Health Comm. and Informatics Research Branch Division of Cancer Control and Population Sciences National Cancer Institute @abdulrshaikh April 18, 2012

  2. NCI: Established by congress in 1937, is the leading Federal agency and world’s largest organization solely dedicated to cancer-related research, training, and dissemination of information. DCCPS: aims to reduce the risk, incidence, and deaths from cancer as well as enhance the quality of life for cancer survivors. The Division conducts and supports an integrated program of the highest quality behavioral, epidemiologic, genetic, health services, and surveillance cancer research. “Much of the suffering and death from cancer could be prevented by more systematic efforts to reduce tobacco use, improve diet and physical activity, reduce obesity, and expand the use of established screening tests. The American Cancer Society estimates that in 2011 about 171,600 cancer deaths will be caused by tobacco use alone. In addition, approximately one-third of the 571,950 cancer deaths expected to occur in 2011 are attributed to poor nutrition, physical inactivity, overweight, and obesity.”1 1. American Cancer Society. Cancer Prevention & Early Detection Facts & Figures 2011. Atlanta: American Cancer Society; 2011.

  4. Meeting the challenge of ‘Big Data’ • The availability of massive, rich datasets offers enormous potential for cancer prevention and control • Basic and Translational Science: • Exploratory & confirmatory research for understanding factors and mechanisms influencing cancer risk & prognosis • Secondary use of health-related data • - Educate, inform, and provide decision support for consumer and clinical health outcomes • Understand the role of behavioral and structural determinants of population health • Commercialization pathways (Open Innovation challenges & SBIR) • Data Sources • -SEER, HINTS, CPS-Tobacco, CLASS, CHIS, ATUS, Cancer Imaging Archive, QuitNowTXT library.

  5. . 2010/2011 Open Innovation Challenges: Increasing the Usability of Public Data for Cancer Prevention & Control • PARTICIPANTS: • 2010: 7 registered teams – 2 winners • 2011: 26 registered teams – 4 semi-finalists – 2 winners • OUTCOMES: • Increasing the utility of research data for the non-research community • Building an ecology of scientists, developers, and entrepreneurs • Accelerate development and commercialization of consumer HIT products to prevent cancer and other chronic diseases

  6. Recent winners •

  7. . SBIR: Innovative Health IT for Broad Adoption by Healthcare Systems and Consumers (R44) • Purpose: • - Accelerate development and commercialization of evidence-based consumer health IT products to prevent cancer and other chronic diseases, facilitate patient-provider communication, and improve disease outcomes. • - Facilitate 3rd party (i.e., large business) partnerships early in the development process (funding priority for applicants that demonstrate upfront commitment) • Background: • - Federal initiatives; consumer demand & commercial investment; NIH, AHRQ, NIST, and ONC interest; tied to COMPETES open innovation challenges • Technical Scope: • - R44 Phase II and Fast Track applications (up to $1 million); Large business partner & LOI

  8. Surveillance Epidemiology and End Results (SEER) • Population • Children to adults • Method • Data collected from cancer registries that cover ~26% of the US population; follow-up with individual cases until death • Content • Cancer incidence, prevalence, and survival data; cancer site, stage, morphology, treatment; limited demographics (age, race/ethnicity, region) • Data • 100% of cancer cases in registries; Six million cases with 350,000 added each year; 1973 to 2009; • Note • Need specialized software to analyze (SEER*Stat or SEER*Prep) downloaded from website; • Must sign user agreement to obtain; limited to research purposes; • Can be linked to Medicare data

  9. Population Adults (18+) Method 2003, 2005: Random digit dial (RDD) 2007: Dual frame/dual mode 2012: Address frame/self-administered mail mode Content Health communications trends and practices Cancer information access and usage Cancer risk perception Mental models of cancer Health behaviors Numeracy Data 2003 (n= 6,469); 2005 (n= 5,586); 2007 (n= 7,674); 2012 (n=3,956)*; Note * Data available summer, 2012

  10. Population - Adolescents/adults 15+ 1992-06; 18+ 2007- Method–National HH address-based frame, 8 panels Conducted every 3-4 years by Census for NCI 65% telephone (allows cell phone if preferred); 35% in-home Translated into Spanish and 4 Asian languages Content:monitor, evaluate and conduct research on Cigarette & other tobacco product usage patterns; Cessation- attempts, intentions, & treatment Policy- work, home, “real” price, attitudes & clinician advice Data: ~240,000 U.S. respondents per cycle Notes: National, state, sub-state estimates; Health disparities (e.g. race/ethnicity; low SES, rural) Economic aspects with CPS detailed occupational, & health disability data Panel design links to other CPS data (e.g., ASEC, ATUS, Food Security, Internet Use) Panel design allows for adding prospective Follow-Up (2002-03, 2010-2011) Linkage to outcome data (mortality and SEER data) through NLMS


  12. Population Households, families, adults and children Method Face to face interview Content Cancer control supplements (1987, 1992, 2000, 2005, 2010): Diet and nutrition Physical activity Cancer screening HPV Sun avoidance Tobacco use and control Cancer survivorship Data n~40,000 households (~87,000 individuals) Initiated in 1957 National Health Interview Survey (NHIS)

  13. California Health Interview Survey (CHIS) Population Adult, adolescent and child questionnaires Very diverse racial/ethnic population Method Telephone survey (landline/cell phone) of all California counties Content Health behaviors (Diet/drug use/sexual/sun safety) Health status Health conditions (asthma, diabetes etc.) Cancer history and prevention Health insurance Data 2001, 2003, 2005, 2007, 2009 data available ~40-50,000 respondents/survey Note Many latino and asian groups represented; oversamples of Koreans and Vietnamese Fielded in five languages: English, Spanish, Chinese, Vietnamese, and Korean

  14. American Time Use Survey; • Population • Adolescents/adults 15 and older • Method • Self report telephone interview using 24 hour recall • Content • Estimates of activities people do (work, childcare, socializing, exercising, eating, educational, sports and religious activities), whom they were with, and the time spent doing them by sex, age, educational attainment, labor force status, and other characteristics, as well as by weekday and weekend day. • Eating and health module. • Data • n ~ 13,000 per year • Data currently available: 2003-2009 • Note • Data files can be linked to Current Population Survey (CPS)


  16. Other NCI data sources and tools Cancer Prevalence & Cost of Care: Cancer Trends Progress Report: Health Disparities Calculator (HD*Calc): PopSciGrid Community Health Data Portal: State Cancer Profiles:

  17. Data Resource Contacts - NCI DCCPS Audie Atienza (; Behavioral Research Program) Eric J (Rocky) Feuer (; Surveillance Research Program) Richard Moser (; Behavioral Research Program) Abdul R Shaikh (; @abdulrshaikh; Behavioral Research Program)