190 likes | 213 Vues
This project provides background information on probability, an introduction to Fay’s Formula for estimating cancer risk, notation explanations, and examples using raw data. It includes tables showing age-conditional breast cancer risk and overall cancer risk. The method behind Fay’s Formula is detailed, with references for further reading and acknowledgements.
E N D
Age-Conditional Risk of Developing Cancer Comprehensive Project By Melissa Joy
You can look forward to… • Background Information on Probability • Intro to Fay’s Formula • Notation • Overview of the method behind Fay’s Formula • Breast cancer example using raw data • Table of age conditional breast cancer risk • Table of age conditional cancer risk (all sites) • Bibliography • Thank you’s
A Little Background on Probability • Probability is the likelihood or chance that something will happen • Conditional Probability is the probability of some event A, given the occurrence of some other event B. • It is written P(A|B) • It is said “the probability of A, given B” • P(A|B) = P(A ∩ B) P(B)
Probability Density Functions • Probability density function (pdf) is a function,f(x), that represents a probability distribution in terms of integrals. • The probability x lies in the interval [a, b] is given by ∫a f (x) dx b
Fay’s Formula for Estimating the Age-Conditional Probability of Getting Cancer A(x,y): Age-conditional probability of getting cancer between x and y, given alive and cancer free up until age x Or equivalently, the probability that an individual of age x will get cancer in the next (y - x) years, given alive and cancer free up until age x Goal: Write A(x,y) in terms of data that is easily found and collected
An Introduction to Notation Probability density functions: (For simplicity, these pdf’s will be constant so I will refer to them as probabilities) • λ: Failure rates • S: Survival rates Subscripts: • c: denotes incidence of cancer • d: denotes incidence of death from cancer • o: denotes death from other (non-cancer) related causes • An asterisk (*) signifies that the data implies that the individual was cancer free up until a particular age.
A Quick Overview of the Method Behind Fay’s Formula A(x,y): Age-conditional probability of getting cancer between x and y, given alive and cancer free up until age x A(x,y) = P(first cancer occurs between age x and y) P(alive and cancer free at age x given cancer free before) A(x,y) = ∫x fc (a) da S*(x) y • fc (a): probability density function of the first occurrence of cancer happening at age a (a between x and y) • S*(a): probability that the person is alive and cancer free at age x, given they are cancer free up until age x Goal: Rewrite A(x,y) with no * terms • Fay, Michael P. "Estimating Age Conditional Probability of Developing Disease From Surveillance Data." Population Health Metrics 2 (2004): 6-14. • Fay, Michael P., Ruth Pfeiffer, Kathleen A. Cronin, Chenxiong Le, and Eric J. Feuer. "Age-Conditional Probabilities of Developing Cancer." Statistics in Medicine 22 (2003): 1837-1848.
Fay’s Formula continued • fc (a): probability density function of the first occurrence of cancer happening at age a (a between x and y) • λc*(a): probability that the first cancer occurs at age a, given alive and cancer free up until age a • S*(a): probability that the person is alive and cancer free at age a, given they are cancer free up until age a y A(x,y) = ∫x fc (a) da S*(x) It is true that fc (a) = λc* (a) S*(a) P (first cancer occurs between age x and y) = ∫x fc (a) da = ∫x λc* (a) S*(a) da Goal: Rewrite A(x,y) with no * terms Starting with the Numerator y y y A(x,y) = ∫x λc*(a) S*(a) da S*(x)
Rewriting Numerator continued • fc (a): probability density function of the first occurrence of cancer happening at age a (a between x and y) • λc (a): probability that the first cancer occurs at age a • S(a): probability that the person is alive and cancer free at age a • λc*(a): probability that the first cancer occurs at age a, given alive and cancer free up until age a • S*(a): probability that the person is alive and cancer free at age a, given they are cancer free up until age a y A(x,y) = ∫x λc*(a) S*(a) da S*(x) It could be found that: λc (a) = fc (a) S(x) λc (a) = λc* (a) S*(a) S(x) So by re-arranging the above equation we get λc (a) S(a) = λc* (a) S*(a) We can now rewrite the numerator without * terms y A(x,y) = ∫xλc (a) S(a) da S*(x) Goal accomplished for the numerator!
S*(a): probability that the person is alive and cancer free at age a, given they are cancer free up until age a • Sc*(a): probability that the person is cancer free at age a, given they are cancer free up until age a • So*(a): probability that the person did not die from non-cancer related causes at age a, given they are cancer free up until age a • So(a): probability that the person did not die from non-cancer related causes at age a • Sd (a): probability that the person did not die from cancer at age a • λc (a): probability that the first cancer occurs at age a • S(a): probability that the person is alive and cancer free at age a Now let’s focus on the denominator y A(x,y) = ∫xλc (a) S(a) da S*(x) S*(x) = Sc*(x) So *(x) and we know So *(x) = So (x) Through a long series of calculations we find that: Sc *(x) = 1 - ∫0 λc (a) Sd (a) da Goal: Rewrite A(x,y) with no * terms x So we can rewrite the denominator as S*(x) = So (a) {1 - ∫0 λc (a) Sd (a) da} x y A(x,y) = ∫x λc (a) S(a) da So (x) {1 - ∫0 λc (a) Sd (a) da} x
So we get… A(x,y): Age-conditional probability of getting cancer between x and y, given alive and cancer free up until age x A(x,y) = ∫x λc (a) S(a) da So (x) {1 - ∫0 λc (a) Sd (a) da} We started from: A(x,y) = ∫x fc (a) da S*(x) y y x Goal accomplished!
Let’s Look at an Example: What is the probability that a female age 20 will get breast cancer in the next 10 years (given she is alive and cancer free up untill age 20)? c: number of incidences of cancer ≈ 160 d: number of cancer caused deaths ≈ 20 o: number of deaths from other causes ≈ 1500 n: Mid-interval population ≈ 3 million Let’s find the failure rates Failure rates are the probability that you will get cancer, die of cancer or die from other causes λc (a)≈ c /n λd (a) ≈ d /n λo (a) ≈ o /n λc (20) ≈ 160/3 million = 0.00005333 λd (20) ≈ 20/3 million = 0.0000066667 λo (a) ≈ 1500/3 million = 0.0005 Approximated SEER Data 2004
Now let’s find the survival rates • Sc(20)= 1- λc (20) = 0.99994667 • Sd(20)= 1- λd (20) = 0.999993 • So(20)= 1- λo (20) = 0.9995 • S(20) = 1- {λc (20) + λo (20)} = 0.99944667 Survival rates are the probability that the individual has not gotten cancer, died from cancer, or died from other causes. S (without a subscript) is the probability of being alive and cancer free.
Now let’s use Fay’s Formula y A(x,y) = ∫x λc (a) ∙ S(a) da So (x) {1 - ∫0 λc (a) ∙ Sd (a) da} x A(20,30) = ∫20 λc (20) ∙ S(20) da So (20) {1 - ∫0 λc (20) ∙ Sd (20) da} = 10 λc (20) ∙ S(20) So (20) {1 – (20 λc (20) ∙ Sd (20) )} = 0.000534 = 0.0534% 30 20 What does this number mean? http://seer.cancer.gov/csr/1975_2004/results_merged/topic_lifetime_risk.pdf
Age-Conditional Risk of Being Diagnosed with Breast Cancer (Females only) Table from Surveillance, Epidemiology and End Results (SEER) database http://seer.cancer.gov/csr/1975_2004/results_merged/topic_lifetime_risk.pdf
Age- Conditional Risk of Being Diagnosed with Cancer Table from Surveillance, Epidemiology and End Results (SEER) database http://seer.cancer.gov/csr/1975_2004/results_merged/topic_lifetime_risk.pdf
Bibliography • Fay, Michael P. "Estimating Age Conditional Probability of Developing Disease From Surveillance Data." Population Health Metrics 2 (2004): 6-14. • Fay, Michael P., Ruth Pfeiffer, Kathleen A. Cronin, Chenxiong Le, and Eric J. Feuer. "Age-Conditional Probabilities of Developing Cancer." Statistics in Medicine 22 (2003): 1837-1848. • Ries LAG, Melbert D, Krapcho M, Mariotto A, Miller BA, Feuer EJ, Clegg L, Horner MJ, Howlader N, Eisner MP, Reichman M, Edwards BK (eds). SEER Cancer Statistics Review, 1975-2004, National Cancer Institute. Bethesda, MD, http://seer.cancer.gov/csr/1975_2004/results_merged/topic_lifetime_risk.pdf, based on November 2006 SEER data submission, posted to the SEER web site, 2007. • "What Is Your Risk?." Your Disease Risk. (2005). Harvard Center For Cancer Prevention. 2 Oct 2007 <http://www.yourdiseaserisk.harvard.edu/english/>.
Many thanks to… • Professor Lengyel • Professor Buckmire • Professor Knoerr • And… the entire Oxy math department THANK YOU!
Go to http://www.yourdiseaserisk.wustl.edu/ to calculate your risk and learn what could raise and lower your risk