320 likes | 410 Vues
This overview delves into untangling equations involving uncertainty using interval analysis and Bayesian strategies. It covers operations like deconvolutions, backcalculations, and updates, merging simple arithmetic with probability theory to handle risk analysis. Learn about probability boxes (p-boxes), Cumulative Distribution Functions (CDF), and the role of Probability Bounds Analysis (PBA) in handling imprecisely specified distributions. Explore Updating methods and Updating with p-boxes for more reasonable answers. Delve into backcalculation processes to find constraints and tolerate solutions, particularly in engineering design problems. Discover how Monte Carlo simulation and trial-and-error methods assist in backcalculations.
E N D
Untangling equations involving uncertainty Scott Ferson,Applied Biomathematics Vladik Kreinovich, University of Texas at El Paso W. Troy Tucker, Applied Biomathematics
Overview • Three kinds of operations • Deconvolutions • Backcalculations • Updates(oh, my!) • Very elementary methods of interval analysis • Low-dimensional • Simple arithmetic operations • But combined with probability theory
1 1 1 Cumulative probability 0.5 0.5 0.5 0 0 0 0 10 20 30 40 10 20 30 40 10 20 30 Probability box (p-box) • Bounds on a cumulative distribution function (CDF) • Envelope of a Dempster-Shafer structure • Used in risk analysis and uncertainty arithmetic • Generalizes probability distributions and intervals This is an interval, not a uniform distribution
a =T( 0 , 10 , 20) + [0, 5] b =N([20,23],[1,12]) Disagreement between theoretical and observed variance Disagreement between theoretical and observed variance Disagreement between theoretical and observed variance c = a |+| b c = a + b 1 1 assuming independence assuming independence 1 1 0 0 0 0 80 80 CDF 1 0 0 0 20 0 40 assuming nothing 0 0 80 Probability bounds analysis (PBA)
PBA handles common problems • Imprecisely specified distributions • Poorly known or unknown dependencies • Non-negligible measurement error • Inconsistency in the quality of input data • Model uncertainty and non-stationarity • Plus, it’s much faster than Monte Carlo
Updating • Using knowledge of how variables are related to tighten their estimates • Removes internal inconsistency and explicates unrecognized knowledge • Also called constraint updating or editing • Also called natural extension
Example • Suppose W = [23, 33] H = [112, 150] A = [2000, 3200] • Does knowing WH=A let us to say any more?
Answer • Yes, we can infer that W = [23, 28.57] H = [112, 139.13] A = [2576, 3200] • The formulas are just W = intersect(W, A/H), etc. To get the largest possible W, for instance, let A be as large as possible and H as small as possible, and solve for W =A/H.
Bayesian strategy Prior Likelihood Posterior
Bayes’ rule • Concentrates mass onto the manifold of feasible combinations of W, H, and A • Answers have the same supports as intervals • Computationally complex • Needs specification of priors • Yields distributions that are not justified (come from the choice of priors) • Expresses less uncertainty than is present
Updating with p-boxes 1 1 1 A H W 0 0 0 20 30 40 120 140 160 2000 3000 4000
1 1 1 A H W 0 0 0 20 30 40 120 140 160 2000 3000 4000 intersect(W, A/H) intersect(H, A/W) intersect(A, WH) Answers
Calculation with p-boxes • Agrees with interval analysis whenever inputs are intervals • Relaxes Bayesian strategy when precise priors are not warranted • Produces more reasonable answers when priors not well known • Much easier to compute than Bayes’ rule
Backcalculation • Find constraints on B that ensure C=A+B satisfies specified constraints • Or, more generally, C = f(A1, A2,…, Ak, B) • If A and C are intervals, the answer is called the tolerance solution
conc intake body mass dose = dose body mass intake conc = Can’t just invert the equation When conc is put back into the forward equation, the dose is wider than planned
Example dose = [0, 2] milligram per kilogram intake = [1, 2.5] liter mass = [60, 96] kilogram conc = dose * mass / intake [ 0, 192] milligram liter-1 dose = conc * intake / mass [ 0, 8] milligram kilogram-1 Doses 4 times larger than tolerable levels!
Backcalculating probability distributions • Needed for engineering design problems, e.g., cleanup and remediation planning for environmental contamination • Available analytical algorithms are unstable for almost all problems • Except in a few special cases, Monte Carlo simulation cannot compute backcalculations; trial and error methods are required
1 1 A C 0 0 -10 0 10 20 30 40 50 60 2 3 4 5 6 7 8 Backcalculation with p-boxes Suppose A + B = C, where A = normal(5, 1) C = {0 C, median 15, 90th %ile 35, max 50}
1 B 0 -10 0 10 20 30 40 50 Getting the answer • The backcalculation algorithm basically reverses the forward convolution • Not hard at all…but a little messy to show • Any distribution totally inside B is sure to satisfy the constraint … it’s “kernel”
1 C* C 0 -10 0 10 20 30 40 50 60 Check by plugging back in A + B = C* C
When you Know that A + B = C A – B = C A B = C A / B = C A ^ B = C 2A = C A² = C And you have estimates for A, B A, C B ,C A, B A, C B ,C A, B A, C B ,C A, B A, C B ,C A, B A, C B ,C A C A C Use this formula to find the unknown C = A + B B = backcalc(A,C) A = backcalc (B,C) C = A – B B = –backcalc(A,C) A = backcalc (–B,C) C = A * B B = factor(A,C) A = factor(B,C) C = A / B B = 1/factor(A,C) A = factor(1/B,C) C = A ^ B B = factor(log A, log C) A = exp(factor(B, log C)) C = 2 * A A = C / 2 C = A ^ 2 A = sqrt(C)
Kernels • Existence more likely if p-boxes are fat • Wider if we can also assume independence • Answers are not unique, even though tolerance solutions always are • Different kernels can emphasize different properties • Envelope of all possible kernels is the shell (i.e., the united solution)
Precise distributions • Precise distributions can’t express the nature of the target • Finding a conc distribution that results in a prescribed distribution of doses says we want some doses to be high (any distribution to the left would be even better) • We need to express the dose target as a p-box
Deconvolution • Uses information about dependence to tighten estimates • Useful, for instance, in correcting an estimated distribution for measurement uncertainty • For instance, suppose Y = X + • If X and are independent, Y² = X² + ² • Then we do an uncertainty correction
Example • Y = X + • Y, ~ normal • X ~ N(decon(Y, X), sqrt(decon(², Y²)) • Y ~ N([5,9], [2,3]); ~ N([1,+1], [½,1]) • X ~ N(dcn([1,1],[5,6]), sqrt(dcn([¼,1],[4,9]))) • X ~ N([6,8], sqrt([3, 63])
Deconvolutions with p-boxes • As for backcalculations, computation of deconvolutions is troublesome in probability theory, but often much simpler with p-boxes • Deconvolution didn’t have an analog in interval analysis (until now via p-boxes)
Relaxing over-determination • Most constraint problems almost never have solutions with probability distributions • The constraints are too numerous and strict • P-boxes relax these constraints so that many problems can have solutions
P-boxes in interval analysis • P-boxes bring probability distributions into the realm of intervals • Express and solve backcalculation problems better than is possible in probability theory by itself • Generalize the notion of tolerance solutions (kernels) • Relax unwarranted assumptions about priors in updating problems needed in a Bayesian approach • Introduce deconvolution into interval analysis
Acknowledgments • Janos Hajagos, Stony Brook University • Lev Ginzburg, Stony Brook University • David Myers, Applied Biomathematics • National Institutes of Health SBIR program
1 1 1 W 1 1 A 1 H W H A 0 20 30 40 0 0 2500 2700 2900 3100 110 120 130 140 0 0 110 120 130 140 150 160 0 22 23 24 25 26 27 28 29 2000 3000 4000