Saturday, Oct. 12, 2001 Haipeng Guo KDD Research Group

KDD Group Presentation Fractal and Bayesian Networks Inference Saturday, Oct. 12, 2001 Haipeng Guo KDD Research Group Department of Computing and Information Sciences Kansas State University

Presentation Outline • Simple Tutorial on Fractals • Bayesian Networks Inference Review • Joint Probability Space’s Fractal Property and its Possible Application to BBN Inference • Summary

Part I Introduction to Fractals

Fractals Introduction • Definition • Examples: Man-made & Nature Fractals • Fractal Dimension • Fractal Applications

Fractal – “broken, fragmented, irregular” “I coined fractal from the Latin adjective fractus. The corresponding Latin verb frangere means "to break" to create irregular fragments. It is therefore sensible - and how appropriate for our need ! - that, in addition to "fragmented" (as in fraction or refraction), fractus should also mean "irregular", both meanings being preserved in fragment. ” B. Mandelbrot : The fractal Geometry of Nature, 1982

Definition: Self-similarity • A geometric shape that has the property of self-similarity, that is, each part of the shape is a smaller version of the whole shape. Examples:

Mathematical fractal: Konch Snowflake • Step One. Start with a large equilateral triangle. • Step Two. Make a Star. • Divide one side of the triangle into three parts and remove the middle section. 2. Replace it with two lines the same length as the section you removed. 3. Do this to all three sides of the triangle. • Repeat this process infinitely. • The snowflake has a finite area bounded by a perimeter of infinite length!

Real world fractals A cloud, a mountain, a flower, a tree or a coastline… The coastline of Britain

Fractal geometry: the language of nature • Euclid geometry: cold and dry • Nature: complex, irregular, fragmented “Clouds are not spheres, mountains are not cones, coastlines are not circles, and bark is not smooth, nor does lightning travel in a straight line.”

Euclid dimension • In Euclid geometry, dimensions of objects are defined by integer numbers. • 0 - A point • 1 - A curve or line • 2 - Triangles, circles or surfaces • 3 - Spheres, cubes and other solids

Fractal dimension • Fractal dimension can be non-integers • Intuitively, we can represent the fractal dimension as a measure of how much space the fractal occupies. • Given a curve, we can transform it into 'n' parts (n actually represents the number of segments), and the whole being 's' times the length of each of the parts. The fractal dimension is then : d = log n / log s

Example: Knoch snowflake • After the first step, we get four segments(it's then divided into 4 parts). • The whole curve is composed of three of these new segments. • So, the fractal dimension is : d=log 4/log 3=1.2618... • It takes more space than a 1 dimensional line segment, but it occupies less space than a filled two-dimensional square.

Another example: Cantor Set • The oldest, simplest, most famous fractal 1 We begin with the closed interval [0,1]. 2 Now we remove the open interval (1/3,2/3); leaving two closed intervals behind. 3 We repeat the procedure, removing the "open middle third" of each of these intervals 4 And continue infinitely. • Fractal dimension: D = log 2 / log 3 = 0.63… • Uncountable points, zero length • Challenge problem: is ¾ in Cantor set?

Devil’s staircase • Take a Cantor Set, which is composed of an infinite number of points. • Consider turning those points into dots and letting a Pacman eat them • As our Pacman eats the dots, he gets heavier. Imagine that his weight after eats all the dots is 1. • Graph his weight with time, we get devil’s staircase.

Cantor square • Fractal dimension: d = log 4 / log 3 = 1.26

The Mandelbrot Set • The Mandelbrot set is a connected set of points in the complex plane • Calculate: Z1 = Z02 + Z0, Z2 = Z12 + Z0, Z3 = Z22 + Z0 • If the sequence Z0, Z1, Z2, Z3, ... remains within a distance of 2 of the origin forever, then the point Z0 is said to be in the Mandelbrot set. • If the sequence diverges from the origin, then the point is not in the set

Colored Mandelbrot Set • The colors are added to the points that are not inside the set. Then we just zoom in on it

Applications of fractals • Astronomy: the struture of our universe • Superclusters - clusters – galaxies- star systems(Solar system) - planets - moons • Every detail of the universe shows the same clustering patterns. • It can be modeled by random Cantor square • The fractal dimension of our universe: 1.23

Applications of fractals • Rings of Saturn • Originally, it was believed that the ring is a single one. • After some time, a break in the middle was discovered, and scientists considered it to have 2 rings. • However, when Voyager I approached Saturn, it discovered that the two ring were also broken in the middle, and the 4 smaller rings were broken as well. • Eventually, it identified a very large number of breaks, which continuously broke even small rings into smaller pieces. • The overall structure is amazingly similar to... Cantor Set

Application of Fractals • Human Body THE LUNGS: • Formed by splitting lines • Fractal Canopies The brain: • The surface of the brain contains a large number of folds • Human, the most intelligent animal, has the most folded surface of the brain • Geometrically, the increase in folding means the increase in dimension • In humans, it is obviously the highest, being as large as between 2.73 - 2.79

Plants • a tree branch looks similar to the entire tree • a fern leaf looks almost identical to the entire fern • One classic way of creating fractal plants is by means of l-systems(Lindenmayer)

Bacteria Cultures • A bacteria culture is all bacteria that originated from a single ancestor and are living in the same place. • When a culture is growing, it spreads outwards in different directions from the place where the original organism was placed. • The spreading of bacteria can be modeled by fractals such as the diffusion fractals

Data Compression • A color full-screen GIF image of Mandelbrot Set occupies about 35 kilobytes • Formula z = z^2 + c, 7 bytes! (99.98% ) • It could work for any other photos as well • The goal is too find functions, each of which produces some part of the image. • IFS (Iterated function system) is the key.

Weather • Weather behaves very unpredictably • Sometimes, it changes very smoothly. Other times, however, it changes very rapidly • Edward Lorenz came up with three formulas that could model the changes of the weather. • These formulas are used to create a 3D strange attractor, they form the famous Lorenz Attractor, which is a fractal pattern.

Fractal Antenna • Practical shrinkage of 2-4 times are realizable for acceptable performance. • Smaller, but even better performance

Electronic Transmission Error • During electronic transmissions, electronic noise would sometimes interfere with the transmitted data. • Although making the signal more powerful would drown out some of this harmful noise, some of it persisted, creating errors during transmissions. • Errors occurred in clusters; an period of no errors would be followed by a period with many errors. • On any scale of magnification(month, day, hour, 20 minutes, …), the proportion of error-free transmission to error-ridden transmission stays constant. • Mandelbrot studied the mathematical process that enables us to create random Cantor dust describing perfectly well the fractal structure of the batches of errors on computer lines

Network Traffic Model Packets delays gain as a function of time in a WAN environment: • the top diagram - absolute values of RTT parameter in virtual channel; • the bottom diagram - fractal structure of packets flow that excessed 600 msec threshold.

Fractal Art

Fractal Summary • Fractals are self-similar or self-affine structures • Fractal object has a fractal dimension • It models many natural objects and processes. • It is the nature’s language. • It has very broad applications.

Part II Bayesian Networks

Bayesian Networks Review • Bayesian Networks • Examples • Belief Update and Belief Revision • The joint Probability Space and Brute Force Inference

Bayesian Networks • Bayesian Networks, also called Bayesian Belief networks, causal networks, or probabilistic networks, are a network-based framework for representing and analyzing causal models involving uncertainty • A BBN is a directed acyclic graph (DAG) with conditional probabilities for each node. • Nodes represent random variables in a problem domain • Arcs represent conditional dependence relationship among these variables. • Each node contains a CPT(Conditional Probabilistic Table) that contains probabilities of this node being specific values given the values of its parent nodes.

Bowel-problem Family-Out Light-On Dog-out Hear-bark Family-Out Example • " Suppose when I go home at night, I want to know if my family is home before I try the doors.(Perhaps the most convenient door to enter is double locked when nobody is home.) Now, often when my wife leaves the houses, she turns on an outdoor light. However, she sometimes turns on the lights if she is expecting a guest. Also, we have a dog. When nobody is home, the dog is put in the back yard. The same is true if the dog has bowel problems. Finally, if the dog is in the back yard, I will probably hear her barking(or what I think is her barking), but sometimes I can be confused by other dogs. "

Asia Example from Medical Diagnostics

Why is BBN important? • Offers a compact, intuitive, and efficient graphical representation of dependence relations between entities of a problem domain. (model the world in a more natural way than Rule-based systems and neural network) • Handle uncertainty knowledge in mathematically rigorous yet efficient and simple way • Provides a computational architecture for computing the impact of evidence nodes on beliefs(probabilities) of interested query nodes • Growing numbers of creative applications

MINVOLSET KINKEDTUBE PULMEMBOLUS INTUBATION VENTMACH DISCONNECT PAP SHUNT VENTLUNG VENITUBE PRESS MINOVL FIO2 VENTALV PVSAT ANAPHYLAXIS ARTCO2 EXPCO2 SAO2 TPR INSUFFANESTH HYPOVOLEMIA LVFAILURE CATECHOL LVEDVOLUME STROEVOLUME ERRCAUTER HR ERRBLOWOUTPUT HISTORY CO CVP PCWP HREKG HRSAT HRBP BP Alarm Example: the power of BBN • The Alarm network • 37 variables, 509 parameters (instead of 237)

Applications • Medical diagnostic systems • Real-time weapons scheduling • Jet-engines fault diagnosis • Intel processor fault diagnosis (Intel); • Generator monitoring expert system (General Electric); • Software troubleshooting (Microsoft office assistant, Win98 print troubleshooting) • Space shuttle engines monitoring(Vista project) • Biological sequences analysis and classification • ……

Bayesian Networks Inference • Given an observed evidence, do some computation to answer queries • An evidence e is an assignment of values to a set of variables E in the domain, E = { Xk+1, …, Xn } • For example, E = e : { Visit Asia = True, Smoke = True} • Queries: • The posteriori belief: compute the conditional probability of a variable given the evidence, • P(Lung Cancer| Visit Asia = TRUE AND Smoke = TRUE) = ? • This kind of inference tasks is called Belief Updating • MPE: compute the Most Probable Explanation given the evidence • An explanation for the evidence is a complete assignment { X1 = x1, …, Xn= xn } that is consistent with evidence. Computing a MPE is finding an explanation such that no other explanation has higher probability • This kind of inference tasks is called Belief revision

Smoking Visit to Asia Tuberculosis Lung Cancer tub. or lung cancer Bronchitis Dyspnea X-Ray Belief Updating • The problem is to compute P(X=x|E=e): the probability of query nodes X, given the observed value of evidence nodes E = e. For example: Suppose that a patient arrives and it is known for certain that he has recently visited Asia and has dyspnea. - What’s the impact that this evidence has on the probabilities of the other variables in the network ? P(Lung Cancer) = ?

Belief Revision Let W is the set of all nodes in our given Bayesian network Let the evidence e be the observation that the roses are okay. Our goal is to now determine the assignment to all nodes which maximizes P(w|e). We only need to consider assignments where the node roses is set to okay and maximize P(w), i.e. the most likely “state of the world” given the evidence that rose is ok in “this world”. The best solution then becomes - P(sprinklers = F, rain = T, street = wet, lawn = wet, soil = wet, roses = okay) = 0.2646

Complexity of BBN Inference • Probabilistic Inference Using Belief Networks is NP-hard. [Cooper 1990] • Approximating Probabilistic Inference in Bayesian Belief Networks is NP-hard [Dagum 1993] • Hardness does not mean we cannot solve inference. It implies that • We cannot find a general procedure that works efficiently for all networks • However, for particular families of networks, we can have provably efficient algorithms either exact or approximate • Instead of a general exact algorithm, we seek for special case, average case, approximate algorithms • Various of approximate, heuristic, hybrid and special case algorithms should be taken into consideration

BBN Inference Algorithms • Exact algorithms • Pearl’s message propagation algorithm(for single connected networks only) • Variable elimination • Cutset conditioning • Clique tree clustering • SPI(Symbolic Probabilistic Inference) • Approximate algorithms • Partial evaluation methods by performing exact inference partially • Variational approach by exploiting averaging phenomena in dense networks(law of large numbers) • Search based algorithms by converting inference problem to an optimization problem, then using heuristic search to solve it • Stochastic sampling also called Monte Carlo algorithms

Inference Algorithm Conclusions • The general problem of exact inference is NP-Hard. • The general problem of approximate inference is NP-Hard. • Exact inference works for small, sparse networks only. • No single champion either exact or inference algorithms. • The goal of research should be that of identifying effective approximate techniques that work well in large classes of problems. • Another direction is the integration of various kinds of approximate and exact algorithms exploiting the best characteristics of each algorithm.

Part III BBN Inference using fractal? • Joint Probability Distributions Space of a BBN • Asymmetries in Joint Probability Distributions (JPD) • Fractal Property of JPD • How does it help to do approximate inference?

Smoking Visit to Asia Tuberculosis Lung Cancer tub. or lung cancer Bronchitis Dyspnea X-Ray Joint Probability Distribution • Asia revisited: - Eight binary nodes - Each node has 2 states: Y or N - Total states: 28 = 256 Instance# a b c d e f g hProbability Instance 0: 0 0 0 0 0 0 0 0 0.000274 Instance 1 : 0 0 0 0 0 0 0 1 0.000118 Instance 2 : 0 0 0 0 0 0 1 0 0 Instance 3 : 0 0 0 0 0 0 1 1 0 Instance 4 : 0 0 0 0 0 1 0 0 0 Instance 5 : 0 0 0 0 0 1 0 1 0 Instance 6 : 0 0 0 0 0 1 1 0 0 Instance 7 : 0 0 0 0 0 1 1 1 0 ……….. ……….. Instance 252: 1 1 1 1 1 1 0 0 0.001698023 Instance 253: 1 1 1 1 1 1 0 1 0.015282209 Instance 254: 1 1 1 1 1 1 1 0 0.032262442 Instance 255: 1 1 1 1 1 1 1 1 0.290361976

JPD of Asia • 256 states • Max:0.290362 • Spreads over 9 orders of magnitude 0,1.50E-09, …, 0.290362 • Top 30 most likely states cover 99.2524% of the total probability space. • Conclusion: There is usually a small set fraction of states that covers a large portion of the total Probability space with the remaining states having Practically negligible probabilities.

Why is the JPD so skew? • When we know nothing about the domain, JPD should be flat. • The more we know about a domain, the more asymmetry individual probabilities will show. • When the domain and its mechanism are well-known, probability distributions tend to be extreme. Conclusion: Asymmetries in the individual distributions result in joint probability distributions exhibiting orders of magnitude differences in probabilities of various states of the model.

How does this help inference? • Considering only a small number of probabilities of individual states can lead to good approximations in Belief updating. • The result can be refined by exploring more high likely states. • Problem: Where to locate these “peaks”?

The global map of JPD of Asia • To locate these peaks, let’s first make the map. Nodes order: visit to Asia?|smoking?|tuberculosis?|either tub. or lung cancer?|positive X-ray?|lung cancer?|bronchitis?|dyspnoea? CPT arrangements: let small numbers go first, for example 0.01 0.99, 0.3 0.7 in order to shift high values to the same area. • Most “peaks” are in the second half of the map • Clusters of high “peaks”

Self Similarity (or self affine) Property

Saturday, Oct. 12, 2001 Haipeng Guo KDD Research Group

Saturday, Oct. 12, 2001 Haipeng Guo KDD Research Group

Presentation Transcript

Friday, August 23, 2002 Haipeng Guo KDD Research Group Department of Computing and Information Sciences Kansas State Uni

4-Oct-12

ESRI Ocean Data Model Working Group, Oct. 4-5, 2001

Friday , Oct. 12

Oct 12, 2009

EAA Young Eagles Saturday Oct 5

Friday, Oct. 12

Oct. 12, 2012

Proximal Plane Classification KDD 2001 San Francisco August 26-29, 2001

Saturday 12 th January

GK-12 Saturday Workshop

Oct 12, 2010

Project Management Group Meeting 2001-09-12

KDD Group Research Seminar Fall, 2001 – Presentation 2b of 11

Oct. 12, 2006

KDD CUP 2001 Task 1: Thrombin

KDD Group Research Seminar Fall, 2001 - Presentation 8 – 11

873 Group Project 12 Oct 00 Presentation

GK-12 Saturday Workshop

Saturday, Oct. 12, 2001 Haipeng Guo KDD Research Group

ESRI Ocean Data Model Working Group, Oct. 4-5, 2001

KDD-2001 Cup The Genomics Challenge