1 / 53

THE FOUR C 's OF NEUROINFORMATION THEORY: C ODING, C OMPUTING, C ONTROL AND C OGNITION

THE FOUR C 's OF NEUROINFORMATION THEORY: C ODING, C OMPUTING, C ONTROL AND C OGNITION. IBM Almaden: Institute on Cognitive Computing, May 10-11, 2006 Toby Berger University of Virginia Charlottesville, VA 22903. Motor System. Environment. Selector. Sensory System; Brain.

mayog
Télécharger la présentation

THE FOUR C 's OF NEUROINFORMATION THEORY: C ODING, C OMPUTING, C ONTROL AND C OGNITION

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. THE FOUR C's OF NEUROINFORMATION THEORY:CODING, COMPUTING, CONTROL AND COGNITION IBM Almaden: Institute on Cognitive Computing, May 10-11, 2006 Toby Berger University of Virginia Charlottesville, VA 22903

  2. Motor System Environment Selector Sensory System; Brain FIG. 1 BLOCK DIAGRAM OF MARKOV-MARKO BRAIN MODEL

  3. (Control) Motor System Environment Selector (Cognition) (Computation) (Coding) Sensory System FIG. 1 BLOCK DIAGRAM OF MARKOV-MARKO BRAIN MODEL

  4. MY BIO-IT COLLABORATORS • Prof. William B. “Chip” Levy, UVA Med - Neuroscientist and my prime bio-collaborator. • Former Grad Students: Zhen Zhang, Yuzheng Ying, Jun Chen • PhD Candidate: Prapun Suksompong

  5. FIGURE 1 OF EVERY INFORMATION THEORY TEXTBOOK Source Source Encoder Channel Encoder Channel is “fixed” and “given.” Future source data does not depend on past outputs to user (open loop). Channel behavior is independent of source statistics. Good performance usually requires computationally intense, long-delay source and channel codes . Source and user must exchange coding rules a priori and must share a common “language.” Channel User Source Decoder Channel Decoder

  6. BUT, IN THE INTRA-ORGANISM COMMUNICATION THAT NEUROSCIENTS STUDY, • Channels are not fixed. They adapt their transition probabilities over eons, or over milliseconds, in response to the empirical distribution of the source. • Future source data depends on past outputs to user. • Time-varying joint source-channel coding often can be efficiently performed by biochemical subsystems of appropriate topology via simple probabilistic transformations. No coding occurs in the classical sense of information theory.

  7. WAIT! What about DNA? Long block code, discrete alphabet, extensive redundancy, perhaps to control against the infiltration of errors. But DNA enables two organisms to communicate; it’s designed for inter-organism communication. DNA also controls gene expression, an intra-organism process, so a comprehensive theory of intra-organism communication needs to address it eventually.

  8. ROBUST SHANNON-OPTIMAL PERFORMANCE WITHOUT CODING Ex. 1: IID Source, MSE AWGN Channel Equating R(D) to C yields the Shannon-optimum mean distortion: But this minimum possible MSE per unit variance can be achieved simply by scaling the signal to the available channel input power level and then scaling the channel output to produce the MMSE estimate! X Y User Source + Channel

  9. ROBUST SHANNON-OPTIMAL PERFORMANCE WITHOUT CODING Ex. 2: Bern-1/2 Source, Hamming Distance BSC(p) Channel Equating R(D) to C yields the Shannon-optimum Hamming distortion: This minimum possible Hamming distortion obviously can be achieved simply by feeding the source output directly into the channel and sending the channel output directly to the user – no delay, no coding!! X Y BSC(p) User Source, Bern-1/2

  10. SHANNON OPTIMALITY IS ACHIEVED WITHOUT CODING OR DELAY IN THESE TWO EXAMPES BECAUSE: Source is matched to the channel. Source outputs are distributed over channel input space in a way that maximizes the mutual info rate between the channel input and output subject to operative constraint(s), thereby achieving capacity. Channel is matched to the source. The channel transition probability structure is optimum for the source and distortion measure; i.e., it achieves the point on their rate-distortion function at which the rate equals the channel’s capacity. [INSPIRED BY MY ABOVE EXAMPLES 1 AND 2, B. RIMOLDI, M. GASTPAR AND M. VETTERLI HAVE DETERMINED A BROAD CLASS OF EXAMPLES THAT EXHIBIT SUCH DOUBLE MATCHING, FIRST WITHOUT AND LATER WITH NOISELESS FEEDBACK OF THE CHANNEL OUTPUTS TO THE ENCODER.]

  11. I CONTEND THAT MOST BIOLOGICAL SYSTEMS HAVE EVOLVED TO BE NEARLY DOUBLY MATCHED LIKE THIS. THUS, THEY HANDLE DATA OPTIMALLY WITH MINIMAL IF ANY CODING AND NEGLIGIBLE DELAY. Information theorists recently have come to appreciate that near-optimum performance can be obtained in many situations via relatively simple probabilistic methods that employ feedback in the source encoder and/or around the channel, and/or in the channel decoder. Biology has knows this for eons.

  12. BUT THERE’S MORE! LIVING ORGANISMS ARE INGENIOUSLY ENERGY-AWARE*. THEY’RE OPTIMALLY DOUBLY MATCHED OVER A WIDE RANGE OF POWER CONSUMPTION LEVELS. THEY HAVE EVOLVED THE ABILITY TO CHANGE THEIR INTERNAL CHANNEL TRANSITION FUNCTIONS, OVER BOTH THE LONG RUN AND THE SHORT RUN, TO MEET THE INFORMATION RATE NEEDS OF THE APPLICATION AT HAND. *The brain consumes 25-50% of the total metabolic energy budget of sedentary human. (L. Sokoloff (1989), “Circulation and energy metabolism of the brain,” in Basic Neurochemistry: Molecular, Cellular and Medical Aspects, 4th ed., G. Siegel et al., Eds.)

  13. C Slope = (bits/s)/(joules/s) = bits/joule Capacity – bits/s N.B. Increasing joules/s to get more bits/s requires expending more joules/bit !! S Average power – joules/s

  14. NEURON CARDINALITY There are approximately 1011 neurons in the human brain. Each neuron forms synapses with between 10 and 105 others, resulting in a total of circa 1015 synapses. From age -1/2 to age +2, the number of synapses increases at net rate of a million per second, day and night; many are abandoned, too. It had long been believed that neuron and synapse formation effectively cease after age 1 and age 2, respectively, but recent studies have shown that they continue until at least age 6.

  15. MULTICASTING: Viewed as a network, the human brain simultaneously multicasts 1011 messages that have an average of 104 recipients each. Each of of these 1011 x 104 = 1015 destinations receives a new binary digit – spike or no spike - once every 2.5 ms, which is the effective spike width. Moreover, 2.5 ms later another petabit that depends on the outcome of processing the previous one has been multicast. (The Internet pales by comparison!) The brain does not simply use store-and-forward routing. Rather, it uses an intensive form of network coding, the exciting new information-theoretic discipline recently introduced by Raymond Yeung and Bob Li. (See, e.g., the latest IT Outstanding Paper Award winning article by Yeung, Li, Ahlswede, and Cai.)

  16. Time permitting, we shall see below that the fact that neurons actually fire asynchronously in continuous time may enable them to send considerably more bps than their relatively low firing rates suggest is the case.

  17. DEFINITIONOF A “TEAM” OF SENSORY NEURONS THE AXONS IN A TEAMOF SENSORY NEURONS FORM MANY OF THEIR SYNAPSES WITH OTHER NEURONS IN THE TEAM (HORIZONTAL, FEEDBACK). SOMETIMES THE LOCAL CONNECTIVITY IS CLOSE TO 50%, AS OPPOSED TO ONLY 10-7 BRAINWIDE. THE REMAINDER OF THE SYNAPSES TO WHICH A TEAM’S AXONS ARE EFFERENT ARE SPLIT BETWEEN “LOWER” NEURONS (TOP-DOWN FEEDBACK) AND “HIGHER” NEURONS (BOTTOM-UP FEEDFORWARD).

  18. TIME-DISCRETE MODEL OF A “TEAM” OF NEURONS PSPs

  19. MAXIMUM INFORMATION RATE HYPOTHESIS The process {X(k))} afferent to a team of neurons has the property that it maximizes the directed mutual information rate from {X(k)} to the efferent process {Y(k)} that it generates, where the maximization is over all processes that lead to the same or smaller energy expenditure in the Y-neurons. Remarks: 1)Energy is expended in the synapses both in receiving and in responding to afferent excitation, and in the axons both to restore chemical concentrations during refractory periods following action potential generation and, to a lesser extent, to drive spikes down the axonal ‘transmission lines’. 2) Time permitting, directed information will be defined in a subsequent slide.

  20. The Brain as a Markov Chain MAIN THEOREM: IF MAXIMUM INFORMATION RATE HYPTOTHESIS IS TRUE, THEN: • {(Xk,Yk} is a first-order (non-homog) Markov chain • {Yk} is a first-order (non-homog) Markov chain • {Xk} is not necessarily Markovian PROOF:Via the Berger-Ying lemmas: Joint work with Yuzheng Ying, to appear in IEEE IT Trans.

  21. REMARKS: • The max info rate hypothesis says the source {X(k)} is robustly “matched” to the channel’s transition matrix, P(y|x). • If double matching prevails, as we suspect it does, then the QSF rate parameterizes the rate-distortion function, and distortion is measured by a Weber-Fechner fidelity criterion of the form • The Markovianness of the Main Theorem is essential to the brain’s low-latency processing of sensory information. Without it, bottom-up delay would accumulate too fast to allow for the number of hierarchy levels needed to achieve the sophisticated distinctions of which the brain is capable.

  22. NEURAL CODING AND SYNAPTIC CLOCKS It is widely held that the principal, if not the only, information transmission task a neuron is called upon to perform is to convey continually to its efferent cohort the value of the afferent excitation intensity (a.k.a. the “bombardment”) it has recently been experiencing.

  23. FIXED THRESHOLD NEURAL MODELS ARE PLAGUED BY LARGE COEFFICIENTS OF VARIATION Several investigators have studied the statistics of the durations of interspike intervals (ISI’s ) for mathematical models of leaky, fixed-threshold neurons. Both with and without a refractory period included in the model, the ISI’s coefficient of variation (i.e., the ratio, / m, of its standard deviation to its mean) is greater than 1 over almost the entire range of afferent excitation levels of practical interest; the only exception is at the highest excitation levels that result in the neuron firing about as fast as it can (saturation).

  24. This renders timing codes virtually useless, leaving rate codes as the only means by which a neuron can reliably communicate information to its efferent neighbors about the bombardment intensity it is currently experiencing. However, that is in direct conflict with numerous recent experiments which convincingly demonstrate that many neurons in cortex and elsewhere exhibit reliable ISI’s in response to repetitions of investigator-controlled stimuli. Also, animals can respond intelligently at latencies which are substantially lower than the time it would take for a hierarchy of rate codes to achieve a useful level of statistical reliability.

  25. A compelling (?) case for this has been made by Berger and Levy, Encoding of Excitation via Dynamic Thresholding NEUROSCIENCE 2004 San Diego, CA 10/23-28/2004

  26. PSP Increasing 4 8 10 12 Time, ms MEAN PSP v. TIME FOR VARIOUS BOMBARDMENT INTENSITIES

  27. PSP 4 8 Time, ms Filtered Poisson PSP’s v. Time

  28. Spikingtimes of red and blue PSPs for descending threshold PSP Fixed Threshold Descending Threshold Time, ms Spiking times of red and blue PSPs for fixed threshold DYNAMICALLY DESCENDING THRESHOLDS ENABLE TIMING CODES

  29. A descending threshold can serve as a simple mechanism by means of which a neuron can accurately convert (i.e., encode) - into the duration of the ISI between any two of its successive AP’s - the value of the excitation intensity it has experienced during said ISI. This statement is true regardless of whether the intensity in question is strong, moderate, or weak. A neuron that possesses a fixed threshold cannot accomplish this.

  30. It is also known that synapses possess chemical “clocks” that enable them to “remember” even for hundreds of milliseconds how long ago its most recent and its next-to-most recent afferent spikes arrived. ALL THIS LEADS ME TO BELIEVE THAT NEURONS DO INDEED IMPLEMENT ACCURATE, LOW-LATENCY TIMING CODES BY MEANS OF DYNAMIC POST-SYNAPTIC POTENTIAL THRESHOLDS THAT DECAY WITH TIME.

  31. Alternatively, a neuron also can achieve much the same result by having a post-synaptic leakage conductance that varies inversely with PSP. (See, e.g., Brette and Gensler, 2005.) It may well be that neurons employ a combination of theshold decay and variable leakage conductance. However, in what follows we use only threshold decay terminology.

  32. 1. The precise shape of the threshold decay curve is not important; the neurons in the efferent cohort can readily adapt to the shape of T(t). • 2. The resulting variance in estimating has the form • 3. If instead you are interested in estimating , • 4. To estimate the accuracy of ISI encoding of bombardment intensity, one must take into account at least the following three sources of imprecision: • Imprecision in the instant of generation of an AP • Imprecision in the rates of axonal propagation along the axon for two successive action potentials

  33. iii) imprecision in the estimate of the AP’s time of arrival at the synapse. (See Berger and Suksompong, IEEE ISIT, Seattle, July 9-15, 2006.) Doing so shows that neural encoding bit rates can be meaningfully higher than previously had been thought! 5. If the excitation is a time-varying Poisson process, then its intensity is a sufficient statistic for stochastically describing it, so it is the only thing that needs to be communicated. 6. The excitation of a (cortical) neuron is indeed robustly a time-varying Poisson process, despite the individual spike trains of which it is composed not being Poisson and possibly being highly correlated. (This is a consequence of Stein-Chen Poisson approximation theory; cf. C. Stein, IMS Lecture Notes, vol. 78, Lecture VIII, IMS, Hayward, CA, 1986, and subsequent work of Barbour et al., among others.)

  34. A CHALLENGING, IMPORTANT QUESTION ABOUT RNN’s Consider a sparsely connected, feedback-heavy network of hundreds of millions of neurons most of which have an in-degree and out-degree of circa 10,000. When galvanized by sensory inputs and exchanging their excitation histories in the manner described above, what kinds of decisions, computations, and responses can such a network generate? (N.B The excitation history that a neuron communicates does not directly propagate beyond its first-tier neighbors.)

  35. (Control) Motor System Environment Selector (Cognition) (Computation) (Coding) Sensory System FIG. 1 BLOCK DIAGRAM OF MARKOV-MARKO BRAIN MODEL

  36. ACTIVITY DURING TIME SLOT k AT THE START OF TIME SLOT k , e(k-1), s(k-1), v(k-1) and m(k-1) ALL EXIST ALREADY. AS SLOT k PROGRESSES, FIRST v(k), NEXT m(k), NEXT e(k), AND FINALLY s(k) GET PRODUCED IN THAT ORDER.

  37. MARKOV REVISITED THE NOTATION USED IN THE BOXES IN FIG. 1, e.g., IMPLIES THAT THE CONDITIONAL PROBABILITY OF THE RANDOM VECTOR APPEARING BEFORE THE CONDITIONING BAR WOULD NOT CHANGE IF ONE WERE TO INCLUDE AFTER THE CONDITIONING BAR TIME-PREDECESSORS OF ONE OR MORE OF THE VECTORS THAT CURRENTLY APPEAR THERE. THAT IS, THE MODEL TREATS THE (SENSORY, MOTOR, ENVIRONMENT)-DYNAMIC SYSTEM AS JOINTLY FIRST-ORDER MARKOV. SURELY, THIS IS ONLY AN APPROXIMATION TO REALITY. HOWEVER, THE NEXT TWO SLIDES DISCUSS HOW TO BUILD THE SENSORY PORTION OF THE MODEL SO THAT IT ACCURATELY RESPECTS THE NEUROBIOLOGY WHILE AT THE SAME TIME BEING FIRST-ORDER MARKOV.

  38. BRAIN STATE AS A FIRST-ORDER MARKOV PROCESS IT DOES NOT SUFFICE TO USE AS THE STATE OF THE BRAIN AT TIME k A BINARY VECTOR WHOSE jth COMPONENT EQUALS 1 IF NEURON jHAS FIRED DURING SLOT k-1 AND 0 IF IT HAS NOT. THAT’S BECAUSE THE NEURONS THAT HAVE NOT FIRED DURING THE LAST SLOT CARRY OVER INTO THE NEXT SLOT INFORMATION ABOUT THE SIZE OF THEIR SUB-THRESHOLD PSP’s AND THE STATUS OF CERTAIN OF THEIR SYNAPTIC CLOCKS. INSTEAD, WE INTRODUCE A STATE VECTOR L(k) WHOSE jthCOMPONENT IS THE NUMBER OF TIME SLOTS THAT HAVE TRANSPIRED SINCE THE LAST SLOT IN WHICH NEURON j GENERATED A SPIKE. THE COMPONENTS OF L(k) THAT ARE ZERO INDEX THE SET OF NEURONS THAT HAVE JUST FIRED IN THE PREVIOUS SLOT, SO THIS SUBSUMES THE USUAL STATE VECTOR. MOREOVER, IT ALLOWS US TO TAKE DYNAMIC THRESHOLDS INTO ACCOUNT, WITH ABSOLUTE REFRACTORINESS CORRESPONDING TO A THRESHOLD THAT IS INFINITELY HIGH DURING THE SLOT IMMEDIATELY FOLLOWING ONE IN WHICH A NEURON HAS FIRED. L(k) CAPTURES EVERYTHING THAT MATTERS EXCEPT QUANTAL SYNAPTIC FAILURE (QSF), WHICH WE ADDRESS ON THE NEXT SLIDE.

  39. BRAIN STATE AUGMENTSED BY QSF DATA QSF’s PROVIDE A POTENT MECHANISM FOR MAKING THE CONDITIONAL DISTRIBUTIONS IN THE THREE MAIN BOXES OF OUR MODEL GENUINELY PROBABILISTIC. THIS IS CRUCIAL TO MANY PHENOMENA OF NEUROSCIENTIFIC INTEREST, INCLUDING THE BUILDING OF AN INTERNAL STOCHASTIC MODEL OF THE ENVIRONMENT THE RANDOM NATURE OF WHICH CAN BE VARIED RAPIDLY OVER A LARGE DYNAMIC RANGE. INCORPORATING QSF’s INECESSITATES INCREASING THE SIZE OF THE STATE VECTOR FROM THE NUMBER OF NEURONS TO THE NUMBER OF SYNAPSES, A FACTOR OF ABOUT 104IN THE CASE OF THE HUMAN BRAIN. THE COMPONENTS THUS ADDED ARE BINARY, EQUALING 1 IF THE LAST SPIKE AFFERENT TO NEURON j FROM NEURON i WAS FAILED AND 0 IF IT WASN’T. THIS IS BECAUSE THE CONDITIONAL PROBABILITY THAT THE NEXT SPIKE TO ARRIVE AT SYNAPSE (i,j) WILL BE FAILED DEPENDS BPTJ ON HOW LONG IT HAS BEEN SINCE A SPIKE LAST ARRIVED THERE AND ON WHETHER OR NOT THAT SPIKE WAS FAILED. WITH THIS AUGMENTATION WE GET A HIGHLY ACCURATE FIRST-ORDER MARKOV MODEL OF THE BRAIN.

  40. MARKO REVISITED NEURONS IN A CORTICAL REGION, SAY V2, RECEIVE SOME OF THEIR INPUTS DIRECTLY FROM OTHERS IN V2 (HORIZONTAL), SOME FROM OTHERS IN V3 AND ABOVE (TOP DOWN), AND SOME FROM OTHERS IN V1 AND BELOW (BOTTOM UP). AS A CONSEQUENCE THE INFORMATION THESE NEURONS TRANSMIT TO OTHERS VIA THEIR AXONAL SPIKES IS DYNAMICALLY DETERMINED IN REAL TIME BY THE INPUTS THEY ARE STEADILY RECEIVING. THE NEURONS THAT CONSTITUTE V2 THEREFORE ARE NOT INFORMATION SOURCES IN THE SHANNON SENSE. THAT IS, THEY DO NOT GENERATE DATA A PRIORI AND INDEPENDENTLY OF WHAT THEY HEAR FROM THOSE WITH WHOM THEY ARE CONVERSING. THEIR OUTPUTS ARE INSTEAD HEAVILY INFLUENCED BY INPUTS THEY HAVE RECEIVED FROM OTHERS IN BOTH THE RECENT AND THE DISTANT PAST. SUCH SOURCES THUS SUBSCRIBE TO THE COMMUNICATION MODEL INTRODUCED BY MARKO. (H. Marko,The bidirectional communication theory: A generalization of information theory, IEEE Trans. Comm., vol. COM-21, pp. 1345-1351, December 1973.)

  41. Control Link Comm Link Comm Link Comm Link NASA Houston CANONICAL REMOTE CONTROL PROBLEM

  42. REPRESENTATION OF THE ENVIRONMENT We subscribe to the view that, within its brain, a healthy organism steadily builds, refines, extends and modifies a model of it’s environment. We view this model not as some mystical or metaphysical construct but rather as being instantiated as a collection of interacting neurons. The model may be located in a particular region or regions of the brain, but its crucial importance militates for it being widely distributed over much if not all of the brain. Most of the basic infrastructure of the model is forged during gestation according to genetic prescriptions, including the design of the fundamental mechanisms by means of which the model subsequently will be extended and modified based on acquired experience. The posited model constitutes an internal representation of the external environment. As such, it is the mechanism by which the organism persistently seeks to solve the “representation problem” of neuropsychology with ever-increasing sophistication.

  43. THE REASON FOR MODEL BUILDING An organism’s principal reason for constructing and continually updating its internal model of the environment is to learn how to better control that environment. If no physical actions are taken, the organism effectively defaults on any attempt at environmental control. The sine qua non, then, is to learn to generate the most effective motor responses possible based on the environmental stimuli acquired by the sensory organs.

  44. ESTIMATING ENVIRONMENTAL RESPONSE An organism can use its internal model of the environment to generate estimates of how the environment will react to prospective motor controls. Depending upon the amounts of time, computational ability, and energy consumption that are permissible in a given situation, the organism may be able to input many prospective motor controls to the environmental model. In this connection, since the actual environment contains sources of randomness due both to stochastic natural phenomena and to the usually unpredictable actions of other denizens of the environment, an organism’s model of it should be similarly stochastic. (QSF’s may play a major role in producing this stochasticity.) Therefore, better estimates may result if a given prospective control is put into the model more than once and statistics are gathered about the set of resulting responses of the model.

  45. THE PERF0RMANCE CRITERION Adopting the block diagram of Figure 1, and also subscribing to the view that an organism is always engaged in building and exercising a model of its environment in the manner described in the preceding slides, leads to the following conclusion: The purpose of processing sensory stimuli is less to convey to the top brain what stimuli havebeen sensed in the past than it is to enable the brain to better predict what stimuli willbe sensed in the future.

  46. THE PERF0RMANCE CRITERION (Cont.) IN SYMBOLS, THE SENTIMENT EXPRESSED IN THE PREVIOUS SLIDE IS THAT THE DISTORTION MEASURE TO BE APPLIED IN TIME SLOT k IS NOT OF THE FORM BUT INSTEAD IS OF THE FORM WHERE IS THE BRAIN’S ESTIMATE OF WHAT WILL BE BASED ON THE IT INPUTS TO THE ENVIROMENT, AS CALCULATED DURING SLOT k ON THE BASIS OF THE DERIVED FROM PROCESSING

  47. MASSEY REVISITED (Cont.) BUTTHE MASSEY-TATIKONDA THEOREM ASSUMES A SHANNON-STYLE SOURCE – ONE OF 2TR PRE-GENERATED MESSAGES TO BE SENT DURING AN INTERVAL OF DURATION T. SINCE OUR (S,M,E)- MODEL USES MARKO-STYLE SOURCES, THE M-T THEOREM IS NOT APPLICABLE TO IT. PERHAPS IT WILL TURN OUT THAT DIRECTED INFORMATION IS RELEVANT TO THE PROBLEM OF NEURAL CODING AND LEARNING, BUT AT PRESENT THERE I SEE NO COMPELLING REASON TO BELIEVE THAT IS THE CASE.

  48. MASSEY REVISITED DIRECTED INFORMATION WAS INTRODUCED IN A PAIR OF CHARACTERISTICALLY BEAUTIFUL PAPERS BY JIM MASSEY.* AMONG OTHER THINGS, MASSEY SHOWED THAT THE CAPACITY OF A CHANNEL WITH MEMORY AND FEEDBACK IS GIVEN BY THE SUPREMUM OF THE DIRECTED INFORMATION RATE FROM THE CHANNEL’S INPUT TO ITS OUTPUT THAT HE INTRODUCED THEREIN, AS OPPOSED TO THE SUPREMUM OF SHANNON’S MUTUAL INFORMATION RATE WHICH HE SHOWED IS IN GENERAL STRICTLY GREATER. (S. TATIKONDA HAS SINCE PROVED THE CORRESPONDING CONVERSE THEOREM.) *1. J. L. Massey, Causality, feedback and directed information, Proceedings of the International Symposium on Information Theory and its Applications, Honolulu, HI, Nov. 27-30, 1990. 2. J. L. Massey, Network information theory – some tentative definitions, DIMACS Workshop on Network Information Theory, March 17, 2003.

  49. BERGER-YING LEMMAS REGARDLESS OF WHETHER MUTUAL INFORMATION OR DIRECTED INFORMATION IS USED, THE BERGER-YING LEMMAS WILL APPLY. THE B-Y LEMMAS SAY THAT, IF IT IS DESIRED TO MAXIMIZE THE RATE AT WHICH EITHER INFORMATION OR DIRECTED INFORMATION IS SENT PART WAY OR ALL THE WAY AROUND THE LOOP FROM {s(k)} TO {v(k)} TO {m(k)} TO {e(k)}, THEN THE PROCESSES INVOLVED IN THAT PORTION OF THE LOOP WILL BE JOINTLY FIRST-ORDER MARKOV. MOREOVER, EACH OF THEM, EXCEPT PERHAPS {s(k)}, WILL BE INDIVIDUALLY FIRST-ORDER MARKOV. THESE FACTS REMAIN TRUE EVEN IF CONSTRAINTS ARE IMPOSED ON THE EXPECTED VALUES OF ONE OR MORE FUNCTIONS OF {s(k-1),v(k),m(k),e(k),v(k-1),m(k-1),e(k-1)); THIS INCLUDES CONSTRAINTS ON ENERGY USAGE.

  50. “We have knowledge of the past, but we can’t control it. We can control the future, but we have no knowledge of it.” CLAUDE E. SHANNON, 1960

More Related