The Kelly criterion and its variants: theory and practice in sports, lottery, futures & options trading

The Kelly criterion and its variants: theory and practicein sports, lottery, futures & options trading Professor William T ZiembaAlumni Professor of Financial Modeling and Stochastic Optimization (Emeritus), UBCICMA Financial Markets Centre, University of ReadingVisiting Professor Mathematical Institute, Oxford University andPresident, William T Ziemba Investment Management Inc Kaist Lecture Program August 2011

Who I Am: RAB/Canyon 2005-07 PIWM 2007 to present WTZIMI

The two trading routes available for cash, equity and equity Futures hedge funds • The search for positive  and the generation of returns from  risk in smart index funds and active equity portfolio management I=excess mean return i=leveraging factor for long market exposure • The search for absolute returns using research on market imperfections, security market biases, mispriced derivative securities, arbitrage, risk arbitrage, and superior investment criteria • The strategy is to win in all markets - up, down and even, and to achieve a smooth wealth path with few drawdowns. WTZIMI

To win consistently: you must • Get the “mean right” that is the direction of the market - adjusted for various types of hedging and positioning. This must consider your risk tolerance. • You must diversify so that regardless of the market path (scenario) the positions are not overbet. • You must bet (portfolio allocate) well. I argue that getting the mean right is themost important ingredient in winning strategies. • This was especially crucial in the equity markets the past 10+ years: the mean was frequently negative. WTZIMI

The Importance of getting the mean right. The mean dominates if the two distributions cross only once. • Thm: Hanoch and Levy (1969) • If X~F( ) and Y~G( ) have CDF’s that cross only once, but are otherwise arbitrary, then F dominates G for all concave u. • The mean of F must be at least as large as the mean of G to have dominance. • Variance and other moments are unimportant. Only the means count. • With normal distributions X and Y will cross only once iff the variance of X does not exceed that of Y • That’s the basic equivalence of Mean-Variance analysis and Expected Utility Analysis via second order (concave, non-decreasing) stochastic dominance.

Errors in Means, Variances and Covariances

Mean Percentage Cash Equivalent Loss Due to Errors in Inputs Risk tolerance is the reciprocal of risk aversion. When RA is very low such as with log u, then the errors in means become 100 times as important. Conclusion: spend your money getting good mean estimates and use historical variances and covariances

Average turnover: percentage of portfolio sold (or bought) relative to preceding allocation • Moving to (or staying at) a near-optimal portfolio may be preferable to incurring the transaction costs of moving to the optimal portfolio • High-turnover strategies are justified only by dramatically different forecasts • There are a large number of near-optimal portfolios • Portfolios with similar risk and return characteristics can be very different in composition • In practice (Frank Russell for example) only change portfolio weights when they change considerably 10, 20 or 30%. • Tests show that leads to superior performance, see Turner-Hensel paper in ZM (1998).

Log Utility: Bernoulli (1732) • In the theory of optimal investment over time, it is not quadratic (the utility function behind the Sharpe ratio) but log that yields the most long term growth. • But the elegant results on the Kelly (1956) criterion, as it is known in the gambling literature and the capital growth theory as it is known in the investments literature, see the survey by Hakansson and Ziemba (1995) and MacLean and Ziemba (2006), that were proved rigorously by Breiman (1961) and generalized by Algoet and Cover (1988) are long run asymptotic results. • However, the Arrow-Pratt absolute risk aversion of the log utility criterion is essentially zero, where u is the utility function of wealth w,, and primes denote differentiation. • The Arrow-Pratt risk aversion index. • is essentially zero, where u is the utility function of wealth w, and primes denote differentiation. • Hence, in the short run, log can be an exceedingly risky utility function with wide swings in wealth values.

Long run exponential growth is equivalent to maximizing the expected log of one period’s returns

Thus the criterion of maximizing the long run exponential rate of asset growth is equivalent to maximizing the one period expected logarithm of wealth. So an optimal policy is myopic. • Max G(f) = p log (1+f) + q log (1-f)  f* = p-q • The optimal fraction to bet is the edge p-q

Slew O’ Gold, 1984 Breeders Cup Classic f*=64% for place/show; suggests fractional Kelly.

Maximizing long run growth • Thus the criterion of maximizing the long run exponential rate of asset growth is equivalent to maximizing the one period expected logarithm of wealth. • So an optimal policy is myopic - future and past do not affect current optimal decisions • Max G(f) = p log (1+f) + q log (1-f)  f* = p-q • The optimal fraction to bet is the edge p-q (the mean) • So if the edge is large, the bet is larger • p = .99, q = .01  f*= 98% of wealth WTZIMI

What does the theory tell us about long term hedge fund trading and overbetting? Kelly and fractional Kelly - explaining the overbetting that leads to hedge fund disasters: you cannot ever bet more than full Kelly and usually you should bet less WTZIMI

Mohnish Pabrai, investing in Stewart Enterprises - Thorp (2010) in our World Scientific book Hedge fund manager won bidding for 2008 lunch with Warren Buffett for $600K+ Stewart Enterprises, Payoff ≤ 24 months Prob Net Return 0.8 >100% 0.19 zero 0.01 lose all investment Pabrai bet 10% of his fund What’s the full Kelly bet? f* =0.975; half Kelly 0.3875; quarter Kelly=0.24375 WTZIMI

Pabrai bet issues: should he have bet more? Other opportunities: must compute against all options (nonlinear or stochastic optimization) for the available wealth Risk tolerance: what fractional Kelly to use? Black Swans: we call them bad scenarios Long vs short run planning WTZIMI

Kelly betting at PIMCO • During an interview in the Wall Street Journal (March 22-23, 2008) Bill Gross and Ed Thorp discussed turbulence in the markets, hedge funds and risk management. • Bill considered the question of risk management after he read Beat the Dealer in 1966. • That summer he was off to Las Vegas to beat blackjack. • Just as Ed did some years earlier, he sized his bets in proportion to his advantage, following the Kelly Criterion as described in Beat the Dealer, and ran his $200 bankroll up to $10,000 over the summer. • Bill has gone from managing risk for his tiny bankroll to managing risk for Pacific Investment Management Company’s (PIMCO) investment pool of almost $1 trillion. • He still applies lessons he learned from the Kelly Criterion. • As Bill said, “Here at PIMCO it doesn’t matter how much you have, whether it’s $200 or $1 trillion … Professional blackjack is being played in this trading room from the standpoint of risk management and that’s a big part of our success.”. WTZIMI

Various trading records for bettors who behave like Kelly investors

Top 10 equity holdings of Soros Fund Management and Berkshire Hathaway, Sept 30, 2008 (SEC filings)

The wealth levels from December 1985 to April 2000 for the Windsor Fund of George Neff, the Ford Foundation, the Tiger Fund of Julian Robertson, the Quantum Fund of George Soros and Berkshire Hathaway, the fund run by Warren Buffett, as well as the S&P500 total return index.

Berkshire Hathaway versus Ford Foundation, monthly returns distribution, January 1977 to April 2000

Return distributions of all the funds, quarterly returns distribution, December 1985 to March 2000

Classic Breiman Results

Kelly and half Kelly medium time simulations: Ziemba-Hausch (1986) These were independent

The good, the bad and the ugly 166 times out of 1000 the wealth is more than 100 times the initial wealth with full Kelly but only once with half Kelly does the investor gain this much But probability of being ahead is higher with half Kelly, 87% vs 95.4% Min wealth is 18 and only 145 with half Kelly 700 bets all independent with a 14% edge: the result, you can still lose over 98% of your fortune with bad scenarios With half Kelly, lose half of wealth only 1% of the time but is is 8.40% with full Kelly So even after 700 plays, the strategy is still risky See Simulation paper for more on this and two more examples

Kentucky Derby 1934-1998 • Use inefficient market system in Hausch, Ziemba, Rubinstein (1981) and Ziemba-Hausch books • Place/show wagers made when prices off sufficiently and EX≥ 1.10 • w0 = $2500 63 years 72 wagers with 45 (62.5%) successful

Typical wealth level histories with one scenario (the actual results) from place and show betting (Dr Z system) on the Kentucky Derby, 1934-1994 with Kelly, half Kelly and betting on the favorite strategies

Overbetting Probability of doubling and quadrupling before halving and relative growth rates versus fraction of wealth wagered for Blackjack (2% advantage, p=0.51 and q=0.49 Should you ever be above 0.02 that is positive power utility like It is growth-security dominated. Betting more than the Kelly bet is non-optimal as risk increases and growth decreases; betting double the Kelly leads to a growth rate of zero plus the riskfree asset. LTCM was at this level or more, see AIMR, 2003. Several similar blowouts are discussed in Ziemba and Ziemba (2007) including Amaranth and Niederhoffer.

Growth Rates Versus Probability of Doubling Before Halving for Blackjack

Fractional Kelly and negative power utility • u(w) =-w <0 •  0 u  log f=1/(1- ) = fraction (Kelly) in log optimal portfolio, rest in cash =0 f=1 full Kelly =-1 f=1/2 1/2 Kelly =--3 f=1/4 1/4 Kelly futures trading down here This is exact with log normality and approximate otherwise but it can be way off.

Samuelson’s critique of Kelly betting Correspondence: Nov 16, 2005 to Elwyn Berlekamp Dec 13, 2006 to WTZ Samuelson postulated three investors, all risk averse and concave WTZ adds two more: Ida (the most risk averse) and Victor (the most risk accepting) Cash return zero, Stock 50-50: $4 or $0.25 for $1 bet (for every period)

The Investors • Tom, I believe, is overbetting and dominated and will go bankrupt • Harriet has a limited degree of risk tolerance, fits well with lots of empirical Wall St equity premium data

Some tests

and in Thorp (2006)

Aside • Victor Niederhoff: always had high returns of blowouts leveraging the S&P500 • Too close to the money for proper risk control • See Chapter 12 of Ziemba and Ziemba (2007) • There are more blowouts since then!

Response • This is correct. I have no disagreement. • The fully Kelly strategy gives very high final wealth most of the time • But it is possible to have low final wealth with no leveraging (-98%) and many times W0 lost with leveraging • See examples in Simulation Section

Horseracing • market in miniature • fundamental and technical systems • returns and odds are determined by • 1) participants -- like stock market, unlike roulette • 2) transaction costs -- track take (17%), breakage; • rebates now plus Betfair (long short) • bet to • 1) win -- must be 1st • 2) place -- must be 1st or 2nd • 3) show -- must be 1st, 2nd or 3rd

Place market in horseracing Inefficiencies are possible since: 1) more complex wager 2) prob(horse places) > prob(horse wins) ==> favorites may be good bets To investigate place bets we need: 1) determine place payoffs 2) their likelihood 3) expected place payoffs 4) betting strategy, if expected payoffs are positive Bettors do not like place and show bets.

The Idea Use data in a simple market (win) to generate probabilities of outcomes Then use those in a complex market (place and show) to find positive expectation bets Then bet on them following the capital growth theory to maximize long run wealth

Effect of transactions costs, calculation of optimal place and show Kelly bets Non concave program but it seems to converge. In practice, adjust q’s to replicate biases. Victor Lo research on this in his thesis and Hausch, Lo, Ziema 1994, 2008 books

Use in a calculator What we do in the system is to reduce the non-convex log optimization problem down to four numbers: Wi,, W, and Si, S or Pi, P, Thousands of race results regress the expected value and the optimal Kelly bet as a function of these four variables. Hence, you just find horses where the relative amount bet to place or show is below the bet in the win pool. The calculator tells you when the expected value is say 1.10 or better and calculates the optimal Kelly bet. So this can be done in say 15 seconds.

The Kelly criterion and its variants: theory and practice in sports, lottery, futures & options trading