Risk Analysis & Modelling

Risk Analysis & Modelling Lecture 3: Statistical Distributions & Dynamic Financial Analysis

www.angelfire.com/linux/riskanalysisriskcoursehq@hotmail.com

Statistical Distributions & Risk Modelling • Actuarial Science uses statistical distributions to describe the number of losses and the size of losses experienced by an insurer • Traditionally mathematics was used in conjunction with these distributions to analyse the risks faced by the insurer and how decisions taken by the insurer can mitigate these risks – this is called Risk Theory • As computer power has increased these Actuarial Distributions have been combined with Monte Carlo Simulations to build more complex, realistic models • These are known as Dynamic Financial Analysis (DFA) Models • Commercial software packages such as Remetrica ™ and Igloo Professional ™ allow practitioners to build DFA Models without the need for programming • Excel combined with VBA is perhaps the most widely used and most flexible tool to build DFA models in the Insurance Industry and is surprising simple once you learn a few basic techniques….

Risk Theory Vs DFA Risk Theory (Analysis of Risk with Calculus) Statistical Distributions Distribution of Number of Losses Distribution of Size of Losses Risk Analysis For example, what is the probability of insolvency and how can this be reduced by different reinsurance strategies? Dynamic Financial Analysis Frequency Severity Model For i = 1 to 1000000 sum = sum + RandomClaim() Next i (Analysis of Risk with Computers)

Distributions - Describing the Range of Outcomes • In the last lecture we introduced the idea of describing the possible range of outcomes and the probability of those outcomes as a useful tool in quantifying risk • This essentially gave us a probability histogram showing the relationship between the outcome and the probability of the outcome • We were dealing with discrete random variables which could only assume a finite or countable number of values • When the random variable is discrete this relationship between the probability and the outcome is known as the Probability Mass Function

Probability Histogram • The probability histogram relates the range of outcomes to the probabilities of those outcomes This random variable is discrete because there are a finite number of outcomes (5 in total)

Probability Mass Function • The Probability Mass Function (PMF) (also known as the Probability Function) specifies the relationship between the outcome and the probability of the outcome Probability Mass Function PMF(-10%) = 5% Loss of -10% 5% Probability In the case of our portfolio the PMF relates the size of a loss or gain to the probability of that loss or gain The PMF can be a mathematical function or a simple table of probabilities The values that the random variable described by the Probability Mass Function is called the Support, in this case the support is the set values (-10%,-5%,0%,5%,10%)

Properties of the Probability Mass Function • The first property of the Probability Mass Function (PMF) is that the probabilities relating to the various outcomes cannot be negative: The second property of the PMF is if we sum the probabilities for all the possible outcomes (N in total) we get a value of 1

Probability of Poor Performance • One important statistic frequently used in the measurement of risk is the probability of the outcome being worse than some specified level • For example, we might want to know the probability of the return on our portfolio being less than or equal to some level, say -5% • This can obviously be calculated by summing the probabilities of all the outcomes equal to or worse than -5% (20% + 5% = 25%) • We could do this for all the possible portfolio returns (worse than or equal to a return of -10%, -5%, 0%, 5%, 10%), this will give us the cumulative probability for the return on the portfolio

Cumulative Probability Graph

Cumulative Distribution Function • The Cumulative Distribution Function (CDF) is one of the most important statistical concepts on this course • It specifies the probability of observing a random variable less than or equal some value (A): Where CDF(A) gives the probability of observing some random variable less than or equal to a level A, this is equal to the sum of the probabilities of all the possible outcomes (given by the PMF) less than or equal to A Because the PMF is always positive or zero the CDF is always a monotonic (increasing) function

PMF & CDF Example • The following table gives values for the PMF of a random variable, evaluate the CDF at 2 and 4

Generating Discrete Random Variables • In lecture 1 we generated random samples for one of the simplest discrete random variables: the flip of a coin • We did this using intuition, cutting up the uniform random variables such that 50% of the outcomes represented a heads and 50% represented a tails • We could have also done this using the Cumulative Distribution Function for the flip of the coin…

Coin Flip Distributions Probability Mass Function Cumulative Distribution Function Notice that the y-axis of the cumulative probability function has values ranging between 0 and 1 like the uniform distribution

Generating Discrete Random Variables using the Inverse Cumulative Probability Function We read off the value from the y axis and see which outcome this relates to If Rand() is greater than 0.5 and less than or equal to 1.0 then Heads Rand() If Rand() is greater than 0.0 and less than or equal to 0.5 then Tails

Creating Excel Functions in VBA • In lecture 1 we simulated the coin flip using rand combined with the spreadsheet’s IF statement: =IF(rand() > 0.5, 1, 0) • We will now learn how to do this in an alternative way, by creating our own spread sheet functions in Excel VBA • This is easy to do and allows us to customise Excel

Custom Functions in Excel • Excel allows the user to create custom functions using VBA • These functions must be added to a Module – which is just a special page where code for an Excel workbook is written • To add a module to a workbook select the Visual Basic option under the Developer Tab (if you do not have the Developer Tab then check the Excel Options -> Popular -> Show Developer Tab in Ribbon) then select the Insert -> Module menu option in the Visual Basic editor • Lets start with a simple function that adds two numbers together

Add Numbers Function Public Function AddNumbers(NumberA, NumberB) AddNumbers = NumberA + NumberB End Function • Special VBA words are displayed in blue (keywords). • AddNumbers, NumberA and NumberB are just words selected at random. • AddNumbers is the name of the function that is used to call the function on the spreadsheet • NumberA and NumberB are used to reference the two parameters passed into the function • To call this function from the spreadsheet we would type =AddNumbers(4,7) • In this case NumberA would be 4 and NumberB would be 7

CoinFlip Function • We could replicate the IF statement for the coin flip using the following VBA function: Public Function CoinFlip(RN) If RN >= 0.0 And RN < 0.5 Then CoinFlip = 0 ElseIf RN >= 0.5 And RN < 1.0 Then CoinFlip = 1 End If End Function • This function takes a single input parameter RN (short for Random Number) which is a uniform random number generated by Rand

Underwriting Risk on an Insurance Policy Probability, Claim Size 70%, £0 5%, £800 Insurance Policy ………………… ………………… ………………… ………………… ………………… 10%, £1,000 10%, £1,300 5%, £1,500

Cumulative Probability Function for Claim Size Rand()

Random Claim Function • By using the Cumulative Probability Function for claim size we can write the following VBA function to generate random claims from a uniform random variable input (RN): Public Function RandomClaim(RN) If RN >= 0 And RN < 0.7 Then RandomClaim = 0 ElseIf RN >= 0.7 And RN < 0.75 Then RandomClaim = 800 ElseIf RN >= 0.75 And RN < 0.85 Then RandomClaim = 1000 ElseIf RN >= 0.85 And RN < 0.95 Then RandomClaim = 1300 ElseIf RN >= 0.95 And RN < 1 Then RandomClaim = 1500 End If End Function

A Simple DFA Model • Imagine a small insurance company has the opportunity to sell 200 policies annually, each of which has an annual claims distribution of the type just described • The first question the insurance company must ask is how much the Premium charged for each policy should be • The starting point for determining the Premium on any type of Policy is to the Average Claim over the period insured or the Pure Risk Premium • Using our formula from last week we can calculate that the Average Claim per Policy: • The Pure Risk Premium for this policy is £345

The risk faced by the insurer is that the total amount of claims experienced on this portfolio of policies is greater than expected meaning that the Total Income from Premiums is not sufficient to cover the Claims • When this occurs a large Underwriting Loss occurs, this is known as the Underwriting Risk • To compensate it for this Risk the Insurance Company adds a Loading Factor to the Pure Risk Premium when calculating the Premium Charged, we will assume this loading factor is 5%:

Risk of Insolvency • We will assume that the Initial Amount of Capital the Insurance Company has set aside to absorb Underwriting Losses on this portfolio of 200 policies is £24000 • The regulatory environment in which the Insurance Company operates requires the Capital to be 22% of Premium Income (Premium Basis Solvency Requirement): • If the Insurance Company’s Capital goes below this minimum level it will deemed to be Insolvent • The Insurance Company wants to know the probability of Insolvency over the next year, or the probability of their Capital going below this minimum level • In order to estimate this probability we will build a simple Dynamic Financial Analysis (DFA) model

Our DFA model will be based on some simple Financial Identities • The underwriting profit, ignoring costs and reinsurance can be written as: • Where P is the total premium income and C is the total level of claims (which is random and must be simulated) • Using the basic balance sheet identity the relationship between the Capital at the start of the year (Z0) and the end of the year (Z1 , which is also random) is: • Using a Monte Carlo Simulation in Excel it is fairly simple to calculate the probability of Z1 going below the minimum level (M*)

Change in Solvency Capital Over the Year Solvency Capital at Year Start Solvency Capital: £24000 Required Solvency Margin: 200 * 362.25 * 0.2 = £15939 Solvency Capital at Year End Premium Income: 200 * 362.25 = £72450 Claims: ? Required Solvency Margin: £15939 Solvency Capital: ?

Subroutines in VBA • In lecture 1 we constructed a simple subroutine to perform a Monte Carlo simulation • Subroutines perform a task like putting the number 2 in cell A1, or counting the number of 1’s in column B • Functions make a calculation and return a value like the sum of 3 and 5 • The subroutines we will use on this course will be attached to and operate on specific worksheets • We select the worksheet that the subroutine operates on by selecting the appropriate worksheet before typing in the routine

The Cells Keyword • In VBA we denote certain cells on a spreadsheet using the Cells keyword • This keyword takes the form Cells(Row Number, Column Number) • For example, Cells(3,2) would access the cell at B3 (row 3, column 2) • We can use the Cells keyword to either set a value in a cell or check the value in that cell

Two Cells Examples • The following subroutine will place the value 4 in cell C2: Sub SetNumber() Cells(2,3) = 4 End Sub • The following subroutine will check the value in cell A1 and if it is greater than or equal to 5 then it will add one to the value in B1: Sub CheckNumber() IfCells(1,1) > 5 Then Cells(1,2) = Cells(1,2) + 1 End If End Sub

The For loop • One of the most important concepts in computer programming is the loop • It basically tells the computer to do something a number of times, or do something until something happens • The loop is central to the Monte Carlo simulation because it allows us to instruct the computer to run the simulation thousands of times • On this course we will use the For loop which instructs the computer to perform a task for every number (integer) in a range

Loop Examples • The following subroutine will recalculate the spread sheet 100 times Sub RecalculateLoop() For i = 1 to 100 Application.ActiveSheet.Calculate Next i End Sub • The following subroutine will put the numbers from 1 to 1000 in column A, starting at cell A1 Sub PutNumbersLoop() For i = 1 to 1000 Cells(i,1) = i Next i End Sub

Insolvency Risk Subroutine • The following subroutine will recalculate the spread sheet 100 times Sub InsolvencyMonteCarlo() Cells(15, 2) = 0 For i = 1 To 100 Application.ActiveSheet.Calculate Cells(16, 2) = i IfCells(13, 3) = 1 Then Cells(15, 2) = Cells(15, 2) + 1 End If Next i End Sub Set the Insolvency count cell (B9) to zero Recalculate the worksheet Put the current simulation count into cell B10 If the value in cell C7 is 1 then this is an insolvency and add one to the insolvency count cell (B9) Loop 100 times

Discrete Vs Continuous Random Variables • Some random variables are naturally discrete such as the roll of a dice or the number of claims experienced by an insurance company over the period of a month • A larger class of random variables are not naturally discrete and have a limitless or extremely large range of values • The return on the portfolio is a good example, we have artificially made it discrete but in reality it could take on any value such as 1.654% or 5.3421% • The cost of repairing a car after an accident could be $5672.55 • If we make continuous random variables artificially discrete for the purpose applying discrete distributions it will mean we are not modelling them accurately • However, when we attempt to describe or model continuous random variables we encounter an interesting problem….

Zero Probability • The problem with continuous random variables is they can take on a limitless (or very large) range of values • What this means is that the probability of any given outcome is very small and effectively zero! • What is the probability of a person selected at random being 1.6744234 meters? • What is the probability of a random pebble picked up on a beach being 0.241312KG? • What is the probability of your portfolio having a return of 1.23542%? • What is the probability that the uniform random number will be 0.154345435123445? • For continuous random variables we can only talk about the probability of the outcome being within a range of values, such as the probability of a uniform random number being between 0.1 and 0.3

Quantifying Continuous Random Variables using the CDF • The standard way to quantify the behaviour of a continuous random variable is by using the Cumulative Distribution Function (CDF) • The CDF of a continuous random variable gives the probability of it being below a given value • The CDF for a continuous random variable is nearly always a mathematical function which takes as its input a value and from this calculates the probability of observing the random variable below that value

CDF for a Uniform Random Variable • The simplest CDF is that of a Standard Uniform Random Variable (U), and is simply: • The support of this CDF are all values between 0 an 1 (the range of values the uniformly distributed random variable can take) • So for example if we want to know the probability of a uniform random value being less that 0.65 • 0.65 or 65% of the time the uniform random variable will be below the value of 0.65 For 0 <= X <= 1

Cumulative Density for Uniformly Distributed Numbers CDF(X) = X

Probability Density Function (PDF) • Although each possible outcome for a continuous random variable is effectively zero (the probability mass is zero), some of the outcomes are more clustered than others • How clustered or dense outcomes are about a given point is measured by the Probability Density • Probability Density is measured using the Probability Density Function (PDF) and can be thought of as the continuous random variable’s equivalent to the Probability Mass Function (PMF) • The higher the density of outcomes at a point the more likely an outcome is to appear in the region of that point • Mathematically, the PDF measures the slope of the CDF at a value • Even though its meaning is abstract, the PDF is one of the most common tool with which to represent or visualise the nature of a continuous random variable – mainly because it is a bit like Probability Mass.

Probability Density for Uniformly Distributed Numbers There is a zero density outside the range 0 to 1 because there are no values there! Uniformly distributed random numbers have a constant density of 1 over the range 0 to 1, 0 elsewhere Probability Density

Uniform Density Graph Points are spread out uniformly

Uniform Distribution PDF and CDF CDF Area under the PDF equals CDF PDF Slope of the CDF (1) equals the PDF

Mathematical Relationship between PDF and CDF • Mathematically we can express the relationship between the PDF and CDF for continuous random variables using calculus • Using integration we can express the relationship between the area under the pdf and the cdf as follows (where LB is the lower bound on the random variable, 0 in the case of the uniformly distribution) Using differentiation we can express the relationship between the slope of the CDF and the PDF as

Mathematical Uses of the PDF • Traditionally, calculus was central to the manipulation of distributions for continuous random variables • Recall from lecture 2 we stated that the average of a discrete random variable was: • For a continuous random variable this is calculated using the PDF and integration:

In the case of a uniformly distributed random variable this would be: • We stated that the variance for a discrete random variable was: • We can calculate the variance of a uniformly distributed random variable as: • Lets verify these results with a Monte Carlo Simulation in Excel…

Other Types of Continuous Random Variables • Uniformly distributed random variables are just one of a large number of continuous random variables • Other types of continuous random variables that we will be looking at on this course include those described by the exponential, normal and log-normal distributions • We will start by looking at the simplest of these other continuous distributions: the Exponential Distribution

Alternative to the CDF: Cantelli’s Inequality • Cantelli’s Inequality is an important result from statistics that places an upper boundary on the probability of a random variable being less than or equal to some value • The estimate can be calculated just from the mean and the variance of the random variable: • The true probability is likely to be less than this estimate BUT it has the advantage in that we do not have to know the distribution of the random variable just its Mean and Variance!

Cantelli’s Inequality Example • Imagine we have a portfolio whose average return is 8% (0.08) and variance is 0.003 • We can use Cantelli’s inequality to estimate the probability of losing more that 5% of the portfolios value (the random return on the portfolio being less than -5%) • So Cantelli’s inequality tells us that the chance or probability of losing more than 5% of the portfolio is 15.07% (0.1507) • This is a worst cast estimate the true probability will be less than or equal to this

Exponential Distribution Function • Exponentially distributed random variables can take on any value between 0 and positive infinity (the support) • The formula for the CDF (Cumulative Distribution Function) of an Exponentially Distributed random variable (o) is Where e is the natural number (2.71828…) Where m is the average of the random variable The formula for the PDF (Probability Density Function) of the Exponential Distribution is

Exponential CDF where m=3 Probability that this exponentially distributed random variable is less than or equal to 5 is 81.1% 1 – e-(5/3)

Risk Analysis & Modelling