Role of Generalised Linear Model in non-life pricing Phase3

Before reading this article, make sure that you read phase1 and phase2. Here are the link:
Phase2: So we know that the purpose of GLM is to find the relationship between mean of the response variable and covariates.

In this Article we are going to talk about Linear Predictors.
Linear Predictor: Let’s denote it with, “η” (eta). So, linear predictor is actually a function of covariates. For example, in the normal linear model where function is Y = B0 + B1x. So linear predictor will be η = B0 + B1x. Always note that linear predictor has to be linear in its parameter. In this case parameters are B0 and B1. But still the question is how I came up with B0 + B1x as a function? First of all, note that broadly there are two types of Covariates. 1. Variables: It takes the numerical value. For example: age of policyholder, years of ex…

Role of Generalised Linear Model in Non Life Pricing - Phase1

We will cover a series of topics relating to how Non Life Pricing is done through GLM.
But first let's see what is GLM

Generalised Linear Model
Before Jumping on to what is GLM, let’s see what is Linear models
1.       Linear Models:
Let’s take the example of Weight (Y) and Height (X). The aim of linear models is to find the line of best fit through the data points.

      Here is your X axis is Height and Y axis is your Weight. Y = B + B1x

Line of Best Fit is B + B1x where B0 is intercept on Y axis and B1 is the gradient.
Now the question is how that line comes?
Well, line is chosen in such a way to minimize the sum of squared error terms where error terms are distances from data points to straight line, error terms are normally distributed with mean 0 and variance σ2.

2.       Multiple Linear Regression:
We can extend our model to allow for other predictive variables. For example, we can decide that Weight can depend on height and calories consumed per day both. So here we cannot find the line of best fit but we can find the plane of best fit through the plotted data points.

Plane of best fit is: Y = B + B1x + B2c. Infact here also error terms follow normal distribution with mean 0 and variance σ2.

But let’s suppose we want to add one more variable then it will become difficult to represent those graphically. So that’s where the GLM come into existence.

The Benefits of GLM
1.       We can allow distribution of data to be non-normal. It is important in actuarial work where data often don’t have normal distribution.
2.       Let’s see the case of Mortality: We use Poisson distribution for modelling force of mortality and binomial distribution for initial rate of mortality.
3.       In General insurance, Poisson distribution is used for modelling claim frequency (I.e. number of claims) and Gamma/Lognormal distribution model for claim severity (i.e. size of the claim).
4.       There are various factors that affects my premium, factors can be measurable and categorical too. So, GLM helps in determining which factors to use.
5.       GLM helps in estimation of appropriate premium to charge for particular policy given the level of risk present.

Follow us on LinkedIn : Actuary Sense

Follow me on LinkedIn: Kamal Sardana

Let’s go Little bit Deep under GLM
For the pricing of motor insurance, there are large number of factors that affects premium for example: Age of driver, number of years of driving experience etc.
So GLM helps in determining which of these factors are significant and should be considered while calculating premium for the same.

GLM actually makes the relationship between Response variable and factors. Now what are “Response variables” and “Factors”?
Under the multiple linear regression model, when we say that weight can be determined using your height and calories consumed per day. So your “Response variable” is “Weight” whereas your factors are “Height and Calories consumed”.
Factors can be predictors, covariates or independent variables, about which you have information.

So first of all, we have to define the distribution of response variable (that is Y). Then covariates (or factors) can be related with response. Thus, first step is to consider the general form of distributions (known as exponential families) which are used in GLM. I am saying that we should make distribution of Response in the form of Exponential family.
What is exponential family:
f( y; θ, φ) = exp[{(yθ – b(θ))/a(φ)} – c(y, φ)] where θ is your natural parameter and φ is your dispersion parameter.
Response distribution can be Gamma, Binomial, Lognormal, Poisson, Normal or Exponential and then we make it in the form of exponential family.
We actually make it into Exponential form so that when we use log likelihood function, it will be easy to obtain MLE 😊

Follow us on LinkedIn : Actuary Sense

Follow me on LinkedIn: Kamal Sardana


Popular posts from this blog

Pension Plans: DB vs DC

CFM vs UDD vs Balducci