### Role of Generalised Linear Model in non-life pricing Phase3

Before reading this article, make sure that you read phase1 and phase2. Here are the link:

Phase1: http://www.actuarysense.com/2018/10/role-of-generalised-linear-model-in-non.html

Phase2: http://www.actuarysense.com/2018/11/role-of-generalised-linear-model-in-non.html

Phase1: http://www.actuarysense.com/2018/10/role-of-generalised-linear-model-in-non.html

Phase2: http://www.actuarysense.com/2018/11/role-of-generalised-linear-model-in-non.html

So we know that the purpose of GLM is to find the relationship between mean of the response variable and covariates.

In this Article we are going to talk about Linear Predictors.

In this Article we are going to talk about Linear Predictors.

**: Let’s denote it with, “η” (eta). So, linear predictor is actually a function of covariates. For example, in the normal linear model where function is Y = B**

__Linear Predictor___{0}+ B

_{1}x. So linear predictor will be η = B

_{0}+ B

_{1}x. Always note that linear predictor has to be linear in its parameter. In this case parameters are B

_{0}and B

_{1}.

But still the question is how I came up with B

_{0}+ B_{1}x as a function?
First of all, note that broadly there are two types of Covariates.

1. Variables: It takes the numerical value. For example: age of policyholder, years of experience etc.

2. Factors: It takes the categorical value. For example: Sex of Policyholder, car colour of Policyholder etc.

Let’s see different Scenario:

2. Factors: It takes the categorical value. For example: Sex of Policyholder, car colour of Policyholder etc.

Let’s see different Scenario:

1. If Age is the only covariate that Exists

Linear Predictor:

**η = B**, where we input for X i.e. age of policyholder._{0}+B_{1}x
2. If Age and Sex both are the Covariates (one is factor and other is variable)

Linear Predictor: Age + Sex:

**η = a**, where we input for X i.e. age of policyholder, where i=1 for male and 2 for female._{i}+Bx
3. If Age and Sex are the covariates with interaction between them too.

Liner Predictor: Age + Sex + Age.Sex:

**η = a**so we can see here that with change in “i” value of B also changes_{i}+B_{i}x ,
The reason that this formulation of liner predictor is desirable is its efficiency. In 1

^{st}Case we need to estimate only 2 parameters, in 2nd Case we need to estimate 3 parameter and in last case we need to estimate 4 parameters. So as the covariates keep on increasing, the model will become more complex and we need to estimate more and more parameters.**Follow us on LinkedIn : Actuary Sense**

photo credit: thegeneral.com

So, it simply means that I can estimate as many parameters as data points to make the perfect model? But that will not be the case as it impacts the efficiency of the model. However, we can use that type of model as a benchmark. That type of model is known as “Saturated Model”.

**: It is the model that provides perfect fit to the data. The Saturated model is not useful from a predictive point of view, however it is a good benchmark against which to compare the fit of other models via the scaled deviance.**

__Saturated Model__
So now point is different people came up with their models for pricing of motor insurance, now which model is good and which is not, we can check it using

**.**__Likelihood Ratio Test__
Let’s see the example:

There are 2 models: Model P and Model Q.

Now In Model P, Scaled Deviance will be

**2(l**where l_{S}– l_{P})_{S}represents log likelihood of Saturated model and other one is of Model P. SO now you can relate with topic 1 of this series why we make every Response Variable model into exponential family, so that we can take its log likelihood easily to use it for comparing with other models. 😊
Same thing I can do for Model Q too.

So, now I have scaled Deviance for both models. So we can use Deviance as a measure of the fit of the model. If the data is normally distributed, the Scaled deviance has a Chi-Square Distribution.

**S**=

_{P}- S_{Q }**2(l**then

_{S}– l_{P}) - 2(l_{S}– l_{Q}) = 2(l_{P}– l_{Q})**we can test with Chi-square with 5%(say) significance level.**

Caution: We can only compare two models if one of them is subset of other model. I mean, both models should have same distribution of data and link function and one model is sub model of other model. Through this we can check whether by adding new parameter, my model produce more accurate result or not.

Seems Easy Haan 😊

Thanks and Regards

Actuary Sense

**Follow us on LinkedIn : Actuary Sense**

Reference: Actuarial Education Company 😊

## Comments

## Post a Comment