### Role of Generalised Linear Model in Non Life Pricing - Phase1

We will cover a series of topics relating to how Non Life Pricing is done through GLM.

But first let's see what is GLM

Line of Best Fit is B

Now the question is how that line comes?

Well, line is chosen in such a way to minimize the sum of squared error terms where error terms are distances from data points to straight line, error terms are normally distributed with mean 0 and variance σ

Plane of best fit is: Y = B

But first let's see what is GLM

**Generalised Linear Model**

Before Jumping on to what is GLM, let’s see what is

__Linear models__
1.

Let’s take the example of Weight (Y) and Height (X). The aim of linear models is to find the line of best fit through the data points.

**:**__Linear Models__Let’s take the example of Weight (Y) and Height (X). The aim of linear models is to find the line of best fit through the data points.

Here is your X axis is Height and Y axis is your Weight. Y = B

_{0}+ B_{1}xLine of Best Fit is B

_{0}+ B

_{1}x where B

_{0}is intercept on Y axis and B

_{1}is the gradient.

Now the question is how that line comes?

Well, line is chosen in such a way to minimize the sum of squared error terms where error terms are distances from data points to straight line, error terms are normally distributed with mean 0 and variance σ

^{2}.

2.

We can extend our model to allow for other predictive variables. For example, we can decide that Weight can depend on height and calories consumed per day both. So here we cannot find the line of best fit but we can find the plane of best fit through the plotted data points.

**:**__Multiple Linear Regression__We can extend our model to allow for other predictive variables. For example, we can decide that Weight can depend on height and calories consumed per day both. So here we cannot find the line of best fit but we can find the plane of best fit through the plotted data points.

Plane of best fit is: Y = B

_{0}+ B

_{1}x + B

_{2}c. Infact here also error terms follow normal distribution with mean 0 and variance σ

^{2}.

But let’s suppose we want to add one more variable then it will become difficult to represent those graphically. So that’s where the GLM come into existence.

__The Benefits of GLM__
1. We can allow distribution of data to be non-normal. It is important in actuarial work where data often don’t have normal distribution.

2. Let’s see the case of Mortality: We use Poisson distribution for modelling force of mortality and binomial distribution for initial rate of mortality.

3. In General insurance, Poisson distribution is used for modelling claim frequency (I.e. number of claims) and Gamma/Lognormal distribution model for claim severity (i.e. size of the claim).

4. There are various factors that affects my premium, factors can be measurable and categorical too. So, GLM helps in determining which factors to use.

5. GLM helps in estimation of appropriate premium to charge for particular policy given the level of risk present.

**Follow us on LinkedIn : Actuary Sense**

__Let’s go Little bit Deep under GLM__
For the pricing of motor insurance, there are large number of factors that affects premium for example: Age of driver, number of years of driving experience etc.

So GLM helps in determining which of these factors are significant and should be considered while calculating premium for the same.

GLM actually makes the relationship between Response variable and factors. Now what are “Response variables” and “Factors”?

So GLM helps in determining which of these factors are significant and should be considered while calculating premium for the same.

GLM actually makes the relationship between Response variable and factors. Now what are “Response variables” and “Factors”?

Under the multiple linear regression model, when we say that weight can be determined using your height and calories consumed per day. So your “Response variable” is “Weight” whereas your factors are “Height and Calories consumed”.

Factors can be predictors, covariates or independent variables, about which you have information.

Factors can be predictors, covariates or independent variables, about which you have information.

__Technicality__
So first of all, we have to define the distribution of response variable (that is Y). Then covariates (or factors) can be related with response. Thus, first step is to consider the general form of distributions (known as exponential families) which are used in GLM. I am saying that we should make distribution of Response in the form of Exponential family.

What is exponential family:

f( y; θ, φ) = exp[{(yθ – b(θ))/a(φ)} – c(y, φ)] where θ is your natural parameter and φ is your dispersion parameter.

What is exponential family:

f( y; θ, φ) = exp[{(yθ – b(θ))/a(φ)} – c(y, φ)] where θ is your natural parameter and φ is your dispersion parameter.

Response distribution can be Gamma, Binomial, Lognormal, Poisson, Normal or Exponential and then we make it in the form of exponential family.

We actually make it into Exponential form so that when we use log likelihood function, it will be easy to obtain MLE 😊

We actually make it into Exponential form so that when we use log likelihood function, it will be easy to obtain MLE 😊

**Follow us on LinkedIn : Actuary Sense****Follow me on LinkedIn: Kamal Sardana**

## Comments

## Post a Comment