Statistics Phase 3 : Measures of Central Tendency


MEASURES OF CENTRAL TENDENCY
When we get a lot of data then it becomes necessary to understand and analyse the data but it is very difficult to understand anything by looking at the huge amount of data. So there is a need to find a figure which is a representative of the whole data. Here we will look at 3 measures of central tendency which act as a representative of the data
1.       Mean
2.       Median
3.       Mode
Mean: It is the arithmetic average of the data. It is calculated by adding all observations and dividing by total no. of observations. It is the most widely used method of measurement of central tendency as it takes into account all the observations of data and is affected by all of them. But the biggest disadvantage of mean is that it is affected by extreme values. For eg if our data is 1,2,3,4,55,7,8,10. Then the mean of this data is 11.25. This is greater than 7 observations out of 8. So we can see that it is affected by extreme value 55 and is not measuring the correct central value.
Median: While conducting census government may calculate median salary or median age of the population in order to avoid any outliers.
It is the middle most value in the data. It is the mid point of our data. It is calculated as the (n+1)/2th observation of our data. For the above data our median is (8+1)/2th observation =4.5th observation , which is the arithmatic mean of 4th and 5th observation . So the median is 5.5. To calculate Median it is necessary to arrange the data in ascending or descending order. The main benefit of median over mean is that it is not affected by extreme observation. We can see that the median is not affected by extreme value 55. The disadvantage of median is that it is not based on all observations. We can see that even if we change observation of 55 to 50. Our estimate of median remains same so it doesn’t include all observations in its computation. To calculate median.
Mode: Mode is the most frequent occuring observation in our data. It is generally used by businessmen to find the size/pattern which is most popular in the market and then plan their production based on that. For eg if our data is 0,0,1,1,1,5,5,7,7,7,7,7,52. Then our mode is 7 as its frequency is highest. A data can have more than 1 mode also and that is the disadvantage of mode that it is not rigidly defined.
For example: A shoe manufacturer wants to produce more shoes of common shoe size while less of others. In this case mode can be used to find the most common shoe size among people.


Follow us on LinkedIn : Actuary Sense
Follow me on LinkedIn: Kamal Sardana
credits : Nandini Chopra

Comments

Popular posts from this blog

Role of Generalised Linear Model in Non Life Pricing - Phase1

Role of Generalised Linear Model in Non-Life Pricing - Phase 2

Role of Generalised Linear Model in non-life pricing Phase3