Do you know about every Probability Distribution function with cool examples?
Probability Distributions: It is a mathematical formula which helps us to find out the chances of
occurring of an event. If we say that our dataset belongs to some particular
distribution then we can say that probability of events in such dataset behave
in a particular pattern. Say, the dataset is from uniform distribution then
each and every event of range will have equal chances of occurring.
Types of distributions:
Discrete Distributions: Distributions defined for discrete random variables (which can take
countable values such as 0,1,2,3….) are known as discrete distributions.
Following given distributions are discrete:
Uniform Distribution: Data is said to be uniformly distributed if the probability of
events is
equal.
For example: Events in
throw of a dice or flipping a coin have equal probabilities. The number of
bouquets sold daily at a flower shop is uniformly distributed with a maximum of
40 and a minimum of 30.
Bernoulli Distribution: When in a single random trial only two outcomes are possible and let the
trial has a probability of success ‘p’ and probability of failure ‘1-p’ then
this trial is termed as Bernoulli trial and corresponds to Bernoulli
distribution.
For example: There are 5 black and 4 white balls in a bag. A single ball is drawn
from it and the probability of drawing a white ball is 4/9. This is an example
of Bernoulli as only two outcomes are possible i.e. either white or black ball
can be drawn and only one ball is drawn hence single trial is being conducted.
๐๐ฎ๐๐ฌ๐๐ซ๐ข๐๐ ๐ญ๐จ ๐ฆ๐ฒ ๐๐จ๐ฎ๐๐ฎ๐๐ ๐๐ก๐๐ง๐ง๐๐ฅ ๐ญ๐จ ๐ฅ๐๐๐ซ๐ง ๐๐ฒ๐ญ๐ก๐จ๐ง ๐๐ง๐ ๐๐๐ ๐๐จ๐ซ ๐๐๐ญ๐ฎ๐๐ซ๐ข๐๐ฌ
Follow us on Linkedin: Actuary Sense
Follow us on Instagram: Actuary Sense
Binomial Distribution : When in any particular experiment only two outcomes are
possible (e.g. pass/fail, smoker/non-smoker etc.)where
one outcome is termed as success with probability ‘p’ and other as failure with
probability ‘1-p’ and “n” independent trials are done for this experiment then
it is said to follow a binomial distribution with parameters n and p .
For example: Results of IAI are declared as either pass or fail, no grading or
marks are declared which means there are only two possible outcomes either pass
or fail. Say the probability of passing a student is ‘p’ and number of trials
are number of students appeared.
[When n=1 binomial distribution is known as
Bernoulli distribution.]
Poisson Distribution: The Poisson distribution expresses the probability of a given
number of events occurring in a fixed interval of time if these events occur
with a known constant rate and independently of the time. This distribution
models the rate, average, pattern or arrival path of various events.
Limiting Case: Poisson
distribution can be described as a limiting case of binomial distribution when
n is sufficiently large and p is sufficiently small. Hence, in binomial
distribution when n tends to infinity and p tends to 0 the binomial
distribution can be approximated to Poisson distribution having ฮป=np. In
general cases this approximation is used when n>100 and p<0.05.
For example: The
number of claims arising on an insurance policy are 5 per year on an average
and if happening of a claim doesn’t affect the other future claims then it
follows Poisson distribution. Some other examples may be the number of
accidents occurring per day in a city, number of mails received in a day, number of patients arriving in an emergency room between 10 and 11
pm.
Geometric distribution: In case of geometric distribution we will keep doing trials until
we get our first success where number of trials can be infinite. This is the
only discrete distribution which follows memoryless property which states
that the failures on first n trials does not affect the probability of success
on upcoming trials.
For example: In case
of a life insurance policy a person will give premiums until event of death.
Here, death will be our success with say probability ‘p’.
[The above given description is of Type 1
Geometric. In case of Type 2 Geometric we consider number of failures before
first success.]
Negative Binomial Distribution: In case of negative binomial distribution we will
keep doing trials until we get our Kth success. Here, we
find the probability of having kth success in xth trial. If we put K=1 then
this will become geometric distribution. In fact, it can be expressed as a sum
of K geometric variables.
For example: In case
of acceptance sampling plans a batch of say 60 items is inspected until
defectives, if we get 10 defectives then the batch is rejected otherwise
accepted. The probability of a batch being rejected on finding 10 defectives in
less than 60 trials can be calculated using negative binomial distribution.
[The above given description is of Type 1
negative binomial and it counts the
number of trials up to and including the kth success whereas Type 2 counts the
failures before the kth success. So, for the
above example if we found 10th defective on say 30th trial
then in Type 1 X will be 30 as we had to do 30 trials before 10th success
but in case of Type 2 X will be 20 because we had 20 failures before finding 10th success.]
Hyper Geometric Distribution: This distribution helps to find the probability the probability of
k successes in n random draws from a population of size N.
For example: We want
to conduct a survey about drug usage and we randomly select 20 students among
5000 boys and 4300 girls of a university. To find the probability that out of
these randomly chosen 20 students 8 will be girls hypergeometric will be
helpful.
Example for Discrete Distributions:
Let there are 50 people out of which on 18 are
graduated and rest are not.
--Let the probability of a person being graduated
is 0.6 and is known in advance then the probability of selecting two people who
are graduated will be modeled using binomial distribution.
--Now, if we randomly select people from this group
and keep selecting until we select a person who is graduated can be modeled
using geometric distribution and the probability of selecting 1st graduate
on 6th trial can be known using geometric distribution.
--Now, if we want to know the probability of
selecting 3rd graduate in 10th trial then this
can be modeled using Negative binomial distribution.
--Now, if we randomly select 7 people from the
group and the probability of having 2 graduates among these 7 can be found
using hypergeometric distribution.
Continuous Distributions: Distributions defined for continuous random variables (which can take
uncountable infinite number of values) are known as continuous distributions.
Following given distributions are continuous:
Uniform Distribution: When a continuous random variable is uniformly distributed in an
interval such that the probability of occurrence is equally distributed then it
is said to follow uniform distribution.
For example: If at a metro station trains run every five minutes. Then the waiting
time for a person entering the station at a random time will follow uniform
(0,5).
[Uniform distributions are also widely used for
random number simulations because cdf of every distribution follows U(0,1).]
Gamma Distribution: Gamma distribution is a positively skewed continuous distribution having
two parameters ฮฑ and ฮฒ where ฮฑ is the shape parameter and ฮฒ is the scale
parameter. It can also be described as ฯ2 distribution with m degrees of
freedom when ฮฑ=m/2 and ฮฒ=1/2.
For example: Gamma
distributions are used to model the size of insurance claims or the size of
loan defaults.
[Gamma distributions are also widely used as
conjugate prior for various distributions in Bayesian statistics.]
Exponential Distribution: It is the probability distribution of time between events in a Poisson
process. The summation of K exponential random variables follows gamma with K
and ฮป where ฮป is parameter of exponential distribution.
For example: If we note the time between arrival of students in a classroom or
patients in a hospital. Then the data of these time intervals will follow a
continuous distribution.
[Exponential distribution also follows
memoryless property.]
Beta Distribution: Beta is a positive continuous distribution defined on an interval
of (0,1) having two positive parameters ฮฑ and ฮฒ. It is widely used in Bayesian
statistics as a conjugate prior for various distributions.
T distribution: This distribution is similar to normal distribution in terms of symmetry
and shape but has heavier tails and hence produces values far from mean. It is used in
parameter estimation of ยต when sample size is small and ฯ2 is
unknown.
ฯ2 Distribution: ฯ2 distribution with n degrees of freedom is a sum of n squared standard
normal variates. It is widely used in testing of population variances and
chi-squared tests like goodness of fit and contingency tables
F Distribution: F distribution with (m-1, n-1) degrees of freedom is ratio of ฯ2 with m
degree of freedom and ฯ2 with n degrees of freedom. It is used in hypothesis
testing of ratio of population variances.
Weibull Distribution: Weibull is a probability distribution which is widely used in data
analysis having two parameters ฮฑ and ฮณ where c is shape parameter and ฮณ is
scale parameter. The value for the shape parameter (ฮณ) determines the hazard
rates:
--If gamma is less than 1, then the hazard rate
decreases with time (i.e. the process has a large number of infantile or
early-life failures and fewer failures as time passes).
--For ฮณ = 1: the hazard rate is constant, which
means it’s indicative of useful life or random failures.
--If ฮณ > 1: the hazard rate increases with time
(i.e. the distribution models wear-out failures, which tend to happen after
some time has passed).
For example: Weibull
analysis can be used to study survival models, lifetimes of medical and dental
implants, Components produced in a factory, Warranty analysis, Utility services
etc.
Pareto Distribution: Pareto distribution was firstly discovered to model the distribution of
income having two parameters ฮฑ and ฮป where ฮฑ is the shape
parameter and ฮป is lower bound on data.
For example: Pareto distributions can be widely used to model the distribution of
incomes, city populations, lifetime of manufactured items, claim amounts of
insurance policies.
Follow us on Linkedin: Actuary Sense
Follow us on Instagram: Actuary Sense
Normal Distribution: Normal distribution is a symmetrical and bell-shaped distribution with
mean ยต and variance ฯ2. According to central limit theorem, all
the distributions tends to normal for large sample size. In case of normal
distribution mean = median =mode and ยต represents the middle point of
the graph while ฯ2 represents the spread. It plays a significant role in
statistics as it is widely used for approximating various distributions,
hypothesis testing, regression models etc.
Standard Normal Distribution: A variable Z is said to follow standard normal distribution
with ยต=0 and ฯ2 = 1 when Z=(x-ยต)/ฯ. Normal variates are transformed
into Z variates (i.e. standard normal variates) to find P(Z<=z) as we have
table available for only Z variables.
Lognormal Distribution: Lognormal distribution is a positively skewed continuous distribution
having 2 parameters ยต and ฯ2 which are not it’s mean and
variance. If X follows a lognormal distribution, then Y= log X has a normal
distribution with mean =ยตand variance=ฯ2.
Burr Distribution: The Burr
distribution is a unimodal family of distributions with a wide variety of
shapes. This distribution is used widely to model crop prices, household
income, option market price distributions, risk (insurance) and travel time.
๐๐ฎ๐๐ฌ๐๐ซ๐ข๐๐ ๐ญ๐จ ๐ฆ๐ฒ ๐๐จ๐ฎ๐๐ฎ๐๐ ๐๐ก๐๐ง๐ง๐๐ฅ ๐ญ๐จ ๐ฅ๐๐๐ซ๐ง ๐๐ฒ๐ญ๐ก๐จ๐ง ๐๐ง๐ ๐๐๐ ๐๐จ๐ซ ๐๐๐ญ๐ฎ๐๐ซ๐ข๐๐ฌ
Follow us on Linkedin: Actuary Sense
Follow us on Instagram: Actuary Sense
Comments
Post a Comment