The following points highlight the top three ways to measure dispersion. The ways are: 1. Variance 2. Standard Deviation 3. Coefficient of Variation.
Way # 1. Variance:
Variance depends on the deviations where the squared deviations are summed up and then divided by the number of observations to get the sample variance. It has a distinct advantage over mean deviation as the squaring is done for the deviated values as a result all values become positive.
where, x̅ = Mean of the sample
n = Total no. of observation.
In case of grouped data, due to frequency distribution, the variance s2 =
Since the sample is the part of the population with only n number of items, to get the estimate of population variance the formula for sample variance is transformed to —
Significance:
Variance is a quantitative mathematical expression representing squared units.
It has two major difficulties:
1. When the deviation and number of observation are more, then variance becomes a large number which is difficult to be expressed numerically.
2. The unit in which the variance is expressed is not in the same unit of the observation, such as, if the observations are made in cm, then the variance is expressed in sq. cm.
Way # 2. Standard Deviation:
This is most useful method of measurement of dispersion of a series where the values deviated from mean are squared and summed up and then expressed as square root of the summed up value divided by no. of observations. Thus the standard deviation is defined as the square root of the variance.
where, d = deviation from mean
n = total no. of observations
f = frequency of each class
When the population standard deviation is estimated then the formula is:
Why n – 1 is used to calculate the standard deviation when sample size is small?
When the sample size is large enough (e.g., 1,000) to reach the population size, i.e., if ‘n’ approaches towards V (n = sample size, v = population size), then the value of variance will show negligible difference whether divided by n or n – 1. But when the sample size is small, e.g. 30, then the value of population variance should be obtained by dividing with n – 1, which gives more appropriate estimate of population variance.
n – 1 denotes number of degrees of freedom, i.e., number of comparisons that can be made between any one observation and the rest number of observations taking them in pairs.
Computation of standard deviation needs 7 steps:
Merits and Demerits of Standard Deviation:
Merits:
1. The calculation is based on all observations.
2. It is more rigidly defined.
3. Less affected by fluctuations of sampling compared to other measures of dispersion.
4. It summarizes the deviation of large number of observations from mean and is expressed as one unit of variation.
Demerits:
1. It requires a lengthy calculation, i.e., squaring of deviations and then again square root of summed up values.
2. Not very simple to understand.
3. The calculation gives more weightage to extreme values.
Uses of Standard Deviation:
1. It helps in correlating and comparing of different samples.
2. It helps in finding the suitable size of sample for valid conclusion.
3. It helps in finding the standard error which determines whether the difference between means of two similar samples by chance or real.
4. The value of mean and standard deviation help to comment on the population on the basis of observation of sample (Fig. 10.3).
(a) 50% of total observations lie in an area bounded by a distance of 0.6745 σ on both side of the mean.
(b) Mean ± 1σ covers the 68.27% area of the curve.
(c) Mean ± 2σ covers the 95.45% area of the curve.
(d) Mean ± 3a covers the 99.73% area of the curve.
Way # 3. Coefficient of Variation:
In measurement of dispersion, we use the units which are used for observation. But if we want to compare the dispersion of two different characters in the same population, then the calculation of coefficient of variation is needed.
This measurement is expressed in percentage ignoring the units. For example, from the same population if we like to study the length of pod and number of seeds/pod and to compare the dispersion of both the two characters, we must calculate the coefficient of variation.
Coefficient of variation = Standard deviation/Mean x 100
This coefficient of variation is also helpful to get an idea or compare two different populations about the dispersion of a character. More the c.v., more is the inconsistency about the dispersion of character.
Example 7:
In two different populations (Batch I and Batch II) the seed number/fruit is calculated:
From the above result, we can conclude that the Second Batch of fruits has relatively more number of seeds/fruit and also this Second Batch has inconsistent number of seeds per fruit, i.e., variation of seed number is more.