Hi
Hope you found my previous post on Statistics helpful.Let’s continue to learn more about it…..
VARIABILITY
The measures of variability complement the measures of central tendency.They describe the extent of distribution or spread in a set of data.
RANGE:It is the difference between the highest and lowest score in a set of data.
For example consider the data set: 65,80,40,39,22,10
Range for this would be 80(highest)-10(lowest)=70
INTER-QUARTILE RANGE(IQR):It also measures variability based on dividing a set of data into quartiles(four equal parts).
It is the difference between the upper quartile(Q1) and lower quartile(Q3) in a data set.
For example consider this:
2,4,6,7,8,9,10,12,14
Here the median or middle value would be 8 leaving 4 values above and below.
This leaves 2,4,6,7 in the upper set.Q1 would be the median of this set i.e 4+6/2
Q1=>10/2=5
Similarly Q3 in the lower set 9,10,12,14 would be 10+12/2
Q3=>22/2=11
IQR(Q3-Q1)=>11-5=6
VARIANCE: It is the average of squared differences from the mean.This will help you in computing the standard deviation of a data set.
Variance of a population is denoted by σ² whereas the variance of a sample is denoted by s².
To calculate both types of variance,do the following:
1)Calculate the Mean in the data set
2)calculate the squared difference for each number i.e (number-mean)²
3)Then calculate the average of these squared differences.
Variance for a population=>σ² =Σ(x-m)²/N where x=number,m=mean and N=no of scores
Variance for a sample=>s²=Σ(x-m)²/N-1 where x=number,m=mean and N=no of scores
Let’s take an example:5,10,15,20,25,30,35
The Mean(m)=5+10+15+20+25+30+35/7=>140/7=>20
The squared difference for each number would be (5-20)²,(10-20)²,(15-20)²,(20-20)²,(25-20)²,(30-20)²,(35-20)²
The variance for a population=>σ² =Σ(x-m)²/N
σ²=(-15)²+(-10)²+(-5)²+(0)²+(5)²+(10)²+(15)²/7=>225+100+25+0+25+100+225/7=>700/7=100
Similarly you can calculate the variance of a sample too.
STANDARD DEVIATION:It is the square root of variance (of a sample/population)
It is denoted by σ.
σ=√σ² or √s²
NORMAL DISTRIBUTION:It is a bell shaped curve indicating the distribution of curves.It is symmetrical with scores concentrated in the middle than in the end.It is defined by two parameters namely Mean and Standard Deviation.
The empirical rule states that:
1)almost all of the values will be within 3(+/-3) standard deviations(SD) of the mean
2)68% will be within +/-1 SD of the mean,99% within +/-2SD and 1% within +/-3 SD of the mean
Distributions can be asymmetrical/skewed to left or right as mentioned in previous post.
You can calculate normal distribution by the formula:
z=X-µ where X=number in the data set ,µ=mean and σ is the SD.
σ
T-DISTRIBUTION:It is used to evaluate the Confidence Intervals for samples less than 30(n).It is wider and flatter at the tails than normal distribution.
DEGREES OF FREEDOM(DF):It is used to measure how accurately the sample used in the research represents the entire population.The greater the degree of freedom, the greater is the possibility that the entire population has been sampled accurately.
DF=N-1 where N is the size of sample
NULL HYPOTHESIS(Hο):It states that no relationship exists between dependent and independent variables.Any such relationship is purely by chance or too small to be considered equivalent to zero.
ALTERNATIVE HYPOTHESIS(H1):It states that there is a true difference between variables.Any observed difference is too large to be considered as chance.
Type I ERROR(False Positive):It happens when a test rejects a true null hypothesis.We conclude that there exists a true difference when in reality, it is by chance.It is denoted by α.It is considered more serious than Type II error.
Type II ERROR(False Negative):It happens when false null hypothesis is accepted .We conclude that relationship between variables is by chance when it is infact a true difference.It is denoted by β.
ALPHA LEVEL(α):It is the probability of Type I error occurring in an experiment.It is an indication of whether an event occurred by c by sampling error or for real.It corresponds to 95% confidence interval or a P value of 0.05.It means there’s a 5% chance you might be wrong about your experiment result if you reject the null hypothesis
p-value:The probability of finding an effect as big as the one( that’s been observed during the experiment) when the null hypothesis is true.If the value of α is smaller than p then Hο will not be rejected.But if α is larger than p then Hο is rejected.
Confidence Interval(CI):It provides a range of scores with specific boundaries.The wider interval we propose, the more confident we’ll be that the true population mean will fall within it.This is expressed as a percentage.
Hopefully the info I’ve given so far helps you all in understanding Statistics easy!!!!
Have a good weekend.
REFERENCES:
2)Pierce, Rod. “Standard Deviation and Variance” Math Is Fun. Ed. Rod Pierce. 10 Jul 2011. 9 Sep 2011 <http://www.mathsisfun.com/data/standard-deviation.html>
3)”Foundations of Clinical Research:Applications to Practice” by Portney & Watkins.