No statistics student can avoid standard deviation and variance, even though they'd often like to .... (well, I guess most people would like to avoid stats in general—full sympathy for that ;))

Standard deviation and variance are an important part of your basic statistical toolbox, and you will see them in both descriptive and inferential statistics. So it might be a good idea to familiarize yourself with them!

Both show how widely the data are scattered around the mean, but only the standard deviation can be used for interpretation and shows the typical deviation from the mean.

In the following, you'll learn how to calculate and interpret these two parameters. Let's do this!

## What are standard deviation & variance?

The standard deviation is the square root of the variance and thus its "daughter".

Put differently, the variance is the mama of the standard deviation ;).

Both describe or quantify the dispersion of values around the mean of a data set, and thus **tell you how much subjects differ on the characteristic in question.**

They **can only be applied to metric data**—interval, ratio, or absolute scales.

Standard deviation and variance are very sensitive to outliers—use caution when interpreting them, or use another measure of dispersion such as the interquartile range

**The big difference between the two is that the values are squared for the variance and in the original units for the standard deviation.**

Therefore, the **variance is used as a mathematical bridge to calculate the standard deviation**, which is much more user-friendly for concrete interpretation.

In addition, the variance is the basis for further calculations, e.g. in regression or – as you may have guessed – in the analysis of variance.

These measures of dispersion can be found in most empirical studies:

Usually the standard deviation is reported as additional information to the mean. It looks like this: *M* (*SD*), where *M* is the mean and *SD* the acronym for standard deviation. An example could be: 5.14 (2.36)

## Which area of statistics do they belong to?

Standard deviation and variance **belong to the world of descriptive statistics, but they can also be found in inferential statistics** where they are called something else:

**At the population level, s (standard deviation) and s squared (variance) become sigma and sigma squared.**

But the principle remains the same—only the calculation is slightly different.

## What does the standard deviation tell us?

The standard deviation describes or **quantifies how widely values are typically scattered around the mean of a data set: **

**This tells us how large a typical, representative deviation from the "average" is.**

**In more practical terms:**

**It tells us whether the test subjects have pretty similar values in a certain variable—or whether they differ a lot. **

**Good to know:**

**If the data is normally distributed, almost 70% of all values fall between one standard deviation below and one standard deviation above the mean.**

As described above, **the variance should not be used for interpretation**, but only as a bridge to calculate the standard deviation.

## Calculation of variance

In order to get to the standard deviation, you need to calculate the variance first.

The confusing part here is that you'll find **slight variations of formulas in every textbook:**

**Some only divide by n, some only by n – 1, and some do both. **So you better check what's being done at your university.

Regarding the calculations, the following formulas are essentially the same:

The mean is subtracted from each value, then the result is squared.

You add all these squared values up and divide them either by the sample size (*n*) or the sample size minus 1 (*n* – 1).

**The reason for dividing by " n – 1" is that you get closer to the "true" variance in the population, i.e. you can estimate it better.**

**If you only divide by n, you underestimate the true variance.**

However, if you already have all the people in your sample that make up your population (sample = population), then there's no need for any estimation or inference to the true variance. In this case, you simply divide by *n*.

**Now, let's get into it:**

You use this formula **when you want to estimate the true variance in the population using the data from the sample**—this is called the **"sample variance":**

If you just want to calculate the variance in your specific sample** without making any inferences about the population**, meaning you use it only as a descriptive statistic, you use this formula:

In the unlikely event that you've gathered data from every single person of the population you're interested in, you use this formula to calculate the **"population variance". **

Note that in this formula, you use sigma squared instead of *s *squared. That's because **you use Greek letters for inferential statistics and Roman letters for descriptive statistics.**

## Our example: Speed dating

**Let's say you're doing a study on self-confidence at speed dating events, conducted among adults over the age of 18.**Self-confidence is scaled from 0 (none at all) to 30 (likes to indulge in fantasies of irresistible attractiveness).

Here are the data:

**1, 20, 26, 14, 9, 6, 19, 22 n = 8**

**This is how you do it:1. Calculate the mean2. Subtract the mean from each value and square the result3. Then add up all the squared values (this is the sum of squares, or **

*SS*)4. Finally, divide by

*n*– 1 (or by

*n*)

The mean value of our data is 14.63.

We now insert the mean plus the above values neatly into the formula.

We'll start with** ****"divided by n – 1" which is 8 – 1 = 7:**

The variance equals 74.84 which is quite large for this small data set and not suitable for interpretation due to the squared values.

And now the **version "divided by n":**

This time, the variance equals 65.48.

As you can see, this doesn't really help us much when it comes to the interpretation of self-confidence on a scale of 0 – 30 ...

So we'll move straight on to the standard deviation!

## Calculation of standard deviation

Once you have calculated the variance, the lion's share is already done.

Now all that remains is to take the square root of the variance:

**This is how you do it:****1. Calculate the variance2. Take the square root**

In our example of self-confidence at speed dating events, the result is as follows – above the version with divided by *n* – 1, below divided by *n*:

**what does this tell us?**

**In terms of a relatively manageable scale of 0 to 30, these scores are quite high, meaning that the subjects varied quite widely in their self-confidence—some had very low and some very high self-confidence. **

**So it's not really a homogeneous sample.**

**To summarize:**

On average, the participants had a self-confidence score of about 15 (14.63), which is right in the middle of the scale.

Typically the scores ranged between 7 and 23.

How did I get those numbers 7 and 23?

If the standard deviation is about 8 and the mean about 15, I subtract and add one standard deviation from and to the mean. This gives me the range where the values typically lie: 15 – 8 = 7 and 15 + 8 = 23

**remember:**

One thing to keep in mind when interpreting the data is the scale on which the characteristic of interest was measured.

**It's always important to put the size of the standard deviation in relation to the range of the scale!**

For example, a standard deviation of 2.2 is quite high on a scale of 0 – 5, and would be very low on a scale of 1 – 100.

**Finally, a quick summary:**

## Summary standard deviation & variance

##### ready to implement?

Is your motivation high enough to apply what you've just read?

Then grab a small data set and start calculating like there's no tomorrow ...

And don't forget to reward yourself regularly!

The fun factor of statistics is usually very limited.

That's why you should really enjoy yourself during and after learning. For example, like this:

**references**

Aron, A., Coups, E. J., & Aron, E. N. (2013). *Statistics for Psychology* (International ed). Pearson.

Field, A. (2018). *Discovering Statistics using IBM SPSS Statistics*. SAGE.

Howell, D. C. (2007). *Statistical Methods for Psychology*. Thomson Wadsworth.