Hey everyone! Today we’re going to take a look at standard deviation when applied to a population.
Let’s start by looking at what standard deviation can tell us about a set of data. Let’s say we have a classroom of 100 middle school students and we measure the height of each student. We now have our data set. We could find the mean, or simple average, of the data by adding all of the numbers together and dividing by how many students we measured, which in this case is 100. Let’s say the mean height of this group of students is 58 inches tall.
This number alone is quite limited. Without access to the original data, we don’t know if every single student was that height, which would give us that average, or if half of them are 53 inches tall and the other half is 63 inches tall, which also gives us that average.
In reality, our intuition would expect there to be a mix of students with most of the students around the average height and fewer students who are a bit shorter or taller and even fewer who are a lot shorter and taller than the average. And our intuition is right. This is what we call a normal distribution and it looks like this:
If we start in the center and move to the right until we reach the next vertical line, we’ve moved one standard deviation above the mean, which is the middle. In this example, the standard deviation is 3 inches. We can see in the area under the graph that this accounts for just over 34% of all the students. If we go back to the middle and move left until we reach a vertical line, we’ve moved one standard deviation below the mean, which accounts for another 34% or so of the students. So we can say that over 68% of the students are within 3 inches (our standard deviation) of the mean height for all the students. This is always true for a normal distribution. The only thing that changes is the value of the standard deviation itself.
For instance, if we had calculated the standard deviation and it came out to 2 inches instead of 3, then the same percentage of students would be within one standard deviation of the mean, but when we look at the normal distribution it’s taller and we see that our 34% fits within 2 inches of the center rather than 3 like in our previous distribution:
That shows us what standard deviation is all about. In this case it’s the number of inches of height away from the average that will make up 34% of the population. In the first case it was three inches to account for that percentage of people. In our second example it only took two inches to account for that percentage.
So when we calculate standard deviation from a sample set, that’s what we’re finding. And that number tells us something about all of the data. In our revised distribution we know that most of the students are within two inches, or one standard deviation, of the mean of 58 inches. In our first set, it took a standard deviation of 3 inches to get that many. So you could say it’s a measure of how spread out our data is.
So at this point you might be asking: How do we calculate the standard deviation, anyway? Well, the easiest way is on a spreadsheet, where STDDEV is a common function that can be used on a range of cells. But we can calculate it manually. Let’s keep our data set small to make our lives a bit easier. Say we have 10 students and we measure their height and get values of 52, 55, 56, 56, 57, 58, 59, 61, 62, 64.
1. Step one is to find the mean. We add them all up (580) and divide by 10 to get a mean of 58.
2. Step two is to take the difference between each number and the mean and then square it. So (52 – 58)2 is 36, (55 – 58)2 is 9, (56 – 58)2 is 4, (56 – 58)2 is 4, (57 – 58)2 is 1, (58 – 58)2 is 0, (59 – 58)2 is 1, (61 – 58)2 is 9, (62 – 58)2 is 16, and (64 – 58)2 is 36.
3. Step three is take the average of those squared differences. This is called the variance. So when I add them all up and divide by 10 I get 11.6.
4. To find the standard deviation, I simply take the square root of the variance. The square root of 11.6 is approximately 3.4. So the standard deviation, in this case, is equal to 3.4 inches. Remember, the standard deviation has units, so that’s inches in this case.
In reality, you probably wouldn’t calculate a standard deviation on such a small population, but this gives you an idea. The process would still be the same four steps even if there were 1000 students in your population, though that sure would take a lot longer to calculate manually.
I hope this video was helpful for understanding standard deviation. Thanks for watching. See you next time!