When you hear someone use the word "average" to describe a set of numbers, be careful. A more accurate phrase to use is measure of central tendency
. There are three ways you can describe data, all of which can be thought of as average.
of a set of data
is probably what most people refer to as average. You find the mean
of a set of data
by adding up all the numbers in the data
set and then dividing by the number of data
points you added up. When someone says average, this is usually what they are talking about.
However, there is another number that is often used to describe a set of data
which is the median. If you think about driving along an interstate highway, you often see signs that say “Keep Off Median” meaning that you can’t drive on the grassy area
in the middle of the interstate. Median
in statistics is the same thing. Well, it’s not a grassy area, but the median
does refer to the number in the middle of a data
set. When finding the median
of a data
set, you have to make sure that the numbers are put in order first. If there is an odd number of data
in the list, there is only one number that is exactly in the middle of the data. But if there is an even number of data
points, then there are two numbers in the middle. In that case, you have to add those two numbers together and then divide by two to find the median.
of a data
set refers to the number that occurs most often. If there is not a number that occurs more than any other, we say there is no mode
for the data. It is possible to have more than one mode
for a data
Why in the world would there be three different ways to describe a set of data? It all depends on the situation. Generally speaking, statisticians tend to follow some simple rules.
- If there are extremely high or extremely low values in the data set, those numbers can greatly affect the mean. In those cases of extreme values (in one direction or another) it is preferable to use the median because it is not affected by extreme values.
- When you have categorical data, or data that appears as words instead of numbers, you need to use the mode. For example, if I ask you your favorite color and get word answers like red, blue, green, purple, it is impossible to add up those values to find the mean or to order the colors and find a median. The only alternative to describe this set of data would be the mode. Then you can say that most people prefer red (or whatever color occurs most).
You probably will not be faced with having to make a decision about which measure of central tendency will be best for a particular data
set. Most books will give you a set of numbers and ask you to compute mean, median, and mode. But many students often wonder why there are three different values to be found from the same data
set. This brief explanation gives you an idea of the different types of uses, just in case you were wondering. Let's Practice:
- Given the following data set, find the mean, median, and mode.
12, 15, 16, 19, 20, 20, 22, 23, 25, 27, 29, 30, 32, 32, 35
The mean is found by adding the 15 numbers together which gives 357. Now take 357 and divide it by 15. This gives the value for the mean to be 23.8.
To find the median, we have to make sure that the data set is in order. This data is already in order so we can proceed directly to finding the middle value. In a data set with 15 numbers, the median will be the 8th value. The median for this data set is 23.
The mode is the value that occurs most often. There are two modes in this data set. The values 20 and 32 both appear twice meaning that both are considered the mode.
- A survey of 20 students was conducted to find out how many books they had read during the past three months (including books for school). The results from those 20 students are shown below. Find the mean, median, and mode for this data.
2, 4, 5, 1, 3, 2, 5, 6, 1, 2, 4, 3, 6, 10, 12, 10, 2, 8, 6, 7
Find the mean by adding all the numbers and dividing by 20. Adding all the numbers results in 99. Divide 99 by 20 and you get a mean of 4.95.
Before finding the median, the numbers have to be put in order from smallest to largest. The ordered data set is
1, 1, 2, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 6, 7, 8, 10, 10, 12
Now the median can be found by finding the middle value. In this case, there are 20 numbers in the data set which means there are two numbers in the middle. Those numbers are 4 and 5. The median is found by adding 4 and 5 and then dividing by 2. This gives a median value of 4.5.
The mode is the value that occurs most often which in this set is 4.