Sign in to view assessments and invite other educators
Sign in using your existing Kendall Hunt account. If you don’t have one, create an educator account.
Andre records how long it takes him (in minutes) to hike a mountain each day for 6 days.
The goal of this discussion is to ensure students understand how an outlier affects the mean.
Ask students what they notice and wonder about the mean values related to whether Andre’s grandfather goes on the hike with him or not. (Students may notice that the mean significantly increases when Andre’s grandfather goes on the hike, and that their estimate was likely inexact. They may wonder why the mean changed so drastically with only 1 new data point, and, if their estimate was wrong why the mean changes in a way that’s different from what they thought would happen.)
Point out that the mean changes so much because 130 minutes is significantly greater than all the other values. When the seventh value is closer to the rest of the data (59 minutes), the mean is not very different from the original.
Highlight that a value should be left in the analysis if it was collected accurately and in the right conditions for the situation at hand or if we are unsure of why it is different. If we are sure it is a typo, or the value is under significantly different conditions that don’t fit the situation at hand, we might leave it out. The most important thing, though, is that students are not under the impression that data should be thrown out just because it’s different and doesn’t match what we want. The default should always be to include the data and remove the data only if we are certain that it was an error or is measuring something very different than we intended.
For each set of data, answer these questions:
The distances, in miles, of students’ homes to the nearest park:
2.3, 4, 1.6, 15, 3.8, 0.75, 1.7
Han visits a website to find out the price of the next phone he wants to get. He sees the following prices, in dollars:
200, 485, 492, 512, 453, 503
The amount of points Clare scored in her last 8 basketball games:
17, 14, 16, 2, 13, 14, 15, 17
Kiran’s math test scores, as percentages:
57, 82, 80, 85, 89, 84
The height in feet of the roller coasters at the amusement park:
415, 456, 423, 442, 30
The purpose of this discussion is to clarify how to calculate outliers and when outliers should be included in data analysis. Here are sample questions to promote a class discussion:
Tell students that in some cases, what can be considered an outlier is clear because it is so atypical compared to the rest of the data often revealed by visual representations. At other times, whether a point should be considered an outlier is harder to decide just by looking at the data. In those cases, the rule that students are encouraged to use in this class is the one given in this activity developed by John Tukey to use and .