The aim of this activity is to expose the limits of the mean in summarizing a data set that has gaps and values far from the center, and to motivate a need to have another measure of center. Students first use a table of values and a dot plot to estimate a typical value for a data set. Then they calculate the mean and notice that it does not lie near the center of the data. A closer look helps them see that when a data set contains values that are far away from the bulk of the data, or when there are gaps in the data set, the mean can be a little or a lot higher or lower than what we would consider typical for the data.
Launch
Give students 2–3 minutes of quiet work time. Follow with a whole-class discussion.
Activity
None
Here are data that show the numbers of siblings of 10 students in Tyler’s class.
1
0
2
1
7
0
2
0
1
10
Without making any calculations, estimate the center of the data based on your dot plot. What is a typical number of siblings for these sixth-grade students? Mark the location of that number on your dot plot.
Find the mean. How does the mean compare to the value that you marked on the dot plot as a typical number of siblings? (Is it a little larger, a lot larger, exactly the same, a little smaller, or a lot smaller than your estimate?)
Student Response
Loading...
Building on Student Thinking
Because previous lessons have used the mean as the best way to find a typical value, some students may go directly to that method from the beginning. Although this is valid at this stage, encourage them to look at the dot plot and think about what a typical value should be.
Activity Synthesis
The purpose of this discussion is to draw students’ attention to the idea that the mean may not always represent a typical value for a data set. First, invite a few students to share their estimate for a typical number of siblings. Then ask students what they calculated for the mean.
Discuss how the calculated mean compared to their estimates:
“Do you think the mean summarizes the data set well?” (No. Eight out of 10 of the data points are below the mean, and more than half of the students have either no siblings or only 1 sibling, so to say that 2.4 is a typical number of siblings is not accurate.)
“Why do you think the mean was higher than your estimate?” (Only two of the points are above the mean of 2.4 and both are quite far above it, and seven points are below 2.4, so the mean might not paint an accurate picture of what is typical in this situation.)
6.2
Activity
Standards Alignment
Building On
Addressing
6.SP.B.5.c
Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the context in which the data were gathered.
This activity introduces students to the term median. They learn that the median describes the middle value in an ordered list of data, and that it can capture what we consider typical for the data in some cases.
Students learn about the median through a kinesthetic activity. They line up in order of the number of letters in their name. Then, those at both ends of the line count off and sit down simultaneously until one or two people in the middle remain standing. If one person remains standing, that person has the median number of letters. If two people remain standing, the median is the mean or the average of their two values.
Students then practice identifying the median of other data sets, by analyzing both tables of values and dot plots.
Launch
Explain to students that, instead of using the mean, sometimes we use the middle value in an ordered list of data set as a measure of center. We call this the median. Guide students through the activity:
Give each student an index card. Ask them to write their first and last names on the card and record the total number of letters in their name. Display an example for all to see.
Ask students to stand up, holding their index cards in front of them, and arrange themselves in order based on the number of letters in their name. (Consider asking students to do so without speaking at all to encourage collaboration.) Look for the student whose name has the fewest letters and ask that student to be the left end of the line. Ask the student with the longest full name to be the right end of the line. Students who have the same number of letters should stand side-by-side.
Tell students that, to find the median or the middle number, we will count off from both ends at the same time. Ask the students at the two ends of the line say “1” at the same time and then to sit on the floor, and the students next to them to say “2” and then to sit down, and so on. Have students count off in this fashion until only one or two students are standing.
If the class has an odd number of students, one student will remain standing. Tell the class that this student’s number is the median. Give this student a sign that says “median” If the class has an even number of students, two students will remain standing. The median will be the mean, or average, of their numbers. Ask both students to hold the sign that says “median.”
Explain that the median is the value that divides the data into 2 halves. Half of the data values are the same size or less than it and fall to the left of it on the number line, and half are the same size or greater than it and fall to the right.
Ask students to find the median a couple more times by changing the data set (for example, ask a few students to leave the line or adding new people who are not members of the class with extremely long or short names). Make sure that students have a chance to work with both odd and even numbers of values.
Collect the index cards and save them because they will be used again in the lesson on box plots.
Ask students to complete the rest of the questions on the Task Statement.
Representation: Develop Language and Symbols. Maintain a display of important terms and vocabulary. Invite students to suggest language or diagrams to include that will support their understanding of distributions.Terms may include “median." Supports accessibility for: Conceptual Processing, Language
Activity
None
Your teacher will give you an index card. Write your first and last names on the card. Then record the total number of letters in your name. After that, pause for additional instructions from your teacher.
Here is a data set on numbers of siblings.
1
0
2
1
7
0
2
0
1
10
Sort the data from least to greatest, and then find the median.
In this situation, do you think the median is a good measure of a typical number of siblings for this group? Explain your reasoning.
Here is the dot plot showing the travel time, in minutes, of Elena’s bus rides to school.
Find the median travel time. Be prepared to explain your reasoning.
What does the median tell us in this context?
Student Response
Loading...
Building on Student Thinking
When determining the median, students might group multiple data points that have the same value and treat it as a single point, instead of counting each one separately. Remind them that when they lined up to find the median number of letters in their names, every student counted off, even if their name had the same number of letters as their neighbor’s name.
Activity Synthesis
Select a few students to share their responses to the questions about number of siblings and Elena's travel times. Focus the discussion on the median as another measure of the center of a data set and whether it captures what students would estimate to be a typical value for each data set.
Emphasize to students that the median is a value and not an individual. For example, if the last person standing in the class has 10 letters in their name, the median is the number 10 and not the person standing. If there is another student who has 10 letters in their name, they might have switched places with the last person standing when lining up initially. Although the person standing could change, the median remains the same value of 10.
At this point, it is unnecessary to compare the mean and the median. Students will have many more opportunities to explore the median and think about how it differs from the mean in the upcoming activities.
6.3
Activity
Standards Alignment
Building On
Addressing
6.SP.B.5.c
Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the context in which the data were gathered.
In this lesson, students study distributions using dot plots and a histogram for which the mean and median can be the same, close, or far apart, and make conjectures about how the distributions affect the mean and median (MP7). Along the way, students recognize that the mean and median are equal or close when the distribution is roughly symmetrical and are farther apart when the distribution is non-symmetrical.
Students sort different distributions during this activity. A sorting task gives students opportunities to analyze representations, statements, and structures closely and to make connections (MP2, MP7).
As students work, encourage them to refine their descriptions of the distributions using more precise language and mathematical terms (MP6).
This activity uses the Critique, Correct, Clarify math language routine to advance representing and conversing as students critique and revise mathematical arguments.
Adjust the timing of this activity to 15 minutes. To move more quickly through the activity, consider shortening the initial discussion of students' categories for sorting the distributions.
Launch
Arrange students in groups of 2 and distribute pre-cut cards. Allow students to familiarize themselves with the representations on the cards.
Give students 1 minute to sort the cards into categories of their choosing.
Pause the class after students have sorted the cards.
Select groups to share their categories and how they sorted their cards/started sorting their cards.
Discuss as many different types of categories as time allows.
Attend to the language that students use to describe their categories and distributions, giving them opportunities to describe their distributions more precisely. Highlight the use of terms like “symmetric” and “asymmetric.” After a brief discussion, invite students to complete the remaining questions.
If not mentioned by students, highlight that in three of the distributions, the mean and median of the data are approximately equal. In the other three distributions, the mean and median are quite different. Discuss:
“What do you notice about the shape and features of distributions that have a roughly equal mean and median?” (They are roughly symmetrical and each have one peak in the middle, with roughly the same number of values to the left and right. They may have gaps, but the gaps are somewhat evenly spaced out.)
“What about the shape and features of a distribution that has a very different mean and median?” (They are not at all symmetrical. They may have one peak, but it is off to one side, or they don't really show any peaks. They may have gaps or data values that are unusually high or low. There is more variability in these data sets.)
“In the second group, why might the mean and the median be so different?” (The mean is pulled toward the direction of unusually large or small values. The median simply tells us where the middle of the data lies when sorted, so it is not as affected by these values that are far from where most data points are.)
Action and Expression: Develop Expression and Communication. To help get students started, display sentence frames such as “Cards in this category all have . . . .” or “The differences between these categories are . . . .” Supports accessibility for: Language, Organization
Activity
None
Activity Synthesis
Use the whole-class discussion to reinforce the idea that the distribution of a data set can tell us which measure of center best summarizes what is typical for the data set. Briefly review the answers to the statistical questions, and then focus the conversation on the last questions (how students knew which measure of center to use in each situation). Select a couple of students to share their responses. Discuss:
“For data sets with non-symmetrical distributions, why does the median turn out to be a better measure of center?” (Non-symmetrical data sets often have unusual values that pull the mean away from the center of data. The median is less influenced by these values.)
“Does it matter which measure we choose to describe a typical value? For example, in Card F, would it matter if we said that a typical age for the people who went on the field trip to D.C. was about 21 years old?” (Yes, it does matter in some cases. In that example, it wouldn’t really make sense to say that 21 years is a typical age because the vast majority of the people on the trip were teenagers.)
Use Critique, Correct, Clarify to give students an opportunity to improve a sample written response to the question, “When are median and mean likely to be close?”, by correcting errors, clarifying meaning, and adding details.
Display this first draft:
“They’re the same when most of the points are in the middle.”
Ask, “What parts of this response are unclear, incorrect, or incomplete?” As students respond, annotate the display with 2–3 ideas to indicate the parts of the writing that could use improvement.
Give students 2–4 minutes to work with a partner to revise the first draft.
Display and review these criteria:
Specific words and phrases: “symmetric” or “approximately symmetric” distributions
Select 1–2 individuals or groups to read their revised draft aloud slowly enough to record for all to see. Scribe as each student shares, then invite the whole class to contribute additional language and edits to make the final draft even more clear and more convincing.
Lesson Synthesis
In this lesson, we learned about another measure of center called the median. The discussion should focus on what the median is, how to find it, and why it is more useful in some situations.
“What is the median?” (The number in the middle of an ordered list of data.)
“How can we find it?” (We order the data values from least to greatest and find the value in the middle.)
“What does the median tell you about a data set? Why is it used as a measure of the center of a distribution?” (It tells us where to divide a data set so that half of the data points have that value or smaller values and the other half have that value or larger.)
“Why do we need another measure of center other than the mean?” (Sometimes the mean is not a good indication of what is typical for the data set.)
“For what kinds of distributions is the median the preferred measure of center?” (When there are a few values that are far from the center of the distribution or when the distribution is not very symmetric.)
Tell students that, in most situations, the mean is preferred if the two measures of center are close. The mean gives equal weight to each data point which means there is less influence of bias. In some cases, though, a few very different values on one side or the other of the center shifts the mean too far from the typical values to be useful. In these cases, the median can be useful.
Student Lesson Summary
The median is another measure of center for a distribution. It is the middle value in a data set when values are listed in order. Half of the values in a data set are less than or equal to the median, and half of the values are greater than or equal to the median.
To find the median, we order the data values from least to greatest and find the number in the middle.
Suppose we have 5 dogs whose weights, in pounds, are shown in the table. The median weight for this group of dogs is 32 pounds because three dogs weigh less than or equal to 32 pounds and three dogs weigh greater than or equal to 32 pounds.
20
25
32
40
55
Now suppose we have 6 cats whose weights, in pounds, are listed here. Notice that there are 2 values in the middle: 7 and 8.
4
6
7
8
10
10
The median weight must be between 7 and 8 pounds, because half of the cats weigh less than or equal to 7 pounds, and half of the cats weigh greater than or equal to 8 pounds.
When there are even numbers of values, we take the number exactly in between the two middle values. In this case, the median cat weight is 7.5 pounds because .
The dot plot shows the number of stickers on 30 pages. The mean number of stickers is 21 (marked with a triangle). The median number of stickers is 20.5 (marked with a diamond).
<p>A dot plot for stickers on a page. The numbers 8 through 34, in increments of 2, are indicated. A diamond is indicated at 20.5 stickers and a triangle is indicated at 21 stickers. Data are as follows: 9 stickers, 1 dot; 10 stickers, 1 dot; 11 stickers, 2 dots; 12 stickers, 1 dot; 14 stickers, 1 dot; 16 stickers, 2 dots; 17 stickers, 1 dot; 18 stickers, 2 dots; 19 stickers, 1 dot; 20 stickers, 3 dots; 21 stickers, 1 dot; 22 stickers, 3 dots; 23 stickers, 1 dot; 24 stickers, 2 dots; 26 stickers, 2 dots; 28 stickers, 1 dot; 30 stickers, 1 dot; 32 stickers, 2 dots; 33 stickers, 1 dot; 34 stickers, 1 dot.</p>
In this case, both the mean and the median could describe a typical number of stickers on a page because they are fairly close to each other and to most of the data points.
Here is a different set of 30 pages with stickers. It has the same mean as the first set, but the median is 23 stickers.
<p>A dot plot for “stickers on a page.” The numbers 8 through 34, in increments of 2, are indicated. A triangle is indicated at 21 stickers, and a diamond is indicated at 23 stickers. The data are as follows: 9 stickers, 1 dot; 10 stickers, 1 dot; 13 stickers, 1 dot; 14 stickers, 1 dot; 16 stickers, 1 dot; 17 stickers, 1 dot; 19 stickers, 1 dot; 20 stickers, 2 dots; 21 stickers, 2 dots; 22 stickers, 3 dots; 23 stickers, 6 dots; 24 stickers, 5 dots; 25 stickers, 4 dots; 26 stickers, 1 dot.</p>
In this case, the median is closer to where most of the data points are clustered and is therefore a better measure of center for this distribution. That is, it is a better description of the typical number of stickers on a page. The mean number of stickers is influenced (in this case, pulled down) by a handful of pages with very few stickers, so it is farther away from most data points.
In general, when a distribution is symmetrical or approximately symmetrical, the mean and median values are close. But when a distribution is not roughly symmetrical, the two values tend to be farther apart.
Standards Alignment
Building On
Addressing
6.SP.B.5.c
Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the context in which the data were gathered.
Your teacher will give you six cards. Each has either a dot plot or a histogram. Sort the cards into 2 piles based on the distributions shown. Be prepared to explain your reasoning.
Discuss your sorting decisions with another group. Did you have the same cards in each pile? If so, did you use the same sorting categories? If not, how are your categories different?
Pause here for a class discussion.
Use the information on the cards to answer these questions.
Card A: What is a typical age of the dogs being treated at the animal clinic?
Card B: What is a typical number of people in the Irish households?
Card C: What is a typical travel time for the New Zealand students?
Card D: Would 15 years old be a good description of a typical age of the people who attended the birthday party?
Card E: Is 15 minutes or 24 minutes a better description of a typical time it takes the students in South Africa to get to school?
Card F: Would 21.3 years old be a good description of a typical age of the people who went on a field trip to Washington, D.C.?
How would you decide which measure of center to use for the dot plots on Cards A–C? What about for those on Cards D–F?