In this unit, students learn about populations and study variables associated with a population. They begin by classifying questions as either statistical or non-statistical—based on whether variable data is necessary to answer the question. This leads to further investigation into variability and data displays, such as dot plots and histograms. As students visualize data, they begin to describe the distribution of data more precisely as they work with mean and mean absolute deviation (MAD).
After working with those statistics, students begin to recognize that some distributions are not well-suited to description by mean and MAD. Students are introduced to median, range, and interquartile range as additional measures of center and variability that can be used to describe distributions in some situations. That also leads to the box plot as an additional way to visualize data.
Note that the introduction of mean absolute deviation is used as an introductory model for understanding variability. Although standard deviation is more mathematically useful, its calculation and meaning may be difficult for students at this level without an understanding of normal distributions. In later courses, when student understanding of variability and their exposure to additional distributions is expanded, students will learn about standard deviation and evolve their understanding away from mean absolute deviation.
Box plots and dot plots for two sets of data: "pug weights in kilograms” and "beagle weights in kilograms". The numbers 6 through 11 are indicated and there are tick marks midway between each indicated number. Each box plot is above it's corresponding dot plot. The approximate data for the box plot for "pug weights in kilograms" are as follows: Minimum value, 6. Maximum value, 8. Q1, 6.5. Q2, 7. Q3, 7.3. The approximate data for the dot plot for "pug weights in kilograms" are as follows: 6 kilograms, 1 dot. 6.2 kilograms, 2 dots. 6.4 kilograms, 2 dots. 6.6 kilograms, 2 dots. 6.8 kilograms, 2 dots. 7 kilograms, 3 dots. 7.2 kilograms, 3 dots. 7.4 kilograms, 1 dot. 7.6 kilograms, 2 dots. 7.8 kilograms, 1 dot. 8 kilograms, 1 dot. The approximate data for the box plot for "beagle weights in kilograms" are as follows: Minimum value, 9. Maximum value, 11. Q1, 9.6. Q2, 10. Q3, 10.5. The approximate data for the dot plot for "beagle weights in kilograms" are as follows: 9 kilograms, 1 x. 9.2 kilograms, 2 x's. 9.4 kilograms, 1 x. 9.6 kilograms, 3 x's. 9.8 kilograms, 1 x. 10 kilograms, 3 x's. 10.2 kilograms, 3 x's. 10.4 kilograms, 1 x. 10.6 kilograms, 2 x's. 10.8 kilograms, 2 x's. 11 kilograms, 1 x.
Progression of Disciplinary Language
In this unit, teachers can anticipate students using language for mathematical purposes, such as justifying, representing, and interpreting. Throughout the unit, students will benefit from routines designed to grow robust disciplinary language, both for their own sense-making and for building shared understanding with peers. Teachers can formatively assess how students are using language in these ways, particularly when students are using language to:
Justify
Reasoning for matching data sets to questions (Lesson 2).
Reasoning about dot plots (Lesson 3).
Reasoning about mean and median (Lesson 13).
Reasoning about changes in mean and median (Lesson 14).
Reasoning about which information is needed (Lesson 17).
Which summaries and graphs best represent given data sets (Lesson 18).
Represent
Data using dot plots (Lessons 3 and 4).
Data using histograms (Lesson 7).
Mean using bar graphs (Lesson 9).
Data with five number summaries (Lesson 15).
Data using box plots (Lesson 16).
Interpret
Dot plots (Lessons 4 and 11).
Histograms (Lessons 6 and 18).
Mean of a data set (Lesson 9).
Five-number summaries (Lesson 15).
Box plots (Lesson 16).
In addition, students are expected to critique the reasoning of others, describe how quantities are measured, describe and compare features and distributions of data sets, generalize about means and distances in data sets, generalize categories for sorting data sets, and generalize about statistical questions. Students are also expected to use language to compare questions that produce numerical and categorical data, compare dot plots and histograms, and compare histograms and bar graphs.
The table shows lessons where new terminology is first introduced in this course, including when students are expected to understand the word or phrase receptively and when students are expected to produce the word or phrase in their own speaking or writing. Terms that appear bolded are in the Glossary. Teachers should continue to support students’ use of a new term in the lessons that follow where it was first introduced.
lesson
new terminology
receptive
productive
6.8.1
numerical data
categorical data
dot plot
6.8.2
statistical question
variability
6.8.3
distribution
frequency
bar graph
6.8.4
typical
6.8.5
center
spread
variability
6.8.6
histogram
bins
distribution
center
6.8.7
statistical question
spread
6.8.8
symmetrical
peak
cluster
unusual value
numerical data
categorical data
gap
6.8.9
average
mean
fair share
6.8.10
measure of center
balance point
6.8.11
mean absolute deviation (MAD)
measure of spread
symmetrical mean
6.8.12
mean absolute deviation (MAD)
typical
6.8.13
median
measure of center
6.8.14
peak
cluster
unusual value
6.8.15
range
quartile
interquartile range (IQR)
five-number summary
Comprehend and use the terms “numerical” and “categorical” to describe data sets.
Justify whether a question is “statistical” based on whether variability is expected in the data that could be collected.
Section Narrative
In this section, students collect data about themselves and their classmates, then classify survey questions in two ways. First, they distinguish the questions based on whether the information collected is numerical or categorical data. Then, they determine whether questions are statistical questions or not based on whether data they collect to answer the question shows variability. Students use dot plots to visualize numerical data in this section, but will look more closely at them in later sections.
A dot plot for "dog weights in kilograms". The numbers 5 through 40, in increments of 5, are indicated. The data are as follows: 6 kilograms, 1 dot. 7 kilograms, 3 dots. 10 kilograms, 2 dots. 32 kilograms, 1 dot. 35 kilograms, 2 dots. 36 kilograms, 1 dot.
In this final section, students have the opportunity to apply their thinking from throughout the unit. Because this is a short section followed by an End-of-Unit Assessment, there are no section goals or checkpoint questions. The lesson in this section is optional because it offers additional opportunities to practice standards that are not a focus of the grade.
Describe a distribution represented by a dot plot, including informal observations about its center and spread.
Interpret a histogram to answer statistical questions about a data set.
Section Narrative
In this section, students focus on describing distributions. In particular, they learn to describe the center and spread of a distribution by using informal language to refer to a typical value for a distribution and how spread out the data are. They add histograms to the ways in which they can represent data, and use the visualization to describe features of a distribution such as clusters, peaks, gaps, and symmetry.
Note that in all histograms in this unit, the left-end boundary of each bin or interval is included and the right-end boundary is excluded. For example, the number 5 would not be included in the 0–5 bin, but would be included in the frequency count for the 5–10 bin. This is only a convention, so check any technology used to create histograms to determine if it matches this convention.
A histogram, the horizontal axis is labeled “dog weights in kilograms” and the numbers 10 through 35, in increments of 5, are indicated. On the vertical axis the numbers 0 through 10, in increments of 2, are indicated. The data represented by the bars are as follows: Weight from 10 up to 15, 5. Weight from 15 up to 20, 7. Weight from 20 up to 25, 10. Weight from 25 up to 30, 3. Weight from 30 up to 35, 5.
Compare the means and mean absolute deviations of different distributions.
Interpret the mean and mean absolute deviation (MAD) in the context of the data.
Section Narrative
In this section, students begin to quantify their understanding of center and spread by finding values for the mean and mean absolute deviation (MAD). The mean is explained as a way of fairly sharing as well as a balance point to give additional intuition into the measure of center.
Then students see that, even with the same mean, distributions can be very different and that a description of a measure of variability is often important. They use mean absolute deviation as a way to describe the variability of a distribution in a way that has some meaning.
A dot plot for “number of stickers on a page.” The numbers 8 through 34, in increments of 2, are indicated. A triangle is indicated at 21 stickers. There are two perpendicular lines drawn, one at 18 stickers and the other at 21 stickers. A horizontal line between the two lines is labeled 3. The data are as follows: 18 stickers, 1 dot; 19 stickers, 3 dots; 20 stickers, 4 dots; 21 stickers, 5 dots; 22 stickers, 6 dots; 23 stickers, 2 dots, 24 stickers, 1 dot.
Compare and contrast distributions that are represented with box plots.
Interpret the median and interquartile range (IQR) in the context of the data.
Section Narrative
In this section, students add “median,” “range,” and “interquartile range” to their methods of describing a measure of center or measure of variability. They use the symmetry of a distribution to determine whether mean or median is likely to be a better description of the center. Then they explore box plots as a way to visualize a summary of data using the five-number summary including the minimum, maximum, median, and 2 other quartiles. Students compare dot plots, histograms, and box plots to determine which represent the distributions best and what information is readily available from each.
<p>Two sets of box plots for "lengths in millimeters". The numbers 4 through 16 are indicated in increments of 2. There are tick marks midway between the indicated numbers. The top box plot is for "ladybugs". The five-number summary is as follows: Minimum value, 6. Maximum value, 10.5. Q1, 8.5. Q2, 9. Q3, 10. The bottom box plot is for "beetles". The five-number summary is as follows: Minimum value, 5. Maximum value, 15.5. Q1, 7.5. Q2, 9. Q3, 13.5.</p>