Here are dot plots that show the ages of people at two different parties. The mean of each distribution is marked with a triangle.
Data set A
<p>A dot plot from 5 to 45 by 5’s. Age in years, labeled data set A. There is a red triangle indicated at 15, and the data set are as follows: 8 years, 12 dots. 10 years, 3 dots. 12 years, 1 dot. 15 years, red triangle. 36 years, 1 dot. 42 years, 1 dot. 44 years, 2 dots.</p>
Data set B
<p>A dot plot from 5 to 45 by 5’s. Age in years, labeled data set B. The data are as follows: 7 years, 1 dot. 8 years, 1 dot. 9 years, 1 dot. 10 years, 2 dots. 15 years, 1 dot. 16 years, 1 dot. 20 years, 2 dots and 1 red triangle. 22 years, 1 dot. 23 years, 1 dot. 24 years, 1 dot. 28 years, 1 dot. 30 years, 1 dot. 33 years, 1 dot. 35 years, 1 dot. 38 years, 1 dot. 42 years, 1 dot.</p>
What do you notice and what do you wonder about the distributions in the two dot plots?
7.2
Activity
Here are the ages of the people at one party, listed from least to greatest.
7
8
9
10
10
11
12
15
16
20
20
22
23
24
28
30
33
35
38
42
Split the data into 4 equal parts using 3 values called quartiles.
Find the median of the data set and label it Q2 for second quartile. This splits the data into an upper half and a lower half.
Find the middle value of the lower half of the data, without including the median. Label this value Q1 for first quartile
Find the middle value of the upper half of the data, without including the median. Label this value Q3 for third quartile.
Label the least value in the set “minimum” and the greatest value “maximum.”
The values you have identified make up the five-number summary for the data set. Record them here.
The median of this data set is 20. This tells us that half of the people at the party were 20 years old or younger, and the other half were 20 or older. What do each of these other values tell us about the ages of the people at the party?
The third quartile
The minimum
The maximum
7.3
Activity
Your teacher will give you the data on the lengths of names of students in your class. Write the five-number summary by finding the data set's minimum, Q1, Q2, Q3, and the maximum.
Pause for additional instructions from your teacher.
7.4
Activity
Twenty people participate in a study about blinking. The number of times each person blinked while watching a video for one minute is recorded. The data values are shown here, in order from smallest to largest.
3
6
8
11
11
13
14
14
14
14
16
18
20
20
20
22
24
32
36
51
Here is a dot plot showing these data.
Find the median (Q2) and mark its location on the dot plot.
Find the first quartile (Q1) and the third quartile (Q3). Mark their locations on the dot plot.
What are the minimum and maximum values?
A box plot can be used to represent the five-number summary graphically. Let’s draw a box plot for the number-of-blinks data. Above the dot plot:
Draw a box that extends from the first quartile (Q1) to the third quartile (Q3). Label the quartiles.
At the median (Q2), draw a vertical line from the top of the box to the bottom of the box. Label the median.
From the left side of the box (Q1), draw a horizontal line (a whisker) that extends to the minimum of the data set. On the right side of the box (Q3), draw a similar line that extends to the maximum of the data set.
Compare the information that can be quickly understood from each representation.
Student Lesson Summary
Earlier we learned that the mean is a measure of the center of a distribution and the MAD is a measure of the variability (or spread) that goes with the mean. There is also a measure of spread that goes with the median. It is called the interquartile range (IQR).
Finding the IQR involves splitting a data set into fourths. Each of the three values that splits the data into fourths is called a quartile. For example, here is a data set with 11 values.
12
19
20
21
22
33
34
35
40
40
49
Q1
Q2
Q3
The median, or second quartile (Q2), splits the data into two equal halves. For this data set, the median is 33.
The first quartile (Q1) is the middle value of the lower half of the data. For this data set, the first quartile is 20.
The third quartile (Q3) is the middle value of the upper half of the data. For this data set, the third quartile is 40.
The difference between the maximum and minimum values of a data set is the range. For this data set, the range is 37 because .
The difference between Q3 and Q1 is the interquartile range (IQR). For this data set, the IQR is 20 because . Because the distance between Q1 and Q3 includes the middle two-fourths of the distribution, the values between those two quartiles are sometimes called the middle half of the data.
The bigger the IQR, the more spread out the middle half of the data values are. The smaller the IQR, the closer together the middle half of the data values are. This is why we can use the IQR as a measure of spread.
A five-number summary can be used to summarize a distribution. It includes the minimum, first quartile, median, third quartile, and maximum of the data set. For the previous example, the five-number summary is 12, 20, 33, 40, and 49. These numbers are marked with diamonds on the dot plot.
A dot plot. The numbers 10 through 50, in increments of 5, are indicated. There are diamonds indicated at 12, 20, 33, 40 and 49. The data are as follows: 12, 1 dot; 19, 1 dot; 20, 1 dot; 21, 1 dot; 22, 1 dot; 33, 1 dot; 34, 1 dot; 35, 1 dot; 40, 1 dot; 49, 2 dots.
A box plot represents the five-number summary of a data set.
A box plot. The numbers 10 through 50, in increments of 5, are indicated. The data are as follows: Minimum value, 12. Maximum value, 49. Q1, 20. Q2, 33. Q3, 40.
It shows the first quartile (Q1) and the third quartile (Q3) as the left and right sides of a rectangle, or a box. The median (Q2) is shown as a vertical segment inside the box. On the left side, a horizontal line segment, sometimes called a whisker, extends from Q1 to the minimum value. On the right, a whisker extends from Q3 to the maximum value.
The rectangle in the middle represents the middle half of the data. Its width is the IQR. The whiskers represent the bottom quarter and the top quarter of the data set.
A box plot is a way to represent data on a number line with a box and some lines. The data is divided into four sections by 5 values. Those values are the minimum, first quartile, median, third quartile, and maximum.
The interquartile range is one way to measure how spread out a data set is. To find the IQR, subtract the first quartile (Q1) from the third quartile (Q3).
For example, the IQR of this data set is 20 because .
22
29
30
31
32
43
44
45
50
50
59
Q1
Q2
Q3
Quartiles are the numbers that divide a data set into four sections. Each section has the same number of data values.
In this data set, the first quartile (Q1) is 30. The second quartile (Q2) is the median, 43. The third quartile (Q3) is 50.
22
29
30
31
32
43
44
45
50
50
59
Q1
Q2
Q3
The range is the distance between the smallest and largest values in a data set.
In the data set 3, 5, 6, 8, 11, 12, the range is 9, because .