DescriptionGNED 1103

Task 4

Winter 2023

Question 8 [2 marks] Assume the following min, max, measures of centrality and measures of spread

values were calculated based upon a single data set. This data set has a normal distribution. Based upon

theses facts answer the following questions

Mean=100, Median=100, Mode=100, Min=0, Max = 200, Q1=50, Q3=150, SD = 30

A. Which measure(s) of centrality would be the most accurate to use when analyzing the data set? Indicate

the name and number as well as an explanation of how you determined your choice.

B. Provide the proper 2 (two) most accurate measures of spread to use when analyzing the data set.

Indicate the name and number as well as an explanation of how you determined your choice. Note: This

question may require to you do some calculations. If you do any calculations, clearly show the full work

of how you did the calculations.

Question 9 [2 marks] Assume the following min, max, measures of centrality and measures of spread

values were calculated based upon a single data set. This data set has a negative skewed distribution.

Based upon theses facts answer the following questions

Mean=50, Median=100, Mode=120, Min=0, Max = 200, Q1=55, Q3=120, SD = 10

A. Which measure(s) of centrality would be the most accurate to use when analyzing the data set? Indicate

the name and number as well as an explanation of how you determined your choice.

B. Provide the proper 2 (two) most accurate measures of spread to use when analyzing the data set.

Indicate the name and number as well as an explanation of how you determined your choice. Note: This

question may require to you do some calculations. If you do any calculations, clearly show the full work

of how you did the calculations.

Question 10 [2 marks] Assume the following min, max, measures of centrality and measures of spread

values were calculated based upon a single data set. This data set has a positive skewed distribution.

Based upon theses facts answer the following questions

Mean=120, Median=110, Mode=90, Min=0, Max = 200, Q1=80, Q3=125, SD = 20

A. Which measure(s) of centrality would be the most accurate to use when analyzing the data set? Indicate

the name and number as well as an explanation of how you determined your choice.

B. Provide the proper 2 (two) most accurate measures of spread to use when analyzing the data set.

Indicate the name and number as well as an explanation of how you determined your choice. Note: This

question may require to you do some calculations. If you do any calculations, clearly show the full work

of how you did the calculations.

GNED 1103

Task 4

Winter 2023

Question 11 [4 marks] the following questions are based upon the box and whisker diagrams shown below.

For each question, you must complete the calculation and show all your work to get full marks.

a.

b.

c.

d.

AA MAX – BB Q1 =?

CC MIN – BB Q3 =?

CC MEDIAN – AA MEDIAN =?

BB Q3 – BB MEDIAN =?

GNED 1103

Task 4

Winter 2023

Guidance

•

•

Question 1

o See notes “05 The Scientific method – Part 2” – slides 52 to 77

Question 2 to Question 11

o See notes “06 Statistical Reasoning Part 1”

▪ Measures of Central Tendency – slides 44 to 73

▪ Types of distributions – slides 74 to 79

▪ Measures of Spread – slides 80 to 110

▪ Box plot/ box and whisker diagram – slides 87 to 91

Quick definitions

•

•

•

•

•

•

•

What is Data?

o Data is a collection of information (numbers, words, measurements, observations) about facts, figures

and statistics collected together for analysis.

What is a Data set?

o Data set (or dataset) refers to any organised collection of data. The data set lists values for each of the

variables and for each member of the dataset. Each value is known as a datum.

Descriptive Statistics

o Descriptive statistics allow a scientist to quickly sum up major attributes of a dataset using measures

such as the mean, median, mode, and standard deviation.

Measures of Central Tendency

o Measures of central tendency are used to describe what is normal for a set of data.

o The mean, median and mode are all valid measures of central tendency, but under different conditions,

some measures of central tendency become more appropriate to use than others.

Measures of spread

o The purpose of identifying a “central” value from a data set was to describe a typical value in the data

set. Once we know this, we can measure the amount of dispersion or spread of the data values from

the typical, central, value.

o Three main measures of dispersion for a data set are the range,

standard deviation, and quartiles

Distribution of a data set

o The distribution of a statistical dataset is the spread of the data which

shows all possible values or intervals of the data and how they occur

Box and whisker plot (or Box Plot)

o A box and whisker plot—also called a box plot—displays the fivenumber summary of a set of data. The five-number summary is the

minimum, first quartile, median, third quartile, and maximum.

GNED 1103

Task 4

Winter 2023

Rules to follow

•

Determining type of distribution based upon measure of centrality (Mean, Median, Mode)

o Normal/Symmetric distribution

▪ Mode ≈ Mean ≈ Median or Mean ≈ Median

• Means they are the same or only slightly different from each other

• For this course I will define “slightly different” to mean within a difference of at least 2

from each other (so a difference below or equal to 2 means similar, anything above 2

means no similar)

o Positively or right skewed distribution (Mean will be greater than the Median)

▪ First you must determine that the data set does not have a normal distribution. If a data set

does not have a normal distribution it has a skewed distribution.

▪ Mean > Median > Mode or Mean > Median

o Negatively or left skewed distribution (Mean will be less than the Median)

▪ First you must determine that the data set does not have a normal distribution. If a data set

does not have a normal distribution it has a skewed distribution.

▪ Mean < Median < Mode or Mean < Median
•
How to determine the proper measure of centrality to use for this course based upon the distribution of a
data set
o Normal/symmetric distribution
▪ You can use any measure of centrality (Mean or Median for this course). You can only use one of
the values, so pick either the mean or the median.
o Positively/Right skewed distribution
▪ Use the median
o Negative/Left skewed distribution
▪ Use the median
•
How to determine the proper measure of spread to use for this course based upon the chosen measure of
centrality
o If you are using the mean as a measure of centrality, then you have to use the standard deviation as a
measure of spread
▪ Lower value = mean – standard deviation
▪ Upper value = mean + standard deviation
o If you are using the median as a measure of centrality, then you have to use the quartiles as a measure
of spread.
▪ Q1 and Q3 values
GNED 1103
Task 4
Winter 2023
Examples for determine the type of distribution based upon the given measure of
centrality
o
Mean=200, Median=400
▪ | Mean – Median | = | 200 – 400 | = | -200 | = 200 (not exactly the same)
▪ 200 is greater than 2 so it is skewed. Since it is skewed, we need to determine the type of skew by
comparing the mean and median.
▪ Mean is less than the Median ( (mean < median) = (200 < 400 ) so we have a Negative/Left skewed
distribution
o
Mean=303, Median=300
▪ | Mean – Median | = | 303 – 300 | = | 3 | = 3 (not exactly the same)
▪ 3 is greater than 2 so it is skewed. Since it is skewed, we need to determine the type of skew by
comparing the mean and median.
▪ Mean is greater than the Median ((mean > median) = (303 > 300 ) so we have a Positive/Right skewed

distribution

o

Mean=298, Median=300

▪ | Mean – Median | = | 298 – 300 | = | -2 | = 2 (not exactly the same)

▪ 2 is less than or equal to 2 so it is Normal (the two values are near to each other)

▪ Both values are near to each other, so we have a Normal/symmetric distribution

o

Mean=300, Median=299

▪ | Mean – Median | = | 300 – 299 | = | 1 | = 1 (not exactly the same)

▪ 1 is less than or equal to 2 so it is Normal (the two values are near to each other)

▪ Both values are near to each other, so we have a Normal/symmetric distribution

o

Mean=300, Median=300

▪ | Mean – Median | = | 300 – 300 | = | 0 | = 0 (exactly the same)

▪ 0 is less than or equal to 2 so it is Normal (the two values are near to each other)

▪ Both values are exactly the same, so we have a Normal/symmetric distribution

o

Mean=200, Median=400, Mode = 450

▪ | Mean – Median | = | 200 – 400 | = | -200 | = 200 (not exactly the same)

▪ | Mean – Mode | = | 200 – 450 | = | -250 | = 250 (not exactly the same)

▪ | Median – Mode | = | 400 – 450 | = | -50 | = 50 (not exactly the same)

▪ All three values from above are greater than 2 so it is skewed. Since it is skewed, we need to determine

the type of skew by comparing the mean and median.

▪ Mean is less than the Median and the Median is less than the Mode ( (mean < median < mode) = (200 <
400 < 450 ) so we have a Negative/Left skewed distribution
GNED 1103
Task 4
Winter 2023
o Mean=303, Median=300, Mode=296
▪ | Mean – Median | = | 303 – 300 | = | 3 | = 3 (not exactly the same)
▪ | Mean – Mode | = | 303 – 296 | = | 7 | = 7 (not exactly the same)
▪ | Median - Mode | = | 300 – 296 | = | 4 | = 4 (not exactly the same)
▪ All three values from above are greater than 2 so it is skewed. Since it is skewed, we need to determine
the type of skew by comparing the mean and median.
▪ Mean is greater than the Median and the Medinan is greater than the Mode((mean > median > mode) =

(303 > 300 > 296 ) so we have a Positive/Right skewed distribution

o

Mean=298, Median=300, Mode=300

▪ | Mean – Median | = | 298 – 300 | = | -2 | = 2 (not exactly the same)

▪ | Mean – Mode | = | 298 – 300 | = | -2 | = 2 (not exactly the same)

▪ | Median – Mode | = | 300 – 300 | = | 0 | = 0 (exactly the same)

▪ All three values from above are less than or equal to 2 so it is Normal (the three values are near to each

other)

▪ All three values are near to each other, so we have a Normal/symmetric distribution

o

Mean=300, Median=299, Mode=301

▪ | Mean – Median | = | 300 – 299 | = | 1 | = 1 (not exactly the same)

▪ | Mean – Mode | = | 300 – 301 | = | -1 | = 1 (not exactly the same)

▪ | Median – Mode | = | 299 – 301 | = | -2 | = 2 (exactly the same)

▪ All three values from above are less than or equal to 2 so it is Normal (the three values are near to each

other)

▪ All three values are near to each other, so we have a Normal/symmetric distribution

o

Mean=300, Median=300, Mode=300

▪ | Mean – Median | = | 300 – 300 | = | 0 | = 0 (exactly the same)

▪ | Mean – Mode | = | 300 – 300 | = | 0 | = 1 (exactly the same)

▪ | Median – Mode | = | 300 – 300 | = | 0 | = 2 (exactly the same)

▪ All three values from above are less than or equal to 2 so it is Normal (the three values are near to each

other)

▪ All three values are near to each other, so we have a Normal/symmetric distribution

GNED 1103

Task 4

Winter 2023

Examples for determine the proper measure of centrality and measures of spread

based upon the type of distribution

o

This data set has a negative skewed distribution. Based upon theses facts answer the following

questions (Mean=25, Median=50, Mode=60, Min=0, Max = 100, Q1=20, Q3=60, SD = 5)

▪ A. Measure(s) of centrality

• Since we skewed distribution, the best measure of centrality to use is the median

• The median value for this data set is 50

▪ B. Measures of spread

• Since we skewed distribution, the best measure of centrality to use is the median.

• Since the median is the best measure of centrality, we need to use the quartiles (Q1 and

Q3

• Q1 = 20, Q3 = 60

o

This data set has a positive skewed distribution. Based upon theses facts answer the following

questions (Mean=60, Median=51, Mode=25, Min=0, Max = 100, Q1=22, Q3=63, SD = 7)

▪ A. Measure(s) of centrality

• Since we skewed distribution, the best measure of centrality to use is the median

• The median value for this data set is 51

▪ B. Measures of spread

• Since we skewed distribution, the best measure of centrality to use is the median.

• Since the median is the best measure of centrality, we need to use the quartiles (Q1 and

Q3)

• Q1 = 22, Q3 = 63

o

This data set has a normal/symmetric distribution. Based upon theses facts answer the following

questions (Mean=60, Median=60, Mode=60, Min=0, Max = 100, Q1=41, Q3=73, SD = 12)

▪ A. Measure(s) of centrality

• Since we have a normal distribution we can pick either Mean or Median for this course. I

can only use one of the values, so in this case I pick the median.

• The median value for this data set is 60

▪ B. Measures of spread

• Since the median is the measure of centrality we chose, we need to use the quartiles

(Q1 and Q3)

• Q1 = 41, Q3 = 73

o

This data set has a normal/symmetric distribution. Based upon theses facts answer the following

questions (Mean=60, Median=60, Mode=60, Min=0, Max = 100, Q1=41, Q3=73, SD = 12)

▪ A. Measure(s) of centrality

• Since we have a normal distribution we can pick either Mean or Median for this course. I

can only use one of the values, so in this case I pick the mean.

• The mean value for this data set is 60

▪ B. Measures of spread

• Since the mean is the measure of centrality we chose, we need to calculate the upper

and lower standard deviation values.

• Lower value = mean – standard deviation = 60 – 12 = 48

• Upper value = mean + standard deviation = 60 + 12 = 72

GNED 1103

Task 4

Winter 2023

Examples Using box and whisker diagram

a.

b.

c.

d.

e.

BB MIN – AA MIN = 10 – 0 = 10

AA MAX – BB MAX = 50 – 50 = 0

AA Q3 – BB Q3 = 38 – 36 = 2

BB MEDIAN – AA MEDIAN = 30 – 25 = 5

AA Q1 – BB Q1 = 13 – 24 = -11

GNED 1103

Task 4

Winter 2023

Rubric – Breakdown for how each question will be marked

Criteria

Question 1

0 marks

No answer given, or

fully incorrect answer.

1 to 5 marks

Could be due to one or more of the following:

Missing or incorrect type of evidence (there

are three for this course). Detentions are not

in relation to cause and effect events (must

not be about variables). Examples provided

are incorrect or do not relate to mount royal

university specifically. Does not include the

step during the scientific method where each

is established.

6 marks

Correct answer for question

with all required information.

Question 2 to

Question 7

0 marks

No answer given, or

fully incorrect answer,

or no math/work shown

1 mark

Could be due to one or more of the following:

Incorrect distribution type indicated, small

error in math, incorrect math used to

determine distribution type.

2 marks

Correct answer for question

with all required information.

Question 8 to

Question 10

Part A

0 mark

No answer given, or fully incorrect answer, or

no math/work shown

1 mark

Correct answer for question

with all required information.

Includes correct measure of

centrality and full explanation

given.

Question 8 to

Question 10

Part B

0 mark

No answer given, or fully incorrect answer, or

no math/work shown

1 mark

Correct answer for question

with all required information.

Includes correct two measures

of spread and full explanation

given.

Question 11

A to D

0 mark

No answer given, or fully incorrect answer, or

no math/work shown

1 mark

Correct answer for question

with all required information.

Includes correct number and all

work fully shown.

