DescriptionData 205

Lab Instructor: Steven Kurniawan

Spring 2023

Unit 4 Homework: Data Distributions

(42 points)

Your Name

For this homework, we will use R built-in data. R comes with several built-in data sets

related to the 50 states of the United States of America. Professor XU has combined these

data sets into a single CSV file named “us_states.csv”. Below is a list of variables in this

data:

•

name: the full state names.

•

abb: 2-letter abbreviations for the state names.

•

region: the geographic region (Northeast, South, North Central, West) that each

state belongs to.

•

division: the geographic division (New England, Middle Atlantic, South Atlantic,

East South Central, West South Central, East North Central, West North Central,

Mountain, and Pacific) that each state belongs to.

•

population: population estimate as of July 1, 1975.

•

income: per capita income (1974).

•

illiteracy: illiteracy (1970, percent of population).

•

life_exp: life expectancy in years (1969–71).

•

murder: murder and non-negligent manslaughter rate per 100,000 population

(1976).

•

hs_grad: percent high-school graduates (1970).

•

frost: mean number of days with minimum temperature below freezing (1931–

1960) in capital or large city.

•

area: land area in square miles.

Exercise 1: Reviewing Variable Labels and Values (14 points)

Let’s start by taking a look at the structure of the U.S. states dataset and what’s included in

it. To do this, we use the str() command.

1

Data 205

Lab Instructor: Steven Kurniawan

Spring 2023

Question 1. How many observations are there in this data set? How many

variables? (2 points)

Question 2. Which variables are nominal? Which variables are interval or ratio?

(12 points)

Exercise 2: Percentages in Tables and Charts (8 points)

In class and in your readings, we’ve covered different measures of dispersion. The most

basic is the percentage, which we can read from tables and from charts. In this exercise,

we’ll go one step further to characterize the dispersion in a distribution.

Here’s a frequency table for the variable geographic division in the U.S. states dataset,

followed by one that reports on percentage and a bar chart.

2

Data 205

Lab Instructor: Steven Kurniawan

3

Spring 2023

Data 205

Lab Instructor: Steven Kurniawan

Spring 2023

Question 1. What is the mode of geographic division, and what is the

percentage of states that are located in this division? (4 points)

Question 2. Which geographic division includes the least number of states?

What is the percentage of states that are located in this division? (2 points)

Question 3. How would you describe the distribution of geographic divisions in

terms of dispersion? Low, medium, or high dispersion? Justify your answer. (2

points)

Exercise 3. Medians and Quartiles (10 points)

The summary() command provides summary statistics on continuous/numeric variables

and reports on the minimum and maximum, the quartiles, and the mean and median. Here

we call this command for the variable income:

We’ll pair this text output with a histogram of income so that we can visualize the shape of

the distribution.

4

Data 205

Lab Instructor: Steven Kurniawan

Spring 2023

Finally, we’ll also look at a boxplot for the same variable. You can do this by changing

geom_histogram argument in the command line to geom_boxplot.

5

Data 205

Lab Instructor: Steven Kurniawan

Spring 2023

Because there is only one variable to examine here, R gives us a sideways rendering of the

boxplot instead of an up and down one.

Use all three pieces of information–the summary output, the histogram, and the

boxplot to answer the questions in this section.

Question 1. Describe the distribution of per capita income. Use the following

values in your discussion: range, interquartile range, mean, median, skew, and

outliers. (6 points – One point for each correct depiction of the keyword)

6

Data 205

Lab Instructor: Steven Kurniawan

Spring 2023

Question 2. Compare what you learned about the distribution of per capita

income from the histogram and the boxplot. Which one do you find more

helpful in summarizing the information and why? (4 points)

7

Data 205

Lab Instructor: Steven Kurniawan

Spring 2023

Exercise 4: Using medians and distributions to compare states in

different geographic divisions (10 points)

In this next exercise, we will again use boxplots, this time to compare the per capita income

for different geographic divisions.

8

Data 205

Lab Instructor: Steven Kurniawan

Spring 2023

Question 1. Use the text output on means and standard deviations. Which

geographic division had the highest mean per capita income in 1974? Looking at

the standard deviation of its mean, how would you describe the dispersion of its

per capita income relative to other geographic divisions? (2 points)

9

Data 205

Lab Instructor: Steven Kurniawan

Spring 2023

Question 2. Now compare the income distributions using the boxplot. Which

geographic division had the lowest median per capita income in 1974? How was

the dispersion of the income distribution in this geographic division compared

to other divisions? (3 points)

Question 3. Based on the text output and the boxplot, which geographic division

had the highest level of income inequality in 1974? And why? (3 points)

Question 4. Judging by the distribution of per capita income, which geographic

division would you choose to live? And why? (2 points)

10

Purchase answer to see full

attachment

#### Why Choose Us

- 100% non-plagiarized Papers
- 24/7 /365 Service Available
- Affordable Prices
- Any Paper, Urgency, and Subject
- Will complete your papers in 6 hours
- On-time Delivery
- Money-back and Privacy guarantees
- Unlimited Amendments upon request
- Satisfaction guarantee

#### How it Works

- Click on the “Place Your Order” tab at the top menu or “Order Now” icon at the bottom and a new page will appear with an order form to be filled.
- Fill in your paper’s requirements in the "
**PAPER DETAILS**" section. - Fill in your paper’s academic level, deadline, and the required number of pages from the drop-down menus.
- Click “
**CREATE ACCOUNT & SIGN IN**” to enter your registration details and get an account with us for record-keeping and then, click on “PROCEED TO CHECKOUT” at the bottom of the page. - From there, the payment sections will show, follow the guided payment process and your order will be available for our writing team to work on it.