Math 28 Project II: Linear Regression and Correlation If we are given two ordered pairs and , we should be able to “work backwards” and find the equation of the line passing through them; assuming...

1 answer below »
Math 28 Project II: Linear Regression and Correlation
If we are given two ordered pairs and , we should be able to “work backwards” and find the equation of the line passing through them; assuming that x and y are linearly related, we can easily find the equation of the line passing through the points and . The process of finding the equation of a line passing through given points is called linear regression.
If we are given more than two points, the points may not be collinear (meaning all on a line). In fact, when collecting real-world data with more than two data points, the points are rarely collinear. However, after plotting the scatter of points in the xy-plane, it may appear that they
almost
fit on a line, displaying a linear trend:
Figure 1 - Sample data indicating a roughly linear trend


When a scatter of points exhibits a linear trend, we construct a line that best approximates that trend. This line is called the
Least-Squares Regression Line, or simply the
Line of Best Fit. We denote it by , where the “hat” over the y indicates that the calculated value of y is a prediction based on linear regression.
CALCULATING THE LINE OF BEST FIT Given a sample of n ordered pairs the line of best fit is denoted by , where the slope
a
and the y-intercept
b
are given by

and

where and denote the means of the x- and y-coordinates, respectively.

Example


Let’s look at an example: Given the ordered pairs (5, 14), (9, 17), (12, 16), (14, 18), and (17, 23), find the equation of the line of best fit, and graph it along with the data on the same coordinate system.

Example Solution


We need to organize the data and compute the appropriate sums:

















































(x, y)

x

x2


y

xy

(5, 14)
5251470

(9, 17)
98117153

(12, 16)
1214416192

(14, 18)
1419618252

(17, 23)
1728923391


First, we find the slope a:
=.
Now we compute b:
.
Therefore, the line of best fit is

Rounding off to one decimal place, we have

To graph the line, we need to plot two points. One point is the
y-intercept
(0, b) =
(0, 10.3). To find another point, we pick the maximum of the x-values in the data set, x = 18, and calculate :
.
Therefore the point (18, 21.1) is on the line of best fit. We plot the line using the two points below, along with the data:
Figure 2 - Line of Best Fit for Example 1


If our line fits the data well, we can use the line of best fit to
interpolate
y-values, given some x within the range of the x-values of our data set. Note that we CANNOT
extrapolate
y-values whose x-values are OUTSIDE of the range of our given x-values. In other words, we can use the line to “fill in” missing y-values BETWEEN our given points, but it is a very strong and likely false assumption to find y-values outside of our data range.
For instance, since our line appears to fit our data well, we can estimate when
x = 10:

This says that our line estimates the point (10, 16.3)
between
our given data points.

Assignment:


Based on the above discussion, complete the problem below as a team, showing your work in the spaces provided.
Suppose data on the average hourly wage and the unemployment rate in the United States are given below (from the Federal Reserve Economic Data):
Figure 3 - Year, Average Hourly Wage, and Unemployment Rate














































Year

Average Hourly Wage

Unemployment Rate

1992
$10.577.5%

1993
10.836.9

1994
11.126.1

1995
11.435.6

1996
11.825.4

1997
12.284.9

1998
12.774.5




  1. Letting
    x

    = average hourly wage, and
    y
    = unemployment rate, plot the data. Do the data exhibit a linear trend?





  1. Find the line of best fit. Use the table below to help with the computations:




































(x, y)

x

x2


y

xy

(10.57, 7.5)

(10.83, 6.9)

(11.12, 6.1)

(11.43, 5.6)

(11.82, 5.4)

(12.28, 4.9)

(12.77, 4.5)




  1. Using Excel or a similar tool, graph the line of best fit, along with the data, on the same coordinate system. Does the line appear to fit the data well? Copy your graph below.





  1. Predict the unemployment rate when the average hourly wage is $12.00.





  1. Can we predict the unemployment rate when the average hourly wage is $14.00? Explain.



References
Data Source: FRED, Federal Reserve Economic Data, Federal Reserve Bank of St. Louis:Civilian Unemployment Rate [UNRATE], Average Hourly Earnings of Production and Nonsupervisory Employees
[AHETPI]; U.S. Department of Labor: Bureau of Labor Statistics; http://research.stlouisfed.org/fred2/; accessed October 14th, 2014.
Answered Same DayDec 26, 2021

Answer To: Math 28 Project II: Linear Regression and Correlation If we are given two ordered pairs and , we...

Robert answered on Dec 26 2021
121 Votes
Math 28 Project II: Linear Regression
and Correlation

If we are given two ordered pairs and , we should be able to “work backwards”
and find the equation of the line passing
through them; assuming that x and y are
linearly related, we can easily find the equation of the line passing through the
points and . The process of finding the equation of a line passing through given
points is called linear regression.

If we are given more than two points, the points may not be collinear (meaning all
on a line). In fact, when collecting real-world data with more than two data points,
the points are rarely collinear. However, after plotting the scatter of points in the xy-
plane, it may appear that they almost fit on a line, displaying a linear trend:

Figure 1 - Sample data indicating a roughly linear trend


When a scatter of points exhibits a linear trend, we construct a line that best
approximates that trend. This line is called the Least-Squares Regression Line, or
simply the Line of Best Fit. We denote it by , where the “hat” over the y
indicates that the calculated value of y is a prediction based on linear regression.

CALCULATING THE LINE OF BEST FIT
Given a sample of n ordered pairs the line of best fit is denoted by , where
the slope a and the y-intercept b are given by

and
0
6
12
17
23
29
0 5 9 14 18
Series1
where and denote the means of the x- and y-coordinates, respectively.

Example
Let’s look at an example: Given the ordered pairs (5, 14), (9, 17), (12, 16), (14, 18),
and (17, 23), find the equation of the line of best fit, and graph it along with the data
on the same coordinate system.

Example Solution
We need to organize the data and compute the appropriate sums:

(x, y) x x2 y xy
(5, 14) 5 25 14 70
(9, 17) 9 81 17 153
(12, 16) 12 144...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here