QM 3341 Business Statistics II Project Project Purpose: Through this project you will obtain “real-world” data from the Internet and use various statistical procedures presented in this course to...

QM 3341 Business Statistics II
Project
Project Purpose:
Through this project you will obtain “real-world” data from the Internet and use various statistical procedures presented in this course to analyze the data. Specifically, the following procedures will applied:


  • Hypothesis Tests for the Difference of Means

  • One-Factor Analysis of Variance

  • Randomized Block Design

  • Multiple Regression

  • Chi-Square Goodness of Fit Test



Project:
The project consists of 5 stages with work submitted at the completion of each stage. Note that each stage says to use Excel – I will not give credit for work done with Minitab, StatTools, or any other statistical software. If you do not use Excel, and turn in the spreadsheet you used to calculate your results, you will not get full credit. Following the steps to complete each stage of the project:

Stage 1



  1. Find some Internet site that sells used cars. The site you select must provide the year, miles and sales price for each car for sale. Here are few sites that you might want to try [there are many others]:

    1. www.carmax.com

    2. www.cars.com

    3. www.bostoncars.com



  2. Randomly select about 30 used cars for sale from each of three different cities. The cars selected should be the same make and model. For example, you may choose to select 30 Toyota Camry LX cars from Los Angeles, Chicago and New York. You should have 30 from each city, 90 in all. You may have to be selective in order to find 90 cars of the same make and model. Here are some zip codes for some selected “big” cities. You may use three of these cities or pick three of your own.


Atlanta 30301
Chicago 60601
Houston 77001
Los Angeles 90001
Miami 33010
New York 10001
Philadelphia 19101

  1. Create an Excel file with the data you have collected.

  2. Conduct hypothesis tests for the difference in population means for each pair of cities. That is, test at the 95% confidence level for differences in the mean sales price of cars of your make and model for

    1. City 1 – City 2

    2. City 1 – City 3

    3. City 2 – City 3



  3. Use Excel to perform the necessary calculations.

  4. Provide a short write-up in Word that interprets your output.

  5. Submit the following files to complete Stage 1:

    1. The file that provides the statistical output or calculations.

    2. The file that provides your interpretations of your output.






Stage 2



  1. Perform a one-factor ANOVA and the subsequent multiple comparisons, if necessary, to determine if there is a difference in the mean sales prices for the three cities.

  2. Provide a short write-up in Word that interprets your output.

  3. Now, do your best to match the cars for the three cities with regard to their year and miles. After doing this you may not have exactly 30 triples. Exclude any data that does not form a triple. A triple is a set of 3 cars, one from each city, that has the same year and is within 1000 miles of each other. If you do not have at least 15 triples, expand the mileage criteria to 2000. If you still do not have 15 triples, expand to 3000, then 4000, then 5000. If you still do not have 15 triples, you will need to gather more data. With at least 15 triples, perform a randomized block design to see if there is a difference in the mean sales prices for the three cities.

  4. Use Excel to perform the necessary calculations.

  5. Provide a short write-up in Word that interprets your output.

  6. Submit the following files to complete Stage 2:

    1. An Excel file that shows the data with the triples you formed. You might want to list the data in 9 columns: City 1 Price, City 1 Miles, City 1 Year, City 2 Price, City 2 Miles, City 2 Year, City 3 Price, City 3 Miles, City 3 Year.

    2. The file that provides the statistical output or calculations.

    3. The file that provides your interpretations of your output.






Stage 3



  1. Use all 90 cars from stage 1, you do not need the triples from stage 2.

  2. Combine the data into one file with separate columns for Price, Miles, Year and City. You might find it more helpful if you turn Year into an Age variable [a 2010 car would be 1 year old, a 2009 car would be 2 years old, …]. Also, you will need to create dummy variables for City.

  3. Obtain a correlation matrix.

  4. Use your data to obtain your “best” multiple regression model.

  5. Provide a short write-up in Word that interprets your output. Be sure to discuss



  • which independent variables are most strongly related to price,

  • which are least strongly related to price,

  • if the directions of the correlations make sense,

  • if there appears to be any multicollinearity,

  • your logic in excluding/including independent variables in your “best” model.



  1. Submit the following files to complete Stage 3:



  1. An Excel file that shows the data set you used, and provides the statistical output or calculations.

  2. The file that provides your interpretations of your output.




Stage 4



  1. Perform a Chi-Square Goodness of Fit test to determine whether or not the prices of all 90 cars are normally distributed.

  2. Provide a short write-up in Word that interprets your output

  3. Submit the following files to complete Stage 4:



  1. The file that provides the statistical output or calculations.

  2. The file that provides your interpretations of your output.





Stage 5




  1. Summarize what you’ve learned about the prices of cars through the first four stages.

  2. Describe the uses and limitations of statistical analysis in this project. What does the data tell you? What does it not tell you?

  3. Submit a Word document with your conclusions to complete Stage 5.



Due Dates and Grading:
You will be required to submit your work in stage. The actual due dates for each stage coincide with the completion of each topic/procedure during the course. Below are the due dates and the possible points for each stage. There are a total of 100 points. Your project grade will be determined by dividing the total points you earn by 100.









































Stage

Due Date

Statistical Procedure

Possible Points
1Sunday of Week 1Confidence Intervals for the Difference of Means20
2Sunday of Week 3One-Factor Analysis of Variance
Randomized Block Design
20
3Sunday of Week 5Multiple Regression20
4Sunday of Week 7Chi-Square Goodness of Fit20
5Sunday of Week 9Final Project Interpretation20


May 13, 2022
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here