Competency In this project, you will demonstrate your mastery of the following competency: Perform regression analysis to address an authentic problem Scenario You are a data analyst for a basketball...

1 answer below »

Competency


In this project, you will demonstrate your mastery of the following competency:



  • Perform regression analysis to address an authentic problem


Scenario


You are a data analyst for a basketball team and have access to a large set of historical data that you can use to analyze performance patterns. The coach of the team and your management have requested that you come up with regression models that predict the number of wins in a regular game based on the performance metrics that are included in the data set. These regression models will help make key decisions to improve the performance of the team. You will use the Python programming language to perform the statistical analyses and then prepare a report of your findings to present for the team’s management. Since the managers are not data analysts, you will need to interpret your findings and describe their practical implications.



Note:
This data set has been “cleaned” for the purposes of this assignment.


Reference


FiveThirtyEight. (April 26, 2019).
FiveThirtyEight NBA Elo dataset. Kaggle. Retrieved from https://www.kaggle.com/fivethirtyeight/fivethirtyeight-nba-elo-dataset/

Directions


For this project, you will submit the
Python script
you used to make your calculations and a
summary report
explaining your findings.




  1. Python Script: To complete the tasks listed below, open the Project Three Jupyter Notebook link in the Assignment Information module. This notebook contains your data set and the Python scripts for your project. In the notebook, you will find step-by-step instructions and code blocks that will help you complete the following tasks:


    • Simple Linear Regression

      • Create
        scatterplots

      • Compute the
        correlation coefficient

      • Conduct a
        linear regression




    • Multiple Regression

      • Create
        scatterplots

      • Compute the
        correlation matrix

      • Conduct a
        multiple regression
        analysis






  2. Summary Report: Once you have completed all the steps in your Python script, you will create a summary report to present your findings. Use the provided template to create your report. You must complete
    each
    of the following sections:


    • Introduction: Set the context for your scenario and the analyses you will be performing.


    • Scatterplots and Correlation: Discuss relationships between variables using scatterplots and correlation coefficients.


    • Simple Linear Regression: Create a simple linear regression model to predict the response variable.


    • Multiple Regression: Create a multiple regression model to predict the response variable.


    • Conclusion: Summarize your findings and explain their practical implications.




What to Submit


To complete this project, you must submit the following:



Python Script

Your Jupyter Notebook Python script contains all the statistical analyses you completed for this project. You downloaded your work as an HTML file. Review the file to make sure that every step and all your outputs are included. Submit the HTML file as part of your submission. Review the Jupyter Notebook in Codio Tutorial in the Supporting Materials section if you need help.



Summary Report Zip File Word Document

Use the provided template to create your summary report. The template contains guiding questions to help you complete each section.
Be sure to remove these questions before submitting your report. Your summary report should be submitted as a
3- to 5-page
Microsoft Word document. It should include an APA-style cover page and APA citations for any sources used. Use double spacing, 12-point Times New Roman font, and one-inch margins.


Supporting Materials


The following resource(s) may help support your work on the project:



Document:

Jupyter Notebook in Codio Tutorial PDF

This tutorial will help you become familiar with the Jupyter Notebook interface. You will learn how to open, complete, save, and download your Jupyter Notebook for this project.



Shapiro Library:
APA Style Guide

This guide will help you format your cover page and references according to APA style. You are

not
required to use external resources for this project. However, if you do use any resources, you

must
cite them in APA format.


Project Three Rubric











Answered 1 days AfterJun 16, 2022

Answer To: Competency In this project, you will demonstrate your mastery of the following competency: Perform...

Atreye answered on Jun 17 2022
103 Votes
MAT 243 Project Three Summary Report
[Full Name]
[SNHU Email]
Southern New Hampshire University
Note: Replace the bracketed text on page one (the cover page) with your personal information.
1. Introduction
The data set which will be explored in the study, will be the average of points
scored, average point differential, the relative skill average, and t
he total number of wins
concerning our team, and opposing teams, during the regular season. Exploration of this data will
help to predict the measure of achievement of our team in the forthcoming season of play. It is expected that the use of this data set will be accurate in providing an
indicator of team’s success in win total established by the measured information. A multiple linear regression and simple linear regression analytical method will be performed to appraise information that will be utilized by team’s upper management and coaches.
2. Data Preparation
The variable termed “avg_pts_differential” will highlight the average of
point differential between the team and its challengers during the season. Analyzing this statistic
will ascertain the variance of points scored by the opponents, and the team, during play. For
someone who would not comprehend this, an example is given: If a team has an average
point total of 126, and our teams average point total was 121, then the point differential average
would be +5. The positive number would be embraced by us due to the difference between our
point’s average and the opposing teams’ point’s average.
Another variable that will be used would highlight the relative skill of any team
during the regular season. This variable would be termed “avg_elo_n”. This measure is relative
to where the game is played, the result of the contest in proportion to the likelihood, or
probability of said outcome, and the ending result of the game. Or the final score. When a team
possess a high relative skill, the number will be higher. The team with a greater relative skill is
predicted to defeat an opposing team with a diminished, or lower, relative skill.
3. Scatterplot and Correlation for the Total Number of Wins and Average Relative Skill
Data visualization uses charts, graphs, or images to show data or information. Data visualization techniques are used to extract insights from the data or information related to the relationships of the data. When the data spreads in such a way that if one variable increases, the other variable decreases, this is an indication of negative relationship between the two variables. When both variables are incremented, both variables have a positive relationship. If there was a data distribution with no trends or patterns, that implies there is no association between the two variables.
The parameter for the correlation coefficient value lies between -1 to +1. Correlation coefficient
value indicates the direction of the relationship between the two variables. The value of the correlation coefficient represents the strength of relationship of the two variables used. A value of 0 indicates that there is no association between the two variables. Therefore, -1 represents perfect negative relationship between the two variables. +1 represents an absolutely positive association between the two variables.
The above scatterplot gives a correlation coefficient of 0.9072. The value indicates
that there is a positive, strong association between the two variables, the relative average skill, and total number of wins. The coefficient of correlation is significant statistically since the p-value being less than 0.01 (1%).
4....
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here