First Go to Cocalc.com and log in using the following password and user nameusername...

Question

First Go to Cocalc.com and log in using the following password and user name
username [email protected] Peaceonearth1!.
There is a lesson file and a homework file. the home work number is Homework 4. you can do the work through the Cocal or you can do it on your own Jupiter and paste the code in CoCalC

First Go to Cocalc.com and log in using the following password and user name username [email protected] Peaceonearth1!. and then click on Fall2021 There is a lesson file and a homework file. the home work number is Homework 4. once you loged in in CoCalc, its the one that said Fall2021 there is two folder, one is home work folder and the other one is lesson folder some of the questions might ask you to go to the lesson to get the starting code

first-go-to-cocalc-phkrxj5j-5h4nbe45.docx

Robert · Accepted Answer

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Problem Set up
",
    "*Note: this information is not included in the Canvas quiz.*
",
    "
",
    "The file *Airfares.xlsx* contains real data that were collected between Q3-1996 and Q2-1997. The first sheet contains variable descriptions while the second sheet contains the data.  A csv file of the data is also provided (called *Airfares.csv*).
",
    "
",
    "We're copying the instructions from the presentation file here for ease of use.
",
    "
",
    "The following problem takes place in the United States in the late 1990s, when many major US cities were facing issues with airport congestion, partly as a result of the 1978 deregulation of airlines. Both fares and routes were freed from regulation, and low-fare carriers such as Southwest (SW) began competing on existing routes and starting non-stop service on routes that previously lacked it.  Building new airports is not generally feasible, but sometimes decommissioned military bases or smaller municipal airports can be reconfigured as regional or larger commercial airports.  There are numerous players and interests involved in the issue (airlines, city, state, and federal authorities, civic groups, the military, airport operators), and an aviation consulting firm is seeking advisory contracts with these players.  
",
    "
",
    "A consulting firm wishes to determine the maximum average fare (FARE) as a function of three variables: COUPON, HI, and DISTANCE.  COUPON, HI, and DISTANCE are things that an airline could control, when determining where to locate new routes. 
",
    "
",
    "Moreover, they need to impose constraints on 
",
    "- the number of passengers on that route (PAX) $\leq 20000$
",
    "- the starting city’s average personal income (S_INCOME) $\leq 30000$
",
    "- the ending city’s average personal income (E_INCOME) $\geq 30000$
",
    "
",
    "For additional constraints:
",
    "* restrict COUPON to no more than 1.5
",
    "* limit HI to between 4000 and 8000, inclusive
",
    "* consider only routes with DISTANCE between 500 and 1000 miles, inclusive.
",
    "
",
    "However, the variables PAX, S_INCOME, and E_INCOME are not decision variables so the firm must first model these variables using COUPON, HI, and DISTANCE as predictors using linear regression (predictive analytics).  They'll also use linear regression to model a linear relation between FARE and COUPON, HI, and DISTANCE.  Armed with these predictive models the firm will build a linear program (prescriptive analytics) to maximize the average fare.
",
    "
",
    "Suppose you are in the aviation consulting firm and you want to maximize airfares for the particular set circumstances described below. The file *Airfares.xlsx* contains real data that were collected between Q3-1996 and Q2-1997. The first sheet contains variable descriptions while the second sheet contains the data.  A csv file of the data is also provided (called *Airfares.csv*).
",
    "
",
    "*NOTE: This problem scenario is developed from pp. 170-171 in Data Mining for Business Analytics: Concepts, Techniques, and Applications in R, by Shmueli, Bruce, Yahav, Patel, and Lichtendahl, Wiley, 2017)*
",
    "
",
    "## Part 1: The Predictive Models
",
    "Since each of these models uses the same predictors and the only thing that varies is the response variable, write a function that takes in the dataframe, a list of predictors and a response variable string which:
",
    "* runs the linear regression based on the 
",
    "* returns the model
",
    "* prints the regression equation.
",
    "
",
    "Use a non-repetitive approach to run multiple linear regression **through the origin** using the average number of coupons (COUPON) for that route, the Herfindel Index (HI), and the distance between the two endpoint airports in miles (DISTANCE) as predictors. You'll build 4 multiple linear regression models, one for each of the following response variables:
",
    "
",
    "- the average fare (FARE)
",
    "- the number of passengers on that route (PAX)
",
    "- the starting city’s average personal income (S_INCOME)
",
    "- the ending city’s average personal income (E_INCOME)
",
    "
",
    "For each of the models, you'll need to:
",
    "
",
    "* print the resulting linear equation. For instance: $FARE = X_1COUPON + X_2HI + X_3DISTANCE$ with the $X_n$ coefficients filled in.
",
    "* print the $R^2$ for each model. (Hint, it's stored in a variable that can be accessed by calling .rsquared on whatever variable you created when you fit the model.)
",
    "* store the data in such a way that you can use the coefficients directly in the linear program.
",
    "
",
    "
",
    "
",
    "There are multiple ways you could do this to get full credit. You could write a function and call it 4 times. You could use a loop, without a function. You could use a combination of loop and function. Non-repetitive code means that you are not copy/pasting the same lines of code over and over again. To get full credit, you must not be replicating the same bits of code over and over. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "import statsmodels.api as sm
",
    "import pandas as pd
",
    "import numpy as np
",
    "from sklearn.linear_model import LinearRegression
",
    "import plotly.graph_objects as go
",
    "
",
    "airfare=pd.read_csv("D:\\New\\Airfares.csv")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "
",
       "
",
       "    .dataframe tbody tr th:only-of-type {
",
       "        vertical-align: middle;
",
       "    }
",
       "
",
       "    .dataframe tbody tr th {
",
       "        vertical-align: top;
",
       "    }
",
       "
",
       "    .dataframe thead th {
",
       "        text-align: right;
",
       "    }
",
       "
",
       "
",
       "  
",
       "    
",
       "      
",
       "      FARE
",
       "      PAX
",
       "      S_INCOME
",
       "      E_INCOME
",
       "      COUPON
",
       "      HI
",
       "      DISTANCE
",
       "    
",
       "  
",
       "  
",
       "    
",
       "      0
",
       "      64.11
",
       "      7864
",
       "      28637.0
",
       "      21112.0
",
       "      1.00
",
       "      5291.99
",
       "      312
",
       "    
",
       "    
",
       "      1
",
       "      174.47
",
       "      8820
",
       "      26993.0
",
       "      29838.0
",
       "      1.06
",
       "      5419.16
",
       "      576
",
       "    
",
       "    
",
       "      2
",
       "      207.76
",
       "      6452
",
       "      30124.0
",
       "      29838.0
",
       "      1.06
",
       "      9185.28
",
       "      364
",
       "    
",
       "    
",
       "      3
",
       "      85.47
",
       "      25144
",
       "      29260.0
",
       "      29838.0
",
       "      1.06
",
       "      2657.35
",
       "      612
",
       "    
",
       "    
",
       "      4
",
       "      85.47
",
       "      25144
",
       "      29260.0
",
       "      29838.0
",
       "      1.06
",
       "      2657.35
",
       "      612
",
       "    
",
       "  
",
       "
",
       ""
      ],
      "text/plain": [
       "     FARE    PAX  S_INCOME  E_INCOME  COUPON       HI  DISTANCE
",
       "0   64.11   7864   28637.0   21112.0    1.00  5291.99       312
",
       "1  174.47   8820   26993.0   29838.0    1.06  5419.16       576
",
       "2  207.76   6452   30124.0   29838.0    1.06  9185.28       364
",
       "3   85.47  25144   29260.0   29838.0    1.06  2657.35       612
",
       "4   85.47  25144   29260.0   29838.0    1.06  2657.35       612"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.DataFrame(airfare['FARE'])
",
    "df['PAX'] = airfare['PAX']
",
    "df['S_INCOME'] = airfare['S_INCOME']
",
    "df['E_INCOME'] = airfare['E_INCOME']
",
    "df['COUPON'] = airfare['COUPON']
",
    "df['HI'] = airfare['HI']
",
    "df['DISTANCE'] = airfare['DISTANCE']
",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# regModel() function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [],
   "source": [
    "def regModel(X, y):
",
    "    X = sm.add_constant(X)
",
    "    model = sm.OLS(y,X)
",
    "    result = model.fit()
",
    "    beta_values = result.params
",
    "    rsq_val = result.rsquared_adj
",
    "    
",
    "    B0 = beta_values[0]
",
    "    B1 = beta_values[1]
",
    "    B2 = beta_values[2]
",
    "    B3 = beta_values[3]
",
    "    
",
    "    reg_eqn = str(y.name) + '=' + str(B0) + '+' + '(' + str(B1) + ')' + '*' + 'Coupon' + '+' + '(' + str(B2) + ')' + '*' + 'HI' + '+' + '(' + str(B3) + ')' + '*' + 'DISTANCE'
",
    "    print(reg_eqn)
",
    "    print(rsq_val)
",
    "    print(result.summary())
",
    "    return beta_values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = df.iloc[:, 4:7]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "FARE=0.15091723772984267+(22.472163154481805)*Coupon+(0.011792019438414687)*HI+(0.08335424286644894)*DISTANCE
",
      "0.5090942382057464
",
      "                            OLS Regression Results                            
",
      "==============================================================================
",
      "Dep. Variable:                   FARE   R-squared:                       0.511
",
      "Model:                            OLS   Adj. R-squared:                  0.509
",
      "Method:                 Least Squares   F-statistic:                     221.2
",
      "Date:                Sun, 03 Oct 2021   Prob (F-statistic):           3.59e-98
",
      "Time:                        15:11:25   Log-Likelihood:                -3439.5
",
      "No. Observations:                 638   AIC:                             6887.
",
      "Df Residuals:                     634   BIC:                             6905.
",
      "Df Model:                           3                                         
",
      "Covariance Type:            nonrobust                                         
",
      "==============================================================================
",
      "                 coef    std err          t      P>|t|      [0.025      0.975]
",
      "------------------------------------------------------------------------------
",
      "const          0.1509     18.363      0.008      0.993     -35.909      36.211
",
      "COUPON        22.4722     15.829      1.420      0.156      -8.612      53.556
",
      "HI             0.0118      0.001      9.002      0.000       0.009       0.014
",
      "DISTANCE       0.0834      0.005     16.913      0.000       0.074       0.093
",
      "==============================================================================
",
      "Omnibus:                       31.682   Durbin-Watson:                   0.990
",
      "Prob(Omnibus):                  0.000   Jarque-Bera (JB):               16.014
",
      "Skew:                           0.193   Prob(JB):                     0.000333
",

First Go to Cocalc.com and log in using the following password and user name username XXXXXXXXXX Peaceonearth1!. and then click on Fall2021 There is a lesson file and a homework file. the home work...

Answer To: First Go to Cocalc.com and log in using the following password and user name username XXXXXXXXXX...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment

	FARE	PAX	S_INCOME	E_INCOME	COUPON	HI	DISTANCE
0	64.11	7864	28637.0	21112.0	1.00	5291.99	312
1	174.47	8820	26993.0	29838.0	1.06	5419.16	576
2	207.76	6452	30124.0	29838.0	1.06	9185.28	364
3	85.47	25144	29260.0	29838.0	1.06	2657.35	612
4	85.47	25144	29260.0	29838.0	1.06	2657.35	612