Homework5 Marketing Segmentation—Propensity model Requirements: use google colaboratory Please kindly comment on your code to help me understand · Write a proposal to your managers about your thoughts...

1 answer below »
Customer segmentation propensity model


Homework5 Marketing Segmentation—Propensity model Requirements: use google colaboratory Please kindly comment on your code to help me understand · Write a proposal to your managers about your thoughts on · How are you going to segment your customers? · What are the potential use cases? · If you are going to build a segmentation model, what features are you going to use? · Read data description in the next page. · Build a machine learning model to segment the restaurants into different revenue tiers Propensity model(logistic regression,knn,Xgboost) · Select the features that are important for predicting revenue · Understand how each feature influences revenue Sample code: Please reference sample code in the attachment Data: https://www.kaggle.com/c/restaurant-revenue-prediction/data Data description · Dataset · The data columns include: · open date · location · city type · categories of obfuscated data: Demographic data, Real estate data, and Commercial data · revenue: a (transformed) revenue of the restaurant in a given year and is the target of predictive analysis · Assume restaurants in the dataset are Trave stores · File descriptions · train.csv - the training set. Use this dataset for training your model. · test.csv - the test set. · sampleSubmission.csv - a sample submission file in the correct format Homework5 Marketing Segmentation—Propensity model Requirements: use google colaboratory Please kindly comment on your code to help me understand · Write a proposal to your managers about your thoughts on · How are you going to segment your customers? · What are the potential use cases? · If you are going to build a segmentation model, what features are you going to use? · Read data description in the next page. · Build a machine learning model to segment the restaurants into different revenue tiers Propensity model(logistic regression,knn,Xgboost) · Select the features that are important for predicting revenue · Understand how each feature influences revenue Sample code: Please reference sample code in the attachment Data: https://www.kaggle.com/c/restaurant-revenue-prediction/data Data description · Dataset · The data columns include: · open date · location · city type · categories of obfuscated data: Demographic data, Real estate data, and Commercial data · revenue: a (transformed) revenue of the restaurant in a given year and is the target of predictive analysis · Assume restaurants in the dataset are Trave stores · File descriptions · train.csv - the training set. Use this dataset for training your model. · test.csv - the test set. · sampleSubmission.csv - a sample submission file in the correct format
Answered Same DayJun 08, 2021

Answer To: Homework5 Marketing Segmentation—Propensity model Requirements: use google colaboratory Please...

Bezawada Arun answered on Jun 15 2021
148 Votes
{
"cells": [
{
"cell_type": "code",
"execution_count": 116,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns\n",
"\n",
"from sklearn import metrics\n",
"from sklearn.metrics import confusion_matrix\n",
"from sklearn.metrics import classification_report\n",
"from sklearn.metrics import roc_auc_score\n",
"from sklearn.metrics import roc_curve\n",
"from sklearn.neighbors import KNeighborsClassifier\n",
"from sklearn.model_selection import train_test_split\n",
"import xgboost as xgb\n",
"from xgboost import XGBClassifier\n",
"from sklearn.metrics import accuracy_score\n",
"from sklearn.feature_selection import SelectKBest\n",
"from sklearn.feature_selection import chi2\n",
"from IPython.display import Image\n",
"from IPython.core.display import HTML "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(137, 43)\n"
]
},
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
IdOpen DateCityCity GroupTypeP1P2P3P4P5...P29P30P31P32P33P34P35P36P37revenue
0007/17/1999İstanbulBig CitiesIL45.04.04.02...3.0534554345653753
1102/14/2008AnkaraBig CitiesFC45.04.04.01...3.0000000006923131
2203-09-2013DiyarbakırOtherIL24.02.05.02...3.0000000002055379
3302-02-2012TokatOtherIL64.56.06.04...7.5251210618121262675511
4405-09-2009GaziantepOtherIL34.03.04.02...3.0513234334316715
\n",
"

5 rows × 43 columns

\n",
"
"
],
"text/plain": [
" Id Open Date City City Group Type P1 P2 P3 P4 P5 ... \\\n",
"0 0 07/17/1999 İstanbul Big Cities IL 4 5.0 4.0 4.0 2 ... \n",
"1 1 02/14/2008 Ankara Big Cities FC 4 5.0 4.0 4.0 1 ... \n",
"2 2 03-09-2013 Diyarbakır Other IL 2 4.0 2.0 5.0 2 ... \n",
"3 3 02-02-2012 Tokat Other IL 6 4.5 6.0 6.0 4 ... \n",
"4 4 05-09-2009 Gaziantep Other IL 3 4.0 3.0 4.0 2 ... \n",
"\n",
" P29 P30 P31 P32 P33 P34 P35 P36 P37 revenue \n",
"0 3.0 5 3 4 5 5 4 3 4 5653753 \n",
"1 3.0 0 0 0 0 0 0 0 0 6923131 \n",
"2 3.0 0 0 0 0 0 0 0 0 2055379 \n",
"3 7.5 25 12 10 6 18 12 12 6 2675511 \n",
"4 3.0 5 1 3 2 3 4 3 3 4316715 \n",
"\n",
"[5 rows x 43 columns]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"train = pd.read_csv(\"/Users/Lenovo/Downloads/60054/train.csv\")\n",
"print(train.shape)\n",
"train.head()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(100000, 42)\n"
]
},
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
IdOpen DateCityCity GroupTypeP1P2P3P4P5...P28P29P30P31P32P33P34P35P36P37
0001/22/2011NiğdeOtherFC14.04.04.01...2.03.000000000
1103/18/2011KonyaOtherIL34.04.04.02...1.03.000000000
2210/30/2013AnkaraBig CitiesFC34.04.04.02...2.03.000000000
3305-06-2013KocaeliOtherIL24.04.04.02...2.03.004000000
4407/31/2013AfyonkarahisarOtherFC24.04.04.01...5.03.000000000
\n",
"

5 rows × 42 columns

\n",
"
"
],
"text/plain": [
" Id Open Date City City Group Type P1 P2 P3 P4 P5 \\\n",
"0 0 01/22/2011 Niğde Other FC 1 4.0 4.0 4.0 1 \n",
"1 1 03/18/2011 Konya Other IL 3 4.0 4.0 4.0 2 \n",
"2 2 10/30/2013 Ankara Big Cities FC 3 4.0 4.0 4.0 2 \n",
"3 3 05-06-2013 Kocaeli Other IL 2 4.0 4.0 4.0 2 \n",
"4 4 07/31/2013 Afyonkarahisar Other FC 2 4.0 4.0 4.0 1 \n",
"\n",
" ... P28 P29 P30 P31 P32 P33 P34 P35 P36 P37 \n",
"0 ... 2.0 3.0 0 0 0 0 0 0 0 0 \n",
"1 ... 1.0 3.0 0 0 0 0 0 0 0 0 \n",
"2 ... 2.0 3.0 0 0 0 0 0 0 0 0 \n",
"3 ... 2.0 3.0 0 4 0 0 0 0 0 0 \n",
"4 ... 5.0 3.0 0 0 0 0 0 0 0 0 \n",
"\n",
"[5 rows x 42 columns]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"test = pd.read_csv(\"/Users/Lenovo/Downloads/60054/test.csv\")\n",
"print(test.shape)\n",
"test.head()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(100000, 37)\n"
]
},
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
P1P2P3P4P5P6P7P8P9P10...P28P29P30P31P32P33P34P35P36P37
014.04.04.0125455...2.03.000000000
134.04.04.0225344...1.03.000000000
234.04.04.0225445...2.03.000000000
324.04.04.0235454...2.03.004000000
424.04.04.0125454...5.03.000000000
\n",
"

5 rows × 37 columns

\n",
"
"
],
"text/plain": [
" P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 ... P28 P29 P30 P31 P32 \\\n",
"0 1 4.0 4.0 4.0 1 2 5 4 5 5 ... 2.0 3.0 0 0 0 \n",
"1 3 4.0 4.0 4.0 2 2 5 3 4 4 ... 1.0 3.0 0 0 0 \n",
"2 3 4.0 4.0 4.0 2 2 5 4 4 5 ... 2.0 3.0 0 0 0 \n",
"3 2 4.0 4.0 4.0 2 3 5 4 5 4 ... 2.0 3.0 0 4 0 \n",
"4 2 4.0 4.0 4.0 1 2 5 4 5 4 ... 5.0 3.0 0 0 0 \n",
"\n",
" P33 P34 P35 P36 P37 \n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
"[5 rows x 37 columns]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x_test=test.iloc[:,5:42]\n",
"print(x_test.shape)\n",
"x_test.head()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"#X_train,X_test,y_train,y_test=train_test_split(train,test,test_size=.3)\n",
"x_train=train.iloc[:,5:42]"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(137, 37)\n"
]
},
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
P1P2P3P4P5P6P7P8P9P10...P28P29P30P31P32P33P34P35P36P37
045.04.04.0225455...2.03.053455434
145.04.04.0125555...3.03.000000000
224.02.05.0235555...1.03.000000000
364.56.06.0441081010...2.57.525121061812126
434.03.04.0225555...1.03.051323433
\n",
"

5 rows × 37 columns

\n",
"
"
],
"text/plain": [
" P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 ... P28 P29 P30 P31 P32 \\\n",
"0 4 5.0 4.0 4.0 2 2 5 4 5 5 ... 2.0 3.0 5 3 4 \n",
"1 4 5.0 4.0 4.0 1 2 5 5 5 5 ... 3.0 3.0 0 0 0 \n",
"2 2 4.0 2.0 5.0 2 3 5 5 5 5 ... 1.0 3.0 0 0 0 \n",
"3 6 4.5 6.0 6.0 4 4 10 8 10 10 ... 2.5 7.5 25 12 10 \n",
"4 3 4.0 3.0 4.0 2 2 5 5 5 5 ... 1.0 3.0 5 1 3 \n",
"\n",
" P33 P34 P35 P36 P37 \n",
"0 5 5 4 3 4 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 6 18 12 12 6 \n",
"4 2 3 4 3 3 \n",
"\n",
"[5 rows x 37 columns]"
]
},
"execution_count": 11,
"metadata":...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here