assignment is in ipynb file that I have attached{ "cells": [ { "cell_type": "markdown", ...

Question

assignment is in ipynb file that I have attached{  "cells": [   {    "cell_type": "markdown",    "metadata": {},    "source": [     "## Homework 5
",     "
",     "In this week's class, we went through the Recsys Chanllege 2015: \
",     "https://2015.recsyschallenge.com/challenge.html\
",     "For this homework, we will work on task 1. So, when you create features, you will need to think on the session level but not item level.
",     "
",     "The click and buy datasets are unploaded to iCollege. These 2 files are sampled down to ~50k buy and ~50k not buy sessions to simplify your homework.
",     "
",     "In this homework, please do feature engineering to create new features from the click and buy datasets. You can use the ideas I provided in the class but I encourage you to be creative and think on your own as well. 
",     "
",     "Each feautre you creat will worth 10 points and the maximal points from feature engineerring will cap at 80 points. But do not limit yourself to 8 features because the more features you can create, the better they will help you in your next homework(6) for Machine Learning modeling.
",     "
",     "In the end, you will need to create the Analytics Base Table(ABT) which worth 20 points. In the ABT, you should have each row representing a unique click session but not each session with each item. This is different from my code on github because we are only doing task 1 for this homework."    ]   },   {    "cell_type": "code",    "execution_count": 3,    "metadata": {},    "outputs": [],    "source": [     "# loading data
",     "import pandas as pd
",     "
",     "click = pd.read_csv('click_sep.csv', low_memory=False)
",     "
",     "buy = pd.read_csv('buy_sep.csv',low_memory=False)"    ]   },   {    "cell_type": "code",    "execution_count": 4,    "metadata": {},    "outputs": [     {      "data": {       "text/html": [        "
",        "
",        "    .dataframe tbody tr th:only-of-type {
",        "        vertical-align: middle;
",        "    }
",        "
",        "    .dataframe tbody tr th {
",        "        vertical-align: top;
",        "    }
",        "
",        "    .dataframe thead th {
",        "        text-align: right;
",        "    }
",        "
",        "
",        "  
",        "    
",        "      
",        "      
",        "      
",        "      
",        "      
",        "    
",        "  
",        "  
",        "    
",        "      
",        "      
",        "      
",        "      
",        "      
",        "    
",        "    
",        "      
",        "      
",        "      
",        "      
",        "      
",        "    
",        "    
",        "      
",        "      
",        "      
",        "      
",        "      
",        "    
",        "    
",        "      
",        "      
",        "      
",        "      
",        "      
",        "    
",        "    
",        "      
",        "      
",        "      
",        "      
",        "      
",        "    
",        "  
",        "													SessionID				TimeStamp				ItemID				Category														0				9293568				2014-09-01 18:07:00.855000+00:00				214853225				S										1				9293653				2014-09-01 10:38:47.087000+00:00				214834871				S										2				9293653				2014-09-01 10:39:49.115000+00:00				214849327				S										3				9293653				2014-09-01 10:40:31.736000+00:00				214828970				S										4				9293653				2014-09-01 10:41:01.640000+00:00				214849327				S						
",        ""       ],       "text/plain": [        "   SessionID                         TimeStamp     ItemID Category
",        "0    9293568  2014-09-01 18:07:00.855000+00:00  214853225        S
",        "1    9293653  2014-09-01 10:38:47.087000+00:00  214834871        S
",        "2    9293653  2014-09-01 10:39:49.115000+00:00  214849327        S
",        "3    9293653  2014-09-01 10:40:31.736000+00:00  214828970        S
",        "4    9293653  2014-09-01 10:41:01.640000+00:00  214849327        S"       ]      },      "execution_count": 4,      "metadata": {},      "output_type": "execute_result"     }    ],    "source": [     "click.head()"    ]   },   {    "cell_type": "code",    "execution_count": 5,    "metadata": {},    "outputs": [     {      "name": "stdout",      "output_type": "stream",      "text": [       "unique click session
",       "99998
"      ]     }    ],    "source": [     "print('unique click session')
",     "print(len(click.SessionID.unique()))"    ]   },   {    "cell_type": "code",    "execution_count": 6,    "metadata": {},    "outputs": [     {      "data": {       "text/html": [        "
",        "
",        "    .dataframe tbody tr th:only-of-type {
",        "        vertical-align: middle;
",        "    }
",        "
",        "    .dataframe tbody tr th {
",        "        vertical-align: top;
",        "    }
",        "
",        "    .dataframe thead th {
",        "        text-align: right;
",        "    }
",        "
",        "
",        "  
",        "

Sandeep Kumar · Accepted Answer

Answer Attached Below:

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Homework 5\n", "\n", "In this week's class, we went through the Recsys Chanllege 2015: \\\n",...

Answer To: { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Homework 5\n", "\n", "In...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment

	SessionID	TimeStamp	ItemID	Category
0	9293568	2014-09-01 18:07:00.855000+00:00	214853225	S
1	9293653	2014-09-01 10:38:47.087000+00:00	214834871	S
2	9293653	2014-09-01 10:39:49.115000+00:00	214849327	S
3	9293653	2014-09-01 10:40:31.736000+00:00	214828970	S
4	9293653	2014-09-01 10:41:01.640000+00:00	214849327	S