Final Project Part 2: Submit Your Project In Module 10 Weeks 1 to 3, you will work on the approved project outline, and use the problem-solving framework to build your final project. You can use...

1 answer below »


Final Project Part 2: Submit Your Project


In Module 10 Weeks 1 to 3, you will work on the approved project outline, and use the problem-solving framework to build your final project. You can use thisPython Notebook templateavailable to develop your code.


Using the insights and results you gain, create a project presentation by filling out slides 18-31. You can use thesame templateprovided in Final Project Part 1.


you will need to sign into my course to do this project





Answered Same DayJun 15, 2021

Answer To: Final Project Part 2: Submit Your Project In Module 10 Weeks 1 to 3, you will work on the approved...

Valupadasu answered on Jun 25 2021
159 Votes
Dartmouth Thayer_Data Science_Module 10_Final Project Submission.pptx
Module 10:
Final Project Template
Name: Kehinde Williams
1
Final Project Part 1:
Sentiment Analysis
2
Proposed Title of the Project
3
Sentiment analysis on Product Reviews and classification of tweets
*Please add slides as required.
Outline of the Project
4
Sentiment analysis help us in understanding the customer sentiment towards a product or a service
Sentiment analysis usually applied on customer reviews, social media posts and their opinions about service
Purpose of this project is to build a classification model which can classify the sentiment of a customer towards Amazon products(Mobile Phones) and also social media tweets
Using sentiment analysis we can identify the hate in the speech or the review, generally Racist and negative comments are treated as hate
*Please add slides as required.
Justification of the Project
5
Here are the few examples why the project is important
Business:
To understand customer preference of certain products and services
 marketin
g department of each company form certain business strategies to understand the customer sentiment to wards the products and services they offer
Helpful for companies to understand why customers prefer certain brands over the other and also useful in measuring the success of marketing campaigns
Measures the satisfaction level of customers for emails
Measures the response of ad campaigns targeted to customers for various product launches
Politics: 
Measures the attitude of the tweeters and bloggers to wards the newly elected president and also the old president
In the political sector, it is used to differentiate the voters sentiment towards politician before and after elections
classification modules can be used to predict the results of elections as well
*Please add slides as required.
Datasets
6
I have used below mentioned datasets and they are open source without any restrictions for usage.
Extracted 400k records of data pertaining to customer reviews of mobile phones listed on amazon for sale from Prompt Cloud to understand below
What are the top 20 mobile phone brands for sale on Amazon?
What is the distribution of different mobile phones ratings?
Also used Github to download tweets dataset.
*Please add slides as required.
Desired Outcomes
7
Amazon Product Reviews:
What are the top 20 mobile phone brands for sale on Amazon?
What is the distribution of different mobile phones ratings?
Aim of this project is to build a classification model to classify the ratings of mobile phones
Twitter Racist tweets classification:
Distribution of racist vs non-racist tweets classification
Most frequent words used with racist and non-racist tweets
Types of hashtags associated with different kind of tweets
*Please add slides as required.
Project Outline:
Check-in
8
Project Outline Check-in
9
Data collection
Data preprocessing
Model building
In the field below, list all the items you have completed so far.
*Please add slides as required.
Final Project Part 2: Project Presentation

10
Abstract
11
What is meant by sentiment analysis?
Why do companies need sentiment analysis?
What questions can be answered by sentiment analysis?
What are different types of available sentiment analysis?
What are the various challenges faced during the sentiment analysis?
*Please add slides as required.
Introduction to Sentiment Analysis
12
Sentiment analysis is anything that has been expressed in terms of emotions, views , Feelings and attitude to wards something
In general it will be binary in nature
Like I liked the product or I disliked the product
The dish was good or the dish was bad
Almost all the sentiments are subjective
*Please add slides as required.
The Problem (Context)
13
Purpose of this project is to build a classification model which can classify the sentiment of a customer towards Amazon products(Mobile Phones) and also social media tweets to understand the customer emotion
As we already discussed it can also be applied to various other services and products available on various platforms
*Please add slides as required.
Purpose of the Study
14
*Please add slides as required.
Purpose of this study is to make use of the data that is being generated from every corner of the world to understand and gauge the customer better, so that companies can come up with products and services as per customer needs
With the improvement in technology on an average 2-3 trillions of data getting generated which is going untreated , as someone said to understand your customer we need to harness the power of our data analysis
Project Description
15
Amazon Product Reviews:
What are the top 20 mobile phone brands for sale on Amazon?
What is the distribution of different mobile phones ratings?
Aim of this project is to build a classification model to classify the ratings of mobile phones
Twitter Racist tweets classification:
Distribution of racist vs non-racist tweets classification
Most frequent words used with racist and non-racist tweets
Types of hashtags associated with different kind of tweets
*Please add slides as required.
Project Description: Workflow
16
*Please add slides as required.
Define the problem 
Map the system 
Define suitable metrics 
Select modeling tools 
Implement model
Evaluate model 
Deploy the final model 
Project Description: Methods
17
Classification methods used:
Random Forest
Logistic Regression
K-nn Classification
Natural language processing
*Please add slides as required.
Project Description: Results and analysis
18
Amazon Product ratings
*Please add slides as required.
The most reviews have come for Samsung followed by BLU and Apple
Project Description: Results and analysis
19
Amazon Product ratings
*Please add slides as required.
Majority of the reviews contains words between 10 to 200 in length
Project Description: Results and analysis
20
Amazon Product ratings top 10 words used for smallest and largest coefficients
*Please add slides as required.
Project Description: Results and Analysis
21
Twitter Racist tweets classification(Most common words used in dataset)
*Please add slides as required.
Project Description: Results and Analysis
22
Twitter Racist tweets classification(Most common words used in dataset for non-racist tweets)
*Please add slides as required.
Project Description: Results and Analysis
23
Twitter Racist tweets classification(Most common words used in dataset for racist tweets)
*Please add slides as required.
Project Description: Results and Analysis
24
Twitter Racist tweets classification(Top 10 most frequent hashtags for non-racist tweets)
*Please add slides as required.
Project Description: Results and Analysis
25
Twitter Racist tweets classification(Top 10 most frequent hashtags for racist tweets)
*Please add slides as required.
Conclusion
26
Sentiment analysis on preferred mobiles on Amazon portal has been analyzed and also the sentiment on various news channels have been analyzed.
As per our analysis most top rated brands are Samsung, BLU and apple
Most used hashtags for non-racist comments are Love , Smile and Healthy
Most used hashtags for racist comments are Trump, Black and hate
*Please add slides as required.
Bibliography
27
https://www.promptcloud.com/product-pricing-and-reviews-from-customer-sites/
Github.com
*Please add slides as required.
28
SentimentAnalysis_final.ipynb
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"#Demo1 :Amazon phones reviews\n",
"\n",
"import pandas as pd\n",
"import numpy as np\n",
"import nltk\n",
"import future\n",
"import matplotlib.pyplot as plt\n",
"%matplotlib inline\n",
"\n",
"from sklearn.model_selection import train_test_split, GridSearchCV\n",
"from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer\n",
"from sklearn.naive_bayes import BernoulliNB, MultinomialNB\n",
"from sklearn import metrics\n",
"from sklearn.metrics import roc_auc_score, accuracy_score\n",
"from sklearn.preprocessing import label_binarize\n",
"from sklearn.linear_model import LogisticRegression\n",
"\n",
"from sklearn.pipeline import Pipeline\n",
"from sklearn import svm\n",
"from sklearn.svm import LinearSVC\n",
"from sklearn.svm import SVR\n",
"from sklearn import metrics\n",
"\n",
"from sklearn.ensemble import RandomForestClassifier\n",
"from sklearn.tree import DecisionTreeClassifier\n",
"\n",
"from bs4 import BeautifulSoup \n",
"import re\n",
"import nltk\n",
"from nltk.corpus import stopwords "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Import Data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"data1= pd.read_csv('C:/Users/satyadurga.mandapati/Downloads/Amazonproduct_dataset.csv')\n",
"data1.head(10)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"data1.dtypes"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print (data1['Brand Name'].unique())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Label the reviews as 0,1 and 2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def label_data():\n",
" rows = pd.read_csv('C:/Users/satyadurga.mandapati/Downloads/Amazonproduct_dataset.csv', header=0, index_col=False, delimiter=',')\n",
" labels = []\n",
" for cell in rows['Rating']:\n",
" if cell >= 4:\n",
" labels.append('2') #Good\n",
" elif cell == 3:\n",
" labels.append('1') #Neutral\n",
" else:\n",
" labels.append('0') #Poor\n",
" \n",
" rows['Label'] = labels\n",
" return rows"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"if __name__ == '__main__':\n",
" data = label_data()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here