1 Practical 1 - Part C [ ]: [ ]: [ ]: [4]: 1 Practical 1 Coding This notebook is an excercise for developing a Naive Bayes classifier for rainfall prediction in australia. We apply the concepts...

1 answer below »
1 Practical 1 - Part C [ ]: [ ]: [ ]: [4]: 1 Practical 1 Coding This notebook is an excercise for developing a Naive Bayes classifier for rainfall prediction in australia. We apply the concepts discussed in Week 3. You should follow the codes from Week 3 Practical Activity to complete this practical. Note: this is assessment is marked. Please submit your work using the submission link. 1.1 The Australia Rainfall dataset Source: https://www.kaggle.com/gaurav896/weather-in-australia This dataset contains daily weather observations from numerous Australian weather stations. The target variable RainTomorrow. Prediction task is to determine whether it will rain tomorrow or not. How to read xlsx files? See here, https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html Q1: How many samples are there in the dataset? Answer: Q2: What are the features in the dataset? Answer: Q3: What are the target classes? How many samples are in each target class? Answer: Q4: What is the data type of each feature? Answer: 2 [5]: [6]: [7]: [8]: [9]: [10]: We can rename the columns for convenience. Read here https://pandas.pydata.org/pandasdocs/stable/reference/api/pandas.DataFrame.rename.html Q5: Show the value distribution of the following nominal attributes? {State, Max Wind gust Dir, Dir - 9:00 AM - km/h} Answer: Q6: Write code to encode categorical features. Answer: Q7: Write code to select the required features. Answer: Q8: Write code to divide the dataset into traing and test sets. Use a 75/25 split. Answer: Q9: Develop a Naive Bayes classifier using the dataset. Answer: Note: you may need to use the max_categories parameters of MixedNB Q10: What is the accuracy of the model? Answer: #write your code here #write your code here #write your code here #write your code here
Answered 10 days AfterMar 04, 2021

Answer To: 1 Practical 1 - Part C [ ]: [ ]: [ ]: [4]: 1 Practical 1 Coding This notebook is an excercise for...

Neha answered on Mar 04 2021
147 Votes
{
"cells": [
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" State Station Month Min °C \\\n",
"0 TAS Hobart (Ellerslie Road) {station 094029} May 8.5 \n",
"1 TAS Hobart (Ellerslie Road) {station 094029} December 8.5 \n",
"2 TAS Hobart (Ellerslie Road) {station 094029} December 7.7 \n",
"3 TAS Hobart (Ellerslie Road) {station 094029} December 8.5 \n",
"4 TAS Devonport Airport {station 091126} May 6.6 \n",
"... ... ... ... ... \n",
"9934 TAS Devonport Airport {station 091126} July 5.4 \n",
"9935 ACT Canberra Airport {station 070351} July 2.5 \n",
"9936 TAS Hobart (Ellerslie Road) {station 094029} July 2.4 \n",
"9937 TAS Devonport Airport {station 091126} July 3.4 \n",
"9938 TAS Hobart (Ellerslie Road) {station 094029} July 4.1 \n",
"\n",
" Max °C Max Wind gust Dir Max wind gust Spd - km/h Temp °C- 9:00AM \\\n",
"0 15.0 NNW 57 10.1 \n",
"1 18.4 NW 56 12.8 \n",
"2 15.8 WNW 76 12.1 \n",
"3 16.6 WSW 69 12.2 \n",
"4 14.5 WSW 56 8.9 \n",
"... ... ... ... ... \n",
"9934 14.6 ENE 30 9.7 \n",
"9935 13.3 E 35 9.0 \n",
"9936 12.5 NNW 35 3.0 \n",
"9937 14.9 SSE 28 5.4 \n",
"9938 12.7 NNW 28 4.5 \n",
"\n",
" Dir - 9:00 AM - km/h MSLP- hPa - 9:00AM Temp °C- 3:00PM \\\n",
"0 NNW 989.1 14.6 \n",
"1 NNW 992.0 17.3 \n",
"2 WNW 987.4 14.1 \n",
"3 SE 984.7 12.2 \n",
"4 SSE 990.0 13.8 \n",
"... ... ... ... \n",
"9934 SSE 1040.3 13.1 \n",
"9935 ESE 1038.1 12.2 \n",
"9936 NNW 1041.0 11.5 \n",
"9937 SSE 1040.8 14.6 \n",
"9938 NNW 1041.1 12.3 \n",
"\n",
" Dir - 3:00 PM - km/h Spd - 3:00PM - km/h MSLP- hPa - 3:00PM Rain(Y/N) \n",
"0 NW 7 984.5 Y \n",
"1 WSW 20 986.2 Y \n",
"2 NW 28 987.2 Y \n",
"3 SW 35 987.2 Y \n",
"4 NW 11 987.3 Y \n",
"... ... ... ... ... \n",
"9934 NE 19 1036.7 N \n",
"9935 SE 19 1036.7 N \n",
"9936 NE 7 1037.1 N \n",
"9937 ENE 7 1039.8 N \n",
"9938 NNE 6 1040.1 N \n",
"\n",
"[9939 rows x 15 columns]\n"
]
}
],
"source": [
"import numpy as np \n",
"import pandas as pd \n",
"from sklearn import preprocessing\n",
"\n",
"df = pd.read_excel (r'australia-rainfall.xlsx')\n",
"print (df)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"State 9939\n",
"Station 9939\n",
"Month 9939\n",
"Min °C 9939\n",
"Max °C 9939\n",
"Max Wind gust Dir 9939\n",
"Max wind gust Spd - km/h 9939\n",
"Temp °C- 9:00AM 9939\n",
"Dir - 9:00 AM - km/h 9939\n",
"MSLP- hPa - 9:00AM 9939\n",
"Temp °C- 3:00PM 9939\n",
"Dir - 3:00 PM - km/h 9939\n",
"Spd - 3:00PM - km/h 9939\n",
"MSLP- hPa - 3:00PM 9939\n",
"Rain(Y/N) 9939\n",
"dtype: int64"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# How many samples are there in the dataset?\n",
"df.count(axis = 0) "
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Index(['State', 'Station', 'Month', 'Min °C', 'Max °C', 'Max Wind gust Dir',\n",
" 'Max wind gust Spd - km/h', 'Temp °C- 9:00AM', 'Dir - 9:00 AM - km/h',\n",
" 'MSLP- hPa - 9:00AM', 'Temp °C- 3:00PM', 'Dir - 3:00 PM - km/h',\n",
" 'Spd - 3:00PM - km/h', 'MSLP- hPa - 3:00PM', 'Rain(Y/N)'],\n",
" dtype='object')"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# What are the features in the dataset?\n",
"df.columns"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Index(['State', 'Station', 'Month', 'Min °C', 'Max °C', 'Max Wind gust Dir',\n",
" 'Max wind gust Spd - km/h', 'Temp °C- 9:00AM', 'Dir - 9:00 AM - km/h',\n",
" 'MSLP- hPa - 9:00AM', 'Temp °C- 3:00PM', 'Dir - 3:00 PM - km/h',\n",
" 'Spd - 3:00PM - km/h', 'MSLP- hPa - 3:00PM', 'Rain(Y/N)'],\n",
" dtype='object')"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.keys()"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" State Station Month Min °C Max °C \\\n",
"0 TAS Hobart (Ellerslie Road) {station 094029} May 8.5 15.0 \n",
"1 TAS Hobart (Ellerslie Road) {station 094029} December 8.5 18.4 \n",
"2 TAS Hobart (Ellerslie Road) {station 094029} December 7.7 15.8 \n",
"3 TAS Hobart (Ellerslie Road) {station 094029} December 8.5 16.6 \n",
"4 TAS Devonport Airport {station 091126} May 6.6 14.5 \n",
"\n",
" Max Wind gust Dir Max wind gust Spd - km/h Temp °C- 9:00AM \\\n",
"0 NNW 57 10.1 \n",
"1 NW 56 12.8 \n",
"2 WNW 76 12.1 \n",
"3 WSW 69 12.2 \n",
"4 WSW 56 8.9 \n",
"\n",
" Dir - 9:00 AM - km/h MSLP- hPa - 9:00AM Temp °C- 3:00PM \\\n",
"0 NNW 989.1 14.6 \n",
"1 NNW 992.0 17.3 \n",
"2 WNW 987.4 14.1 \n",
"3 SE 984.7 12.2 \n",
"4 SSE 990.0 13.8 \n",
"\n",
" Dir - 3:00 PM - km/h Spd - 3:00PM - km/h MSLP- hPa - 3:00PM Rain(Y/N) \n",
"0 NW 7 984.5 Y \n",
"1 WSW 20 986.2 Y \n",
"2 NW 28 987.2 Y \n",
"3 SW 35 987.2 Y \n",
"4 NW 11 987.3 Y \n"
]
}
],
"source": [
"print(df.head(5))"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
StateStationMonthMin °CMax °CMax Wind gust DirMax wind gust Spd - km/hTemp °C- 9:00AMDir - 9:00 AM - km/hMSLP- hPa - 9:00AMTemp °C- 3:00PMDir - 3:00 PM - km/hSpd - 3:00PM - km/hMSLP- hPa - 3:00PMRain
0TASHobart (Ellerslie Road) {station 094029}May8.515.0NNW5710.1NNW989.114.6NW7984.5Y
1TASHobart (Ellerslie Road) {station 094029}December8.518.4NW5612.8NNW992.017.3WSW20986.2Y
2TASHobart (Ellerslie Road) {station 094029}December7.715.8WNW7612.1WNW987.414.1NW28987.2Y
3TASHobart (Ellerslie Road) {station 094029}December8.516.6WSW6912.2SE984.712.2SW35987.2Y
4TASDevonport Airport {station 091126}May6.614.5WSW568.9SSE990.013.8NW11987.3Y
................................................
9934TASDevonport Airport {station 091126}July5.414.6ENE309.7SSE1040.313.1NE191036.7N
9935ACTCanberra Airport {station 070351}July2.513.3E359.0ESE1038.112.2SE191036.7N
9936TASHobart (Ellerslie Road) {station 094029}July2.412.5NNW353.0NNW1041.011.5NE71037.1N
9937TASDevonport Airport {station 091126}July3.414.9SSE285.4SSE1040.814.6ENE71039.8N
9938TASHobart (Ellerslie Road) {station 094029}July4.112.7NNW284.5NNW1041.112.3NNE61040.1N
\n",
"

9939 rows × 15 columns

\n",
"
"
],
"text/plain": [
" State Station Month Min °C \\\n",
"0 TAS Hobart (Ellerslie Road) {station 094029} May 8.5 \n",
"1 TAS Hobart (Ellerslie Road) {station 094029} December 8.5 \n",
"2 TAS Hobart (Ellerslie Road) {station 094029} December 7.7 \n",
"3 TAS Hobart (Ellerslie Road) {station 094029} December 8.5 \n",
"4 TAS Devonport Airport {station 091126} May 6.6 \n",
"... ... ... ... ... \n",
"9934 TAS Devonport Airport {station 091126} July 5.4 \n",
"9935 ACT Canberra Airport {station 070351} July 2.5 \n",
"9936 TAS Hobart (Ellerslie Road) {station 094029} July 2.4 \n",
"9937 TAS Devonport Airport {station 091126} July 3.4 \n",
"9938 TAS Hobart (Ellerslie Road) {station 094029} July 4.1 \n",
"\n",
" Max °C Max Wind gust Dir Max wind gust Spd - km/h Temp °C- 9:00AM \\\n",
"0 15.0 NNW 57 10.1 \n",
"1 18.4 NW 56 12.8 \n",
"2 15.8 WNW 76 12.1 \n",
"3 16.6 WSW 69 12.2 \n",
"4 14.5 WSW 56 8.9 \n",
"... ... ... ... ... \n",
"9934 14.6 ENE 30 9.7 \n",
"9935 13.3 E 35 9.0 \n",
"9936 12.5 NNW 35 3.0 \n",
"9937 14.9 SSE 28 5.4 \n",
"9938 12.7 NNW 28 4.5 \n",
"\n",
" Dir - 9:00 AM - km/h MSLP- hPa - 9:00AM Temp °C- 3:00PM \\\n",
"0 NNW 989.1 14.6 \n",
"1 NNW 992.0 17.3 \n",
"2 WNW 987.4 14.1 \n",
"3 SE 984.7 12.2 \n",
"4 SSE 990.0 13.8 \n",
"... ... ... ... \n",
"9934 SSE 1040.3 13.1 \n",
"9935 ESE 1038.1 12.2 \n",
"9936 NNW 1041.0 11.5 \n",
"9937 SSE 1040.8 14.6 \n",
"9938 NNW 1041.1 12.3 \n",
"\n",
" Dir - 3:00 PM - km/h Spd - 3:00PM - km/h MSLP- hPa - 3:00PM Rain \n",
"0 NW 7 984.5 Y \n",
"1 WSW 20 986.2 Y \n",
"2 NW 28 987.2 Y \n",
"3 SW 35 987.2 Y \n",
"4 NW 11 987.3 Y \n",
"... ... ... ... ... \n",
"9934 NE 19 1036.7 N \n",
"9935 SE 19 1036.7 N \n",
"9936 NE 7 1037.1 N \n",
"9937 ENE 7 1039.8 N \n",
"9938 NNE 6 1040.1 N \n",
"\n",
"[9939 rows x 15 columns]"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#What are the target classes? How many samples are in each target class?\n",
"df.rename(columns={\"Rain(Y/N)\": \"Rain\"})"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"RangeIndex: 9939 entries, 0 to 9938\n",
"Data columns (total 15 columns):\n",
"State 9939 non-null object\n",
"Station 9939 non-null object\n",
"Month 9939 non-null object\n",
"Min °C 9939 non-null float64\n",
"Max °C 9939 non-null float64\n",
"Max Wind gust Dir 9939 non-null object\n",
"Max wind gust Spd - km/h 9939 non-null int64\n",
"Temp °C- 9:00AM 9939 non-null float64\n",
"Dir - 9:00 AM - km/h 9939 non-null object\n",
"MSLP- hPa - 9:00AM 9939 non-null float64\n",
"Temp °C- 3:00PM 9939 non-null float64\n",
"Dir - 3:00 PM - km/h 9939 non-null object\n",
"Spd - 3:00PM - km/h 9939 non-null int64\n",
"MSLP- hPa - 3:00PM 9939 non-null float64\n",
"Rain(Y/N) 9939 non-null object\n",
"dtypes: float64(6), int64(2), object(7)\n",
"memory usage: 1.1+ MB\n",
"None\n"
]
}
],
"source": [
"#What is the data type of each feature?\n",
"print(df.info())"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"#Show the value distribution of the following nominal attributes"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"(array([ 839., 1659., 830., 0., 833., 1241., 0., 1650., 1648.,\n",
" 1239.]),\n",
" array([0. , 0.7, 1.4, 2.1, 2.8, 3.5, 4.2, 4.9, 5.6, 6.3, 7. ]),\n",
" )"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png":...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here