assignment_1_notebook Bright Dark Blues Grays Night  SIT744 Assignment 1: Image Classification with Deep Feedforward Neural Network¶ Due: 8:00 pm (AEST) 19 April 2021 (Monday) This is an individual...

2 answer below »
deep learning


assignment_1_notebook Bright Dark Blues Grays Night  SIT744 Assignment 1: Image Classification with Deep Feedforward Neural Network¶ Due: 8:00 pm (AEST) 19 April 2021 (Monday) This is an individual assignment. It contributes 30% to your final mark. Read the assignment instruction carefully. What to submit This assignment is to be completed individually and submitted to CloudDeakin. By the due date, you are required to submit the following files to the corresponding Assignment (Dropbox) in CloudDeakin: [YourID]_assignment1_solution.ipynp: This is your Python notebook solution source file. [YourID]_assingment1_output.html: This is the output of your Python notebook solution exported in HTML format. Extra files needed to complete your assignment, if any (e.g., images used in your answers). For example, if your student ID is: 123456, you will then need to submit the following files: 123456_assignment1_solution.ipynp 123456_assignment1_output.html Marking criteria Your submission will be marked using the following criteria. Showing good effort through completed tasks. Applying deep learning theory to design suitable deep learning solutions for the tasks. Critically evaluating and reflecting on the pros and cons of various design decisions. Demonstrating creativity and resourcefulness in providing unique individual solutions. (Warning: Highly similar solutions will be investigated for collusion.) Showing attention to details through a good quality assignment report. Indicative weights of various tasks are provided below, but the assignment will be marked by the overall quality per the above criteria. Assignment objective¶ This assignment is for you to demonstrate the knowledge in deep learning that you have acquired from the lectures and practical lab materials. Most tasks in this assignment are straightforward applications of the practical materials in weeks 1-5. Going through these materials before attempting this assignment is highly recommended. In this assignment, you are going to work with the Fashion-MNIST dataset for image recognition. The dataset contains 10 classes of 28x28 grayscale images. You will see some examples in the visualization task below. This assignment consists of five tasks. Task 1 Load the data¶ (weight ~5%) Load the Fashion MNIST dataset (https://github.com/zalandoresearch/fashion-mnist). You may get the data via Keras (keras.datasets) or Tensorflow Datasets (tfds). Task 2 Understand the data¶ (weight ~15%) Display 25 images from the train set in the form of 5x5 matrix. Answer the following questions: What are the data types and shapes of the features and the label? What are the unique labels in this dataset? How many training images and how many test images? What is the size of each image? How much memory is required for holding the whole training data. Find out the numeric range of the input. Do we need to rescale the input? Why? Task 3 Construct an input pipeline¶ (weight ~15%) Creat train/validate/test data splits and construct tf.data pipelines. Make sure that the training data is batched. How do you determine the batch size? Do we need to shuffle the training data? If yes, how do you determine the buffer size? Task 4 Construct a deep forward neural network¶ (weight ~35%) Task 4.1 Setting up a model for training¶ Construct a deep feedforward neural network. You need to decide and report the following configurations: Output layer: How many output nodes? Which activation function? Hidden layers: How many hidden layers? How many nodes in each layer? Which activation function for each layer? Input layer What is the input size? The loss function The metrics for model evaluation (which may be different from the loss function) The optimiser Justify your model design decisions. Plot the model structure using keras.utils.plot_model or similar tools. What is the number of trainable parameters in the model. Explain how the total number can be estimated from the model configurations. Task 4.2 Fitting the model¶ Before you fit the model. Think about what initialisation method have you chosen? If you did not specify the initialisation method, find out what is the default one. Choose a layer and visualise its initial weights. (Hint: You may use UMAP to visualise a collection of high-dimension vectors.) Decide and report the following training setting: The training batch size The number of training epochs (at least 1,000 epochs recommended) The learning rate. If you used momentum or a learning rate schedule, please report the configuration as well. Now fit the model. Show how the training loss changes. How did you decide when to stop training? After fitting the model, visualise the model weights again. How did the weights change? Why? Task 4.3 Check the training using TensorBoard¶ Use TensorBoard to visualise the training process. Show screenshots of your TensorBoard output. Optional task: Record the gradients during training and use TensorBoard to visualise the gradients. Task 5 Overfitting and regularisation¶
Answered 4 days AfterApr 13, 2021SIT744Deakin University

Answer To: assignment_1_notebook Bright Dark Blues Grays Night  SIT744 Assignment 1: Image Classification with...

Vicky answered on Apr 18 2021
156 Votes
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "v81NrNYlAzIm"
},
"source": [
"# SIT744 Assignment 1: Image Classification with Deep Feedforward Neural Network\n",
"\n",
"
\n",
"

Due: 8:00 pm (AEST) 19 April 2021 (Monday)

\n",
"\n",
"\n",
"This is an individual assignment. It contributes 30% to your final mark. Read the assignment instruction carefully.\n",
"\n",
"

What to submit

\n",
"\n",
"

\n",
"This assignment is to be completed individually and submitted to CloudDeakin. By the due date, you are required to submit the following files to the corresponding Assignment (Dropbox) in CloudDeakin:\n",
"\n",
"

    \n",
    "
  1. \t[YourID]_assignment1_solution.ipynp: This is your Python notebook solution source file.
  2. \n",
    "
  3. \t[YourID]_assingment1_output.html: This is the output of your Python notebook solution exported in HTML format.
  4. \n",
    "
  5. \tExtra files needed to complete your assignment, if any (e.g., images used in your answers).
  6. \n",
    "
\n",
"

\n",
"\n",
"

\n",
"For example, if your student ID is: 123456, you will then need to submit the following files:\n",
"

    \n",
    "
  • 123456_assignment1_solution.ipynp
  • \n",
    "
  • 123456_assignment1_output.html
  • \n",
    "
\n",
"

\n",
"\n",
"

Marking criteria

\n",
"\n",
"

\n",
"Your submission will be marked using the following criteria.\n",
"\n",
"

    \n",
    "
  • Showing good effort through completed tasks.
  • \n",
    "
  • Applying deep learning theory to design suitable deep learning solutions for the tasks.
  • \n",
    "
  • Critically evaluating and reflecting on the pros and cons of various design decisions.
  • \n",
    "
  • Demonstrating creativity and resourcefulness in providing unique individual solutions. (Warning: Highly similar solutions will be investigated for collusion.)
  • \n",
    "
  • Showing attention to details through a good quality assignment report.
  • \n",
    "
\n",
"

\n",
"\n",
"

\n",
"Indicative weights of various tasks are provided below, but the assignment will be marked by the overall quality per the above criteria. \n",
"

\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JOr2bP8kAzKE"
},
"source": [
"## Assignment objective\n",
"\n",
"\n",
"\n",
"This assignment is for you to demonstrate the knowledge in deep learning that you have acquired from the lectures and practical lab materials. Most tasks in this assignment are straightforward applications of the practical materials in weeks 1-5. Going through these materials before attempting this assignment is highly recommended.\n",
"\n",
"In this assignment, you are going to work with the Fashion-MNIST dataset for image recognition. The dataset contains 10 classes of 28x28 grayscale images. You will see some examples in the visualization task below. \n",
"\n",
"This assignment consists of five tasks.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cDqvFgbRAzKG"
},
"source": [
"## Task 1 Load the data\n",
"\n",
"*(weight ~5%)*\n",
"\n",
"Load the Fashion MNIST dataset (https://github.com/zalandoresearch/fashion-mnist). You may get the data via Keras (keras.datasets) or Tensorflow Datasets (tfds). "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Using TensorFlow backend.\n"
]
}
],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"from keras.datasets import fashion_mnist\n",
"from keras.utils import to_categorical"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# load dataset\n",
"(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',\n",
" 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "nFbty6i5AzKm"
},
"source": [
"## Task 2 Understand the data\n",
"\n",
"*(weight ~15%)*\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "PLYAtx-ZAzKo"
},
"source": [
"Display 25 images from the train set in the form of 5x5 matrix.\n",
"\n",
"Answer the following questions:\n",
"\n",
"1. What are the data types and shapes of the features and the label? \n",
"2. What are the unique labels in this dataset?\n",
"3. How many training images and how many test images?\n",
"4. What is the size of each image? How much memory is required for holding the whole training data.\n",
"5. Find out the numeric range of the input. Do we need to rescale the input? Why?\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "
\n",
"text/plain": [
"
"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Display 25 images from the train set in the form of 5x5 matrix.\n",
"\n",
"plt.figure(figsize=(10,10))\n",
"for i in range(25):\n",
" plt.subplot(5,5,i+1)\n",
" plt.xticks([])\n",
" plt.yticks([])\n",
" plt.grid(False)\n",
" plt.imshow(train_images[i], cmap=plt.cm.binary)\n",
" plt.xlabel(class_names[train_labels[i]])\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Data type of features: uint8\n",
"Data type of label: uint8\n"
]
}
],
"source": [
"# What are the data types and shapes of the features and the label?\n",
"print('Data type of features: ', train_images.dtype.name)\n",
"print('Data type of label: ', train_labels.dtype.name)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Training data shape : (60000, 28, 28) (60000,)\n",
"Testing data shape : (10000, 28, 28) (10000,)\n"
]
}
],
"source": [
"# What are the shapes of the features and the label?\n",
"print('Training data shape : ', train_images.shape, train_labels.shape)\n",
"print('Testing data shape : ', test_images.shape, test_labels.shape)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Unique labels: [0 1 2 3 4 5 6 7 8 9]\n"
]
}
],
"source": [
"# What are the unique labels in this dataset?\n",
"print('Unique labels:',np.unique(train_labels))"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"No. of training images: 60000\n",
"No. of test images: 10000\n"
]
}
],
"source": [
"# How many training images and how many test images?\n",
"print('No. of training images:',len(train_images))\n",
"print('No. of test images:',len(test_images))"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Size of each image: (28, 28)\n"
]
}
],
"source": [
"# What is the size of each image?\n",
"print('Size of each image:',train_images[0].shape)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Memory required for holding the whole training data: 128 bytes\n"
]
}
],
"source": [
"# How much memory is required for holding the whole training data.\n",
"import sys\n",
" \n",
"print('Memory required for holding the whole training data:',sys.getsizeof(train_images),'bytes')"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Numeric range of input: 0 - 255\n"
]
}
],
"source": [
"# Find out the numeric range of the input.\n",
"print('Numeric range of input:',train_images.min(),'-',train_images.max())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Do we need to rescale the input? Why?\n",
"Yes, we need to rescale the input because the images before feeding it into the network in order to reduce the number of parameters. When the number of parameters are high, we tend to increase the requirement of computation power."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"# Change from matrix to array of dimension 28x28 to array of dimension 784\n",
"dim_data = np.prod(train_images.shape[1:])\n",
"train_data = train_images.reshape(train_images.shape[0], dim_data)\n",
"test_data = test_images.reshape(test_images.shape[0], dim_data)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"train_data = train_data.astype('float32')/255.0\n",
"test_data = test_data.astype('float32')/255.0"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"# Change the labels from integer to categorical data\n",
"train_labels_one_hot = to_categorical(train_labels)\n",
"test_labels_one_hot = to_categorical(test_labels)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-tkl-PwsrXip"
},
"source": [
"## Task 3 Construct an input pipeline\n",
"\n",
"*(weight ~15%)*\n",
"\n",
"Creat train/validate/test data splits and construct tf.data pipelines. Make sure that the training data is batched. \n",
"\n",
"- How do you determine the batch size?\n",
"- Do we need to shuffle the training data? If yes, how do you determine the buffer size?\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Create train/validate/test data splits\n",
"We already have the splitted dataset (training and test) available in ratio 85:15 (60,000:10,000) from Zalando Research, we will use the same for this"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"from tensorflow.keras.models import Sequential\n",
"from tensorflow.keras.layers import Dense"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"model = Sequential()\n",
"model.add(Dense(512, activation='relu', input_shape=(dim_data,)))\n",
"model.add(Dense(512, activation='relu'))\n",
"model.add(Dense(10, activation='softmax'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### How do you determine the batch size?\n",
"We take batch size as 64 because small batch size values give a learning process that converges quickly at the cost of noise in the training process."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Do we need to shuffle the training data? If yes, how do you determine the buffer size?\n",
"No, we not need to shuffle training data because it is already in random form."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "lnE7ZFS_AzLg"
},
"source": [
"## Task 4 Construct a deep forward neural network\n",
"\n",
"*(weight ~35%)*"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "X6gil-HshhHI"
},
"source": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "h2qU873qfGVY"
},
"source": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "ChlhkwhkAzLi"
},
"source": [
"### Task 4.1 Setting up a model for training"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yuxhQM6jAzLl"
},
"source": [
"Construct a deep feedforward neural network. You need to decide and report the following configurations:\n",
"\n",
"- Output layer: \n",
" - How many output nodes?\n",
" - Which activation function?\n",
"- Hidden layers:\n",
" - How many hidden layers?\n",
" - How many nodes in each layer?\n",
" - Which activation function for each layer?\n",
"- Input layer\n",
" - What is the input size?\n",
"- The loss function\n",
"- The metrics for model evaluation (which may be different from the loss function)\n",
"- The optimiser\n",
"\n",
"Justify your model design decisions.\n",
"\n",
"Plot the model structure `using keras.utils.plot_model` or similar tools.\n",
"\n",
"What is the number of trainable parameters in the model. Explain how the total number can be estimated from the model configurations."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"sequential\"\n",
"_________________________________________________________________\n",
"Layer (type) Output Shape Param # \n",
"=================================================================\n",
"dense (Dense) (None, 512) 401920 \n",
"_________________________________________________________________\n",
"dense_1 (Dense) (None, 512) 262656 \n",
"_________________________________________________________________\n",
"dense_2 (Dense) (None, 10) 5130 \n",
"=================================================================\n",
"Total params: 669,706\n",
"Trainable params: 669,706\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"model.summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Output layer: \n",
" - How many output nodes? 10\n",
" - Which activation function? softmax\n",
"- Hidden layers:\n",
" - How many hidden layers? 2\n",
" - How many nodes in each layer? 512 in both layers\n",
" - Which activation function for each layer? relu for both layers\n",
"- Input layer\n",
" - What is the input size? 784\n",
"- The loss function -> categorical_crossentropy\n",
"- The metrics for model evaluation (which may be different from the loss function) -> accuracy\n",
"- The optimiser -> rmsprop"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tf.keras.utils.plot_model(\n",
" model, to_file='model.png', show_shapes=False,\n",
" show_layer_names=True, rankdir='TB', expand_nested=False, dpi=96\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Number of trainable parameters in the model = 669,706\n",
"\n",
"input nodes = 784\n",
"\n",
"hidden_layer_1 nodes = 512\n",
"\n",
"hidden_layer_2 nodes = 512\n",
"\n",
"output_layer nodes = 10\n",
"\n",
"\n",
"dense_2 layer parameters = (784+1)*512 = 401920\n",
"\n",
"dense_3 layer parameters = (512+1)*512 = 262656\n",
"\n",
"dense_4 layer parameters = (512+1)*10 = 5130\n",
"\n",
"Total trainable parameters = 669706"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MZ2BVK5tAzMM"
},
"source": [
"### Task 4.2 Fitting the model\n",
"\n",
"Before you fit the model. Think about what initialisation method have you chosen? If you did not specify the initialisation method, find out what is the default one. Choose a layer and visualise its initial weights. (Hint: You may use UMAP to visualise a collection of high-dimension vectors.)\n",
"\n",
"Decide and report the following training setting:\n",
"\n",
"1. The training batch size\n",
"2. The number of training epochs (at least 1,000 epochs recommended)\n",
"3. The learning rate. If you used momentum or a learning rate schedule, please report the configuration as well.\n",
"\n",
"Now fit the model. Show how the training loss changes. How did you decide when to stop training?\n",
"\n",
"After fitting the model, visualise the model weights again. How did the weights change? Why?\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### what initialisation method have you chosen?\n",
"I choose Sequential() as the initialisation method."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The training batch size -> 64\n",
"\n",
"The number of training epochs -> 20\n",
"\n",
"The learning rate -> 0.0001"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Train on 60000 samples, validate on 10000 samples\n",
"Epoch 1/20\n",
"60000/60000 [==============================] - 18s 299us/sample - loss: 0.5110 - accuracy: 0.8143 - val_loss: 0.3990 - val_accuracy: 0.8574\n",
"Epoch 2/20\n",
"60000/60000 [==============================] - 13s 210us/sample - loss: 0.3798 - accuracy: 0.8637 - val_loss: 0.4374 - val_accuracy: 0.8417\n",
"Epoch 3/20\n",
"60000/60000 [==============================] - 14s 226us/sample - loss: 0.3533 - accuracy: 0.8739 - val_loss: 0.4658 - val_accuracy: 0.8477\n",
"Epoch 4/20\n",
"60000/60000 [==============================] - 13s 208us/sample - loss: 0.3343 - accuracy: 0.8798 - val_loss: 0.4453 - val_accuracy: 0.8351\n",
"Epoch 5/20\n",
"60000/60000 [==============================] - 12s 200us/sample - loss: 0.3208 - accuracy: 0.8870 - val_loss: 0.3952 - val_accuracy: 0.8695\n",
"Epoch 6/20\n",
"60000/60000 [==============================] - 12s 198us/sample - loss: 0.3075 - accuracy: 0.8916 - val_loss: 0.4261 - val_accuracy: 0.8673\n",
"Epoch 7/20\n",
"60000/60000 [==============================] - 12s 198us/sample - loss: 0.2968 - accuracy: 0.8956 - val_loss: 0.4091 - val_accuracy: 0.8771\n",
"Epoch 8/20\n",
"60000/60000 [==============================] - 11s 187us/sample - loss: 0.2922 - accuracy: 0.8959 - val_loss: 0.4385 - val_accuracy: 0.8710\n",
"Epoch 9/20\n",
"60000/60000 [==============================] - 12s 196us/sample - loss: 0.2841 - accuracy: 0.8982 - val_loss: 0.5819 - val_accuracy: 0.8515\n",
"Epoch 10/20\n",
"60000/60000 [==============================] - 11s 186us/sample - loss: 0.2808 - accuracy: 0.9014 - val_loss: 0.5518 - val_accuracy: 0.8536\n",
"Epoch 11/20\n",
"60000/60000 [==============================] - 11s 191us/sample - loss: 0.2732 - accuracy: 0.9032 - val_loss: 0.4855 - val_accuracy: 0.8708\n",
"Epoch 12/20\n",
"60000/60000 [==============================] - 12s 199us/sample - loss: 0.2696 - accuracy: 0.9048 - val_loss: 0.5141 - val_accuracy: 0.8647\n",
"Epoch 13/20\n",
"60000/60000 [==============================] - 12s 194us/sample - loss: 0.2632 - accuracy: 0.9087 - val_loss: 0.4493 - val_accuracy: 0.8831\n",
"Epoch 14/20\n",
"60000/60000 [==============================] - 12s 192us/sample - loss: 0.2623 - accuracy: 0.9089 - val_loss: 0.4972 - val_accuracy: 0.8736\n",
"Epoch 15/20\n",
"60000/60000 [==============================] - 17s 281us/sample - loss: 0.2538 - accuracy: 0.9110 - val_loss: 0.5129 - val_accuracy: 0.8857\n",
"Epoch 16/20\n",
"60000/60000 [==============================] - 19s 313us/sample - loss: 0.2510 - accuracy: 0.9128 - val_loss: 0.4962 - val_accuracy: 0.8864\n",
"Epoch 17/20\n",
"60000/60000 [==============================] - 17s 285us/sample - loss: 0.2487 - accuracy: 0.9144 - val_loss: 0.4799 - val_accuracy: 0.8794\n",
"Epoch 18/20\n",
"60000/60000 [==============================] - 13s 217us/sample - loss: 0.2463 - accuracy: 0.9156 - val_loss: 0.5018 - val_accuracy: 0.8737\n",
"Epoch 19/20\n",
"60000/60000 [==============================] - 14s 236us/sample - loss: 0.2394 - accuracy: 0.9181 - val_loss: 0.7292 - val_accuracy: 0.8628\n",
"Epoch 20/20\n",
"60000/60000 [==============================] - 15s 243us/sample - loss: 0.2370 - accuracy: 0.9189 - val_loss: 0.5742 - val_accuracy: 0.8803\n"
]
}
],
"source": [
"history = model.fit(train_data, train_labels_one_hot, batch_size=64, epochs=20, verbose=1,\n",
" validation_data=(test_data, test_labels_one_hot))"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"10000/10000 [==============================] - 2s 190us/sample - loss: 0.5742 - accuracy: 0.8803\n",
"Evaluation result on Test Data : Loss = 0.5742292147040368, accuracy = 0.880299985408783\n"
]
}
],
"source": [
"[test_loss, test_acc] = model.evaluate(test_data, test_labels_one_hot)\n",
"print(\"Evaluation result on Test Data : Loss = {}, accuracy = {}\".format(test_loss, test_acc))"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Text(0.5, 1.0, 'Loss Curves')"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"#Plot the Loss Curves\n",
"plt.figure(figsize=[8,6])\n",
"plt.plot(history.history['loss'],'r',linewidth=3.0)\n",
"plt.plot(history.history['val_loss'],'b',linewidth=3.0)\n",
"plt.legend(['Training loss', 'Validation Loss'],fontsize=18)\n",
"plt.xlabel('Epochs ',fontsize=16)\n",
"plt.ylabel('Loss',fontsize=16)\n",
"plt.title('Loss Curves',fontsize=16)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9C2VXCM-4Nw9"
},
"source": [
"### Task 4.3 Check the training using TensorBoard\n",
"\n",
"Use TensorBoard to visualise the training process. Show screenshots of your TensorBoard output.\n",
"\n",
"**Optional task:** Record the gradients during training and use TensorBoard to visualise the gradients."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4EgXZCBNAzMs"
},
"source": [
"## Task 5 Overfitting and regularisation\n",
"\n",
"*(weight ~30%)*\n",
"\n",
"Go back to the previous task. Plot the training and validation loss and accuracy. Answer the following questions:\n",
"\n",
"1. Do you see overfitting or underfitting? Why?\n",
"2. If you see overfitting, at which epoch did it happen?\n",
"\n",
"Now retrain the model with only 200 training examples. (Make sure that you reinitialise the weights before retraining.) Do you see overfitting? How did the validation loss and accuracy change?\n",
"\n",
"Neural networks are overparametrised models, meaning there can be more parameters than the training examples. Some form of regularisation is almost always necessary to obtain a useful model. Below are some options:\n",
"\n",
"1. Add dropout\n",
"2. Add Batch Normalisation\n",
"3. Add layer-specific weight regularizations\n",
"4. Change the learning rate\n",
"\n",
"In addition, you may also try changing the weight initialisation method.\n",
"\n",
"Apply different regularisation techniques to the model training. You may also try other techniques for improving training such as learning rate scheduling (see https://www.tensorflow.org/guide/keras/train_and_evaluate#using_learning_rate_schedules).\n",
"\n",
"Run **five or more** experiments of different training configurations and record the validation accuracy achieved in the Markdown table below. You may modify the table heading to match your experiment design.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "d1LRIBrEgu92"
},
"source": [
"\n",
"|Dropout (rate) | Batch Normalisation (Y/N) | Optimiser | Learning Rate | Number of Epochs | Validation Accuracy |\n",
"|---|---|---|---|---|---|\n",
"| | | | | | |\n",
"| | | | | | |\n",
"| | | | | | |\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dLjJA98Qgxg5"
},
"source": [
"\n",
"Answer the following questions:\n",
"\n",
"1. Which configuration achieved the best validation accuracy? Report the test accuracy of your final model.\n",
"2. Which setting had the most impact and which one had the least impact?"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png":...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here