Objective Gain experience empirically determining hyperparameters and evaluating models Instructions Complete this assignment individually Use an accepted style (e.g., APA, Chicago, etc.) for citing...

1 answer below »

Objective


Gain experience empirically determining hyperparameters and evaluating models


Instructions



  • Complete this assignment individually

  • Use an accepted style (e.g., APA, Chicago, etc.) for citing any materials

  • Submit a PDF file containing your submission for this assignment

  • Please view the accompanying Assignment 4 video

  • Use the provided PDF containing the Assignment 4 notebook and the output data generated by the Assignment 4 notebook

    • Alternatively, you may use the attached Assignment 4 notebook to generate your own outputs:

      • Download the Assignment4.ipynb notebook file to your local machine.

      • Connect to the Gateway on Scholar directly or via Scholar's landing page

      • Launch a Jupyter Notebook on Scholar using the gpu queue.

      • Upload the notebook file to Jupyter Notebook and view it.

      • Restart the kernel as is and run all cells (“Restart & Run All” under the Kernel tab).





  • Using the time and validation accuracy values generated via the Assignment 4 notebook, determine the optimal values for the following hyperparameters:

    • Number of filters

    • Batch size



  • Provide a rationale/justification to support your hyperparameter value selections

  • Include a three dimensional, graphical representation of the data to support your hyperparameter value selections

  • Include any related literature to support your hyperparameter value selection

  • Reflect on the impact that a specific domain/application would have on your selections




Assignment4 GPU benchmark on MNIST In [ ]: import tensorflow as tf import numpy as np import matplotlib.pyplot as plt import keras as k from tensorflow.examples.tutorials.mnist import input_data from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D, BatchNormalization from keras.optimizers import SGD, Adam from keras.models import load_model from keras import backend as K # data preprocessing (x_train, y_train), (x_test, y_test) = mnist.load_data() img_rows, img_cols = 28,28 x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1) x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1) input_shape = (img_rows, img_cols, 1) x_test=x_test.astype('float32') x_train=x_train.astype('float32') mean=np.mean(x_train) std=np.std(x_train) x_test = (x_test-mean)/std x_train = (x_train-mean)/std # labels num_classes=10 y_train = k.utils.to_categorical(y_train, num_classes) y_test = k.utils.to_categorical(y_test, num_classes) Assignment4 https://gateway.scholar.rcac.purdue.edu/node/scholar-g000.rcac.purdue.edu/27313/nbconvert/ht... 1 of 2 2/10/21, 6:29 PM In [ ]: for i in range(8): n=2**i # build model num_filter=n num_dense=512 drop_dense=0.7 ac='relu' learningrate=0.001 model = Sequential() model.add(Conv2D(num_filter, (3, 3), activation=ac, input_shape=(28, 28, 1), padding='same')) model.add(BatchNormalization(axis=-1)) model.add(Conv2D(num_filter, (3, 3), activation=ac, padding='same')) model.add(BatchNormalization(axis=-1)) model.add(MaxPooling2D(pool_size=(2, 2))) # reduces to 14x14x32 model.add(Conv2D(2*num_filter, (3, 3), activation=ac, padding='same')) model.add(BatchNormalization(axis=-1)) model.add(Conv2D(2*num_filter, (3, 3), activation=ac, padding='same')) model.add(BatchNormalization(axis=-1)) model.add(MaxPooling2D(pool_size=(2, 2))) # reduces to 7x7x64 = 3136 neurons model.add(Flatten()) model.add(Dense(num_dense, activation=ac)) model.add(BatchNormalization()) model.add(Dropout(drop_dense)) model.add(Dense(10, activation='softmax')) adm=Adam(lr=learningrate, beta_1=0.9, beta_2=0.999, epsilon=1e-08) model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer=adm) for j in range(8): k=8*2**j print("number of filters " + str(n)) print("batch size " + str(k)) model.fit(x_train, y_train, batch_size=k, epochs=1, validation_data=(x_test, y_test)) Assignment4 https://gateway.scholar.rcac.purdue.edu/node/scholar-g000.rcac.purdue.edu/27313/nbconvert/ht... 2 of 2 2/10/21, 6:29 PM Filter Size Batch Size Time (s) Loss Accuracy Validation Loss Validation Accuracy 1 8 86 0.681 0.7903 0.1978 0.9355 1 16 43 0.311 0.9018 0.1559 0.9492 1 32 21 0.232 0.9274 0.1287 0.9571 1 64 11 0.186 0.942 0.1143 0.9629 1 128 5 0.162 0.9476 0.1069 0.9647 1 256 3 0.149 0.9521 0.1031 0.9656 1 512 2 0.14 0.9551 0.0999 0.9666 1 1024 1 0.136 0.9567 0.0991 0.9678 2 8 87 0.461 0.8619 0.1133 0.9655 2 16 43 0.182 0.944 0.0846 0.9737 2 32 21 0.12 0.963 0.0678 0.9785 2 64 11 0.092 0.9718 0.0608 0.9807 2 128 6 0.076 0.9763 0.0555 0.9819 2 256 3 0.069 0.9781 0.0527 0.983 2 512 2 0.063 0.9806 0.0527 0.9834 2 1024 1 0.058 0.9808 0.0512 0.9838 4 8 87 0.357 0.8938 0.1033 0.9672 4 16 43 0.124 0.9626 0.0497 0.9844 4 32 21 0.076 0.977 0.0419 0.9856 4 64 11 0.053 0.9834 0.0358 0.9879 4 128 6 0.042 0.9869 0.0315 0.9888 4 256 3 0.035 0.9889 0.0291 0.9895 4 512 2 0.032 0.9901 0.0287 0.9903 4 1024 1 0.03 0.9904 0.0276 0.9907 8 8 88 0.286 0.916 0.0659 0.9777 8 16 43 0.096 0.9712 0.0357 0.9874 8 32 22 0.056 0.9833 0.0305 0.99 8 64 11 0.038 0.9883 0.0243 0.9923 8 128 6 0.026 0.9919 0.0241 0.9919 8 256 3 0.02 0.9937 0.0221 0.9936 8 512 2 0.017 0.9941 0.0207 0.9932 8 1024 2 0.015 0.9954 0.0209 0.9933 16 8 89 0.247 0.9262 0.0713 0.9779 16 16 43 0.083 0.9759 0.0337 0.9888 16 32 22 0.047 0.986 0.0239 0.9932 16 64 11 0.031 0.9908 0.0224 0.9934 16 128 6 0.019 0.994 0.0173 0.9943 16 256 3 0.015 0.9954 0.0166 0.9944 16 512 2 0.011 0.9966 0.0161 0.9947 16 1024 2 0.011 0.9967 0.0153 0.9948 32 8 89 0.226 0.9328 0.0389 0.9881 32 16 44 0.072 0.9786 0.0326 0.9904 32 32 22 0.038 0.9886 0.0229 0.9937 32 64 11 0.023 0.9926 0.0163 0.9946 32 128 6 0.013 0.9961 0.0154 0.9948 32 256 4 0.009 0.9972 0.0154 0.9948 32 512 3 0.006 0.9982 0.014 0.995 32 1024 3 0.006 0.9985 0.0145 0.9951 64 8 90 0.224 0.9331 0.0466 0.9846 64 16 44 0.075 0.978 0.0246 0.9919 64 32 22 0.04 0.9885 0.0262 0.991 64 64 13 0.022 0.9932 0.0208 0.9926 64 128 8 0.014 0.9957 0.014 0.9957 64 256 6 0.009 0.9971 0.0136 0.9961 64 512 5 0.007 0.9977 0.0129 0.9965 64 1024 5 0.006 0.9983 0.0131 0.9964 128 8 91 0.256 0.9227 0.0532 0.9836 128 16 45 0.081 0.9762 0.0305 0.9892 128 32 27 0.041 0.9877 0.0319 0.9912 128 64 17 0.024 0.9925 0.0197 0.9938 128 128 13 0.015 0.9956 0.014 0.9955 128 256 11 0.01 0.997 0.0156 0.9953 128 512 10 0.007 0.998 0.0125 0.9965 128 1024 11 0.006 0.9981 0.0121 0.9959 { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### GPU benchmark on MNIST" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import tensorflow as tf\n", "import numpy as np \n", "import matplotlib.pyplot as plt\n", "import keras as k\n", "from tensorflow.examples.tutorials.mnist import input_data\n", "from keras.datasets import mnist\n", "from keras.models import Sequential\n", "from keras.layers import Dense, Dropout, Flatten\n", "from keras.layers import Conv2D, MaxPooling2D, BatchNormalization\n", "from keras.optimizers import SGD, Adam\n", "from keras.models import load_model\n", "from keras import backend as K\n", "\n", "# data preprocessing\n", "(x_train, y_train), (x_test, y_test) = mnist.load_data()\n", "img_rows, img_cols = 28,28\n", "x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n", "x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n", "input_shape = (img_rows, img_cols, 1)\n", "x_test=x_test.astype('float32')\n", "x_train=x_train.astype('float32')\n", "mean=np.mean(x_train)\n", "std=np.std(x_train)\n", "x_test = (x_test-mean)/std\n", "x_train = (x_train-mean)/std\n", "\n", "# labels\n", "num_classes=10\n", "y_train = k.utils.to_categorical(y_train, num_classes)\n", "y_test = k.utils.to_categorical(y_test, num_classes)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "for i in range(8):\n", " n=2**i\n", " \n", " # build model\n", " num_filter=n\n", " num_dense=512\n", " drop_dense=0.7\n", " ac='relu'\n", " learningrate=0.001\n", "\n", " model = Sequential()\n", "\n", " model.add(Conv2D(num_filter, (3, 3), activation=ac, input_shape=(28, 28, 1), padding='same'))\n", " model.add(BatchNormalization(axis=-1))\n", " model.add(Conv2D(num_filter, (3, 3), activation=ac, padding='same'))\n", " model.add(BatchNormalization(axis=-1))\n", " model.add(MaxPooling2D(pool_size=(2, 2))) # reduces to 14x14x32\n", "\n", " model.add(Conv2D(2*num_filter, (3, 3), activation=ac, padding='same'))\n", " model.add(BatchNormalization(axis=-1))\n", " model.add(Conv2D(2*num_filter, (3, 3), activation=ac, padding='same'))\n", " model.add(BatchNormalization(axis=-1))\n", " model.add(MaxPooling2D(pool_size=(2, 2))) # reduces to 7x7x64 = 3136 neurons\n", "\n", " model.add(Flatten()) \n", " model.add(Dense(num_dense, activation=ac))\n", " model.add(BatchNormalization())\n", " model.add(Dropout(drop_dense))\n", " model.add(Dense(10, activation='softmax'))\n", "\n", " adm=Adam(lr=learningrate, beta_1=0.9, beta_2=0.999, epsilon=1e-08)\n", " model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer=adm)\n", "
Answered 1 days AfterFeb 26, 2021

Answer To: Objective Gain experience empirically determining hyperparameters and evaluating models Instructions...

Sandeep Kumar answered on Feb 28 2021
151 Votes
Since the model used here is a Convolutional neural network (2D), the optimum filters range from 32 to 128 here, the higher the filters the more the network will learn. But exceeding the number of filters can lead to overfitting of the model. Also, in a CNN model, batch size is generally 32 to 256...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here