Answer To: { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## HW 1: Math Foundation and...
Karthi answered on Oct 06 2021
{
"cells": [
{
"cell_type": "markdown",
"source": [
"## HW 1: Math Foundation and Programming"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"This assignment intends to:\n",
"- Test your Python programming skills\n",
"- Understand gradients and backpropagation\n",
"- Think classical regression problems with a deep learning mind"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"- Suppose you have a model $\\hat{y} = \\sigma(z)$, where: \n",
" - $ z= w^T x + b,~i.e.~z=w_1*x_1+w_2*x_2+w_3*x_3+ w_4*x_4 + b$, \n",
" - $\\sigma$ is the sigmoid function, i.e. $\\sigma(z) = \\frac{1}{(1+e^{-z})}$, and\n",
" - $w, b$ are parameters. $b$ is a scalar, $x,w~\\in R^4$, specifically, $w = [w_1, w_2, w_3, w_4]^T$, $x = [x_1, x_2, x_3, x_4]^T$.\n",
"- Your ground truth lable $y=0~or~1$. With a sample $(x, y)$, You measure your model performance by two possible cost functions:\n",
" - Squared error: $L=\\frac{1}{2}(y-\\hat{y})^2$\n",
" - Cross entropy: $L=-[y*\\ln{\\hat{y}}+(1-y)*ln{(1-\\hat{y})}]$\n",
"\n"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Following the instruction below to program your solution in Python notebook step by step carefully:"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"**Q1**. Write a function to calculate each of the following partial derivatives. The inputs to the function can be all the variables in the model and the returned derivatives are expressions of these variables. An example is given below.\n",
"- `g_L_2_z(L, z, y, yhat, func)`: function to calculate $ \\frac{\\partial{L}}{\\partial{z}}$. `func` is the name of the loss function\n",
"- `g_z_2_w(z, x, w )`: calculate $ \\frac{\\partial{z}}{\\partial{w}}$\n",
"- `g_z_2_b(z, b )`: calculate $ \\frac{\\partial{z}}{\\partial{b}}$\n",
"\n",
"Note, these gradients are very simple. You really don't have to use gradient packages such as PyTorch.autograd. Just define each gradient as an expression of input variables."
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 1,
"source": [
"import numpy as np\n",
"from matplotlib import pyplot as plt\n",
"\n",
"from IPython.core.interactiveshell import InteractiveShell\n",
"InteractiveShell.ast_node_interactivity = \"all\"\n"
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 2,
"source": [
"def g_z_2_b():\n",
" \n",
" return 1\n",
"\n",
"def g_z_2_w(x):\n",
" \n",
" # add your code here\n",
" return x\n",
"\n",
"\n",
"def g_L_2_z(y, yhat, func):\n",
" \n",
" if func=='CrossEntropy':\n",
" \n",
" # add your code here\n",
" return yhat - y\n",
" \n",
" else:\n",
" \n",
" # add your code here\n",
" return ((yhat - y) * yhat * (1 - yhat))\n",
" \n",
"def Sigmoid(z):\n",
" return 1 / (1 + np.exp(-z))\n",
"\n",
"def squaredError(y, yhat):\n",
" return 1/2 * np.power((y - yhat), 2)\n",
"\n",
"def crossEntropy(y, yhat):\n",
" return -(y * np.log(yhat) + (1 - y) * np.log(1 - yhat))\n",
"\n"
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"**Q2**. Write a forward pass function `forward(x, w, b, func)` to calculate variables $z, \\hat{y}, L$, with given $x, y, w, b$"
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 3,
"source": [
"def forward(x, y, w, b, func):\n",
" \n",
" z, yhat, L = None, None, None\n",
" \n",
" # add your code here\n",
" z = np.dot(w.T, x) + b\n",
" yhat = Sigmoid(z)\n",
" if func == 'CrossEntropy':\n",
" L = crossEntropy(y, yhat)\n",
" elif func == 'SquaredError':\n",
" L = squaredError(y, yhat)\n",
" \n",
" return z, yhat, L"
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"**Q3**. Write a function `gradient_desc (v, g, lr)` to adjust a parameter value $v$ by its gradient $g$, i.e. return the new value of parameter $v$ as $v$ $\\leftarrow$ $v-lr*g$, where $lr$ is the learning rate."
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 4,
"source": [
"def gradient_desc(v, dev, lr):\n",
" \n",
" # add your code here\n",
" return v - lr * dev"
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"**Q4**. Write a function `train(x, y, w_0, b_0, func, lr, n)` as follows:\n",
" 1. Initialize $w$, $b$ with w_0, b_0\n",
" 2. Use a loop of $n$ rounds to do the following\n",
" 1. Call the forward function you defined in Q2 to calculate $z, \\hat{y}, L$ \n",
" 2. Apply backpropagation to calculate the partial derivatives $\\frac{\\partial{L}}{\\partial{w}}, \\frac{\\partial{L}}{\\partial{b}}$ using the functions you defined in Q1.\n",
" 3. Update $w, b$ using the function `gradient_desc` you defined in Q3\n",
" 4. record $\\hat{y}$, $L$\n",
" 3. Return the history of $\\hat{y}$, and $L$"
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 5,
"source": [
"def train(x, y, w_0, b_0, func, lr, n):\n",
" \n",
" Yhat, C = None, None\n",
" \n",
" # add your code here\n",
" List_loss = []\n",
" List_prediction = []\n",
" for i in range(n):\n",
" z, yhat, C = forward(x, y, w_0, b_0, func)\n",
" dw = np.dot(g_L_2_z(y, yhat, func), g_z_2_w(x))\n",
" db = np.dot(g_L_2_z(y, yhat, func), g_z_2_b())\n",
" w_0 = gradient_desc(w_0, dw, lr)\n",
" b_0 = gradient_desc(b_0, db, lr)\n",
" List_loss.append(C)\n",
" if y == 1:\n",
" List_prediction.append(yhat / y)\n",
" else:\n",
" List_prediction.append((1-yhat) / (1-y))\n",
"\n",
" return yhat, C, List_loss, List_prediction, w_0, b_0"
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"**Q5**. Test your program with these two test cases and plot the history of loss $L$ (i.e. learning curves) under different loss functions. An example plot for case A has been given.\n",
" \n",
" \n"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
" **Case A**: $x=[1.0,0.5,-1.0, -2.0]^T, y=1, w_0=[-2,-2,1,2]^T, b_0=-1, lr = 0.01$"
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 6,
"source": [
"def test(x, y, w_0_SE, b_0_SE, w_0_CE, b_0_CE, lr, n):\n",
" fig1 = plt.figure(1, figsize = (10, 10))\n",
" ax1 = fig1.add_subplot(211)\n",
" ax2 = fig1.add_subplot(212)\n",
"\n",
" ax1.set_title('Loss')\n",
" ax2.set_title('Prediction')\n",
"\n",
" yhat, C, List_loss, List_prediction, w_0_SE, b_0_SE = train(x, y, w_0_SE, b_0_SE, 'SquaredError', lr, n)\n",
"\n",
" x_plot = np.arange(len(List_loss))\n",
" y1_plot = np.array(List_loss)\n",
" y2_plot = np.array(List_prediction)\n",
"\n",
" ax1.plot(x_plot, y1_plot, label='SquaredError')\n",
" ax2.plot(x_plot, y2_plot, label='SquaredError')\n",
" ax1.legend(loc='upper right')\n",
" ax2.legend(loc='upper left')\n",
"\n",
" yhat, C, List_loss, List_prediction, w_0_CE, b_0_CE = train(x, y, w_0_CE, b_0_CE, 'CrossEntropy', lr, n)\n",
"\n",
" y1_plot = np.array(List_loss)\n",
" y2_plot = np.array(List_prediction)\n",
"\n",
" ax1.plot(x_plot, y1_plot, label='CrossEntropy')\n",
" ax2.plot(x_plot, y2_plot, label='CrossEntropy')\n",
" ax1.legend(loc='upper right')\n",
" ax2.legend(loc='upper left')\n",
"\n",
" plt.show()\n",
" \n",
" return w_0_SE, b_0_SE, w_0_CE, b_0_CE"
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 7,
"source": [
"# Case A:\n",
"\n",
"x=np.array([1.0,0.5,-1.0, -2.0])\n",
"y=1\n",
"w_0=np.array([-2,-2,1,2])\n",
"b_0=-1\n",
"lr = 0.01\n",
"n = 500"
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 8,
"source": [
"# Add your code here\n",
"w_0_SE, b_0_SE, w_0_CE, b_0_CE = w_0, b_0, w_0, b_0\n",
"w_0_SE, b_0_SE, w_0_CE, b_0_CE = test(x, y, w_0_SE, b_0_SE, w_0_CE, b_0_CE, lr, n)"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png":...