Write a program that implements linear regression on a 2D list of points. The program will use stochastic gradient descent (SGD) to learn a line that passes through as many points as possible. In this project you will be writing the two major components in linear regression: the training loop and the loss function. The math is provided and all that needs to be done is to convert pseudo-code into assembly code
ECE 30 Introduction to Computer Engineering Programming Project: Spring 2021 Linear Regression April 22, 2021 Project TA: Cameron Lewis (
[email protected] ) 1. Project Description Write a program that implements linear regression on a 2D list of points. The program will use stochastic gradient descent (SGD) to learn a line that passes through as many points as possible. In this project you will be writing the two major components in linear regression: the training loop and the loss function. The math is provided and all that needs to be done is to convert pseudo-code into assembly code. mailto:
[email protected] 2. Linear Regression (1) (2) Linear regression is the process of finding a line that minimizes the distance between the line and every point in a data set. This can be used to make estimates for new data given some information. In the gif above, the x-axis is used as an input, and the y-axis alone is used as the distance from the line. You can see the definition of loss function ( E ) in equation (1). The equation (2) is the line that we are learning to create an estimated value . We are specifically yi learning an “m” and “c” that for a given X value gets close to the Y value. In order to learn how the line is used, we define a loss function E , and take a derivative of it to learn which direction we should adjust the parameters in order to lower the loss. We then take a small step in that direction in order to bring the line closer. This is called “gradient descent”. In this project you do not need to understand the math behind the technique, the final equations will be given in both math equations and pseudo-code.If you would like to learn more about this, review this article on linear regression for more information. 3. Implementation In order to implement this, the problem is broken down into 5 sub-problems. Each problem will be graded and generally depends on the previous sections. The data file “SOCRdata.txt” and the template code file “projectTemplate.txt” will be provided. P1. Loss Function (25%) In machine learning, a loss function defines how accurate your estimation is as compared to the real data. In linear regression the loss function is the distance from every point in the graph to the line that is being estimated. In this project the loss function is also a function (procedure) called to return the loss. Below is pseudo code representing the loss function. ● In LEG V8 simulator, you will pass in the address corresponding to the beginning of the dataset, e.g., “LDA X2, array”. ● The length (arraySize) of the dataset is given to you in the data file. ● You should use this function for debugging purposes to monitor your loss ( E ) going down over the training loop iteration (epoch). https://towardsdatascience.com/linear-regression-using-gradient-descent-97a6c8700931 ● Return the loss in REGISTER S7 for grading purposes lossFunc(m, c, dataset) { lossSum = 0 for( int i =0; i