n/a
CS211 Fall 2021 Programming Assignment II David Menendez Due: Wednesday, October 27, 11:59 PM This assignment is designed to give you more experience programming in C and using the Unix environment. Your task will be to write one program that implements a simple machine-learning algorithm. This will require file I/O, dynamic memory allocation, and correctly implementing an moderately complex algorithm. Machine learning (ML) techniques are increasingly used to provide services, such as face recog- nition in photographs, spelling correction, automated translation, and predicting what YouTube videos you might want to watch next. Implementing a full ML algorithm is beyond the scope of this course, so you will implement a “one shot” learning algorithm that uses historical data to predict house prices based on particular attributes. For example, a house might have x1 bedrooms, x2 bathrooms, x3 square footage, and be built in year x4. If we had appropriate weights, we could estimate the price of the house y with the formula y = w0 + w1x1 + w2x2 + w3x3 + w4x4. (1) The goal of one-shot learning is to find values for the weights wi using a large provided set of training data. Once those weights have been found, they can be used to estimate prices for additional houses. For example, if the training data includes n houses and has k attributes, this data can be represented as an n× (k + 1) matrix X, of the form 1 x0,1 x0,2 · · · x0,k 1 x1,1 x1,2 · · · x1,k ... ... ... . . . ... 1 xn−1,1 xn−1,2 · · · xn−1,k , where each row corresponds to a house and each column corresponds to an attribute. Note that the first column contains 1 for all rows: this corresponds to the weight w0. Similarly, house prices can be represented as an n× 1 matrix Y , of the form y0 y1 ... yn−1 , where each row gives the price of a house. 1 Finally, the weights will be a (k + 1)× 1 matrix W , of the form w0 w1 ... wk+1 , where each row gives the weight of an attribute. We can relate the matrices X, Y , and W with this equation: XW = Y. (2) Our goal will be to estimate the prices Y ′ for some houses with attributes X ′. This is easily done if we know the weights W , but we do not. Instead, we can observe the attributes X for houses with known prices Y . We will use a strategy known as one-shot learning to deduce W , given X and Y . If X were a square matrix, we could find W by rewriting the equation as W = X−1Y , but in general X will not be a square matrix. Thus, we will find its pseudo-inverse by calculating W = (XTX)−1XTY, (3) where XT is the transpose of X. XTX is a square matrix, and can be inverted.1 Once W has been found, it can be used with a new set of house attributes X ′ to estimate prices for those houses by computing X ′W = Y ′. 1 Algorithm Given matrices X and Y , your program will compute (XTX)−1XTY in order to learn W . This will require (1) multiplying, (2) transposing, and (3) inverting matrices. Programming Assignment I already involved matrix multiplication; you may adapt your implementation for this assignment. Transposing an m× n matrix produces an n×m matrix. Each row of the X becomes a column of XT . To find the inverse of XTX, you will use a simplified form of Gauss-Jordan elimination. 1.1 Gauss-Jordan elimination for finding inverses Gauss-Jordan is a method for solving systems of equations by manipulating matrices with row operations. This method can also be used to find the inverse of a matrix. You will implement two of the three row operations in your program. The first multiplies all elements of a particular row by some number. The second adds the contents of one row to another, element-wise. More generally, the second operation adds a multiple of the elements of one row to another so that element xi,k will become xi,k + axj,k. The third row operation, which swaps two rows, will not be needed for this assignment. Again, the training data used to grade this assignment will not require swapping rows. Algorithm 1 is the implementation of Gauss-Jordan your program must implement. Note that the only row operations used are multiplying (or dividing) a row by a number and adding (or 1This is not true in general, but for this assignment you may assume that XTX is invertable. 2 subtracting) a multiple of one row to another. Given a matrix M , we will use Mi to refer to row i of M and Mi,j to refer to the number in row i, column j. For this assignment, we will start counting rows and columns from 0. Algorithm 1 Simplified Gauss-Jordan elimination procedure invert(M : n× n matrix) N ← n× n identity matrix for p← 0, 1, · · · , n− 1 do f ←Mp,p divide Mp by f divide Np by f for i← p+ 1, · · · , n− 1 do f ←Mi,p subtract Mp × f from Mi subtract Np × f from Ni end for end for for p← n− 1, · · · , 0 do for i← p− 1, · · · , 0 do f ←Mi,p subtract Mp × f from Mi subtract Np × f from Ni end for end for return N end procedure To illustrate Gauss-Jordan, we will walk through the process of inverting the matrix M = 1 2 41 6 8 1 1 6 . As a notational convenience, we will create an augmented matrix A = M |I by adjoining the identity matrix I to M . A = 1 2 4 1 0 01 6 8 0 1 0 1 1 6 0 0 1 It is not necessary for your program to create A; instead, you can represent the two sides of A as two separate matrices. The goal of Gauss-Jordan is to turn the left half of A into an identity matrix by applying row operations. At each step, we will identify a particular row as the pivot row. The element that lies on the diagonal (that is, element Ap,p) is the pivot element. The first step is to turn the M into an upper triangular matrix, where all elements on the diagonal are 1 and elements below the diagonal are 0. The pivot row will start at A0 and advance to A2. At each step, we will first multiply the pivot row by a constant so that the pivot element will 3 become 1. Next, we will subtract the pivot row from the rows below it, so that the elements below the pivot element become 0. Starting with row A0, we see that A0,0 is already 1. To make the elements below A0,0 become 0, we subtract A0 from A1 and A2, yielding 1 2 4 1 0 00 4 4 −1 1 0 0 −1 2 −1 0 1 . Next, for pivot A1 we see that A1,1 = 4. We divide A1 by 4 (that is, multiply A4 by 14 ). 1 2 4 1 0 00 1 1 − 14 14 0 0 −1 2 −1 0 1 . To zero out A2,1, we add A1 to A2. 1 2 4 1 0 00 1 1 − 14 14 0 0 0 3 − 54 1 4 1 . Now the pivot row is A2. To make A2,2 = 1, we divide row A2 by 3. 1 2 4 1 0 00 1 1 − 14 14 0 0 0 1 − 512 1 12 1 3 . A is now an upper triangular matrix. To turn the left side of A into an identity matrix, we will reverse the process and turn the elements above the diagonal into 0. The pivot row will start at A2 and advance in reverse to A0. For each step, we will subtract the pivot row from the rows above it so that the elements above the pivot element become 0. We begin with A2. First, we subtract A2 from A1, yielding 1 2 4 1 0 00 1 0 16 16 − 13 0 0 1 − 512 1 12 1 3 . Then we subtract 4×A2 from A0, yielding 1 2 0 8 3 − 1 3 − 4 3 0 1 0 16 1 6 − 1 3 0 0 1 − 512 1 12 1 3 . Now the elements above A2,2 are 0. Next, we subtract 2×A1 from A0, yielding 1 0 0 7 3 − 2 3 − 2 3 0 1 0 16 1 6 − 1 3 0 0 1 − 512 1 12 1 3 . 4 Now the element above A1,1 is 0. After that step, the left half of A is the identity matrix and the right half of A contains M−1. That is, A = I|M−1. The algorithm is complete and the inverse of M has been found. You are strongly encouraged to write this algorithm in more detailed pseudocode before you begin implementing it in C. Try using it to invert a small square matrix and make sure you understand the operations you must perform at each step. In particular, ask yourself (1) given a matrix A, how can I multiply (or divide) Ai by a constant c, and (2) given a matrix A, how can I add (or subtract) c×Ai to (or from) Aj? You MUST use the algorithm as described. Performing different row operations, or the same row operations in a different order, may change the result of your program due to rounding. This may cause your program to produce results different from the reference result. 2 Program You will write a program estimate that uses a training data set to learn weights for a set of house attributes, and then applies those weights to a set of input data to calculate prices for those houses. estimate takes two arguments, which are the paths to files containing the training data and input data. Training data format The first line will be the word “train”. The second line will contain an integer k, giving the number of attributes. The third line will contain an integer n, giving the number of houses. The next n lines will contain k + 1 floating-point