PLA U. K. Chakraborty Notation d dimension n number of samples (points) in the training set i index to represent the dimension; values 1, . . . , d or 0, . . . , d j index to represent the samples in...

Using the notation and assumptions provided in the file PLA_pseudocode_uday.pdf (under Modules on Canvas), present a formal proof of the convergence (or lack thereof) of the PLA algorithm for the case when the initial weight vector is the zero vector and the vectors are assumed to be normalized. Please do not use any notation other than the one mentioned above. You may consult Chapter 4 of Rojas’ book: https://page.mi.fu-berlin.de/rojas/neural/


PLA U. K. Chakraborty Notation d dimension n number of samples (points) in the training set i index to represent the dimension; values 1, . . . , d or 0, . . . , d j index to represent the samples in the training set; values 1, . . . , n ~xj Input point (vector) j (of dimension d) from the training data yj True class (label or tag) of input point (vector) j from the training data ŷj Predicted class (label or tag) of input point (vector) j (~xj , yj) j-th input-output pair in the training data R The real number domain ~w Weight vector (of dimension d+ 1) ~w∗ Weight vector (of dimension d+ 1) of a solution (zero-errror classifier) t Number of updates of the weight vector ~w ~w(t) Weight vector at time (iteration) t ‖~v‖ Length (Euclidean norm) of vector ~v W Matrix (column vector) representation of the weight vector WT Transpose of W η Learning rate constant (step size) Assumptions i = 0, . . . , d j = 1, . . . , n ~xj ∈ Rd; the augmented ~xj ∈ Rd+1 yj ∈ {−1,+1} ~w(0) ∈ Rd+1 may or may not be the zero vector. ~w∗ ∈ Rd+1 is not the zero vector. η > 0 Explanation We use the notation ~xj in both the regular and the augmented senses; the context tells us which is which. The augmented ~xj ∈ Rd+1 is [x0j , x1j , . . . , xdj ], with x0j = 1 for all j. 1 Algorithm 1: PLA Input : (~xj , yj) for j = 1 to n (this ~xj ∈ Rd+1 is the augmented vector, with the first component set to 1 for all j); η Output: ~w 1 Initialize ~w to random values; 2 for a predetermined number of passes through the training set do 3 updated = False; 4 for j = 1 to n do 5 if yj ~w.~xj ≤ 0 then 6 . yj and ŷj = ~w.~xj are of opposite signs; the equality is needed for the case ~w = ~0; 7 ~w = ~w + ηyj~xj ; 8 updated = True; 9 end 10 end 11 if updated == False then 12 break; 13 end 14 end 15 Output ~w; . either the very first solution found or the final weights; 16 if updated == False then 17 print(”first solution found”) 18 else 19 print(”no solution found or first solution found - but not verified - in the very last pass”); 20 end 2
Oct 04, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here