I uploaded the file
CS161 Homework 2 Summer 2022 For the most up-to-date version of this document, please see: https://www.overleaf. com/read/mrzybrkxxgsj Problem 1: Word Representations As described in class, RadixSort runs in O(d(n+b)) time, where n is the number of elements, b is the base, and d (“digits”) is the max length of the elements. (a) (2 pts.) Suppose we want to use RadixSort to sort the following list of words W , each of which comes from the uppercase English alphabet, into lexicographic order (like in a dictionary): W = [HOW, VEXINGLY, QUICK, DAFT, ZEBRAS, JUMP] Before sorting, we right-pad all words with ! characters (assume ! comes before a alphabetically) to make them the same length. NOTE: the problem originally said left-pad, but now that I think about it, it’s more natural to sort words in lexicographic / dictionary order, so I’ve switched to right padding. You can do either in your solution and we will accept either answer as long as you’re consistent.) Provide numerical values for the three variables n, b, and d. No explanation required. Your answer goes here! (b) (2 pts.) Now you happen to have the date (the day, month, year, all as base-10 num- bers) that each word inW was first published in an English dictionary. (Assume that all such dates are within the last 500 years.) You want to sort first by date (going forward in time), then use lexicographic ordering (as in part (a)) to break ties. You will do this by converting each of the original words in W into words with date information (digits) prepended, appended, or inserted somewhere in the string, in an order that you choose. Provide the string that you would use to represent the word ZEBRAS (first published, say, May 4, 1604) so that it will be correctly sorted by RadixSort for the given ob- jective, regardless of what the dates for the other words in W are. No explanation required. Your answer goes here! 1 (c) (2 pts.) Hereafter we will leave the list W behind. You have a new list V representing the text of a document (just the words, all consisting only of uppercase English alpha- bet letters). V has n words in it in all, where n > 2, and there are at least two distinct words (but words may be repeated within the list). You decide to save space and simply represent the first distinct word in V with the binary value 0, the second distinct word with 1, the third distinct word with 10, etc., continuing to increment by one each time in binary. All subsequent instances of a word you have already seen get the same number. So, for instance, the text [CS, ONE, SIXTY, ONE, CS, IS, FUN] would become [0, 1, 10, 1, 0, 11, 100]. Now you want to sort your list V (an arbitrary list, not the example above!) us- ing RadixSort. Before running RadixSort, you left-pad the strings with 0s as needed to make them equal-length. Give the time complexity of RadixSort on this converted list, as a simplified big- O expression in terms of n. No explanation required. Note that we are asking only for the time complexity of the RadixSort, not of converting the wordlist into binary. Your answer goes here! (d) (2 pts.) Now, assume the same setup as in the previous part, but because you don’t want to mess with binary conversions, you decide instead to represent the distinct words in V with ”one-hot” vectors (vectors with all 0’s except for a single ’1’ in a position corresponding to a particular word). For example, the [CS, ONE, SIXTY, ONE, CS, IS, FUN] example from before has five distinct words, and it would become [10000, 01000, 00100, 01000, 10000, 00010, 00001]. As before, give the time complexity of RadixSort on this converted list, as a sim- plified big-O expression in terms of n. No explanation required. Again, we only want the time complexity of the RadixSort part itself. Your answer goes here! Problem 2: Proof Planning Waverly has gotten into the habit of running proofs by Terry. Terry is preoccupied with a lot of work, but they still want to help Waverly as much as possible. Waverly has an estimate of how long it takes to run each proof by Terry. So she sends an array A = [a1, ..., an] of these time lengths (measured in minutes) to Terry. Terry has put aside a total of L minutes to help Waverly today, and they want to go over as many proofs as possible. 2 (a) (6 pts.) Design an algorithm, MaximumProofs, that calculates the maximum number of proofs that fit within L minutes and runs in time O(n), where n is the length of A. You may assume that all time lengths in A are distinct and positive, but (clarifica- tion added July 1) you may not assume that the times are, e.g., integers – i.e., RadixSort is not viable for this problem. You may ignore divisibility issues (such as floors and ceilings) throughout this problem. You may use any algorithm we have learned in class as a function. For this problem, please provide pseudocode (we don’t care about the exact format), or, if you prefer, code in a language like Python or C++. We think the solution, though not super complicated, has enough details that it’s worth writing out the algorithm rigorously. However, you do not need to prove correctness or justify the runtime – just the algorithm is sufficient. Input: Array A of n distinct positive values (not necessarily integers) indicat- ing the time it takes to verify each proof, and number L ≥ 0 indicating how much time Terry has put aside. Output: The maximum number of proofs that Terry can verify in ≤ L minutes. Here are some sample inputs and the correct outputs below. • For A = [5, 1, 6, 2, 3] and L = 3, the correct output is 2. • For A = [20, 10, 30] and L = 5, the correct output is 0. • For A = [4, 1, 2, 3] and L = 1000, the correct output is 4. • ForA = [1.5628768362856876378462846876384638, 1.4371231637143123621537153123615362] and L = 3, the correct output is 2. Your answer goes here! (b) (2 pts.) Now prove the runtime of the MaximumProofs algorithm which you have just designed. That is, argue using a recurrence relation, or otherwise, that your algorithm satisfies the time bound of O(n). (There is a way to do this pretty briefly.) Your answer goes here! Problem 3: Randomized Algorithms In this exercise, we’ll explore different types of randomized algorithms. Recall from class that a randomized algorithm is a Las Vegas algorithm if it is always correct (that is, it returns the right answer with probability 1), but the running time is a random variable. We 3 say that a randomized algorithm is a Monte Carlo algorithm if there is some probability that it is incorrect. For example, QuickSort (with a random pivot) is a Las Vegas algorithm, since it always returns a sorted array, but it might be slow if we get very unlucky. Consider the following three algorithms for finding a majority element in a list that cer- tainly contains a strict majority element. (That is, there is one element in the list that appears strictly more than n 2 times.) Suppose that nothing else is known about the distribution of the list. (8 pts.) Fill in the following table. You may use asymptotic notation for the running times, or ∞ for a procedure that may not terminate. For the probability of returning a majority element, give the tightest lower bound that you can given the information provided; this may entail making the most conservative assumptions possible about the contents of the list. (Remember that big-O always assumes a worst-case input, even for randomized algorithms where the randomness of the choices might vary.) No explanations required. Algorithm Monte Carlo or Las Vegas? Expected running time Worst-case running time Probability of return- ing a majority element Algorithm 1 Algorithm 2 Algorithm 3 Algorithm 1: findMajorityElement1 Input: A population P of n elements while true do Choose a random p ∈ P ; if isMajority(P , p) then return p; Algorithm 2: findMajorityElement2 Input: A population P of n elements for 100 iterations do Choose a random p ∈ P ; if isMajority(P , p) then return p; return P [0]; 4 Algorithm 3: findMajorityElement3 Input: A population P of n elements Put the elements in P in a random order. /* Assume it takes time Θ(n) to put the n elements in a random order */ for p ∈ P do if isMajority(P , p) then return p; Algorithm 4: isMajority /* This is a helper called by the other three algorithms. */ Input: A population P of n elements and a element p ∈ P Output: True if p is the majority element of the list count ← 0; for q ∈ P do if p = q then count ++; if count > n/2 then return True; else return False; Your answer goes here! Problem 4: What Happens If I...? Some algorithms have constants or properties that seem magical at best and arbitrary at worst, and it’s not clear why they’re there. In this problem, we’ll tweak a couple of the algorithms from this unit to see what breaks. (a) (5 pts.) Show that if the median-of-medians part of the k-Select algorithm is modified to use groups of 3 instead of groups of 5, the argument from lecture that the overall k-Select algorithm is O(n) does not work. To do this, first derive a recurrence T (n) for the running time of k-Select, and then show that there is no way to choose a constant c such that the recurrence is O(n). We recommend closely following what we did in Lecture 3, and seeing what breaks. You do not need to reiterate the entire argument from Lecture 3, but you do need to show where the differences arise. Also, to make your life easier: • Assume that n ≥ 9, and that all the list elements are distinct. 5 • Be very precise about the number of elements that are guaranteed to be less than the estimated median of medians, because the analysis is very tight. You do not need to argue about the equivalent greater-than case. • You do not need to address the possible issue of the length of the list (or subse- quent sublists) not being divisible by 3, which we also glossed over in lecture.