Learning objective: getting started programming in Java, re-enforcing Java concepts, getting
used to reading the Java documentation, setting up Intellij IDEA.
This assignment will be based on finding the prefix sum of an input sequence, which is defined as:
y0 = x0
y1 = x0 + x1
y2 = x0 + x1+ x2
...
So, given an array = [2, 5, 0, 3], the prefix sum would be [2, 7, 7, 10].
a) Write a class named SequentialPrefix that finds the prefix sum for the contents of an
input file, named in.txt, and writes the result to an output file, named out.txt. Rather
than reading the full contents of the input file into memory, you must assume that the size of the
input file is too large to fit in your computer’s memory and design your application accordingly.
The input file consists of list of integers separated by newline character. An example input file
has been provided to you.
b) Write a class called ParallelPrefix which gives the same output as
SequentialPrefix but uses a parallel prefix sum algorithm to speed up the process. Use
Java Threads (java.lang.Thread) to create as many threads as required to parallelize you
code. Again, assume that the size of the input file is too large to fit in your computer’s memory.
There are a few ways the parallel prefix sum can be computed. One option is via a divide-andconquer algorithm. (This is the “work-efficient” algorithm outlined here:
https://en.wikipedia.org/wiki/Prefix_sum#Algorithm_2:_Work-efficient). This algorithm
consists of 2 passes, where the first builds a binary tree of sums bottom-up, and the second
computes the prefixes by traversing the tree top-down.
For example, given an input array [1, 2, 3, 4], the first pass constructs a tree of the form:
10
3 7
1 2 3 4
In the second pass, the root node is assigned a fromLeft value of 0. Then, for each node
(starting at the root), the node takes its fromLeft value and passes its left child the same
fromLeft, and passes its right child its fromLeft plus it’s left child’s sum. So, the
fromLeft values for the tree above would be:
0
0 3
0 1 3 6
Finally, for each leaf node at array position i, output[i]=fromLeft+input[i].
This gives [0+1, 1+2, 3+3, 6+4] = [1, 3, 6, 10].
c) Use the parallelPrefix function provided by Java
(https://docs.oracle.com/javase/8/docs/api/java/util/Arrays.html#parallelPrefix-int:Ajava.util.function.IntBinaryOperator-) instead of rolling out your own parallel algorithm to
achieve the same outcome as part (b). Name this class ParallelPrefixInternal.
d) Benchmark and report the speedup of each of the three variants in a bar chart with speedup on
the y-axis and the class name on the x-axis. Note that the speedup of the sequential code is 1.
You may also consider plotting the speedup of each variant given different numbers of elements
that can fit into memory.
e) What can you do to bring your ParallelPrefix up to speed with
ParallelPrefixInternal? Propose and implement your solution in a class named
BetterParallelPrefix (Hint: consider using a ForkJoinPool, along with other
optimizations).
Benchmark and report your results for all the four variants. Briefly explain your proposed
solution and the rationale behind your decision.