Project Workflow
As mentioned in the overview, as long as (1) you implement all of the functions we declared in
Graph.h
, (2) we can compile the original starter
GraphTest.cpp
via
make
, and (3) we can run
GraphTest
as described, you are absolutely free to implement the project any way you think is best. However, here are some suggestions just in case you feel a bit lost.
Potential Development Process
Step 1: Reading the Header Comments
Before you do
anything, please read
all
of the header comments in
Graph.h
thoroughly
to make sure you understand what exactly each function is supposed to do. You should consider the following questions as you investigate each function declared
Graph.h
:
What are the parameters of this function? What do they mean? How are they formatted?
Given the parameters, what is this function supposed to actually do? What data structure(s) and algorithm(s) would be relevant to achieve this functionality?
What is the output of this function? What does it mean? How is it formatted?
Step 2: Designing Your Graph
Before you write even a single line of code, completely map out how exactly you want to implement the graph ADT. You should consider the following questions as you think about your unique design:
What are the graph operations you will need to support?
What data structure(s) will you use to represent nodes?
What data structure(s) will you use to represent edges between nodes?
Are there any helper variables, functions, classes, etc. that you will want to add?
What are the time complexities of various standard graph operations if you use the data structure(s) you've chosen? Can you modify your choices to speed things up?
I want you to literally
write out
every
data structure you can use
to represent the graph, along with each data structure's
worst-case Big-O time and space complexities, and map the functionality of these data structures to the properties of a graph. This design step is by far
the most important step, as proper design will make your coding much easier and will make your code much faster. Be sure to refer to the
"Summaries of Data Structures" section of the
Data Structures
Stepik text
to help you brainstorm.
Once you have decided on a design for your graph implementation, you will want to add any instance variables your design will need to the
Graph
class.
Step 3: Writing the Constructor
The first nontrivial code you should write will likely be the
Graph
constructor. You will need to populate the
Graph
using the edges contained within the given edge list CSV file. You should think about the following questions as you implement your constructor:
How should you initialize each of the instance variables you added to the
Graph
class?
Do you want to add any helper functions (or perhaps an overloaded constructor) to help you initialize your graph from the given CSV file?
Step 4: Implementing Basic Graph Operations
Once you are confident that your
Graph
constructor is correct, write the functions that access basic properties of a graph:
num_nodes()
,
nodes()
,
num_edges()
,
edge_weight()
,
num_neighbors()
, and
neighbors()
. You should think about the following questions as you implement these functions:
How can you use the instance variables you've added to the
Graph
class to perform these operations? What would be the worst-case time complexity of each operation?
Are there any optimizations you can make by adding even more instance variables to the
Graph
class to quickly look up certain values instead of computing them from scratch each time, but without blowing up in memory consumption?
Once you have implemented all of these functions, you should be able to use the first few tests in
GraphTest
, as well as the autograder. Once you have these basic properties working, the remaining functions can be implemented in any order (they will likely be independent of one another), so feel free to deviate from our recommended order if you prefer.
However,
all
of the following steps require your basic graph functionality to work completely correctly. Thus,
do not continue until you are able to get the basic graph functionality completely working!
Step 5: Finding a Weighted Shortest Path
In the
shortest_path_weighted()
function, you will be finding the shortest weighted path from a given start node to a given end node. You will want to implement this using
Dijkstra's Algorithm. Be sure to map out how exactly the algorithm will operate with the specific graph design you have chosen. You should think about the following questions as you implement this:
What data structure(s) will help you implement Dijkstra's Algorithm, and do they have built-in C++ implementations?
How will you use the properties of your graph design to facilitate Dijkstra's Algorithm?
What is the worst-case time complexity of your approach? Are there any optimizations you can make to speed things up?
Be sure to refer to the
"Dijkstra's Algorithm" section of the
Data Structures
Stepik text
for more information about Dijkstra's algorithm.
Step 6: Finding the Smallest Connecting Threshold
In HIV transmission clustering, a natural question is the following: Given two individuals
u
and
v, what is the smallest threshold
d
such that, if we were to only include edges with weight less than or equal to
d, there would exist a path connecting
u
and
v?
In the
smallest_connecting_threshold()
function, you will be finding the smallest connecting threshold between a given pair of nodes
u
and
v.
We will have time constraints on this, so you will need to implement an optimal solution. As a hint, you will probably want to make use of a Disjoint Set to make your code as fast as possible. Be sure to refer to the
"Disjoint Sets" section of the
Data Structures
Stepik text
for more information about Disjoint Sets.
Design Notes
Before you write even a single line of code, it's important to properly design the various components of your
Graph
class. Here, we will discuss some aspects of the design that you should be sure to think about.
Parsing a CSV File
The edge list your
Graph
constructor needs to read will be a CSV file, such as the following CSV file with 3 columns:
Niema,Moshiri,100 Ryan,Micallef,95 Felix,Garcia,95
In C++, you can parse a CSV file as follows (note that
first
,
second
, and
third
are all
string
objects):
C++12345678910ifstreammy_file(filename);//openthefilestringline;//helpervartostorecurrentlinewhile(getline(my_file,line)){//readonelinefromthefileistringstreamss(line);//createistringstreamofcurrentlinestringfirst,second,third;//helpervarsgetline(ss,first,',');//storefirstcolumnin"first"getline(ss,second,',');//storesecondcolumnin"second"getline(ss,third,'\n');//storethirdcolumncolumnin"third"}my_file.close();//closefilewhendone
Using the C++
tuple
Class
In order to represent the mathematical concept of a
tuple, C++ provides a
tuple
class. Be sure to thoroughly read the
C++
tuple documentation
, but the following code example will hopefully show you how to use it:
RunC++12345678910#includeiostream>//loadcout#includestring>//loadstringclass#includetuple>//loadtupleclassusingnamespacestd;intmain(){tuplestring,string,int>prof=make_tuple("Niema","Moshiri",1993);cout"FirstName:"get0>(prof)endl;cout"LastName:"get1>(prof)endl;cout"BirthYear:"get2>(prof)endl;}
You may or may not want to use
tuple
objects in your underlying
Graph
design, but you will likely want to use them as you implement the member functions of the
Graph
class (e.g. as you implement a graph traversal algorithm). Pro-tip: if you try to sort a
vector
that contains
tuple
objects using
std::sort
, the
tuple
objects will be sorted based on the first element.
Using the C++
pair
Class
Much like the
tuple
class, C++ provides the
pair
class that can be used to represent a pair of objects. Be sure to thoroughly read the
C++
pair
documentation, but the following code example will hopefully show you how to use it:
RunC++123456789#includeiostream>//loadcout#includestring>//loadstringclass#includeutility>//loadpairclassusingnamespacestd;intmain(){pairstring,string>prof=make_pair("Niema","Moshiri");cout"FirstName:"prof.firstendl;cout"LastName:"prof.secondendl;}
Creating a C++
struct
When examining your data types, you may find yourself adding many properties to your
tuple
objects, or may want increased readability by creating custom types. C++ provides the ability to define a
struct
, which is essentially a simple class that can be used to bundle together elements. Be sure to thoroughly read the
C++
struct
documentation, but the following code example will hopefully show you how to use it:
RunC++123456789101112131415#includeiostream>//loadcout#includestring>//loadstringclassusingnamespacestd;//definestructstructperson{stringfirst;stringlast;unsignedintyear;};intmain(){//createstructobjectonstackpersonprof={.first="Niema",
Using a Custom Comparison Class with
priority_queue
When you implement Dijkstra's Algorithm, you will likely want to store
tuple
objects representing (total path weight,
from,
to) tuples to keep track of your graph traversal. In C++, you can create a
priority_queue
with a custom comparison class. You may want to create a custom comparison class to compare these tuples in a way that uses the
priority_queue
as a min-heap with respect to total path weight. Be sure to read the
C++
priority_queue
documentation
thoroughly to see how to define a custom comparison class and use it with a
priority_queue
object.
Modifying the
Makefile
As you design your
Graph
class, you may feel as though you want to create additional code files for any helper classes you may potentially create. Note that you don't have to: you can complete this project without creating any additional files. However, if you do choose to do so, you will find that you will want to modify the
Makefile
in order to properly compile your executable. Feel free to make any modifications you wish, and you may want to refer to
this guide
to help you.