Homework 4
Hello Here is the second attempt of the description how to convert the minimum spanning tree in a traveling salesman tour. We have to use the Depth First Search, with a stack for the vertices. We have an array/list of the edges of the MST, so that for each vertex we can look up its neighbors in the MST. choose start vertex, set start = start vertex; current = start; visited[start] = TRUE; put neighbors of start on stack; while (stack is not empty) { take next vertex from stack, call it x. if( visited[x] == FALSE) { visited[x]= TRUE; draw edge from current to x in the traveling salesman tour; current = x; find the neighbors of x in the minimum spanning tree, and put them on the stack, if they are not visited; } } /* now stack is empty; we need to connect back to start */ draw edge from current to start in the traveling salesman tour. /*done*/ Hope this version is now correct. hw4.dvi Homework 4 of CSC 220 Algorithms, Spring 2020 given April 27, 2020, due May 15 It is the aim of this project to find a minimum spanning tree and Traveling Salesman Tour for points in the plane. You write a function void MST TSP(int *px, int *py, int n) that receives the number of points n, and the arrays of x- and y-coordinates of the points. Your function then uses Kruskal’s algorithm to find the Minimum Spanning Tree, and shows it by calling void tree edge(int i, int j) that marks the edge from point i to point j. After that, you use the tree to construct a traveling salesman tour, and call the function void TSP edge(int i, int j) to mark the edge of the TSP tour. Again the test code that is provided uses the X Window system to visualize the graph and the edges chosen by your function. 1 Minimum Spanning Trees and Kruskal's Algorithm There are several algorithms which compute the minimum spanning tree of a grap with edge length, all in O( e log(n) ) time for a graph eith e edges and n vertices. In this section we will discuss Kruskal's algorithm, which, like Prim's algorithm, was invented in for telephone planning applications in pre-computer times. Kruskal's algorithm appears very simple. We consider the edges in sequence of increasing edge length, and choose each edge for the minimum spanning tree that does not generate a cycle with the edges chosen earlier. We want to generate a tree, so we cannot choose an edge that makes a cycle; and we want to choose the minimum spanning tree, so er want to choose edges as short as possible; so this algorithm sounds reasonable. It does not require a heap, so it seems easier than Prim's algorithm, but it does require us to sort the edges by their length in the first step. And then comes the main difficulty:we need a way to decide whether an edge generates a cycle with the previously chosen edges. For this we introduce a new data structure, which keeps track of connected components of a graph,and supports two operations: we can merge two components (when we introduce an edge between them) and we can query for two vertices whether they are in the same component. This structure is known under different names, "set union", "disjoint set","union find", and we will see that it can perform bot operations in O(log(n)) on a graph with n vertices. If we have that data structure, Kruskal's algorithm works as follows - sort all edges in sequence of increasing edge length (takes O( e log(n)) ) - consider the edges in that sequence of increasing length. For each edge uv { if ( not same_component(u,v) ) { select uv for MST; join_components(u,v); } } so we test whether the edge uv connects two vertices which are already previously connected (in the same component), and if not, we take that edge for the Minimum Spanning Tree, and join the two components. The same_component() query and the join_components() function are called at most once for each edge, so if thhe operations are done in O( log(n) ), the entire algorithm takes O( e log(n) ). So it remains to describe the data structure, which implements same_component() and join_components(). It works as a root-directed tree: each component is represented as a tree, with edges oriented so that they all point to the root. Each vertex is represented by a node with one outgoing pointer. That pointer can point to itself, the it is a root, or it can point to another node in the same component, which is closer to the root. So, starting at any vertex, we just follow the pointers until we find a node pointing to itself, which is then the root of the component. By this, we arrive at the same node, no matter in which vertex of the component we started, so we find a unique identifier for the component, and two vertices are in the same component if the roots reached, starting from the vertices, are the same. So the same_component test works as follows: same_component(u,v) { utmp =u; vtmp =v; while( utmp->next != utmp ) utmp = utmp->next; while( vtmp->next !=vtmp ) vtmp = vtmp->next; /*now utmp and vtmp are the respective roots*/ if( utmp==vtmp) {return(true);} else {return(false);} } With this idea, the joining of two components is also clear: we just have to point one root to the other root, then from anywhere in either component we will arrive at the same root. However,we want to keep the paths from any node to the root short, so we need to make the right choice,which root to keep, and which root to point to the other. For this, each node gets an additional field, the rank. Each node initially has rank 0; the rule for joining two components is to point the root of lower rank to the root of higher rank, and if the ranks are the same, then for one root we increment the rank. With this rule, a root of rank k must have a component of at least 2^k vertices, since it resulted from merging two components with roots of rank k-1, each of which had at least 2^(k-1) vertices. Since we have in total only n vertices, we have 2^k <= n,="" so="">=><= log(n). since each edge points from a node of smaller rank to a node of larger rank, if the maximum rank is log(n), the path from any node to its root is at most log(n) long, so the time for the same_component() query and the join_components() operation each is o( log(n)), as we claimed. this shows that the algorithm has run time o( e log(n) ). it remains to show that the algorithm really finds the minimum spanning tree. suppose there is a tree tmin which has less total length than the tree tk constructed by kruskal's algorithm. kruskal's algorithm selects edges one by one, an the first edge the algorithm selects, the shortest, is always in the mst, so there must be a first edge in the sequence of edges selected by kruskal's algorithm that is a wrong choice. let xy be that first edge that is selected in error, so all edges chosen earlier by k could be extended to a true mst, but with the choice of xy that becomes impossible. let a be the set of edges selected earlier by k, and tmin be a true mst that contains a. the edge xy connects two connected components of a, since it was selected by k. the same connected components must be connected in a different way in tmin, and the edge of path that connects them in tmin contains at least one edge uv that is not present in a, otherwise they would be the same connected component. this edge uv was not selected by k at the moment xy was selected, although it joins different connected components, so uv must be longer than xy. but then tmin is not a minimum spanning tree: we can make it shorter by replacing uv with xy. so the assumption that tk was not minimal led to a contradiction, an we proved that kruskal's algorithm constructs a minimum spanning tree. log(n).="" since="" each="" edge="" points="" from="" a="" node="" of="" smaller="" rank="" to="" a="" node="" of="" larger="" rank,="" if="" the="" maximum="" rank="" is="" log(n),="" the="" path="" from="" any="" node="" to="" its="" root="" is="" at="" most="" log(n)="" long,="" so="" the="" time="" for="" the="" same_component()="" query="" and="" the="" join_components()="" operation="" each="" is="" o(="" log(n)),="" as="" we="" claimed.="" this="" shows="" that="" the="" algorithm="" has="" run="" time="" o(="" e="" log(n)="" ).="" it="" remains="" to="" show="" that="" the="" algorithm="" really="" finds="" the="" minimum="" spanning="" tree.="" suppose="" there="" is="" a="" tree="" tmin="" which="" has="" less="" total="" length="" than="" the="" tree="" tk="" constructed="" by="" kruskal's="" algorithm.="" kruskal's="" algorithm="" selects="" edges="" one="" by="" one,="" an="" the="" first="" edge="" the="" algorithm="" selects,="" the="" shortest,="" is="" always="" in="" the="" mst,="" so="" there="" must="" be="" a="" first="" edge="" in="" the="" sequence="" of="" edges="" selected="" by="" kruskal's="" algorithm="" that="" is="" a="" wrong="" choice. ="" let="" xy="" be="" that="" first="" edge="" that="" is="" selected="" in="" error,="" so="" all="" edges="" chosen ="" earlier="" by="" k="" could="" be="" extended="" to="" a="" true="" mst,="" but="" with="" the="" choice="" of="" xy="" that="" becomes="" impossible.="" let="" a="" be="" the="" set="" of="" edges="" selected="" earlier="" by="" k,="" and="" tmin="" be="" a="" true="" mst="" that="" contains="" a.="" the="" edge="" xy="" connects="" two="" connected="" components="" of="" a,="" since="" it="" was="" selected="" by="" k.="" the="" same="" connected="" components="" must="" be="" connected="" in="" a="" different="" way="" in="" tmin,="" and="" the="" edge="" of="" path="" that="" connects="" them="" in="" tmin="" contains="" at="" least="" one="" edge="" uv="" that="" is="" not="" present="" in="" a,="" otherwise="" they="" would="" be="" the="" same="" connected="" component.="" this="" edge="" uv="" was="" not="" selected="" by="" k="" at="" the="" moment="" xy="" was="" selected,="" although="" it="" joins="" different="" connected="" components,="" so="" uv="" must="" be="" longer="" than="" xy.="" but="" then="" tmin="" is="" not="" a="" minimum="" spanning="" tree:="" we="" can="" make="" it="" shorter="" by="" replacing="" uv="" with="" xy.="" so="" the="" assumption="" that="" tk="" was="" not="" minimal="" led="" to="" a="" contradiction,="" an="" we="" proved="" that="" kruskal's="" algorithm="" constructs="" a="" minimum="" spanning="">= log(n). since each edge points from a node of smaller rank to a node of larger rank, if the maximum rank is log(n), the path from any node to its root is at most log(n) long, so the time for the same_component() query and the join_components() operation each is o( log(n)), as we claimed. this shows that the algorithm has run time o( e log(n) ). it remains to show that the algorithm really finds the minimum spanning tree. suppose there is a tree tmin which has less total length than the tree tk constructed by kruskal's algorithm. kruskal's algorithm selects edges one by one, an the first edge the algorithm selects, the shortest, is always in the mst, so there must be a first edge in the sequence of edges selected by kruskal's algorithm that is a wrong choice. let xy be that first edge that is selected in error, so all edges chosen earlier by k could be extended to a true mst, but with the choice of xy that becomes impossible. let a be the set of edges selected earlier by k, and tmin be a true mst that contains a. the edge xy connects two connected components of a, since it was selected by k. the same connected components must be connected in a different way in tmin, and the edge of path that connects them in tmin contains at least one edge uv that is not present in a, otherwise they would be the same connected component. this edge uv was not selected by k at the moment xy was selected, although it joins different connected components, so uv must be longer than xy. but then tmin is not a minimum spanning tree: we can make it shorter by replacing uv with xy. so the assumption that tk was not minimal led to a contradiction, an we proved that kruskal's algorithm constructs a minimum spanning tree.>