The purpose of this assignment is to implement an ADT for a Directed Graph (digraph) and associated operations in C. PA3 will utilize your List ADT once again, and some parts of this project will look very familiar if you completed PA2.
The Adjacency List representation of a digraph also consists of an array of Lists; However, the edges in a digraph are directed (like one-way streets); there may be an edge (3, 1) between vertex 3 and vertex 1 although there is no edge (1, 3) between vertex 1 and vertex 3. You will create a Digraph ADT that represents a directed graph as an array of Lists. As in PA2, each vertex is associated with an integer label in the range 1 to numVertices, where numVertices is the number of vertices in the digraph.
Your client program will use this Digraph ADT to perform various operations on the digraph that are describedbelow. TheclientprogramusingyourDigraphADTwillbecalledDigraphProperties,and will take two command line arguments (here % denotes the unix prompt):
% DigraphProperties input_file output_file
For PA3, you are to write a makefile that creates the executable file, called DigraphProperties, similar to the makefiles that you used for PA1 and PA2. Include a clean utility in your makefile.
1
Input and Output File Formats
The first line of the input file describes the digraph, and the other lines describe operations to be performed on the digraph. Most of these operations simply print values returned by functions implemented in a Digraph ADT that you will implement in Digraph.c and Digraph.h
The first line starts with an integer that well call numVertices that tells you the number of vertices in the digraph. The rest of that line gives pair of distinct numbers in the range 1 to numVertices, separated by a space. These numbers are the vertices for an edge. There is a comma between numVertices and the first edge, and a comma between edges. Well also put a space after each comma for readability. All edges are directed.
4, 1 3, 2 4, 3 4, 4 3, 3 2
Note that the same digraph could be represented by each of the following input lines: 4, 3 2, 4 3, 1 3, 3 4, 2 4
4, 3 2, 4 3, 1 3, 3 4, 2 4, 1 3
even though the edge (1, 3) appears in the last digraph twice.
However, the following digraph is different:
4, 3 2, 4 3, 1 3, 3 4, 2 4, 3 1
because edges {1, 3} and {3, 1} are different edges.
The rest of the lines in the input file that correspond to digraph operations on operands. Each line begins with a keyword, which is followed by the operands. You should output the entire input line into the output file as a separate line. That line should be followed by another line that just has the result of the operation. If an input line does not follow one of the specified formats, the output on the second line should just say ERROR, and processing of the input file should continue. For example, its an error if the operation isnt one of the legal operations (capitalization matters), or if the operation has the wrong number of operands, or if an operand thats supposed to be a vertex isnt a number between 1 and numVertices.
So the output file should always have twice as many lines as the input file (not counting the line that describes the digraph). For example, if an input file has 7 lines, the first of which is the digraph, then the output should have 12 lines in it, containing each digraph operation lines, each followed immediately by the result of that operation (or ERROR).
As in PA2: If the digraph in the first line of the input file is illegal, then you should just print out that first line, followed by another line with the word ERROR on it. Do not process any subsequent lines in the input file. Whats an illegal digraph? A digraph is illegal if the number of vertices isnt positive, or if an edge specifies a vertex that doesnt exist in the digraph. For example, if there are 6 vertices, both vertex 0 and vertex 7 are illegal. Also, there shouldnt be any self-loop edges between a vertex and itself, so an edge (3, 3) would also make the digraph illegal.
2
Here are the operation tokens, their operands, and descriptions of the results. All of these operations simply print values returned by functions implemented in the Digraph ADT that you will implement in Digraph.c and Digraph.h.
PrintDigraph takes no operands. It prints out the current digraph, in the same format as an input line. Each edge should be printed out exactly once, following the same format as the input line for the digraph, starting with the number of vertices, then a comma, then the edges.
For PA3, you must print the edges sorted by source and then by destination. If u < v, print all the edges whose source is vertex u before all the edges whose source is vertex v. Also, if (u, v1) and (u, v2) are edges, and v1 < v2, print (u, v1) before (u, v2).Strong suggestion: Keep the destinations for each vertex as a sorted list. That will make it easy to print out all the edges in the current digraph sorted as described. GetOrder takes no operands. It returns the total number of vertices in the current digraph. GetSize takes no operands. It returns the total number of edges in the current digraph. GetOutDegree takes a vertex u as operand. It returns the number of vertices that are outgoing neighbors of u in the current digraph. The out degree of a vertex u is the number of vertices v such that (u, v) is an edge in the digraph. AddEdge takes two vertices u and v as operands. It adds the edge (u, v) to the current digraph. If that edge didnt exist in the current digraph, it returns 0. If that edge already existed in the current digraph, it returns 1, since no action was taken. (This is not treated as an error.) DeleteEdge takes two vertices u and v as operands. It deletes the edge (u, v) from the current digraph. If that edge already existed in the current digraph, it returns 0. If that edge didnt exist in the current digraph, it returns 1, since no action was taken. (This is not treated as an error.) Distance takes a pair of vertices u and v as operands, and outputs the distance from u to v in the current digraph, and INF if there is no path from u to v. Of course, this must be a directed path. Yes, if the operands u and v are the same vertex, then there is a path from u to v that has no edges in it, so the output will be 0. The distance from u to v is the length of the shortest path from u to v. Acyclic takes no operands. It outputs YES if the current digraph is acyclic, and NO otherwise. TopoSort takes no operands. It outputs a topological sort of the current digraph, if the current digraph is acyclic. (There may be more than one correct topological sort.) However, if the digraph is not acyclic, it outputs IMPOSSIBLE. 3Example: For an input file that consists of the following lines:4, 1 3, 2 4, 3 4, 4 3, 3 2 PrintDigraphGetSizeDistance 1 4Distance 3 1 Acyclic TopoSort GetOutDegree 4 GetOutDegree 5 DeleteEdge 2 4 Acyclic DeleteEdge 1 2 DeleteEdge 4 3 AddEdge 1 2 Acyclic TopoSortthe output file should be:PrintDigraph4, 1 3, 2 4, 3 2, 3 4, 4 3GetSize5Distance 1 4 2Distance 3 1 INFAcyclicNOTopoSort IMPOSSIBLE GetOutDegree 4 1GetOutDegree 5 ERROR DeleteEdge 2 4 0AcyclicNO DeleteEdge 1 2 1DeleteEdge 4 3 0AddEdge 1 20Acyclic YES TopoSort 1 34 24Your Digraph ADT will be implemented in files Digraph.c and Digraph.h. Digraph.c defines a struct called DigraphObj, and Digraph.h should define a type called Digraph that points to this struct.Your DigraphObj struct is required to have the following fields. It is okay for you to have other fields, but you should document them in your README file. numVertices, the number of vertices in the digraph, called the order of the digraph. numEdges, the number of edges in the digraph, called the size of the digraph. An array of Lists, whose jth element contains the outgoing neighbors (destinations) from sourcevertex j, the vertices k such that (j, k) is an edge in the digraph.. This is the outgoing Adjacency List for vertex j. Youll use your List ADT from PA1 and PA2 to represent the outgoing Adjacency List for each vertex. We strongly suggest that you keep the outgoing neighbors in sorted order, to make printing the digraph simpler. Other arrays similar to the marks that you used in PA2. Youll have to determine yourselves what arrays (if any) youll need for PA3.Since indexes of C arrays begin at 0, you might choose to use arrays that have length numVertices +1, but only use indices 1 through numVertices. Thats so that array indices can be directly identified with vertex numbers (and index 0 is ignored). Of course, if you prefer to have arrays of length numVertices, rather than numVertices + 1,that works, as long as you always remember to look at position j-1 when youre accessing information about vertex j.5Your Digraph ADT should export the following operations through the file Digraph.h:/*** Constructors-Destructors ***/ Digraph newDigraph(int numVertices);// Returns a Digraph that points to a newly created DigraphObj representing a digraph which has // n vertices and no edges.void freeDigraph(Digraph* pG);// Frees all dynamic memory associated with its Digraph* argument, and sets // *pG to NULL./*** Access functions ***/ int getOrder(Digraph G);// Returns the order of G, the number of vertices in G.int getSize(Digraph G);// Returns the size of G, the number of edges in G.int getOutDegree(Digraph G, int u);// Returns the number of outgoing neighbors that vertex u has in G, the number of vertices v such // that (u, v) is an edge in G. Returns -1 if v is not a legal vertex.List getNeighbors(Digraph G, int u);// Returns a list that has all the vertices that are outgoing neighbors of vertex u, i.e., // a list that has all the vertices v such that (u, v) is an edge in G.// There is no input operation that corresponds to getNeighbors./*** Manipulation procedures ***/int addEdge(Digraph G, int u, int v);// Adds v to the adjacency list of u, if that edge doesnt already exist.// If the edge does already exist, does nothing. Used when edges are entered or added. // Returns 0 if (u, v) is a legal edge, and the edge didnt already exist.// Returns 1 if (u, v) is a legal edge and the edge did already exist.// Returns -1 if (u, v) is not a legal edge.int deleteEdge(Digraph G, int u, int v);// Deletes v from the adjacency list of u, if that edge exists.// If the edge doesnt exist, does nothing. Used when edges are deleted. // Returns 0 if (u, v) is a legal edge, and the edge did already exist.// Returns 1 if (u, v) is a legal edge and the edge didnt already exist.// Returns -1 if (u, v) is not a legal edge.You probably will want to have other functions in Digraph.c/Digraph.h that are similar to the unvisitAll, getMark and setMark functions in PA). But youll have to figure out what those functions are yourselves.6/*** Other operations ***/void printDigraph(FILE* out, Digraph G);// Outputs the digraph G in the same format as an input line, including the number of vertices // and the edges. The edges should be in sorted order, as described above.void distance(FILE* out, Digraph G, int u, int v);// Outputs the distance between vertices u and v if there is a directed path between them in the // digraph. Otherwise, outputs INFvoid acyclic(FILE* out, Digraph G);// Outputs YES if the digraph is acyclic. Otherwise, outputs NO.void topoSort(FILE* out, Digraph G);// Outputs a topological sort of the digraph. If the digraph is not acyclic, outputs IMPOSSIBLE.In addition to the above prototypes, Digraph.h will define the type Digraph, as well as #define constant macros for constants that you decide that you need to implement the functions described above.7What does DigraphProperties.c do?DigraphProperties reads in the first line of the input file, and uses newDigraph() and addEdge() to create a digraph that corresponds to that line, as described above. When it reads in a subsequent line whose operation token is one of GetOrder, GetSize or GetOutDegree, it calls the corresponding function in Digraph.c (using the digraph that was read in as the digraph argument), and outputs the value returned. For other operations tokens (Distance, Acyclic, TopoSort), it calls the corresponding function in Digraph.c which performs its own output. Obvious requirements about number of operands should be checked in DigraphProperties.c; obvious requirements about operand values in should be checked in Digraph.c. (If G isnt a digraph, your program has a bug, and should exit.)About distance: Find the distance between two vertices of a digraph using BFS.About acyclic: A digraph is acyclic if and only if a DFS of the digraph produces no back edges, asdescribed in Section 22.3 of CLRS. Youll have to figure out how to implement this. About topological sort: You may do topological sort using either DFS either from in-degree 0 vertices or (more traditionally) from all vertices. Well accept any legal topological sort result, not just some specificresults. 8Submitting your SolutionSince the Digraph ADT includes uses a List to represent the outgoing neighbors of each vertex, and getNeighbors() returns a List, the file Digraph.h should #include the header file List.h. (See the handout C Header File Guidelines, which is posted under ResourcesGeneral Resources, for commonly accepted policies on using .h files.) What files should Digraph.c and DigraphProperties.c include?You will submit 7 files for this project in the directory PA3.List.c, List.h, Digraph.c, Digraph.h, DigraphProperties.c, makefile, README Well deduct credit for inefficient algorithms.Based on the makefile from PA1, you should know how to construct a makefile that will create an executable called DigraphProperties, and will include a clean utility that removes all object files, including DigraphProperties. (There is no ListClientfor PA3.) We will post some public input files that you can use for testing, but we will also test your program on other private input files.Remember that the compile operations mentioned in the Makefile must invoke the gcc compiler with the flags -std=c99. You may develop your programs on any system, using any editor or IDE. But it is a requirement of this and all other assignments in C that your program compile without warnings or errors under gcc, and run properly in the Linux computing environment on the UNIX Timeshare unix.ucsc.edu provided by ITS. In particular, you should not use the cc compiler. Your C programs must also run without memory leaks. Test them using valgrind on unix.ucsc.edu by doing:% valgrind –leak-check=full program_name argument_listYou also must submit a README file for this (and every) assignment. The README file should list each file (other than the README file itself) that you submitted, together with a brief description of its role in the project, and any special notes to the people grading your assignment. The README is essentially a table of contents for the assignment. For PA3, you must also describe all the fields that you have in your DigraphObj struct besides the ones provided to you (numVertices, numEdges and the array of Lists) earlier in this document. Also, your README must briefly explain your algorithms for distance, acyclic and topoSort, and provide complexity of your algorithms (e.g., O(n+m) ). Well deduct credit for inefficient algorithms. Added in readme: Your README must briefly explain your algorithms for of distance, acyclic and topoSort, andprovide complexity of your algorithms (e.g., O(n+m) ). 9 Note1 on Lecture 16 and PA3 This is the first of 3 notes on PA3 that should clarify and help you complete PA3.An edge means that the software package that’s the source of the edge must be installed before the software package that’s the destination of the edge. So (for example) dpkg must be installed before coreutils, and coreutils must be installed before libbz2. I said that backwards during the Feb 15 presentation of Lecture 16, and I wanted to fix that.Topological sort uses Depth-First Search, produces an ordering for installation of the software packages that always install a software package before other software packages that depend on it. So the ordering shown on what is now Slide 45 of Lecture 16 for package installation is an ordering that works, starting with dpkg and ending with multiarch_support:dpkg coreutils tarlibbz2libselinux1 multiarch_supportThat topological sort (different from ordinary sort) is obtained by starting with an empty stack, and pushing a vertex onto the stack when it gets marked as AllDone during the DFS of the graph. So multiarch_support is the first vertex pushed onto the stack (hence installed last), and dpkg is the last vertex on the stack (hence installed first).The Finish time for a vertex is the term that it gets marked as AllDone. Vertices that have smaller “Finish times” will be lower down in the stack that vertices that have larger “finish times”. For example, the “Finish time” for libbz2 is 6, and the “Finish time” for coreutils is 10. That means that libbz2 will be lower down in the stack than coreutils.Time isn’t clock time; it’s just an integer that starts at 0 and keeps getting incremented. We used Finish time to prove that topological sort works; we haven’t used Start time yet. But (since you have to implement topological sort in PA3) it’s worth noting that you don’t have to keep track of time in your implementation! Instead, you can just push vertices on your stack when they are marked as AllDone. (Of course, vertices that are marked AllDone are never visited a second time.) And the topological sort result is the list of vertices in the stack, read from the top of the stack down. Of course, in PA3 you’ll just be using vertex numbers, not strings such as “coreutils”.Note2 on Lecture 16 and PA3 will discuss DFS for Undirected Graphs, and explain how to determine if an Undirected Graph is Acyclic. The last note, Note3, will continue the discussion for Directed Graphs.10 Note2 on Lecture 16 and PA3 This is the second of 3 notes on Lecture 16 and PA3 that should clarify Lecture 16 and help you complete PA3. This note discusses DFS and detecting cycles in Undirected Graphs. Note3 discusses Directed Graphs.For Undirected Graphs, let’s review DFS (which we’ve already discussed in class), as well as how to use DFS to test whether an Undirected Graph is Acyclic.As we saw, to do a DFS of an Undirected Graph G, you begin by marking all vertices as Unvisited. Then you perform DFS recursively starting at some vertex v.DFS(v):Mark v as InProgressFor all vertices w that are Neighbors(v),If w is Unvisited DFS(w) Mark v as AllDone ReturnThat’s a recursive DFS; MA3 had you write the equivalent non-recursive DFS.But that doesn’t complete DFS for G. If Undirected Graph G is not Connected, then it has more than one Connected Component, and you’ve only “visited” the Connected Component that has vertex v in it. So we need to perform DFS(v) for every vertex v that’s unvisited. And this takes time O(n+m), where n is the number of vertices in G and m is the number of edges in G. [Reason: Every edge is traveled once in each direction, and every vertex has to be touched once as we loop through all the vertices to do DFS on each unvisited vertex.]Now how can we use this algorithm to decide if there’s a cycle in Undirected Graph G? There’s a cycle in G if an only if during DFS we reach a vertex that’s InProgress. That means that we’re following a path in the DFS tree and there’s an edge that goes back to a vertex on that same path! Such an edge is called a “Back Edge”; see page 609 of CLRS. If we find a Back Edge, then of course there must be a cycle in G. And if there is a cycle in G, then we must find a Back Edge. (That may be less obvious.)So here’s how you modify DFS to tell if G has a cycle. In what follows, Neighbors of a vertex v means Outgoing Neighbors, the vertices w such that there is an edge (v,w) in the digraph.Mark all vertices as Unvisited.Boolean DFS_Test_Acyclic(v):Mark v as InProgressFor all vertices w that are Neighbors(v)If w is InProgress Return(TRUE) If DFS(w)Return(TRUE)Mark v as AllDone Return(FALSE)// Found a Back Edge // Found a Back Edge// Didn’t find a Back EdgeIf the value returned is TRUE, G has a cycle. If the value returned is FALSE, we didn’t find a cycle. But that doesn’t prove that G is Acyclic, since it could have more than one Connected Component.Once again, we have to loop going through all the vertices of G, performing DFS_Test_Acyclic(v) foreach Unvisited vertex so that we can determine whether any Connected Component of G has a cycle. As soon as DFS_Test_Acyclic(v) returns TRUE for a vertex v, we know that there’s a cycle in G. But if DFS_Test_Acyclic(v) returns FALSE for every vertex v, then G is Acyclic. In Note3, we’ll continue this discussion for Directed Graphs, since PA3 addresses Directed Graphs.11 Note3 on Lecture 16 and PA3 This is the third of 3 notes on Lecture 16 and PA3 that should clarify Lecture 16 and help you complete PA3. This note discusses DFS and detecting cycles in Directed Graphs.In this note, we’ll assume that G is a Directed Graph (Digraph). Of course, edges in a directed graph are “one-way”. If there is an edge (u, v) in G, then (v, u) might also be an edge in G, or (v, u) might not also be an edge in G.You should already know how to do Topological Sort on a Digraph using DFS based on Lecture 16 and Note1 in this series. So we’ll explain how to determine whether a Digraph is Acyclic. You might be able to figure this out for yourselves based on Lecture 16 plus Note1 and Note2. Try to figure this out for yourselves before reading further …… Okay, if you’re still reading, here’s a simple way to determine whether a Digraph is Acyclic. We’re going to describethis in words, rather than giving the algorithm, since you already know how Topological Sort on a Digraph using DFS works. And once again, it’s not important to keep track of Start time and Finish time, but it is important to know whether a vertex is marked as Unvisited, InProgress or AllDone. (By the way, CLRS uses the terms White, Gray and Black to describe these marks.)Begin by marking all vertices as Unvisited. Now start off at a vertex that has in-degree 0. If there are no such vertices, there you know that the graph has a cycle (as we discussed in class). Perform Topological Sort, counting how many Unvisited vertices you’ve reached, including the vertex that you started from (that has in-degree 0). Of course, you’re following directed edges when you go to (outgoing) neighbors of a vertex, since this is a Directed Graph. If a neighbor is Unvisited, you’re going to mark it as InProgress and visit it using DFS (which should return count of number unvisited vertices it reached, which gets added in). If a neighbor is AllDone, you don’t have to do anything with that neighbor. That doesn’t mean that there’s a cycle.We’ll explain why below; look for ***. But if a neighbor is InProgress, then the Digraph has a cycle. No need to keep looking; you’re done. Return -1instead of count. (If your DFS function invocation returns -1, its caller should also immediately return -1, rather than adding that to the unvisited count.) If DFS returns -1, then there’s a cycle. If the DFS invocation from the starting vertex (which has in-degree 0) doesn’t return 0, no cycle was found. But remember the number that it returned, the number of Unvisited vertices that were found starting at that in-degree 0 vertex.Of course, there may be multiple vertices that have in-degree 0. Do the same thing for each of the vertices that have in-degree 0, counting the number of Unvisited vertices that you found. Of course, you keep old marks; you only marked vertices as Unvisited at the beginning, before you took care of the first vertex that has in-degree 0. And as soon as you discover a cycle, you’re done; the graph has a cycle.Now suppose that you completed this for every vertex that has in-degree 0. That does not prove that the graph is Acyclic. Example: Suppose that you had 6 vertices, with edges (A, B), (B, C), (C, A), (D, E) and (D, F). The only vertex that has in-degree 0 is D; it’s count is 3 (for D, E and F). But the graph has a cycle A B C A.So you have to add up the counts for all the vertices that have in-degree 0. If that count equals the number of vertices in G, then the digraph is Acyclic. If that count doesn’t equal the number of vertices in the digraph, then the graph has a cycle.Note that the basic idea is very similar to the way you detect cycles in an Undirected Graph: If DFS ever reaches a node that’s InProgress, then there’s a cycle. But because it’s a Directed Graph, that’s not the entire algorithm.Two additional important comments:12 *** Why is it okay for us to reach a vertex that’s marked AllDone during the DFS? That’s because there could be a vertex that is reachable from two different vertices that have in-degree 0. Here’s a simple example with thatproperty: The 3 vertex digraph that has edges (A, C) and (B, C). Both A and B have in-degree 0, and C is reachable from both of them. What happens if you do topological sort on a graph that has a cycle? You might have an infinite loop going around the cycle forever, if you’re careless; that’s very bad. [Consider what would happen on the 5 vertex graph with edges (A, B1) (B1, B2), (B2, B3), (B3, B4), (B4, B1).]Or you might not reach all of the vertices in the graph, as with the6 vertex graph (A, B), (B, C), (C, A), (D, E) and (D, F) mentioned earlier. That’s also bad.But in PA3, the output for topoSort on a graph that has a cycle is supposed to be IMPOSSIBLE, so you’ll have tohandle topoSort correctly on a graph that isn’t Acyclic. Perhaps you can do topological sort and test for cycles in the same algorithm? Hmmm … And if you want to use Start and Finish time (instead of Unvisited, InProgress and AllDone marks), then you may do that, and you can figure out if a vertex is Unvisited, InProgress or AllDone based on Start and Finish. Start and Finish times are initialized for all vertices to a special “NIL” value. (You might use a -1 constant for NIL, if all times are non-negative.) A vertex is Unvisited if Start is NIL. (In that case, Finish also is NIL.) A vertex is InProgress if Start isn’t NIL but Finish is NIL. A vertex is AlDone if Finish isn’t NIL. (In that case, Start also isn’t NIL.) So there no need to have both a) Start and Finish time and b) Unvisited, InProgress and AllDone marks for PA3; either will do.13
Reviews
There are no reviews yet.