CSC263H1, Fall 2019 Problem Set 7 Sample Solutions
CSC263H1: Problem Set 7 Sample Solutions
Due Tuesday November 26 before 10pm
Note: solutions may be incomplete, and meant to be used as guidelines only. We encourage you to ask follow-up questions on the course forum or during office hours.
1. [10 marks] Disjoint Set Trees.
Let T1, T2 and T3 be three different disjoint-set forest implementations.
T1 uses union-by-weight, without path compression; T2 uses union-by-weight, with path compression;
T3 uses union-by-rank, with path compression.
Union-by-rank and path compression are described in the textbook discussion of disjoint-set forests. Union-by-weight, which described in the textbook as a heuristic for linked-list implementations, can be used as a heuristic for disjoint-set forests: the representative of the set containing more items becomes the representative of the union.
In all three implementations, when the rank or weight heuristic encounters a tie, the representative of the set that contains x becomes the new representative for Union(x, y).
Definition: two disjoint-set forests are equivalent iff they store the same items and can be drawn the same. Let P be a sequence of n disjoint set operations. Each operation may be a Make-Set, Find-Set, or
Union operation. Suppose that we perform P on each of T1, T2 and T3.
(a) [5 marks] What is the smallest n such that T1 and T2 are not equivalent? Fully justify your answer.
Solution
Answer: 8.
Let m1 be the length of the shortest sequence such that T1 and T2 are not equivalent after executing the sequence on them.
To show that m1 8, we give a sequence P1,,P8 such that T1 and T2 are not equivalent after executing this sequence on them: MAKESET(1), MAKESET(2), MAKESET(3), MAKE- SET(4), UNION(1,2), UNION(3,4), UNION(1,3), FINDSET(4).
After the first 7 operations, both T1 and T2 look like this:
1 / 23
| 4
After the last operation, T1 is unchanged while T2 looks like this:
1 /| 234
So T1 and T2 are not equivalent.
Page 1/6
CSC263H1, Fall 2019 Problem Set 7 Sample Solutions
To show that m1 8, suppose we have a sequence P = P1,,Pn such that T1 and T2 are not equivalent after executing P on them. We will show that n 8.
The only difference between T1 and T2 is path compression, and path compression only makes a difference when performing a FINDSET on a node whose distance from the root is at least 2 (i.e., needs to follow at least 2 pointers to find the root). To create a tree with a node whose distance from the root is 2, we must have at least two 2-node trees. Creating a 2-node tree requires at least 3 operations (2 MAKESETs and 1 UNION). So P must at least include operations to create two 2-node trees, a UNION of those trees to create a tree with a node whose distance fromthenodeis2,andaFINDSETonthatnode. Thus3+3+1+1=8naswanted. Therefore m1 = 8.
(b) [5 marks] What is the smallest number n such that T2 and T3 are not equivalent? Fully justify your answer.
Solution
Answer: 9.
Let m2 be the length of the shortest sequence such that T2 and T3 are not equivalent after executing the sequence on them.
To show that m2 9, we give a sequence P1,,P9 such that T2 and T3 are not equivalent after executing this sequence on them: MAKESET(1), MAKESET(2), MAKESET(3), MAKE- SET(4), MAKESET(5), UNION(1,2), UNION(3,4), UNION(3,5), UNION(1,3).
After the first 8 operations, both T2 and T3 look like this:
1 (rank:1, weight:2) 3 (rank:1, weight:3) |/ 245
For the last operation, the first tree is made a subtree of the second tree in T2, while its the other way around in T3. See diagrams.
T2: 3 T3: 1 /| /
451 23 |/ 245
So T2 and T3 are not equivalent. Thus m2 9.
To show that m2 9, suppose we have a sequence P = P1,,Pn such that T2 and T3 are not equivalent after executing P on them. We will show that n 9.
The only difference between T2 and T3 is union-by-weight versus union-by-rank. Since all 2-node trees have weight 2 and rank 1, P must at least include operations to create one 3-node tree and one 2-node tree, and a UNION of these trees. Creating a 3-node tree requires at least 5 operations (3 MAKESETs and 2 UNIONs) and creating a 2-node tree requires at least 3 operations (2 MAKESETs and 1 UNION). Thus 5 + 3 + 1 = 9 n as wanted.
Therefore m2 = 9.
Page allowance: Please neatly present your solution using 12 pages. Any additional pages will not be marked.
Page 2/6
CSC263H1, Fall 2019 Problem Set 7 Sample Solutions
2. [10 marks] Graphs: breadth-first search.
(a) [10 marks] Consider a connected undirected graph G = (V, E) represented by an adjacency matrix G.A. Design an algorithm that finds a vertex whose removal (deleting the vertex and all its incident edges) does NOT disconnect the graph. Your algorithm must be based on breadth first search (BFS).
i. Describe your algorithm in clear and precise English. ii. Explain why your algorithm works correctly.
iii. Analyse the worst-case running time of your algorithm and express the result in asymptotic notation.
Hint: Think about the BFS-tree.
Page allowance: Please neatly present your solution using a single page. Any additional pages will not be marked.
Solution
(a) Perform a complete BFS starting from an arbitrary vertex, then return any vertex v with the maximum distance values d[v]. You could also return the first BFS-tree leaf that you encounter and dont even finish building the tree.
(b) Consider the BFS-tree resulting from calling BFS. Firstly, this tree is connected and covers all vertices in V because G is connected. Secondly, the vertices with the maximum distance values must be leaf nodes in the BFS tree. Deleting a leaf node from a tree would NOT disconnect the tree. That is, after the removal of the returned vertex v, the BFS-tree must still be connected, therefore the graph G v must also be connected. Hence, the algorithm works correctly.
(c) The runtime of this algorithm is basically the runtime of the BFS. Since we use the adjacency matrix, in the worst-case, for each vertex v V we need to check |V | entries in order to check all incident edges. That gives us O(|V |2) worst-case running time.
Page 3/6
CSC263H1, Fall 2019 Problem Set 7 Sample Solutions
3. [0 marks] Row Data Structure. This extra question is for your own edification. If you attempted it, we encourage you to upload your solution, for our data, but it will not be marked.
(a) [0 marks] Consider the following abstract data type ROW that describes the pixels on one row of a screen of width M. An object S of ROW is a subset of {1,,M}, which denotes those pixels that are illuminated. A line is a maximal1 set of consecutive integers in S. For example, If S = {3, 4, 5, 11, 12, 13, 14, 17}, then the row contains three lines: the line {3, 4, 5} whose endpoints are 3 and 5, the line {11, 12, 13, 14} whose endpoints are 11 and 14, and the line {17} whose endpoints are both 17.
ROW supports two operations:
ON(x) illuminates pixel x {1,,M}, i.e. it adds x to S, if it is not already in S.
ENDPOINTS(x) returns the endpoints of the line containing x. If x S, it returns (0,0).
Give a data structure that implements ROW such that the worst case complexity of any sequence of m ON and ENDPOINTS operations is O(m log n), where n is the size of S, and log is the iterated logarithm. Justify why your data structure is correct and has the required complexity.
Solution
Use an array A of size M and an augmented disjoint-set forest D with union-by-rank and path compression. A[i] represents pixel i and each tree in D represents a line. The root of each tree is augmented to store the endpoints of the line, denoted by root.left-endpoint and root.right- endpoint. If the pixel i is illuminated, A[i] stores a pointer to the node i in D. Otherwise, A[i] stores NIL.
English Explanation:
To perform ENDPOINTS(x),
First check A[x] to see if x is illuminated.
If A[x] is NIL, i.e., x is not illuminated, return (0,0).
Otherwise, call FIND(A[x]) to get the root of the tree that contains x. Return the left and right endpoints stored in the root.
To perform ON(x),
First check A[x] to see if x is illuminated.
If A[x] is not NIL, i.e., x is already illuminated, just return.
Otherwise, perform MAKE-SET(x), set both left and right endpoints to be x, and store a
pointer to the new node in A[x].
If x = 1 and its left neighbor x1 is illuminated, find the left endpoint a of the tree contains
A[x 1]; call UNION(A[x],A[x 1]) and let a be the left endpoint of the new root.
If x = M and its right neighbor x + 1 is illuminated, find the right endpoint b of the tree contains A[x + 1]; call UNION(A[x],A[x + 1]) and let b be the right endpoint of the new
root.
1If Q S is a set of consecutive integers, and there is no other subset of S that contains Q, then Q is maximal.
Page 4/6
CSC263H1, Fall 2019 Problem Set 7 Sample Solutions
Pseudo Code:
ENDPOINT(x):
if A[x] == NIL:
return (0,0)
r FIND(A[x])
return (r.left-endpoint, r.right-endpoint)
ON(x):
if A[x] != NIL:
return
if x = 1 and A[x 1]! = NIL:
a FIND(A[x 1]).left-endpoint UNION(A[x],A[x 1])
new-root FIND(A[x 1]) new-root.left-endpoint a
if x = M and A[x + 1] != NIL:
b FIND(A[x + 1]).right-endpoint UNION(A[x],A[x + 1])
new-root FIND(A[x + 1]) new-root.right-endpoint b
Time Complexity: Since ENDPOINTS performs one disjoint-set forest operation, ON per- forms no more than five disjoint-set forest operations and other steps in ENDPOINTS and ON takes constant time, the total number of disjoint-set forest operations grows linearly with the number of ROW operations. Since the disjoint-set forest is union-by-rank and path-compressed, a sequence of m ROW operations is a sequence of at most 6m disjoint-set forest tree operations, and therefore has the worst-case sequence complexity O(mlog n).
(b) [0 marks] Give a data structure that implements ROW such that its space complexity is O(l) and
the time complexity of ON and ENDPOINTS is O(log l),
where l is the number of lines in S. Briefly justify why your data structure is correct and has the required complexity.
Solution
Use an interval tree where the left and right endpoints of each line are the endpoints of the interval. Note that, since each line contains at least one point, and the lines are disjoint, the number of intervals, k, is at most n.
English Explanation:
To perform ENDPOINTS(x), simply return the endpoints of the interval returned by INTERVAL- SEARCH(x).
To perform ON(x),
callINTERVAL-SEARCH(x)togetintervalwcontainingpixelxcallINTERVAL-SEARCH(x 1) to get interval u containing pixel x1 and call INTERVAL-SEARCH(x+1) to get interval
v containing pixel x + 1
if w exists i.e., pixel x is on, just return
if neither of u or v exists, call INTERVAL-INSERT(x, x)
Page 5/6
CSC263H1, Fall 2019 Problem Set 7 Sample Solutions
if both u and v exist, change the right endpoint of u to be the right endpoint of v. And then call INTERVAL-DELETE(v)
if only u exists, change the right endpoint of u to x
if only v exists, change the left endpoint of v to x Pseudo Code:
ENDPOINT(x):
return INTERVAL-SEARCH(x)
ON(x):
w INTERVAL-SEARCH(x)
u INTERVAL-SEARCH(x 1) v INTERVAL-SEARCH(x + 1)
if w != NIL: return
if u == NIL and v == NIL INTERVAL-INSERT(x, x)
if u != NIL and v != NIL: u.right-endpoint v.right-endpoint INTERVAL-DELETE(v)
if u != NIL: u.right-endpoint x
if v != NIL: u.left-endpoint x
Time Complexity:
Since INTERVAL-SEARCH, INTERVAL-INSERT, and INTERVAL-DELETE take O(log l) time and other steps take constant time, both ENDPOINTS and ON take O(log l) time.
Page 6/6
Reviews
There are no reviews yet.