[SOLVED] data structure algorithm network =============================================================================

$25

File Name: data_structure_algorithm_network_=============================================================================.zip
File Size: 1036.2 KB

5/5 - (1 vote)

=============================================================================
CSC 263 Lecture Summary for Week 08 Fall 2019
=============================================================================

READING: Sections 21.1, 21.2, 21.3.
SELF-TEST: Exercise 21.2-2.


Find-Union

Basic abstractions
set of objects
union command: connect two objects
find query: is there a path connecting one object to another?

Objects
Union-find applications involve manipulating objects of all types.
Computers in a network.
Web pages on the Internet.
Transistors in a computer chip.
Variable name aliases.
Pixels in a digital photo.
Metallic sites in a composite system.

When programming, convenient to name them 0 to N-1.
Hide details not relevant to union-find.
Integers allow quick access to object-related info.
Could use symbol table to translate from object names

Objects
0 1 2 3 4 5 6 7 8 9
grid points

Disjoint sets of objects.
0 1 { 2 3 9 } { 5 6 } 7 { 4 8 }
subsets of connected grid points

Find query: are objects 2 and 9 in the same set?
0 1 {239} {5-6} 7 {4-8}
are two grid points connected?

Union command: merge sets containing 3 and 8.
0 1 {23489} {5-6} 7
add a connection between two grid points

Physical application: Percolation
A model for many physical systems
N-by-N grid.
Each square is vacant or occupied.
Grid percolates if top and bottom are connected by vacant squares (i.e., water drains from top to bottom)

Goal.
Design efficient data structure for union-find.
Find queries and union commands may be intermixed.
Number of operations M can be huge.
Number of objects N can be huge.


Disjoint Sets

Disjoint Set ADT:

Objects: Collection of nonempty disjoint sets S = {S_1,S_2,,S_k}
each S_i is a nonempty that has _no_ element in common with any other
S_j (S_i n S_j = {} for all i != j). Each set is identified by a unique
element called its representative.

Operations:

. MAKE-SET(x): Given element x that does not already belong to one of
the sets, create a new set {x} that contains only x (and assign x as
the representative of that new set).

. FIND-SET(x): Given element x, return the representative of the set
that contains x (or NIL if x does not belong to any set).

. UNION(x,y): Given two distinct elements x and y, let S_x be the set
that contains x and S_y be the set that contains y. Form a new set
consisting of S_x u S_y and remove S_x and S_y from the collection
(since all the sets must be disjoint). Pick a representative for the
new set usually (but not necessarily) one of the representatives
for S_x or S_y. Note: if both x and y belong to the same set already
(S_x = S_y), operation has no effect.

Applications:

In Kruskals algorithm, use disjoint set ADT to keep track of connected
components:

KRUSKAL-MST(G=(V,E),w:E->R)
T <- {}sort edges so w(e_1) <= w(e_2) <= … <= w(e_m)for each v in V:Make-Set(v)for i <- 1 to m:# let (u_i,v_i) = e_iif Find-Set(u_i) != Find-Set(v_i):Union(u_i,v_i)T <- T u {e_i}Data Structures for Disjoint Sets:- General assumptions:. Disjoint-set data structure manages _sets_ only (client code manages_elements_).. Each element x maintains attribute x.pointer to data structure.. Complexity analysed using worst-case sequence complexity (WCSC) –in keeping with context above (Kruskal’s algorithm).Each analysis based on m operations where n = number of MAKE-SETs.1. Circularly-linked list.Each set = one circularly-linked list; first element = representative;requires addition attribute .first in each node.- MAKE-SET(x) takes time O(1): create new linked list with element x;set x.pointer <- node storing x.- UNION(x,y) takes time for FIND-SET(x) and FIND-SET(y), plus time O(1)for the actual union, by changing the ‘next’ pointers of the firstelements in each set.- FIND-SET(x) takes time Omega(length of list): in the worst-case, musttraverse every link in a list before finding first element.Worst-case sequence complexity for m operations with n MAKE-SETs? One”bad” sequence: n = m/4 MAKE-SETs, m/4-1 UNIONs (yields one list withm/4 elements), m/2 FIND-SETs on second element in the list. EachFIND-SET requires time Omega(n = m/4), so total time is Omega(m^2).Also, since the number of elements in the structure at any point in anysequence of m operations is <= m, complexity of each operation in asequence is O(m) so the total time is O(m^2).2. Linked list with extra pointer to front.Each set = linked list where each element stores pointer to next elementand pointer “back” to first element (representative).- MAKE-SET(x) takes time O(1), as before.- FIND-SET(x) now takes time O(1): follow the back pointer.- UNION(x,y) takes time Omega(length of appended list): append one listto the end of the other, and modify every back pointer of the secondlist.Worst-case sequence complexity for m operations: n = m/2+1 MAKE-SETs,m/2-1 UNIONs, always appending longer list to the end of single-elementlist. Total time is Omega(m^2).3. Linked list with extra pointer to front and “union-by-weight”.Idea: during UNION, make sure shorter list appended to longer (requiringfewer updated to back pointers). Details: like previous idea but alsokeep track of number of elements in each list. MAKE-SET and FIND-SET notaffected (still take time O(1)), and when we perform UNION, appendsmaller set to larger one. This is called “union-by-weight” (the “weight”of a set is simply its size).Worst-case sequence complexity for m operations: let n be the number ofMAKE-SET operations in the sequence (so there are never more than nelements in total). For some arbitrary element x, we want to prove anupper bound on the number of times that x’s back pointer can be updated.Note that this happens only when the set that contains x is UNIONed witha set that is no smaller (because we only update back pointers for thesmaller set). This means that each time x’s back pointer is updated, theresulting set must have at least doubled in size. Since there are no morethan n elements in total in all the sets, this means that x’s backpointer cannot be updated more than lg(n) times. And since this is truefor every element x, the total number of pointer updates during theentire sequence of operations is O(n log n). The time for otheroperations is still O(1), and there are m operations in total, so thetotal time for the entire sequence is O(m + n log n).4. Trees.Each set = “inverted” tree, where each element points to its parent only– root points back to itself. Representative = root. Note: trees are_not_ necessarily binary: number of children of a node can be arbitrarilylarge (or small).- MAKE-SET(x) takes time O(1): create new tree with root x; setx.pointer <- node storing x.- UNION(x,y) takes time O(1): make root of one tree child of root ofother tree.- FIND-SET(x) takes time O(depth of x): follow parent pointers back toroot of x’s tree.Worst-case sequence complexity for m operations: just like for the linkedlist with back pointers but no size, we can create a tree that is justone long chain with m/4 elements, so that FIND-SET takes time Omega(m);if we perform m/2 FIND-SET operations, we get a sequence whose total timeis Omega(m^2).5. Trees with “union-by-weight”.As before, except also keep track of weight (i.e., size) of each tree andalways append smaller tree to larger one during UNION. Complexity ofMAKE-SET = O(1), and so is complexity of UNION (when one tree appended toanother, add the two weights for weight of the new tree).Complexity of FIND-SET? During any sequence of m operations, n of whichare MAKE-SET, the maximum height of any tree is O(log n). (Proof is byinduction on the height h of the trees.) So running time of individualFIND-SET is O(log n), and total time is O(m log n) for entire sequence.6. Trees with path compression.During FIND-SET(x), keep track of nodes visited on path from x to root oftree (use a stack or queue, or write FIND-SET recursively), and once theroot is found, update parent pointers of each node encountered to pointdirectly to the root. At most doubles running time of FIND-SET, but canspeed up future operations considerably.In fact, possible to prove (but messy) that worst-case running time of asingle operation in a sequence, if there are n MAKE-SET operations (so atmost n-1 UNIONs) and f FIND-SET operations is Theta( f log n / log(1+f/n) )if f >= n

Theta( n + f log n ) if f < nBut we can do even better!7. Trees with “union-by-rank” and path compression.With trees, the measure that matters most for running time is _height_of each tree, not size. So, instead of using weight to decide how tocarry out UNION, we use “rank”: an upper bound on the height of the tree– not always exactly equal to the height (because it is not efficientto keep rank always exactly equal to height).- MAKE-SET(x): set rank of x to 0 (1 would also work fine).- UNION(x,y): the node with higher rank is the new root and its rank isunchanged; if the two nodes have the same rank, pick any one as thenew root and increase its rank by 1.- FIND-SET(x): use path compression and leave ranks unchanged (actualheight may change but time to update ranks would be too great).It is possible to prove that the worst-case time for a sequence of moperations, where there are n MAKE-SETs, is O(m log* n) — for allpractical purposes, roughly O(m).(The textbook proves a slightly different but essentially equivalentbound in Theorem 21.14.)(See pp.58-59 of the textbook for a definition of the iterated logarithmlog* and some of its properties.)

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] data structure algorithm network =============================================================================
$25