[SOLVED] data structure algorithm theory Before starting todays topic, fill in a blank:

$25

File Name: data_structure_algorithm_theory_Before_starting_todays_topic,_fill_in_a_blank:.zip
File Size: 734.76 KB

5/5 - (1 vote)

Before starting todays topic, fill in a blank:

Last week we discussed AVLs, with three operations: insert, delete, search. AVLs are a specific implementation of an abstract data type called a Dictionary. AVL is one implementation with O(log n) for all three operations. There are other implementations, such as hash tables, which we will descreibe later on.

But first.

Augmenting Data Structures[Not in G&T Text.In CLRS chapter 14.]

#G&T text? #############################################################################

General Definition:

An augmented data structure is simply an existing data structure
modified to store additional information and/or perform additional
operations.

Generally, a data structure is augmented as follows:

1. Determine additional information that needs to be maintained.

2. Check that the additional information can be maintained during each
of the original operations (and at what additional cost, if any).

3. Implement the new operations.

Example 1: Key Rank

We want a data structure that will allow us to answer two types of
rank queries on sets of values, as well as having standard operations
for maintaining the set (INSERT, DELETE, SEARCH):

. RANK(k): Given a key k, what is its rank, i.e., its position
among the elements in the data structure?

. SELECT(r): Given a rank r, what is the key with that rank?

For example, if our set of values is {3,15,27,30,56}, then RANK(15) = 2
and SELECT(4) = 30.

Lets look at 3 different ways we could do this

1. Use an AVL tree without modification.

. Queries: Simply carry out an inorder traversal of the tree, keeping
track of the number of nodes visited, until the desired rank or key
is reached.
# inorder traversal

[[Q: What will be the time for a query? ]]
>>A: Theta(n) (a constant fraction of the nodes may have to be examined
for each query).<< [[Q: Will the other operations (SEARCH/INSERT/DELETE) take any longer?]]>>A: Updates: No additional information needs to be maintained.
Update time?No additional time.<<[[Q: What is the problem? Could we do better? ]]>>A: Problem: Query time seems too long.We would expect to be able
to carry out both types of queries in only Theta(log n) time, since
they are very similar to a SEARCH in a 2-3-4 tree.<<# 2-3-4 trees? ############################################################################ 2. Augment AVL trees so that each node has an additional field ‘rank[x]’that stores its rank in the tree.[[Q: What will be the time for a query? ]]>>A: Observe that relative ordering of nodes is the same whether
we use their key or their rank. Queries: Similar to SEARCH,
choosing path according to key or rank field (depending
on the type of query).
Query time?Theta(log n), just like for SEARCH in 2-3-4 trees.<<# 2-3-4 trees? ################################################################################# [[Q: Will the other operations (SEARCH/INSERT/DELETE) take any longer?]]>>A: Updates: Carry out normal update procedure, then update the rank
field of all affected nodes.
Update time?Theta(n) since any insertion or deletion affects the
rank of _all_ subsequent nodes in the tree.Also, each restructuring
operation (i.e., rotations, transfers) affects the ranks of all the
nodes in the subtrees being affected.<<# restructuring preserves rank ####################################################################[[Q: What is the problem? Could we do better? ]]>>A: Problem: Weve achieved the Theta(log n) query time we wanted, but
at the expense of the update time which has gone up from Theta(log n)
to Theta(n).We would like all operations to have time at worst
Theta(log n).<< 3. Augment the tree in a more sophisticated way.Augment each node x with a new data fieldKey idea:newfield[x] depends only on leftchild(x) and rightchild(x) In particular, newfield[x] can depend on the key or newfield ofits children. Question: what should we store in newfield?Answer: Let each node x store an additional field size[x] = number of keys in the subtree rooted at x (including x itself) * size[x] = size[leftchild(x)] + size[rightchild(x)] + 1* “just enough data” to reconstruct rank in log(n) time!* easy to maintain x.size because it can be recomputed from children(x)- Queries:Consider node x. rank[x] = 1 + number of keys that precede x in tree.Rank of x RELATIVE TO KEYS IN SUBTREE[x] = size[leftchild[x]] + 1 . RANK(k): Given key k, perform SEARCH on k keeping track of “currentrank” r: Each time you go down a level you must add the size of thesubtrees to the left that you skipped.You must also add the keyitself that you skipped.Notice that this is the “relative” rankof the key to the left of the subtree you are exploring.Recall BST search:SEARCH(x, key):if x = NIL return NIL(key is not the in the tree)if x.key = key:return xif key < x.key:return SEARCH( leftchild(x), key )else return SEARCH( rightchild(x), key )SIZE(x)if x = NIL return 0 elsereturn x.size// return rank of key relative to tree rooted at x RANK(x, key):if x = NIL EXCEPTION(“key not in tree”)if x.key = key:return SIZE( leftchild(x)) + 1 if key < x.key:return RANK( leftchild(x), key )else // x.key < keyreturn SIZE( leftchild(x)) + 1 + RANK( rightchild(x), key )// return the r’th-ranked node in tree rooted at x SELECT(x, r): if x = NIL EXCEPTION(“empty tree”)if r < 1EXCEPTION(“rank exceeds size of tree”)if r <= SIZE( leftchild(x))return SELECT( leftchild(x), r)elif r == SIZE( leftchild(x)) + 1return xelse return SELECT( rightchild(x), r – SIZE( leftchild(x)) + 1) [[Q: What will be the time for a query? ]]>>A: Theta(log n), as desired, since both algorithms are essentially
like SEARCH (tracing a single path down from the root and doing a
constant amount of work each step down).<<- Updates: INSERT and DELETE operations consist of two phases for AVLtrees: the operation itself, followed by the fix-up process.We lookat the operation phase first, and check the fix-up process only onceafterwards.. INSERT(x): Simply increment the size of the subtree rooted at everynode that is examined when finding the position of x (since x willbe added in that subtree).. DELETE(x): Consider the element y that is actually removed by theoperation (so y = x or y = successor(x)).We know the size of thesubtree rooted at every node on the path from the root down to ydecreases by 1, so we simply traverse that path to decrement thesize of each node on that path.Update time?We have only added a constant amount of extra work duringthe first phase of each operation, so the total time is still Theta(log n).- Now, we have finally achieved what we wanted: each operation (old ornew) takes time Theta(log n) in the worst-case.Example 2 “Most Frequent Key”——————————Assume that we have a set D and each element of D is a data and key pair. We allow duplicate keys. We want a data structure that would support the following operations:Insert: Insert (d,k) into DDelete: Delete an element e in D (no need to search as e is an element)MostFrequent: Find an element e in D such that e.key is most frequent key.Note: most frequent not necessarily unique; any most frequent key aceptable(1) Give an efficient implementation of this abstract data type,(2) Analyze the running time of the operations(3) Give a real word use cases Solution:This ADT can be implemented using a priority queue for keys and their frequencies plus a dictionary for data and keys.(a) A heap for keys and their frequencies prioritized by the frequencies(b) a dictionary (e.g., balanced tree) for data and keynote: duplicate keys are chained together in linked list Cost of operations:MostFrequent : O(1) peek at heapInsert(d, key): (let n be number of unique keys)(a) update dictionary:O(log n) dictionary search dictionary update: O(1) if inserting duplicate key (prepend linked list)# O(log n) since we still have to find the key in the dictionary ##############################################################################O(log n) if inserting new key (AVL insertion)(b) update heapO(log n) if inserting new keyO(log n) if increasing priority of duplicate keyOverall cost: O(log n)Delete(e):no need for dictionary search, e is an element# is e a pointer to the (d,k) pair? #######################################################################(a)dictionary update: O(1) if deleting duplicate key (doubly linked list update)O(log n) if deleting element (AVL deletion)(b) update heapO(log n) if decreasing priority of duplicate keyHow to delete from heap?how about: O(log n) increase priority to infinityO(log n) pop from heapOverall cost: O(log n)# Optional? ##################################################################This is optimal for comparison-based model of computation: In complexity theory, the “element distinctness problem” is the problem of determining whether all the elements of a list are distinct. The problem may be solved by sorting the list and then checking if there are any consecutive equal elements, in O(n log n) time, and this is known to be asymptotically optimal. If we could do better than O(log n) in our operations above, we could “beat” the best solution for the “element distinctness problem,” which is impossible. Either MostFrequent or Insert must be Omega(log n) amortized time.For RAM based models, we can do better (i.e., hashing).

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] data structure algorithm theory Before starting todays topic, fill in a blank:
$25