==============================================================================
CSC 263Lecture Summary for Week6Fall 2019
READING: Chapter 7.
SELF-TEST: Exercises 7.1-1, 7.2-2.
QuickSort
The following algorithm sorts an input sequence S in non-decreasing order.
QuickSort(S):
1. if |S| <= 1:return Selse: 2. select pivot p in S 3. partition elements of S into:L <- elements of S less than pE <- elements of S equal to pG <- elements of S greater than p 4. return QuickSort(L) + E + QuickSort(G)To fully specify algorithm, select _first_ element of S as “pivot” in line 2(other choices are possible).Worst-case analysis of QuickSort:- Count ONLY COMPARISONS between elements of S. Q: where are these comparisons performed? A: during partition step (line 3).comparisons ALWAYS involve pivot- Lower-bound:. Let C(n) denote number of comparisons performed on this input:[n, n-1, n-2, …, 2, 1].. During partition (line 3), pivot n is compared to all other elements(n-1 comparisons) yielding L = [], E = [n], G = [n-1, n-2, …, 1].. All other comparisons happen in recursive call QuickSort(G) oninput [n-1, n-2, …, 2, 1], which performs C(n-1) comparisons bydefinition of C.. Hence, C(n) satisfies recurrence:C(n) = n-1 + C(n-1) for n > 1,
C(1) = 0.
. Solution of recurrence: C(n) = n-1 + n-2 + + 1 + 0 = n(n-1)/2.
. By definition, T(n) >= C(n) = (n choose 2).
Hence, T(n) is Omega(n^2).
Upper-bound:
. each element of S is pivot at most once
(pivot put into E, never looked at again),
. at most all other elements of S are compared to pivot,
. so every pair of elements of S compared at most once,
. there are (n choose 2) pairs of elements if |S| = n,
. so T(n) <= (n choose 2).Hence, T(n) is in O(n^2).- This means T(n) is Theta(n^2). What’s so “quick” about QuickSort?Best-case analysis of QuickSort:- suppose pivot is always in middle (even partitions)-> O(log n) levels of recursion
-> O(n) comparisons at each level
-> O(n log n) comparisons
Cost of QuickSort depends on quality of partition
Average-case analysis of QuickSort:
Sample space S_n = { all permutations of [1, 2, , n] }, with uniform
probability distribution (not necessarily a good assumption in general).
Note implicit assumption in our choice of S_n: no repeated element.
Q: Why is this reasonable?
A:
Running time can only decrease with repeated values (they get put
into E and never examined again), so assumption does not
underestimate complexity for more general sample spaces.
More general sample spaces tricky to define clearly and precisely.
Random variable: t_n(S) = number of comparisons on input S in S_n.
Satisfies recurrence:
t_0(S) = 0,t_1(S) = 0,
t_n(S) = n-1 + t_{i-1}(L) + t_{n-i}(G)
where i = first element of S
(i is the rank of the pivot)
All permutations equally likely
=> each element of {1,2,,n} equally likely to be first in S
=> pivot equal 1 with probability 1/n,
pivot equal 2 with probability 1/n,
,
pivot equal n with probability 1/n.
Also, after partition, L equally likely to be in any order,
and G equally likely to be in any order.
This leads to recurrence for T'(n) = E[t_n]:
T'(0) = 0,T'(1) = 0,
n
T'(n) = SUM ( 1/n * ( n-1 + T'(i-1) + T'(n-i) ) )
i=1
(sum over all possible values of pivot, of probability this particular
pivot is picked times number of comparisons performed for this pivot;
since S uniformly random, both L and G also uniformly random for their
input sizes).
< skipping derivation… >
T'(n) = Theta(n log n).
Randomized QuickSort:
Average-case analysis works only if each permutation equally likely. In
practice, this may not be true.
Instead of relying on unknown distribution of inputs, RANDOMIZE algorithm
by picking random element as pivot. This way,
**********************************************************
RANDOM behaviour of algorithm on any FIXED input
is equivalent to
FIXED behaviour of algorithm on a UNIFORMLY RANDOM input.
**********************************************************
*******************************************************************
EXPECTED worst-case time of RANDOMIZED algorithm ON ANY FIXED INPUT
is equivalent to
AVERAGE worst-case time of DETERMINISTIC algorithm ON RANDOM INPUT
*******************************************************************
(random input = uniform random permutation)
In general, randomized algorithms are good when there are many good
choices but it is difficult to find one choice that is guaranteed to be
good.
Alternative analysis for Randomized QuickSort
Consider input [1, 2, , n]
Upper bound on EXPECTED cost:
Lets count comparisons
Comparisons ALWAYS involve pivot
Each element selected as a pivot at most once.
Q: Will i and j be compared?
A: If i or j is a pivot
AND i and j are in same partition
THEN i and j are compared ONCE
i,j, same partition only if [i+1..j-1] not yet selected as pivots
Conversely, if k in [i+1..j-1] selected as pivot before i or j,
THEN i and j are NEVER compared
Let A_{ij} be event that i and j are compared
Q: What is P(A_{ij})?
A: = P(i or j selected as pivot before any k in [i+1..j-1]
Lets completely discount the selection of pivots outside the range [i,j]. The selection of outside pivots doesnt determine the race for whether i or j are selected prior to other pivots in [i,j].
Thus, we only need to determine the probability that i or j are selected as pivots among the pivots in the range [i,j].
Since all pivots are equally likely to be selected, the probability of a specific two of these (i and j) out of (j-i+1) of these is 2 / (j i + 1).
So, P(A_{ij}) = 2 / (j i + 1)
if selection of pivots is uniform and independent
Expected number of comparisons between i and j
= 1 P(A_{ij}) + 0 (1 P(A_{ij}) = P(A_{ij})
Expected total number of comparisons (upper bound):
n-1n
Sum SumP(A_{ij})
i=1 j=i+1
n-1n 2
= Sum Sum-
i=1 j=i+1 j i + 1
< if out of time, skip to O(n log n), refer to Ch. 7.4) >
(let k = j-i)
n-1 n-i 2
= Sum Sum –
i=1 k=1 k + 1
bounded from above by (changes to summand AND summation limits!)
n-1n2
< Sum Sum —–i=1 k=1 kn-1= SumO(log n) i=1 (Appendix A, Harmonic series)= O(n log n)======= END OF LECTURE =====================================================================Working out the recurrence relation===================================Consider the recurrenceT'(0) = 0,T'(1) = 0, nT'(n) = SUM ( 1/n * ( n-1 + T'(i-1) + T'(n-i) ) )i=1(sum over all possible values of pivot, of probability this particularpivot is picked times number of comparisons performed for this pivot;since S uniformly random, both L and G also uniformly random for theirinput sizes).Distributing sum over expression inside, factoring out constants, andregrouping, we get:2 n-1T'(n) = n-1 + – * SUM T'(j)n j=1First, write down expressions for T'(n) and T'(n-1):2 n-1T'(n) = n-1 + – * SUM T'(j)n j=1 2n-2T'(n-1) = n-2 + — * SUM T'(j)n-1 j=1We can _almost_ subtract T'(n-1) from T'(n) to cancel most terms,except for multiplicative factor. So compute instead: n * T'(n) – (n-1) * T'(n-1)n-1n-2 = n(n-1) + 2 * SUM T'(j) – (n-1)(n-2) – 2 * SUM T'(j)j=1j=1 = 2(n-1) + 2 T'(n-1)This gives recurrence: n T'(n) = (n+1) T'(n-1) + 2(n-1).Rewrite as follows (divide both sides by n(n+1)):T'(n)/(n+1) = T'(n-1)/n + 2(n-1)/n(n+1)Let A(n) = T'(n)/(n+1), rewrite recurrence as:A(n) = A(n-1) + 2(n-1)/n(n+1)(and A(0) = T'(0)/1 = 0)Now solve:nA(n) = SUM ( 2(i-1)/i(i+1) ) i=1n n = 2 SUM 1/(i+1) – 2 SUM 1/i(i+1) i=1 i=1 n+1n = 2 SUM 1/i – 2 SUM (1/i – 1/(i+1)) i=2 i=1 = 2 (H(n+1) – 1) – 2 (1 – 1/(n+1)) = 2 H(n+1) – 4 + 2/(n+1)where H(n) = 1 + 1/2 + … + 1/n is the Harmonic series andH(n) = ln n + O(1).So T'(n) = (n+1) A(n) = Theta(n log n).
Reviews
There are no reviews yet.