Homework 3

Homework 3 Homework 3
Fundamental Algorithms, Fall 2001, Professor Yap

DUE: Mon Oct 22, in class.

NOTICE: The solution for this homework will be published immediately after due date, in preparation for midterm.

Exercise 7.4-5, page 159. [ 5+10+10+5 POINTS ]
We will break down this question into subparts as follows. First, let us call the variant of Quicksort in this exercise the , for any k ≥ 2. In this variant, we stop the recurrence when the size of the subarray to be sorted is < k. The standard quicksort is the case k=2. Insertion sort is a very simple algorithm described in chapter 2 (p.15).
(i) Let the array A contain the output of the k-Truncated Quicksort (A is also going to be the input for our insertion sort). Assume the input numbers are distinct. What can you say about the ordering in A, without knowing anything about how Quicksort makes its random choices?
(ii) Argue that the expected time of the k-Truncated Quicksort algorithm is O(n log(n/k)).
(iii) Argue that the worst case time of insertion sort on the output of the k-Truncated Quicksort algorithm is O(kn).
(iv) Suppose the time for (ii) is C₁ nlog(n/k) and the time for (iii) is C₂ kn. Describe how you would chose k to minimize the sum

T(n) = C₁ nlog(n/k) + C₂ kn.

Exercise 9.1-1, page 185. [ 10 POINTS ]
Find the second largest in at most n+\ceillgn − 2 comparisons. HINT: set up a binary tree T with n leaves and height \ceillgn. Place the numbers in leaves of T and use it as a ``tournament'' to compute the largest. Now, ask yourself what information do you already possess about the next largest. Then, find it with the help of T.

Exercise 9.3-9, page 193. [ 10 POINTS ]
Please help Professor Olay with his oil problem.

Median analysis. [ 10+20 POINTS ]
Suppose that we have a version of the median algorithm whose recurrence is

T(n) = n + T(c₁n) + T(c₂n)

for some 0 < c₁, c₂ < 1. For instance, in the book, the version we saw was c₁=1/5 and c₂ = 7/10.
(i) Show that if c₁+c₂ < 1, we will have the solution T(n)=Θ(n). NOTE: you must show two facts: T(n)=O(n) and T(n)=Ω(n). But one of them is ``trivial'' (but you still need to note it in your answer).
(ii) Suppose c₁+c₂=1. Guess what T(n) should be, and try to prove the best upper and lower bound you can.
HINT: For both (i) and (ii), use an induction argument. To guess the correct value of T(n), think of the Master theorem. For instance, suppose you guessed that T(n) ≤ C n³ for n ≥ n₀ large enough. Attempt an induction proof, keeping C, n₀ undetermined (the proof will suggest what C, n₀ should be).

Exercise 11.3-2, page 236. [ 10 POINTS ]
Implementing the division method for a radix-128 number.

Perfect Hashing (Section 11.5, page 245). [ 20+10+15 POINTS ]
This exercise is to help you understand perfect hashing. Let U=\ZZ_p for some prime p, and we are given a set K\ib U of n keys to be stored. We use a 2-level table structure: a primary table of size n and secondary tables whose total size is at most 4n. Each hash function has the form h(x) = (((ax+b) p) m) where m is the table size, and a, b are the parameters determining h. Here the prime number p and n are fixed throughout. Let us go into implementation details.
Suppose the primary hash function [^h]: K→ \ZZ_n is fixed (in , it means you have chosen the constants a and b). For each i ∈ \ZZ_n, let B_i = [^h]⁻¹(i) = {k ∈ K: [^h](k)=i} be the ith bucket. Let n_i = |B_i| and we choose m_i = n_i². Suppose also that the secondary hash functions,

h_i: B_j → \ZZ_{m_i}

have also been chosen for each i. In particular,

h_i(x) = (((a_ix+b_i)\mod p)\mod m_i)

for for some a_i, b_i, m_i.
(i) How would you implement this hashing scheme? There are two parts in answering this question: describe your data structure, and describe the ``Hash Search'' algorithm which uses this data structure. (In class, we call the Hash Search algorithm the ``LookUp'' algorithm in class.) ASSUMPTION: each k ∈ U can be stored in one entry of your array. PLEASE make explicit any other reasonable assumptions you need.
HINTS: Where would you store the constants a_i, b_i, m_i, and how would you organize the secondary tables. We suggest using two arrays: array T for the primary table, and an array S for all the secondary tables.
(ii) Suppose your keys are ASCII strings of arbitrary length. So we no longer assume that they fit into a single array entry. Continue to assume that n=|K| can fit into an array entry. How would you modify the above scheme?
(iii) In part (i) we assume the hash functions were given to you. But how can we find a suitable [^h]? Here is the method: pick randomly a,b ∈ \ZZ_p where a ≠ 0 as the parameters for the function [^h]. Then check if

n−1
∑
i=0


^

h

−1

(i) 
 2

< 2n.

If so, we have passed the ``test'' and we are done. If not, we repeat the test with another random choice of a,b.
Question A: how may times do you expect to do this test until it is passed? Question B: Give more details about how you would implement this test, and say how much time is needed for each test.

ADDITIONAL QUESTIONS (not graded, but we will sketch answers)

Exercise 7.4-2, page 159. Show that the best case time for Quicksort is Ω(nlogn).

Exercise 9.3-3, page 192. Show how to make Quicksort run in O(nlogn) time in the WORST case. HINT: use a median algorithm somewhere.

Exercise 11.3-4, page 236. On the multiplication method.

Exercise 11-1, page 249. Longest probe for hashing.

File translated from T_EX by T_TH, version 3.01.
On 19 Oct 2001, 10:00.