SECOND MIDTERM (with SOLUTIONS)

Problem 1

TRUE or FALSE [5 points each]
Minus 3 points for wrong answers. Brief justification is needed for full credit.
(a) Any comparison-based algorithm for computing the median must make at least lgn comparisons.

ANSWER: TRUE, by information-theoretic arguments, there must be at least n leaves in the comparison tree. This tree has height at least lgn.

REMARKS: Of course, lgn is very far from the truth, since we can use other arguments to prove a lower bound of n-1. Roughly, any computation that do not look at every input number cannot guarantee to find the answer. This actually gives you a lower boundn/2 comparisons. How can you improve this to n-1?

(b) ``In a certain TRUE/FALSE question, you get 5 points for correct answer, 0 points for not attempting the question and -3 points for an incorrect guess. Suppose you have NO clue what the answer might be. It is a good idea to toss a coin and answer TRUE if head, and answer FALSE if tail.''

ANSWER: TRUE. Your expected value is (-3)(1/2)+(5)(1/2) which is equal to 1. This is better than the 0 points you would get if you do not attempt to answer the question.

Problem 2

RECURRENCES [15+20 points]
(a) Solve the recurrence

T(n) = 2T(n/2)+n lglgn.

ANSWER: Use the transformation: s(N) = T(2^N)/2^N. Then s(N) = s(N-1) + lgN. Hence s(N) = Q(NlgN) and T(n) = n ·s(lnn). Thus T(n) = Q(n lgn lglgn).

REMARK: of course, our definition of s(N) has simply combined the range transformation and domain transformation into one step.

(b) Prove the best upper bound you can for the following recurrence:

T(n) = n² + 8 T(n/3) + 40 T(n/6).

Use real induction. HINT: You may want to begin with some upper bound first, and see if you can improve. If you arrive at the equation g(a) = (8/3^a)+(40/6^a) = 1, this is easily solved without calculators. Does case (0) of the Master Theorem suggest anything?

ANSWER: For a preliminary investigation of this recurrence, if you look at the term ``40 T(n/6)'' alone, Master theorem would tell you that T(n) = W(n^log₆(40)). Note that log₆(40) is between 2 and 3.

Let us guess that T(n) Ł C n^a for some a that we will determine. Then, by applying real induction, we have

T(n) Ł n² + 8·C·(n/3)^a + 40 ·C·(n/6)^a = n² + C ·n^a ·g(a) (*)

where

g(a) = 8
3^a
+ 40
6^a
.

Note that g(2) > 1 and g(3) < 1. Suppose we choose a so that g(a) < 1 and a > 2. Then from equation (*), we have

T(n) Ł C n^a ć
ç
č n^2-a
C
+ g(a) ö
÷
ř .

If we choose C = 1/(1-g(a)) then this implies

T(n) Ł C n^a

for all n ł 1. This would verify our induction hypothesis.

REMARK: Let a = a₀ be the unique solution to the equation g(a) = 1. The above argument shows that we can choose a to be a₀+e for any e > 0. But, can you choose a = a₀? Well, case (0) of the Master theorem will suggest that you modify the induction hypothesis to

T(n) Ł C n^a₀ logn.

This can be verified also. You can even achieve

T(n) Ł C n^a₀

but you need to strengthen your induction hypothesis in another way (how?).

Problem 3

AVL TREES [15+20 points]
You need not show the keys, just draw the nodes in the following questions.
(a) How few nodes can there be in the smallest AVL tree of height 10?

ANSWER: We choose height h = 10 because it would be impossible to enumerate or draw the AVL tree. You need to work out the general principle. But we had already seen this principle in our lectures and notes. It is closely related to the Fibonacci sequence.

Here is the answer: Let N(h) be the smallest number of nodes in an AVL tree of height h. Thus, N(0) = 1 and N(1) = 2. For h ł 1, we have N(h+1) = 1+N(h)+N(h-1). Thus, N(2) = 1+2+1 = 4, which you can verify directly. Continuing,

N(3) = 7, N(4) = 12, N(5) = 20, N(6) = 33, N(7) = 54, N(8) = 88,N(9) = 143, N(232).

REMARK: you get no credit if you simply show me the number "232". How do I know that you have not gotten this number by magic, or worst? ALWAYS SHOW YOUR WORKING SOMEWHERE.

(b) Show an AVL tree of height 4, and a single deletion of a node in this tree that will result in 2 rotations.

ANSWER: Construct a tree of height 4 with N(4) = 12 nodes. Note that N(4) is the minimum number in part (a). It is not necessary to use this minimum number, but it is the simplest example. Let u be the root of this tree, and u_L and u_R be its two children. Let the heights of u_L and u_R be 3 and 2, respectively. There is a leaf x below u_R of depth 2. Delete x. This will cause a double rotation below u_R. Then, you need one further rotation (either single or double, depending on the heights of the children of u_L.)

REMARKS: I intended the 2 rotations to be either a ßingle rotation" or a "double rotation". Since I did not specify that "double rotation" counts as öne rotation", I gave you the benefit of the doubt. If you only use a height 3 tree, which some of you did, you cannot achieve the desire effect.

Problem 4

(a,b)-TREES [10+20 points]
(a) What is the minimum number of nodes in an (a,b)-tree of height h?

ANSWER: Each non-leaf must have at least a children, except the root can have only 2 children. Hence the number of nodes at level l = 0, 1, 2, 3, 4, Ľ (respectively) is at least

1, 2, 2a, 2a², 2a³,Ľ.

Summing from levels 0 to level l, we get

1 + 2 l-1
ĺ
i = 0
aⁱ = 1 + 2 a^l -1
a-1
.

REMARKS. You get partial credit if you do not know how to put the summation in closed form. The most frequent error is not remembering that the root may have only 2 children - but you can get most of the credit even in this case.

(b) In our homework, we discussed a method for splitting and merging (``fusion of'') nodes in an (a,b)-tree to achieve an utilization ratio of a/b that is more than 2/3. Generalize the arguments to achieve a ratio of m/(m+1) for any integer m. You must state the needed inequalities on a and b.

ANSWER: You need to ensure that a ł m+1. Then, when you need to merge or split, you can be assured of at least a-1 ł m siblings in which to borrow (if you are short) or share (if you are long) your keys.

In case of merging, your current node has a-1 keys. We may assume that the other m siblings has a keys each. We can combine all these a(m+1)-1 keys and split them into m new nodes. This requires

a Ł mb + 1
m+1
. (**)

In case of splitting, your current node has b+1 keys. You may assume that m-1 siblings with b keys each. Then we combine all these mb+1 keys, and split them into m+1 new nodes. Again, the same inequality (**) is needed.

Problem 5

PROBABILITY [15+10 Points]
Consider the following silly randomized algorithm: At each step, it rolls a dice (digital one, of course) that has one of six possible outcomes: 1, 2, 3, 4, 5, 6. In the i-th step, if the outcome is less than i, we stop. Otherwise, we go to the next step. The first step is i = 1. For instance, we never stop after first step and surely stop by the 7-th step. Let T be the random variable corresponding to the number of steps.
(a) Set up the sample space, the event space, and the probability function for T.

ANSWER: Let W be the sample space and S the event space. The first mistake is to assume W = {1,2 ,Ľ, 6}. In our case, we can think of each sample point as the ENTIRE history of the rolls of your dice. Let i₁, i₂,Ľ be the result of the first, second, etc, roll. For instance, if you stop after step 2, this means you rolled (i₁,i₂) where i₁ is unrestricted (i₁ = 1 ,Ľ, 6) and i₂ = 1. If you stop after step 3, this means you rolled (i₁,i₂,i₃) where i₁ is unrestricted, i₂ ł 2 and i₃ Ł 2. And so on. Then

W = {(i₁,i₂): i₁ ł 1, i₂ Ł 1}Č{(i₁,i₂,i₃): i₁ ł 1, i₂ ł 2, i₃ Ł 2}Č{(i₁,i₂,i₃,i₄): i₁ ł 1, i₂ ł 2, i₃ ł 3, i₄ Ł 3}ČĽ

Then S = 2^W. So each w Î W is a finite sequence of length between 2 and 7. The probability of any sample point w is

Pr
(w) = 6^|w|

where |w| is the length of the sequence.

Now, you need to define T. This is a random variable, which means T is a function from W to real numbers:

T: W® \Bbb R.

In this case, it is very simple,

T(w) = |w|.

(b) Compute the expected value of T.

ANSWER: The expected value of T is

E[T] =
ĺ
w Î W
|w| Pr
(w).

It is a bit tedious to compute (and we would not take off many points if you make mistakes in your computation). But the correct answer is

E[T] = 1223
324
> 3.

File translated from T_EX by T_TH, version 2.78.
On 13 Dec 2000, 16:05.