Sample Final

  • Q5. The cost to increment from 31 to 129 is 2(98)-3 = 193, not 173. Reason, you must subtract the increase in potential, which is 3.
    Thanks to Zhao Jin for this report. Thanks to David Amouyal for correcting the arithmetic error.

Homework 8 Remarks

  • Q11. As seen in class email, part (a) is not meaningless, and its optimal cost is 25.
    Also, part(b) has 2 typos in the first row: C(1,4)=14, C(1,5)=18 (not C(1,4)=10, C(1,5)=14).
         Thanks to Csaba for pointing out the typos.

Homework 7 Remarks

  • Q3. Recall my rule that +infinity +(-infinity) = +infinity. Try to understand why I have this rule.
  • Q4. External BST has n internal nodess iff it has n+a external nodes. This is clearly a typo.
  • What is the * character in Q4 and Q5?
    I said that the * character is assumed to be the least letter in our alphabet (i.e., the ASCII character set). Actually, in the ASCII code, the least letter is usually called NUL, and in hexadecimal, it is 0x00. So, in some sense, * should be this NUL character. The problem is that * is also an ASCII character, whose hexadecimal code is 0x2A. This is not a problem until you compare * with the space character, whose hex code is 0x20. Then you MAY think that * is smaller than space character. BUT my intention is that * is smaller than EVERY other letter. SOLUTION: think of * in my problem as the NUL character. The reason I use * character because that is the convention in Huffman trees.

Homework 6 Remarks

  • Q1. We said in the REMARKS that "G_n cannot be acyclic". Of course, it should be H_n, not G_n.
  • Q2(a) and (b).
    • The b value you use in part(b) must be the ones you obtained in part(a).
    • Assume that (a', b')=(a,b) for this question. More generally, I want you to treat leaf blocks and internal blocks the same way.
    • Assume that the leaves uses the same kind of organization (array, linked-list, etc) as the internal nodes.
    • Assume that the data part of items is just 4 bytes (this could be a pointer, for instance).
    • To compute the maximum value of b, you need can simply assume the number of keys and number of pointers in a node are equal. (If you want, you can assume there is actually one less key than pointers, which is of course more accurate).
    • Recall that the ONLY balanced BST you have learned in this course is AVL trees. Moreover, the AVL tree we have explicitly treated is the endogeneous kind, not exogeneous. So it is safest if you stick to this in your solutions.
  • Q6. Decided to scratch this one -- it is too hard I feel. But for your interest, I give more details:
    Also, the algorithm I have in mind is not the greedy linear bin packing algorithm, but the following one:
    Assume the bins are B_1, B_2, B_3, etc. Initially, all are empty. Your weights are w_1 >= w_2 >= w_3 >=.... You take each w_i in turn, and try B_1, B_2, etc, until the first bin B_j that fits w_i.
    That is the "greedy algorithm", but notice that it is no longer "linear bin packing".

    If you still want to pursue this (as extra credit) you are welcome. There are two lemmas to prove:
    • (A) there is a first index i (2<=i<=n) such that w_i which is placed into B_{1+opt}. Let i* denote this critical index. Show that w_{i*} <= 1/3 for the critical index i.
    • (B) Show n-i* <= opt.
    • From (A) and (B) conclude that the above greedy algorithm uses at most 1.5(opt) many bins.
  • Q8(b) (Extra Credit Part): you must assume a>=2 and not a>=1 as originally stated.
  • Q10. Some important clarifications:
    • First of all, I suggest you draw pictures of intervals for your counter examples. It is much easier to understand.
    • For part(a), the counter example to the Degree-of-Conflict criterion may need up to 11 intervals. Sorry for my earlier noise about no counter-example!
    • For part(b), thanks to Csaba Szabo-Dery for pointing out that actually, one of the criterion is optimal. I want you to prove the optimality of this criterion (the proof is similar to the optimality proof in the text). For the others, you just have to provide counter examples.

Homework 5 Remarks

  • Q2 and Q3 have been scratched!
    Please do not hand in your solution to these questions. They are considered to be zero point questions. Reason is that they were duplicated in hw5 of Spring 2011. You can look up the solutions there.
    Thanks to Zhongdi Luo for pointing this out.
  • Q4: just do the problem for algorithms assuming c=1 (i.e., standard (a,b)-search trees).
    In particular, you just have to do borrowing, but no donation.
    The general case is not much harder, but this will simplify your task.
  • Q4: More clarifications.
    I need to clarify what I had in mind. First, we need to be very explicit about what the "Standard Insert/Delete" of (a,b)-search trees really is:

    Standard Insert: if a node u is overfull, we split into two.

    Standard Delete: if a node u is underfull, we try to borrow from any immediate sibling. If borrowing fails, we merge with this sibling.

    But actually, for this question, I had in mind an "enhanced" version of these two algorithms. The idea of enhancement is to try to donate or borrow "if at all possible":

    Enhanced Standard Insert: if a node u is overfull, we try to donate to its immediate siblings Note that u has up to two immediate siblings, and it should try all of them. If donation fails, then do the standard stuff.

    Enhanced Standard Delete: if a node u is underfull, we try to borrow from its immediate sibling. But in the enhanced version, it should try both immediate siblings, if there are two of them! If borrowing fails, we do the standard stuff.

    So for Question 4, your goal is to extend the "enhanced Standard Algorithms" by looking to "immediate cousins". By definition, a cousin is a node who shares your grandparent. But "immediate" means its keys are closest to your among all nodes in your level. Note that you can have "immediate" cousin ONLY if you have ONE immediate sibling. BUT sometimes, you may only ONE immediate sibling and NO immediate cousin.

    REMARK: in the question, I call "immediate cousin" the "first cousin", but I think that terminology is not accurate in human family trees -- usually "first cousin" means you share a grandparent but not parent, and "second cousin" means you share a great grandparent, but not a grandparent.
  • Q6.
    Part (a).
    I do expect you to describe two new kinds of edges in the BFS classification.
    (I have said something else to some of you who saw me in office hours today --- I said that you may consider these two kinds of edges as "the same". Please ignore that advice.)
    Part (b).
    To distinguish between these two kinds of edges is a bit tricky to do efficiently. One hint is this: maintain parent pointers as you build the BFS tree, so that you can efficiently travel from any node to the root. Normally, we would like the algorithms to take time O(n+m) (which is optimal).In this case, you can do it in time O(mn). Can you do it in O(n^2)?
  • Q7. I will change the points from this problem from 12 to 24 (so each part is 4 points).
    Part(c). Remember that the graph is acyclic. That means it has no cycles. That means that in the DFS tree, there are ONLY tree edges, nothing else.
    Part(d). It is not easy to do this in O(n+m). Do the best you can.

    Back to Part(c):

    Following the hint, you are to compute the height and diameter of each subtree in the DFS tree. Moreover, we want you to stick to the DFS shell of Page 32, Para.32. There are two things I suggest you do: Assuming V={1,2,...,n}, you can maintain two or more global arrays for the height and diameter, and possibly other information, for each vertex. Moreover, in the POSTVISIT(s0) macro, I must allow you to do a loop to visit all the vertices adjacent to s0. In other words, you are doing more than O(1) work. This is OK, because you are not changing the overall complexity with this loop.

Homework 4 Solution

  • Q4 the code for CheckBST: in line 10, the "else" is not needed.

Homework 4 Remarks

  • Q3. The figure is in Lecture III, not Lecture II.
  • Q6 b. The question refers to a "complete binary tree". What is that?
    By a complete binary tree, we mean one in which every leaf has the same depth. Such a tree necessarily has 2^n -1 nodes (n>=1). Note that 2^{n-1} of the nodes are leaves.
  • Q4. Can we change the "checkBST(u)" specification to "checkBST(u,L)" where L is a list or some other argument? The answer is "no".
    Please stick to the format of "checkBST(u)". Note that this routine is supposed to return some value. Hint: checkBST(u) should return a pair of values (these values can include +infinity or -infinity). Make sure that your routine works for all cases, including the case where u=nil.
  • Q9. The figure 3 is wrong (it is not even AVL). Use figures 16 or 15 instead for practice.


  • Q5: ITB is tight for n<=31 not 29. Thanks to Savvas Savides for pointing this out.

Homework 3 Remarks

  • Q1, part (c): I don't expect it is easy to "draw" a geometric figure that represents the formula for S_k. It should

be easy to directly do the argument for any k (no need for induction).

  • Q2: just order then five functions. No explanations needed. However, you should learn how to give

the justifications if asked.

  • Q3: I do want to see justifications for this one.
  • Q4: Please do not confuse my definition of "polynomial-type" with the usual concept of a "polynomial".

For instance, we know that log(n) is polynomial-type, but it is surely NOT a polynomial.

  • Q8: Typo here: Jill's algorithm satisfies this equation:

    Ji(n) = Ji(2n/3) + 2 Ji(n/5) + n

Homework 2 Remarks

  • Question from Z.J.: About HW2 Q4 (b): C represents the minimum cover for key words w1, w2, ..., wk. Then, what does Ci represent for? Minimum cover for wi? or for arbitrary i key words?
    Answer: No. Each Ci is a minimal cover for W={w1, ..., wk}. A cover Ci=[a,b] is
    minimal if it is a cover for W, and for all intervals [a',b'], if [a',b'] is strictly contained in [a,b], then [a',b'] is not a cover for W.
  • Question from Z.C: I think that on page 15 (Lect.II), in the proof of Principle of Real Induction, you mean Given and theta > 0, for every real number x there is a smallest natural number n(x) such that x<=n(x)- theta instead of x<=n(x)theta.
    Answer: No there is no mistake. E.g., if x=0.2 and x=19.9, then n(x)=100.
  • Question 5(a) gave a hint about "parity of n". I should have explained the term:
    the parity of n is either odd or even. E.g., the parity of 6 is even, parity of 7 is odd.

Homework 1 Remarks

  • Please note that the definition of small-oh and small-omega should use "(for all C>0)" instead of "(for all C>1)".

This is part of the typos for Lecture 1, and could affect your answers for Question 1.

  • Question 1 refers to "Figures 1(a) and (b)". This should be "Figures 2(a) and (b)".
  • I said during lecture that the aim tape sorting algorithm should try to minimize the number of RESET's.

But since I did not specify this requirement for the homework question, you do not need to minimize the number of RESETS. But make sure that the correctness of your algorithm is clear, and should be explained to get full credit.

  • In the tape model, students ask about erasing the contents of the tape. I propose to

use one such command: ERASE(T) means to erase from current head position to the end of tape. E.g., to completely clean out the tape, you can do: RESET(T); ERASE(T).

  • Q: For question 2, should I give the exact calculated number or it is OK to give the mathematical expression?
    A: I need to see BOTH the expression and the calculated number!

No good to just say "3" -- I must see the expression that produce "3". You must find some way of calculating your expression (if using a program, tell us that).

  • Q: Does question 1 only need interpretation for the Tree Program but not the Comparator Circuit Program?
    A: No, you need to interprete both programs.
  • In Question 3, "intermediate tapes" refer to any tape other than tape T0.

You can use as many intermediate tapes as you like, but 2 should be sufficient.

  • My tape merge algorithm is wrong. Imagine what would happen if T1 and T2 each has only one element.

I think the best fix is to introduce two boolean variables b1 and b2, where bi (i=1,2) is true iff xi contains an item that has not yet been written to the output tape. You can get a very elegant algorithm this way.

  • There are 2 questions with zero points -- Ex.7.5 and 7.6. Actually, I wanted

you to look at Ex.7.6 and Ex.7.7. I think Ex.7.5 is a bit esoteric.