**Remark**: From robin simon

The last day for students to withdraw is Nov. 5th.
Therefore the exam should be returned at least a week
before then.

We just studied unordered dictionaries at the end of chapter 2. Now we want to extend the study to permit us to find the "next" and "previous" items. More precisely we wish to support, in addition to findElement(k), insertItem(k,e), and removeElement(k), the new methods

- closestKeyBefore(k): Return the key of the item with largest key less than or equal to k.
- closestElemBefore(k): Return the element of the item with largest key less than or equal to k.
- closestKeyAfter(k): Return the key of the item with smallest key greater than or equal to k.
- closestElemAfter(k): Return the element of the item with smallest key greater than or equal to k.

We naturally signal an exception if no such item exists. For example if the only keys present are 55, 22, 77, and 88, then closestKeyAfter(90) or closestElemBefore(2) each signal an exception.

We begin with the most natural implementation.

We use the sorted vector implementation from chapter 2 (we used it
as a simple implementation of a priority queue).
Recall that this keeps the items sorted in key order.
Hence it is O(n) for inserts and removals, which is slow; however, we
shall see that it is fast for finding and element and for the four new
methods closestKeyBefore(k) and friends.
We call this a **lookup table**.

The space required is Θ(n) since we grow and shrink the array supporting the vector (see extendable arrays).

As indicated the key favorable property of a lookup table is that it is fast for (surprise) lookups using the binary search algorithm that we study next.

In this algorithm we are searching for the rank of the item containing a key equal to k. We are to return a special value if no such key is found.

The algorithm maintains two variables lo and hi, which are respectively lower and upper bounds on the rank where k will be found (assuming it is present).

Initially, the key could be anywhere in the vector so we start with lo=0 and hi=n-1. We write key(r) for the key at rank r and elem(r) for the element at rank r.

We then find mid, the rank (approximately) halfway between lo and hi and see how the key there compares with our desired key.

- If k = key(mid), we have found the item and return elem(mid)
- If k < key(mid), then we restrict our attention to indexes less than mid.
- If k > key(mid), then we restrict our attention to indexes greater than mid.

Some care is need in writing the algorithm precisely as it is easy to have an ``off by one error''. Also we must handle the case in which the desired key is not present in the vector. This occurs when the search range has been reduced to the empty set (i.e., when lo exceeds hi).

Algorithm BinarySearch(S,k,lo,hi): Input: An ordered vector S containing (key(r),elem(r)) at rank r A search key k Integers lo and hi Output: An element of S with key k and rank between lo and hi. NO_SUCH_KEY if no such element exits If lo > hi then return NO_SUCH_KEY // Not present mid ← ⌊(lo+hi)/2⌋ if k = key(mid) then return elem(mid) // Found it if k < key(mid) then return BinarySearch(S,k,lo,mid-1) // Try bottom ``half'' if k > key(mid) then return BinarySearch(S,k,mid+1,hi) // Try top ``half''

Do some examples on the board.

It is easy to see that the algorithm does just a few operations per recursive call. So the complexity of Binary Search is Θ(NumberOfRecursions). So the question is "How many recursions are possible for a lookup table with n items?".

The number of eligible ranks (i.e., the size of the range we still must consider) is hi-lo+1.

The key insight is that when we recurse, we have reduced the range to at most half of what it was before. There are two possibilities, we either tried the bottom or top ``half''. Let's evaluate hi-lo+1 for the bottom and top half. Note that the only two possibilities for ⌊(lo+hi)/2⌋ are (lo+hi)/2 or (lo+hi)/2-(1/2)=(lo+hi-1)/2

Bottom:

(mid-1)-lo+1 = mid-lo = ⌊(lo+hi)/2⌋-lo
≤ (lo+hi)/2-lo = (hi-lo)/2<(hi-lo+1)/2

Top:

hi-(mid+1)+1 = hi-mid = hi-⌊(lo+hi)/2⌋
≤ hi-(lo+hi-1)/2 = (hi-lo+1)/2

So the range starts at n and is halved each time and remains an integer (i.e., if a recursive call has a range of size x, the next recursion will be at most ⌊x/2⌋).