**Remarks **(sent to mailing list on thurs):

- The lateness policy for problem sets has changed. The absolute deadline is 1 week after due date. For problem set 1, the absolute deadline is today.
- Elif Tosun
, who graded problem set 1, has graciously agreed to meet with students who have questions on the grading. Her office is room 1210 in 719 broadway. She will be there this monday 14 oct from 3-5pm. If you have classes then, please send her email to arrange an alternate time. - Please do
**not**put problem sets or homeworks in my WWH mailbox. Strange things seem to happen. - Write the homework solutions password on the board.

This looks trivial. Since we know n, we can find n+1 and hence the reference to node z in O(1) time. But there is a problem; the result might not be a heap since the new key inserted at z might be less than the key stored at u the parent of z. Reminiscent of bubble sort, we need to bubble the value in z up to the correct location.

We compare key(z) with key(u) and swap the items if necessary. In the diagram on the right we added 45 and then had to swap it with 70. But now 45 is still less than its parent so we need to swap again. At worst we need to go all the way up to the root. But that is only Θ(n) as desired. Let's slow down and see that this really works.

- We had a heap before we inserted the new element.
- When we insert the new element it can ruin the heap because it is too small (i.e. smaller than its parent),
- So the blue node is the problem.
- Since blue is smaller than its parent, we swap them. Call the parent the victim.
- Two nodes have been swapped we have four things to check:
For each of these two nodes we must see that it is not larger
than its new children and not smaller than its new parent.
- The blue node is definitely not larger than its new children: One child is the victim, which we know is larger than the blue. The other child was not smaller than the victim so is surely not smaller than the blue.
- The blue node might be smaller than its new parent. Indeed in the diagram on the right it is. That is why we have to keep bubbling up.
- Before we did the insert, we had a heap so at that point the victim was not larger than all its descendents. But after the swap, all of the children of the victim were descendents of the victim before.
- The victim is definitely not smaller than its new parent, which is the blue.

Great. It works (i.e., is a heap) and there can only be O(log(n)) swaps because that is the height of the tree.

But wait! What I showed is that it only takes O(n) steps. Is each step O(1)?

Comparing is clearly O(1) and swapping two fixed elements is also O(1). Finding the parent of a node is easy (integer divide the vector index by 2). Finally, it is trivial to find the new index for the insertion point (just increase the insertion point by 1).

**Remark**: It is not as trivial to find the new
insertion point using a linked implementation.

**Homework:** Show the steps for inserting an element
with key 2 in the heap of Figure 2.41.

Trivial, right? Just remove the root since that must contain an
element with minimum key. Also decrease n by one.

Wrong!

What remains is **TWO** trees.

We do want the element stored at the root but we must put some other element in the root. The one we choose is our friend the last node.

But the last node is likely not to be a valid root, i.e. it will destroy the heap property since it will likely be bigger than one of its new children. So we have to bubble this one down. It is shown in pale red on the right and the procedure explained below. We also need to find a new last node, but that really is trivial: It is the node stored at the new value of n.

If the new root is the only internal node then we are done.

If only one child of the root is internal (it must be the left child) compare its key with the key of the root and swap if needed.

If both children of the root are internal, choose the child with the smaller key and swap with the root if needed.

The original last node, became the root, and now has been bubbled down to level 1. But it might still be bigger than a child so we keep bubbling. At worst we need Θ(h) bubbling steps, which is again logarithmic in n as desired.

**Homework:** R-2.16

Operation | Time |
---|---|

size, isEmpty | O(1) |

minElement, minKey | O(1) |

insertItem | Θ(log n) |

removeMin | Θ(log n) |

The table on the right gives the performance of the heap implementation of a priority queue. As desired, the main operations have logarithmic time complexity. It is for this reason that heap sort is fast.

- A heap containing n elements is a complete tree T with n internal nodes
each storing a reference to a k and a reference to an element.
The tree also contains n+1 leaves, which are not used.

- The heap is a very fast implementation of a priority queue.
The main operations are logarithmic and the others are constant
time.
- The height of the heap is O(log(n)) since T is complete.
- The worst case complexity of the up- and down-heap bubbling are Θ(height)=Θ(log(n)).
- Finding the insertion position and updating the last node position take constant time.

- Using these insertion and removeMin algorithms makes sorting using a priority queue fast, i.e., logarithmic, as we shall state officially in the next section.

The goal is to sort a sequence S. We return to the PQ-sort where
we insert the elements of S into a priority queue and then use
removeMin to obtain the sorted version. When we use a heap to
implement the priority queue, each insertion and removal takes
O(log(n)) so the entire algorithm takes O(nlog(n)). The heap
implementation of PQ-sort is called **heap-sort** and we
have shown

**Theorem**:
The heap-sort algorithm sorts a sequence of n comparable elements in
O(nlog(n)) time.

In place means that we use the space occupied by the input. More precisely, it means that the space required is just the input + O(1) additional memory. The algorithm above required Θ(n) addition space to store the heap.

The in place heap-sort of S assumes that S is implemented as an array and proceeds as follows (This presentation, beyond the definition of ``in place'' is unofficial; i.e., it will not appear on problem sets or exams)

- Logically divide the array into a portion in the front that
contains the growing heap and the rest that contains the elements
of the array that have not yet been dealt with.
- Initially the heap part is empty and the not-yet-dealt-with part of the array is the entire array.
- At each insertion we remove the left most entry from the array part and insert it in the heap, growing the heap to include the memory previously used by the newly inserted element. The blue line moves down.
- At the end the heap uses all the space. We are making the optimization discussed before that we only store the internal nodes of the heap and do not leave the waste the first (index 0) component of the array used to store the heap.

- Do the insertions a with a normal heap-sort but change the comparison so that a maximum element is in the root (i.e., a parent is no smaller than a child).
- Now do the removals from the heap, moving the blue line back up.
- The elements removed are in order big to small.
- This is perfect since we are going to store them starting at the right of the array since that is the portion of the array that is made available by the shrinking heap.

If you are given at the beginning all n elements that are to be
inserted, the total insertion time for all inserts can be reduced to
O(n) from O(nlog(n)). The basic idea assuming n=2^{n}-1 is

- Take out the first element and call it r.
- Divide the remaining 2
^{n}-2 into two parts each of size 2^{n-1}-1. - Heap-sort each of these two parts.
- Make a tree with r as root and the two heaps as children.
- Down-heap bubble r.

Sometimes we wish to extend the priority queue ADT to include a locater that always points to the same element even when the element moves around. So if x is in a priority queue and another item is inserted, x may move during the up-heap bubbling, but the locater of x continues to refer to x.

Method | Unsorted Sequence | Sorted Sequence | Heap |
---|---|---|---|

size, isEmpty | O(1) | O(1) | O(1) |

minElement, minKey | O(n) | O(1) | O(1) |

insertItem | O(1) | O(n) | O(log(n)) |

removeMin | O(n) | O(1) | O(log(n)) |

**Dictionaries**, as the name implies are used to
contain data that may later be retrieved. Associated with each
element is the **key** used for retrieval.

For example consider an element to be one student's NYU transcript and the key would be the student id number. So given the key (id number) the dictionary would return the entire element (the transcript).

A dictionary stores **items**, which are key-element
(k,e) pairs.

We will study ordered dictionaries in the next chapter when we consider searching. Here we consider unordered dictionaries. So, for example, we do not support findSmallestKey. the methods we do support are

- findElement(k): Return an element having key k or signal an error if no such element exists.
- insertItem(k,e): Insert an item with key k and element e.
- removeElement(k): Remove an item with key k and return its element. Signal an error if no such item exists.

Just store the items in a sequence.

- Trivial (and fast) to insert: O(1)
- Minimal space: O(n)
- Slow for finding or removing elements: O(n) per operation