Basic Algorithms: Lecture 18
================ Start Lecture #18 ================
Here are the three pictures for the remaining three possibilities.
That is, the other double rotation and both single rotations. The
original configuration is shown on top and the result after the
rotation is shown immediately below.
Homework: R-3.3, R-3.4, R-3.5
What is the complexity of insertion?
-
Let n be the number of nodes in the tree before the
insertion.
-
Finding the insertion point is Θ(log n).
-
Expanding the leaf and inserting the item is Θ(1).
-
Walking up the tree looking for an imbalance is Θ(1) per
level, which is O(log n) since the tree has height Θ(log
n).
-
Performing the one needed rotation is
Θ(1).
Hence we have the following
Theorem: The complexity of insertion in an AVL
tree is Θ(log n).
Problem Set 3, problem 3.
Please read the entire problem before beginning.
-
Draw an avl tree containing items with integer keys. Draw the
internal nodes as circles and write the key inside the circle.
Draw the leaves as squares; leaves do not contain items.
You will want to read the remaining parts before drawing this
tree.
-
Choose a key not present in the tree you drew in part A whose
insertion will require the "other" double rotation in order to
restore balance (i.e., the double rotation shown in the diagram
above showing one double and two single rotations).
Draw the tree after the insertion, but prior to the rotation.
Then draw the tree after the rotation.
-
Choose a key not present in the tree you drew in part A whose
insertion will require a single rotation in order to
restore balance.
Draw the tree after the insertion, but prior to the rotation.
Then draw the tree after the rotation.
-
Choose a key not present in the tree you drew in part A whose
insertion will require the "other" single rotation in order to
restore balance
Draw the tree after the insertion, but prior to the rotation.
Then draw the tree after the rotation.
Removing an Item from an AVL Tree
In order to remove an item with key k, we begin just as we did for
an ordinary binary search tree by searching for the item and repairing
the tree. Then we must restore the height-balanced property as we did
when inserting into an AVL tree. Once again rotations are the key,
but there is an extra twist this time. The details follow.
-
Search the AVL tree for the key to be removed.
-
Presumably the search succeeds in finding the key at an internal
node.
If not, the key is not present and we signal an error.
-
Call this internal node w.
-
Returning the element with the desired key is simple; it is the
element at w.
-
We need to actually remove w, but we cannot leave a hole.
We described the procedure previously when discussing removal in a
general binary search tree.
I repeat the discussion here and enhance the pictures since we must
also insure that the resulting tree is balanced (i.e., AVL).
There are you recall three cases.
-
The trivial case:
If we are lucky both of w's children are leaves. Then we can
simply replace w with a leaf. (Recall that leaves do not contain
items.) Note that this procedure is the reverse of how we
insert an item.
-
The easy case:
Assume one child of w is a leaf and the other, call
it z, is an internal node.
In this case we can simply replace w by z;
that is have the parent of w now point to z.
This removes w as desired and also removes the leaf child of w,
which is OK since leaves do not contain items.
Note that the above two cases can be considered the same.
We notice that one child of w is a leaf and replace
w by the other child (and its descendents, if any).
-
The difficult case: Both children of w are internal
nodes. What we will do is replace the item in w with the item
that has the next highest key.
-
First we must find the item y with the next highest
key.
We already solved this when we implemented insertions: The
node y we seek is the next internal node after w in an
inorder traversal.
-
Store the item in y in the node w. This removes the old
item of w, which we wanted to do.
-
Does replacing the item in w by the item formerly in y
still result in a binary search tree?
That is, are parents still bigger than (or equal to if we
permit duplicate keys) all of the left subtree and smaller
than all of the right subtree?
-
Yes. The only new parent is the item from y which has
now moved to node w. But this item is the on
immediately after the old item in w. Since it came
from the right subtree of w, it is bigger than the
left subtree, and
since it was the smallest item in the right subtree,
it is smaller than all remaining items in the that
subtree.
-
But what about the old node y? It's left child is a leaf so
it is the easy or trivial case and we just replace y by the
other child and its descendants.
-
We have now successfully removed the item in w and repaired the
structure so that we again have a binary search tree. However, we
may need to restore balance, i.e., re-establish the AVL
property. The possible trouble is that, in each of the three
cases (trivial, easy, or difficult) the light green node on
the left has been replaced by the light blue node on the right, which
is of height one less. This might cause a
problem.
-
Since the tree was balanced before the removal, the sibling of the
light green (shown in purple) had height equal to, one less than,
or one greater than, the height of the light green.
-
If the purple height was equal to the light green, it is now one
greater than the light blue; this is still in balance.
Since the parent (red) has not changed height, all is well.
This is the good case; we are done.
-
If the purple height was one less than the light green, it is
now equal to the light blue; this again remains in balance.
But now red's
height has dropped by one and might be out of balance.
This is the unknown case; we turn our attention to the parent
(red) and redo the balance check.
-
If the purple height was one greater than the light green, it
is now two greater than the light blue so we are out of
balance.
This is the bad case; we must re-balance using a rotation.
-
In the good case we are done; in the unknown case we proceed up
the tree; and in the bad case we rotate.
This sounds the same as for insertion and indeed it is similar, but
there is an important difference.
-
One similarity is that if we proceed up the tree and reach the
root, we are done (there is then no sibling; so we can't be out of
balance).
-
The more important similarity is that the rotations needed are the
same four we used for insertions, two double-rotations and two
single-rotations.
-
The red node is the parent.
-
The purple sibling is the one whose height is too high (I
sometimes called it the “problem” node when
discussing insertions). With removals, heights are decreasing
so the high node is the one whose height was not changed.
-
The third node in the rotation is the higher child of the
purple sibling (or either child, if their heights are equal).
When dealing with insertions we were raising heights, so I
sometimes referred to this child, whose height had been
raised, as the “cause”. For removals, its height
has not changed.
The reason we can use the same rotations for deletions as we used
for insertions is that we are solving the same problem (one
sibling two higher than the other) and, to paraphrase the famous
physicist and lecturer Richard Feynmann, “The same problem
has the same solution”.
-
The important difference is the following.
-
Recall that after a rotation, the highest node has height one less
that the height of the highest node before the rotation (the parent).
For insertions, the parent had just had its height raised by one,
so this reduction restores the height to what it was before the
insertion began. Hence no further changes are needed higher in
the tree.
-
With deletions the highest node after the rotation again has
height one less than the highest node before the rotation, but
there was no previous increase that this reduction cancels.
Hence the height at the top has really been reduced by one, which
can caused this node to be out of balance, depending on whether
its sibling had height that was equal, one larger, or one
greater.
-
If that sibling was equal in height, it is now one greater.
This is in balance and the parent's height is unchanged.
The good case, we are done.
-
If that sibling had height one less, it is now equal.
This is in balance, but the parent's height has dropped by
one.
The unknown case, we move up the tree.
-
If that sibling had height one greater, it is now two
greater.
The bad case, we must perform anotherrotation.
-
This second rotation occurs at a point further up the tree
from the original rotation. Hence even though this second
rotation can cause a third rotation, each subsequent rotation is
closer to the root and at the root the process must stop.
What is the complexity of a removal? Remember that the height of
an AVL tree is Θ(log(N)), where N is the number of nodes.
-
We must find a node with the key, which has complexity
Θ(height) = Θ(log(N)).
-
We must remove the item: Θ(1).
-
We must re-balance the tree, which might involve Θ(height)
rotations. Since each rotation is Θ(1), the complexity is
Θ(log(N))*Θ(1) = Θ(log(N)).
Theorem: The complexity of removal for an AVL tree
is logarithmic in the size of the tree.
Homework: R-3.6
Problem Set 3 problem 4 (end of problem set 3).
Please read the entire problem before beginning.
-
Draw an avl tree containing items with integer keys. Draw the
internal nodes as circles and write the key inside the circle.
Draw the leaves as squares; leaves do not contain items.
You will want to read the remaining parts before drawing this
tree.
-
Choose a key present in the tree you drew in part A whose
removal will require a double rotation and a
single rotation in order to
restore balance.
Draw the tree after the removal, but prior to the rotations.
Then draw the tree after the double rotation, but prior to the
single rotation.
Finally, draw the tree after both rotations.
3.2.2 Performance
The news is good.
Search, Inserting, and Removing all have logarithmic complexity.
The three operations all involve a sweep down the tree searching
for a key, and possibly an up phase where heights are adjusted and
rotations are performed. Since only a constant amount of work is
performed per level and the height is logarithmic, the complexity is
logarithmic.
3.3 Bounded-Depth Search Trees (skipped)
3.4 Splay Trees (skipped)
3.5 Skip Lists (skipped)
3.6 Java Example: AVL and Red-Black Trees (skipped)
Chapter 4 Sorting, Sets, and Selection
We already did a sorting technique in chapter 2.
Namely we inserted items into a priority queue and then removed the
minimum each time.
When we use a heap to implement the priority, the resulting sort is
called heap-sort and is asymptotically optimal.
That is, its complexity of O(Nlog(N)) is as fast as possible if we
only use comparisons (proved in 4.2 below)
4.1 Merge-Sort
4.1.1 Divide-and-Conquer
The idea is that if you divide an enemy into small pieces, each
piece, and hence the enemy, can be conquered.
When applied to computer problems divide-and-conquer
involves three steps.
-
Divide the problem into smaller subproblems.
-
Solve each of the subproblems, normally via a recursive call to
the original procedure.
-
Combine the subproblem solutions into a solution for the original
problem.
In order to prevent an infinite sequence of recursions, we need to
define a stopping condition, i.e., a predicate that informs us when to
stop dividing (because the problem is small enough to solve directly).
Using Divide-and-Conquer for Sorting
This turns out to be so easy that it is perhaps surprising that it
is asymptotically optimal.
The key observation is that merging two sorted lists is fast (the time
is linear in the size of the lists).
The steps are
-
Divide (with stopping condition): If S has zero or one element,
simply return S since it is already sorted.
Otherwise S has n≥2 elements: Move the first ⌈n/2⌉
elements of S into S1 and the remaining
⌊n/2⌋ elements into S2.
-
Solve recursively: Recursively sort each of the two subsequences.
-
Combine: Merge the two (now sorted) subsequences back into S
Example:: Sort {22, 55, 33, 44, 11}.
-
Divide {22, 55, 33, 44, 11} into {22, 55, 33} and {44, 11}
-
Recursively sort {22, 55, 33} and {44, 11} getting {22, 33, 55}
and {11, 44}
-
Merge {22, 33, 55} and {11, 44} getting {11, 22, 33, 44, 55}
Expanding the recursion one level gives.
-
Divide {22, 55, 33, 44, 11} into {22, 55, 33} and {44, 11}
-
Recursively sort {22, 55, 33} and {44, 11} getting {22, 33, 55}
and {11, 44}
-
Divide {22, 55, 33} into {22, 55} and {33}
-
Recursively sort {22, 55} and {33} getting {22, 55} and {33}
-
Merge {22, 55} and {33} getting {22, 33, 55}
-
Divide {44, 11} into {44} and {11}
-
Recursively sort {44} and {11} getting {44} and {11}
-
Merge {44} and {11} getting {11, 44}
-
Merge {22, 33, 55} and {11, 44} getting {11, 22, 33, 44, 55}
Expanding again gives
-
Divide {22, 55, 33, 44, 11} into {22, 55, 33} and {44, 11}
-
Recursively sort {22, 55, 33} and {44, 11} getting {22, 33, 55}
and {11, 44}
-
Divide {22, 55, 33} into {22, 55} and {33}
-
Recursively sort {22, 55} and {33} getting {22, 55} and {33}
-
Divide {22, 55} into {22} and {55}
-
Recursively sort {22} and {55} getting {22} and {55}
-
Merge {22} and {55} getting {22, 55}
-
Do NOT divide {33} since it has only one
element and hence is already sorted
-
Merge {22, 55} and {33} getting {22, 33, 55}
-
Divide {44, 11} into {44} and {11}
-
Recursively sort {44} and {11} getting {44} and {11}
-
Do NOT divide {44} since it has only one
element and hence is already sorted
-
Do NOT divide {11} since it has only one
element and hence is already sorted
-
Merge {44} and {11} getting {11, 44}
-
Merge {22, 33, 55} and {11, 44} getting {11, 22, 33, 44, 55}
Finally there still is one recursion to do so we get.
-
Divide {22, 55, 33, 44, 11} into {22, 55, 33} and {44, 11}
-
Recursively sort {22, 55, 33} and {44, 11} getting {22, 33, 55}
and {11, 44}
-
Divide {22, 55, 33} into {22, 55} and {33}
-
Recursively sort {22, 55} and {33} getting {22, 55} and {33}
-
Divide {22, 55} into {22} and {55}
-
Recursively sort {22} and {55} getting {22} and {55}
-
Do NOT divide {22} since it has only
one element and hence is already sorted.
-
Do NOT divide {55} since it has only
one element and hence is already sorted.
-
Merge {22} and {55} getting {22, 55}
-
Do NOT divide {33} since it has only one
element and hence is already sorted
-
Merge {22, 55} and {33} getting {22, 33, 55}
-
Divide {44, 11} into {44} and {11}
-
Recursively sort {44} and {11} getting {44} and {11}
-
Do NOT divide {44} since it has only one
element and hence is already sorted
-
Do NOT divide {11} since it has only one
element and hence is already sorted
-
Merge {44} and {11} getting {11, 44}
-
Merge {22, 33, 55} and {11, 44} getting {11, 22, 33, 44, 55}
Hopefully there is a better way to describe this action. How about
the following picture. The left tree shows the dividing. The right
shows the result of the merging.