## Lecture 21: Two-Three Trees

Weiss section 4.7 (B-trees).

Two-three trees are one form of balanced search trees. These in general have the following properties:

• The branching factor is limited by a constant bound.
• The height is proportional to log(N), where N is the number of elements.
• The operations of searching for an element, adding an element, and deleting an element, can all be carried out in time proportional to the height of the tree.
Weiss discusses a lot of these: AVL trees, Red-Black trees, Splay trees. We're only going to do B-trees.

### Two-three tree

A 2-3 tree has the following structure.
• All the leaves are at the same depths.
• Each item in the set is in a separate leaf. (There is a different version of 2-3 trees, more like binary search trees, where some of the items are in the leaves and some in the internal nodes. That makes the algorithms more complicated.)
• The leaves are in increasing order left to right.
• Every internal node has either 2 children or 3 children (hence the name).
• Every internal node has a label consisting of 1 or 2 values.
If N has two children then the label on N is the smallest value in the second subtree.
If N has three children then the label on N is the smallest value in the second subtree and the smallest value in the third subtree.

### Searching for an element

Variant on search in a binary search tree.
```search(node T, value X) {  // look for value X under node T
if (T is a leaf)
if (T.value == X) return T
else return null;
else if (X < T.key1)
return search(T.firstChild,X)
else if (T.key2 != null && X >= T.key2)
return search(T.thirdChild,X)
else return search(T.secondChild,X)
```

### How high is a 2-3 tree of size N?

The thinnest 2-3 tree of height H has 2 branches at each node and therefore N=2H.
The bushiest has 3 branches at each node, and therefore N=3H.
Therefore for any tree, 2H < = N < = 3H.
Taking log to the base 2 we get H <= log2N.
log23H = H*log23 > = log2N so H >= log2N / log23.

Therefore H is O(log N).

### Adding an element

```Add(X) {
Using the same technique used in search,
find where X would go if it were a leaf;
if X is already there, return;
create a node N for X, and attach it to the
correct parent P;
loop {
if (P now has 3 children)
return;
else {                    // P now has 4 children
create a sibling P1 for P;
give the second two children of P to P1;
// P and P1 now have 2 children
F = parent of P;
if (F==null) exitloop;
make P1 a child of F;
P = F;
}
} // end loop
// if you have reached this point,
//   then you have split the root
create a new root R
make P1 and P2 children of R;
}
```

What I haven't done is explain how to fix the keys on the internal nodes. Note that the only nodes that have been changed are those on the path from the added node to the root and their neighboring siblings, so at most 2*height of the tree. However, if you implement the key changes in the obvious way, getting the key changes right will take time O(height) for each node being changed and therefore O(height2) overall. But by being clever about passing values up the tree, all the changes can in fact be carried out in time O(height). All that's needed is a careful case analysis, but I am not going to go through it.

```

```

### Deleting an element

```delete(X) {
use the search technique to find the leaf N for X;
if (X is not in the tree) return;
loop {
P = parent of N;
delete N;
if (P now has two children) return;
else  { // P now has one child.
if (P is the root) {
delete P;
return;
}
if (P has a neighboring sibling P1 with three children) {
transfer the nearest child from P1 to P;
return;
} // now both P and the sibling have two children
else { // P has one child and all neighboring siblings have two;
transfer P's child to a neighboring sibling;
N = P;
}
} // end loop
}
```

```

```

#### Another Example

```

```

Again, I haven't explained how to fix the keys. Again, using an even more elaborate case analysis, you can show that this can be done in time O(height).