Skipped for now

It is interesting to compare quick-sort with merge-sort. Both are divide and conquer algorithms. So we divide, recursively sort each piece, and then combine the sorted pieces.

In merge-sort, the divide is trivial: throw half the elements into one pile and the other half in another pile. The combine step, while easy does do comparisons and picks an element from the correct pile.

In quick-sort, the combine is trivial: pick up one pile, then the other. The divide uses comparisons to decide which pile each element should be placed into.

As usual we assume that the sequence we wish to sort contains no duplicates. It is easy to drop this condition if desired.

Algorithm quick-sort (S) Input: A sequence S (of size N). Output: A sorted sequence T containing the same elements as S. Create a empty sequences T, L, G { Divide into L and G } If n = 1 then copy the element of S to T if n < 2 return Pick an element P from S { called the pivot } while (not S.isEmpty()) x ← S.remove(S.first()) if x < P then L.insertLast(x) { L stands for less } if x > P then G.insertLast(x) { G stands for greater } { Recursively Sort L and G } LS ← quick-sort (L) LS stands for L sorted } GS ← quick-sort (G) { Combine LS, P, and GS } while (not LS.isEmpty()) T.insertLast(LS.remove(LS.first())) T.insertLast(P) while (not GS.isEmpty()) T.insertLast(GS.remove(GS.first()))

The running time of quick sort is highly dependent on the choice of the pivots at each stage of the recursion. A very simple method is to choose the last element as the pivot. This method is illustrated in the figure on the right. The pivot is shown in red. This tree is not surprisingly called the quick-sort tree.

The top tree shows the dividing and recursing that occurs with input {33,55,77,11,66,88,22,44}. The tree below shows the combining steps for the same input.

As with merge sort, we assign to each node of the tree the cost (i.e., running time) of the divide and combine steps. We also assign to the node the cost of the two recursive calls, but not their execution. How large are these costs?

The two recursive calls (not including the subroutine execution itself) are trivial and cost Θ(1).

The dividing phase is a simple loop whose running time is linear in the number of elements divided, i.e., in the size of the input sequence to the node. In the diagram this is the number of numbers inside the oval.

Similarly, the combining phase just does a constant amount of work per element and hence is again proportional to the number of elements in the node.

We would like to make an argument something like this. At each level of each of the trees the total number of elements is n so the cost per level is O(n). The pivot divides the list in half so the size of the largest node is divided by two each level. Hence the number of levels, i.e., the height of the tree, is O(log(n)). Hence the entire running time is O(nlog(n)).

That argument sound pretty good and perhaps we should try to make
it more formal. However, I prefer to try something else since the
argument is **WRONG!**

**Homework:**
Draw the quick-sort
tree for sorting the following sequence
{222 55 88 99 77 444 11 44 22 33 66 111 333}.
Assume the pivot is always the last element.

The tree on the right illustrates the worst case of quick-sort, which occurs when the input is already sorted!

The height of the tree is N-1 not O(log(n)). This is because the pivot is in this case the largest element and hence does not come close to dividing the input into two pieces each about half the input size.

It is easy to see that we have the worst case. Since the pivot
does not appear in the children, at least one element from level i
does not appear in level i+1 so at level N-1 you can have at most 1
element left. So we have the highest tree possible.
Note also that level i has at least i pivots missing so cna have at
most N-i elements in all the nodes. Our tree achieves this maximum.
So the time needed is proportional to the total number of numbers
written in the diagram which is N + N-1 + N-2 + ... + 1, which is
again the one summation we know N(N+1)/2 or Θ(N^{2}.

Hence the **worst case** complexity of quick-sort is
quadratic! Why don't we call it slow sort?

Perhaps the problem was in choosing the last element as the pivot. Clearly choosing the first element is no better; the same example on the right again illustrates the worst case (the tree has its empty nodes on the left this time).

Since are spending linear time (as opposed to constant time) on the
division step, why not count how many elements are present (say k) and
choose element number k/2? This would not change the complexity (it
is also linear). You could do that and now a sorted list is not the
worst case. But some other list is. Just put the largest element in
the middle and then put the second largest element in the middle of
the node on level 1. This does have the advantage that if you
mistakenly run quick-sort on a sorted list, you won't hit the worst
case. But the worst case is still there and it is still
Θ(N^{2}).

Why not choose the real middle element as the pivot, i.e., the median. That would work! It would cut the sizes in half as desired. But how do we find the median? We could sort, but that is the original problem. In fact there is a (difficult) algorithm for computing the median in linear time and if this is used for the pivot, quick-sort does take O(nlog(n)) time in the worst case. However, the difficult median algorithm is not fast in practice. That is, the constants hidden in saying it is Θ(N) are rather large.

Instead of studying the fast, difficult median algorithm, we will
consider a randomized quick-sort algorithm and show that the
**expected** running time is Θ(Nlog(N)).

**Problem Set** 4, Problem 2.
Find a sequence of size N=12 giving the worst case for quick-sort when
the pivot for sorting k elements is element number ⌊k/2⌋.

Consider running the following quick-sort-like experiment.

- Pick a positive integer N
- Pick a N numbers at random (say without duplicates).
- Choose at random one of the N numbers and call it the pivot.
- Split the other numbers into two piles: those bigger than the pivot and those smaller than the pivot.
- Call this a good split if the two piles are "nearly" equal. Specifically, the split is good if the larger of the piles has no more than 3N/4 elements (which is the same as the smaller pile having no fewer than N/4 elements)

Are good splits rare or common?

**Theorem**: (From probability theory). The expected
number of times that a fair coin must be flipped until it shows
``heads'' k times is 2k.

We will not prove this theorem, but will apply it to analyze good splits.

We picked the pivot at random so if we **imagine** the
N numbers lined up in order, the pivot is equally likely to be
anywhere in this line.

Consider the picture on the right. If the pivot is anywhere in the pink, the split is good. But the pink is half the line so the probability that we get a ``pink pivot'' (i.e., a good split) is 1/2. This is the same probability that a fair coin comes up heads.

Every good split divides the size of the node by at least 4/3.
Recall that if you divide N by 4/3, log_{4/3}(N) times, you
will get 1. So the maximum number of good splits possible along a
path from the root to a leaf is log_{4/3}(N).

Apply the probability theorem above we see that the expected length
of a path from the root to a leaf is at most 2log_{4/3}(N),
which is O(log(N)). That is, the expected height is O(log(N)).

Since the time spent at each level is O(N), the
**expected** running time of randomized quick-sort is O(Nlog(N)).

**Theorem**: The running time of any comparison-based
sorting algorithm is Ω(Nlog(N)).

**Proof**: We will not cover this officially.
Unofficially the idea we form a binary tree with each node
corresponding to a comparison and the two children corresponding to
the two possible outcomes. This is a tree of all possible executions
(with only comparisons used for decisions). There are N! permutations
of N numbers and each must give a different execution pattern in order
to be sorted. So there are at least N! leaves. Hence the height is
at least log(N!). But N! has N/2 elements that are at least N/2 so
N!≥(N/2)^{N/2}. Hence

height ≥ log(N!) ≥ log((N/2)^{N/2}) = (N/2)log(N/2)

So the running time, which is at least the height of this tree, is
Ω(Nlog(N))

**Corollary**: Heap-sort, merge-sort, and quick sort (with the
difficult, linear-time median algorithm) are asymptotically optimal.