So we can find the kth smallest by doing (at most) two binary searches on the ranks. Each search visits O(log n) items. Whenever an item is visited, we must determine its rank. This can be found by doing a binary search on the other array. (Here we are searching for an actual item, not a rank.) For example, suppose we want the rank of T[i]. Let S[j] be the largest item in S that is smaller than T[i]. (S[j] can be found with a binary search.) Then the rank of T[i] is i+j.
So each "visit" requires a binary search, which takes O(log n) time. Thus, we have O(log n) visits and O(log n) time per visit, which gives a time complexity for the whole algorithm of O(log2n).
Here is some pseudo code for just the part where we do a binary search on the ranks of S:
min := 0; max := n-1; while (max >= min) do begin i := min + floor((max - min)/2); r := rank(S[i]); if (r = k) then ANSWER S[i]; // we're done else if (r < k) then min = i+1; else // must be that r > k max = i-1; end;
findAllElements(k): create empty List; findAllElements(k, root, List); return List; findAllElements(k, v, List): if v is an external node then return; if k = key(v) then begin List.addElement(v); findAllElements(k, T.leftChild(v), List); findAllElements(k, T.rightChild(v), List); end; else if k < key(v) then findAllElements(k, T.leftChild(v), List); else findAllElements(k, T.rightChild(v), List);To show that the time complexity is O(h + s), we must bound the number of nodes visited that do not have key k to be no more that O(h).
claim: The
algorithm will trace no more than two paths, from the root to the
leaves, containing nodes with keys not equal to k.
proof of claim: Notice that the path being traced splits in two only when a node with key k is encountered. If one or both of that node's children has key k, or if one or both of the children is a leaf, then we have not increased the number of paths containing non-k keys. So we need to show that the algorithm will encounter a node with key k, both of whose children have non-k keys, at most once. (In fact, this can only be the first node encoundered with key k.) Suppose the algorithm encounters a node, x, with key k, and it is not the first such node. Then it must be a descendant of another node, y, with key k. Node x is either in the left or right subtree of y. Suppose it is the left subtree (a similar argument applies if it is the right subtree). Since the tree is a binary search tree, all keys in the left subtree of y must be less than or equal to k, and all the keys in the right subtree of x must be greater than or equal to k. Since the right child of x is in both, it must have key equal to k. Hence, x has at most one child with key not equal to k, traversing x cannot increase the number of paths with non-k keys, and the claim is proven. |
Since the AVL tree is also a binary search tree, we can deduce something about the range of values stored in a subtree from the values in the ancestors of the root of the subtree. For example, starting at the root, we have no information, so we can just say that the keys in the tree are in the range [-infinity, +infinity]. But if the root stores key k, then the nodes in the left subtree of the root must be in the range [-infinity, k], and those in the right subtree must be in [k, +infinity]. If the left child stores key k', then the nodes in the right subtree of the left child of the root must have keys in the range [k', k], etc.
We can do a modified depth first traversal of the tree, keeping track of the range boundaries of each subtree as we descend. Suppose we are looking for the number of keys in the range [k1, k2]. If we come to a subtree whose boundary range does not intersect [k1, k2], then we can ignore that subtree. If we come to a subtree whose boundary range is a subset of [k1, k2], then we can add to our running total the size of the root of that subtree, and that subtree also does not need to be traversed. If the boundary range of the subtree intersects, but is not a subset of, [k1, k2], then the subtree needs to be traversed to see how many nodes are in [k1, k2].
Here is some pseudo-code. There is a second subroutine with the same name but different parameters to implement the recursion. (In this pseudo-code, ranges are treated as simple variables.)
countAllInRange(k1, k2) return countAllInRange(T.root, [k1, k2], [-infinity, +infinity]); countAllInRange(v, [k1,k2], [r1,r2]) // returns the number of keys in subtree rooted at v that // are in [k1,k2] // [r1,r2] is the boundary range for the subtree rooted at v // (all keys in this subtree are known to be in [r1,r2]) if [r1,r2] is a subset of [k1,k2] then return v.size; else if [r1,r2] does not intersect [k1,k2] then return 0; else begin if v.key is in [k1,k2] then count := 1; else count := 0; count := count + countAllInRange(T.leftChild(v), [k1,k2], [r1,v.key]); count := count + countAllInRange(T.rightChild(v), [k1,k2], [v.key,r2]); return count; end;To show that the algorithm has time complexity O(log n), we must show that the traversal does not branch out "too much". Notice that the traversal only branches out at a node if the current range, [ri, rj], intersects [k1, k2], but is not contained in it. In other words, we branch if k1 is in [ri, rj] but k2 isn't, or vice versa. But furthermore notice that, at any level of the tree, no two boundary ranges associated with nodes at that level can overlap, except possibly at their endpoints. Therefore, at any level of the tree, only two ranges can contain k1 but not k2, or vice versa.
Therefore, the algorithm can only traverse two complete paths from root to leaves, which represents O(log n) operations (since it is an AVL tree). At each node on these paths, there will be a branch off, but that branch will terminate after one node, so all the side branches represent only O(log n) more operations. So finally the algorithm must have time complexity O(log n).
If n = 1 then zero operations are needed, and 0 = 1 log 1.
Suppose n > 1, and for all m < n mergesort of a list of m items takes O(m log m) time. Let T(n) be the running time as a function of n. Since the merge step is O(n),
T(n) | = 2T(n/2) + O(n) | ||
<= 2T(n/2) + C1n |
T(n/2) | <= C2(n/2)log n/2 | ||
<= C3n log n - C3n |
T(n) | <= 2[C3n log n - C3n] + C1n | ||
<= C4n log n - C4n + C1n | |||
<= C4n log n + C5n |
Every sublist will be split exactly in half. But we still need to do O(n work at each of the O(log n) levels of the recursion tree, so this is the best that can be done.
Sort(A); Sort(B); // use O(n log n) sort algorithm i := 0; j := n-1; while i < n and j >= 0 do begin while A[i] + B[j] > m do j := j - 1; if A[i] + B[j] = m then ANSWER (A[i], B[j]); // we're done else i := i + 1; end; ANSWER "none";Proof of time complexity: We do a constant amount of work every time
i
or j
is incremented or decremented, but
this happens at most 2n times, so the this loop is
O(n). Sorting is O(n log n).
For each of the m edges there are two references to it out of all the incidence containers (adjacency lists), which requires big-Theta(m) space.
Other than that, each of the m edges and n vertices requires constant, non-zero, space, so the total space usage is big-Theta(m + n).
1, 3, 5, 7, 1, 2, 4, 6, 8, 2, 3, 4, 5, 6, 7, 8, 1.
15, 22, 16, 31, 127, 141, 32, 169, 126.
Suppose that u is the current node being explored, and (u, v) is a back-edge, so that v has already been traversed. Suppose, for the sake of contradiction, that v is not an ancestor of u. This means that v is not on the path from the root to u, which means that v is not on the current recursion stack. But this means that v has already been explored completely, which means that all paths from v have been traversed previously, including the one connecting it to u. But this is a contradiction, since the node u and the edge (u,v) are only now being traversed.
Therefore, v must be an ancestor of u.
(We must assume that G is connected, since that is the only way it can have an Euler tour.)
The claim is vacuously true for number of edges equal to one, since there are no graphs with one edge and each node having even degree. It is also clearly true for number of edges equal to two, since there is only one such graph. Similarly for number of edges equal to zero.
Suppose the claim is true for any graph with number of edges less than m, for m > 1.
Let G be a graph with m edges, m > 1, and with each node having an even degree. Pick a node x in G, and, starting from x, traverse a path of edges until a node is reached which has no more untraversed outgoing edges.
claim: this node must be x.
proof of claim: since each node has even degree, whenever we enter a node by traversing an edge, there must be another edge to leave the node by. But we started at x without entering it through an edge, so this node is the only one that we could enter by an edge and not have an untraversed edge available to leave by. |
Let P be the path so traversed. Form a new graph, G', by removing all edges of P from G.
Each node in G' must still have even degree, since every removed edge that was incident on a node u corresponds to another removed edge indicent on u. So G' consists of multiple connected parts, each of which has fewer than m edges, and nodes with even degree. So each connected subgraph of G' has an Euler tour, by the induction hypothesis.
Now we can construct an Euler tour on G by traversing the path P, except that whenever P first enters one of the connected components of G', we traverse the Euler tour of that component, and then continue traversing P.
Hence, by induction, any connected graph with even-degree nodes has an Euler tour.