LECTURE 5
Chee Yap
Euler's formula relating number of vertices, edges and faces of a plana embedding of graph is a fundamental relation in computational geometry. It leads to all the favorable computational properties of planar graphs, as compared to general graphs. We introduce the notion of skeletons and their computation. In general, we need the language of cell complexes as in topology where skeletons might be called 1-complexes.
The problem of point location, Kirkpatrick's elegant solution, and the alternative of Seidel is treated.
Let G be a undirected graph G=(V,E). Let the map m assign to each v Î V a distinct point m(v) in the plane. The points are distinct in the sense that u ¹ v implies m(u) ¹ m(v). This map extends naturally to edges where e=(u,v) Î E is mapped to the open line segment m(e) = (m(u),m(v)). This extended map m is called a linear plane embedding of G if the following two conditions hold.
Let S be a set of open line segments (called ``edges'') and points (``vertices''). We call S a skeleton if it satisfies the following properties:
|
Given any graph G, we have the usual numerical quantities
|V|, |E| and the number of connected components b(G).
We will call the quantities
|V|, |E| and the number of connected components of G
the embedding invariants of G, and denote them by
|
Remark: Two closed line segments are said to be crossing if their relative interios intersect; otherwise they are non-crossing. Note that non-crossing line segments may share endpoints. Then a skeleton S alternatively be represented by a set of closed line segments that are non-crossing.
A skeleton S (or graph G) can be incrementally constructed via a sequence of |S| (or |V|+|E|) operations, where each operation comes under one of the following two types:
| (1) |
Theorem 1
The following formula of Euler holds for any skeleton S:
n0(S) - n1(S) + n2(S) = 1 + b(S). (2)
|
|
|
|
The above proof apparently depends on a choice of a sequence (1), namely, on the order in which vertices and edges are introduced. But the numerical quantity n2(S) depends only on the final set S, and not on the sequence of operations to reach S. We now show that n2(S) depends only on the graph G(S).
Corollary 2
The number n2(G) is well-defined for a planar graph G.
Remark: We can equally develop this section by defining the ``dual'' of a construction sequence: We may define a destruction sequence for any embedding S to be a sequence (S0, S1 ,¼, Sm) where S0=S, and each Si+1 is obtained from Si by removing an arbitrary edge or by removing a vertex that is no longer incident on any edge.
Let G=(V,E) be a planar graph.
Our first goal is to bound the average degree d = d(G)
of vertices in a planar graph:
|
| (3) |
| (4) |
|
|
Let us summarize:
In a plane graph with v vertices and e edges and r
regions,
|
The information represented by an embedding (or a skeleton) needs to be rationally encoded to support algorithms for such embeddings. Several data structures have been proposed in the literature, and we now examine some of them.
The Problem with Holes. First some terminology: if f,f¢ are two adjacent faces, with f¢ contained in the closure of f, then we say that f¢ bounds f. The dimension of f¢ will be less than the dimension of f. The boundary of a region will in general be comprised of several boundary components, which are pairwise disjoint simple polygons (=vertices and edges), For bounded regions, there is a distinguished boundary component, namely, the the outermost boundary. The non-distinguished boundary components can be seen to bound holes. Connected regions that do not have holes are said to be simply-connected. Alternatively, for a non-empty skeleton, a region is simply connected iff it is bounded and has one boundary component.
One way to convert bounded regions into simply-connected regions is to introduce an edge to connect the boundary to each hold to the rest of the boundary. Such an edge (called an isthmus) is characterized by having the same region on both sides of the edge. For simplicity, we often assume that the bounded regions are simply-connected.
Requirements. Let S be a skeleton. What do we require from a data structures D(S) for P(S)? Basically, we expect to search from any face of P(S) its adjacent faces quickly.
Let us give an example of an operation that does not automatically fall out of such representations: given a vertex, what is the edge (if any) that is vertically above it? This query cannot be obtained in constant time using the above information.
First Attempt: Augmented Adjacency Lists. To indicate that the correct data structure is not entirely trivial, let us explore the reasonable suggestion to represent P(S) by any efficient representation of the graph G(S), augmented by additional information that arises from the embedding. One of the most useful representations of graphs is the adjacency list representation: we have a vertex list and an edge list; for each edge in the edge list, we store its pair of endpoints, and for each vertex v in the vertex list, we store an adjacency list A(v) comprising the edges that are bounded by v. We now propose to represent subdivisions by augmenting the adjacency list representation. The following additional information may be provided.
Unfortunately, this data structure does not meet one of our above requirements. How do we traverse around the boundary of a region efficiently? The adjacency does encode information to support this traversal, but not in a form that is easily accessible. In particular, we cannot get from one bounding edge e to the ``next edge'' e¢ in constant time. It takes time proportional to the degree of the vertex that is incident to both e and e¢. Nevertheless, this simple ``augmented adjacency list'' structure may be useful if going around the boundary of a region is not important.
An alternative solution would be to further augment our augmented adjacency list structure: store with each face, a list of all the edges bounding the face. But this is duplicating information, which is sometimes undesirable.
The DCEL Structure. There is no simple way to modify the augmented adjacency list representation above. Instead we must take it apart: the problem lies in the the monolithic ``adjacency lists'' (inherited from graphs). We will take the information in these lists and distribute this information among the edges. Each edge is now given an arbitrary direction. This amounts to specifying one incident vertex of e as the start and the other as the stop vertex. Relative to this direction, we have two faces called the left and right faces of e. We define the left and right successors of of e to be the two edges that are incident on the e.stop and which bound the left and right (respectively) faces of e. Similarly, we can define the left and right predecessors of e; these are edges incident on e.start and bounding the left and right (respectively) faces of e. See figure 1(a).
.
Half-Edge Data Structure. The DCEL Structure has one unsatisfactory feature, its arbitrary assignment of direction to edges. We can remove this arbitrariness by splitting each edge into two half-edges. The result data structure is the so-called half-edge data structure. So the two half-edges are really the same edge with opposite orientations. Such a pair is called a twin. For each half-edge h, let h.twin be its twin. Thus, h.twin.twin=h. The end points of h are h.start and h.stop and we consider h to be directed from h.start to h.stop. Thus h.twin.start = h.stop and h.twin.stop = h.start. We view each half-edge as bounding a single face, denoted h.face. This is the face that lies on the right side of the directed half-edge. We also have h.succ, h.pred refering to the half-edges that bound h.face and incident on h.stop and h.start (respectively).
This data structure will be our default data structure for plane subdivisions (and any general surface mesh). For this reason, let us be explicit about our conventions for this data structure.
What is the complexity of this representation? When it is O(n0(S)+n1(S)+n2(S)) = O(n) where n=n0(S) is usually taken to be the size of this subdivision.
Variations. Some variants are called ``winged edges data structures'', where the ``wing'' terminology is suggested by the directions drawn on edges. Sometimes, redundant information may be provided to provide constant speedup. In some applications, some of the links can be omitted. As noted above, we may often drop all information related to the regions. If this is done, we may achieve significant improvement in performance (constant factor, of course). Another remark is that our skeletons allow ``dangling edges'' or handles which do not appear essential. But one reason to allow them is that some incremental algorithms, our data structre may pass through intermediate stages with such handles which will eventually be removed.
Quad-Edges Data Structure and Duality. The quad-edge data structure was introduced by Guibas and Stolfi. Let us now restrict attention to regions that are simply connected. Then there is a dual graph D(G) in which the regions of G are vertices of D(G), and the edges of D(G) are still the edges of D(G). In general, D(G) may no longer be a simple graph - it may have multiple (or parallel) edges. See figure 2.
The quad-edge data structure for a skeleton S has the elegant symmetry between vertices and regions, thus giving no preferences to the graph G(S) or its dual D(G(S)).
Remarks. For an in-depth discussion on designing a data structure for surfaces in a general geometric library, we refer to Kettner [1].
A connected subset X Í \mathbb R2 is y-monotone if every horizontal line H(t) intersects X in a connected set H(t)ÇX. Note that if H(t)ÇX is empty, it is considered a connected set. Let P be a subdivision of the plane. We say P is y-monotone if every bounded face of P is y-monotone. We say P is triangulated if bounded region is triangulated. Note that
Let S be a skeleton. It induces a partition P(S) of \mathbb R2 into disjoint sets which comprise the vertices and edges in S, as well as the regions as defined before. Thus S Í P and the set of regions is P\S. Call P = P(S) the called the planar subdivision induced by S. Each f Î P is called a face of the subdivision. Thus the faces are vertices (or 0-faces), edges (or 1-faces) or regions (or 2-faces).
Now assume that the underlying graph G(S) is connected, i.e., b(G(S))=1). The planar point location problem for S asks us to construct a data structure D(S) such that for any point q Î \mathbb R2, we can use D(S) to efficiently determine the face f(q) of the subdivision P(S) that contains q. Here q is called the query point.
We now describe a beautiful datastructure D(S) of Kirkpatrick
to solve the planar point location problem.
We assume that the subdivision P(S) is
triangulated (that is, each region, except for the
infinite region, is a triangle).
A subset U Í V(S) is called an independent set
if no two vertices in U are connected by an edge.
Kirkpatrick defines a sequence (hierarchy)
| (5) |
|
It is instructive to understand this re-triangulation process. Say u is a removed vertex, and as a result, we have to remove k edges that are incident on u. This create a ``star-shaped region'' that is centered at u with k bounding segments. To re-triangulate this region, we only need to k-3 new edges. Note that because we assume U is an independent set, the star-shaped region for the points in U are pairwise disjoint. Hence the re-triangulation for each u can proceed independently.
This completes the description of Si+1. Intuitively, Si+1 is a ``simplified version'' of Si. We step this simplification process when |Si| is less than some constant. We may represent each P(Si) using some standard topological representation of subdivisions (e.g., half-edge data structure).
There is another important set of links in our hierarchical data structure: each face f Î P(Si+1) points to its ``cause'' in P(Si): if f occurs as f¢ Î P(Si), then f¢ is the ``direct cause'' of f. If not, f points to the vertex u in P(Si) whose removal led to the creation of f (in this case, f is an edge or a region). Here, u the ``indirect cause'' of f. We have now completely describe D(S), except for one addition detail - how is the independent sets Ui specified?
Let us see how we can use D(S) for point location: given a query point q, we locate the face f of P(Sh) that contains q. In general, when we have found the face fi+1 in P(Si+1) that contains q, we can find the face the face fi in P(Si) that contains q as follows: follow ``cause link'' of fi+1 to some f¢ in P(Si). If f¢ is a direct cause then fi is simply f¢. Otherwise, we need to search the edges and regions in P(Si) that are incident on f¢ to find fi. How much time does this search take? This depends on the degree of f¢. We would like this degree to be bounded.
Complexity of Kirkpatrick's Solution. This brings us to the final detail: Kirkpatrick shows that we can choose Ui so that
Let us deduce the computational significance
of (a) and (b). From (b), it follows that the hierarchy
(5) is O(log|V(S)|)=O(lgn). In fact, it is at most
| (6) |
Discussion. Unfortunately, this beautiful data structure does not appear to be useful in practice. There are two reasons. One is that the hidden constant in the logn time performance seems to be too large. Let the time be Clgn. Note that C depends linearly 1/lg(24/23) + 11. We will next examine more practical alternatives. The second is that the approach requires the subdivision be a triangulation. We will show how to get around this problem.
Kirkpatrick's result motivates the question: what other numbers can be used in place of the constants (d, f) = (11, 24) in the above proof? It turns out that we can use (d, f) = (9, 35/2).
Dynamic Maps. See Teillaud [2].
Consistency Problem. Given a half-edge data structure, how can we verify that it is consistent? In practice, this is an important issue, since algorithms often construct erroneous structures because of numerical roundoff errors.
1
This technical distinction between ``planar graph'' and ``plane graph'' is
possibly confusing in ordinary speech.