3.1 Geometric Modeling
A wide variety of approaches and techniques for geometric modeling exist, and the particular choice usually depends on the application and the difficulty of the problem. In most cases, there are generally two alternatives: 1) a boundary repre- sentation, and 2) a solid representation. Suppose we would like to define a model of a planet. Using a boundary representation, we might write the equation of a sphere that roughly coincides with the planet’s surface. Using a solid representa- tion, we would describe the set of all points that are contained in the sphere. Both alternatives will be considered in this section.
The first step is to define the world for which there are two possible choices:
W
- a 2D world, in which = R2, and 2) a 3D world, in which = R3. These choices should be sufficient for most problems; however, one might also want to allow more complicated worlds, such as the surface of a sphere or even a higher
W W
81
dimensional space. Such generalities are avoided in this book because their current applications are limited. Unless otherwise stated, the world generally contains two kinds of entities:
- Obstacles: Portions of the world that are “permanently” occupied, for example, as in the walls of a building.
- Robots: Bodies that are modeled geometrically and are controllable via a motion plan.
Based on the terminology, one obvious application is to model a robot that moves around in a building; however, many other possibilities exist. For example, the robot could be a flexible molecule, and the obstacles could be a folded protein. As another example, the robot could be a virtual human in a graphical simulation that involves obstacles (imagine the family of Doom-like video games).
This section presents a method for systematically constructing representations of obstacles and robots using a collection of primitives. Both obstacles and robots will be considered as (closed) subsets of . Let the obstacle region denote the set of all points in that lie in one or more obstacles; hence, . The next step is to define a systematic way of representing that has great expressive power while being computationally efficient. Robots will be defined in a similar way; however, this will be deferred until Section 3.2, where transformations of geometric bodies are defined.
O
W O ⊆ W
W O
- Polygonal and Polyhedral Models
In this and the next subsection, a solid representation of O will be developed in terms of a combination of primitives. Each primitive Hi represents a subset of that is easy to represent and manipulate in a computer. A complicated obstacle region will be represented by taking finite, Boolean combinations of primitives. Using set theory, this implies that can also be defined in terms of a finite number of unions, intersections, and set differences of primitives.
W
O
Convex polygons First consider for the case in which the obstacle region is a convex, polygonal subset of a 2D world, = R2. A subset X Rn is called convex if and only if, for any pair of points in X, all points along the line segment that connects them are contained in X. More precisely, this means that for any
W ⊂
O
x1, x2 ∈ X and λ ∈ [0, 1],
λx1 + (1 − λ)x2 ∈ X. (3.1)
Thus, interpolation between x1 and x2 always yields points in X. Intuitively, X contains no pockets or indentations. A set that is not convex is called nonconvex (as opposed to concave, which seems better suited for lenses).
A boundary representation of is an m-sided polygon, which can be described using two kinds of features: vertices and edges. Every vertex corresponds to a “corner” of the polygon, and every edge corresponds to a line segment between a
O
Figure 3.1: A convex polygonal region can be identified by the intersection of half-planes.
pair of vertices. The polygon can be specified by a sequence, (x1, y1), (x2, y2), . . ., (xm, ym), of m points in R2, given in counterclockwise order.
A solid representation of can be expressed as the intersection of m half-
O
planes. Each half-plane corresponds to the set of all points that lie to one side of a line that is common to a polygon edge. Figure 3.1 shows an example of an octagon that is represented as the intersection of eight half-planes.
An edge of the polygon is specified by two points, such as (x1, y1) and (x2, y2). Consider the equation of a line that passes through (x1, y1) and (x2, y2). An equation can be determined of the form ax + by + c = 0, in which a, b, c R
∈
are constants that are determined from x1, y1, x2, and y2. Let f : R2 R be
→
the function given by f(x, y) = ax + by + c. Note that f(x, y) < 0 on one side
of the line, and f(x, y) > 0 on the other. (In fact, f may be interpreted as a signed Euclidean distance from (x, y) to the line.) The sign of f(x, y) indicates a half-plane that is bounded by the line, as depicted in Figure 3.2. Without loss of generality, assume that f(x, y) is defined so that f(x, y) < 0 for all points to the
left of the edge from (x1, y1) to (x2, y2) (if it is not, then multiply f(x, y) by −1).
Let fi(x, y) denote the f function derived from the line that corresponds to the edge from (xi, yi) to (xi+1, yi+1) for 1 ≤ i < m. Let fm(x, y) denote the line
equation that corresponds to the edge from (xm, ym) to (x1, y1). Let a half-plane
Hi for 1 ≤ i ≤ m be defined as a subset of W:
Hi = {(x, y) ∈ W | fi(x, y) ≤ 0}. (3.2) Above, Hi is a primitive that describes the set of all points on one side of the
Figure 3.2: The sign of the f(x, y) partitions R2 into three regions: two half-planes given by f(x, y) < 0 and f(x, y) > 0, and the line f(x, y) = 0.
line fi(x, y) = 0 (including the points on the line). A convex, m-sided, polygonal obstacle region O is expressed as
O = H1 ∩ H2 ∩ · · · ∩ Hm. (3.3)
Nonconvex polygons The assumption that O is convex is too limited for most applications. Now suppose that O is a nonconvex, polygonal subset of W. In this case O can be expressed as
O = O1 ∪ O2 ∪ · · · ∪ On, (3.4)
in which each Oi is a convex, polygonal set that is expressed in terms of half-
planes using (3.3). Note that Oi and Oj for i j need not be disjoint. Using this
representation, very complicated obstacle regions in can be defined. Although
W
these regions may contain multiple components and holes, if is bounded (i.e., will fit inside of a big enough rectangular box), its boundary will consist of linear segments.
O O
In general, more complicated representations of can be defined in terms of any finite combination of unions, intersections, and set differences of primitives; however, it is always possible to simplify the representation into the form given by (3.3) and (3.4). A set difference can be avoided by redefining the primitive. Suppose the model requires removing a set defined by a primitive Hi that contains1
O
fi(x, y) < 0. This is equivalent to keeping all points such that fi(x, y) ≥ 0, which is
equivalent to −fi(x, y) ≤ 0. This can be used to define a new primitive Hi′, which
when taken in union with other sets, is equivalent to the removal of Hi. Given a complicated combination of primitives, once set differences are removed, the
expression can be simplified into a finite union of finite intersections by applying Boolean algebra laws.
1In this section, we want the resulting set to include all of the points along the boundary.
Therefore, < is used to model a set for removal, as opposed to ≤.
Note that the representation of a nonconvex polygon is not unique. There are many ways to decompose into convex components. The decomposition should be carefully selected to optimize computational performance in whatever algorithms that model will be used. In most cases, the components may even be allowed to overlap. Ideally, it seems that it would be nice to represent with the minimum number of primitives, but automating such a decomposition may lead to an NP-hard problem (see Section 6.5.1 for a brief overview of NP-hardness). One efficient, practical way to decompose is to apply the vertical cell decomposition algorithm, which will be presented in Section 6.2.2
O
O
O
Defining a logical predicate What is the value of the previous representation? As a simple example, we can define a logical predicate that serves as a collision detector. Recall from Section 2.4.1 that a predicate is a Boolean-valued function. Let φ be a predicate defined as φ : true, false , which returns true for a point in that lies in , and false otherwise. For a line given by f(x, y) = 0, let e(x, y) denote a logical predicate that returns true if f(x, y) 0, and false otherwise.
≤
W O
W → { }
A predicate that corresponds to a convex polygonal region is represented by a logical conjunction,
α(x, y) = e1(x, y) ∧ e2(x, y) ∧ · · · ∧ em(x, y). (3.5)
The predicate α(x, y) returns true if the point (x, y) lies in the convex polygonal region, and false otherwise. An obstacle region that consists of n convex polygons is represented by a logical disjunction of conjuncts,
φ(x, y) = α1(x, y) ∨ α2(x, y) ∨ · · · ∨ αn(x, y). (3.6)
Although more efficient methods exist, φ can check whether a point (x, y) lies in in time O(n), in which n is the number of primitives that appear in the representation of (each primitive is evaluated in constant time).
O
O
Note the convenient connection between a logical predicate representation and a set-theoretic representation. Using the logical predicate, the unions and inter- sections of the set-theoretic representation are replaced by logical ORs and ANDs. It is well known from Boolean algebra that any complicated logical sentence can be reduced to a logical disjunction of conjunctions (this is often called “sum of products” in computer engineering). This is equivalent to our previous statement
that O can always be represented as a union of intersections of primitives.
Polyhedral models For a 3D world, = R3, and the previous concepts can be nicely generalized from the 2D case by replacing polygons with polyhedra and replacing half-plane primitives with half-space primitives. A boundary represen- tation can be defined in terms of three features: vertices, edges, and faces. Every face is a “flat” polygon embedded in R3. Every edge forms a boundary between two faces. Every vertex forms a boundary between three or more edges.
W
- (b)
Figure 3.3: (a) A polyhedron can be described in terms of faces, edges, and vertices.
- The edges of each face can be stored in a circular list that is traversed in counterclockwise order with respect to the outward normal vector of the face.
Several data structures have been proposed that allow one to conveniently “walk” around the polyhedral features. For example, the doubly connected edge list [264] data structure contains three types of records: faces, half-edges, and vertices. Intuitively, a half-edge is a directed edge. Each vertex record holds the point coordinates and a pointer to an arbitrary half-edge that touches the vertex. Each face record contains a pointer to an arbitrary half-edge on its boundary. Each face is bounded by a circular list of half-edges. There is a pair of directed half-edge records for each edge of the polyhedon. Each half-edge is shown as an arrow in Figure 3.3b. Each half-edge record contains pointers to five other records: 1) the vertex from which the half-edge originates; 2) the “twin” half-edge, which bounds the neighboring face, and has the opposite direction; 3) the face that is bounded by the half-edge; 4) the next element in the circular list of edges that bound the face; and 5) the previous element in the circular list of edges that bound the face. Once all of these records have been defined, one can conveniently traverse the structure of the polyhedron.
Now consider a solid representation of a polyhedron. Suppose that is a con- vex polyhedron, as shown in Figure 3.3. A solid representation can be constructed from the vertices. Each face of has at least three vertices along its boundary. Assuming these vertices are not collinear, an equation of the plane that passes through them can be determined of the form
O
O
ax + by + cz + d = 0, (3.7) in which a, b, c, d ∈ R are constants.
Once again, f can be constructed, except now f : R3 → R and
f(x, y, z) = ax + by + cz + d. (3.8)
Let m be the number of faces. For each face of O, a half-space Hi is defined as a subset of W:
Hi = {(x, y, z) ∈ W | fi(x, y, z) ≤ 0}. (3.9)
It is important to choose fi so that it takes on negative values inside of the poly- hedron. In the case of a polygonal model, it was possible to consistently define fi by proceeding in counterclockwise order around the boundary. In the case of a polyhedron, the half-edge data structure can be used to obtain for each face the list of edges that form its boundary in counterclockwise order. Figure 3.3b shows the edge ordering for each face. For every edge, the arrows point in opposite directions, as required by the half-edge data structure. The equation for each face can be consistently determined as follows. Choose three consecutive vertices, p1, p2, p3 (they must not be collinear) in counterclockwise order on the boundary of the face. Let v12 denote the vector from p1 to p2, and let v23 denote the vector from p2 to p3. The cross product v = v12 v23 always yields a vector that points out of the polyhedron and is normal to the face. Recall that the vector [a b c] is parallel to the normal to the plane. If its components are chosen as a = v[1], b = v[2], and c = v[3], then f(x, y, z) 0 for all points in the half-space that contains the polyhedron.
×
≤
As in the case of a polygonal model, a convex polyhedron can be defined as the intersection of a finite number of half-spaces, one for each face. A nonconvex polyhedron can be defined as the union of a finite number of convex polyhedra. The predicate φ(x, y, z) can be defined in a similar manner, in this case yielding
true if (x, y, z) ∈ O, and false otherwise.
- Semi-Algebraic Models
In both the polygonal and polyhedral models, f was a linear function. In the case of a semi-algebraic model for a 2D world, f can be any polynomial with real- valued coefficients and variables x and y. For a 3D world, f is a polynomial with variables x, y, and z. The class of semi-algebraic models includes both polygonal and polyhedral models, which use first-degree polynomials. A point set determined by a single polynomial primitive is called an algebraic set; a point set that can be obtained by a finite number of unions and intersections of algebraic sets is called a semi-algebraic set.
Consider the case of a 2D world. A solid representation can be defined using
algebraic primitives of the form
H = {(x, y) ∈ W | f(x, y) ≤ 0}. (3.10)
As an example, let f = x2 + y2 4. In this case, H represents a disc of radius 2 that is centered at the origin. This corresponds to the set of points (x, y) for
−
which f(x, y) ≤ 0, as depicted in Figure 3.4a.
Example 3.1 (Gingerbread Face) Consider constructing a model of the shaded region shown in Figure 3.4b. Let the center of the outer circle have radius r1 and
+
- +
- − +
− − +
−
−
−
− − + +
++ +++
+
+
- (b)
Figure 3.4: (a) Once again, f is used to partition R2 into two regions. In this case, the algebraic primitive represents a disc-shaped region. (b) The shaded “face” can be exactly modeled using only four algebraic primitives.
be centered at the origin. Suppose that the “eyes” have radius r2 and r3 and are centered at (x2, y2) and (x3, y3), respectively. Let the “mouth” be an ellipse with major axis a and minor axis b and is centered at (0, y4). The functions are defined as
f1 = x2 + y2 − r2,
f2 = −((x − x2) + (y − y2) − r ),
1
2 2 2
2
f3 = −((x − x3) + (y − y3) − r ),
2 2 2
3
f4 = −(x2/a2 + (y − y4)2/b2 − 1).
(3.11)
For f2, f3, and f4, the familiar circle and ellipse equations were multiplied by 1 to
−
yield algebraic primitives for all points outside of the circle or ellipse. The shaded region O is represented as
O = H1 ∩ H2 ∩ H3 ∩ H4. (3.12)
.
In the case of semi-algebraic models, the intersection of primitives does not necessarily result in a convex subset of . In general, however, it might be necessary to form by taking unions and intersections of algebraic primitives.
O
W
A logical predicate, φ(x, y), can once again be formed, and collision checking is still performed in time that is linear in the number of primitives. Note that it is still very efficient to evaluate every primitive; f is just a polynomial that is evaluated on the point (x, y, z).
The semi-algebraic formulation generalizes easily to the case of a 3D world.
This results in algebraic primitives of the form
H = {(x, y, z) ∈ W | f(x, y, z) ≤ 0}, (3.13)
which can be used to define a solid representation of a 3D obstacle and a logical predicate φ.
O
Equations (3.10) and (3.13) are sufficient to express any model of interest. One may define many other primitives based on different relations, such as f(x, y, z)
≥
0, f(x, y, z) = 0, f(x, y, z) < 0, f(x, y, z) = 0, and f(x, y, z) = 0; however, most of them do not enhance the set of models that can be expressed. They might, however, be more convenient in certain contexts. To see that some primitives do not allow new models to be expressed, consider the primitive
/
H = {(x, y, z) ∈ W | f(x, y, z) ≥ 0}. (3.14)
The right part may be alternatively represented as f(x, y, z) 0, and f may be considered as a new polynomial function of x, y, and z. For an example that involves the = relation, consider the primitive
− ≤ −
H = {(x, y, z) ∈ W | f(x, y, z) = 0}. (3.15)
It can instead be constructed as H = H1 ∩ H2, in which
H1 = {(x, y, z) ∈ W | f(x, y, z) ≤ 0} (3.16)
and
H2 = {(x, y, z) ∈ W | − f(x, y, z) ≤ 0}. (3.17)
The relation < does add some expressive power if it is used to construct primitives.2 It is needed to construct models that do not include the outer boundary (for example, the set of all points inside of a sphere, which does not include points on the sphere). These are generally called open sets and are defined Chapter 4.
- Other Models
The choice of a model often depends on the types of operations that will be per- formed by the planning algorithm. For combinatorial motion planning methods, to be covered in Chapter 6, the particular representation is critical. On the other hand, for sampling-based planning methods, to be covered in Chapter 5, the par- ticular representation is important only to the collision detection algorithm, which is treated as a “black box” as far as planning is concerned. Therefore, the models given in the remainder of this section are more likely to appear in sampling-based approaches and may be invisible to the designer of a planning algorithm (although it is never wise to forget completely about the representation).
≤
2An alternative that yields the same expressive power is to still use , but allow set comple- ments, in addition to unions and intersections.
Figure 3.5: A polygon with holes can be expressed by using different orientations: counterclockwise for the outer boundary and clockwise for the hole boundaries. Note that the shaded part is always to the left when following the arrows.
Nonconvex polygons and polyhedra The method in Section 3.1.1 required nonconvex polygons to be represented as a union of convex polygons. Instead, a boundary representation of a nonconvex polygon may be directly encoded by list- ing vertices in a specific order; assume that counterclockwise order is used. Each polygon of m vertices may be encoded by a list of the form (x1, y1), (x2, y2), . . ., (xm, ym). It is assumed that there is an edge between each (xi, yi) and (xi+1, yi+1) for each i from 1 to m 1, and also an edge between (xm, ym) and (x1, y1). Ordinar- ily, the vertices should be chosen in a way that makes the polygon simple, meaning that no edges intersect. In this case, there is a well-defined interior of the polygon, which is to the left of every edge, if the vertices are listed in counterclockwise order.
−
What if a polygon has a hole in it? In this case, the boundary of the hole can be expressed as a polygon, but with its vertices appearing in the clockwise direction. To the left of each edge is the interior of the outer polygon, and to the right is the hole, as shown in Figure 3.5
Although the data structures are a little more complicated for three dimen- sions, boundary representations of nonconvex polyhedra may be expressed in a similar manner. In this case, instead of an edge list, one must specify faces, edges, and vertices, with pointers that indicate their incidence relations. Consistent ori- entations must also be chosen, and holes may be modeled once again by selecting opposite orientations.
3D triangles Suppose W = R3. One of the most convenient geometric models to express is a set of triangles, each of which is specified by three points, (x1, y1, z1), (x2, y2, z2), (x3, y3, z3). This model has been popular in computer graphics because graphics acceleration hardware primarily uses triangle primitives. It is assumed that the interior of the triangle is part of the model. Thus, two triangles are considered as “colliding” if one pokes into the interior of another. This model offers great flexibility because there are no constraints on the way in which triangles must
Figure 3.6: Triangle strips and triangle fans can reduce the number of redundant points.
be expressed; however, this is also one of the drawbacks. There is no coherency that can be exploited to easily declare whether a point is “inside” or “outside” of a 3D obstacle. If there is at least some coherency, then it is sometimes preferable to reduce redundancy in the specification of triangle coordinates (many triangles will share the same corners). Representations that remove this redundancy are called a triangle strip, which is a sequence of triangles such that each adjacent pair shares a common edge, and a triangle fan, which is a triangle strip in which all triangles share a common vertex. See Figure 3.6.
Nonuniform rational B-splines (NURBS) These are used in many engi- neering design systems to allow convenient design and adjustment of curved sur- faces, in applications such as aircraft or automobile body design. In contrast to semi-algebraic models, which are implicit equations, NURBS and other splines are parametric equations. This makes computations such as rendering easier; however, others, such as collision detection, become more difficult. These models may be defined in any dimension. A brief 2D formulation is given here.
A curve can be expressed as
C(u) =
n
wi_P_i_N_i,k(u)
-
i=0
n
-
wi_N_i,k(u)
i=0
, (3.18)
in which wi R are weights and Pi are control points. The Ni,k are normalized basis functions of degree k, which can be expressed recursively as
∈
N (u) = ( u − ti \N
i,k
t
− t
i+1,k−1
i+k
i
(u) + ( ti+k+1 − u \N
i+k+1
i+1
(u). (3.19)
i+k
i
i+k+1
i+1
The basis of the recursion is Ni,_0(u) = 1 if t_i ≤ u < ti+1, and N_i,_0(u) = 0 otherwise.
i,k−1
t
− t
A knot vector is a nondecreasing sequence of real values, t0, t1, . . . , tm , that controls the intervals over which certain basic functions take effect.
{ }
Bitmaps For either = R2 or = R3, it is possible to discretize a bounded portion of the world into rectangular cells that may or may not be occupied. The resulting model looks very similar to Example 2.1. The resolution of this discretization determines the number of cells per axis and the quality of the ap-
W W
proximation. The representation may be considered as a binary image in which
each “1” in the image corresponds to a rectangular region that contains at least one point of , and “0” represents those that do not contain any of . Although bitmaps do not have the elegance of the other models, they often arise in applica- tions. One example is a digital map constructed by a mobile robot that explores an environment with its sensors. One generalization of bitmaps is a gray-scale map or occupancy grid. In this case, a numerical value may be assigned to each cell, indicating quantities such as “the probability that an obstacle exists” or the “expected difficulty of traversing the cell.” The latter interpretation is often used in terrain maps for navigating planetary rovers.
O O
Superquadrics Instead of using polynomials to define fi, many generalizations
can be constructed. One popular primitive is a superquadric, which generalizes quadric surfaces. One example is a superellipsoid, which is given for W = R3 by
|x/a|n_1 + |y/b|_n_2 _n_1/n_2 + |z/c|_n_1 − 1 ≤ 0, (3.20) in which n1 2 and n2 2. If n1 = n2 = 2, an ellipse is generated. As n1 and n2
≥ ≥
( )
increase, the superellipsoid becomes shaped like a box with rounded corners.
Generalized cylinders A generalized cylinder is a generalization of an ordinary cylinder. Instead of being limited to a line, the center axis is a continuous spine curve, (x(s), y(s), z(s)), for some parameter s [0, 1]. Instead of a constant radius, a radius function r(s) is defined along the spine. The value r(s) is the radius of the circle obtained as the cross section of the generalized cylinder at the point (x(s), y(s), z(s)). The normal to the cross-section plane is the tangent to the spine curve at s.
∈