Sampling-Based Motion Planning Revisited

Now that the preliminary concepts have been deﬁned for motion planning under diﬀerential constraints, the focus shifts to extending the sampling-based planning methods of Chapter 5. This primarily involves extending the incremental sampling and searching framework from Section 5.4 to incorporate diﬀerential constraints. Following the general framework, several popular methods are covered in Section

1. as special cases of the framework. If an eﬃcient BVP solver is available, then it may also be possible to extend sampling-based roadmaps of Section 5.6 to handle diﬀerential constraints.
  1. Basic Components

This section describes how Sections 5.1 to 5.3 are adapted to handle phase spaces and diﬀerential constraints.

- - 1. Distance and volume in X

Recall from Chapter 5 that many sampling-based planning algorithms rely on measuring distances or volumes in . If X = , as in the wheeled systems from Section 13.1.2, then the concepts of Section 5.1 apply directly. The equivalent is needed for a general state space X, which may include phase variables in addition to the conﬁguration variables. In most cases, the topology of the phase variables is trivial. For example, if x = (q, q˙), then each q˙i component is constrained to

C C

an interval of R. In this case the velocity components are just an axis-aligned rectangular region in R_n/_2, if n is the dimension of X. It is straightforward in this case to extend a measure and metric deﬁned on up to X by forming the

Cartesian product.

A metric can be deﬁned using the Cartesian product method given by (5.4). The usual diﬃculty arises of arbitrarily weighting diﬀerent components and com- bining them into a single scalar function. In the case of , this has involved combining translations and rotation. For X, this additionally includes velocity components, which makes it more diﬃcult to choose meaningful weights.

Riemannian metrics A rigorous way to deﬁne a metric on a smooth manifold is to deﬁne a metric tensor (or Riemannian tensor), which is a quadratic function of two tangent vectors. This can be considered as an inner product on X, which can be used to measure angles. This leads to the deﬁnition of the Riemannian metric, which is based on the shortest paths (called geodesics) in X [133]. An example of this appeared in the context of Lagrangian mechanics in Section 13.4.1. The kinetic energy, (13.70), serves as the required metric tensor, and the geodesics are the motions taken by the dynamical system to conserve energy. The metric

can be deﬁned as the length of the geodesic that connects a pair of points. If the chosen Riemannian metric has some physical signiﬁcance, as in the case of Lagrangian mechanics, then the resulting metric provides meaningful information. Unfortunately, it may be diﬃcult or expensive to compute its value.

The ideal distance function The ideal way to deﬁne distance on X is to use a cost functional and then deﬁne the distance from x Xfree to x′ Xfree as the optimal cost-to-go from x to x′ while remaining in Xfree. In some cases, it has been also referred to as the nonholonomic metric, Carnot-Caratheodory metric, or sub-Riemannian metric [596]. Note that this not a true metric, as mentioned in Section 5.1.2, because the cost may not be symmetric. For example, traveling a small distance forward with Dubins car is much shorter than traveling a small distance backward. If there are obstacles, it may not even be possible to reach conﬁgurations behind the car.

∈ ∈

This concept of distance should be somewhat disturbing because it requires optimally solving the motion planning problem of Formulation 14.1. Thus, it can- not be practical for eﬃcient use in a motion planning algorithm. Nevertheless, understanding this ideal notion of distance can be very helpful in designing practi- cal distance functions on X. For example, rather than using a weighted Euclidean metric (often called Mahalanobis metric) for the Dubins car, a distance function can be deﬁned based on the length of the shortest path between two conﬁgura- tions. These lengths are straightforward to compute, and are based on the optimal curve families that will be covered in Section 15.3. This distance function neglects obstacles, but it should still provide better distance information than the weighted Euclidean metric. It may also be useful for car models that involve dynamics.

The general idea is to get as close as possible to the optimal cost-to-go without having to perform expensive computations. It is often possible to compute a useful underestimate of the optimal cost-to-go by neglecting some of the constraints, such as obstacles or dynamics. This may help in applying A∗ search heuristics.

Defining measure As mentioned already, it is straightforward to extend a mea- sure on to X if the topology associated with the phase variables is trivial. It may not be possible, however, to obtain an invariant measure. In most cases,

is a transformation group, in which the Haar measure exists, thereby yielding the “true” volume in a sense that is not sensitive to parameterizations of . This was observed for SO(3) in Section 5.1.4. For a general state space X, a Haar measure may not exist. If a Riemannian metric is deﬁned, then intrinsic notions of surface integration and volume exist [133]; however, these may be diﬃcult to exploit in a sampling-based planning algorithm.

- - 1. Sampling theory

Section 14.2.2 already covered some of the sampling issues. There are at least two continuous spaces: X, and the time interval T . In most cases, the action space

U is also continuous. Each continuous space must be sampled in some way. In the limit, it is important that any sample sequence is dense in the space on which sampling occurs. This was required for the resolution completeness concepts of Section 14.2.2.

Sampling of T and U can be performed by directly using the random or deter- ministic methods of Section 5.2. Time is just an interval of R, and U is typically expressed as a convex m-dimensional subset of Rm. For example, U is often an axis-aligned rectangular subset of Rm.

Some planning methods may require sampling on X. The deﬁnitions of dis-

crepancy and dispersion from Section 5.2 can be easily adapted to any measure space and metric space, respectively. Even though it may be straightforward to deﬁne a good criterion, generating samples that optimize the criterion may be diﬃcult or impossible.

A convenient way to avoid this problem is to work in a coordinate neighbor- hood of X. This makes the manifold appear as an n-dimensional region in Rn, which in many cases is rectangular. This enables the sampling concepts of Section

5.2 to be applied in a straightforward manner. While this is the most straightfor- ward approach, the sampling quality depends on the particular parameterization used to deﬁne the coordinate neighborhood. Note that when working with a co- ordinate neighborhood (for example, by imagining that X is a cube), appropriate identiﬁcations must be taken into account.

- - 1. Collision detection

As in Chapter 5, eﬃcient collision detection algorithms are a key enabler of sampling-based planning. If X = , then the methods of Section 5.3 directly apply. If X includes phase constraints, then additional tests must be performed. These constraints are usually given and are therefore straightforward to evaluate. Recall from Section 4.3 that this is not eﬃcient for the obstacle constraints on due to the complicated mapping between obstacles in and obstacles in .

W C

If only pointwise tests are performed, the trajectory segment between the points is not guaranteed to stay in Xfree. This problem was addressed in Section 5.3.4 by using distance information from collision checking algorithms. The same problem exists for the phase constraints of the form hi(x) 0. In this general form there is no additional information that can be used to ensure that some neighborhood of x is contained in Xfree. Fortunately, the phase constraints are not complicated in most applications, and it is possible to ensure that x is at least some distance away from the constraint boundary. In general, careful analysis of each phase constraint is required to ensure that the state trajectory segments are violation-free.

≤

In summary, determining whether x ∈ Xfree involves

- - - 1. Using a collision detection algorithm as in Section 5.3 to ensure that κ(x) ∈ Cfree.
      2. Checking x to ensure that other constraints of the form hi(x) 0 have been satisﬁed.

≤

x(0)

u˜t t

x˜t

Figure 14.9: Using a system simulator, the system x˙ = f(x, u) is integrated from

x(0) using u˜t : [0, t] → U to produce a state trajectory x˜t : [0, t] → X. Sometimes

x˜ is speciﬁed as a parameterized path, but most often it is approximated as a

sequence of samples in X.

Entire trajectory segments should theoretically be checked. Often times, in prac- tice, only individual points are checked, which is more eﬃcient but technically incorrect.

- 1. System Simulator

A new component is needed for sampling-based planning under diﬀerential con- straints because of (14.1). Motions are now expressed in terms of an action trajec- tory, but collision detection and constraint satisfaction tests must be performed in X. Therefore, the system, x˙ = f(x, u) needs to be integrated frequently during the planning process. Similar to the modeling of collision detection as a “black box,” the integration process is modeled as a module called the system simulator. See Figure 14.9. Since the systems considered in this chapter are time-invariant, the starting time for any required integration can always be shifted to start at t = 0. Integration can be considered as a module that implements (14.1) by computing the state trajectory resulting from a given initial state x(0), an action trajectory u˜t, and time t. The incremental simulator encapsulates the details of integrating the state transition equation so that they do not need to be addressed in the de- sign of planners. However, that information from the particular state transition equation may still be important in the design of the planning algorithm.

Closed-form solutions According to (14.1), the action trajectory must be in- tegrated to produce a state trajectory. In some cases, this integration leads to a closed-form expression. For example, if the system is a chain of integrators, then a polynomial expression can easily be obtained for x(t). For example, suppose q is a scalar and q¨ = u. If q(0) = q˙(0) = 0 and a constant action u = 1 is applied, then x(t) = t2/2. If x˙ = f(x, u) is a linear system (which includes chains of integrators; recall the deﬁnition from Section 13.2.2), then a closed-form expression for the state trajectory can always be obtained. This is based on matrix exponentials and is given in many control theory texts (e.g, [192]).

Euler method For most systems, the integration must be performed numeri- cally. A system simulator based on numerical integration can be constructed by breaking t into smaller intervals and iterating classical methods for computing

numerical solutions to diﬀerential equations. The Euler method is the simplest of these methods. Let ∆t denote a small time interval over which the approxima- tion will be made. This can be considered as an internal parameter of the system simulator. In practice, this ∆t is usually much smaller than the ∆t used in the discrete-time model of Section 14.2.2. Suppose that x(0) and u(0) are given and the task is to estimate x(∆t).

By performing integration over time, the state transition equation can be used to determine the state after some ﬁxed amount of time ∆t has passed. For example, if x(0) is given and u(t′) is known over the interval t′ [0, ∆t], then the state at time ∆t can be determined as

∈

x(∆t) = x(0) +

∆t

f(x(t), u(t))dt. (14.14)

The integral cannot be evaluated directly because x(t) appears in the integrand and is unknown for time t > 0.

Using the fact that

dx x(∆t) x(0)

−

f(x, u) = x˙ = ≈ , (14.15)

∆t

solving for x(∆t) yields the classic Euler integration method

x(∆t) ≈ x(0) + ∆t f(x(0), u(0)). (14.16)

The approximation error depends on how quickly x(t) changes over time and on the length of the interval ∆t. If the planning algorithm applies a motion primitive u˜p, it gives tF (u˜p) as the time input, and the system simulator may subdivide the time interval to maintain higher accuracy. This allows the developer of the planning algorithm to ignore numerical accuracy issues.

Runge-Kutta methods Although Euler integration is eﬃcient and easy to un- derstand, it generally yields poor approximations. Taking a Taylor series expansion of x˜ at t = 0 yields

x(∆t) = x(0) + ∆t x˙ (0) +

(∆t)2 2!

x¨(0) +

(∆t)3

(3)

(0) + · · · . (14.17)

Comparing to (14.16), it can be seen that the Euler method just uses the ﬁrst term of the Taylor series, which is an exact representation (if x˜ is analytic). Thus, the neglected terms reﬂect the approximation error. If x(t) is roughly linear, then the error may be small; however, if x˙ (t) or higher order derivatives change quickly, then poor approximations are obtained.

Runge-Kutta methods are based on using higher order terms of the Taylor series expansion. One of the most widely used and eﬃcient numerical integration methods is the fourth-order Runge-Kutta method. It is simple to implement and

yields good numerical behavior in most applications. Also, it is generally recom- mended over Euler integration. The technique can be derived by performing a Taylor series expansion at x( 1 ∆t). This state itself is estimated in the approxi-

mation process.

The fourth-order Runge-Kutta integration method is

∆t

in which

x(∆t) ≈ x(0) +

(w1 + 2w2 + 2w3 + w4), (14.18)

w1 = f(x(0), u(0))

w2 = f(x(0) + 1 ∆t w1, u( 1 ∆t))

2 2 (14.19)

w3 = f(x(0) + 1 ∆t w2, u( 1 ∆t))

2 2

w4 = f(x(0) + ∆t w3, u(∆t)).

Although this is more expensive than Euler integration, the improved accuracy is usually worthwhile in practice. Note that the action is needed at three diﬀerent times: 0, 1 ∆t, and ∆t. If the action is constant over [0, ∆t), then the same value is used at all three times.

The approximation error depends on how quickly higher order derivatives of x˜ vary over time. This can be expressed using the remaining terms of the Taylor series. In practice, it may be advantageous to adapt ∆t over successive iterations of Runge-Kutta integration. In [247], for example, it is suggested that ∆t is scaled

by (∆t/∆x)1/_5, in which ∆x = Ix(∆t) − x(0)I, the Euclidean distance in R_n.

Multistep methods Runge-Kutta methods represent a popular trade-oﬀ be- tween simplicity and eﬃciency. However, by focusing on the integration problem more carefully, it is often possible to improve eﬃciency further. The Euler and Runge-Kutta methods are often referred to as single-step methods. There exist multi-step methods, which rely on the fact that a sequence of integrations will be performed, in a manner analogous to incremental collision detection in Section

5.3.3. The key issues are ensuring that the methods properly initialize, ensuring numerical stability over time, and estimating error to adaptively adjust the step size. Many books on numerical analysis cover multi-step methods [51, 440, 863]. One of the most popular families is the Adams methods.

Multistep methods require more investment to understand and implement. For a particular application, the decision to pursue this route should be based on the relative costs of planning, collision detection, and numerical integration. If integration tends to dominate and eﬃciency is critical, then multi-step methods could improve running times dramatically over Runge-Kutta methods.

Black-box simulators For some problems, a state transition equation might not be available; however, it is still possible to compute future states given a current state and an action trajectory. This might occur, for example, in a complex

software system that simulates the dynamics of a automobile or a collection of parts that bounce around on a table. In computer graphics applications, simulations may arise from motion capture data. Some simulators may even work internally with implicit diﬀerential constraints of the form gi(x, x˙ , u) = 0, instead of x˙ = f(x, u). In such situations, many sampling-based planners can be applied because they rely only on the existence of the system simulator. The planning algorithm is thus shielded from the particular details of how the system is represented and integrated.

Reverse-time system simulation Some planning algorithms require integra-

tion in the reverse-time direction. For some given x(0) and action trajectory that runs from −∆t to 0, the backward system simulator computes a state trajectory, x˜ : [−t, 0] → X, which when integrated from −∆t to 0 under the application of

u˜t yields x(0). This may seem like an inverse control problem [856] or a BVP as shown in Figure 14.10; however, it is much simpler. Determining the action trajec- tory for given initial and goal states is more complicated; however, in reverse-time integration, the action trajectory and ﬁnal state are given, and the initial state does not need to be ﬁxed.

The reverse-time version of (14.14) is

x(−∆t) = x(0) +

−∆t

f(x(t), u(t))dt = x(0) +

∆t

f(x(t), u(t))dt, (14.20)

−

which relies on the fact that x˙ = f(x, u) is time-invariant. Thus, reverse-time

integration is obtained by simply negating the state transition equation. The Euler and Runge-Kutta methods can then be applied in the usual way to −f(x(t), u(t)).

- 1. Local Planning

The methods of Chapter 5 were based on the existence of a local planning method (LPM) that is simple and eﬃcient. This represented an important part of both the incremental sampling and searching framework of Section 5.4 and the sampling- based roadmap framework of Section 5.6. In the absence of obstacles and diﬀer- ential constraints, it is trivial to deﬁne an LPM that connects two conﬁgurations. They can, for example, be connected using the shortest path (geodesic) in . The sampling-based roadmap approach from Section 5.6 relies on this simple LPM.

In the presence of diﬀerential constraints, the problem of constructing an LPM that connects two conﬁgurations or states is considerably more challenging. Recall from Section 14.1 that this is the classical BVP, which is diﬃcult to solve for most systems. There are two main alternatives to handle this diﬃculty in a sampling- based planning algorithm:

Design the sampling scheme, which may include careful selection of motion primitives, so that the BVP can be trivially solved.

u˜t

Figure 14.10: Some methods in Chapter 15 can solve two-point boundary value problems in the absence of Xobs. This is diﬃcult to obtain for most systems, but it is more powerful than the system simulator. It is very valuable, for example, in making a sampling-based roadmap that satisﬁes diﬀerential constraints.

Design the planning algorithm so that as few as possible BVPs need to be solved. The LPM in this case does not specify precise goal states that must be reached.

Under the ﬁrst alternative, the BVP solver can be considered as a black box, as shown in Figure 14.10, that eﬃciently connects xI to xG in the absence of obstacles. In the case of the Piano Mover’s Problem, this was obtained by moving along the shortest path in . For many of the wheeled vehicle systems from Section 13.1.2, steering methods exist that could serve as an eﬃcient BVP solver; see Section 15.5. Eﬃcient techniques also exist for linear systems and are covered in Section 15.2.2.

If the BVP is eﬃciently solved, then virtually any sampling-based planning algorithm from Chapter 5 can be adapted to the case of diﬀerential constraints. This is achieved by using the module in Figure 14.10 as the LPM. For example, a sampling-based roadmap can use the computed solution in the place of the shortest path through . If the BVP solver is not eﬃcient enough, then this approach becomes impractical because it must typically be used thousands of times to build a roadmap. The existence of an eﬃcient module as shown in Figure 14.10 magically eliminates most of the complications associated with planning under diﬀerential constraints. The only remaining concern is that the solutions provided by the BVP solver could be quite long in comparison to the shortest path in the absence of diﬀerential constraints (for example, how far must the Dubins car travel to move slightly backward?).

Under the second alternative, it is assumed that solving the BVP is very costly. The planning method in this case should avoid solving BVPs whenever possible. Some planning algorithms may only require an LPM that approximately reaches intermediate goal states, which is simpler for some systems. Other planning algo- rithms may not require the LPM to make any kind of connection. The LPM may return a motion primitive that appears to make some progress in the search but is not designed to connect to a prescribed state. This usually involves incremental planning methods, which are covered in Section 14.4 and extends the methods of Sections 5.4 and 5.5 to handle diﬀerential constraints.

- 1. General Framework Under Differential Constraints

The framework presented here is a direct extension of the sampling and searching framework from Section 5.4.1 and includes the extension of Section 5.5 to allow the selection of any point in the swath of the search graph. This replaces the vertex selection method (VSM) by a swath-point selection method (SSM). The framework also naturally extends the discrete search framework of Section 2.2.4. The components are are follows:

Initialization: Let G(V, E) represent an undirected search graph, for which the vertex set V contains a vertex for xI and possibly other states in Xfree, and the edge set E is empty. The graph can be interpreted as a topological

graph with a swath S(G).

Swath-point Selection Method (SSM): Choose a vertex xcur S( ) for expansion.

∈ G

Local Planning Method (LPM): Generate a motion primitive u˜p : [0, tF ] → Xfree such that u(0) = xcur and u(tF ) = xr for some xr Xfree, which may or may not be a vertex in . Using the system simulator, a collision detec- tion algorithm, and by testing the phase constraints, u˜p must be veriﬁed to be violation-free. If this step fails, then go to Step 2.

∈

Insert an Edge in the Graph: Insert u˜p into E. Upon integration, u˜p yields a state trajectory from xcur to xr. If xr is not already in V , it is added.

If xcur lies in the interior of an edge trajectory for some e ∈ E, then e is split

by the introduction of a new vertex at xcur.

Check for a Solution: Determine whether encodes a solution path. In some applications, a small gap in the state trajectory may be tolerated.

Return to Step 2: Iterate unless a solution has been found or some ter- mination condition is satisﬁed. In the latter case, the algorithm reports failure.

The general framework may be applied in the same ways as in Section 5.4.1 to obtain unidirectional, bidirectional, and multidirectional searches. The issues from the Piano Mover’s Problem extend to motion planning under diﬀerential constraints. For example, bug traps cause the same diﬃculties, and as the number of trees increases, it becomes diﬃcult to coordinate the search.

The main new complication is due to BVPs. See Figure 14.11. Recall from Section 14.1.1 that for most systems it is important to reduce the number of BVPs that must be solved during planning as much as possible. Assume that connecting precisely to a prescribed state is diﬃcult. Figure 14.11a shows the best situation, in which forward, unidirectional search is used to enter a large goal region. In this case, no BVPs need to be solved. As the goal region is reduced, the problem

1. (b)

Figure 14.11: (a) Forward, unidirectional search for which the BVP is avoided.

1. Reaching the goal precisely causes a BVP. (c) Backward, unidirectional search also causes a BVP. (d) For bidirectional search, the BVP arises when connecting the trees.

becomes more challenging. Figure 14.11b shows the limiting case in which XG is a point {xG}. This requires the planning algorithm to solve at least one BVP.

Figure 14.11c shows the case of backward, unidirectional search. This has the eﬀect of moving the BVP to xI . Since xI is precisely given (there is no “initial region”), the BVP cannot be avoided as in the forward case. If an algorithm produces a solution u˜ for which x(0) is very close to xI , and if XG is large, then it may be possible to salvage the solution. The system simulator can be applied to u˜ from xI instead of x(0). It is known that x˜(x(0), u˜) is violation-free, and x˜(xI , u˜) may travel close to x˜(x(0), u˜) at all times. This requires f to vary only a small amount with respect to changes in x (this would be implied by a small Lipschitz constant) and also for xI x(0) to be small. One problem is that the diﬀerence between points on the two trajectories usually increases as time increases. If it is veriﬁed by the system simulator that x˜(xI , u˜) is violation-free and the ﬁnal state still lies in XG, then a solution can be declared.

I − I

For bidirectional search, a BVP must be solved somewhere in the middle of a trajectory, as shown in Figure 14.11d. This complicates the problem of determining

whether the two trees can be connected. Once again, if the goal region is large, it may be possible remove the gap in the middle of the trajectory by moving the starting state of the trajectory produced by the backward tree. Let u˜1 and u˜2 denote the action trajectories produced by the forward and backward trees,

respectively. Suppose that their termination times are t1 and t2, respectively. The

action trajectories can be concatenated to yield a function u˜ : [0, t1 + t2] → U by shifting the domain of u˜2 from [0, t2] to [t1, t1 + t2]. If t ≤ t1, then u(t) = u1(t);

otherwise, u(t) = u2(t t1). If there is a gap, the new state trajectory x˜(xI , u˜) must be checked using the system simulator to determine whether it is violation- free and terminates in XG. Multi-directional search becomes even more diﬃcult because more BVPs are created. It is possible in principle to extend the ideas above to concatenate a sequence of action trajectories, which tries to remove all of the gaps.

−

Consider the relationship between the search graph and reachability graphs. In the case of unidirectional search, the search graph is always a subset of a reachability graph (assuming perfect precision and no numerical integration error). In the forward case, the reachability graph starts at xI , and in the backward case it starts at xG. In the case of bidirectional search, there are two reachability graphs. It might be the case that vertices from the two coincide, which is another way that the BVP can be avoided. Such cases are unfortunately rare, unless xI and xG are intentionally chosen to cause this. For example, the precise location of xG may be chosen because it is known to be a vertex of the reachability graph from xI . For most systems, it is diﬃcult to force this behavior. Thus, in general, BVPs arise because the reachability graphs do not have common vertices. In the case of multi-directional search, numerous reachability graphs are being explored, none of which may have vertices that coincide with vertices of others.

Sampling-Based Motion Planning Revisited

Sampling-Based Motion Planning Revisited

results matching ""

No results matching ""