Sampling-Based Motion Planning Revisited

Now that the preliminary concepts have been defined for motion planning under differential constraints, the focus shifts to extending the sampling-based planning methods of Chapter 5. This primarily involves extending the incremental sampling and searching framework from Section 5.4 to incorporate differential constraints. Following the general framework, several popular methods are covered in Section

    1. as special cases of the framework. If an efficient BVP solver is available, then it may also be possible to extend sampling-based roadmaps of Section 5.6 to handle differential constraints.
      1. Basic Components

This section describes how Sections 5.1 to 5.3 are adapted to handle phase spaces and differential constraints.

        1. Distance and volume in X

Recall from Chapter 5 that many sampling-based planning algorithms rely on measuring distances or volumes in . If X = , as in the wheeled systems from Section 13.1.2, then the concepts of Section 5.1 apply directly. The equivalent is needed for a general state space X, which may include phase variables in addition to the configuration variables. In most cases, the topology of the phase variables is trivial. For example, if x = (q, q˙), then each q˙i component is constrained to

C C

an interval of R. In this case the velocity components are just an axis-aligned rectangular region in R_n/_2, if n is the dimension of X. It is straightforward in this case to extend a measure and metric defined on up to X by forming the

C

Cartesian product.

A metric can be defined using the Cartesian product method given by (5.4). The usual difficulty arises of arbitrarily weighting different components and com- bining them into a single scalar function. In the case of , this has involved combining translations and rotation. For X, this additionally includes velocity components, which makes it more difficult to choose meaningful weights.

C

Riemannian metrics A rigorous way to define a metric on a smooth manifold is to define a metric tensor (or Riemannian tensor), which is a quadratic function of two tangent vectors. This can be considered as an inner product on X, which can be used to measure angles. This leads to the definition of the Riemannian metric, which is based on the shortest paths (called geodesics) in X [133]. An example of this appeared in the context of Lagrangian mechanics in Section 13.4.1. The kinetic energy, (13.70), serves as the required metric tensor, and the geodesics are the motions taken by the dynamical system to conserve energy. The metric

can be defined as the length of the geodesic that connects a pair of points. If the chosen Riemannian metric has some physical significance, as in the case of Lagrangian mechanics, then the resulting metric provides meaningful information. Unfortunately, it may be difficult or expensive to compute its value.

The ideal distance function The ideal way to define distance on X is to use a cost functional and then define the distance from x Xfree to x′ Xfree as the optimal cost-to-go from x to x′ while remaining in Xfree. In some cases, it has been also referred to as the nonholonomic metric, Carnot-Caratheodory metric, or sub-Riemannian metric [596]. Note that this not a true metric, as mentioned in Section 5.1.2, because the cost may not be symmetric. For example, traveling a small distance forward with Dubins car is much shorter than traveling a small distance backward. If there are obstacles, it may not even be possible to reach configurations behind the car.

∈ ∈

This concept of distance should be somewhat disturbing because it requires optimally solving the motion planning problem of Formulation 14.1. Thus, it can- not be practical for efficient use in a motion planning algorithm. Nevertheless, understanding this ideal notion of distance can be very helpful in designing practi- cal distance functions on X. For example, rather than using a weighted Euclidean metric (often called Mahalanobis metric) for the Dubins car, a distance function can be defined based on the length of the shortest path between two configura- tions. These lengths are straightforward to compute, and are based on the optimal curve families that will be covered in Section 15.3. This distance function neglects obstacles, but it should still provide better distance information than the weighted Euclidean metric. It may also be useful for car models that involve dynamics.

The general idea is to get as close as possible to the optimal cost-to-go without having to perform expensive computations. It is often possible to compute a useful underestimate of the optimal cost-to-go by neglecting some of the constraints, such as obstacles or dynamics. This may help in applying A∗ search heuristics.

Defining measure As mentioned already, it is straightforward to extend a mea- sure on to X if the topology associated with the phase variables is trivial. It may not be possible, however, to obtain an invariant measure. In most cases,

C

C

is a transformation group, in which the Haar measure exists, thereby yielding the “true” volume in a sense that is not sensitive to parameterizations of . This was observed for SO(3) in Section 5.1.4. For a general state space X, a Haar measure may not exist. If a Riemannian metric is defined, then intrinsic notions of surface integration and volume exist [133]; however, these may be difficult to exploit in a sampling-based planning algorithm.

C

        1. Sampling theory

Section 14.2.2 already covered some of the sampling issues. There are at least two continuous spaces: X, and the time interval T . In most cases, the action space

U is also continuous. Each continuous space must be sampled in some way. In the limit, it is important that any sample sequence is dense in the space on which sampling occurs. This was required for the resolution completeness concepts of Section 14.2.2.

Sampling of T and U can be performed by directly using the random or deter- ministic methods of Section 5.2. Time is just an interval of R, and U is typically expressed as a convex m-dimensional subset of Rm. For example, U is often an axis-aligned rectangular subset of Rm.

Some planning methods may require sampling on X. The definitions of dis-

crepancy and dispersion from Section 5.2 can be easily adapted to any measure space and metric space, respectively. Even though it may be straightforward to define a good criterion, generating samples that optimize the criterion may be difficult or impossible.

A convenient way to avoid this problem is to work in a coordinate neighbor- hood of X. This makes the manifold appear as an n-dimensional region in Rn, which in many cases is rectangular. This enables the sampling concepts of Section

5.2 to be applied in a straightforward manner. While this is the most straightfor- ward approach, the sampling quality depends on the particular parameterization used to define the coordinate neighborhood. Note that when working with a co- ordinate neighborhood (for example, by imagining that X is a cube), appropriate identifications must be taken into account.

        1. Collision detection

As in Chapter 5, efficient collision detection algorithms are a key enabler of sampling-based planning. If X = , then the methods of Section 5.3 directly apply. If X includes phase constraints, then additional tests must be performed. These constraints are usually given and are therefore straightforward to evaluate. Recall from Section 4.3 that this is not efficient for the obstacle constraints on due to the complicated mapping between obstacles in and obstacles in .

C

W C

C

If only pointwise tests are performed, the trajectory segment between the points is not guaranteed to stay in Xfree. This problem was addressed in Section 5.3.4 by using distance information from collision checking algorithms. The same problem exists for the phase constraints of the form hi(x) 0. In this general form there is no additional information that can be used to ensure that some neighborhood of x is contained in Xfree. Fortunately, the phase constraints are not complicated in most applications, and it is possible to ensure that x is at least some distance away from the constraint boundary. In general, careful analysis of each phase constraint is required to ensure that the state trajectory segments are violation-free.

In summary, determining whether x ∈ Xfree involves

          1. Using a collision detection algorithm as in Section 5.3 to ensure that κ(x) ∈ Cfree.
          2. Checking x to ensure that other constraints of the form hi(x) 0 have been satisfied.

x(0)

u˜t t

x˜t

Figure 14.9: Using a system simulator, the system x˙ = f(x, u) is integrated from

x(0) using u˜t : [0, t] → U to produce a state trajectory x˜t : [0, t] → X. Sometimes

x˜ is specified as a parameterized path, but most often it is approximated as a

sequence of samples in X.

Entire trajectory segments should theoretically be checked. Often times, in prac- tice, only individual points are checked, which is more efficient but technically incorrect.

      1. System Simulator

A new component is needed for sampling-based planning under differential con- straints because of (14.1). Motions are now expressed in terms of an action trajec- tory, but collision detection and constraint satisfaction tests must be performed in X. Therefore, the system, x˙ = f(x, u) needs to be integrated frequently during the planning process. Similar to the modeling of collision detection as a “black box,” the integration process is modeled as a module called the system simulator. See Figure 14.9. Since the systems considered in this chapter are time-invariant, the starting time for any required integration can always be shifted to start at t = 0. Integration can be considered as a module that implements (14.1) by computing the state trajectory resulting from a given initial state x(0), an action trajectory u˜t, and time t. The incremental simulator encapsulates the details of integrating the state transition equation so that they do not need to be addressed in the de- sign of planners. However, that information from the particular state transition equation may still be important in the design of the planning algorithm.

Closed-form solutions According to (14.1), the action trajectory must be in- tegrated to produce a state trajectory. In some cases, this integration leads to a closed-form expression. For example, if the system is a chain of integrators, then a polynomial expression can easily be obtained for x(t). For example, suppose q is a scalar and q¨ = u. If q(0) = q˙(0) = 0 and a constant action u = 1 is applied, then x(t) = t2/2. If x˙ = f(x, u) is a linear system (which includes chains of integrators; recall the definition from Section 13.2.2), then a closed-form expression for the state trajectory can always be obtained. This is based on matrix exponentials and is given in many control theory texts (e.g, [192]).

Euler method For most systems, the integration must be performed numeri- cally. A system simulator based on numerical integration can be constructed by breaking t into smaller intervals and iterating classical methods for computing

numerical solutions to differential equations. The Euler method is the simplest of these methods. Let ∆t denote a small time interval over which the approxima- tion will be made. This can be considered as an internal parameter of the system simulator. In practice, this ∆t is usually much smaller than the ∆t used in the discrete-time model of Section 14.2.2. Suppose that x(0) and u(0) are given and the task is to estimate x(∆t).

By performing integration over time, the state transition equation can be used to determine the state after some fixed amount of time ∆t has passed. For example, if x(0) is given and u(t′) is known over the interval t′ [0, ∆t], then the state at time ∆t can be determined as

x(∆t) = x(0) +

t

f(x(t), u(t))dt. (14.14)

r

0

The integral cannot be evaluated directly because x(t) appears in the integrand and is unknown for time t > 0.

Using the fact that

dx x(∆t) x(0)

f(x, u) = x˙ = ≈ , (14.15)

dt

∆t

solving for x(∆t) yields the classic Euler integration method

x(∆t) ≈ x(0) + ∆t f(x(0), u(0)). (14.16)

The approximation error depends on how quickly x(t) changes over time and on the length of the interval ∆t. If the planning algorithm applies a motion primitive u˜p, it gives tF (u˜p) as the time input, and the system simulator may subdivide the time interval to maintain higher accuracy. This allows the developer of the planning algorithm to ignore numerical accuracy issues.

Runge-Kutta methods Although Euler integration is efficient and easy to un- derstand, it generally yields poor approximations. Taking a Taylor series expansion of x˜ at t = 0 yields

x(∆t) = x(0) + ∆t x˙ (0) +

(∆t)2 2!

x¨(0) +

(∆t)3

x

3!

(3)

(0) + · · · . (14.17)

Comparing to (14.16), it can be seen that the Euler method just uses the first term of the Taylor series, which is an exact representation (if x˜ is analytic). Thus, the neglected terms reflect the approximation error. If x(t) is roughly linear, then the error may be small; however, if x˙ (t) or higher order derivatives change quickly, then poor approximations are obtained.

Runge-Kutta methods are based on using higher order terms of the Taylor series expansion. One of the most widely used and efficient numerical integration methods is the fourth-order Runge-Kutta method. It is simple to implement and

yields good numerical behavior in most applications. Also, it is generally recom- mended over Euler integration. The technique can be derived by performing a Taylor series expansion at x( 1 ∆t). This state itself is estimated in the approxi-

2

mation process.

The fourth-order Runge-Kutta integration method is

∆t

in which

x(∆t) ≈ x(0) +

(w1 + 2w2 + 2w3 + w4), (14.18)

6

w1 = f(x(0), u(0))

w2 = f(x(0) + 1 ∆t w1, u( 1 ∆t))

2 2 (14.19)

w3 = f(x(0) + 1 ∆t w2, u( 1 ∆t))

2 2

w4 = f(x(0) + ∆t w3, u(∆t)).

Although this is more expensive than Euler integration, the improved accuracy is usually worthwhile in practice. Note that the action is needed at three different times: 0, 1 ∆t, and ∆t. If the action is constant over [0, ∆t), then the same value is used at all three times.

2

The approximation error depends on how quickly higher order derivatives of x˜ vary over time. This can be expressed using the remaining terms of the Taylor series. In practice, it may be advantageous to adapt ∆t over successive iterations of Runge-Kutta integration. In [247], for example, it is suggested that ∆t is scaled

by (∆t/∆x)1/_5, in which ∆x = Ix(∆t) − x(0)I, the Euclidean distance in R_n.

Multistep methods Runge-Kutta methods represent a popular trade-off be- tween simplicity and efficiency. However, by focusing on the integration problem more carefully, it is often possible to improve efficiency further. The Euler and Runge-Kutta methods are often referred to as single-step methods. There exist multi-step methods, which rely on the fact that a sequence of integrations will be performed, in a manner analogous to incremental collision detection in Section

5.3.3. The key issues are ensuring that the methods properly initialize, ensuring numerical stability over time, and estimating error to adaptively adjust the step size. Many books on numerical analysis cover multi-step methods [51, 440, 863]. One of the most popular families is the Adams methods.

Multistep methods require more investment to understand and implement. For a particular application, the decision to pursue this route should be based on the relative costs of planning, collision detection, and numerical integration. If integration tends to dominate and efficiency is critical, then multi-step methods could improve running times dramatically over Runge-Kutta methods.

Black-box simulators For some problems, a state transition equation might not be available; however, it is still possible to compute future states given a current state and an action trajectory. This might occur, for example, in a complex

software system that simulates the dynamics of a automobile or a collection of parts that bounce around on a table. In computer graphics applications, simulations may arise from motion capture data. Some simulators may even work internally with implicit differential constraints of the form gi(x, x˙ , u) = 0, instead of x˙ = f(x, u). In such situations, many sampling-based planners can be applied because they rely only on the existence of the system simulator. The planning algorithm is thus shielded from the particular details of how the system is represented and integrated.

Reverse-time system simulation Some planning algorithms require integra-

tion in the reverse-time direction. For some given x(0) and action trajectory that runs from −∆t to 0, the backward system simulator computes a state trajectory, x˜ : [−t, 0] → X, which when integrated from −∆t to 0 under the application of

t yields x(0). This may seem like an inverse control problem [856] or a BVP as shown in Figure 14.10; however, it is much simpler. Determining the action trajec- tory for given initial and goal states is more complicated; however, in reverse-time integration, the action trajectory and final state are given, and the initial state does not need to be fixed.

The reverse-time version of (14.14) is

x(−∆t) = x(0) +

−∆t

0

r

f(x(t), u(t))dt = x(0) +

t

f(x(t), u(t))dt, (14.20)

r

0

which relies on the fact that x˙ = f(x, u) is time-invariant. Thus, reverse-time

integration is obtained by simply negating the state transition equation. The Euler and Runge-Kutta methods can then be applied in the usual way to −f(x(t), u(t)).

      1. Local Planning

The methods of Chapter 5 were based on the existence of a local planning method (LPM) that is simple and efficient. This represented an important part of both the incremental sampling and searching framework of Section 5.4 and the sampling- based roadmap framework of Section 5.6. In the absence of obstacles and differ- ential constraints, it is trivial to define an LPM that connects two configurations. They can, for example, be connected using the shortest path (geodesic) in . The sampling-based roadmap approach from Section 5.6 relies on this simple LPM.

C

In the presence of differential constraints, the problem of constructing an LPM that connects two configurations or states is considerably more challenging. Recall from Section 14.1 that this is the classical BVP, which is difficult to solve for most systems. There are two main alternatives to handle this difficulty in a sampling- based planning algorithm:

  1. Design the sampling scheme, which may include careful selection of motion primitives, so that the BVP can be trivially solved.

xI

u˜t

xG

Figure 14.10: Some methods in Chapter 15 can solve two-point boundary value problems in the absence of Xobs. This is difficult to obtain for most systems, but it is more powerful than the system simulator. It is very valuable, for example, in making a sampling-based roadmap that satisfies differential constraints.

  1. Design the planning algorithm so that as few as possible BVPs need to be solved. The LPM in this case does not specify precise goal states that must be reached.

Under the first alternative, the BVP solver can be considered as a black box, as shown in Figure 14.10, that efficiently connects xI to xG in the absence of obstacles. In the case of the Piano Mover’s Problem, this was obtained by moving along the shortest path in . For many of the wheeled vehicle systems from Section 13.1.2, steering methods exist that could serve as an efficient BVP solver; see Section 15.5. Efficient techniques also exist for linear systems and are covered in Section 15.2.2.

C

If the BVP is efficiently solved, then virtually any sampling-based planning algorithm from Chapter 5 can be adapted to the case of differential constraints. This is achieved by using the module in Figure 14.10 as the LPM. For example, a sampling-based roadmap can use the computed solution in the place of the shortest path through . If the BVP solver is not efficient enough, then this approach becomes impractical because it must typically be used thousands of times to build a roadmap. The existence of an efficient module as shown in Figure 14.10 magically eliminates most of the complications associated with planning under differential constraints. The only remaining concern is that the solutions provided by the BVP solver could be quite long in comparison to the shortest path in the absence of differential constraints (for example, how far must the Dubins car travel to move slightly backward?).

C

Under the second alternative, it is assumed that solving the BVP is very costly. The planning method in this case should avoid solving BVPs whenever possible. Some planning algorithms may only require an LPM that approximately reaches intermediate goal states, which is simpler for some systems. Other planning algo- rithms may not require the LPM to make any kind of connection. The LPM may return a motion primitive that appears to make some progress in the search but is not designed to connect to a prescribed state. This usually involves incremental planning methods, which are covered in Section 14.4 and extends the methods of Sections 5.4 and 5.5 to handle differential constraints.

      1. General Framework Under Differential Constraints

The framework presented here is a direct extension of the sampling and searching framework from Section 5.4.1 and includes the extension of Section 5.5 to allow the selection of any point in the swath of the search graph. This replaces the vertex selection method (VSM) by a swath-point selection method (SSM). The framework also naturally extends the discrete search framework of Section 2.2.4. The components are are follows:

  1. Initialization: Let G(V, E) represent an undirected search graph, for which the vertex set V contains a vertex for xI and possibly other states in Xfree, and the edge set E is empty. The graph can be interpreted as a topological

graph with a swath S(G).

  1. Swath-point Selection Method (SSM): Choose a vertex xcur S( ) for expansion.

∈ G

  1. Local Planning Method (LPM): Generate a motion primitive u˜p : [0, tF ] → Xfree such that u(0) = xcur and u(tF ) = xr for some xr Xfree, which may or may not be a vertex in . Using the system simulator, a collision detec- tion algorithm, and by testing the phase constraints, u˜p must be verified to be violation-free. If this step fails, then go to Step 2.

G

  1. Insert an Edge in the Graph: Insert u˜p into E. Upon integration, u˜p yields a state trajectory from xcur to xr. If xr is not already in V , it is added.

If xcur lies in the interior of an edge trajectory for some e ∈ E, then e is split

by the introduction of a new vertex at xcur.

  1. Check for a Solution: Determine whether encodes a solution path. In some applications, a small gap in the state trajectory may be tolerated.

G

  1. Return to Step 2: Iterate unless a solution has been found or some ter- mination condition is satisfied. In the latter case, the algorithm reports failure.

The general framework may be applied in the same ways as in Section 5.4.1 to obtain unidirectional, bidirectional, and multidirectional searches. The issues from the Piano Mover’s Problem extend to motion planning under differential constraints. For example, bug traps cause the same difficulties, and as the number of trees increases, it becomes difficult to coordinate the search.

The main new complication is due to BVPs. See Figure 14.11. Recall from Section 14.1.1 that for most systems it is important to reduce the number of BVPs that must be solved during planning as much as possible. Assume that connecting precisely to a prescribed state is difficult. Figure 14.11a shows the best situation, in which forward, unidirectional search is used to enter a large goal region. In this case, no BVPs need to be solved. As the goal region is reduced, the problem

    1. (b)

(c) (d)

Figure 14.11: (a) Forward, unidirectional search for which the BVP is avoided.

    1. Reaching the goal precisely causes a BVP. (c) Backward, unidirectional search also causes a BVP. (d) For bidirectional search, the BVP arises when connecting the trees.

becomes more challenging. Figure 14.11b shows the limiting case in which XG is a point {xG}. This requires the planning algorithm to solve at least one BVP.

Figure 14.11c shows the case of backward, unidirectional search. This has the effect of moving the BVP to xI . Since xI is precisely given (there is no “initial region”), the BVP cannot be avoided as in the forward case. If an algorithm produces a solution u˜ for which x(0) is very close to xI , and if XG is large, then it may be possible to salvage the solution. The system simulator can be applied to u˜ from xI instead of x(0). It is known that x˜(x(0), u˜) is violation-free, and x˜(xI , u˜) may travel close to x˜(x(0), u˜) at all times. This requires f to vary only a small amount with respect to changes in x (this would be implied by a small Lipschitz constant) and also for xI x(0) to be small. One problem is that the difference between points on the two trajectories usually increases as time increases. If it is verified by the system simulator that x˜(xI , u˜) is violation-free and the final state still lies in XG, then a solution can be declared.

I − I

For bidirectional search, a BVP must be solved somewhere in the middle of a trajectory, as shown in Figure 14.11d. This complicates the problem of determining

whether the two trees can be connected. Once again, if the goal region is large, it may be possible remove the gap in the middle of the trajectory by moving the starting state of the trajectory produced by the backward tree. Let u˜1 and u˜2 denote the action trajectories produced by the forward and backward trees,

respectively. Suppose that their termination times are t1 and t2, respectively. The

action trajectories can be concatenated to yield a function u˜ : [0, t1 + t2] → U by shifting the domain of u˜2 from [0, t2] to [t1, t1 + t2]. If t ≤ t1, then u(t) = u1(t);

otherwise, u(t) = u2(t t1). If there is a gap, the new state trajectory x˜(xI , u˜) must be checked using the system simulator to determine whether it is violation- free and terminates in XG. Multi-directional search becomes even more difficult because more BVPs are created. It is possible in principle to extend the ideas above to concatenate a sequence of action trajectories, which tries to remove all of the gaps.

Consider the relationship between the search graph and reachability graphs. In the case of unidirectional search, the search graph is always a subset of a reachability graph (assuming perfect precision and no numerical integration error). In the forward case, the reachability graph starts at xI , and in the backward case it starts at xG. In the case of bidirectional search, there are two reachability graphs. It might be the case that vertices from the two coincide, which is another way that the BVP can be avoided. Such cases are unfortunately rare, unless xI and xG are intentionally chosen to cause this. For example, the precise location of xG may be chosen because it is known to be a vertex of the reachability graph from xI . For most systems, it is difficult to force this behavior. Thus, in general, BVPs arise because the reachability graphs do not have common vertices. In the case of multi-directional search, numerous reachability graphs are being explored, none of which may have vertices that coincide with vertices of others.

results matching ""

    No results matching ""