Decoupled Planning Approaches

- 1. Different Ways to Decouple the Big Problem

As sampling-based algorithms continue to improve along with computation power, it becomes increasingly feasible in practice to directly solve challenging planning problems under diﬀerential constraints. There are many situations, however, in which computing such solutions is still too costly due to expensive numerical inte- gration, collision detection, and complicated obstacles in a high-dimensional state space. Decoupled approaches become appealing because they divide the big prob- lem into modules that are each easier to solve. For versions of the Piano Mover’s Problem, such methods were already seen in Chapter 7. Section 7.1.3 introduced the velocity-tuning method to handle time-varying obstacles, and Section 7.2.2 presented decoupled approaches to coordinating multiple robots.

Ideally, we would like to obtain feedback plans on any state space in the pres- ence of obstacles and diﬀerential constraints. This assumes that the state can be reliably measured during execution. Section 14.5 provided the best generic techniques for solving the problem, but they are unfortunately limited to a few di- mensions. If there is substantial sensing uncertainty, then the feedback plan must be deﬁned on the I-space, which was covered in Chapter 11. Back in Section 1.4, Figure 1.19 showed a popular model of decoupling the big planning problem into a sequence of reﬁnements. A typical decoupled approach involves four modules:

- - 1. Use a motion planning algorithm to ﬁnd a collision-free path τ : [0, 1] → Cfree.
    2. Transform τ into a new path τ ′ so that velocity constraints on (if there are any) are satisﬁed. This might, for example, ensure that the Dubins car can actually follow the path. At the very least, some path-smoothing is needed in most circumstances.

- - 1. Compute a timing function σ : [0, tF ] → [0, 1] for τ ′ so that τ ′ ◦ σ is a time- parameterized path through Cfree with the following requirement. The state

trajectory x˜ must satisfy x˙ = f(x(t), u(t)) and u(t) ∈ U(x(t)) for all time,

until uT is applied at time tF .

- - 1. Design a feedback plan (or feedback control law) π : X U that tracks x˜. The plan should attempt to minimize the error between the desired state and the measured state during execution.

→

Given recent techniques and computation power, the signiﬁcance of this approach may diminish somewhat; however, it remains an important way to decompose and solve problems. Be aware, however, that this decomposition is arbitrary. If every module can be solved, then it is suﬃcient for producing a solution; however, such a decomposition is not necessary. At any step along the way, completeness may be lost because of poor choices in earlier modules. It is often diﬃcult for modules to take into account problems that may arise later.

Various ways to merge the modules have been considered. The methods of Section 14.4 solve either: 1) the ﬁrst two modules simultaneously, if paths that

satisfy q˙ = f(q, u) are computed through Cfree, or 2) the ﬁrst three modules simul-

taneously, if paths that satisfy x˙ = f(x, u) are computed through Xfree. Section

14.5 solved all four modules simultaneously but was limited to low-dimensional

state spaces.

Now consider keeping the modules separate. Planning methods from Part II can be applied to solve the ﬁrst module. Section 14.6.2 will cover methods that implement the second module. Section 14.6.3 will cover methods that solve the third module, possibly while also solving the second module. The fourth module is a well-studied control problem that is covered in numerous texts [523, 846, 856].

- 1. Plan and Transform

For the decoupled approach in this section, assume that X = , which means there are only velocity constraints, as covered in Section 13.1. The system may be speciﬁed as q˙ = f(q, u) or implicitly as a set of constraints of the form gi(q, q˙) = 0. The ideas in this section can easily be extended to phase spaces. The method given here was developed primarily by Laumond (see [596]) and was also applied to the simple car of Section 13.1.2 in [587]; other applications of the method are covered in [596].

PLAN-AND-TRANSFORM APPROACH

- - 1. Compute a path τ : [0, 1] free using a motion planning algorithm, such as one from Part II.

→ C

- - 1. Choose some s1, s2 ∈ [0, 1] such that s1 < s2 and use an LPM to attempt to replace the portion of τ from τ (s1) to τ (s2) with a path γ that satisﬁes the diﬀerential constraints.
    2. If τ now satisﬁes the diﬀerential constraints over all [0, 1], then the algorithm terminates. Otherwise, go to Step 2.

Figure 14.21: A general outline of the plan-and-transform approach.

An outline of the plan-and-transform approach is shown in Figure 14.21. In the ﬁrst step, a collision-free path τ : [0, 1] free is computed by ignoring diﬀerential constraints. The path is then iteratively modiﬁed until it satisﬁes

→ C

the constraints. In each iteration, a subinterval [s1, s2] ⊆ [0, 1] is selected by

specifying some s1, s2 [0, 1] so that s1 < s2. These points may be chosen using random sequences or may be chosen deterministically. The approach may use binary subdivision to reﬁne intervals and gradually improve the resolution on [0, 1] over the iterations.

∈

For each chosen interval [s1, s2], an LPM is used to compute a path segment γ : [0, 1] free that satisﬁes the conditions γ(0) = τ (s1) and γ(1) = τ (s2). It might be the case that the LPM fails because it cannot connect the two conﬁgurations or a collision may occur. In this case, another subinterval is chosen, and the process repeats. Each time the LPM succeeds, τ is updated to τ ′ as

→ C

 τ (s) if s < s1

τ ′(s) =

γ((s − s1)/(s2 − s1)) if s ∈ [s1, s2]

 τ (s) if s > s2.

(14.30)

The argument to γ reparameterizes it to run from s1 to s2, instead of 0 to 1.

Example 14.5 (Plan-and-Transform for the Dubins Car) For a concrete example, suppose that the task is to plan a path for the Dubins car. Figure 14.22 shows a path τ that might be computed by a motion planning algorithm that ignores diﬀerential constraints. Two sharp corners cannot be traversed by the car. Suppose that s1 and s2 are chosen at random, and appear at the locations shown in Figure 14.22. The portion of τ between τ (s1) and τ (s2) needs to be replaced by a path that can be executed by the Dubins car. Note that matching the orientations at τ (s1) and τ (s2) is important because they are part of the conﬁguration.

A replacement path γ is shown in Figure 14.23. This is obtained by implement- ing the following LPM. For the Dubins car, a path between any conﬁgurations can be found by drawing circles at the starting and stopping conﬁgurations as shown

Figure 14.22: An initial path that ignores diﬀerential constraints.

Figure 14.23: A path for the Dubins car can always be found by connecting a bitangent to two circles generated by the minimum turning radius. The path is not necessarily optimal; see Section 15.3.1 for optimal paths.

in the ﬁgure. Each circle corresponds to the sharpest possible left turn or right turn. It is straightforward to ﬁnd a line that is tangent to one circle from each conﬁguration and also matches the direction of ﬂow for the car (the circles are like one-way streets). Using γ, the path τ is updated to obtain τ ′, which is shown in Figure 14.24, and satisﬁes the diﬀerential constraints for the Dubins car. This problem was very simple, and in practice dozens of iterations may be necessary to replace path segments. Also, if randomization is used, then intervals of the form [0, s] and [s, 1] must not be neglected.

．

Example 14.5 seemed easy because of the existence of a simple local planner. Also, there were no obstacles. Imagine that τ instead traveled through a narrow, zig-zagging corridor. In this case, a solution might not even exist because of sharp corners that cannot be turned by the Dubins car. If there had been an single obstacle that happened to intersect the loop in Figure 14.24, then the replacement would have failed. In general, there is no guarantee that the replacement segment

Figure 14.24: Upon replacement, the resulting path τ ′ can be followed by the Dubins car.

is collision-free. It is important for the LPM to construct path segments that are as close as possible to the original path. For the Dubins car, this is not possible in many cases. For example, moving the Dubins car a small distance backward requires moving along the circles shown in Figure 14.23. Even as the distance between two conﬁgurations is reduced, the distance that the car needs to travel does not approach zero. This is true even if the shortest possible paths are used for the Dubins car.

What property should an LPM have to ensure resolution completeness of the plan-and-transform approach? A suﬃcient condition is given in [596]. Let ρ denote a metric on X. An LPM is said to satisfy the topological property if and only if

the following statement holds: For any ǫ > 0, there exists some δ > 0 such that for any pair q, q′ ∈ Cfree having ρ(q, q′) < δ implies that ρ(τ (s), q) < ǫ for all s ∈ [0, 1]. If an LPM satisﬁes the topological property, then any collision-free path through Cfree can be transformed into one that satisﬁes the diﬀerential constraints.

Suppose that a path τ has some clearance of at least ǫ in free. By dividing the domain of τ into intervals so that the change in q is no more than δ over each interval, then the LPM will produce collision-free path segments for replacement. It turns out that for the Reeds-Shepp car (which has reverse) such an LPM can be designed because it is small-time locally controllable, a property that will be covered in Sections 15.1.3 and 15.4. In general, many techniques from Chapter

may be useful for analyzing and designing eﬀective LPMs.

An interesting adaptation of the plan-and-transform approach has been de- veloped for problems that involve k implicit constraints of the form gi(q, q˙) = 0. An outline of the multi-level approach, which was introduced in [859], is shown in Figure 14.25 (a similar approach was also introduced in [333]). The idea is to sort the k constraints into a sequence and introduce them one at a time. Initially, a path is planned that ignores the constraints. This path is ﬁrst transformed to satisfy g1(q, q˙) = 0 and avoid collisions by using the plan-and-transform method of Figure 14.21. If successful, then the resulting path is transformed into one that is collision-free and satisﬁes both g1(q, q˙) = 0 and g2(q, q˙) = 0. This process re- peats by adding one constraint each time, until either the method fails or all k

MULTI-LEVEL APPROACH

Compute a path τ : [0, 1] free using a standard motion planning algo- rithm (as in Part II), and let i = 1.

→ C

Transform τ into a collision free path that satisﬁes gj (q, q˙) = 0 for all j from 1 to i.
If the transformation failed in Step 2, then terminate and report failure.
If i < k, the number of implicit velocity constraints, then increment i and go to Step 2. Otherwise, terminate and successfully report τ as a path that satisﬁes all constraints.

Figure 14.25: The multi-level approach considers implicit constraints one at a time.

constraints have been taken into account.

- 1. Path-Constrained Trajectory Planning

This section assumes that a path τ : [0, 1] free has been given. It may be computed by a motion planning algorithm from Part II or given by hand. The remaining task is to determine the speed along the path in a way that satisﬁes diﬀerential constraints on the phase space X. Assume that each state x X represents both a conﬁguration and its time derivative, to obtain x = (q, q˙). Let n denote the dimension of ; hence, the dimension of X is 2n. Once a path is given, there are only two remaining degrees of freedom in X: 1) the position s [0, 1] along the domain of τ , and 2) the speed s˙ = ds/dt at each s. The full state, x, can be recovered from these two parameters. As the state changes, it must satisfy

∈

→ C

a given system, x˙ = f(x, u). It will be seen that a 2D planning problem arises,

which can be solved eﬃciently using many alternative techniques. Similar concepts appeared for decoupled versions of time-varying motion planning in Section 7.1. The presentation in the current section is inspired by work in time-scaling paths for robot manipulators [456, 876, 879], which was developed a couple of decades ago. At that time, computers were much slower, which motivated the development of strongly decoupled approaches.

- - 1. Expressing systems in terms of s, s˙, and s¨ Suppose that a system is given in the form

q¨ = h(q, q˙, u), (14.31)

in which there are n action variables u = (u1, . . . , un). It may be helpful to glance ahead to Example 14.6, which will illustrate the coming concepts for the simple

case of double integrators q¨ = u. The acceleration in is determined from the state x = (q, q˙) and action u. Assume u U, in which U is an n-dimensional subset of Rn. If h is nonsingular at x, then an n-dimensional set of possible accelerations arises from choices of u U. This means it is fully actuated. If there were fewer than n action variables, then there would generally not be enough freedom to follow a speciﬁed path. Therefore, U must be n-dimensional. Which choices of u, however, constrain the motion to follow the given path τ ? To determine this,

∈

the q, q˙, and q¨ variables need to be related to the path domain s and its ﬁrst and

second time derivatives s˙ and s¨, respectively. This leads to a subset of U that

corresponds to actions that follow the path.

Suppose that s, s˙, s¨, and a path τ are given. The conﬁguration q ∈ Cfree is

q = τ (s). (14.32)

Assume that all ﬁrst and second derivatives of τ exist. The velocity q˙ determined by the chain rule as

can be

q˙ =

dτ ds dτ

ds dt ds

s˙, (14.33)

in which the derivative dτ /ds is evaluated at s. The acceleration is obtained by taking another derivative, which yields

d dτ

（

q¨ =

dt ds

s˙＼

d2τ ds

ds2 dt

dτ

s˙ + s¨ ds

(14.34)

d2τ

ds2

s˙2

dτ

s¨,

by application of the product rule. The full state x = (q, q˙) can be recovered from (s, s˙) using (14.32) and (14.33).

The next step is to obtain an equation that looks similar to (14.31), but is

expressed in terms of s,

s˙, and

s¨. A function h′(s, s˙, u) can be obtained from

h(q, q˙, u) by substituting τ (s) for q and the right side of (14.33) for q˙:

This yields

h′(s, s˙, u) = h(τ (s), dτ

s˙, u). (14.35)

q¨ = h′(s, s˙, u). (14.36)

For a given state x (which can be obtained from s and s˙), the set of accelerations that can be obtained by a choice of u in (14.36) is the same as that for the original system in (14.31). The only diﬀerence is that x is now constrained to a 2D subset of X, which are the states that can be reached by selecting values for s and s˙.

Applying (14.34) to the left side of (14.36) constrains the accelerations to cause motions that follow τ . This yields

d2τ 2 dτ ′

ds2 s˙

which can also be expressed as

s¨ = h (s, s˙, u), (14.37)

dτ ′

d2τ 2

s¨ = h (s, s˙, u)

ds ds2

s˙ , (14.38)

by moving the ﬁrst term of (14.34) to the right. Note that n equations are actually represented in (14.38). For each i in which dτi/ds /= 0, a constraint of the form

−

1 ′ d2τi 2

i i −

s¨ = h (s, s˙, u )

dτi/ds

ds2 s˙

(14.39)

is obtained by solving for s¨.

- - 1. Determining the allowable accelerations

The actions in U that cause τ to be followed can now be characterized. An action

u ∈ U follows τ if and only if every equation of the form (14.39) is satisﬁed. If

dτi/ds = 0 for all i from 1 to n, then n such equations exist. Suppose that u1 is chosen, and the ﬁrst equation is solved for s¨. The required values of the remaining action variables u2, . . ., un can be obtained by substituting the determined s¨ value into the remaining n 1 equations. This means that the actions that follow τ are at most a one-dimensional subset of U.

−

If dτi/ds = 0 for some i, then following the path requires that q˙i = 0. Instead of (14.39), the constraint is that hi(q, q˙, u) = 0. Example 14.6 will provide a simple illustration of this. If dτi/ds = 0 for all i, then the conﬁguration is not allowed to change. This occurs in the degenerate (and useless) case in which τ is a constant function.

In many cases, a value of u does not exist that satisﬁes all of the constraint equations. This means that the path cannot be followed at that particular state. Such states should be removed, if possible, by deﬁning phase constraints on X. By a poor choice of path τ violating such a phase constraint may be unavoidable. There may exist some s for which no u U can follow τ , regardless of s˙.

∈

Even if a state trajectory may be optimal in some sense, its quality ultimately depends on the given path τ : [0, 1] free. Consider the path shown in Figure

→ C

14.26. At τ (1/3), a “corner” is reached. This violates the diﬀerentiability assump-

tion and would require inﬁnite acceleration to traverse while remaining on τ . For some models, it may be possible to stop at τ (1/3) and then start again. For exam- ple, imagine a ﬂoating particle in the plane. It can be decelerated to rest exactly at τ (1/3) and then started in a new direction to exactly follow the curve. This

assumes that the particle is fully actuated. If there are nonholonomic constraints on C, as in the case of the Dubins car, then the given path must at least satisfy

τ (1_/_3)

τ (1)

τ (0)

Cfree

τ (2_/_3)

Figure 14.26: A bad path for path-constrained trajectory planning.

them before accelerations can be considered. The solution in this case depends on the existence of decoupling vector fields [157, 224].

It is generally preferable to round oﬀ any corners that might have been pro- duced by a motion planning algorithm in constructing τ . This helps, but it still does not completely resolve the issue. The portion of the path around τ (2/3) is not desirable because of high curvature. At a ﬁxed speed, larger accelerations are generally needed to follow sharp turns. The speed may have to be decreased simply because τ carelessly requires sharp turns in . Imagine developing an autonomous double-decker tour bus. It is clear that following the curve around τ (2/3) may cause the bus to topple at high speeds. The bus will have to slow down because it is a slave to the particular choice of τ .

- - 1. The path-constrained phase space

Recall the approach in Section 14.4.1 that enabled systems of the form

q¨ =

h(q, q˙, u) to be expressed as q¨ = u′ for some suitable U′(q, q˙) U (this was illus- trated in Figure 14.15). This enabled many systems to be imagined as multiple, independent double integrators with phase-dependent constraints on the action space. The same idea can be applied here to obtain a single integrator.

⊆

Let S denote a 2D path-constrained phase space, in which each element is of the form (s, s˙) and represents the position and velocity along τ . This parameterizes a 2D subset of the original phase space X. Each original state vector is x = (q, q˙) = (τ (s), dτ /ds s˙). Which accelerations are possible at points in S? At each (s, s˙), a subset of U can be determined that satisﬁes the equations of the form (14.39).

Each valid action yields an acceleration s¨. Let U′(s, s˙) ⊆ R denote the set of all values of s¨ that can be obtained from an action u ∈ U that satisﬁes (14.39) for

each i (except the ones for which dτi/ds = 0). Now the system can be expressed as s¨ = u′, in which u′ U′(s, s˙). After all of this work, we have arrived at the double integrator. The main complication is that U′(s, s˙) can be challenging to determine for some systems. It could consist of a single interval, disjoint intervals, or may even be empty. Assuming that U′(s, s˙) has been characterized, it is straightforward to solve the remaining planning problem using techniques already presented in this chapter. One double integrator is not very challenging; hence, eﬃcient sampling-

∈

based algorithms exist.

An obstacle region Sobs ⊂ S will now be considered. This includes any states

that belong to Xfree. Given s and s˙, the state x can be computed to determine whether any constraints on X are violated. Usually, τ is constructed to avoid obstacle collision; however, some phase constraints may also exist. The obstacle region Sobs also includes any points (s, s˙) for which U′(s, s˙) is empty. Let Sfree denote S Sobs.

Before considering computation methods, we give some examples.

Example 14.6 (Path-Constrained Double Integrators) Consider the case of two double integrators. This could correspond physically to a particle moving

in R2. Hence, C = W = R2. Let U = [−1, 1]2 and q¨ = u for u ∈ U. The

path τ will be chosen to force the particle to move along a line. For linear paths,

dτ /ds is constant and d2τ /ds2 = 0. Using these observations and the fact that

h′(s, s˙, u) = u, (14.39) simpliﬁes to

for i = 1, 2.

s¨ = , (14.40)

dτi/ds

Suppose that τ (s) = (s, s), which means that the particle must move along a diagonal line through the origin of C. This further simpliﬁes (14.40) to s¨ = u1 and

s¨ = u2. Hence any u1 [ 1, 1] may be chosen, but u2 must then be chosen as u2 = u1. The constrained system can be written as one double integrator s¨ = u′, in which u′ [ 1, 1]. Both u1 and u2 are derived from u′ as u1 = u2 = u′. Note that U′ does not vary over S; this occurs because a linear path is degenerate.

∈ −

Now consider constraining the motion to a general line:

τ (s) = (a1s + b1, a2s + b2), (14.41)

in which a1 and a2 are nonzero. In this case, (14.40) yields s¨ = u1/a1 and s¨ =

u2/a2. Since each ui ∈ [−1, 1], each equation indicates that s¨ ∈ [−1/ai, 1/ai]. The acceleration must lie in the intersection of these two intervals. If |a1| ≥ |a2|, then s¨ ∈ [−1/a1, 1/a1]. We can designate u′ = u1 and let u2 = u′a2/a1. If |a1| > |a2|, then s¨ ∈ [−1/a2, 1/a2], u′ = u2, and u1 = u′a1/a2.

Suppose that a1 = 0 and a2 /= 0. The path is

τ (s) = (q1, a2s + b2), (14.42)

in which q1 is ﬁxed and the particle is constrained to move along a vertical line

in C = R2. In this case, only one constraint, s¨ = u2, is obtained from (14.40).

However, u1 is independently constrained to u1 = 0 because horizontal motions

are prohibited.

If n independent, double integrators are constrained to a line, a similar result is obtained. There are n equations of the form (14.40). The i ∈ {1, . . . , n} for which

|ai| is largest determines the acceleration range as s¨ ∈ [−1/ai, 1/ai]. The action u′

is deﬁned as u′ = ui, and the uj for j = i are obtained from the remaining n 1 equations.

/ −

Now assume τ is nonlinear, in which case (14.39) becomes

s¨ = ui dτi/ds

−

d2τi ds2

s˙2

, (14.43)

for each i for which dτi/ds = 0. Now the set U′(s, s˙) varies over S. As the speed s˙ increases, it becomes less likely that U′(s, s˙) is nonempty. In other words, it is less likely that a solution exists to all equations of the form (14.43). In a physical system, that means that staying on the path requires turning too sharply. At a

high speed, this may require an acceleration q¨ that lies outside of [−1, 1] . ．

The same ideas can be applied to systems that are much more complicated. This should not be surprising because in Section 14.4.1 systems of the form q¨ = h(q, q˙) were interpreted as multiple, independent double integrators of the form q¨ = u′, in which u′ U′(q, q˙) provided the possible accelerations. Under this interpretation, and in light of Example 14.6, constraining the motions of a general system to a path τ just further restricts U′(q, q˙). The resulting set of allowable accelerations may be at most one-dimensional.

∈

The following example indicates the specialization of (14.39) for a robot arm.

Example 14.7 (Path-Constrained Manipulators) Suppose that the system is described as (13.142) from Section 13.4.2. This is a common form that has been used for controlling robot arms for decades. Constraints of the form (14.39) can be derived by expressing q, q˙, and q¨ in terms of s, s˙, and s¨. This requires using (14.32), (14.33), and (14.34). Direct substitution into (13.142) yields

（ d2τ 2

M(τ (s))

ds2 s˙

dτ ＼ (

dτ \dτ

This can be simpliﬁed to n equations of the form

τ (s), s˙ ds

s˙ + g(τ (s)) = u. (14.44)

αi(s)s¨ + βi(s)s˙2 + γi(s)s˙ = ui. (14.45)

Solving each one for s¨ yields a special case of (14.39). As in Example 14.6, each equation determines a bounding interval for s¨. The intersection of the intervals

for all n equations yields the allowed interval for

s¨. The action u′ once again

indicates the acceleration in the interval, and the original action variables ui can be obtained from (14.45). If dτi/ds = 0, then αi(s) = 0, which corresponds to the

case in which the constraint does not apply. Instead, the constraint is that the vector u must be chosen so that q˙i = 0. ．

s˙max

s˙ s˙

00 s

1 0 s 1

Figure 14.27: (a) Planning occurs in the path-constrained phase space. (b) Due to the forward-progress assumption, value iteration can be reduced to a quick wavefront propagation across regularly spaced vertical lines in S.

- - 1. Computing optimal solutions via dynamic programming

Dynamic programming with interpolation, as covered in Section 14.5, can be ap- plied to solve the problem once it is formulated in terms of the path-constrained

phase space S R2. The domain of τ provides the constraint 0 s 1. Assume that only forward progress along the path is needed; moving in the reverse direc-

⊂ ≤ ≤

tion should not be necessary. This implies that s˙ > 0. To make S bounded, an upper bound, s˙max, is usually assumed, beyond which it is known that the speed is too high to follow the path.

This results in the planning problem shown in Figure 14.27a. The system is expressed as s¨ = u′, in which u′ ∈ U′(s, s˙). The initial phase in S is (0, s˙i) and

the goal phase is (1, s˙g ). Typically, s˙i = s˙g = 0. The region shown in Figure 14.27 is contained in the ﬁrst quadrant of the phase space because only positive values

of s and s˙

are allowed (in Figure 14.13, q and q˙

could be positive or negative).

This implies that all motions are to the right. The actions determine whether accelerations or decelerations will occur.

Backward value iteration with interpolation can be easily applied by discretiz- ing S and U′(s, s˙). Due to the constraint s˙ > 0, making a Dijkstra-like version of the algorithm is straightforward. A simple wavefront propagation can even be per- formed, starting at s = 1 and progressing backward in vertical waves until s = 0 is reached. See Figure 14.27b. The backprojection (14.29) can be greatly simpliﬁed. Suppose that the s-axis is discretized into m + 1 regularly spaced values s0, . . .,

sm at every ∆s, for some ﬁxed ∆s > 0. Thus, sk = (k∆s)/m. The index k can be interpreted as the stage. Starting at k = m, the ﬁnal cost-to-go G∗m(sm, s˙m)

is deﬁned as 0 if the corresponding phase represents the goal, and ∞ otherwise.

At each sk, the s˙ values are sampled, and the cost-to-go function is represented

using one-dimensional linear interpolation along the vertical axis. At each stage,

the dynamic programming computation

G∗k(sk, s˙k) = min l′ (sk, s˙ , u′) + G∗k+1(s , s˙ )＼ (14.46)

U (sk

is performed at each s˙ sample. This represents a special form of (14.27). Linear

interpolation over discretized s˙ values is used to evaluate G∗k+1(sk+1, s˙k+1). The

cost term ld′ is obtained from ld by computing the original state x ∈ X from s

and s˙; however, if the trajectory segment enters Sobs, it receives inﬁnite cost. The

computations proceed until stage k = 1, at which time the optimal cost-to-go G∗1(s1, s˙1) is computed. The optimal trajectory is obtained by using the cost-to-go function at each stage as a navigation function.

The dynamic programming approach is so general that it can even be extended to path-constrained trajectory planning in the presence of higher order constraints [880]. For example, if a system is speciﬁed as q(3) = h(q, q˙, q¨, u), then a 3D path- constrained phase space results, in which each element is expressed as (s, s˙, s¨).

The actions in this space are jerks, yielding s(3) = u′ for u′ ∈ U′(s, s˙, s¨).

- - 1. A bang-bang approach for time optimality

The dynamic programming approach is already very eﬃcient because the search is conﬁned to two dimensions. Nevertheless, trajectories that are time optimal can be computed even more eﬃciently if Sobs has some special structure. The idea is to ﬁnd an alternating sequence between two motion primitives: one of maximum acceleration and one of maximum deceleration. This kind of switching between extreme opposites is often called bang-bang control and arises often in the development of time-optimal control laws (look ahead to Example 15.4). The method explained here was introduced in [121, 879]. One drawback of obtaining time-optimal trajectories is that they cannot be tracked (the fourth module from Section 14.6.1) if errors occur because the solutions travel on the boundary of the reachable set.

The approach was developed for robot arms, as considered in Example 14.7. Suppose that Sobs is a single connected component that is bounded above by s˙max, and on the sides it is bounded by s = 0 and s = 1. It is assumed that S arises only due to the vanishing of the interval of allowable values for s¨ (in this case, U′(s, s˙) becomes empty). It is also assumed that the lower boundary of Sobs can be expressed as a diﬀerentiable function φ : [0, 1] S, called the limit curve, which yields the maximum speed s˙ = φ(s) for every s [0, 1]. The method is extended to handle multiple obstacles in [879], but this case is not considered here. Assume also that dτi/ds = 0 for every i; the case of dτi/ds = 0 can also be handled in the method [878].

∈

→

Let u′min(s, s˙) and u′max(s, s˙) denote the smallest and largest possible acceler- ations, respectively, from (s, s˙) ∈ S. If (s, s˙) /∈ Sobs, then u′min(s, s˙) < u′max(s, s˙).

BANG-BANG APPROACH

- - - 1. From the ﬁnal state (1, 0), apply reverse-time integration to s¨ = u′min(s, s˙). Continue constructing the curve numerically until either the interior of Sobs is entered or s˙ = 0. In the latter case, the algorithm terminates with failure.
      2. Let (scur, s˙cur) = (0, 0).
      3. Apply forward integration

s¨ = u′max(s, s˙) from (scur, s˙cur) until either the

interior of Sobs is entered or the curve generated in Step 1 is crossed. In the

latter case, the problem is solved.

- - - 1. Starting at the point where the trajectory from Step 3 crossed the limit curve, ﬁnd next tangent point (stan, s˙tan) to the right along the limit curve. From (stan, s˙tan), perform reverse integration on s¨ = u′min(s, s˙) until the curve from Step 3 is hit. Let (scur, s˙cur) = (stan, s˙tan) and go to Step 3.

Figure 14.28: The bang-bang approach ﬁnds a time-optimal, path-constrained trajectory with less searching than the dynamic programming approach.

At the limit curve, um′ in(s, φ(s)) = u′max(s, φ(s)). Applying the only feasible action

in this case generates a velocity that is tangent to the limit curve. This is called

a tangent point, (stan, s˙tan), to φ. Inside of Sobs, no accelerations are possible.

The bang-bang approach is described in Figure 14.28, and a graphical illus- tration appears in Figure 14.29. Assume that the initial and goal phases are (0, 0) and (1, 0), respectively. Step 1 essentially enlarges the goal by constructing a maximum-deceleration curve that terminates at (1, 0). A trajectory that con- tacts this curve can optimally reach (1, 0) by switching to maximum deceleration. Steps 3 and 4 construct a maximum-acceleration curve followed by a maximum- deceleration curve. The acceleration curve runs until it pierces the limit curve. This constraint violation must be avoided. Therefore, a deceleration must be de-

s˙

0 s 1

Figure 14.29: An illustration of the bang-bang approach to computing a time- optimal trajectory. The solution trajectory is obtained by connecting the dots.

termined that departs earlier from the acceleration curve and just barely misses entering the interior of Sobs. This curve must become tangent to the limit curve; therefore, a search is made along the limit curve for the next possible tangent point. From there, reverse-time integration is used in Step 4 to generate a de- celeration curve that contacts the acceleration curve. A portion of the solution has now been obtained in which an acceleration is followed by a deceleration that arrives at a tangent point of φ. It is possible that Step 4 is not reached because the curve that connects to the goal is contacted. Starting from the tangent point, Steps 3 and 4 are repeated until the goal curve is contacted.

Decoupled Planning Approaches

Decoupled Planning Approaches

results matching ""

No results matching ""