Pose3

The Pose3 class implements two things: a pose manifold and a Lie group of transforms. They look awfully similar, and are often treated as the same thing, but they are really not.

%pip install --quiet gtsam-develop

import gtsam
import numpy as np
from gtsam import Pose3, Rot3, Point3

Manifold $\mathcal{SE}(3)$ ¶

The manifold $\mathcal{SE}(3)$ is defined by elements $(C, r)$ , where:

$C \in \mathcal{SO}(3)$ : The orientation or attitude.
$r \in \mathbb{R}^3$ : The position.

This manifold represents poses in 3D space, with $C$ encoding orientation/attitude and $r$ encoding the position of a body in some reference frame.

This manifold can be thought of all possible values that can specify the position and orientation of a rigid body in some reference frame. It does not have a notion of composition: we would never think to compose two poses of two different rigid bodies.

The simplest thing you can do in GTSAM is just creating a pose with identity orientation and position:

P = Pose3()
print(f"P=\n{P}")

P=
R: [
	1, 0, 0;
	0, 1, 0;
	0, 0, 1
]
t: 0 0 0

But of course, it is more useful to specify some orientation and position. To do that, you need to create a Rot3 instance, see Rotations in 3D. For example, here is a pose of a robot at $r=(3,4,0)$ and rotated by 30 degrees around the Z-axis:

C = Rot3.Yaw(np.deg2rad(30))
r = Point3(3,4,0)
print(C,r)
P = Pose3(C,r)
print(f"P=\n{P}")

R: [
	0.866025, -0.5, 0;
	0.5, 0.866025, 0;
	0, 0, 1
]
 [3. 4. 0.]
P=
R: [
	0.866025, -0.5, 0;
	0.5, 0.866025, 0;
	0, 0, 1
]
t: 3 4 0

A great many things can be done using just the manifold $\mathcal{SE}(3)$ , as it provides a flexible and foundational framework for representing and reasoning about poses in 3D space. Below are some key use cases and examples:

Trajectories on $\mathcal{SE}(3)$ ¶

A trajectory over time can be represented as a continuous or discrete curve on the manifold:

P(t) = (C(t), r(t)),

(1)

where:

$C(t)$ is a time-varying orientation.
$r(t)$ is a time-varying position vector.

This representation is useful for describing motion paths of rigid bodies, such as robots, drones, or vehicles.

Optimization on $\mathcal{SE}(3)$ ¶

Many problems involve optimization over the pose manifold, such as:

Localization: Estimating the pose of a robot or sensor in a global or local frame.
SLAM (Simultaneous Localization and Mapping): Optimizing a graph of poses and landmarks to minimize error.

Lie Group $SE(3)$ ¶

The Lie group $SE(3)$ consists of elements $(R, t)$ , where:

$R \in SO(3)$ : A rotation matrix.
$t \in \mathbb{R}^3$ : A translation vector.

The group operation in $SE(3)$ is defined as:

(R_1, t_1) \cdot (R_2, t_2) = (R_1 R_2, R_1 t_2 + t_1).

(2)

This operation combines two transformations, where $R_1 R_2$ composes the rotations and $R_1 t_2 + t_1$ composes the translations.

Note that this particular form of composition defines $SE(3)$ as the semi-product of the group $SO(3)$ and the vector space $\mathbb{R}^3$ . This should be contrasted with a direct product, which would yield $(R_1 R_2, t_2 + t_1)$ as the composition.

Group Action on 3D Points¶

The group $SE(3)$ acts on a 3D point $p \in \mathbb{R}^3$ as follows:

p' = (R, t) \cdot p = R p + t.

(3)

Here, the rotation $R$ is applied first, followed by the translation $t$ .

Group Action on Poses¶

The group $SE(3)$ also acts on a pose $(C,r) \in \mathcal{SE}(3)$ as follows:

(R,t) \cdot (C,r) = (R C, R r +t).

(4)

$R C$ rotates the orientation into a different frame.
$R r +t$ transforms the position, by first rotating the position in the old frame, then translating to account for the origin $t$ of the new frame.

4x4 Matrix Lie Group View of Transforms¶

A transform in $SE(3)$ can also be represented as a $4 \times 4$ homogeneous transformation matrix: $$ T =

\begin{bmatrix} R & t \\ 0^\top & 1 \end{bmatrix}

(5)

$$

The composition of two transforms $T_1$ and $T_2$ in $SE(3)$ is then simply matrix multiplication $$ T_1 \cdot T_2 =

\begin{bmatrix} R_1 & t_1 \\ 0^\top & 1 \end{bmatrix}

(6)

\begin{bmatrix} R_2 & t_2 \\ 0^\top & 1 \end{bmatrix}

(7)

=

\begin{bmatrix} R_1 R_2 & R_1 t_2 + t_1 \\ 0^\top & 1 \end{bmatrix}

(8)

$$

and the inverse is given by: $$ T^{-1} =

\begin{bmatrix} R^\top & -R^\top t \\ 0^\top & 1 \end{bmatrix}

(9)

$ $where$ R^\top $is the transpose (and also the inverse) of the rotation matrix$ R$.

To implement the action of $SE(3)$ on $\mathbb{R}^3$ we need to embed a point $p$ in homogeneous coordinates as $\tilde{p} = [p^\top, 1]^\top$ , where $p \in \mathbb{R}^3$ . The action of a transform $T$ on a point $\tilde{p}$ is then given by: $$ \tilde{p}’ = T \tilde{p} =

\begin{bmatrix} R & t \\ 0^\top & 1 \end{bmatrix}

(10)

\begin{bmatrix} p \\ 1 \end{bmatrix}

(11)

=

\begin{bmatrix} R p + t \\ 1 \end{bmatrix}

(12)

p’ \ 1 \end{bmatrix}. $$

The matrix representation provides an elegant way to handle transformations in 3D space. This view also emphasizes that $SE(3)$ is a matrix Lie group, i.e., a subgroup of invertible $d\times d$ matrices, with in this case $d=4$ . However, it is by no means required to implement the Lie group $SE(3)$ and in fact GTSAM internally does not use $4\times 4$ matrices.

Reference Frame Hierarchies¶

Poses can sometimes be used as transforms when hierarchical relationships exist between frames. For example, consider the pose $P^b_i$ of an IMU relative to a rigid body B, which itself has a pose $P^n_b$ in a navigation frame N.

If we seek is the pose $P^n_i$ of the IMU relative to the navigation frame, we need to upgrade the body pose $P^n_b$ to a rigid transform $T^n_b$ . The pose of the IMU relative to the navigation frame is then given by:

P^n_i = T^n_b \cdot P^b_i.

(13)

Note how the $b$ indices cancel out.

Retractions and the Exponential Map¶

The Tangent Space of $SE(3)$ ¶

The tangent space of $SE(3)$ at the identity, known as its Lie algebra $\mathfrak{se}(3)$ , can be thought of informally as encoding angular and linear velocities. Elements of $\mathfrak{se}(3)$ are twists $\xi = (\omega, v)$ , where:

$\omega \in \mathbb{R}^3$ : Represents angular velocity.
$v \in \mathbb{R}^3$ : Represents linear velocity.

These quantities describe instantaneous motion in 3D space. A twist $\xi$ can be represented as a skew-symmetric matrix: $$ \xi^\wedge =

\begin{bmatrix} \omega^\wedge & v \\ 0^\top & 0 \end{bmatrix}

(14)

$ $where$ \omega^\wedge $is the skew-symmetric matrix of$ \omega$, encoding rotational motion.

To describe finite motions, the twist $\xi$ is scaled by a time or displacement parameter $\Delta t$ , resulting in a scaled twist $\xi \Delta t = (\phi, \rho)$ :

$\phi = \omega \Delta t$ : A finite angular displacement.
$\rho = v \Delta t$ : A finite linear displacement.

We used $\Delta t$ to indicate a time interval: not this has nothing to do with the translation $t$ . This finite motion $\xi \Delta t$ is the input to the exponential map, which we explain next.

The Exponential Map for $SE(3)$ ¶

The exponential map for $SE(3)$ takes a finite motion $\xi \Delta t = (\phi, \rho)$ , derived from a twist $\xi = (\omega, v)$ scaled by a parameter $t$ , and maps it to a rigid body transform $(R, t) \in SE(3)$ . The exponential map produces a transformation $(R, t)$ in $SE(3)$ : $$ \exp((\xi \Delta t)^\wedge) =

\begin{bmatrix} R & t \\ 0^\top & 1 \end{bmatrix}

(15)

$ $or$ (R,t)\in SE(3)$, where

$R = \exp(\phi^\wedge)$ , where $\phi^\wedge$ is the skew-symmetric matrix of $\phi$ .
$t = J_l(\phi) \rho$ , where $J_l(\phi)$ is the left Jacobian of $SO(3)$ , a $3 \times 3$ matrix. $J_l(\phi)$ handles the semi-direct product nature of $SE(3)$ by coupling the rotational and translational components appropriately.

An often-used shorthand notation is defining $\text{Exp}: \mathbb{R}^6 \rightarrow SE(3)$ or a two-argument variant as:

\text{Exp}(\phi,\rho) = \text{Exp}(\xi \Delta t) \doteq \exp((\xi \Delta t)^\wedge) = (\text{Exp}(\phi), J_l(\phi) \rho)

(16)

where we used the same notation for $SO(3)$ , i.e., $\text{Exp}(\phi)\doteq \exp(\phi^\wedge)$ .

Retractions to the Manifold $\mathcal{SE}(3)$ ¶

The exponential map can also be used as a retraction for optimization on the $SE(3)$ manifold. Retractions are mappings from the tangent space to the manifold, e.g., the exponential map implements the following retraction:

(C, r) \times (\phi, \rho) \mapsto \exp((\xi \Delta t)^\wedge) \cdot (C,r) = (C \exp(\phi^\wedge), r + C J_l(\phi) \rho),

(17)

Alternative retractions exists, however, and the only restriction on them is that they need to have the same behavior as the exponential map near the identity. One alternative retraction simply treats the attitude and position separately:

(C, r) \times (\phi, \rho) \mapsto (C\exp(\phi^\wedge), r + \rho),

(18)

This alternative retraction behaves identically to the exponential map at $t = 0$ and is computationally simpler.

Manifold SE(3)\mathcal{SE}(3)SE(3)¶

Trajectories on SE(3)\mathcal{SE}(3)SE(3)¶

Optimization on SE(3)\mathcal{SE}(3)SE(3)¶

Lie Group SE(3)SE(3)SE(3)¶