The Pose3
class implements two things: a pose manifold and a Lie group of transforms. They look awfully similar, and are often treated as the same thing, but they are really not.
%pip install --quiet gtsam-develop
import gtsam
import numpy as np
from gtsam import Pose3, Rot3, Point3
Manifold ¶
The manifold is defined by elements , where:
- : The orientation or attitude.
- : The position.
This manifold represents poses in 3D space, with encoding orientation/attitude and encoding the position of a body in some reference frame.
This manifold can be thought of all possible values that can specify the position and orientation of a rigid body in some reference frame. It does not have a notion of composition: we would never think to compose two poses of two different rigid bodies.
The simplest thing you can do in GTSAM is just creating a pose with identity orientation and position:
P = Pose3()
print(f"P=\n{P}")
P=
R: [
1, 0, 0;
0, 1, 0;
0, 0, 1
]
t: 0 0 0
But of course, it is more useful to specify some orientation and position. To do that, you need to create a Rot3
instance, see Rotations in 3D. For example, here is a pose of a robot at and rotated by 30 degrees around the Z-axis:
C = Rot3.Yaw(np.deg2rad(30))
r = Point3(3,4,0)
print(C,r)
P = Pose3(C,r)
print(f"P=\n{P}")
R: [
0.866025, -0.5, 0;
0.5, 0.866025, 0;
0, 0, 1
]
[3. 4. 0.]
P=
R: [
0.866025, -0.5, 0;
0.5, 0.866025, 0;
0, 0, 1
]
t: 3 4 0
A great many things can be done using just the manifold , as it provides a flexible and foundational framework for representing and reasoning about poses in 3D space. Below are some key use cases and examples:
Trajectories on ¶
A trajectory over time can be represented as a continuous or discrete curve on the manifold:
where:
- is a time-varying orientation.
- is a time-varying position vector.
This representation is useful for describing motion paths of rigid bodies, such as robots, drones, or vehicles.
Optimization on ¶
Many problems involve optimization over the pose manifold, such as:
- Localization: Estimating the pose of a robot or sensor in a global or local frame.
- SLAM (Simultaneous Localization and Mapping): Optimizing a graph of poses and landmarks to minimize error.
Lie Group ¶
The Lie group consists of elements , where:
- : A rotation matrix.
- : A translation vector.
The group operation in is defined as:
This operation combines two transformations, where composes the rotations and composes the translations.
Note that this particular form of composition defines as the semi-product of the group and the vector space . This should be contrasted with a direct product, which would yield as the composition.
Group Action on 3D Points¶
The group acts on a 3D point as follows:
Here, the rotation is applied first, followed by the translation .
Group Action on Poses¶
The group also acts on a pose as follows:
- rotates the orientation into a different frame.
- transforms the position, by first rotating the position in the old frame, then translating to account for the origin of the new frame.
4x4 Matrix Lie Group View of Transforms¶
A transform in can also be represented as a homogeneous transformation matrix: $$ T =
$$
The composition of two transforms and in is then simply matrix multiplication $$ T_1 \cdot T_2 =
=
$$
and the inverse is given by: $$ T^{-1} =
$R^\topR$.
To implement the action of on we need to embed a point in homogeneous coordinates as , where . The action of a transform on a point is then given by: $$ \tilde{p}’ = T \tilde{p} =
=
p’ \ 1 \end{bmatrix}. $$
The matrix representation provides an elegant way to handle transformations in 3D space. This view also emphasizes that is a matrix Lie group, i.e., a subgroup of invertible matrices, with in this case . However, it is by no means required to implement the Lie group and in fact GTSAM internally does not use matrices.
Reference Frame Hierarchies¶
Poses can sometimes be used as transforms when hierarchical relationships exist between frames. For example, consider the pose of an IMU relative to a rigid body B, which itself has a pose in a navigation frame N.
If we seek is the pose of the IMU relative to the navigation frame, we need to upgrade the body pose to a rigid transform . The pose of the IMU relative to the navigation frame is then given by:
Note how the indices cancel out.
Retractions and the Exponential Map¶
The Tangent Space of ¶
The tangent space of at the identity, known as its Lie algebra , can be thought of informally as encoding angular and linear velocities. Elements of are twists , where:
- : Represents angular velocity.
- : Represents linear velocity.
These quantities describe instantaneous motion in 3D space. A twist ξ can be represented as a skew-symmetric matrix: $$ \xi^\wedge =
$\omega^\wedge\omega$, encoding rotational motion.
To describe finite motions, the twist ξ is scaled by a time or displacement parameter , resulting in a scaled twist :
- : A finite angular displacement.
- : A finite linear displacement.
We used to indicate a time interval: not this has nothing to do with the translation . This finite motion is the input to the exponential map, which we explain next.
The Exponential Map for ¶
The exponential map for takes a finite motion , derived from a twist scaled by a parameter , and maps it to a rigid body transform . The exponential map produces a transformation in : $$ \exp((\xi \Delta t)^\wedge) =
$(R,t)\in SE(3)$, where
- , where is the skew-symmetric matrix of ϕ.
- , where is the left Jacobian of , a matrix. handles the semi-direct product nature of by coupling the rotational and translational components appropriately.
An often-used shorthand notation is defining or a two-argument variant as:
where we used the same notation for , i.e., .
Retractions to the Manifold ¶
The exponential map can also be used as a retraction for optimization on the manifold. Retractions are mappings from the tangent space to the manifold, e.g., the exponential map implements the following retraction:
Alternative retractions exists, however, and the only restriction on them is that they need to have the same behavior as the exponential map near the identity. One alternative retraction simply treats the attitude and position separately:
This alternative retraction behaves identically to the exponential map at and is computationally simpler.