Get desktop application:
View/edit binary Protocol Buffers messages
Used in:
1d Array of [f_u, f_v, c_u, c_v, k{1, 2}, p{1, 2}, k{3}]. Note that this intrinsic corresponds to the images after scaling. Camera model: pinhole camera. Lens distortion: Radial distortion coefficients: k1, k2, k3. Tangential distortion coefficients: p1, p2. k_{1, 2, 3}, p_{1, 2} follows the same definition as OpenCV. https://en.wikipedia.org/wiki/Distortion_(optics) https://docs.opencv.org/2.4/doc/tutorials/calib3d/camera_calibration/camera_calibration.html
Vehicle frame to camera frame.
Camera image size.
Used in:
All timestamps in this proto are represented as seconds since Unix epoch.
Used in:
JPEG image.
SDC pose.
SDC velocity at 'pose_timestamp' below. The velocity value is represented at vehicle frame. With this velocity, the pose can be extrapolated. r(t+dt) = r(t) + dr/dt * dt where dr/dt = v_{x,y,z}. R(t+dt) = R(t) + R(t)*SkewSymmetric(w_{x,y,z})*dt r(t) = (x(t), y(t), z(t)) is vehicle location at t in the global frame. R(t) = Rotation Matrix (3x3) from the body frame to the global frame at t. SkewSymmetric(x,y,z) is defined as the cross-product matrix in the following: https://en.wikipedia.org/wiki/Cross_product#Conversion_to_matrix_multiplication
Timestamp of the `pose` above.
Shutter duration in seconds. Time taken for one shutter.
Time when the sensor was triggered and when readout finished. The difference between trigger time and readout done time includes the exposure time and the actual sensor readout time.
The camera labels associated with a given camera image. This message indicates the ground truth information for the camera image recorded by the given camera. If there are no labeled objects in the image, then the labels field is empty.
Used in:
(message has no fields)
Used in:
, ,Used in:
A unique name that identifies the frame sequence.
Some stats for the run segment used.
Used in:
Day, Dawn/Dusk, or Night, determined from sun elevation.
Human readable location (e.g. CHD, SF) of the run segment.
Currently either Sunny or Rain.
Used in:
The number of unique objects with the type in the segment.
This context is the same for all frames belong to the same driving run segment. Use context.name to identify frames belong to the same driving segment. We do not store all frames from one driving segment in one proto to avoid huge protos.
Frame start time, which is the timestamp of the first top lidar spin within this frame.
The vehicle pose.
Lidar labels (laser_labels) projected to camera images. A projected label is the smallest image axis aligned rectangle that can cover all projected points from the 3d lidar label. The projected label is ignored if the projection is fully outside a camera image. The projected label is clamped to the camera image if it is partially outside.
NOTE: if a camera identified by CameraLabels.name has an entry in this field, then it has been labeled, even though it is possible that there are no labeled objects in the corresponding image, which is identified by a zero sized CameraLabels.labels.
No label zones in the *global* frame.
Used in:
,Object ID.
Difficulty level for detection problem.
Difficulty level for tracking problem.
Upright box, zero pitch and roll.
Used in:
Box coordinates in vehicle frame.
Dimensions of the box. length: dim x. width: dim y. height: dim z.
The heading of the bounding box (in radians). The heading is the angle required to rotate +x to the surface normal of the SDC front face.
7-DOF 3D (a.k.a upright 3D box).
5-DOF 2D. Mostly used for laser top down representation.
Axis aligned 2D. Mostly used for image.
The difficulty level of this label. The higher the level, the harder it is.
Used in:
Used in:
Used in:
,Used in:
Used in:
If non-empty, the beam pitch (in radians) is non-uniform. When constructing a range image, this mapping is used to map from beam pitch to range image row. If this is empty, we assume a uniform distribution.
beam_inclination_{min,max} (in radians) are used to determine the mapping.
Lidar frame to vehicle frame.
'Laser' is used interchangeably with 'Lidar' in this file.
(message has no fields)
Used in:
,Row-major matrix. Requires: data.size() = product(shape.dims()).
Used in:
Row-major matrix. Requires: data.size() = product(shape.dims()).
Used in:
,Dimensions for the Matrix messages defined below. Must not be empty. The order of entries in 'dims' matters, as it indicates the layout of the values in the tensor in-memory representation. The first entry in 'dims' is the outermost dimension used to lay out the values; the last entry is the innermost dimension. This matches the in-memory layout of row-major matrices.
Non-self-intersecting 2d polygons. This polygon is not necessarily convex.
Used in:
A globally unique ID.
Range image is a 2d tensor. The first dim (row) represents pitch. The second dim represents yaw. There are two types of range images: 1. Raw range image: Raw range image with a non-empty 'range_image_pose_compressed' which tells the vehicle pose of each range image cell. 2. Virtual range image: Range image with an empty 'range_image_pose_compressed'. This range image is constructed by transforming all lidar points into a fixed vehicle frame (usually the vehicle frame of the middle scan). NOTE: 'range_image_pose_compressed' is only populated for the first range image return. The second return has the exact the same range image pose as the first one.
Used in:
Zlib compressed [H, W, 4] serialized version of MatrixFloat. To decompress: string val = ZlibDecompress(range_image_compressed); MatrixFloat range_image; range_image.ParseFromString(val); Inner dimensions are: * channel 0: range * channel 1: intensity * channel 2: elongation * channel 3: is in any no label zone.
Lidar point to camera image projections. A point can be projected to multiple camera images. We pick the first two at the following order: [FRONT, FRONT_LEFT, FRONT_RIGHT, SIDE_LEFT, SIDE_RIGHT]. Zlib compressed [H, W, 6] serialized version of MatrixInt32. To decompress: string val = ZlibDecompress(camera_projection_compressed); MatrixInt32 camera_projection; camera_projection.ParseFromString(val); Inner dimensions are: * channel 0: CameraName.Name of 1st projection. Set to UNKNOWN if no projection. * channel 1: x (axis along image width) * channel 2: y (axis along image height) * channel 3: CameraName.Name of 2nd projection. Set to UNKNOWN if no projection. * channel 4: x (axis along image width) * channel 5: y (axis along image height) Note: pixel 0 corresponds to the left edge of the first pixel in the image.
Zlib compressed [H, W, 6] serialized version of MatrixFloat. To decompress: string val = ZlibDecompress(range_image_pose_compressed); MatrixFloat range_image_pose; range_image_pose.ParseFromString(val); Inner dimensions are [roll, pitch, yaw, x, y, z] represents a transform from vehicle frame to global frame for every range image pixel. This is ONLY populated for the first return. The second return is assumed to have exactly the same range_image_pose_compressed. The roll, pitch and yaw are specified as 3-2-1 Euler angle rotations, meaning that rotating from the navigation to vehicle frame consists of a yaw, then pitch and finally roll rotation about the z, y and x axes respectively. All rotations use the right hand rule and are positive in the counter clockwise direction.
Deprecated, do not use.
4x4 row major transform matrix that tranforms 3d points from one frame to another.
Used in:
, , ,Used in:
Velocity in m/s.
Angular velocity in rad/s.