package waymo.open_dataset

Get desktop application:
View/edit binary Protocol Buffers messages

optional CameraName.Name name = 1
repeated double intrinsic = 2
1d Array of [f_u, f_v, c_u, c_v, k{1, 2}, p{1, 2}, k{3}]. Note that this intrinsic corresponds to the images after scaling. Camera model: pinhole camera. Lens distortion: Radial distortion coefficients: k1, k2, k3. Tangential distortion coefficients: p1, p2. k_{1, 2, 3}, p_{1, 2} follows the same definition as OpenCV. https://en.wikipedia.org/wiki/Distortion_(optics) https://docs.opencv.org/2.4/doc/tutorials/calib3d/camera_calibration/camera_calibration.html
optional Transform extrinsic = 3
Vehicle frame to camera frame.
optional int32 width = 4
Camera image size.
optional int32 height = 5
optional CameraCalibration.RollingShutterReadOutDirection rolling_shutter_direction = 6

UNKNOWN = 0
TOP_TO_BOTTOM = 1
LEFT_TO_RIGHT = 2
BOTTOM_TO_TOP = 3
RIGHT_TO_LEFT = 4
GLOBAL_SHUTTER = 5

All timestamps in this proto are represented as seconds since Unix epoch.

Used in: Frame

optional CameraName.Name name = 1
optional bytes image = 2
JPEG image.
optional Transform pose = 3
SDC pose.
optional Velocity velocity = 4
SDC velocity at 'pose_timestamp' below. The velocity value is represented at vehicle frame. With this velocity, the pose can be extrapolated. r(t+dt) = r(t) + dr/dt * dt where dr/dt = v_{x,y,z}. R(t+dt) = R(t) + R(t)*SkewSymmetric(w_{x,y,z})*dt r(t) = (x(t), y(t), z(t)) is vehicle location at t in the global frame. R(t) = Rotation Matrix (3x3) from the body frame to the global frame at t. SkewSymmetric(x,y,z) is defined as the cross-product matrix in the following: https://en.wikipedia.org/wiki/Cross_product#Conversion_to_matrix_multiplication
optional double pose_timestamp = 5
Timestamp of the `pose` above.
optional double shutter = 6
Shutter duration in seconds. Time taken for one shutter.
optional double camera_trigger_time = 7
Time when the sensor was triggered and when readout finished. The difference between trigger time and readout done time includes the exposure time and the actual sensor readout time.
optional double camera_readout_done_time = 8

The camera labels associated with a given camera image. This message indicates the ground truth information for the camera image recorded by the given camera. If there are no labeled objects in the image, then the labels field is empty.

Used in: Frame

optional CameraName.Name name = 1
repeated Label labels = 2

(message has no fields)

Used in: CameraCalibration, CameraImage, CameraLabels

UNKNOWN = 0
FRONT = 1
FRONT_LEFT = 2
FRONT_RIGHT = 3
SIDE_LEFT = 4
SIDE_RIGHT = 5

Used in: Frame

optional string name = 1
A unique name that identifies the frame sequence.
repeated CameraCalibration camera_calibrations = 2
repeated LaserCalibration laser_calibrations = 3
optional Context.Stats stats = 4

Some stats for the run segment used.

Used in: Context

repeated Stats.ObjectCount laser_object_counts = 1
repeated Stats.ObjectCount camera_object_counts = 5
optional string time_of_day = 2
Day, Dawn/Dusk, or Night, determined from sun elevation.
optional string location = 3
Human readable location (e.g. CHD, SF) of the run segment.
optional string weather = 4
Currently either Sunny or Rain.

Used in: Stats

optional Label.Type type = 1
optional int32 count = 2
The number of unique objects with the type in the segment.

optional Context context = 1
This context is the same for all frames belong to the same driving run segment. Use context.name to identify frames belong to the same driving segment. We do not store all frames from one driving segment in one proto to avoid huge protos.
optional int64 timestamp_micros = 2
Frame start time, which is the timestamp of the first top lidar spin within this frame.
optional Transform pose = 3
The vehicle pose.
repeated CameraImage images = 4
repeated Laser lasers = 5
repeated Label laser_labels = 6
repeated CameraLabels projected_lidar_labels = 9
Lidar labels (laser_labels) projected to camera images. A projected label is the smallest image axis aligned rectangle that can cover all projected points from the 3d lidar label. The projected label is ignored if the projection is fully outside a camera image. The projected label is clamped to the camera image if it is partially outside.
repeated CameraLabels camera_labels = 8
NOTE: if a camera identified by CameraLabels.name has an entry in this field, then it has been labeled, even though it is possible that there are no labeled objects in the corresponding image, which is identified by a zero sized CameraLabels.labels.
repeated Polygon2dProto no_label_zones = 7
No label zones in the *global* frame.

Used in: CameraLabels, Frame

optional Label.Box box = 1
optional Label.Metadata metadata = 2
optional Label.Type type = 3
optional string id = 4
Object ID.
optional Label.DifficultyLevel detection_difficulty_level = 5
Difficulty level for detection problem.
optional Label.DifficultyLevel tracking_difficulty_level = 6
Difficulty level for tracking problem.

Upright box, zero pitch and roll.

Used in: Label

optional double center_x = 1
Box coordinates in vehicle frame.
optional double center_y = 2
optional double center_z = 3
optional double length = 5
Dimensions of the box. length: dim x. width: dim y. height: dim z.
optional double width = 4
optional double height = 6
optional double heading = 7
The heading of the bounding box (in radians). The heading is the angle required to rotate +x to the surface normal of the SDC front face.

TYPE_UNKNOWN = 0
TYPE_3D = 1
7-DOF 3D (a.k.a upright 3D box).
TYPE_2D = 2
5-DOF 2D. Mostly used for laser top down representation.
TYPE_AA_2D = 3
Axis aligned 2D. Mostly used for image.

The difficulty level of this label. The higher the level, the harder it is.

Used in: Label

UNKNOWN = 0
LEVEL_1 = 1
LEVEL_2 = 2

Used in: Label

optional double speed_x = 1
optional double speed_y = 2
optional double accel_x = 3
optional double accel_y = 4

Used in: Context.Stats.ObjectCount, Label

TYPE_UNKNOWN = 0
TYPE_VEHICLE = 1
TYPE_PEDESTRIAN = 2
TYPE_SIGN = 3
TYPE_CYCLIST = 4

Used in: Frame

optional LaserName.Name name = 1
optional RangeImage ri_return1 = 2
optional RangeImage ri_return2 = 3

Used in: Context

optional LaserName.Name name = 1
repeated double beam_inclinations = 2
If non-empty, the beam pitch (in radians) is non-uniform. When constructing a range image, this mapping is used to map from beam pitch to range image row. If this is empty, we assume a uniform distribution.
optional double beam_inclination_min = 3
beam_inclination_{min,max} (in radians) are used to determine the mapping.
optional double beam_inclination_max = 4
optional Transform extrinsic = 5
Lidar frame to vehicle frame.

'Laser' is used interchangeably with 'Lidar' in this file.

(message has no fields)

Used in: Laser, LaserCalibration

UNKNOWN = 0
TOP = 1
FRONT = 2
SIDE_LEFT = 3
SIDE_RIGHT = 4
REAR = 5

Row-major matrix. Requires: data.size() = product(shape.dims()).

Used in: RangeImage

repeated float data = 1
optional MatrixShape shape = 2

Row-major matrix. Requires: data.size() = product(shape.dims()).

repeated int32 data = 1
optional MatrixShape shape = 2

Used in: MatrixFloat, MatrixInt32

repeated int32 dims = 1
Dimensions for the Matrix messages defined below. Must not be empty. The order of entries in 'dims' matters, as it indicates the layout of the values in the tensor in-memory representation. The first entry in 'dims' is the outermost dimension used to lay out the values; the last entry is the innermost dimension. This matches the in-memory layout of row-major matrices.

Non-self-intersecting 2d polygons. This polygon is not necessarily convex.

Used in: Frame

repeated double x = 1
repeated double y = 2
optional string id = 3
A globally unique ID.

Range image is a 2d tensor. The first dim (row) represents pitch. The second dim represents yaw. There are two types of range images: 1. Raw range image: Raw range image with a non-empty 'range_image_pose_compressed' which tells the vehicle pose of each range image cell. 2. Virtual range image: Range image with an empty 'range_image_pose_compressed'. This range image is constructed by transforming all lidar points into a fixed vehicle frame (usually the vehicle frame of the middle scan). NOTE: 'range_image_pose_compressed' is only populated for the first range image return. The second return has the exact the same range image pose as the first one.

Used in: Laser

optional bytes range_image_compressed = 2
Zlib compressed [H, W, 4] serialized version of MatrixFloat. To decompress: string val = ZlibDecompress(range_image_compressed); MatrixFloat range_image; range_image.ParseFromString(val); Inner dimensions are: * channel 0: range * channel 1: intensity * channel 2: elongation * channel 3: is in any no label zone.
optional bytes camera_projection_compressed = 3
Lidar point to camera image projections. A point can be projected to multiple camera images. We pick the first two at the following order: [FRONT, FRONT_LEFT, FRONT_RIGHT, SIDE_LEFT, SIDE_RIGHT]. Zlib compressed [H, W, 6] serialized version of MatrixInt32. To decompress: string val = ZlibDecompress(camera_projection_compressed); MatrixInt32 camera_projection; camera_projection.ParseFromString(val); Inner dimensions are: * channel 0: CameraName.Name of 1st projection. Set to UNKNOWN if no projection. * channel 1: x (axis along image width) * channel 2: y (axis along image height) * channel 3: CameraName.Name of 2nd projection. Set to UNKNOWN if no projection. * channel 4: x (axis along image width) * channel 5: y (axis along image height) Note: pixel 0 corresponds to the left edge of the first pixel in the image.
optional bytes range_image_pose_compressed = 4
Zlib compressed [H, W, 6] serialized version of MatrixFloat. To decompress: string val = ZlibDecompress(range_image_pose_compressed); MatrixFloat range_image_pose; range_image_pose.ParseFromString(val); Inner dimensions are [roll, pitch, yaw, x, y, z] represents a transform from vehicle frame to global frame for every range image pixel. This is ONLY populated for the first return. The second return is assumed to have exactly the same range_image_pose_compressed. The roll, pitch and yaw are specified as 3-2-1 Euler angle rotations, meaning that rotating from the navigation to vehicle frame consists of a yaw, then pitch and finally roll rotation about the z, y and x axes respectively. All rotations use the right hand rule and are positive in the counter clockwise direction.
optional MatrixFloat range_image = 1
Deprecated, do not use.

4x4 row major transform matrix that tranforms 3d points from one frame to another.

Used in: CameraCalibration, CameraImage, Frame, LaserCalibration

repeated double transform = 1

Used in: CameraImage

optional float v_x = 1
Velocity in m/s.
optional float v_y = 2
optional float v_z = 3
optional double w_x = 4
Angular velocity in rad/s.
optional double w_y = 5
optional double w_z = 6

package waymo.open_dataset

message CameraCalibration

optional CameraName.Name name = 1

repeated double intrinsic = 2

optional Transform extrinsic = 3

optional int32 width = 4

optional int32 height = 5

optional CameraCalibration.RollingShutterReadOutDirection rolling_shutter_direction = 6

enum CameraCalibration.RollingShutterReadOutDirection

UNKNOWN = 0

TOP_TO_BOTTOM = 1

LEFT_TO_RIGHT = 2

BOTTOM_TO_TOP = 3

RIGHT_TO_LEFT = 4

GLOBAL_SHUTTER = 5

message CameraImage

optional CameraName.Name name = 1

optional bytes image = 2

optional Transform pose = 3

optional Velocity velocity = 4

optional double pose_timestamp = 5

optional double shutter = 6

optional double camera_trigger_time = 7

optional double camera_readout_done_time = 8

message CameraLabels

optional CameraName.Name name = 1

repeated Label labels = 2

message CameraName

enum CameraName.Name

UNKNOWN = 0

FRONT = 1

FRONT_LEFT = 2

FRONT_RIGHT = 3

SIDE_LEFT = 4

SIDE_RIGHT = 5

message Context

optional string name = 1

repeated CameraCalibration camera_calibrations = 2

repeated LaserCalibration laser_calibrations = 3

optional Context.Stats stats = 4

message Context.Stats

repeated Stats.ObjectCount laser_object_counts = 1

repeated Stats.ObjectCount camera_object_counts = 5

optional string time_of_day = 2

optional string location = 3

optional string weather = 4

message Context.Stats.ObjectCount

optional Label.Type type = 1

optional int32 count = 2

message Frame

optional Context context = 1

optional int64 timestamp_micros = 2

optional Transform pose = 3

repeated CameraImage images = 4

repeated Laser lasers = 5

repeated Label laser_labels = 6

repeated CameraLabels projected_lidar_labels = 9

repeated CameraLabels camera_labels = 8

repeated Polygon2dProto no_label_zones = 7

message Label

optional Label.Box box = 1

optional Label.Metadata metadata = 2

optional Label.Type type = 3

optional string id = 4

optional Label.DifficultyLevel detection_difficulty_level = 5

optional Label.DifficultyLevel tracking_difficulty_level = 6

message Label.Box

optional double center_x = 1

optional double center_y = 2

optional double center_z = 3

optional double length = 5

optional double width = 4

optional double height = 6

optional double heading = 7

enum Label.Box.Type

TYPE_UNKNOWN = 0

TYPE_3D = 1

TYPE_2D = 2

TYPE_AA_2D = 3

enum Label.DifficultyLevel