package xeno.pursuit.proto

Get desktop application:
View/edit binary Protocol Buffers messages

Used in: ObjectAnnotation

int32 id = 1
optional Point3D point_3d = 2
optional NormalizedPoint2D point_2d = 3

The edge connecting two keypoints together

Used in: Skeleton

int32 source = 1
keypoint id of the edge's source
int32 sink = 2
keypoint id of the edge's sink

Used in: Sequence

int32 frame_id = 1
Unique frame id, corresponds to images.
repeated ObjectAnnotation annotations = 2
List of the annotated objects in this frame. Depending on how many object are observable in this frame, we might have non or as much as sequence.objects_size() annotations.
optional research.compvideo.arcapture.ARCamera camera = 3
Information about the camera transformation (in the world coordinate) and imaging characteristics for a captured video frame.
double timestamp = 4
The timestamp for the frame.
repeated float plane_center = 5
Plane center and normal in camera frame.
repeated float plane_normal = 6

Used in: Object, Skeleton

float x = 1
The position of the keypoint in the local coordinate system of the rigid object.
float y = 2
float z = 3
float confidence_radius = 4
Sphere around the keypoint, indiciating annotator's confidence of the position in meters.
string name = 5
The name of the keypoint (e.g. legs, head, etc.). Does not have to be unique.
bool hidden = 6
Indicates whether the keypoint is hidden or not.

Projection of a 3D point on an image, and its metric depth.

Used in: AnnotatedKeyPoint

float x = 1
x-y position of the 2d keypoint in the image coordinate system. u,v \in [0, 1], where top left corner is (0, 0) and the bottom-right corner is (1, 1).
float y = 2
float depth = 3
The depth of the point in the camera coordinate system (in meters).

Used in: Sequence

int32 id = 1
Unique object id through a sequence. There might be multiple objects of the same label in this sequence.
string category = 2
Describes what category an object is. E.g. object class, attribute, instance or person identity. This provides additional context for the object type.
Object.Type type = 3
repeated float rotation = 4
3x3 row-major rotation matrix describing the orientation of the rigid object's frame of reference in the world-coordinate system.
repeated float translation = 5
3x1 vector describing the translation of the rigid object's frame of reference in the world-coordinate system in meters.
repeated float scale = 6
3x1 vector describing the scale of the rigid object's frame of reference in the world-coordinate system in meters.
repeated KeyPoint keypoints = 7
List of all the key points associated with this object in the object coordinate system. The first keypoint is always the object's frame of reference, e.g. the centroid of the box. E.g. bounding box with its center as frame of reference, the 9 keypoints : {0., 0., 0.}, {-.5, -.5, -.5}, {-.5, -.5, +.5}, {-.5, +.5, -.5}, {-.5, +.5, +.5}, {+.5, -.5, -.5}, {+.5, -.5, +.5}, {+.5, +.5, -.5}, {+.5, +.5, +.5} To get the bounding box in the world-coordinate system, we first scale the box then transform the scaled box. For example, bounding box in the world coordinate system is rotation * scale * keypoints + translation
Object.Method method = 8

Enum to reflect how this object is created.

Used in: Object

UNKNOWN_METHOD = 0
ANNOTATION = 1
Created by data annotation.
AUGMENTATION = 2
Created by data augmentation.

Used in: Object

UNDEFINED_TYPE = 0
BOUNDING_BOX = 1
SKELETON = 2
MESH = 3

Used in: FrameAnnotation

int32 object_id = 1
Reference to the object identifier in ObjectInstance.
repeated AnnotatedKeyPoint keypoints = 2
For each objects, list all the annotated keypoints here. E.g. for bounding-boxes, we have 8 keypoints, hands = 21 keypoints, etc. These normalized points are the projection of the Object's 3D keypoint on the current frame's camera poses.
float visibility = 3
Visibiity of this annotation in a frame.

The 3D point in the camera coordinate system, the scales are in meters.

Used in: AnnotatedKeyPoint

float x = 1
float y = 2
float z = 3

The sequence protocol contains the annotation data for the entire video clip.

repeated Object objects = 1
List of all the annotated 3D objects in this sequence in the world Coordinate system. Given the camera poses of each frame (also in the world-coordinate) these objects bounding boxes can be projected to each frame to get the per-frame annotation (i.e. image_annotation below).
repeated FrameAnnotation frame_annotations = 2
List of annotated data per each frame in sequence + frame information.

The skeleton template for different objects (e.g. humans, chairs, hands, etc) The annotation tool reads the skeleton template dictionary.

Used in: Skeletons

int32 reference_keypoint = 1
The origin keypoint in the object coordinate system. (i.e. Point 0, 0, 0)
string category = 2
The skeleton's category (e.g. human, chair, hand.). Should be unique in the dictionary.
repeated KeyPoint keypoints = 3
Initialization value for all the keypoints in the skeleton in the object's local coordinate system. Pursuit will transform these points using object's transformation to get the keypoint in the world-cooridnate.
repeated Edge edges = 4
List of edges connecting keypoints

The list of all the modeled skeletons in our library. These models can be objects (chairs, desks, etc), humans (full pose, hands, faces, etc), or box. We can have multiple skeletons in the same file.

repeated Skeleton object = 1

package xeno.pursuit.proto

message AnnotatedKeyPoint

int32 id = 1

optional Point3D point_3d = 2

optional NormalizedPoint2D point_2d = 3

message Edge

int32 source = 1

int32 sink = 2

message FrameAnnotation

int32 frame_id = 1

repeated ObjectAnnotation annotations = 2

optional research.compvideo.arcapture.ARCamera camera = 3

double timestamp = 4

repeated float plane_center = 5

repeated float plane_normal = 6

message KeyPoint

float x = 1

float y = 2

float z = 3

float confidence_radius = 4

string name = 5

bool hidden = 6

message NormalizedPoint2D

float x = 1

float y = 2

float depth = 3

message Object

int32 id = 1

string category = 2

Object.Type type = 3

repeated float rotation = 4

repeated float translation = 5

repeated float scale = 6

repeated KeyPoint keypoints = 7

Object.Method method = 8

enum Object.Method

UNKNOWN_METHOD = 0

ANNOTATION = 1

AUGMENTATION = 2

enum Object.Type

UNDEFINED_TYPE = 0

BOUNDING_BOX = 1

SKELETON = 2

MESH = 3

message ObjectAnnotation

int32 object_id = 1

repeated AnnotatedKeyPoint keypoints = 2

float visibility = 3

message Point3D

float x = 1

float y = 2

float z = 3

message Sequence

repeated Object objects = 1

repeated FrameAnnotation frame_annotations = 2

message Skeleton

int32 reference_keypoint = 1

string category = 2

repeated KeyPoint keypoints = 3

repeated Edge edges = 4

message Skeletons

repeated Skeleton object = 1