Get desktop application:
View/edit binary Protocol Buffers messages
Returns the score (e.g., log pplx) given the text.
Returns image generation results given the text.
Returns image generation results given the text and image.
Returns an image embedding given an image.
Returns bounding box, label, and the score of objects detected given an image.
For open-set detection models, one can pass specified sets here. Elements in `text` describe concepts in the image that should be detected; it is up to the detection model to interpret these texts. Some detection models may interpret the text to be object names, and the corresponding response.bounding_boxes.text must be one of the given text elements.
Optionally accept box inputs to predict region class labels. The boxes are N regions of interest in the image. The .text and .score fields should be unset.
A list of bounding boxes. The bounding boxes have no explicit order.
Returns text generation results given image_bytes.
Optional prefix text.
Returns image generation results given image_bytes.
Returns text generation results given video.
Optional prefix text.
Returns video tokens results given video (tokenization).
Video composed of multiple image frames.
quantized or soft tokens.
Returns video bytes results given video tokens (de-tokenization).
quantized or soft tokens.
Video composed of multiple image frames.
Used in:
,Coordinates are in pixel space. Upper left corner of the image represents (0.0, 0.0) Bottom right corner of the image represents (image_width, image_height)
A label for the bounding box object.
A positive number which represents the ranking (the higher the better) of the bounding boxes. The semantic meaning of the score is left for each model to define.
When a mask is present, it only contains values inside the bounding box defined by cx,cy,w,h. I.e.: mask[i, j] corresponding the image pixel[cx - w/2 + i, cy - h / 2 + j], where cx,cy,w,h are given in the BoundingBox. Outside the bounding box the mask is always zero. This enables sending smaller mask sizes, since the mask size is only the bounding box size. E.g. MaskRCNN returns masks that are always 28x28 pixels, and they resized by third_party/cloud_tpu/models/detection/utils/mask_utils.py.
Used in:
, ,The label of the classified object.
The score of the classified object.
Used in:
mask represents a C-order 2-D uint8 array. The array's dimension are given by [mask_height, mask_width]. mask[i, j] / 255 represents the probability of the pixel being in the segment (the range [0,1] was scaled to [0, 255] for compression).
Used in:
, ,The generated image in byte array format. TODO(jianlijianli): decide on a image encoding format; Currently PNG.
The score for the generated image.