package object_detection.protos

Get desktop application:
View/edit binary Protocol Buffers messages

Configuration message for the AdamOptimizer See: https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer

optional LearningRate learning_rate = 1
optional float epsilon = 2
Default value for epsilon (1e-8) matches default value in tf.train.AdamOptimizer. This differs from tf2 default of 1e-7 in tf.keras.optimizers.Adam .

Configuration proto for the anchor generator to use in the object detection pipeline. See core/anchor_generator.py for details.

Used in: FasterRcnn, Ssd

oneof anchor_generator_oneof
- GridAnchorGenerator grid_anchor_generator = 1
- SsdAnchorGenerator ssd_anchor_generator = 2
- MultiscaleAnchorGenerator multiscale_anchor_generator = 3
- FlexibleGridAnchorGenerator flexible_grid_anchor_generator = 4

Used in: FlexibleGridAnchorGenerator

repeated float base_sizes = 1
The base sizes in pixels for each anchor in this anchor layer.
repeated float aspect_ratios = 2
The aspect ratios for each anchor in this anchor layer.
optional uint32 height_stride = 3
The anchor height stride in pixels.
optional uint32 width_stride = 4
The anchor width stride in pixels.
optional uint32 height_offset = 5
The anchor height offset in pixels.
optional uint32 width_offset = 6
The anchor width offset in pixels.

Configuration proto for ArgMaxMatcher. See matchers/argmax_matcher.py for details.

Used in: Matcher

optional float matched_threshold = 1
Threshold for positive matches.
optional float unmatched_threshold = 2
Threshold for negative matches.
optional bool ignore_thresholds = 3
Whether to construct ArgMaxMatcher without thresholds.
optional bool negatives_lower_than_unmatched = 4
If True then negative matches are the ones below the unmatched_threshold, whereas ignored matches are in between the matched and umatched threshold. If False, then negative matches are in between the matched and unmatched threshold, and everything lower than unmatched is ignored.
optional bool force_match_for_each_row = 5
Whether to ensure each row is matched to at least one column.
optional bool use_matmul_gather = 6
Force constructed match objects to use matrix multiplication based gather instead of standard tf.gather

Apply an Autoaugment policy to the image and bounding boxes.

Used in: PreprocessingStep

optional string policy_name = 1
What AutoAugment policy to apply to the Image

Configuration proto for non-max-suppression operation on a batch of detections.

Used in: PostProcessing

optional float score_threshold = 1
Scalar threshold for score (low scoring boxes are removed).
optional float iou_threshold = 2
Scalar threshold for IOU (boxes that have high IOU overlap with previously selected boxes are removed).
optional int32 max_detections_per_class = 3
Maximum number of detections to retain per class.
optional int32 max_total_detections = 5
Maximum number of detections to retain across all classes.
optional bool use_static_shapes = 6
Whether to use the implementation of NMS that guarantees static shapes.
optional bool use_class_agnostic_nms = 7
Whether to use class agnostic NMS. Class-agnostic NMS function implements a class-agnostic version of Non Maximal Suppression where if max_classes_per_detection=k, 1) we keep the top-k scores for each detection and 2) during NMS, each detection only uses the highest class score for sorting. 3) Compared to regular NMS, the worst runtime of this version is O(N^2) instead of O(KN^2) where N is the number of detections and K the number of classes.
optional int32 max_classes_per_detection = 8
optional float soft_nms_sigma = 9
Soft NMS sigma parameter; Bodla et al, https://arxiv.org/abs/1704.04503)
optional bool use_partitioned_nms = 10
Whether to use partitioned version of non_max_suppression.
optional bool use_combined_nms = 11
Whether to use tf.image.combined_non_max_suppression.
optional bool change_coordinate_frame = 12
Whether to change coordinate frame of the boxlist to be relative to window's frame.
optional bool use_hard_nms = 13
Use hard NMS. Note that even if this field is set false, the behavior of NMS will be equivalent to hard NMS; This field when set to true forces the tf.image.non_max_suppression function to be called instead of tf.image.non_max_suppression_with_scores and can be used to export models for older versions of TF.
optional bool use_cpu_nms = 14
Use cpu NMS. NMSV3/NMSV4 by default runs on GPU, which may cause OOM issue if the model is large and/or batch size is large during training. Setting this flag to false moves the nms op to CPU when OOM happens. The flag is not needed if use_hard_nms = false, as soft NMS currently runs on CPU by default.

Configuration proto for batch norm to apply after convolution op. See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/batch_norm

Used in: Hyperparams

optional float decay = 1
optional bool center = 2
optional bool scale = 3
optional float epsilon = 4
optional bool train = 5
Whether to train the batch norm variables. If this is set to false during training, the current value of the batch_norm variables are used for forward pass but they are never updated.

Configuration for Bidirectional Feature Pyramid Networks.

Used in: SsdFeatureExtractor

optional int32 min_level = 1
minimum level in the feature pyramid.
optional int32 max_level = 2
maximum level in the feature pyramid.
optional int32 num_iterations = 3
The number of repeated top-down bottom-up iterations for BiFPN-based feature extractors (bidirectional feature pyramid networks).
optional int32 num_filters = 4
The number of filters (channels) to use in feature pyramid layers for BiFPN-based feature extractors (bidirectional feature pyramid networks).
optional string combine_method = 5
Method used to combine inputs to BiFPN nodes.

Configuration proto for bipartite matcher. See matchers/bipartite_matcher.py for details.

Used in: Matcher

optional bool use_matmul_gather = 6
Force constructed match objects to use matrix multiplication based gather instead of standard tf.gather

Classification loss using a sigmoid function over the class prediction with the highest prediction score.

Used in: ClassificationLoss

optional float alpha = 1
Interpolation weight between 0 and 1.
optional bool hard_bootstrap = 2
Whether hard boot strapping should be used or not. If true, will only use one class favored by model. Othewise, will use all predicted class probabilities.
optional bool anchorwise_output = 3
DEPRECATED, do not use. Output loss per anchor.

Configuration proto for the box coder to be used in the object detection pipeline. See core/box_coder.py for details.

Used in: Ssd, TargetAssigner

oneof box_coder_oneof
- FasterRcnnBoxCoder faster_rcnn_box_coder = 1
- MeanStddevBoxCoder mean_stddev_box_coder = 2
- SquareBoxCoder square_box_coder = 3
- KeypointBoxCoder keypoint_box_coder = 4

Configuration proto for box predictor. See core/box_predictor.py for details.

Used in: FasterRcnn, Ssd

oneof box_predictor_oneof
- ConvolutionalBoxPredictor convolutional_box_predictor = 1
- MaskRCNNBoxPredictor mask_rcnn_box_predictor = 2
- RfcnBoxPredictor rfcn_box_predictor = 3
- WeightSharedConvolutionalBoxPredictor weight_shared_convolutional_box_predictor = 4

Message wrapper for various calibration configurations.

Used in: PostProcessing

oneof calibrator
- FunctionApproximation function_approximation = 1
  Class-agnostic calibration via linear interpolation (usually output from isotonic regression).
- ClassIdFunctionApproximations class_id_function_approximations = 2
  Per-class calibration via linear interpolation.
- SigmoidCalibration sigmoid_calibration = 3
  Class-agnostic sigmoid calibration.
- ClassIdSigmoidCalibrations class_id_sigmoid_calibrations = 4
  Per-class sigmoid calibration.
- TemperatureScalingCalibration temperature_scaling_calibration = 5
  Temperature scaling calibration.

Used in: DetectionModel

optional int32 num_classes = 1
Number of classes to predict.
optional CenterNetFeatureExtractor feature_extractor = 2
Feature extractor config.
optional ImageResizer image_resizer = 3
Image resizer for preprocessing the input image.
optional bool use_depthwise = 13
If set, all task heads will be constructed with separable convolutions.
optional CenterNet.ObjectDetection object_detection_task = 4
optional CenterNet.ObjectCenterParams object_center_params = 5
optional string keypoint_label_map_path = 6
Path of the file that conatins the label map along with the keypoint information, including the keypoint indices, corresponding labels, and the corresponding class. The file should be the same one as used in the input pipeline. Note that a plain text of StringIntLabelMap proto is expected in this file. It is required only if the keypoint estimation task is specified.
repeated CenterNet.KeypointEstimation keypoint_estimation_task = 7
optional CenterNet.MaskEstimation mask_estimation_task = 8
optional CenterNet.DensePoseEstimation densepose_estimation_task = 9
optional CenterNet.TrackEstimation track_estimation_task = 10
optional CenterNet.TemporalOffsetEstimation temporal_offset_task = 12

Parameters which are related to DensePose estimation task. http://densepose.org/

Used in: CenterNet

optional float task_loss_weight = 1
Weight of the task loss. The total loss of the model will be their summation of task losses weighted by the weights.
optional int32 class_id = 2
Class ID (0-indexed) that corresponds to the object in the label map that contains DensePose data.
optional Loss loss = 3
Loss configuration for DensePose heatmap and regression losses. Note that the localization loss is used for surface coordinate losses and classification loss is used for part classification losses.
optional int32 num_parts = 4
The number of body parts.
optional float part_loss_weight = 5
Loss weights for the two DensePose heads.
optional float coordinate_loss_weight = 6
optional bool upsample_to_input_res = 7
Whether to upsample the prediction feature maps back to the original input dimension prior to applying loss. This has the benefit of maintaining finer groundtruth location information.
optional float heatmap_bias_init = 8
The initial bias value of the convlution kernel of the class heatmap prediction head. -2.19 corresponds to predicting foreground with a probability of 0.1.

Parameters which are related to keypoint estimation task.

Used in: CenterNet

optional string task_name = 1
Name of the task, e.g. "human pose". Note that the task name should be unique to each keypoint task.
optional float task_loss_weight = 2
Weight of the task loss. The total loss of the model will be their summation of task losses weighted by the weights.
optional Loss loss = 3
Loss configuration for keypoint heatmap, offset, regression losses. Note that the localization loss is used for offset/regression losses and classification loss is used for heatmap loss.
optional string keypoint_class_name = 4
The name of the class that contains the keypoints for this task. This is used to retrieve the corresponding keypoint indices from the label map. Note that this corresponds to the "name" field, not "display_name".
map<string, float> keypoint_label_to_std = 5
The standard deviation of the Gaussian kernel used to generate the keypoint heatmap. The unit is the pixel in the output image. It is to provide the flexibility of using different sizes of Gaussian kernel for each keypoint class. Note that if provided, the keypoint standard deviations will be overridden by the specified values here, otherwise, the default value 5.0 will be used. TODO(yuhuic): Update the default value once we found the best value.
optional float keypoint_regression_loss_weight = 6
Loss weights corresponding to different heads.
optional float keypoint_heatmap_loss_weight = 7
optional float keypoint_offset_loss_weight = 8
optional float heatmap_bias_init = 9
The initial bias value of the convolution kernel of the keypoint heatmap prediction head. -2.19 corresponds to predicting foreground with a probability of 0.1. See "Focal Loss for Dense Object Detection" at https://arxiv.org/abs/1708.02002.
optional float keypoint_candidate_score_threshold = 10
The heatmap score threshold for a keypoint to become a valid candidate.
optional int32 num_candidates_per_keypoint = 11
The maximum number of candidates to retrieve for each keypoint.
optional int32 peak_max_pool_kernel_size = 12
Max pool kernel size to use to pull off peak score locations in a neighborhood (independently for each keypoint types).
optional float unmatched_keypoint_score = 13
The default score to use for regressed keypoints that are not successfully snapped to a nearby candidate.
optional float box_scale = 14
The multiplier to expand the bounding boxes (either the provided boxes or those which tightly cover the regressed keypoints). Note that new expanded box for an instance becomes the feasible search window for all associated keypoints.
optional float candidate_search_scale = 15
The scale parameter that multiplies the largest dimension of a bounding box. The resulting distance becomes a search radius for candidates in the vicinity of each regressed keypoint.
optional string candidate_ranking_mode = 16
One of ['min_distance', 'score_distance_ratio'] indicating how to select the keypoint candidate.
optional int32 offset_peak_radius = 17
The radius (in the unit of output pixel) around heatmap peak to assign the offset targets. If set 0, then the offset target will only be assigned to the heatmap peak (same behavior as the original paper).
optional bool per_keypoint_offset = 18
Indicates whether to assign offsets for each keypoint channel separately. If set False, the output offset target has the shape [batch_size, out_height, out_width, 2] (same behavior as the original paper). If set True, the output offset target has the shape [batch_size, out_height, out_width, 2 * num_keypoints] (recommended when the offset_peak_radius is not zero).

Parameters which are related to mask estimation task. Note: Currently, CenterNet supports a weak instance segmentation, where semantic segmentation masks are estimated, and then cropped based on bounding box detections. Therefore, it is possible for the same image pixel to be assigned to multiple instances.

Used in: CenterNet

optional float task_loss_weight = 1
Weight of the task loss. The total loss of the model will be their summation of task losses weighted by the weights.
optional ClassificationLoss classification_loss = 2
Classification loss configuration for segmentation loss.
optional int32 mask_height = 4
Each instance mask (one per detection) is cropped and resized (bilinear resampling) from the predicted segmentation feature map. After resampling, the masks are binarized with the provided score threshold.
optional int32 mask_width = 5
optional float score_threshold = 6
optional float heatmap_bias_init = 3
The initial bias value of the convlution kernel of the class heatmap prediction head. -2.19 corresponds to predicting foreground with a probability of 0.1.

Parameters related to object center prediction. This is required for both object detection and keypoint estimation tasks.

Used in: CenterNet

optional float object_center_loss_weight = 1
Weight for the object center loss.
optional ClassificationLoss classification_loss = 2
Classification loss configuration for object center loss.
optional float heatmap_bias_init = 3
The initial bias value of the convlution kernel of the class heatmap prediction head. -2.19 corresponds to predicting foreground with a probability of 0.1. See "Focal Loss for Dense Object Detection" at https://arxiv.org/abs/1708.02002.
optional float min_box_overlap_iou = 4
The minimum IOU overlap boxes need to have to not be penalized.
optional int32 max_box_predictions = 5
Maximum number of boxes to predict.
optional bool use_labeled_classes = 6
If set, loss is only computed for the labeled classes.

Parameters which are related to object detection task.

Used in: CenterNet

optional float task_loss_weight = 1
Weight of the task loss. The total loss of the model will be the summation of task losses weighted by the weights.
optional float offset_loss_weight = 3
Weight for the offset localization loss.
optional float scale_loss_weight = 4
Weight for the height/width localization loss.
optional LocalizationLoss localization_loss = 8
Localization loss configuration for object scale and offset losses.

Temporal offset prediction head similar to CenterTrack. Currently our implementation adopts LSTM, different from original paper. See go/lstd-centernet for more details. Tracking Objects as Points [3] [3]: https://arxiv.org/abs/2004.01177

Used in: CenterNet

optional float task_loss_weight = 1
Weight of the task loss. The total loss of the model will be the summation of task losses weighted by the weights.
optional LocalizationLoss localization_loss = 2
Localization loss configuration for offset loss.

Parameters which are related to tracking embedding estimation task. A Simple Baseline for Multi-Object Tracking [2] [2]: https://arxiv.org/abs/2004.01888

Used in: CenterNet

optional float task_loss_weight = 1
Weight of the task loss. The total loss of the model will be the summation of task losses weighted by the weights.
optional int32 num_track_ids = 2
The maximun track ID of the datset.
optional int32 reid_embed_size = 3
The embedding size for re-identification (ReID) task in tracking.
optional int32 num_fc_layers = 4
The number of (fully-connected, batch-norm, relu) layers for track ID classification head. The output dimension of each intermediate FC layer will all be 'reid_embed_size'. The last FC layer will directly project to the track ID classification space of size 'num_track_ids' without batch-norm and relu layers.
optional ClassificationLoss classification_loss = 5
Classification loss configuration for ReID loss.

Used in: CenterNet

optional string type = 1
repeated float channel_means = 2
Channel means to be subtracted from each image channel. If not specified, we use a default value of 0.
repeated float channel_stds = 3
Channel standard deviations. Each channel will be normalized by dividing it by its standard deviation. If not specified, we use a default value of 1.
optional bool bgr_ordering = 4
If set, will change channel order to be [blue, green, red]. This can be useful to be compatible with some pre-trained feature extractors.
optional bool use_depthwise = 5
If set, the feature upsampling layers will be constructed with separable convolutions. This is typically applied to feature pyramid network if any.

Used in: TrainConfig

UNKNOWN = 0
V1 = 1
V2 = 2

Message for class-specific domain/range mapping for function approximations.

Used in: CalibrationConfig

map<int32, XYPairs> class_id_xy_pairs_map = 1
Message mapping class ids to indices.

Message for class-specific Sigmoid Calibration.

Used in: CalibrationConfig

map<int32, SigmoidParameters> class_id_sigmoid_parameters_map = 1
Message mapping class index to Sigmoid Parameters.

Configuration for class prediction loss function.

Used in: CenterNet.MaskEstimation, CenterNet.ObjectCenterParams, CenterNet.TrackEstimation, FasterRcnn, Loss

oneof classification_loss
- WeightedSigmoidClassificationLoss weighted_sigmoid = 1
- WeightedSoftmaxClassificationLoss weighted_softmax = 2
- WeightedSoftmaxClassificationAgainstLogitsLoss weighted_logits_softmax = 5
- BootstrappedSigmoidClassificationLoss bootstrapped_sigmoid = 3
- SigmoidFocalClassificationLoss weighted_sigmoid_focal = 4
- PenaltyReducedLogisticFocalLoss penalty_reduced_logistic_focal_loss = 6

A message to evaluate COCO keypoint metrics for a specific class.

Used in: ParameterizedMetric

optional string class_label = 1
Identifies the class of object to which keypoints belong. By default this should use the class's "display_name" in the label map.
map<string, float> keypoint_label_to_sigmas = 2
Keypoint specific standard deviations for COCO keypoint metrics, which controls how OKS is computed. See http://cocodataset.org/#keypoints-eval for details. If your keypoints are similar to the COCO keypoints use the precomputed standard deviations below: "nose": 0.026 "left_eye": 0.025 "right_eye": 0.025 "left_ear": 0.035 "right_ear": 0.035 "left_shoulder": 0.079 "right_shoulder": 0.079 "left_elbow": 0.072 "right_elbow": 0.072 "left_wrist": 0.062 "right_wrist": 0.062 "left_hip": 0.107 "right_hip": 0.107 "left_knee": 0.087 "right_knee": 0.087 "left_ankle": 0.089 "right_ankle": 0.089

Configuration proto for image resizer that resizes only if input image height or width is greater or smaller than a certain size. Aspect ratio is maintained.

Used in: ImageResizer

optional ConditionalShapeResizer.ResizeCondition condition = 1
Condition which must be true to resize the image.
optional int32 size_threshold = 2
Threshold for the image size. If any image dimension is above or below this (as specified by condition) the image will be resized so that it meets the threshold.
optional ResizeType resize_method = 3
Desired method when resizing image.
optional bool convert_to_grayscale = 4
Whether to also resize the image channels from 3 to 1 (RGB to grayscale).

Enumeration for the condition on which to resize an image.

Used in: ConditionalShapeResizer

INVALID = 0
Default value.
GREATER = 1
Resizes image if a dimension is greater than specified size.
SMALLER = 2
Resizes image if a dimension is smaller than specified size.

Configuration message for a constant learning rate.

Used in: LearningRate

optional float learning_rate = 1

Configuration proto for Context . Next id: 4

Used in: FasterRcnn

optional int32 max_num_context_features = 1
The maximum number of contextual features per-image, used for padding
optional int32 attention_bottleneck_dimension = 2
The bottleneck feature dimension of the attention block.
optional float attention_temperature = 3
The attention temperature.
optional int32 context_feature_length = 4
The context feature length.

Converts class logits to softmax optionally scaling the values by temperature first.

Used in: PreprocessingStep

optional float temperature = 1
Scale to use on logits before applying softmax.

Configuration proto for Convolutional box predictor. Next id: 13

Used in: BoxPredictor

optional Hyperparams conv_hyperparams = 1
Hyperparameters for convolution ops used in the box predictor.
optional int32 min_depth = 2
Minimum feature depth prior to predicting box encodings and class predictions.
optional int32 max_depth = 3
Maximum feature depth prior to predicting box encodings and class predictions. If max_depth is set to 0, no additional feature map will be inserted before location and class predictions.
optional int32 num_layers_before_predictor = 4
Number of the additional conv layers before the predictor.
optional bool use_dropout = 5
Whether to use dropout for class prediction.
optional float dropout_keep_probability = 6
Keep probability for dropout
optional int32 kernel_size = 7
Size of final convolution kernel. If the spatial resolution of the feature map is smaller than the kernel size, then the kernel size is set to min(feature_width, feature_height).
optional int32 box_code_size = 8
Size of the encoding for boxes.
optional bool apply_sigmoid_to_scores = 9
Whether to apply sigmoid to the output of class predictions. TODO(jonathanhuang): Do we need this since we have a post processing module.?
optional float class_prediction_bias_init = 10
optional bool use_depthwise = 11
Whether to use depthwise separable convolution for box predictor layers.
optional ConvolutionalBoxPredictor.BoxEncodingsClipRange box_encodings_clip_range = 12

If specified, apply clipping to box encodings.

Used in: ConvolutionalBoxPredictor

optional float min = 1
optional float max = 2

Configuration message for a cosine decaying learning rate as defined in object_detection/utils/learning_schedules.py

Used in: LearningRate

optional float learning_rate_base = 1
optional uint32 total_steps = 2
optional float warmup_learning_rate = 3
optional uint32 warmup_steps = 4
optional uint32 hold_base_rate_steps = 5

Top level configuration for DetectionModels.

Used in: TrainEvalPipelineConfig

oneof model
- FasterRcnn faster_rcnn = 1
- Ssd ssd = 2
- ExperimentalModel experimental_model = 3
  This can be used to define experimental models. To define your own experimental meta architecture, populate a key in the model_builder.EXPERIMENTAL_META_ARCH_BUILDER_MAP dict and set its value to a function that builds your model.
- CenterNet center_net = 4

Randomly drops ground truth boxes for a label with some probability.

Used in: PreprocessingStep

optional int32 label = 1
The label that should be dropped. This corresponds to one of the entries in the label map.
optional float drop_probability = 2
Probability of dropping the label.

Message for configuring DetectionModel evaluation jobs (eval.py). Next id - 35

Used in: TrainEvalPipelineConfig

optional uint32 batch_size = 25
optional uint32 num_visualizations = 1
Number of visualization images to generate.
optional uint32 num_examples = 2
Number of examples to process of evaluation.
optional uint32 eval_interval_secs = 3
How often to run evaluation.
optional uint32 max_evals = 4
Maximum number of times to run evaluation. If set to 0, will run forever.
optional bool save_graph = 5
Whether the TensorFlow graph used for evaluation should be saved to disk.
optional string visualization_export_dir = 6
Path to directory to store visualizations in. If empty, visualization images are not exported (only shown on Tensorboard).
optional string eval_master = 7
BNS name of the TensorFlow master.
repeated string metrics_set = 8
Type of metrics to use for evaluation.
repeated ParameterizedMetric parameterized_metric = 31
Type of metrics to use for evaluation. Unlike `metrics_set` above, this field allows configuring evaluation metric through config files.
optional string export_path = 9
Path to export detections to COCO compatible JSON format.
optional bool ignore_groundtruth = 10
Option to not read groundtruth labels and only export detections to COCO-compatible JSON file.
optional bool use_moving_averages = 11
Use exponential moving averages of variables for evaluation. TODO(rathodv): When this is false make sure the model is constructed without moving averages in restore_fn.
optional bool eval_instance_masks = 12
Whether to evaluate instance masks. Note that since there is no evaluation code currently for instance segmentation this option is unused.
optional float min_score_threshold = 13
Minimum score threshold for a detected object box to be visualized
optional int32 max_num_boxes_to_visualize = 14
Maximum number of detections to visualize
optional bool skip_scores = 15
When drawing a single detection, each label is by default visualized as <label name> : <label score>. One can skip the name or/and score using the following fields:
optional bool skip_labels = 16
optional bool visualize_groundtruth_boxes = 17
Whether to show groundtruth boxes in addition to detected boxes in visualizations.
optional string groundtruth_box_visualization_color = 18
Box color for visualizing groundtruth boxes.
optional bool keep_image_id_for_visualization_export = 19
Whether to keep image identifier in filename when exported to visualization_export_dir.
optional bool retain_original_images = 23
Whether to retain original images (i.e. not pre-processed) in the tensor dictionary, so that they can be displayed in Tensorboard.
optional bool include_metrics_per_category = 24
If True, additionally include per-category metrics.
map<string, string> super_categories = 34
Optional super-category definitions: keys are super-category names; values are comma-separated categories (assumed to correspond to category names (`display_name`) in the label map.
optional float recall_lower_bound = 26
Recall range within which precision should be computed.
optional float recall_upper_bound = 27
optional bool retain_original_image_additional_channels = 28
Whether to retain additional channels (i.e. not pre-processed) in the tensor dictionary, so that they can be displayed in Tensorboard.
optional bool force_no_resize = 29
When this flag is set, images are not resized during evaluation. When this flag is not set (default case), image are resized according to the image_resizer config in the model during evaluation.
optional bool use_dummy_loss_in_eval = 30
Whether to use a dummy loss in eval so model.loss() is not executed.
repeated KeypointEdge keypoint_edge = 32
Specifies which keypoints should be connected by an edge, which may improve visualization. An example would be human pose estimation where certain joints can be connected.
optional bool skip_predictions_for_unlabeled_class = 33
The "groundtruth_labeled_classes" field indicates which classes have been labeled on the images. If skip_predictions_for_unlabeled_class is set, detector predictions that do not match to the groundtruth_labeled_classes will be ignored. This is useful for evaluating on test data that are not exhaustively labeled.

Used in: DetectionModel

optional string name = 1

Configuration message for an exponentially decaying learning rate. See https://www.tensorflow.org/versions/master/api_docs/python/train/ \ decaying_the_learning_rate#exponential_decay

Used in: LearningRate

optional float initial_learning_rate = 1
optional uint32 decay_steps = 2
optional float decay_factor = 3
optional bool staircase = 4
optional float burnin_learning_rate = 5
optional uint32 burnin_steps = 6
optional float min_learning_rate = 7

An externally defined input reader. Users may define an extension to this proto to interface their own input readers.

Used in: InputReader

(message has no fields)

Configuration for Faster R-CNN models. See meta_architectures/faster_rcnn_meta_arch.py and models/model_builder.py Naming conventions: Faster R-CNN models have two stages: a first stage region proposal network (or RPN) and a second stage box classifier. We thus use the prefixes `first_stage_` and `second_stage_` to indicate the stage to which each parameter pertains when relevant.

Used in: DetectionModel

optional int32 number_of_stages = 1
Whether to construct only the Region Proposal Network (RPN).
optional int32 num_classes = 3
Number of classes to predict.
optional ImageResizer image_resizer = 4
Image resizer for preprocessing the input image.
optional FasterRcnnFeatureExtractor feature_extractor = 5
Feature extractor config.
optional AnchorGenerator first_stage_anchor_generator = 6
Anchor generator to compute RPN anchors.
optional int32 first_stage_atrous_rate = 7
Atrous rate for the convolution op applied to the `first_stage_features_to_crop` tensor to obtain box predictions.
optional Hyperparams first_stage_box_predictor_conv_hyperparams = 8
Hyperparameters for the convolutional RPN box predictor.
optional int32 first_stage_box_predictor_kernel_size = 9
Kernel size to use for the convolution op just prior to RPN box predictions.
optional int32 first_stage_box_predictor_depth = 10
Output depth for the convolution op just prior to RPN box predictions.
optional int32 first_stage_minibatch_size = 11
The batch size to use for computing the first stage objectness and location losses.
optional float first_stage_positive_balance_fraction = 12
Fraction of positive examples per image for the RPN.
optional float first_stage_nms_score_threshold = 13
Non max suppression score threshold applied to first stage RPN proposals.
optional float first_stage_nms_iou_threshold = 14
Non max suppression IOU threshold applied to first stage RPN proposals.
optional int32 first_stage_max_proposals = 15
Maximum number of RPN proposals retained after first stage postprocessing.
optional float first_stage_localization_loss_weight = 16
First stage RPN localization loss weight.
optional float first_stage_objectness_loss_weight = 17
First stage RPN objectness loss weight.
optional int32 initial_crop_size = 18
Output size (width and height are set to be the same) of the initial bilinear interpolation based cropping during ROI pooling.
optional int32 maxpool_kernel_size = 19
Kernel size of the max pool op on the cropped feature map during ROI pooling.
optional int32 maxpool_stride = 20
Stride of the max pool op on the cropped feature map during ROI pooling.
optional BoxPredictor second_stage_box_predictor = 21
Hyperparameters for the second stage box predictor. If box predictor type is set to rfcn_box_predictor, a R-FCN model is constructed, otherwise a Faster R-CNN model is constructed.
optional int32 second_stage_batch_size = 22
The batch size per image used for computing the classification and refined location loss of the box classifier. Note that this field is ignored if `hard_example_miner` is configured.
optional float second_stage_balance_fraction = 23
Fraction of positive examples to use per image for the box classifier.
optional PostProcessing second_stage_post_processing = 24
Post processing to apply on the second stage box classifier predictions. Note: the `score_converter` provided to the FasterRCNNMetaArch constructor is taken from this `second_stage_post_processing` proto.
optional float second_stage_localization_loss_weight = 25
Second stage refined localization loss weight.
optional float second_stage_classification_loss_weight = 26
Second stage classification loss weight
optional float second_stage_mask_prediction_loss_weight = 27
Second stage instance mask loss weight. Note that this is only applicable when `MaskRCNNBoxPredictor` is selected for second stage and configured to predict instance masks.
optional HardExampleMiner hard_example_miner = 28
If not left to default, applies hard example mining only to classification and localization loss..
optional ClassificationLoss second_stage_classification_loss = 29
Loss for second stage box classifers, supports Softmax and Sigmoid. Note that score converter must be consistent with loss type. When there are multiple labels assigned to the same boxes, recommend to use sigmoid loss and enable merge_multiple_label_boxes. If not specified, Softmax loss is used as default.
optional bool inplace_batchnorm_update = 30
Whether to update batch_norm inplace during training. This is required for batch norm to work correctly on TPUs. When this is false, user must add a control dependency on tf.GraphKeys.UPDATE_OPS for train/loss op in order to update the batch norm moving average parameters.
optional bool use_matmul_crop_and_resize = 31
Force the use of matrix multiplication based crop and resize instead of standard tf.image.crop_and_resize while computing second stage input feature maps.
optional bool clip_anchors_to_image = 32
Normally, anchors generated for a given image size are pruned during training if they lie outside the image window. Setting this option to true, clips the anchors to be within the image instead of pruning.
optional bool use_matmul_gather_in_matcher = 33
After peforming matching between anchors and targets, in order to pull out targets for training Faster R-CNN meta architecture we perform a gather operation. This options specifies whether to use an alternate implementation of tf.gather that is faster on TPUs.
optional bool use_static_balanced_label_sampler = 34
Whether to use the balanced positive negative sampler implementation with static shape guarantees.
optional bool use_static_shapes = 35
If True, uses implementation of ops with static shape guarantees.
optional bool resize_masks = 36
Whether the masks present in groundtruth should be resized in the model to match the image size.
optional bool use_static_shapes_for_eval = 37
If True, uses implementation of ops with static shape guarantees when running evaluation (specifically not is_training if False).
optional bool use_partitioned_nms_in_first_stage = 38
If true, uses implementation of partitioned_non_max_suppression in first stage.
optional bool return_raw_detections_during_predict = 39
Whether to return raw detections (pre NMS).
optional bool use_combined_nms_in_first_stage = 40
Whether to use tf.image.combined_non_max_suppression.
optional bool output_final_box_features = 42
Whether to output final box feature. If true, it will crop the feature map in the postprocess() method based on the final predictions.
optional Context context_config = 41
Configs for context model.

Configuration proto for FasterRCNNBoxCoder. See box_coders/faster_rcnn_box_coder.py for details.

Used in: BoxCoder

optional float y_scale = 1
Scale factor for anchor encoded box center.
optional float x_scale = 2
optional float height_scale = 3
Scale factor for anchor encoded box height.
optional float width_scale = 4
Scale factor for anchor encoded box width.

Used in: FasterRcnn

optional string type = 1
Type of Faster R-CNN model (e.g., 'faster_rcnn_resnet101'; See builders/model_builder.py for expected types).
optional int32 first_stage_features_stride = 2
Output stride of extracted RPN feature map.
optional bool batch_norm_trainable = 3
Whether to update batch norm parameters during training or not. When training with a relative large batch size (e.g. 8), it could be desirable to enable batch norm update.
optional Hyperparams conv_hyperparams = 4
Hyperparameters that affect the layers of feature extractor added on top of the base feature extractor.
optional bool override_base_feature_extractor_hyperparams = 5
if the value is set to true, the base feature extractor's hyperparams will be overridden with the `conv_hyperparams`.
optional int32 pad_to_multiple = 6
The nearest multiple to zero-pad the input height and width dimensions to. For example, if pad_to_multiple = 2, input dimensions are zero-padded until the resulting dimensions are even.
optional FeaturePyramidNetworks fpn = 7
Feature Pyramid Networks config.

Configuration for Feature Pyramid Networks.

We recommend to use multi_resolution_feature_map_generator with FPN, and the levels there must match the levels defined below for better performance. Correspondence from FPN levels to Resnet/Mobilenet V1 feature maps: FPN Level Resnet Feature Map Mobilenet-V1 Feature Map 2 Block 1 Conv2d_3_pointwise 3 Block 2 Conv2d_5_pointwise 4 Block 3 Conv2d_11_pointwise 5 Block 4 Conv2d_13_pointwise 6 Bottomup_5 bottom_up_Conv2d_14 7 Bottomup_6 bottom_up_Conv2d_15 8 Bottomup_7 bottom_up_Conv2d_16 9 Bottomup_8 bottom_up_Conv2d_17

Used in: FasterRcnnFeatureExtractor, SsdFeatureExtractor

optional int32 min_level = 1
minimum level in feature pyramid
optional int32 max_level = 2
maximum level in feature pyramid
optional int32 additional_layer_depth = 3
channel depth for additional coarse feature layers.

Configuration proto for image resizer that resizes to a fixed shape.

Used in: ImageResizer

optional int32 height = 1
Desired height of image in pixels.
optional int32 width = 2
Desired width of image in pixels.
optional ResizeType resize_method = 3
Desired method when resizing image.
optional bool convert_to_grayscale = 4
Whether to also resize the image channels from 3 to 1 (RGB to grayscale).

Used in: AnchorGenerator

repeated AnchorGrid anchor_grid = 1
optional bool normalize_coordinates = 2
Whether to produce anchors in normalized coordinates.

Message for class-agnostic domain/range mapping for function approximations.

Used in: CalibrationConfig

optional XYPairs x_y_pairs = 1
Message mapping class labels to indices

Message to configure graph rewriter for the tf graph.

Used in: TrainEvalPipelineConfig

optional Quantization quantization = 1

Configuration proto for GridAnchorGenerator. See anchor_generators/grid_anchor_generator.py for details.

Used in: AnchorGenerator

optional int32 height = 1
Anchor height in pixels.
optional int32 width = 2
Anchor width in pixels.
optional int32 height_stride = 3
Anchor stride in height dimension in pixels.
optional int32 width_stride = 4
Anchor stride in width dimension in pixels.
optional int32 height_offset = 5
Anchor height offset in pixels.
optional int32 width_offset = 6
Anchor width offset in pixels.
repeated float scales = 7
List of scales for the anchors.
repeated float aspect_ratios = 8
List of aspect ratios for the anchors.

Configuration proto for group normalization to apply after convolution op. https://arxiv.org/abs/1803.08494

Used in: Hyperparams

(message has no fields)

Configuration for hard example miner.

Used in: FasterRcnn, Loss

optional int32 num_hard_examples = 1
Maximum number of hard examples to be selected per image (prior to enforcing max negative to positive ratio constraint). If set to 0, all examples obtained after NMS are considered.
optional float iou_threshold = 2
Minimum intersection over union for an example to be discarded during NMS.
optional HardExampleMiner.LossType loss_type = 3
optional int32 max_negatives_per_positive = 4
Maximum number of negatives to retain for each positive anchor. If num_negatives_per_positive is 0 no prespecified negative:positive ratio is enforced.
optional int32 min_negatives_per_image = 5
Minimum number of negative anchors to sample for a given image. Setting this to a positive number samples negatives in an image without any positive anchors and thus not bias the model towards having at least one detection per image.

Whether to use classification losses ('cls', default), localization losses ('loc') or both losses ('both'). In the case of 'both', cls_loss_weight and loc_loss_weight are used to compute weighted sum of the two losses.

Used in: HardExampleMiner

BOTH = 0
CLASSIFICATION = 1
LOCALIZATION = 2

Configuration proto for the convolution op hyperparameters to use in the object detection pipeline.

Used in: ConvolutionalBoxPredictor, FasterRcnn, FasterRcnnFeatureExtractor, MaskRCNNBoxPredictor, RfcnBoxPredictor, Ssd.MaskHead, SsdFeatureExtractor, WeightSharedConvolutionalBoxPredictor

optional Hyperparams.Op op = 1
optional Regularizer regularizer = 2
Regularizer for the weights of the convolution op.
optional Initializer initializer = 3
Initializer for the weights of the convolution op.
optional Hyperparams.Activation activation = 4
oneof normalizer_oneof
- BatchNorm batch_norm = 5
  Note that if nothing below is selected, then no normalization is applied BatchNorm hyperparameters.
- GroupNorm group_norm = 7
  GroupNorm hyperparameters. This is only supported on a subset of models. Note that the current implementation of group norm instantiated in tf.contrib.group.layers.group_norm() only supports fixed_size_resizer for image preprocessing.
optional bool regularize_depthwise = 6
Whether depthwise convolutions should be regularized. If this parameter is NOT set then the conv hyperparams will default to the parent scope.
optional bool force_use_bias = 8
By default, use_bias is set to False if batch_norm is not None and batch_norm.center is True. When force_use_bias is set to True, this behavior will be overridden, and use_bias will be set to True, regardless of batch norm parameters. Note, this only applies to KerasLayerHyperparams.

Type of activation to apply after convolution.

Used in: Hyperparams

NONE = 0
Use None (no activation)
RELU = 1
Use tf.nn.relu
RELU_6 = 2
Use tf.nn.relu6
SWISH = 3
Use tf.nn.swish

Operations affected by hyperparameters.

Used in: Hyperparams

CONV = 1
Convolution, Separable Convolution, Convolution transpose.
FC = 2
Fully connected

Used in: ImageResizer

(message has no fields)

Configuration proto for image resizing operations. See builders/image_resizer_builder.py for details.

Used in: CenterNet, FasterRcnn, Ssd

oneof image_resizer_oneof
- KeepAspectRatioResizer keep_aspect_ratio_resizer = 1
- FixedShapeResizer fixed_shape_resizer = 2
- IdentityResizer identity_resizer = 3
- ConditionalShapeResizer conditional_shape_resizer = 4
- PadToMultipleResizer pad_to_multiple_resizer = 5

Proto with one-of field for initializers.

Used in: Hyperparams

oneof initializer_oneof
- TruncatedNormalInitializer truncated_normal_initializer = 1
- VarianceScalingInitializer variance_scaling_initializer = 2
- RandomNormalInitializer random_normal_initializer = 3

Next id: 35

Used in: TrainEvalPipelineConfig

optional string name = 23
Name of input reader. Typically used to describe the dataset that is read by this input reader.
optional string label_map_path = 1
Path to StringIntLabelMap pbtxt file specifying the mapping from string labels to integer ids.
optional bool shuffle = 2
Whether data should be processed in the order they are read in, or shuffled randomly.
optional uint32 shuffle_buffer_size = 11
Buffer size to be used when shuffling.
optional uint32 filenames_shuffle_buffer_size = 12
Buffer size to be used when shuffling file names.
optional uint32 num_epochs = 5
The number of times a data source is read. If set to zero, the data source will be reused indefinitely.
optional uint32 sample_1_of_n_examples = 22
Integer representing how often an example should be sampled. To feed only 1/3 of your data into your model, set `sample_1_of_n_examples` to 3. This is particularly useful for evaluation, where you might not prefer to evaluate all of your samples.
optional uint32 num_readers = 6
Number of file shards to read in parallel. When sample_from_datasets_weights are configured, num_readers is applied for each dataset.
optional uint32 num_parallel_batches = 19
Number of batches to produce in parallel. If this is run on a 2x2 TPU set this to 8.
optional int32 num_prefetch_batches = 20
Number of batches to prefetch. Prefetch decouples input pipeline and model so they can be pipelined resulting in higher throughput. Set this to a small constant and increment linearly until the improvements become marginal or you exceed your cpu memory budget. Setting this to -1, automatically tunes this value for you.
optional uint32 queue_capacity = 3
Maximum number of records to keep in reader queue.
optional uint32 min_after_dequeue = 4
Minimum number of records to keep in reader queue. A large value is needed to generate a good random shuffle.
optional uint32 read_block_length = 15
Number of records to read from each reader at once.
optional uint32 prefetch_size = 13
Number of decoded records to prefetch before batching.
optional uint32 num_parallel_map_calls = 14
Number of parallel decode ops to apply.
optional int32 num_additional_channels = 18
If positive, TfExampleDecoder will try to decode rasters of additional channels from tf.Examples.
optional uint32 num_keypoints = 16
Number of groundtruth keypoints per object.
repeated float keypoint_type_weight = 26
Keypoint weights. These weights can be used to apply per-keypoint loss multipliers. The size of this field should agree with `num_keypoints`.
optional int32 max_number_of_boxes = 21
Maximum number of boxes to pad to during training / evaluation. Set this to at least the maximum amount of boxes in the input data, otherwise some groundtruth boxes may be clipped.
optional bool load_multiclass_scores = 24
Whether to load multiclass scores from the dataset.
optional bool load_context_features = 25
Whether to load context features from the dataset.
optional bool load_instance_masks = 7
Whether to load groundtruth instance masks.
optional InstanceMaskType mask_type = 10
Type of instance mask.
optional bool load_dense_pose = 31
Whether to load DensePose data. If set, must also set load_instance_masks to true.
optional bool load_track_id = 33
Whether to load track information.
optional bool use_display_name = 17
Whether to use the display name when decoding examples. This is only used when mapping class text strings to integers.
optional bool include_source_id = 27
Whether to include the source_id string in the input features.
optional InputType input_type = 30
Whether input data type is tf.Examples or tf.SequenceExamples
optional int32 frame_index = 32
Which frame to choose from the input if Sequence Example. -1 indicates random choice.
oneof input_reader
- TFRecordInputReader tf_record_input_reader = 8
- ExternalInputReader external_input_reader = 9
repeated float sample_from_datasets_weights = 34
When multiple input files are configured, we can sample across them based on weights. The number of weights must match the number of input files configured. When set, shuffling, shuffle buffer size, and num_readers settings are applied individually to each dataset. Implementation follows tf.data.experimental.sample_from_datasets sampling strategy.
optional bool expand_labels_hierarchy = 29
Expand labels to ancestors or descendants in the hierarchy for for positive and negative labels, respectively.

Input type format: whether inputs are TfExamples or TfSequenceExamples.

Used in: InputReader

INPUT_DEFAULT = 0
Default implementation, currently TF_EXAMPLE
TF_EXAMPLE = 1
TfExample input
TF_SEQUENCE_EXAMPLE = 2
TfSequenceExample Input

Instance mask format. Note that PNG masks are much more space efficient.

Used in: InputReader

DEFAULT = 0
Default implementation, currently NUMERICAL_MASKS
NUMERICAL_MASKS = 1
[num_masks, H, W] float32 binary masks.
PNG_MASKS = 2
Encoded PNG masks.

Configuration for intersection-over-area (IOA) similarity calculator.

Used in: RegionSimilarityCalculator

(message has no fields)

Configuration for intersection-over-union (IOU) similarity calculator.

Used in: RegionSimilarityCalculator

(message has no fields)

Configuration proto for image resizer that keeps aspect ratio.

Used in: ImageResizer

optional int32 min_dimension = 1
Desired size of the smaller image dimension in pixels.
optional int32 max_dimension = 2
Desired size of the larger image dimension in pixels.
optional ResizeType resize_method = 3
Desired method when resizing image.
optional bool pad_to_max_dimension = 4
Whether to pad the image with zeros so the output spatial size is [max_dimension, max_dimension]. Note that the zeros are padded to the bottom and the right of the resized image.
optional bool convert_to_grayscale = 5
Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
repeated float per_channel_pad_value = 6
Per-channel pad value. This is only used when pad_to_max_dimension is True. If unspecified, a default pad value of 0 is applied to all channels.

Configuration proto for KeypointBoxCoder. See box_coders/keypoint_box_coder.py for details.

Used in: BoxCoder

optional int32 num_keypoints = 1
optional float y_scale = 2
Scale factor for anchor encoded box center and keypoints.
optional float x_scale = 3
optional float height_scale = 4
Scale factor for anchor encoded box height.
optional float width_scale = 5
Scale factor for anchor encoded box width.

Defines an edge that should be drawn between two keypoints.

Used in: EvalConfig

optional int32 start = 1
Index of the keypoint where the edge starts from. Index starts at 0.
optional int32 end = 2
Index of the keypoint where the edge ends. Index starts at 0.

L1 Localization Loss.

Used in: LocalizationLoss

(message has no fields)

Configuration proto for L1 Regularizer. See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/l1_regularizer

Used in: Regularizer

optional float weight = 1

Configuration proto for L2 Regularizer. See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/l2_regularizer

Used in: Regularizer

optional float weight = 1

Configuration message for optimizer learning rate.

Used in: AdamOptimizer, MomentumOptimizer, RMSPropOptimizer

oneof learning_rate
- ConstantLearningRate constant_learning_rate = 1
- ExponentialDecayLearningRate exponential_decay_learning_rate = 2
- ManualStepLearningRate manual_step_learning_rate = 3
- CosineDecayLearningRate cosine_decay_learning_rate = 4

Configuration for bounding box localization loss function.

Used in: CenterNet.ObjectDetection, CenterNet.TemporalOffsetEstimation, Loss

oneof localization_loss
- WeightedL2LocalizationLoss weighted_l2 = 1
- WeightedSmoothL1LocalizationLoss weighted_smooth_l1 = 2
- WeightedIOULocalizationLoss weighted_iou = 3
- L1LocalizationLoss l1_localization_loss = 4

Message for configuring the localization loss, classification loss and hard example miner used for training object detection models. See core/losses.py for details

Used in: CenterNet.DensePoseEstimation, CenterNet.KeypointEstimation, Ssd

optional LocalizationLoss localization_loss = 1
Localization loss to use.
optional ClassificationLoss classification_loss = 2
Classification loss to use.
optional HardExampleMiner hard_example_miner = 3
If not left to default, applies hard example mining.
optional float classification_weight = 4
Classification loss weight.
optional float localization_weight = 5
Localization loss weight.
optional RandomExampleSampler random_example_sampler = 6
If not left to default, applies random example sampling.
optional Loss.EqualizationLoss equalization_loss = 7
optional Loss.ExpectedLossWeights expected_loss_weights = 18
Method to compute expected loss weights with respect to balanced positive/negative sampling scheme. If NONE, use explicit sampling. TODO(birdbrain): Move under ExpectedLossWeights.
optional float min_num_negative_samples = 19
Minimum number of effective negative samples. Only applies if expected_loss_weights is not NONE. TODO(birdbrain): Move under ExpectedLossWeights.
optional float desired_negative_sampling_ratio = 20
Desired number of effective negative samples per positive sample. Only applies if expected_loss_weights is not NONE. TODO(birdbrain): Move under ExpectedLossWeights.

Equalization loss.

Used in: Loss

optional float weight = 1
Weight equalization loss strength.
repeated string exclude_prefixes = 2
When computing equalization loss, ops that start with equalization_exclude_prefixes will be ignored. Only used when equalization_weight > 0.

Used in: Loss

NONE = 0
EXPECTED_SAMPLING = 1
Use expected_classification_loss_by_expected_sampling from third_party/tensorflow_models/object_detection/utils/ops.py
REWEIGHTING_UNMATCHED_ANCHORS = 2
Use expected_classification_loss_by_reweighting_unmatched_anchors from third_party/tensorflow_models/object_detection/utils/ops.py

Configuration message for a manually defined learning rate schedule.

Used in: LearningRate

optional float initial_learning_rate = 1
repeated ManualStepLearningRate.LearningRateSchedule schedule = 2
optional bool warmup = 3
Whether to linearly interpolate learning rates for steps in [0, schedule[0].step].

Used in: ManualStepLearningRate

optional uint32 step = 1
optional float learning_rate = 2

TODO(alirezafathi): Refactor the proto file to be able to configure mask rcnn head easily. Next id: 15

Used in: BoxPredictor

optional Hyperparams fc_hyperparams = 1
Hyperparameters for fully connected ops used in the box predictor.
optional bool use_dropout = 2
Whether to use dropout op prior to the both box and class predictions.
optional float dropout_keep_probability = 3
Keep probability for dropout. This is only used if use_dropout is true.
optional int32 box_code_size = 4
Size of the encoding for the boxes.
optional Hyperparams conv_hyperparams = 5
Hyperparameters for convolution ops used in the box predictor.
optional bool predict_instance_masks = 6
Whether to predict instance masks inside detection boxes.
optional int32 mask_prediction_conv_depth = 7
The depth for the first conv2d_transpose op applied to the image_features in the mask prediction branch. If set to 0, the value will be set automatically based on the number of channels in the image features and the number of classes.
optional bool predict_keypoints = 8
Whether to predict keypoints inside detection boxes.
optional int32 mask_height = 9
The height and the width of the predicted mask.
optional int32 mask_width = 10
optional int32 mask_prediction_num_conv_layers = 11
The number of convolutions applied to image_features in the mask prediction branch.
optional bool masks_are_class_agnostic = 12
optional bool share_box_across_classes = 13
Whether to use one box for all classes rather than a different box for each class.
optional bool convolve_then_upsample_masks = 14
Whether to apply convolutions on mask features before upsampling using nearest neighbor resizing. By default, mask features are resized to [`mask_height`, `mask_width`] before applying convolutions and predicting masks.

Configuration proto for the matcher to be used in the object detection pipeline. See core/matcher.py for details.

Used in: Ssd, TargetAssigner

oneof matcher_oneof
- ArgMaxMatcher argmax_matcher = 1
- BipartiteMatcher bipartite_matcher = 2

Configuration proto for MeanStddevBoxCoder. See box_coders/mean_stddev_box_coder.py for details.

Used in: BoxCoder

optional float stddev = 1
The standard deviation used to encode and decode boxes.

Configuration message for the MomentumOptimizer See: https://www.tensorflow.org/api_docs/python/tf/train/MomentumOptimizer

Used in: Optimizer

optional LearningRate learning_rate = 1
optional float momentum_optimizer_value = 2

Configuration proto for RetinaNet anchor generator described in https://arxiv.org/abs/1708.02002. See anchor_generators/multiscale_grid_anchor_generator.py for details.

Used in: AnchorGenerator

optional int32 min_level = 1
minimum level in feature pyramid
optional int32 max_level = 2
maximum level in feature pyramid
optional float anchor_scale = 3
Scale of anchor to feature stride
repeated float aspect_ratios = 4
Aspect ratios for anchors at each grid point.
optional int32 scales_per_octave = 5
Number of intermediate scale each scale octave
optional bool normalize_coordinates = 6
Whether to produce anchors in normalized coordinates.

Configuration for negative squared distance similarity calculator.

Used in: RegionSimilarityCalculator

(message has no fields)

Normalizes pixel values in an image. For every channel in the image, moves the pixel values from the range [original_minval, original_maxval] to [target_minval, target_maxval].

Used in: PreprocessingStep

optional float original_minval = 1
optional float original_maxval = 2
optional float target_minval = 3
optional float target_maxval = 4

Top level optimizer message.

Used in: TrainConfig

oneof optimizer
- RMSPropOptimizer rms_prop_optimizer = 1
- MomentumOptimizer momentum_optimizer = 2
- AdamOptimizer adam_optimizer = 3
optional bool use_moving_average = 4
optional float moving_average_decay = 5

An image resizer which resizes inputs by zero padding them such that their spatial dimensions are divisible by a specified multiple. This is useful when you want to concatenate or compare the input to an output of a fully convolutional network.

Used in: ImageResizer

optional int32 multiple = 1
The multiple to which the spatial dimensions will be padded to.
optional bool convert_to_grayscale = 4
Whether to also resize the image channels from 3 to 1 (RGB to grayscale).

A message to configure parameterized evaluation metric.

Used in: EvalConfig

oneof parameterized_metric
- CocoKeypointMetrics coco_keypoint_metrics = 1

Pixelwise logistic focal loss with pixels near the target having a reduced penalty.

Used in: ClassificationLoss

optional float alpha = 1
Focussing parameter of the focal loss.
optional float beta = 2
Penalty reduction factor.

Configuration proto for post-processing predicted boxes and scores.

Used in: FasterRcnn, Ssd

optional BatchNonMaxSuppression batch_non_max_suppression = 1
Non max suppression parameters.
optional PostProcessing.ScoreConverter score_converter = 2
Score converter to use.
optional float logit_scale = 3
Scale logit (input) value before conversion in post-processing step. Typically used for softmax distillation, though can be used to scale for other reasons.
optional CalibrationConfig calibration_config = 4
Calibrate score outputs. Calibration is applied after score converter and before non max suppression.

Enum to specify how to convert the detection scores.

Used in: PostProcessing

IDENTITY = 0
Input scores equals output scores.
SIGMOID = 1
Applies a sigmoid on input scores.
SOFTMAX = 2
Applies a softmax on input scores

Message for defining a preprocessing operation on input data. See: //third_party/tensorflow_models/object_detection/core/preprocessor.py Next ID: 39

Used in: TrainConfig

oneof preprocessing_step
- NormalizeImage normalize_image = 1
- RandomHorizontalFlip random_horizontal_flip = 2
- RandomPixelValueScale random_pixel_value_scale = 3
- RandomImageScale random_image_scale = 4
- RandomRGBtoGray random_rgb_to_gray = 5
- RandomAdjustBrightness random_adjust_brightness = 6
- RandomAdjustContrast random_adjust_contrast = 7
- RandomAdjustHue random_adjust_hue = 8
- RandomAdjustSaturation random_adjust_saturation = 9
- RandomDistortColor random_distort_color = 10
- RandomJitterBoxes random_jitter_boxes = 11
- RandomCropImage random_crop_image = 12
- RandomPadImage random_pad_image = 13
- RandomCropPadImage random_crop_pad_image = 14
- RandomCropToAspectRatio random_crop_to_aspect_ratio = 15
- RandomBlackPatches random_black_patches = 16
- RandomResizeMethod random_resize_method = 17
- ScaleBoxesToPixelCoordinates scale_boxes_to_pixel_coordinates = 18
- ResizeImage resize_image = 19
- SubtractChannelMean subtract_channel_mean = 20
- SSDRandomCrop ssd_random_crop = 21
- SSDRandomCropPad ssd_random_crop_pad = 22
- SSDRandomCropFixedAspectRatio ssd_random_crop_fixed_aspect_ratio = 23
- SSDRandomCropPadFixedAspectRatio ssd_random_crop_pad_fixed_aspect_ratio = 24
- RandomVerticalFlip random_vertical_flip = 25
- RandomRotation90 random_rotation90 = 26
- RGBtoGray rgb_to_gray = 27
- ConvertClassLogitsToSoftmax convert_class_logits_to_softmax = 28
- RandomAbsolutePadImage random_absolute_pad_image = 29
- RandomSelfConcatImage random_self_concat_image = 30
- AutoAugmentImage autoaugment_image = 31
- DropLabelProbabilistically drop_label_probabilistically = 32
- RemapLabels remap_labels = 33
- RandomJpegQuality random_jpeg_quality = 34
- RandomDownscaleToTargetPixels random_downscale_to_target_pixels = 35
- RandomPatchGaussian random_patch_gaussian = 36
- RandomSquareCropByScale random_square_crop_by_scale = 37
- RandomScaleCropAndPadToSquare random_scale_crop_and_pad_to_square = 38

Message for quantization options. See tensorflow/contrib/quantize/python/quantize.py for details.

Used in: GraphRewriter

optional int32 delay = 1
Number of steps to delay before quantization takes effect during training.
optional int32 weight_bits = 2
Number of bits to use for quantizing weights. Only 8 bit is supported for now.
optional int32 activation_bits = 3
Number of bits to use for quantizing activations. Only 8 bit is supported for now.
optional bool symmetric = 4
Whether to use symmetric weight quantization.

Converts the RGB image to a grayscale image. This also converts the image depth from 3 to 1, unlike RandomRGBtoGray which does not change the image depth.

Used in: PreprocessingStep

(message has no fields)

Configuration message for the RMSPropOptimizer See: https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer

Used in: Optimizer

optional LearningRate learning_rate = 1
optional float momentum_optimizer_value = 2
optional float decay = 3
optional float epsilon = 4

Randomly adds a padding of size [0, max_height_padding), [0, max_width_padding).

Used in: PreprocessingStep

optional int32 max_height_padding = 1
Height will be padded uniformly at random from [0, max_height_padding).
optional int32 max_width_padding = 2
Width will be padded uniformly at random from [0, max_width_padding).
repeated float pad_color = 3
Color of the padding. If unset, will pad using average color of the input image.

Randomly changes image brightness by up to max_delta. Image outputs will be saturated between 0 and 1.

Used in: PreprocessingStep

optional float max_delta = 1

Randomly scales contract by a value between [min_delta, max_delta].

Used in: PreprocessingStep

optional float min_delta = 1
optional float max_delta = 2

Randomly alters hue by a value of up to max_delta.

Used in: PreprocessingStep

optional float max_delta = 1

Randomly changes saturation by a value between [min_delta, max_delta].

Used in: PreprocessingStep

optional float min_delta = 1
optional float max_delta = 2

Randomly adds black square patches to an image.

Used in: PreprocessingStep

optional int32 max_black_patches = 1
The maximum number of black patches to add.
optional float probability = 2
The probability of a black patch being added to an image.
optional float size_to_image_ratio = 3
Ratio between the dimension of the black patch to the minimum dimension of the image (patch_width = patch_height = min(image_height, image_width)).

Randomly crops the image and bounding boxes.

Used in: PreprocessingStep

optional float min_object_covered = 1
Cropped image must cover at least one box by this fraction.
optional float min_aspect_ratio = 2
Aspect ratio bounds of cropped image.
optional float max_aspect_ratio = 3
optional float min_area = 4
Allowed area ratio of cropped image to original image.
optional float max_area = 5
optional float overlap_thresh = 6
Minimum overlap threshold of cropped boxes to keep in new image. If the ratio between a cropped bounding box and the original is less than this value, it is removed from the new image.
optional bool clip_boxes = 8
Whether to clip the boxes to the cropped image.
optional float random_coef = 7
Probability of keeping the original image.

Randomly crops an image followed by a random pad.

Used in: PreprocessingStep

optional float min_object_covered = 1
Cropping operation must cover at least one box by this fraction.
optional float min_aspect_ratio = 2
Aspect ratio bounds of image after cropping operation.
optional float max_aspect_ratio = 3
optional float min_area = 4
Allowed area ratio of image after cropping operation.
optional float max_area = 5
optional float overlap_thresh = 6
Minimum overlap threshold of cropped boxes to keep in new image. If the ratio between a cropped bounding box and the original is less than this value, it is removed from the new image.
optional bool clip_boxes = 11
Whether to clip the boxes to the cropped image.
optional float random_coef = 7
Probability of keeping the original image during the crop operation.
repeated float min_padded_size_ratio = 8
Maximum dimensions for padded image. If unset, will use double the original image dimension as a lower bound. Both of the following fields should be length 2.
repeated float max_padded_size_ratio = 9
repeated float pad_color = 10
Color of the padding. If unset, will pad using average color of the input image. This field should be of length 3.

Randomly crops an iamge to a given aspect ratio.

Used in: PreprocessingStep

optional float aspect_ratio = 1
Aspect ratio.
optional float overlap_thresh = 2
Minimum overlap threshold of cropped boxes to keep in new image. If the ratio between a cropped bounding box and the original is less than this value, it is removed from the new image.
optional bool clip_boxes = 3
Whether to clip the boxes to the cropped image.

Performs a random color distortion. color_orderings should either be 0 or 1.

Used in: PreprocessingStep

optional int32 color_ordering = 1

Randomly shrinks image (keeping aspect ratio) to a target number of pixels. If the image contains less than the chosen target number of pixels, it will not be changed.

Used in: PreprocessingStep

optional float random_coef = 1
Probability of keeping the original image.
optional int32 min_target_pixels = 2
The target number of pixels will be chosen to be in the range [min_target_pixels, max_target_pixels]
optional int32 max_target_pixels = 3

Configuration for random example sampler.

Used in: Loss

optional float positive_sample_fraction = 1
The desired fraction of positive samples in batch when applying random example sampling.

Randomly horizontally flips the image and detections with the specified probability, default to 50% of the time.

Used in: PreprocessingStep

repeated int32 keypoint_flip_permutation = 1
Specifies a mapping from the original keypoint indices to horizontally flipped indices. This is used in the event that keypoints are specified, in which case when the image is horizontally flipped the keypoints will need to be permuted. E.g. for keypoints representing left_eye, right_eye, nose_tip, mouth, left_ear, right_ear (in that order), one might specify the keypoint_flip_permutation below: keypoint_flip_permutation: 1 keypoint_flip_permutation: 0 keypoint_flip_permutation: 2 keypoint_flip_permutation: 3 keypoint_flip_permutation: 5 keypoint_flip_permutation: 4 If nothing is specified the order of keypoint will be mantained.
optional float probability = 2
The probability of running this augmentation for each image.

Randomly enlarges or shrinks image (keeping aspect ratio).

Used in: PreprocessingStep

optional float min_scale_ratio = 1
optional float max_scale_ratio = 2

Randomly jitters corners of boxes in the image determined by ratio. ie. If a box is [100, 200] and ratio is 0.02, the corners can move by [1, 4].

Used in: PreprocessingStep

optional float ratio = 1

Applies a jpeg encoding with a random quality factor.

Used in: PreprocessingStep

optional float random_coef = 1
Probability of keeping the original image.
optional int32 min_jpeg_quality = 2
Minimum jpeg quality to use.
optional int32 max_jpeg_quality = 3
Maximum jpeg quality to use.

Configuration proto for random normal initializer. See https://www.tensorflow.org/api_docs/python/tf/random_normal_initializer

Used in: Initializer

optional float mean = 1
optional float stddev = 2

Randomly adds padding to the image.

Used in: PreprocessingStep

optional int32 min_image_height = 1
Minimum dimensions for padded image. If unset, will use original image dimension as a lower bound.
optional int32 min_image_width = 2
optional int32 max_image_height = 3
Maximum dimensions for padded image. If unset, will use double the original image dimension as a lower bound.
optional int32 max_image_width = 4
repeated float pad_color = 5
Color of the padding. If unset, will pad using average color of the input image.

Used in: PreprocessingStep

optional float random_coef = 1
Probability of keeping the original image.
optional int32 min_patch_size = 2
The patch size will be chosen to be in the range [min_patch_size, max_patch_size).
optional int32 max_patch_size = 3
optional float min_gaussian_stddev = 4
The standard deviation of the gaussian noise applied within the patch will be chosen to be in the range [min_gaussian_stddev, max_gaussian_stddev).
optional float max_gaussian_stddev = 5

Randomly scales the values of all pixels in the image by some constant value between [minval, maxval], then clip the value to a range between [0, 1.0].

Used in: PreprocessingStep

optional float minval = 1
optional float maxval = 2

Randomly convert entire image to grey scale.

Used in: PreprocessingStep

optional float probability = 1

Randomly resizes the image up to [target_height, target_width].

Used in: PreprocessingStep

optional int32 target_height = 1
optional int32 target_width = 2

Randomly rotates the image and detections by 90 degrees counter-clockwise with the specified probability, default to 50% of the time.

Used in: PreprocessingStep

repeated int32 keypoint_rot_permutation = 1
Specifies a mapping from the original keypoint indices to 90 degree counter clockwise indices. This is used in the event that keypoints are specified, in which case when the image is rotated the keypoints might need to be permuted.
optional float probability = 2
The probability of running this augmentation for each image.

Randomly scale, crop, and then pad an image to the desired square output dimensions. Specifically, this method first samples a random_scale factor from a uniform distribution between scale_min and scale_max, and then resizes the image such that it's maximum dimension is (output_size * random_scale). Secondly, a square output_size crop is extracted from the resized image, and finally the cropped region is padded to the desired square output_size. The augmentation is borrowed from [1] [1]: https://arxiv.org/abs/1911.09070

Used in: PreprocessingStep

optional int32 output_size = 1
The (square) output image size
optional float scale_min = 2
The minimum and maximum values from which to sample the random scale.
optional float scale_max = 3

Randomly concatenates the image with itself horizontally and/or vertically.

Used in: PreprocessingStep

optional float concat_vertical_probability = 1
Probability of concatenating the image vertically.
optional float concat_horizontal_probability = 2
Probability of concatenating the image horizontally.

Extract a square sized crop from an image whose side length is sampled by randomly scaling the maximum spatial dimension of the image. If part of the crop falls outside the image, it is filled with zeros. The augmentation is borrowed from [1] [1]: https://arxiv.org/abs/1904.07850

Used in: PreprocessingStep

optional int32 max_border = 1
The maximum size of the border. The border defines distance in pixels to the image boundaries that will not be considered as a center of a crop. To make sure that the border does not go over the center of the image, we chose the border value by computing the minimum k, such that (max_border / (2**k)) < image_dimension/2
optional float scale_min = 2
The minimum and maximum values of scale.
optional float scale_max = 3
optional int32 num_scales = 4
The number of discrete scale values to randomly sample between [min_scale, max_scale]

Randomly vertically flips the image and detections with the specified probability, default to 50% of the time.

Used in: PreprocessingStep

repeated int32 keypoint_flip_permutation = 1
Specifies a mapping from the original keypoint indices to vertically flipped indices. This is used in the event that keypoints are specified, in which case when the image is vertically flipped the keypoints will need to be permuted. E.g. for keypoints representing left_eye, right_eye, nose_tip, mouth, left_ear, right_ear (in that order), one might specify the keypoint_flip_permutation below: keypoint_flip_permutation: 1 keypoint_flip_permutation: 0 keypoint_flip_permutation: 2 keypoint_flip_permutation: 3 keypoint_flip_permutation: 5 keypoint_flip_permutation: 4
optional float probability = 2
The probability of running this augmentation for each image.

Configuration proto for region similarity calculators. See core/region_similarity_calculator.py for details.

Used in: Ssd, TargetAssigner

oneof region_similarity
- NegSqDistSimilarity neg_sq_dist_similarity = 1
- IouSimilarity iou_similarity = 2
- IoaSimilarity ioa_similarity = 3
- ThresholdedIouSimilarity thresholded_iou_similarity = 4

Proto with one-of field for regularizers.

Used in: Hyperparams

oneof regularizer_oneof
- L1Regularizer l1_regularizer = 1
- L2Regularizer l2_regularizer = 2

Remap a set of labels to a new label.

Used in: PreprocessingStep

repeated int32 original_labels = 1
Labels to be remapped.
optional int32 new_label = 2
Label to map to.

Resizes images to [new_height, new_width].

Used in: PreprocessingStep

optional int32 new_height = 1
optional int32 new_width = 2
optional ResizeImage.Method method = 3

Used in: ResizeImage

AREA = 1
BICUBIC = 2
BILINEAR = 3
NEAREST_NEIGHBOR = 4

Enumeration type for image resizing methods provided in TensorFlow.

Used in: ConditionalShapeResizer, FixedShapeResizer, KeepAspectRatioResizer

BILINEAR = 0
Corresponds to tf.image.ResizeMethod.BILINEAR
NEAREST_NEIGHBOR = 1
Corresponds to tf.image.ResizeMethod.NEAREST_NEIGHBOR
BICUBIC = 2
Corresponds to tf.image.ResizeMethod.BICUBIC
AREA = 3
Corresponds to tf.image.ResizeMethod.AREA

Used in: BoxPredictor

optional Hyperparams conv_hyperparams = 1
Hyperparameters for convolution ops used in the box predictor.
optional int32 num_spatial_bins_height = 2
Bin sizes for RFCN crops.
optional int32 num_spatial_bins_width = 3
optional int32 depth = 4
Target depth to reduce the input image features to.
optional int32 box_code_size = 5
Size of the encoding for the boxes.
optional int32 crop_height = 6
Size to resize the rfcn crops to.
optional int32 crop_width = 7

Randomly crops a image according to: Liu et al., SSD: Single shot multibox detector. This preprocessing step defines multiple SSDRandomCropOperations. Only one operation (chosen at random) is actually performed on an image.

Used in: PreprocessingStep

repeated SSDRandomCropOperation operations = 1

Randomly crops a image to a fixed aspect ratio according to: Liu et al., SSD: Single shot multibox detector. Multiple SSDRandomCropFixedAspectRatioOperations are defined by this preprocessing step. Only one operation (chosen at random) is actually performed on an image.

Used in: PreprocessingStep

repeated SSDRandomCropFixedAspectRatioOperation operations = 1
optional float aspect_ratio = 2
Aspect ratio to crop to. This value is used for all crop operations.

Used in: SSDRandomCropFixedAspectRatio

optional float min_object_covered = 1
Cropped image must cover at least this fraction of one original bounding box.
optional float min_area = 4
The area of the cropped image must be within the range of [min_area, max_area].
optional float max_area = 5
optional float overlap_thresh = 6
Cropped box area ratio must be above this threhold to be kept.
optional bool clip_boxes = 8
Whether to clip the boxes to the cropped image.
optional float random_coef = 7
Probability a crop operation is skipped.

Used in: SSDRandomCrop

optional float min_object_covered = 1
Cropped image must cover at least this fraction of one original bounding box.
optional float min_aspect_ratio = 2
The aspect ratio of the cropped image must be within the range of [min_aspect_ratio, max_aspect_ratio].
optional float max_aspect_ratio = 3
optional float min_area = 4
The area of the cropped image must be within the range of [min_area, max_area].
optional float max_area = 5
optional float overlap_thresh = 6
Cropped box area ratio must be above this threhold to be kept.
optional bool clip_boxes = 8
Whether to clip the boxes to the cropped image.
optional float random_coef = 7
Probability a crop operation is skipped.

Randomly crops and pads an image according to: Liu et al., SSD: Single shot multibox detector. This preprocessing step defines multiple SSDRandomCropPadOperations. Only one operation (chosen at random) is actually performed on an image.

Used in: PreprocessingStep

repeated SSDRandomCropPadOperation operations = 1

Randomly crops and pads an image to a fixed aspect ratio according to: Liu et al., SSD: Single shot multibox detector. Multiple SSDRandomCropPadFixedAspectRatioOperations are defined by this preprocessing step. Only one operation (chosen at random) is actually performed on an image.

Used in: PreprocessingStep

repeated SSDRandomCropPadFixedAspectRatioOperation operations = 1
optional float aspect_ratio = 2
Aspect ratio to pad to. This value is used for all crop and pad operations.
repeated float min_padded_size_ratio = 3
Min ratio of padded image height and width to the input image's height and width. Two entries per operation.
repeated float max_padded_size_ratio = 4
Max ratio of padded image height and width to the input image's height and width. Two entries per operation.

Used in: SSDRandomCropPadFixedAspectRatio

optional float min_object_covered = 1
Cropped image must cover at least this fraction of one original bounding box.
optional float min_aspect_ratio = 2
The aspect ratio of the cropped image must be within the range of [min_aspect_ratio, max_aspect_ratio].
optional float max_aspect_ratio = 3
optional float min_area = 4
The area of the cropped image must be within the range of [min_area, max_area].
optional float max_area = 5
optional float overlap_thresh = 6
Cropped box area ratio must be above this threhold to be kept.
optional bool clip_boxes = 8
Whether to clip the boxes to the cropped image.
optional float random_coef = 7
Probability a crop operation is skipped.

Used in: SSDRandomCropPad

optional float min_object_covered = 1
Cropped image must cover at least this fraction of one original bounding box.
optional float min_aspect_ratio = 2
The aspect ratio of the cropped image must be within the range of [min_aspect_ratio, max_aspect_ratio].
optional float max_aspect_ratio = 3
optional float min_area = 4
The area of the cropped image must be within the range of [min_area, max_area].
optional float max_area = 5
optional float overlap_thresh = 6
Cropped box area ratio must be above this threhold to be kept.
optional bool clip_boxes = 13
Whether to clip the boxes to the cropped image.
optional float random_coef = 7
Probability a crop operation is skipped.
repeated float min_padded_size_ratio = 8
Min ratio of padded image height and width to the input image's height and width. Two entries per operation.
repeated float max_padded_size_ratio = 9
Max ratio of padded image height and width to the input image's height and width. Two entries per operation.
optional float pad_color_r = 10
Padding color.
optional float pad_color_g = 11
optional float pad_color_b = 12

Scales boxes from normalized coordinates to pixel coordinates.

Used in: PreprocessingStep

(message has no fields)

Message for class-agnostic Sigmoid Calibration.

Used in: CalibrationConfig

optional SigmoidParameters sigmoid_parameters = 1
Message mapping class index to Sigmoid Parameters

Sigmoid Focal cross entropy loss as described in https://arxiv.org/abs/1708.02002

Used in: ClassificationLoss

optional bool anchorwise_output = 1
DEPRECATED, do not use.
optional float gamma = 2
modulating factor for the loss.
optional float alpha = 3
alpha weighting factor for the loss.

Message defining parameters for sigmoid calibration.

Used in: ClassIdSigmoidCalibrations, SigmoidCalibration

optional float a = 1
optional float b = 2

Configuration proto for SquareBoxCoder. See box_coders/square_box_coder.py for details.

Used in: BoxCoder

optional float y_scale = 1
Scale factor for anchor encoded box center.
optional float x_scale = 2
optional float length_scale = 3
Scale factor for anchor encoded box length.

Configuration for Single Shot Detection (SSD) models. Next id: 27

Used in: DetectionModel

optional int32 num_classes = 1
Number of classes to predict.
optional ImageResizer image_resizer = 2
Image resizer for preprocessing the input image.
optional SsdFeatureExtractor feature_extractor = 3
Feature extractor config.
optional BoxCoder box_coder = 4
Box coder to encode the boxes.
optional Matcher matcher = 5
Matcher to match groundtruth with anchors.
optional RegionSimilarityCalculator similarity_calculator = 6
Region similarity calculator to compute similarity of boxes.
optional bool encode_background_as_zeros = 12
Whether background targets are to be encoded as an all zeros vector or a one-hot vector (where background is the 0th class).
optional float negative_class_weight = 13
classification weight to be associated to negative anchors (default: 1.0). The weight must be in [0., 1.].
optional BoxPredictor box_predictor = 7
Box predictor to attach to the features.
optional AnchorGenerator anchor_generator = 8
Anchor generator to compute anchors.
optional PostProcessing post_processing = 9
Post processing to apply on the predictions.
optional bool normalize_loss_by_num_matches = 10
Whether to normalize the loss by number of groundtruth boxes that match to the anchors.
optional bool normalize_loc_loss_by_codesize = 14
Whether to normalize the localization loss by the code size of the box encodings. This is applied along with other normalization factors.
optional Loss loss = 11
Loss configuration for training.
optional bool freeze_batchnorm = 16
Whether to update batch norm parameters during training or not. When training with a relative small batch size (e.g. 1), it is desirable to disable batch norm update and use pretrained batch norm params. Note: Some feature extractors are used with canned arg_scopes (e.g resnet arg scopes). In these cases training behavior of batch norm variables may depend on both values of `batch_norm_trainable` and `is_training`. When canned arg_scopes are used with feature extractors `conv_hyperparams` will apply only to the additional layers that are added and are outside the canned arg_scope.
optional bool inplace_batchnorm_update = 15
Whether to update batch_norm inplace during training. This is required for batch norm to work correctly on TPUs. When this is false, user must add a control dependency on tf.GraphKeys.UPDATE_OPS for train/loss op in order to update the batch norm moving average parameters.
optional bool add_background_class = 21
Whether to add an implicit background class to one-hot encodings of groundtruth labels. Set to false if training a single class model or using an explicit background class.
optional bool explicit_background_class = 24
Whether to use an explicit background class. Set to true if using groundtruth labels with an explicit background class, as in multiclass scores.
optional bool use_confidences_as_targets = 22
optional float implicit_example_weight = 23
optional bool return_raw_detections_during_predict = 26
optional Ssd.MaskHead mask_head_config = 25
Configs for mask head.

Configuration proto for MaskHead. Next id: 11

Used in: Ssd

optional int32 mask_height = 1
The height and the width of the predicted mask. Only used when predict_instance_masks is true.
optional int32 mask_width = 2
optional bool masks_are_class_agnostic = 3
Whether to predict class agnostic masks. Only used when predict_instance_masks is true.
optional int32 mask_prediction_conv_depth = 4
The depth for the first conv2d_transpose op applied to the image_features in the mask prediction branch. If set to 0, the value will be set automatically based on the number of channels in the image features and the number of classes.
optional int32 mask_prediction_num_conv_layers = 5
The number of convolutions applied to image_features in the mask prediction branch.
optional bool convolve_then_upsample_masks = 6
Whether to apply convolutions on mask features before upsampling using nearest neighbor resizing. By default, mask features are resized to [`mask_height`, `mask_width`] before applying convolutions and predicting masks.
optional float mask_loss_weight = 7
Mask loss weight.
optional int32 mask_loss_sample_size = 8
Number of boxes to be generated at training time for computing mask loss.
optional Hyperparams conv_hyperparams = 9
Hyperparameters for convolution ops used in the box predictor.
optional int32 initial_crop_size = 10
Output size (width and height are set to be the same) of the initial bilinear interpolation based cropping during ROI pooling. Only used when we have second stage prediction head enabled (e.g. mask head).

Configuration proto for SSD anchor generator described in https://arxiv.org/abs/1512.02325. See anchor_generators/multiple_grid_anchor_generator.py for details.

Used in: AnchorGenerator

optional int32 num_layers = 1
Number of grid layers to create anchors for.
optional float min_scale = 2
Scale of anchors corresponding to finest resolution.
optional float max_scale = 3
Scale of anchors corresponding to coarsest resolution
repeated float scales = 12
Can be used to override min_scale->max_scale, with an explicitly defined set of scales. If empty, then min_scale->max_scale is used.
repeated float aspect_ratios = 4
Aspect ratios for anchors at each grid point.
optional float interpolated_scale_aspect_ratio = 13
When this aspect ratio is greater than 0, then an additional anchor, with an interpolated scale is added with this aspect ratio.
optional bool reduce_boxes_in_lowest_layer = 5
Whether to use the following aspect ratio and scale combination for the layer with the finest resolution : (scale=0.1, aspect_ratio=1.0), (scale=min_scale, aspect_ration=2.0), (scale=min_scale, aspect_ratio=0.5).
optional float base_anchor_height = 6
The base anchor size in height dimension.
optional float base_anchor_width = 7
The base anchor size in width dimension.
repeated int32 height_stride = 8
Anchor stride in height dimension in pixels for each layer. The length of this field is expected to be equal to the value of num_layers.
repeated int32 width_stride = 9
Anchor stride in width dimension in pixels for each layer. The length of this field is expected to be equal to the value of num_layers.
repeated int32 height_offset = 10
Anchor height offset in pixels for each layer. The length of this field is expected to be equal to the value of num_layers.
repeated int32 width_offset = 11
Anchor width offset in pixels for each layer. The length of this field is expected to be equal to the value of num_layers.

Next id: 20.

Used in: Ssd

optional string type = 1
Type of ssd feature extractor.
optional float depth_multiplier = 2
The factor to alter the depth of the channels in the feature extractor.
optional int32 min_depth = 3
Minimum number of the channels in the feature extractor.
optional Hyperparams conv_hyperparams = 4
Hyperparameters that affect the layers of feature extractor added on top of the base feature extractor.
optional bool override_base_feature_extractor_hyperparams = 9
Normally, SSD feature extractors are constructed by reusing an existing base feature extractor (that has its own hyperparams) and adding new layers on top of it. `conv_hyperparams` above normally applies only to the new layers while base feature extractor uses its own default hyperparams. If this value is set to true, the base feature extractor's hyperparams will be overridden with the `conv_hyperparams`.
optional int32 pad_to_multiple = 5
The nearest multiple to zero-pad the input height and width dimensions to. For example, if pad_to_multiple = 2, input dimensions are zero-padded until the resulting dimensions are even.
optional bool use_explicit_padding = 7
Whether to use explicit padding when extracting SSD multiresolution features. This will also apply to the base feature extractor if a MobileNet architecture is used.
optional bool use_depthwise = 8
Whether to use depthwise separable convolutions for to extract additional feature maps added by SSD.
oneof feature_pyramid_oneof
- FeaturePyramidNetworks fpn = 10
  Feature Pyramid Networks config.
- BidirectionalFeaturePyramidNetworks bifpn = 19
  Bidirectional Feature Pyramid Networks config.
optional bool replace_preprocessor_with_placeholder = 11
If true, replace preprocess function of feature extractor with a placeholder. This should only be used if all the image preprocessing steps happen outside the graph.
optional int32 num_layers = 12
The number of SSD layers.

repeated StringIntLabelMapItem item = 1

Used in: StringIntLabelMap

optional string name = 1
String name. The most common practice is to set this to a MID or synsets id.
optional int32 id = 2
Integer id that maps to the string name above. Label ids should start from 1.
optional string display_name = 3
Human readable string label.
repeated StringIntLabelMapItem.KeypointMap keypoints = 4
repeated int32 ancestor_ids = 5
Label ids for the elements that are connected in the hierarchy with the current element. Value should correspond to another label id element.
repeated int32 descendant_ids = 6

Name of class specific keypoints for each class object and their respective keypoint IDs.

Used in: StringIntLabelMapItem

optional int32 id = 1
Id for the keypoint. Id must be unique within a given class, however, it could be shared across classes. For example "nose" keypoint can occur in both "face" and "person" classes. Hence they can be mapped to the same id. Note: It is advised to assign ids in range [1, num_unique_keypoints] to encode keypoint targets efficiently.
optional string label = 2
Label for the keypoint.

Normalizes an image by subtracting a mean from each channel.

Used in: PreprocessingStep

repeated float means = 1
The mean to subtract from each channel. Should be of same dimension of channels in the input image.

An input reader that reads TF Example or TF Sequence Example protos from local TFRecord files.

Used in: InputReader

repeated string input_path = 1
Path(s) to `TFRecordFile`s.

Message to configure Target Assigner for object detectors.

optional Matcher matcher = 1
optional RegionSimilarityCalculator similarity_calculator = 2
optional BoxCoder box_coder = 3

Message for Temperature Scaling Calibration.

Used in: CalibrationConfig

optional float scaler = 1

Configuration for thresholded-intersection-over-union similarity calculator.

Used in: RegionSimilarityCalculator

optional float iou_threshold = 1
IOU threshold used for filtering scores.

Message for configuring DetectionModel training jobs (train.py). Next id: 30

Used in: TrainEvalPipelineConfig

optional uint32 batch_size = 1
Effective batch size to use for training. For TPU (or sync SGD jobs), the batch size per core (or GPU) is going to be `batch_size` / number of cores (or `batch_size` / number of GPUs).
repeated PreprocessingStep data_augmentation_options = 2
Data augmentation options.
optional bool sync_replicas = 3
Whether to synchronize replicas during training.
optional float keep_checkpoint_every_n_hours = 4
How frequently to keep checkpoints.
optional Optimizer optimizer = 5
Optimizer used to train the DetectionModel.
optional float gradient_clipping_by_norm = 6
If greater than 0, clips gradients by this value.
optional string fine_tune_checkpoint = 7
Checkpoint to restore variables from. Typically used to load feature extractor variables trained outside of object detection.
optional string fine_tune_checkpoint_type = 22
Type of checkpoint to restore variables from, e.g. 'classification' or 'detection'. Provides extensibility to from_detection_checkpoint. Typically used to load feature extractor variables from trained models.
optional CheckpointVersion fine_tune_checkpoint_version = 28
Either "v1" or "v2". If v1, restores the checkpoint using the tensorflow v1 style of restoring checkpoints. If v2, uses the eager mode checkpoint restoration API.
optional bool from_detection_checkpoint = 8
[Deprecated]: use fine_tune_checkpoint_type instead. Specifies if the finetune checkpoint is from an object detection model. If from an object detection model, the model being trained should have the same parameters with the exception of the num_classes parameter. If false, it assumes the checkpoint was a object classification model.
optional bool load_all_detection_checkpoint_vars = 19
Whether to load all checkpoint vars that match model variable names and sizes. This option is only available if `from_detection_checkpoint` is True. This option is *not* supported for TF2 --- setting it to true will raise an error.
optional uint32 num_steps = 9
Number of steps to train the DetectionModel for. If 0, will train the model indefinitely.
optional float startup_delay_steps = 10
Number of training steps between replica startup. This flag must be set to 0 if sync_replicas is set to true.
optional float bias_grad_multiplier = 11
If greater than 0, multiplies the gradient of bias variables by this amount.
repeated string update_trainable_variables = 25
Variables that should be updated during training. Note that variables which also match the patterns in freeze_variables will be excluded.
repeated string freeze_variables = 12
Variables that should not be updated during training. If update_trainable_variables is not empty, only eliminates the included variables according to freeze_variables patterns.
optional int32 replicas_to_aggregate = 13
Number of replicas to aggregate before making parameter updates.
optional int32 batch_queue_capacity = 14
Maximum number of elements to store within a queue.
optional int32 num_batch_queue_threads = 15
Number of threads to use for batching.
optional int32 prefetch_queue_capacity = 16
Maximum capacity of the queue used to prefetch assembled batches.
optional bool merge_multiple_label_boxes = 17
If true, boxes with the same coordinates will be merged together. This is useful when each box can have multiple labels. Note that only Sigmoid classification losses should be used.
optional bool use_multiclass_scores = 24
If true, will use multiclass scores from object annotations as ground truth. Currently only compatible with annotated image inputs.
optional bool add_regularization_loss = 18
Whether to add regularization loss to `total_loss`. This is true by default and adds all regularization losses defined in the model to `total_loss`. Setting this option to false is very useful while debugging the model and losses.
optional int32 max_number_of_boxes = 20
Maximum number of boxes used during training. Set this to at least the maximum amount of boxes in the input data. Otherwise, it may cause "Data loss: Attempted to pad to a smaller size than the input element" errors.
optional bool unpad_groundtruth_tensors = 21
Whether to remove padding along `num_boxes` dimension of the groundtruth tensors.
optional bool retain_original_images = 23
Whether to retain original images (i.e. not pre-processed) in the tensor dictionary, so that they can be displayed in Tensorboard. Note that this will lead to a larger memory footprint.
optional bool use_bfloat16 = 26
Whether to use bfloat16 for training. This is currently only supported for TPUs.
optional bool summarize_gradients = 27
Whether to summarize gradients.

Convenience message for configuring a training and eval pipeline. Allows all of the pipeline parameters to be configured from one file. Next id: 8

optional DetectionModel model = 1
optional TrainConfig train_config = 2
optional InputReader train_input_reader = 3
optional EvalConfig eval_config = 4
repeated InputReader eval_input_reader = 5
optional GraphRewriter graph_rewriter = 6

Description of data used to fit the calibration model. CLASS_SPECIFIC indicates that the calibration parameters are derived from detections pertaining to a single class. ALL_CLASSES indicates that parameters were obtained by fitting a model on detections from all classes (including the background class).

Used in: XYPairs

DATA_TYPE_UNKNOWN = 0
ALL_CLASSES = 1
CLASS_SPECIFIC = 2

Configuration proto for truncated normal initializer. See https://www.tensorflow.org/api_docs/python/tf/truncated_normal_initializer

Used in: Initializer

optional float mean = 1
optional float stddev = 2

Configuration proto for variance scaling initializer. See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/ variance_scaling_initializer

Used in: Initializer

optional float factor = 1
optional bool uniform = 2
optional VarianceScalingInitializer.Mode mode = 3

Used in: VarianceScalingInitializer

FAN_IN = 0
FAN_OUT = 1
FAN_AVG = 2

Configuration proto for weight shared convolutional box predictor. Next id: 19

Used in: BoxPredictor

optional Hyperparams conv_hyperparams = 1
Hyperparameters for convolution ops used in the box predictor.
optional int32 num_layers_before_predictor = 4
Number of the additional conv layers before the predictor.
optional int32 depth = 2
Output depth for the convolution ops prior to predicting box encodings and class predictions.
optional int32 kernel_size = 7
Size of final convolution kernel. If the spatial resolution of the feature map is smaller than the kernel size, then the kernel size is set to min(feature_width, feature_height).
optional int32 box_code_size = 8
Size of the encoding for boxes.
optional float class_prediction_bias_init = 10
Bias initialization for class prediction. It has been show to stabilize training where there are large number of negative boxes. See https://arxiv.org/abs/1708.02002 for details.
optional bool use_dropout = 11
Whether to use dropout for class prediction.
optional float dropout_keep_probability = 12
Keep probability for dropout.
optional bool share_prediction_tower = 13
Whether to share the multi-layer tower between box prediction and class prediction heads.
optional bool use_depthwise = 14
Whether to use depthwise separable convolution for box predictor layers.
optional WeightSharedConvolutionalBoxPredictor.ScoreConverter score_converter = 16
Callable elementwise score converter at inference time.
optional WeightSharedConvolutionalBoxPredictor.BoxEncodingsClipRange box_encodings_clip_range = 17

If specified, apply clipping to box encodings.

Used in: WeightSharedConvolutionalBoxPredictor

optional float min = 1
optional float max = 2

Enum to specify how to convert the detection scores at inference time.

Used in: WeightSharedConvolutionalBoxPredictor

IDENTITY = 0
Input scores equals output scores.
SIGMOID = 1
Applies a sigmoid on input scores.

Intersection over union location loss: 1 - IOU

Used in: LocalizationLoss

(message has no fields)

L2 location loss: 0.5 * ||weight * (a - b)|| ^ 2

Used in: LocalizationLoss

optional bool anchorwise_output = 1
DEPRECATED, do not use. Output loss per anchor.

Classification loss using a sigmoid function over class predictions.

Used in: ClassificationLoss

optional bool anchorwise_output = 1
DEPRECATED, do not use. Output loss per anchor.

SmoothL1 (Huber) location loss. The smooth L1_loss is defined elementwise as .5 x^2 if |x| <= delta and delta * (|x|-0.5*delta) otherwise, where x is the difference between predictions and target.

Used in: LocalizationLoss

optional bool anchorwise_output = 1
DEPRECATED, do not use. Output loss per anchor.
optional float delta = 2
Delta value for huber loss.

Classification loss using a softmax function over class predictions and a softmax function over the groundtruth labels (assumed to be logits).

Used in: ClassificationLoss

optional bool anchorwise_output = 1
DEPRECATED, do not use.
optional float logit_scale = 2
Scale and softmax groundtruth logits before calculating softmax classification loss. Typically used for softmax distillation with teacher annotations stored as logits.

Classification loss using a softmax function over class predictions.

Used in: ClassificationLoss

optional bool anchorwise_output = 1
DEPRECATED, do not use. Output loss per anchor.
optional float logit_scale = 2
Scale logit (input) value before calculating softmax classification loss. Typically used for softmax distillation.

Message to store a domain/range pair for function to be approximated.

Used in: ClassIdFunctionApproximations, FunctionApproximation

repeated XYPairs.XYPair x_y_pair = 1
Sequence of x/y pairs for function approximation.
optional TrainingDataType training_data_type = 2
Description of data used to fit the calibration model.

Used in: XYPairs

optional float x = 1
optional float y = 2

package object_detection.protos

message AdamOptimizer

optional LearningRate learning_rate = 1

optional float epsilon = 2

message AnchorGenerator

oneof anchor_generator_oneof

GridAnchorGenerator grid_anchor_generator = 1

SsdAnchorGenerator ssd_anchor_generator = 2

MultiscaleAnchorGenerator multiscale_anchor_generator = 3

FlexibleGridAnchorGenerator flexible_grid_anchor_generator = 4

message AnchorGrid

repeated float base_sizes = 1

repeated float aspect_ratios = 2

optional uint32 height_stride = 3

optional uint32 width_stride = 4

optional uint32 height_offset = 5

optional uint32 width_offset = 6

message ArgMaxMatcher

optional float matched_threshold = 1

optional float unmatched_threshold = 2

optional bool ignore_thresholds = 3

optional bool negatives_lower_than_unmatched = 4

optional bool force_match_for_each_row = 5

optional bool use_matmul_gather = 6

message AutoAugmentImage

optional string policy_name = 1

message BatchNonMaxSuppression

optional float score_threshold = 1

optional float iou_threshold = 2

optional int32 max_detections_per_class = 3

optional int32 max_total_detections = 5

optional bool use_static_shapes = 6

optional bool use_class_agnostic_nms = 7

optional int32 max_classes_per_detection = 8

optional float soft_nms_sigma = 9

optional bool use_partitioned_nms = 10

optional bool use_combined_nms = 11

optional bool change_coordinate_frame = 12

optional bool use_hard_nms = 13

optional bool use_cpu_nms = 14

message BatchNorm

optional float decay = 1

optional bool center = 2

optional bool scale = 3

optional float epsilon = 4

optional bool train = 5

message BidirectionalFeaturePyramidNetworks

optional int32 min_level = 1

optional int32 max_level = 2

optional int32 num_iterations = 3

optional int32 num_filters = 4

optional string combine_method = 5

message BipartiteMatcher

optional bool use_matmul_gather = 6

message BootstrappedSigmoidClassificationLoss

optional float alpha = 1

optional bool hard_bootstrap = 2

optional bool anchorwise_output = 3

message BoxCoder

oneof box_coder_oneof

FasterRcnnBoxCoder faster_rcnn_box_coder = 1

MeanStddevBoxCoder mean_stddev_box_coder = 2

SquareBoxCoder square_box_coder = 3

KeypointBoxCoder keypoint_box_coder = 4

message BoxPredictor

oneof box_predictor_oneof

ConvolutionalBoxPredictor convolutional_box_predictor = 1

MaskRCNNBoxPredictor mask_rcnn_box_predictor = 2

RfcnBoxPredictor rfcn_box_predictor = 3

WeightSharedConvolutionalBoxPredictor weight_shared_convolutional_box_predictor = 4

message CalibrationConfig

oneof calibrator

FunctionApproximation function_approximation = 1

ClassIdFunctionApproximations class_id_function_approximations = 2

SigmoidCalibration sigmoid_calibration = 3

ClassIdSigmoidCalibrations class_id_sigmoid_calibrations = 4

TemperatureScalingCalibration temperature_scaling_calibration = 5

message CenterNet

optional int32 num_classes = 1

optional CenterNetFeatureExtractor feature_extractor = 2