Get desktop application:
View/edit binary Protocol Buffers messages
Configuration message for the AdamOptimizer See: https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer
Used in:
Default value for epsilon (1e-8) matches default value in tf.train.AdamOptimizer. This differs from tf2 default of 1e-7 in tf.keras.optimizers.Adam .
Configuration proto for the anchor generator to use in the object detection pipeline. See core/anchor_generator.py for details.
Used in:
,Used in:
The base sizes in pixels for each anchor in this anchor layer.
The aspect ratios for each anchor in this anchor layer.
The anchor height stride in pixels.
The anchor width stride in pixels.
The anchor height offset in pixels.
The anchor width offset in pixels.
Configuration proto for ArgMaxMatcher. See matchers/argmax_matcher.py for details.
Used in:
Threshold for positive matches.
Threshold for negative matches.
Whether to construct ArgMaxMatcher without thresholds.
If True then negative matches are the ones below the unmatched_threshold, whereas ignored matches are in between the matched and umatched threshold. If False, then negative matches are in between the matched and unmatched threshold, and everything lower than unmatched is ignored.
Whether to ensure each row is matched to at least one column.
Force constructed match objects to use matrix multiplication based gather instead of standard tf.gather
Apply an Autoaugment policy to the image and bounding boxes.
Used in:
What AutoAugment policy to apply to the Image
Configuration proto for non-max-suppression operation on a batch of detections.
Used in:
Scalar threshold for score (low scoring boxes are removed).
Scalar threshold for IOU (boxes that have high IOU overlap with previously selected boxes are removed).
Maximum number of detections to retain per class.
Maximum number of detections to retain across all classes.
Whether to use the implementation of NMS that guarantees static shapes.
Whether to use class agnostic NMS. Class-agnostic NMS function implements a class-agnostic version of Non Maximal Suppression where if max_classes_per_detection=k, 1) we keep the top-k scores for each detection and 2) during NMS, each detection only uses the highest class score for sorting. 3) Compared to regular NMS, the worst runtime of this version is O(N^2) instead of O(KN^2) where N is the number of detections and K the number of classes.
Soft NMS sigma parameter; Bodla et al, https://arxiv.org/abs/1704.04503)
Whether to use partitioned version of non_max_suppression.
Whether to use tf.image.combined_non_max_suppression.
Whether to change coordinate frame of the boxlist to be relative to window's frame.
Use hard NMS. Note that even if this field is set false, the behavior of NMS will be equivalent to hard NMS; This field when set to true forces the tf.image.non_max_suppression function to be called instead of tf.image.non_max_suppression_with_scores and can be used to export models for older versions of TF.
Use cpu NMS. NMSV3/NMSV4 by default runs on GPU, which may cause OOM issue if the model is large and/or batch size is large during training. Setting this flag to false moves the nms op to CPU when OOM happens. The flag is not needed if use_hard_nms = false, as soft NMS currently runs on CPU by default.
Configuration proto for batch norm to apply after convolution op. See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/batch_norm
Used in:
Whether to train the batch norm variables. If this is set to false during training, the current value of the batch_norm variables are used for forward pass but they are never updated.
Configuration for Bidirectional Feature Pyramid Networks.
Used in:
minimum level in the feature pyramid.
maximum level in the feature pyramid.
The number of repeated top-down bottom-up iterations for BiFPN-based feature extractors (bidirectional feature pyramid networks).
The number of filters (channels) to use in feature pyramid layers for BiFPN-based feature extractors (bidirectional feature pyramid networks).
Method used to combine inputs to BiFPN nodes.
Configuration proto for bipartite matcher. See matchers/bipartite_matcher.py for details.
Used in:
Force constructed match objects to use matrix multiplication based gather instead of standard tf.gather
Classification loss using a sigmoid function over the class prediction with the highest prediction score.
Used in:
Interpolation weight between 0 and 1.
Whether hard boot strapping should be used or not. If true, will only use one class favored by model. Othewise, will use all predicted class probabilities.
DEPRECATED, do not use. Output loss per anchor.
Configuration proto for the box coder to be used in the object detection pipeline. See core/box_coder.py for details.
Used in:
,Configuration proto for box predictor. See core/box_predictor.py for details.
Used in:
,Message wrapper for various calibration configurations.
Used in:
Class-agnostic calibration via linear interpolation (usually output from isotonic regression).
Per-class calibration via linear interpolation.
Class-agnostic sigmoid calibration.
Per-class sigmoid calibration.
Temperature scaling calibration.
Used in:
Number of classes to predict.
Feature extractor config.
Image resizer for preprocessing the input image.
If set, all task heads will be constructed with separable convolutions.
Path of the file that conatins the label map along with the keypoint information, including the keypoint indices, corresponding labels, and the corresponding class. The file should be the same one as used in the input pipeline. Note that a plain text of StringIntLabelMap proto is expected in this file. It is required only if the keypoint estimation task is specified.
Parameters which are related to DensePose estimation task. http://densepose.org/
Used in:
Weight of the task loss. The total loss of the model will be their summation of task losses weighted by the weights.
Class ID (0-indexed) that corresponds to the object in the label map that contains DensePose data.
Loss configuration for DensePose heatmap and regression losses. Note that the localization loss is used for surface coordinate losses and classification loss is used for part classification losses.
The number of body parts.
Loss weights for the two DensePose heads.
Whether to upsample the prediction feature maps back to the original input dimension prior to applying loss. This has the benefit of maintaining finer groundtruth location information.
The initial bias value of the convlution kernel of the class heatmap prediction head. -2.19 corresponds to predicting foreground with a probability of 0.1.
Parameters which are related to keypoint estimation task.
Used in:
Name of the task, e.g. "human pose". Note that the task name should be unique to each keypoint task.
Weight of the task loss. The total loss of the model will be their summation of task losses weighted by the weights.
Loss configuration for keypoint heatmap, offset, regression losses. Note that the localization loss is used for offset/regression losses and classification loss is used for heatmap loss.
The name of the class that contains the keypoints for this task. This is used to retrieve the corresponding keypoint indices from the label map. Note that this corresponds to the "name" field, not "display_name".
The standard deviation of the Gaussian kernel used to generate the keypoint heatmap. The unit is the pixel in the output image. It is to provide the flexibility of using different sizes of Gaussian kernel for each keypoint class. Note that if provided, the keypoint standard deviations will be overridden by the specified values here, otherwise, the default value 5.0 will be used. TODO(yuhuic): Update the default value once we found the best value.
Loss weights corresponding to different heads.
The initial bias value of the convolution kernel of the keypoint heatmap prediction head. -2.19 corresponds to predicting foreground with a probability of 0.1. See "Focal Loss for Dense Object Detection" at https://arxiv.org/abs/1708.02002.
The heatmap score threshold for a keypoint to become a valid candidate.
The maximum number of candidates to retrieve for each keypoint.
Max pool kernel size to use to pull off peak score locations in a neighborhood (independently for each keypoint types).
The default score to use for regressed keypoints that are not successfully snapped to a nearby candidate.
The multiplier to expand the bounding boxes (either the provided boxes or those which tightly cover the regressed keypoints). Note that new expanded box for an instance becomes the feasible search window for all associated keypoints.
The scale parameter that multiplies the largest dimension of a bounding box. The resulting distance becomes a search radius for candidates in the vicinity of each regressed keypoint.
One of ['min_distance', 'score_distance_ratio'] indicating how to select the keypoint candidate.
The radius (in the unit of output pixel) around heatmap peak to assign the offset targets. If set 0, then the offset target will only be assigned to the heatmap peak (same behavior as the original paper).
Indicates whether to assign offsets for each keypoint channel separately. If set False, the output offset target has the shape [batch_size, out_height, out_width, 2] (same behavior as the original paper). If set True, the output offset target has the shape [batch_size, out_height, out_width, 2 * num_keypoints] (recommended when the offset_peak_radius is not zero).
Parameters which are related to mask estimation task. Note: Currently, CenterNet supports a weak instance segmentation, where semantic segmentation masks are estimated, and then cropped based on bounding box detections. Therefore, it is possible for the same image pixel to be assigned to multiple instances.
Used in:
Weight of the task loss. The total loss of the model will be their summation of task losses weighted by the weights.
Classification loss configuration for segmentation loss.
Each instance mask (one per detection) is cropped and resized (bilinear resampling) from the predicted segmentation feature map. After resampling, the masks are binarized with the provided score threshold.
The initial bias value of the convlution kernel of the class heatmap prediction head. -2.19 corresponds to predicting foreground with a probability of 0.1.
Parameters related to object center prediction. This is required for both object detection and keypoint estimation tasks.
Used in:
Weight for the object center loss.
Classification loss configuration for object center loss.
The initial bias value of the convlution kernel of the class heatmap prediction head. -2.19 corresponds to predicting foreground with a probability of 0.1. See "Focal Loss for Dense Object Detection" at https://arxiv.org/abs/1708.02002.
The minimum IOU overlap boxes need to have to not be penalized.
Maximum number of boxes to predict.
If set, loss is only computed for the labeled classes.
Parameters which are related to object detection task.
Used in:
Weight of the task loss. The total loss of the model will be the summation of task losses weighted by the weights.
Weight for the offset localization loss.
Weight for the height/width localization loss.
Localization loss configuration for object scale and offset losses.
Temporal offset prediction head similar to CenterTrack. Currently our implementation adopts LSTM, different from original paper. See go/lstd-centernet for more details. Tracking Objects as Points [3] [3]: https://arxiv.org/abs/2004.01177
Used in:
Weight of the task loss. The total loss of the model will be the summation of task losses weighted by the weights.
Localization loss configuration for offset loss.
Parameters which are related to tracking embedding estimation task. A Simple Baseline for Multi-Object Tracking [2] [2]: https://arxiv.org/abs/2004.01888
Used in:
Weight of the task loss. The total loss of the model will be the summation of task losses weighted by the weights.
The maximun track ID of the datset.
The embedding size for re-identification (ReID) task in tracking.
The number of (fully-connected, batch-norm, relu) layers for track ID classification head. The output dimension of each intermediate FC layer will all be 'reid_embed_size'. The last FC layer will directly project to the track ID classification space of size 'num_track_ids' without batch-norm and relu layers.
Classification loss configuration for ReID loss.
Used in:
Channel means to be subtracted from each image channel. If not specified, we use a default value of 0.
Channel standard deviations. Each channel will be normalized by dividing it by its standard deviation. If not specified, we use a default value of 1.
If set, will change channel order to be [blue, green, red]. This can be useful to be compatible with some pre-trained feature extractors.
If set, the feature upsampling layers will be constructed with separable convolutions. This is typically applied to feature pyramid network if any.
Used in:
Message for class-specific domain/range mapping for function approximations.
Used in:
Message mapping class ids to indices.
Message for class-specific Sigmoid Calibration.
Used in:
Message mapping class index to Sigmoid Parameters.
Configuration for class prediction loss function.
Used in:
, , , ,A message to evaluate COCO keypoint metrics for a specific class.
Used in:
Identifies the class of object to which keypoints belong. By default this should use the class's "display_name" in the label map.
Keypoint specific standard deviations for COCO keypoint metrics, which controls how OKS is computed. See http://cocodataset.org/#keypoints-eval for details. If your keypoints are similar to the COCO keypoints use the precomputed standard deviations below: "nose": 0.026 "left_eye": 0.025 "right_eye": 0.025 "left_ear": 0.035 "right_ear": 0.035 "left_shoulder": 0.079 "right_shoulder": 0.079 "left_elbow": 0.072 "right_elbow": 0.072 "left_wrist": 0.062 "right_wrist": 0.062 "left_hip": 0.107 "right_hip": 0.107 "left_knee": 0.087 "right_knee": 0.087 "left_ankle": 0.089 "right_ankle": 0.089
Configuration proto for image resizer that resizes only if input image height or width is greater or smaller than a certain size. Aspect ratio is maintained.
Used in:
Condition which must be true to resize the image.
Threshold for the image size. If any image dimension is above or below this (as specified by condition) the image will be resized so that it meets the threshold.
Desired method when resizing image.
Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
Enumeration for the condition on which to resize an image.
Used in:
Default value.
Resizes image if a dimension is greater than specified size.
Resizes image if a dimension is smaller than specified size.
Configuration message for a constant learning rate.
Used in:
Configuration proto for Context . Next id: 4
Used in:
The maximum number of contextual features per-image, used for padding
The bottleneck feature dimension of the attention block.
The attention temperature.
The context feature length.
Converts class logits to softmax optionally scaling the values by temperature first.
Used in:
Scale to use on logits before applying softmax.
Configuration proto for Convolutional box predictor. Next id: 13
Used in:
Hyperparameters for convolution ops used in the box predictor.
Minimum feature depth prior to predicting box encodings and class predictions.
Maximum feature depth prior to predicting box encodings and class predictions. If max_depth is set to 0, no additional feature map will be inserted before location and class predictions.
Number of the additional conv layers before the predictor.
Whether to use dropout for class prediction.
Keep probability for dropout
Size of final convolution kernel. If the spatial resolution of the feature map is smaller than the kernel size, then the kernel size is set to min(feature_width, feature_height).
Size of the encoding for boxes.
Whether to apply sigmoid to the output of class predictions. TODO(jonathanhuang): Do we need this since we have a post processing module.?
Whether to use depthwise separable convolution for box predictor layers.
If specified, apply clipping to box encodings.
Used in:
Configuration message for a cosine decaying learning rate as defined in object_detection/utils/learning_schedules.py
Used in:
Top level configuration for DetectionModels.
Used in:
This can be used to define experimental models. To define your own experimental meta architecture, populate a key in the model_builder.EXPERIMENTAL_META_ARCH_BUILDER_MAP dict and set its value to a function that builds your model.
Randomly drops ground truth boxes for a label with some probability.
Used in:
The label that should be dropped. This corresponds to one of the entries in the label map.
Probability of dropping the label.
Message for configuring DetectionModel evaluation jobs (eval.py). Next id - 35
Used in:
Number of visualization images to generate.
Number of examples to process of evaluation.
How often to run evaluation.
Maximum number of times to run evaluation. If set to 0, will run forever.
Whether the TensorFlow graph used for evaluation should be saved to disk.
Path to directory to store visualizations in. If empty, visualization images are not exported (only shown on Tensorboard).
BNS name of the TensorFlow master.
Type of metrics to use for evaluation.
Type of metrics to use for evaluation. Unlike `metrics_set` above, this field allows configuring evaluation metric through config files.
Path to export detections to COCO compatible JSON format.
Option to not read groundtruth labels and only export detections to COCO-compatible JSON file.
Use exponential moving averages of variables for evaluation. TODO(rathodv): When this is false make sure the model is constructed without moving averages in restore_fn.
Whether to evaluate instance masks. Note that since there is no evaluation code currently for instance segmentation this option is unused.
Minimum score threshold for a detected object box to be visualized
Maximum number of detections to visualize
When drawing a single detection, each label is by default visualized as <label name> : <label score>. One can skip the name or/and score using the following fields:
Whether to show groundtruth boxes in addition to detected boxes in visualizations.
Box color for visualizing groundtruth boxes.
Whether to keep image identifier in filename when exported to visualization_export_dir.
Whether to retain original images (i.e. not pre-processed) in the tensor dictionary, so that they can be displayed in Tensorboard.
If True, additionally include per-category metrics.
Optional super-category definitions: keys are super-category names; values are comma-separated categories (assumed to correspond to category names (`display_name`) in the label map.
Recall range within which precision should be computed.
Whether to retain additional channels (i.e. not pre-processed) in the tensor dictionary, so that they can be displayed in Tensorboard.
When this flag is set, images are not resized during evaluation. When this flag is not set (default case), image are resized according to the image_resizer config in the model during evaluation.
Whether to use a dummy loss in eval so model.loss() is not executed.
Specifies which keypoints should be connected by an edge, which may improve visualization. An example would be human pose estimation where certain joints can be connected.
The "groundtruth_labeled_classes" field indicates which classes have been labeled on the images. If skip_predictions_for_unlabeled_class is set, detector predictions that do not match to the groundtruth_labeled_classes will be ignored. This is useful for evaluating on test data that are not exhaustively labeled.
Used in:
Configuration message for an exponentially decaying learning rate. See https://www.tensorflow.org/versions/master/api_docs/python/train/ \ decaying_the_learning_rate#exponential_decay
Used in:
An externally defined input reader. Users may define an extension to this proto to interface their own input readers.
Used in:
(message has no fields)
Configuration for Faster R-CNN models. See meta_architectures/faster_rcnn_meta_arch.py and models/model_builder.py Naming conventions: Faster R-CNN models have two stages: a first stage region proposal network (or RPN) and a second stage box classifier. We thus use the prefixes `first_stage_` and `second_stage_` to indicate the stage to which each parameter pertains when relevant.
Used in:
Whether to construct only the Region Proposal Network (RPN).
Number of classes to predict.
Image resizer for preprocessing the input image.
Feature extractor config.
Anchor generator to compute RPN anchors.
Atrous rate for the convolution op applied to the `first_stage_features_to_crop` tensor to obtain box predictions.
Hyperparameters for the convolutional RPN box predictor.
Kernel size to use for the convolution op just prior to RPN box predictions.
Output depth for the convolution op just prior to RPN box predictions.
The batch size to use for computing the first stage objectness and location losses.
Fraction of positive examples per image for the RPN.
Non max suppression score threshold applied to first stage RPN proposals.
Non max suppression IOU threshold applied to first stage RPN proposals.
Maximum number of RPN proposals retained after first stage postprocessing.
First stage RPN localization loss weight.
First stage RPN objectness loss weight.
Output size (width and height are set to be the same) of the initial bilinear interpolation based cropping during ROI pooling.
Kernel size of the max pool op on the cropped feature map during ROI pooling.
Stride of the max pool op on the cropped feature map during ROI pooling.
Hyperparameters for the second stage box predictor. If box predictor type is set to rfcn_box_predictor, a R-FCN model is constructed, otherwise a Faster R-CNN model is constructed.
The batch size per image used for computing the classification and refined location loss of the box classifier. Note that this field is ignored if `hard_example_miner` is configured.
Fraction of positive examples to use per image for the box classifier.
Post processing to apply on the second stage box classifier predictions. Note: the `score_converter` provided to the FasterRCNNMetaArch constructor is taken from this `second_stage_post_processing` proto.
Second stage refined localization loss weight.
Second stage classification loss weight
Second stage instance mask loss weight. Note that this is only applicable when `MaskRCNNBoxPredictor` is selected for second stage and configured to predict instance masks.
If not left to default, applies hard example mining only to classification and localization loss..
Loss for second stage box classifers, supports Softmax and Sigmoid. Note that score converter must be consistent with loss type. When there are multiple labels assigned to the same boxes, recommend to use sigmoid loss and enable merge_multiple_label_boxes. If not specified, Softmax loss is used as default.
Whether to update batch_norm inplace during training. This is required for batch norm to work correctly on TPUs. When this is false, user must add a control dependency on tf.GraphKeys.UPDATE_OPS for train/loss op in order to update the batch norm moving average parameters.
Force the use of matrix multiplication based crop and resize instead of standard tf.image.crop_and_resize while computing second stage input feature maps.
Normally, anchors generated for a given image size are pruned during training if they lie outside the image window. Setting this option to true, clips the anchors to be within the image instead of pruning.
After peforming matching between anchors and targets, in order to pull out targets for training Faster R-CNN meta architecture we perform a gather operation. This options specifies whether to use an alternate implementation of tf.gather that is faster on TPUs.
Whether to use the balanced positive negative sampler implementation with static shape guarantees.
If True, uses implementation of ops with static shape guarantees.
Whether the masks present in groundtruth should be resized in the model to match the image size.
If True, uses implementation of ops with static shape guarantees when running evaluation (specifically not is_training if False).
If true, uses implementation of partitioned_non_max_suppression in first stage.
Whether to return raw detections (pre NMS).
Whether to use tf.image.combined_non_max_suppression.
Whether to output final box feature. If true, it will crop the feature map in the postprocess() method based on the final predictions.
Configs for context model.
Configuration proto for FasterRCNNBoxCoder. See box_coders/faster_rcnn_box_coder.py for details.
Used in:
Scale factor for anchor encoded box center.
Scale factor for anchor encoded box height.
Scale factor for anchor encoded box width.
Used in:
Type of Faster R-CNN model (e.g., 'faster_rcnn_resnet101'; See builders/model_builder.py for expected types).
Output stride of extracted RPN feature map.
Whether to update batch norm parameters during training or not. When training with a relative large batch size (e.g. 8), it could be desirable to enable batch norm update.
Hyperparameters that affect the layers of feature extractor added on top of the base feature extractor.
if the value is set to true, the base feature extractor's hyperparams will be overridden with the `conv_hyperparams`.
The nearest multiple to zero-pad the input height and width dimensions to. For example, if pad_to_multiple = 2, input dimensions are zero-padded until the resulting dimensions are even.
Feature Pyramid Networks config.
Configuration for Feature Pyramid Networks.
We recommend to use multi_resolution_feature_map_generator with FPN, and the levels there must match the levels defined below for better performance. Correspondence from FPN levels to Resnet/Mobilenet V1 feature maps: FPN Level Resnet Feature Map Mobilenet-V1 Feature Map 2 Block 1 Conv2d_3_pointwise 3 Block 2 Conv2d_5_pointwise 4 Block 3 Conv2d_11_pointwise 5 Block 4 Conv2d_13_pointwise 6 Bottomup_5 bottom_up_Conv2d_14 7 Bottomup_6 bottom_up_Conv2d_15 8 Bottomup_7 bottom_up_Conv2d_16 9 Bottomup_8 bottom_up_Conv2d_17
Used in:
,minimum level in feature pyramid
maximum level in feature pyramid
channel depth for additional coarse feature layers.
Configuration proto for image resizer that resizes to a fixed shape.
Used in:
Desired height of image in pixels.
Desired width of image in pixels.
Desired method when resizing image.
Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
Used in:
Whether to produce anchors in normalized coordinates.
Message for class-agnostic domain/range mapping for function approximations.
Used in:
Message mapping class labels to indices
Message to configure graph rewriter for the tf graph.
Used in:
Configuration proto for GridAnchorGenerator. See anchor_generators/grid_anchor_generator.py for details.
Used in:
Anchor height in pixels.
Anchor width in pixels.
Anchor stride in height dimension in pixels.
Anchor stride in width dimension in pixels.
Anchor height offset in pixels.
Anchor width offset in pixels.
List of scales for the anchors.
List of aspect ratios for the anchors.
Configuration proto for group normalization to apply after convolution op. https://arxiv.org/abs/1803.08494
Used in:
(message has no fields)
Configuration for hard example miner.
Used in:
,Maximum number of hard examples to be selected per image (prior to enforcing max negative to positive ratio constraint). If set to 0, all examples obtained after NMS are considered.
Minimum intersection over union for an example to be discarded during NMS.
Maximum number of negatives to retain for each positive anchor. If num_negatives_per_positive is 0 no prespecified negative:positive ratio is enforced.
Minimum number of negative anchors to sample for a given image. Setting this to a positive number samples negatives in an image without any positive anchors and thus not bias the model towards having at least one detection per image.
Whether to use classification losses ('cls', default), localization losses ('loc') or both losses ('both'). In the case of 'both', cls_loss_weight and loc_loss_weight are used to compute weighted sum of the two losses.
Used in:
Configuration proto for the convolution op hyperparameters to use in the object detection pipeline.
Used in:
, , , , , , ,Regularizer for the weights of the convolution op.
Initializer for the weights of the convolution op.
Note that if nothing below is selected, then no normalization is applied BatchNorm hyperparameters.
GroupNorm hyperparameters. This is only supported on a subset of models. Note that the current implementation of group norm instantiated in tf.contrib.group.layers.group_norm() only supports fixed_size_resizer for image preprocessing.
Whether depthwise convolutions should be regularized. If this parameter is NOT set then the conv hyperparams will default to the parent scope.
By default, use_bias is set to False if batch_norm is not None and batch_norm.center is True. When force_use_bias is set to True, this behavior will be overridden, and use_bias will be set to True, regardless of batch norm parameters. Note, this only applies to KerasLayerHyperparams.
Type of activation to apply after convolution.
Used in:
Use None (no activation)
Use tf.nn.relu
Use tf.nn.relu6
Use tf.nn.swish
Operations affected by hyperparameters.
Used in:
Convolution, Separable Convolution, Convolution transpose.
Fully connected
Used in:
(message has no fields)
Configuration proto for image resizing operations. See builders/image_resizer_builder.py for details.
Used in:
, ,Proto with one-of field for initializers.
Used in:
Next id: 35
Used in:
Name of input reader. Typically used to describe the dataset that is read by this input reader.
Path to StringIntLabelMap pbtxt file specifying the mapping from string labels to integer ids.
Whether data should be processed in the order they are read in, or shuffled randomly.
Buffer size to be used when shuffling.
Buffer size to be used when shuffling file names.
The number of times a data source is read. If set to zero, the data source will be reused indefinitely.
Integer representing how often an example should be sampled. To feed only 1/3 of your data into your model, set `sample_1_of_n_examples` to 3. This is particularly useful for evaluation, where you might not prefer to evaluate all of your samples.
Number of file shards to read in parallel. When sample_from_datasets_weights are configured, num_readers is applied for each dataset.
Number of batches to produce in parallel. If this is run on a 2x2 TPU set this to 8.
Number of batches to prefetch. Prefetch decouples input pipeline and model so they can be pipelined resulting in higher throughput. Set this to a small constant and increment linearly until the improvements become marginal or you exceed your cpu memory budget. Setting this to -1, automatically tunes this value for you.
Maximum number of records to keep in reader queue.
Minimum number of records to keep in reader queue. A large value is needed to generate a good random shuffle.
Number of records to read from each reader at once.
Number of decoded records to prefetch before batching.
Number of parallel decode ops to apply.
If positive, TfExampleDecoder will try to decode rasters of additional channels from tf.Examples.
Number of groundtruth keypoints per object.
Keypoint weights. These weights can be used to apply per-keypoint loss multipliers. The size of this field should agree with `num_keypoints`.
Maximum number of boxes to pad to during training / evaluation. Set this to at least the maximum amount of boxes in the input data, otherwise some groundtruth boxes may be clipped.
Whether to load multiclass scores from the dataset.
Whether to load context features from the dataset.
Whether to load groundtruth instance masks.
Type of instance mask.
Whether to load DensePose data. If set, must also set load_instance_masks to true.
Whether to load track information.
Whether to use the display name when decoding examples. This is only used when mapping class text strings to integers.
Whether to include the source_id string in the input features.
Whether input data type is tf.Examples or tf.SequenceExamples
Which frame to choose from the input if Sequence Example. -1 indicates random choice.
When multiple input files are configured, we can sample across them based on weights. The number of weights must match the number of input files configured. When set, shuffling, shuffle buffer size, and num_readers settings are applied individually to each dataset. Implementation follows tf.data.experimental.sample_from_datasets sampling strategy.
Expand labels to ancestors or descendants in the hierarchy for for positive and negative labels, respectively.
Input type format: whether inputs are TfExamples or TfSequenceExamples.
Used in:
Default implementation, currently TF_EXAMPLE
TfExample input
TfSequenceExample Input
Instance mask format. Note that PNG masks are much more space efficient.
Used in:
Default implementation, currently NUMERICAL_MASKS
[num_masks, H, W] float32 binary masks.
Encoded PNG masks.
Configuration for intersection-over-area (IOA) similarity calculator.
Used in:
(message has no fields)
Configuration for intersection-over-union (IOU) similarity calculator.
Used in:
(message has no fields)
Configuration proto for image resizer that keeps aspect ratio.
Used in:
Desired size of the smaller image dimension in pixels.
Desired size of the larger image dimension in pixels.
Desired method when resizing image.
Whether to pad the image with zeros so the output spatial size is [max_dimension, max_dimension]. Note that the zeros are padded to the bottom and the right of the resized image.
Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
Per-channel pad value. This is only used when pad_to_max_dimension is True. If unspecified, a default pad value of 0 is applied to all channels.
Configuration proto for KeypointBoxCoder. See box_coders/keypoint_box_coder.py for details.
Used in:
Scale factor for anchor encoded box center and keypoints.
Scale factor for anchor encoded box height.
Scale factor for anchor encoded box width.
Defines an edge that should be drawn between two keypoints.
Used in:
Index of the keypoint where the edge starts from. Index starts at 0.
Index of the keypoint where the edge ends. Index starts at 0.
L1 Localization Loss.
Used in:
(message has no fields)
Configuration proto for L1 Regularizer. See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/l1_regularizer
Used in:
Configuration proto for L2 Regularizer. See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/l2_regularizer
Used in:
Configuration message for optimizer learning rate.
Used in:
, ,Configuration for bounding box localization loss function.
Used in:
, ,Message for configuring the localization loss, classification loss and hard example miner used for training object detection models. See core/losses.py for details
Used in:
, ,Localization loss to use.
Classification loss to use.
If not left to default, applies hard example mining.
Classification loss weight.
Localization loss weight.
If not left to default, applies random example sampling.
Method to compute expected loss weights with respect to balanced positive/negative sampling scheme. If NONE, use explicit sampling. TODO(birdbrain): Move under ExpectedLossWeights.
Minimum number of effective negative samples. Only applies if expected_loss_weights is not NONE. TODO(birdbrain): Move under ExpectedLossWeights.
Desired number of effective negative samples per positive sample. Only applies if expected_loss_weights is not NONE. TODO(birdbrain): Move under ExpectedLossWeights.
Equalization loss.
Used in:
Weight equalization loss strength.
When computing equalization loss, ops that start with equalization_exclude_prefixes will be ignored. Only used when equalization_weight > 0.
Used in:
Use expected_classification_loss_by_expected_sampling from third_party/tensorflow_models/object_detection/utils/ops.py
Use expected_classification_loss_by_reweighting_unmatched_anchors from third_party/tensorflow_models/object_detection/utils/ops.py
Configuration message for a manually defined learning rate schedule.
Used in:
Whether to linearly interpolate learning rates for steps in [0, schedule[0].step].
Used in:
TODO(alirezafathi): Refactor the proto file to be able to configure mask rcnn head easily. Next id: 15
Used in:
Hyperparameters for fully connected ops used in the box predictor.
Whether to use dropout op prior to the both box and class predictions.
Keep probability for dropout. This is only used if use_dropout is true.
Size of the encoding for the boxes.
Hyperparameters for convolution ops used in the box predictor.
Whether to predict instance masks inside detection boxes.
The depth for the first conv2d_transpose op applied to the image_features in the mask prediction branch. If set to 0, the value will be set automatically based on the number of channels in the image features and the number of classes.
Whether to predict keypoints inside detection boxes.
The height and the width of the predicted mask.
The number of convolutions applied to image_features in the mask prediction branch.
Whether to use one box for all classes rather than a different box for each class.
Whether to apply convolutions on mask features before upsampling using nearest neighbor resizing. By default, mask features are resized to [`mask_height`, `mask_width`] before applying convolutions and predicting masks.
Configuration proto for the matcher to be used in the object detection pipeline. See core/matcher.py for details.
Used in:
,Configuration proto for MeanStddevBoxCoder. See box_coders/mean_stddev_box_coder.py for details.
Used in:
The standard deviation used to encode and decode boxes.
Configuration message for the MomentumOptimizer See: https://www.tensorflow.org/api_docs/python/tf/train/MomentumOptimizer
Used in:
Configuration proto for RetinaNet anchor generator described in https://arxiv.org/abs/1708.02002. See anchor_generators/multiscale_grid_anchor_generator.py for details.
Used in:
minimum level in feature pyramid
maximum level in feature pyramid
Scale of anchor to feature stride
Aspect ratios for anchors at each grid point.
Number of intermediate scale each scale octave
Whether to produce anchors in normalized coordinates.
Configuration for negative squared distance similarity calculator.
Used in:
(message has no fields)
Normalizes pixel values in an image. For every channel in the image, moves the pixel values from the range [original_minval, original_maxval] to [target_minval, target_maxval].
Used in:
Top level optimizer message.
Used in:
An image resizer which resizes inputs by zero padding them such that their spatial dimensions are divisible by a specified multiple. This is useful when you want to concatenate or compare the input to an output of a fully convolutional network.
Used in:
The multiple to which the spatial dimensions will be padded to.
Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
A message to configure parameterized evaluation metric.
Used in:
Pixelwise logistic focal loss with pixels near the target having a reduced penalty.
Used in:
Focussing parameter of the focal loss.
Penalty reduction factor.
Configuration proto for post-processing predicted boxes and scores.
Used in:
,Non max suppression parameters.
Score converter to use.
Scale logit (input) value before conversion in post-processing step. Typically used for softmax distillation, though can be used to scale for other reasons.
Calibrate score outputs. Calibration is applied after score converter and before non max suppression.
Enum to specify how to convert the detection scores.
Used in:
Input scores equals output scores.
Applies a sigmoid on input scores.
Applies a softmax on input scores
Message for defining a preprocessing operation on input data. See: //third_party/tensorflow_models/object_detection/core/preprocessor.py Next ID: 39
Used in:
Message for quantization options. See tensorflow/contrib/quantize/python/quantize.py for details.
Used in:
Number of steps to delay before quantization takes effect during training.
Number of bits to use for quantizing weights. Only 8 bit is supported for now.
Number of bits to use for quantizing activations. Only 8 bit is supported for now.
Whether to use symmetric weight quantization.
Converts the RGB image to a grayscale image. This also converts the image depth from 3 to 1, unlike RandomRGBtoGray which does not change the image depth.
Used in:
(message has no fields)
Configuration message for the RMSPropOptimizer See: https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer
Used in:
Randomly adds a padding of size [0, max_height_padding), [0, max_width_padding).
Used in:
Height will be padded uniformly at random from [0, max_height_padding).
Width will be padded uniformly at random from [0, max_width_padding).
Color of the padding. If unset, will pad using average color of the input image.
Randomly changes image brightness by up to max_delta. Image outputs will be saturated between 0 and 1.
Used in:
Randomly scales contract by a value between [min_delta, max_delta].
Used in:
Randomly alters hue by a value of up to max_delta.
Used in:
Randomly changes saturation by a value between [min_delta, max_delta].
Used in:
Randomly adds black square patches to an image.
Used in:
The maximum number of black patches to add.
The probability of a black patch being added to an image.
Ratio between the dimension of the black patch to the minimum dimension of the image (patch_width = patch_height = min(image_height, image_width)).
Randomly crops the image and bounding boxes.
Used in:
Cropped image must cover at least one box by this fraction.
Aspect ratio bounds of cropped image.
Allowed area ratio of cropped image to original image.
Minimum overlap threshold of cropped boxes to keep in new image. If the ratio between a cropped bounding box and the original is less than this value, it is removed from the new image.
Whether to clip the boxes to the cropped image.
Probability of keeping the original image.
Randomly crops an image followed by a random pad.
Used in:
Cropping operation must cover at least one box by this fraction.
Aspect ratio bounds of image after cropping operation.
Allowed area ratio of image after cropping operation.
Minimum overlap threshold of cropped boxes to keep in new image. If the ratio between a cropped bounding box and the original is less than this value, it is removed from the new image.
Whether to clip the boxes to the cropped image.
Probability of keeping the original image during the crop operation.
Maximum dimensions for padded image. If unset, will use double the original image dimension as a lower bound. Both of the following fields should be length 2.
Color of the padding. If unset, will pad using average color of the input image. This field should be of length 3.
Randomly crops an iamge to a given aspect ratio.
Used in:
Aspect ratio.
Minimum overlap threshold of cropped boxes to keep in new image. If the ratio between a cropped bounding box and the original is less than this value, it is removed from the new image.
Whether to clip the boxes to the cropped image.
Performs a random color distortion. color_orderings should either be 0 or 1.
Used in:
Randomly shrinks image (keeping aspect ratio) to a target number of pixels. If the image contains less than the chosen target number of pixels, it will not be changed.
Used in:
Probability of keeping the original image.
The target number of pixels will be chosen to be in the range [min_target_pixels, max_target_pixels]
Configuration for random example sampler.
Used in:
The desired fraction of positive samples in batch when applying random example sampling.
Randomly horizontally flips the image and detections with the specified probability, default to 50% of the time.
Used in:
Specifies a mapping from the original keypoint indices to horizontally flipped indices. This is used in the event that keypoints are specified, in which case when the image is horizontally flipped the keypoints will need to be permuted. E.g. for keypoints representing left_eye, right_eye, nose_tip, mouth, left_ear, right_ear (in that order), one might specify the keypoint_flip_permutation below: keypoint_flip_permutation: 1 keypoint_flip_permutation: 0 keypoint_flip_permutation: 2 keypoint_flip_permutation: 3 keypoint_flip_permutation: 5 keypoint_flip_permutation: 4 If nothing is specified the order of keypoint will be mantained.
The probability of running this augmentation for each image.
Randomly enlarges or shrinks image (keeping aspect ratio).
Used in:
Randomly jitters corners of boxes in the image determined by ratio. ie. If a box is [100, 200] and ratio is 0.02, the corners can move by [1, 4].
Used in:
Applies a jpeg encoding with a random quality factor.
Used in:
Probability of keeping the original image.
Minimum jpeg quality to use.
Maximum jpeg quality to use.
Configuration proto for random normal initializer. See https://www.tensorflow.org/api_docs/python/tf/random_normal_initializer
Used in:
Randomly adds padding to the image.
Used in:
Minimum dimensions for padded image. If unset, will use original image dimension as a lower bound.
Maximum dimensions for padded image. If unset, will use double the original image dimension as a lower bound.
Color of the padding. If unset, will pad using average color of the input image.
Used in:
Probability of keeping the original image.
The patch size will be chosen to be in the range [min_patch_size, max_patch_size).
The standard deviation of the gaussian noise applied within the patch will be chosen to be in the range [min_gaussian_stddev, max_gaussian_stddev).
Randomly scales the values of all pixels in the image by some constant value between [minval, maxval], then clip the value to a range between [0, 1.0].
Used in:
Randomly convert entire image to grey scale.
Used in:
Randomly resizes the image up to [target_height, target_width].
Used in:
Randomly rotates the image and detections by 90 degrees counter-clockwise with the specified probability, default to 50% of the time.
Used in:
Specifies a mapping from the original keypoint indices to 90 degree counter clockwise indices. This is used in the event that keypoints are specified, in which case when the image is rotated the keypoints might need to be permuted.
The probability of running this augmentation for each image.
Randomly scale, crop, and then pad an image to the desired square output dimensions. Specifically, this method first samples a random_scale factor from a uniform distribution between scale_min and scale_max, and then resizes the image such that it's maximum dimension is (output_size * random_scale). Secondly, a square output_size crop is extracted from the resized image, and finally the cropped region is padded to the desired square output_size. The augmentation is borrowed from [1] [1]: https://arxiv.org/abs/1911.09070
Used in:
The (square) output image size
The minimum and maximum values from which to sample the random scale.
Randomly concatenates the image with itself horizontally and/or vertically.
Used in:
Probability of concatenating the image vertically.
Probability of concatenating the image horizontally.
Extract a square sized crop from an image whose side length is sampled by randomly scaling the maximum spatial dimension of the image. If part of the crop falls outside the image, it is filled with zeros. The augmentation is borrowed from [1] [1]: https://arxiv.org/abs/1904.07850
Used in:
The maximum size of the border. The border defines distance in pixels to the image boundaries that will not be considered as a center of a crop. To make sure that the border does not go over the center of the image, we chose the border value by computing the minimum k, such that (max_border / (2**k)) < image_dimension/2
The minimum and maximum values of scale.
The number of discrete scale values to randomly sample between [min_scale, max_scale]
Randomly vertically flips the image and detections with the specified probability, default to 50% of the time.
Used in:
Specifies a mapping from the original keypoint indices to vertically flipped indices. This is used in the event that keypoints are specified, in which case when the image is vertically flipped the keypoints will need to be permuted. E.g. for keypoints representing left_eye, right_eye, nose_tip, mouth, left_ear, right_ear (in that order), one might specify the keypoint_flip_permutation below: keypoint_flip_permutation: 1 keypoint_flip_permutation: 0 keypoint_flip_permutation: 2 keypoint_flip_permutation: 3 keypoint_flip_permutation: 5 keypoint_flip_permutation: 4
The probability of running this augmentation for each image.
Configuration proto for region similarity calculators. See core/region_similarity_calculator.py for details.
Used in:
,Proto with one-of field for regularizers.
Used in:
Remap a set of labels to a new label.
Used in:
Labels to be remapped.
Label to map to.
Resizes images to [new_height, new_width].
Used in:
Used in:
Enumeration type for image resizing methods provided in TensorFlow.
Used in:
, ,Corresponds to tf.image.ResizeMethod.BILINEAR
Corresponds to tf.image.ResizeMethod.NEAREST_NEIGHBOR
Corresponds to tf.image.ResizeMethod.BICUBIC
Corresponds to tf.image.ResizeMethod.AREA
Used in:
Hyperparameters for convolution ops used in the box predictor.
Bin sizes for RFCN crops.
Target depth to reduce the input image features to.
Size of the encoding for the boxes.
Size to resize the rfcn crops to.
Randomly crops a image according to: Liu et al., SSD: Single shot multibox detector. This preprocessing step defines multiple SSDRandomCropOperations. Only one operation (chosen at random) is actually performed on an image.
Used in:
Randomly crops a image to a fixed aspect ratio according to: Liu et al., SSD: Single shot multibox detector. Multiple SSDRandomCropFixedAspectRatioOperations are defined by this preprocessing step. Only one operation (chosen at random) is actually performed on an image.
Used in:
Aspect ratio to crop to. This value is used for all crop operations.
Used in:
Cropped image must cover at least this fraction of one original bounding box.
The area of the cropped image must be within the range of [min_area, max_area].
Cropped box area ratio must be above this threhold to be kept.
Whether to clip the boxes to the cropped image.
Probability a crop operation is skipped.
Used in:
Cropped image must cover at least this fraction of one original bounding box.
The aspect ratio of the cropped image must be within the range of [min_aspect_ratio, max_aspect_ratio].
The area of the cropped image must be within the range of [min_area, max_area].
Cropped box area ratio must be above this threhold to be kept.
Whether to clip the boxes to the cropped image.
Probability a crop operation is skipped.
Randomly crops and pads an image according to: Liu et al., SSD: Single shot multibox detector. This preprocessing step defines multiple SSDRandomCropPadOperations. Only one operation (chosen at random) is actually performed on an image.
Used in:
Randomly crops and pads an image to a fixed aspect ratio according to: Liu et al., SSD: Single shot multibox detector. Multiple SSDRandomCropPadFixedAspectRatioOperations are defined by this preprocessing step. Only one operation (chosen at random) is actually performed on an image.
Used in:
Aspect ratio to pad to. This value is used for all crop and pad operations.
Min ratio of padded image height and width to the input image's height and width. Two entries per operation.
Max ratio of padded image height and width to the input image's height and width. Two entries per operation.
Used in:
Cropped image must cover at least this fraction of one original bounding box.
The aspect ratio of the cropped image must be within the range of [min_aspect_ratio, max_aspect_ratio].
The area of the cropped image must be within the range of [min_area, max_area].
Cropped box area ratio must be above this threhold to be kept.
Whether to clip the boxes to the cropped image.
Probability a crop operation is skipped.
Used in:
Cropped image must cover at least this fraction of one original bounding box.
The aspect ratio of the cropped image must be within the range of [min_aspect_ratio, max_aspect_ratio].
The area of the cropped image must be within the range of [min_area, max_area].
Cropped box area ratio must be above this threhold to be kept.
Whether to clip the boxes to the cropped image.
Probability a crop operation is skipped.
Min ratio of padded image height and width to the input image's height and width. Two entries per operation.
Max ratio of padded image height and width to the input image's height and width. Two entries per operation.
Padding color.
Scales boxes from normalized coordinates to pixel coordinates.
Used in:
(message has no fields)
Message for class-agnostic Sigmoid Calibration.
Used in:
Message mapping class index to Sigmoid Parameters
Sigmoid Focal cross entropy loss as described in https://arxiv.org/abs/1708.02002
Used in:
DEPRECATED, do not use.
modulating factor for the loss.
alpha weighting factor for the loss.
Message defining parameters for sigmoid calibration.
Used in:
,Configuration proto for SquareBoxCoder. See box_coders/square_box_coder.py for details.
Used in:
Scale factor for anchor encoded box center.
Scale factor for anchor encoded box length.
Configuration for Single Shot Detection (SSD) models. Next id: 27
Used in:
Number of classes to predict.
Image resizer for preprocessing the input image.
Feature extractor config.
Box coder to encode the boxes.
Matcher to match groundtruth with anchors.
Region similarity calculator to compute similarity of boxes.
Whether background targets are to be encoded as an all zeros vector or a one-hot vector (where background is the 0th class).
classification weight to be associated to negative anchors (default: 1.0). The weight must be in [0., 1.].
Box predictor to attach to the features.
Anchor generator to compute anchors.
Post processing to apply on the predictions.
Whether to normalize the loss by number of groundtruth boxes that match to the anchors.
Whether to normalize the localization loss by the code size of the box encodings. This is applied along with other normalization factors.
Loss configuration for training.
Whether to update batch norm parameters during training or not. When training with a relative small batch size (e.g. 1), it is desirable to disable batch norm update and use pretrained batch norm params. Note: Some feature extractors are used with canned arg_scopes (e.g resnet arg scopes). In these cases training behavior of batch norm variables may depend on both values of `batch_norm_trainable` and `is_training`. When canned arg_scopes are used with feature extractors `conv_hyperparams` will apply only to the additional layers that are added and are outside the canned arg_scope.
Whether to update batch_norm inplace during training. This is required for batch norm to work correctly on TPUs. When this is false, user must add a control dependency on tf.GraphKeys.UPDATE_OPS for train/loss op in order to update the batch norm moving average parameters.
Whether to add an implicit background class to one-hot encodings of groundtruth labels. Set to false if training a single class model or using an explicit background class.
Whether to use an explicit background class. Set to true if using groundtruth labels with an explicit background class, as in multiclass scores.
Configs for mask head.
Configuration proto for MaskHead. Next id: 11
Used in:
The height and the width of the predicted mask. Only used when predict_instance_masks is true.
Whether to predict class agnostic masks. Only used when predict_instance_masks is true.
The depth for the first conv2d_transpose op applied to the image_features in the mask prediction branch. If set to 0, the value will be set automatically based on the number of channels in the image features and the number of classes.
The number of convolutions applied to image_features in the mask prediction branch.
Whether to apply convolutions on mask features before upsampling using nearest neighbor resizing. By default, mask features are resized to [`mask_height`, `mask_width`] before applying convolutions and predicting masks.
Mask loss weight.
Number of boxes to be generated at training time for computing mask loss.
Hyperparameters for convolution ops used in the box predictor.
Output size (width and height are set to be the same) of the initial bilinear interpolation based cropping during ROI pooling. Only used when we have second stage prediction head enabled (e.g. mask head).
Configuration proto for SSD anchor generator described in https://arxiv.org/abs/1512.02325. See anchor_generators/multiple_grid_anchor_generator.py for details.
Used in:
Number of grid layers to create anchors for.
Scale of anchors corresponding to finest resolution.
Scale of anchors corresponding to coarsest resolution
Can be used to override min_scale->max_scale, with an explicitly defined set of scales. If empty, then min_scale->max_scale is used.
Aspect ratios for anchors at each grid point.
When this aspect ratio is greater than 0, then an additional anchor, with an interpolated scale is added with this aspect ratio.
Whether to use the following aspect ratio and scale combination for the layer with the finest resolution : (scale=0.1, aspect_ratio=1.0), (scale=min_scale, aspect_ration=2.0), (scale=min_scale, aspect_ratio=0.5).
The base anchor size in height dimension.
The base anchor size in width dimension.
Anchor stride in height dimension in pixels for each layer. The length of this field is expected to be equal to the value of num_layers.
Anchor stride in width dimension in pixels for each layer. The length of this field is expected to be equal to the value of num_layers.
Anchor height offset in pixels for each layer. The length of this field is expected to be equal to the value of num_layers.
Anchor width offset in pixels for each layer. The length of this field is expected to be equal to the value of num_layers.
Next id: 20.
Used in:
Type of ssd feature extractor.
The factor to alter the depth of the channels in the feature extractor.
Minimum number of the channels in the feature extractor.
Hyperparameters that affect the layers of feature extractor added on top of the base feature extractor.
Normally, SSD feature extractors are constructed by reusing an existing base feature extractor (that has its own hyperparams) and adding new layers on top of it. `conv_hyperparams` above normally applies only to the new layers while base feature extractor uses its own default hyperparams. If this value is set to true, the base feature extractor's hyperparams will be overridden with the `conv_hyperparams`.
The nearest multiple to zero-pad the input height and width dimensions to. For example, if pad_to_multiple = 2, input dimensions are zero-padded until the resulting dimensions are even.
Whether to use explicit padding when extracting SSD multiresolution features. This will also apply to the base feature extractor if a MobileNet architecture is used.
Whether to use depthwise separable convolutions for to extract additional feature maps added by SSD.
Feature Pyramid Networks config.
Bidirectional Feature Pyramid Networks config.
If true, replace preprocess function of feature extractor with a placeholder. This should only be used if all the image preprocessing steps happen outside the graph.
The number of SSD layers.
Used in:
String name. The most common practice is to set this to a MID or synsets id.
Integer id that maps to the string name above. Label ids should start from 1.
Human readable string label.
Label ids for the elements that are connected in the hierarchy with the current element. Value should correspond to another label id element.
Name of class specific keypoints for each class object and their respective keypoint IDs.
Used in:
Id for the keypoint. Id must be unique within a given class, however, it could be shared across classes. For example "nose" keypoint can occur in both "face" and "person" classes. Hence they can be mapped to the same id. Note: It is advised to assign ids in range [1, num_unique_keypoints] to encode keypoint targets efficiently.
Label for the keypoint.
Normalizes an image by subtracting a mean from each channel.
Used in:
The mean to subtract from each channel. Should be of same dimension of channels in the input image.
An input reader that reads TF Example or TF Sequence Example protos from local TFRecord files.
Used in:
Path(s) to `TFRecordFile`s.
Message to configure Target Assigner for object detectors.
Message for Temperature Scaling Calibration.
Used in:
Configuration for thresholded-intersection-over-union similarity calculator.
Used in:
IOU threshold used for filtering scores.
Message for configuring DetectionModel training jobs (train.py). Next id: 30
Used in:
Effective batch size to use for training. For TPU (or sync SGD jobs), the batch size per core (or GPU) is going to be `batch_size` / number of cores (or `batch_size` / number of GPUs).
Data augmentation options.
Whether to synchronize replicas during training.
How frequently to keep checkpoints.
Optimizer used to train the DetectionModel.
If greater than 0, clips gradients by this value.
Checkpoint to restore variables from. Typically used to load feature extractor variables trained outside of object detection.
Type of checkpoint to restore variables from, e.g. 'classification' or 'detection'. Provides extensibility to from_detection_checkpoint. Typically used to load feature extractor variables from trained models.
Either "v1" or "v2". If v1, restores the checkpoint using the tensorflow v1 style of restoring checkpoints. If v2, uses the eager mode checkpoint restoration API.
[Deprecated]: use fine_tune_checkpoint_type instead. Specifies if the finetune checkpoint is from an object detection model. If from an object detection model, the model being trained should have the same parameters with the exception of the num_classes parameter. If false, it assumes the checkpoint was a object classification model.
Whether to load all checkpoint vars that match model variable names and sizes. This option is only available if `from_detection_checkpoint` is True. This option is *not* supported for TF2 --- setting it to true will raise an error.
Number of steps to train the DetectionModel for. If 0, will train the model indefinitely.
Number of training steps between replica startup. This flag must be set to 0 if sync_replicas is set to true.
If greater than 0, multiplies the gradient of bias variables by this amount.
Variables that should be updated during training. Note that variables which also match the patterns in freeze_variables will be excluded.
Variables that should not be updated during training. If update_trainable_variables is not empty, only eliminates the included variables according to freeze_variables patterns.
Number of replicas to aggregate before making parameter updates.
Maximum number of elements to store within a queue.
Number of threads to use for batching.
Maximum capacity of the queue used to prefetch assembled batches.
If true, boxes with the same coordinates will be merged together. This is useful when each box can have multiple labels. Note that only Sigmoid classification losses should be used.
If true, will use multiclass scores from object annotations as ground truth. Currently only compatible with annotated image inputs.
Whether to add regularization loss to `total_loss`. This is true by default and adds all regularization losses defined in the model to `total_loss`. Setting this option to false is very useful while debugging the model and losses.
Maximum number of boxes used during training. Set this to at least the maximum amount of boxes in the input data. Otherwise, it may cause "Data loss: Attempted to pad to a smaller size than the input element" errors.
Whether to remove padding along `num_boxes` dimension of the groundtruth tensors.
Whether to retain original images (i.e. not pre-processed) in the tensor dictionary, so that they can be displayed in Tensorboard. Note that this will lead to a larger memory footprint.
Whether to use bfloat16 for training. This is currently only supported for TPUs.
Whether to summarize gradients.
Convenience message for configuring a training and eval pipeline. Allows all of the pipeline parameters to be configured from one file. Next id: 8
Description of data used to fit the calibration model. CLASS_SPECIFIC indicates that the calibration parameters are derived from detections pertaining to a single class. ALL_CLASSES indicates that parameters were obtained by fitting a model on detections from all classes (including the background class).
Used in:
Configuration proto for truncated normal initializer. See https://www.tensorflow.org/api_docs/python/tf/truncated_normal_initializer
Used in:
Configuration proto for variance scaling initializer. See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/ variance_scaling_initializer
Used in:
Used in:
Configuration proto for weight shared convolutional box predictor. Next id: 19
Used in:
Hyperparameters for convolution ops used in the box predictor.
Number of the additional conv layers before the predictor.
Output depth for the convolution ops prior to predicting box encodings and class predictions.
Size of final convolution kernel. If the spatial resolution of the feature map is smaller than the kernel size, then the kernel size is set to min(feature_width, feature_height).
Size of the encoding for boxes.
Bias initialization for class prediction. It has been show to stabilize training where there are large number of negative boxes. See https://arxiv.org/abs/1708.02002 for details.
Whether to use dropout for class prediction.
Keep probability for dropout.
Whether to share the multi-layer tower between box prediction and class prediction heads.
Whether to use depthwise separable convolution for box predictor layers.
Callable elementwise score converter at inference time.
If specified, apply clipping to box encodings.
Used in:
Enum to specify how to convert the detection scores at inference time.
Used in:
Input scores equals output scores.
Applies a sigmoid on input scores.
Intersection over union location loss: 1 - IOU
Used in:
(message has no fields)
L2 location loss: 0.5 * ||weight * (a - b)|| ^ 2
Used in:
DEPRECATED, do not use. Output loss per anchor.
Classification loss using a sigmoid function over class predictions.
Used in:
DEPRECATED, do not use. Output loss per anchor.
SmoothL1 (Huber) location loss. The smooth L1_loss is defined elementwise as .5 x^2 if |x| <= delta and delta * (|x|-0.5*delta) otherwise, where x is the difference between predictions and target.
Used in:
DEPRECATED, do not use. Output loss per anchor.
Delta value for huber loss.
Classification loss using a softmax function over class predictions and a softmax function over the groundtruth labels (assumed to be logits).
Used in:
DEPRECATED, do not use.
Scale and softmax groundtruth logits before calculating softmax classification loss. Typically used for softmax distillation with teacher annotations stored as logits.
Classification loss using a softmax function over class predictions.
Used in:
DEPRECATED, do not use. Output loss per anchor.
Scale logit (input) value before calculating softmax classification loss. Typically used for softmax distillation.
Message to store a domain/range pair for function to be approximated.
Used in:
,Sequence of x/y pairs for function approximation.
Description of data used to fit the calibration model.
Used in: