package toco

Get desktop application:
View/edit binary Protocol Buffers messages

An ArraysExtraInfo message stores a collection of additional Information about arrays in a model, complementing the information in the model itself. It is intentionally a separate message so that it may be serialized and passed separately from the model. See --arrays_extra_info_file. A typical use case is to manually specify MinMax for specific arrays in a model that does not already contain such MinMax information.

Used in: ModelFlags

repeated ArraysExtraInfo.Entry entries = 1

Used in: ArraysExtraInfo

optional string name = 1
Next ID to use: 8.
optional string name_regexp = 7
optional double min = 2
optional double max = 3
optional IODataType data_type = 4
optional InputArrayShape shape = 5
optional float constant_float_value = 6

Supported I/O file formats. Some formats may be input-only or output-only.

Used in: TocoFlags

FILE_FORMAT_UNKNOWN = 0
TENSORFLOW_GRAPHDEF = 1
GraphDef, third_party/tensorflow/core/framework/graph.proto
TFLITE = 2
Tensorflow's mobile inference model. third_party/tensorflow/contrib/tflite/schema.fbs
GRAPHVIZ_DOT = 3
GraphViz Export-only.

IODataType describes the numeric data types of input and output arrays of a model.

Used in: ArraysExtraInfo.Entry, InputArray, TocoFlags

IO_DATA_TYPE_UNKNOWN = 0
FLOAT = 1
Float32, not quantized
QUANTIZED_UINT8 = 2
Uint8, quantized
INT32 = 3
Int32, not quantized
INT64 = 4
Int64, not quantized
STRING = 5
String, not quantized
QUANTIZED_INT16 = 6
Int16, quantized
BOOL = 7
Boolean
COMPLEX64 = 8
Complex64, not quantized
INT8 = 9
Int8, quantized based on QuantizationParameters in schema.
FLOAT16 = 10
Half precision float, not quantized.

Next ID to USE: 7.

Used in: ModelFlags

optional string name = 1
Name of the input arrays, i.e. the arrays from which input activations will be read.
optional InputArrayShape shape = 6
Shape of the input. For many applications the dimensions are {batch, height, width, depth}. Often the batch is left "unspecified" by providing a value of -1. The last dimension is typically called 'depth' or 'channels'. For example, for an image model taking RGB images as input, this would have the value 3.
optional float mean_value = 3
mean_value and std_value parameters control the interpretation of raw input activation values (elements of the input array) as real numbers. The mapping is given by: real_value = (raw_input_value - mean_value) / std_value In particular, the defaults (mean_value=0, std_value=1) yield real_value = raw_input_value. Often, non-default values are used in image models. For example, an image model taking uint8 image channel values as its raw inputs, in [0, 255] range, may use mean_value=128, std_value=128 to map them into the interval [-1, 1). Note: this matches exactly the meaning of mean_value and std_value in (TensorFlow via LegacyFedInput).
optional float std_value = 4
optional IODataType data_type = 5
Data type of the input. In many graphs, the input arrays already have defined data types, e.g. Placeholder nodes in a TensorFlow GraphDef have a dtype attribute. In those cases, it is not needed to specify this data_type flag. The purpose of this flag is only to define the data type of input arrays whose type isn't defined in the input graph file. For example, when specifying an arbitrary (not Placeholder) --input_array into a TensorFlow GraphDef. When this data_type is quantized (e.g. QUANTIZED_UINT8), the corresponding quantization parameters are the mean_value, std_value fields. It is also important to understand the nuance between this data_type flag and the inference_input_type in TocoFlags. The basic difference is that this data_type (like all ModelFlags) describes a property of the input graph, while inference_input_type (like all TocoFlags) describes an aspect of the toco transformation process and thus of the output file. The types of input arrays may be different between the input and output files if quantization or dequantization occurred. Such differences can only occur for real-number data i.e. only between FLOAT and quantized types (e.g. QUANTIZED_UINT8).

Used in: ArraysExtraInfo.Entry, InputArray

repeated int32 dims = 2

ModelFlags encodes properties of a model that, depending on the file format, may or may not be recorded in the model file. The purpose of representing these properties in ModelFlags is to allow passing them separately from the input model file, for instance as command-line parameters, so that we can offer a single uniform interface that can handle files from different input formats. For each of these properties, and each supported file format, we detail in comments below whether the property exists in the given file format. Obsolete flags that have been removed: optional int32 input_depth = 3; optional int32 input_width = 4; optional int32 input_height = 5; optional int32 batch = 6 [ default = 1]; optional float mean_value = 7; optional float std_value = 8 [default = 1.]; optional int32 input_dims = 11 [ default = 4]; repeated int32 input_shape = 13; Next ID to USE: 20.

repeated InputArray input_arrays = 1
Information about the input arrays, i.e. the arrays from which input activations will be read.
repeated string output_arrays = 2
Name of the output arrays, i.e. the arrays into which output activations will be written.
optional bool variable_batch = 10
If true, the model accepts an arbitrary batch size. Mutually exclusive with the 'batch' field: at most one of these two fields can be set.
repeated RnnState rnn_states = 12
repeated ModelFlags.ModelCheck model_checks = 14
optional bool allow_nonexistent_arrays = 16
If true, will allow passing inexistent arrays in --input_arrays and --output_arrays. This makes little sense, is only useful to more easily get graph visualizations.
optional bool allow_nonascii_arrays = 17
If true, will allow passing non-ascii-printable characters in --input_arrays and --output_arrays. By default (if false), only ascii printable characters are allowed, i.e. character codes ranging from 32 to 127. This is disallowed by default so as to catch common copy-and-paste issues where invisible unicode characters are unwittingly added to these strings.
optional ArraysExtraInfo arrays_extra_info = 18
If set, this ArraysExtraInfo allows to pass extra information about arrays not specified in the input model file, such as extra MinMax information.
optional bool change_concat_input_ranges = 19
When set to false, toco will not change the input ranges and the output ranges of concat operator to the overlap of all input ranges.

Checks applied to the model, typically after toco's comprehensive graph transformations. Next ID to USE: 4.

Used in: ModelFlags

optional string count_type = 1
Use the name of a type of operator to check its counts. Use "Total" for overall operator counts. Use "Arrays" for overall array counts.
optional int32 count_min = 2
A count of zero is a meaningful check, so negative used to mean disable.
optional int32 count_max = 3
If count_max < count_min, then count_min is only allowed value.

Used in: ModelFlags

optional string state_array = 1
optional string back_edge_source_array = 2
optional bool discardable = 5
optional int32 size = 3
size allows to specify a 1-D shape for the RNN state array. Will be expanded with 1's to fit the model. TODO(benoitjacob): should allow a generic, explicit shape.
optional int32 num_dims = 4

TocoFlags encodes extra parameters that drive tooling operations, that are not normally encoded in model files and in general may not be thought of as properties of models, instead describing how models are to be processed in the context of the present tooling job. Next ID to use: 31.

optional FileFormat input_format = 1
Input file format
optional FileFormat output_format = 2
Output file format
optional IODataType inference_input_type = 11
Similar to inference_type, but allows to control specifically the quantization of input arrays, separately from other arrays. If not set, then the value of inference_type is implicitly used, i.e. by default input arrays are quantized like other arrays. Like inference_type, this only affects real-number arrays. By "real-number" we mean float arrays, and quantized arrays. This excludes plain integer arrays, strings arrays, and every other data type. The typical use for this flag is for vision models taking a bitmap as input, typically with uint8 channels, yet still requiring floating-point inference. For such image models, the uint8 input is quantized, i.e. the uint8 values are interpreted as real numbers, and the quantization parameters used for such input arrays are their mean_value, std_value parameters.
optional IODataType inference_type = 4
Sets the type of real-number arrays in the output file, that is, controls the representation (quantization) of real numbers in the output file, except for input arrays, which are controlled by inference_input_type. NOTE: this flag only impacts real-number arrays. By "real-number" we mean float arrays, and quantized arrays. This excludes plain integer arrays, strings arrays, and every other data type. For real-number arrays, the impact of this flag is to allow the output file to choose a different real-numbers representation (quantization) from what the input file used. For any other types of arrays, changing the data type would not make sense. Specifically: - If FLOAT, then real-numbers arrays will be of type float in the output file. If they were quantized in the input file, then they get dequantized. - If QUANTIZED_UINT8, then real-numbers arrays will be quantized as uint8 in the output file. If they were float in the input file, then they get quantized. - If not set, then all real-numbers arrays retain the same type in the output file as they have in the input file.
optional float default_ranges_min = 5
default_ranges_min and default_ranges_max are helpers to experiment with quantization of models. Normally, quantization requires the input model to have (min, max) range information for every activations array. This is needed in order to know how to quantize arrays and still achieve satisfactory accuracy. However, in some circumstances one would just like to estimate the performance of quantized inference, without caring about accuracy. That is what default_ranges_min and default_ranges_max are for: when specified, they will be used as default (min, max) range boundaries for all activation arrays that lack (min, max) range information, thus allowing for quantization to proceed. It should be clear from the above explanation that these parameters are for experimentation purposes only and should not be used in production: they make it easy to quantize models, but the resulting quantized model will be inaccurate. These values only apply to arrays quantized with the kUint8 data type.
optional float default_ranges_max = 6
optional float default_int16_ranges_min = 15
Equivalent versions of default_ranges_min/_max for arrays quantized with the kInt16 data type.
optional float default_int16_ranges_max = 16
optional bool drop_fake_quant = 7
Ignore and discard FakeQuant nodes. For instance, that can be used to generate plain float code without fake-quantization from a quantized graph.
optional bool reorder_across_fake_quant = 8
Normally, FakeQuant nodes must be strict boundaries for graph transformations, in order to ensure that quantized inference has the exact same arithmetic behavior as quantized training --- which is the whole point of quantized training and of FakeQuant nodes in the first place. However, that entails subtle requirements on where exactly FakeQuant nodes must be placed in the graph. Some quantized graphs have FakeQuant nodes at unexpected locations, that prevent graph transformations that are necessary in order to generate inference code for these graphs. Such graphs should be fixed, but as a temporary work-around, setting this reorder_across_fake_quant flag allows toco to perform necessary graph transformations on them, at the cost of no longer faithfully matching inference and training arithmetic.
optional bool allow_custom_ops = 10
If true, allow TOCO to create TF Lite Custom operators for all the unsupported Tensorflow ops.
optional bool drop_control_dependency = 12
Applies only to the case when the input format is TENSORFLOW_GRAPHDEF. If true, then control dependencies will be immediately dropped during import. If not set, the default behavior is as follows: - Default to false if the output format is TENSORFLOW_GRAPHDEF. - Default to true in all other cases.
optional bool debug_disable_recurrent_cell_fusion = 13
Disables transformations that fuse subgraphs such as known LSTMs (not all LSTMs are identified).
optional bool propagate_fake_quant_num_bits = 14
Uses the FakeQuantWithMinMaxArgs.num_bits attribute to adjust quantized array data types throughout the graph. The graph must be properly annotated with FakeQuant* ops on at least the edges and may contain additional ops on the interior of the graph to widen/narrow as desired. Input and output array data types may change because of this propagation and users must be sure to query the final data_type values.
optional bool allow_nudging_weights_to_use_fast_gemm_kernel = 17
Some fast uint8 GEMM kernels require uint8 weights to avoid the value 0. This flag allows nudging them to 1 to allow proceeding, with moderate inaccuracy.
optional int64 dedupe_array_min_size_bytes = 18
Minimum size of constant arrays to deduplicate; arrays smaller will not be deduplicated.
optional bool split_tflite_lstm_inputs = 19
Split the LSTM inputs from 5 tensors to 18 tensors for TFLite. Ignored if the output format is not TFLite.
optional bool quantize_weights = 20
Store weights as quantized weights followed by dequantize operations. Computation is still done in float, but reduces model size (at the cost of accuracy and latency). DEPRECATED: Please use post_training_quantize instead.
optional string dump_graphviz_dir = 24
Full filepath of folder to dump the graphs at various stages of processing GraphViz .dot files. Preferred over --output_format=GRAPHVIZ_DOT in order to keep the requirements of the output file.
optional bool dump_graphviz_include_video = 25
Boolean indicating whether to dump the graph after every graph transformation.
optional bool post_training_quantize = 26
Boolean indicating whether to quantize the weights of the converted float model. Model size will be reduced and there will be latency improvements (at the cost of accuracy).
optional bool enable_select_tf_ops = 27
This flag only works when converting to TensorFlow Lite format. When enabled, unsupported ops will be converted to select TensorFlow ops. TODO(ycling): Consider to rename the following 2 flags and don't call it "Flex". `enable_select_tf_ops` should always be used with `allow_custom_ops`. WARNING: Experimental interface, subject to change
optional bool force_select_tf_ops = 28
This flag only works when converting to TensorFlow Lite format. When enabled, all TensorFlow ops will be converted to select TensorFlow ops. This will force `enable_select_tf_ops` to true. `force_select_tf_ops` should always be used with `enable_select_tf_ops`. WARNING: Experimental interface, subject to change
optional bool quantize_to_float16 = 29
Boolean indicating whether to convert float32 constant buffers to float16. This is typically done to reduce model size. Delegates may also wish to implement kernels on reduced precision floats for performance gains.
optional bool allow_dynamic_tensors = 30
Boolean flag indicating whether the converter should allow models with dynamic Tensor shape. When set to False, the converter will generate runtime memory offsets for activation Tensors (with 128 bits alignment) and error out on models with undetermined Tensor shape. (Default: True)

package toco

message ArraysExtraInfo

repeated ArraysExtraInfo.Entry entries = 1

message ArraysExtraInfo.Entry

optional string name = 1

optional string name_regexp = 7

optional double min = 2

optional double max = 3

optional IODataType data_type = 4

optional InputArrayShape shape = 5

optional float constant_float_value = 6

enum FileFormat

FILE_FORMAT_UNKNOWN = 0

TENSORFLOW_GRAPHDEF = 1

TFLITE = 2

GRAPHVIZ_DOT = 3

enum IODataType

IO_DATA_TYPE_UNKNOWN = 0

FLOAT = 1

QUANTIZED_UINT8 = 2

INT32 = 3

INT64 = 4

STRING = 5

QUANTIZED_INT16 = 6

BOOL = 7

COMPLEX64 = 8

INT8 = 9

FLOAT16 = 10

message InputArray

optional string name = 1

optional InputArrayShape shape = 6

optional float mean_value = 3

optional float std_value = 4

optional IODataType data_type = 5

message InputArrayShape

repeated int32 dims = 2

message ModelFlags

repeated InputArray input_arrays = 1

repeated string output_arrays = 2

optional bool variable_batch = 10

repeated RnnState rnn_states = 12

repeated ModelFlags.ModelCheck model_checks = 14

optional bool allow_nonexistent_arrays = 16

optional bool allow_nonascii_arrays = 17

optional ArraysExtraInfo arrays_extra_info = 18

optional bool change_concat_input_ranges = 19

message ModelFlags.ModelCheck

optional string count_type = 1

optional int32 count_min = 2

optional int32 count_max = 3

message RnnState

optional string state_array = 1

optional string back_edge_source_array = 2

optional bool discardable = 5

optional int32 size = 3

optional int32 num_dims = 4

message TocoFlags

optional FileFormat input_format = 1

optional FileFormat output_format = 2

optional IODataType inference_input_type = 11

optional IODataType inference_type = 4

optional float default_ranges_min = 5

optional float default_ranges_max = 6

optional float default_int16_ranges_min = 15

optional float default_int16_ranges_max = 16

optional bool drop_fake_quant = 7

optional bool reorder_across_fake_quant = 8

optional bool allow_custom_ops = 10

optional bool drop_control_dependency = 12

optional bool debug_disable_recurrent_cell_fusion = 13

optional bool propagate_fake_quant_num_bits = 14

optional bool allow_nudging_weights_to_use_fast_gemm_kernel = 17

optional int64 dedupe_array_min_size_bytes = 18

optional bool split_tflite_lstm_inputs = 19

optional bool quantize_weights = 20

optional string dump_graphviz_dir = 24

optional bool dump_graphviz_include_video = 25

optional bool post_training_quantize = 26

optional bool enable_select_tf_ops = 27

optional bool force_select_tf_ops = 28