Get desktop application:
View/edit binary Protocol Buffers messages
List of supported opsets to deploy the quantized model. The quantized model contains different set of ops depending on the opset.
Used in:
go/do-include-enum-unspecified
Uses TF ops that mimic quantization behavior. Used when the corresponding integer op is not yet present.
Uses TF XLA ops
Uses TF Uniform Quantized ops
Model quantization method for optimization. Various techniques for model quantization are defined within this message along with a field that specifies a method to be used for a particular quantization request.
Used in:
Quantization method is either exprimental or non-experimental method.
Experimental quantization methods. These methods are either not implemented or provided with an unstable behavior.
Used in:
This should never be used. Using this will generally result in an error.
go/do-include-enum-unspecified
Static range quantization. Quantized tensor values' ranges are statically determined.
Dynamic range quantization. Quantized tensor values' ranges are determined in the graph executions. The weights are quantized during conversion.
Quantization methods that are supported as a stable API.
Used in:
This should never be used. Using this will generally result in an error.
go/do-include-enum-unspecified
Defines various options to specify and control the behavior of the quantizer. It consists of 1) Model-wise quantization configuration as a default configuration. If it is None, the default configuration is "do not quantize the model". 2) A set of supported operations. 3) Unit wise quantization precision. 4) Target hardware name.
The default quantization configuration for the model. If the below unit-wise configuration does not exist, we use this default quantization configuration for the entire model. If the below unit-wise configuration exists, this default one will become the quantization configuration for units that are not specified in unit-wise configurations.
Quantization precision for each unit. Units can become either nodes or ops, and the mixture of those different units are allowed. If there are conflicts or ambiguity in this unit-wise precision, our quantizer will raise an error.
Minimum number of weight elements to apply quantization. Currently only supported for Post-training Dynamic Range Quantization. By default, it is set to 1024. To disable this, set the value to -1 explicitly.
Quantization precisions. If the specified quantization precision is not available, our quantizer needs to raise an error.
Used in: ,
Full Precision (Do not quantize)
Weight 4 bit and activation 4 bit quantization
Weight 4 bit and activation 8 bit quantization
Weight 8 bit and activation 8 bit quantization
Unit (either nodes or ops at this moment) wise quantization method for mixed bit precision quantization. It contains the name of the unit, the granularity of the unit, and the quantization method for each unit.
Used in:
Available quantization unit. Currently node-wise and op-wise are available quantization units.
Uniqueness isn't guaranteed across SavedModels but within each function def's level, uniqueness is guaranteed. Updated the configuration interfaces to reflect such circumstances. If users do not need to guarantee uniqueness func_name can be omitted.
Quantization option information for the current unit. TODO(b/241322587): Support specifying quantization method for each unit of TF GraphDef.
Quantization unit granularity.
Used in:
This should never be used. Using this will generally result in an error.