package tensorflow.quantization

Get desktop application:
View/edit binary Protocol Buffers messages

List of supported opsets to deploy the quantized model. The quantized model contains different set of ops depending on the opset.

Used in: QuantizationOptions

OP_SET_UNSPECIFIED = 0
go/do-include-enum-unspecified
TF = 1
Uses TF ops that mimic quantization behavior. Used when the corresponding integer op is not yet present.
XLA = 2
Uses TF XLA ops
UNIFORM_QUANTIZED = 3
Uses TF Uniform Quantized ops

Model quantization method for optimization. Various techniques for model quantization are defined within this message along with a field that specifies a method to be used for a particular quantization request.

Used in: QuantizationOptions

oneof method_oneof
Quantization method is either exprimental or non-experimental method.
- QuantizationMethod.Method method = 1
- QuantizationMethod.ExperimentalMethod experimental_method = 2

Experimental quantization methods. These methods are either not implemented or provided with an unstable behavior.

Used in: QuantizationMethod

EXPERIMENTAL_METHOD_UNSPECIFIED = 0
This should never be used. Using this will generally result in an error.
go/do-include-enum-unspecified
STATIC_RANGE = 1
Static range quantization. Quantized tensor values' ranges are statically determined.
DYNAMIC_RANGE = 2
Dynamic range quantization. Quantized tensor values' ranges are determined in the graph executions. The weights are quantized during conversion.

Quantization methods that are supported as a stable API.

Used in: QuantizationMethod

METHOD_UNSPECIFIED = 0
This should never be used. Using this will generally result in an error.
go/do-include-enum-unspecified

Defines various options to specify and control the behavior of the quantizer. It consists of 1) Model-wise quantization configuration as a default configuration. If it is None, the default configuration is "do not quantize the model". 2) A set of supported operations. 3) Unit wise quantization precision. 4) Target hardware name.

optional QuantizationMethod quantization_method = 1
The default quantization configuration for the model. If the below unit-wise configuration does not exist, we use this default quantization configuration for the entire model. If the below unit-wise configuration exists, this default one will become the quantization configuration for units that are not specified in unit-wise configurations.
OpSet op_set = 2
QuantizationPrecision quantization_precision = 3
repeated UnitWiseQuantizationPrecision unit_wise_quantization_precision = 4
Quantization precision for each unit. Units can become either nodes or ops, and the mixture of those different units are allowed. If there are conflicts or ambiguity in this unit-wise precision, our quantizer will raise an error.
int64 min_num_elements_for_weights = 5
Minimum number of weight elements to apply quantization. Currently only supported for Post-training Dynamic Range Quantization. By default, it is set to 1024. To disable this, set the value to -1 explicitly.

Quantization precisions. If the specified quantization precision is not available, our quantizer needs to raise an error.

Used in: QuantizationOptions, UnitWiseQuantizationPrecision

PRECISION_UNSPECIFIED = 0
PRECISION_FULL = 1
Full Precision (Do not quantize)
PRECISION_W4A4 = 2
Weight 4 bit and activation 4 bit quantization
PRECISION_W4A8 = 3
Weight 4 bit and activation 8 bit quantization
PRECISION_W8A8 = 4
Weight 8 bit and activation 8 bit quantization

Unit (either nodes or ops at this moment) wise quantization method for mixed bit precision quantization. It contains the name of the unit, the granularity of the unit, and the quantization method for each unit.

Used in: QuantizationOptions

UnitWiseQuantizationPrecision.UnitType unit_type = 1
Available quantization unit. Currently node-wise and op-wise are available quantization units.
string func_name = 2
Uniqueness isn't guaranteed across SavedModels but within each function def's level, uniqueness is guaranteed. Updated the configuration interfaces to reflect such circumstances. If users do not need to guarantee uniqueness func_name can be omitted.
string unit_name = 3
QuantizationPrecision quantization_precision = 5
Quantization option information for the current unit. TODO(b/241322587): Support specifying quantization method for each unit of TF GraphDef.

Quantization unit granularity.

Used in: UnitWiseQuantizationPrecision

UNIT_UNSPECIFIED = 0
This should never be used. Using this will generally result in an error.
UNIT_NODE = 1
UNIT_OP = 2

package tensorflow.quantization

enum OpSet

OP_SET_UNSPECIFIED = 0

TF = 1

XLA = 2

UNIFORM_QUANTIZED = 3

message QuantizationMethod

oneof method_oneof

QuantizationMethod.Method method = 1

QuantizationMethod.ExperimentalMethod experimental_method = 2

enum QuantizationMethod.ExperimentalMethod

EXPERIMENTAL_METHOD_UNSPECIFIED = 0

STATIC_RANGE = 1

DYNAMIC_RANGE = 2

enum QuantizationMethod.Method

METHOD_UNSPECIFIED = 0

message QuantizationOptions

optional QuantizationMethod quantization_method = 1

OpSet op_set = 2

QuantizationPrecision quantization_precision = 3

repeated UnitWiseQuantizationPrecision unit_wise_quantization_precision = 4

int64 min_num_elements_for_weights = 5

enum QuantizationPrecision

PRECISION_UNSPECIFIED = 0

PRECISION_FULL = 1

PRECISION_W4A4 = 2

PRECISION_W4A8 = 3

PRECISION_W8A8 = 4

message UnitWiseQuantizationPrecision

UnitWiseQuantizationPrecision.UnitType unit_type = 1

string func_name = 2

string unit_name = 3

QuantizationPrecision quantization_precision = 5

enum UnitWiseQuantizationPrecision.UnitType

UNIT_UNSPECIFIED = 0

UNIT_NODE = 1

UNIT_OP = 2