Get desktop application:
View/edit binary Protocol Buffers messages
An error that occurred during benchmarking. Used with event type ERROR.
Used in:
How far benchmarking got.
Process exit code.
Signal the process received.
Handled tflite error.
Mini-benchmark error code.
Top-level benchmarking event stored on-device. All events for a model are parsed to detect the status.
Used in: ,
Which settings were used for benchmarking.
Type of the event.
Result of benchmark, used when type is END.
Error during benchmark, used when type is ERROR.
Start timestamps. These are used for 1. Checking whether a test was started but not completed within a given deadline. 2. Optionally, telemetry timestamps.
Which stage of benchmarking the event is for. There might be multiple events with the same type, if a benchmark is run multiple times.
Used in:
Benchmark start. A start without an end can be interpreted as a test that has crashed or hung.
Benchmarking completion. A model was successfully loaded, acceleration configured and inference run without errors. There may still be an issue with correctness of results, or with performance.
Benchmark was not completed due to an error. The error may be a handled error (e.g., failure in a delegate), or a crash.
Benchmark data has been sent for logging.
Benchmark encountered an error but was able to continue. The error is not related to the model execution but to the mini-benchmark logic. An example of error is a failure when trying to set the CPU affinity of the benchmark runner process.
Represent a failure during the initialization of the mini-benchmark.
Used in:
Status code returned by the mini-benchmark initialization function.
A correctness metric from a benchmark, for example KL-divergence between known-good CPU output and on-device output. These are primarily used for telemetry and monitored server-side.
Used in:
Outcome of a successfully complete benchmark run. This information is intended to both be used on-device to select best compute configuration as well as sent to server for monitoring. Used with event type END.
Used in:
Time to load model and apply acceleration. Initialization may get run multiple times to get information on variance.
Time to run inference (call Invoke()). Inference may get run multiple times to get information on variance.
Maximum memory used. Measures size of application heap (does not necessarily take into account driver-side allocation.
Whether the inference produced correct results (validation graph output 'ok' for all test inputs). Used on-device to disallow configurations that produce incorrect results (e.g., due to OpenCL driver bugs).
Metrics that were used to determine the 'ok' status.
When during benchmark execution an error occurred.
Used in:
During model loading or delegation.
During inference.
Where to store mini-benchmark state.
Used in:
Base path to the files used to store benchmark results in. Two files will be generated: one with the given path and an extra file to store events related to best acceleration results at path storage_file_path + ".extra.fb". Must be specific to the model. Note on Android, this should be the code cache directory.
Path to a directory for intermediate files (lock files, extracted binaries). Note on Android, this typically is the data cache directory (i.e. the one returned by `getCacheDir()`).
Represent the decision on the best acceleration from the mini-benchmark.
Used in:
Number of events used to take the decision. Using just the size instaed of the full list of events to save space.
Event with min latency in the source ones.
Min latency as read from min_latency_event.
Used in:
Set to -1 to let the interpreter choose. Otherwise, must be > 0.
Indicates the type and a human readable text for an error in an operation.
Used in:
Type of the errors.
Human readable message explaining the error.
Used in:
Quantization scale and/or zero point are not in the supported value(s) for the accelerated operation. Applied DDC(s): NNAPI
Indicates that the caller specified an invalid argument, such as incorrect stride values. Applied DDC(s): GPU
Indicates an internal error has occurred and some invariants expected by the underlying system have not been satisfied, such as expecting different number of input or ouput tensors. Applied DDC(s): GPU
Indicates the operation is not implemented or supported in this service. In this case, the operation should not be re-attempted. Applied DDC(s): GPU
Indicates the operation was attempted past the valid range, such as requesting an index that goes beyond the array size. Applied DDC(s): GPU
The operator is not supported by the Delegate. Applied DDC(s): NNAPI
The given operation or operands are not supported on the specified runtime feature level. The min supported version is specified in the compatibility failure message. Applied DDC(s): NNAPI
The version of the operator (value of OpSignature.version) for the given op is not supported. The max supported version is specified in the compatibility failure message. For more details on each operator version see the GetBuiltinOperatorVersion function in third_party/tensorflow/lite/tools/versioning/op_version.cc. Applied DDC(s): NNAPI
The given input operand type is not supported for the current combination of operator type and runtime feature level. Applied DDC(s): NNAPI
When using NN API version 1.0 or 1.1, the condition input_scale * filter_scale < output_scale must be true for quantized versions of the following ops: * CONV_2D * DEPTHWISE_CONV_2D * FULLY_CONNECTED (where filter actually stands for weights) The condition is relaxed and no longer required since version 1.2. Applied DDC(s): NNAPI
The given output operand type is not supported for the current combination of operator type and runtime feature level. Applied DDC(s): NNAPI
The size of the operand tensor is too large. Applied DDC(s): NNAPI
The value of one of the operands or of a combination of operands is not supported. Details are provided in the compatibility failure message. Applied DDC(s): NNAPI
The combination of float inputs and quantized weights or filters is not supported. Applied DDC(s): NNAPI
The quantization type (for example per-channel quantization) is not supported. Applied DDC(s): NNAPI
The accelerated version of operation requires a specific operand to be specified. Applied DDC(s): NNAPI
The rank of the operand is not supported. Details in the compatibility failure message. Applied DDC(s): NNAPI
The input tensor cannot be dynamically-sized. Applied DDC(s): NNAPI
The operator has a different number of inputs of the one or ones that are supported by NNAPI. Applied DDC(s): NNAPI
The accelerated version of the operator cannot specify an activation function. Applied DDC(s): NNAPI
One result for each operation.
One possible acceleration configuration.
Which preference to use this accelerator for.
How to configure TFLite
Identifiers to use for instrumentation and telemetry.
'Maybe' acceleration: use mini-benchmark to select settings.
Coral Dev Board / USB accelerator delegate settings. See https://github.com/google-coral/edgetpu/blob/master/libedgetpu/edgetpu_c.h
Used in:
The Edge Tpu device to be used. See https://github.com/google-coral/libcoral/blob/982426546dfa10128376d0c24fd8a8b161daac97/coral/tflite_utils.h#L131-L137
The desired performance level. This setting adjusts the internal clock rate to achieve different performance / power balance. Higher performance values improve speed, but increase power usage.
If true, always perform device firmware update (DFU) after reset. DFU is usually only necessary after power cycle.
The maximum bulk in queue length. Larger queue length may improve USB performance on the direction from device to host. When not specified (or zero), `usb_max_bulk_in_queue_length` will default to 32 according to the current EdgeTpu Coral implementation.
Used in:
CoreML Delegate settings. See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/delegates/coreml/coreml_delegate.h
Used in:
Only create delegate when Neural Engine is available on the device.
Specifies target Core ML version for model conversion. Core ML 3 come with a lot more ops, but some ops (e.g. reshape) is not delegated due to input rank constraint. if not set to one of the valid versions, the delegate will use highest version possible in the platform. Valid versions: (2, 3)
This sets the maximum number of Core ML delegates created. Each graph corresponds to one delegated node subset in the TFLite model. Set this to 0 to delegate all possible partitions.
This sets the minimum number of nodes per partition delegated with Core ML delegate. Defaults to 2.
Note the enum order change from the above header for better proto practice.
Used in:
Always create Core ML delegate.
Create Core ML delegate only on devices with Apple Neural Engine.
TFLite accelerator to use.
Used in: ,
The EdgeTpu in Pixel devices.
The Coral EdgeTpu Dev Board / USB accelerator.
Apple CoreML.
EdgeTPU device spec.
Used in:
Execution platform for the EdgeTPU device.
Number of chips to use for the EdgeTPU device.
Paths to the EdgeTPU devices;
Chip family used by the EdgeTpu device.
EdgeTPU platform types.
Used in:
Used in:
Inactive power states between inferences.
Inactive timeout in microseconds between inferences.
Generic definitions of EdgeTPU power states.
Used in: ,
Undefined power state.
TPU core is off but control cluster is on.
A non-active low-power state that has much smaller transition time to active compared to off.
Minimum power active state.
Very low performance, very low power.
Low performance, low power.
The normal performance and power. This setting usually provides the optimal perf/power trade-off for the average use-case.
Maximum performance level. Potentially higher power and thermal. This setting may not be allowed in production depending on the system.
EdgeTPU Delegate settings.
Used in:
Target inference power state for running the model.
Inactive power states between inferences.
Priority for the inference request.
Device spec for creating the EdgeTpu device.
A unique identifier of the input TfLite model.
Float truncation type for EdgeTPU.
QoS class to determine chunking size for PRO onward.
Float truncation types for EdgeTPU.
Used in:
Used in:
A handled error.
Used in:
Which delegate the error comes from (or NONE, if it comes from the tflite framework).
What the tflite level error is.
What the underlying error is (e.g., NNAPI or OpenGL error).
ExecutionPreference is used to match accelerators against the preferences of the current application or usecase. Some of the values here can appear both in the compatibility list and as input, some only as input. These are separate from NNAPIExecutionPreference - the compatibility list design doesn't assume a one-to-one mapping between which usecases compatibility list entries have been developed for and what settings are used for NNAPI.
Used in:
Match any selected preference. Allowlist (semantically - value is same as on input).
Match low latency preference. Both compatibility list and input.
Math low power preference. Both compatibility list and input.
Never accelerate. Can be used for input to compatibility list or for standalone Acceleration configuration.
Whether to automatically fallback to TFLite CPU path on delegation errors. Typically fallback is enabled in production use but disabled in tests and benchmarks to ensure they test the intended path.
Used in: ,
Whether to allow automatically falling back to TfLite CPU path on compilation failure. Default is not allowing automatic fallback. This is useful in naive production usecases where the caller would prefer for the model to run even if it's not accelerated. More advanced users will implement fallback themselves; e.g., by using a different model on CPU. Note that compilation errors may occur either at initial ModifyGraphWithDelegate() time, or when calling AllocateTensors() after resizing.
Whether to allow automatically falling back to TfLite CPU path on execution error. Default is not allowing automatic fallback. Experimental, use with care (only when you have complete control over the client code). The caveat above for compilation error holds. Additionally, execution-time errors are harder to handle automatically as they require invalidating the TfLite interpreter which most client code has not been designed to deal with.
Which GPU backend to select. Default behaviour on Android is to try OpenCL and if it's not available fall back to OpenGL.
Used in:
Not yet supported. VULKAN = 3; METAL = 4;
GPU inference priorities define relative priorities given by the GPU delegate to different client needs. Corresponds to TfLiteGpuInferencePriority.
Used in:
GPU inference preference for initialization time vs. inference time. Corresponds to TfLiteGpuInferenceUsage.
Used in:
Delegate will be used only once, therefore, bootstrap/init time should be taken into account.
Prefer maximizing the throughput. Same delegate will be used repeatedly on multiple inputs.
GPU Delegate settings. See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/delegates/gpu/delegate.h
Used in:
Ignored if inference_priority1/2/3 are set.
Ordered priorities provide better control over desired semantics, where priority(n) is more important than priority(n+1). Therefore, each time inference engine needs to make a decision, it uses ordered priorities to do so. Default values correspond to GPU_PRIORITY_AUTO. AUTO priority can only be used when higher priorities are fully specified. For example: VALID: priority1 = MIN_LATENCY, priority2 = AUTO, priority3 = AUTO VALID: priority1 = MIN_LATENCY, priority2 = MAX_PRECISION, priority3 = AUTO INVALID: priority1 = AUTO, priority2 = MIN_LATENCY, priority3 = AUTO INVALID: priority1 = MIN_LATENCY, priority2 = AUTO, priority3 = MAX_PRECISION Invalid priorities will result in error. For more information, see TfLiteGpuDelegateOptionsV2.
Whether to optimize for compilation+execution time or execution time only.
Model serialization. Setting both of these fields will also set the TFLITE_GPU_EXPERIMENTAL_FLAGS_ENABLE_SERIALIZATION flag on the delegate. GPU model serialization directory passed in TfLiteGpuDelegateOptionsV2. This should be set to the application's code cache directory so that it can not be accessed by other apps and is correctly deleted on app updates. tflite::StatefulNnApiDelegate
Normally, the model name with version number should be provided here, since each model needs an unique ID to avoid cache collision.
Hexagon Delegate settings. See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/delegates/hexagon/hexagon_delegate.h
Used in:
Events generated by the mini-benchmark before and after triggering the different configuration-specific benchmarks
Not using oneof because of the way the generated cpp code. See comment above on TfLite settings for details.
If set to true, this event is used to mark all previous events in the mini-benchmark internal storage as read and one of the other fields in this message will have a value.
Event generated when a best acceleration decision is taken.
Reports a failure during mini-benchmark initialization.
Event generated while benchmarking the different settings to test locally.
How to run a minibenchmark. Next ID: 5
Used in:
Which settings to test. This would typically be filled in from an allowlist.
How to access the model. This would typically be set dynamically, as it depends on the application folder and/or runtime state.
Where to store state. This would typically be set dynamically, as it depends on the application folder.
Validation test related settings.
How to access the model for mini-benchmark. Since mini-benchmark runs in a separate process, it can not access an in-memory model. It can read the model either from a file or from a file descriptor. The file descriptor typically comes from the Android asset manager. Users should set either filename, or all of fd, offset and length.
Used in:
Filename for reading model from.
File descriptor to read model from.
Offset for model in file descriptor.
Length of model in file descriptor.
Used in:
Undefined.
Prefer executing in a way that minimizes battery drain.
Prefer returning a single answer as fast as possible, even if this causes more power consumption.
Prefer maximizing the throughput of successive frames, for example when processing successive frames coming from the camera.
Used in:
NNAPI delegate settings.
Used in:
Which instance (NNAPI accelerator) to use. One driver may provide several accelerators (though a driver may also hide several back-ends behind one name, at the choice of the driver vendor). Note that driver introspection is only available in Android Q and later.
NNAPI model compilation caching settings to be passed to tflite::StatefulNnApiDelegate
NNAPI execution preference to pass. See https://developer.android.com/ndk/reference/group/neural-networks.html
Number of instances to cache for the same model (for input size changes). This is mandatory for getting reasonable performance in that case.
Deprecated; use the fallback_settings in TFLiteSettings. Whether to automatically fall back to TFLite CPU path.
Whether to allow use of NNAPI CPU (nnapi-reference accelerator) on Android 10+ when an accelerator name is not specified. The NNAPI CPU typically performs less well than the TfLite built-in kernels; but allowing allows a model to be partially accelerated which may be a win.
Whether to allow dynamic dimension sizes without re-compilation. A tensor of with dynamic dimension must have a valid dims_signature defined. Only supported in NNAPI 1.1 and newer versions. WARNING: Setting this flag to true may result in model being rejected by accelerator. This should only be enabled if the target device supports dynamic dimensions of the model. By default this is set to false.
Whether to allow the NNAPI accelerator to optionally use lower-precision float16 (16-bit floating point) arithmetic when doing calculations on float32 (32-bit floating point).
Whether to use NNAPI Burst mode. Burst mode allows accelerators to efficiently manage resources, which would significantly reduce overhead especially if the same delegate instance is to be used for multiple inferences.
Optional pointer to NNAPI Support Library provided pointer to NnApiSLDriverImplFL5 which can be used to construct the NNAPI delegate.
Result for one operation of the given model and stores if the operation is supported. If it is supported, validation_failures will not have a value. If it is not supported, validation_failures will contain all the errors for that operation. Also saves the subgraph index inside the model and the operator index inside the subgraph.
Used in:
True if the operation is supported for the required DCC.
Index of the subgraph where this operation is contained.
Index of the operator inside the subgraph.
Type of the errors.
How to configure TFLite.
Used in: , ,
Which delegate to use.
How to configure the chosen delegate. (In principle we would like to use 'oneof', but flatc turns that into an nested anonymous table rather than a union. See https://github.com/google/flatbuffers/issues/4628).
How to configure CPU execution.
Shared delegation settings.
For configuring the EdgeTpuDelegate.
For configuring the Coral EdgeTpu Delegate.
Whether to automatically fall back to TFLite CPU path.
Whether to disable default delegates (XNNPack).
Validation related settings. Next ID: 2
Used in:
Timeout for one settings under test. If test didn't finish within this timeout, this setting is considered hanging.
XNNPack Delegate settings. See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/delegates/xnnpack/xnnpack_delegate.h
Used in:
These flags match the flags in xnnpack_delegate.h.
Enable fast signed integer XNNpack kernels.
Enable fast unsigned integer XNNpack kernels.
Enable both, signed and unsigned integer XNNpack kernels.
Force 16-bit floating point inference.
Used in: