Get desktop application:
View/edit binary Protocol Buffers messages
Inference Server GRPC endpoints.
Check liveness of the inference server.
ServerLive messages.
(message has no fields)
True if the inference server is live, false if not live.
Check readiness of the inference server.
ServerReady messages.
(message has no fields)
True if the inference server is ready, false if not ready.
Check readiness of a model in the inference server.
ModelReady messages.
The name of the model to check for readiness.
The version of the model to check for readiness. If not given the server will choose a version based on the model and internal policy.
True if the model is ready, false if not ready.
Get server metadata.
ServerMetadata messages.
(message has no fields)
The server name.
The server version.
The extensions supported by the server.
Get model metadata.
ModelMetadata messages.
The name of the model.
The version of the model to check for readiness. If not given the server will choose a version based on the model and internal policy.
The model name.
The versions of the model available on the server.
The model's platform. See Platforms.
The model's inputs.
The model's outputs.
Optional default parameters for the request / response. NOTE: This is an extension to the standard
Perform inference using a specific model.
Perform stream inference using a specific model.
Get the index of model repository contents.
The name of the repository. If empty the index is returned for all repositories.
If true return only models currently ready for inferencing.
An index entry for each model.
Load or reload a model from a repository.
The name of the repository to load from. If empty the model is loaded from any repository.
The name of the model to load, or reload.
Optional model repository request parameters.
(message has no fields)
Unload a model.
The name of the repository from which the model was originally loaded. If empty the repository is not considered.
The name of the model to unload.
Optional model repository request parameters.
(message has no fields)
An inference parameter value.
Used in:
, , , , , ,The parameter value can be a string, an int64, a boolean or a message specific to a predefined parameter.
A boolean parameter value.
An int64 parameter value.
A string parameter value.
The data contained in a tensor. For a given data type the tensor contents can be represented in "raw" bytes form or in the repeated type that matches the tensor's data type. Protobuf oneof is not used because oneofs cannot contain repeated fields.
Used in:
,Representation for BOOL data type. The size must match what is expected by the tensor's shape. The contents must be the flattened, one-dimensional, row-major order of the tensor elements.
Representation for INT8, INT16, and INT32 data types. The size must match what is expected by the tensor's shape. The contents must be the flattened, one-dimensional, row-major order of the tensor elements.
Representation for INT64 data types. The size must match what is expected by the tensor's shape. The contents must be the flattened, one-dimensional, row-major order of the tensor elements.
Representation for UINT8, UINT16, and UINT32 data types. The size must match what is expected by the tensor's shape. The contents must be the flattened, one-dimensional, row-major order of the tensor elements.
Representation for UINT64 data types. The size must match what is expected by the tensor's shape. The contents must be the flattened, one-dimensional, row-major order of the tensor elements.
Representation for FP32 data type. The size must match what is expected by the tensor's shape. The contents must be the flattened, one-dimensional, row-major order of the tensor elements.
Representation for FP64 data type. The size must match what is expected by the tensor's shape. The contents must be the flattened, one-dimensional, row-major order of the tensor elements.
Representation for BYTES data type. The size must match what is expected by the tensor's shape. The contents must be the flattened, one-dimensional, row-major order of the tensor elements.
ModelInfer messages.
Used as request type in: GRPCInferenceService.ModelInfer, GRPCInferenceService.ModelStreamInfer
The name of the model to use for inferencing.
The version of the model to use for inference. If not given the server will choose a version based on the model and internal policy.
Optional identifier for the request. If specified will be returned in the response.
Optional inference parameters.
The input tensors for the inference.
The requested output tensors for the inference. Optional, if not specified all outputs produced by the model will be returned.
The data contained in an input tensor can be represented in "raw" bytes form or in the repeated type that matches the tensor's data type. Using the "raw" bytes form will typically allow higher performance due to the way protobuf allocation and reuse interacts with GRPC. For example, see https://github.com/grpc/grpc/issues/23231. To use the raw representation 'raw_input_contents' must be initialized with data for each tensor in the same order as 'inputs'. For each tensor, the size of this content must match what is expected by the tensor's shape and data type. The raw data must be the flattened, one-dimensional, row-major order of the tensor elements without any stride or padding between the elements. Note that the FP16 and BF16 data types must be represented as raw content as there is no specific data type for a 16-bit float type. If this field is specified then InferInputTensor::contents must not be specified for any input tensor.
An input tensor for an inference request.
Used in:
The tensor name.
The tensor data type.
The tensor shape.
Optional inference input tensor parameters.
The input tensor data. This field must not be specified if tensor contents are being specified in ModelInferRequest.raw_input_contents.
An output tensor requested for an inference request.
Used in:
The tensor name.
Optional requested output tensor parameters.
Used as response type in: GRPCInferenceService.ModelInfer, GRPCInferenceService.ModelStreamInfer
The name of the model used for inference.
The version of the model used for inference.
The id of the inference request if one was specified.
Optional inference response parameters.
The output tensors holding inference results.
The data contained in an output tensor can be represented in "raw" bytes form or in the repeated type that matches the tensor's data type. Using the "raw" bytes form will typically allow higher performance due to the way protobuf allocation and reuse interacts with GRPC. For example, see https://github.com/grpc/grpc/issues/23231. To use the raw representation 'raw_output_contents' must be initialized with data for each tensor in the same order as 'outputs'. For each tensor, the size of this content must match what is expected by the tensor's shape and data type. The raw data must be the flattened, one-dimensional, row-major order of the tensor elements without any stride or padding between the elements. Note that the FP16 and BF16 data types must be represented as raw content as there is no specific data type for a 16-bit float type. If this field is specified then InferOutputTensor::contents must not be specified for any output tensor.
An output tensor returned for an inference request.
Used in:
The tensor name.
The tensor data type.
The tensor shape.
Optional output tensor parameters.
The output tensor data. This field must not be specified if tensor contents are being specified in ModelInferResponse.raw_output_contents.
Metadata for a tensor.
Used in:
The tensor name.
The tensor data type.
The tensor shape. A variable-size dimension is represented by a -1 value.
Optional default parameters for input. NOTE: This is an extension to the standard
An model repository parameter value.
Used in:
,The parameter value can be a string, an int64 or a boolean
A boolean parameter value.
An int64 parameter value.
A string parameter value.
A bytes parameter value.
Index entry for a model.
Used in:
The name of the model.
The version of the model.
The state of the model.
The reason, if any, that the model is in the given state.