package angel.serving

Get desktop application:
View/edit binary Protocol Buffers messages

ModelService provides methods to query and update the state of the server, e.g. which models/versions are being served.

rpc GetModelStatus (GetModelStatusRequest, GetModelStatusResponse)
model_service.proto:18
Gets status of model. If the ModelSpec in the request does not specify version, information about all versions of the model will be returned. If the ModelSpec in the request does specify a version, the status of only that version will be returned.
message GetModelStatusRequest
get_model_status.proto:16
GetModelStatusRequest contains a ModelSpec indicating the model for which to get status.
- optional ModelSpec model_spec = 1
  Model Specification. If version is not specified, information about all versions of the model will be returned. If a version is specified, the status of only that version will be returned.
message GetModelStatusResponse
get_model_status.proto:68
Response for ModelStatusRequest on successful run.
- repeated ModelVersionStatus model_version_status = 1
  Version number and status information for applicable model version(s).
- map<string, DataType> type_map = 2
- int64 dim = 3
rpc HandleReloadConfigRequest (ReloadConfigRequest, ReloadConfigResponse)
model_service.proto:23
Reloads the set of served models. The new config supersedes the old one, so if a model is omitted from the new config it will be unloaded and no longer served.
message ReloadConfigRequest
model_management.proto:11
- optional ModelServerConfig config = 1
message ReloadConfigResponse
model_management.proto:15
- optional StatusProto status = 1

open source marker; do not remove PredictionService provides access to machine-learned models loaded by model_servers.

rpc Classify (Request, Response)
prediction_service.proto:17
Classify.
rpc GetModelMetadata (GetModelMetadataRequest, GetModelMetadataResponse)
prediction_service.proto:29
GetModelMetadata - provides access to metadata for loaded models.
message GetModelMetadataRequest
get_model_metadata.proto:12
- optional ModelSpec model_spec = 1
  Model Specification indicating which model we are querying for metadata. If version is not specified, will use the latest (numerical) version.
- repeated string metadata_field = 2
  Metadata fields to get. Currently supported: "signature_def".
message GetModelMetadataResponse
get_model_metadata.proto:20
- optional ModelSpec model_spec = 1
  Model Specification indicating which model this metadata belongs to.
- map<string, google.protobuf.Any> metadata = 2
  Map of metadata field name to metadata field. The options for metadata field name are listed in GetModelMetadataRequest. Currently supported: "signature_def".
rpc MultiInference (Request, Response)
prediction_service.proto:26
MultiInference API for multi-headed models.
rpc Predict (Request, Response)
prediction_service.proto:23
Predict -- provides access to loaded TensorFlow model.
rpc Regress (Request, Response)
prediction_service.proto:20
Regress.

SessionService defines a service with which a client can interact to execute Tensorflow model inference. The SessionService::SessionRun method is similar to MasterService::RunStep of Tensorflow, except that all sessions are ready to run, and you request a specific model/session with ModelSpec.

rpc SessionRun (SessionRunRequest, SessionRunResponse)
session_service.proto:51
Runs inference of a given model.
message SessionRunRequest
session_service.proto:15
- optional ModelSpec model_spec = 1
  Model Specification. If version is not specified, will use the latest (numerical) version.
- repeated NamedTensorProto feed = 2
  Tensors to be fed in the step. Each feed is a named tensor.
- repeated string fetch = 3
  Fetches. A list of tensor names. The caller expects a tensor to be returned for each fetch[i] (see RunResponse.tensor). The order of specified fetches does not change the execution order.
- repeated string target = 4
  Target Nodes. A list of node names. The named nodes will be run to but their outputs will not be fetched.
- optional RunOptions options = 5
  Options for the run call. **Currently ignored.**
message SessionRunResponse
session_service.proto:36
- repeated NamedTensorProto tensor = 1
  NOTE: The order of the returned tensors may or may not match the fetch order specified in RunRequest.
- optional RunMetadata metadata = 2
  Returned metadata if requested in the options.

Containers to hold repeated fundamental values.

Used in: Feature

repeated bytes value = 1

The canonical error codes for TensorFlow APIs. Warnings: - Do not change any numeric assignments. - Changes to this list should only be made if there is a compelling need that can't be satisfied in another way. Such changes must be approved by at least two OWNERS. Sometimes multiple error codes may apply. Services should return the most specific error code that applies. For example, prefer OUT_OF_RANGE over FAILED_PRECONDITION if both codes apply. Similarly prefer NOT_FOUND or ALREADY_EXISTS over FAILED_PRECONDITION.

Used in: StatusProto

OK = 0
Not an error; returned on success
CANCELLED = 1
The operation was cancelled (typically by the caller).
UNKNOWN = 2
Unknown error. An example of where this error may be returned is if a Status value received from another address space belongs to an error-space that is not known in this address space. Also errors raised by APIs that do not return enough error information may be converted to this error.
INVALID_ARGUMENT = 3
Client specified an invalid argument. Note that this differs from FAILED_PRECONDITION. INVALID_ARGUMENT indicates arguments that are problematic regardless of the state of the system (e.g., a malformed file name).
DEADLINE_EXCEEDED = 4
Deadline expired before operation could complete. For operations that change the state of the system, this error may be returned even if the operation has completed successfully. For example, a successful response from a server could have been delayed long enough for the deadline to expire.
NOT_FOUND = 5
Some requested entity (e.g., file or directory) was not found. For privacy reasons, this code *may* be returned when the client does not have the access right to the entity.
ALREADY_EXISTS = 6
Some entity that we attempted to create (e.g., file or directory) already exists.
PERMISSION_DENIED = 7
The caller does not have permission to execute the specified operation. PERMISSION_DENIED must not be used for rejections caused by exhausting some resource (use RESOURCE_EXHAUSTED instead for those errors). PERMISSION_DENIED must not be used if the caller can not be identified (use UNAUTHENTICATED instead for those errors).
UNAUTHENTICATED = 16
The request does not have valid authentication credentials for the operation.
RESOURCE_EXHAUSTED = 8
Some resource has been exhausted, perhaps a per-user quota, or perhaps the entire file system is out of space.
FAILED_PRECONDITION = 9
Operation was rejected because the system is not in a state required for the operation's execution. For example, directory to be deleted may be non-empty, an rmdir operation is applied to a non-directory, etc. A litmus test that may help a service implementor in deciding between FAILED_PRECONDITION, ABORTED, and UNAVAILABLE: (a) Use UNAVAILABLE if the client can retry just the failing call. (b) Use ABORTED if the client should retry at a higher-level (e.g., restarting a read-modify-write sequence). (c) Use FAILED_PRECONDITION if the client should not retry until the system state has been explicitly fixed. E.g., if an "rmdir" fails because the directory is non-empty, FAILED_PRECONDITION should be returned since the client should not retry unless they have first fixed up the directory by deleting files from it. (d) Use FAILED_PRECONDITION if the client performs conditional REST Get/Update/Delete on a resource and the resource on the server does not match the condition. E.g., conflicting read-modify-write on the same resource.
ABORTED = 10
The operation was aborted, typically due to a concurrency issue like sequencer check failures, transaction aborts, etc. See litmus test above for deciding between FAILED_PRECONDITION, ABORTED, and UNAVAILABLE.
OUT_OF_RANGE = 11
Operation tried to iterate past the valid input range. E.g., seeking or reading past end of file. Unlike INVALID_ARGUMENT, this error indicates a problem that may be fixed if the system state changes. For example, a 32-bit file system will generate INVALID_ARGUMENT if asked to read at an offset that is not in the range [0,2^32-1], but it will generate OUT_OF_RANGE if asked to read from an offset past the current file size. There is a fair bit of overlap between FAILED_PRECONDITION and OUT_OF_RANGE. We recommend using OUT_OF_RANGE (the more specific error) when it applies so that callers who are iterating through a space can easily look for an OUT_OF_RANGE error to detect when they are done.
UNIMPLEMENTED = 12
Operation is not implemented or not supported/enabled in this service.
INTERNAL = 13
Internal errors. Means some invariant expected by the underlying system has been broken. If you see one of these errors, something is very broken.
UNAVAILABLE = 14
The service is currently unavailable. This is a most likely a transient condition and may be corrected by retrying with a backoff. See litmus test above for deciding between FAILED_PRECONDITION, ABORTED, and UNAVAILABLE.
DATA_LOSS = 15
Unrecoverable data loss or corruption.
DO_NOT_USE_RESERVED_FOR_FUTURE_EXPANSION_USE_DEFAULT_IN_SWITCH_INSTEAD_ = 20
An extra enum entry to prevent people from writing code that fails to compile when a new code is added. Nobody should ever reference this enumeration entry. In particular, if you write C++ code that switches on this enumeration, add a default: case instead of a case that mentions this enumeration entry. Nobody should rely on the value (currently 20) listed here. It may change in the future.

Used in: ExampleList, ExampleListWithContext

optional Features features = 1

Specifies one or more fully independent input Examples. See examples at: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/example/example.proto

Used in: Input

repeated Example examples = 1

message ExampleListWithContext

input.proto:68

Specifies one or more independent input Examples, with a common context Example. The common use case for context is to cleanly and optimally specify some features that are common across multiple examples. See example below with a search query as the context and multiple restaurants to perform some inference on. context: { feature: { key : "query" value: { bytes_list: { value: [ "pizza" ] } } } } examples: { feature: { key : "cuisine" value: { bytes_list: { value: [ "Pizzeria" ] } } } } examples: { feature: { key : "cuisine" value: { bytes_list: { value: [ "Taqueria" ] } } } } Implementations of ExampleListWithContext merge the context Example into each of the Examples. Note that feature keys must not be duplicated between the Examples and context Example, or the behavior is undefined. See also: tensorflow/core/example/example.proto https://developers.google.com/protocol-buffers/docs/proto3#maps

Used in: Input

repeated Example examples = 1
optional Example context = 2

Containers for non-sequential data.

Used in: FeatureList, Features

oneof kind
Each feature can be exactly one kind.
- BytesList bytes_list = 1
- FloatList float_list = 2
- Int64List int64_list = 3

Containers for sequential data. A FeatureList contains lists of Features. These may hold zero or more Feature values. FeatureLists are organized into categories by name. The FeatureLists message contains the mapping from name to FeatureList.

Used in: FeatureLists

repeated Feature feature = 1

Used in: SequenceExample

map<string, FeatureList> feature_list = 1
Map from feature name to feature list.

Used in: Example, SequenceExample

map<string, Feature> feature = 1
Map from feature name to feature.

Used in: Feature

repeated float value = 1

oneof kind
- ExampleList example_list = 1
- ExampleListWithContext example_list_with_context = 2

Used in: Request, Response

DataType dType = 1
string name = 2
InstanceFlag flag = 3
optional TensorShapeProto shape = 4
oneof value
- bytes bs = 5
- int32 i = 6
- int64 l = 7
- float f = 8
- double d = 9
- bool b = 10
- string s = 11
- ListValue lv = 12
- MapValue mv = 13

Used in: Instance

IF_SCALAR = 0
IF_DENSE_VECTOR = 2
IF_INTKEY_SPARSE_VECTOR = 3
IF_LONGKEY_SPARSE_VECTOR = 4
IF_STRINGKEY_VECTOR = 5
IF_2D_MATRIX = 6
IF_3D_MATRIX = 7
IF_MAP_OBJECT = 8

Used in: Feature

repeated int64 value = 1

Used in: Instance, MapValue

repeated bytes bs = 1
repeated int32 i = 2
repeated int64 l = 3
repeated float f = 4
repeated double d = 5
repeated bool b = 6
repeated string s = 7
repeated ListValue vlist = 8
repeated MapValue mlist = 9

Used in: Instance, ListValue

map<string, bytes> s2bs_map = 1
map<string, int32> s2i_map = 2
map<string, int64> s2l_map = 3
map<string, float> s2f_map = 4
map<string, double> s2d_map = 5
map<string, bool> s2b_map = 6
map<string, string> s2s_map = 7
map<string, ListValue> s2list_map = 8
map<string, MapValue> s2map_map = 9
map<int32, bytes> i2bs_map = 11
map<int32, int32> i2i_map = 12
map<int32, int64> i2l_map = 13
map<int32, float> i2f_map = 14
map<int32, double> i2d_map = 15
map<int32, bool> i2b_map = 16
map<int32, string> i2s_map = 17
map<int32, ListValue> i2list_map = 18
map<int32, MapValue> i2map_map = 19
map<int64, bytes> l2bs_map = 21
map<int64, int32> l2i_map = 22
map<int64, int64> l2l_map = 23
map<int64, float> l2f_map = 24
map<int64, double> l2d_map = 25
map<int64, bool> l2b_map = 26
map<int64, string> l2s_map = 27
map<int64, ListValue> l2list_map = 28
map<int64, MapValue> l2map_map = 29

Used in: MetricsResponse.Versions

string model_name = 1
模型名
uint32 model_version = 2
版本
uint64 prediction_count_total = 3
请求总数：查询时刻之前的累加请求总数.
uint64 prediction_count_success = 4
请求成功总数：查询时刻之前的累加请求成功总数.
uint64 prediction_count_failed = 5
请求失败总数：查询时刻之前的累加请求失败总数.
double total_predict_time_ms = 6
请求成功的总耗时:查询时刻之前的累加请求成功的总耗时(单位: ms)
uint64 count_distribution0 = 7
请求成功次数 0~5ms
uint64 count_distribution1 = 8
请求成功次数 5~10ms
uint64 count_distribution2 = 9
请求成功次数 10~15ms
uint64 count_distribution3 = 10
请求成功次数 >15ms

map<string, MetricsResponse.Versions> models = 1

Used in: MetricsResponse

map<string, Metrics> versions = 1

Metadata for an inference request such as the model name and version.

Used in: LogMetadata, GetModelMetadataRequest, GetModelMetadataResponse, GetModelStatusRequest, Request, Response, SessionRunRequest

string name = 1
Required servable name.
oneof version_choice
Optional choice of which version of the model to use. Recommended to be left unset in the common case. Should be specified only when there is a strong version consistency requirement. When left unspecified, the system will serve the best available version. This is typically the latest version, though during version transitions, notably when serving on a fleet of instances, may be either the previous or new version.
- google.protobuf.Int64Value version = 2
  Use this specific version number.
- string version_label = 4
  Use the version associated with the given label. EXPERIMENTAL. DO NOT USE AT THIS TIME.
string signature_name = 3
A named signature to evaluate. If unspecified, the default signature will be used.

Version number, state, and status for a single version of a model.

Used in: GetModelStatusResponse

int64 version = 1
Model version.
ModelVersionStatus.State state = 2
Model state.
optional StatusProto status = 3
Model status.

States that map to ManagerState enum in tensorflow_serving/core/servable_state.h

Used in: ModelVersionStatus

UNKNOWN = 0
Default value.
START = 10
The manager is tracking this servable, but has not initiated any action pertaining to it.
LOADING = 20
The manager has decided to load this servable. In particular, checks around resource availability and other aspects have passed, and the manager is about to invoke the loader's Load() method.
AVAILABLE = 30
The manager has successfully loaded this servable and made it available for serving (i.e. GetServableHandle(id) will succeed). To avoid races, this state is not reported until *after* the servable is made available.
UNLOADING = 40
The manager has decided to make this servable unavailable, and unload it. To avoid races, this state is reported *before* the servable is made unavailable.
END = 50
This servable has reached the end of its journey in the manager. Either it loaded and ultimately unloaded successfully, or it hit an error at some point in its lifecycle.

Used as request type in: PredictionService.Classify, PredictionService.MultiInference, PredictionService.Predict, PredictionService.Regress

optional ModelSpec model_spec = 1
string platform = 2
repeated Instance instances = 3

Used as response type in: PredictionService.Classify, PredictionService.MultiInference, PredictionService.Predict, PredictionService.Regress

optional ModelSpec model_spec = 1
repeated Instance predictions = 2
string error = 3

optional Features context = 1
optional FeatureLists feature_lists = 2

Status that corresponds to Status in third_party/tensorflow/core/lib/core/status.h.

Used in: ModelVersionStatus, ReloadConfigResponse

Code error_code = 1
Error code.
string error_message = 2
Error message. Will only be set if an error was encountered.

package angel.serving

service ModelService

rpc GetModelStatus (GetModelStatusRequest, GetModelStatusResponse)

message GetModelStatusRequest

optional ModelSpec model_spec = 1

message GetModelStatusResponse

repeated ModelVersionStatus model_version_status = 1

map<string, DataType> type_map = 2

int64 dim = 3

rpc HandleReloadConfigRequest (ReloadConfigRequest, ReloadConfigResponse)

message ReloadConfigRequest

optional ModelServerConfig config = 1

message ReloadConfigResponse

optional StatusProto status = 1

service PredictionService

rpc Classify (Request, Response)

rpc GetModelMetadata (GetModelMetadataRequest, GetModelMetadataResponse)

message GetModelMetadataRequest

optional ModelSpec model_spec = 1

repeated string metadata_field = 2

message GetModelMetadataResponse

optional ModelSpec model_spec = 1

map<string, google.protobuf.Any> metadata = 2

rpc MultiInference (Request, Response)

rpc Predict (Request, Response)

rpc Regress (Request, Response)

service SessionService

rpc SessionRun (SessionRunRequest, SessionRunResponse)

message SessionRunRequest

optional ModelSpec model_spec = 1

repeated NamedTensorProto feed = 2

repeated string fetch = 3

repeated string target = 4

optional RunOptions options = 5

message SessionRunResponse

repeated NamedTensorProto tensor = 1

optional RunMetadata metadata = 2

message BytesList

repeated bytes value = 1

enum Code

OK = 0

CANCELLED = 1

UNKNOWN = 2

INVALID_ARGUMENT = 3

DEADLINE_EXCEEDED = 4

NOT_FOUND = 5

ALREADY_EXISTS = 6

PERMISSION_DENIED = 7

UNAUTHENTICATED = 16

RESOURCE_EXHAUSTED = 8

FAILED_PRECONDITION = 9

ABORTED = 10

OUT_OF_RANGE = 11

UNIMPLEMENTED = 12

INTERNAL = 13

UNAVAILABLE = 14

DATA_LOSS = 15

DO_NOT_USE_RESERVED_FOR_FUTURE_EXPANSION_USE_DEFAULT_IN_SWITCH_INSTEAD_ = 20

message Example

optional Features features = 1

message ExampleList

repeated Example examples = 1

message ExampleListWithContext

repeated Example examples = 1

optional Example context = 2

message Feature

oneof kind

BytesList bytes_list = 1

FloatList float_list = 2

Int64List int64_list = 3

message FeatureList

repeated Feature feature = 1

message FeatureLists

map<string, FeatureList> feature_list = 1

message Features

map<string, Feature> feature = 1

message FloatList

repeated float value = 1

message Input

oneof kind