package tensorflow.data.experimental

Get desktop application:
View/edit binary Protocol Buffers messages

Configuration for a tf.data service DispatchServer. Next id: 10

int64 port = 1
The port for the dispatcher to bind to. A value of 0 indicates that the dispatcher may bind to any available port.
string protocol = 2
The protocol for the dispatcher to use when connecting to workers.
string work_dir = 3
A work directory to use for storing dispatcher state, and for recovering during restarts. The empty string indicates not to use any work directory.
bool fault_tolerant_mode = 4
Whether to run in fault tolerant mode, where dispatcher state is saved across restarts. Requires that `work_dir` is nonempty.
repeated string worker_addresses = 7
(Optional.) If the job uses auto-sharding, it needs to specify a fixed list of worker addresses that will register with the dispatcher. The worker addresses should be in the format "host" or "host:port", where "port" is an integer, named port, or %port% to match any port.
DeploymentMode deployment_mode = 9
(Optional.) tf.data service deployment mode. Supported values are "REMOTE", "COLOCATED", and "HYBRID". If unspecified, it is assumed to be "REMOTE".
int64 job_gc_check_interval_ms = 5
How often the dispatcher should scan through to delete old and unused jobs. A value of 0 indicates that the decision should be left up to the runtime.
int64 job_gc_timeout_ms = 6
How long a job needs to be unused before it becomes a candidate for garbage collection. A value of -1 indicates that jobs should never be garbage collected. A value of 0 indicates that the decision should be left up to the runtime.
int64 client_timeout_ms = 8
How long to wait before garbage-collecting a client that hasn't heartbeated to the dispatcher. A value of 0 indicates that the timeout should be left to the runtime.

This stores the metadata information present in each snapshot record.

string graph_hash = 1
Stores the fingerprint of the graph that describes the dataset that is snapshotted.
string run_id = 2
Run ID that this snapshot corresponds to.
int64 creation_timestamp = 3
Time when we started creating this snapshot.
int64 version = 4
Version of the snapshot data file format.
repeated DataType dtype = 5
A list of tensor dtype corresponding to each element of the snapshot.
int64 num_elements = 6
The number of elements in the snapshot.
bool finalized = 1000

Each SnapshotRecord represents one batch of pre-processed input data. A batch consists of a list of tensors that we encode as TensorProtos. This message doesn't store the structure of the batch.

repeated TensorProto tensor = 1

Metadata for all the tensors in a Snapshot Record.

repeated TensorMetadata tensor_metadata = 1

Metadata for a single tensor in the Snapshot Record.

Used in: SnapshotTensorMetadata

optional TensorShapeProto tensor_shape = 2
int64 tensor_size_bytes = 3
Number of uncompressed bytes used to store the tensor representation.

Configuration for a tf.data service WorkerServer. Next id: 12

Used in: WorkerStateExport

int64 port = 1
The port for the worker to bind to. A value of 0 indicates that the worker may bind to any available port.
string protocol = 2
The protocol for the worker to use when connecting to the dispatcher.
string dispatcher_address = 3
The address of the dispatcher to register with.
string worker_address = 4
The address of the worker server. The substring "%port%", if specified, will be replaced with the worker's bound port. This is useful when the port is set to `0`.
repeated string worker_tags = 10
Tags attached to the worker. This allows reading from selected workers. For example, by applying a "COLOCATED" tag, tf.data service is able to read from the local tf.data worker if one exists, then from off-TF-host workers, to avoid cross-TF-host reads.
int64 heartbeat_interval_ms = 5
How often the worker should heartbeat to the master. A value of 0 indicates that the decision should be left up to the runtime.
int64 dispatcher_timeout_ms = 6
How long to retry requests to the dispatcher before giving up and reporting an error. A value of 0 indicates that the decision should be left up to the runtime.
string data_transfer_protocol = 7
The protocol for the worker to use when transferring data to clients.
string data_transfer_address = 8
The data transfer address of the worker server. The substring "%port%", if specified, will be replaced with the worker's bound port. This is useful when the port is set to `0`.
int64 cross_trainer_cache_size_bytes = 11
Maximum size of the cross-trainer cache in bytes. If enabled, make sure your training job provides sufficient memory resources.
int64 shutdown_quiet_period_ms = 9
When shutting down a worker, how long to wait for the gRPC server to process the final requests. This is used to achieve clean shutdown in unit tests.

package tensorflow.data.experimental

message DispatcherConfig

int64 port = 1

string protocol = 2

string work_dir = 3

bool fault_tolerant_mode = 4

repeated string worker_addresses = 7

DeploymentMode deployment_mode = 9

int64 job_gc_check_interval_ms = 5

int64 job_gc_timeout_ms = 6

int64 client_timeout_ms = 8

message SnapshotMetadataRecord

string graph_hash = 1

string run_id = 2

int64 creation_timestamp = 3

int64 version = 4

repeated DataType dtype = 5

int64 num_elements = 6

bool finalized = 1000

message SnapshotRecord

repeated TensorProto tensor = 1

message SnapshotTensorMetadata

repeated TensorMetadata tensor_metadata = 1

message TensorMetadata

optional TensorShapeProto tensor_shape = 2

int64 tensor_size_bytes = 3

message WorkerConfig

int64 port = 1

string protocol = 2

string dispatcher_address = 3

string worker_address = 4

repeated string worker_tags = 10

int64 heartbeat_interval_ms = 5

int64 dispatcher_timeout_ms = 6

string data_transfer_protocol = 7

string data_transfer_address = 8

int64 cross_trainer_cache_size_bytes = 11

int64 shutdown_quiet_period_ms = 9