Get desktop application:
View/edit binary Protocol Buffers messages
Used in: ,
Total number of bytes requested
Total number of bytes allocated if known
Name of the allocator used
Identifier of the allocated buffer if known
Set if this tensor only has one remaining reference
Address of the allocation.
An allocation/de-allocation operation performed by the allocator.
Used in:
The timestamp of the operation.
Number of bytes allocated, or de-allocated if negative.
Used in:
These are per-node allocator memory stats.
The bytes that are not deallocated.
The allocation and deallocation timeline.
These are snapshots of the overall allocator memory stats. The number of live bytes currently allocated by the allocator.
An asset file def for a single file or a set of sharded files with the same name.
Used in:
The tensor to bind the asset filename to.
The filename within an assets directory. Note: does not include the path prefix, i.e. directories. For an asset at /tmp/path/vocab.txt, the filename would be "vocab.txt".
Protocol buffer representing the value for an attr used to configure an Op. Comment indicates the corresponding attr type. Only the field matching the attr type may be filled.
Used in: , , , , ,
"string"
"int"
"float"
"bool"
"type"
"shape"
"tensor"
any "list(...)"
"func" represents a function. func.name is a function's name or a primitive op's name. func.attr.first is the name of an attr defined for that function. func.attr.second is the value for that attr in the instantiation.
This is a placeholder only used in nodes defined inside a function. It indicates the attr value will be supplied when the function is instantiated. For example, let us suppose a node "N" in function "FN". "N" has an attr "A" with value placeholder = "foo". When FN is instantiated with attr "foo" set to "bar", the instantiated node N's attr A will have been given the value "bar".
LINT.IfChange
Used in:
"list(string)"
"list(int)"
"list(float)"
"list(bool)"
"list(type)"
"list(shape)"
"list(tensor)"
"list(attr)"
Used in:
A protobuf to represent tf.BoundedTensorSpec.
Used in:
LINT.IfChange Containers to hold repeated fundamental values.
Used in:
Defines a subgraph in another `GraphDef` as a set of feed points and nodes to be fetched or executed. Compare with the arguments to `Session::Run()`.
Tensors to be fed in the callable. Each feed is the name of a tensor.
Fetches. A list of tensor names. The caller of the callable expects a tensor to be returned for each fetch[i] (see RunStepResponse.tensor). The order of specified fetches does not change the execution order.
Target Nodes. A list of node names. The named nodes will be run by the callable but their outputs will not be returned.
Options that will be applied to each run.
Tensors to be connected in the callable. Each TensorConnection denotes a pair of tensors in the graph, between which an edge will be created in the callable.
The Tensor objects fed in the callable and fetched from the callable are expected to be backed by host (CPU) memory by default. The options below allow changing that - feeding tensors backed by device memory, or returning tensors that are backed by device memory. The maps below map the name of a feed/fetch tensor (which appears in 'feed' or 'fetch' fields above), to the fully qualified name of the device owning the memory backing the contents of the tensor. For example, creating a callable with the following options: CallableOptions { feed: "a:0" feed: "b:0" fetch: "x:0" fetch: "y:0" feed_devices: { "a:0": "/job:localhost/replica:0/task:0/device:GPU:0" } fetch_devices: { "y:0": "/job:localhost/replica:0/task:0/device:GPU:0" } } means that the Callable expects: - The first argument ("a:0") is a Tensor backed by GPU memory. - The second argument ("b:0") is a Tensor backed by host memory. and of its return values: - The first output ("x:0") will be backed by host memory. - The second output ("y:0") will be backed by GPU memory. FEEDS: It is the responsibility of the caller to ensure that the memory of the fed tensors will be correctly initialized and synchronized before it is accessed by operations executed during the call to Session::RunCallable(). This is typically ensured by using the TensorFlow memory allocators (Device::GetAllocator()) to create the Tensor to be fed. Alternatively, for CUDA-enabled GPU devices, this typically means that the operation that produced the contents of the tensor has completed, i.e., the CUDA stream has been synchronized (e.g., via cuCtxSynchronize() or cuStreamSynchronize()).
By default, RunCallable() will synchronize the GPU stream before returning fetched tensors on a GPU device, to ensure that the values in those tensors have been produced. This simplifies interacting with the tensors, but potentially incurs a performance hit. If this options is set to true, the caller is responsible for ensuring that the values in the fetched tensors have been produced before they are used. The caller can do this by invoking `Device::Sync()` on the underlying device(s), or by feeding the tensors back to the same Session using `feed_devices` with the same corresponding device name.
Used in:
Name of captured tensor
Name of concrete function which contains the computed graph tensor.
Defines a TensorFlow cluster as a set of jobs.
Used in:
The jobs that comprise the cluster.
CollectionDef should cover most collections. To add a user-defined collection, do one of the following: 1. For simple data types, such as string, int, float: tf.add_to_collection("your_collection_name", your_simple_value) strings will be stored as bytes_list. 2. For Protobuf types, there are three ways to add them: 1) tf.add_to_collection("your_collection_name", your_proto.SerializeToString()) collection_def { key: "user_defined_bytes_collection" value { bytes_list { value: "queue_name: \"test_queue\"\n" } } } or 2) tf.add_to_collection("your_collection_name", str(your_proto)) collection_def { key: "user_defined_string_collection" value { bytes_list { value: "\n\ntest_queue" } } } or 3) any_buf = any_pb2.Any() tf.add_to_collection("your_collection_name", any_buf.Pack(your_proto)) collection_def { key: "user_defined_any_collection" value { any_list { value { type_url: "type.googleapis.com/tensorflow.QueueRunnerDef" value: "\n\ntest_queue" } } } } 3. For Python objects, implement to_proto() and from_proto(), and register them in the following manner: ops.register_proto_function("your_collection_name", proto_type, to_proto=YourPythonObject.to_proto, from_proto=YourPythonObject.from_proto) These functions will be invoked to serialize and de-serialize the collection. For example, ops.register_proto_function(ops.GraphKeys.GLOBAL_VARIABLES, proto_type=variable_pb2.VariableDef, to_proto=Variable.to_proto, from_proto=Variable.from_proto)
Used in:
AnyList is used for collecting Any protos.
Used in:
BytesList is used for collecting strings and serialized protobufs. For example: collection_def { key: "trainable_variables" value { bytes_list { value: "\n\017conv1/weights:0\022\024conv1/weights/Assign \032\024conv1/weights/read:0" value: "\n\016conv1/biases:0\022\023conv1/biases/Assign\032 \023conv1/biases/read:0" } } }
Used in:
FloatList is used for collecting float values.
Used in:
Int64List is used for collecting int, int64 and long values.
Used in:
NodeList is used for collecting nodes in graph. For example collection_def { key: "summaries" value { node_list { value: "input_producer/ScalarSummary:0" value: "shuffle_batch/ScalarSummary:0" value: "ImageSummary:0" } }
Used in:
Session configuration parameters. The system picks appropriate values for fields that are not set.
Map from device type name (e.g., "CPU" or "GPU" ) to maximum number of devices of that type to use. If a particular device type is not found in the map, the system picks an appropriate number.
The execution of an individual op (for some op types) can be parallelized on a pool of intra_op_parallelism_threads. 0 means the system picks an appropriate number. If you create an ordinary session, e.g., from Python or C++, then there is exactly one intra op thread pool per process. The first session created determines the number of threads in this pool. All subsequent sessions reuse/share this one global pool. There are notable exceptions to the default behavior described above: 1. There is an environment variable for overriding this thread pool, named TF_OVERRIDE_GLOBAL_THREADPOOL. 2. When connecting to a server, such as a remote `tf.train.Server` instance, then this option will be ignored altogether.
Nodes that perform blocking operations are enqueued on a pool of inter_op_parallelism_threads available in each process. 0 means the system picks an appropriate number. Negative means all operations are performed in caller's thread. Note that the first Session created in the process sets the number of threads for all future sessions unless use_per_session_threads is true or session_inter_op_thread_pool is configured.
If true, use a new set of threads for this session rather than the global pool of threads. Only supported by direct sessions. If false, use the global threads created by the first session, or the per-session thread pools configured by session_inter_op_thread_pool. This option is deprecated. The same effect can be achieved by setting session_inter_op_thread_pool to have one element, whose num_threads equals inter_op_parallelism_threads.
This option is experimental - it may be replaced with a different mechanism in the future. Configures session thread pools. If this is configured, then RunOptions for a Run call can select the thread pool to use. The intended use is for when some session invocations need to run in a background pool limited to a small number of threads: - For example, a session may be configured to have one large pool (for regular compute) and one small pool (for periodic, low priority work); using the small pool is currently the mechanism for limiting the inter-op parallelism of the low priority work. Note that it does not limit the parallelism of work spawned by a single op kernel implementation. - Using this setting is normally not needed in training, but may help some serving use cases. - It is also generally recommended to set the global_name field of this proto, to avoid creating multiple large pools. It is typically better to run the non-low-priority work, even across sessions, in a single large pool.
Assignment of Nodes to Devices is recomputed every placement_period steps until the system warms up (at which point the recomputation typically slows down automatically).
When any filters are present sessions will ignore all devices which do not match the filters. Each filter can be partially specified, e.g. "/job:ps" "/job:worker/replica:3", etc.
Options that apply to all GPUs.
Whether soft placement is allowed. If allow_soft_placement is true, an op will be placed on CPU if 1. there's no GPU implementation for the OP or 2. no GPU devices are known or registered or 3. need to co-locate with reftype input(s) which are from CPU.
Whether device placements should be logged.
Options that apply to all graphs.
Global timeout for all blocking operations in this session. If non-zero, and not overridden on a per-operation basis, this value will be used as the deadline for all blocking operations.
Options that apply when this session uses the distributed runtime.
Optional list of all workers to use in this session.
If true, any resources such as Variables used in the session will not be shared with other sessions. However, when clusterspec propagation is enabled, this field is ignored and sessions are always isolated.
When true, WorkerSessions are created with device attributes from the full cluster. This is helpful when a worker wants to partition a graph (for example during a PartitionedCallOp).
Everything inside Experimental is subject to change and is not subject to API stability guarantees in https://www.tensorflow.org/guide/version_compat.
Used in:
Task name for group resolution.
Which executor to use, the default executor will be used if it is an empty string or "DEFAULT"
Guidance to formatting of large RecvBuf fields for transfer. Any positive value sets the max chunk size. 0 defaults to 4096. Any negative value indicates no max, i.e. one chunk only.
If true, and supported by the platform, the runtime will attempt to use NUMA affinity where applicable. One consequence will be the existence of as many CPU devices as there are available NUMA nodes.
If true, make collective op execution order sequential and deterministic for potentially concurrent collective instances.
If true, use NCCL for CollectiveOps. This feature is highly experimental.
In the following, session state means the value of a variable, elements in a hash table, or any other resource, accessible by worker sessions held by a TF server. When ClusterSpec propagation is enabled, the value of isolate_session_state is ignored when deciding whether to share session states in a TF server (for backwards compatibility reasons). - If share_session_state_in_clusterspec_propagation is true, the session states are shared. - If share_session_state_in_clusterspec_propagation is false, session states are isolated. When clusterspec propagation is not used, the value of share_session_state_in_clusterspec_propagation is ignored when deciding whether to share session states in a TF server. - If isolate_session_state is true, session states are isolated. - If isolate_session_state is false, session states are shared. TODO(b/129330037): Add a single API that consistently treats isolate_session_state and ClusterSpec propagation.
If using a direct session, disable spinning while waiting for work in the thread pool. This may result in higher latency for completing ops, but in the case where there is a lot of spinning may result in lower CPU usage.
This was promoted to a non-experimental API. Please use ConfigProto.share_cluster_devices_in_session instead.
Metadata about the session. If set, this can be used by the runtime and the Ops for debugging, monitoring, etc. NOTE: This is currently used and propagated only by the direct session.
If true, the session may treat the graph as being static for optimization purposes. If this option is set to true when a session is created, the full GraphDef must be passed in a single call to Session::Create(), and Session::Extend() may not be supported.
This field will eventually be deprecated and replaced by mlir_bridge_rollout (b/166038521). Whether to enable the MLIR-based TF->XLA bridge. This is a replacement to the existing bridge, and not ready for production usage yet. If this option is set to true when a session is created, MLIR is used to perform the set of graph transformations to put the graph in a form that can be executed with delegation of some computations to an accelerator. This builds on the model of XLA where a subset of the graph is encapsulated and attached to a "compile" operation, whose result is fed to an "execute" operation. The kernel for these operations is responsible to lower the encapsulated graph to a particular device.
This field is underdevelopment, for now use enable_mlir_bridge (b/166038521). Whether to enable the MLIR-based TF->XLA bridge.
Whether to enable the MLIR-based Graph optimizations. This will become a part of standard Tensorflow graph optimization pipeline, currently this is only used for gradual migration and testing new passes that are replacing existing optimizations in Grappler.
If true, the session will not store an additional copy of the graph for each subgraph. If this option is set to true when a session is created, the `RunOptions.output_partition_graphs` options must not be set.
Minimum number of batches run through the XLA graph before XLA fusion autotuner is enabled. Default value of zero disables the autotuner. The XLA fusion autotuner can improve performance by executing a heuristic search on the compiler parameters.
Whether runtime execution uses TFRT.
Whether functional control flow op lowering should be disabled. This is useful when executing within a portable runtime where control flow op kernels may not be loaded due to selective registration.
Provides a hint to XLA auto clustering to prefer forming a single large cluster that encompases most of the graph.
Distributed coordination service configurations.
An enum that describes the state of the MLIR bridge rollout.
Used in:
If this field is left unspecified, the MLIR bridge may be selectively enabled on a per graph basis.
Enabling the MLIR bridge enables it for all graphs in this session.
Disabling the MLIR bridge disables it for all graphs in this session.
Enable the MLIR bridge on a per graph basis based on an analysis of the features used in the graph. If the features used by the graph are supported by the MLIR bridge, the MLIR bridge will be used to run the graph.
Enable the MLIR bridge in a fallback mode on a per graph basis based on an analysis of the features used in the graph. Running the MLIR bridge in the fallback mode means that it is executed and it commits all the changes to the TF graph in case of success. And it does not in case of failures and let the old bridge to process the TF graph.
Coordination service configuration parameters. The system picks appropriate values for fields that are not set.
Used in:
Type of coordination service implementation to enable. For example, setting the service type as "standalone" starts a service instance on the leader task to provide the coordination services such as heartbeats and consistent key-value store.
Address where the coordination service instance is hosted.
Whether to enable the health check mechanism.
Maximum wait time for all members in the cluster to be registered.
Heartbeat timeout, if a worker does not record heartbeat in this time window, it will be considered disconnected.
The list of jobs that partipate in the coordination service. If empty, all jobs will be included in the coordination service by default.
Used in:
Total cost of this graph, typically used for balancing decisions.
Used in:
Aggregated cost value.
Aggregated cost dimension (e.g. 'memory', 'compute', 'network').
Used in:
The name of the node. Names are globally unique.
The device of the node. Can be empty if the node is mapped to the default partition or partitioning hasn't been run yet.
The id of the node. Node ids are only unique inside a partition.
Temporary memory used by this node.
Persistent memory used by this node.
Estimate of the computational cost of this node, in microseconds.
Analytical estimate of the computational cost of this node, in microseconds.
Analytical estimate of the memory access cost of this node, in microseconds.
If true, the output is permanent: it can't be discarded, because this node is part of the "final output". Nodes may depend on final nodes.
Ids of the control inputs for this node.
Are the costs inaccurate?
Inputs of this node. They must be executed before this node can be executed. An input is a particular output of another node, specified by the node id and the output index.
Used in:
Outputs of this node.
Used in:
If >= 0, the output is an alias of an input. Note that an alias input may itself be an alias. The algorithm will therefore need to follow those pointers.
(== suppress_warning documentation-presence ==) LINT.IfChange
Used in: , , , , , , , , , , ,
Not a legal value for DataType. Used to indicate a DataType field has not been set.
Data types that all computation devices are expected to be capable to support.
Single-precision complex
Quantized int8
Quantized uint8
Quantized int32
Float32 truncated to 16 bits. Only for cast ops.
Quantized int16
Quantized uint16
Double-precision complex
Arbitrary C++ data types
Do not use! These are only for parameters. Every enum above should have a corresponding value below (verified by types_test).
Options for initializing DebuggerState in TensorFlow Debugger (tfdbg).
Used in:
Debugging options
Caller-specified global step count. Note that this is distinct from the session run count and the executor step count.
Whether the total disk usage of tfdbg is to be reset to zero in this Session.run call. This is used by wrappers and hooks such as the local CLI ones to indicate that the dumped tensors are cleaned up from the disk after each Session.run.
Option for watching a node in TensorFlow Debugger (tfdbg).
Used in:
Name of the node to watch. Use "*" for wildcard. But note: currently, regex is not supported in general.
Output slot to watch. The semantics of output_slot == -1 is that all outputs of the node will be watched (i.e., a wildcard). Other negative values of output_slot are invalid and will lead to errors currently.
Name(s) of the debugging op(s). One or more than one probes on a tensor. e.g., {"DebugIdentity", "DebugNanCount"}
URL(s) for debug targets(s). Supported URL formats are: - file:///foo/tfdbg_dump: Writes out Event content to file /foo/tfdbg_dump. Assumes all directories can be created if they don't already exist. - grpc://localhost:11011: Sends an RPC request to an EventListener service running at localhost:11011 with the event. - memcbk:///event_key: Routes tensors to clients using the callback registered with the DebugCallbackRegistry for event_key. Each debug op listed in debug_ops will publish its output tensor (debug signal) to all URLs in debug_urls. N.B. Session::Run() supports concurrent invocations of the same inputs (feed keys), outputs and target nodes. If such concurrent invocations are to be debugged, the callers of Session::Run() must use distinct debug_urls to make sure that the streamed or dumped events do not overlap among the invocations. TODO(cais): More visible documentation of this in g3docs.
Do not error out if debug op creation fails (e.g., due to dtype incompatibility). Instead, just log the failure.
Used in:
The host name on which a source code file is located.
Path to the source code file.
The timestamp at which the source code file is last modified.
Byte size of the file.
Line-by-line content of the source code file.
A collection of source code files.
Used in:
Its key is thread id.
Represents a Python dict keyed by `str`. The comment on Unicode from Value.string_value applies analogously.
Used in:
Used in: ,
Containers for non-sequential data.
Used in: ,
Each feature can be exactly one kind.
Containers for sequential data. A FeatureList contains lists of Features. These may hold zero or more Feature values. FeatureLists are organized into categories by name. The FeatureLists message contains the mapping from name to FeatureList.
Used in:
Used in:
Map from feature name to feature list.
Used in: ,
Map from feature name to feature.
Used in:
Highly experimental and very likely to change. This encoding uses tags instead of dedicated messages for regularity. In particular the encoding imposes no restrictions on what the parameters of any type should be, which in particular needs to be true for type symbols.
Used in: ,
The principal type represented by this object. This may be a concrete type (Tensor, Dataset) a type variable (used for dependent types) a type symbol (Any, Union). See FullTypeId for details.
Literal values of this type object, if the the type admits one. For example, a type variable admits a string attribute - its name. Shape-related types may admit int attributes - their static shape values. Fields for more data types to be added as needed.
TODO(mdan): list/tensor, map? Need to reconcile with TFT_RECORD, etc.
Experimental. Represents the complete type information of a TensorFlow value.
Used in:
The default represents an uninitialized values.
Type variables may serve as placeholder for any other type ID in type templates. Examples: TFT_DATASET[TFT_VAR["T"]] is a Dataset returning a type indicated by "T". TFT_TENSOR[TFT_VAR["T"]] is a Tensor of n element type indicated by "T". TFT_TENSOR[TFT_VAR["T"]], TFT_TENSOR[TFT_VAR["T"]] are two tensors of identical element types. TFT_TENSOR[TFT_VAR["P"]], TFT_TENSOR[TFT_VAR["Q"]] are two tensors of independent element types.
Wildcard type. Describes a parameter of unknown type. In TensorFlow, that can mean either a "Top" type (accepts any type), or a dynamically typed object whose type is unknown in context. Important: "unknown" does not necessarily mean undeterminable!
The algebraic product type. This is an algebraic type that may be used just for logical grouping. Not to confused with TFT_TUPLE which describes a concrete object of several elements. Example: TFT_DATASET[TFT_PRODUCT[TFT_TENSOR[TFT_INT32], TFT_TENSOR[TFT_FLOAT64]]] is a Dataset producing two tensors, an integer one and a float one.
Represents a named field, with the name stored in the attribute. Parametrization: TFT_NAMED[<type>]{<name>} * <type> is the type of the field * <name> is the field name, as string (thpugh can theoretically be an int as well) Example: TFT_RECORD[ TFT_NAMED[TFT_TENSOR[TFT_INT32]]{'foo'}, TFT_NAMED[TFT_TENSOR[TFT_FLOAT32]]{'bar'}, ] is a structure with two fields, an int tensor "foo" and a float tensor "bar".
Template definition. Expands the variables by repeating a template as arguments of container. Parametrization: TFT_FOR_EACH[<container_type>, <template>, <expansions>] * <container_type> is the type of the container that the template will be expanded into * <template> is any type definition that potentially contains type variables * <expansions> is a TFT_VAR and may include more types in the future Example: TFT_FOR_EACH[ TFT_PRODUCT, TFT_TENSOR[TFT_VAR["t"]], TFT_VAR["t"] ] will substitute a T = TFT_INT32 to TFT_PRODUCT[TFT_TENSOR[TFT_INT32]] and a T = (TFT_INT32, TFT_INT64) to TFT_PRODUCT[TFT_TENSOR[TFT_INT32], TFT_TENSOR[TFT_INT64]].
Callable types describe functions and ops. Parametrization: TFT_CALLABLE[<arg type>, <return type>] * <arg type> is the type of the arguments; TFT_PRODUCT represents multiple arguments. * <return type> is the return type; TFT_PRODUCT represents multiple return values (that means that callables returning multiple things don't necessarily return a single tuple). Example: TFT_CALLABLE[ TFT_ANY, TFT_PRODUCT[TFT_TENSOR[TFT_INT32], TFT_TENSOR[TFT_FLOAT64]], ] is a callable with unspecified (for now) input arguments, and two return values of type tensor.
The usual Tensor. This is a parametric type. Parametrization: TFT_TENSOR[<element type>, <shape type>] * <element type> is currently limited to one of the element types defined below. * <shape type> is not yet defined, and may only be TFT_UNKNOWN for now. A TFT_SHAPE type will be defined in the future. Example: TFT_TENSOR[TFT_INT32, TFT_UNKNOWN] is a Tensor of int32 element type and unknown shape. TODO(mdan): Define TFT_SHAPE and add more examples.
Array (or tensorflow::TensorList in the variant type registry). Note: this is not to be confused with the deprecated `TensorArray*` ops which are not supported by FullType. This type represents a random-access list whose elements can be described by a single type. Although immutable, Array is expected to support efficient mutation semantics (i.e. element update) in the user-facing API. The element type may be generic or even TFT_ANY for a heterogenous list. Parametrization: TFT_ARRAY[<element type>] * <element type> may be any concrete type. Examples: TFT_ARRAY[TFT_TENSOR[TFT_INT32]] is a TensorArray holding int32 Tensors of any shape. TFT_ARRAY[TFT_TENSOR[TFT_UNKNOWN]] is a TensorArray holding Tensors of mixed element types. TFT_ARRAY[TFT_UNKNOWN] is a TensorArray holding any element type. TFT_ARRAY[] is equivalent to TFT_ARRAY[TFT_UNKNOWN]. TFT_ARRAY[TFT_ARRAY[]] is an array or arrays (of unknown types).
Optional (or tensorflow::OptionalVariant in the variant type registry). This type represents a value that may either hold an element of a single specified type, or nothing at all. Parametrization: TFT_OPTIONAL[<element type>] * <element type> may be any concrete type. Examples: TFT_OPTIONAL[TFT_TENSOR[TFT_INT32]] is an Optional holding an int32 Tensor of any shape.
Literal types describe compile-time constant values. Literal types may also participate in dependent types. Parametrization: TFT_LITERAL[<value type>]{<value>} * <value type> may be any concrete type compatible that can hold <value> * <value> is the type's attribute, and holds the actual literal value Examples: TFT_LITERAL[TFT_INT32]{1} is the compile-time constant 1.
The bool element type. TODO(mdan): Quantized types, legacy representations (e.g. ref)
Integer element types.
Floating-point element types.
Complex element types. TODO(mdan): Represent as TFT_COMPLEX[TFT_DOUBLE] instead?
The string element type.
Datasets created by tf.data ops and APIs. Datasets have generator/iterable semantics, that is, one can construct an iterator from them. Like Array, they are considered to return elements that can be described by a single type. Unlike Array, they do not support random access or mutation, and can potentially produce an infinite number of elements. A datasets can produce logical structures (e.g. multiple elements). This is expressed using TFT_PRODUCT. Parametrization: TFT_ARRAY[<element type>]. * <element type> may be a concrete type or a type symbol. It represents the data type of the elements produced by the dataset. Examples: TFT_DATSET[TFT_TENSOR[TFT_INT32]] is a Dataset producing single int32 Tensors of unknown shape. TFT_DATSET[TFT_PRODUCT[TFT_TENSOR[TFT_INT32], TFT_TENSOR[TFT_FLOAT32]] is a Dataset producing pairs of Tensors, one integer and one float. Note: The high ID number is to prepare for the eventuality that Datasets will be supported by user types in the future.
A ragged tensor created by tf.ragged ops and APIs. Parametrization: TFT_RAGGED[<element_type>].
A mutex lock tensor, produced by tf.raw_ops.MutexLock. Unlike strict execution models, where ownership of a lock is denoted by "running after the lock has been acquired", in non-strict mode, lock ownership is in the true sense: "the op argument representing the lock is available". Mutex locks are the dynamic counterpart of control dependencies. TODO(mdan): Properly document this thing. Parametrization: TFT_MUTEX_LOCK[].
The equivalent of a Tensor with DT_VARIANT dtype, kept here to simplify translation. This type should not normally appear after type inference. Note that LEGACY_VARIANT != ANY: TENSOR[INT32] is a subtype of ANY, but is not a subtype of LEGACY_VARIANT.
A function can be instantiated when the runtime can bind every attr with a value. When a GraphDef has a call to a function, it must have binding for every attr defined in the signature. TODO(zhifengc): * device spec, etc.
Used in:
The definition of the function's name, arguments, return values, attrs etc.
Attributes specific to this function definition.
Unique IDs for each resource argument, used to track aliasing resources. If Argument A and Argument B alias each other, then resource_arg_unique_ids[A.index] == resource_arg_unique_ids[B.index]. If this field is empty, none of the arguments could alias; otherwise, every resource argument should have an entry in this field. When instantiated, the unique IDs will be attached to the _Arg nodes' "_resource_arg_unique_id" attribute.
By convention, "op" in node_def is resolved by consulting with a user-defined library first. If not resolved, "func" is assumed to be a builtin op.
A mapping from the output arg names from `signature` to the outputs from `node_def` that should be returned by the function.
A mapping from control output names from `signature` to node names in `node_def` which should be control outputs of this function.
Attributes for function arguments. These attributes are the same set of valid attributes as to _Arg nodes.
Used in:
A library is a set of named functions.
Used in:
Represents `FunctionSpec` used in `Function`. This represents a function that has been wrapped as a TensorFlow `Function`.
Used in: ,
Full arg spec from inspect.getfullargspec().
Whether this represents a class method.
The input signature, if specified.
Whether the function should be compiled by XLA. The public interface to `tf.function` uses an optional boolean to represent three distinct states for this field. Unfortunately, proto3 removes the ability to explicitly check for the presence or absence of a field, so we instead map to an enum. See `tf.function` for details.
Used in:
Used in:
Fraction of the available GPU memory to allocate for each process. 1 means to allocate all of the GPU memory, 0.5 means the process allocates up to ~50% of the available GPU memory. GPU memory is pre-allocated unless the allow_growth option is enabled. If greater than 1.0, uses CUDA unified memory to potentially oversubscribe the amount of memory available on the GPU device by using host memory as a swap space. Accessing memory not available on the device will be significantly slower as that would require memory transfer between the host and the device. Options to reduce the memory requirement should be considered before enabling this option as this may come with a negative performance impact. Oversubscription using the unified memory requires Pascal class or newer GPUs and it is currently only supported on the Linux operating system. See https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-requirements for the detailed requirements.
If true, the allocator does not pre-allocate the entire specified GPU memory region, instead starting small and growing as needed.
The type of GPU allocation strategy to use. Allowed values: "": The empty string (default) uses a system-chosen default which may change over time. "BFC": A "Best-fit with coalescing" algorithm, simplified from a version of dlmalloc.
Delay deletion of up to this many bytes to reduce the number of interactions with gpu driver code. If 0, the system chooses a reasonable default (several MBs).
A comma-separated list of GPU ids that determines the 'visible' to 'virtual' mapping of GPU devices. For example, if TensorFlow can see 8 GPU devices in the process, and one wanted to map visible GPU devices 5 and 3 as "/device:GPU:0", and "/device:GPU:1", then one would specify this field as "5,3". This field is similar in spirit to the CUDA_VISIBLE_DEVICES environment variable, except it applies to the visible GPU devices in the process. NOTE: 1. The GPU driver provides the process with the visible GPUs in an order which is not guaranteed to have any correlation to the *physical* GPU id in the machine. This field is used for remapping "visible" to "virtual", which means this operates only after the process starts. Users are required to use vendor specific mechanisms (e.g., CUDA_VISIBLE_DEVICES) to control the physical to visible device mapping prior to invoking TensorFlow. 2. In the code, the ids in this list are also called "platform GPU id"s, and the 'virtual' ids of GPU devices (i.e. the ids in the device name "/device:GPU:<id>") are also called "TF GPU id"s. Please refer to third_party/tensorflow/core/common_runtime/gpu/gpu_id.h for more information.
In the event polling loop sleep this many microseconds between PollEvents calls, when the queue is not empty. If value is not set or set to 0, gets set to a non-zero default.
This field is deprecated and ignored.
Force all tensors to be gpu_compatible. On a GPU-enabled TensorFlow, enabling this option forces all CPU tensors to be allocated with Cuda pinned memory. Normally, TensorFlow will infer which tensors should be allocated as the pinned memory. But in case where the inference is incomplete, this option can significantly speed up the cross-device memory copy performance as long as it fits the memory. Note that this option is not something that should be enabled by default for unknown or very large models, since all Cuda pinned memory is unpageable, having too much pinned memory might negatively impact the overall host system performance.
Everything inside experimental is subject to change and is not subject to API stability guarantees in https://www.tensorflow.org/guide/version_compat.
Used in:
The multi virtual device settings. If empty (not set), it will create single virtual device on each visible GPU, according to the settings in "visible_device_list" above. Otherwise, the number of elements in the list must be the same as the number of visible GPUs (after "visible_device_list" filtering if it is set), and the string represented device names (e.g. /device:GPU:<id>) will refer to the virtual devices and have the <id> field assigned sequentially starting from 0, according to the order they appear in this list and the "memory_limit" list inside each element. For example, visible_device_list = "1,0" virtual_devices { memory_limit: 1GB memory_limit: 2GB } virtual_devices {} will create three virtual devices as: /device:GPU:0 -> visible GPU 1 with 1GB memory /device:GPU:1 -> visible GPU 1 with 2GB memory /device:GPU:2 -> visible GPU 0 with all available memory NOTE: 1. It's invalid to set both this and "per_process_gpu_memory_fraction" at the same time. 2. Currently this setting is per-process, not per-session. Using different settings in different sessions within same process will result in undefined behavior.
If true, uses CUDA unified memory for memory allocations. If per_process_gpu_memory_fraction option is greater than 1.0, then unified memory is used regardless of the value for this field. See comments for per_process_gpu_memory_fraction field for more details and requirements of the unified memory. This option is useful to oversubscribe memory if multiple processes are sharing a single GPU while individually using less than 1.0 per process memory fraction.
If > 1, the number of device-to-device copy streams to create for each GPUDevice. Default value is 0, which is automatically converted to 1.
If non-empty, defines a good GPU ring order on a single worker based on device interconnect. This assumes that all workers have the same GPU topology. Specify as a comma-separated string, e.g. "3,2,1,0,7,6,5,4". This ring order is used by the RingReducer implementation of CollectiveReduce, and serves as an override to automatic ring order generation in OrderTaskDeviceMap() during CollectiveParam resolution.
If true then extra work is done by GPUDevice and GPUBFCAllocator to keep track of when GPU memory is freed and when kernels actually complete so that we can know when a nominally free memory chunk is really not subject to pending use.
Parameters for GPUKernelTracker. By default no kernel tracking is done. Note that timestamped_allocator is only effective if some tracking is specified. If kernel_tracker_max_interval = n > 0, then a tracking event is inserted after every n kernels without an event.
If kernel_tracker_max_bytes = n > 0, then a tracking event is inserted after every series of kernels allocating a sum of memory >= n. If one kernel allocates b * n bytes, then one event will be inserted after it, but it will count as b against the pending limit.
If kernel_tracker_max_pending > 0 then no more than this many tracking events can be outstanding at a time. An attempt to launch an additional kernel will stall until an event completes.
BFC Allocator can return an allocated chunk of memory upto 2x the requested size. For virtual devices with tight memory constraints, and proportionately large allocation requests, this can lead to a significant reduction in available memory. The threshold below controls when a chunk should be split if the chunk size exceeds requested memory size. It is expressed as a fraction of total available memory for the tf device. For example setting it to 0.05 would imply a chunk needs to be split if its size exceeds the requested memory by 5% of the total virtual device/gpu memory size.
When true, use CUDA cudaMallocAsync API instead of TF gpu allocator.
Configuration for breaking down a visible GPU into multiple "virtual" devices.
Used in:
Per "virtual" device memory limit, in MB. The number of elements in the list is the number of virtual devices to create on the corresponding visible GPU (see "virtual_devices" below). If empty, it will create single virtual device taking all available memory from the device. For the concept of "visible" and "virtual" GPU, see the comments for "visible_device_list" above for more information.
Priority values to use with the virtual devices. Use the cuda function cudaDeviceGetStreamPriorityRange to query for valid range of values for priority. On a P4000 GPU with cuda 10.1, the priority range reported was 0 for least priority and -1 for greatest priority. If this field is not specified, then the virtual devices will be created with the default. If this field has values set, then the size of this must match with the above memory_limit_mb.
GradientDef defines the gradient function of a function defined in a function library. A gradient function g (specified by gradient_func) for a function f (specified by function_name) must follow the following: The function 'f' must be a numerical function which takes N inputs and produces M outputs. Its gradient function 'g', which is a function taking N + M inputs and produces N outputs. I.e. if we have (y1, y2, ..., y_M) = f(x1, x2, ..., x_N), then, g is (dL/dx1, dL/dx2, ..., dL/dx_N) = g(x1, x2, ..., x_N, dL/dy1, dL/dy2, ..., dL/dy_M), where L is a scalar-value function of (x1, x2, ..., xN) (e.g., the loss function). dL/dx_i is the partial derivative of L with respect to x_i.
Used in:
The function name.
The gradient function's name.
Represents the graph of operations
Used in: , ,
Compatibility versions of the graph. See core/public/version.h for version history. The GraphDef version is distinct from the TensorFlow version, and each release of TensorFlow will support a range of GraphDef versions.
Deprecated single version field; use versions above instead. Since all GraphDef changes before "versions" was introduced were forward compatible, this field is entirely ignored.
"library" provides user-defined functions. Naming: * library.function.name are in a flat namespace. NOTE: We may need to change it to be hierarchical to support different orgs. E.g., { "/google/nn", { ... }}, { "/google/vision", { ... }} { "/org_foo/module_bar", { ... }} map<string, FunctionDefLib> named_lib; * If node[i].op is the name of one function in "library", node[i] is deemed as a function call. Otherwise, node[i].op must be a primitive operation supported by the runtime. Function call semantics: * The callee may start execution as soon as some of its inputs are ready. The caller may want to use Tuple() mechanism to ensure all inputs are ready in the same time. * The consumer of return values may start executing as soon as the return values the consumer depends on are ready. The consumer may want to use Tuple() mechanism to ensure the consumer does not start until all return values of the callee function are ready.
Used in:
If true, use control flow to schedule the activation of Recv nodes. (Currently ignored.)
Options controlling how graph is optimized.
The number of steps to run before returning a cost model detailing the memory usage and performance of each node of the graph. 0 means no cost model.
The number of steps to skip before collecting statistics for the cost model.
Annotate each Node with Op output shape data, to the extent it can be statically inferred.
Only place the subgraphs that are run, rather than the entire graph. This is useful for interactive graph building, where one might produce graphs that cannot be placed during the debugging process. In particular, it allows the client to continue work in a session after adding a node to a graph whose placement constraints are unsatisfiable.
If true, transfer float values between processes as bfloat16.
If > 0, record a timeline every this many steps. EXPERIMENTAL: This currently has no effect in MasterSession.
Options that control the type and amount of graph rewriting. Not currently configurable via the public Python API (i.e. there is no API stability guarantee if you import RewriterConfig explicitly).
Used in:
Defines a single job in a TensorFlow cluster.
Used in:
The name of this job.
Mapping from task ID to "hostname:port" string. If the `name` field contains "worker", and the `tasks` map contains a mapping from 7 to "example.org:2222", then the device prefix "/job:worker/task:7" will be assigned to "example.org:2222".
Represents a Python list.
Used in:
For memory tracking.
Used in:
NOTE: This protocol buffer is evolving, and will go through revisions in the coming months. Protocol buffer containing the following which are necessary to restart training, run inference. It can be used to serialize/de-serialize memory objects necessary for running computation in a graph when crossing the process boundary. It can be used for long term storage of graphs, cross-language execution of graphs, etc. MetaInfoDef GraphDef SaverDef CollectionDef TensorInfo SignatureDef
GraphDef.
SaverDef.
collection_def: Map from collection name to collections. See CollectionDef section for details.
signature_def: Map from user supplied key for a signature to a single SignatureDef.
Asset file def to be used with the defined graph.
Extra information about the structure of functions and stateful objects.
Meta information regarding the graph to be exported. To be used by users of this protocol buffer to encode information regarding their meta graph.
Used in:
User specified Version string. Can be the name of the model and revision, steps this model has been trained to, etc.
A copy of the OpDefs used by the producer of this graph_def. Descriptions and Ops not used in graph_def are stripped out.
A serialized protobuf. Can be the time this meta graph is created, or modified, or name of the model.
User supplied tag(s) on the meta_graph and included graph_def. MetaGraphDefs should be tagged with their capabilities or use-cases. Examples: "train", "serve", "gpu", "tpu", etc. These tags enable loaders to access the MetaGraph(s) appropriate for a specific use-case or runtime environment.
The __version__ string of the tensorflow build used to write this graph. This will be populated by the framework, which will overwrite any user supplied value.
The __git_version__ string of the tensorflow build used to write this graph. This will be populated by the framework, which will overwrite any user supplied value.
A flag to denote whether default-valued attrs have been stripped from the nodes in this graph_def.
FunctionDef name to aliases mapping.
A list of attr names and their values. The whole list is attached with a string name. E.g., MatMul[T=float].
Used in: ,
A pair of tensor name and tensor values.
Used in: ,
Name of the tensor.
The client can populate a TensorProto using a tensorflow::Tensor`, or directly using the protobuf field accessors. The client specifies whether the returned tensor values should be filled tensor fields (float_val, int_val, etc.) or encoded in a compact form in tensor.tensor_content.
Represents Python's namedtuple.
Used in:
Used in: ,
The name given to this operator. Used for naming inputs, logging, visualization, etc. Unique within a single GraphDef. Must match the regexp "[A-Za-z0-9.][A-Za-z0-9_>./]*".
The operation name. There may be custom parameters in attrs. Op names starting with an underscore are reserved for internal use.
Each input is "node:src_output" with "node" being a string name and "src_output" indicating which output tensor to use from "node". If "src_output" is 0 the ":0" suffix can be omitted. Regular inputs may optionally be followed by control inputs that have the format "^node".
A (possibly partial) specification for the device on which this node should be placed. The expected syntax for this string is as follows: DEVICE_SPEC ::= PARTIAL_SPEC PARTIAL_SPEC ::= ("/" CONSTRAINT) * CONSTRAINT ::= ("job:" JOB_NAME) | ("replica:" [1-9][0-9]*) | ("task:" [1-9][0-9]*) | ("device:" [A-Za-z]* ":" ([1-9][0-9]* | "*") ) Valid values for this string include: * "/job:worker/replica:0/task:1/device:GPU:3" (full specification) * "/job:worker/device:GPU:3" (partial specification) * "" (no specification) If the constraints do not resolve to a single device (or if this field is empty or not present), the runtime will attempt to choose a device automatically.
Operation-specific graph-construction-time configuration. Note that this should include all attrs defined in the corresponding OpDef, including those with a value matching the default -- this allows the default to change and makes NodeDefs easier to interpret on their own. However, if an attr with a default is not specified in this list, the default will be used. The "names" (keys) must match the regexp "[a-z][a-z0-9_]+" (and one of the names from the corresponding OpDef's attr field). The values must have a type matching the corresponding OpDef attr's type field. TODO(josh11b): Add some examples here showing best practices.
This stores debug information associated with the node.
The complete type of this node. Experimental and subject to change. Currently, the field only contains the return types of the node. That will extend in the future to contain the entire signature of the node, as a function type.
Used in:
Opaque string inserted into error messages created by the runtime. This is intended to store the list of names of the nodes from the original graph that this node was derived. For example if this node, say C, was result of a fusion of 2 nodes A and B, then 'original_node' would be {A, B}. This information can be used to map errors originating at the current node to some top level source code.
This is intended to store the list of names of the functions from the original graph that this node was derived. For example if this node, say C, was result of a fusion of node A in function FA and node B in function FB, then `original_funcs` would be {FA, FB}. If the node is in the top level graph, the `original_func` is empty. This information, with the `original_node_names` can be used to map errors originating at the current ndoe to some top level source code.
Time/size stats recorded for a single execution of a graph node.
Used in:
TODO(tucker): Use some more compact form of node identity than the full string name. Either all processes should agree on a global id (cost_id?) for each node, or we should use a hash of the name.
Output sizes recorded for a single execution of a graph node.
Used in:
Represents None.
Used in:
(message has no fields)
Defines an operation. A NodeDef in a GraphDef specifies an Op by using the "op" field which should match the name of a OpDef. LINT.IfChange
Used in: ,
Op names starting with an underscore are reserved for internal use. Names should be CamelCase and match the regexp "[A-Z][a-zA-Z0-9>_]*".
Description of the input(s).
Description of the output(s).
Named control outputs for this operation. Useful only for composite operations (i.e. functions) which want to name different control outputs.
Optional deprecation based on GraphDef versions.
One-line human-readable description of what the Op does.
Additional, longer human-readable description of what the Op does.
True if the operation is commutative ("op(a,b) == op(b,a)" for all inputs)
If is_aggregate is true, then this operation accepts N >= 2 inputs and produces 1 output all of the same type. Should be associative and commutative, and produce output with the same shape as the input. The optimizer may replace an aggregate op taking input from multiple devices with a tree of aggregate ops that aggregate locally within each device (and possibly within groups of nearby devices) before communicating. TODO(josh11b): Implement that optimization.
for things like add
Ops are marked as stateful if their behavior depends on some state beyond their input tensors (e.g. variable reading op) or if they have a side-effect (e.g. printing or asserting ops). Equivalently, stateless ops must always produce the same output for the same input and have no side-effects. By default Ops may be moved between devices. Stateful ops should either not be moved, or should only be moved if that state can also be moved (e.g. via some sort of save / restore). Stateful ops are guaranteed to never be optimized away by Common Subexpression Elimination (CSE).
for things like variables, queue
By default, all inputs to an Op must be initialized Tensors. Ops that may initialize tensors for the first time should set this field to true, to allow the Op to take an uninitialized Tensor as input.
for Assign, etc.
Indicates whether the op implementation uses distributed communication. If True, the op is allowed to return errors for network disconnection and trigger TF network failure handling logics.
For describing inputs and outputs.
Used in:
Name for the input/output. Should match the regexp "[a-z][a-z0-9_]*".
Human readable description.
Describes the type of one or more tensors that are accepted/produced by this input/output arg. The only legal combinations are: * For a single tensor: either the "type" field is set or the "type_attr" field is set to the name of an attr with type "type". * For a sequence of tensors with the same type: the "number_attr" field will be set to the name of an attr with type "int", and either the "type" or "type_attr" field will be set as for single tensors. * For a sequence of tensors, the "type_list_attr" field will be set to the name of an attr with type "list(type)".
if specified, attr must have type "type"
if specified, attr must have type "int"
If specified, attr must have type "list(type)", and none of type, type_attr, and number_attr may be specified.
The handle data for resource inputs.
For inputs: if true, the inputs are required to be refs. By default, inputs can be either refs or non-refs. For outputs: if true, outputs are refs, otherwise they are not.
Experimental. Full type declaration for this argument. The full type specification combines type, type_attr, type_list_attr, etc. into a unified representation. This declaration may contain non-concrete types (for example, Tensor<TypeVar<'T'>> is a valid type declaration. Note: this is a transient field. The long-term aim is to represent the entire OpDef as a single type: a callable. In that context, this field is just the type of a single argument.
Description of the graph-construction-time configuration of this Op. That is to say, this describes the attr fields that will be specified in the NodeDef.
Used in:
A descriptive name for the argument. May be used, e.g. by the Python client, as a keyword argument name, and so should match the regexp "[a-z][a-z0-9_]+".
One of the type names from attr_value.proto ("string", "list(string)", "int", etc.).
A reasonable default for this attribute if the user does not supply a value. If not specified, the user must supply a value.
Human-readable description.
For type == "int", this is a minimum value. For "list(___)" types, this is the minimum length.
The set of allowed values. Has type that is the "list" version of the "type" field above (uses the "list" field of AttrValue). If type == "type" or "list(type)" above, then the "type" field of "allowed_values.list" has the set of allowed DataTypes. If type == "string" or "list(string)", then the "s" field of "allowed_values.list" has the set of allowed strings.
Information about version-dependent deprecation of an op
Used in:
First GraphDef version at which the op is disallowed.
Explanation of why it was deprecated and what to use instead.
A collection of OpDefs
Used in:
Options passed to the graph optimizer
Used in:
If true, optimize the graph using common subexpression elimination. Note: the optimization Level L1 will override this setting to true. So in order to disable common subexpression elimination the opt_level has to be set to L0.
If true, perform constant folding optimization on the graph. Note: the optimization Level L1 will override this setting to true. So in order to disable constant folding the opt_level has to be set to L0.
Constant folding optimization replaces tensors whose values can be predetermined, with constant nodes. To avoid inserting too large constants, the size of each constant created can be limited. If this value is zero, a default limit of 10 MiB will be applied. If constant folding optimization is disabled, this value is ignored.
If true, perform function inlining on the graph.
Overall optimization level. The actual optimizations applied will be the logical OR of the flags that this level implies and any flags already set.
CPU code will be autoclustered only if global_jit_level >= ON_1 and either: - this flag is true, or - TF_XLA_FLAGS contains --tf_xla_cpu_global_jit=true.
Control the use of the compiler/jit. Experimental.
Used in:
Default setting ("off" now, but later expected to be "on")
The following settings turn on compilation, with higher values being more aggressive. Higher values may reduce opportunities for parallelism and may use more memory. (At present, there is no distinction, but this is expected to change.)
Optimization level
Used in:
L1 is the default level. Optimization performed at L1 : 1. Common subexpression elimination 2. Constant folding
No optimizations
Represents a (key, value) pair.
Used in:
Used in:
If true, always use RPC to contact the session target. If false (the default option), TensorFlow may use an optimized transport for client-master communication that avoids the RPC stack. This option is primarily for used testing the RPC stack.
The compression algorithm to be used. One of "deflate", "gzip".
If compression_algorithm is set, the compression level to be used. From 0 (no compression), up to 3.
Setting cache_rpc_response to true will enable sender side caching of response for RecvTensorAsync and RecvBufAsync to allow receiver to retry requests . This is only necessary when the network fabric is experiencing a significant error rate. Without it we'll fail a step on an network error, while with it we'll be able to complete long steps (like complex initializations) in the face of some network errors during RecvTensor.
Disables TCP connection sharing when opening a new RPC channel.
Setting num_channels_per_target > 0 allows uses of multiple channels to communicate to the same target. This can be used to improve the aggregate throughput on high speed links (e.g 100G) where single connection is not sufficient to maximize link utilization. Note that a single RPC only goes on a single channel, this only helps in situations where there are multiple transfers to the same target overlapping in time.
RegisteredGradient stores a gradient function that is registered in the gradients library and used in the ops of a function in the function library. Unlike GradientDef, these gradients are identified by op type, and not directly linked to any function.
Used in:
The gradient function's name.
The gradient function's registered op type.
Used in:
The name of the registered saver/restore function.
Unique auto-generated name of the object.
Protocol buffer representing a handle to a tensorflow resource. Handles are not valid across executions, but can be serialized back and forth from within a single run.
Used in:
Unique name for the device containing the resource.
Container in which this resource is placed.
Unique name of this resource.
Hash code for the type of the resource. Is only valid in the same device and in the same execution.
For debug-only, the name of the type pointed to by this handle, if available.
Data types and shapes for the underlying resource.
Protocol buffer representing a pair of (data type, tensor shape).
Used in: ,
Graph rewriting is experimental and subject to change, not covered by any API stability guarantees.
Used in:
CPU Conversion settings between NHCW and NCHW.
Optimize tensor layouts (default is ON) e.g. This will try to use NCHW layout on GPU which is faster.
Fold constants (default is ON) Statically infer the value of tensors when possible, and materialize the result using constants.
Shape optimizations (default is ON) Simplify computations made on shapes.
Remapping (default is ON) Remap subgraphs onto more efficient implementations.
Common subgraph elimination (default is ON) e.g. Simplify arithmetic ops; merge ops with same value (like constants).
Arithmetic optimizations (default is ON) e.g. Simplify arithmetic ops; merge ops with same value (like constants).
Control dependency optimizations (default is ON). Remove redundant control dependencies, which may enable other optimization.
Loop optimizations (default is ON).
Function optimizations (default is ON).
Strips debug-related nodes from the graph (off by default).
If true, don't remove unnecessary ops from the graph
Try to allocate some independent Op outputs contiguously in order to merge or eliminate downstream Ops (off by default).
Force small ops onto the CPU (default is OFF).
Enable the swap of kernel implementations based on the device placement (default is ON).
Optimize data types for CUDA (default is OFF). This will try to use float16 on GPU which is faster. Note that this can change the numerical stability of the graph and may require the use of loss scaling to maintain model convergence.
Optimize data types for MKL (default is OFF). This will try to use bfloat16 on CPUs, which is faster. Note that this can change the numerical stability of the graph.
Emulate a model using data type float16 on CPU (default is OFF). This will try to emulate the float16 inputs and outputs of an operator on CPU to have better correlation with float16 on GPU; however the computation in the operator is based on float32. Note that this can change the numerical stability of the graph.
Disable the entire meta optimizer (off by default).
Optimizers registered by plugin (default is ON)
Controls how many times we run the optimizers in meta optimizer (default is once).
The minimum number of nodes in a graph to optimizer. For smaller graphs, optimization is skipped. 0 means the system picks an appropriate number. < 0 means do not skip optimization.
Disable optimizations that assume compressed tensors. Note that this flag is experimental and may be removed in the future.
Disable folding quantization emulation ops such as FakeQuantWithMinMax* and QuantizeAndDequantize*. Some compilers (e.g. the TF-to-tflite converter) have to extract quantization configs (e.g. min/max range, number of bits, and per-channel) from the quantization emulation ops. Note that this flag is experimental and may be removed in the future. See b/174138564 for more details.
Configures memory optimization passes through the meta-optimizer. Has no effect on manually requested memory optimization passes in the optimizers field.
A node name scope for node names which are valid outputs of recomputations. Inputs to nodes that match this scope may be recomputed (subject either to manual annotation of those input nodes or to manual annotation and heuristics depending on memory_optimization), but the nodes themselves will not be recomputed. This matches any sub-scopes as well, meaning the scope can appear not just as a top-level scope. For example, if the value is "gradients/", the default, it will match node name "gradients/foo", "foo/gradients/bar", but not "foo_gradients/"
Maximum number of milliseconds to spend optimizing a single graph before timing out. If less than or equal to 0 (default value) the optimizer will never time out.
Configures AutoParallel optimization passes either through the meta-optimizer or when manually specified through the optimizers field.
If true, any optimization pass failing will cause the MetaOptimizer to stop with an error. By default - or when set to false, failing passes are skipped silently.
If non-empty, will use this as an alternative way to specify a list of optimizations to turn on and the order of the optimizations (replacing the meta-optimizer). Of the RewriterConfig options, only the AutoParallel configuration options (the auto_parallel field) apply to manually requested optimization passes ("autoparallel"). Memory optimization passes ("memory") invoked here are not configurable (in contrast to memory optimization passes through the meta-optimizer) and act only on manual op annotations. Custom optimizers (see custom_optimizers) that are not part of this schedule will be run after - in the order that they were specified.
list of CustomGraphOptimizers to apply.
VerifierConfig specifying the verifiers to be run after every optimizer.
VerifierConfig specifying the verifiers to be run at the end, after all optimizers have run.
Enum for layout conversion between NCHW and NHWC on CPU. Default is OFF.
Used in:
Message to describe custom graph optimizer and its parameters
Used in:
Used in:
The default setting (SCHEDULING and SWAPPING HEURISTICS only)
Disabled in the meta-optimizer.
Driven by manual op-level annotations.
Swapping heuristic will move a tensor from the GPU to the CPU and move it back when needed to reduce peak memory usage.
Recomputation heuristics will recompute ops (such as Relu activation) during backprop instead of storing them, reducing peak memory usage.
Scheduling will split big ops such as AddN and try to enforce a schedule of the new computations that decreases peak memory usage.
Use any combination of swapping and recomputation heuristics.
Enum controlling the number of times to run optimizers. The default is to run them twice.
Used in:
Used in:
Enable some aggressive optimizations that use assumptions that TF graphs may break. For example, assume the shape of a placeholder matches its actual feed.
Metadata output (i.e., non-Tensor) for a single Run() call.
Used in:
Statistics traced for this step. Populated if tracing is turned on via the "RunOptions" proto. EXPERIMENTAL: The format and set of events may change in future versions.
The cost graph for the computation defined by the run call.
Graphs of the partitions executed by executors.
This is only populated for graphs that are run as functions in TensorFlow V2. There will be an entry below for each function that is traced. The main use cases of the post_optimization_graph and the partition_graphs is to give the caller insight into the graphs that were actually run by the runtime. Additional information (such as those in step_stats) will match these graphs. We also include the pre_optimization_graph since it is usually easier to read, and is helpful in situations where the caller wants to get a high level idea of what the built graph looks like (since the various graph optimization passes might change the structure of the graph significantly).
Used in:
TODO(nareshmodi): Include some sort of function/cache-key identifier?
Options for a single Run() call.
Used in: ,
Time to wait for operation to complete in milliseconds.
The thread pool to use, if session_inter_op_thread_pool is configured. To use the caller thread set this to -1 - this uses the caller thread to execute Session::Run() and thus avoids a context switch. Using the caller thread to execute Session::Run() should be done ONLY for simple graphs, where the overhead of an additional context switch is comparable with the overhead of Session::Run().
Whether the partition graph(s) executed by the executor(s) should be outputted via RunMetadata.
EXPERIMENTAL. Options used to initialize DebuggerState, if enabled.
When enabled, causes tensor allocation information to be included in the error message when the Run() call fails because the allocator ran out of memory (OOM). Enabling this option can slow down the Run() call.
Everything inside Experimental is subject to change and is not subject to API stability guarantees in https://www.tensorflow.org/guide/version_compat.
Used in:
If non-zero, declares that this graph is going to use collective ops and must synchronize step_ids with any other graph with this same group_key value (in a distributed computation where tasks run disjoint graphs).
If true, then operations (using the inter-op pool) across all session::run() calls will be centrally scheduled, optimizing for (median and tail) latency. Consider using this option for CPU-bound workloads like inference.
Options for run handler thread pool.
Used in:
Priority of the request. The run handler thread pool will schedule ops based on the priority number. The larger number means higher priority.
TODO(pbar) Turn this into a TraceOptions proto which allows tracing to be controlled in a more orthogonal manner?
Used in:
Used in:
Name of the full variable of which this is a slice.
Shape of the full variable.
Offset of this variable into the full variable.
Shape of this variable.
Used in:
Node ids of concrete functions for saving and loading from a checkpoint. These functions save and restore directly from tensors.
A SavedAsset points to an asset in the MetaGraph. When bound to a function this object evaluates to a tensor with the absolute filename. Users should not depend on a particular part of the filename to remain stable (e.g. basename could be changed).
Used in:
Index into `MetaGraphDef.asset_file_def[]` that describes the Asset. Only the field `AssetFileDef.filename` is used. Other fields, such as `AssetFileDef.tensor_info`, MUST be ignored.
Used in:
Identifies a SavedConcreteFunction.
A sequence of unique strings, one per Tensor argument.
The prefix of `argument_keywords` which may be identified by position.
The spec of the function that this ConcreteFunction is traced from. This allows the ConcreteFunction to be called with nest structure inputs. This field may not be populated. If this field is absent, the concrete function can only be called with flat inputs. TODO(b/169361281): support calling saved ConcreteFunction with structured inputs in C++ SavedModel API.
Stores low-level information about a concrete function. Referenced in either a SavedFunction or a SavedBareConcreteFunction.
Used in:
Input in canonicalized form that was received to create this concrete function.
Output that was the return value of this function after replacing all Tensors with TensorSpecs. This can be an arbitrary nested function and will be used to reconstruct the full structure from pure tensors.
Used in:
An Operation name for a ConstantOp in this SavedObjectGraph's MetaGraph.
A function with multiple signatures, possibly with non-Tensor arguments.
Used in:
Used in:
Objects which this object depends on: named edges in the dependency graph. Note: currently only valid if kind == "user_object" or "resource".
Ordered list of dependencies that must be loaded before this object. SavedModel loads with the bottom-up approach, by first creating all objects (in the order defined by the dependencies), then connecting the edges.
Slot variables owned by this object. This describes the three-way (optimizer, variable, slot variable) relationship; none of the three depend on the others directly. Note: currently only valid if kind == "user_object".
Stores the functions used to save and restore this object. At most one of `saveable_objects` or `registered_saver` is defined for each SavedObject. See the comment below for the difference between SaveableObject and registered savers.
The name of the registered class of the form "{package}.{class_name}". This field is used to search for the registered class at loading time.
The user-generated proto storing metadata for this object, to be passed to the registered classes's _deserialize_from_proto method when this object is loaded from the SavedModel.
String name of the registered saver. At most one of `saveable_objects` or `registered_saver` is defined for each SavedObject.
Used in:
Flattened list of objects in the object graph. The position of the object in this list indicates its id. Nodes[0] is considered the root node.
Information about captures and output structures in concrete functions. Referenced from SavedBareConcreteFunction and SavedFunction.
A SavedResource represents a TF object that holds state during its lifetime. An object of this type can have a reference to a: create_resource() and an initialize() function.
Used in:
A device specification indicating a required placement for the resource creation function, e.g. "CPU". An empty string allows the user to select a device.
A SavedUserObject is an object (in the object-oriented language of the TensorFlow program) of some user- or framework-defined class other than those handled specifically by the other kinds of SavedObjects. This object cannot be evaluated as a tensor, and therefore cannot be bound to an input of a function.
Used in:
Corresponds to a registration of the type to use in the loading program.
Version information from the producer of this SavedUserObject.
Metadata for deserializing this object. Deprecated! At the time of deprecation, Keras was the only user of this field, and its saving and loading code will be updated shortly. Please save your application-specific metadata to a separate file.
Represents a Variable that is initialized by loading the contents from the checkpoint.
Used in:
List of component variables for a distributed variable. When this field is non-empty, the SavedVariable will be assumed to be a distributed variable defined by the components listed here. This is only supported by experimental loaders at the moment.
Protocol buffer representing the configuration of a Saver.
Used in:
The name of the tensor in which to specify the filename when saving or restoring a model checkpoint.
The operation to run when saving a model checkpoint.
The operation to run when restoring a model checkpoint.
Maximum number of checkpoints to keep. If 0, no checkpoints are deleted.
Shard the save files, one per device that has Variable nodes.
How often to keep an additional checkpoint. If not specified, only the last "max_to_keep" checkpoints are kept; if specified, in addition to keeping the last "max_to_keep" checkpoints, an additional checkpoint will be kept for every n hours of training.
A version number that identifies a different on-disk checkpoint format. Usually, each subclass of BaseSaverBuilder works with a particular version/format. However, it is possible that the same builder may be upgraded to support a newer checkpoint format in the future.
Used in:
Internal legacy format.
Deprecated format: tf.Saver() which works with tensorflow::table::Table.
Current format: more efficient.
Used in:
If present, only perform optimization for these ops.
Metadata about the session. This can be used by the runtime and the Ops for debugging, monitoring, etc. The (name, version) tuple is expected to be a unique identifier for sessions within the same process. NOTE: This is currently used and propagated only by the direct session.
Used in:
The version is optional. If set, needs to be >= 0.
SignatureDef defines the signature of a computation supported by a TensorFlow graph. For example, a model with two loss computations, sharing a single input, might have the following signature_def map, in a MetaGraphDef message. Note that across the two SignatureDefs "loss_A" and "loss_B", the input key, output key, and method_name are identical, and will be used by system(s) that implement or rely upon this particular loss method. The output tensor names differ, demonstrating how different outputs can exist for the same method. signature_def { key: "loss_A" value { inputs { key: "input" value { name: "input:0" dtype: DT_STRING tensor_shape: ... } } outputs { key: "loss_output" value { name: "loss_output_A:0" dtype: DT_FLOAT tensor_shape: ... } } method_name: "some/package/compute_loss" } ... } signature_def { key: "loss_B" value { inputs { key: "input" value { name: "input:0" dtype: DT_STRING tensor_shape: ... } } outputs { key: "loss_output" value { name: "loss_output_B:0" dtype: DT_FLOAT tensor_shape: ... } } method_name: "some/package/compute_loss" } ... }
Used in: ,
Named input parameters.
Named output parameters.
Extensible method_name information enabling third-party users to mark a SignatureDef as supporting a particular method. This enables producers and consumers of SignatureDefs, e.g. a model definition library and a serving library to have a clear hand-off regarding the semantics of a computation. Note that multiple SignatureDefs in a single MetaGraphDef may have the same method_name. This is commonly used to support multi-headed computation, where a single graph computation may return multiple results.
Used in:
`StructuredValue` represents a dynamically typed value representing various data structures that are inspired by Python data structures typically used in TensorFlow functions as inputs and outputs. For example when saving a Layer there may be a `training` argument. If the user passes a boolean True/False, that switches between two concrete TensorFlow functions. In order to switch between them in the same way after loading the SavedModel, we need to represent "True" and "False". A more advanced example might be a function which takes a list of dictionaries mapping from strings to Tensors. In order to map from user-specified arguments `[{"a": tf.constant(1.)}, {"q": tf.constant(3.)}]` after load to the right saved TensorFlow function, we need to represent the nested structure and the strings, recording that we have a trace for anything matching `[{"a": tf.TensorSpec(None, tf.float32)}, {"q": tf.TensorSpec([], tf.float64)}]` as an example. Likewise functions may return nested structures of Tensors, for example returning a dictionary mapping from strings to Tensors. In order for the loaded function to return the same structure we need to serialize it. This is an ergonomic aid for working with loaded SavedModels, not a promise to serialize all possible function signatures. For example we do not expect to pickle generic Python objects, and ideally we'd stay language-agnostic.
Used in: , , , , , ,
The kind of value.
Represents None.
Represents a double-precision floating-point value (a Python `float`).
Represents a signed integer value, limited to 64 bits. Larger values from Python's arbitrary-precision integers are unsupported.
Represents a string of Unicode characters stored in a Python `str`. In Python 3, this is exactly what type `str` is. In Python 2, this is the UTF-8 encoding of the characters. For strings with ASCII characters only (as often used in TensorFlow code) there is effectively no difference between the language versions. The obsolescent `unicode` type of Python 2 is not supported here.
Represents a boolean value.
Represents a TensorShape.
Represents an enum value for dtype.
Represents a value for tf.TensorSpec.
Represents a value for tf.TypeSpec.
Represents a value for tf.BoundedTensorSpec.
Represents a list of `Value`.
Represents a tuple of `Value`.
Represents a dict `Value`.
Represents Python's namedtuple.
Defines a connection between two tensors in a `GraphDef`.
Used in:
A tensor name. The value of this tensor will be substituted for the tensor named in `to_tensor`.
A tensor name. The value of this tensor will be bound to the value of the tensor named in `from_tensor`.
Used in:
Data type of tensor elements
Shape of the tensor.
Information about the size and allocator used for the data
Information about a Tensor necessary for feeding or retrieval.
Used in: , ,
For dense `Tensor`s, the name of the tensor in the graph.
There are many possible encodings of sparse matrices (https://en.wikipedia.org/wiki/Sparse_matrix). Currently, TensorFlow uses only the COO encoding. This is supported and documented in the SparseTensor Python class.
Generic encoding for CompositeTensors.
The static shape should be recorded here, to the extent that it can be known in advance. In the case of a SparseTensor, this field describes the logical shape of the represented tensor (aka dense_shape).
Generic encoding for composite tensors.
Used in:
The serialized TypeSpec for the composite tensor.
A TensorInfo for each flattened component tensor.
For sparse tensors, The COO encoding stores a triple of values, indices, and shape.
Used in:
The shape of the values Tensor is [?]. Its dtype must be the dtype of the SparseTensor as a whole, given in the enclosing TensorInfo.
The indices Tensor must have dtype int64 and shape [?, ?].
The dynamic logical shape represented by the SparseTensor is recorded in the Tensor referenced here. It must have dtype int64 and shape [?].
Protocol buffer representing a tensor.
Used in: , , , , , ,
Shape of the tensor. TODO(touts): sort out the 0-rank issues.
Version number. In version 0, if the "repeated xxx" representations contain only one element, that element is repeated to fill the shape. This makes it easy to represent a constant Tensor with a single value.
Serialized raw tensor content from either Tensor::AsProtoTensorContent or memcpy in tensorflow::grpc::EncodeTensorToByteBuffer. This representation can be used for all tensor types. The purpose of this representation is to reduce serialization overhead during RPC call by avoiding serialization of many repeated small items.
DT_HALF, DT_BFLOAT16. Note that since protobuf has no int16 type, we'll have some pointless zero padding for each value here.
DT_FLOAT.
DT_DOUBLE.
DT_INT32, DT_INT16, DT_UINT16, DT_INT8, DT_UINT8.
DT_STRING
DT_COMPLEX64. scomplex_val(2*i) and scomplex_val(2*i+1) are real and imaginary parts of i-th single precision complex.
DT_INT64
DT_BOOL
DT_COMPLEX128. dcomplex_val(2*i) and dcomplex_val(2*i+1) are real and imaginary parts of i-th double precision complex.
DT_RESOURCE
DT_VARIANT
DT_UINT32
DT_UINT64
Dimensions of a tensor.
Used in: , , , , , , , , , ,
Dimensions of the tensor, such as {"input", 30}, {"output", 40} for a 30 x 40 2D tensor. If an entry has size -1, this corresponds to a dimension of unknown size. The names are optional. The order of entries in "dim" matters: It indicates the layout of the values in the tensor in-memory representation. The first entry in "dim" is the outermost dimension used to layout the values, the last entry is the innermost dimension. This matches the in-memory layout of RowMajor Eigen tensors. If "dim.size()" > 0, "unknown_rank" must be false.
If true, the number of dimensions in the shape is unknown. If true, "dim.size()" must be 0.
One dimension of the tensor.
Used in:
Size of the tensor in that dimension. This value must be >= -1, but values of -1 are reserved for "unknown" shapes (values of -1 mean "unknown" dimension). Certain wrappers that work with TensorShapeProto may fail at runtime when deserializing a TensorShapeProto containing a dim value of -1.
Optional name of the tensor dimension.
A protobuf to represent tf.TensorSpec.
Used in:
Used in:
The number of threads in the pool. 0 means the system picks a value based on where this option proto is used (see the declaration of the specific field for more info).
The global name of the threadpool. If empty, then the threadpool is made and used according to the scope it's in - e.g., for a session threadpool, it is used by that session only. If non-empty, then: - a global threadpool associated with this name is looked up or created. This allows, for example, sharing one threadpool across many sessions (e.g., like the default behavior, if inter_op_parallelism_threads is not configured), but still partitioning into a large and small pool. - if the threadpool for this global_name already exists, then it is an error if the existing pool was created using a different num_threads value as is specified on this call. - threadpools created this way are never garbage collected.
Used in:
Objects which this object depends on.
Serialized data specific to this object.
Slot variables owned by this object.
The registered saver used to save this object. If this saver is not present when loading the checkpoint, then loading will fail.
Whether this object has checkpoint values or descendants with checkpoint values. This is computed at save time to avoid traversing the entire object graph proto when restoring (which also has to traverse the live object graph).
Used in: ,
An index into `TrackableObjectGraph.nodes`, indicating the object being referenced.
A user-provided name for the edge.
Used in:
A name for the Tensor. Simple variables have only one `SerializedTensor` named "VARIABLE_VALUE" by convention. This value may be restored on object creation as an optimization.
The full name of the variable/tensor, if applicable. Used to allow name-based loading of checkpoints which were saved using an object-based API. Should match the checkpoint key which would have been assigned by tf.train.Saver.
The generated name of the Tensor in the checkpoint.
Whether checkpoints should be considered as matching even without this value restored. Used for non-critical values which don't affect the TensorFlow graph, such as layer configurations.
Used in: ,
An index into `TrackableObjectGraph.nodes`, indicating the variable object this slot was created for.
The name of the slot (e.g. "m"/"v").
An index into `TrackableObjectGraph.nodes`, indicating the `Object` with the value of the slot variable.
Represents a Python tuple.
Used in:
Represents a tf.TypeSpec
Used in: ,
The value returned by TypeSpec._serialize().
The name of the TypeSpec class. * If type_spec_class == REGISTERED_TYPE_SPEC, the TypeSpec class is the one registered under this name. For types registered outside core TensorFlow by an add-on library, that library must be loaded before this value can be deserialized by nested_structure_coder. * If type_spec_class specifies a particular TypeSpec class, this field is redundant with the type_spec_class enum, and is only used for error reporting in older binaries that do not know the tupe_spec_class enum.
The number of flat tensor components required by this TypeSpec.
Used in:
tf.SparseTensorSpec
tf.IndexedSlicesSpec
tf.RaggedTensorSpec
tf.TensorArraySpec
tf.data.DatasetSpec
IteratorSpec from data/ops/iterator_ops.py
tf.OptionalSpec
PerReplicaSpec from distribute/values.py
tf.VariableSpec
RowPartitionSpec from ragged/row_partition.py
The type registered as type_spec_class_name.
Subclasses of tf.ExtensionType
Indicates how a distributed variable will be aggregated.
Used in: ,
`NONE`: This is the default, giving an error if you use a variable-update operation with multiple replicas.
`SUM`: Add the updates across replicas.
`MEAN`: Take the arithmetic mean ("average") of the updates across replicas.
`ONLY_FIRST_REPLICA`: This is for when every replica is performing the same update, but we only want to perform the update once. Used, e.g., for the global step counter.
Protocol buffer representing a Variable.
Name of the variable tensor.
Name of the tensor holding the variable's initial value.
Name of the initializer op.
Name of the snapshot tensor.
Support for saving variables as slices of a larger variable.
Whether to represent this as a ResourceVariable.
Whether this variable should be trained.
Indicates when a distributed variable will be synced.
Indicates how a distributed variable will be aggregated.
Indicates when a distributed variable will be synced.
Used in: ,
`AUTO`: Indicates that the synchronization will be determined by the current `DistributionStrategy` (eg. With `MirroredStrategy` this would be `ON_WRITE`).
`NONE`: Indicates that there will only be one copy of the variable, so there is no need to sync.
`ON_WRITE`: Indicates that the variable will be updated across devices every time it is written.
`ON_READ`: Indicates that the variable will be aggregated across devices when it is read (eg. when checkpointing or when evaluating an op that uses the variable).
Protocol buffer representing the serialization format of DT_VARIANT tensors.
Used in:
Name of the type of objects being serialized.
Portions of the object that are not Tensors.
Tensors contained within objects being serialized.
The config for graph verifiers.
Used in:
Deadline for completion of all verification i.e. all the Toggle ON verifiers must complete execution within this time.
Perform structural validation on a tensorflow graph. Default is OFF.
Used in:
Version information for a piece of serialized data There are different types of versions for each type of data (GraphDef, etc.), but they all have the same common shape described here. Each consumer has "consumer" and "min_producer" versions (specified elsewhere). A consumer is allowed to consume this data if producer >= min_producer consumer >= min_consumer consumer not in bad_consumers
Used in: ,
The version of the code that produced this data.
Any consumer below this version is not allowed to consume this data.
Specific consumer versions which are disallowed (e.g. due to bugs).