package tensorflow.tpu

Mouse Melon logoGet desktop application:
View/edit binary Protocol Buffers messages

message AdadeltaParameters

optimization_parameters.proto:171

https://www.tensorflow.org/api_docs/python/tf/train/AdadeltaOptimizer https://github.com/tensorflow/tensorflow/blob/c19e29306ce1777456b2dbb3a14f511edf7883a8/tensorflow/core/kernels/training_ops.cc#L68

Used in: OptimizationParameters

message AdagradParameters

optimization_parameters.proto:59

https://www.tensorflow.org/api_docs/python/tf/train/AdagradOptimizer https://github.com/tensorflow/tensorflow/blob/c19e29306ce1777456b2dbb3a14f511edf7883a8/tensorflow/core/kernels/training_ops.cc#L151

Used in: OptimizationParameters

message AdamParameters

optimization_parameters.proto:111

The Adam optimizer does not implement hyper-parameter update; use the dynamic learning rate feature instead, setting the learning rate to: user learning_rate * sqrt(1 - beta2^t) / (1 - beta1^t) Here, t is the current timestep. https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer https://github.com/tensorflow/tensorflow/blob/ab51450c817674c8ff08a7ae4f8ac50cdc4bed8b/tensorflow/python/training/adam.py#L54 Note that the code by default implements the lazy version of Adam (https://www.tensorflow.org/api_docs/python/tf/contrib/opt/LazyAdamOptimizer) unless the use_non_lazy_adam parameter is set, in which case it implements the normal version of Adam that updates all parameters in the embedding table, even for entries that are not used in the current minibatch (https://www.tensorflow.org/api_docs/python/tf/contrib/opt/AdamOptimizer). If use_non_lazy_adam is enabled, gradient accumulation is also required to be enabled in order to get correct results; a warning will be printed otherwise (which may change to an error in the future). If use_sum_inside_sqrt is set, the Adam variable update formula will be changed from m / (sqrt(v) + epsilon) to m / sqrt(v + epsilon**2); this option improves the performance of TPU training and is not expected to harm model quality.

Used in: OptimizationParameters

message BoundedAdagradParameters

optimization_parameters.proto:64

Algorithm in http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf.

Used in: OptimizationParameters

message CenteredRmsPropParameters

optimization_parameters.proto:141

https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer https://github.com/tensorflow/tensorflow/blob/c19e29306ce1777456b2dbb3a14f511edf7883a8/tensorflow/core/kernels/training_ops.cc#L372

Used in: OptimizationParameters

message ClippingLimits

optimization_parameters.proto:7

Used in: OptimizationParameters

message CompilationResultProto

compilation_result.proto:10

Describes the result of a TPU compilation.

message DynamicLearningRate

optimization_parameters.proto:16

Dynamic learning rate specification in the TPUEmbeddingConfiguration. The actual learning rates are provided as a scalar input list to the SendTPUEmbeddingGradients Op indexed by their tag specified through the following proto.

Used in: LearningRate

message FtrlParameters

optimization_parameters.proto:83

https://www.tensorflow.org/api_docs/python/tf/train/FtrlOptimizer https://github.com/tensorflow/tensorflow/blob/c19e29306ce1777456b2dbb3a14f511edf7883a8/tensorflow/core/kernels/training_ops.cc#L192

Used in: OptimizationParameters

message GradientAccumulationStatus

optimization_parameters.proto:229

Status of using gradient accumulation (doing two passes over the input gradients: one to accumulate them into a temporary array and another to apply them using the actual optimization algorithm). The extra message is to wrap the enum for scoping.

(message has no fields)

enum GradientAccumulationStatus.Status

optimization_parameters.proto:231

if UNSPECIFIED (default), gradient accumulation is ENABLED.

Used in: OptimizationParameters

message HotIdReplicationConfiguration

optimization_parameters.proto:240

Configuration proto for hot ID optimization. This is an experimental feature that is currently disabled (by default).

Used in: OptimizationParameters

enum HotIdReplicationConfiguration.Status

optimization_parameters.proto:243

Whether to enable or disable hot ID optimization. If UNSPECIFIED (default), hot ID optimization is DISABLED.

Used in: HotIdReplicationConfiguration

message LearningRate

optimization_parameters.proto:47

Source of learning rate to use.

Used in: OptimizationParameters

message MdlAdagradLightParameters

optimization_parameters.proto:151

Variant of algorithm in http://proceedings.mlr.press/v44/shamir15.pdf

Used in: OptimizationParameters

message MomentumParameters

optimization_parameters.proto:123

https://www.tensorflow.org/api_docs/python/tf/train/MomentumOptimizer https://github.com/tensorflow/tensorflow/blob/c19e29306ce1777456b2dbb3a14f511edf7883a8/tensorflow/core/kernels/training_ops.cc#L271

Used in: OptimizationParameters

message OnlineYogiParameters

optimization_parameters.proto:195

The online Yogi optimizer does not implement hyper-parameter update; use the dynamic learning rate feature instead, setting the learning rate to: user learning_rate * sqrt(1 - beta2^t) / (1 - beta1^t) Here, t is the current timestep. https://papers.nips.cc/paper/8186-adaptive-methods-for-nonconvex-optimization.pdf plus some extensions based on FTRL. Note that the code by default implements the lazy version of online Yogi.

Used in: OptimizationParameters

message OnlineYogiParameters.SignActivation

optimization_parameters.proto:212

x -> copysign(1, x) (i.e., return 1 for an input of +0 rather than 0).

Used in: OnlineYogiParameters

(message has no fields)

message OnlineYogiParameters.TanhActivation

optimization_parameters.proto:215

x -> tanh(x * 10)

Used in: OnlineYogiParameters

(message has no fields)

message OptimizationParameters

optimization_parameters.proto:251

Used in: TPUEmbeddingConfiguration.TableDescriptor

message PaddingMap

dynamic_padding.proto:9

A mapping between the dynamic shape dimension of an input and the arg that represents the real shape.

message ProximalAdagradParameters

optimization_parameters.proto:180

https://www.tensorflow.org/api_docs/python/tf/train/ProximalAdagradOptimizer https://github.com/tensorflow/tensorflow/blob/c19e29306ce1777456b2dbb3a14f511edf7883a8/tensorflow/core/kernels/training_ops.cc#L164

Used in: OptimizationParameters

message RmsPropParameters

optimization_parameters.proto:131

https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer https://github.com/tensorflow/tensorflow/blob/c19e29306ce1777456b2dbb3a14f511edf7883a8/tensorflow/core/kernels/training_ops.cc#L356

Used in: OptimizationParameters

message StateVariableSpecification

optimization_parameters.proto:304

Specification of an optimization algorithm's state variables (both the main value vector and any extra accumulators, etc.). This proto is only used internally by the TPU software and is not exposed directly to the TF model.

message StateVariableSpecification.FillWithConstant

optimization_parameters.proto:330

A state variable that should be filled with a constant and normally hidden from users (used for intermediate gradients being accumulated, for example).

Used in: StateVariableSpecification

message StateVariableSpecification.UserDefined

optimization_parameters.proto:310

A normal state variable that should be saved and restored in checkpoints and used as an input or output to non-debug TensorFlow ops.

Used in: StateVariableSpecification

message StochasticGradientDescentParameters

optimization_parameters.proto:79

https://www.tensorflow.org/api_docs/python/tf/train/GradientDescentOptimizer https://github.com/tensorflow/tensorflow/blob/c19e29306ce1777456b2dbb3a14f511edf7883a8/tensorflow/core/kernels/training_ops.cc#L423

Used in: OptimizationParameters

(message has no fields)

message TPUEmbeddingConfiguration

tpu_embedding_configuration.proto:8

enum TPUEmbeddingConfiguration.Mode

tpu_embedding_configuration.proto:27

Mode. Should the embedding layer program be run for inference (just forward pass), training (both forward and backward pass) or just the backward_pass.

Used in: TPUEmbeddingConfiguration

enum TPUEmbeddingConfiguration.ShardingStrategy

tpu_embedding_configuration.proto:57

Sharding strategy of the embedding tables among the hosts. If the sharding_strategy is "mod", each id is assigned to host "id % num_hosts". For instance, 13 ids are split across 5 hosts as: [[0, 5, 10], [1, 6, 11], [2, 7, 12], [3, 8], [4, 9]]. If the sharding_strategy is "div", ids are assigned to hosts in a contiguous manner. In this case, 13 ids are split across 5 hosts as: [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10], [11, 12]]. In both the strategies, if the id space does not evenly divide the number of hosts, each of the first "table_descriptor.vocabulary_size % num_hosts" hosts will be assigned one more id. This partitioning strategy exactly follows that in the embedding_lookup TensorFlow function at tensorflow/python/ops/embedding_ops.py.

Used in: TPUEmbeddingConfiguration

message TPUEmbeddingConfiguration.TableDescriptor

tpu_embedding_configuration.proto:10

Description of the various embedding tables.

Used in: TPUEmbeddingConfiguration

message TPUEmbeddingOutputLayout

tpu_embedding_output_layout.proto:17

Used in: TPUEmbeddingConfiguration

message TPUEmbeddingOutputLayout.EmbeddingOutputTensor

tpu_embedding_output_layout.proto:67

Format information for a single output tensor.

Used in: TPUEmbeddingOutputLayout

message TPUEmbeddingOutputLayout.FeatureDescriptor

tpu_embedding_output_layout.proto:37

Description of the output placement for one feature.

Used in: TableDescriptor

message TPUEmbeddingOutputLayout.OutputLocation

tpu_embedding_output_layout.proto:19

Location of one copy of the feature's data.

Used in: FeatureDescriptor

message TPUEmbeddingOutputLayout.TableDescriptor

tpu_embedding_output_layout.proto:45

Description of the output placement for features of one table.

Used in: TPUEmbeddingOutputLayout

message TPUEmbeddingOutputLayout.TwoDOutputTensor

tpu_embedding_output_layout.proto:57

Size and layout information for 2-D tensors.

Used in: EmbeddingOutputTensor

message TopologyProto

topology.proto:8

Describes the geometry of a TPU mesh.