Get desktop application:
View/edit binary Protocol Buffers messages
Used in:
Used in:
Used in:
Used in:
Used in:
Arithmetic operations.
Comparison operators.
Element-wise maximum.
Element-wise minimum.
Raises the left-hand-side to the right-hand-side power.
Remainder operation.
Element-wise, logical operators on booleans and bitwise operators on ints.
Complex from real, imag.
Computes the 4-quadrant arctangent of the y, x input arguments.
Used in:
Serialization of BufferAllocation.
Used in:
Assigned represents a single LogicalBuffer that is assigned to this BufferAllocation.
Used in:
Serialization of BufferAssignment.
Used in:
Alias represents a source LogicalBuffer, and the buffer location that aliases it.
Used in:
Used in:
Handle given to a user to represent a channel between two computations via a Send and Recv instruction pair. Channels are unbuffered, so Send Send instructions will be blocked until the data is transferred.
Used in:
, ,Handle given to a user that represents a data result in a computation. This is used to pass to subsequent computations that depends upon the data as an operand.
Used in:
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,Handle given to a user that represents a computation that the user builds up before execution.
Used in:
, , , , , , , , , , , , , , , , , , , ,Statistics of a computation.
Used in:
The number of floating point operations in the computation.
The number of transcendental operations (e.g., exp) in the computation.
A LiteralProto is returned directly for this request, instead of a ComputationDataHandle.
Used in:
The dimension in which we concatenate; e.g. if you had dimension arrays of [4, 1] and [5, 1], you'd concatenate in dimension 0 to produce a [9, 1]. Attempting to concatenate those in dimension 1 would produce an error, as 4 != 5 (and there is no ragged array support).
Used in:
Used in:
Used in:
Used in:
,The number of the dimension that represents batch in the input.
The number of the dimension that represents features in the input.
The dimension numbers for the spatial dimensions that the window moves through in the input.
The number of the dimension that represents input features in the convolutional kernel (rhs).
The number of the dimension that represents output features in the convolutional kernel (rhs).
The dimension numbers for the spatial dimensions that the window moves through in the kernel (rhs). window.strides(0) is the stride in the kernel_spatial_dimensions(0) dimension.
The number of the dimension that represents batch in the output.
The number of the dimension that represents features in the output.
The dimension numbers for the spatial dimensions that the window moves through in the output.
Used in:
This is the filter/kernel.
Describes the filter/kernel.
(message has no fields)
Used in:
Used in:
Debugging options for XLA. These options may change at any time - there are no guarantees about backward or forward compatibility for these fields.
Used in:
, ,HLO modules matching this regex will be dumped to a .dot file throughout various stages in compilation (file names are LOG(INFO)'d). Set to ".*" to dump *all* HLO modules.
Show addresses of HLO ops in graph dump.
Path to dump HLO graphs to.
Dump HLO graphs as TensorFlow GraphDefs.
HLO modules matching this regex will be dumped to LOG(INFO). Set to ".*" to dump *all* HLO modules.
Dump all HLO modules as text into the provided directory path.
Dump Hlo after all hlo passes are executed as proto binary into this directory.
Instrument the computation to collect per-HLO cycle counts.
Dumps computations that XLA executes into the provided directory path.
Dumps parameters and results of computations that XLA executes into the provided directory path.
List of HLO passes to disable. These names must exactly match the pass names as specified by the HloPassInterface::name() method.
Numerical optimization level for the XLA compiler backend; the specific interpretation of this value is left to the backends.
When true, "unsafe" mathematical optimizations are enabled. These transformations include but are not limited to: - Reducing the precision of operations (e.g. using an approximate sin function, or transforming x/y into x * (1/y)). - Assuming that operations never produce or consume NaN or +/- Inf. - Assuming that +0 and -0 are indistinguishable.
Embed the compiler IR as a string in the executable.
Dump the compiler IR into this directory as individual files.
Eliminate implicit broadcasts when lowering user computations to HLO instructions; use explicit broadcast instead.
When generating calls to Eigen in the CPU backend, use multi-threaded Eigen mode.
Path to directory with cuda/ptx tools and libraries.
Enable flush-to-zero semantics in the GPU backend.
Disable multi-streaming in the GPU backend.
If true, in LLVM-based backends, emit !alias.scope metadata in generated IR.
If true, in LLVM-based backends, emit !noalias metadata in the generated IR.
If true, in LLVM-based backends, emit !invariant.load metadata in the generated IR.
If true, a set of expensive LLVM optimization passes will not be run.
Options for inserting reduce-precision operations for numerical experimentation. This is a repeated field, as we may want to have multiple passes with different parameters.
This is used by ClientLibraryTestBase::ComputeAndCompare*. If true, the computation will run n! times with all permunations of layouts for the output shape in rank n. For example, with a 3D shape, all permutations of the set {0, 1, 2} are tried.
This is used by ClientLibraryTestBase::ComputeAndCompare*. If true, the computation will run for all permunations of layouts of all input arguments. For example, with 2 input arguments in 2D and 4D shapes, the computation will run 2! * 4! times.
Assign colors based on sharding information when generating the Graphviz HLO graph.
Prefix the name scopes of the TF graph exports with "devX" device assignments, if available.
If true, the GPU backend is free to use cudnn for HLO batch normalization ops.
Dump HLO before any hlo passes are executed as proto binary into this directory.
Dump HLO after each pass as an HloProto in binary file format into this directory.
Generate calls to MKL-DNN in the CPU backend.
Extra options to pass to the compilation backend; specific interpretation of these values is left to the backend.
DeviceAssignmentProto is a serialized form of DeviceAssignment class, which represents the device ids assigned to a set of replicated computations. See xla::DeviceAssignment class comment for more details.
Each logical computation runs on replica_count physical devices. ComputationDevice represents the device ids assinged to the replicas.
Used in:
Handle given to a user that represents a replicated virtual device. Each replicated device represents N physical devices for execution where N is the number of replicas.
Used in:
, , , , ,The number of model-parallel virtual devices that communicate via XLA Send/Recv instructions.
Used in:
,The dimension numbers that represent the 'lhs' contracting dimensions.
The dimension numbers that represent the 'rhs' contracting dimensions.
The dimension numbers that represent the 'lhs' batch dimensions.
The dimension numbers that represent the 'rhs' batch dimensions.
Used in:
Used in:
Operand from which to slice at dynamic 'start_indices'.
Dynamically computed 'start_indices' for slice operation.
Slice sizes for each dimension (note that indices calculations are computed modulo dimension sizes to avoid out-of-bound array accesses).
Used in:
Operand on which slice 'update' is to be applied.
The slice update to apply to 'operand'.
Dynamically computed start indices for the update slice operation.
Options that affect how XLA compiles and runs code to service this request.
A handle to the execution launched asynchronously.
Used in:
Options that affect how XLA compiles and runs code to service this request.
Used in:
Options that affect how XLA compiles and runs code to service this request.
Used in:
Handle given to a user that represents an execution that the user launched asynchronously on the device.
Used in:
,These settings control how XLA compiles and/or runs code. Not all settings will have an effect on every platform. When adding new fields, keep in mind that boolean fields default to false.
Used in:
, ,This optional field's layout is used as a hint when storing the output of this computation. Subsequent transfers of this output array to the client may be faster when using this layout. We use a Shape here to accommodate computations that return a tuple.
Used to seed random-number generators used in this computation. If this is 0, we generate a seed ourselves. TODO(b/32083678): Changing the seed unnecessarily forces a recompilation.
This optional field specifies a particular set of devices to run the computation on. The computation will be partitioned across these devices. If not provided, the default device will be chosen.
Profile data from the execution of a computation.
Used in:
,Whether the executable was read from the compilation cache.
The time in milliseconds spent to compile the computation. This only set if the executable was not read from the compilation cache (compilation_cache_hit == false).
The number of cycles spent for the computation. This does not include the time taken for the data transfers between the host and the device. This is a target-dependent field and only used for debugging purposes.
The time in nanoseconds spent for the computation, without data transfer.
The time in nanoseconds spent for the entire computation, including the result data transfer time. Current implementation does not spend any cycles for the input data transfer since the memory is initialized with the proper values before the execution.
Used in:
Multivalent for higher-order FFT.
Used in:
,Forward FFT; complex in, complex out.
Inverse FFT; complex in, complex out.
Forward real FFT; real in, fft_length / 2 + 1 complex out
Inverse real FFT; fft_length / 2 + 1 complex in,
A format specifies the method used by a layout to store an array in memory.
Used in:
The default layout, with exactly one storage location per element (ignoring padding).
A sparsely encoded layout, providing only the index/value pairs of non-zero elements.
Describes the dimension numbers for a gather operation. See https://www.tensorflow.org/performance/xla/operation_semantics#gather for more details.
Used in:
,"Window indices" is a term for a set of indices that index into the interior of a dynamic-slice from the input tensor, the starting indices for which were computed from output_gather_dims (see the operation semantic for how this is defined) and the gather_indices tensor. The window indices for a specific output index Out is computed as: i = 0 for (k : [0, input_tensor_shape.rank)) window_indices[k] = if k in elided_window_dims then 0 else Out[output_window_dims[i++]]
This is interpreted as a map from i to gather_dims_to_operand_dims[i]. It transforms the gather index looked up from the gather_indices tensor into the starting index in the input space.
The dimension in the gather_indices input that contains the starting indices.
Used in:
Used in:
Handle given to a user that represents a globally accessible allocation. Contrast this against a ComputationDataHandle, which is not globally accessible, since it only exists within a specific computation.
Used in:
, , , , , , , , , , , , , ,A trace of a HeapSimulator run.
Used in:
The trace includes a list of events, where each event describes one action performed by the heap simulator.
Used in:
The id of the LogicalBuffer that the event applies to.
The HloInstruction that the simulation was processing that caused this event to occur, identified by its computation and instruction name. E.g. buffers defined by instruction A are allocated when processing A.
The id of the canonical LogicalBuffer that the buffer shares with. Only set for SHARE_WITH events.
Used in:
A memory region was allocated for the buffer.
A memory region was freed for the buffer.
A buffer was shared with another (canonical) buffer. This is similar to ALLOC, except that instead of allocating a new region of memory, the memory region of the canonical buffer is directly re-used. Multiple buffers may share with the same canonical buffer. The lifetime of the canonical buffer is extended to the union of all lifetimes.
Serialization of HloComputation.
Used in:
The array of instructions is always in a valid dependency order, where operands appear before their users.
The program shape (with layout) of this computation.
The id of this computation.
The id of the root of the computation.
Serialization of HloInstruction.
Used in:
Literal, only present for kConstant.
Parameter number is only present for kParameter.
Fusion state, only present for kFusion.
Index for kGetTupleElement.
Dimensions present for some operations that require reshaping or broadcasting, including Reshape, Reduce, ReduceWindow, and Reverse.
Describes the window in a windowed operation such as convolution.
Describes the dimension numbers used for a convolution.
The bit sizes for a reduce-precision operation.
Describes the [start, start + size) range size for a dynamic slice ('start' is specified dynamically in the second operand of the operation).
The padding configuration that describes the edge padding and interior padding of this pad instruction. Only set for pad instructions.
Outfeed configuration information, only present for kOutfeed.
The distribution requested for random number generation. Only present for kRng.
A small float number added to the variance to avoid divide-by-zero error. Only present for kBatchNormTraining.
An integer value representing the index of the feature dimension. Only present for kBatchNormTraining.
Represents a unique identifier for each Send/Recv instruction pair. Only present for kSend or kRecv.
The string representation of the infeed configuration.
Name of a global symbol to call, only present for kCustomCall.
Shape of outfeed request.
Describes the dimension numbers used for a dot operation
FFT type (FFT, IFFT, etc).
FFT length.
Gather dimension numbers.
The id of this instruction.
Describes the [begin, end) index range and stride for slices.
Used in:
Serialization of HloModule.
Used in:
, ,The array of computations is always in a valid dependency order, where callees appear before their callers.
The program shape (with layout) of the entry computation.
The id of this module.
Serialization of HloOrdering.
Used in:
NOTE: currently only sequential orderings are serialized.
Used in:
Describes how to pretty-print a profile counter array gathered for a specific HloModule.
HloComputationInfos for every HloComputation in the HloModule.
The size of the profile counters array we will pretty-print.
Pretty-printer information about an HloComputation.
Used in:
The index into the profile counters array for the HloComputation corresponding to this HloComputationInfo.
HloInstructionInfos for every HloInstruction in the HloComputation for corresponding to this HloComputattionInfo.
Pretty-printer information about an HloInstruction.
Used in:
Metrics computed by HloCostAnalysis.
The index into the profile counters array for the HloInstruction corresponding to this HloInstructionInfo.
Grouping message that contains all of the information above.
Options for the HLO insert-reduce-precision-operations pass.
Used in:
Exponent and mantissa bit counts for the reduced precision.
Operations matching these opcodes should be suffixed with reduce-precision operations.
Operations with names containing these substrings should be suffixed with reduce-precision operations.
Where and when the reduce-precision operations will be added.
Used in:
Add reduce-precision operations to the inputs of selected instructions. This is done before any optimization occurs.
Add reduce-precision operations to the outputs of selected instructions. This is done before any optimization occurs.
After operation-fusion occurs, add reduce-precision operations to the outputs of any selected instructions that have not been fused into fusion instructions.
After operation-fusion occurs, add reduce-precision operations to the outputs of any fusion instructions that contain operations matching the selection criteria.
After operation-fusion occurs, add reduce-precision operations to the outputs of any fusion instructions that contain operations matching the selection criteria.
Used in:
Operand to the HostCompute. Supports tuple.
Name used to identify HostSend/Recv channels.
Cost estimate in nanoseconds.
The shape of any data returned by host.
Used in:
The shape of the data returned by reading the device's infeed buffer.
Additional infeed configuration for the backend.
A layout describes how the array is placed in (1D) memory space. This includes the minor-to-major ordering of dimensions within a shape, as well as any padding present in those dimensions. Clients must specify the layouts of input Literals to the computation. Layouts specified in interior operations which take Shapes (for example, Convert) are ignored. See the XLA documentation for more information on shapes and layouts.
Used in:
,The method used to store the data in memory. The format determines which of the other fields are used by the layout.
Sequence of dimension numbers, from minor (fastest varying index) to major (slowest varying index). This field is required.
The width to which the layout of each dimension is padded up to. If present, the size of the padded_dimensions must equal the rank of the shape. The padding appears at the end of a dimension, not at the beginning. This kind of padding, unlike padding in e.g. convolution, is not part of the shape. This field must be unset unless the format is DENSE.
Describes the values in the padding specified by padded_dimensions. This field must be unset unless the format is DENSE.
The maximum number of elements that can be stored for SPARSE formats. This can be used to determine the maximum size in bytes of arrays stored in memory. This field must be unset unless the format is SPARSE.
Literals are used when the server and client need to exchange materialized data / results. Literals are also used to describe constants used in computations. Transfers to/from the client are encoded in literal form, and the structure of the repeated fields is implied by the shape.
Used in:
, , , , , , , ,Stored as interleaved real, imag floats.
The F16s and BF16s are encoded in little endian byte order
Next = 15
Describes the path of the ColumnIO tablet to load.
Describes the field to load within the ColumnIO tablet.
Individual element shape, excluding rows.
Warning: ColumnIO does not support random-access, so use offset with caution in performance-critical scenarios.
Maximum number of elements (with shape element_shape) to load.
If more than one item is requested (via limit > 1), then this request attribute zips together the produced vectors.
Serialization of LogicalBuffer.
Used in:
The location where the buffer is defined.
Location represents an instruction and its shape index, which uniquely identifies a point where a buffer is needed.
Used in:
,NOTE: module_name isn't necessary, since all LogicalBuffers are associated with a single HloModule.
Used in:
The dimensions over which to map. Example mapping a Dot operation along the batch dimension 0: operand0.shape = [2, 2, 2], operand1.shape = [2,2,3] Map({operand0, operand1}, Dot, {0})
Symbolization metadata for HLO Instructions. This metadata is used for debugging XLA code generation, as well as performance profiling of XLA-generated executables.
Used in:
,The framework op name that generated this XLA op. Frameworks that build on top of XLA should mirror the names of their ops back to users by specifying the op_type. In this way, even if the framework's "ops" are implemented as multiple XLA HLO Ops, they can be grouped appropriately. (e.g. if a SoftMax layer is emitted into XLA as multiple ops, then each op should have the op_type be "SoftMax".)
The user-specified name of the op. This name is often unique within a computation. Note: some frameworks add auto-generated names if the user does not provide one.
Indicate a file and line that this op is associated to in a user's program. e.g. it could be the file and line of user code that generated the op.
Used in:
Next: 47
Used in:
,The shape of the sharded tile.
The shape of the tile assignment tensor - this must be the same rank as tile_shape and the product of its dimensions must equal tile_assignment_devices.size().
Flattened list of device IDs. The order of flattening is the same as used by IndexUtil::MultiToLinearIndex(tile_assignment_shape).
If type == TUPLE, the sub-shardings, one per leaf node in the tuple shape, in pre-order. The tuple shape could be nested; here we store just a flattened list of all leaves in the tuple shape. Note that the tuple shape is not stored here; shardings do not store the shapes to which they are applied, this is inferred from the instruction this sharding gets attached to.
Used in:
This sharding is replicated across all devices (implies maximal, all other fields are unused).
This sharding is maximal - one device runs the entire operation.
This sharding is a tuple - only the tuple_shardings field is valid.
None of the above; tile_shape and tile_assignment are both used.
Describes a single operation request.
Used in:
For operations which call embedded computations such as "Map", these are the version(s) that the embedded computation should be called at. A version value of a computation is the ComputationDataHandle of the root of the computation at the point in time. "Call", "Map", "Reduce", and "ReduceWindow" operations take a single embedded computation so this field will have a single value for those operations. "While" operation takes two; index 0 is the "condition" version and index 1 is the "body" version.
The actual request, which in itself is a tagged union of all possible operation request types.
Used in:
The shape of the data returned by reading the device's outfeed buffer.
Operand to the Outfeed. Supports tuple.
Backend-specific information for how to perform the outfeed.
Used in:
Describes the padding configuration for Pad operation. The padding amount on both edges as well as between the elements are specified for each dimension.
Used in:
,The padding configuration for all dimensions.
Describes the padding configuration for a dimension.
Used in:
Padding amount on the low-end (next to the index 0).
Padding amount on the high-end (next to the highest index).
Padding amount between the elements.
Describes the value held inside padding elements.
Used in:
Zero padding must be 0-values that correspond to the shape's element type.
One padding must be 1-values that correspond to the shape's element type.
"Lowest" padding must be the lowest values in the shape's element type, used as padding for operations like max-accumulation.
"Highest" padding must be the largest values in the shape's element type, used as padding for operations like min-accumulation.
Unknown padding could be anything; e.g. floating NaNs!
Used in:
Primitive types are the individual values that can be held in rectangular multidimensional arrays. A description of the rectangular multidimensional array dimensions / primitive type is given by Shape, below.
Used in:
,Invalid primitive type to serve as default.
Predicates are two-state booleans.
Signed integral values of fixed width.
Unsigned integral values of fixed width.
Floating-point values of fixed width. Note: if f16s are not natively supported on the device, they will be converted to f16 from f32 at arbirary points in the computation.
Truncated 16 bit floating-point format. This is similar to IEEE's 16 bit floating-point format, but uses 1 bit for the sign, 8 bits for the exponent and 7 bits for the mantissa.
Complex values of fixed width.
Paired F32 (real, imag), as in std::complex<float>.
A tuple is a polymorphic sequence; e.g. a shape that holds different sub-shapes. They are used for things like returning multiple values from a computation; e.g. a computation that returns weights and biases may have a signature that results in a tuple like (f32[784x2000], f32[2000]) If a shape proto has the tuple element type, it may not have any entries in the dimensions field.
An opaque type used for passing context specific data to a custom operation.
Shape of the parameters and output of a computation (like a traditional function signature).
Used in:
, ,Used in:
,Creates a uniform-distribution-generated random number on the semi-open interval [parameter[0], parameter[1]).
Creates a normal-distribution-generated random number with mean parameter[0] and standard deviation parameter[1].
Used in:
Used in:
Used in:
Operand to the reduction.
Initial value for the reduction. This must be consistent with the result shape of to_apply.
The dimensions to reduce over.
The computation to apply in the reduction.
Used in:
(message has no fields)
Used in:
The dimension order for collapse (from fastest-changing to slowest).
The new dimension sizes (from dimension 0 to n-1).
Used in:
Used in:
Used in:
Operand array on which the windows slide.
Source array for the data to scatter.
Initial scalar value for each element in the output.
Window configuration.
Binary function used to select an element from each window.
Binary function used to combine each scattered value from source with the current output value at the selected location.
Used in:
Describes a sequence of operation requests which define an XLA computation.
Used in:
The ComputationHandle used to refer to this computation in the XLA service.
Map from ComputationDataHandle value to operation request. The highest ComputationDataHandle value corresponds to the root of the computation.
Describes a group of SessionComputations with an "entry point" computation that may refer to the other non-entry (AKA embedded) computations. This message is used to serialize a computation that has been built via the XLA service API, along with its dependencies, for purposes such as analysis/replay/file-storage.
Used in:
,The entry computation, which was requested for serialization. This may have referred to embedded computations, which are reflected below.
Embedded computations that are transitively referred to by the entry computation.
The arguments passed to the computation.
The result of the computation.
The name of the platform used to run the computation.
(message has no fields)
A shape describes the number of dimensions in the array, the size of each dimension, and the primitive component type. Tuples are a special case in that they have rank zero and have tuple_shapes defined. See the XLA documentation for more information on shapes and layouts.
Used in:
, , , , , , , , , , , , , , , , , ,The element type for this shape.
The size (number of elements) for each dimension. In XLA, dimensions are numbered from 0 to N-1 for an N-dimensional array. The first element of 'dimensions' is the size of dimension 0, the second element is the size of dimension 1, and so forth. Empty list indicates a scalar.
For tuples only, the shapes of constitutent shapes in the tuple sequence.
The layout used to back this shape.
Used in:
(message has no fields)
Used in:
Used in:
Given a predicate and two operands, selects operand0 if the predicate is true and operand1 if the predicate is false.
Given a min, max and an operand returns the operand if between min and max, else returns min if operand is less than min or max if operand is greater than max.
Used in:
This optional field directs the service to return the literal in this layout. A shape is used to hold the layout to accommodate tuples.
This optional field directs the service to return the literal in this layout. A shape is used to hold the layout to accommodate tuples.
(message has no fields)
Used in:
The permutation of the operand's dimensions (in the range 0 to n-1).
Used in:
Used in:
Elementwise, logical negation on booleans and bitwise negation on ints.
Elementwise, computes e^x.
Elementwise, computes -x.
Puts the elements in the operand into sorted order.
Elementwise, computes tanh(x).
Elementwise, computes the natural logarithm of x.
Elementwise, computes the floor of x.
Elementwise, computes the ceil of x.
Elementwise, computes the abs of x.
Elementwise, computes the sign of x.
Elementwise, tests if values are finite (not NaN or inf)
Elementwise, computes the cosine of x.
Elementwise, computes the sine of x.
Elementwise, rounds x to nearest integral value, rounding half-way cases away from zero.
Elementwise, extract real component of complex x.
Elementwise, extract real component of complex x.
(message has no fields)
Used in:
Used in:
Creates a tuple from its operands.
Used in:
Describes the windowing in an operation such as convolution. The window is moved across a base area and for each position of the window a computation is performed. The field below describes the window and the movement of the window across a base area.
Used in:
, , ,Used in:
The size of the window in this dimension. For a rectangle, this would be the width or height.
The stride at which the window moves across the base area in this dimension. In other words, this is the spacing between different positions of the window in this dimension.
If positive, means the amount of padding with zeroes to add to the base area at the low end of this dimension; if negative, its negative means the number of elements removed from the low end of this dimension. For example, in the horizontal dimension of a rectangle, this would be the number of zeroes to pad on the left, given that indices increase when going right.
As padding_low, but on the high end of this dimension. For example, in the horizontal dimension of a rectangle, this would be the number of zeroes to pad on the right, given that indices increase when going right.
Dilation factor of the sliding window in this dimension. A dilation factor of 1 means no dilation. window_dilation - 1 no-op entries ("holes") are implicitly placed between each kernel element. See documentation for convolution.
Dilation factor of the base area in this dimension. A dilation factor of 1 means no dilation. base_dilation - 1 no-op entries ("holes") are implicitly placed between each base area element. See documentation for convolution.
Window reversal means that this dimension was logically reversed before the operation.