package stream_executor.dnn

Get desktop application:
View/edit binary Protocol Buffers messages

Describes a kind of non-linearity (threshold-like mathematical function).

Used in: tensorflow.ConvParametersProto.Fusion, tensorflow.ConvolutionProto, tensorflow.MatmulParametersProto, tensorflow.MatmulProto

kNone = 0
kSigmoid = 1
kRelu = 2
Rectified linear activation: f(x) = x < 0 ? 0 : x
kRelu6 = 3
Rectified linear activation; where upper maximum is 6.0.
kReluX = 4
Rectified linear activation; where upper maximum specified by BatchDescriptor::value_max().
kTanh = 5
kBandPass = 6
Like ReluX; but passes all values in the range [-X,X].
kElu = 7
Exponential linear activation: f(x) = x < 0 ? e^x - 1 : x
kLeakyRelu = 8
Leaky Rectified linear activation: f(x) = x < 0 ? alpha * x : x
kGeluExact = 9
Gaussian Error linear unit activation: x * P(X <= x) = 0.5 * x * (1 + erf(x / sqrt(2))), where P(X) ~ N(0, 1).

Proto definition of AlgorithmConfig in "dnn.h". TODO(ruochengw): After cl/380702564 is submitted, add support for algorithm configs with cuDNN Frontend APIs.

Used in: tensorflow.ConvMapProto.Entry

oneof optional_algorithm
Use oneof to emulate optional semantics in proto2 since older version of proto3 cannot distinguish "unset field" and "default field".
- AlgorithmProto algorithm = 1
oneof optional_algorithm_no_scratch
- AlgorithmProto algorithm_no_scratch = 2
oneof optional_scratch_size
- int64 scratch_size = 3

Generic algorithm representation.

Used in: AlgorithmConfigProto, tensorflow.AutotuneResult, tensorflow.AutotuneResult.FailureResult, xla.gpu.CudnnConvBackendConfig

int64 algo_id = 1
AlgorithmProto.MathType math_type = 2
map<int64, int64> tuning_knobs = 4
bool is_cudnn_frontend = 5
Legacy algorithm enums and cuDNN Frontend engine numbers need to coexist in the same proto medium-term, until we can be confident of no longer needing the legacy cuDNN convolution API. Once the migration is complete, we can stop producing legacy algorithm enums and remove this field.
optional google.protobuf.UInt64Value workspace_size = 6
For ROCm only, it's impossible to re-query the required workspace size after running the algorithm search, so we must store the workspace size along with the choice of algorithm. For consistency and convenience, cuDNN uses this field in the same way, even though it would be possible to re-query the workspace size from cuDNN at each use. Since this message is persisted in files, we need to be able to distinguish 0 workspace size from unknown workspace size in an old message, so this is a message field.

Used in: AlgorithmProto

DEFAULT_MATH = 0
TENSOR_OP_MATH = 1
The GPU may operate 4x4 matrix FMA. See cuDNN's documentation for CUDNN_TENSOR_OP_MATH.

Convolution-specific parameters.

Used in: tensorflow.ConvolutionProto

repeated int64 paddings = 1
repeated int64 strides = 2
repeated int64 dilations = 3
DataType compute_mode = 4
The "accumulator" type. For example, use F32 as an accumulator for F16 convolutions. See cuDNN's cudnnConvolutionMode_t.
int32 group_count = 5
See cuDNN's group count.
ConvolutionMode convolution_mode = 6
string name = 7
Tensorflow node name, same as in NodeDef, for debugging purposes.

Used in: tensorflow.ConvolutionProto

INVALID = 0
FORWARD = 1
BACKWARD_FILTER = 2
BACKWARD_DATA = 3
FORWARD_BIAS_ACTIVATION = 4

Describe the math definition for the conv op. The popular behavior is actually called cross-correlation in math, despite the operation is often referred as convolution. See cuDNN cudnnConvolutionMode_t.

Used in: ConvolutionDescriptorProto

CROSS_CORRELATION = 0
CONVOLUTION = 1

Describes how a convolution input or output layer's data is formatted.

Used in: TensorDescriptorProto

kYXDepthBatch = 0
Naming convention: Y <-> row or height X <-> column or width Batch <-> batch, or N Depth <-> feature, or channel TODO(timshen): turn them into cuDNN names, e.g. kNCHW. Note: In cudnn, kBatchDepthYX4 and kBatchDepthYX32 are the same layout (namely, NCHW_VECT_C). It differentiates between these two by using a different data type (int8x4 vs int8x32). In StreamExecutor we use different layouts for these, because we don't usually pass an explicit data type to StreamExecutor functions.
kYXBatchDepth = 1
kBatchYXDepth = 2
cuDNN's NHWC layout
kBatchDepthYX = 3
cuDNN's NCHW layout
kBatchDepthYX4 = 4
cuDNN's NCHW_VECT_C with 4-elem vectors (e.g. int8x4)
kBatchDepthYX32 = 5
cuDNN's NCHW_VECT_C with 32-elem vects (e.g. int8x32)

Specifies the data type used by an operation.

Used in: ConvolutionDescriptorProto, TensorDescriptorProto, tensorflow.MatmulProto

kFloat = 0
kDouble = 1
kHalf = 2
kInt8 = 3
kInt32 = 4
kComplexFloat = 5
kComplexDouble = 6
kBF16 = 7

Describes how a convolution filter is laid out in the memory.

Used in: TensorDescriptorProto

kOutputInputYX = 0
Naming convention: Y <-> row or height X <-> column or width Output <-> output feature, or N Input <-> input feature, or N TODO(timshen): turn them into cuDNN names, e.g. kNCHW.
cuDNN's NCHW layout
kOutputYXInput = 1
cuDNN's NHWC layout
kOutputInputYX4 = 2
cuDNN's NCHW_VECT_C layout with 4-elem vectors
kOutputInputYX32 = 5
cuDNN's NCHW_VECT_C layout with 32-elem vectors
kInputYXOutput = 3
kYXInputOutput = 4

Generic tensor representation.

Used in: tensorflow.ConvolutionProto

repeated int64 dimensions = 1
DataType data_type = 2
oneof layout_oneof
- DataLayout data_layout = 3
- FilterLayout filter_layout = 4

package stream_executor.dnn

enum ActivationMode

kNone = 0

kSigmoid = 1

kRelu = 2

kRelu6 = 3

kReluX = 4

kTanh = 5

kBandPass = 6

kElu = 7

kLeakyRelu = 8

kGeluExact = 9

message AlgorithmConfigProto

oneof optional_algorithm

AlgorithmProto algorithm = 1

oneof optional_algorithm_no_scratch

AlgorithmProto algorithm_no_scratch = 2

oneof optional_scratch_size

int64 scratch_size = 3

message AlgorithmProto

int64 algo_id = 1

AlgorithmProto.MathType math_type = 2

map<int64, int64> tuning_knobs = 4

bool is_cudnn_frontend = 5

optional google.protobuf.UInt64Value workspace_size = 6

enum AlgorithmProto.MathType

DEFAULT_MATH = 0

TENSOR_OP_MATH = 1

message ConvolutionDescriptorProto

repeated int64 paddings = 1

repeated int64 strides = 2

repeated int64 dilations = 3

DataType compute_mode = 4

int32 group_count = 5

ConvolutionMode convolution_mode = 6

string name = 7

enum ConvolutionKind

INVALID = 0

FORWARD = 1

BACKWARD_FILTER = 2

BACKWARD_DATA = 3

FORWARD_BIAS_ACTIVATION = 4

enum ConvolutionMode

CROSS_CORRELATION = 0

CONVOLUTION = 1

enum DataLayout

kYXDepthBatch = 0

kYXBatchDepth = 1

kBatchYXDepth = 2

kBatchDepthYX = 3

kBatchDepthYX4 = 4

kBatchDepthYX32 = 5

enum DataType

kFloat = 0

kDouble = 1

kHalf = 2

kInt8 = 3

kInt32 = 4

kComplexFloat = 5

kComplexDouble = 6

kBF16 = 7

enum FilterLayout

kOutputInputYX = 0

kOutputYXInput = 1

kOutputInputYX4 = 2

kOutputInputYX32 = 5

kInputYXOutput = 3

kYXInputOutput = 4

message TensorDescriptorProto

repeated int64 dimensions = 1

DataType data_type = 2

oneof layout_oneof

DataLayout data_layout = 3

FilterLayout filter_layout = 4