package tensorflow.metadata.v0

Mouse Melon logoGet desktop application:
View/edit binary Protocol Buffers messages

message AUC

metric.proto:168

Area under curve for the ROC-curve. https://www.tensorflow.org/api_docs/python/tf/keras/metrics/AUC

Used in: PerformanceMetric

(message has no fields)

message AUCPrecisionRecall

metric.proto:174

Area under curve for the precision-recall-curve. https://www.tensorflow.org/api_docs/python/tf/keras/metrics/AUC

Used in: PerformanceMetric

(message has no fields)

message AllowlistDeriver

derived_feature.proto:61

Used in: DerivedFeatureConfig

message Annotation

schema.proto:265

Additional information about the schema or about a feature.

Used in: Feature, Schema

message Anomalies

anomalies.proto:305

Message to represent the anomalies, which describe the mismatches (if any) between the stats and the schema.

enum Anomalies.AnomalyNameFormat

anomalies.proto:316

Map from a column to the difference that it represents.

Used in: Anomalies

message AnomalyInfo

anomalies.proto:31

Message to represent information about an individual anomaly.

Used in: Anomalies

message AnomalyInfo.Reason

anomalies.proto:264

LINT.ThenChange(//tensorflow_data_validation/g3doc/anomalies.md) Reason for the anomaly. There may be more than one reason, e.g. the field might be missing sometimes AND a new value is present.

Used in: AnomalyInfo

enum AnomalyInfo.Severity

anomalies.proto:39

Used in: AnomalyInfo

enum AnomalyInfo.Type

anomalies.proto:56

Next ID: 89 LINT.IfChange

Used in: Reason

message ArgmaxTopK

derived_feature.proto:66

Used in: DerivedFeatureConfig

message AudioDomain

schema.proto:644

Audio data.

Used in: Feature

(message has no fields)

message BinaryAccuracy

metric.proto:47

https://www.tensorflow.org/api_docs/python/tf/keras/metrics/binary_accuracy

Used in: PerformanceMetric

(message has no fields)

message BinaryClassification

problem_statement.proto:49

Configuration for a binary classification task. The output is one of two possible class labels, encoded as the same type as the label column. BinaryClassification is the same as MultiClassClassification with n_classes = 2.

Used in: Type

message BinaryClassification.PositiveNegativeSpec

problem_statement.proto:63

Defines which label value is the positive and/or negative class.

Used in: BinaryClassification

message BinaryClassification.PositiveNegativeSpec.LabelValue

problem_statement.proto:66

Specifies a label's value which can be used for positive/negative class specification.

Used in: PositiveNegativeSpec

message BinaryCrossEntropy

metric.proto:152

Binary cross entropy as a metric is equal to the negative log likelihood (see logistic regression). In addition, when used to solve a binary classification task, binary cross entropy implies that the binary label will maximize binary accuracy. binary_crossentropy(...) https://www.tensorflow.org/api_docs/python/tf/keras/metrics/binary_crossentropy

Used in: PerformanceMetric

(message has no fields)

message BlockUtility

metric.proto:278

DEPRECATED

Used in: PerformanceMetric

message BoolDomain

schema.proto:603

Encodes information about the domain of a boolean attribute that encodes its TRUE/FALSE values as strings, or 0=false, 1=true. Note that FeatureType could be either INT or BYTES.

Used in: Feature

message BytesStatistics

statistics.proto:399

Statistics for a bytes feature in a dataset.

Used in: FeatureNameStatistics

message CategoricalAccuracy

metric.proto:53

categorical_accuracy(...) https://www.tensorflow.org/api_docs/python/tf/keras/metrics/categorical_accuracy

Used in: PerformanceMetric

(message has no fields)

message CategoricalCrossEntropy

metric.proto:59

categorical_crossentropy(...) https://www.tensorflow.org/api_docs/python/tf/keras/metrics/categorical_crossentropy

Used in: PerformanceMetric

(message has no fields)

message CategoricalCrossStatistics

statistics.proto:84

Used in: CrossFeatureStatistics

message ChangedRegion

anomalies.proto:399

Describes a chunk that represents changes in both artifacts over the same number of lines.

Used in: DiffRegion

message CommonStatistics

statistics.proto:446

Common statistics for all feature types. Statistics counting number of values (i.e., min_num_values, max_num_values, avg_num_values, and tot_num_values) include NaNs. For nested features with N nested levels (N > 1), the statistics counting number of values will rely on the innermost level.

Used in: BytesStatistics, NumericStatistics, StringStatistics, StructStatistics

message ContentChunkDomain

schema.proto:650

ContentChunk data.

Used in: Feature

(message has no fields)

message Cosine

metric.proto:67

cosine(...) cosine_proximity(...) https://www.tensorflow.org/api_docs/python/tf/keras/metrics/cosine_proximity DEPRECATED

Used in: PerformanceMetric

(message has no fields)

message CrossFeatureStatistics

statistics.proto:61

NextID: 8

Used in: DatasetFeatureStatistics

message CustomMetric

metric.proto:288

A custom metric. Prefer using or adding an explicit metric message and only use this generic message as a last resort. NEXT_TAG: 4

Used in: PerformanceMetric

message CustomMetric.RegistrySpec

metric.proto:304

RegistrySpec is a full specification of the custom metric and its construction based on the binary’s metric registry. New custom metrics must be linked to the binary and registered in its metric registry to be identifiable via this specification.

Used in: CustomMetric

message CustomStatistic

statistics.proto:222

Stores the name and value of any custom statistic. The value can be a string, double, or histogram.

Used in: FeatureNameStatistics

message DatasetConstraints

schema.proto:289

Constraints on the entire dataset.

Used in: Schema

message DatasetFeatureStatistics

statistics.proto:44

The feature statistics for a single dataset.

Used in: DatasetFeatureStatisticsList

message DatasetFeatureStatisticsList

statistics.proto:39

A list of features statistics for different datasets. If you wish to compare different datasets using this list, then the DatasetFeatureStatistics entries should all contain the same list of features. LINT.IfChange

message DerivedFeatureConfig

derived_feature.proto:51

Stores configuration for a variety of canned feature derivers. TODO(b/227478330): Consider validating config in merge_util.cc.

Used in: DerivedFeatureSource

message DerivedFeatureSource

derived_feature.proto:32

DerivedFeatureSource tracks information about the source of a derived feature. Derived features are computed from ordinary features for the purposes of statistics collection and validation, but do not exist in the dataset. Experimental and subject to change. LINT.IfChange

Used in: Feature, FeatureNameStatistics

message DiffRegion

anomalies.proto:364

Describes a region in the comparison between two text artifacts. Note that a region also contains the contents of the two artifacts that correspond to the region.

Used in: AnomalyInfo

message DistributionConstraints

schema.proto:461

Models constraints on the distribution of a feature's values. TODO(martinz): replace min_domain_mass with max_off_domain (but slowly).

Used in: Feature

message DriftSkewInfo

anomalies.proto:275

Message to contain the result of the drift/skew measurements for a feature.

Used in: Anomalies

message DriftSkewInfo.Measurement

anomalies.proto:276

Used in: DriftSkewInfo

enum DriftSkewInfo.Measurement.Type

anomalies.proto:277

Used in: Measurement

message DynamicClassSpec

problem_statement.proto:83

Specifies a dynamic multiclass/multi-label problem where the number of label classes is inferred from the data.

Used in: MultiClassClassification, MultiLabelClassification

message DynamicClassSpec.OovClassSpec

problem_statement.proto:87

Note: it is up to a solution provider to implement support for OOV labels. Note: both a frequency_threshold and a top_k may be set. A class is grouped into the OOV class if it fails to meet either of the criteria below.

Used in: DynamicClassSpec

message FalseNegativeRateAtThreshold

metric.proto:210

message FalsePositiveRateAtThreshold

metric.proto:224

message Feature

schema.proto:148

Describes schema-level information about a specific feature. NextID: 39

Used in: Schema, StructDomain

message FeatureComparator

schema.proto:771

Used in: Feature

message FeatureCoverageConstraints

schema.proto:470

Encodes vocabulary coverage constraints.

Used in: NaturalLanguageDomain

message FeatureNameStatistics

statistics.proto:152

The complete set of statistics for a given feature name for a dataset. NextID: 11

Used in: DatasetFeatureStatistics

enum FeatureNameStatistics.Type

statistics.proto:157

The types supported by the feature statistics. When aggregating tf.Examples, if the bytelist contains a string, it is recommended to encode it here as STRING instead of BYTES in order to calculate string-specific statistical measures.

Used in: FeatureNameStatistics

message FeaturePresence

schema.proto:716

Describes constraints on the presence of the feature in the data.

Used in: Feature, SparseFeature

message FeaturePresenceWithinGroup

schema.proto:725

Records constraints on the presence of a feature inside a "group" context (e.g., .presence inside a group of features that define a sequence).

Used in: Feature

enum FeatureType

schema.proto:707

Describes the physical representation of a feature. It may be different than the logical representation, which is represented as a Domain.

Used in: Feature, SparseFeature

message FixedShape

schema.proto:319

Specifies a fixed shape for the feature's values. The immediate implication is that each feature has a fixed number of values. Moreover, these values can be parsed in a multi-dimensional tensor using the specified axis sizes. The FixedShape defines a lexicographical ordering of the data. For instance, if there is a FixedShape { dim {size:3} dim {size:2} } then tensor[0][0]=field[0] then tensor[0][1]=field[1] then tensor[1][0]=field[2] then tensor[1][1]=field[3] then tensor[2][0]=field[4] then tensor[2][1]=field[5] The FixedShape message is identical with the tensorflow.TensorShape proto message for fully defined shapes. The FixedShape message cannot represent unknown dimensions or an unknown rank.

Used in: Feature, SparseFeature, TensorRepresentation.DenseTensor, TensorRepresentation.SparseTensor

message FixedShape.Dim

schema.proto:325

An axis in a multi-dimensional feature representation.

Used in: FixedShape

message FloatDomain

schema.proto:538

Encodes information for domains of float values. Note that FeatureType could be either INT or BYTES.

Used in: Feature, Schema

message HiddenRegion

anomalies.proto:410

A chunk that represents identical lines, whose contents are hidden.

Used in: DiffRegion

message Hinge

metric.proto:75

Linear Hinge Loss hinge(...) https://www.tensorflow.org/api_docs/python/tf/keras/metrics/hinge DEPRECATED

Used in: PerformanceMetric

(message has no fields)

message Histogram

statistics.proto:484

The data used to create a histogram of a numeric feature for a dataset.

Used in: CommonStatistics, CustomStatistic, NaturalLanguageStatistics, NaturalLanguageStatistics.TokenStatistics, NumericStatistics, WeightedNaturalLanguageStatistics, WeightedNumericStatistics

message Histogram.Bucket

statistics.proto:489

Each bucket defines its low and high values along with its count. The low and high values must be a real number or positive or negative infinity. They cannot be NaN or undefined. Counts of those special values can be found in the numNaN and numUndefined fields.

Used in: Histogram

enum Histogram.HistogramType

statistics.proto:516

The type of the histogram. A standard histogram has equal-width buckets. The quantiles type is used for when the histogram message is used to store quantile information (by using approximately equal-count buckets with variable widths).

Used in: Histogram

message HistogramSelection

schema.proto:739

Used in: JensenShannonDivergence

enum HistogramSelection.Type

schema.proto:743

Type controls the source of the histogram used for numeric drift and skew calculations. Currently the default is STANDARD. Calculations based on QUANTILES are more robust to outliers.

Used in: HistogramSelection

message ImageDomain

schema.proto:634

Image data.

Used in: Feature

message ImageQualityDeriver

derived_feature.proto:102

Used in: DerivedFeatureConfig

message InfinityNorm

schema.proto:733

Checks that the L-infinity norm is below a certain threshold between the two discrete distributions. Since this is applied to a FeatureNameStatistics, it only considers the top k. L_infty(p,q) = max_i |p_i-q_i|

Used in: FeatureComparator

message IntDomain

schema.proto:522

Encodes information for domains of integer values. Note that FeatureType could be either INT or BYTES.

Used in: Feature, Schema

message JensenShannonDivergence

schema.proto:753

Checks that the approximate Jensen-Shannon Divergence is below a certain threshold between the two distributions.

Used in: FeatureComparator

message KullbackLeiblerDivergence

metric.proto:84

kld(...) kullback_leibler_divergence(...) KLD(...) https://www.tensorflow.org/api_docs/python/tf/keras/metrics/kullback_leibler_divergence DEPRECATED

Used in: PerformanceMetric

(message has no fields)

enum LifecycleStage

schema.proto:32

LifecycleStage. Only UNKNOWN_STAGE, BETA, PRODUCTION, and VALIDATION_DERIVED features are actually validated. PLANNED, ALPHA, DISABLED, and DEBUG are treated as DEPRECATED.

Used in: Feature, SparseFeature, WeightedFeature

message LiftSeries

statistics.proto:99

Container for lift information for a specific y-value.

Used in: LiftStatistics

message LiftSeries.Bucket

statistics.proto:101

A bucket for referring to binned numeric features.

Used in: LiftSeries

message LiftSeries.LiftValue

statistics.proto:125

A container for lift information about a specific value of path_x.

Used in: LiftSeries

message LiftStatistics

statistics.proto:88

Used in: CategoricalCrossStatistics

message LogisticRegression

metric.proto:162

AKA the negative log likelihood or log loss. Given a label y\in {0,1} and a predicted probability p in [0,1]: -yln(p)-(1-y)ln(1-p) TODO(martinz): if this is interpreted the same as binary_cross_entropy, we may need to revisit the semantics. DEPRECATED

Used in: PerformanceMetric

(message has no fields)

message MIDDomain

schema.proto:653

Knowledge graph ID, see: https://www.wikidata.org/wiki/Property:P646

Used in: Feature

(message has no fields)

message MaximumMeanDiscrepancy

metric.proto:245

https://www.tensorflow.org/responsible_ai/model_remediation/api_docs/python/model_remediation/min_diff/losses/MMDLoss

message MeanAbsoluteError

metric.proto:92

MAE(...) mae(...) mean_absolute_error(...) https://www.tensorflow.org/api_docs/python/tf/keras/metrics/mean_absolute_error

Used in: PerformanceMetric

(message has no fields)

message MeanAbsolutePercentageError

metric.proto:100

MAPE(...) mape(...) mean_absolute_percentage_error(...) https://www.tensorflow.org/api_docs/python/tf/keras/metrics/mean_absolute_percentage_error

Used in: PerformanceMetric

(message has no fields)

message MeanReciprocalRank

metric.proto:242

Used in: PerformanceMetric

(message has no fields)

message MeanSquaredError

metric.proto:108

MSE(...) mse(...) mean_squared_error(...) https://www.tensorflow.org/api_docs/python/tf/keras/metrics/mean_squared_error

Used in: PerformanceMetric

(message has no fields)

message MeanSquaredLogarithmicError

metric.proto:116

msle(...) MSLE(...) mean_squared_logarithmic_error(...) https://www.tensorflow.org/api_docs/python/tf/keras/metrics/mean_squared_logarithmic_error

Used in: PerformanceMetric

(message has no fields)

message MetaOptimizationTarget

problem_statement.proto:339

The high-level objectives described by this problem statement. These objectives provide a basis for ranking models and can be optimized by a meta optimizer (e.g. a grid search over hyperparameters). A solution provider may also directly use the meta optimization targets to heuristically select losses, etc without any meta-optimization process. If not specified, the high-level meta optimization target is inferred from the task. These objectives do not need to be differentiable, as the solution provider may use proxy function to optimize model weights. Target definitions include tasks, metrics, and any weighted combination of them.

Used in: ProblemStatement

message MetaOptimizationTarget.ThresholdConfig

problem_statement.proto:351

Configuration for thresholded meta-optimization targets.

Used in: MetaOptimizationTarget

enum MetricType

metric.proto:32

Metric type indicates which direction of a real-valued metric is "better". For most message types, this is invariant. For custom message types, is_maximized == true is like MAXIMIZE, and otherwise MINIMIZE.

message MicroAUC

metric.proto:266

Area under ROC-curve calculated globally for MultiClassClassification (model predicts a single label) or MultiLabelClassification (model predicts class probabilities). The area is calculated by treating the entire set of data as an aggregate result, and computing a single metric rather than k metrics (one for each target label) that get averaged together. For example, the FPR and TPR at a given point on the AUC curve for k targer labels are: FPR = (FP1 + FP2 + ... + FPk) / ((FP1 + FP2 + ... + FPk) + (TN1 + TN2 + ... + TNk)) TPR = (TP1 + TP2 + ... +TPk) / ((TP1 + TP2 + ... + TPk) + (FN1 + FN2 + ... + FNk))

Used in: PerformanceMetric

(message has no fields)

message MultiClassClassification

problem_statement.proto:108

Configuration for a multi-class classification task. In this problem type, there are n_classes possible label values, and the model predicts a single label. The output is one of the class labels, out of n_classes possible classes. The output type will correspond to the label column type.

Used in: Type

message MultiDimensionalRegression

problem_statement.proto:235

A multi-dimensional regression task. Similar to OneDimensionalRegression, MultiDimensionalRegression predicts continuous real numbers. However instead of predicting a single scalar value per example, we predict a fixed dimensional vector of values. By default the range is any float -inf to inf, but specific sub-types (e.g. probability) define more narrow ranges.

message MultiDimensionalRegression.Probability

problem_statement.proto:250

Defines a regression problem where labels are in [0, 1] and represent a probability (e.g: probability of click).

Used in: MultiDimensionalRegression

message MultiLabelClassification

problem_statement.proto:138

Configuration for a multi-label classification task. In this problem type there are n_classes unique possible label values overall. There can be from zero up to n_classes unique labels per example. The output, which is of type real number, is class probabilities associated with each class. It will be of n_classes dimension for each example, if n_classes is specified. Otherwise, the dimension will be set to the number of unique class labels that are dynamically inferred from the data based on dynamic_class_spec.

Used in: Type

message MultilabelCrossEntropy

metric.proto:273

Cross entropy for MultiLabelClassification where each target and prediction is the probabily of belonging to that class independent of other classes.

Used in: PerformanceMetric

(message has no fields)

message NaturalLanguageDomain

schema.proto:621

Natural language text.

Used in: Feature

message NaturalLanguageStatistics

statistics.proto:300

Statistics for a feature containing a NL domain.

message NaturalLanguageStatistics.TokenStatistics

statistics.proto:320

Used in: NaturalLanguageStatistics, WeightedNaturalLanguageStatistics

message NormalizedAbsoluteDifference

schema.proto:767

Checks that the absolute count difference relative to the total count of both datasets is small. This metric is appropriate for comparing datasets that are expected to have similar absolute counts, and not necessarily just similar distributions. Computed as max_i | x_i - y_i | / sum_i(x_i + y_i) for aligned datasets x and y. Results will be in the interval [0.0, 1.0] so sensible bounds should be in the interval [0.0, 1.0).

Used in: FeatureComparator

message NumericCrossStatistics

statistics.proto:77

Used in: CrossFeatureStatistics

message NumericStatistics

statistics.proto:234

Statistics for a numeric feature in a dataset.

Used in: FeatureNameStatistics

message NumericValueComparator

schema.proto:283

Checks that the ratio of the current value to the previous value is not below the min_fraction_threshold or above the max_fraction_threshold. That is, previous value * min_fraction_threshold <= current value <= previous value * max_fraction_threshold. To specify that the value cannot change, set both min_fraction_threshold and max_fraction_threshold to 1.0.

Used in: DatasetConstraints

message OneDimensionalRegression

problem_statement.proto:200

A one-dimensional regression task. The output is a single real number, whose range is dependent upon the objective.

Used in: Type

message OneDimensionalRegression.Counts

problem_statement.proto:218

Defines a regression problem where the labels are counts i.e. integers >=0.

Used in: OneDimensionalRegression

(message has no fields)

message OneDimensionalRegression.Probability

problem_statement.proto:215

Defines a regression problem where labels are in [0, 1] and represent a probability (e.g: probability of click).

Used in: OneDimensionalRegression

(message has no fields)

message OneSideRegion

anomalies.proto:390

Describes a chunk that applies to only one of the two artifacts.

Used in: DiffRegion

message Path

path.proto:39

A path is a more general substitute for the name of a field or feature that can be used for flat examples as well as structured data. For example, if we had data in a protocol buffer: message Person { int age = 1; optional string gender = 2; repeated Person parent = 3; } Thus, here the path {step:["parent", "age"]} in statistics would refer to the age of a parent, and {step:["parent", "parent", "age"]} would refer to the age of a grandparent. This allows us to distinguish between the statistics of parents' ages and grandparents' ages. In general, repeated messages are to be preferred to linked lists of arbitrary length. For SequenceExample, if we have a feature list "foo", this is represented by {step:["##SEQUENCE##", "foo"]}.

Used in: AnomalyInfo, BinaryClassification, CrossFeatureStatistics, DerivedFeatureSource, DriftSkewInfo, FeatureNameStatistics, MultiClassClassification, MultiDimensionalRegression, MultiLabelClassification, OneDimensionalRegression, TensorRepresentation.RaggedTensor, TopKClassification, WeightedFeature

message PerformanceMetric

metric.proto:319

Performance metrics measure the quality of a model. They need not be differentiable.

Used in: MetaOptimizationTarget, Task

message Poisson

metric.proto:122

poisson(...) DEPRECATED

Used in: PerformanceMetric

(message has no fields)

message PrecisionAtK

metric.proto:238

Used in: PerformanceMetric

(message has no fields)

message PrecisionAtRecall

metric.proto:195

https://www.tensorflow.org/api_docs/python/tf/keras/metrics/PrecisionAtRecall

Used in: PerformanceMetric

message PredictionMean

metric.proto:254

The mean of the prediction across the dataset.

(message has no fields)

message PresenceAndValencyStatistics

statistics.proto:426

Statistics about the presence and valency of feature values. Feature values could be nested lists. A feature in tf.Examples or other "flat" datasets has values of nest level 1 -- they are lists of primitives. A nest level N (N > 1) feature value is a list of lists of nest level (N - 1). This proto can be used to describe the presence and valency of values at each level.

Used in: CommonStatistics

message ProblemStatement

problem_statement.proto:391

message RankHistogram

statistics.proto:532

The data used to create a rank histogram of a non-numeric feature of a dataset. The rank of a value in a feature can be used as a measure of how commonly the value is found in the entire dataset. With bucket sizes of one, this becomes a distribution function of all feature values.

Used in: CustomStatistic, NaturalLanguageStatistics, StringStatistics, WeightedNaturalLanguageStatistics, WeightedStringStatistics

message RankHistogram.Bucket

statistics.proto:534

Each bucket defines its start and end ranks along with its count.

Used in: RankHistogram

message RecallAtPrecision

metric.proto:203

https://www.tensorflow.org/api_docs/python/tf/keras/metrics/RecallAtPrecision

Used in: PerformanceMetric

message ReduceOp

derived_feature.proto:70

Used in: DerivedFeatureConfig

message Schema

schema.proto:72

Message to represent schema information. NextID: 15

Used in: Anomalies

message SensitivityAtSpecificity

metric.proto:179

https://www.tensorflow.org/api_docs/python/tf/keras/metrics/SensitivityAtSpecificity

Used in: PerformanceMetric

message SequenceLengthConstraints

schema.proto:509

Encodes constraints on sequence lengths.

Used in: NaturalLanguageDomain

message SequenceMetadata

schema.proto:952

Used in: Feature

enum SequenceMetadata.SequentialStatus

schema.proto:955

This enum specifies whether to treat the feature as a sequence which has meaningful element order.

Used in: SequenceMetadata

message SequenceValueConstraints

schema.proto:491

Encodes constraints on specific values in sequences.

Used in: NaturalLanguageDomain

message SliceSql

derived_feature.proto:81

Used in: DerivedFeatureConfig

enum SliceValueTypes

derived_feature.proto:74

Used in: SliceSql

message SparseFeature

schema.proto:408

A sparse feature represents a sparse tensor that is encoded with a combination of raw features, namely index features and a value feature. Each index feature defines a list of indices in a different dimension.

Used in: Schema, StructDomain

message SparseFeature.IndexFeature

schema.proto:438

Used in: SparseFeature

message SparseFeature.ValueFeature

schema.proto:448

Used in: SparseFeature

message SparseTopKCategoricalAccuracy

metric.proto:142

sparse_top_k_categorical_accuracy(...) https://www.tensorflow.org/api_docs/python/tf/keras/metrics/sparse_top_k_categorical_accuracy DEPRECATED

Used in: PerformanceMetric

(message has no fields)

message SpecificityAtSensitivity

metric.proto:187

https://www.tensorflow.org/api_docs/python/tf/keras/metrics/SpecificityAtSensitivity

Used in: PerformanceMetric

message SquaredHinge

metric.proto:129

squared_hinge(...) https://www.tensorflow.org/api_docs/python/tf/keras/metrics/squared_hinge DEPRECATED

Used in: PerformanceMetric

(message has no fields)

message StringDomain

schema.proto:581

Encodes information for domains of string values.

Used in: Feature, Schema

enum StringDomain.Categorical

schema.proto:592

Currently unused StringDomain consists of Categorical. This enum allows the user to specify the whether to treat the feature as categorical.

Used in: StringDomain

message StringStatistics

statistics.proto:256

Statistics for a string feature in a dataset.

Used in: FeatureNameStatistics

message StringStatistics.FreqAndValue

statistics.proto:261

Used in: StringStatistics, WeightedStringStatistics

message StructDomain

schema.proto:574

Domain for a recursive struct. NOTE: If a feature with a StructDomain is deprecated, then all the child features (features and sparse_features of the StructDomain) are also considered to be deprecated. Similarly child features can only be in environments of the parent feature.

Used in: Feature

message StructStatistics

statistics.proto:416

Used in: FeatureNameStatistics

message Task

problem_statement.proto:296

Describes a single task in a model and all its properties. A task corresponds to a single output of the model. Multiple tasks in the same problem statement correspond to different outputs of the model.

Used in: ProblemStatement

enum TaskType

problem_statement.proto:29

message TensorRepresentation

schema.proto:793

A TensorRepresentation captures the intent for converting columns in a dataset to TensorFlow Tensors (or more generally, tf.CompositeTensors). Note that one tf.CompositeTensor may consist of data from multiple columns, for example, a N-dimensional tf.SparseTensor may need N + 1 columns to provide the sparse indices and values. Note that the "column name" that a TensorRepresentation needs is a string, not a Path -- it means that the column name identifies a top-level Feature in the schema (i.e. you cannot specify a Feature nested in a STRUCT Feature).

Used in: TensorRepresentationGroup

message TensorRepresentation.DefaultValue

schema.proto:794

Used in: DenseTensor

message TensorRepresentation.DenseTensor

schema.proto:808

A tf.Tensor

Used in: TensorRepresentation

message TensorRepresentation.RaggedTensor

schema.proto:849

A tf.RaggedTensor that models nested lists. Currently there is no way for the user to specify the shape of the leaf value (the innermost value tensor of the RaggedTensor). The leaf value will always be a 1-D tensor.

Used in: TensorRepresentation

message TensorRepresentation.RaggedTensor.Partition

schema.proto:861

Further partition of the feature values at the leaf level.

Used in: RaggedTensor

enum TensorRepresentation.RowPartitionDType

schema.proto:898

RaggedTensor consists of RowPartitions. This enum allows the user to specify the dtype of those RowPartitions. If it is UNSPECIFIED, then we default to INT64.

Used in: RaggedTensor

message TensorRepresentation.SparseTensor

schema.proto:831

A tf.SparseTensor whose indices and values come from separate data columns. This will replace Schema.sparse_feature eventually. The index columns must be of INT type, and all the columns must co-occur and have the same valency at the same row.

Used in: TensorRepresentation

message TensorRepresentation.VarLenSparseTensor

schema.proto:821

A ragged tf.SparseTensor that models nested lists.

Used in: TensorRepresentation

message TensorRepresentationGroup

schema.proto:948

A TensorRepresentationGroup is a collection of TensorRepresentations with names. These names may serve as identifiers when converting the dataset to a collection of Tensors or tf.CompositeTensors. For example, given the following group: { key: "dense_tensor" tensor_representation { dense_tensor { column_name: "univalent_feature" shape { dim { size: 1 } } default_value { float_value: 0 } } } } { key: "varlen_sparse_tensor" tensor_representation { varlen_sparse_tensor { column_name: "multivalent_feature" } } } Then the schema is expected to have feature "univalent_feature" and "multivalent_feature", and when a batch of data is converted to Tensors using this TensorRepresentationGroup, the result may be the following dict: { "dense_tensor": tf.Tensor(...), "varlen_sparse_tensor": tf.SparseTensor(...), }

Used in: Schema

message TextGeneration

problem_statement.proto:268

Configuration for a text generation task where the model should predict a sequence of natural language text.

Used in: Type

message TimeDomain

schema.proto:659

Time or date representation.

Used in: Feature

enum TimeDomain.IntegerTimeFormat

schema.proto:660

Used in: TimeDomain

message TimeOfDayDomain

schema.proto:681

Time of day, without a particular date.

Used in: Feature

enum TimeOfDayDomain.IntegerTimeOfDayFormat

schema.proto:682

Used in: TimeOfDayDomain

message TopKCategoricalAccuracy

metric.proto:135

top_k_categorical_accuracy(...) https://www.tensorflow.org/api_docs/python/tf/keras/metrics/top_k_categorical_accuracy

Used in: PerformanceMetric

(message has no fields)

message TopKClassification

problem_statement.proto:170

Configuration for a top-K classification task. In this problem type, there are n_classes possible label values, and the model predicts n_predicted_labels labels. The output is a sequence of n_predicted_labels labels, out of n_classes possible classes. The order of the predicted output labels is determined by the predictions_order field. (*) MultiClassClassification is the same as TopKClassification with n_predicted_labels = 1. (*) TopKClassification does NOT mean multi-class multi-label classification: e.g., the output contains a sequence of labels, all coming from the same label column in the data.

Used in: Type

enum TopKClassification.Order

problem_statement.proto:187

Used in: TopKClassification

message Type

problem_statement.proto:280

The type of a head or meta-objective. Specifies the label, weight, and output type of the head. TODO(martinz): add logistic regression.

Used in: Task

message URLDomain

schema.proto:656

A URL, see: https://en.wikipedia.org/wiki/URL

Used in: Feature

(message has no fields)

message UnchangedRegion

anomalies.proto:381

Describes a chunk that is the same in the two artifacts.

Used in: DiffRegion

message UniqueConstraints

schema.proto:779

Checks that the number of unique values is greater than or equal to the min, and less than or equal to the max.

Used in: Feature

message ValueCount

schema.proto:337

Limits on maximum and minimum number of values in a single example (when the feature is present). Use this when the minimum value count can be different than the maximum value count. Otherwise prefer FixedShape.

Used in: Feature, ValueCountList

message ValueCountList

schema.proto:142

Used in: Feature

message VideoDomain

schema.proto:647

Video data.

Used in: Feature

(message has no fields)

message WeightedCommonStatistics

statistics.proto:206

Common weighted statistics for all feature types. Statistics counting number of values (i.e., avg_num_values and tot_num_values) include NaNs. If the weighted column is missing, then this counts as a weight of 1 for that example. For nested features with N nested levels (N > 1), the statistics counting number of values will rely on the innermost level.

Used in: CommonStatistics

message WeightedFeature

schema.proto:391

Represents a weighted feature that is encoded as a combination of raw base features. The `weight_feature` should be a float feature with identical shape as the `feature`. This is useful for representing weights associated with categorical tokens (e.g. a TFIDF weight associated with each token). TODO(b/142122960): Handle WeightedCategorical end to end in TFX (validation, TFX Unit Testing, etc)

Used in: Schema

message WeightedNaturalLanguageStatistics

statistics.proto:381

Statistics for a weighted feature with an NL domain.

Used in: NaturalLanguageStatistics

message WeightedNumericStatistics

statistics.proto:358

Statistics for a weighted numeric feature in a dataset.

Used in: NumericStatistics

message WeightedStringStatistics

statistics.proto:371

Statistics for a weighted string feature in a dataset.

Used in: StringStatistics