package google.cloud.bigquery.v2

Get desktop application:
View/edit binary Protocol Buffers messages

rpc GetModel (GetModelRequest, Model)
model.proto:43
Gets the specified model resource by model ID.
message GetModelRequest
model.proto:578
- string project_id = 1
  Required. Project ID of the requested model.
- string dataset_id = 2
  Required. Dataset ID of the requested model.
- string model_id = 3
  Required. Model ID of the requested model.
rpc ListModels (ListModelsRequest, ListModelsResponse)
model.proto:49
Lists all models in the specified dataset. Requires the READER dataset role.
message ListModelsRequest
model.proto:616
- string project_id = 1
  Required. Project ID of the models to list.
- string dataset_id = 2
  Required. Dataset ID of the models to list.
- optional protobuf.UInt32Value max_results = 3
  The maximum number of results to return in a single response page. Leverage the page tokens to iterate through the entire collection.
- string page_token = 4
  Page token, returned by a previous call to request the next page of results
message ListModelsResponse
model.proto:632
- repeated Model models = 1
  Models in the requested dataset. Only the following fields are populated: model_reference, model_type, creation_time, last_modified_time and labels.
- string next_page_token = 2
  A token to request the next page of results.
rpc PatchModel (PatchModelRequest, Model)
model.proto:54
Patch specific fields in the specified model.
message PatchModelRequest
model.proto:589
- string project_id = 1
  Required. Project ID of the model to patch.
- string dataset_id = 2
  Required. Dataset ID of the model to patch.
- string model_id = 3
  Required. Model ID of the model to patch.
- optional Model model = 4
  Required. Patched model. Follows RFC5789 patch semantics. Missing fields are not updated. To clear a field, explicitly set to default value.
rpc DeleteModel (DeleteModelRequest, protobuf.Empty)
model.proto:59
Deletes the model specified by modelId from the dataset.
message DeleteModelRequest
model.proto:605
- string project_id = 1
  Required. Project ID of the model to delete.
- string dataset_id = 2
  Required. Dataset ID of the model to delete.
- string model_id = 3
  Required. Model ID of the model to delete.

Used in: Model

optional protobuf.StringValue kms_key_name = 1
Optional. Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key.

Used as response type in: ModelService.GetModel, ModelService.PatchModel

Used as field type in: ListModelsResponse, PatchModelRequest

string etag = 1
Output only. A hash of this resource.
optional ModelReference model_reference = 2
Required. Unique identifier for this model.
int64 creation_time = 5
Output only. The time when this model was created, in millisecs since the epoch.
int64 last_modified_time = 6
Output only. The time when this model was last modified, in millisecs since the epoch.
string description = 12
Optional. A user-friendly description of this model.
string friendly_name = 14
Optional. A descriptive name for this model.
map<string, string> labels = 15
The labels associated with this model. You can use these to organize and group your models. Label keys and values can be no longer than 63 characters, can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
int64 expiration_time = 16
Optional. The time when this model expires, in milliseconds since the epoch. If not present, the model will persist indefinitely. Expired models will be deleted and their storage reclaimed. The defaultTableExpirationMs property of the encapsulating dataset can be used to set a default expirationTime on newly created models.
string location = 13
Output only. The geographic location where the model resides. This value is inherited from the dataset.
optional EncryptionConfiguration encryption_configuration = 17
Custom encryption configuration (e.g., Cloud KMS keys). This shows the encryption configuration of the model data while stored in BigQuery storage.
Model.ModelType model_type = 7
Output only. Type of the model resource.
repeated Model.TrainingRun training_runs = 9
Output only. Information for all training runs in increasing order of start_time.
repeated StandardSqlField feature_columns = 10
Output only. Input feature columns that were used to train this model.
repeated StandardSqlField label_columns = 11
Output only. Label columns that were used to train this model. The output of the model will have a "predicted_" prefix to these columns.

Aggregate metrics for classification/classifier models. For multi-class models, the metrics are either macro-averaged or micro-averaged. When macro-averaged, the metrics are calculated for each label and then an unweighted average is taken of those values. When micro-averaged, the metric is calculated globally by counting the total number of correctly predicted rows.

Used in: BinaryClassificationMetrics, MultiClassClassificationMetrics

optional protobuf.DoubleValue precision = 1
Precision is the fraction of actual positive predictions that had positive actual labels. For multiclass this is a macro-averaged metric treating each class as a binary classifier.
optional protobuf.DoubleValue recall = 2
Recall is the fraction of actual positive labels that were given a positive prediction. For multiclass this is a macro-averaged metric.
optional protobuf.DoubleValue accuracy = 3
Accuracy is the fraction of predictions given the correct label. For multiclass this is a micro-averaged metric.
optional protobuf.DoubleValue threshold = 4
Threshold at which the metrics are computed. For binary classification models this is the positive class threshold. For multi-class classfication models this is the confidence threshold.
optional protobuf.DoubleValue f1_score = 5
The F1 score is an average of recall and precision. For multiclass this is a macro-averaged metric.
optional protobuf.DoubleValue log_loss = 6
Logarithmic Loss. For multiclass this is a macro-averaged metric.
optional protobuf.DoubleValue roc_auc = 7
Area Under a ROC Curve. For multiclass this is a macro-averaged metric.

Evaluation metrics for binary classification/classifier models.

Used in: EvaluationMetrics

optional AggregateClassificationMetrics aggregate_classification_metrics = 1
Aggregate classification metrics.
repeated BinaryClassificationMetrics.BinaryConfusionMatrix binary_confusion_matrix_list = 2
Binary confusion matrix at multiple thresholds.
string positive_label = 3
Label representing the positive class.
string negative_label = 4
Label representing the negative class.

Confusion matrix for binary classification models.

Used in: BinaryClassificationMetrics

optional protobuf.DoubleValue positive_class_threshold = 1
Threshold value used when computing each of the following metric.
optional protobuf.Int64Value true_positives = 2
Number of true samples predicted as true.
optional protobuf.Int64Value false_positives = 3
Number of false samples predicted as true.
optional protobuf.Int64Value true_negatives = 4
Number of true samples predicted as false.
optional protobuf.Int64Value false_negatives = 5
Number of false samples predicted as false.
optional protobuf.DoubleValue precision = 6
The fraction of actual positive predictions that had positive actual labels.
optional protobuf.DoubleValue recall = 7
The fraction of actual positive labels that were given a positive prediction.
optional protobuf.DoubleValue f1_score = 8
The equally weighted average of recall and precision.
optional protobuf.DoubleValue accuracy = 9
The fraction of predictions given the correct label.

Evaluation metrics for clustering models.

Used in: EvaluationMetrics

optional protobuf.DoubleValue davies_bouldin_index = 1
Davies-Bouldin index.
optional protobuf.DoubleValue mean_squared_distance = 2
Mean of squared distances between each sample to its cluster centroid.
repeated ClusteringMetrics.Cluster clusters = 3
[Beta] Information for all clusters.

Message containing the information about one cluster.

Used in: ClusteringMetrics

int64 centroid_id = 1
Centroid id.
repeated Cluster.FeatureValue feature_values = 2
Values of highly variant features for this cluster.
optional protobuf.Int64Value count = 3
Count of training data rows that were assigned to this cluster.

Representative value of a single feature within the cluster.

Used in: Cluster

string feature_column = 1
The feature column name.
oneof value
- protobuf.DoubleValue numerical_value = 2
  The numerical feature value. This is the centroid value for this feature.
- FeatureValue.CategoricalValue categorical_value = 3
  The categorical feature value.

Representative value of a categorical feature.

Used in: FeatureValue

repeated CategoricalValue.CategoryCount category_counts = 1
Counts of all categories for the categorical feature. If there are more than ten categories, we return top ten (by count) and return one more CategoryCount with category "_OTHER_" and count as aggregate counts of remaining categories.

Represents the count of a single category within the cluster.

Used in: CategoricalValue

string category = 1
The name of category.
optional protobuf.Int64Value count = 2
The count of training samples matching the category within the cluster.

Indicates the method to split input data into multiple tables.

Used in: TrainingRun.TrainingOptions

DATA_SPLIT_METHOD_UNSPECIFIED = 0
RANDOM = 1
Splits data randomly.
CUSTOM = 2
Splits data with the user provided tags.
SEQUENTIAL = 3
Splits data sequentially.
NO_SPLIT = 4
Data split will be skipped.
AUTO_SPLIT = 5
Splits data automatically: Uses NO_SPLIT if the data size is small. Otherwise uses RANDOM.

Distance metric used to compute the distance between two points.

Used in: TrainingRun.TrainingOptions

DISTANCE_TYPE_UNSPECIFIED = 0
EUCLIDEAN = 1
Eculidean distance.
COSINE = 2
Cosine distance.

Evaluation metrics of a model. These are either computed on all training data or just the eval data based on whether eval data was used during training. These are not present for imported models.

Used in: TrainingRun

oneof metrics
- RegressionMetrics regression_metrics = 1
  Populated for regression models and explicit feedback type matrix factorization models.
- BinaryClassificationMetrics binary_classification_metrics = 2
  Populated for binary classification/classifier models.
- MultiClassClassificationMetrics multi_class_classification_metrics = 3
  Populated for multi-class classification/classifier models.
- ClusteringMetrics clustering_metrics = 4
  Populated for clustering models.

(message has no fields)

Indicates the method used to initialize the centroids for KMeans clustering algorithm.

Used in: TrainingRun.TrainingOptions

KMEANS_INITIALIZATION_METHOD_UNSPECIFIED = 0
RANDOM = 1
Initializes the centroids randomly.
CUSTOM = 2
Initializes the centroids using data specified in kmeans_initialization_column.

Indicates the learning rate optimization strategy to use.

Used in: TrainingRun.TrainingOptions

LEARN_RATE_STRATEGY_UNSPECIFIED = 0
LINE_SEARCH = 1
Use line search to determine learning rate.
CONSTANT = 2
Use a constant learning rate.

Loss metric to evaluate model training performance.

Used in: TrainingRun.TrainingOptions

LOSS_TYPE_UNSPECIFIED = 0
MEAN_SQUARED_LOSS = 1
Mean squared loss, used for linear regression.
MEAN_LOG_LOSS = 2
Mean log loss, used for logistic regression.

Indicates the type of the Model.

Used in: Model

MODEL_TYPE_UNSPECIFIED = 0
LINEAR_REGRESSION = 1
Linear regression model.
LOGISTIC_REGRESSION = 2
Logistic regression based classification model.
KMEANS = 3
K-means clustering model.
TENSORFLOW = 6
[Beta] An imported TensorFlow model.

Evaluation metrics for multi-class classification/classifier models.

Used in: EvaluationMetrics

optional AggregateClassificationMetrics aggregate_classification_metrics = 1
Aggregate classification metrics.
repeated MultiClassClassificationMetrics.ConfusionMatrix confusion_matrix_list = 2
Confusion matrix at different thresholds.

Confusion matrix for multi-class classification models.

Used in: MultiClassClassificationMetrics

optional protobuf.DoubleValue confidence_threshold = 1
Confidence threshold used when computing the entries of the confusion matrix.
repeated ConfusionMatrix.Row rows = 2
One row per actual label.

A single entry in the confusion matrix.

Used in: Row

string predicted_label = 1
The predicted label. For confidence_threshold > 0, we will also add an entry indicating the number of items under the confidence threshold.
optional protobuf.Int64Value item_count = 2
Number of items being predicted as this label.

A single row in the confusion matrix.

Used in: ConfusionMatrix

string actual_label = 1
The original label of this row.
repeated Entry entries = 2
Info describing predicted label distribution.

Indicates the optimization strategy used for training.

Used in: TrainingRun.TrainingOptions

OPTIMIZATION_STRATEGY_UNSPECIFIED = 0
BATCH_GRADIENT_DESCENT = 1
Uses an iterative batch gradient descent algorithm.
NORMAL_EQUATION = 2
Uses a normal equation to solve linear regression problem.

Evaluation metrics for regression and explicit feedback type matrix factorization models.

Used in: EvaluationMetrics

optional protobuf.DoubleValue mean_absolute_error = 1
Mean absolute error.
optional protobuf.DoubleValue mean_squared_error = 2
Mean squared error.
optional protobuf.DoubleValue mean_squared_log_error = 3
Mean squared log error.
optional protobuf.DoubleValue median_absolute_error = 4
Median absolute error.
optional protobuf.DoubleValue r_squared = 5
R^2 score.

Information about a single training query run for the model.

Used in: Model

optional TrainingRun.TrainingOptions training_options = 1
Options that were used for this training run, includes user specified and default options that were used.
optional protobuf.Timestamp start_time = 8
The start time of this training run.
repeated TrainingRun.IterationResult results = 6
Output of each iteration run, results.size() <= max_iterations.
optional EvaluationMetrics evaluation_metrics = 7
The evaluation metrics over training/eval data that were computed at the end of training.

Information about a single iteration of the training run.

Used in: TrainingRun

optional protobuf.Int32Value index = 1
Index of the iteration, 0 based.
optional protobuf.Int64Value duration_ms = 4
Time taken to run the iteration in milliseconds.
optional protobuf.DoubleValue training_loss = 5
Loss computed on the training data at the end of iteration.
optional protobuf.DoubleValue eval_loss = 6
Loss computed on the eval data at the end of iteration.
double learn_rate = 7
Learn rate used for this iteration.
repeated IterationResult.ClusterInfo cluster_infos = 8
Information about top clusters for clustering models.

Information about a single cluster for clustering model.

Used in: IterationResult

int64 centroid_id = 1
Centroid id.
optional protobuf.DoubleValue cluster_radius = 2
Cluster radius, the average distance from centroid to each point assigned to the cluster.
optional protobuf.Int64Value cluster_size = 3
Cluster size, the total number of points assigned to the cluster.

Used in: TrainingRun

int64 max_iterations = 1
The maximum number of iterations in training. Used only for iterative training algorithms.
LossType loss_type = 2
Type of loss function used during training run.
double learn_rate = 3
Learning rate in training. Used only for iterative training algorithms.
optional protobuf.DoubleValue l1_regularization = 4
L1 regularization coefficient.
optional protobuf.DoubleValue l2_regularization = 5
L2 regularization coefficient.
optional protobuf.DoubleValue min_relative_progress = 6
When early_stop is true, stops training when accuracy improvement is less than 'min_relative_progress'. Used only for iterative training algorithms.
optional protobuf.BoolValue warm_start = 7
Whether to train a model from the last checkpoint.
optional protobuf.BoolValue early_stop = 8
Whether to stop early when the loss doesn't improve significantly any more (compared to min_relative_progress). Used only for iterative training algorithms.
repeated string input_label_columns = 9
Name of input label columns in training data.
DataSplitMethod data_split_method = 10
The data split type for training and evaluation, e.g. RANDOM.
double data_split_eval_fraction = 11
The fraction of evaluation data over the whole input data. The rest of data will be used as training data. The format should be double. Accurate to two decimal places. Default value is 0.2.
string data_split_column = 12
The column to split data with. This column won't be used as a feature. 1. When data_split_method is CUSTOM, the corresponding column should be boolean. The rows with true value tag are eval data, and the false are training data. 2. When data_split_method is SEQ, the first DATA_SPLIT_EVAL_FRACTION rows (from smallest to largest) in the corresponding column are used as training data, and the rest are eval data. It respects the order in Orderable data types: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#data-type-properties
LearnRateStrategy learn_rate_strategy = 13
The strategy to determine learn rate for the current iteration.
double initial_learn_rate = 16
Specifies the initial learning rate for the line search learn rate strategy.
map<string, double> label_class_weights = 17
Weights associated with each label class, for rebalancing the training data. Only applicable for classification models.
DistanceType distance_type = 20
Distance type for clustering models.
int64 num_clusters = 21
Number of clusters for clustering models.
string model_uri = 22
[Beta] Google Cloud Storage URI from which the model was imported. Only applicable for imported models.
OptimizationStrategy optimization_strategy = 23
Optimization strategy for training linear regression models.
KmeansEnums.KmeansInitializationMethod kmeans_initialization_method = 33
The method used to initialize the centroids for kmeans algorithm.
string kmeans_initialization_column = 34
The column used to provide the initial centroids for kmeans algorithm when kmeans_initialization_method is CUSTOM.

Id path of a model.

Used in: Model

string project_id = 1
Required. The ID of the project containing this model.
string dataset_id = 2
Required. The ID of the dataset containing this model.
string model_id = 3
Required. The ID of the model. The ID must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_). The maximum length is 1,024 characters.

The type of a variable, e.g., a function argument. Examples: INT64: {type_kind="INT64"} ARRAY<STRING>: {type_kind="ARRAY", array_element_type="STRING"} STRUCT<x STRING, y ARRAY<DATE>>: {type_kind="STRUCT", struct_type={fields=[ {name="x", type={type_kind="STRING"}}, {name="y", type={type_kind="ARRAY", array_element_type="DATE"}} ]}}

Used in: StandardSqlField

StandardSqlDataType.TypeKind type_kind = 1
Required. The top level type of this field. Can be any standard SQL data type (e.g., "INT64", "DATE", "ARRAY").
oneof sub_type
- StandardSqlDataType array_element_type = 2
  The type of the array's elements, if type_kind = "ARRAY".
- StandardSqlStructType struct_type = 3
  The fields of this struct, in order, if type_kind = "STRUCT".

Used in: StandardSqlDataType

TYPE_KIND_UNSPECIFIED = 0
Invalid type.
INT64 = 2
Encoded as a string in decimal format.
BOOL = 5
Encoded as a boolean "false" or "true".
FLOAT64 = 7
Encoded as a number, or string "NaN", "Infinity" or "-Infinity".
STRING = 8
Encoded as a string value.
BYTES = 9
Encoded as a base64 string per RFC 4648, section 4.
TIMESTAMP = 19
Encoded as an RFC 3339 timestamp with mandatory "Z" time zone string: 1985-04-12T23:20:50.52Z
DATE = 10
Encoded as RFC 3339 full-date format string: 1985-04-12
TIME = 20
Encoded as RFC 3339 partial-time format string: 23:20:50.52
DATETIME = 21
Encoded as RFC 3339 full-date "T" partial-time: 1985-04-12T23:20:50.52
GEOGRAPHY = 22
Encoded as WKT
NUMERIC = 23
Encoded as a decimal string.
ARRAY = 16
Encoded as a list with types matching Type.array_type.
STRUCT = 17
Encoded as a list with fields of type Type.struct_type[i]. List is used because a JSON object cannot have duplicate field names.

A field or a column.

Used in: Model, StandardSqlStructType

string name = 1
Optional. The name of this field. Can be absent for struct fields.
optional StandardSqlDataType type = 2
Optional. The type of this parameter. Absent if not explicitly specified (e.g., CREATE FUNCTION statement can omit the return type; in this case the output parameter does not have this "type" field).

Used in: StandardSqlDataType

repeated StandardSqlField fields = 1

package google.cloud.bigquery.v2

service ModelService

rpc GetModel (GetModelRequest, Model)

message GetModelRequest

string project_id = 1

string dataset_id = 2

string model_id = 3

rpc ListModels (ListModelsRequest, ListModelsResponse)

message ListModelsRequest

string project_id = 1

string dataset_id = 2

optional protobuf.UInt32Value max_results = 3

string page_token = 4

message ListModelsResponse

repeated Model models = 1

string next_page_token = 2

rpc PatchModel (PatchModelRequest, Model)

message PatchModelRequest

string project_id = 1

string dataset_id = 2

string model_id = 3

optional Model model = 4

rpc DeleteModel (DeleteModelRequest, protobuf.Empty)

message DeleteModelRequest

string project_id = 1

string dataset_id = 2

string model_id = 3

message EncryptionConfiguration

optional protobuf.StringValue kms_key_name = 1

message Model

string etag = 1

optional ModelReference model_reference = 2

int64 creation_time = 5

int64 last_modified_time = 6

string description = 12

string friendly_name = 14

map<string, string> labels = 15

int64 expiration_time = 16

string location = 13

optional EncryptionConfiguration encryption_configuration = 17

Model.ModelType model_type = 7

repeated Model.TrainingRun training_runs = 9

repeated StandardSqlField feature_columns = 10

repeated StandardSqlField label_columns = 11

message Model.AggregateClassificationMetrics

optional protobuf.DoubleValue precision = 1

optional protobuf.DoubleValue recall = 2

optional protobuf.DoubleValue accuracy = 3

optional protobuf.DoubleValue threshold = 4

optional protobuf.DoubleValue f1_score = 5

optional protobuf.DoubleValue log_loss = 6

optional protobuf.DoubleValue roc_auc = 7

message Model.BinaryClassificationMetrics

optional AggregateClassificationMetrics aggregate_classification_metrics = 1

repeated BinaryClassificationMetrics.BinaryConfusionMatrix binary_confusion_matrix_list = 2

string positive_label = 3

string negative_label = 4

message Model.BinaryClassificationMetrics.BinaryConfusionMatrix

optional protobuf.DoubleValue positive_class_threshold = 1

optional protobuf.Int64Value true_positives = 2

optional protobuf.Int64Value false_positives = 3

optional protobuf.Int64Value true_negatives = 4

optional protobuf.Int64Value false_negatives = 5

optional protobuf.DoubleValue precision = 6

optional protobuf.DoubleValue recall = 7

optional protobuf.DoubleValue f1_score = 8

optional protobuf.DoubleValue accuracy = 9

message Model.ClusteringMetrics

optional protobuf.DoubleValue davies_bouldin_index = 1

optional protobuf.DoubleValue mean_squared_distance = 2

repeated ClusteringMetrics.Cluster clusters = 3

message Model.ClusteringMetrics.Cluster

int64 centroid_id = 1

repeated Cluster.FeatureValue feature_values = 2

optional protobuf.Int64Value count = 3

message Model.ClusteringMetrics.Cluster.FeatureValue

string feature_column = 1

oneof value

protobuf.DoubleValue numerical_value = 2

FeatureValue.CategoricalValue categorical_value = 3