Get desktop application:
View/edit binary Protocol Buffers messages
The Cloud Data Loss Prevention (DLP) API is a service that allows clients to detect the presence of Personally Identifiable Information (PII) and other privacy-sensitive data in user-supplied, unstructured data streams, like text blocks or images. The service also includes methods for sensitive data redaction and scheduling of data scans on Google Cloud Platform based data sets. To learn more about concepts and find how-to guides see https://cloud.google.com/dlp/docs/.
Finds potentially sensitive info in content. This method has limits on input size, processing time, and output size. When no InfoTypes or CustomInfoTypes are specified in this request, the system will automatically choose what detectors to run. By default this may be all types, but may change over time as detectors are updated. For how to guides, see https://cloud.google.com/dlp/docs/inspecting-images and https://cloud.google.com/dlp/docs/inspecting-text,
Request to search for potentially sensitive info in a ContentItem.
The parent resource name, for example projects/my-project-id.
Configuration for the inspector. What specified here will override the template referenced by the inspect_template_name argument.
The item to inspect.
Optional template to use. Any configuration directly specified in inspect_config will override those set in the template. Singular fields that are set in this request will replace their corresponding fields in the template. Repeated fields are appended. Singular sub-messages and groups are recursively merged.
Results of inspecting an item.
The findings.
Redacts potentially sensitive info from an image. This method has limits on input size, processing time, and output size. See https://cloud.google.com/dlp/docs/redacting-sensitive-data-images to learn more. When no InfoTypes or CustomInfoTypes are specified in this request, the system will automatically choose what detectors to run. By default this may be all types, but may change over time as detectors are updated.
Request to search for potentially sensitive info in an image and redact it by covering it with a colored rectangle.
The parent resource name, for example projects/my-project-id.
Configuration for the inspector.
The configuration for specifying what content to redact from images.
Whether the response should include findings along with the redacted image.
The content must be PNG, JPEG, SVG or BMP.
Results of redacting an image.
The redacted image. The type will be the same as the original image.
If an image was being inspected and the InspectConfig's include_quote was set to true, then this field will include all text, if any, that was found in the image.
The findings. Populated when include_findings in the request is true.
De-identifies potentially sensitive info from a ContentItem. This method has limits on input size and output size. See https://cloud.google.com/dlp/docs/deidentify-sensitive-data to learn more. When no InfoTypes or CustomInfoTypes are specified in this request, the system will automatically choose what detectors to run. By default this may be all types, but may change over time as detectors are updated.
Request to de-identify a list of items.
The parent resource name, for example projects/my-project-id.
Configuration for the de-identification of the content item. Items specified here will override the template referenced by the deidentify_template_name argument.
Configuration for the inspector. Items specified here will override the template referenced by the inspect_template_name argument.
The item to de-identify. Will be treated as text.
Optional template to use. Any configuration directly specified in inspect_config will override those set in the template. Singular fields that are set in this request will replace their corresponding fields in the template. Repeated fields are appended. Singular sub-messages and groups are recursively merged.
Optional template to use. Any configuration directly specified in deidentify_config will override those set in the template. Singular fields that are set in this request will replace their corresponding fields in the template. Repeated fields are appended. Singular sub-messages and groups are recursively merged.
Results of de-identifying a ContentItem.
The de-identified item.
An overview of the changes that were made on the `item`.
Re-identifies content that has been de-identified. See https://cloud.google.com/dlp/docs/pseudonymization#re-identification_in_free_text_code_example to learn more.
Request to re-identify an item.
The parent resource name.
Configuration for the re-identification of the content item. This field shares the same proto message type that is used for de-identification, however its usage here is for the reversal of the previous de-identification. Re-identification is performed by examining the transformations used to de-identify the items and executing the reverse. This requires that only reversible transformations be provided here. The reversible transformations are: - `CryptoReplaceFfxFpeConfig`
Configuration for the inspector.
The item to re-identify. Will be treated as text.
Optional template to use. Any configuration directly specified in `inspect_config` will override those set in the template. Singular fields that are set in this request will replace their corresponding fields in the template. Repeated fields are appended. Singular sub-messages and groups are recursively merged.
Optional template to use. References an instance of `DeidentifyTemplate`. Any configuration directly specified in `reidentify_config` or `inspect_config` will override those set in the template. Singular fields that are set in this request will replace their corresponding fields in the template. Repeated fields are appended. Singular sub-messages and groups are recursively merged.
Results of re-identifying a item.
The re-identified item.
An overview of the changes that were made to the `item`.
Returns a list of the sensitive information types that the DLP API supports. See https://cloud.google.com/dlp/docs/infotypes-reference to learn more.
Request for the list of infoTypes.
Optional BCP-47 language code for localized infoType friendly names. If omitted, or if localized strings are not available, en-US strings will be returned.
Optional filter to only return infoTypes supported by certain parts of the API. Defaults to supported_by=INSPECT.
Response to the ListInfoTypes request.
Set of sensitive infoTypes.
Creates an InspectTemplate for re-using frequently used configuration for inspecting content, images, and storage. See https://cloud.google.com/dlp/docs/creating-templates to learn more.
Request message for CreateInspectTemplate.
The parent resource name, for example projects/my-project-id or organizations/my-org-id.
The InspectTemplate to create.
The template id can contain uppercase and lowercase letters, numbers, and hyphens; that is, it must match the regular expression: `[a-zA-Z\\d-_]+`. The maximum length is 100 characters. Can be empty to allow the system to generate one.
Updates the InspectTemplate. See https://cloud.google.com/dlp/docs/creating-templates to learn more.
Request message for UpdateInspectTemplate.
Resource name of organization and inspectTemplate to be updated, for example `organizations/433245324/inspectTemplates/432452342` or projects/project-id/inspectTemplates/432452342.
New InspectTemplate value.
Mask to control which fields get updated.
Gets an InspectTemplate. See https://cloud.google.com/dlp/docs/creating-templates to learn more.
Request message for GetInspectTemplate.
Resource name of the organization and inspectTemplate to be read, for example `organizations/433245324/inspectTemplates/432452342` or projects/project-id/inspectTemplates/432452342.
Lists InspectTemplates. See https://cloud.google.com/dlp/docs/creating-templates to learn more.
Request message for ListInspectTemplates.
The parent resource name, for example projects/my-project-id or organizations/my-org-id.
Optional page token to continue retrieval. Comes from previous call to `ListInspectTemplates`.
Optional size of the page, can be limited by server. If zero server returns a page of max size 100.
Optional comma separated list of fields to order by, followed by `asc` or `desc` postfix. This list is case-insensitive, default sorting order is ascending, redundant space characters are insignificant. Example: `name asc,update_time, create_time desc` Supported fields are: - `create_time`: corresponds to time the template was created. - `update_time`: corresponds to time the template was last updated. - `name`: corresponds to template's name. - `display_name`: corresponds to template's display name.
Response message for ListInspectTemplates.
List of inspectTemplates, up to page_size in ListInspectTemplatesRequest.
If the next page is available then the next page token to be used in following ListInspectTemplates request.
Deletes an InspectTemplate. See https://cloud.google.com/dlp/docs/creating-templates to learn more.
Request message for DeleteInspectTemplate.
Resource name of the organization and inspectTemplate to be deleted, for example `organizations/433245324/inspectTemplates/432452342` or projects/project-id/inspectTemplates/432452342.
Creates a DeidentifyTemplate for re-using frequently used configuration for de-identifying content, images, and storage. See https://cloud.google.com/dlp/docs/creating-templates-deid to learn more.
Request message for CreateDeidentifyTemplate.
The parent resource name, for example projects/my-project-id or organizations/my-org-id.
The DeidentifyTemplate to create.
The template id can contain uppercase and lowercase letters, numbers, and hyphens; that is, it must match the regular expression: `[a-zA-Z\\d-_]+`. The maximum length is 100 characters. Can be empty to allow the system to generate one.
Updates the DeidentifyTemplate. See https://cloud.google.com/dlp/docs/creating-templates-deid to learn more.
Request message for UpdateDeidentifyTemplate.
Resource name of organization and deidentify template to be updated, for example `organizations/433245324/deidentifyTemplates/432452342` or projects/project-id/deidentifyTemplates/432452342.
New DeidentifyTemplate value.
Mask to control which fields get updated.
Gets a DeidentifyTemplate. See https://cloud.google.com/dlp/docs/creating-templates-deid to learn more.
Request message for GetDeidentifyTemplate.
Resource name of the organization and deidentify template to be read, for example `organizations/433245324/deidentifyTemplates/432452342` or projects/project-id/deidentifyTemplates/432452342.
Lists DeidentifyTemplates. See https://cloud.google.com/dlp/docs/creating-templates-deid to learn more.
Request message for ListDeidentifyTemplates.
The parent resource name, for example projects/my-project-id or organizations/my-org-id.
Optional page token to continue retrieval. Comes from previous call to `ListDeidentifyTemplates`.
Optional size of the page, can be limited by server. If zero server returns a page of max size 100.
Optional comma separated list of fields to order by, followed by `asc` or `desc` postfix. This list is case-insensitive, default sorting order is ascending, redundant space characters are insignificant. Example: `name asc,update_time, create_time desc` Supported fields are: - `create_time`: corresponds to time the template was created. - `update_time`: corresponds to time the template was last updated. - `name`: corresponds to template's name. - `display_name`: corresponds to template's display name.
Response message for ListDeidentifyTemplates.
List of deidentify templates, up to page_size in ListDeidentifyTemplatesRequest.
If the next page is available then the next page token to be used in following ListDeidentifyTemplates request.
Deletes a DeidentifyTemplate. See https://cloud.google.com/dlp/docs/creating-templates-deid to learn more.
Request message for DeleteDeidentifyTemplate.
Resource name of the organization and deidentify template to be deleted, for example `organizations/433245324/deidentifyTemplates/432452342` or projects/project-id/deidentifyTemplates/432452342.
Creates a job trigger to run DLP actions such as scanning storage for sensitive information on a set schedule. See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
Request message for CreateJobTrigger.
The parent resource name, for example projects/my-project-id.
The JobTrigger to create.
The trigger id can contain uppercase and lowercase letters, numbers, and hyphens; that is, it must match the regular expression: `[a-zA-Z\\d-_]+`. The maximum length is 100 characters. Can be empty to allow the system to generate one.
Updates a job trigger. See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
Request message for UpdateJobTrigger.
Resource name of the project and the triggeredJob, for example `projects/dlp-test-project/jobTriggers/53234423`.
New JobTrigger value.
Mask to control which fields get updated.
Gets a job trigger. See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
Request message for GetJobTrigger.
Resource name of the project and the triggeredJob, for example `projects/dlp-test-project/jobTriggers/53234423`.
Lists job triggers. See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
Request message for ListJobTriggers.
The parent resource name, for example `projects/my-project-id`.
Optional page token to continue retrieval. Comes from previous call to ListJobTriggers. `order_by` field must not change for subsequent calls.
Optional size of the page, can be limited by a server.
Optional comma separated list of triggeredJob fields to order by, followed by `asc` or `desc` postfix. This list is case-insensitive, default sorting order is ascending, redundant space characters are insignificant. Example: `name asc,update_time, create_time desc` Supported fields are: - `create_time`: corresponds to time the JobTrigger was created. - `update_time`: corresponds to time the JobTrigger was last updated. - `last_run_time`: corresponds to the last time the JobTrigger ran. - `name`: corresponds to JobTrigger's name. - `display_name`: corresponds to JobTrigger's display name. - `status`: corresponds to JobTrigger's status.
Optional. Allows filtering. Supported syntax: * Filter expressions are made up of one or more restrictions. * Restrictions can be combined by `AND` or `OR` logical operators. A sequence of restrictions implicitly uses `AND`. * A restriction has the form of `<field> <operator> <value>`. * Supported fields/values for inspect jobs: - `status` - HEALTHY|PAUSED|CANCELLED - `inspected_storage` - DATASTORE|CLOUD_STORAGE|BIGQUERY - 'last_run_time` - RFC 3339 formatted timestamp, surrounded by quotation marks. Nanoseconds are ignored. - 'error_count' - Number of errors that have occurred while running. * The operator must be `=` or `!=` for status and inspected_storage. Examples: * inspected_storage = cloud_storage AND status = HEALTHY * inspected_storage = cloud_storage OR inspected_storage = bigquery * inspected_storage = cloud_storage AND (state = PAUSED OR state = HEALTHY) * last_run_time > \"2017-12-12T00:00:00+00:00\" The length of this field should be no more than 500 characters.
Response message for ListJobTriggers.
List of triggeredJobs, up to page_size in ListJobTriggersRequest.
If the next page is available then the next page token to be used in following ListJobTriggers request.
Deletes a job trigger. See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
Request message for DeleteJobTrigger.
Resource name of the project and the triggeredJob, for example `projects/dlp-test-project/jobTriggers/53234423`.
Activate a job trigger. Causes the immediate execute of a trigger instead of waiting on the trigger event to occur.
Request message for ActivateJobTrigger.
Resource name of the trigger to activate, for example `projects/dlp-test-project/jobTriggers/53234423`.
Creates a new job to inspect storage or calculate risk metrics. See https://cloud.google.com/dlp/docs/inspecting-storage and https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more. When no InfoTypes or CustomInfoTypes are specified in inspect jobs, the system will automatically choose what detectors to run. By default this may be all types, but may change over time as detectors are updated.
Request message for CreateDlpJobRequest. Used to initiate long running jobs such as calculating risk metrics or inspecting Google Cloud Storage.
The parent resource name, for example projects/my-project-id.
The configuration details for the specific type of job to run.
The job id can contain uppercase and lowercase letters, numbers, and hyphens; that is, it must match the regular expression: `[a-zA-Z\\d-_]+`. The maximum length is 100 characters. Can be empty to allow the system to generate one.
Lists DlpJobs that match the specified filter in the request. See https://cloud.google.com/dlp/docs/inspecting-storage and https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more.
The request message for listing DLP jobs.
The parent resource name, for example projects/my-project-id.
Optional. Allows filtering. Supported syntax: * Filter expressions are made up of one or more restrictions. * Restrictions can be combined by `AND` or `OR` logical operators. A sequence of restrictions implicitly uses `AND`. * A restriction has the form of `<field> <operator> <value>`. * Supported fields/values for inspect jobs: - `state` - PENDING|RUNNING|CANCELED|FINISHED|FAILED - `inspected_storage` - DATASTORE|CLOUD_STORAGE|BIGQUERY - `trigger_name` - The resource name of the trigger that created job. - 'end_time` - Corresponds to time the job finished. - 'start_time` - Corresponds to time the job finished. * Supported fields for risk analysis jobs: - `state` - RUNNING|CANCELED|FINISHED|FAILED - 'end_time` - Corresponds to time the job finished. - 'start_time` - Corresponds to time the job finished. * The operator must be `=` or `!=`. Examples: * inspected_storage = cloud_storage AND state = done * inspected_storage = cloud_storage OR inspected_storage = bigquery * inspected_storage = cloud_storage AND (state = done OR state = canceled) * end_time > \"2017-12-12T00:00:00+00:00\" The length of this field should be no more than 500 characters.
The standard list page size.
The standard list page token.
The type of job. Defaults to `DlpJobType.INSPECT`
Optional comma separated list of fields to order by, followed by `asc` or `desc` postfix. This list is case-insensitive, default sorting order is ascending, redundant space characters are insignificant. Example: `name asc, end_time asc, create_time desc` Supported fields are: - `create_time`: corresponds to time the job was created. - `end_time`: corresponds to time the job ended. - `name`: corresponds to job's name. - `state`: corresponds to `state`
The response message for listing DLP jobs.
A list of DlpJobs that matches the specified filter in the request.
The standard List next-page token.
Gets the latest state of a long-running DlpJob. See https://cloud.google.com/dlp/docs/inspecting-storage and https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more.
The request message for [DlpJobs.GetDlpJob][].
The name of the DlpJob resource.
Deletes a long-running DlpJob. This method indicates that the client is no longer interested in the DlpJob result. The job will be cancelled if possible. See https://cloud.google.com/dlp/docs/inspecting-storage and https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more.
The request message for deleting a DLP job.
The name of the DlpJob resource to be deleted.
Starts asynchronous cancellation on a long-running DlpJob. The server makes a best effort to cancel the DlpJob, but success is not guaranteed. See https://cloud.google.com/dlp/docs/inspecting-storage and https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more.
The request message for canceling a DLP job.
The name of the DlpJob resource to be cancelled.
Creates a pre-built stored infoType to be used for inspection. See https://cloud.google.com/dlp/docs/creating-stored-infotypes to learn more.
Request message for CreateStoredInfoType.
The parent resource name, for example projects/my-project-id or organizations/my-org-id.
Configuration of the storedInfoType to create.
The storedInfoType ID can contain uppercase and lowercase letters, numbers, and hyphens; that is, it must match the regular expression: `[a-zA-Z\\d-_]+`. The maximum length is 100 characters. Can be empty to allow the system to generate one.
Updates the stored infoType by creating a new version. The existing version will continue to be used until the new version is ready. See https://cloud.google.com/dlp/docs/creating-stored-infotypes to learn more.
Request message for UpdateStoredInfoType.
Resource name of organization and storedInfoType to be updated, for example `organizations/433245324/storedInfoTypes/432452342` or projects/project-id/storedInfoTypes/432452342.
Updated configuration for the storedInfoType. If not provided, a new version of the storedInfoType will be created with the existing configuration.
Mask to control which fields get updated.
Gets a stored infoType. See https://cloud.google.com/dlp/docs/creating-stored-infotypes to learn more.
Request message for GetStoredInfoType.
Resource name of the organization and storedInfoType to be read, for example `organizations/433245324/storedInfoTypes/432452342` or projects/project-id/storedInfoTypes/432452342.
Lists stored infoTypes. See https://cloud.google.com/dlp/docs/creating-stored-infotypes to learn more.
Request message for ListStoredInfoTypes.
The parent resource name, for example projects/my-project-id or organizations/my-org-id.
Optional page token to continue retrieval. Comes from previous call to `ListStoredInfoTypes`.
Optional size of the page, can be limited by server. If zero server returns a page of max size 100.
Optional comma separated list of fields to order by, followed by `asc` or `desc` postfix. This list is case-insensitive, default sorting order is ascending, redundant space characters are insignificant. Example: `name asc, display_name, create_time desc` Supported fields are: - `create_time`: corresponds to time the most recent version of the resource was created. - `state`: corresponds to the state of the resource. - `name`: corresponds to resource name. - `display_name`: corresponds to info type's display name.
Response message for ListStoredInfoTypes.
List of storedInfoTypes, up to page_size in ListStoredInfoTypesRequest.
If the next page is available then the next page token to be used in following ListStoredInfoTypes request.
Deletes a stored infoType. See https://cloud.google.com/dlp/docs/creating-stored-infotypes to learn more.
Request message for DeleteStoredInfoType.
Resource name of the organization and storedInfoType to be deleted, for example `organizations/433245324/storedInfoTypes/432452342` or projects/project-id/storedInfoTypes/432452342.
A task to execute on the completion of a job. See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
Used in:
,Save resulting findings in a provided location.
Publish a notification to a pubsub topic.
Publish summary to Cloud Security Command Center (Alpha).
Publish findings to Cloud Datahub.
Enable email notification to project owners and editors on job's completion/failure.
Enable email notification to project owners and editors on jobs's completion/failure.
Used in:
(message has no fields)
Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the results of the DlpJob will be applied to the entry for the resource scanned in Cloud Data Catalog. Any labels previously written by another DlpJob will be deleted. InfoType naming patterns are strictly enforced when using this feature. Note that the findings will be persisted in Cloud Data Catalog storage and are governed by Data Catalog service-specific policy, see https://cloud.google.com/terms/service-terms Only a single instance of this action can be specified and only allowed if all resources being scanned are BigQuery tables. Compatible with: Inspect
Used in:
(message has no fields)
Publish the result summary of a DlpJob to the Cloud Security Command Center (CSCC Alpha). This action is only available for projects which are parts of an organization and whitelisted for the alpha Cloud Security Command Center. The action will publish count of finding instances and their info types. The summary of findings will be persisted in CSCC and are governed by CSCC service-specific policy, see https://cloud.google.com/terms/service-terms Only a single instance of this action can be specified. Compatible with: Inspect
Used in:
(message has no fields)
Publish a message into given Pub/Sub topic when DlpJob has completed. The message contains a single field, `DlpJobName`, which is equal to the finished job's [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob). Compatible with: Inspect, Risk
Used in:
Cloud Pub/Sub topic to send notifications to. The topic must have given publishing access rights to the DLP API service account executing the long running DlpJob sending the notifications. Format is projects/{project}/topics/{topic}.
If set, the detailed findings will be persisted to the specified OutputStorageConfig. Only a single instance of this action can be specified. Compatible with: Inspect, Risk
Used in:
Result of a risk analysis operation request.
Used in:
Privacy metric to compute.
Input dataset to compute metrics over.
Values associated with this metric.
Result of the categorical stats computation.
Used in:
Histogram of value frequencies in the column.
Used in:
Lower bound on the value frequency of the values in this bucket.
Upper bound on the value frequency of the values in this bucket.
Total number of values in this bucket.
Sample of value frequencies in this bucket. The total number of values returned per bucket is capped at 20.
Total number of distinct values in this bucket.
Result of the δ-presence computation. Note that these results are an estimation, not exact values.
Used in:
The intervals [min_probability, max_probability) do not overlap. If a value doesn't correspond to any such interval, the associated frequency is zero. For example, the following records: {min_probability: 0, max_probability: 0.1, frequency: 17} {min_probability: 0.2, max_probability: 0.3, frequency: 42} {min_probability: 0.3, max_probability: 0.4, frequency: 99} mean that there are no record with an estimated probability in [0.1, 0.2) nor larger or equal to 0.4.
A DeltaPresenceEstimationHistogramBucket message with the following values: min_probability: 0.1 max_probability: 0.2 frequency: 42 means that there are 42 records for which δ is in [0.1, 0.2). An important particular case is when min_probability = max_probability = 1: then, every individual who shares this quasi-identifier combination is in the dataset.
Used in:
Between 0 and 1.
Always greater than or equal to min_probability.
Number of records within these probability bounds.
Sample of quasi-identifier tuple values in this bucket. The total number of classes returned per bucket is capped at 20.
Total number of distinct quasi-identifier tuple values in this bucket.
A tuple of values for the quasi-identifier columns.
Used in:
The quasi-identifier values.
The estimated probability that a given individual sharing these quasi-identifier values is in the dataset. This value, typically called δ, is the ratio between the number of records in the dataset with these quasi-identifier values, and the total number of individuals (inside *and* outside the dataset) with these quasi-identifier values. For example, if there are 15 individuals in the dataset who share the same quasi-identifier values, and an estimated 100 people in the entire population with these values, then δ is 0.15.
Result of the k-anonymity computation.
Used in:
Histogram of k-anonymity equivalence classes.
The set of columns' values that share the same ldiversity value
Used in:
Set of values defining the equivalence class. One value per quasi-identifier column in the original KAnonymity metric message. The order is always the same as the original request.
Size of the equivalence class, for example number of rows with the above set of values.
Used in:
Lower bound on the size of the equivalence classes in this bucket.
Upper bound on the size of the equivalence classes in this bucket.
Total number of equivalence classes in this bucket.
Sample of equivalence classes in this bucket. The total number of classes returned per bucket is capped at 20.
Total number of distinct equivalence classes in this bucket.
Result of the reidentifiability analysis. Note that these results are an estimation, not exact values.
Used in:
The intervals [min_anonymity, max_anonymity] do not overlap. If a value doesn't correspond to any such interval, the associated frequency is zero. For example, the following records: {min_anonymity: 1, max_anonymity: 1, frequency: 17} {min_anonymity: 2, max_anonymity: 3, frequency: 42} {min_anonymity: 5, max_anonymity: 10, frequency: 99} mean that there are no record with an estimated anonymity of 4, 5, or larger than 10.
A KMapEstimationHistogramBucket message with the following values: min_anonymity: 3 max_anonymity: 5 frequency: 42 means that there are 42 records whose quasi-identifier values correspond to 3, 4 or 5 people in the overlying population. An important particular case is when min_anonymity = max_anonymity = 1: the frequency field then corresponds to the number of uniquely identifiable records.
Used in:
Always positive.
Always greater than or equal to min_anonymity.
Number of records within these anonymity bounds.
Sample of quasi-identifier tuple values in this bucket. The total number of classes returned per bucket is capped at 20.
Total number of distinct quasi-identifier tuple values in this bucket.
A tuple of values for the quasi-identifier columns.
Used in:
The quasi-identifier values.
The estimated anonymity for these quasi-identifier values.
Result of the l-diversity computation.
Used in:
Histogram of l-diversity equivalence class sensitive value frequencies.
The set of columns' values that share the same ldiversity value.
Used in:
Quasi-identifier values defining the k-anonymity equivalence class. The order is always the same as the original request.
Size of the k-anonymity equivalence class.
Number of distinct sensitive values in this equivalence class.
Estimated frequencies of top sensitive values.
Used in:
Lower bound on the sensitive value frequencies of the equivalence classes in this bucket.
Upper bound on the sensitive value frequencies of the equivalence classes in this bucket.
Total number of equivalence classes in this bucket.
Sample of equivalence classes in this bucket. The total number of classes returned per bucket is capped at 20.
Total number of distinct equivalence classes in this bucket.
Result of the numerical stats computation.
Used in:
Minimum value appearing in the column.
Maximum value appearing in the column.
List of 99 values that partition the set of field values into 100 equal sized buckets.
Message defining a field of a BigQuery table.
Used in:
Source table of the field.
Designated field in the BigQuery table.
Row key for identifying a record in BigQuery table.
Used in:
Complete BigQuery table reference.
Absolute number of the row from the beginning of the table at the time of scanning.
Options defining BigQuery table and row identifiers.
Used in:
Complete BigQuery table reference.
References to fields uniquely identifying rows within the table. Nested fields in the format, like `person.birthdate.year`, are allowed.
Max number of rows to scan. If the table has more rows than this value, the rest of the rows are omitted. If not set, or if set to 0, all rows will be scanned. Only one of rows_limit and rows_limit_percent can be specified. Cannot be used in conjunction with TimespanConfig.
Max percentage of rows to scan. The rest are omitted. The number of rows scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one of rows_limit and rows_limit_percent can be specified. Cannot be used in conjunction with TimespanConfig.
References to fields excluded from scanning. This allows you to skip inspection of entire columns which you know have no findings.
How to sample rows if not all rows are scanned. Meaningful only when used in conjunction with either rows_limit or rows_limit_percent. If not specified, scanning would start from the top.
Used in:
Scan from the top (default).
Randomly pick the row to start scanning. The scanned rows are contiguous.
Message defining the location of a BigQuery table. A table is uniquely identified by its project_id, dataset_id, and table_name. Within a query a table is often referenced with a string in the format of: `<project_id>:<dataset_id>.<table_id>` or `<project_id>.<dataset_id>.<table_id>`.
Used in:
, , , , , , ,The Google Cloud Platform project ID of the project containing the table. If omitted, project ID is inferred from the API call.
Dataset ID of the table.
Name of the table.
Bounding box encompassing detected text within an image.
Used in:
Top coordinate of the bounding box. (0,0) is upper left.
Left coordinate of the bounding box. (0,0) is upper left.
Width of the bounding box in pixels.
Height of the bounding box in pixels.
Generalization function that buckets values based on ranges. The ranges and replacement values are dynamically provided by the user for custom behavior, such as 1-30 -> LOW 31-65 -> MEDIUM 66-100 -> HIGH This can be used on data of type: number, long, string, timestamp. If the bound `Value` type differs from the type of data being transformed, we will first attempt converting the type of the data to be transformed to match the type of the bound before comparing. See https://cloud.google.com/dlp/docs/concepts-bucketing to learn more.
Used in:
Set of buckets. Ranges must be non-overlapping.
Bucket is represented as a range, along with replacement values.
Used in:
Lower bound of the range, inclusive. Type should be the same as max if used.
Upper bound of the range, exclusive; type must match min.
Replacement value for this bucket. If not provided the default behavior will be to hyphenate the min-max range.
Container for bytes to inspect or redact.
Used in:
,The type of data stored in the bytes string. Default will be TEXT_UTF8.
Content data to inspect or redact.
Used in:
Partially mask a string by replacing a given number of characters with a fixed character. Masking can start from the beginning or end of the string. This can be used on data of any type (numbers, longs, and so on) and when de-identifying structured data we'll attempt to preserve the original data's type. (This allows you to take a long like 123 and modify it to a string like **3.
Used in:
Character to mask the sensitive values—for example, "*" for an alphabetic string such as name, or "0" for a numeric string such as ZIP code or credit card number. String must have length 1. If not supplied, we will default to "*" for strings, 0 for digits.
Number of characters to mask. If not set, all matching chars will be masked. Skipped characters do not count towards this tally.
Mask characters in reverse order. For example, if `masking_character` is '0', number_to_mask is 14, and `reverse_order` is false, then 1234-5678-9012-3456 -> 00000000000000-3456 If `masking_character` is '*', `number_to_mask` is 3, and `reverse_order` is true, then 12345 -> 12***
When masking a string, items in this list will be skipped when replacing. For example, if your string is 555-555-5555 and you ask us to skip `-` and mask 5 chars with * we would produce ***-*55-5555.
Characters to skip when doing deidentification of a value. These will be left alone and skipped.
Used in:
Used in:
0-9
A-Z
a-z
US Punctuation, one of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
Whitespace character, one of [ \t\n\x0B\f\r]
Message representing a set of files in Cloud Storage.
Used in:
The url, in the format `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.
Options defining a file or a set of files within a Google Cloud Storage bucket.
Used in:
The set of one or more files to scan.
Max number of bytes to scan from a file. If a scanned file's size is bigger than this value then the rest of the bytes are omitted. Only one of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
Max percentage of bytes to scan from a file. The rest are omitted. The number of bytes scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
List of file type groups to include in the scan. If empty, all files are scanned and available data format processors are applied. In addition, the binary content of the selected files is always scanned as well.
Limits the number of files to scan to this percentage of the input FileSet. Number of files scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and 100 means no limit. Defaults to 0.
Set of files to scan.
Used in:
The Cloud Storage url of the file(s) to scan, in the format `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed. If the url ends in a trailing slash, the bucket or directory represented by the url will be scanned non-recursively (content in sub-directories will not be scanned). This means that `gs://mybucket/` is equivalent to `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to `gs://mybucket/directory/*`. Exactly one of `url` or `regex_file_set` must be set.
The regex-filtered set of files to scan. Exactly one of `url` or `regex_file_set` must be set.
How to sample bytes if not all bytes are scanned. Meaningful only when used in conjunction with bytes_limit_per_file. If not specified, scanning would start from the top.
Used in:
Scan from the top (default).
For each file larger than bytes_limit_per_file, randomly pick the offset to start scanning. The scanned bytes are contiguous.
Message representing a single file or path in Cloud Storage.
Used in:
,A url representing a file or path (no wildcards) in Cloud Storage. Example: gs://[BUCKET_NAME]/dictionary.txt
Message representing a set of files in a Cloud Storage bucket. Regular expressions are used to allow fine-grained control over which files in the bucket to include. Included files are those that match at least one item in `include_regex` and do not match any items in `exclude_regex`. Note that a file that matches items from both lists will _not_ be included. For a match to occur, the entire file path (i.e., everything in the url after the bucket name) must match the regular expression. For example, given the input `{bucket_name: "mybucket", include_regex: ["directory1/.*"], exclude_regex: ["directory1/excluded.*"]}`: * `gs://mybucket/directory1/myfile` will be included * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches across `/`) * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the full path doesn't match any items in `include_regex`) * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path matches an item in `exclude_regex`) If `include_regex` is left empty, it will match all files by default (this is equivalent to setting `include_regex: [".*"]`). Some other common use cases: * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all files in `mybucket` except for .pdf files * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will include all files directly under `gs://mybucket/directory/`, without matching across `/`
Used in:
The name of a Cloud Storage bucket. Required.
A list of regular expressions matching file paths to include. All files in the bucket that match at least one of these regular expressions will be included in the set of files, except for those that also match an item in `exclude_regex`. Leaving this field empty will match all files by default (this is equivalent to including `.*` in the list). Regular expressions use RE2 [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found under the google/re2 repository on GitHub.
A list of regular expressions matching file paths to exclude. All files in the bucket that match at least one of these regular expressions will be excluded from the scan. Regular expressions use RE2 [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found under the google/re2 repository on GitHub.
Represents a color in the RGB color space.
Used in:
The amount of red in the color as a value in the interval [0, 1].
The amount of green in the color as a value in the interval [0, 1].
The amount of blue in the color as a value in the interval [0, 1].
Container structure for the content to inspect.
Used in:
, , , ,Data of the item either in the byte array or UTF-8 string form, or table.
String data to inspect or redact.
Structured content for inspection. See https://cloud.google.com/dlp/docs/inspecting-text#inspecting_a_table to learn more.
Content data to inspect or redact. Replaces `type` and `data`.
Findings container location data.
Used in:
Name of the container where the finding is located. The top level name is the source file name or table name. Names of some common storage containers are formatted as follows: * BigQuery tables: `<project_id>:<dataset_id>.<table_id>` * Cloud Storage files: `gs://<bucket>/<path>` * Datastore namespace: <namespace> Nested names could be absent if the embedded object has no string identifier (for an example an image contained within a document).
Type of the container within the file with location of the finding.
Location within a row or record of a database table.
Location within an image's pixels.
Location data for document files.
Findings container modification timestamp, if applicable. For Google Cloud Storage contains last file modification timestamp. For BigQuery table contains last_modified_time property. For Datastore - not populated.
Findings container version, if available ("generation" for Google Cloud Storage).
Options describing which parts of the provided content should be scanned.
Used in:
Includes entire content of a file or a data stream.
Text content within the data, excluding any metadata.
Images found in the data.
Pseudonymization method that generates deterministic encryption for the given input. Outputs a base64 encoded representation of the encrypted output. Uses AES-SIV based on the RFC https://tools.ietf.org/html/rfc5297.
Used in:
The key used by the encryption function.
The custom info type to annotate the surrogate with. This annotation will be applied to the surrogate by prefixing it with the name of the custom info type followed by the number of characters comprising the surrogate. The following scheme defines the format: <info type name>(<surrogate character count>):<surrogate> For example, if the name of custom info type is 'MY_TOKEN_INFO_TYPE' and the surrogate is 'abc', the full replacement value will be: 'MY_TOKEN_INFO_TYPE(3):abc' This annotation identifies the surrogate when inspecting content using the custom info type 'Surrogate'. This facilitates reversal of the surrogate when it occurs in free text. In order for inspection to work properly, the name of this info type must not occur naturally anywhere in your data; otherwise, inspection may either - reverse a surrogate that does not correspond to an actual identifier - be unable to parse the surrogate and result in an error Therefore, choose your custom info type name carefully after considering what your data looks like. One way to select a name that has a high chance of yielding reliable detection is to include one or more unicode characters that are highly improbable to exist in your data. For example, assuming your data is entered from a regular ASCII keyboard, the symbol with the hex code point 29DD might be used like so: ⧝MY_TOKEN_TYPE
Optional. A context may be used for higher security and maintaining referential integrity such that the same identifier in two different contexts will be given a distinct surrogate. The context is appended to plaintext value being encrypted. On decryption the provided context is validated against the value used during encryption. If a context was provided during encryption, same context must be provided during decryption as well. If the context is not set, plaintext would be used as is for encryption. If the context is set but: 1. there is no record present when transforming a given value or 2. the field is not present when transforming a given value, plaintext would be used as is for encryption. Note that case (1) is expected when an `InfoTypeTransformation` is applied to both structured and non-structured `ContentItem`s.
Pseudonymization method that generates surrogates via cryptographic hashing. Uses SHA-256. The key size must be either 32 or 64 bytes. Outputs a base64 encoded representation of the hashed output (for example, L7k0BHmF1ha5U3NfGykjro4xWi1MPVQPjhMAZbSV9mM=). Currently, only string and integer values can be hashed. See https://cloud.google.com/dlp/docs/pseudonymization to learn more.
Used in:
The key used by the hash function.
This is a data encryption key (DEK) (as opposed to a key encryption key (KEK) stored by KMS). When using KMS to wrap/unwrap DEKs, be sure to set an appropriate IAM policy on the KMS CryptoKey (KEK) to ensure an attacker cannot unwrap the data crypto key.
Used in:
, , ,Replaces an identifier with a surrogate using Format Preserving Encryption (FPE) with the FFX mode of operation; however when used in the `ReidentifyContent` API method, it serves the opposite function by reversing the surrogate back into the original identifier. The identifier must be encoded as ASCII. For a given crypto key and context, the same identifier will be replaced with the same surrogate. Identifiers must be at least two characters long. In the case that the identifier is the empty string, it will be skipped. See https://cloud.google.com/dlp/docs/pseudonymization to learn more. Note: We recommend using CryptoDeterministicConfig for all use cases which do not require preserving the input alphabet space and size, plus warrant referential integrity.
Used in:
The key used by the encryption algorithm. [required]
The 'tweak', a context may be used for higher security since the same identifier in two different contexts won't be given the same surrogate. If the context is not set, a default tweak will be used. If the context is set but: 1. there is no record present when transforming a given value or 1. the field is not present when transforming a given value, a default tweak will be used. Note that case (1) is expected when an `InfoTypeTransformation` is applied to both structured and non-structured `ContentItem`s. Currently, the referenced field may be of value type integer or string. The tweak is constructed as a sequence of bytes in big endian byte order such that: - a 64 bit integer is encoded followed by a single byte of value 1 - a string is encoded in UTF-8 format followed by a single byte of value 2
This is supported by mapping these to the alphanumeric characters that the FFX mode natively supports. This happens before/after encryption/decryption. Each character listed must appear only once. Number of characters must be in the range [2, 62]. This must be encoded as ASCII. The order of characters does not matter.
The native way to select the alphabet. Must be in the range [2, 62].
The custom infoType to annotate the surrogate with. This annotation will be applied to the surrogate by prefixing it with the name of the custom infoType followed by the number of characters comprising the surrogate. The following scheme defines the format: info_type_name(surrogate_character_count):surrogate For example, if the name of custom infoType is 'MY_TOKEN_INFO_TYPE' and the surrogate is 'abc', the full replacement value will be: 'MY_TOKEN_INFO_TYPE(3):abc' This annotation identifies the surrogate when inspecting content using the custom infoType [`SurrogateType`](/dlp/docs/reference/rest/v2/InspectConfig#surrogatetype). This facilitates reversal of the surrogate when it occurs in free text. In order for inspection to work properly, the name of this infoType must not occur naturally anywhere in your data; otherwise, inspection may find a surrogate that does not correspond to an actual identifier. Therefore, choose your custom infoType name carefully after considering what your data looks like. One way to select a name that has a high chance of yielding reliable detection is to include one or more unicode characters that are highly improbable to exist in your data. For example, assuming your data is entered from a regular ASCII keyboard, the symbol with the hex code point 29DD might be used like so: ⧝MY_TOKEN_TYPE
These are commonly used subsets of the alphabet that the FFX mode natively supports. In the algorithm, the alphabet is selected using the "radix". Therefore each corresponds to particular radix.
Used in:
[0-9] (radix of 10)
[0-9A-F] (radix of 16)
[0-9A-Z] (radix of 36)
[0-9A-Za-z] (radix of 62)
Custom information type provided by the user. Used to find domain-specific sensitive information configurable to the data in question.
Used in:
CustomInfoType can either be a new infoType, or an extension of built-in infoType, when the name matches one of existing infoTypes and that infoType is specified in `InspectContent.info_types` field. Specifying the latter adds findings to the one detected by the system. If built-in info type is not specified in `InspectContent.info_types` list then the name is treated as a custom info type.
Likelihood to return for this CustomInfoType. This base value can be altered by a detection rule if the finding meets the criteria specified by the rule. Defaults to `VERY_LIKELY` if not specified.
A list of phrases to detect as a CustomInfoType.
Regular expression based CustomInfoType.
Message for detecting output from deidentification transformations that support reversing.
Load an existing `StoredInfoType` resource for use in `InspectDataSource`. Not currently supported in `InspectContent`.
Set of detection rules to apply to all findings of this CustomInfoType. Rules are applied in order that they are specified. Not supported for the `surrogate_type` CustomInfoType.
If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding to be returned. It still can be used for rules matching.
Deprecated; use `InspectionRuleSet` instead. Rule for modifying a `CustomInfoType` to alter behavior under certain circumstances, depending on the specific details of the rule. Not supported for the `surrogate_type` custom infoType.
Used in:
Hotword-based detection rule.
The rule that adjusts the likelihood of findings within a certain proximity of hotwords.
Used in:
,Regular expression pattern defining what qualifies as a hotword.
Proximity of the finding within which the entire hotword must reside. The total length of the window cannot exceed 1000 characters. Note that the finding itself will be included in the window, so that hotwords may be used to match substrings of the finding itself. For example, the certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be adjusted upwards if the area code is known to be the local area code of a company office using the hotword regex "\(xxx\)", where "xxx" is the area code in question.
Likelihood adjustment to apply to all matching findings.
Message for specifying an adjustment to the likelihood of a finding as part of a detection rule.
Used in:
Set the likelihood of a finding to a fixed value.
Increase or decrease the likelihood by the specified number of levels. For example, if a finding would be `POSSIBLE` without the detection rule and `relative_likelihood` is 1, then it is upgraded to `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. Likelihood may never drop below `VERY_UNLIKELY` or exceed `VERY_LIKELY`, so applying an adjustment of 1 followed by an adjustment of -1 when base likelihood is `VERY_LIKELY` will result in a final likelihood of `LIKELY`.
Message for specifying a window around a finding to apply a detection rule.
Used in:
Number of characters before the finding to consider.
Number of characters after the finding to consider.
Custom information type based on a dictionary of words or phrases. This can be used to match sensitive information specific to the data, such as a list of employee IDs or job titles. Dictionary words are case-insensitive and all characters other than letters and digits in the unicode [Basic Multilingual Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) will be replaced with whitespace when scanning for matches, so the dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters surrounding any match must be of a different type than the adjacent characters within the word, so letters must be next to non-letters and digits next to non-digits. For example, the dictionary word "jen" will match the first three letters of the text "jen123" but will return no matches for "jennifer". Dictionary words containing a large number of characters that are not letters or digits may result in unexpected findings because such characters are treated as whitespace. The [limits](https://cloud.google.com/dlp/limits) page contains details about the size limits of dictionaries. For dictionaries that do not fit within these constraints, consider using `LargeCustomDictionaryConfig` in the `StoredInfoType` API.
Used in:
,List of words or phrases to search for.
Newline-delimited file of words in Cloud Storage. Only a single file is accepted.
Message defining a list of words or phrases to search for in the data.
Used in:
Words or phrases defining the dictionary. The dictionary must contain at least one phrase and every phrase must contain at least 2 characters that are letters or digits. [required]
Used in:
A finding of this custom info type will not be excluded from results.
A finding of this custom info type will be excluded from final results, but can still affect rule execution.
Message defining a custom regular expression.
Used in:
, ,Pattern defining the regular expression. Its syntax (https://github.com/google/re2/wiki/Syntax) can be found under the google/re2 repository on GitHub.
The index of the submatch to extract as findings. When not specified, the entire match is returned. No more than 3 may be included.
Message for detecting output from deidentification transformations such as [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). These types of transformations are those that perform pseudonymization, thereby producing a "surrogate" as output. This should be used in conjunction with a field on the transformation such as `surrogate_info_type`. This CustomInfoType does not support the use of `detection_rules`.
Used in:
(message has no fields)
Record key for a finding in Cloud Datastore.
Used in:
Datastore entity key.
Options defining a data set within Google Cloud Datastore.
Used in:
A partition ID identifies a grouping of entities. The grouping is always by project and namespace, however the namespace ID may be empty.
The kind to process.
Shifts dates by random number of days, with option to be consistent for the same context. See https://cloud.google.com/dlp/docs/concepts-date-shifting to learn more.
Used in:
Range of shift in days. Actual shift will be selected at random within this range (inclusive ends). Negative means shift to earlier in time. Must not be more than 365250 days (1000 years) each direction. For example, 3 means shift date to at most 3 days into the future. [Required]
For example, -5 means shift date to at most 5 days back in the past. [Required]
Points to the field that contains the context, for example, an entity id. If set, must also set method. If set, shift will be consistent for the given context.
Method for calculating shift that takes context into consideration. If set, must also set context. Can only be applied to table items.
Causes the shift to be computed based on this key and the context. This results in the same shift for the same context and crypto_key.
Message for a date time object. e.g. 2018-01-01, 5th August.
Used in:
One or more of the following must be set. All fields are optional, but when set must be valid date or time values.
Used in:
Set only if the offset can be determined. Positive for time ahead of UTC. E.g. For "UTC-9", this value is -540.
The configuration that controls how the data will change.
Used in:
, ,Treat the dataset as free-form text and apply the same free text transformation everywhere.
Treat the dataset as structured. Transformations can be applied to specific locations within structured datasets, such as transforming a column within a table.
The DeidentifyTemplates contains instructions on how to deidentify content. See https://cloud.google.com/dlp/docs/concepts-templates to learn more.
Used as response type in: DlpService.CreateDeidentifyTemplate, DlpService.GetDeidentifyTemplate, DlpService.UpdateDeidentifyTemplate
Used as field type in:
, ,The template name. Output only. The template will have one of the following formats: `projects/PROJECT_ID/deidentifyTemplates/TEMPLATE_ID` OR `organizations/ORGANIZATION_ID/deidentifyTemplates/TEMPLATE_ID`
Display name (max 256 chars).
Short description (max 256 chars).
The creation timestamp of a inspectTemplate, output only field.
The last update timestamp of a inspectTemplate, output only field.
///////////// // The core content of the template // ///////////////
Combines all of the information about a DLP job.
Used as response type in: DlpService.ActivateJobTrigger, DlpService.CreateDlpJob, DlpService.GetDlpJob
Used as field type in:
The server-assigned name.
The type of job.
State of a job.
Results from analyzing risk of a data source.
Results from inspecting a data source.
Time when the job was created.
Time when the job started.
Time when the job finished.
If created by a job trigger, the resource name of the trigger that instantiated the job.
A stream of errors encountered running the job.
Used in:
The job has not yet started.
The job is currently running.
The job is no longer running.
The job was canceled before it could complete.
The job had an error and did not complete.
An enum to represent the various type of DLP jobs.
Used in:
,The job inspected Google Cloud for sensitive data.
The job executed a Risk Analysis computation.
Location of a finding within a document.
Used in:
Offset of the line, from the beginning of the file, where the finding is located.
An entity in a dataset is a field or set of fields that correspond to a single person. For example, in medical records the `EntityId` might be a patient identifier, or for financial records it might be an account identifier. This message is used when generalizations or analysis must take into account that multiple rows correspond to the same entity.
Used in:
Composite key indicating which field contains the entity identifier.
Details information about an error encountered during job execution or the results of an unsuccessful activation of the JobTrigger. Output only field.
Used in:
, ,The times the error occurred.
List of exclude infoTypes.
Used in:
InfoType list in ExclusionRule rule drops a finding when it overlaps or contained within with a finding of an infoType from this list. For example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and `exclusion_rule` containing `exclude_info_types.info_types` with "EMAIL_ADDRESS" the phone number findings are dropped if they overlap with EMAIL_ADDRESS finding. That leads to "555-222-2222@example.org" to generate only a single finding, namely email address.
The rule that specifies conditions when findings of infoTypes specified in `InspectionRuleSet` are removed from results.
Used in:
Dictionary which defines the rule.
Regular expression which defines the rule.
Set of infoTypes for which findings would affect this rule.
How the rule is applied, see MatchingType documentation for details.
General identifier of a data field in a storage service.
Used in:
, , , , , , , , , , , , , , , , , , , , ,Name describing the field.
The transformation to apply to the field.
Used in:
,Input field(s) to apply the transformation to. [required]
Only apply the transformation if the condition evaluates to true for the given `RecordCondition`. The conditions are allowed to reference fields that are not used in the actual transformation. [optional] Example Use Cases: - Apply a different bucket transformation to an age column if the zip code column for the same record is within a specific range. - Redact a field if the date of birth field is greater than 85.
Transformation to apply. [required]
Apply the transformation to the entire field.
Treat the contents of the field as free text, and selectively transform content that matches an `InfoType`.
Definitions of file type groups to scan.
Used in:
Includes all files.
Includes all file extensions not covered by text file types.
Included file extensions: asc, brf, c, cc, cpp, csv, cxx, c++, cs, css, dart, eml, go, h, hh, hpp, hxx, h++, hs, html, htm, shtml, shtm, xhtml, lhs, ini, java, js, json, ocaml, md, mkd, markdown, m, ml, mli, pl, pm, php, phtml, pht, py, pyw, rb, rbw, rs, rc, scala, sh, sql, tex, txt, text, tsv, vcard, vcs, wml, xml, xsl, xsd, yml, yaml.
Included file extensions: bmp, gif, jpg, jpeg, jpe, png. bytes_limit_per_file has no effect on image files.
Included file extensions: avro
Represents a piece of potentially sensitive content.
Used in:
The content that was found. Even if the content is not textual, it may be converted to a textual representation here. Provided if `include_quote` is true and the finding is less than or equal to 4096 bytes long. If the finding exceeds 4096 bytes in length, the quote may be omitted.
The type of content that might have been found. Provided if `excluded_types` is false.
Confidence of how likely it is that the `info_type` is correct.
Where the content was found.
Timestamp when finding was detected.
Contains data parsed from quotes. Only populated if include_quote was set to true and a supported infoType was requested. Currently supported infoTypes: DATE, DATE_OF_BIRTH and TIME.
Buckets values based on fixed size ranges. The Bucketing transformation can provide all of this functionality, but requires more configuration. This message is provided as a convenience to the user for simple bucketing strategies. The transformed value will be a hyphenated string of <lower_bound>-<upper_bound>, i.e if lower_bound = 10 and upper_bound = 20 all values that are within this bucket will be replaced with "10-20". This can be used on data of type: double, long. If the bound Value type differs from the type of data being transformed, we will first attempt converting the type of the data to be transformed to match the type of the bound before comparing. See https://cloud.google.com/dlp/docs/concepts-bucketing to learn more.
Used in:
Lower bound value of buckets. All values less than `lower_bound` are grouped together into a single bucket; for example if `lower_bound` = 10, then all values less than 10 are replaced with the value “-10”. [Required].
Upper bound value of buckets. All values greater than upper_bound are grouped together into a single bucket; for example if `upper_bound` = 89, then all values greater than 89 are replaced with the value “89+”. [Required].
Size of each bucket (except for minimum and maximum buckets). So if `lower_bound` = 10, `upper_bound` = 89, and `bucket_size` = 10, then the following buckets would be used: -10, 10-20, 20-30, 30-40, 40-50, 50-60, 60-70, 70-80, 80-89, 89+. Precision up to 2 decimals works. [Required].
Location of the finding within an image.
Used in:
Bounding boxes locating the pixels within the image containing the finding.
Type of information detected by the API.
Used in:
, , , , , , , , , , , , ,Name of the information type. Either a name of your choosing when creating a CustomInfoType, or one of the names listed at https://cloud.google.com/dlp/docs/infotypes-reference when specifying a built-in type. InfoType names should conform to the pattern [a-zA-Z0-9_]{1,64}.
InfoType description.
Used in:
Internal name of the infoType.
Human readable form of the infoType name.
Which parts of the API supports this InfoType.
Description of the infotype. Translated when language is provided in the request.
Statistics regarding a specific InfoType.
Used in:
The type of finding this stat is for.
Number of findings for this infoType.
Parts of the APIs which use certain infoTypes.
Used in:
Supported by the inspect operations.
Supported by the risk analysis operations.
A type of transformation that will scan unstructured text and apply various `PrimitiveTransformation`s to each finding, where the transformation is applied to only values that were identified as a specific info_type.
Used in:
,Transformation for each infoType. Cannot specify more than one for a given infoType. [required]
A transformation to apply to text that is identified as a specific info_type.
Used in:
InfoTypes to apply the transformation to. An empty list will cause this transformation to apply to all findings that correspond to infoTypes that were requested in `InspectConfig`.
Primitive transformation to apply to the infoType. [required]
Configuration description of the scanning process. When used with redactContent only info_types and min_likelihood are currently used.
Used in:
, , , , ,Restricts what info_types to look for. The values must correspond to InfoType values returned by ListInfoTypes or listed at https://cloud.google.com/dlp/docs/infotypes-reference. When no InfoTypes or CustomInfoTypes are specified in a request, the system may automatically choose what detectors to run. By default this may be all types, but may change over time as detectors are updated. The special InfoType name "ALL_BASIC" can be used to trigger all detectors, but may change over time as new InfoTypes are added. If you need precise control and predictability as to what detectors are run you should specify specific InfoTypes listed in the reference.
Only returns findings equal or above this threshold. The default is POSSIBLE. See https://cloud.google.com/dlp/docs/likelihood to learn more.
When true, a contextual quote from the data that triggered a finding is included in the response; see Finding.quote.
When true, excludes type information of the findings.
CustomInfoTypes provided by the user. See https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
List of options defining data content to scan. If empty, text, images, and other content will be included.
Set of rules to apply to the findings for this InspectConfig. Exclusion rules, contained in the set are executed in the end, other rules are executed in the order they are specified for each info type.
Used in:
Max number of findings that will be returned for each item scanned. When set within `InspectDataSourceRequest`, the maximum returned is 2000 regardless if this is set higher. When set within `InspectContentRequest`, this field is ignored.
Max number of findings that will be returned per request/job. When set within `InspectContentRequest`, the maximum returned is 2000 regardless if this is set higher.
Configuration of findings limit given for specified infoTypes.
Max findings configuration per infoType, per content item or long running DlpJob.
Used in:
Type of information the findings limit applies to. Only one limit per info_type should be provided. If InfoTypeLimit does not have an info_type, the DLP API applies the limit against all info_types that are found but not specified in another InfoTypeLimit.
Max findings limit for the given infoType.
The results of an inspect DataSource job.
Used in:
The configuration used for this job.
A summary of the outcome of this inspect job.
Used in:
If run with an InspectTemplate, a snapshot of its state at the time of this run.
All result fields mentioned below are updated while the job is processing.
Used in:
Total size in bytes that were processed.
Estimate of the number of bytes to process.
Statistics of how many instances of each info type were found during inspect job.
Used in:
, ,The data to scan.
How and what to scan for.
If provided, will be used as the default for all values in InspectConfig. `inspect_config` will be merged into the values persisted as part of the template.
Actions to execute at the completion of the job.
All the findings for a single scanned item.
Used in:
,List of findings for an item.
If true, then this item might have more findings than were returned, and the findings returned are an arbitrary subset of all findings. The findings list might be truncated because the input items were too large, or because the server reached the maximum amount of resources allowed for a single API call. For best results, divide the input into smaller batches.
The inspectTemplate contains a configuration (set of types of sensitive data to be detected) to be used anywhere you otherwise would normally specify InspectConfig. See https://cloud.google.com/dlp/docs/concepts-templates to learn more.
Used as response type in: DlpService.CreateInspectTemplate, DlpService.GetInspectTemplate, DlpService.UpdateInspectTemplate
Used as field type in:
, , ,The template name. Output only. The template will have one of the following formats: `projects/PROJECT_ID/inspectTemplates/TEMPLATE_ID` OR `organizations/ORGANIZATION_ID/inspectTemplates/TEMPLATE_ID`
Display name (max 256 chars).
Short description (max 256 chars).
The creation timestamp of a inspectTemplate, output only field.
The last update timestamp of a inspectTemplate, output only field.
The core content of the template. Configuration of the scanning process.
A single inspection rule to be applied to infoTypes, specified in `InspectionRuleSet`.
Used in:
Hotword-based detection rule.
Exclusion rule.
Rule set for modifying a set of infoTypes to alter behavior under certain circumstances, depending on the specific details of the rules within the set.
Used in:
List of infoTypes this rule set is applied to.
Set of rules to be applied to infoTypes. The rules are applied in order.
Contains a configuration to make dlp api calls on a repeating basis. See https://cloud.google.com/dlp/docs/concepts-job-triggers to learn more.
Used as response type in: DlpService.CreateJobTrigger, DlpService.GetJobTrigger, DlpService.UpdateJobTrigger
Used as field type in:
, ,Unique resource name for the triggeredJob, assigned by the service when the triggeredJob is created, for example `projects/dlp-test-project/triggeredJobs/53234423`.
Display name (max 100 chars)
User provided description (max 256 chars)
The configuration details for the specific type of job to run.
A list of triggers which will be OR'ed together. Only one in the list needs to trigger for a job to be started. The list may contain only a single Schedule trigger and must have at least one object.
A stream of errors encountered when the trigger was activated. Repeated errors may result in the JobTrigger automatically being paused. Will return the last 100 errors. Whenever the JobTrigger is modified this list will be cleared. Output only field.
The creation timestamp of a triggeredJob, output only field.
The last update timestamp of a triggeredJob, output only field.
The timestamp of the last time this trigger executed, output only field.
A status for this trigger. [required]
Whether the trigger is currently active. If PAUSED or CANCELLED, no jobs will be created with this configuration. The service may automatically pause triggers experiencing frequent errors. To restart a job, set the status to HEALTHY after correcting user errors.
Used in:
Trigger is healthy.
Trigger is temporarily paused.
Trigger is cancelled and can not be resumed.
What event needs to occur for a new job to be started.
Used in:
Create a job on a repeating basis based on the elapse of time.
A unique identifier for a Datastore entity. If a key's partition ID or any of its path kinds or names are reserved/read-only, the key is reserved/read-only. A reserved/read-only key is forbidden in certain documented contexts.
Used in:
Entities are partitioned into subsets, currently identified by a project ID and namespace ID. Queries are scoped to a single partition.
The entity path. An entity path consists of one or more elements composed of a kind and a string or numerical identifier, which identify entities. The first element identifies a _root entity_, the second element identifies a _child_ of the root entity, the third element identifies a child of the second entity, and so forth. The entities identified by all prefixes of the path are called the element's _ancestors_. A path can never be empty, and a path can have at most 100 elements.
A (kind, ID/name) pair used to construct a key path. If either name or ID is set, the element is complete. If neither is set, the element is incomplete.
Used in:
The kind of the entity. A kind matching regex `__.*__` is reserved/read-only. A kind must not contain more than 1500 bytes when UTF-8 encoded. Cannot be `""`.
The type of ID.
The auto-allocated ID of the entity. Never equal to zero. Values less than zero are discouraged and may not be supported in the future.
The name of the entity. A name matching regex `__.*__` is reserved/read-only. A name must not be more than 1500 bytes when UTF-8 encoded. Cannot be `""`.
A representation of a Datastore kind.
Used in:
The name of the kind.
Include to use an existing data crypto key wrapped by KMS. The wrapped key must be a 128/192/256 bit key. Authorization requires the following IAM permissions when sending a request to perform a crypto transformation using a kms-wrapped crypto key: dlp.kms.encrypt
Used in:
The wrapped data crypto key. [required]
The resource name of the KMS CryptoKey to use for unwrapping. [required]
Configuration for a custom dictionary created from a data source of any size up to the maximum size defined in the [limits](https://cloud.google.com/dlp/limits) page. The artifacts of dictionary creation are stored in the specified Google Cloud Storage location. Consider using `CustomInfoType.Dictionary` for smaller dictionaries that satisfy the size requirements.
Used in:
Location to store dictionary artifacts in Google Cloud Storage. These files will only be accessible by project owners and the DLP API. If any of these artifacts are modified, the dictionary is considered invalid and can no longer be used.
Set of files containing newline-delimited lists of dictionary phrases.
Field in a BigQuery table where each cell represents a dictionary phrase.
Summary statistics of a custom dictionary.
Used in:
Approximate number of distinct phrases in the dictionary.
Categorization of results based on how likely they are to represent a match, based on the number of elements they contain which imply a match.
Used in:
, , ,Default value; same as POSSIBLE.
Few matching elements.
Some matching elements.
Many matching elements.
Specifies the location of the finding.
Used in:
Zero-based byte offsets delimiting the finding. These are relative to the finding's containing element. Note that when the content is not textual, this references the UTF-8 encoded textual representation of the content. Omitted if content is an image.
Unicode character offsets delimiting the finding. These are relative to the finding's containing element. Provided when the content is text.
List of nested objects pointing to the precise location of the finding within the file or record.
Type of the match which can be applied to different ways of matching, like Dictionary, regular expression and intersecting with findings of another info type.
Used in:
Invalid.
Full match. - Dictionary: join of Dictionary results matched complete finding quote - Regex: all regex matches fill a finding quote start to end - Exclude info type: completely inside affecting info types findings
Partial match. - Dictionary: at least one of the tokens in the finding matches - Regex: substring of the finding matches - Exclude info type: intersects with affecting info types findings
Inverse match. - Dictionary: no tokens in the finding match the dictionary - Regex: finding doesn't match the regex - Exclude info type: no intersection with affecting info types findings
Cloud repository for storing output.
Used in:
Store findings in an existing table or a new table in an existing dataset. If table_id is not set a new one will be generated for you with the following format: dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for generating the date details. For Inspect, each column in an existing output table must have the same name, type, and mode of a field in the `Finding` object. For Risk, an existing output table should be the output of a previous Risk analysis job run on the same source table, with the same privacy metric and quasi-identifiers. Risk jobs that analyze the same table but compute a different privacy metric, or use different sets of quasi-identifiers, cannot store their results in the same table.
Schema used for writing the findings for Inspect jobs. This field is only used for Inspect and must be unspecified for Risk jobs. Columns are derived from the `Finding` object. If appending to an existing table, any columns from the predefined schema that are missing will be added. No columns in the existing table will be deleted. If unspecified, then all available columns will be used for a new table or an (existing) table with no schema, and no changes will be made to an existing table that has a schema.
Predefined schemas for storing findings.
Used in:
Basic schema including only `info_type`, `quote`, `certainty`, and `timestamp`.
Schema tailored to findings from scanning Google Cloud Storage.
Schema tailored to findings from scanning Google Datastore.
Schema tailored to findings from scanning Google BigQuery.
Schema containing all columns.
Datastore partition ID. A partition ID identifies a grouping of entities. The grouping is always by project and namespace, however the namespace ID may be empty. A partition ID contains several dimensions: project ID and namespace ID.
Used in:
,The ID of the project to which the entities belong.
If not empty, the ID of the namespace to which the entities belong.
A rule for transforming a value.
Used in:
, ,Privacy metric to compute for reidentification risk analysis.
Used in:
,Compute numerical stats over an individual column, including number of distinct values and value count distribution.
Used in:
Field to compute categorical stats on. All column types are supported except for arrays and structs. However, it may be more informative to use NumericalStats when the field type is supported, depending on the data.
δ-presence metric, used to estimate how likely it is for an attacker to figure out that one given individual appears in a de-identified dataset. Similarly to the k-map metric, we cannot compute δ-presence exactly without knowing the attack dataset, so we use a statistical model instead.
Used in:
Fields considered to be quasi-identifiers. No two fields can have the same tag. [required]
ISO 3166-1 alpha-2 region code to use in the statistical modeling. Required if no column is tagged with a region-specific InfoType (like US_ZIP_5) or a region code.
Several auxiliary tables can be used in the analysis. Each custom_tag used to tag a quasi-identifiers field must appear in exactly one field of one auxiliary table.
k-anonymity metric, used for analysis of reidentification risk.
Used in:
Set of fields to compute k-anonymity over. When multiple fields are specified, they are considered a single composite key. Structs and repeated data types are not supported; however, nested fields are supported so long as they are not structs themselves or nested within a repeated field.
Optional message indicating that multiple rows might be associated to a single individual. If the same entity_id is associated to multiple quasi-identifier tuples over distinct rows, we consider the entire collection of tuples as the composite quasi-identifier. This collection is a multiset: the order in which the different tuples appear in the dataset is ignored, but their frequency is taken into account. Important note: a maximum of 1000 rows can be associated to a single entity ID. If more rows are associated with the same entity ID, some might be ignored.
Reidentifiability metric. This corresponds to a risk model similar to what is called "journalist risk" in the literature, except the attack dataset is statistically modeled instead of being perfectly known. This can be done using publicly available data (like the US Census), or using a custom statistical model (indicated as one or several BigQuery tables), or by extrapolating from the distribution of values in the input dataset. A column with a semantic tag attached.
Used in:
Fields considered to be quasi-identifiers. No two columns can have the same tag. [required]
ISO 3166-1 alpha-2 region code to use in the statistical modeling. Required if no column is tagged with a region-specific InfoType (like US_ZIP_5) or a region code.
Several auxiliary tables can be used in the analysis. Each custom_tag used to tag a quasi-identifiers column must appear in exactly one column of one auxiliary table.
An auxiliary table contains statistical information on the relative frequency of different quasi-identifiers values. It has one or several quasi-identifiers columns, and one column that indicates the relative frequency of each quasi-identifier tuple. If a tuple is present in the data but not in the auxiliary table, the corresponding relative frequency is assumed to be zero (and thus, the tuple is highly reidentifiable).
Used in:
Auxiliary table location. [required]
Quasi-identifier columns. [required]
The relative frequency column must contain a floating-point number between 0 and 1 (inclusive). Null values are assumed to be zero. [required]
A quasi-identifier column has a custom_tag, used to know which column in the data corresponds to which column in the statistical model.
Used in:
Used in:
Identifies the column. [required]
Semantic tag that identifies what a column contains, to determine which statistical model to use to estimate the reidentifiability of each value. [required]
A column can be tagged with a InfoType to use the relevant public dataset as a statistical model of population, if available. We currently support US ZIP codes, region codes, ages and genders. To programmatically obtain the list of supported InfoTypes, use ListInfoTypes with the supported_by=RISK_ANALYSIS filter.
A column can be tagged with a custom tag. In this case, the user must indicate an auxiliary table that contains statistical information on the possible values of this column (below).
If no semantic tag is indicated, we infer the statistical model from the distribution of values in the input data
l-diversity metric, used for analysis of reidentification risk.
Used in:
Set of quasi-identifiers indicating how equivalence classes are defined for the l-diversity computation. When multiple fields are specified, they are considered a single composite key.
Sensitive field for computing the l-value.
Compute numerical stats over an individual column, including min, max, and quantiles.
Used in:
Field to compute numerical stats on. Supported types are integer, float, date, datetime, timestamp, time.
A column with a semantic tag attached.
Used in:
Identifies the column. [required]
Semantic tag that identifies what a column contains, to determine which statistical model to use to estimate the reidentifiability of each value. [required]
A column can be tagged with a InfoType to use the relevant public dataset as a statistical model of population, if available. We currently support US ZIP codes, region codes, ages and genders. To programmatically obtain the list of supported InfoTypes, use ListInfoTypes with the supported_by=RISK_ANALYSIS filter.
A column can be tagged with a custom tag. In this case, the user must indicate an auxiliary table that contains statistical information on the possible values of this column (below).
If no semantic tag is indicated, we infer the statistical model from the distribution of values in the input data
Message for infoType-dependent details parsed from quote.
Used in:
Object representation of the quote.
The date time indicated by the quote.
Generic half-open interval [start, end)
Used in:
Index of the first character of the range (inclusive).
Index of the last character of the range (exclusive).
A condition for determining whether a transformation should be applied to a field.
Used in:
,An expression.
The field type of `value` and `field` do not need to match to be considered equal, but not all comparisons are possible. EQUAL_TO and NOT_EQUAL_TO attempt to compare even with incompatible types, but all other comparisons are invalid with incompatible types. A `value` of type: - `string` can be compared against all other types - `boolean` can only be compared against other booleans - `integer` can be compared against doubles or a string if the string value can be parsed as an integer. - `double` can be compared against integers or a string if the string can be parsed as a double. - `Timestamp` can be compared against strings in RFC 3339 date string format. - `TimeOfDay` can be compared against timestamps and strings in the format of 'HH:mm:ss'. If we fail to compare do to type mismatch, a warning will be given and the condition will evaluate to false.
Used in:
Field within the record this condition is evaluated against. [required]
Operator used to compare the field or infoType to the value. [required]
Value to compare against. [Required, except for `EXISTS` tests.]
A collection of conditions.
Used in:
An expression, consisting or an operator and conditions.
Used in:
The operator to apply to the result of conditions. Default and currently only supported value is `AND`.
Used in:
Message for a unique key indicating a record that contains a finding.
Used in:
Values of identifying columns in the given row. Order of values matches the order of field identifiers specified in the scanning request.
Location of a finding within a row or record.
Used in:
Key of the finding.
Field id of the field containing the finding.
Location within a `ContentItem.Table`.
Configuration to suppress records whose suppression conditions evaluate to true.
Used in:
,A condition that when it evaluates to true will result in the record being evaluated to be suppressed from the transformed content.
A type of transformation that is applied over structured data such as a table.
Used in:
Transform the record by applying various field transformations.
Configuration defining which records get suppressed entirely. Records that match any suppression rule are omitted from the output [optional].
Redact a given value. For example, if used with an `InfoTypeTransformation` transforming PHONE_NUMBER, and input 'My phone number is 206-555-0123', the output would be 'My phone number is '.
Used in:
(message has no fields)
Configuration for determining how redaction of images should occur.
Used in:
Type of information to redact from images.
Only one per info_type should be provided per request. If not specified, and redact_all_text is false, the DLP API will redact all text that it matches against all info_types that are found, but not specified in another ImageRedactionConfig.
If true, all text found in the image, regardless whether it matches an info_type, is redacted. Only one should be provided.
The color to use when redacting content from an image. If not specified, the default is black.
Operators available for comparing the value of fields.
Used in:
Equal. Attempts to match even with incompatible types.
Not equal to. Attempts to match even with incompatible types.
Greater than.
Less than.
Greater than or equals.
Less than or equals.
Exists
Replace each input value with a given `Value`.
Used in:
Value to replace it with.
Replace each matching finding with the name of the info_type.
Used in:
(message has no fields)
Configuration for a risk analysis job. See https://cloud.google.com/dlp/docs/concepts-risk-analysis to learn more.
Used in:
Privacy metric to compute.
Input dataset to compute metrics over.
Actions to execute at the completion of the job. Are executed in the order provided.
Schedule for triggeredJobs.
Used in:
With this option a job is started a regular periodic basis. For example: every day (86400 seconds). A scheduled start time will be skipped if the previous execution has not ended when its scheduled time occurs. This value must be set to a time duration greater than or equal to 1 day and can be no longer than 60 days.
An auxiliary table containing statistical information on the relative frequency of different quasi-identifiers values. It has one or several quasi-identifiers columns, and one column that indicates the relative frequency of each quasi-identifier tuple. If a tuple is present in the data but not in the auxiliary table, the corresponding relative frequency is assumed to be zero (and thus, the tuple is highly reidentifiable).
Used in:
Auxiliary table location. [required]
Quasi-identifier columns. [required]
The relative frequency column must contain a floating-point number between 0 and 1 (inclusive). Null values are assumed to be zero. [required]
A quasi-identifier column has a custom_tag, used to know which column in the data corresponds to which column in the statistical model.
Used in:
Shared message indicating Cloud storage type.
Used in:
Google Cloud Datastore options specification.
Google Cloud Storage options specification.
BigQuery options specification.
Configuration of the timespan of the items to include in scanning. Currently only supported when inspecting Google Cloud Storage and BigQuery.
Used in:
Exclude files or rows older than this value.
Exclude files or rows newer than this value. If set to zero, no upper time limit is applied.
Specification of the field containing the timestamp of scanned items. Used for data sources like Datastore or BigQuery. If not specified for BigQuery, table last modification timestamp is checked against given time span. The valid data types of the timestamp field are: for BigQuery - timestamp, date, datetime; for Datastore - timestamp. Datastore entity will be scanned if the timestamp property does not exist or its value is empty or invalid.
When the job is started by a JobTrigger we will automatically figure out a valid start_time to avoid scanning files that have not been modified since the last time the JobTrigger executed. This will be based on the time of the execution of the last run of the JobTrigger.
StoredInfoType resource message that contains information about the current version and any pending updates.
Used as response type in: DlpService.CreateStoredInfoType, DlpService.GetStoredInfoType, DlpService.UpdateStoredInfoType
Used as field type in:
Resource name.
Current version of the stored info type.
Pending versions of the stored info type. Empty if no versions are pending.
Configuration for a StoredInfoType.
Used in:
, ,Display name of the StoredInfoType (max 256 characters).
Description of the StoredInfoType (max 256 characters).
StoredInfoType where findings are defined by a dictionary of phrases.
State of a StoredInfoType version.
Used in:
StoredInfoType version is being created.
StoredInfoType version is ready for use.
StoredInfoType creation failed. All relevant error messages are returned in the `StoredInfoTypeVersion` message.
StoredInfoType is no longer valid because artifacts stored in user-controlled storage were modified. To fix an invalid StoredInfoType, use the `UpdateStoredInfoType` method to create a new version.
Statistics for a StoredInfoType.
Used in:
StoredInfoType where findings are defined by a dictionary of phrases.
Version of a StoredInfoType, including the configuration used to build it, create timestamp, and current state.
Used in:
StoredInfoType configuration.
Create timestamp of the version. Read-only, determined by the system when the version is created.
Stored info type version state. Read-only, updated by the system during dictionary creation.
Errors that occurred when creating this storedInfoType version, or anomalies detected in the storedInfoType data that render it unusable. Only the five most recent errors will be displayed, with the most recent error appearing first. <p>For example, some of the data for stored custom dictionaries is put in the user's Google Cloud Storage bucket, and if this data is modified or deleted by the user or another system, the dictionary becomes invalid. <p>If any errors occur, fix the problem indicated by the error message and use the UpdateStoredInfoType API method to create another version of the storedInfoType to continue using it, reusing the same `config` if it was not the source of the error.
Statistics about this storedInfoType version.
A reference to a StoredInfoType to use with scanning.
Used in:
Resource name of the requested `StoredInfoType`, for example `organizations/433245324/storedInfoTypes/432452342` or `projects/project-id/storedInfoTypes/432452342`.
Timestamp indicating when the version of the `StoredInfoType` used for inspection was created. Output-only field, populated by the system.
Structured content to inspect. Up to 50,000 `Value`s per request allowed. See https://cloud.google.com/dlp/docs/inspecting-text#inspecting_a_table to learn more.
Used in:
Used in:
Location of a finding within a table.
Used in:
The zero-based index of the row where the finding is located.
For use with `Date`, `Timestamp`, and `TimeOfDay`, extract or preserve a portion of the value.
Used in:
Used in:
[0-9999]
[1-12]
[1-31]
[1-7]
[1-52]
[0-23]
Overview of the modifications that occurred.
Used in:
,Total size in bytes that were transformed in some way.
Transformations applied to the dataset.
Summary of a single transformation. Only one of 'transformation', 'field_transformation', or 'record_suppress' will be set.
Used in:
Set if the transformation was limited to a specific InfoType.
Set if the transformation was limited to a specific FieldId.
The specific transformation these stats apply to.
The field transformation that was applied. If multiple field transformations are requested for a single field, this list will contain all of them; otherwise, only one is supplied.
The specific suppression option these stats apply to.
Total size in bytes that were transformed in some way.
A collection that informs the user the number of times a particular `TransformationResultCode` and error details occurred.
Used in:
A place for warnings or errors to show up if a transformation didn't work as expected.
Possible outcomes of transformations.
Used in:
Use this to have a random data crypto key generated. It will be discarded after the request finishes.
Used in:
Name of the key. [required] This is an arbitrary string used to differentiate different keys. A unique key is generated per name: two separate `TransientCryptoKey` protos share the same generated key if their names are the same. When the data crypto key is generated, this name is not used in any way (repeating the api call will result in a different key being generated).
Using raw keys is prone to security risks due to accidentally leaking the key. Choose another type of key if possible.
Used in:
A 128/192/256 bit key. [required]
Set of primitive values supported by the system. Note that for the purposes of inspection or transformation, the number of bytes considered to comprise a 'Value' is based on its representation as a UTF-8 encoded string. For example, if 'integer_value' is set to 123456789, the number of bytes would be counted as 9, even though an int64 only holds up to 8 bytes of data.
Used in:
, , , , , , , , , ,A value of a field, including its frequency.
Used in:
,A value contained in the field in question.
How many times the value is contained in the field.