Get desktop application:
View/edit binary Protocol Buffers messages
A service for running genomics pipelines.
Creates a pipeline that can be run later. Create takes a Pipeline that has all fields other than `pipelineId` populated, and then returns the same pipeline with `pipelineId` populated. This id can be used to run the pipeline. Caller must have WRITE permission to the project.
The request to create a pipeline. The pipeline field here should not have `pipelineId` populated, as that will be populated by the server.
The pipeline to create. Should not have `pipelineId` populated.
Runs a pipeline. If `pipelineId` is specified in the request, then run a saved pipeline. If `ephemeralPipeline` is specified, then run that pipeline once without saving a copy. The caller must have READ permission to the project where the pipeline is stored and WRITE permission to the project where the pipeline will be run, as VMs will be created and storage will be used.
The request to run a pipeline. If `pipelineId` is specified, it refers to a saved pipeline created with CreatePipeline and set as the `pipelineId` of the returned Pipeline object. If `ephemeralPipeline` is specified, that pipeline is run once with the given args and not saved. It is an error to specify both `pipelineId` and `ephemeralPipeline`. `pipelineArgs` must be specified.
The already created pipeline to run.
A new pipeline object to run once and then delete.
The arguments to use when running this pipeline.
Retrieves a pipeline based on ID. Caller must have READ permission to the project.
A request to get a saved pipeline by id.
Caller must have READ access to the project in which this pipeline is defined.
Lists pipelines. Caller must have READ permission to the project.
A request to list pipelines in a given project. Pipelines can be filtered by name using `namePrefix`: all pipelines with names that begin with `namePrefix` will be returned. Uses standard pagination: `pageSize` indicates how many pipelines to return, and `pageToken` comes from a previous ListPipelinesResponse to indicate offset.
Required. The name of the project to search for pipelines. Caller must have READ access to this project.
Pipelines with names that match this prefix should be returned. If unspecified, all pipelines in the project, up to `pageSize`, will be returned.
Number of pipelines to return at once. Defaults to 256, and max is 2048.
Token to use to indicate where to start getting results. If unspecified, returns the first page of results.
The response of ListPipelines. Contains at most `pageSize` pipelines. If it contains `pageSize` pipelines, and more pipelines exist, then `nextPageToken` will be populated and should be used as the `pageToken` argument to a subsequent ListPipelines request.
The matched pipelines.
The token to use to get the next page of results.
Deletes a pipeline based on ID. Caller must have WRITE permission to the project.
The request to delete a saved pipeline by ID.
Caller must have WRITE access to the project in which this pipeline is defined.
Gets controller configuration information. Should only be called by VMs created by the Pipelines Service and not by end users.
Request to get controller configuation. Should only be used by VMs created by the Pipelines Service and not by end users.
The operation to retrieve controller configuration for.
Stores the information that the controller will fetch from the server in order to run. Should only be used by VMs created by the Pipelines Service and not by end users.
Sets status of a given operation. Any new timestamps (as determined by description) are appended to TimestampEvents. Should only be called by VMs created by the Pipelines Service and not by end users.
Request to set operation status. Should only be used by VMs created by the Pipelines Service and not by end users.
Describes a Compute Engine resource that is being managed by a running [pipeline][google.genomics.v1alpha2.Pipeline].
Used in:
The instance on which the operation is running.
The availability zone in which the instance resides.
The machine type of the instance.
The names of the disks that were created for this pipeline.
Used in:
The Docker execuctor specification.
Used in:
Required. Image name from either Docker Hub or Google Container Registry. Users that run pipelines must have READ access to the image.
Required. The command or newline delimited script to run. The command string will be executed within a bash shell. If the command exits with a non-zero exit code, output parameter de-localization will be skipped and the pipeline operation's [`error`][google.longrunning.Operation.error] field will be populated. Maximum command string length is 16384.
The logging options for the pipeline run.
Used in:
The location in Google Cloud Storage to which the pipeline logs will be copied. Can be specified as a fully qualified directory path, in which case logs will be output with a unique identifier as the filename in that directory, or as a fully specified path, which must end in `.log`, in which case that path will be used, and the user must ensure that logs are not overwritten. Stdout and stderr logs from the run are also generated and output as `-stdout.log` and `-stderr.log`.
The pipeline object. Represents a transformation from a set of input parameters to a set of output parameters. The transformation is defined as a docker image and command to run within that image. Each pipeline is run on a Google Compute Engine VM. A pipeline can be created with the `create` method and then later run with the `run` method, or a pipeline can be defined and run all at once with the `run` method.
Used as response type in: PipelinesV1Alpha2.CreatePipeline, PipelinesV1Alpha2.GetPipeline
Used as field type in:
, ,Required. The project in which to create the pipeline. The caller must have WRITE access.
Required. A user specified pipeline name that does not have to be unique. This name can be used for filtering Pipelines in ListPipelines.
User-specified description.
Input parameters of the pipeline.
Output parameters of the pipeline.
Required. The executor indicates in which environment the pipeline runs.
Specifies the docker run information.
Required. Specifies resource requirements for the pipeline run. Required fields: * [minimumCpuCores][google.genomics.v1alpha2.PipelineResources.minimum_cpu_cores] * [minimumRamGb][google.genomics.v1alpha2.PipelineResources.minimum_ram_gb]
Unique pipeline id that is generated by the service when CreatePipeline is called. Cannot be specified in the Pipeline used in the CreatePipelineRequest, and will be populated in the response to CreatePipeline and all subsequent Get and List calls. Indicates that the service has registered this pipeline.
Parameters facilitate setting and delivering data into the pipeline's execution environment. They are defined at create time, with optional defaults, and can be overridden at run time. If `localCopy` is unset, then the parameter specifies a string that is passed as-is into the pipeline, as the value of the environment variable with the given name. A default value can be optionally specified at create time. The default can be overridden at run time using the inputs map. If no default is given, a value must be supplied at runtime. If `localCopy` is defined, then the parameter specifies a data source or sink, both in Google Cloud Storage and on the Docker container where the pipeline computation is run. The [service account associated with the Pipeline][google.genomics.v1alpha2.RunPipelineArgs.service_account] (by default the project's Compute Engine service account) must have access to the Google Cloud Storage paths. At run time, the Google Cloud Storage paths can be overridden if a default was provided at create time, or must be set otherwise. The pipeline runner should add a key/value pair to either the inputs or outputs map. The indicated data copies will be carried out before/after pipeline execution, just as if the corresponding arguments were provided to `gsutil cp`. For example: Given the following `PipelineParameter`, specified in the `inputParameters` list: ``` {name: "input_file", localCopy: {path: "file.txt", disk: "pd1"}} ``` where `disk` is defined in the `PipelineResources` object as: ``` {name: "pd1", mountPoint: "/mnt/disk/"} ``` We create a disk named `pd1`, mount it on the host VM, and map `/mnt/pd1` to `/mnt/disk` in the docker container. At runtime, an entry for `input_file` would be required in the inputs map, such as: ``` inputs["input_file"] = "gs://my-bucket/bar.txt" ``` This would generate the following gsutil call: ``` gsutil cp gs://my-bucket/bar.txt /mnt/pd1/file.txt ``` The file `/mnt/pd1/file.txt` maps to `/mnt/disk/file.txt` in the Docker container. Acceptable paths are: <table> <thead> <tr><th>Google Cloud storage path</th><th>Local path</th></tr> </thead> <tbody> <tr><td>file</td><td>file</td></tr> <tr><td>glob</td><td>directory</td></tr> </tbody> </table> For outputs, the direction of the copy is reversed: ``` gsutil cp /mnt/disk/file.txt gs://my-bucket/bar.txt ``` Acceptable paths are: <table> <thead> <tr><th>Local path</th><th>Google Cloud Storage path</th></tr> </thead> <tbody> <tr><td>file</td><td>file</td></tr> <tr> <td>file</td> <td>directory - directory must already exist</td> </tr> <tr> <td>glob</td> <td>directory - directory will be created if it doesn't exist</td></tr> </tbody> </table> One restriction due to docker limitations, is that for outputs that are found on the boot disk, the local path cannot be a glob and must be a file.
Used in:
Required. Name of the parameter - the pipeline runner uses this string as the key to the input and output maps in RunPipeline.
Human-readable description.
The default value for this parameter. Can be overridden at runtime. If `localCopy` is present, then this must be a Google Cloud Storage path beginning with `gs://`.
If present, this parameter is marked for copying to and from the VM. `LocalCopy` indicates where on the VM the file should be. The value given to this parameter (either at runtime or using `defaultValue`) must be the remote path where the file should be.
LocalCopy defines how a remote file should be copied to and from the VM.
Used in:
Required. The path within the user's docker container where this input should be localized to and from, relative to the specified disk's mount point. For example: file.txt,
Required. The name of the disk where this parameter is located. Can be the name of one of the disks specified in the Resources field, or "boot", which represents the Docker instance's boot disk and has a mount point of `/`.
The system resources for the pipeline run.
Used in:
,The minimum number of cores to use. Defaults to 1.
Whether to use preemptible VMs. Defaults to `false`. In order to use this, must be true for both create time and run time. Cannot be true at run time if false at create time.
The minimum amount of RAM to use. Defaults to 3.75 (GB)
Disks to attach.
List of Google Compute Engine availability zones to which resource creation will restricted. If empty, any zone may be chosen.
The size of the boot disk. Defaults to 10 (GB).
Whether to assign an external IP to the instance. This is an experimental feature that may go away. Defaults to false. Corresponds to `--no_address` flag for [gcloud compute instances create] (https://cloud.google.com/sdk/gcloud/reference/compute/instances/create). In order to use this, must be true for both create time and run time. Cannot be true at run time if false at create time. If you need to ssh into a private IP VM for debugging, you can ssh to a public VM and then ssh into the private VM's Internal IP. If noAddress is set, this pipeline run may only load docker images from Google Container Registry and not Docker Hub. ** Note: To use this option, your project must be in Google Access for Private IPs Early Access Program.**
A Google Compute Engine disk resource specification.
Used in:
Required. The name of the disk that can be used in the pipeline parameters. Must be 1 - 63 characters. The name "boot" is reserved for system use.
Required. The type of the disk to create.
The size of the disk. Defaults to 500 (GB). This field is not applicable for local SSD.
The full or partial URL of the persistent disk to attach. See https://cloud.google.com/compute/docs/reference/latest/instances#resource and https://cloud.google.com/compute/docs/disks/persistent-disks#snapshots for more details.
Deprecated. Disks created by the Pipelines API will be deleted at the end of the pipeline run, regardless of what this field is set to.
Required at create time and cannot be overridden at run time. Specifies the path in the docker container where files on this disk should be located. For example, if `mountPoint` is `/mnt/disk`, and the parameter has `localPath` `inputs/file.txt`, the docker container can access the data at `/mnt/disk/inputs/file.txt`.
The types of disks that may be attached to VMs.
Used in:
Default disk type. Use one of the other options below.
Specifies a Google Compute Engine persistent hard disk. See https://cloud.google.com/compute/docs/disks/#pdspecs for details.
Specifies a Google Compute Engine persistent solid-state disk. See https://cloud.google.com/compute/docs/disks/#pdspecs for details.
Specifies a Google Compute Engine local SSD. See https://cloud.google.com/compute/docs/disks/local-ssd for details.
The pipeline run arguments.
Used in:
Required. The project in which to run the pipeline. The caller must have WRITER access to all Google Cloud services and resources (e.g. Google Compute Engine) will be used.
Pipeline input arguments; keys are defined in the pipeline documentation. All input parameters that do not have default values must be specified. If parameters with defaults are specified here, the defaults will be overridden.
Pipeline output arguments; keys are defined in the pipeline documentation. All output parameters of without default values must be specified. If parameters with defaults are specified here, the defaults will be overridden.
The Google Cloud Service Account that will be used to access data and services. By default, the compute service account associated with `projectId` is used.
This field is deprecated. Use `labels` instead. Client-specified pipeline operation identifier.
Specifies resource requirements/overrides for the pipeline run.
Required. Logging options. Used by the service to communicate results to the user.
How long to keep the VM up after a failure (for example docker command failed, copying input or output files failed, etc). While the VM is up, one can ssh into the VM to debug. Default is 0; maximum allowed value is 1 day.
Labels to apply to this pipeline run. Labels will also be applied to compute resources (VM, disks) created by this pipeline run. When listing operations, operations can [filtered by labels] [google.longrunning.ListOperationsRequest.filter]. Label keys may not be empty; label values may be empty. Non-empty labels must be 1-63 characters long, and comply with [RFC1035] (https://www.ietf.org/rfc/rfc1035.txt). Specifically, the name must be 1-63 characters long and match the regular expression `[a-z]([-a-z0-9]*[a-z0-9])?` which means the first character must be a lowercase letter, and all following characters must be a dash, lowercase letter, or digit, except the last character, which cannot be a dash.
Runtime metadata that will be populated in the [runtimeMetadata][google.genomics.v1.OperationMetadata.runtime_metadata] field of the Operation associated with a RunPipeline execution.
Execution information specific to Google Compute Engine.
A Google Cloud Service Account.
Used in:
Email address of the service account. Defaults to `default`, which uses the compute service account associated with the project.
List of scopes to be enabled for this service account on the VM. The following scopes are automatically included: * https://www.googleapis.com/auth/compute * https://www.googleapis.com/auth/devstorage.full_control * https://www.googleapis.com/auth/genomics * https://www.googleapis.com/auth/logging.write * https://www.googleapis.com/auth/monitoring.write
Stores the list of events and times they occured for major events in job execution.
Used in:
String indicating the type of event
The time this event occured.