Get desktop application:
View/edit binary Protocol Buffers messages
LINT.IfChange
Inserts or updates an ArtifactType. A type has a set of strong typed properties describing the schema of any stored instance associated with that type. A type is identified by a name and an optional version. Type Creation: If no type exists in the database with the given identifier (name, version), it creates a new type and returns the type_id. Type Evolution: If the request type with the same (name, version) already exists (let's call it stored_type), the method enforces the stored_type can be updated only when the request type is backward compatible for the already stored instances. Backwards compatibility is violated iff: a) there is a property where the request type and stored_type have different value type (e.g., int vs. string) b) `can_add_fields = false` and the request type has a new property that is not stored. c) `can_omit_fields = false` and stored_type has an existing property that is not provided in the request type. If non-backward type change is required in the application, e.g., deprecate properties, re-purpose property name, change value types, a new type can be created with a different (name, version) identifier. Note the type version is optional, and a version value with empty string is treated as unset. Args: artifact_type: the type to be inserted or updated. can_add_fields: when set to true, new properties can be added; when set to false, returns ALREADY_EXISTS if the request type has properties that are not in stored_type. can_omit_fields: when set to true, stored properties can be omitted in the request type; when set to false, returns ALREADY_EXISTS if the stored_type has properties not in the request type. Returns: The type_id of the stored type. Raises: ALREADY_EXISTS error in the case listed above. INVALID_ARGUMENT error, if the given type has no name, or any property value type is unknown.
The field is required in any request. Stored types in MLMD can be updated by introducing new properties and remain backward compatible. If a type with the same name exists in the database, it updates the existing type, otherwise it creates a new type.
If true then allows adding properties to an existing stored type. If false, then type update is not allowed and it raises AlreadyExists error if the given type has any new property that is not defined in the stored type.
If true then allows omitting properties of an existing stored type. If false, then no properties of the stored type can be omitted in the given type, otherwise it raises AlreadyExists error.
Deprecated fields.
Options regarding transactions.
The type ID of the artifact type.
Inserts or updates an ExecutionType. Please refer to PutArtifactType for type upsert API description.
The field is required in any request. Stored types in MLMD can be updated by introducing new properties and remain backward compatible. If a type with the same name exists in the database, it updates the existing type, otherwise it creates a new type.
If true then allows adding properties to an existing stored type. If false, then type update is not allowed and it raises AlreadyExists error if the given type has any new property that is not defined in the stored type.
If true then allows omitting properties of an existing stored type. If false, then no properties of the stored type can be omitted in the given type, otherwise it raises AlreadyExists error.
Deprecated fields.
Options regarding transactions.
The type ID of the execution type.
Inserts or updates an ContextType. Please refer to PutArtifactType for type upsert API description.
The field is required in any request. Stored types in MLMD can be updated by introducing new properties and remain backward compatible. If a type with the same name exists in the database, it updates the existing type, otherwise it creates a new type.
If true then allows adding properties to an existing stored type. If false, then type update is not allowed and it raises AlreadyExists error if the given type has any new property that is not defined in the stored type.
If true then allows omitting properties of an existing stored type. If false, then no properties of the stored type can be omitted in the given type, otherwise it raises AlreadyExists error.
Deprecated fields.
Options regarding transactions.
The type ID of the context type.
Bulk inserts types atomically.
If true then allows adding properties to an existing stored type. If false, then type update is not allowed and it raises AlreadyExists error if the given type has any new property that is not defined in the stored type.
If true then allows omitting properties of an existing stored type. If false, then no properties of the stored type can be omitted in the given type, otherwise it raises AlreadyExists error.
Deprecated fields.
Options regarding transactions.
The type ids of the artifact type.
The type ids of the execution type.
The type ids of the context type.
Inserts or updates artifacts in the database. If an artifact_id is specified for an artifact, it is an update. If an artifact_id is unspecified, it will insert a new artifact. For new artifacts, type must be specified. For old artifacts, type must be unchanged or unspecified. It is not guaranteed that the created or updated artifacts will share the same `create_time_since_epoch` or `last_update_time_since_epoch` timestamps. Args: artifacts: A list of artifacts to insert or update. Returns: A list of artifact ids index-aligned with the input.
Additional options to change the behavior of the method.
Options regarding transactions.
FieldMask for artifacts in the PUT update If `artifact.id` is not specified, it means a new artifact will be created and `update_mask` will not be applied to the creation. If `update_mask` is empty, update the artifacts as a whole. If `update_mask` is not empty, only update fields or properties specified in `update_mask`. Example request protos: 1. Examples that update `properties` / `custom_properties`: 1.1 Add a <'key', 'val'> pair into `custom_properties`: { artifacts { id: 1234 type_id: 5678 custom_properties { key: "key" value: { string_value: "val" } } } update_mask { paths: "custom_properties.key" } } 1.2 Set `custom_properties['key'].bool_value` to true: { artifacts { id: 1234 type_id: 5678 custom_properties { key: "key" value: { bool_value: true } } } update_mask { paths: "custom_properties.key" } } 1.3 Delete the complete <'key', 'val'> pair from `custom_properties`: { artifacts { id: 1234 type_id: 5678 custom_properties {} } update_mask { paths: "custom_properties.key" } } 2. Examples that update fields such as `uri`, `external_id`, etc: 2.1 Update `external_id` field: { artifacts { id: 1234 type_id: 5678 external_id: "new_value" } update_mask { paths: "external_id" } } 2.2 Set `uri` field: { artifacts { id: 1234 type_id: 5678 uri: "set_value" } update_mask { paths: "uri" } } If `paths: "properties"` or `paths: "custom_properties"` are added to `update_mask`, the key-level updates will be ignored and we only perform field-level updates on the all `properties`/`custom_properties`. For example: If the mask is: {"properties", "properties.key1"}, the field path "properties.key1" will be ignored and all `properties` will be updated. (Do not suggest) If the mask is {"properties", "external_id"}, all `properties` and field `external_id` will be updated. (Do not suggest)
A list of artifact ids index-aligned with PutArtifactsRequest.
Inserts or updates executions in the database. If an execution_id is specified for an execution, it is an update. If an execution_id is unspecified, it will insert a new execution. For new executions, type must be specified. For old executions, type must be unchanged or unspecified. It is not guaranteed that the created or updated executions will share the same `create_time_since_epoch` or `last_update_time_since_epoch` timestamps. Args: executions: A list of executions to insert or update. Returns: A list of execution ids index-aligned with the input.
Options regarding transactions.
FieldMask for executions in the PUT update If `execution.id` is not specified, it means a new execution will be created and `update_mask` will not be applied to the creation. If `update_mask` is empty, update the executions as a whole. If `update_mask` is not empty, only update fields or properties specified in `update_mask`. Example request protos: 1. Add a <'key', 'val'> pair into `custom_properties`: { executions { id: 1234 type_id: 5678 custom_properties { key: "key" value: { string_value: "val" } } } update_mask { paths: "custom_properties.key" } } 2. Set `last_known_state` field: { executions { id: 1234 type_id: 5678 last_known_state: CACHED } update_mask { paths: "last_known_state" } } Please refer to `PutArtifactsRequest` for more details.
A list of execution ids index-aligned with PutExecutionsRequest.
Inserts events in the database. The execution_id and artifact_id must already exist. Once created, events cannot be modified. AlreadyExists error will be raised if duplicated events are found. It is not guaranteed that the created or updated events will share the same `milliseconds_since_epoch` timestamps. Args: events: A list of events to insert or update.
Options regarding transactions.
(message has no fields)
Inserts or updates an Execution and its input and output artifacts and related contexts atomically. The `artifact_event_pairs` include the state changes of the Artifacts used or generated by the Execution, as well as the input/output Event. The `contexts` describe the associations of the execution and the attributions of the artifacts. If an execution_id is specified, it is an update on the corresponding execution, otherwise it does an insertion. For insertion, type must be specified. Same rule applies to artifacts and contexts in the request. Corresponding errors may raised. For example: AlreadyExists error will be raised if duplicated executions, artifacts or events are found. It is not guaranteed that the created or updated executions, artifacts, contexts and events will share the same `create_time_since_epoch`, `last_update_time_since_epoch`, or `milliseconds_since_epoch` timestamps. Args: execution: An execution to insert or update. artifact_event_pairs: Artifacts to insert or update and events to insert. contexts: The contexts that the execution and the artifacts belong to. Returns: An execution id and a list of artifacts and contexts ids index-aligned with the input.
The execution that produces many artifact and event pairs.
The list of artifact and event pairs.
A list of contexts associated with the execution and artifacts. For each given context without a context.id, it inserts the context, otherwise it updates the stored context with the same id (unless the option force_reuse_context is set). Associations between each pair of contexts and the execution, and attributions between each pair of contexts and artifacts are created if they do not already exist.
Additional options to change the behavior of the method.
Options regarding transactions.
An execution id of the `execution` in PutExecutionRequest.
A list of artifact ids index-aligned with `artifact_event_pairs` in the PutExecutionRequest.
A list of context ids index-aligned with `contexts` in the PutExecutionRequest.
Inserts or updates a lineage subgraph (i.e. a collection of event edges and its executions, artifacts, and related contexts) atomically. The `event_edges` include an Event and the indices of the corresponding execution and artifact from the input list of executions and artifacts. The `contexts` describe the associations of the Execution and the attributions of the Artifact. If an execution_id is specified, it is an update on the corresponding Execution, otherwise it does an insertion. For insertion, type must be specified. These rules apply to Artifacts and Contexts as well. Corresponding errors may be raised. For example: AlreadyExists error will be raised if duplicated executions, artifacts, or events are found. It is not guaranteed that the created or updated executions, artifacts, contexts and events will share the same `create_time_since_epoch`, `last_update_time_since_epoch`, or `milliseconds_since_epoch` timestamps. Args: executions: A list of executions to insert or update. artifacts: A list of artifacts to insert or update. contexts: A list of contexts to insert and/or create associations and attributions with. event_edges: A list of events to insert with the indices of the corresponding execution and artifact from the input lists of executions and artifacts. Returns: Lists of execution, artifact, and context ids index-aligned with the inputs.
A list of execution ids index-aligned with `executions` in the request
A list of artifact ids index-aligned with `artifacts` in the request
A list of context ids index-aligned with `contexts` in the request
Inserts or updates contexts in database and returns a list of context ids. If an context_id is specified for a context, it is an update. If an context_id is unspecified, it will insert a new context. For new contexts, type must be specified. For old contexts, type must be unchanged or unspecified. It is not guaranteed that the created or updated contexts will share the same `create_time_since_epoch` or `last_update_time_since_epoch` timestamps. Args: contexts: A list of contexts to insert or update. Returns: A list of context ids index-aligned with the input.
Options regarding transactions.
FieldMask for contexts in the PUT update If `context.id` is not specified, it means a new context will be created and `update_mask` will not be applied to the creation. If `update_mask` is empty, update the contexts as a whole. If `update_mask` is not empty, only update fields or properties specified in `update_mask`. Example request protos: 1. Add a <'key', 'val'> pair into `custom_properties`: { contexts { id: 1234 type_id: 5678 custom_properties { key: "key" value: { string_value: "val" } } } update_mask { paths: "custom_properties.key" } } 2. Set `name` field: { contexts { id: 1234 type_id: 5678 name: "set_name" } update_mask { paths: "name" } } Please refer to `PutArtifactsRequest` for more details.
A list of context ids index-aligned with PutContextsRequest.
Inserts attribution and association relationships in the database. The context_id, artifact_id, and execution_id must already exist. If the relationship exists, this call does nothing. Once added, the relationships cannot be modified. Args: attributions: A list of attributions to insert. associations: A list of associations to insert.
Options regarding transactions.
(message has no fields)
Inserts parental context relationships in the database. The ParentContext relationship has direction. The call fails if cycles are detected. Args: parent_contexts: A list of parent contexts to insert.
Options regarding transactions.
(message has no fields)
Gets an artifact type. Returns a NOT_FOUND error if the type does not exist.
If not set, it looks for the type with type_name with default type_version.
Options regarding transactions.
Gets an artifact type, or clear if it does not exist.
Gets a list of artifact types by ID. If no artifact types with an ID exists, the artifact type is skipped.
Options regarding transactions.
The result is not index-aligned: if an id is not found, it is not returned.
Gets a list of all artifact types.
Options regarding transactions.
Gets an execution type, or None if it does not exist.
If not set, it looks for the type with type_name with default type_version.
Options regarding transactions.
Gets an execution type, or clear if it does not exist.
Gets a list of execution types by ID. If no execution types with an ID exists, the execution type is skipped.
Options regarding transactions.
The result is not index-aligned: if an id is not found, it is not returned.
Gets a list of all execution types.
Options regarding transactions.
Gets a context type. Returns a NOT_FOUND error if the type does not exist.
If not set, it looks for the type with type_name with default type_version.
Options regarding transactions.
Gets a context type, or clear if it does not exist.
Gets a list of context types by ID. If no context types with an ID exists, the context type is skipped.
Options regarding transactions.
The result is not index-aligned: if an id is not found, it is not returned.
Gets a list of all context types.
Options regarding transactions.
Gets all the artifacts.
Request to retrieve Artifacts using List options. If option is not specified then all Artifacts are returned.
Specify options. Please refer to the documentation of ListOperationOptions for the supported functionalities.
Options regarding transactions.
Returned artifacts.
Token to use to retrieve next page of results if list options are used in the request.
Gets all the executions.
Request to retrieve Executions using List options. If option is not specified then all Executions are returned.
Specify options. Please refer to the documentation of ListOperationOptions for the supported functionalities.
Options regarding transactions.
Returned executions.
Token to use to retrieve next page of results if list options are used in the request.
Gets all the contexts.
Request to retrieve Contexts using List options. If option is not specified then all Contexts are returned.
Specify options. Please refer to the documentation of ListOperationOptions for the supported functionalities.
Options regarding transactions.
Returned contexts.
Token to use to retrieve next page of results if list options are used in the request.
Gets all artifacts with matching ids. The result is not index-aligned: if an id is not found, it is not returned. Args: artifact_ids: A list of artifact ids to retrieve. Returns: Artifacts with matching ids.
A list of artifact ids to retrieve.
An option to populate all the ArtifactTypes in the response. If true, returns retrieved Artifacts and their artifact types, which can be matched by type_ids. If false, returns only the retrieved Artifacts. Example request proto: { artifact_ids: 101, populate_artifact_types: true, } The response will contain an artifact with id = 101 and an artifact type with id = artifact.type_id().
Options regarding transactions.
Artifacts with matching ids. This is not index-aligned: if an id is not found, it is not returned.
ArtifactTypes populated with matching type_ids owned by `artifacts`. This is not index-aligned: if a type_id is not found, it is not returned.
Gets all executions with matching ids. The result is not index-aligned: if an id is not found, it is not returned. Args: execution_ids: A list of execution ids to retrieve.
A list of execution ids to retrieve.
Options regarding transactions.
The result is not index-aligned: if an id is not found, it is not returned.
Gets all contexts with matching ids. The result is not index-aligned: if an id is not found, it is not returned. Args: context_ids: A list of context ids to retrieve.
A list of context ids to retrieve.
Options regarding transactions.
The result is not index-aligned: if an id is not found, it is not returned.
Gets all the artifacts of a given type.
If not set, it looks for the type with type_name with default type_version.
Specify List options. Currently supports: 1. Field to order the results. 2. Page size. If set, the request will first fetch all artifacts with specified `type_name` and `type_version`, then order by a specifield field finally find the correct page and return #Artifacts of the page size. Higher-level APIs may only use the functionalies partially. Please reference the API documentation for the API behaviors.
Options regarding transactions.
Token to use to retrieve next page of results if list options are used in the request.
Gets all the executions of a given type.
If not set, it looks for the type with type_name with default type_version.
Specify List options. Currently supports: 1. Field to order the results. 2. Page size. If set, the request will first fetch all executions with specified `type_name` and `type_version`, then order by a specifield field finally find the correct page and return #Executions of the page size. Higher-level APIs may only use the functionalies partially. Please reference the API documentation for the API behaviors.
Options regarding transactions.
Token to use to retrieve next page of results if list options are used in the request.
Gets all the contexts of a given type.
Specify options. Currently supports: 1. Field to order the results. 2. Page size. If set, the request will first fetch all contexts with specified `type_name` and `type_version`, then order by a specifield field finally find the correct page and return #Contexts of the page size. Higher-level APIs may only use the functionalies partially. Please reference the API documentation for the API behaviors.
If not set, it looks for the type with type_name and options with default type_version.
Options regarding transactions.
Token to use to retrieve next page of results if list options are used in the request.
Gets the artifact of the given type and artifact name.
If not set, it looks for the type with type_name and artifact_name with default type_version.
Options regarding transactions.
Gets the execution of the given type and execution name.
If not set, it looks for the type with type_name and execution_name with default type_version.
Options regarding transactions.
Gets the context of the given type and context name.
If not set, it looks for the type with type_name and context_name with default type_version.
Options regarding transactions.
Gets all the artifacts with matching uris.
A list of artifact uris to retrieve.
Options regarding transactions.
Gets all events with matching execution ids.
Gets all events with matching execution ids.
Options regarding transactions.
Gets all events with matching artifact ids.
Options regarding transactions.
Gets all the artifacts with matching external ids.
Options regarding transactions.
Gets all the artifacts with matching external ids.
Options regarding transactions.
Gets all the artifacts with matching external ids.
Options regarding transactions.
Gets all the artifacts with matching external ids.
Options regarding transactions.
Gets all the artifacts with matching external ids.
Options regarding transactions.
Gets all the artifacts with matching external ids.
Options regarding transactions.
Gets all context that an artifact is attributed to.
Options regarding transactions.
Gets all context that an execution is associated with.
Options regarding transactions.
Gets all parent contexts that a context is related.
Options regarding transactions.
Gets all children contexts that a context is related.
Options regarding transactions.
Batch getting all the parent contexts that a list of contexts are related.
Options regarding transactions.
Batch getting all the children contexts that a list of contexts are related.
Options regarding transactions.
Gets all direct artifacts that a context attributes to.
Specify List options. Currently supports: 1. Field to order the results. 2. Page size.
Options regarding transactions.
Token to use to retrieve next page of results if list options are used in the request.
Gets all direct executions that a context associates with.
Specify List options. Currently supports: 1. Field to order the results. 2. Page size.
Options regarding transactions.
Token to use to retrieve next page of results if list options are used in the request.
Options regarding transactions.
Deprecated: GetLineageGraph API is deprecated, please refer to GetLineageSubgraph API as the alternative. The transaction performs a constrained transitive closure and returns a lineage subgraph satisfying the conditions and constraints specified in the GetLineageGraphRequest.
Deprecated: GetLineageGraph API is deprecated, please refer to GetLineageSubgraph API as the alternative. A lineage query request to specify the query nodes of interest and the boundary conditions for pruning the returned graph.
Options regarding transactions.
Deprecated: GetLineageGraph API is deprecated, please refer to GetLineageSubgraph API as the alternative. A connected lineage `subgraph` about the MLMD nodes derived from LineageGraphRequest.query_conditions.
Gets a lineage subgraph by performing graph traversal from a list of interested nodes. A lineage subgraph without node details (e.g., external_id, properties) will be returned.
Query options for lineage graph tracing from a list of interested nodes. A lineage subgraph without node details (e.g., external_id, properties) will be returned. Please refer to `LineageSubgraphQueryOptions` for more details.
`read_mask` contains user specified paths of fields that should be included in the returned lineage subgraph. Supported field paths are: 'artifacts', 'executions', 'contexts', 'artifact_types', 'execution_types', 'context_types', 'events', 'associations' and 'attributions'. If 'artifacts', 'executions', or 'contexts' is specified in `read_mask`, the dehydrated nodes will be included. Note: A dehydrated node means a node containing only its id and no other information. User should call GetNodesByID or other APIs to get node details later on. If 'artifact_types', 'execution_types', or 'context_types' is specified in `read_mask`, all the node types will be included. If 'events', 'associations', or 'attributions' is specified in `read_mask`, the corresponding edges will be included. If `read_mask` is not set, the API will return all the fields in the returned graph. Note: Only paths of fields in LineageGraph message are supported. Paths of fields in the submessage, such as "artifacts.id", "contexts.name" are not acknowledged.
A lineage subgraph of MLMD nodes and relations retrieved from lineage graph tracing.
Every ArtifactStruct is a member of this type.
Used in:
(message has no fields)
Used in:
, , , , , , , , , , ,Output only. The unique server generated id of the artifact.
The client provided name of the artifact. This field is optional. If set, it must be unique among all the artifacts of the same artifact type within a database instance and cannot be changed once set.
The id of an ArtifactType. This needs to be specified when an artifact is created, and it cannot be changed.
Output only. The name of an ArtifactType.
The uniform resource identifier of the physical artifact. May be empty if there is no physical artifact.
The external id that come from the clients’ system. This field is optional. If set, it must be unique among all artifacts within a database instance.
Properties of the artifact. Properties must be specified in the ArtifactType.
User provided custom properties which are not defined by its type.
The state of the artifact known to the system.
Output only. Create time of the artifact in millisecond since epoch.
Output only. Last update time of the artifact since epoch in millisecond since epoch.
Output only.
Used in:
A state indicating that the artifact may exist.
A state indicating that the artifact should exist, unless something external to the system deletes it.
A state indicating that the artifact should be deleted.
A state indicating that the artifact has been deleted.
A state indicating that the artifact has been abandoned, which may be due to a failed or cancelled execution.
A state indicating that the artifact is a reference artifact. At execution start time, the orchestrator produces an output artifact for each output key with state PENDING. However, for an intermediate artifact, this first artifact's state will be REFERENCE. Intermediate artifacts emitted during a component's execution will copy the REFERENCE artifact's attributes. At the end of an execution, the artifact state should remain REFERENCE instead of being changed to LIVE.
An artifact and type pair. Part of an artifact struct.
Used in:
An artifact struct represents the input or output of an Execution. See the more specific types referenced in the message for more details.
Used in:
,Note: an artifact struct may be empty to indicate "None" or null.
An artifact struct that is a list.
Used in:
Can be represented as a JSON list of artifact structs.
A dictionary of artifact structs. Can represent a dictionary.
Used in:
An artifact struct that is a dictionary. Can be represented as a JSON dictionary of artifact structs.
The list of ArtifactStruct is EXPERIMENTAL and not in use yet. The type of an ArtifactStruct. An artifact struct type represents an infinite set of artifact structs. It can specify the input or output type of an ExecutionType. See the more specific types referenced in the message for more details.
Used in:
, , , , ,Matches exactly this type.
Used in:
, , , , , , , , , ,The id of the type. 1-1 relationship between type names and IDs.
The name of the type. It must be unique among ArtifactTypes within a database instance.
An optional version of the type. An empty string is treated as unset.
An optional description about the type.
The external id that come from the clients’ system. This field is optional. If set, it must be unique among all artifact types within a database instance.
The schema of the type. Properties are always optional in the artifact. Properties of an artifact type can be expanded but not contracted (i.e., you can add columns but not remove them).
An optional system defined base_type expressing the intent of the current type. This field is useful for the tool builders to utilize the stored MLMD information, e.g., `MyModel` ArtifactType could set base_type = MODEL.
An enum of system-defined artifact types.
Used in:
the Association edges between Context and Execution instances.
Used in:
,Output only.
the Attribution edges between Context and Artifact instances.
Used in:
,Output only.
Used in:
Configuration for a new connection.
PostgreSQL database connection config.
Options for overwriting the default retry setting when MLMD transactions returning Aborted error. The setting is currently available for python client library only. TODO(b/154862807) set the setting in transaction executor.
Used in:
, , , , , , , , , , , , , ,Output Only. The unique server generated id of the context.
The client provided name of the context. It must be unique within a database instance.
The id of a ContextType. This needs to be specified when a context is created, and it cannot be changed.
Output only. The name of a ContextType.
The external id that come from the clients’ system. This field is optional. If set, it must be unique among all contexts within a virtual database.
Values of the properties, which must be specified in the ContextType.
User provided custom properties which are not defined by its type.
Output only. Create time of the context in millisecond since epoch.
Output only. Last update time of the context in millisecond since epoch.
Output only system metadata.
Used in:
, , , , , , ,The id of the type. 1-1 relationship between type names and IDs.
The name of the type, e.g., Pipeline, Task, Session, User, etc. It must be unique among ContextTypes within a database instance.
An optional version of the type. An empty string is treated as unset.
An optional description about the type.
The external id that come from the clients’ system. This field is optional. If set, it must be unique among all context types within a database instance.
The schema of the type, e.g., name: string, owner: string Properties are always optional in the context. Properties of an context type can be expanded but not contracted (i.e., you can add columns but not remove them).
An optional system defined base_type expressing the intent of the current context type. *NOTE: currently there are no system Context types defined, and the field is not used for ContextType.
An enum of system-defined context types.
Used in:
A artifact struct type that represents a record or struct-like dictionary. ArtifactStruct would be map (i.e. ArtifactStructMap)
Used in:
Underlying properties for the type.
If true, then if properties["foo"] can be None, then that key is not required.
Extra keys are allowed that are not specified in properties. These keys must have the type specified below. If this is not specified, then extra properties are not allowed.
An event represents a relationship between an artifact and an execution. There are different kinds of events, relating to both input and output, as well as how they are used by the mlmd powered system. For example, the DECLARED_INPUT and DECLARED_OUTPUT events are part of the signature of an execution. For example, consider: my_result = my_execution({"data":[3,7],"schema":8}) Where 3, 7, and 8 are artifact_ids, Assuming execution_id of my_execution is 12 and artifact_id of my_result is 15, the events are: { artifact_id:3, execution_id: 12, type:DECLARED_INPUT, path:{step:[{"key":"data"},{"index":0}]} } { artifact_id:7, execution_id: 12, type:DECLARED_INPUT, path:{step:[{"key":"data"},{"index":1}]} } { artifact_id:8, execution_id: 12, type:DECLARED_INPUT, path:{step:[{"key":"schema"}]} } { artifact_id:15, execution_id: 12, type:DECLARED_OUTPUT, path:{step:[{"key":"my_result"}]} } Other event types include INPUT/OUTPUT, INTERNAL_INPUT/_OUTPUT and PENDING_OUTPUT: * The INPUT/OUTPUT is an event that actually reads/writes an artifact by an execution. The input/output artifacts may not declared in the signature, For example, the trainer may output multiple caches of the parameters (as an OUTPUT), then finally write the SavedModel as a DECLARED_OUTPUT. * The INTERNAL_INPUT/_OUTPUT are event types which are only meaningful to an orchestration system to keep track of the details for later debugging. For example, a fork happened conditioning on an artifact, then an execution is triggered, such fork implementing may need to log the read and write of artifacts and may not be worth displaying to the users. For instance, in the above example, my_result = my_execution({"data":[3,7],"schema":8}) there is another execution (id: 15), which represents a `garbage_collection` step in an orchestration system gc_result = garbage_collection(my_result) that cleans `my_result` if needed. The details should be invisible to the end users and lineage tracking. The orchestrator can emit following events: { artifact_id: 15, execution_id: 15, type:INTERNAL_INPUT, } { artifact_id:16, // New artifact containing the GC job result. execution_id: 15, type:INTERNAL_OUTPUT, path:{step:[{"key":"gc_result"}]} } * The PENDING_OUTPUT event is used to indicate that an artifact is tentatively associated with an active execution which has not yet been finalized. For example, an orchestration system can register output artifacts of a running execution with PENDING_OUTPUT events to indicate the output artifacts the execution is expected to produce. When the execution is finished, the final set of output artifacts can be associated with the exeution using OUTPUT events, and any unused artifacts which were previously registered with PENDING_OUTPUT events can be updated to set their Artifact.State to ABANDONED. Events are unique of the same (artifact_id, execution_id, type) combination within a metadata store.
Used in:
, , , , ,The artifact id is required for an event, and should refer to an existing artifact.
The execution_id is required for an event, and should refer to an existing execution.
The path in an artifact struct, or the name of an artifact.
The type of an event.
Time the event occurred Epoch is Jan 1, 1970, UTC
Output only.
A simple path (e.g. {step{key:"foo"}}) can name an artifact in the context of an execution.
Used in:
A simple path (e.g. {step{key:"foo"}}) can name an artifact in the context of an execution.
Used in:
Events distinguish between an artifact that is written by the execution (possibly as a cache), versus artifacts that are part of the declared output of the Execution. For more information on what DECLARED_ means, see the comment on the message.
Used in:
A declared output of the execution.
A declared input of the execution.
An input of the execution.
An output of the execution.
An internal input of the execution.
An internal output of the execution.
A pending output of the execution.
Used in:
, , , , , , , , ,Output only. The unique server generated id of the execution.
The client provided name of the execution. This field is optional. If set, it must be unique among all the executions of the same execution type within a database instance and cannot be changed once set.
The id of an ExecutionType. This needs to be specified when an execution is created, and it cannot be changed. The id of an ExecutionType.
Output only. The name of an ExecutionType.
The external id that come from the clients’ system. This field is optional. If set, it must be unique among all executions within a database instance.
The last known state of an execution in the system.
Properties of the Execution. Properties must be specified in the ExecutionType.
User provided custom properties which are not defined by its type.
Output only. Create time of the execution in millisecond since epoch.
Output only. Last update time of the execution in millisecond since epoch.
Output only.
The state of the Execution. The state transitions are NEW -> RUNNING -> COMPLETE | CACHED | FAILED | CANCELED CACHED means the execution is skipped due to cached results. CANCELED means the execution is skipped due to precondition not met. It is different from CACHED in that a CANCELED execution will not have any event associated with it. It is different from FAILED in that there is no unexpected error happened and it is regarded as a normal state.
Used in:
Used in:
, , , , , , ,The id of the type. 1-1 relationship between type names and IDs.
The name of the type. It must be unique among ExecutionTypes within a database instance.
An optional version of the type. An empty string is treated as unset.
An optional description about the type.
The external id that come from the clients’ system. This field is optional. If set, it must be unique among all execution types within a database instance.
The schema of the type. Properties are always optional in the execution.
The ArtifactStructType of the input. For example: { "dict":{ "properties":{ "schema":{ "union_type":{ "none":{}, "simple":{...schema type...} }, }, "data":{ "simple":{...data_type...} } } } } That would be an optional schema field with a required data field.
The ArtifactStructType of the output. For example {"simple":{...stats gen output type...}}
An optional system defined base_type expressing the intent of the current type. This field is useful for the tool builders to utilize the stored MLMD information, e.g., `MyTrainer` ExecutionType could set base_type = TRAIN.
An enum of system-defined execution types.
Used in:
Configuration for a "fake" database. This database is an in-memory SQLite database that lives only as long as the associated object lives.
Used in:
(message has no fields)
Used in:
Used in:
A list of supported GRPC arguments defined in: https://grpc.github.io/grpc/core/group__grpc__arg__keys.html
Used in:
Maximum message length in bytes per response that the channel can receive.
Maximum misbehaving pings the server can bear before sending goaway and closing the transport? (0 indicates infinite number of misbehaving pings)
A member of this type must satisfy all constraints. This primarily useful not as an end-user type, but something calculated as an intermediate type in the system. For example, suppose you have a method: def infer_my_input_type(a): # try to infer the input type of this method. use_in_method_x(a) # with input type x_input use_in_method_y(a) # with input type y_input Given this information, you know that infer_my_input_type has type {"intersection":{"constraints":[x_input, y_input]}}. IntersectionArtifactStructType intersection_type = {"constraints":[ {"dict":{"properties":{"schema":{"any":{}}}, "extra_properties":{"any":{}}}}, {"dict":{"properties":{"data":{"any":{}}}, "extra_properties":{"any":{}}}}]} Since the first constraint requires the dictionary to have a schema property, and the second constraint requires it to have a data property, this is equivalent to: ArtifactStructType other_type = {"dict":{"properties":{"schema":{"any":{}},"data":{"any":{}}}}, "extra_properties":{"any":{}}}
Used in:
A self-contained provenance (sub)graph representation consists of MLMD nodes and their relationships. It is used to represent the query results from the persistent backend (e.g., lineage about a node, reachability of two nodes).
Used in:
,extracted types
extracted nodes
extracted edges
Deprecated: GetLineageGraph API is deprecated, please refer to GetLineageSubgraph API as the alternative. The query options for `get_lineage_graph` operation. `query_nodes` is a list of nodes of interest. Currently only artifacts are supported as `query_nodes`. `stop_conditions` defines the filtering rules when querying a lineage graph. `max_node_size` defines the total number of artifacts and executions returned in the subgraph.
Used in:
A query to specify the nodes of interest. `ListOperationOptions.max_result_size` sets the maximum number of nodes to begin with the graph search. TODO(b/178491112) Support query_nodes for Executions.
A constraint option to define the filtering rules when querying a lineage graph.
Maximum total number of artifacts and executions in the whole returned lineage graph. If set to 0 or below, all related nodes will be returned without any number limitation. The number counts toward Artifacts and Executions. Nothing else considered. NOTE: There is no pagination supported.
Filtering conditions for retrieving the lineage graph.
Used in:
The maximum number of hops from the `query_nodes` to traverse. A hop is defined as a jump to the next node following the path of node -> event -> next_node. For example, in the lineage graph a_1 -> e_1 -> a_2: a_2 is 2 hops away from a_1, and e_1 is 1 hop away from a_1. `max_num_hops` should be non-negative. When its value is set to 0, only the `query_nodes` are returned.
Filtering conditions for retrieving the lineage graph. Please refer to `ListOperationOptions.filter_query` for the syntax. If set, the `boundary_artifacts` defines which artifacts to keep in the returned lineage graph during the graph search. Artifacts that do not satisfy the `boundary_artifacts` are filtered out, and the subgraphs starting at them will be pruned. If not set, no artifacts will be filtered out. Taking the following lineage graph as example: (`a` represents an Artifact, `e` represents an Execution, each arrow represents a hop.) a_0 a_1 a_3 | \ / \ \/ \/ \/ \/ e_0 e_1 e_3 / \ \/ \/ a_2 a_4 a_5 \ / \/ \/ e_2 To query all the upstream and downstream nodes 3 hops away from a_4, while excluding the upstream subgraph starting at a_3, then `stop_conditions` can be set as: { max_num_hops: 3 boundary_artifacts: 'id != 3' } With the `stop_conditions`, {a_3, e_1, a_1, a_0, e_0} will be filtered out. The returned lineage graph looks like: e_3 / \ \/ \/ a_2 a_4 a_5 \ / \/ \/ e_2
If set, the `boundary_executions` defines which executions to keep in the returned lineage graph during the graph search. Executions that do not satisfy the `boundary_executions` are filtered out and the subgraphs starting at them will be pruned. If not set, no executions will be filtered out. In the example above, to query for all the upstream and downstream nodes 3 hops away from a_4, while excluding the upstream subgraph and the downstream subgraph starting at e_3, then `stop_conditions` can be set as { max_num_hops: 3 boundary_executions: 'id != 3' } With the `stop_conditions`, {e_3, a_5, a_3, e_1, a_1, a_0, e_0} will be filtered out. The returned lineage graph looks like: a_2 a_4 \ / \/ \/ e_2 However, for the following graph: a_0 a_1 a_3 | \ / \ \/ \/ \/ \/ e_0 e_1 e_3 \ / \ \/ \/ \/ a_2 a_4 a_5 \ / \/ \/ e_2 With the same `stop_conditions`, only {e_3, a_5, a_0, e_0} will be filtered out. The returned lineage graph looks like: a_1 a_3 \ / \/ \/ e_1 \ \/ a_2 a_4 \ / \/ \/ e_2
The query options for lineage graph tracing from a list of interested nodes.
Used in:
The maximum number of hops from the `starting_nodes` to traverse. A hop is defined as a jump to the next node following the path of node -> event -> next_node. For example, in the lineage graph a_1 -> e_1 -> a_2: a_2 is 2 hops away from a_1, and e_1 is 1 hop away from a_1. `max_num_hops` should be non-negative. When its value is set to 0, only the `starting_nodes` are returned.
The direction of lineage graph tracing, which means the direction of all hops in the tracing. A DOWNSTREAM hop means an expansion following the path of execution -> output_event -> artifact or artifact -> input_event -> execution An UPSTREAM hop means an expansion following the path of execution -> input_event -> artifact or artifact -> output_event -> execution Please refer to `Direction` for more details.
If set, `ending_artifacts` defines the terminal artifacts within `max_num_hops` of the traversed subgraph. No further expansion will occur at these terminal artifact nodes. If not set, expansion will continue until the traversal reaches `max_num_hops`. Taking the following lineage graph as example: (`a` represents an Artifact, `e` represents an Execution, each arrow represents a hop.) a_0 a_1 a_3 | \ / \ \/ \/ \/ \/ e_0 e_1 e_3 / \ \/ \/ a_2 a_4 a_5 \ / \/ \/ e_2 To query all the upstream and downstream nodes 3 hops away from a_4, and end traversal if a_3 is met, `ending_artifacts` can be set as: { filter_query: 'id = 3' include_ending_nodes: false } The returned lineage graph looks like: e_3 / \ \/ \/ a_2 a_4 a_5 \ / \/ \/ e_2 `ending_artifacts` and `ending_executions` can be combined in the request. Taking the same graph as example, if we combine `ending_artifacts` defined above with `ending_executions` as: { filter_query: 'id = 2' include_ending_nodes: true } The returned lineage graph looks like: e_3 / \ \/ \/ a_4 a_5 / \/ e_2
If set, `ending_executions` defines the terminal executions within `max_num_hops` of the traversed subgraph. No further expansion will occur at these terminal execution nodes. If not set, expansion will continue until the traversal reaches `max_num_hops`.
Used in:
Direction is by defult DIRECTION_UNSPECIFIED, which is equivalent to BIDIRECTIONAL.
Indicates tracing the lineage graph by hops in upstream direction.
Indicates tracing the lineage graph by hops in downstream direction.
Indicates tracing the lineage graph in both directions.
`ending_nodes` is a list of nodes that end expanding the graph.
Used in:
`filter_query` is a boolean expression in SQL syntax that is used to specify the conditions on ending nodes. Please refer to ListOperationOptions.filter_query for more details.
If true, include the ending nodes defined by the filter query, as well as edges connected from the traversed subgraph to them. If false, do not include the nodes and edges connected to them.
`starting_nodes` is a list of nodes of interest to start graph tracing. NOTE: The maximum number of starting nodes is 100 at most.
Used in:
`filter_query` is a boolean expression in SQL syntax that is used to specify the conditions on starting nodes. Please refer to ListOperationOptions.filter_query for more details.
Represents an ArtifactStruct list type with homogeneous elements.
Used in:
Every entry in the list must be of this type. Note: if this type is Any, then the list can have arbitrary elements.
Encapsulates information to identify the next page of resources in ListOperation.
Id offset within the resultset to start next page. Id offset is returned as Id is the unique field used to break ties for fields that might have duplicate entries, e.g. there could be two resources with same create_time. In such cases to break the tie in ordering, id offset is used. This field is currently only set whe order_by field is CREATE_TIME.
Offset value of the order by field. If ID is used this value is same as id_offset.
Options set in the first call to ListOperation. This ensures that if next_page_token is set by the caller then ListPipelineJobs API will always use options set in the first call.
List of ids that have the same order_by field values. This is used to ensure List Operation does not return duplicate entries for nodes that have the same order_by field value. This field is currently only set whe order_by field is LAST_UPDATE_TIME.
ListOperationOptions represents the set of options and predicates to be used for List operations on Artifacts, Executions and Contexts.
Used in:
, , , , , , , , ,Max number of resources to return in the result. A value of zero or less results in a InvalidArgumentError. The API implementation also enforces an upper-bound of 100, and picks the minimum between this value and the one specified here.
Ordering field.
Identifies the next page of results.
A boolean expression in SQL syntax that is used to specify the conditions on node attributes and directly connected assets. In the current implementation, filtering Artifact/Execution/Context with the following attributes and neighborhood is supported: Attributes: id:int64, type_id:int64, type:string, uri:string, name: string, external_id: string, create_time_since_epoch:int64, last_update_time_since_epoch:int64 state:ENUM (Artifact only) last_known_state:ENUM (Execution only) Neighborhood - Properties and Custom Properties (for all node types): syntax: properties.$name ($name is the property name) custom_properties.$name ($name is the custom property name) attributes: the following attributes can be used int_value: int64, double_value: double, string_value: string bool_value: bool - Context (for Artifact and Execution): syntax: contexts_$alias ($alias can be [0-9A-Za-z_]) attributes: the following attributes can be used id:int64, name:string, type:string, create_time_since_epoch:int64, last_update_time_since_epoch: int64 - Parent and Child Contexts (for Contexts): syntax: parent_contexts_$alias( $alias can be [0-9A-Za-z_] child_contexts_$alias( $alias can be [0-9A-Za-z_] attributes: the following attributes can be used id:int64, name: string, type:string - Event (for Artifact and Execution) syntax: events_$alias ($alias can be [0-9A-Za-z_]) attributes: the following attributes can be used artifact_id: int64(Execution only), execution_id: int64(Artifact only), type: ENUM, milliseconds_since_epoch: int64 Examples: a) to filter nodes attributes: - id != 1 - id IN (1, 3) - type_id = 5 - type = 'my_type_name' - name = 'foo' - type = 'bar' AND name LIKE 'foo%' - external_id = 'my_external_id' - NOT(create_time_since_epoch < 1 OR last_update_time_since_epoch < 1) b) to filter artifacts' uri - uri = 'exact_path_string' - uri LIKE 'path_like_this%' - uri IS NOT NULL c) to filter artifact's state or execution's last_known_state - state = LIVE - state IS NULL - state IN (PENDING, LIVE) - last_known_state = RUNNING - last_known_state != RUNNING - last_known_state NOT IN (FAILED, CANCELED) d) to filter nodes having a specific context, artifact, or execution - contexts_a.id = 5 - contexts_a.type = 'RunContext' - contexts_a.name = 'my_run' - contexts_a.create_time_since_epoch = 1626761453 - contexts_a.last_update_time_since_epoch = 1626761453 To filter nodes with conditions on multiple contexts: - contexts_a.name = 'my_run' AND contexts_b.name = 'my_pipeline' To filter context with artifacts: - artifacts_a.id = 5 - artifacts_a.type = 'Dataset' - artifacts_a.name = 'my_dataset' - artifacts_a.uri = 'exact_path_string' - artifacts_a.state = LIVE - artifacts_a.state IN (PENDING, LIVE) - artifacts_a.external_id = "my_external_id" - artifacts_a.create_time_since_epoch = 1626761453 - artifacts_a.last_update_time_since_epoch = 1626761453 To filter contexts with conditions on multiple artifacts: - artifacts_a.name = 'my_run' AND artifacts_b.name = 'my_pipeline' To filter context with executions: - executions_a.id = 5 - executions_a.type = 'Dataset' - executions_a.name = 'my_dataset' - executions_a.last_known_state = RUNNING . - executions_a.last_known_state IN (NEW, RUNNING) - executions_a.external_id = "my_external_id" - executions_a.create_time_since_epoch = 1626761453 - executions_a.last_update_time_since_epoch = 1626761453 To filter contexts with conditions on multiple executions: - executions_a.name = 'my_run' AND executions_b.name = 'my_pipeline' e) to filter nodes condition on their properties - properties.accuracy.double_value > 0.95 - custom_properties.my_param.string_value = "foo" If the name of the property or custom property includes characters other than [0-9A-Za-z_], then the name need to be backquoted, e.g., - properties.`my property`.int_value > 0 - custom_properties.`my:custom.property`.bool_value = true f) complex query to filter both node attributes and neighborhood - type = 'DataSet' AND (contexts_a.type = 'RunContext' AND contexts_a.name = 'my_run') AND (properties.span = 1 OR custom_properties.span = 1) g) to filter parent/child context - parent_contexts_a.id = 5 - child_contexts_a.type = 'RunContext' - parent_contexts_a.name = 'parent_context_1' h) to filter Artifacts on Events - events_0.execution_id = 1 - events_0.type = INPUT - events_0.milliseconds_since_epoch = 1 to filter Executions on Events - events_0.artifact_id = 1 - events_0.type IN (INPUT, INTERNAL_INPUT) - events_0.milliseconds_since_epoch = 1 TODO(b/145945460) Support filtering on event step fields.
Used in:
Field to order.
Direction of ordering.
Supported fields for Ordering.
Used in:
A config includes a set of SQL queries and the type of metadata source. It is used by MetadataAccessObject to init backend and issue queries. Next ID: 144
the type of the metadata source
Drops the Type table.
Creates the Type table.
Checks the existence of the Type table.
Inserts an artifact type into the Type table. It has 3 parameters. $0 is the name $1 is the version $2 is the description
Inserts an execution type into the Type table. It has 3 parameters. $0 is the type name $1 is the version $2 is the description $3 is the input_type serialized as JSON or null. $4 is the output_type serialized as JSON or null.
Inserts a context type into the Type table. It has 1 parameter. $0 is the type name $1 is the version $2 is the description
Queries types by a list of type ids. It has 2 parameter. $0 is the type_ids $1 is the is_artifact_type
Queries types by a list of external ids. It has 2 parameters. $0 is the external_ids $1 is the type_kind
Queries a type by its type id. It has 2 parameters. $0 is the type id $1 is the is_artifact_type
Queries a type by its type name. It has 2 parameters. $0 is the type name $1 is the type_kind
Queries a type by its type name and version. It has 3 parameters. $0 is the type name $1 is the type version $2 is the type_kind
Queries types by a list of type names. It has 2 parameters. $0 is the type names $1 is the type kind
Queries types by a list of type name and version pairs. It has 2 parameters. $0 is the type name and version pairs $1 is the type kind
Queries for all type instances. It has 1 parameter. $0 is the is_artifact_type
Updates a type in the Type table. It has 2 parameters. $0 is the existing type id $1 is the external_id of the Type
Drops the ParentType table.
Creates the ParentType table.
Checks the existence of the ParentType table.
Inserts a parent type into the ParentType table. It has 2 parameters: $0 is the type_id $1 is the parent_type_id
Queries parent types from the ParentType table by type_id. It has 1 parameter. $0 is the type_id
Drops the TypeProperty table.
Creates the TypeProperty table.
Checks the existence of the TypeProperty table.
Inserts a property of a type into the TypeProperty table. It has 3 parameters. $0 is the type_id $1 is the name of the property $2 is the data_type of the property
Queries properties of a type from the TypeProperty table by the type_id Returns a list of properties (name, data_type). It has 1 parameter. $0 is the type_id
Queries the last inserted id.
Drops the Artifact table.
Creates the Artifact table.
Checks the existence of the Artifact table.
Inserts an artifact into the Artifact table. It has 5 parameters. $0 is the type_id $1 is the uri of the Artifact $2 is the name of the Artifact $3 is the create_time_since_epoch of the Artifact $4 is the last_update_time_since_epoch of the Artifact
Queries an artifact from the Artifact table by its id. It has 1 parameter. $0 is the artifact_id
Queries an artifact from the Artifact table by its name and type id. It has 2 parameter. $0 is the type_id $1 is the name of the Artifact
Queries an artifact from the Artifact table by its type_id. It has 1 parameter. $0 is the artifact_type_id
Queries an artifact from the Artifact table by its uri. It has 1 parameter. $0 is the uri
Queries artifacts from the Artifact table by external_ids. It has 1 parameter. $0 is the external_ids
Updates an artifact in the Artifact table. It has 4 parameters. $0 is the existing artifact id $1 is the type_id $2 is the uri of the Artifact $3 is the last_update_time_since_epoch of the Artifact
Drops the ArtifactProperty table.
Creates the ArtifactProperty table.
Checks the existence of the ArtifactProperty table.
Insert a property of an artifact from the ArtifactProperty table. It has 5 parameters. $0 is the property data type $1 is the artifact_id $2 is the name of the artifact property $3 is the flag to indicate whether it is a custom property $4 is the value of the property
Queries properties of an artifact from the ArtifactProperty table by the artifact id. It has 1 parameter. $0 is the artifact_id
Updates a property of an artifact in the ArtifactProperty table. It has 4 parameters. $0 is the property data type $1 is the value of the property $2 is the artifact_id $3 is the name of the artifact property
Deletes a property of an artifact. It has 2 parameters. $0 is the artifact_id $1 is the name of the artifact property
Drops the Execution table.
Creates the Execution table.
Checks the existence of the Execution table.
Inserts an execution into the Execution table. It has 4 parameter. $0 is the type_id $1 is the name of the execution $2 is the create_time_since_epoch of the execution $3 is the last_update_time_since_epoch of the execution
Queries an execution from the Execution table by its id. It has 1 parameter. $0 is the execution_id
Queries an execution from the Execution table by its name and type id. It has 2 parameters. $0 is the type_id $1 is the name
Queries an execution from the Execution table by its type_id. It has 1 parameter. $0 is the execution_type_id
Queries executions from the Execution table by external_ids. It has 1 parameter. $0 is the external_ids
Updates an execution in the Execution table. It has 3 parameters. $0 is the existing execution id $1 is the type_id $2 is the last_update_time_since_epoch of the execution
Drops the ExecutionProperty table.
Creates the ExecutionProperty table.
Checks the existence of the ExecutionProperty table.
Insert a property of an execution from the ExecutionProperty table. It has 5 parameters. $0 is the property data type $1 is the execution_id $2 is the name of the execution property $3 is the flag to indicate whether it is a custom property $4 is the value of the property
Queries properties of an execution from the ExecutionProperty table by the execution id. It has 1 parameter. $0 is the execution_id
Updates a property of an execution in the ExecutionProperty table. It has 4 parameters. $0 is the property data type $1 is the value of the property $2 is the execution_id $3 is the name of the execution property
Deletes a property of an execution. It has 2 parameters. $0 is the execution_id $1 is the name of the execution property
Drops the Context table.
Creates the Context table.
Checks the existence of the Context table.
Inserts a context into the Context table. It has 4 parameters. $0 is the type_id $1 is the name of the Context $2 is the create_time_since_epoch of the Context $3 is the last_update_time_since_epoch of the Context
Queries a context from the Context table by its id. It has 1 parameter. $0 is the context_id
Queries a context from the Context table by its type_id. It has 1 parameter. $0 is the context_type_id
Queries a context from the Context table by its type_id and name. It has 2 parameters. $0 is the context_type_id $1 is the context_name
Queries contexts from the Context table by external_ids. It has 1 parameter. $0 is the external_ids
Updates a context in the Context table. It has 4 parameters. $0 is the existing context id $1 is the type_id $2 is the name of the Context $3 is the last_update_time_since_epoch of the Context
Drops the ContextProperty table.
Creates the ContextProperty table.
Checks the existence of the ContextProperty table.
Insert a property of a context from the ContextProperty table. It has 5 parameters. $0 is the property data type $1 is the context_id $2 is the name of the context property $3 is the flag to indicate whether it is a custom property $4 is the value of the property
Queries properties of a context from the ContextProperty table by the context id. It has 1 parameter. $0 is the context_id
Updates a property of a context in the ContextProperty table. It has 4 parameters. $0 is the property data type $1 is the value of the property $2 is the context_id $3 is the name of the context property
Deletes a property of a context. It has 2 parameters. $0 is the context_id $1 is the name of the context property
Drops the ParentContext table.
Creates the ParentContext table.
Checks the existence of the ParentContext table.
Inserts a parent context into the ParentContext table. It has 2 parameters: $0 is the context_id $1 is the parent_context_id
Queries parent contexts from the ParentContext table by context_id. It has 1 parameter. $0 is the context_id
Queries parent contexts from the ParentContext table by parent_context_id. It has 1 parameter. $0 is the parent_context_id
Queries parent contexts from the ParentContext table by context_ids. It has 1 parameter. $0 is the context_ids
Queries parent contexts from the ParentContext table by parent_context_ids. It has 1 parameter. $0 is the parent_context_ids
Drops the Event table.
Creates the Event table.
Checks the existence of the Event table.
Inserts an event into the Event table. It has 4 parameters. $0 is the artifact_id $1 is the execution_id $2 is the event type $3 is the event time
Queries events from the Event table by a collection of artifact ids. It has 1 parameter. $0 is the collection string of artifact ids joined by ", ".
Queries events from the Event table by a collection of execution ids. It has 1 parameter. $0 is the collection string of execution ids joined by ", ".
Drops the EventPath table.
Creates the EventPath table.
Checks the existence of the EventPath table.
Inserts a path into the EventPath table. It has 4 parameters $0 is the event_id $1 is the step value case, either index or key $2 is the is_index_step indicates the step value case $3 is the value of the step
Queries paths from the EventPath table by a collection of event ids. It has 1 parameter. $0 is the collection string of event ids joined by ", ".
Drops the Association table.
Creates the Association table.
Checks the existence of the Association table.
Inserts an association into the Association table. It has 2 parameters. $0 is the context_id $1 is the execution_id
Queries association from the Association table by its context id. It has 1 parameter. $0 is the context_id
Queries associations from the Association table by execution ids. It has 1 parameter. $0 are the execution_ids.
Drops the Attribution table.
Creates the Attribution table.
Checks the existence of the Attribution table.
Inserts an attribution into the Attribution table. It has 2 parameters. $0 is the context_id $1 is the artifact_id
Queries attribution from the Attribution table by its context id. It has 1 parameter. $0 is the context_id
Queries attributions from the Attribution table by artifact ids. It has 1 parameter. $0 are the artifact_ids.
Drops the MLMDEnv table.
Creates the MLMDEnv table.
The version of the current query config. Increase the version by 1 in any CL that includes physical schema changes and provides a migration function that uses a list migration queries. The database stores it to indicate the current database version. When metadata source creates, it compares the given `schema_version` in query config with the `schema_version` stored in the database, and migrate the database if needed.
Checks the MLMDEnv table and query the schema version. At MLMD release v0.13.2, by default it is v0.
This query returns an int 0 or int 1 to indicate whether mlmdenv table and the corresponding schema exist. In contrast, check_mlmd_env_table will return the schema_version value directly, assuming mlmdenv table exists. The reason for this query is that PostgreSQL will abort transaction if a query inside that transaction fails, while other DB types do not abort the transaction. This query serves single purpose to check whether mlmdenv table exists, while check_mlmd_env_table serves two goals: table schema checking and query for MLMD schema version.
Insert schema_version. $0 is the schema_version
Update schema_version $0 is the schema_version
Check the database is a valid database produced by 0.13.2 MLMD release. The schema version and migration are introduced after that release.
A list of secondary indices to be applied on the current schema. This is intended for indices that cover multiple columns or which cannot be created as part of table DDL statements.
Each metadata source should provides migration schemes, each of which defines the schema change details for a particular `schema_version` (sv_i). When a migration procedure wants to upgrade to sv_i, it looks for the MigrationScheme with sv_i as map key.
Deletes contexts by id. $0 are the context ids.
Deletes contexts properties by contexts ids. $0 are the context ids.
Delete parent contexts by parent context ids. $0 are the parent context ids.
Delete parent contexts by child context ids. $0 are the child context ids.
Delete parent contexts by parent context ids. $0 is the parent context id. $1 are the child context ids.
Deletes artifacts by id. $0 are the artifact ids.
Deletes artifacts properties by artifacts ids. $0 are the artifact ids.
Deletes executions by id. $0 are the execution ids.
Deletes executions properties by executions ids. $0 are the execution ids.
Delete events by artifact ids. $0 are the artifact ids.
Delete events by execution ids. $0 are the execution ids.
Delete associations by context ids. $0 are the context ids.
Delete associations by execution ids. $0 are the context ids.
Delete attributions by context ids. $0 are the context ids.
Delete attributions by artifact ids. $0 are the artifact ids.
Delete event paths. This query cleans up event paths where the events do not exist.
Delete a parent type from the ParentType table. It has 2 parameters: $0 is the type_id $1 is the parent_type_id
Queries properties of a list of types from the TypeProperty table by `type_ids`. Returns a list of properties(type_id, name, data_type).It has 1 parameter: $0 is the type_ids
Allows different MetadataSourceTypes to configure its own options when creating MetadataAccessObject.
Used in:
Total number of MLMD tables.
Total number of MLMD indexes.
A migration scheme that is used by a migration function to transit a database at a schema_version to schema_version + 1. DDL is often metadata source specific, if provided, each metadata source should have its own setting.
Used in:
Sequence of queries to increase the schema version by 1.
Sequence of queries to decrease the schema version by 1.
Details of verifying the correctness of upgrade_queries.
Details of verifying the correctness of downgrade_queries.
Verification of the DB schema in this version.
For test purposes, it defines the setup query and post condition invariants of a migration scheme.
Used in:
This optional field defines the additional details of recreating: a) the schema of previous version, as the DDL queries will not be available in newer library versions. b) and/or the records of related tables which have schema changes. Note the setup queries is used in a sequence along with other migration schemes of previous version in order to test upgrading from all known previous versions to the library version.
This optional field defines the verification queries, each of which returns only True/False in the select to assert the state transition invariant is the same, e.g., * conditions on number of rows * conditions on data model entities (e.g., type, artifact, property).
Template of a SQL query, which can contain parameterized variables using $0, $1, ... $9. For instance: query: "select * from foo where bar = $0" parameter_num: 1
Used in:
, ,Contains supported metadata sources types in MetadataAccessObject. Next index: 7
Used in:
a fake in memory metadata_source for testing DEPRECATED -- use SQLITE_METADATA_SOURCE instead.
a MYSQL metadata source.
A Sqlite metadata source.
PostgreSQL, the index number is related to ConnectionConfig.
Configuration for the gRPC metadata store client.
The hostname or IP address of the gRPC server. Must be specified.
The TCP Port number that the gRPC server accepts connections on. Must be specified.
Configuration for a secure gRPC channel. If not given, insecure connection is used.
GRPC channel creation arguments.
Time duration that a client is willing to wait for a reply from the server. If unset, the timeout is considered infinite. When the field is specified, Grpc APIs would return DeadlineExceededError when server does not respond within `client_timeout_sec`. Floating point valued, in seconds.
Used in:
The PEM-encoded private key as a byte string, or Empty if no private key should be used.
The PEM-encoded certificate chain as a byte string to use or or Empty if no certificate chain should be used.
The PEM-encoded root certificates as a byte string, or Empty to retrieve them from a default location chosen by gRPC runtime.
Configuration for the gRPC metadata store server.
Configuration to connect the metadata source backend.
Configuration for upgrade and downgrade migrations the metadata source.
Configuration for a secure gRPC channel. If not given, insecure connection is used.
Used in:
Private server key for SSL
Public server certificate
Custom certificate authority
Valid client certificate required?
Used in:
If not set, by default the upgrade migration is disabled. MLMD only compares db_v with the lib_v, and raise error if the two do not align. If the field is set to true, MLMD performs upgrade migration. It upgrades the database schema version (db_v) to align with the library schema version (lib_v) when connecting to the database. Schema migration should not be run concurrently with multiple clients to prevent data races.
Downgrade the given database to the specified schema version. For v0.13.2 release, the schema_version is 0. For 0.14.0 and 0.15.0 release, the schema_version is 4. More details are described in g3doc/get_start.md#upgrade-mlmd-library Set this field only when a database is accidentally upgraded by a newer version library. Each library version only knows how to downgrade to previous schema versions. As downgrade migrations inevitably introduce data loss, please consider taking a backup of the database before downgrading schema. After downgrade migration, the database connection is canceled. The user needs to downgrade the library to use the database.
Used in:
The hostname or IP address of the MYSQL server: * If unspecified, a connection to the local host is assumed. The client connects using a Unix socket specified by `socket`. * Otherwise, TCP/IP is used. Currently a replicated MYSQL backend is not supported.
The TCP Port number that the MYSQL server accepts connections on. If unspecified, the default MYSQL port (3306) is used.
The database to connect to. Must be specified. After connecting to the MYSQL server, this database is created if not already present unless skip_db_creation is set. All queries after Connect() are assumed to be for this database.
The MYSQL login id. If empty, the current user is assumed.
The password to use for `user`. If empty, only MYSQL user ids that don't have a password set are allowed to connect.
The Unix socket to use to connect to the server. If unspecified, a `host` must be provided.
If the field is set, the ssl options are set in mysql_options before establishing a connection. It is ignored if the mysql server does not enable SSL.
A config to skip the database creation if not exist when connecting the db instance. It is useful when the db creation is handled by an admin process, while the lib user should not issue db creation clauses.
The options to establish encrypted connections to MySQL using SSL.
Used in:
The path name of the client private key file.
The path name of the client public key certificate file.
The path name of the CA certificate file.
The path name of the directory that contains trusted SSL CA certificates.
The list of permissible ciphers for SSL encryption.
If set, enable verification of the server certificate against the host name used when connecting to the server.
A payload that can be optionally attached to absl::Status messages to indicate failure specific information like error codes for MySQL based backends.
Error code for the error.
The only member of this type is a None artifact. Note: ArtifactStruct{} is a None artifact. This can represent an execution that has no outputs (or inputs), or can be part of a UnionArtifactStructType to represent an optional input. For example, StatsGen has an "optional" schema input. A practical example of this is: stats_gen_type = { "dict":{ "properties":{ "schema":{ "union_type":{ "none":{}, "simple":{...schema type...} }, }, "data":{ "simple":{...data_type...} } } } };
Used in:
(message has no fields)
the Parental Context edges between Context and Context instances.
Used in:
,Output only.
A config contains the parameters when using with PostgreSQLMetadatSource. Next index: 10
Used in:
Name of host to connect to. If the host name starts with /, it is taken as a Unix-domain socket in the abstract namespace.
Numeric IP address of host to connect to. If this field is provided, `host` field is ignored.
Port number to connect to at the server host, or socket file name extension for Unix-domain connections.
PostgreSQL user name to connect as. Defaults to be the same as the operating system name of the user running the application.
Password to be used if the server demands password authentication.
Specifies the name of the file used to store passwords.
The database name. Defaults to be the same as the user name.
A config to skip the database creation if not exist when connecting the db instance. It is useful when the db creation is handled by an admin process, while the lib user should not issue db creation clauses.
Used in:
disable, allow, verify-ca, verify-full, etc. Reference: https://www.postgresql.org/docs/current/libpq-connect.html
This parameter specifies the file name of the client SSL certificate, replacing the default ~/.postgresql/postgresql.crt. This parameter is ignored if an SSL connection is not made.
This parameter specifies the location for the secret key used for the client certificate. It can either specify a file name that will be used instead of the default ~/.postgresql/postgresql.key, this parameter is ignored if an SSL connection is not made.
This parameter specifies the password for the secret key specified in sslkey, allowing client certificate private keys to be stored in encrypted form on disk even when interactive passphrase input is not practical.
This parameter specifies the name of a file containing SSL certificate authority (CA) certificate(s). If the file exists, the server's certificate will be verified to be signed by one of these authorities. The default is ~/.postgresql/root.crt.
The list of supported property value types.
Used in:
, ,Prefer to use `PROTO` to store structed data since this option has inefficient database storage usage.
Used in:
When there are multiple writers to update an existing node to different states, there may be a race and the end result of the concurrent update is nondeterministic. If the field is set, then an optimistic concurrency control (OCC) scheme is used during update: it compares the `artifact`.`last_update_time_since_epoch` in the request with the stored `last_update_time_since_epoch` having the same `artifact`.`id`. If they are different, the request fails, and the user can read the stored node and retry node update. When the option is set, the timestamp after update is guaranteed to be increased and different from the input artifact. When set the option, the caller should set it for all concurrent writers.
A pair of an artifact and an event used or generated by an execution, e.g., during the execution run, it uses none or many artifacts as input, and generate none or many artifacts as output.
Used in:
The pair may have an artifact. If present and no artifact.id is given, then it inserts the artifact, otherwise it updates the artifact.
The pair may have an event. Providing event.artifact_id or event.execution_id is optional. If the ids are given, it must align with the `artifact`.id / `execution`.id respectively. If artifact is not given and event.artifact_id is set, it must exist in the backend.
Used in:
When there's a race to publish executions with a new context with the same context.name, by default there'll be one writer succeeds and the rest of the writers returning AlreadyExists errors. If set to true, the API will reuse the stored context in the transaction and perform an update.
When there's a race to publish executions with a new artifact with the same artifact.external_id, by default there'll be one writer succeeds and the rest of the writers returning AlreadyExists errors. If set to true and an Artifact has non-empty external_id, the API will reuse the stored artifact in the transaction and perform an update. Otherwise, it will fall back to relying on `id` field to decide if it's update (if `id` exists) or insert (if `id` is empty).
When `force_update_time` is set to true, `execution.last_update_time_since_epoch` is force-updated even if input execution is the same as stored execution.
If true, for contexts with a context.id, the stored context will NOT be updated. For such contexts, we will only look at the context.id. We will validate that it exists, and associate the context with the execution. If the context does not exist, a NotFound error is returned.
Used in:
Index in the array of executions.
Index in the array of artifacts.
Used in:
When there's a race to publish executions with a new context with the same context.name, by default there'll be one writer succeeds and the rest of the writers returning AlreadyExists errors. If set to true, the API will reuse the stored context in the transaction and perform an update.
When there's a race to publish executions with a new artifact with the same artifact.external_id, by default there'll be one writer succeeds and the rest of the writers returning AlreadyExists errors. If set to true and an Artifact has non-empty external_id, the API will reuse the stored artifact in the transaction and perform an update. Otherwise, it will fall back to relying on `id` field to decide if it's update (if `id` exists) or insert (if `id` is empty).
A collection of returned records.
index-aligned column names for all records
a list of records returned by a query
An individual record (e.g., row) returned by a MetadataSource. The record does not address the type conversion.
Used in:
Used in:
The max number of retries when transaction returns Aborted error.
A bundle of ml-metadata types to describe artifacts, executions and contexts in general ML pipelines. The details of the data model is described in go/mlmd. ml-metadata provides a predefined bundle defined in simple_types_constants.h.
A list of artifact types.
A list of execution types.
A list of context types.
A config contains the parameters when using with SqliteMetadatSource.
Used in:
A uri specifying Sqlite3 database filename, for example: file:some_sqlite3_file_in_local_dir.db file:///home/username/some_sqlite3_file.db see https://www.sqlite.org/c3ref/open.html for model details If not given, a in-memory sqlite3 database is used, and destroyed when disconnecting the metadata source.
A flag specifying the connection mode. If not given, default connection mode is set to READWRITE_OPENCREATE.
Connection parameters for SQLite3 based metadata source.
Used in:
Connect a metadata source in read-only mode. Connection fail if the sqlite3 database at the `filename` does not exist. Any queries modifying the database fail.
Connect a metadata source in read/write mode. Connection fail if the sqlite3 database at the `filename` does not exist.
Similar to READWRITE. In addition, it creates the database if it does not exist.
The name of a system defined type.
Options for transactions. Note: This is under development. Clients should not use it.
Used in:
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,Transaction tag for debug use only.
An ordered list of heterogeneous artifact structs. The length of the list is fixed. Each position in the list can have a different type.
Used in:
Represents a union of types.
Used in:
An artifact struct matches this type if it matches any of the candidates. If candidates is empty, this is a bottom type (matches no artifacts).
A value in properties.
Used in:
, ,