Get desktop application:
View/edit binary Protocol Buffers messages
Creates a new deep learning training job for a given model definition.
EMExtractionSpec allows the caller to specify evaluation metrics extraction.
Reads a training job with a given ID.
Returns all training jobs for a given user.
Deletes a training job with a given ID.
Halts the training with a given ID without discarding the result.
Returns the model definition that was used for training as application/zip.
Returns the trained model as application/zip.
Returns the logs from the ZIP file stored in the object store. Deprecated
These are for internal use only, and will be eventually removed!
LogLine represents one line of log information, returned by training data endpoints.
For MetaInfo, at the minimum training_id and user_id must be specified.
Raw line from the logs.
Get evaluation metrics records, based on query
EMetrics specifies an evaluation metrics record from the training data.
For MetaInfo, at the minimum training_id and user_id must be specified.
Map of temporal keys, normally values for the x-axis on a graph. Example: {"iteration": 209}
Group label, such as test, train, or validate
Map of value keys, normally values for the y-axis on a graph. / Example: {"cross_entropy": 0.4430539906024933, "accuracy": 0.8999999761581421}
(message has no fields)
Contains a list of all frameworks currently supported along with the versions of that framework and whether a specific framework version can be used by anyone or only for internal usage.
For internal use only!
For internal use only! Updates an existing training status TODO we should not have this but until we fix the status update handling properly, we have no other choice.
Not implemented, to be removed (for GA)
Any represents a typed value used with the evaluation metrics record.
Used in:
Datatype of the value.
String representation of a value
Used in:
Used in:
,additional fields for the given Datastore type
connection information for the given Datastore type
Typed value for EMExtractionSpec. The data type here can't be an enum, due to internal issues with YAML conversion.
Used in:
one of: STRING, JSONSTRING, INT, FLOAT
String represenation of the value.
EMExtractionSpec represents the specification for extracting structured evaluation metrics from training jobs. It is used across all log collectors, so some fields may not be relevent for all log collectors. Note: Don't use enums with this, as need to do untyped YAML convert to string and back Refer to https://github.ibm.com/deep-learning-platform/dlaas-training-metrics-service for complete documentation.
Used in:
Loosly typed string representing what kind of log-collector to use. For Logs-only, specify `type: logger` For the Regex_extractor log-collector, specify `type: regex_extractor` For Tensorboard, specify `type: tensorboard` To invoke the emetrics_file_extractor, you can specify the following synonyms `type: optivist` || `type: emetrics_file` || `type: file`.
Dev only.
The filename of the logfile. (Normally this should be left to default).
For the regex_extractor, number of lines to keep in the buffer for regex matching.
(Eventual) Available event types: 'images', 'distributions', 'histograms', 'images' 'audio', 'scalars', 'tensors', 'graph', 'meta_graph', 'run_metadata'. For now only scalars are supported.
For the regex_extractor, the `EMExtractionSpec` should contain a `groups` section, which should contain templates for groups such as `test` and `train`, which group names should be be the keys of this map.
EMGroup represents a group, such as `test` or `train`, that acts as a template for structured evaluation metrics, and which allows the specification of a regular expression (regex) that contains named bindings with sub-expressions, which can then be used as references to specify structured time-related (x-axis) and value-related (y-axis) values.
Used in:
Python regular expressions, which use the named group feature `(?P<name>...)`, to specify a name of a matching expression, which can then be used to specify the value that is used in the template for the `EMetrics` record. To help with verbosity, the regex_extractor allows the following macros: GLOG_STAMP, TIMESTAMP, FLOAT, INT, INT_ANY, and HEX. (See dlaas-training-metrics-service README for more details.
Allows the caller to specify a binding for the time field of the meta structure.
Map of keys and regex references for value-related (y-axis) values.
Map of keys and regex references for time-related (x-axis) values.
Allows the user to bind an extracted value to the time field of the evaluation metrics.
Used in:
Time that the metric occured: representing the number of millisecond since midnight January 1, 1970. (ref, for instance $timestamp). Value will be extracted from timestamps
Optional subid
Used in:
Optional: tag used for learner testing
Optional: non-standard location for learner image
Used in:
Used in:
If true, the image can be used by any user. If false, the image is only available for internal usage.
Used as request type in: Trainer.GetTrainingJob, Trainer.GetTrainingStatusID
Used in:
the server name for the docker registry
namespace within the registry
Token used to access images stored in the registry+namespace
Email address associated with the account
Used in:
,MetaInfo represents data shared with both log lines and evaluation metrics.
Used in:
, ,Unique id identifying the training job
Unique id identifying the user
Time that the metric occured: representing the number of millisecond since midnight January 1, 1970.
Sequential index, 1-based
Optional subid
Used in:
, ,Used in:
,Optional: application/zip as bytes containing the model definition. If not present field location needs to be set.
Optional: data store location where the model definition (code) is located
Query specifies the input query for logs and evaluation metrics.
Used as request type in: Trainer.GetTrainingEMetrics, Trainer.GetTrainingLogs
At this time, the SearchType value should normally always be TERM.
At the minimum, the training_data and user_data must be specified in the meta substructure.
representing the number of milliseconds since midnight January 1, 1970, exclusive with pos.
Only get this many records
The starting position. If positive or zero, count from beginning, if negative, count from end, exclusive with since.
Used in:
Used to specify resource requirements of a training job
Used in:
Number of CPU cores
Number of GPUs
RAM
Number of learners
Optional. If not specified, job will be scheduled ONLY on nvidia-TeslaK80 Constraint strictly enforced. If e.g., a nvidia-TeslaP100 is requested, job will NOT start until a nvidia-TeslaP100 is available Can only be nvidia-TeslaK80, nvidia-TeslaP100 or nvidia-TeslaV100
Used in:
Used in:
, , , ,Used in:
,Command to execute during training
Resource requirements for the training
Input and output data as data store references
whether we want to enable detailed profiling during the training
Used in:
,Used as response type in: Trainer.GetModelDefinition, Trainer.GetTrainedModel