Get desktop application:
View/edit binary Protocol Buffers messages
Meta-data of a checkpoint.
Training configuration for the Distributed Gradient Boosted Trees algorithm.
Classical training configuration for a GBT.
Hyper-parameters for the creation of the dataset cache.
How to read the dataset cache.
If true, workers will print training logs.
Dynamic balancing of workload in between workers in the case the speed of workers is not uniform or not constant.
Internal in between the creation og checkpoints. If one of the worker or the manager is rescheduled, all the training from the last checkpoint is lost. On the other hand, creating a checkpoint is expensive. Value "-1" disables the interval.
Default to 10 minutes.
Ratio of workers used for evaluation. The remaining workers are used for training. If no validation dataset is available, all the workers are used for training independently of the value of "ratio_evaluation_workers". Validation and training are running concurrently. Increase this value if the validation takes more time than the training (High average duration of the"EndIter" stage; see the training logs).
Used in:
If true, the workers will simulate failures to test the checkpoint during training. The training will eventually complete.
If true, all the workers are running all the splits i.e. the same amount of computation. This option can be used to benchmark and detect slow workers. Note that each worker will have a full copy of the dataset i.e. the dataset is not distributed.
Evaluation. Can be partial i.e. on a subset of a dataset.
Used in:
, ,The order and semantic of the metrics is defined by the loss implementation.
Used in:
Number of evaluation fragments to make a full evaluation.
Evaluations indexed by "iter_idx".
Used in:
Request message of the workers. Unless expressed, the messages are designed to be send from the manager to one of the workers.
Computes the statistics of the labels e.g. number of element of each class for classification. Worker type: Trainer
Sets the initial predictions (also call bias) of the model. Worker type: Trainer & Evaluator
Starts the training of a new iteration e.g. starts the training of a new tree if one tree is trained at each iteration. The workers return the statistics of the weak model labels. Worker type: Trainer
Finds the highest scoring splits for each of the model nodes in the current tree. Worker type: Trainer
Each worker will evaluate the split (on each examples) based on the features it owns. Worker type: Trainer
Share the evaluation split values in between the workers. Once the split are sharded, update the node map, tree structure and label statistics. Worker type: Trainer
Request a subset of split values. This message is only used in between workers. Worker type: Trainer
Finalize the current iteration. Worker type: Trainer & Evaluator
Restore an existing checkpoint. Worker type: Trainer & Evaluator
Create the training worker side checkpoint. Possibly, only create part of a checkpoint (for distributed checkpoint creation). Worker type: Trainer
First message send to the worker by the manager. Will be sent both for new or resumed training. Worker type: Trainer
Create an evaluation worker side checkpoint. Worker type: Evaluator
If set, the worker is to make sure the following (and only the following) features are loaded in RAM (in case features are loaded in RAM) as they are used in the computation. Those values can be different from the features in the welcome message (which are the initial features owned by the worker).
Features not currently used in requests, but that will be used (or stopped to be used) in the future. The worker is expected to load these features in the background (i.e. independently of the core computation).
Used in:
Range of examples to export. Note: The checkpoint only contains the prediction accumulator of the checkpoint.
Used by the manager to keep track of the shards.
Used in:
Used in:
If true, the worker is expected to return the training loss.
Newly learned tree. Only available to evaluation workers.
If true, the evaluation worker is expected to return the validation evaluation immediately (i.e. not in the next iteration).
Used in:
Used in:
Used in:
List of features to test per weak learner and open nodes.
Used in:
Used in:
Used in:
Used in:
(message has no fields)
Used in:
Used in:
Used in:
Used in:
Used in:
Index of the iteration.
Unique identifier of the iteration. If the manager is rescheduled, a same iteration index can be started multiple time. However, the UID will change.
Seed used to initialize the random generator.
Used in:
(message has no fields)
Used in:
Result message of the worker.
Each WorkerRequest leads to a WorkerResult with the same set attribute. Keep the same indexing as in "WorkerRequest" for debugging purpose.
If true, indicates that the worker is missing information to complete the request and continue the training of the tree. This situation is caused by a rescheduling. This message is only possible for the messages related to the training an individual tree as snapshots are made in between trees.
Duration of the computation expressed in seconds. If the worker restart during the computation, the duration of the last execution is used.
True if the pre-loading is currently being done.
Used in:
Used in:
Any pending validation evaluation.
Used in:
Because validation evaluation is asynchronous, there can be multiple validation evaluation corresponding to several previous iterations.
Used in:
(message has no fields)
Used in:
One for each weak models.
Used in:
Used in:
Used in:
Used in:
(message has no fields)
Used in:
(message has no fields)
Used in:
(message has no fields)
Used in:
One for each weak models.
Used in:
If specified, the worker loaded the dataset in memory.
"Welcome" proto message of the worker. The welcome message is received as an argument of the "Setup" method. All the workers have the same welcome message.
Location used by the manager and the workers to store intermediate data.
Location of the dataset cache i.e. the dataset indexed for fast training.
List of features owned by each training worker. "owned_features[i].features" are the features owned by the i-th worker.
Classical yggdrasil training configuration.
Number of training workers. A fraction of the workers will only be used for training while another will only be used for evaluation. Training worker have index "WorkerIdx() < num_train_workers" while evaluation worker have index "WorkerIdx() >= num_train_workers".
Validation dataset for each evaluation worker.
Used in: