package learning.adanets.phoenix.proto

Get desktop application:
View/edit binary Protocol Buffers messages

Next id: 2

Used in: PhoenixSpec

repeated TowerSuggestion towers = 1
Architecture of the various towers in the model.

Used in: PhoenixSpec

optional bool apply_exponential_decay = 1
If true Phoenix will apply (step) exponential decay in some of the trials The number of steps can be specified in the fields below.
optional int32 min_decay_times = 2
The minimal number of times to apply decay when applying learning rate exponential decay.
optional int32 max_decay_times = 3
The maximal number of times to apply decay when applying learning rate exponential decay.
optional float min_learning_rate_decay_rate = 4
minimum rate for exponential learning rate decay
optional float max_learning_rate_decay_rate = 5
maximum rate for exponential learning rate decay
optional bool apply_gradient_clipping = 6
If true, The trials will apply gradient clipping.
optional int32 min_gradient_norm_when_clipping = 7
Minimal norm allowed when clipping.
optional int32 max_gradient_norm_when_clipping = 8
Maximal norm allowed when clipping.
optional bool apply_l2_regularization = 9
Allow l2 regularization on trainable variables.
optional float min_l2_regularization = 10
Minimal rate (lambda) for the l2 regularization
optional float max_l2_regularization = 11
Maximal rate (lambda) for the l2 regularization
optional int32 lr_warmup_steps = 12
Number of steps to warm up the learning rate
optional float min_dropout = 13
Minimum and maximum dropout allowed. The study will explore discretely between the minimum and the maximum over 10 different values.
optional float max_dropout = 14

Parameters that apply only to the LinearModel search algorithm.

Used in: PhoenixSpec

optional LinearModelSpec.NetworkAlignment network_alignment = 1
optional float ridge_penalty = 2
The L2 ridge regression penalty used to fit the linear model. Must be > 0.
optional bool remove_outliers = 3
Ignore very high-loss architectures. Performs asymmetric outlier removal: Trials with loss in the top 10% or with loss >= 10*median are removed. Huge loss == SGD diverged; remove. Low loss == good model; keep!
optional int32 trials_before_fit = 4
With less than this many trials, just generate a random suggestion.

Controls how to align variable-depth architectures such that all architectures can be encoded as a fixed-length feature vector input.

Used in: LinearModelSpec

NET_ALIGN_UNSPECIFIED = 0
NET_ALIGN_HEAD = 1
Networks will be aligned at the last layer before outputs.
NET_ALIGN_BASE = 2
Networks will be aligned at the first layer after inputs.

Next id: 53

optional int32 minimum_depth = 1
Below this depth, Phoenix will fill with random blocks. This depth is also used for the non-adaptive search. I.e., all networks have fixed depth that is equal to minimum_depth.
optional int32 maximum_depth = 2
If you exceed this depth, we go to evolutionary mode. I.e., we mutate one block at a time.
optional PhoenixSpec.SearchType search_type = 3
The type of search you would like.
oneof search_spec
Submessage for SearchTypes that accept further configuration. Must be compatible with the value of search_type.
- LinearModelSpec linear_model = 43
optional int32 num_blocks_in_cell = 40
The number of blocks between reduction cells.
optional string reduction_block_type = 41
The reduction block to use after each cell.
optional bool replicate_cell = 42
Whether to only search for a cell which is then replicated n times until the size of the architecture is equal to maximum_depth. If this option is false, each cell will be search independently.
repeated int32 increase_complexity_minimum_trials = 4
To make it statistically significant, we set thresholds below which the algorithm cannot increase complexity (i.e., go deeper). Example: [10, 30, 50] -- The algorithm must complete 10 trials at depth 1 to go to depth 2, and has to finish 30 trails to try depth 3 etc. If not set, the default is [5 * i for i in range(maximum_depth)]
optional float increase_complexity_probability = 32
The probability that the architecture will grow deeper if there is enough capacity. With probability 1 - increase_complexity_probability, evolutionary mode will be run instead. Setting increase_complexity_probability < 1.0 increases exploration (this is in effect an epsilon-greedy exploration strategy) and could enable the search algorithm to get out of local minima.
optional float mutate_probability = 37
The following three probabilities are only used in ADAPTIVE_COORDINATE_DESCENT_GENETIC. The probability to mutate a block.
optional float crossover_probability = 38
The probability of crossover. A new architecture is the result of mixing two of the best candidates.
optional float random_architecture_probability = 39
The probability of generating a random architecture.
optional int32 beam_size = 31
The number of best performing completed trials to to consider when increasing complexity (i.e. going deeper). E.g. if beam_size = 5, then the base for the new deeper tower will be randomly selected from the top 5 completed trials. Must be >= 1.
optional PhoenixSpec.ProblemType problem_type = 5
TODO(b/172564129): Find a way to figure this out from the head/dataset.
repeated string blocks_to_use = 6
List of blocks to use in the search. Please use names defined in: model_search/blocks.py
repeated TowerSuggestion user_suggestions = 25
User suggestions for architectures. These suggestions will be trained first Once trained, Phoenix will switch to its specified search algorithm. Restrictions: 1. If you intend to run Phoenix solely to train the user suggested architectures, then there are no limitations. 2. If you are using a search algorithm after the user_suggestions (i.e., the number of max trials is greated than the length of this repeated field), then your suggestions are going to be data points for the search algorithm. Therefore, they must be compatible with the search; Namely, if you are using HARMONICA_SEARCH or NONADAPTIVE_RANDOM_SEARCH, then your suggestions must be of fixed depth since these algorithms work in a . fixed depth space. The depth should be equal to PhoenixSpec.minimum_depth. 3. If you are using ADAPTIVE_COORDINATE_DESCENT, this algorithm search over dynamic depth architectures. No restriction applies.
optional ArchitectureReplay replay = 47
Fixes the architecture to a given one so that when replay is on, no search is performed. The phoenix returned estimator will always give the same architecture. When replay is on, there is no point to use parameterized_train_and_eval as replay also contains the hparams used to train every tower.
optional bool use_parameter_scaled_training = 34
Whether to scale how long a trial trains for based on the architecture size. If enabled, each extra block of an architecture adds max_train_steps / max_depth number of steps to the training. This only affects training when using mutation search.
optional string lengths_feature_name = 23
If provided, it means that the input function provided for Phoenix already created the lengths tensor for the recurrent problem. Phoenix, will use the provided lengths tensor and will assume, that we have only one sequential feature.
optional string weight_feature_name = 26
The name of the weight tensor in the features dict. Keep empty for no weights.
optional bool apply_dropouts_between_blocks = 20
Apply dropout
optional LearningSpec learning_spec = 21
Spec for learning, e.g., apply_exponential_decay, etc.
optional ensembling_spec.EnsemblingSpec ensemble_spec = 22
Specifies how to create an ensemble inside Phoenix. Ensembling is currently not supported for TPU (b/122906465).
optional meta.proto.distillation_spec.DistillationSpec distillation_spec = 45
Specifies how to distill the ensemble that Phoenix creates. If both a DistillationSpec and an EnsemblingSpec are provided, the DistillationSpec will be ignored. TODO(b/172564129): Integrate distillation with ensembling.
optional bool use_synchronous_optimizer = 24
Use synchronous optimization across workers.
repeated TaskSpec multi_task_spec = 27
Task spec for the various tasks. Please keep empty if you have one task.
optional string primary_task_name = 28
The most important task when using multi_task_spec. The loss / prediction of estimator are going to be set to this label.
optional bool merge_losses_of_multitask = 52
If set to true, the losses from the multitask above are added to one, and then optimized. Otherwise, each loss is trained separately (i.e., the training is alternating between the various losses optimization).
optional string cnn_data_format = 30
NHWC or NCHW - only applies to cnn problems
optional meta.proto.transfer_learning_spec.TransferLearningSpec transfer_learning_spec = 33
optional bool use_auxiliary_head = 35
TODO(b/172564129): Add the ability to specify a custom tower to use with the auxiliary head. Whether to use an auxiliary head for CNN tasks. If True, two kinds of auxiliary heads can be attached: 1. If the network contains a downsample or reduction block, the NASNet auxiliary head will be attached right before the block. 2. If the network contains a flatten block which is not connected to the logits, an auxiliary head will be attached to this flatten block. If neither of these two criteria hold, no auxiliary head will be attached.
optional float auxiliary_head_loss_weight = 36
The weight to use for the auxiliary head loss. Must be in [0, 1.0]. The weight for the loss of the main head will be set to 1 - auxiliary_head_loss_weight.
optional bool is_input_shared = 46
If true, in case there is embedding feature column, we use the embedding once for all towers in Phoenix. Otherwise, we create different embedding for every tower. If in doubt, keep false.
optional float temperature = 48
Temperature added to predict graph when exporting.
optional bool disable_last_dense_layer = 50
This option gives the user the ability to disallow phoenix from adding A last dense layer to create the logits that match the problem size. This option is to be used if the user can guarantee that the last block produces logits (i.e. output tensor) with the approprite size. If unsure, please keep as false.
optional bool pass_label_dict_as_is = 51
Only apply to multitask If true, we don't infer the label tensor and label weight when passing them to the loss_fn and metric_fn. We pass the estimator label dict as is. Note that this requires the user to modify the default loss_fn/metric_fn to accept a dictionary and calculate the loss based on the tensor in that dictionary.

How does the input look like? Use DNN if you have many feature columns. Use CNN for plate tensors like images or speech (timeframes X freq. bands) Use RNN_ALL_ACTIVATIONS for language model like classification. I.e., Assume a label for every input in the sequence. For one classification rnn (i.e., one label per sequence), use RNN_LAST_ACTIVATIONS

Used in: PhoenixSpec

UNKNOWN_PROBLEM_TYPE = 0
CNN = 1
DNN = 2
RNN_ALL_ACTIVATIONS = 3
RNN_LAST_ACTIVATIONS = 4

Used in: PhoenixSpec

UNKNOWN_SEARCH_TYPE = 0
ADAPTIVE_COORDINATE_DESCENT = 1
Each trial will try to add, or modify one block from the best trial so far.
NONADAPTIVE_RANDOM_SEARCH = 2
Each trial will randomly choose blocks to construct the network.
HARMONICA_SEARCH = 4
The harmonica search: https://arxiv.org/pdf/1706.00764.pdf
CONSTRAINED_ADAPTIVE_COORDINATE_DESCENT = 6
An extension of the coordinate descent algorithm which does not search over reduction blocks. Instead a reduction block is placed every num_blocks_in_cell. For example, if num_blocks_in_cell=4 then a reduction block will be added to the architecture after every 4 blocks. NOTE: Do not include reduction blocks in blocks_to_use otherwise we will crash (e.g. pooling, downsample, reduction, flatten, etc.). NOTE: The following fields should be set if using this algorithm: num_blocks_in_cell, reduction_block_type, replicate_cell.
LINEAR_MODEL = 7
Fit a linear model predicting accuracy from block binary feature vector. Then take the argmax of that model, and possibly increase depth.

This message helps Phoenix to search for an architecture when we are optimizing for multiple tasks. The search is trying to find a good architecture for the shared model. We will call the logits of the shared model (shared across the various tasks) `search logits`. You can specify how to create `final logits` for the tasks given the search logits. I.e., you can specify what blocks to add above the shared model to the specific task - This addition is currently fixed and not searchable. Next id: 8

Used in: PhoenixSpec

optional string label_name = 1
required This name must match the name in the label dict in model_fn of estimator.
optional int32 number_of_classes = 2
The number of classication classes - use 2 for binary classification.
repeated string architecture = 3
Additional blocks between the search logits and the final logits. Keep empty if you don't wish to add blocks between the final logits for the task and the search logits. A fully connected layer is added above the search logits if the dimension is not equal to the number of classes of the task.
optional string weight_feature_name = 4
Feature name for the weight for the label.
optional bool weight_is_a_feature = 5
True is weight feature is in feature dict, otherwise (if in label dict), False.
optional bool apply_activation_on_shared_logits = 7
Adding the ability to specify whether or not to apply activation before adding the appropriate architecture for the task specified in field architecture above.

Next id: 3

Used in: ArchitectureReplay, PhoenixSpec

repeated string architecture = 1
List of blocks to use in the architecture (stacked). Please use names (strings) defined in: model_search/blocks.py
optional model_search.HParamDef hparams = 2
Ability to override additional hparams in the trial. Keep empty to rely on tuner hparams instead.

package learning.adanets.phoenix.proto

message ArchitectureReplay

repeated TowerSuggestion towers = 1

message LearningSpec

optional bool apply_exponential_decay = 1

optional int32 min_decay_times = 2

optional int32 max_decay_times = 3

optional float min_learning_rate_decay_rate = 4

optional float max_learning_rate_decay_rate = 5

optional bool apply_gradient_clipping = 6

optional int32 min_gradient_norm_when_clipping = 7

optional int32 max_gradient_norm_when_clipping = 8

optional bool apply_l2_regularization = 9

optional float min_l2_regularization = 10

optional float max_l2_regularization = 11

optional int32 lr_warmup_steps = 12

optional float min_dropout = 13

optional float max_dropout = 14

message LinearModelSpec

optional LinearModelSpec.NetworkAlignment network_alignment = 1

optional float ridge_penalty = 2

optional bool remove_outliers = 3

optional int32 trials_before_fit = 4

enum LinearModelSpec.NetworkAlignment

NET_ALIGN_UNSPECIFIED = 0

NET_ALIGN_HEAD = 1

NET_ALIGN_BASE = 2

message PhoenixSpec

optional int32 minimum_depth = 1

optional int32 maximum_depth = 2

optional PhoenixSpec.SearchType search_type = 3

oneof search_spec

LinearModelSpec linear_model = 43

optional int32 num_blocks_in_cell = 40

optional string reduction_block_type = 41

optional bool replicate_cell = 42

repeated int32 increase_complexity_minimum_trials = 4

optional float increase_complexity_probability = 32

optional float mutate_probability = 37

optional float crossover_probability = 38

optional float random_architecture_probability = 39

optional int32 beam_size = 31

optional PhoenixSpec.ProblemType problem_type = 5

repeated string blocks_to_use = 6

repeated TowerSuggestion user_suggestions = 25

optional ArchitectureReplay replay = 47

optional bool use_parameter_scaled_training = 34

optional string lengths_feature_name = 23

optional string weight_feature_name = 26

optional bool apply_dropouts_between_blocks = 20

optional LearningSpec learning_spec = 21

optional ensembling_spec.EnsemblingSpec ensemble_spec = 22

optional meta.proto.distillation_spec.DistillationSpec distillation_spec = 45

optional bool use_synchronous_optimizer = 24

repeated TaskSpec multi_task_spec = 27

optional string primary_task_name = 28

optional bool merge_losses_of_multitask = 52

optional string cnn_data_format = 30

optional meta.proto.transfer_learning_spec.TransferLearningSpec transfer_learning_spec = 33

optional bool use_auxiliary_head = 35

optional float auxiliary_head_loss_weight = 36

optional bool is_input_shared = 46

optional float temperature = 48

optional bool disable_last_dense_layer = 50

optional bool pass_label_dict_as_is = 51

enum PhoenixSpec.ProblemType

UNKNOWN_PROBLEM_TYPE = 0

CNN = 1

DNN = 2

RNN_ALL_ACTIVATIONS = 3

RNN_LAST_ACTIVATIONS = 4

enum PhoenixSpec.SearchType

UNKNOWN_SEARCH_TYPE = 0

ADAPTIVE_COORDINATE_DESCENT = 1

NONADAPTIVE_RANDOM_SEARCH = 2

HARMONICA_SEARCH = 4

CONSTRAINED_ADAPTIVE_COORDINATE_DESCENT = 6

LINEAR_MODEL = 7

message TaskSpec

optional string label_name = 1