Get desktop application:
View/edit binary Protocol Buffers messages
Used in:
Used in:
,Optimizer name.
Learning rate decay values.
Learning rate decay steps.
Total number of training steps.
Training batch size.
Training number of GPUs.
Used in:
Similarity function, euclidean or cosine.
Whether to use cosine softmax in task A.
Whether to add a learnable scaling factor in task A.
Whether to use cosine softmax in task B.
Whether to add a learnable scaling factor in task B
Whether to use cosine attention mechanism.
Whether to gate the prototypes with learned parameters.
Whether to reinit tau variable for cosine A.
Used in:
Input image height.
Input image width.
Input image number of channels.
Number of residual units.
Input image number of filters.
Stride for the initial convolution of each resolution stage.
Initial convolution strides.
Whether doing max pooling in the initial convolution.
Number of filters in the initial convolution.
Whether to use bottleneck layer in each residual unit.
Weight decay.
Normalization scheme used in every layer.
Number of groups (used by GroupNorm).
Whether perform global average pooling in the end.
Data format, NCHW or NHWC.
ResNet version, v1, v2, snailv1, or snailv2.
Whether to use Leaky ReLU, if yes, by how much.
Filter initialization scheme, normal or uniform.
Add the final ReLU to the feature map.
Used in:
Number of steps for one validation run.
Number of steps per output printing.
Number of steps per mode checkpointing.
Number of steps to pretrain task A.
Used in:
Layers to finetune.
Whether to reduce learning rate for lower layers.
Dummy learning rate to build one-step computation graph.
Damping rate.
Number of steps of RBP loops.
RBP batch size.
Finetune weight decay.
Whether the query set contains both old and new instances.
Type of the additional transfer learning loss.
Ratio of the transfer loss.
Whether to learn meta-weights only.
Whether to finetune readout weights for task A.
Replace the gradients for weights for task A to 1-step.
Fractional learning rate for slow layers
Initialize weights with prototypes.
Cache the transfer loss values.
Scipy optimizer interface.
Layers for meta-learning altogether.
Learned memory size.
Meta weights weight decay.
Whether to warm start with no learned regularizer.
Initial value for the cosine similarity scale variable.
Initial variance of the MLP.
Fast model type, LR or MLP.
Fast model MLP hidden size.
Initial log gamma.
Whether to learn gamma.
Number of BPTT steps.
Whether to add BPTT in the transient phase.
Whether to add RBP in the steady phase.
Whether to do BPTT at all steps.
Loss multiplier for bptt.
Disable RBP to backprop into embeddings.
MLP hidden size for the transfer loss function.
MLP activation function.
MLP number layers for the transfer loss function.
Cost multiplier for task A.
Cost multiplier for task B.
Cost to remember task A (not used).
Number of steps when the cool down of task A starts.
Final task A ratio.
Number of steps to warm up B from 0.0.
Number of steps to cool down A.
Finetune optimizer config.