Get desktop application:
View/edit binary Protocol Buffers messages
Used in:
Used in:
Used in:
(message has no fields)
Configuration message for the AdagradOptimizer See: https://www.tensorflow.org/api_docs/python/tf/train/AdagradOptimizer
Used in:
Only available on pai-tf, which has better performance than AdamOptimizer
Used in:
Used in:
Configuration message for the AdamOptimizer See: https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer
Used in:
Used in:
Used in:
Used in:
(message has no fields)
Used in:
Used in:
The number of heads
The dimension of heads
The number of interacting layers
Used in:
Used in:
Size of the encoder layers and the pooler layer
Number of hidden layers in the Transformer encoder
Number of attention heads for each attention layer in the Transformer encoder
The size of the "intermediate" (i.e. feed-forward) layer in the Transformer encoder
The non-linear activation function (function or string) in the encoder and pooler.
"gelu", "relu", "tanh" and "swish" are supported.
The dropout probability for all fully connected layers in the embeddings, encoder, and pooler
The dropout ratio for the attention probabilities
The maximum sequence length that this model might ever be used with
Whether to add position embeddings for the position of each token in the text sequence
The stddev of the truncated_normal_initializer for initializing all weight matrices
Whether to output all token embedding, if set to false, then only output the first token embedding
The position of target item (i.e. head, tail, ignore)
Whether to preserve a position for target
Used in:
Used in:
a few sub DAGs
a few blocks generating a DAG
the names of output blocks, will be merge into a tensor
the names of output blocks, return as a list or single tensor
optional top mlp layer
Used in:
,task name for the task tower
label for the task, default is label_fields by order
metrics for the task
loss for the task
num_class for multi-class classification loss
task specific dnn
related tower names
relation dnn
training loss weights
label name for indicating the sample space for the task tower
the loss weight for sample in the task space
the loss weight for sample out the task space
level for prediction required uint32 prediction_level = 13; prediction weights optional float prediction_weight = 14 [default = 1.0]; multiple losses
whether to use sample weight in this tower
field name for indicating the sample space for this task
field value for indicating the sample space for this task
Used in:
,Used in:
support gfile.Glob
Used in:
,Used in:
,the input names of feature groups or other blocks
sequential layers
only take effect when there are no layers
a package of blocks for reuse; e.g. call in a contrastive learning manner
Used in:
package name
a few blocks generating a DAG
the names of output blocks, will be merge into a tensor
the names of output blocks, return as a list or single tensor
Used in:
Used in:
Used in:
,The number of heads of cross modal fusion layer
The number of heads of image feature learning layer
The number of heads of text feature learning layer
The dimension of text heads
The dimension of image heads
The number of patches of image feature, take effect when there is only one image feature
Do dimension reduce to this size for image feature before single modal learning module
The number of self attention layers for image features
The number of self attention layers for text features
The number of cross modal layers
The dimension of image cross modal heads
The dimension of text cross modal heads
Dropout probability for hidden layers
Dropout probability of the attention probabilities
Whether to add embeddings for different text sequence features
Whether to add position embeddings for the position of each token in the text sequence
Maximum sequence length that might ever be used with this model
Dropout probability for text sequence embeddings
dnn layers for other features
Used in:
max number of high capsules
max behaviour sequence length
high capsule embedding vector dimension
number EM iterations
routing logits scale
routing logits initial stddev
squash power
output ratio
constant interest number in default, use log(seq_len)
Used in:
, ,Used in:
Used in:
Configuration message for a constant learning rate.
Used in:
Configuration message for a cosine decaying learning rate as defined in utils/learning_schedules.py
Used in:
Used in:
The number of cross layers
Used in:
loss weight for amm_i
loss weight for amm_u
Used in:
Used in:
shared bottom cmbf layer
shared bottom uniter layer
shared bottom dnn layer
mmoe expert dnn layer definition
number of mmoe experts
bayes task tower
l2 regularization
Used in:
Used in:
din attention layer
whether to keep target item feature
option: softmax, sigmoid
Used in:
Used in:
options are: dot and cat
whether a feature will interact with itself
whether to include dense features after interaction
Used in:
, , , , , , , , , , , , , , , , , , , , , , , , , , ,hidden units for each layer
ratio of dropout
activation function
use batch normalization
Used in:
add a layer for scaling the similarity
normalize user_tower_embedding and item_tower_embedding
Used in:
Used in:
add a layer for scaling the similarity
normalize user_tower_embedding and item_tower_embedding
Used in:
Used in:
in json format: {"0":{"cursor": ""}, "1":{"cursor":""}}
offset_time could be two formats: 1: %Y%m%d %H:%M:%S "20220508 12:00:00" 2: %s "1651982400"
Used in:
mini batch size to use for training and evaluation.
set auto_expand_input_fields to true to auto_expand field[1-21] to field1, field2, ..., field21
label fields, normally only one field is used. For multiple target models such as MMOE multiple label_fields will be set.
label separator
label dimensions which need to be set when there are labels have dimension > 1
extra transformation functions that generate new labels
whether to shuffle data
shufffle buffer for better performance, even shuffle buffer is set, it is suggested to do full data shuffle before training especially when the performance of models is not good.
The number of times a data source is read. If set to zero, the data source will be reused indefinitely.
Number of decoded batches to prefetch.
shard dataset to 1/num_workers in distribute mode this param is not used anymore
shard by file, not by sample, valid only for CSVInput
separator of column features, only used for CSVInput* not used in OdpsInput* binary separators are supported: CTRL+A could be set as '\001' CTRL+B could be set as '\002' CTRL+C could be set as '\003' for RTPInput and OdpsRTPInput it is usually set to '\002'
parallel preproces of raw data, avoid using too small or too large numbers(suggested be to small than number of the cores)
only used for OdpsInput/OdpsInputV2/OdpsRTPInput, comma separated for RTPInput, selected_cols use indices as column names such as '1,2,4', where 1,2 are label columns, and 4 is the feature column, column 0,3 are not used,
selected col types, only used for OdpsInput/OdpsInputV2 to avoid error setting of data types
the input fields must be the same number and in the same order as data in csv files or odps tables
for RTPInput only
ignore some data errors it is not suggested to set this parameter
whether to use pai global shuffle queue, only for OdpsInput, OdpsInputV2, OdpsRTPInputV2
if true, one worker will duplicate the data of the chief node and undertake the gradient computation of the chief node
input field for sample weight
the compression type of tfrecord
n data for one feature in tfrecord
for csv files, may optionally with an header in that case, input_name must match header name, and the number and the order of input_fields may not be the same as that in csv files.
Used in:
user-defined function for label. eg: tf.math.log1p, remap_lbl
user-defined function path. eg: /samples/demo_script/process_lbl.py
output field type of user-defined function.
ignore value
Used in:
Used in:
csv format input, could be used in local or hdfs support .gz compression(but not .tar.gz files)
@Depreciated
extended csv format, allow quote in fields
@Depreciated, has memory leak problem
odps input, used on pai
for the purpose to debug performance bottleneck of input pipelines
All features are packed into one field for fast copying to gpu, and there are no feature preprocessing step, it is assumed that features are preprocessed before training. Requirements: python3 and tf2.x due to multiprocssing spawn and RaggedTensor apis.
Features are not packed, and are preprocessing separately. Requirements: python3 and tf2.x due to multiprocssing spawn and RaggedTensor apis.
c++ version of parquet dataset which currently are only available with deeprec.
Used in:
Used in:
deprecated
deprecated
Used in:
use old SyncReplicasOptimizer for ParameterServer training
PSStrategy with multiple gpus on one node could not work on pai-tf, could only work on TF >=1.15
could only work on PaiTF or TF >=1.15 single worker multiple gpu mode
Depreciated
currently not working good
multi worker multi gpu mode see tf.distribute.experimental.MultiWorkerMirroredStrategy
use horovod strategy
support kv embedding, support kv embedding shard
support embedding shard, requires horovod
Used in:
for input performance test
Used in:
(message has no fields)
Used in:
Used in:
,use embedding cache, only for sok hybrid embedding
for sok hybrid key value embedding
train config, including optimizer, weight decay, num_steps and so on
for compatibility
recommendation model config
Json file[RTP FG] to define input data and features: * In easy_rec.python.utils.fg_util.load_fg_json_to_config: data_config and feature_config will be generated based on fg_json. * After generation, a prefix '!' is added: fg_json_path = '!' + fg_json_path indicates config update is already done, and should not be updated anymore. In this way, we make load_fg_json_to_config function reentrant. This step is done before edit_config_json to take effect.
Used in:
just a name for backbone config
actually input layers, each layer produce a group of feature
model parameters
implemented in easy_rec/python/model/easy_rec_estimator add regularization to all variables with "embedding_weights:" in name
filter variables matching any pattern in restore_filters common filters are Adam, Momentum, etc.
label name for rank_model to select one label between multiple labels
Used in:
Used in:
Message for configuring EasyRecModel evaluation jobs (eval.py).
Used in:
Number of examples to process of evaluation.
How often to run evaluation.
Maximum number of times to run evaluation. If set to 0, will run forever.
Whether the TensorFlow graph used for evaluation should be saved to disk.
Type of metrics to use for evaluation. possible values:
Evaluation online with batch forward data of training
Used in:
, ,Used in:
Configuration message for an exponentially decaying learning rate. See https://www.tensorflow.org/versions/master/api_docs/python/train/ \ decaying_the_learning_rate#exponential_decay
Used in:
Message for configuring exporting models.
Used in:
batch size used for exported model, -1 indicates batch_size is None which is only supported by classification model right now, while other models support static batch_size
type of exporter [final | latest | best | none] when train_and_evaluation final: performs a single export in the end of training latest: regularly exports the serving graph and checkpoints best: export the best model according to best_exporter_metric none: do not perform export
the metric used to determine the best checkpoint
metric value the bigger the best
enable early stop
custom early stop function, format: early_stop_func(eval_results, early_stop_params) return True if should stop
custom early stop parameters
early stop max check steps
each feature has a placeholder
export to keep, only for exporter_type in [best, latest]
multi value field list
auto analyze multi value fields
is placeholder named by input
filter out inputs, only keep effective ones
export the original feature values as string
export the outputs required by RTP
export asset files
Used in:
number of experts per task
number of experts for share For the last extraction_network, no need to configure this
dnn network of experts per task
dnn network of experts for share For the last extraction_network, no need to configure this
Used in:
,Used in:
,Used in:
,input field names: must be included in DatasetConfig.input_fields
for categorical_column_with_identity
only for raw features
separator with in features
delimeter to separator key from value
delimeter to separate sequence multi-values
truncate sequence data to max_seq_len
many other field share this config
lookup max select element number, default 10
max_partitions
combiner
embedding initializer
number of digits kept after dot in format float/double to string scientific format is not used. in default it is not allowed to convert float/double to string
normalize raw feature to [0-1]
normalization function for raw features: such as: tf.math.log1p
raw feature of multiple dimensions
sequence feature combiner
sub feature type for sequence feature
sequence length
for expr feature
embedding variable params
for combo feature: if not set, use cross_column otherwise, the input features are first joined and then passed to categorical_column
separator for each inputs if not set, combo inputs will not be split
Used in:
Used in:
force place embedding lookup ops on cpu to improve training and inference efficiency.
Used in:
Used in:
Used in:
optional float learning_rate = 1 [default=1e-4];
Used in:
uid field name
reduction method for auc of different users * "mean": simple mean of different users * "mean_by_sample_num": weighted mean with sample num of different users * "mean_by_positive_num": weighted mean with positive sample num of different users
used in PPNet
Used in:
activation function
use batch normalization
Used in:
(message has no fields)
Weighted Random Sampling ItemID not in Batch and Sampling Hard Edge
Used in:
user data path userid weight
item data path itemid weight attrs
hard negative edge path userid itemid weight
number of negative sample
max number of hard negative sample
field names of attrs in train data or eval data
field name of item_id in train data or eval data
field name of user_id in train data or eval data
only works on DataScience/Local
Weighted Random Sampling ItemID not with Edge and Sampling Hard Edge
Used in:
user data path userid weight
item data path itemid weight attrs
positive edge path userid itemid weight
hard negative edge path userid itemid weight
number of negative sample
max number of hard negative sample
field names of attrs in train data or eval data
field name of item_id in train data or eval data
field name of user_id in train data or eval data
only works on DataScience/Local
Used in:
,Used in:
hive master's ip
hive port
hive username
hive database
Used in:
if open, will save increment updates to model_dir/incr_save/
Used in:
Used in:
Used in:
relative to model_dir
for online inference, please set the storage.mount_path to mount_path online service will fail
Used in:
Used in:
Proto with one-of field for initializers.
Used in:
Used in:
Used in:
,Used in:
,for knowledge distillation
Used in:
default to be logits
for CROSS_ENTROPY_LOSS, soft_label must be logits instead of probs
default to be logits
only for loss_type == CROSS_ENTROPY_LOSS or BINARY_CROSS_ENTROPY_LOSS or KL_DIVERGENCE_LOSS
field name for indicating the sample space for this task
field value for indicating the sample space for this task
the loss weight for sample in the task space
the loss weight for sample out the task space
Used in:
in json format: {'0':10, '1':20}
offset_time could be two formats: 1: %Y%m%d %H:%M:%S '20220508 12:00:00' 2: %s '1651982400'
kafka global config, such as: fetch.max.bytes=1024
kafka topic config, such as: max.partition.fetch.bytes=1024
Used in:
, , ,Configuration proto for L2 Regularizer.
Used in:
Configuration proto for L1 Regularizer.
Used in:
Configuration proto for L2 Regularizer.
Used in:
Used in:
,Used in:
Used in:
Configuration message for optimizer learning rate.
Used in:
, , , , , , , , ,Used in:
,Used in:
,Used in:
, ,Used in:
, , , ,crossentropy loss/log loss
Used in:
preprocessing dnn before entering capsule layer
dnn layers applied on user_context(none sequence features)
concat user and capsule dnn
method to combine several user sequences such as item_ids, category_ids
dnn layers applied on item features
similarity power, the paper says that the big the better
add a layer for scaling the similarity
if small than 1.0, then a loss will be added to limit the maximal interest similarities, but in experiments, setup such a loss leads to low hitrate.
Used in:
Used in:
, , , , , , , , ,hidden units for each layer
ratio of dropout
activation function
use batch normalization
kernel_initializer
Used in:
deprecated: original mmoe experts config
mmoe expert dnn layer definition
number of mmoe experts
task tower
l2 regularization
Used in:
number of tasks
mmoe expert mlp layer definition
number of mmoe experts
Configuration message for a manually defined learning rate schedule.
Used in:
Whether to linearly interpolate learning rates for steps in [0, schedule[0].step].
Used in:
Used in:
Used in:
,Used in:
Used in:
(message has no fields)
Used in:
(message has no fields)
Used in:
(message has no fields)
configure backbone network common parameters
Used in:
Configuration message for the MomentumOptimizer See: https://www.tensorflow.org/api_docs/python/tf/train/MomentumOptimizer
Used in:
Used in:
Used in:
The expected shape of an output tensor, besides the batch and sequence dims. If not specified, projects back to the query feature dim (the query input's last dimension).
axes over which the attention is applied.
Used in:
(message has no fields)
Used in:
, ,Used in:
Used in:
Used in:
Used in:
for now, inter_ary_pooling not support yet
Weighted Random Sampling ItemID not in Batch
Used in:
sample data path itemid weight attrs
number of negative sample
field names of attrs in train data or eval data
field name of item_id in train data or eval data
only works on DataScience/Local
Used in:
sample data path itemid weight attrs
number of negative sample
field names of attrs in train data or eval data
field name of item_id in train data or eval data
only works on DataScience/Local
Weighted Random Sampling ItemID not with Edge
Used in:
user data path userid weight
item data path itemid weight attrs
positive edge path userid itemid weight
number of negative sample
field names of attrs in train data or eval data
field name of item_id in train data or eval data
field name of user_id in train data or eval data
only works on DataScience/Local
Top level optimizer message.
Used in:
Used in:
Used in:
encode user info
encode target item info
encode u2i seq info
produce trigger score
encode trigger item seqs to target item co-occurance info
produce sim score
direct net user_dnn
direct net item_dnn
for direct net, similar to DSSM
for direct net
bias net dnn
Used in:
extraction network
task tower
l2 regularization
Used in:
run mode: eager, lazy
Used in:
,Used in:
,Used in:
,Used in:
,Used in:
Configuration message for a poly decaying learning rate. See https://www.tensorflow.org/api_docs/python/tf/train/polynomial_decay.
Used in:
Used in:
(message has no fields)
Configuration message for the RMSPropOptimizer See: https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer
Used in:
Configuration proto for random normal initializer. See https://www.tensorflow.org/api_docs/python/tf/random_normal_initializer
Used in:
Used in:
(message has no fields)
Used in:
(message has no fields)
Used in:
Used in:
Used in:
,Proto with one-of field for regularizers.
Used in:
,default output the list of multiple outputs
Used in:
COSINE = 0; EUCLID = 1;
Used in:
(message has no fields)
Used in:
, ,Used in:
,Used in:
Used in:
Percentage length of mask original sequence
Percentage left of crop original sequence
Percentage length of reorder original sequence
Used in:
Used in:
session id field name
reduction: reduction method for auc of different sessions * "mean": simple mean of different sessions * "mean_by_sample_num": weighted mean with sample num of different sessions * "mean_by_positive_num": weighted mean with positive sample num of different sessions
Used in:
, , , , , ,Used in:
Used in:
, ,Used in:
, , ,task name for the task tower
label for the task, default is label_fields by order
metrics for the task
loss for the task
num_class for multi-class classification loss
task specific dnn
training loss weights
label name for indicating the sample space for the task tower
the loss weight for sample in the task space
the loss weight for sample out the task space
multiple losses
whether to use sample weight in this tower
field name for indicating the sample space for this task
field value for indicating the sample space for this task
Used in:
,Used in:
Used in:
, ,Message for configuring EasyRecModel training jobs (train.py). Next id: 25
Used in:
optimizer options
If greater than 0, clips gradients by this value.
Number of steps to train the models: if 0, will train the model indefinitely.
Checkpoint to restore variables from.
Whether to synchronize replicas during training. In case so, build a SyncReplicateOptimizer
only take effect on pai-tf when sync_replicas is set, options are: raw, hash, multi_map, list, parallel in general, multi_map runs faster than other options.
Number of training steps between replica startup. This flag must be set to 0 if sync_replicas is set to true.
Step interval for saving checkpoint
Seconds interval for saving checkpoint
Max checkpoints to keep
Save summaries every this many steps.
The frequency global step/sec and the loss will be logged during training.
profiling or not
if variable shape is incompatible, clip or pad variables in checkpoint
DistributionStrategy, available values are 'mirrored' and 'collective' and 'ess' - mirrored: MirroredStrategy, single machine and multiple devices; - collective: CollectiveAllReduceStrategy, multiple machines and multiple devices.
Number of gpus per machine
summary model variables or not
distribute training protocol [grpc++ | star_server] grpc++: https://help.aliyun.com/document_detail/173157.html?spm=5176.10695662.1996646101.searchclickresult.3ebf450evuaPT3 star_server: https://help.aliyun.com/document_detail/173154.html?spm=a2c4g.11186623.6.627.39ad7e3342KOX4
inter_op_parallelism_threads
intra_op_parallelism_threads
tensor fusion on PAI-TF
write graph into graph.pbtxt and summary or not
match variable patterns to freeze
increment save config
enable oss stop signal stop by create OSS_STOP_SIGNAL under model_dir
stop training after dead_line time, format: 20220508 23:59:59
Used in:
,Size of the encoder layers and the pooler layer
Number of hidden layers in the Transformer encoder
Number of attention heads for each attention layer in the Transformer encoder
The size of the "intermediate" (i.e. feed-forward) layer in the Transformer encoder
The non-linear activation function (function or string) in the encoder and pooler.
The dropout probability for all fully connected layers in the embeddings, encoder, and pooler
The maximum sequence length that this model might ever be used with
Whether to add position embeddings for the position of each token in the text sequence
Whether to output all token embedding, if set to false, then only output the first token embedding
The dropout ratio for the attention probabilities
Used in:
Configuration proto for truncated normal initializer. See https://www.tensorflow.org/api_docs/python/tf/truncated_normal_initializer
Used in:
Used in:
Used in:
,Size of the encoder layers and the pooler layer
Number of hidden layers in the Transformer encoder
Number of attention heads for each attention layer in the Transformer encoder
The size of the "intermediate" (i.e. feed-forward) layer in the Transformer encoder
The non-linear activation function (function or string) in the encoder and pooler.
"gelu", "relu", "tanh" and "swish" are supported.
The dropout probability for all fully connected layers in the embeddings, encoder, and pooler
The dropout ratio for the attention probabilities
The maximum sequence length that this model might ever be used with
Whether to add position embeddings for the position of each token in the text sequence
The stddev of the truncated_normal_initializer for initializing all weight matrices
dnn layers for other features
Used in:
regularization coefficient lambda
variational_dropout dimension
used in CDN model
Used in:
Used in:
if set, the output of dnn and wide part are concatenated and passed to the final_dnn; otherwise, they are summarized
Used in: