Proto commits in alibaba/TorchEasyRec

These commits are when the Protocol Buffers files have changed: (only the last 100 relevant commits are shown)

Commit:52002e2
Author:Eric
Committer:GitHub

[feat] add model delta tracker (#546)

The documentation is generated from this commit.

Commit:3d4d5a8
Author:ShuQi
Committer:GitHub

[feat] SID: add SidRqkmeans model (FAISS-trained residual K-Means) (#539) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Commit:7886e4c
Author:Hongsheng Jin
Committer:GitHub

[feat] Kafka: event-time driven checkpointing from message timestamp (#541) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Commit:fbd47be
Author:天邑

Merge remote-tracking branch 'origin/master' into feat/ultra-hstu-fp8 # Conflicts: # tzrec/version.py

Commit:da99c02
Author:天邑

[fix] ULTRA-HSTU FP8: narrow the arch gate to SM90 + SM120-mode2 The previous "SM90+" gate was too permissive: - SM100 (Blackwell datacenter) has no FP8 kernel in the wheel; the dispatcher routes there via _sm100.hstu_varlen_fwd_100 which doesn't even take quant_mode (cuda_hstu_attention.py:399-403). - SM120 (Blackwell RTX) only handles quant_mode==2 (per-block, fwd-only, cuda_hstu_attention.py:282); for any other mode the wheel silently falls into the sm80 bf16/fp16 branch (line 308's `or major_version == 12`) -- the user gets non-FP8 attention with no warning. Tighten _assert_fp8_capable to accept exactly (sm90, any mode) or (sm120, mode=2), and reject everything else loudly. Pass fp8_quant_mode into the helper so it can mode-check on sm120. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:90d592a
Author:天邑

[feat] ULTRA-HSTU FP8: extend support from SM90 to SM90+ (Blackwell) Relax the FP8 capability gate from "exactly SM90 (Hopper)" to "SM90+" so the same fp8_quant_mode>=0 path also runs on sm100 (Blackwell) and sm120 (Blackwell RTX). The wheel dispatches to its per-arch FP8 kernel internally (sm120/Blackwell RTX is forward-only and supports only quant_mode=2; that constraint surfaces from the wheel's own check, not tzrec's). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:f8ac3b3
Author:Hongsheng Jin
Committer:GitHub

[feat] support keep_checkpoint_max with async checkpoint pruning (#528) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:b9dd23e
Author:天邑

[feat] ULTRA-HSTU: add STU.fp8_quant_mode proto field Add an int32 `fp8_quant_mode` (default -1) to the STU message. -1 keeps attention in bf16/fp16; 0..5 select an FP8 mode forwarded to the CUTLASS (SM90/Hopper) kernel. Mirrors the wheel's quant_mode int and the existing scaling_seqlen=-1 sentinel style. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:f67ec93
Author:Hongsheng Jin
Committer:GitHub

[refactor] HSTUMatch with STUStack + UIHPreprocessor + block-suffix candidates (#506) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:3a2a589
Author:Hongsheng Jin
Committer:GitHub

[refactor] MatchTowerWoEG accepts feature_groups (plural); DSSMTower takes EmbeddingGroup (#510) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:9ae727f
Author:Hongsheng Jin
Committer:GitHub

[feat] metrics: NormalizedEntropy for binary classification (#507) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:98855ee
Author:Hongsheng Jin
Committer:GitHub

[refactor] dataset/sampler: dynamic expand_factor + build_sampler_input + block-suffix combine (#505) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:37d7fc4
Author:Hongsheng Jin
Committer:GitHub

[feat] FeatureGroupConfig.embedding_name_suffix to break embedding sharing across groups (#504) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:a446e12
Author:Hongsheng Jin
Committer:GitHub

[bugfix] thread contextual_seq_len from preprocessor to STULayer (proto sentinel + truncation total_uih_len + AOTI-friendly SLA builder) (#501) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:c7de8a9
Author:Hongsheng Jin
Committer:GitHub

[feat] stu.scaling_seqlen + drop autotune assert strip (#500) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:fa39911
Author:Hongsheng Jin
Committer:GitHub

[feat] Adadelta + RMSprop sparse and dense optimizers (#499) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:5993c45
Author:Hongsheng Jin
Committer:GitHub

[perf] AOTI export knobs: fp32 unbacked floats, sample-input autotune, TF32 from export_config (#498) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:f2d0116
Author:Hongsheng Jin
Committer:GitHub

[feat] ULTRA-HSTU Mixture of Transducers (MoT) (#492) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:03ec5e6
Author:Hongsheng Jin
Committer:GitHub

[feat] ULTRA-HSTU mid-stack attention truncation (#488) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:401cb29
Author:Hongsheng Jin
Committer:GitHub

[feat] Semi-Local Attention + selective activation rematerialization (#486) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit:8327341
Author:Hongsheng Jin
Committer:GitHub

[feat] support TokenizeFeature as token-level sequence input (#470) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Commit:b203ece
Author:Hongsheng Jin
Committer:GitHub

[feat] add CUTLASS kernel backend for HSTU attention (#465) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Commit:6673e00
Author:Hongsheng Jin
Committer:GitHub

[feat] integrate dynamicemb table fusion (wheel 20260407.97b80bf) (#466) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Commit:8cadac0
Author:Hongsheng Jin
Committer:GitHub

[feat] add concat_contextual_features option to DlrmHSTU (#459) When enabled, all contextual features are concatenated on the channel dimension and projected as a single token instead of N separate tokens, reducing HSTU attention cost by shortening sequence length. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Commit:a0d6e8a
Author:Hongsheng Jin
Committer:GitHub

[feat] add per-task loss weight to FusionSubTaskConfig (#453) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Commit:8c2bdec
Author:Hongsheng Jin
Committer:GitHub

[feat] add CosineAnnealingLR and CosineAnnealingWarmRestartsLR schedules (#454) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Commit:5351e3f
Author:Hongsheng Jin
Committer:GitHub

[feat] add CombineFeature support (#447) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Commit:1b33f24
Author:Hongsheng Jin
Committer:GitHub

[feat] add label_smoothing support to BinaryCrossEntropy loss (#455) Label smoothing helps with noisy click labels and improves generalization in ranking models. Smooths hard binary labels using the standard formula: label * (1 - eps) + 0.5 * eps, consistent with PyTorch CrossEntropyLoss. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Commit:f93d73d
Author:Hongsheng Jin
Committer:GitHub

[feat] add grad clipping for dense params (#424)

Commit:1e7922b
Author:Hongsheng Jin
Committer:GitHub

[feat] support input fields str (#412)

Commit:777f1e8
Author:chengaofei
Committer:GitHub

[feat] support pepnet (#402)

Commit:999cbe5
Author:Hongsheng Jin
Committer:GitHub

[feat] add kafka dataset (#401)

Commit:4d1dcc5
Author:Hongsheng Jin
Committer:GitHub

[feat] dlrm hstu support sequence timestamp is descending order (#395)

Commit:0693de6
Author:Hongsheng Jin
Committer:GitHub

[bugfix] make sequence related config optional (#390)

Commit:4ca5ed5
Author:chengaofei
Committer:GitHub

[feat] dlrm and wukong support only one sparse group (#385)

Commit:f5776db
Author:Hongsheng Jin
Committer:GitHub

[feat] support sequence cross features (#375)

Commit:0f2f0dd
Author:chengaofei
Committer:GitHub

[feat] support pe ltr in train wrapper (#381)

Commit:f138b22
Author:Hongsheng Jin
Committer:GitHub

[feat] support initial_accumulator_value for FusedSparseAdagradOptimizer & add additional optimizer configuration options (#382)

Commit:b926ac1
Author:chengaofei
Committer:GitHub

[feat] add wukong model (#372)

Commit:93c6b46
Author:Hongsheng Jin
Committer:GitHub

[feat] add AdmissionStrategy support for DynamicEmbedding (#362)

Commit:c8e5561
Author:Hongsheng Jin
Committer:天邑

[feat] add time_bucket_increments for DlrmHSTU PositionEncoder (#359)

Commit:9f6d8d7
Author:Hongsheng Jin
Committer:GitHub

[feat] add time_bucket_increments for DlrmHSTU PositionEncoder (#359)

Commit:a7271a9
Author:Hongsheng Jin
Committer:GitHub

[feat] mtl tower of dlrm hstu support num_class > 1 (#352)

Commit:3d2a4a8
Author:Hongsheng Jin
Committer:GitHub

[feat] support dynamic batch size with sample cost (#343)

Commit:c4a4944
Author:Eric Ge
Committer:GitHub

[feat] mind dynamic routing support zero init (#342)

Commit:7165db6
Author:Hongsheng Jin
Committer:GitHub

[feat] add TMA support for hstu attn & rms_norm test (#336)

Commit:906f0ce
Author:Hongsheng Jin
Committer:GitHub

[feat] add global average loss option for DlrmHSTU (#334)

Commit:a16a85a
Author:Eric Ge
Committer:GitHub

[feat] TensorRT export (#318)

Commit:c8fbc3f
Author:Hongsheng Jin
Committer:GitHub

[feat] refactor dlrm hstu preprocess modules (#314)

Commit:3348869
Author:chengaofei
Committer:GitHub

[feat] support log training metric (#310)

Commit:bdf615f
Author:chengaofei
Committer:GitHub

[feat] support adamw optimizer and part optimizer and label soomthing (#297)

Commit:5af80ff
Author:chengaofei
Committer:GitHub

[feat] export best model (#294)

Commit:cf2e73b
Author:Hongsheng Jin
Committer:GitHub

[feat] add contextual_feature_to_pooling to DlrmHSTU preprocessors (#296)

Commit:a1247d9
Author:Hongsheng Jin
Committer:GitHub

[feat] make dlrmhstu watchtime feature optional (#290)

Commit:77daba0
Author:chengaofei
Committer:GitHub

[feat] support bool mask feature (#285)

Commit:b312efd
Author:Hongsheng Jin
Committer:GitHub

[feat] add dynamicemb doc (#283)

Commit:25dd248
Author:Hongsheng Jin
Committer:GitHub

[feat] add tool to initialize dynamic embeddings from tables (#282)

Commit:30865d1
Author:Hongsheng Jin
Committer:GitHub

[bugfix] revert dynamic embedding hash bucket size and remove unused evict_strategy (#281)

Commit:1a61fd1
Author:Hongsheng Jin
Committer:GitHub

[feat] add dynamic embedding support (#279)

Commit:a75b5b7
Author:Hongsheng Jin
Committer:GitHub

[feat] add kv dot product feature (#276)

Commit:a580a28
Author:Eric Ge
Committer:GitHub

[feat] support xauc and grouped xauc (#252)

Commit:2627da0
Author:chengaofei
Committer:GitHub

[feat] add sequence self attention encoder (#251)

Commit:3779c0e
Author:chengaofei
Committer:GitHub

[feat] add dcnv2 and xdeepfm net (#242)

Commit:ce67118
Author:Eric Ge
Committer:GitHub

[feat] dcn_v1 (#235)

Commit:74ef405
Author:Hongsheng Jin
Committer:GitHub

[feat] refine hstu ops & add triton tests for dlrm hstu (#231)

Commit:525ce95
Author:Hongsheng Jin
Committer:GitHub

[feat] add hstu rank model (#227)

Commit:be56886
Author:Hongsheng Jin
Committer:GitHub

[feat] oss dlrm hstu modules (#224)

Commit:df65528
Author:Hongsheng Jin
Committer:GitHub

[feat] add use_ln option for MLP module & fix parse encoded sequence feature error msg (#223)

Commit:ee97015
Author:Hongsheng Jin
Committer:GitHub

[feat] add fp16 embedding dtype and fix weight decay mode (#221)

Commit:d270e1f
Author:Hongsheng Jin
Committer:GitHub

[feat] add mixed_precision bf16/fp16 and gradient accumulation support (#220)

Commit:70208df
Author:Hongsheng Jin
Committer:GitHub

[feat] support feature only used as fg dag intermediate result (stub_type=true) (#218)

Commit:efda7f5
Author:Hongsheng Jin
Committer:GitHub

[feat] expr feature support value_dim & bump up pyfg to 0.7.1 (#216)

Commit:91a4847
Author:Hongsheng Jin
Committer:GitHub

[feat] add wide and deep model and wide_init_fn (#212)

Commit:4c3ae6c
Author:Hongsheng Jin
Committer:GitHub

[feat] upgrade pyfg to 0.6.9 and refine expr/overlap feature doc (#199)

Commit:10af7d3
Author:Hongsheng Jin
Committer:GitHub

[feat] support freeze embedding parameters (#206)

Commit:4d3ac46
Author:Eric Ge
Committer:GitHub

[feat] add binary focal loss (#208)

Commit:bc6bfbe
Author:Hongsheng Jin
Committer:GitHub

[feat] add allow_tf32 flag and global embedding param constraint (#188)

Commit:46947a6
Author:Hongsheng Jin
Committer:GitHub

[feat] add masknet for dbmtl and refine masknet logic (#187)

Commit:18ee4d6
Author:Hongsheng Jin
Committer:GitHub

[feat] add max sequence length for sequence encoder (#184)

Commit:be8da4b
Author:Eric Ge
Committer:GitHub

[feat] write tensorboard log for model parameters (#181)

Commit:4d59215
Author:Eric Ge
Committer:GitHub

[feat] masknet (#179)

Commit:49a7f73
Author:Hongsheng Jin
Committer:GitHub

[feat] add fg value_type config and make num_buckets default value_dtype as string (#175)

Commit:92ad14b
Author:Eric Ge
Committer:GitHub

[feat] optimize mind model (#157) - optimize creation and scaling for the routing_logit tensor - optimize the iteration of dynamic routing, capturing gradient after iteration - adjust MindUserTower's MLP modules, the inner layers and output layers are extracted separately - add bias hyper-parameter for the MLP module. For sequence feature, bias is not used

Commit:75f3c47
Author:Hongsheng Jin
Committer:GitHub

[feat] add kernel config and BaseModule (#151)

Commit:4f833b4
Author:Hongsheng Jin
Committer:GitHub

[feat] add regression and multi-classification metric (#149)

Commit:ec50459
Author:chengaofei
Committer:GitHub

[feat] support dlrm model (#148)

Commit:caa27b5
Author:Hongsheng Jin
Committer:GitHub

[feat] add custom feature and custom sequence feature (#144)

Commit:bf05427
Author:iWelkin-coder
Committer:GitHub

[feat] Optimize HSTU training and sampling process (#93)

Commit:e8989ec
Author:Hongsheng Jin
Committer:GitHub

[feat] add odps_data_compression config (#146)

Commit:3fa13c4
Author:chengaofei
Committer:GitHub

[feat] add rocket launching model (#129)

Commit:62a90da
Author:Eric Ge
Committer:GitHub

[feat] add mind model (#119)

Commit:457da32
Author:Hongsheng Jin
Committer:GitHub

[feat] eval and save checkpoint by epoch (#116)

Commit:ae00b33
Author:Hongsheng Jin
Committer:GitHub

[feat] support dataset shuffle (#114)

Commit:3bee923
Author:Hongsheng Jin
Committer:GitHub

[feat] add vocab file for features (#97)

Commit:d382062
Author:Hongsheng Jin
Committer:GitHub

[feat] make default bucketize value configurable (#94)

Commit:4bcbee6
Author:iWelkin-coder
Committer:GitHub

[feat] add hstu (#55)

Commit:cdba485
Author:Eric Ge
Committer:GitHub

[feat] add dual augmented two-tower model (#83)

Commit:1b48405
Author:chengaofei
Committer:GitHub

[feat] add task space for mtl loss (#82)

Commit:00a24e4
Author:Hongsheng Jin
Committer:GitHub

[feat] refactor embedding group input tile and dense embedding collection (#75) * refactor embedding group input tile and dense embedding collection * fix tests * refactor proto and add tests * refactor proto and add tests * fix tests * fix tests * fix tests * add docs

Commit:dfd2051
Author:Eric Ge
Committer:GitHub

[feat] support Autodis and MLP embedding for raw features (#73)