Commit Graph

13 Commits

Author SHA1 Message Date
jingyaogong
1279a61681 [update] prompt prefill 2026-01-13 17:46:54 +08:00
jingyaogong
9d898576ac [update] aux loss 2026-01-01 22:41:46 +08:00
jingyaogong
c65335b56f [fix] experts unused 2025-12-31 21:47:04 +08:00
jingyaogong
5129f0e2a2 [fix] dtype & lr 2025-12-09 13:01:38 +08:00
jingyaogong
ecd1ae1563 [fix] reduce aux_loss_alpha 2025-12-05 23:08:29 +08:00
jingyaogong
151fdf7e76 [feat] update yarn 2025-12-01 16:15:05 +08:00
jingyaogong
6b86ea399a [feat] release memory 2025-11-27 19:39:49 +08:00
jingyaogong
f5374dc87f [fix] model attn_mask 2025-11-19 22:26:53 +08:00
jingyaogong
a044578d73 [fix] update model 2025-11-18 13:07:20 +08:00
yuyu5333
7d02ce673c fix: attn_forwad when is_causal=True assert attn_mask is None 2025-11-18 03:17:17 +00:00
jingyaogong
4e35fb9da8 [fix] update model 2025-10-17 00:09:32 +08:00
jingyaogong
29454c31af fix bugs 2025-04-27 09:56:49 +08:00
jingyaogong
a62faf34bd 250426 2025-04-26 10:05:47 +08:00