whitesword
|
3a18fdd666
|
Fix: support loading DDP-saved LoRA weights for inference
|
2025-12-22 20:50:25 +08:00 |
|
jingyaogong
|
fe24501602
|
[feat] adjust seq length
|
2025-12-14 20:41:58 +08:00 |
|
jingyaogong
|
fa82707c9c
|
[feat] update readme
|
2025-12-11 15:45:50 +08:00 |
|
jingyaogong
|
5129f0e2a2
|
[fix] dtype & lr
|
2025-12-09 13:01:38 +08:00 |
|
jingyaogong
|
aa7dc0f61e
|
Merge pull request #571 from dyhuachi/dyhuachi-patch-1
[fix] Refactor get_lr function to include min_lr calculation
|
2025-12-09 12:59:11 +08:00 |
|
dyhuachi
|
bf3878ace8
|
[fix] Refactor get_lr function to include min_lr calculation
这里的退火算法会让参数里的lr的起始值变成原来lr的1.1倍,作出如下修改
|
2025-12-06 17:09:51 +08:00 |
|
jingyaogong
|
ecd1ae1563
|
[fix] reduce aux_loss_alpha
|
2025-12-05 23:08:29 +08:00 |
|
jingyaogong
|
5e1447b913
|
[fix] cuda memory #559
|
2025-12-01 16:17:43 +08:00 |
|
jingyaogong
|
151fdf7e76
|
[feat] update yarn
|
2025-12-01 16:15:05 +08:00 |
|
jingyaogong
|
6b86ea399a
|
[feat] release memory
|
2025-11-27 19:39:49 +08:00 |
|
jingyaogong
|
d7f4f4eab8
|
[fix] ppo mask
|
2025-11-19 23:39:02 +08:00 |
|
jingyaogong
|
f5374dc87f
|
[fix] model attn_mask
|
2025-11-19 22:26:53 +08:00 |
|
jingyaogong
|
a044578d73
|
[fix] update model
|
2025-11-18 13:07:20 +08:00 |
|
jingyaogong
|
ce9394670b
|
Merge pull request #536 from yuyu5333/fix/attn_forward
fix: attn_forwad when is_causal=True assert attn_mask is None
|
2025-11-18 13:02:46 +08:00 |
|
yuyu5333
|
7d02ce673c
|
fix: attn_forwad when is_causal=True assert attn_mask is None
|
2025-11-18 03:17:17 +00:00 |
|
jingyaogong
|
9c98cabc9a
|
[fix] prompt length calculate
|
2025-11-15 18:25:37 +08:00 |
|
jingyaogong
|
f3441b0078
|
Merge pull request #528 from wangzhaode/feat/add_mnn_support
[feat] add MNN support to README.
|
2025-11-10 22:46:15 +08:00 |
|
yanxing
|
5959396096
|
[feat] add MNN support to README.
|
2025-11-10 21:59:22 +08:00 |
|
jingyaogong
|
bf60bde8fb
|
[fix] model-name
|
2025-11-07 19:38:20 +08:00 |
|
jingyaogong
|
81e869fc3e
|
[fix] harmonize template
|
2025-11-06 13:14:08 +08:00 |
|
jingyaogong
|
509d8dacf1
|
[feat] clear cache
|
2025-11-06 13:12:28 +08:00 |
|
jingyaogong
|
8a0b04ed82
|
[fix] harmonize template
|
2025-11-02 23:18:11 +08:00 |
|
jingyaogong
|
0323815729
|
[feat] update import
|
2025-10-31 23:45:55 +08:00 |
|
jingyaogong
|
8d71754e05
|
[feat] update readme
|
2025-10-30 23:39:25 +08:00 |
|
jingyaogong
|
d8ac558ce2
|
[feat] update readme
|
2025-10-30 23:30:12 +08:00 |
|
jingyaogong
|
e4807a5214
|
[feat] update readme
|
2025-10-30 23:27:15 +08:00 |
|
jingyaogong
|
de23e1ea39
|
[feat] update datasets
|
2025-10-30 11:08:13 +08:00 |
|
jingyaogong
|
08ce3da228
|
[feat] update args
|
2025-10-30 10:48:31 +08:00 |
|
jingyaogong
|
bf123b585d
|
[feat] add args
|
2025-10-30 10:05:12 +08:00 |
|
jingyaogong
|
800fed4639
|
[feat] update readme
|
2025-10-29 12:13:25 +08:00 |
|
jingyaogong
|
1713c24114
|
[fix] model device
|
2025-10-29 10:36:28 +08:00 |
|
jingyaogong
|
acd5925193
|
[feat] update trainer
|
2025-10-29 00:52:37 +08:00 |
|
jingyaogong
|
8f7e07b8ef
|
[feat] update trainer
|
2025-10-28 23:30:10 +08:00 |
|
jingyaogong
|
eb96113cd4
|
[feat] update readme
|
2025-10-27 16:31:40 +08:00 |
|
jingyaogong
|
cc81925ecc
|
[feat] update readme
|
2025-10-26 18:52:50 +08:00 |
|
jingyaogong
|
35bed8936d
|
[feat] update eval-llm
|
2025-10-26 18:52:01 +08:00 |
|
jingyaogong
|
e8484874f5
|
[feat] pause-training
|
2025-10-26 18:49:52 +08:00 |
|
jingyaogong
|
6efba3249a
|
[feat] update readme
|
2025-10-24 01:18:33 +08:00 |
|
jingyaogong
|
ea2abb5fb3
|
[feat] update readme
|
2025-10-24 00:45:11 +08:00 |
|
jingyaogong
|
c4e9789d7e
|
[feat] update requirements
|
2025-10-23 23:54:55 +08:00 |
|
jingyaogong
|
4276f8a72f
|
[feat] update requirements
|
2025-10-23 23:50:21 +08:00 |
|
jingyaogong
|
d762926f48
|
[feat] update readme
|
2025-10-23 23:23:52 +08:00 |
|
jingyaogong
|
09be7dba11
|
[feat] update readme
|
2025-10-23 20:23:46 +08:00 |
|
jingyaogong
|
fa6df82ff8
|
[feat] repetition-penalty
|
2025-10-23 20:23:25 +08:00 |
|
jingyaogong
|
28cc44579a
|
[feat] convert2llama
|
2025-10-23 20:22:42 +08:00 |
|
jingyaogong
|
a82526da11
|
[feat] shuffle data
|
2025-10-23 20:13:28 +08:00 |
|
jingyaogong
|
805744e60a
|
[fix] loss-issues-430
|
2025-10-23 19:08:42 +08:00 |
|
jingyaogong
|
4014e62cdf
|
[fix] restore
|
2025-10-23 19:00:06 +08:00 |
|
jingyaogong
|
557bcc018d
|
[fix] issue-431
|
2025-10-23 18:56:30 +08:00 |
|
jingyaogong
|
463044e92a
|
[fix] sampler-ddp
|
2025-10-23 15:03:19 +08:00 |
|