nvxuanyuc
|
d1398c05e6
|
[None][feat] Support ignored prompt length for penalties via new sampling config parameter (#8127)
Signed-off-by: Xuanyu Chen <xuanyuc@nvidia.com>
|
2025-10-27 13:12:31 -04:00 |
|
wili
|
eba3623a54
|
Feat: Variable-Beam-Width-Search (VBWS) part4 (#3979)
* feat/vbws-part4-v1.8: rebase
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* feat/vbws-part4-v1.9: fix incorrect output when using short output length
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.1: remove useless variables
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.2:fix incorrect output when using short output length
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.3: rebase
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.4: rebase
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.5: remove API change
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
---------
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
Co-authored-by: wili-65535 <wili-65535@users.noreply.github.com>
|
2025-05-12 22:32:29 +02:00 |
|
Yan Chunwei
|
0c26059703
|
chore: Cleanup deprecated APIs from LLM-API (part 1/2) (#3732)
* beam_width and max_new_token
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* remove beam_width
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* remove min_length
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* remove return_num_sequences
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
---------
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
|
2025-05-07 13:20:25 +08:00 |
|
wili
|
34e63d07e6
|
feat: Variable-Beam-Width-Search (VBWS) Part2 (#3133)
* feat: Variable-Beam-Width-Search Part2
Signed-off-by: wili-65535 <wili-65535@user.noreply.github.com>
* feat: Variable-Beam-Width-Search Part2
Signed-off-by: wili-65535 <wili-65535@user.noreply.github.com>
* feat: Variable-Beam-Width-Search Part2, fix CPP tests
Signed-off-by: wili-65535 <wili-65535@user.noreply.github.com>
* feat: Variable-Beam-Width-Search Part3, simplify CPP tests
Signed-off-by: wili-65535 <wili-65535@user.noreply.github.com>
* feat: Variable-Beam-Width-Search Part4, move beam_width_array param
Signed-off-by: wili-65535 <wili-65535@user.noreply.github.com>
* feat: Variable-Beam-Width-Search, fix CI error
Signed-off-by: wili-65535 <wili-65535@user.noreply.github.com>
* feat: Variable-Beam-Width-Search part2
Signed-off-by: wili-65535 <wili-65535@user.noreply.github.com>
* feat: Variable-Beam-Width-Search part2
Signed-off-by: wili-65535 <wili-65535@user.noreply.github.com>
* feat: Variable-Beam-Width-Search part2, fix pre-commit
Signed-off-by: wili-65535 <wili-65535@user.noreply.github.com>
* feat: Variable-Beam-Width-Search part2, fix review
Signed-off-by: wili-65535 <wili-65535@user.noreply.github.com>
---------
Signed-off-by: wili-65535 <wili-65535@user.noreply.github.com>
Co-authored-by: wili-65535 <wili-65535@user.noreply.github.com>
|
2025-04-02 12:31:28 +08:00 |
|
wili
|
3e035f2219
|
v1.2 (#3082)
Signed-off-by: wili <wili@nvidia.com>
|
2025-03-26 23:31:29 +08:00 |
|
Kaiyu Xie
|
9b931c0f63
|
Update TensorRT-LLM (#2873)
|
2025-03-11 21:13:42 +08:00 |
|