wili
|
eba3623a54
|
Feat: Variable-Beam-Width-Search (VBWS) part4 (#3979)
* feat/vbws-part4-v1.8: rebase
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* feat/vbws-part4-v1.9: fix incorrect output when using short output length
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.1: remove useless variables
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.2:fix incorrect output when using short output length
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.3: rebase
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.4: rebase
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.5: remove API change
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
---------
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
Co-authored-by: wili-65535 <wili-65535@users.noreply.github.com>
|
2025-05-12 22:32:29 +02:00 |
|
Robin Kobus
|
94dd456bd0
|
refactor: Remove speculative decoding parameters from stateful decoders (#3024)
Simplify StatefulGptDecoderBatched constructor:
- Remove speculative decoding mode parameter
- Initialize with default mode=None
- Update GptSession class accordingly
Simplify setup method signatures in StatefulGptDecoder and StatefulGptDecoderBatched:
- Remove maxTokensPerStep parameter
- Initialize decoders with default maxTokensPerStep=1
- Update GptSession class accordingly
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
|
2025-03-26 20:16:26 +08:00 |
|
Kaiyu Xie
|
3aa6b11d13
|
Update TensorRT-LLM (#2936)
* Update TensorRT-LLM
---------
Co-authored-by: changcui <cuichang147@gmail.com>
|
2025-03-18 21:25:19 +08:00 |
|
Kaiyu Xie
|
ab5b19e027
|
Update TensorRT-LLM (#2820)
|
2025-02-25 21:21:49 +08:00 |
|
Dan Blanaru
|
16d2467ea8
|
Update TensorRT-LLM (#2755)
* Update TensorRT-LLM
---------
Co-authored-by: Denis Kayshev <topenkoff@gmail.com>
Co-authored-by: akhoroshev <arthoroshev@gmail.com>
Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com>
Update
|
2025-02-11 03:01:00 +00:00 |
|
Kaiyu Xie
|
b7868dd1bd
|
Update TensorRT-LLM (#2413)
|
2024-11-05 16:27:06 +08:00 |
|
石晓伟
|
32ed92e449
|
Update TensorRT-LLM
Co-authored-by: Rong Zhou <130957722+ReginaZh@users.noreply.github.com>
Co-authored-by: Onur Galoglu <33498883+ogaloglu@users.noreply.github.com>
Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com>
|
2024-08-20 18:55:15 +08:00 |
|
Kaiyu Xie
|
74b324f667
|
Update TensorRT-LLM (#2110)
|
2024-08-13 22:34:33 +08:00 |
|
Kaiyu Xie
|
be9cd719f7
|
Update TensorRT-LLM (#2094)
* Update TensorRT-LLM
---------
Co-authored-by: akhoroshev <arthoroshev@gmail.com>
Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com>
Co-authored-by: Tayef Shah <tayefshah@gmail.com>
Co-authored-by: lfz941 <linfanzai941@gmail.com>
|
2024-08-07 16:44:43 +08:00 |
|
Kaiyu Xie
|
bca9a33b02
|
Update TensorRT-LLM (#2008)
* Update TensorRT-LLM
---------
Co-authored-by: Timur Abishev <abishev.timur@gmail.com>
Co-authored-by: MahmoudAshraf97 <hassouna97.ma@gmail.com>
Co-authored-by: Saeyoon Oh <saeyoon.oh@furiosa.ai>
Co-authored-by: hattizai <hattizai@gmail.com>
|
2024-07-23 23:05:09 +08:00 |
|
Kaiyu Xie
|
b777bd6475
|
Update TensorRT-LLM (#1725)
* Update TensorRT-LLM
---------
Co-authored-by: RunningLeon <mnsheng@yeah.net>
Co-authored-by: Tlntin <TlntinDeng01@Gmail.com>
Co-authored-by: ZHENG, Zhen <zhengzhen.z@qq.com>
Co-authored-by: Pham Van Ngoan <ngoanpham1196@gmail.com>
Co-authored-by: Nathan Price <nathan@abridge.com>
Co-authored-by: Tushar Goel <tushar.goel.ml@gmail.com>
Co-authored-by: Mati <132419219+matichon-vultureprime@users.noreply.github.com>
|
2024-06-04 20:26:32 +08:00 |
|
Kaiyu Xie
|
f430a4b447
|
Update TensorRT-LLM (#1688)
* Update TensorRT-LLM
---------
Co-authored-by: IbrahimAmin <ibrahimamin532@gmail.com>
Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com>
Co-authored-by: Pzzzzz <hello-cd.plus@hotmail.com>
Co-authored-by: CoderHam <hemant@cohere.com>
Co-authored-by: Konstantin Lopuhin <kostia.lopuhin@gmail.com>
|
2024-05-28 20:07:49 +08:00 |
|
Kaiyu Xie
|
bf0a5afc92
|
Update TensorRT-LLM (#1598)
* Update TensorRT-LLM
|
2024-05-14 16:43:41 +08:00 |
|
Kaiyu Xie
|
66ef1df492
|
Update TensorRT-LLM (#1492)
* Update TensorRT-LLM
---------
Co-authored-by: Loki <lokravi@amazon.com>
|
2024-04-24 14:44:22 +08:00 |
|
石晓伟
|
850b6fa1e7
|
Update TensorRT-LLM (#1358)
Co-authored-by: Kaiyu <26294424+kaiyux@users.noreply.github.com>
|
2024-03-26 20:47:14 +08:00 |
|
Kaiyu Xie
|
655524dd82
|
Update TensorRT-LLM (#1168)
* Update TensorRT-LLM
---------
Co-authored-by: Bhuvanesh Sridharan <bhuvan.sridharan@gmail.com>
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
|
2024-02-27 17:37:34 +08:00 |
|
Kaiyu Xie
|
0f041b7b57
|
Update TensorRT-LLM (#1098)
* Update TensorRT-LLM
* update submodule
* Remove unused binaries
|
2024-02-18 15:48:08 +08:00 |
|
Kaiyu Xie
|
deaae40bd7
|
Update TensorRT-LLM (#787)
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
|
2024-01-02 17:54:32 +08:00 |
|
Kaiyu Xie
|
f7eca56161
|
Update TensorRT-LLM (#613)
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
Co-authored-by: zhang-ge-hao <842720660@qq.com>
|
2023-12-08 17:49:24 +08:00 |
|
Kaiyu Xie
|
711a28d9bf
|
Update TensorRT-LLM (#465)
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
|
2023-11-24 22:12:26 +08:00 |
|
Kaiyu Xie
|
6755a3f077
|
Update TensorRT-LLM (#422)
* Update TensorRT-LLM
---------
Co-authored-by: Tltin <TltinDeng01@gmail.com>
Co-authored-by: zhaohb <zhaohbcloud@126.com>
Co-authored-by: Bradley Heilbrun <brad@repl.it>
Co-authored-by: nqbao11 <nqbao11.01@gmail.com>
Co-authored-by: Nikhil Varghese <nikhil@bot-it.ai>
|
2023-11-18 00:05:54 +08:00 |
|
Kaiyu Xie
|
b2fd493c16
|
Update TensorRT-LLM (#349)
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
|
2023-11-10 22:30:31 +08:00 |
|
Kaiyu Xie
|
4de32a86ae
|
Update TensorRT-LLM (#188)
* Update batch manager
* Update src
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
Co-authored-by: jdemouth-nvidia <11447840+jdemouth-nvidia@users.noreply.github.com>
|
2023-10-30 16:06:41 +08:00 |
|
Kevin Xie
|
027cd518e3
|
Update
|
2023-10-10 23:22:17 -07:00 |
|
Kevin Xie
|
6e9e318e91
|
Update code
|
2023-09-28 09:00:05 -07:00 |
|
Kaiyu Xie
|
23bc5b7c49
|
Initial commit
|
2023-09-20 00:29:41 -07:00 |
|