Jhao-Ting Chen
|
220dc01372
|
[None][feat] support JIT mha.cu for SPEC_DEC in runtime (#6078)
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
|
2025-09-23 14:56:17 -07:00 |
|
Yao Yao
|
12e075eb70
|
[nvbug 5333996 ][fix] Unload XQA cubins early to avoid static lifetime (#5133)
Signed-off-by: Yao Yao <lowsfer@users.noreply.github.com>
|
2025-06-13 15:53:29 +08:00 |
|
Yao Yao
|
3545d59635
|
Support speculative decoding with Hopper XQA (#3269)
Signed-off-by: Yao Yao <lowsfer@users.noreply.github.com>
|
2025-04-07 17:14:34 +08:00 |
|
Kaiyu Xie
|
e88da961c5
|
Update TensorRT-LLM (#2783)
|
2025-02-13 18:40:22 +08:00 |
|
Dan Blanaru
|
16d2467ea8
|
Update TensorRT-LLM (#2755)
* Update TensorRT-LLM
---------
Co-authored-by: Denis Kayshev <topenkoff@gmail.com>
Co-authored-by: akhoroshev <arthoroshev@gmail.com>
Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com>
Update
|
2025-02-11 03:01:00 +00:00 |
|
Kaiyu Xie
|
1730a587d8
|
Update TensorRT-LLM (#2363)
* Update TensorRT-LLM
---------
Co-authored-by: tonylek <137782967+tonylek@users.noreply.github.com>
|
2024-10-22 20:27:35 +08:00 |
|
Kaiyu Xie
|
9dbc5b38ba
|
Update TensorRT-LLM (#1891)
* Update TensorRT-LLM
---------
Co-authored-by: Marks101 <markus.schnoes@gmx.de>
Co-authored-by: lkm2835 <lkm2835@gmail.com>
|
2024-07-04 14:37:19 +08:00 |
|
石晓伟
|
2a115dae84
|
Update TensorRT-LLM (#1793)
Co-authored-by: DreamGenX <x@dreamgen.com>
Co-authored-by: Ace-RR <78812427+Ace-RR@users.noreply.github.com>
Co-authored-by: bprus <39293131+bprus@users.noreply.github.com>
Co-authored-by: janpetrov <janpetrov@icloud.com>
|
2024-06-18 18:18:23 +08:00 |
|
Kaiyu Xie
|
f430a4b447
|
Update TensorRT-LLM (#1688)
* Update TensorRT-LLM
---------
Co-authored-by: IbrahimAmin <ibrahimamin532@gmail.com>
Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com>
Co-authored-by: Pzzzzz <hello-cd.plus@hotmail.com>
Co-authored-by: CoderHam <hemant@cohere.com>
Co-authored-by: Konstantin Lopuhin <kostia.lopuhin@gmail.com>
|
2024-05-28 20:07:49 +08:00 |
|
Kaiyu Xie
|
89ba1b1a67
|
Update TensorRT-LLM (#1554)
|
2024-05-07 23:34:28 +08:00 |
|