ixlmar
|
fbe4db207d
|
feat: forward exceptions to Python and catch OOMs (#4497)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2025-05-28 11:58:10 +02:00 |
|
Jinyang Yuan
|
b618e1f55b
|
perf: Eliminate the need for attention DP padding when possible (#3439)
Signed-off-by: Jinyang Yuan <154768711+jinyangyuan-nvidia@users.noreply.github.com>
Co-authored-by: raccoonliukai <raccoonliu@tencent.com>
|
2025-05-17 13:30:55 +08:00 |
|
Kaiyu Xie
|
2ea17cdad2
|
Update TensorRT-LLM (#2792)
* Update TensorRT-LLM
---------
Co-authored-by: jlee <jungmoolee@clika.io>
|
2025-02-18 21:27:39 +08:00 |
|
Kaiyu Xie
|
e88da961c5
|
Update TensorRT-LLM (#2783)
|
2025-02-13 18:40:22 +08:00 |
|