Faraz
|
27a5091fcb
|
[None][feat] GPT-OSS Sm120/Sm121 Support (#7937)
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Signed-off-by: list <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: Vincent Huang <vincenth@nvidia.com>
Co-authored-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Co-authored-by: Vincent Huang <vincenth@nvidia.com>
|
2025-10-06 16:59:06 -04:00 |
|
Cheng Hang
|
cdce68c3e0
|
[TRTLLM-6741][fix] Add heuristics for lm head tp size when enable_lm_head_tp_in_adp=True (#7891)
Signed-off-by: Cheng Hang <chang@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-09-30 09:24:35 +08:00 |
|
Kaiyu Xie
|
62042a9733
|
[TRTLLM-6741] [feat] enable LM tp for MTP, under attention dp case (cherry-pick #7128) (#7571)
Signed-off-by: Cheng Hang <chang@nvidia.com>
Co-authored-by: Cheng Hang <chang@nvidia.com>
|
2025-09-17 09:41:32 +08:00 |
|
Chang Liu
|
99b98f1374
|
[TRTLLM-7440][fix] Split fused_input_embed to separate out host sync (#7280)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-09-06 23:11:39 -04:00 |
|
hlu1
|
8207d5fd39
|
[None] [feat] Add model gpt-oss (#6645)
Signed-off-by: Hao Lu <14827759+hlu1@users.noreply.github.com>
|
2025-08-07 03:04:18 -04:00 |
|
Zhenhuan Chen
|
30608a5e6d
|
[https://nvbugs/5355316] fix: update torch.compile option to fix triton store_cubin error (#5865)
Signed-off-by: Zhenhuan Chen <chenzhh3671@gmail.com>
|
2025-07-14 17:17:30 +08:00 |
|
Perkz Zheng
|
ba2ab5098b
|
[Bug] attention DP doesn't work with embedding TP (#5642)
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
|
2025-07-02 08:57:46 +08:00 |
|
Yilin Fan
|
31bb650298
|
Cherry pick feat/llama4 to main (#4739)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
Signed-off-by: Yilin Fan <206948969+nv-yilinf@users.noreply.github.com>
Co-authored-by: Chenfei Zhang <chenfeiz@nvidia.com>
|
2025-05-30 05:28:40 +08:00 |
|
yuxianq
|
4f8afe4cc6
|
feat: [nvbugs/5261055][nvbugs/5170160] non-invasive pipeline parallelism (#4034)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2025-05-16 04:16:53 +08:00 |
|
milesial
|
362a8272f8
|
feat: llama4 input processor (#3383)
Signed-off-by: Alexandre Milesi <30204471+milesial@users.noreply.github.com>
Signed-off-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>
Co-authored-by: Alexandre Milesi <30204471+milesial@users.noreply.github.com>
Co-authored-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>
|
2025-04-25 16:47:14 -07:00 |
|
QI JUN
|
d167cbd5bb
|
refactor: remove ParallelConfig in tensorrt_llm._torch.distributed module (#3370)
* remove tensorrt_llm._torch.distributed.ParallelConfig
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* clean
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix embedding test
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix comments
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* polish
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* rebase
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
---------
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Co-authored-by: hlu1 <14827759+hlu1@users.noreply.github.com>
|
2025-04-11 15:34:20 -07:00 |
|
Kaiyu Xie
|
2631f21089
|
Update (#2978)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-03-23 16:39:35 +08:00 |
|
Kaiyu Xie
|
77d7fe1eb2
|
Update TensorRT-LLM (#2849)
* Update TensorRT-LLM
---------
Co-authored-by: aotman <chenhangatm@gmail.com>
|
2025-03-04 18:44:00 +08:00 |
|
Kaiyu Xie
|
2ea17cdad2
|
Update TensorRT-LLM (#2792)
* Update TensorRT-LLM
---------
Co-authored-by: jlee <jungmoolee@clika.io>
|
2025-02-18 21:27:39 +08:00 |
|
Kaiyu Xie
|
e88da961c5
|
Update TensorRT-LLM (#2783)
|
2025-02-13 18:40:22 +08:00 |
|
Dan Blanaru
|
16d2467ea8
|
Update TensorRT-LLM (#2755)
* Update TensorRT-LLM
---------
Co-authored-by: Denis Kayshev <topenkoff@gmail.com>
Co-authored-by: akhoroshev <arthoroshev@gmail.com>
Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com>
Update
|
2025-02-11 03:01:00 +00:00 |
|