shuyixiong
|
6776caaad1
|
[TRTLLM-8507][fix] Fix ray resource cleanup and error handling in LoRA test (#8175)
Signed-off-by: shuyix <219646547+shuyixiong@users.noreply.github.com>
|
2025-10-14 23:46:30 +08:00 |
|
Jonas Yang CN
|
88ea2c4ee9
|
[TRTLLM-7349][feat] Adding new orchestrator type -- ray (#7520)
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
|
2025-10-04 08:12:24 +08:00 |
|
xinhe-nv
|
1fbea497ff
|
[TRTLLM-7070][feat] add gpt-oss serve benchmark tests (#7638)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-09-16 16:39:31 +08:00 |
|
xiweny
|
c076a02b38
|
[TRTLLM-4629] [feat] Add support of CUDA13 and sm103 devices (#7568)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
Signed-off-by: Daniel Stokes <dastokes@nvidia.com>
Signed-off-by: Zhanrui Sun <zhanruis@nvidia.com>
Signed-off-by: Xiwen Yu <xiweny@nvidia.com>
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: xiweny <13230610+VALLIS-NERIA@users.noreply.github.com>
Co-authored-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
Co-authored-by: Daniel Stokes <dastokes@nvidia.com>
Co-authored-by: Zhanrui Sun <zhanruis@nvidia.com>
Co-authored-by: Jiagan Cheng <jiaganc@nvidia.com>
Co-authored-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Bo Deng <deemod@nvidia.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
|
2025-09-16 09:56:18 +08:00 |
|
Yuan Tong
|
db8dc97b7b
|
[None][fix] Migrate to new cuda binding package name (#6700)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
|
2025-08-07 16:29:55 -04:00 |
|
hlu1
|
8207d5fd39
|
[None] [feat] Add model gpt-oss (#6645)
Signed-off-by: Hao Lu <14827759+hlu1@users.noreply.github.com>
|
2025-08-07 03:04:18 -04:00 |
|
amitz-nv
|
85af62184b
|
[TRTLLM-6683][feat] Support LoRA reload CPU cache evicted adapter (#6510)
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
|
2025-08-07 09:05:36 +03:00 |
|
amitz-nv
|
98428f330e
|
[TRTLLM-5826][feat] Support pytorch LoRA adapter eviction (#5616)
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
|
2025-07-20 08:00:14 +03:00 |
|
danielafrimi
|
7a617ad1fe
|
feat: W4A16 GEMM (#4232)
Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>
|
2025-07-01 10:36:05 +03:00 |
|
Omer Ullman Argov
|
1db63c2546
|
[fix] speedup modeling unittests (#5579)
Signed-off-by: Omer Ullman Argov <118735753+omera-nv@users.noreply.github.com>
|
2025-06-30 06:30:45 +03:00 |
|
shaharmor98
|
27afcb9928
|
add changes for fp8, nemotron-nas, API (#4180)
Signed-off-by: Shahar Mor <17088876+shaharmor98@users.noreply.github.com>
|
2025-05-18 23:27:25 +08:00 |
|
xiweny
|
68a19a33d4
|
TRTLLM-4624 feat: Add nvfp4 gemm and moe support for SM120 (#3770)
* upgrade cutlass to 3.9
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
update latest internal_cutlass_kernels; revert cutlass version update; fix fp4 gemm for sm100
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
* update internal cutlass kernels
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
* fix file
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
* remove unnecessary change
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
* update hash
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
---------
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Co-authored-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
|
2025-04-29 11:19:11 -04:00 |
|
Pamela Peng
|
6cdfc54883
|
feat: Add FP8 support for SM 120 (#3248)
* Allow FP8 on SM120
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
* fix sm121
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
* fix
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
* fix pre-commit
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
* review update
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
---------
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Co-authored-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
|
2025-04-14 16:05:41 -07:00 |
|
Kaiyu Xie
|
2631f21089
|
Update (#2978)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-03-23 16:39:35 +08:00 |
|
Kaiyu Xie
|
3aa6b11d13
|
Update TensorRT-LLM (#2936)
* Update TensorRT-LLM
---------
Co-authored-by: changcui <cuichang147@gmail.com>
|
2025-03-18 21:25:19 +08:00 |
|