Bala Marimuthu
|
1c065fbb3e
|
[#11109][feat] AutoDeploy: GLM 4.7 Flash Improvements (#11414)
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>
Signed-off-by: Gal Hubara-Agam <96368689+galagam@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
Co-authored-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
Co-authored-by: Grzegorz Kwasniewski <213329731+greg-kwasniewski1@users.noreply.github.com>
|
2026-02-17 08:43:59 -05:00 |
|
chenfeiz0326
|
eae480b713
|
[https://nvbugs/5820874][fix] Adjust deepgemm tuning buckets to cover larger num_tokens's scope (#11259)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
|
2026-02-05 23:12:38 +08:00 |
|
Anish Shanbhag
|
e308eb50f4
|
[TRTLLM-10803][fix] Fix mocking of HuggingFace downloads in with_mocked_hf_download (#11200)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
|
2026-02-02 21:58:15 -08:00 |
|
chenfeiz0326
|
56073f501a
|
[TRTLLM-8263][feat] Add Aggregated Perf Tests (#10598)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
|
2026-01-17 13:16:36 +08:00 |
|
Anish Shanbhag
|
faa80e73fd
|
[None][feat] Auto download speculative models from HF for pytorch backend, add speculative_model field alias (#10099)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
|
2026-01-14 21:06:07 -08:00 |
|
Anish Shanbhag
|
dacc881993
|
[https://nvbugs/5761391][fix] Use correct model names for config database regression tests (#10192)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
|
2026-01-12 10:55:07 -08:00 |
|
Lizhi Zhou
|
bd13957e70
|
[TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-12-16 05:16:32 -08:00 |
|