Lucas Liebenwein
5cdd6bb10f
[AutoDeploy] Increased Model Coverage Mass Migration Week 1 ( #4468 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
Signed-off-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>
Co-authored-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
Co-authored-by: sugunav14 <178320438+sugunav14@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
Co-authored-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-27 16:43:15 +08:00
Yiqing Yan
3ca05330f9
Waive L0 test ( #4609 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-05-23 15:54:11 +08:00
Lucas Liebenwein
8e4320ede5
[AutoDeploy] configurable cache resize ( #4372 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-05-16 10:07:09 -04:00
Fridah-nv
21dbd163a7
[TRTLLM-5188] fix: [AutoDeploy] unwaive AD build test ( #4273 )
...
* unwaive small build test
Signed-off-by: Ubuntu <201670829+Fridah-nv@users.noreply.github.com>
* unwaive mutigpu/integration tests
Signed-off-by: Ubuntu <201670829+Fridah-nv@users.noreply.github.com>
* fix for torch.compile+flashinfer attention
Signed-off-by: Ubuntu <201670829+Fridah-nv@users.noreply.github.com>
---------
Signed-off-by: Ubuntu <201670829+Fridah-nv@users.noreply.github.com>
2025-05-14 10:40:12 +08:00
Yiqing Yan
fda8b0277a
[Infra][TRTLLM-4374] Upgrade TRT 10.10.0 GA, CUDA 12.9 GA and DLFW 25.04 ( #4049 )
...
* [TRTLLM-4374] Upgrade TRT 10.10.0 GA, CUDA 12.9 GA and DLFW 25.04
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
* fix review
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
* update images
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
* Update jenkins/L0_Test.groovy
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
* update image name
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
---------
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-05-13 14:59:12 +08:00
amitz-nv
a6a2ae6cc1
chore: Rename nvsmall to nemotron nas ( #3447 )
...
* Rename nvsmall to nemotron NAS
* Revert nvsmall to nemotron_nas rename in paths in tests that access llm_models_root/nvsmall/tests
* Add NemotronNAS to pytorch supported models table
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
2025-04-10 23:16:52 +08:00
Suyog Gupta
047f2b234d
perf: [AutoDeploy] Enable AutoDeploy as a backend in trtllm-bench ( #3041 )
...
* Enable AutoDeploy as a backend in trtllm-bench
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* update how caches are resized
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* fix: files permission from 100755 to 100644
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* some comments
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* lint
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* lint
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* lint
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* lint
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* Fix function name
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* refactor
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* Remove spurious change
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* Add cursor generated doc strings
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* re-enable ad test
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* some perf cleanup
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* debug ci
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* ensure that overlap scheduler is enabled
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
* Reorder the tests
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
---------
Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-26 14:33:14 -07:00
Kaiyu Xie
2631f21089
Update ( #2978 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-23 16:39:35 +08:00
Kaiyu Xie
3aa6b11d13
Update TensorRT-LLM ( #2936 )
...
* Update TensorRT-LLM
---------
Co-authored-by: changcui <cuichang147@gmail.com>
2025-03-18 21:25:19 +08:00