QI JUN
257abfbc51
move pytorch tests of LLM API into separate test files ( #3745 )
...
* move pytorch tests of LLM API into separate test files
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* polish
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* update
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* clean
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
---------
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-04-22 14:36:59 -07:00
Yan Chunwei
231b39015c
unwaive multi_node test ( #3715 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-04-21 21:26:07 +08:00
QI JUN
d51ae53940
move the reset models into examples/models/core directory ( #3555 )
...
* move rest models to examples/models/core directory
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* update multimodal readme
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix example path
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix cpp test
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix tensorrt test
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
---------
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-04-19 20:48:59 -07:00
Yechan Kim
5460d18b10
feat: trtllm-serve multimodal support ( #3590 )
...
* feat: trtllm-serve multimodal support
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* remove disable argument
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* remove disable
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* add and separate tests and move the doc
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* remove block_resue arg from serve.py
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
---------
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>
2025-04-19 05:01:28 +08:00
Zheng Duan
bce7ea8c38
test: add kv cache event tests for disagg workers ( #3602 )
2025-04-18 18:30:19 +08:00
Yan Chunwei
2a09826ec4
fix hmac in remote mpi session ( #3649 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Co-authored-by: Tao Li @ NVIDIA <tali@nvidia.com>
2025-04-18 17:47:51 +08:00
Erin
4fedf0be5c
unwaive test for nvbug_5150466 ( #3552 )
...
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-04-18 15:15:58 +08:00
rakib-hasan
ff3b741045
feat: adding multimodal (only image for now) support in trtllm-bench ( #3490 )
...
* feat: adding multimodal (only image for now) support in trtllm-bench
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* fix: add in load_dataset() calls to maintain the v2.19.2 behavior
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* re-adding prompt_token_ids and using that for prompt_len
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* updating the datasets version in examples as well
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* api changes are not needed
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* moving datasets requirement and removing a missed api change
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* addressing review comments
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* refactoring the quickstart example
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
---------
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
2025-04-18 07:06:16 +08:00
QI JUN
91660939fd
tests: waive test_llm_multi_node ( #3664 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-04-18 01:59:16 +08:00
QI JUN
fac1a905e9
waive test_llm_multi_node_with_postproc ( #3628 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-04-16 05:49:39 -07:00
Yan Chunwei
63f3fba679
waive test_llm_multi_node_pytorch ( #3592 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-04-16 10:49:07 +08:00
Pengyun Lin
1899e71364
doc: add genai-perf benchmark & slurm multi-node for trtllm-serve doc ( #3407 )
...
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
2025-04-16 00:11:58 +08:00
xinhe-nv
5cfa927132
update waive list ( #3503 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-04-15 16:53:53 +08:00
xinhe-nv
863d023fd0
test: fix memory leak of tests ( #3392 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-04-10 14:31:40 +08:00
yuxianq
7b03350527
Add thread leak check and fix thread/memory leak issues. ( #3270 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-04-08 19:03:18 +08:00
pansicheng
ef1ba468a1
feat: support abort disconnected requests ( #3214 )
...
Signed-off-by: pansicheng <sicheng.pan.chn@gmail.com>
2025-04-07 16:14:58 +08:00
Yan Chunwei
b21cfcfed1
chore: refactor the LlmArgs with Pydantic and migrate remaining pybinding configs to python ( #3025 )
...
* make LlmArgs Pydantic
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* amending doc
fix api_stability
fix tests
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* restore yaml groups
refine StackTrace
singleton
clean tests
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* fix trtllm-bench
fix pytorch
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* fix serve distagg
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* fix
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
---------
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-04-05 13:31:48 +08:00
Pengyun Lin
f25c7cefb4
doc: refactor trtllm-serve examples and doc ( #3187 )
...
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-04-04 11:40:43 +08:00
pcastonguay
b5b83009ff
chore: Reenabling get_stats_async test which seems to have been fixed by recent commit ( #3246 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Co-authored-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
2025-04-02 20:57:31 -07:00
Enwei Zhu
3cf7066350
test: Accuracy test improvement (Part 3.2): Move Qwen tests (NvBug 5135332) ( #3219 )
...
* remove test_llm_models_multi_gpu.py
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* qwen 2.5
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* upgrade
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
---------
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-04-02 17:29:57 +08:00
bhsueh_NV
322ac565fc
chore: clean some ci of qa test ( #3083 )
...
* move some models to examples/models/contrib
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* update the document
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* remove arctic, blip2, cogvlm, dbrx from qa test list
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* remove tests of dit, mmdit and stdit from qa test
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* remove grok, jais, sdxl, skywork, smaug from qa test list
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* re-organize the glm examples
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* fix issues after running pre-commit
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* fix some typo in glm_4_9b readme
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* fix bug
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
---------
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-03-31 14:30:41 +08:00
xiweny
6979afa6f2
test: reorganize tests folder hierarchy ( #2996 )
...
1. move TRT path tests to 'trt' folder
2. optimize some import usage
2025-03-27 12:07:53 +08:00
Yuan Tong
53adb3cb4e
test: waive flaky test_kv_cache_event_async_api ( #3062 )
...
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-03-25 18:41:30 +08:00
Yan Chunwei
531b98ed62
feat: Add several pure python configs to LlmArgs ( #2997 )
...
* add SchedulerConfig
* add PeftCacheConfig
2025-03-24 16:16:17 +08:00
Kaiyu Xie
2631f21089
Update ( #2978 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-23 16:39:35 +08:00
Kaiyu Xie
3aa6b11d13
Update TensorRT-LLM ( #2936 )
...
* Update TensorRT-LLM
---------
Co-authored-by: changcui <cuichang147@gmail.com>
2025-03-18 21:25:19 +08:00