Dom Brown
f995a92a31
CI: Waive for https://nvbugspro.nvidia.com/bug/5189673 ( #3100 )
...
* Waive for https://nvbugspro.nvidia.com/bug/5189673
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
* Update waive
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
---------
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
2025-03-26 19:13:43 +08:00
Enwei Zhu
224469b096
test: [TRTLLM-4334] Create 1.0 criteria scope from API stability references ( #3069 )
...
* committed APIs validation
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* clean name
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* separate
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* add TODOs
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix naming
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
---------
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-03-26 18:14:35 +08:00
Ivy Zhang
3e116c9687
test: add random image test for llama-3.2-11b-vision ( #3055 )
...
* add random image test for llama-3.2-11b-vision
Signed-off-by: Ivy Zhang <yanzh@nvidia.com>
* rename case
Signed-off-by: Ivy Zhang <yanzh@nvidia.com>
---------
Signed-off-by: Ivy Zhang <yanzh@nvidia.com>
Co-authored-by: Larry <larryx@nvidia.com>
CI got Passed: https://nv/trt-llm-cicd/job/helpers/job/PR_Github/522/
2025-03-26 15:38:16 +08:00
Aurelien Chartier
0ec7b5701f
chore: Handle qwen2audio inputs ids expansion during processing ( #3080 )
...
* Handle qwen2audio inputs ids expansion during processing
Signed-off-by: Aurelien Chartier <achartier@nvidia.com>
* remove more dead code
Signed-off-by: Aurelien Chartier <achartier@nvidia.com>
* fix yapf
Signed-off-by: Aurelien Chartier <achartier@nvidia.com>
---------
Signed-off-by: Aurelien Chartier <achartier@nvidia.com>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-03-26 15:00:27 +08:00
Yechan Kim
3c7cb6629c
Add EXAONE-Deep ( #3054 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-03-26 14:24:04 +08:00
kxdc
e6cb34d921
test: fix QA TRT integration testlist mismatch issue ( #3090 )
...
Incorrect test-db context caused empty test list output.
Fix by typo correction: `llm_trt_*` -> `trt_llm_*`.
Signed-off-by: kxdc <xink@nvidia.com>
2025-03-26 14:03:21 +08:00
peaceh-nv
5e272eef81
feat : reduce trt engine build time in testing ( #3014 )
...
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
2025-03-26 13:02:54 +08:00
Anurag Mukkara
7361c7d401
Add second possible output ( #3043 )
...
Signed-off-by: Anurag Mukkara <amukkara@nvidia.com>
2025-03-25 12:59:27 -07:00
Enwei Zhu
f93ac9672e
clean ( #3061 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-03-25 21:55:08 +08:00
Chuang Zhu
110c6fc0f0
wait long time for disagg test ( #2998 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-03-25 20:52:38 +08:00
Yuan Tong
53adb3cb4e
test: waive flaky test_kv_cache_event_async_api ( #3062 )
...
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-03-25 18:41:30 +08:00
bhsueh_NV
5724c61934
chore: fix bug of model paths in confset.py ( #3011 )
...
* fix bugs of model paths of models in examples/models/contrib/
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* fix bug of code layout
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* fix bug of test_multimodal.py
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* add gptj_example_root back
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
---------
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-03-25 17:00:44 +08:00
xiweny
aacb8d66f4
doc: document running CI stage locally ( #3060 )
...
Signed-off-by: Xiwen Yu <xiweny@nvidia.com>
2025-03-25 16:18:17 +08:00
QI JUN
a8ec1cc4ea
remove examples/test_gptj.py::test_llm_gptj_fp8_manage_weights_summary test case ( #3057 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-03-25 15:41:27 +08:00
Yan Chunwei
69feafc947
fix: amend the test list ( #3056 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-03-25 14:17:36 +08:00
bhsueh_NV
ed84f8f923
fix bug of test_phi ( #3050 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-03-25 13:12:06 +08:00
Yan Chunwei
c29cebf79d
Deprecate model_api examples ( #2999 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-03-25 09:37:20 +08:00
Enwei Zhu
705eef68c2
test: Accuracy test improvement (Part 2): Incorporate mmlu to accuracy test suite ( #2982 )
...
* Accuracy test improvement (Part 2)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* WAR OOM
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
update
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
---------
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-03-25 07:34:10 +08:00
Netanel Haber
da0b0e0ee3
fix: disable kv cache reuse when minimum window size is reached, instead of maximum window size ( #2983 )
...
* fix variable window size reuse - disable when *min attention window* starts sliding, not max
* isPreCyclic -> isCyclic, and invert logic, for clarity
* getDecoderState()
Signed-off-by: Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
2025-03-24 22:49:52 +08:00
Yan Chunwei
531b98ed62
feat: Add several pure python configs to LlmArgs ( #2997 )
...
* add SchedulerConfig
* add PeftCacheConfig
2025-03-24 16:16:17 +08:00
nv-guomingz
ec4f43a0ab
test:remove opt/mpt/gptj/gptneox/bloom/falcon/baichuan/internlm/deep_… ( #2987 )
...
* test:remove opt/mpt/gptj/gptneox/bloom/falcon/baichuan/internlm/deep_seek_v2 test cases.
Signed-off-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
* updatet test case per review comments
Signed-off-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
---------
Signed-off-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
2025-03-24 14:18:06 +08:00
bhsueh_NV
7413cb555a
relax the limitation of setuptools ( #2992 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-03-24 13:36:10 +08:00
Kaiyu Xie
2631f21089
Update ( #2978 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-23 16:39:35 +08:00
Kaiyu Xie
3aa6b11d13
Update TensorRT-LLM ( #2936 )
...
* Update TensorRT-LLM
---------
Co-authored-by: changcui <cuichang147@gmail.com>
2025-03-18 21:25:19 +08:00
Kaiyu Xie
9b931c0f63
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
Kaiyu Xie
77d7fe1eb2
Update TensorRT-LLM ( #2849 )
...
* Update TensorRT-LLM
---------
Co-authored-by: aotman <chenhangatm@gmail.com>
2025-03-04 18:44:00 +08:00
Kaiyu Xie
ab5b19e027
Update TensorRT-LLM ( #2820 )
2025-02-25 21:21:49 +08:00
Kaiyu Xie
2ea17cdad2
Update TensorRT-LLM ( #2792 )
...
* Update TensorRT-LLM
---------
Co-authored-by: jlee <jungmoolee@clika.io>
2025-02-18 21:27:39 +08:00
Kaiyu Xie
e88da961c5
Update TensorRT-LLM ( #2783 )
2025-02-13 18:40:22 +08:00
Dan Blanaru
16d2467ea8
Update TensorRT-LLM ( #2755 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Denis Kayshev <topenkoff@gmail.com>
Co-authored-by: akhoroshev <arthoroshev@gmail.com>
Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com>
Update
2025-02-11 03:01:00 +00:00
Kaiyu Xie
be17881062
Update TensorRT-LLM ( #2582 )
2024-12-16 21:50:47 -08:00
Kaiyu Xie
aaacc9bd68
Update TensorRT-LLM ( #2562 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Starrick Liu <73152103+StarrickLiu@users.noreply.github.com>
2024-12-11 00:31:05 -08:00
石晓伟
548b5b7310
Update TensorRT-LLM ( #2532 )
...
* blossom-ci.yml: run vulnerability scan on blossom
* open source efb18c1256f8c9c3d47b7d0c740b83e5d5ebe0ec
---------
Co-authored-by: niukuo <6831097+niukuo@users.noreply.github.com>
Co-authored-by: pei0033 <59505847+pei0033@users.noreply.github.com>
Co-authored-by: Kyungmin Lee <30465912+lkm2835@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2024-12-04 21:16:56 +08:00
Kaiyu Xie
385626572d
Update TensorRT-LLM ( #2502 )
...
* Update TensorRT-LLM
---------
Co-authored-by: 岑灿 <yunyi.hyy@alibaba-inc.com>
2024-11-26 16:51:34 +08:00
Kaiyu Xie
535c9cc673
Update TensorRT-LLM ( #2460 )
2024-11-19 18:30:34 +08:00
Kaiyu Xie
c629546ce4
Update TensorRT-LLM ( #2436 )
2024-11-12 15:27:49 +08:00
Kaiyu Xie
b7868dd1bd
Update TensorRT-LLM ( #2413 )
2024-11-05 16:27:06 +08:00
Kaiyu Xie
f14d1d433c
Update TensorRT-LLM ( #2389 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Alessio Netti <netti.alessio@gmail.com>
2024-10-29 22:24:38 +08:00
Kaiyu Xie
1730a587d8
Update TensorRT-LLM ( #2363 )
...
* Update TensorRT-LLM
---------
Co-authored-by: tonylek <137782967+tonylek@users.noreply.github.com>
2024-10-22 20:27:35 +08:00
Kaiyu Xie
75057cd036
Update TensorRT-LLM ( #2333 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Puneesh Khanna <puneesh.khanna@tii.ae>
Co-authored-by: Ethan Zhang <26497102+ethnzhng@users.noreply.github.com>
2024-10-15 15:28:40 +08:00
Kaiyu Xie
8681b3a4c0
open source 4dbf696ae9b74a26829d120b67ab8443d70c8e58 ( #2297 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Bhuvanesh Sridharan <bhuvanesh.sridharan@sprinklr.com>
Co-authored-by: Qingquan Song <ustcsqq@gmail.com>
2024-10-08 12:19:19 +02:00
Dan Blanaru
48686bca3a
open source 7f370deb0090d885d7518c2b146399ba3933c004 ( #2273 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Qingquan Song <ustcsqq@gmail.com>
2024-09-30 13:51:19 +02:00
Kaiyu Xie
e153372759
Update TensorRT-LLM ( #2253 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Ivan Sorokin <isorokin@nvidia.com>
Co-authored-by: lkm2835 <lkm2835@gmail.com>
2024-09-24 17:27:31 +02:00
Kaiyu Xie
fe7dc6ad4e
Update TensorRT-LLM ( #2230 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Yi Wang <yi.wang.2005@gmail.com>
Co-authored-by: lkm2835 <lkm2835@gmail.com>
2024-09-17 14:39:09 +08:00
Kaiyu Xie
31ac30e928
Update TensorRT-LLM ( #2215 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Sherlock Xu <65327072+Sherlock113@users.noreply.github.com>
2024-09-10 18:21:22 +08:00
Kaiyu Xie
78f5c2936b
Update TensorRT-LLM ( #2184 )
2024-09-03 12:14:23 +02:00
石晓伟
b8fc6633ba
Update TensorRT-LLM ( #2156 )
...
Co-authored-by: Bruno Magalhaes <bruno.magalhaes@synthesia.io>
2024-08-27 18:20:59 +08:00
石晓伟
32ed92e449
Update TensorRT-LLM
...
Co-authored-by: Rong Zhou <130957722+ReginaZh@users.noreply.github.com>
Co-authored-by: Onur Galoglu <33498883+ogaloglu@users.noreply.github.com>
Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com>
2024-08-20 18:55:15 +08:00
Kaiyu Xie
74b324f667
Update TensorRT-LLM ( #2110 )
2024-08-13 22:34:33 +08:00
Kaiyu Xie
be9cd719f7
Update TensorRT-LLM ( #2094 )
...
* Update TensorRT-LLM
---------
Co-authored-by: akhoroshev <arthoroshev@gmail.com>
Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com>
Co-authored-by: Tayef Shah <tayefshah@gmail.com>
Co-authored-by: lfz941 <linfanzai941@gmail.com>
2024-08-07 16:44:43 +08:00