TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-22 03:35:00 +08:00

Author	SHA1	Message	Date
Ivy Zhang	5eefdf2c75	tests: Add llama4 functional cases (#6392 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2025-08-04 11:19:58 +08:00
Yechan Kim	ee6ab5be96	chore: add EXAONE4 accuracy test (#6397 ) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>	2025-08-04 10:14:16 +08:00
Ivy Zhang	7547a7d0a2	[TRTLLM-6473][test] add speculative decoding and ep load balance cases into QA test list (#6436 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2025-08-03 22:11:26 -04:00
Jhao-Ting Chen	4da5cfc511	[None][infra] add eagle3 one model accuracy tests (#6264 ) Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>	2025-08-02 16:07:46 -07:00
Lizhi Zhou	6f34f3489b	[TRTLLM-6357][test] Add accuracy tests for Qwen3 (#6177 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>	2025-08-01 13:33:34 -04:00
xinhe-nv	263c6c0ad0	test: skip post blackwell (#6357 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-08-01 13:10:14 -04:00
Emma Qiao	16febefee0	[None][Infra] - Skip failed tests in post-merge (#6558 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-08-01 22:21:23 +08:00
brb-nv	7447d6ed85	[TRTLLM-6657][feat] Add LoRA support for Gemma3 (#6371 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-08-01 09:19:54 -04:00
liji-nv	1daa8c3232	[https://nvbugs/5340941 ][https://nvbugs/5375785 ] - fix: Wrap attentio… (#6355 ) Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>	2025-08-01 07:38:06 -04:00
Yukun He	90856bf97d	[https://nvbugs/5419069 ][fix] Fix the mismatched layer name components. (#6417 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-08-01 16:32:39 +08:00
brb-nv	2eca0d5925	fix: Fix poor generation with FP8 Gemma3 1B checkpoint (#6499 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-07-31 17:18:23 -07:00
Ziyi Xiong	8062e0fe7c	[TRTLLM-6392][feat] Support turning on/off spec decoding dynamically (#6363 ) Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>	2025-07-31 15:31:39 -04:00
Faraz	8e84df74b5	Fix e2e test failure for RTX6000 Pro (#6420 ) Signed-off-by: list <58580514+farazkh80@users.noreply.github.com> Signed-off-by: Faraz <58580514+farazkh80@users.noreply.github.com>	2025-07-30 23:32:44 -04:00
xinhe-nv	ca534e4798	test: add accuracy reference (#6479 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-07-31 12:27:29 +10:00
bhsueh_NV	ae3a5fc918	[doc][ci][Qwen3][nvbugs 5374145] Add Qwen3 235B eagle3 CI (#6477 ) Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>	2025-07-31 09:37:23 +08:00
brb-nv	0e16d1f070	test: Add time logging for lora tests (#6466 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-07-30 14:02:43 -07:00
Anurag Mukkara	fac186e3b5	[nvbug/5409417] Unwaive llava test case (#6460 ) Signed-off-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>	2025-07-30 14:38:47 -04:00
brb-nv	f6287e4498	Unwaive Gemma2 LoRA test on H100 (#6461 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-07-30 12:56:12 -04:00
Bo Deng	24e7f4eece	[nvbug/5410296][fix] Fix OOM in Llama 4 disagg-serve tests (#6439 ) Signed-off-by: Bo Deng <deemod@nvidia.com>	2025-07-31 00:41:37 +08:00
Wanli Jiang	9632dba02e	feat: TRTLLM-6450 update long rope for phi3.5/phi4-mini/phi4-mm (#6353 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>	2025-07-30 09:20:16 -07:00
pcastonguay	0f083b9daf	fix: Unwaive triton cpp test [nvbug 5401088] (#6412 ) Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>	2025-07-30 11:25:18 -04:00
pcastonguay	e7ae5e2824	feat: Add support for disaggregation with pp with pytorch backend (#6369 ) Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> Signed-off-by: raayandhar <rdhar@nvidia.com> Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com> Signed-off-by: pcastonguay <55748270+pcastonguay@users.noreply.github.com> Co-authored-by: raayandhar <rdhar@nvidia.com> Co-authored-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>	2025-07-30 09:42:13 -04:00
tomeras91	a2514d93fc	[nvbug 5380101][fix] Fix nemotronNAS loading for TP>1 (#6447 ) Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>	2025-07-30 07:22:32 -04:00
xinhe-nv	d9ab3fd35e	tests: add TestNemotronH cuda graph tests (#6390 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com> Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>	2025-07-30 18:45:58 +10:00
xinhe-nv	c00d6763b2	test: [CI] Add failed cases into waives.txt (#6457 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2025-07-30 12:36:58 +10:00
Yechan Kim	d6eb8e2366	fix: support mixture of text & multimodal prompts (#6345 ) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>	2025-07-30 08:52:31 +08:00
xinhe-nv	f1086e7d4f	test: [CI] remove closed bugs (#6381 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>	2025-07-29 19:01:23 +10:00
xinhe-nv	4fbb344caf	test: [CI] Add failed cases into waives.txt (#6423 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-07-29 19:00:30 +10:00
Yukun He	0eee2e2850	[5385981] fix: Update the usage of VisionAttention init API. (#6413 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-07-29 16:41:48 +08:00
ruodil	e11255e9d0	test:[nvbug 5415268] add kv_cache_free_gpu_mem_fraction param and llama4 rcca cases (#6430 ) Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>	2025-07-29 15:52:45 +10:00
Michal Guzek	2573bb729d	feat: Add Phi-4-Mini-Instruct in Pytorch backend for LLM API accuracy tests (#6303 ) Signed-off-by: moraxu <mguzek@nvidia.com>	2025-07-28 14:02:14 -07:00
2ez4bz	cdca541148	[test] Unwaive mistral3.1 small E2E test (#6352 ) Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>	2025-07-28 14:37:42 -04:00
2ez4bz	60e4d3a9d4	[test] Add accuracy regression test for Mistral3.1 (#6322 ) Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>	2025-07-28 09:41:44 -07:00
ruodil	03632a679f	test: organize perf cases and add missing perflab cases in qa test list (#6283 ) Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>	2025-07-28 20:33:32 +10:00
xinhe-nv	971be1fe86	test: waive failed cases (#6394 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-07-28 20:31:43 +10:00
Emma Qiao	b3ca159787	[Infa] - waive failed cases and fix a typo (#6384 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-07-28 02:06:57 -04:00
Chang Liu	dc757799e1	[nvbugs/5401156][fix] Avoid import all models when import trtllm._common (#6266 )	2025-07-27 23:29:21 -04:00
Yan Chunwei	908f49a4ad	[nvbug/5320234] fix: test_trtllm_bench_llmapi_launch (#6359 ) Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>	2025-07-28 09:01:10 +08:00
nv-guomingz	b8d4cb8beb	feat: Support JSON Schema in OpenAI-Compatible API (#6321 ) Signed-off-by: noiji <52301388+noiji@users.noreply.github.com>	2025-07-25 12:55:56 -04:00
xiaoqi	a0aecf0476	[feat]: support logit_bias (#5354 ) Signed-off-by: xq25478 <xq25478@qq.com> Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com> Signed-off-by: hexiao.xq <hexiao.xq@antgroup.com> Co-authored-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com> Co-authored-by: hexiao.xq <hexiao.xq@antgroup.com> Co-authored-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>	2025-07-25 09:37:41 +00:00
xinhe-nv	470544cf17	test: [CI] Add failed cases into waives.txt (#6333 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-07-25 17:18:06 +10:00
xinhe-nv	6268a60ab3	tests: add test_chunked_prefill for llama4 (#5549 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-07-24 23:02:00 -04:00
bhsueh_NV	7b6aadc800	[Fix][nvbug 5401163][nvbug 5404726][Qwen3] Fix bug of MoE on tp > 1 with trtllm moe backend (#6235 ) Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>	2025-07-24 21:47:37 +08:00
Emma Qiao	0cc1f8c03d	[Infra] - Wiave failed tests in post-merge (#6331 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-07-24 21:18:06 +08:00
Iman Tabrizian	5fceaa6153	Revert "tests: add timeout_manager to tensorrt flow test cases (#5942 )" (#6309 )	2025-07-23 23:58:10 -04:00
Iman Tabrizian	7740bfa31d	Waive tests (#6312 ) Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>	2025-07-23 18:15:07 -07:00
Emma Qiao	cb737a5fcd	[Infra] - Skip failed cases (#6299 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-07-23 21:26:31 +08:00
xinhe-nv	2b0fa24175	test: [CI] Add failed cases into waives.txt (#6289 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2025-07-23 19:04:21 +10:00
YueWeng	ed62a06eef	[nvbug/5322354] fix PD + MTP + overlap scheduler accuracy issue (#6136 ) Signed-off-by: Yue Weng <25103990+yweng0828@users.noreply.github.com>	2025-07-23 14:53:37 +08:00
Iman Tabrizian	bc2fb29c5e	[nvbugs/5401261][fix] Fix Triton backend disaggregated serving support (#6224 ) Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>	2025-07-23 05:27:16 +08:00

1 2 3 4 5 ...

536 Commits