TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-29 15:15:08 +08:00

Author	SHA1	Message	Date
chenfeiz0326	cc4ab8d9d1	[TRTLLM-8825][feat] Support Pytest Perf Results uploading to Database (#8653 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-11-03 16:23:13 +08:00
ruodil	07a957e5cb	[None][test] remove redunctant runtime backend in perf test (#8358 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>	2025-10-24 02:01:34 -04:00
Eran Geva	d4b3bae5af	[#8391 ][fix] check perf by device subtype (#8428 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2025-10-22 12:38:05 +03:00
chenfeiz0326	6cf1c3fba4	[TRTLLM-8260][feat] Add Server-Client Perf Test in pytest for B200 and B300 (#7985 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-10-22 10:17:22 +08:00
fredricz-20070104	fc4e6d3702	[TRTLLM-7183][test] Feature fix model issue for disagg serving (#7785 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>	2025-09-19 10:12:55 +08:00
Bo Deng	bf57829acf	[TRTLLM-7871][infra] Extend test_perf.py to add disagg-serving perf tests. (#7503 ) Signed-off-by: Bo Deng <deemod@nvidia.com>	2025-09-10 17:35:51 +08:00
amirkl94	8039ef45d3	CI: Performance regression tests update (#3531 )	2025-06-01 09:47:55 +03:00
Emma Qiao	c945e92fdb	[Infra]Remove some old keyword (#4552 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-05-31 13:50:45 +08:00
ruodil	9c03a7ab74	test: add llama_3.2_1B model and fix for test lora script issue (#4139 ) * test: add llama_v3.1_8b_fp8 model, llama_v3.1_405b model and llama_nemotron_49b model in perf test, and modify original llama models dtype from float16 to bfloat16 according to README.md Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com> * add llama_3.2_1B model and fix for lora script issue Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com> --------- Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>	2025-05-12 14:51:59 +08:00
ruodil	4d0e462723	tests: skip writing prepare_dataset output to logs, and add llama_v3.1_8b_fp8, llama_v3.3_70b_fp8, llama_v3.1_405b_fp4 models (#3864 ) * tests: skip writing prepare_dataset output to logs Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com> * test: add llama_v3.1_8b_fp8 model, llama_v3.1_405b model and llama_nemotron_49b model in perf test, and modify original llama models dtype from float16 to bfloat16 according to README.md Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com> --------- Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com> Signed-off-by: Larry <197874197+LarryXFly@users.noreply.github.com> Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>	2025-05-07 13:56:35 +08:00
ruodil	9223000765	waive failed case in perf test, change default max_batch_size to 512 and write config.json to output log (#3657 ) Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com> Signed-off-by: Larry <197874197+LarryXFly@users.noreply.github.com> Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>	2025-04-22 14:51:45 +08:00
Kaiyu Xie	2631f21089	Update (#2978 ) Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>	2025-03-23 16:39:35 +08:00

12 Commits