TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-28 22:56:13 +08:00

History

2ez4bz 7ebb770dce [None][fix] Fix batching bug in Mistral3 model (#6841 ) Prior to this commit, if multiple requests with images were in the same batch, the batching logic for the images would fail. This commit fixes it, and adds unit tests for it that were verified to fail prior to the fix. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>		2025-08-14 02:15:44 -04:00
..
test_modeling_bert.py	feat: no-cache attention in PyTorch workflow (#3085 )	2025-04-05 01:54:32 +08:00
test_modeling_clip.py	feat: add Pytorch support of Vision Encoder for multimodal models (#3791 )	2025-05-03 05:13:47 +08:00
test_modeling_deepseek.py	[TRTLLM-5530][BREAKING CHANGE] refactor: unify KvCacheConfig in LLM class for pytorch backend (#5752 )	2025-07-16 16:42:59 +08:00
test_modeling_exaone4.py	chore: add EXAONE4 accuracy test (#6397 )	2025-08-04 10:14:16 +08:00
test_modeling_gemma3.py	[None][chore] Update Gemma3 closeness check to mitigate flakiness (#6591 )	2025-08-04 10:10:58 -04:00
test_modeling_llama_min_latency.py	[TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats (#5372 )	2025-07-17 00:50:30 +08:00
test_modeling_llama.py	[fix] speedup modeling unittests (#5579 )	2025-06-30 06:30:45 +03:00
test_modeling_mistral.py	[None][fix] Fix batching bug in Mistral3 model (#6841 )	2025-08-14 02:15:44 -04:00
test_modeling_mixtral.py	[TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats (#5372 )	2025-07-17 00:50:30 +08:00
test_modeling_mllama.py	Update transformers to 4.53.0 (#5747 )	2025-07-09 09:32:24 -07:00
test_modeling_nemotron_h.py	[https://nvbugs/5404046 ][fix] Fix Nemotron-H flaky CUDA graph / overlap scheduler test (#6485 )	2025-07-31 21:35:10 +03:00
test_modeling_nemotron_nas.py	[fix][test] Speedup Nemotron NAS unittests (#5202 )	2025-06-15 11:26:03 +03:00
test_modeling_nemotron.py	[fix] speedup modeling unittests (#5579 )	2025-06-30 06:30:45 +03:00
test_modeling_out_of_tree.py	chores: merge examples for v1.0 doc (#5736 )	2025-07-08 21:00:42 -07:00
test_modeling_phi3.py	feat: TRTLLM-5574 Add phi-4-multimodal pytorch-backend support (#5644 )	2025-07-17 06:30:58 +08:00
test_modeling_pixtral.py	[TRTLLM-5252][fix] Propagate mapping to intermediate layers (#6611 ) (#6765 )	2025-08-11 10:13:10 -07:00
test_modeling_qwen_moe.py	[TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats (#5372 )	2025-07-17 00:50:30 +08:00
test_modeling_qwen.py	feat: no-cache attention in PyTorch workflow (#3085 )	2025-04-05 01:54:32 +08:00
test_modeling_siglip.py	feat: Update Gemma3 Vision Encoder (#5973 )	2025-07-14 22:38:10 +08:00
test_modeling_vila.py	feat: llama4 input processor (#3383 )	2025-04-25 16:47:14 -07:00