TensorRT-LLMs/tests/unittest/_torch
William Zhang c53d1814a7
[None][feat] Extend VLM factory and add Mistral3 factory (#7583)
This commit:

* extends existing factory interfaces to enable Mistral3 in AutoDeploy.
* adds a Mistral3 VLM factory.
* adds various model patches for pixtral (the vision model) and mistral3
  to make the VLM export compliant.
* adjusts checkpoint loading code to take possible parameter name
  conversions into account.
* fixes a sampling bug (the `end_id` needs to be take into account when
  sampling, but it is not included in the stop words' token IDs).

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-09-09 02:47:18 -04:00
..
attention [https://nvbugs/5453806][unwaive] Unwaive fp8 kvcache attention test (#7243) 2025-09-05 12:13:57 -04:00
auto_deploy [None][feat] Extend VLM factory and add Mistral3 factory (#7583) 2025-09-09 02:47:18 -04:00
compilation [TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804) 2025-05-09 11:04:01 +08:00
debugger Fix: fix nvbug 5356427 (#5464) 2025-06-25 22:24:26 +08:00
executor [TRTLLM-7353][feat] Implement capturable drafting loops for speculation (#7100) 2025-09-01 14:37:44 -04:00
misc [None][perf] Make finalize fusion part of the tactic selection logic (#6915) 2025-08-21 14:08:03 -07:00
modeling [None][ci] remove unnecessary test_modeling_deepseek.py (#7542) 2025-09-04 20:05:27 -07:00
models/checkpoints/hf [None][feat] Skip prefetching consolidated safetensors when appropriate (#7013) 2025-08-25 23:56:21 -04:00
modules [TRTLLM-7346][fix] Improve performance of PyTorchModelEngine._get_lora_params_from_requests (#7033) 2025-08-25 10:37:40 +03:00
multi_gpu [None][ci] move unittests to sub-directories (#6635) 2025-08-20 05:42:22 -04:00
multi_gpu_modeling [None][chore] Mass integration of release/1.0 - 3rd (#7519) 2025-09-08 14:03:04 +08:00
multimodal [None][feat] Update multimodal utility get_num_tokens_per_image for better generalization (#7544) 2025-09-08 07:42:46 -04:00
sampler [TRTLLM-7155][feat] Unify sampler handle logits implementation. (#6867) 2025-08-22 08:09:30 +02:00
speculative [https://nvbugs/5502352][fix] Fix 2-model CDL path (#7543) 2025-09-06 23:53:27 -04:00
thop [OMNIML-2336][feat] Add NVFP4 x FP8 (#6809) 2025-09-04 09:03:38 -07:00
helpers.py [None][chore] share input_ids buffers among different cuda graphs (#7236) 2025-09-06 17:49:42 -04:00
pattern_watcher.py [TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804) 2025-05-09 11:04:01 +08:00
test_connector.py [None][feat] KV Cache Connector API (#7228) 2025-08-28 23:09:27 -04:00
test_torch_sampler.py [TRTLLM-7153] [feat] Move stop_criteria to sample_async (#7041) 2025-09-07 17:36:49 +03:00