mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
Prior to this commit, if multiple requests with images were in the same batch, the batching logic for the images would fail. This commit fixes it, and adds unit tests for it that were verified to fail prior to the fix. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| attention | ||
| auto_deploy | ||
| compilation | ||
| debugger | ||
| executor | ||
| misc | ||
| modeling | ||
| models/checkpoints/hf | ||
| modules | ||
| multi_gpu | ||
| multi_gpu_modeling | ||
| multimodal | ||
| sampler | ||
| speculative | ||
| thop | ||
| helpers.py | ||
| pattern_watcher.py | ||
| test_connector.py | ||