mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
* Fix padded vocab size for Llama Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com> * Refactor multi GPU llama executor tests, and reuse the built model engines Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com> * Fix test list typo Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com> * WIP Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com> * Further WIP Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com> * WIP Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com> * Update test lists and readme Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com> * Try parametrize for asymmetric Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com> * Parametrize + skip unsupported combinations Signed-off-by: domb <3886319+DomBrown@users.noreply.github.com> * Update test list Signed-off-by: domb <3886319+DomBrown@users.noreply.github.com> * Reduce environment duplicated code Signed-off-by: domb <3886319+DomBrown@users.noreply.github.com> --------- Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com> Signed-off-by: domb <3886319+DomBrown@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| conftest.py | ||
| cpp_common.py | ||
| test_e2e.py | ||
| test_multi_gpu.py | ||
| test_unit_tests.py | ||