mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
* test: add llama_v3.1_8b_fp8 model, llama_v3.1_405b model and llama_nemotron_49b model in perf test, and modify original llama models dtype from float16 to bfloat16 according to README.md Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com> * add llama_3.2_1B model and fix for lora script issue Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com> --------- Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| dev | ||
| qa | ||
| test-db | ||
| waives.txt | ||