TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-09 20:43:50 +08:00

History

Yibin Li 32ae1564bd update FP4 quantize layout (#3045 ) Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>		2025-04-03 13:13:54 -04:00
..
test_ar_residual_norm.py	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00
test_deepseek_allreduce.py	perf: Add optimizations for deepseek in min latency mode (#3093 )	2025-04-02 09:05:24 +08:00
test_embedding.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
test_linear.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
test_star_attention_input.jsonl	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
test_star_attention.py	move BuildConfig functional args to llmargs (#3036 )	2025-03-29 02:20:18 +08:00
test_user_buffers.py	update FP4 quantize layout (#3045 )	2025-04-03 13:13:54 -04:00