TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Chuang Zhu ffc0b8f5da Cache transceiver support VSWA (#5505 ) Signed-off-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com> Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com> Co-authored-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>		2025-07-05 01:18:42 +09:00
..
blockKeyTest.cpp	fix partialMatch (#3413 )	2025-04-11 16:42:52 +08:00
cacheTransceiverTest.cpp	Cache transceiver support VSWA (#5505 )	2025-07-05 01:18:42 +09:00
CMakeLists.txt	refactor: Move ModelSpec to core library (#3980 )	2025-05-04 01:39:09 +08:00
guidedDecoderTest.cpp	[TRTLLM-4460] test: Use Llama 3.2 1B for Llama C++ tests (#3206 )	2025-05-01 05:31:08 +08:00
peftCacheManagerTest.cpp	Update TensorRT-LLM (#2873 )	2025-03-11 21:13:42 +08:00
trtEncoderModelTest.cpp	refactor: remove TrtGptModelOptionalParams (#5165 )	2025-06-20 10:31:40 +02:00
trtGptModelRealDecoderTest.cpp	fix: Improve chunking test and skip empty kernel calls (#5710 )	2025-07-04 09:08:15 +02:00
trtGptModelTest.cpp	[fix: nvbugs/5355493] Correctly clamp max sequence len to max attention window (#5720 )	2025-07-04 08:16:25 +02:00