TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Jinyang Yuan bc2b01d1dd chore: update FMHA cubin files (#3680 ) Signed-off-by: Jinyang Yuan <154768711+jinyangyuan-nvidia@users.noreply.github.com>		2025-04-21 15:04:04 +08:00
..
cmake	refactor: Clean up CMakeLists.txt (#3479 )	2025-04-18 14:39:29 +08:00
include/tensorrt_llm	feat: Offloading Multimodal embedding table to CPU in Chunked Prefill Mode (#3380 )	2025-04-21 14:31:01 +08:00
micro_benchmarks	feat: Add FP8 support for SM 120 (#3248 )	2025-04-14 16:05:41 -07:00
tensorrt_llm	chore: update FMHA cubin files (#3680 )	2025-04-21 15:04:04 +08:00
tests	move the reset models into `examples/models/core` directory (#3555 )	2025-04-19 20:48:59 -07:00
CMakeLists.txt	refactor: Clean up CMakeLists.txt (#3479 )	2025-04-18 14:39:29 +08:00