TensorRT-LLMs/tests/unittest
liji-nv dca6397d1e
feat: Introduce UB allocator for pytorch flow (#3257)
* Instead of allocating UserBuffers at beginning of runtime, UB buffers
  are now managed with global allocator. The allocator will dynamically
assign free UB buffer or allocate new buffer for torch tensor. It makes
userbuffers easier to use.

* In common usecase, the Userbuffers will be allocated correctly during
  warm up stage. There is no dynamic allocation during inference.

* UB fusion pattern is rewroten using the new UB Allocator. It contains
  following passes:

1. Fuse Quant with allreduce, replace with UB impl, and insert a
   copy_to_userbuffers. Currently the normal allreduce still does not
   support FP8 quant. So this need to be done in UB pass
2. Convert all supported allreduce with UB and insert copy_to_userbuffers.
3. Fuse op before ar with the copy_to_userbuffers. So the op directly
   writes to the userbuffer
4. Remove userbuffers finalize if the output is connect to another UB
   allreduce.

Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-04-08 18:39:49 +08:00
..
_torch feat: Introduce UB allocator for pytorch flow (#3257) 2025-04-08 18:39:49 +08:00
api_stability chore: refactor the LlmArgs with Pydantic and migrate remaining pybinding configs to python (#3025) 2025-04-05 13:31:48 +08:00
bindings feat: support abort disconnected requests (#3214) 2025-04-07 16:14:58 +08:00
llmapi feat: support abort disconnected requests (#3214) 2025-04-07 16:14:58 +08:00
others test: reorganize tests folder hierarchy (#2996) 2025-03-27 12:07:53 +08:00
scaffolding feat: refactor scaffolding worker and support openai api worker (#3166) 2025-04-01 18:31:52 +08:00
tools Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00
trt Waive unittest/trt/model/test_mamba.py::TestMamba::test_loaders_mamba_130m_hf_from_checkpoint. Will fix it later. (#3356) 2025-04-07 22:36:35 -07:00
utils test: reorganize tests folder hierarchy (#2996) 2025-03-27 12:07:53 +08:00
conftest.py Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00
dump_checkpoint_stats.py Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00
profile_utils.py Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00
pytest.ini test: reorganize tests folder hierarchy (#2996) 2025-03-27 12:07:53 +08:00
test_model_runner_cpp.py Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00
test_pip_install.py relax the limitation of setuptools (#2992) 2025-03-24 13:36:10 +08:00