TensorRT-LLMs/tensorrt_llm
Faraz 49c45ebef1
[None][fix] change logging for weight loading on unified memory (#9177)
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
Co-authored-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
2025-11-19 14:31:19 -05:00
..
_tensorrt_engine
_torch [None][fix] change logging for weight loading on unified memory (#9177) 2025-11-19 14:31:19 -05:00
bench [#9237][feat] enable iter stats in autodeploy (#9278) 2025-11-19 19:29:29 +01:00
commands [None][chore] local imports for AutoDeploy in serve and bench (#9199) 2025-11-18 08:14:32 +08:00
evaluate [TRTLLM-8119][feat] Update doc/tests/chat_template for nano-v2-vlm (#8840) 2025-11-11 07:48:23 -08:00
executor [TRTLLM-8988][feat] Unify MPI & Ray's req/response handling with RPC Client/Server (#8765) 2025-11-13 17:21:24 -08:00
inputs [TRTLLM-8119][feat] Update doc/tests/chat_template for nano-v2-vlm (#8840) 2025-11-11 07:48:23 -08:00
layers
llmapi [None][feat] Have ability to cancel disagg request if KV cache resource are exhausted (#9155) 2025-11-18 20:59:17 -05:00
metrics [None][feat] Add trtllm_ prefix for exposed metrics (#8845) 2025-11-06 15:27:18 +08:00
models [TRTLLM-8684][chore] Migrate BuildConfig to Pydantic, add a Python wrapper for KVCacheType enum (#8330) 2025-10-28 09:17:26 -07:00
plugin [TRTLLM-8683][chore] Migrate PluginConfig to Pydantic (#8277) 2025-10-17 16:13:22 -04:00
quantization [None][perf] Use fp8 quant kernel in DS3.2 indexer module (#8701) 2025-10-29 12:45:09 +08:00
runtime [TRTLLM-8684][chore] Migrate BuildConfig to Pydantic, add a Python wrapper for KVCacheType enum (#8330) 2025-10-28 09:17:26 -07:00
scaffolding [None][feat] Deep Research Implemented with Scaffolding (#8452) 2025-11-06 10:33:28 +08:00
serve [None][chore] Support json_schema in response_format (#8934) 2025-11-14 09:43:13 +08:00
tools [None][feat] Add Qwen3-Next to layer-wise benchmarks (#9065) 2025-11-14 10:03:00 +08:00
__init__.py [None] [fix] Disable UCC as WAR to MPI allgather issue before NGC PyTorch 25.12 upgrade (#9126) 2025-11-13 02:25:30 -08:00
_common.py [None][chroe] Rename TensorRT-LLM to TensorRT LLM for source code. (#7851) 2025-09-25 21:02:35 +08:00
_dlpack_utils.py
_ipc_utils.py [TRTLLM-7349][feat] Adding new orchestrator type -- ray (#7520) 2025-10-04 08:12:24 +08:00
_mnnvl_utils.py
_ray_utils.py [TRTLLM-8511][feat] Add update_weights and sleep_wakeup support for rl integration (#8302) 2025-11-04 10:19:24 -08:00
_utils.py [TRTLLM-8988][feat] Unify MPI & Ray's req/response handling with RPC Client/Server (#8765) 2025-11-13 17:21:24 -08:00
builder.py [TRTLLM-8684][chore] Migrate BuildConfig to Pydantic, add a Python wrapper for KVCacheType enum (#8330) 2025-10-28 09:17:26 -07:00
disaggregated_params.py [TRTLLM-7328][feat] E-PD Disagg Support via llmapi (3/N) (#7577) 2025-09-22 19:07:18 -07:00
functional.py [None][chroe] Rename TensorRT-LLM to TensorRT LLM for source code. (#7851) 2025-09-25 21:02:35 +08:00
graph_rewriting.py
logger.py [None][chore] Mass integration of release/1.0 - 3rd (#7519) 2025-09-08 14:03:04 +08:00
lora_helper.py [TRTLLM-8682][chore] Remove auto_parallel module (#8329) 2025-10-22 20:53:08 -04:00
lora_manager.py [https://nvbugs/5510879][fix] Fix pytorch & TRT-python flows fused LoRA adapter modules weight split with TP>1 (#8063) 2025-10-12 12:29:52 -07:00
mapping.py [TRTLLM-9179][feat] add pp_partition to customize each rank's layer number (#9003) 2025-11-13 10:34:17 +08:00
math_utils.py
module.py [None][chroe] Rename TensorRT-LLM to TensorRT LLM for source code. (#7851) 2025-09-25 21:02:35 +08:00
network.py [TRTLLM-8682][chore] Remove auto_parallel module (#8329) 2025-10-22 20:53:08 -04:00
parameter.py
profiler.py
prompt_adapter_manager.py
python_plugin.py
ray_stub.py [TRTLLM-8507][fix] Fix ray resource cleanup and error handling in LoRA test (#8175) 2025-10-14 23:46:30 +08:00
sampling_params.py [None][feat] Support ignored prompt length for penalties via new sampling config parameter (#8127) 2025-10-27 13:12:31 -04:00
scheduling_params.py
serialization.py [TRTLLM-8682][chore] Remove auto_parallel module (#8329) 2025-10-22 20:53:08 -04:00
top_model_mixin.py [TRTLLM-8683][chore] Migrate PluginConfig to Pydantic (#8277) 2025-10-17 16:13:22 -04:00
version.py [None][chore] Bump version to 1.2.0rc3 (#9004) 2025-11-07 01:24:32 -08:00