| .. |
|
llm_auto_parallel.py
|
Update TensorRT-LLM (#2562)
|
2024-12-11 00:31:05 -08:00 |
|
llm_eagle2_decoding.py
|
[fix] Eagle-2 LLMAPI pybind argument fix. (#3967)
|
2025-05-29 12:23:25 -07:00 |
|
llm_eagle_decoding.py
|
[fix] Eagle-2 LLMAPI pybind argument fix. (#3967)
|
2025-05-29 12:23:25 -07:00 |
|
llm_guided_decoding.py
|
Update TensorRT-LLM (#2562)
|
2024-12-11 00:31:05 -08:00 |
|
llm_inference_async_streaming.py
|
Update TensorRT-LLM (#2562)
|
2024-12-11 00:31:05 -08:00 |
|
llm_inference_async.py
|
Update TensorRT-LLM (#2562)
|
2024-12-11 00:31:05 -08:00 |
|
llm_inference_customize.py
|
chore: Cleanup deprecated APIs from LLM-API (part 1/2) (#3732)
|
2025-05-07 13:20:25 +08:00 |
|
llm_inference_distributed.py
|
Update TensorRT-LLM (#2562)
|
2024-12-11 00:31:05 -08:00 |
|
llm_inference_kv_events.py
|
chore [BREAKING CHANGE]: Flatten PyTorchConfig knobs into TorchLlmArgs (#4603)
|
2025-05-28 18:43:04 +08:00 |
|
llm_inference.py
|
Update TensorRT-LLM (#2755)
|
2025-02-11 03:01:00 +00:00 |
|
llm_logits_processor.py
|
[fix]: Fall back to HMAC to Avoid IPC Serialization Churn (#5074)
|
2025-06-13 11:37:50 +08:00 |
|
llm_lookahead_decoding.py
|
Update TensorRT-LLM (#2755)
|
2025-02-11 03:01:00 +00:00 |
|
llm_medusa_decoding.py
|
Update TensorRT-LLM (#2849)
|
2025-03-04 18:44:00 +08:00 |
|
llm_mgmn_llm_distributed.sh
|
make LLM-API slurm examples executable (#3402)
|
2025-04-13 21:42:45 +08:00 |
|
llm_mgmn_trtllm_bench.sh
|
chore [BREAKING CHANGE]: Flatten PyTorchConfig knobs into TorchLlmArgs (#4603)
|
2025-05-28 18:43:04 +08:00 |
|
llm_mgmn_trtllm_serve.sh
|
[TRTQA-2802][fix]: add --host for mgmn serve examples script (#4175)
|
2025-05-12 13:28:42 +08:00 |
|
llm_multilora.py
|
Update TensorRT-LLM (#2562)
|
2024-12-11 00:31:05 -08:00 |
|
llm_quantization.py
|
feat: use cudaMalloc to allocate kvCache (#3303)
|
2025-04-08 10:59:14 +08:00 |
|
quickstart_example.py
|
Update TensorRT-LLM (#2562)
|
2024-12-11 00:31:05 -08:00 |
|
README.md
|
chore: Mass integration of release/0.20 (#4898)
|
2025-06-08 23:26:26 +08:00 |