Erin
89dabf5aa1
[TRTLLM-9736][feat] AsyncLLM and verl integ ( #9353 )
...
Signed-off-by: Liwei Ma <liweim@nvidia.com>
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Liwei Ma <liweim@nvidia.com>
Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Co-authored-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-12-11 09:33:25 -08:00
Kaiyu Xie
177ba7b0f1
[None] [fix] Disable UCC as WAR to MPI allgather issue before NGC PyTorch 25.12 upgrade ( #9126 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-11-13 02:25:30 -08:00
Anish Shanbhag
15de45d782
[TRTLLM-8682][chore] Remove auto_parallel module ( #8329 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2025-10-22 20:53:08 -04:00
Yuan Tong
fae83c387b
[ #6102 ][fix] support non-system python installation ( #7763 )
...
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-09-26 10:16:15 +08:00
Guoming Zhang
202bed4574
[None][chroe] Rename TensorRT-LLM to TensorRT LLM for source code. ( #7851 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-25 21:02:35 +08:00
Enwei Zhu
f882fb86db
[ https://nvbugs/5367180 ][fix] Fix xgrammar import before loading tensorrt_llm binary ( #7906 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-09-23 00:29:57 -07:00
Chang Liu
ce53832610
[TRTLLM-7326][feat] Add standalone multimodal encoder ( #6743 )
...
Signed-off-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-08-19 21:42:50 -07:00
rakib-hasan
7ab8112450
[None][fix] Refactoring to avoid circular import when importing torch models ( #6720 )
...
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
2025-08-11 18:00:42 -04:00
Wanli Jiang
3789ba1d37
feat: TRTLLM-5941 Upgrade xgrammar to 0.1.18 ( #5364 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-07-01 20:12:55 +08:00
Bo Li
1bab9000a6
perf: Optimize swizzle_sf, unswizzle_sf, reswizzle_sf ( #5318 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-06-26 14:03:56 +08:00
Yan Chunwei
9bd42ecf9b
[TRTLLM-5208][BREAKING CHANGE] chore: make pytorch LLM the default ( #5312 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-06-20 03:01:10 +08:00
Yan Chunwei
4798d088d9
chore: Partition LlmArgs into TorchLlmArgs and TrtLlmArgs ( #3823 )
...
* partition LlmArgs
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* update backend
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
---------
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-05-22 09:40:56 +08:00
yuxianq
adfa04745e
fix: revert https://github.com/NVIDIA/TensorRT-LLM/pull/3858 ( #3928 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-04-29 11:26:13 +08:00
yuxianq
e6c14ca97a
fix: Detect pmix and raise error when mpirun is not used. ( #3858 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-04-26 21:49:41 +08:00
dongxuy04
16535991b2
feat: Add MNNVL MoE A2A support ( #3504 )
...
* add MNNVL memory mapping support
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
* add more MPI environment for trtllm-llmapi-launch
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
* add MoE communication and prepare kernels
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
* add MNNVL AlltoAll support for DeepSeekV3
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
* add output dump for throughput benchmark
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
* support dynamic kernel launch grid
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
* address review comments
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
* address review comments #2
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
---------
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
2025-04-25 17:29:08 +08:00
Kaiyu Xie
2631f21089
Update ( #2978 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-23 16:39:35 +08:00
Kaiyu Xie
ab5b19e027
Update TensorRT-LLM ( #2820 )
2025-02-25 21:21:49 +08:00
石晓伟
548b5b7310
Update TensorRT-LLM ( #2532 )
...
* blossom-ci.yml: run vulnerability scan on blossom
* open source efb18c1256f8c9c3d47b7d0c740b83e5d5ebe0ec
---------
Co-authored-by: niukuo <6831097+niukuo@users.noreply.github.com>
Co-authored-by: pei0033 <59505847+pei0033@users.noreply.github.com>
Co-authored-by: Kyungmin Lee <30465912+lkm2835@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2024-12-04 21:16:56 +08:00
Kaiyu Xie
f14d1d433c
Update TensorRT-LLM ( #2389 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Alessio Netti <netti.alessio@gmail.com>
2024-10-29 22:24:38 +08:00
Kaiyu Xie
75057cd036
Update TensorRT-LLM ( #2333 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Puneesh Khanna <puneesh.khanna@tii.ae>
Co-authored-by: Ethan Zhang <26497102+ethnzhng@users.noreply.github.com>
2024-10-15 15:28:40 +08:00
石晓伟
32ed92e449
Update TensorRT-LLM
...
Co-authored-by: Rong Zhou <130957722+ReginaZh@users.noreply.github.com>
Co-authored-by: Onur Galoglu <33498883+ogaloglu@users.noreply.github.com>
Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com>
2024-08-20 18:55:15 +08:00
Kaiyu Xie
9dbc5b38ba
Update TensorRT-LLM ( #1891 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Marks101 <markus.schnoes@gmx.de>
Co-authored-by: lkm2835 <lkm2835@gmail.com>
2024-07-04 14:37:19 +08:00
Kaiyu Xie
f430a4b447
Update TensorRT-LLM ( #1688 )
...
* Update TensorRT-LLM
---------
Co-authored-by: IbrahimAmin <ibrahimamin532@gmail.com>
Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com>
Co-authored-by: Pzzzzz <hello-cd.plus@hotmail.com>
Co-authored-by: CoderHam <hemant@cohere.com>
Co-authored-by: Konstantin Lopuhin <kostia.lopuhin@gmail.com>
2024-05-28 20:07:49 +08:00
Kaiyu Xie
66ef1df492
Update TensorRT-LLM ( #1492 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Loki <lokravi@amazon.com>
2024-04-24 14:44:22 +08:00
Kaiyu Xie
035b99e0d0
Update TensorRT-LLM ( #1427 )
...
* Update TensorRT-LLM
---------
Co-authored-by: meghagarwal <16129366+megha95@users.noreply.github.com>
2024-04-09 17:03:34 +08:00
石晓伟
850b6fa1e7
Update TensorRT-LLM ( #1358 )
...
Co-authored-by: Kaiyu <26294424+kaiyux@users.noreply.github.com>
2024-03-26 20:47:14 +08:00
Kaiyu Xie
4bb65f216f
Update TensorRT-LLM ( #1274 )
...
* Update TensorRT-LLM
---------
Co-authored-by: meghagarwal <16129366+megha95@users.noreply.github.com>
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-03-12 18:15:52 +08:00
Kaiyu Xie
eb8f26c7e4
Update TensorRT-LLM ( #1122 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Eddie-Wang1120 <wangjinheng1120@163.com>
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-02-21 21:30:55 +08:00
Kaiyu Xie
0ab9d17a59
Update TensorRT-LLM ( #1055 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-02-06 18:38:07 +08:00
石晓伟
da79354b8e
Update TensorRT-LLM ( #1017 )
2024-01-31 17:48:46 +08:00
Kaiyu Xie
b57221b764
Update TensorRT-LLM ( #941 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-01-23 23:22:35 +08:00
Kaiyu Xie
d879430b04
Update TensorRT-LLM ( #846 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-01-09 21:03:35 +08:00
Kaiyu Xie
deaae40bd7
Update TensorRT-LLM ( #787 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-01-02 17:54:32 +08:00
Kaiyu Xie
f7eca56161
Update TensorRT-LLM ( #613 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
Co-authored-by: zhang-ge-hao <842720660@qq.com>
2023-12-08 17:49:24 +08:00
Kaiyu Xie
711a28d9bf
Update TensorRT-LLM ( #465 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2023-11-24 22:12:26 +08:00
Kevin Xie
6e9e318e91
Update code
2023-09-28 09:00:05 -07:00
Kaiyu Xie
23bc5b7c49
Initial commit
2023-09-20 00:29:41 -07:00