Enwei Zhu
|
1745102e72
|
[TRTLLM-7027][feat] Fuse d2t to logitsBitmaskKernel and fix a race condition in one-model spec (#7481)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-09-04 23:30:14 +08:00 |
|
Jin Li
|
2a2dfe273b
|
[https://nvbugs/5485102][fix] Correctly set stride for piecewise outp… (#7442)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-09-04 10:48:15 +08:00 |
|
Yi Zhang
|
a15af879ec
|
[None][refactor] Refactor Torch Compile Backend, MoeLoadBalancer and warmup Logic (#6615)
Signed-off-by: yizhang-nv <187001205+yizhang-nv@users.noreply.github.com>
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
|
2025-08-19 09:58:44 +08:00 |
|
liji-nv
|
dcbfa7e509
|
[https://nvbugs/5252313][fix] Fix torch compile + MTP (#6554)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-08-05 10:31:29 -04:00 |
|
liji-nv
|
1daa8c3232
|
[https://nvbugs/5340941][https://nvbugs/5375785] - fix: Wrap attentio… (#6355)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-08-01 07:38:06 -04:00 |
|
liji-nv
|
e07fff4f78
|
[https://nvbugs/5340941] - fix: Correct custom ops used by Qwen3 Moe … (#6285)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-07-25 14:49:45 +08:00 |
|
liji-nv
|
3e0fb60e50
|
[TRTLLM-4279] feat: Multistream initial support for torch compile flow (#5847)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-07-21 19:10:22 +08:00 |
|
Yi Zhang
|
91bf5e6a8e
|
[TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804)
Add Piecewise CUDA Graph Support
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
|
2025-05-09 11:04:01 +08:00 |
|