xinhe-nv
2c86cee38c
[None][chore] Remove closed bugs ( #6969 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-08-19 16:01:33 +08:00
Shunkangz
54ec2c1af1
[None][opt] Add batch wait timeout in fetching requests ( #6923 )
...
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
2025-08-19 03:50:08 -04:00
Eran Geva
636c622bb8
[ https://nvbugs/5458798 ][fix] Relaxed test threshold, added documentation ( #6997 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2025-08-19 00:24:03 -07:00
Ivy Zhang
bff5fdf6df
[TRTLLM-6541][test] Add NIM Related Cases Part 1 ( #6684 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-08-19 13:59:14 +08:00
William Zhang
daa2a65d37
[ https://nvbugs/5454875 ][ci] Unwaive Mistral Small 3.1 test ( #7011 )
...
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-08-19 00:32:14 -04:00
fredricz-20070104
e90280a84d
[TRTLLM-6541][test] Add NIM Related Cases [StarCoder2_7B] and [Codestral_22B_V01] ( #6939 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-08-19 00:13:04 -04:00
Fanrong Li
816a120af6
[TRTLLM-6991][chore] add DeepSeek-R1 FP8 accuracy tests on Blackwell ( #6710 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-08-19 00:03:03 -04:00
Zhenhuan Chen
2bb90ba002
[TRTLLM-6960][fix] enable scaled_mm tests ( #6936 )
...
Signed-off-by: Zhenhuan Chen <chenzhh3671@gmail.com>
2025-08-19 10:18:04 +08:00
Yi Zhang
a15af879ec
[None][refactor] Refactor Torch Compile Backend, MoeLoadBalancer and warmup Logic ( #6615 )
...
Signed-off-by: yizhang-nv <187001205+yizhang-nv@users.noreply.github.com>
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
2025-08-19 09:58:44 +08:00
Lizhi Zhou
71e28eab36
[TRTLLM-7014][chore] Add accuracy test for ctx and gen workers with different models ( #6741 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-08-19 09:58:22 +08:00
Wanli Jiang
dabebb2c7a
[ https://nvbugs/5371480 ][fix] Enable test_phi3_small_8k ( #6938 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-08-19 09:42:35 +08:00
Leslie Fang
e76e5c640f
[None][infra] Enable accuracy test for mtp and chunked prefill ( #6314 )
...
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
2025-08-19 07:42:52 +08:00
Yiqing Yan
1ce23545fc
[None][chore] Remove duplicate test waives ( #6998 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-08-18 21:15:49 +08:00
Emma Qiao
69ff32f9b1
[None][infra] Waive failed tests on main 0818 ( #6992 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-08-18 20:34:52 +08:00
Shi Xiaowei
5ec15b98f0
[TRTLLM-7030][fix] uppercase def value in pd-config ( #6981 )
...
Signed-off-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-08-18 02:33:23 -04:00
Leslie Fang
ce0b13ea02
[None][infra] update feature_combination_matrix of disaggregated and Eagle3 ( #6945 )
...
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
2025-08-18 09:18:17 +08:00
Naveassaf
d6322f70b7
[ https://nvbugs/5451028 ][fix] Constrain NemotronSuper test parameters to prevent OOMs ( #6970 )
...
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
2025-08-17 13:38:36 -04:00
amitz-nv
3a49b47081
[ https://nvbugs/5390853 ][fix] Fix _test_openai_lora.py - disable cuda graph ( #6965 )
...
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
2025-08-17 16:56:16 +03:00
Emma Qiao
cc6d763824
[None][infra]Waive failed cases in main branch ( #6951 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-08-17 14:27:59 +03:00
bhsueh_NV
85cbd0263b
[None][feat] Support Yarn on Qwen3 ( #6785 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-08-17 07:21:29 +08:00
Daniel Cámpora
53312eeebd
[TRTLLM-7157][feat] BREAKING CHANGE Introduce sampler_type, detect sampler according to options ( #6831 )
...
Signed-off-by: Daniel Campora <961215+dcampora@users.noreply.github.com>
2025-08-16 00:27:24 -04:00
brb-nv
9505727d31
[ https://nvbugs/5401114 ][fix] Unwaive Gemma3 tests ( #6952 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-08-15 16:35:02 -07:00
Yuening Li
1f8ae2b2db
[TRTLLM-5863][feat] Support MoE INT8 Weight-Only-Quantization in PyTorch Workflow ( #6629 )
...
Signed-off-by: Yuening Li <62227368+yueningl@users.noreply.github.com>
2025-08-15 17:15:49 -04:00
dongfengy
0ad0b967bb
[None][fix] Make TP working for Triton MOE (in additional to EP we are using) ( #6722 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2025-08-15 16:58:42 -04:00
ajrasane
4162d2d746
[None][test] Add accuracy evaluation for AutoDeploy ( #6764 )
...
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2025-08-15 13:46:09 -04:00
yifeizhang-c
4127d77678
[ https://nvbugs/5394392 ][fix] Enlarge scheduler capacity under disagg bs == 1 ( #6537 )
...
Signed-off-by: Yifei Zhang <219273404+yifeizhang-c@users.noreply.github.com>
2025-08-15 09:52:06 -07:00
liji-nv
18ccd053d3
[ https://nvbugs/5427801 ][fix] Torch compile support for Llama4 and Ea… ( #6858 )
...
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-08-15 11:14:20 -04:00
peaceh-nv
1c1d5d2495
[ https://nvbugs/5451373 ][fix] : Fix the accuracy issue when using FP8 context MLA ( #6881 )
...
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
2025-08-15 16:53:56 +08:00
xinhe-nv
b23fdfc62f
[None][chore] Add failed cases into waives.txt ( #6914 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-08-15 14:00:16 +08:00
Yanchao Lu
3a987891d8
[TRTLLM-7141][infra] Use repo mirrors to avoid intermittent network failures ( #6836 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-08-15 11:16:07 +08:00
Bo Deng
e54ba75dac
[None][fix] Update tests to use standardized uppercase backend identifiers ( #6921 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-08-15 11:14:15 +08:00
Frank
2cc59aacb3
[None][fix] Correct reporting of torch_dtype for ModelConfig class. ( #6800 )
...
Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
2025-08-14 22:46:20 -04:00
Aurelien Chartier
b13a5a99b2
[None][chore] Add tests for non-existent and completed request cancellation ( #6840 )
...
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-08-14 15:57:01 -07:00
Raayan Dhar
8b237b943b
[ https://nvbugs/5441714 ][chore] remove skip on disagg n-gram test ( #6872 )
...
Signed-off-by: raayandhar <rdhar@nvidia.com>
2025-08-14 15:45:00 -07:00
Bo Li
26f413ad90
[ https://nvbugs/5450262 ][fix] Fix unsupported alltoall use case ( #6882 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-08-14 17:46:54 -04:00
Matthias Jouanneaux
69574ad730
[TRTLLM-5966][feat] Helix: extend mapping to support different CP types ( #6816 )
...
Signed-off-by: Matthias Jouanneaux <mjoux@nvidia.com>
2025-08-14 09:00:02 -07:00
Emma Qiao
96339c69a9
[None][infra] Waive failed cases on main ( #6902 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-08-14 23:59:44 +08:00
Pengbo Wang @ NVIDIA
ffc976ceaf
[ https://nvbugs/5445466 ][fix] fix deepseek r1 hang by not enabling mnnvl by default ( #6860 )
...
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
Co-authored-by: Tao Li @ NVIDIA <tali@nvidia.com>
2025-08-14 22:36:56 +08:00
Shi Xiaowei
1095dfd03c
[None][fix] BREAKING CHANGE: Mismatch between docs and actual commands ( #6323 )
2025-08-14 03:48:57 -04:00
chenfeiz0326
5cd8c0f6cc
[None][test] Add perf-sweep scripts ( #6738 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Co-authored-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
2025-08-14 14:04:47 +08:00
NVJiangShao
a700646132
[None][fix] Add FP4 all2all unitest and fix a bug for module WideEPMoE ( #6784 )
...
Signed-off-by: Jiang Shao <91270701+StudyingShao@users.noreply.github.com>
2025-08-14 13:35:37 +08:00
Yan Chunwei
0132c1db84
[ https://nvbugs/5427043 ][fix] request length exceeds max_num_tokens ( #6821 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-08-14 13:31:12 +08:00
Bo Deng
d8acca495b
[TRTLLM-6675][infra] Cherry-pick https://github.com/NVIDIA/TensorRT-LLM/pull/6623 ( #6735 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-08-14 04:36:38 +00:00
jmydurant
4200fa46d1
[None][feat] Add support for Hopper MLA chunked prefill ( #6655 )
...
Signed-off-by: Mingyang Jiang <13463932+jmydurant@users.noreply.github.com>
2025-08-14 10:39:26 +08:00
Izzy Putterman
ef53de8eef
[None][feat] Add test for speculative rejection sampler (2-model) ( #6542 )
...
Signed-off-by: Izzy Putterman <iputterman@nvidia.com>
2025-08-13 22:09:35 -04:00
Mike Iovine
7cba883932
[ https://nvbugs/5410399 ][chore] Unwaive mtp llmapi test ( #6833 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-08-13 17:38:45 -04:00
Emma Qiao
c7e6145409
[None][infra] Waive failed cases on main ( #6863 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-08-13 09:50:14 -04:00
Anthony Chang
2198587b35
[ https://nvbugs/5378031 ] [feat] Hopper W4A8 MoE supports ModelOpt ckpt for PyT backend ( #6200 )
...
Signed-off-by: Anthony Chang <27950904+rosenrodt@users.noreply.github.com>
2025-08-13 21:24:40 +08:00
Yukun He
bc5f766e0e
[TRTLLM-4501][feat] AutoTuner tuning config refactor and valid tactic generalization. ( #6545 )
...
* Generalize the definition of tactics so that users can implement more customizable tactic types, making the configurations clearer for each kernel run.
* Allow the user not to specify the `gen_tuning_buckets` or the `map_to_tuning_buckets` function.
* Other code refactoring.
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-08-13 16:25:22 +08:00
Mike Iovine
f68e03e646
[ https://nvbugs/5452167 ][fix] Fix ngram padding issue ( #6837 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-08-13 11:23:16 +08:00