Commit Graph

53 Commits

Author SHA1 Message Date
Venky
59c9588e9a
enh(doc): Add ci-overview in docs/source/reference/ (#5137)
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2025-06-12 17:48:13 +08:00
Po-Wei (Vincent)
ad99a08fa2
[TRTLLM-5581][infra] Update Module Owners (#5052)
Signed-off-by: Po-Wei Wang (Vincent)
2025-06-12 09:38:42 +08:00
tburt-nv
ddfe4fceb3
[chore] 2025-06-10 update allowlist (#5102)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-06-11 18:02:18 +08:00
Izzy Putterman
6cb2b7d370
CI: Allow run (#5101)
Signed-off-by: Izzy Putterman <iputterman@nvidia.com>
2025-06-11 06:03:38 +08:00
Po-Wei (Vincent)
9ae2ce6665
[TRTLLM-5502][infra] Add github action to identify if PR is from community (#4824)
Signed-off-by: Po-Wei Wang (Vincent)
2025-06-03 06:36:35 +08:00
juney-nvidia
fe359d9df9
Added code owners for AutoDeploy (#4769)
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
2025-05-30 09:55:27 +08:00
juney-nvidia
7b2bb67491
Update CODEOWNERS for PyTorch backend - runtime component (#4620)
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
2025-05-23 20:40:44 +08:00
Kevin Chen
b80b78f87c
Add pytorch backend team (#4405)
* Add pytorch backend team

Signed-off-by: Kevin Chen 

* Update .github/CODEOWNERS

Co-authored-by: Yanchao Lu 
Signed-off-by: juney-nvidia <143764042+juney-nvidia@users.noreply.github.com>

---------

Signed-off-by: Kevin Chen 
Signed-off-by: juney-nvidia <143764042+juney-nvidia@users.noreply.github.com>
Co-authored-by: juney-nvidia <143764042+juney-nvidia@users.noreply.github.com>
Co-authored-by: Yanchao Lu
2025-05-21 21:10:35 +08:00
tburt-nv
58bb34c460
[chore] update CI allowlist 2025-05-13 (#4278)
update allowlist

Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-05-14 10:41:57 +08:00
Robin Kobus
b1bee9c394
Revert "Add initial list of CODEOWNERS (#4105)" (#4234)
This reverts commit aa7300e040.

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-05-12 16:53:49 +02:00
Robin Kobus
ba13b51a58
chore: Update CODEOWNERS (#4221)
Remove @funatiq and @dcampora

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-05-11 23:55:20 -07:00
Kevin Chen
aa7300e040
Add initial list of CODEOWNERS (#4105)
Signed-off-by: Kevin Chen <kevinch@nvidia.com>
2025-05-09 16:16:48 -07:00
Yanchao Lu
7175392206
[Infra] - Update code ownership rules for public APIs (#4122)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-05-08 11:04:31 +08:00
Yanchao Lu
0446270f78
[Infra] - Update code ownership rules (#4109)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-05-07 13:35:27 +08:00
tburt-nv
80b96cf910
update CI allowlist (#3969)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-05-05 10:05:04 +08:00
tburt-nv
afb7d3adce
remove release branch codeowners (#3954)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-30 11:59:42 +08:00
Dom Brown
8709fe8b53
chore: bump version to 0.19.0 (#3598) (#3841)
test: add test cases for 0.19 release (#3608)

* fix test name



* add quickstart test for nemotron-ultra



* add rcca multi-node test case for deepseek-v3



* add rcca info



---------




squash (#3642)



fix: nvbugs/5187237: fix deterministic mode crash (#3448)

* nvbugs/5187237 nvbugs/5112075: fix deterministic mode error

* remove waive


* Revert "remove waive"

This reverts commit 0bf5486d19906d692bfb7a6262333c296b0087ac.



* revert ar fusion



---------



update fp8 doc (#3647)




tests: change qa perf test to trtllm-bench (#3619)




 fix: FP8 quantized lm_head (NvBug 5214229) (#3567)



infra: Add PR approval protection for the release branch (#3634)



fix: nvbugs/5231298: pytorch allreduce issue (#3673)



Fix: nvbugs/5222698 variable not defined (#3630)

* Fix: nvbugs/5222698 variable not defined



* Tidy code



---------



test:sync waives.txt from main branch by disabling test_perf/gpt_350m-cppmanager case (#3685)



test:restore fp8 kv cache testing for L0 (#3671)



doc: Update DeepSeek perf docs (#3693)

* Update DeepSeek perf docs



* update



* Apply suggestions from code review




---------




tests: waive test_llm_multi_node (#3664)



fix: update test_user_buffers_mm_add_prologue atol (#3711)



Fix: cherry-pick hmac encryption from main branch (#3635)

* security fix cherry-pick changes from main



* fix hmac in remote mpi session (#3649)



---------





Un-waive DS-V3-Lite tests. (#3621)



fix: FP8 kv accuracy (#3675)

* fix FP8 kv accuracy



* update doc



---------



Fix script options for engines. (#3622)



unwaive multi-node test (#3721)



chore : Split more tests out of gpt tests (#3524) (#3674)



doc:add torch examples link into torch backend documentation (#3749)




test: Get Eagle tests working (#3593) (#3722)




Waive L0 test (#3756)



waive failed case in perf test, change default max_batch_size to 512 and write config.json to output log (#3656)





Update ds v3 parameters in stress test. (#3676)

waive gemma on L20 (#3766)



https://nvbugs/5141291: Fix convert.py script for Qwen model. (#3758)

Include Qwen2VLDecoderLayer in the smooth_qwen2_model function.



fix: PP4 fixes and cleanup (#3688)




remove benchmark test list (#3643)



skip disagg deepseek test if sm!=90 (#3720)



test: skip failed cases on B200 (#3710)

* add skip condition to tests



* fix error



---------



test: [nvbug: 5234494] skip_pre_ada for fp8 cases (#3718)

* skip_pre_ada for fp8 cases



* update



* update after rebase



---------



add know issue to deepseek doc. (#3800)



Fix ModelOpt Mixtral AWQ OOM (#3714) (#3761)




Waive L0 tests (#3826)



fix: Reduce memory usage in fused moe op associated with AutoTuning and fix moe fallback issue. (#3793)

* Reduce memory usage in fused moe op associated with AutoTuning.
* Replace pre-defined bucket size strategy with a generating function based on the tune_max_num_tokens.
* Add free_memory logic of workspace in min_latency_mode fused moe path.



* Fix fused_moe fallback issue. (#3652)

min_latency_mode is only set to False during warmup phase. Thus when it becomes true during inference, all tactics fall back to the default one and thus cause perf regression.



---------



[doc] Better document for Draft-Target-Model (DTM) speculative decoding (#3797)




Fix pre-commit



Fix again



Address some review comments for the MI

Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-04-29 16:57:22 +08:00
bhsueh_NV
c4d86b267c
chore: add pull request template (#3760)
* add pull request template

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* fix pre-commit issue

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

---------

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-04-23 10:21:31 +08:00
Yiteng Niu
ca88674210
update user list (#3614)
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
2025-04-16 15:13:29 +08:00
tburt-nv
5616c0d232
add precommit check to github actions (#3129)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-11 06:40:53 +08:00
tburt-nv
8d164f40d7
update allowlist (#3428)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-10 06:41:40 +08:00
tburt-nv
3a8443f1e1
extend allowlist (#3379)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-09 11:10:42 +08:00
Zhanrui Sun
c692474b59
infra: Fix bot help error when " in bot command (#3314)
* Fix bot help error when " in bot command

Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>

* Delete a.txt

Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>

---------

Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-04-08 18:16:05 +08:00
Zhanrui Sun
bd75ec02f2
Fix bot check error when triggered by pull request (#3268)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-04-03 21:47:05 +08:00
Zhanrui Sun
67e9f99d46
infra: [TRTLLM-4308] Add Bot help (#3192)
* Add bot command help and check bot command

Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>

* Fix permission error

Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>

* Fix add comment

Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>

* Fix

Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>

* Fix review

Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>

* Update bot-command.yml

Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>

* Update .github/workflows/bot-command.yml

Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>

* Fix pre-commit

Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>

---------

Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-04-03 17:48:25 +08:00
Yiteng Niu
c725f1043f
update user list (#3193)
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
2025-04-01 16:41:15 +08:00
Yiteng Niu
3aae124a00
infra: update concurrency control (#3120)
* update concurrency control

Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>

* Update .github/workflows/blossom-ci.yml

Co-authored-by: tburt-nv <195370667+tburt-nv@users.noreply.github.com>
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>

---------

Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
Co-authored-by: tburt-nv <195370667+tburt-nv@users.noreply.github.com>
2025-03-30 23:28:50 +08:00
tburt-nv
e68749ca1e
2025-03-25 update CI allowlist (#3074)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
Co-authored-by: juney-nvidia <143764042+juney-nvidia@users.noreply.github.com>
2025-03-26 08:13:01 +08:00
Yiteng Niu
cb11c10719
add ratelimit in workflow (#3001)
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
2025-03-24 15:54:11 +08:00
Yiteng Niu
37644e22bc
update approver list (#2994)
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
2025-03-24 12:51:27 +08:00
Kaiyu Xie
2631f21089
Update (#2978)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-23 16:39:35 +08:00
tburt-nv
c2ac9e6269
update github workflow (#2943)
cherry-picks aa1c52f

Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-03-18 22:20:46 -04:00
Kaiyu Xie
3aa6b11d13
Update TensorRT-LLM (#2936)
* Update TensorRT-LLM

---------

Co-authored-by: changcui <cuichang147@gmail.com>
2025-03-18 21:25:19 +08:00
niukuo
aa1c52fa26 update github workflow 2025-03-17 23:11:07 +08:00
Yiteng Niu
c384d26736
migrate to l0-test.yml (#2858)
Signed-off-by: niukuo <6831097+niukuo@users.noreply.github.com>
2025-03-06 15:24:40 +08:00
Kaiyu Xie
77d7fe1eb2
Update TensorRT-LLM (#2849)
* Update TensorRT-LLM

---------

Co-authored-by: aotman <chenhangatm@gmail.com>
2025-03-04 18:44:00 +08:00
tburt-nv
0bcfdca6aa
Use NVIDIA-gha runners to collect test results (#2830)
Signed-off-by: Tyler Burt <tburt@nvidia.com>
2025-02-27 23:02:02 -05:00
Kaiyu Xie
ab5b19e027
Update TensorRT-LLM (#2820) 2025-02-25 21:21:49 +08:00
tburt-nv
5c794e3714
allow build command arguments (#2808)
Signed-off-by: Tyler Burt <tburt@nvidia.com>
2025-02-21 10:38:49 +08:00
Dan Blanaru
16d2467ea8 Update TensorRT-LLM (#2755)
* Update TensorRT-LLM

---------

Co-authored-by: Denis Kayshev <topenkoff@gmail.com>
Co-authored-by: akhoroshev <arthoroshev@gmail.com>
Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com>

Update
2025-02-11 03:01:00 +00:00
Kaiyu Xie
be17881062
Update TensorRT-LLM (#2582) 2024-12-16 21:50:47 -08:00
Kaiyu Xie
b171e87956
Add issue triage workflows (#2566) 2024-12-11 09:27:40 -08:00
Kaiyu Xie
aaacc9bd68
Update TensorRT-LLM (#2562)
* Update TensorRT-LLM

---------

Co-authored-by: Starrick Liu <73152103+StarrickLiu@users.noreply.github.com>
2024-12-11 00:31:05 -08:00
Kevin Chen
340a1b62fc
Add issue triage workflows (#2498)
Signed-off-by: Kevin Chen <kevinch@nvidia.com>
2024-12-04 23:50:46 +08:00
niukuo
c994b69731 blossom-ci.yml: run vulnerability scan on ubuntu 2024-11-29 00:47:11 -08:00
niukuo
af3d49ce53 update blossom-ci.yml 2024-11-28 23:43:11 -08:00
niukuo
ae640fd376
Add blossom-ci.yml (#2512) 2024-11-29 15:01:26 +08:00
Kaiyu Xie
bf0a5afc92
Update TensorRT-LLM (#1598)
* Update TensorRT-LLM
2024-05-14 16:43:41 +08:00
Kaiyu Xie
89ba1b1a67
Update TensorRT-LLM (#1554) 2024-05-07 23:34:28 +08:00
Kaiyu Xie
06c0e9b1ec
Update TensorRT-LLM (#1530) 2024-04-30 17:19:10 +08:00