Frank
baf7eaa1cc
Add trtllm-bench reviewers. ( #5452 )
...
Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
2025-06-26 18:48:00 +08:00
tburt-nv
7d55c381fa
Revert "[infra] Report CI authorization errors to PR" ( #5298 )
2025-06-17 17:28:33 -04:00
tburt-nv
2df9f875cf
[infra] Report CI authorization errors to PR ( #5175 )
...
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-06-17 17:26:49 -04:00
Venky
59c9588e9a
enh(doc): Add ci-overview in docs/source/reference/ ( #5137 )
...
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2025-06-12 17:48:13 +08:00
Po-Wei (Vincent)
ad99a08fa2
[TRTLLM-5581][infra] Update Module Owners ( #5052 )
...
Signed-off-by: Po-Wei Wang (Vincent)
2025-06-12 09:38:42 +08:00
tburt-nv
ddfe4fceb3
[chore] 2025-06-10 update allowlist ( #5102 )
...
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-06-11 18:02:18 +08:00
Izzy Putterman
6cb2b7d370
CI: Allow run ( #5101 )
...
Signed-off-by: Izzy Putterman <iputterman@nvidia.com>
2025-06-11 06:03:38 +08:00
Po-Wei (Vincent)
9ae2ce6665
[TRTLLM-5502][infra] Add github action to identify if PR is from community ( #4824 )
...
Signed-off-by: Po-Wei Wang (Vincent)
2025-06-03 06:36:35 +08:00
juney-nvidia
fe359d9df9
Added code owners for AutoDeploy ( #4769 )
...
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
2025-05-30 09:55:27 +08:00
juney-nvidia
7b2bb67491
Update CODEOWNERS for PyTorch backend - runtime component ( #4620 )
...
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
2025-05-23 20:40:44 +08:00
Kevin Chen
b80b78f87c
Add pytorch backend team ( #4405 )
...
* Add pytorch backend team
Signed-off-by: Kevin Chen
* Update .github/CODEOWNERS
Co-authored-by: Yanchao Lu
Signed-off-by: juney-nvidia <143764042+juney-nvidia@users.noreply.github.com>
---------
Signed-off-by: Kevin Chen
Signed-off-by: juney-nvidia <143764042+juney-nvidia@users.noreply.github.com>
Co-authored-by: juney-nvidia <143764042+juney-nvidia@users.noreply.github.com>
Co-authored-by: Yanchao Lu
2025-05-21 21:10:35 +08:00
tburt-nv
58bb34c460
[chore] update CI allowlist 2025-05-13 ( #4278 )
...
update allowlist
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-05-14 10:41:57 +08:00
Robin Kobus
b1bee9c394
Revert "Add initial list of CODEOWNERS ( #4105 )" ( #4234 )
...
This reverts commit aa7300e040 .
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-05-12 16:53:49 +02:00
Robin Kobus
ba13b51a58
chore: Update CODEOWNERS ( #4221 )
...
Remove @funatiq and @dcampora
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-05-11 23:55:20 -07:00
Kevin Chen
aa7300e040
Add initial list of CODEOWNERS ( #4105 )
...
Signed-off-by: Kevin Chen <kevinch@nvidia.com>
2025-05-09 16:16:48 -07:00
Yanchao Lu
7175392206
[Infra] - Update code ownership rules for public APIs ( #4122 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-05-08 11:04:31 +08:00
Yanchao Lu
0446270f78
[Infra] - Update code ownership rules ( #4109 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-05-07 13:35:27 +08:00
tburt-nv
80b96cf910
update CI allowlist ( #3969 )
...
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-05-05 10:05:04 +08:00
tburt-nv
afb7d3adce
remove release branch codeowners ( #3954 )
...
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-30 11:59:42 +08:00
Dom Brown
8709fe8b53
chore: bump version to 0.19.0 ( #3598 ) ( #3841 )
...
test: add test cases for 0.19 release (#3608 )
* fix test name
* add quickstart test for nemotron-ultra
* add rcca multi-node test case for deepseek-v3
* add rcca info
---------
squash (#3642 )
fix: nvbugs/5187237: fix deterministic mode crash (#3448 )
* nvbugs/5187237 nvbugs/5112075: fix deterministic mode error
* remove waive
* Revert "remove waive"
This reverts commit 0bf5486d19906d692bfb7a6262333c296b0087ac.
* revert ar fusion
---------
update fp8 doc (#3647 )
tests: change qa perf test to trtllm-bench (#3619 )
fix: FP8 quantized lm_head (NvBug 5214229) (#3567 )
infra: Add PR approval protection for the release branch (#3634 )
fix: nvbugs/5231298: pytorch allreduce issue (#3673 )
Fix: nvbugs/5222698 variable not defined (#3630 )
* Fix: nvbugs/5222698 variable not defined
* Tidy code
---------
test:sync waives.txt from main branch by disabling test_perf/gpt_350m-cppmanager case (#3685 )
test:restore fp8 kv cache testing for L0 (#3671 )
doc: Update DeepSeek perf docs (#3693 )
* Update DeepSeek perf docs
* update
* Apply suggestions from code review
---------
tests: waive test_llm_multi_node (#3664 )
fix: update test_user_buffers_mm_add_prologue atol (#3711 )
Fix: cherry-pick hmac encryption from main branch (#3635 )
* security fix cherry-pick changes from main
* fix hmac in remote mpi session (#3649 )
---------
Un-waive DS-V3-Lite tests. (#3621 )
fix: FP8 kv accuracy (#3675 )
* fix FP8 kv accuracy
* update doc
---------
Fix script options for engines. (#3622 )
unwaive multi-node test (#3721 )
chore : Split more tests out of gpt tests (#3524 ) (#3674 )
doc:add torch examples link into torch backend documentation (#3749 )
test: Get Eagle tests working (#3593 ) (#3722 )
Waive L0 test (#3756 )
waive failed case in perf test, change default max_batch_size to 512 and write config.json to output log (#3656 )
Update ds v3 parameters in stress test. (#3676 )
waive gemma on L20 (#3766 )
https://nvbugs/5141291 : Fix convert.py script for Qwen model. (#3758 )
Include Qwen2VLDecoderLayer in the smooth_qwen2_model function.
fix: PP4 fixes and cleanup (#3688 )
remove benchmark test list (#3643 )
skip disagg deepseek test if sm!=90 (#3720 )
test: skip failed cases on B200 (#3710 )
* add skip condition to tests
* fix error
---------
test: [nvbug: 5234494] skip_pre_ada for fp8 cases (#3718 )
* skip_pre_ada for fp8 cases
* update
* update after rebase
---------
add know issue to deepseek doc. (#3800 )
Fix ModelOpt Mixtral AWQ OOM (#3714 ) (#3761 )
Waive L0 tests (#3826 )
fix: Reduce memory usage in fused moe op associated with AutoTuning and fix moe fallback issue. (#3793 )
* Reduce memory usage in fused moe op associated with AutoTuning.
* Replace pre-defined bucket size strategy with a generating function based on the tune_max_num_tokens.
* Add free_memory logic of workspace in min_latency_mode fused moe path.
* Fix fused_moe fallback issue. (#3652 )
min_latency_mode is only set to False during warmup phase. Thus when it becomes true during inference, all tactics fall back to the default one and thus cause perf regression.
---------
[doc] Better document for Draft-Target-Model (DTM) speculative decoding (#3797 )
Fix pre-commit
Fix again
Address some review comments for the MI
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-04-29 16:57:22 +08:00
bhsueh_NV
c4d86b267c
chore: add pull request template ( #3760 )
...
* add pull request template
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* fix pre-commit issue
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
---------
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-04-23 10:21:31 +08:00
Yiteng Niu
ca88674210
update user list ( #3614 )
...
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
2025-04-16 15:13:29 +08:00
tburt-nv
5616c0d232
add precommit check to github actions ( #3129 )
...
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-11 06:40:53 +08:00
tburt-nv
8d164f40d7
update allowlist ( #3428 )
...
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-10 06:41:40 +08:00
tburt-nv
3a8443f1e1
extend allowlist ( #3379 )
...
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-09 11:10:42 +08:00
Zhanrui Sun
c692474b59
infra: Fix bot help error when " in bot command ( #3314 )
...
* Fix bot help error when " in bot command
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
* Delete a.txt
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
---------
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-04-08 18:16:05 +08:00
Zhanrui Sun
bd75ec02f2
Fix bot check error when triggered by pull request ( #3268 )
...
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-04-03 21:47:05 +08:00
Zhanrui Sun
67e9f99d46
infra: [TRTLLM-4308] Add Bot help ( #3192 )
...
* Add bot command help and check bot command
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
* Fix permission error
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
* Fix add comment
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
* Fix
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
* Fix review
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
* Update bot-command.yml
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
* Update .github/workflows/bot-command.yml
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
* Fix pre-commit
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
---------
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-04-03 17:48:25 +08:00
Yiteng Niu
c725f1043f
update user list ( #3193 )
...
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
2025-04-01 16:41:15 +08:00
Yiteng Niu
3aae124a00
infra: update concurrency control ( #3120 )
...
* update concurrency control
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
* Update .github/workflows/blossom-ci.yml
Co-authored-by: tburt-nv <195370667+tburt-nv@users.noreply.github.com>
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
---------
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
Co-authored-by: tburt-nv <195370667+tburt-nv@users.noreply.github.com>
2025-03-30 23:28:50 +08:00
tburt-nv
e68749ca1e
2025-03-25 update CI allowlist ( #3074 )
...
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
Co-authored-by: juney-nvidia <143764042+juney-nvidia@users.noreply.github.com>
2025-03-26 08:13:01 +08:00
Yiteng Niu
cb11c10719
add ratelimit in workflow ( #3001 )
...
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
2025-03-24 15:54:11 +08:00
Yiteng Niu
37644e22bc
update approver list ( #2994 )
...
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
2025-03-24 12:51:27 +08:00
Kaiyu Xie
2631f21089
Update ( #2978 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-23 16:39:35 +08:00
tburt-nv
c2ac9e6269
update github workflow ( #2943 )
...
cherry-picks aa1c52f
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-03-18 22:20:46 -04:00
Kaiyu Xie
3aa6b11d13
Update TensorRT-LLM ( #2936 )
...
* Update TensorRT-LLM
---------
Co-authored-by: changcui <cuichang147@gmail.com>
2025-03-18 21:25:19 +08:00
niukuo
aa1c52fa26
update github workflow
2025-03-17 23:11:07 +08:00
Yiteng Niu
c384d26736
migrate to l0-test.yml ( #2858 )
...
Signed-off-by: niukuo <6831097+niukuo@users.noreply.github.com>
2025-03-06 15:24:40 +08:00
Kaiyu Xie
77d7fe1eb2
Update TensorRT-LLM ( #2849 )
...
* Update TensorRT-LLM
---------
Co-authored-by: aotman <chenhangatm@gmail.com>
2025-03-04 18:44:00 +08:00
tburt-nv
0bcfdca6aa
Use NVIDIA-gha runners to collect test results ( #2830 )
...
Signed-off-by: Tyler Burt <tburt@nvidia.com>
2025-02-27 23:02:02 -05:00
Kaiyu Xie
ab5b19e027
Update TensorRT-LLM ( #2820 )
2025-02-25 21:21:49 +08:00
tburt-nv
5c794e3714
allow build command arguments ( #2808 )
...
Signed-off-by: Tyler Burt <tburt@nvidia.com>
2025-02-21 10:38:49 +08:00
Dan Blanaru
16d2467ea8
Update TensorRT-LLM ( #2755 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Denis Kayshev <topenkoff@gmail.com>
Co-authored-by: akhoroshev <arthoroshev@gmail.com>
Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com>
Update
2025-02-11 03:01:00 +00:00
Kaiyu Xie
be17881062
Update TensorRT-LLM ( #2582 )
2024-12-16 21:50:47 -08:00
Kaiyu Xie
b171e87956
Add issue triage workflows ( #2566 )
2024-12-11 09:27:40 -08:00
Kaiyu Xie
aaacc9bd68
Update TensorRT-LLM ( #2562 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Starrick Liu <73152103+StarrickLiu@users.noreply.github.com>
2024-12-11 00:31:05 -08:00
Kevin Chen
340a1b62fc
Add issue triage workflows ( #2498 )
...
Signed-off-by: Kevin Chen <kevinch@nvidia.com>
2024-12-04 23:50:46 +08:00
niukuo
c994b69731
blossom-ci.yml: run vulnerability scan on ubuntu
2024-11-29 00:47:11 -08:00
niukuo
af3d49ce53
update blossom-ci.yml
2024-11-28 23:43:11 -08:00
niukuo
ae640fd376
Add blossom-ci.yml ( #2512 )
2024-11-29 15:01:26 +08:00
Kaiyu Xie
bf0a5afc92
Update TensorRT-LLM ( #1598 )
...
* Update TensorRT-LLM
2024-05-14 16:43:41 +08:00
Kaiyu Xie
89ba1b1a67
Update TensorRT-LLM ( #1554 )
2024-05-07 23:34:28 +08:00
Kaiyu Xie
06c0e9b1ec
Update TensorRT-LLM ( #1530 )
2024-04-30 17:19:10 +08:00
Kaiyu Xie
c89653021e
Update TensorRT-LLM (20240116) ( #891 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Eddie-Wang1120 <81598289+Eddie-Wang1120@users.noreply.github.com>
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-01-16 20:03:11 +08:00
juney-nvidia
6cc5e177ff
Update issue templates
2024-01-03 16:22:51 +08:00
juney-nvidia
a413d132b8
Update issue templates
2024-01-03 16:22:03 +08:00