Yuan Tong
32b244af38
feat: reduce unnecessary kernel generation ( #5476 )
...
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-07-04 14:37:49 +08:00
Alessio Netti
7e681fbe52
[chore] Allow configuring linking of NVRTC wrapper ( #5189 )
...
Signed-off-by: Alessio Netti <netti.alessio@gmail.com>
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Co-authored-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-06-26 07:26:10 +02:00
qsang-nv
faca19c2f0
update setup.py for special cases ( #5227 )
...
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
2025-06-17 16:41:07 +08:00
dongxuy04
1e369658f1
feat: large-scale EP(part 6: Online EP load balancer integration for GB200 nvfp4) ( #4818 )
...
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
Signed-off-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
Co-authored-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-06-08 10:25:18 +08:00
Emma Qiao
202813f054
Check test names in waive list ( #4292 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-06-01 14:39:30 +08:00
Emma Qiao
c945e92fdb
[Infra]Remove some old keyword ( #4552 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-05-31 13:50:45 +08:00
Yuan Tong
5cb4f9be33
feat: improve build_wheel.py venv handling ( #4525 )
...
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-05-27 15:17:14 +08:00
Emma Qiao
6f626af386
[TRTLLM-4535][infra]: Add marker TIMEOUT for test level ( #3905 )
...
* Add marker for TIMEOUT
Signed-off-by: qqiao <qqiao@nvidia.com>
* Remove workspace after tests
Signed-off-by: qqiao <qqiao@nvidia.com>
* Add missed property
Signed-off-by: qqiao <qqiao@nvidia.com>
* Add some debug info
Signed-off-by: qqiao <qqiao@nvidia.com>
* Fix errors
Signed-off-by: qqiao <qqiao@nvidia.com>
* Testing
Signed-off-by: qqiao <qqiao@nvidia.com>
* Special process for unittests
Signed-off-by: qqiao <qqiao@nvidia.com>
* Move special proecessing unittests to test generating stage
Signed-off-by: qqiao <qqiao@nvidia.com>
* Process for the whole test list
Signed-off-by: qqiao <qqiao@nvidia.com>
* Test more
Signed-off-by: qqiao <qqiao@nvidia.com>
* Add another test case
Signed-off-by: qqiao <qqiao@nvidia.com>
* Change back the setting for testing
Signed-off-by: qqiao <qqiao@nvidia.com>
* Revert another config file
Signed-off-by: qqiao <qqiao@nvidia.com>
* Add descriptionf or timeout in test readme
Signed-off-by: qqiao <qqiao@nvidia.com>
---------
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-05-25 23:30:40 -07:00
Shi Xiaowei
3d62727303
test: NIXL single process test ( #4486 )
2025-05-21 10:41:46 +08:00
Shi Xiaowei
df2798e0c3
feat: NIXL interface integration ( #3934 )
...
NIXL interfaces
Signed-off-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-05-19 18:18:22 +08:00
Yuan Tong
593f65ff6a
fix: better method to help torch find nvtx3 ( #4110 )
...
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-05-15 16:42:30 +08:00
Zhanrui Sun
5dc3b539ba
infra: Down the gcc toolset version from 13 to 11 ( #4114 )
...
* Down the gcc toolset version from 13 to 11
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
* Update rocky8 images
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
---------
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-05-15 11:08:51 +08:00
qsang-nv
0fd59d64ab
infra: open source fmha v2 kernels ( #4185 )
...
* add fmha repo
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix format
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix code style
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix header
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix header kernel_traits.h
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* add .gitignore file
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* add SLIDING_WINDOW_ATTENTION
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix style
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix format
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* update setup.py
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* update build_wheel.py
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
---------
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
Signed-off-by: qsang-nv <200703406+qsang-nv@users.noreply.github.com>
2025-05-15 10:56:34 +08:00
Yanchao Lu
504f4bf779
[Infra] - Update the upstream PyTorch dependency to 2.7.0 ( #4235 )
...
[Infra][TRTLLM-4941] - Update the upstream PyTorch dependency to 2.7.0
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-05-14 22:28:13 +08:00
Yiqing Yan
fda8b0277a
[Infra][TRTLLM-4374] Upgrade TRT 10.10.0 GA, CUDA 12.9 GA and DLFW 25.04 ( #4049 )
...
* [TRTLLM-4374] Upgrade TRT 10.10.0 GA, CUDA 12.9 GA and DLFW 25.04
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
* fix review
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
* update images
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
* Update jenkins/L0_Test.groovy
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
* update image name
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
---------
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-05-13 14:59:12 +08:00
Martin Marciniszyn Mehringer
33977dbd42
infra: [TRTLLM-325] Prepare for NGC release - multiplatform build ( #4191 )
...
* infra: [TRTLLM-325] Prepare for NGC release - prepare multiplatform build
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
2025-05-12 00:38:45 -07:00
Martin Marciniszyn Mehringer
d0e672f96d
chore: [TRTLLM-325][infra] Prepare for NGC release - reduce size of the docker images ( #3990 )
...
* chore: reduce size of the docker images
Signed-off-by: Martin Marciniszyn Mehringer <11665257+martinmarciniszyn@users.noreply.github.com>
* Finish the renaming script and run with new images.
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
* Fix installation of GCC toolset for Rocky Linux
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
* Upgrade to new docker images
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
---------
Signed-off-by: Martin Marciniszyn Mehringer <11665257+martinmarciniszyn@users.noreply.github.com>
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
2025-05-09 19:31:29 +08:00
Yuan Tong
4b6c19737b
feat: support add internal cutlass kernels as subproject ( #3658 )
...
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-05-06 11:35:07 +08:00
tburt-nv
0aca05514a
build: keep using system python for dev install ( #4014 )
...
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-05-03 07:38:47 +02:00
tburt-nv
7053d0ad5a
infra: add conan ( #3744 )
...
This MR integrates Conan into the build system, so that it can be used to fetch dependencies in future changes.
Also installs all requirements-dev.txt inside a virtualenv instead of the system, since some of Conan's dependencies may conflict with the system packages. Virtualenv is used instead of venv because the triton server backend container has only virtualenv installed. This also allows developers to cache the requirements-dev.txt packages between container launches.
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-30 11:53:14 -07:00
Ming Wei
ed887940d4
infra: open source XQA kernels ( #3762 )
...
Replace libtensorrt_llm_nvrtc_wrapper.so with its source code, which
consists of two parts:
1. NVRTC glue code
2. XQA kernel code
During TensorRT-LLM build, XQA kernel code is embedded as C++ arries via
gen_cpp_header.py and passed to NVRTC for JIT compilation.
Signed-off-by: Ming Wei <2345434+ming-wei@users.noreply.github.com>
2025-04-30 18:05:15 +08:00
Emma Qiao
48db263d9a
infra: Add test list name check ( #3097 )
...
* Add steps to check test names
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Correct test-db command
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Switch to use a trt-llm image
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Update go path
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Correct go path
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Move the test list check to test ci
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Correct file path
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Fix path again
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Fix get path
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Fix typo
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Skip test list check for ARM
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Fix expression
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Change back unrelated file
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Correct qa test names
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Remove a stage
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Update jenkins/L0_Test.groovy
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Move some steps to a python script
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Fix script path
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Split commands and debug
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Fix typo
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Fix typo
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Also correct case name in waives list
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Move check script to another folder
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Update qa list after rebase
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Fix rebase
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Remove the perf tests under QA
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Some tests already fixed after rebase to TOT
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
---------
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-04-20 23:02:16 +08:00
Zheng Duan
bce7ea8c38
test: add kv cache event tests for disagg workers ( #3602 )
2025-04-18 18:30:19 +08:00
Yuan Tong
0b0e6d8a0a
refactor: Clean up CMakeLists.txt ( #3479 )
...
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-04-18 14:39:29 +08:00
Emma Qiao
2f48985b9c
infra: Add step to generate new duration file ( #3298 )
...
* Add step to generate new duration file
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Install python in earlier step
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Clone repo and add debug info
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Remove debug info and only generate duration for post-merge
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Test for the new duration file
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Update the duration file format
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
* Move generate_duration.py to scripts folder and add try-catch avoiding any broken
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
---------
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
2025-04-18 12:56:31 +08:00
Kaiyu Xie
258ae9c58c
Revert "infra: move nvrtc_wrapper to conan ( #3282 )" ( #3573 )
...
This reverts commit c0dd6cbce0 .
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-04-15 22:45:13 +08:00
tburt-nv
c0dd6cbce0
infra: move nvrtc_wrapper to conan ( #3282 )
...
* add pip scripts dir to path
* move nvrtc_wrapper to conan
* support building nvrtc wrapper from source
---------
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-15 05:31:01 +08:00
Yuan Tong
5985d362a9
fix: install RTC headers with linking when using --linking_install_binary ( #3484 )
...
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-04-14 21:22:03 +08:00
tburt-nv
5616c0d232
add precommit check to github actions ( #3129 )
...
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-11 06:40:53 +08:00
Gabriel Wu
376731013d
feat: use NVRTC for DeepGEMM JIT compilation ( #3239 )
...
* feat: use NVRTC for DeepGEMM JIT compilation
Signed-off-by: Zihua Wu
* fix: add license
Signed-off-by: Zihua Wu
* feat: store NVRTC JIT results in memory by default
Signed-off-by: Zihua Wu
* feat: refinement
Signed-off-by: Zihua Wu
* feat: refinement
Signed-off-by: Zihua Wu
* test: set timeout to 7200
Signed-off-by: Zihua Wu
---------
Signed-off-by: Zihua Wu
2025-04-07 20:29:23 +08:00
Ming Wei
ca6615d800
Remove gen_cuda_headers_for_xqa.py ( #3222 )
...
No longer needed.
2025-04-03 07:13:22 +08:00
xiweny
6979afa6f2
test: reorganize tests folder hierarchy ( #2996 )
...
1. move TRT path tests to 'trt' folder
2. optimize some import usage
2025-03-27 12:07:53 +08:00
Kaiyu Xie
2631f21089
Update ( #2978 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-23 16:39:35 +08:00
Kaiyu Xie
3aa6b11d13
Update TensorRT-LLM ( #2936 )
...
* Update TensorRT-LLM
---------
Co-authored-by: changcui <cuichang147@gmail.com>
2025-03-18 21:25:19 +08:00
Kaiyu Xie
9b931c0f63
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
Kaiyu Xie
ab5b19e027
Update TensorRT-LLM ( #2820 )
2025-02-25 21:21:49 +08:00
Dan Blanaru
16d2467ea8
Update TensorRT-LLM ( #2755 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Denis Kayshev <topenkoff@gmail.com>
Co-authored-by: akhoroshev <arthoroshev@gmail.com>
Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com>
Update
2025-02-11 03:01:00 +00:00
石晓伟
548b5b7310
Update TensorRT-LLM ( #2532 )
...
* blossom-ci.yml: run vulnerability scan on blossom
* open source efb18c1256f8c9c3d47b7d0c740b83e5d5ebe0ec
---------
Co-authored-by: niukuo <6831097+niukuo@users.noreply.github.com>
Co-authored-by: pei0033 <59505847+pei0033@users.noreply.github.com>
Co-authored-by: Kyungmin Lee <30465912+lkm2835@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2024-12-04 21:16:56 +08:00
Kaiyu Xie
b7868dd1bd
Update TensorRT-LLM ( #2413 )
2024-11-05 16:27:06 +08:00
Kaiyu Xie
1730a587d8
Update TensorRT-LLM ( #2363 )
...
* Update TensorRT-LLM
---------
Co-authored-by: tonylek <137782967+tonylek@users.noreply.github.com>
2024-10-22 20:27:35 +08:00
Kaiyu Xie
8681b3a4c0
open source 4dbf696ae9b74a26829d120b67ab8443d70c8e58 ( #2297 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Bhuvanesh Sridharan <bhuvanesh.sridharan@sprinklr.com>
Co-authored-by: Qingquan Song <ustcsqq@gmail.com>
2024-10-08 12:19:19 +02:00
石晓伟
b8fc6633ba
Update TensorRT-LLM ( #2156 )
...
Co-authored-by: Bruno Magalhaes <bruno.magalhaes@synthesia.io>
2024-08-27 18:20:59 +08:00
石晓伟
32ed92e449
Update TensorRT-LLM
...
Co-authored-by: Rong Zhou <130957722+ReginaZh@users.noreply.github.com>
Co-authored-by: Onur Galoglu <33498883+ogaloglu@users.noreply.github.com>
Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com>
2024-08-20 18:55:15 +08:00
Kaiyu Xie
74b324f667
Update TensorRT-LLM ( #2110 )
2024-08-13 22:34:33 +08:00
Kaiyu Xie
bca9a33b02
Update TensorRT-LLM ( #2008 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Timur Abishev <abishev.timur@gmail.com>
Co-authored-by: MahmoudAshraf97 <hassouna97.ma@gmail.com>
Co-authored-by: Saeyoon Oh <saeyoon.oh@furiosa.ai>
Co-authored-by: hattizai <hattizai@gmail.com>
2024-07-23 23:05:09 +08:00
石晓伟
2a115dae84
Update TensorRT-LLM ( #1793 )
...
Co-authored-by: DreamGenX <x@dreamgen.com>
Co-authored-by: Ace-RR <78812427+Ace-RR@users.noreply.github.com>
Co-authored-by: bprus <39293131+bprus@users.noreply.github.com>
Co-authored-by: janpetrov <janpetrov@icloud.com>
2024-06-18 18:18:23 +08:00
Kaiyu Xie
db4edea1e1
Update TensorRT-LLM ( #1763 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Kota Tsuyuzaki <bloodeagle40234@gmail.com>
Co-authored-by: Pzzzzz <hello-cd.plus@hotmail.com>
Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com>
2024-06-11 16:59:02 +08:00
Kaiyu Xie
b777bd6475
Update TensorRT-LLM ( #1725 )
...
* Update TensorRT-LLM
---------
Co-authored-by: RunningLeon <mnsheng@yeah.net>
Co-authored-by: Tlntin <TlntinDeng01@Gmail.com>
Co-authored-by: ZHENG, Zhen <zhengzhen.z@qq.com>
Co-authored-by: Pham Van Ngoan <ngoanpham1196@gmail.com>
Co-authored-by: Nathan Price <nathan@abridge.com>
Co-authored-by: Tushar Goel <tushar.goel.ml@gmail.com>
Co-authored-by: Mati <132419219+matichon-vultureprime@users.noreply.github.com>
2024-06-04 20:26:32 +08:00
Kaiyu Xie
f430a4b447
Update TensorRT-LLM ( #1688 )
...
* Update TensorRT-LLM
---------
Co-authored-by: IbrahimAmin <ibrahimamin532@gmail.com>
Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com>
Co-authored-by: Pzzzzz <hello-cd.plus@hotmail.com>
Co-authored-by: CoderHam <hemant@cohere.com>
Co-authored-by: Konstantin Lopuhin <kostia.lopuhin@gmail.com>
2024-05-28 20:07:49 +08:00
Kaiyu Xie
bf0a5afc92
Update TensorRT-LLM ( #1598 )
...
* Update TensorRT-LLM
2024-05-14 16:43:41 +08:00
Kaiyu Xie
89ba1b1a67
Update TensorRT-LLM ( #1554 )
2024-05-07 23:34:28 +08:00
Kaiyu Xie
06c0e9b1ec
Update TensorRT-LLM ( #1530 )
2024-04-30 17:19:10 +08:00
Kaiyu Xie
66ef1df492
Update TensorRT-LLM ( #1492 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Loki <lokravi@amazon.com>
2024-04-24 14:44:22 +08:00
Kaiyu Xie
035b99e0d0
Update TensorRT-LLM ( #1427 )
...
* Update TensorRT-LLM
---------
Co-authored-by: meghagarwal <16129366+megha95@users.noreply.github.com>
2024-04-09 17:03:34 +08:00
石晓伟
850b6fa1e7
Update TensorRT-LLM ( #1358 )
...
Co-authored-by: Kaiyu <26294424+kaiyux@users.noreply.github.com>
2024-03-26 20:47:14 +08:00
Kaiyu Xie
4bb65f216f
Update TensorRT-LLM ( #1274 )
...
* Update TensorRT-LLM
---------
Co-authored-by: meghagarwal <16129366+megha95@users.noreply.github.com>
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-03-12 18:15:52 +08:00
Kaiyu Xie
0f041b7b57
Update TensorRT-LLM ( #1098 )
...
* Update TensorRT-LLM
* update submodule
* Remove unused binaries
2024-02-18 15:48:08 +08:00
Kaiyu Xie
0ab9d17a59
Update TensorRT-LLM ( #1055 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-02-06 18:38:07 +08:00
Kaiyu Xie
e06f537e08
Update TensorRT-LLM ( #1019 )
...
* Update TensorRT-LLM
---------
Co-authored-by: erenup <ping.nie@pku.edu.cn>
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-01-31 21:55:32 +08:00
Kaiyu Xie
c89653021e
Update TensorRT-LLM (20240116) ( #891 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Eddie-Wang1120 <81598289+Eddie-Wang1120@users.noreply.github.com>
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-01-16 20:03:11 +08:00
Kaiyu Xie
deaae40bd7
Update TensorRT-LLM ( #787 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-01-02 17:54:32 +08:00
Kaiyu Xie
a75618df24
Update TensorRT-LLM ( #667 )
...
* Update TensorRT-LLM
---------
Co-authored-by: 0xymoro <jerrymeng100@gmail.com>
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2023-12-15 22:14:51 +08:00
Kaiyu Xie
71f60f6df0
Update TensorRT-LLM ( #524 )
2023-12-01 22:27:51 +08:00
Kaiyu Xie
711a28d9bf
Update TensorRT-LLM ( #465 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2023-11-24 22:12:26 +08:00
Kaiyu Xie
6755a3f077
Update TensorRT-LLM ( #422 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Tltin <TltinDeng01@gmail.com>
Co-authored-by: zhaohb <zhaohbcloud@126.com>
Co-authored-by: Bradley Heilbrun <brad@repl.it>
Co-authored-by: nqbao11 <nqbao11.01@gmail.com>
Co-authored-by: Nikhil Varghese <nikhil@bot-it.ai>
2023-11-18 00:05:54 +08:00
Kaiyu Xie
b2fd493c16
Update TensorRT-LLM ( #349 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2023-11-10 22:30:31 +08:00
Kaiyu Xie
f044eb8d94
Update TensorRT-LLM ( #302 )
...
* Update TensorRT-LLM
---------
Co-authored-by: wangruohui <12756472+wangruohui@users.noreply.github.com>
2023-11-07 19:51:58 +08:00
Kaiyu Xie
75b6210ff4
Kaiyu/update main ( #5 )
...
* Update
* Update
2023-10-18 22:38:53 +08:00
Kevin Xie
6e9e318e91
Update code
2023-09-28 09:00:05 -07:00
Kaiyu Xie
23bc5b7c49
Initial commit
2023-09-20 00:29:41 -07:00