yufeiwu-nv
|
f323b74d42
|
[None][test] Update llm_models_root to improve path handling on BareMetal environment (#7876)
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
|
2025-09-24 17:35:57 +08:00 |
|
amitz-nv
|
750d15bfaa
|
[https://nvbugs/5503529][fix] Change test_llmapi_example_multilora to get adapters path from cmd line to avoid downloading from HF (#7740)
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
|
2025-09-16 16:35:13 +08:00 |
|
Yan Chunwei
|
612c26be22
|
[None][doc] add legacy section for tensorrt engine (#6724)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-01 11:02:31 +08:00 |
|
Richard Huo
|
ce580ce4f5
|
[None][feat] KV Cache Connector API (#7228)
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
Signed-off-by: richardhuo-nv <rihuo@nvidia.com>
Co-authored-by: jthomson04 <jwillthomson19@gmail.com>
Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
Co-authored-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
|
2025-08-28 23:09:27 -04:00 |
|
dominicshanshan
|
6f245ec78b
|
[None][chore] Mass integration of release/1.0 (#6864)
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: raayandhar <rdhar@nvidia.com>
Co-authored-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Co-authored-by: Bo Deng <deemod@nvidia.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
Co-authored-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Yechan Kim <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: 2ez4bz <133824995+2ez4bz@users.noreply.github.com>
Co-authored-by: Raayan Dhar <58057652+raayandhar@users.noreply.github.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
|
2025-08-22 09:25:15 +08:00 |
|
Erin
|
9522cde464
|
fix: NVBug 5385576 py_batch_idx issue (#6153)
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
|
2025-07-18 22:36:43 +08:00 |
|
Erin
|
de60ae47e3
|
chores: unwaive a few tests for v1.0 (#6107)
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
|
2025-07-17 17:59:51 +08:00 |
|
Zhanrui Sun
|
3a0ef73414
|
infra: [TRTLLM-6242] install cuda-toolkit to fix sanity check (#5709)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
|
2025-07-14 18:52:13 +09:00 |
|
Yan Chunwei
|
9c673e9707
|
[TRTLLM-6160] chore: add sampling examples for pytorch (#5951)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
|
2025-07-14 15:28:32 +09:00 |
|
Yan Chunwei
|
c30eead09f
|
[TRTLLM-6164][TRTLLM-6165] chore: add runtime example for pytorch (#5956)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
|
2025-07-14 14:09:39 +08:00 |
|
Yan Chunwei
|
e50d95c40d
|
chore [TRTLLM-6161]: add LLM speculative decoding example (#5706)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
|
2025-07-09 07:33:11 +08:00 |
|
Yan Chunwei
|
a5eff139f1
|
[TRTLLM-5277] chore: refine llmapi examples for 1.0 (part1) (#5431)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
|
2025-07-01 19:06:41 +08:00 |
|
Yan Chunwei
|
9bd42ecf9b
|
[TRTLLM-5208][BREAKING CHANGE] chore: make pytorch LLM the default (#5312)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
|
2025-06-20 03:01:10 +08:00 |
|
Jhao-Ting Chen
|
fcadce9f8d
|
[fix] Eagle-2 LLMAPI pybind argument fix. (#3967)
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
Co-authored-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>
|
2025-05-29 12:23:25 -07:00 |
|
Yan Chunwei
|
bc0cf41592
|
chore: refactor llmapi e2e tests (#3803)
* refactor llmapi e2e tests
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* fix
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
---------
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
|
2025-05-05 07:37:24 +08:00 |
|