Chang Liu
|
31bc14b350
|
[TRTLLM-9654][feat] Support DeepSeek-V32 chat template (#9814)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-12-19 17:05:38 +08:00 |
|
Enwei Zhu
|
a3455f55c7
|
[None][chore] Fix trtllm-eval and move GroupedGemmInputsHelper (#9612)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-12-03 07:55:03 +08:00 |
|
Venky
|
639c939a4f
|
[TRTC-1943][feat] Env vars override support in LLM API (#9104)
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
|
2025-12-01 10:04:49 -08:00 |
|
brb-nv
|
f61067cbb5
|
[None][chore] Defer exposing context parallel configs (#9552)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-01 09:50:02 -08:00 |
|
brb-nv
|
b77f4ffe54
|
[TRTLLM-5971][feat] Integrate helix parallelism (#9342)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-11-29 15:17:30 -08:00 |
|
Aurelien Chartier
|
f2f197360d
|
[#9463][feat] Add revision option to trtllm commands (#9498)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
2025-11-27 09:30:01 +08:00 |
|
Fanrong Li
|
c36f144591
|
[None][chore] Fix trtllm-eval for PyTorchLLM (#9427)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2025-11-25 04:49:03 -08:00 |
|
Anish Shanbhag
|
a09b38a862
|
[TRTLLM-8684][chore] Migrate BuildConfig to Pydantic, add a Python wrapper for KVCacheType enum (#8330)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
|
2025-10-28 09:17:26 -07:00 |
|
Chao Ni
|
0019d99e6d
|
[None][test] Add longbench v2 for long context evaluation (#8604)
Signed-off-by: mni <125171826+baize97@users.noreply.github.com>
|
2025-10-27 20:01:14 +08:00 |
|
Yuan Tong
|
f050b8d871
|
[None][fix] refine backend option handling for commands (#7829)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
|
2025-09-24 10:54:33 +08:00 |
|
Wanli Jiang
|
a7ca0fff54
|
[TRTLLM-6577][feat] Support nano_v2_vlm in pytorch backend (#7207)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-09-18 16:26:20 +08:00 |
|
Yechan Kim
|
0893afae3d
|
[TRTLLM-6771][feat] Support MMMU for multimodal models (#6828)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2025-08-21 08:54:12 +08:00 |
|
Yan Chunwei
|
9bd42ecf9b
|
[TRTLLM-5208][BREAKING CHANGE] chore: make pytorch LLM the default (#5312)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
|
2025-06-20 03:01:10 +08:00 |
|
Enwei Zhu
|
babdd9ce06
|
test: Add json_mode_eval for guided decoding evaluation (#5179)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-06-16 10:03:55 +08:00 |
|
Yan Chunwei
|
5506f60037
|
chore [BREAKING CHANGE]: Flatten PyTorchConfig knobs into TorchLlmArgs (#4603)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
|
2025-05-28 18:43:04 +08:00 |
|
Kaiyu Xie
|
b4e5df0ee0
|
Breaking change: perf: Enable scheduling overlap by default (#4174)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-05-15 14:27:36 +08:00 |
|
Enwei Zhu
|
3fa19ffa4e
|
test [TRTLLM-4477,TRTLLM-4481]: Accuracy test improvement (Part 3.5): Support GSM8K and GPQA (#3483)
* add gsm8k
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix gsm8k
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* add gpqa
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* conditional import lm_eval
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* gpqa in lm_eval
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* system prompt
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* shuffle
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* update AA prompt and regex
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* revert AA prompt and regex
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* integration to tests
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* add DS-R1
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix and clean
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* update tests
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* update
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* clean up
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* free_gpu_memory_fraction=0.8
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
---------
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-04-22 07:38:16 +08:00 |
|
Enwei Zhu
|
b2f69db507
|
test: Accuracy test improvement (Part 3.1): Extend accuracy test suite with LLM API and initial implementation of trtllm-eval (#3167)
* add eval_llmapi
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
tmp commit
port to CLI tool
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
move
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
setup llmapi
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
fix spec_dec_algo
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
_update_from_hf_quant_config
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
migrate test_pytorch.py
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
fix fp8 block scales
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
fix fp8 rowwise
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
adj alpha
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
move test_pytorch.py cases
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
move
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
rename test_accuracy.py to test_cli.py
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
clean
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix cnn_dailymail
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* renaming to cli flow
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* rename MMLU
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* rename
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* add error
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
---------
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-04-01 22:20:29 +08:00 |
|