QI JUN
d0d19e81ca
chore: fix some invalid paths of contrib models ( #3818 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-04-24 05:36:16 +08:00
Kaiyu Xie
dfbcb543ce
doc: fix path after examples migration ( #3814 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-04-24 02:36:45 +08:00
rakib-hasan
b16a127026
fixing the metric fmeasure access ( #3774 )
...
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
2025-04-23 05:10:04 +08:00
rakib-hasan
74c13ea84f
datasets API change : datasets.load_metric => evaluate.load ( #3741 )
...
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
2025-04-22 08:23:48 +08:00
Enwei Zhu
3fa19ffa4e
test [TRTLLM-4477,TRTLLM-4481]: Accuracy test improvement (Part 3.5): Support GSM8K and GPQA ( #3483 )
...
* add gsm8k
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix gsm8k
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* add gpqa
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* conditional import lm_eval
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* gpqa in lm_eval
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* system prompt
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* shuffle
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* update AA prompt and regex
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* revert AA prompt and regex
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* integration to tests
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* add DS-R1
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix and clean
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* update tests
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* update
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* clean up
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* free_gpu_memory_fraction=0.8
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
---------
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-04-22 07:38:16 +08:00
katec846
eeb605abd6
feat: Offloading Multimodal embedding table to CPU in Chunked Prefill Mode ( #3380 )
...
* Feat: Offload ptable to cpu if enable_chunk_context
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Feat: offload ptable to cpu for chunk context mode
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Fix and add comment
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Update Readme for multimodal and add a new param mm_embedding_offloading
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* fix: Correct prompt table offloading condition in PromptTuningBuffers
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Clean up the code
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Add commits to explain copy from cpu <-> gpu using pinned memory
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Fix namings based on comments
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Fix format based on precommit
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Modify --mm_embedding_offloading flag
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
---------
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
Co-authored-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>
2025-04-21 14:31:01 +08:00
Naveassaf
f7c2eb4fa2
Update Nemotron Super and Ultra in Supported Models and add an example ( #3632 )
...
* Update Nemotron Super and Ultra in Supported Models and add an example
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
* Update README link to match new examples structure
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
---------
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
2025-04-20 21:14:33 +08:00
QI JUN
d51ae53940
move the reset models into examples/models/core directory ( #3555 )
...
* move rest models to examples/models/core directory
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* update multimodal readme
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix example path
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix cpp test
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix tensorrt test
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
---------
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-04-19 20:48:59 -07:00
rakib-hasan
ff3b741045
feat: adding multimodal (only image for now) support in trtllm-bench ( #3490 )
...
* feat: adding multimodal (only image for now) support in trtllm-bench
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* fix: add in load_dataset() calls to maintain the v2.19.2 behavior
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* re-adding prompt_token_ids and using that for prompt_len
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* updating the datasets version in examples as well
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* api changes are not needed
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* moving datasets requirement and removing a missed api change
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* addressing review comments
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
* refactoring the quickstart example
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
---------
Signed-off-by: Rakib Hasan <rhasan@nvidia.com>
2025-04-18 07:06:16 +08:00
bhsueh_NV
322ac565fc
chore: clean some ci of qa test ( #3083 )
...
* move some models to examples/models/contrib
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* update the document
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* remove arctic, blip2, cogvlm, dbrx from qa test list
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* remove tests of dit, mmdit and stdit from qa test
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* remove grok, jais, sdxl, skywork, smaug from qa test list
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* re-organize the glm examples
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* fix issues after running pre-commit
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* fix some typo in glm_4_9b readme
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
* fix bug
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
---------
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-03-31 14:30:41 +08:00
Enwei Zhu
705eef68c2
test: Accuracy test improvement (Part 2): Incorporate mmlu to accuracy test suite ( #2982 )
...
* Accuracy test improvement (Part 2)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* WAR OOM
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
update
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
---------
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-03-25 07:34:10 +08:00
Pradeep Raj Prabhu Raj
5b4a5014d1
Fix: wrong path to constraints.txt in bloom/requirements.txt ( #3003 )
...
Signed-off-by: Pradeep Raj Prabhu Raj <pradeepraj18062002@gmail.com>
2025-03-24 23:03:40 +08:00
Kaiyu Xie
2631f21089
Update ( #2978 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-23 16:39:35 +08:00