Omer Ullman Argov
8731f5f14f
chore: Mass integration of release/0.20 ( #4898 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Hui Gao <huig@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Signed-off-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: moraxu <mguzek@nvidia.com>
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Co-authored-by: HuiGao-NV <huig@nvidia.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Co-authored-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Co-authored-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>
Co-authored-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Faraz <58580514+farazkh80@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Co-authored-by: Yechan Kim <161688079+yechank-nvidia@users.noreply.github.com>
2025-06-08 23:26:26 +08:00
yuxianq
0e87fcc228
refactor: use x is None instead of x == None. ( #4244 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-05-15 20:00:04 +08:00
brb-nv
cd5b3d21a0
feat: Support Mistral Small 3.1 24B VLM in TRT workflow ( #4183 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-05-14 03:47:22 +08:00
mayani-nv
fe3a993234
chore: PR to fix the formatting errors ( #4200 )
...
* updating the run_dtm_pld.py to handle logits correctly
* following correct code formatting
* Update run_dtm_pld.py to account for correct code formatting
Signed-off-by: mayani-nv <67936769+mayani-nv@users.noreply.github.com>
* correct formatting for the multimodal README PR
---------
Signed-off-by: mayani-nv <67936769+mayani-nv@users.noreply.github.com>
Co-authored-by: Ubuntu <Azureuser@mayani-nch100-vm1.42fmcfe2wyrepoque0x2bs4uue.jx.internal.cloudapp.net>
Co-authored-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>
2025-05-09 16:31:36 -07:00
mayani-nv
25533a7736
Updating the multimodal models README to add steps for running phi-4-multimodal instruct ( #3932 )
...
* Update run.py for draft_target_model
This change makes the draft target model works without mismatch in the vocab size
Signed-off-by: mayani-nv <67936769+mayani-nv@users.noreply.github.com>
* updating README with phi-4-multimodal-instruct steps
* adding ENGINE_DIR, HF_DIR and CKPT_DIR as per review
* addressing review comments on PR
* updating readme
---------
Signed-off-by: mayani-nv <67936769+mayani-nv@users.noreply.github.com>
Co-authored-by: rakib-hasan <rhasan@nvidia.com>
Co-authored-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>
2025-05-09 15:42:58 -07:00
QI JUN
d0d19e81ca
chore: fix some invalid paths of contrib models ( #3818 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-04-24 05:36:16 +08:00
Kaiyu Xie
dfbcb543ce
doc: fix path after examples migration ( #3814 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-04-24 02:36:45 +08:00
katec846
eeb605abd6
feat: Offloading Multimodal embedding table to CPU in Chunked Prefill Mode ( #3380 )
...
* Feat: Offload ptable to cpu if enable_chunk_context
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Feat: offload ptable to cpu for chunk context mode
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Fix and add comment
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Update Readme for multimodal and add a new param mm_embedding_offloading
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* fix: Correct prompt table offloading condition in PromptTuningBuffers
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Clean up the code
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Add commits to explain copy from cpu <-> gpu using pinned memory
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Fix namings based on comments
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Fix format based on precommit
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
* Modify --mm_embedding_offloading flag
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
---------
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
Co-authored-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>
2025-04-21 14:31:01 +08:00
QI JUN
d51ae53940
move the reset models into examples/models/core directory ( #3555 )
...
* move rest models to examples/models/core directory
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* update multimodal readme
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix example path
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix cpp test
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix tensorrt test
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
* fix ci
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
---------
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-04-19 20:48:59 -07:00