Thor Johnsen
5d438be59a
[TRTLLM-5000][feat] Pytorch implementation of ngram drafter ( #3936 )
...
* v1.5
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
v1.5.4 Add back draft_overhead to spec dec stats
Signed-off-by: Thor Johnsen <41591019+thorjohnsen@users.noreply.github.com>
* v1.5.5: fix CI error
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.6: fix CI error 8196 > 8192
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* Address reviewer concerns
Signed-off-by: Thor Johnsen <41591019+thorjohnsen@users.noreply.github.com>
* Address reviewer concerns
Signed-off-by: Thor Johnsen <41591019+thorjohnsen@users.noreply.github.com>
* precommit run
Signed-off-by: Thor Johnsen <41591019+thorjohnsen@users.noreply.github.com>
* v2.0: Address reviewer concerns
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v2.1: add fix from wili
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* Revert changes that require use of TypeAlias because that requires python version >= 3.10
Signed-off-by: Thor Johnsen <41591019+thorjohnsen@users.noreply.github.com>
---------
Signed-off-by: Thor Johnsen <41591019+thorjohnsen@users.noreply.github.com>
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
Co-authored-by: wili-65535 <wili-65535@users.noreply.github.com>
2025-05-21 10:40:00 +08:00
brb-nv
8280c3d4f2
feat: Support Gemma3-1b-it in Pytorch workflow ( #3999 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-05-14 14:02:44 +08:00
Yechan Kim
3e9bda3a09
[feat] Support HyperCLOVAX-SEED-Text language part ( #3902 )
...
* feat: support HyperCLOVAX-SEED-Text language part
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* add Pytorch flow and remove test file
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* revert summarize
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* fix summarize
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* remove from pytorch example
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
---------
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-05-12 16:05:14 +08:00
milesial
362a8272f8
feat: llama4 input processor ( #3383 )
...
Signed-off-by: Alexandre Milesi <30204471+milesial@users.noreply.github.com>
Signed-off-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>
Co-authored-by: Alexandre Milesi <30204471+milesial@users.noreply.github.com>
Co-authored-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>
2025-04-25 16:47:14 -07:00
Mike Iovine
68e774ff9e
[chore] Add Llama 4 Maverick to quickstart README ( #3848 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-04-26 01:04:24 +08:00
Luis Vega
f95dbbb6cb
added nemotron-h to supported models ( #3663 )
...
Signed-off-by: Luis Vega <lvega@nvidia.com>
2025-04-24 10:41:32 -07:00
Naveassaf
f7c2eb4fa2
Update Nemotron Super and Ultra in Supported Models and add an example ( #3632 )
...
* Update Nemotron Super and Ultra in Supported Models and add an example
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
* Update README link to match new examples structure
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
---------
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
2025-04-20 21:14:33 +08:00
amitz-nv
a6a2ae6cc1
chore: Rename nvsmall to nemotron nas ( #3447 )
...
* Rename nvsmall to nemotron NAS
* Revert nvsmall to nemotron_nas rename in paths in tests that access llm_models_root/nvsmall/tests
* Add NemotronNAS to pytorch supported models table
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
2025-04-10 23:16:52 +08:00
Yechan Kim
943218b54a
feat: Add Qwen2.5-VL and refactor Qwen2-VL ( #3156 )
...
* feat: Add Qwen2.5-VL and refactor Qwen2-VL
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* fix yapf and codespell
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* add test
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* fix test_e2e
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* generalize get_rope_index
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* fix qwen2.5-vl on REAME
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* fix test
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* fix image test
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
---------
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>
2025-04-10 04:09:03 +08:00
Mike Iovine
5bdf997963
Add Llama 4 ( #3302 )
...
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-04-09 03:35:21 +08:00
Yechan Kim
c7533d271f
doc: add supported-models on PyTorch example ( #3179 )
...
* doc: add supported-models on PyTorch example
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* remove vision support from Llama3.2
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
---------
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: juney-nvidia <143764042+juney-nvidia@users.noreply.github.com>
2025-04-03 21:09:25 +08:00
Kaiyu Xie
77d7fe1eb2
Update TensorRT-LLM ( #2849 )
...
* Update TensorRT-LLM
---------
Co-authored-by: aotman <chenhangatm@gmail.com>
2025-03-04 18:44:00 +08:00
Kaiyu Xie
ab5b19e027
Update TensorRT-LLM ( #2820 )
2025-02-25 21:21:49 +08:00
Kaiyu Xie
2ea17cdad2
Update TensorRT-LLM ( #2792 )
...
* Update TensorRT-LLM
---------
Co-authored-by: jlee <jungmoolee@clika.io>
2025-02-18 21:27:39 +08:00
Dan Blanaru
16d2467ea8
Update TensorRT-LLM ( #2755 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Denis Kayshev <topenkoff@gmail.com>
Co-authored-by: akhoroshev <arthoroshev@gmail.com>
Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com>
Update
2025-02-11 03:01:00 +00:00