Chenghao Zhang
|
bac9e8c2ad
|
[None][feat] AutoDeploy: Add Nemotron MOE support for AutoDeploy (#8469)
|
2025-10-21 15:32:01 -07:00 |
|
Suyog Gupta
|
7050b1ea49
|
[#8272][feat] Enable chunked prefill for SSMs in AutoDeploy (#8477)
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
|
2025-10-20 15:31:52 -07:00 |
|
Lucas Liebenwein
|
41169fb20c
|
[None][feat] AutoDeploy: chunked prefill support (#8158)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2025-10-18 00:47:35 -07:00 |
|
h-guo18
|
55fed1873c
|
[None][chore] AutoDeploy: cleanup old inference optimizer configs (#8039)
Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
Co-authored-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2025-10-17 15:55:57 -04:00 |
|
Lucas Liebenwein
|
3492391feb
|
[None][chore] AutoDeploy: clean up accuracy test configs (#8134)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2025-10-06 12:51:01 -07:00 |
|
Lucas Liebenwein
|
2c454e8003
|
[None][feat] AutoDeploy: Nemotron-H accuracy test (#8133)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2025-10-03 15:39:03 -07:00 |
|
Lucas Liebenwein
|
5faa5e9dd8
|
[None][feat] AutoDeploy: dive deeper into token generation bugs + enable_block_reuse (#8108)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2025-10-03 04:57:26 -07:00 |
|
Eran Geva
|
5f2a42b3df
|
[TRTLLM-6142][feat] AutoDeploy: set torch recompile_limit based on cuda_graph_batch_sizes and refactored (#7219)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
|
2025-09-08 08:45:58 -04:00 |
|
Suyog Gupta
|
e3de5758a3
|
[#7136][feat] trtllm-serve + autodeploy integration (#7141)
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
|
2025-08-22 08:30:53 -07:00 |
|
ajrasane
|
4162d2d746
|
[None][test] Add accuracy evaluation for AutoDeploy (#6764)
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
|
2025-08-15 13:46:09 -04:00 |
|