..
__init__.py
feat: add Pytorch support of Vision Encoder for multimodal models ( #3791 )
2025-05-03 05:13:47 +08:00
.gitkeep
Update TensorRT-LLM ( #2755 )
2025-02-11 03:01:00 +00:00
modeling_auto.py
Fix create_weights in attention ( #3692 )
2025-04-24 07:30:00 +08:00
modeling_bert.py
feat: Support cos_sin_cache in all cases. ( #3517 )
2025-04-16 13:48:44 +08:00
modeling_clip.py
feat: add Pytorch support of Vision Encoder for multimodal models ( #3791 )
2025-05-03 05:13:47 +08:00
modeling_deepseekv3.py
[nvbugs/5268808][fix] Fix the list-out-of-range access issue of AllReduce workspace on multi-node. ( #4159 )
2025-05-13 17:17:25 +08:00
modeling_llama.py
fix: [ https://nvbugspro.nvidia.com/bug/5238626 ] illegal memory address when running llama 4 with cuda graph enabled ( #4101 )
2025-05-13 14:58:54 +08:00
modeling_llava_next.py
feat: add Pytorch support of Vision Encoder for multimodal models ( #3791 )
2025-05-03 05:13:47 +08:00
modeling_mamba_hybrid.py
feat: Support cos_sin_cache in all cases. ( #3517 )
2025-04-16 13:48:44 +08:00
modeling_mistral.py
feat: Mistral-Large-2 support in the Pytorch workflow
2025-04-30 20:12:39 +08:00
modeling_mixtral.py
feat: Support cos_sin_cache in all cases. ( #3517 )
2025-04-16 13:48:44 +08:00
modeling_mllama.py
feat: LogitsProcessor in PyTorch backend ( #3145 )
2025-05-01 14:15:30 -07:00
modeling_multimodal_encoder.py
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
modeling_multimodal_utils.py
Adding option to specify a set of token ids for multimodal tokens ( #4107 )
2025-05-07 12:15:41 +08:00
modeling_nemotron_h.py
Support NemotronH FP8 Quantization
2025-04-29 18:51:43 +03:00
modeling_nemotron_nas.py
Fix create_weights in attention ( #3692 )
2025-04-24 07:30:00 +08:00
modeling_nemotron.py
feat: Support cos_sin_cache in all cases. ( #3517 )
2025-04-16 13:48:44 +08:00
modeling_qwen2vl.py
feat: Add multimodal embedding field in LlmRequest ( #3855 )
2025-05-01 12:23:30 +08:00
modeling_qwen3_moe.py
refactor: Allow models to override apply_qk_norm. ( #4078 )
2025-05-12 19:38:24 +08:00
modeling_qwen3.py
refactor: Allow models to override apply_qk_norm. ( #4078 )
2025-05-12 19:38:24 +08:00
modeling_qwen_moe.py
feat: Support cos_sin_cache in all cases. ( #3517 )
2025-04-16 13:48:44 +08:00
modeling_qwen.py
feat: Support cos_sin_cache in all cases. ( #3517 )
2025-04-16 13:48:44 +08:00
modeling_siglip.py
feat: add Pytorch support of Vision Encoder for multimodal models ( #3791 )
2025-05-03 05:13:47 +08:00
modeling_utils.py
feat: support multi lora adapters and TP ( #3885 )
2025-05-08 23:45:45 +08:00
modeling_vila.py
feat: Add multimodal embedding field in LlmRequest ( #3855 )
2025-05-01 12:23:30 +08:00