TensorRT-LLMs/requirements.txt
Rundong Li f1b85fea4c
[None][feat] Integrate cuda.tile RMS norm kernels (#9725)
Signed-off-by: Rundong (David) Li <davidli@nvidia.com>
Co-authored-by: Jinman Xie <jinmanx@nvidia.com>
Co-authored-by: Alexey Bylinkin <abylinkin@nvidia.com>
Co-authored-by: Qiqi Xiao <qiqix@nvidia.com>
Co-authored-by: Biao Wang <biaow@nvidia.com>
Co-authored-by: Thomas Schmid <thschmid@nvidia.com>
2026-02-02 19:44:27 +08:00

87 lines
2.1 KiB
Plaintext

--extra-index-url https://download.pytorch.org/whl/cu130
-c constraints.txt
accelerate>=1.7.0
build
colored
cuda-python>=13
diffusers>=0.27.0
lark
mpi4py
numpy<2
onnx>=1.18.0,<1.20.0
onnx_graphsurgeon>=0.5.2
onnxscript==0.5.4
graphviz
openai
polygraphy
psutil
nvidia-ml-py>=13
pulp
pandas
h5py==3.12.1
StrEnum
sentencepiece>=0.1.99
tensorrt~=10.14.1
# https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-25-12.html#rel-25-12 uses 2.10.0a0.
torch>=2.9.1,<=2.10.0a0
torchvision
nvidia-modelopt[torch]~=0.37.0
# https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-25-12.html#rel-25-12 uses 2.28.9
# torch 2.9.1+cu130 depends on nvidia-nccl-cu13==2.27.7
nvidia-nccl-cu13>=2.27.7,<=2.28.9
nvidia-cuda-nvrtc
transformers==4.57.1
prometheus_client
prometheus_fastapi_instrumentator
pydantic>=2.9.1
pydantic-settings[yaml]
omegaconf
pillow
wheel<=0.45.1
optimum
# evaluate needs datasets>=2.0.0 which triggers datasets>3.1.0 which is not stable: https://github.com/huggingface/datasets/issues/7467
datasets==3.1.0
evaluate
mpmath>=1.3.0
click
click_option_group
aenum
pyzmq
fastapi>=0.120.1,<=0.121.3
starlette>=0.49.1
uvicorn
setuptools<80
ordered-set
peft
patchelf
einops
flashinfer-python~=0.6.2
opencv-python-headless
xgrammar==0.1.25
llguidance==0.7.29
jsonschema
backoff
nvtx
matplotlib # FIXME: this is added to make nvtx happy
meson
ninja
etcd3 @ git+https://github.com/kragniz/python-etcd3.git@e58a899579ba416449c4e225b61f039457c8072a
blake3
soundfile
triton==3.5.1 # NOTE: if you update this, you must also run scripts/vendor_triton_kernels.py to vendor the new version of triton_kernels
tiktoken
blobfile
openai-harmony==0.0.4
nvidia-cutlass-dsl==4.3.4; python_version >= "3.10"
plotly
numexpr<2.14.0 # WAR for attempted use of nonexistent numpy.typing
partial_json_parser
apache-tvm-ffi==0.1.6 # used for reduce nvidia-cutlass-dsl host overhead
torch-c-dlpack-ext==0.1.3 # used for reduce nvidia-cutlass-dsl host overhead, optional package for improved torch tensor calling perf
mistral-common==1.8.6
torchao>=0.14.1
cuda-core
llist
cuda-tile>=1.0.1
nvidia-cuda-tileiras>=13.1