diff --git a/.github/ISSUE_TEMPLATE/01-installation.yml b/.github/ISSUE_TEMPLATE/01-installation.yml new file mode 100644 index 0000000000..fd24fd93f0 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/01-installation.yml @@ -0,0 +1,66 @@ +# Adapted from https://github.com/vllm-project/vllm/tree/main/.github/ISSUE_TEMPLATE/200-installation.yml +name: 🛠️ Installation +description: Report an issue here when you hit errors during installation. +title: "[Installation]: " +labels: ["Installation"] + +body: +- type: markdown + attributes: + value: > + #### Before submitting an issue, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/NVIDIA/TensorRT-LLM/issues?q=is%3Aissue+sort%3Acreated-desc+). +- type: textarea + attributes: + label: System Info + description: | + Please provide the following system information to help us debug your installation issue: + + ```bash + # System information + cat /etc/os-release + nvidia-smi + nvcc --version + python --version + pip list | grep -E "(tensorrt|torch|cuda)" + + # TensorRT-LLM installation method and version + pip show tensorrt_llm + ``` + value: | + **System Information:** + - OS: + - Python version: + - CUDA version: + - GPU model(s): + - Driver version: + - TensorRT version: + - PyTorch version: + - TensorRT-LLM version: + + **Detailed output:** + ```text + Paste the output of the above commands here + ``` + validations: + required: true +- type: textarea + attributes: + label: How you are installing TensorRT-LLM + description: | + Paste the full command you are trying to execute or describe your installation method. + value: | + ```sh + # Installation command or method + pip install tensorrt_llm + ``` +- type: markdown + attributes: + value: > + Thanks for contributing 🎉! +- type: checkboxes + id: askllm + attributes: + label: Before submitting a new issue... + options: + - label: Make sure you already searched for relevant issues, and checked the [installation documentation](https://nvidia.github.io/TensorRT-LLM/installation/) and [examples](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples) for answers to frequently asked questions. + required: true diff --git a/.github/ISSUE_TEMPLATE/02-new-model.yml b/.github/ISSUE_TEMPLATE/02-new-model.yml new file mode 100644 index 0000000000..688c11866f --- /dev/null +++ b/.github/ISSUE_TEMPLATE/02-new-model.yml @@ -0,0 +1,41 @@ +# Adapted from https://github.com/vllm-project/vllm/tree/main/.github/ISSUE_TEMPLATE/600-new-model.yml +name: 🤗 Support request for a new model from huggingface +description: Submit a proposal/request for a new model from huggingface +title: "[New Model]: " +labels: ["new model"] + +body: +- type: markdown + attributes: + value: > + #### Before submitting an issue, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/NVIDIA/TensorRT-LLM/issues?q=is%3Aissue+sort%3Acreated-desc+). + + #### We also highly recommend you read https://nvidia.github.io/TensorRT-LLM/architecture/add-model.html first to understand how to add a new model. +- type: textarea + attributes: + label: The model to consider. + description: > + A huggingface identifier, pointing to the model, e.g. `meta-llama/Llama-3.1-8B-Instruct` . + validations: + required: true +- type: textarea + attributes: + label: The closest model TensorRT-LLM already supports. + description: > + Here is the list of models already supported by TensorRT-LLM: https://github.com/NVIDIA/TensorRT-LLM/tree/main/tensorrt_llm/models (TRT backend) and https://github.com/NVIDIA/TensorRT-LLM/tree/main/tensorrt_llm/_torch/models (Pytorch backend) . Which model is the most similar to the model you want to add support for? +- type: textarea + attributes: + label: What's your difficulty of supporting the model you want? + description: > + For example, any new operators or new architecture? +- type: markdown + attributes: + value: > + Thanks for contributing 🎉! +- type: checkboxes + id: askllm + attributes: + label: Before submitting a new issue... + options: + - label: Make sure you already searched for relevant issues, and checked the [documentation](https://nvidia.github.io/TensorRT-LLM/) and [examples](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples) for answers to frequently asked questions. + required: true diff --git a/.github/ISSUE_TEMPLATE/03-documentation.yml b/.github/ISSUE_TEMPLATE/03-documentation.yml new file mode 100644 index 0000000000..df7643337b --- /dev/null +++ b/.github/ISSUE_TEMPLATE/03-documentation.yml @@ -0,0 +1,31 @@ +# Adapted from https://github.com/vllm-project/vllm/tree/main/.github/ISSUE_TEMPLATE/100-documentation.yml +name: 📚 Documentation +description: Report an issue related to https://nvidia.github.io/TensorRT-LLM/ +title: "[Doc]: " +labels: ["Documentation"] +assignees: ["nv-guomingz"] + +body: +- type: textarea + attributes: + label: 📚 The doc issue + description: > + A clear and concise description of what content in https://nvidia.github.io/TensorRT-LLM/ is an issue. + validations: + required: true +- type: textarea + attributes: + label: Suggest a potential alternative/fix + description: > + Tell us how we could improve the documentation in this regard. +- type: markdown + attributes: + value: > + Thanks for contributing 🎉! +- type: checkboxes + id: askllm + attributes: + label: Before submitting a new issue... + options: + - label: Make sure you already searched for relevant issues, and checked the [documentation](https://nvidia.github.io/TensorRT-LLM/) and [examples](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples) for answers to frequently asked questions. + required: true diff --git a/.github/ISSUE_TEMPLATE/04-questions.yml b/.github/ISSUE_TEMPLATE/04-questions.yml new file mode 100644 index 0000000000..75a9416e92 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/04-questions.yml @@ -0,0 +1,62 @@ +# Adapted from https://github.com/vllm-project/vllm/tree/main/.github/ISSUE_TEMPLATE/300-usage.yml +name: 💻 Questions +description: Raise an issue here if you don't know how to use TensorRT-LLM. +title: "[Usage]: " +labels: ["question"] + +body: +- type: markdown + attributes: + value: > + #### Before submitting an issue, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/NVIDIA/TensorRT-LLM/issues?q=is%3Aissue+sort%3Acreated-desc+). +- type: textarea + attributes: + label: System Info + description: | + Please provide the following system information to help us debug your usage issue: + + ```bash + # System information + nvidia-smi + python --version + pip show tensorrt_llm + ``` + value: | + **System Information:** + - OS: + - Python version: + - CUDA version: + - GPU model(s): + - Driver version: + - TensorRT-LLM version: + + **Detailed output:** + ```text + Paste the output of the above commands here + ``` + validations: + required: true +- type: textarea + attributes: + label: How would you like to use TensorRT-LLM + description: | + A detailed description of how you want to use TensorRT-LLM. + value: | + I want to run inference of a [specific model](put Hugging Face link here). I don't know how to integrate it with TensorRT-LLM or optimize it for my use case. + + **Specific questions:** + - Model: + - Use case (e.g., chatbot, batch inference, real-time serving): + - Expected throughput/latency requirements: + - Multi-GPU setup needed: +- type: markdown + attributes: + value: > + Thanks for contributing 🎉! +- type: checkboxes + id: askllm + attributes: + label: Before submitting a new issue... + options: + - label: Make sure you already searched for relevant issues, and checked the [documentation](https://nvidia.github.io/TensorRT-LLM/) and [examples](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples) for answers to frequently asked questions. + required: true diff --git a/.github/ISSUE_TEMPLATE/05-feature-request.yml b/.github/ISSUE_TEMPLATE/05-feature-request.yml new file mode 100644 index 0000000000..32c1ee43c7 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/05-feature-request.yml @@ -0,0 +1,40 @@ +# Adapted from https://github.com/vllm-project/vllm/tree/main/.github/ISSUE_TEMPLATE/500-feature-request.yml +name: 🚀 Feature request +description: Submit a proposal/request for a new TensorRT-LLM feature +title: "[Feature]: " +labels: ["feature request"] +assignees: ["laikhtewari"] + +body: +- type: markdown + attributes: + value: > + #### Before submitting an issue, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/NVIDIA/TensorRT-LLM/issues?q=is%3Aissue+sort%3Acreated-desc+). +- type: textarea + attributes: + label: 🚀 The feature, motivation and pitch + description: > + A clear and concise description of the feature proposal. Please outline the motivation for the proposal. Is your feature request related to a specific problem? e.g., *"I'm working on X and would like Y to be possible"*. If this is related to another GitHub issue, please link here too. + validations: + required: true +- type: textarea + attributes: + label: Alternatives + description: > + A description of any alternative solutions or features you've considered, if any. +- type: textarea + attributes: + label: Additional context + description: > + Add any other context or screenshots about the feature request. +- type: markdown + attributes: + value: > + Thanks for contributing 🎉! +- type: checkboxes + id: askllm + attributes: + label: Before submitting a new issue... + options: + - label: Make sure you already searched for relevant issues, and checked the [documentation](https://nvidia.github.io/TensorRT-LLM/) and [examples](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples) for answers to frequently asked questions. + required: true diff --git a/.github/ISSUE_TEMPLATE/06-bug-report.yml b/.github/ISSUE_TEMPLATE/06-bug-report.yml new file mode 100644 index 0000000000..c41ff62ded --- /dev/null +++ b/.github/ISSUE_TEMPLATE/06-bug-report.yml @@ -0,0 +1,191 @@ +# Adapted from https://github.com/vllm-project/vllm/tree/main/.github/ISSUE_TEMPLATE/400-bug-report.yml +name: "🐛 Bug Report" +description: Submit a bug report to help us improve TensorRT-LLM +title: "[Bug]: " +labels: [ "bug" ] + +body: +- type: markdown + attributes: + value: > + #### Before submitting an issue, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/NVIDIA/TensorRT-LLM/issues?q=is%3Aissue+sort%3Acreated-desc+). +- type: markdown + attributes: + value: | + ⚠️ **SECURITY WARNING:** Please review any text you paste to ensure it does not contain sensitive information such as: + - API tokens or keys (e.g., Hugging Face tokens, OpenAI API keys) + - Passwords or authentication credentials + - Private URLs or endpoints + - Personal or confidential data + + Consider redacting or replacing sensitive values with placeholders like `` when sharing configuration or code examples. +- type: textarea + id: system-info + attributes: + label: System Info + description: Please share your system info with us. + placeholder: | + - CPU architecture (e.g., x86_64, aarch64) + - CPU/Host memory size (if known) + - GPU properties + - GPU name (e.g., NVIDIA H100, NVIDIA A100, NVIDIA L40S) + - GPU memory size (if known) + - Clock frequencies used (if applicable) + - Libraries + - TensorRT-LLM branch or tag (e.g., main, v0.7.1) + - TensorRT-LLM commit (if known) + - Versions of TensorRT, Modelopt, CUDA, cuBLAS, etc. used + - Container used (if running TensorRT-LLM in a container) + - NVIDIA driver version + - OS (Ubuntu 24.04, CentOS 8) + - Any other information that may be useful in reproducing the bug + + **Commands to gather system information:** + ```bash + nvidia-smi + nvcc --version + python --version + pip show tensorrt_llm tensorrt torch + ``` + validations: + required: true + +- type: textarea + id: who-can-help + attributes: + label: Who can help? + description: | + To expedite the response to your issue, it would be helpful if you could identify the appropriate person + to tag using the **@** symbol. Here is a general guideline on **whom to tag**. + + Rest assured that all issues are reviewed by the core maintainers. If you are unsure about whom to tag, + you can leave it blank, and a core maintainer will make sure to involve the appropriate person. + + Please tag fewer than 3 people. + + Quantization: @Tracin + + Documentation: @juney-nvidia + + Feature request: @laikhtewari + + Performance: @kaiyux + + placeholder: "@Username ..." + +- type: checkboxes + id: information-scripts-examples + attributes: + label: Information + description: 'The problem arises when using:' + options: + - label: "The official example scripts" + - label: "My own modified scripts" + +- type: checkboxes + id: information-tasks + attributes: + label: Tasks + description: "The tasks I am working on are:" + options: + - label: "An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)" + - label: "My own task or dataset (give details below)" + +- type: textarea + id: reproduction + validations: + required: true + attributes: + label: Reproduction + description: | + Please provide a clear and concise description of what the bug is and how to reproduce it. + + If relevant, add a minimal example so that we can reproduce the error by running the code. It is very important for the snippet to be as succinct (minimal) as possible, so please take time to trim down any irrelevant code to help us debug efficiently. We are going to copy-paste your code and we expect to get the same result as you did: avoid any external data, and include the relevant imports, etc. For example: + + ```python + from tensorrt_llm import LLM + from tensorrt_llm.sampling_params import SamplingParams + + prompts = [ + "Hello, my name is", + "The president of the United States is", + "The capital of France is", + "The future of AI is", + ] + sampling_params = SamplingParams(temperature=0.8, top_p=0.95) + + llm = LLM(model="meta-llama/Llama-3.1-8B-Instruct") + + outputs = llm.generate(prompts, sampling_params) + + # Print the outputs. + for output in outputs: + prompt = output.prompt + generated_text = output.outputs[0].text + print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}") + ``` + + If the code is too long (hopefully, it isn't), feel free to put it in a public gist and link it in the issue: https://gist.github.com. + + Remember to use code tags to properly format your code. You can refer to the + link https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting for guidance on code formatting. + + Please refrain from using screenshots, as they can be difficult to read and prevent others from copying and pasting your code. + It would be most helpful if we could reproduce your issue by simply copying and pasting your scripts and codes. + + Please set the environment variable `export TLLM_DEBUG_MODE=1` to turn on more logging to help debugging potential issues. + + placeholder: | + Steps to reproduce the behavior: + + 1. + 2. + 3. + + ```python + # Sample code to reproduce the problem + ``` + + ``` + The error message you got, with the full traceback and the error logs. + ``` + +- type: textarea + id: expected-behavior + validations: + required: true + attributes: + label: Expected behavior + description: "Provide a brief summary of the expected behavior of the software. Provide output files or examples if possible." + +- type: textarea + id: actual-behavior + validations: + required: true + attributes: + label: actual behavior + description: "Describe the actual behavior of the software and how it deviates from the expected behavior. Provide output files or examples if possible." + +- type: textarea + id: additional-notes + validations: + required: true + attributes: + label: additional notes + description: "Provide any additional context here you think might be useful for the TensorRT-LLM team to help debug this issue (such as experiments done, potential things to investigate)." + +- type: markdown + attributes: + value: | + ⚠️ Please separate bugs of `transformers`, `pytorch` implementation or usage from bugs of `TensorRT-LLM`. + + - If the error only appears in TensorRT-LLM, please provide the detailed script of how you run `TensorRT-LLM`, also highlight the difference and what you expect. + + Thanks for reporting 🙏! +- type: checkboxes + id: askllm + attributes: + label: Before submitting a new issue... + options: + - label: Make sure you already searched for relevant issues, and checked the [documentation](https://nvidia.github.io/TensorRT-LLM/) and [examples](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples) for answers to frequently asked questions. + required: true diff --git a/.github/ISSUE_TEMPLATE/07-performance-discussion.yml b/.github/ISSUE_TEMPLATE/07-performance-discussion.yml new file mode 100644 index 0000000000..feb3b02501 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/07-performance-discussion.yml @@ -0,0 +1,74 @@ +# Adapted from https://github.com/vllm-project/vllm/tree/main/.github/ISSUE_TEMPLATE/700-performance-discussion.yml +name: ⚡ Discussion on the performance of TensorRT-LLM +description: Submit a proposal/discussion about the performance of TensorRT-LLM +title: "[Performance]: " +labels: ["Performance"] +assignees: ["byshiue", "kaiyux"] + +body: +- type: markdown + attributes: + value: > + #### Before submitting an issue, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/NVIDIA/TensorRT-LLM/issues?q=is%3Aissue+sort%3Acreated-desc+). +- type: textarea + attributes: + label: Proposal to improve performance + description: > + How do you plan to improve TensorRT-LLM's performance? + validations: + required: false +- type: textarea + attributes: + label: Report of performance regression + description: > + Please provide detailed description of performance comparison to confirm the regression. You may want to run the benchmark script at https://github.com/NVIDIA/TensorRT-LLM/tree/main/benchmarks . + validations: + required: false +- type: textarea + attributes: + label: Misc discussion on performance + description: > + Anything about the performance. + validations: + required: false +- type: textarea + attributes: + label: Your current environment (if you think it is necessary) + description: | + Please provide the following system information to help with performance analysis: + + ```bash + # System information + nvidia-smi + nvcc --version + python --version + pip show tensorrt_llm tensorrt torch + ``` + value: | + **System Information:** + - OS: + - Python version: + - CUDA version: + - GPU model(s): + - Driver version: + - TensorRT version: + - PyTorch version: + - TensorRT-LLM version: + + **Detailed output:** + ```text + Paste the output of the above commands here + ``` + validations: + required: false +- type: markdown + attributes: + value: > + Thanks for contributing 🎉! +- type: checkboxes + id: askllm + attributes: + label: Before submitting a new issue... + options: + - label: Make sure you already searched for relevant issues, and checked the [documentation](https://nvidia.github.io/TensorRT-LLM/) and [examples](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples) for answers to frequently asked questions. + required: true diff --git a/.github/ISSUE_TEMPLATE/08-RFC.yml b/.github/ISSUE_TEMPLATE/08-RFC.yml new file mode 100644 index 0000000000..20d505171b --- /dev/null +++ b/.github/ISSUE_TEMPLATE/08-RFC.yml @@ -0,0 +1,58 @@ +# Adapted from https://github.com/vllm-project/vllm/tree/main/.github/ISSUE_TEMPLATE/750-RFC.yml +name: 💬 Request for comments (RFC). +description: Ask for feedback on major architectural changes or design choices. +title: "[RFC]: " +labels: ["RFC"] +assignees: ["laikhtewari"] + +body: +- type: markdown + attributes: + value: > + #### Please take a look at previous [RFCs](https://github.com/NVIDIA/TensorRT-LLM/issues?q=label%3ARFC+sort%3Aupdated-desc) for reference. +- type: textarea + attributes: + label: Motivation. + description: > + The motivation of the RFC. + validations: + required: true +- type: textarea + attributes: + label: Proposed Change. + description: > + The proposed change of the RFC. + validations: + required: true +- type: textarea + attributes: + label: Feedback Period. + description: > + The feedback period of the RFC. Usually at least one week. + validations: + required: false +- type: textarea + attributes: + label: CC List. + description: > + The list of people you want to CC. + validations: + required: false +- type: textarea + attributes: + label: Any Other Things. + description: > + Any other things you would like to mention. + validations: + required: false +- type: markdown + attributes: + value: > + Thanks for contributing 🎉! The TensorRT-LLM team reviews RFCs during regular team meetings. Most RFCs can be discussed online, but you can also reach out to the team through GitHub discussions or issues for additional feedback. +- type: checkboxes + id: askllm + attributes: + label: Before submitting a new issue... + options: + - label: Make sure you already searched for relevant issues, and checked the [documentation](https://nvidia.github.io/TensorRT-LLM/) and [examples](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples) for answers to frequently asked questions. + required: true diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml deleted file mode 100644 index 10591e6b23..0000000000 --- a/.github/ISSUE_TEMPLATE/bug_report.yml +++ /dev/null @@ -1,114 +0,0 @@ -name: "Bug Report" -description: Submit a bug report to help us improve TensorRT-LLM -labels: [ "bug" ] -body: - - type: textarea - id: system-info - attributes: - label: System Info - description: Please share your system info with us. - placeholder: | - - CPU architecture (e.g., x86_64, aarch64) - - CPU/Host memory size (if known) - - GPU properties - - GPU name (e.g., NVIDIA H100, NVIDIA A100, NVIDIA L40S) - - GPU memory size (if known) - - Clock frequencies used (if applicable) - - Libraries - - TensorRT-LLM branch or tag (e.g., main, v0.7.1) - - TensorRT-LLM commit (if known) - - Versions of TensorRT, Modelopt, CUDA, cuBLAS, etc. used - - Container used (if running TensorRT-LLM in a container) - - NVIDIA driver version - - OS (Ubuntu 24.04, CentOS 8) - - Any other information that may be useful in reproducing the bug - validations: - required: true - - - type: textarea - id: who-can-help - attributes: - label: Who can help? - description: | - To expedite the response to your issue, it would be helpful if you could identify the appropriate person - to tag using the **@** symbol. Here is a general guideline on **whom to tag**. - - Rest assured that all issues are reviewed by the core maintainers. If you are unsure about whom to tag, - you can leave it blank, and a core maintainer will make sure to involve the appropriate person. - - Please tag fewer than 3 people. - - Quantization: @Tracin - - Documentation: @juney-nvidia - - Feature request: @ncomly-nvidia - - Performance: @kaiyux - - placeholder: "@Username ..." - - - type: checkboxes - id: information-scripts-examples - attributes: - label: Information - description: 'The problem arises when using:' - options: - - label: "The official example scripts" - - label: "My own modified scripts" - - - type: checkboxes - id: information-tasks - attributes: - label: Tasks - description: "The tasks I am working on are:" - options: - - label: "An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)" - - label: "My own task or dataset (give details below)" - - - type: textarea - id: reproduction - validations: - required: true - attributes: - label: Reproduction - description: | - Kindly share a code example that demonstrates the issue you encountered. It is recommending to provide a code snippet directly. - Additionally, if you have any error messages, or stack traces related to the problem, please include them here. - - Remember to use code tags to properly format your code. You can refer to the - link https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting for guidance on code formatting. - - Please refrain from using screenshots, as they can be difficult to read and prevent others from copying and pasting your code. - It would be most helpful if we could reproduce your issue by simply copying and pasting your scripts and codes. - - placeholder: | - Steps to reproduce the behavior: - - 1. - 2. - 3. - - - type: textarea - id: expected-behavior - validations: - required: true - attributes: - label: Expected behavior - description: "Provide a brief summary of the expected behavior of the software. Provide output files or examples if possible." - - - type: textarea - id: actual-behavior - validations: - required: true - attributes: - label: actual behavior - description: "Describe the actual behavior of the software and how it deviates from the expected behavior. Provide output files or examples if possible." - - - type: textarea - id: additioanl-notes - validations: - required: true - attributes: - label: additional notes - description: "Provide any additional context here you think might be useful for the TensorRT-LLM team to help debug this issue (such as experiments done, potential things to investigate)." diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml new file mode 100644 index 0000000000..93ef69beeb --- /dev/null +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -0,0 +1,5 @@ +blank_issues_enabled: false +contact_links: + - name: 🤔 Questions + url: https://github.com/NVIDIA/TensorRT-LLM/discussions + about: Ask questions and discuss with other TensorRT-LLM community members