mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-02-05 02:31:33 +08:00
[https://nvbugs/5741304][chore] Update flashinfer-python to 0.6.1 (#10872)
Signed-off-by: Yihan Wang
This commit is contained in:
parent
128d4ac5be
commit
cdb9ffd0ab
@ -5261,7 +5261,7 @@ For more information, please refer to <http://unlicense.org>
|
||||
- `Tracker`: https://github.com/tox-dev/py-filelock/issues
|
||||
|
||||
|
||||
## flashinfer-python (0.6.0)
|
||||
## flashinfer-python (0.6.1)
|
||||
|
||||
### Licenses
|
||||
License: `Apache-2.0`
|
||||
|
||||
@ -53,7 +53,7 @@ ordered-set
|
||||
peft
|
||||
patchelf
|
||||
einops
|
||||
flashinfer-python~=0.6.0
|
||||
flashinfer-python~=0.6.1
|
||||
opencv-python-headless
|
||||
xgrammar==0.1.25
|
||||
llguidance==0.7.29
|
||||
|
||||
@ -55,7 +55,7 @@ dependencies = [
|
||||
"peft (>=0.18.1,<0.19.0)",
|
||||
"patchelf (>=0.17.2.4,<0.18.0.0)",
|
||||
"einops (>=0.8.1,<0.9.0)",
|
||||
"flashinfer-python (>=0.6.0,<0.7.0)",
|
||||
"flashinfer-python (>=0.6.1,<0.7.0)",
|
||||
"xgrammar (==0.1.25)",
|
||||
"llguidance (==0.7.29)",
|
||||
"jsonschema (>=4.26.0,<5.0.0)",
|
||||
|
||||
@ -236,7 +236,6 @@ unittest/_torch/modeling/test_modeling_out_of_tree.py::TestOutOfTree::test_llm_a
|
||||
unittest/_torch/modeling/test_modeling_out_of_tree.py::TestOutOfTree::test_serve[True] SKIP (https://nvbugs/5739981)
|
||||
full:sm89/accuracy/test_disaggregated_serving.py::TestLlama3_1_8BInstruct::test_ctx_pp_gen_tp_asymmetric[MMLU-gen_tp=2-ctx_pp=2] SKIP (https://nvbugs/5596337)
|
||||
full:sm89/accuracy/test_disaggregated_serving.py::TestLlama3_1_8BInstruct::test_tp_pp_symmetric[MMLU-tp2pp2] SKIP (https://nvbugs/5596337)
|
||||
accuracy/test_llm_api_pytorch.py::TestLlama3_1_8BInstruct::test_fp8_4gpus[tp4-fp8kv=True-attn_backend=FLASHINFER-torch_compile=True] SKIP (https://nvbugs/5741304)
|
||||
unittest/executor/test_rpc.py::TestRpcCorrectness::test_incremental_task_async SKIP (https://nvbugs/5741476)
|
||||
test_e2e.py::test_trtllm_bench_llmapi_launch[pytorch_backend-llama-v3-llama3-8b] SKIP (https://nvbugs/5744432)
|
||||
test_e2e.py::test_trtllm_serve_multimodal_example SKIP (https://nvbugs/5747920)
|
||||
@ -330,7 +329,6 @@ accuracy/test_disaggregated_serving.py::TestDeepSeekV3Lite::test_auto_dtype[mtp_
|
||||
accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_bfloat16_4gpus[tp4-mtp_nextn=0-attention_dp=False-cuda_graph=True-overlap_scheduler=True-torch_compile=True] SKIP (https://nvbugs/5800646)
|
||||
full:RTXPro6000D/accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_nvfp4_4gpus[moe_backend=CUTLASS-mtp_nextn=0-ep4-fp8kv=False-attention_dp=False-cuda_graph=False-overlap_scheduler=False-torch_compile=False] SKIP (https://nvbugs/5800672)
|
||||
full:RTXPro6000D/accuracy/test_llm_api_pytorch.py::TestGPTOSS::test_eagle3_4gpus[cutlass-one_model-overlap_scheduler] SKIP (https://nvbugs/5800679)
|
||||
accuracy/test_llm_api_pytorch.py::TestLlama3_1_8BInstruct::test_bfloat16_4gpus[tp4-attn_backend=FLASHINFER-torch_compile=False] SKIP (https://nvbugs/5741304)
|
||||
examples/test_medusa.py::test_llm_medusa_with_qaunt_base_model_1gpu[fp8-use_cpp_session-medusa-vicuna-7b-v1.3-4-heads-float16-bs1] SKIP (https://nvbugs/5802248)
|
||||
unittest/_torch/modeling/test_modeling_llama.py::TestLlama::test_llama_verification_with_kv_cache_relocation SKIP (https://nvbugs/5804923)
|
||||
accuracy/test_disaggregated_serving.py::TestGemma3_1BInstruct::test_auto_dtype[False] SKIP (https://nvbugs/5799901)
|
||||
|
||||
Loading…
Reference in New Issue
Block a user