File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change 3838 strategy :
3939 fail-fast : false
4040 matrix :
41- cuda_version : ['cu12.8', cu12, cu11 ]
41+ cuda_version : ['cu12.8', ' cu12' ]
4242 env :
4343 CUDA_VERSION : ${{ matrix.cuda_version }}
4444 TAG_PREFIX : " openmmlab/lmdeploy"
Original file line number Diff line number Diff line change @@ -225,7 +225,7 @@ The default prebuilt package is compiled on **CUDA 12** since v0.3.0.
225225For the GeForce RTX 50 series, please install the LMDeploy prebuilt package complied with ** CUDA 12.8**
226226
227227``` shell
228- export LMDEPLOY_VERSION=0.12.0
228+ export LMDEPLOY_VERSION=0.12.1
229229export PYTHON_VERSION=310
230230pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION} /lmdeploy-${LMDEPLOY_VERSION} +cu128-cp${PYTHON_VERSION} -cp${PYTHON_VERSION} -manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu128
231231```
Original file line number Diff line number Diff line change @@ -227,7 +227,7 @@ pip install lmdeploy
227227若使用 GeForce RTX 50 系列显卡,请安装基于 ** CUDA 12.8** 编译的 LMDeploy 预编译包。
228228
229229``` shell
230- export LMDEPLOY_VERSION=0.12.0
230+ export LMDEPLOY_VERSION=0.12.1
231231export PYTHON_VERSION=310
232232pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION} /lmdeploy-${LMDEPLOY_VERSION} +cu128-cp${PYTHON_VERSION} -cp${PYTHON_VERSION} -manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu128
233233```
Original file line number Diff line number Diff line change @@ -110,4 +110,9 @@ pip install -r /tmp/requirements/serve.txt
110110if [[ " ${CUDA_VERSION_SHORT} " = " cu118" ]]; then
111111 rm -rf /opt/py3/lib/python${PYTHON_VERSION} /site-packages/nvidia/nccl
112112 cp -R /nccl /opt/py3/lib/python${PYTHON_VERSION} /site-packages/nvidia/
113+ elif [[ " ${CUDA_VERSION_SHORT} " = " cu128" ]]; then
114+ # As described in https://github.com/InternLM/lmdeploy/pull/4313,
115+ # window registration may cause memory leaks in NCCL 2.27, NCCL 2.28+ resolves the issue,
116+ # but turbomind engine will use nccl GIN for EP in future, which is brought in since 2.29
117+ pip install " nvidia-nccl-cu12>2.29"
113118fi
Original file line number Diff line number Diff line change @@ -23,7 +23,7 @@ pip install lmdeploy
2323The default prebuilt package is compiled on ** CUDA 12** . If CUDA 11+ (>=11.3) is required, you can install lmdeploy by:
2424
2525``` shell
26- export LMDEPLOY_VERSION=0.12.0
26+ export LMDEPLOY_VERSION=0.12.1
2727export PYTHON_VERSION=310
2828pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION} /lmdeploy-${LMDEPLOY_VERSION} +cu118-cp${PYTHON_VERSION} -cp${PYTHON_VERSION} -manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118
2929```
Original file line number Diff line number Diff line change @@ -23,7 +23,7 @@ pip install lmdeploy
2323默认的预构建包是在 ** CUDA 12** 上编译的。如果需要 CUDA 11+ (>=11.3),你可以使用以下命令安装 lmdeploy:
2424
2525``` shell
26- export LMDEPLOY_VERSION=0.12.0
26+ export LMDEPLOY_VERSION=0.12.1
2727export PYTHON_VERSION=310
2828pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION} /lmdeploy-${LMDEPLOY_VERSION} +cu118-cp${PYTHON_VERSION} -cp${PYTHON_VERSION} -manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118
2929```
Original file line number Diff line number Diff line change 11# Copyright (c) OpenMMLab. All rights reserved.
22from typing import Tuple
33
4- __version__ = '0.12.0 '
4+ __version__ = '0.12.1 '
55short_version = __version__
66
77
You can’t perform that action at this time.
0 commit comments