Skip to content

Commit fd44c00

Browse files
authored
Merge pull request #2562 from kevincheng2/develop
[LLM] update v1.2 images
2 parents 3bb05ac + 203f3ae commit fd44c00

2 files changed

Lines changed: 4 additions & 4 deletions

File tree

llm/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,9 @@
1515
# 挂载模型文件
1616
export MODEL_PATH=${PWD}/Llama-3-8B-A8W8C8
1717
18-
docker run --gpus all --shm-size 5G --network=host \
18+
docker run --gpus all --shm-size 5G --network=host --privileged --cap-add=SYS_PTRACE \
1919
-v ${MODEL_PATH}:/models/ \
20-
-dit registry.baidubce.com/paddlepaddle/fastdeploy:llm-serving-cuda123-cudnn9-v1.0 \
20+
-dit registry.baidubce.com/paddlepaddle/fastdeploy:llm-serving-cuda123-cudnn9-v1.2 \
2121
bash -c 'export USE_CACHE_KV_INT8=1 && cd /opt/output/Serving && bash start_server.sh; exec bash'
2222
```
2323

llm/docs/FastDeploy_usage_tutorial.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,7 @@ health接口:(模型是否准备好推理)
144144
from fastdeploy_client.chatbot import ChatBot
145145
146146
hostname = "127.0.0.1" # 服务部署的hostname
147-
port = 8000 # 服务配置的GRPC_PORT
147+
port = 8811 # 服务配置的GRPC_PORT
148148
149149
chatbot = ChatBot(hostname=hostname, port=port)
150150
@@ -153,7 +153,7 @@ result = chatbot.generate("你好", topp=0.8, max_dec_len=128, timeout=120)
153153
print(result)
154154
155155
# 流式接口
156-
chatbot = ChatBot(hostname=hostname, port=port, model_id=model_id, mode=mode)
156+
chatbot = ChatBot(hostname=hostname, port=port)
157157
stream_result = chatbot.stream_generate("你好", max_dec_len=128, timeout=120)
158158
for res in stream_result:
159159
print(res)

0 commit comments

Comments
 (0)