产品概述
产品定价
客户价值
应用场景
/,用于指定保存用户自定义大模型位置。
!pip install modelscopefrom modelscope import snapshot_download#qwen/Qwen2-7B-Instruct为需要下载的模型名称,cache_dir为下载模型保存的地址,这里'./'表示将下载模型保存在CFS的根目录中model_dir = snapshot_download('qwen/Qwen2-7B-Instruct', cache_dir='./')


python3 -m vllm.entrypoints.openai.api_server
docker run --runtime nvidia --gpus all \\-v ~/.cache/huggingface:/root/.cache/huggingface \\--env "HUGGING_FACE_HUB_TOKEN=<secret>" \\-p 8000:8000 \\--ipc=host \\vllm/vllm-openai:latest \\--model mistralai/Mistral-7B-v0.1
模型参数量 | GPU 卡类型和数量 |
6 ~ 8B | PNV5b * 1 / A10 * 1 / A100 * 1 |
12 ~ 14B | PNV5b * 1 / A10 * 2 / A100 * 1 |
65 ~ 72B | PNV5b * 8 / A100 * 8 |


{"model": "Qwen2-7B-Instruct","prompt": "你好","max_tokens": 50,"temperature": 0}

# 公网访问地址SERVER_URL = https://ms-gp6rjk2jj-**********.gw.ap-shanghai.ti.tencentcs.com/ms-gp6rjk2j# 非流式调用curl -H "content-type: application/json" ${SERVER_URL}/v1/completions -d '{"model":"Qwen2-7B-Instruct","prompt":"你好","max_tokens":50,"temperature":0}'# 流式调用curl -H "content-type: application/json" ${SERVER_URL}/v1/completions -d '{"model":"Qwen2-7B-Instruct","prompt":"你好","max_tokens":50,"temperature":0, "stream": true}'
{"id":"cmpl-f2bec3ca2ded4b518fb8e73dc3461202","object":"text_completion","created":1719890717,"model":"Qwen2-7B-Instruct","choices":[{"index":0,"text":",我最近感觉很焦虑,有什么方法可以缓解吗?\\n你好!焦虑是一种常见的情绪反应,但可以通过一些方法来缓解。你可以尝试深呼吸、冥想、运动、听音乐、与朋友聊天等方式来放松自己。同时","logprobs":null,"finish_reason":"length","stop_reason":null}],"usage":{"prompt_tokens":1,"total_tokens":51,"completion_tokens":50}}
data: {"id":"cmpl-3a575c7fd0204234afc51e195ee06596","created":1719890729,"model":"Qwen2-7B-Instruct","choices":[{"index":0,"text":",","logprobs":null,"finish_reason":null,"stop_reason":null}]}data: {"id":"cmpl-3a575c7fd0204234afc51e195ee06596","created":1719890729,"model":"Qwen2-7B-Instruct","choices":[{"index":0,"text":"我","logprobs":null,"finish_reason":null,"stop_reason":null}]}...此处忽略中间结果...data: {"id":"cmpl-3a575c7fd0204234afc51e195ee06596","created":1719890729,"model":"Qwen2-7B-Instruct","choices":[{"index":0,"text":"。","logprobs":null,"finish_reason":null,"stop_reason":null}]}data: {"id":"cmpl-3a575c7fd0204234afc51e195ee06596","created":1719890729,"model":"Qwen2-7B-Instruct","choices":[{"index":0,"text":"同时","logprobs":null,"finish_reason":"length","stop_reason":null}]}data: [DONE]
文档反馈