mirror of
https://github.com/bytedance/deer-flow.git
synced 2026-05-21 23:46:50 +00:00
feat(models): add vLLM provider support (#1860)
support for vLLM 0.19.0 OpenAI-compatible chat endpoints and fixes the Qwen reasoning toggle so flash mode can actually disable thinking. Co-authored-by: NmanQAQ <normangyao@qq.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
This commit is contained in:
@@ -141,12 +141,26 @@ That prompt is intended for coding agents. It tells the agent to clone the repo
|
||||
api_key: $OPENAI_API_KEY
|
||||
use_responses_api: true
|
||||
output_version: responses/v1
|
||||
|
||||
- name: qwen3-32b-vllm
|
||||
display_name: Qwen3 32B (vLLM)
|
||||
use: deerflow.models.vllm_provider:VllmChatModel
|
||||
model: Qwen/Qwen3-32B
|
||||
api_key: $VLLM_API_KEY
|
||||
base_url: http://localhost:8000/v1
|
||||
supports_thinking: true
|
||||
when_thinking_enabled:
|
||||
extra_body:
|
||||
chat_template_kwargs:
|
||||
enable_thinking: true
|
||||
```
|
||||
|
||||
OpenRouter and similar OpenAI-compatible gateways should be configured with `langchain_openai:ChatOpenAI` plus `base_url`. If you prefer a provider-specific environment variable name, point `api_key` at that variable explicitly (for example `api_key: $OPENROUTER_API_KEY`).
|
||||
|
||||
To route OpenAI models through `/v1/responses`, keep using `langchain_openai:ChatOpenAI` and set `use_responses_api: true` with `output_version: responses/v1`.
|
||||
|
||||
For vLLM 0.19.0, use `deerflow.models.vllm_provider:VllmChatModel`. For Qwen-style reasoning models, DeerFlow toggles reasoning with `extra_body.chat_template_kwargs.enable_thinking` and preserves vLLM's non-standard `reasoning` field across multi-turn tool-call conversations. Legacy `thinking` configs are normalized automatically for backward compatibility. Reasoning models may also require the server to be started with `--reasoning-parser ...`. If your local vLLM deployment accepts any non-empty API key, you can still set `VLLM_API_KEY` to a placeholder value.
|
||||
|
||||
CLI-backed provider examples:
|
||||
|
||||
```yaml
|
||||
|
||||
Reference in New Issue
Block a user