mirror of
https://github.com/bytedance/deer-flow.git
synced 2026-06-10 09:25:57 +00:00
* fix(#3189): prevent write_file streaming timeout on long reports Adds a layered defense against StreamChunkTimeoutError caused by oversized single-shot write_file tool calls: - factory: default stream_chunk_timeout to 240s for OpenAI-compatible clients (overridable via ModelConfig.stream_chunk_timeout in config.yaml) - sandbox/tools: server-side 80 KB length guard on non-append write_file calls (configurable via DEERFLOW_WRITE_FILE_MAX_BYTES env var, 0 disables); rejects oversized payloads with a structured error pointing the model at str_replace or append=True - middleware: classify StreamChunkTimeoutError as transient but cap retries at 1 via per-exception _RETRY_BUDGET_OVERRIDES (same-payload retry on a chunk-gap timeout buffers the same way upstream; full 3-attempt loop would stack 6-12 min of dead air) - middleware: surface an actionable user-facing message for stream-drop exceptions instead of leaking the raw langchain stack - prompts: add a routing-style File Editing Workflow hint to both lead_agent and general_purpose subagent prompts, pointing the model at str_replace for incremental edits (mirrors Claude Code's Edit / Codex's apply_patch) - tests: behavioural coverage for size guard, retry budget override, stream-drop user message, factory default injection Refs #3189 * fix(#3189): drop stream_chunk_timeout for non-OpenAI providers Address CR feedback on PR #3195: - factory: pop `stream_chunk_timeout` from kwargs for any model_use_path other than `langchain_openai:ChatOpenAI` instead of returning early. `ModelConfig.stream_chunk_timeout` is part of the shared schema, so a user-supplied value on a non-OpenAI provider would otherwise be forwarded to its constructor and raise `TypeError: unexpected keyword argument`. - factory: rewrite docstring to describe the actual `exclude_none=True` behaviour (explicit null is excluded and falls back to the default) instead of the misleading "None falling out via exclude_none=True keeps its value". - tests: add regression coverage asserting the kwarg is stripped before reaching a non-OpenAI provider's constructor. Refs: bytedance#3189 * fix(#3189): restrict stream-drop user copy to StreamChunkTimeoutError only Per CR on #3195: narrow _STREAM_DROP_EXCEPTIONS to StreamChunkTimeoutError. Generic httpx RemoteProtocolError / ReadError fall back to the standard 'temporarily unavailable' copy, since they routinely fire on transient network blips where the 'split the output' guidance is misleading. Retry/backoff classification is unchanged — both remain transient/retriable. Tests updated to reflect new copy, plus a symmetric regression test for ReadError. --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
This commit is contained in:
@@ -47,6 +47,38 @@ def _enable_stream_usage_by_default(model_use_path: str, model_settings_from_con
|
||||
model_settings_from_config["stream_usage"] = True
|
||||
|
||||
|
||||
# Default chunk-gap budget for OpenAI-compatible streaming responses.
|
||||
#
|
||||
# langchain-openai raises ``StreamChunkTimeoutError`` after this many seconds
|
||||
# without receiving a chunk. Its own default is 60s, which is too aggressive for
|
||||
# reasoning models (DeepSeek-R1, Doubao-thinking, GPT-5) whose first chunk can
|
||||
# legitimately take 90~150s. We default to 240s so the streaming layer rarely
|
||||
# trips on long thinking pauses; the LLMErrorHandlingMiddleware still retries
|
||||
# (budget=2) if a real stall happens. Users can override per-model in config.yaml.
|
||||
_DEFAULT_STREAM_CHUNK_TIMEOUT_SECONDS: float = 240.0
|
||||
|
||||
|
||||
def _apply_stream_chunk_timeout_default(model_use_path: str, model_settings_from_config: dict) -> None:
|
||||
"""Inject a generous ``stream_chunk_timeout`` for OpenAI-compatible clients.
|
||||
|
||||
The ``stream_chunk_timeout`` kwarg is specific to ``langchain_openai:ChatOpenAI``
|
||||
and is rejected by other providers' constructors as an unexpected keyword
|
||||
argument. Behaviour:
|
||||
|
||||
* OpenAI-compatible path: an explicit value in ``config.yaml`` is preserved.
|
||||
An explicit ``null`` is dropped upstream by ``model_dump(exclude_none=True)``
|
||||
and therefore treated as "unset", so the default is injected.
|
||||
* Non-OpenAI path: drop the key so it is never forwarded to an incompatible
|
||||
constructor (which would raise ``TypeError: unexpected keyword argument``).
|
||||
"""
|
||||
if model_use_path != "langchain_openai:ChatOpenAI":
|
||||
model_settings_from_config.pop("stream_chunk_timeout", None)
|
||||
return
|
||||
if "stream_chunk_timeout" in model_settings_from_config:
|
||||
return
|
||||
model_settings_from_config["stream_chunk_timeout"] = _DEFAULT_STREAM_CHUNK_TIMEOUT_SECONDS
|
||||
|
||||
|
||||
def create_chat_model(name: str | None = None, thinking_enabled: bool = False, *, app_config: AppConfig | None = None, attach_tracing: bool = True, **kwargs) -> BaseChatModel:
|
||||
"""Create a chat model instance from the config.
|
||||
|
||||
@@ -128,6 +160,7 @@ def create_chat_model(name: str | None = None, thinking_enabled: bool = False, *
|
||||
model_settings_from_config.pop("reasoning_effort", None)
|
||||
|
||||
_enable_stream_usage_by_default(model_config.use, model_settings_from_config)
|
||||
_apply_stream_chunk_timeout_default(model_config.use, model_settings_from_config)
|
||||
|
||||
# For Codex Responses API models: map thinking mode to reasoning_effort
|
||||
from deerflow.models.openai_codex_provider import CodexChatModel
|
||||
|
||||
Reference in New Issue
Block a user