fix(#3189): prevent write_file streaming timeout on long reports (#3195)

* fix(#3189): prevent write_file streaming timeout on long reports Adds a layered defense against StreamChunkTimeoutError caused by oversized single-shot write_file tool calls: - factory: default stream_chunk_timeout to 240s for OpenAI-compatible clients (overridable via ModelConfig.stream_chunk_timeout in config.yaml) - sandbox/tools: server-side 80 KB length guard on non-append write_file calls (configurable via DEERFLOW_WRITE_FILE_MAX_BYTES env var, 0 disables); rejects oversized payloads with a structured error pointing the model at str_replace or append=True - middleware: classify StreamChunkTimeoutError as transient but cap retries at 1 via per-exception _RETRY_BUDGET_OVERRIDES (same-payload retry on a chunk-gap timeout buffers the same way upstream; full 3-attempt loop would stack 6-12 min of dead air) - middleware: surface an actionable user-facing message for stream-drop exceptions instead of leaking the raw langchain stack - prompts: add a routing-style File Editing Workflow hint to both lead_agent and general_purpose subagent prompts, pointing the model at str_replace for incremental edits (mirrors Claude Code's Edit / Codex's apply_patch) - tests: behavioural coverage for size guard, retry budget override, stream-drop user message, factory default injection Refs #3189 * fix(#3189): drop stream_chunk_timeout for non-OpenAI providers Address CR feedback on PR #3195: - factory: pop `stream_chunk_timeout` from kwargs for any model_use_path other than `langchain_openai:ChatOpenAI` instead of returning early. `ModelConfig.stream_chunk_timeout` is part of the shared schema, so a user-supplied value on a non-OpenAI provider would otherwise be forwarded to its constructor and raise `TypeError: unexpected keyword argument`. - factory: rewrite docstring to describe the actual `exclude_none=True` behaviour (explicit null is excluded and falls back to the default) instead of the misleading "None falling out via exclude_none=True keeps its value". - tests: add regression coverage asserting the kwarg is stripped before reaching a non-OpenAI provider's constructor. Refs: bytedance#3189 * fix(#3189): restrict stream-drop user copy to StreamChunkTimeoutError only Per CR on #3195: narrow _STREAM_DROP_EXCEPTIONS to StreamChunkTimeoutError. Generic httpx RemoteProtocolError / ReadError fall back to the standard 'temporarily unavailable' copy, since they routinely fire on transient network blips where the 'split the output' guidance is misleading. Retry/backoff classification is unchanged — both remain transient/retriable. Tests updated to reflect new copy, plus a symmetric regression test for ReadError. --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-06-10 17:35:57 +00:00 · 2026-06-07 17:47:11 +08:00
parent 268fdd6968
commit 88e36d9686
10 changed files with 677 additions and 4 deletions
@@ -1069,3 +1069,116 @@ def test_no_duplicate_kwarg_when_reasoning_effort_in_config_and_thinking_disable

    # kwargs (runtime) takes precedence: thinking-disabled path sets reasoning_effort=minimal
    assert captured.get("reasoning_effort") == "minimal"
+
+
+# ---------------------------------------------------------------------------
+# stream_chunk_timeout default injection (issue #3189)
+# ---------------------------------------------------------------------------
+
+
+def test_stream_chunk_timeout_defaults_to_240_for_openai_compatible_model(monkeypatch):
+    """OpenAI-compatible clients must receive a generous 240s chunk-gap budget by
+    default, so reasoning models with long thinking pauses don't trip
+    langchain-openai's aggressive 60s built-in default.
+    """
+    model = _make_model(use="langchain_openai:ChatOpenAI")
+    cfg = _make_app_config([model])
+
+    captured: dict = {}
+
+    class CapturingModel(FakeChatModel):
+        def __init__(self, **kwargs):
+            captured.update(kwargs)
+            BaseChatModel.__init__(self, **kwargs)
+
+    _patch_factory(monkeypatch, cfg, model_class=CapturingModel)
+    factory_module.create_chat_model(name="test-model")
+
+    assert captured.get("stream_chunk_timeout") == 240.0
+
+
+def test_stream_chunk_timeout_user_value_not_overridden(monkeypatch):
+    """If the user explicitly sets stream_chunk_timeout in config.yaml, the
+    factory must not overwrite it with the default — even if the value is
+    smaller (60s) or larger (600s) than the default.
+    """
+    model = ModelConfig(
+        name="custom-timeout-model",
+        display_name="Custom Timeout",
+        description=None,
+        use="langchain_openai:ChatOpenAI",
+        model="gpt-4o-mini",
+        stream_chunk_timeout=60.0,  # user-set explicit value
+    )
+    cfg = _make_app_config([model])
+
+    captured: dict = {}
+
+    class CapturingModel(FakeChatModel):
+        def __init__(self, **kwargs):
+            captured.update(kwargs)
+            BaseChatModel.__init__(self, **kwargs)
+
+    _patch_factory(monkeypatch, cfg, model_class=CapturingModel)
+    factory_module.create_chat_model(name="custom-timeout-model")
+
+    assert captured.get("stream_chunk_timeout") == 60.0
+
+
+def test_stream_chunk_timeout_not_injected_for_non_openai_provider(monkeypatch):
+    """Only langchain_openai:ChatOpenAI receives the default. Anthropic / Vertex /
+    other clients that don't understand this kwarg must not be polluted with it.
+    """
+    model = _make_model(use="langchain_anthropic:ChatAnthropic")
+    cfg = _make_app_config([model])
+
+    captured: dict = {}
+
+    class CapturingModel(FakeChatModel):
+        def __init__(self, **kwargs):
+            captured.update(kwargs)
+            BaseChatModel.__init__(self, **kwargs)
+
+    _patch_factory(monkeypatch, cfg, model_class=CapturingModel)
+    factory_module.create_chat_model(name="test-model")
+
+    assert "stream_chunk_timeout" not in captured
+
+
+def test_stream_chunk_timeout_default_constant_is_documented():
+    """Lock the default value at 240s. If we ever want to change this, the
+    deliberate update here (and the docstring on _apply_stream_chunk_timeout_default)
+    forces a paired review of the rationale comment block above the constant.
+    """
+    assert factory_module._DEFAULT_STREAM_CHUNK_TIMEOUT_SECONDS == 240.0
+
+
+def test_stream_chunk_timeout_popped_for_non_openai_provider_when_user_set_it(monkeypatch):
+    """Regression for CR feedback on issue #3189: if a user accidentally sets
+    ``stream_chunk_timeout`` on a non-OpenAI provider, the factory must drop
+    the kwarg before forwarding it to the model constructor. Otherwise the
+    third-party client raises ``TypeError: unexpected keyword argument
+    'stream_chunk_timeout'`` because the parameter is specific to
+    ``langchain_openai:ChatOpenAI``.
+    """
+    model = ModelConfig(
+        name="anthropic-with-stray-timeout",
+        display_name="Anthropic With Stray Timeout",
+        description=None,
+        use="langchain_anthropic:ChatAnthropic",
+        model="claude-sonnet-4",
+        stream_chunk_timeout=60.0,  # user-set on a non-OpenAI provider — must be dropped
+    )
+    cfg = _make_app_config([model])
+
+    captured: dict = {}
+
+    class CapturingModel(FakeChatModel):
+        def __init__(self, **kwargs):
+            captured.update(kwargs)
+            BaseChatModel.__init__(self, **kwargs)
+
+    _patch_factory(monkeypatch, cfg, model_class=CapturingModel)
+    factory_module.create_chat_model(name="anthropic-with-stray-timeout")
+
+    assert "stream_chunk_timeout" not in captured