fix(#3189): prevent write_file streaming timeout on long reports (#3195)

* fix(#3189): prevent write_file streaming timeout on long reports

Adds a layered defense against StreamChunkTimeoutError caused by oversized
single-shot write_file tool calls:

- factory: default stream_chunk_timeout to 240s for OpenAI-compatible
  clients (overridable via ModelConfig.stream_chunk_timeout in config.yaml)
- sandbox/tools: server-side 80 KB length guard on non-append write_file
  calls (configurable via DEERFLOW_WRITE_FILE_MAX_BYTES env var, 0 disables);
  rejects oversized payloads with a structured error pointing the model at
  str_replace or append=True
- middleware: classify StreamChunkTimeoutError as transient but cap retries
  at 1 via per-exception _RETRY_BUDGET_OVERRIDES (same-payload retry on a
  chunk-gap timeout buffers the same way upstream; full 3-attempt loop
  would stack 6-12 min of dead air)
- middleware: surface an actionable user-facing message for stream-drop
  exceptions instead of leaking the raw langchain stack
- prompts: add a routing-style File Editing Workflow hint to both lead_agent
  and general_purpose subagent prompts, pointing the model at str_replace
  for incremental edits (mirrors Claude Code's Edit / Codex's apply_patch)
- tests: behavioural coverage for size guard, retry budget override,
  stream-drop user message, factory default injection

Refs #3189

* fix(#3189): drop stream_chunk_timeout for non-OpenAI providers

Address CR feedback on PR #3195:

- factory: pop `stream_chunk_timeout` from kwargs for any model_use_path other than `langchain_openai:ChatOpenAI` instead of returning early. `ModelConfig.stream_chunk_timeout` is part of the shared schema, so a user-supplied value on a non-OpenAI provider would otherwise be forwarded to its constructor and raise `TypeError: unexpected keyword argument`.

- factory: rewrite docstring to describe the actual `exclude_none=True` behaviour (explicit null is excluded and falls back to the default) instead of the misleading "None falling out via exclude_none=True keeps its value".

- tests: add regression coverage asserting the kwarg is stripped before reaching a non-OpenAI provider's constructor.

Refs: bytedance#3189

* fix(#3189): restrict stream-drop user copy to StreamChunkTimeoutError only

Per CR on #3195: narrow _STREAM_DROP_EXCEPTIONS to StreamChunkTimeoutError. Generic httpx RemoteProtocolError / ReadError fall back to the standard 'temporarily unavailable' copy, since they routinely fire on transient network blips where the 'split the output' guidance is misleading. Retry/backoff classification is unchanged — both remain transient/retriable. Tests updated to reflect new copy, plus a symmetric regression test for ReadError.

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
This commit is contained in:
Huixin615
2026-06-07 17:47:11 +08:00
committed by GitHub
parent 268fdd6968
commit 88e36d9686
10 changed files with 677 additions and 4 deletions
+113
View File
@@ -1069,3 +1069,116 @@ def test_no_duplicate_kwarg_when_reasoning_effort_in_config_and_thinking_disable
# kwargs (runtime) takes precedence: thinking-disabled path sets reasoning_effort=minimal
assert captured.get("reasoning_effort") == "minimal"
# ---------------------------------------------------------------------------
# stream_chunk_timeout default injection (issue #3189)
# ---------------------------------------------------------------------------
def test_stream_chunk_timeout_defaults_to_240_for_openai_compatible_model(monkeypatch):
"""OpenAI-compatible clients must receive a generous 240s chunk-gap budget by
default, so reasoning models with long thinking pauses don't trip
langchain-openai's aggressive 60s built-in default.
"""
model = _make_model(use="langchain_openai:ChatOpenAI")
cfg = _make_app_config([model])
captured: dict = {}
class CapturingModel(FakeChatModel):
def __init__(self, **kwargs):
captured.update(kwargs)
BaseChatModel.__init__(self, **kwargs)
_patch_factory(monkeypatch, cfg, model_class=CapturingModel)
factory_module.create_chat_model(name="test-model")
assert captured.get("stream_chunk_timeout") == 240.0
def test_stream_chunk_timeout_user_value_not_overridden(monkeypatch):
"""If the user explicitly sets stream_chunk_timeout in config.yaml, the
factory must not overwrite it with the default — even if the value is
smaller (60s) or larger (600s) than the default.
"""
model = ModelConfig(
name="custom-timeout-model",
display_name="Custom Timeout",
description=None,
use="langchain_openai:ChatOpenAI",
model="gpt-4o-mini",
stream_chunk_timeout=60.0, # user-set explicit value
)
cfg = _make_app_config([model])
captured: dict = {}
class CapturingModel(FakeChatModel):
def __init__(self, **kwargs):
captured.update(kwargs)
BaseChatModel.__init__(self, **kwargs)
_patch_factory(monkeypatch, cfg, model_class=CapturingModel)
factory_module.create_chat_model(name="custom-timeout-model")
assert captured.get("stream_chunk_timeout") == 60.0
def test_stream_chunk_timeout_not_injected_for_non_openai_provider(monkeypatch):
"""Only langchain_openai:ChatOpenAI receives the default. Anthropic / Vertex /
other clients that don't understand this kwarg must not be polluted with it.
"""
model = _make_model(use="langchain_anthropic:ChatAnthropic")
cfg = _make_app_config([model])
captured: dict = {}
class CapturingModel(FakeChatModel):
def __init__(self, **kwargs):
captured.update(kwargs)
BaseChatModel.__init__(self, **kwargs)
_patch_factory(monkeypatch, cfg, model_class=CapturingModel)
factory_module.create_chat_model(name="test-model")
assert "stream_chunk_timeout" not in captured
def test_stream_chunk_timeout_default_constant_is_documented():
"""Lock the default value at 240s. If we ever want to change this, the
deliberate update here (and the docstring on _apply_stream_chunk_timeout_default)
forces a paired review of the rationale comment block above the constant.
"""
assert factory_module._DEFAULT_STREAM_CHUNK_TIMEOUT_SECONDS == 240.0
def test_stream_chunk_timeout_popped_for_non_openai_provider_when_user_set_it(monkeypatch):
"""Regression for CR feedback on issue #3189: if a user accidentally sets
``stream_chunk_timeout`` on a non-OpenAI provider, the factory must drop
the kwarg before forwarding it to the model constructor. Otherwise the
third-party client raises ``TypeError: unexpected keyword argument
'stream_chunk_timeout'`` because the parameter is specific to
``langchain_openai:ChatOpenAI``.
"""
model = ModelConfig(
name="anthropic-with-stray-timeout",
display_name="Anthropic With Stray Timeout",
description=None,
use="langchain_anthropic:ChatAnthropic",
model="claude-sonnet-4",
stream_chunk_timeout=60.0, # user-set on a non-OpenAI provider — must be dropped
)
cfg = _make_app_config([model])
captured: dict = {}
class CapturingModel(FakeChatModel):
def __init__(self, **kwargs):
captured.update(kwargs)
BaseChatModel.__init__(self, **kwargs)
_patch_factory(monkeypatch, cfg, model_class=CapturingModel)
factory_module.create_chat_model(name="anthropic-with-stray-timeout")
assert "stream_chunk_timeout" not in captured