mirror of
https://github.com/bytedance/deer-flow.git
synced 2026-05-22 07:56:48 +00:00
* fix(backend): stream DeerFlowClient AI text as token deltas (#1969)
DeerFlowClient.stream() subscribed to LangGraph stream_mode=["values",
"custom"] which only delivers full-state snapshots at graph-node
boundaries, so AI replies were dumped as a single messages-tuple event
per node instead of streaming token-by-token. `client.stream("hello")`
looked identical to `client.chat("hello")` — the bug reported in #1969.
Subscribe to "messages" mode as well, forward AIMessageChunk deltas as
messages-tuple events with delta semantics (consumers accumulate by id),
and dedup the values-snapshot path so it does not re-synthesize AI
text that was already streamed. Introduce a per-id usage_metadata
counter so the final AIMessage in the values snapshot and the final
"messages" chunk — which carry the same cumulative usage — are not
double-counted.
chat() now accumulates per-id deltas and returns the last message's
full accumulated text. Non-streaming mock sources (single event per id)
are a degenerate case of the same logic, keeping existing callers and
tests backward compatible.
Verified end-to-end against a real LLM: a 15-number count emits 35
messages-tuple events with BPE subword boundaries clearly visible
("eleven" -> "ele" / "ven", "twelve" -> "tw" / "elve"), 476ms across
the window, end-event usage matches the values-snapshot usage exactly
(not doubled). tests/test_client_live.py::TestLiveStreaming passes.
New unit tests:
- test_messages_mode_emits_token_deltas: 3 AIMessageChunks produce 3
delta events with correct content/id/usage, values-snapshot does not
duplicate, usage counted once.
- test_chat_accumulates_streamed_deltas: chat() rebuilds full text
from deltas.
- test_messages_mode_tool_message: ToolMessage delivered via messages
mode is not duplicated by the values-snapshot synthesis path.
The stream() docstring now documents why this client does not reuse
Gateway's run_agent() / StreamBridge pipeline (sync vs async, raw
LangChain objects vs serialized dicts, single caller vs HTTP fan-out).
Fixes #1969
* refactor(backend): simplify DeerFlowClient streaming helpers (#1969)
Post-review cleanup for the token-level streaming fix. No behavior
change for correct inputs; one efficiency regression fixed.
Fix: chat() O(n²) accumulator
-----------------------------
`chat()` accumulated per-id text via `buffers[id] = buffers.get(id,"") + delta`,
which is O(n) per concat → O(n²) total over a streamed response. At
~2 KB cumulative text this becomes user-visible; at 50 KB / 5000 chunks
it costs roughly 100-300 ms of pure copying. Switched to
`dict[str, list[str]]` + `"".join()` once at return.
Cleanup
-------
- Extract `_serialize_tool_calls`, `_ai_text_event`, `_ai_tool_calls_event`,
and `_tool_message_event` static helpers. The messages-mode and
values-mode branches previously repeated four inline dict literals each;
they now call the same builders.
- `StreamEvent.type` is now typed as `Literal["values", "messages-tuple",
"custom", "end"]` via a `StreamEventType` alias. Makes the closed set
explicit and catches typos at type-check time.
- Direct attribute access on `AIMessage`/`AIMessageChunk`: `.usage_metadata`,
`.tool_calls`, `.id` all have default values on the base class, so the
`getattr(..., None)` fallbacks were dead code. Removed from the hot
path.
- `_account_usage` parameter type loosened to `Any` so that LangChain's
`UsageMetadata` TypedDict is accepted under strict type checking.
- Trimmed narrating comments on `seen_ids` / `streamed_ids` / the
values-synthesis skip block; kept the non-obvious ones that document
the cross-mode dedup invariant.
Net diff: -15 lines. All 132 unit tests + harness boundary test still
pass; ruff check and ruff format pass.
* docs(backend): add STREAMING.md design note (#1969)
Dedicated design document for the token-level streaming architecture,
prompted by the bug investigation in #1969.
Contents:
- Why two parallel streaming paths exist (Gateway HTTP/async vs
DeerFlowClient sync/in-process) and why they cannot be merged.
- LangGraph's three-layer mode naming (Graph "messages" vs Platform
SDK "messages-tuple" vs HTTP SSE) and why a shared string constant
would be harmful.
- Gateway path: run_agent + StreamBridge + sse_consumer with a
sequence diagram.
- DeerFlowClient path: sync generator + direct yield, delta semantics,
chat() accumulator.
- Why the three id sets (seen_ids / streamed_ids / counted_usage_ids)
each carry an independent invariant and cannot be collapsed.
- End-to-end sequence for a real conversation turn.
- Lessons from #1969: why mock-based tests missed the bug, why
BPE subword boundaries in live output are the strongest
correctness signal, and the regression test that locks it in.
- Source code location index.
Also:
- Link from backend/CLAUDE.md Embedded Client section.
- Link from backend/docs/README.md under Feature Documentation.
* test(backend): add refactor regression guards for stream() (#1969)
Three new tests in TestStream that lock the contract introduced by
PR #1974 so any future refactor (sync->async migration, sharing a
core with Gateway's run_agent, dedup strategy change) cannot
silently change behavior.
- test_dedup_requires_messages_before_values_invariant: canary that
documents the order-dependence of cross-mode dedup. streamed_ids
is populated only by the messages branch, so values-before-messages
for the same id produces duplicate AI text events. Real LangGraph
never inverts this order, but a refactor that does (or that makes
dedup idempotent) must update this test deliberately.
- test_messages_mode_golden_event_sequence: locks the *exact* event
sequence (4 events: 2 messages-tuple deltas, 1 values snapshot, 1
end) for a canonical streaming turn. List equality gives a clear
diff on any drift in order, type, or payload shape.
- test_chat_accumulates_in_linear_time: perf canary for the O(n^2)
fix in commit 1f11ba10. 10,000 single-char chunks must accumulate
in under 1s; the threshold is wide enough to pass on slow CI but
tight enough to fail if buffer = buffer + delta is restored.
All three tests pass alongside the existing 12 TestStream tests
(15/15). ruff check + ruff format clean.
* docs(backend): clarify stream() docstring on JSON serialization (#1969)
Replace the misleading "raw LangChain objects (AIMessage,
usage_metadata as dataclasses), not dicts" claim in the
"Why not reuse Gateway's run_agent?" section. The implementation
already yields plain Python dicts (StreamEvent.data is dict, and
usage_metadata is a TypedDict), so the original wording suggested
a richer return type than the API actually delivers.
The corrected wording focuses on what is actually true and
relevant: this client skips the JSON/SSE serialization layer that
Gateway adds for HTTP wire transmission, and yields stream event
payloads directly as Python data structures.
Addresses Copilot review feedback on PR #1974.
* test(backend): document none-id messages dedup limitation (#1969)
Add test_none_id_chunks_produce_duplicates_known_limitation to
TestStream that explicitly documents and asserts the current
behavior when an LLM provider emits AIMessageChunk with id=None
(vLLM, certain custom backends).
The cross-mode dedup machinery cannot record a None id in
streamed_ids (guarded by ``if msg_id:``), so the values snapshot's
reassembled AIMessage with a real id falls through and synthesizes
a duplicate AI text event. The test asserts len == 2 and locks
this as a known limitation rather than silently letting future
contributors hit it without context.
Why this is documented rather than fixed:
* Falling back to ``metadata.get("id")`` does not help — LangGraph's
messages-mode metadata never carries the message id.
* Synthesizing ``f"_synth_{id(msg_chunk)}"`` only helps if the
values snapshot uses the same fallback, which it does not.
* A real fix requires provider cooperation (always emit chunk ids)
or content-based dedup (false-positive risk), neither of which
belongs in this PR.
If a real fix lands, replace this test with a positive assertion
that dedup works for None-id chunks.
Addresses Copilot review feedback on PR #1974 (client.py:515).
* fix(frontend): UI polish - fix CSS typo, dark mode border, and hardcoded colors (#1942)
- Fix `font-norma` typo to `font-normal` in message-list subtask count
- Fix dark mode `--border` using reddish hue (22.216) instead of neutral
- Replace hardcoded `rgb(184,184,192)` in hero with `text-muted-foreground`
- Replace hardcoded `bg-[#a3a1a1]` in streaming indicator with `bg-muted-foreground`
- Add missing `font-sans` to welcome description `<pre>` for consistency
- Make case-study-section padding responsive (`px-4 md:px-20`)
Closes #1940
* docs: clarify deployment sizing guidance (#1963)
* fix(frontend): prevent stale 'new' thread ID from triggering 422 history requests (#1960)
After history.replaceState updates the URL from /chats/new to
/chats/{UUID}, Next.js useParams does not update because replaceState
bypasses the router. The useEffect in useThreadChat would then set
threadIdFromPath ('new') as the threadId, causing the LangGraph SDK
to call POST /threads/new/history which returns HTTP 422 (Invalid
thread ID: must be a UUID).
This fix adds a guard to skip the threadId update when
threadIdFromPath is the literal string 'new', preserving the
already-correct UUID that was set when the thread was created.
* fix(frontend): avoid using route new as thread id (#1967)
Co-authored-by: luoxiao6645 <luoxiao6645@gmail.com>
* Fix(subagent): Event loop conflict in SubagentExecutor.execute() (#1965)
* Fix event loop conflict in SubagentExecutor.execute()
When SubagentExecutor.execute() is called from within an already-running
event loop (e.g., when the parent agent uses async/await), calling
asyncio.run() creates a new event loop that conflicts with asyncio
primitives (like httpx.AsyncClient) that were created in and bound to
the parent loop.
This fix detects if we're already in a running event loop, and if so,
runs the subagent in a separate thread with its own isolated event loop
to avoid conflicts.
Fixes: sub-task cards not appearing in Ultra mode when using async parent agents
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(subagent): harden isolated event loop execution
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(backend): remove dead getattr in _tool_message_event
---------
Co-authored-by: greatmengqi <chenmengqi.0376@bytedance.com>
Co-authored-by: Xinmin Zeng <135568692+fancyboi999@users.noreply.github.com>
Co-authored-by: 13ernkastel <LennonCMJ@live.com>
Co-authored-by: siwuai <458372151@qq.com>
Co-authored-by: 肖 <168966994+luoxiao6645@users.noreply.github.com>
Co-authored-by: luoxiao6645 <luoxiao6645@gmail.com>
Co-authored-by: Saber <11769524+hawkli-1994@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
This commit is contained in:
@@ -25,7 +25,7 @@ import uuid
|
||||
from collections.abc import Generator, Sequence
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
from typing import Any, Literal
|
||||
|
||||
from langchain.agents import create_agent
|
||||
from langchain.agents.middleware import AgentMiddleware
|
||||
@@ -55,6 +55,9 @@ from deerflow.uploads.manager import (
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
StreamEventType = Literal["values", "messages-tuple", "custom", "end"]
|
||||
|
||||
|
||||
@dataclass
|
||||
class StreamEvent:
|
||||
"""A single event from the streaming agent response.
|
||||
@@ -69,7 +72,7 @@ class StreamEvent:
|
||||
data: Event payload. Contents vary by type.
|
||||
"""
|
||||
|
||||
type: str
|
||||
type: StreamEventType
|
||||
data: dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
|
||||
@@ -254,13 +257,53 @@ class DeerFlowClient:
|
||||
|
||||
return get_available_tools(model_name=model_name, subagent_enabled=subagent_enabled)
|
||||
|
||||
@staticmethod
|
||||
def _serialize_tool_calls(tool_calls) -> list[dict]:
|
||||
"""Reshape LangChain tool_calls into the wire format used in events."""
|
||||
return [{"name": tc["name"], "args": tc["args"], "id": tc.get("id")} for tc in tool_calls]
|
||||
|
||||
@staticmethod
|
||||
def _ai_text_event(msg_id: str | None, text: str, usage: dict | None) -> "StreamEvent":
|
||||
"""Build a ``messages-tuple`` AI text event, attaching usage when present."""
|
||||
data: dict[str, Any] = {"type": "ai", "content": text, "id": msg_id}
|
||||
if usage:
|
||||
data["usage_metadata"] = usage
|
||||
return StreamEvent(type="messages-tuple", data=data)
|
||||
|
||||
@staticmethod
|
||||
def _ai_tool_calls_event(msg_id: str | None, tool_calls) -> "StreamEvent":
|
||||
"""Build a ``messages-tuple`` AI tool-calls event."""
|
||||
return StreamEvent(
|
||||
type="messages-tuple",
|
||||
data={
|
||||
"type": "ai",
|
||||
"content": "",
|
||||
"id": msg_id,
|
||||
"tool_calls": DeerFlowClient._serialize_tool_calls(tool_calls),
|
||||
},
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def _tool_message_event(msg: ToolMessage) -> "StreamEvent":
|
||||
"""Build a ``messages-tuple`` tool-result event from a ToolMessage."""
|
||||
return StreamEvent(
|
||||
type="messages-tuple",
|
||||
data={
|
||||
"type": "tool",
|
||||
"content": DeerFlowClient._extract_text(msg.content),
|
||||
"name": msg.name,
|
||||
"tool_call_id": msg.tool_call_id,
|
||||
"id": msg.id,
|
||||
},
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def _serialize_message(msg) -> dict:
|
||||
"""Serialize a LangChain message to a plain dict for values events."""
|
||||
if isinstance(msg, AIMessage):
|
||||
d: dict[str, Any] = {"type": "ai", "content": msg.content, "id": getattr(msg, "id", None)}
|
||||
if msg.tool_calls:
|
||||
d["tool_calls"] = [{"name": tc["name"], "args": tc["args"], "id": tc.get("id")} for tc in msg.tool_calls]
|
||||
d["tool_calls"] = DeerFlowClient._serialize_tool_calls(msg.tool_calls)
|
||||
if getattr(msg, "usage_metadata", None):
|
||||
d["usage_metadata"] = msg.usage_metadata
|
||||
return d
|
||||
@@ -438,6 +481,53 @@ class DeerFlowClient:
|
||||
consumers can switch between HTTP streaming and embedded mode
|
||||
without changing their event-handling logic.
|
||||
|
||||
Token-level streaming
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
This method subscribes to LangGraph's ``messages`` stream mode, so
|
||||
``messages-tuple`` events for AI text are emitted as **deltas** as
|
||||
the model generates tokens, not as one cumulative dump at node
|
||||
completion. Each delta carries a stable ``id`` — consumers that
|
||||
want the full text must accumulate ``content`` per ``id``.
|
||||
``chat()`` already does this for you.
|
||||
|
||||
Tool calls and tool results are still emitted once per logical
|
||||
message. ``values`` events continue to carry full state snapshots
|
||||
after each graph node finishes; AI text already delivered via the
|
||||
``messages`` stream is **not** re-synthesized from the snapshot to
|
||||
avoid duplicate deliveries.
|
||||
|
||||
Why not reuse Gateway's ``run_agent``?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Gateway (``runtime/runs/worker.py``) has a complete streaming
|
||||
pipeline: ``run_agent`` → ``StreamBridge`` → ``sse_consumer``. It
|
||||
looks like this client duplicates that work, but the two paths
|
||||
serve different audiences and **cannot** share execution:
|
||||
|
||||
* ``run_agent`` is ``async def`` and uses ``agent.astream()``;
|
||||
this method is a sync generator using ``agent.stream()`` so
|
||||
callers can write ``for event in client.stream(...)`` without
|
||||
touching asyncio. Bridging the two would require spinning up
|
||||
an event loop + thread per call.
|
||||
* Gateway events are JSON-serialized by ``serialize()`` for SSE
|
||||
wire transmission. This client yields in-process stream event
|
||||
payloads directly as Python data structures (``StreamEvent``
|
||||
with ``data`` as a plain ``dict``), without the extra
|
||||
JSON/SSE serialization layer used for HTTP delivery.
|
||||
* ``StreamBridge`` is an asyncio-queue decoupling producers from
|
||||
consumers across an HTTP boundary (``Last-Event-ID`` replay,
|
||||
heartbeats, multi-subscriber fan-out). A single in-process
|
||||
caller with a direct iterator needs none of that.
|
||||
|
||||
So ``DeerFlowClient.stream()`` is a parallel, sync, in-process
|
||||
consumer of the same ``create_agent()`` factory — not a wrapper
|
||||
around Gateway. The two paths **should** stay in sync on which
|
||||
LangGraph stream modes they subscribe to; that invariant is
|
||||
enforced by ``tests/test_client.py::test_messages_mode_emits_token_deltas``
|
||||
rather than by a shared constant, because the three layers
|
||||
(Graph, Platform SDK, HTTP) each use their own naming
|
||||
(``messages`` vs ``messages-tuple``) and cannot literally share
|
||||
a string.
|
||||
|
||||
Args:
|
||||
message: User message text.
|
||||
thread_id: Thread ID for conversation context. Auto-generated if None.
|
||||
@@ -448,8 +538,8 @@ class DeerFlowClient:
|
||||
StreamEvent with one of:
|
||||
- type="values" data={"title": str|None, "messages": [...], "artifacts": [...]}
|
||||
- type="custom" data={...}
|
||||
- type="messages-tuple" data={"type": "ai", "content": str, "id": str}
|
||||
- type="messages-tuple" data={"type": "ai", "content": str, "id": str, "usage_metadata": {...}}
|
||||
- type="messages-tuple" data={"type": "ai", "content": <delta>, "id": str}
|
||||
- type="messages-tuple" data={"type": "ai", "content": <delta>, "id": str, "usage_metadata": {...}}
|
||||
- type="messages-tuple" data={"type": "ai", "content": "", "id": str, "tool_calls": [...]}
|
||||
- type="messages-tuple" data={"type": "tool", "content": str, "name": str, "tool_call_id": str, "id": str}
|
||||
- type="end" data={"usage": {"input_tokens": int, "output_tokens": int, "total_tokens": int}}
|
||||
@@ -466,13 +556,47 @@ class DeerFlowClient:
|
||||
context["agent_name"] = self._agent_name
|
||||
|
||||
seen_ids: set[str] = set()
|
||||
# Cross-mode handoff: ids already streamed via LangGraph ``messages``
|
||||
# mode so the ``values`` path skips re-synthesis of the same message.
|
||||
streamed_ids: set[str] = set()
|
||||
# The same message id carries identical cumulative ``usage_metadata``
|
||||
# in both the final ``messages`` chunk and the values snapshot —
|
||||
# count it only on whichever arrives first.
|
||||
counted_usage_ids: set[str] = set()
|
||||
cumulative_usage: dict[str, int] = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
|
||||
|
||||
def _account_usage(msg_id: str | None, usage: Any) -> dict | None:
|
||||
"""Add *usage* to cumulative totals if this id has not been counted.
|
||||
|
||||
``usage`` is a ``langchain_core.messages.UsageMetadata`` TypedDict
|
||||
or ``None``; typed as ``Any`` because TypedDicts are not
|
||||
structurally assignable to plain ``dict`` under strict type
|
||||
checking. Returns the normalized usage dict (for attaching
|
||||
to an event) when we accepted it, otherwise ``None``.
|
||||
"""
|
||||
if not usage:
|
||||
return None
|
||||
if msg_id and msg_id in counted_usage_ids:
|
||||
return None
|
||||
if msg_id:
|
||||
counted_usage_ids.add(msg_id)
|
||||
input_tokens = usage.get("input_tokens", 0) or 0
|
||||
output_tokens = usage.get("output_tokens", 0) or 0
|
||||
total_tokens = usage.get("total_tokens", 0) or 0
|
||||
cumulative_usage["input_tokens"] += input_tokens
|
||||
cumulative_usage["output_tokens"] += output_tokens
|
||||
cumulative_usage["total_tokens"] += total_tokens
|
||||
return {
|
||||
"input_tokens": input_tokens,
|
||||
"output_tokens": output_tokens,
|
||||
"total_tokens": total_tokens,
|
||||
}
|
||||
|
||||
for item in self._agent.stream(
|
||||
state,
|
||||
config=config,
|
||||
context=context,
|
||||
stream_mode=["values", "custom"],
|
||||
stream_mode=["values", "messages", "custom"],
|
||||
):
|
||||
if isinstance(item, tuple) and len(item) == 2:
|
||||
mode, chunk = item
|
||||
@@ -484,6 +608,36 @@ class DeerFlowClient:
|
||||
yield StreamEvent(type="custom", data=chunk)
|
||||
continue
|
||||
|
||||
if mode == "messages":
|
||||
# LangGraph ``messages`` mode emits ``(message_chunk, metadata)``.
|
||||
if isinstance(chunk, tuple) and len(chunk) == 2:
|
||||
msg_chunk, _metadata = chunk
|
||||
else:
|
||||
msg_chunk = chunk
|
||||
|
||||
msg_id = getattr(msg_chunk, "id", None)
|
||||
|
||||
if isinstance(msg_chunk, AIMessage):
|
||||
text = self._extract_text(msg_chunk.content)
|
||||
counted_usage = _account_usage(msg_id, msg_chunk.usage_metadata)
|
||||
|
||||
if text:
|
||||
if msg_id:
|
||||
streamed_ids.add(msg_id)
|
||||
yield self._ai_text_event(msg_id, text, counted_usage)
|
||||
|
||||
if msg_chunk.tool_calls:
|
||||
if msg_id:
|
||||
streamed_ids.add(msg_id)
|
||||
yield self._ai_tool_calls_event(msg_id, msg_chunk.tool_calls)
|
||||
|
||||
elif isinstance(msg_chunk, ToolMessage):
|
||||
if msg_id:
|
||||
streamed_ids.add(msg_id)
|
||||
yield self._tool_message_event(msg_chunk)
|
||||
continue
|
||||
|
||||
# mode == "values"
|
||||
messages = chunk.get("messages", [])
|
||||
|
||||
for msg in messages:
|
||||
@@ -493,47 +647,25 @@ class DeerFlowClient:
|
||||
if msg_id:
|
||||
seen_ids.add(msg_id)
|
||||
|
||||
# Already streamed via ``messages`` mode; only (defensively)
|
||||
# capture usage here and skip re-synthesizing the event.
|
||||
if msg_id and msg_id in streamed_ids:
|
||||
if isinstance(msg, AIMessage):
|
||||
_account_usage(msg_id, getattr(msg, "usage_metadata", None))
|
||||
continue
|
||||
|
||||
if isinstance(msg, AIMessage):
|
||||
# Track token usage from AI messages
|
||||
usage = getattr(msg, "usage_metadata", None)
|
||||
if usage:
|
||||
cumulative_usage["input_tokens"] += usage.get("input_tokens", 0) or 0
|
||||
cumulative_usage["output_tokens"] += usage.get("output_tokens", 0) or 0
|
||||
cumulative_usage["total_tokens"] += usage.get("total_tokens", 0) or 0
|
||||
counted_usage = _account_usage(msg_id, msg.usage_metadata)
|
||||
|
||||
if msg.tool_calls:
|
||||
yield StreamEvent(
|
||||
type="messages-tuple",
|
||||
data={
|
||||
"type": "ai",
|
||||
"content": "",
|
||||
"id": msg_id,
|
||||
"tool_calls": [{"name": tc["name"], "args": tc["args"], "id": tc.get("id")} for tc in msg.tool_calls],
|
||||
},
|
||||
)
|
||||
yield self._ai_tool_calls_event(msg_id, msg.tool_calls)
|
||||
|
||||
text = self._extract_text(msg.content)
|
||||
if text:
|
||||
event_data: dict[str, Any] = {"type": "ai", "content": text, "id": msg_id}
|
||||
if usage:
|
||||
event_data["usage_metadata"] = {
|
||||
"input_tokens": usage.get("input_tokens", 0) or 0,
|
||||
"output_tokens": usage.get("output_tokens", 0) or 0,
|
||||
"total_tokens": usage.get("total_tokens", 0) or 0,
|
||||
}
|
||||
yield StreamEvent(type="messages-tuple", data=event_data)
|
||||
yield self._ai_text_event(msg_id, text, counted_usage)
|
||||
|
||||
elif isinstance(msg, ToolMessage):
|
||||
yield StreamEvent(
|
||||
type="messages-tuple",
|
||||
data={
|
||||
"type": "tool",
|
||||
"content": self._extract_text(msg.content),
|
||||
"name": getattr(msg, "name", None),
|
||||
"tool_call_id": getattr(msg, "tool_call_id", None),
|
||||
"id": msg_id,
|
||||
},
|
||||
)
|
||||
yield self._tool_message_event(msg)
|
||||
|
||||
# Emit a values event for each state snapshot
|
||||
yield StreamEvent(
|
||||
@@ -550,10 +682,12 @@ class DeerFlowClient:
|
||||
def chat(self, message: str, *, thread_id: str | None = None, **kwargs) -> str:
|
||||
"""Send a message and return the final text response.
|
||||
|
||||
Convenience wrapper around :meth:`stream` that returns only the
|
||||
**last** AI text from ``messages-tuple`` events. If the agent emits
|
||||
multiple text segments in one turn, intermediate segments are
|
||||
discarded. Use :meth:`stream` directly to capture all events.
|
||||
Convenience wrapper around :meth:`stream` that accumulates delta
|
||||
``messages-tuple`` events per ``id`` and returns the text of the
|
||||
**last** AI message to complete. Intermediate AI messages (e.g.
|
||||
planner drafts) are discarded — only the final id's accumulated
|
||||
text is returned. Use :meth:`stream` directly if you need every
|
||||
delta as it arrives.
|
||||
|
||||
Args:
|
||||
message: User message text.
|
||||
@@ -561,15 +695,21 @@ class DeerFlowClient:
|
||||
**kwargs: Override client defaults (same as stream()).
|
||||
|
||||
Returns:
|
||||
The last AI message text, or empty string if no response.
|
||||
The accumulated text of the last AI message, or empty string
|
||||
if no AI text was produced.
|
||||
"""
|
||||
last_text = ""
|
||||
# Per-id delta lists joined once at the end — avoids the O(n²) cost
|
||||
# of repeated ``str + str`` on a growing buffer for long responses.
|
||||
chunks: dict[str, list[str]] = {}
|
||||
last_id: str = ""
|
||||
for event in self.stream(message, thread_id=thread_id, **kwargs):
|
||||
if event.type == "messages-tuple" and event.data.get("type") == "ai":
|
||||
content = event.data.get("content", "")
|
||||
if content:
|
||||
last_text = content
|
||||
return last_text
|
||||
msg_id = event.data.get("id") or ""
|
||||
delta = event.data.get("content", "")
|
||||
if delta:
|
||||
chunks.setdefault(msg_id, []).append(delta)
|
||||
last_id = msg_id
|
||||
return "".join(chunks.get(last_id, ()))
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Public API — configuration queries
|
||||
|
||||
Reference in New Issue
Block a user