deer-flow

mirror of https://github.com/bytedance/deer-flow.git synced 2026-06-11 09:55:59 +00:00

Author	SHA1	Message	Date
taohe	d1768606c0	Merge remote-tracking branch 'origin/main' into codex/im-channel-connections # Conflicts: # backend/app/gateway/services.py # frontend/src/app/workspace/chats/page.tsx	2026-06-11 17:51:16 +08:00
taohe	dae7c7870e	Prefill IM channel runtime config	2026-06-11 17:02:02 +08:00
taohe	4f56437030	Show IM channel source on threads	2026-06-11 16:51:04 +08:00
taohe	42fd0cc22f	Use default user for auth-disabled local mode	2026-06-11 16:33:37 +08:00
taohe	4a0278420f	Allow disconnecting runtime IM channels	2026-06-11 16:10:02 +08:00
taohe	ade4a55cfe	Persist IM runtime config locally	2026-06-11 15:58:40 +08:00
taohe	92f562920d	Avoid password autofill for channel secrets	2026-06-11 15:12:18 +08:00
taohe	9d51e38641	Keep configured IM channels editable	2026-06-11 14:40:37 +08:00
taohe	c966eb71a7	Guard global shortcut key handling	2026-06-11 13:57:56 +08:00
taohe	c4368c9018	Add runtime setup for enabled IM channels	2026-06-11 12:10:16 +08:00
taohe	f83767bb17	Fix IM channel provider icons	2026-06-11 11:48:08 +08:00
taohe	0e939bfe23	Keep unavailable channel connect buttons clickable	2026-06-11 11:28:56 +08:00
taohe	a52deada8b	Support all integrated IM channel connections	2026-06-11 11:19:27 +08:00
Huixin615	5819bd8a59	fix(frontend): paginate workspace chat list beyond 50 threads (#3482 ) (#3485 ) * fix(frontend): paginate workspace chat list beyond 50 threads (#3482) The sidebar 'Recent chats' and /workspace/chats list were hard-capped at the first 50 threads returned by threads.search. Replace the single-shot useThreads() consumers with useInfiniteThreads() and add an IntersectionObserver sentinel to each list so further pages are fetched on demand. In search mode on the chats page, the sentinel is replaced by an explicit 'Load more' button to prevent the observer from draining the entire backend list while the filtered view stays empty. - Add useInfiniteThreads + page-size constant and pure cache helpers (map/filterInfiniteThreadsCache, getInfiniteThreadsNextPageParam) - Mirror rename / delete / stream-finish updates into the new infinite cache so optimistic UI stays consistent - Extend the e2e mock to honour limit/offset slicing - Unit tests for the cache helpers and pagination boundary - Playwright e2e covering chats page + sidebar load-more, and the search-mode guard against runaway auto-pagination - Add en/zh i18n entries for the search-mode load-more button Fixes #3482 * docs(frontend): clarify infinite-threads offset semantics and test post-delete invariant - Add docstring to getInfiniteThreadsNextPageParam explaining that TanStack Query freezes the returned offset into pageParams once, so optimistic cache mutations that shrink page lengths (filterInfiniteThreadsCache on delete) cannot retroactively move the offset backwards. Delete/rename paths reconcile against the backend via invalidateQueries in onSettled. - Add unit test covering the post-delete invariant. - Fix misleading comment in thread-list-infinite-scroll.spec.ts: the thread-search mock does not sort by updated_at; it returns the array in the order provided. Addresses Copilot CR comments on #3485. * fix(frontend): mirror onCreated upsert into infinite cache; add sidebar Load-older button Address review feedback on #3485: - New upsertThreadInInfiniteCache helper; useThreadStream onCreated now upserts into both the legacy ['threads','search'] cache and the new infinite cache, so a freshly created thread appears in the sidebar immediately during streaming instead of only after the run finishes and onSettled invalidates the query. Restores parity with main. - Sidebar Recent Chats now exposes a visible 'Load older chats' button alongside the IntersectionObserver sentinel, so keyboard-only users and environments where IO is unavailable can still reach older conversations. - Add zh-CN / en-US / types entry for chats.loadOlderChats. - Cover the new helper with 3 unit tests (no-op on uninitialised cache, prepend new thread to first page, merge with existing entry without duplication).	2026-06-10 23:59:38 +08:00
taohe	d06643d8a2	Align IM connections with local channels	2026-06-10 22:16:47 +08:00
taohe	92c185b90d	Support local IM channel connections	2026-06-10 21:59:33 +08:00
taohe	b66152c514	Use async channel connect flow	2026-06-10 21:34:29 +08:00
taohe	78fbc0abdb	Fix dev startup and channel connect popup	2026-06-10 21:33:15 +08:00
taohe	ec5ed185cd	Merge remote-tracking branch 'origin/main' into codex/im-channel-connections # Conflicts: # backend/app/channels/discord.py # backend/app/channels/manager.py # backend/app/channels/slack.py # backend/app/channels/telegram.py	2026-06-10 21:13:02 +08:00
taohe	dbe3a3bb0d	Add user-owned IM channel connections	2026-06-10 21:07:44 +08:00
DanielWalnut	2b795265e7	fix: align auth-disabled mode and mock history loading (#3471 ) * fix: align auth-disabled mode and mock history loading * fix: address auth-disabled review feedback * test: cover auth-disabled backend contract * style: format frontend tests * fix: address follow-up review comments	2026-06-10 16:11:00 +08:00
DanielWalnut	16391e35ab	fix(skills): harden slash skill activation across chat channels (#3466 ) * support slash skill activation * format slash skill activation * Preserve slash skill activation with uploads * Address slash skill review feedback * Address slash skill follow-up review * Fix lazy slash skill storage resolution * Keep slash skill activation out of system prompt * Address slash skill review issues * fix: harden slash skill command handling * feat(frontend): add slash skill autocomplete * fix: address slash skill review feedback * fix: preserve slash skill text for IM uploads	2026-06-09 23:07:17 +08:00
Admire	5b81588b87	fix(frontend): fallback Streamdown clipboard copy (#3397 ) * fix(frontend): fallback streamdown clipboard copy * fix(frontend): address clipboard fallback review * fix(frontend): normalize clipboard fallback rejection * fix(frontend): harden clipboard fallback install * fix(frontend): clarify clipboard fallback errors * fix(frontend): cover clipboard fallback edge cases * fix(frontend): tighten clipboard fallback cleanup * fix(frontend): reduce clipboard fallback copy window * fix(frontend): guard clipboard item fallback install * fix(frontend): clean up clipboard fallback on selection errors * Address clipboard fallback review feedback * fix(frontend): guard clipboard fallback install during SSR	2026-06-09 22:09:13 +08:00
AochenShen99	0fb18e368c	refactor(lead-agent): make build_middlewares public to drop the last cross-module private import (#3458 ) `client.py` imported the private `_build_middlewares` from `agent.py` across a module boundary and called it as public API. Because the `_` name signals "module-private, no external callers", any future rename or signature change silently breaks the embedded `DeerFlowClient` path — and the test suite even monkeypatched `deerflow.client._build_middlewares`, baking the leak in. `DeerFlowClient` is a lead-agent variant that genuinely needs the lead agent's full middleware composition, so make the dependency honest: promote the helper to a documented public entry point `build_middlewares` and update every in-repo caller. Found during #3341 review; #3341 already removed one such leak (`_assemble_deferred` -> public `assemble_deferred_tools`) and left this one out of scope on purpose. - agent.py: rename def + both internal call sites; expand the docstring into a public-entry-point contract and document the previously-undocumented model_name / app_config / deferred_setup params - client.py: import + call site now use the public name (removes the last cross-module private import) - scripts/tool-error-degradation-detection.sh: update its import + call site - tests (5 files): update monkeypatch/patch targets and direct calls - docs (backend/CLAUDE.md, plan_mode_usage.md, middlewares.mdx): sync the live references that describe the symbol as current API Pure mechanical rename, no behavior change. Historical design docs (rfc, superpowers spec) intentionally keep the old name as point-in-time records. Closes #3431	2026-06-09 11:56:28 +08:00
DanielWalnut	cd5bedaa74	feat: MiniMax provider for image/video/podcast skills + new music-generation skill (#3437 ) * docs(spec): MiniMax integration for generation skills + new music skill Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(plan): MiniMax generation providers implementation plan Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(skills): add importlib loader + FakeResp for skill tests * test(skills): register loaded module in sys.modules; raise requests.HTTPError in FakeResp * feat(image-generation): add MiniMax provider with env auto-detect Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(image-generation): guard unknown provider, derive ref MIME, strengthen tests Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(video-generation): add MiniMax provider with async poll/download Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(video-generation): surface base_resp errors while polling; add timeout test * feat(podcast-generation): add MiniMax t2a_v2 provider with env auto-detect Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(podcast-generation): restore TTS credential guard; add volcengine + voice tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(music-generation): new MiniMax music skill via skill-creator Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(music-generation): treat empty lyrics as absent; test no-audio-data path * refactor(skills): add request timeouts to MiniMax network calls Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Potential fix for pull request finding 'Explicit returns mixed with implicit (fall through) returns' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * fix(models): strip inconsistent user-message names for MiniMax chat DeerFlow middlewares tag user messages with provenance names (user-input, summary, loop_warning); langchain serializes them into the OpenAI-compatible payload and MiniMax rejects mismatched user-message names with "user name must be consistent (2013)". PatchedChatMiniMax now drops the per-message name from user-role messages. Point the config.example MiniMax models at PatchedChatMiniMax so they also get reasoning_content mapping. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(image-generation): MiniMax sends JSON prompt field, guard 1500-char limit MiniMax image-01 takes one text string capped at 1500 chars, but the skill was sending the whole structured JSON. The MiniMax provider now extracts the JSON `prompt` field (relying on prompt_optimizer to expand it) and fails fast with a clear error before calling the API when that field exceeds 1500 chars. Authoring stays provider-agnostic; Gemini still receives the full JSON. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(podcast-generation): per-provider TTS concurrency and retry/backoff Each TTS provider owns its concurrency internally — MiniMax runs single-threaded to reduce rate-limit failures, Volcengine keeps 4 workers — with automatic retry and backoff on transient HTTP and base_resp errors. No caller-facing concurrency knob. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(skills): address Copilot review comments on generation skills - video: add raise_for_status + timeout to the Gemini download/POST/poll calls so non-2xx responses surface as clear HTTP errors instead of JSON/KeyError or hangs - video: check the task Fail status before the generic base_resp check so the failure keeps its task_id context - video/image: create the output file parent directory before writing (matching music-generation) so nested output paths do not raise FileNotFoundError - music: require a non-empty prompt and fail fast with ValueError instead of sending an empty prompt to the API Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(scripts): reclaim dev ports across worktrees in make stop/dev All deer-flow worktrees (main checkout + linked worktrees) hardcode the same dev ports (8001/3000/2026), so a service started from any worktree must be reclaimable from another. stop_all now resolves the set of worktree roots (DEERFLOW_ROOTS) and treats a process as deer-flow-owned when its open files live under any of them. It also force-kills survivors on 2026 alongside 8001/3000, fixing `make dev` aborting on the nginx port preflight when a prior nginx lingered on 2026. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(view-image): hide the injected image-context message from the UI ViewImageMiddleware injects a HumanMessage (text + base64 images) so the vision model can see viewed images, but it was the only internal injector that set neither hide_from_ui nor a hidden name, so it leaked into the chat UI (and IM channels) as a user bubble reading "Here are the images you've viewed:". Mark it with additional_kwargs={"hide_from_ui": True}, matching todo/dynamic_context injections, which the frontend isHiddenFromUIMessage and the channel sender already honor. The model still receives the full content. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(minimax): mark M2.7 models as text-only (no vision) MiniMax M2.7 / M2.7-highspeed do not support vision; only M3 does. The provider config asserted vision support for M2.7 in four places. - config.example.yaml: 4 M2.7 entries -> supports_vision: false - backend/docs/CONFIGURATION.md: M2.7 + highspeed -> supports_vision: false - wizard: add LLMProvider.model_vision_overrides + extra_config_for() so selecting an M2.7 model writes supports_vision: false while M3 (default) keeps vision; wire it through setup_wizard.py - tests: M2.7-highspeed fixture -> supports_vision=False; add test_minimax_vision_is_per_model Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>	2026-06-08 22:04:38 +08:00
DanielWalnut	1651d1f1f5	fix(frontend): restructure Memory settings toolbar into two rows (#3433 ) The search input, filter tabs, and four action buttons were crammed into a single horizontal row, which squeezed the search box into an unusable sliver and truncated the "Summaries" filter tab to "Summarie". Split the toolbar into two rows: search + filter tabs on the first, actions on the second. The search input now keeps a usable min width, filter tabs use whitespace-nowrap so they never truncate, and the destructive "Clear all memory" button is pushed to the far right (ml-auto) to separate it from the constructive actions. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 19:17:14 +08:00
Xinmin Zeng	799bef6d9d	fix(replay-e2e): match by conversation, not the living system prompt (#3436 ) * fix(replay-e2e): match by conversation, not the living system prompt The model-replay match key hashed the full input including the lead-agent system prompt. That prompt is edited frequently (e.g. #3195 added a "File Editing Workflow" section), so the committed fixture went stale the moment the prompt changed on main — turning the Layer-2 render gate RED on every unrelated PR (#3430, #3432, ...). This was a self-inflicted false positive. Root-cause fix: - replay_provider._canonical_messages now EXCLUDES the system message from the hash. The conversation (human/ai/tool) is the stable contract that identifies a recorded turn; the system prompt is an internal detail not part of the front-back contract under test. (Mirrors how open-design keys its mock picker on the user prompt, not the system internals.) Proven robust: injecting a prompt edit no longer causes a replay miss. - Layer-1 golden was BLIND to replay misses: the gateway swallows a miss into an assistant error message, so the shape-only golden stayed green on a stale fixture. It now inspects replay_provider.replay_misses() and fails loud. (Layer-2 already fails on a miss.) - Re-recorded write_read_file.ultra fixture + regenerated golden under the new conversation-only hash. - Layer-2 render spec: assert the in-graph auto-title (deterministic); the follow-up suggestion is fired async and depends on a clean JSON model output, so assert it only when the fixture captured one — never gate on its absence (recording flakiness must not block CI). - docs: REPLAY_E2E.md updated. Verified: Layer-1 golden green (no miss), Layer-2 both specs green, CI=true make test 4033 passed / 0 failed, frontend pnpm check clean. * test(replay-e2e): restore suggestions coverage with a reliable capture Addresses review feedback (the suggestion path was dropped from Layer-2): - record spec now waits for the `/suggestions` response before checking capture stability, so the recorded fixture reliably includes the frontend-fired suggestions turn (previously the stability window could return before suggestions fired, yielding a fixture without it). - Re-recorded write_read_file.ultra: 5 turns (write_file, auto-title, read_file, answer, suggestions). Golden unchanged — suggestions is a separate /suggestions call, not part of the /runs/stream SSE sequence. - Layer-2 spec: restore the hard `EXPECTED_SUGGESTION` assertion. With the record spec now waiting for /suggestions, a fixture missing the suggestion turn means a broken recording and must fail loud, not pass silently. Verified: Layer-1 golden green (no miss), Layer-2 both specs green (auto-title + suggestion render), frontend pnpm check clean. * ci: re-trigger (flaky Docker Hub image pull in sandbox e2e, unrelated) backend-unit-tests failed only in test_sandbox_orphan_reconciliation_e2e.py with 'docker pull busybox:latest ... context deadline exceeded' — a CI-runner network flake reaching Docker Hub, not related to this docs/tests-only change. Empty commit to re-run CI. --------- Co-authored-by: DanielWalnut <45447813+hetaoBackend@users.noreply.github.com>	2026-06-08 17:32:41 +08:00
Xinmin Zeng	88759015e4	test(e2e): deterministic record/replay front-back contract verification (#3365 ) * test(e2e): record/replay front-back contract verification Guards the front-back contract with a deterministic, key-free record/replay harness (mirrors open-design's golden-trace approach): - ReplayChatModel (tests/replay_provider.py): replays recorded LLM turns by a normalized hash of the model input. Strips <system-reminder>/date/uuid/tmp-path so one fixture replays across days and from both the browser and direct-POST paths; a miss raises loudly (no silent divergence). - Recording is record-through-browser (scripts/record_gateway.py + build_fixture_from_jsonl.py + frontend/tests/e2e-record): a real run is driven through the real frontend so captured inputs match exactly what the browser sends; fixtures contain no API key. - Layer 1 — backend golden (tests/test_replay_golden.py): replay through the real gateway, assert the SSE event sequence == committed golden. - Layer 2 — full-stack render (frontend/tests/e2e-real-backend): real Next.js + real gateway (replay model) + Chromium; assert the replayed auto-title and follow-up suggestions render. DOM assertions are the gate; visual regression is a local dev gate (CI uploads the render as an artifact). - CI (.github/workflows/replay-e2e.yml): both layers, triggered on EITHER side of the contract (frontend/** or backend gateway/harness/fixtures). * test(e2e): multi-run render-order cross-stack scenario (#3352) Guards the dangerous front-back class where a backend ordering change silently breaks a frontend assumption while both sides' unit tests stay green. Reproduces issue #3352: backend list_by_thread returns runs newest-first (#2932) and the frontend prepended per-run pages, inverting chronological order once the checkpoint no longer held the older messages. - tests/seed_runs_router.py: test-only seeder, mounted on the replay gateway only when DEERFLOW_ENABLE_TEST_SEED=1 (never in the production app). Seeds a thread with >=2 runs + per-run message events and no checkpoint -- the #3352 precondition -- so the frontend per-run reload path is the sole source of truth and the prepend inversion is observable. - frontend/tests/e2e-real-backend/multi-run-order.spec.ts: drives the real frontend against the real gateway, asserts the first run renders above the second. Reverting the #3354 fix turns it red. - replay-e2e.yml: trigger on the new replay test-infra paths. - docs: REPLAY_E2E.md cross-stack scenario section. * test(e2e): address Copilot review on the replay harness - Fix stale recorder references (scripts/record_traces.py -> scripts/record_gateway.py + scripts/build_fixture_from_jsonl.py) in replay_provider.py, test_replay_golden.py, _replay_fixture.py. - MODE_CONTEXT['ultra']: thinking_enabled False -> True, mirroring the frontend's `context.mode !== 'flash'` (hooks.ts). It did not affect the hashed input (Layer 1 golden still green), but the table now matches the real frontend context it claims to mirror. - replay_provider.py docstring: stop claiming memory is recorded-enabled; the replay config disables memory/summarization for determinism (title stays, as an in-graph deterministic call). - record_gateway.py / run_replay_gateway.py: override DEER_FLOW_HOME instead of setdefault, so an outer value can't leak into the hermetic harness. - record_gateway.py: clear error when DEERFLOW_RECORD_OUT is unset (was a bare KeyError). - playwright.record.config.ts: forward OPENAI_/DEERFLOW_RECORD_OUT only when set, so the gateway raises a clear 'missing env' error instead of getting ''. test(e2e): address Copilot review round 2 - seed_runs_router.py: constrain SeedMessage.role to Literal['human','ai'] so a bad value is a clean 422 at the boundary instead of a 500 (KeyError on _EVENT_TYPE). - record-write-read-file.spec.ts: waitForCaptureStable now throws on timeout instead of returning the last count, so a truncated/partial recording can't pass silently. - real-backend-render.spec.ts: guard the suggestions JSON.parse; a bracket-prefixed non-JSON turn falls back to '' so the existing not.toBe('') assertion fails clearly instead of a generic parse throw.	2026-06-08 12:35:03 +08:00
Xinmin Zeng	7679f21edf	fix(frontend): truncate overflowing text in agent cards (#3391 ) * fix(frontend): truncate overflowing text in agent cards Long custom agent names, descriptions, skills and tool-group labels overflowed the agent card and broke its layout (issue #3389). The title already had `truncate`, but it never took effect: an ancestor flex container lacked `min-w-0`, so the flex item refused to shrink below its content width. - Restore the truncation chain by adding `min-w-0` to the title's flex ancestors so `truncate` can finally take effect. - Cap and ellipsize model / skill / tool-group badges via a small `TruncatedBadge` (`block max-w-full truncate`). - Reveal the full value on hover, but only when the text is actually clipped (`TruncatedTooltip`, width + height detection), so names, descriptions and labels stay readable without popping redundant tooltips on short cards. * fix(frontend): wrap unbreakable strings in agent card tooltips A long token with no break opportunity (no spaces or hyphens) could still overflow the tooltip horizontally. Add `break-words` next to the existing `text-wrap` so such strings wrap instead of overflowing. Addresses Copilot review feedback on tooltip wrapping robustness. * fix(frontend): show agent card tooltips instantly Drop the explicit `delayDuration` so card tooltips fall back to the provider's default 0ms delay. Instant feedback is better UX for revealing text that is already clipped, per maintainer review. --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-06-07 23:29:59 +08:00
Xinmin Zeng	8d2e55a05f	fix(subagent): structured subagent_status field over text parsing (#3146 ) (#3154 ) * fix(subagent): structured subagent_status field over text parsing Closes #3146. ## Why The frontend used to derive subtask card state by string-matching the leading text of the `task` tool's result. That contract surface was fragile — `#3107` BUG-007 and the `#3131` review both surfaced cases where new backend wording (`Task cancelled by user.`, `Task polling timed out after N minutes`, `ToolErrorHandlingMiddleware` exception wrappers) silently broke the card lifecycle. The frontend fallback kept growing more prefixes; any future rewording would break it again. ## Design 1. Backend → frontend contract: `ToolMessage.additional_kwargs` carries `subagent_status` (one of `completed \| failed \| cancelled \| timed_out \| polling_timed_out`) and an optional `subagent_error` blob. The frontend prefers it over parsing `content`. 2. Centralised stamping, not 8 sprinkled stamps: rather than have each of `task_tool.py`'s 5 normal-return + 3 pre-execution `Error:` paths remember to set `additional_kwargs`, `ToolErrorHandlingMiddleware` stamps the field after every task-tool call. Adding a new return path in `task_tool.py` cannot now skip the stamp. 3. Cross-language contract fixture: the prefix→status mapping is the one piece both sides must agree on. The shared fixture at `contracts/subagent_status_contract.json` lists every backend return string, the expected status, and what the error substring should contain. Backend test (`backend/tests/test_subagent_status_contract.py`) and frontend test (`frontend/tests/unit/core/tasks/subtask-result.test.ts`) both load that fixture and assert the same cases. A wording drift on either side fails the matching language's test. 4. Round-trip serialisation pinned: the round-trip test asserts `ToolMessage.model_dump_json()` → `model_validate_json()` preserves `additional_kwargs.subagent_status`. Catches the case where a future LangChain or Pydantic upgrade silently strips unknown kwargs. 5. Frontend status collapse documented: the backend has five status values, the frontend card has three (`completed \| failed \| in_progress`). `cancelled` / `timed_out` / `polling_timed_out` all collapse to `failed` with the original status preserved in `error`. `parseSubtaskResult` returns `in_progress` for unknown values so a backend that ships a new enum variant before the frontend upgrades degrades to the legacy prefix fallback instead of getting pinned. ## Changes Backend: - `deerflow.subagents.status_contract` — new module exporting `SUBAGENT_STATUS_KEY`, `SUBAGENT_ERROR_KEY`, `SUBAGENT_STATUS_VALUES`, `extract_subagent_status(content)`, and `make_subagent_additional_kwargs(status, error)`. - `ToolErrorHandlingMiddleware`: new `_stamp_task_subagent_status` helper centralises the stamp; `wrap_tool_call` / `awrap_tool_call` stamp on the success path; `_build_error_message` stamps on the wrapper path (carrying `ExcClass: detail` into `subagent_error`). Non-task tools are untouched. - New tests: `test_subagent_status_contract.py` (19 cases from the shared fixture + status-enum / blank-error / unknown-status rejection) and `test_tool_error_handling_subagent_stamp.py` (middleware integration: terminal-content stamps, non-terminal doesn't, non-task tools untouched, async path mirrors sync, existing additional_kwargs survive, JSON round-trip preserved). Frontend: - `parseSubtaskResult(text, additionalKwargs?)` — prefers the structured stamp; falls back to the legacy prefix matcher for historical threads / unknown future status values. - `STRUCTURED_STATUS_TO_SUBTASK` documents the five→three collapse. - `message-list.tsx` passes `message.additional_kwargs` through. - `subtask-result.test.ts` adds a structured-status block + a fixture-driven contract block; legacy prefix tests stay green for the fallback path. Contract: - `contracts/subagent_status_contract.json` — single source of truth both languages load. Whitespace variants, varied N for polling timeouts, the 3 pre-execution `Error:` returns task_tool produces, and the middleware wrapper shape are all in there. ## Test plan - `make lint` clean (backend + frontend). - `pytest tests/test_subagent_status_contract.py tests/test_tool_error_handling_subagent_stamp.py` → 37 passed. - `pnpm test --run` → 103 passed (was 76, +27 new). ## Migration / fallback retirement The text-prefix fallback stays in place until backend telemetry shows the frontend never hits it for newly produced messages. At that point a follow-up PR can drop the prefix branches and keep only the structured-status branch. Refs: bytedance/deer-flow#3138 (split summary), #3107 (origin), #3131 (prior prefix-only fix), #3146 (this issue). * fix(subtask): back-fill result/error from text when structured status present Three follow-ups on the PR #3154 review: 1. `readStructuredStatus` no longer short-circuits the prefix parse. The backend currently stamps only the `subagent_status` enum value; the human-facing `result` body and wrapped-error message still live in `ToolMessage.content`. Dropping the text parse meant successful tasks rendered empty completed pills and wrapped failures lost their diagnostic. Now both shapes get composed: structured status wins, `result`/`error` come from text when both sides agree, and a lying success body under a `failed` stamp is dropped instead of leaking. 2. Replace the ESM-incompatible `__dirname` fixture lookup in subtask-result.test.ts with `fileURLToPath(new URL(..., import.meta.url))`. The frontend package is `"type": "module"`, so the previous path would have thrown at runtime if anything ever changed under the contract directory. 3. Drop the `$schema` reference from contracts/subagent_status_contract.json pointing at a file that doesn't exist in the tree. Three new tests cover the structured + text composition: completed back-fills the success body, failed back-fills the wrapper text, and unrecognised content under a `failed` stamp stays empty rather than echoing noise.	2026-06-07 22:49:55 +08:00
Nan Gao	9a5de8d6a5	fix(ux): remove Backspace shortcut for deleting prompt attachments (#3410 ) * Remove backspace attachment deletion * Fix the lint error --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-06-06 15:13:24 +08:00
Nan Gao	1aac408dd0	fix upload file size contract (#3408 )	2026-06-06 15:12:17 +08:00
Huixin615	9a53f9dfbb	fix(frontend): preserve chronological order of thread history after context compression (#3354 ) * fix(frontend): preserve chronological order of thread history after context compression Iterate runs from newest to match backend `list_by_thread` (newest-first) and the prepend semantics of the history loader, so refreshed history renders in A→B→C→D→E→F order. Fixes #3352 * fix(frontend): auto-continue loading runs with no visible messages after context compression	2026-06-03 21:51:48 +08:00
Eilen Shin	019bd16a06	fix: load paginated run history messages (#3305 )	2026-06-01 15:50:39 +08:00
Eilen Shin	872079b894	docs: clean standalone LangGraph server remnants (#3301 )	2026-05-29 11:36:45 +08:00
Nan Gao	d46a5779bc	fix(chat): preserve messages after summarization (#3280 ) * fix(chat): preserve messages after summarization * make format * fix(chat): address summarization review comments	2026-05-29 08:24:47 +08:00
Xinmin Zeng	2ace78d1e5	fix(frontend): surface backend detail when agent name check fails (#3048 ) * fix(frontend): surface backend detail when agent name check fails The new-agent page caught AgentNameCheckError but only branched on reason === "backend_unreachable". Everything else (notably the 422 "Invalid agent name '...'. Must match ^[A-Za-z0-9-]+$" response from GET /api/agents/check when the user submits a name with disallowed characters — trailing space, dot, Chinese, invisible whitespace from copy-paste) fell through to the generic fallback "Could not verify name availability — please try again", swallowing the detail that already told the user exactly what to fix. Add a request_failed branch that surfaces err.message (which checkAgentName already populates from the backend's detail at core/agents/api.ts). The disabled / backend_unreachable / unknown- error paths are unchanged. Pin the contract with unit tests covering: 200 success, fetch rejection, 502/503/504 network errors, agents_api disabled detail, 422 validation detail carried verbatim, statusText fallback when detail is absent, and a regression guard against misclassifying a 422 as agents_api disabled. Closes #3041 * fix(frontend): localise the error prefix when surfacing backend detail The previous commit surfaced the backend's raw `err.message` on the new-agent page when the name check failed. The detail itself is English (backend's `_validate_agent_name` text, any 5xx business message, etc.) and dropping it bare into a zh-CN page produced a jarring English-among-Chinese line that didn't match neighbouring strings like "已存在同名智能体" / "无法验证名称可用性". Add `nameStepCheckErrorWithDetail` as a templated string ("Name check failed: {detail}" / "名称校验失败：{detail}"), mirroring the existing `nameStepBootstrapMessage` `{name}` template pattern. The page wraps `err.message` in it when present and falls back to the plain `nameStepCheckError` when the detail is empty. Rendered output (verified locally with a Console fetch mock that returns 500 + detail): zh-CN: 名称校验失败：Database connection lost: SQLAlchemy connection pool exhausted (max 5 connections, all in use) en-US: Name check failed: Database connection lost: SQLAlchemy connection pool exhausted (max 5 connections, all in use) The localised prefix tells the user what operation failed; the raw detail tells them why. Translating the detail itself would be lossy (any unbounded backend string would need a translation table) and would break the debuggability the previous commit delivered. Refs #3041 * fix(frontend): distinguish backend detail from generated fallback in AgentNameCheckError Addresses Copilot's review on #3048: the previous commits keyed off `err.message`, but `checkAgentName` substitutes a generated fallback string ("Failed to check agent name: ${statusText}") when the backend sent no detail. That guaranteed `err.message` was always truthy, made the `nameStepCheckError` fallback branch unreachable in practice, and could surface awkward strings like "名称校验失败：Failed to check agent name: Bad Gateway" in the UI. Add an explicit `detail: string \| null` field to AgentNameCheckError. `checkAgentName` populates it only when the backend response actually carried a string `detail` (defensive guard against the dict-shaped detail that other deer-flow endpoints use for typed error codes). The new-agent page now selects on `err.detail` instead of `err.message` so the localised fallback wins when no real detail exists. Also fix the prettier formatting that broke lint-frontend CI on the previous push. Test changes: - The 422 carry-through test now asserts both `detail` and `message` hold the backend string verbatim. - A new "falls back to statusText in message but leaves detail null" test pins the contract that no real detail ⇒ no UI surface leak. - A new "treats non-string detail as null" test guards against future backend schema drift toward dict-shaped detail. Refs #3041 #3048	2026-05-28 18:38:45 +08:00
Admire	2fdfff0db3	fix(frontend): fix Mermaid preview failure in historical messages (#3196 ) * fix(frontend): render historical mermaid diagrams * fix(frontend): address mermaid review feedback * Stabilize cancel lifecycle test * fix(frontend): handle mermaid fence variants * fix(frontend): normalize mermaid arrow spacing * fix(frontend): handle mermaid CRLF fences * chore: keep mermaid fix frontend-scoped --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-28 18:20:02 +08:00
zgenu	737abc0e45	fix: ignore stale run reconnect conflicts (#3284 ) * fix: ignore stale run reconnect conflicts * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix: ignore stale run reconnect conflicts --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-28 17:29:30 +08:00
Xinmin Zeng	0287240728	fix(frontend): show new thread in sidebar immediately on creation (#3276 ) (#3283 ) When a user starts a new conversation, the sidebar list did not display it until the AI finished streaming and generated a title. This made it impossible to switch back to an in-progress conversation when working with multiple threads concurrently. Optimistically insert the new thread into the TanStack Query cache during the `onCreated` callback so the sidebar renders a placeholder entry ("New chat") as soon as the backend acknowledges thread creation. The existing `onUpdateEvent` title handler and `onFinish` query invalidation then update the entry in-place with the real title.	2026-05-28 15:27:38 +08:00
dependabot[bot]	9e332c594a	chore(deps): bump uuid from 10.0.0 to 14.0.0 in /frontend (#3281 ) Bumps [uuid](https://github.com/uuidjs/uuid) from 10.0.0 to 14.0.0. - [Release notes](https://github.com/uuidjs/uuid/releases) - [Changelog](https://github.com/uuidjs/uuid/blob/main/CHANGELOG.md) - [Commits](https://github.com/uuidjs/uuid/compare/v10.0.0...v14.0.0) --- updated-dependencies: - dependency-name: uuid dependency-version: 14.0.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-28 07:14:44 +08:00
Admire	f68bcb771c	fix(frontend): guard message copy clipboard access (#3211 ) * fix(frontend): guard message copy clipboard access * fix(frontend): reuse clipboard guard across copy actions	2026-05-26 09:37:51 +08:00
AochenShen99	11dd5b0683	fix(frontend): strip unclosed <think> tags from streaming AI content (#3218 ) * fix(frontend): strip unclosed <think> tags from streaming AI content During streaming, an opening <think> tag may arrive in one chunk while the matching </think> arrives in a later chunk. The existing splitInlineReasoning regex only matched fully closed pairs, so the mid-flight reasoning was left in message.content and rendered into the chat bubble via the markdown pipeline's rehypeRaw plugin until the closing tag landed. Extend splitInlineReasoning with a second pass: after stripping every closed <think>...</think> pair, route any remaining content from a lone opener to the reasoning slot and leave only the preceding preamble in content. Closed-tag behavior is unchanged. Covers every provider whose stream emits reasoning inline as <think> tags (MiniMax streaming path, MindIE, PatchedChatOpenAI, and any gateway-served DeepSeek/OpenAI-compatible model). * style(frontend): apply prettier formatting to streaming reasoning tests * fix(frontend): skip <think> split for literal think tags in inline code Treats a `<think>` opener immediately preceded by a backtick as part of markdown inline code rather than a streaming reasoning marker. Prevents permanent content truncation when an AI message documents the `<think>` tag literally (e.g. ``Use `<think>` markers``), where the streaming-safe fallback would otherwise route the rest of the answer into the reasoning panel because no `</think>` ever arrives. Adds regression tests for both the post-stream and mid-stream cases.	2026-05-26 09:35:07 +08:00
Admire	e7967a7fc3	fix(frontend): hide copy for streaming assistant turn (#3176 )	2026-05-23 23:29:16 +08:00
Admire	d0fa37e71d	fix(frontend): avoid duplicate optimistic user message (#3002 )	2026-05-23 17:02:23 +08:00
AochenShen99	604fcbb9d2	Stabilize write artifact previews (#3172 )	2026-05-23 16:56:14 +08:00
Nan Gao	a64a39dbc0	config: raise default summarization trigger before v2.0-m1 (#3174 ) * config: update summarization configuration * docs: sync summarization trigger guidance	2026-05-23 15:38:25 +08:00
JeffJiang	b103d1a7f5	feat(frontend): support static website demo mode (#3170 ) * feat(frontend): support static website demo mode * fix(frontend): render html artifact previews from blob content * chore(frontend): apply pre-commit formatting * fix(frontend): address static demo PR review comments * Update the release information of DeerFlow --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-23 00:10:56 +08:00
Nan Gao	914d6a4f1c	docs: add provider safety termination post (#3167 )	2026-05-22 21:33:15 +08:00
Nan Gao	253542ea0d	docs: discourage MCP filesystem workspace config (#3141 )	2026-05-22 09:19:23 +08:00

1 2 3 4 5 ...

494 Commits