* fix(task-tool): unwrap callback manager when locating usage recorder
`config["callbacks"]` may arrive as a `BaseCallbackManager` (e.g. the
`AsyncCallbackManager` LangChain hands to async tool runs), not just a plain
list. The previous `for cb in callbacks` loop raised
`TypeError: 'AsyncCallbackManager' object is not iterable`, which
`ToolErrorHandlingMiddleware` then converted into a failed `task` ToolMessage
even though the subagent had completed internally — Ultra mode lost subagent
results and the lead agent fell back to redoing the work.
Unwrap `BaseCallbackManager.handlers` before searching for the recorder.
Refs: bytedance/deer-flow#3107 (BUG-002)
* fix(frontend): treat any task tool error as a terminal subtask failure
The subtask card status machine matched only three English prefixes (`Task
Succeeded. Result:`, `Task failed.`, `Task timed out`). Anything else fell
through to `in_progress`, so a `task` tool error wrapped by
`ToolErrorHandlingMiddleware` (`Error: Tool 'task' failed ...`) left the card
spinning forever even after the run had ended.
Extract the prefix logic into `parseSubtaskResult` and recognise any leading
`Error:` token as a terminal failure. The extracted function is unit-tested
against the legacy prefixes plus the `AsyncCallbackManager` regression
captured in the upstream issue.
Refs: bytedance/deer-flow#3107 (BUG-007)
* fix(frontend): exclude hidden, reasoning, and tool payloads from chat export
`formatThreadAsMarkdown` / `formatThreadAsJSON` iterated raw messages without
running the UI-level `isHiddenFromUIMessage` filter. Exported transcripts
therefore included `hide_from_ui` system reminders, memory injections,
provider `reasoning_content`, tool calls, and tool result messages — content
that is intentionally hidden in the chat view.
Filter the export to the user-visible transcript by default and gate
reasoning / tool calls / tool messages / hidden messages behind explicit
`ExportOptions` flags so a future debug export can opt back in without
forking the formatter.
Refs: bytedance/deer-flow#3107 (BUG-006)
* fix(gateway): route get_config through get_app_config for mtime hot reload
`get_config(request)` returned the `app.state.config` snapshot captured at
startup. The worker / lead-agent path then threaded that frozen `AppConfig`
through `RunContext` and `agent_factory`, so per-run fields edited in
`config.yaml` (notably `max_tokens`) were ignored until the gateway process
was restarted — even though `get_app_config()` already does mtime-based
reload at the bottom layer.
Route the request dependency through `get_app_config()` directly. Runtime
`ContextVar` overrides (`push_current_app_config`) and test-injected
singletons (`set_app_config`) keep working; `app.state.config` is now only
read at startup for one-shot bootstrap (logging level, IM channels,
`langgraph_runtime` engines).
`tests/test_gateway_deps_config.py` encoded the old snapshot contract and is
removed; `tests/test_gateway_config_freshness.py` replaces it with mtime,
ContextVar, and `set_app_config` coverage. `test_skills_custom_router.py` and
`test_uploads_router.py` now inject test configs via FastAPI
`dependency_overrides[get_config]` instead of mutating `app.state.config`.
Document the hot-reload boundary in `backend/CLAUDE.md` so reviewers know
which fields are picked up on the next request vs. which still require a
restart (`database`, `checkpointer`, `run_events`, `stream_bridge`,
`sandbox.use`, `log_level`, `channels.*`).
Refs: bytedance/deer-flow#3107 (BUG-001)
* fix(gateway): broaden get_config 503 to any config-load failure
Address review feedback on the previous commit:
1. Narrow exception catch removed. The old contract returned 503 whenever
`app.state.config is None`. The first cut only mapped
`FileNotFoundError`, leaving `PermissionError`, YAML parse errors, and
pydantic `ValidationError` to bubble up as 500. At the request boundary
we treat any inability to materialise the config as "configuration not
available" (503) and log the original exception so the operator still
has the stack.
2. Removed the unused `request: Request` parameter and the matching
`# noqa: ARG001`. FastAPI's `Depends()` does not require the dependency
to accept `Request`; the only call site uses the no-arg form.
3. `backend/CLAUDE.md` boundary now lists the *reason* each field is
restart-required (engine binding, singleton caching, one-shot
`apply_logging_level`, etc.), not just the field name, so reviewers do
not have to reverse-engineer the boundary themselves.
Tests parametrise four exception classes (`FileNotFoundError`,
`PermissionError`, `ValueError`, `RuntimeError`) and assert 503 for each.
Refs: bytedance/deer-flow#3107 (BUG-001)
* fix(task-tool): defend _find_usage_recorder against non-list callbacks
Address review feedback. The previous commit handled the two common shapes
LangChain hands to async tool runs — a plain `list[BaseCallbackHandler]` and
a `BaseCallbackManager` subclass — but iterated any other shape directly,
which would still raise `TypeError` if e.g. a single handler instance leaked
through without a list wrapper.
Treat any non-list, non-manager `config["callbacks"]` value as "no recorder"
rather than crash. Docstring now lists all four shapes explicitly. New tests
cover the single-handler-object case, `runtime is None`, `callbacks is None`,
and `runtime.config` being a non-dict — all required to be silent no-ops.
Refs: bytedance/deer-flow#3107 (BUG-002)
* fix(frontend): drop dead identity ternary and add opt-in export tests
Address review feedback on the previous export commit:
1. Removed the no-op `typeof msg.content === "string" ? msg.content : msg.content`
expression in `formatThreadAsJSON`. Both branches returned the same value;
the message content now flows through unchanged whether it is a string or
the rich `MessageContent[]` shape (LangChain JSON-serialises the array
structure correctly already).
2. Expanded the JSDoc on `ExportOptions` to make it clearer that the four
flags are not currently wired to any UI control — callers wanting a debug
export must build the options object explicitly. The default behaviour
continues to match the explicit prescription in
bytedance/deer-flow#3107 BUG-006.
3. Added opt-in coverage. The previous tests only exercised the
`options = {}` default path; the new cases verify each flag flips the
corresponding payload back into the export so a future debug-export
surface does not silently break the contract.
Refs: bytedance/deer-flow#3107 (BUG-006)
* fix(frontend): export subtask prefix constants and document fallback intent
Address review feedback on the previous BUG-007 commit:
1. `SUCCESS_PREFIX`, `FAILURE_PREFIX`, `TIMEOUT_PREFIX`, and the
`ERROR_WRAPPER_PATTERN` regex are now exported. The JSDoc explicitly
pins them as part of the backend↔frontend contract defined in
`task_tool.py` and `tool_error_handling_middleware.py`, so any future
structured-status migration (e.g. backend writing
`additional_kwargs.subagent_status` instead of leading text) can
reference these from one canonical place rather than redefine them.
2. The `in_progress` fallback now carries a docstring explaining the
deliberate choice — LangChain only ever emits a `ToolMessage` once the
tool itself has returned, so unrecognised content means the contract
has drifted and "still running" is the right operator signal (eagerly
marking it terminal-failed would mask the drift).
No behaviour change; this is documentation and an API export.
Refs: bytedance/deer-flow#3107 (BUG-007)
* fix(gateway): drop app.state.config snapshot and freeze run_events_config
Address @ShenAC-SAC's BUG-001 review on #3131. The previous cut still
stored an ``AppConfig`` snapshot on ``app.state.config`` for startup
bootstrap. Two follow-on hazards from that:
1. Future code touching the gateway lifespan could accidentally start
reading ``app.state.config`` again, silently regressing the request
hot path back to a stale snapshot.
2. ``get_run_context()`` paired a freshly-reloaded ``AppConfig`` with the
startup-bound ``event_store`` and a *live* ``run_events_config``
field — so an operator who edited ``run_events.backend`` mid-flight
would have produced a run context whose ``event_store`` and
``run_events_config`` referred to different backends.
Clean approach (aligned with the direction in PR #3128):
- ``lifespan()`` keeps a local ``startup_config`` variable and passes it
explicitly into ``langgraph_runtime(app, startup_config)`` and into
``start_channel_service``. No ``app.state.config`` attribute is set at
any point.
- ``langgraph_runtime`` now accepts ``startup_config`` as a required
parameter, removing the ``getattr(app.state, "config", None)`` lookup
and the "config not initialised" runtime error.
- The matching ``run_events_config`` is frozen onto ``app.state`` next
to ``run_event_store`` so ``get_run_context`` reads the two from the
same startup-time source. ``app_config`` continues to be resolved
live via ``get_app_config()``.
- ``backend/CLAUDE.md`` boundary explanation updated to spell out the
``startup_config`` / ``get_app_config()`` split.
New regression test ``test_run_context_app_config_reflects_yaml_edit``
exercises the worker-feeding path: it asserts that ``ctx.app_config``
follows a mid-flight ``config.yaml`` edit while
``ctx.run_events_config`` stays frozen to the startup snapshot the
event store was built from.
Refs: bytedance/deer-flow#3107 (BUG-001), bytedance/deer-flow#3131 review
* fix(frontend): parse Task cancelled and polling timed out as terminal
Address @ShenAC-SAC's BUG-007 review on #3131. `task_tool.py` actually
emits five terminal strings:
- `Task Succeeded. Result: …`
- `Task failed. …`
- `Task timed out. …`
- `Task cancelled by user.` ← previously matched none
- `Task polling timed out after N minutes …` ← previously matched none
The previous cut handled three; the last two fell through to the
"unknown content" branch and pushed the subtask card back to
`in_progress` even though the backend had already reached a terminal
state. Add explicit matches plus regression tests for both. The
`in_progress` fallback is now reserved for genuinely unrecognised
output (i.e. contract drift), as documented.
Refs: bytedance/deer-flow#3107 (BUG-007), bytedance/deer-flow#3131 review
* fix(frontend): sanitize JSON export content via the Markdown content path
Address @ShenAC-SAC's BUG-006 review and the Copilot inline comment on
#3131. The previous cut filtered hidden/tool messages out of the JSON
export but still serialised `msg.content` verbatim, so:
- inline `<think>…</think>` wrappers stayed in the exported `content`
even with `includeReasoning: false`,
- content-array thinking blocks leaked the `thinking` field,
- `<uploaded_files>…</uploaded_files>` markers leaked the workspace
paths a user uploaded files to.
JSON now goes through the same sanitiser the Markdown path uses
(`extractContentFromMessage` + `stripUploadedFilesTag`). Reasoning and
tool_calls remain gated behind their `ExportOptions` flags. AI / human
rows that sanitise to empty content with no opted-in reasoning or tool
calls are dropped so the JSON matches the Markdown path's `continue`
on empty assistant fragments.
New regression tests cover the three leak shapes the reviewer called
out plus the empty-content-drop case.
Refs: bytedance/deer-flow#3107 (BUG-006), bytedance/deer-flow#3131 review
* test(gateway): align lifespan stub with langgraph_runtime two-arg signature
Codex round-3 review of c0bc7a06 flagged this: changing
`langgraph_runtime` to require `startup_config` as a second positional
argument broke the one-arg stub `_noop_langgraph_runtime(_app)` in
`test_gateway_lifespan_shutdown.py`, which is patched into
`app.gateway.app.langgraph_runtime` by the lifespan shutdown bounded-timeout
regression. Lifespan would then call the stub with two args and raise
`TypeError` before the bounded-shutdown assertion ran.
Update the stub to match the new signature. The shutdown test itself is
unaffected — it only cares about the channel `stop_channel_service` hang
path.
Refs: bytedance/deer-flow#3107 (BUG-001), bytedance/deer-flow#3131 review
* fix(frontend): strip every known backend marker in export, not just uploads
Codex round-3 review of 258ca800 and the matching maintainer feedback on
PR #3131 made the same point: the JSON export now ran the
Markdown-side sanitiser, but that sanitiser only stripped
`<uploaded_files>`. The full set of payloads middleware embeds inside
message `content` is larger:
- `<uploaded_files>` — `UploadsMiddleware`
- `<system-reminder>` — `DynamicContextMiddleware`
- `<memory>` — `DynamicContextMiddleware` (nested inside system-reminder)
- `<current_date>` — `DynamicContextMiddleware`
The primary protection is still `isHiddenFromUIMessage`: the
`<system-reminder>` HumanMessage is marked `hide_from_ui: true` and never
reaches the formatter. This commit adds the second line of defence so a
regression that drops the `hide_from_ui` flag — or any future middleware
that injects the same tag vocabulary into a visible HumanMessage —
cannot leak the payload into the export file.
Concrete changes:
- New `INTERNAL_MARKER_TAGS` constant + `stripInternalMarkers(content)`
helper in `core/messages/utils.ts`. The constant doubles as
documentation for the backend↔frontend contract.
- `formatMessageContent` in `export.ts` now calls `stripInternalMarkers`
instead of `stripUploadedFilesTag`. UI render paths
(`message-list-item.tsx`) keep using the narrower function so a user
legitimately typing `<memory>` in a meta-discussion is preserved.
- The "drop empty rows" guard in `buildJSONMessage` switched from
`=== undefined` to truthy `!` checks. Codex spotted the asymmetry: when
`extractReasoningContentFromMessage` returned the empty string (which it
legitimately can), the JSON path emitted `{reasoning: ""}` while the
Markdown path's `!reasoning` `continue` correctly dropped the row.
New regression tests cover the defence-in-depth strip with a
`<system-reminder><memory><current_date>` payload deliberately *not*
marked `hide_from_ui`; tool-message sanitization under
`includeToolMessages: true`; the mixed-content-array case
(`thinking + text + image_url`); and the opted-in empty-reasoning drop.
Live verification on a real Ultra-mode thread that uploaded a PDF
(`曾鑫民-薪资交易流水.pdf`): backend state's first HumanMessage carries the
`<uploaded_files>` block (with `/mnt/user-data/uploads/...` paths) as part
of a content-array. The Markdown and JSON export blobs both come back
free of `<uploaded_files>`, `<system-reminder>`, `<current_date>`,
`tool_calls`, and reasoning — while preserving the user's `这是什么 ?`
prompt and the assistant's visible answer.
Refs: bytedance/deer-flow#3107 (BUG-006), bytedance/deer-flow#3131 review
* test(frontend): cover trim, varied N, and pre-execution Error: prefixes
Codex round-3 review of 50e2c257 flagged three coverage gaps in the
subtask-status parser:
1. `Task cancelled by user.` and `Task polling timed out` previously had
no whitespace-trim coverage — the original trim test only exercised
the success prefix. Streaming chunks can arrive with leading/trailing
newlines; the regex needed an explicit assertion.
2. The polling-timeout case was tested only at one `N` (15 minutes). The
backend interpolates the live `timeout_seconds // 60` value, so the
matcher must hold for any positive integer. Now we run the case for
1, 5, and 60 minutes.
3. `task_tool.py` also emits three `Error:` strings for pre-execution
failures — unknown subagent type, host-bash disabled, and "task
disappeared from background tasks". They are intentionally handled by
`ERROR_WRAPPER_PATTERN` rather than dedicated prefixes (the wrapper
already produces the right terminal-failed shape) but had no test
coverage proving that wiring. Codex was right that a refactor splitting
one of them off into its own prefix would silently break things.
The JSDoc on the constants block now spells the three pre-execution
errors out so the relationship between `task_tool.py` returns and the
prefix vocabulary is explicit.
No production code change beyond the docstring — this commit is pure
coverage hardening for the contract that already exists.
Refs: bytedance/deer-flow#3107 (BUG-007), bytedance/deer-flow#3131 review
* fix(gateway): preserve message additional_kwargs in normalize_input (#3132)
The gateway's hand-rolled dict→message coercion only forwarded `content`
and collapsed every role to `HumanMessage`, silently dropping the
frontend's `additional_kwargs.files` payload (along with `id`, `name`,
and ai/system/tool roles).
Effect on issue #3132:
- `UploadsMiddleware` saw no `files` on the last human message, so the
just-uploaded file got bucketed under "previous messages" while the
current turn was reported as `(empty)`.
- The persisted human message had no `files`, so the attachment chip on
the message disappeared the moment the optimistic UI cleared.
Delegate the conversion to `langchain_core.messages.utils.convert_to_messages`
so `additional_kwargs`, `id`, `name`, and non-human roles round-trip
unchanged.
* fix(gateway): convert malformed-message ValueError into HTTP 400
normalize_input now sits at the request boundary, so a malformed
input.messages[N] dict (missing role/type/content, unsupported role,
etc.) should surface as 400 with the offending index — not bubble out
of FastAPI as 500.
Per Copilot review on #3136.
* feat(trace):LangGraph -> lead_agent and set user custom agent name to run_name
* feat(trace):follow github copilot suggest
* feat(trace):Refactor run_name resolution and improve test coverage
* fix(loop-detection): defer warn injection to wrap_model_call
The warn branch in LoopDetectionMiddleware injected a HumanMessage
into state from after_model. The tools node had not yet produced
ToolMessage responses to the previous AIMessage(tool_calls=...), so
the new HumanMessage landed *between* the assistant's tool_calls and
their responses. OpenAI/Moonshot reject the next request with
"tool_call_ids did not have response messages" because their
validators require tool_calls to be followed immediately by tool
messages.
Detection now runs in after_model as before, but only enqueues the
warning into a per-thread list. Injection happens in wrap_model_call,
where every prior ToolMessage is already present in request.messages.
The warning is appended at the end as HumanMessage(name="loop_warning")
— pairing intact, AIMessage semantics untouched, no SystemMessage
issues for Anthropic.
Closes#2029, addresses #2255#2293#2304#2511.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(channels): remove loop warning display filter
* feat(loop-detection): scope pending warnings by run
* docs(loop-detection): update docs
* test(loop-detection): assert deferred warnings are queued
* fix(loop-detection): cap transient warning state
* docs: update docs
* add async awrap_model_call test coverage
* docs(loop-detection): document transient warnings
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
* fix(security): mask sensitive values in MCP config API responses
GET /api/mcp/config previously returned plaintext secrets including
env dict values (API keys), headers (auth tokens), and OAuth
client_secret/refresh_token. Any authenticated user could read all
MCP service credentials.
This commit masks sensitive fields in GET/PUT responses while
preserving the key structure so the frontend round-trip (GET masked
→ toggle enabled → PUT) correctly preserves existing secrets.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(security): address Copilot review on MCP config masking
- Load raw JSON (un-resolved $VAR placeholders) as merge source instead
of resolved config, preventing plaintext secrets from replacing
$VAR placeholders on disk (Comment 2)
- Preserve all top-level keys (e.g. mcpInterceptors) in PUT, not just
mcpServers/skills (Comment 1)
- Reject masked value '***' for new keys that don't exist in existing
config, returning 400 with actionable error (Comment 3)
- Allow empty string '' to explicitly clear OAuth secrets, while None
means 'preserve existing' for safe round-trip (Comment 4)
- Add 3 new tests for rejection, clearing, and edge cases (18 total)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix(harness): hydrate run history from RunStore and persist cancellation status
fix:
- Make RunManager.get() async and hydrate from RunStore when in-memory record is missing
- Merge store rows into list_by_thread() with in-memory precedence for active runs
- Persist interrupted status to RunStore in cancel() and create_or_reject(interrupt|rollback)
- Extract _persist_status() to reuse the best-effort store update pattern
- Await run_mgr.get() in all gateway endpoints
- Return 409 with distinct message for store-only runs not active on current worker
Closes#2812, Closes#2813
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(harness): consistent sort and guarded hydration in RunManager
fix:
- list_by_thread() now sorts by created_at desc (newest first) even when
no RunStore is configured, matching the store-backed code path
- guard _record_from_store() call sites in get() and list_by_thread()
with best-effort error handling so a single malformed store row cannot
turn read paths into 500s
test:
- update test_list_by_thread assertion to expect newest-first order
- seed MemoryRunStore via public put() API instead of writing to _runs
* fix(harness): guard store-only runs from streaming and fix get() TOCTOU
Add RunRecord.store_only flag set by _record_from_store so callers can
distinguish hydrated history from live in-memory runs. join_run and
stream_existing_run (action=None) now return 409 instead of hanging
forever on an empty MemoryStreamBridge channel.
Re-check _runs under lock after the store await in RunManager.get() so a
concurrent create() that lands between the two checks returns the
authoritative in-memory record rather than a stale store-hydrated copy.
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
* fix(harness): reorder bridge fetch in join_run and make list_by_thread limit explicit
Move get_stream_bridge() after the store_only guard in join_run so a
missing bridge cannot produce 503 for historical runs before the 409
guard fires.
Add limit parameter to RunManager.list_by_thread (default 100, matching
the store's page size) and pass it explicitly to the store call.
Update docstring to document the limit instead of claiming all runs are
returned.
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
* fix(harness): cap list_by_thread result to limit after merge
Apply [:limit] to all return paths in list_by_thread so the method
consistently returns at most limit records regardless of how many
in-memory runs exist, making the limit parameter a true upper bound
on the response size rather than just a store-query hint.
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
* fix `list_by_thread` docstring
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix(runtime): add update_model_name to RunStore to prevent SQL integrity errors
RunManager.update_model_name() was calling _persist_to_store() which uses
RunStore.put(), but RunRepository.put() is insert-only. This caused integrity
errors when updating model_name for existing runs in SQL-backed stores.
fix:
- Add abstract update_model_name method to RunStore base class
- Implement update_model_name in MemoryRunStore
- Implement update_model_name in RunRepository with proper normalization
- Add _persist_model_name helper in RunManager
- Update RunManager.update_model_name to use the new method
test:
- Add tests for update_model_name functionality
- Add integration tests for RunManager with SQL-backed store
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(runtime): handle NULL status/on_disconnect in _record_from_store
`dict.get(key, default)` only uses the default when the key is absent,
so a SQL row with an explicit NULL status would pass `None` to
`RunStatus(None)` and raise, breaking hydration for otherwise valid rows.
Switch to `row.get(...) or fallback` so both missing and NULL values
get a safe default. Add tests for get() and list_by_thread() with a
NULL status row to prevent regression.
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
* fix(runs): address PR review feedback on store consistency changes
- Fix list_by_thread limit semantics: pass store_limit = max(0, limit - len(memory_records)) to store so newer store records are not crowded out by in-memory records
- Remove dead code: cancelled guard after raise is always True, simplify to if wait and record.task
- Document _record_from_store NULL fallback policy (status→pending, on_disconnect→cancel) in docstring
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix(auth): replace setup-status 429 rate limit with cached response
The /api/v1/auth/setup-status endpoint had a 60-second cooldown that
returned HTTP 429 for all but the first request per IP. When the service
restarted with multiple browser tabs open, all tabs hit this endpoint
simultaneously from the same source IP, causing a storm of 429 errors
that blocked the login flow.
Replace the cooldown-with-429 model with a per-IP response cache that
returns the previously computed result within the TTL. The database
query (count_admin_users) still only runs once per IP per 60 seconds,
preserving the original performance goal while eliminating spurious
429 errors on multi-tab reconnection.
Fixes#2902
* fix(auth): address setup-status cache review issues
Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/439a0e8c-8b64-41d4-a3cd-fe9a00eec534
Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com>
* test(auth): improve readability of setup-status concurrency assertion
Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/439a0e8c-8b64-41d4-a3cd-fe9a00eec534
Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
* fix the unit test error
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
* fix(auth): persist auto-generated JWT secret to survive restarts
When AUTH_JWT_SECRET is not set, the auto-generated secret is now
written to .deer-flow/.jwt_secret (mode 0600) and reused on subsequent
starts. This prevents session invalidation on every restart while still
allowing explicit AUTH_JWT_SECRET in .env to take precedence.
* Apply suggestions from code review
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix the lint errors of backend
---------
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* feat(channels): enhance Discord with mention-only mode, thread routing, and typing indicators
Add mention_only config to only respond when bot is mentioned, with
allowed_channels override. Add thread_mode for Hermes-style auto-thread
creation. Add periodic typing indicators while bot is processing.
* fix(discord): include allowed_channels in mention_only skip condition (line 274)
* docs: fix Discord config example to match boolean thread_mode implementation
* style: format with ruff
* fix(discord): apply Copilot review fixes and resolve lint errors
- Remove unused Optional import
- Fix thread_ts type hints to str | None
- Fix has_mention logic for None values
- Implement thread_mode fallback to channel replies on thread creation failure
- Fix thread_mode docstring alignment
- Fix allowed_channels comment formatting in config.example.yaml
* fix(discord): reset context for orphaned threads in mention_only mode
When a message arrives in a thread not tracked by _active_threads,
clear thread_id and typing_target so the message falls through to
the standard channel handling pipeline, which creates a fresh thread
instead of incorrectly routing to the stale thread.
* fix(discord): create new thread on @ when channel has existing tracked thread
When mention_only is enabled and a user @-s the bot in a channel
that already has a tracked thread, create a new thread instead of
incorrectly routing to the old one.
* fix(discord): allow no-@ thread replies while skipping no-@ channel messages
The skip block for no-@ messages was too aggressive — it blocked
continuation replies within tracked threads AND incorrectly routed
no-@ channel messages to the existing thread.
Now:
- Thread message, no @ → routed to existing tracked thread
- Channel message, no @ → skipped
- Channel message, with @ → creates new thread
* feat(discord): add checkmark reaction to acknowledge received messages
* Move discord.py to optional dependency and auto-detect from config.yaml
- Add discord extra to [project.optional-dependencies] in pyproject.toml
- Update detect_uv_extras.py to map channels.discord.enabled: true -> --extra discord
- Set UV_EXTRAS=discord in docker-compose-dev.yaml gateway env
* fix(discord): persist thread-channel mappings to store for recovery after restart
Discord's _active_threads dict was purely in-memory, so all channel-to-thread
mappings were lost on server restart. This fix bridges ChannelStore into
DiscordChannel:
- Save thread mappings to store.json after every thread creation
- Restore active threads from store on DiscordChannel startup
- Pass channel_store to all channels via service.py config injection
Store keys follow the pattern: discord:<channel_id>:<thread_id>
* fix(discord): address Copilot review — fix types, typing targets, cross-thread safety, and config comments
* fix(tests): add multitask_strategy param to mock for clarification follow-up test
* fix(tests): explicitly set model_name=None for title middleware test isolation
* fix(discord): use trigger_typing() instead of typing() for typing indicators
discord.py 2.x TextChannel.typing() and Thread.typing() are async context
managers, not one-shot coroutines. Use trigger_typing() for periodic
typing indicator pings.
* fix(discord): cancel typing tasks on channel shutdown
Prevents 'Task was destroyed but it is pending' warnings when the
Discord client stops while typing indicator loops are still running.
* fix(scripts): detect nested YAML config for discord extra
section_value() only matched top-level YAML sections. Added
nested_section_value() that handles two-level nesting (e.g.,
channels.discord.enabled), so auto-detection of the discord
extra works when config uses the standard nested format.
* fix(docker): remove hard-coded UV_EXTRAS=discord from dev compose
Relies on auto-detection via detect_uv_extras.py instead of forcing
discord.py install even when channels.discord.enabled is false.
Matches production docker-compose.yaml behavior (UV_EXTRAS:-).
* refactor(nginx): move proxy_buffering/proxy_cache to server level
DRY cleanup — these directives were repeated in 14 location blocks.
Set at server level once, reducing duplication and risk of drift.
* fix(discord): use dedicated JSON file for thread persistence
Replace ChannelStore usage for Discord thread-ID persistence with a
dedicated discord_threads.json file. ChannelStore is designed to map
IM conversations to DeerFlow thread IDs — using it to persist Discord
thread IDs was semantically wrong and confusing.
Changes:
- _save_thread() now reads/writes a simple {channel_id: thread_id} JSON dict
- _load_active_threads() reads directly from the JSON file
- File path derived from ChannelStore directory (when available) or
defaults to ~/.deer-flow/channels/discord_threads.json
- Removed unused ChannelStore import
* fix(discord): address WillemJiang's code review comments on PR #2842
1. Remove semantically incorrect message_in_thread variable. At this code
point (after the Thread case is handled above), we're guaranteed to be in
a channel, not a thread. Always apply mention_only check here.
2. Add _active_thread_ids reverse-lookup set for O(1) thread ID membership
checks instead of O(n) scan of _active_threads.values(). Keep the set
in sync with _active_threads in _load_active_threads() and _save_thread().
3. Add _thread_store_lock (threading.Lock) to protect _active_threads and
the JSON file from concurrent access between the Discord loop thread
(_run_client) and the main thread (_load_active_threads, _save_thread).
* perf(harness): push thread metadata filters into SQL
Replace Python-side metadata filtering (5x overfetch + in-memory match)
with database-side json_extract predicates so LIMIT/OFFSET pagination
is exact regardless of match density.
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
* fix(harness): add dialect-aware JsonMatch compiler for type-safe metadata SQL filters
Replace SQLAlchemy JSON index/comparator APIs with a custom JsonMatch
ColumnElement that compiles to json_type/json_extract on SQLite and
jsonb_typeof/->>/-> on PostgreSQL. Tighten key validation regex to
single-segment identifiers, handle None/bool/numeric value types with
json_type-based discrimination, and strengthen test coverage for edge
cases and discriminability.
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
* fix(harness): address Copilot review comments on JSON metadata filters
- Use json_typeof instead of jsonb_typeof in PostgreSQL compiler; the
metadata_json column is JSON not JSONB so jsonb_typeof would error at
runtime on any PostgreSQL backend
- Align _is_safe_json_key with json_match's _KEY_CHARSET_RE so keys
containing hyphens or leading digits are not silently skipped
- Add thread_id as secondary ORDER BY in search() to make pagination
deterministic when updated_at values collide; remove asyncio.sleep
from the pagination regression test
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
* fix(harness): address remaining review comments on metadata SQL filters
- Remove _is_safe_json_key() and reuse json_match ValueError to avoid
validator drift (Copilot #3217603895, #3217411616)
- Raise ValueError when all metadata keys are rejected so callers never
get silent unfiltered results (WillemJiang)
- Fix integer precision: split int/float branches, bind int as Integer()
with INTEGER/BIGINT CAST instead of float() coercion (Copilot #3217603972)
- Fix jsonb_typeof -> json_typeof on JSON column (Copilot #3217411579)
- Replace manual _cleanup() calls with async yield fixture so teardown
always runs (Copilot #3217604019)
- Remove asyncio.sleep(0.01) pagination ordering; use thread_id secondary
sort instead (Copilot #3217411636)
- Add type annotations to _bind/_build_clause/_compile_* and remove EOL
comments from _Dialect fields (coding.mdc)
- Expand test coverage: boolean/null/mixed-type/large-int precision,
partial unsafe-key skip with caplog assertion
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(harness): address third-round Copilot review comments on JsonMatch
- Reject unsupported value types (list, dict, ...) in JsonMatch.__init__
with TypeError so inherit_cache=True never receives an unhashable value
and callers get an explicit error instead of silent str() coercion
(Copilot #3217933201)
- Upgrade int bindparam from Integer() to BigInteger() to align with
BIGINT CAST and avoid overflow on large integers (Copilot #3217933252)
- Catch TypeError alongside ValueError in search() so non-string metadata
keys are warned and skipped rather than raising unexpectedly
(Copilot #3217933300)
- Add three tests: json_match rejects unsupported value types, search()
warns and raises on non-string key, search() warns and raises on
unsupported value type
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(harness): address fourth-round Copilot review comments on JsonMatch
- Add CASE WHEN guard for PostgreSQL integer matching: json_typeof returns
'number' for both ints and floats; wrap CAST in CASE with regex guard
'^-?[0-9]+$' so float rows never trigger CAST error (Copilot #3218413860)
- Validate isinstance(key, str) before regex match in JsonMatch.__init__
so non-string keys raise ValueError consistently instead of TypeError
from re.match (Copilot #3218413900)
- Include exception message in metadata filter skip warning so callers
can distinguish invalid key from unsupported value type (Copilot #3218413924)
- Update tests: assert CASE WHEN guard in PG int compilation, cover
non-string key ValueError in test_json_match_rejects_unsafe_key
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(harness): align ThreadMetaStore.search() signature with sql.py implementation
Use `dict[str, Any]` for `metadata` and `list[dict[str, Any]]` as return
type in base class and MemoryThreadMetaStore to resolve an LSP signature
mismatch; also correct a test docstring that cited the wrong exception type.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(harness): surface InvalidMetadataFilterError as HTTP 400 in search endpoint
Replace bare ValueError with a domain-specific InvalidMetadataFilterError
(subclass of ValueError) so the Gateway handler can catch it and return
HTTP 400 instead of letting it bubble up as a 500.
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
* fix(harness): sanitize metadata keys in log output to prevent log injection
Use ascii() instead of %r to escape control characters in client-supplied
metadata keys before logging, preventing multiline/forged log entries.
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix(harness): validate metadata filters at API boundary and dedupe key/value rules
- Add Pydantic ``field_validator`` on ``ThreadSearchRequest.metadata`` so
unsafe keys / unsupported value types are rejected with HTTP 422 from
both SQL and memory backends (closes Copilot review 3218830849).
- Export ``validate_metadata_filter_key`` / ``validate_metadata_filter_value``
(and ``ALLOWED_FILTER_VALUE_TYPES``) from ``json_compat`` and have
``JsonMatch.__init__`` reuse them — the Gateway-side validator and the
SQL-side ``JsonMatch`` constructor now share one admission rule and
cannot drift.
- Format ``InvalidMetadataFilterError`` rejected-keys list as a
comma-separated plain string instead of a Python list repr so the
surfaced HTTP 400 detail is readable (closes Copilot review 3218830899).
- Update router tests to cover both 422 boundary paths plus the 400
defense-in-depth path when a backend still raises the error.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(harness): harden JsonMatch compile-time key validation against __init__ bypass
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
* fix: address review feedback on metadata filter SQL push-down
- Add signed 64-bit range check to validate_metadata_filter_value; give
out-of-range ints a distinct TypeError message.
- Replace assert guards in _compile_sqlite/_compile_pg with explicit
if/raise so they survive python -O optimisation.
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4 <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
* docs: document auth design and user isolation
* docs: align auth docs with current storage and reset behavior
---------
Co-authored-by: greatmengqi <chenmengqi.0376@bytedance.com>
* feat(run): propagate model_name from gateway request context to persistence layer
Pass model_name through the full run creation pipeline — from
RunCreateRequest.context in the gateway, through RunManager, to the
RunStore interface and SQL persistence. This enables client-specified
model selection to be recorded per-run in the database.
* feat(run): add model allowlist validation and effective model name capture
- Validate model_name against allowlist in gateway services.py using
get_app_config().get_model_config()
- Truncate model_name to 128 chars to match DB column constraint
- In worker.py, capture effective model name from agent.metadata after
agent creation and persist if resolved differently than requested
* feat(run): add defense-in-depth model_name normalization and round-trip persistence tests
- Add _normalize_model_name() to RunRepository for whitespace stripping
and 128-char truncation before DB writes.
- Add round-trip unit tests for model_name creation and default None
in test_run_manager.py.
* fix(run): coerce non-string model_name values before strip/truncate in _normalize_model_name
* fix(gateway): add runtime type guard for model_name coercion in gateway services
Add isinstance check and str() coercion before calling .strip() to prevent
AttributeError when non-string types (int, None, etc.) flow through the
gateway. Paired with SQL integration test for end-to-end model_name
persistence across gateway → langgraph → persistence layer.
* fix(run): drop Alembic migration for model_name (no-op) and expose public update method on RunManager
- Drop a1b2c3d4e5f6 migration: model_name already exists in RunRow schema
and is auto-created via Base.metadata.create_all() at startup
- Add update_model_name() public method to RunManager to replace the private
_persist_to_store call in worker.py, preserving internal locking/persistence
* fix(nginx): defer cors to gateway allowlist
Remove proxy-level wildcard CORS handling so browser origins are controlled by the Gateway allowlist and stay aligned with CSRF origin checks.
* docs: document gateway cors allowlist
Clarify that same-origin nginx access needs no CORS headers while split-origin or port-forwarded browser clients must opt in with GATEWAY_CORS_ORIGINS.
* docs(gateway): record cors source of truth
Document that Gateway CORSMiddleware and CSRFMiddleware share GATEWAY_CORS_ORIGINS as the split-origin source of truth.
* fix(gateway): align cors origin normalization
* docs: clarify gateway langgraph routing
* docs(gateway): update runtime routing note
* feat(agent): add update_agent tool for in-chat custom-agent self-updates (#2616)
Custom agents had no built-in way to persist updates to their own SOUL.md /
config.yaml from a normal chat — `setup_agent` was only bound during the
bootstrap flow, so when the user asked the agent to refine its description
or personality, the agent would shell out via bash/write_file and the edits
landed in a temporary sandbox/tool workspace instead of
`{base_dir}/agents/{agent_name}/`.
Changes:
- New `update_agent` builtin tool with partial-update semantics (only the
fields you pass are written) and atomic temp-file + os.replace writes so
a failed update never corrupts existing SOUL.md / config.yaml.
- Lead agent now binds `update_agent` in the non-bootstrap path whenever
`agent_name` is set in the runtime context. Default agent (no
agent_name) and bootstrap flow are unchanged.
- New `<self_update>` system-prompt section is injected for custom agents,
instructing them to use `update_agent` — and explicitly NOT bash /
write_file — to persist self-updates.
- Tests: 11 new cases in `tests/test_update_agent_tool.py` covering
validation (missing/invalid agent_name, unknown agent, no fields),
partial updates (soul-only, description-only, skills=[] vs omitted),
no-op detection, atomic-write safety, and AgentConfig round-tripping;
plus 2 new cases in `tests/test_lead_agent_prompt.py` covering the
self-update prompt section.
- Docs: updated backend/CLAUDE.md builtin tools list and tools.mdx
(en/zh) with the new tool description.
* feat(agent): isolate custom agents per user
Store custom agent definitions under the effective user, keep legacy agents readable until migration, and cover API/tool/migration behavior with tests.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat: consistent write/delete targets & add --user-id to migration
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(loop-detection): keep tool-call pairing on warn injection (#2724)
* make format
* fix(loop-detection): avoid IMMessage leak to downstream consumer
* fix(channels): filter loop warning text from IM replies
* fix(channels): preserve clarification conversation history across follow-up turns
Pin channel-triggered runs to the root checkpoint namespace and ensure thread_id is always present in configurable run config so follow-up replies resume the same conversation state.
Add regression coverage to channel tests:
assert checkpoint_ns/thread_id are passed in wait and stream paths
add an integration-style clarification flow test that verifies the second user reply continues prior context instead of starting a new session
This addresses history loss after ask_clarification interruptions (issue #2425).
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix(channels): copy configurable dict before injecting run-scoped fields
When configurable was already a plain dict, _resolve_run_params mutated
it in place, leaking checkpoint_ns and thread_id back into the shared
session config. Always copy via dict() before mutating to prevent
cross-user or cross-channel config pollution.
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix(gateway): return ISO 8601 timestamps from threads endpoints (#2594)
ThreadResponse documents created_at / updated_at as ISO timestamps,
matching the LangGraph Platform schema (langgraph_sdk.schema.Thread
exposes them as datetime, JSON-encoded as ISO 8601). The gateway
threads router was instead emitting str(time.time()) — unix-second
floats — breaking frontend new Date() parsing and producing a mixed
ISO/unix wire format that also corrupted the search sort order.
Centralize timestamp generation in deerflow.utils.time:
- now_iso() — datetime.now(UTC).isoformat()
- coerce_iso(x) — heals legacy unix-timestamp strings on read so the
store converges to ISO without a one-shot migration
threads.py: replace 6 time.time() call sites with now_iso(); wrap all
read paths and Phase-2 checkpoint metadata with coerce_iso(); _store_upsert
opportunistically heals legacy created_at on update; drop unused time import.
thread_runs.py: reuse now_iso() instead of a private duplicate _now_iso(),
preventing future drift between the two timestamp call sites.
Tests: 9 unit tests for the helper; 5 integration tests pinning the ISO
contract for create/get/patch/search and the legacy-healing path on the
internal store upsert. Full suite: 2144 passed, 15 skipped, 0 failed.
Closes#2594
* fix(gateway): coerce checkpoint metadata timestamps to ISO on read
After the merge with main, three additional read paths in ``threads.py``
were still emitting raw ``str(metadata.get("created_at", ""))`` —
``get_thread_state``, ``update_thread_state``, and ``get_thread_history``.
Same root cause as #2594: when the checkpoint metadata's ``created_at``
is a unix-second float (legacy data, or a checkpoint written by an older
Gateway version), ``str(float)`` produces ``"1777252410.411327"`` and the
frontend's ``new Date(...)`` returns ``Invalid Date``. The fix on the
``/threads/{id}`` GET path was already in place; these three sibling
endpoints needed the same treatment.
All four call sites now flow through ``coerce_iso``, so:
- legacy float metadata heals to ISO on the way out,
- ISO metadata passes through unchanged,
- ``datetime`` instances (which the new ``coerce_iso`` branch handles
explicitly) emit with the ``T`` separator instead of falling through
to the space-separated ``str(datetime)`` form.
Coverage added for the two endpoints not already pinned by the merge:
- ``test_get_thread_state_returns_iso_for_legacy_checkpoint_metadata``
- ``test_get_thread_history_returns_iso_for_legacy_checkpoint_metadata``
Both pre-seed a checkpoint whose metadata carries the literal float
from the issue body and assert the wire format is ISO.
* fix(agents): propagate agent_name into ToolRuntime.context for setup_agent (#2677)
When creating a custom agent via the web UI, SOUL.md was always written
to the global base_dir/SOUL.md instead of agents/<name>/SOUL.md.
Root cause: the bootstrap flow sends agent_name via body.context, but
two layers were broken:
1. services.py only forwarded body.context keys into config["configurable"];
config["context"] was never populated.
2. worker.py constructed the parent Runtime with a hard-coded
{thread_id, run_id} context, ignoring config["context"] entirely.
After the langgraph >= 1.1.9 bump (#98a5b34f), ToolRuntime.context no
longer falls back to configurable, so setup_agent's
runtime.context.get("agent_name") returned None and the tool's silent
agent_name=None -> base_dir fallback kicked in, overwriting the global
SOUL.md.
Fix:
- services.py: extract merge_run_context_overrides() and write the
whitelisted context keys into both configurable (legacy readers) and
context (langgraph 1.1+ ToolRuntime consumers).
- worker.py: extract _build_runtime_context() and merge config["context"]
into the Runtime's context (without letting callers override
thread_id/run_id).
The base_dir fallback in setup_agent_tool.py is left in place because
the IM /bootstrap channel command depends on it. That code path can
be tightened in a follow-up.
Adds regression tests covering both helpers.
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Centralize log level parsing in `logging_level_from_config()` and
application in `apply_logging_level()` within `deerflow.config.app_config`.
- Gateway lifespan applies configured log level on startup
- `debug.py` uses shared helpers instead of local duplicates
- `apply_logging_level()` targets only `deerflow`/`app` logger hierarchies
so third-party library verbosity is not affected; root handler levels
are only lowered (never raised) to allow configured loggers through
without suppressing third-party output; root logger level is not modified
- Config field description updated to clarify scope
- Tests save/restore global logging state to avoid test pollution
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
* feat(channels): add DingTalk channel integration
Add a new DingTalk messaging channel using the dingtalk-stream SDK
with Stream Push (WebSocket), requiring no public IP. Supports both
plain sampleMarkdown replies and optional AI Card streaming for a
typewriter effect when card_template_id is configured.
- Add DingTalkChannel implementation with token management, message
routing, allowed_users filtering, and markdown adaptation
- Register dingtalk in channel service registry and capability map
- Propagate inbound metadata to outbound messages in ChannelManager
for DingTalk sender context (sender_staff_id, conversation_type)
- Add dingtalk-stream dependency to pyproject.toml
- Add configuration examples in config.example.yaml and .env.example
- Update all README translations with setup instructions
- Add comprehensive test suite (test_dingtalk_channel.py) and
metadata propagation test in test_channels.py
- Update backend CLAUDE.md to document DingTalk channel
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(channels): address PR review feedback for DingTalk integration
- Replace runtime mutation of CHANNEL_CAPABILITIES with a
`supports_streaming` property on the Channel base class, overridden
by DingTalkChannel, FeishuChannel, and WeComChannel
- Store stream client reference and attempt graceful disconnect in
stop(); guard _on_chatbot_message with _running check to prevent
post-stop message processing
- Use msg.chat_id as the primary routing key in send/send_file via
a shared _resolve_routing helper, with metadata as fallback
- Fix process() return type annotation from tuple[str, str] to
tuple[int, str] to match AckMessage.STATUS_OK
- Protect _incoming_messages with threading.Lock for cross-thread
safety between the Stream Push thread and the asyncio loop
- Re-add Docker Compose URL guidance removed during DingTalk setup
docs addition in README.md
- Fix incomplete sentence in README_zh.md (missing verb "启用")
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(docs): restore plain paragraph format for Docker Compose note
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(channels): fix isinstance TypeError and add file size guard in DingTalk channel
Use tuple syntax for isinstance() type check to avoid runtime TypeError
with PEP 604 union types. Add upload size limit (20MB) before reading
files into memory. Narrow exception handlers to specific types.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(channels): propagate markdown fallback errors and validate access token response
- Re-raise exceptions in _send_markdown_fallback to prevent partial
deliveries (files sent without accompanying text)
- Validate _get_access_token response: reject non-dict bodies, empty
tokens, and coerce invalid expireIn to a safe default
- Add tests for both fixes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(channels): validate upload response and broaden send_file exception handling
- Validate _upload_media JSON response: handle JSONDecodeError and
non-dict payloads gracefully by returning None
- Broaden send_file exception tuple to include TypeError and
AttributeError for unexpected JSON shapes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(channels): fix streaming race on channel registration and slim outbound metadata
- Register channel in service before calling start() to avoid race
where background receiver publishes inbound before registration,
causing manager to fall back to static CHANNEL_CAPABILITIES
- Strip known-large metadata keys (raw_message, ref_msg) from outbound
messages to prevent memory bloat from propagated inbound payloads
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Update service.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update CLAUDE.md
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix(security): allow disabling API docs in production via GATEWAY_ENABLE_DOCS
Expose /docs, /redoc, and /openapi.json only when GATEWAY_ENABLE_DOCS=true
(default). Setting GATEWAY_ENABLE_DOCS=false disables all three endpoints,
preventing unauthorized API surface discovery in production deployments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test(security): add unit tests and docs for GATEWAY_ENABLE_DOCS
Add 7 tests covering default behavior, env var parsing (case-insensitive,
fail-closed), endpoint visibility, and health endpoint independence.
Update CONFIGURATION.md and CLAUDE.md with the new toggle.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* style(security): apply ruff formatting to gateway app.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
* fix(security): harden auth system and fix run journal logic bug
- Fix inverted condition in RunJournal.on_chat_model_start that prevented
first human message capture (not messages → messages)
- Pre-hash passwords with SHA-256 before bcrypt to avoid silent 72-byte
truncation vulnerability
- Move load_dotenv() from module scope into get_auth_config() to prevent
import-time os.environ mutation breaking test isolation
- Return generic ‘Invalid token’ instead of exposing specific error
variants (expired, malformed, invalid_signature) to clients
- Make @require_auth independently enforce 401 instead of silently
passing through when AuthMiddleware is absent
- Rate-limit /setup-status endpoint with per-IP cooldown to mitigate
initialization-state information leak
- Document in-process rate limiter limitation for multi-worker deployments
* fix(security): return 429+Retry-After on setup-status rate limit, bound cooldown dict
Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/070d0be8-99a5-46c8-85bb-6b81b5284021
Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com>
* fix(security): add versioned password hashes with auto-migration on login
The SHA-256 pre-hash change silently broke verification for any existing
bcrypt-only password hashes. Introduce a <N>$ prefix scheme so hashes
are self-describing:
- v2 (current): bcrypt(b64(sha256(password))) with $ prefix
- v1 (legacy): plain bcrypt, prefixed $ or bare (no prefix)
verify_password auto-detects the version and falls back to v1 for older
hashes. LocalAuthProvider.authenticate() now rehashes legacy hashes to v2
on successful login via needs_rehash(), so existing users upgrade
transparently without a dedicated migration step.
* fix(auth): harden verify_password, best-effort rehash, update require_auth docstring, downgrade journal logging
- password.py: wrap bcrypt.checkpw in try/except → return False for malformed/corrupt hashes instead of crashing
- local_provider.py: wrap auto-rehash update_user() in try/except so transient DB errors don't fail valid logins
- authz.py: update require_auth docstring to reflect independent 401 enforcement
- journal.py: downgrade on_chat_model_start from INFO to DEBUG, log only metadata (batch_count, message_counts) instead of full serialized/messages content
Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/48c5cf31-a4ab-418a-982a-6343c37bb299
Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com>
* fix(auth): address code review - narrow ValueError catch, add rehash warning log, rename num_batches
Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/48c5cf31-a4ab-418a-982a-6343c37bb299
Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
- Updated documentation and comments to reflect the transition from LangGraph Server to Gateway.
- Changed default URLs in ChannelManager and tests to point to Gateway.
- Removed references to LangGraph Server in deployment scripts and configurations.
- Updated Nginx configuration to route API traffic to Gateway.
- Adjusted frontend configurations to utilize Gateway's API.
- Removed LangGraph service from Docker Compose files, consolidating services under Gateway.
- Added regression tests to ensure Gateway integration works as expected.
Co-authored-by: Copilot <copilot@github.com>
* Refactor API fetch calls to use a unified fetch function; enhance chat history loading with new hooks and UI components
- Replaced `fetchWithAuth` with a generic `fetch` function across various API modules for consistency.
- Updated `useThreadStream` and `useThreadHistory` hooks to manage chat history loading, including loading states and pagination.
- Introduced `LoadMoreHistoryIndicator` component for better user experience when loading more chat history.
- Enhanced message handling in `MessageList` to accommodate new loading states and history management.
- Added support for run messages in the thread context, improving the overall message handling logic.
- Updated translations for loading indicators in English and Chinese.
* Fix test assertions for run ordering in RunManager tests
- Updated assertions in `test_list_by_thread` to reflect correct ordering of runs.
- Modified `test_list_by_thread_is_stable_when_timestamps_tie` to ensure stable ordering when timestamps are tied.