deer-flow

mirror of https://github.com/bytedance/deer-flow.git synced 2026-06-11 18:05:58 +00:00

Author	SHA1	Message	Date
taohe	d1768606c0	Merge remote-tracking branch 'origin/main' into codex/im-channel-connections # Conflicts: # backend/app/gateway/services.py # frontend/src/app/workspace/chats/page.tsx	2026-06-11 17:51:16 +08:00
taohe	b26b30ac3d	Reflect IM channel runtime health	2026-06-11 17:11:55 +08:00
taohe	dae7c7870e	Prefill IM channel runtime config	2026-06-11 17:02:02 +08:00
taohe	4a0278420f	Allow disconnecting runtime IM channels	2026-06-11 16:10:02 +08:00
taohe	ade4a55cfe	Persist IM runtime config locally	2026-06-11 15:58:40 +08:00
taohe	09872af36c	Make channel threads visible to connection owners	2026-06-11 15:40:49 +08:00
taohe	9d51e38641	Keep configured IM channels editable	2026-06-11 14:40:37 +08:00
taohe	c4368c9018	Add runtime setup for enabled IM channels	2026-06-11 12:10:16 +08:00
taohe	a52deada8b	Support all integrated IM channel connections	2026-06-11 11:19:27 +08:00
taohe	b7097baaec	Address IM channel review comments	2026-06-11 10:33:44 +08:00
taohe	87200ff920	Address Copilot IM channel feedback	2026-06-11 08:26:48 +08:00
Ryker_Feng	167ef4512f	feat(memory): add memory.token_counting config to avoid tiktoken network dependency (#3429 ) (#3465 ) * feat(memory): add memory.token_counting config to avoid tiktoken network dependency (#3429) Add a `memory.token_counting` option (`tiktoken` \| `char`) so deployments in network-restricted environments can opt out of tiktoken entirely. In `char` mode the memory-injection budget uses a network-free character-based estimate and never triggers the BPE download from openaipublic.blob.core.windows.net, which could otherwise block for tens of minutes (see #3402). Also harden the default `tiktoken` path: - cache an in-flight LOADING sentinel so concurrent callers fall back immediately instead of spawning more blocking get_encoding threads when the first load is still running (e.g. under the 5s startup warm-up timeout); - cache failures with a timestamp and retry after a cooldown so a transient network outage self-heals back to accurate counting without a restart; - skip startup warm-up entirely in char mode. The new config is surfaced via the memory config API and config.example.yaml (config_version bumped). Default remains `tiktoken`, so existing deployments are unaffected. * fix(memory): use CJK-aware char token estimate and address review feedback - Replace the flat len(text)//4 fallback with a CJK-aware estimate so Chinese/Japanese/Korean memory content does not over-fill the injection budget - Document the internal tiktoken retry cooldown and char-mode escape hatch - Sync CLAUDE.md / config.example.yaml / MEMORY_IMPROVEMENTS.md wording - Fix MemoryConfigResponse mocks/assertions and add CJK estimate tests	2026-06-10 23:26:15 +08:00
taohe	d06643d8a2	Align IM connections with local channels	2026-06-10 22:16:47 +08:00
taohe	92c185b90d	Support local IM channel connections	2026-06-10 21:59:33 +08:00
taohe	ec5ed185cd	Merge remote-tracking branch 'origin/main' into codex/im-channel-connections # Conflicts: # backend/app/channels/discord.py # backend/app/channels/manager.py # backend/app/channels/slack.py # backend/app/channels/telegram.py	2026-06-10 21:13:02 +08:00
taohe	dbe3a3bb0d	Add user-owned IM channel connections	2026-06-10 21:07:44 +08:00
DanielWalnut	2b795265e7	fix: align auth-disabled mode and mock history loading (#3471 ) * fix: align auth-disabled mode and mock history loading * fix: address auth-disabled review feedback * test: cover auth-disabled backend contract * style: format frontend tests * fix: address follow-up review comments	2026-06-10 16:11:00 +08:00
ly-wang19	b62c5a7b5b	fix(agents): offload blocking filesystem IO in the custom-agent router off the event loop (#3457 ) * fix(agents): offload blocking filesystem IO in delete_agent off the event loop delete_agent is an async route handler but resolved the agent directory (Paths.base_dir -> Path.resolve), probed it (Path.exists), and removed it (shutil.rmtree) directly on the event loop, blocking it for the duration of every delete. Surfaced by 'make detect-blocking-io'. Move the resolve/exists/rmtree sequence into a sync helper run via asyncio.to_thread, mapping its outcome back to the existing 404/409/500 responses (behavior unchanged). Adds a tests/blocking_io/ regression anchor under the strict Blockbuster gate, mirroring test_skills_load.py (#1917). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(agents): offload blocking filesystem IO in create_agent_endpoint too Like delete_agent, the async create_agent_endpoint resolved and created the agent directory and wrote config.yaml + SOUL.md (with rmtree cleanup on failure) directly on the event loop. Move the whole create-or-409 sequence into a sync helper run via asyncio.to_thread; behavior is unchanged (201 / 409 / 500). Extends the blocking_io regression anchor to cover create as well as delete and renames it to test_agents_router.py. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: ly-wang19 <ly-wang19@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-06-09 22:24:53 +08:00
Xun	3c2b60aaae	fix(threads): assign new checkpoint ID in update_thread_state (#2391 ) * async * add test * test(threads): assert aput preserves endpoint-assigned checkpoint id Confirm the update_thread_state fix is real, not a no-op: all supported savers (InMemorySaver, AsyncSqliteSaver, AsyncPostgresSaver) persist and echo checkpoint["id"] verbatim rather than minting their own. Add assertions that each POST /state response's checkpoint_id round-tripped into persisted history and kept its uuid6 time-ordering through aput, and document the verified contract in the router. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 23:12:25 +08:00
DanielWalnut	3b105d1e5f	fix(suggestions): strip inline <think> reasoning before parsing follow-up questions (#3435 ) Reasoning models such as MiniMax-M3 inline their chain-of-thought into the message content as <think>...</think> (reasoning_split defaults to false) instead of a separate reasoning_content field. The follow-up-suggestions endpoint extracted the JSON array via find('[') / rfind(']'), which silently broke whenever the reasoning text contained '[' or ']' — or when long thinking hit max_tokens and truncated before the array was emitted — returning empty suggestions. - Add _strip_think_blocks() and apply it before JSON extraction; it removes complete <think>...</think> blocks (case-insensitive) and drops an unclosed <think> left by max_tokens truncation. - Document the MiniMax thinking toggle in config.example.yaml (when_thinking_enabled: adaptive / when_thinking_disabled: disabled) so thinking_enabled=False actually disables reasoning on M3; note that M2.x models always think and rely on the defensive strip above. - Tests cover complete/unclosed think blocks, brackets-inside-think, think + code-fence, and an end-to-end suggestions case reproducing the empty-result bug. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 15:48:00 +08:00
greatmengqi	40a371b88c	fix(security): harden MCP config endpoint (#3425 ) * fix(security): harden MCP config endpoint * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: greatmengqi <chenmengqi.0376@bytedance.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-06-08 12:21:02 +08:00
Nan Gao	1aac408dd0	fix upload file size contract (#3408 )	2026-06-06 15:12:17 +08:00
Eilen Shin	019bd16a06	fix: load paginated run history messages (#3305 )	2026-06-01 15:50:39 +08:00
Eilen Shin	872079b894	docs: clean standalone LangGraph server remnants (#3301 )	2026-05-29 11:36:45 +08:00
Lucy Shen	37451500eb	fix(gateway): split stream_existing_run into per-method routes for unique OpenAPI operationIds (#3228 ) * fix(gateway): split stream_existing_run into per-method routes for unique OpenAPI operationIds `@router.api_route("/.../stream", methods=["GET", "POST"])` registers a single FastAPI route that holds both methods. FastAPI's auto-generated `operationId` is computed once per route from a single method picked out of `route.methods`, so when OpenAPI generation iterates over every method on that route both end up sharing the same `operationId`. That triggers `UserWarning: Duplicate Operation ID stream_existing_run_..._stream_(get\|post) for function stream_existing_run` during `app.openapi()` and produces an invalid OpenAPI spec for SDK / codegen consumers. Register GET and POST as two separate routes on the same handler so each method gets a distinct auto-generated `operationId` ("..._stream_get" and "..._stream_post"). Behavior is otherwise unchanged: same handler, same `require_permission` decoration, same response. Add `tests/test_openapi_operation_ids.py` to lock in the invariant: no duplicate-operationId warnings during spec generation, globally unique operationIds across the spec, and distinct GET / POST operationIds on the stream endpoint specifically. Reverted the source change locally and confirmed all three tests fail before the fix. * test(runtime): widen CancelledError catch in _ScriptedAgent to fix cancel-race flake `_ScriptedAgent.astream()` previously only caught `asyncio.CancelledError` inside the inner `if self.block_after_first_chunk:` while-loop. Cancellation arriving during any earlier `await` in the same body (`self.model.ainvoke`, `_write_checkpoint`, the `yield`) would propagate without setting `controller.cancelled`, so callers waiting on `controller.cancelled.wait(5)` after `POST /cancel` returned 204 could race and time out. `test_cancel_interrupt_stops_running_background_run` waits only for the `started` event (set on the first line of `astream`) before issuing cancel, so its race window spans all three pre-loop `await`s. On a clean `main` checkout, stress-running the test 20× reproduces the failure 6/20 (~30%). `test_cancel_rollback_restores_pre_run_checkpoint`, which waits for the later `checkpoint_written` event, passes 20/20 — confirming the race lives entirely in the gap between `started.set()` and the cancellation-aware block. Widen the try/except to cover the entire `astream` body so any `CancelledError` sets the controller event; the non-cancel path is unchanged (no exception means no event set). After this change the previously flaky test passes 50/50, the rollback test still passes 30/30, and the full backend suite remains at 3649 passed / 19 skipped. Test-only change — `backend/tests/test_runtime_lifecycle_e2e.py` is the only file touched; the production cancel pipeline is unaffected.	2026-05-28 08:20:52 +08:00
AochenShen99	a5599c100c	fix(gateway): honour on_disconnect on /wait endpoints (#3267 ) * fix(gateway): honour on_disconnect on /wait endpoints (#3265) The non-streaming /threads/{tid}/runs/wait and /runs/wait handlers used to await record.task directly with no disconnect handling and silently swallow CancelledError. When a long tool call (e.g. pip install inside a custom skill) kept the connection idle long enough for an intermediate HTTP layer to time out, the handler would still read the in-progress checkpoint and return it as if the run had completed normally -- masking a half-finished run as a successful response. Add wait_for_run_completion in app.gateway.services that mirrors sse_consumer's bridge-consumption pattern: subscribe to the stream bridge until END_SENTINEL, poll request.is_disconnected on every wake-up, and on real client disconnect cancel the background run when record.on_disconnect is "cancel". Wire it into both wait endpoints. The streaming path was unaffected because sse_consumer already has this loop; this just brings /wait to parity. * fix(gateway): skip checkpoint serialization on /wait disconnect Copilot review on #3267 caught a follow-on of the same #3265 bug: when the client disconnects, wait_for_run_completion breaks out of the bridge loop and cancels the run, but the /wait endpoint then continues to read the checkpointer and serializes whatever partial checkpoint exists as a normal 200 response. Have the helper return a bool — True only when END_SENTINEL was observed — and skip the checkpoint serialization path on False. Also reorder the inner check so END_SENTINEL is honoured even when is_disconnected() flips true in the same iteration; the run truly finished so the real final checkpoint is still valid.	2026-05-28 07:22:39 +08:00
Willem Jiang	f9b7071304	fix(sandbox): add group/other read permissions to uploaded files for Docker sandbox (#3127 ) (#3134 ) * fix(sandbox): add group/other read permissions to uploaded files for Docker sandbox (#3127) When using AIO sandbox with LocalContainerBackend, uploaded files are created with 0o600 (owner-only) permissions by the gateway process running as root. The sandbox process inside the Docker container runs as a non-root user and cannot read these bind-mounted files, causing a "Permission denied" error on read_file. Add `needs_upload_permission_adjustment` attribute to SandboxProvider (default True) to indicate that uploaded files need chmod adjustment. LocalSandboxProvider opts out (same user). A new `_make_file_sandbox_readable` function adds S_IRGRP \| S_IROTH bits after files are written, changing permissions from 0o600 to 0o644 so the sandbox can read the uploads. fixes #3127 * fix(uploads): unconditionally adjust file permissions for sandbox access The conditional check meant uploaded files retained 0o600 permissions in some Docker sandbox configurations, preventing the sandbox process (UID 1000) from reading them. Always add group/other read bits so every sandbox setup can access uploaded content. Also add read bits to the sync-path writable helper as defense in depth.	2026-05-25 09:26:18 +08:00
Lawrance_YXLiao	2eeb597985	fix(runs): expose active progress counters (#3148 ) * fix(runs): expose active progress counters * fix(runs): avoid delayed progress flush on completion * fix(runs): tighten progress snapshot semantics * fix(runs): preserve omitted progress fields * chore(runs): remove duplicate journal initialization	2026-05-22 21:42:14 +08:00
sunsine	7ec8d3a6e7	fix(security): mask sensitive values in MCP config API responses (#2667 ) * fix(security): mask sensitive values in MCP config API responses GET /api/mcp/config previously returned plaintext secrets including env dict values (API keys), headers (auth tokens), and OAuth client_secret/refresh_token. Any authenticated user could read all MCP service credentials. This commit masks sensitive fields in GET/PUT responses while preserving the key structure so the frontend round-trip (GET masked → toggle enabled → PUT) correctly preserves existing secrets. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(security): address Copilot review on MCP config masking - Load raw JSON (un-resolved $VAR placeholders) as merge source instead of resolved config, preventing plaintext secrets from replacing $VAR placeholders on disk (Comment 2) - Preserve all top-level keys (e.g. mcpInterceptors) in PUT, not just mcpServers/skills (Comment 1) - Reject masked value '***' for new keys that don't exist in existing config, returning 400 with actionable error (Comment 3) - Allow empty string '' to explicitly clear OAuth secrets, while None means 'preserve existing' for safe round-trip (Comment 4) - Add 3 new tests for rejection, clearing, and edge cases (18 total) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-21 10:28:57 +08:00
He Wang	c810e9f809	fix(harness)!: hydrate runs from RunStore and persist interrupted status (#2932 ) * fix(harness): hydrate run history from RunStore and persist cancellation status fix: - Make RunManager.get() async and hydrate from RunStore when in-memory record is missing - Merge store rows into list_by_thread() with in-memory precedence for active runs - Persist interrupted status to RunStore in cancel() and create_or_reject(interrupt\|rollback) - Extract _persist_status() to reuse the best-effort store update pattern - Await run_mgr.get() in all gateway endpoints - Return 409 with distinct message for store-only runs not active on current worker Closes #2812, Closes #2813 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(harness): consistent sort and guarded hydration in RunManager fix: - list_by_thread() now sorts by created_at desc (newest first) even when no RunStore is configured, matching the store-backed code path - guard _record_from_store() call sites in get() and list_by_thread() with best-effort error handling so a single malformed store row cannot turn read paths into 500s test: - update test_list_by_thread assertion to expect newest-first order - seed MemoryRunStore via public put() API instead of writing to _runs * fix(harness): guard store-only runs from streaming and fix get() TOCTOU Add RunRecord.store_only flag set by _record_from_store so callers can distinguish hydrated history from live in-memory runs. join_run and stream_existing_run (action=None) now return 409 instead of hanging forever on an empty MemoryStreamBridge channel. Re-check _runs under lock after the store await in RunManager.get() so a concurrent create() that lands between the two checks returns the authoritative in-memory record rather than a stale store-hydrated copy. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com> * fix(harness): reorder bridge fetch in join_run and make list_by_thread limit explicit Move get_stream_bridge() after the store_only guard in join_run so a missing bridge cannot produce 503 for historical runs before the 409 guard fires. Add limit parameter to RunManager.list_by_thread (default 100, matching the store's page size) and pass it explicitly to the store call. Update docstring to document the limit instead of claiming all runs are returned. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com> * fix(harness): cap list_by_thread result to limit after merge Apply [:limit] to all return paths in list_by_thread so the method consistently returns at most limit records regardless of how many in-memory runs exist, making the limit parameter a true upper bound on the response size rather than just a store-query hint. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com> * fix `list_by_thread` docstring Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix(runtime): add update_model_name to RunStore to prevent SQL integrity errors RunManager.update_model_name() was calling _persist_to_store() which uses RunStore.put(), but RunRepository.put() is insert-only. This caused integrity errors when updating model_name for existing runs in SQL-backed stores. fix: - Add abstract update_model_name method to RunStore base class - Implement update_model_name in MemoryRunStore - Implement update_model_name in RunRepository with proper normalization - Add _persist_model_name helper in RunManager - Update RunManager.update_model_name to use the new method test: - Add tests for update_model_name functionality - Add integration tests for RunManager with SQL-backed store Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(runtime): handle NULL status/on_disconnect in _record_from_store `dict.get(key, default)` only uses the default when the key is absent, so a SQL row with an explicit NULL status would pass `None` to `RunStatus(None)` and raise, breaking hydration for otherwise valid rows. Switch to `row.get(...) or fallback` so both missing and NULL values get a safe default. Add tests for get() and list_by_thread() with a NULL status row to prevent regression. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com> * fix(runs): address PR review feedback on store consistency changes - Fix list_by_thread limit semantics: pass store_limit = max(0, limit - len(memory_records)) to store so newer store records are not crowded out by in-memory records - Remove dead code: cancelled guard after raise is always True, simplify to if wait and record.task - Document _record_from_store NULL fallback policy (status→pending, on_disconnect→cancel) in docstring Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-18 22:25:02 +08:00
Willem Jiang	b5108e3520	fix(auth): replace setup-status 429 rate limit with cached response (#2915 ) * fix(auth): replace setup-status 429 rate limit with cached response The /api/v1/auth/setup-status endpoint had a 60-second cooldown that returned HTTP 429 for all but the first request per IP. When the service restarted with multiple browser tabs open, all tabs hit this endpoint simultaneously from the same source IP, causing a storm of 429 errors that blocked the login flow. Replace the cooldown-with-429 model with a per-IP response cache that returns the previously computed result within the TTL. The database query (count_admin_users) still only runs once per IP per 60 seconds, preserving the original performance goal while eliminating spurious 429 errors on multi-tab reconnection. Fixes #2902 * fix(auth): address setup-status cache review issues Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/439a0e8c-8b64-41d4-a3cd-fe9a00eec534 Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com> * test(auth): improve readability of setup-status concurrency assertion Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/439a0e8c-8b64-41d4-a3cd-fe9a00eec534 Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * fix the unit test error --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>	2026-05-18 22:07:01 +08:00
Willem Jiang	39f901d3a5	fix(runs): restore historical runs from persistent store after gateway restart (#2989 ) * fix(runs): restore historical runs from persistent store after gateway restart RunManager.list_by_thread() and get() only queried the in-memory _runs dict, returning empty results after a restart even when PostgreSQL had the records. Add store fallback to both read paths and a new async aget() for the API endpoint, keeping sync get() for internal callers that need live task/abort_event state. Fixes #2984 * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix(runs): scope run store fallback reads by user id Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/e73daada-1215-4bc1-ab7d-7117826c5013 Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com> * test(runs): clarify ordering expectation and mock store filters Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/e73daada-1215-4bc1-ab7d-7117826c5013 Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com> * test(runs): make user filter fallback assertions explicit Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/e73daada-1215-4bc1-ab7d-7117826c5013 Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com> * test(runs): verify user-isolated fallback behavior with memory store Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/e73daada-1215-4bc1-ab7d-7117826c5013 Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com> * update the code with feedback from issue-2984 --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>	2026-05-17 20:03:21 +08:00
Hinotobi	7a2670eaea	fix(gateway): cap skill artifact preview size (#2963 )	2026-05-15 22:15:58 +08:00
He Wang	e9deb6c2f2	perf(harness): push thread metadata filters into SQL (#2865 ) * perf(harness): push thread metadata filters into SQL Replace Python-side metadata filtering (5x overfetch + in-memory match) with database-side json_extract predicates so LIMIT/OFFSET pagination is exact regardless of match density. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com> * fix(harness): add dialect-aware JsonMatch compiler for type-safe metadata SQL filters Replace SQLAlchemy JSON index/comparator APIs with a custom JsonMatch ColumnElement that compiles to json_type/json_extract on SQLite and jsonb_typeof/->>/-> on PostgreSQL. Tighten key validation regex to single-segment identifiers, handle None/bool/numeric value types with json_type-based discrimination, and strengthen test coverage for edge cases and discriminability. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com> * fix(harness): address Copilot review comments on JSON metadata filters - Use json_typeof instead of jsonb_typeof in PostgreSQL compiler; the metadata_json column is JSON not JSONB so jsonb_typeof would error at runtime on any PostgreSQL backend - Align _is_safe_json_key with json_match's _KEY_CHARSET_RE so keys containing hyphens or leading digits are not silently skipped - Add thread_id as secondary ORDER BY in search() to make pagination deterministic when updated_at values collide; remove asyncio.sleep from the pagination regression test Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com> * fix(harness): address remaining review comments on metadata SQL filters - Remove _is_safe_json_key() and reuse json_match ValueError to avoid validator drift (Copilot #3217603895, #3217411616) - Raise ValueError when all metadata keys are rejected so callers never get silent unfiltered results (WillemJiang) - Fix integer precision: split int/float branches, bind int as Integer() with INTEGER/BIGINT CAST instead of float() coercion (Copilot #3217603972) - Fix jsonb_typeof -> json_typeof on JSON column (Copilot #3217411579) - Replace manual _cleanup() calls with async yield fixture so teardown always runs (Copilot #3217604019) - Remove asyncio.sleep(0.01) pagination ordering; use thread_id secondary sort instead (Copilot #3217411636) - Add type annotations to _bind/_build_clause/_compile_* and remove EOL comments from _Dialect fields (coding.mdc) - Expand test coverage: boolean/null/mixed-type/large-int precision, partial unsafe-key skip with caplog assertion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(harness): address third-round Copilot review comments on JsonMatch - Reject unsupported value types (list, dict, ...) in JsonMatch.__init__ with TypeError so inherit_cache=True never receives an unhashable value and callers get an explicit error instead of silent str() coercion (Copilot #3217933201) - Upgrade int bindparam from Integer() to BigInteger() to align with BIGINT CAST and avoid overflow on large integers (Copilot #3217933252) - Catch TypeError alongside ValueError in search() so non-string metadata keys are warned and skipped rather than raising unexpectedly (Copilot #3217933300) - Add three tests: json_match rejects unsupported value types, search() warns and raises on non-string key, search() warns and raises on unsupported value type Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(harness): address fourth-round Copilot review comments on JsonMatch - Add CASE WHEN guard for PostgreSQL integer matching: json_typeof returns 'number' for both ints and floats; wrap CAST in CASE with regex guard '^-?[0-9]+$' so float rows never trigger CAST error (Copilot #3218413860) - Validate isinstance(key, str) before regex match in JsonMatch.__init__ so non-string keys raise ValueError consistently instead of TypeError from re.match (Copilot #3218413900) - Include exception message in metadata filter skip warning so callers can distinguish invalid key from unsupported value type (Copilot #3218413924) - Update tests: assert CASE WHEN guard in PG int compilation, cover non-string key ValueError in test_json_match_rejects_unsafe_key Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(harness): align ThreadMetaStore.search() signature with sql.py implementation Use `dict[str, Any]` for `metadata` and `list[dict[str, Any]]` as return type in base class and MemoryThreadMetaStore to resolve an LSP signature mismatch; also correct a test docstring that cited the wrong exception type. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(harness): surface InvalidMetadataFilterError as HTTP 400 in search endpoint Replace bare ValueError with a domain-specific InvalidMetadataFilterError (subclass of ValueError) so the Gateway handler can catch it and return HTTP 400 instead of letting it bubble up as a 500. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com> * fix(harness): sanitize metadata keys in log output to prevent log injection Use ascii() instead of %r to escape control characters in client-supplied metadata keys before logging, preventing multiline/forged log entries. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix(harness): validate metadata filters at API boundary and dedupe key/value rules - Add Pydantic ``field_validator`` on ``ThreadSearchRequest.metadata`` so unsafe keys / unsupported value types are rejected with HTTP 422 from both SQL and memory backends (closes Copilot review 3218830849). - Export ``validate_metadata_filter_key`` / ``validate_metadata_filter_value`` (and ``ALLOWED_FILTER_VALUE_TYPES``) from ``json_compat`` and have ``JsonMatch.__init__`` reuse them — the Gateway-side validator and the SQL-side ``JsonMatch`` constructor now share one admission rule and cannot drift. - Format ``InvalidMetadataFilterError`` rejected-keys list as a comma-separated plain string instead of a Python list repr so the surfaced HTTP 400 detail is readable (closes Copilot review 3218830899). - Update router tests to cover both 422 boundary paths plus the 400 defense-in-depth path when a backend still raises the error. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(harness): harden JsonMatch compile-time key validation against __init__ bypass Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com> * fix: address review feedback on metadata filter SQL push-down - Add signed 64-bit range check to validate_metadata_filter_value; give out-of-range ints a distinct TypeError message. - Replace assert guards in _compile_sqlite/_compile_pg with explicit if/raise so they survive python -O optimisation. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4 <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 23:21:22 +08:00
greatmengqi	f734e14d8b	docs: document auth design and user isolation (#2913 ) * docs: document auth design and user isolation * docs: align auth docs with current storage and reset behavior --------- Co-authored-by: greatmengqi <chenmengqi.0376@bytedance.com>	2026-05-12 23:07:11 +08:00
YuJitang	417416087b	fix: use backend thread token usage for header total (#2800 ) * fix: use backend thread token usage for header total * Refactor thread token usage fetch	2026-05-09 19:40:32 +08:00
ChenglongZ	7a3c58a733	Fix duplicate gateway upload filenames (#2789 )	2026-05-09 18:02:40 +08:00
yangzheli	59c4a3f0a4	feat(agent): add custom-agent self-updates with user isolation (#2713 ) * feat(agent): add update_agent tool for in-chat custom-agent self-updates (#2616) Custom agents had no built-in way to persist updates to their own SOUL.md / config.yaml from a normal chat — `setup_agent` was only bound during the bootstrap flow, so when the user asked the agent to refine its description or personality, the agent would shell out via bash/write_file and the edits landed in a temporary sandbox/tool workspace instead of `{base_dir}/agents/{agent_name}/`. Changes: - New `update_agent` builtin tool with partial-update semantics (only the fields you pass are written) and atomic temp-file + os.replace writes so a failed update never corrupts existing SOUL.md / config.yaml. - Lead agent now binds `update_agent` in the non-bootstrap path whenever `agent_name` is set in the runtime context. Default agent (no agent_name) and bootstrap flow are unchanged. - New `<self_update>` system-prompt section is injected for custom agents, instructing them to use `update_agent` — and explicitly NOT bash / write_file — to persist self-updates. - Tests: 11 new cases in `tests/test_update_agent_tool.py` covering validation (missing/invalid agent_name, unknown agent, no fields), partial updates (soul-only, description-only, skills=[] vs omitted), no-op detection, atomic-write safety, and AgentConfig round-tripping; plus 2 new cases in `tests/test_lead_agent_prompt.py` covering the self-update prompt section. - Docs: updated backend/CLAUDE.md builtin tools list and tools.mdx (en/zh) with the new tool description. * feat(agent): isolate custom agents per user Store custom agent definitions under the effective user, keep legacy agents readable until migration, and cover API/tool/migration behavior with tests. Co-authored-by: Cursor <cursoragent@cursor.com> * feat: consistent write/delete targets & add --user-id to migration --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-05 23:17:42 +08:00
Hinotobi	e543bbf5d6	[security] fix(upload): reject symlinked upload destinations (#2623 ) * fix: reject symlinked upload destinations * test: harden upload destination checks * fix: address PR feedback for #2623 * test: cover safe upload re-uploads * fix: preserve upload limit checks after rebase * fix(upload): stream safe HTTP upload writes	2026-05-02 15:19:28 +08:00
Xinmin Zeng	ca3332f8bf	fix(gateway): return ISO 8601 timestamps from threads endpoints (#2599 ) * fix(gateway): return ISO 8601 timestamps from threads endpoints (#2594) ThreadResponse documents created_at / updated_at as ISO timestamps, matching the LangGraph Platform schema (langgraph_sdk.schema.Thread exposes them as datetime, JSON-encoded as ISO 8601). The gateway threads router was instead emitting str(time.time()) — unix-second floats — breaking frontend new Date() parsing and producing a mixed ISO/unix wire format that also corrupted the search sort order. Centralize timestamp generation in deerflow.utils.time: - now_iso() — datetime.now(UTC).isoformat() - coerce_iso(x) — heals legacy unix-timestamp strings on read so the store converges to ISO without a one-shot migration threads.py: replace 6 time.time() call sites with now_iso(); wrap all read paths and Phase-2 checkpoint metadata with coerce_iso(); _store_upsert opportunistically heals legacy created_at on update; drop unused time import. thread_runs.py: reuse now_iso() instead of a private duplicate _now_iso(), preventing future drift between the two timestamp call sites. Tests: 9 unit tests for the helper; 5 integration tests pinning the ISO contract for create/get/patch/search and the legacy-healing path on the internal store upsert. Full suite: 2144 passed, 15 skipped, 0 failed. Closes #2594 * fix(gateway): coerce checkpoint metadata timestamps to ISO on read After the merge with main, three additional read paths in ``threads.py`` were still emitting raw ``str(metadata.get("created_at", ""))`` — ``get_thread_state``, ``update_thread_state``, and ``get_thread_history``. Same root cause as #2594: when the checkpoint metadata's ``created_at`` is a unix-second float (legacy data, or a checkpoint written by an older Gateway version), ``str(float)`` produces ``"1777252410.411327"`` and the frontend's ``new Date(...)`` returns ``Invalid Date``. The fix on the ``/threads/{id}`` GET path was already in place; these three sibling endpoints needed the same treatment. All four call sites now flow through ``coerce_iso``, so: - legacy float metadata heals to ISO on the way out, - ISO metadata passes through unchanged, - ``datetime`` instances (which the new ``coerce_iso`` branch handles explicitly) emit with the ``T`` separator instead of falling through to the space-separated ``str(datetime)`` form. Coverage added for the two endpoints not already pinned by the merge: - ``test_get_thread_state_returns_iso_for_legacy_checkpoint_metadata`` - ``test_get_thread_history_returns_iso_for_legacy_checkpoint_metadata`` Both pre-seed a checkpoint whose metadata carries the literal float from the issue body and assert the wire format is ISO.	2026-05-02 15:16:16 +08:00
KiteEater	17447fccbe	fix(runtime): make rollback restore checkpoint supersede newer checkpoints (#2582 ) * Restore rollback checkpoints with fresh ids * Tighten rollback checkpoint tests and imports * Update test_run_worker_rollback.py --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-02 11:25:45 +08:00
KiteEater	8939ccaed2	fix(uploads): enforce streaming upload limits in gateway (#2589 ) * fix: enforce gateway upload limits * fix: acquire sandbox before upload writes * Fix upload limit config wiring * Sanitize upload size error filenames * test: call upload routes unwrapped * fix: guard upload limits endpoint --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-01 20:19:30 +08:00
Xun	1ad1420e31	refactor(skills): Unified skill storage capability (#2613 )	2026-05-01 13:23:26 +08:00
Willem Jiang	11afd32459	Fix the log Injection error of skills.py	2026-04-28 21:42:38 +08:00
Willem Jiang	64f4dc1639	fixed the CI build errors	2026-04-28 19:01:36 +08:00
Willem Jiang	844ad8e528	Merge branch 'main' into release/2.0-rc	2026-04-28 15:44:02 +08:00
greatmengqi	e82940c03d	refactor: thread release config through lead path (#2612 ) Co-authored-by: greatmengqi <chenmengqi.0376@bytedance.com>	2026-04-28 14:53:18 +08:00
DanielWalnut	707ed328dd	fix(skills): scan skill archives before install (#2561 ) * fix(skills): scan skill archives before install Fixes #2536 * fix(skills): scan archive support files before install * style(skills): format archive installer * fix(skills): address archive install review comments	2026-04-28 11:56:11 +08:00
Willem Jiang	4e4e4f92a0	fix(security): harden auth system and fix run journal logic bug (#2593 ) * fix(security): harden auth system and fix run journal logic bug - Fix inverted condition in RunJournal.on_chat_model_start that prevented first human message capture (not messages → messages) - Pre-hash passwords with SHA-256 before bcrypt to avoid silent 72-byte truncation vulnerability - Move load_dotenv() from module scope into get_auth_config() to prevent import-time os.environ mutation breaking test isolation - Return generic ‘Invalid token’ instead of exposing specific error variants (expired, malformed, invalid_signature) to clients - Make @require_auth independently enforce 401 instead of silently passing through when AuthMiddleware is absent - Rate-limit /setup-status endpoint with per-IP cooldown to mitigate initialization-state information leak - Document in-process rate limiter limitation for multi-worker deployments * fix(security): return 429+Retry-After on setup-status rate limit, bound cooldown dict Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/070d0be8-99a5-46c8-85bb-6b81b5284021 Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com> * fix(security): add versioned password hashes with auto-migration on login The SHA-256 pre-hash change silently broke verification for any existing bcrypt-only password hashes. Introduce a <N>$ prefix scheme so hashes are self-describing: - v2 (current): bcrypt(b64(sha256(password))) with $ prefix - v1 (legacy): plain bcrypt, prefixed $ or bare (no prefix) verify_password auto-detects the version and falls back to v1 for older hashes. LocalAuthProvider.authenticate() now rehashes legacy hashes to v2 on successful login via needs_rehash(), so existing users upgrade transparently without a dedicated migration step. * fix(auth): harden verify_password, best-effort rehash, update require_auth docstring, downgrade journal logging - password.py: wrap bcrypt.checkpw in try/except → return False for malformed/corrupt hashes instead of crashing - local_provider.py: wrap auto-rehash update_user() in try/except so transient DB errors don't fail valid logins - authz.py: update require_auth docstring to reflect independent 401 enforcement - journal.py: downgrade on_chat_model_start from INFO to DEBUG, log only metadata (batch_count, message_counts) instead of full serialized/messages content Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/48c5cf31-a4ab-418a-982a-6343c37bb299 Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com> * fix(auth): address code review - narrow ValueError catch, add rehash warning log, rename num_batches Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/48c5cf31-a4ab-418a-982a-6343c37bb299 Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>	2026-04-28 11:34:07 +08:00
Willem Jiang	829e82a9af	fix the lint error in backend	2026-04-26 15:09:25 +08:00

1 2

89 Commits