56d5fa3337cb6ecf54620c2fe09379aedffaa5e7
78 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
56d5fa3337 |
feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134)
* feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930) * feat(persistence): add SQLAlchemy 2.0 async ORM scaffold Introduce a unified database configuration (DatabaseConfig) that controls both the LangGraph checkpointer and the DeerFlow application persistence layer from a single `database:` config section. New modules: - deerflow.config.database_config — Pydantic config with memory/sqlite/postgres backends - deerflow.persistence — async engine lifecycle, DeclarativeBase with to_dict mixin, Alembic skeleton - deerflow.runtime.runs.store — RunStore ABC + MemoryRunStore implementation Gateway integration initializes/tears down the persistence engine in the existing langgraph_runtime() context manager. Legacy checkpointer config is preserved for backward compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(persistence): add RunEventStore ABC + MemoryRunEventStore Phase 2-A prerequisite for event storage: adds the unified run event stream interface (RunEventStore) with an in-memory implementation, RunEventsConfig, gateway integration, and comprehensive tests (27 cases). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(persistence): add ORM models, repositories, DB/JSONL event stores, RunJournal, and API endpoints Phase 2-B: run persistence + event storage + token tracking. - ORM models: RunRow (with token fields), ThreadMetaRow, RunEventRow - RunRepository implements RunStore ABC via SQLAlchemy ORM - ThreadMetaRepository with owner access control - DbRunEventStore with trace content truncation and cursor pagination - JsonlRunEventStore with per-run files and seq recovery from disk - RunJournal (BaseCallbackHandler) captures LLM/tool/lifecycle events, accumulates token usage by caller type, buffers and flushes to store - RunManager now accepts optional RunStore for persistent backing - Worker creates RunJournal, writes human_message, injects callbacks - Gateway deps use factory functions (RunRepository when DB available) - New endpoints: messages, run messages, run events, token-usage - ThreadCreateRequest gains assistant_id field - 92 tests pass (33 new), zero regressions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(persistence): add user feedback + follow-up run association Phase 2-C: feedback and follow-up tracking. - FeedbackRow ORM model (rating +1/-1, optional message_id, comment) - FeedbackRepository with CRUD, list_by_run/thread, aggregate stats - Feedback API endpoints: create, list, stats, delete - follow_up_to_run_id in RunCreateRequest (explicit or auto-detected from latest successful run on the thread) - Worker writes follow_up_to_run_id into human_message event metadata - Gateway deps: feedback_repo factory + getter - 17 new tests (14 FeedbackRepository + 3 follow-up association) - 109 total tests pass, zero regressions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test+config: comprehensive Phase 2 test coverage + deprecate checkpointer config - config.example.yaml: deprecate standalone checkpointer section, activate unified database:sqlite as default (drives both checkpointer + app data) - New: test_thread_meta_repo.py (14 tests) — full ThreadMetaRepository coverage including check_access owner logic, list_by_owner pagination - Extended test_run_repository.py (+4 tests) — completion preserves fields, list ordering desc, limit, owner_none returns all - Extended test_run_journal.py (+8 tests) — on_chain_error, track_tokens=false, middleware no ai_message, unknown caller tokens, convenience fields, tool_error, non-summarization custom event - Extended test_run_event_store.py (+7 tests) — DB batch seq continuity, make_run_event_store factory (memory/db/jsonl/fallback/unknown) - Extended test_phase2b_integration.py (+4 tests) — create_or_reject persists, follow-up metadata, summarization in history, full DB-backed lifecycle - Fixed DB integration test to use proper fake objects (not MagicMock) for JSON-serializable metadata - 157 total Phase 2 tests pass, zero regressions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * config: move default sqlite_dir to .deer-flow/data Keep SQLite databases alongside other DeerFlow-managed data (threads, memory) under the .deer-flow/ directory instead of a top-level ./data folder. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(persistence): remove UTFJSON, use engine-level json_serializer + datetime.now() - Replace custom UTFJSON type with standard sqlalchemy.JSON in all ORM models. Add json_serializer=json.dumps(ensure_ascii=False) to all create_async_engine calls so non-ASCII text (Chinese etc.) is stored as-is in both SQLite and Postgres. - Change ORM datetime defaults from datetime.now(UTC) to datetime.now(), remove UTC imports. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(gateway): simplify deps.py with getter factory + inline repos - Replace 6 identical getter functions with _require() factory. - Inline 3 _make_*_repo() factories into langgraph_runtime(), call get_session_factory() once instead of 3 times. - Add thread_meta upsert in start_run (services.py). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(docker): add UV_EXTRAS build arg for optional dependencies Support installing optional dependency groups (e.g. postgres) at Docker build time via UV_EXTRAS build arg: UV_EXTRAS=postgres docker compose build Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(journal): fix flush, token tracking, and consolidate tests RunJournal fixes: - _flush_sync: retain events in buffer when no event loop instead of dropping them; worker's finally block flushes via async flush(). - on_llm_end: add tool_calls filter and caller=="lead_agent" guard for ai_message events; mark message IDs for dedup with record_llm_usage. - worker.py: persist completion data (tokens, message count) to RunStore in finally block. Model factory: - Auto-inject stream_usage=True for BaseChatOpenAI subclasses with custom api_base, so usage_metadata is populated in streaming responses. Test consolidation: - Delete test_phase2b_integration.py (redundant with existing tests). - Move DB-backed lifecycle test into test_run_journal.py. - Add tests for stream_usage injection in test_model_factory.py. - Clean up executor/task_tool dead journal references. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): widen content type to str|dict in all store backends Allow event content to be a dict (for structured OpenAI-format messages) in addition to plain strings. Dict values are JSON-serialized for the DB backend and deserialized on read; memory and JSONL backends handle dicts natively. Trace truncation now serializes dicts to JSON before measuring. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(events): use metadata flag instead of heuristic for dict content detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(converters): add LangChain-to-OpenAI message format converters Pure functions langchain_to_openai_message, langchain_to_openai_completion, langchain_messages_to_openai, and _infer_finish_reason for converting LangChain BaseMessage objects to OpenAI Chat Completions format, used by RunJournal for event storage. 15 unit tests added. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(converters): handle empty list content as null, clean up test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): human_message content uses OpenAI user message format Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(events): ai_message uses OpenAI format, add ai_tool_call message event - ai_message content now uses {"role": "assistant", "content": "..."} format - New ai_tool_call message event emitted when lead_agent LLM responds with tool_calls - ai_tool_call uses langchain_to_openai_message converter for consistent format - Both events include finish_reason in metadata ("stop" or "tool_calls") Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): add tool_result message event with OpenAI tool message format Cache tool_call_id from on_tool_start keyed by run_id as fallback for on_tool_end, then emit a tool_result message event (role=tool, tool_call_id, content) after each successful tool completion. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(events): summary content uses OpenAI system message format Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): replace llm_start/llm_end with llm_request/llm_response in OpenAI format Add on_chat_model_start to capture structured prompt messages as llm_request events. Replace llm_end trace events with llm_response using OpenAI Chat Completions format. Track llm_call_index to pair request/response events. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): add record_middleware method for middleware trace events Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(events): add full run sequence integration test for OpenAI content format Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(events): align message events with checkpoint format and add middleware tag injection - Message events (ai_message, ai_tool_call, tool_result, human_message) now use BaseMessage.model_dump() format, matching LangGraph checkpoint values.messages - on_tool_end extracts tool_call_id/name/status from ToolMessage objects - on_tool_error now emits tool_result message events with error status - record_middleware uses middleware:{tag} event_type and middleware category - Summarization custom events use middleware:summarize category - TitleMiddleware injects middleware:title tag via get_config() inheritance - SummarizationMiddleware model bound with middleware:summarize tag - Worker writes human_message using HumanMessage.model_dump() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(threads): switch search endpoint to threads_meta table and sync title - POST /api/threads/search now queries threads_meta table directly, removing the two-phase Store + Checkpointer scan approach - Add ThreadMetaRepository.search() with metadata/status filters - Add ThreadMetaRepository.update_display_name() for title sync - Worker syncs checkpoint title to threads_meta.display_name on run completion - Map display_name to values.title in search response for API compatibility Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(threads): history endpoint reads messages from event store - POST /api/threads/{thread_id}/history now combines two data sources: checkpointer for checkpoint_id, metadata, title, thread_data; event store for messages (complete history, not truncated by summarization) - Strip internal LangGraph metadata keys from response - Remove full channel_values serialization in favor of selective fields Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove duplicate optional-dependencies header in pyproject.toml Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(middleware): pass tagged config to TitleMiddleware ainvoke call Without the config, the middleware:title tag was not injected, causing the LLM response to be recorded as a lead_agent ai_message in run_events. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve merge conflict in .env.example Keep both DATABASE_URL (from persistence-scaffold) and WECOM credentials (from main) after the merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(persistence): address review feedback on PR #1851 - Fix naive datetime.now() → datetime.now(UTC) in all ORM models - Fix seq race condition in DbRunEventStore.put() with FOR UPDATE and UNIQUE(thread_id, seq) constraint - Encapsulate _store access in RunManager.update_run_completion() - Deduplicate _store.put() logic in RunManager via _persist_to_store() - Add update_run_completion to RunStore ABC + MemoryRunStore - Wire follow_up_to_run_id through the full create path - Add error recovery to RunJournal._flush_sync() lost-event scenario - Add migration note for search_threads breaking change - Fix test_checkpointer_none_fix mock to set database=None Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: update uv.lock Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(persistence): address 22 review comments from CodeQL, Copilot, and Code Quality Bug fixes: - Sanitize log params to prevent log injection (CodeQL) - Reset threads_meta.status to idle/error when run completes - Attach messages only to latest checkpoint in /history response - Write threads_meta on POST /threads so new threads appear in search Lint fixes: - Remove unused imports (journal.py, migrations/env.py, test_converters.py) - Convert lambda to named function (engine.py, Ruff E731) - Remove unused logger definitions in repos (Ruff F841) - Add logging to JSONL decode errors and empty except blocks - Separate assert side-effects in tests (CodeQL) - Remove unused local variables in tests (Ruff F841) - Fix max_trace_content truncation to use byte length, not char length Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: apply ruff format to persistence and runtime files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Potential fix for pull request finding 'Statement has no effect' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * refactor(runtime): introduce RunContext to reduce run_agent parameter bloat Extract checkpointer, store, event_store, run_events_config, thread_meta_repo, and follow_up_to_run_id into a frozen RunContext dataclass. Add get_run_context() in deps.py to build the base context from app.state singletons. start_run() uses dataclasses.replace() to enrich per-run fields before passing ctx to run_agent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(gateway): move sanitize_log_param to app/gateway/utils.py Extract the log-injection sanitizer from routers/threads.py into a shared utils module and rename to sanitize_log_param (public API). Eliminates the reverse service → router import in services.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * perf: use SQL aggregation for feedback stats and thread token usage Replace Python-side counting in FeedbackRepository.aggregate_by_run with a single SELECT COUNT/SUM query. Add RunStore.aggregate_tokens_by_thread abstract method with SQL GROUP BY implementation in RunRepository and Python fallback in MemoryRunStore. Simplify the thread_token_usage endpoint to delegate to the new method, eliminating the limit=10000 truncation risk. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: annotate DbRunEventStore.put() as low-frequency path Add docstring clarifying that put() opens a per-call transaction with FOR UPDATE and should only be used for infrequent writes (currently just the initial human_message event). High-throughput callers should use put_batch() instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(threads): fall back to Store search when ThreadMetaRepository is unavailable When database.backend=memory (default) or no SQL session factory is configured, search_threads now queries the LangGraph Store instead of returning 503. Returns empty list if neither Store nor repo is available. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(persistence): introduce ThreadMetaStore ABC for backend-agnostic thread metadata Add ThreadMetaStore abstract base class with create/get/search/update/delete interface. ThreadMetaRepository (SQL) now inherits from it. New MemoryThreadMetaStore wraps LangGraph BaseStore for memory-mode deployments. deps.py now always provides a non-None thread_meta_repo, eliminating all `if thread_meta_repo is not None` guards in services.py, worker.py, and routers/threads.py. search_threads no longer needs a Store fallback branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(history): read messages from checkpointer instead of RunEventStore The /history endpoint now reads messages directly from the checkpointer's channel_values (the authoritative source) instead of querying RunEventStore.list_messages(). The RunEventStore API is preserved for other consumers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(persistence): address new Copilot review comments - feedback.py: validate thread_id/run_id before deleting feedback - jsonl.py: add path traversal protection with ID validation - run_repo.py: parse `before` to datetime for PostgreSQL compat - thread_meta_repo.py: fix pagination when metadata filter is active - database_config.py: use resolve_path for sqlite_dir consistency Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Implement skill self-evolution and skill_manage flow (#1874) * chore: ignore .worktrees directory * Add skill_manage self-evolution flow * Fix CI regressions for skill_manage * Address PR review feedback for skill evolution * fix(skill-evolution): preserve history on delete * fix(skill-evolution): tighten scanner fallbacks * docs: add skill_manage e2e evidence screenshot * fix(skill-manage): avoid blocking fs ops in session runtime --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> * fix(config): resolve sqlite_dir relative to CWD, not Paths.base_dir resolve_path() resolves relative to Paths.base_dir (.deer-flow), which double-nested the path to .deer-flow/.deer-flow/data/app.db. Use Path.resolve() (CWD-relative) instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Feature/feishu receive file (#1608) * feat(feishu): add channel file materialization hook for inbound messages - Introduce Channel.receive_file(msg, thread_id) as a base method for file materialization; default is no-op. - Implement FeishuChannel.receive_file to download files/images from Feishu messages, save to sandbox, and inject virtual paths into msg.text. - Update ChannelManager to call receive_file for any channel if msg.files is present, enabling downstream model access to user-uploaded files. - No impact on Slack/Telegram or other channels (they inherit the default no-op). * style(backend): format code with ruff for lint compliance - Auto-formatted packages/harness/deerflow/agents/factory.py and tests/test_create_deerflow_agent.py using `ruff format` - Ensured both files conform to project linting standards - Fixes CI lint check failures caused by code style issues * fix(feishu): handle file write operation asynchronously to prevent blocking * fix(feishu): rename GetMessageResourceRequest to _GetMessageResourceRequest and remove redundant code * test(feishu): add tests for receive_file method and placeholder replacement * fix(manager): remove unnecessary type casting for channel retrieval * fix(feishu): update logging messages to reflect resource handling instead of image * fix(feishu): sanitize filename by replacing invalid characters in file uploads * fix(feishu): improve filename sanitization and reorder image key handling in message processing * fix(feishu): add thread lock to prevent filename conflicts during file downloads * fix(test): correct bad merge in test_feishu_parser.py * chore: run ruff and apply formatting cleanup fix(feishu): preserve rich-text attachment order and improve fallback filename handling * fix(docker): restore gateway env vars and fix langgraph empty arg issue (#1915) Two production docker-compose.yaml bugs prevent `make up` from working: 1. Gateway missing DEER_FLOW_CONFIG_PATH and DEER_FLOW_EXTENSIONS_CONFIG_PATH environment overrides. Added in |
||
|
|
d8ecaf46c9 |
feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930)
* feat(persistence): add SQLAlchemy 2.0 async ORM scaffold Introduce a unified database configuration (DatabaseConfig) that controls both the LangGraph checkpointer and the DeerFlow application persistence layer from a single `database:` config section. New modules: - deerflow.config.database_config — Pydantic config with memory/sqlite/postgres backends - deerflow.persistence — async engine lifecycle, DeclarativeBase with to_dict mixin, Alembic skeleton - deerflow.runtime.runs.store — RunStore ABC + MemoryRunStore implementation Gateway integration initializes/tears down the persistence engine in the existing langgraph_runtime() context manager. Legacy checkpointer config is preserved for backward compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(persistence): add RunEventStore ABC + MemoryRunEventStore Phase 2-A prerequisite for event storage: adds the unified run event stream interface (RunEventStore) with an in-memory implementation, RunEventsConfig, gateway integration, and comprehensive tests (27 cases). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(persistence): add ORM models, repositories, DB/JSONL event stores, RunJournal, and API endpoints Phase 2-B: run persistence + event storage + token tracking. - ORM models: RunRow (with token fields), ThreadMetaRow, RunEventRow - RunRepository implements RunStore ABC via SQLAlchemy ORM - ThreadMetaRepository with owner access control - DbRunEventStore with trace content truncation and cursor pagination - JsonlRunEventStore with per-run files and seq recovery from disk - RunJournal (BaseCallbackHandler) captures LLM/tool/lifecycle events, accumulates token usage by caller type, buffers and flushes to store - RunManager now accepts optional RunStore for persistent backing - Worker creates RunJournal, writes human_message, injects callbacks - Gateway deps use factory functions (RunRepository when DB available) - New endpoints: messages, run messages, run events, token-usage - ThreadCreateRequest gains assistant_id field - 92 tests pass (33 new), zero regressions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(persistence): add user feedback + follow-up run association Phase 2-C: feedback and follow-up tracking. - FeedbackRow ORM model (rating +1/-1, optional message_id, comment) - FeedbackRepository with CRUD, list_by_run/thread, aggregate stats - Feedback API endpoints: create, list, stats, delete - follow_up_to_run_id in RunCreateRequest (explicit or auto-detected from latest successful run on the thread) - Worker writes follow_up_to_run_id into human_message event metadata - Gateway deps: feedback_repo factory + getter - 17 new tests (14 FeedbackRepository + 3 follow-up association) - 109 total tests pass, zero regressions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test+config: comprehensive Phase 2 test coverage + deprecate checkpointer config - config.example.yaml: deprecate standalone checkpointer section, activate unified database:sqlite as default (drives both checkpointer + app data) - New: test_thread_meta_repo.py (14 tests) — full ThreadMetaRepository coverage including check_access owner logic, list_by_owner pagination - Extended test_run_repository.py (+4 tests) — completion preserves fields, list ordering desc, limit, owner_none returns all - Extended test_run_journal.py (+8 tests) — on_chain_error, track_tokens=false, middleware no ai_message, unknown caller tokens, convenience fields, tool_error, non-summarization custom event - Extended test_run_event_store.py (+7 tests) — DB batch seq continuity, make_run_event_store factory (memory/db/jsonl/fallback/unknown) - Extended test_phase2b_integration.py (+4 tests) — create_or_reject persists, follow-up metadata, summarization in history, full DB-backed lifecycle - Fixed DB integration test to use proper fake objects (not MagicMock) for JSON-serializable metadata - 157 total Phase 2 tests pass, zero regressions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * config: move default sqlite_dir to .deer-flow/data Keep SQLite databases alongside other DeerFlow-managed data (threads, memory) under the .deer-flow/ directory instead of a top-level ./data folder. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(persistence): remove UTFJSON, use engine-level json_serializer + datetime.now() - Replace custom UTFJSON type with standard sqlalchemy.JSON in all ORM models. Add json_serializer=json.dumps(ensure_ascii=False) to all create_async_engine calls so non-ASCII text (Chinese etc.) is stored as-is in both SQLite and Postgres. - Change ORM datetime defaults from datetime.now(UTC) to datetime.now(), remove UTC imports. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(gateway): simplify deps.py with getter factory + inline repos - Replace 6 identical getter functions with _require() factory. - Inline 3 _make_*_repo() factories into langgraph_runtime(), call get_session_factory() once instead of 3 times. - Add thread_meta upsert in start_run (services.py). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(docker): add UV_EXTRAS build arg for optional dependencies Support installing optional dependency groups (e.g. postgres) at Docker build time via UV_EXTRAS build arg: UV_EXTRAS=postgres docker compose build Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(journal): fix flush, token tracking, and consolidate tests RunJournal fixes: - _flush_sync: retain events in buffer when no event loop instead of dropping them; worker's finally block flushes via async flush(). - on_llm_end: add tool_calls filter and caller=="lead_agent" guard for ai_message events; mark message IDs for dedup with record_llm_usage. - worker.py: persist completion data (tokens, message count) to RunStore in finally block. Model factory: - Auto-inject stream_usage=True for BaseChatOpenAI subclasses with custom api_base, so usage_metadata is populated in streaming responses. Test consolidation: - Delete test_phase2b_integration.py (redundant with existing tests). - Move DB-backed lifecycle test into test_run_journal.py. - Add tests for stream_usage injection in test_model_factory.py. - Clean up executor/task_tool dead journal references. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): widen content type to str|dict in all store backends Allow event content to be a dict (for structured OpenAI-format messages) in addition to plain strings. Dict values are JSON-serialized for the DB backend and deserialized on read; memory and JSONL backends handle dicts natively. Trace truncation now serializes dicts to JSON before measuring. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(events): use metadata flag instead of heuristic for dict content detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(converters): add LangChain-to-OpenAI message format converters Pure functions langchain_to_openai_message, langchain_to_openai_completion, langchain_messages_to_openai, and _infer_finish_reason for converting LangChain BaseMessage objects to OpenAI Chat Completions format, used by RunJournal for event storage. 15 unit tests added. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(converters): handle empty list content as null, clean up test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): human_message content uses OpenAI user message format Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(events): ai_message uses OpenAI format, add ai_tool_call message event - ai_message content now uses {"role": "assistant", "content": "..."} format - New ai_tool_call message event emitted when lead_agent LLM responds with tool_calls - ai_tool_call uses langchain_to_openai_message converter for consistent format - Both events include finish_reason in metadata ("stop" or "tool_calls") Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): add tool_result message event with OpenAI tool message format Cache tool_call_id from on_tool_start keyed by run_id as fallback for on_tool_end, then emit a tool_result message event (role=tool, tool_call_id, content) after each successful tool completion. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(events): summary content uses OpenAI system message format Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): replace llm_start/llm_end with llm_request/llm_response in OpenAI format Add on_chat_model_start to capture structured prompt messages as llm_request events. Replace llm_end trace events with llm_response using OpenAI Chat Completions format. Track llm_call_index to pair request/response events. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): add record_middleware method for middleware trace events Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(events): add full run sequence integration test for OpenAI content format Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(events): align message events with checkpoint format and add middleware tag injection - Message events (ai_message, ai_tool_call, tool_result, human_message) now use BaseMessage.model_dump() format, matching LangGraph checkpoint values.messages - on_tool_end extracts tool_call_id/name/status from ToolMessage objects - on_tool_error now emits tool_result message events with error status - record_middleware uses middleware:{tag} event_type and middleware category - Summarization custom events use middleware:summarize category - TitleMiddleware injects middleware:title tag via get_config() inheritance - SummarizationMiddleware model bound with middleware:summarize tag - Worker writes human_message using HumanMessage.model_dump() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(threads): switch search endpoint to threads_meta table and sync title - POST /api/threads/search now queries threads_meta table directly, removing the two-phase Store + Checkpointer scan approach - Add ThreadMetaRepository.search() with metadata/status filters - Add ThreadMetaRepository.update_display_name() for title sync - Worker syncs checkpoint title to threads_meta.display_name on run completion - Map display_name to values.title in search response for API compatibility Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(threads): history endpoint reads messages from event store - POST /api/threads/{thread_id}/history now combines two data sources: checkpointer for checkpoint_id, metadata, title, thread_data; event store for messages (complete history, not truncated by summarization) - Strip internal LangGraph metadata keys from response - Remove full channel_values serialization in favor of selective fields Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove duplicate optional-dependencies header in pyproject.toml Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(middleware): pass tagged config to TitleMiddleware ainvoke call Without the config, the middleware:title tag was not injected, causing the LLM response to be recorded as a lead_agent ai_message in run_events. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve merge conflict in .env.example Keep both DATABASE_URL (from persistence-scaffold) and WECOM credentials (from main) after the merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(persistence): address review feedback on PR #1851 - Fix naive datetime.now() → datetime.now(UTC) in all ORM models - Fix seq race condition in DbRunEventStore.put() with FOR UPDATE and UNIQUE(thread_id, seq) constraint - Encapsulate _store access in RunManager.update_run_completion() - Deduplicate _store.put() logic in RunManager via _persist_to_store() - Add update_run_completion to RunStore ABC + MemoryRunStore - Wire follow_up_to_run_id through the full create path - Add error recovery to RunJournal._flush_sync() lost-event scenario - Add migration note for search_threads breaking change - Fix test_checkpointer_none_fix mock to set database=None Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: update uv.lock Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(persistence): address 22 review comments from CodeQL, Copilot, and Code Quality Bug fixes: - Sanitize log params to prevent log injection (CodeQL) - Reset threads_meta.status to idle/error when run completes - Attach messages only to latest checkpoint in /history response - Write threads_meta on POST /threads so new threads appear in search Lint fixes: - Remove unused imports (journal.py, migrations/env.py, test_converters.py) - Convert lambda to named function (engine.py, Ruff E731) - Remove unused logger definitions in repos (Ruff F841) - Add logging to JSONL decode errors and empty except blocks - Separate assert side-effects in tests (CodeQL) - Remove unused local variables in tests (Ruff F841) - Fix max_trace_content truncation to use byte length, not char length Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: apply ruff format to persistence and runtime files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Potential fix for pull request finding 'Statement has no effect' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * refactor(runtime): introduce RunContext to reduce run_agent parameter bloat Extract checkpointer, store, event_store, run_events_config, thread_meta_repo, and follow_up_to_run_id into a frozen RunContext dataclass. Add get_run_context() in deps.py to build the base context from app.state singletons. start_run() uses dataclasses.replace() to enrich per-run fields before passing ctx to run_agent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(gateway): move sanitize_log_param to app/gateway/utils.py Extract the log-injection sanitizer from routers/threads.py into a shared utils module and rename to sanitize_log_param (public API). Eliminates the reverse service → router import in services.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * perf: use SQL aggregation for feedback stats and thread token usage Replace Python-side counting in FeedbackRepository.aggregate_by_run with a single SELECT COUNT/SUM query. Add RunStore.aggregate_tokens_by_thread abstract method with SQL GROUP BY implementation in RunRepository and Python fallback in MemoryRunStore. Simplify the thread_token_usage endpoint to delegate to the new method, eliminating the limit=10000 truncation risk. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: annotate DbRunEventStore.put() as low-frequency path Add docstring clarifying that put() opens a per-call transaction with FOR UPDATE and should only be used for infrequent writes (currently just the initial human_message event). High-throughput callers should use put_batch() instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(threads): fall back to Store search when ThreadMetaRepository is unavailable When database.backend=memory (default) or no SQL session factory is configured, search_threads now queries the LangGraph Store instead of returning 503. Returns empty list if neither Store nor repo is available. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(persistence): introduce ThreadMetaStore ABC for backend-agnostic thread metadata Add ThreadMetaStore abstract base class with create/get/search/update/delete interface. ThreadMetaRepository (SQL) now inherits from it. New MemoryThreadMetaStore wraps LangGraph BaseStore for memory-mode deployments. deps.py now always provides a non-None thread_meta_repo, eliminating all `if thread_meta_repo is not None` guards in services.py, worker.py, and routers/threads.py. search_threads no longer needs a Store fallback branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(history): read messages from checkpointer instead of RunEventStore The /history endpoint now reads messages directly from the checkpointer's channel_values (the authoritative source) instead of querying RunEventStore.list_messages(). The RunEventStore API is preserved for other consumers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(persistence): address new Copilot review comments - feedback.py: validate thread_id/run_id before deleting feedback - jsonl.py: add path traversal protection with ID validation - run_repo.py: parse `before` to datetime for PostgreSQL compat - thread_meta_repo.py: fix pagination when metadata filter is active - database_config.py: use resolve_path for sqlite_dir consistency Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Implement skill self-evolution and skill_manage flow (#1874) * chore: ignore .worktrees directory * Add skill_manage self-evolution flow * Fix CI regressions for skill_manage * Address PR review feedback for skill evolution * fix(skill-evolution): preserve history on delete * fix(skill-evolution): tighten scanner fallbacks * docs: add skill_manage e2e evidence screenshot * fix(skill-manage): avoid blocking fs ops in session runtime --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> * fix(config): resolve sqlite_dir relative to CWD, not Paths.base_dir resolve_path() resolves relative to Paths.base_dir (.deer-flow), which double-nested the path to .deer-flow/.deer-flow/data/app.db. Use Path.resolve() (CWD-relative) instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Feature/feishu receive file (#1608) * feat(feishu): add channel file materialization hook for inbound messages - Introduce Channel.receive_file(msg, thread_id) as a base method for file materialization; default is no-op. - Implement FeishuChannel.receive_file to download files/images from Feishu messages, save to sandbox, and inject virtual paths into msg.text. - Update ChannelManager to call receive_file for any channel if msg.files is present, enabling downstream model access to user-uploaded files. - No impact on Slack/Telegram or other channels (they inherit the default no-op). * style(backend): format code with ruff for lint compliance - Auto-formatted packages/harness/deerflow/agents/factory.py and tests/test_create_deerflow_agent.py using `ruff format` - Ensured both files conform to project linting standards - Fixes CI lint check failures caused by code style issues * fix(feishu): handle file write operation asynchronously to prevent blocking * fix(feishu): rename GetMessageResourceRequest to _GetMessageResourceRequest and remove redundant code * test(feishu): add tests for receive_file method and placeholder replacement * fix(manager): remove unnecessary type casting for channel retrieval * fix(feishu): update logging messages to reflect resource handling instead of image * fix(feishu): sanitize filename by replacing invalid characters in file uploads * fix(feishu): improve filename sanitization and reorder image key handling in message processing * fix(feishu): add thread lock to prevent filename conflicts during file downloads * fix(test): correct bad merge in test_feishu_parser.py * chore: run ruff and apply formatting cleanup fix(feishu): preserve rich-text attachment order and improve fallback filename handling * fix(docker): restore gateway env vars and fix langgraph empty arg issue (#1915) Two production docker-compose.yaml bugs prevent `make up` from working: 1. Gateway missing DEER_FLOW_CONFIG_PATH and DEER_FLOW_EXTENSIONS_CONFIG_PATH environment overrides. Added in |
||
|
|
b970993425 |
fix: read lead agent options from context (#2515)
* fix: read lead agent options from context * fix: validate runtime context config |
||
|
|
ec8a8cae38 |
fix: gate deferred MCP tool execution (#2513)
* fix: gate deferred MCP tool execution * style: format deferred tool middleware * fix: address deferred tool review feedback |
||
|
|
d78ed5c8f2 | fix: inherit subagent skill allowlists (#2514) | ||
|
|
f9ff3a698d |
fix(middleware): avoid rescuing non-skill tool outputs during summarization (#2458)
* fix(middelware): narrow skill rescue to skill-related tool outputs * fix(summarization): address skill rescue review feedback * fix: wire summarization skill rescue config * fix: remove dead skill tool helper * fix(lint): fix format --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
11f557a2c6 |
feat(trace):Add run_name to the trace info for system agents. (#2492)
* feat(trace): Add `run_name` to the trace info for suggestions and memory. before(in langsmith): CodexChatModel CodexChatModel lead_agent after: suggest_agent memory_agent lead_agent feat(trace): Add `run_name` to the trace info for suggestions and memory. before(in langsmith): CodexChatModel CodexChatModel lead_agent after: suggest_agent memory_agent lead_agent * feat(trace): Add `run_name` to the trace info for system agents. before(in langsmith): CodexChatModel CodexChatModel CodexChatModel CodexChatModel lead_agent after: suggest_agent title_agent security_agent memory_agent lead_agent * chore(code format):code format --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
30d619de08 |
feat(subagents): support per-subagent skill loading and custom subagent types (#2253)
* feat(subagents): support per-subagent skill loading and custom subagent types (#2230) Add per-subagent skill configuration and custom subagent type registration, aligned with Codex's role-based config layering and per-session skill injection. Backend: - SubagentConfig gains `skills` field (None=all, []=none, list=whitelist) - New CustomSubagentConfig for user-defined subagent types in config.yaml - SubagentsAppConfig gains `custom_agents` section and `get_skills_for()` - Registry resolves custom agents with three-layer config precedence - SubagentExecutor loads skills per-session as conversation items (Codex pattern) - task_tool no longer appends skills to system_prompt - Lead agent system prompt dynamically lists all registered subagent types - setup_agent tool accepts optional skills parameter - Gateway agents API transparently passes skills in CRUD operations Frontend: - Agent/CreateAgentRequest/UpdateAgentRequest types include skills field - Agent card displays skills as badges alongside tool_groups Config: - config.example.yaml documents custom_agents and per-agent skills override Tests: - 40 new tests covering all skill config, custom agents, and registry logic - Existing tests updated for new get_skills_prompt_section signature Closes #2230 * fix: address review feedback on skills PR - Remove stale get_skills_prompt_section monkeypatches from test_task_tool_core_logic.py (task_tool no longer imports this function after skill injection moved to executor) - Add key prefixes (tg:/sk:) to agent-card badges to prevent React key collisions between tool_groups and skills * fix(ci): resolve lint and test failures - Format agent-card.tsx with prettier (lint-frontend) - Remove stale "Skills Appendix" system_prompt assertion — skills are now loaded per-session by SubagentExecutor, not appended to system_prompt * fix(ci): sort imports in test_subagent_skills_config.py (ruff I001) * fix(ci): use nullish coalescing in agent-card badge condition (eslint) * fix: address review feedback on skills PR - Use model_fields_set in AgentUpdateRequest to distinguish "field omitted" from "explicitly set to null" — fixes skills=None ambiguity where None means "inherit all" but was treated as "don't change" - Move lazy import of get_subagent_config outside loop in _build_available_subagents_description to avoid repeated import overhead --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
5ba1dacf25 |
fix: rename present_file to present_files in docs and prompts (#2393)
The tool is registered as `present_files` (plural) in present_file_tool.py, but four references in documentation and prompt strings incorrectly used the singular form `present_file`. This could cause confusion and potentially lead to incorrect tool invocations. Changed files: - backend/docs/GUARDRAILS.md - backend/docs/ARCHITECTURE.md - backend/packages/harness/deerflow/agents/lead_agent/prompt.py (2 occurrences) |
||
|
|
a62ca5dd47 |
fix: Catch httpx.ReadError in the error handling (#2309)
* fix: Catch httpx.ReadError in the error handling * fix |
||
|
|
f514e35a36 | fix(backend): make clarification messages idempotent (#2350) (#2351) | ||
|
|
55474011c9 |
fix(subagent): inherit parent agent's tool_groups in task_tool (#2305)
* fix(subagent): inherit parent agent's tool_groups in task_tool
When a custom agent defines tool_groups (e.g. [file:read, file:write, bash]),
the restriction is correctly applied to the lead agent. However, when the lead
agent delegates work to a subagent via the task tool, get_available_tools() is
called without the groups parameter, causing the subagent to receive ALL tools
(including web_search, web_fetch, image_search, etc.) regardless of the parent
agent's configuration.
This fix propagates tool_groups through run metadata so that task_tool passes
the same group filter when building the subagent's tool set.
Changes:
- agent.py: include tool_groups in run metadata
- task_tool.py: read tool_groups from metadata and pass to get_available_tools()
* fix: initialize metadata before conditional block and update tests for tool_groups propagation
- Initialize metadata = {} before the 'if runtime is not None' block to
avoid Ruff F821 (possibly-undefined variable) and simplify the
parent_tool_groups expression.
- Update existing test assertion to expect groups=None in
get_available_tools call signature.
- Add 3 new test cases:
- test_task_tool_propagates_tool_groups_to_subagent
- test_task_tool_no_tool_groups_passes_none
- test_task_tool_runtime_none_passes_groups_none
|
||
|
|
898f4e8ac2 |
fix: Memory update system has cache corruption, data loss, and thread-safety bugs (#2251)
* fix(memory): cache corruption, thread-safety, and caller mutation bugs
Bug 1 (updater.py): deep-copy current_memory before passing to
_apply_updates() so a subsequent save() failure cannot leave a
partially-mutated object in the storage cache.
Bug 3 (storage.py): add _cache_lock (threading.Lock) to
FileMemoryStorage and acquire it around every read/write of
_memory_cache, fixing concurrent-access races between the background
timer thread and HTTP reload calls.
Bug 4 (storage.py): replace in-place mutation
memory_data["lastUpdated"] = ...
with a shallow copy
memory_data = {**memory_data, "lastUpdated": ...}
so save() no longer silently modifies the caller's dict.
Regression tests added for all three bugs in test_memory_storage.py
and test_memory_updater.py.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* style: format test_memory_updater.py with ruff
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* style: remove stale bug-number labels from code comments and docstrings
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
|
||
|
|
a664d2f5c4 |
fix(checkpointer): create parent directory before opening SQLite in sync provider (#2272)
* fix(checkpointer): create parent directory before opening SQLite in sync provider
The sync checkpointer factory (_sync_checkpointer_cm) opens a SQLite
connection without first ensuring the parent directory exists. The async
provider and both store providers already call ensure_sqlite_parent_dir(),
but this call was missing from the sync path.
When the deer-flow harness package is used from an external virtualenv
(where the .deer-flow directory is not pre-created), the missing parent
directory causes:
sqlite3.OperationalError: unable to open database file
Add the missing ensure_sqlite_parent_dir() call in the sync SQLite
branch, consistent with the async provider, and add a regression test.
Closes #2259
* style: fix ruff format + add call-order assertion for ensure_parent_dir
- Fix formatting in test_checkpointer.py (ruff format)
- Add test_sqlite_ensure_parent_dir_before_connect to verify
ensure_sqlite_parent_dir is called before from_conn_string
(addresses Copilot review suggestion)
---------
Co-authored-by: voidborne-d <voidborne-d@users.noreply.github.com>
|
||
|
|
2176b2bbfc |
fix: validate bootstrap agent names before filesystem writes (#2274)
* fix: validate bootstrap agent names before filesystem writes * fix: tighten bootstrap agent-name validation |
||
|
|
8760937439 |
fix(memory): use asyncio.to_thread for blocking file I/O in aupdate_memory (#2220)
* fix(memory): use asyncio.to_thread for blocking file I/O in aupdate_memory `_finalize_update` performs synchronous blocking operations (os.mkdir, file open/write/rename/stat) that were called directly from the async `aupdate_memory` method, causing `BlockingError` from blockbuster when running under an ASGI server. Wrap the call with `asyncio.to_thread` to offload all blocking I/O to a thread pool. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(memory): use unique temp filename to prevent concurrent write collision `file_path.with_suffix(".tmp")` produces a fixed path — concurrent saves for the same agent (now possible after wrapping _finalize_update in asyncio.to_thread) would clobber the same temp file. Use a UUID-suffixed temp file so each write is isolated. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(memory): also offload _prepare_update_prompt to thread pool FileMemoryStorage.load() inside _prepare_update_prompt performs synchronous stat() and file read, blocking the event loop just like _finalize_update did. Wrap _prepare_update_prompt in asyncio.to_thread for the same reason. The async path now has no blocking file I/O on the event loop: to_thread(_prepare_update_prompt) → await model.ainvoke() → to_thread(_finalize_update) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
4ba3167f48 |
feat: flush memory before summarization (#2176)
* feat: flush memory before summarization * fix: keep agent-scoped memory on summarization flush * fix: harden summarization hook plumbing * fix: address summarization review feedback * style: format memory middleware |
||
|
|
e4f896e90d |
fix(todo-middleware): prevent premature agent exit with incomplete todos (#2135)
* fix(todo-middleware): prevent premature agent exit with incomplete todos When plan mode is active (is_plan_mode=True), the agent occasionally exits the loop and outputs a final response while todo items are still incomplete. This happens because the routing edge only checks for tool_calls, not todo completion state. Fixes #2112 Add an after_model override to TodoMiddleware with @hook_config(can_jump_to=["model"]). When the model produces a response with no tool calls but there are still incomplete todos, the middleware injects a todo_completion_reminder HumanMessage and returns jump_to=model to force another model turn. A cap of 2 reminders prevents infinite loops when the agent cannot make further progress. Also adds _completion_reminder_count() helper and 14 new unit tests covering all edge cases of the new after_model / aafter_model logic. * Remove unnecessary blank line in test file * Fix runtime argument annotation in before_model * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: octo-patch <octo-patch@github.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> |
||
|
|
07fc25d285 |
feat: switch memory updater to async LLM calls (#2138)
* docs: mark memory updater async migration as completed - Update TODO.md to mark the replacement of sync model.invoke() with async model.ainvoke() in title_middleware and memory updater as completed using [x] format Addresses #2131 * feat: switch memory updater to async LLM calls - Add async aupdate_memory() method using await model.ainvoke() - Convert sync update_memory() to use async wrapper - Add _run_async_update_sync() for nested loop context handling - Maintain backward compatibility with existing sync API - Add ThreadPoolExecutor for async execution from sync contexts Addresses #2131 * test: add tests for async memory updater - Add test_async_update_memory_uses_ainvoke() to verify async path - Convert existing tests to use AsyncMock and ainvoke assertions - Add test_sync_update_memory_wrapper_works_in_running_loop() - Update all model mocks to use async await patterns Addresses #2131 * fix: apply ruff formatting to memory updater - Format multi-line expressions to single line - Ensure code style consistency with project standards - Fix lint issues caught by GitHub Actions * test: add comprehensive tests for async memory updater - Add test_async_update_memory_uses_ainvoke() to verify async path - Convert existing tests to use AsyncMock and ainvoke assertions - Add test_sync_update_memory_wrapper_works_in_running_loop() - Update all model mocks to use async await patterns - Ensure backward compatibility with sync API * fix: satisfy ruff formatting in memory updater test --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
c91785dd68 |
fix(title): strip <think> tags from title model responses and assistant context (#1927)
* fix(title): strip <think> tags from title model responses and assistant context Reasoning models (e.g. minimax M2.7, DeepSeek-R1) emit <think>...</think> blocks before their actual output. When such a model is used as the title model (or as the main agent), the raw thinking content leaked into the thread title stored in state, so the chat list showed the internal monologue instead of a meaningful title. Fixes #1884 - Add `_strip_think_tags()` helper using a regex to remove all <think>...</think> blocks - Apply it in `_parse_title()` so the title model response is always clean - Apply it to the assistant message in `_build_title_prompt()` so thinking content from the first AI turn is not fed back to the title model - Add four new unit tests covering: stripping in parse, think-only response, assistant prompt stripping, and end-to-end async flow with think tags * Fix the lint error --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
5db71cb68c |
fix(middleware): repair dangling tool-call history after loop interru… (#2035)
* fix(middleware): repair dangling tool-call history after loop interruption (#2029) * docs(backend): fix middleware chain ordering --------- Co-authored-by: luoxiao6645 <luoxiao6645@gmail.com> |
||
|
|
4d4ddb3d3f | feat(llm): introduce lightweight circuit breaker to prevent rate-limit bans and resource exhaustion (#2095) | ||
|
|
5b633449f8 |
fix(middleware): add per-tool-type frequency detection to LoopDetectionMiddleware (#1988)
* fix(middleware): add per-tool-type frequency detection to LoopDetectionMiddleware The existing hash-based loop detection only catches identical tool call sets. When the agent calls the same tool type (e.g. read_file) on many different files, each call produces a unique hash and bypasses detection. This causes the agent to exhaust recursion_limit, consuming 150K-225K tokens per failed run. Add a second detection layer that tracks cumulative call counts per tool type per thread. Warns at 30 calls (configurable) and forces stop at 50. The hard stop message now uses the actual returned message instead of a hardcoded constant, so both hash-based and frequency-based stops produce accurate diagnostics. Also fix _apply() to use the warning message returned by _track_and_check() for hard stops, instead of always using _HARD_STOP_MSG. Closes #1987 * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix(lint): remove unused imports and fix line length - Remove unused _TOOL_FREQ_HARD_STOP_MSG and _TOOL_FREQ_WARNING_MSG imports from test file (F401) - Break long _TOOL_FREQ_WARNING_MSG string to fit within 240 char limit (E501) * style: apply ruff format * test: add LRU eviction and per-thread reset coverage for frequency state Address review feedback from @WillemJiang: - Verify _tool_freq and _tool_freq_warned are cleaned on LRU eviction - Add test for reset(thread_id=...) clearing only the target thread's frequency state while leaving others intact * fix(makefile): route Windows shell-script targets through Git Bash (#2060) --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Asish Kumar <87874775+officialasishkumar@users.noreply.github.com> |
||
|
|
02569136df |
fix(sandbox): improve sandbox security and preserve multimodal content (#2114)
* fix: improve sandbox security and preserve multimodal content * Add unit test modifications for test_injects_uploaded_files_tag_into_list_content * format updated_content * Add regression tests for multimodal upload content and host bash default safety |
||
|
|
eef0a6e2da | feat(dx): Setup Wizard + doctor command — closes #2030 (#2034) | ||
|
|
563383c60f |
fix(agent): file-io path guidance in agent prompts (#2019)
* fix(prompt): guide workspace-relative file io * Clarify bash agent file IO path guidance |
||
|
|
ad6d934a5f |
fix(middleware): handle string-serialized options in ClarificationMiddleware (#1997)
* fix(middleware): handle string-serialized options in ClarificationMiddleware (#1995) Some models (e.g. Qwen3-Max) serialize array tool parameters as JSON strings instead of native arrays. Add defensive type checking in _format_clarification_message() to deserialize string options before iteration, preventing per-character rendering. * fix(middleware): normalize options after JSON deserialization Address Copilot review feedback: - Add post-deserialization normalization so options is always a list (handles json.loads returning a scalar string, dict, or None) - Add test for JSON-encoded scalar string ("development") - Fix test_json_string_with_mixed_types to use actual mixed types |
||
|
|
29817c3b34 |
fix(backend): use timezone-aware UTC in memory modules (fix pytest DeprecationWarnings) (#1992)
* fix(backend): use timezone-aware UTC in memory modules Replace datetime.utcnow() with datetime.now(timezone.utc) and a shared utc_now_iso_z() helper so persisted ISO timestamps keep the trailing Z suffix without triggering Python 3.12+ deprecation warnings. Made-with: Cursor * refactor(backend): use removesuffix for utc_now_iso_z suffix Makes the +00:00 -> Z transform explicit for the trailing offset only (Copilot review on PR #1992). Made-with: Cursor * style(backend): satisfy ruff UP017 with datetime.UTC in memory queue Made-with: Cursor --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
c3170f22da |
fix(backend): make loop detection hash tool calls by stable keys (#1911)
* fix(backend): make loop detection hash tool calls by stable keys The loop detection middleware previously hashed full tool call arguments, which made repeated calls look different when only non-essential argument details changed. In particular, `read_file` calls with nearby line ranges could bypass repetition detection even when the agent was effectively reading the same file region again and again. - Hash tool calls using stable keys instead of the full raw args payload - Bucket `read_file` line ranges so nearby reads map to the same region key - Prefer stable identifiers such as `path`, `url`, `query`, or `command` before falling back to JSON serialization of args - Keep hashing order-independent so the same tool call set produces the same hash regardless of call order Fixes #1905 * fix(backend): harden loop detection hash normalization - Normalize and parse stringified tool args defensively - Expand stable key derivation to include pattern, glob, and cmd - Normalize reversed read_file ranges before bucketing Fixes #1905 * fix(backend): harden loop detection tool format * exclude write_file and str_replace from the stable-key path — writing different content to the same file shouldn't be flagged. --------- Co-authored-by: JeffJiang <for-eleven@hotmail.com> |
||
|
|
3b3e8e1b0b |
feat(sandbox): strengthen bash command auditing with compound splitting and expanded patterns (#1881)
* fix(sandbox): strengthen regex coverage in SandboxAuditMiddleware
Expand high-risk patterns from 6 to 13 and medium-risk from 4 to 6,
closing several bypass vectors identified by cross-referencing Claude
Code's BashSecurity validator chain against DeerFlow's threat model.
High-risk additions:
- Generalised pipe-to-sh (replaces narrow curl|sh rule)
- Targeted command substitution ($() / backtick with dangerous executables)
- base64 decode piped to execution
- Overwrite system binaries (/usr/bin/, /bin/, /sbin/)
- Overwrite shell startup files (~/.bashrc, ~/.profile, etc.)
- /proc/*/environ leakage
- LD_PRELOAD / LD_LIBRARY_PATH hijack
- /dev/tcp/ bash built-in networking
Medium-risk additions:
- sudo/su (no-op under Docker root, warn only)
- PATH= modification (long attack chain, warn only)
Design decisions:
- Command substitution uses targeted matching (curl/wget/bash/sh/python/
ruby/perl/base64) rather than blanket block to avoid false positives
on safe usage like $(date) or `whoami`.
- Skipped encoding/obfuscation checks (hex, octal, Unicode homoglyphs)
as ROI is low in Docker sandbox — LLMs don't generate encoded commands
and container isolation bounds the blast radius.
- Merged pip/pip3 into single pip3? pattern.
* feat(sandbox): compound command splitting and fork bomb detection
Split compound bash commands (&&, ||, ;) into sub-commands and classify
each independently — prevents dangerous commands hidden after safe
prefixes (e.g. "cd /workspace && rm -rf /") from bypassing detection.
- Add _split_compound_command() with shlex quote-aware splitting
- Add fork bomb detection patterns (classic and while-loop variants)
- Most severe verdict wins; block short-circuits
- 15 new tests covering compound commands, splitting, and fork bombs
* test(sandbox): add async tests for fork bomb and compound commands
Cover awrap_tool_call path for fork bomb detection (3 variants) and
compound command splitting (block/warn/pass scenarios).
* fix(sandbox): address Copilot review — no-whitespace operators, >>/etc/, whole-command scan
- _split_compound_command: replace shlex-based implementation with a
character-by-character quote/escape-aware scanner. shlex.split only
separates '&&' / '||' / ';' when they are surrounded by whitespace,
so payloads like 'rm -rf /&&echo ok' or 'safe;rm -rf /' bypassed the
previous splitter and therefore the per-sub-command classifier.
- _HIGH_RISK_PATTERNS: change r'>\s*/etc/' to r'>+\s*/etc/' so append
redirection ('>>/etc/hosts') is also blocked.
- _classify_command: run a whole-command high-risk scan *before*
splitting. Structural attacks like 'while true; do bash & done'
span multiple shell statements — splitting on ';' destroys the
pattern context, so the raw command must be scanned first.
- tests: add no-whitespace operator cases to TestSplitCompoundCommand
and test_compound_command_classification to lock in the bypass fix.
|
||
|
|
7643a46fca |
fix(skill): make skill prompt cache refresh nonblocking (#1924)
* fix: make skill prompt cache refresh nonblocking * fix: harden skills prompt cache refresh * chore: add timeout to skills cache warm-up |
||
|
|
c4da0e8ca9 |
Move async SQLite mkdir off the event loop (#1921)
Co-authored-by: DanielWalnut <45447813+hetaoBackend@users.noreply.github.com> |
||
|
|
888f7bfb9d |
Implement skill self-evolution and skill_manage flow (#1874)
* chore: ignore .worktrees directory * Add skill_manage self-evolution flow * Fix CI regressions for skill_manage * Address PR review feedback for skill evolution * fix(skill-evolution): preserve history on delete * fix(skill-evolution): tighten scanner fallbacks * docs: add skill_manage e2e evidence screenshot * fix(skill-manage): avoid blocking fs ops in session runtime --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
055e4df049 |
fix(sandbox): add input sanitisation guard to SandboxAuditMiddleware (#1872)
* fix(sandbox): add L2 input sanitisation to SandboxAuditMiddleware Add _validate_input() to reject malformed bash commands before regex classification: empty commands, oversized commands (>10 000 chars), and null bytes that could cause detection/execution layer inconsistency. * fix(sandbox): address Copilot review — type guard, log truncation, reject reason - Coerce None/non-string command to str before validation - Truncate oversized commands in audit logs to prevent log amplification - Propagate reject_reason through _pre_process() to block message - Remove L2 label from comments and test class names * fix(sandbox): isinstance type guard + async input sanitisation tests Address review comments: - Replace str() coercion with isinstance(raw_command, str) guard so non-string truthy values (0, [], False) fall back to empty string instead of passing validation as "0"/"[]"/"False". - Add TestInputSanitisationBlocksInAwrapToolCall with 4 async tests covering empty, null-byte, oversized, and None command via awrap_tool_call path. |
||
|
|
1ced6e977c |
fix(backend): preserve viewed image reducer metadata (#1900)
Fix concurrent viewed_images state updates for multi-image input by preserving the reducer metadata in the vision middleware state schema. |
||
|
|
8049785de6 |
fix(memory): case-insensitive fact deduplication and positive reinforcement detection (#1804)
* fix(memory): case-insensitive fact deduplication and positive reinforcement detection Two fixes to the memory system: 1. _fact_content_key() now lowercases content before comparison, preventing semantically duplicate facts like "User prefers Python" and "user prefers python" from being stored separately. 2. Adds detect_reinforcement() to MemoryMiddleware (closes #1719), mirroring detect_correction(). When users signal approval ("yes exactly", "perfect", "完全正确", etc.), the memory updater now receives reinforcement_detected=True and injects a hint prompting the LLM to record confirmed preferences and behaviors with high confidence. Changes across the full signal path: - memory_middleware.py: _REINFORCEMENT_PATTERNS + detect_reinforcement() - queue.py: reinforcement_detected field in ConversationContext and add() - updater.py: reinforcement_detected param in update_memory() and update_memory_from_conversation(); builds reinforcement_hint alongside the existing correction_hint Tests: 11 new tests covering deduplication, hint injection, and signal detection (Chinese + English patterns, window boundary, conflict with correction). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(memory): address Copilot review comments on reinforcement detection - Tighten _REINFORCEMENT_PATTERNS: remove 很好, require punctuation/end-of-string boundaries on remaining patterns, split this-is-good into stricter variants - Suppress reinforcement_detected when correction_detected is true to avoid mixed-signal noise - Use casefold() instead of lower() for Unicode-aware fact deduplication - Add missing test coverage for reinforcement_detected OR merge and forwarding in queue --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
2a150f5d4a |
fix: unblock concurrent threads and workspace hydration (#1839)
* fix: unblock concurrent threads and workspace hydration * fix: restore async title generation * fix: address PR review feedback * style: format lead agent prompt |
||
|
|
163121d327 |
fix(uploads): handle split-bold headings and ** ** artefacts in extract_outline (#1838)
* feat(uploads): guide agent to use grep/glob/read_file for uploaded documents Add workflow guidance to the <uploaded_files> context block so the agent knows to use grep and glob (added in #1784) alongside read_file when working with uploaded documents, rather than falling back to web search. This is the final piece of the three-PR PDF agentic search pipeline: - PR1 (#1727): pymupdf4llm converter produces structured Markdown with headings - PR2 (#1738): document outline injected into agent context with line numbers - PR3 (this): agent guided to use outline + grep + read_file workflow * feat(uploads): add file-first priority and fallback guidance to uploaded_files context * fix(uploads): handle split-bold headings and ** ** artefacts in extract_outline - Add _clean_bold_title() to merge adjacent bold spans (** **) produced by pymupdf4llm when bold text crosses span boundaries - Add _SPLIT_BOLD_HEADING_RE (Style 3) to recognise **<num>** **<title>** headings common in academic papers; excludes pure-number table headers and rows with more than 4 bold blocks - When outline is empty, read first 5 non-empty lines of the .md as a content preview and surface a grep hint in the agent context - Update _format_file_entry to render the preview + grep hint instead of silently omitting the outline section - Add 3 new extract_outline tests and 2 new middleware tests (65 total) * fix(uploads): address Copilot review comments on extract_outline regex - Replace ASCII [A-Za-z] guard with negative lookahead to support non-ASCII titles (e.g. **1** **概述**); pure-numeric/punctuation blocks still excluded - Replace .+ with [^*]+ and cap repetition at {0,2} (four blocks total) to keep _SPLIT_BOLD_HEADING_RE linear and avoid ReDoS on malformed input - Remove now-redundant len(blocks) <= 4 code-level check (enforced by regex) - Log debug message with exc_info when preview extraction fails |
||
|
|
bbd0866374 |
feat(uploads): guide agent using agentic search for uploaded documents (#1816)
* feat(uploads): guide agent to use grep/glob/read_file for uploaded documents Add workflow guidance to the <uploaded_files> context block so the agent knows to use grep and glob (added in #1784) alongside read_file when working with uploaded documents, rather than falling back to web search. This is the final piece of the three-PR PDF agentic search pipeline: - PR1 (#1727): pymupdf4llm converter produces structured Markdown with headings - PR2 (#1738): document outline injected into agent context with line numbers - PR3 (this): agent guided to use outline + grep + read_file workflow * feat(uploads): add file-first priority and fallback guidance to uploaded_files context |
||
|
|
db82b59254 |
fix(middleware): handle list-type AIMessage.content in LoopDetectionMiddleware (#1823)
* fix: inject longTermBackground into memory prompt
The format_memory_for_injection function only processed recentMonths and
earlierContext from the history section, silently dropping longTermBackground.
The LLM writes longTermBackground correctly and it persists to memory.json,
but it was never injected into the system prompt — making the user's
long-term background invisible to the AI.
Add the missing field handling and a regression test.
* fix(middleware): handle list-type AIMessage.content in LoopDetectionMiddleware
LangChain AIMessage.content can be str | list. When using providers that
return structured content blocks (e.g. Anthropic thinking mode, certain
OpenAI-compatible gateways), content is a list of dicts like
[{"type": "text", "text": "..."}].
The hard_limit branch in _apply() concatenated content with a string via
(last_msg.content or "") + f"\n\n{_HARD_STOP_MSG}", which raises
TypeError when content is a non-empty list (list + str is invalid).
Add _append_text() static method that:
- Returns the text directly when content is None
- Appends a {"type": "text"} block when content is a list
- Falls back to string concatenation when content is a str
This is consistent with how other modules in the project already handle
list content (client.py._extract_text, memory_middleware, executor.py).
* test(middleware): add unit tests for _append_text and list content hard stop
Add regression tests to verify LoopDetectionMiddleware handles list-type
AIMessage.content correctly during hard stop:
- TestAppendText: unit tests for the new _append_text() static method
covering None, str, list (including empty list) content types
- TestHardStopWithListContent: integration tests verifying hard stop
works correctly with list content (Anthropic thinking mode), None
content, and str content
Requested by reviewer in PR #1823.
* fix(middleware): improve _append_text robustness and test isolation
- Add explicit isinstance(content, str) check with fallback for
unexpected types (coerce to str) to prevent TypeError on edge cases
- Deep-copy list content in _make_state() test helper to prevent
shared mutable references across test iterations
- Add test_unexpected_type_coerced_to_str: verify fallback for
non-str/list/None content types
- Add test_list_content_not_mutated_in_place: verify _append_text
does not modify the original list
* style: fix ruff format whitespace in test file
---------
Co-authored-by: ppyt <14163465+ppyt@users.noreply.github.com>
|
||
|
|
5ff230eafd |
feat(uploads): inject document outline into agent context for converted files (#1738)
* feat(uploads): inject document outline into agent context for converted files
Extract headings from converted .md files and inject them into the
<uploaded_files> context block so the agent can navigate large documents
by line number before reading.
- Add `extract_outline()` to `file_conversion.py`: recognises standard
Markdown headings (#/##/###) and SEC-style bold structural headings
(**ITEM N. BUSINESS**, **PART II**); caps at 50 entries; excludes
cover-page boilerplate (WASHINGTON DC, CURRENT REPORT, SIGNATURES)
- Add `_extract_outline_for_file()` helper in `uploads_middleware.py`:
looks for a sibling `.md` file produced by the conversion pipeline
- Update `UploadsMiddleware._create_files_message()` to render the outline
under each file entry with `L{line}: {title}` format and a `read_file`
prompt for range-based reading
- Tests: 10 new tests for `extract_outline()`, 4 new tests for outline
injection in `UploadsMiddleware`; existing test updated for new `outline`
field in `uploaded_files` state
Partially addresses #1647 (agent ignores uploaded files).
* fix(uploads): stream outline file reads and strip inline bold from heading titles
- Switch extract_outline() from read_text().splitlines() to open()+line iteration
so large converted documents are not loaded into memory on every agent turn;
exits as soon as MAX_OUTLINE_ENTRIES is reached (Copilot suggestion)
- Strip **...** wrapper from standard Markdown heading titles before appending
to outline so agent context stays clean (e.g. "## **Overview**" → "Overview")
(Copilot suggestion)
- Remove unused pathlib.Path import and fix import sort order in test_file_conversion.py
to satisfy ruff CI lint
* fix(uploads): show truncation hint when outline exceeds MAX_OUTLINE_ENTRIES
When extract_outline() hits the cap it now appends a sentinel entry
{"truncated": True} instead of silently dropping the rest of the headings.
UploadsMiddleware reads the sentinel and renders a hint line:
... (showing first 50 headings; use `read_file` to explore further)
Without this the agent had no way to know the outline was incomplete and
would treat the first 50 headings as the full document structure.
* fix(uploads): fall back to configurable.thread_id when runtime.context lacks thread_id
runtime.context does not always carry thread_id (depends on LangGraph
invocation path). ThreadDataMiddleware already falls back to
get_config().configurable.thread_id — apply the same pattern so
UploadsMiddleware can resolve the uploads directory and attach outlines
in all invocation paths.
* style: apply ruff format
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
|
||
|
|
46d0c329c1 |
fix(uploads): fall back to configurable.thread_id when runtime.context lacks thread_id (#1814)
* fix(uploads): fall back to configurable.thread_id when runtime.context lacks thread_id runtime.context does not always carry thread_id depending on the LangGraph invocation path. When absent, uploads_dir resolved to None and the entire outline/historical-files attachment was silently skipped. Apply the same fallback pattern already used by ThreadDataMiddleware: try get_config().configurable.thread_id, with a RuntimeError guard for test environments where get_config() is called outside a runnable context. Discovered via live integration testing (curl against local LangGraph). Unit tests inject uploads_dir directly and would not catch this. * style: apply ruff format to uploads_middleware.py |
||
|
|
a2aba23962 | fix: replace the offline link in the lead_agent prompt (#1800) | ||
|
|
5664b9d413 |
fix: inject longTermBackground into memory prompt (#1734)
The format_memory_for_injection function only processed recentMonths and earlierContext from the history section, silently dropping longTermBackground. The LLM writes longTermBackground correctly and it persists to memory.json, but it was never injected into the system prompt — making the user's long-term background invisible to the AI. Add the missing field handling and a regression test. Co-authored-by: ppyt <14163465+ppyt@users.noreply.github.com> |
||
|
|
8128a3bc57 | fix: enable DanglingToolCallMiddleware for subagents (#1766) | ||
|
|
f8fb8d6fb1 |
feat/per agent skill filter (#1650)
* feat(agent): 为AgentConfig添加skills字段并更新lead_agent系统提示 在AgentConfig中添加skills字段以支持配置agent可用技能 更新lead_agent的系统提示模板以包含可用技能信息 * fix: resolve agent skill configuration edge cases and add tests * Update backend/packages/harness/deerflow/agents/lead_agent/prompt.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * refactor(agent): address PR review comments for skills configuration - Add detailed docstring to `skills` field in `AgentConfig` to clarify the semantics of `None` vs `[]`. - Add unit tests in `test_custom_agent.py` to verify `load_agent_config()` correctly parses omitted skills and explicit empty lists. - Fix `test_make_lead_agent_empty_skills_passed_correctly` to include `agent_name` in the runtime config, ensuring it exercises the real code path. * docs: 添加关于按代理过滤技能的配置说明 在配置示例文件和文档中添加说明,解释如何通过代理的config.yaml文件限制加载的技能 --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> |
||
|
|
3a672b39c7 |
Fix/1681 llm call retry handling (#1683)
* fix(runtime): handle llm call errors gracefully * fix(runtime): preserve graph control flow in llm retry middleware --------- Co-authored-by: luoxiao6645 <luoxiao6645@gmail.com> |
||
|
|
0cdecf7b30 |
feat(memory): structured reflection + correction detection in MemoryMiddleware (#1620) (#1668)
* feat(memory): add structured reflection and correction detection * fix(memory): align sourceError schema and prompt guidance --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
aae59a8ba8 |
fix: surface configured sandbox mounts to agents (#1638)
* fix: surface configured sandbox mounts to agents * fix: address PR review feedback --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
9a557751d6 |
feat: support memory import and export (#1521)
* feat: support memory import and export * fix(memory): address review feedback * style: format memory settings page --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |