mirror of
https://github.com/bytedance/deer-flow.git
synced 2026-05-21 15:36:48 +00:00
0948c7a4e1db52ea49072c116a3de65fc7496c71
21 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
888f7bfb9d |
Implement skill self-evolution and skill_manage flow (#1874)
* chore: ignore .worktrees directory * Add skill_manage self-evolution flow * Fix CI regressions for skill_manage * Address PR review feedback for skill evolution * fix(skill-evolution): preserve history on delete * fix(skill-evolution): tighten scanner fallbacks * docs: add skill_manage e2e evidence screenshot * fix(skill-manage): avoid blocking fs ops in session runtime --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
5fd2c581f6 |
fix: add output truncation to ls_tool to prevent context window overflow (#1896)
ls_tool was the only sandbox tool without output size limits, allowing multi-MB results from large directories to blow up the model context window. Add head-truncation (configurable via ls_output_max_chars, default 20000) consistent with existing bash and read_file truncation. Closes #1887 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
0ffe5a73c1 | chroe(config):Increase subagent max-turn limits (#1852) | ||
|
|
2a150f5d4a |
fix: unblock concurrent threads and workspace hydration (#1839)
* fix: unblock concurrent threads and workspace hydration * fix: restore async title generation * fix: address PR review feedback * style: format lead agent prompt |
||
|
|
ddfc988bef |
feat(uploads): add pymupdf4llm PDF converter with auto-fallback and async offload (#1727)
* feat(uploads): add pymupdf4llm PDF converter with auto-fallback and async offload - Introduce pymupdf4llm as an optional PDF converter with better heading detection and table preservation than MarkItDown - Auto mode: prefer pymupdf4llm when installed; fall back to MarkItDown when output is suspiciously sparse (image-based / scanned PDFs) - Sparsity check uses chars-per-page (< 50 chars/page) rather than an absolute threshold, correctly handling both short and long documents - Large files (> 1 MB) are offloaded to asyncio.to_thread() to avoid blocking the event loop (related: #1569) - Add UploadsConfig with pdf_converter field (auto/pymupdf4llm/markitdown) - Add pymupdf4llm as optional dependency: pip install deerflow-harness[pymupdf] - Add 14 unit tests covering sparsity heuristic, routing logic, and async path * fix(uploads): address Copilot review comments on PDF converter - Fix docstring: MIN_CHARS_PYMUPDF -> _MIN_CHARS_PER_PAGE (typo) - Fix file handle leak: wrap pymupdf.open in try/finally to ensure doc.close() - Fix silent fallback gap: _convert_pdf_with_pymupdf4llm now catches all conversion exceptions (not just ImportError), so encrypted/corrupt PDFs fall back to MarkItDown instead of propagating - Tighten type: pdf_converter field changed from str to Literal[auto|pymupdf4llm|markitdown] - Normalize config value: _get_pdf_converter() strips and lowercases the raw config string, warns and falls back to 'auto' on unknown values |
||
|
|
f8fb8d6fb1 |
feat/per agent skill filter (#1650)
* feat(agent): 为AgentConfig添加skills字段并更新lead_agent系统提示 在AgentConfig中添加skills字段以支持配置agent可用技能 更新lead_agent的系统提示模板以包含可用技能信息 * fix: resolve agent skill configuration edge cases and add tests * Update backend/packages/harness/deerflow/agents/lead_agent/prompt.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * refactor(agent): address PR review comments for skills configuration - Add detailed docstring to `skills` field in `AgentConfig` to clarify the semantics of `None` vs `[]`. - Add unit tests in `test_custom_agent.py` to verify `load_agent_config()` correctly parses omitted skills and explicit empty lists. - Fix `test_make_lead_agent_empty_skills_passed_correctly` to include `agent_name` in the runtime config, ensuring it exercises the real code path. * docs: 添加关于按代理过滤技能的配置说明 在配置示例文件和文档中添加说明,解释如何通过代理的config.yaml文件限制加载的技能 --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> |
||
|
|
2d1f90d5dc |
feat(tracing): add optional Langfuse support (#1717)
* feat(tracing): add optional Langfuse support * Fix tracing fail-fast behavior for explicitly enabled providers * fix(lint) |
||
|
|
df5339b5d0 |
feat(sandbox): truncate oversized bash and read_file tool outputs (#1677)
* feat(sandbox): truncate oversized bash and read_file tool outputs Long tool outputs (large directory listings, multi-MB source files) can overflow the model's context window. Two new configurable limits: - bash_output_max_chars (default 20000): middle-truncates bash output, preserving both head and tail so stderr at the end is not lost - read_file_output_max_chars (default 50000): head-truncates file output with a hint to use start_line/end_line for targeted reads Both limits are enforced at the tool layer (sandbox/tools.py) rather than middleware, so truncation is guaranteed regardless of call path. Setting either limit to 0 disables truncation entirely. Measured: read_file on a 250KB source file drops from 63,698 tokens to 19,927 tokens (69% reduction) with the default limit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tests): remove unused pytest import and fix import sort order * style: apply ruff format to sandbox/tools.py * refactor(sandbox): address Copilot review feedback on truncation feature - strict hard cap: while-loop ensures result (including marker) ≤ max_chars - max_chars=0 now returns "" instead of original output - get_app_config() wrapped in try/except with fallback to defaults - sandbox_config.py: add ge=0 validation on truncation limit fields - config.example.yaml: bump config_version 4→5 - tests: add len(result) <= max_chars assertions, edge-case (max=0, small max, various sizes) tests; fix skipped-count test for strict hard cap * refactor(sandbox): replace while-loop truncation with fixed marker budget Use a pre-allocated constant (_MARKER_MAX_LEN) instead of a convergence loop to ensure result <= max_chars. Simpler, safer, and skipped-char count in the marker is now an exact predictable value. * refactor(sandbox): compute marker budget dynamically instead of hardcoding * fix(sandbox): make max_chars=0 disable truncation instead of returning empty string --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: JeffJiang <for-eleven@hotmail.com> |
||
|
|
3ff15423d6 |
fix Windows Docker sandbox path mounting (#1634)
* fix windows docker sandbox paths * fix windows sandbox mount validation * fix backend checks for windows sandbox path PR |
||
|
|
34e835bc33 |
feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403)
* feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli Implement all core LangGraph Platform API endpoints in the Gateway, allowing it to fully replace the langgraph-cli dev server for local development. This eliminates a heavyweight dependency and simplifies the development stack. Changes: - Add runs lifecycle endpoints (create, stream, wait, cancel, join) - Add threads CRUD and search endpoints - Add assistants compatibility endpoints (search, get, graph, schemas) - Add StreamBridge (in-memory pub/sub for SSE) and async provider - Add RunManager with atomic create_or_reject (eliminates TOCTOU race) - Add worker with interrupt/rollback cancel actions and runtime context injection - Route /api/langgraph/* to Gateway in nginx config - Skip langgraph-cli startup by default (SKIP_LANGGRAPH_SERVER=0 to restore) - Add unit tests for RunManager, SSE format, and StreamBridge * fix: drain bridge queue on client disconnect to prevent backpressure When on_disconnect=continue, keep consuming events from the bridge without yielding, so the worker is not blocked by a full queue. Only on_disconnect=cancel breaks out immediately. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: remove pytest import Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: Fix default stream_mode to ["values", "messages-tuple"] Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: Remove unused if_exists field from ThreadCreateRequest Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: address review comments on gateway LangGraph API - Mount runs.py router in app.py (missing include_router) - Normalize interrupt_before/after "*" to node list before run_agent() - Use entry.id for SSE event ID instead of counter - Drain bridge queue on disconnect when on_disconnect=continue - Reuse serialization helper in wait_run() for consistent wire format - Reject unsupported multitask_strategy with 400 - Remove SKIP_LANGGRAPH_SERVER fallback, always use Gateway * feat: extract app.state access into deps.py Encapsulate read/write operations for singleton objects (RunManager, StreamBridge, checkpointer) held in app.state into a shared utility, reducing repeated access patterns across router modules. * feat: extract deerflow.runtime.serialization module with tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: replace duplicated serialization with deerflow.runtime.serialization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: extract app/gateway/services.py with run lifecycle logic Create a service layer that centralizes SSE formatting, input/config normalization, and run lifecycle management. Router modules will delegate to these functions instead of using private cross-imported helpers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: wire routers to use services layer, remove cross-module private imports Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: apply ruff formatting to refactored files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(runtime): support LangGraph dev server and add compat route - Enable official LangGraph dev server for local development workflow - Decouple runtime components from agents package for better separation - Provide gateway-backed fallback route when dev server is skipped - Simplify lifecycle management using context manager in gateway * feat(runtime): add Store providers with auto-backend selection - Add async_provider.py and provider.py under deerflow/runtime/store/ - Support memory, sqlite, postgres backends matching checkpointer config - Integrate into FastAPI lifespan via AsyncExitStack in deps.py - Replace hardcoded InMemoryStore with config-driven factory * refactor(gateway): migrate thread management from checkpointer to Store and resolve multiple endpoint failures - Add Store-backed CRUD helpers (_store_get, _store_put, _store_upsert) - Replace checkpoint-scanning search with two-phase strategy: phase 1 reads Store (O(threads)), phase 2 backfills from checkpointer for legacy/LangGraph Server threads with lazy migration - Extend Store record schema with values field for title persistence - Sync thread title from checkpoint to Store after run completion - Fix /threads/{id}/runs/{run_id}/stream 405 by accepting both GET and POST methods; POST handles interrupt/rollback actions - Fix /threads/{id}/state 500 by separating read_config and write_config, adding checkpoint_ns to configurable, and shallow-copying checkpoint/metadata before mutation - Sync title to Store on state update for immediate search reflection - Move _upsert_thread_in_store into services.py, remove duplicate logic - Add _sync_thread_title_after_run: await run task, read final checkpoint title, write back to Store record - Spawn title sync as background task from start_run when Store exists * refactor(runtime): deduplicate store and checkpointer provider logic Extract _ensure_sqlite_parent_dir() helper into checkpointer/provider.py and use it in all three places that previously inlined the same mkdir logic. Consolidate duplicate error constants in store/async_provider.py by importing from store/provider.py instead of redefining them. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(runtime): move SQLite helpers to runtime/store, checkpointer imports from store _resolve_sqlite_conn_str and _ensure_sqlite_parent_dir now live in runtime/store/provider.py. agents/checkpointer/provider and agents/checkpointer/async_provider import from there, reversing the previous dependency direction (store → checkpointer becomes checkpointer → store). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(runtime): extract SQLite helpers into runtime/store/_sqlite_utils.py Move resolve_sqlite_conn_str and ensure_sqlite_parent_dir out of checkpointer/provider.py into a dedicated _sqlite_utils module. Functions are now public (no underscore prefix), making cross-module imports semantically correct. All four provider files import from the single shared location. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(gateway): use adelete_thread to fully remove thread checkpoints on delete AsyncSqliteSaver has no adelete method — the previous hasattr check always evaluated to False, silently leaving all checkpoint rows in the database. Switch to adelete_thread(thread_id) which deletes every checkpoint and pending-write row for the thread across all namespaces (including sub-graph checkpoints). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(gateway): remove dead bridge_cm/ckpt_cm code and fix StrEnum lint app.py had unreachable code after the async-with lifespan refactor: bridge_cm and ckpt_cm were referenced but never defined (F821), and the channel service startup/shutdown was outside the langgraph_runtime block so it never ran. Move channel service lifecycle inside the async-with block where it belongs. Replace str+Enum inheritance in RunStatus and DisconnectMode with StrEnum as suggested by UP042. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * style: format with ruff --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: JeffJiang <for-eleven@hotmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
92c7a20cb7 |
[Security] Address critical host-shell escape in LocalSandboxProvider (#1547)
* fix(security): disable host bash by default in local sandbox * fix(security): address review feedback for local bash hardening * fix(ci): sort live test imports for lint * style: apply backend formatter --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
8590249db4 |
feat(acp): add env field to ACPAgentConfig for subprocess env injection (#1447)
Allow per-agent environment variables to be declared in config.yaml under acp_agents.<name>.env. Values prefixed with $ are resolved from the host environment at invocation time, consistent with other config fields. Passes None to spawn_agent_process when env is empty so the subprocess inherits the parent environment unchanged. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
1c542ab7f1 |
feat(memory): Introduce configurable memory storage abstraction (#1353)
* feat(内存存储): 添加可配置的内存存储提供者支持 实现内存存储的抽象基类 MemoryStorage 和文件存储实现 FileMemoryStorage 重构内存数据加载和保存逻辑到存储提供者中 添加 storage_class 配置项以支持自定义存储提供者 * refactor(memory): 重构内存存储模块并更新相关测试 将内存存储逻辑从updater模块移动到独立的storage模块 使用存储接口模式替代直接文件操作 更新所有相关测试以使用新的存储接口 * Update backend/packages/harness/deerflow/agents/memory/storage.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update backend/packages/harness/deerflow/agents/memory/storage.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix(内存存储): 添加线程安全锁并增加测试用例 添加线程锁确保内存存储单例初始化的线程安全 增加对无效代理名称的验证测试 补充单例线程安全性和异常处理的测试用例 * Update backend/tests/test_memory_storage.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix(agents): 使用统一模式验证代理名称 修改代理名称验证逻辑以使用仓库中定义的AGENT_NAME_PATTERN模式,确保代码库一致性并防止路径遍历等安全问题。同时更新测试用例以覆盖更多无效名称情况。 --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> |
||
|
|
d119214fee |
feat(harness): integration ACP agent tool (#1344)
* refactor: extract shared utils to break harness→app cross-layer imports Move _validate_skill_frontmatter to src/skills/validation.py and CONVERTIBLE_EXTENSIONS + convert_file_to_markdown to src/utils/file_conversion.py. This eliminates the two reverse dependencies from client.py (harness layer) into gateway/routers/ (app layer), preparing for the harness/app package split. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: split backend/src into harness (deerflow.*) and app (app.*) Physically split the monolithic backend/src/ package into two layers: - **Harness** (`packages/harness/deerflow/`): publishable agent framework package with import prefix `deerflow.*`. Contains agents, sandbox, tools, models, MCP, skills, config, and all core infrastructure. - **App** (`app/`): unpublished application code with import prefix `app.*`. Contains gateway (FastAPI REST API) and channels (IM integrations). Key changes: - Move 13 harness modules to packages/harness/deerflow/ via git mv - Move gateway + channels to app/ via git mv - Rename all imports: src.* → deerflow.* (harness) / app.* (app layer) - Set up uv workspace with deerflow-harness as workspace member - Update langgraph.json, config.example.yaml, all scripts, Docker files - Add build-system (hatchling) to harness pyproject.toml - Add PYTHONPATH=. to gateway startup commands for app.* resolution - Update ruff.toml with known-first-party for import sorting - Update all documentation to reflect new directory structure Boundary rule enforced: harness code never imports from app. All 429 tests pass. Lint clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: add harness→app boundary check test and update docs Add test_harness_boundary.py that scans all Python files in packages/harness/deerflow/ and fails if any `from app.*` or `import app.*` statement is found. This enforces the architectural rule that the harness layer never depends on the app layer. Update CLAUDE.md to document the harness/app split architecture, import conventions, and the boundary enforcement test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add config versioning with auto-upgrade on startup When config.example.yaml schema changes, developers' local config.yaml files can silently become outdated. This adds a config_version field and auto-upgrade mechanism so breaking changes (like src.* → deerflow.* renames) are applied automatically before services start. - Add config_version: 1 to config.example.yaml - Add startup version check warning in AppConfig.from_file() - Add scripts/config-upgrade.sh with migration registry for value replacements - Add `make config-upgrade` target - Auto-run config-upgrade in serve.sh and start-daemon.sh before starting services - Add config error hints in service failure messages Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix comments * fix: update src.* import in test_sandbox_tools_security to deerflow.* Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: handle empty config and search parent dirs for config.example.yaml Address Copilot review comments on PR #1131: - Guard against yaml.safe_load() returning None for empty config files - Search parent directories for config.example.yaml instead of only looking next to config.yaml, fixing detection in common setups Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: correct skills root path depth and config_version type coercion - loader.py: fix get_skills_root_path() to use 5 parent levels (was 3) after harness split, file lives at packages/harness/deerflow/skills/ so parent×3 resolved to backend/packages/harness/ instead of backend/ - app_config.py: coerce config_version to int() before comparison in _check_config_version() to prevent TypeError when YAML stores value as string (e.g. config_version: "1") - tests: add regression tests for both fixes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: update test imports from src.* to deerflow.*/app.* after harness refactor Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(harness): add tool-first ACP agent invocation (#37) * feat(harness): add tool-first ACP agent invocation * build(harness): make ACP dependency required * fix(harness): address ACP review feedback * feat(harness): decouple ACP agent workspace from thread data ACP agents (codex, claude-code) previously used per-thread workspace directories, causing path resolution complexity and coupling task execution to DeerFlow's internal thread data layout. This change: - Replace _resolve_cwd() with a fixed _get_work_dir() that always uses {base_dir}/acp-workspace/, eliminating virtual path translation and thread_id lookups - Introduce /mnt/acp-workspace virtual path for lead agent read-only access to ACP agent output files (same pattern as /mnt/skills) - Add security guards: read-only validation, path traversal prevention, command path allowlisting, and output masking for acp-workspace - Update system prompt and tool description to guide LLM: send self-contained tasks to ACP agents, copy results via /mnt/acp-workspace - Add 11 new security tests for ACP workspace path handling Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(prompt): inject ACP section only when ACP agents are configured The ACP agent guidance in the system prompt is now conditionally built by _build_acp_section(), which checks get_acp_agents() and returns an empty string when no ACP agents are configured. This avoids polluting the prompt with irrelevant instructions for users who don't use ACP. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix lint * fix(harness): address Copilot review comments on sandbox path handling and ACP tool - local_sandbox: fix path-segment boundary bug in _resolve_path (== or startswith +"/") and add lookahead in _resolve_paths_in_command regex to prevent /mnt/skills matching inside /mnt/skills-extra - local_sandbox_provider: replace print() with logger.warning(..., exc_info=True) - invoke_acp_agent_tool: guard getattr(option, "optionId") with None default + continue; move full prompt from INFO to DEBUG level (truncated to 200 chars) - sandbox/tools: fix _get_acp_workspace_host_path docstring to match implementation; remove misleading "read-only" language from validate_local_bash_command_paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(acp): thread-isolated workspaces, permission guardrail, and ContextVar registry P1.1 – ACP workspace thread isolation - Add `Paths.acp_workspace_dir(thread_id)` for per-thread paths - `_get_work_dir(thread_id)` in invoke_acp_agent_tool now uses `{base_dir}/threads/{thread_id}/acp-workspace/`; falls back to global workspace when thread_id is absent or invalid - `_invoke` extracts thread_id from `RunnableConfig` via `Annotated[RunnableConfig, InjectedToolArg]` - `sandbox/tools.py`: `_get_acp_workspace_host_path(thread_id)`, `_resolve_acp_workspace_path(path, thread_id)`, and all callers (`replace_virtual_paths_in_command`, `mask_local_paths_in_output`, `ls_tool`, `read_file_tool`) now resolve ACP paths per-thread P1.2 – ACP permission guardrail - New `auto_approve_permissions: bool = False` field in `ACPAgentConfig` - `_build_permission_response(options, *, auto_approve: bool)` now defaults to deny; only approves when `auto_approve=True` - Document field in `config.example.yaml` P2 – Deferred tool registry race condition - Replace module-level `_registry` global with `contextvars.ContextVar` - Each asyncio request context gets its own registry; worker threads inherit the context automatically via `loop.run_in_executor` - Expose `get_deferred_registry` / `set_deferred_registry` / `reset_deferred_registry` helpers Tests: 831 pass (57 for affected modules, 3 new tests) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(sandbox): mount /mnt/acp-workspace in docker sandbox container The AioSandboxProvider was not mounting the ACP workspace into the sandbox container, so /mnt/acp-workspace was inaccessible when the lead agent tried to read ACP results in docker mode. Changes: - `ensure_thread_dirs`: also create `acp-workspace/` (chmod 0o777) so the directory exists before the sandbox container starts — required for Docker volume mounts - `_get_thread_mounts`: add read-only `/mnt/acp-workspace` mount using the per-thread host path (`host_paths.acp_workspace_dir(thread_id)`) - Update stale CLAUDE.md description (was "fixed global workspace") Tests: `test_aio_sandbox_provider.py` (4 new tests) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lint): remove unused imports in test_aio_sandbox_provider Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix config --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
16ed797e0e |
feat: add configurable log level and token usage tracking (#1301)
* feat: add configurable log level and token usage tracking - Add `log_level` config to control deerflow module log level, synced to LangGraph Server via serve.sh `--server-log-level` - Add `token_usage.enabled` config with TokenUsageMiddleware that logs input/output/total tokens per LLM call from usage_metadata - Add .omc/ to .gitignore * fix: use info level for token usage logs since feature has its own toggle * fix: sort imports to pass lint check --------- Co-authored-by: greatmengqi <chenmengqi.0376@bytedance.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
8b0f3fe233 |
fix(threads): clean up local thread data after thread deletion (#1262)
* fix(threads): clean up local thread data after thread deletion Delete DeerFlow-managed thread directories after the web UI removes a LangGraph thread. This keeps local thread data in sync with conversation deletion and adds regression coverage for the cleanup flow. * fix(threads): address thread cleanup review feedback Encode thread cleanup URLs in the web client, keep cache updates explicit when no thread search data is cached, and return a generic 500 response from the cleanup endpoint while documenting the sanitized error behavior. --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
a29134d7c9 |
feat(guardrails): add pre-tool-call authorization middleware with pluggable providers (#1240)
Add GuardrailMiddleware that evaluates every tool call before execution. Three provider options: built-in AllowlistProvider (zero deps), OAP passport providers (open standard), or custom providers loaded by class path. - GuardrailProvider protocol with GuardrailRequest/Decision dataclasses - GuardrailMiddleware (AgentMiddleware, position 5 in chain) - AllowlistProvider for simple deny/allow by tool name - GuardrailsConfig (Pydantic singleton, loaded from config.yaml) - 25 tests covering allow/deny, fail-closed/open, async, GraphBubbleUp - Comprehensive docs at backend/docs/GUARDRAILS.md Closes #1213 Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
e119dc74ae |
feat(codex): support explicit OpenAI Responses API config (#1235)
* feat: support explicit OpenAI Responses API config Co-authored-by: Codex <noreply@openai.com> * Update backend/packages/harness/deerflow/config/model_config.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> |
||
|
|
644501ae07 |
fix(config): reload AppConfig when config path or mtime changes (#1239)
* fix(config): reload AppConfig when config path or mtime changes - Track resolved path + mtime; invalidate cache on change - Preserve set_app_config() injection behavior - Add regression tests (test_app_config_reload.py) - Document behavior in README and backend/CLAUDE.md Signed-off-by: Gao Mingfei <g199209@gmail.com> * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Signed-off-by: Gao Mingfei <g199209@gmail.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> |
||
|
|
0091d9f071 |
feat(tools): add tool_search for deferred MCP tool loading (#1176)
* feat(tools): add tool_search for deferred MCP tool loading When multiple MCP servers are enabled, total tool count can exceed 30-50, causing context bloat and degraded tool selection accuracy. This adds a deferred tool loading mechanism controlled by `tool_search.enabled` config. - Add ToolSearchConfig with single `enabled` field - Add DeferredToolRegistry with regex search (select:, +keyword, keyword) - Add tool_search tool returning OpenAI-compatible function JSON - Add DeferredToolFilterMiddleware to hide deferred schemas from bind_tools - Add <available-deferred-tools> section to system prompt - Enable MCP tool_name_prefix to prevent cross-server name collisions - Add 34 unit tests covering registry, tool, prompt, and middleware * fix: reset stale deferred registry and bump config_version - Reset deferred registry upfront in get_available_tools() to prevent stale tool entries when MCP servers are disabled between calls - Bump config_version to 2 for new tool_search config field Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tests): mock get_app_config in prompt section tests for CI CI has no config.yaml, causing TestDeferredToolsPromptSection to fail with FileNotFoundError. Add autouse fixture to mock get_app_config. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
76803b826f |
refactor: split backend into harness (deerflow.*) and app (app.*) (#1131)
* refactor: extract shared utils to break harness→app cross-layer imports Move _validate_skill_frontmatter to src/skills/validation.py and CONVERTIBLE_EXTENSIONS + convert_file_to_markdown to src/utils/file_conversion.py. This eliminates the two reverse dependencies from client.py (harness layer) into gateway/routers/ (app layer), preparing for the harness/app package split. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: split backend/src into harness (deerflow.*) and app (app.*) Physically split the monolithic backend/src/ package into two layers: - **Harness** (`packages/harness/deerflow/`): publishable agent framework package with import prefix `deerflow.*`. Contains agents, sandbox, tools, models, MCP, skills, config, and all core infrastructure. - **App** (`app/`): unpublished application code with import prefix `app.*`. Contains gateway (FastAPI REST API) and channels (IM integrations). Key changes: - Move 13 harness modules to packages/harness/deerflow/ via git mv - Move gateway + channels to app/ via git mv - Rename all imports: src.* → deerflow.* (harness) / app.* (app layer) - Set up uv workspace with deerflow-harness as workspace member - Update langgraph.json, config.example.yaml, all scripts, Docker files - Add build-system (hatchling) to harness pyproject.toml - Add PYTHONPATH=. to gateway startup commands for app.* resolution - Update ruff.toml with known-first-party for import sorting - Update all documentation to reflect new directory structure Boundary rule enforced: harness code never imports from app. All 429 tests pass. Lint clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: add harness→app boundary check test and update docs Add test_harness_boundary.py that scans all Python files in packages/harness/deerflow/ and fails if any `from app.*` or `import app.*` statement is found. This enforces the architectural rule that the harness layer never depends on the app layer. Update CLAUDE.md to document the harness/app split architecture, import conventions, and the boundary enforcement test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add config versioning with auto-upgrade on startup When config.example.yaml schema changes, developers' local config.yaml files can silently become outdated. This adds a config_version field and auto-upgrade mechanism so breaking changes (like src.* → deerflow.* renames) are applied automatically before services start. - Add config_version: 1 to config.example.yaml - Add startup version check warning in AppConfig.from_file() - Add scripts/config-upgrade.sh with migration registry for value replacements - Add `make config-upgrade` target - Auto-run config-upgrade in serve.sh and start-daemon.sh before starting services - Add config error hints in service failure messages Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix comments * fix: update src.* import in test_sandbox_tools_security to deerflow.* Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: handle empty config and search parent dirs for config.example.yaml Address Copilot review comments on PR #1131: - Guard against yaml.safe_load() returning None for empty config files - Search parent directories for config.example.yaml instead of only looking next to config.yaml, fixing detection in common setups Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: correct skills root path depth and config_version type coercion - loader.py: fix get_skills_root_path() to use 5 parent levels (was 3) after harness split, file lives at packages/harness/deerflow/skills/ so parent×3 resolved to backend/packages/harness/ instead of backend/ - app_config.py: coerce config_version to int() before comparison in _check_config_version() to prevent TypeError when YAML stores value as string (e.g. config_version: "1") - tests: add regression tests for both fixes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: update test imports from src.* to deerflow.*/app.* after harness refactor Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> |