* feat(subagents): support per-subagent skill loading and custom subagent types (#2230)
Add per-subagent skill configuration and custom subagent type registration,
aligned with Codex's role-based config layering and per-session skill injection.
Backend:
- SubagentConfig gains `skills` field (None=all, []=none, list=whitelist)
- New CustomSubagentConfig for user-defined subagent types in config.yaml
- SubagentsAppConfig gains `custom_agents` section and `get_skills_for()`
- Registry resolves custom agents with three-layer config precedence
- SubagentExecutor loads skills per-session as conversation items (Codex pattern)
- task_tool no longer appends skills to system_prompt
- Lead agent system prompt dynamically lists all registered subagent types
- setup_agent tool accepts optional skills parameter
- Gateway agents API transparently passes skills in CRUD operations
Frontend:
- Agent/CreateAgentRequest/UpdateAgentRequest types include skills field
- Agent card displays skills as badges alongside tool_groups
Config:
- config.example.yaml documents custom_agents and per-agent skills override
Tests:
- 40 new tests covering all skill config, custom agents, and registry logic
- Existing tests updated for new get_skills_prompt_section signature
Closes#2230
* fix: address review feedback on skills PR
- Remove stale get_skills_prompt_section monkeypatches from test_task_tool_core_logic.py
(task_tool no longer imports this function after skill injection moved to executor)
- Add key prefixes (tg:/sk:) to agent-card badges to prevent React key collisions
between tool_groups and skills
* fix(ci): resolve lint and test failures
- Format agent-card.tsx with prettier (lint-frontend)
- Remove stale "Skills Appendix" system_prompt assertion — skills are now
loaded per-session by SubagentExecutor, not appended to system_prompt
* fix(ci): sort imports in test_subagent_skills_config.py (ruff I001)
* fix(ci): use nullish coalescing in agent-card badge condition (eslint)
* fix: address review feedback on skills PR
- Use model_fields_set in AgentUpdateRequest to distinguish "field omitted"
from "explicitly set to null" — fixes skills=None ambiguity where None
means "inherit all" but was treated as "don't change"
- Move lazy import of get_subagent_config outside loop in
_build_available_subagents_description to avoid repeated import overhead
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
The tool is registered as `present_files` (plural) in present_file_tool.py,
but four references in documentation and prompt strings incorrectly used the
singular form `present_file`. This could cause confusion and potentially
lead to incorrect tool invocations.
Changed files:
- backend/docs/GUARDRAILS.md
- backend/docs/ARCHITECTURE.md
- backend/packages/harness/deerflow/agents/lead_agent/prompt.py (2 occurrences)
* fix(subagent): inherit parent agent's tool_groups in task_tool
When a custom agent defines tool_groups (e.g. [file:read, file:write, bash]),
the restriction is correctly applied to the lead agent. However, when the lead
agent delegates work to a subagent via the task tool, get_available_tools() is
called without the groups parameter, causing the subagent to receive ALL tools
(including web_search, web_fetch, image_search, etc.) regardless of the parent
agent's configuration.
This fix propagates tool_groups through run metadata so that task_tool passes
the same group filter when building the subagent's tool set.
Changes:
- agent.py: include tool_groups in run metadata
- task_tool.py: read tool_groups from metadata and pass to get_available_tools()
* fix: initialize metadata before conditional block and update tests for tool_groups propagation
- Initialize metadata = {} before the 'if runtime is not None' block to
avoid Ruff F821 (possibly-undefined variable) and simplify the
parent_tool_groups expression.
- Update existing test assertion to expect groups=None in
get_available_tools call signature.
- Add 3 new test cases:
- test_task_tool_propagates_tool_groups_to_subagent
- test_task_tool_no_tool_groups_passes_none
- test_task_tool_runtime_none_passes_groups_none
* feat(agent): 为AgentConfig添加skills字段并更新lead_agent系统提示
在AgentConfig中添加skills字段以支持配置agent可用技能
更新lead_agent的系统提示模板以包含可用技能信息
* fix: resolve agent skill configuration edge cases and add tests
* Update backend/packages/harness/deerflow/agents/lead_agent/prompt.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* refactor(agent): address PR review comments for skills configuration
- Add detailed docstring to `skills` field in `AgentConfig` to clarify the semantics of `None` vs `[]`.
- Add unit tests in `test_custom_agent.py` to verify `load_agent_config()` correctly parses omitted skills and explicit empty lists.
- Fix `test_make_lead_agent_empty_skills_passed_correctly` to include `agent_name` in the runtime config, ensuring it exercises the real code path.
* docs: 添加关于按代理过滤技能的配置说明
在配置示例文件和文档中添加说明,解释如何通过代理的config.yaml文件限制加载的技能
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix(security): disable host bash by default in local sandbox
* fix(security): address review feedback for local bash hardening
* fix(ci): sort live test imports for lint
* style: apply backend formatter
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
* feat(client): support custom middleware injection
Add support for custom middleware, allowing custom middleware list to be passed when initializing DeerFlowClient. These middleware will be injected after the default middleware when creating the agent, extending the agent's functionality.
* feat: inject custom middlewares before ClarificationMiddleware to preserve ordering
- Add `custom_middlewares` param to `_build_middlewares`
- Inject custom middlewares right before `ClarificationMiddleware` to keep it as the last in the chain
- Remove unsafe `.extend()` in `client.py`
- Update tests in `test_client.py` and `test_lead_agent_model_resolution.py` to assert correct injection ordering
Replace all bare print() calls with proper logging using Python's
standard logging module across the deerflow harness package.
Changes across 8 files (16 print statements replaced):
- agents/middlewares/clarification_middleware.py: use logger.info/debug
- agents/middlewares/memory_middleware.py: use logger.debug
- agents/middlewares/thread_data_middleware.py: use logger.debug
- agents/middlewares/view_image_middleware.py: use logger.debug
- agents/memory/queue.py: use logger.info/debug/warning/error
- agents/lead_agent/prompt.py: use logger.error
- skills/loader.py: use logger.warning
- skills/parser.py: use logger.error
Each file follows the established codebase convention:
import logging
logger = logging.getLogger(__name__)
Log levels chosen based on message semantics:
- debug: routine operational details (directory creation, timer resets)
- info: significant state changes (memory queued, updates processed)
- warning: recoverable issues (config load failures, skipped updates)
- error: unexpected failures (parsing errors, memory update errors)
Note: client.py is intentionally excluded as it uses print() for
CLI output, which is the correct behavior for a command-line client.
Co-authored-by: moose-lab <moose-lab@users.noreply.github.com>
* refactor: extract shared utils to break harness→app cross-layer imports
Move _validate_skill_frontmatter to src/skills/validation.py and
CONVERTIBLE_EXTENSIONS + convert_file_to_markdown to src/utils/file_conversion.py.
This eliminates the two reverse dependencies from client.py (harness layer)
into gateway/routers/ (app layer), preparing for the harness/app package split.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor: split backend/src into harness (deerflow.*) and app (app.*)
Physically split the monolithic backend/src/ package into two layers:
- **Harness** (`packages/harness/deerflow/`): publishable agent framework
package with import prefix `deerflow.*`. Contains agents, sandbox, tools,
models, MCP, skills, config, and all core infrastructure.
- **App** (`app/`): unpublished application code with import prefix `app.*`.
Contains gateway (FastAPI REST API) and channels (IM integrations).
Key changes:
- Move 13 harness modules to packages/harness/deerflow/ via git mv
- Move gateway + channels to app/ via git mv
- Rename all imports: src.* → deerflow.* (harness) / app.* (app layer)
- Set up uv workspace with deerflow-harness as workspace member
- Update langgraph.json, config.example.yaml, all scripts, Docker files
- Add build-system (hatchling) to harness pyproject.toml
- Add PYTHONPATH=. to gateway startup commands for app.* resolution
- Update ruff.toml with known-first-party for import sorting
- Update all documentation to reflect new directory structure
Boundary rule enforced: harness code never imports from app.
All 429 tests pass. Lint clean.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: add harness→app boundary check test and update docs
Add test_harness_boundary.py that scans all Python files in
packages/harness/deerflow/ and fails if any `from app.*` or
`import app.*` statement is found. This enforces the architectural
rule that the harness layer never depends on the app layer.
Update CLAUDE.md to document the harness/app split architecture,
import conventions, and the boundary enforcement test.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add config versioning with auto-upgrade on startup
When config.example.yaml schema changes, developers' local config.yaml
files can silently become outdated. This adds a config_version field and
auto-upgrade mechanism so breaking changes (like src.* → deerflow.*
renames) are applied automatically before services start.
- Add config_version: 1 to config.example.yaml
- Add startup version check warning in AppConfig.from_file()
- Add scripts/config-upgrade.sh with migration registry for value replacements
- Add `make config-upgrade` target
- Auto-run config-upgrade in serve.sh and start-daemon.sh before starting services
- Add config error hints in service failure messages
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix comments
* fix: update src.* import in test_sandbox_tools_security to deerflow.*
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: handle empty config and search parent dirs for config.example.yaml
Address Copilot review comments on PR #1131:
- Guard against yaml.safe_load() returning None for empty config files
- Search parent directories for config.example.yaml instead of only
looking next to config.yaml, fixing detection in common setups
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: correct skills root path depth and config_version type coercion
- loader.py: fix get_skills_root_path() to use 5 parent levels (was 3)
after harness split, file lives at packages/harness/deerflow/skills/
so parent×3 resolved to backend/packages/harness/ instead of backend/
- app_config.py: coerce config_version to int() before comparison in
_check_config_version() to prevent TypeError when YAML stores value
as string (e.g. config_version: "1")
- tests: add regression tests for both fixes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: update test imports from src.* to deerflow.*/app.* after harness refactor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(harness): add tool-first ACP agent invocation (#37)
* feat(harness): add tool-first ACP agent invocation
* build(harness): make ACP dependency required
* fix(harness): address ACP review feedback
* feat(harness): decouple ACP agent workspace from thread data
ACP agents (codex, claude-code) previously used per-thread workspace
directories, causing path resolution complexity and coupling task
execution to DeerFlow's internal thread data layout. This change:
- Replace _resolve_cwd() with a fixed _get_work_dir() that always uses
{base_dir}/acp-workspace/, eliminating virtual path translation and
thread_id lookups
- Introduce /mnt/acp-workspace virtual path for lead agent read-only
access to ACP agent output files (same pattern as /mnt/skills)
- Add security guards: read-only validation, path traversal prevention,
command path allowlisting, and output masking for acp-workspace
- Update system prompt and tool description to guide LLM: send
self-contained tasks to ACP agents, copy results via /mnt/acp-workspace
- Add 11 new security tests for ACP workspace path handling
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor(prompt): inject ACP section only when ACP agents are configured
The ACP agent guidance in the system prompt is now conditionally built
by _build_acp_section(), which checks get_acp_agents() and returns an
empty string when no ACP agents are configured. This avoids polluting
the prompt with irrelevant instructions for users who don't use ACP.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix lint
* fix(harness): address Copilot review comments on sandbox path handling and ACP tool
- local_sandbox: fix path-segment boundary bug in _resolve_path (== or startswith +"/")
and add lookahead in _resolve_paths_in_command regex to prevent /mnt/skills matching
inside /mnt/skills-extra
- local_sandbox_provider: replace print() with logger.warning(..., exc_info=True)
- invoke_acp_agent_tool: guard getattr(option, "optionId") with None default + continue;
move full prompt from INFO to DEBUG level (truncated to 200 chars)
- sandbox/tools: fix _get_acp_workspace_host_path docstring to match implementation;
remove misleading "read-only" language from validate_local_bash_command_paths
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(acp): thread-isolated workspaces, permission guardrail, and ContextVar registry
P1.1 – ACP workspace thread isolation
- Add `Paths.acp_workspace_dir(thread_id)` for per-thread paths
- `_get_work_dir(thread_id)` in invoke_acp_agent_tool now uses
`{base_dir}/threads/{thread_id}/acp-workspace/`; falls back to
global workspace when thread_id is absent or invalid
- `_invoke` extracts thread_id from `RunnableConfig` via
`Annotated[RunnableConfig, InjectedToolArg]`
- `sandbox/tools.py`: `_get_acp_workspace_host_path(thread_id)`,
`_resolve_acp_workspace_path(path, thread_id)`, and all callers
(`replace_virtual_paths_in_command`, `mask_local_paths_in_output`,
`ls_tool`, `read_file_tool`) now resolve ACP paths per-thread
P1.2 – ACP permission guardrail
- New `auto_approve_permissions: bool = False` field in `ACPAgentConfig`
- `_build_permission_response(options, *, auto_approve: bool)` now
defaults to deny; only approves when `auto_approve=True`
- Document field in `config.example.yaml`
P2 – Deferred tool registry race condition
- Replace module-level `_registry` global with `contextvars.ContextVar`
- Each asyncio request context gets its own registry; worker threads
inherit the context automatically via `loop.run_in_executor`
- Expose `get_deferred_registry` / `set_deferred_registry` /
`reset_deferred_registry` helpers
Tests: 831 pass (57 for affected modules, 3 new tests)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(sandbox): mount /mnt/acp-workspace in docker sandbox container
The AioSandboxProvider was not mounting the ACP workspace into the
sandbox container, so /mnt/acp-workspace was inaccessible when the lead
agent tried to read ACP results in docker mode.
Changes:
- `ensure_thread_dirs`: also create `acp-workspace/` (chmod 0o777) so
the directory exists before the sandbox container starts — required
for Docker volume mounts
- `_get_thread_mounts`: add read-only `/mnt/acp-workspace` mount using
the per-thread host path (`host_paths.acp_workspace_dir(thread_id)`)
- Update stale CLAUDE.md description (was "fixed global workspace")
Tests: `test_aio_sandbox_provider.py` (4 new tests)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(lint): remove unused imports in test_aio_sandbox_provider
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix config
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add configurable log level and token usage tracking
- Add `log_level` config to control deerflow module log level, synced
to LangGraph Server via serve.sh `--server-log-level`
- Add `token_usage.enabled` config with TokenUsageMiddleware that logs
input/output/total tokens per LLM call from usage_metadata
- Add .omc/ to .gitignore
* fix: use info level for token usage logs since feature has its own toggle
* fix: sort imports to pass lint check
---------
Co-authored-by: greatmengqi <chenmengqi.0376@bytedance.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
* feat(tools): add tool_search for deferred MCP tool loading
When multiple MCP servers are enabled, total tool count can exceed 30-50,
causing context bloat and degraded tool selection accuracy. This adds a
deferred tool loading mechanism controlled by `tool_search.enabled` config.
- Add ToolSearchConfig with single `enabled` field
- Add DeferredToolRegistry with regex search (select:, +keyword, keyword)
- Add tool_search tool returning OpenAI-compatible function JSON
- Add DeferredToolFilterMiddleware to hide deferred schemas from bind_tools
- Add <available-deferred-tools> section to system prompt
- Enable MCP tool_name_prefix to prevent cross-server name collisions
- Add 34 unit tests covering registry, tool, prompt, and middleware
* fix: reset stale deferred registry and bump config_version
- Reset deferred registry upfront in get_available_tools() to prevent
stale tool entries when MCP servers are disabled between calls
- Bump config_version to 2 for new tool_search config field
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tests): mock get_app_config in prompt section tests for CI
CI has no config.yaml, causing TestDeferredToolsPromptSection to fail
with FileNotFoundError. Add autouse fixture to mock get_app_config.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add citation/reference support to deep research reports (#1141)
- Enhance lead agent system prompt with mandatory citation requirements
after web_search/web_fetch tool usage
- Add citation examples and best practices to GitHub Deep Research skill
- Add citation hints to report template (Executive Summary, Key Analysis)
- Style regular markdown links in frontend for visual distinction
(color, underline, hover effect)
- Fix TitleMiddleware being registered when title generation is disabled
* fix: address PR review comments
- Revert TitleMiddleware conditional registration (agent.py) to avoid
sync/async incompatibility with DeerFlowClient
- Fix markdown link rendering: merge classNames instead of overwriting,
only set target=_blank for external http(s) URLs
- Remove unrelated package.json/pnpm-lock.yaml changes
* fix: use plain markdown links in Sources section for cleaner rendering
Inline citations in report body use [citation:Title](URL) for pill/badge style.
Sources section uses plain [Title](URL) for simple underlined link style.
* fix(frontend): render plain links as underlined text in artifact markdown
Only links with citation: prefix render as Badge pills.
Regular links in Sources section now render as underlined text links.
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
* refactor: extract shared utils to break harness→app cross-layer imports
Move _validate_skill_frontmatter to src/skills/validation.py and
CONVERTIBLE_EXTENSIONS + convert_file_to_markdown to src/utils/file_conversion.py.
This eliminates the two reverse dependencies from client.py (harness layer)
into gateway/routers/ (app layer), preparing for the harness/app package split.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor: split backend/src into harness (deerflow.*) and app (app.*)
Physically split the monolithic backend/src/ package into two layers:
- **Harness** (`packages/harness/deerflow/`): publishable agent framework
package with import prefix `deerflow.*`. Contains agents, sandbox, tools,
models, MCP, skills, config, and all core infrastructure.
- **App** (`app/`): unpublished application code with import prefix `app.*`.
Contains gateway (FastAPI REST API) and channels (IM integrations).
Key changes:
- Move 13 harness modules to packages/harness/deerflow/ via git mv
- Move gateway + channels to app/ via git mv
- Rename all imports: src.* → deerflow.* (harness) / app.* (app layer)
- Set up uv workspace with deerflow-harness as workspace member
- Update langgraph.json, config.example.yaml, all scripts, Docker files
- Add build-system (hatchling) to harness pyproject.toml
- Add PYTHONPATH=. to gateway startup commands for app.* resolution
- Update ruff.toml with known-first-party for import sorting
- Update all documentation to reflect new directory structure
Boundary rule enforced: harness code never imports from app.
All 429 tests pass. Lint clean.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: add harness→app boundary check test and update docs
Add test_harness_boundary.py that scans all Python files in
packages/harness/deerflow/ and fails if any `from app.*` or
`import app.*` statement is found. This enforces the architectural
rule that the harness layer never depends on the app layer.
Update CLAUDE.md to document the harness/app split architecture,
import conventions, and the boundary enforcement test.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add config versioning with auto-upgrade on startup
When config.example.yaml schema changes, developers' local config.yaml
files can silently become outdated. This adds a config_version field and
auto-upgrade mechanism so breaking changes (like src.* → deerflow.*
renames) are applied automatically before services start.
- Add config_version: 1 to config.example.yaml
- Add startup version check warning in AppConfig.from_file()
- Add scripts/config-upgrade.sh with migration registry for value replacements
- Add `make config-upgrade` target
- Auto-run config-upgrade in serve.sh and start-daemon.sh before starting services
- Add config error hints in service failure messages
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix comments
* fix: update src.* import in test_sandbox_tools_security to deerflow.*
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: handle empty config and search parent dirs for config.example.yaml
Address Copilot review comments on PR #1131:
- Guard against yaml.safe_load() returning None for empty config files
- Search parent directories for config.example.yaml instead of only
looking next to config.yaml, fixing detection in common setups
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: correct skills root path depth and config_version type coercion
- loader.py: fix get_skills_root_path() to use 5 parent levels (was 3)
after harness split, file lives at packages/harness/deerflow/skills/
so parent×3 resolved to backend/packages/harness/ instead of backend/
- app_config.py: coerce config_version to int() before comparison in
_check_config_version() to prevent TypeError when YAML stores value
as string (e.g. config_version: "1")
- tests: add regression tests for both fixes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: update test imports from src.* to deerflow.*/app.* after harness refactor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>