diff --git a/Makefile b/Makefile index b6554fa06..7fdcbc7f4 100644 --- a/Makefile +++ b/Makefile @@ -76,7 +76,7 @@ install: @echo "Installing frontend dependencies..." @cd frontend && pnpm install @echo "Installing pre-commit hooks..." - @$(BACKEND_UV_RUN) --with pre-commit pre-commit install + @$(BACKEND_UV_RUN) --with pre-commit pre-commit install @echo "✓ All dependencies installed" @echo "" @echo "==========================================" diff --git a/SECURITY.md b/SECURITY.md index 1dffa1d8a..459654a0a 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -10,71 +10,3 @@ Currently, we have two branches to maintain: ## Reporting a Vulnerability Please go to https://github.com/bytedance/deer-flow/security to report the vulnerability you find. - -## Sandbox Isolation and the Docker Socket (DooD) - -DeerFlow executes agent-generated shell/code through a configurable sandbox -(`sandbox.use` in `config.yaml`). The isolation guarantees differ by mode, and -one mode requires mounting the host Docker socket. Understand the trade-offs -before exposing an instance to untrusted input. - -| Mode | `config.yaml` | Host Docker socket | Isolation | -|------|---------------|--------------------|-----------| -| `local` (default) | `deerflow.sandbox.local:LocalSandboxProvider` | Not mounted | Commands run **inside the gateway container** on its filesystem. Not a strong boundary — `allow_host_bash` is `false` by default and should stay off for untrusted workloads. | -| `aio` (pure DooD) | `deerflow.community.aio_sandbox:AioSandboxProvider` (no `provisioner_url`) | **Mounted** (opt-in overlay) | Sandbox containers are started via the host Docker daemon. | -| `provisioner` (Kubernetes) | `AioSandboxProvider` + `provisioner_url` | Not mounted | Sandbox pods are created through the provisioner's K8s API over HTTP. Strongest isolation. | - -### The Docker socket is host root - -Mounting `/var/run/docker.sock` into a container grants that container -**root-equivalent control of the host**: anything able to reach the socket can -start a new container that bind-mounts the host filesystem and escape. This -matters for DeerFlow because the gateway executes model-generated commands, so a -prompt injection or any in-container code-execution primitive could pivot to the -host through the socket. - -To keep this off the default attack surface: - -- The host Docker socket is **not** mounted by the default Compose stack. It is - added only for `aio` mode through the opt-in `docker/docker-compose.dood.yaml` - overlay, which `scripts/deploy.sh` and `scripts/docker.sh` append - automatically when `detect_sandbox_mode()` returns `aio`. -- Prefer **provisioner/Kubernetes mode** for multi-tenant or internet-exposed - deployments — it isolates sandboxes without handing the gateway the host - daemon. -- If you must use `aio`/DooD, treat the host as part of the gateway's trust - boundary: run it on a dedicated host, and consider a scoped Docker API proxy - instead of the raw socket. - -> Note: the gateway bind-mounts `$HOME/.claude` and `$HOME/.codex` (read-only) -> for CLI auto-auth in **all** modes. These hold long-lived CLI credentials; -> scope or omit them when the gateway runs untrusted workloads. - -## CLI Credential Mounts (Claude Code / Codex) - -DeerFlow can reuse your Claude Code / Codex CLI subscription login as a model -provider (`ClaudeChatModel`, the Codex provider) or for ACP agents that run the -CLI in-container. The Compose stack used to bind-mount the **entire** `~/.claude` -and `~/.codex` directories (read-only) into the gateway container in **every** -configuration — exposing not just credentials but full conversation history, -per-project session data, and global CLI config. A gateway compromise (prompt -injection, tool/MCP misuse, RCE) would leak all of it. - -These directories are **no longer mounted by default**. Supply CLI credentials -with the least exposure that fits your setup: - -| Need | How | Exposure | -|------|-----|----------| -| Claude model provider | env `CLAUDE_CODE_OAUTH_TOKEN` / `ANTHROPIC_AUTH_TOKEN` (via `.env`), or `CLAUDE_CODE_CREDENTIALS_PATH` → a single mounted `.credentials.json` | none / one file | -| Codex model provider | env `CODEX_AUTH_PATH` pointing at a single mounted `auth.json` | one file | -| ACP agent | the adapter's own auth — many ACP adapters take an env API key (e.g. `ANTHROPIC_API_KEY` / `OPENAI_API_KEY`) and need no mount; use the opt-in `docker/docker-compose.cli-auth.yaml` overlay only if your adapter reads the full CLI config dir | none / full dir | - -The Gateway credential loader checks environment variables **before** the -default credential files, so the env-token paths need no bind mount at all. ACP -adapters authenticate independently of DeerFlow via their own documented env — -for example the common `claude-code-acp` adapter starts as -`ANTHROPIC_API_KEY=… claude-code-acp` and honors `CLAUDE_CONFIG_DIR` to redirect -its config directory, so it needs no `~/.claude` mount at all. Prefer the -adapter's documented env auth, and reach for the -`docker-compose.cli-auth.yaml` overlay only as a fallback for an adapter that -genuinely reads the full CLI config directory. diff --git a/backend/docs/CONFIGURATION.md b/backend/docs/CONFIGURATION.md index a2e8561f8..cd32b3f7c 100644 --- a/backend/docs/CONFIGURATION.md +++ b/backend/docs/CONFIGURATION.md @@ -434,6 +434,76 @@ DeerFlow searches for configuration in this order: 3. `config.yaml` under `DEER_FLOW_PROJECT_ROOT`, or under the current working directory when `DEER_FLOW_PROJECT_ROOT` is unset 4. Legacy backend/repository-root locations for monorepo compatibility +## Security Notes +### Sandbox Isolation and the Docker Socket (DooD) + +DeerFlow executes agent-generated shell/code through a configurable sandbox +(`sandbox.use` in `config.yaml`). The isolation guarantees differ by mode, and +one mode requires mounting the host Docker socket. Understand the trade-offs +before exposing an instance to untrusted input. + +| Mode | `config.yaml` | Host Docker socket | Isolation | +|------|---------------|--------------------|-----------| +| `local` (default) | `deerflow.sandbox.local:LocalSandboxProvider` | Not mounted | Commands run **inside the gateway container** on its filesystem. Not a strong boundary — `allow_host_bash` is `false` by default and should stay off for untrusted workloads. | +| `aio` (pure DooD) | `deerflow.community.aio_sandbox:AioSandboxProvider` (no `provisioner_url`) | **Mounted** (opt-in overlay) | Sandbox containers are started via the host Docker daemon. | +| `provisioner` (Kubernetes) | `AioSandboxProvider` + `provisioner_url` | Not mounted | Sandbox pods are created through the provisioner's K8s API over HTTP. Strongest isolation. | + +#### The Docker socket is host root + +Mounting `/var/run/docker.sock` into a container grants that container +**root-equivalent control of the host**: anything able to reach the socket can +start a new container that bind-mounts the host filesystem and escape. This +matters for DeerFlow because the gateway executes model-generated commands, so a +prompt injection or any in-container code-execution primitive could pivot to the +host through the socket. + +To keep this off the default attack surface: + +- The host Docker socket is **not** mounted by the default Compose stack. It is + added only for `aio` mode through the opt-in `docker/docker-compose.dood.yaml` + overlay, which `scripts/deploy.sh` and `scripts/docker.sh` append + automatically when `detect_sandbox_mode()` returns `aio`. +- Prefer **provisioner/Kubernetes mode** for multi-tenant or internet-exposed + deployments — it isolates sandboxes without handing the gateway the host + daemon. +- If you must use `aio`/DooD, treat the host as part of the gateway's trust + boundary: run it on a dedicated host, and consider a scoped Docker API proxy + instead of the raw socket. + +> Note: the gateway bind-mounts `$HOME/.claude` and `$HOME/.codex` (read-only) +> for CLI auto-auth in **all** modes. These hold long-lived CLI credentials; +> scope or omit them when the gateway runs untrusted workloads. + +### CLI Credential Mounts (Claude Code / Codex) + +DeerFlow can reuse your Claude Code / Codex CLI subscription login as a model +provider (`ClaudeChatModel`, the Codex provider) or for ACP agents that run the +CLI in-container. The Compose stack used to bind-mount the **entire** `~/.claude` +and `~/.codex` directories (read-only) into the gateway container in **every** +configuration — exposing not just credentials but full conversation history, +per-project session data, and global CLI config. A gateway compromise (prompt +injection, tool/MCP misuse, RCE) would leak all of it. + +These directories are **no longer mounted by default**. Supply CLI credentials +with the least exposure that fits your setup: + +| Need | How | Exposure | +|------|-----|----------| +| Claude model provider | env `CLAUDE_CODE_OAUTH_TOKEN` / `ANTHROPIC_AUTH_TOKEN` (via `.env`), or `CLAUDE_CODE_CREDENTIALS_PATH` → a single mounted `.credentials.json` | none / one file | +| Codex model provider | env `CODEX_AUTH_PATH` pointing at a single mounted `auth.json` | one file | +| ACP agent | the adapter's own auth — many ACP adapters take an env API key (e.g. `ANTHROPIC_API_KEY` / `OPENAI_API_KEY`) and need no mount; use the opt-in `docker/docker-compose.cli-auth.yaml` overlay only if your adapter reads the full CLI config dir | none / full dir | + +The Gateway credential loader checks environment variables **before** the +default credential files, so the env-token paths need no bind mount at all. ACP +adapters authenticate independently of DeerFlow via their own documented env — +for example the common `claude-code-acp` adapter starts as +`ANTHROPIC_API_KEY=… claude-code-acp` and honors `CLAUDE_CONFIG_DIR` to redirect +its config directory, so it needs no `~/.claude` mount at all. Prefer the +adapter's documented env auth, and reach for the +`docker-compose.cli-auth.yaml` overlay only as a fallback for an adapter that +genuinely reads the full CLI config directory. + + ## Best Practices 1. **Place `config.yaml` in project root** - Set `DEER_FLOW_PROJECT_ROOT` if the runtime starts elsewhere diff --git a/docker/docker-compose-dev.yaml b/docker/docker-compose-dev.yaml index 197d41110..97cbbeea9 100644 --- a/docker/docker-compose-dev.yaml +++ b/docker/docker-compose-dev.yaml @@ -148,14 +148,14 @@ services: - gateway-uv-cache:/root/.cache/uv # DooD: the host Docker socket is NOT mounted by default. It is added only # for aio (pure-DooD) sandbox mode via the opt-in docker-compose.dood.yaml - # overlay (appended by scripts/docker.sh). See SECURITY.md. + # overlay (appended by scripts/docker.sh). See backend/docs/CONFIGURATION.md # CLI auth dirs (Claude Code / Codex) are NOT mounted by default: they # expose the entire ~/.claude and ~/.codex (history, projects, global # config, credentials) into the container. Mount them only when you use # the Claude/Codex CLI login as a model provider or ACP agent, via the # opt-in docker-compose.cli-auth.yaml overlay. Prefer an env token - # (CLAUDE_CODE_OAUTH_TOKEN, see .env.example / SECURITY.md). + # (CLAUDE_CODE_OAUTH_TOKEN, see .env.example / backend/docs/CONFIGURATION.md). working_dir: /app environment: - CI=true diff --git a/docker/docker-compose.dood.yaml b/docker/docker-compose.dood.yaml index 85bca9482..d353407c3 100644 --- a/docker/docker-compose.dood.yaml +++ b/docker/docker-compose.dood.yaml @@ -10,7 +10,8 @@ # root-equivalent control of the host. Only load this overlay when you have # explicitly chosen aio (DooD) sandbox mode and accept that trade-off. The # default LocalSandboxProvider and the provisioner/Kubernetes mode do NOT need -# it and never load this file. See SECURITY.md for the full threat model. +# it and never load this file. See backend/docs/CONFIGURATION.md Security Note +# section for the full threat model. # # scripts/deploy.sh and scripts/docker.sh append this overlay automatically # only when detect_sandbox_mode() returns "aio". Manual use: diff --git a/docker/docker-compose.yaml b/docker/docker-compose.yaml index 23191ec7a..7e3096e73 100644 --- a/docker/docker-compose.yaml +++ b/docker/docker-compose.yaml @@ -86,14 +86,15 @@ services: - ${DEER_FLOW_HOME}:/app/backend/.deer-flow # DooD: the host Docker socket is NOT mounted by default. It is added only # for aio (pure-DooD) sandbox mode via the opt-in docker-compose.dood.yaml - # overlay (appended by scripts/deploy.sh). See SECURITY.md. + # overlay (appended by scripts/deploy.sh). See backend/docs/CONFIGURATION.md + # Security Note section for details. # CLI auth dirs (Claude Code / Codex) are NOT mounted by default: they # expose the entire ~/.claude and ~/.codex (history, projects, global # config, credentials) into the container. Mount them only when you use # the Claude/Codex CLI login as a model provider or ACP agent, via the # opt-in docker-compose.cli-auth.yaml overlay. Prefer an env token - # (CLAUDE_CODE_OAUTH_TOKEN, see .env.example / SECURITY.md). + # (CLAUDE_CODE_OAUTH_TOKEN, see .env.example / backend/docs/CONFIGURATION.md). working_dir: /app environment: - CI=true