furyhawk/deer-flow

Fork 0

mirror of https://github.com/bytedance/deer-flow.git synced 2026-07-20 09:45:47 +00:00

Files

T

History

cd34a1a504 fix(skills): don't attach model tracing to the in-graph skill security scan (#4252 )

* fix(skills): don't attach model tracing to the in-graph skill security scan

* fix(skills): pass attach_tracing explicitly from the in-graph scan call site

Follow the tracing INVARIANT's own convention rather than detecting the call
context: scan_skill_content takes an attach_tracing flag, and _scan_or_raise --
the single in-graph choke point -- passes False. Standalone callers (Gateway
skill routes, installer) keep the default True.

The INVARIANT list named four sites and asks that new in-graph calls be added
to it; record this fifth one so a future audit of that list finds it.

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>

2026-07-20 08:18:23 +08:00

.vscode

chore: specify project title

2026-01-14 09:57:52 +08:00

app

fix(gateway): correct GitHub auto-retry claim in webhook route docs (#4289 )

2026-07-20 08:01:18 +08:00

docs

fix(skills): reload mounted skills without restarting Gateway (#4264 )

2026-07-17 23:22:16 +08:00

packages/harness

fix(skills): don't attach model tracing to the in-graph skill security scan (#4252 )

2026-07-20 08:18:23 +08:00

samples/other_agent_demo

feat(memory): pluggable memory abstraction with self-contained DeerMem backend (#4122 )

2026-07-15 11:21:04 +08:00

scripts

perf(boxlite): add benchmark-driven warm-pool reclaim tuning (#4001 )

2026-07-09 17:21:25 +08:00

tests

fix(skills): don't attach model tracing to the in-graph skill security scan (#4252 )

2026-07-20 08:18:23 +08:00

.gitignore

fix(dev): create backend/sandbox before uvicorn reload-exclude (#3459 ) (#3460 )

2026-06-09 15:29:40 +08:00

.python-version

chore: add Python and LangGraph stuff

2026-01-14 07:15:02 +08:00

AGENTS.md

fix(gateway): correct GitHub auto-retry claim in webhook route docs (#4289 )

2026-07-20 08:01:18 +08:00

CLAUDE.md

docs: adopt AGENTS.md as source of truth (CLAUDE.md imports via @AGENTS.md) + refresh module guides (#3770 )

2026-06-25 19:15:07 +08:00

CONTRIBUTING.md

docs: fix stale docs and typos (#3913 )

2026-07-03 15:16:20 +08:00

debug.py

feat(debug): print presented file paths with physical resolution (#2825 )

2026-05-09 18:21:01 +08:00

Dockerfile

feat(gateway): add redis stream bridge (#3191 )

2026-07-04 09:21:19 +08:00

langgraph.json

fix: resolve make dev and test-e2e errors (#2570 )

2026-04-26 17:27:32 +08:00

Makefile

feat(persistence): wire alembic migrations, bootstrap schema on startup (#3706 )

2026-06-24 13:57:12 +08:00

pyproject.toml

feat(trace): add agent observability with Monocle (#4024 )

2026-07-14 08:58:06 +08:00

README.md

feat(context): record effective memory identity per run (#3556 )

2026-07-16 09:39:09 +08:00

ruff.toml

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

sitecustomize.py

Fix 'make dev' failure in Windows environment (#3236 )

2026-06-09 22:37:54 +08:00

uv.lock

feat(trace): add agent observability with Monocle (#4024 )

2026-07-14 08:58:06 +08:00

README.md

DeerFlow Backend

DeerFlow is a LangGraph-based AI super agent with sandbox execution, persistent memory, and extensible tool integration. The backend enables AI agents to execute code, browse the web, manage files, delegate tasks to subagents, and retain context across conversations - all in isolated, per-thread environments.

Architecture

                        ┌──────────────────────────────────────┐
                        │          Nginx (Port 2026)           │
                        │      Unified reverse proxy           │
                        └───────┬──────────────────┬───────────┘
                                │
            /api/langgraph/*    │    /api/* (other)
            rewritten to /api/* │
                                ▼
               ┌────────────────────────────────────────┐
               │        Gateway API (8001)              │
               │        FastAPI REST + agent runtime    │
               │                                        │
               │ Models, MCP, Skills, Memory, Uploads,  │
               │ Artifacts, Threads, Runs, Streaming    │
               │                                        │
               │ ┌────────────────────────────────────┐ │
               │ │ Lead Agent                         │ │
               │ │ Middleware Chain, Tools, Subagents │ │
               │ └────────────────────────────────────┘ │
               └────────────────────────────────────────┘

Request Routing (via Nginx):

/api/langgraph/* → Gateway LangGraph-compatible API - agent interactions, threads, streaming
/api/* (other) → Gateway API - models, MCP, skills, memory, artifacts, uploads, thread-local cleanup
/ (non-API) → Frontend - Next.js web interface

Core Components

Lead Agent

The single LangGraph agent (lead_agent) is the runtime entry point, created via make_lead_agent(config). It combines:

Dynamic model selection with thinking and vision support
Middleware chain for cross-cutting concerns (9 middlewares)
Tool system with sandbox, MCP, community, and built-in tools
Subagent delegation for parallel task execution
System prompt with skills injection, memory context, and working directory guidance

Middleware Chain

Middlewares execute in strict order, each handling a specific concern:

#	Middleware	Purpose
1	ThreadDataMiddleware	Creates per-thread isolated directories (workspace, uploads, outputs)
2	UploadsMiddleware	Injects newly uploaded files into conversation context
3	SandboxMiddleware	Acquires sandbox environment for code execution
4	SummarizationMiddleware	Reduces context when approaching token limits (optional)
5	TodoListMiddleware	Tracks multi-step tasks in plan mode (optional)
6	TitleMiddleware	Auto-generates conversation titles after first exchange
7	MemoryMiddleware	Queues conversations for async memory extraction
8	ViewImageMiddleware	Injects image data for vision-capable models (conditional)
9	ClarificationMiddleware	Intercepts clarification requests and interrupts execution (must be last)

Sandbox System

Per-thread isolated execution with virtual path translation:

Abstract interface: execute_command, read_file, write_file, list_dir
Providers: LocalSandboxProvider (filesystem) and AioSandboxProvider (Docker, in community/). Async runtime paths use async sandbox lifecycle hooks so startup, readiness polling, and release do not block the event loop. AioSandboxProvider validates active-cache and warm-pool containers during acquire/reuse, dropping definitively dead entries so a thread can provision a fresh sandbox after an unexpected container exit while keeping get() as an in-memory lookup. Backend health-check failures are treated as unknown, not dead, and a container that cannot be verified during discovery is simply not adopted (acquire falls through to create instead of failing).
Virtual paths: /mnt/user-data/{workspace,uploads,outputs} → thread-specific physical directories
Skills path: /mnt/skills → deer-flow/skills/ directory
Skills loading: Recursively discovers nested SKILL.md files under skills/{public,custom} and preserves nested container paths
SkillScan: Native offline deterministic scanning runs before the LLM skill scanner on installs and agent-managed skill writes; CRITICAL findings block and warning findings become LLM context
File-write safety: str_replace serializes read-modify-write per (sandbox.id, path) so isolated sandboxes keep concurrency even when virtual paths match
Tools: bash, ls, read_file, write_file, str_replace (write_file overwrites by default and exposes append for end-of-file writes; bash is disabled by default when using LocalSandboxProvider; use AioSandboxProvider for isolated shell access)

Subagent System

Async task delegation with concurrent execution:

Built-in agents: general-purpose (full toolset) and bash (command specialist, exposed only when shell access is available)
Concurrency: Max 3 subagents per turn, 15-minute timeout
Execution: Background thread pools with status tracking and SSE events
Flow: Agent calls task() tool → executor runs subagent in background → polls for completion → returns result

Memory System

LLM-powered persistent context retention across conversations:

Automatic extraction: Analyzes conversations for user context, facts, and preferences
Structured storage: User context (work, personal, top-of-mind), history, and confidence-scored facts
Debounced updates: Batches updates to minimize LLM calls (configurable wait time)
System prompt injection: Top facts + context injected into agent prompts
Run-level memory identity: GET /api/threads/{thread_id}/runs/{run_id}/events?event_types=context:memory returns the SHA-256 identity of the effective hidden memory block without copying memory text into the event store
Storage: JSON file with mtime-based cache invalidation

Tool Ecosystem

Category	Tools
Sandbox	`bash`, `ls`, `read_file`, `write_file`, `str_replace`
Built-in	`present_files`, `ask_clarification`, `view_image`, `task` (subagent)
Community	Tavily (web search), Jina AI (web fetch), Crawl4AI (web fetch), Firecrawl (scraping), fastCRW (scraping), DuckDuckGo (image search)
MCP	Any Model Context Protocol server (stdio, SSE, HTTP transports)
Skills	Domain-specific workflows injected via system prompt

Gateway API

FastAPI application providing REST endpoints for frontend integration:

Route	Purpose
`GET /api/models`	List available LLM models
`GET/PUT /api/mcp/config`	Manage MCP server configurations
`POST /api/mcp/cache/reset`	Reset cached MCP tools so they reload on next use
`GET/PUT /api/skills`	List and manage skills
`POST /api/skills/install`	Install skill from `.skill` archive
`GET /api/memory`	Retrieve memory data
`POST /api/memory/reload`	Force memory reload
`GET /api/memory/config`	Memory configuration
`GET /api/memory/status`	Combined config + data
`GET /api/threads/{id}/runs/{run_id}/events`	Debug/audit events for one run; filter `event_types=context:memory` for effective memory identity
`POST /api/threads/{id}/uploads`	Upload files (auto-converts PDF/PPT/Excel/Word to Markdown, rejects directory paths, auto-renames duplicate filenames in one request)
`GET /api/threads/{id}/uploads/list`	List uploaded files
`DELETE /api/threads/{id}`	Delete DeerFlow-managed local thread data after LangGraph thread deletion; unexpected failures are logged server-side and return a generic 500 detail
`GET /api/threads/{id}/artifacts/{path}`	Serve generated artifacts

IM Channels

The IM bridge supports Feishu, Slack, and Telegram. Slack and Telegram still use the final runs.wait() response path, while Feishu now streams through runs.stream(["messages-tuple", "values"]), serializes rapid same-thread turns inside the channel manager, and updates a single in-thread card per source message in place.

For Feishu card updates, DeerFlow stores the running card's message_id per inbound message and patches that same card until the run finishes, preserving the existing OK / DONE reaction flow. When a follow-up arrives inside an existing Feishu topic while another turn is still running, the later message now waits on the mapped DeerFlow thread_id, receives a queued/running card on that exact source message, and keeps a compact source-message blockquote in subsequent patches so rapid consecutive questions remain distinguishable.

Quick Start

Prerequisites

Python 3.12+
uv package manager
API keys for your chosen LLM provider

Installation

cd deer-flow

# Copy configuration files
cp config.example.yaml config.yaml

# Install backend dependencies
cd backend
make install

Configuration

Edit config.yaml in the project root:

models:
  - name: gpt-4o
    display_name: GPT-4o
    use: langchain_openai:ChatOpenAI
    model: gpt-4o
    api_key: $OPENAI_API_KEY
    supports_thinking: false
    supports_vision: true

  - name: gpt-5-responses
    display_name: GPT-5 (Responses API)
    use: langchain_openai:ChatOpenAI
    model: gpt-5
    api_key: $OPENAI_API_KEY
    use_responses_api: true
    output_version: responses/v1
    supports_vision: true

Set your API keys:

export OPENAI_API_KEY="your-api-key-here"

Running

Full Application (from project root):

make dev  # Starts Gateway + Frontend + Nginx

Access at: http://localhost:2026

Backend Only (from backend directory):

# Gateway API + embedded agent runtime
make dev

Direct access: Gateway at http://localhost:8001

Terminal Workbench (TUI) — a terminal-native UI over the embedded harness, no services required:

uv pip install 'deerflow-harness[tui]'   # optional 'textual' dependency
deerflow                                 # launch the TUI
deerflow --print "summarize this repo"   # headless one-shot

Sessions opened in the TUI appear in the Web UI sidebar (it writes the shared threads_meta store under the local default user). See docs/TUI.md.

Project Structure

backend/
├── packages/harness/           # deerflow-harness package (import: deerflow.*)
│   └── deerflow/
│       ├── agents/             # Agent system
│       │   ├── lead_agent/     # Main agent (factory, prompts)
│       │   ├── middlewares/    # Middleware components
│       │   ├── memory/         # Memory extraction & storage
│       │   └── thread_state.py # ThreadState schema
│       ├── sandbox/            # Sandbox execution
│       │   ├── local/          # Local filesystem provider
│       │   ├── sandbox.py      # Abstract interface
│       │   ├── tools.py        # bash, ls, read/write/str_replace
│       │   └── middleware.py   # Sandbox lifecycle
│       ├── subagents/          # Subagent delegation
│       │   ├── builtins/       # general-purpose, bash agents
│       │   ├── executor.py     # Background execution engine
│       │   └── registry.py     # Agent registry
│       ├── tools/builtins/     # Built-in tools
│       ├── mcp/                # MCP protocol integration
│       ├── models/             # Model factory
│       ├── skills/             # Skill discovery & loading
│       ├── config/             # Configuration system
│       ├── runtime/            # Embedded run execution (RunManager, StreamBridge)
│       ├── persistence/        # Checkpointer/store engines & schema migrations
│       ├── guardrails/         # Pre-tool-call authorization providers
│       ├── tracing/            # Tracer factory & trace metadata
│       ├── uploads/            # Uploads manager
│       ├── tui/                # Terminal UI (`deerflow` console script)
│       ├── community/          # Community tools & providers
│       ├── reflection/         # Dynamic module loading
│       └── utils/              # Utilities
├── app/                        # FastAPI Gateway + IM channels (import: app.*)
│   ├── gateway/                # Gateway API
│   │   ├── app.py              # Application setup
│   │   └── routers/            # Route modules
│   └── channels/               # IM channel integrations
├── docs/                       # Documentation
├── tests/                      # Test suite
├── langgraph.json              # LangGraph graph registry for tooling/Studio compatibility
├── pyproject.toml              # Python dependencies
├── Makefile                    # Development commands
└── Dockerfile                  # Container build

langgraph.json is not the default service entrypoint. The scripts and Docker deployments run the Gateway embedded runtime; the file is kept for LangGraph tooling, Studio, or direct LangGraph Server compatibility.

Configuration

Main Configuration (`config.yaml`)

Place in project root. Config values starting with $ resolve as environment variables.

Key sections:

models - LLM configurations with class paths, API keys, thinking/vision flags
tools - Tool definitions with module paths and groups
tool_groups - Logical tool groupings
sandbox - Execution environment provider
skills - Skills directory paths
title - Auto-title generation settings
summarization - Context summarization settings
subagents - Subagent system (enabled/disabled)
memory - Memory system settings (enabled, storage, debounce, facts limits)

Provider note:

models[*].use references provider classes by module path (for example langchain_openai:ChatOpenAI).
If a provider module is missing, DeerFlow now returns an actionable error with install guidance (for example uv add langchain-google-genai).

Extensions Configuration (`extensions_config.json`)

MCP servers and skill states in a single file:

{
  "mcpServers": {
    "github": {
      "enabled": true,
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {"GITHUB_TOKEN": "$GITHUB_TOKEN"}
    },
    "secure-http": {
      "enabled": true,
      "type": "http",
      "url": "https://api.example.com/mcp",
      "oauth": {
        "enabled": true,
        "token_url": "https://auth.example.com/oauth/token",
        "grant_type": "client_credentials",
        "client_id": "$MCP_OAUTH_CLIENT_ID",
        "client_secret": "$MCP_OAUTH_CLIENT_SECRET"
      }
    },
    "postgres": {
      "enabled": false,
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-postgres", "postgresql://localhost/mydb"],
      "description": "PostgreSQL database access",
      "routing": {
        "mode": "prefer",
        "priority": 50,
        "keywords": ["orders", "users", "SQL", "database", "table"]
      },
      "tools": {
        "query": {
          "routing": {
            "priority": 100,
            "keywords": ["query database", "orders table", "metrics"]
          }
        }
      }
    }
  },
  "skills": {
    "pdf-processing": {"enabled": true}
  }
}

routing adds soft MCP preference hints to the agent prompt. It helps the model prefer a configured MCP tool for matching requests without forbidding other tools. When tool_search.enabled=true defers MCP schemas, matching routing metadata can auto-promote up to tool_search.auto_promote_top_k deferred schemas before the model call.

Environment Variables

DEER_FLOW_CONFIG_PATH - Override config.yaml location
DEER_FLOW_EXTENSIONS_CONFIG_PATH - Override extensions_config.json location
Model API keys: OPENAI_API_KEY, ANTHROPIC_API_KEY, DEEPSEEK_API_KEY, etc.
Tool API keys: TAVILY_API_KEY, GITHUB_TOKEN, etc.

LangSmith Tracing

DeerFlow has built-in LangSmith integration for observability. When enabled, all LLM calls, agent runs, tool executions, and middleware processing are traced and visible in the LangSmith dashboard.

Setup:

Sign up at smith.langchain.com and create a project.
Add the following to your .env file in the project root:

LANGSMITH_TRACING=true
LANGSMITH_ENDPOINT=https://api.smith.langchain.com
LANGSMITH_API_KEY=lsv2_pt_xxxxxxxxxxxxxxxx
LANGSMITH_PROJECT=xxx

Legacy variables: The LANGCHAIN_TRACING_V2, LANGCHAIN_API_KEY, LANGCHAIN_PROJECT, and LANGCHAIN_ENDPOINT variables are also supported for backward compatibility. LANGSMITH_* variables take precedence when both are set.

Langfuse Tracing

DeerFlow also supports Langfuse observability for LangChain-compatible runs.

Add the following to your .env file:

LANGFUSE_TRACING=true
LANGFUSE_PUBLIC_KEY=pk-lf-xxxxxxxxxxxxxxxx
LANGFUSE_SECRET_KEY=sk-lf-xxxxxxxxxxxxxxxx
LANGFUSE_BASE_URL=https://cloud.langfuse.com

If you are using a self-hosted Langfuse deployment, set LANGFUSE_BASE_URL to your Langfuse host.

Dual Provider Behavior

If both LangSmith and Langfuse are enabled, DeerFlow initializes and attaches both callbacks so the same run data is reported to both systems.

If a provider is explicitly enabled but required credentials are missing, or the provider callback cannot be initialized, DeerFlow raises an error when tracing is initialized during model creation instead of silently disabling tracing.

Docker: In docker-compose.yaml, tracing is disabled by default (LANGSMITH_TRACING=false). Set LANGSMITH_TRACING=true and/or LANGFUSE_TRACING=true in your .env, together with the required credentials, to enable tracing in containerized deployments.

Development

Commands

make install    # Install dependencies
make dev        # Run Gateway API + embedded agent runtime (port 8001)
make gateway    # Run Gateway API without reload (port 8001)
make lint       # Run linter (ruff)
make format     # Format code (ruff)
make detect-blocking-io  # Inventory blocking IO that may block the backend event loop
make migrate-rev MSG="..."  # Autogenerate a new alembic revision against the live ORM models

Schema Migrations

DeerFlow's application tables (runs, threads_meta, feedback, users, run_events, and the channel_* tables) are owned by alembic. The Gateway runs alembic upgrade head automatically on startup via bootstrap_schema(engine, backend=...), so operators do not run alembic manually in production. Bootstrap is concurrency-safe (Postgres advisory lock across processes; per-engine asyncio.Lock inside one SQLite process) and idempotent against pre-existing schemas (empty / legacy / versioned).

When you add or change an ORM model, ship the change as a new revision under packages/harness/deerflow/persistence/migrations/versions/:

make migrate-rev MSG="add foo column to runs"

The target invokes scripts/_autogen_revision.py, which builds a fresh temp SQLite at head and diffs the live models against it — so a clean checkout does not need a pre-existing ./data/deerflow.db. Review the generated file and switch raw op.add_column / op.drop_column calls to the idempotent helpers in migrations/_helpers.py before committing. There is no make migrate / make migrate-stamp target on purpose — Gateway startup is the only execution path, which keeps operational mistakes off the table. See backend/CLAUDE.md (Schema Migrations) for the full design.

Code Style

Linter/Formatter: ruff
Line length: 240 characters
Python: 3.12+ with type hints
Quotes: Double quotes
Indentation: 4 spaces

Testing

uv run pytest

make detect-blocking-io statically scans backend business code for blocking IO that may run on the backend event loop and is not test-coverage-bound. It prints a concise summary for human review and writes complete JSON findings to .deer-flow/blocking-io-findings.json at the repository root (regardless of whether the target is invoked from the repo root or from backend/). JSON findings include both broad IO category and review-oriented fields such as priority, location, blocking_call, event_loop_exposure, reason, and code. priority is a deterministic review ordering from the operation type, not proof of a bug. Bare-name same-file calls are resolved by function name, so duplicate helper names in one file can conservatively over-report async reachability.

Technology Stack

LangGraph (1.0.6+) - Agent framework and multi-agent orchestration
LangChain (1.2.3+) - LLM abstractions and tool system
FastAPI (0.115.0+) - Gateway REST API
langchain-mcp-adapters - Model Context Protocol support
agent-sandbox - Sandboxed code execution
markitdown - Multi-format document conversion
tavily-python / firecrawl-py - Web search and scraping

README.md

DeerFlow Backend

Architecture

Core Components

Lead Agent

Middleware Chain

Sandbox System

Subagent System

Memory System

Tool Ecosystem

Gateway API

IM Channels

Quick Start

Prerequisites

Installation

Configuration

Running

Project Structure

Configuration

Main Configuration (`config.yaml`)

Extensions Configuration (`extensions_config.json`)

Environment Variables

LangSmith Tracing

Langfuse Tracing

Dual Provider Behavior

Development

Commands

Schema Migrations

Code Style

Testing

Technology Stack

Documentation

License

Contributing

README.md

DeerFlow Backend

Architecture

Core Components

Lead Agent

Middleware Chain

Sandbox System

Subagent System

Memory System

Tool Ecosystem

Gateway API

IM Channels

Quick Start

Prerequisites

Installation

Configuration

Running

Project Structure

Configuration

Main Configuration (config.yaml)

Extensions Configuration (extensions_config.json)

Environment Variables

LangSmith Tracing

Langfuse Tracing

Dual Provider Behavior

Development

Commands

Schema Migrations

Code Style

Testing

Technology Stack

Documentation

License

Contributing

Main Configuration (`config.yaml`)

Extensions Configuration (`extensions_config.json`)