Merge branch 'main' of https://github.com/bytedance/deer-flow into rayhpeng/storage-package-base
This commit is contained in:
+4
-4
@@ -185,9 +185,9 @@ If you need to start services individually:
|
|||||||
|
|
||||||
1. **Start backend service**:
|
1. **Start backend service**:
|
||||||
```bash
|
```bash
|
||||||
# Terminal 1: Start Gateway API and embedded LangGraph-compatible runtime (port 8001)
|
# Terminal 1: Start Gateway API + embedded agent runtime (port 8001)
|
||||||
cd backend
|
cd backend
|
||||||
make gateway
|
make dev
|
||||||
|
|
||||||
# Terminal 2: Start Frontend (port 3000)
|
# Terminal 2: Start Frontend (port 3000)
|
||||||
cd frontend
|
cd frontend
|
||||||
@@ -207,7 +207,7 @@ If you need to start services individually:
|
|||||||
|
|
||||||
The nginx configuration provides:
|
The nginx configuration provides:
|
||||||
- Unified entry point on port 2026
|
- Unified entry point on port 2026
|
||||||
- Gateway owns `/api/langgraph/*` and translates those public LangGraph-compatible paths to its native `/api/*` routers behind nginx
|
- Rewrites `/api/langgraph/*` to Gateway's LangGraph-compatible API (8001)
|
||||||
- Routes other `/api/*` endpoints to Gateway API (8001)
|
- Routes other `/api/*` endpoints to Gateway API (8001)
|
||||||
- Routes non-API requests to Frontend (3000)
|
- Routes non-API requests to Frontend (3000)
|
||||||
- Same-origin API routing; split-origin or port-forwarded browser clients should use the Gateway `GATEWAY_CORS_ORIGINS` allowlist
|
- Same-origin API routing; split-origin or port-forwarded browser clients should use the Gateway `GATEWAY_CORS_ORIGINS` allowlist
|
||||||
@@ -231,7 +231,7 @@ deer-flow/
|
|||||||
├── backend/ # Backend application
|
├── backend/ # Backend application
|
||||||
│ ├── src/
|
│ ├── src/
|
||||||
│ │ ├── gateway/ # Gateway API and LangGraph-compatible runtime (port 8001)
|
│ │ ├── gateway/ # Gateway API and LangGraph-compatible runtime (port 8001)
|
||||||
│ │ ├── agents/ # LangGraph agent definitions
|
│ │ ├── agents/ # LangGraph agent runtime used by Gateway
|
||||||
│ │ ├── mcp/ # Model Context Protocol integration
|
│ │ ├── mcp/ # Model Context Protocol integration
|
||||||
│ │ ├── skills/ # Skills system
|
│ │ ├── skills/ # Skills system
|
||||||
│ │ └── sandbox/ # Sandbox execution
|
│ │ └── sandbox/ # Sandbox execution
|
||||||
|
|||||||
@@ -628,7 +628,7 @@ See [`skills/public/claude-to-deerflow/SKILL.md`](skills/public/claude-to-deerfl
|
|||||||
|
|
||||||
Complex tasks rarely fit in a single pass. DeerFlow decomposes them.
|
Complex tasks rarely fit in a single pass. DeerFlow decomposes them.
|
||||||
|
|
||||||
The lead agent can spawn sub-agents on the fly — each with its own scoped context, tools, and termination conditions. Sub-agents run in parallel when possible, report back structured results, and the lead agent synthesizes everything into a coherent output.
|
The lead agent can spawn sub-agents on the fly — each with its own scoped context, tools, and termination conditions. Sub-agents run in parallel when possible, report back structured results, and the lead agent synthesizes everything into a coherent output. When token usage tracking is enabled, completed sub-agent usage is attributed back to the dispatching step.
|
||||||
|
|
||||||
This is how DeerFlow handles tasks that take minutes to hours: a research task might fan out into a dozen sub-agents, each exploring a different angle, then converge into a single report — or a website — or a slide deck with generated visuals. One harness, many hands.
|
This is how DeerFlow handles tasks that take minutes to hours: a research task might fan out into a dozen sub-agents, each exploring a different angle, then converge into a single report — or a website — or a slide deck with generated visuals. One harness, many hands.
|
||||||
|
|
||||||
|
|||||||
+3
-3
@@ -228,7 +228,7 @@ make down # Stop and remove containers
|
|||||||
```
|
```
|
||||||
|
|
||||||
> [!NOTE]
|
> [!NOTE]
|
||||||
> Le serveur d'agents LangGraph fonctionne actuellement via `langgraph dev` (le serveur CLI open source).
|
> Le runtime d'agent s'exécute actuellement dans la Gateway. nginx réécrit `/api/langgraph/*` vers l'API compatible LangGraph servie par la Gateway.
|
||||||
|
|
||||||
Accès : http://localhost:2026
|
Accès : http://localhost:2026
|
||||||
|
|
||||||
@@ -296,8 +296,8 @@ DeerFlow peut recevoir des tâches depuis des applications de messagerie. Les ca
|
|||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
channels:
|
channels:
|
||||||
# LangGraph Server URL (default: http://localhost:2024)
|
# LangGraph-compatible Gateway API base URL (default: http://localhost:8001/api)
|
||||||
langgraph_url: http://localhost:2024
|
langgraph_url: http://localhost:8001/api
|
||||||
# Gateway API URL (default: http://localhost:8001)
|
# Gateway API URL (default: http://localhost:8001)
|
||||||
gateway_url: http://localhost:8001
|
gateway_url: http://localhost:8001
|
||||||
|
|
||||||
|
|||||||
+3
-3
@@ -181,7 +181,7 @@ make down # コンテナを停止して削除
|
|||||||
```
|
```
|
||||||
|
|
||||||
> [!NOTE]
|
> [!NOTE]
|
||||||
> LangGraphエージェントサーバーは現在`langgraph dev`(オープンソースCLIサーバー)経由で実行されます。
|
> Agentランタイムは現在Gateway内で実行されます。`/api/langgraph/*`はnginxによってGatewayのLangGraph-compatible APIへ書き換えられます。
|
||||||
|
|
||||||
アクセス: http://localhost:2026
|
アクセス: http://localhost:2026
|
||||||
|
|
||||||
@@ -249,8 +249,8 @@ DeerFlowはメッセージングアプリからのタスク受信をサポート
|
|||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
channels:
|
channels:
|
||||||
# LangGraphサーバーURL(デフォルト: http://localhost:2024)
|
# LangGraph-compatible Gateway API base URL(デフォルト: http://localhost:8001/api)
|
||||||
langgraph_url: http://localhost:2024
|
langgraph_url: http://localhost:8001/api
|
||||||
# Gateway API URL(デフォルト: http://localhost:8001)
|
# Gateway API URL(デフォルト: http://localhost:8001)
|
||||||
gateway_url: http://localhost:8001
|
gateway_url: http://localhost:8001
|
||||||
|
|
||||||
|
|||||||
+3
-3
@@ -184,7 +184,7 @@ make down # 停止并移除容器
|
|||||||
```
|
```
|
||||||
|
|
||||||
> [!NOTE]
|
> [!NOTE]
|
||||||
> 当前 LangGraph agent server 通过开源 CLI 服务 `langgraph dev` 运行。
|
> 当前 Agent 运行时嵌入在 Gateway 中运行,`/api/langgraph/*` 会由 nginx 重写到 Gateway 的 LangGraph-compatible API。
|
||||||
|
|
||||||
访问地址:http://localhost:2026
|
访问地址:http://localhost:2026
|
||||||
|
|
||||||
@@ -254,8 +254,8 @@ DeerFlow 支持从即时通讯应用接收任务。只要配置完成,对应
|
|||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
channels:
|
channels:
|
||||||
# LangGraph Server URL(默认:http://localhost:2024)
|
# LangGraph-compatible Gateway API base URL(默认:http://localhost:8001/api)
|
||||||
langgraph_url: http://localhost:2024
|
langgraph_url: http://localhost:8001/api
|
||||||
# Gateway API URL(默认:http://localhost:8001)
|
# Gateway API URL(默认:http://localhost:8001)
|
||||||
gateway_url: http://localhost:8001
|
gateway_url: http://localhost:8001
|
||||||
|
|
||||||
|
|||||||
+1
-1
@@ -165,7 +165,7 @@ Lead-agent middlewares are assembled in strict append order across `packages/har
|
|||||||
8. **ToolErrorHandlingMiddleware** - Converts tool exceptions into error `ToolMessage`s so the run can continue instead of aborting
|
8. **ToolErrorHandlingMiddleware** - Converts tool exceptions into error `ToolMessage`s so the run can continue instead of aborting
|
||||||
9. **SummarizationMiddleware** - Context reduction when approaching token limits (optional, if enabled)
|
9. **SummarizationMiddleware** - Context reduction when approaching token limits (optional, if enabled)
|
||||||
10. **TodoListMiddleware** - Task tracking with `write_todos` tool (optional, if plan_mode)
|
10. **TodoListMiddleware** - Task tracking with `write_todos` tool (optional, if plan_mode)
|
||||||
11. **TokenUsageMiddleware** - Records token usage metrics when token tracking is enabled (optional)
|
11. **TokenUsageMiddleware** - Records token usage metrics when token tracking is enabled (optional); subagent usage is cached by `tool_call_id` only while token usage is enabled and merged back into the dispatching AIMessage by message position rather than message id
|
||||||
12. **TitleMiddleware** - Auto-generates thread title after first complete exchange and normalizes structured message content before prompting the title model
|
12. **TitleMiddleware** - Auto-generates thread title after first complete exchange and normalizes structured message content before prompting the title model
|
||||||
13. **MemoryMiddleware** - Queues conversations for async memory update (filters to user + final AI responses)
|
13. **MemoryMiddleware** - Queues conversations for async memory update (filters to user + final AI responses)
|
||||||
14. **ViewImageMiddleware** - Injects base64 image data before LLM call (conditional on vision support)
|
14. **ViewImageMiddleware** - Injects base64 image data before LLM call (conditional on vision support)
|
||||||
|
|||||||
@@ -56,11 +56,8 @@ export OPENAI_API_KEY="your-api-key"
|
|||||||
### Run the Development Server
|
### Run the Development Server
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Terminal 1: LangGraph server
|
# Gateway API + embedded agent runtime
|
||||||
make dev
|
make dev
|
||||||
|
|
||||||
# Terminal 2: Gateway API
|
|
||||||
make gateway
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Project Structure
|
## Project Structure
|
||||||
|
|||||||
+27
-34
@@ -11,34 +11,26 @@ DeerFlow is a LangGraph-based AI super agent with sandbox execution, persistent
|
|||||||
│ Nginx (Port 2026) │
|
│ Nginx (Port 2026) │
|
||||||
│ Unified reverse proxy │
|
│ Unified reverse proxy │
|
||||||
└───────┬──────────────────┬───────────┘
|
└───────┬──────────────────┬───────────┘
|
||||||
│ │
|
│
|
||||||
/api/langgraph/* │ │ /api/* (other)
|
/api/langgraph/* │ /api/* (other)
|
||||||
▼ ▼
|
rewritten to /api/* │
|
||||||
┌──────────────────────────────────────────────┐
|
▼
|
||||||
│ Gateway API (8001) │
|
┌────────────────────────────────────────┐
|
||||||
│ FastAPI REST + LangGraph-compatible runtime │
|
│ Gateway API (8001) │
|
||||||
│ │
|
│ FastAPI REST + agent runtime │
|
||||||
│ Models, MCP, Skills, Memory, Uploads, │
|
│ │
|
||||||
│ Artifacts, Threads, Runs, Streaming │
|
│ Models, MCP, Skills, Memory, Uploads, │
|
||||||
│ │
|
│ Artifacts, Threads, Runs, Streaming │
|
||||||
│ ┌────────────────┐ │
|
│ │
|
||||||
│ │ Lead Agent │ │
|
│ ┌────────────────────────────────────┐ │
|
||||||
│ │ ┌──────────┐ │ │
|
│ │ Lead Agent │ │
|
||||||
│ │ │Middleware│ │ │
|
│ │ Middleware Chain, Tools, Subagents │ │
|
||||||
│ │ │ Chain │ │ │
|
│ └────────────────────────────────────┘ │
|
||||||
│ │ └──────────┘ │ │
|
└────────────────────────────────────────┘
|
||||||
│ │ ┌──────────┐ │ │
|
|
||||||
│ │ │ Tools │ │ │
|
|
||||||
│ │ └──────────┘ │ │
|
|
||||||
│ │ ┌──────────┐ │ │
|
|
||||||
│ │ │Subagents │ │ │
|
|
||||||
│ │ └──────────┘ │ │
|
|
||||||
│ └────────────────┘ │
|
|
||||||
└──────────────────────────────────────────────┘
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**Request Routing** (via Nginx):
|
**Request Routing** (via Nginx):
|
||||||
- `/api/langgraph/*` → Gateway API - LangGraph-compatible agent interactions, threads, runs, and streaming translated to native `/api/*` routers
|
- `/api/langgraph/*` → Gateway LangGraph-compatible API - agent interactions, threads, streaming
|
||||||
- `/api/*` (other) → Gateway API - models, MCP, skills, memory, artifacts, uploads, thread-local cleanup
|
- `/api/*` (other) → Gateway API - models, MCP, skills, memory, artifacts, uploads, thread-local cleanup
|
||||||
- `/` (non-API) → Frontend - Next.js web interface
|
- `/` (non-API) → Frontend - Next.js web interface
|
||||||
|
|
||||||
@@ -196,7 +188,7 @@ export OPENAI_API_KEY="your-api-key-here"
|
|||||||
**Full Application** (from project root):
|
**Full Application** (from project root):
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
make dev # Starts LangGraph + Gateway + Frontend + Nginx
|
make dev # Starts Gateway + Frontend + Nginx
|
||||||
```
|
```
|
||||||
|
|
||||||
Access at: http://localhost:2026
|
Access at: http://localhost:2026
|
||||||
@@ -204,14 +196,11 @@ Access at: http://localhost:2026
|
|||||||
**Backend Only** (from backend directory):
|
**Backend Only** (from backend directory):
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Terminal 1: LangGraph server
|
# Gateway API + embedded agent runtime
|
||||||
make dev
|
make dev
|
||||||
|
|
||||||
# Terminal 2: Gateway API
|
|
||||||
make gateway
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Direct access: LangGraph at http://localhost:2024, Gateway at http://localhost:8001
|
Direct access: Gateway at http://localhost:8001
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -247,12 +236,16 @@ backend/
|
|||||||
│ └── utils/ # Utilities
|
│ └── utils/ # Utilities
|
||||||
├── docs/ # Documentation
|
├── docs/ # Documentation
|
||||||
├── tests/ # Test suite
|
├── tests/ # Test suite
|
||||||
├── langgraph.json # LangGraph server configuration
|
├── langgraph.json # LangGraph graph registry for tooling/Studio compatibility
|
||||||
├── pyproject.toml # Python dependencies
|
├── pyproject.toml # Python dependencies
|
||||||
├── Makefile # Development commands
|
├── Makefile # Development commands
|
||||||
└── Dockerfile # Container build
|
└── Dockerfile # Container build
|
||||||
```
|
```
|
||||||
|
|
||||||
|
`langgraph.json` is not the default service entrypoint. The scripts and Docker
|
||||||
|
deployments run the Gateway embedded runtime; the file is kept for LangGraph
|
||||||
|
tooling, Studio, or direct LangGraph Server compatibility.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
@@ -365,8 +358,8 @@ If a provider is explicitly enabled but required credentials are missing, or the
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
make install # Install dependencies
|
make install # Install dependencies
|
||||||
make dev # Run LangGraph server (port 2024)
|
make dev # Run Gateway API + embedded agent runtime (port 8001)
|
||||||
make gateway # Run Gateway API (port 8001)
|
make gateway # Run Gateway API without reload (port 8001)
|
||||||
make lint # Run linter (ruff)
|
make lint # Run linter (ruff)
|
||||||
make format # Format code (ruff)
|
make format # Format code (ruff)
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -62,7 +62,7 @@ async def _ensure_admin_user(app: FastAPI) -> None:
|
|||||||
|
|
||||||
Subsequent boots (admin already exists):
|
Subsequent boots (admin already exists):
|
||||||
- Runs the one-time "no-auth → with-auth" orphan thread migration for
|
- Runs the one-time "no-auth → with-auth" orphan thread migration for
|
||||||
existing LangGraph thread metadata that has no owner_id.
|
existing LangGraph thread metadata that has no user_id.
|
||||||
|
|
||||||
No SQL persistence migration is needed: the four user_id columns
|
No SQL persistence migration is needed: the four user_id columns
|
||||||
(threads_meta, runs, run_events, feedback) only come into existence
|
(threads_meta, runs, run_events, feedback) only come into existence
|
||||||
@@ -177,7 +177,7 @@ async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
|||||||
async with langgraph_runtime(app):
|
async with langgraph_runtime(app):
|
||||||
logger.info("LangGraph runtime initialised")
|
logger.info("LangGraph runtime initialised")
|
||||||
|
|
||||||
# Ensure admin user exists (auto-create on first boot)
|
# Check admin bootstrap state and migrate orphan threads after admin exists.
|
||||||
# Must run AFTER langgraph_runtime so app.state.store is available for thread migration
|
# Must run AFTER langgraph_runtime so app.state.store is available for thread migration
|
||||||
await _ensure_admin_user(app)
|
await _ensure_admin_user(app)
|
||||||
|
|
||||||
|
|||||||
@@ -28,7 +28,7 @@ class User(BaseModel):
|
|||||||
oauth_id: str | None = Field(None, description="User ID from OAuth provider")
|
oauth_id: str | None = Field(None, description="User ID from OAuth provider")
|
||||||
|
|
||||||
# Auth lifecycle
|
# Auth lifecycle
|
||||||
needs_setup: bool = Field(default=False, description="True for auto-created admin until setup completes")
|
needs_setup: bool = Field(default=False, description="True when a reset account must complete setup")
|
||||||
token_version: int = Field(default=0, description="Incremented on password change to invalidate old JWTs")
|
token_version: int = Field(default=0, description="Incremented on password change to invalidate old JWTs")
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -1,8 +1,12 @@
|
|||||||
"""LangGraph Server auth handler — shares JWT logic with Gateway.
|
"""LangGraph compatibility auth handler — shares JWT logic with Gateway.
|
||||||
|
|
||||||
Loaded by LangGraph Server via langgraph.json ``auth.path``.
|
The default DeerFlow runtime is embedded in the FastAPI Gateway; scripts and
|
||||||
Reuses the same ``decode_token`` / ``get_auth_config`` as Gateway,
|
Docker deployments do not load this module. It is retained for LangGraph
|
||||||
so both modes validate tokens with the same secret and rules.
|
tooling, Studio, or direct LangGraph Server compatibility through
|
||||||
|
``langgraph.json``'s ``auth.path``.
|
||||||
|
|
||||||
|
When that compatibility path is used, this module reuses the same JWT and CSRF
|
||||||
|
rules as Gateway so both modes validate sessions consistently.
|
||||||
|
|
||||||
Two layers:
|
Two layers:
|
||||||
1. @auth.authenticate — validates JWT cookie, extracts user_id,
|
1. @auth.authenticate — validates JWT cookie, extracts user_id,
|
||||||
|
|||||||
@@ -305,7 +305,7 @@ async def login_local(
|
|||||||
async def register(request: Request, response: Response, body: RegisterRequest):
|
async def register(request: Request, response: Response, body: RegisterRequest):
|
||||||
"""Register a new user account (always 'user' role).
|
"""Register a new user account (always 'user' role).
|
||||||
|
|
||||||
Admin is auto-created on first boot. This endpoint creates regular users.
|
The first admin is created explicitly through /initialize. This endpoint creates regular users.
|
||||||
Auto-login by setting the session cookie.
|
Auto-login by setting the session cookie.
|
||||||
"""
|
"""
|
||||||
try:
|
try:
|
||||||
|
|||||||
@@ -90,6 +90,28 @@ class ThreadSearchRequest(BaseModel):
|
|||||||
offset: int = Field(default=0, ge=0, description="Pagination offset")
|
offset: int = Field(default=0, ge=0, description="Pagination offset")
|
||||||
status: str | None = Field(default=None, description="Filter by thread status")
|
status: str | None = Field(default=None, description="Filter by thread status")
|
||||||
|
|
||||||
|
@field_validator("metadata")
|
||||||
|
@classmethod
|
||||||
|
def _validate_metadata_filters(cls, v: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
"""Reject filter entries the SQL backend cannot compile.
|
||||||
|
|
||||||
|
Enforces consistent behaviour across SQL and memory backends.
|
||||||
|
See ``deerflow.persistence.json_compat`` for the shared validators.
|
||||||
|
"""
|
||||||
|
if not v:
|
||||||
|
return v
|
||||||
|
from deerflow.persistence.json_compat import validate_metadata_filter_key, validate_metadata_filter_value
|
||||||
|
|
||||||
|
bad_entries: list[str] = []
|
||||||
|
for key, value in v.items():
|
||||||
|
if not validate_metadata_filter_key(key):
|
||||||
|
bad_entries.append(f"{key!r} (unsafe key)")
|
||||||
|
elif not validate_metadata_filter_value(value):
|
||||||
|
bad_entries.append(f"{key!r} (unsupported value type {type(value).__name__})")
|
||||||
|
if bad_entries:
|
||||||
|
raise ValueError(f"Invalid metadata filter entries: {', '.join(bad_entries)}")
|
||||||
|
return v
|
||||||
|
|
||||||
|
|
||||||
class ThreadStateResponse(BaseModel):
|
class ThreadStateResponse(BaseModel):
|
||||||
"""Response model for thread state."""
|
"""Response model for thread state."""
|
||||||
@@ -294,14 +316,18 @@ async def search_threads(body: ThreadSearchRequest, request: Request) -> list[Th
|
|||||||
(SQL-backed for sqlite/postgres, Store-backed for memory mode).
|
(SQL-backed for sqlite/postgres, Store-backed for memory mode).
|
||||||
"""
|
"""
|
||||||
from app.gateway.deps import get_thread_store
|
from app.gateway.deps import get_thread_store
|
||||||
|
from deerflow.persistence.thread_meta import InvalidMetadataFilterError
|
||||||
|
|
||||||
repo = get_thread_store(request)
|
repo = get_thread_store(request)
|
||||||
rows = await repo.search(
|
try:
|
||||||
metadata=body.metadata or None,
|
rows = await repo.search(
|
||||||
status=body.status,
|
metadata=body.metadata or None,
|
||||||
limit=body.limit,
|
status=body.status,
|
||||||
offset=body.offset,
|
limit=body.limit,
|
||||||
)
|
offset=body.offset,
|
||||||
|
)
|
||||||
|
except InvalidMetadataFilterError as exc:
|
||||||
|
raise HTTPException(status_code=400, detail=str(exc)) from exc
|
||||||
return [
|
return [
|
||||||
ThreadResponse(
|
ThreadResponse(
|
||||||
thread_id=r["thread_id"],
|
thread_id=r["thread_id"],
|
||||||
|
|||||||
+40
-17
@@ -535,14 +535,28 @@ All APIs return errors in a consistent format:
|
|||||||
|
|
||||||
## Authentication
|
## Authentication
|
||||||
|
|
||||||
Currently, DeerFlow does not implement authentication. All APIs are accessible without credentials.
|
DeerFlow enforces authentication for all non-public HTTP routes. Public routes are limited to health/docs metadata and these public auth endpoints:
|
||||||
|
|
||||||
Note: This is about DeerFlow API authentication. MCP outbound connections can still use OAuth for configured HTTP/SSE MCP servers.
|
- `POST /api/v1/auth/initialize` creates the first admin account when no admin exists.
|
||||||
|
- `POST /api/v1/auth/login/local` logs in with email/password and sets an HttpOnly `access_token` cookie.
|
||||||
|
- `POST /api/v1/auth/register` creates a regular `user` account and sets the session cookie.
|
||||||
|
- `POST /api/v1/auth/logout` clears the session cookie.
|
||||||
|
- `GET /api/v1/auth/setup-status` reports whether the first admin still needs to be created.
|
||||||
|
|
||||||
For production deployments, it is recommended to:
|
The authenticated auth endpoints are:
|
||||||
1. Use Nginx for basic auth or OAuth integration
|
|
||||||
2. Deploy behind a VPN or private network
|
- `GET /api/v1/auth/me` returns the current user.
|
||||||
3. Implement custom authentication middleware
|
- `POST /api/v1/auth/change-password` changes password, optionally changes email during setup, increments `token_version`, and reissues the cookie.
|
||||||
|
|
||||||
|
Protected state-changing requests also require the CSRF double-submit token: send the `csrf_token` cookie value as the `X-CSRF-Token` header. Login/register/initialize/logout are bootstrap auth endpoints: they are exempt from the double-submit token but still reject hostile browser `Origin` headers.
|
||||||
|
|
||||||
|
User isolation is enforced from the authenticated user context:
|
||||||
|
|
||||||
|
- Thread metadata is scoped by `threads_meta.user_id`; search/read/write/delete APIs only expose the current user's threads.
|
||||||
|
- Thread files live under `{base_dir}/users/{user_id}/threads/{thread_id}/user-data/` and are exposed inside the sandbox as `/mnt/user-data/`.
|
||||||
|
- Memory and custom agents are stored under `{base_dir}/users/{user_id}/...`.
|
||||||
|
|
||||||
|
Note: MCP outbound connections can still use OAuth for configured HTTP/SSE MCP servers; that is separate from DeerFlow API authentication.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -561,12 +575,13 @@ location /api/ {
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## WebSocket Support
|
## Streaming Support
|
||||||
|
|
||||||
The LangGraph server supports WebSocket connections for real-time streaming. Connect to:
|
Gateway's LangGraph-compatible API streams run events with Server-Sent Events (SSE):
|
||||||
|
|
||||||
```
|
```http
|
||||||
ws://localhost:2026/api/langgraph/threads/{thread_id}/runs/stream
|
POST /api/langgraph/threads/{thread_id}/runs/stream
|
||||||
|
Accept: text/event-stream
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -602,13 +617,21 @@ const response = await fetch('/api/models');
|
|||||||
const data = await response.json();
|
const data = await response.json();
|
||||||
console.log(data.models);
|
console.log(data.models);
|
||||||
|
|
||||||
// Using EventSource for streaming
|
// Create a run and stream SSE events
|
||||||
const eventSource = new EventSource(
|
const streamResponse = await fetch(`/api/langgraph/threads/${threadId}/runs/stream`, {
|
||||||
`/api/langgraph/threads/${threadId}/runs/stream`
|
method: "POST",
|
||||||
);
|
headers: {
|
||||||
eventSource.onmessage = (event) => {
|
"Content-Type": "application/json",
|
||||||
console.log(JSON.parse(event.data));
|
Accept: "text/event-stream",
|
||||||
};
|
},
|
||||||
|
body: JSON.stringify({
|
||||||
|
input: { messages: [{ role: "user", content: "Hello" }] },
|
||||||
|
stream_mode: ["values", "messages-tuple", "custom"],
|
||||||
|
}),
|
||||||
|
});
|
||||||
|
|
||||||
|
const reader = streamResponse.body?.getReader();
|
||||||
|
// Decode and parse SSE frames from reader in your client code.
|
||||||
```
|
```
|
||||||
|
|
||||||
### cURL Examples
|
### cURL Examples
|
||||||
|
|||||||
@@ -20,24 +20,22 @@ This document provides a comprehensive overview of the DeerFlow backend architec
|
|||||||
│ └────────────────────────────────────────────────────────────────────┘ │
|
│ └────────────────────────────────────────────────────────────────────┘ │
|
||||||
└─────────────────────────────────┬────────────────────────────────────────┘
|
└─────────────────────────────────┬────────────────────────────────────────┘
|
||||||
│
|
│
|
||||||
┌───────────────────────┼───────────────────────┐
|
┌───────────────────────┴───────────────────────┐
|
||||||
│ │ │
|
│ │
|
||||||
▼ ▼ ▼
|
▼ ▼
|
||||||
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
|
┌─────────────────────────────────────────────┐ ┌─────────────────────┐
|
||||||
│ Embedded Runtime │ │ Gateway API │ │ Frontend │
|
│ Gateway API │ │ Frontend │
|
||||||
│ (inside Gateway) │ │ (Port 8001) │ │ (Port 3000) │
|
│ (Port 8001) │ │ (Port 3000) │
|
||||||
│ │ │ │ │ │
|
│ │ │ │
|
||||||
│ - Agent Runtime │ │ - Models API │ │ - Next.js App │
|
│ - LangGraph-compatible runs/threads API │ │ - Next.js App │
|
||||||
│ - Thread Mgmt │ │ - MCP Config │ │ - React UI │
|
│ - Embedded Agent Runtime │ │ - React UI │
|
||||||
│ - SSE Streaming │ │ - Skills Mgmt │ │ - Chat Interface │
|
│ - SSE Streaming │ │ - Chat Interface │
|
||||||
│ - Checkpointing │ │ - File Uploads │ │ │
|
│ - Checkpointing │ │ │
|
||||||
│ │ │ - Thread Cleanup │ │ │
|
│ - Models, MCP, Skills, Uploads, Artifacts │ │ │
|
||||||
│ │ │ - Artifacts │ │ │
|
│ - Thread Cleanup │ │ │
|
||||||
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
|
└─────────────────────────────────────────────┘ └─────────────────────┘
|
||||||
│ │
|
│
|
||||||
│ ┌─────────────────┘
|
▼
|
||||||
│ │
|
|
||||||
▼ ▼
|
|
||||||
┌──────────────────────────────────────────────────────────────────────────┐
|
┌──────────────────────────────────────────────────────────────────────────┐
|
||||||
│ Shared Configuration │
|
│ Shared Configuration │
|
||||||
│ ┌─────────────────────────┐ ┌────────────────────────────────────────┐ │
|
│ ┌─────────────────────────┐ ┌────────────────────────────────────────┐ │
|
||||||
@@ -52,9 +50,9 @@ This document provides a comprehensive overview of the DeerFlow backend architec
|
|||||||
|
|
||||||
## Component Details
|
## Component Details
|
||||||
|
|
||||||
### Embedded LangGraph Runtime
|
### Gateway Embedded Agent Runtime
|
||||||
|
|
||||||
The LangGraph-compatible runtime runs inside the Gateway process and is built on LangGraph for robust multi-agent workflow orchestration.
|
The agent runtime is embedded in the FastAPI Gateway and built on LangGraph for robust multi-agent workflow orchestration. Nginx rewrites `/api/langgraph/*` to Gateway's native `/api/*` routes, so the public API remains compatible with LangGraph SDK clients without running a separate LangGraph server.
|
||||||
|
|
||||||
**Entry Point**: `packages/harness/deerflow/agents/lead_agent/agent.py:make_lead_agent`
|
**Entry Point**: `packages/harness/deerflow/agents/lead_agent/agent.py:make_lead_agent`
|
||||||
|
|
||||||
@@ -65,7 +63,8 @@ The LangGraph-compatible runtime runs inside the Gateway process and is built on
|
|||||||
- Tool execution orchestration
|
- Tool execution orchestration
|
||||||
- SSE streaming for real-time responses
|
- SSE streaming for real-time responses
|
||||||
|
|
||||||
**Configuration**: `langgraph.json`
|
**Graph registry**: `langgraph.json` remains available for tooling, Studio, or direct LangGraph Server compatibility.
|
||||||
|
It is not the default service entrypoint; scripts and Docker deployments run the Gateway embedded runtime.
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
@@ -84,6 +83,7 @@ FastAPI application providing REST endpoints plus the public LangGraph-compatibl
|
|||||||
|
|
||||||
**Routers**:
|
**Routers**:
|
||||||
- `models.py` - `/api/models` - Model listing and details
|
- `models.py` - `/api/models` - Model listing and details
|
||||||
|
- `thread_runs.py` / `runs.py` - `/api/threads/{id}/runs`, `/api/runs/*` - LangGraph-compatible runs and streaming
|
||||||
- `mcp.py` - `/api/mcp` - MCP server configuration
|
- `mcp.py` - `/api/mcp` - MCP server configuration
|
||||||
- `skills.py` - `/api/skills` - Skills management
|
- `skills.py` - `/api/skills` - Skills management
|
||||||
- `uploads.py` - `/api/threads/{id}/uploads` - File upload
|
- `uploads.py` - `/api/threads/{id}/uploads` - File upload
|
||||||
@@ -91,7 +91,7 @@ FastAPI application providing REST endpoints plus the public LangGraph-compatibl
|
|||||||
- `artifacts.py` - `/api/threads/{id}/artifacts` - Artifact serving
|
- `artifacts.py` - `/api/threads/{id}/artifacts` - Artifact serving
|
||||||
- `suggestions.py` - `/api/threads/{id}/suggestions` - Follow-up suggestion generation
|
- `suggestions.py` - `/api/threads/{id}/suggestions` - Follow-up suggestion generation
|
||||||
|
|
||||||
The web conversation delete flow is now split across both backend surfaces: LangGraph handles `DELETE /api/langgraph/threads/{thread_id}` for thread state, then the Gateway `threads.py` router removes DeerFlow-managed filesystem data via `Paths.delete_thread_dir()`.
|
The web conversation delete flow first deletes Gateway-managed thread state through the LangGraph-compatible route, then the Gateway `threads.py` router removes DeerFlow-managed filesystem data via `Paths.delete_thread_dir()`.
|
||||||
|
|
||||||
### Agent Architecture
|
### Agent Architecture
|
||||||
|
|
||||||
@@ -354,9 +354,9 @@ SKILL.md Format:
|
|||||||
{"input": {"messages": [{"role": "user", "content": "Hello"}]}}
|
{"input": {"messages": [{"role": "user", "content": "Hello"}]}}
|
||||||
|
|
||||||
2. Nginx → Gateway API (8001)
|
2. Nginx → Gateway API (8001)
|
||||||
Routes `/api/langgraph/*` to the Gateway's LangGraph-compatible runtime
|
`/api/langgraph/*` is rewritten to Gateway's LangGraph-compatible `/api/*` routes
|
||||||
|
|
||||||
3. Embedded LangGraph runtime
|
3. Gateway embedded runtime
|
||||||
a. Load/create thread state
|
a. Load/create thread state
|
||||||
b. Execute middleware chain:
|
b. Execute middleware chain:
|
||||||
- ThreadDataMiddleware: Set up paths
|
- ThreadDataMiddleware: Set up paths
|
||||||
@@ -412,7 +412,7 @@ SKILL.md Format:
|
|||||||
### Thread Cleanup Flow
|
### Thread Cleanup Flow
|
||||||
|
|
||||||
```
|
```
|
||||||
1. Client deletes conversation via LangGraph
|
1. Client deletes conversation via the LangGraph-compatible Gateway route
|
||||||
DELETE /api/langgraph/threads/{thread_id}
|
DELETE /api/langgraph/threads/{thread_id}
|
||||||
|
|
||||||
2. Web UI follows up with Gateway cleanup
|
2. Web UI follows up with Gateway cleanup
|
||||||
|
|||||||
@@ -0,0 +1,331 @@
|
|||||||
|
# 用户认证与隔离设计
|
||||||
|
|
||||||
|
本文档描述 DeerFlow 当前内置认证模块的设计,而不是历史 RFC。它覆盖浏览器登录、API 认证、CSRF、用户隔离、首次初始化、密码重置、内部调用和升级迁移。
|
||||||
|
|
||||||
|
## 设计目标
|
||||||
|
|
||||||
|
认证模块的核心目标是把 DeerFlow 从“本地单用户工具”提升为“可多用户部署的 agent runtime”,并让用户身份贯穿 HTTP API、LangGraph-compatible runtime、文件系统、memory、自定义 agent 和反馈数据。
|
||||||
|
|
||||||
|
设计约束:
|
||||||
|
|
||||||
|
- 默认强制认证:除健康检查、文档和 auth bootstrap 端点外,HTTP 路由都必须有有效 session。
|
||||||
|
- 服务端持有所有权:客户端 metadata 不能声明 `user_id` 或 `owner_id`。
|
||||||
|
- 隔离默认开启:repository(仓储)、文件路径、memory、agent 配置默认按当前用户解析。
|
||||||
|
- 旧数据可升级:无认证版本留下的 thread 可以在 admin 存在后迁移到 admin。
|
||||||
|
- 密码不进日志:首次初始化由操作者设置密码;`reset_admin` 只写 0600 凭据文件。
|
||||||
|
|
||||||
|
非目标:
|
||||||
|
|
||||||
|
- 当前 OAuth 端点只是占位,尚未实现第三方登录。
|
||||||
|
- 当前用户角色只有 `admin` 和 `user`,尚未实现细粒度 RBAC。
|
||||||
|
- 当前登录限速是进程内字典,多 worker 下不是全局精确限速。
|
||||||
|
|
||||||
|
## 核心模型
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TB
|
||||||
|
classDef actor fill:#D8CFC4,stroke:#6E6259,color:#2F2A26;
|
||||||
|
classDef api fill:#C9D7D2,stroke:#5D706A,color:#21302C;
|
||||||
|
classDef state fill:#D7D3E8,stroke:#6B6680,color:#29263A;
|
||||||
|
classDef data fill:#E5D2C4,stroke:#806A5B,color:#30251E;
|
||||||
|
|
||||||
|
Browser["Browser — access_token cookie and csrf_token cookie"]:::actor
|
||||||
|
AuthMiddleware["AuthMiddleware — strict session gate"]:::api
|
||||||
|
CSRFMiddleware["CSRFMiddleware — double-submit token and Origin check"]:::api
|
||||||
|
AuthRoutes["Auth routes — initialize login register logout me change-password"]:::api
|
||||||
|
UserContext["Current user ContextVar — request-scoped identity"]:::state
|
||||||
|
Repositories["Repositories — AUTO resolves user_id from context"]:::state
|
||||||
|
Files["Filesystem — users/{user_id}/threads/{thread_id}/user-data"]:::data
|
||||||
|
Memory["Memory and agents — users/{user_id}/memory.json and agents"]:::data
|
||||||
|
|
||||||
|
Browser --> AuthMiddleware
|
||||||
|
Browser --> CSRFMiddleware
|
||||||
|
AuthMiddleware --> AuthRoutes
|
||||||
|
AuthMiddleware --> UserContext
|
||||||
|
UserContext --> Repositories
|
||||||
|
UserContext --> Files
|
||||||
|
UserContext --> Memory
|
||||||
|
```
|
||||||
|
|
||||||
|
### 用户表
|
||||||
|
|
||||||
|
用户记录定义在 `app.gateway.auth.models.User`,持久化到 `users` 表。关键字段:
|
||||||
|
|
||||||
|
| 字段 | 语义 |
|
||||||
|
|---|---|
|
||||||
|
| `id` | 用户主键,JWT `sub` 使用该值 |
|
||||||
|
| `email` | 唯一登录名 |
|
||||||
|
| `password_hash` | bcrypt hash,OAuth 用户可为空 |
|
||||||
|
| `system_role` | `admin` 或 `user` |
|
||||||
|
| `needs_setup` | reset 后要求用户完成邮箱 / 密码设置 |
|
||||||
|
| `token_version` | 改密码或 reset 时递增,用于废弃旧 JWT |
|
||||||
|
|
||||||
|
### 运行时身份
|
||||||
|
|
||||||
|
认证成功后,`AuthMiddleware` 把用户同时写入:
|
||||||
|
|
||||||
|
- `request.state.user`
|
||||||
|
- `request.state.auth`
|
||||||
|
- `deerflow.runtime.user_context` 的 `ContextVar`
|
||||||
|
|
||||||
|
`ContextVar` 是这里的核心边界。上层 Gateway 负责写入身份,下层 persistence / file path 只读取结构化的当前用户,不反向依赖 `app.gateway.auth` 具体类型。
|
||||||
|
|
||||||
|
可以把 repository 调用的用户参数理解成一个三态 ADT:
|
||||||
|
|
||||||
|
```scala
|
||||||
|
enum UserScope:
|
||||||
|
case AutoFromContext
|
||||||
|
case Explicit(userId: String)
|
||||||
|
case BypassForMigration
|
||||||
|
```
|
||||||
|
|
||||||
|
对应 Python 实现是 `AUTO | str | None`:
|
||||||
|
|
||||||
|
- `AUTO`:从 `ContextVar` 解析当前用户;没有上下文则抛错。
|
||||||
|
- `str`:显式指定用户,主要用于测试或管理脚本。
|
||||||
|
- `None`:跳过用户过滤,只允许迁移脚本或 admin CLI 使用。
|
||||||
|
|
||||||
|
## 登录与初始化流程
|
||||||
|
|
||||||
|
### 首次初始化
|
||||||
|
|
||||||
|
首次启动时,如果没有 admin,服务不会自动创建账号,只记录日志提示访问 `/setup`。
|
||||||
|
|
||||||
|
流程:
|
||||||
|
|
||||||
|
1. 用户访问 `/setup`。
|
||||||
|
2. 前端调用 `GET /api/v1/auth/setup-status`。
|
||||||
|
3. 如果返回 `{"needs_setup": true}`,前端展示创建 admin 表单。
|
||||||
|
4. 表单提交 `POST /api/v1/auth/initialize`。
|
||||||
|
5. 服务端确认当前没有 admin,创建 `system_role="admin"`、`needs_setup=false` 的用户。
|
||||||
|
6. 服务端设置 `access_token` HttpOnly cookie,用户进入 workspace。
|
||||||
|
|
||||||
|
`/api/v1/auth/initialize` 只在没有 admin 时可用。并发初始化由数据库唯一约束兜底,失败方返回 409。
|
||||||
|
|
||||||
|
### 普通登录
|
||||||
|
|
||||||
|
`POST /api/v1/auth/login/local` 使用 `OAuth2PasswordRequestForm`:
|
||||||
|
|
||||||
|
- `username` 是邮箱。
|
||||||
|
- `password` 是密码。
|
||||||
|
- 成功后签发 JWT,放入 `access_token` HttpOnly cookie。
|
||||||
|
- 响应体只返回 `expires_in` 和 `needs_setup`,不返回 token。
|
||||||
|
|
||||||
|
登录失败会按客户端 IP 计数。IP 解析只在 TCP peer 属于 `AUTH_TRUSTED_PROXIES` 时信任 `X-Real-IP`,不使用 `X-Forwarded-For`。
|
||||||
|
|
||||||
|
### 注册
|
||||||
|
|
||||||
|
`POST /api/v1/auth/register` 创建普通 `user`,并自动登录。
|
||||||
|
|
||||||
|
当前实现允许在没有 admin 时注册普通用户,但 `setup-status` 仍会返回 `needs_setup=true`,因为 admin 仍不存在。这是当前产品策略边界:如果后续要求“必须先初始化 admin 才能注册普通用户”,需要在 `/register` 增加 admin-exists gate。
|
||||||
|
|
||||||
|
### 改密码与 reset setup
|
||||||
|
|
||||||
|
`POST /api/v1/auth/change-password` 需要当前密码和新密码:
|
||||||
|
|
||||||
|
- 校验当前密码。
|
||||||
|
- 更新 bcrypt hash。
|
||||||
|
- `token_version += 1`,使旧 JWT 立即失效。
|
||||||
|
- 重新签发 cookie。
|
||||||
|
- 如果 `needs_setup=true` 且传了 `new_email`,则更新邮箱并清除 `needs_setup`。
|
||||||
|
|
||||||
|
`python -m app.gateway.auth.reset_admin` 会:
|
||||||
|
|
||||||
|
- 找到 admin 或指定邮箱用户。
|
||||||
|
- 生成随机密码。
|
||||||
|
- 更新密码 hash。
|
||||||
|
- `token_version += 1`。
|
||||||
|
- 设置 `needs_setup=true`。
|
||||||
|
- 写入 `.deer-flow/admin_initial_credentials.txt`,权限 `0600`。
|
||||||
|
|
||||||
|
命令行只输出凭据文件路径,不输出明文密码。
|
||||||
|
|
||||||
|
## HTTP 认证边界
|
||||||
|
|
||||||
|
`AuthMiddleware` 是 fail-closed(默认拒绝)的全局认证门。
|
||||||
|
|
||||||
|
公开路径:
|
||||||
|
|
||||||
|
- `/health`
|
||||||
|
- `/docs`
|
||||||
|
- `/redoc`
|
||||||
|
- `/openapi.json`
|
||||||
|
- `/api/v1/auth/login/local`
|
||||||
|
- `/api/v1/auth/register`
|
||||||
|
- `/api/v1/auth/logout`
|
||||||
|
- `/api/v1/auth/setup-status`
|
||||||
|
- `/api/v1/auth/initialize`
|
||||||
|
|
||||||
|
其余路径都要求有效 `access_token` cookie。存在 cookie 但 JWT 无效、过期、用户不存在或 `token_version` 不匹配时,直接返回 401,而不是让请求穿透到业务路由。
|
||||||
|
|
||||||
|
路由级别的 owner check 由 `require_permission(..., owner_check=True)` 完成:
|
||||||
|
|
||||||
|
- 读类请求允许旧的未追踪 legacy thread 兼容读取。
|
||||||
|
- 写 / 删除类请求使用 `require_existing=True`,要求 thread row 存在且属于当前用户,避免删除后缺 row 导致其他用户误通过。
|
||||||
|
|
||||||
|
## CSRF 设计
|
||||||
|
|
||||||
|
DeerFlow 使用 Double Submit Cookie:
|
||||||
|
|
||||||
|
- 服务端设置 `csrf_token` cookie。
|
||||||
|
- 前端 state-changing 请求发送同值 `X-CSRF-Token` header。
|
||||||
|
- 服务端用 `secrets.compare_digest` 比较 cookie/header。
|
||||||
|
|
||||||
|
需要 CSRF 的方法:
|
||||||
|
|
||||||
|
- `POST`
|
||||||
|
- `PUT`
|
||||||
|
- `DELETE`
|
||||||
|
- `PATCH`
|
||||||
|
|
||||||
|
auth bootstrap 端点(login/register/initialize/logout)不要求 double-submit token,因为首次调用时浏览器还没有 token;但这些端点会校验 browser `Origin`,拒绝 hostile Origin,避免 login CSRF / session fixation。
|
||||||
|
|
||||||
|
## 用户隔离
|
||||||
|
|
||||||
|
### Thread metadata
|
||||||
|
|
||||||
|
Thread metadata 存在 `threads_meta`,关键隔离字段是 `user_id`。
|
||||||
|
|
||||||
|
创建 thread 时:
|
||||||
|
|
||||||
|
- 客户端传入的 `metadata.user_id` 和 `metadata.owner_id` 会被剥离。
|
||||||
|
- `ThreadMetaRepository.create(..., user_id=AUTO)` 从 `ContextVar` 解析真实用户。
|
||||||
|
- `/api/threads/search` 默认只返回当前用户的 thread。
|
||||||
|
|
||||||
|
读取 / 修改 / 删除时:
|
||||||
|
|
||||||
|
- `get()` 默认按当前用户过滤。
|
||||||
|
- `check_access()` 用于路由 owner check。
|
||||||
|
- 对其他用户的 thread 返回 404,避免泄露资源存在性。
|
||||||
|
|
||||||
|
### 文件系统
|
||||||
|
|
||||||
|
当前线程文件布局:
|
||||||
|
|
||||||
|
```text
|
||||||
|
{base_dir}/users/{user_id}/threads/{thread_id}/user-data/
|
||||||
|
├── workspace/
|
||||||
|
├── uploads/
|
||||||
|
└── outputs/
|
||||||
|
```
|
||||||
|
|
||||||
|
agent 在 sandbox 内看到统一虚拟路径:
|
||||||
|
|
||||||
|
```text
|
||||||
|
/mnt/user-data/workspace
|
||||||
|
/mnt/user-data/uploads
|
||||||
|
/mnt/user-data/outputs
|
||||||
|
```
|
||||||
|
|
||||||
|
`ThreadDataMiddleware` 使用 `get_effective_user_id()` 解析当前用户并生成线程路径。没有认证上下文时会落到 `default` 用户桶,主要用于内部调用、嵌入式 client 或无 HTTP 的本地执行路径。
|
||||||
|
|
||||||
|
### Memory
|
||||||
|
|
||||||
|
默认 memory 存储:
|
||||||
|
|
||||||
|
```text
|
||||||
|
{base_dir}/users/{user_id}/memory.json
|
||||||
|
{base_dir}/users/{user_id}/agents/{agent_name}/memory.json
|
||||||
|
```
|
||||||
|
|
||||||
|
有用户上下文时,空或相对 `memory.storage_path` 都使用上述 per-user 默认路径;只有绝对 `memory.storage_path` 会视为显式 opt-out(退出) per-user isolation,所有用户共享该路径。无用户上下文的 legacy 路径仍会把相对 `storage_path` 解析到 `Paths.base_dir` 下。
|
||||||
|
|
||||||
|
### 自定义 agent
|
||||||
|
|
||||||
|
用户自定义 agent 写入:
|
||||||
|
|
||||||
|
```text
|
||||||
|
{base_dir}/users/{user_id}/agents/{agent_name}/
|
||||||
|
├── config.yaml
|
||||||
|
├── SOUL.md
|
||||||
|
└── memory.json
|
||||||
|
```
|
||||||
|
|
||||||
|
旧布局 `{base_dir}/agents/{agent_name}/` 只作为只读兼容回退。更新或删除旧共享 agent 会要求先运行迁移脚本。
|
||||||
|
|
||||||
|
## 内部调用与 IM 渠道
|
||||||
|
|
||||||
|
IM channel worker 不是浏览器用户,不持有浏览器 cookie。它们通过 Gateway 内部认证:
|
||||||
|
|
||||||
|
- 请求带 `X-DeerFlow-Internal-Token`。
|
||||||
|
- 同时带匹配的 CSRF cookie/header。
|
||||||
|
- 服务端识别为内部用户,`id="default"`、`system_role="internal"`。
|
||||||
|
|
||||||
|
这意味着 channel 产生的数据默认进入 `default` 用户桶。这个选择适合“平台级 bot 身份”,但不是“每个 IM 用户单独隔离”。如果后续要做到外部 IM 用户隔离,需要把外部 platform user 映射到 DeerFlow user,并让 channel manager 设置对应的 scoped identity。
|
||||||
|
|
||||||
|
## LangGraph-compatible 认证
|
||||||
|
|
||||||
|
Gateway 内嵌 runtime 路径由 `AuthMiddleware` 和 `CSRFMiddleware` 保护。
|
||||||
|
|
||||||
|
仓库仍保留 `app.gateway.langgraph_auth`,用于 LangGraph Server 直连模式:
|
||||||
|
|
||||||
|
- `@auth.authenticate` 校验 JWT cookie、CSRF、用户存在性和 `token_version`。
|
||||||
|
- `@auth.on` 在写入 metadata 时注入 `user_id`,并在读路径返回 `{"user_id": current_user}` 过滤条件。
|
||||||
|
|
||||||
|
这保证 Gateway 路由和 LangGraph-compatible 直连模式使用同一 JWT 语义。
|
||||||
|
|
||||||
|
## 升级与迁移
|
||||||
|
|
||||||
|
从无认证版本升级时,可能存在没有 `user_id` 的历史 thread。
|
||||||
|
|
||||||
|
当前策略:
|
||||||
|
|
||||||
|
1. 首次启动如果没有 admin,只提示访问 `/setup`,不迁移。
|
||||||
|
2. 操作者创建 admin。
|
||||||
|
3. 后续启动时,`_ensure_admin_user()` 找到 admin,并把 LangGraph store 中缺少 `metadata.user_id` 的 thread 迁移到 admin。
|
||||||
|
|
||||||
|
文件系统旧布局迁移由脚本处理:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd backend
|
||||||
|
PYTHONPATH=. python scripts/migrate_user_isolation.py --dry-run
|
||||||
|
PYTHONPATH=. python scripts/migrate_user_isolation.py --user-id <target-user-id>
|
||||||
|
```
|
||||||
|
|
||||||
|
迁移脚本覆盖 legacy `memory.json`、`threads/` 和 `agents/` 到 per-user layout。
|
||||||
|
|
||||||
|
## 安全不变量
|
||||||
|
|
||||||
|
必须长期保持的不变量:
|
||||||
|
|
||||||
|
- JWT 只在 HttpOnly cookie 中传输,不出现在响应 JSON。
|
||||||
|
- 任何非 public HTTP 路由都不能只靠“cookie 存在”放行,必须严格验证 JWT。
|
||||||
|
- `token_version` 不匹配必须拒绝,保证改密码 / reset 后旧 session 失效。
|
||||||
|
- 客户端 metadata 中的 `user_id` / `owner_id` 必须剥离。
|
||||||
|
- repository 默认 `AUTO` 必须从当前用户上下文解析,不能静默退化成全局查询。
|
||||||
|
- 只有迁移脚本和 admin CLI 可以显式传 `user_id=None` 绕过隔离。
|
||||||
|
- 本地文件路径必须通过 `Paths` 和 sandbox path validation 解析,不能拼接未校验的用户输入。
|
||||||
|
- 捕获认证、迁移、后台任务异常必须记录日志;不能空 catch。
|
||||||
|
|
||||||
|
## 已知边界
|
||||||
|
|
||||||
|
| 边界 | 当前行为 | 后续方向 |
|
||||||
|
|---|---|---|
|
||||||
|
| 无 admin 时注册普通用户 | 允许注册普通 `user` | 如产品要求先初始化 admin,给 `/register` 加 gate |
|
||||||
|
| 登录限速 | 进程内 dict,单 worker 精确,多 worker 近似 | Redis / DB-backed rate limiter |
|
||||||
|
| OAuth | 端点占位,未实现 | 接入 provider 并统一 `token_version` / role 语义 |
|
||||||
|
| IM 用户隔离 | channel 使用 `default` 内部用户 | 建立外部用户到 DeerFlow user 的映射 |
|
||||||
|
| 绝对 memory path | 显式共享 memory | UI / docs 明确提示 opt-out 风险 |
|
||||||
|
|
||||||
|
## 相关文件
|
||||||
|
|
||||||
|
| 文件 | 职责 |
|
||||||
|
|---|---|
|
||||||
|
| `app/gateway/auth_middleware.py` | 全局认证门、JWT 严格验证、写入 user context |
|
||||||
|
| `app/gateway/csrf_middleware.py` | CSRF double-submit 和 auth Origin 校验 |
|
||||||
|
| `app/gateway/routers/auth.py` | initialize/login/register/logout/me/change-password |
|
||||||
|
| `app/gateway/auth/jwt.py` | JWT 创建与解析 |
|
||||||
|
| `app/gateway/auth/reset_admin.py` | 密码 reset CLI |
|
||||||
|
| `app/gateway/auth/credential_file.py` | 0600 凭据文件写入 |
|
||||||
|
| `app/gateway/authz.py` | 路由权限与 owner check |
|
||||||
|
| `deerflow/runtime/user_context.py` | 当前用户 ContextVar 与 `AUTO` sentinel |
|
||||||
|
| `deerflow/persistence/thread_meta/` | thread metadata owner filter |
|
||||||
|
| `deerflow/config/paths.py` | per-user filesystem layout |
|
||||||
|
| `deerflow/agents/middlewares/thread_data_middleware.py` | run 时解析用户线程目录 |
|
||||||
|
| `deerflow/agents/memory/storage.py` | per-user memory storage |
|
||||||
|
| `deerflow/config/agents_config.py` | per-user custom agents |
|
||||||
|
| `app/channels/manager.py` | IM channel 内部认证调用 |
|
||||||
|
| `scripts/migrate_user_isolation.py` | legacy 数据迁移到 per-user layout |
|
||||||
|
| `.deer-flow/data/deerflow.db` | 统一 SQLite 数据库,包含 users / threads_meta / runs / feedback 等表 |
|
||||||
|
| `.deer-flow/users/{user_id}/agents/{agent_name}/` | 用户自定义 agent 配置、SOUL 和 agent memory |
|
||||||
|
| `.deer-flow/admin_initial_credentials.txt` | `reset_admin` 生成的新凭据文件(0600,读完应删除) |
|
||||||
@@ -24,11 +24,11 @@ All other test plan sections were executed against either:
|
|||||||
|
|
||||||
| Case | Title | What it covers | Why not run |
|
| Case | Title | What it covers | Why not run |
|
||||||
|---|---|---|---|
|
|---|---|---|---|
|
||||||
| TC-DOCKER-01 | `users.db` volume persistence | Verify the `DEER_FLOW_HOME` bind mount survives container restart | needs `docker compose up` |
|
| TC-DOCKER-01 | `deerflow.db` volume persistence | Verify the `DEER_FLOW_HOME` bind mount survives container restart | needs `docker compose up` |
|
||||||
| TC-DOCKER-02 | Session persistence across container restart | `AUTH_JWT_SECRET` env var keeps cookies valid after `docker compose down && up` | needs `docker compose down/up` |
|
| TC-DOCKER-02 | Session persistence across container restart | `AUTH_JWT_SECRET` env var keeps cookies valid after `docker compose down && up` | needs `docker compose down/up` |
|
||||||
| TC-DOCKER-03 | Per-worker rate limiter divergence | Confirms in-process `_login_attempts` dict doesn't share state across `gunicorn` workers (4 by default in the compose file); known limitation, documented | needs multi-worker container |
|
| TC-DOCKER-03 | Per-worker rate limiter divergence | Confirms in-process `_login_attempts` dict doesn't share state across `gunicorn` workers (4 by default in the compose file); known limitation, documented | needs multi-worker container |
|
||||||
| TC-DOCKER-04 | IM channels skip AuthMiddleware | Verify Feishu/Slack/Telegram dispatchers run in-container against `http://langgraph:2024` without going through nginx | needs `docker logs` |
|
| TC-DOCKER-04 | IM channels use internal Gateway auth | Verify Feishu/Slack/Telegram dispatchers attach the process-local internal auth header plus CSRF cookie/header when calling Gateway-compatible LangGraph APIs | needs `docker logs` |
|
||||||
| TC-DOCKER-05 | Admin credentials surfacing | **Updated post-simplify** — was "log scrape", now "0600 credential file in `DEER_FLOW_HOME`". The file-based behavior is already validated by TC-1.1 + TC-UPG-13 on sg_dev (non-Docker), so the only Docker-specific gap is verifying the volume mount carries the file out to the host | needs container + host volume |
|
| TC-DOCKER-05 | Reset credentials surfacing | `reset_admin` writes a 0600 credential file in `DEER_FLOW_HOME` instead of logging plaintext. The file-based behavior is validated by non-Docker reset tests, so the only Docker-specific gap is verifying the volume mount carries the file out to the host | needs container + host volume |
|
||||||
| TC-DOCKER-06 | Gateway-mode Docker deploy | `./scripts/deploy.sh --gateway` produces a 3-container topology (no `langgraph` container); same auth flow as standard mode | needs `docker compose --profile gateway` |
|
| TC-DOCKER-06 | Gateway-mode Docker deploy | `./scripts/deploy.sh --gateway` produces a 3-container topology (no `langgraph` container); same auth flow as standard mode | needs `docker compose --profile gateway` |
|
||||||
|
|
||||||
## Coverage already provided by non-Docker tests
|
## Coverage already provided by non-Docker tests
|
||||||
@@ -41,8 +41,8 @@ the test cases that ran on sg_dev or local:
|
|||||||
| TC-DOCKER-01 (volume persistence) | TC-REENT-01 on sg_dev (admin row survives gateway restart) — same SQLite file, just no container layer between |
|
| TC-DOCKER-01 (volume persistence) | TC-REENT-01 on sg_dev (admin row survives gateway restart) — same SQLite file, just no container layer between |
|
||||||
| TC-DOCKER-02 (session persistence) | TC-API-02/03/06 (cookie roundtrip), plus TC-REENT-04 (multi-cookie) — JWT verification is process-state-free, container restart is equivalent to `pkill uvicorn && uv run uvicorn` |
|
| TC-DOCKER-02 (session persistence) | TC-API-02/03/06 (cookie roundtrip), plus TC-REENT-04 (multi-cookie) — JWT verification is process-state-free, container restart is equivalent to `pkill uvicorn && uv run uvicorn` |
|
||||||
| TC-DOCKER-03 (per-worker rate limit) | TC-GW-04 + TC-REENT-09 (single-worker rate limit + 5min expiry). The cross-worker divergence is an architectural property of the in-memory dict; no auth code path differs |
|
| TC-DOCKER-03 (per-worker rate limit) | TC-GW-04 + TC-REENT-09 (single-worker rate limit + 5min expiry). The cross-worker divergence is an architectural property of the in-memory dict; no auth code path differs |
|
||||||
| TC-DOCKER-04 (IM channels skip auth) | Code-level only: `app/channels/manager.py` uses `langgraph_sdk` directly with no cookie handling. The langgraph_auth handler is bypassed by going through SDK, not HTTP |
|
| TC-DOCKER-04 (IM channels use internal auth) | Code-level: `app/channels/manager.py` creates the `langgraph_sdk` client with `create_internal_auth_headers()` plus CSRF cookie/header, so channel workers do not rely on browser cookies |
|
||||||
| TC-DOCKER-05 (credential surfacing) | TC-1.1 on sg_dev (file at `~/deer-flow/backend/.deer-flow/admin_initial_credentials.txt`, mode 0600, password 22 chars) — the only Docker-unique step is whether the bind mount projects this path onto the host, which is a `docker compose` config check, not a runtime behavior change |
|
| TC-DOCKER-05 (credential surfacing) | `reset_admin` writes `.deer-flow/admin_initial_credentials.txt` with mode 0600 and logs only the path — the only Docker-unique step is whether the bind mount projects this path onto the host, which is a `docker compose` config check, not a runtime behavior change |
|
||||||
| TC-DOCKER-06 (gateway-mode container) | Section 七 7.2 covered by TC-GW-01..05 + Section 二 (gateway-mode auth flow on sg_dev) — same Gateway code, container is just a packaging change |
|
| TC-DOCKER-06 (gateway-mode container) | Section 七 7.2 covered by TC-GW-01..05 + Section 二 (gateway-mode auth flow on sg_dev) — same Gateway code, container is just a packaging change |
|
||||||
|
|
||||||
## Reproduction steps when Docker becomes available
|
## Reproduction steps when Docker becomes available
|
||||||
@@ -72,6 +72,6 @@ Then run TC-DOCKER-01..06 from the test plan as written.
|
|||||||
about *container packaging* details (bind mounts, multi-worker, log
|
about *container packaging* details (bind mounts, multi-worker, log
|
||||||
collection), not about whether the auth code paths work.
|
collection), not about whether the auth code paths work.
|
||||||
- **TC-DOCKER-05 was updated in place** in `AUTH_TEST_PLAN.md` to reflect
|
- **TC-DOCKER-05 was updated in place** in `AUTH_TEST_PLAN.md` to reflect
|
||||||
the post-simplify reality (credentials file → 0600 file, no log leak).
|
the current reset flow (`reset_admin` → 0600 credentials file, no log leak).
|
||||||
The old "grep 'Password:' in docker logs" expectation would have failed
|
The old "grep 'Password:' in docker logs" expectation would have failed
|
||||||
silently and given a false sense of coverage.
|
silently and given a false sense of coverage.
|
||||||
|
|||||||
+149
-105
@@ -19,7 +19,7 @@
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 清除已有数据
|
# 清除已有数据
|
||||||
rm -f backend/.deer-flow/users.db
|
rm -f backend/.deer-flow/data/deerflow.db
|
||||||
|
|
||||||
# 选择模式启动
|
# 选择模式启动
|
||||||
make dev # 标准模式
|
make dev # 标准模式
|
||||||
@@ -28,10 +28,11 @@ make dev-pro # Gateway 模式
|
|||||||
```
|
```
|
||||||
|
|
||||||
**验证点:**
|
**验证点:**
|
||||||
- [ ] 控制台输出 admin 邮箱和随机密码
|
- [ ] 控制台不输出 admin 邮箱或明文密码
|
||||||
- [ ] 密码格式为 `secrets.token_urlsafe(16)` 的 22 字符字符串
|
- [ ] 控制台提示 `First boot detected — no admin account exists.`
|
||||||
- [ ] 邮箱为 `admin@deerflow.dev`
|
- [ ] 控制台提示访问 `/setup` 完成 admin 创建
|
||||||
- [ ] 提示 `Change it after login: Settings -> Account`
|
- [ ] `GET /api/v1/auth/setup-status` 返回 `{"needs_setup": true}`
|
||||||
|
- [ ] 前端访问 `/login` 会跳转 `/setup`
|
||||||
|
|
||||||
### 1.2 非首次启动
|
### 1.2 非首次启动
|
||||||
|
|
||||||
@@ -42,7 +43,8 @@ make dev
|
|||||||
|
|
||||||
**验证点:**
|
**验证点:**
|
||||||
- [ ] 控制台不输出密码
|
- [ ] 控制台不输出密码
|
||||||
- [ ] 如果 admin 仍 `needs_setup=True`,控制台有 warning 提示
|
- [ ] `GET /api/v1/auth/setup-status` 返回 `{"needs_setup": false}`
|
||||||
|
- [ ] 已登录用户如果 `needs_setup=True`,访问 workspace 会被引导到 `/setup` 完成改邮箱 / 改密码流程
|
||||||
|
|
||||||
### 1.3 环境变量配置
|
### 1.3 环境变量配置
|
||||||
|
|
||||||
@@ -76,19 +78,22 @@ make dev
|
|||||||
curl -s $BASE/api/v1/auth/setup-status | jq .
|
curl -s $BASE/api/v1/auth/setup-status | jq .
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** 返回 `{"needs_setup": false}`(admin 在启动时已自动创建,`count_users() > 0`)。仅在启动完成前的极短窗口内可能返回 `true`。
|
**预期:**
|
||||||
|
- 干净数据库且尚未初始化 admin:返回 `{"needs_setup": true}`
|
||||||
|
- 已存在 admin:返回 `{"needs_setup": false}`
|
||||||
|
|
||||||
#### TC-API-02: Admin 首次登录
|
#### TC-API-02: 首次初始化 Admin
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
curl -s -X POST $BASE/api/v1/auth/login/local \
|
curl -s -X POST $BASE/api/v1/auth/initialize \
|
||||||
-d "username=admin@deerflow.dev&password=<控制台密码>" \
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"email":"admin@example.com","password":"AdminPass1!"}' \
|
||||||
-c cookies.txt | jq .
|
-c cookies.txt | jq .
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- 状态码 200
|
- 状态码 201
|
||||||
- Body: `{"expires_in": 604800, "needs_setup": true}`
|
- Body: `{"id": "...", "email": "admin@example.com", "system_role": "admin", "needs_setup": false}`
|
||||||
- `cookies.txt` 包含 `access_token`(HttpOnly)和 `csrf_token`(非 HttpOnly)
|
- `cookies.txt` 包含 `access_token`(HttpOnly)和 `csrf_token`(非 HttpOnly)
|
||||||
|
|
||||||
#### TC-API-03: 获取当前用户
|
#### TC-API-03: 获取当前用户
|
||||||
@@ -97,9 +102,9 @@ curl -s -X POST $BASE/api/v1/auth/login/local \
|
|||||||
curl -s $BASE/api/v1/auth/me -b cookies.txt | jq .
|
curl -s $BASE/api/v1/auth/me -b cookies.txt | jq .
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** `{"id": "...", "email": "admin@deerflow.dev", "system_role": "admin", "needs_setup": true}`
|
**预期:** `{"id": "...", "email": "admin@example.com", "system_role": "admin", "needs_setup": false}`
|
||||||
|
|
||||||
#### TC-API-04: Setup 流程(改邮箱 + 改密码)
|
#### TC-API-04: 改密码流程
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
CSRF=$(grep csrf_token cookies.txt | awk '{print $NF}')
|
CSRF=$(grep csrf_token cookies.txt | awk '{print $NF}')
|
||||||
@@ -107,13 +112,36 @@ curl -s -X POST $BASE/api/v1/auth/change-password \
|
|||||||
-b cookies.txt \
|
-b cookies.txt \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-H "X-CSRF-Token: $CSRF" \
|
-H "X-CSRF-Token: $CSRF" \
|
||||||
-d '{"current_password":"<控制台密码>","new_password":"NewPass123!","new_email":"admin@example.com"}' | jq .
|
-d '{"current_password":"AdminPass1!","new_password":"NewPass123!"}' | jq .
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- 状态码 200
|
- 状态码 200
|
||||||
- `{"message": "Password changed successfully"}`
|
- `{"message": "Password changed successfully"}`
|
||||||
- 再调 `/auth/me` 邮箱变为 `admin@example.com`,`needs_setup` 变为 `false`
|
- 再调 `/auth/me` 仍为 `admin@example.com`,`needs_setup` 仍为 `false`
|
||||||
|
|
||||||
|
#### TC-API-04a: reset_admin 后的 Setup 流程(改邮箱 + 改密码)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd backend
|
||||||
|
python -m app.gateway.auth.reset_admin --email admin@example.com
|
||||||
|
# 从 .deer-flow/admin_initial_credentials.txt 读取 reset 后密码
|
||||||
|
|
||||||
|
curl -s -X POST $BASE/api/v1/auth/login/local \
|
||||||
|
-d "username=admin@example.com&password=<凭据文件密码>" \
|
||||||
|
-c cookies.txt | jq .
|
||||||
|
|
||||||
|
CSRF=$(grep csrf_token cookies.txt | awk '{print $NF}')
|
||||||
|
curl -s -X POST $BASE/api/v1/auth/change-password \
|
||||||
|
-b cookies.txt \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-H "X-CSRF-Token: $CSRF" \
|
||||||
|
-d '{"current_password":"<凭据文件密码>","new_password":"AdminPass2!","new_email":"admin2@example.com"}' | jq .
|
||||||
|
```
|
||||||
|
|
||||||
|
**预期:**
|
||||||
|
- 登录返回 `{"expires_in": 604800, "needs_setup": true}`
|
||||||
|
- `change-password` 后 `/auth/me` 邮箱变为 `admin2@example.com`,`needs_setup` 变为 `false`
|
||||||
|
|
||||||
#### TC-API-05: 普通用户注册
|
#### TC-API-05: 普通用户注册
|
||||||
|
|
||||||
@@ -493,7 +521,7 @@ curl -s -X POST $BASE/api/v1/auth/register \
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 检查数据库
|
# 检查数据库
|
||||||
sqlite3 backend/.deer-flow/users.db "SELECT email, password_hash FROM users LIMIT 3;"
|
sqlite3 backend/.deer-flow/data/deerflow.db "SELECT email, password_hash FROM users LIMIT 3;"
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** `password_hash` 以 `$2b$` 开头(bcrypt 格式)
|
**预期:** `password_hash` 以 `$2b$` 开头(bcrypt 格式)
|
||||||
@@ -506,24 +534,25 @@ sqlite3 backend/.deer-flow/users.db "SELECT email, password_hash FROM users LIMI
|
|||||||
|
|
||||||
### 4.1 首次登录流程
|
### 4.1 首次登录流程
|
||||||
|
|
||||||
#### TC-UI-01: 访问首页跳转登录
|
#### TC-UI-01: 无 admin 时访问 workspace 跳转 setup
|
||||||
|
|
||||||
1. 打开 `http://localhost:2026/workspace`
|
1. 打开 `http://localhost:2026/workspace`
|
||||||
2. **预期:** 自动跳转到 `/login`
|
2. **预期:** 自动跳转到 `/setup`
|
||||||
|
|
||||||
#### TC-UI-02: Login 页面
|
#### TC-UI-02: Setup 页面创建 admin
|
||||||
|
|
||||||
1. 输入 admin 邮箱和控制台密码
|
1. 输入 admin 邮箱、密码、确认密码
|
||||||
2. 点击 Login
|
2. 点击 Create Admin Account
|
||||||
3. **预期:** 跳转到 `/setup`(因为 `needs_setup=true`)
|
|
||||||
|
|
||||||
#### TC-UI-03: Setup 页面
|
|
||||||
|
|
||||||
1. 输入新邮箱、控制台密码(current)、新密码、确认密码
|
|
||||||
2. 点击 Complete Setup
|
|
||||||
3. **预期:** 跳转到 `/workspace`
|
3. **预期:** 跳转到 `/workspace`
|
||||||
4. 刷新页面不跳回 `/setup`
|
4. 刷新页面不跳回 `/setup`
|
||||||
|
|
||||||
|
#### TC-UI-03: 已初始化后 Login 页面
|
||||||
|
|
||||||
|
1. 退出登录后访问 `/login`
|
||||||
|
2. 输入 admin 邮箱和密码
|
||||||
|
3. 点击 Login
|
||||||
|
4. **预期:** 跳转到 `/workspace`
|
||||||
|
|
||||||
#### TC-UI-04: Setup 密码不匹配
|
#### TC-UI-04: Setup 密码不匹配
|
||||||
|
|
||||||
1. 新密码和确认密码不一致
|
1. 新密码和确认密码不一致
|
||||||
@@ -602,7 +631,7 @@ sqlite3 backend/.deer-flow/users.db "SELECT email, password_hash FROM users LIMI
|
|||||||
#### TC-UI-15: reset_admin 后重新登录
|
#### TC-UI-15: reset_admin 后重新登录
|
||||||
|
|
||||||
1. 执行 `cd backend && python -m app.gateway.auth.reset_admin`
|
1. 执行 `cd backend && python -m app.gateway.auth.reset_admin`
|
||||||
2. 使用新密码登录
|
2. 从 `.deer-flow/admin_initial_credentials.txt` 读取新密码并登录
|
||||||
3. **预期:** 跳转到 `/setup` 页面(`needs_setup` 被重置为 true)
|
3. **预期:** 跳转到 `/setup` 页面(`needs_setup` 被重置为 true)
|
||||||
4. 旧 session 已失效
|
4. 旧 session 已失效
|
||||||
|
|
||||||
@@ -645,18 +674,28 @@ make install
|
|||||||
make dev
|
make dev
|
||||||
```
|
```
|
||||||
|
|
||||||
#### TC-UPG-01: 首次启动创建 admin
|
#### TC-UPG-01: 首次启动等待 admin 初始化
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 控制台输出 admin 邮箱(`admin@deerflow.dev`)和随机密码
|
- [ ] 控制台不输出 admin 邮箱或随机密码
|
||||||
|
- [ ] 访问 `/setup` 可创建第一个 admin
|
||||||
- [ ] 无报错,正常启动
|
- [ ] 无报错,正常启动
|
||||||
|
|
||||||
#### TC-UPG-02: 旧 Thread 迁移到 admin
|
#### TC-UPG-02: 旧 Thread 迁移到 admin
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
# 创建第一个 admin
|
||||||
|
curl -s -X POST http://localhost:2026/api/v1/auth/initialize \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"email":"admin@example.com","password":"AdminPass1!"}' \
|
||||||
|
-c cookies.txt
|
||||||
|
|
||||||
|
# 重启一次:启动迁移只在已有 admin 的启动路径执行
|
||||||
|
make stop && make dev
|
||||||
|
|
||||||
# 登录 admin
|
# 登录 admin
|
||||||
curl -s -X POST http://localhost:2026/api/v1/auth/login/local \
|
curl -s -X POST http://localhost:2026/api/v1/auth/login/local \
|
||||||
-d "username=admin@deerflow.dev&password=<控制台密码>" \
|
-d "username=admin@example.com&password=AdminPass1!" \
|
||||||
-c cookies.txt
|
-c cookies.txt
|
||||||
|
|
||||||
# 查看 thread 列表
|
# 查看 thread 列表
|
||||||
@@ -670,8 +709,8 @@ curl -s -X POST http://localhost:2026/api/threads/search \
|
|||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 返回的 thread 数量 ≥ 旧版创建的数量
|
- [ ] 返回的 thread 数量 ≥ 旧版创建的数量
|
||||||
- [ ] 控制台日志有 `Migrated N orphaned thread(s) to admin`
|
- [ ] 控制台日志有 `Migrated N orphan LangGraph thread(s) to admin`
|
||||||
- [ ] 每个 thread 的 `metadata.owner_id` 都已被设为 admin 的 ID
|
- [ ] 旧 thread 只对 admin 可见
|
||||||
|
|
||||||
#### TC-UPG-03: 旧 Thread 内容完整
|
#### TC-UPG-03: 旧 Thread 内容完整
|
||||||
|
|
||||||
@@ -683,7 +722,7 @@ curl -s http://localhost:2026/api/threads/<old-thread-id> \
|
|||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] `metadata.title` 保留原值(如 `old-thread-1`)
|
- [ ] `metadata.title` 保留原值(如 `old-thread-1`)
|
||||||
- [ ] `metadata.owner_id` 已填充
|
- [ ] 响应不回显服务端保留的 `user_id` / `owner_id`
|
||||||
|
|
||||||
#### TC-UPG-04: 新用户看不到旧 Thread
|
#### TC-UPG-04: 新用户看不到旧 Thread
|
||||||
|
|
||||||
@@ -706,18 +745,19 @@ curl -s -X POST http://localhost:2026/api/threads/search \
|
|||||||
|
|
||||||
### 5.3 数据库 Schema 兼容
|
### 5.3 数据库 Schema 兼容
|
||||||
|
|
||||||
#### TC-UPG-05: 无 users.db 时自动创建
|
#### TC-UPG-05: 无 deerflow.db 时创建 schema 但不创建默认用户
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ls -la backend/.deer-flow/users.db
|
ls -la backend/.deer-flow/data/deerflow.db
|
||||||
|
sqlite3 backend/.deer-flow/data/deerflow.db "SELECT COUNT(*) FROM users;"
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** 文件存在,`sqlite3` 可查到 `users` 表含 `needs_setup`、`token_version` 列
|
**预期:** 文件存在,`sqlite3` 可查到 `users` 表含 `needs_setup`、`token_version` 列;未调用 `/initialize` 前用户数为 0
|
||||||
|
|
||||||
#### TC-UPG-06: users.db WAL 模式
|
#### TC-UPG-06: deerflow.db WAL 模式
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
sqlite3 backend/.deer-flow/users.db "PRAGMA journal_mode;"
|
sqlite3 backend/.deer-flow/data/deerflow.db "PRAGMA journal_mode;"
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** 返回 `wal`
|
**预期:** 返回 `wal`
|
||||||
@@ -768,9 +808,9 @@ make dev
|
|||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 服务正常启动(忽略 `users.db`,无 auth 相关代码不报错)
|
- [ ] 服务正常启动(忽略 `deerflow.db`,无 auth 相关代码不报错)
|
||||||
- [ ] 旧对话数据仍然可访问
|
- [ ] 旧对话数据仍然可访问
|
||||||
- [ ] `users.db` 文件残留但不影响运行
|
- [ ] `deerflow.db` 文件残留但不影响运行
|
||||||
|
|
||||||
#### TC-UPG-12: 再次升级到 auth 分支
|
#### TC-UPG-12: 再次升级到 auth 分支
|
||||||
|
|
||||||
@@ -781,51 +821,47 @@ make dev
|
|||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 识别已有 `users.db`,不重新创建 admin
|
- [ ] 识别已有 `deerflow.db`,不重新创建 admin
|
||||||
- [ ] 旧的 admin 账号仍可登录(如果回退期间未删 `users.db`)
|
- [ ] 旧的 admin 账号仍可登录(如果回退期间未删 `deerflow.db`)
|
||||||
|
|
||||||
### 5.7 休眠 Admin(初始密码未使用/未更改)
|
### 5.7 Admin 初始化与 reset_admin
|
||||||
|
|
||||||
> 首次启动生成 admin + 随机密码,但运维未登录、未改密码。
|
> 首次启动不生成默认 admin,也不在日志输出密码。忘记密码时走 `reset_admin`,新密码写入 0600 凭据文件。
|
||||||
> 密码只在首次启动的控制台闪过一次,后续启动不再显示。
|
|
||||||
|
|
||||||
#### TC-UPG-13: 重启后自动重置密码并打印
|
#### TC-UPG-13: 未初始化 admin 时重启不创建默认账号
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 首次启动,记录密码
|
rm -f backend/.deer-flow/data/deerflow.db
|
||||||
rm -f backend/.deer-flow/users.db
|
|
||||||
make dev
|
make dev
|
||||||
# 控制台输出密码 P0,不登录
|
|
||||||
make stop
|
make stop
|
||||||
|
|
||||||
# 隔了几天,再次启动
|
|
||||||
make dev
|
make dev
|
||||||
# 控制台输出新密码 P1
|
curl -s $BASE/api/v1/auth/setup-status | jq .
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 控制台输出 `Admin account setup incomplete — password reset`
|
- [ ] 控制台不输出密码
|
||||||
- [ ] 输出新密码 P1(P0 已失效)
|
- [ ] `setup-status` 仍为 `{"needs_setup": true}`
|
||||||
- [ ] 用 P1 可以登录,P0 不可以
|
- [ ] 访问 `/setup` 仍可创建第一个 admin
|
||||||
- [ ] 登录后 `needs_setup=true`,跳转 `/setup`
|
|
||||||
- [ ] `token_version` 递增(旧 session 如有也失效)
|
|
||||||
|
|
||||||
#### TC-UPG-14: 密码丢失 — 无需 CLI,重启即可
|
#### TC-UPG-14: 密码丢失 — reset_admin 写入凭据文件
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 忘记了控制台密码 → 直接重启服务
|
python -m app.gateway.auth.reset_admin --email admin@example.com
|
||||||
make stop && make dev
|
ls -la backend/.deer-flow/admin_initial_credentials.txt
|
||||||
# 控制台自动输出新密码
|
cat backend/.deer-flow/admin_initial_credentials.txt
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 无需 `reset_admin`,重启服务即可拿到新密码
|
- [ ] 命令行只输出凭据文件路径,不输出明文密码
|
||||||
- [ ] `reset_admin` CLI 仍然可用作手动备选方案
|
- [ ] 凭据文件权限为 `0600`
|
||||||
|
- [ ] 凭据文件包含 email + password 行
|
||||||
|
- [ ] 该用户下次登录返回 `needs_setup=true`
|
||||||
|
|
||||||
#### TC-UPG-15: 休眠 admin 期间普通用户注册
|
#### TC-UPG-15: 未初始化 admin 期间普通用户注册策略边界
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# admin 存在但从未登录,普通用户先注册
|
# admin 尚不存在,普通用户尝试注册
|
||||||
curl -s -X POST $BASE/api/v1/auth/register \
|
curl -s -X POST $BASE/api/v1/auth/register \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-d '{"email":"earlybird@example.com","password":"EarlyPass1!"}' \
|
-d '{"email":"earlybird@example.com","password":"EarlyPass1!"}' \
|
||||||
@@ -833,11 +869,11 @@ curl -s -X POST $BASE/api/v1/auth/register \
|
|||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 注册成功(201),角色为 `user`
|
- [ ] 当前代码允许注册普通用户并自动登录(201,角色为 `user`)
|
||||||
- [ ] 无法提权为 admin
|
- [ ] 但 `setup-status` 仍为 `{"needs_setup": true}`,因为 admin 仍不存在
|
||||||
- [ ] 普通用户的数据与 admin 隔离
|
- [ ] 这是一个产品策略边界:若要求“必须先有 admin”,需要在 `/register` 增加 admin-exists gate
|
||||||
|
|
||||||
#### TC-UPG-16: 休眠 admin 不影响后续操作
|
#### TC-UPG-16: 普通用户数据与后续 admin 隔离
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 普通用户正常创建 thread、发消息
|
# 普通用户正常创建 thread、发消息
|
||||||
@@ -849,14 +885,13 @@ curl -s -X POST $BASE/api/threads \
|
|||||||
-d '{"metadata":{}}' | jq .thread_id
|
-d '{"metadata":{}}' | jq .thread_id
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** 正常创建,不受休眠 admin 影响
|
**预期:** 普通用户正常创建 thread;后续 admin 创建后,搜索不到该普通用户 thread
|
||||||
|
|
||||||
#### TC-UPG-17: 休眠 admin 最终完成 Setup
|
#### TC-UPG-17: reset_admin 后完成 Setup
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 运维终于登录
|
|
||||||
curl -s -X POST $BASE/api/v1/auth/login/local \
|
curl -s -X POST $BASE/api/v1/auth/login/local \
|
||||||
-d "username=admin@deerflow.dev&password=<P0或P1>" \
|
-d "username=admin@example.com&password=<凭据文件密码>" \
|
||||||
-c admin.txt | jq .needs_setup
|
-c admin.txt | jq .needs_setup
|
||||||
# 预期: true
|
# 预期: true
|
||||||
|
|
||||||
@@ -866,7 +901,7 @@ curl -s -X POST $BASE/api/v1/auth/change-password \
|
|||||||
-b admin.txt \
|
-b admin.txt \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-H "X-CSRF-Token: $CSRF" \
|
-H "X-CSRF-Token: $CSRF" \
|
||||||
-d '{"current_password":"<密码>","new_password":"AdminFinal1!","new_email":"admin@real.com"}' \
|
-d '{"current_password":"<凭据文件密码>","new_password":"AdminFinal1!","new_email":"admin@real.com"}' \
|
||||||
-c admin.txt
|
-c admin.txt
|
||||||
|
|
||||||
# 验证
|
# 验证
|
||||||
@@ -876,7 +911,7 @@ curl -s $BASE/api/v1/auth/me -b admin.txt | jq '{email, needs_setup}'
|
|||||||
**预期:**
|
**预期:**
|
||||||
- [ ] `email` 变为 `admin@real.com`
|
- [ ] `email` 变为 `admin@real.com`
|
||||||
- [ ] `needs_setup` 变为 `false`
|
- [ ] `needs_setup` 变为 `false`
|
||||||
- [ ] 后续重启控制台不再有 warning
|
- [ ] 后续登录使用新密码
|
||||||
|
|
||||||
#### TC-UPG-18: 长期未用后 JWT 密钥轮换
|
#### TC-UPG-18: 长期未用后 JWT 密钥轮换
|
||||||
|
|
||||||
@@ -890,8 +925,8 @@ make stop && make dev
|
|||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 服务正常启动
|
- [ ] 服务正常启动
|
||||||
- [ ] 旧密码仍可登录(密码存在 DB,与 JWT 密钥无关)
|
- [ ] 账号密码仍可登录(密码存在 DB,与 JWT 密钥无关)
|
||||||
- [ ] 旧的 JWT token 失效(密钥变了签名不匹配)— 但因为从未登录过也没有旧 token
|
- [ ] 旧的 JWT token 失效(密钥变了签名不匹配)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -910,7 +945,7 @@ for i in 1 2 3; do
|
|||||||
done
|
done
|
||||||
|
|
||||||
# 检查 admin 数量
|
# 检查 admin 数量
|
||||||
sqlite3 backend/.deer-flow/users.db \
|
sqlite3 backend/.deer-flow/data/deerflow.db \
|
||||||
"SELECT COUNT(*) FROM users WHERE system_role='admin';"
|
"SELECT COUNT(*) FROM users WHERE system_role='admin';"
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -1055,7 +1090,7 @@ curl -s -X POST $BASE/api/v1/auth/register \
|
|||||||
wait
|
wait
|
||||||
|
|
||||||
# 检查用户数
|
# 检查用户数
|
||||||
sqlite3 backend/.deer-flow/users.db \
|
sqlite3 backend/.deer-flow/data/deerflow.db \
|
||||||
"SELECT COUNT(*) FROM users WHERE email='race@example.com';"
|
"SELECT COUNT(*) FROM users WHERE email='race@example.com';"
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -1165,13 +1200,16 @@ curl -s -w "%{http_code}" -X DELETE "$BASE/api/threads/$TID" \
|
|||||||
```bash
|
```bash
|
||||||
cd backend
|
cd backend
|
||||||
python -m app.gateway.auth.reset_admin
|
python -m app.gateway.auth.reset_admin
|
||||||
# 记录密码 P1
|
cp .deer-flow/admin_initial_credentials.txt /tmp/deerflow-reset-p1.txt
|
||||||
|
P1=$(awk -F': ' '/^password:/ {print $2}' /tmp/deerflow-reset-p1.txt)
|
||||||
|
|
||||||
python -m app.gateway.auth.reset_admin
|
python -m app.gateway.auth.reset_admin
|
||||||
# 记录密码 P2
|
cp .deer-flow/admin_initial_credentials.txt /tmp/deerflow-reset-p2.txt
|
||||||
|
P2=$(awk -F': ' '/^password:/ {print $2}' /tmp/deerflow-reset-p2.txt)
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
|
- [ ] `.deer-flow/admin_initial_credentials.txt` 每次都会被重写,文件权限为 `0600`
|
||||||
- [ ] P1 ≠ P2(每次生成新随机密码)
|
- [ ] P1 ≠ P2(每次生成新随机密码)
|
||||||
- [ ] P1 不可用,只有 P2 有效
|
- [ ] P1 不可用,只有 P2 有效
|
||||||
- [ ] `token_version` 递增了 2
|
- [ ] `token_version` 递增了 2
|
||||||
@@ -1324,7 +1362,8 @@ done
|
|||||||
```bash
|
```bash
|
||||||
GW=http://localhost:8001
|
GW=http://localhost:8001
|
||||||
|
|
||||||
for path in /health /api/v1/auth/setup-status /api/v1/auth/login/local /api/v1/auth/register; do
|
for path in /health /api/v1/auth/setup-status /api/v1/auth/login/local \
|
||||||
|
/api/v1/auth/register /api/v1/auth/initialize /api/v1/auth/logout; do
|
||||||
echo "$path: $(curl -s -w '%{http_code}' -o /dev/null $GW$path)"
|
echo "$path: $(curl -s -w '%{http_code}' -o /dev/null $GW$path)"
|
||||||
done
|
done
|
||||||
# 预期: 200 或 405/422(方法不对但不是 401)
|
# 预期: 200 或 405/422(方法不对但不是 401)
|
||||||
@@ -1399,9 +1438,9 @@ done
|
|||||||
>
|
>
|
||||||
> 前置条件:
|
> 前置条件:
|
||||||
> - `.env` 中设置 `AUTH_JWT_SECRET`(否则每次容器重启 session 全部失效)
|
> - `.env` 中设置 `AUTH_JWT_SECRET`(否则每次容器重启 session 全部失效)
|
||||||
> - `DEER_FLOW_HOME` 挂载到宿主机目录(持久化 `users.db`)
|
> - `DEER_FLOW_HOME` 挂载到宿主机目录(持久化 `deerflow.db`)
|
||||||
|
|
||||||
#### TC-DOCKER-01: users.db 通过 volume 持久化
|
#### TC-DOCKER-01: deerflow.db 通过 volume 持久化
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 启动容器
|
# 启动容器
|
||||||
@@ -1416,13 +1455,13 @@ curl -s -X POST $BASE/api/v1/auth/register \
|
|||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-d '{"email":"docker-test@example.com","password":"DockerTest1!"}' -w "\nHTTP %{http_code}"
|
-d '{"email":"docker-test@example.com","password":"DockerTest1!"}' -w "\nHTTP %{http_code}"
|
||||||
|
|
||||||
# 检查宿主机上的 users.db
|
# 检查宿主机上的 deerflow.db
|
||||||
ls -la ${DEER_FLOW_HOME:-backend/.deer-flow}/users.db
|
ls -la ${DEER_FLOW_HOME:-backend/.deer-flow}/data/deerflow.db
|
||||||
sqlite3 ${DEER_FLOW_HOME:-backend/.deer-flow}/users.db \
|
sqlite3 ${DEER_FLOW_HOME:-backend/.deer-flow}/data/deerflow.db \
|
||||||
"SELECT email FROM users WHERE email='docker-test@example.com';"
|
"SELECT email FROM users WHERE email='docker-test@example.com';"
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** users.db 在宿主机 `DEER_FLOW_HOME` 目录中,查询可见刚注册的用户。
|
**预期:** deerflow.db 在宿主机 `DEER_FLOW_HOME` 目录中,查询可见刚注册的用户。
|
||||||
|
|
||||||
#### TC-DOCKER-02: 重启容器后 session 保持
|
#### TC-DOCKER-02: 重启容器后 session 保持
|
||||||
|
|
||||||
@@ -1466,22 +1505,24 @@ done
|
|||||||
|
|
||||||
**已知限制:** In-process rate limiter 不跨 worker 共享。生产环境如需精确限速,需要 Redis 等外部存储。
|
**已知限制:** In-process rate limiter 不跨 worker 共享。生产环境如需精确限速,需要 Redis 等外部存储。
|
||||||
|
|
||||||
#### TC-DOCKER-04: IM 渠道不经过 auth
|
#### TC-DOCKER-04: IM 渠道使用内部认证
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# IM 渠道(Feishu/Slack/Telegram)在 gateway 容器内部通过 LangGraph SDK 通信
|
# IM 渠道(Feishu/Slack/Telegram)在 gateway 容器内部通过 LangGraph SDK 调 Gateway
|
||||||
# 不走 nginx,不经过 AuthMiddleware
|
# 请求携带 process-local internal auth header,并带匹配的 CSRF cookie/header
|
||||||
|
|
||||||
# 验证方式:检查 gateway 日志中 channel manager 的请求不包含 auth 错误
|
# 验证方式:检查 gateway 日志中 channel manager 的请求不包含 auth 错误
|
||||||
docker logs deer-flow-gateway 2>&1 | grep -E "ChannelManager|channel" | head -10
|
docker logs deer-flow-gateway 2>&1 | grep -E "ChannelManager|channel" | head -10
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** 无 auth 相关错误。渠道通过 `langgraph-sdk` 直连 LangGraph Server(`http://langgraph:2024`),不走 auth 层。
|
**预期:** 无 auth 相关错误。渠道不依赖浏览器 cookie;服务端通过内部认证头把请求归入 `default` 用户桶。
|
||||||
|
|
||||||
#### TC-DOCKER-05: admin 密码写入 0600 凭证文件(不再走日志)
|
#### TC-DOCKER-05: reset_admin 密码写入 0600 凭证文件(不再走日志)
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 凭证文件写在挂载到宿主机的 DEER_FLOW_HOME 下
|
# 首次启动不会自动生成 admin 密码。先重置已有 admin,凭据文件写在挂载到宿主机的 DEER_FLOW_HOME 下。
|
||||||
|
docker exec deer-flow-gateway python -m app.gateway.auth.reset_admin --email docker-test@example.com
|
||||||
|
|
||||||
ls -la ${DEER_FLOW_HOME:-backend/.deer-flow}/admin_initial_credentials.txt
|
ls -la ${DEER_FLOW_HOME:-backend/.deer-flow}/admin_initial_credentials.txt
|
||||||
# 预期文件权限: -rw------- (0600)
|
# 预期文件权限: -rw------- (0600)
|
||||||
|
|
||||||
@@ -1512,14 +1553,15 @@ sleep 15
|
|||||||
docker ps --filter name=deer-flow-langgraph --format '{{.Names}}' | wc -l
|
docker ps --filter name=deer-flow-langgraph --format '{{.Names}}' | wc -l
|
||||||
# 预期: 0
|
# 预期: 0
|
||||||
|
|
||||||
# auth 流程正常
|
# auth 流程正常:未登录受保护接口返回 401
|
||||||
curl -s -w "%{http_code}" -o /dev/null $BASE/api/models
|
curl -s -w "%{http_code}" -o /dev/null $BASE/api/models
|
||||||
# 预期: 401
|
# 预期: 401
|
||||||
|
|
||||||
curl -s -X POST $BASE/api/v1/auth/login/local \
|
curl -s -X POST $BASE/api/v1/auth/initialize \
|
||||||
-d "username=admin@deerflow.dev&password=<日志密码>" \
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"email":"admin@example.com","password":"AdminPass1!"}' \
|
||||||
-c cookies.txt -w "\nHTTP %{http_code}"
|
-c cookies.txt -w "\nHTTP %{http_code}"
|
||||||
# 预期: 200
|
# 预期: 201
|
||||||
```
|
```
|
||||||
|
|
||||||
### 7.4 补充边界用例
|
### 7.4 补充边界用例
|
||||||
@@ -1587,13 +1629,15 @@ curl -s -D - -X POST $BASE/api/v1/auth/login/local \
|
|||||||
#### TC-EDGE-05: HTTP 无 max_age / HTTPS 有 max_age
|
#### TC-EDGE-05: HTTP 无 max_age / HTTPS 有 max_age
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
GW=http://localhost:8001
|
||||||
|
|
||||||
# HTTP
|
# HTTP
|
||||||
curl -s -D - -X POST $BASE/api/v1/auth/login/local \
|
curl -s -D - -X POST $GW/api/v1/auth/login/local \
|
||||||
-d "username=admin@example.com&password=正确密码" 2>/dev/null \
|
-d "username=admin@example.com&password=正确密码" 2>/dev/null \
|
||||||
| grep "access_token=" | grep -oi "max-age=[0-9]*" || echo "NO max-age (HTTP session cookie)"
|
| grep "access_token=" | grep -oi "max-age=[0-9]*" || echo "NO max-age (HTTP session cookie)"
|
||||||
|
|
||||||
# HTTPS
|
# HTTPS:直连 Gateway 才能用 X-Forwarded-Proto 模拟 HTTPS;nginx 会覆盖该 header
|
||||||
curl -s -D - -X POST $BASE/api/v1/auth/login/local \
|
curl -s -D - -X POST $GW/api/v1/auth/login/local \
|
||||||
-H "X-Forwarded-Proto: https" \
|
-H "X-Forwarded-Proto: https" \
|
||||||
-d "username=admin@example.com&password=正确密码" 2>/dev/null \
|
-d "username=admin@example.com&password=正确密码" 2>/dev/null \
|
||||||
| grep "access_token=" | grep -oi "max-age=[0-9]*"
|
| grep "access_token=" | grep -oi "max-age=[0-9]*"
|
||||||
@@ -1712,10 +1756,10 @@ curl -s -X POST $BASE/api/threads \
|
|||||||
-b cookies.txt \
|
-b cookies.txt \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-H "X-CSRF-Token: $CSRF" \
|
-H "X-CSRF-Token: $CSRF" \
|
||||||
-d '{"metadata":{"owner_id":"victim-user-id"}}' | jq .metadata.owner_id
|
-d '{"metadata":{"owner_id":"victim-user-id","user_id":"victim-user-id"}}' | jq .metadata
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** 返回的 `metadata.owner_id` 应为当前登录用户的 ID,不是请求中注入的 `victim-user-id`。服务端应覆盖客户端提供的 `user_id`。
|
**预期:** 返回的 `metadata` 不包含 `owner_id` 或 `user_id`。真实所有权写入 `threads_meta.user_id`,不从客户端 metadata 接收,也不通过 metadata 回显。
|
||||||
|
|
||||||
#### 7.5.6 HTTP Method 探测
|
#### 7.5.6 HTTP Method 探测
|
||||||
|
|
||||||
@@ -1796,6 +1840,6 @@ cd backend && PYTHONPATH=. uv run pytest \
|
|||||||
# 核心接口冒烟
|
# 核心接口冒烟
|
||||||
curl -s $BASE/health # 200
|
curl -s $BASE/health # 200
|
||||||
curl -s $BASE/api/models # 401 (无 cookie)
|
curl -s $BASE/api/models # 401 (无 cookie)
|
||||||
curl -s -X POST $BASE/api/v1/auth/setup-status # 200
|
curl -s $BASE/api/v1/auth/setup-status # 200
|
||||||
curl -s $BASE/api/v1/auth/me -b cookies.txt # 200 (有 cookie)
|
curl -s $BASE/api/v1/auth/me -b cookies.txt # 200 (有 cookie)
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -2,13 +2,16 @@
|
|||||||
|
|
||||||
DeerFlow 内置了认证模块。本文档面向从无认证版本升级的用户。
|
DeerFlow 内置了认证模块。本文档面向从无认证版本升级的用户。
|
||||||
|
|
||||||
|
完整设计见 [AUTH_DESIGN.md](AUTH_DESIGN.md)。
|
||||||
|
|
||||||
## 核心概念
|
## 核心概念
|
||||||
|
|
||||||
认证模块采用**始终强制**策略:
|
认证模块采用**始终强制**策略:
|
||||||
|
|
||||||
- 首次启动时自动创建 admin 账号,随机密码打印到控制台日志
|
- 首次启动时不会自动创建账号;首次访问 `/setup` 时由操作者创建第一个 admin 账号
|
||||||
- 认证从一开始就是强制的,无竞争窗口
|
- 认证从一开始就是强制的,无竞争窗口
|
||||||
- 历史对话(升级前创建的 thread)自动迁移到 admin 名下
|
- 已有 admin 后,服务启动时会把历史对话(升级前创建且缺少 `user_id` 的 thread)迁移到 admin 名下
|
||||||
|
- 新数据按用户隔离:thread、workspace/uploads/outputs、memory、自定义 agent 都归属当前用户
|
||||||
|
|
||||||
## 升级步骤
|
## 升级步骤
|
||||||
|
|
||||||
@@ -25,39 +28,41 @@ cd backend && make install
|
|||||||
make dev
|
make dev
|
||||||
```
|
```
|
||||||
|
|
||||||
控制台会输出:
|
如果没有 admin 账号,控制台只会提示:
|
||||||
|
|
||||||
```
|
```
|
||||||
============================================================
|
============================================================
|
||||||
Admin account created on first boot
|
First boot detected — no admin account exists.
|
||||||
Email: admin@deerflow.dev
|
Visit /setup to complete admin account creation.
|
||||||
Password: aB3xK9mN_pQ7rT2w
|
|
||||||
Change it after login: Settings → Account
|
|
||||||
============================================================
|
============================================================
|
||||||
```
|
```
|
||||||
|
|
||||||
如果未登录就重启了服务,不用担心——只要 setup 未完成,每次启动都会重置密码并重新打印到控制台。
|
首次启动不会在日志里打印随机密码,也不会写入默认 admin。这样避免启动日志泄露凭据,也避免在操作者创建账号前出现可被猜测的默认身份。
|
||||||
|
|
||||||
### 3. 登录
|
### 3. 创建 admin
|
||||||
|
|
||||||
访问 `http://localhost:2026/login`,使用控制台输出的邮箱和密码登录。
|
访问 `http://localhost:2026/setup`,填写邮箱和密码创建第一个 admin 账号。创建成功后会自动登录并进入 workspace。
|
||||||
|
|
||||||
### 4. 修改密码
|
如果这是从无认证版本升级,创建 admin 后重启一次服务,让启动迁移把缺少 `user_id` 的历史 thread 归属到 admin。
|
||||||
|
|
||||||
登录后进入 Settings → Account → Change Password。
|
### 4. 登录
|
||||||
|
|
||||||
|
后续访问 `http://localhost:2026/login`,使用已创建的邮箱和密码登录。
|
||||||
|
|
||||||
### 5. 添加用户(可选)
|
### 5. 添加用户(可选)
|
||||||
|
|
||||||
其他用户通过 `/login` 页面注册,自动获得 **user** 角色。每个用户只能看到自己的对话。
|
其他用户通过 `/login` 页面注册,自动获得 **user** 角色。每个用户只能看到自己的对话、上传文件、输出文件、memory 和自定义 agent。
|
||||||
|
|
||||||
## 安全机制
|
## 安全机制
|
||||||
|
|
||||||
| 机制 | 说明 |
|
| 机制 | 说明 |
|
||||||
|------|------|
|
|------|------|
|
||||||
| JWT HttpOnly Cookie | Token 不暴露给 JavaScript,防止 XSS 窃取 |
|
| JWT HttpOnly Cookie | Token 不暴露给 JavaScript,防止 XSS 窃取 |
|
||||||
| CSRF Double Submit Cookie | 所有 POST/PUT/DELETE 请求需携带 `X-CSRF-Token` |
|
| CSRF Double Submit Cookie | 受保护的 POST/PUT/PATCH/DELETE 请求需携带 `X-CSRF-Token`;登录/注册/初始化/登出走 auth 端点 Origin 校验 |
|
||||||
| bcrypt 密码哈希 | 密码不以明文存储 |
|
| bcrypt 密码哈希 | 密码不以明文存储 |
|
||||||
| 多租户隔离 | 用户只能访问自己的 thread |
|
| Thread owner filter | `threads_meta.user_id` 由服务端认证上下文写入,搜索、读取、更新、删除默认按当前用户过滤 |
|
||||||
|
| 文件系统隔离 | 线程数据写入 `{base_dir}/users/{user_id}/threads/{thread_id}/user-data/`,sandbox 内统一映射为 `/mnt/user-data/` |
|
||||||
|
| Memory / agent 隔离 | 用户 memory 和自定义 agent 写入 `{base_dir}/users/{user_id}/...`;旧共享 agent 只作为只读兼容回退 |
|
||||||
| HTTPS 自适应 | 检测 `x-forwarded-proto`,自动设置 `Secure` cookie 标志 |
|
| HTTPS 自适应 | 检测 `x-forwarded-proto`,自动设置 `Secure` cookie 标志 |
|
||||||
|
|
||||||
## 常见操作
|
## 常见操作
|
||||||
@@ -74,22 +79,26 @@ python -m app.gateway.auth.reset_admin
|
|||||||
python -m app.gateway.auth.reset_admin --email user@example.com
|
python -m app.gateway.auth.reset_admin --email user@example.com
|
||||||
```
|
```
|
||||||
|
|
||||||
会输出新的随机密码。
|
会把新的随机密码写入 `.deer-flow/admin_initial_credentials.txt`,文件权限为 `0600`。命令行只输出文件路径,不输出明文密码。
|
||||||
|
|
||||||
### 完全重置
|
### 完全重置
|
||||||
|
|
||||||
删除用户数据库,重启后自动创建新 admin:
|
删除统一 SQLite 数据库,重启后重新访问 `/setup` 创建新 admin:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
rm -f backend/.deer-flow/users.db
|
rm -f backend/.deer-flow/data/deerflow.db
|
||||||
# 重启服务,控制台输出新密码
|
# 重启服务后访问 http://localhost:2026/setup
|
||||||
```
|
```
|
||||||
|
|
||||||
## 数据存储
|
## 数据存储
|
||||||
|
|
||||||
| 文件 | 内容 |
|
| 文件 | 内容 |
|
||||||
|------|------|
|
|------|------|
|
||||||
| `.deer-flow/users.db` | SQLite 用户数据库(密码哈希、角色) |
|
| `.deer-flow/data/deerflow.db` | 统一 SQLite 数据库(users、threads_meta、runs、feedback 等应用数据) |
|
||||||
|
| `.deer-flow/users/{user_id}/threads/{thread_id}/user-data/` | 用户线程的 workspace、uploads、outputs |
|
||||||
|
| `.deer-flow/users/{user_id}/memory.json` | 用户级 memory |
|
||||||
|
| `.deer-flow/users/{user_id}/agents/{agent_name}/` | 用户自定义 agent 配置、SOUL 和 agent memory |
|
||||||
|
| `.deer-flow/admin_initial_credentials.txt` | `reset_admin` 生成的新凭据文件(0600,读完应删除) |
|
||||||
| `.env` 中的 `AUTH_JWT_SECRET` | JWT 签名密钥(未设置时自动生成临时密钥,重启后 session 失效) |
|
| `.env` 中的 `AUTH_JWT_SECRET` | JWT 签名密钥(未设置时自动生成临时密钥,重启后 session 失效) |
|
||||||
|
|
||||||
### 生产环境建议
|
### 生产环境建议
|
||||||
@@ -111,19 +120,21 @@ python -c "import secrets; print(secrets.token_urlsafe(32))"
|
|||||||
| `/api/v1/auth/me` | GET | 获取当前用户信息 |
|
| `/api/v1/auth/me` | GET | 获取当前用户信息 |
|
||||||
| `/api/v1/auth/change-password` | POST | 修改密码 |
|
| `/api/v1/auth/change-password` | POST | 修改密码 |
|
||||||
| `/api/v1/auth/setup-status` | GET | 检查 admin 是否存在 |
|
| `/api/v1/auth/setup-status` | GET | 检查 admin 是否存在 |
|
||||||
|
| `/api/v1/auth/initialize` | POST | 首次初始化第一个 admin(仅无 admin 时可调用) |
|
||||||
|
|
||||||
## 兼容性
|
## 兼容性
|
||||||
|
|
||||||
- **标准模式**(`make dev`):完全兼容,admin 自动创建
|
- **标准模式**(`make dev`):完全兼容;无 admin 时访问 `/setup` 初始化
|
||||||
- **Gateway 模式**(`make dev-pro`):完全兼容
|
- **Gateway 模式**(`make dev-pro`):完全兼容
|
||||||
- **Docker 部署**:完全兼容,`.deer-flow/users.db` 需持久化卷挂载
|
- **Docker 部署**:完全兼容,`.deer-flow/data/deerflow.db` 需持久化卷挂载
|
||||||
- **IM 渠道**(Feishu/Slack/Telegram):通过 LangGraph SDK 通信,不经过认证层
|
- **IM 渠道**(Feishu/Slack/Telegram):通过 Gateway 内部认证通信,使用 `default` 用户桶
|
||||||
- **DeerFlowClient**(嵌入式):不经过 HTTP,不受认证影响
|
- **DeerFlowClient**(嵌入式):不经过 HTTP,不受认证影响
|
||||||
|
|
||||||
## 故障排查
|
## 故障排查
|
||||||
|
|
||||||
| 症状 | 原因 | 解决 |
|
| 症状 | 原因 | 解决 |
|
||||||
|------|------|------|
|
|------|------|------|
|
||||||
| 启动后没看到密码 | admin 已存在(非首次启动) | 用 `reset_admin` 重置,或删 `users.db` |
|
| 启动后没看到密码 | 当前实现不在启动日志输出密码 | 首次安装访问 `/setup`;忘记密码用 `reset_admin` |
|
||||||
|
| `/login` 自动跳到 `/setup` | 系统还没有 admin | 在 `/setup` 创建第一个 admin |
|
||||||
| 登录后 POST 返回 403 | CSRF token 缺失 | 确认前端已更新 |
|
| 登录后 POST 返回 403 | CSRF token 缺失 | 确认前端已更新 |
|
||||||
| 重启后需要重新登录 | `AUTH_JWT_SECRET` 未持久化 | 在 `.env` 中设置固定密钥 |
|
| 重启后需要重新登录 | `AUTH_JWT_SECRET` 未持久化 | 在 `.env` 中设置固定密钥 |
|
||||||
|
|||||||
@@ -8,6 +8,7 @@ This directory contains detailed documentation for the DeerFlow backend.
|
|||||||
|----------|-------------|
|
|----------|-------------|
|
||||||
| [ARCHITECTURE.md](ARCHITECTURE.md) | System architecture overview |
|
| [ARCHITECTURE.md](ARCHITECTURE.md) | System architecture overview |
|
||||||
| [API.md](API.md) | Complete API reference |
|
| [API.md](API.md) | Complete API reference |
|
||||||
|
| [AUTH_DESIGN.md](AUTH_DESIGN.md) | User authentication, CSRF, and per-user isolation design |
|
||||||
| [CONFIGURATION.md](CONFIGURATION.md) | Configuration options |
|
| [CONFIGURATION.md](CONFIGURATION.md) | Configuration options |
|
||||||
| [SETUP.md](SETUP.md) | Quick setup guide |
|
| [SETUP.md](SETUP.md) | Quick setup guide |
|
||||||
|
|
||||||
@@ -42,6 +43,7 @@ docs/
|
|||||||
├── README.md # This file
|
├── README.md # This file
|
||||||
├── ARCHITECTURE.md # System architecture
|
├── ARCHITECTURE.md # System architecture
|
||||||
├── API.md # API reference
|
├── API.md # API reference
|
||||||
|
├── AUTH_DESIGN.md # User authentication and isolation design
|
||||||
├── CONFIGURATION.md # Configuration guide
|
├── CONFIGURATION.md # Configuration guide
|
||||||
├── SETUP.md # Setup instructions
|
├── SETUP.md # Setup instructions
|
||||||
├── FILE_UPLOAD.md # File upload feature
|
├── FILE_UPLOAD.md # File upload feature
|
||||||
|
|||||||
@@ -40,6 +40,15 @@ class MemoryUpdateQueue:
|
|||||||
self._timer: threading.Timer | None = None
|
self._timer: threading.Timer | None = None
|
||||||
self._processing = False
|
self._processing = False
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _queue_key(
|
||||||
|
thread_id: str,
|
||||||
|
user_id: str | None,
|
||||||
|
agent_name: str | None,
|
||||||
|
) -> tuple[str, str | None, str | None]:
|
||||||
|
"""Return the debounce identity for a memory update target."""
|
||||||
|
return (thread_id, user_id, agent_name)
|
||||||
|
|
||||||
def add(
|
def add(
|
||||||
self,
|
self,
|
||||||
thread_id: str,
|
thread_id: str,
|
||||||
@@ -115,8 +124,9 @@ class MemoryUpdateQueue:
|
|||||||
correction_detected: bool,
|
correction_detected: bool,
|
||||||
reinforcement_detected: bool,
|
reinforcement_detected: bool,
|
||||||
) -> None:
|
) -> None:
|
||||||
|
queue_key = self._queue_key(thread_id, user_id, agent_name)
|
||||||
existing_context = next(
|
existing_context = next(
|
||||||
(context for context in self._queue if context.thread_id == thread_id),
|
(context for context in self._queue if self._queue_key(context.thread_id, context.user_id, context.agent_name) == queue_key),
|
||||||
None,
|
None,
|
||||||
)
|
)
|
||||||
merged_correction_detected = correction_detected or (existing_context.correction_detected if existing_context is not None else False)
|
merged_correction_detected = correction_detected or (existing_context.correction_detected if existing_context is not None else False)
|
||||||
@@ -130,7 +140,7 @@ class MemoryUpdateQueue:
|
|||||||
reinforcement_detected=merged_reinforcement_detected,
|
reinforcement_detected=merged_reinforcement_detected,
|
||||||
)
|
)
|
||||||
|
|
||||||
self._queue = [c for c in self._queue if c.thread_id != thread_id]
|
self._queue = [context for context in self._queue if self._queue_key(context.thread_id, context.user_id, context.agent_name) != queue_key]
|
||||||
self._queue.append(context)
|
self._queue.append(context)
|
||||||
|
|
||||||
def _reset_timer(self) -> None:
|
def _reset_timer(self) -> None:
|
||||||
|
|||||||
@@ -6,6 +6,7 @@ from deerflow.agents.memory.message_processing import detect_correction, detect_
|
|||||||
from deerflow.agents.memory.queue import get_memory_queue
|
from deerflow.agents.memory.queue import get_memory_queue
|
||||||
from deerflow.agents.middlewares.summarization_middleware import SummarizationEvent
|
from deerflow.agents.middlewares.summarization_middleware import SummarizationEvent
|
||||||
from deerflow.config.memory_config import get_memory_config
|
from deerflow.config.memory_config import get_memory_config
|
||||||
|
from deerflow.runtime.user_context import resolve_runtime_user_id
|
||||||
|
|
||||||
|
|
||||||
def memory_flush_hook(event: SummarizationEvent) -> None:
|
def memory_flush_hook(event: SummarizationEvent) -> None:
|
||||||
@@ -21,11 +22,13 @@ def memory_flush_hook(event: SummarizationEvent) -> None:
|
|||||||
|
|
||||||
correction_detected = detect_correction(filtered_messages)
|
correction_detected = detect_correction(filtered_messages)
|
||||||
reinforcement_detected = not correction_detected and detect_reinforcement(filtered_messages)
|
reinforcement_detected = not correction_detected and detect_reinforcement(filtered_messages)
|
||||||
|
user_id = resolve_runtime_user_id(event.runtime)
|
||||||
queue = get_memory_queue()
|
queue = get_memory_queue()
|
||||||
queue.add_nowait(
|
queue.add_nowait(
|
||||||
thread_id=event.thread_id,
|
thread_id=event.thread_id,
|
||||||
messages=filtered_messages,
|
messages=filtered_messages,
|
||||||
agent_name=event.agent_name,
|
agent_name=event.agent_name,
|
||||||
|
user_id=user_id,
|
||||||
correction_detected=correction_detected,
|
correction_detected=correction_detected,
|
||||||
reinforcement_detected=reinforcement_detected,
|
reinforcement_detected=reinforcement_detected,
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -9,7 +9,7 @@ from typing import Any, override
|
|||||||
from langchain.agents import AgentState
|
from langchain.agents import AgentState
|
||||||
from langchain.agents.middleware import AgentMiddleware
|
from langchain.agents.middleware import AgentMiddleware
|
||||||
from langchain.agents.middleware.todo import Todo
|
from langchain.agents.middleware.todo import Todo
|
||||||
from langchain_core.messages import AIMessage
|
from langchain_core.messages import AIMessage, ToolMessage
|
||||||
from langgraph.runtime import Runtime
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@@ -217,6 +217,17 @@ def _infer_step_kind(message: AIMessage, actions: list[dict[str, Any]]) -> str:
|
|||||||
return "thinking"
|
return "thinking"
|
||||||
|
|
||||||
|
|
||||||
|
def _has_tool_call(message: AIMessage, tool_call_id: str) -> bool:
|
||||||
|
"""Return True if the AIMessage contains a tool_call with the given id."""
|
||||||
|
for tc in message.tool_calls or []:
|
||||||
|
if isinstance(tc, dict):
|
||||||
|
if tc.get("id") == tool_call_id:
|
||||||
|
return True
|
||||||
|
elif hasattr(tc, "id") and tc.id == tool_call_id:
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
def _build_attribution(message: AIMessage, todos: list[Todo]) -> dict[str, Any]:
|
def _build_attribution(message: AIMessage, todos: list[Todo]) -> dict[str, Any]:
|
||||||
tool_calls = getattr(message, "tool_calls", None) or []
|
tool_calls = getattr(message, "tool_calls", None) or []
|
||||||
actions: list[dict[str, Any]] = []
|
actions: list[dict[str, Any]] = []
|
||||||
@@ -261,8 +272,51 @@ class TokenUsageMiddleware(AgentMiddleware):
|
|||||||
if not messages:
|
if not messages:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
|
# Annotate subagent token usage onto the AIMessage that dispatched it.
|
||||||
|
# When a task tool completes, its usage is cached by tool_call_id. Detect
|
||||||
|
# the ToolMessage → search backward for the corresponding AIMessage → merge.
|
||||||
|
# Walk backward through consecutive ToolMessages before the new AIMessage
|
||||||
|
# so that multiple concurrent task tool calls all get their subagent tokens
|
||||||
|
# written back to the same dispatch message (merging into one update).
|
||||||
|
state_updates: dict[int, AIMessage] = {}
|
||||||
|
if len(messages) >= 2:
|
||||||
|
from deerflow.tools.builtins.task_tool import pop_cached_subagent_usage
|
||||||
|
|
||||||
|
idx = len(messages) - 2
|
||||||
|
while idx >= 0:
|
||||||
|
tool_msg = messages[idx]
|
||||||
|
if not isinstance(tool_msg, ToolMessage) or not tool_msg.tool_call_id:
|
||||||
|
break
|
||||||
|
|
||||||
|
subagent_usage = pop_cached_subagent_usage(tool_msg.tool_call_id)
|
||||||
|
if subagent_usage:
|
||||||
|
# Search backward from the ToolMessage to find the AIMessage
|
||||||
|
# that dispatched it. A single model response can dispatch
|
||||||
|
# multiple task tool calls, so we can't assume a fixed offset.
|
||||||
|
dispatch_idx = idx - 1
|
||||||
|
while dispatch_idx >= 0:
|
||||||
|
candidate = messages[dispatch_idx]
|
||||||
|
if isinstance(candidate, AIMessage) and _has_tool_call(candidate, tool_msg.tool_call_id):
|
||||||
|
# Accumulate into an existing update for the same
|
||||||
|
# AIMessage (multiple task calls in one response),
|
||||||
|
# or merge fresh from the original message.
|
||||||
|
existing_update = state_updates.get(dispatch_idx)
|
||||||
|
prev = existing_update.usage_metadata if existing_update else (getattr(candidate, "usage_metadata", None) or {})
|
||||||
|
merged = {
|
||||||
|
**prev,
|
||||||
|
"input_tokens": prev.get("input_tokens", 0) + subagent_usage["input_tokens"],
|
||||||
|
"output_tokens": prev.get("output_tokens", 0) + subagent_usage["output_tokens"],
|
||||||
|
"total_tokens": prev.get("total_tokens", 0) + subagent_usage["total_tokens"],
|
||||||
|
}
|
||||||
|
state_updates[dispatch_idx] = candidate.model_copy(update={"usage_metadata": merged})
|
||||||
|
break
|
||||||
|
dispatch_idx -= 1
|
||||||
|
idx -= 1
|
||||||
|
|
||||||
last = messages[-1]
|
last = messages[-1]
|
||||||
if not isinstance(last, AIMessage):
|
if not isinstance(last, AIMessage):
|
||||||
|
if state_updates:
|
||||||
|
return {"messages": [state_updates[idx] for idx in sorted(state_updates)]}
|
||||||
return None
|
return None
|
||||||
|
|
||||||
usage = getattr(last, "usage_metadata", None)
|
usage = getattr(last, "usage_metadata", None)
|
||||||
@@ -288,11 +342,12 @@ class TokenUsageMiddleware(AgentMiddleware):
|
|||||||
additional_kwargs = dict(getattr(last, "additional_kwargs", {}) or {})
|
additional_kwargs = dict(getattr(last, "additional_kwargs", {}) or {})
|
||||||
|
|
||||||
if additional_kwargs.get(TOKEN_USAGE_ATTRIBUTION_KEY) == attribution:
|
if additional_kwargs.get(TOKEN_USAGE_ATTRIBUTION_KEY) == attribution:
|
||||||
return None
|
return {"messages": [state_updates[idx] for idx in sorted(state_updates)]} if state_updates else None
|
||||||
|
|
||||||
additional_kwargs[TOKEN_USAGE_ATTRIBUTION_KEY] = attribution
|
additional_kwargs[TOKEN_USAGE_ATTRIBUTION_KEY] = attribution
|
||||||
updated_msg = last.model_copy(update={"additional_kwargs": additional_kwargs})
|
updated_msg = last.model_copy(update={"additional_kwargs": additional_kwargs})
|
||||||
return {"messages": [updated_msg]}
|
state_updates[len(messages) - 1] = updated_msg
|
||||||
|
return {"messages": [state_updates[idx] for idx in sorted(state_updates)]}
|
||||||
|
|
||||||
@override
|
@override
|
||||||
def after_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
def after_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
||||||
|
|||||||
@@ -0,0 +1,195 @@
|
|||||||
|
"""Dialect-aware JSON value matching for SQLAlchemy (SQLite + PostgreSQL)."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import re
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from sqlalchemy import BigInteger, Float, String, bindparam
|
||||||
|
from sqlalchemy.ext.compiler import compiles
|
||||||
|
from sqlalchemy.sql.compiler import SQLCompiler
|
||||||
|
from sqlalchemy.sql.expression import ColumnElement
|
||||||
|
from sqlalchemy.sql.visitors import InternalTraversal
|
||||||
|
from sqlalchemy.types import Boolean, TypeEngine
|
||||||
|
|
||||||
|
# Key is interpolated into compiled SQL; restrict charset to prevent injection.
|
||||||
|
_KEY_CHARSET_RE = re.compile(r"^[A-Za-z0-9_\-]+$")
|
||||||
|
|
||||||
|
# Allowed value types for metadata filter values (same set accepted by JsonMatch).
|
||||||
|
ALLOWED_FILTER_VALUE_TYPES: tuple[type, ...] = (type(None), bool, int, float, str)
|
||||||
|
|
||||||
|
# SQLite raises an overflow when binding values outside signed 64-bit range;
|
||||||
|
# PostgreSQL overflows during BIGINT cast. Reject at validation time instead.
|
||||||
|
_INT64_MIN = -(2**63)
|
||||||
|
_INT64_MAX = 2**63 - 1
|
||||||
|
|
||||||
|
|
||||||
|
def validate_metadata_filter_key(key: object) -> bool:
|
||||||
|
"""Return True if *key* is safe for use as a JSON metadata filter key.
|
||||||
|
|
||||||
|
A key is "safe" when it is a string matching ``[A-Za-z0-9_-]+``. The
|
||||||
|
charset is restricted because the key is interpolated into the
|
||||||
|
compiled SQL path expression (``$."<key>"`` / ``->`` literal), so any
|
||||||
|
laxer pattern would open a SQL/JSONPath injection surface.
|
||||||
|
"""
|
||||||
|
return isinstance(key, str) and bool(_KEY_CHARSET_RE.match(key))
|
||||||
|
|
||||||
|
|
||||||
|
def validate_metadata_filter_value(value: object) -> bool:
|
||||||
|
"""Return True if *value* is an allowed type for a JSON metadata filter.
|
||||||
|
|
||||||
|
Matches the set of types ``_build_clause`` knows how to compile into
|
||||||
|
a dialect-portable predicate. Anything else (list/dict/bytes/...) is
|
||||||
|
intentionally rejected rather than silently coerced via ``str()`` —
|
||||||
|
silent coercion would (a) produce wrong matches and (b) break
|
||||||
|
SQLAlchemy's ``inherit_cache`` invariant when ``value`` is unhashable.
|
||||||
|
|
||||||
|
Integer values are additionally restricted to the signed 64-bit range
|
||||||
|
``[-2**63, 2**63 - 1]``: SQLite overflows when binding larger values
|
||||||
|
and PostgreSQL overflows during the ``BIGINT`` cast.
|
||||||
|
"""
|
||||||
|
if not isinstance(value, ALLOWED_FILTER_VALUE_TYPES):
|
||||||
|
return False
|
||||||
|
if isinstance(value, int) and not isinstance(value, bool):
|
||||||
|
if not (_INT64_MIN <= value <= _INT64_MAX):
|
||||||
|
return False
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
class JsonMatch(ColumnElement):
|
||||||
|
"""Dialect-portable ``column[key] == value`` for JSON columns.
|
||||||
|
|
||||||
|
Compiles to ``json_type``/``json_extract`` on SQLite and
|
||||||
|
``json_typeof``/``->>`` on PostgreSQL, with type-safe comparison
|
||||||
|
that distinguishes bool vs int and NULL vs missing key.
|
||||||
|
|
||||||
|
*key* must be a single literal key matching ``[A-Za-z0-9_-]+``.
|
||||||
|
*value* must be one of: ``None``, ``bool``, ``int`` (signed 64-bit), ``float``, ``str``.
|
||||||
|
"""
|
||||||
|
|
||||||
|
inherit_cache = True
|
||||||
|
type = Boolean()
|
||||||
|
_is_implicitly_boolean = True
|
||||||
|
|
||||||
|
_traverse_internals = [
|
||||||
|
("column", InternalTraversal.dp_clauseelement),
|
||||||
|
("key", InternalTraversal.dp_string),
|
||||||
|
("value", InternalTraversal.dp_plain_obj),
|
||||||
|
]
|
||||||
|
|
||||||
|
def __init__(self, column: ColumnElement, key: str, value: object) -> None:
|
||||||
|
if not validate_metadata_filter_key(key):
|
||||||
|
raise ValueError(f"JsonMatch key must match {_KEY_CHARSET_RE.pattern!r}; got: {key!r}")
|
||||||
|
if not validate_metadata_filter_value(value):
|
||||||
|
if isinstance(value, int) and not isinstance(value, bool):
|
||||||
|
raise TypeError(f"JsonMatch int value out of signed 64-bit range [-2**63, 2**63-1]: {value!r}")
|
||||||
|
raise TypeError(f"JsonMatch value must be None, bool, int, float, or str; got: {type(value).__name__!r}")
|
||||||
|
self.column = column
|
||||||
|
self.key = key
|
||||||
|
self.value = value
|
||||||
|
super().__init__()
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class _Dialect:
|
||||||
|
"""Per-dialect names used when emitting JSON type/value comparisons."""
|
||||||
|
|
||||||
|
null_type: str
|
||||||
|
num_types: tuple[str, ...]
|
||||||
|
num_cast: str
|
||||||
|
int_types: tuple[str, ...]
|
||||||
|
int_cast: str
|
||||||
|
# None for SQLite where json_type already returns 'integer'/'real';
|
||||||
|
# regex literal for PostgreSQL where json_typeof returns 'number' for
|
||||||
|
# both ints and floats, so an extra guard prevents CAST errors on floats.
|
||||||
|
int_guard: str | None
|
||||||
|
string_type: str
|
||||||
|
bool_type: str | None
|
||||||
|
|
||||||
|
|
||||||
|
_SQLITE = _Dialect(
|
||||||
|
null_type="null",
|
||||||
|
num_types=("integer", "real"),
|
||||||
|
num_cast="REAL",
|
||||||
|
int_types=("integer",),
|
||||||
|
int_cast="INTEGER",
|
||||||
|
int_guard=None,
|
||||||
|
string_type="text",
|
||||||
|
bool_type=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
_PG = _Dialect(
|
||||||
|
null_type="null",
|
||||||
|
num_types=("number",),
|
||||||
|
num_cast="DOUBLE PRECISION",
|
||||||
|
int_types=("number",),
|
||||||
|
int_cast="BIGINT",
|
||||||
|
int_guard="'^-?[0-9]+$'",
|
||||||
|
string_type="string",
|
||||||
|
bool_type="boolean",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _bind(compiler: SQLCompiler, value: object, sa_type: TypeEngine[Any], **kw: Any) -> str:
|
||||||
|
param = bindparam(None, value, type_=sa_type)
|
||||||
|
return compiler.process(param, **kw)
|
||||||
|
|
||||||
|
|
||||||
|
def _type_check(typeof: str, types: tuple[str, ...]) -> str:
|
||||||
|
if len(types) == 1:
|
||||||
|
return f"{typeof} = '{types[0]}'"
|
||||||
|
quoted = ", ".join(f"'{t}'" for t in types)
|
||||||
|
return f"{typeof} IN ({quoted})"
|
||||||
|
|
||||||
|
|
||||||
|
def _build_clause(compiler: SQLCompiler, typeof: str, extract: str, value: object, dialect: _Dialect, **kw: Any) -> str:
|
||||||
|
if value is None:
|
||||||
|
return f"{typeof} = '{dialect.null_type}'"
|
||||||
|
if isinstance(value, bool):
|
||||||
|
# bool check must precede int check — bool is a subclass of int in Python
|
||||||
|
bool_str = "true" if value else "false"
|
||||||
|
if dialect.bool_type is None:
|
||||||
|
return f"{typeof} = '{bool_str}'"
|
||||||
|
return f"({typeof} = '{dialect.bool_type}' AND {extract} = '{bool_str}')"
|
||||||
|
if isinstance(value, int):
|
||||||
|
bp = _bind(compiler, value, BigInteger(), **kw)
|
||||||
|
if dialect.int_guard:
|
||||||
|
# CASE prevents CAST error when json_typeof = 'number' also matches floats
|
||||||
|
return f"(CASE WHEN {_type_check(typeof, dialect.int_types)} AND {extract} ~ {dialect.int_guard} THEN CAST({extract} AS {dialect.int_cast}) END = {bp})"
|
||||||
|
return f"({_type_check(typeof, dialect.int_types)} AND CAST({extract} AS {dialect.int_cast}) = {bp})"
|
||||||
|
if isinstance(value, float):
|
||||||
|
bp = _bind(compiler, value, Float(), **kw)
|
||||||
|
return f"({_type_check(typeof, dialect.num_types)} AND CAST({extract} AS {dialect.num_cast}) = {bp})"
|
||||||
|
bp = _bind(compiler, str(value), String(), **kw)
|
||||||
|
return f"({typeof} = '{dialect.string_type}' AND {extract} = {bp})"
|
||||||
|
|
||||||
|
|
||||||
|
@compiles(JsonMatch, "sqlite")
|
||||||
|
def _compile_sqlite(element: JsonMatch, compiler: SQLCompiler, **kw: Any) -> str:
|
||||||
|
if not validate_metadata_filter_key(element.key):
|
||||||
|
raise ValueError(f"Key escaped validation: {element.key!r}")
|
||||||
|
col = compiler.process(element.column, **kw)
|
||||||
|
path = f'$."{element.key}"'
|
||||||
|
typeof = f"json_type({col}, '{path}')"
|
||||||
|
extract = f"json_extract({col}, '{path}')"
|
||||||
|
return _build_clause(compiler, typeof, extract, element.value, _SQLITE, **kw)
|
||||||
|
|
||||||
|
|
||||||
|
@compiles(JsonMatch, "postgresql")
|
||||||
|
def _compile_pg(element: JsonMatch, compiler: SQLCompiler, **kw: Any) -> str:
|
||||||
|
if not validate_metadata_filter_key(element.key):
|
||||||
|
raise ValueError(f"Key escaped validation: {element.key!r}")
|
||||||
|
col = compiler.process(element.column, **kw)
|
||||||
|
typeof = f"json_typeof({col} -> '{element.key}')"
|
||||||
|
extract = f"({col} ->> '{element.key}')"
|
||||||
|
return _build_clause(compiler, typeof, extract, element.value, _PG, **kw)
|
||||||
|
|
||||||
|
|
||||||
|
@compiles(JsonMatch)
|
||||||
|
def _compile_default(element: JsonMatch, compiler: SQLCompiler, **kw: Any) -> str:
|
||||||
|
raise NotImplementedError(f"JsonMatch supports only sqlite and postgresql; got dialect: {compiler.dialect.name}")
|
||||||
|
|
||||||
|
|
||||||
|
def json_match(column: ColumnElement, key: str, value: object) -> JsonMatch:
|
||||||
|
return JsonMatch(column, key, value)
|
||||||
@@ -223,10 +223,11 @@ class RunRepository(RunStore):
|
|||||||
"""Aggregate token usage via a single SQL GROUP BY query."""
|
"""Aggregate token usage via a single SQL GROUP BY query."""
|
||||||
_completed = RunRow.status.in_(("success", "error"))
|
_completed = RunRow.status.in_(("success", "error"))
|
||||||
_thread = RunRow.thread_id == thread_id
|
_thread = RunRow.thread_id == thread_id
|
||||||
|
model_name = func.coalesce(RunRow.model_name, "unknown")
|
||||||
|
|
||||||
stmt = (
|
stmt = (
|
||||||
select(
|
select(
|
||||||
func.coalesce(RunRow.model_name, "unknown").label("model"),
|
model_name.label("model"),
|
||||||
func.count().label("runs"),
|
func.count().label("runs"),
|
||||||
func.coalesce(func.sum(RunRow.total_tokens), 0).label("total_tokens"),
|
func.coalesce(func.sum(RunRow.total_tokens), 0).label("total_tokens"),
|
||||||
func.coalesce(func.sum(RunRow.total_input_tokens), 0).label("total_input_tokens"),
|
func.coalesce(func.sum(RunRow.total_input_tokens), 0).label("total_input_tokens"),
|
||||||
@@ -236,7 +237,7 @@ class RunRepository(RunStore):
|
|||||||
func.coalesce(func.sum(RunRow.middleware_tokens), 0).label("middleware"),
|
func.coalesce(func.sum(RunRow.middleware_tokens), 0).label("middleware"),
|
||||||
)
|
)
|
||||||
.where(_thread, _completed)
|
.where(_thread, _completed)
|
||||||
.group_by(func.coalesce(RunRow.model_name, "unknown"))
|
.group_by(model_name)
|
||||||
)
|
)
|
||||||
|
|
||||||
async with self._sf() as session:
|
async with self._sf() as session:
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ from __future__ import annotations
|
|||||||
|
|
||||||
from typing import TYPE_CHECKING
|
from typing import TYPE_CHECKING
|
||||||
|
|
||||||
from deerflow.persistence.thread_meta.base import ThreadMetaStore
|
from deerflow.persistence.thread_meta.base import InvalidMetadataFilterError, ThreadMetaStore
|
||||||
from deerflow.persistence.thread_meta.memory import MemoryThreadMetaStore
|
from deerflow.persistence.thread_meta.memory import MemoryThreadMetaStore
|
||||||
from deerflow.persistence.thread_meta.model import ThreadMetaRow
|
from deerflow.persistence.thread_meta.model import ThreadMetaRow
|
||||||
from deerflow.persistence.thread_meta.sql import ThreadMetaRepository
|
from deerflow.persistence.thread_meta.sql import ThreadMetaRepository
|
||||||
@@ -14,6 +14,7 @@ if TYPE_CHECKING:
|
|||||||
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker
|
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
|
"InvalidMetadataFilterError",
|
||||||
"MemoryThreadMetaStore",
|
"MemoryThreadMetaStore",
|
||||||
"ThreadMetaRepository",
|
"ThreadMetaRepository",
|
||||||
"ThreadMetaRow",
|
"ThreadMetaRow",
|
||||||
|
|||||||
@@ -15,10 +15,15 @@ three-state semantics (see :mod:`deerflow.runtime.user_context`):
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import abc
|
import abc
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
from deerflow.runtime.user_context import AUTO, _AutoSentinel
|
from deerflow.runtime.user_context import AUTO, _AutoSentinel
|
||||||
|
|
||||||
|
|
||||||
|
class InvalidMetadataFilterError(ValueError):
|
||||||
|
"""Raised when all client-supplied metadata filter keys are rejected."""
|
||||||
|
|
||||||
|
|
||||||
class ThreadMetaStore(abc.ABC):
|
class ThreadMetaStore(abc.ABC):
|
||||||
@abc.abstractmethod
|
@abc.abstractmethod
|
||||||
async def create(
|
async def create(
|
||||||
@@ -40,12 +45,12 @@ class ThreadMetaStore(abc.ABC):
|
|||||||
async def search(
|
async def search(
|
||||||
self,
|
self,
|
||||||
*,
|
*,
|
||||||
metadata: dict | None = None,
|
metadata: dict[str, Any] | None = None,
|
||||||
status: str | None = None,
|
status: str | None = None,
|
||||||
limit: int = 100,
|
limit: int = 100,
|
||||||
offset: int = 0,
|
offset: int = 0,
|
||||||
user_id: str | None | _AutoSentinel = AUTO,
|
user_id: str | None | _AutoSentinel = AUTO,
|
||||||
) -> list[dict]:
|
) -> list[dict[str, Any]]:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
@abc.abstractmethod
|
@abc.abstractmethod
|
||||||
|
|||||||
@@ -69,12 +69,12 @@ class MemoryThreadMetaStore(ThreadMetaStore):
|
|||||||
async def search(
|
async def search(
|
||||||
self,
|
self,
|
||||||
*,
|
*,
|
||||||
metadata: dict | None = None,
|
metadata: dict[str, Any] | None = None,
|
||||||
status: str | None = None,
|
status: str | None = None,
|
||||||
limit: int = 100,
|
limit: int = 100,
|
||||||
offset: int = 0,
|
offset: int = 0,
|
||||||
user_id: str | None | _AutoSentinel = AUTO,
|
user_id: str | None | _AutoSentinel = AUTO,
|
||||||
) -> list[dict]:
|
) -> list[dict[str, Any]]:
|
||||||
resolved_user_id = resolve_user_id(user_id, method_name="MemoryThreadMetaStore.search")
|
resolved_user_id = resolve_user_id(user_id, method_name="MemoryThreadMetaStore.search")
|
||||||
filter_dict: dict[str, Any] = {}
|
filter_dict: dict[str, Any] = {}
|
||||||
if metadata:
|
if metadata:
|
||||||
|
|||||||
@@ -2,16 +2,20 @@
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
from datetime import UTC, datetime
|
from datetime import UTC, datetime
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from sqlalchemy import select, update
|
from sqlalchemy import select, update
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker
|
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker
|
||||||
|
|
||||||
from deerflow.persistence.thread_meta.base import ThreadMetaStore
|
from deerflow.persistence.json_compat import json_match
|
||||||
|
from deerflow.persistence.thread_meta.base import InvalidMetadataFilterError, ThreadMetaStore
|
||||||
from deerflow.persistence.thread_meta.model import ThreadMetaRow
|
from deerflow.persistence.thread_meta.model import ThreadMetaRow
|
||||||
from deerflow.runtime.user_context import AUTO, _AutoSentinel, resolve_user_id
|
from deerflow.runtime.user_context import AUTO, _AutoSentinel, resolve_user_id
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
class ThreadMetaRepository(ThreadMetaStore):
|
class ThreadMetaRepository(ThreadMetaStore):
|
||||||
def __init__(self, session_factory: async_sessionmaker[AsyncSession]) -> None:
|
def __init__(self, session_factory: async_sessionmaker[AsyncSession]) -> None:
|
||||||
@@ -20,7 +24,7 @@ class ThreadMetaRepository(ThreadMetaStore):
|
|||||||
@staticmethod
|
@staticmethod
|
||||||
def _row_to_dict(row: ThreadMetaRow) -> dict[str, Any]:
|
def _row_to_dict(row: ThreadMetaRow) -> dict[str, Any]:
|
||||||
d = row.to_dict()
|
d = row.to_dict()
|
||||||
d["metadata"] = d.pop("metadata_json", {})
|
d["metadata"] = d.pop("metadata_json", None) or {}
|
||||||
for key in ("created_at", "updated_at"):
|
for key in ("created_at", "updated_at"):
|
||||||
val = d.get(key)
|
val = d.get(key)
|
||||||
if isinstance(val, datetime):
|
if isinstance(val, datetime):
|
||||||
@@ -104,39 +108,43 @@ class ThreadMetaRepository(ThreadMetaStore):
|
|||||||
async def search(
|
async def search(
|
||||||
self,
|
self,
|
||||||
*,
|
*,
|
||||||
metadata: dict | None = None,
|
metadata: dict[str, Any] | None = None,
|
||||||
status: str | None = None,
|
status: str | None = None,
|
||||||
limit: int = 100,
|
limit: int = 100,
|
||||||
offset: int = 0,
|
offset: int = 0,
|
||||||
user_id: str | None | _AutoSentinel = AUTO,
|
user_id: str | None | _AutoSentinel = AUTO,
|
||||||
) -> list[dict]:
|
) -> list[dict[str, Any]]:
|
||||||
"""Search threads with optional metadata and status filters.
|
"""Search threads with optional metadata and status filters.
|
||||||
|
|
||||||
Owner filter is enforced by default: caller must be in a user
|
Owner filter is enforced by default: caller must be in a user
|
||||||
context. Pass ``user_id=None`` to bypass (migration/CLI).
|
context. Pass ``user_id=None`` to bypass (migration/CLI).
|
||||||
"""
|
"""
|
||||||
resolved_user_id = resolve_user_id(user_id, method_name="ThreadMetaRepository.search")
|
resolved_user_id = resolve_user_id(user_id, method_name="ThreadMetaRepository.search")
|
||||||
stmt = select(ThreadMetaRow).order_by(ThreadMetaRow.updated_at.desc())
|
stmt = select(ThreadMetaRow).order_by(ThreadMetaRow.updated_at.desc(), ThreadMetaRow.thread_id.desc())
|
||||||
if resolved_user_id is not None:
|
if resolved_user_id is not None:
|
||||||
stmt = stmt.where(ThreadMetaRow.user_id == resolved_user_id)
|
stmt = stmt.where(ThreadMetaRow.user_id == resolved_user_id)
|
||||||
if status:
|
if status:
|
||||||
stmt = stmt.where(ThreadMetaRow.status == status)
|
stmt = stmt.where(ThreadMetaRow.status == status)
|
||||||
|
|
||||||
if metadata:
|
if metadata:
|
||||||
# When metadata filter is active, fetch a larger window and filter
|
applied = 0
|
||||||
# in Python. TODO(Phase 2): use JSON DB operators (Postgres @>,
|
for key, value in metadata.items():
|
||||||
# SQLite json_extract) for server-side filtering.
|
try:
|
||||||
stmt = stmt.limit(limit * 5 + offset)
|
stmt = stmt.where(json_match(ThreadMetaRow.metadata_json, key, value))
|
||||||
async with self._sf() as session:
|
applied += 1
|
||||||
result = await session.execute(stmt)
|
except (ValueError, TypeError) as exc:
|
||||||
rows = [self._row_to_dict(r) for r in result.scalars()]
|
logger.warning("Skipping metadata filter key %s: %s", ascii(key), exc)
|
||||||
rows = [r for r in rows if all(r.get("metadata", {}).get(k) == v for k, v in metadata.items())]
|
if applied == 0:
|
||||||
return rows[offset : offset + limit]
|
# Comma-separated plain string (no list repr / nested
|
||||||
else:
|
# quoting) so the 400 detail surfaced by the Gateway is
|
||||||
stmt = stmt.limit(limit).offset(offset)
|
# easy for clients to read. Sorted for determinism.
|
||||||
async with self._sf() as session:
|
rejected_keys = ", ".join(sorted(str(k) for k in metadata))
|
||||||
result = await session.execute(stmt)
|
raise InvalidMetadataFilterError(f"All metadata filter keys were rejected as unsafe: {rejected_keys}")
|
||||||
return [self._row_to_dict(r) for r in result.scalars()]
|
|
||||||
|
stmt = stmt.limit(limit).offset(offset)
|
||||||
|
async with self._sf() as session:
|
||||||
|
result = await session.execute(stmt)
|
||||||
|
return [self._row_to_dict(r) for r in result.scalars()]
|
||||||
|
|
||||||
async def _check_ownership(self, session: AsyncSession, thread_id: str, resolved_user_id: str | None) -> bool:
|
async def _check_ownership(self, session: AsyncSession, thread_id: str, resolved_user_id: str | None) -> bool:
|
||||||
"""Return True if the row exists and is owned (or filter bypassed)."""
|
"""Return True if the row exists and is owned (or filter bypassed)."""
|
||||||
|
|||||||
@@ -11,7 +11,7 @@ import logging
|
|||||||
from datetime import UTC, datetime
|
from datetime import UTC, datetime
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from sqlalchemy import delete, func, select
|
from sqlalchemy import delete, func, select, text
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker
|
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker
|
||||||
|
|
||||||
from deerflow.persistence.models.run_event import RunEventRow
|
from deerflow.persistence.models.run_event import RunEventRow
|
||||||
@@ -86,6 +86,28 @@ class DbRunEventStore(RunEventStore):
|
|||||||
user = get_current_user()
|
user = get_current_user()
|
||||||
return str(user.id) if user is not None else None
|
return str(user.id) if user is not None else None
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
async def _max_seq_for_thread(session: AsyncSession, thread_id: str) -> int | None:
|
||||||
|
"""Return the current max seq while serializing writers per thread.
|
||||||
|
|
||||||
|
PostgreSQL rejects ``SELECT max(...) FOR UPDATE`` because aggregate
|
||||||
|
results are not lockable rows. As a release-safe workaround, take a
|
||||||
|
transaction-level advisory lock keyed by thread_id before reading the
|
||||||
|
aggregate. Other dialects keep the existing row-locking statement.
|
||||||
|
"""
|
||||||
|
stmt = select(func.max(RunEventRow.seq)).where(RunEventRow.thread_id == thread_id)
|
||||||
|
bind = session.get_bind()
|
||||||
|
dialect_name = bind.dialect.name if bind is not None else ""
|
||||||
|
|
||||||
|
if dialect_name == "postgresql":
|
||||||
|
await session.execute(
|
||||||
|
text("SELECT pg_advisory_xact_lock(hashtext(CAST(:thread_id AS text))::bigint)"),
|
||||||
|
{"thread_id": thread_id},
|
||||||
|
)
|
||||||
|
return await session.scalar(stmt)
|
||||||
|
|
||||||
|
return await session.scalar(stmt.with_for_update())
|
||||||
|
|
||||||
async def put(self, *, thread_id, run_id, event_type, category, content="", metadata=None, created_at=None): # noqa: D401
|
async def put(self, *, thread_id, run_id, event_type, category, content="", metadata=None, created_at=None): # noqa: D401
|
||||||
"""Write a single event — low-frequency path only.
|
"""Write a single event — low-frequency path only.
|
||||||
|
|
||||||
@@ -100,10 +122,7 @@ class DbRunEventStore(RunEventStore):
|
|||||||
user_id = self._user_id_from_context()
|
user_id = self._user_id_from_context()
|
||||||
async with self._sf() as session:
|
async with self._sf() as session:
|
||||||
async with session.begin():
|
async with session.begin():
|
||||||
# Use FOR UPDATE to serialize seq assignment within a thread.
|
max_seq = await self._max_seq_for_thread(session, thread_id)
|
||||||
# NOTE: with_for_update() on aggregates is a no-op on SQLite;
|
|
||||||
# the UNIQUE(thread_id, seq) constraint catches races there.
|
|
||||||
max_seq = await session.scalar(select(func.max(RunEventRow.seq)).where(RunEventRow.thread_id == thread_id).with_for_update())
|
|
||||||
seq = (max_seq or 0) + 1
|
seq = (max_seq or 0) + 1
|
||||||
row = RunEventRow(
|
row = RunEventRow(
|
||||||
thread_id=thread_id,
|
thread_id=thread_id,
|
||||||
@@ -126,10 +145,8 @@ class DbRunEventStore(RunEventStore):
|
|||||||
async with self._sf() as session:
|
async with self._sf() as session:
|
||||||
async with session.begin():
|
async with session.begin():
|
||||||
# Get max seq for the thread (assume all events in batch belong to same thread).
|
# Get max seq for the thread (assume all events in batch belong to same thread).
|
||||||
# NOTE: with_for_update() on aggregates is a no-op on SQLite;
|
|
||||||
# the UNIQUE(thread_id, seq) constraint catches races there.
|
|
||||||
thread_id = events[0]["thread_id"]
|
thread_id = events[0]["thread_id"]
|
||||||
max_seq = await session.scalar(select(func.max(RunEventRow.seq)).where(RunEventRow.thread_id == thread_id).with_for_update())
|
max_seq = await self._max_seq_for_thread(session, thread_id)
|
||||||
seq = max_seq or 0
|
seq = max_seq or 0
|
||||||
rows = []
|
rows = []
|
||||||
for e in events:
|
for e in events:
|
||||||
|
|||||||
@@ -109,6 +109,34 @@ def get_effective_user_id() -> str:
|
|||||||
return str(user.id)
|
return str(user.id)
|
||||||
|
|
||||||
|
|
||||||
|
def resolve_runtime_user_id(runtime: object | None) -> str:
|
||||||
|
"""Single source of truth for a tool/middleware's effective user_id.
|
||||||
|
|
||||||
|
Resolution order (most authoritative first):
|
||||||
|
1. ``runtime.context["user_id"]`` — set by ``inject_authenticated_user_context``
|
||||||
|
in the gateway from the auth-validated ``request.state.user``. This is
|
||||||
|
the only source that survives boundaries where the contextvar may have
|
||||||
|
been lost (background tasks scheduled outside the request task,
|
||||||
|
worker pools that don't copy_context, future cross-process drivers).
|
||||||
|
2. The ``_current_user`` ContextVar — set by the auth middleware at
|
||||||
|
request entry. Reliable for in-task work; copied by ``asyncio``
|
||||||
|
child tasks and by ``ContextThreadPoolExecutor``.
|
||||||
|
3. ``DEFAULT_USER_ID`` — last-resort fallback so unauthenticated
|
||||||
|
CLI / migration / test paths keep working without raising.
|
||||||
|
|
||||||
|
Tools that persist user-scoped state (custom agents, memory, uploads)
|
||||||
|
MUST call this instead of ``get_effective_user_id()`` directly so they
|
||||||
|
benefit from the runtime.context channel that ``setup_agent`` already
|
||||||
|
relies on.
|
||||||
|
"""
|
||||||
|
context = getattr(runtime, "context", None)
|
||||||
|
if isinstance(context, dict):
|
||||||
|
ctx_user_id = context.get("user_id")
|
||||||
|
if ctx_user_id:
|
||||||
|
return str(ctx_user_id)
|
||||||
|
return get_effective_user_id()
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# Sentinel-based user_id resolution
|
# Sentinel-based user_id resolution
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|||||||
@@ -7,19 +7,12 @@ from langgraph.types import Command
|
|||||||
|
|
||||||
from deerflow.config.agents_config import validate_agent_name
|
from deerflow.config.agents_config import validate_agent_name
|
||||||
from deerflow.config.paths import get_paths
|
from deerflow.config.paths import get_paths
|
||||||
from deerflow.runtime.user_context import get_effective_user_id
|
from deerflow.runtime.user_context import resolve_runtime_user_id
|
||||||
from deerflow.tools.types import Runtime
|
from deerflow.tools.types import Runtime
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
def _get_runtime_user_id(runtime: Runtime) -> str:
|
|
||||||
context_user_id = runtime.context.get("user_id") if runtime.context else None
|
|
||||||
if context_user_id:
|
|
||||||
return str(context_user_id)
|
|
||||||
return get_effective_user_id()
|
|
||||||
|
|
||||||
|
|
||||||
@tool(parse_docstring=True)
|
@tool(parse_docstring=True)
|
||||||
def setup_agent(
|
def setup_agent(
|
||||||
soul: str,
|
soul: str,
|
||||||
@@ -45,7 +38,7 @@ def setup_agent(
|
|||||||
if agent_name:
|
if agent_name:
|
||||||
# Custom agents are persisted under the current user's bucket so
|
# Custom agents are persisted under the current user's bucket so
|
||||||
# different users do not see each other's agents.
|
# different users do not see each other's agents.
|
||||||
user_id = _get_runtime_user_id(runtime)
|
user_id = resolve_runtime_user_id(runtime)
|
||||||
agent_dir = paths.user_agent_dir(user_id, agent_name)
|
agent_dir = paths.user_agent_dir(user_id, agent_name)
|
||||||
else:
|
else:
|
||||||
# Default agent (no agent_name): SOUL.md lives at the global base dir.
|
# Default agent (no agent_name): SOUL.md lives at the global base dir.
|
||||||
|
|||||||
@@ -26,6 +26,28 @@ if TYPE_CHECKING:
|
|||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
# Cache subagent token usage by tool_call_id so TokenUsageMiddleware can
|
||||||
|
# write it back to the triggering AIMessage's usage_metadata.
|
||||||
|
_subagent_usage_cache: dict[str, dict[str, int]] = {}
|
||||||
|
|
||||||
|
|
||||||
|
def _token_usage_cache_enabled(app_config: "AppConfig | None") -> bool:
|
||||||
|
if app_config is None:
|
||||||
|
try:
|
||||||
|
app_config = get_app_config()
|
||||||
|
except FileNotFoundError:
|
||||||
|
return False
|
||||||
|
return bool(getattr(getattr(app_config, "token_usage", None), "enabled", False))
|
||||||
|
|
||||||
|
|
||||||
|
def _cache_subagent_usage(tool_call_id: str, usage: dict | None, *, enabled: bool = True) -> None:
|
||||||
|
if enabled and usage:
|
||||||
|
_subagent_usage_cache[tool_call_id] = usage
|
||||||
|
|
||||||
|
|
||||||
|
def pop_cached_subagent_usage(tool_call_id: str) -> dict | None:
|
||||||
|
return _subagent_usage_cache.pop(tool_call_id, None)
|
||||||
|
|
||||||
|
|
||||||
def _is_subagent_terminal(result: Any) -> bool:
|
def _is_subagent_terminal(result: Any) -> bool:
|
||||||
"""Return whether a background subagent result is safe to clean up."""
|
"""Return whether a background subagent result is safe to clean up."""
|
||||||
@@ -92,6 +114,17 @@ def _find_usage_recorder(runtime: Any) -> Any | None:
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _summarize_usage(records: list[dict] | None) -> dict | None:
|
||||||
|
"""Summarize token usage records into a compact dict for SSE events."""
|
||||||
|
if not records:
|
||||||
|
return None
|
||||||
|
return {
|
||||||
|
"input_tokens": sum(r.get("input_tokens", 0) or 0 for r in records),
|
||||||
|
"output_tokens": sum(r.get("output_tokens", 0) or 0 for r in records),
|
||||||
|
"total_tokens": sum(r.get("total_tokens", 0) or 0 for r in records),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
def _report_subagent_usage(runtime: Any, result: Any) -> None:
|
def _report_subagent_usage(runtime: Any, result: Any) -> None:
|
||||||
"""Report subagent token usage to the parent RunJournal, if available.
|
"""Report subagent token usage to the parent RunJournal, if available.
|
||||||
|
|
||||||
@@ -177,6 +210,7 @@ async def task_tool(
|
|||||||
subagent_type: The type of subagent to use. ALWAYS PROVIDE THIS PARAMETER THIRD.
|
subagent_type: The type of subagent to use. ALWAYS PROVIDE THIS PARAMETER THIRD.
|
||||||
"""
|
"""
|
||||||
runtime_app_config = _get_runtime_app_config(runtime)
|
runtime_app_config = _get_runtime_app_config(runtime)
|
||||||
|
cache_token_usage = _token_usage_cache_enabled(runtime_app_config)
|
||||||
available_subagent_names = get_available_subagent_names(app_config=runtime_app_config) if runtime_app_config is not None else get_available_subagent_names()
|
available_subagent_names = get_available_subagent_names(app_config=runtime_app_config) if runtime_app_config is not None else get_available_subagent_names()
|
||||||
|
|
||||||
# Get subagent configuration
|
# Get subagent configuration
|
||||||
@@ -312,27 +346,32 @@ async def task_tool(
|
|||||||
last_message_count = current_message_count
|
last_message_count = current_message_count
|
||||||
|
|
||||||
# Check if task completed, failed, or timed out
|
# Check if task completed, failed, or timed out
|
||||||
|
usage = _summarize_usage(getattr(result, "token_usage_records", None))
|
||||||
if result.status == SubagentStatus.COMPLETED:
|
if result.status == SubagentStatus.COMPLETED:
|
||||||
|
_cache_subagent_usage(tool_call_id, usage, enabled=cache_token_usage)
|
||||||
_report_subagent_usage(runtime, result)
|
_report_subagent_usage(runtime, result)
|
||||||
writer({"type": "task_completed", "task_id": task_id, "result": result.result})
|
writer({"type": "task_completed", "task_id": task_id, "result": result.result, "usage": usage})
|
||||||
logger.info(f"[trace={trace_id}] Task {task_id} completed after {poll_count} polls")
|
logger.info(f"[trace={trace_id}] Task {task_id} completed after {poll_count} polls")
|
||||||
cleanup_background_task(task_id)
|
cleanup_background_task(task_id)
|
||||||
return f"Task Succeeded. Result: {result.result}"
|
return f"Task Succeeded. Result: {result.result}"
|
||||||
elif result.status == SubagentStatus.FAILED:
|
elif result.status == SubagentStatus.FAILED:
|
||||||
|
_cache_subagent_usage(tool_call_id, usage, enabled=cache_token_usage)
|
||||||
_report_subagent_usage(runtime, result)
|
_report_subagent_usage(runtime, result)
|
||||||
writer({"type": "task_failed", "task_id": task_id, "error": result.error})
|
writer({"type": "task_failed", "task_id": task_id, "error": result.error, "usage": usage})
|
||||||
logger.error(f"[trace={trace_id}] Task {task_id} failed: {result.error}")
|
logger.error(f"[trace={trace_id}] Task {task_id} failed: {result.error}")
|
||||||
cleanup_background_task(task_id)
|
cleanup_background_task(task_id)
|
||||||
return f"Task failed. Error: {result.error}"
|
return f"Task failed. Error: {result.error}"
|
||||||
elif result.status == SubagentStatus.CANCELLED:
|
elif result.status == SubagentStatus.CANCELLED:
|
||||||
|
_cache_subagent_usage(tool_call_id, usage, enabled=cache_token_usage)
|
||||||
_report_subagent_usage(runtime, result)
|
_report_subagent_usage(runtime, result)
|
||||||
writer({"type": "task_cancelled", "task_id": task_id, "error": result.error})
|
writer({"type": "task_cancelled", "task_id": task_id, "error": result.error, "usage": usage})
|
||||||
logger.info(f"[trace={trace_id}] Task {task_id} cancelled: {result.error}")
|
logger.info(f"[trace={trace_id}] Task {task_id} cancelled: {result.error}")
|
||||||
cleanup_background_task(task_id)
|
cleanup_background_task(task_id)
|
||||||
return "Task cancelled by user."
|
return "Task cancelled by user."
|
||||||
elif result.status == SubagentStatus.TIMED_OUT:
|
elif result.status == SubagentStatus.TIMED_OUT:
|
||||||
|
_cache_subagent_usage(tool_call_id, usage, enabled=cache_token_usage)
|
||||||
_report_subagent_usage(runtime, result)
|
_report_subagent_usage(runtime, result)
|
||||||
writer({"type": "task_timed_out", "task_id": task_id, "error": result.error})
|
writer({"type": "task_timed_out", "task_id": task_id, "error": result.error, "usage": usage})
|
||||||
logger.warning(f"[trace={trace_id}] Task {task_id} timed out: {result.error}")
|
logger.warning(f"[trace={trace_id}] Task {task_id} timed out: {result.error}")
|
||||||
cleanup_background_task(task_id)
|
cleanup_background_task(task_id)
|
||||||
return f"Task timed out. Error: {result.error}"
|
return f"Task timed out. Error: {result.error}"
|
||||||
@@ -351,7 +390,9 @@ async def task_tool(
|
|||||||
timeout_minutes = config.timeout_seconds // 60
|
timeout_minutes = config.timeout_seconds // 60
|
||||||
logger.error(f"[trace={trace_id}] Task {task_id} polling timed out after {poll_count} polls (should have been caught by thread pool timeout)")
|
logger.error(f"[trace={trace_id}] Task {task_id} polling timed out after {poll_count} polls (should have been caught by thread pool timeout)")
|
||||||
_report_subagent_usage(runtime, result)
|
_report_subagent_usage(runtime, result)
|
||||||
writer({"type": "task_timed_out", "task_id": task_id})
|
usage = _summarize_usage(getattr(result, "token_usage_records", None))
|
||||||
|
_cache_subagent_usage(tool_call_id, usage, enabled=cache_token_usage)
|
||||||
|
writer({"type": "task_timed_out", "task_id": task_id, "usage": usage})
|
||||||
return f"Task polling timed out after {timeout_minutes} minutes. This may indicate the background task is stuck. Status: {result.status.value}"
|
return f"Task polling timed out after {timeout_minutes} minutes. This may indicate the background task is stuck. Status: {result.status.value}"
|
||||||
except asyncio.CancelledError:
|
except asyncio.CancelledError:
|
||||||
# Signal the background subagent thread to stop cooperatively.
|
# Signal the background subagent thread to stop cooperatively.
|
||||||
@@ -374,4 +415,8 @@ async def task_tool(
|
|||||||
cleanup_background_task(task_id)
|
cleanup_background_task(task_id)
|
||||||
else:
|
else:
|
||||||
_schedule_deferred_subagent_cleanup(task_id, trace_id, max_poll_count)
|
_schedule_deferred_subagent_cleanup(task_id, trace_id, max_poll_count)
|
||||||
|
_subagent_usage_cache.pop(tool_call_id, None)
|
||||||
|
raise
|
||||||
|
except Exception:
|
||||||
|
_subagent_usage_cache.pop(tool_call_id, None)
|
||||||
raise
|
raise
|
||||||
|
|||||||
@@ -27,7 +27,7 @@ from langgraph.types import Command
|
|||||||
from deerflow.config.agents_config import load_agent_config, validate_agent_name
|
from deerflow.config.agents_config import load_agent_config, validate_agent_name
|
||||||
from deerflow.config.app_config import get_app_config
|
from deerflow.config.app_config import get_app_config
|
||||||
from deerflow.config.paths import get_paths
|
from deerflow.config.paths import get_paths
|
||||||
from deerflow.runtime.user_context import get_effective_user_id
|
from deerflow.runtime.user_context import resolve_runtime_user_id
|
||||||
from deerflow.tools.types import Runtime
|
from deerflow.tools.types import Runtime
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@@ -118,9 +118,13 @@ def update_agent(
|
|||||||
return _err("update_agent is only available inside a custom agent's chat. There is no agent_name in the current runtime context, so there is nothing to update. If you are inside the bootstrap flow, use setup_agent instead.")
|
return _err("update_agent is only available inside a custom agent's chat. There is no agent_name in the current runtime context, so there is nothing to update. If you are inside the bootstrap flow, use setup_agent instead.")
|
||||||
|
|
||||||
# Resolve the active user so that updates only affect this user's agent.
|
# Resolve the active user so that updates only affect this user's agent.
|
||||||
# ``get_effective_user_id`` returns DEFAULT_USER_ID when no auth context
|
# ``resolve_runtime_user_id`` prefers ``runtime.context["user_id"]`` (set by
|
||||||
# is set (matching how memory and thread storage behave).
|
# the gateway from the auth-validated request) and falls back to the
|
||||||
user_id = get_effective_user_id()
|
# contextvar, then DEFAULT_USER_ID. This matches setup_agent so a user
|
||||||
|
# creating an agent and later refining it always touches the same files,
|
||||||
|
# even if the contextvar gets lost across an async/thread boundary
|
||||||
|
# (issue #2782 / #2862 class of bugs).
|
||||||
|
user_id = resolve_runtime_user_id(runtime)
|
||||||
|
|
||||||
# Reject an unknown ``model`` *before* touching the filesystem. Otherwise
|
# Reject an unknown ``model`` *before* touching the filesystem. Otherwise
|
||||||
# ``_resolve_model_name`` silently falls back to the default at runtime
|
# ``_resolve_model_name`` silently falls back to the default at runtime
|
||||||
|
|||||||
@@ -7,7 +7,7 @@ from deerflow.config.app_config import AppConfig
|
|||||||
from deerflow.reflection import resolve_variable
|
from deerflow.reflection import resolve_variable
|
||||||
from deerflow.sandbox.security import is_host_bash_allowed
|
from deerflow.sandbox.security import is_host_bash_allowed
|
||||||
from deerflow.tools.builtins import ask_clarification_tool, present_file_tool, task_tool, view_image_tool
|
from deerflow.tools.builtins import ask_clarification_tool, present_file_tool, task_tool, view_image_tool
|
||||||
from deerflow.tools.builtins.tool_search import reset_deferred_registry
|
from deerflow.tools.builtins.tool_search import get_deferred_registry
|
||||||
from deerflow.tools.sync import make_sync_tool_wrapper
|
from deerflow.tools.sync import make_sync_tool_wrapper
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@@ -116,8 +116,6 @@ def get_available_tools(
|
|||||||
# made through the Gateway API (which runs in a separate process) are immediately
|
# made through the Gateway API (which runs in a separate process) are immediately
|
||||||
# reflected when loading MCP tools.
|
# reflected when loading MCP tools.
|
||||||
mcp_tools = []
|
mcp_tools = []
|
||||||
# Reset deferred registry upfront to prevent stale state from previous calls
|
|
||||||
reset_deferred_registry()
|
|
||||||
if include_mcp:
|
if include_mcp:
|
||||||
try:
|
try:
|
||||||
from deerflow.config.extensions_config import ExtensionsConfig
|
from deerflow.config.extensions_config import ExtensionsConfig
|
||||||
@@ -135,12 +133,51 @@ def get_available_tools(
|
|||||||
from deerflow.tools.builtins.tool_search import DeferredToolRegistry, set_deferred_registry
|
from deerflow.tools.builtins.tool_search import DeferredToolRegistry, set_deferred_registry
|
||||||
from deerflow.tools.builtins.tool_search import tool_search as tool_search_tool
|
from deerflow.tools.builtins.tool_search import tool_search as tool_search_tool
|
||||||
|
|
||||||
registry = DeferredToolRegistry()
|
# Reuse the existing registry if one is already set for
|
||||||
for t in mcp_tools:
|
# this async context. ``get_available_tools`` is
|
||||||
registry.register(t)
|
# re-entered whenever a subagent is spawned
|
||||||
set_deferred_registry(registry)
|
# (``task_tool`` calls it to build the child agent's
|
||||||
|
# toolset), and previously we used to unconditionally
|
||||||
|
# rebuild the registry — wiping out the parent agent's
|
||||||
|
# tool_search promotions. The
|
||||||
|
# ``DeferredToolFilterMiddleware`` then re-hid those
|
||||||
|
# tools from subsequent model calls, leaving the agent
|
||||||
|
# able to see a tool's name but unable to invoke it
|
||||||
|
# (issue #2884). ``contextvars`` already gives us the
|
||||||
|
# lifetime semantics we want: a fresh request / graph
|
||||||
|
# run starts in a new asyncio task with the
|
||||||
|
# ContextVar at its default of ``None``, so reuse is
|
||||||
|
# only triggered for re-entrant calls inside one run.
|
||||||
|
#
|
||||||
|
# Intentionally NOT reconciling against the current
|
||||||
|
# ``mcp_tools`` snapshot. The MCP cache only refreshes
|
||||||
|
# on ``extensions_config.json`` mtime changes, which
|
||||||
|
# in practice happens between graph runs — not inside
|
||||||
|
# one. And even if a refresh did happen mid-run, the
|
||||||
|
# already-built lead agent's ``ToolNode`` still holds
|
||||||
|
# the *previous* tool set (LangGraph binds tools at
|
||||||
|
# graph construction time), so a brand-new MCP tool
|
||||||
|
# couldn't actually be invoked anyway. The
|
||||||
|
# ``DeferredToolRegistry`` doesn't retain the names
|
||||||
|
# of previously-promoted tools (``promote()`` drops
|
||||||
|
# the entry entirely), so re-syncing the registry
|
||||||
|
# against a fresh ``mcp_tools`` list would
|
||||||
|
# mis-classify those promotions as new tools and
|
||||||
|
# re-register them as deferred — exactly the bug
|
||||||
|
# this fix exists to prevent.
|
||||||
|
existing_registry = get_deferred_registry()
|
||||||
|
if existing_registry is None:
|
||||||
|
registry = DeferredToolRegistry()
|
||||||
|
for t in mcp_tools:
|
||||||
|
registry.register(t)
|
||||||
|
set_deferred_registry(registry)
|
||||||
|
logger.info(f"Tool search active: {len(mcp_tools)} tools deferred")
|
||||||
|
else:
|
||||||
|
mcp_tool_names = {t.name for t in mcp_tools}
|
||||||
|
still_deferred = len(existing_registry)
|
||||||
|
promoted_count = max(0, len(mcp_tool_names) - still_deferred)
|
||||||
|
logger.info(f"Tool search active (preserved promotions): {still_deferred} tools deferred, {promoted_count} already promoted")
|
||||||
builtin_tools.append(tool_search_tool)
|
builtin_tools.append(tool_search_tool)
|
||||||
logger.info(f"Tool search active: {len(mcp_tools)} tools deferred")
|
|
||||||
except ImportError:
|
except ImportError:
|
||||||
logger.warning("MCP module not available. Install 'langchain-mcp-adapters' package to enable MCP tools.")
|
logger.warning("MCP module not available. Install 'langchain-mcp-adapters' package to enable MCP tools.")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
|
|||||||
@@ -0,0 +1,68 @@
|
|||||||
|
"""Shared helpers for user-isolation e2e tests on the custom-agent tooling.
|
||||||
|
|
||||||
|
Centralises the small fake-LLM shim and a few test-data builders that the
|
||||||
|
three e2e files in this PR (``test_setup_agent_e2e_user_isolation``,
|
||||||
|
``test_update_agent_e2e_user_isolation``, ``test_setup_agent_http_e2e_real_server``)
|
||||||
|
all need. The shim is what lets a real ``langchain.agents.create_agent``
|
||||||
|
graph run without an API key — every other layer in those tests is real
|
||||||
|
production code, which is the entire point of the test design.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from langchain_core.language_models.fake_chat_models import FakeMessagesListChatModel
|
||||||
|
from langchain_core.messages import AIMessage
|
||||||
|
from langchain_core.runnables import Runnable
|
||||||
|
|
||||||
|
|
||||||
|
class FakeToolCallingModel(FakeMessagesListChatModel):
|
||||||
|
"""FakeMessagesListChatModel plus a no-op ``bind_tools`` for create_agent.
|
||||||
|
|
||||||
|
``langchain.agents.create_agent`` calls ``model.bind_tools(...)`` to
|
||||||
|
expose the tool schemas to the model; the upstream fake raises
|
||||||
|
``NotImplementedError`` there. We just return ``self`` because we
|
||||||
|
drive deterministic tool_call output via ``responses=...``, no schema
|
||||||
|
handling needed.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def bind_tools( # type: ignore[override]
|
||||||
|
self,
|
||||||
|
tools: Any,
|
||||||
|
*,
|
||||||
|
tool_choice: Any = None,
|
||||||
|
**kwargs: Any,
|
||||||
|
) -> Runnable:
|
||||||
|
return self
|
||||||
|
|
||||||
|
|
||||||
|
def build_single_tool_call_model(
|
||||||
|
*,
|
||||||
|
tool_name: str,
|
||||||
|
tool_args: dict[str, Any],
|
||||||
|
tool_call_id: str = "call_e2e_1",
|
||||||
|
final_text: str = "done",
|
||||||
|
) -> FakeToolCallingModel:
|
||||||
|
"""Build a fake model that emits exactly one tool_call then finishes.
|
||||||
|
|
||||||
|
Two-turn behaviour, identical across our e2e tests:
|
||||||
|
turn 1 → AIMessage with a single tool_call for *tool_name*
|
||||||
|
turn 2 → AIMessage with *final_text* (terminates the agent loop)
|
||||||
|
"""
|
||||||
|
return FakeToolCallingModel(
|
||||||
|
responses=[
|
||||||
|
AIMessage(
|
||||||
|
content="",
|
||||||
|
tool_calls=[
|
||||||
|
{
|
||||||
|
"name": tool_name,
|
||||||
|
"args": tool_args,
|
||||||
|
"id": tool_call_id,
|
||||||
|
"type": "tool_call",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
),
|
||||||
|
AIMessage(content=final_text),
|
||||||
|
]
|
||||||
|
)
|
||||||
@@ -4,6 +4,8 @@ Sets up sys.path and pre-mocks modules that would cause circular import
|
|||||||
issues when unit-testing lightweight config/registry code in isolation.
|
issues when unit-testing lightweight config/registry code in isolation.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import importlib.util
|
import importlib.util
|
||||||
import sys
|
import sys
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
@@ -11,11 +13,16 @@ from types import SimpleNamespace
|
|||||||
from unittest.mock import MagicMock
|
from unittest.mock import MagicMock
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
from support.detectors.blocking_io import BlockingIOProbe, detect_blocking_io
|
||||||
|
|
||||||
# Make 'app' and 'deerflow' importable from any working directory
|
# Make 'app' and 'deerflow' importable from any working directory
|
||||||
sys.path.insert(0, str(Path(__file__).parent.parent))
|
sys.path.insert(0, str(Path(__file__).parent.parent))
|
||||||
sys.path.insert(0, str(Path(__file__).resolve().parents[2] / "scripts"))
|
sys.path.insert(0, str(Path(__file__).resolve().parents[2] / "scripts"))
|
||||||
|
|
||||||
|
_BACKEND_ROOT = Path(__file__).resolve().parents[1]
|
||||||
|
_blocking_io_probe = BlockingIOProbe(_BACKEND_ROOT)
|
||||||
|
_BLOCKING_IO_DETECTOR_ATTR = "_blocking_io_detector"
|
||||||
|
|
||||||
# Break the circular import chain that exists in production code:
|
# Break the circular import chain that exists in production code:
|
||||||
# deerflow.subagents.__init__
|
# deerflow.subagents.__init__
|
||||||
# -> .executor (SubagentExecutor, SubagentResult)
|
# -> .executor (SubagentExecutor, SubagentResult)
|
||||||
@@ -56,6 +63,92 @@ def provisioner_module():
|
|||||||
return module
|
return module
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture()
|
||||||
|
def blocking_io_detector():
|
||||||
|
"""Fail a focused test if blocking calls run on the event loop thread."""
|
||||||
|
with detect_blocking_io(fail_on_exit=True) as detector:
|
||||||
|
yield detector
|
||||||
|
|
||||||
|
|
||||||
|
def pytest_addoption(parser: pytest.Parser) -> None:
|
||||||
|
group = parser.getgroup("blocking-io")
|
||||||
|
group.addoption(
|
||||||
|
"--detect-blocking-io",
|
||||||
|
action="store_true",
|
||||||
|
default=False,
|
||||||
|
help="Collect blocking calls made while an asyncio event loop is running and report a summary.",
|
||||||
|
)
|
||||||
|
group.addoption(
|
||||||
|
"--detect-blocking-io-fail",
|
||||||
|
action="store_true",
|
||||||
|
default=False,
|
||||||
|
help="Set a failing exit status when --detect-blocking-io records violations.",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def pytest_configure(config: pytest.Config) -> None:
|
||||||
|
config.addinivalue_line("markers", "no_blocking_io_probe: skip the optional blocking IO probe")
|
||||||
|
|
||||||
|
|
||||||
|
def pytest_sessionstart(session: pytest.Session) -> None:
|
||||||
|
if _blocking_io_probe_enabled(session.config):
|
||||||
|
_blocking_io_probe.clear()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.hookimpl(hookwrapper=True)
|
||||||
|
def pytest_runtest_call(item: pytest.Item):
|
||||||
|
if not _blocking_io_probe_enabled(item.config) or _blocking_io_probe_skipped(item):
|
||||||
|
yield
|
||||||
|
return
|
||||||
|
|
||||||
|
detector = detect_blocking_io(fail_on_exit=False, stack_limit=18)
|
||||||
|
detector.__enter__()
|
||||||
|
setattr(item, _BLOCKING_IO_DETECTOR_ATTR, detector)
|
||||||
|
yield
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.hookimpl(hookwrapper=True)
|
||||||
|
def pytest_runtest_teardown(item: pytest.Item):
|
||||||
|
yield
|
||||||
|
|
||||||
|
detector = getattr(item, _BLOCKING_IO_DETECTOR_ATTR, None)
|
||||||
|
if detector is None:
|
||||||
|
return
|
||||||
|
|
||||||
|
try:
|
||||||
|
detector.__exit__(None, None, None)
|
||||||
|
_blocking_io_probe.record(item.nodeid, detector.violations)
|
||||||
|
finally:
|
||||||
|
delattr(item, _BLOCKING_IO_DETECTOR_ATTR)
|
||||||
|
|
||||||
|
|
||||||
|
def pytest_sessionfinish(session: pytest.Session) -> None:
|
||||||
|
if _blocking_io_fail_enabled(session.config) and _blocking_io_probe.violation_count and session.exitstatus == pytest.ExitCode.OK:
|
||||||
|
session.exitstatus = pytest.ExitCode.TESTS_FAILED
|
||||||
|
|
||||||
|
|
||||||
|
def pytest_terminal_summary(terminalreporter: pytest.TerminalReporter) -> None:
|
||||||
|
if not _blocking_io_probe_enabled(terminalreporter.config):
|
||||||
|
return
|
||||||
|
|
||||||
|
header, *details = _blocking_io_probe.format_summary().splitlines()
|
||||||
|
terminalreporter.write_sep("=", header)
|
||||||
|
for line in details:
|
||||||
|
terminalreporter.write_line(line)
|
||||||
|
|
||||||
|
|
||||||
|
def _blocking_io_probe_enabled(config: pytest.Config) -> bool:
|
||||||
|
return bool(config.getoption("--detect-blocking-io") or config.getoption("--detect-blocking-io-fail"))
|
||||||
|
|
||||||
|
|
||||||
|
def _blocking_io_fail_enabled(config: pytest.Config) -> bool:
|
||||||
|
return bool(config.getoption("--detect-blocking-io-fail"))
|
||||||
|
|
||||||
|
|
||||||
|
def _blocking_io_probe_skipped(item: pytest.Item) -> bool:
|
||||||
|
return item.path.name == "test_blocking_io_detector.py" or item.get_closest_marker("no_blocking_io_probe") is not None
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# Auto-set user context for every test unless marked no_auto_user
|
# Auto-set user context for every test unless marked no_auto_user
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|||||||
@@ -0,0 +1 @@
|
|||||||
|
"""Shared test support helpers."""
|
||||||
@@ -0,0 +1 @@
|
|||||||
|
"""Runtime and static detectors used by tests."""
|
||||||
@@ -0,0 +1,287 @@
|
|||||||
|
"""Test helper for detecting blocking calls on an asyncio event loop.
|
||||||
|
|
||||||
|
The detector is intentionally test-only. It monkeypatches a small set of
|
||||||
|
well-known blocking entry points and their already-loaded module-level aliases,
|
||||||
|
then records calls only when they happen on a thread that is currently running
|
||||||
|
an asyncio event loop. Aliases captured in closures or default arguments remain
|
||||||
|
out of scope.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import importlib
|
||||||
|
import sys
|
||||||
|
import traceback
|
||||||
|
from collections import Counter
|
||||||
|
from collections.abc import Callable, Iterable, Iterator
|
||||||
|
from contextlib import AbstractContextManager
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from functools import wraps
|
||||||
|
from pathlib import Path
|
||||||
|
from types import TracebackType
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
BlockingCallable = Callable[..., Any]
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class BlockingCallSpec:
|
||||||
|
"""Describes one blocking callable to wrap during a detector run."""
|
||||||
|
|
||||||
|
name: str
|
||||||
|
target: str
|
||||||
|
record_on_iteration: bool = False
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class BlockingCall:
|
||||||
|
"""One blocking call observed on an asyncio event loop thread."""
|
||||||
|
|
||||||
|
name: str
|
||||||
|
target: str
|
||||||
|
stack: tuple[traceback.FrameSummary, ...]
|
||||||
|
|
||||||
|
|
||||||
|
DEFAULT_BLOCKING_CALL_SPECS: tuple[BlockingCallSpec, ...] = (
|
||||||
|
BlockingCallSpec("time.sleep", "time:sleep"),
|
||||||
|
BlockingCallSpec("requests.Session.request", "requests.sessions:Session.request"),
|
||||||
|
BlockingCallSpec("httpx.Client.request", "httpx:Client.request"),
|
||||||
|
BlockingCallSpec("os.walk", "os:walk", record_on_iteration=True),
|
||||||
|
BlockingCallSpec("pathlib.Path.resolve", "pathlib:Path.resolve"),
|
||||||
|
BlockingCallSpec("pathlib.Path.read_text", "pathlib:Path.read_text"),
|
||||||
|
BlockingCallSpec("pathlib.Path.write_text", "pathlib:Path.write_text"),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _is_event_loop_thread() -> bool:
|
||||||
|
try:
|
||||||
|
loop = asyncio.get_running_loop()
|
||||||
|
except RuntimeError:
|
||||||
|
return False
|
||||||
|
return loop.is_running()
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_target(target: str) -> tuple[object, str, BlockingCallable]:
|
||||||
|
module_name, attr_path = target.split(":", maxsplit=1)
|
||||||
|
owner: object = importlib.import_module(module_name)
|
||||||
|
parts = attr_path.split(".")
|
||||||
|
for part in parts[:-1]:
|
||||||
|
owner = getattr(owner, part)
|
||||||
|
|
||||||
|
attr_name = parts[-1]
|
||||||
|
original = getattr(owner, attr_name)
|
||||||
|
return owner, attr_name, original
|
||||||
|
|
||||||
|
|
||||||
|
def _trim_detector_frames(stack: Iterable[traceback.FrameSummary]) -> tuple[traceback.FrameSummary, ...]:
|
||||||
|
return tuple(frame for frame in stack if frame.filename != __file__)
|
||||||
|
|
||||||
|
|
||||||
|
class BlockingIODetector(AbstractContextManager["BlockingIODetector"]):
|
||||||
|
"""Record blocking calls made from async runtime code.
|
||||||
|
|
||||||
|
By default the detector reports violations but does not fail on context
|
||||||
|
exit. Tests can set ``fail_on_exit=True`` or call
|
||||||
|
``assert_no_blocking_calls()`` explicitly.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
specs: Iterable[BlockingCallSpec] = DEFAULT_BLOCKING_CALL_SPECS,
|
||||||
|
*,
|
||||||
|
fail_on_exit: bool = False,
|
||||||
|
patch_loaded_aliases: bool = True,
|
||||||
|
stack_limit: int = 12,
|
||||||
|
) -> None:
|
||||||
|
self._specs = tuple(specs)
|
||||||
|
self._fail_on_exit = fail_on_exit
|
||||||
|
self._patch_loaded_aliases_enabled = patch_loaded_aliases
|
||||||
|
self._stack_limit = stack_limit
|
||||||
|
self._patches: list[tuple[object, str, BlockingCallable]] = []
|
||||||
|
self._patch_keys: set[tuple[int, str]] = set()
|
||||||
|
self.violations: list[BlockingCall] = []
|
||||||
|
self._active = False
|
||||||
|
|
||||||
|
def __enter__(self) -> BlockingIODetector:
|
||||||
|
try:
|
||||||
|
self._active = True
|
||||||
|
alias_replacements: dict[int, BlockingCallable] = {}
|
||||||
|
for spec in self._specs:
|
||||||
|
owner, attr_name, original = _resolve_target(spec.target)
|
||||||
|
wrapper = self._wrap(spec, original)
|
||||||
|
self._patch_attribute(owner, attr_name, original, wrapper)
|
||||||
|
alias_replacements[id(original)] = wrapper
|
||||||
|
|
||||||
|
if self._patch_loaded_aliases_enabled:
|
||||||
|
self._patch_loaded_module_aliases(alias_replacements)
|
||||||
|
except Exception:
|
||||||
|
self._restore()
|
||||||
|
self._active = False
|
||||||
|
raise
|
||||||
|
return self
|
||||||
|
|
||||||
|
def __exit__(
|
||||||
|
self,
|
||||||
|
exc_type: type[BaseException] | None,
|
||||||
|
exc_value: BaseException | None,
|
||||||
|
traceback_value: TracebackType | None,
|
||||||
|
) -> bool | None:
|
||||||
|
self._restore()
|
||||||
|
self._active = False
|
||||||
|
if exc_type is None and self._fail_on_exit:
|
||||||
|
self.assert_no_blocking_calls()
|
||||||
|
return None
|
||||||
|
|
||||||
|
def _restore(self) -> None:
|
||||||
|
for owner, attr_name, original in reversed(self._patches):
|
||||||
|
setattr(owner, attr_name, original)
|
||||||
|
self._patches.clear()
|
||||||
|
self._patch_keys.clear()
|
||||||
|
|
||||||
|
def _patch_attribute(self, owner: object, attr_name: str, original: BlockingCallable, replacement: BlockingCallable) -> None:
|
||||||
|
key = (id(owner), attr_name)
|
||||||
|
if key in self._patch_keys:
|
||||||
|
return
|
||||||
|
setattr(owner, attr_name, replacement)
|
||||||
|
self._patches.append((owner, attr_name, original))
|
||||||
|
self._patch_keys.add(key)
|
||||||
|
|
||||||
|
def _patch_loaded_module_aliases(self, replacements_by_id: dict[int, BlockingCallable]) -> None:
|
||||||
|
for module in tuple(sys.modules.values()):
|
||||||
|
namespace = getattr(module, "__dict__", None)
|
||||||
|
if not isinstance(namespace, dict):
|
||||||
|
continue
|
||||||
|
|
||||||
|
for attr_name, value in tuple(namespace.items()):
|
||||||
|
replacement = replacements_by_id.get(id(value))
|
||||||
|
if replacement is not None:
|
||||||
|
self._patch_attribute(module, attr_name, value, replacement)
|
||||||
|
|
||||||
|
def _wrap(self, spec: BlockingCallSpec, original: BlockingCallable) -> BlockingCallable:
|
||||||
|
@wraps(original)
|
||||||
|
def wrapper(*args: Any, **kwargs: Any) -> Any:
|
||||||
|
if spec.record_on_iteration:
|
||||||
|
result = original(*args, **kwargs)
|
||||||
|
return self._wrap_iteration(spec, result)
|
||||||
|
self._record_if_blocking(spec)
|
||||||
|
return original(*args, **kwargs)
|
||||||
|
|
||||||
|
return wrapper
|
||||||
|
|
||||||
|
def _wrap_iteration(self, spec: BlockingCallSpec, iterable: Iterable[Any]) -> Iterator[Any]:
|
||||||
|
iterator = iter(iterable)
|
||||||
|
reported = False
|
||||||
|
|
||||||
|
while True:
|
||||||
|
if not reported:
|
||||||
|
reported = self._record_if_blocking(spec)
|
||||||
|
try:
|
||||||
|
yield next(iterator)
|
||||||
|
except StopIteration:
|
||||||
|
return
|
||||||
|
|
||||||
|
def _record_if_blocking(self, spec: BlockingCallSpec) -> bool:
|
||||||
|
if self._active and _is_event_loop_thread():
|
||||||
|
stack = _trim_detector_frames(traceback.extract_stack(limit=self._stack_limit))
|
||||||
|
self.violations.append(BlockingCall(spec.name, spec.target, stack))
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
def assert_no_blocking_calls(self) -> None:
|
||||||
|
if self.violations:
|
||||||
|
raise AssertionError(format_blocking_calls(self.violations))
|
||||||
|
|
||||||
|
|
||||||
|
class BlockingIOProbe:
|
||||||
|
"""Collect detector output across tests and format a compact summary."""
|
||||||
|
|
||||||
|
def __init__(self, project_root: Path) -> None:
|
||||||
|
self._project_root = project_root.resolve()
|
||||||
|
self._observed: list[tuple[str, BlockingCall]] = []
|
||||||
|
|
||||||
|
@property
|
||||||
|
def violation_count(self) -> int:
|
||||||
|
return len(self._observed)
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_count(self) -> int:
|
||||||
|
return len({nodeid for nodeid, _violation in self._observed})
|
||||||
|
|
||||||
|
def clear(self) -> None:
|
||||||
|
self._observed.clear()
|
||||||
|
|
||||||
|
def record(self, nodeid: str, violations: Iterable[BlockingCall]) -> None:
|
||||||
|
for violation in violations:
|
||||||
|
self._observed.append((nodeid, violation))
|
||||||
|
|
||||||
|
def format_summary(self, *, limit: int = 30) -> str:
|
||||||
|
if not self._observed:
|
||||||
|
return "blocking io probe: no violations"
|
||||||
|
|
||||||
|
call_sites: Counter[tuple[str, str, int, str, str]] = Counter()
|
||||||
|
for _nodeid, violation in self._observed:
|
||||||
|
frame = self._local_call_site(violation.stack)
|
||||||
|
if frame is None:
|
||||||
|
call_sites[(violation.name, "<unknown>", 0, "<unknown>", "")] += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
call_sites[
|
||||||
|
(
|
||||||
|
violation.name,
|
||||||
|
self._relative(frame.filename),
|
||||||
|
frame.lineno,
|
||||||
|
frame.name,
|
||||||
|
(frame.line or "").strip(),
|
||||||
|
)
|
||||||
|
] += 1
|
||||||
|
|
||||||
|
lines = [f"blocking io probe: {self.violation_count} violations across {self.test_count} tests", "Top call sites:"]
|
||||||
|
for (name, filename, lineno, function, line), count in call_sites.most_common(limit):
|
||||||
|
lines.append(f"{count:4d} {name} {filename}:{lineno} {function} | {line}")
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
def _relative(self, filename: str) -> str:
|
||||||
|
try:
|
||||||
|
return str(Path(filename).resolve().relative_to(self._project_root))
|
||||||
|
except ValueError:
|
||||||
|
return filename
|
||||||
|
|
||||||
|
def _local_call_site(self, stack: tuple[traceback.FrameSummary, ...]) -> traceback.FrameSummary | None:
|
||||||
|
local_frames = [frame for frame in stack if str(self._project_root) in frame.filename and "/.venv/" not in frame.filename and not self._relative(frame.filename).startswith("tests/")]
|
||||||
|
if local_frames:
|
||||||
|
return local_frames[-1]
|
||||||
|
|
||||||
|
test_frames = [frame for frame in stack if str(self._project_root) in frame.filename and "/.venv/" not in frame.filename]
|
||||||
|
return test_frames[-1] if test_frames else None
|
||||||
|
|
||||||
|
|
||||||
|
def detect_blocking_io(
|
||||||
|
specs: Iterable[BlockingCallSpec] = DEFAULT_BLOCKING_CALL_SPECS,
|
||||||
|
*,
|
||||||
|
fail_on_exit: bool = False,
|
||||||
|
patch_loaded_aliases: bool = True,
|
||||||
|
stack_limit: int = 12,
|
||||||
|
) -> BlockingIODetector:
|
||||||
|
"""Create a detector context manager for a focused test scope."""
|
||||||
|
|
||||||
|
return BlockingIODetector(specs, fail_on_exit=fail_on_exit, patch_loaded_aliases=patch_loaded_aliases, stack_limit=stack_limit)
|
||||||
|
|
||||||
|
|
||||||
|
def format_blocking_calls(violations: Iterable[BlockingCall]) -> str:
|
||||||
|
"""Format detector output with enough stack context to locate call sites."""
|
||||||
|
|
||||||
|
lines = ["Blocking calls were executed on an asyncio event loop thread:"]
|
||||||
|
for index, violation in enumerate(violations, start=1):
|
||||||
|
lines.append(f"{index}. {violation.name} ({violation.target})")
|
||||||
|
lines.extend(_format_stack(violation.stack))
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def _format_stack(stack: Iterable[traceback.FrameSummary]) -> Iterator[str]:
|
||||||
|
for frame in stack:
|
||||||
|
location = f"{frame.filename}:{frame.lineno}"
|
||||||
|
lines = [f" at {frame.name} ({location})"]
|
||||||
|
if frame.line:
|
||||||
|
lines.append(f" {frame.line.strip()}")
|
||||||
|
yield from lines
|
||||||
@@ -0,0 +1,190 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import os
|
||||||
|
import time
|
||||||
|
from os import walk as imported_walk
|
||||||
|
from pathlib import Path
|
||||||
|
from time import sleep as imported_sleep
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
import pytest
|
||||||
|
import requests
|
||||||
|
from support.detectors.blocking_io import (
|
||||||
|
BlockingCallSpec,
|
||||||
|
BlockingIOProbe,
|
||||||
|
detect_blocking_io,
|
||||||
|
)
|
||||||
|
|
||||||
|
pytestmark = pytest.mark.asyncio
|
||||||
|
|
||||||
|
|
||||||
|
TIME_SLEEP_ONLY = (BlockingCallSpec("time.sleep", "time:sleep"),)
|
||||||
|
REQUESTS_ONLY = (BlockingCallSpec("requests.Session.request", "requests.sessions:Session.request"),)
|
||||||
|
HTTPX_ONLY = (BlockingCallSpec("httpx.Client.request", "httpx:Client.request"),)
|
||||||
|
OS_WALK_ONLY = (BlockingCallSpec("os.walk", "os:walk", record_on_iteration=True),)
|
||||||
|
PATH_READ_TEXT_ONLY = (BlockingCallSpec("pathlib.Path.read_text", "pathlib:Path.read_text"),)
|
||||||
|
|
||||||
|
|
||||||
|
async def test_records_time_sleep_on_event_loop() -> None:
|
||||||
|
with detect_blocking_io(TIME_SLEEP_ONLY) as detector:
|
||||||
|
time.sleep(0)
|
||||||
|
|
||||||
|
assert [violation.name for violation in detector.violations] == ["time.sleep"]
|
||||||
|
|
||||||
|
|
||||||
|
async def test_records_already_imported_sleep_alias_on_event_loop() -> None:
|
||||||
|
original_alias = imported_sleep
|
||||||
|
|
||||||
|
with detect_blocking_io(TIME_SLEEP_ONLY) as detector:
|
||||||
|
imported_sleep(0)
|
||||||
|
|
||||||
|
assert imported_sleep is original_alias
|
||||||
|
assert [violation.name for violation in detector.violations] == ["time.sleep"]
|
||||||
|
|
||||||
|
|
||||||
|
async def test_can_disable_loaded_alias_patching() -> None:
|
||||||
|
with detect_blocking_io(TIME_SLEEP_ONLY, patch_loaded_aliases=False) as detector:
|
||||||
|
imported_sleep(0)
|
||||||
|
|
||||||
|
assert detector.violations == []
|
||||||
|
|
||||||
|
|
||||||
|
async def test_does_not_record_time_sleep_offloaded_to_thread() -> None:
|
||||||
|
with detect_blocking_io(TIME_SLEEP_ONLY) as detector:
|
||||||
|
await asyncio.to_thread(time.sleep, 0)
|
||||||
|
|
||||||
|
assert detector.violations == []
|
||||||
|
|
||||||
|
|
||||||
|
async def test_fixture_allows_offloaded_sync_work(blocking_io_detector) -> None:
|
||||||
|
await asyncio.to_thread(time.sleep, 0)
|
||||||
|
|
||||||
|
assert blocking_io_detector.violations == []
|
||||||
|
|
||||||
|
|
||||||
|
async def test_does_not_record_sync_call_without_running_event_loop() -> None:
|
||||||
|
def call_sleep() -> list[str]:
|
||||||
|
with detect_blocking_io(TIME_SLEEP_ONLY) as detector:
|
||||||
|
time.sleep(0)
|
||||||
|
return [violation.name for violation in detector.violations]
|
||||||
|
|
||||||
|
assert await asyncio.to_thread(call_sleep) == []
|
||||||
|
|
||||||
|
|
||||||
|
async def test_fail_on_exit_includes_call_site() -> None:
|
||||||
|
with pytest.raises(AssertionError) as exc_info:
|
||||||
|
with detect_blocking_io(TIME_SLEEP_ONLY, fail_on_exit=True):
|
||||||
|
time.sleep(0)
|
||||||
|
|
||||||
|
message = str(exc_info.value)
|
||||||
|
assert "time.sleep" in message
|
||||||
|
assert "test_fail_on_exit_includes_call_site" in message
|
||||||
|
|
||||||
|
|
||||||
|
async def test_records_requests_session_request_without_real_network(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
def fake_request(self: requests.Session, method: str, url: str, **kwargs: object) -> str:
|
||||||
|
return f"{method}:{url}"
|
||||||
|
|
||||||
|
monkeypatch.setattr(requests.sessions.Session, "request", fake_request)
|
||||||
|
|
||||||
|
with detect_blocking_io(REQUESTS_ONLY) as detector:
|
||||||
|
assert requests.get("https://example.invalid") == "get:https://example.invalid"
|
||||||
|
|
||||||
|
assert [violation.name for violation in detector.violations] == ["requests.Session.request"]
|
||||||
|
|
||||||
|
|
||||||
|
async def test_records_sync_httpx_client_request_without_real_network(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
def fake_request(self: httpx.Client, method: str, url: str, **kwargs: object) -> httpx.Response:
|
||||||
|
return httpx.Response(200, request=httpx.Request(method, url))
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx.Client, "request", fake_request)
|
||||||
|
|
||||||
|
with detect_blocking_io(HTTPX_ONLY) as detector:
|
||||||
|
with httpx.Client() as client:
|
||||||
|
response = client.get("https://example.invalid")
|
||||||
|
|
||||||
|
assert response.status_code == 200
|
||||||
|
assert [violation.name for violation in detector.violations] == ["httpx.Client.request"]
|
||||||
|
|
||||||
|
|
||||||
|
async def test_records_os_walk_on_event_loop(tmp_path: Path) -> None:
|
||||||
|
(tmp_path / "nested").mkdir()
|
||||||
|
|
||||||
|
with detect_blocking_io(OS_WALK_ONLY) as detector:
|
||||||
|
assert list(os.walk(tmp_path))
|
||||||
|
|
||||||
|
assert [violation.name for violation in detector.violations] == ["os.walk"]
|
||||||
|
|
||||||
|
|
||||||
|
async def test_records_already_imported_os_walk_alias_on_iteration(tmp_path: Path) -> None:
|
||||||
|
(tmp_path / "nested").mkdir()
|
||||||
|
original_alias = imported_walk
|
||||||
|
|
||||||
|
with detect_blocking_io(OS_WALK_ONLY) as detector:
|
||||||
|
assert list(imported_walk(tmp_path))
|
||||||
|
|
||||||
|
assert imported_walk is original_alias
|
||||||
|
assert [violation.name for violation in detector.violations] == ["os.walk"]
|
||||||
|
|
||||||
|
|
||||||
|
async def test_does_not_record_os_walk_before_iteration(tmp_path: Path) -> None:
|
||||||
|
with detect_blocking_io(OS_WALK_ONLY) as detector:
|
||||||
|
walker = os.walk(tmp_path)
|
||||||
|
|
||||||
|
assert list(walker)
|
||||||
|
assert detector.violations == []
|
||||||
|
|
||||||
|
|
||||||
|
async def test_does_not_record_os_walk_iterated_off_event_loop(tmp_path: Path) -> None:
|
||||||
|
(tmp_path / "nested").mkdir()
|
||||||
|
|
||||||
|
with detect_blocking_io(OS_WALK_ONLY) as detector:
|
||||||
|
walker = os.walk(tmp_path)
|
||||||
|
assert await asyncio.to_thread(lambda: list(walker))
|
||||||
|
|
||||||
|
assert detector.violations == []
|
||||||
|
|
||||||
|
|
||||||
|
async def test_records_path_read_text_on_event_loop(tmp_path: Path) -> None:
|
||||||
|
path = tmp_path / "data.txt"
|
||||||
|
path.write_text("content", encoding="utf-8")
|
||||||
|
|
||||||
|
with detect_blocking_io(PATH_READ_TEXT_ONLY) as detector:
|
||||||
|
assert path.read_text(encoding="utf-8") == "content"
|
||||||
|
|
||||||
|
assert [violation.name for violation in detector.violations] == ["pathlib.Path.read_text"]
|
||||||
|
|
||||||
|
|
||||||
|
async def test_probe_formats_summary_for_recorded_violations(tmp_path: Path) -> None:
|
||||||
|
probe = BlockingIOProbe(Path(__file__).resolve().parents[1])
|
||||||
|
path = tmp_path / "data.txt"
|
||||||
|
path.write_text("content", encoding="utf-8")
|
||||||
|
|
||||||
|
with detect_blocking_io(PATH_READ_TEXT_ONLY, stack_limit=18) as detector:
|
||||||
|
assert path.read_text(encoding="utf-8") == "content"
|
||||||
|
|
||||||
|
probe.record("tests/test_example.py::test_example", detector.violations)
|
||||||
|
summary = probe.format_summary()
|
||||||
|
|
||||||
|
assert "blocking io probe: 1 violations across 1 tests" in summary
|
||||||
|
assert "pathlib.Path.read_text" in summary
|
||||||
|
|
||||||
|
|
||||||
|
async def test_probe_formats_empty_summary_and_can_be_cleared(tmp_path: Path) -> None:
|
||||||
|
probe = BlockingIOProbe(Path(__file__).resolve().parents[1])
|
||||||
|
|
||||||
|
assert probe.format_summary() == "blocking io probe: no violations"
|
||||||
|
|
||||||
|
path = tmp_path / "data.txt"
|
||||||
|
path.write_text("content", encoding="utf-8")
|
||||||
|
with detect_blocking_io(PATH_READ_TEXT_ONLY, stack_limit=18) as detector:
|
||||||
|
assert path.read_text(encoding="utf-8") == "content"
|
||||||
|
|
||||||
|
probe.record("tests/test_example.py::test_example", detector.violations)
|
||||||
|
assert probe.violation_count == 1
|
||||||
|
|
||||||
|
probe.clear()
|
||||||
|
|
||||||
|
assert probe.violation_count == 0
|
||||||
|
assert probe.format_summary() == "blocking io probe: no violations"
|
||||||
@@ -0,0 +1,22 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import time
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
ORIGINAL_SLEEP = time.sleep
|
||||||
|
|
||||||
|
|
||||||
|
def replacement_sleep(seconds: float) -> None:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def test_probe_survives_monkeypatch_teardown(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setattr(time, "sleep", replacement_sleep)
|
||||||
|
assert time.sleep is replacement_sleep
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.no_blocking_io_probe
|
||||||
|
def test_probe_restores_original_after_monkeypatch_teardown() -> None:
|
||||||
|
assert time.sleep is ORIGINAL_SLEEP
|
||||||
|
assert getattr(time.sleep, "__wrapped__", None) is None
|
||||||
@@ -0,0 +1,222 @@
|
|||||||
|
"""Real-LLM end-to-end verification for issue #2884.
|
||||||
|
|
||||||
|
Drives a real ``langchain.agents.create_agent`` graph against a real OpenAI-
|
||||||
|
compatible LLM (one-api gateway), bound through ``DeferredToolFilterMiddleware``
|
||||||
|
and the production ``get_available_tools`` pipeline. The only thing we mock is
|
||||||
|
the MCP tool source — we hand-roll two ``@tool``s and inject them through
|
||||||
|
``deerflow.mcp.cache.get_cached_mcp_tools``.
|
||||||
|
|
||||||
|
The flow exercised:
|
||||||
|
1. Turn 1: agent sees ``tool_search`` (plus a ``fake_subagent_trigger``
|
||||||
|
that re-enters ``get_available_tools`` on the same task — this is the
|
||||||
|
code path issue #2884 reports). It must call ``tool_search`` to
|
||||||
|
discover the deferred ``fake_calculator`` tool.
|
||||||
|
2. Tool batch: ``tool_search`` promotes ``fake_calculator``;
|
||||||
|
``fake_subagent_trigger`` re-enters ``get_available_tools``.
|
||||||
|
3. Turn 2: the promoted ``fake_calculator`` schema must reach the model
|
||||||
|
so it can actually call it. Without this PR's fix, the re-entry wipes
|
||||||
|
the promotion and the model can no longer invoke the tool.
|
||||||
|
|
||||||
|
Skipped unless ``ONEAPI_E2E=1`` is set so this doesn't burn credits on every
|
||||||
|
test run. Run with::
|
||||||
|
|
||||||
|
ONEAPI_E2E=1 OPENAI_API_KEY=... OPENAI_API_BASE=... \
|
||||||
|
PYTHONPATH=. uv run pytest \
|
||||||
|
tests/test_deferred_tool_promotion_real_llm.py -v -s
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from langchain_core.messages import HumanMessage
|
||||||
|
from langchain_core.tools import tool as as_tool
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Skip control: only run when explicitly opted in.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
pytestmark = pytest.mark.skipif(
|
||||||
|
os.getenv("ONEAPI_E2E") != "1",
|
||||||
|
reason="Real-LLM e2e: opt in with ONEAPI_E2E=1 (requires OPENAI_API_KEY + OPENAI_API_BASE)",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Fake "MCP" tools the agent should discover via tool_search.
|
||||||
|
# Keep them obviously synthetic so the model can pattern-match the search.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
_calls: list[str] = []
|
||||||
|
|
||||||
|
|
||||||
|
@as_tool
|
||||||
|
def fake_calculator(expression: str) -> str:
|
||||||
|
"""Evaluate a tiny arithmetic expression like '2 + 2'.
|
||||||
|
|
||||||
|
Reserved for the user — only call this if the user asks for arithmetic.
|
||||||
|
"""
|
||||||
|
_calls.append(f"fake_calculator:{expression}")
|
||||||
|
try:
|
||||||
|
# Trivially safe-eval just for the e2e check
|
||||||
|
allowed = set("0123456789+-*/() .")
|
||||||
|
if not set(expression) <= allowed:
|
||||||
|
return "expression contains disallowed characters"
|
||||||
|
return str(eval(expression, {"__builtins__": {}}, {})) # noqa: S307
|
||||||
|
except Exception as e:
|
||||||
|
return f"error: {e}"
|
||||||
|
|
||||||
|
|
||||||
|
@as_tool
|
||||||
|
def fake_translator(text: str, target_lang: str) -> str:
|
||||||
|
"""Translate text into the given language code. Decorative — not used."""
|
||||||
|
_calls.append(f"fake_translator:{text}:{target_lang}")
|
||||||
|
return f"[{target_lang}] {text}"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Pipeline wiring (same shape as the in-process tests).
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture(autouse=True)
|
||||||
|
def _reset_registry_between_tests():
|
||||||
|
from deerflow.tools.builtins.tool_search import reset_deferred_registry
|
||||||
|
|
||||||
|
reset_deferred_registry()
|
||||||
|
yield
|
||||||
|
reset_deferred_registry()
|
||||||
|
|
||||||
|
|
||||||
|
def _patch_mcp_pipeline(monkeypatch: pytest.MonkeyPatch, mcp_tools: list) -> None:
|
||||||
|
from deerflow.config.extensions_config import ExtensionsConfig, McpServerConfig
|
||||||
|
|
||||||
|
real_ext = ExtensionsConfig(
|
||||||
|
mcpServers={"fake-server": McpServerConfig(type="stdio", command="echo", enabled=True)},
|
||||||
|
)
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"deerflow.config.extensions_config.ExtensionsConfig.from_file",
|
||||||
|
classmethod(lambda cls: real_ext),
|
||||||
|
)
|
||||||
|
monkeypatch.setattr("deerflow.mcp.cache.get_cached_mcp_tools", lambda: list(mcp_tools))
|
||||||
|
|
||||||
|
|
||||||
|
def _force_tool_search_enabled(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
"""Build a minimal mock AppConfig and patch the symbol — never call the
|
||||||
|
real loader, which would trigger ``_apply_singleton_configs`` and
|
||||||
|
permanently mutate cross-test singletons (memory, title, …)."""
|
||||||
|
from deerflow.config.app_config import AppConfig
|
||||||
|
from deerflow.config.tool_search_config import ToolSearchConfig
|
||||||
|
|
||||||
|
mock_cfg = AppConfig.model_construct(
|
||||||
|
log_level="info",
|
||||||
|
models=[],
|
||||||
|
tools=[],
|
||||||
|
tool_groups=[],
|
||||||
|
sandbox=AppConfig.model_fields["sandbox"].annotation.model_construct(use="x"),
|
||||||
|
tool_search=ToolSearchConfig(enabled=True),
|
||||||
|
)
|
||||||
|
monkeypatch.setattr("deerflow.tools.tools.get_app_config", lambda: mock_cfg)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Real-LLM e2e test
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_real_llm_promotes_then_invokes_with_subagent_reentry(monkeypatch: pytest.MonkeyPatch):
|
||||||
|
"""End-to-end against a real OpenAI-compatible LLM.
|
||||||
|
|
||||||
|
The model must:
|
||||||
|
Turn 1 — see ``tool_search`` (deferred tools aren't bound yet) and
|
||||||
|
batch-call BOTH ``tool_search(select:fake_calculator)`` AND
|
||||||
|
``fake_subagent_trigger(...)``.
|
||||||
|
Turn 2 — call ``fake_calculator`` and finish.
|
||||||
|
|
||||||
|
Pass criterion: ``fake_calculator`` actually gets invoked at the tool
|
||||||
|
layer — recorded in ``_calls`` — which proves the model received the
|
||||||
|
promoted schema after the re-entrant ``get_available_tools`` call.
|
||||||
|
"""
|
||||||
|
from langchain.agents import create_agent
|
||||||
|
from langchain_openai import ChatOpenAI
|
||||||
|
|
||||||
|
from deerflow.agents.middlewares.deferred_tool_filter_middleware import DeferredToolFilterMiddleware
|
||||||
|
from deerflow.tools.tools import get_available_tools
|
||||||
|
|
||||||
|
_patch_mcp_pipeline(monkeypatch, [fake_calculator, fake_translator])
|
||||||
|
_force_tool_search_enabled(monkeypatch)
|
||||||
|
_calls.clear()
|
||||||
|
|
||||||
|
@as_tool
|
||||||
|
async def fake_subagent_trigger(prompt: str) -> str:
|
||||||
|
"""Pretend to spawn a subagent. Internally rebuilds the toolset.
|
||||||
|
|
||||||
|
Use this whenever the user asks you to delegate work — pass a short
|
||||||
|
description as ``prompt``.
|
||||||
|
"""
|
||||||
|
# ``task_tool`` does this internally. Whether the registry-reset that
|
||||||
|
# used to happen here actually leaks back to the parent task depends
|
||||||
|
# on asyncio's implicit context-copying semantics (gather creates
|
||||||
|
# child tasks with copied contexts, so reset_deferred_registry is
|
||||||
|
# task-local) — but the fix in this PR is what GUARANTEES the
|
||||||
|
# promotion sticks regardless of which integration path triggers a
|
||||||
|
# re-entrant ``get_available_tools`` call.
|
||||||
|
get_available_tools(subagent_enabled=False)
|
||||||
|
_calls.append(f"fake_subagent_trigger:{prompt}")
|
||||||
|
return "subagent completed"
|
||||||
|
|
||||||
|
tools = get_available_tools() + [fake_subagent_trigger]
|
||||||
|
|
||||||
|
model = ChatOpenAI(
|
||||||
|
model=os.environ.get("ONEAPI_MODEL", "claude-sonnet-4-6"),
|
||||||
|
api_key=os.environ["OPENAI_API_KEY"],
|
||||||
|
base_url=os.environ["OPENAI_API_BASE"],
|
||||||
|
temperature=0,
|
||||||
|
max_retries=1,
|
||||||
|
)
|
||||||
|
|
||||||
|
system_prompt = (
|
||||||
|
"You are a meticulous assistant. Available deferred tools include a "
|
||||||
|
"calculator and a translator — their schemas are hidden until you "
|
||||||
|
"search for them via tool_search.\n\n"
|
||||||
|
"Procedure for the user's request:\n"
|
||||||
|
" 1. Call tool_search with query 'select:fake_calculator' AND "
|
||||||
|
"in the SAME tool batch also call fake_subagent_trigger(prompt='go') "
|
||||||
|
"to delegate the side work. Put both tool_calls in your first response.\n"
|
||||||
|
" 2. After both tool messages come back, call fake_calculator with "
|
||||||
|
"the user's expression.\n"
|
||||||
|
" 3. Reply with just the numeric result."
|
||||||
|
)
|
||||||
|
|
||||||
|
graph = create_agent(
|
||||||
|
model=model,
|
||||||
|
tools=tools,
|
||||||
|
middleware=[DeferredToolFilterMiddleware()],
|
||||||
|
system_prompt=system_prompt,
|
||||||
|
)
|
||||||
|
|
||||||
|
result = await graph.ainvoke(
|
||||||
|
{"messages": [HumanMessage(content="What is 17 * 23? Use the deferred calculator tool.")]},
|
||||||
|
config={"recursion_limit": 12},
|
||||||
|
)
|
||||||
|
|
||||||
|
print("\n=== tool calls recorded ===")
|
||||||
|
for c in _calls:
|
||||||
|
print(f" {c}")
|
||||||
|
print("\n=== final message ===")
|
||||||
|
final_text = result["messages"][-1].content if result["messages"] else "(none)"
|
||||||
|
print(f" {final_text!r}")
|
||||||
|
|
||||||
|
# The smoking-gun assertion: fake_calculator was actually invoked at the
|
||||||
|
# tool layer. This is only possible if the promoted schema reached the
|
||||||
|
# model in turn 2, despite the subagent-style re-entry in turn 1.
|
||||||
|
calc_calls = [c for c in _calls if c.startswith("fake_calculator:")]
|
||||||
|
assert calc_calls, f"REGRESSION (#2884): the model never managed to call fake_calculator. All recorded tool calls: {_calls!r}. Final text: {final_text!r}"
|
||||||
|
|
||||||
|
# And the math should actually be done correctly (sanity that the LLM
|
||||||
|
# really used the result, not just hallucinated the answer).
|
||||||
|
assert "391" in str(final_text), f"Model didn't surface 17*23=391. Final text: {final_text!r}"
|
||||||
@@ -0,0 +1,390 @@
|
|||||||
|
"""Reproduce + regression-guard issue #2884.
|
||||||
|
|
||||||
|
Hypothesis from the issue:
|
||||||
|
``tools.tools.get_available_tools`` unconditionally calls
|
||||||
|
``reset_deferred_registry()`` and constructs a fresh ``DeferredToolRegistry``
|
||||||
|
every time it is invoked. If anything calls ``get_available_tools`` again
|
||||||
|
during the same async context (after the agent has promoted tools via
|
||||||
|
``tool_search``), the promotion is wiped and the next model call hides the
|
||||||
|
tool's schema again.
|
||||||
|
|
||||||
|
These tests pin two things:
|
||||||
|
|
||||||
|
A. **At the unit boundary** — verify the failure mode directly. Promote a
|
||||||
|
tool in the registry, then call ``get_available_tools`` again and observe
|
||||||
|
that the ContextVar registry is reset and the promotion is lost.
|
||||||
|
|
||||||
|
B. **At the graph-execution boundary** — drive a real ``create_agent`` graph
|
||||||
|
with the real ``DeferredToolFilterMiddleware`` through two model turns.
|
||||||
|
The first turn calls ``tool_search`` which promotes a tool. The second
|
||||||
|
turn must see that tool's schema in ``request.tools``. If
|
||||||
|
``get_available_tools`` were to run again between the two turns and reset
|
||||||
|
the registry, the second turn's filter would strip the tool.
|
||||||
|
|
||||||
|
Strategy: use the production ``deerflow.tools.tools.get_available_tools``
|
||||||
|
unmodified; mock only the LLM and the MCP tool source. Patch
|
||||||
|
``deerflow.mcp.cache.get_cached_mcp_tools`` (the symbol that
|
||||||
|
``get_available_tools`` resolves via lazy import) to return our fixture
|
||||||
|
tools so we don't need a real MCP server.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from langchain_core.language_models.fake_chat_models import FakeMessagesListChatModel
|
||||||
|
from langchain_core.messages import AIMessage, HumanMessage
|
||||||
|
from langchain_core.runnables import Runnable
|
||||||
|
from langchain_core.tools import tool as as_tool
|
||||||
|
|
||||||
|
|
||||||
|
class FakeToolCallingModel(FakeMessagesListChatModel):
|
||||||
|
"""FakeMessagesListChatModel + no-op bind_tools so create_agent works."""
|
||||||
|
|
||||||
|
def bind_tools( # type: ignore[override]
|
||||||
|
self,
|
||||||
|
tools: Any,
|
||||||
|
*,
|
||||||
|
tool_choice: Any = None,
|
||||||
|
**kwargs: Any,
|
||||||
|
) -> Runnable:
|
||||||
|
return self
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Fixtures: a fake MCP tool source + a way to force config.tool_search.enabled
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@as_tool
|
||||||
|
def fake_mcp_search(query: str) -> str:
|
||||||
|
"""Pretend to search a knowledge base for the given query."""
|
||||||
|
return f"results for {query}"
|
||||||
|
|
||||||
|
|
||||||
|
@as_tool
|
||||||
|
def fake_mcp_fetch(url: str) -> str:
|
||||||
|
"""Pretend to fetch a page at the given URL."""
|
||||||
|
return f"content of {url}"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture(autouse=True)
|
||||||
|
def _supply_env(monkeypatch: pytest.MonkeyPatch):
|
||||||
|
"""config.yaml references $OPENAI_API_KEY at parse time; supply a placeholder."""
|
||||||
|
monkeypatch.setenv("OPENAI_API_KEY", "sk-fake-not-used")
|
||||||
|
monkeypatch.setenv("OPENAI_API_BASE", "https://example.invalid")
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture(autouse=True)
|
||||||
|
def _reset_deferred_registry_between_tests():
|
||||||
|
"""Each test must start with a clean ContextVar.
|
||||||
|
|
||||||
|
The registry lives in a module-level ContextVar with no per-task isolation
|
||||||
|
in a synchronous test runner, so one test's promotion can leak into the
|
||||||
|
next and silently break filter assertions.
|
||||||
|
"""
|
||||||
|
from deerflow.tools.builtins.tool_search import reset_deferred_registry
|
||||||
|
|
||||||
|
reset_deferred_registry()
|
||||||
|
yield
|
||||||
|
reset_deferred_registry()
|
||||||
|
|
||||||
|
|
||||||
|
def _patch_mcp_pipeline(monkeypatch: pytest.MonkeyPatch, mcp_tools: list) -> None:
|
||||||
|
"""Make get_available_tools believe an MCP server is registered.
|
||||||
|
|
||||||
|
Build a real ``ExtensionsConfig`` with one enabled MCP server entry so
|
||||||
|
that both ``AppConfig.from_file`` (which calls
|
||||||
|
``ExtensionsConfig.from_file().model_dump()``) and ``tools.get_available_tools``
|
||||||
|
(which calls ``ExtensionsConfig.from_file().get_enabled_mcp_servers()``)
|
||||||
|
see a valid instance. Then point the MCP tool cache at our fixture tools.
|
||||||
|
"""
|
||||||
|
from deerflow.config.extensions_config import ExtensionsConfig, McpServerConfig
|
||||||
|
|
||||||
|
real_ext = ExtensionsConfig(
|
||||||
|
mcpServers={"fake-server": McpServerConfig(type="stdio", command="echo", enabled=True)},
|
||||||
|
)
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"deerflow.config.extensions_config.ExtensionsConfig.from_file",
|
||||||
|
classmethod(lambda cls: real_ext),
|
||||||
|
)
|
||||||
|
monkeypatch.setattr("deerflow.mcp.cache.get_cached_mcp_tools", lambda: list(mcp_tools))
|
||||||
|
|
||||||
|
|
||||||
|
def _force_tool_search_enabled(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
"""Force config.tool_search.enabled=True without touching the yaml.
|
||||||
|
|
||||||
|
Calling the real ``get_app_config()`` would trigger ``_apply_singleton_configs``
|
||||||
|
which permanently mutates module-level singletons (``_memory_config``,
|
||||||
|
``_title_config``, …) to match the developer's ``config.yaml`` — even
|
||||||
|
after pytest restores our patch. That leaks across tests later in the
|
||||||
|
run that rely on those singletons' DEFAULTS (e.g. memory queue tests
|
||||||
|
require ``_memory_config.enabled = True``, which is the dataclass default
|
||||||
|
but FALSE in the actual yaml).
|
||||||
|
|
||||||
|
Build a minimal mock AppConfig instead and never call the real loader.
|
||||||
|
"""
|
||||||
|
from deerflow.config.app_config import AppConfig
|
||||||
|
from deerflow.config.tool_search_config import ToolSearchConfig
|
||||||
|
|
||||||
|
mock_cfg = AppConfig.model_construct(
|
||||||
|
log_level="info",
|
||||||
|
models=[],
|
||||||
|
tools=[],
|
||||||
|
tool_groups=[],
|
||||||
|
sandbox=AppConfig.model_fields["sandbox"].annotation.model_construct(use="x"),
|
||||||
|
tool_search=ToolSearchConfig(enabled=True),
|
||||||
|
)
|
||||||
|
monkeypatch.setattr("deerflow.tools.tools.get_app_config", lambda: mock_cfg)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Section A — direct unit-level reproduction
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_get_available_tools_preserves_promotions_across_reentrant_calls(monkeypatch: pytest.MonkeyPatch):
|
||||||
|
"""Re-entrant ``get_available_tools()`` must preserve prior promotions.
|
||||||
|
|
||||||
|
Step 1: call get_available_tools() — registers MCP tools as deferred.
|
||||||
|
Step 2: simulate the agent calling tool_search by promoting one tool.
|
||||||
|
Step 3: call get_available_tools() again (the same code path
|
||||||
|
``task_tool`` exercises mid-run).
|
||||||
|
|
||||||
|
Assertion: after step 3, the promoted tool is STILL promoted (not
|
||||||
|
re-deferred). On ``main`` before the fix, step 3's
|
||||||
|
``reset_deferred_registry()`` wiped the promotion and re-registered
|
||||||
|
every MCP tool as deferred — this assertion fired with
|
||||||
|
``REGRESSION (#2884)``.
|
||||||
|
"""
|
||||||
|
from deerflow.tools.builtins.tool_search import get_deferred_registry
|
||||||
|
from deerflow.tools.tools import get_available_tools
|
||||||
|
|
||||||
|
_patch_mcp_pipeline(monkeypatch, [fake_mcp_search, fake_mcp_fetch])
|
||||||
|
_force_tool_search_enabled(monkeypatch)
|
||||||
|
|
||||||
|
# Step 1: first call — both MCP tools start deferred
|
||||||
|
get_available_tools()
|
||||||
|
reg1 = get_deferred_registry()
|
||||||
|
assert reg1 is not None
|
||||||
|
assert {e.name for e in reg1.entries} == {"fake_mcp_search", "fake_mcp_fetch"}
|
||||||
|
|
||||||
|
# Step 2: simulate tool_search promoting one of them
|
||||||
|
reg1.promote({"fake_mcp_search"})
|
||||||
|
assert {e.name for e in reg1.entries} == {"fake_mcp_fetch"}, "Sanity: promote should remove fake_mcp_search"
|
||||||
|
|
||||||
|
# Step 3: second call — registry must NOT silently undo the promotion
|
||||||
|
get_available_tools()
|
||||||
|
reg2 = get_deferred_registry()
|
||||||
|
assert reg2 is not None
|
||||||
|
deferred_after = {e.name for e in reg2.entries}
|
||||||
|
assert "fake_mcp_search" not in deferred_after, f"REGRESSION (#2884): get_available_tools wiped the deferred registry, re-deferring a tool that was already promoted by tool_search. deferred_after_second_call={deferred_after!r}"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Section B — graph-execution reproduction
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class _ToolSearchPromotingModel(FakeToolCallingModel):
|
||||||
|
"""Two-turn model that:
|
||||||
|
|
||||||
|
Turn 1 → emit a tool_call for ``tool_search`` (the real one)
|
||||||
|
Turn 2 → emit a tool_call for ``fake_mcp_search`` (the promoted tool)
|
||||||
|
|
||||||
|
Records the tools it received on each turn so the test can inspect what
|
||||||
|
DeferredToolFilterMiddleware actually fed to ``bind_tools``.
|
||||||
|
"""
|
||||||
|
|
||||||
|
bound_tools_per_turn: list[list[str]] = []
|
||||||
|
|
||||||
|
def bind_tools( # type: ignore[override]
|
||||||
|
self,
|
||||||
|
tools: Any,
|
||||||
|
*,
|
||||||
|
tool_choice: Any = None,
|
||||||
|
**kwargs: Any,
|
||||||
|
) -> Runnable:
|
||||||
|
# Record the tool names the model would see in this turn
|
||||||
|
names = [getattr(t, "name", getattr(t, "__name__", repr(t))) for t in tools]
|
||||||
|
self.bound_tools_per_turn.append(names)
|
||||||
|
return self
|
||||||
|
|
||||||
|
|
||||||
|
def _build_promoting_model() -> _ToolSearchPromotingModel:
|
||||||
|
return _ToolSearchPromotingModel(
|
||||||
|
responses=[
|
||||||
|
AIMessage(
|
||||||
|
content="",
|
||||||
|
tool_calls=[
|
||||||
|
{
|
||||||
|
"name": "tool_search",
|
||||||
|
"args": {"query": "select:fake_mcp_search"},
|
||||||
|
"id": "call_search_1",
|
||||||
|
"type": "tool_call",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
),
|
||||||
|
AIMessage(
|
||||||
|
content="",
|
||||||
|
tool_calls=[
|
||||||
|
{
|
||||||
|
"name": "fake_mcp_search",
|
||||||
|
"args": {"query": "hello"},
|
||||||
|
"id": "call_mcp_1",
|
||||||
|
"type": "tool_call",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
),
|
||||||
|
AIMessage(content="all done"),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_promoted_tool_is_visible_to_model_on_second_turn(monkeypatch: pytest.MonkeyPatch):
|
||||||
|
"""End-to-end: drive a real create_agent graph through two turns.
|
||||||
|
|
||||||
|
Without the fix, the second-turn bind_tools call should NOT contain
|
||||||
|
fake_mcp_search (because DeferredToolFilterMiddleware sees it in the
|
||||||
|
registry and strips it). With the fix, the model sees the schema and can
|
||||||
|
invoke it.
|
||||||
|
"""
|
||||||
|
from langchain.agents import create_agent
|
||||||
|
|
||||||
|
from deerflow.agents.middlewares.deferred_tool_filter_middleware import DeferredToolFilterMiddleware
|
||||||
|
from deerflow.tools.tools import get_available_tools
|
||||||
|
|
||||||
|
_patch_mcp_pipeline(monkeypatch, [fake_mcp_search, fake_mcp_fetch])
|
||||||
|
_force_tool_search_enabled(monkeypatch)
|
||||||
|
|
||||||
|
tools = get_available_tools()
|
||||||
|
# Sanity: the assembled tool list includes the deferred tools (they're in
|
||||||
|
# bind_tools but DeferredToolFilterMiddleware strips deferred ones before
|
||||||
|
# they reach the model)
|
||||||
|
tool_names = {getattr(t, "name", "") for t in tools}
|
||||||
|
assert {"tool_search", "fake_mcp_search", "fake_mcp_fetch"} <= tool_names
|
||||||
|
|
||||||
|
model = _build_promoting_model()
|
||||||
|
model.bound_tools_per_turn = [] # reset class-level recorder
|
||||||
|
|
||||||
|
graph = create_agent(
|
||||||
|
model=model,
|
||||||
|
tools=tools,
|
||||||
|
middleware=[DeferredToolFilterMiddleware()],
|
||||||
|
system_prompt="bug-2884-repro",
|
||||||
|
)
|
||||||
|
|
||||||
|
graph.invoke({"messages": [HumanMessage(content="use the search tool")]})
|
||||||
|
|
||||||
|
# Turn 1: model should NOT see fake_mcp_search (it's deferred)
|
||||||
|
turn1 = set(model.bound_tools_per_turn[0])
|
||||||
|
assert "fake_mcp_search" not in turn1, f"Turn 1 sanity: deferred tools must be hidden from the model. Saw: {turn1!r}"
|
||||||
|
assert "tool_search" in turn1, f"Turn 1 sanity: tool_search must be visible so the agent can discover. Saw: {turn1!r}"
|
||||||
|
|
||||||
|
# Turn 2: AFTER tool_search promotes fake_mcp_search, the model must see it.
|
||||||
|
# This is the load-bearing assertion for issue #2884.
|
||||||
|
assert len(model.bound_tools_per_turn) >= 2, f"Expected at least 2 model turns, got {len(model.bound_tools_per_turn)}"
|
||||||
|
turn2 = set(model.bound_tools_per_turn[1])
|
||||||
|
assert "fake_mcp_search" in turn2, f"REGRESSION (#2884): tool_search promoted fake_mcp_search in turn 1, but the deferred-tool filter still hid it from the model in turn 2. Turn 2 bound tools: {turn2!r}"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Section C — the actual issue #2884 trigger: a re-entrant
|
||||||
|
# get_available_tools call (e.g. when task_tool spawns a subagent) must not
|
||||||
|
# wipe the parent's promotion.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_reentrant_get_available_tools_preserves_promotion(monkeypatch: pytest.MonkeyPatch):
|
||||||
|
"""Issue #2884 in its real shape: a re-entrant get_available_tools call
|
||||||
|
(the same pattern that happens when ``task_tool`` builds a subagent's
|
||||||
|
toolset mid-run) must not wipe the parent agent's tool_search promotions.
|
||||||
|
|
||||||
|
Turn 1's tool batch contains BOTH ``tool_search`` (which promotes
|
||||||
|
``fake_mcp_search``) AND ``fake_subagent_trigger`` (which calls
|
||||||
|
``get_available_tools`` again — exactly what ``task_tool`` does when it
|
||||||
|
builds a subagent's toolset). With the fix, turn 2's bind_tools sees the
|
||||||
|
promoted tool. Without the fix, the re-entry wipes the registry and
|
||||||
|
the filter re-hides it.
|
||||||
|
"""
|
||||||
|
from langchain.agents import create_agent
|
||||||
|
|
||||||
|
from deerflow.agents.middlewares.deferred_tool_filter_middleware import DeferredToolFilterMiddleware
|
||||||
|
from deerflow.tools.tools import get_available_tools
|
||||||
|
|
||||||
|
_patch_mcp_pipeline(monkeypatch, [fake_mcp_search, fake_mcp_fetch])
|
||||||
|
_force_tool_search_enabled(monkeypatch)
|
||||||
|
|
||||||
|
# The trigger tool simulates what task_tool does internally: rebuild the
|
||||||
|
# toolset by calling get_available_tools while the registry is live.
|
||||||
|
@as_tool
|
||||||
|
def fake_subagent_trigger(prompt: str) -> str:
|
||||||
|
"""Pretend to spawn a subagent. Internally rebuilds the toolset."""
|
||||||
|
get_available_tools(subagent_enabled=False)
|
||||||
|
return f"spawned subagent for: {prompt}"
|
||||||
|
|
||||||
|
tools = get_available_tools() + [fake_subagent_trigger]
|
||||||
|
|
||||||
|
bound_per_turn: list[list[str]] = []
|
||||||
|
|
||||||
|
class _Model(FakeToolCallingModel):
|
||||||
|
def bind_tools(self, tools_arg, **kwargs): # type: ignore[override]
|
||||||
|
bound_per_turn.append([getattr(t, "name", repr(t)) for t in tools_arg])
|
||||||
|
return self
|
||||||
|
|
||||||
|
model = _Model(
|
||||||
|
responses=[
|
||||||
|
# Turn 1: do both in one batch — promote AND trigger the
|
||||||
|
# subagent-style rebuild. LangGraph executes them in order in the
|
||||||
|
# same agent step.
|
||||||
|
AIMessage(
|
||||||
|
content="",
|
||||||
|
tool_calls=[
|
||||||
|
{
|
||||||
|
"name": "tool_search",
|
||||||
|
"args": {"query": "select:fake_mcp_search"},
|
||||||
|
"id": "call_search_1",
|
||||||
|
"type": "tool_call",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "fake_subagent_trigger",
|
||||||
|
"args": {"prompt": "go"},
|
||||||
|
"id": "call_trigger_1",
|
||||||
|
"type": "tool_call",
|
||||||
|
},
|
||||||
|
],
|
||||||
|
),
|
||||||
|
# Turn 2: try to invoke the promoted tool. The model gets this
|
||||||
|
# turn only if turn 1's bind_tools recorded what the filter sent.
|
||||||
|
AIMessage(
|
||||||
|
content="",
|
||||||
|
tool_calls=[
|
||||||
|
{
|
||||||
|
"name": "fake_mcp_search",
|
||||||
|
"args": {"query": "hello"},
|
||||||
|
"id": "call_mcp_1",
|
||||||
|
"type": "tool_call",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
),
|
||||||
|
AIMessage(content="all done"),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
graph = create_agent(
|
||||||
|
model=model,
|
||||||
|
tools=tools,
|
||||||
|
middleware=[DeferredToolFilterMiddleware()],
|
||||||
|
system_prompt="bug-2884-subagent-repro",
|
||||||
|
)
|
||||||
|
graph.invoke({"messages": [HumanMessage(content="use the search tool")]})
|
||||||
|
|
||||||
|
# Turn 1 sanity: deferred tool not visible yet
|
||||||
|
assert "fake_mcp_search" not in set(bound_per_turn[0]), bound_per_turn[0]
|
||||||
|
|
||||||
|
# The smoking-gun assertion: turn 2 sees the promoted tool DESPITE the
|
||||||
|
# re-entrant get_available_tools call that happened in turn 1's tool batch.
|
||||||
|
assert len(bound_per_turn) >= 2, f"Expected ≥2 turns, got {len(bound_per_turn)}"
|
||||||
|
turn2 = set(bound_per_turn[1])
|
||||||
|
assert "fake_mcp_search" in turn2, f"REGRESSION (#2884): a re-entrant get_available_tools call (e.g. task_tool spawning a subagent) wiped the parent agent's promotion. Turn 2 bound tools: {turn2!r}"
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
import threading
|
import threading
|
||||||
import time
|
import time
|
||||||
from unittest.mock import MagicMock, patch
|
from unittest.mock import MagicMock, call, patch
|
||||||
|
|
||||||
from deerflow.agents.memory.queue import ConversationContext, MemoryUpdateQueue
|
from deerflow.agents.memory.queue import ConversationContext, MemoryUpdateQueue
|
||||||
from deerflow.config.memory_config import MemoryConfig
|
from deerflow.config.memory_config import MemoryConfig
|
||||||
@@ -164,3 +164,85 @@ def test_flush_nowait_is_non_blocking() -> None:
|
|||||||
assert elapsed < 0.1
|
assert elapsed < 0.1
|
||||||
assert finished.is_set() is False
|
assert finished.is_set() is False
|
||||||
assert finished.wait(1.0) is True
|
assert finished.wait(1.0) is True
|
||||||
|
|
||||||
|
|
||||||
|
def test_queue_keeps_updates_for_different_agents_in_same_thread() -> None:
|
||||||
|
queue = MemoryUpdateQueue()
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("deerflow.agents.memory.queue.get_memory_config", return_value=_memory_config(enabled=True)),
|
||||||
|
patch.object(queue, "_reset_timer"),
|
||||||
|
):
|
||||||
|
queue.add(thread_id="thread-1", messages=["agent-a"], agent_name="agent-a")
|
||||||
|
queue.add(thread_id="thread-1", messages=["agent-b"], agent_name="agent-b")
|
||||||
|
|
||||||
|
assert queue.pending_count == 2
|
||||||
|
assert [context.agent_name for context in queue._queue] == ["agent-a", "agent-b"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_queue_still_coalesces_updates_for_same_agent_in_same_thread() -> None:
|
||||||
|
queue = MemoryUpdateQueue()
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("deerflow.agents.memory.queue.get_memory_config", return_value=_memory_config(enabled=True)),
|
||||||
|
patch.object(queue, "_reset_timer"),
|
||||||
|
):
|
||||||
|
queue.add(
|
||||||
|
thread_id="thread-1",
|
||||||
|
messages=["first"],
|
||||||
|
agent_name="agent-a",
|
||||||
|
correction_detected=True,
|
||||||
|
)
|
||||||
|
queue.add(
|
||||||
|
thread_id="thread-1",
|
||||||
|
messages=["second"],
|
||||||
|
agent_name="agent-a",
|
||||||
|
correction_detected=False,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert queue.pending_count == 1
|
||||||
|
assert queue._queue[0].agent_name == "agent-a"
|
||||||
|
assert queue._queue[0].messages == ["second"]
|
||||||
|
assert queue._queue[0].correction_detected is True
|
||||||
|
|
||||||
|
|
||||||
|
def test_process_queue_updates_different_agents_in_same_thread_separately() -> None:
|
||||||
|
queue = MemoryUpdateQueue()
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("deerflow.agents.memory.queue.get_memory_config", return_value=_memory_config(enabled=True)),
|
||||||
|
patch.object(queue, "_reset_timer"),
|
||||||
|
):
|
||||||
|
queue.add(thread_id="thread-1", messages=["agent-a"], agent_name="agent-a")
|
||||||
|
queue.add(thread_id="thread-1", messages=["agent-b"], agent_name="agent-b")
|
||||||
|
|
||||||
|
mock_updater = MagicMock()
|
||||||
|
mock_updater.update_memory.return_value = True
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("deerflow.agents.memory.updater.MemoryUpdater", return_value=mock_updater),
|
||||||
|
patch("deerflow.agents.memory.queue.time.sleep"),
|
||||||
|
):
|
||||||
|
queue.flush()
|
||||||
|
|
||||||
|
assert mock_updater.update_memory.call_count == 2
|
||||||
|
mock_updater.update_memory.assert_has_calls(
|
||||||
|
[
|
||||||
|
call(
|
||||||
|
messages=["agent-a"],
|
||||||
|
thread_id="thread-1",
|
||||||
|
agent_name="agent-a",
|
||||||
|
correction_detected=False,
|
||||||
|
reinforcement_detected=False,
|
||||||
|
user_id=None,
|
||||||
|
),
|
||||||
|
call(
|
||||||
|
messages=["agent-b"],
|
||||||
|
thread_id="thread-1",
|
||||||
|
agent_name="agent-b",
|
||||||
|
correction_detected=False,
|
||||||
|
reinforcement_detected=False,
|
||||||
|
user_id=None,
|
||||||
|
),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|||||||
@@ -3,6 +3,7 @@
|
|||||||
from unittest.mock import MagicMock, patch
|
from unittest.mock import MagicMock, patch
|
||||||
|
|
||||||
from deerflow.agents.memory.queue import ConversationContext, MemoryUpdateQueue
|
from deerflow.agents.memory.queue import ConversationContext, MemoryUpdateQueue
|
||||||
|
from deerflow.config.memory_config import MemoryConfig
|
||||||
|
|
||||||
|
|
||||||
def test_conversation_context_has_user_id():
|
def test_conversation_context_has_user_id():
|
||||||
@@ -17,7 +18,7 @@ def test_conversation_context_user_id_default_none():
|
|||||||
|
|
||||||
def test_queue_add_stores_user_id():
|
def test_queue_add_stores_user_id():
|
||||||
q = MemoryUpdateQueue()
|
q = MemoryUpdateQueue()
|
||||||
with patch.object(q, "_reset_timer"):
|
with patch("deerflow.agents.memory.queue.get_memory_config", return_value=MemoryConfig(enabled=True)), patch.object(q, "_reset_timer"):
|
||||||
q.add(thread_id="t1", messages=["msg"], user_id="alice")
|
q.add(thread_id="t1", messages=["msg"], user_id="alice")
|
||||||
assert len(q._queue) == 1
|
assert len(q._queue) == 1
|
||||||
assert q._queue[0].user_id == "alice"
|
assert q._queue[0].user_id == "alice"
|
||||||
@@ -26,7 +27,7 @@ def test_queue_add_stores_user_id():
|
|||||||
|
|
||||||
def test_queue_process_passes_user_id_to_updater():
|
def test_queue_process_passes_user_id_to_updater():
|
||||||
q = MemoryUpdateQueue()
|
q = MemoryUpdateQueue()
|
||||||
with patch.object(q, "_reset_timer"):
|
with patch("deerflow.agents.memory.queue.get_memory_config", return_value=MemoryConfig(enabled=True)), patch.object(q, "_reset_timer"):
|
||||||
q.add(thread_id="t1", messages=["msg"], user_id="alice")
|
q.add(thread_id="t1", messages=["msg"], user_id="alice")
|
||||||
|
|
||||||
mock_updater = MagicMock()
|
mock_updater = MagicMock()
|
||||||
@@ -37,3 +38,42 @@ def test_queue_process_passes_user_id_to_updater():
|
|||||||
mock_updater.update_memory.assert_called_once()
|
mock_updater.update_memory.assert_called_once()
|
||||||
call_kwargs = mock_updater.update_memory.call_args.kwargs
|
call_kwargs = mock_updater.update_memory.call_args.kwargs
|
||||||
assert call_kwargs["user_id"] == "alice"
|
assert call_kwargs["user_id"] == "alice"
|
||||||
|
|
||||||
|
|
||||||
|
def test_queue_keeps_updates_for_different_users_in_same_thread_and_agent():
|
||||||
|
q = MemoryUpdateQueue()
|
||||||
|
|
||||||
|
with patch("deerflow.agents.memory.queue.get_memory_config", return_value=MemoryConfig(enabled=True)), patch.object(q, "_reset_timer"):
|
||||||
|
q.add(thread_id="main", messages=["alice update"], agent_name="researcher", user_id="alice")
|
||||||
|
q.add(thread_id="main", messages=["bob update"], agent_name="researcher", user_id="bob")
|
||||||
|
|
||||||
|
assert q.pending_count == 2
|
||||||
|
assert [context.user_id for context in q._queue] == ["alice", "bob"]
|
||||||
|
assert [context.messages for context in q._queue] == [["alice update"], ["bob update"]]
|
||||||
|
|
||||||
|
|
||||||
|
def test_queue_still_coalesces_updates_for_same_user_thread_and_agent():
|
||||||
|
q = MemoryUpdateQueue()
|
||||||
|
|
||||||
|
with patch("deerflow.agents.memory.queue.get_memory_config", return_value=MemoryConfig(enabled=True)), patch.object(q, "_reset_timer"):
|
||||||
|
q.add(thread_id="main", messages=["first"], agent_name="researcher", user_id="alice")
|
||||||
|
q.add(thread_id="main", messages=["second"], agent_name="researcher", user_id="alice")
|
||||||
|
|
||||||
|
assert q.pending_count == 1
|
||||||
|
assert q._queue[0].messages == ["second"]
|
||||||
|
assert q._queue[0].user_id == "alice"
|
||||||
|
assert q._queue[0].agent_name == "researcher"
|
||||||
|
|
||||||
|
|
||||||
|
def test_add_nowait_keeps_different_users_separate():
|
||||||
|
q = MemoryUpdateQueue()
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("deerflow.agents.memory.queue.get_memory_config", return_value=MemoryConfig(enabled=True)),
|
||||||
|
patch.object(q, "_schedule_timer"),
|
||||||
|
):
|
||||||
|
q.add_nowait(thread_id="main", messages=["alice update"], agent_name="researcher", user_id="alice")
|
||||||
|
q.add_nowait(thread_id="main", messages=["bob update"], agent_name="researcher", user_id="bob")
|
||||||
|
|
||||||
|
assert q.pending_count == 2
|
||||||
|
assert [context.user_id for context in q._queue] == ["alice", "bob"]
|
||||||
|
|||||||
@@ -268,6 +268,39 @@ class TestEdgeCases:
|
|||||||
class TestDbRunEventStore:
|
class TestDbRunEventStore:
|
||||||
"""Tests for DbRunEventStore with temp SQLite."""
|
"""Tests for DbRunEventStore with temp SQLite."""
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_postgres_max_seq_uses_advisory_lock_without_for_update(self):
|
||||||
|
from sqlalchemy.dialects import postgresql
|
||||||
|
|
||||||
|
from deerflow.runtime.events.store.db import DbRunEventStore
|
||||||
|
|
||||||
|
class FakeSession:
|
||||||
|
def __init__(self):
|
||||||
|
self.dialect = postgresql.dialect()
|
||||||
|
self.execute_calls = []
|
||||||
|
self.scalar_stmt = None
|
||||||
|
|
||||||
|
def get_bind(self):
|
||||||
|
return self
|
||||||
|
|
||||||
|
async def execute(self, stmt, params=None):
|
||||||
|
self.execute_calls.append((stmt, params))
|
||||||
|
|
||||||
|
async def scalar(self, stmt):
|
||||||
|
self.scalar_stmt = stmt
|
||||||
|
return 41
|
||||||
|
|
||||||
|
session = FakeSession()
|
||||||
|
|
||||||
|
max_seq = await DbRunEventStore._max_seq_for_thread(session, "thread-1")
|
||||||
|
|
||||||
|
assert max_seq == 41
|
||||||
|
assert session.execute_calls
|
||||||
|
assert session.execute_calls[0][1] == {"thread_id": "thread-1"}
|
||||||
|
assert "pg_advisory_xact_lock" in str(session.execute_calls[0][0])
|
||||||
|
compiled = str(session.scalar_stmt.compile(dialect=postgresql.dialect()))
|
||||||
|
assert "FOR UPDATE" not in compiled
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_basic_crud(self, tmp_path):
|
async def test_basic_crud(self, tmp_path):
|
||||||
from deerflow.persistence.engine import close_engine, get_session_factory, init_engine
|
from deerflow.persistence.engine import close_engine, get_session_factory, init_engine
|
||||||
|
|||||||
@@ -3,7 +3,10 @@
|
|||||||
Uses a temp SQLite DB to test ORM-backed CRUD operations.
|
Uses a temp SQLite DB to test ORM-backed CRUD operations.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
from sqlalchemy.dialects import postgresql
|
||||||
|
|
||||||
from deerflow.persistence.run import RunRepository
|
from deerflow.persistence.run import RunRepository
|
||||||
|
|
||||||
@@ -278,3 +281,48 @@ class TestRunRepository:
|
|||||||
assert row4["model_name"] is None
|
assert row4["model_name"] is None
|
||||||
|
|
||||||
await _cleanup()
|
await _cleanup()
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_aggregate_tokens_by_thread_reuses_shared_model_name_expression(self):
|
||||||
|
captured = []
|
||||||
|
|
||||||
|
class FakeResult:
|
||||||
|
def all(self):
|
||||||
|
return []
|
||||||
|
|
||||||
|
class FakeSession:
|
||||||
|
async def execute(self, stmt):
|
||||||
|
captured.append(stmt)
|
||||||
|
return FakeResult()
|
||||||
|
|
||||||
|
class FakeSessionContext:
|
||||||
|
async def __aenter__(self):
|
||||||
|
return FakeSession()
|
||||||
|
|
||||||
|
async def __aexit__(self, exc_type, exc, tb):
|
||||||
|
return None
|
||||||
|
|
||||||
|
repo = RunRepository(lambda: FakeSessionContext())
|
||||||
|
|
||||||
|
agg = await repo.aggregate_tokens_by_thread("t1")
|
||||||
|
assert agg == {
|
||||||
|
"total_tokens": 0,
|
||||||
|
"total_input_tokens": 0,
|
||||||
|
"total_output_tokens": 0,
|
||||||
|
"total_runs": 0,
|
||||||
|
"by_model": {},
|
||||||
|
"by_caller": {"lead_agent": 0, "subagent": 0, "middleware": 0},
|
||||||
|
}
|
||||||
|
assert len(captured) == 1
|
||||||
|
|
||||||
|
stmt = captured[0]
|
||||||
|
compiled_sql = str(stmt.compile(dialect=postgresql.dialect()))
|
||||||
|
select_sql, group_by_sql = compiled_sql.split(" GROUP BY ", maxsplit=1)
|
||||||
|
model_expr_pattern = r"coalesce\(runs\.model_name, %\(([^)]+)\)s\)"
|
||||||
|
|
||||||
|
select_match = re.search(model_expr_pattern + r" AS model", select_sql)
|
||||||
|
group_by_match = re.fullmatch(model_expr_pattern, group_by_sql.strip())
|
||||||
|
|
||||||
|
assert select_match is not None
|
||||||
|
assert group_by_match is not None
|
||||||
|
assert select_match.group(1) == group_by_match.group(1)
|
||||||
|
|||||||
@@ -0,0 +1,429 @@
|
|||||||
|
"""End-to-end verification for issue #2862 (and the regression of #2782).
|
||||||
|
|
||||||
|
Goal: prove — without trusting any single layer's claim — that an authenticated
|
||||||
|
user creating a custom agent through the real ``setup_agent`` tool, driven by a
|
||||||
|
real LangGraph ``create_agent`` graph, ends up with files under
|
||||||
|
``users/<auth_uid>/agents/<name>`` and **not** under ``users/default/agents/...``.
|
||||||
|
|
||||||
|
We intentionally exercise the full pipeline:
|
||||||
|
|
||||||
|
HTTP body shape (mimics LangGraph SDK wire format)
|
||||||
|
-> app.gateway.services.start_run config-assembly chain
|
||||||
|
-> deerflow.runtime.runs.worker._build_runtime_context
|
||||||
|
-> langchain.agents.create_agent graph
|
||||||
|
-> ToolNode dispatch
|
||||||
|
-> setup_agent tool
|
||||||
|
|
||||||
|
The only thing we mock is the LLM (FakeMessagesListChatModel) — every layer
|
||||||
|
that handles ``user_id`` is the real production code path. If the
|
||||||
|
``user_id`` propagation is broken anywhere in this chain, these tests will
|
||||||
|
fail.
|
||||||
|
|
||||||
|
These tests intentionally ``no_auto_user`` so that the ``contextvar``
|
||||||
|
fallback would put files into ``default/`` if propagation breaks.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
from types import SimpleNamespace
|
||||||
|
from unittest.mock import patch
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from _agent_e2e_helpers import FakeToolCallingModel
|
||||||
|
from langchain_core.messages import AIMessage, HumanMessage
|
||||||
|
|
||||||
|
from app.gateway.services import (
|
||||||
|
build_run_config,
|
||||||
|
inject_authenticated_user_context,
|
||||||
|
merge_run_context_overrides,
|
||||||
|
)
|
||||||
|
from deerflow.runtime.runs.worker import _build_runtime_context, _install_runtime_context
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Helpers — real production code paths
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def _make_request(user_id_str: str | None) -> SimpleNamespace:
|
||||||
|
"""Build a fake FastAPI Request that carries an authenticated user."""
|
||||||
|
if user_id_str is None:
|
||||||
|
user = None
|
||||||
|
else:
|
||||||
|
# User.id is UUID in production; honour that
|
||||||
|
user = SimpleNamespace(id=UUID(user_id_str), email="alice@local")
|
||||||
|
return SimpleNamespace(state=SimpleNamespace(user=user))
|
||||||
|
|
||||||
|
|
||||||
|
def _assemble_config(
|
||||||
|
*,
|
||||||
|
body_config: dict | None,
|
||||||
|
body_context: dict | None,
|
||||||
|
request_user_id: str | None,
|
||||||
|
thread_id: str = "thread-e2e",
|
||||||
|
assistant_id: str = "lead_agent",
|
||||||
|
) -> dict:
|
||||||
|
"""Replay the **exact** start_run config-assembly sequence."""
|
||||||
|
config = build_run_config(thread_id, body_config, None, assistant_id=assistant_id)
|
||||||
|
merge_run_context_overrides(config, body_context)
|
||||||
|
inject_authenticated_user_context(config, _make_request(request_user_id))
|
||||||
|
return config
|
||||||
|
|
||||||
|
|
||||||
|
def _make_paths_mock(tmp_path: Path):
|
||||||
|
"""Mirror the production paths.user_agent_dir signature."""
|
||||||
|
from unittest.mock import MagicMock
|
||||||
|
|
||||||
|
paths = MagicMock()
|
||||||
|
paths.base_dir = tmp_path
|
||||||
|
paths.agent_dir = lambda name: tmp_path / "agents" / name
|
||||||
|
paths.user_agent_dir = lambda user_id, name: tmp_path / "users" / user_id / "agents" / name
|
||||||
|
return paths
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# L1-L3: HTTP wire format → start_run → worker._build_runtime_context
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class TestConfigAssembly:
|
||||||
|
"""Covers L1-L3: validate that user_id reaches runtime_ctx for every wire shape."""
|
||||||
|
|
||||||
|
def test_typical_wire_format_user_id_in_runtime_ctx(self):
|
||||||
|
"""Real frontend: body.config={recursion_limit}, body.context={agent_name,...}."""
|
||||||
|
config = _assemble_config(
|
||||||
|
body_config={"recursion_limit": 1000},
|
||||||
|
body_context={"agent_name": "myagent", "is_bootstrap": True, "mode": "flash"},
|
||||||
|
request_user_id="11111111-2222-3333-4444-555555555555",
|
||||||
|
)
|
||||||
|
runtime_ctx = _build_runtime_context("thread-e2e", "run-1", config.get("context"), None)
|
||||||
|
assert runtime_ctx["user_id"] == "11111111-2222-3333-4444-555555555555"
|
||||||
|
assert runtime_ctx["agent_name"] == "myagent"
|
||||||
|
|
||||||
|
def test_body_context_none_still_injects_user_id(self):
|
||||||
|
"""If frontend omits body.context entirely, inject must still create it."""
|
||||||
|
config = _assemble_config(
|
||||||
|
body_config={"recursion_limit": 1000},
|
||||||
|
body_context=None,
|
||||||
|
request_user_id="aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
|
||||||
|
)
|
||||||
|
runtime_ctx = _build_runtime_context("thread-e2e", "run-1", config.get("context"), None)
|
||||||
|
assert runtime_ctx["user_id"] == "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
|
||||||
|
|
||||||
|
def test_body_context_empty_dict_still_injects_user_id(self):
|
||||||
|
"""body.context={} (falsy) path: inject must still produce user_id."""
|
||||||
|
config = _assemble_config(
|
||||||
|
body_config={"recursion_limit": 1000},
|
||||||
|
body_context={},
|
||||||
|
request_user_id="aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
|
||||||
|
)
|
||||||
|
runtime_ctx = _build_runtime_context("thread-e2e", "run-1", config.get("context"), None)
|
||||||
|
assert runtime_ctx["user_id"] == "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
|
||||||
|
|
||||||
|
def test_body_config_already_contains_context_field(self):
|
||||||
|
"""body.config={'context': {...}} (LG 0.6 alt wire): inject still wins."""
|
||||||
|
config = _assemble_config(
|
||||||
|
body_config={"context": {"agent_name": "myagent"}, "recursion_limit": 1000},
|
||||||
|
body_context=None,
|
||||||
|
request_user_id="aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
|
||||||
|
)
|
||||||
|
runtime_ctx = _build_runtime_context("thread-e2e", "run-1", config.get("context"), None)
|
||||||
|
assert runtime_ctx["user_id"] == "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
|
||||||
|
|
||||||
|
def test_client_supplied_user_id_is_overridden(self):
|
||||||
|
"""Spoofed client user_id must be overwritten by inject (auth-trusted source)."""
|
||||||
|
config = _assemble_config(
|
||||||
|
body_config={"recursion_limit": 1000},
|
||||||
|
body_context={"agent_name": "myagent", "user_id": "spoofed"},
|
||||||
|
request_user_id="11111111-2222-3333-4444-555555555555",
|
||||||
|
)
|
||||||
|
runtime_ctx = _build_runtime_context("thread-e2e", "run-1", config.get("context"), None)
|
||||||
|
assert runtime_ctx["user_id"] == "11111111-2222-3333-4444-555555555555"
|
||||||
|
|
||||||
|
def test_unauthenticated_request_does_not_inject(self):
|
||||||
|
"""If request.state.user is missing (impossible under fail-closed auth, but
|
||||||
|
verify defensively), inject must not write user_id and runtime_ctx must
|
||||||
|
therefore lack it — forcing the tool fallback path to reveal itself."""
|
||||||
|
config = _assemble_config(
|
||||||
|
body_config={"recursion_limit": 1000},
|
||||||
|
body_context={"agent_name": "myagent"},
|
||||||
|
request_user_id=None,
|
||||||
|
)
|
||||||
|
runtime_ctx = _build_runtime_context("thread-e2e", "run-1", config.get("context"), None)
|
||||||
|
assert "user_id" not in runtime_ctx
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# L4-L7: Real LangGraph create_agent driving the real setup_agent tool
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def _build_real_bootstrap_graph(authenticated_user_id: str):
|
||||||
|
"""Construct a real LangGraph using create_agent + the real setup_agent tool.
|
||||||
|
|
||||||
|
The LLM is faked (FakeMessagesListChatModel) so we don't need an API key.
|
||||||
|
Everything else — ToolNode dispatch, runtime injection, middleware — is
|
||||||
|
the real production code path.
|
||||||
|
"""
|
||||||
|
from langchain.agents import create_agent
|
||||||
|
|
||||||
|
from deerflow.tools.builtins.setup_agent_tool import setup_agent
|
||||||
|
|
||||||
|
# First model turn: emit a tool_call for setup_agent
|
||||||
|
# Second model turn (after tool result): final answer (terminates the loop)
|
||||||
|
fake_model = FakeToolCallingModel(
|
||||||
|
responses=[
|
||||||
|
AIMessage(
|
||||||
|
content="",
|
||||||
|
tool_calls=[
|
||||||
|
{
|
||||||
|
"name": "setup_agent",
|
||||||
|
"args": {
|
||||||
|
"soul": "# My E2E Agent\n\nA SOUL written by the model.",
|
||||||
|
"description": "End-to-end test agent",
|
||||||
|
},
|
||||||
|
"id": "call_setup_1",
|
||||||
|
"type": "tool_call",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
),
|
||||||
|
AIMessage(content=f"Done. Agent created for user {authenticated_user_id}."),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
graph = create_agent(
|
||||||
|
model=fake_model,
|
||||||
|
tools=[setup_agent],
|
||||||
|
system_prompt="You are a bootstrap agent. Call setup_agent immediately.",
|
||||||
|
)
|
||||||
|
return graph
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.no_auto_user
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_real_graph_real_setup_agent_writes_to_authenticated_user_dir(tmp_path: Path):
|
||||||
|
"""The smoking-gun test for issue #2862.
|
||||||
|
|
||||||
|
Under no_auto_user (contextvar = empty), if user_id propagation through
|
||||||
|
runtime.context is broken, setup_agent will fall back to DEFAULT_USER_ID
|
||||||
|
and write to users/default/agents/... The assertion that this directory
|
||||||
|
DOES NOT exist is what makes this test load-bearing.
|
||||||
|
"""
|
||||||
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
|
auth_uid = "abcdef01-2345-6789-abcd-ef0123456789"
|
||||||
|
config = _assemble_config(
|
||||||
|
body_config={"recursion_limit": 50},
|
||||||
|
body_context={"agent_name": "e2e-agent", "is_bootstrap": True},
|
||||||
|
request_user_id=auth_uid,
|
||||||
|
thread_id="thread-e2e-1",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Replay worker.run_agent's runtime construction. This is the key step:
|
||||||
|
# it is what makes ToolRuntime.context contain user_id when the tool
|
||||||
|
# actually fires.
|
||||||
|
runtime_ctx = _build_runtime_context("thread-e2e-1", "run-1", config.get("context"), None)
|
||||||
|
_install_runtime_context(config, runtime_ctx)
|
||||||
|
runtime = Runtime(context=runtime_ctx, store=None)
|
||||||
|
config.setdefault("configurable", {})["__pregel_runtime"] = runtime
|
||||||
|
|
||||||
|
graph = _build_real_bootstrap_graph(auth_uid)
|
||||||
|
|
||||||
|
# Patch get_paths only (the file-system rooting); everything else is real
|
||||||
|
with patch(
|
||||||
|
"deerflow.tools.builtins.setup_agent_tool.get_paths",
|
||||||
|
return_value=_make_paths_mock(tmp_path),
|
||||||
|
):
|
||||||
|
# Drive the real graph. This goes through real ToolNode + real Runtime merge.
|
||||||
|
final_state = await graph.ainvoke(
|
||||||
|
{"messages": [HumanMessage(content="Create an agent named e2e-agent")]},
|
||||||
|
config=config,
|
||||||
|
)
|
||||||
|
|
||||||
|
expected_dir = tmp_path / "users" / auth_uid / "agents" / "e2e-agent"
|
||||||
|
default_dir = tmp_path / "users" / "default" / "agents" / "e2e-agent"
|
||||||
|
|
||||||
|
# Load-bearing assertions:
|
||||||
|
assert expected_dir.exists(), f"Agent directory not found at the authenticated user's path. Expected: {expected_dir}. tmp_path tree: {[str(p) for p in tmp_path.rglob('*')]}"
|
||||||
|
assert (expected_dir / "SOUL.md").read_text() == "# My E2E Agent\n\nA SOUL written by the model."
|
||||||
|
assert (expected_dir / "config.yaml").exists()
|
||||||
|
assert not default_dir.exists(), "REGRESSION: agent landed under users/default/. user_id propagation broke somewhere between HTTP layer and ToolRuntime.context."
|
||||||
|
|
||||||
|
# And final state should reflect tool success
|
||||||
|
last = final_state["messages"][-1]
|
||||||
|
assert "Done" in (last.content if isinstance(last.content, str) else str(last.content))
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.no_auto_user
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_inject_failure_falls_back_to_default_proving_test_is_load_bearing(tmp_path: Path):
|
||||||
|
"""Negative control: if inject does NOT happen (no user in request), and
|
||||||
|
contextvar is empty (no_auto_user), setup_agent must land in default/.
|
||||||
|
|
||||||
|
This proves the positive test is actually load-bearing — i.e. it would
|
||||||
|
have failed before PR #2784, not passed accidentally.
|
||||||
|
"""
|
||||||
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
|
config = _assemble_config(
|
||||||
|
body_config={"recursion_limit": 50},
|
||||||
|
body_context={"agent_name": "fallback-agent", "is_bootstrap": True},
|
||||||
|
request_user_id=None, # no auth — inject is a no-op
|
||||||
|
thread_id="thread-e2e-2",
|
||||||
|
)
|
||||||
|
|
||||||
|
runtime_ctx = _build_runtime_context("thread-e2e-2", "run-2", config.get("context"), None)
|
||||||
|
_install_runtime_context(config, runtime_ctx)
|
||||||
|
runtime = Runtime(context=runtime_ctx, store=None)
|
||||||
|
config.setdefault("configurable", {})["__pregel_runtime"] = runtime
|
||||||
|
|
||||||
|
graph = _build_real_bootstrap_graph("does-not-matter")
|
||||||
|
|
||||||
|
with patch(
|
||||||
|
"deerflow.tools.builtins.setup_agent_tool.get_paths",
|
||||||
|
return_value=_make_paths_mock(tmp_path),
|
||||||
|
):
|
||||||
|
await graph.ainvoke(
|
||||||
|
{"messages": [HumanMessage(content="Create fallback-agent")]},
|
||||||
|
config=config,
|
||||||
|
)
|
||||||
|
|
||||||
|
default_dir = tmp_path / "users" / "default" / "agents" / "fallback-agent"
|
||||||
|
assert default_dir.exists(), "Negative control failed: even without inject + contextvar, agent did not land in default/. The test infrastructure may not be reproducing the bug condition."
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# L5: Sub-graph runtime propagation (the task tool case)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.no_auto_user
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_subgraph_invocation_preserves_user_id_in_runtime(tmp_path: Path):
|
||||||
|
"""When a parent graph invokes a child graph (the pattern used by
|
||||||
|
subagents), parent_runtime.merge() must keep user_id intact.
|
||||||
|
|
||||||
|
We construct a child graph that contains setup_agent and call it from
|
||||||
|
a parent graph's tool. If LangGraph re-creates the Runtime and drops
|
||||||
|
user_id at the sub-graph boundary, this fails.
|
||||||
|
"""
|
||||||
|
from langchain.agents import create_agent
|
||||||
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
|
from deerflow.tools.builtins.setup_agent_tool import setup_agent
|
||||||
|
|
||||||
|
auth_uid = "deadbeef-0000-1111-2222-333344445555"
|
||||||
|
|
||||||
|
# Inner graph: same as the bootstrap flow
|
||||||
|
inner_model = FakeToolCallingModel(
|
||||||
|
responses=[
|
||||||
|
AIMessage(
|
||||||
|
content="",
|
||||||
|
tool_calls=[
|
||||||
|
{
|
||||||
|
"name": "setup_agent",
|
||||||
|
"args": {"soul": "# Inner", "description": "subgraph"},
|
||||||
|
"id": "call_inner_1",
|
||||||
|
"type": "tool_call",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
),
|
||||||
|
AIMessage(content="inner done"),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
inner_graph = create_agent(
|
||||||
|
model=inner_model,
|
||||||
|
tools=[setup_agent],
|
||||||
|
system_prompt="inner",
|
||||||
|
)
|
||||||
|
|
||||||
|
config = _assemble_config(
|
||||||
|
body_config={"recursion_limit": 50},
|
||||||
|
body_context={"agent_name": "subgraph-agent", "is_bootstrap": True},
|
||||||
|
request_user_id=auth_uid,
|
||||||
|
thread_id="thread-e2e-3",
|
||||||
|
)
|
||||||
|
runtime_ctx = _build_runtime_context("thread-e2e-3", "run-3", config.get("context"), None)
|
||||||
|
_install_runtime_context(config, runtime_ctx)
|
||||||
|
runtime = Runtime(context=runtime_ctx, store=None)
|
||||||
|
config.setdefault("configurable", {})["__pregel_runtime"] = runtime
|
||||||
|
|
||||||
|
with patch(
|
||||||
|
"deerflow.tools.builtins.setup_agent_tool.get_paths",
|
||||||
|
return_value=_make_paths_mock(tmp_path),
|
||||||
|
):
|
||||||
|
# Direct sub-graph invoke (mimics what a subagent invocation looks like
|
||||||
|
# — distinct ainvoke call, but parent config carries the same runtime).
|
||||||
|
await inner_graph.ainvoke(
|
||||||
|
{"messages": [HumanMessage(content="Create subgraph-agent")]},
|
||||||
|
config=config,
|
||||||
|
)
|
||||||
|
|
||||||
|
expected_dir = tmp_path / "users" / auth_uid / "agents" / "subgraph-agent"
|
||||||
|
default_dir = tmp_path / "users" / "default" / "agents" / "subgraph-agent"
|
||||||
|
assert expected_dir.exists()
|
||||||
|
assert not default_dir.exists()
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# L6: Sync tool path through ContextThreadPoolExecutor
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_sync_tool_dispatch_through_thread_pool_uses_runtime_context(tmp_path: Path):
|
||||||
|
"""setup_agent is a sync function. When dispatched through ToolNode's
|
||||||
|
ContextThreadPoolExecutor, runtime.context must still carry user_id —
|
||||||
|
not via thread-local copy_context (which only carries contextvars), but
|
||||||
|
because it was passed in as the ToolRuntime constructor argument.
|
||||||
|
"""
|
||||||
|
from langchain.agents import create_agent
|
||||||
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
|
from deerflow.tools.builtins.setup_agent_tool import setup_agent
|
||||||
|
|
||||||
|
auth_uid = "11112222-3333-4444-5555-666677778888"
|
||||||
|
|
||||||
|
fake_model = FakeToolCallingModel(
|
||||||
|
responses=[
|
||||||
|
AIMessage(
|
||||||
|
content="",
|
||||||
|
tool_calls=[
|
||||||
|
{
|
||||||
|
"name": "setup_agent",
|
||||||
|
"args": {"soul": "# Sync", "description": "sync path"},
|
||||||
|
"id": "call_sync_1",
|
||||||
|
"type": "tool_call",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
),
|
||||||
|
AIMessage(content="sync done"),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
graph = create_agent(model=fake_model, tools=[setup_agent], system_prompt="sync")
|
||||||
|
|
||||||
|
config = _assemble_config(
|
||||||
|
body_config={"recursion_limit": 50},
|
||||||
|
body_context={"agent_name": "sync-agent", "is_bootstrap": True},
|
||||||
|
request_user_id=auth_uid,
|
||||||
|
thread_id="thread-e2e-4",
|
||||||
|
)
|
||||||
|
runtime_ctx = _build_runtime_context("thread-e2e-4", "run-4", config.get("context"), None)
|
||||||
|
_install_runtime_context(config, runtime_ctx)
|
||||||
|
runtime = Runtime(context=runtime_ctx, store=None)
|
||||||
|
config.setdefault("configurable", {})["__pregel_runtime"] = runtime
|
||||||
|
|
||||||
|
with patch(
|
||||||
|
"deerflow.tools.builtins.setup_agent_tool.get_paths",
|
||||||
|
return_value=_make_paths_mock(tmp_path),
|
||||||
|
):
|
||||||
|
# Use SYNC invoke to hit the ContextThreadPoolExecutor path
|
||||||
|
graph.invoke(
|
||||||
|
{"messages": [HumanMessage(content="Create sync-agent")]},
|
||||||
|
config=config,
|
||||||
|
)
|
||||||
|
|
||||||
|
expected_dir = tmp_path / "users" / auth_uid / "agents" / "sync-agent"
|
||||||
|
default_dir = tmp_path / "users" / "default" / "agents" / "sync-agent"
|
||||||
|
assert expected_dir.exists()
|
||||||
|
assert not default_dir.exists()
|
||||||
@@ -0,0 +1,326 @@
|
|||||||
|
"""Real HTTP end-to-end verification for issue #2862's setup_agent path.
|
||||||
|
|
||||||
|
This test drives the **entire** FastAPI gateway through ``starlette.testclient.TestClient``:
|
||||||
|
|
||||||
|
starlette.testclient.TestClient (real ASGI stack)
|
||||||
|
-> AuthMiddleware (real cookie parsing, real JWT decode)
|
||||||
|
-> /api/v1/auth/register endpoint (real password hash + sqlite write)
|
||||||
|
-> /api/threads/{id}/runs/stream endpoint (real start_run config-assembly)
|
||||||
|
-> background asyncio.create_task(run_agent) (real worker, real Runtime)
|
||||||
|
-> langchain.agents.create_agent graph (real, with fake LLM)
|
||||||
|
-> ToolNode dispatch (real)
|
||||||
|
-> setup_agent tool (real file I/O)
|
||||||
|
|
||||||
|
The only mock is the LLM (no API key needed). Every layer that participates
|
||||||
|
in ``user_id`` propagation — auth, ContextVar, ``inject_authenticated_user_context``,
|
||||||
|
``worker._build_runtime_context``, ``Runtime.merge`` — is the real production
|
||||||
|
code path. If the chain is broken at any layer, this test fails.
|
||||||
|
|
||||||
|
This is what "真实验证" looks like for a server that lives behind authentication:
|
||||||
|
register a user, log in (cookie), POST to /runs/stream, wait for the run to
|
||||||
|
finish, then read the filesystem.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
from unittest.mock import patch
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from _agent_e2e_helpers import FakeToolCallingModel, build_single_tool_call_model
|
||||||
|
|
||||||
|
|
||||||
|
def _build_fake_create_chat_model(agent_name: str):
|
||||||
|
"""Return a callable matching the real ``create_chat_model`` signature.
|
||||||
|
|
||||||
|
Whenever the lead agent constructs a chat model during the bootstrap flow,
|
||||||
|
we hand it a fake that emits a single setup_agent tool_call on its first
|
||||||
|
turn, then a benign final answer on its second turn.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def fake_create_chat_model(*args: Any, **kwargs: Any) -> FakeToolCallingModel:
|
||||||
|
return build_single_tool_call_model(
|
||||||
|
tool_name="setup_agent",
|
||||||
|
tool_args={
|
||||||
|
"soul": f"# Real HTTP E2E SOUL for {agent_name}",
|
||||||
|
"description": "real-http-e2e agent",
|
||||||
|
},
|
||||||
|
tool_call_id="call_real_http_1",
|
||||||
|
final_text=f"Agent {agent_name} created via real HTTP e2e.",
|
||||||
|
)
|
||||||
|
|
||||||
|
return fake_create_chat_model
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def isolated_deer_flow_home(tmp_path: Path, monkeypatch: pytest.MonkeyPatch):
|
||||||
|
"""Stand up an isolated DeerFlow data root + config under tmp_path.
|
||||||
|
|
||||||
|
- Sets ``DEER_FLOW_HOME`` so paths land under tmp_path, not the real
|
||||||
|
``.deer-flow`` directory.
|
||||||
|
- Stages a copy of the project's ``config.yaml`` (or ``config.example.yaml``
|
||||||
|
on a fresh CI checkout where ``config.yaml`` is gitignored) and pins
|
||||||
|
``DEER_FLOW_CONFIG_PATH`` to it, so lifespan boot doesn't depend on the
|
||||||
|
developer's local config layout.
|
||||||
|
- Sets a placeholder OPENAI_API_KEY because the config has
|
||||||
|
``$OPENAI_API_KEY`` that gets resolved at parse time; the LLM itself is
|
||||||
|
mocked, so any non-empty value works.
|
||||||
|
"""
|
||||||
|
home = tmp_path / "deer-flow-home"
|
||||||
|
home.mkdir()
|
||||||
|
monkeypatch.setenv("DEER_FLOW_HOME", str(home))
|
||||||
|
monkeypatch.setenv("OPENAI_API_KEY", "sk-fake-key-not-used-because-llm-is-mocked")
|
||||||
|
monkeypatch.setenv("OPENAI_API_BASE", "https://example.invalid")
|
||||||
|
|
||||||
|
# Hermetic config: do not depend on whether the dev machine has a real
|
||||||
|
# ``config.yaml`` at the repo root. CI's ``actions/checkout`` only ships
|
||||||
|
# ``config.example.yaml`` (and its ``models:`` list is commented out, so
|
||||||
|
# AppConfig validation would reject it). Write a minimal, self-sufficient
|
||||||
|
# config to tmp_path and pin ``DEER_FLOW_CONFIG_PATH`` to it.
|
||||||
|
staged_config = tmp_path / "config.yaml"
|
||||||
|
staged_config.write_text(_MINIMAL_CONFIG_YAML, encoding="utf-8")
|
||||||
|
monkeypatch.setenv("DEER_FLOW_CONFIG_PATH", str(staged_config))
|
||||||
|
|
||||||
|
return home
|
||||||
|
|
||||||
|
|
||||||
|
# Minimal config that satisfies AppConfig + LeadAgent's _resolve_model_name.
|
||||||
|
# The model `use` path must resolve to a real class for config parsing to
|
||||||
|
# succeed; the test patches ``create_chat_model`` on the lead agent module,
|
||||||
|
# so the model is never actually instantiated. SandboxConfig.use is required
|
||||||
|
# at schema level; LocalSandboxProvider is the only sandbox that runs without
|
||||||
|
# Docker.
|
||||||
|
_MINIMAL_CONFIG_YAML = """\
|
||||||
|
log_level: info
|
||||||
|
models:
|
||||||
|
- name: fake-test-model
|
||||||
|
display_name: Fake Test Model
|
||||||
|
use: langchain_openai:ChatOpenAI
|
||||||
|
model: gpt-4o-mini
|
||||||
|
api_key: $OPENAI_API_KEY
|
||||||
|
base_url: $OPENAI_API_BASE
|
||||||
|
sandbox:
|
||||||
|
use: deerflow.sandbox.local:LocalSandboxProvider
|
||||||
|
agents_api:
|
||||||
|
enabled: true
|
||||||
|
database:
|
||||||
|
backend: sqlite
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
def _reset_process_singletons(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
"""Reset every process-wide cache that would survive across tests.
|
||||||
|
|
||||||
|
This fixture stands up a full FastAPI app + sqlite DB + LangGraph runtime
|
||||||
|
inside ``tmp_path``. To get true per-test isolation we have to invalidate
|
||||||
|
a handful of module-level caches that production normally never resets,
|
||||||
|
so they pick up our test-only ``DEER_FLOW_HOME`` and sqlite path:
|
||||||
|
|
||||||
|
- ``deerflow.config.app_config`` caches the parsed ``config.yaml``.
|
||||||
|
- ``deerflow.config.paths`` caches the ``Paths`` singleton derived from
|
||||||
|
``DEER_FLOW_HOME`` at first access.
|
||||||
|
- ``deerflow.persistence.engine`` caches the SQLAlchemy engine and
|
||||||
|
session factory after the first call to ``init_engine_from_config``.
|
||||||
|
|
||||||
|
``raising=False`` keeps the fixture resilient if upstream renames or
|
||||||
|
drops one of these attributes — the test will simply skip that reset
|
||||||
|
instead of failing with a confusing AttributeError, and the next test
|
||||||
|
to call ``get_app_config()``/``get_paths()`` will surface the real
|
||||||
|
incompatibility loudly.
|
||||||
|
"""
|
||||||
|
from deerflow.config import app_config as app_config_module
|
||||||
|
from deerflow.config import paths as paths_module
|
||||||
|
from deerflow.persistence import engine as engine_module
|
||||||
|
|
||||||
|
for module, attr in (
|
||||||
|
(app_config_module, "_app_config"),
|
||||||
|
(app_config_module, "_app_config_path"),
|
||||||
|
(app_config_module, "_app_config_mtime"),
|
||||||
|
(paths_module, "_paths_singleton"),
|
||||||
|
(engine_module, "_engine"),
|
||||||
|
(engine_module, "_session_factory"),
|
||||||
|
):
|
||||||
|
monkeypatch.setattr(module, attr, None, raising=False)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def isolated_app(isolated_deer_flow_home: Path, monkeypatch: pytest.MonkeyPatch):
|
||||||
|
"""Build a fresh FastAPI app inside a clean DEER_FLOW_HOME.
|
||||||
|
|
||||||
|
Each test gets its own sqlite DB and checkpoint store under ``tmp_path``,
|
||||||
|
with no cross-test contamination.
|
||||||
|
"""
|
||||||
|
_reset_process_singletons(monkeypatch)
|
||||||
|
|
||||||
|
# Re-resolve the config from the test-only DEER_FLOW_HOME and pin its
|
||||||
|
# sqlite path into tmp_path so the lifespan-time engine init lands there.
|
||||||
|
from deerflow.config import app_config as app_config_module
|
||||||
|
|
||||||
|
cfg = app_config_module.get_app_config()
|
||||||
|
cfg.database.sqlite_dir = str(isolated_deer_flow_home / "db")
|
||||||
|
|
||||||
|
from app.gateway.app import create_app
|
||||||
|
|
||||||
|
return create_app()
|
||||||
|
|
||||||
|
|
||||||
|
def _drain_stream(response, *, timeout: float = 30.0, max_bytes: int = 4 * 1024 * 1024) -> str:
|
||||||
|
"""Consume an SSE response body until the run terminates and return the text.
|
||||||
|
|
||||||
|
Bounded to keep the test fail-fast:
|
||||||
|
- Stops as soon as an ``event: end`` SSE frame is observed (the gateway
|
||||||
|
sends this when the background run finishes — see ``services.format_sse``
|
||||||
|
and ``StreamBridge.publish_end``).
|
||||||
|
- Stops at ``timeout`` seconds wall-clock so a stuck run / runaway heartbeat
|
||||||
|
loop surfaces a real failure instead of hanging pytest.
|
||||||
|
- Stops at ``max_bytes`` so a runaway producer can't OOM the test process.
|
||||||
|
"""
|
||||||
|
import time as _time
|
||||||
|
|
||||||
|
deadline = _time.monotonic() + timeout
|
||||||
|
body = b""
|
||||||
|
for chunk in response.iter_bytes():
|
||||||
|
body += chunk
|
||||||
|
if b"event: end" in body:
|
||||||
|
break
|
||||||
|
if len(body) >= max_bytes:
|
||||||
|
break
|
||||||
|
if _time.monotonic() >= deadline:
|
||||||
|
break
|
||||||
|
return body.decode("utf-8", errors="replace")
|
||||||
|
|
||||||
|
|
||||||
|
def _wait_for_file(path: Path, *, timeout: float = 10.0) -> bool:
|
||||||
|
"""Block until *path* exists or *timeout* elapses.
|
||||||
|
|
||||||
|
The run completes inside ``asyncio.create_task`` after start_run returns,
|
||||||
|
so the test must wait for the background task to flush its writes.
|
||||||
|
"""
|
||||||
|
import time as _time
|
||||||
|
|
||||||
|
deadline = _time.monotonic() + timeout
|
||||||
|
while _time.monotonic() < deadline:
|
||||||
|
if path.exists():
|
||||||
|
return True
|
||||||
|
_time.sleep(0.05)
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.no_auto_user
|
||||||
|
def test_real_http_create_agent_lands_in_authenticated_user_dir(
|
||||||
|
isolated_app: Any,
|
||||||
|
isolated_deer_flow_home: Path,
|
||||||
|
monkeypatch: pytest.MonkeyPatch,
|
||||||
|
):
|
||||||
|
"""The full real-server contract test.
|
||||||
|
|
||||||
|
1. Register a real user via POST /api/v1/auth/register (also auto-logs in)
|
||||||
|
2. POST to /api/threads/{tid}/runs/stream with the **exact** body shape the
|
||||||
|
frontend (LangGraph SDK) sends during the bootstrap flow.
|
||||||
|
3. Wait for the background run to finish.
|
||||||
|
4. Assert SOUL.md exists under users/<authenticated_uid>/agents/<name>/.
|
||||||
|
5. Assert NOTHING exists under users/default/agents/<name>/.
|
||||||
|
"""
|
||||||
|
# ``deerflow.agents.lead_agent.agent`` imports ``create_chat_model`` with
|
||||||
|
# ``from deerflow.models import create_chat_model`` at module load time,
|
||||||
|
# rebinding the symbol into its own namespace. So the only patch that
|
||||||
|
# intercepts the call is the bound name on ``lead_agent.agent`` — patching
|
||||||
|
# ``deerflow.models.create_chat_model`` would be too late.
|
||||||
|
agent_name = "real-http-agent"
|
||||||
|
|
||||||
|
from starlette.testclient import TestClient
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch(
|
||||||
|
"deerflow.agents.lead_agent.agent.create_chat_model",
|
||||||
|
new=_build_fake_create_chat_model(agent_name),
|
||||||
|
),
|
||||||
|
TestClient(isolated_app) as client,
|
||||||
|
):
|
||||||
|
# --- 1. Register & auto-login ---
|
||||||
|
register = client.post(
|
||||||
|
"/api/v1/auth/register",
|
||||||
|
json={"email": "e2e-user@example.com", "password": "very-strong-password-123"},
|
||||||
|
)
|
||||||
|
assert register.status_code == 201, register.text
|
||||||
|
registered = register.json()
|
||||||
|
auth_uid = registered["id"]
|
||||||
|
# The endpoint sets both access_token (auth) and csrf_token (CSRF Double
|
||||||
|
# Submit Cookie) cookies; the TestClient cookie jar propagates them.
|
||||||
|
assert client.cookies.get("access_token"), "register endpoint must set session cookie"
|
||||||
|
csrf_token = client.cookies.get("csrf_token")
|
||||||
|
assert csrf_token, "register endpoint must set csrf_token cookie"
|
||||||
|
|
||||||
|
# --- 2. Create a thread (require_existing=True on /runs/stream means
|
||||||
|
# we must call POST /api/threads first; the React frontend does the
|
||||||
|
# same via the LangGraph SDK's threads.create) ---
|
||||||
|
import uuid as _uuid
|
||||||
|
|
||||||
|
thread_id = str(_uuid.uuid4())
|
||||||
|
created = client.post(
|
||||||
|
"/api/threads",
|
||||||
|
json={"thread_id": thread_id, "metadata": {}},
|
||||||
|
headers={"X-CSRF-Token": csrf_token},
|
||||||
|
)
|
||||||
|
assert created.status_code == 200, created.text
|
||||||
|
|
||||||
|
# --- 3. POST /runs/stream with the bootstrap wire format ---
|
||||||
|
# This is the EXACT shape the React frontend sends after PR #2784:
|
||||||
|
# thread.submit(input, {config, context}) ->
|
||||||
|
# POST /api/threads/{id}/runs/stream body =
|
||||||
|
# {assistant_id, input, config, context}
|
||||||
|
body = {
|
||||||
|
"assistant_id": "lead_agent",
|
||||||
|
"input": {
|
||||||
|
"messages": [
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": (f"The new custom agent name is {agent_name}. Help me design its SOUL.md before saving it."),
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"config": {"recursion_limit": 50},
|
||||||
|
"context": {
|
||||||
|
"agent_name": agent_name,
|
||||||
|
"is_bootstrap": True,
|
||||||
|
"mode": "flash",
|
||||||
|
"thinking_enabled": False,
|
||||||
|
"is_plan_mode": False,
|
||||||
|
"subagent_enabled": False,
|
||||||
|
},
|
||||||
|
"stream_mode": ["values"],
|
||||||
|
}
|
||||||
|
# The /stream endpoint returns SSE; we drain it so the server-side
|
||||||
|
# background task (run_agent) gets to completion before we look at disk.
|
||||||
|
with client.stream(
|
||||||
|
"POST",
|
||||||
|
f"/api/threads/{thread_id}/runs/stream",
|
||||||
|
json=body,
|
||||||
|
headers={"X-CSRF-Token": csrf_token},
|
||||||
|
) as resp:
|
||||||
|
assert resp.status_code == 200, resp.read().decode()
|
||||||
|
transcript = _drain_stream(resp)
|
||||||
|
|
||||||
|
# Sanity: the stream should have produced at least one event
|
||||||
|
assert "event:" in transcript, f"no SSE events in response: {transcript[:500]!r}"
|
||||||
|
|
||||||
|
# --- 4. Verify filesystem outcome ---
|
||||||
|
expected_dir = isolated_deer_flow_home / "users" / auth_uid / "agents" / agent_name
|
||||||
|
default_dir = isolated_deer_flow_home / "users" / "default" / "agents" / agent_name
|
||||||
|
|
||||||
|
# The setup_agent tool runs inside the background asyncio task spawned
|
||||||
|
# by start_run; SSE-drain typically waits for it, but we add a bounded
|
||||||
|
# poll to be robust against scheduler jitter.
|
||||||
|
assert _wait_for_file(expected_dir / "SOUL.md", timeout=15.0), (
|
||||||
|
"SOUL.md did not appear under users/<auth_uid>/agents/. "
|
||||||
|
f"Expected: {expected_dir / 'SOUL.md'}. "
|
||||||
|
f"tmp tree: {sorted(str(p.relative_to(isolated_deer_flow_home)) for p in isolated_deer_flow_home.rglob('SOUL.md'))}. "
|
||||||
|
f"SSE transcript tail: {transcript[-1000:]!r}"
|
||||||
|
)
|
||||||
|
|
||||||
|
soul_text = (expected_dir / "SOUL.md").read_text()
|
||||||
|
assert agent_name in soul_text, f"unexpected SOUL content: {soul_text!r}"
|
||||||
|
|
||||||
|
# The smoking-gun assertion: the agent must NOT have landed in default/
|
||||||
|
assert not default_dir.exists(), f"REGRESSION: agent landed under users/default/{agent_name} instead of the authenticated user. Default-dir contents: {list(default_dir.rglob('*')) if default_dir.exists() else 'n/a'}"
|
||||||
@@ -30,12 +30,18 @@ def _dynamic_context_reminder(msg_id: str = "reminder-1") -> HumanMessage:
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def _runtime(thread_id: str | None = "thread-1", agent_name: str | None = None) -> SimpleNamespace:
|
def _runtime(
|
||||||
|
thread_id: str | None = "thread-1",
|
||||||
|
agent_name: str | None = None,
|
||||||
|
user_id: str | None = None,
|
||||||
|
) -> SimpleNamespace:
|
||||||
context = {}
|
context = {}
|
||||||
if thread_id is not None:
|
if thread_id is not None:
|
||||||
context["thread_id"] = thread_id
|
context["thread_id"] = thread_id
|
||||||
if agent_name is not None:
|
if agent_name is not None:
|
||||||
context["agent_name"] = agent_name
|
context["agent_name"] = agent_name
|
||||||
|
if user_id is not None:
|
||||||
|
context["user_id"] = user_id
|
||||||
return SimpleNamespace(context=context)
|
return SimpleNamespace(context=context)
|
||||||
|
|
||||||
|
|
||||||
@@ -634,3 +640,22 @@ def test_memory_flush_hook_preserves_agent_scoped_memory(monkeypatch: pytest.Mon
|
|||||||
|
|
||||||
queue.add_nowait.assert_called_once()
|
queue.add_nowait.assert_called_once()
|
||||||
assert queue.add_nowait.call_args.kwargs["agent_name"] == "research-agent"
|
assert queue.add_nowait.call_args.kwargs["agent_name"] == "research-agent"
|
||||||
|
|
||||||
|
|
||||||
|
def test_memory_flush_hook_passes_runtime_user_id(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
queue = MagicMock()
|
||||||
|
monkeypatch.setattr("deerflow.agents.memory.summarization_hook.get_memory_config", lambda: MemoryConfig(enabled=True))
|
||||||
|
monkeypatch.setattr("deerflow.agents.memory.summarization_hook.get_memory_queue", lambda: queue)
|
||||||
|
|
||||||
|
memory_flush_hook(
|
||||||
|
SummarizationEvent(
|
||||||
|
messages_to_summarize=tuple(_messages()[:2]),
|
||||||
|
preserved_messages=(),
|
||||||
|
thread_id="main",
|
||||||
|
agent_name="researcher",
|
||||||
|
runtime=_runtime(thread_id="main", agent_name="researcher", user_id="alice"),
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
queue.add_nowait.assert_called_once()
|
||||||
|
assert queue.add_nowait.call_args.kwargs["user_id"] == "alice"
|
||||||
|
|||||||
@@ -59,12 +59,15 @@ def _make_result(
|
|||||||
ai_messages: list[dict] | None = None,
|
ai_messages: list[dict] | None = None,
|
||||||
result: str | None = None,
|
result: str | None = None,
|
||||||
error: str | None = None,
|
error: str | None = None,
|
||||||
|
token_usage_records: list[dict] | None = None,
|
||||||
) -> SimpleNamespace:
|
) -> SimpleNamespace:
|
||||||
return SimpleNamespace(
|
return SimpleNamespace(
|
||||||
status=status,
|
status=status,
|
||||||
ai_messages=ai_messages or [],
|
ai_messages=ai_messages or [],
|
||||||
result=result,
|
result=result,
|
||||||
error=error,
|
error=error,
|
||||||
|
token_usage_records=token_usage_records or [],
|
||||||
|
usage_reported=False,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -1132,3 +1135,153 @@ def test_cancellation_reports_subagent_usage(monkeypatch):
|
|||||||
assert len(report_calls) == 1
|
assert len(report_calls) == 1
|
||||||
assert report_calls[0][1] is cancel_result
|
assert report_calls[0][1] is cancel_result
|
||||||
assert cleanup_calls == ["tc-cancel-report"]
|
assert cleanup_calls == ["tc-cancel-report"]
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.parametrize(
|
||||||
|
"status, expected_type",
|
||||||
|
[
|
||||||
|
(FakeSubagentStatus.COMPLETED, "task_completed"),
|
||||||
|
(FakeSubagentStatus.FAILED, "task_failed"),
|
||||||
|
(FakeSubagentStatus.CANCELLED, "task_cancelled"),
|
||||||
|
(FakeSubagentStatus.TIMED_OUT, "task_timed_out"),
|
||||||
|
],
|
||||||
|
)
|
||||||
|
def test_terminal_events_include_usage(monkeypatch, status, expected_type):
|
||||||
|
"""Terminal task events include a usage summary from token_usage_records."""
|
||||||
|
config = _make_subagent_config()
|
||||||
|
runtime = _make_runtime()
|
||||||
|
events = []
|
||||||
|
|
||||||
|
records = [
|
||||||
|
{"source_run_id": "r1", "caller": "subagent:general-purpose", "input_tokens": 100, "output_tokens": 50, "total_tokens": 150},
|
||||||
|
{"source_run_id": "r2", "caller": "subagent:general-purpose", "input_tokens": 200, "output_tokens": 80, "total_tokens": 280},
|
||||||
|
]
|
||||||
|
result = _make_result(status, result="ok" if status == FakeSubagentStatus.COMPLETED else None, error="err" if status != FakeSubagentStatus.COMPLETED else None, token_usage_records=records)
|
||||||
|
|
||||||
|
monkeypatch.setattr(task_tool_module, "SubagentStatus", FakeSubagentStatus)
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_subagent_config", lambda _: config)
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_background_task_result", lambda _: result)
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_stream_writer", lambda: events.append)
|
||||||
|
monkeypatch.setattr(task_tool_module.asyncio, "sleep", _no_sleep)
|
||||||
|
monkeypatch.setattr(task_tool_module, "_report_subagent_usage", lambda *_: None)
|
||||||
|
monkeypatch.setattr(task_tool_module, "cleanup_background_task", lambda _: None)
|
||||||
|
monkeypatch.setattr("deerflow.tools.get_available_tools", MagicMock(return_value=[]))
|
||||||
|
|
||||||
|
_run_task_tool(
|
||||||
|
runtime=runtime,
|
||||||
|
description="test",
|
||||||
|
prompt="do work",
|
||||||
|
subagent_type="general-purpose",
|
||||||
|
tool_call_id="tc-usage",
|
||||||
|
)
|
||||||
|
|
||||||
|
terminal_events = [e for e in events if e["type"] == expected_type]
|
||||||
|
assert len(terminal_events) == 1
|
||||||
|
assert terminal_events[0]["usage"] == {
|
||||||
|
"input_tokens": 300,
|
||||||
|
"output_tokens": 130,
|
||||||
|
"total_tokens": 430,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def test_terminal_event_usage_none_when_no_records(monkeypatch):
|
||||||
|
"""Terminal event has usage=None when token_usage_records is empty."""
|
||||||
|
config = _make_subagent_config()
|
||||||
|
runtime = _make_runtime()
|
||||||
|
events = []
|
||||||
|
|
||||||
|
result = _make_result(FakeSubagentStatus.COMPLETED, result="done", token_usage_records=[])
|
||||||
|
|
||||||
|
monkeypatch.setattr(task_tool_module, "SubagentStatus", FakeSubagentStatus)
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_subagent_config", lambda _: config)
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_background_task_result", lambda _: result)
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_stream_writer", lambda: events.append)
|
||||||
|
monkeypatch.setattr(task_tool_module.asyncio, "sleep", _no_sleep)
|
||||||
|
monkeypatch.setattr(task_tool_module, "_report_subagent_usage", lambda *_: None)
|
||||||
|
monkeypatch.setattr(task_tool_module, "cleanup_background_task", lambda _: None)
|
||||||
|
monkeypatch.setattr("deerflow.tools.get_available_tools", MagicMock(return_value=[]))
|
||||||
|
|
||||||
|
_run_task_tool(
|
||||||
|
runtime=runtime,
|
||||||
|
description="test",
|
||||||
|
prompt="do work",
|
||||||
|
subagent_type="general-purpose",
|
||||||
|
tool_call_id="tc-no-records",
|
||||||
|
)
|
||||||
|
|
||||||
|
completed = [e for e in events if e["type"] == "task_completed"]
|
||||||
|
assert len(completed) == 1
|
||||||
|
assert completed[0]["usage"] is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_usage_cache_is_skipped_when_config_file_is_missing(monkeypatch):
|
||||||
|
monkeypatch.setattr(
|
||||||
|
task_tool_module,
|
||||||
|
"get_app_config",
|
||||||
|
MagicMock(side_effect=FileNotFoundError("missing config")),
|
||||||
|
)
|
||||||
|
|
||||||
|
assert task_tool_module._token_usage_cache_enabled(None) is False
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_usage_cache_is_skipped_when_token_usage_is_disabled(monkeypatch):
|
||||||
|
config = _make_subagent_config()
|
||||||
|
app_config = SimpleNamespace(token_usage=SimpleNamespace(enabled=False))
|
||||||
|
runtime = _make_runtime(app_config=app_config)
|
||||||
|
records = [{"input_tokens": 10, "output_tokens": 5, "total_tokens": 15}]
|
||||||
|
result = _make_result(FakeSubagentStatus.COMPLETED, result="done", token_usage_records=records)
|
||||||
|
|
||||||
|
task_tool_module._subagent_usage_cache.clear()
|
||||||
|
monkeypatch.setattr(task_tool_module, "SubagentStatus", FakeSubagentStatus)
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_available_subagent_names", lambda *, app_config: ["general-purpose"])
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_subagent_config", lambda _, *, app_config: config)
|
||||||
|
monkeypatch.setattr(
|
||||||
|
task_tool_module,
|
||||||
|
"SubagentExecutor",
|
||||||
|
type("DummyExecutor", (), {"__init__": lambda self, **kwargs: None, "execute_async": lambda self, prompt, task_id=None: task_id}),
|
||||||
|
)
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_background_task_result", lambda _: result)
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_stream_writer", lambda: lambda _: None)
|
||||||
|
monkeypatch.setattr(task_tool_module, "_report_subagent_usage", lambda *_: None)
|
||||||
|
monkeypatch.setattr(task_tool_module, "cleanup_background_task", lambda _: None)
|
||||||
|
monkeypatch.setattr("deerflow.tools.get_available_tools", MagicMock(return_value=[]))
|
||||||
|
|
||||||
|
_run_task_tool(
|
||||||
|
runtime=runtime,
|
||||||
|
description="test",
|
||||||
|
prompt="do work",
|
||||||
|
subagent_type="general-purpose",
|
||||||
|
tool_call_id="tc-disabled-cache",
|
||||||
|
)
|
||||||
|
|
||||||
|
assert task_tool_module.pop_cached_subagent_usage("tc-disabled-cache") is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_usage_cache_is_cleared_when_polling_raises(monkeypatch):
|
||||||
|
config = _make_subagent_config()
|
||||||
|
app_config = SimpleNamespace(token_usage=SimpleNamespace(enabled=True))
|
||||||
|
runtime = _make_runtime(app_config=app_config)
|
||||||
|
|
||||||
|
task_tool_module._subagent_usage_cache["tc-error"] = {"input_tokens": 1, "output_tokens": 1, "total_tokens": 2}
|
||||||
|
monkeypatch.setattr(task_tool_module, "SubagentStatus", FakeSubagentStatus)
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_available_subagent_names", lambda *, app_config: ["general-purpose"])
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_subagent_config", lambda _, *, app_config: config)
|
||||||
|
monkeypatch.setattr(
|
||||||
|
task_tool_module,
|
||||||
|
"SubagentExecutor",
|
||||||
|
type("DummyExecutor", (), {"__init__": lambda self, **kwargs: None, "execute_async": lambda self, prompt, task_id=None: task_id}),
|
||||||
|
)
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_background_task_result", MagicMock(side_effect=RuntimeError("poll failed")))
|
||||||
|
monkeypatch.setattr(task_tool_module, "get_stream_writer", lambda: lambda _: None)
|
||||||
|
monkeypatch.setattr("deerflow.tools.get_available_tools", MagicMock(return_value=[]))
|
||||||
|
|
||||||
|
with pytest.raises(RuntimeError, match="poll failed"):
|
||||||
|
_run_task_tool(
|
||||||
|
runtime=runtime,
|
||||||
|
description="test",
|
||||||
|
prompt="do work",
|
||||||
|
subagent_type="general-purpose",
|
||||||
|
tool_call_id="tc-error",
|
||||||
|
)
|
||||||
|
|
||||||
|
assert task_tool_module.pop_cached_subagent_usage("tc-error") is None
|
||||||
|
|||||||
@@ -1,28 +1,25 @@
|
|||||||
"""Tests for ThreadMetaRepository (SQLAlchemy-backed)."""
|
"""Tests for ThreadMetaRepository (SQLAlchemy-backed)."""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
from deerflow.persistence.thread_meta import ThreadMetaRepository
|
from deerflow.persistence.thread_meta import InvalidMetadataFilterError, ThreadMetaRepository
|
||||||
|
|
||||||
|
|
||||||
async def _make_repo(tmp_path):
|
@pytest.fixture
|
||||||
from deerflow.persistence.engine import get_session_factory, init_engine
|
async def repo(tmp_path):
|
||||||
|
from deerflow.persistence.engine import close_engine, get_session_factory, init_engine
|
||||||
|
|
||||||
url = f"sqlite+aiosqlite:///{tmp_path / 'test.db'}"
|
url = f"sqlite+aiosqlite:///{tmp_path / 'test.db'}"
|
||||||
await init_engine("sqlite", url=url, sqlite_dir=str(tmp_path))
|
await init_engine("sqlite", url=url, sqlite_dir=str(tmp_path))
|
||||||
return ThreadMetaRepository(get_session_factory())
|
yield ThreadMetaRepository(get_session_factory())
|
||||||
|
|
||||||
|
|
||||||
async def _cleanup():
|
|
||||||
from deerflow.persistence.engine import close_engine
|
|
||||||
|
|
||||||
await close_engine()
|
await close_engine()
|
||||||
|
|
||||||
|
|
||||||
class TestThreadMetaRepository:
|
class TestThreadMetaRepository:
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_create_and_get(self, tmp_path):
|
async def test_create_and_get(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
record = await repo.create("t1")
|
record = await repo.create("t1")
|
||||||
assert record["thread_id"] == "t1"
|
assert record["thread_id"] == "t1"
|
||||||
assert record["status"] == "idle"
|
assert record["status"] == "idle"
|
||||||
@@ -31,148 +28,523 @@ class TestThreadMetaRepository:
|
|||||||
fetched = await repo.get("t1")
|
fetched = await repo.get("t1")
|
||||||
assert fetched is not None
|
assert fetched is not None
|
||||||
assert fetched["thread_id"] == "t1"
|
assert fetched["thread_id"] == "t1"
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_create_with_assistant_id(self, tmp_path):
|
async def test_create_with_assistant_id(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
record = await repo.create("t1", assistant_id="agent1")
|
record = await repo.create("t1", assistant_id="agent1")
|
||||||
assert record["assistant_id"] == "agent1"
|
assert record["assistant_id"] == "agent1"
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_create_with_owner_and_display_name(self, tmp_path):
|
async def test_create_with_owner_and_display_name(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
record = await repo.create("t1", user_id="user1", display_name="My Thread")
|
record = await repo.create("t1", user_id="user1", display_name="My Thread")
|
||||||
assert record["user_id"] == "user1"
|
assert record["user_id"] == "user1"
|
||||||
assert record["display_name"] == "My Thread"
|
assert record["display_name"] == "My Thread"
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_create_with_metadata(self, tmp_path):
|
async def test_create_with_metadata(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
record = await repo.create("t1", metadata={"key": "value"})
|
record = await repo.create("t1", metadata={"key": "value"})
|
||||||
assert record["metadata"] == {"key": "value"}
|
assert record["metadata"] == {"key": "value"}
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_get_nonexistent(self, tmp_path):
|
async def test_get_nonexistent(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
assert await repo.get("nonexistent") is None
|
assert await repo.get("nonexistent") is None
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_check_access_no_record_allows(self, tmp_path):
|
async def test_check_access_no_record_allows(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
assert await repo.check_access("unknown", "user1") is True
|
assert await repo.check_access("unknown", "user1") is True
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_check_access_owner_matches(self, tmp_path):
|
async def test_check_access_owner_matches(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
await repo.create("t1", user_id="user1")
|
await repo.create("t1", user_id="user1")
|
||||||
assert await repo.check_access("t1", "user1") is True
|
assert await repo.check_access("t1", "user1") is True
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_check_access_owner_mismatch(self, tmp_path):
|
async def test_check_access_owner_mismatch(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
await repo.create("t1", user_id="user1")
|
await repo.create("t1", user_id="user1")
|
||||||
assert await repo.check_access("t1", "user2") is False
|
assert await repo.check_access("t1", "user2") is False
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_check_access_no_owner_allows_all(self, tmp_path):
|
async def test_check_access_no_owner_allows_all(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
# Explicit user_id=None to bypass the new AUTO default that
|
# Explicit user_id=None to bypass the new AUTO default that
|
||||||
# would otherwise pick up the test user from the autouse fixture.
|
# would otherwise pick up the test user from the autouse fixture.
|
||||||
await repo.create("t1", user_id=None)
|
await repo.create("t1", user_id=None)
|
||||||
assert await repo.check_access("t1", "anyone") is True
|
assert await repo.check_access("t1", "anyone") is True
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_check_access_strict_missing_row_denied(self, tmp_path):
|
async def test_check_access_strict_missing_row_denied(self, repo):
|
||||||
"""require_existing=True flips the missing-row case to *denied*.
|
"""require_existing=True flips the missing-row case to *denied*.
|
||||||
|
|
||||||
Closes the delete-idempotence cross-user gap: after a thread is
|
Closes the delete-idempotence cross-user gap: after a thread is
|
||||||
deleted, the row is gone, and the permissive default would let any
|
deleted, the row is gone, and the permissive default would let any
|
||||||
caller "claim" it as untracked. The strict mode demands a row.
|
caller "claim" it as untracked. The strict mode demands a row.
|
||||||
"""
|
"""
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
assert await repo.check_access("never-existed", "user1", require_existing=True) is False
|
assert await repo.check_access("never-existed", "user1", require_existing=True) is False
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_check_access_strict_owner_match_allowed(self, tmp_path):
|
async def test_check_access_strict_owner_match_allowed(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
await repo.create("t1", user_id="user1")
|
await repo.create("t1", user_id="user1")
|
||||||
assert await repo.check_access("t1", "user1", require_existing=True) is True
|
assert await repo.check_access("t1", "user1", require_existing=True) is True
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_check_access_strict_owner_mismatch_denied(self, tmp_path):
|
async def test_check_access_strict_owner_mismatch_denied(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
await repo.create("t1", user_id="user1")
|
await repo.create("t1", user_id="user1")
|
||||||
assert await repo.check_access("t1", "user2", require_existing=True) is False
|
assert await repo.check_access("t1", "user2", require_existing=True) is False
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_check_access_strict_null_owner_still_allowed(self, tmp_path):
|
async def test_check_access_strict_null_owner_still_allowed(self, repo):
|
||||||
"""Even in strict mode, a row with NULL user_id stays shared.
|
"""Even in strict mode, a row with NULL user_id stays shared.
|
||||||
|
|
||||||
The strict flag tightens the *missing row* case, not the *shared
|
The strict flag tightens the *missing row* case, not the *shared
|
||||||
row* case — legacy pre-auth rows that survived a clean migration
|
row* case — legacy pre-auth rows that survived a clean migration
|
||||||
without an owner are still everyone's.
|
without an owner are still everyone's.
|
||||||
"""
|
"""
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
await repo.create("t1", user_id=None)
|
await repo.create("t1", user_id=None)
|
||||||
assert await repo.check_access("t1", "anyone", require_existing=True) is True
|
assert await repo.check_access("t1", "anyone", require_existing=True) is True
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_update_status(self, tmp_path):
|
async def test_update_status(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
await repo.create("t1")
|
await repo.create("t1")
|
||||||
await repo.update_status("t1", "busy")
|
await repo.update_status("t1", "busy")
|
||||||
record = await repo.get("t1")
|
record = await repo.get("t1")
|
||||||
assert record["status"] == "busy"
|
assert record["status"] == "busy"
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_delete(self, tmp_path):
|
async def test_delete(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
await repo.create("t1")
|
await repo.create("t1")
|
||||||
await repo.delete("t1")
|
await repo.delete("t1")
|
||||||
assert await repo.get("t1") is None
|
assert await repo.get("t1") is None
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_delete_nonexistent_is_noop(self, tmp_path):
|
async def test_delete_nonexistent_is_noop(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
await repo.delete("nonexistent") # should not raise
|
await repo.delete("nonexistent") # should not raise
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_update_metadata_merges(self, tmp_path):
|
async def test_update_metadata_merges(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
await repo.create("t1", metadata={"a": 1, "b": 2})
|
await repo.create("t1", metadata={"a": 1, "b": 2})
|
||||||
await repo.update_metadata("t1", {"b": 99, "c": 3})
|
await repo.update_metadata("t1", {"b": 99, "c": 3})
|
||||||
record = await repo.get("t1")
|
record = await repo.get("t1")
|
||||||
# Existing key preserved, overlapping key overwritten, new key added
|
# Existing key preserved, overlapping key overwritten, new key added
|
||||||
assert record["metadata"] == {"a": 1, "b": 99, "c": 3}
|
assert record["metadata"] == {"a": 1, "b": 99, "c": 3}
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_update_metadata_on_empty(self, tmp_path):
|
async def test_update_metadata_on_empty(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
await repo.create("t1")
|
await repo.create("t1")
|
||||||
await repo.update_metadata("t1", {"k": "v"})
|
await repo.update_metadata("t1", {"k": "v"})
|
||||||
record = await repo.get("t1")
|
record = await repo.get("t1")
|
||||||
assert record["metadata"] == {"k": "v"}
|
assert record["metadata"] == {"k": "v"}
|
||||||
await _cleanup()
|
|
||||||
|
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
async def test_update_metadata_nonexistent_is_noop(self, tmp_path):
|
async def test_update_metadata_nonexistent_is_noop(self, repo):
|
||||||
repo = await _make_repo(tmp_path)
|
|
||||||
await repo.update_metadata("nonexistent", {"k": "v"}) # should not raise
|
await repo.update_metadata("nonexistent", {"k": "v"}) # should not raise
|
||||||
await _cleanup()
|
|
||||||
|
# --- search with metadata filter (SQL push-down) ---
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_filter_string(self, repo):
|
||||||
|
await repo.create("t1", metadata={"env": "prod"})
|
||||||
|
await repo.create("t2", metadata={"env": "staging"})
|
||||||
|
await repo.create("t3", metadata={"env": "prod", "region": "us"})
|
||||||
|
|
||||||
|
results = await repo.search(metadata={"env": "prod"})
|
||||||
|
ids = {r["thread_id"] for r in results}
|
||||||
|
assert ids == {"t1", "t3"}
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_filter_numeric(self, repo):
|
||||||
|
await repo.create("t1", metadata={"priority": 1})
|
||||||
|
await repo.create("t2", metadata={"priority": 2})
|
||||||
|
await repo.create("t3", metadata={"priority": 1, "extra": "x"})
|
||||||
|
|
||||||
|
results = await repo.search(metadata={"priority": 1})
|
||||||
|
ids = {r["thread_id"] for r in results}
|
||||||
|
assert ids == {"t1", "t3"}
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_filter_multiple_keys(self, repo):
|
||||||
|
await repo.create("t1", metadata={"env": "prod", "region": "us"})
|
||||||
|
await repo.create("t2", metadata={"env": "prod", "region": "eu"})
|
||||||
|
await repo.create("t3", metadata={"env": "staging", "region": "us"})
|
||||||
|
|
||||||
|
results = await repo.search(metadata={"env": "prod", "region": "us"})
|
||||||
|
assert len(results) == 1
|
||||||
|
assert results[0]["thread_id"] == "t1"
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_no_match(self, repo):
|
||||||
|
await repo.create("t1", metadata={"env": "prod"})
|
||||||
|
|
||||||
|
results = await repo.search(metadata={"env": "dev"})
|
||||||
|
assert results == []
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_pagination_correct(self, repo):
|
||||||
|
"""Regression: SQL push-down makes limit/offset exact even when most rows don't match."""
|
||||||
|
for i in range(30):
|
||||||
|
meta = {"target": "yes"} if i % 3 == 0 else {"target": "no"}
|
||||||
|
await repo.create(f"t{i:03d}", metadata=meta)
|
||||||
|
|
||||||
|
# Total matching rows: i in {0,3,6,9,12,15,18,21,24,27} = 10 rows
|
||||||
|
all_matches = await repo.search(metadata={"target": "yes"}, limit=100)
|
||||||
|
assert len(all_matches) == 10
|
||||||
|
|
||||||
|
# Paginate: first page
|
||||||
|
page1 = await repo.search(metadata={"target": "yes"}, limit=3, offset=0)
|
||||||
|
assert len(page1) == 3
|
||||||
|
|
||||||
|
# Paginate: second page
|
||||||
|
page2 = await repo.search(metadata={"target": "yes"}, limit=3, offset=3)
|
||||||
|
assert len(page2) == 3
|
||||||
|
|
||||||
|
# No overlap between pages
|
||||||
|
page1_ids = {r["thread_id"] for r in page1}
|
||||||
|
page2_ids = {r["thread_id"] for r in page2}
|
||||||
|
assert page1_ids.isdisjoint(page2_ids)
|
||||||
|
|
||||||
|
# Last page
|
||||||
|
page_last = await repo.search(metadata={"target": "yes"}, limit=3, offset=9)
|
||||||
|
assert len(page_last) == 1
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_with_status_filter(self, repo):
|
||||||
|
await repo.create("t1", metadata={"env": "prod"})
|
||||||
|
await repo.create("t2", metadata={"env": "prod"})
|
||||||
|
await repo.update_status("t1", "busy")
|
||||||
|
|
||||||
|
results = await repo.search(metadata={"env": "prod"}, status="busy")
|
||||||
|
assert len(results) == 1
|
||||||
|
assert results[0]["thread_id"] == "t1"
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_without_metadata_still_works(self, repo):
|
||||||
|
await repo.create("t1", metadata={"env": "prod"})
|
||||||
|
await repo.create("t2")
|
||||||
|
|
||||||
|
results = await repo.search(limit=10)
|
||||||
|
assert len(results) == 2
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_missing_key_no_match(self, repo):
|
||||||
|
"""Rows without the requested metadata key should not match."""
|
||||||
|
await repo.create("t1", metadata={"other": "val"})
|
||||||
|
await repo.create("t2", metadata={"env": "prod"})
|
||||||
|
|
||||||
|
results = await repo.search(metadata={"env": "prod"})
|
||||||
|
assert len(results) == 1
|
||||||
|
assert results[0]["thread_id"] == "t2"
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_all_unsafe_keys_raises(self, repo, caplog):
|
||||||
|
"""When ALL metadata keys are unsafe, raises InvalidMetadataFilterError."""
|
||||||
|
await repo.create("t1", metadata={"env": "prod"})
|
||||||
|
await repo.create("t2", metadata={"env": "staging"})
|
||||||
|
|
||||||
|
with caplog.at_level(logging.WARNING, logger="deerflow.persistence.thread_meta.sql"):
|
||||||
|
with pytest.raises(InvalidMetadataFilterError, match="rejected") as exc_info:
|
||||||
|
await repo.search(metadata={"bad;key": "x"})
|
||||||
|
assert any("bad;key" in r.message for r in caplog.records)
|
||||||
|
# Subclass of ValueError for backward compatibility
|
||||||
|
assert isinstance(exc_info.value, ValueError)
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_partial_unsafe_key_skipped(self, repo, caplog):
|
||||||
|
"""Valid keys filter rows; only the invalid key is warned and skipped."""
|
||||||
|
await repo.create("t1", metadata={"env": "prod"})
|
||||||
|
await repo.create("t2", metadata={"env": "staging"})
|
||||||
|
|
||||||
|
with caplog.at_level(logging.WARNING, logger="deerflow.persistence.thread_meta.sql"):
|
||||||
|
results = await repo.search(metadata={"env": "prod", "bad;key": "x"})
|
||||||
|
ids = {r["thread_id"] for r in results}
|
||||||
|
assert ids == {"t1"}
|
||||||
|
assert any("bad;key" in r.message for r in caplog.records)
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_filter_boolean(self, repo):
|
||||||
|
"""True matches only boolean true, not integer 1."""
|
||||||
|
await repo.create("t1", metadata={"active": True})
|
||||||
|
await repo.create("t2", metadata={"active": False})
|
||||||
|
await repo.create("t3", metadata={"active": True, "extra": "x"})
|
||||||
|
await repo.create("t4", metadata={"active": 1})
|
||||||
|
|
||||||
|
results = await repo.search(metadata={"active": True})
|
||||||
|
ids = {r["thread_id"] for r in results}
|
||||||
|
assert ids == {"t1", "t3"}
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_filter_none(self, repo):
|
||||||
|
"""Only rows with explicit JSON null match; missing key does not."""
|
||||||
|
await repo.create("t1", metadata={"tag": None})
|
||||||
|
await repo.create("t2", metadata={"tag": "present"})
|
||||||
|
await repo.create("t3", metadata={"other": "val"})
|
||||||
|
|
||||||
|
results = await repo.search(metadata={"tag": None})
|
||||||
|
ids = {r["thread_id"] for r in results}
|
||||||
|
assert ids == {"t1"}
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_non_string_key_skipped(self, repo, caplog):
|
||||||
|
"""Non-string keys raise ValueError from isinstance check; should be warned and skipped."""
|
||||||
|
await repo.create("t1", metadata={"env": "prod"})
|
||||||
|
await repo.create("t2", metadata={"env": "staging"})
|
||||||
|
|
||||||
|
with caplog.at_level(logging.WARNING, logger="deerflow.persistence.thread_meta.sql"):
|
||||||
|
with pytest.raises(InvalidMetadataFilterError, match="rejected"):
|
||||||
|
await repo.search(metadata={1: "x"})
|
||||||
|
assert any("1" in r.message for r in caplog.records)
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_unsupported_value_type_skipped(self, repo, caplog):
|
||||||
|
"""Unsupported value types (list, dict) raise TypeError; should be warned and skipped."""
|
||||||
|
await repo.create("t1", metadata={"env": "prod"})
|
||||||
|
await repo.create("t2", metadata={"env": "staging"})
|
||||||
|
|
||||||
|
with caplog.at_level(logging.WARNING, logger="deerflow.persistence.thread_meta.sql"):
|
||||||
|
with pytest.raises(InvalidMetadataFilterError, match="rejected"):
|
||||||
|
await repo.search(metadata={"env": ["prod", "staging"]})
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_dotted_key_raises(self, repo, caplog):
|
||||||
|
"""Dotted keys are rejected; when ALL keys are dotted, raises ValueError."""
|
||||||
|
await repo.create("t1", metadata={"env": "prod"})
|
||||||
|
await repo.create("t2", metadata={"env": "staging"})
|
||||||
|
|
||||||
|
with caplog.at_level(logging.WARNING, logger="deerflow.persistence.thread_meta.sql"):
|
||||||
|
with pytest.raises(InvalidMetadataFilterError, match="rejected"):
|
||||||
|
await repo.search(metadata={"a.b": "anything"})
|
||||||
|
assert any("a.b" in r.message for r in caplog.records)
|
||||||
|
|
||||||
|
# --- dialect-aware type-safe filtering edge cases ---
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_bool_vs_int_distinction(self, repo):
|
||||||
|
"""True must not match 1; False must not match 0."""
|
||||||
|
await repo.create("bool_true", metadata={"flag": True})
|
||||||
|
await repo.create("bool_false", metadata={"flag": False})
|
||||||
|
await repo.create("int_one", metadata={"flag": 1})
|
||||||
|
await repo.create("int_zero", metadata={"flag": 0})
|
||||||
|
|
||||||
|
true_hits = {r["thread_id"] for r in await repo.search(metadata={"flag": True})}
|
||||||
|
assert true_hits == {"bool_true"}
|
||||||
|
|
||||||
|
false_hits = {r["thread_id"] for r in await repo.search(metadata={"flag": False})}
|
||||||
|
assert false_hits == {"bool_false"}
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_int_does_not_match_bool(self, repo):
|
||||||
|
"""Integer 1 must not match boolean True."""
|
||||||
|
await repo.create("bool_true", metadata={"val": True})
|
||||||
|
await repo.create("int_one", metadata={"val": 1})
|
||||||
|
|
||||||
|
hits = {r["thread_id"] for r in await repo.search(metadata={"val": 1})}
|
||||||
|
assert hits == {"int_one"}
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_none_excludes_missing_key(self, repo):
|
||||||
|
"""Filtering by None matches explicit JSON null only, not missing key or empty {}."""
|
||||||
|
await repo.create("explicit_null", metadata={"k": None})
|
||||||
|
await repo.create("missing_key", metadata={"other": "x"})
|
||||||
|
await repo.create("empty_obj", metadata={})
|
||||||
|
|
||||||
|
hits = {r["thread_id"] for r in await repo.search(metadata={"k": None})}
|
||||||
|
assert hits == {"explicit_null"}
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_float_value(self, repo):
|
||||||
|
await repo.create("t1", metadata={"score": 3.14})
|
||||||
|
await repo.create("t2", metadata={"score": 2.71})
|
||||||
|
await repo.create("t3", metadata={"score": 3.14})
|
||||||
|
|
||||||
|
hits = {r["thread_id"] for r in await repo.search(metadata={"score": 3.14})}
|
||||||
|
assert hits == {"t1", "t3"}
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_mixed_types_same_key(self, repo):
|
||||||
|
"""Each type query only matches its own type, even when the key is shared."""
|
||||||
|
await repo.create("str_row", metadata={"x": "hello"})
|
||||||
|
await repo.create("int_row", metadata={"x": 42})
|
||||||
|
await repo.create("bool_row", metadata={"x": True})
|
||||||
|
await repo.create("null_row", metadata={"x": None})
|
||||||
|
|
||||||
|
assert {r["thread_id"] for r in await repo.search(metadata={"x": "hello"})} == {"str_row"}
|
||||||
|
assert {r["thread_id"] for r in await repo.search(metadata={"x": 42})} == {"int_row"}
|
||||||
|
assert {r["thread_id"] for r in await repo.search(metadata={"x": True})} == {"bool_row"}
|
||||||
|
assert {r["thread_id"] for r in await repo.search(metadata={"x": None})} == {"null_row"}
|
||||||
|
|
||||||
|
@pytest.mark.anyio
|
||||||
|
async def test_search_metadata_large_int_precision(self, repo):
|
||||||
|
"""Integers beyond float precision (> 2**53) must match exactly."""
|
||||||
|
large = 2**53 + 1
|
||||||
|
await repo.create("t1", metadata={"id": large})
|
||||||
|
await repo.create("t2", metadata={"id": large - 1})
|
||||||
|
|
||||||
|
hits = {r["thread_id"] for r in await repo.search(metadata={"id": large})}
|
||||||
|
assert hits == {"t1"}
|
||||||
|
|
||||||
|
|
||||||
|
class TestJsonMatchCompilation:
|
||||||
|
"""Verify compiled SQL for both SQLite and PostgreSQL dialects."""
|
||||||
|
|
||||||
|
def test_json_match_compiles_sqlite(self):
|
||||||
|
from sqlalchemy import Column, MetaData, String, Table, create_engine
|
||||||
|
from sqlalchemy.types import JSON
|
||||||
|
|
||||||
|
from deerflow.persistence.json_compat import json_match
|
||||||
|
|
||||||
|
metadata = MetaData()
|
||||||
|
t = Table("t", metadata, Column("data", JSON), Column("id", String))
|
||||||
|
engine = create_engine("sqlite://")
|
||||||
|
|
||||||
|
cases = [
|
||||||
|
(None, "json_type(t.data, '$.\"k\"') = 'null'"),
|
||||||
|
(True, "json_type(t.data, '$.\"k\"') = 'true'"),
|
||||||
|
(False, "json_type(t.data, '$.\"k\"') = 'false'"),
|
||||||
|
]
|
||||||
|
for value, expected_fragment in cases:
|
||||||
|
expr = json_match(t.c.data, "k", value)
|
||||||
|
sql = expr.compile(dialect=engine.dialect, compile_kwargs={"literal_binds": True})
|
||||||
|
assert str(sql) == expected_fragment, f"value={value!r}: {sql}"
|
||||||
|
|
||||||
|
# int: uses INTEGER cast for precision, type-check narrows to 'integer' only
|
||||||
|
int_expr = json_match(t.c.data, "k", 42)
|
||||||
|
sql = str(int_expr.compile(dialect=engine.dialect, compile_kwargs={"literal_binds": True}))
|
||||||
|
assert "json_type" in sql
|
||||||
|
assert "= 'integer'" in sql
|
||||||
|
assert "INTEGER" in sql
|
||||||
|
assert "CAST" in sql
|
||||||
|
|
||||||
|
# float: uses REAL cast, type-check spans 'integer' and 'real'
|
||||||
|
float_expr = json_match(t.c.data, "k", 3.14)
|
||||||
|
sql = str(float_expr.compile(dialect=engine.dialect, compile_kwargs={"literal_binds": True}))
|
||||||
|
assert "json_type" in sql
|
||||||
|
assert "IN ('integer', 'real')" in sql
|
||||||
|
assert "REAL" in sql
|
||||||
|
|
||||||
|
str_expr = json_match(t.c.data, "k", "hello")
|
||||||
|
sql = str(str_expr.compile(dialect=engine.dialect, compile_kwargs={"literal_binds": True}))
|
||||||
|
assert "json_type" in sql
|
||||||
|
assert "'text'" in sql
|
||||||
|
|
||||||
|
def test_json_match_compiles_pg(self):
|
||||||
|
from sqlalchemy import Column, MetaData, String, Table
|
||||||
|
from sqlalchemy.dialects import postgresql
|
||||||
|
from sqlalchemy.types import JSON
|
||||||
|
|
||||||
|
from deerflow.persistence.json_compat import json_match
|
||||||
|
|
||||||
|
metadata = MetaData()
|
||||||
|
t = Table("t", metadata, Column("data", JSON), Column("id", String))
|
||||||
|
dialect = postgresql.dialect()
|
||||||
|
|
||||||
|
cases = [
|
||||||
|
(None, "json_typeof(t.data -> 'k') = 'null'"),
|
||||||
|
(True, "(json_typeof(t.data -> 'k') = 'boolean' AND (t.data ->> 'k') = 'true')"),
|
||||||
|
(False, "(json_typeof(t.data -> 'k') = 'boolean' AND (t.data ->> 'k') = 'false')"),
|
||||||
|
]
|
||||||
|
for value, expected_fragment in cases:
|
||||||
|
expr = json_match(t.c.data, "k", value)
|
||||||
|
sql = expr.compile(dialect=dialect, compile_kwargs={"literal_binds": True})
|
||||||
|
assert str(sql) == expected_fragment, f"value={value!r}: {sql}"
|
||||||
|
|
||||||
|
# int: CASE guard prevents CAST error when 'number' also matches floats
|
||||||
|
int_expr = json_match(t.c.data, "k", 42)
|
||||||
|
sql = str(int_expr.compile(dialect=dialect, compile_kwargs={"literal_binds": True}))
|
||||||
|
assert "json_typeof" in sql
|
||||||
|
assert "'number'" in sql
|
||||||
|
assert "BIGINT" in sql
|
||||||
|
assert "CASE WHEN" in sql
|
||||||
|
assert "'^-?[0-9]+$'" in sql
|
||||||
|
|
||||||
|
# float: uses DOUBLE PRECISION cast
|
||||||
|
float_expr = json_match(t.c.data, "k", 3.14)
|
||||||
|
sql = str(float_expr.compile(dialect=dialect, compile_kwargs={"literal_binds": True}))
|
||||||
|
assert "json_typeof" in sql
|
||||||
|
assert "'number'" in sql
|
||||||
|
assert "DOUBLE PRECISION" in sql
|
||||||
|
|
||||||
|
str_expr = json_match(t.c.data, "k", "hello")
|
||||||
|
sql = str(str_expr.compile(dialect=dialect, compile_kwargs={"literal_binds": True}))
|
||||||
|
assert "json_typeof" in sql
|
||||||
|
assert "'string'" in sql
|
||||||
|
|
||||||
|
def test_json_match_rejects_unsafe_key(self):
|
||||||
|
from sqlalchemy import Column, MetaData, String, Table
|
||||||
|
from sqlalchemy.types import JSON
|
||||||
|
|
||||||
|
from deerflow.persistence.json_compat import json_match
|
||||||
|
|
||||||
|
metadata = MetaData()
|
||||||
|
t = Table("t", metadata, Column("data", JSON), Column("id", String))
|
||||||
|
|
||||||
|
for bad_key in ["a.b", "with space", "bad'quote", 'bad"quote', "back\\slash", "semi;colon", ""]:
|
||||||
|
with pytest.raises(ValueError, match="JsonMatch key must match"):
|
||||||
|
json_match(t.c.data, bad_key, "x")
|
||||||
|
|
||||||
|
# Non-string keys must also raise ValueError (not TypeError from re.match)
|
||||||
|
for non_str_key in [42, None, ("k",)]:
|
||||||
|
with pytest.raises(ValueError, match="JsonMatch key must match"):
|
||||||
|
json_match(t.c.data, non_str_key, "x")
|
||||||
|
|
||||||
|
def test_json_match_rejects_unsupported_value_type(self):
|
||||||
|
from sqlalchemy import Column, MetaData, String, Table
|
||||||
|
from sqlalchemy.types import JSON
|
||||||
|
|
||||||
|
from deerflow.persistence.json_compat import json_match
|
||||||
|
|
||||||
|
metadata = MetaData()
|
||||||
|
t = Table("t", metadata, Column("data", JSON), Column("id", String))
|
||||||
|
|
||||||
|
for bad_value in [[], {}, object()]:
|
||||||
|
with pytest.raises(TypeError, match="JsonMatch value must be"):
|
||||||
|
json_match(t.c.data, "k", bad_value)
|
||||||
|
|
||||||
|
def test_json_match_unsupported_dialect_raises(self):
|
||||||
|
from sqlalchemy import Column, MetaData, String, Table
|
||||||
|
from sqlalchemy.dialects import mysql
|
||||||
|
from sqlalchemy.types import JSON
|
||||||
|
|
||||||
|
from deerflow.persistence.json_compat import json_match
|
||||||
|
|
||||||
|
metadata = MetaData()
|
||||||
|
t = Table("t", metadata, Column("data", JSON), Column("id", String))
|
||||||
|
expr = json_match(t.c.data, "k", "v")
|
||||||
|
|
||||||
|
with pytest.raises(NotImplementedError, match="mysql"):
|
||||||
|
str(expr.compile(dialect=mysql.dialect(), compile_kwargs={"literal_binds": True}))
|
||||||
|
|
||||||
|
def test_json_match_rejects_out_of_range_int(self):
|
||||||
|
from sqlalchemy import Column, MetaData, String, Table
|
||||||
|
from sqlalchemy.types import JSON
|
||||||
|
|
||||||
|
from deerflow.persistence.json_compat import json_match
|
||||||
|
|
||||||
|
metadata = MetaData()
|
||||||
|
t = Table("t", metadata, Column("data", JSON), Column("id", String))
|
||||||
|
|
||||||
|
# boundary values must be accepted
|
||||||
|
json_match(t.c.data, "k", 2**63 - 1)
|
||||||
|
json_match(t.c.data, "k", -(2**63))
|
||||||
|
|
||||||
|
# one beyond each boundary must be rejected
|
||||||
|
for out_of_range in [2**63, -(2**63) - 1, 10**30]:
|
||||||
|
with pytest.raises(TypeError, match="out of signed 64-bit range"):
|
||||||
|
json_match(t.c.data, "k", out_of_range)
|
||||||
|
|
||||||
|
def test_compiler_raises_on_escaped_key(self):
|
||||||
|
"""Compiler raises ValueError even when __init__ validation is bypassed."""
|
||||||
|
from sqlalchemy import Column, MetaData, String, Table, create_engine
|
||||||
|
from sqlalchemy.dialects import postgresql
|
||||||
|
from sqlalchemy.types import JSON
|
||||||
|
|
||||||
|
from deerflow.persistence.json_compat import json_match
|
||||||
|
|
||||||
|
metadata = MetaData()
|
||||||
|
t = Table("t", metadata, Column("data", JSON), Column("id", String))
|
||||||
|
engine = create_engine("sqlite://")
|
||||||
|
|
||||||
|
elem = json_match(t.c.data, "k", "v")
|
||||||
|
elem.key = "bad.key" # bypass __init__ to simulate -O stripping assert
|
||||||
|
|
||||||
|
with pytest.raises(ValueError, match="Key escaped validation"):
|
||||||
|
str(elem.compile(dialect=engine.dialect, compile_kwargs={"literal_binds": True}))
|
||||||
|
|
||||||
|
with pytest.raises(ValueError, match="Key escaped validation"):
|
||||||
|
str(elem.compile(dialect=postgresql.dialect(), compile_kwargs={"literal_binds": True}))
|
||||||
|
|||||||
@@ -10,6 +10,7 @@ from langgraph.store.memory import InMemoryStore
|
|||||||
|
|
||||||
from app.gateway.routers import threads
|
from app.gateway.routers import threads
|
||||||
from deerflow.config.paths import Paths
|
from deerflow.config.paths import Paths
|
||||||
|
from deerflow.persistence.thread_meta import InvalidMetadataFilterError
|
||||||
from deerflow.persistence.thread_meta.memory import THREADS_NS, MemoryThreadMetaStore
|
from deerflow.persistence.thread_meta.memory import THREADS_NS, MemoryThreadMetaStore
|
||||||
|
|
||||||
_ISO_TIMESTAMP_RE = re.compile(r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}")
|
_ISO_TIMESTAMP_RE = re.compile(r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}")
|
||||||
@@ -431,3 +432,56 @@ def test_get_thread_history_returns_iso_for_legacy_checkpoint_metadata() -> None
|
|||||||
assert entries, "expected at least one history entry"
|
assert entries, "expected at least one history entry"
|
||||||
for entry in entries:
|
for entry in entries:
|
||||||
assert _ISO_TIMESTAMP_RE.match(entry["created_at"]), entry
|
assert _ISO_TIMESTAMP_RE.match(entry["created_at"]), entry
|
||||||
|
|
||||||
|
|
||||||
|
# ── Metadata filter validation at API boundary ────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
def test_search_threads_rejects_invalid_key_at_api_boundary() -> None:
|
||||||
|
"""Keys that don't match [A-Za-z0-9_-]+ are rejected by the Pydantic
|
||||||
|
validator on ThreadSearchRequest.metadata — 422 from both backends.
|
||||||
|
"""
|
||||||
|
app, _store, _checkpointer = _build_thread_app()
|
||||||
|
|
||||||
|
with TestClient(app) as client:
|
||||||
|
response = client.post("/api/threads/search", json={"metadata": {"bad;key": "x"}})
|
||||||
|
|
||||||
|
assert response.status_code == 422
|
||||||
|
|
||||||
|
|
||||||
|
def test_search_threads_rejects_unsupported_value_type_at_api_boundary() -> None:
|
||||||
|
"""Value types outside (None, bool, int, float, str) are rejected."""
|
||||||
|
app, _store, _checkpointer = _build_thread_app()
|
||||||
|
|
||||||
|
with TestClient(app) as client:
|
||||||
|
response = client.post("/api/threads/search", json={"metadata": {"env": ["a", "b"]}})
|
||||||
|
|
||||||
|
assert response.status_code == 422
|
||||||
|
|
||||||
|
|
||||||
|
def test_search_threads_returns_400_for_backend_invalid_metadata_filter() -> None:
|
||||||
|
"""If the backend still raises InvalidMetadataFilterError (defense in
|
||||||
|
depth), the handler surfaces it as HTTP 400.
|
||||||
|
"""
|
||||||
|
app, _store, _checkpointer = _build_thread_app()
|
||||||
|
thread_store = app.state.thread_store
|
||||||
|
|
||||||
|
async def _raise(**kwargs):
|
||||||
|
raise InvalidMetadataFilterError("rejected")
|
||||||
|
|
||||||
|
with TestClient(app) as client:
|
||||||
|
with patch.object(thread_store, "search", side_effect=_raise):
|
||||||
|
response = client.post("/api/threads/search", json={"metadata": {"valid_key": "x"}})
|
||||||
|
|
||||||
|
assert response.status_code == 400
|
||||||
|
assert "rejected" in response.json()["detail"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_search_threads_succeeds_with_valid_metadata() -> None:
|
||||||
|
"""Sanity check: valid metadata passes through without error."""
|
||||||
|
app, _store, _checkpointer = _build_thread_app()
|
||||||
|
|
||||||
|
with TestClient(app) as client:
|
||||||
|
response = client.post("/api/threads/search", json={"metadata": {"env": "prod"}})
|
||||||
|
|
||||||
|
assert response.status_code == 200
|
||||||
|
|||||||
@@ -1,9 +1,10 @@
|
|||||||
"""Tests for TokenUsageMiddleware attribution annotations."""
|
"""Tests for TokenUsageMiddleware attribution annotations."""
|
||||||
|
|
||||||
|
import importlib
|
||||||
import logging
|
import logging
|
||||||
from unittest.mock import MagicMock
|
from unittest.mock import MagicMock
|
||||||
|
|
||||||
from langchain_core.messages import AIMessage
|
from langchain_core.messages import AIMessage, ToolMessage
|
||||||
|
|
||||||
from deerflow.agents.middlewares.token_usage_middleware import (
|
from deerflow.agents.middlewares.token_usage_middleware import (
|
||||||
TOKEN_USAGE_ATTRIBUTION_KEY,
|
TOKEN_USAGE_ATTRIBUTION_KEY,
|
||||||
@@ -232,3 +233,49 @@ class TestTokenUsageMiddleware:
|
|||||||
"tool_call_id": "write_todos:remove",
|
"tool_call_id": "write_todos:remove",
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
|
|
||||||
|
def test_merges_subagent_usage_by_message_position_when_ai_message_ids_are_missing(self, monkeypatch):
|
||||||
|
middleware = TokenUsageMiddleware()
|
||||||
|
first_dispatch = AIMessage(
|
||||||
|
content="",
|
||||||
|
tool_calls=[{"id": "task:first", "name": "task", "args": {}}],
|
||||||
|
)
|
||||||
|
second_dispatch = AIMessage(
|
||||||
|
content="",
|
||||||
|
tool_calls=[
|
||||||
|
{"id": "task:second-a", "name": "task", "args": {}},
|
||||||
|
{"id": "task:second-b", "name": "task", "args": {}},
|
||||||
|
],
|
||||||
|
)
|
||||||
|
messages = [
|
||||||
|
first_dispatch,
|
||||||
|
ToolMessage(content="first", tool_call_id="task:first"),
|
||||||
|
second_dispatch,
|
||||||
|
ToolMessage(content="second-a", tool_call_id="task:second-a"),
|
||||||
|
ToolMessage(content="second-b", tool_call_id="task:second-b"),
|
||||||
|
AIMessage(content="done"),
|
||||||
|
]
|
||||||
|
cached_usage = {
|
||||||
|
"task:second-a": {"input_tokens": 10, "output_tokens": 5, "total_tokens": 15},
|
||||||
|
"task:second-b": {"input_tokens": 20, "output_tokens": 7, "total_tokens": 27},
|
||||||
|
}
|
||||||
|
|
||||||
|
task_tool_module = importlib.import_module("deerflow.tools.builtins.task_tool")
|
||||||
|
monkeypatch.setattr(
|
||||||
|
task_tool_module,
|
||||||
|
"pop_cached_subagent_usage",
|
||||||
|
lambda tool_call_id: cached_usage.pop(tool_call_id, None),
|
||||||
|
)
|
||||||
|
|
||||||
|
result = middleware.after_model({"messages": messages}, _make_runtime())
|
||||||
|
|
||||||
|
assert result is not None
|
||||||
|
usage_updates = [message for message in result["messages"] if getattr(message, "usage_metadata", None)]
|
||||||
|
assert len(usage_updates) == 1
|
||||||
|
updated = usage_updates[0]
|
||||||
|
assert updated.tool_calls == second_dispatch.tool_calls
|
||||||
|
assert updated.usage_metadata == {
|
||||||
|
"input_tokens": 30,
|
||||||
|
"output_tokens": 12,
|
||||||
|
"total_tokens": 42,
|
||||||
|
}
|
||||||
|
|||||||
@@ -65,8 +65,7 @@ def _make_minimal_config(tools):
|
|||||||
|
|
||||||
@patch("deerflow.tools.tools.get_app_config")
|
@patch("deerflow.tools.tools.get_app_config")
|
||||||
@patch("deerflow.tools.tools.is_host_bash_allowed", return_value=True)
|
@patch("deerflow.tools.tools.is_host_bash_allowed", return_value=True)
|
||||||
@patch("deerflow.tools.tools.reset_deferred_registry")
|
def test_config_loaded_async_only_tool_gets_sync_wrapper(mock_bash, mock_cfg):
|
||||||
def test_config_loaded_async_only_tool_gets_sync_wrapper(mock_reset, mock_bash, mock_cfg):
|
|
||||||
"""Config-loaded async-only tools can still be invoked by sync clients."""
|
"""Config-loaded async-only tools can still be invoked by sync clients."""
|
||||||
|
|
||||||
async def async_tool_impl(x: int) -> str:
|
async def async_tool_impl(x: int) -> str:
|
||||||
@@ -98,8 +97,7 @@ def test_config_loaded_async_only_tool_gets_sync_wrapper(mock_reset, mock_bash,
|
|||||||
|
|
||||||
@patch("deerflow.tools.tools.get_app_config")
|
@patch("deerflow.tools.tools.get_app_config")
|
||||||
@patch("deerflow.tools.tools.is_host_bash_allowed", return_value=True)
|
@patch("deerflow.tools.tools.is_host_bash_allowed", return_value=True)
|
||||||
@patch("deerflow.tools.tools.reset_deferred_registry")
|
def test_no_duplicates_returned(mock_bash, mock_cfg):
|
||||||
def test_no_duplicates_returned(mock_reset, mock_bash, mock_cfg):
|
|
||||||
"""get_available_tools() never returns two tools with the same name."""
|
"""get_available_tools() never returns two tools with the same name."""
|
||||||
mock_cfg.return_value = _make_minimal_config([])
|
mock_cfg.return_value = _make_minimal_config([])
|
||||||
|
|
||||||
@@ -113,8 +111,7 @@ def test_no_duplicates_returned(mock_reset, mock_bash, mock_cfg):
|
|||||||
|
|
||||||
@patch("deerflow.tools.tools.get_app_config")
|
@patch("deerflow.tools.tools.get_app_config")
|
||||||
@patch("deerflow.tools.tools.is_host_bash_allowed", return_value=True)
|
@patch("deerflow.tools.tools.is_host_bash_allowed", return_value=True)
|
||||||
@patch("deerflow.tools.tools.reset_deferred_registry")
|
def test_first_occurrence_wins(mock_bash, mock_cfg):
|
||||||
def test_first_occurrence_wins(mock_reset, mock_bash, mock_cfg):
|
|
||||||
"""When duplicates exist, the first occurrence is kept."""
|
"""When duplicates exist, the first occurrence is kept."""
|
||||||
mock_cfg.return_value = _make_minimal_config([])
|
mock_cfg.return_value = _make_minimal_config([])
|
||||||
|
|
||||||
@@ -132,8 +129,7 @@ def test_first_occurrence_wins(mock_reset, mock_bash, mock_cfg):
|
|||||||
|
|
||||||
@patch("deerflow.tools.tools.get_app_config")
|
@patch("deerflow.tools.tools.get_app_config")
|
||||||
@patch("deerflow.tools.tools.is_host_bash_allowed", return_value=True)
|
@patch("deerflow.tools.tools.is_host_bash_allowed", return_value=True)
|
||||||
@patch("deerflow.tools.tools.reset_deferred_registry")
|
def test_duplicate_triggers_warning(mock_bash, mock_cfg, caplog):
|
||||||
def test_duplicate_triggers_warning(mock_reset, mock_bash, mock_cfg, caplog):
|
|
||||||
"""A warning is logged for every skipped duplicate."""
|
"""A warning is logged for every skipped duplicate."""
|
||||||
import logging
|
import logging
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,253 @@
|
|||||||
|
"""End-to-end verification for update_agent's user_id resolution.
|
||||||
|
|
||||||
|
PR #2784 hardened setup_agent to prefer runtime.context["user_id"] over the
|
||||||
|
contextvar. update_agent had the same latent gap: it unconditionally called
|
||||||
|
get_effective_user_id() at module level, so any scenario where the contextvar
|
||||||
|
was unavailable while runtime.context carried user_id (a background task
|
||||||
|
scheduled outside the request task, a worker pool that doesn't copy_context,
|
||||||
|
checkpoint resume on a different task) would silently route writes to
|
||||||
|
users/default/agents/...
|
||||||
|
|
||||||
|
These tests are load-bearing under @no_auto_user (contextvar empty):
|
||||||
|
|
||||||
|
- The negative-control test confirms the fixture actually puts the tool in
|
||||||
|
the regime where the contextvar fallback would land in users/default/.
|
||||||
|
Without that, the positive test would be vacuously satisfied.
|
||||||
|
- The positive test verifies update_agent honours runtime.context["user_id"]
|
||||||
|
injected by inject_authenticated_user_context in the gateway. Before the
|
||||||
|
fix in this PR, this test failed; now it passes.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from contextlib import ExitStack
|
||||||
|
from pathlib import Path
|
||||||
|
from types import SimpleNamespace
|
||||||
|
from unittest.mock import MagicMock, patch
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
import yaml
|
||||||
|
from _agent_e2e_helpers import build_single_tool_call_model
|
||||||
|
from langchain_core.messages import HumanMessage
|
||||||
|
|
||||||
|
from app.gateway.services import (
|
||||||
|
build_run_config,
|
||||||
|
inject_authenticated_user_context,
|
||||||
|
merge_run_context_overrides,
|
||||||
|
)
|
||||||
|
from deerflow.runtime.runs.worker import _build_runtime_context, _install_runtime_context
|
||||||
|
|
||||||
|
|
||||||
|
def _make_request(user_id_str: str | None) -> SimpleNamespace:
|
||||||
|
user = SimpleNamespace(id=UUID(user_id_str), email="alice@local") if user_id_str else None
|
||||||
|
return SimpleNamespace(state=SimpleNamespace(user=user))
|
||||||
|
|
||||||
|
|
||||||
|
def _assemble_config(*, body_context: dict | None, request_user_id: str | None, thread_id: str) -> dict:
|
||||||
|
config = build_run_config(thread_id, {"recursion_limit": 50}, None, assistant_id="lead_agent")
|
||||||
|
merge_run_context_overrides(config, body_context)
|
||||||
|
inject_authenticated_user_context(config, _make_request(request_user_id))
|
||||||
|
return config
|
||||||
|
|
||||||
|
|
||||||
|
def _seed_existing_agent(tmp_path: Path, user_id: str, agent_name: str, soul: str = "# Original"):
|
||||||
|
"""Pre-create an agent on disk for update_agent to overwrite."""
|
||||||
|
agent_dir = tmp_path / "users" / user_id / "agents" / agent_name
|
||||||
|
agent_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
(agent_dir / "config.yaml").write_text(
|
||||||
|
yaml.dump({"name": agent_name, "description": "old"}, allow_unicode=True),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
(agent_dir / "SOUL.md").write_text(soul, encoding="utf-8")
|
||||||
|
return agent_dir
|
||||||
|
|
||||||
|
|
||||||
|
def _make_paths_mock(tmp_path: Path):
|
||||||
|
paths = MagicMock()
|
||||||
|
paths.base_dir = tmp_path
|
||||||
|
paths.agent_dir = lambda name: tmp_path / "agents" / name
|
||||||
|
paths.user_agent_dir = lambda user_id, name: tmp_path / "users" / user_id / "agents" / name
|
||||||
|
return paths
|
||||||
|
|
||||||
|
|
||||||
|
def _patch_update_agent_dependencies(tmp_path: Path):
|
||||||
|
"""update_agent reads load_agent_config + get_app_config — stub them
|
||||||
|
minimally so the tool can run without a real config file or LLM."""
|
||||||
|
fake_model_cfg = SimpleNamespace(name="fake-model")
|
||||||
|
fake_app_cfg = MagicMock()
|
||||||
|
fake_app_cfg.get_model_config = lambda name: fake_model_cfg if name == "fake-model" else None
|
||||||
|
|
||||||
|
return [
|
||||||
|
patch(
|
||||||
|
"deerflow.tools.builtins.update_agent_tool.get_paths",
|
||||||
|
return_value=_make_paths_mock(tmp_path),
|
||||||
|
),
|
||||||
|
patch(
|
||||||
|
"deerflow.tools.builtins.update_agent_tool.get_app_config",
|
||||||
|
return_value=fake_app_cfg,
|
||||||
|
),
|
||||||
|
# load_agent_config (used by update_agent to read existing config) also
|
||||||
|
# reads paths via its own module-level get_paths reference. Patch it too
|
||||||
|
# or the tool returns "Agent does not exist" before touching disk.
|
||||||
|
patch(
|
||||||
|
"deerflow.config.agents_config.get_paths",
|
||||||
|
return_value=_make_paths_mock(tmp_path),
|
||||||
|
),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def _build_update_graph(*, soul_payload: str):
|
||||||
|
from langchain.agents import create_agent
|
||||||
|
|
||||||
|
from deerflow.tools.builtins.update_agent_tool import update_agent
|
||||||
|
|
||||||
|
fake_model = build_single_tool_call_model(
|
||||||
|
tool_name="update_agent",
|
||||||
|
tool_args={"soul": soul_payload, "description": "refined"},
|
||||||
|
tool_call_id="call_update_1",
|
||||||
|
final_text="updated",
|
||||||
|
)
|
||||||
|
return create_agent(model=fake_model, tools=[update_agent], system_prompt="updater")
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Negative control — proves the test environment puts update_agent in the
|
||||||
|
# regime where the contextvar fallback would land in default/.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.no_auto_user
|
||||||
|
def test_update_agent_falls_back_to_default_when_no_inject_and_no_contextvar(tmp_path: Path):
|
||||||
|
"""No request.state.user, no contextvar — update_agent must look in
|
||||||
|
users/default/agents/. We seed the file there so the tool succeeds and
|
||||||
|
we know which directory it actually consulted."""
|
||||||
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
|
_seed_existing_agent(tmp_path, "default", "fallback-target")
|
||||||
|
|
||||||
|
config = _assemble_config(
|
||||||
|
body_context={"agent_name": "fallback-target"},
|
||||||
|
request_user_id=None, # no auth, inject is no-op
|
||||||
|
thread_id="thread-update-1",
|
||||||
|
)
|
||||||
|
runtime_ctx = _build_runtime_context("thread-update-1", "run-1", config.get("context"), None)
|
||||||
|
_install_runtime_context(config, runtime_ctx)
|
||||||
|
runtime = Runtime(context=runtime_ctx, store=None)
|
||||||
|
config.setdefault("configurable", {})["__pregel_runtime"] = runtime
|
||||||
|
|
||||||
|
graph = _build_update_graph(soul_payload="# Fallback Updated")
|
||||||
|
|
||||||
|
with ExitStack() as stack:
|
||||||
|
for p in _patch_update_agent_dependencies(tmp_path):
|
||||||
|
stack.enter_context(p)
|
||||||
|
graph.invoke(
|
||||||
|
{"messages": [HumanMessage(content="update fallback-target")]},
|
||||||
|
config=config,
|
||||||
|
)
|
||||||
|
|
||||||
|
soul = (tmp_path / "users" / "default" / "agents" / "fallback-target" / "SOUL.md").read_text()
|
||||||
|
assert soul == "# Fallback Updated", "Sanity: tool should have written under default/"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Regression guard — passes on this branch, would fail on main before the fix.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.no_auto_user
|
||||||
|
def test_update_agent_should_use_runtime_context_user_id_when_contextvar_missing(tmp_path: Path):
|
||||||
|
"""update_agent prefers the authenticated user_id carried in
|
||||||
|
runtime.context (placed there by inject_authenticated_user_context)
|
||||||
|
over the contextvar — same contract as setup_agent (PR #2784).
|
||||||
|
|
||||||
|
Before this PR's fix, update_agent unconditionally called
|
||||||
|
get_effective_user_id() and landed in default/ whenever the contextvar
|
||||||
|
was unavailable. This test pins the corrected behaviour.
|
||||||
|
"""
|
||||||
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
|
auth_uid = "abcdef01-2345-6789-abcd-ef0123456789"
|
||||||
|
|
||||||
|
# Seed the agent in BOTH locations so we can prove which one was opened.
|
||||||
|
auth_dir = _seed_existing_agent(tmp_path, auth_uid, "shared-name", soul="# Auth Original")
|
||||||
|
default_dir = _seed_existing_agent(tmp_path, "default", "shared-name", soul="# Default Original")
|
||||||
|
|
||||||
|
config = _assemble_config(
|
||||||
|
body_context={"agent_name": "shared-name"},
|
||||||
|
request_user_id=auth_uid,
|
||||||
|
thread_id="thread-update-2",
|
||||||
|
)
|
||||||
|
runtime_ctx = _build_runtime_context("thread-update-2", "run-2", config.get("context"), None)
|
||||||
|
assert runtime_ctx["user_id"] == auth_uid, "Pre-condition: inject must have placed user_id into runtime_ctx"
|
||||||
|
|
||||||
|
_install_runtime_context(config, runtime_ctx)
|
||||||
|
runtime = Runtime(context=runtime_ctx, store=None)
|
||||||
|
config.setdefault("configurable", {})["__pregel_runtime"] = runtime
|
||||||
|
|
||||||
|
graph = _build_update_graph(soul_payload="# Auth Updated")
|
||||||
|
|
||||||
|
with ExitStack() as stack:
|
||||||
|
for p in _patch_update_agent_dependencies(tmp_path):
|
||||||
|
stack.enter_context(p)
|
||||||
|
graph.invoke(
|
||||||
|
{"messages": [HumanMessage(content="update shared-name")]},
|
||||||
|
config=config,
|
||||||
|
)
|
||||||
|
|
||||||
|
auth_soul = (auth_dir / "SOUL.md").read_text()
|
||||||
|
default_soul = (default_dir / "SOUL.md").read_text()
|
||||||
|
|
||||||
|
assert auth_soul == "# Auth Updated", f"REGRESSION: update_agent ignored runtime.context['user_id']={auth_uid!r} and routed the write to users/default/ instead. auth_soul={auth_soul!r}, default_soul={default_soul!r}"
|
||||||
|
assert default_soul == "# Default Original", "REGRESSION: update_agent corrupted the shared default-user agent. It should have written under the authenticated user's path."
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Positive — when contextvar IS the auth user (the normal HTTP case), things
|
||||||
|
# already work. Pin it as a regression guard so future refactors don't
|
||||||
|
# accidentally break the contextvar path in pursuit of the runtime-context fix.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_update_agent_uses_contextvar_when_present(tmp_path: Path, monkeypatch):
|
||||||
|
"""The normal HTTP case: contextvar is set by auth_middleware. This must
|
||||||
|
keep working regardless of how runtime.context is populated."""
|
||||||
|
from types import SimpleNamespace as _SN
|
||||||
|
|
||||||
|
from deerflow.runtime.user_context import reset_current_user, set_current_user
|
||||||
|
|
||||||
|
auth_uid = "11112222-3333-4444-5555-666677778888"
|
||||||
|
user = _SN(id=auth_uid, email="ctxvar@local")
|
||||||
|
|
||||||
|
_seed_existing_agent(tmp_path, auth_uid, "ctxvar-agent", soul="# Original")
|
||||||
|
|
||||||
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
|
config = _assemble_config(
|
||||||
|
body_context={"agent_name": "ctxvar-agent"},
|
||||||
|
request_user_id=auth_uid,
|
||||||
|
thread_id="thread-update-3",
|
||||||
|
)
|
||||||
|
runtime_ctx = _build_runtime_context("thread-update-3", "run-3", config.get("context"), None)
|
||||||
|
_install_runtime_context(config, runtime_ctx)
|
||||||
|
runtime = Runtime(context=runtime_ctx, store=None)
|
||||||
|
config.setdefault("configurable", {})["__pregel_runtime"] = runtime
|
||||||
|
|
||||||
|
graph = _build_update_graph(soul_payload="# CtxVar Updated")
|
||||||
|
|
||||||
|
with ExitStack() as stack:
|
||||||
|
for p in _patch_update_agent_dependencies(tmp_path):
|
||||||
|
stack.enter_context(p)
|
||||||
|
token = set_current_user(user)
|
||||||
|
try:
|
||||||
|
final = graph.invoke(
|
||||||
|
{"messages": [HumanMessage(content="update ctxvar-agent")]},
|
||||||
|
config=config,
|
||||||
|
)
|
||||||
|
finally:
|
||||||
|
reset_current_user(token)
|
||||||
|
|
||||||
|
# surface the tool's reply for debug if it errored
|
||||||
|
tool_replies = [m.content for m in final["messages"] if getattr(m, "type", "") == "tool"]
|
||||||
|
soul = (tmp_path / "users" / auth_uid / "agents" / "ctxvar-agent" / "SOUL.md").read_text()
|
||||||
|
assert soul == "# CtxVar Updated", f"tool replies: {tool_replies}"
|
||||||
Generated
+3
-3
@@ -2088,7 +2088,7 @@ wheels = [
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "langsmith"
|
name = "langsmith"
|
||||||
version = "0.7.36"
|
version = "0.8.0"
|
||||||
source = { registry = "https://pypi.org/simple" }
|
source = { registry = "https://pypi.org/simple" }
|
||||||
dependencies = [
|
dependencies = [
|
||||||
{ name = "httpx" },
|
{ name = "httpx" },
|
||||||
@@ -2101,9 +2101,9 @@ dependencies = [
|
|||||||
{ name = "xxhash" },
|
{ name = "xxhash" },
|
||||||
{ name = "zstandard" },
|
{ name = "zstandard" },
|
||||||
]
|
]
|
||||||
sdist = { url = "https://files.pythonhosted.org/packages/8d/4c/5f20508000ee0559bfa713b85c431b1cdc95d2913247ff9eb318e7fdff7b/langsmith-0.7.36.tar.gz", hash = "sha256:d18ef34819e0a252cf52c74ce6e9bd5de6deea4f85a3aef50abc9f48d8c5f8b8", size = 4402322, upload-time = "2026-04-24T16:58:06.681Z" }
|
sdist = { url = "https://files.pythonhosted.org/packages/a8/64/95f1f013531395f4e8ed73caeee780f65c7c58fe028cb543f8937b45611b/langsmith-0.8.0.tar.gz", hash = "sha256:59fe5b2a56bbbe14a08aa76691f84b49e8675dd21e11b57d80c6db8c08bac2e3", size = 4432996, upload-time = "2026-04-30T22:13:07.341Z" }
|
||||||
wheels = [
|
wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/f3/8d/3ca31ae3a4a437191243ad6d9061ede9367440bb7dc9a0da1ecc2c2a4865/langsmith-0.7.36-py3-none-any.whl", hash = "sha256:e1657a795f3f1982bb8d34c98b143b630ca3eee9de2c10e670c9105233b54654", size = 381808, upload-time = "2026-04-24T16:58:04.572Z" },
|
{ url = "https://files.pythonhosted.org/packages/f3/e1/a4be2e696c9473bb53298df398237da5674704d781d4b748ed35aeef592a/langsmith-0.8.0-py3-none-any.whl", hash = "sha256:12cc4bc5622b835a6d841964d6034df3617bdb912dae0c1381fd0a68a9b3a3ef", size = 393268, upload-time = "2026-04-30T22:13:05.56Z" },
|
||||||
]
|
]
|
||||||
|
|
||||||
[package.optional-dependencies]
|
[package.optional-dependencies]
|
||||||
|
|||||||
+3
-3
@@ -82,10 +82,10 @@ pnpm start
|
|||||||
Key environment variables (see `.env.example` for full list):
|
Key environment variables (see `.env.example` for full list):
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Backend API URLs (optional, uses nginx proxy by default)
|
# Backend API URL (optional, uses local Next.js/nginx proxy by default)
|
||||||
NEXT_PUBLIC_BACKEND_BASE_URL="http://localhost:8001"
|
NEXT_PUBLIC_BACKEND_BASE_URL="http://localhost:8001"
|
||||||
# LangGraph API URLs (optional, uses nginx proxy by default)
|
# LangGraph-compatible API URL (optional, uses local Next.js/nginx proxy by default)
|
||||||
NEXT_PUBLIC_LANGGRAPH_BASE_URL="http://localhost:2024"
|
NEXT_PUBLIC_LANGGRAPH_BASE_URL="http://localhost:8001/api"
|
||||||
```
|
```
|
||||||
|
|
||||||
## Project Structure
|
## Project Structure
|
||||||
|
|||||||
@@ -12,13 +12,11 @@ function TokenUsageSummary({
|
|||||||
inputTokens,
|
inputTokens,
|
||||||
outputTokens,
|
outputTokens,
|
||||||
totalTokens,
|
totalTokens,
|
||||||
unavailable = false,
|
|
||||||
}: {
|
}: {
|
||||||
className?: string;
|
className?: string;
|
||||||
inputTokens?: number;
|
inputTokens?: number;
|
||||||
outputTokens?: number;
|
outputTokens?: number;
|
||||||
totalTokens?: number;
|
totalTokens?: number;
|
||||||
unavailable?: boolean;
|
|
||||||
}) {
|
}) {
|
||||||
const { t } = useI18n();
|
const { t } = useI18n();
|
||||||
|
|
||||||
@@ -33,21 +31,15 @@ function TokenUsageSummary({
|
|||||||
<CoinsIcon className="size-3" />
|
<CoinsIcon className="size-3" />
|
||||||
{t.tokenUsage.label}
|
{t.tokenUsage.label}
|
||||||
</span>
|
</span>
|
||||||
{!unavailable ? (
|
<span>
|
||||||
<>
|
{t.tokenUsage.input}: {formatTokenCount(inputTokens ?? 0)}
|
||||||
<span>
|
</span>
|
||||||
{t.tokenUsage.input}: {formatTokenCount(inputTokens ?? 0)}
|
<span>
|
||||||
</span>
|
{t.tokenUsage.output}: {formatTokenCount(outputTokens ?? 0)}
|
||||||
<span>
|
</span>
|
||||||
{t.tokenUsage.output}: {formatTokenCount(outputTokens ?? 0)}
|
<span className="font-medium">
|
||||||
</span>
|
{t.tokenUsage.total}: {formatTokenCount(totalTokens ?? 0)}
|
||||||
<span className="font-medium">
|
</span>
|
||||||
{t.tokenUsage.total}: {formatTokenCount(totalTokens ?? 0)}
|
|
||||||
</span>
|
|
||||||
</>
|
|
||||||
) : (
|
|
||||||
<span>{t.tokenUsage.unavailableShort}</span>
|
|
||||||
)}
|
|
||||||
</div>
|
</div>
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
@@ -55,7 +47,7 @@ function TokenUsageSummary({
|
|||||||
export function MessageTokenUsageList({
|
export function MessageTokenUsageList({
|
||||||
className,
|
className,
|
||||||
enabled = false,
|
enabled = false,
|
||||||
isLoading = false,
|
isLoading: _isLoading = false,
|
||||||
messages,
|
messages,
|
||||||
}: {
|
}: {
|
||||||
className?: string;
|
className?: string;
|
||||||
@@ -63,7 +55,7 @@ export function MessageTokenUsageList({
|
|||||||
isLoading?: boolean;
|
isLoading?: boolean;
|
||||||
messages: Message[];
|
messages: Message[];
|
||||||
}) {
|
}) {
|
||||||
if (!enabled || isLoading) {
|
if (!enabled) {
|
||||||
return null;
|
return null;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -75,13 +67,16 @@ export function MessageTokenUsageList({
|
|||||||
|
|
||||||
const usage = accumulateUsage(aiMessages);
|
const usage = accumulateUsage(aiMessages);
|
||||||
|
|
||||||
|
if (!usage) {
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<TokenUsageSummary
|
<TokenUsageSummary
|
||||||
className={className}
|
className={className}
|
||||||
inputTokens={usage?.inputTokens}
|
inputTokens={usage.inputTokens}
|
||||||
outputTokens={usage?.outputTokens}
|
outputTokens={usage.outputTokens}
|
||||||
totalTokens={usage?.totalTokens}
|
totalTokens={usage.totalTokens}
|
||||||
unavailable={!usage}
|
|
||||||
/>
|
/>
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -111,10 +111,9 @@ checkpointer:
|
|||||||
```
|
```
|
||||||
|
|
||||||
<Callout type="info">
|
<Callout type="info">
|
||||||
The LangGraph Server manages its own state separately. The
|
The Gateway embedded runtime uses the <code>checkpointer</code> setting in
|
||||||
<code>checkpointer</code> setting in <code>config.yaml</code> applies to the
|
<code>config.yaml</code>. The same setting is also used by
|
||||||
embedded <code>DeerFlowClient</code> (used in direct Python integrations), not
|
<code>DeerFlowClient</code> in direct Python integrations.
|
||||||
to the LangGraph Server deployment used by DeerFlow App.
|
|
||||||
</Callout>
|
</Callout>
|
||||||
|
|
||||||
### Thread data storage
|
### Thread data storage
|
||||||
|
|||||||
@@ -23,8 +23,7 @@ Services started:
|
|||||||
|
|
||||||
| Service | Port | Description |
|
| Service | Port | Description |
|
||||||
| ----------- | ---- | ------------------------ |
|
| ----------- | ---- | ------------------------ |
|
||||||
| LangGraph | 2024 | DeerFlow Harness runtime |
|
| Gateway API | 8001 | FastAPI backend + embedded agent runtime |
|
||||||
| Gateway API | 8001 | FastAPI backend |
|
|
||||||
| Frontend | 3000 | Next.js UI |
|
| Frontend | 3000 | Next.js UI |
|
||||||
| nginx | 2026 | Unified reverse proxy |
|
| nginx | 2026 | Unified reverse proxy |
|
||||||
|
|
||||||
@@ -36,13 +35,12 @@ Access the app at **http://localhost:2026**.
|
|||||||
make stop
|
make stop
|
||||||
```
|
```
|
||||||
|
|
||||||
Stops all four services. Safe to run even if a service is not running.
|
Stops all services. Safe to run even if a service is not running.
|
||||||
|
|
||||||
</Tabs.Tab>
|
</Tabs.Tab>
|
||||||
<Tabs.Tab>
|
<Tabs.Tab>
|
||||||
```
|
```
|
||||||
logs/langgraph.log # Agent runtime logs
|
logs/gateway.log # API gateway and agent runtime logs
|
||||||
logs/gateway.log # API gateway logs
|
|
||||||
logs/frontend.log # Next.js dev server logs
|
logs/frontend.log # Next.js dev server logs
|
||||||
logs/nginx.log # nginx access/error logs
|
logs/nginx.log # nginx access/error logs
|
||||||
```
|
```
|
||||||
@@ -50,7 +48,7 @@ logs/nginx.log # nginx access/error logs
|
|||||||
Tail a log in real time:
|
Tail a log in real time:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
tail -f logs/langgraph.log
|
tail -f logs/gateway.log
|
||||||
```
|
```
|
||||||
|
|
||||||
</Tabs.Tab>
|
</Tabs.Tab>
|
||||||
@@ -74,7 +72,7 @@ export DEER_FLOW_ROOT=/path/to/deer-flow
|
|||||||
docker compose -f docker/docker-compose-dev.yaml up --build
|
docker compose -f docker/docker-compose-dev.yaml up --build
|
||||||
```
|
```
|
||||||
|
|
||||||
Services: nginx, frontend, gateway, langgraph, and optionally provisioner (for K8s-managed sandboxes).
|
Services: nginx, frontend, gateway, and optionally provisioner (for K8s-managed sandboxes).
|
||||||
|
|
||||||
Access the app at **http://localhost:2026**.
|
Access the app at **http://localhost:2026**.
|
||||||
|
|
||||||
@@ -99,7 +97,7 @@ The `docker-compose*.yaml` files include an `env_file: ../.env` directive that l
|
|||||||
|
|
||||||
### Data persistence
|
### Data persistence
|
||||||
|
|
||||||
Thread data is stored in `backend/.deer-flow/threads/`. In Docker deployments, this directory is bind-mounted into the langgraph container.
|
Thread data is stored in `backend/.deer-flow/threads/`. In Docker deployments, this directory is bind-mounted into the gateway container.
|
||||||
|
|
||||||
To avoid data loss when containers are recreated:
|
To avoid data loss when containers are recreated:
|
||||||
|
|
||||||
@@ -161,14 +159,7 @@ When `USERDATA_PVC_NAME` is set, the provisioner automatically uses subPath (`th
|
|||||||
|
|
||||||
### nginx configuration
|
### nginx configuration
|
||||||
|
|
||||||
nginx routes all traffic. Key environment variables that control routing:
|
nginx routes all traffic to the frontend or Gateway. `/api/langgraph/*` is rewritten to Gateway's LangGraph-compatible `/api/*` routes, so no separate LangGraph upstream is required.
|
||||||
|
|
||||||
| Variable | Default | Description |
|
|
||||||
| -------------------- | ---------------- | --------------------------------------- |
|
|
||||||
| `LANGGRAPH_UPSTREAM` | `langgraph:2024` | LangGraph service address |
|
|
||||||
| `LANGGRAPH_REWRITE` | `/` | URL rewrite prefix for LangGraph routes |
|
|
||||||
|
|
||||||
These are set in the Docker Compose environment and processed by `envsubst` at container startup.
|
|
||||||
|
|
||||||
### Authentication
|
### Authentication
|
||||||
|
|
||||||
@@ -186,8 +177,7 @@ openssl rand -base64 32
|
|||||||
|
|
||||||
| Service | Minimum | Recommended |
|
| Service | Minimum | Recommended |
|
||||||
| ------------------------------- | ---------------- | ---------------- |
|
| ------------------------------- | ---------------- | ---------------- |
|
||||||
| LangGraph (agent runtime) | 2 vCPU, 4 GB RAM | 4 vCPU, 8 GB RAM |
|
| Gateway + agent runtime | 2 vCPU, 4 GB RAM | 4 vCPU, 8 GB RAM |
|
||||||
| Gateway | 0.5 vCPU, 512 MB | 1 vCPU, 1 GB |
|
|
||||||
| Frontend | 0.5 vCPU, 512 MB | 1 vCPU, 1 GB |
|
| Frontend | 0.5 vCPU, 512 MB | 1 vCPU, 1 GB |
|
||||||
| Sandbox container (per session) | 1 vCPU, 1 GB | 2 vCPU, 2 GB |
|
| Sandbox container (per session) | 1 vCPU, 1 GB | 2 vCPU, 2 GB |
|
||||||
|
|
||||||
@@ -199,9 +189,6 @@ After starting, verify the deployment:
|
|||||||
# Check Gateway health
|
# Check Gateway health
|
||||||
curl http://localhost:8001/health
|
curl http://localhost:8001/health
|
||||||
|
|
||||||
# Check LangGraph health
|
|
||||||
curl http://localhost:2024/ok
|
|
||||||
|
|
||||||
# List configured models (through nginx)
|
# List configured models (through nginx)
|
||||||
curl http://localhost:2026/api/models
|
curl http://localhost:2026/api/models
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -25,11 +25,11 @@ DeerFlow App is the reference implementation of what a production DeerFlow exper
|
|||||||
| **Streaming responses** | Real-time token streaming with thinking steps and tool call visibility |
|
| **Streaming responses** | Real-time token streaming with thinking steps and tool call visibility |
|
||||||
| **Artifact viewer** | In-browser preview and download of files and outputs produced by the agent |
|
| **Artifact viewer** | In-browser preview and download of files and outputs produced by the agent |
|
||||||
| **Extensions UI** | Enable/disable MCP servers and skills without editing config files |
|
| **Extensions UI** | Enable/disable MCP servers and skills without editing config files |
|
||||||
| **Gateway API** | FastAPI-based REST API that bridges the frontend and the LangGraph runtime |
|
| **Gateway API** | FastAPI-based REST API with the embedded LangGraph-compatible agent runtime |
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
The DeerFlow App runs as four services behind a single nginx reverse proxy:
|
The DeerFlow App runs behind a single nginx reverse proxy:
|
||||||
|
|
||||||
```
|
```
|
||||||
┌──────────────────┐
|
┌──────────────────┐
|
||||||
@@ -42,19 +42,11 @@ The DeerFlow App runs as four services behind a single nginx reverse proxy:
|
|||||||
│ Frontend :3000 │ │ Gateway API :8001 │
|
│ Frontend :3000 │ │ Gateway API :8001 │
|
||||||
│ (Next.js) │ │ (FastAPI) │
|
│ (Next.js) │ │ (FastAPI) │
|
||||||
└──────────────────┘ └──────────────────────┘
|
└──────────────────┘ └──────────────────────┘
|
||||||
│
|
|
||||||
┌─────────┘
|
|
||||||
▼
|
|
||||||
┌──────────────────────┐
|
|
||||||
│ LangGraph :2024 │
|
|
||||||
│ (DeerFlow Harness) │
|
|
||||||
└──────────────────────┘
|
|
||||||
```
|
```
|
||||||
|
|
||||||
- **nginx**: routes requests — `/api/*` to the Gateway, LangGraph streaming endpoints to LangGraph directly, and everything else to the frontend.
|
- **nginx**: routes requests — `/api/*` and `/api/langgraph/*` to Gateway, and everything else to the frontend.
|
||||||
- **Frontend** (Next.js + React): the browser UI. Communicates with both the Gateway and LangGraph.
|
- **Frontend** (Next.js + React): the browser UI. Communicates with Gateway.
|
||||||
- **Gateway** (FastAPI): handles API operations — model listing, agent CRUD, memory, extensions management, file uploads.
|
- **Gateway** (FastAPI): handles API operations and the embedded LangGraph-compatible runtime for thread state, agent execution, and streaming.
|
||||||
- **LangGraph**: the DeerFlow Harness runtime. Manages thread state, agent execution, and streaming.
|
|
||||||
|
|
||||||
## Technology stack
|
## Technology stack
|
||||||
|
|
||||||
@@ -64,7 +56,7 @@ The DeerFlow App runs as four services behind a single nginx reverse proxy:
|
|||||||
| Gateway | FastAPI, Python 3.12, uvicorn |
|
| Gateway | FastAPI, Python 3.12, uvicorn |
|
||||||
| Agent runtime | LangGraph, LangChain, DeerFlow Harness |
|
| Agent runtime | LangGraph, LangChain, DeerFlow Harness |
|
||||||
| Reverse proxy | nginx |
|
| Reverse proxy | nginx |
|
||||||
| State persistence | LangGraph Server (default) + optional SQLite/PostgreSQL checkpointer |
|
| State persistence | Gateway runtime + optional SQLite/PostgreSQL checkpointer |
|
||||||
|
|
||||||
<Cards num={2}>
|
<Cards num={2}>
|
||||||
<Cards.Card title="Quick Start" href="/docs/application/quick-start" />
|
<Cards.Card title="Quick Start" href="/docs/application/quick-start" />
|
||||||
|
|||||||
@@ -15,15 +15,13 @@ All services write logs to the `logs/` directory when started with `make dev`:
|
|||||||
|
|
||||||
| File | Service |
|
| File | Service |
|
||||||
| -------------------- | ------------------------------------ |
|
| -------------------- | ------------------------------------ |
|
||||||
| `logs/langgraph.log` | LangGraph / DeerFlow Harness runtime |
|
| `logs/gateway.log` | FastAPI Gateway API and agent runtime |
|
||||||
| `logs/gateway.log` | FastAPI Gateway API |
|
|
||||||
| `logs/frontend.log` | Next.js frontend dev server |
|
| `logs/frontend.log` | Next.js frontend dev server |
|
||||||
| `logs/nginx.log` | nginx reverse proxy |
|
| `logs/nginx.log` | nginx reverse proxy |
|
||||||
|
|
||||||
Tail logs in real time:
|
Tail logs in real time:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
tail -f logs/langgraph.log
|
|
||||||
tail -f logs/gateway.log
|
tail -f logs/gateway.log
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -41,9 +39,6 @@ Verify each service is responding:
|
|||||||
# Gateway health
|
# Gateway health
|
||||||
curl http://localhost:8001/health
|
curl http://localhost:8001/health
|
||||||
|
|
||||||
# LangGraph health
|
|
||||||
curl http://localhost:2024/ok
|
|
||||||
|
|
||||||
# Through nginx (verifies full proxy chain)
|
# Through nginx (verifies full proxy chain)
|
||||||
curl http://localhost:2026/api/models
|
curl http://localhost:2026/api/models
|
||||||
```
|
```
|
||||||
@@ -66,7 +61,7 @@ grep config_version config.yaml
|
|||||||
|
|
||||||
### The app loads but the agent doesn't respond
|
### The app loads but the agent doesn't respond
|
||||||
|
|
||||||
1. Check `logs/langgraph.log` for startup errors.
|
1. Check `logs/gateway.log` for startup errors.
|
||||||
2. Verify your model is correctly configured in `config.yaml` with a valid API key.
|
2. Verify your model is correctly configured in `config.yaml` with a valid API key.
|
||||||
3. Confirm the API key environment variable is set in the shell that ran `make dev`.
|
3. Confirm the API key environment variable is set in the shell that ran `make dev`.
|
||||||
4. Test the model endpoint directly with `curl` to rule out network issues.
|
4. Test the model endpoint directly with `curl` to rule out network issues.
|
||||||
@@ -126,7 +121,7 @@ Connection refused: http://provisioner:8002
|
|||||||
|
|
||||||
If MCP tools appear in `extensions_config.json` but are not available in the agent:
|
If MCP tools appear in `extensions_config.json` but are not available in the agent:
|
||||||
|
|
||||||
1. Check `logs/langgraph.log` for MCP initialization errors.
|
1. Check `logs/gateway.log` for MCP initialization errors.
|
||||||
2. Verify the MCP server command is installed (`npx`, `uvx`, or the relevant binary).
|
2. Verify the MCP server command is installed (`npx`, `uvx`, or the relevant binary).
|
||||||
3. Test the server command manually to confirm it starts without errors.
|
3. Test the server command manually to confirm it starts without errors.
|
||||||
4. Set `log_level: debug` to see detailed MCP loading output.
|
4. Set `log_level: debug` to see detailed MCP loading output.
|
||||||
@@ -137,7 +132,7 @@ If MCP tools appear in `extensions_config.json` but are not available in the age
|
|||||||
|
|
||||||
- Verify `memory.enabled: true` in `config.yaml`.
|
- Verify `memory.enabled: true` in `config.yaml`.
|
||||||
- Check that the storage path is writable: `ls -la backend/.deer-flow/`.
|
- Check that the storage path is writable: `ls -la backend/.deer-flow/`.
|
||||||
- Look for memory update errors in `logs/langgraph.log` (search for "memory").
|
- Look for memory update errors in `logs/gateway.log` (search for "memory").
|
||||||
|
|
||||||
## Data backup
|
## Data backup
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
---
|
---
|
||||||
title: Quick Start
|
title: Quick Start
|
||||||
description: This guide walks you through starting DeerFlow App on your local machine using the `make dev` workflow. All four services (LangGraph, Gateway, Frontend, nginx) start together and are accessible through a single URL.
|
description: This guide walks you through starting DeerFlow App on your local machine using the `make dev` workflow. Gateway, Frontend, and nginx start together and are accessible through a single URL.
|
||||||
---
|
---
|
||||||
|
|
||||||
import { Callout, Cards, Steps } from "nextra/components";
|
import { Callout, Cards, Steps } from "nextra/components";
|
||||||
@@ -12,7 +12,7 @@ import { Callout, Cards, Steps } from "nextra/components";
|
|||||||
Python 3.12+, Node.js 22+, and at least one LLM API key.
|
Python 3.12+, Node.js 22+, and at least one LLM API key.
|
||||||
</Callout>
|
</Callout>
|
||||||
|
|
||||||
This guide walks you through starting DeerFlow App on your local machine using the `make dev` workflow. All four services (LangGraph, Gateway, Frontend, nginx) start together and are accessible through a single URL.
|
This guide walks you through starting DeerFlow App on your local machine using the `make dev` workflow. Gateway, Frontend, and nginx start together and are accessible through a single URL.
|
||||||
|
|
||||||
## Prerequisites
|
## Prerequisites
|
||||||
|
|
||||||
@@ -88,8 +88,7 @@ make dev
|
|||||||
|
|
||||||
This starts:
|
This starts:
|
||||||
|
|
||||||
- LangGraph server on port `2024`
|
- Gateway API and embedded agent runtime on port `8001`
|
||||||
- Gateway API on port `8001`
|
|
||||||
- Frontend on port `3000`
|
- Frontend on port `3000`
|
||||||
- nginx reverse proxy on port `2026`
|
- nginx reverse proxy on port `2026`
|
||||||
|
|
||||||
@@ -113,15 +112,13 @@ Log files:
|
|||||||
|
|
||||||
| Service | Log file |
|
| Service | Log file |
|
||||||
| --------- | -------------------- |
|
| --------- | -------------------- |
|
||||||
| LangGraph | `logs/langgraph.log` |
|
|
||||||
| Gateway | `logs/gateway.log` |
|
| Gateway | `logs/gateway.log` |
|
||||||
| Frontend | `logs/frontend.log` |
|
| Frontend | `logs/frontend.log` |
|
||||||
| nginx | `logs/nginx.log` |
|
| nginx | `logs/nginx.log` |
|
||||||
|
|
||||||
<Callout type="tip">
|
<Callout type="tip">
|
||||||
If something is not working, check the log files first. Most startup errors
|
If something is not working, check the log files first. Most startup errors
|
||||||
(missing API keys, config parsing failures) appear in `logs/langgraph.log` or
|
(missing API keys, config parsing failures) appear in `logs/gateway.log`.
|
||||||
`logs/gateway.log`.
|
|
||||||
</Callout>
|
</Callout>
|
||||||
|
|
||||||
<Cards num={2}>
|
<Cards num={2}>
|
||||||
|
|||||||
@@ -68,7 +68,7 @@ DeerFlow ships with the following public skills:
|
|||||||
|
|
||||||
### Discovery and loading
|
### Discovery and loading
|
||||||
|
|
||||||
`load_skills()` in `skills/loader.py` scans both `public/` and `custom/` directories under the configured skills path. It re-reads `ExtensionsConfig.from_file()` on every call, which means enabling or disabling a skill through the Gateway API takes effect immediately in the running LangGraph server without a restart.
|
`load_skills()` in `skills/loader.py` scans both `public/` and `custom/` directories under the configured skills path. It re-reads `ExtensionsConfig.from_file()` on every call, which means enabling or disabling a skill through the Gateway API takes effect immediately in the running agent runtime without a restart.
|
||||||
|
|
||||||
### Parsing
|
### Parsing
|
||||||
|
|
||||||
|
|||||||
@@ -215,7 +215,6 @@ BETTER_AUTH_SECRET=local-dev-secret-at-least-32-chars
|
|||||||
| `DEER_FLOW_CONFIG_PATH` | 自动发现 | `config.yaml` 的绝对路径 |
|
| `DEER_FLOW_CONFIG_PATH` | 自动发现 | `config.yaml` 的绝对路径 |
|
||||||
| `LOG_LEVEL` | `info` | 日志详细程度(`debug`/`info`/`warning`/`error`) |
|
| `LOG_LEVEL` | `info` | 日志详细程度(`debug`/`info`/`warning`/`error`) |
|
||||||
| `DEER_FLOW_ROOT` | 仓库根目录 | 用于 Docker 中的技能和线程挂载 |
|
| `DEER_FLOW_ROOT` | 仓库根目录 | 用于 Docker 中的技能和线程挂载 |
|
||||||
| `LANGGRAPH_UPSTREAM` | `langgraph:2024` | nginx 代理的 LangGraph 地址 |
|
|
||||||
|
|
||||||
<Cards num={2}>
|
<Cards num={2}>
|
||||||
<Cards.Card title="Harness 配置" href="/docs/harness/configuration" />
|
<Cards.Card title="Harness 配置" href="/docs/harness/configuration" />
|
||||||
|
|||||||
@@ -23,8 +23,7 @@ make dev
|
|||||||
|
|
||||||
| 服务 | 端口 | 描述 |
|
| 服务 | 端口 | 描述 |
|
||||||
| ----------- | ---- | ----------------------- |
|
| ----------- | ---- | ----------------------- |
|
||||||
| LangGraph | 2024 | DeerFlow Harness 运行时 |
|
| Gateway API | 8001 | FastAPI 后端 + 嵌入式 Agent 运行时 |
|
||||||
| Gateway API | 8001 | FastAPI 后端 |
|
|
||||||
| 前端 | 3000 | Next.js 界面 |
|
| 前端 | 3000 | Next.js 界面 |
|
||||||
| nginx | 2026 | 统一反向代理 |
|
| nginx | 2026 | 统一反向代理 |
|
||||||
|
|
||||||
@@ -36,13 +35,12 @@ make dev
|
|||||||
make stop
|
make stop
|
||||||
```
|
```
|
||||||
|
|
||||||
停止所有四个服务。即使某个服务没有运行也可以安全执行。
|
停止所有服务。即使某个服务没有运行也可以安全执行。
|
||||||
|
|
||||||
</Tabs.Tab>
|
</Tabs.Tab>
|
||||||
<Tabs.Tab>
|
<Tabs.Tab>
|
||||||
```
|
```
|
||||||
logs/langgraph.log # Agent 运行时日志
|
logs/gateway.log # API Gateway 和 Agent 运行时日志
|
||||||
logs/gateway.log # API Gateway 日志
|
|
||||||
logs/frontend.log # Next.js 开发服务器日志
|
logs/frontend.log # Next.js 开发服务器日志
|
||||||
logs/nginx.log # nginx 访问/错误日志
|
logs/nginx.log # nginx 访问/错误日志
|
||||||
```
|
```
|
||||||
@@ -50,7 +48,7 @@ logs/nginx.log # nginx 访问/错误日志
|
|||||||
实时追踪日志:
|
实时追踪日志:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
tail -f logs/langgraph.log
|
tail -f logs/gateway.log
|
||||||
```
|
```
|
||||||
|
|
||||||
</Tabs.Tab>
|
</Tabs.Tab>
|
||||||
@@ -96,7 +94,7 @@ BETTER_AUTH_SECRET=your-secret-here-min-32-chars
|
|||||||
|
|
||||||
### 数据持久化
|
### 数据持久化
|
||||||
|
|
||||||
线程数据存储在 `backend/.deer-flow/threads/`。在 Docker 部署中,此目录被绑定挂载到 langgraph 容器中。
|
线程数据存储在 `backend/.deer-flow/threads/`。在 Docker 部署中,此目录会绑定挂载到 gateway 容器中。
|
||||||
|
|
||||||
为避免容器重建时数据丢失:
|
为避免容器重建时数据丢失:
|
||||||
|
|
||||||
@@ -156,14 +154,7 @@ SKILLS_PVC_NAME=deer-flow-skills-pvc
|
|||||||
|
|
||||||
### nginx 配置
|
### nginx 配置
|
||||||
|
|
||||||
nginx 路由所有流量,控制路由的关键环境变量:
|
nginx 将流量路由到前端或 Gateway。`/api/langgraph/*` 会被重写到 Gateway 的 LangGraph-compatible `/api/*` 路由,因此不需要单独的 LangGraph upstream。
|
||||||
|
|
||||||
| 变量 | 默认值 | 描述 |
|
|
||||||
| -------------------- | ---------------- | ----------------------------- |
|
|
||||||
| `LANGGRAPH_UPSTREAM` | `langgraph:2024` | LangGraph 服务地址 |
|
|
||||||
| `LANGGRAPH_REWRITE` | `/` | LangGraph 路由的 URL 重写前缀 |
|
|
||||||
|
|
||||||
这些在 Docker Compose 环境中设置,并在容器启动时由 `envsubst` 处理。
|
|
||||||
|
|
||||||
### 认证配置
|
### 认证配置
|
||||||
|
|
||||||
@@ -181,8 +172,7 @@ openssl rand -base64 32
|
|||||||
|
|
||||||
| 服务 | 最低配置 | 推荐配置 |
|
| 服务 | 最低配置 | 推荐配置 |
|
||||||
| ------------------------- | ---------------- | ---------------- |
|
| ------------------------- | ---------------- | ---------------- |
|
||||||
| LangGraph(Agent 运行时) | 2 vCPU、4 GB RAM | 4 vCPU、8 GB RAM |
|
| Gateway + Agent 运行时 | 2 vCPU、4 GB RAM | 4 vCPU、8 GB RAM |
|
||||||
| Gateway | 0.5 vCPU、512 MB | 1 vCPU、1 GB |
|
|
||||||
| 前端 | 0.5 vCPU、512 MB | 1 vCPU、1 GB |
|
| 前端 | 0.5 vCPU、512 MB | 1 vCPU、1 GB |
|
||||||
| 沙箱容器(每会话) | 1 vCPU、1 GB | 2 vCPU、2 GB |
|
| 沙箱容器(每会话) | 1 vCPU、1 GB | 2 vCPU、2 GB |
|
||||||
|
|
||||||
@@ -194,9 +184,6 @@ openssl rand -base64 32
|
|||||||
# 检查 Gateway 健康状态
|
# 检查 Gateway 健康状态
|
||||||
curl http://localhost:8001/health
|
curl http://localhost:8001/health
|
||||||
|
|
||||||
# 检查 LangGraph 健康状态
|
|
||||||
curl http://localhost:2024/ok
|
|
||||||
|
|
||||||
# 通过 nginx 列出配置的模型(验证完整代理链)
|
# 通过 nginx 列出配置的模型(验证完整代理链)
|
||||||
curl http://localhost:2026/api/models
|
curl http://localhost:2026/api/models
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -25,11 +25,11 @@ DeerFlow 应用是 DeerFlow 生产体验的参考实现。它将 Harness 运行
|
|||||||
| **流式响应** | 实时 token 流式传输,带思考步骤和工具调用可见性 |
|
| **流式响应** | 实时 token 流式传输,带思考步骤和工具调用可见性 |
|
||||||
| **产出物查看器** | Agent 生成文件和输出的浏览器内预览和下载 |
|
| **产出物查看器** | Agent 生成文件和输出的浏览器内预览和下载 |
|
||||||
| **扩展界面** | 无需编辑配置文件即可启用/禁用 MCP 服务器和技能 |
|
| **扩展界面** | 无需编辑配置文件即可启用/禁用 MCP 服务器和技能 |
|
||||||
| **Gateway API** | 桥接前端和 LangGraph 运行时的基于 FastAPI 的 REST API |
|
| **Gateway API** | 基于 FastAPI 的 REST API,并内置 LangGraph-compatible Agent 运行时 |
|
||||||
|
|
||||||
## 架构
|
## 架构
|
||||||
|
|
||||||
DeerFlow 应用以四个服务的形式运行,通过单个 nginx 反向代理提供:
|
DeerFlow 应用通过单个 nginx 反向代理提供:
|
||||||
|
|
||||||
```
|
```
|
||||||
┌──────────────────┐
|
┌──────────────────┐
|
||||||
@@ -42,19 +42,11 @@ DeerFlow 应用以四个服务的形式运行,通过单个 nginx 反向代理
|
|||||||
│ 前端 :3000 │ │ Gateway API :8001 │
|
│ 前端 :3000 │ │ Gateway API :8001 │
|
||||||
│ (Next.js) │ │ (FastAPI) │
|
│ (Next.js) │ │ (FastAPI) │
|
||||||
└──────────────────┘ └──────────────────────┘
|
└──────────────────┘ └──────────────────────┘
|
||||||
│
|
|
||||||
┌─────────┘
|
|
||||||
▼
|
|
||||||
┌──────────────────────┐
|
|
||||||
│ LangGraph :2024 │
|
|
||||||
│ (DeerFlow Harness) │
|
|
||||||
└──────────────────────┘
|
|
||||||
```
|
```
|
||||||
|
|
||||||
- **nginx**:路由请求——`/api/*` 到 Gateway,LangGraph 流式端点到 LangGraph,其余到前端。
|
- **nginx**:路由请求——`/api/*` 和 `/api/langgraph/*` 到 Gateway,其余到前端。
|
||||||
- **前端**(Next.js + React):浏览器界面,与 Gateway 和 LangGraph 通信。
|
- **前端**(Next.js + React):浏览器界面,与 Gateway 通信。
|
||||||
- **Gateway**(FastAPI):处理 API 操作——模型列表、Agent CRUD、记忆、扩展管理、文件上传。
|
- **Gateway**(FastAPI):处理 API 操作,并通过内置 LangGraph-compatible 运行时管理线程状态、Agent 执行和流式传输。
|
||||||
- **LangGraph**:DeerFlow Harness 运行时,管理线程状态、Agent 执行和流式传输。
|
|
||||||
|
|
||||||
## 技术栈
|
## 技术栈
|
||||||
|
|
||||||
@@ -64,7 +56,7 @@ DeerFlow 应用以四个服务的形式运行,通过单个 nginx 反向代理
|
|||||||
| Gateway | FastAPI、Python 3.12、uvicorn |
|
| Gateway | FastAPI、Python 3.12、uvicorn |
|
||||||
| Agent 运行时 | LangGraph、LangChain、DeerFlow Harness |
|
| Agent 运行时 | LangGraph、LangChain、DeerFlow Harness |
|
||||||
| 反向代理 | nginx |
|
| 反向代理 | nginx |
|
||||||
| 状态持久化 | LangGraph Server(默认)+ 可选 SQLite/PostgreSQL 检查点 |
|
| 状态持久化 | Gateway 运行时 + 可选 SQLite/PostgreSQL 检查点 |
|
||||||
|
|
||||||
<Cards num={2}>
|
<Cards num={2}>
|
||||||
<Cards.Card title="快速上手" href="/docs/application/quick-start" />
|
<Cards.Card title="快速上手" href="/docs/application/quick-start" />
|
||||||
|
|||||||
@@ -15,16 +15,14 @@ DeerFlow 应用在 `logs/` 目录中写入每个服务的日志:
|
|||||||
|
|
||||||
| 文件 | 内容 |
|
| 文件 | 内容 |
|
||||||
| -------------------- | -------------------------------------- |
|
| -------------------- | -------------------------------------- |
|
||||||
| `logs/langgraph.log` | Agent 运行时、工具调用、LangGraph 错误 |
|
| `logs/gateway.log` | API 请求/响应、Agent 运行时和 Gateway 错误 |
|
||||||
| `logs/gateway.log` | API 请求/响应、Gateway 错误 |
|
|
||||||
| `logs/frontend.log` | Next.js 服务器日志 |
|
| `logs/frontend.log` | Next.js 服务器日志 |
|
||||||
| `logs/nginx.log` | 代理访问和错误日志 |
|
| `logs/nginx.log` | 代理访问和错误日志 |
|
||||||
|
|
||||||
**实时追踪日志**:
|
**实时追踪日志**:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
tail -f logs/langgraph.log # 查看 Agent 活动
|
tail -f logs/gateway.log # 查看 API 请求和 Agent 活动
|
||||||
tail -f logs/gateway.log # 查看 API 请求
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**调整日志级别**:
|
**调整日志级别**:
|
||||||
@@ -42,9 +40,6 @@ DeerFlow 暴露健康检查端点:
|
|||||||
# Gateway 健康状态
|
# Gateway 健康状态
|
||||||
curl http://localhost:8001/health
|
curl http://localhost:8001/health
|
||||||
|
|
||||||
# LangGraph 健康状态
|
|
||||||
curl http://localhost:2024/ok
|
|
||||||
|
|
||||||
# 通过 nginx 完整代理链验证
|
# 通过 nginx 完整代理链验证
|
||||||
curl http://localhost:2026/api/models
|
curl http://localhost:2026/api/models
|
||||||
```
|
```
|
||||||
@@ -68,8 +63,8 @@ make config-upgrade
|
|||||||
**诊断**:
|
**诊断**:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 检查 LangGraph 日志中的模型错误
|
# 检查 Gateway 日志中的模型错误
|
||||||
grep -i "error\|apikey\|unauthorized" logs/langgraph.log | tail -20
|
grep -i "error\|apikey\|unauthorized" logs/gateway.log | tail -20
|
||||||
```
|
```
|
||||||
|
|
||||||
**解决**:
|
**解决**:
|
||||||
@@ -118,13 +113,13 @@ SKIP_ENV_VALIDATION=1 pnpm build
|
|||||||
|
|
||||||
### MCP 服务器连接失败
|
### MCP 服务器连接失败
|
||||||
|
|
||||||
**症状**:MCP 工具未出现,`logs/langgraph.log` 中有超时错误。
|
**症状**:MCP 工具未出现,`logs/gateway.log` 中有超时错误。
|
||||||
|
|
||||||
**诊断**:
|
**诊断**:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 检查 MCP 相关错误
|
# 检查 MCP 相关错误
|
||||||
grep -i "mcp\|timeout" logs/langgraph.log | tail -20
|
grep -i "mcp\|timeout" logs/gateway.log | tail -20
|
||||||
```
|
```
|
||||||
|
|
||||||
**解决**:
|
**解决**:
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
---
|
---
|
||||||
title: 快速上手
|
title: 快速上手
|
||||||
description: 本指南引导你使用 `make dev` 工作流在本地机器上启动 DeerFlow 应用。所有四个服务(LangGraph、Gateway、前端、nginx)一起启动,通过单个 URL 访问。
|
description: 本指南引导你使用 `make dev` 工作流在本地机器上启动 DeerFlow 应用。Gateway、前端和 nginx 会一起启动,通过单个 URL 访问。
|
||||||
---
|
---
|
||||||
|
|
||||||
import { Callout, Cards, Steps } from "nextra/components";
|
import { Callout, Cards, Steps } from "nextra/components";
|
||||||
@@ -12,7 +12,7 @@ import { Callout, Cards, Steps } from "nextra/components";
|
|||||||
3.12+、Node.js 22+ 的机器,以及至少一个 LLM API Key。
|
3.12+、Node.js 22+ 的机器,以及至少一个 LLM API Key。
|
||||||
</Callout>
|
</Callout>
|
||||||
|
|
||||||
本指南引导你使用 `make dev` 工作流在本地机器上启动 DeerFlow 应用。所有四个服务(LangGraph、Gateway、前端、nginx)一起启动,通过单个 URL 访问。
|
本指南引导你使用 `make dev` 工作流在本地机器上启动 DeerFlow 应用。Gateway、前端和 nginx 会一起启动,通过单个 URL 访问。
|
||||||
|
|
||||||
## 前置条件
|
## 前置条件
|
||||||
|
|
||||||
@@ -88,8 +88,7 @@ make dev
|
|||||||
|
|
||||||
这会启动:
|
这会启动:
|
||||||
|
|
||||||
- LangGraph 服务,端口 `2024`
|
- Gateway API 和嵌入式 Agent 运行时,端口 `8001`
|
||||||
- Gateway API,端口 `8001`
|
|
||||||
- 前端,端口 `3000`
|
- 前端,端口 `3000`
|
||||||
- nginx 反向代理,端口 `2026`
|
- nginx 反向代理,端口 `2026`
|
||||||
|
|
||||||
@@ -113,15 +112,13 @@ make stop
|
|||||||
|
|
||||||
| 服务 | 日志文件 |
|
| 服务 | 日志文件 |
|
||||||
| --------- | -------------------- |
|
| --------- | -------------------- |
|
||||||
| LangGraph | `logs/langgraph.log` |
|
|
||||||
| Gateway | `logs/gateway.log` |
|
| Gateway | `logs/gateway.log` |
|
||||||
| 前端 | `logs/frontend.log` |
|
| 前端 | `logs/frontend.log` |
|
||||||
| nginx | `logs/nginx.log` |
|
| nginx | `logs/nginx.log` |
|
||||||
|
|
||||||
<Callout type="tip">
|
<Callout type="tip">
|
||||||
如果有问题,先检查日志文件。大多数启动错误(缺失 API
|
如果有问题,先检查日志文件。大多数启动错误(缺失 API
|
||||||
Key、配置解析失败)会出现在 <code>logs/langgraph.log</code> 或{" "}
|
Key、配置解析失败)会出现在 <code>logs/gateway.log</code> 中。
|
||||||
<code>logs/gateway.log</code> 中。
|
|
||||||
</Callout>
|
</Callout>
|
||||||
|
|
||||||
<Cards num={2}>
|
<Cards num={2}>
|
||||||
|
|||||||
@@ -65,7 +65,7 @@ export function accumulateUsage(messages: Message[]): TokenUsage | null {
|
|||||||
return hasUsage ? cumulative : null;
|
return hasUsage ? cumulative : null;
|
||||||
}
|
}
|
||||||
|
|
||||||
function hasNonZeroUsage(
|
export function hasNonZeroUsage(
|
||||||
usage: TokenUsage | null | undefined,
|
usage: TokenUsage | null | undefined,
|
||||||
): usage is TokenUsage {
|
): usage is TokenUsage {
|
||||||
return (
|
return (
|
||||||
@@ -75,7 +75,7 @@ function hasNonZeroUsage(
|
|||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
function addUsage(base: TokenUsage, delta: TokenUsage): TokenUsage {
|
export function addUsage(base: TokenUsage, delta: TokenUsage): TokenUsage {
|
||||||
return {
|
return {
|
||||||
inputTokens: base.inputTokens + delta.inputTokens,
|
inputTokens: base.inputTokens + delta.inputTokens,
|
||||||
outputTokens: base.outputTokens + delta.outputTokens,
|
outputTokens: base.outputTokens + delta.outputTokens,
|
||||||
|
|||||||
@@ -296,7 +296,11 @@ export function useThreadStream({
|
|||||||
onError(error) {
|
onError(error) {
|
||||||
setOptimisticMessages([]);
|
setOptimisticMessages([]);
|
||||||
toast.error(getStreamErrorMessage(error));
|
toast.error(getStreamErrorMessage(error));
|
||||||
pendingUsageBaselineMessageIdsRef.current = new Set();
|
pendingUsageBaselineMessageIdsRef.current = new Set(
|
||||||
|
messagesRef.current
|
||||||
|
.map(messageIdentity)
|
||||||
|
.filter((id): id is string => Boolean(id)),
|
||||||
|
);
|
||||||
if (threadIdRef.current && !isMock) {
|
if (threadIdRef.current && !isMock) {
|
||||||
void queryClient.invalidateQueries({
|
void queryClient.invalidateQueries({
|
||||||
queryKey: threadTokenUsageQueryKey(threadIdRef.current),
|
queryKey: threadTokenUsageQueryKey(threadIdRef.current),
|
||||||
@@ -305,7 +309,11 @@ export function useThreadStream({
|
|||||||
},
|
},
|
||||||
onFinish(state) {
|
onFinish(state) {
|
||||||
listeners.current.onFinish?.(state.values);
|
listeners.current.onFinish?.(state.values);
|
||||||
pendingUsageBaselineMessageIdsRef.current = new Set();
|
pendingUsageBaselineMessageIdsRef.current = new Set(
|
||||||
|
messagesRef.current
|
||||||
|
.map(messageIdentity)
|
||||||
|
.filter((id): id is string => Boolean(id)),
|
||||||
|
);
|
||||||
void queryClient.invalidateQueries({ queryKey: ["threads", "search"] });
|
void queryClient.invalidateQueries({ queryKey: ["threads", "search"] });
|
||||||
if (threadIdRef.current && !isMock) {
|
if (threadIdRef.current && !isMock) {
|
||||||
void queryClient.invalidateQueries({
|
void queryClient.invalidateQueries({
|
||||||
@@ -339,7 +347,11 @@ export function useThreadStream({
|
|||||||
useEffect(() => {
|
useEffect(() => {
|
||||||
startedRef.current = false;
|
startedRef.current = false;
|
||||||
sendInFlightRef.current = false;
|
sendInFlightRef.current = false;
|
||||||
pendingUsageBaselineMessageIdsRef.current = new Set();
|
pendingUsageBaselineMessageIdsRef.current = new Set(
|
||||||
|
messagesRef.current
|
||||||
|
.map(messageIdentity)
|
||||||
|
.filter((id): id is string => Boolean(id)),
|
||||||
|
);
|
||||||
prevHumanMsgCountRef.current =
|
prevHumanMsgCountRef.current =
|
||||||
latestMessageCountsRef.current.humanMessageCount;
|
latestMessageCountsRef.current.humanMessageCount;
|
||||||
}, [threadId]);
|
}, [threadId]);
|
||||||
|
|||||||
@@ -14,8 +14,8 @@ DeerFlow exposes two API surfaces behind an Nginx reverse proxy:
|
|||||||
|
|
||||||
| Service | Direct Port | Via Proxy | Purpose |
|
| Service | Direct Port | Via Proxy | Purpose |
|
||||||
|----------------|-------------|----------------------------------|----------------------------------|
|
|----------------|-------------|----------------------------------|----------------------------------|
|
||||||
| Gateway API | 8001 | `$DEERFLOW_GATEWAY_URL` | REST endpoints (models, skills, memory, uploads) |
|
| Gateway API | 8001 | `$DEERFLOW_GATEWAY_URL` | REST endpoints and embedded agent runtime |
|
||||||
| LangGraph API | 2024 | `$DEERFLOW_LANGGRAPH_URL` | Agent threads, runs, streaming |
|
| LangGraph-compatible API | 8001 | `$DEERFLOW_LANGGRAPH_URL` | Agent threads, runs, streaming |
|
||||||
|
|
||||||
## Environment Variables
|
## Environment Variables
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user