mirror of
https://github.com/bytedance/deer-flow.git
synced 2026-06-10 01:15:58 +00:00
Compare commits
1 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 2eb45e9bb5 |
@@ -59,7 +59,7 @@ smoke-test/
|
|||||||
2. **Check pnpm** - Package manager
|
2. **Check pnpm** - Package manager
|
||||||
3. **Check uv** - Python package manager
|
3. **Check uv** - Python package manager
|
||||||
4. **Check nginx** - Reverse proxy
|
4. **Check nginx** - Reverse proxy
|
||||||
5. **Check required ports** - Confirm that ports 2026, 3000, and 8001 are not occupied
|
5. **Check required ports** - Confirm that ports 2026, 3000, 8001, and 2024 are not occupied
|
||||||
|
|
||||||
**Docker mode environment check** (if Docker is selected):
|
**Docker mode environment check** (if Docker is selected):
|
||||||
1. **Check whether Docker is installed** - Run `docker --version`
|
1. **Check whether Docker is installed** - Run `docker --version`
|
||||||
@@ -93,17 +93,17 @@ smoke-test/
|
|||||||
### Phase 5: Service Health Check
|
### Phase 5: Service Health Check
|
||||||
|
|
||||||
**Local mode health check**:
|
**Local mode health check**:
|
||||||
1. **Check process status** - Confirm that Gateway, Frontend, and Nginx processes are all running
|
1. **Check process status** - Confirm that LangGraph, Gateway, Frontend, and Nginx processes are all running
|
||||||
2. **Check frontend service** - Visit `http://localhost:2026` and verify that the page loads
|
2. **Check frontend service** - Visit `http://localhost:2026` and verify that the page loads
|
||||||
3. **Check API Gateway** - Verify the `http://localhost:2026/health` endpoint
|
3. **Check API Gateway** - Verify the `http://localhost:2026/health` endpoint
|
||||||
4. **Check LangGraph-compatible API** - Verify the `/api/langgraph/*` route exposed by Gateway
|
4. **Check LangGraph service** - Verify the availability of relevant endpoints
|
||||||
5. **Frontend route smoke check** - Run `bash .agent/skills/smoke-test/scripts/frontend_check.sh` to verify key routes under `/workspace`
|
5. **Frontend route smoke check** - Run `bash .agent/skills/smoke-test/scripts/frontend_check.sh` to verify key routes under `/workspace`
|
||||||
|
|
||||||
**Docker mode health check** (when using Docker):
|
**Docker mode health check** (when using Docker):
|
||||||
1. **Check container status** - Run `docker ps` and confirm that all containers are running
|
1. **Check container status** - Run `docker ps` and confirm that all containers are running
|
||||||
2. **Check frontend service** - Visit `http://localhost:2026` and verify that the page loads
|
2. **Check frontend service** - Visit `http://localhost:2026` and verify that the page loads
|
||||||
3. **Check API Gateway** - Verify the `http://localhost:2026/health` endpoint
|
3. **Check API Gateway** - Verify the `http://localhost:2026/health` endpoint
|
||||||
4. **Check LangGraph-compatible API** - Verify the `/api/langgraph/*` route exposed by Gateway
|
4. **Check LangGraph service** - Verify the availability of relevant endpoints
|
||||||
5. **Frontend route smoke check** - Run `bash .agent/skills/smoke-test/scripts/frontend_check.sh` to verify key routes under `/workspace`
|
5. **Frontend route smoke check** - Run `bash .agent/skills/smoke-test/scripts/frontend_check.sh` to verify key routes under `/workspace`
|
||||||
|
|
||||||
### Optional Functional Verification
|
### Optional Functional Verification
|
||||||
@@ -135,7 +135,7 @@ smoke-test/
|
|||||||
|
|
||||||
The following warnings can appear during smoke testing and do not block a successful result:
|
The following warnings can appear during smoke testing and do not block a successful result:
|
||||||
- Feishu/Lark SSL errors in Gateway logs (certificate verification failure) can be ignored if that channel is not enabled
|
- Feishu/Lark SSL errors in Gateway logs (certificate verification failure) can be ignored if that channel is not enabled
|
||||||
- Warnings in Gateway logs about missing methods in the custom checkpointer, such as `adelete_for_runs` or `aprune`, do not affect the core functionality
|
- Warnings in LangGraph logs about missing methods in the custom checkpointer, such as `adelete_for_runs` or `aprune`, do not affect the core functionality
|
||||||
|
|
||||||
## Key Tools
|
## Key Tools
|
||||||
|
|
||||||
|
|||||||
@@ -138,6 +138,7 @@ This document describes the detailed operating steps for each phase of the DeerF
|
|||||||
lsof -i :2026 # Main port
|
lsof -i :2026 # Main port
|
||||||
lsof -i :3000 # Frontend
|
lsof -i :3000 # Frontend
|
||||||
lsof -i :8001 # Gateway
|
lsof -i :8001 # Gateway
|
||||||
|
lsof -i :2024 # LangGraph
|
||||||
```
|
```
|
||||||
|
|
||||||
**Success Criteria**: All ports are free, or they are occupied only by DeerFlow-related processes.
|
**Success Criteria**: All ports are free, or they are occupied only by DeerFlow-related processes.
|
||||||
@@ -257,7 +258,7 @@ This document describes the detailed operating steps for each phase of the DeerF
|
|||||||
**Steps**:
|
**Steps**:
|
||||||
1. Run `make dev-daemon` (background mode)
|
1. Run `make dev-daemon` (background mode)
|
||||||
|
|
||||||
**Description**: This command starts all services (Gateway embedded runtime, Frontend, Nginx).
|
**Description**: This command starts all services (LangGraph, Gateway, Frontend, Nginx).
|
||||||
|
|
||||||
**Notes**:
|
**Notes**:
|
||||||
- `make dev` runs in the foreground and stops with Ctrl+C
|
- `make dev` runs in the foreground and stops with Ctrl+C
|
||||||
@@ -271,6 +272,7 @@ This document describes the detailed operating steps for each phase of the DeerF
|
|||||||
**Steps**:
|
**Steps**:
|
||||||
1. Wait 90-120 seconds for all services to start completely
|
1. Wait 90-120 seconds for all services to start completely
|
||||||
2. You can monitor startup progress by checking these log files:
|
2. You can monitor startup progress by checking these log files:
|
||||||
|
- `logs/langgraph.log`
|
||||||
- `logs/gateway.log`
|
- `logs/gateway.log`
|
||||||
- `logs/frontend.log`
|
- `logs/frontend.log`
|
||||||
- `logs/nginx.log`
|
- `logs/nginx.log`
|
||||||
@@ -314,10 +316,11 @@ This document describes the detailed operating steps for each phase of the DeerF
|
|||||||
**Steps**:
|
**Steps**:
|
||||||
1. Run the following command to check processes:
|
1. Run the following command to check processes:
|
||||||
```bash
|
```bash
|
||||||
ps aux | grep -E "(uvicorn|next|nginx)" | grep -v grep
|
ps aux | grep -E "(langgraph|uvicorn|next|nginx)" | grep -v grep
|
||||||
```
|
```
|
||||||
|
|
||||||
**Success Criteria**: Confirm that the following processes are running:
|
**Success Criteria**: Confirm that the following processes are running:
|
||||||
|
- LangGraph (`langgraph dev`)
|
||||||
- Gateway (`uvicorn app.gateway.app:app`)
|
- Gateway (`uvicorn app.gateway.app:app`)
|
||||||
- Frontend (`next dev` or `next start`)
|
- Frontend (`next dev` or `next start`)
|
||||||
- Nginx (`nginx`)
|
- Nginx (`nginx`)
|
||||||
@@ -353,11 +356,10 @@ curl http://localhost:2026/health
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
#### 5.1.4 Check LangGraph-compatible API
|
#### 5.1.4 Check LangGraph Service
|
||||||
|
|
||||||
**Steps**:
|
**Steps**:
|
||||||
1. Visit `http://localhost:2026/api/langgraph/assistants/lead_agent` to verify Gateway's LangGraph-compatible API route is reachable.
|
1. Visit relevant LangGraph endpoints to verify availability
|
||||||
2. A `401` response is acceptable when authentication is enabled and no session cookie is provided.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -371,6 +373,7 @@ curl http://localhost:2026/health
|
|||||||
- `deer-flow-nginx`
|
- `deer-flow-nginx`
|
||||||
- `deer-flow-frontend`
|
- `deer-flow-frontend`
|
||||||
- `deer-flow-gateway`
|
- `deer-flow-gateway`
|
||||||
|
- `deer-flow-langgraph` (if not in gateway mode)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -403,11 +406,10 @@ curl http://localhost:2026/health
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
#### 5.2.4 Check LangGraph-compatible API
|
#### 5.2.4 Check LangGraph Service
|
||||||
|
|
||||||
**Steps**:
|
**Steps**:
|
||||||
1. Visit `http://localhost:2026/api/langgraph/assistants/lead_agent` to verify Gateway's LangGraph-compatible API route is reachable.
|
1. Visit relevant LangGraph endpoints to verify availability
|
||||||
2. A `401` response is acceptable when authentication is enabled and no session cookie is provided.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
@@ -254,6 +254,7 @@ Processes exit quickly after running `make dev-daemon`.
|
|||||||
**Solutions**:
|
**Solutions**:
|
||||||
1. Check log files:
|
1. Check log files:
|
||||||
```bash
|
```bash
|
||||||
|
tail -f logs/langgraph.log
|
||||||
tail -f logs/gateway.log
|
tail -f logs/gateway.log
|
||||||
tail -f logs/frontend.log
|
tail -f logs/frontend.log
|
||||||
tail -f logs/nginx.log
|
tail -f logs/nginx.log
|
||||||
@@ -366,7 +367,24 @@ Errors appear in `gateway.log`.
|
|||||||
uv sync
|
uv sync
|
||||||
```
|
```
|
||||||
|
|
||||||
4. Confirm that the Gateway process is running normally.
|
4. Confirm that the LangGraph service is running normally (if not in gateway mode)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Issue: LangGraph Fails to Start
|
||||||
|
|
||||||
|
**Symptoms**:
|
||||||
|
Errors appear in `langgraph.log`.
|
||||||
|
|
||||||
|
**Solutions**:
|
||||||
|
1. Check LangGraph logs:
|
||||||
|
```bash
|
||||||
|
tail -f logs/langgraph.log
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Check config.yaml
|
||||||
|
3. Check whether Python dependencies are complete
|
||||||
|
4. Confirm that port 2024 is not occupied
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -501,7 +519,7 @@ Accessing `/health` returns an error or times out.
|
|||||||
|
|
||||||
2. Confirm that config.yaml exists and has valid formatting
|
2. Confirm that config.yaml exists and has valid formatting
|
||||||
3. Check whether Python dependencies are complete
|
3. Check whether Python dependencies are complete
|
||||||
4. Confirm that the Gateway process is running normally.
|
4. Confirm that the LangGraph service is running normally
|
||||||
|
|
||||||
**Solutions** (Docker mode):
|
**Solutions** (Docker mode):
|
||||||
1. Check gateway container logs:
|
1. Check gateway container logs:
|
||||||
@@ -511,7 +529,7 @@ Accessing `/health` returns an error or times out.
|
|||||||
|
|
||||||
2. Confirm that config.yaml is mounted correctly
|
2. Confirm that config.yaml is mounted correctly
|
||||||
3. Check whether Python dependencies are complete
|
3. Check whether Python dependencies are complete
|
||||||
4. Confirm that the Gateway process is running normally.
|
4. Confirm that the LangGraph service is running normally
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -521,7 +539,7 @@ Accessing `/health` returns an error or times out.
|
|||||||
|
|
||||||
#### View All Service Processes
|
#### View All Service Processes
|
||||||
```bash
|
```bash
|
||||||
ps aux | grep -E "(uvicorn|next|nginx)" | grep -v grep
|
ps aux | grep -E "(langgraph|uvicorn|next|nginx)" | grep -v grep
|
||||||
```
|
```
|
||||||
|
|
||||||
#### View Service Logs
|
#### View Service Logs
|
||||||
@@ -530,6 +548,7 @@ ps aux | grep -E "(uvicorn|next|nginx)" | grep -v grep
|
|||||||
tail -f logs/*.log
|
tail -f logs/*.log
|
||||||
|
|
||||||
# View specific service logs
|
# View specific service logs
|
||||||
|
tail -f logs/langgraph.log
|
||||||
tail -f logs/gateway.log
|
tail -f logs/gateway.log
|
||||||
tail -f logs/frontend.log
|
tail -f logs/frontend.log
|
||||||
tail -f logs/nginx.log
|
tail -f logs/nginx.log
|
||||||
|
|||||||
@@ -65,7 +65,7 @@ if ! command -v lsof >/dev/null 2>&1; then
|
|||||||
echo " Install lsof and rerun this check"
|
echo " Install lsof and rerun this check"
|
||||||
all_passed=false
|
all_passed=false
|
||||||
else
|
else
|
||||||
for port in 2026 3000 8001; do
|
for port in 2026 3000 8001 2024; do
|
||||||
if lsof -i :$port >/dev/null 2>&1; then
|
if lsof -i :$port >/dev/null 2>&1; then
|
||||||
echo "⚠ Port $port is already in use:"
|
echo "⚠ Port $port is already in use:"
|
||||||
lsof -i :$port | head -2
|
lsof -i :$port | head -2
|
||||||
|
|||||||
@@ -54,6 +54,7 @@ echo "=========================================="
|
|||||||
echo ""
|
echo ""
|
||||||
echo "🌐 Access URL: http://localhost:2026"
|
echo "🌐 Access URL: http://localhost:2026"
|
||||||
echo "📋 View logs:"
|
echo "📋 View logs:"
|
||||||
|
echo " - logs/langgraph.log"
|
||||||
echo " - logs/gateway.log"
|
echo " - logs/gateway.log"
|
||||||
echo " - logs/frontend.log"
|
echo " - logs/frontend.log"
|
||||||
echo " - logs/nginx.log"
|
echo " - logs/nginx.log"
|
||||||
|
|||||||
@@ -76,11 +76,12 @@ if [ "$mode" = "docker" ]; then
|
|||||||
all_passed=false
|
all_passed=false
|
||||||
fi
|
fi
|
||||||
else
|
else
|
||||||
summary_hint="logs/{gateway,frontend,nginx}.log"
|
summary_hint="logs/{langgraph,gateway,frontend,nginx}.log"
|
||||||
print_step "1. Checking local service ports..."
|
print_step "1. Checking local service ports..."
|
||||||
check_listen_port "Nginx" 2026
|
check_listen_port "Nginx" 2026
|
||||||
check_listen_port "Frontend" 3000
|
check_listen_port "Frontend" 3000
|
||||||
check_listen_port "Gateway" 8001
|
check_listen_port "Gateway" 8001
|
||||||
|
check_listen_port "LangGraph" 2024
|
||||||
fi
|
fi
|
||||||
echo ""
|
echo ""
|
||||||
|
|
||||||
@@ -103,8 +104,8 @@ else
|
|||||||
fi
|
fi
|
||||||
echo ""
|
echo ""
|
||||||
|
|
||||||
echo "5. Checking LangGraph-compatible Gateway API..."
|
echo "5. Checking LangGraph service..."
|
||||||
check_http_status "LangGraph-compatible Gateway API" "http://localhost:2026/api/langgraph/assistants/lead_agent" "200|401"
|
check_http_status "LangGraph service" "http://localhost:2024/" "200|301|302|307|308|404"
|
||||||
echo ""
|
echo ""
|
||||||
|
|
||||||
echo "=========================================="
|
echo "=========================================="
|
||||||
|
|||||||
@@ -78,7 +78,7 @@
|
|||||||
- [x] Container status - {{status_containers}}
|
- [x] Container status - {{status_containers}}
|
||||||
- [x] Frontend service - {{status_frontend}}
|
- [x] Frontend service - {{status_frontend}}
|
||||||
- [x] API Gateway - {{status_api_gateway}}
|
- [x] API Gateway - {{status_api_gateway}}
|
||||||
- [x] LangGraph-compatible Gateway API - {{status_langgraph}}
|
- [x] LangGraph service - {{status_langgraph}}
|
||||||
|
|
||||||
**Phase Status**: {{stage5_status}}
|
**Phase Status**: {{stage5_status}}
|
||||||
|
|
||||||
@@ -147,6 +147,7 @@ Commit Message: {{git_commit_message}}
|
|||||||
| deer-flow-nginx | {{nginx_status}} | {{nginx_uptime}} |
|
| deer-flow-nginx | {{nginx_status}} | {{nginx_uptime}} |
|
||||||
| deer-flow-frontend | {{frontend_status}} | {{frontend_uptime}} |
|
| deer-flow-frontend | {{frontend_status}} | {{frontend_uptime}} |
|
||||||
| deer-flow-gateway | {{gateway_status}} | {{gateway_uptime}} |
|
| deer-flow-gateway | {{gateway_status}} | {{gateway_uptime}} |
|
||||||
|
| deer-flow-langgraph | {{langgraph_status}} | {{langgraph_uptime}} |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
@@ -80,7 +80,7 @@
|
|||||||
- [x] Process status - {{status_processes}}
|
- [x] Process status - {{status_processes}}
|
||||||
- [x] Frontend service - {{status_frontend}}
|
- [x] Frontend service - {{status_frontend}}
|
||||||
- [x] API Gateway - {{status_api_gateway}}
|
- [x] API Gateway - {{status_api_gateway}}
|
||||||
- [x] LangGraph-compatible Gateway API - {{status_langgraph}}
|
- [x] LangGraph service - {{status_langgraph}}
|
||||||
|
|
||||||
**Phase Status**: {{stage5_status}}
|
**Phase Status**: {{stage5_status}}
|
||||||
|
|
||||||
@@ -152,7 +152,7 @@ Commit Message: {{git_commit_message}}
|
|||||||
| Nginx | {{nginx_status}} | {{nginx_endpoint}} |
|
| Nginx | {{nginx_status}} | {{nginx_endpoint}} |
|
||||||
| Frontend | {{frontend_status}} | {{frontend_endpoint}} |
|
| Frontend | {{frontend_status}} | {{frontend_endpoint}} |
|
||||||
| Gateway | {{gateway_status}} | {{gateway_endpoint}} |
|
| Gateway | {{gateway_status}} | {{gateway_endpoint}} |
|
||||||
| Gateway LangGraph API | {{langgraph_status}} | {{langgraph_endpoint}} |
|
| LangGraph | {{langgraph_status}} | {{langgraph_endpoint}} |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -166,7 +166,7 @@ Commit Message: {{git_commit_message}}
|
|||||||
|
|
||||||
### If the Test Fails
|
### If the Test Fails
|
||||||
1. [ ] Review references/troubleshooting.md for common solutions
|
1. [ ] Review references/troubleshooting.md for common solutions
|
||||||
2. [ ] Check local logs: `logs/{gateway,frontend,nginx}.log`
|
2. [ ] Check local logs: `logs/{langgraph,gateway,frontend,nginx}.log`
|
||||||
3. [ ] Verify configuration file format and content
|
3. [ ] Verify configuration file format and content
|
||||||
4. [ ] If needed, fully reset the environment: `make stop && make clean && make install && make dev-daemon`
|
4. [ ] If needed, fully reset the environment: `make stop && make clean && make install && make dev-daemon`
|
||||||
|
|
||||||
|
|||||||
+2
-23
@@ -1,6 +1,3 @@
|
|||||||
# Serper API Key (Google Search) - https://serper.dev
|
|
||||||
SERPER_API_KEY=your-serper-api-key
|
|
||||||
|
|
||||||
# TAVILY API Key
|
# TAVILY API Key
|
||||||
TAVILY_API_KEY=your-tavily-api-key
|
TAVILY_API_KEY=your-tavily-api-key
|
||||||
|
|
||||||
@@ -9,9 +6,8 @@ JINA_API_KEY=your-jina-api-key
|
|||||||
|
|
||||||
# InfoQuest API Key
|
# InfoQuest API Key
|
||||||
INFOQUEST_API_KEY=your-infoquest-api-key
|
INFOQUEST_API_KEY=your-infoquest-api-key
|
||||||
# Browser CORS allowlist for split-origin or port-forwarded deployments (comma-separated exact origins).
|
# CORS Origins (comma-separated) - e.g., http://localhost:3000,http://localhost:3001
|
||||||
# Leave unset when using the unified nginx endpoint, e.g. http://localhost:2026.
|
# CORS_ORIGINS=http://localhost:3000
|
||||||
# GATEWAY_CORS_ORIGINS=http://localhost:3000,http://127.0.0.1:3000
|
|
||||||
|
|
||||||
# Optional:
|
# Optional:
|
||||||
# FIRECRAWL_API_KEY=your-firecrawl-api-key
|
# FIRECRAWL_API_KEY=your-firecrawl-api-key
|
||||||
@@ -21,7 +17,6 @@ INFOQUEST_API_KEY=your-infoquest-api-key
|
|||||||
# DEEPSEEK_API_KEY=your-deepseek-api-key
|
# DEEPSEEK_API_KEY=your-deepseek-api-key
|
||||||
# NOVITA_API_KEY=your-novita-api-key # OpenAI-compatible, see https://novita.ai
|
# NOVITA_API_KEY=your-novita-api-key # OpenAI-compatible, see https://novita.ai
|
||||||
# MINIMAX_API_KEY=your-minimax-api-key # OpenAI-compatible, see https://platform.minimax.io
|
# MINIMAX_API_KEY=your-minimax-api-key # OpenAI-compatible, see https://platform.minimax.io
|
||||||
# STEPFUN_API_KEY=your-stepfun-api-key # OpenAI-compatible, see https://platform.stepfun.com
|
|
||||||
# VLLM_API_KEY=your-vllm-api-key # OpenAI-compatible
|
# VLLM_API_KEY=your-vllm-api-key # OpenAI-compatible
|
||||||
# FEISHU_APP_ID=your-feishu-app-id
|
# FEISHU_APP_ID=your-feishu-app-id
|
||||||
# FEISHU_APP_SECRET=your-feishu-app-secret
|
# FEISHU_APP_SECRET=your-feishu-app-secret
|
||||||
@@ -50,19 +45,3 @@ INFOQUEST_API_KEY=your-infoquest-api-key
|
|||||||
|
|
||||||
# Set to "false" to disable Swagger UI, ReDoc, and OpenAPI schema in production
|
# Set to "false" to disable Swagger UI, ReDoc, and OpenAPI schema in production
|
||||||
# GATEWAY_ENABLE_DOCS=false
|
# GATEWAY_ENABLE_DOCS=false
|
||||||
|
|
||||||
# Shared internal Gateway auth token for multi-worker deployments.
|
|
||||||
# `make up` generates and persists this automatically; set it manually only
|
|
||||||
# when you run Gateway workers outside the bundled deploy script.
|
|
||||||
# DEER_FLOW_INTERNAL_AUTH_TOKEN=your-shared-internal-token
|
|
||||||
|
|
||||||
# ── Frontend SSR → Gateway wiring ─────────────────────────────────────────────
|
|
||||||
# The Next.js server uses these to reach the Gateway during SSR (auth checks,
|
|
||||||
# /api/* rewrites). They default to localhost values that match `make dev` and
|
|
||||||
# `make start`, so most local users do not need to set them.
|
|
||||||
#
|
|
||||||
# Override only when the Gateway is not on localhost:8001 (e.g. when the
|
|
||||||
# frontend and gateway run on different hosts, in containers with a service
|
|
||||||
# alias, or behind a different port). docker-compose already sets these.
|
|
||||||
# DEER_FLOW_INTERNAL_GATEWAY_BASE_URL=http://localhost:8001
|
|
||||||
# DEER_FLOW_TRUSTED_ORIGINS=http://localhost:3000,http://localhost:2026
|
|
||||||
|
|||||||
@@ -1,159 +0,0 @@
|
|||||||
name: 🐛 Bug report
|
|
||||||
description: Report something that isn't working so maintainers can reproduce and fix it.
|
|
||||||
title: "[bug] "
|
|
||||||
labels: ["bug"]
|
|
||||||
body:
|
|
||||||
- type: markdown
|
|
||||||
attributes:
|
|
||||||
value: |
|
|
||||||
Thanks for taking the time to file a bug. A clear, reproducible report is the
|
|
||||||
single biggest factor in how fast it gets fixed.
|
|
||||||
|
|
||||||
Please fill in every required field — especially **reproduction steps** and **logs**.
|
|
||||||
|
|
||||||
- type: checkboxes
|
|
||||||
id: preflight
|
|
||||||
attributes:
|
|
||||||
label: Before you start
|
|
||||||
options:
|
|
||||||
- label: I searched [existing issues](https://github.com/bytedance/deer-flow/issues?q=is%3Aissue) and this is not a duplicate.
|
|
||||||
required: true
|
|
||||||
- label: I can reproduce this on the latest `main`.
|
|
||||||
required: false
|
|
||||||
|
|
||||||
- type: input
|
|
||||||
id: summary
|
|
||||||
attributes:
|
|
||||||
label: Problem summary
|
|
||||||
description: One sentence describing the bug.
|
|
||||||
placeholder: e.g. make dev fails to start the gateway service
|
|
||||||
validations:
|
|
||||||
required: true
|
|
||||||
|
|
||||||
- type: dropdown
|
|
||||||
id: areas
|
|
||||||
attributes:
|
|
||||||
label: Affected area(s)
|
|
||||||
description: Which part of DeerFlow does this touch? Select all that apply.
|
|
||||||
multiple: true
|
|
||||||
options:
|
|
||||||
- Frontend (UI / Next.js)
|
|
||||||
- Backend API (gateway / endpoints / SSE)
|
|
||||||
- Agents / LangGraph (graph, prompts, langgraph.json)
|
|
||||||
- Sandbox / Docker
|
|
||||||
- Skills
|
|
||||||
- MCP
|
|
||||||
- Config / setup (make, config.yaml, env)
|
|
||||||
- Docs
|
|
||||||
- Not sure
|
|
||||||
validations:
|
|
||||||
required: true
|
|
||||||
|
|
||||||
- type: textarea
|
|
||||||
id: actual
|
|
||||||
attributes:
|
|
||||||
label: What happened?
|
|
||||||
description: The actual behavior. Include the key error lines verbatim.
|
|
||||||
placeholder: When I do X, I expected Y but I got Z.
|
|
||||||
validations:
|
|
||||||
required: true
|
|
||||||
|
|
||||||
- type: textarea
|
|
||||||
id: expected
|
|
||||||
attributes:
|
|
||||||
label: Expected behavior
|
|
||||||
placeholder: What did you expect to happen instead?
|
|
||||||
validations:
|
|
||||||
required: true
|
|
||||||
|
|
||||||
- type: textarea
|
|
||||||
id: reproduce
|
|
||||||
attributes:
|
|
||||||
label: Steps to reproduce
|
|
||||||
description: Exact commands and sequence. Minimal steps that reliably reproduce the problem.
|
|
||||||
placeholder: |
|
|
||||||
1. make check
|
|
||||||
2. make install
|
|
||||||
3. make dev
|
|
||||||
4. ...
|
|
||||||
validations:
|
|
||||||
required: true
|
|
||||||
|
|
||||||
- type: textarea
|
|
||||||
id: logs
|
|
||||||
attributes:
|
|
||||||
label: Relevant logs
|
|
||||||
description: Paste key lines from logs (for example `logs/gateway.log`, `logs/frontend.log`). Redact secrets.
|
|
||||||
render: shell
|
|
||||||
validations:
|
|
||||||
required: true
|
|
||||||
|
|
||||||
- type: dropdown
|
|
||||||
id: run_mode
|
|
||||||
attributes:
|
|
||||||
label: How are you running DeerFlow?
|
|
||||||
options:
|
|
||||||
- Local (make dev)
|
|
||||||
- Docker (make docker-start)
|
|
||||||
- CI
|
|
||||||
- Other
|
|
||||||
validations:
|
|
||||||
required: true
|
|
||||||
|
|
||||||
- type: dropdown
|
|
||||||
id: os
|
|
||||||
attributes:
|
|
||||||
label: Operating system
|
|
||||||
options:
|
|
||||||
- macOS
|
|
||||||
- Linux
|
|
||||||
- Windows
|
|
||||||
- Other
|
|
||||||
validations:
|
|
||||||
required: true
|
|
||||||
|
|
||||||
- type: input
|
|
||||||
id: platform_details
|
|
||||||
attributes:
|
|
||||||
label: Platform details
|
|
||||||
description: Architecture and shell, if relevant.
|
|
||||||
placeholder: e.g. arm64, zsh
|
|
||||||
|
|
||||||
- type: input
|
|
||||||
id: python_version
|
|
||||||
attributes:
|
|
||||||
label: Python version
|
|
||||||
placeholder: e.g. Python 3.12.9
|
|
||||||
|
|
||||||
- type: input
|
|
||||||
id: node_version
|
|
||||||
attributes:
|
|
||||||
label: Node.js version
|
|
||||||
placeholder: e.g. v22.11.0
|
|
||||||
|
|
||||||
- type: input
|
|
||||||
id: pnpm_version
|
|
||||||
attributes:
|
|
||||||
label: pnpm version
|
|
||||||
placeholder: e.g. 10.26.2
|
|
||||||
|
|
||||||
- type: input
|
|
||||||
id: uv_version
|
|
||||||
attributes:
|
|
||||||
label: uv version
|
|
||||||
placeholder: e.g. 0.7.20
|
|
||||||
|
|
||||||
- type: textarea
|
|
||||||
id: git_info
|
|
||||||
attributes:
|
|
||||||
label: Git state
|
|
||||||
description: Output of `git branch --show-current` and the latest commit SHA.
|
|
||||||
placeholder: |
|
|
||||||
branch: feature/my-branch
|
|
||||||
commit: abcdef1
|
|
||||||
|
|
||||||
- type: textarea
|
|
||||||
id: additional
|
|
||||||
attributes:
|
|
||||||
label: Additional context
|
|
||||||
description: Screenshots, related issues, config snippets (redacted), or anything else that helps triage.
|
|
||||||
@@ -1,11 +0,0 @@
|
|||||||
blank_issues_enabled: false
|
|
||||||
contact_links:
|
|
||||||
- name: 💬 Questions & usage help
|
|
||||||
url: https://github.com/bytedance/deer-flow/discussions/categories/q-a
|
|
||||||
about: "How do I use X? Why does Y behave like that? Ask in Discussions — it gets answered faster and stays searchable."
|
|
||||||
- name: 💡 Ideas & proposals
|
|
||||||
url: https://github.com/bytedance/deer-flow/discussions/categories/ideas
|
|
||||||
about: Have a half-formed idea? Float it in Discussions before opening a formal feature request.
|
|
||||||
- name: 🔒 Report a security vulnerability
|
|
||||||
url: https://github.com/bytedance/deer-flow/security/policy
|
|
||||||
about: Do not open a public issue for security problems. Follow the security policy instead.
|
|
||||||
@@ -1,67 +0,0 @@
|
|||||||
name: 💡 Feature request
|
|
||||||
description: Propose a new capability or an improvement to an existing one.
|
|
||||||
title: "[feat] "
|
|
||||||
labels: ["enhancement"]
|
|
||||||
body:
|
|
||||||
- type: markdown
|
|
||||||
attributes:
|
|
||||||
value: |
|
|
||||||
Thanks for the suggestion. For non-trivial features, please open a
|
|
||||||
[Discussion](https://github.com/bytedance/deer-flow/discussions/categories/ideas)
|
|
||||||
first to align on scope before writing code.
|
|
||||||
|
|
||||||
- type: checkboxes
|
|
||||||
id: preflight
|
|
||||||
attributes:
|
|
||||||
label: Before you start
|
|
||||||
options:
|
|
||||||
- label: I searched [existing issues](https://github.com/bytedance/deer-flow/issues?q=is%3Aissue) and this is not a duplicate.
|
|
||||||
required: true
|
|
||||||
|
|
||||||
- type: textarea
|
|
||||||
id: problem
|
|
||||||
attributes:
|
|
||||||
label: Problem / motivation
|
|
||||||
description: What problem does this solve? What is painful today, or what does it unblock?
|
|
||||||
placeholder: "I'm always frustrated when ..."
|
|
||||||
validations:
|
|
||||||
required: true
|
|
||||||
|
|
||||||
- type: textarea
|
|
||||||
id: solution
|
|
||||||
attributes:
|
|
||||||
label: Proposed solution
|
|
||||||
description: Describe the change from a user's / caller's perspective.
|
|
||||||
validations:
|
|
||||||
required: true
|
|
||||||
|
|
||||||
- type: dropdown
|
|
||||||
id: areas
|
|
||||||
attributes:
|
|
||||||
label: Affected area(s)
|
|
||||||
description: Which part of DeerFlow would this touch? Select all that apply.
|
|
||||||
multiple: true
|
|
||||||
options:
|
|
||||||
- Frontend (UI / Next.js)
|
|
||||||
- Backend API (gateway / endpoints / SSE)
|
|
||||||
- Agents / LangGraph (graph, prompts, langgraph.json)
|
|
||||||
- Sandbox / Docker
|
|
||||||
- Skills
|
|
||||||
- MCP
|
|
||||||
- Config / setup
|
|
||||||
- Docs
|
|
||||||
- Not sure
|
|
||||||
validations:
|
|
||||||
required: true
|
|
||||||
|
|
||||||
- type: textarea
|
|
||||||
id: alternatives
|
|
||||||
attributes:
|
|
||||||
label: Alternatives considered
|
|
||||||
description: Other approaches you weighed and why you discarded them.
|
|
||||||
|
|
||||||
- type: textarea
|
|
||||||
id: additional
|
|
||||||
attributes:
|
|
||||||
label: Additional context
|
|
||||||
description: Mockups, links, related issues, or anything else that helps.
|
|
||||||
@@ -0,0 +1,128 @@
|
|||||||
|
name: Runtime Information
|
||||||
|
description: Report runtime/environment details to help reproduce an issue.
|
||||||
|
title: "[runtime] "
|
||||||
|
labels:
|
||||||
|
- needs-triage
|
||||||
|
body:
|
||||||
|
- type: markdown
|
||||||
|
attributes:
|
||||||
|
value: |
|
||||||
|
Thanks for sharing runtime details.
|
||||||
|
Complete this form so maintainers can quickly reproduce and diagnose the problem.
|
||||||
|
|
||||||
|
- type: input
|
||||||
|
id: summary
|
||||||
|
attributes:
|
||||||
|
label: Problem summary
|
||||||
|
description: Short summary of the issue.
|
||||||
|
placeholder: e.g. make dev fails to start gateway service
|
||||||
|
validations:
|
||||||
|
required: true
|
||||||
|
|
||||||
|
- type: textarea
|
||||||
|
id: expected
|
||||||
|
attributes:
|
||||||
|
label: Expected behavior
|
||||||
|
placeholder: What did you expect to happen?
|
||||||
|
validations:
|
||||||
|
required: true
|
||||||
|
|
||||||
|
- type: textarea
|
||||||
|
id: actual
|
||||||
|
attributes:
|
||||||
|
label: Actual behavior
|
||||||
|
placeholder: What happened instead? Include key error lines.
|
||||||
|
validations:
|
||||||
|
required: true
|
||||||
|
|
||||||
|
- type: dropdown
|
||||||
|
id: os
|
||||||
|
attributes:
|
||||||
|
label: Operating system
|
||||||
|
options:
|
||||||
|
- macOS
|
||||||
|
- Linux
|
||||||
|
- Windows
|
||||||
|
- Other
|
||||||
|
validations:
|
||||||
|
required: true
|
||||||
|
|
||||||
|
- type: input
|
||||||
|
id: platform_details
|
||||||
|
attributes:
|
||||||
|
label: Platform details
|
||||||
|
description: Add architecture and shell if relevant.
|
||||||
|
placeholder: e.g. arm64, zsh
|
||||||
|
|
||||||
|
- type: input
|
||||||
|
id: python_version
|
||||||
|
attributes:
|
||||||
|
label: Python version
|
||||||
|
placeholder: e.g. Python 3.12.9
|
||||||
|
|
||||||
|
- type: input
|
||||||
|
id: node_version
|
||||||
|
attributes:
|
||||||
|
label: Node.js version
|
||||||
|
placeholder: e.g. v23.11.0
|
||||||
|
|
||||||
|
- type: input
|
||||||
|
id: pnpm_version
|
||||||
|
attributes:
|
||||||
|
label: pnpm version
|
||||||
|
placeholder: e.g. 10.26.2
|
||||||
|
|
||||||
|
- type: input
|
||||||
|
id: uv_version
|
||||||
|
attributes:
|
||||||
|
label: uv version
|
||||||
|
placeholder: e.g. 0.7.20
|
||||||
|
|
||||||
|
- type: dropdown
|
||||||
|
id: run_mode
|
||||||
|
attributes:
|
||||||
|
label: How are you running DeerFlow?
|
||||||
|
options:
|
||||||
|
- Local (make dev)
|
||||||
|
- Docker (make docker-dev)
|
||||||
|
- CI
|
||||||
|
- Other
|
||||||
|
validations:
|
||||||
|
required: true
|
||||||
|
|
||||||
|
- type: textarea
|
||||||
|
id: reproduce
|
||||||
|
attributes:
|
||||||
|
label: Reproduction steps
|
||||||
|
description: Provide exact commands and sequence.
|
||||||
|
placeholder: |
|
||||||
|
1. make check
|
||||||
|
2. make install
|
||||||
|
3. make dev
|
||||||
|
4. ...
|
||||||
|
validations:
|
||||||
|
required: true
|
||||||
|
|
||||||
|
- type: textarea
|
||||||
|
id: logs
|
||||||
|
attributes:
|
||||||
|
label: Relevant logs
|
||||||
|
description: Paste key lines from logs (for example logs/gateway.log, logs/frontend.log).
|
||||||
|
render: shell
|
||||||
|
validations:
|
||||||
|
required: true
|
||||||
|
|
||||||
|
- type: textarea
|
||||||
|
id: git_info
|
||||||
|
attributes:
|
||||||
|
label: Git state
|
||||||
|
description: Share output of git branch and latest commit SHA.
|
||||||
|
placeholder: |
|
||||||
|
branch: feature/my-branch
|
||||||
|
commit: abcdef1
|
||||||
|
|
||||||
|
- type: textarea
|
||||||
|
id: additional
|
||||||
|
attributes:
|
||||||
|
label: Additional context
|
||||||
|
description: Add anything else that might help triage.
|
||||||
@@ -1,119 +0,0 @@
|
|||||||
# Declarative label source of truth for DeerFlow.
|
|
||||||
#
|
|
||||||
# This file is the single source of truth for repository labels used by the
|
|
||||||
# auto-labeling workflows (.github/workflows/pr-labeler.yml, pr-triage.yml,
|
|
||||||
# issue-triage.yml). Auto-labelers can only apply labels that already exist,
|
|
||||||
# so every label referenced by a workflow MUST be declared here.
|
|
||||||
#
|
|
||||||
# Apply with: uv run --with pyyaml python scripts/sync_labels.py [--repo OWNER/NAME]
|
|
||||||
# CI keeps it in sync via .github/workflows/label-sync.yml (runs on changes here).
|
|
||||||
#
|
|
||||||
# Sync is additive/update-only: it creates or updates the labels listed below
|
|
||||||
# and never deletes labels that are not listed.
|
|
||||||
#
|
|
||||||
# Color = 6-digit hex without the leading '#'.
|
|
||||||
|
|
||||||
labels:
|
|
||||||
# ── Type ─────────────────────────────────────────────────────────────────
|
|
||||||
# Mostly GitHub defaults; declared here so colors/descriptions stay stable
|
|
||||||
# and so issue templates can rely on them existing.
|
|
||||||
- name: bug
|
|
||||||
color: d73a4a
|
|
||||||
description: Something isn't working
|
|
||||||
- name: enhancement
|
|
||||||
color: a2eeef
|
|
||||||
description: New feature or request
|
|
||||||
- name: documentation
|
|
||||||
color: 0075ca
|
|
||||||
description: Improvements or additions to documentation
|
|
||||||
- name: question
|
|
||||||
color: d876e3
|
|
||||||
description: Further information is requested
|
|
||||||
|
|
||||||
# ── Area (auto, by changed paths — see .github/labeler.yml) ───────────────
|
|
||||||
# Mirrors the "Surface area" section of the pull request template.
|
|
||||||
- name: "area:frontend"
|
|
||||||
color: c5def5
|
|
||||||
description: Next.js frontend under frontend/
|
|
||||||
- name: "area:backend"
|
|
||||||
color: c5def5
|
|
||||||
description: Gateway / runtime / core backend under backend/
|
|
||||||
- name: "area:agents"
|
|
||||||
color: c5def5
|
|
||||||
description: Agents, subagents, graph wiring, prompts, langgraph.json
|
|
||||||
- name: "area:sandbox"
|
|
||||||
color: c5def5
|
|
||||||
description: Sandboxed execution and docker/
|
|
||||||
- name: "area:skills"
|
|
||||||
color: c5def5
|
|
||||||
description: Skills under skills/ or the skills harness
|
|
||||||
- name: "area:mcp"
|
|
||||||
color: c5def5
|
|
||||||
description: Model Context Protocol integration
|
|
||||||
- name: "area:ci"
|
|
||||||
color: c5def5
|
|
||||||
description: GitHub Actions, CI config, repo tooling
|
|
||||||
- name: "area:docs"
|
|
||||||
color: c5def5
|
|
||||||
description: Documentation and Markdown only
|
|
||||||
- name: "area:deps"
|
|
||||||
color: c5def5
|
|
||||||
description: Dependency manifests / lockfiles
|
|
||||||
|
|
||||||
# ── Size (auto, by additions + deletions — see pr-triage.yml) ─────────────
|
|
||||||
- name: "size/XS"
|
|
||||||
color: "009900"
|
|
||||||
description: PR changes < 20 lines
|
|
||||||
- name: "size/S"
|
|
||||||
color: 77bb00
|
|
||||||
description: PR changes 20-100 lines
|
|
||||||
- name: "size/M"
|
|
||||||
color: eebb00
|
|
||||||
description: PR changes 100-300 lines
|
|
||||||
- name: "size/L"
|
|
||||||
color: ee9900
|
|
||||||
description: PR changes 300-700 lines
|
|
||||||
- name: "size/XL"
|
|
||||||
color: ee5500
|
|
||||||
description: PR changes 700+ lines
|
|
||||||
|
|
||||||
# ── Risk (auto, by changed paths — see pr-triage.yml) ─────────────────────
|
|
||||||
- name: "risk:low"
|
|
||||||
color: 0e8a16
|
|
||||||
description: "Low risk: docs / i18n / assets only"
|
|
||||||
- name: "risk:medium"
|
|
||||||
color: fbca04
|
|
||||||
description: "Medium risk: regular code changes"
|
|
||||||
- name: "risk:high"
|
|
||||||
color: b60205
|
|
||||||
description: "High risk: backend API, agents, sandbox, auth, deps, CI"
|
|
||||||
|
|
||||||
# ── Priority (manual) ─────────────────────────────────────────────────────
|
|
||||||
- name: P0
|
|
||||||
color: b60205
|
|
||||||
description: Critical priority
|
|
||||||
- name: P1
|
|
||||||
color: d93f0b
|
|
||||||
description: Major priority
|
|
||||||
- name: P2
|
|
||||||
color: e99695
|
|
||||||
description: Normal priority
|
|
||||||
|
|
||||||
# ── Status (auto + manual) ────────────────────────────────────────────────
|
|
||||||
- name: needs-triage
|
|
||||||
color: fef2c0
|
|
||||||
description: Awaiting maintainer triage
|
|
||||||
- name: needs-validation
|
|
||||||
color: d4c5f9
|
|
||||||
description: Touches front/back contract surface; needs real-path validation
|
|
||||||
- name: skip-validation
|
|
||||||
color: cccccc
|
|
||||||
description: "Maintainer override: do not auto-add needs-validation on this PR"
|
|
||||||
- name: reviewing
|
|
||||||
color: 5319e7
|
|
||||||
description: A maintainer is reviewing this PR
|
|
||||||
|
|
||||||
# ── Contributor ───────────────────────────────────────────────────────────
|
|
||||||
- name: first-time-contributor
|
|
||||||
color: c2e0c6
|
|
||||||
description: First contribution to this repository — be welcoming
|
|
||||||
@@ -1,75 +0,0 @@
|
|||||||
<!-- Reference a related issue with #123. Use Fixes / Closes / Resolves to
|
|
||||||
auto-close it on merge. Delete this line if the PR doesn't reference an issue. -->
|
|
||||||
Fixes #
|
|
||||||
|
|
||||||
## Why
|
|
||||||
|
|
||||||
<!-- Why are you opening this PR? Cover two things:
|
|
||||||
- The trigger — what made you write this? A bug you hit, a feature you need,
|
|
||||||
tech debt, or a prod issue?
|
|
||||||
- The pain being addressed — user-facing problem, or what it unblocks.
|
|
||||||
For non-trivial features, please open an issue/discussion first to align on
|
|
||||||
scope before writing code. -->
|
|
||||||
|
|
||||||
|
|
||||||
## What changed
|
|
||||||
|
|
||||||
<!-- Describe the change from a user's / caller's perspective, not as a code diff. e.g.:
|
|
||||||
- "Settings now has a 'Custom endpoint' field, off by default"
|
|
||||||
- "Backend /api/chat gains a `stream` flag, defaults to false"
|
|
||||||
- "Default model changed from X to Y — existing users notice on first run" -->
|
|
||||||
|
|
||||||
|
|
||||||
## Surface area
|
|
||||||
|
|
||||||
<!-- Check every box that applies. Reviewers use this to scope the review. -->
|
|
||||||
|
|
||||||
- [ ] **Frontend UI** — page / component / setting / interaction under `frontend/`
|
|
||||||
- [ ] **Backend API** — endpoint / SSE event / request-response shape under `backend/app`
|
|
||||||
- [ ] **Agents / LangGraph** — agent node, graph wiring, `langgraph.json`, or prompt change
|
|
||||||
- [ ] **Sandbox** — `docker/` or sandboxed execution
|
|
||||||
- [ ] **Skills** — change under `skills/`
|
|
||||||
- [ ] **Dependencies** — new/upgraded entry in `backend/pyproject.toml` or `frontend/package.json` (say what it buys us)
|
|
||||||
- [ ] **Default behavior change** — changes existing behavior without the user opting in (default model, default setting, data shape)
|
|
||||||
- [ ] **Docs / tests / CI only** — no runtime behavior change
|
|
||||||
|
|
||||||
|
|
||||||
## Screenshots / Recording
|
|
||||||
|
|
||||||
<!-- If you checked "Frontend UI", attach screenshots showing the entry point —
|
|
||||||
where users discover the change — not just the feature in isolation.
|
|
||||||
Before/after is best for behavior changes. Short GIFs welcome. -->
|
|
||||||
|
|
||||||
|
|
||||||
## Bug fix verification
|
|
||||||
|
|
||||||
<!-- Skip (delete) this section if this PR is not a bug fix.
|
|
||||||
|
|
||||||
Bugs should be encoded as a failing test that goes red before the fix.
|
|
||||||
Confirm:
|
|
||||||
- Test path that reproduces the bug:
|
|
||||||
- Did it go red on `main` and green on this branch? (yes / no)
|
|
||||||
- If a red test wasn't cheap to write, explain why and what you did instead. -->
|
|
||||||
|
|
||||||
|
|
||||||
## Validation
|
|
||||||
|
|
||||||
<!-- What you actually ran. Run at least the checks for the area you changed:
|
|
||||||
Backend: cd backend && make lint && make test
|
|
||||||
Frontend: cd frontend && pnpm format && pnpm lint && pnpm typecheck && BETTER_AUTH_SECRET=local-dev-secret pnpm build && make test
|
|
||||||
Frontend E2E (if you touched frontend/): cd frontend && make test-e2e -->
|
|
||||||
|
|
||||||
|
|
||||||
## AI assistance
|
|
||||||
|
|
||||||
<!-- DeerFlow is an AI project — most PRs here use AI coding tools, and that's
|
|
||||||
welcome. Disclosing it just helps reviewers calibrate how closely to read the
|
|
||||||
diff. Please fill all three; don't delete the section. -->
|
|
||||||
|
|
||||||
**Tool(s) used:** <!-- e.g. Claude Code, Cursor, GitHub Copilot, Codex, Windsurf, or "none" -->
|
|
||||||
|
|
||||||
**How you used it:** <!-- e.g. "generated the module from a spec", "autocomplete only",
|
|
||||||
"AI wrote tests, I wrote the impl". A prompt or conversation link is great too. -->
|
|
||||||
|
|
||||||
- [ ] I've read and understand every line of this change and take responsibility for it — it's not unreviewed AI output.
|
|
||||||
|
|
||||||
@@ -1,46 +0,0 @@
|
|||||||
name: Backend Blocking IO
|
|
||||||
|
|
||||||
on:
|
|
||||||
push:
|
|
||||||
branches: ["main"]
|
|
||||||
paths:
|
|
||||||
- "backend/**"
|
|
||||||
- ".github/workflows/backend-blocking-io-tests.yml"
|
|
||||||
pull_request:
|
|
||||||
types: [opened, synchronize, reopened, ready_for_review]
|
|
||||||
paths:
|
|
||||||
- "backend/**"
|
|
||||||
- ".github/workflows/backend-blocking-io-tests.yml"
|
|
||||||
|
|
||||||
concurrency:
|
|
||||||
group: blocking-io-${{ github.event.pull_request.number || github.ref }}
|
|
||||||
cancel-in-progress: true
|
|
||||||
|
|
||||||
permissions:
|
|
||||||
contents: read
|
|
||||||
|
|
||||||
jobs:
|
|
||||||
backend-blocking-io:
|
|
||||||
if: github.event_name != 'pull_request' || github.event.pull_request.draft == false
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
timeout-minutes: 10
|
|
||||||
|
|
||||||
steps:
|
|
||||||
- name: Checkout
|
|
||||||
uses: actions/checkout@v4
|
|
||||||
|
|
||||||
- name: Set up Python
|
|
||||||
uses: actions/setup-python@v5
|
|
||||||
with:
|
|
||||||
python-version: "3.12"
|
|
||||||
|
|
||||||
- name: Install uv
|
|
||||||
uses: astral-sh/setup-uv@v3
|
|
||||||
|
|
||||||
- name: Install backend dependencies
|
|
||||||
working-directory: backend
|
|
||||||
run: uv sync --group dev
|
|
||||||
|
|
||||||
- name: Run blocking IO regression tests
|
|
||||||
working-directory: backend
|
|
||||||
run: make test-blocking-io
|
|
||||||
@@ -1,101 +0,0 @@
|
|||||||
name: Publish Containers
|
|
||||||
|
|
||||||
on:
|
|
||||||
push:
|
|
||||||
tags:
|
|
||||||
- "v*"
|
|
||||||
|
|
||||||
jobs:
|
|
||||||
|
|
||||||
backend-container:
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
permissions:
|
|
||||||
contents: read
|
|
||||||
packages: write
|
|
||||||
attestations: write
|
|
||||||
id-token: write
|
|
||||||
env:
|
|
||||||
REGISTRY: ghcr.io
|
|
||||||
IMAGE_NAME: ${{ github.repository }}-backend
|
|
||||||
steps:
|
|
||||||
- name: Checkout repository
|
|
||||||
uses: actions/checkout@v6
|
|
||||||
- name: Log in to the Container registry
|
|
||||||
uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 #v3.4.0
|
|
||||||
with:
|
|
||||||
registry: ${{ env.REGISTRY }}
|
|
||||||
username: ${{ github.actor }}
|
|
||||||
password: ${{ secrets.GITHUB_TOKEN }}
|
|
||||||
- name: Extract metadata (tags, labels) for Docker
|
|
||||||
id: meta
|
|
||||||
uses: docker/metadata-action@902fa8ec7d6ecbf8d84d538b9b233a880e428804 #v5.7.0
|
|
||||||
with:
|
|
||||||
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
|
|
||||||
tags: |
|
|
||||||
type=ref,event=tag
|
|
||||||
type=ref,event=branch
|
|
||||||
type=sha
|
|
||||||
type=raw,value=latest,enable={{is_default_branch}}
|
|
||||||
- name: Build and push Docker image
|
|
||||||
id: push
|
|
||||||
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 #v6.18.0
|
|
||||||
with:
|
|
||||||
context: .
|
|
||||||
file: backend/Dockerfile
|
|
||||||
push: true
|
|
||||||
tags: ${{ steps.meta.outputs.tags }}
|
|
||||||
labels: ${{ steps.meta.outputs.labels }}
|
|
||||||
|
|
||||||
- name: Generate artifact attestation
|
|
||||||
uses: actions/attest-build-provenance@v2
|
|
||||||
with:
|
|
||||||
subject-name: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME}}
|
|
||||||
subject-digest: ${{ steps.push.outputs.digest }}
|
|
||||||
push-to-registry: true
|
|
||||||
|
|
||||||
frontend-container:
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
permissions:
|
|
||||||
contents: read
|
|
||||||
packages: write
|
|
||||||
attestations: write
|
|
||||||
id-token: write
|
|
||||||
env:
|
|
||||||
REGISTRY: ghcr.io
|
|
||||||
IMAGE_NAME: ${{ github.repository }}-frontend
|
|
||||||
|
|
||||||
steps:
|
|
||||||
- name: Checkout repository
|
|
||||||
uses: actions/checkout@v6
|
|
||||||
- name: Log in to the Container registry
|
|
||||||
uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 #v3.4.0
|
|
||||||
with:
|
|
||||||
registry: ${{ env.REGISTRY }}
|
|
||||||
username: ${{ github.actor }}
|
|
||||||
password: ${{ secrets.GITHUB_TOKEN }}
|
|
||||||
- name: Extract metadata (tags, labels) for Docker
|
|
||||||
id: meta
|
|
||||||
uses: docker/metadata-action@902fa8ec7d6ecbf8d84d538b9b233a880e428804 #v5.7.0
|
|
||||||
with:
|
|
||||||
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
|
|
||||||
tags: |
|
|
||||||
type=ref,event=tag
|
|
||||||
type=ref,event=branch
|
|
||||||
type=sha
|
|
||||||
type=raw,value=latest,enable={{is_default_branch}}
|
|
||||||
- name: Build and push Docker image
|
|
||||||
id: push
|
|
||||||
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 #v6.18.0
|
|
||||||
with:
|
|
||||||
context: .
|
|
||||||
file: frontend/Dockerfile
|
|
||||||
push: true
|
|
||||||
tags: ${{ steps.meta.outputs.tags }}
|
|
||||||
labels: ${{ steps.meta.outputs.labels }}
|
|
||||||
|
|
||||||
- name: Generate artifact attestation
|
|
||||||
uses: actions/attest-build-provenance@v2
|
|
||||||
with:
|
|
||||||
subject-name: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME}}
|
|
||||||
subject-digest: ${{ steps.push.outputs.digest }}
|
|
||||||
push-to-registry: true
|
|
||||||
@@ -1,38 +0,0 @@
|
|||||||
name: Label Sync
|
|
||||||
|
|
||||||
# Keeps repository labels in sync with the declarative source of truth
|
|
||||||
# (.github/labels.yml). Runs whenever that file changes on main, and can be
|
|
||||||
# triggered manually. Additive/update-only — never deletes labels.
|
|
||||||
|
|
||||||
on:
|
|
||||||
push:
|
|
||||||
branches: [main]
|
|
||||||
paths:
|
|
||||||
- ".github/labels.yml"
|
|
||||||
- "scripts/sync_labels.py"
|
|
||||||
- ".github/workflows/label-sync.yml"
|
|
||||||
workflow_dispatch:
|
|
||||||
|
|
||||||
permissions:
|
|
||||||
contents: read
|
|
||||||
issues: write
|
|
||||||
|
|
||||||
concurrency:
|
|
||||||
group: label-sync
|
|
||||||
cancel-in-progress: false
|
|
||||||
|
|
||||||
jobs:
|
|
||||||
sync:
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
steps:
|
|
||||||
- name: Checkout
|
|
||||||
uses: actions/checkout@v6
|
|
||||||
|
|
||||||
- name: Install uv
|
|
||||||
uses: astral-sh/setup-uv@v7
|
|
||||||
|
|
||||||
- name: Sync labels
|
|
||||||
run: uv run --with pyyaml python scripts/sync_labels.py
|
|
||||||
env:
|
|
||||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
|
||||||
GH_REPO: ${{ github.repository }}
|
|
||||||
@@ -1,108 +0,0 @@
|
|||||||
name: Replay E2E (front-back contract)
|
|
||||||
|
|
||||||
# Guards the front-back contract via record/replay (no API key in CI):
|
|
||||||
# Layer 1 — backend golden: replay a recorded trace through the real gateway,
|
|
||||||
# assert the SSE event sequence matches the committed golden.
|
|
||||||
# Layer 2 — full-stack render: real Next.js frontend + real gateway (replay
|
|
||||||
# model) + Chromium; assert the replayed turns render in the browser.
|
|
||||||
# Triggered by changes on EITHER side of the contract so a backend change can no
|
|
||||||
# longer pass without the frontend-facing checks running.
|
|
||||||
|
|
||||||
on:
|
|
||||||
push:
|
|
||||||
branches: ["main"]
|
|
||||||
paths:
|
|
||||||
- "frontend/**"
|
|
||||||
- "backend/app/gateway/**"
|
|
||||||
- "backend/packages/harness/**"
|
|
||||||
- "backend/tests/fixtures/replay/**"
|
|
||||||
- "backend/tests/replay_provider.py"
|
|
||||||
- "backend/tests/_replay_fixture.py"
|
|
||||||
- "backend/tests/seed_runs_router.py"
|
|
||||||
- "backend/tests/test_replay_golden.py"
|
|
||||||
- "backend/scripts/run_replay_gateway.py"
|
|
||||||
- ".github/workflows/replay-e2e.yml"
|
|
||||||
pull_request:
|
|
||||||
types: [opened, synchronize, reopened, ready_for_review]
|
|
||||||
paths:
|
|
||||||
- "frontend/**"
|
|
||||||
- "backend/app/gateway/**"
|
|
||||||
- "backend/packages/harness/**"
|
|
||||||
- "backend/tests/fixtures/replay/**"
|
|
||||||
- "backend/tests/replay_provider.py"
|
|
||||||
- "backend/tests/_replay_fixture.py"
|
|
||||||
- "backend/tests/seed_runs_router.py"
|
|
||||||
- "backend/tests/test_replay_golden.py"
|
|
||||||
- "backend/scripts/run_replay_gateway.py"
|
|
||||||
- ".github/workflows/replay-e2e.yml"
|
|
||||||
|
|
||||||
concurrency:
|
|
||||||
group: replay-e2e-${{ github.event.pull_request.number || github.ref }}
|
|
||||||
cancel-in-progress: true
|
|
||||||
|
|
||||||
permissions:
|
|
||||||
contents: read
|
|
||||||
|
|
||||||
jobs:
|
|
||||||
backend-replay-golden:
|
|
||||||
name: Layer 1 — backend golden (no API key)
|
|
||||||
if: github.event_name != 'pull_request' || github.event.pull_request.draft == false
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
timeout-minutes: 15
|
|
||||||
steps:
|
|
||||||
- uses: actions/checkout@v6
|
|
||||||
- name: Set up Python
|
|
||||||
uses: actions/setup-python@v6
|
|
||||||
with:
|
|
||||||
python-version: "3.12"
|
|
||||||
- name: Install uv
|
|
||||||
uses: astral-sh/setup-uv@v7
|
|
||||||
- name: Install backend dependencies
|
|
||||||
working-directory: backend
|
|
||||||
run: uv sync --group dev
|
|
||||||
- name: Replay golden (backend SSE contract)
|
|
||||||
working-directory: backend
|
|
||||||
run: PYTHONPATH=. uv run pytest tests/test_replay_golden.py -v
|
|
||||||
|
|
||||||
fullstack-replay-render:
|
|
||||||
name: Layer 2 — full-stack render (no API key)
|
|
||||||
if: github.event_name != 'pull_request' || github.event.pull_request.draft == false
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
timeout-minutes: 25
|
|
||||||
steps:
|
|
||||||
- uses: actions/checkout@v6
|
|
||||||
- name: Set up Python
|
|
||||||
uses: actions/setup-python@v6
|
|
||||||
with:
|
|
||||||
python-version: "3.12"
|
|
||||||
- name: Install uv
|
|
||||||
uses: astral-sh/setup-uv@v7
|
|
||||||
- name: Install backend dependencies (replay gateway)
|
|
||||||
working-directory: backend
|
|
||||||
run: uv sync --group dev
|
|
||||||
- name: Setup Node.js
|
|
||||||
uses: actions/setup-node@v4
|
|
||||||
with:
|
|
||||||
node-version: "22"
|
|
||||||
- name: Enable Corepack
|
|
||||||
run: corepack enable
|
|
||||||
- name: Use pinned pnpm version
|
|
||||||
run: corepack prepare pnpm@10.26.2 --activate
|
|
||||||
- name: Install frontend dependencies
|
|
||||||
working-directory: frontend
|
|
||||||
run: pnpm install --frozen-lockfile
|
|
||||||
- name: Install Playwright Chromium
|
|
||||||
working-directory: frontend
|
|
||||||
run: npx playwright install chromium --with-deps
|
|
||||||
- name: Full-stack replay render (DOM assertions are the gate)
|
|
||||||
working-directory: frontend
|
|
||||||
run: pnpm exec playwright test -c playwright.real-backend.config.ts
|
|
||||||
- name: Upload report + render artifact
|
|
||||||
uses: actions/upload-artifact@v4
|
|
||||||
if: ${{ !cancelled() }}
|
|
||||||
with:
|
|
||||||
name: replay-render
|
|
||||||
path: |
|
|
||||||
frontend/playwright-report/
|
|
||||||
frontend/test-results/
|
|
||||||
retention-days: 7
|
|
||||||
@@ -1,223 +0,0 @@
|
|||||||
name: Triage
|
|
||||||
|
|
||||||
# One workflow for all event-driven PR/issue labeling. Replaces the former
|
|
||||||
# pr-labeler / pr-triage / issue-triage workflows (and drops actions/labeler).
|
|
||||||
#
|
|
||||||
# Design notes:
|
|
||||||
# * All jobs are pure-metadata: they read changed-file lists / PR fields / the
|
|
||||||
# review payload via the API and write labels. PR code is NEVER checked out
|
|
||||||
# or executed, so pull_request_target is safe here.
|
|
||||||
# * Each job only reconciles labels in namespaces IT owns
|
|
||||||
# (area:* / size/* / risk:* / needs-validation). It never touches labels
|
|
||||||
# applied by maintainers or other tools (bug, priority, etc.). first-time-
|
|
||||||
# contributor and reviewing are add-only.
|
|
||||||
# * State is read LIVE (listFiles + listLabelsOnIssue) at run time, not from
|
|
||||||
# the (stale) event payload, so rapid synchronize events converge instead
|
|
||||||
# of thrashing.
|
|
||||||
|
|
||||||
on:
|
|
||||||
pull_request_target:
|
|
||||||
types: [opened, synchronize, reopened, ready_for_review]
|
|
||||||
pull_request_review:
|
|
||||||
types: [submitted]
|
|
||||||
issues:
|
|
||||||
types: [opened]
|
|
||||||
|
|
||||||
permissions:
|
|
||||||
contents: read
|
|
||||||
pull-requests: write
|
|
||||||
issues: write
|
|
||||||
|
|
||||||
jobs:
|
|
||||||
# ── PR: area / size / risk / needs-validation / first-time ─────────────────
|
|
||||||
pr-labels:
|
|
||||||
if: github.event_name == 'pull_request_target' && github.event.pull_request.draft == false
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
concurrency:
|
|
||||||
group: triage-pr-${{ github.event.pull_request.number }}
|
|
||||||
cancel-in-progress: true
|
|
||||||
steps:
|
|
||||||
- name: Apply PR labels from live state
|
|
||||||
uses: actions/github-script@v8
|
|
||||||
with:
|
|
||||||
script: |
|
|
||||||
const pr = context.payload.pull_request;
|
|
||||||
const { owner, repo } = context.repo;
|
|
||||||
const num = pr.number;
|
|
||||||
|
|
||||||
// ---- live changed files ----
|
|
||||||
const files = await github.paginate(github.rest.pulls.listFiles, {
|
|
||||||
owner, repo, pull_number: num, per_page: 100,
|
|
||||||
});
|
|
||||||
const paths = files.map(f => f.filename);
|
|
||||||
const m = (re) => paths.some(p => re.test(p));
|
|
||||||
|
|
||||||
// ---- area: replaces .github/labeler.yml (path -> area) ----
|
|
||||||
const AREA_RULES = [
|
|
||||||
['area:frontend', [/^frontend\//]],
|
|
||||||
['area:backend', [/^backend\/app\//, /^backend\/packages\/harness\/deerflow\/(runtime|persistence|config|tools|guardrails|tracing|models|utils|uploads)\//]],
|
|
||||||
['area:agents', [/^backend\/packages\/harness\/deerflow\/(agents|subagents|reflection)\//, /(^|\/)langgraph\.json$/, /^backend\/.*\/prompts\//]],
|
|
||||||
['area:sandbox', [/^docker\//, /^backend\/packages\/harness\/deerflow\/sandbox\//, /(^|\/)Dockerfile$/]],
|
|
||||||
['area:skills', [/^skills\//, /^backend\/packages\/harness\/deerflow\/skills\//, /^frontend\/src\/core\/skills\//]],
|
|
||||||
['area:mcp', [/^backend\/packages\/harness\/deerflow\/mcp\//, /^frontend\/src\/core\/mcp\//]],
|
|
||||||
['area:ci', [/^\.github\//, /^scripts\//]],
|
|
||||||
['area:docs', [/^docs\//, /\.mdx?$/]],
|
|
||||||
['area:deps', [/(^|\/)(pyproject\.toml|uv\.lock|package\.json|pnpm-lock\.yaml)$/]],
|
|
||||||
];
|
|
||||||
const areaLabels = AREA_RULES
|
|
||||||
.filter(([, res]) => res.some(re => m(re)))
|
|
||||||
.map(([label]) => label);
|
|
||||||
|
|
||||||
// ---- size: additions+deletions, excluding lockfiles/snapshots ----
|
|
||||||
const EXCLUDE_SIZE = /(^|\/)(uv\.lock|pnpm-lock\.yaml|package-lock\.json)$|\.snap$/;
|
|
||||||
const churn = files
|
|
||||||
.filter(f => !EXCLUDE_SIZE.test(f.filename))
|
|
||||||
.reduce((s, f) => s + (f.additions || 0) + (f.deletions || 0), 0);
|
|
||||||
const sizeLabel =
|
|
||||||
churn < 20 ? 'size/XS' :
|
|
||||||
churn < 100 ? 'size/S' :
|
|
||||||
churn < 300 ? 'size/M' :
|
|
||||||
churn < 700 ? 'size/L' : 'size/XL';
|
|
||||||
|
|
||||||
// ---- risk ----
|
|
||||||
const docsOnly = paths.length > 0 && paths.every(p =>
|
|
||||||
/\.(md|mdx|txt)$/i.test(p) || p.startsWith('docs/') ||
|
|
||||||
/\.(png|jpe?g|gif|svg|webp|ico)$/i.test(p));
|
|
||||||
const highRisk =
|
|
||||||
m(/^backend\/app\/gateway\//) ||
|
|
||||||
m(/^backend\/packages\/harness\/deerflow\/(agents|subagents|sandbox)\//) ||
|
|
||||||
m(/(^|\/)langgraph\.json$/) ||
|
|
||||||
m(/(^|\/)(auth|authz|security)/i) ||
|
|
||||||
m(/(pyproject\.toml|uv\.lock|package\.json|pnpm-lock\.yaml)$/) ||
|
|
||||||
m(/^docker\//) ||
|
|
||||||
m(/^\.github\/workflows\//);
|
|
||||||
const riskLabel = docsOnly ? 'risk:low' : (highRisk ? 'risk:high' : 'risk:medium');
|
|
||||||
|
|
||||||
// ---- needs-validation: front/back contract surface ----
|
|
||||||
const contract =
|
|
||||||
m(/^backend\/app\/gateway\//) ||
|
|
||||||
m(/^backend\/packages\/harness\/deerflow\/(agents|subagents)\//) ||
|
|
||||||
m(/(^|\/)langgraph\.json$/) ||
|
|
||||||
m(/^frontend\/src\/core\/(api|threads|messages)\//);
|
|
||||||
|
|
||||||
// ---- live current labels (NOT the stale event payload) ----
|
|
||||||
const current = (await github.paginate(github.rest.issues.listLabelsOnIssue, {
|
|
||||||
owner, repo, issue_number: num, per_page: 100,
|
|
||||||
})).map(l => l.name);
|
|
||||||
const hasSkip = current.includes('skip-validation');
|
|
||||||
|
|
||||||
// Reconcile ONLY namespaces we own; never touch others.
|
|
||||||
const owned = (n) =>
|
|
||||||
n.startsWith('area:') || n.startsWith('size/') ||
|
|
||||||
n.startsWith('risk:') || n === 'needs-validation';
|
|
||||||
const desired = new Set([...areaLabels, sizeLabel, riskLabel]);
|
|
||||||
if (contract && !hasSkip) desired.add('needs-validation');
|
|
||||||
|
|
||||||
const toRemove = current.filter(n => owned(n) && !desired.has(n));
|
|
||||||
const toAdd = [...desired].filter(n => !current.includes(n));
|
|
||||||
|
|
||||||
// first-time-contributor: add-only, on opened, real users only.
|
|
||||||
if (context.payload.action === 'opened' &&
|
|
||||||
pr.user.type === 'User' &&
|
|
||||||
['FIRST_TIME_CONTRIBUTOR', 'FIRST_TIMER'].includes(pr.author_association) &&
|
|
||||||
!current.includes('first-time-contributor')) {
|
|
||||||
toAdd.push('first-time-contributor');
|
|
||||||
}
|
|
||||||
|
|
||||||
for (const name of toRemove) {
|
|
||||||
try {
|
|
||||||
await github.rest.issues.removeLabel({ owner, repo, issue_number: num, name });
|
|
||||||
} catch (e) {
|
|
||||||
if (e.status !== 404) throw e;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (toAdd.length) {
|
|
||||||
await github.rest.issues.addLabels({ owner, repo, issue_number: num, labels: toAdd });
|
|
||||||
}
|
|
||||||
core.info(`area=[${areaLabels.join(',')}] ${sizeLabel} ${riskLabel} churn=${churn} ` +
|
|
||||||
`validation=${desired.has('needs-validation')} ` +
|
|
||||||
`(+${toAdd.join(',') || '-'} / -${toRemove.join(',') || '-'})`);
|
|
||||||
|
|
||||||
# ── PR: reviewing label on a maintainer's human review ─────────────────────
|
|
||||||
reviewing:
|
|
||||||
if: github.event_name == 'pull_request_review'
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
concurrency:
|
|
||||||
group: triage-review-${{ github.event.pull_request.number }}
|
|
||||||
cancel-in-progress: false
|
|
||||||
steps:
|
|
||||||
- name: Add reviewing label for maintainer reviews
|
|
||||||
uses: actions/github-script@v8
|
|
||||||
with:
|
|
||||||
script: |
|
|
||||||
const { owner, repo } = context.repo;
|
|
||||||
const num = context.payload.pull_request.number;
|
|
||||||
const review = context.payload.review;
|
|
||||||
const assoc = review.author_association; // payload field; no API call
|
|
||||||
const type = review.user && review.user.type;
|
|
||||||
|
|
||||||
// author_association is NONE for every automated reviewer
|
|
||||||
// (Copilot, CodeRabbit, Codex, Sourcery, ...), so this allowlist
|
|
||||||
// drops them all without a denylist — and never calls the
|
|
||||||
// collaborators API that 404s on "Copilot is not a user".
|
|
||||||
// user.type === 'User' guards the rare bot-added-as-collaborator case.
|
|
||||||
if (!['OWNER', 'MEMBER', 'COLLABORATOR'].includes(assoc) || type !== 'User') {
|
|
||||||
core.info(`reviewer ${review.user && review.user.login} assoc=${assoc} type=${type}; skipping.`);
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
const labels = (await github.paginate(github.rest.issues.listLabelsOnIssue, {
|
|
||||||
owner, repo, issue_number: num, per_page: 100,
|
|
||||||
})).map(l => l.name);
|
|
||||||
if (labels.includes('reviewing')) {
|
|
||||||
core.info('Already labeled reviewing; skipping.');
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
try {
|
|
||||||
await github.rest.issues.addLabels({
|
|
||||||
owner, repo, issue_number: num, labels: ['reviewing'],
|
|
||||||
});
|
|
||||||
core.info('Added "reviewing".');
|
|
||||||
} catch (e) {
|
|
||||||
if (e.status === 403) core.info('No permission to label (expected on some fork PRs).');
|
|
||||||
else throw e;
|
|
||||||
}
|
|
||||||
|
|
||||||
# ── Issue: needs-triage on every new issue ────────────────────────────────
|
|
||||||
issue-triage:
|
|
||||||
if: github.event_name == 'issues'
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
concurrency:
|
|
||||||
group: triage-issue-${{ github.event.issue.number }}
|
|
||||||
cancel-in-progress: false
|
|
||||||
steps:
|
|
||||||
- name: Add needs-triage label
|
|
||||||
uses: actions/github-script@v8
|
|
||||||
with:
|
|
||||||
script: |
|
|
||||||
const { owner, repo } = context.repo;
|
|
||||||
const issue_number = context.payload.issue.number;
|
|
||||||
|
|
||||||
// Read live labels (not the event payload) so labels added at creation
|
|
||||||
// time via the API or by another automation are seen — consistent with
|
|
||||||
// the live-state reads in the PR jobs above.
|
|
||||||
const current = (await github.paginate(github.rest.issues.listLabelsOnIssue, {
|
|
||||||
owner, repo, issue_number, per_page: 100,
|
|
||||||
})).map(l => l.name);
|
|
||||||
if (current.includes('needs-triage')) {
|
|
||||||
core.info('Issue already has needs-triage; nothing to do.');
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
// Self-heal: create the label if it does not exist yet.
|
|
||||||
try {
|
|
||||||
await github.rest.issues.createLabel({
|
|
||||||
owner, repo, name: 'needs-triage', color: 'fef2c0',
|
|
||||||
description: 'Awaiting maintainer triage',
|
|
||||||
});
|
|
||||||
} catch (e) {
|
|
||||||
if (e.status !== 422) throw e; // 422 = already exists
|
|
||||||
}
|
|
||||||
await github.rest.issues.addLabels({
|
|
||||||
owner, repo, issue_number, labels: ['needs-triage'],
|
|
||||||
});
|
|
||||||
core.info(`Added needs-triage to #${issue_number}.`);
|
|
||||||
+19
-28
@@ -46,12 +46,12 @@ Docker provides a consistent, isolated environment with all dependencies pre-con
|
|||||||
All services will start with hot-reload enabled:
|
All services will start with hot-reload enabled:
|
||||||
- Frontend changes are automatically reloaded
|
- Frontend changes are automatically reloaded
|
||||||
- Backend changes trigger automatic restart
|
- Backend changes trigger automatic restart
|
||||||
- Gateway-hosted LangGraph-compatible runtime supports hot-reload
|
- LangGraph server supports hot-reload
|
||||||
|
|
||||||
4. **Access the application**:
|
4. **Access the application**:
|
||||||
- Web Interface: http://localhost:2026
|
- Web Interface: http://localhost:2026
|
||||||
- API Gateway: http://localhost:2026/api/*
|
- API Gateway: http://localhost:2026/api/*
|
||||||
- LangGraph-compatible API: http://localhost:2026/api/langgraph/*
|
- LangGraph: http://localhost:2026/api/langgraph/*
|
||||||
|
|
||||||
#### Docker Commands
|
#### Docker Commands
|
||||||
|
|
||||||
@@ -94,7 +94,7 @@ Use these as practical starting points for development and review environments:
|
|||||||
If `make docker-init`, `make docker-start`, or `make docker-stop` fails on Linux with an error like below, your current user likely does not have permission to access the Docker daemon socket:
|
If `make docker-init`, `make docker-start`, or `make docker-stop` fails on Linux with an error like below, your current user likely does not have permission to access the Docker daemon socket:
|
||||||
|
|
||||||
```text
|
```text
|
||||||
unable to get image 'deer-flow-gateway': permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock
|
unable to get image 'deer-flow-dev-langgraph': permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock
|
||||||
```
|
```
|
||||||
|
|
||||||
Recommended fix: add your current user to the `docker` group so Docker commands work without `sudo`.
|
Recommended fix: add your current user to the `docker` group so Docker commands work without `sudo`.
|
||||||
@@ -131,8 +131,9 @@ Host Machine
|
|||||||
Docker Compose (deer-flow-dev)
|
Docker Compose (deer-flow-dev)
|
||||||
├→ nginx (port 2026) ← Reverse proxy
|
├→ nginx (port 2026) ← Reverse proxy
|
||||||
├→ web (port 3000) ← Frontend with hot-reload
|
├→ web (port 3000) ← Frontend with hot-reload
|
||||||
├→ gateway (port 8001) ← Gateway API + LangGraph-compatible runtime with hot-reload
|
├→ api (port 8001) ← Gateway API with hot-reload
|
||||||
└→ provisioner (optional, port 8002) ← Started only in provisioner/K8s sandbox mode
|
├→ langgraph (port 2024) ← LangGraph server with hot-reload
|
||||||
|
└→ provisioner (optional, port 8002) ← Started only in provisioner/K8s sandbox mode
|
||||||
```
|
```
|
||||||
|
|
||||||
**Benefits of Docker Development**:
|
**Benefits of Docker Development**:
|
||||||
@@ -183,13 +184,17 @@ Required tools:
|
|||||||
|
|
||||||
If you need to start services individually:
|
If you need to start services individually:
|
||||||
|
|
||||||
1. **Start backend service**:
|
1. **Start backend services**:
|
||||||
```bash
|
```bash
|
||||||
# Terminal 1: Start Gateway API + embedded agent runtime (port 8001)
|
# Terminal 1: Start LangGraph Server (port 2024)
|
||||||
cd backend
|
cd backend
|
||||||
make dev
|
make dev
|
||||||
|
|
||||||
# Terminal 2: Start Frontend (port 3000)
|
# Terminal 2: Start Gateway API (port 8001)
|
||||||
|
cd backend
|
||||||
|
make gateway
|
||||||
|
|
||||||
|
# Terminal 3: Start Frontend (port 3000)
|
||||||
cd frontend
|
cd frontend
|
||||||
pnpm dev
|
pnpm dev
|
||||||
```
|
```
|
||||||
@@ -207,10 +212,10 @@ If you need to start services individually:
|
|||||||
|
|
||||||
The nginx configuration provides:
|
The nginx configuration provides:
|
||||||
- Unified entry point on port 2026
|
- Unified entry point on port 2026
|
||||||
- Rewrites `/api/langgraph/*` to Gateway's LangGraph-compatible API (8001)
|
- Routes `/api/langgraph/*` to LangGraph Server (2024)
|
||||||
- Routes other `/api/*` endpoints to Gateway API (8001)
|
- Routes other `/api/*` endpoints to Gateway API (8001)
|
||||||
- Routes non-API requests to Frontend (3000)
|
- Routes non-API requests to Frontend (3000)
|
||||||
- Same-origin API routing; split-origin or port-forwarded browser clients should use the Gateway `GATEWAY_CORS_ORIGINS` allowlist
|
- Centralized CORS handling
|
||||||
- SSE/streaming support for real-time agent responses
|
- SSE/streaming support for real-time agent responses
|
||||||
- Optimized timeouts for long-running operations
|
- Optimized timeouts for long-running operations
|
||||||
|
|
||||||
@@ -230,8 +235,8 @@ deer-flow/
|
|||||||
│ └── nginx.local.conf # Nginx config for local dev
|
│ └── nginx.local.conf # Nginx config for local dev
|
||||||
├── backend/ # Backend application
|
├── backend/ # Backend application
|
||||||
│ ├── src/
|
│ ├── src/
|
||||||
│ │ ├── gateway/ # Gateway API and LangGraph-compatible runtime (port 8001)
|
│ │ ├── gateway/ # Gateway API (port 8001)
|
||||||
│ │ ├── agents/ # LangGraph agent runtime used by Gateway
|
│ │ ├── agents/ # LangGraph agents (port 2024)
|
||||||
│ │ ├── mcp/ # Model Context Protocol integration
|
│ │ ├── mcp/ # Model Context Protocol integration
|
||||||
│ │ ├── skills/ # Skills system
|
│ │ ├── skills/ # Skills system
|
||||||
│ │ └── sandbox/ # Sandbox execution
|
│ │ └── sandbox/ # Sandbox execution
|
||||||
@@ -251,7 +256,8 @@ Browser
|
|||||||
↓
|
↓
|
||||||
Nginx (port 2026) ← Unified entry point
|
Nginx (port 2026) ← Unified entry point
|
||||||
├→ Frontend (port 3000) ← / (non-API requests)
|
├→ Frontend (port 3000) ← / (non-API requests)
|
||||||
└→ Gateway API (port 8001) ← /api/* and /api/langgraph/* (LangGraph-compatible agent interactions)
|
├→ Gateway API (port 8001) ← /api/models, /api/mcp, /api/skills, /api/threads/*/artifacts
|
||||||
|
└→ LangGraph Server (port 2024) ← /api/langgraph/* (agent interactions)
|
||||||
```
|
```
|
||||||
|
|
||||||
## Development Workflow
|
## Development Workflow
|
||||||
@@ -287,21 +293,6 @@ Nginx (port 2026) ← Unified entry point
|
|||||||
git push origin feature/your-feature-name
|
git push origin feature/your-feature-name
|
||||||
```
|
```
|
||||||
|
|
||||||
## AI assistance disclosure
|
|
||||||
|
|
||||||
DeerFlow is an AI project and we welcome AI-assisted contributions. To help
|
|
||||||
reviewers calibrate how closely to read a change, **every pull request must
|
|
||||||
complete the "AI assistance" section of the
|
|
||||||
[PR template](.github/pull_request_template.md)**:
|
|
||||||
|
|
||||||
- which tool(s) you used (or `none`),
|
|
||||||
- how you used them, and
|
|
||||||
- a confirmation that a human has read, understands, and takes responsibility
|
|
||||||
for the change.
|
|
||||||
|
|
||||||
Please don't delete the section. PRs that ignore it may be asked to fill it in
|
|
||||||
before review.
|
|
||||||
|
|
||||||
## Testing
|
## Testing
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
# DeerFlow - Unified Development Environment
|
# DeerFlow - Unified Development Environment
|
||||||
|
|
||||||
.PHONY: help config config-upgrade check install setup doctor detect-thread-boundaries detect-blocking-io dev dev-daemon start start-daemon stop up down clean docker-init docker-start docker-stop docker-logs docker-logs-frontend docker-logs-gateway
|
.PHONY: help config config-upgrade check install setup doctor dev dev-daemon start start-daemon stop up down clean docker-init docker-start docker-stop docker-logs docker-logs-frontend docker-logs-gateway
|
||||||
|
|
||||||
BASH ?= bash
|
BASH ?= bash
|
||||||
BACKEND_UV_RUN = cd backend && uv run
|
BACKEND_UV_RUN = cd backend && uv run
|
||||||
@@ -23,8 +23,6 @@ help:
|
|||||||
@echo " make config - Generate local config files (aborts if config already exists)"
|
@echo " make config - Generate local config files (aborts if config already exists)"
|
||||||
@echo " make config-upgrade - Merge new fields from config.example.yaml into config.yaml"
|
@echo " make config-upgrade - Merge new fields from config.example.yaml into config.yaml"
|
||||||
@echo " make check - Check if all required tools are installed"
|
@echo " make check - Check if all required tools are installed"
|
||||||
@echo " make detect-thread-boundaries - Inventory async/thread boundary points"
|
|
||||||
@echo " make detect-blocking-io - Inventory blocking IO that may block the backend event loop"
|
|
||||||
@echo " make install - Install all dependencies (frontend + backend + pre-commit hooks)"
|
@echo " make install - Install all dependencies (frontend + backend + pre-commit hooks)"
|
||||||
@echo " make setup-sandbox - Pre-pull sandbox container image (recommended)"
|
@echo " make setup-sandbox - Pre-pull sandbox container image (recommended)"
|
||||||
@echo " make dev - Start all services in development mode (with hot-reloading)"
|
@echo " make dev - Start all services in development mode (with hot-reloading)"
|
||||||
@@ -53,12 +51,6 @@ setup:
|
|||||||
doctor:
|
doctor:
|
||||||
@$(BACKEND_UV_RUN) python ../scripts/doctor.py
|
@$(BACKEND_UV_RUN) python ../scripts/doctor.py
|
||||||
|
|
||||||
detect-thread-boundaries:
|
|
||||||
@$(PYTHON) ./scripts/detect_thread_boundaries.py
|
|
||||||
|
|
||||||
detect-blocking-io:
|
|
||||||
@$(MAKE) -C backend detect-blocking-io
|
|
||||||
|
|
||||||
config:
|
config:
|
||||||
@$(PYTHON) ./scripts/configure.py
|
@$(PYTHON) ./scripts/configure.py
|
||||||
|
|
||||||
@@ -89,7 +81,36 @@ install:
|
|||||||
|
|
||||||
# Pre-pull sandbox Docker image (optional but recommended)
|
# Pre-pull sandbox Docker image (optional but recommended)
|
||||||
setup-sandbox:
|
setup-sandbox:
|
||||||
@$(RUN_WITH_GIT_BASH) ./scripts/setup-sandbox.sh
|
@echo "=========================================="
|
||||||
|
@echo " Pre-pulling Sandbox Container Image"
|
||||||
|
@echo "=========================================="
|
||||||
|
@echo ""
|
||||||
|
@IMAGE=$$(grep -A 20 "# sandbox:" config.yaml 2>/dev/null | grep "image:" | awk '{print $$2}' | head -1); \
|
||||||
|
if [ -z "$$IMAGE" ]; then \
|
||||||
|
IMAGE="enterprise-public-cn-beijing.cr.volces.com/vefaas-public/all-in-one-sandbox:latest"; \
|
||||||
|
echo "Using default image: $$IMAGE"; \
|
||||||
|
else \
|
||||||
|
echo "Using configured image: $$IMAGE"; \
|
||||||
|
fi; \
|
||||||
|
echo ""; \
|
||||||
|
if command -v container >/dev/null 2>&1 && [ "$$(uname)" = "Darwin" ]; then \
|
||||||
|
echo "Detected Apple Container on macOS, pulling image..."; \
|
||||||
|
container image pull "$$IMAGE" || echo "⚠ Apple Container pull failed, will try Docker"; \
|
||||||
|
fi; \
|
||||||
|
if command -v docker >/dev/null 2>&1; then \
|
||||||
|
echo "Pulling image using Docker..."; \
|
||||||
|
if docker pull "$$IMAGE"; then \
|
||||||
|
echo ""; \
|
||||||
|
echo "✓ Sandbox image pulled successfully"; \
|
||||||
|
else \
|
||||||
|
echo ""; \
|
||||||
|
echo "⚠ Failed to pull sandbox image (this is OK for local sandbox mode)"; \
|
||||||
|
fi; \
|
||||||
|
else \
|
||||||
|
echo "✗ Neither Docker nor Apple Container is available"; \
|
||||||
|
echo " Please install Docker: https://docs.docker.com/get-docker/"; \
|
||||||
|
exit 1; \
|
||||||
|
fi
|
||||||
|
|
||||||
# Start all services in development mode (with hot-reloading)
|
# Start all services in development mode (with hot-reloading)
|
||||||
dev:
|
dev:
|
||||||
@@ -119,6 +140,7 @@ stop:
|
|||||||
clean: stop
|
clean: stop
|
||||||
@echo "Cleaning up..."
|
@echo "Cleaning up..."
|
||||||
@-rm -rf backend/.deer-flow 2>/dev/null || true
|
@-rm -rf backend/.deer-flow 2>/dev/null || true
|
||||||
|
@-rm -rf backend/.langgraph_api 2>/dev/null || true
|
||||||
@-rm -rf logs/*.log 2>/dev/null || true
|
@-rm -rf logs/*.log 2>/dev/null || true
|
||||||
@echo "✓ Cleanup complete"
|
@echo "✓ Cleanup complete"
|
||||||
|
|
||||||
|
|||||||
@@ -245,8 +245,6 @@ make down # Stop and remove containers
|
|||||||
|
|
||||||
Access: http://localhost:2026
|
Access: http://localhost:2026
|
||||||
|
|
||||||
The unified nginx endpoint is same-origin by default and does not emit browser CORS headers. If you run a split-origin or port-forwarded browser client, set `GATEWAY_CORS_ORIGINS` to comma-separated exact origins such as `http://localhost:3000`; the Gateway then applies the CORS allowlist and matching CSRF origin checks.
|
|
||||||
|
|
||||||
See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed Docker development guide.
|
See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed Docker development guide.
|
||||||
|
|
||||||
#### Option 2: Local Development
|
#### Option 2: Local Development
|
||||||
@@ -546,15 +544,6 @@ LANGFUSE_BASE_URL=https://cloud.langfuse.com
|
|||||||
|
|
||||||
If you are using a self-hosted Langfuse instance, set `LANGFUSE_BASE_URL` to your deployment URL.
|
If you are using a self-hosted Langfuse instance, set `LANGFUSE_BASE_URL` to your deployment URL.
|
||||||
|
|
||||||
**Trace correlation fields.** Every agent run is annotated with Langfuse's reserved trace attributes so the Sessions and Users pages light up automatically:
|
|
||||||
|
|
||||||
- `session_id` = LangGraph `thread_id` — groups every trace of the same conversation
|
|
||||||
- `user_id` = effective user from `get_effective_user_id()` (falls back to `default` in no-auth mode)
|
|
||||||
- `trace_name` = assistant id (defaults to `lead-agent`)
|
|
||||||
- `tags` = `[env:<DEER_FLOW_ENV>, model:<model_name>]` (omitted when not set)
|
|
||||||
|
|
||||||
These are injected into `RunnableConfig.metadata` at the graph invocation root for both the gateway path (`runtime/runs/worker.py::run_agent`) and the embedded path (`client.py::DeerFlowClient.stream`), so any LangChain-compatible callback can read them. Set `DEER_FLOW_ENV` (or `ENVIRONMENT`) to tag traces by deployment environment.
|
|
||||||
|
|
||||||
#### Using Both Providers
|
#### Using Both Providers
|
||||||
|
|
||||||
If both LangSmith and Langfuse are enabled, DeerFlow attaches both tracing callbacks and reports the same model activity to both systems.
|
If both LangSmith and Langfuse are enabled, DeerFlow attaches both tracing callbacks and reports the same model activity to both systems.
|
||||||
@@ -585,8 +574,6 @@ A standard Agent Skill is a structured capability module — a Markdown file tha
|
|||||||
|
|
||||||
Skills are loaded progressively — only when the task needs them, not all at once. This keeps the context window lean and makes DeerFlow work well even with token-sensitive models.
|
Skills are loaded progressively — only when the task needs them, not all at once. This keeps the context window lean and makes DeerFlow work well even with token-sensitive models.
|
||||||
|
|
||||||
Users can explicitly activate an enabled skill for a single turn by starting the request with `/skill-name`, for example `/data-analysis analyze uploads/foo.csv`. DeerFlow loads that skill's `SKILL.md` as hidden current-turn context while leaving the base prompt limited to skill metadata. Slash activation respects disabled skills, custom-agent skill whitelists, and existing channel commands such as `/new` and `/help`.
|
|
||||||
|
|
||||||
When you install `.skill` archives through the Gateway, DeerFlow accepts standard optional frontmatter metadata such as `version`, `author`, and `compatibility` instead of rejecting otherwise valid external skills.
|
When you install `.skill` archives through the Gateway, DeerFlow accepts standard optional frontmatter metadata such as `version`, `author`, and `compatibility` instead of rejecting otherwise valid external skills.
|
||||||
|
|
||||||
Tools follow the same philosophy. DeerFlow comes with a core toolset — web search, web fetch, file operations, bash execution — and supports custom tools via MCP servers and Python functions. Swap anything. Add anything.
|
Tools follow the same philosophy. DeerFlow comes with a core toolset — web search, web fetch, file operations, bash execution — and supports custom tools via MCP servers and Python functions. Swap anything. Add anything.
|
||||||
@@ -639,7 +626,7 @@ See [`skills/public/claude-to-deerflow/SKILL.md`](skills/public/claude-to-deerfl
|
|||||||
|
|
||||||
Complex tasks rarely fit in a single pass. DeerFlow decomposes them.
|
Complex tasks rarely fit in a single pass. DeerFlow decomposes them.
|
||||||
|
|
||||||
The lead agent can spawn sub-agents on the fly — each with its own scoped context, tools, and termination conditions. Sub-agents run in parallel when possible, report back structured results, and the lead agent synthesizes everything into a coherent output. When token usage tracking is enabled, completed sub-agent usage is attributed back to the dispatching step.
|
The lead agent can spawn sub-agents on the fly — each with its own scoped context, tools, and termination conditions. Sub-agents run in parallel when possible, report back structured results, and the lead agent synthesizes everything into a coherent output.
|
||||||
|
|
||||||
This is how DeerFlow handles tasks that take minutes to hours: a research task might fan out into a dozen sub-agents, each exploring a different angle, then converge into a single report — or a website — or a slide deck with generated visuals. One harness, many hands.
|
This is how DeerFlow handles tasks that take minutes to hours: a research task might fan out into a dozen sub-agents, each exploring a different angle, then converge into a single report — or a website — or a slide deck with generated visuals. One harness, many hands.
|
||||||
|
|
||||||
@@ -742,12 +729,6 @@ DeerFlow has key high-privilege capabilities including **system command executio
|
|||||||
We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, workflow, and guidelines.
|
We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, workflow, and guidelines.
|
||||||
|
|
||||||
Regression coverage includes Docker sandbox mode detection and provisioner kubeconfig-path handling tests in `backend/tests/`.
|
Regression coverage includes Docker sandbox mode detection and provisioner kubeconfig-path handling tests in `backend/tests/`.
|
||||||
Backend blocking-IO diagnostics are available from the repository root with
|
|
||||||
`make detect-blocking-io`: it statically scans backend business code for
|
|
||||||
blocking IO that may run on the backend event loop, prints a concise summary,
|
|
||||||
and writes complete JSON findings to `.deer-flow/blocking-io-findings.json`.
|
|
||||||
The JSON includes compact review records with `priority`, `location`,
|
|
||||||
`blocking_call`, `event_loop_exposure`, `reason`, and `code`.
|
|
||||||
Gateway artifact serving now forces active web content types (`text/html`, `application/xhtml+xml`, `image/svg+xml`) to download as attachments instead of inline rendering, reducing XSS risk for generated artifacts.
|
Gateway artifact serving now forces active web content types (`text/html`, `application/xhtml+xml`, `image/svg+xml`) to download as attachments instead of inline rendering, reducing XSS risk for generated artifacts.
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|||||||
+3
-3
@@ -228,7 +228,7 @@ make down # Stop and remove containers
|
|||||||
```
|
```
|
||||||
|
|
||||||
> [!NOTE]
|
> [!NOTE]
|
||||||
> Le runtime d'agent s'exécute actuellement dans la Gateway. nginx réécrit `/api/langgraph/*` vers l'API compatible LangGraph servie par la Gateway.
|
> Le serveur d'agents LangGraph fonctionne actuellement via `langgraph dev` (le serveur CLI open source).
|
||||||
|
|
||||||
Accès : http://localhost:2026
|
Accès : http://localhost:2026
|
||||||
|
|
||||||
@@ -296,8 +296,8 @@ DeerFlow peut recevoir des tâches depuis des applications de messagerie. Les ca
|
|||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
channels:
|
channels:
|
||||||
# LangGraph-compatible Gateway API base URL (default: http://localhost:8001/api)
|
# LangGraph Server URL (default: http://localhost:2024)
|
||||||
langgraph_url: http://localhost:8001/api
|
langgraph_url: http://localhost:2024
|
||||||
# Gateway API URL (default: http://localhost:8001)
|
# Gateway API URL (default: http://localhost:8001)
|
||||||
gateway_url: http://localhost:8001
|
gateway_url: http://localhost:8001
|
||||||
|
|
||||||
|
|||||||
+3
-3
@@ -181,7 +181,7 @@ make down # コンテナを停止して削除
|
|||||||
```
|
```
|
||||||
|
|
||||||
> [!NOTE]
|
> [!NOTE]
|
||||||
> Agentランタイムは現在Gateway内で実行されます。`/api/langgraph/*`はnginxによってGatewayのLangGraph-compatible APIへ書き換えられます。
|
> LangGraphエージェントサーバーは現在`langgraph dev`(オープンソースCLIサーバー)経由で実行されます。
|
||||||
|
|
||||||
アクセス: http://localhost:2026
|
アクセス: http://localhost:2026
|
||||||
|
|
||||||
@@ -249,8 +249,8 @@ DeerFlowはメッセージングアプリからのタスク受信をサポート
|
|||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
channels:
|
channels:
|
||||||
# LangGraph-compatible Gateway API base URL(デフォルト: http://localhost:8001/api)
|
# LangGraphサーバーURL(デフォルト: http://localhost:2024)
|
||||||
langgraph_url: http://localhost:8001/api
|
langgraph_url: http://localhost:2024
|
||||||
# Gateway API URL(デフォルト: http://localhost:8001)
|
# Gateway API URL(デフォルト: http://localhost:8001)
|
||||||
gateway_url: http://localhost:8001
|
gateway_url: http://localhost:8001
|
||||||
|
|
||||||
|
|||||||
+3
-3
@@ -184,7 +184,7 @@ make down # 停止并移除容器
|
|||||||
```
|
```
|
||||||
|
|
||||||
> [!NOTE]
|
> [!NOTE]
|
||||||
> 当前 Agent 运行时嵌入在 Gateway 中运行,`/api/langgraph/*` 会由 nginx 重写到 Gateway 的 LangGraph-compatible API。
|
> 当前 LangGraph agent server 通过开源 CLI 服务 `langgraph dev` 运行。
|
||||||
|
|
||||||
访问地址:http://localhost:2026
|
访问地址:http://localhost:2026
|
||||||
|
|
||||||
@@ -254,8 +254,8 @@ DeerFlow 支持从即时通讯应用接收任务。只要配置完成,对应
|
|||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
channels:
|
channels:
|
||||||
# LangGraph-compatible Gateway API base URL(默认:http://localhost:8001/api)
|
# LangGraph Server URL(默认:http://localhost:2024)
|
||||||
langgraph_url: http://localhost:8001/api
|
langgraph_url: http://localhost:2024
|
||||||
# Gateway API URL(默认:http://localhost:8001)
|
# Gateway API URL(默认:http://localhost:8001)
|
||||||
gateway_url: http://localhost:8001
|
gateway_url: http://localhost:8001
|
||||||
|
|
||||||
|
|||||||
@@ -24,10 +24,5 @@ config.yaml
|
|||||||
# Langgraph
|
# Langgraph
|
||||||
.langgraph_api
|
.langgraph_api
|
||||||
|
|
||||||
# Sandbox runtime working dir — pre-created and excluded from uvicorn reload
|
|
||||||
# (scripts/serve.sh, docker/dev-entrypoint.sh). Anchored so it does not match
|
|
||||||
# the source package backend/packages/harness/deerflow/sandbox/.
|
|
||||||
/sandbox/
|
|
||||||
|
|
||||||
# Claude Code settings
|
# Claude Code settings
|
||||||
.claude/settings.local.json
|
.claude/settings.local.json
|
||||||
|
|||||||
+29
-106
@@ -88,57 +88,18 @@ make stop # Stop all services
|
|||||||
|
|
||||||
**Backend directory** (for backend development only):
|
**Backend directory** (for backend development only):
|
||||||
```bash
|
```bash
|
||||||
make install # Install backend dependencies
|
make install # Install backend dependencies
|
||||||
make dev # Run Gateway API with reload (port 8001)
|
make dev # Run Gateway API with reload (port 8001)
|
||||||
make gateway # Run Gateway API only (port 8001)
|
make gateway # Run Gateway API only (port 8001)
|
||||||
make test # Run all backend tests
|
make test # Run all backend tests
|
||||||
make test-blocking-io # Run strict Blockbuster runtime gate on tests/blocking_io/
|
make lint # Lint with ruff
|
||||||
make lint # Lint with ruff
|
make format # Format code with ruff
|
||||||
make format # Format code with ruff
|
|
||||||
```
|
```
|
||||||
|
|
||||||
The `detect-blocking-io` target parses `app/`, `packages/harness/deerflow/`,
|
|
||||||
and `scripts/` with AST. By default it reports only blocking IO candidates that
|
|
||||||
are inside async code, reachable from async code in the same file, or reachable
|
|
||||||
from sync-only `AgentMiddleware` before/after hooks that LangGraph can execute
|
|
||||||
on the async graph path. It prints a concise summary and writes complete JSON
|
|
||||||
findings to `.deer-flow/blocking-io-findings.json` at the repository root
|
|
||||||
(both `make detect-blocking-io` from the repo root and `cd backend && make
|
|
||||||
detect-blocking-io` resolve to the same repo-root path). JSON findings include
|
|
||||||
`priority`, `location`, `blocking_call`, `event_loop_exposure`, `reason`, and
|
|
||||||
`code` for model-assisted or manual review. `priority` is a deterministic
|
|
||||||
review ordering from operation type, not proof of a bug. Bare-name same-file
|
|
||||||
calls are resolved by function name, so duplicate helper names in one file can
|
|
||||||
conservatively over-report async reachability. It is intentionally
|
|
||||||
informational and is not run from CI in this round.
|
|
||||||
|
|
||||||
Regression tests related to Docker/provisioner behavior:
|
Regression tests related to Docker/provisioner behavior:
|
||||||
- `tests/test_docker_sandbox_mode_detection.py` (mode detection from `config.yaml`)
|
- `tests/test_docker_sandbox_mode_detection.py` (mode detection from `config.yaml`)
|
||||||
- `tests/test_provisioner_kubeconfig.py` (kubeconfig file/directory handling)
|
- `tests/test_provisioner_kubeconfig.py` (kubeconfig file/directory handling)
|
||||||
|
|
||||||
Blocking-IO runtime gate (`tests/blocking_io/`):
|
|
||||||
- Wraps every item under `tests/blocking_io/` with a strict Blockbuster
|
|
||||||
context scoped to `app.*` and `deerflow.*` (see
|
|
||||||
`tests/support/detectors/blocking_io_runtime.py`). Any sync blocking IO
|
|
||||||
call whose stack passes through DeerFlow business code while running on
|
|
||||||
the asyncio event loop raises `BlockingError` and fails the test.
|
|
||||||
- Regression anchors live there: `test_skills_load.py` (locks the
|
|
||||||
`asyncio.to_thread` offload around `LocalSkillStorage.load_skills`, fix
|
|
||||||
for #1917); `test_sqlite_lifespan.py` (locks the offload around
|
|
||||||
SQLite path resolution plus `ensure_sqlite_parent_dir`, fix for #1912);
|
|
||||||
`test_jsonl_run_event_store.py` (locks `JsonlRunEventStore`'s async
|
|
||||||
API offloading its file IO via `asyncio.to_thread`, fix #3084); and
|
|
||||||
`test_uploads_middleware.py` (locks `UploadsMiddleware.abefore_agent`
|
|
||||||
offloading the uploads-directory scan off the event loop).
|
|
||||||
- `test_gate_smoke.py` is a meta-test asserting the gate actually catches
|
|
||||||
unoffloaded blocking IO and that the `@pytest.mark.allow_blocking_io`
|
|
||||||
opt-out works.
|
|
||||||
- Coverage boundary: the gate only sees code that test execution actually
|
|
||||||
touches. Static AST coverage is a separate concern (out of scope for
|
|
||||||
this PR).
|
|
||||||
- CI: runs on every PR via `.github/workflows/backend-blocking-io-tests.yml`,
|
|
||||||
hard-fail.
|
|
||||||
|
|
||||||
Boundary check (harness → app import firewall):
|
Boundary check (harness → app import firewall):
|
||||||
- `tests/test_harness_boundary.py` — ensures `packages/harness/deerflow/` never imports from `app.*`
|
- `tests/test_harness_boundary.py` — ensures `packages/harness/deerflow/` never imports from `app.*`
|
||||||
|
|
||||||
@@ -192,7 +153,7 @@ from deerflow.config import get_app_config
|
|||||||
|
|
||||||
### Middleware Chain
|
### Middleware Chain
|
||||||
|
|
||||||
Lead-agent middlewares are assembled in strict append order across `packages/harness/deerflow/agents/middlewares/tool_error_handling_middleware.py` (`build_lead_runtime_middlewares`) and `packages/harness/deerflow/agents/lead_agent/agent.py` (`build_middlewares`):
|
Lead-agent middlewares are assembled in strict append order across `packages/harness/deerflow/agents/middlewares/tool_error_handling_middleware.py` (`build_lead_runtime_middlewares`) and `packages/harness/deerflow/agents/lead_agent/agent.py` (`_build_middlewares`):
|
||||||
|
|
||||||
1. **ThreadDataMiddleware** - Creates per-thread directories under the user's isolation scope (`backend/.deer-flow/users/{user_id}/threads/{thread_id}/user-data/{workspace,uploads,outputs}`); resolves `user_id` via `get_effective_user_id()` (falls back to `"default"` in no-auth mode); Web UI thread deletion now follows LangGraph thread removal with Gateway cleanup of the local thread directory
|
1. **ThreadDataMiddleware** - Creates per-thread directories under the user's isolation scope (`backend/.deer-flow/users/{user_id}/threads/{thread_id}/user-data/{workspace,uploads,outputs}`); resolves `user_id` via `get_effective_user_id()` (falls back to `"default"` in no-auth mode); Web UI thread deletion now follows LangGraph thread removal with Gateway cleanup of the local thread directory
|
||||||
2. **UploadsMiddleware** - Tracks and injects newly uploaded files into conversation
|
2. **UploadsMiddleware** - Tracks and injects newly uploaded files into conversation
|
||||||
@@ -202,17 +163,16 @@ Lead-agent middlewares are assembled in strict append order across `packages/har
|
|||||||
6. **GuardrailMiddleware** - Pre-tool-call authorization via pluggable `GuardrailProvider` protocol (optional, if `guardrails.enabled` in config). Evaluates each tool call and returns error ToolMessage on deny. Three provider options: built-in `AllowlistProvider` (zero deps), OAP policy providers (e.g. `aport-agent-guardrails`), or custom providers. See [docs/GUARDRAILS.md](docs/GUARDRAILS.md) for setup, usage, and how to implement a provider.
|
6. **GuardrailMiddleware** - Pre-tool-call authorization via pluggable `GuardrailProvider` protocol (optional, if `guardrails.enabled` in config). Evaluates each tool call and returns error ToolMessage on deny. Three provider options: built-in `AllowlistProvider` (zero deps), OAP policy providers (e.g. `aport-agent-guardrails`), or custom providers. See [docs/GUARDRAILS.md](docs/GUARDRAILS.md) for setup, usage, and how to implement a provider.
|
||||||
7. **SandboxAuditMiddleware** - Audits sandboxed shell/file operations for security logging before tool execution continues
|
7. **SandboxAuditMiddleware** - Audits sandboxed shell/file operations for security logging before tool execution continues
|
||||||
8. **ToolErrorHandlingMiddleware** - Converts tool exceptions into error `ToolMessage`s so the run can continue instead of aborting
|
8. **ToolErrorHandlingMiddleware** - Converts tool exceptions into error `ToolMessage`s so the run can continue instead of aborting
|
||||||
9. **SkillActivationMiddleware** - Detects strict `/skill-name task` syntax on the latest real user message, resolves only enabled and runtime-allowed skills, reads `SKILL.md` from trusted skill storage, injects the skill body as hidden current-turn model context, and records a `middleware:skill_activation` audit event with skill name, category, path, and content hash
|
9. **SummarizationMiddleware** - Context reduction when approaching token limits (optional, if enabled)
|
||||||
10. **SummarizationMiddleware** - Context reduction when approaching token limits (optional, if enabled)
|
10. **TodoListMiddleware** - Task tracking with `write_todos` tool (optional, if plan_mode)
|
||||||
11. **TodoListMiddleware** - Task tracking with `write_todos` tool (optional, if plan_mode)
|
11. **TokenUsageMiddleware** - Records token usage metrics when token tracking is enabled (optional)
|
||||||
12. **TokenUsageMiddleware** - Records token usage metrics when token tracking is enabled (optional); subagent usage is cached by `tool_call_id` only while token usage is enabled and merged back into the dispatching AIMessage by message position rather than message id
|
12. **TitleMiddleware** - Auto-generates thread title after first complete exchange and normalizes structured message content before prompting the title model
|
||||||
13. **TitleMiddleware** - Auto-generates thread title after first complete exchange and normalizes structured message content before prompting the title model
|
13. **MemoryMiddleware** - Queues conversations for async memory update (filters to user + final AI responses)
|
||||||
14. **MemoryMiddleware** - Queues conversations for async memory update (filters to user + final AI responses)
|
14. **ViewImageMiddleware** - Injects base64 image data before LLM call (conditional on vision support)
|
||||||
15. **ViewImageMiddleware** - Injects base64 image data before LLM call (conditional on vision support)
|
15. **DeferredToolFilterMiddleware** - Hides deferred tool schemas from the bound model until tool search is enabled (optional)
|
||||||
16. **DeferredToolFilterMiddleware** - Hides deferred (MCP) tool schemas from the bound model using a build-time deferred-name set + catalog hash, reading per-thread promotions from `ThreadState.promoted` (hash-scoped, no ContextVar); a tool becomes bound on subsequent turns after `tool_search` returns its schema (optional, if `tool_search.enabled`)
|
16. **SubagentLimitMiddleware** - Truncates excess `task` tool calls from model response to enforce `MAX_CONCURRENT_SUBAGENTS` limit (optional, if `subagent_enabled`)
|
||||||
17. **SubagentLimitMiddleware** - Truncates excess `task` tool calls from model response to enforce `MAX_CONCURRENT_SUBAGENTS` limit (optional, if `subagent_enabled`)
|
17. **LoopDetectionMiddleware** - Detects repeated tool-call loops; hard-stop responses clear both structured `tool_calls` and raw provider tool-call metadata before forcing a final text answer
|
||||||
18. **LoopDetectionMiddleware** - Detects repeated tool-call loops; hard-stop responses clear both structured `tool_calls` and raw provider tool-call metadata before forcing a final text answer
|
18. **ClarificationMiddleware** - Intercepts `ask_clarification` tool calls, interrupts via `Command(goto=END)` (must be last)
|
||||||
19. **ClarificationMiddleware** - Intercepts `ask_clarification` tool calls, interrupts via `Command(goto=END)` (must be last)
|
|
||||||
|
|
||||||
### Configuration System
|
### Configuration System
|
||||||
|
|
||||||
@@ -224,10 +184,6 @@ Setup: Copy `config.example.yaml` to `config.yaml` in the **project root** direc
|
|||||||
|
|
||||||
**Config Caching**: `get_app_config()` caches the parsed config, but automatically reloads it when the resolved config path changes or the file's mtime increases. This keeps Gateway and LangGraph reads aligned with `config.yaml` edits without requiring a manual process restart.
|
**Config Caching**: `get_app_config()` caches the parsed config, but automatically reloads it when the resolved config path changes or the file's mtime increases. This keeps Gateway and LangGraph reads aligned with `config.yaml` edits without requiring a manual process restart.
|
||||||
|
|
||||||
**Config Hot-Reload Boundary**: Gateway dependencies route through `get_app_config()` on every request, so per-run fields like `models[*].max_tokens`, `summarization.*`, `title.*`, `memory.*`, `subagents.*`, `tools[*]`, and the agent system prompt pick up `config.yaml` edits on the next message. `AppConfig` is intentionally **not** cached on `app.state` — `lifespan()` keeps a local `startup_config` variable for one-shot bootstrap work and passes it to `langgraph_runtime(app, startup_config)`.
|
|
||||||
|
|
||||||
Infrastructure fields are **restart-required**. The authoritative list lives in `packages/harness/deerflow/config/reload_boundary.py::STARTUP_ONLY_FIELDS` and is mirrored by the standardised `"startup-only:"` prefix on the corresponding `Field(description=...)` in `AppConfig`, so IDE hover on those fields surfaces the reason inline (no need to context-switch into this table). Currently registered: `database`, `checkpointer`, `run_events`, `stream_bridge`, `sandbox`, `log_level`, `channels`. Adding a new restart-required field requires updating the registry; drift is pinned by `tests/test_reload_boundary.py`.
|
|
||||||
|
|
||||||
Configuration priority:
|
Configuration priority:
|
||||||
1. Explicit `config_path` argument
|
1. Explicit `config_path` argument
|
||||||
2. `DEER_FLOW_CONFIG_PATH` environment variable
|
2. `DEER_FLOW_CONFIG_PATH` environment variable
|
||||||
@@ -251,8 +207,6 @@ Configuration priority:
|
|||||||
|
|
||||||
FastAPI application on port 8001 with health check at `GET /health`. Set `GATEWAY_ENABLE_DOCS=false` to disable `/docs`, `/redoc`, and `/openapi.json` in production (default: enabled).
|
FastAPI application on port 8001 with health check at `GET /health`. Set `GATEWAY_ENABLE_DOCS=false` to disable `/docs`, `/redoc`, and `/openapi.json` in production (default: enabled).
|
||||||
|
|
||||||
CORS is same-origin by default when requests enter through nginx on port 2026. Split-origin or port-forwarded browser clients must opt in with `GATEWAY_CORS_ORIGINS` (comma-separated exact origins); Gateway `CORSMiddleware` and `CSRFMiddleware` both read that variable so browser CORS and auth-origin checks stay aligned.
|
|
||||||
|
|
||||||
**Routers**:
|
**Routers**:
|
||||||
|
|
||||||
| Router | Endpoints |
|
| Router | Endpoints |
|
||||||
@@ -264,39 +218,32 @@ CORS is same-origin by default when requests enter through nginx on port 2026. S
|
|||||||
| **Uploads** (`/api/threads/{id}/uploads`) | `POST /` - upload files (auto-converts PDF/PPT/Excel/Word); `GET /list` - list; `DELETE /{filename}` - delete |
|
| **Uploads** (`/api/threads/{id}/uploads`) | `POST /` - upload files (auto-converts PDF/PPT/Excel/Word); `GET /list` - list; `DELETE /{filename}` - delete |
|
||||||
| **Threads** (`/api/threads/{id}`) | `DELETE /` - remove DeerFlow-managed local thread data after LangGraph thread deletion; unexpected failures are logged server-side and return a generic 500 detail |
|
| **Threads** (`/api/threads/{id}`) | `DELETE /` - remove DeerFlow-managed local thread data after LangGraph thread deletion; unexpected failures are logged server-side and return a generic 500 detail |
|
||||||
| **Artifacts** (`/api/threads/{id}/artifacts`) | `GET /{path}` - serve artifacts; active content types (`text/html`, `application/xhtml+xml`, `image/svg+xml`) are always forced as download attachments to reduce XSS risk; `?download=true` still forces download for other file types |
|
| **Artifacts** (`/api/threads/{id}/artifacts`) | `GET /{path}` - serve artifacts; active content types (`text/html`, `application/xhtml+xml`, `image/svg+xml`) are always forced as download attachments to reduce XSS risk; `?download=true` still forces download for other file types |
|
||||||
| **Suggestions** (`/api/threads/{id}/suggestions`) | `POST /` - generate follow-up questions; rich list/block model content is normalized and inline reasoning (`<think>...</think>`, including unclosed/truncated blocks from reasoning models like MiniMax-M3) is stripped before JSON parsing |
|
| **Suggestions** (`/api/threads/{id}/suggestions`) | `POST /` - generate follow-up questions; rich list/block model content is normalized before JSON parsing |
|
||||||
| **Thread Runs** (`/api/threads/{id}/runs`) | `POST /` - create background run; `POST /stream` - create + SSE stream; `POST /wait` - create + block; `GET /` - list runs; `GET /{rid}` - run details; `POST /{rid}/cancel` - cancel; `GET /{rid}/join` - join SSE; `GET /{rid}/messages` - paginated messages `{data, has_more}`; `GET /{rid}/events` - full event stream; `GET /../messages` - thread messages with feedback; `GET /../token-usage` - aggregate tokens |
|
| **Thread Runs** (`/api/threads/{id}/runs`) | `POST /` - create background run; `POST /stream` - create + SSE stream; `POST /wait` - create + block; `GET /` - list runs; `GET /{rid}` - run details; `POST /{rid}/cancel` - cancel; `GET /{rid}/join` - join SSE; `GET /{rid}/messages` - paginated messages `{data, has_more}`; `GET /{rid}/events` - full event stream; `GET /../messages` - thread messages with feedback; `GET /../token-usage` - aggregate tokens |
|
||||||
| **Feedback** (`/api/threads/{id}/runs/{rid}/feedback`) | `PUT /` - upsert feedback; `DELETE /` - delete user feedback; `POST /` - create feedback; `GET /` - list feedback; `GET /stats` - aggregate stats; `DELETE /{fid}` - delete specific |
|
| **Feedback** (`/api/threads/{id}/runs/{rid}/feedback`) | `PUT /` - upsert feedback; `DELETE /` - delete user feedback; `POST /` - create feedback; `GET /` - list feedback; `GET /stats` - aggregate stats; `DELETE /{fid}` - delete specific |
|
||||||
| **Runs** (`/api/runs`) | `POST /stream` - stateless run + SSE; `POST /wait` - stateless run + block; `GET /{rid}/messages` - paginated messages by run_id `{data, has_more}` (cursor: `after_seq`/`before_seq`); `GET /{rid}/feedback` - list feedback by run_id |
|
| **Runs** (`/api/runs`) | `POST /stream` - stateless run + SSE; `POST /wait` - stateless run + block; `GET /{rid}/messages` - paginated messages by run_id `{data, has_more}` (cursor: `after_seq`/`before_seq`); `GET /{rid}/feedback` - list feedback by run_id |
|
||||||
|
|
||||||
**RunManager / RunStore contract**:
|
Proxied through nginx: `/api/langgraph/*` → LangGraph, all other `/api/*` → Gateway.
|
||||||
- `RunManager.get()` is async; direct callers must `await` it.
|
|
||||||
- When a persistent `RunStore` is configured, `get()` and `list_by_thread()` hydrate historical runs from the store. In-memory records win for the same `run_id` so task, abort, and stream-control state stays attached to active local runs.
|
|
||||||
- `cancel()` and `create_or_reject(..., multitask_strategy="interrupt"|"rollback")` persist interrupted status through `RunStore.update_status()`, matching normal `set_status()` transitions.
|
|
||||||
- Store-only hydrated runs are readable history. If the current worker has no in-memory task/control state for that run, cancellation APIs can return 409 because this worker cannot stop the task.
|
|
||||||
- `POST /wait` (both thread-scoped and `/api/runs/wait`) drains the stream bridge via `wait_for_run_completion()` instead of bare `await record.task`, so it honours the run's `on_disconnect` setting and cancels the background run on real client disconnect rather than returning a stale checkpoint (issue #3265).
|
|
||||||
|
|
||||||
Proxied through nginx: `/api/langgraph/*` → Gateway LangGraph-compatible runtime, all other `/api/*` → Gateway REST APIs.
|
|
||||||
|
|
||||||
### Sandbox System (`packages/harness/deerflow/sandbox/`)
|
### Sandbox System (`packages/harness/deerflow/sandbox/`)
|
||||||
|
|
||||||
**Interface**: Abstract `Sandbox` with `execute_command`, `read_file`, `write_file`, `list_dir`
|
**Interface**: Abstract `Sandbox` with `execute_command`, `read_file`, `write_file`, `list_dir`
|
||||||
**Provider Pattern**: `SandboxProvider` with `acquire`, `acquire_async`, `get`, `release` lifecycle. Async agent/tool paths call async sandbox lifecycle hooks so Docker sandbox creation, discovery, cross-process locking, readiness polling, and release stay off the event loop.
|
**Provider Pattern**: `SandboxProvider` with `acquire`, `get`, `release` lifecycle
|
||||||
**Implementations**:
|
**Implementations**:
|
||||||
- `LocalSandboxProvider` - Local filesystem execution. `acquire(thread_id)` returns a per-thread `LocalSandbox` (id `local:{thread_id}`) whose `path_mappings` resolve `/mnt/user-data/{workspace,uploads,outputs}` and `/mnt/acp-workspace` to that thread's host directories, so the public `Sandbox` API honours the `/mnt/user-data` contract uniformly with AIO. `acquire()` / `acquire(None)` keeps the legacy generic singleton (id `local`) for callers without a thread context. Per-thread sandboxes are held in an LRU cache (default 256 entries) guarded by a `threading.Lock`.
|
- `LocalSandboxProvider` - Singleton local filesystem execution with path mappings
|
||||||
- `AioSandboxProvider` (`packages/harness/deerflow/community/`) - Docker-based isolation
|
- `AioSandboxProvider` (`packages/harness/deerflow/community/`) - Docker-based isolation
|
||||||
|
|
||||||
**Virtual Path System**:
|
**Virtual Path System**:
|
||||||
- Agent sees: `/mnt/user-data/{workspace,uploads,outputs}`, `/mnt/skills`
|
- Agent sees: `/mnt/user-data/{workspace,uploads,outputs}`, `/mnt/skills`
|
||||||
- Physical: `backend/.deer-flow/users/{user_id}/threads/{thread_id}/user-data/...`, `deer-flow/skills/`
|
- Physical: `backend/.deer-flow/users/{user_id}/threads/{thread_id}/user-data/...`, `deer-flow/skills/`
|
||||||
- Translation: `LocalSandboxProvider` builds per-thread `PathMapping`s for the user-data prefixes at acquire time; `tools.py` keeps `replace_virtual_path()` / `replace_virtual_paths_in_command()` as a defense-in-depth layer (and for path validation). AIO has the directories volume-mounted at the same virtual paths inside its container, so both implementations accept `/mnt/user-data/...` natively.
|
- Translation: `replace_virtual_path()` / `replace_virtual_paths_in_command()`
|
||||||
- Detection: `is_local_sandbox()` accepts both `sandbox_id == "local"` (legacy / no-thread) and `sandbox_id.startswith("local:")` (per-thread)
|
- Detection: `is_local_sandbox()` checks `sandbox_id == "local"`
|
||||||
|
|
||||||
**Sandbox Tools** (in `packages/harness/deerflow/sandbox/tools.py`):
|
**Sandbox Tools** (in `packages/harness/deerflow/sandbox/tools.py`):
|
||||||
- `bash` - Execute commands with path translation and error handling
|
- `bash` - Execute commands with path translation and error handling
|
||||||
- `ls` - Directory listing (tree format, max 2 levels)
|
- `ls` - Directory listing (tree format, max 2 levels)
|
||||||
- `read_file` - Read file contents with optional line range
|
- `read_file` - Read file contents with optional line range
|
||||||
- `write_file` - Write/append to files, creates directories; overwrites by default and exposes the `append` argument in the model-facing schema for end-of-file writes
|
- `write_file` - Write/append to files, creates directories
|
||||||
- `str_replace` - Substring replacement (single or all occurrences); same-path serialization is scoped to `(sandbox.id, path)` so isolated sandboxes do not contend on identical virtual paths inside one process
|
- `str_replace` - Substring replacement (single or all occurrences); same-path serialization is scoped to `(sandbox.id, path)` so isolated sandboxes do not contend on identical virtual paths inside one process
|
||||||
|
|
||||||
### Subagent System (`packages/harness/deerflow/subagents/`)
|
### Subagent System (`packages/harness/deerflow/subagents/`)
|
||||||
@@ -306,7 +253,6 @@ Proxied through nginx: `/api/langgraph/*` → Gateway LangGraph-compatible runti
|
|||||||
**Concurrency**: `MAX_CONCURRENT_SUBAGENTS = 3` enforced by `SubagentLimitMiddleware` (truncates excess tool calls in `after_model`), 15-minute timeout
|
**Concurrency**: `MAX_CONCURRENT_SUBAGENTS = 3` enforced by `SubagentLimitMiddleware` (truncates excess tool calls in `after_model`), 15-minute timeout
|
||||||
**Flow**: `task()` tool → `SubagentExecutor` → background thread → poll 5s → SSE events → result
|
**Flow**: `task()` tool → `SubagentExecutor` → background thread → poll 5s → SSE events → result
|
||||||
**Events**: `task_started`, `task_running`, `task_completed`/`task_failed`/`task_timed_out`
|
**Events**: `task_started`, `task_running`, `task_completed`/`task_failed`/`task_timed_out`
|
||||||
**Deferred MCP tools** (if `tool_search.enabled`): `SubagentExecutor._build_initial_state` assembles deferral after policy filtering via the shared `assemble_deferred_tools` (fail-closed), appends the `tool_search` tool, injects the `<available-deferred-tools>` section into the subagent's `SystemMessage`, and threads the setup to `_create_agent`, which attaches `DeferredToolFilterMiddleware` through `build_subagent_runtime_middlewares(deferred_setup=...)`. Subagents thus withhold full MCP schemas until promotion, same as the lead agent; each task run gets a fresh `ThreadState` so promotion is isolated per run
|
|
||||||
|
|
||||||
### Tool System (`packages/harness/deerflow/tools/`)
|
### Tool System (`packages/harness/deerflow/tools/`)
|
||||||
|
|
||||||
@@ -317,10 +263,8 @@ Proxied through nginx: `/api/langgraph/*` → Gateway LangGraph-compatible runti
|
|||||||
- `present_files` - Make output files visible to user (only `/mnt/user-data/outputs`)
|
- `present_files` - Make output files visible to user (only `/mnt/user-data/outputs`)
|
||||||
- `ask_clarification` - Request clarification (intercepted by ClarificationMiddleware → interrupts)
|
- `ask_clarification` - Request clarification (intercepted by ClarificationMiddleware → interrupts)
|
||||||
- `view_image` - Read image as base64 (added only if model supports vision)
|
- `view_image` - Read image as base64 (added only if model supports vision)
|
||||||
- `setup_agent` - Bootstrap-only: persist a brand-new custom agent's `SOUL.md` and `config.yaml`. Bound only when `is_bootstrap=True`.
|
|
||||||
- `update_agent` - Custom-agent-only: persist self-updates to the current agent's `SOUL.md` / `config.yaml` from inside a normal chat (partial update + atomic write). Bound when `agent_name` is set and `is_bootstrap=False`.
|
|
||||||
4. **Subagent tool** (if enabled):
|
4. **Subagent tool** (if enabled):
|
||||||
- `task` - Delegate to subagent (description, prompt, subagent_type)
|
- `task` - Delegate to subagent (description, prompt, subagent_type, max_turns)
|
||||||
|
|
||||||
**Community tools** (`packages/harness/deerflow/community/`):
|
**Community tools** (`packages/harness/deerflow/community/`):
|
||||||
- `tavily/` - Web search (5 results default) and web fetch (4KB limit)
|
- `tavily/` - Web search (5 results default) and web fetch (4KB limit)
|
||||||
@@ -341,7 +285,7 @@ Proxied through nginx: `/api/langgraph/*` → Gateway LangGraph-compatible runti
|
|||||||
- **Cache invalidation**: Detects config file changes via mtime comparison
|
- **Cache invalidation**: Detects config file changes via mtime comparison
|
||||||
- **Transports**: stdio (command-based), SSE, HTTP
|
- **Transports**: stdio (command-based), SSE, HTTP
|
||||||
- **OAuth (HTTP/SSE)**: Supports token endpoint flows (`client_credentials`, `refresh_token`) with automatic token refresh + Authorization header injection
|
- **OAuth (HTTP/SSE)**: Supports token endpoint flows (`client_credentials`, `refresh_token`) with automatic token refresh + Authorization header injection
|
||||||
- **Runtime updates**: Gateway API saves to extensions_config.json; the Gateway-embedded runtime detects changes via mtime
|
- **Runtime updates**: Gateway API saves to extensions_config.json; LangGraph detects via mtime
|
||||||
|
|
||||||
### Skills System (`packages/harness/deerflow/skills/`)
|
### Skills System (`packages/harness/deerflow/skills/`)
|
||||||
|
|
||||||
@@ -349,7 +293,6 @@ Proxied through nginx: `/api/langgraph/*` → Gateway LangGraph-compatible runti
|
|||||||
- **Format**: Directory with `SKILL.md` (YAML frontmatter: name, description, license, allowed-tools)
|
- **Format**: Directory with `SKILL.md` (YAML frontmatter: name, description, license, allowed-tools)
|
||||||
- **Loading**: `load_skills()` recursively scans `skills/{public,custom}` for `SKILL.md`, parses metadata, and reads enabled state from extensions_config.json
|
- **Loading**: `load_skills()` recursively scans `skills/{public,custom}` for `SKILL.md`, parses metadata, and reads enabled state from extensions_config.json
|
||||||
- **Injection**: Enabled skills listed in agent system prompt with container paths
|
- **Injection**: Enabled skills listed in agent system prompt with container paths
|
||||||
- **Slash activation**: `/skill-name task` loads that enabled skill's `SKILL.md` for the current model call only. The resolver rejects leading whitespace, missing separators, reserved channel commands (`/new`, `/help`, `/bootstrap`, `/status`, `/models`, `/memory`), disabled skills, and skills outside a custom agent's whitelist.
|
|
||||||
- **Installation**: `POST /api/skills/install` extracts .skill ZIP archive to custom/ directory
|
- **Installation**: `POST /api/skills/install` extracts .skill ZIP archive to custom/ directory
|
||||||
|
|
||||||
### Model Factory (`packages/harness/deerflow/models/factory.py`)
|
### Model Factory (`packages/harness/deerflow/models/factory.py`)
|
||||||
@@ -369,7 +312,7 @@ Proxied through nginx: `/api/langgraph/*` → Gateway LangGraph-compatible runti
|
|||||||
|
|
||||||
### IM Channels System (`app/channels/`)
|
### IM Channels System (`app/channels/`)
|
||||||
|
|
||||||
Bridges external messaging platforms (Feishu, Slack, Telegram, DingTalk) to the DeerFlow agent via Gateway's LangGraph-compatible API.
|
Bridges external messaging platforms (Feishu, Slack, Telegram, DingTalk) to the DeerFlow agent via the LangGraph Server.
|
||||||
|
|
||||||
|
|
||||||
**Architecture**: Channels communicate with Gateway through the `langgraph-sdk` HTTP client (same as the frontend), ensuring threads are created and managed server-side. The internal SDK client injects process-local internal auth plus a matching CSRF cookie/header pair so Gateway accepts state-changing thread/run requests from channel workers without relying on browser session cookies.
|
**Architecture**: Channels communicate with Gateway through the `langgraph-sdk` HTTP client (same as the frontend), ensuring threads are created and managed server-side. The internal SDK client injects process-local internal auth plus a matching CSRF cookie/header pair so Gateway accepts state-changing thread/run requests from channel workers without relying on browser session cookies.
|
||||||
@@ -411,11 +354,10 @@ Bridges external messaging platforms (Feishu, Slack, Telegram, DingTalk) to the
|
|||||||
**Per-User Isolation**:
|
**Per-User Isolation**:
|
||||||
- Memory is stored per-user at `{base_dir}/users/{user_id}/memory.json`
|
- Memory is stored per-user at `{base_dir}/users/{user_id}/memory.json`
|
||||||
- Per-agent per-user memory at `{base_dir}/users/{user_id}/agents/{agent_name}/memory.json`
|
- Per-agent per-user memory at `{base_dir}/users/{user_id}/agents/{agent_name}/memory.json`
|
||||||
- Custom agent definitions (`SOUL.md` + `config.yaml`) are also per-user at `{base_dir}/users/{user_id}/agents/{agent_name}/`. The legacy shared layout `{base_dir}/agents/{agent_name}/` remains read-only fallback for unmigrated installations
|
|
||||||
- `user_id` is resolved via `get_effective_user_id()` from `deerflow.runtime.user_context`
|
- `user_id` is resolved via `get_effective_user_id()` from `deerflow.runtime.user_context`
|
||||||
- In no-auth mode, `user_id` defaults to `"default"` (constant `DEFAULT_USER_ID`)
|
- In no-auth mode, `user_id` defaults to `"default"` (constant `DEFAULT_USER_ID`)
|
||||||
- Absolute `storage_path` in config opts out of per-user isolation
|
- Absolute `storage_path` in config opts out of per-user isolation
|
||||||
- **Migration**: Run `PYTHONPATH=. python scripts/migrate_user_isolation.py` to move legacy `memory.json`, `threads/`, and `agents/` into per-user layout. Supports `--dry-run` (preview changes) and `--user-id USER_ID` (assign unowned legacy data to a user, defaults to `default`).
|
- **Migration**: Run `PYTHONPATH=. python scripts/migrate_user_isolation.py` to move legacy `memory.json` and `threads/` into per-user layout; supports `--dry-run`
|
||||||
|
|
||||||
**Data Structure** (stored in `{base_dir}/users/{user_id}/memory.json`):
|
**Data Structure** (stored in `{base_dir}/users/{user_id}/memory.json`):
|
||||||
- **User Context**: `workContext`, `personalContext`, `topOfMind` (1-3 sentence summaries)
|
- **User Context**: `workContext`, `personalContext`, `topOfMind` (1-3 sentence summaries)
|
||||||
@@ -444,24 +386,6 @@ Focused regression coverage for the updater lives in `backend/tests/test_memory_
|
|||||||
- `resolve_variable(path)` - Import module and return variable (e.g., `module.path:variable_name`)
|
- `resolve_variable(path)` - Import module and return variable (e.g., `module.path:variable_name`)
|
||||||
- `resolve_class(path, base_class)` - Import and validate class against base class
|
- `resolve_class(path, base_class)` - Import and validate class against base class
|
||||||
|
|
||||||
### Tracing System (`packages/harness/deerflow/tracing/`)
|
|
||||||
|
|
||||||
LangSmith and Langfuse are both supported. The wiring lives in two layers:
|
|
||||||
|
|
||||||
- `factory.py::build_tracing_callbacks()` — returns the LangChain `CallbackHandler` list for the providers currently enabled via env vars (`LANGSMITH_TRACING`, `LANGFUSE_TRACING`, etc.). The handlers are attached at the **graph invocation root** for in-graph runs (`make_lead_agent` and `DeerFlowClient.stream` both append them to `config["callbacks"]` before invoking the graph) so a single run produces one trace with all node / LLM / tool calls as child spans. Standalone callers — anything that invokes a model outside such a graph (e.g. `MemoryUpdater`) — keep `create_chat_model`'s default `attach_tracing=True`, which falls back to model-level callback attachment.
|
|
||||||
- `metadata.py::build_langfuse_trace_metadata()` — builds the Langfuse-reserved trace attributes for `RunnableConfig.metadata`. The Langfuse v4 `langchain.CallbackHandler` lifts these onto the root trace (see its `_parse_langfuse_trace_attributes`), but only when it sees `on_chain_start(parent_run_id=None)` — which is why the callbacks have to live at the graph root, not the model.
|
|
||||||
|
|
||||||
**Trace-attribute injection points**: both `runtime/runs/worker.py::run_agent` (gateway path) and `client.py::DeerFlowClient.stream` (embedded path) merge the metadata into `config["metadata"]` right before constructing the graph. Caller-supplied keys win via `setdefault`, so an external `session_id` override is preserved. Field mapping:
|
|
||||||
|
|
||||||
| Langfuse field | Source |
|
|
||||||
|-----------------------|----------------------------------------------|
|
|
||||||
| `langfuse_session_id` | LangGraph `thread_id` |
|
|
||||||
| `langfuse_user_id` | `get_effective_user_id()` (`default` in no-auth) |
|
|
||||||
| `langfuse_trace_name` | `RunRecord.assistant_id` / client `agent_name` (defaults to `lead-agent`) |
|
|
||||||
| `langfuse_tags` | `env:<DEER_FLOW_ENV>` + `model:<model_name>` |
|
|
||||||
|
|
||||||
Returns `{}` when Langfuse is not in the enabled providers — LangSmith-only deployments are unaffected. Set `DEER_FLOW_ENV` (or `ENVIRONMENT`) to tag traces by deployment environment. Tests live in `tests/test_tracing_factory.py`, `tests/test_tracing_metadata.py`, `tests/test_worker_langfuse_metadata.py`, and `tests/test_client_langfuse_metadata.py`.
|
|
||||||
|
|
||||||
### Config Schema
|
### Config Schema
|
||||||
|
|
||||||
**`config.yaml`** key sections:
|
**`config.yaml`** key sections:
|
||||||
@@ -495,7 +419,7 @@ Both can be modified at runtime via Gateway API endpoints or `DeerFlowClient` me
|
|||||||
- `"messages-tuple"` — per-chunk update: for AI text this is a **delta** (concat per `id` to rebuild the full message); tool calls and tool results are emitted once each
|
- `"messages-tuple"` — per-chunk update: for AI text this is a **delta** (concat per `id` to rebuild the full message); tool calls and tool results are emitted once each
|
||||||
- `"custom"` — forwarded from `StreamWriter`
|
- `"custom"` — forwarded from `StreamWriter`
|
||||||
- `"end"` — stream finished (carries cumulative `usage` counted once per message id)
|
- `"end"` — stream finished (carries cumulative `usage` counted once per message id)
|
||||||
- Agent created lazily via `create_agent()` + `build_middlewares()`, same as `make_lead_agent`
|
- Agent created lazily via `create_agent()` + `_build_middlewares()`, same as `make_lead_agent`
|
||||||
- Supports `checkpointer` parameter for state persistence across turns
|
- Supports `checkpointer` parameter for state persistence across turns
|
||||||
- `reset_agent()` forces agent recreation (e.g. after memory or skill changes)
|
- `reset_agent()` forces agent recreation (e.g. after memory or skill changes)
|
||||||
- See [docs/STREAMING.md](docs/STREAMING.md) for the full design: why Gateway and DeerFlowClient are parallel paths, LangGraph's `stream_mode` semantics, the per-id dedup invariants, and regression testing strategy
|
- See [docs/STREAMING.md](docs/STREAMING.md) for the full design: why Gateway and DeerFlowClient are parallel paths, LangGraph's `stream_mode` semantics, the per-id dedup invariants, and regression testing strategy
|
||||||
@@ -593,7 +517,6 @@ Multi-file upload with automatic document conversion:
|
|||||||
- Rejects directory inputs before copying so uploads stay all-or-nothing
|
- Rejects directory inputs before copying so uploads stay all-or-nothing
|
||||||
- Reuses one conversion worker per request when called from an active event loop
|
- Reuses one conversion worker per request when called from an active event loop
|
||||||
- Files stored in thread-isolated directories
|
- Files stored in thread-isolated directories
|
||||||
- Duplicate filenames in a single upload request are auto-renamed with `_N` suffixes so later files do not truncate earlier files
|
|
||||||
- Agent receives uploaded file list via `UploadsMiddleware`
|
- Agent receives uploaded file list via `UploadsMiddleware`
|
||||||
|
|
||||||
See [docs/FILE_UPLOAD.md](docs/FILE_UPLOAD.md) for details.
|
See [docs/FILE_UPLOAD.md](docs/FILE_UPLOAD.md) for details.
|
||||||
|
|||||||
@@ -56,8 +56,11 @@ export OPENAI_API_KEY="your-api-key"
|
|||||||
### Run the Development Server
|
### Run the Development Server
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Gateway API + embedded agent runtime
|
# Terminal 1: LangGraph server
|
||||||
make dev
|
make dev
|
||||||
|
|
||||||
|
# Terminal 2: Gateway API
|
||||||
|
make gateway
|
||||||
```
|
```
|
||||||
|
|
||||||
## Project Structure
|
## Project Structure
|
||||||
|
|||||||
+3
-13
@@ -50,12 +50,6 @@ COPY backend ./backend
|
|||||||
RUN --mount=type=cache,target=/root/.cache/uv \
|
RUN --mount=type=cache,target=/root/.cache/uv \
|
||||||
sh -c "cd backend && UV_INDEX_URL=${UV_INDEX_URL:-https://pypi.org/simple} uv sync ${UV_EXTRAS:+--extra $UV_EXTRAS}"
|
sh -c "cd backend && UV_INDEX_URL=${UV_INDEX_URL:-https://pypi.org/simple} uv sync ${UV_EXTRAS:+--extra $UV_EXTRAS}"
|
||||||
|
|
||||||
# UTF-8 locale prevents UnicodeEncodeError on Chinese/emoji content in minimal
|
|
||||||
# containers where locale configuration may be missing and the default encoding is not UTF-8.
|
|
||||||
ENV LANG=C.UTF-8
|
|
||||||
ENV LC_ALL=C.UTF-8
|
|
||||||
ENV PYTHONIOENCODING=utf-8
|
|
||||||
|
|
||||||
# ── Stage 2: Dev ──────────────────────────────────────────────────────────────
|
# ── Stage 2: Dev ──────────────────────────────────────────────────────────────
|
||||||
# Retains compiler toolchain from builder so startup-time `uv sync` can build
|
# Retains compiler toolchain from builder so startup-time `uv sync` can build
|
||||||
# source distributions in development containers.
|
# source distributions in development containers.
|
||||||
@@ -64,7 +58,7 @@ FROM builder AS dev
|
|||||||
# Install Docker CLI (for DooD: allows starting sandbox containers via host Docker socket)
|
# Install Docker CLI (for DooD: allows starting sandbox containers via host Docker socket)
|
||||||
COPY --from=docker:cli /usr/local/bin/docker /usr/local/bin/docker
|
COPY --from=docker:cli /usr/local/bin/docker /usr/local/bin/docker
|
||||||
|
|
||||||
EXPOSE 8001
|
EXPOSE 8001 2024
|
||||||
|
|
||||||
CMD ["sh", "-c", "cd backend && PYTHONPATH=. uv run uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001"]
|
CMD ["sh", "-c", "cd backend && PYTHONPATH=. uv run uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001"]
|
||||||
|
|
||||||
@@ -72,10 +66,6 @@ CMD ["sh", "-c", "cd backend && PYTHONPATH=. uv run uvicorn app.gateway.app:app
|
|||||||
# Clean image without build-essential — reduces size (~200 MB) and attack surface.
|
# Clean image without build-essential — reduces size (~200 MB) and attack surface.
|
||||||
FROM python:3.12-slim-bookworm
|
FROM python:3.12-slim-bookworm
|
||||||
|
|
||||||
ENV LANG=C.UTF-8
|
|
||||||
ENV LC_ALL=C.UTF-8
|
|
||||||
ENV PYTHONIOENCODING=utf-8
|
|
||||||
|
|
||||||
# Copy Node.js runtime from builder (provides npx for MCP servers)
|
# Copy Node.js runtime from builder (provides npx for MCP servers)
|
||||||
COPY --from=builder /usr/bin/node /usr/bin/node
|
COPY --from=builder /usr/bin/node /usr/bin/node
|
||||||
COPY --from=builder /usr/lib/node_modules /usr/lib/node_modules
|
COPY --from=builder /usr/lib/node_modules /usr/lib/node_modules
|
||||||
@@ -94,8 +84,8 @@ WORKDIR /app
|
|||||||
# Copy backend with pre-built virtualenv from builder
|
# Copy backend with pre-built virtualenv from builder
|
||||||
COPY --from=builder /app/backend ./backend
|
COPY --from=builder /app/backend ./backend
|
||||||
|
|
||||||
# Expose Gateway API port.
|
# Expose ports (gateway: 8001, langgraph: 2024)
|
||||||
EXPOSE 8001
|
EXPOSE 8001 2024
|
||||||
|
|
||||||
# Default command (can be overridden in docker-compose)
|
# Default command (can be overridden in docker-compose)
|
||||||
CMD ["sh", "-c", "cd backend && PYTHONPATH=. uv run --no-sync uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001"]
|
CMD ["sh", "-c", "cd backend && PYTHONPATH=. uv run --no-sync uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001"]
|
||||||
|
|||||||
+3
-9
@@ -2,16 +2,13 @@ install:
|
|||||||
uv sync
|
uv sync
|
||||||
|
|
||||||
dev:
|
dev:
|
||||||
PYTHONPATH=. PYTHONIOENCODING=utf-8 PYTHONUTF8=1 uv run uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001 --reload
|
PYTHONPATH=. uv run uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001 --reload
|
||||||
|
|
||||||
gateway:
|
gateway:
|
||||||
PYTHONPATH=. PYTHONIOENCODING=utf-8 PYTHONUTF8=1 uv run uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001
|
PYTHONPATH=. uv run uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001
|
||||||
|
|
||||||
test:
|
test:
|
||||||
PYTHONPATH=. PYTHONIOENCODING=utf-8 PYTHONUTF8=1 uv run pytest tests/ -v
|
PYTHONPATH=. uv run pytest tests/ -v
|
||||||
|
|
||||||
test-blocking-io:
|
|
||||||
PYTHONPATH=. PYTHONIOENCODING=utf-8 PYTHONUTF8=1 uv run pytest tests/blocking_io -q --tb=short
|
|
||||||
|
|
||||||
lint:
|
lint:
|
||||||
uvx ruff check .
|
uvx ruff check .
|
||||||
@@ -19,6 +16,3 @@ lint:
|
|||||||
|
|
||||||
format:
|
format:
|
||||||
uvx ruff check . --fix && uvx ruff format .
|
uvx ruff check . --fix && uvx ruff format .
|
||||||
|
|
||||||
detect-blocking-io:
|
|
||||||
@PYTHONPATH=. PYTHONIOENCODING=utf-8 PYTHONUTF8=1 uv run python ../scripts/detect_blocking_io_static.py --output ../.deer-flow/blocking-io-findings.json
|
|
||||||
|
|||||||
+34
-43
@@ -11,26 +11,31 @@ DeerFlow is a LangGraph-based AI super agent with sandbox execution, persistent
|
|||||||
│ Nginx (Port 2026) │
|
│ Nginx (Port 2026) │
|
||||||
│ Unified reverse proxy │
|
│ Unified reverse proxy │
|
||||||
└───────┬──────────────────┬───────────┘
|
└───────┬──────────────────┬───────────┘
|
||||||
│
|
│ │
|
||||||
/api/langgraph/* │ /api/* (other)
|
/api/langgraph/* │ │ /api/* (other)
|
||||||
rewritten to /api/* │
|
▼ ▼
|
||||||
▼
|
┌────────────────────┐ ┌────────────────────────┐
|
||||||
┌────────────────────────────────────────┐
|
│ LangGraph Server │ │ Gateway API (8001) │
|
||||||
│ Gateway API (8001) │
|
│ (Port 2024) │ │ FastAPI REST │
|
||||||
│ FastAPI REST + agent runtime │
|
│ │ │ │
|
||||||
│ │
|
│ ┌────────────────┐ │ │ Models, MCP, Skills, │
|
||||||
│ Models, MCP, Skills, Memory, Uploads, │
|
│ │ Lead Agent │ │ │ Memory, Uploads, │
|
||||||
│ Artifacts, Threads, Runs, Streaming │
|
│ │ ┌──────────┐ │ │ │ Artifacts │
|
||||||
│ │
|
│ │ │Middleware│ │ │ └────────────────────────┘
|
||||||
│ ┌────────────────────────────────────┐ │
|
│ │ │ Chain │ │ │
|
||||||
│ │ Lead Agent │ │
|
│ │ └──────────┘ │ │
|
||||||
│ │ Middleware Chain, Tools, Subagents │ │
|
│ │ ┌──────────┐ │ │
|
||||||
│ └────────────────────────────────────┘ │
|
│ │ │ Tools │ │ │
|
||||||
└────────────────────────────────────────┘
|
│ │ └──────────┘ │ │
|
||||||
|
│ │ ┌──────────┐ │ │
|
||||||
|
│ │ │Subagents │ │ │
|
||||||
|
│ │ └──────────┘ │ │
|
||||||
|
│ └────────────────┘ │
|
||||||
|
└────────────────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
**Request Routing** (via Nginx):
|
**Request Routing** (via Nginx):
|
||||||
- `/api/langgraph/*` → Gateway LangGraph-compatible API - agent interactions, threads, streaming
|
- `/api/langgraph/*` → LangGraph Server - agent interactions, threads, streaming
|
||||||
- `/api/*` (other) → Gateway API - models, MCP, skills, memory, artifacts, uploads, thread-local cleanup
|
- `/api/*` (other) → Gateway API - models, MCP, skills, memory, artifacts, uploads, thread-local cleanup
|
||||||
- `/` (non-API) → Frontend - Next.js web interface
|
- `/` (non-API) → Frontend - Next.js web interface
|
||||||
|
|
||||||
@@ -69,12 +74,12 @@ Middlewares execute in strict order, each handling a specific concern:
|
|||||||
Per-thread isolated execution with virtual path translation:
|
Per-thread isolated execution with virtual path translation:
|
||||||
|
|
||||||
- **Abstract interface**: `execute_command`, `read_file`, `write_file`, `list_dir`
|
- **Abstract interface**: `execute_command`, `read_file`, `write_file`, `list_dir`
|
||||||
- **Providers**: `LocalSandboxProvider` (filesystem) and `AioSandboxProvider` (Docker, in community/). Async runtime paths use async sandbox lifecycle hooks so startup, readiness polling, and release do not block the event loop.
|
- **Providers**: `LocalSandboxProvider` (filesystem) and `AioSandboxProvider` (Docker, in community/)
|
||||||
- **Virtual paths**: `/mnt/user-data/{workspace,uploads,outputs}` → thread-specific physical directories
|
- **Virtual paths**: `/mnt/user-data/{workspace,uploads,outputs}` → thread-specific physical directories
|
||||||
- **Skills path**: `/mnt/skills` → `deer-flow/skills/` directory
|
- **Skills path**: `/mnt/skills` → `deer-flow/skills/` directory
|
||||||
- **Skills loading**: Recursively discovers nested `SKILL.md` files under `skills/{public,custom}` and preserves nested container paths
|
- **Skills loading**: Recursively discovers nested `SKILL.md` files under `skills/{public,custom}` and preserves nested container paths
|
||||||
- **File-write safety**: `str_replace` serializes read-modify-write per `(sandbox.id, path)` so isolated sandboxes keep concurrency even when virtual paths match
|
- **File-write safety**: `str_replace` serializes read-modify-write per `(sandbox.id, path)` so isolated sandboxes keep concurrency even when virtual paths match
|
||||||
- **Tools**: `bash`, `ls`, `read_file`, `write_file`, `str_replace` (`write_file` overwrites by default and exposes `append` for end-of-file writes; `bash` is disabled by default when using `LocalSandboxProvider`; use `AioSandboxProvider` for isolated shell access)
|
- **Tools**: `bash`, `ls`, `read_file`, `write_file`, `str_replace` (`bash` is disabled by default when using `LocalSandboxProvider`; use `AioSandboxProvider` for isolated shell access)
|
||||||
|
|
||||||
### Subagent System
|
### Subagent System
|
||||||
|
|
||||||
@@ -119,7 +124,7 @@ FastAPI application providing REST endpoints for frontend integration:
|
|||||||
| `POST /api/memory/reload` | Force memory reload |
|
| `POST /api/memory/reload` | Force memory reload |
|
||||||
| `GET /api/memory/config` | Memory configuration |
|
| `GET /api/memory/config` | Memory configuration |
|
||||||
| `GET /api/memory/status` | Combined config + data |
|
| `GET /api/memory/status` | Combined config + data |
|
||||||
| `POST /api/threads/{id}/uploads` | Upload files (auto-converts PDF/PPT/Excel/Word to Markdown, rejects directory paths, auto-renames duplicate filenames in one request) |
|
| `POST /api/threads/{id}/uploads` | Upload files (auto-converts PDF/PPT/Excel/Word to Markdown, rejects directory paths) |
|
||||||
| `GET /api/threads/{id}/uploads/list` | List uploaded files |
|
| `GET /api/threads/{id}/uploads/list` | List uploaded files |
|
||||||
| `DELETE /api/threads/{id}` | Delete DeerFlow-managed local thread data after LangGraph thread deletion; unexpected failures are logged server-side and return a generic 500 detail |
|
| `DELETE /api/threads/{id}` | Delete DeerFlow-managed local thread data after LangGraph thread deletion; unexpected failures are logged server-side and return a generic 500 detail |
|
||||||
| `GET /api/threads/{id}/artifacts/{path}` | Serve generated artifacts |
|
| `GET /api/threads/{id}/artifacts/{path}` | Serve generated artifacts |
|
||||||
@@ -188,7 +193,7 @@ export OPENAI_API_KEY="your-api-key-here"
|
|||||||
**Full Application** (from project root):
|
**Full Application** (from project root):
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
make dev # Starts Gateway + Frontend + Nginx
|
make dev # Starts LangGraph + Gateway + Frontend + Nginx
|
||||||
```
|
```
|
||||||
|
|
||||||
Access at: http://localhost:2026
|
Access at: http://localhost:2026
|
||||||
@@ -196,11 +201,14 @@ Access at: http://localhost:2026
|
|||||||
**Backend Only** (from backend directory):
|
**Backend Only** (from backend directory):
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Gateway API + embedded agent runtime
|
# Terminal 1: LangGraph server
|
||||||
make dev
|
make dev
|
||||||
|
|
||||||
|
# Terminal 2: Gateway API
|
||||||
|
make gateway
|
||||||
```
|
```
|
||||||
|
|
||||||
Direct access: Gateway at http://localhost:8001
|
Direct access: LangGraph at http://localhost:2024, Gateway at http://localhost:8001
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -236,16 +244,12 @@ backend/
|
|||||||
│ └── utils/ # Utilities
|
│ └── utils/ # Utilities
|
||||||
├── docs/ # Documentation
|
├── docs/ # Documentation
|
||||||
├── tests/ # Test suite
|
├── tests/ # Test suite
|
||||||
├── langgraph.json # LangGraph graph registry for tooling/Studio compatibility
|
├── langgraph.json # LangGraph server configuration
|
||||||
├── pyproject.toml # Python dependencies
|
├── pyproject.toml # Python dependencies
|
||||||
├── Makefile # Development commands
|
├── Makefile # Development commands
|
||||||
└── Dockerfile # Container build
|
└── Dockerfile # Container build
|
||||||
```
|
```
|
||||||
|
|
||||||
`langgraph.json` is not the default service entrypoint. The scripts and Docker
|
|
||||||
deployments run the Gateway embedded runtime; the file is kept for LangGraph
|
|
||||||
tooling, Studio, or direct LangGraph Server compatibility.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
@@ -358,11 +362,10 @@ If a provider is explicitly enabled but required credentials are missing, or the
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
make install # Install dependencies
|
make install # Install dependencies
|
||||||
make dev # Run Gateway API + embedded agent runtime (port 8001)
|
make dev # Run LangGraph server (port 2024)
|
||||||
make gateway # Run Gateway API without reload (port 8001)
|
make gateway # Run Gateway API (port 8001)
|
||||||
make lint # Run linter (ruff)
|
make lint # Run linter (ruff)
|
||||||
make format # Format code (ruff)
|
make format # Format code (ruff)
|
||||||
make detect-blocking-io # Inventory blocking IO that may block the backend event loop
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Code Style
|
### Code Style
|
||||||
@@ -379,18 +382,6 @@ make detect-blocking-io # Inventory blocking IO that may block the backend even
|
|||||||
uv run pytest
|
uv run pytest
|
||||||
```
|
```
|
||||||
|
|
||||||
`make detect-blocking-io` statically scans backend business code for blocking
|
|
||||||
IO that may run on the backend event loop and is not test-coverage-bound. It
|
|
||||||
prints a concise summary for human review and writes complete JSON findings to
|
|
||||||
`.deer-flow/blocking-io-findings.json` at the repository root (regardless of
|
|
||||||
whether the target is invoked from the repo root or from `backend/`). JSON
|
|
||||||
findings include both broad IO category and review-oriented fields such as
|
|
||||||
`priority`, `location`, `blocking_call`, `event_loop_exposure`, `reason`, and
|
|
||||||
`code`. `priority` is a deterministic review ordering from the operation type,
|
|
||||||
not proof of a bug. Bare-name same-file calls are resolved by function name,
|
|
||||||
so duplicate helper names in one file can conservatively over-report async
|
|
||||||
reachability.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Technology Stack
|
## Technology Stack
|
||||||
|
|||||||
@@ -18,10 +18,3 @@ KNOWN_CHANNEL_COMMANDS: frozenset[str] = frozenset(
|
|||||||
"/help",
|
"/help",
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def is_known_channel_command(text: str) -> bool:
|
|
||||||
"""Return whether text starts with a registered channel control command."""
|
|
||||||
if not text.startswith("/"):
|
|
||||||
return False
|
|
||||||
return text.split(maxsplit=1)[0].lower() in KNOWN_CHANNEL_COMMANDS
|
|
||||||
|
|||||||
@@ -14,7 +14,7 @@ from typing import Any
|
|||||||
import httpx
|
import httpx
|
||||||
|
|
||||||
from app.channels.base import Channel
|
from app.channels.base import Channel
|
||||||
from app.channels.commands import is_known_channel_command
|
from app.channels.commands import KNOWN_CHANNEL_COMMANDS
|
||||||
from app.channels.message_bus import InboundMessage, InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
|
from app.channels.message_bus import InboundMessage, InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@@ -59,7 +59,9 @@ def _normalize_allowed_users(allowed_users: Any) -> set[str]:
|
|||||||
|
|
||||||
|
|
||||||
def _is_dingtalk_command(text: str) -> bool:
|
def _is_dingtalk_command(text: str) -> bool:
|
||||||
return is_known_channel_command(text)
|
if not text.startswith("/"):
|
||||||
|
return False
|
||||||
|
return text.split(maxsplit=1)[0].lower() in KNOWN_CHANNEL_COMMANDS
|
||||||
|
|
||||||
|
|
||||||
def _extract_text_from_rich_text(rich_text_list: list) -> str:
|
def _extract_text_from_rich_text(rich_text_list: list) -> str:
|
||||||
|
|||||||
+12
-293
@@ -3,14 +3,11 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import asyncio
|
import asyncio
|
||||||
import json
|
|
||||||
import logging
|
import logging
|
||||||
import threading
|
import threading
|
||||||
from pathlib import Path
|
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from app.channels.base import Channel
|
from app.channels.base import Channel
|
||||||
from app.channels.commands import is_known_channel_command
|
|
||||||
from app.channels.message_bus import InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
|
from app.channels.message_bus import InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@@ -24,12 +21,6 @@ class DiscordChannel(Channel):
|
|||||||
Configuration keys (in ``config.yaml`` under ``channels.discord``):
|
Configuration keys (in ``config.yaml`` under ``channels.discord``):
|
||||||
- ``bot_token``: Discord Bot token.
|
- ``bot_token``: Discord Bot token.
|
||||||
- ``allowed_guilds``: (optional) List of allowed Discord guild IDs. Empty = allow all.
|
- ``allowed_guilds``: (optional) List of allowed Discord guild IDs. Empty = allow all.
|
||||||
- ``mention_only``: (optional) If true, only respond when the bot is mentioned.
|
|
||||||
- ``allowed_channels``: (optional) List of channel IDs where messages are always accepted
|
|
||||||
(even when mention_only is true). Use for channels where you want the bot to respond
|
|
||||||
without mentions. Empty = mention_only applies everywhere.
|
|
||||||
- ``thread_mode``: (optional) If true, group a channel conversation into a thread.
|
|
||||||
Default: same as ``mention_only``.
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
def __init__(self, bus: MessageBus, config: dict[str, Any]) -> None:
|
def __init__(self, bus: MessageBus, config: dict[str, Any]) -> None:
|
||||||
@@ -41,29 +32,6 @@ class DiscordChannel(Channel):
|
|||||||
self._allowed_guilds.add(int(guild_id))
|
self._allowed_guilds.add(int(guild_id))
|
||||||
except (TypeError, ValueError):
|
except (TypeError, ValueError):
|
||||||
continue
|
continue
|
||||||
self._mention_only: bool = bool(config.get("mention_only", False))
|
|
||||||
self._thread_mode: bool = config.get("thread_mode", self._mention_only)
|
|
||||||
self._allowed_channels: set[str] = set()
|
|
||||||
for channel_id in config.get("allowed_channels", []):
|
|
||||||
self._allowed_channels.add(str(channel_id))
|
|
||||||
|
|
||||||
# Session tracking: channel_id -> Discord thread_id (in-memory, persisted to JSON).
|
|
||||||
# Uses a dedicated JSON file separate from ChannelStore, which maps IM
|
|
||||||
# conversations to DeerFlow thread IDs — a different concern.
|
|
||||||
self._active_threads: dict[str, str] = {}
|
|
||||||
# Reverse-lookup set for O(1) thread ID checks (avoids O(n) scan of _active_threads.values()).
|
|
||||||
self._active_thread_ids: set[str] = set()
|
|
||||||
# Lock protecting _active_threads and the JSON file from concurrent access.
|
|
||||||
# _run_client (Discord loop thread) and the main thread both read/write.
|
|
||||||
self._thread_store_lock = threading.Lock()
|
|
||||||
store = config.get("channel_store")
|
|
||||||
if store is not None:
|
|
||||||
self._thread_store_path = store._path.parent / "discord_threads.json"
|
|
||||||
else:
|
|
||||||
self._thread_store_path = Path.home() / ".deer-flow" / "channels" / "discord_threads.json"
|
|
||||||
|
|
||||||
# Typing indicator management
|
|
||||||
self._typing_tasks: dict[str, asyncio.Task] = {}
|
|
||||||
|
|
||||||
self._client = None
|
self._client = None
|
||||||
self._thread: threading.Thread | None = None
|
self._thread: threading.Thread | None = None
|
||||||
@@ -107,56 +75,12 @@ class DiscordChannel(Channel):
|
|||||||
|
|
||||||
self._thread = threading.Thread(target=self._run_client, daemon=True)
|
self._thread = threading.Thread(target=self._run_client, daemon=True)
|
||||||
self._thread.start()
|
self._thread.start()
|
||||||
self._load_active_threads()
|
|
||||||
logger.info("Discord channel started")
|
logger.info("Discord channel started")
|
||||||
|
|
||||||
def _load_active_threads(self) -> None:
|
|
||||||
"""Restore Discord thread mappings from the dedicated JSON file on startup."""
|
|
||||||
with self._thread_store_lock:
|
|
||||||
try:
|
|
||||||
if not self._thread_store_path.exists():
|
|
||||||
logger.debug("[Discord] no thread mappings file at %s", self._thread_store_path)
|
|
||||||
return
|
|
||||||
data = json.loads(self._thread_store_path.read_text())
|
|
||||||
self._active_threads.clear()
|
|
||||||
self._active_thread_ids.clear()
|
|
||||||
for channel_id, thread_id in data.items():
|
|
||||||
self._active_threads[channel_id] = thread_id
|
|
||||||
self._active_thread_ids.add(thread_id)
|
|
||||||
if self._active_threads:
|
|
||||||
logger.info("[Discord] restored %d thread mappings from %s", len(self._active_threads), self._thread_store_path)
|
|
||||||
except Exception:
|
|
||||||
logger.exception("[Discord] failed to load thread mappings")
|
|
||||||
|
|
||||||
def _save_thread(self, channel_id: str, thread_id: str) -> None:
|
|
||||||
"""Persist a Discord thread mapping to the dedicated JSON file."""
|
|
||||||
with self._thread_store_lock:
|
|
||||||
try:
|
|
||||||
data: dict[str, str] = {}
|
|
||||||
if self._thread_store_path.exists():
|
|
||||||
data = json.loads(self._thread_store_path.read_text())
|
|
||||||
old_id = data.get(channel_id)
|
|
||||||
data[channel_id] = thread_id
|
|
||||||
# Update reverse-lookup set
|
|
||||||
if old_id:
|
|
||||||
self._active_thread_ids.discard(old_id)
|
|
||||||
self._active_thread_ids.add(thread_id)
|
|
||||||
self._thread_store_path.parent.mkdir(parents=True, exist_ok=True)
|
|
||||||
self._thread_store_path.write_text(json.dumps(data, indent=2))
|
|
||||||
except Exception:
|
|
||||||
logger.exception("[Discord] failed to save thread mapping for channel %s", channel_id)
|
|
||||||
|
|
||||||
async def stop(self) -> None:
|
async def stop(self) -> None:
|
||||||
self._running = False
|
self._running = False
|
||||||
self.bus.unsubscribe_outbound(self._on_outbound)
|
self.bus.unsubscribe_outbound(self._on_outbound)
|
||||||
|
|
||||||
# Cancel all active typing indicator tasks
|
|
||||||
for target_id, task in list(self._typing_tasks.items()):
|
|
||||||
if not task.done():
|
|
||||||
task.cancel()
|
|
||||||
logger.debug("[Discord] cancelled typing task for target %s", target_id)
|
|
||||||
self._typing_tasks.clear()
|
|
||||||
|
|
||||||
if self._client and self._discord_loop and self._discord_loop.is_running():
|
if self._client and self._discord_loop and self._discord_loop.is_running():
|
||||||
close_future = asyncio.run_coroutine_threadsafe(self._client.close(), self._discord_loop)
|
close_future = asyncio.run_coroutine_threadsafe(self._client.close(), self._discord_loop)
|
||||||
try:
|
try:
|
||||||
@@ -176,10 +100,6 @@ class DiscordChannel(Channel):
|
|||||||
logger.info("Discord channel stopped")
|
logger.info("Discord channel stopped")
|
||||||
|
|
||||||
async def send(self, msg: OutboundMessage) -> None:
|
async def send(self, msg: OutboundMessage) -> None:
|
||||||
# Stop typing indicator once we're sending the response
|
|
||||||
stop_future = asyncio.run_coroutine_threadsafe(self._stop_typing(msg.chat_id, msg.thread_ts), self._discord_loop)
|
|
||||||
await asyncio.wrap_future(stop_future)
|
|
||||||
|
|
||||||
target = await self._resolve_target(msg)
|
target = await self._resolve_target(msg)
|
||||||
if target is None:
|
if target is None:
|
||||||
logger.error("[Discord] target not found for chat_id=%s thread_ts=%s", msg.chat_id, msg.thread_ts)
|
logger.error("[Discord] target not found for chat_id=%s thread_ts=%s", msg.chat_id, msg.thread_ts)
|
||||||
@@ -191,9 +111,6 @@ class DiscordChannel(Channel):
|
|||||||
await asyncio.wrap_future(send_future)
|
await asyncio.wrap_future(send_future)
|
||||||
|
|
||||||
async def send_file(self, msg: OutboundMessage, attachment: ResolvedAttachment) -> bool:
|
async def send_file(self, msg: OutboundMessage, attachment: ResolvedAttachment) -> bool:
|
||||||
stop_future = asyncio.run_coroutine_threadsafe(self._stop_typing(msg.chat_id, msg.thread_ts), self._discord_loop)
|
|
||||||
await asyncio.wrap_future(stop_future)
|
|
||||||
|
|
||||||
target = await self._resolve_target(msg)
|
target = await self._resolve_target(msg)
|
||||||
if target is None:
|
if target is None:
|
||||||
logger.error("[Discord] target not found for file upload chat_id=%s thread_ts=%s", msg.chat_id, msg.thread_ts)
|
logger.error("[Discord] target not found for file upload chat_id=%s thread_ts=%s", msg.chat_id, msg.thread_ts)
|
||||||
@@ -213,41 +130,6 @@ class DiscordChannel(Channel):
|
|||||||
logger.exception("[Discord] failed to upload file: %s", attachment.filename)
|
logger.exception("[Discord] failed to upload file: %s", attachment.filename)
|
||||||
return False
|
return False
|
||||||
|
|
||||||
async def _start_typing(self, channel, chat_id: str, thread_ts: str | None = None) -> None:
|
|
||||||
"""Starts a loop to send periodic typing indicators."""
|
|
||||||
target_id = thread_ts or chat_id
|
|
||||||
if target_id in self._typing_tasks:
|
|
||||||
return # Already typing for this target
|
|
||||||
|
|
||||||
async def _typing_loop():
|
|
||||||
try:
|
|
||||||
while True:
|
|
||||||
try:
|
|
||||||
await channel.trigger_typing()
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
await asyncio.sleep(10)
|
|
||||||
except asyncio.CancelledError:
|
|
||||||
pass
|
|
||||||
|
|
||||||
task = asyncio.create_task(_typing_loop())
|
|
||||||
self._typing_tasks[target_id] = task
|
|
||||||
|
|
||||||
async def _stop_typing(self, chat_id: str, thread_ts: str | None = None) -> None:
|
|
||||||
"""Stops the typing loop for a specific target."""
|
|
||||||
target_id = thread_ts or chat_id
|
|
||||||
task = self._typing_tasks.pop(target_id, None)
|
|
||||||
if task and not task.done():
|
|
||||||
task.cancel()
|
|
||||||
logger.debug("[Discord] stopped typing indicator for target %s", target_id)
|
|
||||||
|
|
||||||
async def _add_reaction(self, message) -> None:
|
|
||||||
"""Add a checkmark reaction to acknowledge the message was received."""
|
|
||||||
try:
|
|
||||||
await message.add_reaction("✅")
|
|
||||||
except Exception:
|
|
||||||
logger.debug("[Discord] failed to add reaction to message %s", message.id, exc_info=True)
|
|
||||||
|
|
||||||
async def _on_message(self, message) -> None:
|
async def _on_message(self, message) -> None:
|
||||||
if not self._running or not self._client:
|
if not self._running or not self._client:
|
||||||
return
|
return
|
||||||
@@ -270,145 +152,17 @@ class DiscordChannel(Channel):
|
|||||||
if self._discord_module is None:
|
if self._discord_module is None:
|
||||||
return
|
return
|
||||||
|
|
||||||
# Determine whether the bot is mentioned in this message
|
|
||||||
user = self._client.user if self._client else None
|
|
||||||
if user:
|
|
||||||
bot_mention = user.mention # <@ID>
|
|
||||||
alt_mention = f"<@!{user.id}>" # <@!ID> (ping variant)
|
|
||||||
standard_mention = f"<@{user.id}>"
|
|
||||||
else:
|
|
||||||
bot_mention = None
|
|
||||||
alt_mention = None
|
|
||||||
standard_mention = ""
|
|
||||||
has_mention = (bot_mention and bot_mention in message.content) or (alt_mention and alt_mention in message.content) or (standard_mention and standard_mention in message.content)
|
|
||||||
|
|
||||||
# Strip mention from text for processing
|
|
||||||
if has_mention:
|
|
||||||
text = text.replace(bot_mention or "", "").replace(alt_mention or "", "").replace(standard_mention or "", "").strip()
|
|
||||||
# Don't return early if text is empty — still process the mention (e.g., create thread)
|
|
||||||
|
|
||||||
# --- Determine thread/channel routing and typing target ---
|
|
||||||
thread_id = None
|
|
||||||
chat_id = None
|
|
||||||
typing_target = None # The Discord object to type into
|
|
||||||
|
|
||||||
if isinstance(message.channel, self._discord_module.Thread):
|
if isinstance(message.channel, self._discord_module.Thread):
|
||||||
# --- Message already inside a thread ---
|
chat_id = str(message.channel.parent_id or message.channel.id)
|
||||||
thread_obj = message.channel
|
thread_id = str(message.channel.id)
|
||||||
thread_id = str(thread_obj.id)
|
|
||||||
chat_id = str(thread_obj.parent_id or thread_obj.id)
|
|
||||||
typing_target = thread_obj
|
|
||||||
|
|
||||||
# If this is a known active thread, process normally
|
|
||||||
if thread_id in self._active_thread_ids:
|
|
||||||
msg_type = InboundMessageType.COMMAND if is_known_channel_command(text) else InboundMessageType.CHAT
|
|
||||||
inbound = self._make_inbound(
|
|
||||||
chat_id=chat_id,
|
|
||||||
user_id=str(message.author.id),
|
|
||||||
text=text,
|
|
||||||
msg_type=msg_type,
|
|
||||||
thread_ts=thread_id,
|
|
||||||
metadata={
|
|
||||||
"guild_id": str(guild.id) if guild else None,
|
|
||||||
"channel_id": str(message.channel.id),
|
|
||||||
"message_id": str(message.id),
|
|
||||||
},
|
|
||||||
)
|
|
||||||
inbound.topic_id = thread_id
|
|
||||||
self._publish(inbound)
|
|
||||||
# Start typing indicator in the thread
|
|
||||||
if typing_target:
|
|
||||||
asyncio.create_task(self._start_typing(typing_target, chat_id, thread_id))
|
|
||||||
asyncio.create_task(self._add_reaction(message))
|
|
||||||
return
|
|
||||||
|
|
||||||
# Thread not tracked (orphaned) — create new thread and handle below
|
|
||||||
logger.debug("[Discord] message in orphaned thread %s, will create new thread", thread_id)
|
|
||||||
thread_id = None
|
|
||||||
typing_target = None
|
|
||||||
|
|
||||||
# At this point we're guaranteed to be in a channel, not a thread
|
|
||||||
# (the Thread case is handled above). Apply mention_only for all
|
|
||||||
# non-thread messages — no special case needed.
|
|
||||||
channel_id = str(message.channel.id)
|
|
||||||
|
|
||||||
# Check if there's an active thread for this channel
|
|
||||||
if channel_id in self._active_threads:
|
|
||||||
# respect mention_only: if enabled, only process messages that mention the bot
|
|
||||||
# (unless the channel is in allowed_channels)
|
|
||||||
# Messages within a thread are always allowed through (continuation).
|
|
||||||
# At this code point we know the message is in a channel, not a thread
|
|
||||||
# (Thread case handled above), so always apply the check.
|
|
||||||
if self._mention_only and not has_mention and channel_id not in self._allowed_channels:
|
|
||||||
logger.debug("[Discord] skipping no-@ message in channel %s (not in thread)", channel_id)
|
|
||||||
return
|
|
||||||
# mention_only + fresh @ → create new thread instead of routing to existing one
|
|
||||||
if self._mention_only and has_mention:
|
|
||||||
thread_obj = await self._create_thread(message)
|
|
||||||
if thread_obj is not None:
|
|
||||||
target_thread_id = str(thread_obj.id)
|
|
||||||
self._active_threads[channel_id] = target_thread_id
|
|
||||||
self._save_thread(channel_id, target_thread_id)
|
|
||||||
thread_id = target_thread_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = thread_obj
|
|
||||||
logger.info("[Discord] created new thread %s in channel %s on mention (replacing existing thread)", target_thread_id, channel_id)
|
|
||||||
else:
|
|
||||||
logger.info("[Discord] thread creation failed in channel %s, falling back to channel replies", channel_id)
|
|
||||||
thread_id = channel_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = message.channel
|
|
||||||
else:
|
|
||||||
# Existing session → route to the existing thread
|
|
||||||
target_thread_id = self._active_threads[channel_id]
|
|
||||||
logger.debug("[Discord] routing message in channel %s to existing thread %s", channel_id, target_thread_id)
|
|
||||||
thread_id = target_thread_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = await self._get_channel_or_thread(target_thread_id)
|
|
||||||
elif self._mention_only and not has_mention and channel_id not in self._allowed_channels:
|
|
||||||
# Not mentioned and not in an allowed channel → skip
|
|
||||||
logger.debug("[Discord] skipping message without mention in channel %s", channel_id)
|
|
||||||
return
|
|
||||||
elif self._mention_only and has_mention:
|
|
||||||
# First mention in this channel → create thread
|
|
||||||
thread_obj = await self._create_thread(message)
|
|
||||||
if thread_obj is not None:
|
|
||||||
target_thread_id = str(thread_obj.id)
|
|
||||||
self._active_threads[channel_id] = target_thread_id
|
|
||||||
self._save_thread(channel_id, target_thread_id)
|
|
||||||
thread_id = target_thread_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = thread_obj # Type into the new thread
|
|
||||||
logger.info("[Discord] created thread %s in channel %s for user %s", target_thread_id, channel_id, message.author.display_name)
|
|
||||||
else:
|
|
||||||
# Fallback: thread creation failed (disabled/permissions), reply in channel
|
|
||||||
logger.info("[Discord] thread creation failed in channel %s, falling back to channel replies", channel_id)
|
|
||||||
thread_id = channel_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = message.channel # Type into the channel
|
|
||||||
elif self._thread_mode:
|
|
||||||
# thread_mode but mention_only is False → create thread anyway for conversation grouping
|
|
||||||
thread_obj = await self._create_thread(message)
|
|
||||||
if thread_obj is None:
|
|
||||||
# Thread creation failed (disabled/permissions), fall back to channel replies
|
|
||||||
logger.info("[Discord] thread creation failed in channel %s, falling back to channel replies", channel_id)
|
|
||||||
thread_id = channel_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = message.channel # Type into the channel
|
|
||||||
else:
|
|
||||||
target_thread_id = str(thread_obj.id)
|
|
||||||
self._active_threads[channel_id] = target_thread_id
|
|
||||||
self._save_thread(channel_id, target_thread_id)
|
|
||||||
thread_id = target_thread_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = thread_obj # Type into the new thread
|
|
||||||
else:
|
else:
|
||||||
# No threading — reply directly in channel
|
thread = await self._create_thread(message)
|
||||||
thread_id = channel_id
|
if thread is None:
|
||||||
chat_id = channel_id
|
return
|
||||||
typing_target = message.channel # Type into the channel
|
chat_id = str(message.channel.id)
|
||||||
|
thread_id = str(thread.id)
|
||||||
|
|
||||||
msg_type = InboundMessageType.COMMAND if is_known_channel_command(text) else InboundMessageType.CHAT
|
msg_type = InboundMessageType.COMMAND if text.startswith("/") else InboundMessageType.CHAT
|
||||||
inbound = self._make_inbound(
|
inbound = self._make_inbound(
|
||||||
chat_id=chat_id,
|
chat_id=chat_id,
|
||||||
user_id=str(message.author.id),
|
user_id=str(message.author.id),
|
||||||
@@ -423,15 +177,6 @@ class DiscordChannel(Channel):
|
|||||||
)
|
)
|
||||||
inbound.topic_id = thread_id
|
inbound.topic_id = thread_id
|
||||||
|
|
||||||
# Start typing indicator in the correct target (thread or channel)
|
|
||||||
if typing_target:
|
|
||||||
asyncio.create_task(self._start_typing(typing_target, chat_id, thread_id))
|
|
||||||
|
|
||||||
self._publish(inbound)
|
|
||||||
asyncio.create_task(self._add_reaction(message))
|
|
||||||
|
|
||||||
def _publish(self, inbound) -> None:
|
|
||||||
"""Publish an inbound message to the main event loop."""
|
|
||||||
if self._main_loop and self._main_loop.is_running():
|
if self._main_loop and self._main_loop.is_running():
|
||||||
future = asyncio.run_coroutine_threadsafe(self.bus.publish_inbound(inbound), self._main_loop)
|
future = asyncio.run_coroutine_threadsafe(self.bus.publish_inbound(inbound), self._main_loop)
|
||||||
future.add_done_callback(lambda f: logger.exception("[Discord] publish_inbound failed", exc_info=f.exception()) if f.exception() else None)
|
future.add_done_callback(lambda f: logger.exception("[Discord] publish_inbound failed", exc_info=f.exception()) if f.exception() else None)
|
||||||
@@ -453,40 +198,14 @@ class DiscordChannel(Channel):
|
|||||||
|
|
||||||
async def _create_thread(self, message):
|
async def _create_thread(self, message):
|
||||||
try:
|
try:
|
||||||
if self._discord_module is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
# Only TextChannel (type 0) and NewsChannel (type 10) support threads
|
|
||||||
channel_type = message.channel.type
|
|
||||||
if channel_type not in (
|
|
||||||
self._discord_module.ChannelType.text,
|
|
||||||
self._discord_module.ChannelType.news,
|
|
||||||
):
|
|
||||||
logger.info(
|
|
||||||
"[Discord] channel type %s (%s) does not support threads",
|
|
||||||
channel_type.value,
|
|
||||||
channel_type.name,
|
|
||||||
)
|
|
||||||
return None
|
|
||||||
|
|
||||||
thread_name = f"deerflow-{message.author.display_name}-{message.id}"[:100]
|
thread_name = f"deerflow-{message.author.display_name}-{message.id}"[:100]
|
||||||
return await message.create_thread(name=thread_name)
|
return await message.create_thread(name=thread_name)
|
||||||
except self._discord_module.errors.HTTPException as exc:
|
|
||||||
if exc.code == 50024:
|
|
||||||
logger.info(
|
|
||||||
"[Discord] cannot create thread in channel %s (error code 50024): %s",
|
|
||||||
message.channel.id,
|
|
||||||
channel_type.name if (channel_type := message.channel.type) else "unknown",
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
logger.exception(
|
|
||||||
"[Discord] failed to create thread for message=%s (HTTPException %s)",
|
|
||||||
message.id,
|
|
||||||
exc.code,
|
|
||||||
)
|
|
||||||
return None
|
|
||||||
except Exception:
|
except Exception:
|
||||||
logger.exception("[Discord] failed to create thread for message=%s (threads may be disabled or missing permissions)", message.id)
|
logger.exception("[Discord] failed to create thread for message=%s (threads may be disabled or missing permissions)", message.id)
|
||||||
|
try:
|
||||||
|
await message.channel.send("Could not create a thread for your message. Please check that threads are enabled in this channel.")
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
return None
|
return None
|
||||||
|
|
||||||
async def _resolve_target(self, msg: OutboundMessage):
|
async def _resolve_target(self, msg: OutboundMessage):
|
||||||
|
|||||||
+13
-190
@@ -7,30 +7,22 @@ import json
|
|||||||
import logging
|
import logging
|
||||||
import re
|
import re
|
||||||
import threading
|
import threading
|
||||||
import time
|
|
||||||
from typing import Any, Literal
|
from typing import Any, Literal
|
||||||
|
|
||||||
from app.channels.base import Channel
|
from app.channels.base import Channel
|
||||||
from app.channels.commands import is_known_channel_command
|
from app.channels.commands import KNOWN_CHANNEL_COMMANDS
|
||||||
from app.channels.message_bus import (
|
from app.channels.message_bus import InboundMessage, InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
|
||||||
PENDING_CLARIFICATION_METADATA_KEY,
|
|
||||||
RESOLVED_FROM_PENDING_CLARIFICATION_METADATA_KEY,
|
|
||||||
InboundMessage,
|
|
||||||
InboundMessageType,
|
|
||||||
MessageBus,
|
|
||||||
OutboundMessage,
|
|
||||||
ResolvedAttachment,
|
|
||||||
)
|
|
||||||
from deerflow.config.paths import VIRTUAL_PATH_PREFIX, get_paths
|
from deerflow.config.paths import VIRTUAL_PATH_PREFIX, get_paths
|
||||||
from deerflow.runtime.user_context import get_effective_user_id
|
from deerflow.runtime.user_context import get_effective_user_id
|
||||||
from deerflow.sandbox.sandbox_provider import get_sandbox_provider
|
from deerflow.sandbox.sandbox_provider import get_sandbox_provider
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
PENDING_CLARIFICATION_TTL_SECONDS = 30 * 60
|
|
||||||
|
|
||||||
|
|
||||||
def _is_feishu_command(text: str) -> bool:
|
def _is_feishu_command(text: str) -> bool:
|
||||||
return is_known_channel_command(text)
|
if not text.startswith("/"):
|
||||||
|
return False
|
||||||
|
return text.split(maxsplit=1)[0].lower() in KNOWN_CHANNEL_COMMANDS
|
||||||
|
|
||||||
|
|
||||||
class FeishuChannel(Channel):
|
class FeishuChannel(Channel):
|
||||||
@@ -64,7 +56,6 @@ class FeishuChannel(Channel):
|
|||||||
self._background_tasks: set[asyncio.Task] = set()
|
self._background_tasks: set[asyncio.Task] = set()
|
||||||
self._running_card_ids: dict[str, str] = {}
|
self._running_card_ids: dict[str, str] = {}
|
||||||
self._running_card_tasks: dict[str, asyncio.Task] = {}
|
self._running_card_tasks: dict[str, asyncio.Task] = {}
|
||||||
self._pending_clarifications: dict[tuple[str, str], list[dict[str, Any]]] = {}
|
|
||||||
self._CreateFileRequest = None
|
self._CreateFileRequest = None
|
||||||
self._CreateFileRequestBody = None
|
self._CreateFileRequestBody = None
|
||||||
self._CreateImageRequest = None
|
self._CreateImageRequest = None
|
||||||
@@ -72,16 +63,6 @@ class FeishuChannel(Channel):
|
|||||||
self._GetMessageResourceRequest = None
|
self._GetMessageResourceRequest = None
|
||||||
self._thread_lock = threading.Lock()
|
self._thread_lock = threading.Lock()
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _non_empty_str(value: Any) -> str | None:
|
|
||||||
if isinstance(value, str) and value.strip():
|
|
||||||
return value.strip()
|
|
||||||
return None
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _pending_key(chat_id: str, user_id: str) -> tuple[str, str]:
|
|
||||||
return (chat_id, user_id)
|
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def supports_streaming(self) -> bool:
|
def supports_streaming(self) -> bool:
|
||||||
return True
|
return True
|
||||||
@@ -550,25 +531,18 @@ class FeishuChannel(Channel):
|
|||||||
"[Feishu] failed to patch running card %s, falling back to final reply",
|
"[Feishu] failed to patch running card %s, falling back to final reply",
|
||||||
running_card_id,
|
running_card_id,
|
||||||
)
|
)
|
||||||
fallback_card_id = await self._reply_card(source_message_id, msg.text)
|
await self._reply_card(source_message_id, msg.text)
|
||||||
self._remember_thread_mapping(msg, source_message_id, fallback_card_id)
|
|
||||||
self._remember_pending_clarification(msg, fallback_card_id)
|
|
||||||
else:
|
else:
|
||||||
self._remember_thread_mapping(msg, source_message_id, running_card_id)
|
|
||||||
self._remember_pending_clarification(msg, running_card_id)
|
|
||||||
logger.info("[Feishu] running card updated: source=%s card=%s", source_message_id, running_card_id)
|
logger.info("[Feishu] running card updated: source=%s card=%s", source_message_id, running_card_id)
|
||||||
elif msg.is_final:
|
elif msg.is_final:
|
||||||
final_card_id = await self._reply_card(source_message_id, msg.text)
|
await self._reply_card(source_message_id, msg.text)
|
||||||
self._remember_thread_mapping(msg, source_message_id, final_card_id)
|
|
||||||
self._remember_pending_clarification(msg, final_card_id)
|
|
||||||
elif awaited_running_card_task:
|
elif awaited_running_card_task:
|
||||||
logger.warning(
|
logger.warning(
|
||||||
"[Feishu] running card task finished without message_id for source=%s, skipping duplicate non-final creation",
|
"[Feishu] running card task finished without message_id for source=%s, skipping duplicate non-final creation",
|
||||||
source_message_id,
|
source_message_id,
|
||||||
)
|
)
|
||||||
else:
|
else:
|
||||||
created_card_id = await self._ensure_running_card(source_message_id, msg.text)
|
await self._ensure_running_card(source_message_id, msg.text)
|
||||||
self._remember_thread_mapping(msg, source_message_id, created_card_id)
|
|
||||||
|
|
||||||
if msg.is_final:
|
if msg.is_final:
|
||||||
self._running_card_ids.pop(source_message_id, None)
|
self._running_card_ids.pop(source_message_id, None)
|
||||||
@@ -579,129 +553,6 @@ class FeishuChannel(Channel):
|
|||||||
|
|
||||||
# -- internal ----------------------------------------------------------
|
# -- internal ----------------------------------------------------------
|
||||||
|
|
||||||
def _remember_thread_mapping(self, msg: OutboundMessage, *topic_ids: str | None) -> None:
|
|
||||||
store = self.config.get("channel_store")
|
|
||||||
if store is None or not msg.thread_id:
|
|
||||||
return
|
|
||||||
|
|
||||||
metadata_topic_ids = [
|
|
||||||
msg.metadata.get("message_id"),
|
|
||||||
msg.metadata.get("root_id"),
|
|
||||||
msg.metadata.get("parent_id"),
|
|
||||||
msg.metadata.get("thread_id"),
|
|
||||||
msg.metadata.get("topic_id"),
|
|
||||||
]
|
|
||||||
user_id = ""
|
|
||||||
raw_user_id = msg.metadata.get("user_id")
|
|
||||||
if isinstance(raw_user_id, str):
|
|
||||||
user_id = raw_user_id
|
|
||||||
|
|
||||||
seen: set[str] = set()
|
|
||||||
for topic_id in [*topic_ids, *metadata_topic_ids]:
|
|
||||||
topic_id = self._non_empty_str(topic_id)
|
|
||||||
if not topic_id or topic_id in seen:
|
|
||||||
continue
|
|
||||||
seen.add(topic_id)
|
|
||||||
try:
|
|
||||||
store.set_thread_id(
|
|
||||||
self.name,
|
|
||||||
msg.chat_id,
|
|
||||||
msg.thread_id,
|
|
||||||
topic_id=topic_id,
|
|
||||||
user_id=user_id,
|
|
||||||
)
|
|
||||||
except Exception:
|
|
||||||
logger.exception("[Feishu] failed to remember thread mapping for topic_id=%s", topic_id)
|
|
||||||
|
|
||||||
def _remember_pending_clarification(self, msg: OutboundMessage, card_message_id: str | None) -> None:
|
|
||||||
if not msg.is_final or msg.metadata.get(PENDING_CLARIFICATION_METADATA_KEY) is not True:
|
|
||||||
return
|
|
||||||
|
|
||||||
user_id = self._non_empty_str(msg.metadata.get("user_id"))
|
|
||||||
topic_id = self._non_empty_str(msg.metadata.get("topic_id"))
|
|
||||||
source_message_id = self._non_empty_str(msg.thread_ts) or self._non_empty_str(msg.metadata.get("message_id"))
|
|
||||||
if not (user_id and topic_id and msg.thread_id and source_message_id and card_message_id):
|
|
||||||
return
|
|
||||||
|
|
||||||
key = self._pending_key(msg.chat_id, user_id)
|
|
||||||
pending = {
|
|
||||||
"thread_id": msg.thread_id,
|
|
||||||
"topic_id": topic_id,
|
|
||||||
"source_message_id": source_message_id,
|
|
||||||
"card_message_id": card_message_id,
|
|
||||||
"created_at": time.time(),
|
|
||||||
}
|
|
||||||
with self._thread_lock:
|
|
||||||
# Plain-message clarification continuity is a short-lived in-memory
|
|
||||||
# hint; explicit Feishu replies are still covered by persisted
|
|
||||||
# message-id mappings.
|
|
||||||
self._pending_clarifications.setdefault(key, []).append(pending)
|
|
||||||
logger.info(
|
|
||||||
"[Feishu] pending clarification remembered: chat_id=%s user_id=%s topic_id=%s thread_id=%s",
|
|
||||||
msg.chat_id,
|
|
||||||
user_id,
|
|
||||||
topic_id,
|
|
||||||
msg.thread_id,
|
|
||||||
)
|
|
||||||
|
|
||||||
def _consume_pending_clarification(self, chat_id: str, user_id: str) -> dict[str, Any] | None:
|
|
||||||
key = self._pending_key(chat_id, user_id)
|
|
||||||
with self._thread_lock:
|
|
||||||
pending_items = self._pending_clarifications.get(key)
|
|
||||||
if not pending_items:
|
|
||||||
return None
|
|
||||||
|
|
||||||
now = time.time()
|
|
||||||
while pending_items:
|
|
||||||
pending = pending_items.pop(0)
|
|
||||||
created_at = pending.get("created_at")
|
|
||||||
if isinstance(created_at, (int, float)) and now - created_at <= PENDING_CLARIFICATION_TTL_SECONDS:
|
|
||||||
if pending_items:
|
|
||||||
self._pending_clarifications[key] = pending_items
|
|
||||||
else:
|
|
||||||
self._pending_clarifications.pop(key, None)
|
|
||||||
return pending
|
|
||||||
logger.info("[Feishu] pending clarification expired: chat_id=%s user_id=%s", chat_id, user_id)
|
|
||||||
|
|
||||||
self._pending_clarifications.pop(key, None)
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _ensure_pending_thread_mapping(self, chat_id: str, user_id: str, pending: dict[str, Any]) -> None:
|
|
||||||
store = self.config.get("channel_store")
|
|
||||||
topic_id = self._non_empty_str(pending.get("topic_id"))
|
|
||||||
thread_id = self._non_empty_str(pending.get("thread_id"))
|
|
||||||
if store is None or not topic_id or not thread_id:
|
|
||||||
return
|
|
||||||
try:
|
|
||||||
store.set_thread_id(self.name, chat_id, thread_id, topic_id=topic_id, user_id=user_id)
|
|
||||||
except Exception:
|
|
||||||
logger.exception("[Feishu] failed to restore pending clarification mapping for topic_id=%s", topic_id)
|
|
||||||
|
|
||||||
def _resolve_topic_id(
|
|
||||||
self,
|
|
||||||
chat_id: str,
|
|
||||||
msg_id: str,
|
|
||||||
*,
|
|
||||||
root_id: str | None,
|
|
||||||
parent_id: str | None,
|
|
||||||
thread_id: str | None,
|
|
||||||
) -> tuple[str, bool]:
|
|
||||||
store = self.config.get("channel_store")
|
|
||||||
candidates = [root_id, parent_id, thread_id]
|
|
||||||
|
|
||||||
if store is not None:
|
|
||||||
for candidate in candidates:
|
|
||||||
candidate = self._non_empty_str(candidate)
|
|
||||||
if not candidate:
|
|
||||||
continue
|
|
||||||
try:
|
|
||||||
if store.get_thread_id(self.name, chat_id, topic_id=candidate):
|
|
||||||
return candidate, True
|
|
||||||
except Exception:
|
|
||||||
logger.exception("[Feishu] failed to resolve stored topic mapping for topic_id=%s", candidate)
|
|
||||||
|
|
||||||
return root_id or msg_id, False
|
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def _log_future_error(fut, name: str, msg_id: str) -> None:
|
def _log_future_error(fut, name: str, msg_id: str) -> None:
|
||||||
"""Callback for run_coroutine_threadsafe futures to surface errors."""
|
"""Callback for run_coroutine_threadsafe futures to surface errors."""
|
||||||
@@ -742,9 +593,7 @@ class FeishuChannel(Channel):
|
|||||||
|
|
||||||
# root_id is set when the message is a reply within a Feishu thread.
|
# root_id is set when the message is a reply within a Feishu thread.
|
||||||
# Use it as topic_id so all replies share the same DeerFlow thread.
|
# Use it as topic_id so all replies share the same DeerFlow thread.
|
||||||
root_id = self._non_empty_str(getattr(message, "root_id", None))
|
root_id = getattr(message, "root_id", None) or None
|
||||||
parent_id = self._non_empty_str(getattr(message, "parent_id", None))
|
|
||||||
feishu_thread_id = self._non_empty_str(getattr(message, "thread_id", None))
|
|
||||||
|
|
||||||
# Parse message content
|
# Parse message content
|
||||||
content = json.loads(message.content)
|
content = json.loads(message.content)
|
||||||
@@ -805,12 +654,10 @@ class FeishuChannel(Channel):
|
|||||||
text = text.strip()
|
text = text.strip()
|
||||||
|
|
||||||
logger.info(
|
logger.info(
|
||||||
"[Feishu] parsed message: chat_id=%s, msg_id=%s, root_id=%s, parent_id=%s, thread_id=%s, sender=%s, text=%r",
|
"[Feishu] parsed message: chat_id=%s, msg_id=%s, root_id=%s, sender=%s, text=%r",
|
||||||
chat_id,
|
chat_id,
|
||||||
msg_id,
|
msg_id,
|
||||||
root_id,
|
root_id,
|
||||||
parent_id,
|
|
||||||
feishu_thread_id,
|
|
||||||
sender_id,
|
sender_id,
|
||||||
text[:100] if text else "",
|
text[:100] if text else "",
|
||||||
)
|
)
|
||||||
@@ -826,24 +673,8 @@ class FeishuChannel(Channel):
|
|||||||
else:
|
else:
|
||||||
msg_type = InboundMessageType.CHAT
|
msg_type = InboundMessageType.CHAT
|
||||||
|
|
||||||
# Prefer any platform message id that already maps to a DeerFlow
|
# topic_id: use root_id for replies (same topic), msg_id for new messages (new topic)
|
||||||
# thread. This keeps replies to bot clarification cards in the
|
topic_id = root_id or msg_id
|
||||||
# original conversation even when Feishu reports the card as root.
|
|
||||||
topic_id, resolved_from_stored_mapping = self._resolve_topic_id(
|
|
||||||
chat_id,
|
|
||||||
msg_id,
|
|
||||||
root_id=root_id,
|
|
||||||
parent_id=parent_id,
|
|
||||||
thread_id=feishu_thread_id,
|
|
||||||
)
|
|
||||||
resolved_from_pending = False
|
|
||||||
if msg_type == InboundMessageType.CHAT and not resolved_from_stored_mapping:
|
|
||||||
pending = self._consume_pending_clarification(chat_id, sender_id)
|
|
||||||
pending_topic_id = self._non_empty_str(pending.get("topic_id")) if pending else None
|
|
||||||
if pending_topic_id:
|
|
||||||
topic_id = pending_topic_id
|
|
||||||
self._ensure_pending_thread_mapping(chat_id, sender_id, pending)
|
|
||||||
resolved_from_pending = True
|
|
||||||
|
|
||||||
inbound = self._make_inbound(
|
inbound = self._make_inbound(
|
||||||
chat_id=chat_id,
|
chat_id=chat_id,
|
||||||
@@ -852,15 +683,7 @@ class FeishuChannel(Channel):
|
|||||||
msg_type=msg_type,
|
msg_type=msg_type,
|
||||||
thread_ts=msg_id,
|
thread_ts=msg_id,
|
||||||
files=files_list,
|
files=files_list,
|
||||||
metadata={
|
metadata={"message_id": msg_id, "root_id": root_id},
|
||||||
"message_id": msg_id,
|
|
||||||
"root_id": root_id,
|
|
||||||
"parent_id": parent_id,
|
|
||||||
"thread_id": feishu_thread_id,
|
|
||||||
"topic_id": topic_id,
|
|
||||||
"user_id": sender_id,
|
|
||||||
RESOLVED_FROM_PENDING_CLARIFICATION_METADATA_KEY: resolved_from_pending,
|
|
||||||
},
|
|
||||||
)
|
)
|
||||||
inbound.topic_id = topic_id
|
inbound.topic_id = topic_id
|
||||||
|
|
||||||
|
|||||||
+30
-258
@@ -8,7 +8,6 @@ import mimetypes
|
|||||||
import re
|
import re
|
||||||
import time
|
import time
|
||||||
from collections.abc import Awaitable, Callable, Mapping
|
from collections.abc import Awaitable, Callable, Mapping
|
||||||
from dataclasses import dataclass
|
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
@@ -16,24 +15,11 @@ import httpx
|
|||||||
from langgraph_sdk.errors import ConflictError
|
from langgraph_sdk.errors import ConflictError
|
||||||
|
|
||||||
from app.channels.commands import KNOWN_CHANNEL_COMMANDS
|
from app.channels.commands import KNOWN_CHANNEL_COMMANDS
|
||||||
from app.channels.message_bus import (
|
from app.channels.message_bus import InboundMessage, InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
|
||||||
PENDING_CLARIFICATION_METADATA_KEY,
|
|
||||||
InboundMessage,
|
|
||||||
InboundMessageType,
|
|
||||||
MessageBus,
|
|
||||||
OutboundMessage,
|
|
||||||
ResolvedAttachment,
|
|
||||||
)
|
|
||||||
from app.channels.store import ChannelStore
|
from app.channels.store import ChannelStore
|
||||||
from app.gateway.csrf_middleware import CSRF_COOKIE_NAME, CSRF_HEADER_NAME, generate_csrf_token
|
from app.gateway.csrf_middleware import CSRF_COOKIE_NAME, CSRF_HEADER_NAME, generate_csrf_token
|
||||||
from app.gateway.internal_auth import create_internal_auth_headers
|
from app.gateway.internal_auth import create_internal_auth_headers
|
||||||
from deerflow.config.agents_config import load_agent_config
|
|
||||||
from deerflow.config.paths import make_safe_user_id
|
|
||||||
from deerflow.runtime.user_context import get_effective_user_id
|
from deerflow.runtime.user_context import get_effective_user_id
|
||||||
from deerflow.skills.slash import parse_slash_skill_reference
|
|
||||||
from deerflow.skills.storage import get_or_new_skill_storage
|
|
||||||
from deerflow.skills.storage.skill_storage import SkillStorage
|
|
||||||
from deerflow.utils.messages import ORIGINAL_USER_CONTENT_KEY
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
@@ -130,16 +116,6 @@ class InvalidChannelSessionConfigError(ValueError):
|
|||||||
"""Raised when IM channel session overrides contain invalid agent config."""
|
"""Raised when IM channel session overrides contain invalid agent config."""
|
||||||
|
|
||||||
|
|
||||||
class SlashSkillCommandResolutionError(RuntimeError):
|
|
||||||
"""Raised when IM slash-skill command resolution cannot complete safely."""
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True, slots=True)
|
|
||||||
class _SlashSkillCommandResolution:
|
|
||||||
route_to_chat: bool = False
|
|
||||||
failure_message: str | None = None
|
|
||||||
|
|
||||||
|
|
||||||
def _is_thread_busy_error(exc: BaseException | None) -> bool:
|
def _is_thread_busy_error(exc: BaseException | None) -> bool:
|
||||||
if exc is None:
|
if exc is None:
|
||||||
return False
|
return False
|
||||||
@@ -179,6 +155,7 @@ def _extract_response_text(result: dict | list) -> str:
|
|||||||
Handles special cases:
|
Handles special cases:
|
||||||
- Regular AI text responses
|
- Regular AI text responses
|
||||||
- Clarification interrupts (``ask_clarification`` tool messages)
|
- Clarification interrupts (``ask_clarification`` tool messages)
|
||||||
|
- AI messages with tool_calls but no text content
|
||||||
"""
|
"""
|
||||||
if isinstance(result, list):
|
if isinstance(result, list):
|
||||||
messages = result
|
messages = result
|
||||||
@@ -197,8 +174,6 @@ def _extract_response_text(result: dict | list) -> str:
|
|||||||
|
|
||||||
# Stop at the last human message — anything before it is a previous turn
|
# Stop at the last human message — anything before it is a previous turn
|
||||||
if msg_type == "human":
|
if msg_type == "human":
|
||||||
if _is_hidden_human_control_message(msg):
|
|
||||||
continue
|
|
||||||
break
|
break
|
||||||
|
|
||||||
# Check for tool messages from ask_clarification (interrupt case)
|
# Check for tool messages from ask_clarification (interrupt case)
|
||||||
@@ -226,54 +201,6 @@ def _extract_response_text(result: dict | list) -> str:
|
|||||||
return ""
|
return ""
|
||||||
|
|
||||||
|
|
||||||
def _messages_from_result(result: dict | list) -> list[Any]:
|
|
||||||
if isinstance(result, list):
|
|
||||||
return result
|
|
||||||
if isinstance(result, dict):
|
|
||||||
messages = result.get("messages", [])
|
|
||||||
if isinstance(messages, list):
|
|
||||||
return messages
|
|
||||||
return []
|
|
||||||
|
|
||||||
|
|
||||||
def _current_turn_messages(result: dict | list) -> list[dict[str, Any]]:
|
|
||||||
messages = _messages_from_result(result)
|
|
||||||
current_turn: list[dict[str, Any]] = []
|
|
||||||
for msg in reversed(messages):
|
|
||||||
if not isinstance(msg, dict):
|
|
||||||
continue
|
|
||||||
if msg.get("type") == "human":
|
|
||||||
break
|
|
||||||
current_turn.append(msg)
|
|
||||||
current_turn.reverse()
|
|
||||||
return current_turn
|
|
||||||
|
|
||||||
|
|
||||||
def _has_current_turn_clarification(result: dict | list) -> bool:
|
|
||||||
"""Return True only when the current turn's final result is clarification."""
|
|
||||||
for msg in reversed(_current_turn_messages(result)):
|
|
||||||
msg_type = msg.get("type")
|
|
||||||
if msg_type == "tool":
|
|
||||||
return msg.get("name") == "ask_clarification"
|
|
||||||
if msg_type == "ai":
|
|
||||||
content = msg.get("content")
|
|
||||||
if isinstance(content, str):
|
|
||||||
if content:
|
|
||||||
return False
|
|
||||||
elif content:
|
|
||||||
return False
|
|
||||||
if msg.get("tool_calls"):
|
|
||||||
return False
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
def _response_metadata(base_metadata: dict[str, Any], *, pending_clarification: bool = False) -> dict[str, Any]:
|
|
||||||
metadata = _slim_metadata(base_metadata)
|
|
||||||
if pending_clarification:
|
|
||||||
metadata[PENDING_CLARIFICATION_METADATA_KEY] = True
|
|
||||||
return metadata
|
|
||||||
|
|
||||||
|
|
||||||
def _extract_text_content(content: Any) -> str:
|
def _extract_text_content(content: Any) -> str:
|
||||||
"""Extract text from a streaming payload content field."""
|
"""Extract text from a streaming payload content field."""
|
||||||
if isinstance(content, str):
|
if isinstance(content, str):
|
||||||
@@ -387,8 +314,6 @@ def _extract_artifacts(result: dict | list) -> list[str]:
|
|||||||
continue
|
continue
|
||||||
# Stop at the last human message — anything before it is a previous turn
|
# Stop at the last human message — anything before it is a previous turn
|
||||||
if msg.get("type") == "human":
|
if msg.get("type") == "human":
|
||||||
if _is_hidden_human_control_message(msg):
|
|
||||||
continue
|
|
||||||
break
|
break
|
||||||
# Look for AI messages with present_files tool calls
|
# Look for AI messages with present_files tool calls
|
||||||
if msg.get("type") == "ai":
|
if msg.get("type") == "ai":
|
||||||
@@ -401,18 +326,6 @@ def _extract_artifacts(result: dict | list) -> list[str]:
|
|||||||
return artifacts
|
return artifacts
|
||||||
|
|
||||||
|
|
||||||
def _is_hidden_human_control_message(msg: Mapping[str, Any]) -> bool:
|
|
||||||
"""Return whether a human message is an internal control message hidden from UI."""
|
|
||||||
if msg.get("type") != "human":
|
|
||||||
return False
|
|
||||||
|
|
||||||
additional_kwargs = msg.get("additional_kwargs")
|
|
||||||
if not isinstance(additional_kwargs, Mapping):
|
|
||||||
return False
|
|
||||||
|
|
||||||
return additional_kwargs.get("hide_from_ui") is True
|
|
||||||
|
|
||||||
|
|
||||||
def _format_artifact_text(artifacts: list[str]) -> str:
|
def _format_artifact_text(artifacts: list[str]) -> str:
|
||||||
"""Format artifact paths into a human-readable text block listing filenames."""
|
"""Format artifact paths into a human-readable text block listing filenames."""
|
||||||
import posixpath
|
import posixpath
|
||||||
@@ -426,46 +339,6 @@ def _format_artifact_text(artifacts: list[str]) -> str:
|
|||||||
_OUTPUTS_VIRTUAL_PREFIX = "/mnt/user-data/outputs/"
|
_OUTPUTS_VIRTUAL_PREFIX = "/mnt/user-data/outputs/"
|
||||||
|
|
||||||
|
|
||||||
def _unknown_command_reply(command: str | None = None) -> str:
|
|
||||||
available = " | ".join(sorted(KNOWN_CHANNEL_COMMANDS))
|
|
||||||
if command:
|
|
||||||
return f"Unknown command: /{command}. Available commands: {available}"
|
|
||||||
return f"Unknown command. Available commands: {available}"
|
|
||||||
|
|
||||||
|
|
||||||
def _human_input_message(content: str, *, original_content: str | None = None) -> dict[str, Any]:
|
|
||||||
message: dict[str, Any] = {"role": "human", "content": content}
|
|
||||||
if original_content is not None and original_content != content:
|
|
||||||
message["additional_kwargs"] = {ORIGINAL_USER_CONTENT_KEY: original_content}
|
|
||||||
return message
|
|
||||||
|
|
||||||
|
|
||||||
def _resolve_slash_skill_command(
|
|
||||||
text: str,
|
|
||||||
available_skills: set[str] | None = None,
|
|
||||||
storage: SkillStorage | Callable[[], SkillStorage] | None = None,
|
|
||||||
) -> _SlashSkillCommandResolution | None:
|
|
||||||
reference = parse_slash_skill_reference(text)
|
|
||||||
if reference is None:
|
|
||||||
return None
|
|
||||||
try:
|
|
||||||
resolved_storage = storage() if callable(storage) else storage or get_or_new_skill_storage()
|
|
||||||
skills = resolved_storage.load_skills(enabled_only=False)
|
|
||||||
|
|
||||||
skill = next((candidate for candidate in skills if candidate.name == reference.name), None)
|
|
||||||
if skill is None:
|
|
||||||
return None
|
|
||||||
if not skill.enabled:
|
|
||||||
return _SlashSkillCommandResolution(failure_message=f"Skill `/{reference.name}` is installed but disabled. Enable it before using slash activation.")
|
|
||||||
if available_skills is not None and reference.name not in available_skills:
|
|
||||||
return _SlashSkillCommandResolution(failure_message=f"Skill `/{reference.name}` is not available for this agent.")
|
|
||||||
|
|
||||||
return _SlashSkillCommandResolution(route_to_chat=True)
|
|
||||||
except Exception as exc:
|
|
||||||
logger.exception("[Manager] failed to resolve slash skill command")
|
|
||||||
raise SlashSkillCommandResolutionError("Failed to resolve slash skill command. Please check the skill configuration.") from exc
|
|
||||||
|
|
||||||
|
|
||||||
def _resolve_attachments(thread_id: str, artifacts: list[str]) -> list[ResolvedAttachment]:
|
def _resolve_attachments(thread_id: str, artifacts: list[str]) -> list[ResolvedAttachment]:
|
||||||
"""Resolve virtual artifact paths to host filesystem paths with metadata.
|
"""Resolve virtual artifact paths to host filesystem paths with metadata.
|
||||||
|
|
||||||
@@ -547,13 +420,7 @@ async def _ingest_inbound_files(thread_id: str, msg: InboundMessage) -> list[dic
|
|||||||
if not msg.files:
|
if not msg.files:
|
||||||
return []
|
return []
|
||||||
|
|
||||||
from deerflow.uploads.manager import (
|
from deerflow.uploads.manager import claim_unique_filename, ensure_uploads_dir, normalize_filename
|
||||||
UnsafeUploadPathError,
|
|
||||||
claim_unique_filename,
|
|
||||||
ensure_uploads_dir,
|
|
||||||
normalize_filename,
|
|
||||||
write_upload_file_no_symlink,
|
|
||||||
)
|
|
||||||
|
|
||||||
uploads_dir = ensure_uploads_dir(thread_id)
|
uploads_dir = ensure_uploads_dir(thread_id)
|
||||||
seen_names = {entry.name for entry in uploads_dir.iterdir() if entry.is_file()}
|
seen_names = {entry.name for entry in uploads_dir.iterdir() if entry.is_file()}
|
||||||
@@ -604,10 +471,7 @@ async def _ingest_inbound_files(thread_id: str, msg: InboundMessage) -> list[dic
|
|||||||
|
|
||||||
dest = uploads_dir / safe_name
|
dest = uploads_dir / safe_name
|
||||||
try:
|
try:
|
||||||
dest = write_upload_file_no_symlink(uploads_dir, safe_name, data)
|
dest.write_bytes(data)
|
||||||
except UnsafeUploadPathError:
|
|
||||||
logger.warning("[Manager] skipping inbound file with unsafe destination: %s", safe_name)
|
|
||||||
continue
|
|
||||||
except Exception:
|
except Exception:
|
||||||
logger.exception("[Manager] failed to write inbound file: %s", dest)
|
logger.exception("[Manager] failed to write inbound file: %s", dest)
|
||||||
continue
|
continue
|
||||||
@@ -680,7 +544,6 @@ class ChannelManager:
|
|||||||
self._default_session = _as_dict(default_session)
|
self._default_session = _as_dict(default_session)
|
||||||
self._channel_sessions = dict(channel_sessions or {})
|
self._channel_sessions = dict(channel_sessions or {})
|
||||||
self._client = None # lazy init — langgraph_sdk async client
|
self._client = None # lazy init — langgraph_sdk async client
|
||||||
self._skill_storage: SkillStorage | None = None
|
|
||||||
self._csrf_token = generate_csrf_token()
|
self._csrf_token = generate_csrf_token()
|
||||||
self._semaphore: asyncio.Semaphore | None = None
|
self._semaphore: asyncio.Semaphore | None = None
|
||||||
self._running = False
|
self._running = False
|
||||||
@@ -717,31 +580,12 @@ class ChannelManager:
|
|||||||
user_layer.get("config"),
|
user_layer.get("config"),
|
||||||
)
|
)
|
||||||
|
|
||||||
configurable = run_config.get("configurable")
|
|
||||||
if isinstance(configurable, Mapping):
|
|
||||||
configurable = dict(configurable)
|
|
||||||
else:
|
|
||||||
configurable = {}
|
|
||||||
run_config["configurable"] = configurable
|
|
||||||
# Pin channel-triggered runs to the root graph namespace so follow-up
|
|
||||||
# turns continue from the same conversation checkpoint.
|
|
||||||
configurable["checkpoint_ns"] = ""
|
|
||||||
configurable["thread_id"] = thread_id
|
|
||||||
|
|
||||||
# ``user_id`` drives user-scoped filesystem buckets that only accept
|
|
||||||
# ``[A-Za-z0-9_-]``, so normalize the channel id and keep the raw value
|
|
||||||
# under ``channel_user_id`` for platform-facing lookups.
|
|
||||||
run_context_identity: dict[str, Any] = {"thread_id": thread_id}
|
|
||||||
if msg.user_id:
|
|
||||||
run_context_identity["user_id"] = make_safe_user_id(msg.user_id)
|
|
||||||
run_context_identity["channel_user_id"] = msg.user_id
|
|
||||||
|
|
||||||
run_context = _merge_dicts(
|
run_context = _merge_dicts(
|
||||||
DEFAULT_RUN_CONTEXT,
|
DEFAULT_RUN_CONTEXT,
|
||||||
self._default_session.get("context"),
|
self._default_session.get("context"),
|
||||||
channel_layer.get("context"),
|
channel_layer.get("context"),
|
||||||
user_layer.get("context"),
|
user_layer.get("context"),
|
||||||
run_context_identity,
|
{"thread_id": thread_id},
|
||||||
)
|
)
|
||||||
|
|
||||||
# Custom agents are implemented as lead_agent + agent_name context.
|
# Custom agents are implemented as lead_agent + agent_name context.
|
||||||
@@ -753,21 +597,6 @@ class ChannelManager:
|
|||||||
|
|
||||||
return assistant_id, run_config, run_context
|
return assistant_id, run_config, run_context
|
||||||
|
|
||||||
def _resolve_available_skill_names(self, msg: InboundMessage) -> set[str] | None:
|
|
||||||
thread_id = self.store.get_thread_id(msg.channel_name, msg.chat_id, topic_id=msg.topic_id) or ""
|
|
||||||
_, _, run_context = self._resolve_run_params(msg, thread_id)
|
|
||||||
if run_context.get("is_bootstrap"):
|
|
||||||
return {"bootstrap"}
|
|
||||||
|
|
||||||
agent_name = run_context.get("agent_name")
|
|
||||||
if not isinstance(agent_name, str) or not agent_name.strip():
|
|
||||||
return None
|
|
||||||
|
|
||||||
agent_config = load_agent_config(_normalize_custom_agent_name(agent_name))
|
|
||||||
if agent_config and agent_config.skills is not None:
|
|
||||||
return set(agent_config.skills)
|
|
||||||
return None
|
|
||||||
|
|
||||||
# -- LangGraph SDK client (lazy) ----------------------------------------
|
# -- LangGraph SDK client (lazy) ----------------------------------------
|
||||||
|
|
||||||
def _get_client(self):
|
def _get_client(self):
|
||||||
@@ -785,11 +614,6 @@ class ChannelManager:
|
|||||||
)
|
)
|
||||||
return self._client
|
return self._client
|
||||||
|
|
||||||
def _get_skill_storage(self) -> SkillStorage:
|
|
||||||
if self._skill_storage is None:
|
|
||||||
self._skill_storage = get_or_new_skill_storage()
|
|
||||||
return self._skill_storage
|
|
||||||
|
|
||||||
# -- lifecycle ---------------------------------------------------------
|
# -- lifecycle ---------------------------------------------------------
|
||||||
|
|
||||||
async def start(self) -> None:
|
async def start(self) -> None:
|
||||||
@@ -859,14 +683,6 @@ class ChannelManager:
|
|||||||
exc,
|
exc,
|
||||||
)
|
)
|
||||||
await self._send_error(msg, str(exc))
|
await self._send_error(msg, str(exc))
|
||||||
except SlashSkillCommandResolutionError as exc:
|
|
||||||
logger.warning(
|
|
||||||
"Slash skill command resolution failed for %s (chat=%s): %s",
|
|
||||||
msg.channel_name,
|
|
||||||
msg.chat_id,
|
|
||||||
exc,
|
|
||||||
)
|
|
||||||
await self._send_error(msg, str(exc))
|
|
||||||
except Exception:
|
except Exception:
|
||||||
logger.exception(
|
logger.exception(
|
||||||
"Error handling message from %s (chat=%s)",
|
"Error handling message from %s (chat=%s)",
|
||||||
@@ -921,11 +737,9 @@ class ChannelManager:
|
|||||||
if extra_context:
|
if extra_context:
|
||||||
run_context.update(extra_context)
|
run_context.update(extra_context)
|
||||||
|
|
||||||
original_text = msg.text
|
|
||||||
uploaded = await _ingest_inbound_files(thread_id, msg)
|
uploaded = await _ingest_inbound_files(thread_id, msg)
|
||||||
if uploaded:
|
if uploaded:
|
||||||
msg.text = f"{_format_uploaded_files_block(uploaded)}\n\n{msg.text}".strip()
|
msg.text = f"{_format_uploaded_files_block(uploaded)}\n\n{msg.text}".strip()
|
||||||
human_message = _human_input_message(msg.text, original_content=original_text)
|
|
||||||
|
|
||||||
if self._channel_supports_streaming(msg.channel_name):
|
if self._channel_supports_streaming(msg.channel_name):
|
||||||
await self._handle_streaming_chat(
|
await self._handle_streaming_chat(
|
||||||
@@ -935,30 +749,19 @@ class ChannelManager:
|
|||||||
assistant_id,
|
assistant_id,
|
||||||
run_config,
|
run_config,
|
||||||
run_context,
|
run_context,
|
||||||
human_message,
|
|
||||||
)
|
)
|
||||||
return
|
return
|
||||||
|
|
||||||
logger.info("[Manager] invoking runs.wait(thread_id=%s, text=%r)", thread_id, msg.text[:100])
|
logger.info("[Manager] invoking runs.wait(thread_id=%s, text=%r)", thread_id, msg.text[:100])
|
||||||
try:
|
result = await client.runs.wait(
|
||||||
result = await client.runs.wait(
|
thread_id,
|
||||||
thread_id,
|
assistant_id,
|
||||||
assistant_id,
|
input={"messages": [{"role": "human", "content": msg.text}]},
|
||||||
input={"messages": [human_message]},
|
config=run_config,
|
||||||
config=run_config,
|
context=run_context,
|
||||||
context=run_context,
|
)
|
||||||
multitask_strategy="reject",
|
|
||||||
)
|
|
||||||
except Exception as exc:
|
|
||||||
if _is_thread_busy_error(exc):
|
|
||||||
logger.warning("[Manager] thread busy (concurrent run rejected): thread_id=%s", thread_id)
|
|
||||||
await self._send_error(msg, THREAD_BUSY_MESSAGE)
|
|
||||||
return
|
|
||||||
else:
|
|
||||||
raise
|
|
||||||
|
|
||||||
response_text = _extract_response_text(result)
|
response_text = _extract_response_text(result)
|
||||||
pending_clarification = _has_current_turn_clarification(result)
|
|
||||||
artifacts = _extract_artifacts(result)
|
artifacts = _extract_artifacts(result)
|
||||||
|
|
||||||
logger.info(
|
logger.info(
|
||||||
@@ -984,7 +787,7 @@ class ChannelManager:
|
|||||||
artifacts=artifacts,
|
artifacts=artifacts,
|
||||||
attachments=attachments,
|
attachments=attachments,
|
||||||
thread_ts=msg.thread_ts,
|
thread_ts=msg.thread_ts,
|
||||||
metadata=_response_metadata(msg.metadata, pending_clarification=pending_clarification),
|
metadata=_slim_metadata(msg.metadata),
|
||||||
)
|
)
|
||||||
logger.info("[Manager] publishing outbound message to bus: channel=%s, chat_id=%s", msg.channel_name, msg.chat_id)
|
logger.info("[Manager] publishing outbound message to bus: channel=%s, chat_id=%s", msg.channel_name, msg.chat_id)
|
||||||
await self.bus.publish_outbound(outbound)
|
await self.bus.publish_outbound(outbound)
|
||||||
@@ -997,7 +800,6 @@ class ChannelManager:
|
|||||||
assistant_id: str,
|
assistant_id: str,
|
||||||
run_config: dict[str, Any],
|
run_config: dict[str, Any],
|
||||||
run_context: dict[str, Any],
|
run_context: dict[str, Any],
|
||||||
human_message: dict[str, Any],
|
|
||||||
) -> None:
|
) -> None:
|
||||||
logger.info("[Manager] invoking runs.stream(thread_id=%s, text=%r)", thread_id, msg.text[:100])
|
logger.info("[Manager] invoking runs.stream(thread_id=%s, text=%r)", thread_id, msg.text[:100])
|
||||||
|
|
||||||
@@ -1013,7 +815,7 @@ class ChannelManager:
|
|||||||
async for chunk in client.runs.stream(
|
async for chunk in client.runs.stream(
|
||||||
thread_id,
|
thread_id,
|
||||||
assistant_id,
|
assistant_id,
|
||||||
input={"messages": [human_message]},
|
input={"messages": [{"role": "human", "content": msg.text}]},
|
||||||
config=run_config,
|
config=run_config,
|
||||||
context=run_context,
|
context=run_context,
|
||||||
stream_mode=["messages-tuple", "values"],
|
stream_mode=["messages-tuple", "values"],
|
||||||
@@ -1047,7 +849,7 @@ class ChannelManager:
|
|||||||
text=latest_text,
|
text=latest_text,
|
||||||
is_final=False,
|
is_final=False,
|
||||||
thread_ts=msg.thread_ts,
|
thread_ts=msg.thread_ts,
|
||||||
metadata=_response_metadata(msg.metadata),
|
metadata=_slim_metadata(msg.metadata),
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
last_published_text = latest_text
|
last_published_text = latest_text
|
||||||
@@ -1061,7 +863,6 @@ class ChannelManager:
|
|||||||
finally:
|
finally:
|
||||||
result = last_values if last_values is not None else {"messages": [{"type": "ai", "content": latest_text}]}
|
result = last_values if last_values is not None else {"messages": [{"type": "ai", "content": latest_text}]}
|
||||||
response_text = _extract_response_text(result)
|
response_text = _extract_response_text(result)
|
||||||
pending_clarification = _has_current_turn_clarification(result)
|
|
||||||
artifacts = _extract_artifacts(result)
|
artifacts = _extract_artifacts(result)
|
||||||
response_text, attachments = _prepare_artifact_delivery(thread_id, response_text, artifacts)
|
response_text, attachments = _prepare_artifact_delivery(thread_id, response_text, artifacts)
|
||||||
|
|
||||||
@@ -1093,27 +894,18 @@ class ChannelManager:
|
|||||||
attachments=attachments,
|
attachments=attachments,
|
||||||
is_final=True,
|
is_final=True,
|
||||||
thread_ts=msg.thread_ts,
|
thread_ts=msg.thread_ts,
|
||||||
metadata=_response_metadata(msg.metadata, pending_clarification=pending_clarification),
|
metadata=_slim_metadata(msg.metadata),
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
|
||||||
# -- command handling --------------------------------------------------
|
# -- command handling --------------------------------------------------
|
||||||
|
|
||||||
async def _handle_command(self, msg: InboundMessage) -> None:
|
async def _handle_command(self, msg: InboundMessage) -> None:
|
||||||
raw_text = msg.text
|
text = msg.text.strip()
|
||||||
text = raw_text.strip()
|
|
||||||
parts = text.split(maxsplit=1)
|
parts = text.split(maxsplit=1)
|
||||||
reply: str | None = None
|
command = parts[0].lower().lstrip("/")
|
||||||
if not parts:
|
|
||||||
command = None
|
|
||||||
reply = _unknown_command_reply()
|
|
||||||
else:
|
|
||||||
command = parts[0].lower().removeprefix("/")
|
|
||||||
|
|
||||||
if reply is None and not raw_text.startswith("/"):
|
if command == "bootstrap":
|
||||||
reply = _unknown_command_reply(command)
|
|
||||||
|
|
||||||
if reply is None and command == "bootstrap":
|
|
||||||
from dataclasses import replace as _dc_replace
|
from dataclasses import replace as _dc_replace
|
||||||
|
|
||||||
chat_text = parts[1] if len(parts) > 1 else "Initialize workspace"
|
chat_text = parts[1] if len(parts) > 1 else "Initialize workspace"
|
||||||
@@ -1121,7 +913,7 @@ class ChannelManager:
|
|||||||
await self._handle_chat(chat_msg, extra_context={"is_bootstrap": True})
|
await self._handle_chat(chat_msg, extra_context={"is_bootstrap": True})
|
||||||
return
|
return
|
||||||
|
|
||||||
if reply is None and command == "new":
|
if command == "new":
|
||||||
# Create a new thread through Gateway
|
# Create a new thread through Gateway
|
||||||
client = self._get_client()
|
client = self._get_client()
|
||||||
thread = await client.threads.create()
|
thread = await client.threads.create()
|
||||||
@@ -1134,14 +926,14 @@ class ChannelManager:
|
|||||||
user_id=msg.user_id,
|
user_id=msg.user_id,
|
||||||
)
|
)
|
||||||
reply = "New conversation started."
|
reply = "New conversation started."
|
||||||
elif reply is None and command == "status":
|
elif command == "status":
|
||||||
thread_id = self.store.get_thread_id(msg.channel_name, msg.chat_id, topic_id=msg.topic_id)
|
thread_id = self.store.get_thread_id(msg.channel_name, msg.chat_id, topic_id=msg.topic_id)
|
||||||
reply = f"Active thread: {thread_id}" if thread_id else "No active conversation."
|
reply = f"Active thread: {thread_id}" if thread_id else "No active conversation."
|
||||||
elif reply is None and command == "models":
|
elif command == "models":
|
||||||
reply = await self._fetch_gateway("/api/models", "models")
|
reply = await self._fetch_gateway("/api/models", "models")
|
||||||
elif reply is None and command == "memory":
|
elif command == "memory":
|
||||||
reply = await self._fetch_gateway("/api/memory", "memory")
|
reply = await self._fetch_gateway("/api/memory", "memory")
|
||||||
elif reply is None and command == "help":
|
elif command == "help":
|
||||||
reply = (
|
reply = (
|
||||||
"Available commands:\n"
|
"Available commands:\n"
|
||||||
"/bootstrap — Start a bootstrap session (enables agent setup)\n"
|
"/bootstrap — Start a bootstrap session (enables agent setup)\n"
|
||||||
@@ -1149,32 +941,16 @@ class ChannelManager:
|
|||||||
"/status — Show current thread info\n"
|
"/status — Show current thread info\n"
|
||||||
"/models — List available models\n"
|
"/models — List available models\n"
|
||||||
"/memory — Show memory status\n"
|
"/memory — Show memory status\n"
|
||||||
"/<skill-name> <task> — Activate an enabled skill for one turn\n"
|
|
||||||
"/help — Show this help"
|
"/help — Show this help"
|
||||||
)
|
)
|
||||||
elif reply is None:
|
else:
|
||||||
slash_resolution = await asyncio.to_thread(
|
available = " | ".join(sorted(KNOWN_CHANNEL_COMMANDS))
|
||||||
lambda: _resolve_slash_skill_command(
|
reply = f"Unknown command: /{command}. Available commands: {available}"
|
||||||
raw_text,
|
|
||||||
self._resolve_available_skill_names(msg),
|
|
||||||
self._get_skill_storage,
|
|
||||||
)
|
|
||||||
)
|
|
||||||
if slash_resolution and slash_resolution.failure_message:
|
|
||||||
reply = slash_resolution.failure_message
|
|
||||||
elif slash_resolution and slash_resolution.route_to_chat:
|
|
||||||
from dataclasses import replace as _dc_replace
|
|
||||||
|
|
||||||
chat_msg = _dc_replace(msg, msg_type=InboundMessageType.CHAT)
|
|
||||||
await self._handle_chat(chat_msg)
|
|
||||||
return
|
|
||||||
else:
|
|
||||||
reply = _unknown_command_reply(command)
|
|
||||||
|
|
||||||
outbound = OutboundMessage(
|
outbound = OutboundMessage(
|
||||||
channel_name=msg.channel_name,
|
channel_name=msg.channel_name,
|
||||||
chat_id=msg.chat_id,
|
chat_id=msg.chat_id,
|
||||||
thread_id=self.store.get_thread_id(msg.channel_name, msg.chat_id, topic_id=msg.topic_id) or "",
|
thread_id=self.store.get_thread_id(msg.channel_name, msg.chat_id) or "",
|
||||||
text=reply,
|
text=reply,
|
||||||
thread_ts=msg.thread_ts,
|
thread_ts=msg.thread_ts,
|
||||||
metadata=_slim_metadata(msg.metadata),
|
metadata=_slim_metadata(msg.metadata),
|
||||||
@@ -1187,11 +963,7 @@ class ChannelManager:
|
|||||||
|
|
||||||
try:
|
try:
|
||||||
async with httpx.AsyncClient() as http:
|
async with httpx.AsyncClient() as http:
|
||||||
resp = await http.get(
|
resp = await http.get(f"{self._gateway_url}{path}", timeout=10)
|
||||||
f"{self._gateway_url}{path}",
|
|
||||||
timeout=10,
|
|
||||||
headers=create_internal_auth_headers(),
|
|
||||||
)
|
|
||||||
resp.raise_for_status()
|
resp.raise_for_status()
|
||||||
data = resp.json()
|
data = resp.json()
|
||||||
except Exception:
|
except Exception:
|
||||||
@@ -1212,7 +984,7 @@ class ChannelManager:
|
|||||||
outbound = OutboundMessage(
|
outbound = OutboundMessage(
|
||||||
channel_name=msg.channel_name,
|
channel_name=msg.channel_name,
|
||||||
chat_id=msg.chat_id,
|
chat_id=msg.chat_id,
|
||||||
thread_id=self.store.get_thread_id(msg.channel_name, msg.chat_id, topic_id=msg.topic_id) or "",
|
thread_id=self.store.get_thread_id(msg.channel_name, msg.chat_id) or "",
|
||||||
text=error_text,
|
text=error_text,
|
||||||
thread_ts=msg.thread_ts,
|
thread_ts=msg.thread_ts,
|
||||||
metadata=_slim_metadata(msg.metadata),
|
metadata=_slim_metadata(msg.metadata),
|
||||||
|
|||||||
@@ -13,9 +13,6 @@ from typing import Any
|
|||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
PENDING_CLARIFICATION_METADATA_KEY = "pending_clarification"
|
|
||||||
RESOLVED_FROM_PENDING_CLARIFICATION_METADATA_KEY = "resolved_from_pending_clarification"
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# Message types
|
# Message types
|
||||||
|
|||||||
@@ -167,8 +167,6 @@ class ChannelService:
|
|||||||
return False
|
return False
|
||||||
|
|
||||||
try:
|
try:
|
||||||
config = dict(config)
|
|
||||||
config["channel_store"] = self.store
|
|
||||||
channel = channel_cls(bus=self.bus, config=config)
|
channel = channel_cls(bus=self.bus, config=config)
|
||||||
self._channels[name] = channel
|
self._channels[name] = channel
|
||||||
await channel.start()
|
await channel.start()
|
||||||
|
|||||||
@@ -9,7 +9,6 @@ from typing import Any
|
|||||||
from markdown_to_mrkdwn import SlackMarkdownConverter
|
from markdown_to_mrkdwn import SlackMarkdownConverter
|
||||||
|
|
||||||
from app.channels.base import Channel
|
from app.channels.base import Channel
|
||||||
from app.channels.commands import is_known_channel_command
|
|
||||||
from app.channels.message_bus import InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
|
from app.channels.message_bus import InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@@ -33,20 +32,6 @@ def _normalize_allowed_users(allowed_users: Any) -> set[str]:
|
|||||||
return {str(user_id) for user_id in values if str(user_id)}
|
return {str(user_id) for user_id in values if str(user_id)}
|
||||||
|
|
||||||
|
|
||||||
def _strip_leading_slack_bot_mention(text: str, bot_user_id: str | None) -> str:
|
|
||||||
if not bot_user_id:
|
|
||||||
return text
|
|
||||||
if not text.startswith("<@"):
|
|
||||||
return text
|
|
||||||
end = text.find(">")
|
|
||||||
if end <= 2:
|
|
||||||
return text
|
|
||||||
mentioned_user_id = text[2:end].split("|", 1)[0].lstrip("!")
|
|
||||||
if mentioned_user_id != bot_user_id:
|
|
||||||
return text
|
|
||||||
return text[end + 1 :].lstrip()
|
|
||||||
|
|
||||||
|
|
||||||
class SlackChannel(Channel):
|
class SlackChannel(Channel):
|
||||||
"""Slack IM channel using Socket Mode (WebSocket, no public IP).
|
"""Slack IM channel using Socket Mode (WebSocket, no public IP).
|
||||||
|
|
||||||
@@ -64,8 +49,6 @@ class SlackChannel(Channel):
|
|||||||
self._web_client = None
|
self._web_client = None
|
||||||
self._loop: asyncio.AbstractEventLoop | None = None
|
self._loop: asyncio.AbstractEventLoop | None = None
|
||||||
self._allowed_users = _normalize_allowed_users(config.get("allowed_users", []))
|
self._allowed_users = _normalize_allowed_users(config.get("allowed_users", []))
|
||||||
configured_bot_user_id = config.get("bot_user_id")
|
|
||||||
self._bot_user_id = str(configured_bot_user_id).lstrip("@") if configured_bot_user_id else None
|
|
||||||
|
|
||||||
async def start(self) -> None:
|
async def start(self) -> None:
|
||||||
if self._running:
|
if self._running:
|
||||||
@@ -89,17 +72,6 @@ class SlackChannel(Channel):
|
|||||||
return
|
return
|
||||||
|
|
||||||
self._web_client = WebClient(token=bot_token)
|
self._web_client = WebClient(token=bot_token)
|
||||||
if self._bot_user_id is None:
|
|
||||||
try:
|
|
||||||
auth_info = await asyncio.to_thread(self._web_client.auth_test)
|
|
||||||
user_id = auth_info.get("user_id") if isinstance(auth_info, dict) else None
|
|
||||||
if user_id is None:
|
|
||||||
auth_get = getattr(auth_info, "get", None)
|
|
||||||
user_id = auth_get("user_id") if callable(auth_get) else None
|
|
||||||
if isinstance(user_id, str) and user_id:
|
|
||||||
self._bot_user_id = user_id
|
|
||||||
except Exception:
|
|
||||||
logger.warning("[Slack] failed to resolve bot user id; app mention text may include the bot mention", exc_info=True)
|
|
||||||
self._socket_client = SocketModeClient(
|
self._socket_client = SocketModeClient(
|
||||||
app_token=app_token,
|
app_token=app_token,
|
||||||
web_client=self._web_client,
|
web_client=self._web_client,
|
||||||
@@ -238,12 +210,6 @@ class SlackChannel(Channel):
|
|||||||
if event_type != "events_api":
|
if event_type != "events_api":
|
||||||
return
|
return
|
||||||
|
|
||||||
if self._bot_user_id is None:
|
|
||||||
authorization = next((item for item in req.payload.get("authorizations", []) if isinstance(item, dict)), None)
|
|
||||||
user_id = authorization.get("user_id") if authorization else None
|
|
||||||
if isinstance(user_id, str) and user_id:
|
|
||||||
self._bot_user_id = user_id
|
|
||||||
|
|
||||||
event = req.payload.get("event", {})
|
event = req.payload.get("event", {})
|
||||||
etype = event.get("type", "")
|
etype = event.get("type", "")
|
||||||
|
|
||||||
@@ -267,15 +233,13 @@ class SlackChannel(Channel):
|
|||||||
return
|
return
|
||||||
|
|
||||||
text = event.get("text", "").strip()
|
text = event.get("text", "").strip()
|
||||||
if event.get("type") == "app_mention":
|
|
||||||
text = _strip_leading_slack_bot_mention(text, self._bot_user_id)
|
|
||||||
if not text:
|
if not text:
|
||||||
return
|
return
|
||||||
|
|
||||||
channel_id = event.get("channel", "")
|
channel_id = event.get("channel", "")
|
||||||
thread_ts = event.get("thread_ts") or event.get("ts", "")
|
thread_ts = event.get("thread_ts") or event.get("ts", "")
|
||||||
|
|
||||||
if is_known_channel_command(text):
|
if text.startswith("/"):
|
||||||
msg_type = InboundMessageType.COMMAND
|
msg_type = InboundMessageType.COMMAND
|
||||||
else:
|
else:
|
||||||
msg_type = InboundMessageType.CHAT
|
msg_type = InboundMessageType.CHAT
|
||||||
|
|||||||
@@ -60,17 +60,12 @@ class TelegramChannel(Channel):
|
|||||||
|
|
||||||
# Command handlers
|
# Command handlers
|
||||||
app.add_handler(CommandHandler("start", self._cmd_start))
|
app.add_handler(CommandHandler("start", self._cmd_start))
|
||||||
app.add_handler(CommandHandler("bootstrap", self._cmd_generic))
|
|
||||||
app.add_handler(CommandHandler("new", self._cmd_generic))
|
app.add_handler(CommandHandler("new", self._cmd_generic))
|
||||||
app.add_handler(CommandHandler("status", self._cmd_generic))
|
app.add_handler(CommandHandler("status", self._cmd_generic))
|
||||||
app.add_handler(CommandHandler("models", self._cmd_generic))
|
app.add_handler(CommandHandler("models", self._cmd_generic))
|
||||||
app.add_handler(CommandHandler("memory", self._cmd_generic))
|
app.add_handler(CommandHandler("memory", self._cmd_generic))
|
||||||
app.add_handler(CommandHandler("help", self._cmd_generic))
|
app.add_handler(CommandHandler("help", self._cmd_generic))
|
||||||
|
|
||||||
# Slash skill commands are dynamic and cannot all be pre-registered
|
|
||||||
# with Telegram, so route unknown slash commands through chat handling.
|
|
||||||
app.add_handler(MessageHandler(filters.TEXT & filters.COMMAND, self._on_text))
|
|
||||||
|
|
||||||
# General message handler
|
# General message handler
|
||||||
app.add_handler(MessageHandler(filters.TEXT & ~filters.COMMAND, self._on_text))
|
app.add_handler(MessageHandler(filters.TEXT & ~filters.COMMAND, self._on_text))
|
||||||
|
|
||||||
@@ -233,33 +228,6 @@ class TelegramChannel(Channel):
|
|||||||
return True
|
return True
|
||||||
return user_id in self._allowed_users
|
return user_id in self._allowed_users
|
||||||
|
|
||||||
def _get_bot_username(self, context) -> str | None:
|
|
||||||
bot = getattr(context, "bot", None)
|
|
||||||
username = getattr(bot, "username", None)
|
|
||||||
if not username and self._application is not None:
|
|
||||||
username = getattr(getattr(self._application, "bot", None), "username", None)
|
|
||||||
return str(username) if username else None
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _strip_bot_username_from_leading_command(text: str, bot_username: str | None) -> str:
|
|
||||||
username = (bot_username or "").lstrip("@").lower()
|
|
||||||
if not username or not text.startswith("/"):
|
|
||||||
return text
|
|
||||||
|
|
||||||
parts = text.split(maxsplit=1)
|
|
||||||
command_token = parts[0]
|
|
||||||
if "@" not in command_token:
|
|
||||||
return text
|
|
||||||
|
|
||||||
command_name, addressed_username = command_token[1:].rsplit("@", 1)
|
|
||||||
if not command_name or addressed_username.lower() != username:
|
|
||||||
return text
|
|
||||||
|
|
||||||
normalized = f"/{command_name}"
|
|
||||||
if len(parts) > 1:
|
|
||||||
normalized = f"{normalized} {parts[1]}"
|
|
||||||
return normalized
|
|
||||||
|
|
||||||
async def _cmd_start(self, update, context) -> None:
|
async def _cmd_start(self, update, context) -> None:
|
||||||
"""Handle /start command."""
|
"""Handle /start command."""
|
||||||
if not self._check_user(update.effective_user.id):
|
if not self._check_user(update.effective_user.id):
|
||||||
@@ -275,7 +243,7 @@ class TelegramChannel(Channel):
|
|||||||
if not self._check_user(update.effective_user.id):
|
if not self._check_user(update.effective_user.id):
|
||||||
return
|
return
|
||||||
|
|
||||||
text = self._strip_bot_username_from_leading_command(update.message.text.strip(), self._get_bot_username(context))
|
text = update.message.text
|
||||||
chat_id = str(update.effective_chat.id)
|
chat_id = str(update.effective_chat.id)
|
||||||
user_id = str(update.effective_user.id)
|
user_id = str(update.effective_user.id)
|
||||||
msg_id = str(update.message.message_id)
|
msg_id = str(update.message.message_id)
|
||||||
@@ -311,7 +279,7 @@ class TelegramChannel(Channel):
|
|||||||
if not self._check_user(update.effective_user.id):
|
if not self._check_user(update.effective_user.id):
|
||||||
return
|
return
|
||||||
|
|
||||||
text = self._strip_bot_username_from_leading_command(update.message.text.strip(), self._get_bot_username(context))
|
text = update.message.text.strip()
|
||||||
if not text:
|
if not text:
|
||||||
return
|
return
|
||||||
|
|
||||||
|
|||||||
@@ -22,7 +22,6 @@ from cryptography.hazmat.primitives import padding
|
|||||||
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
|
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
|
||||||
|
|
||||||
from app.channels.base import Channel
|
from app.channels.base import Channel
|
||||||
from app.channels.commands import is_known_channel_command
|
|
||||||
from app.channels.message_bus import InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
|
from app.channels.message_bus import InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@@ -621,7 +620,7 @@ class WechatChannel(Channel):
|
|||||||
chat_id=chat_id,
|
chat_id=chat_id,
|
||||||
user_id=chat_id,
|
user_id=chat_id,
|
||||||
text=text,
|
text=text,
|
||||||
msg_type=InboundMessageType.COMMAND if is_known_channel_command(text) else InboundMessageType.CHAT,
|
msg_type=InboundMessageType.COMMAND if text.startswith("/") else InboundMessageType.CHAT,
|
||||||
thread_ts=thread_ts,
|
thread_ts=thread_ts,
|
||||||
files=files,
|
files=files,
|
||||||
metadata={
|
metadata={
|
||||||
|
|||||||
@@ -8,7 +8,6 @@ from collections.abc import Awaitable, Callable
|
|||||||
from typing import Any, cast
|
from typing import Any, cast
|
||||||
|
|
||||||
from app.channels.base import Channel
|
from app.channels.base import Channel
|
||||||
from app.channels.commands import is_known_channel_command
|
|
||||||
from app.channels.message_bus import (
|
from app.channels.message_bus import (
|
||||||
InboundMessageType,
|
InboundMessageType,
|
||||||
MessageBus,
|
MessageBus,
|
||||||
@@ -271,7 +270,7 @@ class WeComChannel(Channel):
|
|||||||
|
|
||||||
user_id = (body.get("from") or {}).get("userid")
|
user_id = (body.get("from") or {}).get("userid")
|
||||||
|
|
||||||
inbound_type = InboundMessageType.COMMAND if is_known_channel_command(text) else InboundMessageType.CHAT
|
inbound_type = InboundMessageType.COMMAND if text.startswith("/") else InboundMessageType.CHAT
|
||||||
inbound = self._make_inbound(
|
inbound = self._make_inbound(
|
||||||
chat_id=user_id, # keep user's conversation in memory
|
chat_id=user_id, # keep user's conversation in memory
|
||||||
user_id=user_id,
|
user_id=user_id,
|
||||||
|
|||||||
+33
-54
@@ -1,5 +1,6 @@
|
|||||||
import asyncio
|
import asyncio
|
||||||
import logging
|
import logging
|
||||||
|
import os
|
||||||
from collections.abc import AsyncGenerator
|
from collections.abc import AsyncGenerator
|
||||||
from contextlib import asynccontextmanager
|
from contextlib import asynccontextmanager
|
||||||
|
|
||||||
@@ -8,7 +9,7 @@ from fastapi.middleware.cors import CORSMiddleware
|
|||||||
|
|
||||||
from app.gateway.auth_middleware import AuthMiddleware
|
from app.gateway.auth_middleware import AuthMiddleware
|
||||||
from app.gateway.config import get_gateway_config
|
from app.gateway.config import get_gateway_config
|
||||||
from app.gateway.csrf_middleware import CSRFMiddleware, get_configured_cors_origins
|
from app.gateway.csrf_middleware import CSRFMiddleware
|
||||||
from app.gateway.deps import langgraph_runtime
|
from app.gateway.deps import langgraph_runtime
|
||||||
from app.gateway.routers import (
|
from app.gateway.routers import (
|
||||||
agents,
|
agents,
|
||||||
@@ -62,7 +63,7 @@ async def _ensure_admin_user(app: FastAPI) -> None:
|
|||||||
|
|
||||||
Subsequent boots (admin already exists):
|
Subsequent boots (admin already exists):
|
||||||
- Runs the one-time "no-auth → with-auth" orphan thread migration for
|
- Runs the one-time "no-auth → with-auth" orphan thread migration for
|
||||||
existing LangGraph thread metadata that has no user_id.
|
existing LangGraph thread metadata that has no owner_id.
|
||||||
|
|
||||||
No SQL persistence migration is needed: the four user_id columns
|
No SQL persistence migration is needed: the four user_id columns
|
||||||
(threads_meta, runs, run_events, feedback) only come into existence
|
(threads_meta, runs, run_events, feedback) only come into existence
|
||||||
@@ -161,16 +162,10 @@ async def _migrate_orphaned_threads(store, admin_user_id: str) -> int:
|
|||||||
async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
||||||
"""Application lifespan handler."""
|
"""Application lifespan handler."""
|
||||||
|
|
||||||
# Load config and check necessary environment variables at startup.
|
# Load config and check necessary environment variables at startup
|
||||||
# `startup_config` is a local snapshot used only for one-shot bootstrap
|
|
||||||
# work (logging level, langgraph_runtime engines, channels). Request-time
|
|
||||||
# config resolution always routes through `get_app_config()` in
|
|
||||||
# `app/gateway/deps.py::get_config()` so `config.yaml` edits become
|
|
||||||
# visible without a process restart. We deliberately do NOT cache this
|
|
||||||
# snapshot on `app.state` to keep that contract enforceable.
|
|
||||||
try:
|
try:
|
||||||
startup_config = get_app_config()
|
app.state.config = get_app_config()
|
||||||
apply_logging_level(startup_config.log_level)
|
apply_logging_level(app.state.config.log_level)
|
||||||
logger.info("Configuration loaded successfully")
|
logger.info("Configuration loaded successfully")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
error_msg = f"Failed to load configuration during gateway startup: {e}"
|
error_msg = f"Failed to load configuration during gateway startup: {e}"
|
||||||
@@ -179,30 +174,11 @@ async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
|||||||
config = get_gateway_config()
|
config = get_gateway_config()
|
||||||
logger.info(f"Starting API Gateway on {config.host}:{config.port}")
|
logger.info(f"Starting API Gateway on {config.host}:{config.port}")
|
||||||
|
|
||||||
# Pre-warm tiktoken encoding cache so the first memory-injection request
|
|
||||||
# never blocks on the BPE data download (which hits an OpenAI/Azure URL
|
|
||||||
# that may be unreachable in restricted networks — see issue #3402).
|
|
||||||
try:
|
|
||||||
from deerflow.agents.memory.prompt import warm_tiktoken_cache
|
|
||||||
|
|
||||||
warmed = await asyncio.wait_for(
|
|
||||||
asyncio.to_thread(warm_tiktoken_cache),
|
|
||||||
timeout=5,
|
|
||||||
)
|
|
||||||
if warmed:
|
|
||||||
logger.info("tiktoken encoding cache warmed successfully")
|
|
||||||
else:
|
|
||||||
logger.warning("tiktoken encoding cache warm-up failed; token counting will use character-based fallback")
|
|
||||||
except TimeoutError:
|
|
||||||
logger.warning("tiktoken encoding cache warm-up timed out; token counting will use character-based fallback")
|
|
||||||
except Exception:
|
|
||||||
logger.warning("tiktoken warm-up skipped", exc_info=True)
|
|
||||||
|
|
||||||
# Initialize LangGraph runtime components (StreamBridge, RunManager, checkpointer, store)
|
# Initialize LangGraph runtime components (StreamBridge, RunManager, checkpointer, store)
|
||||||
async with langgraph_runtime(app, startup_config):
|
async with langgraph_runtime(app):
|
||||||
logger.info("LangGraph runtime initialised")
|
logger.info("LangGraph runtime initialised")
|
||||||
|
|
||||||
# Check admin bootstrap state and migrate orphan threads after admin exists.
|
# Ensure admin user exists (auto-create on first boot)
|
||||||
# Must run AFTER langgraph_runtime so app.state.store is available for thread migration
|
# Must run AFTER langgraph_runtime so app.state.store is available for thread migration
|
||||||
await _ensure_admin_user(app)
|
await _ensure_admin_user(app)
|
||||||
|
|
||||||
@@ -210,7 +186,7 @@ async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
|||||||
try:
|
try:
|
||||||
from app.channels.service import start_channel_service
|
from app.channels.service import start_channel_service
|
||||||
|
|
||||||
channel_service = await start_channel_service(startup_config)
|
channel_service = await start_channel_service(app.state.config)
|
||||||
logger.info("Channel service started: %s", channel_service.get_status())
|
logger.info("Channel service started: %s", channel_service.get_status())
|
||||||
except Exception:
|
except Exception:
|
||||||
logger.exception("No IM channels configured or channel service failed to start")
|
logger.exception("No IM channels configured or channel service failed to start")
|
||||||
@@ -243,9 +219,7 @@ def create_app() -> FastAPI:
|
|||||||
Configured FastAPI application instance.
|
Configured FastAPI application instance.
|
||||||
"""
|
"""
|
||||||
config = get_gateway_config()
|
config = get_gateway_config()
|
||||||
docs_url = "/docs" if config.enable_docs else None
|
docs_kwargs = {"docs_url": "/docs", "redoc_url": "/redoc", "openapi_url": "/openapi.json"} if config.enable_docs else {"docs_url": None, "redoc_url": None, "openapi_url": None}
|
||||||
redoc_url = "/redoc" if config.enable_docs else None
|
|
||||||
openapi_url = "/openapi.json" if config.enable_docs else None
|
|
||||||
|
|
||||||
app = FastAPI(
|
app = FastAPI(
|
||||||
title="DeerFlow API Gateway",
|
title="DeerFlow API Gateway",
|
||||||
@@ -265,14 +239,12 @@ API Gateway for DeerFlow - A LangGraph-based AI agent backend with sandbox execu
|
|||||||
|
|
||||||
### Architecture
|
### Architecture
|
||||||
|
|
||||||
LangGraph-compatible requests are routed through nginx to this gateway.
|
LangGraph requests are handled by nginx reverse proxy.
|
||||||
This gateway provides runtime endpoints for agent runs plus custom endpoints for models, MCP configuration, skills, and artifacts.
|
This gateway provides custom endpoints for models, MCP configuration, skills, and artifacts.
|
||||||
""",
|
""",
|
||||||
version="0.1.0",
|
version="0.1.0",
|
||||||
lifespan=lifespan,
|
lifespan=lifespan,
|
||||||
docs_url=docs_url,
|
**docs_kwargs,
|
||||||
redoc_url=redoc_url,
|
|
||||||
openapi_url=openapi_url,
|
|
||||||
openapi_tags=[
|
openapi_tags=[
|
||||||
{
|
{
|
||||||
"name": "models",
|
"name": "models",
|
||||||
@@ -335,18 +307,25 @@ This gateway provides runtime endpoints for agent runs plus custom endpoints for
|
|||||||
# CSRF: Double Submit Cookie pattern for state-changing requests
|
# CSRF: Double Submit Cookie pattern for state-changing requests
|
||||||
app.add_middleware(CSRFMiddleware)
|
app.add_middleware(CSRFMiddleware)
|
||||||
|
|
||||||
# CORS: the unified nginx endpoint is same-origin by default. Split-origin
|
# CORS: when GATEWAY_CORS_ORIGINS is set (dev without nginx), add CORS middleware.
|
||||||
# browser clients must opt in with this explicit Gateway allowlist so CORS
|
# In production, nginx handles CORS and no middleware is needed.
|
||||||
# and CSRF origin checks share the same source of truth.
|
cors_origins_env = os.environ.get("GATEWAY_CORS_ORIGINS", "")
|
||||||
cors_origins = sorted(get_configured_cors_origins())
|
if cors_origins_env:
|
||||||
if cors_origins:
|
cors_origins = [o.strip() for o in cors_origins_env.split(",") if o.strip()]
|
||||||
app.add_middleware(
|
# Validate: wildcard origin with credentials is a security misconfiguration
|
||||||
CORSMiddleware,
|
for origin in cors_origins:
|
||||||
allow_origins=cors_origins,
|
if origin == "*":
|
||||||
allow_credentials=True,
|
logger.error("GATEWAY_CORS_ORIGINS contains wildcard '*' with allow_credentials=True. This is a security misconfiguration — browsers will reject the response. Use explicit scheme://host:port origins instead.")
|
||||||
allow_methods=["*"],
|
cors_origins = [o for o in cors_origins if o != "*"]
|
||||||
allow_headers=["*"],
|
break
|
||||||
)
|
if cors_origins:
|
||||||
|
app.add_middleware(
|
||||||
|
CORSMiddleware,
|
||||||
|
allow_origins=cors_origins,
|
||||||
|
allow_credentials=True,
|
||||||
|
allow_methods=["*"],
|
||||||
|
allow_headers=["*"],
|
||||||
|
)
|
||||||
|
|
||||||
# Include routers
|
# Include routers
|
||||||
# Models API is mounted at /api/models
|
# Models API is mounted at /api/models
|
||||||
@@ -395,7 +374,7 @@ This gateway provides runtime endpoints for agent runs plus custom endpoints for
|
|||||||
app.include_router(runs.router)
|
app.include_router(runs.router)
|
||||||
|
|
||||||
@app.get("/health", tags=["health"])
|
@app.get("/health", tags=["health"])
|
||||||
async def health_check() -> dict[str, str]:
|
async def health_check() -> dict:
|
||||||
"""Health check endpoint.
|
"""Health check endpoint.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
|
|||||||
@@ -8,8 +8,6 @@ from pydantic import BaseModel, Field
|
|||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
_SECRET_FILE = ".jwt_secret"
|
|
||||||
|
|
||||||
|
|
||||||
class AuthConfig(BaseModel):
|
class AuthConfig(BaseModel):
|
||||||
"""JWT and auth-related configuration. Parsed once at startup.
|
"""JWT and auth-related configuration. Parsed once at startup.
|
||||||
@@ -32,32 +30,6 @@ class AuthConfig(BaseModel):
|
|||||||
_auth_config: AuthConfig | None = None
|
_auth_config: AuthConfig | None = None
|
||||||
|
|
||||||
|
|
||||||
def _load_or_create_secret() -> str:
|
|
||||||
"""Load persisted JWT secret from ``{base_dir}/.jwt_secret``, or generate and persist a new one."""
|
|
||||||
from deerflow.config.paths import get_paths
|
|
||||||
|
|
||||||
paths = get_paths()
|
|
||||||
secret_file = paths.base_dir / _SECRET_FILE
|
|
||||||
|
|
||||||
try:
|
|
||||||
if secret_file.exists():
|
|
||||||
secret = secret_file.read_text(encoding="utf-8").strip()
|
|
||||||
if secret:
|
|
||||||
return secret
|
|
||||||
except OSError as exc:
|
|
||||||
raise RuntimeError(f"Failed to read JWT secret from {secret_file}. Set AUTH_JWT_SECRET explicitly or fix DEER_FLOW_HOME/base directory permissions so DeerFlow can read its persisted auth secret.") from exc
|
|
||||||
|
|
||||||
secret = secrets.token_urlsafe(32)
|
|
||||||
try:
|
|
||||||
secret_file.parent.mkdir(parents=True, exist_ok=True)
|
|
||||||
fd = os.open(secret_file, os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600)
|
|
||||||
with os.fdopen(fd, "w", encoding="utf-8") as fh:
|
|
||||||
fh.write(secret)
|
|
||||||
except OSError as exc:
|
|
||||||
raise RuntimeError(f"Failed to persist JWT secret to {secret_file}. Set AUTH_JWT_SECRET explicitly or fix DEER_FLOW_HOME/base directory permissions so DeerFlow can store a stable auth secret.") from exc
|
|
||||||
return secret
|
|
||||||
|
|
||||||
|
|
||||||
def get_auth_config() -> AuthConfig:
|
def get_auth_config() -> AuthConfig:
|
||||||
"""Get the global AuthConfig instance. Parses from env on first call."""
|
"""Get the global AuthConfig instance. Parses from env on first call."""
|
||||||
global _auth_config
|
global _auth_config
|
||||||
@@ -67,11 +39,11 @@ def get_auth_config() -> AuthConfig:
|
|||||||
load_dotenv()
|
load_dotenv()
|
||||||
jwt_secret = os.environ.get("AUTH_JWT_SECRET")
|
jwt_secret = os.environ.get("AUTH_JWT_SECRET")
|
||||||
if not jwt_secret:
|
if not jwt_secret:
|
||||||
jwt_secret = _load_or_create_secret()
|
jwt_secret = secrets.token_urlsafe(32)
|
||||||
os.environ["AUTH_JWT_SECRET"] = jwt_secret
|
os.environ["AUTH_JWT_SECRET"] = jwt_secret
|
||||||
logger.warning(
|
logger.warning(
|
||||||
"⚠ AUTH_JWT_SECRET is not set — using an auto-generated secret "
|
"⚠ AUTH_JWT_SECRET is not set — using an auto-generated ephemeral secret. "
|
||||||
"persisted to .jwt_secret. Sessions will survive restarts. "
|
"Sessions will be invalidated on restart. "
|
||||||
"For production, add AUTH_JWT_SECRET to your .env file: "
|
"For production, add AUTH_JWT_SECRET to your .env file: "
|
||||||
'python -c "import secrets; print(secrets.token_urlsafe(32))"'
|
'python -c "import secrets; print(secrets.token_urlsafe(32))"'
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -28,7 +28,7 @@ class User(BaseModel):
|
|||||||
oauth_id: str | None = Field(None, description="User ID from OAuth provider")
|
oauth_id: str | None = Field(None, description="User ID from OAuth provider")
|
||||||
|
|
||||||
# Auth lifecycle
|
# Auth lifecycle
|
||||||
needs_setup: bool = Field(default=False, description="True when a reset account must complete setup")
|
needs_setup: bool = Field(default=False, description="True for auto-created admin until setup completes")
|
||||||
token_version: int = Field(default=0, description="Incremented on password change to invalidate old JWTs")
|
token_version: int = Field(default=0, description="Incremented on password change to invalidate old JWTs")
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -8,6 +8,7 @@ class GatewayConfig(BaseModel):
|
|||||||
|
|
||||||
host: str = Field(default="0.0.0.0", description="Host to bind the gateway server")
|
host: str = Field(default="0.0.0.0", description="Host to bind the gateway server")
|
||||||
port: int = Field(default=8001, description="Port to bind the gateway server")
|
port: int = Field(default=8001, description="Port to bind the gateway server")
|
||||||
|
cors_origins: list[str] = Field(default_factory=lambda: ["http://localhost:3000"], description="Allowed CORS origins")
|
||||||
enable_docs: bool = Field(default=True, description="Enable Swagger/ReDoc/OpenAPI endpoints")
|
enable_docs: bool = Field(default=True, description="Enable Swagger/ReDoc/OpenAPI endpoints")
|
||||||
|
|
||||||
|
|
||||||
@@ -18,9 +19,11 @@ def get_gateway_config() -> GatewayConfig:
|
|||||||
"""Get gateway config, loading from environment if available."""
|
"""Get gateway config, loading from environment if available."""
|
||||||
global _gateway_config
|
global _gateway_config
|
||||||
if _gateway_config is None:
|
if _gateway_config is None:
|
||||||
|
cors_origins_str = os.getenv("CORS_ORIGINS", "http://localhost:3000")
|
||||||
_gateway_config = GatewayConfig(
|
_gateway_config = GatewayConfig(
|
||||||
host=os.getenv("GATEWAY_HOST", "0.0.0.0"),
|
host=os.getenv("GATEWAY_HOST", "0.0.0.0"),
|
||||||
port=int(os.getenv("GATEWAY_PORT", "8001")),
|
port=int(os.getenv("GATEWAY_PORT", "8001")),
|
||||||
|
cors_origins=cors_origins_str.split(","),
|
||||||
enable_docs=os.getenv("GATEWAY_ENABLE_DOCS", "true").lower() == "true",
|
enable_docs=os.getenv("GATEWAY_ENABLE_DOCS", "true").lower() == "true",
|
||||||
)
|
)
|
||||||
return _gateway_config
|
return _gateway_config
|
||||||
|
|||||||
@@ -4,10 +4,8 @@ Per RFC-001:
|
|||||||
State-changing operations require CSRF protection.
|
State-changing operations require CSRF protection.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import os
|
|
||||||
import secrets
|
import secrets
|
||||||
from collections.abc import Awaitable, Callable
|
from collections.abc import Callable
|
||||||
from urllib.parse import urlsplit
|
|
||||||
|
|
||||||
from fastapi import Request, Response
|
from fastapi import Request, Response
|
||||||
from starlette.middleware.base import BaseHTTPMiddleware
|
from starlette.middleware.base import BaseHTTPMiddleware
|
||||||
@@ -21,7 +19,7 @@ CSRF_TOKEN_LENGTH = 64 # bytes
|
|||||||
|
|
||||||
def is_secure_request(request: Request) -> bool:
|
def is_secure_request(request: Request) -> bool:
|
||||||
"""Detect whether the original client request was made over HTTPS."""
|
"""Detect whether the original client request was made over HTTPS."""
|
||||||
return _request_scheme(request) == "https"
|
return request.headers.get("x-forwarded-proto", request.url.scheme) == "https"
|
||||||
|
|
||||||
|
|
||||||
def generate_csrf_token() -> str:
|
def generate_csrf_token() -> str:
|
||||||
@@ -63,129 +61,15 @@ def is_auth_endpoint(request: Request) -> bool:
|
|||||||
return request.url.path.rstrip("/") in _AUTH_EXEMPT_PATHS
|
return request.url.path.rstrip("/") in _AUTH_EXEMPT_PATHS
|
||||||
|
|
||||||
|
|
||||||
def _host_with_optional_port(hostname: str, port: int | None, scheme: str) -> str:
|
|
||||||
"""Return normalized host[:port], omitting default ports."""
|
|
||||||
host = hostname.lower()
|
|
||||||
if ":" in host and not host.startswith("["):
|
|
||||||
host = f"[{host}]"
|
|
||||||
|
|
||||||
if port is None or (scheme == "http" and port == 80) or (scheme == "https" and port == 443):
|
|
||||||
return host
|
|
||||||
return f"{host}:{port}"
|
|
||||||
|
|
||||||
|
|
||||||
def _normalize_origin(origin: str) -> str | None:
|
|
||||||
"""Return a normalized scheme://host[:port] origin, or None for invalid input."""
|
|
||||||
try:
|
|
||||||
parsed = urlsplit(origin.strip())
|
|
||||||
port = parsed.port
|
|
||||||
except ValueError:
|
|
||||||
return None
|
|
||||||
|
|
||||||
scheme = parsed.scheme.lower()
|
|
||||||
if scheme not in {"http", "https"} or not parsed.hostname:
|
|
||||||
return None
|
|
||||||
|
|
||||||
# Browser Origin is only scheme/host/port. Reject URL-shaped or credentialed values.
|
|
||||||
if parsed.username or parsed.password or parsed.path or parsed.query or parsed.fragment:
|
|
||||||
return None
|
|
||||||
|
|
||||||
return f"{scheme}://{_host_with_optional_port(parsed.hostname, port, scheme)}"
|
|
||||||
|
|
||||||
|
|
||||||
def _configured_cors_origins() -> set[str]:
|
|
||||||
"""Return explicit configured browser origins that may call auth routes."""
|
|
||||||
origins = set()
|
|
||||||
for raw_origin in os.environ.get("GATEWAY_CORS_ORIGINS", "").split(","):
|
|
||||||
origin = raw_origin.strip()
|
|
||||||
if not origin or origin == "*":
|
|
||||||
continue
|
|
||||||
normalized = _normalize_origin(origin)
|
|
||||||
if normalized:
|
|
||||||
origins.add(normalized)
|
|
||||||
return origins
|
|
||||||
|
|
||||||
|
|
||||||
def get_configured_cors_origins() -> set[str]:
|
|
||||||
"""Return normalized explicit browser origins from GATEWAY_CORS_ORIGINS."""
|
|
||||||
return _configured_cors_origins()
|
|
||||||
|
|
||||||
|
|
||||||
def _first_header_value(value: str | None) -> str | None:
|
|
||||||
"""Return the first value from a comma-separated proxy header."""
|
|
||||||
if not value:
|
|
||||||
return None
|
|
||||||
first = value.split(",", 1)[0].strip()
|
|
||||||
return first or None
|
|
||||||
|
|
||||||
|
|
||||||
def _forwarded_param(request: Request, name: str) -> str | None:
|
|
||||||
"""Extract a parameter from the first RFC 7239 Forwarded header entry."""
|
|
||||||
forwarded = _first_header_value(request.headers.get("forwarded"))
|
|
||||||
if not forwarded:
|
|
||||||
return None
|
|
||||||
|
|
||||||
for part in forwarded.split(";"):
|
|
||||||
key, sep, value = part.strip().partition("=")
|
|
||||||
if sep and key.lower() == name:
|
|
||||||
return value.strip().strip('"') or None
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def _request_scheme(request: Request) -> str:
|
|
||||||
"""Resolve the original request scheme from trusted proxy headers."""
|
|
||||||
scheme = _forwarded_param(request, "proto") or _first_header_value(request.headers.get("x-forwarded-proto")) or request.url.scheme
|
|
||||||
return scheme.lower()
|
|
||||||
|
|
||||||
|
|
||||||
def _request_origin(request: Request) -> str | None:
|
|
||||||
"""Build the origin for the URL the browser is targeting."""
|
|
||||||
scheme = _request_scheme(request)
|
|
||||||
host = _forwarded_param(request, "host") or _first_header_value(request.headers.get("x-forwarded-host")) or request.headers.get("host") or request.url.netloc
|
|
||||||
|
|
||||||
forwarded_port = _first_header_value(request.headers.get("x-forwarded-port"))
|
|
||||||
if forwarded_port and ":" not in host.rsplit("]", 1)[-1]:
|
|
||||||
host = f"{host}:{forwarded_port}"
|
|
||||||
|
|
||||||
return _normalize_origin(f"{scheme}://{host}")
|
|
||||||
|
|
||||||
|
|
||||||
def is_allowed_auth_origin(request: Request) -> bool:
|
|
||||||
"""Allow auth POSTs only from the same origin or explicit configured origins.
|
|
||||||
|
|
||||||
Login/register/initialize are exempt from the double-submit token because
|
|
||||||
first-time browser clients do not have a CSRF token yet. They still create
|
|
||||||
a session cookie, so browser requests with a hostile Origin header must be
|
|
||||||
rejected to prevent login CSRF / session fixation. Requests without Origin
|
|
||||||
are allowed for non-browser clients such as curl and mobile integrations.
|
|
||||||
"""
|
|
||||||
origin = request.headers.get("origin")
|
|
||||||
if not origin:
|
|
||||||
return True
|
|
||||||
|
|
||||||
normalized_origin = _normalize_origin(origin)
|
|
||||||
if normalized_origin is None:
|
|
||||||
return False
|
|
||||||
|
|
||||||
request_origin = _request_origin(request)
|
|
||||||
return normalized_origin in _configured_cors_origins() or (request_origin is not None and normalized_origin == request_origin)
|
|
||||||
|
|
||||||
|
|
||||||
class CSRFMiddleware(BaseHTTPMiddleware):
|
class CSRFMiddleware(BaseHTTPMiddleware):
|
||||||
"""Middleware that implements CSRF protection using Double Submit Cookie pattern."""
|
"""Middleware that implements CSRF protection using Double Submit Cookie pattern."""
|
||||||
|
|
||||||
def __init__(self, app: ASGIApp) -> None:
|
def __init__(self, app: ASGIApp) -> None:
|
||||||
super().__init__(app)
|
super().__init__(app)
|
||||||
|
|
||||||
async def dispatch(self, request: Request, call_next: Callable[[Request], Awaitable[Response]]) -> Response:
|
async def dispatch(self, request: Request, call_next: Callable) -> Response:
|
||||||
_is_auth = is_auth_endpoint(request)
|
_is_auth = is_auth_endpoint(request)
|
||||||
|
|
||||||
if should_check_csrf(request) and _is_auth and not is_allowed_auth_origin(request):
|
|
||||||
return JSONResponse(
|
|
||||||
status_code=403,
|
|
||||||
content={"detail": "Cross-site auth request denied."},
|
|
||||||
)
|
|
||||||
|
|
||||||
if should_check_csrf(request) and not _is_auth:
|
if should_check_csrf(request) and not _is_auth:
|
||||||
cookie_token = request.cookies.get(CSRF_COOKIE_NAME)
|
cookie_token = request.cookies.get(CSRF_COOKIE_NAME)
|
||||||
header_token = request.headers.get(CSRF_HEADER_NAME)
|
header_token = request.headers.get(CSRF_HEADER_NAME)
|
||||||
|
|||||||
+17
-160
@@ -3,22 +3,11 @@
|
|||||||
**Getters** (used by routers): raise 503 when a required dependency is
|
**Getters** (used by routers): raise 503 when a required dependency is
|
||||||
missing, except ``get_store`` which returns ``None``.
|
missing, except ``get_store`` which returns ``None``.
|
||||||
|
|
||||||
``AppConfig`` is intentionally *not* cached on ``app.state``. Routers and the
|
|
||||||
run path resolve it through :func:`deerflow.config.app_config.get_app_config`,
|
|
||||||
which performs mtime-based hot reload, so edits to ``config.yaml`` take
|
|
||||||
effect on the next request without a process restart. The engines created in
|
|
||||||
:func:`langgraph_runtime` (stream bridge, persistence, checkpointer, store,
|
|
||||||
run-event store) accept a ``startup_config`` snapshot — they are
|
|
||||||
restart-required by design and stay bound to that snapshot to keep the live
|
|
||||||
process consistent with itself.
|
|
||||||
|
|
||||||
Initialization is handled directly in ``app.py`` via :class:`AsyncExitStack`.
|
Initialization is handled directly in ``app.py`` via :class:`AsyncExitStack`.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import logging
|
|
||||||
from collections.abc import AsyncGenerator, Callable
|
from collections.abc import AsyncGenerator, Callable
|
||||||
from contextlib import AsyncExitStack, asynccontextmanager
|
from contextlib import AsyncExitStack, asynccontextmanager
|
||||||
from typing import TYPE_CHECKING, TypeVar, cast
|
from typing import TYPE_CHECKING, TypeVar, cast
|
||||||
@@ -26,144 +15,36 @@ from typing import TYPE_CHECKING, TypeVar, cast
|
|||||||
from fastapi import FastAPI, HTTPException, Request
|
from fastapi import FastAPI, HTTPException, Request
|
||||||
from langgraph.types import Checkpointer
|
from langgraph.types import Checkpointer
|
||||||
|
|
||||||
from deerflow.config.app_config import AppConfig, get_app_config
|
from deerflow.config.app_config import AppConfig
|
||||||
from deerflow.persistence.feedback import FeedbackRepository
|
from deerflow.persistence.feedback import FeedbackRepository
|
||||||
from deerflow.runtime import RunContext, RunManager, StreamBridge
|
from deerflow.runtime import RunContext, RunManager, StreamBridge
|
||||||
from deerflow.runtime.events.store.base import RunEventStore
|
from deerflow.runtime.events.store.base import RunEventStore
|
||||||
from deerflow.runtime.runs.store.base import RunStore
|
from deerflow.runtime.runs.store.base import RunStore
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
# Upper bound (seconds) for draining in-flight runs during shutdown, before the
|
|
||||||
# AsyncExitStack tears down the checkpointer (and its connection pool). Kept
|
|
||||||
# local to avoid an app -> deps -> app import cycle. This is a *separate* budget
|
|
||||||
# from ``app.gateway.app._SHUTDOWN_HOOK_TIMEOUT_SECONDS`` (currently also 5.0s,
|
|
||||||
# which bounds channel-service stop): the two govern independent teardown steps
|
|
||||||
# and may diverge, but both count toward the lifespan shutdown window — revisit
|
|
||||||
# them together if their sum must stay within the server's graceful-shutdown
|
|
||||||
# timeout.
|
|
||||||
_RUN_DRAIN_TIMEOUT_SECONDS = 5.0
|
|
||||||
|
|
||||||
|
|
||||||
async def _drain_inflight_runs(run_manager: RunManager) -> None:
|
|
||||||
"""Drain in-flight runs before the checkpointer is torn down (issue #3373).
|
|
||||||
|
|
||||||
Shields the (internally-bounded) drain so that even if the lifespan
|
|
||||||
coroutine is itself cancelled mid-shutdown — a second SIGINT or the server's
|
|
||||||
graceful-shutdown timeout, i.e. the same signal storm behind #3373 — the
|
|
||||||
checkpointer pool is not closed while run tasks are still writing
|
|
||||||
checkpoints. On such a cancellation we let the already-running drain finish
|
|
||||||
(it is bounded by ``RunManager.shutdown``'s own timeout) and then propagate
|
|
||||||
the cancellation.
|
|
||||||
"""
|
|
||||||
drain = asyncio.create_task(run_manager.shutdown(timeout=_RUN_DRAIN_TIMEOUT_SECONDS))
|
|
||||||
try:
|
|
||||||
await asyncio.shield(drain)
|
|
||||||
except asyncio.CancelledError:
|
|
||||||
# Re-shield so this second wait does not abandon the in-flight drain;
|
|
||||||
# it is bounded, so this cannot hang. Then re-raise to honour shutdown.
|
|
||||||
try:
|
|
||||||
await asyncio.shield(drain)
|
|
||||||
except Exception:
|
|
||||||
logger.exception("In-flight run drain failed after shutdown cancellation")
|
|
||||||
raise
|
|
||||||
except Exception:
|
|
||||||
logger.exception("Failed to drain in-flight runs during shutdown")
|
|
||||||
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
if TYPE_CHECKING:
|
||||||
from app.gateway.auth.local_provider import LocalAuthProvider
|
from app.gateway.auth.local_provider import LocalAuthProvider
|
||||||
from app.gateway.auth.repositories.sqlite import SQLiteUserRepository
|
from app.gateway.auth.repositories.sqlite import SQLiteUserRepository
|
||||||
from deerflow.persistence.thread_meta.base import ThreadMetaStore
|
from deerflow.persistence.thread_meta.base import ThreadMetaStore
|
||||||
from deerflow.runtime import RunRecord
|
|
||||||
|
|
||||||
|
|
||||||
T = TypeVar("T")
|
T = TypeVar("T")
|
||||||
|
|
||||||
|
|
||||||
async def _mark_latest_recovered_threads_error(
|
def get_config(request: Request) -> AppConfig:
|
||||||
run_manager: RunManager,
|
"""Return the app-scoped ``AppConfig`` stored on ``app.state``."""
|
||||||
thread_store: ThreadMetaStore,
|
config = getattr(request.app.state, "config", None)
|
||||||
recovered_runs: list[RunRecord],
|
if config is None:
|
||||||
) -> None:
|
raise HTTPException(status_code=503, detail="Configuration not available")
|
||||||
"""Mark thread status as error only when its newest run was recovered."""
|
return config
|
||||||
recovered_by_thread: dict[str, set[str]] = {}
|
|
||||||
for record in recovered_runs:
|
|
||||||
recovered_by_thread.setdefault(record.thread_id, set()).add(record.run_id)
|
|
||||||
|
|
||||||
for thread_id, recovered_run_ids in recovered_by_thread.items():
|
|
||||||
try:
|
|
||||||
latest_runs = await run_manager.list_by_thread(thread_id, user_id=None, limit=1)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to find latest run for thread %s during run reconciliation", thread_id, exc_info=True)
|
|
||||||
continue
|
|
||||||
if not latest_runs or latest_runs[0].run_id not in recovered_run_ids:
|
|
||||||
continue
|
|
||||||
try:
|
|
||||||
await thread_store.update_status(thread_id, "error", user_id=None)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to mark thread %s as error during run reconciliation", thread_id, exc_info=True)
|
|
||||||
|
|
||||||
|
|
||||||
def get_config() -> AppConfig:
|
|
||||||
"""Return the freshest ``AppConfig`` for the current request.
|
|
||||||
|
|
||||||
Routes through :func:`deerflow.config.app_config.get_app_config`, which
|
|
||||||
honours runtime ``ContextVar`` overrides and reloads ``config.yaml`` from
|
|
||||||
disk when its mtime changes. ``AppConfig`` is not cached on ``app.state``
|
|
||||||
at all — the only startup-time snapshot lives as a local
|
|
||||||
``startup_config`` variable inside ``lifespan()`` and is passed
|
|
||||||
explicitly into :func:`langgraph_runtime` for the engines that are
|
|
||||||
restart-required by design. Routing every request through
|
|
||||||
:func:`get_app_config` closes the bytedance/deer-flow issue #3107 BUG-001
|
|
||||||
split-brain where the worker / lead-agent thread saw a stale startup
|
|
||||||
snapshot.
|
|
||||||
|
|
||||||
Hot-reload boundary: fields backed by startup-time singletons
|
|
||||||
(engines, sandbox provider, IM channels, logging handler) require a
|
|
||||||
process restart to change at runtime. The authoritative list lives in
|
|
||||||
:mod:`deerflow.config.reload_boundary` and is mirrored by the
|
|
||||||
standardised ``"startup-only:"`` prefix on the matching
|
|
||||||
``Field(description=...)`` in :class:`AppConfig` — IDE hover on those
|
|
||||||
fields will surface the boundary inline. See
|
|
||||||
``backend/CLAUDE.md`` "Config Hot-Reload Boundary" for the operator
|
|
||||||
summary.
|
|
||||||
|
|
||||||
Any failure to materialise the config (missing file, permission denied,
|
|
||||||
YAML parse error, validation error) is reported as 503 — semantically
|
|
||||||
"the gateway cannot serve requests without a usable configuration" — and
|
|
||||||
logged with the original exception so operators have something to debug.
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
return get_app_config()
|
|
||||||
except Exception as exc: # noqa: BLE001 - request boundary: log and degrade gracefully
|
|
||||||
logger.exception("Failed to load AppConfig at request time")
|
|
||||||
raise HTTPException(status_code=503, detail="Configuration not available") from exc
|
|
||||||
|
|
||||||
|
|
||||||
@asynccontextmanager
|
@asynccontextmanager
|
||||||
async def langgraph_runtime(app: FastAPI, startup_config: AppConfig) -> AsyncGenerator[None, None]:
|
async def langgraph_runtime(app: FastAPI) -> AsyncGenerator[None, None]:
|
||||||
"""Bootstrap and tear down all LangGraph runtime singletons.
|
"""Bootstrap and tear down all LangGraph runtime singletons.
|
||||||
|
|
||||||
``startup_config`` is the ``AppConfig`` snapshot taken once during
|
|
||||||
``lifespan()`` for one-shot infrastructure bootstrap. The engines and
|
|
||||||
stores constructed here (stream bridge, persistence engine, checkpointer,
|
|
||||||
store, run-event store) are restart-required by design — they hold live
|
|
||||||
connections, file handles, or singleton providers — so they bind to this
|
|
||||||
snapshot and survive across `config.yaml` edits. Request-time consumers
|
|
||||||
must still go through :func:`get_config` for any field that should be
|
|
||||||
hot-reloadable. See ``backend/CLAUDE.md`` "Config Hot-Reload Boundary".
|
|
||||||
|
|
||||||
The matching ``run_events_config`` is frozen onto ``app.state`` so
|
|
||||||
:func:`get_run_context` pairs a freshly-loaded ``AppConfig`` with the
|
|
||||||
*startup-time* run-events configuration the underlying ``event_store``
|
|
||||||
was built from — otherwise the runtime could end up combining a live
|
|
||||||
new ``run_events_config`` with an event store still bound to the
|
|
||||||
previous backend.
|
|
||||||
|
|
||||||
Usage in ``app.py``::
|
Usage in ``app.py``::
|
||||||
|
|
||||||
async with langgraph_runtime(app, startup_config):
|
async with langgraph_runtime(app):
|
||||||
yield
|
yield
|
||||||
"""
|
"""
|
||||||
from deerflow.persistence.engine import close_engine, get_session_factory, init_engine_from_config
|
from deerflow.persistence.engine import close_engine, get_session_factory, init_engine_from_config
|
||||||
@@ -172,7 +53,9 @@ async def langgraph_runtime(app: FastAPI, startup_config: AppConfig) -> AsyncGen
|
|||||||
from deerflow.runtime.events.store import make_run_event_store
|
from deerflow.runtime.events.store import make_run_event_store
|
||||||
|
|
||||||
async with AsyncExitStack() as stack:
|
async with AsyncExitStack() as stack:
|
||||||
config = startup_config
|
config = getattr(app.state, "config", None)
|
||||||
|
if config is None:
|
||||||
|
raise RuntimeError("langgraph_runtime() requires app.state.config to be initialized")
|
||||||
|
|
||||||
app.state.stream_bridge = await stack.enter_async_context(make_stream_bridge(config))
|
app.state.stream_bridge = await stack.enter_async_context(make_stream_bridge(config))
|
||||||
|
|
||||||
@@ -201,38 +84,16 @@ async def langgraph_runtime(app: FastAPI, startup_config: AppConfig) -> AsyncGen
|
|||||||
|
|
||||||
app.state.thread_store = make_thread_store(sf, app.state.store)
|
app.state.thread_store = make_thread_store(sf, app.state.store)
|
||||||
|
|
||||||
# Run event store. The store and the matching ``run_events_config`` are
|
# Run event store (has its own factory with config-driven backend selection)
|
||||||
# both frozen at startup so ``get_run_context`` does not combine a
|
|
||||||
# freshly-reloaded ``AppConfig.run_events`` with a store still bound to
|
|
||||||
# the previous backend.
|
|
||||||
run_events_config = getattr(config, "run_events", None)
|
run_events_config = getattr(config, "run_events", None)
|
||||||
app.state.run_events_config = run_events_config
|
|
||||||
app.state.run_event_store = make_run_event_store(run_events_config)
|
app.state.run_event_store = make_run_event_store(run_events_config)
|
||||||
|
|
||||||
# RunManager with store backing for persistence
|
# RunManager with store backing for persistence
|
||||||
app.state.run_manager = RunManager(store=app.state.run_store)
|
app.state.run_manager = RunManager(store=app.state.run_store)
|
||||||
if getattr(config.database, "backend", None) == "sqlite":
|
|
||||||
from deerflow.utils.time import now_iso
|
|
||||||
|
|
||||||
# Startup-only recovery: clean shutdowns return no active rows and
|
|
||||||
# the thread-status update below becomes a no-op.
|
|
||||||
recovered_runs = await app.state.run_manager.reconcile_orphaned_inflight_runs(
|
|
||||||
error="Gateway restarted before this run reached a durable final state.",
|
|
||||||
before=now_iso(),
|
|
||||||
)
|
|
||||||
await _mark_latest_recovered_threads_error(app.state.run_manager, app.state.thread_store, recovered_runs)
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
yield
|
yield
|
||||||
finally:
|
finally:
|
||||||
# Drain in-flight run tasks BEFORE the AsyncExitStack tears down the
|
|
||||||
# checkpointer (and its connection pool). A run still mid-graph would
|
|
||||||
# otherwise leak into asyncio.run() shutdown, where langgraph's
|
|
||||||
# _checkpointer_put_after_previous aput races the closed pool and
|
|
||||||
# raises PoolClosed (issue #3373).
|
|
||||||
run_manager = getattr(app.state, "run_manager", None)
|
|
||||||
if run_manager is not None:
|
|
||||||
await _drain_inflight_runs(run_manager)
|
|
||||||
await close_engine()
|
await close_engine()
|
||||||
|
|
||||||
|
|
||||||
@@ -278,20 +139,16 @@ def get_thread_store(request: Request) -> ThreadMetaStore:
|
|||||||
def get_run_context(request: Request) -> RunContext:
|
def get_run_context(request: Request) -> RunContext:
|
||||||
"""Build a :class:`RunContext` from ``app.state`` singletons.
|
"""Build a :class:`RunContext` from ``app.state`` singletons.
|
||||||
|
|
||||||
Returns a *base* context with infrastructure dependencies. The
|
Returns a *base* context with infrastructure dependencies.
|
||||||
``app_config`` field is resolved live so per-run fields (e.g.
|
|
||||||
``models[*].max_tokens``) follow ``config.yaml`` edits; the
|
|
||||||
``event_store`` / ``run_events_config`` pair stays frozen to the snapshot
|
|
||||||
captured in :func:`langgraph_runtime` so callers never see a store bound
|
|
||||||
to one backend paired with a config pointing at another.
|
|
||||||
"""
|
"""
|
||||||
|
config = get_config(request)
|
||||||
return RunContext(
|
return RunContext(
|
||||||
checkpointer=get_checkpointer(request),
|
checkpointer=get_checkpointer(request),
|
||||||
store=get_store(request),
|
store=get_store(request),
|
||||||
event_store=get_run_event_store(request),
|
event_store=get_run_event_store(request),
|
||||||
run_events_config=getattr(request.app.state, "run_events_config", None),
|
run_events_config=getattr(config, "run_events", None),
|
||||||
thread_store=get_thread_store(request),
|
thread_store=get_thread_store(request),
|
||||||
app_config=get_config(),
|
app_config=config,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -1,38 +1,26 @@
|
|||||||
"""Authentication for trusted Gateway internal callers."""
|
"""Process-local authentication for Gateway internal callers."""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import os
|
|
||||||
import secrets
|
import secrets
|
||||||
from types import SimpleNamespace
|
from types import SimpleNamespace
|
||||||
|
|
||||||
from deerflow.runtime.user_context import DEFAULT_USER_ID
|
from deerflow.runtime.user_context import DEFAULT_USER_ID
|
||||||
|
|
||||||
INTERNAL_AUTH_HEADER_NAME = "X-DeerFlow-Internal-Token"
|
INTERNAL_AUTH_HEADER_NAME = "X-DeerFlow-Internal-Token"
|
||||||
INTERNAL_AUTH_ENV_VAR = "DEER_FLOW_INTERNAL_AUTH_TOKEN"
|
_INTERNAL_AUTH_TOKEN = secrets.token_urlsafe(32)
|
||||||
INTERNAL_SYSTEM_ROLE = "internal"
|
|
||||||
|
|
||||||
|
|
||||||
def _load_internal_auth_token() -> str:
|
|
||||||
token = os.environ.get(INTERNAL_AUTH_ENV_VAR)
|
|
||||||
if token:
|
|
||||||
return token
|
|
||||||
return secrets.token_urlsafe(32)
|
|
||||||
|
|
||||||
|
|
||||||
_INTERNAL_AUTH_TOKEN = _load_internal_auth_token()
|
|
||||||
|
|
||||||
|
|
||||||
def create_internal_auth_headers() -> dict[str, str]:
|
def create_internal_auth_headers() -> dict[str, str]:
|
||||||
"""Return headers that authenticate trusted Gateway internal calls."""
|
"""Return headers that authenticate same-process Gateway internal calls."""
|
||||||
return {INTERNAL_AUTH_HEADER_NAME: _INTERNAL_AUTH_TOKEN}
|
return {INTERNAL_AUTH_HEADER_NAME: _INTERNAL_AUTH_TOKEN}
|
||||||
|
|
||||||
|
|
||||||
def is_valid_internal_auth_token(token: str | None) -> bool:
|
def is_valid_internal_auth_token(token: str | None) -> bool:
|
||||||
"""Return True when *token* matches this Gateway worker's internal token."""
|
"""Return True when *token* matches the process-local internal token."""
|
||||||
return bool(token) and secrets.compare_digest(token, _INTERNAL_AUTH_TOKEN)
|
return bool(token) and secrets.compare_digest(token, _INTERNAL_AUTH_TOKEN)
|
||||||
|
|
||||||
|
|
||||||
def get_internal_user():
|
def get_internal_user():
|
||||||
"""Return the synthetic user used for trusted internal channel calls."""
|
"""Return the synthetic user used for trusted internal channel calls."""
|
||||||
return SimpleNamespace(id=DEFAULT_USER_ID, system_role=INTERNAL_SYSTEM_ROLE)
|
return SimpleNamespace(id=DEFAULT_USER_ID, system_role="internal")
|
||||||
|
|||||||
@@ -1,12 +1,8 @@
|
|||||||
"""LangGraph compatibility auth handler — shares JWT logic with Gateway.
|
"""LangGraph Server auth handler — shares JWT logic with Gateway.
|
||||||
|
|
||||||
The default DeerFlow runtime is embedded in the FastAPI Gateway; scripts and
|
Loaded by LangGraph Server via langgraph.json ``auth.path``.
|
||||||
Docker deployments do not load this module. It is retained for LangGraph
|
Reuses the same ``decode_token`` / ``get_auth_config`` as Gateway,
|
||||||
tooling, Studio, or direct LangGraph Server compatibility through
|
so both modes validate tokens with the same secret and rules.
|
||||||
``langgraph.json``'s ``auth.path``.
|
|
||||||
|
|
||||||
When that compatibility path is used, this module reuses the same JWT and CSRF
|
|
||||||
rules as Gateway so both modes validate sessions consistently.
|
|
||||||
|
|
||||||
Two layers:
|
Two layers:
|
||||||
1. @auth.authenticate — validates JWT cookie, extracts user_id,
|
1. @auth.authenticate — validates JWT cookie, extracts user_id,
|
||||||
|
|||||||
@@ -1,15 +0,0 @@
|
|||||||
"""Shared pagination helpers for gateway routers."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
|
|
||||||
def trim_run_message_page(rows: list[dict], *, limit: int, after_seq: int | None) -> tuple[list[dict], bool]:
|
|
||||||
"""Trim a ``limit + 1`` run-message page while preserving page boundaries."""
|
|
||||||
has_more = len(rows) > limit
|
|
||||||
if not has_more:
|
|
||||||
return rows, False
|
|
||||||
|
|
||||||
if after_seq is not None:
|
|
||||||
return rows[:limit], True
|
|
||||||
|
|
||||||
return rows[-limit:], True
|
|
||||||
@@ -1,6 +1,5 @@
|
|||||||
"""CRUD API for custom agents."""
|
"""CRUD API for custom agents."""
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import logging
|
import logging
|
||||||
import re
|
import re
|
||||||
import shutil
|
import shutil
|
||||||
@@ -12,7 +11,6 @@ from pydantic import BaseModel, Field
|
|||||||
from deerflow.config.agents_api_config import get_agents_api_config
|
from deerflow.config.agents_api_config import get_agents_api_config
|
||||||
from deerflow.config.agents_config import AgentConfig, list_custom_agents, load_agent_config, load_agent_soul
|
from deerflow.config.agents_config import AgentConfig, list_custom_agents, load_agent_config, load_agent_soul
|
||||||
from deerflow.config.paths import get_paths
|
from deerflow.config.paths import get_paths
|
||||||
from deerflow.runtime.user_context import get_effective_user_id
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
router = APIRouter(prefix="/api", tags=["agents"])
|
router = APIRouter(prefix="/api", tags=["agents"])
|
||||||
@@ -88,11 +86,11 @@ def _require_agents_api_enabled() -> None:
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def _agent_config_to_response(agent_cfg: AgentConfig, include_soul: bool = False, *, user_id: str | None = None) -> AgentResponse:
|
def _agent_config_to_response(agent_cfg: AgentConfig, include_soul: bool = False) -> AgentResponse:
|
||||||
"""Convert AgentConfig to AgentResponse."""
|
"""Convert AgentConfig to AgentResponse."""
|
||||||
soul: str | None = None
|
soul: str | None = None
|
||||||
if include_soul:
|
if include_soul:
|
||||||
soul = load_agent_soul(agent_cfg.name, user_id=user_id) or ""
|
soul = load_agent_soul(agent_cfg.name) or ""
|
||||||
|
|
||||||
return AgentResponse(
|
return AgentResponse(
|
||||||
name=agent_cfg.name,
|
name=agent_cfg.name,
|
||||||
@@ -118,10 +116,9 @@ async def list_agents() -> AgentsListResponse:
|
|||||||
"""
|
"""
|
||||||
_require_agents_api_enabled()
|
_require_agents_api_enabled()
|
||||||
|
|
||||||
user_id = get_effective_user_id()
|
|
||||||
try:
|
try:
|
||||||
agents = list_custom_agents(user_id=user_id)
|
agents = list_custom_agents()
|
||||||
return AgentsListResponse(agents=[_agent_config_to_response(a, include_soul=True, user_id=user_id) for a in agents])
|
return AgentsListResponse(agents=[_agent_config_to_response(a, include_soul=True) for a in agents])
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"Failed to list agents: {e}", exc_info=True)
|
logger.error(f"Failed to list agents: {e}", exc_info=True)
|
||||||
raise HTTPException(status_code=500, detail=f"Failed to list agents: {str(e)}")
|
raise HTTPException(status_code=500, detail=f"Failed to list agents: {str(e)}")
|
||||||
@@ -147,12 +144,7 @@ async def check_agent_name(name: str) -> dict:
|
|||||||
_require_agents_api_enabled()
|
_require_agents_api_enabled()
|
||||||
_validate_agent_name(name)
|
_validate_agent_name(name)
|
||||||
normalized = _normalize_agent_name(name)
|
normalized = _normalize_agent_name(name)
|
||||||
user_id = get_effective_user_id()
|
available = not get_paths().agent_dir(normalized).exists()
|
||||||
paths = get_paths()
|
|
||||||
# Treat the name as taken if either the per-user path or the legacy shared
|
|
||||||
# path holds an agent — picking a name that collides with an unmigrated
|
|
||||||
# legacy agent would shadow the legacy entry once migration runs.
|
|
||||||
available = not paths.user_agent_dir(user_id, normalized).exists() and not paths.agent_dir(normalized).exists()
|
|
||||||
return {"available": available, "name": normalized}
|
return {"available": available, "name": normalized}
|
||||||
|
|
||||||
|
|
||||||
@@ -177,11 +169,10 @@ async def get_agent(name: str) -> AgentResponse:
|
|||||||
_require_agents_api_enabled()
|
_require_agents_api_enabled()
|
||||||
_validate_agent_name(name)
|
_validate_agent_name(name)
|
||||||
name = _normalize_agent_name(name)
|
name = _normalize_agent_name(name)
|
||||||
user_id = get_effective_user_id()
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
agent_cfg = load_agent_config(name, user_id=user_id)
|
agent_cfg = load_agent_config(name)
|
||||||
return _agent_config_to_response(agent_cfg, include_soul=True, user_id=user_id)
|
return _agent_config_to_response(agent_cfg, include_soul=True)
|
||||||
except FileNotFoundError:
|
except FileNotFoundError:
|
||||||
raise HTTPException(status_code=404, detail=f"Agent '{name}' not found")
|
raise HTTPException(status_code=404, detail=f"Agent '{name}' not found")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
@@ -211,63 +202,47 @@ async def create_agent_endpoint(request: AgentCreateRequest) -> AgentResponse:
|
|||||||
_require_agents_api_enabled()
|
_require_agents_api_enabled()
|
||||||
_validate_agent_name(request.name)
|
_validate_agent_name(request.name)
|
||||||
normalized_name = _normalize_agent_name(request.name)
|
normalized_name = _normalize_agent_name(request.name)
|
||||||
user_id = get_effective_user_id()
|
|
||||||
paths = get_paths()
|
|
||||||
|
|
||||||
def _create_agent() -> AgentResponse | None:
|
agent_dir = get_paths().agent_dir(normalized_name)
|
||||||
# Worker thread: base-dir resolution, existence checks, directory/file
|
|
||||||
# creation, read-back, and failure cleanup are all blocking filesystem
|
|
||||||
# IO that must stay off the event loop.
|
|
||||||
agent_dir = paths.user_agent_dir(user_id, normalized_name)
|
|
||||||
legacy_dir = paths.agent_dir(normalized_name)
|
|
||||||
|
|
||||||
if legacy_dir.exists():
|
if agent_dir.exists():
|
||||||
return None # signals 409 to the caller
|
|
||||||
|
|
||||||
try:
|
|
||||||
try:
|
|
||||||
agent_dir.mkdir(parents=True, exist_ok=False)
|
|
||||||
except FileExistsError:
|
|
||||||
return None # signals 409 to the caller
|
|
||||||
# Write config.yaml
|
|
||||||
config_data: dict = {"name": normalized_name}
|
|
||||||
if request.description:
|
|
||||||
config_data["description"] = request.description
|
|
||||||
if request.model is not None:
|
|
||||||
config_data["model"] = request.model
|
|
||||||
if request.tool_groups is not None:
|
|
||||||
config_data["tool_groups"] = request.tool_groups
|
|
||||||
if request.skills is not None:
|
|
||||||
config_data["skills"] = request.skills
|
|
||||||
|
|
||||||
config_file = agent_dir / "config.yaml"
|
|
||||||
with open(config_file, "w", encoding="utf-8") as f:
|
|
||||||
yaml.dump(config_data, f, default_flow_style=False, allow_unicode=True)
|
|
||||||
|
|
||||||
# Write SOUL.md
|
|
||||||
soul_file = agent_dir / "SOUL.md"
|
|
||||||
soul_file.write_text(request.soul, encoding="utf-8")
|
|
||||||
|
|
||||||
logger.info(f"Created agent '{normalized_name}' at {agent_dir}")
|
|
||||||
|
|
||||||
agent_cfg = load_agent_config(normalized_name, user_id=user_id)
|
|
||||||
return _agent_config_to_response(agent_cfg, include_soul=True, user_id=user_id)
|
|
||||||
except Exception:
|
|
||||||
# Clean up partial state on failure before surfacing the error.
|
|
||||||
if agent_dir.exists():
|
|
||||||
shutil.rmtree(agent_dir)
|
|
||||||
raise
|
|
||||||
|
|
||||||
try:
|
|
||||||
response = await asyncio.to_thread(_create_agent)
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"Failed to create agent '{request.name}': {e}", exc_info=True)
|
|
||||||
raise HTTPException(status_code=500, detail=f"Failed to create agent: {str(e)}")
|
|
||||||
|
|
||||||
if response is None:
|
|
||||||
raise HTTPException(status_code=409, detail=f"Agent '{normalized_name}' already exists")
|
raise HTTPException(status_code=409, detail=f"Agent '{normalized_name}' already exists")
|
||||||
|
|
||||||
return response
|
try:
|
||||||
|
agent_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
# Write config.yaml
|
||||||
|
config_data: dict = {"name": normalized_name}
|
||||||
|
if request.description:
|
||||||
|
config_data["description"] = request.description
|
||||||
|
if request.model is not None:
|
||||||
|
config_data["model"] = request.model
|
||||||
|
if request.tool_groups is not None:
|
||||||
|
config_data["tool_groups"] = request.tool_groups
|
||||||
|
if request.skills is not None:
|
||||||
|
config_data["skills"] = request.skills
|
||||||
|
|
||||||
|
config_file = agent_dir / "config.yaml"
|
||||||
|
with open(config_file, "w", encoding="utf-8") as f:
|
||||||
|
yaml.dump(config_data, f, default_flow_style=False, allow_unicode=True)
|
||||||
|
|
||||||
|
# Write SOUL.md
|
||||||
|
soul_file = agent_dir / "SOUL.md"
|
||||||
|
soul_file.write_text(request.soul, encoding="utf-8")
|
||||||
|
|
||||||
|
logger.info(f"Created agent '{normalized_name}' at {agent_dir}")
|
||||||
|
|
||||||
|
agent_cfg = load_agent_config(normalized_name)
|
||||||
|
return _agent_config_to_response(agent_cfg, include_soul=True)
|
||||||
|
|
||||||
|
except HTTPException:
|
||||||
|
raise
|
||||||
|
except Exception as e:
|
||||||
|
# Clean up on failure
|
||||||
|
if agent_dir.exists():
|
||||||
|
shutil.rmtree(agent_dir)
|
||||||
|
logger.error(f"Failed to create agent '{request.name}': {e}", exc_info=True)
|
||||||
|
raise HTTPException(status_code=500, detail=f"Failed to create agent: {str(e)}")
|
||||||
|
|
||||||
|
|
||||||
@router.put(
|
@router.put(
|
||||||
@@ -292,20 +267,13 @@ async def update_agent(name: str, request: AgentUpdateRequest) -> AgentResponse:
|
|||||||
_require_agents_api_enabled()
|
_require_agents_api_enabled()
|
||||||
_validate_agent_name(name)
|
_validate_agent_name(name)
|
||||||
name = _normalize_agent_name(name)
|
name = _normalize_agent_name(name)
|
||||||
user_id = get_effective_user_id()
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
agent_cfg = load_agent_config(name, user_id=user_id)
|
agent_cfg = load_agent_config(name)
|
||||||
except FileNotFoundError:
|
except FileNotFoundError:
|
||||||
raise HTTPException(status_code=404, detail=f"Agent '{name}' not found")
|
raise HTTPException(status_code=404, detail=f"Agent '{name}' not found")
|
||||||
|
|
||||||
paths = get_paths()
|
agent_dir = get_paths().agent_dir(name)
|
||||||
agent_dir = paths.user_agent_dir(user_id, name)
|
|
||||||
if not agent_dir.exists() and paths.agent_dir(name).exists():
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=409,
|
|
||||||
detail=(f"Agent '{name}' only exists in the legacy shared layout and is not scoped to a user. Run scripts/migrate_user_isolation.py to move legacy agents into the per-user layout before updating."),
|
|
||||||
)
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Update config if any config fields changed
|
# Update config if any config fields changed
|
||||||
@@ -346,8 +314,8 @@ async def update_agent(name: str, request: AgentUpdateRequest) -> AgentResponse:
|
|||||||
|
|
||||||
logger.info(f"Updated agent '{name}'")
|
logger.info(f"Updated agent '{name}'")
|
||||||
|
|
||||||
refreshed_cfg = load_agent_config(name, user_id=user_id)
|
refreshed_cfg = load_agent_config(name)
|
||||||
return _agent_config_to_response(refreshed_cfg, include_soul=True, user_id=user_id)
|
return _agent_config_to_response(refreshed_cfg, include_soul=True)
|
||||||
|
|
||||||
except HTTPException:
|
except HTTPException:
|
||||||
raise
|
raise
|
||||||
@@ -434,38 +402,20 @@ async def delete_agent(name: str) -> None:
|
|||||||
name: The agent name.
|
name: The agent name.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
HTTPException: 404 if no per-user copy exists; 409 if only a legacy
|
HTTPException: 404 if agent not found.
|
||||||
shared copy exists (suggesting the migration script).
|
|
||||||
"""
|
"""
|
||||||
_require_agents_api_enabled()
|
_require_agents_api_enabled()
|
||||||
_validate_agent_name(name)
|
_validate_agent_name(name)
|
||||||
name = _normalize_agent_name(name)
|
name = _normalize_agent_name(name)
|
||||||
user_id = get_effective_user_id()
|
|
||||||
paths = get_paths()
|
|
||||||
|
|
||||||
def _remove_agent_dir() -> tuple[str, str]:
|
agent_dir = get_paths().agent_dir(name)
|
||||||
# Runs in a worker thread: resolving the base dir, probing the directory
|
|
||||||
# (`exists`), and removing it (`rmtree`) are all blocking filesystem IO
|
if not agent_dir.exists():
|
||||||
# that must stay off the event loop.
|
raise HTTPException(status_code=404, detail=f"Agent '{name}' not found")
|
||||||
agent_dir = paths.user_agent_dir(user_id, name)
|
|
||||||
if not agent_dir.exists():
|
|
||||||
outcome = "legacy" if paths.agent_dir(name).exists() else "missing"
|
|
||||||
return outcome, str(agent_dir)
|
|
||||||
shutil.rmtree(agent_dir)
|
|
||||||
return "deleted", str(agent_dir)
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
outcome, agent_dir = await asyncio.to_thread(_remove_agent_dir)
|
shutil.rmtree(agent_dir)
|
||||||
|
logger.info(f"Deleted agent '{name}' from {agent_dir}")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"Failed to delete agent '{name}': {e}", exc_info=True)
|
logger.error(f"Failed to delete agent '{name}': {e}", exc_info=True)
|
||||||
raise HTTPException(status_code=500, detail=f"Failed to delete agent: {str(e)}")
|
raise HTTPException(status_code=500, detail=f"Failed to delete agent: {str(e)}")
|
||||||
|
|
||||||
if outcome == "legacy":
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=409,
|
|
||||||
detail=(f"Agent '{name}' only exists in the legacy shared layout and is not scoped to a user. Run scripts/migrate_user_isolation.py to move legacy agents into the per-user layout before deleting."),
|
|
||||||
)
|
|
||||||
if outcome == "missing":
|
|
||||||
raise HTTPException(status_code=404, detail=f"Agent '{name}' not found")
|
|
||||||
|
|
||||||
logger.info(f"Deleted agent '{name}' from {agent_dir}")
|
|
||||||
|
|||||||
@@ -20,9 +20,6 @@ ACTIVE_CONTENT_MIME_TYPES = {
|
|||||||
"image/svg+xml",
|
"image/svg+xml",
|
||||||
}
|
}
|
||||||
|
|
||||||
MAX_SKILL_ARCHIVE_MEMBER_BYTES = 16 * 1024 * 1024
|
|
||||||
_SKILL_ARCHIVE_READ_CHUNK_SIZE = 64 * 1024
|
|
||||||
|
|
||||||
|
|
||||||
def _build_content_disposition(disposition_type: str, filename: str) -> str:
|
def _build_content_disposition(disposition_type: str, filename: str) -> str:
|
||||||
"""Build an RFC 5987 encoded Content-Disposition header value."""
|
"""Build an RFC 5987 encoded Content-Disposition header value."""
|
||||||
@@ -47,22 +44,6 @@ def is_text_file_by_content(path: Path, sample_size: int = 8192) -> bool:
|
|||||||
return False
|
return False
|
||||||
|
|
||||||
|
|
||||||
def _read_skill_archive_member(zip_ref: zipfile.ZipFile, info: zipfile.ZipInfo) -> bytes:
|
|
||||||
"""Read a .skill archive member while enforcing an uncompressed size cap."""
|
|
||||||
if info.file_size > MAX_SKILL_ARCHIVE_MEMBER_BYTES:
|
|
||||||
raise HTTPException(status_code=413, detail="Skill archive member is too large to preview")
|
|
||||||
|
|
||||||
chunks: list[bytes] = []
|
|
||||||
total_read = 0
|
|
||||||
with zip_ref.open(info, "r") as src:
|
|
||||||
while chunk := src.read(_SKILL_ARCHIVE_READ_CHUNK_SIZE):
|
|
||||||
total_read += len(chunk)
|
|
||||||
if total_read > MAX_SKILL_ARCHIVE_MEMBER_BYTES:
|
|
||||||
raise HTTPException(status_code=413, detail="Skill archive member is too large to preview")
|
|
||||||
chunks.append(chunk)
|
|
||||||
return b"".join(chunks)
|
|
||||||
|
|
||||||
|
|
||||||
def _extract_file_from_skill_archive(zip_path: Path, internal_path: str) -> bytes | None:
|
def _extract_file_from_skill_archive(zip_path: Path, internal_path: str) -> bytes | None:
|
||||||
"""Extract a file from a .skill ZIP archive.
|
"""Extract a file from a .skill ZIP archive.
|
||||||
|
|
||||||
@@ -79,16 +60,16 @@ def _extract_file_from_skill_archive(zip_path: Path, internal_path: str) -> byte
|
|||||||
try:
|
try:
|
||||||
with zipfile.ZipFile(zip_path, "r") as zip_ref:
|
with zipfile.ZipFile(zip_path, "r") as zip_ref:
|
||||||
# List all files in the archive
|
# List all files in the archive
|
||||||
infos_by_name = {info.filename: info for info in zip_ref.infolist()}
|
namelist = zip_ref.namelist()
|
||||||
|
|
||||||
# Try direct path first
|
# Try direct path first
|
||||||
if internal_path in infos_by_name:
|
if internal_path in namelist:
|
||||||
return _read_skill_archive_member(zip_ref, infos_by_name[internal_path])
|
return zip_ref.read(internal_path)
|
||||||
|
|
||||||
# Try with any top-level directory prefix (e.g., "skill-name/SKILL.md")
|
# Try with any top-level directory prefix (e.g., "skill-name/SKILL.md")
|
||||||
for name, info in infos_by_name.items():
|
for name in namelist:
|
||||||
if name.endswith("/" + internal_path) or name == internal_path:
|
if name.endswith("/" + internal_path) or name == internal_path:
|
||||||
return _read_skill_archive_member(zip_ref, info)
|
return zip_ref.read(name)
|
||||||
|
|
||||||
# Not found
|
# Not found
|
||||||
return None
|
return None
|
||||||
|
|||||||
@@ -1,6 +1,5 @@
|
|||||||
"""Authentication endpoints."""
|
"""Authentication endpoints."""
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import logging
|
import logging
|
||||||
import os
|
import os
|
||||||
import time
|
import time
|
||||||
@@ -306,7 +305,7 @@ async def login_local(
|
|||||||
async def register(request: Request, response: Response, body: RegisterRequest):
|
async def register(request: Request, response: Response, body: RegisterRequest):
|
||||||
"""Register a new user account (always 'user' role).
|
"""Register a new user account (always 'user' role).
|
||||||
|
|
||||||
The first admin is created explicitly through /initialize. This endpoint creates regular users.
|
Admin is auto-created on first boot. This endpoint creates regular users.
|
||||||
Auto-login by setting the session cookie.
|
Auto-login by setting the session cookie.
|
||||||
"""
|
"""
|
||||||
try:
|
try:
|
||||||
@@ -383,15 +382,9 @@ async def get_me(request: Request):
|
|||||||
return UserResponse(id=str(user.id), email=user.email, system_role=user.system_role, needs_setup=user.needs_setup)
|
return UserResponse(id=str(user.id), email=user.email, system_role=user.system_role, needs_setup=user.needs_setup)
|
||||||
|
|
||||||
|
|
||||||
# Per-IP cache: ip → (timestamp, result_dict).
|
_SETUP_STATUS_COOLDOWN: dict[str, float] = {}
|
||||||
# Returns the cached result within the TTL instead of 429, because
|
_SETUP_STATUS_COOLDOWN_SECONDS = 60
|
||||||
# the answer (whether an admin exists) rarely changes and returning
|
|
||||||
# 429 breaks multi-tab / post-restart reconnection storms.
|
|
||||||
_SETUP_STATUS_CACHE: dict[str, tuple[float, dict]] = {}
|
|
||||||
_SETUP_STATUS_CACHE_TTL_SECONDS = 60
|
|
||||||
_MAX_TRACKED_SETUP_STATUS_IPS = 10000
|
_MAX_TRACKED_SETUP_STATUS_IPS = 10000
|
||||||
_SETUP_STATUS_INFLIGHT: dict[str, asyncio.Task[dict]] = {}
|
|
||||||
_SETUP_STATUS_INFLIGHT_GUARD = asyncio.Lock()
|
|
||||||
|
|
||||||
|
|
||||||
@router.get("/setup-status")
|
@router.get("/setup-status")
|
||||||
@@ -399,56 +392,29 @@ async def setup_status(request: Request):
|
|||||||
"""Check if an admin account exists. Returns needs_setup=True when no admin exists."""
|
"""Check if an admin account exists. Returns needs_setup=True when no admin exists."""
|
||||||
client_ip = _get_client_ip(request)
|
client_ip = _get_client_ip(request)
|
||||||
now = time.time()
|
now = time.time()
|
||||||
|
last_check = _SETUP_STATUS_COOLDOWN.get(client_ip, 0)
|
||||||
# Return cached result when within TTL — avoids 429 on multi-tab reconnection.
|
elapsed = now - last_check
|
||||||
cached = _SETUP_STATUS_CACHE.get(client_ip)
|
if elapsed < _SETUP_STATUS_COOLDOWN_SECONDS:
|
||||||
if cached is not None:
|
retry_after = max(1, int(_SETUP_STATUS_COOLDOWN_SECONDS - elapsed))
|
||||||
cached_time, cached_result = cached
|
raise HTTPException(
|
||||||
if now - cached_time < _SETUP_STATUS_CACHE_TTL_SECONDS:
|
status_code=status.HTTP_429_TOO_MANY_REQUESTS,
|
||||||
return cached_result
|
detail="Setup status check is rate limited",
|
||||||
|
headers={"Retry-After": str(retry_after)},
|
||||||
async with _SETUP_STATUS_INFLIGHT_GUARD:
|
)
|
||||||
# Recheck cache after waiting for the inflight guard.
|
# Evict stale entries when dict grows too large to bound memory usage.
|
||||||
now = time.time()
|
if len(_SETUP_STATUS_COOLDOWN) >= _MAX_TRACKED_SETUP_STATUS_IPS:
|
||||||
cached = _SETUP_STATUS_CACHE.get(client_ip)
|
cutoff = now - _SETUP_STATUS_COOLDOWN_SECONDS
|
||||||
if cached is not None:
|
stale = [k for k, t in _SETUP_STATUS_COOLDOWN.items() if t < cutoff]
|
||||||
cached_time, cached_result = cached
|
for k in stale:
|
||||||
if now - cached_time < _SETUP_STATUS_CACHE_TTL_SECONDS:
|
del _SETUP_STATUS_COOLDOWN[k]
|
||||||
return cached_result
|
# If still too large after evicting expired entries, remove oldest half.
|
||||||
|
if len(_SETUP_STATUS_COOLDOWN) >= _MAX_TRACKED_SETUP_STATUS_IPS:
|
||||||
task = _SETUP_STATUS_INFLIGHT.get(client_ip)
|
by_time = sorted(_SETUP_STATUS_COOLDOWN.items(), key=lambda kv: kv[1])
|
||||||
if task is None:
|
for k, _ in by_time[: len(by_time) // 2]:
|
||||||
# Evict stale entries when dict grows too large to bound memory usage.
|
del _SETUP_STATUS_COOLDOWN[k]
|
||||||
if len(_SETUP_STATUS_CACHE) >= _MAX_TRACKED_SETUP_STATUS_IPS:
|
_SETUP_STATUS_COOLDOWN[client_ip] = now
|
||||||
cutoff = now - _SETUP_STATUS_CACHE_TTL_SECONDS
|
admin_count = await get_local_provider().count_admin_users()
|
||||||
stale = [k for k, (t, _) in _SETUP_STATUS_CACHE.items() if t < cutoff]
|
return {"needs_setup": admin_count == 0}
|
||||||
for k in stale:
|
|
||||||
del _SETUP_STATUS_CACHE[k]
|
|
||||||
if len(_SETUP_STATUS_CACHE) >= _MAX_TRACKED_SETUP_STATUS_IPS:
|
|
||||||
by_time = sorted(_SETUP_STATUS_CACHE.items(), key=lambda entry: entry[1][0])
|
|
||||||
for k, _ in by_time[: len(by_time) // 2]:
|
|
||||||
del _SETUP_STATUS_CACHE[k]
|
|
||||||
|
|
||||||
async def _compute_setup_status() -> dict:
|
|
||||||
admin_count = await get_local_provider().count_admin_users()
|
|
||||||
return {"needs_setup": admin_count == 0}
|
|
||||||
|
|
||||||
task = asyncio.create_task(_compute_setup_status())
|
|
||||||
_SETUP_STATUS_INFLIGHT[client_ip] = task
|
|
||||||
|
|
||||||
try:
|
|
||||||
result = await task
|
|
||||||
finally:
|
|
||||||
async with _SETUP_STATUS_INFLIGHT_GUARD:
|
|
||||||
if _SETUP_STATUS_INFLIGHT.get(client_ip) is task:
|
|
||||||
del _SETUP_STATUS_INFLIGHT[client_ip]
|
|
||||||
|
|
||||||
# Cache only the stable "initialized" result to avoid stale setup redirects.
|
|
||||||
if result["needs_setup"] is False:
|
|
||||||
_SETUP_STATUS_CACHE[client_ip] = (time.time(), result)
|
|
||||||
else:
|
|
||||||
_SETUP_STATUS_CACHE.pop(client_ip, None)
|
|
||||||
return result
|
|
||||||
|
|
||||||
|
|
||||||
class InitializeAdminRequest(BaseModel):
|
class InitializeAdminRequest(BaseModel):
|
||||||
|
|||||||
@@ -1,10 +1,9 @@
|
|||||||
import json
|
import json
|
||||||
import logging
|
import logging
|
||||||
import os
|
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Literal
|
from typing import Literal
|
||||||
|
|
||||||
from fastapi import APIRouter, HTTPException, Request, status
|
from fastapi import APIRouter, HTTPException
|
||||||
from pydantic import BaseModel, Field
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
from deerflow.config.extensions_config import ExtensionsConfig, get_extensions_config, reload_extensions_config
|
from deerflow.config.extensions_config import ExtensionsConfig, get_extensions_config, reload_extensions_config
|
||||||
@@ -13,11 +12,6 @@ logger = logging.getLogger(__name__)
|
|||||||
router = APIRouter(prefix="/api", tags=["mcp"])
|
router = APIRouter(prefix="/api", tags=["mcp"])
|
||||||
|
|
||||||
|
|
||||||
_MCP_STDIO_COMMAND_ALLOWLIST_ENV = "DEER_FLOW_MCP_STDIO_COMMAND_ALLOWLIST"
|
|
||||||
_DEFAULT_MCP_STDIO_COMMAND_ALLOWLIST = frozenset({"npx", "uvx"})
|
|
||||||
_SHELL_METACHARS = frozenset(";|&`$<>\n\r")
|
|
||||||
|
|
||||||
|
|
||||||
class McpOAuthConfigResponse(BaseModel):
|
class McpOAuthConfigResponse(BaseModel):
|
||||||
"""OAuth configuration for an MCP server."""
|
"""OAuth configuration for an MCP server."""
|
||||||
|
|
||||||
@@ -69,178 +63,13 @@ class McpConfigUpdateRequest(BaseModel):
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
_MASKED_VALUE = "***"
|
|
||||||
|
|
||||||
|
|
||||||
async def _require_admin_user(request: Request) -> None:
|
|
||||||
"""Require the authenticated caller to be an admin user.
|
|
||||||
|
|
||||||
``AuthMiddleware`` normally stamps ``request.state.user`` before the
|
|
||||||
request reaches this router. Falling back to the strict dependency keeps
|
|
||||||
this route safe even in tests or alternative ASGI compositions that mount
|
|
||||||
the router without the global middleware.
|
|
||||||
"""
|
|
||||||
user = getattr(request.state, "user", None)
|
|
||||||
if user is None:
|
|
||||||
from app.gateway.deps import get_current_user_from_request
|
|
||||||
|
|
||||||
user = await get_current_user_from_request(request)
|
|
||||||
|
|
||||||
if getattr(user, "system_role", None) != "admin":
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_403_FORBIDDEN,
|
|
||||||
detail="Admin privileges required to manage MCP configuration.",
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _allowed_stdio_commands() -> set[str]:
|
|
||||||
"""Return executable names allowed for API-managed stdio MCP servers."""
|
|
||||||
raw = os.environ.get(_MCP_STDIO_COMMAND_ALLOWLIST_ENV)
|
|
||||||
base = set(_DEFAULT_MCP_STDIO_COMMAND_ALLOWLIST)
|
|
||||||
if raw is None:
|
|
||||||
return base
|
|
||||||
extra = {item.strip() for item in raw.split(",") if item.strip()}
|
|
||||||
return base | extra
|
|
||||||
|
|
||||||
|
|
||||||
def _stdio_command_name(command: str | None, *, server_name: str) -> str:
|
|
||||||
"""Normalize and validate a stdio command field from the API boundary."""
|
|
||||||
if command is None or not command.strip():
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_400_BAD_REQUEST,
|
|
||||||
detail=f"MCP server '{server_name}' with stdio transport requires a command.",
|
|
||||||
)
|
|
||||||
|
|
||||||
stripped = command.strip()
|
|
||||||
has_path_separator = "/" in stripped or "\\" in stripped
|
|
||||||
if stripped != command or has_path_separator or any(ch.isspace() for ch in stripped) or any(ch in stripped for ch in _SHELL_METACHARS):
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_400_BAD_REQUEST,
|
|
||||||
detail=(f"MCP server '{server_name}' command must be a single executable name; put parameters in args instead."),
|
|
||||||
)
|
|
||||||
|
|
||||||
return stripped
|
|
||||||
|
|
||||||
|
|
||||||
def _validate_mcp_update_request(request: McpConfigUpdateRequest) -> None:
|
|
||||||
"""Validate API-submitted MCP config before it is persisted.
|
|
||||||
|
|
||||||
Local config files can still express arbitrary advanced setups, but the
|
|
||||||
HTTP API is an untrusted boundary. Restricting stdio commands here reduces
|
|
||||||
the blast radius of a compromised authenticated browser session.
|
|
||||||
"""
|
|
||||||
allowed_commands = _allowed_stdio_commands()
|
|
||||||
for name, server in request.mcp_servers.items():
|
|
||||||
transport_type = (server.type or "stdio").lower()
|
|
||||||
if transport_type != "stdio":
|
|
||||||
continue
|
|
||||||
|
|
||||||
command_name = _stdio_command_name(server.command, server_name=name)
|
|
||||||
if command_name not in allowed_commands:
|
|
||||||
allowed = ", ".join(sorted(allowed_commands)) or "<none>"
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_400_BAD_REQUEST,
|
|
||||||
detail=(f"MCP server '{name}' uses disallowed stdio command '{command_name}'. Allowed commands: {allowed}. Configure {_MCP_STDIO_COMMAND_ALLOWLIST_ENV} to extend this list."),
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _mask_server_config(server: McpServerConfigResponse) -> McpServerConfigResponse:
|
|
||||||
"""Return a copy of server config with sensitive fields masked.
|
|
||||||
|
|
||||||
Masks env values, header values, and removes OAuth secrets so they
|
|
||||||
are not exposed through the GET API endpoint.
|
|
||||||
"""
|
|
||||||
masked_env = {k: _MASKED_VALUE for k in server.env}
|
|
||||||
masked_headers = {k: _MASKED_VALUE for k in server.headers}
|
|
||||||
masked_oauth = None
|
|
||||||
if server.oauth is not None:
|
|
||||||
masked_oauth = server.oauth.model_copy(
|
|
||||||
update={
|
|
||||||
"client_secret": None,
|
|
||||||
"refresh_token": None,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
return server.model_copy(
|
|
||||||
update={
|
|
||||||
"env": masked_env,
|
|
||||||
"headers": masked_headers,
|
|
||||||
"oauth": masked_oauth,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _merge_preserving_secrets(
|
|
||||||
incoming: McpServerConfigResponse,
|
|
||||||
existing: McpServerConfigResponse,
|
|
||||||
) -> McpServerConfigResponse:
|
|
||||||
"""Merge incoming config with existing, preserving secrets masked by GET.
|
|
||||||
|
|
||||||
When the frontend toggles ``enabled`` it round-trips the full config:
|
|
||||||
GET (masked) → modify enabled → PUT (masked values sent back).
|
|
||||||
This function ensures masked values (``***``) are replaced with the
|
|
||||||
real secrets from the current on-disk config.
|
|
||||||
|
|
||||||
``***`` is only accepted for keys that already exist in *existing*.
|
|
||||||
New keys must provide a real value.
|
|
||||||
|
|
||||||
For OAuth secrets, ``None`` means "preserve the existing stored value"
|
|
||||||
so masked GET responses can be safely round-tripped. To explicitly clear
|
|
||||||
a stored secret, clients may send an empty string, which is converted
|
|
||||||
to ``None`` before persisting.
|
|
||||||
"""
|
|
||||||
merged_env = {}
|
|
||||||
for k, v in incoming.env.items():
|
|
||||||
if v == _MASKED_VALUE:
|
|
||||||
if k in existing.env:
|
|
||||||
merged_env[k] = existing.env[k]
|
|
||||||
else:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=400,
|
|
||||||
detail=f"Cannot set env key '{k}' to masked value '***'; provide a real value.",
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
merged_env[k] = v
|
|
||||||
|
|
||||||
merged_headers = {}
|
|
||||||
for k, v in incoming.headers.items():
|
|
||||||
if v == _MASKED_VALUE:
|
|
||||||
if k in existing.headers:
|
|
||||||
merged_headers[k] = existing.headers[k]
|
|
||||||
else:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=400,
|
|
||||||
detail=f"Cannot set header '{k}' to masked value '***'; provide a real value.",
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
merged_headers[k] = v
|
|
||||||
|
|
||||||
merged_oauth = incoming.oauth
|
|
||||||
if incoming.oauth is not None and existing.oauth is not None:
|
|
||||||
# None = preserve (masked round-trip), "" = explicitly clear, else = new value
|
|
||||||
merged_client_secret = existing.oauth.client_secret if incoming.oauth.client_secret is None else (None if incoming.oauth.client_secret == "" else incoming.oauth.client_secret)
|
|
||||||
merged_refresh_token = existing.oauth.refresh_token if incoming.oauth.refresh_token is None else (None if incoming.oauth.refresh_token == "" else incoming.oauth.refresh_token)
|
|
||||||
merged_oauth = incoming.oauth.model_copy(
|
|
||||||
update={
|
|
||||||
"client_secret": merged_client_secret,
|
|
||||||
"refresh_token": merged_refresh_token,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
return incoming.model_copy(
|
|
||||||
update={
|
|
||||||
"env": merged_env,
|
|
||||||
"headers": merged_headers,
|
|
||||||
"oauth": merged_oauth,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
@router.get(
|
@router.get(
|
||||||
"/mcp/config",
|
"/mcp/config",
|
||||||
response_model=McpConfigResponse,
|
response_model=McpConfigResponse,
|
||||||
summary="Get MCP Configuration",
|
summary="Get MCP Configuration",
|
||||||
description="Retrieve the current Model Context Protocol (MCP) server configurations.",
|
description="Retrieve the current Model Context Protocol (MCP) server configurations.",
|
||||||
)
|
)
|
||||||
async def get_mcp_configuration(request: Request) -> McpConfigResponse:
|
async def get_mcp_configuration() -> McpConfigResponse:
|
||||||
"""Get the current MCP configuration.
|
"""Get the current MCP configuration.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
@@ -254,19 +83,16 @@ async def get_mcp_configuration(request: Request) -> McpConfigResponse:
|
|||||||
"enabled": true,
|
"enabled": true,
|
||||||
"command": "npx",
|
"command": "npx",
|
||||||
"args": ["-y", "@modelcontextprotocol/server-github"],
|
"args": ["-y", "@modelcontextprotocol/server-github"],
|
||||||
"env": {"GITHUB_TOKEN": "***"},
|
"env": {"GITHUB_TOKEN": "ghp_xxx"},
|
||||||
"description": "GitHub MCP server for repository operations"
|
"description": "GitHub MCP server for repository operations"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
"""
|
"""
|
||||||
await _require_admin_user(request)
|
|
||||||
|
|
||||||
config = get_extensions_config()
|
config = get_extensions_config()
|
||||||
|
|
||||||
servers = {name: _mask_server_config(McpServerConfigResponse(**server.model_dump())) for name, server in config.mcp_servers.items()}
|
return McpConfigResponse(mcp_servers={name: McpServerConfigResponse(**server.model_dump()) for name, server in config.mcp_servers.items()})
|
||||||
return McpConfigResponse(mcp_servers=servers)
|
|
||||||
|
|
||||||
|
|
||||||
@router.put(
|
@router.put(
|
||||||
@@ -275,7 +101,7 @@ async def get_mcp_configuration(request: Request) -> McpConfigResponse:
|
|||||||
summary="Update MCP Configuration",
|
summary="Update MCP Configuration",
|
||||||
description="Update Model Context Protocol (MCP) server configurations and save to file.",
|
description="Update Model Context Protocol (MCP) server configurations and save to file.",
|
||||||
)
|
)
|
||||||
async def update_mcp_configuration(request: Request, body: McpConfigUpdateRequest) -> McpConfigResponse:
|
async def update_mcp_configuration(request: McpConfigUpdateRequest) -> McpConfigResponse:
|
||||||
"""Update the MCP configuration.
|
"""Update the MCP configuration.
|
||||||
|
|
||||||
This will:
|
This will:
|
||||||
@@ -308,9 +134,6 @@ async def update_mcp_configuration(request: Request, body: McpConfigUpdateReques
|
|||||||
```
|
```
|
||||||
"""
|
"""
|
||||||
try:
|
try:
|
||||||
await _require_admin_user(request)
|
|
||||||
_validate_mcp_update_request(body)
|
|
||||||
|
|
||||||
# Get the current config path (or determine where to save it)
|
# Get the current config path (or determine where to save it)
|
||||||
config_path = ExtensionsConfig.resolve_config_path()
|
config_path = ExtensionsConfig.resolve_config_path()
|
||||||
|
|
||||||
@@ -319,39 +142,14 @@ async def update_mcp_configuration(request: Request, body: McpConfigUpdateReques
|
|||||||
config_path = Path.cwd().parent / "extensions_config.json"
|
config_path = Path.cwd().parent / "extensions_config.json"
|
||||||
logger.info(f"No existing extensions config found. Creating new config at: {config_path}")
|
logger.info(f"No existing extensions config found. Creating new config at: {config_path}")
|
||||||
|
|
||||||
# Load current config to preserve skills
|
# Load current config to preserve skills configuration
|
||||||
current_config = get_extensions_config()
|
current_config = get_extensions_config()
|
||||||
|
|
||||||
# Load raw (un-resolved) JSON from disk to use as the merge source.
|
# Convert request to dict format for JSON serialization
|
||||||
# This preserves $VAR placeholders in env values and top-level keys
|
config_data = {
|
||||||
# like mcpInterceptors that would otherwise be lost.
|
"mcpServers": {name: server.model_dump() for name, server in request.mcp_servers.items()},
|
||||||
raw_servers: dict[str, dict] = {}
|
"skills": {name: {"enabled": skill.enabled} for name, skill in current_config.skills.items()},
|
||||||
raw_other_keys: dict = {}
|
}
|
||||||
if config_path is not None and config_path.exists():
|
|
||||||
with open(config_path, encoding="utf-8") as f:
|
|
||||||
raw_data = json.load(f)
|
|
||||||
raw_servers = raw_data.get("mcpServers", {})
|
|
||||||
# Preserve any top-level keys beyond mcpServers/skills
|
|
||||||
for key, value in raw_data.items():
|
|
||||||
if key not in ("mcpServers", "skills"):
|
|
||||||
raw_other_keys[key] = value
|
|
||||||
|
|
||||||
# Merge incoming server configs with raw on-disk secrets
|
|
||||||
merged_servers: dict[str, McpServerConfigResponse] = {}
|
|
||||||
for name, incoming in body.mcp_servers.items():
|
|
||||||
raw_server = raw_servers.get(name)
|
|
||||||
if raw_server is not None:
|
|
||||||
merged_servers[name] = _merge_preserving_secrets(
|
|
||||||
incoming,
|
|
||||||
McpServerConfigResponse(**raw_server),
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
merged_servers[name] = incoming
|
|
||||||
|
|
||||||
# Build config data preserving all top-level keys from the original file
|
|
||||||
config_data = dict(raw_other_keys)
|
|
||||||
config_data["mcpServers"] = {name: server.model_dump() for name, server in merged_servers.items()}
|
|
||||||
config_data["skills"] = {name: {"enabled": skill.enabled} for name, skill in current_config.skills.items()}
|
|
||||||
|
|
||||||
# Write the configuration to file
|
# Write the configuration to file
|
||||||
with open(config_path, "w", encoding="utf-8") as f:
|
with open(config_path, "w", encoding="utf-8") as f:
|
||||||
@@ -359,15 +157,13 @@ async def update_mcp_configuration(request: Request, body: McpConfigUpdateReques
|
|||||||
|
|
||||||
logger.info(f"MCP configuration updated and saved to: {config_path}")
|
logger.info(f"MCP configuration updated and saved to: {config_path}")
|
||||||
|
|
||||||
# Reload the Gateway configuration and update the global cache. The
|
# NOTE: No need to reload/reset cache here - LangGraph Server (separate process)
|
||||||
# agent runtime lives in Gateway, so this keeps API reads and tool
|
# will detect config file changes via mtime and reinitialize MCP tools automatically
|
||||||
# execution aligned after extensions_config.json changes.
|
|
||||||
reloaded_config = reload_extensions_config()
|
# Reload the configuration and update the global cache
|
||||||
servers = {name: _mask_server_config(McpServerConfigResponse(**server.model_dump())) for name, server in reloaded_config.mcp_servers.items()}
|
reloaded_config = reload_extensions_config()
|
||||||
return McpConfigResponse(mcp_servers=servers)
|
return McpConfigResponse(mcp_servers={name: McpServerConfigResponse(**server.model_dump()) for name, server in reloaded_config.mcp_servers.items()})
|
||||||
|
|
||||||
except HTTPException:
|
|
||||||
raise
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"Failed to update MCP configuration: {e}", exc_info=True)
|
logger.error(f"Failed to update MCP configuration: {e}", exc_info=True)
|
||||||
raise HTTPException(status_code=500, detail=f"Failed to update MCP configuration: {str(e)}")
|
raise HTTPException(status_code=500, detail=f"Failed to update MCP configuration: {str(e)}")
|
||||||
|
|||||||
@@ -7,6 +7,7 @@ is reused so that conversation history is preserved across calls.
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
import logging
|
import logging
|
||||||
import uuid
|
import uuid
|
||||||
|
|
||||||
@@ -15,9 +16,8 @@ from fastapi.responses import StreamingResponse
|
|||||||
|
|
||||||
from app.gateway.authz import require_permission
|
from app.gateway.authz import require_permission
|
||||||
from app.gateway.deps import get_checkpointer, get_feedback_repo, get_run_event_store, get_run_manager, get_run_store, get_stream_bridge
|
from app.gateway.deps import get_checkpointer, get_feedback_repo, get_run_event_store, get_run_manager, get_run_store, get_stream_bridge
|
||||||
from app.gateway.pagination import trim_run_message_page
|
|
||||||
from app.gateway.routers.thread_runs import RunCreateRequest
|
from app.gateway.routers.thread_runs import RunCreateRequest
|
||||||
from app.gateway.services import sse_consumer, start_run, wait_for_run_completion
|
from app.gateway.services import sse_consumer, start_run
|
||||||
from deerflow.runtime import serialize_channel_values
|
from deerflow.runtime import serialize_channel_values
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@@ -66,25 +66,24 @@ async def stateless_wait(body: RunCreateRequest, request: Request) -> dict:
|
|||||||
Otherwise a new temporary thread is created.
|
Otherwise a new temporary thread is created.
|
||||||
"""
|
"""
|
||||||
thread_id = _resolve_thread_id(body)
|
thread_id = _resolve_thread_id(body)
|
||||||
bridge = get_stream_bridge(request)
|
|
||||||
run_mgr = get_run_manager(request)
|
|
||||||
record = await start_run(body, thread_id, request)
|
record = await start_run(body, thread_id, request)
|
||||||
|
|
||||||
completed = True
|
|
||||||
if record.task is not None:
|
if record.task is not None:
|
||||||
completed = await wait_for_run_completion(bridge, record, request, run_mgr)
|
|
||||||
|
|
||||||
if completed:
|
|
||||||
checkpointer = get_checkpointer(request)
|
|
||||||
config = {"configurable": {"thread_id": thread_id}}
|
|
||||||
try:
|
try:
|
||||||
checkpoint_tuple = await checkpointer.aget_tuple(config)
|
await record.task
|
||||||
if checkpoint_tuple is not None:
|
except asyncio.CancelledError:
|
||||||
checkpoint = getattr(checkpoint_tuple, "checkpoint", {}) or {}
|
pass
|
||||||
channel_values = checkpoint.get("channel_values", {})
|
|
||||||
return serialize_channel_values(channel_values)
|
checkpointer = get_checkpointer(request)
|
||||||
except Exception:
|
config = {"configurable": {"thread_id": thread_id}}
|
||||||
logger.exception("Failed to fetch final state for run %s", record.run_id)
|
try:
|
||||||
|
checkpoint_tuple = await checkpointer.aget_tuple(config)
|
||||||
|
if checkpoint_tuple is not None:
|
||||||
|
checkpoint = getattr(checkpoint_tuple, "checkpoint", {}) or {}
|
||||||
|
channel_values = checkpoint.get("channel_values", {})
|
||||||
|
return serialize_channel_values(channel_values)
|
||||||
|
except Exception:
|
||||||
|
logger.exception("Failed to fetch final state for run %s", record.run_id)
|
||||||
|
|
||||||
return {"status": record.status.value, "error": record.error}
|
return {"status": record.status.value, "error": record.error}
|
||||||
|
|
||||||
@@ -130,7 +129,8 @@ async def run_messages(
|
|||||||
before_seq=before_seq,
|
before_seq=before_seq,
|
||||||
after_seq=after_seq,
|
after_seq=after_seq,
|
||||||
)
|
)
|
||||||
data, has_more = trim_run_message_page(rows, limit=limit, after_seq=after_seq)
|
has_more = len(rows) > limit
|
||||||
|
data = rows[:limit] if has_more else rows
|
||||||
return {"data": data, "has_more": has_more}
|
return {"data": data, "has_more": has_more}
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,5 @@
|
|||||||
import json
|
import json
|
||||||
import logging
|
import logging
|
||||||
import re
|
|
||||||
|
|
||||||
from fastapi import APIRouter, Depends, Request
|
from fastapi import APIRouter, Depends, Request
|
||||||
from langchain_core.messages import HumanMessage, SystemMessage
|
from langchain_core.messages import HumanMessage, SystemMessage
|
||||||
@@ -31,31 +30,6 @@ class SuggestionsResponse(BaseModel):
|
|||||||
suggestions: list[str] = Field(default_factory=list, description="Suggested follow-up questions")
|
suggestions: list[str] = Field(default_factory=list, description="Suggested follow-up questions")
|
||||||
|
|
||||||
|
|
||||||
# Matches a complete <think>...</think> block (case-insensitive, spans newlines).
|
|
||||||
_THINK_BLOCK_RE = re.compile(r"<think\b[^>]*>.*?</think\s*>", re.IGNORECASE | re.DOTALL)
|
|
||||||
# Matches a dangling, unclosed <think> (model truncated at max_tokens mid-thought).
|
|
||||||
_OPEN_THINK_RE = re.compile(r"<think\b[^>]*>", re.IGNORECASE)
|
|
||||||
|
|
||||||
|
|
||||||
def _strip_think_blocks(text: str) -> str:
|
|
||||||
"""Remove reasoning-model ``<think>...</think>`` blocks from the response.
|
|
||||||
|
|
||||||
Reasoning models such as MiniMax-M3 inline their chain-of-thought into the
|
|
||||||
message ``content`` wrapped in ``<think>...</think>`` (``reasoning_split``
|
|
||||||
defaults to false), rather than exposing a separate ``reasoning_content``
|
|
||||||
field. The thinking text frequently contains ``[`` / ``]`` characters, which
|
|
||||||
corrupted the downstream ``find('[')`` / ``rfind(']')`` JSON extraction and
|
|
||||||
produced empty suggestions. We strip the reasoning before parsing so only
|
|
||||||
the actual answer remains.
|
|
||||||
"""
|
|
||||||
text = _THINK_BLOCK_RE.sub("", text)
|
|
||||||
# Drop any unclosed <think> (and everything after it) left by truncation.
|
|
||||||
open_match = _OPEN_THINK_RE.search(text)
|
|
||||||
if open_match:
|
|
||||||
text = text[: open_match.start()]
|
|
||||||
return text.strip()
|
|
||||||
|
|
||||||
|
|
||||||
def _strip_markdown_code_fence(text: str) -> str:
|
def _strip_markdown_code_fence(text: str) -> str:
|
||||||
stripped = text.strip()
|
stripped = text.strip()
|
||||||
if not stripped.startswith("```"):
|
if not stripped.startswith("```"):
|
||||||
@@ -67,8 +41,7 @@ def _strip_markdown_code_fence(text: str) -> str:
|
|||||||
|
|
||||||
|
|
||||||
def _parse_json_string_list(text: str) -> list[str] | None:
|
def _parse_json_string_list(text: str) -> list[str] | None:
|
||||||
candidate = _strip_think_blocks(text)
|
candidate = _strip_markdown_code_fence(text)
|
||||||
candidate = _strip_markdown_code_fence(candidate)
|
|
||||||
start = candidate.find("[")
|
start = candidate.find("[")
|
||||||
end = candidate.rfind("]")
|
end = candidate.rfind("]")
|
||||||
if start == -1 or end == -1 or end <= start:
|
if start == -1 or end == -1 or end <= start:
|
||||||
|
|||||||
@@ -21,9 +21,8 @@ from pydantic import BaseModel, Field
|
|||||||
|
|
||||||
from app.gateway.authz import require_permission
|
from app.gateway.authz import require_permission
|
||||||
from app.gateway.deps import get_checkpointer, get_current_user, get_feedback_repo, get_run_event_store, get_run_manager, get_run_store, get_stream_bridge
|
from app.gateway.deps import get_checkpointer, get_current_user, get_feedback_repo, get_run_event_store, get_run_manager, get_run_store, get_stream_bridge
|
||||||
from app.gateway.pagination import trim_run_message_page
|
from app.gateway.services import sse_consumer, start_run
|
||||||
from app.gateway.services import sse_consumer, start_run, wait_for_run_completion
|
from deerflow.runtime import RunRecord, serialize_channel_values
|
||||||
from deerflow.runtime import RunRecord, RunStatus, serialize_channel_values
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
router = APIRouter(prefix="/api/threads", tags=["runs"])
|
router = APIRouter(prefix="/api/threads", tags=["runs"])
|
||||||
@@ -67,35 +66,6 @@ class RunResponse(BaseModel):
|
|||||||
multitask_strategy: str = "reject"
|
multitask_strategy: str = "reject"
|
||||||
created_at: str = ""
|
created_at: str = ""
|
||||||
updated_at: str = ""
|
updated_at: str = ""
|
||||||
total_input_tokens: int = 0
|
|
||||||
total_output_tokens: int = 0
|
|
||||||
total_tokens: int = 0
|
|
||||||
llm_call_count: int = 0
|
|
||||||
lead_agent_tokens: int = 0
|
|
||||||
subagent_tokens: int = 0
|
|
||||||
middleware_tokens: int = 0
|
|
||||||
message_count: int = 0
|
|
||||||
|
|
||||||
|
|
||||||
class ThreadTokenUsageModelBreakdown(BaseModel):
|
|
||||||
tokens: int = 0
|
|
||||||
runs: int = 0
|
|
||||||
|
|
||||||
|
|
||||||
class ThreadTokenUsageCallerBreakdown(BaseModel):
|
|
||||||
lead_agent: int = 0
|
|
||||||
subagent: int = 0
|
|
||||||
middleware: int = 0
|
|
||||||
|
|
||||||
|
|
||||||
class ThreadTokenUsageResponse(BaseModel):
|
|
||||||
thread_id: str
|
|
||||||
total_tokens: int = 0
|
|
||||||
total_input_tokens: int = 0
|
|
||||||
total_output_tokens: int = 0
|
|
||||||
total_runs: int = 0
|
|
||||||
by_model: dict[str, ThreadTokenUsageModelBreakdown] = Field(default_factory=dict)
|
|
||||||
by_caller: ThreadTokenUsageCallerBreakdown = Field(default_factory=ThreadTokenUsageCallerBreakdown)
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
@@ -103,12 +73,6 @@ class ThreadTokenUsageResponse(BaseModel):
|
|||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
def _cancel_conflict_detail(run_id: str, record: RunRecord) -> str:
|
|
||||||
if record.status in (RunStatus.pending, RunStatus.running):
|
|
||||||
return f"Run {run_id} is not active on this worker and cannot be cancelled"
|
|
||||||
return f"Run {run_id} is not cancellable (status: {record.status.value})"
|
|
||||||
|
|
||||||
|
|
||||||
def _record_to_response(record: RunRecord) -> RunResponse:
|
def _record_to_response(record: RunRecord) -> RunResponse:
|
||||||
return RunResponse(
|
return RunResponse(
|
||||||
run_id=record.run_id,
|
run_id=record.run_id,
|
||||||
@@ -120,14 +84,6 @@ def _record_to_response(record: RunRecord) -> RunResponse:
|
|||||||
multitask_strategy=record.multitask_strategy,
|
multitask_strategy=record.multitask_strategy,
|
||||||
created_at=record.created_at,
|
created_at=record.created_at,
|
||||||
updated_at=record.updated_at,
|
updated_at=record.updated_at,
|
||||||
total_input_tokens=record.total_input_tokens,
|
|
||||||
total_output_tokens=record.total_output_tokens,
|
|
||||||
total_tokens=record.total_tokens,
|
|
||||||
llm_call_count=record.llm_call_count,
|
|
||||||
lead_agent_tokens=record.lead_agent_tokens,
|
|
||||||
subagent_tokens=record.subagent_tokens,
|
|
||||||
middleware_tokens=record.middleware_tokens,
|
|
||||||
message_count=record.message_count,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -176,25 +132,24 @@ async def stream_run(thread_id: str, body: RunCreateRequest, request: Request) -
|
|||||||
@require_permission("runs", "create", owner_check=True, require_existing=True)
|
@require_permission("runs", "create", owner_check=True, require_existing=True)
|
||||||
async def wait_run(thread_id: str, body: RunCreateRequest, request: Request) -> dict:
|
async def wait_run(thread_id: str, body: RunCreateRequest, request: Request) -> dict:
|
||||||
"""Create a run and block until it completes, returning the final state."""
|
"""Create a run and block until it completes, returning the final state."""
|
||||||
bridge = get_stream_bridge(request)
|
|
||||||
run_mgr = get_run_manager(request)
|
|
||||||
record = await start_run(body, thread_id, request)
|
record = await start_run(body, thread_id, request)
|
||||||
|
|
||||||
completed = True
|
|
||||||
if record.task is not None:
|
if record.task is not None:
|
||||||
completed = await wait_for_run_completion(bridge, record, request, run_mgr)
|
|
||||||
|
|
||||||
if completed:
|
|
||||||
checkpointer = get_checkpointer(request)
|
|
||||||
config = {"configurable": {"thread_id": thread_id}}
|
|
||||||
try:
|
try:
|
||||||
checkpoint_tuple = await checkpointer.aget_tuple(config)
|
await record.task
|
||||||
if checkpoint_tuple is not None:
|
except asyncio.CancelledError:
|
||||||
checkpoint = getattr(checkpoint_tuple, "checkpoint", {}) or {}
|
pass
|
||||||
channel_values = checkpoint.get("channel_values", {})
|
|
||||||
return serialize_channel_values(channel_values)
|
checkpointer = get_checkpointer(request)
|
||||||
except Exception:
|
config = {"configurable": {"thread_id": thread_id}}
|
||||||
logger.exception("Failed to fetch final state for run %s", record.run_id)
|
try:
|
||||||
|
checkpoint_tuple = await checkpointer.aget_tuple(config)
|
||||||
|
if checkpoint_tuple is not None:
|
||||||
|
checkpoint = getattr(checkpoint_tuple, "checkpoint", {}) or {}
|
||||||
|
channel_values = checkpoint.get("channel_values", {})
|
||||||
|
return serialize_channel_values(channel_values)
|
||||||
|
except Exception:
|
||||||
|
logger.exception("Failed to fetch final state for run %s", record.run_id)
|
||||||
|
|
||||||
return {"status": record.status.value, "error": record.error}
|
return {"status": record.status.value, "error": record.error}
|
||||||
|
|
||||||
@@ -204,8 +159,7 @@ async def wait_run(thread_id: str, body: RunCreateRequest, request: Request) ->
|
|||||||
async def list_runs(thread_id: str, request: Request) -> list[RunResponse]:
|
async def list_runs(thread_id: str, request: Request) -> list[RunResponse]:
|
||||||
"""List all runs for a thread."""
|
"""List all runs for a thread."""
|
||||||
run_mgr = get_run_manager(request)
|
run_mgr = get_run_manager(request)
|
||||||
user_id = await get_current_user(request)
|
records = await run_mgr.list_by_thread(thread_id)
|
||||||
records = await run_mgr.list_by_thread(thread_id, user_id=user_id)
|
|
||||||
return [_record_to_response(r) for r in records]
|
return [_record_to_response(r) for r in records]
|
||||||
|
|
||||||
|
|
||||||
@@ -214,8 +168,7 @@ async def list_runs(thread_id: str, request: Request) -> list[RunResponse]:
|
|||||||
async def get_run(thread_id: str, run_id: str, request: Request) -> RunResponse:
|
async def get_run(thread_id: str, run_id: str, request: Request) -> RunResponse:
|
||||||
"""Get details of a specific run."""
|
"""Get details of a specific run."""
|
||||||
run_mgr = get_run_manager(request)
|
run_mgr = get_run_manager(request)
|
||||||
user_id = await get_current_user(request)
|
record = run_mgr.get(run_id)
|
||||||
record = await run_mgr.get(run_id, user_id=user_id)
|
|
||||||
if record is None or record.thread_id != thread_id:
|
if record is None or record.thread_id != thread_id:
|
||||||
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
||||||
return _record_to_response(record)
|
return _record_to_response(record)
|
||||||
@@ -238,13 +191,16 @@ async def cancel_run(
|
|||||||
- wait=false: Return immediately with 202
|
- wait=false: Return immediately with 202
|
||||||
"""
|
"""
|
||||||
run_mgr = get_run_manager(request)
|
run_mgr = get_run_manager(request)
|
||||||
record = await run_mgr.get(run_id)
|
record = run_mgr.get(run_id)
|
||||||
if record is None or record.thread_id != thread_id:
|
if record is None or record.thread_id != thread_id:
|
||||||
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
||||||
|
|
||||||
cancelled = await run_mgr.cancel(run_id, action=action)
|
cancelled = await run_mgr.cancel(run_id, action=action)
|
||||||
if not cancelled:
|
if not cancelled:
|
||||||
raise HTTPException(status_code=409, detail=_cancel_conflict_detail(run_id, record))
|
raise HTTPException(
|
||||||
|
status_code=409,
|
||||||
|
detail=f"Run {run_id} is not cancellable (status: {record.status.value})",
|
||||||
|
)
|
||||||
|
|
||||||
if wait and record.task is not None:
|
if wait and record.task is not None:
|
||||||
try:
|
try:
|
||||||
@@ -260,14 +216,12 @@ async def cancel_run(
|
|||||||
@require_permission("runs", "read", owner_check=True)
|
@require_permission("runs", "read", owner_check=True)
|
||||||
async def join_run(thread_id: str, run_id: str, request: Request) -> StreamingResponse:
|
async def join_run(thread_id: str, run_id: str, request: Request) -> StreamingResponse:
|
||||||
"""Join an existing run's SSE stream."""
|
"""Join an existing run's SSE stream."""
|
||||||
|
bridge = get_stream_bridge(request)
|
||||||
run_mgr = get_run_manager(request)
|
run_mgr = get_run_manager(request)
|
||||||
record = await run_mgr.get(run_id)
|
record = run_mgr.get(run_id)
|
||||||
if record is None or record.thread_id != thread_id:
|
if record is None or record.thread_id != thread_id:
|
||||||
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
||||||
if record.store_only:
|
|
||||||
raise HTTPException(status_code=409, detail=f"Run {run_id} is not active on this worker and cannot be streamed")
|
|
||||||
|
|
||||||
bridge = get_stream_bridge(request)
|
|
||||||
return StreamingResponse(
|
return StreamingResponse(
|
||||||
sse_consumer(bridge, record, request, run_mgr),
|
sse_consumer(bridge, record, request, run_mgr),
|
||||||
media_type="text/event-stream",
|
media_type="text/event-stream",
|
||||||
@@ -279,12 +233,7 @@ async def join_run(thread_id: str, run_id: str, request: Request) -> StreamingRe
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
# Register GET and POST as separate routes so each method gets a unique OpenAPI
|
@router.api_route("/{thread_id}/runs/{run_id}/stream", methods=["GET", "POST"], response_model=None)
|
||||||
# operationId. ``api_route(methods=["GET", "POST"])`` shares one route registration
|
|
||||||
# across both methods, which makes FastAPI emit the same ``operationId`` twice and
|
|
||||||
# warn about a duplicate operation id during OpenAPI generation.
|
|
||||||
@router.get("/{thread_id}/runs/{run_id}/stream", response_model=None)
|
|
||||||
@router.post("/{thread_id}/runs/{run_id}/stream", response_model=None)
|
|
||||||
@require_permission("runs", "read", owner_check=True)
|
@require_permission("runs", "read", owner_check=True)
|
||||||
async def stream_existing_run(
|
async def stream_existing_run(
|
||||||
thread_id: str,
|
thread_id: str,
|
||||||
@@ -301,18 +250,14 @@ async def stream_existing_run(
|
|||||||
remaining buffered events so the client observes a clean shutdown.
|
remaining buffered events so the client observes a clean shutdown.
|
||||||
"""
|
"""
|
||||||
run_mgr = get_run_manager(request)
|
run_mgr = get_run_manager(request)
|
||||||
record = await run_mgr.get(run_id)
|
record = run_mgr.get(run_id)
|
||||||
if record is None or record.thread_id != thread_id:
|
if record is None or record.thread_id != thread_id:
|
||||||
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
||||||
if record.store_only and action is None:
|
|
||||||
raise HTTPException(status_code=409, detail=f"Run {run_id} is not active on this worker and cannot be streamed")
|
|
||||||
|
|
||||||
# Cancel if an action was requested (stop-button / interrupt flow)
|
# Cancel if an action was requested (stop-button / interrupt flow)
|
||||||
if action is not None:
|
if action is not None:
|
||||||
cancelled = await run_mgr.cancel(run_id, action=action)
|
cancelled = await run_mgr.cancel(run_id, action=action)
|
||||||
if not cancelled:
|
if cancelled and wait and record.task is not None:
|
||||||
raise HTTPException(status_code=409, detail=_cancel_conflict_detail(run_id, record))
|
|
||||||
if wait and record.task is not None:
|
|
||||||
try:
|
try:
|
||||||
await record.task
|
await record.task
|
||||||
except (asyncio.CancelledError, Exception):
|
except (asyncio.CancelledError, Exception):
|
||||||
@@ -403,7 +348,8 @@ async def list_run_messages(
|
|||||||
before_seq=before_seq,
|
before_seq=before_seq,
|
||||||
after_seq=after_seq,
|
after_seq=after_seq,
|
||||||
)
|
)
|
||||||
data, has_more = trim_run_message_page(rows, limit=limit, after_seq=after_seq)
|
has_more = len(rows) > limit
|
||||||
|
data = rows[:limit] if has_more else rows
|
||||||
return {"data": data, "has_more": has_more}
|
return {"data": data, "has_more": has_more}
|
||||||
|
|
||||||
|
|
||||||
@@ -422,17 +368,10 @@ async def list_run_events(
|
|||||||
return await event_store.list_events(thread_id, run_id, event_types=types, limit=limit)
|
return await event_store.list_events(thread_id, run_id, event_types=types, limit=limit)
|
||||||
|
|
||||||
|
|
||||||
@router.get("/{thread_id}/token-usage", response_model=ThreadTokenUsageResponse)
|
@router.get("/{thread_id}/token-usage")
|
||||||
@require_permission("threads", "read", owner_check=True)
|
@require_permission("threads", "read", owner_check=True)
|
||||||
async def thread_token_usage(
|
async def thread_token_usage(thread_id: str, request: Request) -> dict:
|
||||||
thread_id: str,
|
|
||||||
request: Request,
|
|
||||||
include_active: bool = Query(default=False, description="Include running run progress snapshots"),
|
|
||||||
) -> ThreadTokenUsageResponse:
|
|
||||||
"""Thread-level token usage aggregation."""
|
"""Thread-level token usage aggregation."""
|
||||||
run_store = get_run_store(request)
|
run_store = get_run_store(request)
|
||||||
if include_active:
|
agg = await run_store.aggregate_tokens_by_thread(thread_id)
|
||||||
agg = await run_store.aggregate_tokens_by_thread(thread_id, include_active=True)
|
return {"thread_id": thread_id, **agg}
|
||||||
else:
|
|
||||||
agg = await run_store.aggregate_tokens_by_thread(thread_id)
|
|
||||||
return ThreadTokenUsageResponse(thread_id=thread_id, **agg)
|
|
||||||
|
|||||||
@@ -13,11 +13,11 @@ matching the LangGraph Platform wire format expected by the
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
|
import time
|
||||||
import uuid
|
import uuid
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from fastapi import APIRouter, HTTPException, Request
|
from fastapi import APIRouter, HTTPException, Request
|
||||||
from langgraph.checkpoint.base import empty_checkpoint, uuid6
|
|
||||||
from pydantic import BaseModel, Field, field_validator
|
from pydantic import BaseModel, Field, field_validator
|
||||||
|
|
||||||
from app.gateway.authz import require_permission
|
from app.gateway.authz import require_permission
|
||||||
@@ -26,7 +26,6 @@ from app.gateway.utils import sanitize_log_param
|
|||||||
from deerflow.config.paths import Paths, get_paths
|
from deerflow.config.paths import Paths, get_paths
|
||||||
from deerflow.runtime import serialize_channel_values
|
from deerflow.runtime import serialize_channel_values
|
||||||
from deerflow.runtime.user_context import get_effective_user_id
|
from deerflow.runtime.user_context import get_effective_user_id
|
||||||
from deerflow.utils.time import coerce_iso, now_iso
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
router = APIRouter(prefix="/api/threads", tags=["threads"])
|
router = APIRouter(prefix="/api/threads", tags=["threads"])
|
||||||
@@ -90,28 +89,6 @@ class ThreadSearchRequest(BaseModel):
|
|||||||
offset: int = Field(default=0, ge=0, description="Pagination offset")
|
offset: int = Field(default=0, ge=0, description="Pagination offset")
|
||||||
status: str | None = Field(default=None, description="Filter by thread status")
|
status: str | None = Field(default=None, description="Filter by thread status")
|
||||||
|
|
||||||
@field_validator("metadata")
|
|
||||||
@classmethod
|
|
||||||
def _validate_metadata_filters(cls, v: dict[str, Any]) -> dict[str, Any]:
|
|
||||||
"""Reject filter entries the SQL backend cannot compile.
|
|
||||||
|
|
||||||
Enforces consistent behaviour across SQL and memory backends.
|
|
||||||
See ``deerflow.persistence.json_compat`` for the shared validators.
|
|
||||||
"""
|
|
||||||
if not v:
|
|
||||||
return v
|
|
||||||
from deerflow.persistence.json_compat import validate_metadata_filter_key, validate_metadata_filter_value
|
|
||||||
|
|
||||||
bad_entries: list[str] = []
|
|
||||||
for key, value in v.items():
|
|
||||||
if not validate_metadata_filter_key(key):
|
|
||||||
bad_entries.append(f"{key!r} (unsafe key)")
|
|
||||||
elif not validate_metadata_filter_value(value):
|
|
||||||
bad_entries.append(f"{key!r} (unsupported value type {type(value).__name__})")
|
|
||||||
if bad_entries:
|
|
||||||
raise ValueError(f"Invalid metadata filter entries: {', '.join(bad_entries)}")
|
|
||||||
return v
|
|
||||||
|
|
||||||
|
|
||||||
class ThreadStateResponse(BaseModel):
|
class ThreadStateResponse(BaseModel):
|
||||||
"""Response model for thread state."""
|
"""Response model for thread state."""
|
||||||
@@ -256,7 +233,7 @@ async def create_thread(body: ThreadCreateRequest, request: Request) -> ThreadRe
|
|||||||
checkpointer = get_checkpointer(request)
|
checkpointer = get_checkpointer(request)
|
||||||
thread_store = get_thread_store(request)
|
thread_store = get_thread_store(request)
|
||||||
thread_id = body.thread_id or str(uuid.uuid4())
|
thread_id = body.thread_id or str(uuid.uuid4())
|
||||||
now = now_iso()
|
now = time.time()
|
||||||
# ``body.metadata`` is already stripped of server-reserved keys by
|
# ``body.metadata`` is already stripped of server-reserved keys by
|
||||||
# ``ThreadCreateRequest._strip_reserved`` — see the model definition.
|
# ``ThreadCreateRequest._strip_reserved`` — see the model definition.
|
||||||
|
|
||||||
@@ -266,8 +243,8 @@ async def create_thread(body: ThreadCreateRequest, request: Request) -> ThreadRe
|
|||||||
return ThreadResponse(
|
return ThreadResponse(
|
||||||
thread_id=thread_id,
|
thread_id=thread_id,
|
||||||
status=existing_record.get("status", "idle"),
|
status=existing_record.get("status", "idle"),
|
||||||
created_at=coerce_iso(existing_record.get("created_at", "")),
|
created_at=str(existing_record.get("created_at", "")),
|
||||||
updated_at=coerce_iso(existing_record.get("updated_at", "")),
|
updated_at=str(existing_record.get("updated_at", "")),
|
||||||
metadata=existing_record.get("metadata", {}),
|
metadata=existing_record.get("metadata", {}),
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -285,6 +262,8 @@ async def create_thread(body: ThreadCreateRequest, request: Request) -> ThreadRe
|
|||||||
# Write an empty checkpoint so state endpoints work immediately
|
# Write an empty checkpoint so state endpoints work immediately
|
||||||
config = {"configurable": {"thread_id": thread_id, "checkpoint_ns": ""}}
|
config = {"configurable": {"thread_id": thread_id, "checkpoint_ns": ""}}
|
||||||
try:
|
try:
|
||||||
|
from langgraph.checkpoint.base import empty_checkpoint
|
||||||
|
|
||||||
ckpt_metadata = {
|
ckpt_metadata = {
|
||||||
"step": -1,
|
"step": -1,
|
||||||
"source": "input",
|
"source": "input",
|
||||||
@@ -302,8 +281,8 @@ async def create_thread(body: ThreadCreateRequest, request: Request) -> ThreadRe
|
|||||||
return ThreadResponse(
|
return ThreadResponse(
|
||||||
thread_id=thread_id,
|
thread_id=thread_id,
|
||||||
status="idle",
|
status="idle",
|
||||||
created_at=now,
|
created_at=str(now),
|
||||||
updated_at=now,
|
updated_at=str(now),
|
||||||
metadata=body.metadata,
|
metadata=body.metadata,
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -316,27 +295,20 @@ async def search_threads(body: ThreadSearchRequest, request: Request) -> list[Th
|
|||||||
(SQL-backed for sqlite/postgres, Store-backed for memory mode).
|
(SQL-backed for sqlite/postgres, Store-backed for memory mode).
|
||||||
"""
|
"""
|
||||||
from app.gateway.deps import get_thread_store
|
from app.gateway.deps import get_thread_store
|
||||||
from deerflow.persistence.thread_meta import InvalidMetadataFilterError
|
|
||||||
|
|
||||||
repo = get_thread_store(request)
|
repo = get_thread_store(request)
|
||||||
try:
|
rows = await repo.search(
|
||||||
rows = await repo.search(
|
metadata=body.metadata or None,
|
||||||
metadata=body.metadata or None,
|
status=body.status,
|
||||||
status=body.status,
|
limit=body.limit,
|
||||||
limit=body.limit,
|
offset=body.offset,
|
||||||
offset=body.offset,
|
)
|
||||||
)
|
|
||||||
except InvalidMetadataFilterError as exc:
|
|
||||||
raise HTTPException(status_code=400, detail=str(exc)) from exc
|
|
||||||
return [
|
return [
|
||||||
ThreadResponse(
|
ThreadResponse(
|
||||||
thread_id=r["thread_id"],
|
thread_id=r["thread_id"],
|
||||||
status=r.get("status", "idle"),
|
status=r.get("status", "idle"),
|
||||||
# ``coerce_iso`` heals legacy unix-second values that
|
created_at=r.get("created_at", ""),
|
||||||
# ``MemoryThreadMetaStore`` historically wrote with ``time.time()``;
|
updated_at=r.get("updated_at", ""),
|
||||||
# SQL-backed rows already arrive as ISO strings and pass through.
|
|
||||||
created_at=coerce_iso(r.get("created_at", "")),
|
|
||||||
updated_at=coerce_iso(r.get("updated_at", "")),
|
|
||||||
metadata=r.get("metadata", {}),
|
metadata=r.get("metadata", {}),
|
||||||
values={"title": r["display_name"]} if r.get("display_name") else {},
|
values={"title": r["display_name"]} if r.get("display_name") else {},
|
||||||
interrupts={},
|
interrupts={},
|
||||||
@@ -368,8 +340,8 @@ async def patch_thread(thread_id: str, body: ThreadPatchRequest, request: Reques
|
|||||||
return ThreadResponse(
|
return ThreadResponse(
|
||||||
thread_id=thread_id,
|
thread_id=thread_id,
|
||||||
status=record.get("status", "idle"),
|
status=record.get("status", "idle"),
|
||||||
created_at=coerce_iso(record.get("created_at", "")),
|
created_at=str(record.get("created_at", "")),
|
||||||
updated_at=coerce_iso(record.get("updated_at", "")),
|
updated_at=str(record.get("updated_at", "")),
|
||||||
metadata=record.get("metadata", {}),
|
metadata=record.get("metadata", {}),
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -409,8 +381,8 @@ async def get_thread(thread_id: str, request: Request) -> ThreadResponse:
|
|||||||
record = {
|
record = {
|
||||||
"thread_id": thread_id,
|
"thread_id": thread_id,
|
||||||
"status": "idle",
|
"status": "idle",
|
||||||
"created_at": coerce_iso(ckpt_meta.get("created_at", "")),
|
"created_at": ckpt_meta.get("created_at", ""),
|
||||||
"updated_at": coerce_iso(ckpt_meta.get("updated_at", ckpt_meta.get("created_at", ""))),
|
"updated_at": ckpt_meta.get("updated_at", ckpt_meta.get("created_at", "")),
|
||||||
"metadata": {k: v for k, v in ckpt_meta.items() if k not in ("created_at", "updated_at", "step", "source", "writes", "parents")},
|
"metadata": {k: v for k, v in ckpt_meta.items() if k not in ("created_at", "updated_at", "step", "source", "writes", "parents")},
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -424,8 +396,8 @@ async def get_thread(thread_id: str, request: Request) -> ThreadResponse:
|
|||||||
return ThreadResponse(
|
return ThreadResponse(
|
||||||
thread_id=thread_id,
|
thread_id=thread_id,
|
||||||
status=status,
|
status=status,
|
||||||
created_at=coerce_iso(record.get("created_at", "")),
|
created_at=str(record.get("created_at", "")),
|
||||||
updated_at=coerce_iso(record.get("updated_at", "")),
|
updated_at=str(record.get("updated_at", "")),
|
||||||
metadata=record.get("metadata", {}),
|
metadata=record.get("metadata", {}),
|
||||||
values=serialize_channel_values(channel_values),
|
values=serialize_channel_values(channel_values),
|
||||||
)
|
)
|
||||||
@@ -476,10 +448,10 @@ async def get_thread_state(thread_id: str, request: Request) -> ThreadStateRespo
|
|||||||
values=values,
|
values=values,
|
||||||
next=next_tasks,
|
next=next_tasks,
|
||||||
metadata=metadata,
|
metadata=metadata,
|
||||||
checkpoint={"id": checkpoint_id, "ts": coerce_iso(metadata.get("created_at", ""))},
|
checkpoint={"id": checkpoint_id, "ts": str(metadata.get("created_at", ""))},
|
||||||
checkpoint_id=checkpoint_id,
|
checkpoint_id=checkpoint_id,
|
||||||
parent_checkpoint_id=parent_checkpoint_id,
|
parent_checkpoint_id=parent_checkpoint_id,
|
||||||
created_at=coerce_iso(metadata.get("created_at", "")),
|
created_at=str(metadata.get("created_at", "")),
|
||||||
tasks=tasks,
|
tasks=tasks,
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -529,28 +501,16 @@ async def update_thread_state(thread_id: str, body: ThreadStateUpdateRequest, re
|
|||||||
channel_values.update(body.values)
|
channel_values.update(body.values)
|
||||||
|
|
||||||
checkpoint["channel_values"] = channel_values
|
checkpoint["channel_values"] = channel_values
|
||||||
metadata["updated_at"] = now_iso()
|
metadata["updated_at"] = time.time()
|
||||||
|
|
||||||
if body.as_node:
|
if body.as_node:
|
||||||
metadata["source"] = "update"
|
metadata["source"] = "update"
|
||||||
metadata["step"] = metadata.get("step", 0) + 1
|
metadata["step"] = metadata.get("step", 0) + 1
|
||||||
metadata["writes"] = {body.as_node: body.values}
|
metadata["writes"] = {body.as_node: body.values}
|
||||||
|
|
||||||
# Assign a new checkpoint ID so aput performs an INSERT rather than an
|
|
||||||
# in-place REPLACE of the existing row. Use uuid6 (time-ordered) rather
|
|
||||||
# than uuid4 (random) so the new ID is always lexicographically greater
|
|
||||||
# than the previous one — LangGraph's checkpointers determine the "latest"
|
|
||||||
# checkpoint by max(checkpoint_ids) string order, matching the uuid6 epoch.
|
|
||||||
checkpoint["id"] = str(uuid6())
|
|
||||||
|
|
||||||
# aput requires checkpoint_ns in the config — use the same config used for the
|
# aput requires checkpoint_ns in the config — use the same config used for the
|
||||||
# read (which always includes checkpoint_ns=""). The fresh checkpoint ID is
|
# read (which always includes checkpoint_ns=""). Do NOT include checkpoint_id
|
||||||
# assigned above via checkpoint["id"]; keep checkpoint_id out of the config so
|
# so that aput generates a fresh checkpoint ID for the new snapshot.
|
||||||
# the write is keyed by the new checkpoint payload rather than the prior read.
|
|
||||||
# All supported savers (InMemorySaver, AsyncSqliteSaver, AsyncPostgresSaver)
|
|
||||||
# persist and echo back checkpoint["id"] verbatim — none mint their own — so
|
|
||||||
# the new_config below carries the uuid6 we assigned here. (Regression-locked
|
|
||||||
# by test_update_thread_state_inserts_new_checkpoint_each_call.)
|
|
||||||
write_config: dict[str, Any] = {
|
write_config: dict[str, Any] = {
|
||||||
"configurable": {
|
"configurable": {
|
||||||
"thread_id": thread_id,
|
"thread_id": thread_id,
|
||||||
@@ -569,7 +529,7 @@ async def update_thread_state(thread_id: str, body: ThreadStateUpdateRequest, re
|
|||||||
|
|
||||||
# Sync title changes through the ThreadMetaStore abstraction so /threads/search
|
# Sync title changes through the ThreadMetaStore abstraction so /threads/search
|
||||||
# reflects them immediately in both sqlite and memory backends.
|
# reflects them immediately in both sqlite and memory backends.
|
||||||
if thread_store and body.values and "title" in body.values:
|
if body.values and "title" in body.values:
|
||||||
new_title = body.values["title"]
|
new_title = body.values["title"]
|
||||||
if new_title: # Skip empty strings and None
|
if new_title: # Skip empty strings and None
|
||||||
try:
|
try:
|
||||||
@@ -582,7 +542,7 @@ async def update_thread_state(thread_id: str, body: ThreadStateUpdateRequest, re
|
|||||||
next=[],
|
next=[],
|
||||||
metadata=metadata,
|
metadata=metadata,
|
||||||
checkpoint_id=new_checkpoint_id,
|
checkpoint_id=new_checkpoint_id,
|
||||||
created_at=coerce_iso(metadata.get("created_at", "")),
|
created_at=str(metadata.get("created_at", "")),
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -649,7 +609,7 @@ async def get_thread_history(thread_id: str, body: ThreadHistoryRequest, request
|
|||||||
parent_checkpoint_id=parent_id,
|
parent_checkpoint_id=parent_id,
|
||||||
metadata=user_meta,
|
metadata=user_meta,
|
||||||
values=values,
|
values=values,
|
||||||
created_at=coerce_iso(metadata.get("created_at", "")),
|
created_at=str(metadata.get("created_at", "")),
|
||||||
next=next_tasks,
|
next=next_tasks,
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -5,7 +5,7 @@ import os
|
|||||||
import stat
|
import stat
|
||||||
|
|
||||||
from fastapi import APIRouter, Depends, File, HTTPException, Request, UploadFile
|
from fastapi import APIRouter, Depends, File, HTTPException, Request, UploadFile
|
||||||
from pydantic import BaseModel, Field
|
from pydantic import BaseModel
|
||||||
|
|
||||||
from app.gateway.authz import require_permission
|
from app.gateway.authz import require_permission
|
||||||
from app.gateway.deps import get_config
|
from app.gateway.deps import get_config
|
||||||
@@ -15,15 +15,12 @@ from deerflow.runtime.user_context import get_effective_user_id
|
|||||||
from deerflow.sandbox.sandbox_provider import SandboxProvider, get_sandbox_provider
|
from deerflow.sandbox.sandbox_provider import SandboxProvider, get_sandbox_provider
|
||||||
from deerflow.uploads.manager import (
|
from deerflow.uploads.manager import (
|
||||||
PathTraversalError,
|
PathTraversalError,
|
||||||
UnsafeUploadPathError,
|
|
||||||
claim_unique_filename,
|
|
||||||
delete_file_safe,
|
delete_file_safe,
|
||||||
enrich_file_listing,
|
enrich_file_listing,
|
||||||
ensure_uploads_dir,
|
ensure_uploads_dir,
|
||||||
get_uploads_dir,
|
get_uploads_dir,
|
||||||
list_files_in_dir,
|
list_files_in_dir,
|
||||||
normalize_filename,
|
normalize_filename,
|
||||||
open_upload_file_no_symlink,
|
|
||||||
upload_artifact_url,
|
upload_artifact_url,
|
||||||
upload_virtual_path,
|
upload_virtual_path,
|
||||||
)
|
)
|
||||||
@@ -39,37 +36,12 @@ DEFAULT_MAX_FILE_SIZE = 50 * 1024 * 1024
|
|||||||
DEFAULT_MAX_TOTAL_SIZE = 100 * 1024 * 1024
|
DEFAULT_MAX_TOTAL_SIZE = 100 * 1024 * 1024
|
||||||
|
|
||||||
|
|
||||||
class UploadedFileInfo(BaseModel):
|
|
||||||
"""Uploaded file metadata exposed by upload and list APIs."""
|
|
||||||
|
|
||||||
filename: str
|
|
||||||
size: int
|
|
||||||
path: str
|
|
||||||
virtual_path: str
|
|
||||||
artifact_url: str
|
|
||||||
extension: str | None = None
|
|
||||||
modified: float | None = None
|
|
||||||
original_filename: str | None = None
|
|
||||||
markdown_file: str | None = None
|
|
||||||
markdown_path: str | None = None
|
|
||||||
markdown_virtual_path: str | None = None
|
|
||||||
markdown_artifact_url: str | None = None
|
|
||||||
|
|
||||||
|
|
||||||
class UploadResponse(BaseModel):
|
class UploadResponse(BaseModel):
|
||||||
"""Response model for file upload."""
|
"""Response model for file upload."""
|
||||||
|
|
||||||
success: bool
|
success: bool
|
||||||
files: list[UploadedFileInfo]
|
files: list[dict[str, str]]
|
||||||
message: str
|
message: str
|
||||||
skipped_files: list[str] = Field(default_factory=list)
|
|
||||||
|
|
||||||
|
|
||||||
class UploadListResponse(BaseModel):
|
|
||||||
"""Response model for uploaded file listing."""
|
|
||||||
|
|
||||||
files: list[UploadedFileInfo]
|
|
||||||
count: int
|
|
||||||
|
|
||||||
|
|
||||||
class UploadLimits(BaseModel):
|
class UploadLimits(BaseModel):
|
||||||
@@ -93,30 +65,11 @@ def _make_file_sandbox_writable(file_path: os.PathLike[str] | str) -> None:
|
|||||||
logger.warning("Skipping sandbox chmod for symlinked upload path: %s", file_path)
|
logger.warning("Skipping sandbox chmod for symlinked upload path: %s", file_path)
|
||||||
return
|
return
|
||||||
|
|
||||||
writable_mode = stat.S_IMODE(file_stat.st_mode) | stat.S_IWUSR | stat.S_IWGRP | stat.S_IWOTH | stat.S_IRGRP | stat.S_IROTH
|
writable_mode = stat.S_IMODE(file_stat.st_mode) | stat.S_IWUSR | stat.S_IWGRP | stat.S_IWOTH
|
||||||
chmod_kwargs = {"follow_symlinks": False} if os.chmod in os.supports_follow_symlinks else {}
|
chmod_kwargs = {"follow_symlinks": False} if os.chmod in os.supports_follow_symlinks else {}
|
||||||
os.chmod(file_path, writable_mode, **chmod_kwargs)
|
os.chmod(file_path, writable_mode, **chmod_kwargs)
|
||||||
|
|
||||||
|
|
||||||
def _make_file_sandbox_readable(file_path: os.PathLike[str] | str) -> None:
|
|
||||||
"""Ensure uploaded files are readable by the sandbox process.
|
|
||||||
|
|
||||||
For Docker sandboxes (AIO), the gateway writes files as root with 0o600
|
|
||||||
permissions, then bind-mounts the host directory into the container. The
|
|
||||||
sandbox process inside the container runs as a non-root user and cannot
|
|
||||||
read those files without group/other read bits. This function adds
|
|
||||||
``S_IRGRP | S_IROTH`` so the sandbox can read the uploaded content.
|
|
||||||
"""
|
|
||||||
file_stat = os.lstat(file_path)
|
|
||||||
if stat.S_ISLNK(file_stat.st_mode):
|
|
||||||
logger.warning("Skipping sandbox chmod for symlinked upload path: %s", file_path)
|
|
||||||
return
|
|
||||||
|
|
||||||
readable_mode = stat.S_IMODE(file_stat.st_mode) | stat.S_IRGRP | stat.S_IROTH
|
|
||||||
chmod_kwargs = {"follow_symlinks": False} if os.chmod in os.supports_follow_symlinks else {}
|
|
||||||
os.chmod(file_path, readable_mode, **chmod_kwargs)
|
|
||||||
|
|
||||||
|
|
||||||
def _uses_thread_data_mounts(sandbox_provider: SandboxProvider) -> bool:
|
def _uses_thread_data_mounts(sandbox_provider: SandboxProvider) -> bool:
|
||||||
return bool(getattr(sandbox_provider, "uses_thread_data_mounts", False))
|
return bool(getattr(sandbox_provider, "uses_thread_data_mounts", False))
|
||||||
|
|
||||||
@@ -163,18 +116,17 @@ def _cleanup_uploaded_paths(paths: list[os.PathLike[str] | str]) -> None:
|
|||||||
logger.warning("Failed to clean up upload path after rejected request: %s", path, exc_info=True)
|
logger.warning("Failed to clean up upload path after rejected request: %s", path, exc_info=True)
|
||||||
|
|
||||||
|
|
||||||
async def _write_upload_file_with_limits(
|
async def _write_upload_file_streaming(
|
||||||
file: UploadFile,
|
file: UploadFile,
|
||||||
|
file_path: os.PathLike[str] | str,
|
||||||
*,
|
*,
|
||||||
uploads_dir: os.PathLike[str] | str,
|
|
||||||
display_filename: str,
|
display_filename: str,
|
||||||
max_single_file_size: int,
|
max_single_file_size: int,
|
||||||
max_total_size: int,
|
max_total_size: int,
|
||||||
total_size: int,
|
total_size: int,
|
||||||
) -> tuple[os.PathLike[str] | str, int, int]:
|
) -> tuple[int, int]:
|
||||||
file_size = 0
|
file_size = 0
|
||||||
file_path, fh = open_upload_file_no_symlink(uploads_dir, display_filename)
|
with open(file_path, "wb") as output:
|
||||||
try:
|
|
||||||
while chunk := await file.read(UPLOAD_CHUNK_SIZE):
|
while chunk := await file.read(UPLOAD_CHUNK_SIZE):
|
||||||
file_size += len(chunk)
|
file_size += len(chunk)
|
||||||
total_size += len(chunk)
|
total_size += len(chunk)
|
||||||
@@ -182,17 +134,8 @@ async def _write_upload_file_with_limits(
|
|||||||
raise HTTPException(status_code=413, detail=f"File too large: {display_filename}")
|
raise HTTPException(status_code=413, detail=f"File too large: {display_filename}")
|
||||||
if total_size > max_total_size:
|
if total_size > max_total_size:
|
||||||
raise HTTPException(status_code=413, detail="Total upload size too large")
|
raise HTTPException(status_code=413, detail="Total upload size too large")
|
||||||
fh.write(chunk)
|
output.write(chunk)
|
||||||
except Exception:
|
return file_size, total_size
|
||||||
fh.close()
|
|
||||||
try:
|
|
||||||
os.unlink(file_path)
|
|
||||||
except FileNotFoundError:
|
|
||||||
pass
|
|
||||||
raise
|
|
||||||
else:
|
|
||||||
fh.close()
|
|
||||||
return file_path, file_size, total_size
|
|
||||||
|
|
||||||
|
|
||||||
def _auto_convert_documents_enabled(app_config: AppConfig) -> bool:
|
def _auto_convert_documents_enabled(app_config: AppConfig) -> bool:
|
||||||
@@ -234,12 +177,7 @@ async def upload_files(
|
|||||||
uploaded_files = []
|
uploaded_files = []
|
||||||
written_paths = []
|
written_paths = []
|
||||||
sandbox_sync_targets = []
|
sandbox_sync_targets = []
|
||||||
skipped_files = []
|
|
||||||
total_size = 0
|
total_size = 0
|
||||||
# Track filenames within this request so duplicate form parts do not
|
|
||||||
# silently truncate each other. Existing uploads keep the historical
|
|
||||||
# overwrite behavior for a single replacement upload.
|
|
||||||
seen_filenames: set[str] = set()
|
|
||||||
|
|
||||||
sandbox_provider = get_sandbox_provider()
|
sandbox_provider = get_sandbox_provider()
|
||||||
sync_to_sandbox = not _uses_thread_data_mounts(sandbox_provider)
|
sync_to_sandbox = not _uses_thread_data_mounts(sandbox_provider)
|
||||||
@@ -256,22 +194,22 @@ async def upload_files(
|
|||||||
continue
|
continue
|
||||||
|
|
||||||
try:
|
try:
|
||||||
original_filename = normalize_filename(file.filename)
|
safe_filename = normalize_filename(file.filename)
|
||||||
safe_filename = claim_unique_filename(original_filename, seen_filenames)
|
|
||||||
except ValueError:
|
except ValueError:
|
||||||
logger.warning(f"Skipping file with unsafe filename: {file.filename!r}")
|
logger.warning(f"Skipping file with unsafe filename: {file.filename!r}")
|
||||||
continue
|
continue
|
||||||
|
|
||||||
try:
|
try:
|
||||||
file_path, file_size, total_size = await _write_upload_file_with_limits(
|
file_path = uploads_dir / safe_filename
|
||||||
|
written_paths.append(file_path)
|
||||||
|
file_size, total_size = await _write_upload_file_streaming(
|
||||||
file,
|
file,
|
||||||
uploads_dir=uploads_dir,
|
file_path,
|
||||||
display_filename=safe_filename,
|
display_filename=safe_filename,
|
||||||
max_single_file_size=limits.max_file_size,
|
max_single_file_size=limits.max_file_size,
|
||||||
max_total_size=limits.max_total_size,
|
max_total_size=limits.max_total_size,
|
||||||
total_size=total_size,
|
total_size=total_size,
|
||||||
)
|
)
|
||||||
written_paths.append(file_path)
|
|
||||||
|
|
||||||
virtual_path = upload_virtual_path(safe_filename)
|
virtual_path = upload_virtual_path(safe_filename)
|
||||||
|
|
||||||
@@ -280,13 +218,11 @@ async def upload_files(
|
|||||||
|
|
||||||
file_info = {
|
file_info = {
|
||||||
"filename": safe_filename,
|
"filename": safe_filename,
|
||||||
"size": file_size,
|
"size": str(file_size),
|
||||||
"path": str(sandbox_uploads / safe_filename),
|
"path": str(sandbox_uploads / safe_filename),
|
||||||
"virtual_path": virtual_path,
|
"virtual_path": virtual_path,
|
||||||
"artifact_url": upload_artifact_url(thread_id, safe_filename),
|
"artifact_url": upload_artifact_url(thread_id, safe_filename),
|
||||||
}
|
}
|
||||||
if safe_filename != original_filename:
|
|
||||||
file_info["original_filename"] = original_filename
|
|
||||||
|
|
||||||
logger.info(f"Saved file: {safe_filename} ({file_size} bytes) to {file_info['path']}")
|
logger.info(f"Saved file: {safe_filename} ({file_size} bytes) to {file_info['path']}")
|
||||||
|
|
||||||
@@ -310,39 +246,20 @@ async def upload_files(
|
|||||||
except HTTPException as e:
|
except HTTPException as e:
|
||||||
_cleanup_uploaded_paths(written_paths)
|
_cleanup_uploaded_paths(written_paths)
|
||||||
raise e
|
raise e
|
||||||
except UnsafeUploadPathError as e:
|
|
||||||
logger.warning("Skipping upload with unsafe destination %s: %s", file.filename, e)
|
|
||||||
skipped_files.append(safe_filename)
|
|
||||||
continue
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"Failed to upload {file.filename}: {e}")
|
logger.error(f"Failed to upload {file.filename}: {e}")
|
||||||
_cleanup_uploaded_paths(written_paths)
|
_cleanup_uploaded_paths(written_paths)
|
||||||
raise HTTPException(status_code=500, detail=f"Failed to upload {file.filename}: {str(e)}")
|
raise HTTPException(status_code=500, detail=f"Failed to upload {file.filename}: {str(e)}")
|
||||||
|
|
||||||
# Uploaded files are created with 0o600 permissions (owner read/write only).
|
|
||||||
# In Docker sandbox deployments the gateway writes as root but the sandbox
|
|
||||||
# process runs as a non-root user (typically UID 1000). Without group/other
|
|
||||||
# read bits the sandbox cannot access the files — whether the uploads
|
|
||||||
# directory is bind-mounted into the container or synced via
|
|
||||||
# sandbox.update_file. Always add group/other read bits so every sandbox
|
|
||||||
# configuration can read the uploaded content.
|
|
||||||
for file_path in written_paths:
|
|
||||||
_make_file_sandbox_readable(file_path)
|
|
||||||
|
|
||||||
if sync_to_sandbox:
|
if sync_to_sandbox:
|
||||||
for file_path, virtual_path in sandbox_sync_targets:
|
for file_path, virtual_path in sandbox_sync_targets:
|
||||||
_make_file_sandbox_writable(file_path)
|
_make_file_sandbox_writable(file_path)
|
||||||
sandbox.update_file(virtual_path, file_path.read_bytes())
|
sandbox.update_file(virtual_path, file_path.read_bytes())
|
||||||
|
|
||||||
message = f"Successfully uploaded {len(uploaded_files)} file(s)"
|
|
||||||
if skipped_files:
|
|
||||||
message += f"; skipped {len(skipped_files)} unsafe file(s)"
|
|
||||||
|
|
||||||
return UploadResponse(
|
return UploadResponse(
|
||||||
success=not skipped_files,
|
success=True,
|
||||||
files=uploaded_files,
|
files=uploaded_files,
|
||||||
message=message,
|
message=f"Successfully uploaded {len(uploaded_files)} file(s)",
|
||||||
skipped_files=skipped_files,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -357,9 +274,9 @@ async def get_upload_limits(
|
|||||||
return _get_upload_limits(config)
|
return _get_upload_limits(config)
|
||||||
|
|
||||||
|
|
||||||
@router.get("/list", response_model=UploadListResponse)
|
@router.get("/list", response_model=dict)
|
||||||
@require_permission("threads", "read", owner_check=True)
|
@require_permission("threads", "read", owner_check=True)
|
||||||
async def list_uploaded_files(thread_id: str, request: Request) -> UploadListResponse:
|
async def list_uploaded_files(thread_id: str, request: Request) -> dict:
|
||||||
"""List all files in a thread's uploads directory."""
|
"""List all files in a thread's uploads directory."""
|
||||||
try:
|
try:
|
||||||
uploads_dir = get_uploads_dir(thread_id)
|
uploads_dir = get_uploads_dir(thread_id)
|
||||||
@@ -373,7 +290,7 @@ async def list_uploaded_files(thread_id: str, request: Request) -> UploadListRes
|
|||||||
for f in result["files"]:
|
for f in result["files"]:
|
||||||
f["path"] = str(sandbox_uploads / f["filename"])
|
f["path"] = str(sandbox_uploads / f["filename"])
|
||||||
|
|
||||||
return UploadListResponse(**result)
|
return result
|
||||||
|
|
||||||
|
|
||||||
@router.delete("/{filename}")
|
@router.delete("/{filename}")
|
||||||
|
|||||||
+13
-129
@@ -15,13 +15,10 @@ from collections.abc import Mapping
|
|||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from fastapi import HTTPException, Request
|
from fastapi import HTTPException, Request
|
||||||
from langchain_core.messages import BaseMessage
|
from langchain_core.messages import HumanMessage
|
||||||
from langchain_core.messages.utils import convert_to_messages
|
|
||||||
|
|
||||||
from app.gateway.deps import get_run_context, get_run_manager, get_stream_bridge
|
from app.gateway.deps import get_run_context, get_run_manager, get_stream_bridge
|
||||||
from app.gateway.internal_auth import INTERNAL_SYSTEM_ROLE
|
|
||||||
from app.gateway.utils import sanitize_log_param
|
from app.gateway.utils import sanitize_log_param
|
||||||
from deerflow.config.app_config import get_app_config
|
|
||||||
from deerflow.runtime import (
|
from deerflow.runtime import (
|
||||||
END_SENTINEL,
|
END_SENTINEL,
|
||||||
HEARTBEAT_SENTINEL,
|
HEARTBEAT_SENTINEL,
|
||||||
@@ -34,7 +31,6 @@ from deerflow.runtime import (
|
|||||||
UnsupportedStrategyError,
|
UnsupportedStrategyError,
|
||||||
run_agent,
|
run_agent,
|
||||||
)
|
)
|
||||||
from deerflow.runtime.runs.naming import resolve_root_run_name
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
@@ -78,35 +74,21 @@ def normalize_stream_modes(raw: list[str] | str | None) -> list[str]:
|
|||||||
|
|
||||||
|
|
||||||
def normalize_input(raw_input: dict[str, Any] | None) -> dict[str, Any]:
|
def normalize_input(raw_input: dict[str, Any] | None) -> dict[str, Any]:
|
||||||
"""Convert LangGraph Platform input format to LangChain state dict.
|
"""Convert LangGraph Platform input format to LangChain state dict."""
|
||||||
|
|
||||||
Delegates dict→message coercion to ``langchain_core.messages.utils.convert_to_messages``
|
|
||||||
so that ``additional_kwargs`` (e.g. uploaded-file metadata — gh #3132), ``id``,
|
|
||||||
``name``, and non-human roles (ai/system/tool) survive unchanged. An earlier
|
|
||||||
hand-rolled version only forwarded ``content`` and collapsed every role to
|
|
||||||
``HumanMessage``, which silently stripped frontend-supplied attachments.
|
|
||||||
|
|
||||||
Malformed message dicts (missing ``role``/``type``/``content``, unsupported
|
|
||||||
role, etc.) raise ``HTTPException(400)`` with the offending index, instead
|
|
||||||
of bubbling up as a 500. The gateway is a system boundary, so per-entry
|
|
||||||
validation errors are the right shape for clients to retry against.
|
|
||||||
"""
|
|
||||||
if raw_input is None:
|
if raw_input is None:
|
||||||
return {}
|
return {}
|
||||||
messages = raw_input.get("messages")
|
messages = raw_input.get("messages")
|
||||||
if messages and isinstance(messages, list):
|
if messages and isinstance(messages, list):
|
||||||
converted: list[Any] = []
|
converted = []
|
||||||
for index, msg in enumerate(messages):
|
for msg in messages:
|
||||||
if isinstance(msg, BaseMessage):
|
if isinstance(msg, dict):
|
||||||
converted.append(msg)
|
role = msg.get("role", msg.get("type", "user"))
|
||||||
elif isinstance(msg, dict):
|
content = msg.get("content", "")
|
||||||
try:
|
if role in ("user", "human"):
|
||||||
converted.extend(convert_to_messages([msg]))
|
converted.append(HumanMessage(content=content))
|
||||||
except (ValueError, TypeError, NotImplementedError) as exc:
|
else:
|
||||||
raise HTTPException(
|
# TODO: handle other message types (system, ai, tool)
|
||||||
status_code=400,
|
converted.append(HumanMessage(content=content))
|
||||||
detail=f"Invalid message at input.messages[{index}]: {exc}",
|
|
||||||
) from exc
|
|
||||||
else:
|
else:
|
||||||
converted.append(msg)
|
converted.append(msg)
|
||||||
return {**raw_input, "messages": converted}
|
return {**raw_input, "messages": converted}
|
||||||
@@ -141,14 +123,7 @@ def merge_run_context_overrides(config: dict[str, Any], context: Mapping[str, An
|
|||||||
"""Merge whitelisted keys from ``body.context`` into both ``config['configurable']``
|
"""Merge whitelisted keys from ``body.context`` into both ``config['configurable']``
|
||||||
and ``config['context']`` so they are visible to legacy configurable readers and
|
and ``config['context']`` so they are visible to legacy configurable readers and
|
||||||
to LangGraph ``ToolRuntime.context`` consumers (e.g. the ``setup_agent`` tool —
|
to LangGraph ``ToolRuntime.context`` consumers (e.g. the ``setup_agent`` tool —
|
||||||
see issue #2677).
|
see issue #2677)."""
|
||||||
|
|
||||||
``user_id`` is intentionally propagated into ``config['context']`` in addition to
|
|
||||||
the whitelisted keys, so non-web callers (e.g. IM channels) that supply identity in
|
|
||||||
``body.context`` keep it on ``ToolRuntime.context``. It is merged with
|
|
||||||
``setdefault`` so a server-authenticated id stamped by
|
|
||||||
:func:`inject_authenticated_user_context` always wins over the client-supplied one.
|
|
||||||
"""
|
|
||||||
if not context:
|
if not context:
|
||||||
return
|
return
|
||||||
configurable = config.setdefault("configurable", {})
|
configurable = config.setdefault("configurable", {})
|
||||||
@@ -159,29 +134,6 @@ def merge_run_context_overrides(config: dict[str, Any], context: Mapping[str, An
|
|||||||
configurable.setdefault(key, context[key])
|
configurable.setdefault(key, context[key])
|
||||||
if isinstance(runtime_context, dict):
|
if isinstance(runtime_context, dict):
|
||||||
runtime_context.setdefault(key, context[key])
|
runtime_context.setdefault(key, context[key])
|
||||||
if "user_id" in context and isinstance(runtime_context, dict):
|
|
||||||
runtime_context.setdefault("user_id", context["user_id"])
|
|
||||||
|
|
||||||
|
|
||||||
def inject_authenticated_user_context(config: dict[str, Any], request: Request) -> None:
|
|
||||||
"""Stamp the authenticated user into the run context for background tools.
|
|
||||||
|
|
||||||
Tool execution may happen after the request handler has returned, so tools
|
|
||||||
that persist user-scoped files should not rely only on ambient ContextVars.
|
|
||||||
The value comes from server-side auth state, never from client context.
|
|
||||||
"""
|
|
||||||
|
|
||||||
user = getattr(request.state, "user", None)
|
|
||||||
user_id = getattr(user, "id", None)
|
|
||||||
if user_id is None:
|
|
||||||
return
|
|
||||||
|
|
||||||
if getattr(user, "system_role", None) == INTERNAL_SYSTEM_ROLE:
|
|
||||||
return
|
|
||||||
|
|
||||||
runtime_context = config.setdefault("context", {})
|
|
||||||
if isinstance(runtime_context, dict):
|
|
||||||
runtime_context["user_id"] = str(user_id)
|
|
||||||
|
|
||||||
|
|
||||||
def resolve_agent_factory(assistant_id: str | None):
|
def resolve_agent_factory(assistant_id: str | None):
|
||||||
@@ -264,7 +216,6 @@ def build_run_config(
|
|||||||
target = config.setdefault("configurable", {})
|
target = config.setdefault("configurable", {})
|
||||||
if target is not None and "agent_name" not in target:
|
if target is not None and "agent_name" not in target:
|
||||||
target["agent_name"] = normalized
|
target["agent_name"] = normalized
|
||||||
config.setdefault("run_name", resolve_root_run_name(config, normalized))
|
|
||||||
if metadata:
|
if metadata:
|
||||||
config.setdefault("metadata", {}).update(metadata)
|
config.setdefault("metadata", {}).update(metadata)
|
||||||
return config
|
return config
|
||||||
@@ -298,23 +249,6 @@ async def start_run(
|
|||||||
|
|
||||||
disconnect = DisconnectMode.cancel if body.on_disconnect == "cancel" else DisconnectMode.continue_
|
disconnect = DisconnectMode.cancel if body.on_disconnect == "cancel" else DisconnectMode.continue_
|
||||||
|
|
||||||
body_context = getattr(body, "context", None) or {}
|
|
||||||
model_name = body_context.get("model_name")
|
|
||||||
|
|
||||||
# Coerce non-string model_name values to str before truncation.
|
|
||||||
if model_name is not None and not isinstance(model_name, str):
|
|
||||||
model_name = str(model_name)
|
|
||||||
|
|
||||||
# Validate model against the allowlist when a model_name is provided.
|
|
||||||
if model_name:
|
|
||||||
app_config = get_app_config()
|
|
||||||
resolved = app_config.get_model_config(model_name)
|
|
||||||
if resolved is None:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=400,
|
|
||||||
detail=f"Model {model_name!r} is not in the configured model allowlist",
|
|
||||||
)
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
record = await run_mgr.create_or_reject(
|
record = await run_mgr.create_or_reject(
|
||||||
thread_id,
|
thread_id,
|
||||||
@@ -323,7 +257,6 @@ async def start_run(
|
|||||||
metadata=body.metadata or {},
|
metadata=body.metadata or {},
|
||||||
kwargs={"input": body.input, "config": body.config},
|
kwargs={"input": body.input, "config": body.config},
|
||||||
multitask_strategy=body.multitask_strategy,
|
multitask_strategy=body.multitask_strategy,
|
||||||
model_name=model_name,
|
|
||||||
)
|
)
|
||||||
except ConflictError as exc:
|
except ConflictError as exc:
|
||||||
raise HTTPException(status_code=409, detail=str(exc)) from exc
|
raise HTTPException(status_code=409, detail=str(exc)) from exc
|
||||||
@@ -355,7 +288,6 @@ async def start_run(
|
|||||||
# that carries agent configuration (model_name, thinking_enabled, etc.).
|
# that carries agent configuration (model_name, thinking_enabled, etc.).
|
||||||
# Only agent-relevant keys are forwarded; unknown keys (e.g. thread_id) are ignored.
|
# Only agent-relevant keys are forwarded; unknown keys (e.g. thread_id) are ignored.
|
||||||
merge_run_context_overrides(config, getattr(body, "context", None))
|
merge_run_context_overrides(config, getattr(body, "context", None))
|
||||||
inject_authenticated_user_context(config, request)
|
|
||||||
|
|
||||||
stream_modes = normalize_stream_modes(body.stream_mode)
|
stream_modes = normalize_stream_modes(body.stream_mode)
|
||||||
|
|
||||||
@@ -415,51 +347,3 @@ async def sse_consumer(
|
|||||||
if record.status in (RunStatus.pending, RunStatus.running):
|
if record.status in (RunStatus.pending, RunStatus.running):
|
||||||
if record.on_disconnect == DisconnectMode.cancel:
|
if record.on_disconnect == DisconnectMode.cancel:
|
||||||
await run_mgr.cancel(record.run_id)
|
await run_mgr.cancel(record.run_id)
|
||||||
|
|
||||||
|
|
||||||
async def wait_for_run_completion(
|
|
||||||
bridge: StreamBridge,
|
|
||||||
record: RunRecord,
|
|
||||||
request: Request,
|
|
||||||
run_mgr: RunManager,
|
|
||||||
) -> bool:
|
|
||||||
"""Block until the run publishes ``END_SENTINEL``, honouring on_disconnect.
|
|
||||||
|
|
||||||
The non-streaming ``/wait`` endpoints used to ``await record.task``
|
|
||||||
directly with no disconnect handling. When the client (or an
|
|
||||||
intermediate HTTP proxy) timed out during a long tool call such as
|
|
||||||
``pip install``, the handler would swallow ``CancelledError`` and
|
|
||||||
serialize whatever checkpoint happened to exist — masking a half-finished
|
|
||||||
run as a normal completion (issue #3265).
|
|
||||||
|
|
||||||
This helper consumes the same bridge that ``sse_consumer`` does so the
|
|
||||||
wait path shares its disconnect semantics: each wake-up polls
|
|
||||||
``request.is_disconnected()``; on a real disconnect it cancels the
|
|
||||||
background run when ``record.on_disconnect`` is ``cancel``. The bridge's
|
|
||||||
heartbeat sentinels guarantee at least one wake-up per
|
|
||||||
``heartbeat_interval`` even when the agent emits no events for a while.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
``True`` when ``END_SENTINEL`` was observed (run reached a terminal
|
|
||||||
state), ``False`` when the loop exited because the client
|
|
||||||
disconnected. Callers must skip checkpoint serialization on
|
|
||||||
``False`` so a partial checkpoint is not returned as a normal
|
|
||||||
response.
|
|
||||||
"""
|
|
||||||
completed = False
|
|
||||||
try:
|
|
||||||
async for entry in bridge.subscribe(record.run_id):
|
|
||||||
# END_SENTINEL means the run reached a terminal state; honour it
|
|
||||||
# even if the client just disconnected so the caller still serializes
|
|
||||||
# the real final checkpoint.
|
|
||||||
if entry is END_SENTINEL:
|
|
||||||
completed = True
|
|
||||||
return True
|
|
||||||
if await request.is_disconnected():
|
|
||||||
break
|
|
||||||
# Heartbeats and regular events: keep waiting for END_SENTINEL.
|
|
||||||
return completed
|
|
||||||
finally:
|
|
||||||
if not completed and record.status in (RunStatus.pending, RunStatus.running):
|
|
||||||
if record.on_disconnect == DisconnectMode.cancel:
|
|
||||||
await run_mgr.cancel(record.run_id)
|
|
||||||
|
|||||||
@@ -79,9 +79,7 @@ async def main():
|
|||||||
from langgraph.runtime import Runtime
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
from deerflow.agents import make_lead_agent
|
from deerflow.agents import make_lead_agent
|
||||||
from deerflow.config.paths import get_paths
|
|
||||||
from deerflow.mcp import initialize_mcp_tools
|
from deerflow.mcp import initialize_mcp_tools
|
||||||
from deerflow.runtime.user_context import get_effective_user_id
|
|
||||||
|
|
||||||
# Initialize MCP tools at startup
|
# Initialize MCP tools at startup
|
||||||
try:
|
try:
|
||||||
@@ -115,8 +113,6 @@ async def main():
|
|||||||
print("Tip: `uv sync --group dev` to enable arrow-key & history support")
|
print("Tip: `uv sync --group dev` to enable arrow-key & history support")
|
||||||
print("=" * 50)
|
print("=" * 50)
|
||||||
|
|
||||||
seen_artifacts: set[str] = set()
|
|
||||||
|
|
||||||
while True:
|
while True:
|
||||||
try:
|
try:
|
||||||
if session:
|
if session:
|
||||||
@@ -138,22 +134,6 @@ async def main():
|
|||||||
last_message = result["messages"][-1]
|
last_message = result["messages"][-1]
|
||||||
print(f"\nAgent: {last_message.content}")
|
print(f"\nAgent: {last_message.content}")
|
||||||
|
|
||||||
# Show files presented to the user this turn (new artifacts only)
|
|
||||||
artifacts = result.get("artifacts") or []
|
|
||||||
new_artifacts = [p for p in artifacts if p not in seen_artifacts]
|
|
||||||
if new_artifacts:
|
|
||||||
thread_id = config["configurable"]["thread_id"]
|
|
||||||
user_id = get_effective_user_id()
|
|
||||||
paths = get_paths()
|
|
||||||
print("\n[Presented files]")
|
|
||||||
for virtual in new_artifacts:
|
|
||||||
try:
|
|
||||||
physical = paths.resolve_virtual_path(thread_id, virtual, user_id=user_id)
|
|
||||||
print(f" - {virtual}\n → {physical}")
|
|
||||||
except ValueError as exc:
|
|
||||||
print(f" - {virtual} (failed to resolve physical path: {exc})")
|
|
||||||
seen_artifacts.update(new_artifacts)
|
|
||||||
|
|
||||||
except (KeyboardInterrupt, EOFError):
|
except (KeyboardInterrupt, EOFError):
|
||||||
print("\nGoodbye!")
|
print("\nGoodbye!")
|
||||||
break
|
break
|
||||||
|
|||||||
+46
-74
@@ -6,16 +6,16 @@ This document provides a complete reference for the DeerFlow backend APIs.
|
|||||||
|
|
||||||
DeerFlow backend exposes two sets of APIs:
|
DeerFlow backend exposes two sets of APIs:
|
||||||
|
|
||||||
1. **LangGraph-compatible API** - Agent interactions, threads, and streaming (`/api/langgraph/*`)
|
1. **LangGraph API** - Agent interactions, threads, and streaming (`/api/langgraph/*`)
|
||||||
2. **Gateway API** - Models, MCP, skills, uploads, and artifacts (`/api/*`)
|
2. **Gateway API** - Models, MCP, skills, uploads, and artifacts (`/api/*`)
|
||||||
|
|
||||||
All APIs are accessed through the Nginx reverse proxy at port 2026.
|
All APIs are accessed through the Nginx reverse proxy at port 2026.
|
||||||
|
|
||||||
## LangGraph-compatible API
|
## LangGraph API
|
||||||
|
|
||||||
Base URL: `/api/langgraph`
|
Base URL: `/api/langgraph`
|
||||||
|
|
||||||
The public LangGraph-compatible API follows LangGraph SDK conventions. In the unified nginx deployment, Gateway owns `/api/langgraph/*` and translates those paths to its native `/api/*` run, thread, and streaming routers.
|
The LangGraph API is provided by the LangGraph server and follows the LangGraph SDK conventions.
|
||||||
|
|
||||||
### Threads
|
### Threads
|
||||||
|
|
||||||
@@ -104,11 +104,17 @@ Content-Type: application/json
|
|||||||
**Recursion Limit:**
|
**Recursion Limit:**
|
||||||
|
|
||||||
`config.recursion_limit` caps the number of graph steps LangGraph will execute
|
`config.recursion_limit` caps the number of graph steps LangGraph will execute
|
||||||
in a single run. The unified Gateway path defaults to `100` in
|
in a single run. The `/api/langgraph/*` endpoints go straight to the LangGraph
|
||||||
`build_run_config` (see `backend/app/gateway/services.py`), which is a safer
|
server and therefore inherit LangGraph's native default of **25**, which is
|
||||||
starting point for plan-mode or subagent-heavy runs. Clients can still set
|
too low for plan-mode or subagent-heavy runs — the agent typically errors out
|
||||||
`recursion_limit` explicitly in the request body; increase it if you run deeply
|
with `GraphRecursionError` after the first round of subagent results comes
|
||||||
nested subagent graphs.
|
back, before the lead agent can synthesize the final answer.
|
||||||
|
|
||||||
|
DeerFlow's own Gateway and IM-channel paths mitigate this by defaulting to
|
||||||
|
`100` in `build_run_config` (see `backend/app/gateway/services.py`), but
|
||||||
|
clients calling the LangGraph API directly must set `recursion_limit`
|
||||||
|
explicitly in the request body. `100` matches the Gateway default and is a
|
||||||
|
safe starting point; increase it if you run deeply nested subagent graphs.
|
||||||
|
|
||||||
**Configurable Options:**
|
**Configurable Options:**
|
||||||
- `model_name` (string): Override the default model
|
- `model_name` (string): Override the default model
|
||||||
@@ -228,13 +234,10 @@ Get current MCP server configurations.
|
|||||||
GET /api/mcp/config
|
GET /api/mcp/config
|
||||||
```
|
```
|
||||||
|
|
||||||
Requires an authenticated admin session. Sensitive env/header/OAuth secret
|
|
||||||
values are masked in the response.
|
|
||||||
|
|
||||||
**Response:**
|
**Response:**
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"mcp_servers": {
|
"mcpServers": {
|
||||||
"github": {
|
"github": {
|
||||||
"enabled": true,
|
"enabled": true,
|
||||||
"type": "stdio",
|
"type": "stdio",
|
||||||
@@ -244,6 +247,13 @@ values are masked in the response.
|
|||||||
"GITHUB_TOKEN": "***"
|
"GITHUB_TOKEN": "***"
|
||||||
},
|
},
|
||||||
"description": "GitHub operations"
|
"description": "GitHub operations"
|
||||||
|
},
|
||||||
|
"filesystem": {
|
||||||
|
"enabled": false,
|
||||||
|
"type": "stdio",
|
||||||
|
"command": "npx",
|
||||||
|
"args": ["-y", "@modelcontextprotocol/server-filesystem"],
|
||||||
|
"description": "File system access"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -258,15 +268,10 @@ PUT /api/mcp/config
|
|||||||
Content-Type: application/json
|
Content-Type: application/json
|
||||||
```
|
```
|
||||||
|
|
||||||
Requires an authenticated admin session. API-managed `stdio` MCP servers may
|
|
||||||
only use allowed executable names for `command` (default: `npx`, `uvx`). Set
|
|
||||||
`DEER_FLOW_MCP_STDIO_COMMAND_ALLOWLIST` to a comma-separated list when a
|
|
||||||
deployment needs additional trusted launchers.
|
|
||||||
|
|
||||||
**Request Body:**
|
**Request Body:**
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"mcp_servers": {
|
"mcpServers": {
|
||||||
"github": {
|
"github": {
|
||||||
"enabled": true,
|
"enabled": true,
|
||||||
"type": "stdio",
|
"type": "stdio",
|
||||||
@@ -284,18 +289,8 @@ deployment needs additional trusted launchers.
|
|||||||
**Response:**
|
**Response:**
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"mcp_servers": {
|
"success": true,
|
||||||
"github": {
|
"message": "MCP configuration updated"
|
||||||
"enabled": true,
|
|
||||||
"type": "stdio",
|
|
||||||
"command": "npx",
|
|
||||||
"args": ["-y", "@modelcontextprotocol/server-github"],
|
|
||||||
"env": {
|
|
||||||
"GITHUB_TOKEN": "***"
|
|
||||||
},
|
|
||||||
"description": "GitHub operations"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -546,28 +541,14 @@ All APIs return errors in a consistent format:
|
|||||||
|
|
||||||
## Authentication
|
## Authentication
|
||||||
|
|
||||||
DeerFlow enforces authentication for all non-public HTTP routes. Public routes are limited to health/docs metadata and these public auth endpoints:
|
Currently, DeerFlow does not implement authentication. All APIs are accessible without credentials.
|
||||||
|
|
||||||
- `POST /api/v1/auth/initialize` creates the first admin account when no admin exists.
|
Note: This is about DeerFlow API authentication. MCP outbound connections can still use OAuth for configured HTTP/SSE MCP servers.
|
||||||
- `POST /api/v1/auth/login/local` logs in with email/password and sets an HttpOnly `access_token` cookie.
|
|
||||||
- `POST /api/v1/auth/register` creates a regular `user` account and sets the session cookie.
|
|
||||||
- `POST /api/v1/auth/logout` clears the session cookie.
|
|
||||||
- `GET /api/v1/auth/setup-status` reports whether the first admin still needs to be created.
|
|
||||||
|
|
||||||
The authenticated auth endpoints are:
|
For production deployments, it is recommended to:
|
||||||
|
1. Use Nginx for basic auth or OAuth integration
|
||||||
- `GET /api/v1/auth/me` returns the current user.
|
2. Deploy behind a VPN or private network
|
||||||
- `POST /api/v1/auth/change-password` changes password, optionally changes email during setup, increments `token_version`, and reissues the cookie.
|
3. Implement custom authentication middleware
|
||||||
|
|
||||||
Protected state-changing requests also require the CSRF double-submit token: send the `csrf_token` cookie value as the `X-CSRF-Token` header. Login/register/initialize/logout are bootstrap auth endpoints: they are exempt from the double-submit token but still reject hostile browser `Origin` headers.
|
|
||||||
|
|
||||||
User isolation is enforced from the authenticated user context:
|
|
||||||
|
|
||||||
- Thread metadata is scoped by `threads_meta.user_id`; search/read/write/delete APIs only expose the current user's threads.
|
|
||||||
- Thread files live under `{base_dir}/users/{user_id}/threads/{thread_id}/user-data/` and are exposed inside the sandbox as `/mnt/user-data/`.
|
|
||||||
- Memory and custom agents are stored under `{base_dir}/users/{user_id}/...`.
|
|
||||||
|
|
||||||
Note: MCP outbound connections can still use OAuth for configured HTTP/SSE MCP servers; that is separate from DeerFlow API authentication.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -586,13 +567,12 @@ location /api/ {
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Streaming Support
|
## WebSocket Support
|
||||||
|
|
||||||
Gateway's LangGraph-compatible API streams run events with Server-Sent Events (SSE):
|
The LangGraph server supports WebSocket connections for real-time streaming. Connect to:
|
||||||
|
|
||||||
```http
|
```
|
||||||
POST /api/langgraph/threads/{thread_id}/runs/stream
|
ws://localhost:2026/api/langgraph/threads/{thread_id}/runs/stream
|
||||||
Accept: text/event-stream
|
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -628,21 +608,13 @@ const response = await fetch('/api/models');
|
|||||||
const data = await response.json();
|
const data = await response.json();
|
||||||
console.log(data.models);
|
console.log(data.models);
|
||||||
|
|
||||||
// Create a run and stream SSE events
|
// Using EventSource for streaming
|
||||||
const streamResponse = await fetch(`/api/langgraph/threads/${threadId}/runs/stream`, {
|
const eventSource = new EventSource(
|
||||||
method: "POST",
|
`/api/langgraph/threads/${threadId}/runs/stream`
|
||||||
headers: {
|
);
|
||||||
"Content-Type": "application/json",
|
eventSource.onmessage = (event) => {
|
||||||
Accept: "text/event-stream",
|
console.log(JSON.parse(event.data));
|
||||||
},
|
};
|
||||||
body: JSON.stringify({
|
|
||||||
input: { messages: [{ role: "user", content: "Hello" }] },
|
|
||||||
stream_mode: ["values", "messages-tuple", "custom"],
|
|
||||||
}),
|
|
||||||
});
|
|
||||||
|
|
||||||
const reader = streamResponse.body?.getReader();
|
|
||||||
// Decode and parse SSE frames from reader in your client code.
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### cURL Examples
|
### cURL Examples
|
||||||
@@ -677,7 +649,7 @@ curl -X POST http://localhost:2026/api/langgraph/threads/abc123/runs \
|
|||||||
}'
|
}'
|
||||||
```
|
```
|
||||||
|
|
||||||
> The unified Gateway path defaults `config.recursion_limit` to 100 for
|
> The `/api/langgraph/*` endpoints bypass DeerFlow's Gateway and inherit
|
||||||
> plan-mode and subagent-heavy runs. Clients may still set
|
> LangGraph's native `recursion_limit` default of 25, which is too low for
|
||||||
> `config.recursion_limit` explicitly — see the [Create Run](#create-run)
|
> plan-mode or subagent runs. Set `config.recursion_limit` explicitly — see
|
||||||
> section for details.
|
> the [Create Run](#create-run) section for details.
|
||||||
|
|||||||
@@ -14,28 +14,30 @@ This document provides a comprehensive overview of the DeerFlow backend architec
|
|||||||
│ Nginx (Port 2026) │
|
│ Nginx (Port 2026) │
|
||||||
│ Unified Reverse Proxy Entry Point │
|
│ Unified Reverse Proxy Entry Point │
|
||||||
│ ┌────────────────────────────────────────────────────────────────────┐ │
|
│ ┌────────────────────────────────────────────────────────────────────┐ │
|
||||||
│ │ /api/langgraph/* → Gateway LangGraph-compatible runtime (8001) │ │
|
│ │ /api/langgraph/* → LangGraph Server (2024) │ │
|
||||||
│ │ /api/* → Gateway REST APIs (8001) │ │
|
│ │ /api/* → Gateway API (8001) │ │
|
||||||
│ │ /* → Frontend (3000) │ │
|
│ │ /* → Frontend (3000) │ │
|
||||||
│ └────────────────────────────────────────────────────────────────────┘ │
|
│ └────────────────────────────────────────────────────────────────────┘ │
|
||||||
└─────────────────────────────────┬────────────────────────────────────────┘
|
└─────────────────────────────────┬────────────────────────────────────────┘
|
||||||
│
|
│
|
||||||
┌───────────────────────┴───────────────────────┐
|
┌───────────────────────┼───────────────────────┐
|
||||||
│ │
|
│ │ │
|
||||||
▼ ▼
|
▼ ▼ ▼
|
||||||
┌─────────────────────────────────────────────┐ ┌─────────────────────┐
|
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
|
||||||
│ Gateway API │ │ Frontend │
|
│ LangGraph Server │ │ Gateway API │ │ Frontend │
|
||||||
│ (Port 8001) │ │ (Port 3000) │
|
│ (Port 2024) │ │ (Port 8001) │ │ (Port 3000) │
|
||||||
│ │ │ │
|
│ │ │ │ │ │
|
||||||
│ - LangGraph-compatible runs/threads API │ │ - Next.js App │
|
│ - Agent Runtime │ │ - Models API │ │ - Next.js App │
|
||||||
│ - Embedded Agent Runtime │ │ - React UI │
|
│ - Thread Mgmt │ │ - MCP Config │ │ - React UI │
|
||||||
│ - SSE Streaming │ │ - Chat Interface │
|
│ - SSE Streaming │ │ - Skills Mgmt │ │ - Chat Interface │
|
||||||
│ - Checkpointing │ │ │
|
│ - Checkpointing │ │ - File Uploads │ │ │
|
||||||
│ - Models, MCP, Skills, Uploads, Artifacts │ │ │
|
│ │ │ - Thread Cleanup │ │ │
|
||||||
│ - Thread Cleanup │ │ │
|
│ │ │ - Artifacts │ │ │
|
||||||
└─────────────────────────────────────────────┘ └─────────────────────┘
|
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
|
||||||
│
|
│ │
|
||||||
▼
|
│ ┌─────────────────┘
|
||||||
|
│ │
|
||||||
|
▼ ▼
|
||||||
┌──────────────────────────────────────────────────────────────────────────┐
|
┌──────────────────────────────────────────────────────────────────────────┐
|
||||||
│ Shared Configuration │
|
│ Shared Configuration │
|
||||||
│ ┌─────────────────────────┐ ┌────────────────────────────────────────┐ │
|
│ ┌─────────────────────────┐ ┌────────────────────────────────────────┐ │
|
||||||
@@ -50,9 +52,9 @@ This document provides a comprehensive overview of the DeerFlow backend architec
|
|||||||
|
|
||||||
## Component Details
|
## Component Details
|
||||||
|
|
||||||
### Gateway Embedded Agent Runtime
|
### LangGraph Server
|
||||||
|
|
||||||
The agent runtime is embedded in the FastAPI Gateway and built on LangGraph for robust multi-agent workflow orchestration. Nginx rewrites `/api/langgraph/*` to Gateway's native `/api/*` routes, so the public API remains compatible with LangGraph SDK clients without running a separate LangGraph server.
|
The LangGraph server is the core agent runtime, built on LangGraph for robust multi-agent workflow orchestration.
|
||||||
|
|
||||||
**Entry Point**: `packages/harness/deerflow/agents/lead_agent/agent.py:make_lead_agent`
|
**Entry Point**: `packages/harness/deerflow/agents/lead_agent/agent.py:make_lead_agent`
|
||||||
|
|
||||||
@@ -63,8 +65,7 @@ The agent runtime is embedded in the FastAPI Gateway and built on LangGraph for
|
|||||||
- Tool execution orchestration
|
- Tool execution orchestration
|
||||||
- SSE streaming for real-time responses
|
- SSE streaming for real-time responses
|
||||||
|
|
||||||
**Graph registry**: `langgraph.json` remains available for tooling, Studio, or direct LangGraph Server compatibility.
|
**Configuration**: `langgraph.json`
|
||||||
It is not the default service entrypoint; scripts and Docker deployments run the Gateway embedded runtime.
|
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
@@ -77,13 +78,12 @@ It is not the default service entrypoint; scripts and Docker deployments run the
|
|||||||
|
|
||||||
### Gateway API
|
### Gateway API
|
||||||
|
|
||||||
FastAPI application providing REST endpoints plus the public LangGraph-compatible `/api/langgraph/*` runtime routes.
|
FastAPI application providing REST endpoints for non-agent operations.
|
||||||
|
|
||||||
**Entry Point**: `app/gateway/app.py`
|
**Entry Point**: `app/gateway/app.py`
|
||||||
|
|
||||||
**Routers**:
|
**Routers**:
|
||||||
- `models.py` - `/api/models` - Model listing and details
|
- `models.py` - `/api/models` - Model listing and details
|
||||||
- `thread_runs.py` / `runs.py` - `/api/threads/{id}/runs`, `/api/runs/*` - LangGraph-compatible runs and streaming
|
|
||||||
- `mcp.py` - `/api/mcp` - MCP server configuration
|
- `mcp.py` - `/api/mcp` - MCP server configuration
|
||||||
- `skills.py` - `/api/skills` - Skills management
|
- `skills.py` - `/api/skills` - Skills management
|
||||||
- `uploads.py` - `/api/threads/{id}/uploads` - File upload
|
- `uploads.py` - `/api/threads/{id}/uploads` - File upload
|
||||||
@@ -91,7 +91,7 @@ FastAPI application providing REST endpoints plus the public LangGraph-compatibl
|
|||||||
- `artifacts.py` - `/api/threads/{id}/artifacts` - Artifact serving
|
- `artifacts.py` - `/api/threads/{id}/artifacts` - Artifact serving
|
||||||
- `suggestions.py` - `/api/threads/{id}/suggestions` - Follow-up suggestion generation
|
- `suggestions.py` - `/api/threads/{id}/suggestions` - Follow-up suggestion generation
|
||||||
|
|
||||||
The web conversation delete flow first deletes Gateway-managed thread state through the LangGraph-compatible route, then the Gateway `threads.py` router removes DeerFlow-managed filesystem data via `Paths.delete_thread_dir()`.
|
The web conversation delete flow is now split across both backend surfaces: LangGraph handles `DELETE /api/langgraph/threads/{thread_id}` for thread state, then the Gateway `threads.py` router removes DeerFlow-managed filesystem data via `Paths.delete_thread_dir()`.
|
||||||
|
|
||||||
### Agent Architecture
|
### Agent Architecture
|
||||||
|
|
||||||
@@ -353,10 +353,10 @@ SKILL.md Format:
|
|||||||
POST /api/langgraph/threads/{thread_id}/runs
|
POST /api/langgraph/threads/{thread_id}/runs
|
||||||
{"input": {"messages": [{"role": "user", "content": "Hello"}]}}
|
{"input": {"messages": [{"role": "user", "content": "Hello"}]}}
|
||||||
|
|
||||||
2. Nginx → Gateway API (8001)
|
2. Nginx → LangGraph Server (2024)
|
||||||
`/api/langgraph/*` is rewritten to Gateway's LangGraph-compatible `/api/*` routes
|
Proxied to LangGraph server
|
||||||
|
|
||||||
3. Gateway embedded runtime
|
3. LangGraph Server
|
||||||
a. Load/create thread state
|
a. Load/create thread state
|
||||||
b. Execute middleware chain:
|
b. Execute middleware chain:
|
||||||
- ThreadDataMiddleware: Set up paths
|
- ThreadDataMiddleware: Set up paths
|
||||||
@@ -412,7 +412,7 @@ SKILL.md Format:
|
|||||||
### Thread Cleanup Flow
|
### Thread Cleanup Flow
|
||||||
|
|
||||||
```
|
```
|
||||||
1. Client deletes conversation via the LangGraph-compatible Gateway route
|
1. Client deletes conversation via LangGraph
|
||||||
DELETE /api/langgraph/threads/{thread_id}
|
DELETE /api/langgraph/threads/{thread_id}
|
||||||
|
|
||||||
2. Web UI follows up with Gateway cleanup
|
2. Web UI follows up with Gateway cleanup
|
||||||
|
|||||||
@@ -1,331 +0,0 @@
|
|||||||
# 用户认证与隔离设计
|
|
||||||
|
|
||||||
本文档描述 DeerFlow 当前内置认证模块的设计,而不是历史 RFC。它覆盖浏览器登录、API 认证、CSRF、用户隔离、首次初始化、密码重置、内部调用和升级迁移。
|
|
||||||
|
|
||||||
## 设计目标
|
|
||||||
|
|
||||||
认证模块的核心目标是把 DeerFlow 从“本地单用户工具”提升为“可多用户部署的 agent runtime”,并让用户身份贯穿 HTTP API、LangGraph-compatible runtime、文件系统、memory、自定义 agent 和反馈数据。
|
|
||||||
|
|
||||||
设计约束:
|
|
||||||
|
|
||||||
- 默认强制认证:除健康检查、文档和 auth bootstrap 端点外,HTTP 路由都必须有有效 session。
|
|
||||||
- 服务端持有所有权:客户端 metadata 不能声明 `user_id` 或 `owner_id`。
|
|
||||||
- 隔离默认开启:repository(仓储)、文件路径、memory、agent 配置默认按当前用户解析。
|
|
||||||
- 旧数据可升级:无认证版本留下的 thread 可以在 admin 存在后迁移到 admin。
|
|
||||||
- 密码不进日志:首次初始化由操作者设置密码;`reset_admin` 只写 0600 凭据文件。
|
|
||||||
|
|
||||||
非目标:
|
|
||||||
|
|
||||||
- 当前 OAuth 端点只是占位,尚未实现第三方登录。
|
|
||||||
- 当前用户角色只有 `admin` 和 `user`,尚未实现细粒度 RBAC。
|
|
||||||
- 当前登录限速是进程内字典,多 worker 下不是全局精确限速。
|
|
||||||
|
|
||||||
## 核心模型
|
|
||||||
|
|
||||||
```mermaid
|
|
||||||
graph TB
|
|
||||||
classDef actor fill:#D8CFC4,stroke:#6E6259,color:#2F2A26;
|
|
||||||
classDef api fill:#C9D7D2,stroke:#5D706A,color:#21302C;
|
|
||||||
classDef state fill:#D7D3E8,stroke:#6B6680,color:#29263A;
|
|
||||||
classDef data fill:#E5D2C4,stroke:#806A5B,color:#30251E;
|
|
||||||
|
|
||||||
Browser["Browser — access_token cookie and csrf_token cookie"]:::actor
|
|
||||||
AuthMiddleware["AuthMiddleware — strict session gate"]:::api
|
|
||||||
CSRFMiddleware["CSRFMiddleware — double-submit token and Origin check"]:::api
|
|
||||||
AuthRoutes["Auth routes — initialize login register logout me change-password"]:::api
|
|
||||||
UserContext["Current user ContextVar — request-scoped identity"]:::state
|
|
||||||
Repositories["Repositories — AUTO resolves user_id from context"]:::state
|
|
||||||
Files["Filesystem — users/{user_id}/threads/{thread_id}/user-data"]:::data
|
|
||||||
Memory["Memory and agents — users/{user_id}/memory.json and agents"]:::data
|
|
||||||
|
|
||||||
Browser --> AuthMiddleware
|
|
||||||
Browser --> CSRFMiddleware
|
|
||||||
AuthMiddleware --> AuthRoutes
|
|
||||||
AuthMiddleware --> UserContext
|
|
||||||
UserContext --> Repositories
|
|
||||||
UserContext --> Files
|
|
||||||
UserContext --> Memory
|
|
||||||
```
|
|
||||||
|
|
||||||
### 用户表
|
|
||||||
|
|
||||||
用户记录定义在 `app.gateway.auth.models.User`,持久化到 `users` 表。关键字段:
|
|
||||||
|
|
||||||
| 字段 | 语义 |
|
|
||||||
|---|---|
|
|
||||||
| `id` | 用户主键,JWT `sub` 使用该值 |
|
|
||||||
| `email` | 唯一登录名 |
|
|
||||||
| `password_hash` | bcrypt hash,OAuth 用户可为空 |
|
|
||||||
| `system_role` | `admin` 或 `user` |
|
|
||||||
| `needs_setup` | reset 后要求用户完成邮箱 / 密码设置 |
|
|
||||||
| `token_version` | 改密码或 reset 时递增,用于废弃旧 JWT |
|
|
||||||
|
|
||||||
### 运行时身份
|
|
||||||
|
|
||||||
认证成功后,`AuthMiddleware` 把用户同时写入:
|
|
||||||
|
|
||||||
- `request.state.user`
|
|
||||||
- `request.state.auth`
|
|
||||||
- `deerflow.runtime.user_context` 的 `ContextVar`
|
|
||||||
|
|
||||||
`ContextVar` 是这里的核心边界。上层 Gateway 负责写入身份,下层 persistence / file path 只读取结构化的当前用户,不反向依赖 `app.gateway.auth` 具体类型。
|
|
||||||
|
|
||||||
可以把 repository 调用的用户参数理解成一个三态 ADT:
|
|
||||||
|
|
||||||
```scala
|
|
||||||
enum UserScope:
|
|
||||||
case AutoFromContext
|
|
||||||
case Explicit(userId: String)
|
|
||||||
case BypassForMigration
|
|
||||||
```
|
|
||||||
|
|
||||||
对应 Python 实现是 `AUTO | str | None`:
|
|
||||||
|
|
||||||
- `AUTO`:从 `ContextVar` 解析当前用户;没有上下文则抛错。
|
|
||||||
- `str`:显式指定用户,主要用于测试或管理脚本。
|
|
||||||
- `None`:跳过用户过滤,只允许迁移脚本或 admin CLI 使用。
|
|
||||||
|
|
||||||
## 登录与初始化流程
|
|
||||||
|
|
||||||
### 首次初始化
|
|
||||||
|
|
||||||
首次启动时,如果没有 admin,服务不会自动创建账号,只记录日志提示访问 `/setup`。
|
|
||||||
|
|
||||||
流程:
|
|
||||||
|
|
||||||
1. 用户访问 `/setup`。
|
|
||||||
2. 前端调用 `GET /api/v1/auth/setup-status`。
|
|
||||||
3. 如果返回 `{"needs_setup": true}`,前端展示创建 admin 表单。
|
|
||||||
4. 表单提交 `POST /api/v1/auth/initialize`。
|
|
||||||
5. 服务端确认当前没有 admin,创建 `system_role="admin"`、`needs_setup=false` 的用户。
|
|
||||||
6. 服务端设置 `access_token` HttpOnly cookie,用户进入 workspace。
|
|
||||||
|
|
||||||
`/api/v1/auth/initialize` 只在没有 admin 时可用。并发初始化由数据库唯一约束兜底,失败方返回 409。
|
|
||||||
|
|
||||||
### 普通登录
|
|
||||||
|
|
||||||
`POST /api/v1/auth/login/local` 使用 `OAuth2PasswordRequestForm`:
|
|
||||||
|
|
||||||
- `username` 是邮箱。
|
|
||||||
- `password` 是密码。
|
|
||||||
- 成功后签发 JWT,放入 `access_token` HttpOnly cookie。
|
|
||||||
- 响应体只返回 `expires_in` 和 `needs_setup`,不返回 token。
|
|
||||||
|
|
||||||
登录失败会按客户端 IP 计数。IP 解析只在 TCP peer 属于 `AUTH_TRUSTED_PROXIES` 时信任 `X-Real-IP`,不使用 `X-Forwarded-For`。
|
|
||||||
|
|
||||||
### 注册
|
|
||||||
|
|
||||||
`POST /api/v1/auth/register` 创建普通 `user`,并自动登录。
|
|
||||||
|
|
||||||
当前实现允许在没有 admin 时注册普通用户,但 `setup-status` 仍会返回 `needs_setup=true`,因为 admin 仍不存在。这是当前产品策略边界:如果后续要求“必须先初始化 admin 才能注册普通用户”,需要在 `/register` 增加 admin-exists gate。
|
|
||||||
|
|
||||||
### 改密码与 reset setup
|
|
||||||
|
|
||||||
`POST /api/v1/auth/change-password` 需要当前密码和新密码:
|
|
||||||
|
|
||||||
- 校验当前密码。
|
|
||||||
- 更新 bcrypt hash。
|
|
||||||
- `token_version += 1`,使旧 JWT 立即失效。
|
|
||||||
- 重新签发 cookie。
|
|
||||||
- 如果 `needs_setup=true` 且传了 `new_email`,则更新邮箱并清除 `needs_setup`。
|
|
||||||
|
|
||||||
`python -m app.gateway.auth.reset_admin` 会:
|
|
||||||
|
|
||||||
- 找到 admin 或指定邮箱用户。
|
|
||||||
- 生成随机密码。
|
|
||||||
- 更新密码 hash。
|
|
||||||
- `token_version += 1`。
|
|
||||||
- 设置 `needs_setup=true`。
|
|
||||||
- 写入 `.deer-flow/admin_initial_credentials.txt`,权限 `0600`。
|
|
||||||
|
|
||||||
命令行只输出凭据文件路径,不输出明文密码。
|
|
||||||
|
|
||||||
## HTTP 认证边界
|
|
||||||
|
|
||||||
`AuthMiddleware` 是 fail-closed(默认拒绝)的全局认证门。
|
|
||||||
|
|
||||||
公开路径:
|
|
||||||
|
|
||||||
- `/health`
|
|
||||||
- `/docs`
|
|
||||||
- `/redoc`
|
|
||||||
- `/openapi.json`
|
|
||||||
- `/api/v1/auth/login/local`
|
|
||||||
- `/api/v1/auth/register`
|
|
||||||
- `/api/v1/auth/logout`
|
|
||||||
- `/api/v1/auth/setup-status`
|
|
||||||
- `/api/v1/auth/initialize`
|
|
||||||
|
|
||||||
其余路径都要求有效 `access_token` cookie。存在 cookie 但 JWT 无效、过期、用户不存在或 `token_version` 不匹配时,直接返回 401,而不是让请求穿透到业务路由。
|
|
||||||
|
|
||||||
路由级别的 owner check 由 `require_permission(..., owner_check=True)` 完成:
|
|
||||||
|
|
||||||
- 读类请求允许旧的未追踪 legacy thread 兼容读取。
|
|
||||||
- 写 / 删除类请求使用 `require_existing=True`,要求 thread row 存在且属于当前用户,避免删除后缺 row 导致其他用户误通过。
|
|
||||||
|
|
||||||
## CSRF 设计
|
|
||||||
|
|
||||||
DeerFlow 使用 Double Submit Cookie:
|
|
||||||
|
|
||||||
- 服务端设置 `csrf_token` cookie。
|
|
||||||
- 前端 state-changing 请求发送同值 `X-CSRF-Token` header。
|
|
||||||
- 服务端用 `secrets.compare_digest` 比较 cookie/header。
|
|
||||||
|
|
||||||
需要 CSRF 的方法:
|
|
||||||
|
|
||||||
- `POST`
|
|
||||||
- `PUT`
|
|
||||||
- `DELETE`
|
|
||||||
- `PATCH`
|
|
||||||
|
|
||||||
auth bootstrap 端点(login/register/initialize/logout)不要求 double-submit token,因为首次调用时浏览器还没有 token;但这些端点会校验 browser `Origin`,拒绝 hostile Origin,避免 login CSRF / session fixation。
|
|
||||||
|
|
||||||
## 用户隔离
|
|
||||||
|
|
||||||
### Thread metadata
|
|
||||||
|
|
||||||
Thread metadata 存在 `threads_meta`,关键隔离字段是 `user_id`。
|
|
||||||
|
|
||||||
创建 thread 时:
|
|
||||||
|
|
||||||
- 客户端传入的 `metadata.user_id` 和 `metadata.owner_id` 会被剥离。
|
|
||||||
- `ThreadMetaRepository.create(..., user_id=AUTO)` 从 `ContextVar` 解析真实用户。
|
|
||||||
- `/api/threads/search` 默认只返回当前用户的 thread。
|
|
||||||
|
|
||||||
读取 / 修改 / 删除时:
|
|
||||||
|
|
||||||
- `get()` 默认按当前用户过滤。
|
|
||||||
- `check_access()` 用于路由 owner check。
|
|
||||||
- 对其他用户的 thread 返回 404,避免泄露资源存在性。
|
|
||||||
|
|
||||||
### 文件系统
|
|
||||||
|
|
||||||
当前线程文件布局:
|
|
||||||
|
|
||||||
```text
|
|
||||||
{base_dir}/users/{user_id}/threads/{thread_id}/user-data/
|
|
||||||
├── workspace/
|
|
||||||
├── uploads/
|
|
||||||
└── outputs/
|
|
||||||
```
|
|
||||||
|
|
||||||
agent 在 sandbox 内看到统一虚拟路径:
|
|
||||||
|
|
||||||
```text
|
|
||||||
/mnt/user-data/workspace
|
|
||||||
/mnt/user-data/uploads
|
|
||||||
/mnt/user-data/outputs
|
|
||||||
```
|
|
||||||
|
|
||||||
`ThreadDataMiddleware` 使用 `get_effective_user_id()` 解析当前用户并生成线程路径。没有认证上下文时会落到 `default` 用户桶,主要用于内部调用、嵌入式 client 或无 HTTP 的本地执行路径。
|
|
||||||
|
|
||||||
### Memory
|
|
||||||
|
|
||||||
默认 memory 存储:
|
|
||||||
|
|
||||||
```text
|
|
||||||
{base_dir}/users/{user_id}/memory.json
|
|
||||||
{base_dir}/users/{user_id}/agents/{agent_name}/memory.json
|
|
||||||
```
|
|
||||||
|
|
||||||
有用户上下文时,空或相对 `memory.storage_path` 都使用上述 per-user 默认路径;只有绝对 `memory.storage_path` 会视为显式 opt-out(退出) per-user isolation,所有用户共享该路径。无用户上下文的 legacy 路径仍会把相对 `storage_path` 解析到 `Paths.base_dir` 下。
|
|
||||||
|
|
||||||
### 自定义 agent
|
|
||||||
|
|
||||||
用户自定义 agent 写入:
|
|
||||||
|
|
||||||
```text
|
|
||||||
{base_dir}/users/{user_id}/agents/{agent_name}/
|
|
||||||
├── config.yaml
|
|
||||||
├── SOUL.md
|
|
||||||
└── memory.json
|
|
||||||
```
|
|
||||||
|
|
||||||
旧布局 `{base_dir}/agents/{agent_name}/` 只作为只读兼容回退。更新或删除旧共享 agent 会要求先运行迁移脚本。
|
|
||||||
|
|
||||||
## 内部调用与 IM 渠道
|
|
||||||
|
|
||||||
IM channel worker 不是浏览器用户,不持有浏览器 cookie。它们通过 Gateway 内部认证:
|
|
||||||
|
|
||||||
- 请求带 `X-DeerFlow-Internal-Token`。
|
|
||||||
- 同时带匹配的 CSRF cookie/header。
|
|
||||||
- 服务端识别为内部用户,`id="default"`、`system_role="internal"`。
|
|
||||||
|
|
||||||
这意味着 channel 产生的数据默认进入 `default` 用户桶。这个选择适合“平台级 bot 身份”,但不是“每个 IM 用户单独隔离”。如果后续要做到外部 IM 用户隔离,需要把外部 platform user 映射到 DeerFlow user,并让 channel manager 设置对应的 scoped identity。
|
|
||||||
|
|
||||||
## LangGraph-compatible 认证
|
|
||||||
|
|
||||||
Gateway 内嵌 runtime 路径由 `AuthMiddleware` 和 `CSRFMiddleware` 保护。
|
|
||||||
|
|
||||||
仓库仍保留 `app.gateway.langgraph_auth`,用于 LangGraph Server 直连模式:
|
|
||||||
|
|
||||||
- `@auth.authenticate` 校验 JWT cookie、CSRF、用户存在性和 `token_version`。
|
|
||||||
- `@auth.on` 在写入 metadata 时注入 `user_id`,并在读路径返回 `{"user_id": current_user}` 过滤条件。
|
|
||||||
|
|
||||||
这保证 Gateway 路由和 LangGraph-compatible 直连模式使用同一 JWT 语义。
|
|
||||||
|
|
||||||
## 升级与迁移
|
|
||||||
|
|
||||||
从无认证版本升级时,可能存在没有 `user_id` 的历史 thread。
|
|
||||||
|
|
||||||
当前策略:
|
|
||||||
|
|
||||||
1. 首次启动如果没有 admin,只提示访问 `/setup`,不迁移。
|
|
||||||
2. 操作者创建 admin。
|
|
||||||
3. 后续启动时,`_ensure_admin_user()` 找到 admin,并把 LangGraph store 中缺少 `metadata.user_id` 的 thread 迁移到 admin。
|
|
||||||
|
|
||||||
文件系统旧布局迁移由脚本处理:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd backend
|
|
||||||
PYTHONPATH=. python scripts/migrate_user_isolation.py --dry-run
|
|
||||||
PYTHONPATH=. python scripts/migrate_user_isolation.py --user-id <target-user-id>
|
|
||||||
```
|
|
||||||
|
|
||||||
迁移脚本覆盖 legacy `memory.json`、`threads/` 和 `agents/` 到 per-user layout。
|
|
||||||
|
|
||||||
## 安全不变量
|
|
||||||
|
|
||||||
必须长期保持的不变量:
|
|
||||||
|
|
||||||
- JWT 只在 HttpOnly cookie 中传输,不出现在响应 JSON。
|
|
||||||
- 任何非 public HTTP 路由都不能只靠“cookie 存在”放行,必须严格验证 JWT。
|
|
||||||
- `token_version` 不匹配必须拒绝,保证改密码 / reset 后旧 session 失效。
|
|
||||||
- 客户端 metadata 中的 `user_id` / `owner_id` 必须剥离。
|
|
||||||
- repository 默认 `AUTO` 必须从当前用户上下文解析,不能静默退化成全局查询。
|
|
||||||
- 只有迁移脚本和 admin CLI 可以显式传 `user_id=None` 绕过隔离。
|
|
||||||
- 本地文件路径必须通过 `Paths` 和 sandbox path validation 解析,不能拼接未校验的用户输入。
|
|
||||||
- 捕获认证、迁移、后台任务异常必须记录日志;不能空 catch。
|
|
||||||
|
|
||||||
## 已知边界
|
|
||||||
|
|
||||||
| 边界 | 当前行为 | 后续方向 |
|
|
||||||
|---|---|---|
|
|
||||||
| 无 admin 时注册普通用户 | 允许注册普通 `user` | 如产品要求先初始化 admin,给 `/register` 加 gate |
|
|
||||||
| 登录限速 | 进程内 dict,单 worker 精确,多 worker 近似 | Redis / DB-backed rate limiter |
|
|
||||||
| OAuth | 端点占位,未实现 | 接入 provider 并统一 `token_version` / role 语义 |
|
|
||||||
| IM 用户隔离 | channel 使用 `default` 内部用户 | 建立外部用户到 DeerFlow user 的映射 |
|
|
||||||
| 绝对 memory path | 显式共享 memory | UI / docs 明确提示 opt-out 风险 |
|
|
||||||
|
|
||||||
## 相关文件
|
|
||||||
|
|
||||||
| 文件 | 职责 |
|
|
||||||
|---|---|
|
|
||||||
| `app/gateway/auth_middleware.py` | 全局认证门、JWT 严格验证、写入 user context |
|
|
||||||
| `app/gateway/csrf_middleware.py` | CSRF double-submit 和 auth Origin 校验 |
|
|
||||||
| `app/gateway/routers/auth.py` | initialize/login/register/logout/me/change-password |
|
|
||||||
| `app/gateway/auth/jwt.py` | JWT 创建与解析 |
|
|
||||||
| `app/gateway/auth/reset_admin.py` | 密码 reset CLI |
|
|
||||||
| `app/gateway/auth/credential_file.py` | 0600 凭据文件写入 |
|
|
||||||
| `app/gateway/authz.py` | 路由权限与 owner check |
|
|
||||||
| `deerflow/runtime/user_context.py` | 当前用户 ContextVar 与 `AUTO` sentinel |
|
|
||||||
| `deerflow/persistence/thread_meta/` | thread metadata owner filter |
|
|
||||||
| `deerflow/config/paths.py` | per-user filesystem layout |
|
|
||||||
| `deerflow/agents/middlewares/thread_data_middleware.py` | run 时解析用户线程目录 |
|
|
||||||
| `deerflow/agents/memory/storage.py` | per-user memory storage |
|
|
||||||
| `deerflow/config/agents_config.py` | per-user custom agents |
|
|
||||||
| `app/channels/manager.py` | IM channel 内部认证调用 |
|
|
||||||
| `scripts/migrate_user_isolation.py` | legacy 数据迁移到 per-user layout |
|
|
||||||
| `.deer-flow/data/deerflow.db` | 统一 SQLite 数据库,包含 users / threads_meta / runs / feedback 等表 |
|
|
||||||
| `.deer-flow/users/{user_id}/agents/{agent_name}/` | 用户自定义 agent 配置、SOUL 和 agent memory |
|
|
||||||
| `.deer-flow/admin_initial_credentials.txt` | `reset_admin` 生成的新凭据文件(0600,读完应删除) |
|
|
||||||
@@ -24,12 +24,12 @@ All other test plan sections were executed against either:
|
|||||||
|
|
||||||
| Case | Title | What it covers | Why not run |
|
| Case | Title | What it covers | Why not run |
|
||||||
|---|---|---|---|
|
|---|---|---|---|
|
||||||
| TC-DOCKER-01 | `deerflow.db` volume persistence | Verify the `DEER_FLOW_HOME` bind mount survives container restart | needs `docker compose up` |
|
| TC-DOCKER-01 | `users.db` volume persistence | Verify the `DEER_FLOW_HOME` bind mount survives container restart | needs `docker compose up` |
|
||||||
| TC-DOCKER-02 | Session persistence across container restart | `AUTH_JWT_SECRET` env var keeps cookies valid after `docker compose down && up` | needs `docker compose down/up` |
|
| TC-DOCKER-02 | Session persistence across container restart | `AUTH_JWT_SECRET` env var keeps cookies valid after `docker compose down && up` | needs `docker compose down/up` |
|
||||||
| TC-DOCKER-03 | Per-worker rate limiter divergence | Confirms in-process `_login_attempts` dict doesn't share state across `gunicorn` workers (4 by default in the compose file); known limitation, documented | needs multi-worker container |
|
| TC-DOCKER-03 | Per-worker rate limiter divergence | Confirms in-process `_login_attempts` dict doesn't share state across `gunicorn` workers (4 by default in the compose file); known limitation, documented | needs multi-worker container |
|
||||||
| TC-DOCKER-04 | IM channels use internal Gateway auth | Verify Feishu/Slack/Telegram dispatchers attach the process-local internal auth header plus CSRF cookie/header when calling Gateway-compatible LangGraph APIs | needs `docker logs` |
|
| TC-DOCKER-04 | IM channels skip AuthMiddleware | Verify Feishu/Slack/Telegram dispatchers run in-container against `http://langgraph:2024` without going through nginx | needs `docker logs` |
|
||||||
| TC-DOCKER-05 | Reset credentials surfacing | `reset_admin` writes a 0600 credential file in `DEER_FLOW_HOME` instead of logging plaintext. The file-based behavior is validated by non-Docker reset tests, so the only Docker-specific gap is verifying the volume mount carries the file out to the host | needs container + host volume |
|
| TC-DOCKER-05 | Admin credentials surfacing | **Updated post-simplify** — was "log scrape", now "0600 credential file in `DEER_FLOW_HOME`". The file-based behavior is already validated by TC-1.1 + TC-UPG-13 on sg_dev (non-Docker), so the only Docker-specific gap is verifying the volume mount carries the file out to the host | needs container + host volume |
|
||||||
| TC-DOCKER-06 | Docker deploy uses Gateway embedded runtime | `./scripts/deploy.sh` produces a Gateway + frontend + nginx topology (no `langgraph` container); same auth flow as local `make dev` | needs `docker compose up` |
|
| TC-DOCKER-06 | Gateway-mode Docker deploy | `./scripts/deploy.sh --gateway` produces a 3-container topology (no `langgraph` container); same auth flow as standard mode | needs `docker compose --profile gateway` |
|
||||||
|
|
||||||
## Coverage already provided by non-Docker tests
|
## Coverage already provided by non-Docker tests
|
||||||
|
|
||||||
@@ -41,9 +41,9 @@ the test cases that ran on sg_dev or local:
|
|||||||
| TC-DOCKER-01 (volume persistence) | TC-REENT-01 on sg_dev (admin row survives gateway restart) — same SQLite file, just no container layer between |
|
| TC-DOCKER-01 (volume persistence) | TC-REENT-01 on sg_dev (admin row survives gateway restart) — same SQLite file, just no container layer between |
|
||||||
| TC-DOCKER-02 (session persistence) | TC-API-02/03/06 (cookie roundtrip), plus TC-REENT-04 (multi-cookie) — JWT verification is process-state-free, container restart is equivalent to `pkill uvicorn && uv run uvicorn` |
|
| TC-DOCKER-02 (session persistence) | TC-API-02/03/06 (cookie roundtrip), plus TC-REENT-04 (multi-cookie) — JWT verification is process-state-free, container restart is equivalent to `pkill uvicorn && uv run uvicorn` |
|
||||||
| TC-DOCKER-03 (per-worker rate limit) | TC-GW-04 + TC-REENT-09 (single-worker rate limit + 5min expiry). The cross-worker divergence is an architectural property of the in-memory dict; no auth code path differs |
|
| TC-DOCKER-03 (per-worker rate limit) | TC-GW-04 + TC-REENT-09 (single-worker rate limit + 5min expiry). The cross-worker divergence is an architectural property of the in-memory dict; no auth code path differs |
|
||||||
| TC-DOCKER-04 (IM channels use internal auth) | Code-level: `app/channels/manager.py` creates the `langgraph_sdk` client with `create_internal_auth_headers()` plus CSRF cookie/header, so channel workers do not rely on browser cookies |
|
| TC-DOCKER-04 (IM channels skip auth) | Code-level only: `app/channels/manager.py` uses `langgraph_sdk` directly with no cookie handling. The langgraph_auth handler is bypassed by going through SDK, not HTTP |
|
||||||
| TC-DOCKER-05 (credential surfacing) | `reset_admin` writes `.deer-flow/admin_initial_credentials.txt` with mode 0600 and logs only the path — the only Docker-unique step is whether the bind mount projects this path onto the host, which is a `docker compose` config check, not a runtime behavior change |
|
| TC-DOCKER-05 (credential surfacing) | TC-1.1 on sg_dev (file at `~/deer-flow/backend/.deer-flow/admin_initial_credentials.txt`, mode 0600, password 22 chars) — the only Docker-unique step is whether the bind mount projects this path onto the host, which is a `docker compose` config check, not a runtime behavior change |
|
||||||
| TC-DOCKER-06 (Gateway embedded runtime container) | Section 七 7.2 covered by TC-GW-01..05 + Section 二 (Gateway auth flow on sg_dev) — same Gateway code, container is just a packaging change |
|
| TC-DOCKER-06 (gateway-mode container) | Section 七 7.2 covered by TC-GW-01..05 + Section 二 (gateway-mode auth flow on sg_dev) — same Gateway code, container is just a packaging change |
|
||||||
|
|
||||||
## Reproduction steps when Docker becomes available
|
## Reproduction steps when Docker becomes available
|
||||||
|
|
||||||
@@ -72,6 +72,6 @@ Then run TC-DOCKER-01..06 from the test plan as written.
|
|||||||
about *container packaging* details (bind mounts, multi-worker, log
|
about *container packaging* details (bind mounts, multi-worker, log
|
||||||
collection), not about whether the auth code paths work.
|
collection), not about whether the auth code paths work.
|
||||||
- **TC-DOCKER-05 was updated in place** in `AUTH_TEST_PLAN.md` to reflect
|
- **TC-DOCKER-05 was updated in place** in `AUTH_TEST_PLAN.md` to reflect
|
||||||
the current reset flow (`reset_admin` → 0600 credentials file, no log leak).
|
the post-simplify reality (credentials file → 0600 file, no log leak).
|
||||||
The old "grep 'Password:' in docker logs" expectation would have failed
|
The old "grep 'Password:' in docker logs" expectation would have failed
|
||||||
silently and given a false sense of coverage.
|
silently and given a false sense of coverage.
|
||||||
|
|||||||
+156
-179
@@ -4,12 +4,10 @@
|
|||||||
|
|
||||||
| 模式 | 启动命令 | Auth 层 | 端口 |
|
| 模式 | 启动命令 | Auth 层 | 端口 |
|
||||||
|------|---------|---------|------|
|
|------|---------|---------|------|
|
||||||
| 标准模式 | `make dev` | Gateway AuthMiddleware(全量) | 2026 (nginx) |
|
| 标准模式 | `make dev` | Gateway AuthMiddleware + LangGraph auth | 2026 (nginx) |
|
||||||
|
| Gateway 模式 | `make dev-pro` | Gateway AuthMiddleware(全量) | 2026 (nginx) |
|
||||||
| 直连 Gateway | `cd backend && make gateway` | Gateway AuthMiddleware | 8001 |
|
| 直连 Gateway | `cd backend && make gateway` | Gateway AuthMiddleware | 8001 |
|
||||||
| 直连 LangGraph 兼容性 | 手动运行 LangGraph 工具链时使用 | LangGraph auth | 2024 |
|
| 直连 LangGraph | `cd backend && make dev` | LangGraph auth | 2024 |
|
||||||
|
|
||||||
`make dev`、Docker dev 和生产部署默认都运行 Gateway embedded runtime。
|
|
||||||
`app.gateway.langgraph_auth` 仅用于保留的直连 LangGraph 工具链 / Studio 兼容性测试,不是标准服务启动路径。
|
|
||||||
|
|
||||||
每种模式下都需执行以下测试。
|
每种模式下都需执行以下测试。
|
||||||
|
|
||||||
@@ -21,18 +19,19 @@
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 清除已有数据
|
# 清除已有数据
|
||||||
rm -f backend/.deer-flow/data/deerflow.db
|
rm -f backend/.deer-flow/users.db
|
||||||
|
|
||||||
# 启动标准模式(Gateway embedded runtime)
|
# 选择模式启动
|
||||||
make dev
|
make dev # 标准模式
|
||||||
|
# 或
|
||||||
|
make dev-pro # Gateway 模式
|
||||||
```
|
```
|
||||||
|
|
||||||
**验证点:**
|
**验证点:**
|
||||||
- [ ] 控制台不输出 admin 邮箱或明文密码
|
- [ ] 控制台输出 admin 邮箱和随机密码
|
||||||
- [ ] 控制台提示 `First boot detected — no admin account exists.`
|
- [ ] 密码格式为 `secrets.token_urlsafe(16)` 的 22 字符字符串
|
||||||
- [ ] 控制台提示访问 `/setup` 完成 admin 创建
|
- [ ] 邮箱为 `admin@deerflow.dev`
|
||||||
- [ ] `GET /api/v1/auth/setup-status` 返回 `{"needs_setup": true}`
|
- [ ] 提示 `Change it after login: Settings -> Account`
|
||||||
- [ ] 前端访问 `/login` 会跳转 `/setup`
|
|
||||||
|
|
||||||
### 1.2 非首次启动
|
### 1.2 非首次启动
|
||||||
|
|
||||||
@@ -43,8 +42,7 @@ make dev
|
|||||||
|
|
||||||
**验证点:**
|
**验证点:**
|
||||||
- [ ] 控制台不输出密码
|
- [ ] 控制台不输出密码
|
||||||
- [ ] `GET /api/v1/auth/setup-status` 返回 `{"needs_setup": false}`
|
- [ ] 如果 admin 仍 `needs_setup=True`,控制台有 warning 提示
|
||||||
- [ ] 已登录用户如果 `needs_setup=True`,访问 workspace 会被引导到 `/setup` 完成改邮箱 / 改密码流程
|
|
||||||
|
|
||||||
### 1.3 环境变量配置
|
### 1.3 环境变量配置
|
||||||
|
|
||||||
@@ -57,7 +55,7 @@ make dev
|
|||||||
|
|
||||||
## 二、接口流程测试
|
## 二、接口流程测试
|
||||||
|
|
||||||
> 以下用 `BASE=http://localhost:2026` 为例。标准模式经 nginx 暴露此地址。
|
> 以下用 `BASE=http://localhost:2026` 为例。标准模式和 Gateway 模式都用此地址。
|
||||||
> 直连测试替换为对应端口。
|
> 直连测试替换为对应端口。
|
||||||
>
|
>
|
||||||
> **CSRF token 提取**:多处用到从 cookie jar 提取 CSRF token,统一使用:
|
> **CSRF token 提取**:多处用到从 cookie jar 提取 CSRF token,统一使用:
|
||||||
@@ -78,22 +76,19 @@ make dev
|
|||||||
curl -s $BASE/api/v1/auth/setup-status | jq .
|
curl -s $BASE/api/v1/auth/setup-status | jq .
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:** 返回 `{"needs_setup": false}`(admin 在启动时已自动创建,`count_users() > 0`)。仅在启动完成前的极短窗口内可能返回 `true`。
|
||||||
- 干净数据库且尚未初始化 admin:返回 `{"needs_setup": true}`
|
|
||||||
- 已存在 admin:返回 `{"needs_setup": false}`
|
|
||||||
|
|
||||||
#### TC-API-02: 首次初始化 Admin
|
#### TC-API-02: Admin 首次登录
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
curl -s -X POST $BASE/api/v1/auth/initialize \
|
curl -s -X POST $BASE/api/v1/auth/login/local \
|
||||||
-H "Content-Type: application/json" \
|
-d "username=admin@deerflow.dev&password=<控制台密码>" \
|
||||||
-d '{"email":"admin@example.com","password":"AdminPass1!"}' \
|
|
||||||
-c cookies.txt | jq .
|
-c cookies.txt | jq .
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- 状态码 201
|
- 状态码 200
|
||||||
- Body: `{"id": "...", "email": "admin@example.com", "system_role": "admin", "needs_setup": false}`
|
- Body: `{"expires_in": 604800, "needs_setup": true}`
|
||||||
- `cookies.txt` 包含 `access_token`(HttpOnly)和 `csrf_token`(非 HttpOnly)
|
- `cookies.txt` 包含 `access_token`(HttpOnly)和 `csrf_token`(非 HttpOnly)
|
||||||
|
|
||||||
#### TC-API-03: 获取当前用户
|
#### TC-API-03: 获取当前用户
|
||||||
@@ -102,9 +97,9 @@ curl -s -X POST $BASE/api/v1/auth/initialize \
|
|||||||
curl -s $BASE/api/v1/auth/me -b cookies.txt | jq .
|
curl -s $BASE/api/v1/auth/me -b cookies.txt | jq .
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** `{"id": "...", "email": "admin@example.com", "system_role": "admin", "needs_setup": false}`
|
**预期:** `{"id": "...", "email": "admin@deerflow.dev", "system_role": "admin", "needs_setup": true}`
|
||||||
|
|
||||||
#### TC-API-04: 改密码流程
|
#### TC-API-04: Setup 流程(改邮箱 + 改密码)
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
CSRF=$(grep csrf_token cookies.txt | awk '{print $NF}')
|
CSRF=$(grep csrf_token cookies.txt | awk '{print $NF}')
|
||||||
@@ -112,36 +107,13 @@ curl -s -X POST $BASE/api/v1/auth/change-password \
|
|||||||
-b cookies.txt \
|
-b cookies.txt \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-H "X-CSRF-Token: $CSRF" \
|
-H "X-CSRF-Token: $CSRF" \
|
||||||
-d '{"current_password":"AdminPass1!","new_password":"NewPass123!"}' | jq .
|
-d '{"current_password":"<控制台密码>","new_password":"NewPass123!","new_email":"admin@example.com"}' | jq .
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- 状态码 200
|
- 状态码 200
|
||||||
- `{"message": "Password changed successfully"}`
|
- `{"message": "Password changed successfully"}`
|
||||||
- 再调 `/auth/me` 仍为 `admin@example.com`,`needs_setup` 仍为 `false`
|
- 再调 `/auth/me` 邮箱变为 `admin@example.com`,`needs_setup` 变为 `false`
|
||||||
|
|
||||||
#### TC-API-04a: reset_admin 后的 Setup 流程(改邮箱 + 改密码)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd backend
|
|
||||||
python -m app.gateway.auth.reset_admin --email admin@example.com
|
|
||||||
# 从 .deer-flow/admin_initial_credentials.txt 读取 reset 后密码
|
|
||||||
|
|
||||||
curl -s -X POST $BASE/api/v1/auth/login/local \
|
|
||||||
-d "username=admin@example.com&password=<凭据文件密码>" \
|
|
||||||
-c cookies.txt | jq .
|
|
||||||
|
|
||||||
CSRF=$(grep csrf_token cookies.txt | awk '{print $NF}')
|
|
||||||
curl -s -X POST $BASE/api/v1/auth/change-password \
|
|
||||||
-b cookies.txt \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-H "X-CSRF-Token: $CSRF" \
|
|
||||||
-d '{"current_password":"<凭据文件密码>","new_password":"AdminPass2!","new_email":"admin2@example.com"}' | jq .
|
|
||||||
```
|
|
||||||
|
|
||||||
**预期:**
|
|
||||||
- 登录返回 `{"expires_in": 604800, "needs_setup": true}`
|
|
||||||
- `change-password` 后 `/auth/me` 邮箱变为 `admin2@example.com`,`needs_setup` 变为 `false`
|
|
||||||
|
|
||||||
#### TC-API-05: 普通用户注册
|
#### TC-API-05: 普通用户注册
|
||||||
|
|
||||||
@@ -211,18 +183,20 @@ curl -s -X POST $BASE/api/threads/search \
|
|||||||
|
|
||||||
**预期:** 返回 0 或仅包含 user2 自己的 thread
|
**预期:** 返回 0 或仅包含 user2 自己的 thread
|
||||||
|
|
||||||
### 2.3 LangGraph-compatible Gateway 路由隔离
|
### 2.3 标准模式 LangGraph Server 隔离
|
||||||
|
|
||||||
#### TC-API-10: LangGraph-compatible 端点需要 cookie
|
> 仅在标准模式下测试。Gateway 模式不跑 LangGraph Server。
|
||||||
|
|
||||||
|
#### TC-API-10: LangGraph 端点需要 cookie
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 不带 cookie 访问 LangGraph-compatible 接口
|
# 不带 cookie 访问 LangGraph 接口
|
||||||
curl -s -w "%{http_code}" $BASE/api/langgraph/threads
|
curl -s -w "%{http_code}" $BASE/api/langgraph/threads
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** 401
|
**预期:** 401
|
||||||
|
|
||||||
#### TC-API-11: LangGraph-compatible 路由带 cookie 可访问
|
#### TC-API-11: LangGraph 带 cookie 可访问
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
curl -s $BASE/api/langgraph/threads -b user1.txt | jq length
|
curl -s $BASE/api/langgraph/threads -b user1.txt | jq length
|
||||||
@@ -230,10 +204,10 @@ curl -s $BASE/api/langgraph/threads -b user1.txt | jq length
|
|||||||
|
|
||||||
**预期:** 200,返回 user1 的 thread 列表
|
**预期:** 200,返回 user1 的 thread 列表
|
||||||
|
|
||||||
#### TC-API-12: LangGraph-compatible 路由隔离 — 用户只看到自己的
|
#### TC-API-12: LangGraph 隔离 — 用户只看到自己的
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# user2 查 threads
|
# user2 查 LangGraph threads
|
||||||
curl -s $BASE/api/langgraph/threads -b user2.txt | jq length
|
curl -s $BASE/api/langgraph/threads -b user2.txt | jq length
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -519,7 +493,7 @@ curl -s -X POST $BASE/api/v1/auth/register \
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 检查数据库
|
# 检查数据库
|
||||||
sqlite3 backend/.deer-flow/data/deerflow.db "SELECT email, password_hash FROM users LIMIT 3;"
|
sqlite3 backend/.deer-flow/users.db "SELECT email, password_hash FROM users LIMIT 3;"
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** `password_hash` 以 `$2b$` 开头(bcrypt 格式)
|
**预期:** `password_hash` 以 `$2b$` 开头(bcrypt 格式)
|
||||||
@@ -532,25 +506,24 @@ sqlite3 backend/.deer-flow/data/deerflow.db "SELECT email, password_hash FROM us
|
|||||||
|
|
||||||
### 4.1 首次登录流程
|
### 4.1 首次登录流程
|
||||||
|
|
||||||
#### TC-UI-01: 无 admin 时访问 workspace 跳转 setup
|
#### TC-UI-01: 访问首页跳转登录
|
||||||
|
|
||||||
1. 打开 `http://localhost:2026/workspace`
|
1. 打开 `http://localhost:2026/workspace`
|
||||||
2. **预期:** 自动跳转到 `/setup`
|
2. **预期:** 自动跳转到 `/login`
|
||||||
|
|
||||||
#### TC-UI-02: Setup 页面创建 admin
|
#### TC-UI-02: Login 页面
|
||||||
|
|
||||||
1. 输入 admin 邮箱、密码、确认密码
|
1. 输入 admin 邮箱和控制台密码
|
||||||
2. 点击 Create Admin Account
|
2. 点击 Login
|
||||||
|
3. **预期:** 跳转到 `/setup`(因为 `needs_setup=true`)
|
||||||
|
|
||||||
|
#### TC-UI-03: Setup 页面
|
||||||
|
|
||||||
|
1. 输入新邮箱、控制台密码(current)、新密码、确认密码
|
||||||
|
2. 点击 Complete Setup
|
||||||
3. **预期:** 跳转到 `/workspace`
|
3. **预期:** 跳转到 `/workspace`
|
||||||
4. 刷新页面不跳回 `/setup`
|
4. 刷新页面不跳回 `/setup`
|
||||||
|
|
||||||
#### TC-UI-03: 已初始化后 Login 页面
|
|
||||||
|
|
||||||
1. 退出登录后访问 `/login`
|
|
||||||
2. 输入 admin 邮箱和密码
|
|
||||||
3. 点击 Login
|
|
||||||
4. **预期:** 跳转到 `/workspace`
|
|
||||||
|
|
||||||
#### TC-UI-04: Setup 密码不匹配
|
#### TC-UI-04: Setup 密码不匹配
|
||||||
|
|
||||||
1. 新密码和确认密码不一致
|
1. 新密码和确认密码不一致
|
||||||
@@ -629,7 +602,7 @@ sqlite3 backend/.deer-flow/data/deerflow.db "SELECT email, password_hash FROM us
|
|||||||
#### TC-UI-15: reset_admin 后重新登录
|
#### TC-UI-15: reset_admin 后重新登录
|
||||||
|
|
||||||
1. 执行 `cd backend && python -m app.gateway.auth.reset_admin`
|
1. 执行 `cd backend && python -m app.gateway.auth.reset_admin`
|
||||||
2. 从 `.deer-flow/admin_initial_credentials.txt` 读取新密码并登录
|
2. 使用新密码登录
|
||||||
3. **预期:** 跳转到 `/setup` 页面(`needs_setup` 被重置为 true)
|
3. **预期:** 跳转到 `/setup` 页面(`needs_setup` 被重置为 true)
|
||||||
4. 旧 session 已失效
|
4. 旧 session 已失效
|
||||||
|
|
||||||
@@ -672,28 +645,18 @@ make install
|
|||||||
make dev
|
make dev
|
||||||
```
|
```
|
||||||
|
|
||||||
#### TC-UPG-01: 首次启动等待 admin 初始化
|
#### TC-UPG-01: 首次启动创建 admin
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 控制台不输出 admin 邮箱或随机密码
|
- [ ] 控制台输出 admin 邮箱(`admin@deerflow.dev`)和随机密码
|
||||||
- [ ] 访问 `/setup` 可创建第一个 admin
|
|
||||||
- [ ] 无报错,正常启动
|
- [ ] 无报错,正常启动
|
||||||
|
|
||||||
#### TC-UPG-02: 旧 Thread 迁移到 admin
|
#### TC-UPG-02: 旧 Thread 迁移到 admin
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 创建第一个 admin
|
|
||||||
curl -s -X POST http://localhost:2026/api/v1/auth/initialize \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-d '{"email":"admin@example.com","password":"AdminPass1!"}' \
|
|
||||||
-c cookies.txt
|
|
||||||
|
|
||||||
# 重启一次:启动迁移只在已有 admin 的启动路径执行
|
|
||||||
make stop && make dev
|
|
||||||
|
|
||||||
# 登录 admin
|
# 登录 admin
|
||||||
curl -s -X POST http://localhost:2026/api/v1/auth/login/local \
|
curl -s -X POST http://localhost:2026/api/v1/auth/login/local \
|
||||||
-d "username=admin@example.com&password=AdminPass1!" \
|
-d "username=admin@deerflow.dev&password=<控制台密码>" \
|
||||||
-c cookies.txt
|
-c cookies.txt
|
||||||
|
|
||||||
# 查看 thread 列表
|
# 查看 thread 列表
|
||||||
@@ -707,8 +670,8 @@ curl -s -X POST http://localhost:2026/api/threads/search \
|
|||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 返回的 thread 数量 ≥ 旧版创建的数量
|
- [ ] 返回的 thread 数量 ≥ 旧版创建的数量
|
||||||
- [ ] 控制台日志有 `Migrated N orphan LangGraph thread(s) to admin`
|
- [ ] 控制台日志有 `Migrated N orphaned thread(s) to admin`
|
||||||
- [ ] 旧 thread 只对 admin 可见
|
- [ ] 每个 thread 的 `metadata.owner_id` 都已被设为 admin 的 ID
|
||||||
|
|
||||||
#### TC-UPG-03: 旧 Thread 内容完整
|
#### TC-UPG-03: 旧 Thread 内容完整
|
||||||
|
|
||||||
@@ -720,7 +683,7 @@ curl -s http://localhost:2026/api/threads/<old-thread-id> \
|
|||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] `metadata.title` 保留原值(如 `old-thread-1`)
|
- [ ] `metadata.title` 保留原值(如 `old-thread-1`)
|
||||||
- [ ] 响应不回显服务端保留的 `user_id` / `owner_id`
|
- [ ] `metadata.owner_id` 已填充
|
||||||
|
|
||||||
#### TC-UPG-04: 新用户看不到旧 Thread
|
#### TC-UPG-04: 新用户看不到旧 Thread
|
||||||
|
|
||||||
@@ -743,19 +706,18 @@ curl -s -X POST http://localhost:2026/api/threads/search \
|
|||||||
|
|
||||||
### 5.3 数据库 Schema 兼容
|
### 5.3 数据库 Schema 兼容
|
||||||
|
|
||||||
#### TC-UPG-05: 无 deerflow.db 时创建 schema 但不创建默认用户
|
#### TC-UPG-05: 无 users.db 时自动创建
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ls -la backend/.deer-flow/data/deerflow.db
|
ls -la backend/.deer-flow/users.db
|
||||||
sqlite3 backend/.deer-flow/data/deerflow.db "SELECT COUNT(*) FROM users;"
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** 文件存在,`sqlite3` 可查到 `users` 表含 `needs_setup`、`token_version` 列;未调用 `/initialize` 前用户数为 0
|
**预期:** 文件存在,`sqlite3` 可查到 `users` 表含 `needs_setup`、`token_version` 列
|
||||||
|
|
||||||
#### TC-UPG-06: deerflow.db WAL 模式
|
#### TC-UPG-06: users.db WAL 模式
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
sqlite3 backend/.deer-flow/data/deerflow.db "PRAGMA journal_mode;"
|
sqlite3 backend/.deer-flow/users.db "PRAGMA journal_mode;"
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** 返回 `wal`
|
**预期:** 返回 `wal`
|
||||||
@@ -806,9 +768,9 @@ make dev
|
|||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 服务正常启动(忽略 `deerflow.db`,无 auth 相关代码不报错)
|
- [ ] 服务正常启动(忽略 `users.db`,无 auth 相关代码不报错)
|
||||||
- [ ] 旧对话数据仍然可访问
|
- [ ] 旧对话数据仍然可访问
|
||||||
- [ ] `deerflow.db` 文件残留但不影响运行
|
- [ ] `users.db` 文件残留但不影响运行
|
||||||
|
|
||||||
#### TC-UPG-12: 再次升级到 auth 分支
|
#### TC-UPG-12: 再次升级到 auth 分支
|
||||||
|
|
||||||
@@ -819,47 +781,51 @@ make dev
|
|||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 识别已有 `deerflow.db`,不重新创建 admin
|
- [ ] 识别已有 `users.db`,不重新创建 admin
|
||||||
- [ ] 旧的 admin 账号仍可登录(如果回退期间未删 `deerflow.db`)
|
- [ ] 旧的 admin 账号仍可登录(如果回退期间未删 `users.db`)
|
||||||
|
|
||||||
### 5.7 Admin 初始化与 reset_admin
|
### 5.7 休眠 Admin(初始密码未使用/未更改)
|
||||||
|
|
||||||
> 首次启动不生成默认 admin,也不在日志输出密码。忘记密码时走 `reset_admin`,新密码写入 0600 凭据文件。
|
> 首次启动生成 admin + 随机密码,但运维未登录、未改密码。
|
||||||
|
> 密码只在首次启动的控制台闪过一次,后续启动不再显示。
|
||||||
|
|
||||||
#### TC-UPG-13: 未初始化 admin 时重启不创建默认账号
|
#### TC-UPG-13: 重启后自动重置密码并打印
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
rm -f backend/.deer-flow/data/deerflow.db
|
# 首次启动,记录密码
|
||||||
|
rm -f backend/.deer-flow/users.db
|
||||||
make dev
|
make dev
|
||||||
|
# 控制台输出密码 P0,不登录
|
||||||
make stop
|
make stop
|
||||||
|
|
||||||
|
# 隔了几天,再次启动
|
||||||
make dev
|
make dev
|
||||||
curl -s $BASE/api/v1/auth/setup-status | jq .
|
# 控制台输出新密码 P1
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 控制台不输出密码
|
- [ ] 控制台输出 `Admin account setup incomplete — password reset`
|
||||||
- [ ] `setup-status` 仍为 `{"needs_setup": true}`
|
- [ ] 输出新密码 P1(P0 已失效)
|
||||||
- [ ] 访问 `/setup` 仍可创建第一个 admin
|
- [ ] 用 P1 可以登录,P0 不可以
|
||||||
|
- [ ] 登录后 `needs_setup=true`,跳转 `/setup`
|
||||||
|
- [ ] `token_version` 递增(旧 session 如有也失效)
|
||||||
|
|
||||||
#### TC-UPG-14: 密码丢失 — reset_admin 写入凭据文件
|
#### TC-UPG-14: 密码丢失 — 无需 CLI,重启即可
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python -m app.gateway.auth.reset_admin --email admin@example.com
|
# 忘记了控制台密码 → 直接重启服务
|
||||||
ls -la backend/.deer-flow/admin_initial_credentials.txt
|
make stop && make dev
|
||||||
cat backend/.deer-flow/admin_initial_credentials.txt
|
# 控制台自动输出新密码
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 命令行只输出凭据文件路径,不输出明文密码
|
- [ ] 无需 `reset_admin`,重启服务即可拿到新密码
|
||||||
- [ ] 凭据文件权限为 `0600`
|
- [ ] `reset_admin` CLI 仍然可用作手动备选方案
|
||||||
- [ ] 凭据文件包含 email + password 行
|
|
||||||
- [ ] 该用户下次登录返回 `needs_setup=true`
|
|
||||||
|
|
||||||
#### TC-UPG-15: 未初始化 admin 期间普通用户注册策略边界
|
#### TC-UPG-15: 休眠 admin 期间普通用户注册
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# admin 尚不存在,普通用户尝试注册
|
# admin 存在但从未登录,普通用户先注册
|
||||||
curl -s -X POST $BASE/api/v1/auth/register \
|
curl -s -X POST $BASE/api/v1/auth/register \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-d '{"email":"earlybird@example.com","password":"EarlyPass1!"}' \
|
-d '{"email":"earlybird@example.com","password":"EarlyPass1!"}' \
|
||||||
@@ -867,11 +833,11 @@ curl -s -X POST $BASE/api/v1/auth/register \
|
|||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 当前代码允许注册普通用户并自动登录(201,角色为 `user`)
|
- [ ] 注册成功(201),角色为 `user`
|
||||||
- [ ] 但 `setup-status` 仍为 `{"needs_setup": true}`,因为 admin 仍不存在
|
- [ ] 无法提权为 admin
|
||||||
- [ ] 这是一个产品策略边界:若要求“必须先有 admin”,需要在 `/register` 增加 admin-exists gate
|
- [ ] 普通用户的数据与 admin 隔离
|
||||||
|
|
||||||
#### TC-UPG-16: 普通用户数据与后续 admin 隔离
|
#### TC-UPG-16: 休眠 admin 不影响后续操作
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 普通用户正常创建 thread、发消息
|
# 普通用户正常创建 thread、发消息
|
||||||
@@ -883,13 +849,14 @@ curl -s -X POST $BASE/api/threads \
|
|||||||
-d '{"metadata":{}}' | jq .thread_id
|
-d '{"metadata":{}}' | jq .thread_id
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** 普通用户正常创建 thread;后续 admin 创建后,搜索不到该普通用户 thread
|
**预期:** 正常创建,不受休眠 admin 影响
|
||||||
|
|
||||||
#### TC-UPG-17: reset_admin 后完成 Setup
|
#### TC-UPG-17: 休眠 admin 最终完成 Setup
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
# 运维终于登录
|
||||||
curl -s -X POST $BASE/api/v1/auth/login/local \
|
curl -s -X POST $BASE/api/v1/auth/login/local \
|
||||||
-d "username=admin@example.com&password=<凭据文件密码>" \
|
-d "username=admin@deerflow.dev&password=<P0或P1>" \
|
||||||
-c admin.txt | jq .needs_setup
|
-c admin.txt | jq .needs_setup
|
||||||
# 预期: true
|
# 预期: true
|
||||||
|
|
||||||
@@ -899,7 +866,7 @@ curl -s -X POST $BASE/api/v1/auth/change-password \
|
|||||||
-b admin.txt \
|
-b admin.txt \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-H "X-CSRF-Token: $CSRF" \
|
-H "X-CSRF-Token: $CSRF" \
|
||||||
-d '{"current_password":"<凭据文件密码>","new_password":"AdminFinal1!","new_email":"admin@real.com"}' \
|
-d '{"current_password":"<密码>","new_password":"AdminFinal1!","new_email":"admin@real.com"}' \
|
||||||
-c admin.txt
|
-c admin.txt
|
||||||
|
|
||||||
# 验证
|
# 验证
|
||||||
@@ -909,7 +876,7 @@ curl -s $BASE/api/v1/auth/me -b admin.txt | jq '{email, needs_setup}'
|
|||||||
**预期:**
|
**预期:**
|
||||||
- [ ] `email` 变为 `admin@real.com`
|
- [ ] `email` 变为 `admin@real.com`
|
||||||
- [ ] `needs_setup` 变为 `false`
|
- [ ] `needs_setup` 变为 `false`
|
||||||
- [ ] 后续登录使用新密码
|
- [ ] 后续重启控制台不再有 warning
|
||||||
|
|
||||||
#### TC-UPG-18: 长期未用后 JWT 密钥轮换
|
#### TC-UPG-18: 长期未用后 JWT 密钥轮换
|
||||||
|
|
||||||
@@ -923,8 +890,8 @@ make stop && make dev
|
|||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] 服务正常启动
|
- [ ] 服务正常启动
|
||||||
- [ ] 账号密码仍可登录(密码存在 DB,与 JWT 密钥无关)
|
- [ ] 旧密码仍可登录(密码存在 DB,与 JWT 密钥无关)
|
||||||
- [ ] 旧的 JWT token 失效(密钥变了签名不匹配)
|
- [ ] 旧的 JWT token 失效(密钥变了签名不匹配)— 但因为从未登录过也没有旧 token
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -943,7 +910,7 @@ for i in 1 2 3; do
|
|||||||
done
|
done
|
||||||
|
|
||||||
# 检查 admin 数量
|
# 检查 admin 数量
|
||||||
sqlite3 backend/.deer-flow/data/deerflow.db \
|
sqlite3 backend/.deer-flow/users.db \
|
||||||
"SELECT COUNT(*) FROM users WHERE system_role='admin';"
|
"SELECT COUNT(*) FROM users WHERE system_role='admin';"
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -1088,7 +1055,7 @@ curl -s -X POST $BASE/api/v1/auth/register \
|
|||||||
wait
|
wait
|
||||||
|
|
||||||
# 检查用户数
|
# 检查用户数
|
||||||
sqlite3 backend/.deer-flow/data/deerflow.db \
|
sqlite3 backend/.deer-flow/users.db \
|
||||||
"SELECT COUNT(*) FROM users WHERE email='race@example.com';"
|
"SELECT COUNT(*) FROM users WHERE email='race@example.com';"
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -1198,16 +1165,13 @@ curl -s -w "%{http_code}" -X DELETE "$BASE/api/threads/$TID" \
|
|||||||
```bash
|
```bash
|
||||||
cd backend
|
cd backend
|
||||||
python -m app.gateway.auth.reset_admin
|
python -m app.gateway.auth.reset_admin
|
||||||
cp .deer-flow/admin_initial_credentials.txt /tmp/deerflow-reset-p1.txt
|
# 记录密码 P1
|
||||||
P1=$(awk -F': ' '/^password:/ {print $2}' /tmp/deerflow-reset-p1.txt)
|
|
||||||
|
|
||||||
python -m app.gateway.auth.reset_admin
|
python -m app.gateway.auth.reset_admin
|
||||||
cp .deer-flow/admin_initial_credentials.txt /tmp/deerflow-reset-p2.txt
|
# 记录密码 P2
|
||||||
P2=$(awk -F': ' '/^password:/ {print $2}' /tmp/deerflow-reset-p2.txt)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:**
|
**预期:**
|
||||||
- [ ] `.deer-flow/admin_initial_credentials.txt` 每次都会被重写,文件权限为 `0600`
|
|
||||||
- [ ] P1 ≠ P2(每次生成新随机密码)
|
- [ ] P1 ≠ P2(每次生成新随机密码)
|
||||||
- [ ] P1 不可用,只有 P2 有效
|
- [ ] P1 不可用,只有 P2 有效
|
||||||
- [ ] `token_version` 递增了 2
|
- [ ] `token_version` 递增了 2
|
||||||
@@ -1232,11 +1196,21 @@ P2=$(awk -F': ' '/^password:/ {print $2}' /tmp/deerflow-reset-p2.txt)
|
|||||||
## 七、模式差异测试
|
## 七、模式差异测试
|
||||||
|
|
||||||
> 以下用 `GW=http://localhost:8001` 表示直连 Gateway,`BASE=http://localhost:2026` 表示经 nginx。
|
> 以下用 `GW=http://localhost:8001` 表示直连 Gateway,`BASE=http://localhost:2026` 表示经 nginx。
|
||||||
> 标准启动命令:`make dev`(或 `./scripts/serve.sh --dev`)。
|
> Gateway 模式启动命令:`make dev-pro`(或 `./scripts/serve.sh --dev --gateway`)。
|
||||||
|
|
||||||
### 7.1 标准启动模式
|
### 7.1 标准模式独有
|
||||||
|
|
||||||
#### TC-MODE-01: Gateway AuthMiddleware 的 token_version 检查
|
> 启动命令:`make dev`(或 `./scripts/serve.sh --dev`)
|
||||||
|
|
||||||
|
#### TC-MODE-01: LangGraph Server 独立运行,需 cookie
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 无 cookie 访问 LangGraph
|
||||||
|
curl -s -w "%{http_code}" -o /dev/null $BASE/api/langgraph/threads/search
|
||||||
|
# 预期: 403(LangGraph auth handler 拒绝)
|
||||||
|
```
|
||||||
|
|
||||||
|
#### TC-MODE-02: LangGraph auth 的 token_version 检查
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 登录拿 cookie
|
# 登录拿 cookie
|
||||||
@@ -1249,9 +1223,9 @@ curl -s -X POST $BASE/api/v1/auth/change-password \
|
|||||||
-b cookies.txt -H "Content-Type: application/json" -H "X-CSRF-Token: $CSRF" \
|
-b cookies.txt -H "Content-Type: application/json" -H "X-CSRF-Token: $CSRF" \
|
||||||
-d '{"current_password":"正确密码","new_password":"NewPass1!"}' -c new_cookies.txt
|
-d '{"current_password":"正确密码","new_password":"NewPass1!"}' -c new_cookies.txt
|
||||||
|
|
||||||
# 用旧 cookie 访问 LangGraph-compatible 路由
|
# 用旧 cookie 访问 LangGraph
|
||||||
curl -s -w "%{http_code}" $BASE/api/langgraph/threads/search -b cookies.txt
|
curl -s -w "%{http_code}" $BASE/api/langgraph/threads/search -b cookies.txt
|
||||||
# 预期: 401(token_version 不匹配)
|
# 预期: 403(token_version 不匹配)
|
||||||
|
|
||||||
# 用新 cookie 访问
|
# 用新 cookie 访问
|
||||||
CSRF2=$(grep csrf_token new_cookies.txt | awk '{print $NF}')
|
CSRF2=$(grep csrf_token new_cookies.txt | awk '{print $NF}')
|
||||||
@@ -1260,7 +1234,7 @@ curl -s -w "%{http_code}" -X POST $BASE/api/langgraph/threads/search \
|
|||||||
# 预期: 200
|
# 预期: 200
|
||||||
```
|
```
|
||||||
|
|
||||||
#### TC-MODE-02: Gateway owner filter 隔离
|
#### TC-MODE-03: LangGraph auth 的 owner filter 隔离
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# user1 创建 thread
|
# user1 创建 thread
|
||||||
@@ -1285,9 +1259,18 @@ print('OK: user2 sees', len(threads), 'threads, none belong to user1')
|
|||||||
"
|
"
|
||||||
```
|
```
|
||||||
|
|
||||||
#### TC-MODE-03: 所有请求经 AuthMiddleware
|
### 7.2 Gateway 模式独有
|
||||||
|
|
||||||
|
> 启动命令:`make dev-pro`(或 `./scripts/serve.sh --dev --gateway`)
|
||||||
|
> 无 LangGraph Server 进程,agent runtime 嵌入 Gateway。
|
||||||
|
|
||||||
|
#### TC-MODE-04: 所有请求经 AuthMiddleware
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
# 确认 LangGraph Server 未运行
|
||||||
|
curl -s -w "%{http_code}" -o /dev/null http://localhost:2024/ok
|
||||||
|
# 预期: 000(连接被拒)
|
||||||
|
|
||||||
# Gateway API 受保护
|
# Gateway API 受保护
|
||||||
curl -s -w "%{http_code}" -o /dev/null $BASE/api/models
|
curl -s -w "%{http_code}" -o /dev/null $BASE/api/models
|
||||||
# 预期: 401
|
# 预期: 401
|
||||||
@@ -1298,7 +1281,7 @@ curl -s -w "%{http_code}" -o /dev/null -X POST $BASE/api/langgraph/threads/searc
|
|||||||
# 预期: 401
|
# 预期: 401
|
||||||
```
|
```
|
||||||
|
|
||||||
#### TC-MODE-04: 标准模式下完整 auth 流程
|
#### TC-MODE-05: Gateway 模式下完整 auth 流程
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 登录
|
# 登录
|
||||||
@@ -1313,7 +1296,7 @@ curl -s -X POST $BASE/api/langgraph/threads \
|
|||||||
-d '{"metadata":{}}' | python3 -c "import sys,json; print(json.load(sys.stdin)['thread_id'])"
|
-d '{"metadata":{}}' | python3 -c "import sys,json; print(json.load(sys.stdin)['thread_id'])"
|
||||||
# 预期: 返回 thread_id
|
# 预期: 返回 thread_id
|
||||||
|
|
||||||
# CSRF 保护(CSRFMiddleware 覆盖所有 Gateway 路由)
|
# CSRF 保护(Gateway 模式下 CSRFMiddleware 直接覆盖所有路由)
|
||||||
curl -s -w "%{http_code}" -o /dev/null -X POST $BASE/api/langgraph/threads \
|
curl -s -w "%{http_code}" -o /dev/null -X POST $BASE/api/langgraph/threads \
|
||||||
-b cookies.txt -H "Content-Type: application/json" -d '{"metadata":{}}'
|
-b cookies.txt -H "Content-Type: application/json" -d '{"metadata":{}}'
|
||||||
# 预期: 403(CSRF token missing)
|
# 预期: 403(CSRF token missing)
|
||||||
@@ -1341,8 +1324,7 @@ done
|
|||||||
```bash
|
```bash
|
||||||
GW=http://localhost:8001
|
GW=http://localhost:8001
|
||||||
|
|
||||||
for path in /health /api/v1/auth/setup-status /api/v1/auth/login/local \
|
for path in /health /api/v1/auth/setup-status /api/v1/auth/login/local /api/v1/auth/register; do
|
||||||
/api/v1/auth/register /api/v1/auth/initialize /api/v1/auth/logout; do
|
|
||||||
echo "$path: $(curl -s -w '%{http_code}' -o /dev/null $GW$path)"
|
echo "$path: $(curl -s -w '%{http_code}' -o /dev/null $GW$path)"
|
||||||
done
|
done
|
||||||
# 预期: 200 或 405/422(方法不对但不是 401)
|
# 预期: 200 或 405/422(方法不对但不是 401)
|
||||||
@@ -1412,14 +1394,14 @@ done
|
|||||||
|
|
||||||
### 7.4 Docker 部署
|
### 7.4 Docker 部署
|
||||||
|
|
||||||
> 启动命令:`./scripts/deploy.sh`
|
> 启动命令:`./scripts/deploy.sh`(标准)或 `./scripts/deploy.sh --gateway`(Gateway 模式)
|
||||||
> Docker Compose 文件:`docker/docker-compose.yaml`
|
> Docker Compose 文件:`docker/docker-compose.yaml`
|
||||||
>
|
>
|
||||||
> 前置条件:
|
> 前置条件:
|
||||||
> - `.env` 中设置 `AUTH_JWT_SECRET`(否则每次容器重启 session 全部失效)
|
> - `.env` 中设置 `AUTH_JWT_SECRET`(否则每次容器重启 session 全部失效)
|
||||||
> - `DEER_FLOW_HOME` 挂载到宿主机目录(持久化 `deerflow.db`)
|
> - `DEER_FLOW_HOME` 挂载到宿主机目录(持久化 `users.db`)
|
||||||
|
|
||||||
#### TC-DOCKER-01: deerflow.db 通过 volume 持久化
|
#### TC-DOCKER-01: users.db 通过 volume 持久化
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 启动容器
|
# 启动容器
|
||||||
@@ -1434,13 +1416,13 @@ curl -s -X POST $BASE/api/v1/auth/register \
|
|||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-d '{"email":"docker-test@example.com","password":"DockerTest1!"}' -w "\nHTTP %{http_code}"
|
-d '{"email":"docker-test@example.com","password":"DockerTest1!"}' -w "\nHTTP %{http_code}"
|
||||||
|
|
||||||
# 检查宿主机上的 deerflow.db
|
# 检查宿主机上的 users.db
|
||||||
ls -la ${DEER_FLOW_HOME:-backend/.deer-flow}/data/deerflow.db
|
ls -la ${DEER_FLOW_HOME:-backend/.deer-flow}/users.db
|
||||||
sqlite3 ${DEER_FLOW_HOME:-backend/.deer-flow}/data/deerflow.db \
|
sqlite3 ${DEER_FLOW_HOME:-backend/.deer-flow}/users.db \
|
||||||
"SELECT email FROM users WHERE email='docker-test@example.com';"
|
"SELECT email FROM users WHERE email='docker-test@example.com';"
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** deerflow.db 在宿主机 `DEER_FLOW_HOME` 目录中,查询可见刚注册的用户。
|
**预期:** users.db 在宿主机 `DEER_FLOW_HOME` 目录中,查询可见刚注册的用户。
|
||||||
|
|
||||||
#### TC-DOCKER-02: 重启容器后 session 保持
|
#### TC-DOCKER-02: 重启容器后 session 保持
|
||||||
|
|
||||||
@@ -1484,24 +1466,22 @@ done
|
|||||||
|
|
||||||
**已知限制:** In-process rate limiter 不跨 worker 共享。生产环境如需精确限速,需要 Redis 等外部存储。
|
**已知限制:** In-process rate limiter 不跨 worker 共享。生产环境如需精确限速,需要 Redis 等外部存储。
|
||||||
|
|
||||||
#### TC-DOCKER-04: IM 渠道使用内部认证
|
#### TC-DOCKER-04: IM 渠道不经过 auth
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# IM 渠道(Feishu/Slack/Telegram)在 gateway 容器内部通过 LangGraph SDK 调 Gateway
|
# IM 渠道(Feishu/Slack/Telegram)在 gateway 容器内部通过 LangGraph SDK 通信
|
||||||
# 请求携带 process-local internal auth header,并带匹配的 CSRF cookie/header
|
# 不走 nginx,不经过 AuthMiddleware
|
||||||
|
|
||||||
# 验证方式:检查 gateway 日志中 channel manager 的请求不包含 auth 错误
|
# 验证方式:检查 gateway 日志中 channel manager 的请求不包含 auth 错误
|
||||||
docker logs deer-flow-gateway 2>&1 | grep -E "ChannelManager|channel" | head -10
|
docker logs deer-flow-gateway 2>&1 | grep -E "ChannelManager|channel" | head -10
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** 无 auth 相关错误。渠道不依赖浏览器 cookie;服务端通过内部认证头把请求归入 `default` 用户桶。
|
**预期:** 无 auth 相关错误。渠道通过 `langgraph-sdk` 直连 LangGraph Server(`http://langgraph:2024`),不走 auth 层。
|
||||||
|
|
||||||
#### TC-DOCKER-05: reset_admin 密码写入 0600 凭证文件(不再走日志)
|
#### TC-DOCKER-05: admin 密码写入 0600 凭证文件(不再走日志)
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 首次启动不会自动生成 admin 密码。先重置已有 admin,凭据文件写在挂载到宿主机的 DEER_FLOW_HOME 下。
|
# 凭证文件写在挂载到宿主机的 DEER_FLOW_HOME 下
|
||||||
docker exec deer-flow-gateway python -m app.gateway.auth.reset_admin --email docker-test@example.com
|
|
||||||
|
|
||||||
ls -la ${DEER_FLOW_HOME:-backend/.deer-flow}/admin_initial_credentials.txt
|
ls -la ${DEER_FLOW_HOME:-backend/.deer-flow}/admin_initial_credentials.txt
|
||||||
# 预期文件权限: -rw------- (0600)
|
# 预期文件权限: -rw------- (0600)
|
||||||
|
|
||||||
@@ -1521,26 +1501,25 @@ docker logs deer-flow-gateway 2>&1 | grep -iE "Password: .{15,}" && echo "FAIL:
|
|||||||
- 容器日志输出**路径**(不是密码本身),符合 CodeQL `py/clear-text-logging-sensitive-data` 规则
|
- 容器日志输出**路径**(不是密码本身),符合 CodeQL `py/clear-text-logging-sensitive-data` 规则
|
||||||
- `grep "Password:"` 在日志中**应当无匹配**(旧行为已废弃,simplify pass 移除了日志泄露路径)
|
- `grep "Password:"` 在日志中**应当无匹配**(旧行为已废弃,simplify pass 移除了日志泄露路径)
|
||||||
|
|
||||||
#### TC-DOCKER-06: Docker 部署
|
#### TC-DOCKER-06: Gateway 模式 Docker 部署
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 标准 Docker 模式:runtime 嵌入 gateway 容器
|
# Gateway 模式:无 langgraph 容器
|
||||||
./scripts/deploy.sh
|
./scripts/deploy.sh --gateway
|
||||||
sleep 15
|
sleep 15
|
||||||
|
|
||||||
# 确认 gateway 容器存在
|
# 确认 langgraph 容器不存在
|
||||||
docker ps --filter name=deer-flow-gateway --format '{{.Names}}'
|
docker ps --filter name=deer-flow-langgraph --format '{{.Names}}' | wc -l
|
||||||
# 预期: deer-flow-gateway
|
# 预期: 0
|
||||||
|
|
||||||
# auth 流程正常:未登录受保护接口返回 401
|
# auth 流程正常
|
||||||
curl -s -w "%{http_code}" -o /dev/null $BASE/api/models
|
curl -s -w "%{http_code}" -o /dev/null $BASE/api/models
|
||||||
# 预期: 401
|
# 预期: 401
|
||||||
|
|
||||||
curl -s -X POST $BASE/api/v1/auth/initialize \
|
curl -s -X POST $BASE/api/v1/auth/login/local \
|
||||||
-H "Content-Type: application/json" \
|
-d "username=admin@deerflow.dev&password=<日志密码>" \
|
||||||
-d '{"email":"admin@example.com","password":"AdminPass1!"}' \
|
|
||||||
-c cookies.txt -w "\nHTTP %{http_code}"
|
-c cookies.txt -w "\nHTTP %{http_code}"
|
||||||
# 预期: 201
|
# 预期: 200
|
||||||
```
|
```
|
||||||
|
|
||||||
### 7.4 补充边界用例
|
### 7.4 补充边界用例
|
||||||
@@ -1608,15 +1587,13 @@ curl -s -D - -X POST $BASE/api/v1/auth/login/local \
|
|||||||
#### TC-EDGE-05: HTTP 无 max_age / HTTPS 有 max_age
|
#### TC-EDGE-05: HTTP 无 max_age / HTTPS 有 max_age
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
GW=http://localhost:8001
|
|
||||||
|
|
||||||
# HTTP
|
# HTTP
|
||||||
curl -s -D - -X POST $GW/api/v1/auth/login/local \
|
curl -s -D - -X POST $BASE/api/v1/auth/login/local \
|
||||||
-d "username=admin@example.com&password=正确密码" 2>/dev/null \
|
-d "username=admin@example.com&password=正确密码" 2>/dev/null \
|
||||||
| grep "access_token=" | grep -oi "max-age=[0-9]*" || echo "NO max-age (HTTP session cookie)"
|
| grep "access_token=" | grep -oi "max-age=[0-9]*" || echo "NO max-age (HTTP session cookie)"
|
||||||
|
|
||||||
# HTTPS:直连 Gateway 才能用 X-Forwarded-Proto 模拟 HTTPS;nginx 会覆盖该 header
|
# HTTPS
|
||||||
curl -s -D - -X POST $GW/api/v1/auth/login/local \
|
curl -s -D - -X POST $BASE/api/v1/auth/login/local \
|
||||||
-H "X-Forwarded-Proto: https" \
|
-H "X-Forwarded-Proto: https" \
|
||||||
-d "username=admin@example.com&password=正确密码" 2>/dev/null \
|
-d "username=admin@example.com&password=正确密码" 2>/dev/null \
|
||||||
| grep "access_token=" | grep -oi "max-age=[0-9]*"
|
| grep "access_token=" | grep -oi "max-age=[0-9]*"
|
||||||
@@ -1735,10 +1712,10 @@ curl -s -X POST $BASE/api/threads \
|
|||||||
-b cookies.txt \
|
-b cookies.txt \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-H "X-CSRF-Token: $CSRF" \
|
-H "X-CSRF-Token: $CSRF" \
|
||||||
-d '{"metadata":{"owner_id":"victim-user-id","user_id":"victim-user-id"}}' | jq .metadata
|
-d '{"metadata":{"owner_id":"victim-user-id"}}' | jq .metadata.owner_id
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** 返回的 `metadata` 不包含 `owner_id` 或 `user_id`。真实所有权写入 `threads_meta.user_id`,不从客户端 metadata 接收,也不通过 metadata 回显。
|
**预期:** 返回的 `metadata.owner_id` 应为当前登录用户的 ID,不是请求中注入的 `victim-user-id`。服务端应覆盖客户端提供的 `user_id`。
|
||||||
|
|
||||||
#### 7.5.6 HTTP Method 探测
|
#### 7.5.6 HTTP Method 探测
|
||||||
|
|
||||||
@@ -1819,6 +1796,6 @@ cd backend && PYTHONPATH=. uv run pytest \
|
|||||||
# 核心接口冒烟
|
# 核心接口冒烟
|
||||||
curl -s $BASE/health # 200
|
curl -s $BASE/health # 200
|
||||||
curl -s $BASE/api/models # 401 (无 cookie)
|
curl -s $BASE/api/models # 401 (无 cookie)
|
||||||
curl -s $BASE/api/v1/auth/setup-status # 200
|
curl -s -X POST $BASE/api/v1/auth/setup-status # 200
|
||||||
curl -s $BASE/api/v1/auth/me -b cookies.txt # 200 (有 cookie)
|
curl -s $BASE/api/v1/auth/me -b cookies.txt # 200 (有 cookie)
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -2,16 +2,13 @@
|
|||||||
|
|
||||||
DeerFlow 内置了认证模块。本文档面向从无认证版本升级的用户。
|
DeerFlow 内置了认证模块。本文档面向从无认证版本升级的用户。
|
||||||
|
|
||||||
完整设计见 [AUTH_DESIGN.md](AUTH_DESIGN.md)。
|
|
||||||
|
|
||||||
## 核心概念
|
## 核心概念
|
||||||
|
|
||||||
认证模块采用**始终强制**策略:
|
认证模块采用**始终强制**策略:
|
||||||
|
|
||||||
- 首次启动时不会自动创建账号;首次访问 `/setup` 时由操作者创建第一个 admin 账号
|
- 首次启动时自动创建 admin 账号,随机密码打印到控制台日志
|
||||||
- 认证从一开始就是强制的,无竞争窗口
|
- 认证从一开始就是强制的,无竞争窗口
|
||||||
- 已有 admin 后,服务启动时会把历史对话(升级前创建且缺少 `user_id` 的 thread)迁移到 admin 名下
|
- 历史对话(升级前创建的 thread)自动迁移到 admin 名下
|
||||||
- 新数据按用户隔离:thread、workspace/uploads/outputs、memory、自定义 agent 都归属当前用户
|
|
||||||
|
|
||||||
## 升级步骤
|
## 升级步骤
|
||||||
|
|
||||||
@@ -28,41 +25,39 @@ cd backend && make install
|
|||||||
make dev
|
make dev
|
||||||
```
|
```
|
||||||
|
|
||||||
如果没有 admin 账号,控制台只会提示:
|
控制台会输出:
|
||||||
|
|
||||||
```
|
```
|
||||||
============================================================
|
============================================================
|
||||||
First boot detected — no admin account exists.
|
Admin account created on first boot
|
||||||
Visit /setup to complete admin account creation.
|
Email: admin@deerflow.dev
|
||||||
|
Password: aB3xK9mN_pQ7rT2w
|
||||||
|
Change it after login: Settings → Account
|
||||||
============================================================
|
============================================================
|
||||||
```
|
```
|
||||||
|
|
||||||
首次启动不会在日志里打印随机密码,也不会写入默认 admin。这样避免启动日志泄露凭据,也避免在操作者创建账号前出现可被猜测的默认身份。
|
如果未登录就重启了服务,不用担心——只要 setup 未完成,每次启动都会重置密码并重新打印到控制台。
|
||||||
|
|
||||||
### 3. 创建 admin
|
### 3. 登录
|
||||||
|
|
||||||
访问 `http://localhost:2026/setup`,填写邮箱和密码创建第一个 admin 账号。创建成功后会自动登录并进入 workspace。
|
访问 `http://localhost:2026/login`,使用控制台输出的邮箱和密码登录。
|
||||||
|
|
||||||
如果这是从无认证版本升级,创建 admin 后重启一次服务,让启动迁移把缺少 `user_id` 的历史 thread 归属到 admin。
|
### 4. 修改密码
|
||||||
|
|
||||||
### 4. 登录
|
登录后进入 Settings → Account → Change Password。
|
||||||
|
|
||||||
后续访问 `http://localhost:2026/login`,使用已创建的邮箱和密码登录。
|
|
||||||
|
|
||||||
### 5. 添加用户(可选)
|
### 5. 添加用户(可选)
|
||||||
|
|
||||||
其他用户通过 `/login` 页面注册,自动获得 **user** 角色。每个用户只能看到自己的对话、上传文件、输出文件、memory 和自定义 agent。
|
其他用户通过 `/login` 页面注册,自动获得 **user** 角色。每个用户只能看到自己的对话。
|
||||||
|
|
||||||
## 安全机制
|
## 安全机制
|
||||||
|
|
||||||
| 机制 | 说明 |
|
| 机制 | 说明 |
|
||||||
|------|------|
|
|------|------|
|
||||||
| JWT HttpOnly Cookie | Token 不暴露给 JavaScript,防止 XSS 窃取 |
|
| JWT HttpOnly Cookie | Token 不暴露给 JavaScript,防止 XSS 窃取 |
|
||||||
| CSRF Double Submit Cookie | 受保护的 POST/PUT/PATCH/DELETE 请求需携带 `X-CSRF-Token`;登录/注册/初始化/登出走 auth 端点 Origin 校验 |
|
| CSRF Double Submit Cookie | 所有 POST/PUT/DELETE 请求需携带 `X-CSRF-Token` |
|
||||||
| bcrypt 密码哈希 | 密码不以明文存储 |
|
| bcrypt 密码哈希 | 密码不以明文存储 |
|
||||||
| Thread owner filter | `threads_meta.user_id` 由服务端认证上下文写入,搜索、读取、更新、删除默认按当前用户过滤 |
|
| 多租户隔离 | 用户只能访问自己的 thread |
|
||||||
| 文件系统隔离 | 线程数据写入 `{base_dir}/users/{user_id}/threads/{thread_id}/user-data/`,sandbox 内统一映射为 `/mnt/user-data/` |
|
|
||||||
| Memory / agent 隔离 | 用户 memory 和自定义 agent 写入 `{base_dir}/users/{user_id}/...`;旧共享 agent 只作为只读兼容回退 |
|
|
||||||
| HTTPS 自适应 | 检测 `x-forwarded-proto`,自动设置 `Secure` cookie 标志 |
|
| HTTPS 自适应 | 检测 `x-forwarded-proto`,自动设置 `Secure` cookie 标志 |
|
||||||
|
|
||||||
## 常见操作
|
## 常见操作
|
||||||
@@ -79,27 +74,23 @@ python -m app.gateway.auth.reset_admin
|
|||||||
python -m app.gateway.auth.reset_admin --email user@example.com
|
python -m app.gateway.auth.reset_admin --email user@example.com
|
||||||
```
|
```
|
||||||
|
|
||||||
会把新的随机密码写入 `.deer-flow/admin_initial_credentials.txt`,文件权限为 `0600`。命令行只输出文件路径,不输出明文密码。
|
会输出新的随机密码。
|
||||||
|
|
||||||
### 完全重置
|
### 完全重置
|
||||||
|
|
||||||
删除统一 SQLite 数据库,重启后重新访问 `/setup` 创建新 admin:
|
删除用户数据库,重启后自动创建新 admin:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
rm -f backend/.deer-flow/data/deerflow.db
|
rm -f backend/.deer-flow/users.db
|
||||||
# 重启服务后访问 http://localhost:2026/setup
|
# 重启服务,控制台输出新密码
|
||||||
```
|
```
|
||||||
|
|
||||||
## 数据存储
|
## 数据存储
|
||||||
|
|
||||||
| 文件 | 内容 |
|
| 文件 | 内容 |
|
||||||
|------|------|
|
|------|------|
|
||||||
| `.deer-flow/data/deerflow.db` | 统一 SQLite 数据库(users、threads_meta、runs、feedback 等应用数据) |
|
| `.deer-flow/users.db` | SQLite 用户数据库(密码哈希、角色) |
|
||||||
| `.deer-flow/users/{user_id}/threads/{thread_id}/user-data/` | 用户线程的 workspace、uploads、outputs |
|
| `.env` 中的 `AUTH_JWT_SECRET` | JWT 签名密钥(未设置时自动生成临时密钥,重启后 session 失效) |
|
||||||
| `.deer-flow/users/{user_id}/memory.json` | 用户级 memory |
|
|
||||||
| `.deer-flow/users/{user_id}/agents/{agent_name}/` | 用户自定义 agent 配置、SOUL 和 agent memory |
|
|
||||||
| `.deer-flow/admin_initial_credentials.txt` | `reset_admin` 生成的新凭据文件(0600,读完应删除) |
|
|
||||||
| `.env` 中的 `AUTH_JWT_SECRET` | JWT 签名密钥(未设置时自动生成并持久化到 `.deer-flow/.jwt_secret`,重启后 session 保持) |
|
|
||||||
|
|
||||||
### 生产环境建议
|
### 生产环境建议
|
||||||
|
|
||||||
@@ -120,21 +111,19 @@ python -c "import secrets; print(secrets.token_urlsafe(32))"
|
|||||||
| `/api/v1/auth/me` | GET | 获取当前用户信息 |
|
| `/api/v1/auth/me` | GET | 获取当前用户信息 |
|
||||||
| `/api/v1/auth/change-password` | POST | 修改密码 |
|
| `/api/v1/auth/change-password` | POST | 修改密码 |
|
||||||
| `/api/v1/auth/setup-status` | GET | 检查 admin 是否存在 |
|
| `/api/v1/auth/setup-status` | GET | 检查 admin 是否存在 |
|
||||||
| `/api/v1/auth/initialize` | POST | 首次初始化第一个 admin(仅无 admin 时可调用) |
|
|
||||||
|
|
||||||
## 兼容性
|
## 兼容性
|
||||||
|
|
||||||
- **本地开发**(`make dev`):Gateway embedded runtime 完全兼容;无 admin 时访问 `/setup` 初始化
|
- **标准模式**(`make dev`):完全兼容,admin 自动创建
|
||||||
- **Gateway embedded runtime**:标准脚本、Docker dev 和生产部署均通过 Gateway 提供认证与 LangGraph-compatible API
|
- **Gateway 模式**(`make dev-pro`):完全兼容
|
||||||
- **Docker 部署**:完全兼容,`.deer-flow/data/deerflow.db` 需持久化卷挂载
|
- **Docker 部署**:完全兼容,`.deer-flow/users.db` 需持久化卷挂载
|
||||||
- **IM 渠道**(Feishu/Slack/Telegram):通过 Gateway 内部认证通信,使用 `default` 用户桶
|
- **IM 渠道**(Feishu/Slack/Telegram):通过 LangGraph SDK 通信,不经过认证层
|
||||||
- **DeerFlowClient**(嵌入式):不经过 HTTP,不受认证影响
|
- **DeerFlowClient**(嵌入式):不经过 HTTP,不受认证影响
|
||||||
|
|
||||||
## 故障排查
|
## 故障排查
|
||||||
|
|
||||||
| 症状 | 原因 | 解决 |
|
| 症状 | 原因 | 解决 |
|
||||||
|------|------|------|
|
|------|------|------|
|
||||||
| 启动后没看到密码 | 当前实现不在启动日志输出密码 | 首次安装访问 `/setup`;忘记密码用 `reset_admin` |
|
| 启动后没看到密码 | admin 已存在(非首次启动) | 用 `reset_admin` 重置,或删 `users.db` |
|
||||||
| `/login` 自动跳到 `/setup` | 系统还没有 admin | 在 `/setup` 创建第一个 admin |
|
|
||||||
| 登录后 POST 返回 403 | CSRF token 缺失 | 确认前端已更新 |
|
| 登录后 POST 返回 403 | CSRF token 缺失 | 确认前端已更新 |
|
||||||
| 重启后需要重新登录 | `.jwt_secret` 文件被删除且 `.env` 未设置 `AUTH_JWT_SECRET` | 在 `.env` 中设置固定密钥 |
|
| 重启后需要重新登录 | `AUTH_JWT_SECRET` 未持久化 | 在 `.env` 中设置固定密钥 |
|
||||||
|
|||||||
@@ -1,154 +0,0 @@
|
|||||||
# Blocking IO detection usage and maintenance
|
|
||||||
|
|
||||||
This document describes how to use and maintain DeerFlow backend blocking-IO
|
|
||||||
detection for async event-loop safety.
|
|
||||||
|
|
||||||
The goal is narrow: find and prevent synchronous IO from blocking backend
|
|
||||||
async event-loop paths. Static and runtime detection are complementary, but
|
|
||||||
they have different jobs.
|
|
||||||
|
|
||||||
## Static detector
|
|
||||||
|
|
||||||
The static detector is the discovery tool. It scans backend source code and
|
|
||||||
reports candidate blocking-IO call sites that may need human review.
|
|
||||||
|
|
||||||
Run it from the repository root:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
make detect-blocking-io
|
|
||||||
```
|
|
||||||
|
|
||||||
Or from `backend/`:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
make detect-blocking-io
|
|
||||||
```
|
|
||||||
|
|
||||||
The report is written to:
|
|
||||||
|
|
||||||
```text
|
|
||||||
.deer-flow/blocking-io-findings.json
|
|
||||||
```
|
|
||||||
|
|
||||||
Use this output for review and triage. A static finding is a candidate, not
|
|
||||||
proof that production blocks the event loop at runtime. The current static
|
|
||||||
rules are intentionally broad; prefer triaging existing output before adding
|
|
||||||
new static rules.
|
|
||||||
|
|
||||||
Add a static rule only when review finds a recurring high-risk blocking
|
|
||||||
pattern that is invisible to the current detector.
|
|
||||||
|
|
||||||
## Runtime detector
|
|
||||||
|
|
||||||
The runtime detector is the CI regression guard. It uses Blockbuster to fail a
|
|
||||||
focused test when code under `app.*` or `deerflow.*` performs blocking IO on
|
|
||||||
the asyncio event-loop thread.
|
|
||||||
|
|
||||||
Run it from `backend/`:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
make test-blocking-io
|
|
||||||
```
|
|
||||||
|
|
||||||
The runtime gate starts from confirmed production bugs and protects those
|
|
||||||
paths from regressing. It does not prove that the entire backend is free of
|
|
||||||
blocking IO; it only covers the production paths exercised by
|
|
||||||
`backend/tests/blocking_io/`.
|
|
||||||
|
|
||||||
## Maintenance workflow
|
|
||||||
|
|
||||||
Use the static detector to find candidates, then use review to decide which
|
|
||||||
async production paths are worth protecting in CI.
|
|
||||||
|
|
||||||
The normal workflow is:
|
|
||||||
|
|
||||||
1. Run the static detector to find backend blocking-IO candidates.
|
|
||||||
2. Use human review to pick high-risk production async paths.
|
|
||||||
3. Add or update a focused runtime anchor in `backend/tests/blocking_io/`.
|
|
||||||
4. Let CI prevent that path from regressing.
|
|
||||||
|
|
||||||
Runtime detection has two maintenance paths.
|
|
||||||
|
|
||||||
### Add a runtime rule
|
|
||||||
|
|
||||||
Add a runtime rule when Blockbuster's default rules do not cover a generic
|
|
||||||
blocking primitive used by production code.
|
|
||||||
|
|
||||||
Rules belong in:
|
|
||||||
|
|
||||||
```text
|
|
||||||
backend/tests/support/detectors/blocking_io_runtime.py
|
|
||||||
```
|
|
||||||
|
|
||||||
Add them to `_PROJECT_BLOCKING_RULES`, not directly inside individual tests.
|
|
||||||
Keeping rules centralized makes it clear which extra primitives DeerFlow
|
|
||||||
expects Blockbuster to catch.
|
|
||||||
|
|
||||||
Example shape:
|
|
||||||
|
|
||||||
```python
|
|
||||||
import subprocess
|
|
||||||
|
|
||||||
from blockbuster import BlockBusterFunction
|
|
||||||
|
|
||||||
_PROJECT_BLOCKING_RULES = (
|
|
||||||
(
|
|
||||||
"subprocess.Popen.__init__",
|
|
||||||
BlockBusterFunction(
|
|
||||||
subprocess.Popen,
|
|
||||||
"__init__",
|
|
||||||
scanned_modules=["app", "deerflow"],
|
|
||||||
),
|
|
||||||
),
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
Do not add a runtime rule just because a business path is not tested. A rule
|
|
||||||
only expands what Blockbuster can intercept after code runs.
|
|
||||||
|
|
||||||
### Add a runtime anchor
|
|
||||||
|
|
||||||
Add a runtime anchor when a high-risk async production path should be protected
|
|
||||||
by CI but no existing `backend/tests/blocking_io/` test executes it.
|
|
||||||
|
|
||||||
Anchors belong in:
|
|
||||||
|
|
||||||
```text
|
|
||||||
backend/tests/blocking_io/
|
|
||||||
```
|
|
||||||
|
|
||||||
A good anchor should:
|
|
||||||
|
|
||||||
- Call the real production async entry point.
|
|
||||||
- Avoid bypassing the blocking surface with test-only `asyncio.to_thread`
|
|
||||||
wrappers.
|
|
||||||
- Use real local filesystem inputs when the bug shape is filesystem IO.
|
|
||||||
- Mock only the external dependency boundary, such as a network service or
|
|
||||||
third-party saver class.
|
|
||||||
- Fail if a future change moves the blocking operation back onto the event
|
|
||||||
loop.
|
|
||||||
|
|
||||||
Avoid testing only the low-level helper unless that helper is the production
|
|
||||||
async entry point. The runtime gate is most useful when it protects the caller
|
|
||||||
that production actually executes.
|
|
||||||
|
|
||||||
## Current runtime coverage
|
|
||||||
|
|
||||||
The runtime anchors protect confirmed blocking-IO bug shapes:
|
|
||||||
|
|
||||||
- SQLite checkpointer setup, including path resolution and parent-directory
|
|
||||||
creation.
|
|
||||||
- Subagent skill metadata loading through `SubagentExecutor._load_skills()`.
|
|
||||||
- `JsonlRunEventStore` async API (`put` / `list_*` / `delete_*`): the JSONL
|
|
||||||
run-event backend offloads its synchronous file IO via `asyncio.to_thread`
|
|
||||||
(fix #3084); this anchor drives the real async API under the gate so any
|
|
||||||
blocking IO reintroduced on the loop fails, not only removal of one
|
|
||||||
`to_thread` call.
|
|
||||||
- `UploadsMiddleware.before_agent` uploads-directory scan: a sync-only middleware
|
|
||||||
hook runs on the event loop under async graph execution, so the scan is
|
|
||||||
offloaded via `abefore_agent` + `run_in_executor`.
|
|
||||||
- Gate health checks: Blockbuster catches unoffloaded calls, opt-out works, and
|
|
||||||
patches are restored after exceptions.
|
|
||||||
|
|
||||||
As static detection and review identify more high-risk async paths, add new
|
|
||||||
runtime anchors incrementally.
|
|
||||||
@@ -36,7 +36,6 @@ models:
|
|||||||
- OpenAI (`langchain_openai:ChatOpenAI`)
|
- OpenAI (`langchain_openai:ChatOpenAI`)
|
||||||
- Anthropic (`langchain_anthropic:ChatAnthropic`)
|
- Anthropic (`langchain_anthropic:ChatAnthropic`)
|
||||||
- DeepSeek (`langchain_deepseek:ChatDeepSeek`)
|
- DeepSeek (`langchain_deepseek:ChatDeepSeek`)
|
||||||
- Xiaomi MiMo (`deerflow.models.patched_mimo:PatchedChatMiMo`)
|
|
||||||
- Claude Code OAuth (`deerflow.models.claude_provider:ClaudeChatModel`)
|
- Claude Code OAuth (`deerflow.models.claude_provider:ClaudeChatModel`)
|
||||||
- Codex CLI (`deerflow.models.openai_codex_provider:CodexChatModel`)
|
- Codex CLI (`deerflow.models.openai_codex_provider:CodexChatModel`)
|
||||||
- Any LangChain-compatible provider
|
- Any LangChain-compatible provider
|
||||||
@@ -95,35 +94,25 @@ models:
|
|||||||
thinking:
|
thinking:
|
||||||
type: enabled
|
type: enabled
|
||||||
|
|
||||||
- name: minimax-m3
|
- name: minimax-m2.5
|
||||||
display_name: MiniMax M3
|
display_name: MiniMax M2.5
|
||||||
use: langchain_openai:ChatOpenAI
|
use: langchain_openai:ChatOpenAI
|
||||||
model: MiniMax-M3
|
model: MiniMax-M2.5
|
||||||
api_key: $MINIMAX_API_KEY
|
api_key: $MINIMAX_API_KEY
|
||||||
base_url: https://api.minimax.io/v1
|
base_url: https://api.minimax.io/v1
|
||||||
max_tokens: 4096
|
max_tokens: 4096
|
||||||
temperature: 1.0 # MiniMax requires temperature in (0.0, 1.0]
|
temperature: 1.0 # MiniMax requires temperature in (0.0, 1.0]
|
||||||
supports_vision: true
|
supports_vision: true
|
||||||
|
|
||||||
- name: minimax-m2.7
|
- name: minimax-m2.5-highspeed
|
||||||
display_name: MiniMax M2.7
|
display_name: MiniMax M2.5 Highspeed
|
||||||
use: langchain_openai:ChatOpenAI
|
use: langchain_openai:ChatOpenAI
|
||||||
model: MiniMax-M2.7
|
model: MiniMax-M2.5-highspeed
|
||||||
api_key: $MINIMAX_API_KEY
|
api_key: $MINIMAX_API_KEY
|
||||||
base_url: https://api.minimax.io/v1
|
base_url: https://api.minimax.io/v1
|
||||||
max_tokens: 4096
|
max_tokens: 4096
|
||||||
temperature: 1.0 # MiniMax requires temperature in (0.0, 1.0]
|
temperature: 1.0 # MiniMax requires temperature in (0.0, 1.0]
|
||||||
supports_vision: false # M2.7 is text-only; M3 supports vision
|
supports_vision: true
|
||||||
|
|
||||||
- name: minimax-m2.7-highspeed
|
|
||||||
display_name: MiniMax M2.7 Highspeed
|
|
||||||
use: langchain_openai:ChatOpenAI
|
|
||||||
model: MiniMax-M2.7-highspeed
|
|
||||||
api_key: $MINIMAX_API_KEY
|
|
||||||
base_url: https://api.minimax.io/v1
|
|
||||||
max_tokens: 4096
|
|
||||||
temperature: 1.0 # MiniMax requires temperature in (0.0, 1.0]
|
|
||||||
supports_vision: false # M2.7 is text-only; M3 supports vision
|
|
||||||
- name: openrouter-gemini-2.5-flash
|
- name: openrouter-gemini-2.5-flash
|
||||||
display_name: Gemini 2.5 Flash (OpenRouter)
|
display_name: Gemini 2.5 Flash (OpenRouter)
|
||||||
use: langchain_openai:ChatOpenAI
|
use: langchain_openai:ChatOpenAI
|
||||||
@@ -177,37 +166,6 @@ models:
|
|||||||
|
|
||||||
For Gemini accessed **without** thinking (e.g. via OpenRouter where thinking is not activated), the plain `langchain_openai:ChatOpenAI` with `supports_thinking: false` is sufficient and no patch is needed.
|
For Gemini accessed **without** thinking (e.g. via OpenRouter where thinking is not activated), the plain `langchain_openai:ChatOpenAI` with `supports_thinking: false` is sufficient and no patch is needed.
|
||||||
|
|
||||||
**MiMo with thinking via OpenAI-compatible API**:
|
|
||||||
|
|
||||||
MiMo returns `reasoning_content` on assistant messages in thinking mode. In multi-turn agent conversations with tool calls, subsequent requests must preserve that historical `reasoning_content` on assistant messages or the MiMo API can return HTTP 400. Standard `langchain_openai:ChatOpenAI` drops this provider-specific field, so use `deerflow.models.patched_mimo:PatchedChatMiMo`:
|
|
||||||
|
|
||||||
For pay-as-you-go API keys (`sk-...`), use `https://api.xiaomimimo.com/v1`. For Token Plan keys (`tp-...`), use the regional Token Plan Base URL shown in the MiMo console, such as `https://token-plan-cn.xiaomimimo.com/v1`. MiMo documents these key types as separate and non-interchangeable.
|
|
||||||
|
|
||||||
`PatchedChatMiMo` is model-id agnostic. Use it for every MiMo thinking model entry you configure, including model entries referenced by `subagents.*.model` overrides (for example `mimo-v2.5-pro`, `mimo-v2.5`, `mimo-v2-pro`, `mimo-v2-omni`, or `mimo-v2-flash`).
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
models:
|
|
||||||
- name: mimo-v2.5-pro
|
|
||||||
display_name: MiMo V2.5 Pro
|
|
||||||
use: deerflow.models.patched_mimo:PatchedChatMiMo
|
|
||||||
model: mimo-v2.5-pro
|
|
||||||
api_key: $MIMO_API_KEY
|
|
||||||
base_url: https://api.xiaomimimo.com/v1
|
|
||||||
max_tokens: 8192
|
|
||||||
supports_thinking: true
|
|
||||||
supports_vision: false
|
|
||||||
when_thinking_enabled:
|
|
||||||
extra_body:
|
|
||||||
thinking:
|
|
||||||
type: enabled
|
|
||||||
when_thinking_disabled:
|
|
||||||
extra_body:
|
|
||||||
thinking:
|
|
||||||
type: disabled
|
|
||||||
```
|
|
||||||
|
|
||||||
`PatchedChatMiMo` preserves MiMo's `choices[].message.reasoning_content`, streaming `delta.reasoning_content`, and request-history assistant `reasoning_content` fields. It does not reuse the DeepSeek provider.
|
|
||||||
|
|
||||||
### Tool Groups
|
### Tool Groups
|
||||||
|
|
||||||
Organize tools into logical groups:
|
Organize tools into logical groups:
|
||||||
@@ -361,7 +319,6 @@ models:
|
|||||||
- `OPENAI_API_KEY` - OpenAI API key
|
- `OPENAI_API_KEY` - OpenAI API key
|
||||||
- `ANTHROPIC_API_KEY` - Anthropic API key
|
- `ANTHROPIC_API_KEY` - Anthropic API key
|
||||||
- `DEEPSEEK_API_KEY` - DeepSeek API key
|
- `DEEPSEEK_API_KEY` - DeepSeek API key
|
||||||
- `MIMO_API_KEY` - Xiaomi MiMo API key
|
|
||||||
- `NOVITA_API_KEY` - Novita API key (OpenAI-compatible endpoint)
|
- `NOVITA_API_KEY` - Novita API key (OpenAI-compatible endpoint)
|
||||||
- `TAVILY_API_KEY` - Tavily search API key
|
- `TAVILY_API_KEY` - Tavily search API key
|
||||||
- `DEER_FLOW_PROJECT_ROOT` - Project root for relative runtime paths
|
- `DEER_FLOW_PROJECT_ROOT` - Project root for relative runtime paths
|
||||||
|
|||||||
@@ -14,19 +14,6 @@ DeerFlow supports configurable MCP servers and skills to extend its capabilities
|
|||||||
3. Configure each server’s command, arguments, and environment variables as needed.
|
3. Configure each server’s command, arguments, and environment variables as needed.
|
||||||
4. Restart the application to load and register MCP tools.
|
4. Restart the application to load and register MCP tools.
|
||||||
|
|
||||||
## Filesystem MCP Servers
|
|
||||||
|
|
||||||
DeerFlow already provides built-in file tools for thread-scoped workspace access.
|
|
||||||
Do not add an MCP filesystem server for the same DeerFlow workspace. The
|
|
||||||
overlapping file tools use different path semantics, which can make LLM tool
|
|
||||||
selection and file access behavior unstable.
|
|
||||||
|
|
||||||
DeerFlow does not currently adapt the MCP Roots mode for filesystem servers. In
|
|
||||||
particular, it does not publish per-thread MCP roots or map DeerFlow sandbox
|
|
||||||
paths such as `/mnt/user-data/...` to paths accepted by
|
|
||||||
`@modelcontextprotocol/server-filesystem`. Use DeerFlow's built-in file tools
|
|
||||||
for DeerFlow workspace files.
|
|
||||||
|
|
||||||
## OAuth Support (HTTP/SSE MCP Servers)
|
## OAuth Support (HTTP/SSE MCP Servers)
|
||||||
|
|
||||||
For `http` and `sse` MCP servers, DeerFlow supports OAuth token acquisition and automatic token refresh.
|
For `http` and `sse` MCP servers, DeerFlow supports OAuth token acquisition and automatic token refresh.
|
||||||
@@ -101,6 +88,7 @@ MCP servers expose tools that are automatically discovered and integrated into D
|
|||||||
|
|
||||||
MCP servers can provide access to:
|
MCP servers can provide access to:
|
||||||
|
|
||||||
|
- **File systems**
|
||||||
- **Databases** (e.g., PostgreSQL)
|
- **Databases** (e.g., PostgreSQL)
|
||||||
- **External APIs** (e.g., GitHub, Brave Search)
|
- **External APIs** (e.g., GitHub, Brave Search)
|
||||||
- **Browser automation** (e.g., Puppeteer)
|
- **Browser automation** (e.g., Puppeteer)
|
||||||
@@ -109,4 +97,4 @@ MCP servers can provide access to:
|
|||||||
## Learn More
|
## Learn More
|
||||||
|
|
||||||
For detailed documentation about the Model Context Protocol, visit:
|
For detailed documentation about the Model Context Protocol, visit:
|
||||||
https://modelcontextprotocol.io
|
https://modelcontextprotocol.io
|
||||||
@@ -8,7 +8,6 @@ This directory contains detailed documentation for the DeerFlow backend.
|
|||||||
|----------|-------------|
|
|----------|-------------|
|
||||||
| [ARCHITECTURE.md](ARCHITECTURE.md) | System architecture overview |
|
| [ARCHITECTURE.md](ARCHITECTURE.md) | System architecture overview |
|
||||||
| [API.md](API.md) | Complete API reference |
|
| [API.md](API.md) | Complete API reference |
|
||||||
| [AUTH_DESIGN.md](AUTH_DESIGN.md) | User authentication, CSRF, and per-user isolation design |
|
|
||||||
| [CONFIGURATION.md](CONFIGURATION.md) | Configuration options |
|
| [CONFIGURATION.md](CONFIGURATION.md) | Configuration options |
|
||||||
| [SETUP.md](SETUP.md) | Quick setup guide |
|
| [SETUP.md](SETUP.md) | Quick setup guide |
|
||||||
|
|
||||||
@@ -19,7 +18,6 @@ This directory contains detailed documentation for the DeerFlow backend.
|
|||||||
| [STREAMING.md](STREAMING.md) | Token-level streaming design: Gateway vs DeerFlowClient paths, `stream_mode` semantics, per-id dedup |
|
| [STREAMING.md](STREAMING.md) | Token-level streaming design: Gateway vs DeerFlowClient paths, `stream_mode` semantics, per-id dedup |
|
||||||
| [FILE_UPLOAD.md](FILE_UPLOAD.md) | File upload functionality |
|
| [FILE_UPLOAD.md](FILE_UPLOAD.md) | File upload functionality |
|
||||||
| [PATH_EXAMPLES.md](PATH_EXAMPLES.md) | Path types and usage examples |
|
| [PATH_EXAMPLES.md](PATH_EXAMPLES.md) | Path types and usage examples |
|
||||||
| [SANDBOX_MEMORY_PROFILING.md](SANDBOX_MEMORY_PROFILING.md) | Sandbox memory baseline and runtime comparison guide |
|
|
||||||
| [summarization.md](summarization.md) | Context summarization feature |
|
| [summarization.md](summarization.md) | Context summarization feature |
|
||||||
| [plan_mode_usage.md](plan_mode_usage.md) | Plan mode with TodoList |
|
| [plan_mode_usage.md](plan_mode_usage.md) | Plan mode with TodoList |
|
||||||
| [AUTO_TITLE_GENERATION.md](AUTO_TITLE_GENERATION.md) | Automatic title generation |
|
| [AUTO_TITLE_GENERATION.md](AUTO_TITLE_GENERATION.md) | Automatic title generation |
|
||||||
@@ -44,7 +42,6 @@ docs/
|
|||||||
├── README.md # This file
|
├── README.md # This file
|
||||||
├── ARCHITECTURE.md # System architecture
|
├── ARCHITECTURE.md # System architecture
|
||||||
├── API.md # API reference
|
├── API.md # API reference
|
||||||
├── AUTH_DESIGN.md # User authentication and isolation design
|
|
||||||
├── CONFIGURATION.md # Configuration guide
|
├── CONFIGURATION.md # Configuration guide
|
||||||
├── SETUP.md # Setup instructions
|
├── SETUP.md # Setup instructions
|
||||||
├── FILE_UPLOAD.md # File upload feature
|
├── FILE_UPLOAD.md # File upload feature
|
||||||
|
|||||||
@@ -1,120 +0,0 @@
|
|||||||
# Record/Replay E2E — front-back contract verification
|
|
||||||
|
|
||||||
Deterministic, **key-free** end-to-end checks that a backend change can't
|
|
||||||
silently break the frontend (and vice-versa). Two complementary layers, fed by a
|
|
||||||
single recording.
|
|
||||||
|
|
||||||
## Why
|
|
||||||
|
|
||||||
The mock-based frontend e2e hand-writes the backend's JSON/SSE, so a backend
|
|
||||||
schema or SSE change passes green ("fake green"). These layers replay a recorded
|
|
||||||
**real** run against the **real** backend (and, for Layer 2, the real frontend),
|
|
||||||
so contract drift turns the build red instead.
|
|
||||||
|
|
||||||
## The two layers
|
|
||||||
|
|
||||||
- **Layer 1 — backend golden** (`tests/test_replay_golden.py`): replays a fixture
|
|
||||||
through the real FastAPI gateway with `ReplayChatModel` and asserts the streamed
|
|
||||||
SSE event sequence equals a committed golden. Fast, no browser. Guards protocol
|
|
||||||
*shape*.
|
|
||||||
- **Layer 2 — full-stack render** (`frontend/tests/e2e-real-backend/`): real
|
|
||||||
Next.js + real gateway (replay model) + Chromium; asserts the replayed
|
|
||||||
auto-title and a follow-up suggestion render in the browser. Guards semantic
|
|
||||||
*render*. (Complementary to Layer 1 — neither subsumes the other.)
|
|
||||||
|
|
||||||
Layer 2 also hosts **cross-stack contract scenarios** — the dangerous class
|
|
||||||
where a backend change silently breaks a frontend assumption and *both sides'
|
|
||||||
unit tests stay green*. See below.
|
|
||||||
|
|
||||||
## Cross-stack scenario: multi-run render order (`multi-run-order.spec.ts`)
|
|
||||||
|
|
||||||
Regression guard for issue **#3352** (after context compression, refreshing a
|
|
||||||
thread rendered history out of order). Root cause was a front-back desync:
|
|
||||||
backend `RunManager.list_by_thread` returns runs **newest-first** (PR #2932),
|
|
||||||
while the frontend (`core/threads/hooks.ts`) iterated runs and **prepended** each
|
|
||||||
loaded page — inverting chronological order once the checkpoint no longer held
|
|
||||||
the older messages. The backend ordering test was green throughout, and the
|
|
||||||
frontend regression unit test hardcodes "backend returns newest-first" in a mock,
|
|
||||||
so only a *real frontend against a real backend* catches the desync.
|
|
||||||
|
|
||||||
This scenario does **not** record a conversation. It uses a **test-only seeder**
|
|
||||||
(`tests/seed_runs_router.py`, mounted on the replay gateway only when
|
|
||||||
`DEERFLOW_ENABLE_TEST_SEED=1`) to stand up a thread with ≥2 runs and per-run
|
|
||||||
message events — and deliberately **no checkpoint**, which is the #3352
|
|
||||||
precondition: it forces the frontend's per-run reload path to be the sole source
|
|
||||||
of truth so the ordering bug becomes observable. The seeder writes through the
|
|
||||||
gateway's own run/event stores using the request's auth context, so the real
|
|
||||||
`list_by_thread` → `/runs/{id}/messages` → prepend path runs live. Reverting the
|
|
||||||
#3354 frontend fix turns this spec red.
|
|
||||||
|
|
||||||
## How replay works
|
|
||||||
|
|
||||||
`tests/replay_provider.py::ReplayChatModel` returns recorded assistant turns keyed
|
|
||||||
by a **normalized hash of the model caller + conversation**. The conversation is
|
|
||||||
human / ai / tool messages — role, text, tool-call name+args; with
|
|
||||||
`<system-reminder>`, dates, UUIDs, tmp paths stripped. The caller is the stable
|
|
||||||
source of the model call (`lead_agent`, `middleware:title`, `suggest_agent`,
|
|
||||||
`subagent:*`, etc.). A miss raises loudly rather than passing silently.
|
|
||||||
|
|
||||||
**The system prompt is excluded from the match key.** The lead-agent system
|
|
||||||
prompt is a living, frequently-edited implementation detail — its wording changes
|
|
||||||
across PRs (e.g. #3195 added a "File Editing Workflow" section). Hashing it would
|
|
||||||
make every fixture go stale and red-fail unrelated PRs the moment anyone edits the
|
|
||||||
prompt. The conversation flow (user input → tool calls → results → answer) is the
|
|
||||||
stable contract that identifies a recorded turn. The caller still stays in the
|
|
||||||
key so two different model users with identical conversation text do not compete
|
|
||||||
for the same replay bucket. (This mirrors how open-design's mock picker keys on
|
|
||||||
the user prompt, not the system internals.) Combined with pinning skills +
|
|
||||||
extensions empty and disabling memory/summarization
|
|
||||||
(`tests/_replay_fixture.py::build_config_yaml`), a fixture replays the same across
|
|
||||||
machines, days, prompt edits, and CI. Replaying needs **no API key**.
|
|
||||||
|
|
||||||
A swallowed hash-miss keeps the SSE *event shapes* identical (the gateway wraps it
|
|
||||||
into a normal assistant error message), so the Layer-1 golden can't catch a miss
|
|
||||||
by shape alone — it inspects `replay_provider.replay_misses()` and fails loud
|
|
||||||
instead. Layer-2 already fails on a miss (the recorded turns never render).
|
|
||||||
|
|
||||||
## Record a new scenario (needs a real key — dev machine only)
|
|
||||||
|
|
||||||
Recording drives the **real frontend** so captured inputs match exactly what the
|
|
||||||
browser sends; fixtures contain no API key.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# 1. drive the real frontend against a real-model gateway, capturing model calls
|
|
||||||
OPENAI_API_KEY=... OPENAI_API_BASE=<openai-compatible-endpoint>/v1 \
|
|
||||||
DEERFLOW_RECORD_OUT=/tmp/rec/turns.jsonl RECORD_MODEL=<model> \
|
|
||||||
bash -c 'cd frontend && pnpm exec playwright test -c playwright.record.config.ts'
|
|
||||||
|
|
||||||
# 2. stitch the capture into a fixture
|
|
||||||
cd backend && uv run python scripts/build_fixture_from_jsonl.py \
|
|
||||||
--jsonl /tmp/rec/turns.jsonl --meta /tmp/rec/turns.jsonl.meta.json \
|
|
||||||
--out tests/fixtures/replay/<scenario>.<mode>.json --model <model>
|
|
||||||
|
|
||||||
# 3. regenerate the committed golden
|
|
||||||
DEERFLOW_WRITE_GOLDEN=1 PYTHONPATH=. uv run pytest tests/test_replay_golden.py
|
|
||||||
```
|
|
||||||
|
|
||||||
## Run (no key)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd backend && PYTHONPATH=. uv run pytest tests/test_replay_golden.py # Layer 1
|
|
||||||
cd frontend && pnpm exec playwright test -c playwright.real-backend.config.ts # Layer 2
|
|
||||||
```
|
|
||||||
|
|
||||||
## CI
|
|
||||||
|
|
||||||
`.github/workflows/replay-e2e.yml` runs both layers on changes to **either** side
|
|
||||||
of the contract (`frontend/**`, `backend/app/gateway/**`,
|
|
||||||
`backend/packages/harness/**`, fixtures). DOM assertions are the gate; the rendered
|
|
||||||
screenshot + Playwright HTML report are uploaded as a CI artifact.
|
|
||||||
|
|
||||||
## Known limitations
|
|
||||||
|
|
||||||
- Visual regression baselines are OS-specific, so they are a **local dev gate
|
|
||||||
only** (gitignored); CI uploads the render as an artifact for human review
|
|
||||||
instead of hard-asserting a cross-OS baseline.
|
|
||||||
- Fixtures are coupled to the recording-time prompt; if new
|
|
||||||
environment-dependent content enters the system prompt, extend the
|
|
||||||
normalization in `replay_provider.py` (or pin it in `build_config_yaml`).
|
|
||||||
- Re-record a scenario if the agent graph changes how many model calls it makes
|
|
||||||
— the replay raises loudly on a hash miss pointing at the divergence.
|
|
||||||
@@ -1,81 +0,0 @@
|
|||||||
# Sandbox Memory Profiling
|
|
||||||
|
|
||||||
This guide records a repeatable baseline before changing the sandbox runtime.
|
|
||||||
Issue #3213 reports per-sandbox memory near 1 GiB in Kubernetes. Before adding
|
|
||||||
or recommending a new provider, capture the current AIO sandbox baseline and
|
|
||||||
compare candidates with the same DeerFlow workload.
|
|
||||||
|
|
||||||
## What to Measure
|
|
||||||
|
|
||||||
Measure at least these samples:
|
|
||||||
|
|
||||||
1. Empty sandbox after it becomes ready.
|
|
||||||
2. After a simple bash command.
|
|
||||||
3. After a Python task that imports common packages.
|
|
||||||
4. After a Node task when Node-based workloads are expected.
|
|
||||||
5. After generating files under `/mnt/user-data/outputs`.
|
|
||||||
6. After release and warm reuse.
|
|
||||||
7. At the target concurrency level, for example 10, 50, or 100 sandboxes.
|
|
||||||
|
|
||||||
`kubectl top` reports Kubernetes/container working set memory. Treat it as a
|
|
||||||
capacity signal, not exclusive RSS/PSS. Pod-level memory includes every
|
|
||||||
container in the Pod and may include cache charged to the cgroup. If a result
|
|
||||||
looks surprising, inspect the sandbox processes and cgroup metrics on the node
|
|
||||||
before drawing conclusions.
|
|
||||||
|
|
||||||
## Capture a Snapshot
|
|
||||||
|
|
||||||
Run this from the repository root:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python scripts/sandbox_memory_profile.py \
|
|
||||||
--namespace deer-flow \
|
|
||||||
--selector app=deer-flow-sandbox \
|
|
||||||
--sample empty \
|
|
||||||
--include-processes \
|
|
||||||
--format markdown
|
|
||||||
```
|
|
||||||
|
|
||||||
Use a descriptive `--sample` value for each phase:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python scripts/sandbox_memory_profile.py --sample after-bash --format json
|
|
||||||
python scripts/sandbox_memory_profile.py --sample after-python --format json
|
|
||||||
python scripts/sandbox_memory_profile.py --sample after-artifact --format json
|
|
||||||
```
|
|
||||||
|
|
||||||
`--include-processes` runs `kubectl exec ... ps` in each sandbox Pod and adds
|
|
||||||
the highest-RSS processes to the report. This helps distinguish Pod-level cgroup
|
|
||||||
memory from process RSS. The two numbers will not match exactly because cgroup
|
|
||||||
memory can include cache and other kernel-accounted memory.
|
|
||||||
|
|
||||||
Save the raw JSON when comparing backends so totals, pod names, images,
|
|
||||||
requests, limits, and timestamps can be audited later.
|
|
||||||
|
|
||||||
## Candidate Runtime Matrix
|
|
||||||
|
|
||||||
For AIO, CubeSandbox, OpenSandbox, gVisor, Kata, or another candidate, compare
|
|
||||||
the same workload and record:
|
|
||||||
|
|
||||||
| Area | Required Evidence |
|
|
||||||
| --- | --- |
|
|
||||||
| Capacity | Pod or instance count, total memory, average memory, max memory |
|
|
||||||
| Startup | Ready latency at 1, 10, 50, and 100 concurrent sandboxes |
|
|
||||||
| Commands | Bash output, timeout behavior, failure shape |
|
|
||||||
| Files | `read_file`, `write_file`, binary `update_file`, `list_dir`, `glob`, `grep` |
|
|
||||||
| Uploads | Files uploaded by the gateway are visible inside the sandbox |
|
|
||||||
| Artifacts | Files written to `/mnt/user-data/outputs` are readable by the backend artifact API |
|
|
||||||
| Paths | `/mnt/user-data/workspace`, `/mnt/user-data/uploads`, `/mnt/user-data/outputs`, `/mnt/acp-workspace`, and skills paths keep their expected semantics |
|
|
||||||
| Isolation | Different users and threads cannot read each other's data |
|
|
||||||
| Cleanup | Release, idle timeout, process restart, and orphan cleanup free resources |
|
|
||||||
| Operations | Deployment prerequisites, privileged components, networking, storage, and upgrade path |
|
|
||||||
|
|
||||||
## PR Guidance
|
|
||||||
|
|
||||||
Do not claim that a new provider fixes high-concurrency memory usage until the
|
|
||||||
same DeerFlow workload has been measured on both the current AIO sandbox and the
|
|
||||||
candidate backend.
|
|
||||||
|
|
||||||
For an experimental provider PR, prefer `Related to #3213` unless the PR also
|
|
||||||
includes reproducible DeerFlow workload data that demonstrates the target memory
|
|
||||||
reduction and preserves uploads, outputs, artifacts, and isolation behavior.
|
|
||||||
@@ -26,7 +26,7 @@
|
|||||||
- Replace sync `requests` with `httpx.AsyncClient` in community tools (tavily, jina_ai, firecrawl, infoquest, image_search)
|
- Replace sync `requests` with `httpx.AsyncClient` in community tools (tavily, jina_ai, firecrawl, infoquest, image_search)
|
||||||
- [x] Replace sync `model.invoke()` with async `model.ainvoke()` in title_middleware and memory updater
|
- [x] Replace sync `model.invoke()` with async `model.ainvoke()` in title_middleware and memory updater
|
||||||
- Consider `asyncio.to_thread()` wrapper for remaining blocking file I/O
|
- Consider `asyncio.to_thread()` wrapper for remaining blocking file I/O
|
||||||
- For production: tune Gateway worker/runtime settings for long-running agent workloads
|
- For production: use `langgraph up` (multi-worker) instead of `langgraph dev` (single-worker)
|
||||||
|
|
||||||
## Resolved Issues
|
## Resolved Issues
|
||||||
|
|
||||||
|
|||||||
@@ -4,22 +4,22 @@
|
|||||||
|
|
||||||
`create_deerflow_agent` 通过 `RuntimeFeatures` 组装的完整 middleware 链(默认全开时):
|
`create_deerflow_agent` 通过 `RuntimeFeatures` 组装的完整 middleware 链(默认全开时):
|
||||||
|
|
||||||
| # | Middleware | `before_agent` | `before_model` | `after_model` | `after_agent` | `wrap_model_call` | `wrap_tool_call` | 主 Agent | Subagent | 来源 |
|
| # | Middleware | `before_agent` | `before_model` | `after_model` | `after_agent` | `wrap_tool_call` | 主 Agent | Subagent | 来源 |
|
||||||
|---|-----------|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|------|
|
|---|-----------|:-:|:-:|:-:|:-:|:-:|:-:|:-:|------|
|
||||||
| 0 | ThreadDataMiddleware | ✓ | | | | | | ✓ | ✓ | `sandbox` |
|
| 0 | ThreadDataMiddleware | ✓ | | | | | ✓ | ✓ | `sandbox` |
|
||||||
| 1 | UploadsMiddleware | ✓ | | | | | | ✓ | ✗ | `sandbox` |
|
| 1 | UploadsMiddleware | ✓ | | | | | ✓ | ✗ | `sandbox` |
|
||||||
| 2 | SandboxMiddleware | ✓ | | | ✓ | | | ✓ | ✓ | `sandbox` |
|
| 2 | SandboxMiddleware | ✓ | | | ✓ | | ✓ | ✓ | `sandbox` |
|
||||||
| 3 | DanglingToolCallMiddleware | | | | | ✓ | | ✓ | ✗ | 始终开启 |
|
| 3 | DanglingToolCallMiddleware | | | ✓ | | | ✓ | ✗ | 始终开启 |
|
||||||
| 4 | GuardrailMiddleware | | | | | | ✓ | ✓ | ✓ | *Phase 2 纳入* |
|
| 4 | GuardrailMiddleware | | | | | ✓ | ✓ | ✓ | *Phase 2 纳入* |
|
||||||
| 5 | ToolErrorHandlingMiddleware | | | | | | ✓ | ✓ | ✓ | 始终开启 |
|
| 5 | ToolErrorHandlingMiddleware | | | | | ✓ | ✓ | ✓ | 始终开启 |
|
||||||
| 6 | SummarizationMiddleware | | ✓ | | | | | ✓ | ✗ | `summarization` |
|
| 6 | SummarizationMiddleware | | | ✓ | | | ✓ | ✗ | `summarization` |
|
||||||
| 7 | TodoMiddleware | | ✓ | ✓ | | ✓ | | ✓ | ✗ | `plan_mode` 参数 |
|
| 7 | TodoMiddleware | | | ✓ | | | ✓ | ✗ | `plan_mode` 参数 |
|
||||||
| 8 | TitleMiddleware | | | ✓ | | | | ✓ | ✗ | `auto_title` |
|
| 8 | TitleMiddleware | | | ✓ | | | ✓ | ✗ | `auto_title` |
|
||||||
| 9 | MemoryMiddleware | | | | ✓ | | | ✓ | ✗ | `memory` |
|
| 9 | MemoryMiddleware | | | | ✓ | | ✓ | ✗ | `memory` |
|
||||||
| 10 | ViewImageMiddleware | | ✓ | | | | | ✓ | ✗ | `vision` |
|
| 10 | ViewImageMiddleware | | ✓ | | | | ✓ | ✗ | `vision` |
|
||||||
| 11 | SubagentLimitMiddleware | | | ✓ | | | | ✓ | ✗ | `subagent` |
|
| 11 | SubagentLimitMiddleware | | | ✓ | | | ✓ | ✗ | `subagent` |
|
||||||
| 12 | LoopDetectionMiddleware | ✓ | | ✓ | ✓ | ✓ | | ✓ | ✗ | 始终开启 |
|
| 12 | LoopDetectionMiddleware | | | ✓ | | | ✓ | ✗ | 始终开启 |
|
||||||
| 13 | ClarificationMiddleware | | | | | | ✓ | ✓ | ✗ | 始终最后 |
|
| 13 | ClarificationMiddleware | | | ✓ | | | ✓ | ✗ | 始终最后 |
|
||||||
|
|
||||||
主 agent **14 个** middleware(`make_lead_agent`),subagent **4 个**(ThreadData、Sandbox、Guardrail、ToolErrorHandling)。`create_deerflow_agent` Phase 1 实现 **13 个**(Guardrail 仅支持自定义实例,无内置默认)。
|
主 agent **14 个** middleware(`make_lead_agent`),subagent **4 个**(ThreadData、Sandbox、Guardrail、ToolErrorHandling)。`create_deerflow_agent` Phase 1 实现 **13 个**(Guardrail 仅支持自定义实例,无内置默认)。
|
||||||
|
|
||||||
@@ -35,7 +35,7 @@ graph TB
|
|||||||
|
|
||||||
subgraph BA ["<b>before_agent</b> 正序 0→N"]
|
subgraph BA ["<b>before_agent</b> 正序 0→N"]
|
||||||
direction TB
|
direction TB
|
||||||
TD["[0] ThreadData<br/>创建线程目录"] --> UL["[1] Uploads<br/>扫描上传文件"] --> SB["[2] Sandbox<br/>获取沙箱"] --> LD_BA["[12] LoopDetection<br/>清理 stale warning"]
|
TD["[0] ThreadData<br/>创建线程目录"] --> UL["[1] Uploads<br/>扫描上传文件"] --> SB["[2] Sandbox<br/>获取沙箱"]
|
||||||
end
|
end
|
||||||
|
|
||||||
subgraph BM ["<b>before_model</b> 正序 0→N"]
|
subgraph BM ["<b>before_model</b> 正序 0→N"]
|
||||||
@@ -43,42 +43,34 @@ graph TB
|
|||||||
VI["[10] ViewImage<br/>注入图片 base64"]
|
VI["[10] ViewImage<br/>注入图片 base64"]
|
||||||
end
|
end
|
||||||
|
|
||||||
subgraph WM ["<b>wrap_model_call</b>"]
|
SB --> VI
|
||||||
direction TB
|
VI --> M["<b>MODEL</b>"]
|
||||||
DTC_WM["[3] DanglingToolCall<br/>补悬空 ToolMessage"] --> LD_WM["[12] LoopDetection<br/>注入当前 run warning"]
|
|
||||||
end
|
|
||||||
|
|
||||||
LD_BA --> VI
|
|
||||||
VI --> DTC_WM
|
|
||||||
LD_WM --> M["<b>MODEL</b>"]
|
|
||||||
|
|
||||||
subgraph AM ["<b>after_model</b> 反序 N→0"]
|
subgraph AM ["<b>after_model</b> 反序 N→0"]
|
||||||
direction TB
|
direction TB
|
||||||
LD["[12] LoopDetection<br/>检测循环/排队 warning"] --> SL["[11] SubagentLimit<br/>截断多余 task"] --> TI["[8] Title<br/>生成标题"]
|
CL["[13] Clarification<br/>拦截 ask_clarification"] --> LD["[12] LoopDetection<br/>检测循环"] --> SL["[11] SubagentLimit<br/>截断多余 task"] --> TI["[8] Title<br/>生成标题"] --> SM["[6] Summarization<br/>上下文压缩"] --> DTC["[3] DanglingToolCall<br/>补缺失 ToolMessage"]
|
||||||
end
|
end
|
||||||
|
|
||||||
M --> LD
|
M --> CL
|
||||||
|
|
||||||
subgraph AA ["<b>after_agent</b> 反序 N→0"]
|
subgraph AA ["<b>after_agent</b> 反序 N→0"]
|
||||||
direction TB
|
direction TB
|
||||||
LD_CLEAN["[12] LoopDetection<br/>清理 pending warning"] --> MEM["[9] Memory<br/>入队记忆"] --> SBR["[2] Sandbox<br/>释放沙箱"]
|
SBR["[2] Sandbox<br/>释放沙箱"] --> MEM["[9] Memory<br/>入队记忆"]
|
||||||
end
|
end
|
||||||
|
|
||||||
TI --> LD_CLEAN
|
DTC --> SBR
|
||||||
SBR --> END(["response"])
|
MEM --> END(["response"])
|
||||||
|
|
||||||
classDef beforeNode fill:#a0a8b5,stroke:#636b7a,color:#2d3239
|
classDef beforeNode fill:#a0a8b5,stroke:#636b7a,color:#2d3239
|
||||||
classDef modelNode fill:#b5a8a0,stroke:#7a6b63,color:#2d3239
|
classDef modelNode fill:#b5a8a0,stroke:#7a6b63,color:#2d3239
|
||||||
classDef wrapModelNode fill:#a8a0b5,stroke:#6b637a,color:#2d3239
|
|
||||||
classDef afterModelNode fill:#b5a0a8,stroke:#7a636b,color:#2d3239
|
classDef afterModelNode fill:#b5a0a8,stroke:#7a636b,color:#2d3239
|
||||||
classDef afterAgentNode fill:#a0b5a8,stroke:#637a6b,color:#2d3239
|
classDef afterAgentNode fill:#a0b5a8,stroke:#637a6b,color:#2d3239
|
||||||
classDef terminalNode fill:#a8b5a0,stroke:#6b7a63,color:#2d3239
|
classDef terminalNode fill:#a8b5a0,stroke:#6b7a63,color:#2d3239
|
||||||
|
|
||||||
class TD,UL,SB,LD_BA,VI beforeNode
|
class TD,UL,SB,VI beforeNode
|
||||||
class DTC_WM,LD_WM wrapModelNode
|
|
||||||
class M modelNode
|
class M modelNode
|
||||||
class LD,SL,TI afterModelNode
|
class CL,LD,SL,TI,SM,DTC afterModelNode
|
||||||
class LD_CLEAN,SBR,MEM afterAgentNode
|
class SBR,MEM afterAgentNode
|
||||||
class START,END terminalNode
|
class START,END terminalNode
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -90,12 +82,13 @@ sequenceDiagram
|
|||||||
participant TD as ThreadDataMiddleware
|
participant TD as ThreadDataMiddleware
|
||||||
participant UL as UploadsMiddleware
|
participant UL as UploadsMiddleware
|
||||||
participant SB as SandboxMiddleware
|
participant SB as SandboxMiddleware
|
||||||
participant LD as LoopDetectionMiddleware
|
|
||||||
participant VI as ViewImageMiddleware
|
participant VI as ViewImageMiddleware
|
||||||
participant DTC as DanglingToolCallMiddleware
|
|
||||||
participant M as MODEL
|
participant M as MODEL
|
||||||
|
participant CL as ClarificationMiddleware
|
||||||
participant SL as SubagentLimitMiddleware
|
participant SL as SubagentLimitMiddleware
|
||||||
participant TI as TitleMiddleware
|
participant TI as TitleMiddleware
|
||||||
|
participant SM as SummarizationMiddleware
|
||||||
|
participant DTC as DanglingToolCallMiddleware
|
||||||
participant MEM as MemoryMiddleware
|
participant MEM as MemoryMiddleware
|
||||||
|
|
||||||
U ->> TD: invoke
|
U ->> TD: invoke
|
||||||
@@ -110,26 +103,19 @@ sequenceDiagram
|
|||||||
activate SB
|
activate SB
|
||||||
Note right of SB: before_agent 获取沙箱
|
Note right of SB: before_agent 获取沙箱
|
||||||
|
|
||||||
SB ->> LD: before_agent
|
SB ->> VI: before_model
|
||||||
activate LD
|
|
||||||
Note right of LD: before_agent 清理同 thread 旧 run 的 pending warning
|
|
||||||
LD ->> VI: before_model
|
|
||||||
activate VI
|
activate VI
|
||||||
Note right of VI: before_model 注入图片 base64
|
Note right of VI: before_model 注入图片 base64
|
||||||
|
|
||||||
VI ->> DTC: wrap_model_call
|
VI ->> M: messages + tools
|
||||||
activate DTC
|
|
||||||
Note right of DTC: wrap_model_call 补悬空 ToolMessage
|
|
||||||
DTC ->> LD: wrap_model_call
|
|
||||||
Note right of LD: wrap_model_call drain 当前 run warning 并追加到末尾
|
|
||||||
LD ->> M: messages + tools
|
|
||||||
activate M
|
activate M
|
||||||
M -->> LD: AI response
|
M -->> CL: AI response
|
||||||
deactivate M
|
deactivate M
|
||||||
|
|
||||||
Note right of LD: after_model 检测循环;warning 入队,hard-stop 清 tool_calls
|
activate CL
|
||||||
LD -->> SL: after_model
|
Note right of CL: after_model 拦截 ask_clarification
|
||||||
deactivate LD
|
CL -->> SL: after_model
|
||||||
|
deactivate CL
|
||||||
|
|
||||||
activate SL
|
activate SL
|
||||||
Note right of SL: after_model 截断多余 task
|
Note right of SL: after_model 截断多余 task
|
||||||
@@ -138,18 +124,22 @@ sequenceDiagram
|
|||||||
|
|
||||||
activate TI
|
activate TI
|
||||||
Note right of TI: after_model 生成标题
|
Note right of TI: after_model 生成标题
|
||||||
TI -->> DTC: done
|
TI -->> SM: after_model
|
||||||
deactivate TI
|
deactivate TI
|
||||||
|
|
||||||
|
activate SM
|
||||||
|
Note right of SM: after_model 上下文压缩
|
||||||
|
SM -->> DTC: after_model
|
||||||
|
deactivate SM
|
||||||
|
|
||||||
|
activate DTC
|
||||||
|
Note right of DTC: after_model 补缺失 ToolMessage
|
||||||
|
DTC -->> VI: done
|
||||||
deactivate DTC
|
deactivate DTC
|
||||||
|
|
||||||
VI -->> SB: done
|
VI -->> SB: done
|
||||||
deactivate VI
|
deactivate VI
|
||||||
|
|
||||||
Note right of LD: after_agent 清理当前 run 未消费 warning
|
|
||||||
|
|
||||||
Note right of MEM: after_agent 入队记忆
|
|
||||||
|
|
||||||
Note right of SB: after_agent 释放沙箱
|
Note right of SB: after_agent 释放沙箱
|
||||||
SB -->> UL: done
|
SB -->> UL: done
|
||||||
deactivate SB
|
deactivate SB
|
||||||
@@ -157,6 +147,8 @@ sequenceDiagram
|
|||||||
UL -->> TD: done
|
UL -->> TD: done
|
||||||
deactivate UL
|
deactivate UL
|
||||||
|
|
||||||
|
Note right of MEM: after_agent 入队记忆
|
||||||
|
|
||||||
TD -->> U: response
|
TD -->> U: response
|
||||||
deactivate TD
|
deactivate TD
|
||||||
```
|
```
|
||||||
@@ -232,12 +224,12 @@ sequenceDiagram
|
|||||||
participant TD as ThreadData
|
participant TD as ThreadData
|
||||||
participant UL as Uploads
|
participant UL as Uploads
|
||||||
participant SB as Sandbox
|
participant SB as Sandbox
|
||||||
participant LD as LoopDetection
|
|
||||||
participant VI as ViewImage
|
participant VI as ViewImage
|
||||||
participant DTC as DanglingToolCall
|
|
||||||
participant M as MODEL
|
participant M as MODEL
|
||||||
|
participant CL as Clarification
|
||||||
participant SL as SubagentLimit
|
participant SL as SubagentLimit
|
||||||
participant TI as Title
|
participant TI as Title
|
||||||
|
participant SM as Summarization
|
||||||
participant MEM as Memory
|
participant MEM as Memory
|
||||||
|
|
||||||
U ->> TD: invoke
|
U ->> TD: invoke
|
||||||
@@ -246,40 +238,34 @@ sequenceDiagram
|
|||||||
Note right of UL: before_agent 扫描文件
|
Note right of UL: before_agent 扫描文件
|
||||||
UL ->> SB: .
|
UL ->> SB: .
|
||||||
Note right of SB: before_agent 获取沙箱
|
Note right of SB: before_agent 获取沙箱
|
||||||
SB ->> LD: .
|
|
||||||
Note right of LD: before_agent 清理 stale pending warning
|
|
||||||
|
|
||||||
loop 每轮对话(tool call 循环)
|
loop 每轮对话(tool call 循环)
|
||||||
SB ->> VI: .
|
SB ->> VI: .
|
||||||
Note right of VI: before_model 注入图片
|
Note right of VI: before_model 注入图片
|
||||||
VI ->> DTC: .
|
VI ->> M: messages + tools
|
||||||
Note right of DTC: wrap_model_call 补悬空工具结果
|
M -->> CL: AI response
|
||||||
DTC ->> LD: .
|
Note right of CL: after_model 拦截 ask_clarification
|
||||||
Note right of LD: wrap_model_call 注入当前 run warning
|
CL -->> SL: .
|
||||||
LD ->> M: messages + tools
|
|
||||||
M -->> LD: AI response
|
|
||||||
Note right of LD: after_model 检测循环/排队 warning
|
|
||||||
LD -->> SL: .
|
|
||||||
Note right of SL: after_model 截断多余 task
|
Note right of SL: after_model 截断多余 task
|
||||||
SL -->> TI: .
|
SL -->> TI: .
|
||||||
Note right of TI: after_model 生成标题
|
Note right of TI: after_model 生成标题
|
||||||
|
TI -->> SM: .
|
||||||
|
Note right of SM: after_model 上下文压缩
|
||||||
end
|
end
|
||||||
|
|
||||||
Note right of LD: after_agent 清理当前 run pending warning
|
|
||||||
LD -->> MEM: .
|
|
||||||
Note right of MEM: after_agent 入队记忆
|
|
||||||
MEM -->> SB: .
|
|
||||||
Note right of SB: after_agent 释放沙箱
|
Note right of SB: after_agent 释放沙箱
|
||||||
SB -->> U: response
|
SB -->> MEM: .
|
||||||
|
Note right of MEM: after_agent 入队记忆
|
||||||
|
MEM -->> U: response
|
||||||
```
|
```
|
||||||
|
|
||||||
> [!warning] 不是洋葱
|
> [!warning] 不是洋葱
|
||||||
> 大部分 middleware 只用一个阶段。SandboxMiddleware 使用 `before_agent`/`after_agent` 做资源获取/释放;LoopDetectionMiddleware 也使用这两个钩子,但用途是清理 run-scoped pending warnings,不是资源生命周期对称。`before_agent` / `after_agent` 只跑一次,`before_model` / `after_model` / `wrap_model_call` 每轮循环都跑。
|
> 14 个 middleware 中只有 SandboxMiddleware 有 before/after 对称(获取/释放)。其余都是单向的:要么只在 `before_*` 做事,要么只在 `after_*` 做事。`before_agent` / `after_agent` 只跑一次,`before_model` / `after_model` 每轮循环都跑。
|
||||||
|
|
||||||
硬依赖只有 2 处:
|
硬依赖只有 2 处:
|
||||||
|
|
||||||
1. **ThreadData 在 Sandbox 之前** — sandbox 需要线程目录
|
1. **ThreadData 在 Sandbox 之前** — sandbox 需要线程目录
|
||||||
2. **Clarification 在列表最后** — `wrap_tool_call` 处理 `ask_clarification` 时优先拦截,并通过 `Command(goto=END)` 中断执行
|
2. **Clarification 在列表最后** — `after_model` 反序时最先执行,第一个拦截 `ask_clarification`
|
||||||
|
|
||||||
### 结论
|
### 结论
|
||||||
|
|
||||||
@@ -287,19 +273,19 @@ sequenceDiagram
|
|||||||
|---|---|---|
|
|---|---|---|
|
||||||
| 每个 middleware | before + after 对称 | 大多只用一个钩子 |
|
| 每个 middleware | before + after 对称 | 大多只用一个钩子 |
|
||||||
| 激活条 | 嵌套(外长内短) | 不嵌套(串行) |
|
| 激活条 | 嵌套(外长内短) | 不嵌套(串行) |
|
||||||
| 反序的意义 | 清理与初始化配对 | 影响 `after_model` / `after_agent` 的执行优先级 |
|
| 反序的意义 | 清理与初始化配对 | 仅影响 after_model 的执行优先级 |
|
||||||
| 典型例子 | Auth: 校验 token / 清理上下文 | ThreadData: 只创建目录,没有清理 |
|
| 典型例子 | Auth: 校验 token / 清理上下文 | ThreadData: 只创建目录,没有清理 |
|
||||||
|
|
||||||
## 关键设计点
|
## 关键设计点
|
||||||
|
|
||||||
### ClarificationMiddleware 为什么在列表最后?
|
### ClarificationMiddleware 为什么在列表最后?
|
||||||
|
|
||||||
位置最后使它在工具调用包装链中优先拦截 `ask_clarification`。如果命中,它返回 `Command(goto=END)`,把格式化后的澄清问题写成 `ToolMessage` 并中断执行。
|
位置最后 = `after_model` 最先执行。它需要**第一个**看到 model 输出,检查是否有 `ask_clarification` tool call。如果有,立即中断(`Command(goto=END)`),后续 middleware 的 `after_model` 不再执行。
|
||||||
|
|
||||||
### SandboxMiddleware 的对称性
|
### SandboxMiddleware 的对称性
|
||||||
|
|
||||||
`before_agent`(正序第 3 个)获取沙箱,`after_agent`(反序第 1 个)释放沙箱。外层进入 → 外层退出,天然的洋葱对称。
|
`before_agent`(正序第 3 个)获取沙箱,`after_agent`(反序第 1 个)释放沙箱。外层进入 → 外层退出,天然的洋葱对称。
|
||||||
|
|
||||||
### LoopDetectionMiddleware 为什么同时用多个钩子?
|
### 大部分 middleware 只用一个钩子
|
||||||
|
|
||||||
`after_model` 只做检测:重复工具调用达到 warning 阈值时,把 warning 放入 `(thread_id, run_id)` 作用域的 pending 队列。真正注入发生在下一次 `wrap_model_call`:此时上一轮 `AIMessage(tool_calls)` 对应的 `ToolMessage` 已经在请求里,warning 追加在末尾,不会破坏 OpenAI/Moonshot 的 tool-call pairing。`before_agent` 清理同一 thread 下旧 run 的残留 warning,`after_agent` 清理当前 run 没被消费的 warning。
|
14 个 middleware 中,只有 SandboxMiddleware 同时用了 `before_agent` + `after_agent`(获取/释放)。其余都只在一个阶段执行。洋葱模型的反序特性主要影响 `after_model` 阶段的执行顺序。
|
||||||
|
|||||||
@@ -127,8 +127,8 @@ complex_agent = create_agent_for_task("high")
|
|||||||
## How It Works
|
## How It Works
|
||||||
|
|
||||||
1. When `make_lead_agent(config)` is called, it extracts `is_plan_mode` from `config.configurable`
|
1. When `make_lead_agent(config)` is called, it extracts `is_plan_mode` from `config.configurable`
|
||||||
2. The config is passed to `build_middlewares(config)`
|
2. The config is passed to `_build_middlewares(config)`
|
||||||
3. `build_middlewares()` reads `is_plan_mode` and calls `_create_todo_list_middleware(is_plan_mode)`
|
3. `_build_middlewares()` reads `is_plan_mode` and calls `_create_todo_list_middleware(is_plan_mode)`
|
||||||
4. If `is_plan_mode=True`, a `TodoListMiddleware` instance is created and added to the middleware chain
|
4. If `is_plan_mode=True`, a `TodoListMiddleware` instance is created and added to the middleware chain
|
||||||
5. The middleware automatically adds a `write_todos` tool to the agent's toolset
|
5. The middleware automatically adds a `write_todos` tool to the agent's toolset
|
||||||
6. The agent can use this tool to manage tasks during execution
|
6. The agent can use this tool to manage tasks during execution
|
||||||
@@ -141,7 +141,7 @@ make_lead_agent(config)
|
|||||||
│
|
│
|
||||||
├─> Extracts: is_plan_mode = config.configurable.get("is_plan_mode", False)
|
├─> Extracts: is_plan_mode = config.configurable.get("is_plan_mode", False)
|
||||||
│
|
│
|
||||||
└─> build_middlewares(config)
|
└─> _build_middlewares(config)
|
||||||
│
|
│
|
||||||
├─> ThreadDataMiddleware
|
├─> ThreadDataMiddleware
|
||||||
├─> SandboxMiddleware
|
├─> SandboxMiddleware
|
||||||
@@ -156,7 +156,7 @@ make_lead_agent(config)
|
|||||||
### Agent Module
|
### Agent Module
|
||||||
- **Location**: `packages/harness/deerflow/agents/lead_agent/agent.py`
|
- **Location**: `packages/harness/deerflow/agents/lead_agent/agent.py`
|
||||||
- **Function**: `_create_todo_list_middleware(is_plan_mode: bool)` - Creates TodoListMiddleware if plan mode is enabled
|
- **Function**: `_create_todo_list_middleware(is_plan_mode: bool)` - Creates TodoListMiddleware if plan mode is enabled
|
||||||
- **Function**: `build_middlewares(config: RunnableConfig)` - Builds middleware chain based on runtime config
|
- **Function**: `_build_middlewares(config: RunnableConfig)` - Builds middleware chain based on runtime config
|
||||||
- **Function**: `make_lead_agent(config: RunnableConfig)` - Creates agent with appropriate middlewares
|
- **Function**: `make_lead_agent(config: RunnableConfig)` - Creates agent with appropriate middlewares
|
||||||
|
|
||||||
### Runtime Configuration
|
### Runtime Configuration
|
||||||
|
|||||||
@@ -173,7 +173,7 @@ def _assemble_from_features(
|
|||||||
9. MemoryMiddleware (memory feature)
|
9. MemoryMiddleware (memory feature)
|
||||||
10. ViewImageMiddleware (vision feature)
|
10. ViewImageMiddleware (vision feature)
|
||||||
11. SubagentLimitMiddleware (subagent feature)
|
11. SubagentLimitMiddleware (subagent feature)
|
||||||
12. LoopDetectionMiddleware (loop_detection feature)
|
12. LoopDetectionMiddleware (always)
|
||||||
13. ClarificationMiddleware (always last)
|
13. ClarificationMiddleware (always last)
|
||||||
|
|
||||||
Two-phase ordering:
|
Two-phase ordering:
|
||||||
@@ -272,15 +272,10 @@ def _assemble_from_features(
|
|||||||
|
|
||||||
extra_tools.append(task_tool)
|
extra_tools.append(task_tool)
|
||||||
|
|
||||||
# --- [12] LoopDetection ---
|
# --- [12] LoopDetection (always) ---
|
||||||
if feat.loop_detection is not False:
|
from deerflow.agents.middlewares.loop_detection_middleware import LoopDetectionMiddleware
|
||||||
if isinstance(feat.loop_detection, AgentMiddleware):
|
|
||||||
chain.append(feat.loop_detection)
|
|
||||||
else:
|
|
||||||
from deerflow.agents.middlewares.loop_detection_middleware import LoopDetectionMiddleware
|
|
||||||
from deerflow.config.loop_detection_config import LoopDetectionConfig
|
|
||||||
|
|
||||||
chain.append(LoopDetectionMiddleware.from_config(LoopDetectionConfig()))
|
chain.append(LoopDetectionMiddleware())
|
||||||
|
|
||||||
# --- [13] Clarification (always last among built-ins) ---
|
# --- [13] Clarification (always last among built-ins) ---
|
||||||
chain.append(ClarificationMiddleware())
|
chain.append(ClarificationMiddleware())
|
||||||
|
|||||||
@@ -31,7 +31,6 @@ class RuntimeFeatures:
|
|||||||
vision: bool | AgentMiddleware = False
|
vision: bool | AgentMiddleware = False
|
||||||
auto_title: bool | AgentMiddleware = False
|
auto_title: bool | AgentMiddleware = False
|
||||||
guardrail: Literal[False] | AgentMiddleware = False
|
guardrail: Literal[False] | AgentMiddleware = False
|
||||||
loop_detection: bool | AgentMiddleware = True
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|||||||
@@ -1,25 +1,3 @@
|
|||||||
"""Lead agent factory.
|
|
||||||
|
|
||||||
INVARIANT — tracing callback placement
|
|
||||||
======================================
|
|
||||||
|
|
||||||
Tracing callbacks (Langfuse, LangSmith) are attached at the **graph
|
|
||||||
invocation root** in :func:`_make_lead_agent` (see the
|
|
||||||
``build_tracing_callbacks()`` block that appends to ``config["callbacks"]``).
|
|
||||||
Every ``create_chat_model(...)`` call inside this module — and inside any
|
|
||||||
middleware reachable from this graph (e.g. ``TitleMiddleware``) — MUST pass
|
|
||||||
``attach_tracing=False``.
|
|
||||||
|
|
||||||
Forgetting that flag emits duplicate spans (one rooted at the graph, one at
|
|
||||||
the model) AND prevents the Langfuse handler's ``propagate_attributes``
|
|
||||||
path from firing, so ``session_id`` / ``user_id`` never reach the trace.
|
|
||||||
The four current sites are: bootstrap agent, default agent, summarization
|
|
||||||
middleware, and the async path inside ``TitleMiddleware``. Any new in-graph
|
|
||||||
``create_chat_model`` call must add to this list and pass the flag.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
|
|
||||||
from langchain.agents import create_agent
|
from langchain.agents import create_agent
|
||||||
@@ -31,7 +9,6 @@ from deerflow.agents.memory.summarization_hook import memory_flush_hook
|
|||||||
from deerflow.agents.middlewares.clarification_middleware import ClarificationMiddleware
|
from deerflow.agents.middlewares.clarification_middleware import ClarificationMiddleware
|
||||||
from deerflow.agents.middlewares.loop_detection_middleware import LoopDetectionMiddleware
|
from deerflow.agents.middlewares.loop_detection_middleware import LoopDetectionMiddleware
|
||||||
from deerflow.agents.middlewares.memory_middleware import MemoryMiddleware
|
from deerflow.agents.middlewares.memory_middleware import MemoryMiddleware
|
||||||
from deerflow.agents.middlewares.safety_finish_reason_middleware import SafetyFinishReasonMiddleware
|
|
||||||
from deerflow.agents.middlewares.subagent_limit_middleware import SubagentLimitMiddleware
|
from deerflow.agents.middlewares.subagent_limit_middleware import SubagentLimitMiddleware
|
||||||
from deerflow.agents.middlewares.summarization_middleware import BeforeSummarizationHook, DeerFlowSummarizationMiddleware
|
from deerflow.agents.middlewares.summarization_middleware import BeforeSummarizationHook, DeerFlowSummarizationMiddleware
|
||||||
from deerflow.agents.middlewares.title_middleware import TitleMiddleware
|
from deerflow.agents.middlewares.title_middleware import TitleMiddleware
|
||||||
@@ -43,14 +20,9 @@ from deerflow.agents.thread_state import ThreadState
|
|||||||
from deerflow.config.agents_config import load_agent_config, validate_agent_name
|
from deerflow.config.agents_config import load_agent_config, validate_agent_name
|
||||||
from deerflow.config.app_config import AppConfig, get_app_config
|
from deerflow.config.app_config import AppConfig, get_app_config
|
||||||
from deerflow.models import create_chat_model
|
from deerflow.models import create_chat_model
|
||||||
from deerflow.skills.tool_policy import filter_tools_by_skill_allowed_tools
|
|
||||||
from deerflow.skills.types import Skill
|
|
||||||
from deerflow.tracing import build_tracing_callbacks
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
_BOOTSTRAP_SKILL_NAMES = {"bootstrap"}
|
|
||||||
|
|
||||||
|
|
||||||
def _get_runtime_config(config: RunnableConfig) -> dict:
|
def _get_runtime_config(config: RunnableConfig) -> dict:
|
||||||
"""Merge legacy configurable options with LangGraph runtime context."""
|
"""Merge legacy configurable options with LangGraph runtime context."""
|
||||||
@@ -99,14 +71,10 @@ def _create_summarization_middleware(*, app_config: AppConfig | None = None) ->
|
|||||||
# Bind "middleware:summarize" tag so RunJournal identifies these LLM calls
|
# Bind "middleware:summarize" tag so RunJournal identifies these LLM calls
|
||||||
# as middleware rather than lead_agent (SummarizationMiddleware is a
|
# as middleware rather than lead_agent (SummarizationMiddleware is a
|
||||||
# LangChain built-in, so we tag the model at creation time).
|
# LangChain built-in, so we tag the model at creation time).
|
||||||
# attach_tracing=False because the graph-level RunnableConfig (set in
|
|
||||||
# ``_make_lead_agent``) already carries tracing callbacks; binding them
|
|
||||||
# again at the model level would emit duplicate spans and break
|
|
||||||
# ``session_id`` / ``user_id`` propagation.
|
|
||||||
if config.model_name:
|
if config.model_name:
|
||||||
model = create_chat_model(name=config.model_name, thinking_enabled=False, app_config=resolved_app_config, attach_tracing=False)
|
model = create_chat_model(name=config.model_name, thinking_enabled=False, app_config=resolved_app_config)
|
||||||
else:
|
else:
|
||||||
model = create_chat_model(thinking_enabled=False, app_config=resolved_app_config, attach_tracing=False)
|
model = create_chat_model(thinking_enabled=False, app_config=resolved_app_config)
|
||||||
model = model.with_config(tags=["middleware:summarize"])
|
model = model.with_config(tags=["middleware:summarize"])
|
||||||
|
|
||||||
# Prepare kwargs
|
# Prepare kwargs
|
||||||
@@ -267,31 +235,20 @@ Being proactive with task management demonstrates thoroughness and ensures all r
|
|||||||
# ViewImageMiddleware should be before ClarificationMiddleware to inject image details before LLM
|
# ViewImageMiddleware should be before ClarificationMiddleware to inject image details before LLM
|
||||||
# ToolErrorHandlingMiddleware should be before ClarificationMiddleware to convert tool exceptions to ToolMessages
|
# ToolErrorHandlingMiddleware should be before ClarificationMiddleware to convert tool exceptions to ToolMessages
|
||||||
# ClarificationMiddleware should be last to intercept clarification requests after model calls
|
# ClarificationMiddleware should be last to intercept clarification requests after model calls
|
||||||
def build_middlewares(
|
def _build_middlewares(
|
||||||
config: RunnableConfig,
|
config: RunnableConfig,
|
||||||
model_name: str | None,
|
model_name: str | None,
|
||||||
agent_name: str | None = None,
|
agent_name: str | None = None,
|
||||||
custom_middlewares: list[AgentMiddleware] | None = None,
|
custom_middlewares: list[AgentMiddleware] | None = None,
|
||||||
*,
|
*,
|
||||||
available_skills: set[str] | None = None,
|
|
||||||
app_config: AppConfig | None = None,
|
app_config: AppConfig | None = None,
|
||||||
deferred_setup=None,
|
|
||||||
):
|
):
|
||||||
"""Build the lead-agent middleware chain based on runtime configuration.
|
"""Build middleware chain based on runtime configuration.
|
||||||
|
|
||||||
Public entry point for the lead agent's full middleware composition. Used by
|
|
||||||
``make_lead_agent`` and by the embedded ``DeerFlowClient`` (a lead-agent variant
|
|
||||||
that needs the identical chain). Keep this name stable: it is imported across a
|
|
||||||
module boundary, so renames/signature changes ripple into ``client.py``.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
config: Runtime configuration containing configurable options like is_plan_mode.
|
config: Runtime configuration containing configurable options like is_plan_mode.
|
||||||
model_name: Resolved runtime model name; gates vision-only middleware.
|
|
||||||
agent_name: If provided, MemoryMiddleware will use per-agent memory storage.
|
agent_name: If provided, MemoryMiddleware will use per-agent memory storage.
|
||||||
custom_middlewares: Optional list of custom middlewares to inject into the chain.
|
custom_middlewares: Optional list of custom middlewares to inject into the chain.
|
||||||
app_config: Explicit AppConfig; falls back to ``get_app_config()`` when omitted.
|
|
||||||
deferred_setup: Optional deferred-MCP-tool setup that attaches
|
|
||||||
``DeferredToolFilterMiddleware`` when ``tool_search`` is enabled.
|
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List of middleware instances.
|
List of middleware instances.
|
||||||
@@ -299,19 +256,6 @@ def build_middlewares(
|
|||||||
resolved_app_config = app_config or get_app_config()
|
resolved_app_config = app_config or get_app_config()
|
||||||
middlewares = build_lead_runtime_middlewares(app_config=resolved_app_config, lazy_init=True)
|
middlewares = build_lead_runtime_middlewares(app_config=resolved_app_config, lazy_init=True)
|
||||||
|
|
||||||
# Always inject current date (and optionally memory) as <system-reminder> into the
|
|
||||||
# first HumanMessage to keep the system prompt fully static for prefix-cache reuse.
|
|
||||||
from deerflow.agents.middlewares.dynamic_context_middleware import DynamicContextMiddleware
|
|
||||||
|
|
||||||
middlewares.append(DynamicContextMiddleware(agent_name=agent_name, app_config=resolved_app_config))
|
|
||||||
|
|
||||||
# Deterministically load a full SKILL.md when the user starts the turn with
|
|
||||||
# /skill-name. This keeps the base system prompt metadata-only while giving
|
|
||||||
# explicit user activation priority over model-side relevance guessing.
|
|
||||||
from deerflow.agents.middlewares.skill_activation_middleware import SkillActivationMiddleware
|
|
||||||
|
|
||||||
middlewares.append(SkillActivationMiddleware(available_skills=available_skills, app_config=resolved_app_config))
|
|
||||||
|
|
||||||
# Add summarization middleware if enabled
|
# Add summarization middleware if enabled
|
||||||
summarization_middleware = _create_summarization_middleware(app_config=resolved_app_config)
|
summarization_middleware = _create_summarization_middleware(app_config=resolved_app_config)
|
||||||
if summarization_middleware is not None:
|
if summarization_middleware is not None:
|
||||||
@@ -340,13 +284,11 @@ def build_middlewares(
|
|||||||
if model_config is not None and model_config.supports_vision:
|
if model_config is not None and model_config.supports_vision:
|
||||||
middlewares.append(ViewImageMiddleware())
|
middlewares.append(ViewImageMiddleware())
|
||||||
|
|
||||||
# Hide deferred tool schemas from model binding until tool_search promotes them.
|
# Add DeferredToolFilterMiddleware to hide deferred tool schemas from model binding
|
||||||
# The deferred set + catalog hash come from the build-time setup (assembled
|
if resolved_app_config.tool_search.enabled:
|
||||||
# after tool-policy filtering); promotion is read from graph state.
|
|
||||||
if deferred_setup is not None and deferred_setup.deferred_names:
|
|
||||||
from deerflow.agents.middlewares.deferred_tool_filter_middleware import DeferredToolFilterMiddleware
|
from deerflow.agents.middlewares.deferred_tool_filter_middleware import DeferredToolFilterMiddleware
|
||||||
|
|
||||||
middlewares.append(DeferredToolFilterMiddleware(deferred_setup.deferred_names, deferred_setup.catalog_hash))
|
middlewares.append(DeferredToolFilterMiddleware())
|
||||||
|
|
||||||
# Add SubagentLimitMiddleware to truncate excess parallel task calls
|
# Add SubagentLimitMiddleware to truncate excess parallel task calls
|
||||||
subagent_enabled = cfg.get("subagent_enabled", False)
|
subagent_enabled = cfg.get("subagent_enabled", False)
|
||||||
@@ -355,50 +297,17 @@ def build_middlewares(
|
|||||||
middlewares.append(SubagentLimitMiddleware(max_concurrent=max_concurrent_subagents))
|
middlewares.append(SubagentLimitMiddleware(max_concurrent=max_concurrent_subagents))
|
||||||
|
|
||||||
# LoopDetectionMiddleware — detect and break repetitive tool call loops
|
# LoopDetectionMiddleware — detect and break repetitive tool call loops
|
||||||
loop_detection_config = resolved_app_config.loop_detection
|
middlewares.append(LoopDetectionMiddleware())
|
||||||
if loop_detection_config.enabled:
|
|
||||||
middlewares.append(LoopDetectionMiddleware.from_config(loop_detection_config))
|
|
||||||
|
|
||||||
# Inject custom middlewares before ClarificationMiddleware
|
# Inject custom middlewares before ClarificationMiddleware
|
||||||
if custom_middlewares:
|
if custom_middlewares:
|
||||||
middlewares.extend(custom_middlewares)
|
middlewares.extend(custom_middlewares)
|
||||||
|
|
||||||
# SafetyFinishReasonMiddleware — suppress tool execution when the provider
|
|
||||||
# safety-terminated the response. Registered after custom middlewares so
|
|
||||||
# that LangChain's reverse-order after_model dispatch runs Safety first;
|
|
||||||
# cleared tool_calls then flow through Loop/Subagent accounting without
|
|
||||||
# firing extra alarms. See safety_finish_reason_middleware.py docstring.
|
|
||||||
safety_config = resolved_app_config.safety_finish_reason
|
|
||||||
if safety_config.enabled:
|
|
||||||
middlewares.append(SafetyFinishReasonMiddleware.from_config(safety_config))
|
|
||||||
|
|
||||||
# ClarificationMiddleware should always be last
|
# ClarificationMiddleware should always be last
|
||||||
middlewares.append(ClarificationMiddleware())
|
middlewares.append(ClarificationMiddleware())
|
||||||
return middlewares
|
return middlewares
|
||||||
|
|
||||||
|
|
||||||
def _available_skill_names(agent_config, is_bootstrap: bool) -> set[str] | None:
|
|
||||||
if is_bootstrap:
|
|
||||||
return set(_BOOTSTRAP_SKILL_NAMES)
|
|
||||||
if agent_config and agent_config.skills is not None:
|
|
||||||
return set(agent_config.skills)
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def _load_enabled_skills_for_tool_policy(available_skills: set[str] | None, *, app_config: AppConfig) -> list[Skill]:
|
|
||||||
try:
|
|
||||||
from deerflow.agents.lead_agent.prompt import get_enabled_skills_for_config
|
|
||||||
|
|
||||||
skills = get_enabled_skills_for_config(app_config)
|
|
||||||
except Exception:
|
|
||||||
logger.exception("Failed to load skills for allowed-tools policy")
|
|
||||||
raise
|
|
||||||
|
|
||||||
if available_skills is None:
|
|
||||||
return skills
|
|
||||||
return [skill for skill in skills if skill.name in available_skills]
|
|
||||||
|
|
||||||
|
|
||||||
def make_lead_agent(config: RunnableConfig):
|
def make_lead_agent(config: RunnableConfig):
|
||||||
"""LangGraph graph factory; keep the signature compatible with LangGraph Server."""
|
"""LangGraph graph factory; keep the signature compatible with LangGraph Server."""
|
||||||
runtime_config = _get_runtime_config(config)
|
runtime_config = _get_runtime_config(config)
|
||||||
@@ -409,8 +318,7 @@ def make_lead_agent(config: RunnableConfig):
|
|||||||
def _make_lead_agent(config: RunnableConfig, *, app_config: AppConfig):
|
def _make_lead_agent(config: RunnableConfig, *, app_config: AppConfig):
|
||||||
# Lazy import to avoid circular dependency
|
# Lazy import to avoid circular dependency
|
||||||
from deerflow.tools import get_available_tools
|
from deerflow.tools import get_available_tools
|
||||||
from deerflow.tools.builtins import setup_agent, update_agent
|
from deerflow.tools.builtins import setup_agent
|
||||||
from deerflow.tools.builtins.tool_search import assemble_deferred_tools
|
|
||||||
|
|
||||||
cfg = _get_runtime_config(config)
|
cfg = _get_runtime_config(config)
|
||||||
resolved_app_config = app_config
|
resolved_app_config = app_config
|
||||||
@@ -425,7 +333,6 @@ def _make_lead_agent(config: RunnableConfig, *, app_config: AppConfig):
|
|||||||
agent_name = validate_agent_name(cfg.get("agent_name"))
|
agent_name = validate_agent_name(cfg.get("agent_name"))
|
||||||
|
|
||||||
agent_config = load_agent_config(agent_name) if not is_bootstrap else None
|
agent_config = load_agent_config(agent_name) if not is_bootstrap else None
|
||||||
available_skills = _available_skill_names(agent_config, is_bootstrap)
|
|
||||||
# Custom agent model from agent config (if any), or None to let _resolve_model_name pick the default
|
# Custom agent model from agent config (if any), or None to let _resolve_model_name pick the default
|
||||||
agent_model_name = agent_config.model if agent_config and agent_config.model else None
|
agent_model_name = agent_config.model if agent_config and agent_config.model else None
|
||||||
|
|
||||||
@@ -464,77 +371,41 @@ def _make_lead_agent(config: RunnableConfig, *, app_config: AppConfig):
|
|||||||
"is_plan_mode": is_plan_mode,
|
"is_plan_mode": is_plan_mode,
|
||||||
"subagent_enabled": subagent_enabled,
|
"subagent_enabled": subagent_enabled,
|
||||||
"tool_groups": agent_config.tool_groups if agent_config else None,
|
"tool_groups": agent_config.tool_groups if agent_config else None,
|
||||||
"available_skills": sorted(available_skills) if available_skills is not None else None,
|
"available_skills": ["bootstrap"] if is_bootstrap else (agent_config.skills if agent_config and agent_config.skills is not None else None),
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
|
|
||||||
# Inject tracing callbacks at the graph invocation root so a single LangGraph
|
|
||||||
# run produces one trace with all node / LLM / tool calls as child spans,
|
|
||||||
# AND so the Langfuse handler sees ``on_chain_start(parent_run_id=None)`` and
|
|
||||||
# actually propagates ``langfuse_session_id`` / ``langfuse_user_id`` from
|
|
||||||
# ``config["metadata"]`` onto the trace. Without root-level attachment the
|
|
||||||
# model is a nested observation and the handler strips ``langfuse_*`` keys.
|
|
||||||
tracing_callbacks = build_tracing_callbacks()
|
|
||||||
if tracing_callbacks:
|
|
||||||
existing = config.get("callbacks") or []
|
|
||||||
if not isinstance(existing, list):
|
|
||||||
existing = list(existing)
|
|
||||||
config["callbacks"] = [*existing, *tracing_callbacks]
|
|
||||||
|
|
||||||
skills_for_tool_policy = _load_enabled_skills_for_tool_policy(available_skills, app_config=resolved_app_config)
|
|
||||||
|
|
||||||
if is_bootstrap:
|
if is_bootstrap:
|
||||||
# Special bootstrap agent with minimal prompt for initial custom agent creation flow
|
# Special bootstrap agent with minimal prompt for initial custom agent creation flow
|
||||||
# Keep the bootstrap skill set intentionally narrow so agent creation
|
|
||||||
# remains deterministic before the custom agent's own config exists.
|
|
||||||
raw_tools = get_available_tools(model_name=model_name, subagent_enabled=subagent_enabled, app_config=resolved_app_config) + [setup_agent]
|
|
||||||
filtered = filter_tools_by_skill_allowed_tools(raw_tools, skills_for_tool_policy)
|
|
||||||
final_tools, setup = assemble_deferred_tools(filtered, enabled=resolved_app_config.tool_search.enabled)
|
|
||||||
return create_agent(
|
return create_agent(
|
||||||
model=create_chat_model(name=model_name, thinking_enabled=thinking_enabled, app_config=resolved_app_config, attach_tracing=False),
|
model=create_chat_model(name=model_name, thinking_enabled=thinking_enabled, app_config=resolved_app_config),
|
||||||
tools=final_tools,
|
tools=get_available_tools(model_name=model_name, subagent_enabled=subagent_enabled, app_config=resolved_app_config) + [setup_agent],
|
||||||
middleware=build_middlewares(
|
middleware=_build_middlewares(config, model_name=model_name, app_config=resolved_app_config),
|
||||||
config,
|
|
||||||
model_name=model_name,
|
|
||||||
available_skills=set(_BOOTSTRAP_SKILL_NAMES),
|
|
||||||
app_config=resolved_app_config,
|
|
||||||
deferred_setup=setup,
|
|
||||||
),
|
|
||||||
system_prompt=apply_prompt_template(
|
system_prompt=apply_prompt_template(
|
||||||
subagent_enabled=subagent_enabled,
|
subagent_enabled=subagent_enabled,
|
||||||
max_concurrent_subagents=max_concurrent_subagents,
|
max_concurrent_subagents=max_concurrent_subagents,
|
||||||
available_skills=set(_BOOTSTRAP_SKILL_NAMES),
|
available_skills=set(["bootstrap"]),
|
||||||
app_config=resolved_app_config,
|
app_config=resolved_app_config,
|
||||||
deferred_names=setup.deferred_names,
|
|
||||||
),
|
),
|
||||||
state_schema=ThreadState,
|
state_schema=ThreadState,
|
||||||
)
|
)
|
||||||
|
|
||||||
# Custom agents can update their own SOUL.md / config via update_agent.
|
|
||||||
# The default agent (no agent_name) does not see this tool.
|
|
||||||
extra_tools = [update_agent] if agent_name else []
|
|
||||||
# Default lead agent (unchanged behavior)
|
# Default lead agent (unchanged behavior)
|
||||||
raw_tools = get_available_tools(model_name=model_name, groups=agent_config.tool_groups if agent_config else None, subagent_enabled=subagent_enabled, app_config=resolved_app_config)
|
|
||||||
filtered = filter_tools_by_skill_allowed_tools(raw_tools + extra_tools, skills_for_tool_policy)
|
|
||||||
final_tools, setup = assemble_deferred_tools(filtered, enabled=resolved_app_config.tool_search.enabled)
|
|
||||||
return create_agent(
|
return create_agent(
|
||||||
model=create_chat_model(name=model_name, thinking_enabled=thinking_enabled, reasoning_effort=reasoning_effort, app_config=resolved_app_config, attach_tracing=False),
|
model=create_chat_model(name=model_name, thinking_enabled=thinking_enabled, reasoning_effort=reasoning_effort, app_config=resolved_app_config),
|
||||||
tools=final_tools,
|
tools=get_available_tools(
|
||||||
middleware=build_middlewares(
|
|
||||||
config,
|
|
||||||
model_name=model_name,
|
model_name=model_name,
|
||||||
agent_name=agent_name,
|
groups=agent_config.tool_groups if agent_config else None,
|
||||||
available_skills=available_skills,
|
subagent_enabled=subagent_enabled,
|
||||||
app_config=resolved_app_config,
|
app_config=resolved_app_config,
|
||||||
deferred_setup=setup,
|
|
||||||
),
|
),
|
||||||
|
middleware=_build_middlewares(config, model_name=model_name, agent_name=agent_name, app_config=resolved_app_config),
|
||||||
system_prompt=apply_prompt_template(
|
system_prompt=apply_prompt_template(
|
||||||
subagent_enabled=subagent_enabled,
|
subagent_enabled=subagent_enabled,
|
||||||
max_concurrent_subagents=max_concurrent_subagents,
|
max_concurrent_subagents=max_concurrent_subagents,
|
||||||
agent_name=agent_name,
|
agent_name=agent_name,
|
||||||
available_skills=available_skills,
|
available_skills=set(agent_config.skills) if agent_config and agent_config.skills is not None else None,
|
||||||
app_config=resolved_app_config,
|
app_config=resolved_app_config,
|
||||||
deferred_names=setup.deferred_names,
|
|
||||||
),
|
),
|
||||||
state_schema=ThreadState,
|
state_schema=ThreadState,
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -3,6 +3,7 @@ from __future__ import annotations
|
|||||||
import asyncio
|
import asyncio
|
||||||
import logging
|
import logging
|
||||||
import threading
|
import threading
|
||||||
|
from datetime import datetime
|
||||||
from functools import lru_cache
|
from functools import lru_cache
|
||||||
from typing import TYPE_CHECKING
|
from typing import TYPE_CHECKING
|
||||||
|
|
||||||
@@ -10,7 +11,6 @@ from deerflow.config.agents_config import load_agent_soul
|
|||||||
from deerflow.skills.storage import get_or_new_skill_storage
|
from deerflow.skills.storage import get_or_new_skill_storage
|
||||||
from deerflow.skills.types import Skill, SkillCategory
|
from deerflow.skills.types import Skill, SkillCategory
|
||||||
from deerflow.subagents import get_available_subagent_names
|
from deerflow.subagents import get_available_subagent_names
|
||||||
from deerflow.tools.builtins.tool_search import get_deferred_tools_prompt_section
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
if TYPE_CHECKING:
|
||||||
from deerflow.config.app_config import AppConfig
|
from deerflow.config.app_config import AppConfig
|
||||||
@@ -20,7 +20,6 @@ logger = logging.getLogger(__name__)
|
|||||||
_ENABLED_SKILLS_REFRESH_WAIT_TIMEOUT_SECONDS = 5.0
|
_ENABLED_SKILLS_REFRESH_WAIT_TIMEOUT_SECONDS = 5.0
|
||||||
_enabled_skills_lock = threading.Lock()
|
_enabled_skills_lock = threading.Lock()
|
||||||
_enabled_skills_cache: list[Skill] | None = None
|
_enabled_skills_cache: list[Skill] | None = None
|
||||||
_enabled_skills_by_config_cache: dict[int, tuple[object, list[Skill]]] = {}
|
|
||||||
_enabled_skills_refresh_active = False
|
_enabled_skills_refresh_active = False
|
||||||
_enabled_skills_refresh_version = 0
|
_enabled_skills_refresh_version = 0
|
||||||
_enabled_skills_refresh_event = threading.Event()
|
_enabled_skills_refresh_event = threading.Event()
|
||||||
@@ -85,7 +84,6 @@ def _invalidate_enabled_skills_cache() -> threading.Event:
|
|||||||
_get_cached_skills_prompt_section.cache_clear()
|
_get_cached_skills_prompt_section.cache_clear()
|
||||||
with _enabled_skills_lock:
|
with _enabled_skills_lock:
|
||||||
_enabled_skills_cache = None
|
_enabled_skills_cache = None
|
||||||
_enabled_skills_by_config_cache.clear()
|
|
||||||
_enabled_skills_refresh_version += 1
|
_enabled_skills_refresh_version += 1
|
||||||
_enabled_skills_refresh_event.clear()
|
_enabled_skills_refresh_event.clear()
|
||||||
if _enabled_skills_refresh_active:
|
if _enabled_skills_refresh_active:
|
||||||
@@ -109,15 +107,6 @@ def warm_enabled_skills_cache(timeout_seconds: float = _ENABLED_SKILLS_REFRESH_W
|
|||||||
|
|
||||||
|
|
||||||
def _get_enabled_skills():
|
def _get_enabled_skills():
|
||||||
return get_cached_enabled_skills()
|
|
||||||
|
|
||||||
|
|
||||||
def get_cached_enabled_skills() -> list[Skill]:
|
|
||||||
"""Return the cached enabled-skills list, kicking off a background refresh on miss.
|
|
||||||
|
|
||||||
Safe to call from request paths: never blocks on disk I/O. Returns an empty
|
|
||||||
list on cache miss; the next call will see the warmed result.
|
|
||||||
"""
|
|
||||||
with _enabled_skills_lock:
|
with _enabled_skills_lock:
|
||||||
cached = _enabled_skills_cache
|
cached = _enabled_skills_cache
|
||||||
|
|
||||||
@@ -128,29 +117,17 @@ def get_cached_enabled_skills() -> list[Skill]:
|
|||||||
return []
|
return []
|
||||||
|
|
||||||
|
|
||||||
def get_enabled_skills_for_config(app_config: AppConfig | None = None) -> list[Skill]:
|
def _get_enabled_skills_for_config(app_config: AppConfig | None = None) -> list[Skill]:
|
||||||
"""Return enabled skills using the caller's config source.
|
"""Return enabled skills using the caller's config source.
|
||||||
|
|
||||||
When a concrete ``app_config`` is supplied, cache the loaded skills by that
|
When a concrete ``app_config`` is supplied, bypass the global enabled-skills
|
||||||
config object's identity so request-scoped config injection still resolves
|
cache so the skill list and skill paths are resolved from the same config
|
||||||
skill paths from the matching config without rescanning storage on every
|
object. This keeps request-scoped config injection consistent even while the
|
||||||
agent factory call.
|
release branch still supports global fallback paths.
|
||||||
"""
|
"""
|
||||||
if app_config is None:
|
if app_config is None:
|
||||||
return _get_enabled_skills()
|
return _get_enabled_skills()
|
||||||
|
return list(get_or_new_skill_storage(app_config=app_config).load_skills(enabled_only=True))
|
||||||
cache_key = id(app_config)
|
|
||||||
with _enabled_skills_lock:
|
|
||||||
cached = _enabled_skills_by_config_cache.get(cache_key)
|
|
||||||
if cached is not None:
|
|
||||||
cached_config, cached_skills = cached
|
|
||||||
if cached_config is app_config:
|
|
||||||
return list(cached_skills)
|
|
||||||
|
|
||||||
skills = list(get_or_new_skill_storage(app_config=app_config).load_skills(enabled_only=True))
|
|
||||||
with _enabled_skills_lock:
|
|
||||||
_enabled_skills_by_config_cache[cache_key] = (app_config, skills)
|
|
||||||
return list(skills)
|
|
||||||
|
|
||||||
|
|
||||||
def _skill_mutability_label(category: SkillCategory | str) -> str:
|
def _skill_mutability_label(category: SkillCategory | str) -> str:
|
||||||
@@ -367,7 +344,8 @@ You are {agent_name}, an open-source super agent.
|
|||||||
</role>
|
</role>
|
||||||
|
|
||||||
{soul}
|
{soul}
|
||||||
{self_update_section}
|
{memory_context}
|
||||||
|
|
||||||
<thinking_style>
|
<thinking_style>
|
||||||
- Think concisely and strategically about the user's request BEFORE taking action
|
- Think concisely and strategically about the user's request BEFORE taking action
|
||||||
- Break down the task: What is clear? What is ambiguous? What is missing?
|
- Break down the task: What is clear? What is ambiguous? What is missing?
|
||||||
@@ -543,14 +521,6 @@ combined with a FastAPI gateway for REST API access [citation:FastAPI](https://f
|
|||||||
{subagent_reminder}- Skill First: Always load the relevant skill before starting **complex** tasks.
|
{subagent_reminder}- Skill First: Always load the relevant skill before starting **complex** tasks.
|
||||||
- Progressive Loading: Load resources incrementally as referenced in skills
|
- Progressive Loading: Load resources incrementally as referenced in skills
|
||||||
- Output Files: Final deliverables must be in `/mnt/user-data/outputs`
|
- Output Files: Final deliverables must be in `/mnt/user-data/outputs`
|
||||||
- File Editing Workflow: When revising an existing file, prefer
|
|
||||||
`str_replace` over `write_file` — it sends only the diff and avoids
|
|
||||||
re-emitting the whole file (mirrors Claude Code's Edit and Codex's
|
|
||||||
apply_patch). When writing long new content from scratch, split it
|
|
||||||
into sections: the first `write_file` call creates the file, then use
|
|
||||||
`write_file` with append=True to extend it section by section. This
|
|
||||||
keeps each tool call small and avoids mid-stream chunk-gap timeouts
|
|
||||||
on oversized single-shot writes. (See issue #3189.)
|
|
||||||
- Clarity: Be direct and helpful, avoid unnecessary meta-commentary
|
- Clarity: Be direct and helpful, avoid unnecessary meta-commentary
|
||||||
- Including Images and Mermaid: Images and Mermaid diagrams are always welcomed in the Markdown format, and you're encouraged to use `\n\n` or "```mermaid" to display images in response or Markdown files
|
- Including Images and Mermaid: Images and Mermaid diagrams are always welcomed in the Markdown format, and you're encouraged to use `\n\n` or "```mermaid" to display images in response or Markdown files
|
||||||
- Multi-task: Better utilize parallel tool calling to call multiple tools at one time for better performance
|
- Multi-task: Better utilize parallel tool calling to call multiple tools at one time for better performance
|
||||||
@@ -625,11 +595,6 @@ You have access to skills that provide optimized workflows for specific tasks. E
|
|||||||
4. Load referenced resources only when needed during execution
|
4. Load referenced resources only when needed during execution
|
||||||
5. Follow the skill's instructions precisely
|
5. Follow the skill's instructions precisely
|
||||||
|
|
||||||
**Explicit Slash Skill Activation:**
|
|
||||||
- If the user starts a request with `/<skill-name>`, that skill was explicitly requested for the current turn.
|
|
||||||
- Follow the activated skill before choosing a general workflow.
|
|
||||||
- The runtime injects the activated skill content for explicit slash activations; do not call `read_file` for that SKILL.md again unless the injected skill references supporting resources you need.
|
|
||||||
|
|
||||||
**Skills are located at:** {container_base_path}
|
**Skills are located at:** {container_base_path}
|
||||||
{skill_evolution_section}
|
{skill_evolution_section}
|
||||||
{skills_list}
|
{skills_list}
|
||||||
@@ -639,7 +604,7 @@ You have access to skills that provide optimized workflows for specific tasks. E
|
|||||||
|
|
||||||
def get_skills_prompt_section(available_skills: set[str] | None = None, *, app_config: AppConfig | None = None) -> str:
|
def get_skills_prompt_section(available_skills: set[str] | None = None, *, app_config: AppConfig | None = None) -> str:
|
||||||
"""Generate the skills prompt section with available skills list."""
|
"""Generate the skills prompt section with available skills list."""
|
||||||
skills = get_enabled_skills_for_config(app_config)
|
skills = _get_enabled_skills_for_config(app_config)
|
||||||
|
|
||||||
if app_config is None:
|
if app_config is None:
|
||||||
try:
|
try:
|
||||||
@@ -678,25 +643,34 @@ def get_agent_soul(agent_name: str | None) -> str:
|
|||||||
return ""
|
return ""
|
||||||
|
|
||||||
|
|
||||||
def _build_self_update_section(agent_name: str | None) -> str:
|
def get_deferred_tools_prompt_section(*, app_config: AppConfig | None = None) -> str:
|
||||||
"""Prompt block that teaches the custom agent to persist self-updates via update_agent."""
|
"""Generate <available-deferred-tools> block for the system prompt.
|
||||||
if not agent_name:
|
|
||||||
|
Lists only deferred tool names so the agent knows what exists
|
||||||
|
and can use tool_search to load them.
|
||||||
|
Returns empty string when tool_search is disabled or no tools are deferred.
|
||||||
|
"""
|
||||||
|
from deerflow.tools.builtins.tool_search import get_deferred_registry
|
||||||
|
|
||||||
|
if app_config is None:
|
||||||
|
try:
|
||||||
|
from deerflow.config import get_app_config
|
||||||
|
|
||||||
|
config = get_app_config()
|
||||||
|
except Exception:
|
||||||
|
return ""
|
||||||
|
else:
|
||||||
|
config = app_config
|
||||||
|
|
||||||
|
if not config.tool_search.enabled:
|
||||||
return ""
|
return ""
|
||||||
return f"""<self_update>
|
|
||||||
You are running as the custom agent **{agent_name}** with a persisted SOUL.md and config.yaml.
|
|
||||||
|
|
||||||
When the user asks you to update your own description, personality, behaviour, skill set, tool groups, or default model,
|
registry = get_deferred_registry()
|
||||||
you MUST persist the change with the `update_agent` tool. Do NOT use `bash`, `write_file`, or any sandbox tool to edit
|
if not registry:
|
||||||
SOUL.md or config.yaml — those write into a temporary sandbox/tool workspace and the changes will be lost on the next turn.
|
return ""
|
||||||
|
|
||||||
Rules:
|
names = "\n".join(e.name for e in registry.entries)
|
||||||
- Always pass the FULL replacement text for `soul` (no patch semantics). Start from your current SOUL above and apply the user's edits.
|
return f"<available-deferred-tools>\n{names}\n</available-deferred-tools>"
|
||||||
- Only pass the fields that should change. Omit the others to preserve them.
|
|
||||||
- Never pass literal strings like `"null"`, `"none"`, or `"undefined"` for unchanged fields.
|
|
||||||
- Pass `skills=[]` to disable all skills, or omit `skills` to keep the existing whitelist.
|
|
||||||
- After `update_agent` returns successfully, tell the user the change is persisted and will take effect on the next turn.
|
|
||||||
</self_update>
|
|
||||||
"""
|
|
||||||
|
|
||||||
|
|
||||||
def _build_acp_section(*, app_config: AppConfig | None = None) -> str:
|
def _build_acp_section(*, app_config: AppConfig | None = None) -> str:
|
||||||
@@ -757,8 +731,10 @@ def apply_prompt_template(
|
|||||||
agent_name: str | None = None,
|
agent_name: str | None = None,
|
||||||
available_skills: set[str] | None = None,
|
available_skills: set[str] | None = None,
|
||||||
app_config: AppConfig | None = None,
|
app_config: AppConfig | None = None,
|
||||||
deferred_names: frozenset[str] = frozenset(),
|
|
||||||
) -> str:
|
) -> str:
|
||||||
|
# Get memory context
|
||||||
|
memory_context = _get_memory_context(agent_name, app_config=app_config)
|
||||||
|
|
||||||
# Include subagent section only if enabled (from runtime parameter)
|
# Include subagent section only if enabled (from runtime parameter)
|
||||||
n = max_concurrent_subagents
|
n = max_concurrent_subagents
|
||||||
subagent_section = _build_subagent_section(n, app_config=app_config) if subagent_enabled else ""
|
subagent_section = _build_subagent_section(n, app_config=app_config) if subagent_enabled else ""
|
||||||
@@ -785,25 +761,24 @@ def apply_prompt_template(
|
|||||||
skills_section = get_skills_prompt_section(available_skills, app_config=app_config)
|
skills_section = get_skills_prompt_section(available_skills, app_config=app_config)
|
||||||
|
|
||||||
# Get deferred tools section (tool_search)
|
# Get deferred tools section (tool_search)
|
||||||
deferred_tools_section = get_deferred_tools_prompt_section(deferred_names=deferred_names)
|
deferred_tools_section = get_deferred_tools_prompt_section(app_config=app_config)
|
||||||
|
|
||||||
# Build ACP agent section only if ACP agents are configured
|
# Build ACP agent section only if ACP agents are configured
|
||||||
acp_section = _build_acp_section(app_config=app_config)
|
acp_section = _build_acp_section(app_config=app_config)
|
||||||
custom_mounts_section = _build_custom_mounts_section(app_config=app_config)
|
custom_mounts_section = _build_custom_mounts_section(app_config=app_config)
|
||||||
acp_and_mounts_section = "\n".join(section for section in (acp_section, custom_mounts_section) if section)
|
acp_and_mounts_section = "\n".join(section for section in (acp_section, custom_mounts_section) if section)
|
||||||
|
|
||||||
# Build and return the fully static system prompt.
|
# Format the prompt with dynamic skills and memory
|
||||||
# Memory and current date are injected per-turn via DynamicContextMiddleware
|
prompt = SYSTEM_PROMPT_TEMPLATE.format(
|
||||||
# as a <system-reminder> in the first HumanMessage, keeping this prompt
|
|
||||||
# identical across users and sessions for maximum prefix-cache reuse.
|
|
||||||
return SYSTEM_PROMPT_TEMPLATE.format(
|
|
||||||
agent_name=agent_name or "DeerFlow 2.0",
|
agent_name=agent_name or "DeerFlow 2.0",
|
||||||
soul=get_agent_soul(agent_name),
|
soul=get_agent_soul(agent_name),
|
||||||
self_update_section=_build_self_update_section(agent_name),
|
|
||||||
skills_section=skills_section,
|
skills_section=skills_section,
|
||||||
deferred_tools_section=deferred_tools_section,
|
deferred_tools_section=deferred_tools_section,
|
||||||
|
memory_context=memory_context,
|
||||||
subagent_section=subagent_section,
|
subagent_section=subagent_section,
|
||||||
subagent_reminder=subagent_reminder,
|
subagent_reminder=subagent_reminder,
|
||||||
subagent_thinking=subagent_thinking,
|
subagent_thinking=subagent_thinking,
|
||||||
acp_section=acp_and_mounts_section,
|
acp_section=acp_and_mounts_section,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
return prompt + f"\n<current_date>{datetime.now().strftime('%Y-%m-%d, %A')}</current_date>"
|
||||||
|
|||||||
@@ -1,14 +1,9 @@
|
|||||||
"""Prompt templates for memory update and injection."""
|
"""Prompt templates for memory update and injection."""
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import logging
|
|
||||||
import math
|
import math
|
||||||
import re
|
import re
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
import tiktoken
|
import tiktoken
|
||||||
|
|
||||||
@@ -165,39 +160,6 @@ Rules:
|
|||||||
Return ONLY valid JSON."""
|
Return ONLY valid JSON."""
|
||||||
|
|
||||||
|
|
||||||
# Module-level tiktoken encoding cache. Populated lazily on first use;
|
|
||||||
# subsequent calls are a dict lookup (no network I/O). Pre-warming at
|
|
||||||
# startup via :func:`warm_tiktoken_cache` avoids blocking a request on the
|
|
||||||
# (potentially slow) first ``get_encoding`` call.
|
|
||||||
_tiktoken_encoding_cache: dict[str, tiktoken.Encoding] = {}
|
|
||||||
|
|
||||||
|
|
||||||
def _get_tiktoken_encoding(encoding_name: str = "cl100k_base") -> tiktoken.Encoding | None:
|
|
||||||
"""Return a cached tiktoken encoding, or ``None`` on failure / unavailability.
|
|
||||||
|
|
||||||
On the very first call for a given *encoding_name*, tiktoken may need to
|
|
||||||
download the BPE data from ``openaipublic.blob.core.windows.net``. In
|
|
||||||
network-restricted environments (e.g. deployments behind the GFW) this
|
|
||||||
download can block for tens of minutes before the OS TCP timeout kicks in.
|
|
||||||
The caller must therefore be prepared for this to block and should run it
|
|
||||||
off the event loop (e.g. via ``asyncio.to_thread``).
|
|
||||||
"""
|
|
||||||
if not TIKTOKEN_AVAILABLE:
|
|
||||||
return None
|
|
||||||
|
|
||||||
cached = _tiktoken_encoding_cache.get(encoding_name)
|
|
||||||
if cached is not None:
|
|
||||||
return cached
|
|
||||||
|
|
||||||
try:
|
|
||||||
encoding = tiktoken.get_encoding(encoding_name)
|
|
||||||
_tiktoken_encoding_cache[encoding_name] = encoding
|
|
||||||
return encoding
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to load tiktoken encoding %r; falling back to char-based estimation", encoding_name, exc_info=True)
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def _count_tokens(text: str, encoding_name: str = "cl100k_base") -> int:
|
def _count_tokens(text: str, encoding_name: str = "cl100k_base") -> int:
|
||||||
"""Count tokens in text using tiktoken.
|
"""Count tokens in text using tiktoken.
|
||||||
|
|
||||||
@@ -208,30 +170,18 @@ def _count_tokens(text: str, encoding_name: str = "cl100k_base") -> int:
|
|||||||
Returns:
|
Returns:
|
||||||
The number of tokens in the text.
|
The number of tokens in the text.
|
||||||
"""
|
"""
|
||||||
encoding = _get_tiktoken_encoding(encoding_name)
|
if not TIKTOKEN_AVAILABLE:
|
||||||
if encoding is None:
|
|
||||||
# Fallback to character-based estimation if tiktoken is not available
|
# Fallback to character-based estimation if tiktoken is not available
|
||||||
# or the encoding failed to load.
|
|
||||||
return len(text) // 4
|
return len(text) // 4
|
||||||
|
|
||||||
try:
|
try:
|
||||||
|
encoding = tiktoken.get_encoding(encoding_name)
|
||||||
return len(encoding.encode(text))
|
return len(encoding.encode(text))
|
||||||
except Exception:
|
except Exception:
|
||||||
# Fallback to character-based estimation on error
|
# Fallback to character-based estimation on error
|
||||||
return len(text) // 4
|
return len(text) // 4
|
||||||
|
|
||||||
|
|
||||||
def warm_tiktoken_cache() -> bool:
|
|
||||||
"""Pre-warm the tiktoken encoding cache.
|
|
||||||
|
|
||||||
Call at startup (off the event loop) so the first request never blocks
|
|
||||||
on the BPE download. Returns ``True`` if the encoding was loaded
|
|
||||||
successfully (or was already cached), ``False`` if tiktoken is
|
|
||||||
unavailable or the download failed.
|
|
||||||
"""
|
|
||||||
return _get_tiktoken_encoding("cl100k_base") is not None
|
|
||||||
|
|
||||||
|
|
||||||
def _coerce_confidence(value: Any, default: float = 0.0) -> float:
|
def _coerce_confidence(value: Any, default: float = 0.0) -> float:
|
||||||
"""Coerce a confidence-like value to a bounded float in [0, 1].
|
"""Coerce a confidence-like value to a bounded float in [0, 1].
|
||||||
|
|
||||||
|
|||||||
@@ -40,15 +40,6 @@ class MemoryUpdateQueue:
|
|||||||
self._timer: threading.Timer | None = None
|
self._timer: threading.Timer | None = None
|
||||||
self._processing = False
|
self._processing = False
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _queue_key(
|
|
||||||
thread_id: str,
|
|
||||||
user_id: str | None,
|
|
||||||
agent_name: str | None,
|
|
||||||
) -> tuple[str, str | None, str | None]:
|
|
||||||
"""Return the debounce identity for a memory update target."""
|
|
||||||
return (thread_id, user_id, agent_name)
|
|
||||||
|
|
||||||
def add(
|
def add(
|
||||||
self,
|
self,
|
||||||
thread_id: str,
|
thread_id: str,
|
||||||
@@ -124,9 +115,8 @@ class MemoryUpdateQueue:
|
|||||||
correction_detected: bool,
|
correction_detected: bool,
|
||||||
reinforcement_detected: bool,
|
reinforcement_detected: bool,
|
||||||
) -> None:
|
) -> None:
|
||||||
queue_key = self._queue_key(thread_id, user_id, agent_name)
|
|
||||||
existing_context = next(
|
existing_context = next(
|
||||||
(context for context in self._queue if self._queue_key(context.thread_id, context.user_id, context.agent_name) == queue_key),
|
(context for context in self._queue if context.thread_id == thread_id),
|
||||||
None,
|
None,
|
||||||
)
|
)
|
||||||
merged_correction_detected = correction_detected or (existing_context.correction_detected if existing_context is not None else False)
|
merged_correction_detected = correction_detected or (existing_context.correction_detected if existing_context is not None else False)
|
||||||
@@ -140,7 +130,7 @@ class MemoryUpdateQueue:
|
|||||||
reinforcement_detected=merged_reinforcement_detected,
|
reinforcement_detected=merged_reinforcement_detected,
|
||||||
)
|
)
|
||||||
|
|
||||||
self._queue = [context for context in self._queue if self._queue_key(context.thread_id, context.user_id, context.agent_name) != queue_key]
|
self._queue = [c for c in self._queue if c.thread_id != thread_id]
|
||||||
self._queue.append(context)
|
self._queue.append(context)
|
||||||
|
|
||||||
def _reset_timer(self) -> None:
|
def _reset_timer(self) -> None:
|
||||||
|
|||||||
@@ -6,7 +6,6 @@ from deerflow.agents.memory.message_processing import detect_correction, detect_
|
|||||||
from deerflow.agents.memory.queue import get_memory_queue
|
from deerflow.agents.memory.queue import get_memory_queue
|
||||||
from deerflow.agents.middlewares.summarization_middleware import SummarizationEvent
|
from deerflow.agents.middlewares.summarization_middleware import SummarizationEvent
|
||||||
from deerflow.config.memory_config import get_memory_config
|
from deerflow.config.memory_config import get_memory_config
|
||||||
from deerflow.runtime.user_context import resolve_runtime_user_id
|
|
||||||
|
|
||||||
|
|
||||||
def memory_flush_hook(event: SummarizationEvent) -> None:
|
def memory_flush_hook(event: SummarizationEvent) -> None:
|
||||||
@@ -22,13 +21,11 @@ def memory_flush_hook(event: SummarizationEvent) -> None:
|
|||||||
|
|
||||||
correction_detected = detect_correction(filtered_messages)
|
correction_detected = detect_correction(filtered_messages)
|
||||||
reinforcement_detected = not correction_detected and detect_reinforcement(filtered_messages)
|
reinforcement_detected = not correction_detected and detect_reinforcement(filtered_messages)
|
||||||
user_id = resolve_runtime_user_id(event.runtime)
|
|
||||||
queue = get_memory_queue()
|
queue = get_memory_queue()
|
||||||
queue.add_nowait(
|
queue.add_nowait(
|
||||||
thread_id=event.thread_id,
|
thread_id=event.thread_id,
|
||||||
messages=filtered_messages,
|
messages=filtered_messages,
|
||||||
agent_name=event.agent_name,
|
agent_name=event.agent_name,
|
||||||
user_id=user_id,
|
|
||||||
correction_detected=correction_detected,
|
correction_detected=correction_detected,
|
||||||
reinforcement_detected=reinforcement_detected,
|
reinforcement_detected=reinforcement_detected,
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -227,110 +227,6 @@ def _extract_text(content: Any) -> str:
|
|||||||
return str(content)
|
return str(content)
|
||||||
|
|
||||||
|
|
||||||
_REQUIRED_MEMORY_UPDATE_TOP_LEVEL_KEYS = frozenset({"user", "history", "newFacts", "factsToRemove"})
|
|
||||||
|
|
||||||
|
|
||||||
def _normalize_memory_update_fact(fact: Any) -> dict[str, Any] | None:
|
|
||||||
"""Normalize a single fact entry from a model-produced memory update."""
|
|
||||||
if not isinstance(fact, dict):
|
|
||||||
return None
|
|
||||||
|
|
||||||
raw_content = fact.get("content")
|
|
||||||
if not isinstance(raw_content, str):
|
|
||||||
return None
|
|
||||||
content = raw_content.strip()
|
|
||||||
if not content:
|
|
||||||
return None
|
|
||||||
|
|
||||||
raw_category = fact.get("category")
|
|
||||||
category = raw_category.strip() if isinstance(raw_category, str) and raw_category.strip() else "context"
|
|
||||||
|
|
||||||
raw_confidence = fact.get("confidence", 0.5)
|
|
||||||
if isinstance(raw_confidence, bool):
|
|
||||||
return None
|
|
||||||
if isinstance(raw_confidence, str):
|
|
||||||
raw_confidence = raw_confidence.strip()
|
|
||||||
if not raw_confidence:
|
|
||||||
return None
|
|
||||||
try:
|
|
||||||
raw_confidence = float(raw_confidence)
|
|
||||||
except ValueError:
|
|
||||||
return None
|
|
||||||
elif isinstance(raw_confidence, (int, float)):
|
|
||||||
raw_confidence = float(raw_confidence)
|
|
||||||
else:
|
|
||||||
return None
|
|
||||||
|
|
||||||
if not math.isfinite(raw_confidence):
|
|
||||||
return None
|
|
||||||
|
|
||||||
normalized_fact = {
|
|
||||||
"content": content,
|
|
||||||
"category": category,
|
|
||||||
"confidence": raw_confidence,
|
|
||||||
}
|
|
||||||
source_error = fact.get("sourceError")
|
|
||||||
if isinstance(source_error, str):
|
|
||||||
normalized_source_error = source_error.strip()
|
|
||||||
if normalized_source_error:
|
|
||||||
normalized_fact["sourceError"] = normalized_source_error
|
|
||||||
|
|
||||||
return normalized_fact
|
|
||||||
|
|
||||||
|
|
||||||
def _normalize_memory_update_data(update_data: dict[str, Any]) -> dict[str, Any]:
|
|
||||||
"""Coerce parsed memory update data into the shape consumed by _apply_updates."""
|
|
||||||
user = update_data.get("user")
|
|
||||||
history = update_data.get("history")
|
|
||||||
new_facts = update_data.get("newFacts")
|
|
||||||
facts_to_remove = update_data.get("factsToRemove")
|
|
||||||
normalized_facts_to_remove = [fact_id for fact_id in facts_to_remove if isinstance(fact_id, str)] if isinstance(facts_to_remove, list) else []
|
|
||||||
normalized_new_facts = []
|
|
||||||
dropped_new_fact = not isinstance(new_facts, list)
|
|
||||||
if isinstance(new_facts, list):
|
|
||||||
for fact in new_facts:
|
|
||||||
normalized_fact = _normalize_memory_update_fact(fact)
|
|
||||||
if normalized_fact is not None:
|
|
||||||
normalized_new_facts.append(normalized_fact)
|
|
||||||
else:
|
|
||||||
dropped_new_fact = True
|
|
||||||
|
|
||||||
if normalized_facts_to_remove and dropped_new_fact:
|
|
||||||
raise json.JSONDecodeError(
|
|
||||||
"Unsafe partial memory update: factsToRemove with malformed newFacts",
|
|
||||||
json.dumps(update_data, ensure_ascii=False),
|
|
||||||
0,
|
|
||||||
)
|
|
||||||
|
|
||||||
return {
|
|
||||||
"user": user if isinstance(user, dict) else {},
|
|
||||||
"history": history if isinstance(history, dict) else {},
|
|
||||||
"newFacts": normalized_new_facts,
|
|
||||||
"factsToRemove": normalized_facts_to_remove,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def _parse_memory_update_response(response_content: Any) -> dict[str, Any]:
|
|
||||||
"""Parse the first valid memory-update JSON object from an LLM response.
|
|
||||||
|
|
||||||
Some providers may wrap JSON in thinking traces, prose, or markdown fences
|
|
||||||
even when prompted to return JSON only. This parser accepts safely
|
|
||||||
extractable JSON objects but does not repair truncated or malformed JSON.
|
|
||||||
"""
|
|
||||||
response_text = _extract_text(response_content).strip()
|
|
||||||
decoder = json.JSONDecoder()
|
|
||||||
|
|
||||||
for match in re.finditer(r"\{", response_text):
|
|
||||||
try:
|
|
||||||
parsed, _end = decoder.raw_decode(response_text[match.start() :])
|
|
||||||
except json.JSONDecodeError:
|
|
||||||
continue
|
|
||||||
if isinstance(parsed, dict) and _REQUIRED_MEMORY_UPDATE_TOP_LEVEL_KEYS.issubset(parsed):
|
|
||||||
return _normalize_memory_update_data(parsed)
|
|
||||||
|
|
||||||
raise json.JSONDecodeError("No valid memory update JSON object found", response_text, 0)
|
|
||||||
|
|
||||||
|
|
||||||
# Matches sentences that describe a file-upload *event* rather than general
|
# Matches sentences that describe a file-upload *event* rather than general
|
||||||
# file-related work. Deliberately narrow to avoid removing legitimate facts
|
# file-related work. Deliberately narrow to avoid removing legitimate facts
|
||||||
# such as "User works with CSV files" or "prefers PDF export".
|
# such as "User works with CSV files" or "prefers PDF export".
|
||||||
@@ -442,7 +338,7 @@ class MemoryUpdater:
|
|||||||
reinforcement_detected=reinforcement_detected,
|
reinforcement_detected=reinforcement_detected,
|
||||||
)
|
)
|
||||||
prompt = MEMORY_UPDATE_PROMPT.format(
|
prompt = MEMORY_UPDATE_PROMPT.format(
|
||||||
current_memory=json.dumps(current_memory, indent=2, ensure_ascii=False),
|
current_memory=json.dumps(current_memory, indent=2),
|
||||||
conversation=conversation_text,
|
conversation=conversation_text,
|
||||||
correction_hint=correction_hint,
|
correction_hint=correction_hint,
|
||||||
)
|
)
|
||||||
@@ -457,7 +353,13 @@ class MemoryUpdater:
|
|||||||
user_id: str | None = None,
|
user_id: str | None = None,
|
||||||
) -> bool:
|
) -> bool:
|
||||||
"""Parse the model response, apply updates, and persist memory."""
|
"""Parse the model response, apply updates, and persist memory."""
|
||||||
update_data = _parse_memory_update_response(response_content)
|
response_text = _extract_text(response_content).strip()
|
||||||
|
|
||||||
|
if response_text.startswith("```"):
|
||||||
|
lines = response_text.split("\n")
|
||||||
|
response_text = "\n".join(lines[1:-1] if lines[-1] == "```" else lines[1:])
|
||||||
|
|
||||||
|
update_data = json.loads(response_text)
|
||||||
# Deep-copy before in-place mutation so a subsequent save() failure
|
# Deep-copy before in-place mutation so a subsequent save() failure
|
||||||
# cannot corrupt the still-cached original object reference.
|
# cannot corrupt the still-cached original object reference.
|
||||||
updated_memory = self._apply_updates(copy.deepcopy(current_memory), update_data, thread_id)
|
updated_memory = self._apply_updates(copy.deepcopy(current_memory), update_data, thread_id)
|
||||||
|
|||||||
+48
-104
@@ -15,7 +15,6 @@ to the end of the message list as before_model + add_messages reducer would do.
|
|||||||
|
|
||||||
import json
|
import json
|
||||||
import logging
|
import logging
|
||||||
from collections import defaultdict, deque
|
|
||||||
from collections.abc import Awaitable, Callable
|
from collections.abc import Awaitable, Callable
|
||||||
from typing import override
|
from typing import override
|
||||||
|
|
||||||
@@ -26,11 +25,6 @@ from langchain_core.messages import ToolMessage
|
|||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Workaround for issue #2894: malformed write_file calls can carry huge Markdown
|
|
||||||
# payloads in invalid tool-call args. Keep recovery error details short so the
|
|
||||||
# synthetic ToolMessage does not echo large or malformed content back to the model.
|
|
||||||
_MAX_RECOVERY_ERROR_DETAIL_LEN = 500
|
|
||||||
|
|
||||||
|
|
||||||
class DanglingToolCallMiddleware(AgentMiddleware[AgentState]):
|
class DanglingToolCallMiddleware(AgentMiddleware[AgentState]):
|
||||||
"""Inserts placeholder ToolMessages for dangling tool calls before model invocation.
|
"""Inserts placeholder ToolMessages for dangling tool calls before model invocation.
|
||||||
@@ -42,144 +36,94 @@ class DanglingToolCallMiddleware(AgentMiddleware[AgentState]):
|
|||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def _message_tool_calls(msg) -> list[dict]:
|
def _message_tool_calls(msg) -> list[dict]:
|
||||||
"""Return normalized tool calls from structured fields or raw provider payloads.
|
"""Return normalized tool calls from structured fields or raw provider payloads."""
|
||||||
|
|
||||||
LangChain stores malformed provider function calls in ``invalid_tool_calls``.
|
|
||||||
They do not execute, but provider adapters may still serialize enough of
|
|
||||||
the call id/name back into the next request that strict OpenAI-compatible
|
|
||||||
validators expect a matching ToolMessage. Treat them as dangling calls so
|
|
||||||
the next model request stays well-formed and the model sees a recoverable
|
|
||||||
tool error instead of another provider 400.
|
|
||||||
"""
|
|
||||||
normalized: list[dict] = []
|
|
||||||
|
|
||||||
tool_calls = getattr(msg, "tool_calls", None) or []
|
tool_calls = getattr(msg, "tool_calls", None) or []
|
||||||
normalized.extend(list(tool_calls))
|
if tool_calls:
|
||||||
|
return list(tool_calls)
|
||||||
|
|
||||||
raw_tool_calls = (getattr(msg, "additional_kwargs", None) or {}).get("tool_calls") or []
|
raw_tool_calls = (getattr(msg, "additional_kwargs", None) or {}).get("tool_calls") or []
|
||||||
if not tool_calls:
|
normalized: list[dict] = []
|
||||||
for raw_tc in raw_tool_calls:
|
for raw_tc in raw_tool_calls:
|
||||||
if not isinstance(raw_tc, dict):
|
if not isinstance(raw_tc, dict):
|
||||||
continue
|
|
||||||
|
|
||||||
function = raw_tc.get("function")
|
|
||||||
name = raw_tc.get("name")
|
|
||||||
if not name and isinstance(function, dict):
|
|
||||||
name = function.get("name")
|
|
||||||
|
|
||||||
args = raw_tc.get("args", {})
|
|
||||||
if not args and isinstance(function, dict):
|
|
||||||
raw_args = function.get("arguments")
|
|
||||||
if isinstance(raw_args, str):
|
|
||||||
try:
|
|
||||||
parsed_args = json.loads(raw_args)
|
|
||||||
except (TypeError, ValueError, json.JSONDecodeError):
|
|
||||||
parsed_args = {}
|
|
||||||
args = parsed_args if isinstance(parsed_args, dict) else {}
|
|
||||||
|
|
||||||
normalized.append(
|
|
||||||
{
|
|
||||||
"id": raw_tc.get("id"),
|
|
||||||
"name": name or "unknown",
|
|
||||||
"args": args if isinstance(args, dict) else {},
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
for invalid_tc in getattr(msg, "invalid_tool_calls", None) or []:
|
|
||||||
if not isinstance(invalid_tc, dict):
|
|
||||||
continue
|
continue
|
||||||
|
|
||||||
|
function = raw_tc.get("function")
|
||||||
|
name = raw_tc.get("name")
|
||||||
|
if not name and isinstance(function, dict):
|
||||||
|
name = function.get("name")
|
||||||
|
|
||||||
|
args = raw_tc.get("args", {})
|
||||||
|
if not args and isinstance(function, dict):
|
||||||
|
raw_args = function.get("arguments")
|
||||||
|
if isinstance(raw_args, str):
|
||||||
|
try:
|
||||||
|
parsed_args = json.loads(raw_args)
|
||||||
|
except (TypeError, ValueError, json.JSONDecodeError):
|
||||||
|
parsed_args = {}
|
||||||
|
args = parsed_args if isinstance(parsed_args, dict) else {}
|
||||||
|
|
||||||
normalized.append(
|
normalized.append(
|
||||||
{
|
{
|
||||||
"id": invalid_tc.get("id"),
|
"id": raw_tc.get("id"),
|
||||||
"name": invalid_tc.get("name") or "unknown",
|
"name": name or "unknown",
|
||||||
"args": {},
|
"args": args if isinstance(args, dict) else {},
|
||||||
"invalid": True,
|
|
||||||
"error": invalid_tc.get("error"),
|
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
|
|
||||||
return normalized
|
return normalized
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _synthetic_tool_message_content(tool_call: dict) -> str:
|
|
||||||
if tool_call.get("invalid"):
|
|
||||||
name = tool_call.get("name")
|
|
||||||
error = tool_call.get("error")
|
|
||||||
error_text = error[:_MAX_RECOVERY_ERROR_DETAIL_LEN] if isinstance(error, str) and error else ""
|
|
||||||
# Workaround for issue #2894: malformed write_file calls can carry huge Markdown
|
|
||||||
# payloads in invalid tool-call args. Keep recovery guidance actionable without
|
|
||||||
# echoing large or malformed content back to the model.
|
|
||||||
if name == "write_file":
|
|
||||||
details = f" Parser error: {error_text}" if error_text else ""
|
|
||||||
return (
|
|
||||||
"[write_file failed before execution: the tool-call arguments were not valid JSON, "
|
|
||||||
"so no file was written. This often happens when the model tries to write a very "
|
|
||||||
"large Markdown file in a single tool call, especially when `content` contains "
|
|
||||||
"unescaped quotes, inline JSON, backslashes, or code fences. Do not retry the same "
|
|
||||||
"large `write_file` payload for this artifact; provide the report/content directly "
|
|
||||||
"as normal assistant text in your next response. If a file write is still needed "
|
|
||||||
f"later, split the file into smaller sections instead of one large payload.{details}]"
|
|
||||||
)
|
|
||||||
if error_text:
|
|
||||||
return f"[Tool call could not be executed because its arguments were invalid: {error_text}]"
|
|
||||||
return "[Tool call could not be executed because its arguments were invalid.]"
|
|
||||||
return "[Tool call was interrupted and did not return a result.]"
|
|
||||||
|
|
||||||
def _build_patched_messages(self, messages: list) -> list | None:
|
def _build_patched_messages(self, messages: list) -> list | None:
|
||||||
"""Return messages with tool results grouped after their tool-call AIMessage.
|
"""Return a new message list with patches inserted at the correct positions.
|
||||||
|
|
||||||
This normalizes model-bound causal order before provider serialization while
|
For each AIMessage with dangling tool_calls (no corresponding ToolMessage),
|
||||||
preserving already-valid transcripts unchanged.
|
a synthetic ToolMessage is inserted immediately after that AIMessage.
|
||||||
|
Returns None if no patches are needed.
|
||||||
"""
|
"""
|
||||||
tool_messages_by_id: dict[str, deque[ToolMessage]] = defaultdict(deque)
|
# Collect IDs of all existing ToolMessages
|
||||||
|
existing_tool_msg_ids: set[str] = set()
|
||||||
for msg in messages:
|
for msg in messages:
|
||||||
if isinstance(msg, ToolMessage):
|
if isinstance(msg, ToolMessage):
|
||||||
tool_messages_by_id[msg.tool_call_id].append(msg)
|
existing_tool_msg_ids.add(msg.tool_call_id)
|
||||||
|
|
||||||
tool_call_ids: set[str] = set()
|
# Check if any patching is needed
|
||||||
|
needs_patch = False
|
||||||
for msg in messages:
|
for msg in messages:
|
||||||
if getattr(msg, "type", None) != "ai":
|
if getattr(msg, "type", None) != "ai":
|
||||||
continue
|
continue
|
||||||
for tc in self._message_tool_calls(msg):
|
for tc in self._message_tool_calls(msg):
|
||||||
tc_id = tc.get("id")
|
tc_id = tc.get("id")
|
||||||
if tc_id:
|
if tc_id and tc_id not in existing_tool_msg_ids:
|
||||||
tool_call_ids.add(tc_id)
|
needs_patch = True
|
||||||
|
break
|
||||||
|
if needs_patch:
|
||||||
|
break
|
||||||
|
|
||||||
|
if not needs_patch:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Build new list with patches inserted right after each dangling AIMessage
|
||||||
patched: list = []
|
patched: list = []
|
||||||
|
patched_ids: set[str] = set()
|
||||||
patch_count = 0
|
patch_count = 0
|
||||||
for msg in messages:
|
for msg in messages:
|
||||||
if isinstance(msg, ToolMessage) and msg.tool_call_id in tool_call_ids:
|
|
||||||
continue
|
|
||||||
|
|
||||||
patched.append(msg)
|
patched.append(msg)
|
||||||
if getattr(msg, "type", None) != "ai":
|
if getattr(msg, "type", None) != "ai":
|
||||||
continue
|
continue
|
||||||
|
|
||||||
for tc in self._message_tool_calls(msg):
|
for tc in self._message_tool_calls(msg):
|
||||||
tc_id = tc.get("id")
|
tc_id = tc.get("id")
|
||||||
if not tc_id:
|
if tc_id and tc_id not in existing_tool_msg_ids and tc_id not in patched_ids:
|
||||||
continue
|
|
||||||
|
|
||||||
tool_msg_queue = tool_messages_by_id.get(tc_id)
|
|
||||||
existing_tool_msg = tool_msg_queue.popleft() if tool_msg_queue else None
|
|
||||||
if existing_tool_msg is not None:
|
|
||||||
patched.append(existing_tool_msg)
|
|
||||||
else:
|
|
||||||
patched.append(
|
patched.append(
|
||||||
ToolMessage(
|
ToolMessage(
|
||||||
content=self._synthetic_tool_message_content(tc),
|
content="[Tool call was interrupted and did not return a result.]",
|
||||||
tool_call_id=tc_id,
|
tool_call_id=tc_id,
|
||||||
name=tc.get("name", "unknown"),
|
name=tc.get("name", "unknown"),
|
||||||
status="error",
|
status="error",
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
patched_ids.add(tc_id)
|
||||||
patch_count += 1
|
patch_count += 1
|
||||||
|
|
||||||
if patched == messages:
|
logger.warning(f"Injecting {patch_count} placeholder ToolMessage(s) for dangling tool calls")
|
||||||
return None
|
|
||||||
|
|
||||||
if patch_count:
|
|
||||||
logger.warning(f"Injecting {patch_count} placeholder ToolMessage(s) for dangling tool calls")
|
|
||||||
return patched
|
return patched
|
||||||
|
|
||||||
@override
|
@override
|
||||||
|
|||||||
+34
-39
@@ -1,15 +1,12 @@
|
|||||||
"""Middleware to filter deferred tool schemas from model binding.
|
"""Middleware to filter deferred tool schemas from model binding.
|
||||||
|
|
||||||
When tool_search is enabled, MCP tools are still passed to ToolNode for
|
When tool_search is enabled, MCP tools are registered in the DeferredToolRegistry
|
||||||
execution, but their schemas must NOT be sent to the LLM via bind_tools until
|
and passed to ToolNode for execution, but their schemas should NOT be sent to the
|
||||||
the model has discovered them via tool_search. This middleware removes the
|
LLM via bind_tools (that's the whole point of deferral — saving context tokens).
|
||||||
still-deferred tools from request.tools before model binding, and blocks tool
|
|
||||||
calls to tools that have not been promoted yet.
|
|
||||||
|
|
||||||
The deferred name set and the catalog hash are injected at construction time
|
This middleware intercepts wrap_model_call and removes deferred tools from
|
||||||
(no ContextVar). Promotion state is read from graph state (``state["promoted"]``),
|
request.tools so that model.bind_tools only receives active tool schemas.
|
||||||
scoped by catalog hash so a stale persisted promotion cannot expose a renamed
|
The agent discovers deferred tools at runtime via the tool_search tool.
|
||||||
or drifted tool.
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
@@ -27,49 +24,47 @@ logger = logging.getLogger(__name__)
|
|||||||
|
|
||||||
|
|
||||||
class DeferredToolFilterMiddleware(AgentMiddleware[AgentState]):
|
class DeferredToolFilterMiddleware(AgentMiddleware[AgentState]):
|
||||||
"""Hide deferred tool schemas from the bound model until promoted.
|
"""Remove deferred tools from request.tools before model binding.
|
||||||
|
|
||||||
ToolNode still holds all tools (including deferred) for execution routing,
|
ToolNode still holds all tools (including deferred) for execution routing,
|
||||||
but the LLM only sees active tool schemas plus tools that have already been
|
but the LLM only sees active tool schemas — deferred tools are discoverable
|
||||||
promoted (recorded in ``state["promoted"]`` under the current catalog hash).
|
via tool_search at runtime.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
def __init__(self, deferred_names: frozenset[str], catalog_hash: str | None):
|
|
||||||
super().__init__()
|
|
||||||
self._deferred = deferred_names
|
|
||||||
self._catalog_hash = catalog_hash
|
|
||||||
|
|
||||||
def _promoted(self, state) -> set[str]:
|
|
||||||
promoted = (state or {}).get("promoted")
|
|
||||||
if promoted and promoted.get("catalog_hash") == self._catalog_hash:
|
|
||||||
return set(promoted.get("names") or [])
|
|
||||||
return set()
|
|
||||||
|
|
||||||
def _hidden(self, state) -> set[str]:
|
|
||||||
return set(self._deferred) - self._promoted(state)
|
|
||||||
|
|
||||||
def _filter_tools(self, request: ModelRequest) -> ModelRequest:
|
def _filter_tools(self, request: ModelRequest) -> ModelRequest:
|
||||||
if not self._deferred:
|
from deerflow.tools.builtins.tool_search import get_deferred_registry
|
||||||
|
|
||||||
|
registry = get_deferred_registry()
|
||||||
|
if not registry:
|
||||||
return request
|
return request
|
||||||
hide = self._hidden(request.state)
|
|
||||||
if not hide:
|
deferred_names = registry.deferred_names
|
||||||
return request
|
active_tools = [t for t in request.tools if getattr(t, "name", None) not in deferred_names]
|
||||||
active = [t for t in request.tools if getattr(t, "name", None) not in hide]
|
|
||||||
if len(active) < len(request.tools):
|
if len(active_tools) < len(request.tools):
|
||||||
logger.debug("Filtered %d deferred tool schema(s) from model binding", len(request.tools) - len(active))
|
logger.debug(f"Filtered {len(request.tools) - len(active_tools)} deferred tool schema(s) from model binding")
|
||||||
return request.override(tools=active)
|
|
||||||
|
return request.override(tools=active_tools)
|
||||||
|
|
||||||
def _blocked_tool_message(self, request: ToolCallRequest) -> ToolMessage | None:
|
def _blocked_tool_message(self, request: ToolCallRequest) -> ToolMessage | None:
|
||||||
if not self._deferred:
|
from deerflow.tools.builtins.tool_search import get_deferred_registry
|
||||||
|
|
||||||
|
registry = get_deferred_registry()
|
||||||
|
if not registry:
|
||||||
return None
|
return None
|
||||||
name = str(request.tool_call.get("name") or "")
|
|
||||||
if not name or name not in self._hidden(request.state):
|
tool_name = str(request.tool_call.get("name") or "")
|
||||||
|
if not tool_name:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
|
if not registry.contains(tool_name):
|
||||||
|
return None
|
||||||
|
|
||||||
tool_call_id = str(request.tool_call.get("id") or "missing_tool_call_id")
|
tool_call_id = str(request.tool_call.get("id") or "missing_tool_call_id")
|
||||||
return ToolMessage(
|
return ToolMessage(
|
||||||
content=(f"Error: Tool '{name}' is deferred and has not been promoted yet. Call tool_search first to expose and promote this tool's schema, then retry."),
|
content=(f"Error: Tool '{tool_name}' is deferred and has not been promoted yet. Call tool_search first to expose and promote this tool's schema, then retry."),
|
||||||
tool_call_id=tool_call_id,
|
tool_call_id=tool_call_id,
|
||||||
name=name,
|
name=tool_name,
|
||||||
status="error",
|
status="error",
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|||||||
@@ -1,232 +0,0 @@
|
|||||||
"""Middleware to inject dynamic context (memory, current date) as a system-reminder.
|
|
||||||
|
|
||||||
The system prompt is kept fully static for maximum prefix-cache reuse across users
|
|
||||||
and sessions. The current date is always injected. Per-user memory is also injected
|
|
||||||
when ``memory.injection_enabled`` is True in the app config. Both are delivered once
|
|
||||||
per conversation as a dedicated <system-reminder> HumanMessage inserted before the
|
|
||||||
first user message (frozen-snapshot pattern).
|
|
||||||
|
|
||||||
When a conversation spans midnight the middleware detects the date change and injects
|
|
||||||
a lightweight date-update reminder as a separate HumanMessage before the current turn.
|
|
||||||
This correction is persisted so subsequent turns on the new day see a consistent history
|
|
||||||
and do not re-inject.
|
|
||||||
|
|
||||||
Reminder format:
|
|
||||||
|
|
||||||
<system-reminder>
|
|
||||||
<memory>...</memory>
|
|
||||||
|
|
||||||
<current_date>2026-05-08, Friday</current_date>
|
|
||||||
</system-reminder>
|
|
||||||
|
|
||||||
Date-update format:
|
|
||||||
|
|
||||||
<system-reminder>
|
|
||||||
<current_date>2026-05-09, Saturday</current_date>
|
|
||||||
</system-reminder>
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import logging
|
|
||||||
import re
|
|
||||||
import uuid
|
|
||||||
from datetime import datetime
|
|
||||||
from typing import TYPE_CHECKING, override
|
|
||||||
|
|
||||||
from langchain.agents.middleware import AgentMiddleware
|
|
||||||
from langchain_core.messages import HumanMessage
|
|
||||||
from langgraph.runtime import Runtime
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
|
||||||
from deerflow.config.app_config import AppConfig
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
# Upper bound (seconds) for a single _inject() offload. If the warm-up at
|
|
||||||
# gateway startup failed silently, the first request may still hit a cold
|
|
||||||
# tiktoken BPE download that blocks until the OS TCP timeout (~26 min).
|
|
||||||
# This cap ensures the request degrades gracefully instead of hanging.
|
|
||||||
_INJECT_TIMEOUT_SECONDS = 5.0
|
|
||||||
|
|
||||||
_DATE_RE = re.compile(r"<current_date>([^<]+)</current_date>")
|
|
||||||
_DYNAMIC_CONTEXT_REMINDER_KEY = "dynamic_context_reminder"
|
|
||||||
_SUMMARY_MESSAGE_NAME = "summary"
|
|
||||||
|
|
||||||
|
|
||||||
def _extract_date(content: str) -> str | None:
|
|
||||||
"""Return the first <current_date> value found in *content*, or None."""
|
|
||||||
m = _DATE_RE.search(content)
|
|
||||||
return m.group(1) if m else None
|
|
||||||
|
|
||||||
|
|
||||||
def is_dynamic_context_reminder(message: object) -> bool:
|
|
||||||
"""Return whether *message* is a hidden dynamic-context reminder."""
|
|
||||||
return isinstance(message, HumanMessage) and bool(message.additional_kwargs.get(_DYNAMIC_CONTEXT_REMINDER_KEY))
|
|
||||||
|
|
||||||
|
|
||||||
def _last_injected_date(messages: list) -> str | None:
|
|
||||||
"""Scan messages in reverse and return the most recently injected date.
|
|
||||||
|
|
||||||
Detection uses the ``dynamic_context_reminder`` additional_kwargs flag rather
|
|
||||||
than content substring matching, so user messages containing ``<system-reminder>``
|
|
||||||
are not mistakenly treated as injected reminders.
|
|
||||||
"""
|
|
||||||
for msg in reversed(messages):
|
|
||||||
if is_dynamic_context_reminder(msg):
|
|
||||||
content_str = msg.content if isinstance(msg.content, str) else str(msg.content)
|
|
||||||
return _extract_date(content_str)
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def _is_user_injection_target(message: object) -> bool:
|
|
||||||
"""Return whether *message* can receive a dynamic-context reminder."""
|
|
||||||
return isinstance(message, HumanMessage) and not is_dynamic_context_reminder(message) and message.name != _SUMMARY_MESSAGE_NAME
|
|
||||||
|
|
||||||
|
|
||||||
class DynamicContextMiddleware(AgentMiddleware):
|
|
||||||
"""Inject memory and current date into HumanMessages as a <system-reminder>.
|
|
||||||
|
|
||||||
First turn
|
|
||||||
----------
|
|
||||||
Prepends a full system-reminder (memory + date) to the first HumanMessage and
|
|
||||||
persists it (same message ID). The first message is then frozen for the whole
|
|
||||||
session — its content never changes again, so the prefix cache can hit on every
|
|
||||||
subsequent turn.
|
|
||||||
|
|
||||||
Midnight crossing
|
|
||||||
-----------------
|
|
||||||
If the conversation spans midnight, the current date differs from the date that
|
|
||||||
was injected earlier. In that case a lightweight date-update reminder is prepended
|
|
||||||
to the **current** (last) HumanMessage and persisted. Subsequent turns on the new
|
|
||||||
day see the corrected date in history and skip re-injection.
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, agent_name: str | None = None, *, app_config: AppConfig | None = None):
|
|
||||||
super().__init__()
|
|
||||||
self._agent_name = agent_name
|
|
||||||
self._app_config = app_config
|
|
||||||
|
|
||||||
def _build_full_reminder(self) -> str:
|
|
||||||
from deerflow.agents.lead_agent.prompt import _get_memory_context
|
|
||||||
|
|
||||||
# Memory injection is gated by injection_enabled; date is always included.
|
|
||||||
injection_enabled = self._app_config.memory.injection_enabled if self._app_config else True
|
|
||||||
memory_context = _get_memory_context(self._agent_name, app_config=self._app_config) if injection_enabled else ""
|
|
||||||
current_date = datetime.now().strftime("%Y-%m-%d, %A")
|
|
||||||
|
|
||||||
lines: list[str] = ["<system-reminder>"]
|
|
||||||
if memory_context:
|
|
||||||
lines.append(memory_context.strip())
|
|
||||||
lines.append("") # blank line separating memory from date
|
|
||||||
lines.append(f"<current_date>{current_date}</current_date>")
|
|
||||||
lines.append("</system-reminder>")
|
|
||||||
|
|
||||||
return "\n".join(lines)
|
|
||||||
|
|
||||||
def _build_date_update_reminder(self) -> str:
|
|
||||||
current_date = datetime.now().strftime("%Y-%m-%d, %A")
|
|
||||||
return "\n".join(
|
|
||||||
[
|
|
||||||
"<system-reminder>",
|
|
||||||
f"<current_date>{current_date}</current_date>",
|
|
||||||
"</system-reminder>",
|
|
||||||
]
|
|
||||||
)
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _make_reminder_and_user_messages(original: HumanMessage, reminder_content: str) -> tuple[HumanMessage, HumanMessage]:
|
|
||||||
"""Return (reminder_msg, user_msg) using the ID-swap technique.
|
|
||||||
|
|
||||||
reminder_msg takes the original message's ID so that add_messages replaces it
|
|
||||||
in-place (preserving position). user_msg carries the original content with a
|
|
||||||
derived ``{id}__user`` ID and is appended immediately after by add_messages.
|
|
||||||
|
|
||||||
If the original message has no ID a stable UUID is generated so the derived
|
|
||||||
``{id}__user`` ID never collapses to the ambiguous ``None__user`` string.
|
|
||||||
"""
|
|
||||||
stable_id = original.id or str(uuid.uuid4())
|
|
||||||
reminder_msg = HumanMessage(
|
|
||||||
content=reminder_content,
|
|
||||||
id=stable_id,
|
|
||||||
additional_kwargs={"hide_from_ui": True, _DYNAMIC_CONTEXT_REMINDER_KEY: True},
|
|
||||||
)
|
|
||||||
user_msg = HumanMessage(
|
|
||||||
content=original.content,
|
|
||||||
id=f"{stable_id}__user",
|
|
||||||
name=original.name,
|
|
||||||
additional_kwargs=original.additional_kwargs,
|
|
||||||
)
|
|
||||||
return reminder_msg, user_msg
|
|
||||||
|
|
||||||
def _inject(self, state) -> dict | None:
|
|
||||||
messages = list(state.get("messages", []))
|
|
||||||
if not messages:
|
|
||||||
return None
|
|
||||||
|
|
||||||
current_date = datetime.now().strftime("%Y-%m-%d, %A")
|
|
||||||
last_date = _last_injected_date(messages)
|
|
||||||
logger.debug(
|
|
||||||
"DynamicContextMiddleware._inject: msg_count=%d last_date=%r current_date=%r",
|
|
||||||
len(messages),
|
|
||||||
last_date,
|
|
||||||
current_date,
|
|
||||||
)
|
|
||||||
|
|
||||||
if last_date is None:
|
|
||||||
# ── First turn: inject full reminder as a separate HumanMessage ─────
|
|
||||||
first_idx = next((i for i, m in enumerate(messages) if _is_user_injection_target(m)), None)
|
|
||||||
if first_idx is None:
|
|
||||||
return None
|
|
||||||
full_reminder = self._build_full_reminder()
|
|
||||||
logger.info(
|
|
||||||
"DynamicContextMiddleware: injecting full reminder (len=%d, has_memory=%s) into first HumanMessage id=%r",
|
|
||||||
len(full_reminder),
|
|
||||||
"<memory>" in full_reminder,
|
|
||||||
messages[first_idx].id,
|
|
||||||
)
|
|
||||||
reminder_msg, user_msg = self._make_reminder_and_user_messages(messages[first_idx], full_reminder)
|
|
||||||
return {"messages": [reminder_msg, user_msg]}
|
|
||||||
|
|
||||||
if last_date == current_date:
|
|
||||||
# ── Same day: nothing to do ──────────────────────────────────────────
|
|
||||||
return None
|
|
||||||
|
|
||||||
# ── Midnight crossed: inject date-update reminder as a separate HumanMessage ──
|
|
||||||
last_human_idx = next((i for i in reversed(range(len(messages))) if _is_user_injection_target(messages[i])), None)
|
|
||||||
if last_human_idx is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
reminder_msg, user_msg = self._make_reminder_and_user_messages(messages[last_human_idx], self._build_date_update_reminder())
|
|
||||||
logger.info("DynamicContextMiddleware: midnight crossing detected — injected date update before current turn")
|
|
||||||
return {"messages": [reminder_msg, user_msg]}
|
|
||||||
|
|
||||||
@override
|
|
||||||
def before_agent(self, state, runtime: Runtime) -> dict | None:
|
|
||||||
return self._inject(state)
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def abefore_agent(self, state, runtime: Runtime) -> dict | None:
|
|
||||||
# _inject() performs synchronous file I/O (memory JSON loading) and
|
|
||||||
# potentially blocking network calls (tiktoken encoding download on
|
|
||||||
# first use). Offload to a thread so the event loop is never blocked
|
|
||||||
# — a blocking call here starves all concurrent HTTP handlers (auth,
|
|
||||||
# SSE heartbeats, etc.). See issue #3402.
|
|
||||||
#
|
|
||||||
# Bounded timeout: if startup warm-up failed silently (e.g. network
|
|
||||||
# blip during deploy), the first request's cold tiktoken download can
|
|
||||||
# block for tens of minutes (OS TCP timeout). Time-box injection so
|
|
||||||
# the request degrades gracefully (no memory context) rather than
|
|
||||||
# hanging.
|
|
||||||
try:
|
|
||||||
return await asyncio.wait_for(
|
|
||||||
asyncio.to_thread(self._inject, state),
|
|
||||||
timeout=_INJECT_TIMEOUT_SECONDS,
|
|
||||||
)
|
|
||||||
except TimeoutError:
|
|
||||||
logger.warning(
|
|
||||||
"DynamicContextMiddleware: injection timed out (%.1fs); skipping memory/date injection for this turn",
|
|
||||||
_INJECT_TIMEOUT_SECONDS,
|
|
||||||
)
|
|
||||||
return None
|
|
||||||
+6
-106
@@ -62,41 +62,6 @@ _AUTH_PATTERNS = (
|
|||||||
"未授权",
|
"未授权",
|
||||||
)
|
)
|
||||||
|
|
||||||
# Per-exception retry budget overrides.
|
|
||||||
#
|
|
||||||
# Some transient errors are retriable in principle but expensive to retry at
|
|
||||||
# the default budget. StreamChunkTimeoutError in particular fires after the
|
|
||||||
# upstream provider has already stalled for `stream_chunk_timeout` seconds
|
|
||||||
# (typically 120-240s); a full 3-attempt loop can therefore stack 6-12 minutes
|
|
||||||
# of dead air before surfacing the failure to the user. We keep exactly one
|
|
||||||
# retry (cheap reconnect that catches genuine transient TCP blips) and then
|
|
||||||
# fail fast — the same buffered payload is overwhelmingly likely to fail
|
|
||||||
# again at the upstream provider for the same reason.
|
|
||||||
#
|
|
||||||
# Keys are exception class *names* (not classes) so we don't introduce
|
|
||||||
# import-time coupling on optional dependencies like langchain-openai. The
|
|
||||||
# value is the absolute max attempt count, NOT additional retries — so a
|
|
||||||
# value of 2 means "1 first attempt + 1 retry" (the CR-requested
|
|
||||||
# "keep one retry" behavior).
|
|
||||||
_RETRY_BUDGET_OVERRIDES: dict[str, int] = {
|
|
||||||
"StreamChunkTimeoutError": 2,
|
|
||||||
}
|
|
||||||
|
|
||||||
# Exception class names that indicate the upstream stream-chunk watchdog
|
|
||||||
# fired because the model stalled mid-flight. These deserve a more specific
|
|
||||||
# user-facing message than the generic "temporarily unavailable" copy,
|
|
||||||
# because the typical root cause is a long tool-call serialization stalling
|
|
||||||
# the upstream stream — and the most actionable advice we can give the user
|
|
||||||
# is "ask for a shorter / split output" rather than "wait and retry".
|
|
||||||
# Generic connection drops (httpx RemoteProtocolError / ReadError) are
|
|
||||||
# intentionally excluded: they routinely fire on transient network blips
|
|
||||||
# with normal payloads, where the "split the work" guidance is misleading.
|
|
||||||
_STREAM_DROP_EXCEPTIONS: frozenset[str] = frozenset(
|
|
||||||
{
|
|
||||||
"StreamChunkTimeoutError",
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class LLMErrorHandlingMiddleware(AgentMiddleware[AgentState]):
|
class LLMErrorHandlingMiddleware(AgentMiddleware[AgentState]):
|
||||||
"""Retry transient LLM errors and surface graceful assistant messages."""
|
"""Retry transient LLM errors and surface graceful assistant messages."""
|
||||||
@@ -118,18 +83,6 @@ class LLMErrorHandlingMiddleware(AgentMiddleware[AgentState]):
|
|||||||
self._circuit_state = "closed"
|
self._circuit_state = "closed"
|
||||||
self._circuit_probe_in_flight = False
|
self._circuit_probe_in_flight = False
|
||||||
|
|
||||||
def _max_attempts_for(self, exc: BaseException) -> int:
|
|
||||||
"""Return the effective max attempt count for this exception.
|
|
||||||
|
|
||||||
Falls back to `self.retry_max_attempts` unless the exception class name
|
|
||||||
appears in the per-exception override table.
|
|
||||||
"""
|
|
||||||
override = _RETRY_BUDGET_OVERRIDES.get(type(exc).__name__)
|
|
||||||
if override is None:
|
|
||||||
return self.retry_max_attempts
|
|
||||||
|
|
||||||
return min(override, self.retry_max_attempts)
|
|
||||||
|
|
||||||
def _check_circuit(self) -> bool:
|
def _check_circuit(self) -> bool:
|
||||||
"""Returns True if circuit is OPEN (fast fail), False otherwise."""
|
"""Returns True if circuit is OPEN (fast fail), False otherwise."""
|
||||||
with self._circuit_lock:
|
with self._circuit_lock:
|
||||||
@@ -200,7 +153,6 @@ class LLMErrorHandlingMiddleware(AgentMiddleware[AgentState]):
|
|||||||
"InternalServerError",
|
"InternalServerError",
|
||||||
"ReadError", # httpx.ReadError: connection dropped mid-stream
|
"ReadError", # httpx.ReadError: connection dropped mid-stream
|
||||||
"RemoteProtocolError", # httpx: server closed connection unexpectedly
|
"RemoteProtocolError", # httpx: server closed connection unexpectedly
|
||||||
"StreamChunkTimeoutError", # langchain-openai: chunk gap exceeded stream_chunk_timeout
|
|
||||||
}:
|
}:
|
||||||
return True, "transient"
|
return True, "transient"
|
||||||
if status_code in _RETRIABLE_STATUS_CODES:
|
if status_code in _RETRIABLE_STATUS_CODES:
|
||||||
@@ -225,24 +177,6 @@ class LLMErrorHandlingMiddleware(AgentMiddleware[AgentState]):
|
|||||||
def _build_circuit_breaker_message(self) -> str:
|
def _build_circuit_breaker_message(self) -> str:
|
||||||
return "The configured LLM provider is currently unavailable due to continuous failures. Circuit breaker is engaged to protect the system. Please wait a moment before trying again."
|
return "The configured LLM provider is currently unavailable due to continuous failures. Circuit breaker is engaged to protect the system. Please wait a moment before trying again."
|
||||||
|
|
||||||
def _build_error_fallback_message(
|
|
||||||
self,
|
|
||||||
content: str,
|
|
||||||
*,
|
|
||||||
error_type: str,
|
|
||||||
reason: str,
|
|
||||||
detail: str,
|
|
||||||
) -> AIMessage:
|
|
||||||
return AIMessage(
|
|
||||||
content=content,
|
|
||||||
additional_kwargs={
|
|
||||||
"deerflow_error_fallback": True,
|
|
||||||
"error_type": error_type,
|
|
||||||
"error_reason": reason,
|
|
||||||
"error_detail": detail,
|
|
||||||
},
|
|
||||||
)
|
|
||||||
|
|
||||||
def _build_user_message(self, exc: BaseException, reason: str) -> str:
|
def _build_user_message(self, exc: BaseException, reason: str) -> str:
|
||||||
detail = _extract_error_detail(exc)
|
detail = _extract_error_detail(exc)
|
||||||
if reason == "quota":
|
if reason == "quota":
|
||||||
@@ -250,31 +184,9 @@ class LLMErrorHandlingMiddleware(AgentMiddleware[AgentState]):
|
|||||||
if reason == "auth":
|
if reason == "auth":
|
||||||
return "The configured LLM provider rejected the request because authentication or access is invalid. Please check the provider credentials and try again."
|
return "The configured LLM provider rejected the request because authentication or access is invalid. Please check the provider credentials and try again."
|
||||||
if reason in {"busy", "transient"}:
|
if reason in {"busy", "transient"}:
|
||||||
# Stream-drop failures (chunk-gap timeout, peer-closed connection,
|
|
||||||
# raw read error) almost always point at a single oversized
|
|
||||||
# tool-call payload — the model spent so long serializing JSON
|
|
||||||
# arguments that the upstream provider buffered and the stream
|
|
||||||
# gap exceeded `stream_chunk_timeout`. Surfacing this distinct
|
|
||||||
# cause lets the user split or shorten their next request
|
|
||||||
# instead of helplessly retrying the same prompt.
|
|
||||||
if type(exc).__name__ in _STREAM_DROP_EXCEPTIONS:
|
|
||||||
return (
|
|
||||||
"The model's streaming response was interrupted before it could "
|
|
||||||
"finish. This usually happens when a single response or tool call "
|
|
||||||
"is very large — please ask the assistant to split the work into "
|
|
||||||
"smaller steps, or shorten the requested output, and try again."
|
|
||||||
)
|
|
||||||
return "The configured LLM provider is temporarily unavailable after multiple retries. Please wait a moment and continue the conversation."
|
return "The configured LLM provider is temporarily unavailable after multiple retries. Please wait a moment and continue the conversation."
|
||||||
return f"LLM request failed: {detail}"
|
return f"LLM request failed: {detail}"
|
||||||
|
|
||||||
def _build_user_fallback_message(self, exc: BaseException, reason: str) -> AIMessage:
|
|
||||||
return self._build_error_fallback_message(
|
|
||||||
self._build_user_message(exc, reason),
|
|
||||||
error_type=type(exc).__name__,
|
|
||||||
reason=reason,
|
|
||||||
detail=_extract_error_detail(exc),
|
|
||||||
)
|
|
||||||
|
|
||||||
def _emit_retry_event(self, attempt: int, wait_ms: int, reason: str) -> None:
|
def _emit_retry_event(self, attempt: int, wait_ms: int, reason: str) -> None:
|
||||||
try:
|
try:
|
||||||
from langgraph.config import get_stream_writer
|
from langgraph.config import get_stream_writer
|
||||||
@@ -300,12 +212,7 @@ class LLMErrorHandlingMiddleware(AgentMiddleware[AgentState]):
|
|||||||
handler: Callable[[ModelRequest], ModelResponse],
|
handler: Callable[[ModelRequest], ModelResponse],
|
||||||
) -> ModelCallResult:
|
) -> ModelCallResult:
|
||||||
if self._check_circuit():
|
if self._check_circuit():
|
||||||
return self._build_error_fallback_message(
|
return AIMessage(content=self._build_circuit_breaker_message())
|
||||||
self._build_circuit_breaker_message(),
|
|
||||||
error_type="CircuitBreakerOpen",
|
|
||||||
reason="circuit_open",
|
|
||||||
detail="LLM circuit breaker is open",
|
|
||||||
)
|
|
||||||
|
|
||||||
attempt = 1
|
attempt = 1
|
||||||
while True:
|
while True:
|
||||||
@@ -321,8 +228,7 @@ class LLMErrorHandlingMiddleware(AgentMiddleware[AgentState]):
|
|||||||
raise
|
raise
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
retriable, reason = self._classify_error(exc)
|
retriable, reason = self._classify_error(exc)
|
||||||
max_attempts = self._max_attempts_for(exc)
|
if retriable and attempt < self.retry_max_attempts:
|
||||||
if retriable and attempt < max_attempts:
|
|
||||||
wait_ms = self._build_retry_delay_ms(attempt, exc)
|
wait_ms = self._build_retry_delay_ms(attempt, exc)
|
||||||
logger.warning(
|
logger.warning(
|
||||||
"Transient LLM error on attempt %d/%d; retrying in %dms: %s",
|
"Transient LLM error on attempt %d/%d; retrying in %dms: %s",
|
||||||
@@ -343,7 +249,7 @@ class LLMErrorHandlingMiddleware(AgentMiddleware[AgentState]):
|
|||||||
)
|
)
|
||||||
if retriable:
|
if retriable:
|
||||||
self._record_failure()
|
self._record_failure()
|
||||||
return self._build_user_fallback_message(exc, reason)
|
return AIMessage(content=self._build_user_message(exc, reason))
|
||||||
|
|
||||||
@override
|
@override
|
||||||
async def awrap_model_call(
|
async def awrap_model_call(
|
||||||
@@ -352,12 +258,7 @@ class LLMErrorHandlingMiddleware(AgentMiddleware[AgentState]):
|
|||||||
handler: Callable[[ModelRequest], Awaitable[ModelResponse]],
|
handler: Callable[[ModelRequest], Awaitable[ModelResponse]],
|
||||||
) -> ModelCallResult:
|
) -> ModelCallResult:
|
||||||
if self._check_circuit():
|
if self._check_circuit():
|
||||||
return self._build_error_fallback_message(
|
return AIMessage(content=self._build_circuit_breaker_message())
|
||||||
self._build_circuit_breaker_message(),
|
|
||||||
error_type="CircuitBreakerOpen",
|
|
||||||
reason="circuit_open",
|
|
||||||
detail="LLM circuit breaker is open",
|
|
||||||
)
|
|
||||||
|
|
||||||
attempt = 1
|
attempt = 1
|
||||||
while True:
|
while True:
|
||||||
@@ -373,8 +274,7 @@ class LLMErrorHandlingMiddleware(AgentMiddleware[AgentState]):
|
|||||||
raise
|
raise
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
retriable, reason = self._classify_error(exc)
|
retriable, reason = self._classify_error(exc)
|
||||||
max_attempts = self._max_attempts_for(exc)
|
if retriable and attempt < self.retry_max_attempts:
|
||||||
if retriable and attempt < max_attempts:
|
|
||||||
wait_ms = self._build_retry_delay_ms(attempt, exc)
|
wait_ms = self._build_retry_delay_ms(attempt, exc)
|
||||||
logger.warning(
|
logger.warning(
|
||||||
"Transient LLM error on attempt %d/%d; retrying in %dms: %s",
|
"Transient LLM error on attempt %d/%d; retrying in %dms: %s",
|
||||||
@@ -395,7 +295,7 @@ class LLMErrorHandlingMiddleware(AgentMiddleware[AgentState]):
|
|||||||
)
|
)
|
||||||
if retriable:
|
if retriable:
|
||||||
self._record_failure()
|
self._record_failure()
|
||||||
return self._build_user_fallback_message(exc, reason)
|
return AIMessage(content=self._build_user_message(exc, reason))
|
||||||
|
|
||||||
|
|
||||||
def _matches_any(detail: str, patterns: tuple[str, ...]) -> bool:
|
def _matches_any(detail: str, patterns: tuple[str, ...]) -> bool:
|
||||||
|
|||||||
+16
-239
@@ -6,58 +6,25 @@ arguments indefinitely until the recursion limit kills the run.
|
|||||||
Detection strategy:
|
Detection strategy:
|
||||||
1. After each model response, hash the tool calls (name + args).
|
1. After each model response, hash the tool calls (name + args).
|
||||||
2. Track recent hashes in a sliding window.
|
2. Track recent hashes in a sliding window.
|
||||||
3. If the same hash appears >= warn_threshold times, queue a
|
3. If the same hash appears >= warn_threshold times, inject a
|
||||||
"you are repeating yourself — wrap up" warning for the current
|
"you are repeating yourself — wrap up" system message (once per hash).
|
||||||
thread/run. The warning is **injected at the next model call** (in
|
|
||||||
``wrap_model_call``) as a ``HumanMessage`` appended to the message
|
|
||||||
list, *after* all ToolMessage responses to the previous
|
|
||||||
AIMessage(tool_calls).
|
|
||||||
4. If it appears >= hard_limit times, strip all tool_calls from the
|
4. If it appears >= hard_limit times, strip all tool_calls from the
|
||||||
response so the agent is forced to produce a final text answer.
|
response so the agent is forced to produce a final text answer.
|
||||||
|
|
||||||
Why the warning is injected at ``wrap_model_call`` instead of
|
|
||||||
``after_model``:
|
|
||||||
|
|
||||||
``after_model`` fires immediately after the model emits an
|
|
||||||
``AIMessage`` that may carry ``tool_calls``. The tools node has not
|
|
||||||
run yet, so no matching ``ToolMessage`` exists in the history. Any
|
|
||||||
message we add here lands *between* the assistant's tool_calls and
|
|
||||||
their responses. OpenAI/Moonshot reject the next request with
|
|
||||||
``"tool_call_ids did not have response messages"`` because their
|
|
||||||
validators require the assistant's tool_calls to be followed
|
|
||||||
immediately by tool messages. Anthropic also disallows mid-stream
|
|
||||||
``SystemMessage``. By deferring the warning to ``wrap_model_call``,
|
|
||||||
every prior ToolMessage is already present in the request's message
|
|
||||||
list and the warning is appended at the end — pairing intact, no
|
|
||||||
``AIMessage`` semantics are mutated.
|
|
||||||
|
|
||||||
Queued warnings are intentionally transient. If a run ends before the
|
|
||||||
next model request drains a queued warning, ``after_agent`` drops it
|
|
||||||
instead of carrying it into a later invocation for the same thread. The
|
|
||||||
hard-stop path still forces termination when the configured safety limit
|
|
||||||
is reached.
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import hashlib
|
import hashlib
|
||||||
import json
|
import json
|
||||||
import logging
|
import logging
|
||||||
import threading
|
import threading
|
||||||
from collections import OrderedDict, defaultdict
|
from collections import OrderedDict, defaultdict
|
||||||
from collections.abc import Awaitable, Callable
|
|
||||||
from copy import deepcopy
|
from copy import deepcopy
|
||||||
from typing import TYPE_CHECKING, override
|
from typing import override
|
||||||
|
|
||||||
from langchain.agents import AgentState
|
from langchain.agents import AgentState
|
||||||
from langchain.agents.middleware import AgentMiddleware
|
from langchain.agents.middleware import AgentMiddleware
|
||||||
from langchain.agents.middleware.types import ModelCallResult, ModelRequest, ModelResponse
|
|
||||||
from langchain_core.messages import HumanMessage
|
from langchain_core.messages import HumanMessage
|
||||||
from langgraph.runtime import Runtime
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
|
||||||
from deerflow.config.loop_detection_config import LoopDetectionConfig
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Defaults — can be overridden via constructor
|
# Defaults — can be overridden via constructor
|
||||||
@@ -67,7 +34,6 @@ _DEFAULT_WINDOW_SIZE = 20 # track last N tool calls
|
|||||||
_DEFAULT_MAX_TRACKED_THREADS = 100 # LRU eviction limit
|
_DEFAULT_MAX_TRACKED_THREADS = 100 # LRU eviction limit
|
||||||
_DEFAULT_TOOL_FREQ_WARN = 30 # warn after 30 calls to the same tool type
|
_DEFAULT_TOOL_FREQ_WARN = 30 # warn after 30 calls to the same tool type
|
||||||
_DEFAULT_TOOL_FREQ_HARD_LIMIT = 50 # force-stop after 50 calls to the same tool type
|
_DEFAULT_TOOL_FREQ_HARD_LIMIT = 50 # force-stop after 50 calls to the same tool type
|
||||||
_MAX_PENDING_WARNINGS_PER_RUN = 4
|
|
||||||
|
|
||||||
|
|
||||||
def _normalize_tool_call_args(raw_args: object) -> tuple[dict, str | None]:
|
def _normalize_tool_call_args(raw_args: object) -> tuple[dict, str | None]:
|
||||||
@@ -174,9 +140,6 @@ _TOOL_FREQ_HARD_STOP_MSG = "[FORCED STOP] Tool {tool_name} called {count} times
|
|||||||
class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
||||||
"""Detects and breaks repetitive tool call loops.
|
"""Detects and breaks repetitive tool call loops.
|
||||||
|
|
||||||
Threshold parameters are validated upstream by :class:`LoopDetectionConfig`;
|
|
||||||
construct via :meth:`from_config` to ensure values pass Pydantic validation.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
warn_threshold: Number of identical tool call sets before injecting
|
warn_threshold: Number of identical tool call sets before injecting
|
||||||
a warning message. Default: 3.
|
a warning message. Default: 3.
|
||||||
@@ -192,14 +155,6 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
Default: 30.
|
Default: 30.
|
||||||
tool_freq_hard_limit: Number of calls to the same tool type before
|
tool_freq_hard_limit: Number of calls to the same tool type before
|
||||||
forcing a stop. Default: 50.
|
forcing a stop. Default: 50.
|
||||||
tool_freq_overrides: Per-tool overrides for frequency thresholds,
|
|
||||||
keyed by tool name. Each value is a ``(warn, hard_limit)`` tuple
|
|
||||||
that replaces ``tool_freq_warn`` / ``tool_freq_hard_limit`` for
|
|
||||||
that specific tool. Tools not listed here fall back to the global
|
|
||||||
thresholds. Useful for raising limits on intentionally
|
|
||||||
high-frequency tools (e.g. ``bash`` in batch pipelines) without
|
|
||||||
weakening protection on all other tools. Default: ``None``
|
|
||||||
(no overrides).
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
def __init__(
|
def __init__(
|
||||||
@@ -210,7 +165,6 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
max_tracked_threads: int = _DEFAULT_MAX_TRACKED_THREADS,
|
max_tracked_threads: int = _DEFAULT_MAX_TRACKED_THREADS,
|
||||||
tool_freq_warn: int = _DEFAULT_TOOL_FREQ_WARN,
|
tool_freq_warn: int = _DEFAULT_TOOL_FREQ_WARN,
|
||||||
tool_freq_hard_limit: int = _DEFAULT_TOOL_FREQ_HARD_LIMIT,
|
tool_freq_hard_limit: int = _DEFAULT_TOOL_FREQ_HARD_LIMIT,
|
||||||
tool_freq_overrides: dict[str, tuple[int, int]] | None = None,
|
|
||||||
):
|
):
|
||||||
super().__init__()
|
super().__init__()
|
||||||
self.warn_threshold = warn_threshold
|
self.warn_threshold = warn_threshold
|
||||||
@@ -219,50 +173,21 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
self.max_tracked_threads = max_tracked_threads
|
self.max_tracked_threads = max_tracked_threads
|
||||||
self.tool_freq_warn = tool_freq_warn
|
self.tool_freq_warn = tool_freq_warn
|
||||||
self.tool_freq_hard_limit = tool_freq_hard_limit
|
self.tool_freq_hard_limit = tool_freq_hard_limit
|
||||||
self._tool_freq_overrides: dict[str, tuple[int, int]] = tool_freq_overrides or {}
|
|
||||||
self._lock = threading.Lock()
|
self._lock = threading.Lock()
|
||||||
|
# Per-thread tracking using OrderedDict for LRU eviction
|
||||||
self._history: OrderedDict[str, list[str]] = OrderedDict()
|
self._history: OrderedDict[str, list[str]] = OrderedDict()
|
||||||
self._warned: dict[str, set[str]] = defaultdict(set)
|
self._warned: dict[str, set[str]] = defaultdict(set)
|
||||||
|
# Per-thread, per-tool-type cumulative call counts
|
||||||
self._tool_freq: dict[str, dict[str, int]] = defaultdict(lambda: defaultdict(int))
|
self._tool_freq: dict[str, dict[str, int]] = defaultdict(lambda: defaultdict(int))
|
||||||
self._tool_freq_warned: dict[str, set[str]] = defaultdict(set)
|
self._tool_freq_warned: dict[str, set[str]] = defaultdict(set)
|
||||||
# Per-thread/run queue of warnings to inject at the next model call.
|
|
||||||
# Populated by ``after_model`` (detection) and drained by
|
|
||||||
# ``wrap_model_call`` (injection); see module docstring.
|
|
||||||
self._pending_warnings: dict[tuple[str, str], list[str]] = defaultdict(list)
|
|
||||||
self._pending_warning_touch_order: OrderedDict[tuple[str, str], None] = OrderedDict()
|
|
||||||
self._max_pending_warning_keys = max(1, self.max_tracked_threads * 2)
|
|
||||||
|
|
||||||
@classmethod
|
|
||||||
def from_config(cls, config: LoopDetectionConfig) -> LoopDetectionMiddleware:
|
|
||||||
"""Construct from a Pydantic-validated config, trusting its validation."""
|
|
||||||
return cls(
|
|
||||||
warn_threshold=config.warn_threshold,
|
|
||||||
hard_limit=config.hard_limit,
|
|
||||||
window_size=config.window_size,
|
|
||||||
max_tracked_threads=config.max_tracked_threads,
|
|
||||||
tool_freq_warn=config.tool_freq_warn,
|
|
||||||
tool_freq_hard_limit=config.tool_freq_hard_limit,
|
|
||||||
tool_freq_overrides={name: (o.warn, o.hard_limit) for name, o in config.tool_freq_overrides.items()},
|
|
||||||
)
|
|
||||||
|
|
||||||
def _get_thread_id(self, runtime: Runtime) -> str:
|
def _get_thread_id(self, runtime: Runtime) -> str:
|
||||||
"""Extract thread_id from runtime context for per-thread tracking."""
|
"""Extract thread_id from runtime context for per-thread tracking."""
|
||||||
thread_id = runtime.context.get("thread_id") if runtime.context else None
|
thread_id = runtime.context.get("thread_id") if runtime.context else None
|
||||||
if thread_id:
|
if thread_id:
|
||||||
return str(thread_id)
|
return thread_id
|
||||||
return "default"
|
return "default"
|
||||||
|
|
||||||
def _get_run_id(self, runtime: Runtime) -> str:
|
|
||||||
"""Extract run_id from runtime context for per-run warning scoping."""
|
|
||||||
run_id = runtime.context.get("run_id") if runtime.context else None
|
|
||||||
if run_id:
|
|
||||||
return str(run_id)
|
|
||||||
return "default"
|
|
||||||
|
|
||||||
def _pending_key(self, runtime: Runtime) -> tuple[str, str]:
|
|
||||||
"""Return the pending-warning key for the current thread/run."""
|
|
||||||
return self._get_thread_id(runtime), self._get_run_id(runtime)
|
|
||||||
|
|
||||||
def _evict_if_needed(self) -> None:
|
def _evict_if_needed(self) -> None:
|
||||||
"""Evict least recently used threads if over the limit.
|
"""Evict least recently used threads if over the limit.
|
||||||
|
|
||||||
@@ -273,52 +198,8 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
self._warned.pop(evicted_id, None)
|
self._warned.pop(evicted_id, None)
|
||||||
self._tool_freq.pop(evicted_id, None)
|
self._tool_freq.pop(evicted_id, None)
|
||||||
self._tool_freq_warned.pop(evicted_id, None)
|
self._tool_freq_warned.pop(evicted_id, None)
|
||||||
for key in list(self._pending_warnings):
|
|
||||||
if key[0] == evicted_id:
|
|
||||||
self._drop_pending_warning_key_locked(key)
|
|
||||||
logger.debug("Evicted loop tracking for thread %s (LRU)", evicted_id)
|
logger.debug("Evicted loop tracking for thread %s (LRU)", evicted_id)
|
||||||
|
|
||||||
def _drop_pending_warning_key_locked(self, key: tuple[str, str]) -> None:
|
|
||||||
"""Drop all pending-warning bookkeeping for one thread/run key.
|
|
||||||
|
|
||||||
Must be called while holding self._lock.
|
|
||||||
"""
|
|
||||||
self._pending_warnings.pop(key, None)
|
|
||||||
self._pending_warning_touch_order.pop(key, None)
|
|
||||||
|
|
||||||
def _touch_pending_warning_key_locked(self, key: tuple[str, str]) -> None:
|
|
||||||
"""Mark a pending-warning key as recently used.
|
|
||||||
|
|
||||||
Must be called while holding self._lock.
|
|
||||||
"""
|
|
||||||
self._pending_warning_touch_order[key] = None
|
|
||||||
self._pending_warning_touch_order.move_to_end(key)
|
|
||||||
|
|
||||||
def _prune_pending_warning_state_locked(self, protected_key: tuple[str, str]) -> None:
|
|
||||||
"""Cap pending-warning state across abnormal or concurrent runs.
|
|
||||||
|
|
||||||
Must be called while holding self._lock.
|
|
||||||
"""
|
|
||||||
overflow = len(self._pending_warning_touch_order) - self._max_pending_warning_keys
|
|
||||||
if overflow <= 0:
|
|
||||||
return
|
|
||||||
|
|
||||||
candidates = [key for key in self._pending_warning_touch_order if key != protected_key]
|
|
||||||
for key in candidates[:overflow]:
|
|
||||||
self._drop_pending_warning_key_locked(key)
|
|
||||||
|
|
||||||
def _queue_pending_warning(self, runtime: Runtime, warning: str) -> None:
|
|
||||||
"""Queue one transient warning for the current thread/run with caps."""
|
|
||||||
pending_key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
warnings = self._pending_warnings[pending_key]
|
|
||||||
if warning not in warnings:
|
|
||||||
warnings.append(warning)
|
|
||||||
if len(warnings) > _MAX_PENDING_WARNINGS_PER_RUN:
|
|
||||||
del warnings[: len(warnings) - _MAX_PENDING_WARNINGS_PER_RUN]
|
|
||||||
self._touch_pending_warning_key_locked(pending_key)
|
|
||||||
self._prune_pending_warning_state_locked(protected_key=pending_key)
|
|
||||||
|
|
||||||
def _track_and_check(self, state: AgentState, runtime: Runtime) -> tuple[str | None, bool]:
|
def _track_and_check(self, state: AgentState, runtime: Runtime) -> tuple[str | None, bool]:
|
||||||
"""Track tool calls and check for loops.
|
"""Track tool calls and check for loops.
|
||||||
|
|
||||||
@@ -359,12 +240,6 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
if len(history) > self.window_size:
|
if len(history) > self.window_size:
|
||||||
history[:] = history[-self.window_size :]
|
history[:] = history[-self.window_size :]
|
||||||
|
|
||||||
warned_hashes = self._warned.get(thread_id)
|
|
||||||
if warned_hashes is not None:
|
|
||||||
warned_hashes.intersection_update(history)
|
|
||||||
if not warned_hashes:
|
|
||||||
self._warned.pop(thread_id, None)
|
|
||||||
|
|
||||||
count = history.count(call_hash)
|
count = history.count(call_hash)
|
||||||
tool_names = [tc.get("name", "?") for tc in tool_calls]
|
tool_names = [tc.get("name", "?") for tc in tool_calls]
|
||||||
|
|
||||||
@@ -405,12 +280,7 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
freq[name] += 1
|
freq[name] += 1
|
||||||
tc_count = freq[name]
|
tc_count = freq[name]
|
||||||
|
|
||||||
if name in self._tool_freq_overrides:
|
if tc_count >= self.tool_freq_hard_limit:
|
||||||
eff_warn, eff_hard = self._tool_freq_overrides[name]
|
|
||||||
else:
|
|
||||||
eff_warn, eff_hard = self.tool_freq_warn, self.tool_freq_hard_limit
|
|
||||||
|
|
||||||
if tc_count >= eff_hard:
|
|
||||||
logger.error(
|
logger.error(
|
||||||
"Tool frequency hard limit reached — forcing stop",
|
"Tool frequency hard limit reached — forcing stop",
|
||||||
extra={
|
extra={
|
||||||
@@ -421,7 +291,7 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
)
|
)
|
||||||
return _TOOL_FREQ_HARD_STOP_MSG.format(tool_name=name, count=tc_count), True
|
return _TOOL_FREQ_HARD_STOP_MSG.format(tool_name=name, count=tc_count), True
|
||||||
|
|
||||||
if tc_count >= eff_warn:
|
if tc_count >= self.tool_freq_warn:
|
||||||
warned = self._tool_freq_warned[thread_id]
|
warned = self._tool_freq_warned[thread_id]
|
||||||
if name not in warned:
|
if name not in warned:
|
||||||
warned.add(name)
|
warned.add(name)
|
||||||
@@ -478,10 +348,7 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
warning, hard_stop = self._track_and_check(state, runtime)
|
warning, hard_stop = self._track_and_check(state, runtime)
|
||||||
|
|
||||||
if hard_stop:
|
if hard_stop:
|
||||||
# Strip tool_calls from the last AIMessage to force text output.
|
# Strip tool_calls from the last AIMessage to force text output
|
||||||
# Once tool_calls are stripped, the AIMessage no longer requires
|
|
||||||
# matching ToolMessage responses, so mutating it in place here
|
|
||||||
# is safe for OpenAI/Moonshot pairing validators.
|
|
||||||
messages = state.get("messages", [])
|
messages = state.get("messages", [])
|
||||||
last_msg = messages[-1]
|
last_msg = messages[-1]
|
||||||
content = self._append_text(last_msg.content, warning or _HARD_STOP_MSG)
|
content = self._append_text(last_msg.content, warning or _HARD_STOP_MSG)
|
||||||
@@ -489,48 +356,16 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
return {"messages": [stripped_msg]}
|
return {"messages": [stripped_msg]}
|
||||||
|
|
||||||
if warning:
|
if warning:
|
||||||
# Defer injection to the next model call. We must NOT alter the
|
# Inject as HumanMessage instead of SystemMessage to avoid
|
||||||
# AIMessage(tool_calls=...) here (would put framework words in
|
# Anthropic's "multiple non-consecutive system messages" error.
|
||||||
# the model's mouth, polluting downstream consumers like
|
# Anthropic models require system messages only at the start of
|
||||||
# MemoryMiddleware), nor insert a separate non-tool message
|
# the conversation; injecting one mid-conversation crashes
|
||||||
# (would break OpenAI/Moonshot tool-call pairing because the
|
# langchain_anthropic's _format_messages(). HumanMessage works
|
||||||
# tools node has not produced ToolMessage responses yet). The
|
# with all providers. See #1299.
|
||||||
# warning is delivered via ``wrap_model_call`` below.
|
return {"messages": [HumanMessage(content=warning, name="loop_warning")]}
|
||||||
self._queue_pending_warning(runtime, warning)
|
|
||||||
return None
|
|
||||||
|
|
||||||
return None
|
return None
|
||||||
|
|
||||||
def _clear_other_run_pending_warnings(self, runtime: Runtime) -> None:
|
|
||||||
"""Drop stale pending warnings for previous runs in this thread."""
|
|
||||||
thread_id, current_run_id = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
for key in list(self._pending_warnings):
|
|
||||||
if key[0] == thread_id and key[1] != current_run_id:
|
|
||||||
self._drop_pending_warning_key_locked(key)
|
|
||||||
|
|
||||||
def _clear_current_run_pending_warnings(self, runtime: Runtime) -> None:
|
|
||||||
"""Drop pending warnings owned by the current thread/run."""
|
|
||||||
pending_key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
self._drop_pending_warning_key_locked(pending_key)
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _format_warning_message(warnings: list[str]) -> str:
|
|
||||||
"""Merge pending warnings into one prompt message."""
|
|
||||||
deduped = list(dict.fromkeys(warnings))
|
|
||||||
return "\n\n".join(deduped)
|
|
||||||
|
|
||||||
@override
|
|
||||||
def before_agent(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
self._clear_other_run_pending_warnings(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def abefore_agent(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
self._clear_other_run_pending_warnings(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
@override
|
@override
|
||||||
def after_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
def after_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
||||||
return self._apply(state, runtime)
|
return self._apply(state, runtime)
|
||||||
@@ -539,59 +374,6 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
async def aafter_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
async def aafter_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
||||||
return self._apply(state, runtime)
|
return self._apply(state, runtime)
|
||||||
|
|
||||||
@override
|
|
||||||
def after_agent(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
self._clear_current_run_pending_warnings(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def aafter_agent(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
self._clear_current_run_pending_warnings(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _drain_pending_warnings(self, runtime: Runtime) -> list[str]:
|
|
||||||
"""Pop and return all queued warnings for *runtime*'s thread/run."""
|
|
||||||
pending_key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
warnings = self._pending_warnings.pop(pending_key, [])
|
|
||||||
self._pending_warning_touch_order.pop(pending_key, None)
|
|
||||||
return warnings
|
|
||||||
|
|
||||||
def _augment_request(self, request: ModelRequest) -> ModelRequest:
|
|
||||||
"""Append queued loop warnings (if any) to the outgoing message list.
|
|
||||||
|
|
||||||
The warning is placed *after* every existing message, including the
|
|
||||||
ToolMessage responses to the previous AIMessage(tool_calls). This
|
|
||||||
keeps ``assistant tool_calls -> tool_messages`` pairing intact for
|
|
||||||
OpenAI/Moonshot, avoids the Anthropic mid-stream SystemMessage
|
|
||||||
restriction (we use HumanMessage), and never mutates an existing
|
|
||||||
AIMessage.
|
|
||||||
"""
|
|
||||||
warnings = self._drain_pending_warnings(request.runtime)
|
|
||||||
if not warnings:
|
|
||||||
return request
|
|
||||||
new_messages = [
|
|
||||||
*request.messages,
|
|
||||||
HumanMessage(content=self._format_warning_message(warnings), name="loop_warning"),
|
|
||||||
]
|
|
||||||
return request.override(messages=new_messages)
|
|
||||||
|
|
||||||
@override
|
|
||||||
def wrap_model_call(
|
|
||||||
self,
|
|
||||||
request: ModelRequest,
|
|
||||||
handler: Callable[[ModelRequest], ModelResponse],
|
|
||||||
) -> ModelCallResult:
|
|
||||||
return handler(self._augment_request(request))
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def awrap_model_call(
|
|
||||||
self,
|
|
||||||
request: ModelRequest,
|
|
||||||
handler: Callable[[ModelRequest], Awaitable[ModelResponse]],
|
|
||||||
) -> ModelCallResult:
|
|
||||||
return await handler(self._augment_request(request))
|
|
||||||
|
|
||||||
def reset(self, thread_id: str | None = None) -> None:
|
def reset(self, thread_id: str | None = None) -> None:
|
||||||
"""Clear tracking state. If thread_id given, clear only that thread."""
|
"""Clear tracking state. If thread_id given, clear only that thread."""
|
||||||
with self._lock:
|
with self._lock:
|
||||||
@@ -600,13 +382,8 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
self._warned.pop(thread_id, None)
|
self._warned.pop(thread_id, None)
|
||||||
self._tool_freq.pop(thread_id, None)
|
self._tool_freq.pop(thread_id, None)
|
||||||
self._tool_freq_warned.pop(thread_id, None)
|
self._tool_freq_warned.pop(thread_id, None)
|
||||||
for key in list(self._pending_warnings):
|
|
||||||
if key[0] == thread_id:
|
|
||||||
self._drop_pending_warning_key_locked(key)
|
|
||||||
else:
|
else:
|
||||||
self._history.clear()
|
self._history.clear()
|
||||||
self._warned.clear()
|
self._warned.clear()
|
||||||
self._tool_freq.clear()
|
self._tool_freq.clear()
|
||||||
self._tool_freq_warned.clear()
|
self._tool_freq_warned.clear()
|
||||||
self._pending_warnings.clear()
|
|
||||||
self._pending_warning_touch_order.clear()
|
|
||||||
|
|||||||
-317
@@ -1,317 +0,0 @@
|
|||||||
"""Suppress tool execution when the provider safety-terminated the response.
|
|
||||||
|
|
||||||
Background — see issue bytedance/deer-flow#3028.
|
|
||||||
|
|
||||||
Some providers (OpenAI ``finish_reason='content_filter'``, Anthropic
|
|
||||||
``stop_reason='refusal'``, Gemini ``finish_reason='SAFETY'`` ...) can stop
|
|
||||||
generation mid-stream while still returning partially-formed ``tool_calls``.
|
|
||||||
LangChain's tool router treats any AIMessage with a non-empty ``tool_calls``
|
|
||||||
field as "go execute these", so half-truncated arguments — e.g. a markdown
|
|
||||||
``write_file`` that stops in the middle of a sentence — get dispatched as if
|
|
||||||
they were complete. The agent then sees the truncated file, tries to fix it,
|
|
||||||
gets filtered again, and loops.
|
|
||||||
|
|
||||||
This middleware sits at ``after_model`` and gates that behaviour: when a
|
|
||||||
configured ``SafetyTerminationDetector`` fires *and* the AIMessage carries
|
|
||||||
tool calls, we strip the tool calls (both structured and raw provider
|
|
||||||
payloads), append a user-facing explanation, and stash observability fields
|
|
||||||
in ``additional_kwargs.safety_termination`` so logs, traces, and SSE
|
|
||||||
consumers can see what happened.
|
|
||||||
|
|
||||||
Hook choice: ``after_model`` (not ``wrap_model_call``) because the response
|
|
||||||
is a *normal* return — not an exception — and we want to participate in the
|
|
||||||
same after-model chain as ``LoopDetectionMiddleware``, with which we share
|
|
||||||
the same tool-call-suppression mechanic but a different trigger.
|
|
||||||
|
|
||||||
Placement: register *after* ``LoopDetectionMiddleware`` in the middleware
|
|
||||||
list. LangChain factory wires ``after_model`` edges in reverse list order
|
|
||||||
(``langchain/agents/factory.py:add_edge("model", middleware_w_after_model[-1])``,
|
|
||||||
then walks ``range(len-1, 0, -1)``), so the *last* registered middleware is
|
|
||||||
the *first* to observe the model output. Registering Safety after Loop
|
|
||||||
means Safety sees the raw response first, clears tool calls if it fires,
|
|
||||||
and Loop then accounts against the cleaned message.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import logging
|
|
||||||
from typing import TYPE_CHECKING, override
|
|
||||||
|
|
||||||
from langchain.agents import AgentState
|
|
||||||
from langchain.agents.middleware import AgentMiddleware
|
|
||||||
from langchain_core.messages import AIMessage
|
|
||||||
from langgraph.runtime import Runtime
|
|
||||||
|
|
||||||
from deerflow.agents.middlewares.safety_termination_detectors import (
|
|
||||||
SafetyTermination,
|
|
||||||
SafetyTerminationDetector,
|
|
||||||
default_detectors,
|
|
||||||
)
|
|
||||||
from deerflow.agents.middlewares.tool_call_metadata import clone_ai_message_with_tool_calls
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
|
||||||
from deerflow.config.safety_finish_reason_config import SafetyFinishReasonConfig
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
_USER_FACING_MESSAGE = (
|
|
||||||
"The model provider stopped this response with a safety-related signal "
|
|
||||||
"({reason_field}={reason_value!r}, detector={detector!r}). Any tool "
|
|
||||||
"calls produced in this turn were suppressed because their arguments "
|
|
||||||
"may be truncated and unsafe to execute. Please rephrase the request "
|
|
||||||
"or ask for a narrower output."
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class SafetyFinishReasonMiddleware(AgentMiddleware[AgentState]):
|
|
||||||
"""Strip tool_calls from AIMessages flagged by a SafetyTerminationDetector."""
|
|
||||||
|
|
||||||
def __init__(self, detectors: list[SafetyTerminationDetector] | None = None) -> None:
|
|
||||||
super().__init__()
|
|
||||||
# Copy so caller mutations after construction don't leak into us.
|
|
||||||
self._detectors: list[SafetyTerminationDetector] = list(detectors) if detectors else default_detectors()
|
|
||||||
|
|
||||||
@classmethod
|
|
||||||
def from_config(cls, config: SafetyFinishReasonConfig) -> SafetyFinishReasonMiddleware:
|
|
||||||
"""Construct from validated Pydantic config, honouring the
|
|
||||||
reflection-loaded detector list when provided.
|
|
||||||
|
|
||||||
An explicit empty list is intentionally rejected — it would silently
|
|
||||||
disable detection while leaving the middleware in the chain, which
|
|
||||||
is the worst of both worlds. Use ``enabled: false`` instead.
|
|
||||||
"""
|
|
||||||
if config.detectors is None:
|
|
||||||
return cls()
|
|
||||||
|
|
||||||
if not config.detectors:
|
|
||||||
raise ValueError("safety_finish_reason.detectors must be omitted (use built-ins) or contain at least one entry; use enabled=false to disable the middleware entirely.")
|
|
||||||
|
|
||||||
from deerflow.reflection import resolve_variable
|
|
||||||
|
|
||||||
detectors: list[SafetyTerminationDetector] = []
|
|
||||||
for entry in config.detectors:
|
|
||||||
detector_cls = resolve_variable(entry.use)
|
|
||||||
kwargs = dict(entry.config) if entry.config else {}
|
|
||||||
detector = detector_cls(**kwargs)
|
|
||||||
if not isinstance(detector, SafetyTerminationDetector):
|
|
||||||
raise TypeError(f"{entry.use} did not produce a SafetyTerminationDetector (got {type(detector).__name__}); ensure it has a `name` attribute and a `detect(message)` method")
|
|
||||||
detectors.append(detector)
|
|
||||||
return cls(detectors=detectors)
|
|
||||||
|
|
||||||
# ----- detection -------------------------------------------------------
|
|
||||||
|
|
||||||
def _detect(self, message: AIMessage) -> SafetyTermination | None:
|
|
||||||
for detector in self._detectors:
|
|
||||||
try:
|
|
||||||
hit = detector.detect(message)
|
|
||||||
except Exception: # noqa: BLE001 - never let a buggy detector break the agent run
|
|
||||||
logger.exception("SafetyTerminationDetector %r raised; treating as no-match", getattr(detector, "name", type(detector).__name__))
|
|
||||||
continue
|
|
||||||
if hit is not None:
|
|
||||||
return hit
|
|
||||||
return None
|
|
||||||
|
|
||||||
# ----- message rewriting ----------------------------------------------
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _append_user_message(content: object, text: str) -> str | list:
|
|
||||||
"""Append a plain-text explanation to AIMessage content.
|
|
||||||
|
|
||||||
Mirrors ``LoopDetectionMiddleware._append_text`` so list-content
|
|
||||||
responses (Anthropic thinking blocks, vLLM reasoning splits) keep
|
|
||||||
their structure instead of being string-coerced into a TypeError.
|
|
||||||
"""
|
|
||||||
if content is None or content == "":
|
|
||||||
return text
|
|
||||||
if isinstance(content, list):
|
|
||||||
return [*content, {"type": "text", "text": f"\n\n{text}"}]
|
|
||||||
if isinstance(content, str):
|
|
||||||
return content + f"\n\n{text}"
|
|
||||||
return str(content) + f"\n\n{text}"
|
|
||||||
|
|
||||||
def _build_suppressed_message(
|
|
||||||
self,
|
|
||||||
message: AIMessage,
|
|
||||||
termination: SafetyTermination,
|
|
||||||
) -> AIMessage:
|
|
||||||
suppressed_names = [tc.get("name") or "unknown" for tc in (message.tool_calls or [])]
|
|
||||||
explanation = _USER_FACING_MESSAGE.format(
|
|
||||||
reason_field=termination.reason_field,
|
|
||||||
reason_value=termination.reason_value,
|
|
||||||
detector=termination.detector,
|
|
||||||
)
|
|
||||||
new_content = self._append_user_message(message.content, explanation)
|
|
||||||
|
|
||||||
# clone_ai_message_with_tool_calls handles structured tool_calls,
|
|
||||||
# raw additional_kwargs.tool_calls, and function_call in one shot.
|
|
||||||
# It only rewrites finish_reason when the old value was "tool_calls",
|
|
||||||
# which is not our case — content_filter / refusal / SAFETY stay put
|
|
||||||
# so downstream SSE / converters keep seeing the real provider reason.
|
|
||||||
cleared = clone_ai_message_with_tool_calls(message, [], content=new_content)
|
|
||||||
|
|
||||||
# Re-clone additional_kwargs so we don't accidentally mutate the
|
|
||||||
# dict returned by clone_ai_message_with_tool_calls (which already
|
|
||||||
# made a shallow copy, but downstream model_copy still references
|
|
||||||
# it). Then stamp the observability record.
|
|
||||||
kwargs = dict(getattr(cleared, "additional_kwargs", None) or {})
|
|
||||||
kwargs["safety_termination"] = {
|
|
||||||
"detector": termination.detector,
|
|
||||||
"reason_field": termination.reason_field,
|
|
||||||
"reason_value": termination.reason_value,
|
|
||||||
"suppressed_tool_call_count": len(suppressed_names),
|
|
||||||
"suppressed_tool_call_names": suppressed_names,
|
|
||||||
"extras": dict(termination.extras) if termination.extras else {},
|
|
||||||
}
|
|
||||||
return cleared.model_copy(update={"additional_kwargs": kwargs})
|
|
||||||
|
|
||||||
# ----- observability ---------------------------------------------------
|
|
||||||
|
|
||||||
def _emit_event(
|
|
||||||
self,
|
|
||||||
termination: SafetyTermination,
|
|
||||||
suppressed_names: list[str],
|
|
||||||
runtime: Runtime,
|
|
||||||
) -> None:
|
|
||||||
"""Notify SSE consumers (e.g. the web UI) that a tool turn was
|
|
||||||
suppressed so they can reconcile any "tool starting..." placeholders
|
|
||||||
already streamed to the user. Failures are logged at debug and
|
|
||||||
ignored — this is a best-effort signal."""
|
|
||||||
try:
|
|
||||||
from langgraph.config import get_stream_writer
|
|
||||||
|
|
||||||
writer = get_stream_writer()
|
|
||||||
except Exception: # noqa: BLE001
|
|
||||||
logger.debug("get_stream_writer unavailable; skipping safety_termination event", exc_info=True)
|
|
||||||
return
|
|
||||||
|
|
||||||
thread_id = None
|
|
||||||
if runtime is not None and getattr(runtime, "context", None):
|
|
||||||
thread_id = runtime.context.get("thread_id") if isinstance(runtime.context, dict) else None
|
|
||||||
|
|
||||||
try:
|
|
||||||
writer(
|
|
||||||
{
|
|
||||||
"type": "safety_termination",
|
|
||||||
"detector": termination.detector,
|
|
||||||
"reason_field": termination.reason_field,
|
|
||||||
"reason_value": termination.reason_value,
|
|
||||||
"suppressed_tool_call_count": len(suppressed_names),
|
|
||||||
"suppressed_tool_call_names": suppressed_names,
|
|
||||||
"thread_id": thread_id,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
except Exception: # noqa: BLE001
|
|
||||||
logger.debug("Failed to emit safety_termination stream event", exc_info=True)
|
|
||||||
|
|
||||||
def _record_audit_event(
|
|
||||||
self,
|
|
||||||
termination: SafetyTermination,
|
|
||||||
message,
|
|
||||||
tool_calls: list[dict],
|
|
||||||
runtime: Runtime,
|
|
||||||
) -> None:
|
|
||||||
"""Write a ``middleware:safety_termination`` record to RunEventStore
|
|
||||||
for post-run auditability.
|
|
||||||
|
|
||||||
The custom stream event in ``_emit_event`` is consumed by live SSE
|
|
||||||
clients and disappears after the run; this event is persisted so an
|
|
||||||
operator can answer "which runs were safety-suppressed today?" from
|
|
||||||
a single SQL query without joining the message body. Worker exposes
|
|
||||||
the run-scoped ``RunJournal`` via ``runtime.context["__run_journal"]``;
|
|
||||||
absent in unit-test / subagent / no-event-store paths, in which case
|
|
||||||
we silently skip.
|
|
||||||
|
|
||||||
Tool **arguments** are deliberately **not** recorded — those are the
|
|
||||||
very content the provider filtered; persisting them would defeat the
|
|
||||||
purpose of the safety filter. Names / count / ids are sufficient for
|
|
||||||
audit and debugging (issue #3028 review).
|
|
||||||
"""
|
|
||||||
journal = None
|
|
||||||
if runtime is not None and getattr(runtime, "context", None):
|
|
||||||
context = runtime.context
|
|
||||||
if isinstance(context, dict):
|
|
||||||
journal = context.get("__run_journal")
|
|
||||||
if journal is None:
|
|
||||||
return
|
|
||||||
|
|
||||||
suppressed_names = [tc.get("name") or "unknown" for tc in tool_calls]
|
|
||||||
suppressed_ids = [tc.get("id") for tc in tool_calls if tc.get("id")]
|
|
||||||
|
|
||||||
changes = {
|
|
||||||
"detector": termination.detector,
|
|
||||||
"reason_field": termination.reason_field,
|
|
||||||
"reason_value": termination.reason_value,
|
|
||||||
"suppressed_tool_call_count": len(tool_calls),
|
|
||||||
"suppressed_tool_call_names": suppressed_names,
|
|
||||||
"suppressed_tool_call_ids": suppressed_ids,
|
|
||||||
"message_id": getattr(message, "id", None),
|
|
||||||
"extras": dict(termination.extras) if termination.extras else {},
|
|
||||||
}
|
|
||||||
|
|
||||||
try:
|
|
||||||
journal.record_middleware(
|
|
||||||
tag="safety_termination",
|
|
||||||
name=type(self).__name__,
|
|
||||||
hook="after_model",
|
|
||||||
action="suppress_tool_calls",
|
|
||||||
changes=changes,
|
|
||||||
)
|
|
||||||
except Exception: # noqa: BLE001
|
|
||||||
# Audit-event persistence must never break agent execution.
|
|
||||||
logger.debug("Failed to record middleware:safety_termination event", exc_info=True)
|
|
||||||
|
|
||||||
# ----- main apply ------------------------------------------------------
|
|
||||||
|
|
||||||
def _apply(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
messages = state.get("messages", [])
|
|
||||||
if not messages:
|
|
||||||
return None
|
|
||||||
|
|
||||||
last = messages[-1]
|
|
||||||
if not isinstance(last, AIMessage):
|
|
||||||
return None
|
|
||||||
|
|
||||||
# Issue scope: only intervene when there's something to suppress.
|
|
||||||
# ``content_filter`` without tool_calls is allowed through unchanged
|
|
||||||
# so the partial text response (if any) reaches the user naturally.
|
|
||||||
tool_calls = last.tool_calls
|
|
||||||
if not tool_calls:
|
|
||||||
return None
|
|
||||||
|
|
||||||
termination = self._detect(last)
|
|
||||||
if termination is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
patched = self._build_suppressed_message(last, termination)
|
|
||||||
|
|
||||||
thread_id = None
|
|
||||||
if runtime is not None and getattr(runtime, "context", None):
|
|
||||||
thread_id = runtime.context.get("thread_id") if isinstance(runtime.context, dict) else None
|
|
||||||
|
|
||||||
logger.warning(
|
|
||||||
"Provider safety termination detected — suppressed %d tool call(s)",
|
|
||||||
len(tool_calls),
|
|
||||||
extra={
|
|
||||||
"thread_id": thread_id,
|
|
||||||
"detector": termination.detector,
|
|
||||||
"reason_field": termination.reason_field,
|
|
||||||
"reason_value": termination.reason_value,
|
|
||||||
"suppressed_tool_call_names": [tc.get("name") for tc in tool_calls],
|
|
||||||
},
|
|
||||||
)
|
|
||||||
|
|
||||||
self._emit_event(termination, [tc.get("name") or "unknown" for tc in tool_calls], runtime)
|
|
||||||
self._record_audit_event(termination, last, list(tool_calls), runtime)
|
|
||||||
|
|
||||||
return {"messages": [patched]}
|
|
||||||
|
|
||||||
# ----- hooks -----------------------------------------------------------
|
|
||||||
|
|
||||||
@override
|
|
||||||
def after_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
return self._apply(state, runtime)
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def aafter_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
return self._apply(state, runtime)
|
|
||||||
@@ -1,237 +0,0 @@
|
|||||||
"""Detectors for provider-side safety termination signals.
|
|
||||||
|
|
||||||
Different LLM providers signal "I stopped this response for safety reasons"
|
|
||||||
through different fields with different values. This module defines a small
|
|
||||||
strategy interface and three built-in detectors that cover the major
|
|
||||||
providers DeerFlow supports today. New providers (Wenxin, Hunyuan, Bedrock
|
|
||||||
adapters, in-house gateways, ...) can be added by implementing
|
|
||||||
``SafetyTerminationDetector`` and wiring it through
|
|
||||||
``config.yaml: safety_finish_reason.detectors``.
|
|
||||||
|
|
||||||
The middleware that consumes these detectors lives in
|
|
||||||
``safety_finish_reason_middleware.py``.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from dataclasses import dataclass, field
|
|
||||||
from typing import Any, Protocol, runtime_checkable
|
|
||||||
|
|
||||||
from langchain_core.messages import AIMessage
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class SafetyTermination:
|
|
||||||
"""A detected safety-related termination signal.
|
|
||||||
|
|
||||||
Attributes:
|
|
||||||
detector: Name of the detector that produced this result. Used for
|
|
||||||
observability so operators can see which provider rule fired.
|
|
||||||
reason_field: The message metadata field that carried the signal
|
|
||||||
(e.g. ``finish_reason``, ``stop_reason``).
|
|
||||||
reason_value: The actual value of that field
|
|
||||||
(e.g. ``content_filter``, ``refusal``, ``SAFETY``).
|
|
||||||
extras: Provider-specific metadata that may help downstream
|
|
||||||
consumers (e.g. Azure OpenAI content_filter_results, Gemini
|
|
||||||
safety_ratings). Detectors are free to populate or skip this.
|
|
||||||
"""
|
|
||||||
|
|
||||||
detector: str
|
|
||||||
reason_field: str
|
|
||||||
reason_value: str
|
|
||||||
extras: dict[str, Any] = field(default_factory=dict)
|
|
||||||
|
|
||||||
|
|
||||||
@runtime_checkable
|
|
||||||
class SafetyTerminationDetector(Protocol):
|
|
||||||
"""Strategy interface for provider safety termination detection."""
|
|
||||||
|
|
||||||
name: str
|
|
||||||
|
|
||||||
def detect(self, message: AIMessage) -> SafetyTermination | None:
|
|
||||||
"""Return a SafetyTermination if *message* indicates provider safety
|
|
||||||
termination, otherwise return ``None``.
|
|
||||||
|
|
||||||
Implementations must be side-effect free and tolerant of missing or
|
|
||||||
oddly-typed metadata — detectors run on every model response.
|
|
||||||
"""
|
|
||||||
...
|
|
||||||
|
|
||||||
|
|
||||||
def _get_metadata_value(message: AIMessage, field_name: str) -> str | None:
|
|
||||||
"""Read a string-typed value from either ``response_metadata`` or
|
|
||||||
``additional_kwargs``.
|
|
||||||
|
|
||||||
LangChain provider adapters are inconsistent about where they stash
|
|
||||||
provider stop signals. Most modern adapters use ``response_metadata``,
|
|
||||||
but some legacy / passthrough paths still surface them via
|
|
||||||
``additional_kwargs``. We check both, in that order, and only accept
|
|
||||||
string values — Pydantic enums or dicts are ignored so we never raise
|
|
||||||
on malformed inputs.
|
|
||||||
"""
|
|
||||||
for container_name in ("response_metadata", "additional_kwargs"):
|
|
||||||
container = getattr(message, container_name, None) or {}
|
|
||||||
if not isinstance(container, dict):
|
|
||||||
continue
|
|
||||||
value = container.get(field_name)
|
|
||||||
if isinstance(value, str) and value:
|
|
||||||
return value
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
class OpenAICompatibleContentFilterDetector:
|
|
||||||
"""OpenAI-compatible content_filter signal.
|
|
||||||
|
|
||||||
Covers OpenAI, Azure OpenAI, Moonshot/Kimi, DeepSeek, Mistral, vLLM,
|
|
||||||
Qwen (OpenAI-compatible mode), and any other adapter that follows the
|
|
||||||
OpenAI ``finish_reason`` convention.
|
|
||||||
|
|
||||||
Some Chinese providers ship custom OpenAI-compatible gateways that use
|
|
||||||
alternative tokens like ``sensitive`` or ``violation``. Extend the set
|
|
||||||
via the ``finish_reasons`` kwarg in config.
|
|
||||||
"""
|
|
||||||
|
|
||||||
name = "openai_compatible_content_filter"
|
|
||||||
|
|
||||||
def __init__(self, finish_reasons: list[str] | tuple[str, ...] | None = None) -> None:
|
|
||||||
configured = finish_reasons if finish_reasons is not None else ("content_filter",)
|
|
||||||
self._finish_reasons: frozenset[str] = frozenset(r.lower() for r in configured)
|
|
||||||
|
|
||||||
def detect(self, message: AIMessage) -> SafetyTermination | None:
|
|
||||||
value = _get_metadata_value(message, "finish_reason")
|
|
||||||
if value is None or value.lower() not in self._finish_reasons:
|
|
||||||
return None
|
|
||||||
|
|
||||||
extras: dict[str, Any] = {}
|
|
||||||
# Azure OpenAI ships a structured content_filter_results block; carry it
|
|
||||||
# through so operators can see *what* was filtered without re-tracing.
|
|
||||||
response_metadata = getattr(message, "response_metadata", None) or {}
|
|
||||||
if isinstance(response_metadata, dict):
|
|
||||||
filter_results = response_metadata.get("content_filter_results")
|
|
||||||
if filter_results:
|
|
||||||
extras["content_filter_results"] = filter_results
|
|
||||||
|
|
||||||
return SafetyTermination(
|
|
||||||
detector=self.name,
|
|
||||||
reason_field="finish_reason",
|
|
||||||
reason_value=value,
|
|
||||||
extras=extras,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class AnthropicRefusalDetector:
|
|
||||||
"""Anthropic ``stop_reason == "refusal"`` signal.
|
|
||||||
|
|
||||||
Anthropic models surface safety refusals via a dedicated ``stop_reason``
|
|
||||||
rather than ``finish_reason``. See:
|
|
||||||
https://platform.claude.com/docs/en/test-and-evaluate/strengthen-guardrails/handle-streaming-refusals
|
|
||||||
"""
|
|
||||||
|
|
||||||
name = "anthropic_refusal"
|
|
||||||
|
|
||||||
def __init__(self, stop_reasons: list[str] | tuple[str, ...] | None = None) -> None:
|
|
||||||
configured = stop_reasons if stop_reasons is not None else ("refusal",)
|
|
||||||
self._stop_reasons: frozenset[str] = frozenset(r.lower() for r in configured)
|
|
||||||
|
|
||||||
def detect(self, message: AIMessage) -> SafetyTermination | None:
|
|
||||||
value = _get_metadata_value(message, "stop_reason")
|
|
||||||
if value is None or value.lower() not in self._stop_reasons:
|
|
||||||
return None
|
|
||||||
return SafetyTermination(
|
|
||||||
detector=self.name,
|
|
||||||
reason_field="stop_reason",
|
|
||||||
reason_value=value,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class GeminiSafetyDetector:
|
|
||||||
"""Gemini / Vertex AI safety-related finish reasons.
|
|
||||||
|
|
||||||
Gemini uses the same ``finish_reason`` field as OpenAI but with an
|
|
||||||
enumerated upper-case taxonomy. The default set covers every Gemini
|
|
||||||
finish_reason that means "the model stopped because the content/image
|
|
||||||
tripped a safety, blocklist, recitation, or PII filter" — i.e. cases
|
|
||||||
where any tool_calls returned alongside are likely truncated/
|
|
||||||
unreliable. Full enum:
|
|
||||||
https://docs.cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform_v1.types.Candidate.FinishReason
|
|
||||||
|
|
||||||
Intentionally **excluded** from the default set:
|
|
||||||
- ``STOP`` — normal termination.
|
|
||||||
- ``MAX_TOKENS`` — output length truncation, not safety
|
|
||||||
(same root failure mode as
|
|
||||||
content_filter, but issue #3028
|
|
||||||
scopes it out; expose separately if
|
|
||||||
desired).
|
|
||||||
- ``LANGUAGE`` / ``NO_IMAGE`` — capability mismatches, unrelated to
|
|
||||||
safety; tool_calls would be absent
|
|
||||||
anyway.
|
|
||||||
- ``MALFORMED_FUNCTION_CALL`` /
|
|
||||||
``UNEXPECTED_TOOL_CALL`` — tool-call protocol errors. The
|
|
||||||
tool_calls are *also* unreliable
|
|
||||||
here, but the failure category is
|
|
||||||
distinct from safety filtering;
|
|
||||||
handle in a dedicated detector to
|
|
||||||
keep observability records honest.
|
|
||||||
- ``OTHER`` / ``IMAGE_OTHER`` /
|
|
||||||
``FINISH_REASON_UNSPECIFIED`` — too broad to enable by default;
|
|
||||||
opt in via ``finish_reasons=`` if
|
|
||||||
your provider abuses these.
|
|
||||||
"""
|
|
||||||
|
|
||||||
name = "gemini_safety"
|
|
||||||
|
|
||||||
_DEFAULT_FINISH_REASONS = (
|
|
||||||
# Text safety
|
|
||||||
"SAFETY",
|
|
||||||
"BLOCKLIST",
|
|
||||||
"PROHIBITED_CONTENT",
|
|
||||||
"SPII",
|
|
||||||
"RECITATION",
|
|
||||||
# Image safety (multimodal generation)
|
|
||||||
"IMAGE_SAFETY",
|
|
||||||
"IMAGE_PROHIBITED_CONTENT",
|
|
||||||
"IMAGE_RECITATION",
|
|
||||||
)
|
|
||||||
|
|
||||||
def __init__(self, finish_reasons: list[str] | tuple[str, ...] | None = None) -> None:
|
|
||||||
configured = finish_reasons if finish_reasons is not None else self._DEFAULT_FINISH_REASONS
|
|
||||||
self._finish_reasons: frozenset[str] = frozenset(r.upper() for r in configured)
|
|
||||||
|
|
||||||
def detect(self, message: AIMessage) -> SafetyTermination | None:
|
|
||||||
value = _get_metadata_value(message, "finish_reason")
|
|
||||||
if value is None or value.upper() not in self._finish_reasons:
|
|
||||||
return None
|
|
||||||
|
|
||||||
extras: dict[str, Any] = {}
|
|
||||||
response_metadata = getattr(message, "response_metadata", None) or {}
|
|
||||||
if isinstance(response_metadata, dict):
|
|
||||||
# Gemini surfaces per-category scoring under safety_ratings.
|
|
||||||
ratings = response_metadata.get("safety_ratings")
|
|
||||||
if ratings:
|
|
||||||
extras["safety_ratings"] = ratings
|
|
||||||
|
|
||||||
return SafetyTermination(
|
|
||||||
detector=self.name,
|
|
||||||
reason_field="finish_reason",
|
|
||||||
reason_value=value,
|
|
||||||
extras=extras,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def default_detectors() -> list[SafetyTerminationDetector]:
|
|
||||||
"""Built-in detector set used when no custom detectors are configured."""
|
|
||||||
return [
|
|
||||||
OpenAICompatibleContentFilterDetector(),
|
|
||||||
AnthropicRefusalDetector(),
|
|
||||||
GeminiSafetyDetector(),
|
|
||||||
]
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"AnthropicRefusalDetector",
|
|
||||||
"GeminiSafetyDetector",
|
|
||||||
"OpenAICompatibleContentFilterDetector",
|
|
||||||
"SafetyTermination",
|
|
||||||
"SafetyTerminationDetector",
|
|
||||||
"default_detectors",
|
|
||||||
]
|
|
||||||
@@ -1,289 +0,0 @@
|
|||||||
"""Middleware for explicit slash skill activation."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import hashlib
|
|
||||||
import html
|
|
||||||
import logging
|
|
||||||
import uuid
|
|
||||||
from collections.abc import Awaitable, Callable
|
|
||||||
from dataclasses import dataclass
|
|
||||||
from pathlib import Path
|
|
||||||
from typing import TYPE_CHECKING, override
|
|
||||||
|
|
||||||
from langchain.agents.middleware import AgentMiddleware
|
|
||||||
from langchain.agents.middleware.types import ModelRequest, ModelResponse
|
|
||||||
from langchain_core.messages import AIMessage, HumanMessage
|
|
||||||
|
|
||||||
from deerflow.skills.slash import parse_slash_skill_reference, resolve_slash_skill
|
|
||||||
from deerflow.skills.storage import get_or_new_skill_storage
|
|
||||||
from deerflow.skills.storage.skill_storage import SkillStorage
|
|
||||||
from deerflow.skills.types import SKILL_MD_FILE
|
|
||||||
from deerflow.utils.messages import get_original_user_content_text
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
|
||||||
from deerflow.config.app_config import AppConfig
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
_SLASH_SKILL_ACTIVATION_KEY = "slash_skill_activation"
|
|
||||||
_SLASH_SKILL_ACTIVATION_TARGET_ID_KEY = "slash_skill_activation_target_id"
|
|
||||||
_SUMMARY_MESSAGE_NAME = "summary"
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True, slots=True)
|
|
||||||
class _Activation:
|
|
||||||
skill_name: str
|
|
||||||
category: str
|
|
||||||
container_file_path: str
|
|
||||||
skill_content: str
|
|
||||||
content_hash: str
|
|
||||||
remaining_text: str
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True, slots=True)
|
|
||||||
class _ActivationResolution:
|
|
||||||
activation: _Activation | None = None
|
|
||||||
failure_message: str | None = None
|
|
||||||
|
|
||||||
|
|
||||||
def is_slash_skill_activation_reminder(message: object) -> bool:
|
|
||||||
"""Return whether a message is hidden slash-skill activation context."""
|
|
||||||
return isinstance(message, HumanMessage) and bool(message.additional_kwargs.get(_SLASH_SKILL_ACTIVATION_KEY))
|
|
||||||
|
|
||||||
|
|
||||||
def _is_user_activation_target(message: object) -> bool:
|
|
||||||
if not isinstance(message, HumanMessage):
|
|
||||||
return False
|
|
||||||
if message.name == _SUMMARY_MESSAGE_NAME:
|
|
||||||
return False
|
|
||||||
if message.additional_kwargs.get("hide_from_ui"):
|
|
||||||
return False
|
|
||||||
return True
|
|
||||||
|
|
||||||
|
|
||||||
class SkillActivationMiddleware(AgentMiddleware):
|
|
||||||
"""Inject full SKILL.md content when the user explicitly types /skill-name."""
|
|
||||||
|
|
||||||
def __init__(
|
|
||||||
self,
|
|
||||||
*,
|
|
||||||
available_skills: set[str] | None = None,
|
|
||||||
app_config: AppConfig | None = None,
|
|
||||||
) -> None:
|
|
||||||
super().__init__()
|
|
||||||
self._available_skills = set(available_skills) if available_skills is not None else None
|
|
||||||
self._app_config = app_config
|
|
||||||
|
|
||||||
def _storage(self) -> SkillStorage:
|
|
||||||
if self._app_config is not None:
|
|
||||||
return get_or_new_skill_storage(app_config=self._app_config)
|
|
||||||
return get_or_new_skill_storage()
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _read_skill_content(skill_file: Path, skills_root: Path) -> str:
|
|
||||||
if skill_file.name != SKILL_MD_FILE:
|
|
||||||
raise ValueError(f"Expected {SKILL_MD_FILE}, got {skill_file.name}")
|
|
||||||
resolved_root = skills_root.resolve()
|
|
||||||
resolved_file = skill_file.resolve()
|
|
||||||
try:
|
|
||||||
resolved_file.relative_to(resolved_root)
|
|
||||||
except ValueError as exc:
|
|
||||||
raise ValueError("Resolved skill file must stay within the configured skills root.") from exc
|
|
||||||
if not resolved_file.is_file():
|
|
||||||
raise FileNotFoundError(resolved_file)
|
|
||||||
return resolved_file.read_text(encoding="utf-8")
|
|
||||||
|
|
||||||
def _resolve_activation(self, text: str) -> _ActivationResolution | None:
|
|
||||||
reference = parse_slash_skill_reference(text)
|
|
||||||
if reference is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
storage = self._storage()
|
|
||||||
skills = storage.load_skills(enabled_only=False)
|
|
||||||
skill = next((candidate for candidate in skills if candidate.name == reference.name), None)
|
|
||||||
if skill is None:
|
|
||||||
return _ActivationResolution(failure_message=f"Skill `/{reference.name}` is not installed.")
|
|
||||||
if not skill.enabled:
|
|
||||||
return _ActivationResolution(failure_message=f"Skill `/{reference.name}` is installed but disabled. Enable it before using slash activation.")
|
|
||||||
if self._available_skills is not None and reference.name not in self._available_skills:
|
|
||||||
return _ActivationResolution(failure_message=f"Skill `/{reference.name}` is not available for this agent.")
|
|
||||||
|
|
||||||
resolved = resolve_slash_skill(
|
|
||||||
text,
|
|
||||||
skills,
|
|
||||||
available_skills=self._available_skills,
|
|
||||||
container_base_path=storage.get_container_root(),
|
|
||||||
)
|
|
||||||
if resolved is None:
|
|
||||||
return _ActivationResolution(failure_message=f"Skill `/{reference.name}` could not be resolved.")
|
|
||||||
|
|
||||||
try:
|
|
||||||
skill_content = self._read_skill_content(resolved.skill.skill_file, storage.get_skills_root_path())
|
|
||||||
except (OSError, ValueError):
|
|
||||||
logger.exception("Failed to read slash-activated skill %s", resolved.skill.name)
|
|
||||||
return _ActivationResolution(failure_message=f"Skill `/{reference.name}` could not be loaded safely. Please check the skill installation.")
|
|
||||||
|
|
||||||
content_hash = hashlib.sha256(skill_content.encode("utf-8")).hexdigest()
|
|
||||||
return _ActivationResolution(
|
|
||||||
activation=_Activation(
|
|
||||||
skill_name=resolved.skill.name,
|
|
||||||
category=str(resolved.skill.category),
|
|
||||||
container_file_path=resolved.container_file_path,
|
|
||||||
skill_content=skill_content,
|
|
||||||
content_hash=content_hash,
|
|
||||||
remaining_text=resolved.remaining_text,
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _build_activation_reminder(activation: _Activation) -> str:
|
|
||||||
user_request = activation.remaining_text or ("No additional task text was provided after the slash skill command. Ask the user what they want to do with this skill if the next step is unclear.")
|
|
||||||
escaped_user_request = html.escape(user_request, quote=False)
|
|
||||||
escaped_skill_content = html.escape(activation.skill_content, quote=False)
|
|
||||||
escaped_skill_name = html.escape(activation.skill_name, quote=True)
|
|
||||||
escaped_category = html.escape(activation.category, quote=True)
|
|
||||||
escaped_path = html.escape(activation.container_file_path, quote=True)
|
|
||||||
escaped_content_hash = html.escape(activation.content_hash, quote=True)
|
|
||||||
return f"""<slash_skill_activation>
|
|
||||||
The user explicitly activated the `{activation.skill_name}` skill for this turn.
|
|
||||||
Treat the task text as:
|
|
||||||
<user_request>
|
|
||||||
{escaped_user_request}
|
|
||||||
</user_request>
|
|
||||||
|
|
||||||
Follow this skill before choosing a general workflow. Load supporting resources from the same skill directory only when needed.
|
|
||||||
|
|
||||||
<skill name="{escaped_skill_name}" category="{escaped_category}" path="{escaped_path}" sha256="{escaped_content_hash}">
|
|
||||||
<skill_content encoding="xml-escaped">
|
|
||||||
{escaped_skill_content}
|
|
||||||
</skill_content>
|
|
||||||
</skill>
|
|
||||||
</slash_skill_activation>"""
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _has_existing_activation_for_target(messages: list, target_index: int, target: HumanMessage) -> bool:
|
|
||||||
if target_index <= 0:
|
|
||||||
return False
|
|
||||||
|
|
||||||
if target.id:
|
|
||||||
for previous in messages[:target_index]:
|
|
||||||
if not is_slash_skill_activation_reminder(previous):
|
|
||||||
continue
|
|
||||||
target_id = previous.additional_kwargs.get(_SLASH_SKILL_ACTIVATION_TARGET_ID_KEY)
|
|
||||||
if target_id == target.id or previous.id == f"{target.id}__slash_activation":
|
|
||||||
return True
|
|
||||||
|
|
||||||
previous = messages[target_index - 1]
|
|
||||||
return is_slash_skill_activation_reminder(previous)
|
|
||||||
|
|
||||||
def _find_activation_target(self, messages: list) -> tuple[int, HumanMessage, _ActivationResolution] | None:
|
|
||||||
if not messages:
|
|
||||||
return None
|
|
||||||
|
|
||||||
target_index = next((idx for idx in range(len(messages) - 1, -1, -1) if _is_user_activation_target(messages[idx])), None)
|
|
||||||
if target_index is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
target = messages[target_index]
|
|
||||||
if target is None:
|
|
||||||
return None
|
|
||||||
if self._has_existing_activation_for_target(messages, target_index, target):
|
|
||||||
return None
|
|
||||||
|
|
||||||
content = get_original_user_content_text(target.content, target.additional_kwargs)
|
|
||||||
resolution = self._resolve_activation(content)
|
|
||||||
if resolution is None:
|
|
||||||
return None
|
|
||||||
return target_index, target, resolution
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _record_activation(request: ModelRequest, activation: _Activation, *, hook: str) -> None:
|
|
||||||
runtime = getattr(request, "runtime", None)
|
|
||||||
context = getattr(runtime, "context", None)
|
|
||||||
journal = context.get("__run_journal") if isinstance(context, dict) else None
|
|
||||||
if journal is None:
|
|
||||||
return
|
|
||||||
try:
|
|
||||||
journal.record_middleware(
|
|
||||||
"skill_activation",
|
|
||||||
name="SkillActivationMiddleware",
|
|
||||||
hook=hook,
|
|
||||||
action="activate",
|
|
||||||
changes={
|
|
||||||
"skill_name": activation.skill_name,
|
|
||||||
"category": activation.category,
|
|
||||||
"path": activation.container_file_path,
|
|
||||||
"content_hash": activation.content_hash,
|
|
||||||
},
|
|
||||||
)
|
|
||||||
except Exception:
|
|
||||||
logger.debug("Failed to record slash skill activation audit event", exc_info=True)
|
|
||||||
|
|
||||||
def _prepare_model_request(self, request: ModelRequest, *, hook: str) -> ModelRequest | AIMessage | None:
|
|
||||||
target_and_resolution = self._find_activation_target(list(request.messages))
|
|
||||||
if target_and_resolution is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
target_index, target, resolution = target_and_resolution
|
|
||||||
if resolution.failure_message:
|
|
||||||
return AIMessage(content=resolution.failure_message)
|
|
||||||
|
|
||||||
activation = resolution.activation
|
|
||||||
if activation is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
logger.info(
|
|
||||||
"SkillActivationMiddleware: activating slash skill %s category=%s path=%s hash=%s",
|
|
||||||
activation.skill_name,
|
|
||||||
activation.category,
|
|
||||||
activation.container_file_path,
|
|
||||||
activation.content_hash,
|
|
||||||
)
|
|
||||||
self._record_activation(request, activation, hook=hook)
|
|
||||||
activation_msg = self._make_activation_message(target, self._build_activation_reminder(activation))
|
|
||||||
messages = list(request.messages)
|
|
||||||
messages.insert(target_index, activation_msg)
|
|
||||||
return request.override(messages=messages)
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _make_activation_message(target: HumanMessage, activation_content: str) -> HumanMessage:
|
|
||||||
stable_id = target.id or str(uuid.uuid4())
|
|
||||||
additional_kwargs = {
|
|
||||||
"hide_from_ui": True,
|
|
||||||
_SLASH_SKILL_ACTIVATION_KEY: True,
|
|
||||||
}
|
|
||||||
if target.id:
|
|
||||||
additional_kwargs[_SLASH_SKILL_ACTIVATION_TARGET_ID_KEY] = target.id
|
|
||||||
return HumanMessage(
|
|
||||||
content=activation_content,
|
|
||||||
id=f"{stable_id}__slash_activation",
|
|
||||||
additional_kwargs=additional_kwargs,
|
|
||||||
)
|
|
||||||
|
|
||||||
@override
|
|
||||||
def wrap_model_call(
|
|
||||||
self,
|
|
||||||
request: ModelRequest,
|
|
||||||
handler: Callable[[ModelRequest], ModelResponse],
|
|
||||||
) -> ModelResponse | AIMessage:
|
|
||||||
prepared = self._prepare_model_request(request, hook="wrap_model_call")
|
|
||||||
if prepared is None:
|
|
||||||
return handler(request)
|
|
||||||
if isinstance(prepared, AIMessage):
|
|
||||||
return prepared
|
|
||||||
return handler(prepared)
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def awrap_model_call(
|
|
||||||
self,
|
|
||||||
request: ModelRequest,
|
|
||||||
handler: Callable[[ModelRequest], Awaitable[ModelResponse]],
|
|
||||||
) -> ModelResponse | AIMessage:
|
|
||||||
prepared = await asyncio.to_thread(self._prepare_model_request, request, hook="awrap_model_call")
|
|
||||||
if prepared is None:
|
|
||||||
return await handler(request)
|
|
||||||
if isinstance(prepared, AIMessage):
|
|
||||||
return prepared
|
|
||||||
return await handler(prepared)
|
|
||||||
@@ -7,7 +7,6 @@ from langchain.agents import AgentState
|
|||||||
from langchain.agents.middleware import AgentMiddleware
|
from langchain.agents.middleware import AgentMiddleware
|
||||||
from langgraph.runtime import Runtime
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
from deerflow.agents.middlewares.tool_call_metadata import clone_ai_message_with_tool_calls
|
|
||||||
from deerflow.subagents.executor import MAX_CONCURRENT_SUBAGENTS
|
from deerflow.subagents.executor import MAX_CONCURRENT_SUBAGENTS
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@@ -64,7 +63,7 @@ class SubagentLimitMiddleware(AgentMiddleware[AgentState]):
|
|||||||
logger.warning(f"Truncated {dropped_count} excess task tool call(s) from model response (limit: {self.max_concurrent})")
|
logger.warning(f"Truncated {dropped_count} excess task tool call(s) from model response (limit: {self.max_concurrent})")
|
||||||
|
|
||||||
# Replace the AIMessage with truncated tool_calls (same id triggers replacement)
|
# Replace the AIMessage with truncated tool_calls (same id triggers replacement)
|
||||||
updated_msg = clone_ai_message_with_tool_calls(last_msg, truncated_tool_calls)
|
updated_msg = last_msg.model_copy(update={"tool_calls": truncated_tool_calls})
|
||||||
return {"messages": [updated_msg]}
|
return {"messages": [updated_msg]}
|
||||||
|
|
||||||
@override
|
@override
|
||||||
|
|||||||
@@ -9,15 +9,11 @@ from typing import Any, Protocol, override, runtime_checkable
|
|||||||
|
|
||||||
from langchain.agents import AgentState
|
from langchain.agents import AgentState
|
||||||
from langchain.agents.middleware import SummarizationMiddleware
|
from langchain.agents.middleware import SummarizationMiddleware
|
||||||
from langchain_core.messages import AIMessage, AnyMessage, HumanMessage, RemoveMessage, ToolMessage, get_buffer_string
|
from langchain_core.messages import AIMessage, AnyMessage, HumanMessage, RemoveMessage, ToolMessage
|
||||||
from langgraph.config import get_config
|
from langgraph.config import get_config
|
||||||
from langgraph.constants import TAG_NOSTREAM
|
|
||||||
from langgraph.graph.message import REMOVE_ALL_MESSAGES
|
from langgraph.graph.message import REMOVE_ALL_MESSAGES
|
||||||
from langgraph.runtime import Runtime
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
from deerflow.agents.middlewares.dynamic_context_middleware import is_dynamic_context_reminder
|
|
||||||
from deerflow.agents.middlewares.tool_call_metadata import clone_ai_message_with_tool_calls
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
@@ -82,7 +78,10 @@ def _clone_ai_message(
|
|||||||
content: Any | None = None,
|
content: Any | None = None,
|
||||||
) -> AIMessage:
|
) -> AIMessage:
|
||||||
"""Clone an AIMessage while replacing its tool_calls list and optional content."""
|
"""Clone an AIMessage while replacing its tool_calls list and optional content."""
|
||||||
return clone_ai_message_with_tool_calls(message, tool_calls, content=content)
|
update: dict[str, Any] = {"tool_calls": tool_calls}
|
||||||
|
if content is not None:
|
||||||
|
update["content"] = content
|
||||||
|
return message.model_copy(update=update)
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
@@ -117,74 +116,6 @@ class DeerFlowSummarizationMiddleware(SummarizationMiddleware):
|
|||||||
self._preserve_recent_skill_count = max(0, preserve_recent_skill_count)
|
self._preserve_recent_skill_count = max(0, preserve_recent_skill_count)
|
||||||
self._preserve_recent_skill_tokens = max(0, preserve_recent_skill_tokens)
|
self._preserve_recent_skill_tokens = max(0, preserve_recent_skill_tokens)
|
||||||
self._preserve_recent_skill_tokens_per_skill = max(0, preserve_recent_skill_tokens_per_skill)
|
self._preserve_recent_skill_tokens_per_skill = max(0, preserve_recent_skill_tokens_per_skill)
|
||||||
# The summary LLM call runs inside a LangGraph middleware hook, so its token
|
|
||||||
# stream would otherwise be captured by the messages-tuple stream callback and
|
|
||||||
# broadcast to the frontend as a phantom AI message. Tag a dedicated model copy
|
|
||||||
# with TAG_NOSTREAM so the streaming handler skips it.
|
|
||||||
# Keep self.model untagged so the parent's profile / ls_params inspection still works.
|
|
||||||
#
|
|
||||||
# Preserve any tags already bound on the model (e.g. "middleware:summarize" set in
|
|
||||||
# lead_agent/agent.py for RunJournal attribution): RunnableBinding.with_config does a
|
|
||||||
# shallow merge that would otherwise overwrite the existing tags list entirely.
|
|
||||||
existing_tags = list((getattr(self.model, "config", None) or {}).get("tags") or [])
|
|
||||||
merged_tags = [*existing_tags, TAG_NOSTREAM] if TAG_NOSTREAM not in existing_tags else existing_tags
|
|
||||||
self._summary_model = self.model.with_config(tags=merged_tags)
|
|
||||||
|
|
||||||
@override
|
|
||||||
def _create_summary(self, messages_to_summarize: list[AnyMessage]) -> str:
|
|
||||||
return self._summarize_with(messages_to_summarize)
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def _acreate_summary(self, messages_to_summarize: list[AnyMessage]) -> str:
|
|
||||||
return await self._asummarize_with(messages_to_summarize)
|
|
||||||
|
|
||||||
def _summarize_with(self, messages_to_summarize: list[AnyMessage]) -> str:
|
|
||||||
"""Mirror the parent ``_create_summary`` but invoke the nostream-tagged model.
|
|
||||||
|
|
||||||
We do not swap ``self.model`` at the instance level: the agent/middleware is
|
|
||||||
cached and reused across concurrent runs, so a temporary swap would leak the
|
|
||||||
``RunnableBinding`` to other coroutines during ``await`` and break parent logic
|
|
||||||
that inspects the raw model (``profile`` / ``_get_ls_params``).
|
|
||||||
"""
|
|
||||||
if not messages_to_summarize:
|
|
||||||
return "No previous conversation history."
|
|
||||||
prompt = self._build_summary_prompt(messages_to_summarize)
|
|
||||||
if prompt is None:
|
|
||||||
return "Previous conversation was too long to summarize."
|
|
||||||
try:
|
|
||||||
response = self._summary_model.invoke(
|
|
||||||
prompt,
|
|
||||||
config={"metadata": {"lc_source": "summarization"}},
|
|
||||||
)
|
|
||||||
return response.text.strip()
|
|
||||||
except Exception as e:
|
|
||||||
return f"Error generating summary: {e!s}"
|
|
||||||
|
|
||||||
async def _asummarize_with(self, messages_to_summarize: list[AnyMessage]) -> str:
|
|
||||||
"""Async counterpart of :meth:`_summarize_with` using the nostream model."""
|
|
||||||
if not messages_to_summarize:
|
|
||||||
return "No previous conversation history."
|
|
||||||
prompt = self._build_summary_prompt(messages_to_summarize)
|
|
||||||
if prompt is None:
|
|
||||||
return "Previous conversation was too long to summarize."
|
|
||||||
try:
|
|
||||||
response = await self._summary_model.ainvoke(
|
|
||||||
prompt,
|
|
||||||
config={"metadata": {"lc_source": "summarization"}},
|
|
||||||
)
|
|
||||||
return response.text.strip()
|
|
||||||
except Exception as e:
|
|
||||||
return f"Error generating summary: {e!s}"
|
|
||||||
|
|
||||||
def _build_summary_prompt(self, messages_to_summarize: list[AnyMessage]) -> str | None:
|
|
||||||
"""Build the summary prompt, returning ``None`` when trimming leaves nothing."""
|
|
||||||
trimmed_messages = self._trim_messages_for_summary(messages_to_summarize)
|
|
||||||
if not trimmed_messages:
|
|
||||||
return None
|
|
||||||
# Format messages to avoid token inflation from metadata when str() is called on
|
|
||||||
# message objects.
|
|
||||||
formatted_messages = get_buffer_string(trimmed_messages)
|
|
||||||
return self.summary_prompt.format(messages=formatted_messages).rstrip()
|
|
||||||
|
|
||||||
def before_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
def before_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
||||||
return self._maybe_summarize(state, runtime)
|
return self._maybe_summarize(state, runtime)
|
||||||
@@ -205,7 +136,6 @@ class DeerFlowSummarizationMiddleware(SummarizationMiddleware):
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
messages_to_summarize, preserved_messages = self._partition_with_skill_rescue(messages, cutoff_index)
|
messages_to_summarize, preserved_messages = self._partition_with_skill_rescue(messages, cutoff_index)
|
||||||
messages_to_summarize, preserved_messages = self._preserve_dynamic_context_reminders(messages_to_summarize, preserved_messages)
|
|
||||||
self._fire_hooks(messages_to_summarize, preserved_messages, runtime)
|
self._fire_hooks(messages_to_summarize, preserved_messages, runtime)
|
||||||
summary = self._create_summary(messages_to_summarize)
|
summary = self._create_summary(messages_to_summarize)
|
||||||
new_messages = self._build_new_messages(summary)
|
new_messages = self._build_new_messages(summary)
|
||||||
@@ -231,7 +161,6 @@ class DeerFlowSummarizationMiddleware(SummarizationMiddleware):
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
messages_to_summarize, preserved_messages = self._partition_with_skill_rescue(messages, cutoff_index)
|
messages_to_summarize, preserved_messages = self._partition_with_skill_rescue(messages, cutoff_index)
|
||||||
messages_to_summarize, preserved_messages = self._preserve_dynamic_context_reminders(messages_to_summarize, preserved_messages)
|
|
||||||
self._fire_hooks(messages_to_summarize, preserved_messages, runtime)
|
self._fire_hooks(messages_to_summarize, preserved_messages, runtime)
|
||||||
summary = await self._acreate_summary(messages_to_summarize)
|
summary = await self._acreate_summary(messages_to_summarize)
|
||||||
new_messages = self._build_new_messages(summary)
|
new_messages = self._build_new_messages(summary)
|
||||||
@@ -251,24 +180,6 @@ class DeerFlowSummarizationMiddleware(SummarizationMiddleware):
|
|||||||
"""
|
"""
|
||||||
return [HumanMessage(content=f"Here is a summary of the conversation to date:\n\n{summary}", name="summary")]
|
return [HumanMessage(content=f"Here is a summary of the conversation to date:\n\n{summary}", name="summary")]
|
||||||
|
|
||||||
def _preserve_dynamic_context_reminders(
|
|
||||||
self,
|
|
||||||
messages_to_summarize: list[AnyMessage],
|
|
||||||
preserved_messages: list[AnyMessage],
|
|
||||||
) -> tuple[list[AnyMessage], list[AnyMessage]]:
|
|
||||||
"""Keep hidden dynamic-context reminders out of summary compression.
|
|
||||||
|
|
||||||
These reminders carry the current date and optional memory. If summarization
|
|
||||||
removes them, DynamicContextMiddleware can mistake the summary HumanMessage
|
|
||||||
for the first user message and inject the reminder in the wrong place.
|
|
||||||
"""
|
|
||||||
reminders = [msg for msg in messages_to_summarize if is_dynamic_context_reminder(msg)]
|
|
||||||
if not reminders:
|
|
||||||
return messages_to_summarize, preserved_messages
|
|
||||||
|
|
||||||
remaining = [msg for msg in messages_to_summarize if not is_dynamic_context_reminder(msg)]
|
|
||||||
return remaining, reminders + preserved_messages
|
|
||||||
|
|
||||||
def _partition_with_skill_rescue(
|
def _partition_with_skill_rescue(
|
||||||
self,
|
self,
|
||||||
messages: list[AnyMessage],
|
messages: list[AnyMessage],
|
||||||
|
|||||||
@@ -9,7 +9,6 @@ from langchain.agents.middleware import AgentMiddleware
|
|||||||
from langgraph.config import get_config
|
from langgraph.config import get_config
|
||||||
from langgraph.runtime import Runtime
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
from deerflow.agents.middlewares.dynamic_context_middleware import is_dynamic_context_reminder
|
|
||||||
from deerflow.config.title_config import get_title_config
|
from deerflow.config.title_config import get_title_config
|
||||||
from deerflow.models import create_chat_model
|
from deerflow.models import create_chat_model
|
||||||
|
|
||||||
@@ -62,10 +61,6 @@ class TitleMiddleware(AgentMiddleware[TitleMiddlewareState]):
|
|||||||
|
|
||||||
return ""
|
return ""
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _is_user_message_for_title(message: object) -> bool:
|
|
||||||
return getattr(message, "type", None) == "human" and not is_dynamic_context_reminder(message)
|
|
||||||
|
|
||||||
def _should_generate_title(self, state: TitleMiddlewareState) -> bool:
|
def _should_generate_title(self, state: TitleMiddlewareState) -> bool:
|
||||||
"""Check if we should generate a title for this thread."""
|
"""Check if we should generate a title for this thread."""
|
||||||
config = self._get_title_config()
|
config = self._get_title_config()
|
||||||
@@ -82,7 +77,7 @@ class TitleMiddleware(AgentMiddleware[TitleMiddlewareState]):
|
|||||||
return False
|
return False
|
||||||
|
|
||||||
# Count user and assistant messages
|
# Count user and assistant messages
|
||||||
user_messages = [m for m in messages if self._is_user_message_for_title(m)]
|
user_messages = [m for m in messages if m.type == "human"]
|
||||||
assistant_messages = [m for m in messages if m.type == "ai"]
|
assistant_messages = [m for m in messages if m.type == "ai"]
|
||||||
|
|
||||||
# Generate title after first complete exchange
|
# Generate title after first complete exchange
|
||||||
@@ -96,7 +91,7 @@ class TitleMiddleware(AgentMiddleware[TitleMiddlewareState]):
|
|||||||
config = self._get_title_config()
|
config = self._get_title_config()
|
||||||
messages = state.get("messages", [])
|
messages = state.get("messages", [])
|
||||||
|
|
||||||
user_msg_content = next((m.content for m in messages if self._is_user_message_for_title(m)), "")
|
user_msg_content = next((m.content for m in messages if m.type == "human"), "")
|
||||||
assistant_msg_content = next((m.content for m in messages if m.type == "ai"), "")
|
assistant_msg_content = next((m.content for m in messages if m.type == "ai"), "")
|
||||||
|
|
||||||
user_msg = self._normalize_content(user_msg_content)
|
user_msg = self._normalize_content(user_msg_content)
|
||||||
@@ -160,11 +155,7 @@ class TitleMiddleware(AgentMiddleware[TitleMiddlewareState]):
|
|||||||
prompt, user_msg = self._build_title_prompt(state)
|
prompt, user_msg = self._build_title_prompt(state)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# attach_tracing=False because ``_get_runnable_config()`` inherits
|
model_kwargs = {"thinking_enabled": False}
|
||||||
# the graph-level RunnableConfig (set in ``_make_lead_agent``) whose
|
|
||||||
# callbacks already carry tracing handlers; binding them again at
|
|
||||||
# the model level would emit duplicate spans.
|
|
||||||
model_kwargs = {"thinking_enabled": False, "attach_tracing": False}
|
|
||||||
if self._app_config is not None:
|
if self._app_config is not None:
|
||||||
model_kwargs["app_config"] = self._app_config
|
model_kwargs["app_config"] = self._app_config
|
||||||
if config.model_name:
|
if config.model_name:
|
||||||
|
|||||||
@@ -7,26 +7,20 @@ reminder message so the model still knows about the outstanding todo list.
|
|||||||
|
|
||||||
Additionally, this middleware prevents the agent from exiting the loop while
|
Additionally, this middleware prevents the agent from exiting the loop while
|
||||||
there are still incomplete todo items. When the model produces a final response
|
there are still incomplete todo items. When the model produces a final response
|
||||||
(no tool calls) but todos are not yet complete, the middleware queues a reminder
|
(no tool calls) but todos are not yet complete, the middleware injects a reminder
|
||||||
for the next model request and jumps back to the model node to force continued
|
and jumps back to the model node to force continued engagement.
|
||||||
engagement. The completion reminder is injected via ``wrap_model_call`` instead
|
|
||||||
of being persisted into graph state as a normal user-visible message.
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import threading
|
|
||||||
from collections.abc import Awaitable, Callable
|
|
||||||
from typing import Any, override
|
from typing import Any, override
|
||||||
|
|
||||||
from langchain.agents.middleware import TodoListMiddleware
|
from langchain.agents.middleware import TodoListMiddleware
|
||||||
from langchain.agents.middleware.todo import Todo
|
from langchain.agents.middleware.todo import PlanningState, Todo
|
||||||
from langchain.agents.middleware.types import ModelCallResult, ModelRequest, ModelResponse, hook_config
|
from langchain.agents.middleware.types import hook_config
|
||||||
from langchain_core.messages import AIMessage, HumanMessage
|
from langchain_core.messages import AIMessage, HumanMessage
|
||||||
from langgraph.runtime import Runtime
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
from deerflow.agents.thread_state import ThreadState
|
|
||||||
|
|
||||||
|
|
||||||
def _todos_in_messages(messages: list[Any]) -> bool:
|
def _todos_in_messages(messages: list[Any]) -> bool:
|
||||||
"""Return True if any AIMessage in *messages* contains a write_todos tool call."""
|
"""Return True if any AIMessage in *messages* contains a write_todos tool call."""
|
||||||
@@ -61,51 +55,6 @@ def _format_todos(todos: list[Todo]) -> str:
|
|||||||
return "\n".join(lines)
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
def _format_completion_reminder(todos: list[Todo]) -> str:
|
|
||||||
"""Format a completion reminder for incomplete todo items."""
|
|
||||||
incomplete = [t for t in todos if t.get("status") != "completed"]
|
|
||||||
incomplete_text = "\n".join(f"- [{t.get('status', 'pending')}] {t.get('content', '')}" for t in incomplete)
|
|
||||||
return (
|
|
||||||
"<system_reminder>\n"
|
|
||||||
"You have incomplete todo items that must be finished before giving your final response:\n\n"
|
|
||||||
f"{incomplete_text}\n\n"
|
|
||||||
"Please continue working on these tasks. Call `write_todos` to mark items as completed "
|
|
||||||
"as you finish them, and only respond when all items are done.\n"
|
|
||||||
"</system_reminder>"
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
_TOOL_CALL_FINISH_REASONS = {"tool_calls", "function_call"}
|
|
||||||
|
|
||||||
|
|
||||||
def _has_tool_call_intent_or_error(message: AIMessage) -> bool:
|
|
||||||
"""Return True when an AIMessage is not a clean final answer.
|
|
||||||
|
|
||||||
Todo completion reminders should only fire when the model has produced a
|
|
||||||
plain final response. Provider/tool parsing details have moved across
|
|
||||||
LangChain versions and integrations, so keep all tool-intent/error signals
|
|
||||||
behind this helper instead of checking one concrete field at the call site.
|
|
||||||
"""
|
|
||||||
if message.tool_calls:
|
|
||||||
return True
|
|
||||||
|
|
||||||
if getattr(message, "invalid_tool_calls", None):
|
|
||||||
return True
|
|
||||||
|
|
||||||
# Backward/provider compatibility: some integrations preserve raw or legacy
|
|
||||||
# tool-call intent in additional_kwargs even when structured tool_calls is
|
|
||||||
# empty. If this helper changes, update the matching sentinel test
|
|
||||||
# `TestToolCallIntentOrError.test_langchain_ai_message_tool_fields_are_explicitly_handled`;
|
|
||||||
# if that test fails after a LangChain upgrade, review this helper so new
|
|
||||||
# tool-call/error fields are not silently treated as clean final answers.
|
|
||||||
additional_kwargs = getattr(message, "additional_kwargs", {}) or {}
|
|
||||||
if additional_kwargs.get("tool_calls") or additional_kwargs.get("function_call"):
|
|
||||||
return True
|
|
||||||
|
|
||||||
response_metadata = getattr(message, "response_metadata", {}) or {}
|
|
||||||
return response_metadata.get("finish_reason") in _TOOL_CALL_FINISH_REASONS
|
|
||||||
|
|
||||||
|
|
||||||
class TodoMiddleware(TodoListMiddleware):
|
class TodoMiddleware(TodoListMiddleware):
|
||||||
"""Extends TodoListMiddleware with `write_todos` context-loss detection.
|
"""Extends TodoListMiddleware with `write_todos` context-loss detection.
|
||||||
|
|
||||||
@@ -115,12 +64,10 @@ class TodoMiddleware(TodoListMiddleware):
|
|||||||
and injects a reminder message so the model can continue tracking progress.
|
and injects a reminder message so the model can continue tracking progress.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
state_schema = ThreadState
|
|
||||||
|
|
||||||
@override
|
@override
|
||||||
def before_model(
|
def before_model(
|
||||||
self,
|
self,
|
||||||
state: ThreadState,
|
state: PlanningState,
|
||||||
runtime: Runtime,
|
runtime: Runtime,
|
||||||
) -> dict[str, Any] | None:
|
) -> dict[str, Any] | None:
|
||||||
"""Inject a todo-list reminder when write_todos has left the context window."""
|
"""Inject a todo-list reminder when write_todos has left the context window."""
|
||||||
@@ -142,7 +89,6 @@ class TodoMiddleware(TodoListMiddleware):
|
|||||||
formatted = _format_todos(todos)
|
formatted = _format_todos(todos)
|
||||||
reminder = HumanMessage(
|
reminder = HumanMessage(
|
||||||
name="todo_reminder",
|
name="todo_reminder",
|
||||||
additional_kwargs={"hide_from_ui": True},
|
|
||||||
content=(
|
content=(
|
||||||
"<system_reminder>\n"
|
"<system_reminder>\n"
|
||||||
"Your todo list from earlier is no longer visible in the current context window, "
|
"Your todo list from earlier is no longer visible in the current context window, "
|
||||||
@@ -158,7 +104,7 @@ class TodoMiddleware(TodoListMiddleware):
|
|||||||
@override
|
@override
|
||||||
async def abefore_model(
|
async def abefore_model(
|
||||||
self,
|
self,
|
||||||
state: ThreadState,
|
state: PlanningState,
|
||||||
runtime: Runtime,
|
runtime: Runtime,
|
||||||
) -> dict[str, Any] | None:
|
) -> dict[str, Any] | None:
|
||||||
"""Async version of before_model."""
|
"""Async version of before_model."""
|
||||||
@@ -167,106 +113,12 @@ class TodoMiddleware(TodoListMiddleware):
|
|||||||
# Maximum number of completion reminders before allowing the agent to exit.
|
# Maximum number of completion reminders before allowing the agent to exit.
|
||||||
# This prevents infinite loops when the agent cannot make further progress.
|
# This prevents infinite loops when the agent cannot make further progress.
|
||||||
_MAX_COMPLETION_REMINDERS = 2
|
_MAX_COMPLETION_REMINDERS = 2
|
||||||
# Hard cap for per-run reminder bookkeeping in long-lived middleware instances.
|
|
||||||
_MAX_COMPLETION_REMINDER_KEYS = 4096
|
|
||||||
|
|
||||||
def __init__(self, *args: Any, **kwargs: Any) -> None:
|
|
||||||
super().__init__(*args, **kwargs)
|
|
||||||
self._lock = threading.Lock()
|
|
||||||
self._pending_completion_reminders: dict[tuple[str, str], list[str]] = {}
|
|
||||||
self._completion_reminder_counts: dict[tuple[str, str], int] = {}
|
|
||||||
self._completion_reminder_touch_order: dict[tuple[str, str], int] = {}
|
|
||||||
self._completion_reminder_next_order = 0
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _get_thread_id(runtime: Runtime) -> str:
|
|
||||||
context = getattr(runtime, "context", None)
|
|
||||||
thread_id = context.get("thread_id") if context else None
|
|
||||||
return str(thread_id) if thread_id else "default"
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _get_run_id(runtime: Runtime) -> str:
|
|
||||||
context = getattr(runtime, "context", None)
|
|
||||||
run_id = context.get("run_id") if context else None
|
|
||||||
return str(run_id) if run_id else "default"
|
|
||||||
|
|
||||||
def _pending_key(self, runtime: Runtime) -> tuple[str, str]:
|
|
||||||
return self._get_thread_id(runtime), self._get_run_id(runtime)
|
|
||||||
|
|
||||||
def _touch_completion_reminder_key_locked(self, key: tuple[str, str]) -> None:
|
|
||||||
self._completion_reminder_next_order += 1
|
|
||||||
self._completion_reminder_touch_order[key] = self._completion_reminder_next_order
|
|
||||||
|
|
||||||
def _completion_reminder_keys_locked(self) -> set[tuple[str, str]]:
|
|
||||||
keys = set(self._pending_completion_reminders)
|
|
||||||
keys.update(self._completion_reminder_counts)
|
|
||||||
keys.update(self._completion_reminder_touch_order)
|
|
||||||
return keys
|
|
||||||
|
|
||||||
def _drop_completion_reminder_key_locked(self, key: tuple[str, str]) -> None:
|
|
||||||
self._pending_completion_reminders.pop(key, None)
|
|
||||||
self._completion_reminder_counts.pop(key, None)
|
|
||||||
self._completion_reminder_touch_order.pop(key, None)
|
|
||||||
|
|
||||||
def _prune_completion_reminder_state_locked(self, protected_key: tuple[str, str]) -> None:
|
|
||||||
keys = self._completion_reminder_keys_locked()
|
|
||||||
overflow = len(keys) - self._MAX_COMPLETION_REMINDER_KEYS
|
|
||||||
if overflow <= 0:
|
|
||||||
return
|
|
||||||
|
|
||||||
candidates = [key for key in keys if key != protected_key]
|
|
||||||
candidates.sort(key=lambda key: self._completion_reminder_touch_order.get(key, 0))
|
|
||||||
for key in candidates[:overflow]:
|
|
||||||
self._drop_completion_reminder_key_locked(key)
|
|
||||||
|
|
||||||
def _queue_completion_reminder(self, runtime: Runtime, reminder: str) -> None:
|
|
||||||
key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
self._pending_completion_reminders.setdefault(key, []).append(reminder)
|
|
||||||
self._completion_reminder_counts[key] = self._completion_reminder_counts.get(key, 0) + 1
|
|
||||||
self._touch_completion_reminder_key_locked(key)
|
|
||||||
self._prune_completion_reminder_state_locked(protected_key=key)
|
|
||||||
|
|
||||||
def _completion_reminder_count_for_runtime(self, runtime: Runtime) -> int:
|
|
||||||
key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
return self._completion_reminder_counts.get(key, 0)
|
|
||||||
|
|
||||||
def _drain_completion_reminders(self, runtime: Runtime) -> list[str]:
|
|
||||||
key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
reminders = self._pending_completion_reminders.pop(key, [])
|
|
||||||
if reminders or key in self._completion_reminder_counts:
|
|
||||||
self._touch_completion_reminder_key_locked(key)
|
|
||||||
return reminders
|
|
||||||
|
|
||||||
def _clear_other_run_completion_reminders(self, runtime: Runtime) -> None:
|
|
||||||
thread_id, current_run_id = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
for key in self._completion_reminder_keys_locked():
|
|
||||||
if key[0] == thread_id and key[1] != current_run_id:
|
|
||||||
self._drop_completion_reminder_key_locked(key)
|
|
||||||
|
|
||||||
def _clear_current_run_completion_reminders(self, runtime: Runtime) -> None:
|
|
||||||
key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
self._drop_completion_reminder_key_locked(key)
|
|
||||||
|
|
||||||
@override
|
|
||||||
def before_agent(self, state: ThreadState, runtime: Runtime) -> dict[str, Any] | None:
|
|
||||||
self._clear_other_run_completion_reminders(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def abefore_agent(self, state: ThreadState, runtime: Runtime) -> dict[str, Any] | None:
|
|
||||||
self._clear_other_run_completion_reminders(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
@hook_config(can_jump_to=["model"])
|
@hook_config(can_jump_to=["model"])
|
||||||
@override
|
@override
|
||||||
def after_model(
|
def after_model(
|
||||||
self,
|
self,
|
||||||
state: ThreadState,
|
state: PlanningState,
|
||||||
runtime: Runtime,
|
runtime: Runtime,
|
||||||
) -> dict[str, Any] | None:
|
) -> dict[str, Any] | None:
|
||||||
"""Prevent premature agent exit when todo items are still incomplete.
|
"""Prevent premature agent exit when todo items are still incomplete.
|
||||||
@@ -285,12 +137,10 @@ class TodoMiddleware(TodoListMiddleware):
|
|||||||
if base_result is not None:
|
if base_result is not None:
|
||||||
return base_result
|
return base_result
|
||||||
|
|
||||||
# 2. Only intervene when the agent wants to exit cleanly. Tool-call
|
# 2. Only intervene when the agent wants to exit (no tool calls).
|
||||||
# intent or tool-call parse errors should be handled by the tool path
|
|
||||||
# instead of being masked by todo reminders.
|
|
||||||
messages = state.get("messages") or []
|
messages = state.get("messages") or []
|
||||||
last_ai = next((m for m in reversed(messages) if isinstance(m, AIMessage)), None)
|
last_ai = next((m for m in reversed(messages) if isinstance(m, AIMessage)), None)
|
||||||
if not last_ai or _has_tool_call_intent_or_error(last_ai):
|
if not last_ai or last_ai.tool_calls:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
# 3. Allow exit when all todos are completed or there are no todos.
|
# 3. Allow exit when all todos are completed or there are no todos.
|
||||||
@@ -299,65 +149,31 @@ class TodoMiddleware(TodoListMiddleware):
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
# 4. Enforce a reminder cap to prevent infinite re-engagement loops.
|
# 4. Enforce a reminder cap to prevent infinite re-engagement loops.
|
||||||
if self._completion_reminder_count_for_runtime(runtime) >= self._MAX_COMPLETION_REMINDERS:
|
if _completion_reminder_count(messages) >= self._MAX_COMPLETION_REMINDERS:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
# 5. Queue a reminder for the next model request and jump back. We must
|
# 5. Inject a reminder and force the agent back to the model.
|
||||||
# not persist this control prompt as a normal HumanMessage, otherwise it
|
incomplete = [t for t in todos if t.get("status") != "completed"]
|
||||||
# can leak into user-visible message streams and saved transcripts.
|
incomplete_text = "\n".join(f"- [{t.get('status', 'pending')}] {t.get('content', '')}" for t in incomplete)
|
||||||
self._queue_completion_reminder(runtime, _format_completion_reminder(todos))
|
reminder = HumanMessage(
|
||||||
return {"jump_to": "model"}
|
name="todo_completion_reminder",
|
||||||
|
content=(
|
||||||
|
"<system_reminder>\n"
|
||||||
|
"You have incomplete todo items that must be finished before giving your final response:\n\n"
|
||||||
|
f"{incomplete_text}\n\n"
|
||||||
|
"Please continue working on these tasks. Call `write_todos` to mark items as completed "
|
||||||
|
"as you finish them, and only respond when all items are done.\n"
|
||||||
|
"</system_reminder>"
|
||||||
|
),
|
||||||
|
)
|
||||||
|
return {"jump_to": "model", "messages": [reminder]}
|
||||||
|
|
||||||
@override
|
@override
|
||||||
@hook_config(can_jump_to=["model"])
|
@hook_config(can_jump_to=["model"])
|
||||||
async def aafter_model(
|
async def aafter_model(
|
||||||
self,
|
self,
|
||||||
state: ThreadState,
|
state: PlanningState,
|
||||||
runtime: Runtime,
|
runtime: Runtime,
|
||||||
) -> dict[str, Any] | None:
|
) -> dict[str, Any] | None:
|
||||||
"""Async version of after_model."""
|
"""Async version of after_model."""
|
||||||
return self.after_model(state, runtime)
|
return self.after_model(state, runtime)
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _format_pending_completion_reminders(reminders: list[str]) -> str:
|
|
||||||
return "\n\n".join(dict.fromkeys(reminders))
|
|
||||||
|
|
||||||
def _augment_request(self, request: ModelRequest) -> ModelRequest:
|
|
||||||
reminders = self._drain_completion_reminders(request.runtime)
|
|
||||||
if not reminders:
|
|
||||||
return request
|
|
||||||
new_messages = [
|
|
||||||
*request.messages,
|
|
||||||
HumanMessage(
|
|
||||||
content=self._format_pending_completion_reminders(reminders),
|
|
||||||
name="todo_completion_reminder",
|
|
||||||
additional_kwargs={"hide_from_ui": True},
|
|
||||||
),
|
|
||||||
]
|
|
||||||
return request.override(messages=new_messages)
|
|
||||||
|
|
||||||
@override
|
|
||||||
def wrap_model_call(
|
|
||||||
self,
|
|
||||||
request: ModelRequest,
|
|
||||||
handler: Callable[[ModelRequest], ModelResponse],
|
|
||||||
) -> ModelCallResult:
|
|
||||||
return handler(self._augment_request(request))
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def awrap_model_call(
|
|
||||||
self,
|
|
||||||
request: ModelRequest,
|
|
||||||
handler: Callable[[ModelRequest], Awaitable[ModelResponse]],
|
|
||||||
) -> ModelCallResult:
|
|
||||||
return await handler(self._augment_request(request))
|
|
||||||
|
|
||||||
@override
|
|
||||||
def after_agent(self, state: ThreadState, runtime: Runtime) -> dict[str, Any] | None:
|
|
||||||
self._clear_current_run_completion_reminders(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def aafter_agent(self, state: ThreadState, runtime: Runtime) -> dict[str, Any] | None:
|
|
||||||
self._clear_current_run_completion_reminders(runtime)
|
|
||||||
return None
|
|
||||||
|
|||||||
@@ -1,358 +1,37 @@
|
|||||||
"""Middleware for logging token usage and annotating step attribution."""
|
"""Middleware for logging LLM token usage."""
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
from collections import defaultdict
|
from typing import override
|
||||||
from typing import Any, override
|
|
||||||
|
|
||||||
from langchain.agents import AgentState
|
from langchain.agents import AgentState
|
||||||
from langchain.agents.middleware import AgentMiddleware
|
from langchain.agents.middleware import AgentMiddleware
|
||||||
from langchain.agents.middleware.todo import Todo
|
|
||||||
from langchain_core.messages import AIMessage, ToolMessage
|
|
||||||
from langgraph.runtime import Runtime
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
TOKEN_USAGE_ATTRIBUTION_KEY = "token_usage_attribution"
|
|
||||||
|
|
||||||
|
|
||||||
def _string_arg(value: Any) -> str | None:
|
|
||||||
if isinstance(value, str):
|
|
||||||
normalized = value.strip()
|
|
||||||
return normalized or None
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def _normalize_todos(value: Any) -> list[Todo]:
|
|
||||||
if not isinstance(value, list):
|
|
||||||
return []
|
|
||||||
|
|
||||||
normalized: list[Todo] = []
|
|
||||||
for item in value:
|
|
||||||
if not isinstance(item, dict):
|
|
||||||
continue
|
|
||||||
|
|
||||||
todo: Todo = {}
|
|
||||||
content = _string_arg(item.get("content"))
|
|
||||||
status = item.get("status")
|
|
||||||
|
|
||||||
if content is not None:
|
|
||||||
todo["content"] = content
|
|
||||||
if status in {"pending", "in_progress", "completed"}:
|
|
||||||
todo["status"] = status
|
|
||||||
|
|
||||||
normalized.append(todo)
|
|
||||||
|
|
||||||
return normalized
|
|
||||||
|
|
||||||
|
|
||||||
def _todo_action_kind(previous: Todo | None, current: Todo) -> str:
|
|
||||||
status = current.get("status")
|
|
||||||
previous_content = previous.get("content") if previous else None
|
|
||||||
current_content = current.get("content")
|
|
||||||
|
|
||||||
if previous is None:
|
|
||||||
if status == "completed":
|
|
||||||
return "todo_complete"
|
|
||||||
if status == "in_progress":
|
|
||||||
return "todo_start"
|
|
||||||
return "todo_update"
|
|
||||||
|
|
||||||
if previous_content != current_content:
|
|
||||||
return "todo_update"
|
|
||||||
|
|
||||||
if status == "completed":
|
|
||||||
return "todo_complete"
|
|
||||||
if status == "in_progress":
|
|
||||||
return "todo_start"
|
|
||||||
return "todo_update"
|
|
||||||
|
|
||||||
|
|
||||||
def _build_todo_actions(previous_todos: list[Todo], next_todos: list[Todo]) -> list[dict[str, Any]]:
|
|
||||||
# This is the single source of truth for precise write_todos token
|
|
||||||
# attribution. The frontend intentionally falls back to a generic
|
|
||||||
# "Update to-do list" label when this metadata is missing or malformed.
|
|
||||||
previous_by_content: dict[str, list[tuple[int, Todo]]] = defaultdict(list)
|
|
||||||
matched_previous_indices: set[int] = set()
|
|
||||||
|
|
||||||
for index, todo in enumerate(previous_todos):
|
|
||||||
content = todo.get("content")
|
|
||||||
if isinstance(content, str) and content:
|
|
||||||
previous_by_content[content].append((index, todo))
|
|
||||||
|
|
||||||
actions: list[dict[str, Any]] = []
|
|
||||||
|
|
||||||
for index, todo in enumerate(next_todos):
|
|
||||||
content = todo.get("content")
|
|
||||||
if not isinstance(content, str) or not content:
|
|
||||||
continue
|
|
||||||
|
|
||||||
previous_match: Todo | None = None
|
|
||||||
content_matches = previous_by_content.get(content)
|
|
||||||
if content_matches:
|
|
||||||
while content_matches and content_matches[0][0] in matched_previous_indices:
|
|
||||||
content_matches.pop(0)
|
|
||||||
if content_matches:
|
|
||||||
previous_index, previous_match = content_matches.pop(0)
|
|
||||||
matched_previous_indices.add(previous_index)
|
|
||||||
|
|
||||||
if previous_match is None and index < len(previous_todos) and index not in matched_previous_indices:
|
|
||||||
previous_match = previous_todos[index]
|
|
||||||
matched_previous_indices.add(index)
|
|
||||||
|
|
||||||
if previous_match is not None:
|
|
||||||
previous_content = previous_match.get("content")
|
|
||||||
previous_status = previous_match.get("status")
|
|
||||||
if previous_content == content and previous_status == todo.get("status"):
|
|
||||||
continue
|
|
||||||
|
|
||||||
actions.append(
|
|
||||||
{
|
|
||||||
"kind": _todo_action_kind(previous_match, todo),
|
|
||||||
"content": content,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
for index, todo in enumerate(previous_todos):
|
|
||||||
if index in matched_previous_indices:
|
|
||||||
continue
|
|
||||||
|
|
||||||
content = todo.get("content")
|
|
||||||
if not isinstance(content, str) or not content:
|
|
||||||
continue
|
|
||||||
|
|
||||||
actions.append(
|
|
||||||
{
|
|
||||||
"kind": "todo_remove",
|
|
||||||
"content": content,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
return actions
|
|
||||||
|
|
||||||
|
|
||||||
def _describe_tool_call(tool_call: dict[str, Any], todos: list[Todo]) -> list[dict[str, Any]]:
|
|
||||||
name = _string_arg(tool_call.get("name")) or "unknown"
|
|
||||||
args = tool_call.get("args") if isinstance(tool_call.get("args"), dict) else {}
|
|
||||||
tool_call_id = _string_arg(tool_call.get("id"))
|
|
||||||
|
|
||||||
if name == "write_todos":
|
|
||||||
next_todos = _normalize_todos(args.get("todos"))
|
|
||||||
actions = _build_todo_actions(todos, next_todos)
|
|
||||||
if not actions:
|
|
||||||
return [
|
|
||||||
{
|
|
||||||
"kind": "tool",
|
|
||||||
"tool_name": name,
|
|
||||||
"tool_call_id": tool_call_id,
|
|
||||||
}
|
|
||||||
]
|
|
||||||
return [
|
|
||||||
{
|
|
||||||
**action,
|
|
||||||
"tool_call_id": tool_call_id,
|
|
||||||
}
|
|
||||||
for action in actions
|
|
||||||
]
|
|
||||||
|
|
||||||
if name == "task":
|
|
||||||
return [
|
|
||||||
{
|
|
||||||
"kind": "subagent",
|
|
||||||
"description": _string_arg(args.get("description")),
|
|
||||||
"subagent_type": _string_arg(args.get("subagent_type")),
|
|
||||||
"tool_call_id": tool_call_id,
|
|
||||||
}
|
|
||||||
]
|
|
||||||
|
|
||||||
if name in {"web_search", "image_search"}:
|
|
||||||
query = _string_arg(args.get("query"))
|
|
||||||
return [
|
|
||||||
{
|
|
||||||
"kind": "search",
|
|
||||||
"tool_name": name,
|
|
||||||
"query": query,
|
|
||||||
"tool_call_id": tool_call_id,
|
|
||||||
}
|
|
||||||
]
|
|
||||||
|
|
||||||
if name == "present_files":
|
|
||||||
return [
|
|
||||||
{
|
|
||||||
"kind": "present_files",
|
|
||||||
"tool_call_id": tool_call_id,
|
|
||||||
}
|
|
||||||
]
|
|
||||||
|
|
||||||
if name == "ask_clarification":
|
|
||||||
return [
|
|
||||||
{
|
|
||||||
"kind": "clarification",
|
|
||||||
"tool_call_id": tool_call_id,
|
|
||||||
}
|
|
||||||
]
|
|
||||||
|
|
||||||
return [
|
|
||||||
{
|
|
||||||
"kind": "tool",
|
|
||||||
"tool_name": name,
|
|
||||||
"description": _string_arg(args.get("description")),
|
|
||||||
"tool_call_id": tool_call_id,
|
|
||||||
}
|
|
||||||
]
|
|
||||||
|
|
||||||
|
|
||||||
def _infer_step_kind(message: AIMessage, actions: list[dict[str, Any]]) -> str:
|
|
||||||
if actions:
|
|
||||||
first_kind = actions[0].get("kind")
|
|
||||||
if len(actions) == 1 and first_kind in {"todo_start", "todo_complete", "todo_update", "todo_remove"}:
|
|
||||||
return "todo_update"
|
|
||||||
if len(actions) == 1 and first_kind == "subagent":
|
|
||||||
return "subagent_dispatch"
|
|
||||||
return "tool_batch"
|
|
||||||
|
|
||||||
if message.content:
|
|
||||||
return "final_answer"
|
|
||||||
return "thinking"
|
|
||||||
|
|
||||||
|
|
||||||
def _has_tool_call(message: AIMessage, tool_call_id: str) -> bool:
|
|
||||||
"""Return True if the AIMessage contains a tool_call with the given id."""
|
|
||||||
for tc in message.tool_calls or []:
|
|
||||||
if isinstance(tc, dict):
|
|
||||||
if tc.get("id") == tool_call_id:
|
|
||||||
return True
|
|
||||||
elif hasattr(tc, "id") and tc.id == tool_call_id:
|
|
||||||
return True
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
def _build_attribution(message: AIMessage, todos: list[Todo]) -> dict[str, Any]:
|
|
||||||
tool_calls = getattr(message, "tool_calls", None) or []
|
|
||||||
actions: list[dict[str, Any]] = []
|
|
||||||
current_todos = list(todos)
|
|
||||||
|
|
||||||
for raw_tool_call in tool_calls:
|
|
||||||
if not isinstance(raw_tool_call, dict):
|
|
||||||
continue
|
|
||||||
|
|
||||||
described_actions = _describe_tool_call(raw_tool_call, current_todos)
|
|
||||||
actions.extend(described_actions)
|
|
||||||
|
|
||||||
if raw_tool_call.get("name") == "write_todos":
|
|
||||||
args = raw_tool_call.get("args") if isinstance(raw_tool_call.get("args"), dict) else {}
|
|
||||||
current_todos = _normalize_todos(args.get("todos"))
|
|
||||||
|
|
||||||
tool_call_ids: list[str] = []
|
|
||||||
for tool_call in tool_calls:
|
|
||||||
if not isinstance(tool_call, dict):
|
|
||||||
continue
|
|
||||||
|
|
||||||
tool_call_id = _string_arg(tool_call.get("id"))
|
|
||||||
if tool_call_id is not None:
|
|
||||||
tool_call_ids.append(tool_call_id)
|
|
||||||
|
|
||||||
return {
|
|
||||||
# Schema changes should remain additive where possible so older
|
|
||||||
# frontends can ignore unknown fields and fall back safely.
|
|
||||||
"version": 1,
|
|
||||||
"kind": _infer_step_kind(message, actions),
|
|
||||||
"shared_attribution": len(actions) > 1,
|
|
||||||
"tool_call_ids": tool_call_ids,
|
|
||||||
"actions": actions,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class TokenUsageMiddleware(AgentMiddleware):
|
class TokenUsageMiddleware(AgentMiddleware):
|
||||||
"""Logs token usage from model responses and annotates the AI step."""
|
"""Logs token usage from model response usage_metadata."""
|
||||||
|
|
||||||
def _apply(self, state: AgentState) -> dict | None:
|
|
||||||
messages = state.get("messages", [])
|
|
||||||
if not messages:
|
|
||||||
return None
|
|
||||||
|
|
||||||
# Annotate subagent token usage onto the AIMessage that dispatched it.
|
|
||||||
# When a task tool completes, its usage is cached by tool_call_id. Detect
|
|
||||||
# the ToolMessage → search backward for the corresponding AIMessage → merge.
|
|
||||||
# Walk backward through consecutive ToolMessages before the new AIMessage
|
|
||||||
# so that multiple concurrent task tool calls all get their subagent tokens
|
|
||||||
# written back to the same dispatch message (merging into one update).
|
|
||||||
state_updates: dict[int, AIMessage] = {}
|
|
||||||
if len(messages) >= 2:
|
|
||||||
from deerflow.tools.builtins.task_tool import pop_cached_subagent_usage
|
|
||||||
|
|
||||||
idx = len(messages) - 2
|
|
||||||
while idx >= 0:
|
|
||||||
tool_msg = messages[idx]
|
|
||||||
if not isinstance(tool_msg, ToolMessage) or not tool_msg.tool_call_id:
|
|
||||||
break
|
|
||||||
|
|
||||||
subagent_usage = pop_cached_subagent_usage(tool_msg.tool_call_id)
|
|
||||||
if subagent_usage:
|
|
||||||
# Search backward from the ToolMessage to find the AIMessage
|
|
||||||
# that dispatched it. A single model response can dispatch
|
|
||||||
# multiple task tool calls, so we can't assume a fixed offset.
|
|
||||||
dispatch_idx = idx - 1
|
|
||||||
while dispatch_idx >= 0:
|
|
||||||
candidate = messages[dispatch_idx]
|
|
||||||
if isinstance(candidate, AIMessage) and _has_tool_call(candidate, tool_msg.tool_call_id):
|
|
||||||
# Accumulate into an existing update for the same
|
|
||||||
# AIMessage (multiple task calls in one response),
|
|
||||||
# or merge fresh from the original message.
|
|
||||||
existing_update = state_updates.get(dispatch_idx)
|
|
||||||
prev = existing_update.usage_metadata if existing_update else (getattr(candidate, "usage_metadata", None) or {})
|
|
||||||
merged = {
|
|
||||||
**prev,
|
|
||||||
"input_tokens": prev.get("input_tokens", 0) + subagent_usage["input_tokens"],
|
|
||||||
"output_tokens": prev.get("output_tokens", 0) + subagent_usage["output_tokens"],
|
|
||||||
"total_tokens": prev.get("total_tokens", 0) + subagent_usage["total_tokens"],
|
|
||||||
}
|
|
||||||
state_updates[dispatch_idx] = candidate.model_copy(update={"usage_metadata": merged})
|
|
||||||
break
|
|
||||||
dispatch_idx -= 1
|
|
||||||
idx -= 1
|
|
||||||
|
|
||||||
last = messages[-1]
|
|
||||||
if not isinstance(last, AIMessage):
|
|
||||||
if state_updates:
|
|
||||||
return {"messages": [state_updates[idx] for idx in sorted(state_updates)]}
|
|
||||||
return None
|
|
||||||
|
|
||||||
usage = getattr(last, "usage_metadata", None)
|
|
||||||
if usage:
|
|
||||||
input_token_details = usage.get("input_token_details") or {}
|
|
||||||
output_token_details = usage.get("output_token_details") or {}
|
|
||||||
detail_parts = []
|
|
||||||
if input_token_details:
|
|
||||||
detail_parts.append(f"input_token_details={input_token_details}")
|
|
||||||
if output_token_details:
|
|
||||||
detail_parts.append(f"output_token_details={output_token_details}")
|
|
||||||
detail_suffix = f" {' '.join(detail_parts)}" if detail_parts else ""
|
|
||||||
logger.info(
|
|
||||||
"LLM token usage: input=%s output=%s total=%s%s",
|
|
||||||
usage.get("input_tokens", "?"),
|
|
||||||
usage.get("output_tokens", "?"),
|
|
||||||
usage.get("total_tokens", "?"),
|
|
||||||
detail_suffix,
|
|
||||||
)
|
|
||||||
|
|
||||||
todos = state.get("todos") or []
|
|
||||||
attribution = _build_attribution(last, todos if isinstance(todos, list) else [])
|
|
||||||
additional_kwargs = dict(getattr(last, "additional_kwargs", {}) or {})
|
|
||||||
|
|
||||||
if additional_kwargs.get(TOKEN_USAGE_ATTRIBUTION_KEY) == attribution:
|
|
||||||
return {"messages": [state_updates[idx] for idx in sorted(state_updates)]} if state_updates else None
|
|
||||||
|
|
||||||
additional_kwargs[TOKEN_USAGE_ATTRIBUTION_KEY] = attribution
|
|
||||||
updated_msg = last.model_copy(update={"additional_kwargs": additional_kwargs})
|
|
||||||
state_updates[len(messages) - 1] = updated_msg
|
|
||||||
return {"messages": [state_updates[idx] for idx in sorted(state_updates)]}
|
|
||||||
|
|
||||||
@override
|
@override
|
||||||
def after_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
def after_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
||||||
return self._apply(state)
|
return self._log_usage(state)
|
||||||
|
|
||||||
@override
|
@override
|
||||||
async def aafter_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
async def aafter_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
||||||
return self._apply(state)
|
return self._log_usage(state)
|
||||||
|
|
||||||
|
def _log_usage(self, state: AgentState) -> None:
|
||||||
|
messages = state.get("messages", [])
|
||||||
|
if not messages:
|
||||||
|
return None
|
||||||
|
last = messages[-1]
|
||||||
|
usage = getattr(last, "usage_metadata", None)
|
||||||
|
if usage:
|
||||||
|
logger.info(
|
||||||
|
"LLM token usage: input=%s output=%s total=%s",
|
||||||
|
usage.get("input_tokens", "?"),
|
||||||
|
usage.get("output_tokens", "?"),
|
||||||
|
usage.get("total_tokens", "?"),
|
||||||
|
)
|
||||||
|
return None
|
||||||
|
|||||||
@@ -1,50 +0,0 @@
|
|||||||
"""Helpers for keeping AIMessage tool-call metadata consistent."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from typing import Any
|
|
||||||
|
|
||||||
from langchain_core.messages import AIMessage
|
|
||||||
|
|
||||||
|
|
||||||
def _raw_tool_call_id(raw_tool_call: Any) -> str | None:
|
|
||||||
if not isinstance(raw_tool_call, dict):
|
|
||||||
return None
|
|
||||||
|
|
||||||
raw_id = raw_tool_call.get("id")
|
|
||||||
return raw_id if isinstance(raw_id, str) and raw_id else None
|
|
||||||
|
|
||||||
|
|
||||||
def clone_ai_message_with_tool_calls(
|
|
||||||
message: AIMessage,
|
|
||||||
tool_calls: list[dict[str, Any]],
|
|
||||||
*,
|
|
||||||
content: Any | None = None,
|
|
||||||
) -> AIMessage:
|
|
||||||
"""Clone an AIMessage while keeping raw provider tool-call metadata in sync."""
|
|
||||||
kept_ids = {tc["id"] for tc in tool_calls if isinstance(tc.get("id"), str) and tc["id"]}
|
|
||||||
|
|
||||||
update: dict[str, Any] = {"tool_calls": tool_calls}
|
|
||||||
if content is not None:
|
|
||||||
update["content"] = content
|
|
||||||
|
|
||||||
additional_kwargs = dict(getattr(message, "additional_kwargs", {}) or {})
|
|
||||||
raw_tool_calls = additional_kwargs.get("tool_calls")
|
|
||||||
if isinstance(raw_tool_calls, list):
|
|
||||||
synced_raw_tool_calls = [raw_tc for raw_tc in raw_tool_calls if _raw_tool_call_id(raw_tc) in kept_ids]
|
|
||||||
if synced_raw_tool_calls:
|
|
||||||
additional_kwargs["tool_calls"] = synced_raw_tool_calls
|
|
||||||
else:
|
|
||||||
additional_kwargs.pop("tool_calls", None)
|
|
||||||
|
|
||||||
if not tool_calls:
|
|
||||||
additional_kwargs.pop("function_call", None)
|
|
||||||
|
|
||||||
update["additional_kwargs"] = additional_kwargs
|
|
||||||
|
|
||||||
response_metadata = dict(getattr(message, "response_metadata", {}) or {})
|
|
||||||
if not tool_calls and response_metadata.get("finish_reason") == "tool_calls":
|
|
||||||
response_metadata["finish_reason"] = "stop"
|
|
||||||
update["response_metadata"] = response_metadata
|
|
||||||
|
|
||||||
return message.model_copy(update=update)
|
|
||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user